Instructions to authors Aims and scope Physics Reports keeps the active physicist up-to-date on developments in a wide range of topics by publishing timely reviews which are more extensive than just literature surveys but normally less than a full monograph. Each Report deals with one specific subject. These reviews are specialist in nature but contain enough introductory material to make the main points intelligible to a non-specialist. The reader will not only be able to distinguish important developments and trends but will also find a sufficient number of references to the original literature. Submission In principle, papers are written and submitted on the invitation of one of the Editors, although the Editors would be glad to receive suggestions. Proposals for review articles (approximately 500–1000 words) should be sent by the authors to one of the Editors listed below. The Editor will evaluate proposals on the basis of timeliness and relevance and inform the authors as soon as possible. All submitted papers are subject to a refereeing process. Editors J.V. ALLABY (Experimental high-energy physics), EP Division, CERN, CH-1211 Geneva 23, Switzerland. E-mail:
[email protected] D.D. AWSCHALOM (Experimental condensed matter physics), Department of Physics, University of California, Santa Barbara, CA 93106, USA. E-mail:
[email protected] J.A. BAGGER (High-energy physics), Department of Physics & Astronomy, The Johns Hopkins University, 3400 North Charles Street, Baltimore MD 21218, USA. E-mail:
[email protected] C.W.J. BEENAKKER (Mesoscopic physics), Instituut–Lorentz, Universiteit Leiden, P.O. Box 9506, 2300 RA Leiden, The Netherlands. E-mail:
[email protected] G.E. BROWN (Nuclear physics), Institute for Theoretical Physics, State University of New York at Stony Brook, Stony Brook, NY 11974, USA. E-mail:
[email protected] D.K. CAMPBELL (Non-linear dynamics), Dean, College of Engineering, Boston University, 44 Cummington Street, Boston, MA 02215, USA. E-mail:
[email protected] G. COMSA (Surfaces and thin films), Institut fur . Physikalische und Theoretische Chemie, Universit.at Bonn, Wegelerstrasse 12, D-53115 Bonn, Germany. E-mail:
[email protected] J. EICHLER (Atomic and molecular physics), Hahn-Meitner-Institut Berlin, Abteilung Theoretische Physik, Glienicker Strasse 100, 14109 Berlin, Germany. E-mail:
[email protected] M.P. KAMIONKOWSKI (Astrophysics), Theoretical Astrophysics 130-33, California Institute of Technology, 1200 East California Blvd., Pasadena, CA 91125, USA. E-mail:
[email protected] M.L. KLEIN (Soft condensed matter physics), Department of Chemistry, University of Pennsylvania, Philadelphia, PA 19104-6323, USA. E-mail:
[email protected]
vi
Instructions to authors
A.A. MARADUDIN (Condensed matter physics), Department of Physics and Astronomy, University of California, Irvine, CA 92697-4575, USA. E-mail:
[email protected] D.L. MILLS (Condensed matter physics), Department of Physics and Astronomy, University of California, Irvine, CA 92697-4575, USA. E-mail:
[email protected] H. ORLAND (Statistical physics and field theory), Service de Physique Theorique, CE-Saclay, CEA, 91191 Gif-sur-Yvette Cedex, France. E-mail:
[email protected] R. PETRONZIO (High-energy physics), Dipartimento di Fisica, Universita" di Roma – Tor Vergata, Via della Ricerca Scientifica, 1, I-00133 Rome, Italy. E-mail:
[email protected] S. PEYERIMHOFF (Molecular physics), Institute of Physical and Theoretical Chemistry, Wegelerstrasse 12, D-53115 Bonn, Germany. E-mail:
[email protected] I. PROCACCIA (Statistical mechanics), Department of Chemical Physics, Weizmann Institute of Science, Rehovot 76100, Israel. E-mail:
[email protected] E. SACKMANN (Biological physics), Physik-Department E22 (Biophysics Lab.), Technische Universit.at Munchen, . D-85747 Garching, Germany. E-mail:
[email protected] A. SCHWIMMER (High-energy physics), Department of Physics of Complex Systems, Weizmann Institute of Science, Rehovot 76100, Israel. E-mail:
[email protected] R.N. SUDAN (Plasma physics), Laboratory of Plasma Studies, Cornell University, 369 Upson Hall, Ithaca, NY 14853-7501, USA. E-mail:
[email protected] W. WEISE (Physics of hadrons and nuclei), Institut fur . Theoretische Physik, Physik Department, Technische Universit.at Munchen, . James Franck Strae, D-85748 Garching, Germany. E-mail:
[email protected] Manuscript style guidelines Papers should be written in correct English. Authors with insufficient command of the English language should seek linguistic advice. Manuscripts should be typed on one side of the paper, with double line spacing and a wide margin. The character size should be sufficiently large that all subscripts and superscripts in mathematical expressions are clearly legible. Please note that manuscripts should be accompanied by separate sheets containing: the title, authors’ names and addresses, abstract, PACS codes and keywords, a table of contents, and a list of figure captions and tables. – Address: The name, complete postal address, e-mail address, telephone and fax number of the corresponding author should be indicated on the manuscript. – Abstract: A short informative abstract not exceeding approximately 150 words is required. – PACS codes/keywords: Please supply one or more PACS-1999 classification codes and up to 4 keywords of your own choice for indexing purposes. PACS is available online from our homepage (http://www.elsevier.com/locate/physrep). References. The list of references may be organized according to the number system or the nameyear (Harvard) system. Number system: [1] M.J. Ablowitz, D.J. Kaup, A.C. Newell and H. Segur, The inverse scattering transform – Fourier analysis for nonlinear problems, Studies in Applied Mathematics 53 (1974) 249–315. [2] M. Abramowitz and I. Stegun, Handbook of Mathematical Functions (Dover, New York, 1965).
Instructions to authors
vii
[3] B. Ziegler, in: New Vistas in Electro-nuclear Physics, eds E.L. Tomusiak, H.S. Kaplan and E.T. Dressler (Plenum, New York, 1986) p. 293. A reference should not contain more than one article. Harvard system:
Ablowitz, M.J., D.J. Kaup, A.C. Newell and H. Segur, 1974. The inverse scattering transform – Fourier analysis for nonlinear problems, Studies in Applied Mathematics 53, 249–315. Abramowitz, M. and I. Stegun, 1965, Handbook of Mathematical Functions (Dover, New York). Ziegler, B., 1986, in: New Vistas in Electro-nuclear Physics, eds E.L. Tomusiak, H.S. Kaplan and E.T. Dressler (Plenum, New York) p. 293. Ranking of references. The references in Physics Reports are ranked: crucial references are indicated by three asterisks, very important ones with two, and important references with one. Please indicate in your final version the ranking of the references with the asterisk system. Please use the asterisks sparingly: certainly not more than 15% of all references should be placed in either of the three categories. Formulas. Formulas should be typed or unambiguously written. Special care should be taken of those symbols which might cause confusion. Unusual symbols should be identified in the margin the first time they occur.
Equations should be numbered consecutively throughout the paper or per section, e.g., Eq. (15) or Eq. (2.5). Equations which are referred to should have a number; it is not necessary to number all equations. Figures and tables may be numbered the same way. Footnotes. Footnotes may be typed at the foot of the page where they are alluded to, or collected at the end of the paper on a separate sheet. Please do not mix footnotes with references. Figures. Each figure should be submitted on a separate sheet labeled with the figure number. Line diagrams should be original drawings or laser prints. Photographs should be contrasted originals, or high-resolution laserprints on glossy paper. Photocopies usually do not give good results. The size of the lettering should be proportionate to the details of the figure so as to be legible after reduction. Original figures will be returned to the author only if this is explicitly requested. Colour illustrations. Colour illustrations will be accepted if the use of colour is judged by the Editor to be essential for the presentation. Upon acceptance, the author will be asked to bear part of the extra cost involved in colour reproduction and printing. After acceptance – Proofs: Proofs will be sent to the author by e-mail, 6–8 weeks after receipt of the manuscript. Please note that the proofs have been proofread by the Publisher and only a cursory check by the author is needed; we are unable to accept changes in, or additions to, the edited manuscript at this stage. Your proof corrections should be returned within two days of receipt by fax, courier or airmail. The Publisher may proceed with publication of no response is received. – Copyright transfer: The author(s) will receive a form with which they can transfer copyright of the article to the Publisher. This transfer will ensure the widest possible dissemination of information. LaTeX manuscripts The Publisher welcomes the receipt of an electronic version of your accepted manuscript (encoded in LATEX). If you have not already supplied the final, revised version of your article (on diskette) to the Journal Editor, you are requested herewith to send a file with the text of the manuscript (after acceptance) by e-mail to the address provided by the Publisher. Please note that no deviations
viii
Instructions to authors
from the version accepted by the Editor of the journal are permissible without the prior and explicit approval by the Editor. Such changes should be clearly indicated on an accompanying printout of the file.
Files sent via electronic mail should be accompanied by a clear identification of the article (name of journal, editor’s reference number) in the ‘‘subject field’’ of the e-mail message. LATEX articles should use the Elsevier document class ‘‘elsart’’, or alternatively the standard document class ‘‘article’’. The Elsevier package (including detailed instructions for LATEX preparation) can be obtained from http://www.elsevier.com/locate/latex. The elsart package consists of the files: ascii.tab (ASCII table), elsart.cls (use this file if you are using LATEX2e, the current version of LATEX), elsart.sty and elsart12.sty (use these two files if you are using LATEX2.09, the previous version of LATEX), instraut.dvi and/or instraut.ps (instruction booklet), readme. Author benefits – Free offprints. For regular articles, the joint authors will receive 25 offprints free of charge of the journal issue containing their contribution; additional copies may be ordered at a reduced rate. – Discount. Contributors to Elsevier Science journals are entitled to a 30% discount on all Elsevier Science books. – Contents Alert. Physics Reports is included in Elsevier’s pre-publication service Contents Alert. Author enquiries For enquiries relating to the submission of articles (including electronic submission where available) please visit the Author Gateway from Elsevier Science at http://authors.elsevier.com. The Author Gateway also provides the facility to track accepted articles and set up e-mail alerts to inform you of when an article’s status has changed, as well as detailed artwork guidelines, copyright information, frequently asked questions and more. Contact details for questions arising after acceptance of an article, especially those relating to proofs, are provided when an article is accepted for publication.
Available online at www.sciencedirect.com
Physics Reports 382 (2003) 1 – 111 www.elsevier.com/locate/physrep
Exact mean- eld theory of ionic solutions: non-Debye screening Luis M. Varela∗ , Manuel Garc-.a, V-.ctor Mosquera Grupo de F sica de Coloides y Pol meros, Departamento de F sica de la Materia Condensada, Universidad de Santiago de Compostela, E-15706, Santiago de Compostela, Spain Accepted 13 April 2003 editor: S. Peyerimho2
Abstract The main aim of this report is to analyze the equilibrium properties of primitive model (PM) ionic solutions in the formally exact mean- eld formalism. Previously, we review the main theoretical and numerical results reported throughout the last century for homogeneous (electrolytes) and inhomogeneous (electric double layer, edl) ionic systems, starting with the classical mean- eld theory of electrolytes due to Debye and H6uckel (DH). In this formalism, the e2ective potential is derived from the Poisson–Boltzmann (PB) equation and its asymptotic behavior analyzed in the classical Debye theory of screening. The thermodynamic properties of electrolyte solutions are brie:y reviewed in the DH formalism. The main analytical and numerical extensions of DH formalism are revised, ranging from the earliest extensions that overcome the linearization of the PB equation to the more sophisticated integral equation techniques introduced after the late 1960s. Some Monte Carlo and molecular dynamic simulations are also reviewed. The potential distributions in an inhomogeneous ionic system are studied in the classical PB framework, presenting the classical Gouy–Chapman (GC) theory of the electric double layer (edl) in a brief manner. The mean- eld theory is adequately contextualized using eld theoretic (FT) results and it is proven that the classical PB theory is recovered at the Gaussian or one-loop level of the exact FT, and a systematic way to obtain the corrections to the DH theory is derived. Particularly, it is proven following Kholodenko and Beyerlein that corrections to DH theory e2ectively lead to a renormalization of charges and Debye screening length. The main analytical and numerical results for this non-Debye screening length are reviewed, ranging from asymptotic expansions, self-consistent theory, nonlinear DH results and hypernetted chain (HNC) calculations. Finally, we study the exact mean- eld theory of ionic solutions, the so-called dressed-ion theory (DIT). An analysis of its statistical foundations is reported together with a detailed study of its linear response function, (k), ˆ that generalizes the concept of screening length and contains all the information about the e2ective quantities. The relation of this quantity to the structure factor of the :uid is explicitly analyzed and the renormalized charges and screening length for a one component charged spheres (OCCS) system derived in the modi ed mean spherical approximation (MMSA), and a comparison of the DIT/MMSA predictions for the e2ective magnitudes to HNC results included. Besides, ∗
Corresponding author. Tel.: +34-981-563100x14042; fax: +34-981-520676. E-mail address:
[email protected] (L.M. Varela). c 2003 Elsevier B.V. All rights reserved. 0370-1573/03/$ - see front matter doi:10.1016/S0370-1573(03)00210-2
2
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
the predicted DIT/MMSA thermodynamic properties are studied for the RPM electrolyte and extensions of this formalism to asymmetric electrolyte solutions presented. The main DIT results for the edl due to Ennis et al. are introduced and, nally, we analyze the main features of the application of the new equilibrium formalism to the calculation of transport coeJcients, the so termed dressed ion transport theory (DITT). In this framework, the relaxation and electrophoretic corrections to the ionic mobility are interpreted in terms of the existence of new kinetic entities in the bulk solution: the e2ective or dressed particles. c 2003 Elsevier B.V. All rights reserved. PACS: 61.20.Gy; 61.20.Qg; 82.45.+z; 82.70.Dd
Contents 0. Nomenclature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2. Classical mean- eld theory of ionic solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1. Primitive model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2. Mean- eld assumption: the Poisson–Boltzmann equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.1. Debye–H6uckel theory: screening of the ionic correlations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.2. Further developments. Extensions of DH theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.3. Guggenheim’s theory: contribution of the short range forces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.4. Mayer’s cluster sum theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.5. Potential and charge distributions at a :at surface: the Gouy–Chapman model . . . . . . . . . . . . . . . . . . . 3. Integral equation techniques and computer simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1. Mean spherical approximation (MSA) and its thermodynamically consistent generalization (GMSA) . . . . . . . 3.2. Percus–Yevick and hypernetted chain approximations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3. Simulation results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4. Field theory of ionic solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5. Calculus of the e2ective parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1. Asymptotic expansions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2. Self-consistent screening length . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3. Numerical results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6. Dressed ion theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1. Modi ed MSA approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7. Thermodynamic predictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8. The primitive model double layer: e2ective surface charge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9. Transport theory of electrolytes: DITT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1. Relaxation of the ionic cloud . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2. Electrophoretic e2ect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3. Formulation of the DITT conductance equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.4. Comparison to experimental results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 5 12 13 15 15 24 31 32 32 34 35 39 41 41 47 48 49 51 53 62 76 82 85 87 94 97 99 105 106 106
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
0. Nomenclature a A Aex A Aij bij (r) c ci cij (r) cijl (r) cij0 (r) D Dj Dj0 E E0 ˆ f(k) fij (r) f± Fij gij (r) G (D) (i; j) hij (r) h0ij (r) hi0 (z) h0i0 (z) I Ji kB kD l G ex G g lB m n n∗ ni Ni NA
lattice spacing of the hyperlattice Helmholtz free energy of the system excess Helmholtz free energy of the system matrix A potential of the mean force between ions i and j bridge function molar concentration molar concentration of species i direct correlation function between ions i and j long range part of the direct correlation function between ions i and j short range part of the direct correlation function among ions of species i and j dimensionality of the space di2usion coeJcient of ion j limiting di2usion coeJcient of species j external eld external eld amplitude Fourier transform of function f(r) two body Mayer’s function rational mean activity coeJcient Coulomb force acting on ion j due to ion i radial distribution function of ions of species i in the neighbourhood of an ion of species j lattice propagator between lattice sites i and j total correlation function among ions of species i and j short range part of the total correlation function among ions of species i and j macroion–ion correlation function short range part of the macroion–ion correlation function ionic strength probability :ux for i particles Boltzmann’s constant Debye’s parameter mean free path excess Gibbs free energy free energy of the system Gibbs free energy per particle Bjerrum’s length mass of the ions number density of the :uid reduced number density of the :uid number density of species i total number of ions of species i in solution Avogadro’s number
3
4
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
P Posm qi qj∗ rij S(k) Sij (k) SNN (k) SNZ (k) SZZ (k) T u U U ex U (rN ) V VO V el vi0 vi (0) vij V(rj ) xj (r) (k) ˆ 1− j ij
±
2! "(D) (r) "ij # = kD e2 =& & &∗ j(k) ' #(n) ) = lB =* + ,i
pressure osmotic pressure of the system charge of an ion of species i renormalized or e2ective charge of ions of species j distance between the centers of ions i and j static structure factor of the :uid partial structure factor number–number structure factor charge–number structure factor charge–charge structure factor absolute temperature internal energy per particle internal energy of the system excess internal energy of the system total potential energy of the system volume of the system partial molar volume electrostatic potential energy contribution of the short-range part of the pair correlation to the velocity eld in the neighbourhood of ion i velocity of the surrounding ionic cloud relative to the bare i particle velocity of ion j in the neighbourhood of ion i drift velocity of the solution as a whole in the position of particle j molar fraction of ionic species j DIT linear response function Fourier transform of the DIT linear response function degree of ionic association inverse thermal energy practical activity coeJcient of ionic species j GMMSA correlation parameters between ionic species i and j practical mean activity coeJcient MSA decay constant Dirac’s delta function in D-dimensional space Kronecker’s delta coupling constant dielectric permittivity of the solvent e2ective dielectric permittivity of the medium longitudinal dielectric function viscosity of the medium slope of the direct correlation function inside the ionic core reduced Bjerrum’s length e2ective decay constant of the :uid fugacity of species i
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
,i0 ,Bj . .0 /j 0j 1 2 2s 2∗s 2j (r) 2Oj (r) 2cj (r) 200 (z) 2˜l *ij *i 4 4∗ 5 5ZZ (k) 6 6j (r) 6ij (r) 60ij (r) 6sc ij (r) 6osm 7 0
O j (r) !O :i0 :j
5
limiting equivalent conductance of ion i is the thermal wavelength of species j equivalent conductance of the system limiting equivalent conductance chemical potential of species j stoichiometric coeJcient of species j grand canonical partition function charge density of the :uid surface charge e2ective surface charge charge density in the neighbourhood of ion j average charge density in the neighbourhood of ion j charge density of the central particle j short range charge distribution in the wall ionic frictional coeJcients mean ionic diameter of the pair of ions i and j hard sphere diameter of ions of species i collision time relaxation time of the ionic atmosphere dynamic DIT pole charge response function volume fraction of ions Coulomb’s potential created by ion j at a distance r Coulomb potential energy between ions i and j short range pair interaction between ions i and j soft core pair potential osmotic coeJcient of the system renormalization parameter potential at the surface of the central particle in the electric double layer average electrostatic potential created by an ion of species j in solution frequency of the external eld mobility of ions of species i at in nite dilution mobility of an ion of species j
1. Introduction Charged complex :uids have a prominent role among the fundamental systems that form the class of complex :uids, a category that covers basically the whole spectrum of liquid matter. Being the main contribution to the interparticle potential, the Coulombic interaction is responsible for the main properties of these systems, ranging from the homogeneization of molecular structures (stability of colloidal dispersions) to the existence of concentration gradients and mesomorphic structures. However, the most outstanding consequence of this long-range interaction is the coupling of the
6
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
degrees of freedom of many particles, which confers a typical many-body character to the problem. The paradigm of charged complex :uid is the ionic liquid, a category which comprises liquid metals, molten salts and ionic solutions. The latter are neutral systems formed by a solute of positive and negative ions immersed in a neutral polar solvent. This kind of system varies widely in complexity, ranging from electrolyte solutions where cations and anions have comparable size and charge, to highly asymmetric macromolecular ionic liquids in which macroions (polymers, micelles, proteins; : : :) and microscopic counterions coexist. Thus, the importance of this system in many theoretical and applied elds is out of any doubt. The theory of ionic solutions has been one of the most important and fundamental problems in statistical physics throughout the last century. Since the formulation of the seminal Debye–H6uckel (DH) theory [1] of homogeneous ionic solutions (electrolytes), with their discovery of the screened form of the mean interionic potential, the number of both theoretical and experimental contributions to this eld has increased constantly. There is general agreement that the DH theory was a revolution in the understanding of the properties of ionic media. In fact, it has been the theoretical framework where most of the studies of electrolyte solutions have taken place since then. Besides, the corresponding theory for the electric double-layer (the so-called Gouy–Chapman theory) [2] constitutes the basis of the modern colloid science [3]. In a Coulomb system, one ion interacts with many di2erent ions simultaneously, making the mean- eld approach very successful in describing qualitatively (and usually also quantitatively) experiments and simulations. This is the reason why both mean- eld approximations proved to be powerful tools for the interpretation of ionic :uids and, in fact, they continue to be the basis of the theoretical understanding of many phenomena like ionic and colloidal stability, electrolyte solutions thermodynamics and phase transitions in ionic :uids [4]. However, it has to be admitted that most practical applications of the classical mean- eld theory of electrolyte solutions are made under conditions where it ought to be inaccurate, due to the approximations involved in its obtention. Kjellander and Mitchell [5] analyzed the origin of this puzzling success and listed possible causes for it, ranging from the occurrence of several cancelling errors in the PB approximation, to the fact that—when actually using a PB expression— tted values of the system parameters are often employed. Moreover, models are frequently constructed where these parameters can be calculated from the particular details of the system under consideration. This strategy is followed, for example, for tting the observed forces between charged surfaces in electrolyte solutions, where the real surface charge has to be replaced by a smaller e2ective one if PB is to be successfully employed. This reduction has been shown by accurate double-layer studies to be related to an overestimation of the repulsion in the PB theory, and not only to counterion condensation [6–8]. Besides, the analysis of the double-layer due to Attard et al. [9] con rmed that the double-layer interactions at large separations agree with the predictions of the PB theory as long as the actual surface charge density is replaced by an apparent one. The same can be said of the asymptotic tail of the potential of the mean force between ions in a bulk electrolyte solution, where DH theory is valid at large separations if real charges are replaced by their e2ective values [10]. The classical PB approach is the basis of the DLVO theory, named after Derjaguin et al. [3]. The PB prediction of a Yukawa type potential is essential for the interpretation of the e2ective interaction pair potential between two colloids in the solvent. However, this approach becomes inadequate to describe highly charged objects for which the electrostatic energy of a microion near the colloid surface exceeds the thermal energy, and the linearization of the PB equation is not justi ed. In this
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
7
case, however, the electrostatic potential in exact or mean- eld theory still takes the DH like form far from the charged bodies, provided that the source of the potential is renormalized (Z → Ze2 ) [11]. Of course, one is faced with the diJcult task of predicting this e2ective colloid charge. These limitations of the classical mean- eld treatment arise from the neglect of ionic correlations, moment condition constraints and high electrostatic coupling which depend on concentration, ion size and charge asymmetry. A great number of attempts to go beyond the DH theory have been reported, based, for example, on the overcoming of the linearization of the PB equation [12], the introduction of ionic pairing [13], or the existence of a pseudoreticular structure in the dense regime of an ionic solution [14,15]. Integral equation techniques have also been used for the obtention of the pair correlation functions from the Ornstein–Zernike (OZ) equation like the mean spherical approximation (MSA) [16], its thermodynamically consistent generalization GMSA [17], or their improvements based on cluster resummation techniques, the optimized random phase approximation (ORPA) [18] and the !-ordering theory [19]. Other integral equation techniques have been tested with success, such as the Percus–Yevick type equation [20] and the hypernetted chain equation (HNC) [21]. At the same time, Mayer expansions [22–24] have also been used for the study of ionic systems, based on the in nite resummation of diagrams that accounted for the long range character of the Coulomb interaction. The theoretical results derived from the above frameworks are usually not directly comparable to experimental data. Despite some controversy about the real nature of computer calculations, it is now generally accepted that they constitute real computer experiments, and that they can replace direct experimental data in most current situations. Computer simulation (both by Monte Carlo (MC) and molecular dynamic (MD) techniques) has been used since the 1960s in the interpretation of the physics of ionic systems [25–27], providing a test ground where the validity of the di2erent approaches can be tested. In 1970 Card and Valleau [28] carried out Monte Carlo (MC) simulations of 1:1 electrolytes at low concentrations, obtaining the internal energy, osmotic coeJcient, constant volume and heat capacity and contact values of the pair correlation functions. In the mid-1970s, Larsen [29] published a MC study of the restricted primitive model (RPM) electrolyte including molten salt region, and since then there has been an enormous number of MC and MD published results in the area of ionic solutions, proving invaluable in discriminating competing theories and revealing new structural features of electrolyte solutions. More recently, Zhang and coworkers [30] studied a soft ion model of a symmetric 1:1 electrolyte solution by Grand Canonical MC (GCMC) simulations and compared the results to those derived using conventional HNC integral equation theory, using both the standard Ewald summation method and the so-called minimal image (MI). MC calculations have also been reported for the sticky spheres model by Shew et al. [31]. Many e2orts have been tributed to the study of the phase behaviour and critical parameters of charge and size symmetric and asymmetric electrolyte solutions (see for example, Caillol et al. [32] and Yan and de Pablo [33]). MD simulations of 1:1 and 1:2 electrolyte solutions have been reported by Heinzinger for several symmetric and asymmetric electrolyte solutions [34] and by Suh et al. [35]. Similarly, a soft sphere model has been recently studied by Zhang et al. [36]. The elegant way in which integral equation techniques describe ionic correlations is somewhat obscured by the inherent diJculties which their solution poses. This problem has motivated the search for alternative and simpler methods to extend PB theory. Density functional theory (DFT) has been traditionally employed for the understanding of either homogenous (bulk electrolyte solutions) and inhomogeneous ionic systems [37]. The minimization of the Helmholtz free energy
8
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
functional provides a useful technique to include ionic correlations while avoiding the lengthy numerical calculations that are needed for solving integral equations. Consequently, DFT has usually been used to discuss the validity of di2erent approaches to the physics of ionic systems and to understand ionic phase transitions (for a review in this eld see Levin and Fisher in Ref. [38]). Frusawa and Hayakawa formulated a density functional expression for the canonical system of a Coulomb gas, demonstrating that this one is a dual form to the sine-Gordon theory, and analyzed the generalized DH (GDH) equation proposed by Fisher et al. [39,40], proving that it only holds in some special cases. Barbosa [41] recently proposed a local density functional theory to introduce ionic correlations into PB formalism using the DH hole-cavity (DHHC) theory to account for the ionic correlations in local fashion. He was able to predict an increase in condensation when salt is added to a charged colloid. However, up to our knowledge, there has been no direct application of DFT to the calculation of e2ective non-Debye charges and screening length of ionic systems. One of the conceptually more fruitful approaches to ionic :uids is that of systematic eld theory (FT). It is in this framework that the mean- eld PB treatment of charged :uids is adequately contextualized and admits a natural extension treating ionic correlations and :uctuations by means of loop-wise expansions [42–50]. In this context, classical mean- eld theory is recovered at the Gaussian or one-loop level [43] and PB equation is shown to constitute the saddle point of the exact FT as proven by Netz and Orland [45]. Thus, classical mean- eld theory is contextualized as the low charge, low concentration limit where :uctuation corrections are negligible. On the other hand, Kholodenko and Beyerlein [42] demonstrated from eld-theoretic perturbation expansions that the corrections to DH theory lead to rescaled ionic charges and screening length. FT demonstrates that the preservation of the classical picture of ionic screening demands the introduction of a non-Debye screening scheme which is necessary to preserve the mean- eld PB-like formalism of ionic :uids at nite concentrations. On the other hand, excluded volume e2ects have been accounted for in the eld-theoretic treatment, modelling them by means of a short-ranged hard core repulsion. The results have been applied to the calculation of the thermodynamic properties of the one component (OCP) and two component (TCP) plasmas [44]. Netz studied the edl from the FT perspective in the strong coupling regime [48], and reported FT results for the contributions to the van der Waals interaction between two dielectric semi-in nite half-spaces in the presence of mobile salt ions [49]. Despite the great accuracy of the existing sophisticated approaches based on the solution of eld equations or the OZ equation, they are usually purely formal or demand numerical computations (probably with the exception of the MSA and GMSA). For this reason, they are not useful for characterization of data and are seldom used in empirical applications. The mean- eld image is the only one which provides tractable expressions in order to t data on the thermodynamic properties of the system and so its preservation is of fundamental importance. Consequently, it has been extended to situations where it is not strictly applicable, showing a somewhat paradoxical ability to t empirical data if e2ective values of the system’s parameters are used [9,51–54]. This apparent success of PB theory has to be supported by exact statistical mechanical theory before one can relate it to fundamental properties of the system. Consequently, there has been considerable activity during the last decades in the eld of reformulations of the exact statistical theory for ionic liquids in the primitive model (PM) [5,55–57]. Their common characteristic is the introduction of e2ective parameters (charge densities, screening lengths; : : :) in the classical PB scheme [58] in order to account for ionic correlations and higher order electrostatic coupling, in accordance with FT predictions. The decay
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
9
constant (together with the ionic charges) contains all the information about ionic correlations and it allows the construction of the mean- eld concentration-dependent interionic potential, so this is the fundamental quantity to be renormalized. Consequently, in recent years there has been a resurge of interest in the asymptotic decay of spatial correlations. Numerical calculations based on the hypernetted chain (HNC) approximation have been reported [59,60] and several theoretical corrections to the classical result for the screening length due to Debye have been proposed, which comprise asymptotic diagrammatic expansions [55,61], self-consistent analytic theories [59], non-linear DH approach and the MSA approximation [60]. The most successful formalism for overcoming the PB level is the formally exact mean- eld-like dressed-ion theory (DIT) [5,55]. This theory casts the exact theory of electrolyte solutions in a linear mean- eld PB form, developing the idea of renormalizing the system’s parameters in order to account for ionic correlations. DIT is an exact theory so it is applicable to all primitive model systems irrespectively of the charge and size of their components: symmetric and non-symmetric electrolytes as well as mixtures of colloid particles and small ions and purely colloidal systems. In this framework no distinction is made between ions and colloid particles, as both are assumed to be charged hard spheres dispersed in a uniform dielectric continuum, solely characterized by its dielectric constant &. By splitting the ion–ion correlation function hij (r) and the direct correlation function cij (r) into a long-range part and a short-range part, following Stell’s hypervertex formalism [19], the conventional parameters (charge, screening length, electric permittivity; : : :) su2er a renormalization process and become e2ective quantities that can be related to the linear response function (k), ˆ the DIT functional generalization of Debye’s screening parameter [62,63]. In order to obtain concrete expressions for the e2ective decay length and renormalized charges, the DIT linear response function has been related to the Bhatia–Thornton static structure factors of the :uid [64]. Through this relation, and by means of an adequate structural model of the :uid, analytical expressions for the linear response function, and consequently for the e2ective quantities of the system, can be obtained. This is the so-called “DIT route” to the e2ective quantities, and demands the use of any equilibrium correlation function. The logical decision would be to employ an equilibrium distribution function which minimizes the analytical complexity or demands short numerical calculations. Of course, one looks in the rst place at the conventional MSA for this purpose, but this closure relation leads to a low concentration underestimation of the e2ective screening constant [65–67]. The problem with other integral equations is that they would demand lengthy numerical calculations, so approximations at the direct-correlation function level are to be made in order to avoid analytical or numerical complexity [41]. For this reason, a modi ed version of the MSA, the so termed MMSA, was introduced [64]. Neglecting the correlations inside the hard-core of the ions, the most probable cause of the low concentration de ciencies, the MMSA allowed the calculation of the DIT linear response function and correctly tted the e2ective screening length HNC data throughout the whole concentration range. Neglecting correlations inside the core is equivalent to the assumption of constant potential and direct correlation function in that zone, a hypothesis which has been used in density functional studies of colloidal solutions [68]. A generalized version of the MMSA (GMMSA) was introduced in order to extend the original framework to asymmetric electrolytes. In this theory, speci c interactions between ionic species [69] are treated by means of a short-range interaction parameter in the hard-core region. The GMMSA allows the prediction of the HNC screening length of 1 : z electrolytes (z = 1; : : : ; 4) up to the concentration of transition to the oscillatory regime of the pair correlations [69].
10
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
These advances in the mean- eld theory of ionic solutions have already been applied to the prediction of actual thermodynamic quantities of electrolyte solutions and the electric double layer. Thus, results have been reported for the internal excess energy and osmotic coeJcient of symmetric electrolytes solutions using the self-consistent screening lengths [59] and the MMSA [70]. The Helmholtz free energy of the system has been evaluated in terms of the renormalized quantities of exact mean- eld theories, and therefore the chemical potentials, and activity coeJcients of ionic systems have been analyzed [71]. The critical behaviour of a system stands among the thermodynamically more relevant information, and its analysis usually demands the knowledge of the Helmholtz free energy. The critical behaviour of ionic :uids has been the matter of intensive research during the last decades (for an extensive review see Ref. [72] and references therein). The main results that have been reported in this eld employ the usual DH theory corrected with the existence of ionic association and dipole-ion interactions [4]. Therefore, the Debye screening length is employed for modelling the electrostatic interactions. However, ionic systems near criticality are very far from the Debye–H6uckel limiting law (DHLL). Given the fact that there is a nite screening length, critical behaviour of :uids containing charges would have to become Ising-like when the critical point is approached closely enough. The actual value the screening length assumes might be of crucial importance for the crossover behaviour actually observed in experiments and numerical simulations [73]. The exact computation of the concentration dependence of transport coeJcients from rst principles is not practical because of the diJculties arising from the solute–solvent interaction, mimicked by random forces originating a Brownian motion [74–77], and the complexity of the introduction of hydrodynamic interactions in the Brownian dynamics simulation [78]. This has motivated the use of alternative theoretical frameworks based on hydrodynamic extensions of the DH equilibrium scheme such as the early linear response Fuoss–Onsager (FO) formalism [79–82], the rst treatment of non-equilibrium phenomena in ionic systems, or the Fokker–Planck–Smoluchowsky equation combined with HNC or MSA correlation functions. FO transport formalism is a linear response theory based in the usage of hydrodynamic continuity equations, and was originally formulated to extend the DH equilibrium theory to transport processes. In this theoretical framework, the perturbed parts of the total pair correlation, "hij (r; t), and of the total average potential "=i (r; t) are related by means of a Poisson equation, used as a closure relation, and the equilibrium pair correlation is provided by any statistical equilibrium model of the bulk :uid. Onsager and Fuoss obtained limiting laws for the conductance as a result of the application of the DH equilibrium results to transport phenomena, later extended to self-di2usion of single ions [81] and ionic mixtures [82]. Since then, the progress in this eld has been slow and laborious. Due to the close connection between non-equilibrium behaviour of a system and its equilibrium structure, no signi cative improvement was registered until more accurate descriptions of the equilibrium were introduced. Onsager et al. made the e2ort of extending Onsager’s own conductivity results to nite concentrations [82], but the fact that only the DH pair distribution function was available at that time led to limited results. Soon after the application of the MSA to charged systems in the early 1970s [83], its application to transport processes at nite concentrations was attempted through the combination of the mean- eld FO hydrodynamic formalism with MSA equilibrium scheme. The restricted primitive model (RPM) was used by Ebeling and coworkers [84,85] to describe the variation of conductance with ionic concentration, computing the relaxation contribution with the aid of MSA distribution functions. The primitive model (PM) description, with no restriction in the ion sizes, was adopted
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
11
by Durand–Vidal et al. [78,86–88] to study the dominant forces which determine the deviation from ideal behaviour of the transport processes in electrolytes: relaxation and electrophoretic forces. They formulated a linear response formalism in which Onsager’s continuity equations were combined with the MSA equilibrium correlation functions using the Green’s response functions formalism. Concentration independent potentials were obtained and the results applied to self-di2usion, acoustophoresis and conductance of strong and associated electrolytes and to micellar systems. Another strategy to obtain transport coeJcients at the mean- eld level is the combination of the FO transport formalism with the modern formally exact DIT equilibrium theoretical framework. This is the conceptual basis of the dressed-ion transport theory (DITT) [89,90], a formalism that incorporates ionic correlations to the description of transport phenomena renormalizing the kinetic entities (renormalized charges) and using a non-Debye screening scheme, in the DIT fashion. Varela et al. [89] employed the DIT equilibrium distribution functions to obtain the perturbed electric eld acting on an ion due to the distortion of the ionic atmosphere caused by the external eld, and completed the formulation of the DITT evaluating the electrophoretic (hydrodynamic) correction to the ion mobilities [90]. The latter e2ect is due to the distortion of the velocity eld of the liquid around the moving particle, so that neighbouring ions do not move in a stationary medium, and its computation also requires the knowledge of ionic distributions. DITT is an improvement with respect to other transport theories previously developed from less accurate pair distributions than those of the DIT. The derived expressions for the electrophoretic velocity and the relaxation eld were used to evaluate the mobility of DITT quasiparticles and the conductance of the electrolyte solution, obtaining an exact reformulation of Onsager’s limiting law of conductance in terms of the concentration-dependent deviations of the renormalized quantities from the bare ones. The present work is structured in a mainly historical manner, starting in Section 2 with the derivation of the classical linearized PB (LPB) mean- eld theory for both the homogeneous and inhomogeneous ionic systems, and with the analysis of their thermodynamical implications. Attard’s [91] proof of the exponential decay of ionic correlations is also included. The main extensions of the DH theory are reviewed in this section together with GC classical treatment of the edl. In Section 3, we examine the main applications of the results of the statistical theory of liquids to ionic :uids, including some cornerstones of the statistical mechanics of bulk electrolytes like Waisman’s analytic solutions of the MSA of the neutral electrolyte. The thermodynamically consistent extension generalized MSA (GMSA) is also studied, and the fact that it predicts a similar behaviour of the charge–charge correlation function as does the MSA is emphasized. Besides, PY and HNC integral equations applications to the theory of electrolyte solutions is brie:y presented in the same section, together with a brief report on some MC and MD simulations of ionic :uids. Section 4 is devoted to the analysis of eld theoretic approach to the theory of electrolyte solutions. In this report, we mainly follow the treatment of Kholodenko and Beyerlein [42], who proved that PB theory is just the Gaussian or one-loop level of the exact FT and that the extension of DH theory demands the introduction of renormalized charges and non-Debye screening length in the calculations. The main theoretical formalisms that have been reported throughout the last century for the calculation of the non-Debye parameters are reviewed in Section 5, including analytic asymptotic expansions [55,61], self-consistent approaches [59] and numerical techniques like nonlinear DH, HNC treatments and computer simulations. Section 6 is devoted to the presentation of the formally exact mean- eld theory of ionic solutions. We review Kjellander and Mitchell’s [5,55] formulation of the DIT. The modi ed MSA (MMSA)
12
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
is introduced in this section and its formal consistency is explicitly demonstrated, and the e2ective screening length and renormalized charges of a OCCS are explicitly calculated using the DIT route. The case of asymmetric electrolyte solutions is also a matter of study in this section. For this task a generalization of the successful MMSA is introduced and the HNC asymptotic behaviour of their distribution functions is interpreted in terms of this structural model. The thermodynamic properties arising from the DIT structural model are examined in Section 7. More speci cally, the internal energy and osmotic coeJcient of a 1:1 RPM electrolyte solution are analyzed in terms of the existence of dressed particles in the :uid. The Helmholtz free energy is also provided in terms of the e2ective DIT quantities and the concentration dependence of the activity coeJcients of electrolyte solutions analyzed. The classical edl theory is also revisited in the next section, and the rescaling of the surface charge density of a :at wall is presented. Finally, the application of the DIT structural model to the prediction of transport coeJcients (DITT) is introduced in Section 9, with an explicit derivation of the relaxation and electrophoretic corrections to ionic mobility and the formulation of a DITT conductance equation in terms of renormalized kinetic entities. 2. Classical mean-eld theory of ionic solutions The theory of electrolyte solutions and of the electric double layer (edl) has been the object of a huge number of scienti c results during the 20th century, due to the great amount of applications in the most diverse areas of basic and applied research and in industry. Progress in this eld has been possible mainly because of an adequate knowledge of the interionic interactions. It was the combination of the interaction potential with the formalisms of electrostatics, statistical mechanics and hydrodynamics that allowed the formulation of classical equilibrium and transport theories of ionic solutions. These theories have been successfully applied to situations where the long range Coulombic interactions predominate over solvent–solvent, ion–solvent or short-ranged ion–ion forces. The behaviour of ionic solutions is mainly determined by the competition between the thermal motion of ions and the attractive and repulsive interaction between them. Even in highly dilute solutions, the damped oscillatory behaviour of the radial distribution function shows the existence of a short range order in the bulk, as a result of the partial compensation of the thermal movement due to interionic interactions. However, the calculation of the equilibrium structure of an ionic system is a highly diJcult task from the statistical point of view, due to the particular form of the interaction potential. Particularly, the long range nature of the Coulomb interaction is responsible for the impossibility of doing a straightforward virial expansion as in classical :uids [92]. One of the earliest e2orts to evaluate the ionic distribution functions was undertaken by Debye and H6uckel in their classical paper of 1923 [1]. Their results were extremely in:uential, mainly because more elaborate liquid state theories were not developed until the 1960s and 1970s and a common approach to electrolytes and non-electrolytes was not possible. Besides, DH results are now recognized as the universally valid limiting law for ionic thermodynamic quantities at in nite dilution. In fact, the importance of DH formalism is still enormous nowadays, when it continues to be the theoretical basis of most practical applications. Debye and H6uckel introduced a model of ionic solutions where the ions are treated as ionic point charges which interact by means of a Coulomb potential in a uniform dielectric background. They assumed that the ions are distributed
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
13
according to an exponential distribution law (Boltzmann distribution) characteristic of a system in thermal equilibrium with a heat reservoir (the solvent). The key concept in DH theory is the ionic atmosphere, a spatial separation of charge made up of mobile ions which balance the charge of the central ion, that allows the understanding of the particular ordering inside the bulk solution. The spatial range of correlations itself is determined by the size of this charge inhomogeneity. These and other concepts will be brie:y summarized in the following sections. 2.1. Primitive model When facing the study of charged media, one is forced to make up ones mind as to whether to take the solvent contribution into account or not. In a rst approximation, when one considers only the physics arising directly from the Coulomb interaction, the solvent molecules can be ignored. This is the option taken by classical electrolyte and double layer theories, and constitutes the so-called primitive model (PM). In this model, the solvent is assumed to form a continuum background where the ions are immersed, and particular details concerning the solvent structure are neglected. The classical image of a solvent virtually unaltered by the solution of the ions is valid only in highly dilute media, where the degrees of freedom of the solvent overwhelmingly dominate that of the ions, so the solvent can be considered as a mere heat reservoir. What remains of the solvent after the smoothing operation is just its permittivity, &, that is supposed to be that of the medium. The ions are assumed to be made of a material with the same dielectric constant as that of the solvent, and in this medium they exhibit a behaviour similar to that of the particles in a real gas. This means that no pressure or temperature e2ect can be accounted for, since it would require a detailed description of the molecular nature of the solvent. Particularly, a study of the short-range potential of the mean force would be needed, a task that requires taking the orientation and interaction of solvent molecules around two ions into account. Despite its inherent diJculties, in more recent theories solvent granularity is explicitly taken into account, together with the ion–solvent and solvent–solvent interactions, allowing a more detailed description of the ionic solution. The most important polar solvent is undoubtedly water. Its characteristics as a solvent can be explained by considering the charge distribution of the molecule and its high dipolar moment. As proved by X-ray studies, water molecules are far from adopting a packed structure [93]. The volZ 3 , suggesting a molecular radius (assuming spherical ume per molecule at 298:15 K is about 30 A Z if a closely packed structure were adopted. This is far from the 2.9 to 3:05 A Z geometry) of 3:48 A detected in X-ray scattering experiments. In fact, in the temperature range from 273.15 to 353:15 K, between 4.4 to 4.9 nearest neighbours are detected instead of the 12 neighbours characteristic of close packing. However, liquid water exhibits a short range order which results in a marked primary Z and a secondary maximum at 4:5 A. Z maximum in the radial distributions function at about 3 A, The last decade has witnessed remarkable progress in this classical understanding of the properties of water. X-rays and neutron scattering, together with simulation techniques, have been used to establish its structure from the deeply supercooled liquid and amorphous thermodynamic states to the supercritical state (see Ref. [94] and references therein). These results provide an image of water being a liquid with di2erent degrees of tetrahedrality, depending on its thermodynamic state. Near ambient temperatures and below, water exists in some sort of hybrid state between a low density form with an open, tetrahedral structure and a high density form with a signi cant degree of non-tetrahedral hydrogen bonds [94].
14
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
The equilibrium between these high and low density forms is a2ected by the introduction of solutes in di2erent degrees, depending on whether the solute is hydrophilic or hydrophobic. This picture is radically modi ed by the introduction of ions that have charges comparable to those responsible for the dipolar moment of the solvent molecules and that are approximately the same size. Under these circumstances, according to classical theory of Frank and Evans [95], the solvent molecules adopt an iceberg-like structure around the ions (hydration), lowering the entropy of the medium. Recent studies by A. Soper and coworkers on hydrophobic hydration of non-polar molecules (see for example Ref. [96] and references therein), are changing this conventional view. According to these results, the response of water to the presence of the hydrophobic molecules of the solute is not an iceberg-like ordering of the hydration shell of the non-polar group, as followed from the standard model, but instead a compression of the second shell is observed. This compression is accompanied by a sharpening of the second-neighbour water correlations, and this reduction of the structural freedom of water contributes to the entropic driving force of the hydrophobic interaction. In highly dilute solutions, however, the solvent remains almost unaltered during the solution process. If we assume that the ions adopt a somewhat simple cubic structure in solution, the interionic Z for an ionic concentration of 0.001 M. Thus, in the space between any two ions there distance is 94 A exists a number of molecules high enough to consider the medium as a continuum. This situation persists up to concentrations of the order of 1 M, at which only two or three solvent molecules exist between two given ions. Throughout this study we shall assume that any details of solvent structure—apart from its dielectric constant—can be ignored and, therefore, that the ions are immersed in a structureless background where they interact in media via Coulomb’s law: q i qj 6ij (r) = ; (1) 4>&rij where qi is the charge of an ion of species i and rij is the separation between the interacting ions. These are the basic features of the PM. This model involves two levels of approximation. First the Born–Oppenheimer procedure for averaging out the electron contributions, and second the McMillan–Mayer theory of solutions which integrates out the solvent degrees of freedom. As we pointed out previously, incorporating the solvent via the dielectric permittivity is asymptotically exact. In the regime of intermediate concentrations, deviations from Coulomb’s behaviour are detected, due to the existence of repulsive cavity terms associated to exclusion e2ects of the solvent and many-body contributions due to ionic polarizability [15,91]. At higher concentrations, oscillatory behaviour is registered, associated to the molecular size of the solvent, as well as solvent-induced many-body interactions [97]. These e2ects are totally neglected in the PM, where the average Coulomb’s law is assumed at all separations. In the PM, the ions are stabilized against collapse by a short-range repulsion which adds to the electrostatic interaction, recognizing that the ions have nite size. The most usual choice for this interaction is the hard sphere potential with additive diameters: ∞ r ¡ *ij = 12 (*i + *j ) ; 60ij (r) = (2) 0 r ¿ *ij ; where *i is the hard sphere diameter of species i. This diameter includes approximately the rst solvation shell, larger than the crystallographic bare ion diameter. PM mimics the short range
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
15
repulsions due to Pauli’s exclusion of electrons by means of a hard core, preventing molecular overlap. The most commonly used version of the PM is the restricted PM (RPM) that assumes that the ions are charged hard spheres of equal size and opposite charges, and this is the version we shall use in the remainder of this report. 2.2. Mean-2eld assumption: the Poisson–Boltzmann equation The whole classical theory of ionic :uids is built on the mean- eld Poisson–Boltzmann equation. This equation relies upon the assumption that the ions in an ionic solution interact through averaged electrostatic potentials that obey the rules of classical electrostatics. Although these assumptions are far from being obvious at a microscopic level, they have been revealed as highly useful for understanding ionic behaviour. PB equation constitutes the starting point for DH theory of electrolyte solutions and for GC (or non-linear PB) double layer theory. The main objective of the PB equation is to calculate the average electrostatic potential acting on a given ion i, O i (r), created by an ensemble of charged particles statistically distributed in a certain spatial region. The homogeneous case corresponds to electrolyte solutions, whereas the inhomogeneous one corresponds to the edl, a spatial distribution of charge in the neighbourhood of a wall (colloid) immersed in an ionic solution. Classical theory uni es the treatment of both systems in a mean- eld framework, where each ion interacts with a number of neighbours great enough so as to validate the mean- eld hypothesis. The main idea is to focus on one ion in the system and to assume that the role of the neighbouring particles is to create an average or molecular eld that adds to the external eld, and in which the tagged particle :uctuates like a free particle [98]. In the remainder of this section, we shall analyze in detail the mathematical and physical consequences of this hypothesis in the framework of electrolyte theory. 2.2.1. Debye–H8uckel theory: screening of the ionic correlations Let us consider a mixing of s ionic species of charges qj = zj e and number density nj = Nj =V , where Nj is the number of ions of species j and V is the total volume of the system, immersed in a polar solvent of dielectric constant &. Long range Coulombic interactions act on all the ionic pairs in solution, attenuating the thermal disorder. The existence of these interactions gives rise to a form of structural order in the neighbourhood of the ions, in the form of an inhomogeneous spatial distribution of charge balancing the charge of the central ion. In this region, known as the ionic atmosphere, there is an excess of ions of opposite charge to that of the central ion with respect to the equilibrium concentration. This concept of an ionic atmosphere condenses the greater part of the intuition of Debye and H6uckel on the structure of ionic :uids. A statistical description of this structure general enough for treating the equilibrium and dynamical case demands the introduction of gij (r1 ; r2 ; t), the time dependent pair distribution function, de ned as the conditional probability of nding, at a time t, a particle of species j in an element dr2 around r2 when an ion of species i is in the element dr1 in the neighbourhood of r1 . Knowledge of this function allows the calculation of the thermodynamic and transport properties of the :uid. For an isotropic equilibrium electrolyte solution the correlation functions depend upon |r1 − r2 | = r only, so we can neglect every time dependent e2ect, particularly Brownian motion of thermal origin. However, to study equilibrium and transport processes in a uni ed manner, we have to allow for the existence of an external force acting on the system. Under these circumstances, the system is
16
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
not in equilibrium and one has to take into account the time dependence of the pair correlations and treat them hydrodynamically. This point of view will be extremely useful for studying the transport properties of ionic solution in Section 9. The hydrodynamic continuity equation can be written for the radial distribution functions as [99,100] 9gij (r1 ; r2 ; t) (3) = ∇1 (gij vij ) + ∇2 (gji vji ) ; 9t where vij is the velocity of ion j in the neighbourhood of ion i, and ∇1 and ∇2 denotes the gradient respect the coordinates of ions i and j respectively. The above equation relates the structure of :uid to the motion of its particles. This motion is mainly due to three phenomena: −
(1) Forces acting on ions: interionic interactions (coulombic and concentration gradients) and external perturbations. (2) Brownian motion of thermal origin. (3) Di2usion of the solution as a whole. Let :j be the mobility of the central ion, so the Coulombic force acting on this particle due to ion i, Fij , provides it with velocity vj = !j Fij . The existence of a concentration gradient, ∇gij ; in the ˜ ij = −kB T:j ∇gij where Dj = kB T:j is the di2usion coeJcient bulk generates an ionic :ux—Dj ∇g of ion j. On the other hand, this intensity is given by vij gij , so the velocity of the central ion due to inhomogeneities in concentration is given by vij = −
kB T:j ∇1 gij = −kB T:j ∇1 ln gij : gij
(4)
Adding the drift velocity of the solution as a whole in the position of particle j, V(rj ), to the two previous contributions to vij we get vij = V(rj ) + :j (Fij − kB T ∇2 ln gij ) :
(5)
In equilibrium V(rj )=0, and the distribution functions depend upon r=|r1 −r2 | only, as a consequence of the medium being isotropic. In this situation the ions have zero average velocities, so Eq. (5) leads to Fij = kB T ∇2 ln gij :
(6)
Using Kirkwood’s potential of the mean force, Aij , de ned by [101,102] Fij = −∇j Aij
(7)
we can write for the equilibrium radial distribution function: gij (r) = e−Aij (r) ;
(8)
where = 1=(kB T ). The above equation constitutes the so-termed Boltzmann distribution, and links the equilibrium ionic distribution to the interionic potential energy. Unfortunately, the calculus of the potential of the mean force demands previous knowledge of the :uid structure, so one is trapped in a vicious circle.
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
17
To solve this problem, Debye and H6uckel introduced the average electrostatic potential created by the rest of the ions in the neighbourhood of ion j, O j (r), a quantity that results from the screening of the Coulombic potential of the central ion, 6j (r) = qj =4>&r. O j (r) may be expressed as [101] −U (rN ) O j (r) : : : e−U (rN ) (9) drlNl = : : : drlNl ; j (r)e l=j
l=j
where j (r) = i (qi qj =4>&rij ), U (rN ) is the total potential energy of the N particles on the system, and the integral extends over the con guration space of all the particles except the central ion. It is easily demonstrated that the above potential veri es a Poisson equation: ∇2 O j (r) = −
2Oj (r) &
(10)
with an average charge density, 2Oj (r): N N 2Oj (r) : : : e−U (r ) drlNl = : : : 2j (r)e−U (r ) drlNl : l=j
(11)
l=j
One of the main assumptions of the DH theory is that the Coulomb interaction overwhelmingly dominates other interionic forces, so the mean force between two ions of species i and j is mainly of an electrostatic type. This hypothesis implies that all short ranged ionic correlations and higher order electrostatic coupling are neglected in DH theory. On the other hand, the Coulomb potential in the bulk was substituted by the previously introduced mean electrostatic potential, which comprises the e2ect of the whole medium, so the potential of the mean force between ions i and j can be expressed as the charge of ion i times the mean electrostatic potential created by ion j: Aij qi O j . A major consequence of this hypothesis is that every ion :uctuates in an e2ective potential created by the rest of the ions, so this assumption is of a mean- eld type. As it is well known, the validity of a mean- eld theory is related to the number of neighbours that interact with a given particle. The long range character of the Coulomb interaction guarantees that in the case of an ionic :uid this number is fairly high, and this is the main reason why DH theory works well and gives quantitatively accurate results in many cases. Indeed, DH method becomes asymptotically exact in the limit of low coupling, as we shall see below. The charge density in a given coordinate of the :uid can be expressed in terms of the radial distribution function as 2j (r) =
s
nl ql glj (r) :
(12)
l=1
Substituting the above equation in (10), and using Eq. (8) and the mean- eld DH hypothesis, Aij qi O j ; we get ∇2 O j = −
s
1 O nl ql e−ql j & l=1
(13)
18
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
that is the so-called PB equation for the average electrostatic potential created by a statistically distributed ensemble of ions. This equation is of extreme importance, constituting the basis of the mean- eld theory of ionic solutions, and it is practically impossible to count the number of theoretical, experimental and numerical results based on it. The mean- eld approximation is the main hypothesis involved in the derivation of the above result, but it is not the only one. The PB equation is a non-linear di2erential equation, and the analytical complexities involved in its resolution are the main reason for introducing another approximation to allow the linearization of the PB equation. In the low coupling limit, the electrostatic potential energy is much lower than the thermal energy, so qk O j 1. This hypothesis is known as hot and dilute plasma approximation, and it is valid in the low charge density and/or high temperature regimes. Under these circumstances, one can expand the exponential in PB equation and retain just the rst order terms to get ∇2 O j = kD2 O j
s
(14)
which is the linearized PB equation (LPB). The electroneutrality condition, k=1 nk qk = 0; has been used in the derivation of this result, and Debye’s parameter has been introduced: s kD2 = nk qk2 : (15) & k=1
For point ions, the solution of LPB equation is given by O j (r) = A e
− kD r
r
+B
e kD r : r
(16)
The only solutions which verify the boundary condition O j (∞) = 0; are those with B = 0, so the valid solutions are screened Coulomb potentials of the Yukawa type: O j (r) = A e
− kD r
: (17) r As we can see in this equation, this potential is concentration dependent, re:ecting its many-body nature. The whole e2ect of the rest of the system over the central ion j is comprised in the Debye constant, which is a measure of the range of the interaction and, consequently, of the size of the ionic atmosphere. For highly dilute media, we can expand O j (r) in series of kD r; and retain just the lowest order terms: O j (r) A : (18) r In this low concentration limit, O j (r) must recover the potential corresponding to a point ion in vacuum, so A must be equal to qj =4>&. Therefore, − kD r O j (r) = qj e : (19) 4>& r This is the main result of DH theory and it was a major scienti c achievement. The above equation states that the potential created by an ion immersed in a charged medium is of the Yukawa type. The net e2ect of the charged medium on the Coulomb interaction is the screening term e−kD r , which reduces the range of the interaction. It is noteworthy to point out that this average electrostatic
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
19
potential is not a real pair potential, since it depends on concentration through the parameter kD . This is the main consequence of the mean- eld character of the DH theory. Using the de nition of the total correlation function between species i and j, hij (r), hij (r) = gij (r) − 1 = e−Aij (r) − 1
(20)
and expanding the exponential in the second hand of the above equation we get for the DH direct correlation function: qi qj −kD r O e : (21) hDH ij (r) −qi j = − 4>&r Once more, this result points out the close connection between interaction and structure in DH theory, as all the structural information is contained in the interaction. This is the great contribution of DH theory, and its formal beauty and depth is evident. The statistical mechanical meaning of DH theory is further clari ed if one derives it from the Ornstein–Zernike equation (OZ). The relation between the direct correlation function, cij (r); and the total correlation function, hij (r); is given by the OZ equation [103], hij (r) = cij (r) + nk dr cik (|r − r |)hkj (r ) : (22) k
The integral is taken over the whole space and the sum is over all the species (ionic or colloidal) present. Eq. (22) is the fundamental equation of the theory of liquids, and in Fourier space it is written as hˆij (k) = cˆij (k) + nl cˆil (k)hˆlj (k) ; (23) l
ˆ where f(k) denotes the Fourier transform of f(r); de ned by: ˆ f(k) = drf(r)e−ikr and we have used that the Fourier transform of a convolution of functions is given by [104] ˆ g(k) (f[ ∗ g)(k) = f(k) ˆ : At low concentrations we can approximate the direct and total correlation functions by [102,103] cij (r) = fij (r) = 1 − e−6ij (r) −qi qj 6(r) ; hij (r) = e−Aij (r) − 1 −Aij (r) ;
(24)
where we have introduced the two-body Mayer’s function, fij (r) [103] and 6(r) = 1=(4>&r). This approximation of the direct correlation function corresponds to the lowest order of the two particle correlation function [102]. Assuming this form of the pair correlation is equivalent to retaining only the pair interaction and neglecting all the e2ects of the rest of the medium in the interaction. Substituting the above expressions in the OZ equation leads to: qi qj q i ql − Aij (r) = − + nl d 3 r Alj (|r − r |) : (25) 4>&r 4>&r l
20
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
Let us assume a potential of the mean force of the form: q i qj w(r) ; Aij (r) = 4>&
(26)
where w(r) is a function independent of the ionic species to be determined. Fourier transforming the above equations and inserting the Fourier transform of the electrostatic potential, q i qj 6ˆ ij (k) = 2 (27) &k in the Fourier transform of Eq. (25), we get w(k) ˆ =
k2
1 : + kD2
(28)
Inverting the Fourier transform in this equation and integrating the angular variables we get: keikr 1 w(r) = dk : (29) (2>)2 r k 2 + kD2 The integral in this equation is easily calculated extending the k values into the complex plane and applying the residues theorem [105] to an integration contour in the upper half plane, obtaining: w(r) =
e − kD r r
from which one recovers DH potential of the mean force: q i qj − k D r Aij (r) = : e 4>&r
(30)
(31)
As pointed out in the previous part of this section, DH theory predicts the screening of ionic interactions in the bulk solution, and the decay length is determined by the ion concentration. At this stage, a non-specialist in ionic :uids physics could think that this result may be an artifact of the approximations involved in the derivation of the DH theory. However, as shown by Attard [91], exponential decay of the pair correlations is a completely general result for ionic systems, where “exponential” can mean both monotonic and damped sinusoidal behaviour. This is the most characteristic property of ionic :uids (apart from Coulomb interaction, obviously). Thus, the total correlation function is of shorter range than the pair interaction, a fact which contrasts markedly with what happens in :uids with integrable power law potential, where the total correlation function decays at the same rate as the pair potential [19,106,107]. The situation is also di2erent to that of a system with in nitely short ranged pair potentials, such as the hard sphere :uid or the Gaussian :uid where the total correlation function is of longer range than the pair potential. The argument of Attard in Ref. [91] is of great interest and we shall brie:y summarize its main features here. The basis of the derivation relies upon the exact closure of the OZ equation [103]: hij (r) = −1 + exp[ − 6ij (r) − cij (r) + hij (r) + bij (r)] ;
(32)
where bij (r) is the bridge function. The condition of integrability of hij (r) demands that this function decays to zero for large r, and, therefore, the right-hand side of the above equation must exhibit the
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
21
same behaviour. Consequently, linearizing the exponential in the limit r → ∞ and neglecting terms that decay as the square of the of the pair correlation functions, we get hij (r) = −cij0 (r) + hij (r) + bij (r) ;
(33)
cij0 (r) = cij (r) + 6ij (r)
(34)
where is the short range part of the direct correlation function of ions of species i and j, a magnitude that is of shorter range than the direct correlation function of the :uid. Eq. (33) implies that, asymptotically, the short range part of the pair correlation function coincides with the bridge function. Let us suppose that hij (r) decays at least as an integrable power law, h(r) ∼ r −' , ' ¿ 3. Since the bridge function consists of diagrams comprised of h bonds and there are no nodal points between the root points [103], at least two h bonds must bridge between the root points and, therefore, the individual diagrams of bij (r) must decay at least as fast as the square of hij (r). Taking into account that the range of the individual binodal diagram does not change as the number of eld points increases, one can conclude that the bridge function itself decays as the square of the total correlation function. By virtue of Eq. (32) the short range part of the direct correlation function must decay at the same rate as bij (r) when r → ∞, which ensures that the rst moments of cij0 (r) exist. Thus, if hij (r) decays as an integrable power law, then cij0 (r) ∼ O[h(r)2 ], r → ∞. From this result one can show by induction on the moments that if hij (r) is exponentially decaying then all the moments of cij0 (r) exist and, consequently, it is also exponentially decaying. Finally, from this result and from the existence of the zeroth moment of the total correlation function, hˆij (0); which is a consequence of the fundamental assumption of integrability of the total correlation function, it is straightforward to demonstrate that hij (r) must be exponentially decaying. The proof is again by induction on the moments. Let us equate the coeJcients of k 2n in the small-k Taylor expansion of the OZ equation: H(2n) = C0(2n) + [H(2n) C0(0) + H(2n−2) C0(2) + · · · + H(0) C0(2n) ] + H(2n+2) Q ;
(35)
where A denotes a matrix, and Qi = qi is a s × 1 matrix (a column vector) whose components are the charges of the ions of the di2erent species in the bulk. Assume that all moments of the total correlation functions 4>(−1)n ∞ (n) h = h(r)r 2n+2 dr (36) (2n + 1)! 0 exist for n 6 2m; where m ¿ 1 is an integer. Therefore, all the moments of cij0 (r) must also exist for n 6 2m. Since cij0 (r) is either exponential or at least as short ranged as h(r), all the moments of the short range part of the direct correlation function also exist for n 6 2m. Therefore it follows from Eq. (35) that H(2m+2) must be nite. Since H(0) exists by fundamental assumption (h(r) is integrable as we have pointed out previously), then all the moments of the correlation function exist and this function must be at least exponentially decaying. This result proves that the screened form of the interaction and pair correlations derived in the simple DH formalism is far from being an artifact. Instead it is somewhat surprising that such a simpli ed theoretical framework includes the main physical features of screening in ionic :uids. Knowing DH mean electrostatic potential, one can obtain the ionic correlation functions and the whole system’s thermodynamics, in what has been called the “microthermodynamics” of the system
22
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
[102]. There are three main routes for the calculation of the thermodynamic properties of a physical system from its correlation functions: (1) the energy equation, (2) the virial equation, (3) the compressibility equation. In systems with short range interactions, the knowledge of the interaction between any two particles is enough to obtain the thermodynamic properties, and one seldom must take higher order correlations into account [102]. Due to the long range nature of the Coulomb interaction, this argument no longer seems valid for ionic systems. However, the screening of the ionic interaction due to the rearrangement of the ionic charge in the neighbourhood of an ion reduces the range of the interaction in such a way that the whole thermodynamic properties are determined by pair interactions, at least in the low concentration regime. DH thermodynamics are usually obtained from the energy equation that relates the internal energy of system to the microscopic interactions between its particles [103]: 3 1 U = NkB T + ni nj d 3 r 6ij (r)gij (r) ; (37) 2 2 i j where N is the total number of particles in the system and the term 32 NkB T constitutes the contribution of the translational degrees of freedom. The second term on the right-hand side of the above equation contains the e2ect of the interactions. The sum is taken over all the particles in the medium, and the integral extends over the whole volume of the system. The 1=2 term avoids counting twice the interaction between each pair of particles. For an isotropic system, the pair correlations and the interparticle potentials depend on interparticle distance only. Averaging out the angular coordinates in the above equation we get: ∞ 3 U = NkB T + 2> ni nj r 2 dr 6ij (r)gij (r) : (38) 2 0 i j In DH formalism, the radial distribution function is given by Eq. (21). Substituting this expression in the above equation, and using the usual expression of the Coulomb interaction, we obtain the excess internal energy of the charged :uid, U ex = U − U id : k3 U ex =− D : V 8>
(39)
As one can see in this result, in the limit of in nite dilution the DH internal energy of the electrolyte is proportional to the 3=2 power of the concentration. This result can be interpreted in the following way: The ions and their ionic atmospheres form spherical capacitors, so we can view the ionic medium as a collection of 2N independent charged spherical capacitors of radius 1=kD and capacitance &=kD , so the energy of the system is 2N
Nq2 qj2 kD = : U = C 2& j=1 c
(40)
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
23
Introducing the density of species j, and summing over all the species present, the above energy equation is given by s U c nj qj2 kD k3 =− D = V 8> 2& j=1
(41)
and we recover the result in Eq. (39) for the excess internal energy of an electrolyte solution. From the expression of the excess internal energy, and using conventional thermodynamic relations, the calculation of the rest of the thermodynamic potentials is straightforward. Using a Gibbs–Helmholtz relation, we can calculate, using Eq. (39), the excess Helmholtz free energy of the system, Aex : Aex k3 =− D : V 12>
(42)
ex The osmotic pressure, Posm = −(9A =9V )T; nj ; or alternatively the osmotic coeJcient, 6 = Posm = ( i ni kB T ), can be obtained from the above equation and the result is
6−1=−
kD3 : 24> i ni
This result could also be obtained from the virial or pressure equation [103,108]: ∞ 2>n P xi xj r 3 dr 6ij (r)gij (r) ; =1− n 3 i; j 0
(43)
(44)
where n = N=V stands for the number density of the :uid and xj is the molar fraction of species j, and 6ij (r) = d6ij (r)=dr. The excess Gibbs free energy is de ned as the Legendre transformation of the excess Helmholtz free energy with respect to the volume: P gex = aex + i
ni
;
(45)
where gex = G ex =N is the excess Gibbs free energy per particle, and aex is the Helmholtz free energy per particle. Combining the above de nition with the de nition of the chemical potential of species j, /j = (9G=9nj )T; P; nk =j , we get for this quantity: /j 1 kD qj2 = ln(nj .3j ) − kB T 2 &kB T
(46)
1 kD qj2 : 2 &kB T
(47)
/j0 = ln(nj .3j ) being the ideal gas contribution, where ,Bj = 2>˝2 =mj is the thermal wavelength of species j. The second term on the right-hand side contains the contribution of the interparticle interactions and de nes the activity coeJcient of ionic species j: ln
j
=−
The impossibility of a direct experimental measurement of the activity coeJcient of a single ionic species imposes the introduction of the practical mean activity coeJcient, ln ± =(0+ ln + +0− ln − )=
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
lnγ±
24
0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -1.0 -1.2 -1.4 -1.6 -1.8 -2.0 -2.2 -2.4 -2.6 -2.8 -3.0 -3.2 0.0
0.5
1.0
m
1/2
1.5
2.0
-1 1/2
(mol kg )
Fig. 1. Activity coeJcients of various electrolyte systems at 298:15 K. Solid squares and circles correspond to LiCl and LiBr respectively. Open symbols correspond to the 1:2 electrolytes MgCl2 (up triangles) and MgBr 2 (down triangles). Diamonds and stars represent respectively the activity coeJcient of the 2:2 electrolytes BeSO4 and MgSO4 . The straight lines correspond to the predictions of the DHLL in Eq. (48).
(0+ + 0− ), where 0j is the stoichiometric coeJcient of species j. Using Eq. (47) for the activity coeJcients of the di2erent species we get: kD |q+ q− | (48) ln ± = − 2 &kB T an expression which constitutes the so-called Debye–H6uckel limiting law (DHLL) for the activity coeJcient. This universally valid law states that the logarithm of the mean activity coeJcient of any ionic solution is proportional to the square root of concentration in the limit of vanishing concentration, and undoubtedly constitutes one of the main theoretical results of the 20th century physics. Fig. 1 depicts the vanishing concentration behaviour of the activity coeJcients of some electrolyte systems. As shown there, the dependence on c1=2 of this magnitude is a general behaviour in electrolyte systems, and the slope depends only on the ionic charges at constant temperature. The validity of the DHLL for a 1:1 electrolyte solution extends up to concentrations of 0:01 M. At higher concentrations, the approximations involved in the derivation of the above result are totally inadequate, because of the importance of the short range correlations, neglected in the mean- eld hypothesis, and the breakdown of the hot and dilute plasma approximation. As demonstrated by the di2erent equations above, the main thermodynamic properties (internal energy, Helmholtz and Gibbs free energies; : : :) of the electrolyte system are determined by the decay constant of the :uid, one more piece of evidence of the fundamental importance of an adequate characterization of this phenomenon. 2.2.2. Further developments. Extensions of DH theory As we mentioned previously, DH theory was a genuine revolution in the physics of charged systems. However, its inherent limitations arising from the approximations involved in its derivation, and the fact that it is only directly applicable to point ions, motivated that many theoretical studies
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
25
searched for extensions of this formalism since its appearance. Historically, the rst attempt to go beyond the DHLL was made by Gronwall, La Mer and Sandved [12] for symmetric electrolytes and to Gronwall, La Mer and Greif [109] for asymmetric electrolytes solutions. The main point in their formalism is the overcoming of the hot and dilute plasma approximation. This hypothesis allows the linearization of the PB equation and the consequent calculation of the mean electrostatic potential in the classical DH formalism. As the concentration rises, this condition becomes inapplicable, and one has to retain higher order terms in the expansion of the exponential in the PB equation (13). For symmetric electrolyte solutions (01 = 02 = 0; q1 = −q2 = q), the sum in this equation can be expanded as 2 k=1
nk qk e−qk
Oj
= n0q[exp(−q1 O j ) − exp(−q2 O j )] 1 1 3 5 = −2n0q q O j + (q O j ) + (q O j ) + · · · : 3! 5!
(49)
In the case of symmetric electrolyte solutions, hot and dilute plasma approximation consists in neglecting terms of odd order above the third one. For asymmetric electrolyte solutions, even order terms do not cancel, so one has to take them into account. Expanding the exponential in the PB equation up to the seventh order, Gronwall et al. [12] obtained for the rational activity coeJcient of symmetric electrolytes electrolyte solution, f± ,
3 kD q2 1 q2 X3 (kD *) − 2Y3 (kD *) + ln f± = − 8>&kB T 1 + kD * 4>&kB T 2 5
1 q2 X5 (kD *) − 4Y5 (kD *) ; (50) + 4>&kB T 2 where * = 12 (*i + *j ) is the ionic diameter and Xi (kD *) and Yi (kD *) are functions de ned through series tabulated by Gronwall et al. [12]. From the second and third terms on the right-hand side of the above equation one can deduce that the main deviations from the DHLL are registered on electrolytes of high valence type or solvents of low dielectric permittivity. Similar expressions are obtained for asymmetric electrolyte solutions [99]. One of the main shortcomings of DH’s original formulation is that it neglects, among other ionic correlations, the e2ects arising from the nite radius of the ions. This hypothesis is valid at low temperatures, for which the interionic distance is much bigger than the ionic radius, but at nite concentrations the excluded volume e2ects are no longer negligible, and any model of the ionic system has to take them into account. Various interionic potentials have been used to model the short range interionic forces responsible for the system’s stability. The most commonly used is the hard sphere potential in Eq. (2): ∞ r ¡ *ij = 12 (*i + *j ) ; 0 6ij (r) = (51) 0 r ¿ *ij : The election of the above potential corresponds to the primitive model electrolyte solution which we have previously chosen as our structural model of the electrolyte [103]. The distance *ij represents
26
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
the limit of impenetrability for a pair of ions of species i and j. This provokes the existence of an excluded volume in the solution for a given ion, due to the presence of the rest of the ions. The interaction potential in Eq. (2), together with its simpli ed version where all the ions have the same radius, is the simplest example of short range interionic potentials. More complex and realistic models have been used for electrolyte solutions and molten salts. Frequently, soft core potentials have been used for the description of molten salts. These models allow for a certain penetrability of the external electronic clouds of the ions, a possibility that is radically excluded in the PM electrolyte. Thus, the short range interaction potential has been modelled, for example, by a soft core potential: 6sc ij (r) =
* e2 1 * n + z i zj * n r r
(52)
in which the Coulomb interaction between ions of valence zk is complemented by a soft core that goes like r −n with interionic distance. This potential is especially adequate for alkaline halogenides, particularly for those in which cations and anions have approximately the same size. In these systems, n 8–10. Even more realistic potentials have been used for molten salts, among which the Tosi–Fumi potential is probably the most common [110]. In this potential, the short range interaction is modelled by a Born–Mayer repulsion between the electronic clouds and the long range interactions are represented by attractions of the van der Waals type: 6ij (r) = Bij e−r=, −
Cij Dij − 8 ; 6 r r
(53)
where the parameters ,, Bij ; Cij , Dij , are obtained from crystallographic data. For ions of high polarizability, the e2ect of the induction forces cannot be neglected and the interionic potential has to take into account the ionic polarization. This is the main contribution of the so-called shell model [111,112], where the ionic polarization is modelled by means of a core that represents the ionic nucleus plus the inner electronic shells. The model is completed by a massless shell corresponding to the outer part of the electronic cloud, which is bonded by means of a harmonic potential. All the potentials cited above take into account the excluded volume of the ions. They recognize that the short range forces are of fundamental importance, not only for a realistic description of the system but also as the origin of the short range repulsive forces responsible for the system’s stability. However, DH mean- eld formalism considers that the ions have zero volume and on the basis of this naive assumption it is capable of predicting the correct universal asymptotic functional dependence of the e2ective pair interactions in the limit of vanishing concentration. In this regime the interionic distance is much higher than the ionic radius, so the latter can be neglected. Nevertheless, as we shall see below, the introduction of an excluded volume interaction is of crucial importance, not only for the extension of DH theory to nite concentrations but for the formal consistency of the DH theory itself. Obviously, the distribution functions of the DH theory must verify the fundamental equation of the statistical mechanics of :uids; the OZ equation. The correlation functions of an ionic system must also verify the Stillinger–Lovett sum rules [113,114]. These are conditions a2ecting the pair
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
27
correlations and are obtained from their low wavenumber expansions. For the calculation of these sum rules, we shall consider the short range direct correlation function in Eq. (34): qi qj cij0 (r) = cij (r) + : 4>&r The Fourier transform of the above equation can be written as qi qj cˆ0ij (k) = cˆij (k) + 2 : (54) &k On the other hand, the Fourier transform of the OZ equation, (23), can be expressed in terms of this short range correlation function as q i qj ˆ qk qj hˆik (k) : (55) hik (k)cˆ0kj (k) − 2 hˆij (k) = cˆ0ij (k) − 2 + &k &k k
k
The correlation functions involved in the resolution of the problem, cij0 (r) and hij (r); are short ranged (one must remember that hij (r) is screened in the bulk solution and, therefore, its range is much shorter than that of the Coulomb interaction), so one can do a Taylor expansion of hˆij (k) for k → 0; in the case of radial functions only even powers of k appear, resulting in (2) 2 (4) 4 hˆij (k) ∼ h(0) ij + hij k + hij k + · · · :
An analogous expression is obtained for cˆ0ij (k). The coeJcients of the expansion read ∞ 4>(−1)n h(2n) = hij (r)r 2n+2 dr : ij (2n + 1)! 0
(56)
(57)
It is noteworthy that all the moments in the above equation, h(2n) ij , exist for exponentially decreasing correlation functions. Introducing the expressions of hˆij (k) and cˆ0ij (k) into Eq. (55), and equating the coeJcients of the various powers of k, the so-called Stillinger–Lovett sum rules are obtained. Equating the coeJcients of the terms in k −2 one gets ∞ qi = − nk qk hik (r) dr (58) k
0
which is the Stillinger–Lovett zeroth moment condition. This sum rule is no other than the electroneutrality equation, and it states that the net charge of the solution around an ion must be equal to the charge of that ion and of opposite sign. The DH charge density can be calculated combining Eqs. (12) and (21) to give s
qj 2 e−kD r 2j (r) = − : qk n k 4>& r
(59)
k=1
Using the de nition of Debye’s parameter one can express Eq. (59) as qj kD2 e−kD r : (60) 4> r It is straightforward to show that the above expression only veri es the electroneutrality condition expressed by Stillinger–Lovett zeroth moment condition if the exponential decay of the correlation 2j (r) = −
28
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
functions extends over the whole range of distances [59]. However, this is incompatible with a nite size for the ions. Rescaling DH charge density as e− k D r r and imposing the electroneutrality condition in Eq. (58) one gets for the RPM electrolyte: 2j (r) = −Aqj kD2
A=
qj e kD * : 4>& 1 + kD *
(61)
(62)
Thus, the total correlation function which veri es Stillinger–Lovett zeroth moment condition is given by r¡* ; −1; k * − k r hij (r) = (63) qi qj e D e D − ; r¿* : 4>& 1 + kD * r One could easily show that the solution of PB equation with boundary conditions that take the nite size of the ions into account leads to the same result. The main conclusion of this equation is that for DH theory to be compatible with Stillinger–Lovett electroneutrality condition, one has to rescale the DH correlation functions in order to take into account the excluded volume interactions associated to the nite size of the ions. However, this is not the only inconsistency of the classical DH formalism. The DH correlation functions rescaled for nite ions that we have just obtained does not verify Stillinger–Lovett’s second moment condition [59]. This constraint can be obtained by equating the coeJcients of the terms of order k 0 in both sides of the Fourier transform of Eq. (55), and the result is ∞ 4> nk n m qk qm hkm (r)r 2 dr : (64) 1=− 6& * k; m
This equation is known as the Stillinger–Lovett second moment condition. It is noteworthy to point out that both the electroneutrality condition and the second moment condition are determined by the long range tail of Coulomb interaction only, being completely independent of the short range interionic interactions. It is straightforward to show that the original DH pair correlations in Eq. (21) satisfy these conditions only when * → 0. Even the modi ed total correlation function in Eq. (63) does not verify the second moment condition if the conventional DH screening constant is not substituted by an e2ective decay constant [59]. Thus, we reach the conclusion that for the DH theory to be consistent with the Stillinger–Lovett sum rules, the imposition of nite radius boundary conditions in the correlation functions is not enough, and the screening constant must be allowed to take non-Debye values. We shall come back to this problem in Section 5, where an explicit expression for this self-consistent screening length will be derived. Another important extension of DH theory was reported soon after the formulation of the original results. In 1926, Bjerrum [115] proposed a theory to account for deviations from the DHLL predictions. Conscious as he was of the inherent diJculties of solving the nonlinear PB equation, Bjerrum suggested a simpler and more descriptive approach than that of Gronwall, LaMer and coworkers. The basic idea underlying the Bjerrum theory is the formation of ionic pairs in the bulk, something
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
29
that takes place when two ions of opposite sign undergoing Brownian motion come close enough for the electrostatic energy to be more than twice the thermal energy. Under these circumstances, the electrostatic potential energy can stabilize a new entity in the bulk, formed by the two ions and capable of resisting the collisions with the solvent molecules. The phenomenon of ionic association has an important e2ect in the thermodynamic and transport properties of ionic solutions. In a symmetric electrolyte solution, the formation of an ionic pair cancels the charges of the two ions by forming a neutral entity, probably with a dipolar moment. This entity does not contribute to the electric conductivity, and the equilibrium thermodynamic properties of the solution are modi ed in such a way that a certain number of charges has been substituted by half the number of ionic dipoles. The situation is considerably more complex in the case of asymmetric electrolytes as the ion pairs have also ionic charges. Of course, the formation of dipoles in the bulk is of great importance near the critical point, as Fisher and Levin proved [4], as it allows the improvement of the DH predictions for the spinodal line. The concept of ionic pair is diJcult to pinpoint. It is usually understood that two ions form an ion-pair when the latter is long-lived enough to be an recognizable entity in solution. This means that the ion-pair must survive after a big enough number of collisions with the solvent particles. Furthermore, no solvent molecule can exist between the two ions in the ion-pair [93]. The formation of ion-pairs will take place every time one ion passes near another with a kinetic energy lower than its electrostatic energy. This happens in a probabilistic manner as it corresponds to a thermal process, so the formation of an ionic pair will not take place every time two ions of opposite sign come close to one another. Bjerrum avoids these complications related to the velocity distribution and assumes that the association takes place every time that two ions get closer than a critical distance. Below this distance the electrostatic energy is bigger than the thermal one, so the ions get trapped into an electrostatic potential energy well. However, one must take into account that the validity of Bjerrum’s association theory is restricted to dilute media. Fuoss [116] demonstrated that at concentrations above 1:2 × 10−14 =(4>&T ) for 1:1 electrolytes interactions of higher order than the pair Coulomb interaction cannot be neglected, so the latter cannot be taken as the rule for discriminating between free and associated ions. The probability that two ions in solution of species i and j are at a distance r at temperature T is given by the Boltzmann distribution in Eq. (8): P(r) = ni exp[ − Aij (r)]4>r 2 dr :
(65)
At short interionic distances, the potential energy of the mean force can be assumed to be the usual electrostatic potential 6ij (r) qi qj =4>&r, so the above probability reads qi qj 4>r 2 dr : P(r) dr = ni exp − (66) 4>&r For particles of opposite sign this function has a minimum at a distance, rmin =
|qi qj | 8>&
(67)
a distance at which the electrostatic potential of the ionic pair equals the sum of the thermal energies of the two ions, 2kB T . Half this distance is known as the Bjerrum length rmin =2=lB , and it represents
30
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
10
number of ions x 10
-21
8
6
4
2
0 2
4
6
8
10
r (Å)
Z located at a distance r of a given ion in a 1:1 aqueous electrolyte Fig. 2. Number of ions in a spherical shell of 0:1 A solution at 298:15 K according to the predictions Eq. (66).
the distance at which the electrostatic potential energy of an ionic pair equals the thermal energy at Z temperature T . For solutions of 1:1 electrolyte solutions in water at 298:15 K this distance is 3:52 A, and, consequently, for shorter mean interionic distances, deviations from the DHLL are expected due to the formation of ionic pairs. For distances below the above one, P(r) grows rapidly, while at bigger distances the growth is very slow. This behaviour is depicted in Fig. 2, where the population Z around the central ion for an aqueous solution of of ions of opposite sign in a shell of radius 0:1 A a 1:1 electrolyte at 298:15 K is shown. The ionic association process we have just described provokes that a certain fraction of the ions in solution is associated, neutralizing in a partial or total manner their electric charges and forming high order multipoles. The degree of association in the bulk, 1 − , can be calculated as [93,99] 4>NA 2lB |qi qj | 2 1−= e 4>&r r dr (68) 1000 *i a magnitude which is tabulated [99]. The thermodynamics and transport properties of electrolyte solutions are given in terms of the parameter in Bjerrum’s formalism, accounting for the deviations from limiting laws. As an example, we can take the equation for the mean rational activity coeJcient of a solution of ions of nite radius: √ A |z1 z2 | I √ ; (69) ln f± = − 1 + B* I where I is the ionic force of the solution and the constants A and B are the usual ones in DH theory [93]: A =
1:8246 × 106 ; (4>&T )3=2
B=
50:29 × 108 : (4>&T )1=2
(70)
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
31
Eq. (69) is that of the conventional DH theory for ions of nite size but the concentration is corrected with a factor (1 − ) due to the existence of ionic association. This fundamental idea of ionic association was soon extended by Fuoss and Krauss [117,118] to account for deviations due to the formation of multipoles of order higher than the second one, particularly triplets and quadruplets. The algebra involved in its derivation is certainly more complicated than the one in Bjerrum’s theory, but the calculus progresses in a very similar manner. 2.2.3. Guggenheim’s theory: contribution of the short range forces DH theory is based on the neglect of interionic forces di2erent from electrostatic forces. However, in a real solution, short range interionic interactions of attractive and repulsive type exist, and also interactions between the ions and the solvent molecules. All these e2ects become important at moderate concentrations, where the interionic distance is suJciently small. These short range interactions depend on the size, polarizabilities and relative positions of the ions and solvent molecules, and their particular form is extremely complex. Guggenheim [119] proved that the contribution of the speci c short range interactions to the thermodynamic properties of the system can be phenomenologically described by means of the Gibbs free energy: 1 Gs = ni nj Hij ; (71) V i; j where Hij is a function of the short range potential energy 60ij (r), given by [101]: 0 Hij = NkB T d!ij (1 − e6ij (r)=kB T ) ;
(72)
where d!ij denotes the element of volume occupied by the j-ion relative to the i-ion, and the integral extends over all relative con gurations of the two ions. Obviously, the only relative con gurations whose contributions to the integral di2er appreciably from zero are those in which the two ions are very near each other, since for all other relative con gurations 60ij (r) is e2ectively zero. Br6onsted speci c interaction theory [120] assumes 60ij (r) = 0, i = j because two ions of opposite charge getting as close as the range of the short range potential is highly improbable. Using this hypothesis, Guggenheim reported an equation for the mean activity coeJcient of the ionic solution [119]: √ A |z1 z2 | I √ + bI : ln f± = − (73) 1+ I The parameter b contains the e2ect of short range non-electrostatic forces, and in Guggenheim’s phenomenological theory it is treated as an adjustable parameter. The above equation represents an extension of DH limiting law to nite concentrations, and states that the contribution of the short range interionic forces are well represented by a linear term in the ionic force. Similar expressions were obtained in a phenomenological way by G6untelberg [121] and Davies [122]. These modi cations, specially that of Guggenheim, are specially useful for treating the thermodynamics of ionic surfactants, as they represent the conceptual basis of the theory of Burch eld and Wooley [123].
32
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
2.2.4. Mayer’s cluster sum theory Mayer [22] founded DHLL on a general statistical treatment of ion–ion interactions based on his theory of clusters on real gases. This treatment avoids most shortcomings of PB theory. In Mayer’s theory, the central problem is the calculus of the pair correlation function. This quantity is represented by means of an expansion in powers of the ionic number density: gij (r) = e−6ij (r) yl (r)nl ; (74) l
where yl (r) are Mayer’s cluster integrals, built from the irreducible cluster diagrams of l particles [103], and 6ij (r) is the usual pair potential. The rst term in the above series represents the dilute gas approximation, where gij (r) exp[ − 6ij (r)], and corresponds to DH theory. Therefore, Mayer’s expansion provides a general way for the systematic extension of the DH theory, including high order ionic correlations. In practical terms, however, the evaluation of the cluster integrals becomes vey complex for terms of order higher than two. Poirier [124] applied Mayer’s formalism to the calculations of thermodynamic functions of real electrolyte solutions. Particularly, for a specially de ned mean activity coeJcient, y± ; he reported the equation [99]: A (−1)/ n2/ kD n2 e2 − b/ (kD *) ; (75) ln y± = − 2 &kB T n2 /¿0 A/ where / belongs to a family of indexes, and the rst term in the rhs corresponds to DHLL. The various constants in Eq. (75) were evaluated by Poirier using Br6onsted speci c interaction hypothesis [99]. The valence factors, n/ , are de ned as s 1 / z 0i (76) n/ = 0 i=1 i 0i being the stoichiometric coeJcients of the ionic salt. On the other hand, the constant A is 4>&kB T* A= : (77) e2 On the other hand, b/ (kD *) is given by the integral (kD *)2 l/ −/kD *y 2−/ e y (1 − /kD *y) dy ; (78) b/ (kD *) = / 1 where l/ = 0 for / 6 2 and l/ = ∞ for / ¿ 3 [99], and their values have been tabulated by Poirier [124]. The inclusion of progressively higher order terms on the expansion of the distribution function (74) results in more accurate expressions for the thermodynamic functions of the system. Nevertheless, the slow convergence of the cluster integrals [125] demands lengthy numerical calculations that can be simpli ed using an approximate procedure due to Kirkwood [53]. 2.2.5. Potential and charge distributions at a >at surface: the Gouy–Chapman model The spatial separation of charge in the neighbourhood of a surface immersed in a polar solvent by the oppositely charged ions in solution is termed the electric double layer. Although attracted to the surface by the usual Coulomb force, the counterions remain dispersed in the solution due to entropy,
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
33
so the essence of the double layer consists of charged surfaces and mobile ions. Several theoretical tools have been employed to study the electric double layer, ranging from classical statistical mechanical methods to Monte Carlo and molecular dynamics (for a review, see Refs. [91,126,127]). The traditional theoretical approach to this inhomogeneous ionic system is that which employs the conventional PB equation. This mean- eld method assumes that the density of ions in the di2use layer is proportional to the Boltzmann factor, as it corresponds to a thermally driven process. The application of PB theory to this problem constitutes the basis of the so-called Gouy–Chapman theory [2]. The structural model underlying GC theory is the conventional PM of charged hard spheres near a charged wall, all embedded in a dielectric continuum, a model which is the most studied and best understood of all models of electric double layer. Since the formulation of the original GC theory, there has been considerable progress in the eld of PM double layers. The classical theory of the electric double layer dates back to the rst decade of the XXth century and it was independently formulated by Gouy and Chapman. Like DH theory, GC theory is a mean- eld theory based on the PB equation, a result derived in Eq. (13). The main aim of the GC formalism is the calculation of the potential distribution in the neighbourhood of a wall (electrode or macroion) placed at z = 0 and immersed in an ionic solution: In this case the PB equation reads s
d2 O 1 O = − nl ql e−ql (z) : 2 dz &
(79)
l=1
GC theory is derived linearizing the PB equation for small potentials (ql O (z)1) subject to the particular boundary conditions imposed by the geometry of the problem: (z) → 0;
z→∞ ;
d 2s =− ; dz &
z=0 ;
(80)
where 2s is the surface charge density. The solution of Eq. (79) valid for low surface potentials is O (z) = 2s e−kD z ; ni (z) = ni (1 − qi 0 e−kD z ) ; (81) &kD where ni (z) is the number density pro le of ions of species i in the neighbourhood of the surface and 0 is the surface potential. The above equation summarizes the results of the so-called linearized GC theory (LGC). Maybe, the most characteristic fact of this result is the exponential screening of the surface charge by intervening electrolyte. However, the linearized GC formalism shows several unphysical features ranging from the linear relationship between 0 and 2s to the asymmetry between the total correlation functions of each type of ions, h+ (z) and h− (z): This problems can be solved using the full PB equation, in a strictly analogous fashion to LDH and the complete PB equation in the electrolyte case. Despite some thermodynamic inconsistencies which are also present in successful integral equation theories, the complete GC theory has been almost universally adopted as the appropriate theory of the di2use double layer, particularly its analytic solutions for 2:1 [128], 2:1:1 [129] and binary symmetric electrolyte [2]. In this last case the PB equation can be written as q
d2 O = kD2 sinh(q O ) d z2
(82)
34
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
with the solution q O (z) = 4 tanh−1 (ue−kD z ) ;
q 0 ; q 0 = 2 sinh−1 s ; u = tanh (83) 4 where we have introduced the parameter s = q2s =&kD . As is obvious from the above result, the ion density pro les are monotonic and asymptotically decay exponentially with the same decay length as that of the bulk electrolyte. Therefore, a detailed analysis of the decay length of bulk electrolyte solutions is also of maximum importance in the case of the electric double layer. Theories based on the improvement of the GC theory, such as the modi ed Poisson–Boltzmann (MPB) [130], density functional theories [131], integral equation techniques, as well as Monte Carlo and molecular dynamic techniques [132–135], have been used to reveal new structural features of the electric double layer. The MPB theory of the double layer is based on the application of the Kirkwood hierarchy and the weak-coupling approximation to the case of PM electrolytes. A good uni ed treatment of this formalism can be found in the review of Carnie and Torrie in Ref. [127]. Density functional techniques are based on the approximation of the free energy by an integral that involves the bulk direct correlation function, and on the subsequent variational minimization of the free energy functional regarding the ion density pro les. On the other hand, integral equation techniques are based on mathematical relationships between the ion density pro les and the ion pair correlations, and their main aim is the solution of the OZ equation by iteration using some kind of closure relation. The integral equations can be classi ed, according to Blum [126], into three categories: those deriving from the OZ equation [136], those deriving from Born–Green–Yvon (BGY) equation [137], and those of Wertheim–Lovett–Mou–Bu2 (WLMB) [138] and Kirkwood equations [139]. Monte Carlo and molecular dynamics simulations compare the numerical calculations computed for systems consisting of a few thousands of particles, for which the intermolecular potential is assumed to be known. For charged hard spheres the molecular dynamic simulations are not the most appropriate, since the potential is both singular and very long ranged, and therefore the Monte Carlo technique is the most frequently employed. This latter method is more amenable to simulating the PM of the edl [140] despite the problems associated to the long range nature of the interactions [141]. 3. Integral equation techniques and computer simulations In the above sections we have described some of the more important modi cations of DH theory which were formulated at the time of its appearance. Till the 1970s, no other important progress was registered in the theory of ionic solutions. It was at the beginning of that decade when the rst integral equation theory was solved, the mean spherical approximation (MSA) [142]. The theoretical basis of these theories is the introduction of some closure relation between the direct and total correlation functions which allows the solution of the OZ integral equation. The MSA was originally solved for charged hard spheres and its correlation functions and thermodynamics were studied. The MSA was the rst in a series of integral equation for the distribution functions of the liquid state which were used for the study of the thermodynamic properties of Coulomb :uids. After the appearance of the original theory, successive improvements of the MSA were introduced like the thermodynamically
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
35
consistent generalized mean spherical approximation (GMSA) due to HHye, Lebowitz and Stell [143], the optimized random phase approximation (ORPA) [144], and the ! order theory [19]. Besides, versions for electrolyte solutions of the Percus–Yevick (PY) [145] or hypernetted chain closure (HNC) [146] have been used for the calculation of the ionic distribution functions. In the following sections we shall make a brief review of the main results of some of these integral equation theories. 3.1. Mean spherical approximation (MSA) and its thermodynamically consistent generalization (GMSA) In 1970, Waisman and Lebowitz [142] obtained the distribution functions of RPM and PM electrolyte solutions using the MSA, in what constitutes a basic result in the statistical theory of ionic solutions. This model had been introduced in the 1960s by Lebowitz and Percus [147] as a generalization to continuum :uids with hard core interactions of the spherical model of systems isomorphic to Ising lattice gas. The MSA model assumes that the interaction potential between ions of species i and j inside the ionic core is 6ij (r) = ∞. This condition is equivalent to gij (r) = 0; r ¡ *ij , where *ij = 1=2(*i + *j ), a relationship for the radial distribution function inside the ionic core that is exact for the PM electrolyte, and that represents the impenetrability of ions. At the same time, the MSA assumes that the direct correlation function outside the core is the one corresponding to the in nite dilution regime, cij (r) = −6ij (r); r ¿ *ij : Thus, the MSA closure relation takes the form: gij (r) = 0;
r ¡ *ij ;
qi qj ; r ¿ *ij : (84) 4>&r For uncharged particles (qj = qi = 0) the PY closure relation for a mixing of hard spheres is recovered from the above equation, while for point ions the MSA coincides with the conventional DH theory. When supplemented by the OZ equation, the MSA yields an integral equation for the radial distribution function of the system. The rst expression is exact, and the second represents an extension of the asymptotic behaviour of the direct correlation function cij (r) to all r ¿ *ij . This approximation, despite its crudeness, improves the PY or HNC descriptions of the properties of the square-well :uid. Its most attractive feature is the fact that the integral equation can be solved analytically for a number of models of physical interest, particularly ionic solutions. Besides, as the short range direct correlation function de ned by Eq. (34) cancels out in the MSA outside the hard core, Stillinger–Lovett moment conditions are trivially satis ed. The MSA approximation represents a more realistic treatment of the hard sphere repulsion than that of the classical DH theory (see Section 2). The PY equation for hard spheres corresponds to the special case when the tail in the potential is absent. In fact, the relation between these two approximations is deeper than this, since they share a common diagrammatic structure [103]. The MSA was originally solved for RPM electrolytes and the obtained direct correlation functions were
q i qj B Br HS 2 − ; r¡* ; (r) − c ij &* * cij (r) = (85) − qi qj ; r¿* ; 4>&r cij (r) = −
36
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
where the superscript HS refers to the hard spheres system, and the parameter B is given in terms of x = kD * by B=
x2 + x − x(1 + 2x)1=2 : x2
(86)
Combining the above equation for the MSA direct correlation function, with the OZ equation, the radial distribution function can be obtained. Waisman and Lebowitz’s solution for the Laplace transformation of g(r) is g˜D (s) = g˜++ (s) − g˜− − (s) = −
1 2p2 s ; 2 >n (s + 2ps + 2p2 )es* − 2p2
(87)
where p=
(1 + 2kD *)1=2 − 1 : 2*
(88)
The application of the microthermodynamic equations (energy, virial, compressibility) allows the calculation of the thermodynamic properties. The MSA internal energy takes a particularly simple form: U − U HS k2 B =− D : VkB T 4>
(89)
The pressure (or virial) equation (44) leads to the following equation for the MSA distribution functions: U − U HS PV =1+ + >n*3 (g11 (*) + g22 (*)) : n 3NkB T
(90)
The right-hand side of the above equation is made up of a contact term and the contribution of the Coulomb forces. Using a conventional Gibbs–Helmholtz relation, we can obtain the osmotic pressure from the energy equation (89). This route to the pressure is called the energy route, and one can easily show that it leads to di2erent results than those obtained by the pressure route. This thermodynamic inconsistency is one of the major drawbacks of the MSA, and it is due to the unphysical feature of the correlation functions taking negative values at contact distances. This behaviour is a consequence of the overestimation of the short range correlations implicit in the hard sphere potential [143]. This fact, as we shall see in the following, has a major impact in the screening predictions. Besides, the use of the compressibility equation leads to the equation of state of the hard sphere :uid, revealing that charge and number :uctuations are completely uncoupled in MSA structural scheme, another great drawback of this model. These thermodynamic inconsistencies are also present in the solutions of the MSA integral equation for PM electrolytes together with another one which a2ects the screening predictions of the MSA closure relation. Blum [148] solved the OZ equations for the PM electrolyte and proved that the decay constant, !, was n 1=2 zi − (>=2#)*2 Pn 2 i 2! = ni ; (91) 1 + !* i i=1
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
37
where *i are the hard sphere diameters for the symmetric interactions between ions of the same species, and *ij is the usual half-sum of the diameters of ions of species i and j. The parameters on the above equation are given by 1 nk *k zk > nk *k3 ; :=1+ ; Pn = : 1 + !*k 2# 1 + !*k k
#=1−
k
> nk *k3 ; 6 k
2 =
4>e2 : &kB T
(92)
For small ion sizes, the MSA screening factor, 2!, recovers its usual DH value, and for the RPM electrolyte (*ij = *) it takes the form 4!2 (1 + !*)2 = kD2 :
(93)
The roots of the above equations are of the form, −2!* = 1 − (1 + 2kD *)1=2 , which recover the Debye screening constant from below at low concentrations (2!=kD ¡ 1). This can be con rmed by doing the low density expansion of the inverse decay length, which can be expressed as [67] kD2 * kD3 *2 (94) + + O(kD4 ) ; 2 2 where the derivatives of the !-function at the origin (kD = 0) have been evaluated from Eq. (93) using the implicit function theorem. The above expansion con rms the fact that 2!(kD )=kD → 1− in the limit of vanishing concentrations. This prediction is in marked contrast with the HNC calculations of the decay constant of 1 : z electrolytes [60], that predict that the actual screening constant of an ionic :uid tends to Debye’s one by values greater than this value. Certainly, this is another major shortcoming of this formalism. The calculation of thermodynamic properties for the PM in the MSA proceeds in the usual manner, making use of the microthermodynamic equations, and they will not be reproduced here. A good review of these properties can be found in Ref. [102]. However, it is noteworthy to point out that the MSA for the PM shows the same thermodynamic inconsistencies as for the simpli ed RPM. This unphysical behaviour of the MSA distribution functions is due to the particular description of the short range interaction in the MSA closure relation. The hard sphere potential overestimates the short range repulsion since the ions are not perfect hard spheres, so one has to allow some degree of penetrability (soft core interaction) when modelling the ionic repulsion. Among the improvements of the MSA closure relation, the one by HHye et al. [143] deserves particular attention. These authors made an assumption at the direct correlation function level and introduced a repulsive Yukawa term in the short range interionic potential, postulating a direct correlation function of the form: 2!(kD ) = kD −
gij (r) = 0;
r ¡ *ij ;
qi qj e−,r ; r ¿ *ij : (95) +B 4>&r r The function of the Yukawa term is to soften the hard sphere repulsion responsible for the previously mentioned shortcomings of the MSA. The above closure relation is known as the generalized mean cij (r) = −
38
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
spherical approximation (GMSA) [149,150], and it admits an analytical solution in terms of a coupled system of nonlinear algebraic equations. This set of equations can be solved by means of the same techniques that Wertheim and Thiele employed to solve the PY model for the hard spheres :uid [151,152]. The parameters B and , are functions of the density and temperature that are chosen so as to ensure thermodynamic consistency. The asymptotics of correlation functions in binary symmetric electrolytes in the GMSA have been extensively studied by Leote de Carvalho and Evans [153]. For simplicity, we shall restrict our attention to ionic :uids where ions have equal and opposite charges. Let us de ne the similar and dissimilar parts of the direct correlation function as 1 cNN (r) = [c++ (r) + c:− − (r) + 2c+− (r)] ; 4 1 cNQ (r) = [c++ (r) − c:− − (r)] ; 2 1 cQQ (r) = [c++ (r) + c:− − (r) − 2c+− (r)] ; 4
(96)
where the subscripts + and − stand for cations and anions respectively. cNN (r), cNQ (r) and cQQ (r) are, respectively, the number–number, charge–number and charge–charge correlation functions. The above transformation is a particular case of the so-called QN transformation [154]. Leote and Evans proved that only cQQ (r) is not short ranged, so this is the function whose decay behaviour must be analyzed in order to understand the asymptotics of ionic charge–charge correlations. When the short range interaction potential between cations and anions are the same, c++ (r) = c:− − (r); and consequently: 1 cNN (r) = [c++ (r) + c:+− (r)] ; 2 cNQ (r) = 0 ; 1 cQQ (r) = [c++ (r) − c+− (r)] : 2
(97)
The two independent correlation functions cNN (r) and cQQ (r) satisfy the OZ equations in Fourier space [153]. cNN (r) in real space satis es the following set of equations: hNN (r) = cNN (r) + n
dr hNN (r )cNN (|r − r |) ;
hNN (r) = gNN (r) − 1;
r¡* ;
cNN (r) = B
e−,r ; r
r¿* :
(98)
These equations have the form of the MSA for a Yukawa pair potential. The screening of the interionic potential is determined by the asymptotic behaviour of the charge–charge total correlation
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
39
function, which in the MSA satis es the following equations: hQQ (r) = cQQ (r) + n dr hQQ (r)cQQ (|r − r |) ; hQQ (r) = gQQ (r) = 0; cQQ (r) = −
qi qj ; 4>&r
r¡* ; r¿*
(99)
which is simply the MSA for the RPM. Therefore, despite its thermodynamic consistency, the GMSA is expected to exhibit the same de ciencies as the MSA for the screening length of ionic solutions at low concentrations. Despite showing some inconsistencies related to the hard sphere interaction and that they lead to wrong screening predictions at low concentrations, the main advantage of the MSA-like theories is that they admit a completely analytical solution. These shortcomings normally disappear if a soft sphere model is used. In this case, unfortunately, a strictly analytical approach is not possible for the obtention of the distribution functions, and numerical calculations are needed to solve more sophisticated integral equations as the Percus–Yevick (PY) or hypernetted chain calculations (HNC). 3.2. Percus–Yevick and hypernetted chain approximations Both Percus–Yevick (PY) and the hypernetted chain (HNC) approximations have been applied for obtaining the structure of electrolyte solutions. However, HNC is more adequate for studying ionic systems, as has been proven by calculations for electrolytes [146], molten salts [155] and plasmas [156]. PY approximation for a monocomponent system with pair potential 6ij (r) is given by: gij (r) = e−6ij (r) [gij (r) − cij (r)] :
(100)
This closure relation is highly successful in calculating the structure of :uids with short-ranged potentials. Nevertheless, it is not appropriate for explaining the long range exponential screening of the radial distribution function of an ionic system, because the PY radial distribution function always decays like the pair potential 6ij (r). In contrast, HNC approximation is able to accurately describe the long range correlations in the bulk. The latter approximation rests on neglecting the bridge diagrams, (bij (r) = 0), in the exact closure relation for the correlation functions (Eq. (32)): hij (r) = −1 + exp[hij (r) − cij (r) − 6ij (r) + bij (r)] : A diagrammatic analysis shows that in the HNC a higher class of diagrams than in its PY counterparts are summed, so its greater success is not surprising at all [103]. OZ equation subject to the HNC closure relation can be numerically solved for the RPM [146,157,158] and it brings considerable improvement over the MSA approximation. The HNC is also applicable to soft sphere systems, for which the MSA in its classical form is not valid. Using Tosi–Fumi potentials, HNC is capable of accurately predicting the main features of the pair correlations of ionic systems [159]. Thus, in the majority of practical situations HNC results for the pair correlation functions are considered to be exact. It is easy to see that for low coupling the HNC is reduced to the MSA, and eventually to DH
40
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
Fig. 3. Pair correlation functions for the RPM in the HNC and reference HNC (RHNC) approximations against the reduced distance for a reduced density n∗ = 0:669 and reduced Bjerrum length ) = 35:674 (Ref. [158]).
theory. The neglect of the bridge function is the characteristic approximation of the HNC integral equation theory and this is equivalent to writing: cij (r) = hij (r) − ln gij (r) − 6ij (r) = hij (r) − ln[1 + hij (r)] − 6ij (r) 1 2 1 hij (r) − h3ij (r) : (101) 2 3 Consequently, HNC can be seen as a correction to the MSA. Use of the OZ relation to eliminate cij (r) in Eq. (32) with bij (r) = 0, leads to the HNC integral equation: (102) ln gij (r) + 6ij (r) = n [gij (|r − r |) − 1][gij (r ) − 1 − ln gij (r ) − 6ij (r )] dr : −6ij (r) +
The numerical solution of the OZ equation coupled to the HNC closure relation can be performed by fast Fourier transform technique, and yields the pair correlations of the system. Some previously reported HNC results for the pair correlation functions of ionic solutions are shown in Fig. 3. The direct correlation function of a concentrated RPM electrolyte solution is plotted in this gure together with the HNC and its reference version (RHNC) predictions [158]. ) = lB =* and n∗ = n*3 are the reduced Bjerrum length and density of the system. The numerical solution of the HNC equations
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
41
yield satisfactory results for the pair correlations of the PM electrolyte and similar ionic systems over a considerable range of thermodynamic states, despite its breakdown at low concentrations and at temperatures above the coexistence region between gas and liquid phases [160,161]. 3.3. Simulation results Obviously, computer simulation techniques (Monte Carlo (MC) and molecular dynamic (MD) simulations) have also been applied to calculate structural and dynamical properties of electrolyte solutions since their appearance in the 1950s. Both MC simulations done by Metropolis and coworkers [162] and molecular dynamics simulations also introduced in the fties by Alder and Wainwright [163], have been a landmark in the physics of :uids, as they provide almost empirical data of the correlation functions. The values obtained by means of computer simulation procedures are considered to be exact for a given interaction potential. However, relevant di2erences between MC and MD methods exist. While in the MD method the microstate sequence is obtained solving Newton equations to calculate the successive positions of the system particles, the trajectories in phase space are generated according to a given probability distribution (usually those of the canonical and grand canonical ensembles). Therefore, the MD method is more suitable for studying dynamical properties while MC is normally used to calculate thermodynamic properties. It was in the 1960s when the rst MC simulations of the PM electrolyte solutions were done [164]. In the following decade, many results were reported for the PM electrolyte solutions [26,27,165,166], and their results proved invaluable for discriminating between competing theories, and provided new insight into several structural features of these systems. The performance of computer simulations of electrolyte solutions has been a constant up to our days, and the great number of contributions published almost every day makes it impossible to review this matter. To give only but a few results, Zhang and coworkers [30] studied a soft ion model of a symmetric 1:1 electrolyte solution with Grand Canonical Monte Carlo (GCMC) simulations and compared the results to those derived using conventional HNC integral equation theory, using both the standard Ewald summation method and the so-called minimal image (MI), a technique where only the interaction with the nearest image of each particle is considered. Comparison of both methods indicates that the less expensive MI gives good results for medium to high electrolyte concentrations. MC calculations have also been reported for the sticky spheres model by Shew et al. [31]. Many e2orts have been tributed to the study of the phase behaviour and critical parameters of charge and size symmetric and asymmetric electrolyte solutions (see for example, Caillol et al. [32] and Yan and de Pablo [33]). On the other hand, MD simulations of 1:1 and 1:2 electrolyte solutions have been reported by Heinzinger for several symmetric and asymmetric electrolyte solutions [34] and by Suh et al. [35]. Similarly, a soft sphere model has been recently studied by Zhang et al. [36] 4. Field theory of ionic solutions Since the original formulation of the PB theory, speci c attention has been paid to its failures, as it was known to give reliable results only in the limit of low-valency ions, low densities or high temperatures, as we have seen in Section 2. These shortcomings of the PB theory leave us in a situation where no systematic theory is available for the distribution of counterions around charged
42
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
objects in the limit of nite concentrations or high ionic valences. Moreover, the exact nature and origin of the DH term has remained somewhat unclear and has been the object of a great number of improvements, some of which have been summarized in the previous sections. However, it is perhaps in the context of systematic eld theory where the exact nature and meaning of the PB approximation is more adequately understood [23,24,42–50]. A good treatment of the e2ective FT of ionized systems is that of Brown and Ja2e [167]. In the context of this theoretical framework, it has been proven that classical DH theory (i.e. PB approximation) constitutes the saddle point of the exact FT, so the classical theory is adequately contextualized as the Gaussian or one-loop level of the exact FT [43]. At the same order of approximation, Kholodenko and Beyerlein have established the connection between PB equation and sine-Gordon equation [42]. Including multiloop diagrams, one can go beyond the PB approximation. At this level, non-trivial multibody interaction appears, so the multibody correlations acquire contributions which cannot be described as superpositions of pair correlations. An extensive series of studies about the FT of ionic systems done by Netz and coworkers was inaugurated by Ref. [43]. In this series, the one and two component plasmas has been studied by Moreira and Netz [46], Netz and Orland [44] and, recently by Brilliantov et al. [50]. Besides, the same authors reported a non-linear FT for a :uctuating counterion distribution in the presence of a xed, arbitrary charge distribution, and analyzed the :uctuation corrections to the electrostatic potential and counterion distributions [45,47]. The edl has been studied by Netz from the FT perspective in the strong coupling regime, the opposite limit to that of PB theory [48]. The same author has reported FT results for the contributions to the van der Waals interaction between two dielectric semi-in nite half-spaces in the presence of mobile salt ions [49]. Maybe the most relevant result for our present purpose is the demonstration that systematic FT corrections to DH theory e2ectively lead to a renormalization or rescaling of the ionic charges as well as the DH screening length [42]. This result has been derived by both perturbative and non-perturbative methods, and it has been shown in Ref. [42] that these quantities include the e2ects of high order ionic correlations. As we shall see in the following, this result is extremely important for the formulation of the exact mean- eld theory of ionic solutions. Let us consider a symmetric electrolyte solution. As we have seen previously (Section 2.2.3), in this important case the PB equation can be written as q∇2 O = kD2 sinh(q O ) : These results closely resemble the static case of the three-dimensional version of the famous sine-Gordon equation if the hyperbolic sine function is substituted by a sine. This change can be done by the substitution O → i O that leads to
q∇2 O = kD2 sin(q O )
(103)
that is the sine-Gordon three-dimensional equation itself. This is the starting point of the systematic FT of Kholodenko and Beyerlein, who provided a route to study simultaneously the systematic corrections to DH results as well as the possibility of a phase transition in ionic systems. The main features of their derivation, as far as screening is concerned, are as follows.
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
The grand partition function for the 1:1 electrolyte system can be written as ∞ ∞ N+ N+ ,− ,+ 1= Z(N+ ; N− ) ; N + N− N =0 N =0 +
43
(104)
−
where Ni is the number of ions of ionic species i (N = N+ + N− ) and ,i = exp(/i ). For a symmetric electrolyte ,+ = ,− = ,. Z(N+ ; N− ) is the canonical partition function of an electrolyte with those number of particles of each species. If one treats the translational degrees of freedom classically, we can write
N 2m> D=2 Z(N+ ; N− ) = d D ri exp(−V el ) ; (105) h2 V k=1
where m is the mass of the ions (supposed to be approximately equal in the original derivation) and D is the dimensionality of the space. The integral extends over the volume of the system compatible with the short range hard core repulsion forces, the only type of short range contributions to the potential energy of the system in the PM. V el is the potential energy due to the electrostatic interaction and can be written as V el =
N− N+ N + N− 2 2 z+ z − z− q2 z+ L(r − * ) + L(r − * ) + 2 L(r − * ) ij ij ij ij ij ij ; D −2 D −2 D −2 8>& r r r ij ij i=1 j=1 i=j i=j i; j=1
i; j=1
(106) where L(rij − *ij ) is the usual Heaviside or step function. The next step in the derivation of the FT action is the presentation of the partition function as a sum on a D-dimensional hyperlattice of lattice spacing a, so one can avoid the step functions, the integration can be replaced by a summation and the function 1=rijD−2 by a lattice propagator G D (i; j), where now i and j stand for lattice sites. Each lattice site can be either occupied or desoccupied by an ion, which enables us to introduce Ising-like variable s(i) = ±1. Following Glimm and Ja2e [168] the grand partition function can be rewritten as ∞ 1= ,N aND Z(N ) ; (107) N =0
{ sN }
ik ∈ V k=1;2;:::;N
where the sum over the set of sN ’s must be understood as the sum over all possible con gurations of the lattice spins, and the last sum in the above expression extends over all the lattice sites in volume V . On the other hand, the canonical partition function reads N q2 Z(N ) = exp − s(ik )G (D) (ik ; il )s(il ) : 8>&
(108)
k =l k;l=1
The lattice version of the grand partition allows a formulation of the FT closely analogous to the original Ising model while avoiding having to take the short range repulsion into account. This
44
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
latter interaction enters the continuum formalism in an somewhat obscure manner, as it restricts the volume of integration in a way which is diJcult to de ne. Using conventional FT methods we get 1 1 = D[6]exp − dr (∇2 6) − 2, cos(6) ; (109) 2
where = q =& and the :uctuating eld, 6, is nothing but the electrostatic scalar potential created by an ion in the bulk [167], and it generalizes the spin variables of the previous lattice model. More precisely, −i6 is the normal electrostatic potential, and the rotation of the contour of the functional integral is necessary to obtain an absolutely convergent functional integral. Expanding the exponent in the rhs of the above equation and keeping only the terms up to the lowest order in the eld, we get the DH grand partition function 1 2,V 2 2 2 1DH e dr[(∇ 6) + ! 6 ] ; D[6]exp − (110) 2 where !2 = 2,2 = kD2 . The above functional integral is Gaussian and its evaluation is straightforward resulting nally in [42]
2 k0 V k + !2 2 PV = ln 1 = 2,V − dk ; (111) k ln 2(2>)2 0 2> where k0 is a cuto2 of order a−1 introduced to exclude the interactions of the charges and their own elds originally included in the partition function. Performing an analytic continuation to D = 3 and using the dimensional regularization procedure, it is straightforward to obtain: V !3 : PV = ln 1 = 2,V − (112) 24> One can now recall that the usual grand-canonical expression for the mean number of particles in the system: 9 ln 1 N = , 2,V 9, to recover the classical DH equation state: V 3 k : (113) PV = N − 24> D In the above result, the rst term on the rhs corresponds to the ideal gas contribution to pressure, while the second term is the DH one, associated to the electrostatic interactions. The lowest approximation, where all the anharmonic terms in 6 in the potential −2, cos(6) in Eq. (109) are neglected is, thus, equivalent to the classical DH theory. In this case, the pair interaction between two test particles is just the DH potential, and the two-point correlation function is just gij (r) ˙ exp(−qi O j ) with a proportionality constant such that it is normalized. Neglecting the high order terms in the potential implies that there is no multibody interaction between test particles, and therefore, high order correlation functions are just products of the pair correlation function [43], gijk (r1 ; r2 ; r3 )=gij (r12 )gjk (r23 )gik (r13 ). This corresponds to the usual Kirkwood approximation in liquid-state theory, which is exactly obeyed in the DH theory. The DH theory contains correlations of all orders [43] but on a linear level. Corrections to DH results come from non-trivial higher body e2ective interactions, i.e. from violations of the superposition principle.
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
45
Thus, one recovers the classical PB results in this FT formalism, but recalling the approximation that leads to Eq. (110) one can now clearly see a way to obtain systematic corrections to the DH predictions due to non-linear e2ects and higher order correlations and :uctuations. Kholodenko and Beyerlein also proved that the DH correlation function, g(l; m) is adequately recovered in this FT formalism. They considered the expression ,2 V 2 g(l; m) = exp iq ; (114) [6(l) − 6(m)] N 2 4>& DH
where the functional average in the perturbative scheme is understood in the usual sense, ! D[6]A[6]e−SDH [6] : A[6]DH = ! D[6]e−SDH [6]
(115)
The functional average in Eq. (114) leads to purely Gaussian integrals that can be evaluated by the usual shift procedure, giving 2 q exp(−kD r12 ) (116) g(r1 ; r2 ) = C exp 4>& r12 a expression which recovers the classical DH potential of the mean force: A12 (r12 ) = −
q2 exp(−kD r12 ) : 4> r12
(117)
Thus, the approximation in Eq. (110) leads to the classical DH result, contextualizing this theoretical scheme as the Gaussian or lowest order level of the systematic eld theory. In this formalism one can obtain systematic corrections to this result, retaining higher order terms in the expansion of the action in Eq. (109). The grand-canonical partition function can be written as 1=
1 1DH 1DH
(118)
which leads to a pressure equation of the form: PV = N −
1 V 3 kD + ln : 24> 1DH
(119)
The logarithm in the rhs of the above results can be expressed as 2 2 1 6 ; (120) = −S[6] + SDH [6] = −Sint [6] = −2, dr cos(6) − 1 + ln 1DH 2 ! where SDH [6] = − 12 dr[(∇2 6) + !2 62 ]. Expanding the integrand in the above expression and keeping only the quartic term we get ,4 dr 64 (121) Sint [6] = 12
46
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
a result that allows a computation strictly analogous to that of the one component scalar 4 eld [169]. Performing the integrals associated to the corresponding high order diagrams Kholodenko and Beyerlein obtained for the pressure [42]: P 9kD9 k3 3kD6 + + ··· ; =1− D + n 24>n 16>2 n2 64>3 n3
(122)
where n represents the total number density of the :uid as usual. The third term in the rhs of the above equation corresponds to a correction of order c2 as predicted in the classical theory of electrolytes. However, in contrast to what happens with pressure, FT does not generally lead to expressions directly applicable to real experimental situations. For this purpose one is forced to use mean- eld results. In this sense, Kholodenko and Beyerlein proved by FT methods that the classical mean- eld theory of electrolyte solutions can be preserved if renormalized parameters (charges and screening length) are used in actual expressions, a rescaling that is also demanded by the Stillinger–Lovett moment conditions. Kholodenko and Beyerlein concentrated on the two particle potential of the mean force and computed the corrections to DH result in Eq. (117). The result of the diagrammatic expansion for the full inverse propagator (the inverse of the ordinary screened Coulomb potential O (r) [44]) can be represented as [170,171] −
4>& O [ (k; !2 )]−1 = k 2 + !2 + M(k; !2 ; ,) ; q2
(123)
where M(k; !2 ; ,) is the so-called mass operator which comprises the interaction e2ects. This operator can be decomposed as M(k; !2 ; ,) = k 2 f1 (k; !2 ; ,) + !2 f2 (k; !2 ; ,)
(124)
and the most general functional form of fi (k; !2 ; ,) can be deduced from dimensional analysis and is
2 k !3 k 2 2 : (125) fi (k; ! ; ,) = ’i ; ; !2 , ,2=3 Expanding ’i in the Taylor series in k 2 in the limit k → 0 (long distances, low concentration) and keeping terms up to k 2 the inverse propagator can be expressed as
3
3 ! ! 4>& O 2 −1 2 2 + ! f2 + O(k 4 ) : (126) − 2 [ (k; ! )] k f1 q , , From this result it is apparent that corrections to the classical DH result e2ectively lead to renormalized values of the screening parameter (kD ) and coupling constant (q). Thus, rigorous FT calculations show that at low but nite concentrations, the classical mean- eld formalism can be preserved despite the existence of ionic correlations and high order couplings, if one allows for the renormalization of the interaction parameters. Inverting the Fourier transform in Eq. (126) and using Eq. (123), one gets for the interionic eld: ∗2 O (r) = − q exp(−+r) ; 4>& r
(127)
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
47
where, q ∗2 =
q2 ; f1 (!3 =,)
+2 = kD2
f2 (!3 =,) f1 (!3 =,)
(128)
are the expressions of the e2ective interaction parameters. In Section 6, we shall prove by conventional statistical mechanical methods that the dielectric permittivity of the bulk solution also has to be renormalized in the exact mean- eld theory. The above result proves that the classical DH form for the average potential in a bulk electrolyte solution is maintained even in a situation where ionic correlations exist. The only price to pay is the use of e2ective parameters. The situation is the same for the radial distribution function of the ionic system, that can be obtained recalling the relation between the potential of the mean force and the correlation function itself: 2 q exp(−+r) ∗ g(r) = C exp (129) 4>& r an expression that must be compared to the DH classical result in Eqs. (20) and (21). This de nitively proves that the corrections to DH theory can be adequately represented by a substitution of the conventional interaction parameters by e2ective ones. The evaluation of these ones from statistical mechanics and their application to equilibrium and transport properties of ionic solutions is the main aim of the rest of this report. 5. Calculus of the e*ective parameters The incapability of the classical mean- eld theory to include ionic correlations can be overcome by using e2ective charges and screening lengths, as we have repeatedly mentioned in the previous sections. Many e2orts have been devoted to the obtention of these quantities, not only because they are powerful theoretical tools to understand the physics of ionic systems, but also for their practical relevance in some experimental situations. The most important analytic theories include asymptotic expansions [55,61], self-consistent theories [59], and conventional MSA. The derivation of theoretical results for the e2ective decay length of a :uid dates back to the 1960s, when Stell and Lebowitz [172] obtained an expression for the e2ective screening parameter in terms of the integral of the total correlation function of a reference :uid. However, their correction cancels out for equal short range interionic interactions. A similar de ciency is registered in the asymptotic theories of Mitchell and Ninham [61] but not on its extension for symmetric electrolytes due to Kjellander and Mitchell [55]. Hydrodynamic considerations and the Stillinger–Lovett fourth-order moment condition have also been used to obtain formal expressions of the e2ective screening length in terms of the isothermal compressibility [173,174]. Parrinello and Tosi [175] analytically obtained the screening length of a Coulomb :uid in the MSA approximation, but their predictions re:ect the overestimation of the ionic correlations due to the hard spheres potential. Outhwaite [176] has discussed various transcendental equations for the screening length of PM electrolytes based in a modi ed version of PB theory [177].
48
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
Blum [148] and Blum and HHye [179] found that the various properties of the unrestricted PM electrolyte in the MSA could be expressed in terms of a single parameter, the screening length, which could also be expanded in terms of the Debye screening length and the ionic diameter. For a binary symmetric electrolyte their result was
+ = −1 + 1 + 2(kD *) : (130) kD The major de ciency of this equation is that it does not predict a transition to the oscillatory regime of the pair correlations. This transition is known as the + transition and it corresponds to the transitions rst studied by Kirkwood [53]. As the density of ions in the system is increased at xed temperature, a cross over from monotonic charge to damped oscillatory charge dominated decay occurs at a certain value. This is termed the Kirkwood cross-over, as it was this researcher in 1936 who rst described this phenomenon in his discussion of the potential of the mean force in strong electrolytes. The onset of charge oscillations at a certain concentration is a general property of ionic systems, so any theory which intends to provide a detailed description of screening in this kind of systems must predict this transition. The main aim of this section is to do a detailed analysis of the main theoretical frameworks derived for the prediction of non-Debye ionic parameters. 5.1. Asymptotic expansions Deviations from the Debye length are due to ionic correlations and high order electrostatic coupling, and they can become important even at very low concentrations (e.g. for a 2:1 electrolyte the real decay length is reduced by 18% from the Debye length at 10−1 M). The theoretical expressions that predict the relation of the e2ective decay constant to the Debye constant are known as asymptotic expansions. The rst one of these expressions was derived by Mitchell and Ninham in the late 1970s [61]. They showed through a resummation of diagrammatic expansion that the decay length is an asymptotic expansion that depends on concentration. Their arguments were generalized by Knackstedt and Ninham [180] and they are summarized below. Let us consider a mixture of electrolytes containing s species of valences zi and numerical densities ni . The electrostatic interaction in the bulk at large distances is governed by the asymptotic behaviour of the pair distribution function gij (r) which is assumed to have the asymptotic form 1 − A˜ ij =r exp(−Br), with A˜ ij a constant to be determined. The corresponding direct correlation function is obtained through the OZ equation, whose Fourier transform is given in Eq. (23). The general solution of the OZ relations is of the form: cˆij (k) + sl nl [cˆil (k)cˆlj (k) − cˆij (k)cˆll (k)] ˆ hij (k) = ; (131) √ 1− ni nj cij where |:::| stands for a determinant. Through a resummation of a diagrammatic expansion Mitchell and Ninham [61] showed that for kD *ij 1, the Fourier transform of the direct correlation function has the form cˆij (k) = −
4># 2>#2 2 2 tan−1 (7=2) ; z z + z z i j 7 kD3 72 kD3 i j
(132)
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
49
where # = kD e2 =& is a coupling constant of the :uid and 7 = +=kD is the renormalization parameter. √ The decay constant of the pair distribution function is determined by the zero of |1− ni nj cij | with the smallest imaginary part, which is the dominant pole of hˆij (k). To the lowest order this pole can be proofed to be [180] # ln 3 ( sk nk zk4 )( sk nk zk2 ) − k l nk nl zk2 zl2 (zk − zl )2 2 − i=0 = 1 + + O(# ln #) ; (133) 8 ( sk nk zk2 )2 √ where i= −1. It is noteworthy that the correction terms vanishes for symmetric electrolyte solutions, so in the limit of vanishing concentration the correction to the usual Debye constant for this type of electrolyte is linear in the molar concentration of ions. The decay constant of the pair distribution function can be easily obtained from the above result, and reads: # ln 3 ( sk nk zk4 )( sk nk zk2 ) − k l nk nl zk2 zl2 (zk − zl )2 2 B = kD 1 + + O(# ln #) : (134) 8 ( sk nk zk2 )2 This expression does not exhibit any dependence on the hard sphere diameter of the ions and it recovers the DH decay constant in the limit c → 0. It must be noted that the correction term vanishes identically for symmetric electrolytes. As we mentioned previously, the deviations from the classical Debye length are quite marked for asymmetric electrolytes (18% for a 2:1 electrolyte at such a low concentration as 0:1 M) so they can a2ect the integration of measured thermodynamic properties and surface forces. Thus, it is of great importance to use the correct asymptotic forms, especially in double-layer problems. The next order term was obtained by Kjellander and Mitchell [55] using similar diagrammatic techniques to include a term for symmetric electrolytes: + # ln 3 ( sk nk zk3 )2 #2 ln # ( sk nk zk4 ) s =1+ + + O(#2 ) : (135) kD 8 ( k nk zk2 )2 12 ( sk nk zk2 )2 For a one component plasma (OCP) or for asymmetric binary electrolytes this result reduces to the Mitchell and Ninham result, of which it is a direct generalization. However, the range of validity of the above equations is still limited to the vanishing concentration regime. 5.2. Self-consistent screening length We have just pointed out the limitations of the asymptotic expansions, inherent to the diagrammatic expansions on which they are based. One way to obtain an expression for the screening length valid at nite concentration is to impose the Stillinger–Lovett moment conditions on the pair correlation functions. As we have seen these moment conditions are demanded by the OZ equation, and the two most important moments are the zeroth moment or electroneutrality condition, ∞ qi = − nk q k hik (r) dr 0
k
and the second moment or Stillinger–Lovett condition, ∞ 4> 1=− nk n m qk qm hkm (r)r 2 dr : 6& * k; m
50
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
We have pointed out in Section 2.2.2 that the LDH theory will only obey the two moment conditions if we have point-sized ions. Besides, the classical electrolyte formalism is only expected to hold for low charge, low density electrolytes, due to the neglect of short range correlations implicit in the assumption cij0 (r) = 0, from which it follows the classical DH formalism. However, the exponential form of the pair correlations must be preserved, since hij (r) must be at least exponentially decaying, as we demonstrated in Section 2.2.1. Thus, one has to include the e2ects of the moment condition constraint into the screening length. This is the conceptual basis of the analytic result of Attard [59], whose arguments are summarized below. Attard made the assumption that the countercharge pro le has the DH form and obtained an approximation that obeyed the two moment conditions and a bound for the onset of the monotonicoscillatory transition in the bulk solution. The countercharge pro le was assumed to be purely exponential beyond the hard core: e−+r 2j (r) = 'j ; (136) r where both 'j and + are treated as adjustable parameter at this level. Under these circumstances, the second moment condition for the RPM electrolyte becomes
2 ∞ (4>)2 1 + +* + (+*)2 =2 + (+*)3 =6 kD −+r 3 : (137) 1=− nk qk 'k e r dr = 6& + 1 + +* * k
The assumption of purely exponential pro les of the correlation functions is expected to be valid for highly dilute solutions only, so, expanding the above equation to the second order in +* one obtains +2 = kD2 [1 + (+*)2 =2 + O(+*)3 ] : The solution of the above equation is given by kD +=
1 − (kD *)2 =2
(138) (139)
an expression which recovers the usual DH length in the limit of in nite dilution. This result gives the actual decay length of the electrolyte in terms of the concentration of ions (implicit in the Debye constant) and the ionic diameter, and can be considered as a linear Pad-e approximant to the actual decay constant [59]. The limit where oscillations commence can be obtained as the zero of the denominator in Eq. (139), because for upper concentrations the screening constant becomes complex. Therefore, the self-consistent result predicts a crossover from monotonic charge to damped oscillatory charge dominated decay of the correlation functions when √ kD * = 2 (140) a result near the GMSA value (kD * = 1:228) [153] and to actual HNC calculations (kD * = 1:3) [60]. In the oscillatory regime, the e2ective decay constant of the electrolyte may be written as +=+r +i+i , so the total correlation function is given by hij (r) = A cos[+i (r − ")]e−+r r ;
(141)
where A is the amplitude and the phase is ". The period of the oscillations is controlled by the imaginary part of the decay constant ! = 2>=+i and the asymptotic behaviour of the pair correlations is governed by the corresponding real part.
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
51
The next order solution to Eq. (137) is kD +=
2 1 − (kD *) =2 + (kD *)3 =6
(142)
a result that shows the same functional form that Eq. (139). The application of the zeroth moment or electroneutrality condition leads to a countercharge pro le of the form 2j (r) = −
qj + 2 e−+r 4>&(1 + +*) r
(143)
and to a total correlation function: hij (r) = −
qi qj +2 e+* e+r ; 2 4>& kD (1 + +*) r
r¿* :
(144)
These results clearly resemble those of the modi ed DH theory in Section 2.2.2, with the Debye constant substituted by an e2ective screening length. It must be noted that the imposition of moment conditions does not lead to the introduction of e2ective charges in the equations. The appearance of these quantities in the formalism of electrolyte solution is demanded by FT considerations as we have seen. It will be in the framework of the dressed-ion theory (DIT) that the renormalization of both parameters is going to take place, as we shall see in the following, where we shall also consider the so-called DIT route to the e2ective parameters. 5.3. Numerical results The analytical calculation of the e2ective decay length of an ionic system is restricted to few theoretical frameworks, including the MSA, asymptotic expansions, self consistent theories and, as we shall see in the next section the modi ed mean spherical approximation (MMSA). Normally, numerical techniques are needed to obtain actual values of this parameter. We shall brie:y mention in this report two approaches of di2erent levels of calculational complexity. The rst one was developed by Attard and is considerably easier to implement computationally than the other, the hypernetted chain theory. The basis of the so-called non-linear DH approximation is based on the observation that the linearized type of approximations on which the self-consistent theories are based allow negative values of the co-ion radial distribution function at high electrolyte couplings [59]. It is preferable to impose the exponential form upon the potential of the mean force instead of doing it upon the pair correlation function: Aij (r) =
qi qj e+r ; 4>&0 r
r¿* ;
(145)
where 0 and + are treated as purely adjustable parameters. The values of these quantities are to be determined by imposing the two moment conditions upon the total correlation function hij (r) = −1 + exp[ − Aij (r)]. It is straightforward to solve the equations for the parameters using Newton– Raphson’s method [59,60]. Doing so one can see that the nonlinear numerical approach has the same functional form as the analytic versions, with slightly di2erent values of the parameters, while
52
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
remaining physically realistic for small separations. One must realize that this method provides only the e2ective decay length and that the e2ective charges must be calculated using other analytical or numerical approaches. The other numerical technique used for calculating the e2ective decay length is the HNC approximation. Detailed numerical calculations of the asymptotics of the pair correlations in PM symmetric binary electrolytes have been carried out by di2erent authors in this approximation. Ennis et al. [181] used the HNC to investigate numerically the pole structure of the decay of the pair correlation functions of 1:1 and 2:2 RPM electrolytes, and analyzed the results using the DIT concept of dressed particles. Carvalho and Evans [153] studied the charge-symmetric, size-asymmetric electrolyte solution by means of the HNC and compared the results to the predictions of the GMSA. In the same study, they also performed a detailed analysis of the phase equilibria of these systems. Attard [59] also used HNC calculations for analyzing the asymptotic behaviour and the thermodynamic properties of various electrolyte solutions, both in the monotonic and oscillatory regimes. On the other hand, Kjellander and Ulander [154,182] specialized in size and charge-asymmetric electrolyte solutions, while McBride et al. [60] systematically performed calculations for charge asymmetric electrolytes of valence type 1 : z (z = 1 : : : 4) up to concentrations of 1 M. These works used mainly the HNC scheme, an approximation that is considered as an exact theory for Coulomb :uids at not too high couplings for all practical purposes. In its rst stages, the method for determining the e2ective electrolyte parameters by means of the HNC approximation is the same as the conventional determination of the pair correlation functions for any other problem of :uids. The OZ equation for the electrolyte must be solved together with the hypernetted chain closure bij (r) = 0 by, for example, the standard fast Fourier transform approach. One starts with some initial guess for the direct correlation function, inverts the OZ equation in Fourier space to obtain the total correlation function and then uses numerical methods to invert the Fourier transform back to real space and the HNC approximation to obtain the direct correlation function. The procedure goes on until convergence of thermodynamic properties to suJcient accuracy is attained. Since we are interested in the asymptotic behaviour of the correlation functions high accuracy is required. Ennis et al. use grids with 16 384 points [181] with a spacing _r such that *=_r in the range 100 – 400, and they estimated the presence of numerical noise varying the grid size and spacing. Their errors on the tails of c(r) are of the order of 10−10 [183]. Attard used 214 Z [59]. Once the correlation functions, grid points for the Fourier transform at a spacing of 0:005 A and consequently their short range parts, are calculated, the second stage of the implementation is the calculation of the e2ective parameters. This is done by numerically calculating the integrals which de ne these quantities and which will be introduced in the next sections. The DIT framework has also been used by Ulander and Kjellander [183] in a systematic, selfconsistent manner to obtain the long-range decay of the pair distributions from Monte Carlo (MC) simulations with a fairly small number of particles. These authors performed extensive simulations for 1:1, 2:2 and 1:2 electrolyte solutions both in a cube cell with periodic boundary conditions and in a spherical cell (so they could also analyze the impact of cube periodic boundary conditions on the accuracy of the pair correlations). They used standard Metropolis algorithm in the canonical ensemble [184] to obtain long-range behaviour of the pair correlations from more short-ranged functions and quite small simulations, and they veri ed the results with much larger calculations. They used the standard Ewald summation and the minimal image convention (MI) procedures for the summation of the electrostatic potential of the periodic lattice generated by translations ±HL in
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
53
the x, y, z directions, where L is the size of the lattice and H=1; 2; 3; : : : MI corresponds to the Ewald summation procedure with an e2ective potential instead of the conventional electrostatic potential. The pair correlations calculated by the Ewald summation are almost identical to that of MI, as both procedures di2er only in a shift in the long range part of the potential. Ulander and Kjellander also performed numerical calculations of the e2ective parameters of the bulk :uid using standard HNC procedure. For low coupling systems (1:1 electrolyes), these authors found no signi cant di2erences between MC and HNC values of the e2ective decay constants and e2ective dielectric constant of the :uid (see Section DIT). However, for high valency systems (high coupling regime) the situation [176] is somewhat di2erent, as the HNC exhibit some well known de ciencies below 0:3 M. In this low concentration regime, HNC underestimates the value of g+− (r) at contact and predicts a spurious peak between like charges. These artifacts of the HNC a2ect the long range behaviour of the pair correlations and, therefore, it is not surprising that the HNC and MC results show quantitative di2erences for these high coupling systems. For 1:2 asymmetric electrolytes, Ulander and Kjellander reported a very good agreement of MC and HNC calculations, except at concentrations very close to the oscillatory transition (kD * =0:76 for the HNC data), and they found that the e2ective charge increases for divalent ions and decreases for the monovalent ions with increasing concentration. Further details about the simulation procedures for the calculus of e2ective quantities and the numerical technique with a small box size can be found in Ref. [183]. 6. Dressed ion theory In the above sections we have pointed out evidence of failures in the PB mean- eld formalism together with its somewhat paradoxical applicability in experimental situations where it should be completely inaccurate. This fact, together with the desire to preserve the mean- eld formalism, led to the formulation of the so-called dressed ion theory (DIT), by Kjellander and Mitchell [5]. This theoretical framework constitutes the systematic mean- eld reformulation of the statistical theory of ionic solutions, and it adequately contextualizes the previous developments in the eld. The main point developed by Kjellander and Mitchell is the demonstration that the exact statistical mechanical theory of ionic :uids can be cast in a mean- eld formalism using e2ective values of the system parameters. This means that the whole theory of the PM electrolyte can be reformulated to include n-body correlations and non-linear e2ects into e2ective quantities while preserving the mean eld analytical transparency. For the formulation of the DIT we shall employ the usual primitive model, so that both ions and colloidal particles are treated as charged hard spheres immersed in the solvent continuum of dielectric permittivity &. The calculus of the pair correlations is based on the solution of the conventional OZ equation for a multicomponent system, whose Fourier transform is given by Eq. (23): hˆij (k) = cˆij (k) + nl cˆil (k)hˆlj (k) : l
In compact matrix notation we can rewrite the above equation as ˆ H ˆ = Cˆ + CN ˆ ; H
(146)
54
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
where N = {"ij ni } is a diagonal matrix whose components are the numerical densities of the ionic species, and "ij is Kronecker’s delta function. The total charge density in the neighbourhood of ion j is made up of the contributions of the central particle itself, 2cj (r); and of its ionic cloud: s 2j (r) = 2cj (r) + qk nk gjk (r) : (147) k=1
It is important to note that the above result is valid for both electrolytes and colloid particles, the only di2erence being the central particle charge density. For point ions, 2cj (r) is given in terms of Dirac’s delta function, "(3) (r); as 2cj (r) = qj "(3) (r) or in Fourier space, 2ˆcj (k) = qj . Using the electroneutrality condition and the de nition of the total correlation function, we can write Eq. (147): ˆ ; ˆ = q + qT NH
(148)
where q = {qj } is a column vector whose components are the charges of the di2erent ionic species present in the medium, and the superscript T denotes the usual matrix transpose. The generalization of the above equations to ions of nite size is straightforward. The average density created by a charge distribution which interacts by means of a Coulomb pair potential 6(r) = 1=(4>&r), is a result of the convolution of the charge density source and the pair potential itself, as a consequence of the superposition principle: O j (r) = 6(|r − r |)2j (r ) d 3 r ; (149) where the integral extends over all the charge density. By substituting the charge density in the neighbourhood of the central j ion in Eq. (148) we obtain " # s ˆ 2ˆj (k) = 6(k) ˆ ˆ j (k) = 6(k) qj + qk nk hˆjk (k) (150) k=1
or, in matrix notation, ˆ T + qT NH) ˆ T = 6(q ˆ :
(151)
It is precisely at this stage where the fundamental hypothesis of the DIT is introduced. It is an assumption about the distribution functions of the medium that assumes that it is possible to resolve the inner and the outer parts of the ionic atmosphere as two di2erent structures, in marked contrast with the absence of any structure in DH di2use atmosphere. This distinction leads to the splitting of the pair correlation functions into a short range and long range parts, and it allows, as we shall see, the inclusion of all the e2ects of ion correlations in to the short range part of the distribution functions. The mathematical formulation of this hypothesis can be done in terms of the direct or total correlation functions, but we shall use cij (r) following the original derivation of Kjellander and Mitchell [5,55]. The splitting of the pair correlation function into a short range and a long range parts is written as cij (r) = cij0 (r) + cijl (r) ;
(152)
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
55
where the superscript l denotes a long range magnitude. This splitting of the direct correlation function is totally arbitrary, except for cij0 (r) being of shorter range than cijl (r). This idea of splitting the direct correlation function into two parts of di2erent spatial range had been used previously by Stell et al. in the hypervertex formalism [19]. The hypotheses of the classical DH formalism were restrictive about both the structure and interaction of the system, as this formalism considers only highly dilute media and Coulomb interionic interactions. However, the DIT includes any type of ionic correlation in the short range part of the direct correlation function. As the DIT does not make any hypothesis on the particular form of the ionic correlation, it is valid for any type of ionic interaction and, thus, formally valid at any concentration. The main result of the DIT is that, using this hypothesis, it reformulates the exact statistical theory of charged :uids in a formally exact mean eld-like theory. Besides, in contrast to what happens with DH classical one, this mean- eld formalism is capable of accounting for all the correlations of the system. The long range part of the direct correlation function of a low density Coulomb system: cijl (r) = −qi qj 6(r) :
(153)
The total interionic interaction is given by 6ij (r) = 60ij (r) + qi qj 6(r), where 60ij (r) is assumed to be of shorter range than the electrostatic interaction. For the primitive model electrolyte, 60ij (r) is the hard spheres potential. Fourier transforming the last equation and combining it with Eqs. (151) and (152) one gets ˆ T ˆ − q( ˆ = Cˆ 0 + Cˆ 0 NH ˆ ) H
(154)
whose solution is given by ˆ T : ˆ = (1 − Cˆ 0 N)−1 Cˆ 0 − (1 − Cˆ 0 N)−1 q() H
(155)
De ning the short range charge density as ˆ0 = (1 − Cˆ 0 N)−1 q and the short range total correlation, ˆ 0 = (1 − Cˆ 0 N)−1 Cˆ 0 , the above equation can be re-expressed as H ˆ T : ˆ = Hˆ 0 − ˆ0 () H
(156)
ˆ 0 contains the e2ects of the short range ionic correlations exactly as does Cˆ 0 . The physical meaning H of the above equation is clari ed if we express it in real space: (157) hij (r) = h0ij (r) − d 3 r 20i (|r − r |) O j (r ) : As is evident in the above result, the direct correlation function is split into two parts by virtue of the DIT structural hypothesis. The short range part of the total correlation function contains, as we mentioned before, the short range ionic correlations. On the other hand, the long range part is represented by the convolution of the average electrostatic potential and the short range charge density. Particularly, hlij (r) depends linearly on the average potential and on the short range charge densities, 20j (r). This means that these quantities act as the actual sources of the interaction. Combining the de nitions of the short range part of the total correlation function and the short range charge density we can write: ˆ 0 Nq ˆ0 = q + H
(158)
56
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
or in real space: 20j (r)
=
2cj (r)
+
s
qk nk h0jk (r) :
(159)
k=1
This equation must be compared to Eq. (12) for the DH charge density. From the above result one can deduce that the sources of charge of the long range correlations are made up of the central particle j and a charge distribution surrounding it. This result is the basis of the interpretation of 20j (r) as the charge density of a dressed particle, which acts as the source of the interaction. As well as including the central particle charge, 20j (r) includes the charge in the inner part of the ionic atmosphere of the central ion through the short range part of the total correlation function. The second term in the rhs of Eq. (159) contains the expression of the “dress” of the central particle. At this point, it is interesting to compare the above results of the exact theory with those of DH theory. Using the OZ equation the total correlation function in DH approximation given by Eq. (21) can be expressed as [103] O h(DH) (r) = −q (r) = − d 3 r 2ci (|r − r |) O j (r ) : (160) i j ij If we compare the above equation with its DIT equivalent in (157) we can see that the charge distribution of the bare central particle 2ci (r), has been substituted by the charge density of the dressed particle, 20i (r). Otherwise, the analogy between the long range part of the total correlation function in the two formalisms is evident. Changing from DH theory to DIT amounts to the substitution of the usual charges of the bare particles by e2ective or renormalized charges, the so-called dressed particles that comprise all the e2ect of correlations. The meaning of the renormalized quantities is more easily understood writing Poisson’s equation for the average electrostatic potential created by a particle in the bulk solution. Obviously, the de nition of this magnitude in Eq. (151) implies that O (r) satis es Poisson’s equation for a Coulomb system in Eq. (10). Using Eqs. (157) and (159) the DIT Poisson equation reads 2 O 0 − &∇ j (r) = 2j (r) = 2j (r) − dr (|r − r|) O j (r ) ; (161) where we have introduced the function, qk nk 20k (r) : (r) =
(162)
k
Eq. (161) is the exact version of the LPB equation. The rst term on the right-hand side of this equation is the dressed eld source, and includes both the internal charge of the original constituents of the system and the inner part of their ionic cloud. The second term on the right-hand side of Eq. (161) may be interpreted as a polarization response of the medium to the average potential [5]. Thus, the function −(r) provides the link between the polarization response and the variations in the average potential. In DH theory (Eq. (160)) the source of the interaction are the bare ionic point charges and the response is linear and local. The response in the exact theory is still linear, but its local character is lost due to the di2use nature of the short range charge densities. This fact is proved explicitly by calculating the linear response function [55]: "2(r ) 4(r; r ) = −−1 = −−1 (r) (163) " O (r)
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
57
that expresses the response of the charge density of the bulk :uid to variation in the average electrostatic potential. This linear response function determines the static structure of the bulk system as we shall see below, and is then the basic structural magnitude. The function expression introduced in Eq. (162) is closely analogous to the usual expression for the Debye parameter, kD2 = nk qk2 (164) & k
thus (r) plays the same role in the DIT as the Debye parameter in the classical DH theory. Thus, screening is controlled in the DIT by a function depending upon concentration and position instead of a concentration dependent constant. In the following we shall see that this function is actually the crucial magnitude of the DIT. We shall come back to this point later on. Before continuing with the explanation of the DIT, one must be aware that the DIT expressions must recover the universally valid DH theory at vanishing concentrations. In fact, taking the limit nk → 0, we see form Eq. (159) that 20j (r) → 2cj (r) or for point ions 20j (r) → qj "(3) (r). Therefore, the exact theory approaches the DH theory in this limit. Equivalently, (r) → &kD2 "(3) (r). The potential and distributions functions in the bulk solutions can easily be expressed in terms of hˆ0ij and 2ˆ0j . Fourier transforming Eq. (161) one gets ˆ av (k) = j
2ˆ0j (k) &k 2 + (k) ˆ
(165)
for the average potential acting on ion j, and the total correlation function. Using Poisson’s equation the total charge density around an ion j, can be calculated from the above result and one nds 2ˆj (k) =
&k 2 2ˆ0j (k) : &k 2 + (k) ˆ
(166)
Finally, the Fourier transform of the total pair correlation function between ions of species i and j is hˆij (k) = hˆ0ij (k) −
2ˆ0i 2ˆ0j (k) : &k 2 + (k) ˆ
(167)
The asymptotics of the pair correlation functions are determined by the zeroes of the denominator of the second term in the rhs of the above equation. In fact, the leading asymptotic behaviour of the pair correlation function is determined by the pole of the second term of the rhs of the above equation with the smallest imaginary part. This pole is given in general by i+, where i is the imaginary unit and + is a complex number. Therefore (k) ˆ de nes the non-Debye screening length through the equation [5] +2 =
(i+) ˆ &
(168)
which is crucial for the investigation of the asymptotic regime of the DIT equations. Thus, the analogy with the Debye parameter is complete. The asymptotic behaviour of the average potential, ionic charge density and total correlation function can be obtained by inverting the Fourier transforms
58
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
in Eqs. (165)–(167). Thus, in the limit r → ∞ for low concentration systems we have av j (r)
∼
qj∗ e−+r ; 4>&∗ r
2∗j (r) ∼ −
&+2 qj∗ e−+r ; 4>&∗ r
qi∗ qj∗ e−+r hij (r) ∼ − ; 4>&∗ r
(169)
where the asterisk denotes an e2ective (or dressed ion) magnitude. The e2ective charge of ionic species j and the e2ective permittivity of the system are given by 4> ∞ 0 qj∗ = 2ˆ0j (i+) = 2j (r) sinh(+r)r dr ; + 0 & ∗ = &0
ˆ i+ d &(i+) = & + ˆ (i+) : 2 dk
(170)
One must note that these quantities recover the usual DH ones in the limit of in nite dilution (ni → 0). The renormalized charges that replace the bare q ionic charge in the DIT are obtained from the short-range charge distribution as [5]: q∗ = 2ˆ0 (i+) ;
(171)
where + is the leading singularity of the total correlation function. The parameter q∗ plays the role of an e2ective charge, and can be as such interpreted provided it is evaluated in a purely imaginary pole [55]. In fact, it is equal to the charge a point particle must have in order to yield the same interaction energy as the dressed particle when they are placed at equal r values in the average potential long range tail [55]. On the other hand, q∗ is seen to depend only on the internal charge distribution of the dressed particle or short-range charge distribution. Furthermore, combining Eqs. (162), (168) and (171) one gets the following expression for the e2ective decay constant: qk nk 20k (k) ⇔ + = ni qi qi∗ : (172) (k) ˆ = & i k
The similarity with the conventional expression of the Debye constant needs no further comment. This results de nitively con rms the role of the -function and the e2ective decay constant as the parameters that control the asymptotic behaviour of the pair correlation in the DIT. In the previous paragraphs we have calculated the decay of the pair correlations in an ionic :uid, the main result being Eq. (169) together with the predictions for the e2ective charges and screening length. However, a complete knowledge of the decay of pair correlations in ionic :uids demands the analysis of the asymptotic behaviour of the dressed ion correlation function, h0ij (r). The leading asymptotic terms of the short range part of the total pair correlation have been calculated by Kjellander and coworkers (see Ref. [183] and references therein) and are h0ij (r) ∼ A
d0i d0j −br 40i 40j + I0 (r) 2 e−2+r + · · · ; e r r
(173)
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
where the coeJcients A, d0m and 40m are constants, and the function I0 (r) goes like this: ∞ re−tr dt I0 (r) ∼ ln2 t + >2 0
59
(174)
and it decays for large r as 1=ln2 (r) (see Eq. (B25) in Ref. [55]). The rst term originates from a simple pole of hˆ0ij (k) at k = ib similar to the one in Eq. (168), with b verifying the equation: (1 − n+ cˆ0++ (ib))(1 − n− cˆ0− − (ib)) − n+ n− (cˆ0+− (ib)) = 0
(175)
which can be solved applying a Newton–Raphson iteration. The second term in the rhs of Eq. (173) comes from a branch point singularity of cˆ0ij (k) at k = 2i+. Further details about the asymptotics of the short range part of the pair correlation can be found in Refs. [55,183], particularly in their detailed appendixes. The pole de ned by Eq. (168) governs the leading asymptotic term of the total correlation function, but when one considers suJciently concentrated media other terms due to other poles and branch point singularities of the Fourier transform of the short range part direct correlation function become important for the interpretation of the asymptotic behaviour of the pair correlation functions. Kjellander and Ulander [154,181,183] made several theoretical and numerical studies of the screening of charge and number correlations in both symmetric and asymmetric electrolyte solutions. The asymptotic expression for the charge–charge :uctuations, hQQ (r); of a dilute RPM electrolyte is given by (qQ∗ )2 e−+r (qQ ∗ )2 e−+ r 40i 40j −2+r 0 (r) e + ··· ; (176) hQQ (r) = − − + I 4>&∗ r 4>&∗ r r2 ∗ − q∗ )=2 and the primed quantities are evaluated according to Eq. (170) with the where qQ∗ = (q+ − evaluation of the DIT functions in i+ instead of i+. At high concentrations the poles i+ and i+ merge and form a complex conjugate pair, a fact that marks the transition to the damped oscillatory regime of the ionic correlations. This point is known as the + transition point and corresponds to the transition rst studied by Kirkwood [53]. The pole structure of an asymmetric electrolyte solution is qualitatively di2erent from symmetric electrolytes, a fundamental di2erence being that the second pole of the charge–charge structure factor is bounded from above by 2+ in the monotonic asymptotic regime. However, the equation that governs the decay of the pair correlations in asymmetric electrolyte solutions is formally equivalent to the one corresponding to symmetric electrolytes. However, the third term in Eq. (176) is di2erent in the asymmetric case [183]. It is obvious from the above explanation of the DIT that (r) is the main functional parameter of this theoretical framework, because its knowledge is of fundamental importance to obtain concrete expressions for the leading renormalized quantities. As one can see in Eq. (163), the -function determines the linear response of the :uid to variations in the average electrostatic potential. By virtue of the :uctuation-dissipation theorem [98,103,185], this linear response function is related to the correlation functions of the :uid, i.e. to the static structure of the :uid. In fact, Varela et al. proved in Ref. [64] that the DIT -function is related to the static structure factor of the :uid. The microscopic structure of a system is given in terms of the static structure factor S(k): 1 S(k) = n(k)n(−k) ; (177) N
60
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
where n(k) are the Fourier components of the number density. In systems of charge and size polydispersity, the previous result is easily extended through the introduction of the partial structure factors [103]: Sij (k) =
1 ni (k)nj (−k) ; N
(178)
where nl (k) is the number density of species l. The partial structure factors of a homogeneous system are related to the pair correlation functions through the equation [103]: Sij (k) = xi "ij + xi xj nhˆij (k) ;
(179)
where xl is the molar fraction of species l, xl = Nl =N . Using the expression of the pair correlation functions in Eq. (167) we get # " 0 0 2 ˆ (k) 2 ˆ (k) i j (180) Sij (k) = xi "ij + xi xj n hˆ0ij (k) − &k 2 + (k) ˆ for the partial structure factors of species i and j in terms of the DIT quantities. To better understand the structure of a system with charge and size polydispersity it is convenient to introduce another set of structure factors besides the partial structure functions, Sij (k). The latter represents the correlations of the density :uctuations of species i and j. We can also consider the :uctuations of the total density and those of the charge density. The corresponding structure factors are the so-called Bhatia–Thornton structure factors [103,108,186]: 1 SNN (k) = n(k)n(−k) = Slm (k) ; (181) N m l
1 1 SNZ (k) = n(k)2Z (−k) = qm Slm (k) ; N e m
(182)
1 1 SZZ (k) = 2Z (k)2Z (−k) = 2 ql qm Slm (k) ; N e m
(183)
l
l
Z
where 2 (k) is the Fourier component of the charge density :uctuations. The number–number structure factor SNN (k) is a measure of the linear response of the system to an external perturbation that couples to the number density irrespective of the charge density of the components. Similarly, the charge–charge structure factor SZZ (k) is a measure of the linear response to perturbations that couple to the charge density and is especially important in the framework of the DIT for the calculus of the -function, as the latter is also a measure of the response of the charge density to variations of the average potential. If we interpret this one as an external perturbation, it is obvious that it will couple to the charge density, so (k) ˆ is expected to be related to the charge–charge structure factor. On the other hand, the charge–number structure factor SNZ (k) measures the correlations between the two types of :uctuations. The calculus of the above structure factor in the framework of the DIT is straightforward using Eq. (180) for the partial structure factors. The calculus of the previous structure factors in the DIT formalism is straightforward. To simplify the notation, we shall calculate them for a binary Coulomb system (electrolyte or colloid).
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
61
Substituting in Eqs. (181)–(183) the expression of the DIT total correlation function in Eq. (167), one gets [64] ˆ0l (k))2 0 2 ( l xl 2 nSNN (k) = ; nˆl (k) − n &k 2 + (k) ˆ l n eSNZ (k) = xl 2ˆ0l (k) − 2 ql xl2 (2ˆ0l (k))2 &k + (k) ˆ l
l
n (q1 + q2 )x1 x2 2ˆ01 (k)2ˆ01 (k) − 2 &k + (k) ˆ
(184)
and for the charge–charge static structure factor one gets the simple relation: SZZ (k) =
ˆ &k 2 (k) ; ne2 (&k 2 + (k)) ˆ
where we have introduced the short ranged numerical densities: n2k hˆ0ki (k) n0i (k) = ni + ni
(185)
(186)
k
and used the expression of the function as given by Eq. (162). The results for the Bhatia–Thornton structure factors in DH theory are recovered in the usual limit of in nite dilution, where (k) ˆ → &kD2 , as follows from Eq. (172). In this limit, SNN (k) = 1, SNZ (k) = 0 and the charge–charge structure factor is given by SZZ (k) =
k2 : (k 2 + kD2 )
(187)
As one can see in Eqs. (184) and (185) the -function determines the static structure of the :uid, whether directly, as in the case of the charge–charge structure factor, or implicitly as in the other two structure factors through the short-range charges. These equations clarify the relationship among the linear response function and the static structure factor of the medium. The linear response of the system to an external eld (alternatively to variations in the Maxwell or average eld) is therefore determined by the density correlation functions. This is a particular case of the :uctuation-dissipation theorem [103]. The close connection of the linear response function to the static structure of the :uid is more explicitly shown by the charge response function 5ZZ (k) and the static, longitudinal dielectric function &(k). The fundamental relationship between both functions is given by [103] 1 e2 = 1 + 2 5ZZ (k) : &(k) &k
(188)
Using the charge–response version of the :uctuation-dissipation theorem [103]: 5ZZ (k) = −nSZZ (k)
(189)
and combining the two above equations and Eq. (185) we get for the dielectric function: (k) ˆ (190) &(k) = 1 + 2 : &k The above equation together with Eq. (185) provides the theoretical framework for the calculus of the linear response function. Both the static structure factor and the dielectric response function of
62
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
an electrolyte or colloid system may be obtained either from theory (using whatever equilibrium approximation) or experiment, therefore providing expressions for (k) ˆ and, consequently, the key to DIT fundamental renormalized parameters. This statistical procedure for calculating the e2ective quantities has been called the DIT route [64]. The relationship between the linear response function and the static structure of the :uid will be more clearly stated in the following when we study the one component charged spheres case (OCCS) in the next section. We can go no further in the calculus of the renormalized quantities of the ionic system without an explicit form of the linear response function (k) ˆ and the calculus of the latter will be the main aim in the following sections. We shall perform the calculation of the linear response function of the OCCS. This model is a generalization of the one-component plasma (OCP) for nite size ions and is the simplest representation of an ionic :uid containing oppositely charged species, and it consists of replacing one of the species by a uniformly smeared-out, structureless background that ensures electroneutrality [187–190]. Despite the non-physical features of the model, as for example proportionality between mass and charge :uctuations [103], the OCP plays a conceptual role as the prototype ionic :uid. 6.1. Modi2ed MSA approach As we have previously seen, all the information about the renormalized quantities is contained in the DIT response function. Thus, the evaluation of this magnitude becomes the main task of any theoretical approach to the problem. Knowing it, one can obtain expressions for the e2ective charges and decay length, and compare them to actual numerical results. The DIT linear response functions have been calculated by Varela et al. [64] in the RPA approximation. Despite its approximate character, this approximation clearly accounts for the DIT basic feature of the splitting of the distribution functions into two well-de ned parts and can be considered as a rst approximation to the problem. A perturbative framework seems to be accurate to take this fact into account because of the characteristic perturbative splitting of the intermolecular potential into a harsh, short-range repulsion and a smoothly varying long-range attraction. If the reference system is an ideal gas and the long range perturbation is the Coulomb potential, the static structure factor can be expressed as [103] S0 (k) S(k) = ; (191) ˆ 1 + 2S0 (k)6(k) ˆ is the Fourier transform of where S0 (k) is the static structure factor of a reference system and 6(k) the long-range perturbation of the potential. The above equation is equivalent to approximating the true direct correlation function by: c(r) c0 (r) − 6(r) :
(192)
The perturbation must be weak enough to ensure that S(k) is positive and to prevent the RPA catastrophe [103]. For a OCCS, SZZ (k) = z 2 S(k), as is easily proven by means of Eqs. (181)–(183). This expresses the proportionality of mass and charge :uctuations in the OCCS model of ionic :uids. ˆ Comparing the RPA static structure factor calculated for the Coulomb interaction (6(k) = q2 =&k 2 ) to the DIT static charge–charge structure factor in Eq. (185), and using this proportionality of the number and charge perturbation of the OCCS model, we get (k) ˆ = &kD2 S0 (k) :
(193)
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
63
As can be seen in the above equation the DIT linear response function calculated in the RPA is directly proportional to the RPA reference system static structure factor. This fact is due to the particular form of the Coulomb perturbation and represents a further con rmation of the previously mentioned :uctuation-dissipation theorem that relates the linear response of the :uid to its structural features. For an ionic system, the linear response function (k) ˆ in the RPA is completely determined by the peculiar structural features of the reference system. Using as a reference system a :uid of hard spheres of radius * one gets 4>n sin(k*) 2 ; (194) (k) ˆ = &kD 1 + 2 * cos(k*) − k k k2 &(k) = 1 + D2 k
4>n sin(k*) : 1 + 2 * cos(k*) − k k
(195)
In both cases DH results are recovered in the limit of vanishing k or in the limit of point ions (* → 0). This is an expected result because in both limits the ideal gas reference system is recovered, and so we place in the crude DH approximation. Two di2erent regimes are readily seen in the above equation. One for vanishing wavenumber, where the linear response function is a screened oscillatory function that depends on the wavenumber like sin(k*)=k 3 , and the limit k → ∞ where the linear response function behaves as cos(k*)=k 2 . In the limit of vanishing wavenumber, k → 0, we can expand the cosine and sine functions up to the fourth order in k and obtain: 1 2 2 2 ; (196) (k) ˆ = &kD (1 − 6) + 6* k 8 where 6 is the volume fraction of electrolyte, 6 = >n*3 =6. Thus, the linear response function is completely determined by the only interaction parameter (* or equivalently 6) in this approximation. At zero wavevector the linear response function is given in the RPA for a reference system of hard spheres by (k) ˆ = &kD2 (1 − 6), that represents the short range interaction correction to the DH result. Unfortunately, the RPA, although formally valuable, predicts an e2ective decay length (i+) ˆ which tends to Debye’s one from below, in contrast to simulation predictions, and a more general model of the electrolyte solution is needed. As we saw previously (see Section 3.1), a convenient starting point for the discussion of hard spheres ionic systems is the mean-spherical approximation (MSA), either in its classical form or in its generalized version (GMSA). However, the evaluation of the DIT response function in either versions of this approximations is not expected to lead to accurate predictions of the e2ective decay length, because at low concentrations both approximations underestimate the e2ective decay constant, as we have previously mentioned. To overcome these diJculties, Varela et al. [64] reported a modi ed version of the MSA that softens the hard core repulsion, approximating the direct correlation function inside the core by the total correlation function. This is equivalent to ignoring three or more particle correlations inside the hard core and is therefore expected to accurately describe the low to moderate particle density regime [108]. This version has the advantage over the classical MSA of being easily analytically tractable while it retains all the essential physical features of the system of hard spheres.
64
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
The modi ed MSA (MMSA) direct correlation function for equisized hard spheres is [64] r¡* ; −1; (197) cij (r) = qq − i j ; r ¿ * ; 4>&r where we can choose the direct correlation function to take the value of −1 inside the hard-core as the potential diverges inside this region [103]. As we can see, the MMSA is an approximation at the direct correlation function level that, though plausible, at this level can only be justi ed by its results. Fourier transforming the direct correlation function in the MSA approximation we get
sin(k*) 4> 2 cos(k*) c(k) ˆ = −q (198) + 2 * cos(k*) − &k 2 k k for the Fourier transform of the direct correlation function of a OCCS. Combining the expression for the static structure factor [103] 1 S(k) = (199) 1 − nc(k) ˆ with the MSA expression for the Fourier transform of the direct correlation function, one arrives at S(k) =
(k*)2 : (k*)2 + ((kD *)2 − 36)cos(k*) + 36 sin(k*)=k*
(200)
Using the proportionality of mass and charge :uctuations in the OCCS model, and the expression of the DIT charge–charge structure factor in Eq. (185) and comparing it with Eq. (200) we get for the linear response function in the modi ed MSA scheme: (k) ˆ =
&kD2 (k*)2 (k*)2 − (kD *)2 + ((kD *)2 − 36)cos(k*) + 36 sin(k*)=k*
(201)
that explicitly contains the correction to the DH result due to the nite size of the ions. The nite radius of the ions introduces an oscillatory dependence in the linear response function. In the limit of ionic point charges * → 0, (k) ˆ → &kD2 and DH results are recovered. Therefore, the MSA scheme provides the nite radius expression for the linear response function. So far we have calculated the linear response function in the RPA and MMSA for a OCCS model, and we have proven that the DIT route provides a useful way to the non-Debye screening length. In the following we shall analyze both the -function and the dielectric function behaviour in the RPA and modi ed MSA schemes and relate it to the renormalized quasiparticle charges. For a OCCS, the Fourier transform of the DIT linear response function de ned by Eq. (162) reads (k) ˆ = nq2ˆ0 (k)
(202)
so the short-range charge distribution is directly proportional to the linear response function. Using Eq. (194) for the RPA -function we get for the short-range charge distribution: sin(k*) 36 0 : (203) cos(k*) − 2ˆ (k) = q 1 + (k*)2 k* Thus the “bare” ionic charges are corrected through a screened oscillatory term in the RPA framework. This suggests an “onion-like” model for the dressed particle charges, that is, the bare charges are renormalized by the superposition of spherically symmetric charge shells around the central ionic
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
65
Fig. 4. RPA response function representation against wavenumber in arbitrary units as predicted by Eq. (194) at a volume fraction of 0.2 for di2erent values of the ionic radius. Full curve corresponds to an ionic radius * = 2, long-dashed curve to * = 3 and the short-dashed curve to * = 4 (Ref. [64]).
charge. The radius of the shells are determined by the short-range interaction. In the hard spheres RPA framework the size of the di2erent charge shells is determined by the radius of the bare ionic charges. In the same way, for the modi ed MSA scheme, the short-range charge distribution is given by 2ˆ0 (k) =
q(k*)2 ; (k*)2 − (kD *)2 + ((kD *)2 − 36)cos(k*) + 36 sin(k*)=k*
(204)
2 where we have used the MSA -function expression in Eq. (201). In Figs. 4 and 5, (k)=&k ˆ is plotted against k* for di2erent values of the volume fraction or ionic radius in the RPA framework and the MSA framework respectively. In the MSA framework the dependence of the linear response function on the volume fraction is not explicit, so the -function is represented against k* for di2erent ionic radius. Both frameworks lead to oscillatory short-range charge distributions that tend to the bare ionic charge in the free particle limit (k → ∞) as expected. The main di2erence between the two frameworks is registered in the k → 0 limit where the RPA scheme leads to a decrease in the internal charge distribution of the quasiparticles, while the MSA scheme leads to the opposite behaviour. In the RPA approximation, the short-range charge distribution decreases linearly with the
66
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
Fig. 5. Behaviour of the MSA response function representation against wavenumber in arbitrary units according to Eq. (201) for * = 2 and di2erent volume fractions. Full curve corresponds to a volume fraction 6 = 0:1, the long-dashed curve to 6 = 0:2 and short-dashed curve to 6 = 0:3 (Ref. [64]).
volume fraction increment as q(1 − 6), whereas in the MSA scheme the limit of vanishing k for the charge-distribution is obtained from Eq. (204) as lim 2ˆ0 (k) =
k →0
2q : 2 + (2 − 3z 2 lB =*)6
(205)
As can be readily seen from the above equation, in the limit of small wavevector the short-range charge distribution may exceed the value of the bare ion charge when the ionic radius is larger than 3z 2 lB =2. In a similar way we can study the behaviour of the dielectric function &(k). From Eq. (190) and the MSA expression (201) for the -function we obtain &(k) =
(k*)2 + (kD *)2 cos (k*) − 36(cos(k*) − sin(k*)=k*) : (k*)2 − (kD *)2 + ((kD *)2 − 36)cos(k*) + 36 sin(k*)=k*
(206)
This result implies that the dielectric function veri es the assumption of perfect screening of the external charge by a conducting :uid, as &(k) → ∞ in the limit k → 0; as can be clearly seen in the above equation.
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
67
As we have previously mentioned, once the -function has been calculated—in whatever equilibrium structural scheme—the e2ective parameters are easily obtained by the DIT route. One must remember that in the DIT framework the asymptotic decay of the correlation functions and of the charge and potential distributions is determined by (k), ˆ the function that provides the singularities ˆ av (k) and 2(k) ˆ of h(k), ˆ in the complex k plane. The relevant singularities are the poles that occur at the zeros of the denominators of Eqs. (165)–(167). The leading singularity is given by Eq. (168), a result that is equivalent to ˆ : &k 2 = −(k)
(207)
This equation implies that the poles of the DIT quantities occur at the zeros of the dielectric function as stated by Eq. (190). As &(k) ¿ 1 these poles are seen to be complex. Using the MSA expression for the linear response function and particularizing the above equation to the limit of small wavevector to obtain the poles closest to the real axis we obtain ±ikD k=
; 1 + 6 − (kD *)2 =2
(208)
where only terms up to the second order in k have been taken into account. Consequently, the renormalized screening length in the DIT/MMSA is given by kD : +=
1 + 6 − (kD *)2 =2
(209)
The above result closely resembles Attard’s expression for the screening length in terms of the Debye length and hard-core radius [59] obtained from Stillinger–Lovett second moment condition and a purely exponential decay of the countercharge pro le. The only di2erence is the volume fraction term in the denominator of Eq. (209) that comes from the modi ed MSA results for the linear response function (201) and renormalized charges. This procedure of obtaining the e2ective decay length of the ionic solution is termed DIT/MMSA route. Fig. 6 shows the actual decay length for monovalent electrolytes, as given by the hypernetted chain theory [59] for a 1:1 RPM electrolyte. It compares the results derived from the self-consistent approximation of Attard [59] and from asymptotic expansions [55] to that derived from the modi ed MSA approximation for an OCCS at the same number density of species. All the results converge to DH length at low concentrations. In the region of moderate concentration of the bulk system, the self-consistent analytic approximations overestimate the actual value of the screening length, indicating a bad modelling of the ionic correlations in that zone. The volume fraction term in Eq. (209) improves the agreement of theoretically predicted renormalized screening lengths to that derived from the HNC theory. This fact is in agreement with the approximation used in the modi ed MSA expression in Eq. (197) and con rms its validity in the moderate concentration regime. DH screening length is recovered in the limit kD * → 0 or equivalently 6 → 0. In the same way, the monotonicity assumption breakdown occurs at the divergence of the denominator of the renormalized screening length, and the transition from monotonic exponentially decaying pair correlations to oscillatory
68
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111 2.0
1.8
κ /kD
1.6
1.4
1.2
1.0
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6
k Dσ
Fig. 6. Ratio of the inverse e2ective decay length to the inverse decay length vs. kD * for 1:1 RPM electrolytes of ionic Z The solid line (—) corresponds to the MMSA result, the dashed line and the dot line are Attard’s second radius * = 4:5 A. and third order approximations respectively [59]. The open diamonds correspond to Kjellander and Mitchell asymptotic expansion [55] and the dot dashed line is Mitchell and Ninham expression [61]. Finally, the open squares correspond to a numerical non-linear Debye–H6uckel type result [60].
behaviour takes place. Hence oscillations commence when
2(1 + 6) kD = (210) * which is closely analogous to the Debye length at which oscillations commence calculated from the second moment condition [59,113,114]. Another conclusion that follows from a direct observation of Fig. 6 is that MMSA does not show the low concentration de ciency of the MSA and +=kD → 1+ ; in accordance to HNC calculations. This is a direct consequence of the MMSA modelling of the ionic core. Taking this into account, one concludes that the MSA hard core repulsion is the responsible of the underestimation of the e2ective decay constant in the in nite dilution limit (see Section 3.1). Using the expression for the short-range charge distributions in the MSA in Eq. (204) we obtain q∗ =
q(+*)2 : (+*)2 + (kD *)2 − ((kD *)2 − 36)cosh(+*) − 36 sinh(+*)=+*
(211)
The above e2ective charges for a OCCS are represented against + in Fig. 7 for di2erent values of the ionic radius. As can be seen in that gure, q∗ increases with increasing + as expected from the relationship between + and q∗ in Eq. (172). Expanding the hyperbolic functions in Eq. (211) up to the third order in +, q∗ is seen to diverge in the
neighbourhood of the transition from monotonic exponential decay to a damped oscillation (kD 2(1 + 6)=*) where + becomes complex—as stated by Eq. (208)—and q∗ looses its physical signi cance and cannot be simply interpreted as an e2ective charge. The OCCS results just presented are valid for low coupling 1:1 electrolyte solutions. The generalization of the results obtained for a OCCS using the DIT/MMSA to understand the physics of screening in high-coupling electrolyte solutions has been done by Varela et al. [69]. These
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
69
5
4
3
*
q /q
2
1
0 0.0
0.5
1.0
1.5
2.0
κ Fig. 7. Behaviour of the renormalized charges of the dressed particles in Eq. (211) represented against renormalized screening length at a volume fraction of 6 = 0:2 in the MSA approximation and arbitrary units. The dash-dotted curve is for an ionic radius * = 1, the full curve is for * = 3, the dotted curve is for * = 5, the long-dashed curve is for * = 10 and the short-dashed curve is for * = 15. For clarity, the asymptotes are omitted and a constant q∗ =q = 1 line is included to show the deviations of the renormalized charges from their DH values (Ref. [64]).
authors developed a formalism for obtaining the correction to the Debye length in general binary electrolytes, going beyond the OCCS model. The OZ equations for a binary electrolyte system are those of a conventional binary :uid, and the Fourier transforms of the pair correlations can be expressed as [153] k 2 [cˆ++ (k) + n− (cˆ2+− (k) − cˆ++ (k)cˆ− − (k))] hˆ++ (k) = ; D(k) k 2 [cˆ− − (k) + n+ (cˆ2+− (k) − cˆ++ (k)cˆ− − (k))] ; hˆ− − (k) = D(k) k 2 cˆ+− (k) ; hˆ+− (k) = D(k)
(212) (213) (214)
where the common denominator is given by the relationship [153] D(k) = k 2 [(1 − n+ cˆ++ (k))(1 − n− cˆ− − (k)) − n+ n− cˆ2+− (k)] :
(215)
70
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
The factor k 2 has been introduced in these equations to ensure a nite low-k limit of the pair correlations. The asymptotics of the pair correlations are determined by the zeros of the denominator of Eqs. (212)–(214), D(k)=0, so all the total correlation functions exhibit the same pole structure or exponential contributions in the real space, and only the amplitudes of this contributions will di2er from one component to another [153]. In terms of these correlation functions the Bhatia–Thornton static charge–charge structure factor is expressed as Szz (k) = (z12 + z22 ) +
k2 √ [cˆ2+− (k) − cˆ++ (k)cˆ− − (k)] + 2z+ z− n+ n− cˆ+− (k) : D(k)
(216)
As follows from the above equation, the zeros of Szz−1 (k), which were seen to be the e2ective screening length of the spatial correlations in the DIT scheme [64], coincide with the zeros of D(k). Remembering that the DIT relation &kD2 = 0 is equivalent by virtue of Eq. (185) to Szz−1 (k) = 0, we can conclude that the DIT retains all the essential information of the correlation length [69]. Consequently, evaluating the e2ective decay length and charges of the ionic :uid from Eq. (215) is equivalent to the application of the DIT route. Equation (216) de nes the pole structure of the Fourier transforms of the pair correlations, or equivalently the e2ective decay length of the spatial correlation in the DIT. It is in this point where the assumption about the direct correlation function implicit in the MMSA enters the scene, allowing the calculation of the static structure factor and the -function. For general z+ : z− electrolytes, we have to allow a dependence of the cij (r) inside the hard-core on the properties of the particles i and j because the coupling is higher than in 1:1 electrolytes and highly non-linear charge distributions exist around multivalent ions, whose presence in a 1:1 electrolyte solution deeply modify the pair correlations [180]. Thus, the direct correlation function in that zone is modellized as cij (r) − ij , where ij is constant for each pair of ions. This choice allows the inclusion of speci c short-range correlations in the MMSA without loosing the constancy of the direct correlation function in the hard-core. The correlation parameters ij , must depend, in general, on the radii of the ionic species i and j, because of its relation to the short-range ionic interactions, which in MSA-like models are given uniquely in terms of these parameters, and in the ionic charges. With these assumptions, the direct correlation function reads: r¡* ; − ij ; cij (r) = (217) qi qj − ; r¿* : 4>&r The correlation parameters ij , must depend on the radii of the ionic species i and j, because of its relation to the short-range ionic interactions, which in MSA-like models are given uniquely in terms of these parameters. Fourier transforming Eq. (217) we get cˆij (k) = A(k)
ij
+ B(k)qi qj ;
where the functions A(k) and B(k) in the MMSA are given by
sin(k*) 4>* ; A(k) = 2 cos(k*) − k k* B(k) = − 2 cos(k*) : &k
(218)
(219) (220)
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
71
In the monotonic decay regime, below the Kirkwood critical concentration [53,176,191], the pair correlations are exponentially decaying functions of distance. This forces the zeros of D(k) to be purely imaginary. The asymptotics of the pair correlations of the electrolyte solutions are controlled by the zero of the common denominator in Eqs. (212)–(214), with the smallest imaginary part. Substituting (218) in (215), and evaluating in the pole i+ we obtain the relation: (n+
++
+ n−
− − )A(i+)
2 −n+ n− A(i+)B(i+)(q+
2 2 + (n+ q+ + n − q− )B(i+) − n+ n− A2 (i+)(
−−
2 + q−
++
− 2q+ q−
+− )
=1 :
++ − −
−
2 +− )
(221)
The e2ective decay length can be calculated expanding the above equation for low +. The rst order of this expansion corresponds to the case A(i+) 0 and B(i+) =(&+2 ), and leads to the Debye–H6uckel decay length, + kD . To second order in the expansion of the hyperbolic functions A(i+) and B(i+) we get
2 1 + #3 (qi ; ij )6 + ; (222) = 2 kD 1 + #1 ( ij )6 − (kD *) =2 + #2 ( ij )62 − ((kD *)2 =2)#3 (qi ; ij )6 where the functions #k (qi ; i ) are given by the expressions: #1 ( ij ) = #2 ( ij ) = #3 (qi ;
ij )
(n+
++
n + n− (
=
2 (q+
+ n− n
− −)
++ − − n2
−−
−
; 2 +− )
(223) ;
2 + q− ++ − 2q+ q− (q+ − q− )2
(224) +− )
:
(225)
The decay length derived above depends on the Debye length and on the radius of the ions, as pointed out by Blum [148] and Blum and H`ye [65] for the unrestricted PM. It is also interesting to point out that, at the same time, the MMSA e2ective screening length for general electrolytes has the functional form: + = f(qi ; qj ; kD *) (226) kD a fact that is a consequence of the independence of the e2ective screening length on the size of the ions and which has been pointed out by McBride et al. [60]. This fact is a consequence of the independency of the e2ective screening length on the size of the ions [60]. Assuming that the correlation parameters are the same for all species ij , an assumption that is equivalent to assuming some kind of mean interaction between the ions inside the hard core and, despite its naivety, it allows us to test the formal validity of the GMMSA approximation with di2erent types of electrolyte solutions, while the number of parameters remains tractable. Under these circumstances, Eq. (222) reduces to + 1 =
kD 1 + 6 − (kD *)2 =2
(227)
72
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
1.8
κ /kD
1.6
1.4
1.2
1.0
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
k Dσ
Fig. 8. Ratio of the inverse e2ective decay length to the inverse Debye decay length vs. kD * for 1 : z RPM electrolytes of Z Open symbols correspond to HNC calculations: squares, circles, up-triangles and down triangles ionic radius * = 4:5 A. correspond to 1:1, 1:2, 1:3 and 1:4 electrolytes respectively. The curves represent theoretical predictions of the GMMSA.
a result similar to that derived in the original MMSA but with explicit account of speci c short-range correlations between particles. This equation represents an extension of the results derived from the self-consistent approximations, and also an improvement on the previously derived screening length for a OCCS in the MMSA [64]. Fig. 8 shows HNC results for the screening length of 1 : z electrolytes (z=1:::4) against kD * for ions Z diameter, together with the GMMSA predictions [69]. The obtained value for the speci c of 4:5 A interaction parameter was =1 (MMSA), =−1:9, =−43:5 and =−262:5 respectively for 1:1,1:2, 1:3 and 1:4 electrolytes. As these results clearly re:ect, the modulus of the mean interaction parameter strongly increases with the asymmetry of the electrolyte, in accordance with the high-coupling occurring in asymmetric systems. Therefore the MMSA picture, even with just one mean interaction parameter, allows the modellization of extra deviations due to charge asymmetry while retaining the formal aspect of a self-consistent theory [59]. MMSA also predicts +=kD ¿ 1 at low concentrations for asymmetric electrolyte solutions in marked contrast with the exact MSA result. These facts lead us to the conclusion that the GMMSA physical picture is essentially correct, despite some underestimation of the decay constant whose importance is increasingly higher as the asymmetry of the electrolyte increases. The success of the MMSA and GMMSA is somewhat surprising, since the constant value of the direct correlation function inside the hard core is incompatible with the physical meaning of c(r). The latter is related to the energy change of ion i in r, vi (r), due to a change in the local density of particles in r of species j, nj (r ), according to the usual relation [55]: cij (r; r ) =
"(3) (|r − r |) "vi (r) + " : ij "nj (r ) ni (r)
(228)
Varela et al. [89] recently reported a formally consistent version of the MMSA which allows for a spatial variation of the direct correlation function inside the ionic core, and which reduces back to the approximations introduced above. The MMSA proved to be a powerful tool for obtaining the
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
73
e2ective decay length of ionic systems combined with the DIT -function, but theoretical consistency demands an analysis of the impact on screening results of a spatial variation of the direct correlation function to be overcome, which is one of its main inconveniences: the constancy of c(r) inside the hard-core. Thus, one must allow for a spatial variation of the direct correlation function inside the ionic core and then calculate the screening length and the thermodynamic properties of the system. The most natural choice for this dependence is the linear one, a behaviour shown by the MSA c(r) for a OCCS in the low concentration regime [187]. Consequently, the original MMSA was generalized in the form [67]: −1 + #(n)r; r ¡ * ; 2 c(r) = (229) − q ; r¿* : 4>&r Due to the existence of concentration dependent short-range ionic interactions, the slope of the direct correlation function inside the ionic core, #(n); is assumed to exhibit a dependence in the number density. Fourier transforming Eq. (229) we get
4> 4># 2* sin(k*) 2 2 2 c(k) ˆ = 2 −* cos(k*) + + 2 + * cos(k*) sin(k*) + 2 − k k k k k k2 cos(k*) : (230) &k 2 As we have previously mentioned, the poles of the static structure factor, which were seen to be the e2ective screening length of the spatial correlations in the DIT scheme [64], are given by the expression c(i+) ˆ = 1=n. Thus, it follows from the above equation:
4>n 2# cosh(+*) 2 2#* 1 2 + sinh(+*) + q2 1= 2 *+# * − 2 cosh(+*) + 2 − : + &+2 + + + + (231) − q2
Expanding, as usual [64,69], the hyperbolic functions in the above expression up to the second order in +* and using the expression of the Debye parameter, one gets:
kD2 4>n (+*)2 (+*)2 (+*)2 2 * + #* − 2 + =1 : (232) + +2 6 6 +2 2 Therefore, the screening constant of the ionic :uid is given by the following expression:
2 + 1 − (8>&kB T*2 =q2 )#(n) : = kD 1 + (1 + #(n)*=2)6 − (kD *)2 =2
(233)
It follows from the above equation that the slope of the direct correlation function must vanish in the limit of low concentration, #(n) → 0 when n → 0, if DH theoretical framework is to be recovered in the limit of low ionic density. Evidently, #(n) must tend to zero from below in the high dilution limit for +=kD to be positive in that regime. All these constraints guarantee that the predicted screening properties coincide with the HNC calculated ones, but do not allow us to make any prediction about the decay of correlations in the bulk :uid. For that one has to know a particular form of the concentration dependent slope of the direct correlation function. Because the e2ect of short-range correlations is contained in this
74
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
magnitude, it is natural to assume that it increases with concentration, because both excluded volume correlations and higher ionic coupling increase with number density. We shall adopt the simplest form of this magnitude and assume that it depends linearly on concentration, as will follow from a series expansion of the linear term of the MSA polynomial [187]. Thus, we assume that #(n) ∼ #0 6, where #0 is a tting parameter. No independent parameter is allowed in the expression of the #-function as this magnitude must approach zero for negligible ionic concentration. The #0 -parameter can be qualitatively evaluated in terms of a soft sphere model through the following argument: Assuming that the hard core is substituted by an exponential short range repulsion between the ions we would have a potential of the Born–Mayer type: vsr (r) = Ae−r=,
(234)
a form that constitutes the repulsive part of the Tosi–Fumi potential [110] where it accounts for the repulsion of the electronic clouds of the ions. This form is usually employed in molten salt applications and the parameter , is obtained from crystallographic data. This form of the potential also lies in the spirit of the GMSA but we cannot use this scheme directly to get screening results since it leads to the same de ciencies as the MSA in the low concentration regime. The decay constant of the above expression is a measure of the degree of penetrability of the spheres. In the low concentration range the following equality holds: c(r) h(r) −v(r) :
(235)
For ,* the combination of the above result with Eq. (234) leads to a short range direct correlation function of the type: r c(r) A 1 − ; r¡* (236) , so the linear form introduced in Eq. (229) is recovered, proving that a linear c(r) inside the ionic core can be obtained from soft sphere considerations. Equating the above result with Eq. (229) in r = * one gets #0
1 − Ae−*=, *6
(237)
an expression which demonstrates that this magnitude is related to excluded volume e2ects and to the penetrability of the ions, as we have previously supposed. Substituting the expression of the concentration dependent slope in Eq. (233) and neglecting second order terms in concentration in the denominator of such expression we get
2 + 1 − (8>&kB T*2 =q2 )#0 6 : (238) kD 1 + 6 − (kD *)2 =2 It is interesting to point out that this last equation is very similar to the one obtained in [69] for asymmetric electrolytes, both results showing a linear term in the numerator of the renormalization parameter, 7 = +=kD , which in the previous contribution was related to the charge asymmetry of the ionic species and the speci c interaction among species. The predictions of the above equation must be tested with direct HNC data of the decay constant of ionic :uids. For this purpose it is useful
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
75
2.0
1.8
1.6
κ/kD
1.4
1.2
1.0
0.8
0.6 0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
k Dσ
Z for di2erent Fig. 9. Behaviour of the renormalization parameter 7 = +=kD vs. kD * for 1:1 RPM electrolytes of * = 4:5 A values of the slope of the direct correlation function inside the ionic core. Solid line corresponds to #0 = −0:27=*, dotted line to #0 =0:08=*, dashed line to #0 =0:03=*, dash-double-dotted line to #0 =0:084=*, short dash-dotted line to #0 =0:45=* and short dotted line to the exact MSA solution of Palmer and Weeks.
to introduce the reduced Bjerrum length ) = lB =* in the above equation. In terms of this parameter the above equation can be re-expressed as
2 + 1 − (2*#0 =3)2 )(kD *)2 (239) kD 1 + (1=3) − 12 )(kD *)2 that, once again, has the functional form in Eq. (226): + = f(qi ; qj ; kD *) : kD
(240)
Fig. 9 shows the behaviour of the ratio +=kD against Debye’s wavenumber in units of the ionic radius. It is evident from this gure and Eq. (239) that the function + tends to kD by values higher than 1 if there exists a particular relation between #0 and the coupling parameter ). Calculating the derivative of the renormalization parameter, d7=d(kD *), one gets 2(a + b)kD * d7 = ; (241) d(kD *) (+=kD )[1 − b(kD *)2 ]2 where the parameters a and b are given by the expressions 2*#0 ; 3)2 1 1 : (242) b= − 2 3) Thus, the slope of the renormalization parameter at the origin is always zero, in concordance to HNC calculations [60]. This prediction was also included in the original MMSA, so the introduction of a slope in the correlation function does not change the slope of the e2ective decay length at zero a=−
76
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
concentration. On the other hand, 7 increases in the low concentration regime if a + b ¿ 0, which means that ) 3) #0 ¡ −1 : (243) 2* 2 This relation is a restriction for the possible values of the direct correlation function slope if one has to recover HNC calculations. A slope in the direct correlation function which does not verify the above relationship would lead to an e2ective decay constant lower than Debye’s classical one in the limit of low ionic density. Finally, we must say that Eq. (239) accurately predicts HNC data for the e2ective decay constant (see Ref. [67]), and that the improvement of the consistent version over the original MMSA (equivalently GMMSA) version is not very important. Thus, one can con dently use the original (and simpler) versions in practical calculations despite their formal inconsistency. 7. Thermodynamic predictions As repeatedly mentioned throughout this work, the conciliation of the exact statistical theory of electrolyte solutions with a mean- eld approach, consistent with the moment conditions, demands the substitution of the parameters of the system by e2ective ones. Thus, the new sources of the interaction are e2ective charges which interact by means of an average potential with a non-Debye decay length [5,55]. With these new structural features, the classical image of electrolytes must undergo a deep modi cation that will logically be re:ected in the thermodynamic predictions. Varela et al. [70] have obtained the excess internal energy and the osmotic coeJcient of monovalent electrolyte solutions using the DIT/MMSA decay length, and compared them to the HNC calculations and analytical results found by Attard [59]. Up to our knowledge, no other systematic study of thermodynamic properties has been reported in literature till now. The internal energy of a :uid with an interaction potential 6ij (r) is given by the energy equation in Eq. (37): ∞ 3 U = NkB T + 2> ni nj r 2 dr 6ij (r)gij (r) ; 2 0 i j where all the symbols have their usual meaning: Particularizing this equation for the case of a binary symmetric electrolyte of charge q, hard-core * and number density n we get ∞ ex u = >nlB [h++ (r) − h+− (r)]r dr ; (244) *
ex
3 2
where u = (U − NkB T )=n is the excess internal energy per ion. In the asymptotic regime, the pair correlations of the :uid are given by Eq. (169): qi∗ qj∗ e−+r ; (245) hij (r) − 4>&∗ r where the e2ective charges of the :uid and the e2ective dielectric constant are de ned by Eq. (170): qi∗ = 2ˆ0i (i+) ; &∗ = & +
ˆ (i+) ; 2i+
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
77
0.0
ex
β u /Θ
-0.1
-0.2
-0.3
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
k Dσ
Fig. 10. Excess internal energy of a 1:1 RPM electrolyte solution as a function of the inverse Debye length in units of the Z The squares corresponds to the HNC calculations, the solid line corresponds to the MMSA ionic radius, kD *, for * = 4 A. predictions, the dashed line and the dotted line correspond, respectively, to the second and third order self-consistent approximations. The stars are calculated through the Debye–H6uckel non-linear theory (Ref. [59]).
where the prime denotes the derivative of the linear response function. Substituting the pair correlations in Eq. (244) leads to uex = −
+lB ; 2(1 + +*)
(246)
where + is given by Eq. (209) in the DIT/MMSA scheme. Fig. 10 shows the calculations for the excess internal energy made in the HNC and the non-linear DH type approximation [59] normalized by the coupling constant ) = lB =* [59], together with the predictions of Attard’s self-consistent theory and the MMSA results. The predictions of the asymptotic expansions of Mitchell and Ninham [61] and of Kjellander and Mitchell [55] are not shown, since they are only valid for in nite dilution. In this respect it is interesting to remember that HNC is considered to all e2ects as being exact to test the more approximate theories [59]. As follows from this gure, DH classical theory predicts an excess internal energy for monovalent electrolytes which is lower than the actual one, while the second order self-consistent result of Attard overestimates this thermodynamic quantity from quite low concentrations. Both the MMSA and Attard’s third order equation accurately predict the internal energy, improving even the numerical non-linear DH result. However, it is the MMSA which provides the better t for HNC calculations than any other model of the e2ective decay length up to concentrations near the oscillatory transition. In the limit of high concentrations, near the transition to the oscillatory regime, the MMSA exhibits the same underestimation of the excess internal energy as do the third order self-consistent approximations and the DH non-linear calculations Thus the MMSA, despite of being itself a second order approximation [64], leads to results comparable with those of Attard’s third order equation and, as far as internal energy is concerned, also with the DH non-linear numerical approaches. This means that using the MMSA distribution function to model the short range correlations improves the results obtained by imposing the restrictions of the Stillinger–Lovett moment conditions on the DH pair correlations.
78
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111 1.3
φosm
1.2
1.1
1.0
0.9
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
kDσ Z Symbols are as in the Fig. 11. Osmotic coeJcient of a 1:1 RPM electrolyte solutions vs. kD * for ions of * = 4 A. preceding gure.
The internal energy contains the antagonic e2ects of the thermal and potential contributions and so it is generally quite insensitive to approximations. For testing the accuracy of theoretical models the pressure is usually a more adequate parameter. The virial equation in (44), ∞ d6ij (r) 2> PV = NkB T − ni nj r 3 dr gij (r) 3 i; j dr 0 leads to an osmotic coeJcient given by uex >n*3 P =1+ + [2 + h11 (*+ ) + h12 (*+ )] ; (247) n 3 3 where hij (*+ ) denotes the limit of the pair correlations by values greater than the hard core of the particles. The contact terms inside the bracket on the right hand of the above equation cancel in the linear theories [55,70]. The HNC calculations for the osmotic coeJcient of monovalent electrolytes [59] are plotted in Fig. 11 vs. the inverse Debye length in units of the ionic radius. The second order self-consistent predictions show good agreement to the HNC calculations, despite slightly exaggerating the curvature of the predicted osmotic coeJcient at moderate concentrations. On the other hand, the DIT/ MMSA predicts the HNC results in an essentially correct way, improving even the predictions of the third order expansion of the self-consistent cubic equation of Attard [59]. Both frameworks (DIT/MMSA 6osm =
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
79
and self-consistent calculations) underestimate the values of the osmotic coeJcient up to concentrations near kD * = 1 (the second order self-consistent result) or up to the concentration of transition to the oscillatory regime. This is due to the linear character of these frameworks that is responsible for the cancellation of the contributions of the pair correlations in the third term on the rhs of Eq. (247). These terms contribute signi cantly to the pressure in this concentration regime, and they are only included in non-linear frameworks. Therefore, it is not surprising that the numerical predictions of the non-linear DH theory quantitatively improve the predictions in that concentration regime, as shown in Fig. 11. However, it is relevant that this approach overestimates the osmotic coeJcient in the high concentration regime to the same extent that the MMSA underestimates it. The calculus of the Helmholtz free energy, F(T; V; N ), is done by means of the conventional Gibbs–Helmholtz relation
1 ex ex F =T U d ; (248) T where the superscript ex indicates once again an excess magnitude, F ex = F − F id . Substitution of Eq. (246) in the above equation leads to an excess free energy per particle:
lB +* 1 T ex F =− d : (249) 2* 1 + +* T The e2ective decay constant of the electrolyte for a general binary electrolyte is given in the DIT/MMSA scheme by Eq. (227). Thus
kD * 1 Tq2 ex
F =− ; (250) d 2 2 8>&* T 1 − %(kD *) (1 + kD *= 1 − %(kD *) ) where % has the expression
* 1 − %= : 2 48lB
(251)
Restricting, for the sake of simplicity, the analysis to 1:1 RPM electrolyte solutions, we can assume = 1. Although % depends on temperature through lB , its multiplication by the square of the Debye’s parameter cancels this dependence. Taking it into account amounts to a constant change in the radicands of Eq. (250). This correction is of the order of the volume fraction of the system, since (kD *)2 = 48lB 6=* for a 1:1 electrolyte, and consequently we shall neglect its e2ect. Thus, % can be considered as temperature independent. The value of this parameter for a OCCS is % = 0:31 for a Z at 298:15 K [67]. In this system only one species exists in the bulk. However, OCCS of * = 4:5 A Z one obtains % = 0:48, the value we shall adopt in the rest of for a 1:1 RPM electrolyte of * = 4:6 A this section. Expanding the square root in the denominator of Eq. (250) and retaining terms of the lowest order in concentration in both the numerator and the denominator of such expression, one obtains: 2 x (1 − (%=2)x2 ) kB TV ex F =− dx ; (252) 4>*3 1+x where V is the total volume of the system and x = kD *. The rst term in the above integral corresponds to the DH result. This formalism is recovered in the limit of low concentration or, equivalently, doing x → 0. The second term is the DIT/MMSA correction that accounts for ionic
80
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
0.0
4πσ3 β F/V
-0.2
-0.4
-0.6
-0.8 0.0
0.5
1.0
1.5
2.0
x
Fig. 12. Behaviour of the Helmholtz free energy against Debye’s wavenumber in units of ionic radius. The dotted line Z corresponds to DH theory and the solid line to the predictions of Eq. (253) for ions of 4:5 A.
correlations through the parameter %. The integral in Eq. (252) admits a straightforward analytical solution: 4 F ex x2 % % x 1 x3 ln(1 + x) − x + + : (253) 1− =− − V 4>*3 2 2 8>*3 4 3 This equation clearly states that the energetic e2ect of the process of renormalization of charge is not only the increase of the Helmholtz free energy of the system predicted by the second term on the right-hand side, but also modi cations in the pure DH term. The leading term of the electrical Helmholtz free energy at nite concentrations is of order x4 or, equivalently, of order c2 where c is the molar concentration, and therefore an increase over the pure DH result is expected. Fig. 12 shows the behaviour of the Helmholtz free energy for 1:1 electrolyte solutions as predicted in the DIT formalism and compares them to the classical DH predictions for the same system. As shown in this gure, the DIT/MMSA renormalization process leads to an increase of the Helmholtz free energy of the electrolyte solution with respect to that of the pure DH theory. This corresponds to a decrease of the electrostatic interactions in the bulk and a simultaneous increase of other types of ionic correlations, something which e2ectively happens in actual solutions. To say so, some sort of compensation occurs in solution between pure electrostatic (long-range) interactions and short-range interactions and higher order electrostatic couplings. As a consequence, the system becomes more “ideal” as the ionic concentration increases. Once the free energy is known it is straightforward to calculate the contribution of the interionic interactions to the chemical potential of species j (kB T ln j )
qj2 *2 9(F ex =V ) : (254) ln j = 2&x 9x T;V Thus, using Eq. (253) one gets %qj2 qj2 %qj2 % x 1− − ln j = − x+ x2 : 4>&kB T* 2 2(1 + x) 16>&kB T* 16>&kB T*
(255)
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
81
The experimentally accessible magnitude is, of course, the mean activity coeJcient, which for a binary electrolyte is obtained as ln
±
0+ ln
+ + 0− ln − 0+ + 0 − % x %|q+ q− | %|q+ q− | 2 |q+ q− | 1− − x+ x ; =− 4>&kB T* 2 2(1 + x) 16>&kB T* 16>&kB T*
=
(256)
where 0j is the stoichiometric coeJcient of species j and the subscripts + and − denote cations and anions respectively. We have also used in the derivation of the above equation the electroneutrality condition. Expanding Eq. (256) to the lowest order in concentration one gets ln
±
−
x %|q+ q− | 3 |q+ q− | + x : 8>&kB T* (1 + x) 16>&kB T*
(257)
The rst term on the rhs of the above expression corresponds to DH activity coeJcient for ions with nite diameter. This result explicitly shows that the DIT/MMSA lowest order correction to the activity coeJcient predicts deviations from DHLL at nite concentrations which show a c3=2 dependence on molar concentration. The above equation is equivalent to the classical improvement of the limiting law due to Fowler and Guggenheim [93]. These authors included in the description of the system the electrical free energy of the solvent, originated in ion-dipole interactions, and deduced for the rational activity coeJcient the expression: x VO 2 0 |q+ q− | ln f± = − + x3 *(x) ; (258) 2&kB T* (1 + x) 24>N*3 where 0 is the stoichiometric coeJcient, VO 2 is the solute partial molar volume and the function *(x) is de ned by 2 x 2 x dx : *(x) = 3 x 0 1+x This function tends to unity at low concentrations, so the functional dependence of the corrections to DHLL in Eq. (258) is identical to the one obtained in Eq. (257) from the DIT/MMSA scheme. However, this last equation opens the way to more systematic corrections to classical DH law. Fig. 13 compares the DIT/MMSA predictions for the activity coeJcient in Eq. (257) to Rasaiah Z [21] and also to Andersen and Friedman’s HNC calculations for 1:1 RPM electrolytes of * = 4:6 A and Chandler’s rst (M1) and second (M2) order mode expansions for the same system [18]. As can be seen in that representation, the DIT/MMSA result reproduces quite well the behaviour of the HNC/M2 numerical results of the activity coeJcient in a wide concentration range. Particularly, the DIT/MMSA, despite being itself a mean- eld theory with no adjustable parameter, clearly improves the predictions of the DHLL + B2 approximation which denotes the sum of all ring diagrams and all two-particle cluster diagrams in the ionic cluster theory in the moderate to high concentration range, despite showing at low concentration the DIT/MMSA result a weak underestimation of the activity coeJcient. Thus, we can conclude that the simple DIT/MMSA scheme captures the same physics as the sophisticated HNC and mode expansion formalisms as far as screening and thermodynamic properties of ionic systems are concerned, and, consequently, that the rst formalism emerges as a powerful tool for the thermodynamic description of these systems.
82
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111 0.10 0.05 0.00
lnγ±
-0.05 -0.10 -0.15 -0.20 -0.25 -0.30 -0.35 0.0
0.2
0.4
0.6
c
0.8
1.0
1/2
Fig. 13. Logarithm of the mean ionic activity coeJcient of an aqueous 1:1 RPM electrolyte solution at 298:15 K against molar concentration. The solid symbols correspond to HNC calculations of Rasaiah and Friedman and the second order mode expansion of Andersen and Chandler (M2) and the open symbols to the rst order mode expansion of the same authors (M1). The long dashed line is DHLL, the short dashed one corresponds to DHLL corrected with all two-particle cluster diagrams and the solid line is the DIT/MMSA predicted activity coeJcient in Eq. (257) with % = 0:48.
8. The primitive model double layer: e*ective surface charge The main result one should bear in mind when facing the asymptotics of the pair correlations in electrolytes and electric double layer is that the ion density pro les behave as predicted by the PB approximation but with e2ective values of the parameters. In the previous sections we have analyzed the e2ective screening behaviour in bulk electrolyte solutions, and we concluded that there exist deviations from the classical DH charge and screening parameters due to the existence of ionic correlations. As occurred in the homogeneous case, renormalized sources of the potential must be introduced if one wants to preserve the PB formalism—the conceptual basis of the DLVO theory of colloid solutions—for the description of highly charged objects [11]. Actual numerical data of the e2ective charge density are available calculated by the singlet hypernetted chain approximation [127,126], an approximation which is reliable for the isolated planar double layer and may be considered accurate at low electrolyte coupling. The singlet HNC results are improved by the inclusion of the rst bridge diagram [192]. Both the singlet HNC results and the HNC corrected with the rst bridge diagram indicate that the e2ective surface charge is noticeably lower than the actual surface charge and the departure is larger for high coupling (high surface charge densities or divalent electrolytes), even at low electrolyte concentrations [193]. Besides, the HNC results predict a maximum of the e2ective surface charge density at a surface area per unit Z 2 for monovalent electrolytes at 0:1 M, which leads to a charge reversal at a charge of 100 –125 A surface charge s = q2s =&kD = 11. This behaviour is consistent with other HNC results for the edl that show that a change from repulsive to attractive of the edl interaction between two charged walls Z 2 , or the decrease of the potential drop takes place at an area per unit surface charge of about 200 A of an isolated double layer with increasing surface charge. The mechanism of charge reversal has been studied in detail and it is caused by a combination of ion size and valence, and consists in
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
83
the formation of layers of charge around the wall induced by the high surface charge. The charge reversal has been con rmed by simulations [134], in the singlet MSA [194] and in density functional theories [195]. Of course, several theoretical approaches have been taken to study the behaviour of the e2ective surface charge of the edl in the McMillan–Mayer (MM) level, i.e. averaging out the solvent degrees of freedom. The limitations of these simpli ed theories are pointed out using models of the edl that explicitly take into account the solvent structure (Born–Oppenheimer (BO) level). The solvent is modelled as hard spheres with embedded point dipoles and point quadrupoles, with parameters chosen to mimic the properties of real water, and are usually solved using a reference HNC method [196–199]. Thus, one nds that the solvent structure controls the short-range response of the ions to the surface. A fast neutralization occurs due to the counterions next to the surface, as they are not screened. At longer range (about four solvent diameters from the surface), a continuum behaviour is predicted [198]. Among the theories of the edl in the MM level, the modi ed PB (MPB approach), the extended PB theory (EPB, following the nomenclature of Attard, Mitchell and Ninham) [193], and the DIT of Ennis et al. [200] are the most relevant. The rst theory is based on the application of the Kirkwood hierarchy and the weak-coupling approximation to the PM and comprise a whole family of related theories MPB1; : : : ; MPB5 [201–203]. This formalism was perhaps the rst to predict charge reversal for a PM electrolyte at 0:15 M, but the fact that the calculation of ion–ion correlations through electrostatic boundary problems is suitable only for the dilute McMillan–Mayer level, together with the analytical complexity of the equations (see e.g. [127]) has prevented a more widespread use of this formalism. The second formalism, EPB, represents an analytical correction to the original GC picture based on the weak-coupling equations. Ion size e2ects are still neglected, but the e2ect of purely electrostatic correlations between ions is included at the level of a perturbation. We shall summarize the main features of the EPB treatment of the edl below. The original GC theory (also termed non-linear PB theory) provides us some insight into the concept of an e2ective surface charge. The solution of the PB equation for an isolated edl immersed in a symmetric electrolyte solution is given by Eq. (83): q O (z) = 4 tanh−1 (ue−kD z ) ;
q 0 ; q 0 = 2 sinh−1 s ; u = tanh 4 where s = q2s =&kD . In the limit of small s (low surface charge density or high electrolyte concentration) we can linearize the above equation to obtain O (z) = se−kD z
(259)
a solution which is formally identical to DH one for electrolyte solutions. This result provides the asymptotic behaviour of the edl potential and it is used as a reference case with which the solutions of more accurate theories are compared [200]. For large z; O (z) will be exponentially small, so the rhs of Eq. (83) can be linearized in the form: " # 2 1=2 8 1 + s O (z) − 1 e− k D z : (260) s 4
84
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
An e2ective value of the surface charge density for the GC case can be de ned as [193]: " # 2 1=2 1 + s 8 ∗ = −1 : sGC s 4
(261)
The EPB analytical correction to the GC picture of e2ective surface charge is based on the result for the e2ective surface charge density of a multicomponent electrolyte where one of the components is taken to be a spherical solute of charge q0 and number density n0 . As we pointed out in the beginning of this section, in the limit of in nite dilution (n0 → 0), the isolated macroion can be used to model an isolated wall, and the e2ective surface charge density is [59] " # 1 *+ qi cˆoi (i+) ; (262) 2∗s = 2 i where the 1=2 factor is introduced to preserve the PB form of the asymptote and 2∗s → 2s in the limit of in nite dilution. Assuming that + kD and &∗ &, Attard et al. [193] obtained: 1=2 q2 kD ∗ (2I + ln 2) 1+ ; (263) s∗ = sGC 4& where I is given by 2T2 − 3 2T2 − 3 2 − 2T3 + T 1 1 1+ 1− ln 2 + ln(T + T2 ) − I= 2 (2T2 − 1)3 2T(2T2 − 1)2 2 (2T2 − 1)3 $
2T2 + 1 T2 − 1 T−1 1+ − tan−1 ; 2 3 T (2T − 1) T+1
where T = (1 + s2 )=4. In the limit s → 0, the above result can be approximated as 1=2 q2 kD ∗ ∗ : s = sGC 1 + (1 − ln 2) 4&
(264)
(265)
The fact that EPB assumes that the e2ective decay constant is equal to the Debye constant limits its range of validity to low enough concentrations. This assumption is not correct for the PM electrolyte, as we have previously seen, so the EPB predictions are expected to be valid only at low surface charges and very low electrolyte concentrations. The main result in the DIT theory of the edl was found by Ennis et al. [200]. Once again, the theory for a planar wall is formulated taking as a model system the multicomponent electrolyte in which one species (let us denote it as species 0) represents the colloidal particles. We have to bear in mind that DIT is a formally exact mean eld theory that is expected to apply to solutions of small ions as well as to colloidal dispersions [5,55]. To obtain the correlation function and the e2ective parameters for a planar wall, the limit of large ionic diameter and in nite dilution of the macroionic species has to be taken. The short range part of the macroion-ion correlation function is given by h0i0 (z) = hi0 (z) + qi∗ O (z);
i¿1 ;
(266)
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
85
where hi0 (z) is the total macroion-ion correlation function, O (z) is the average potential due to the macroion and qi∗ is the charge of the dressed ions of species i in the bulk. From the de nition introduced above, the short-range charge distribution in the wall can be de ned as 200 (z) = qi ni h0i0 (z) : (267) i=0
At low enough concentrations, the e2ective decay constant is equal to the Debye constant, and the e2ective surface charge density is given by [200]
∞ 1 ∗ 0 − +z 2s + &+ O (0) + 20 (z)e d z : (268) 2s = 2 0 As one can see, to compute the e2ective surface charge density in the wall, the ion densities and the mean electrostatic potential must be determined as functions of distance to the wall. For this task, one must previously determine the wall–ion pair correlation functions and use the bulk e2ective parameters + and qi∗ obtained by any method. Ennis et al. in Ref. [200] have used the Anisotropic HNC (AHNC) [204–206] to analyze the inhomogeneous electrolyte in the di2use double layer. This closure relation is equivalent to its classical HNC version but the pair correlation functions and the pair potential are supposed to be explicit functions of the position: hij (r1 ; r2 ) = −1 + exp[hij (r1 ; r2 ) − cij (r1 ; r2 ) − 6ij (r1 ; r2 )] :
(269)
The pair correlations are related to the density pro les in the neighbourhood of the macroion by means of Kirkwood’s equation. The AHNC has been proved to agree with Monte Carlo simulations for monovalent and divalent electrolytes (see Ref. [200] and references therein) and its results can be considered as essentially correct except at very high ion concentrations in the edl when the Anisotropic Reference HNC must be employed. The work of Ennis et al. [200] provides systematic numerical calculations of both the e2ective surface charge density and di2use layer potential for 1:1 and 2:2 electrolytes at various concentrations. For 1:1 electrolytes a saturation of the e2ective surface charge is detected with the limiting surface charge depending weakly on ionic diameter for low concentrations. This dependence is more pronounced for higher concentrations. The situation is radically di2erent for 2:2 electrolytes where the higher ionic coupling registered in this type of ionic systems leads to a reversal of charge at a surface charge which corresponds to maximum in the edl potential. Ennis et al. did not registered the reversal of charge that actually occurs in 1:1 electrolyte solutions because they limited their analysis to 0:05 M solutions. However, Attard and coworkers [193] reported a reversal of charge in 1:1 electrolyte solutions for concentrations of 0:1 M. Refs. [200,193] demonstrate that EPB theory predicts fairly well the e2ective surface charge densities for 1:1 electrolyte solutions up to concentrations of 0:1 M, while the concordance of the theoretical results for 2:2 electrolytes is limited to concentrations of 10−3 M. 9. Transport theory of electrolytes: DITT The calculation of the correlation functions of a :uid in the dynamical case demands the simultaneous consideration of both space and time scales. The reference magnitudes for a liquid are,
86
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
generally, the mean free path (l) and the collision time (4). With reference to these parameters we can distinguish three main regimes in the description of the system dynamics: (1) The hydrodynamic regime, for which the wavenumber k and frequency !O of the external perturbation verify the relations: kl1 y !41. O Under these circumstances, the :uid can be treated as a continuum and its time :uctuations can be averaged out. In this regime the dynamic behaviour of the :uid is described by the usual equations of macroscopic :uid mechanics [207]. (2) The kinetic regime, (kl 1; !4 O 1) where the molecular structure of the :uid must be considered and microscopic equations of motion must be employed. (3) The free particle regime, a situation where both the mean interparticle distance and the collision time are great enough (kl1; !41) O to consider that the particles move independently. In this work we shall only consider the dynamic properties of an ionic system in the rst regime. Thus, the system can be considered as a continuum, implying that in any element of the :uid there exist a large enough number of molecules. This image of the :uid breaks down when we consider rapidly varying space or time dependent perturbations, In the hydrodynamic regime, we can assume that the system instantly responds to variations of the external eld. In these conditions it is possible to de ne a set of hydrodynamic (macroscopic) variables that describe the motion state of the :uid. The system is no longer homogeneous and stationary, so these variables exhibit a dependence on space and time in the form A(˜r; t) and verify relationships between :uxes and gradients of the local densities in terms of the so-called transport coeJcients. Particularly, in an ionic solution the conductivity coeJcient that relates the current density and the external eld is of special relevance. The existence of an external perturbation acting on an ionic solution modi es the equilibrium picture that we have seen previously. The external eld induces a net displacement of charges, breaking the symmetry of the pair correlations and introducing time dependencies on them. The various transport coeJcients that give the response of the system are determined by the interionic interactions, and their evaluation is the main aim of the so-called transport theory. However, despite the importance of this theory, many important dynamic phenomena in ionic systems take place under conditions for which the hydrodynamic assumption is not valid. The microscopic dynamics of the ionic solution must be studied using the correlation function formalism (see for example Refs. [103,208]). These quantities are related to the dynamic structure factors and their experimental determination demands the use of inelastic particle scattering, or sophisticated theoretical or simulations techniques that are beyond the scope of this report. In the classical Debye–Falkenhagen–Onsager (DFO) theory, hydrodynamic equations of motion are combined with the DH equilibrium theory for calculating the transport coeJcients of electrolyte solutions. This formalism is based on the assumption that the ions undergo Brownian motion and that the equilibrium distribution functions are preserved under weak external elds. On the basis of these assumptions Debye and H6uckel [1] and Onsager [79] were able to make important contributions to transport theory of electrolytes. This is one of the oldest problems in physical chemistry and has been widely treated in literature for both the static and frequency dependent regimes. The DFO treatment was generalized by Debye and Falkenhagen [209] to account for high frequency elds on the conductance and dielectric constant of the :uid. Besides, Joos and Blumentritt [210] analyzed the e2ect of high intensity elds on electrolytic conductance, the so-called Wien e2ect.
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
87
As we mentioned at the beginning of this section, transport theory is directly related to equilibrium properties of the solution, and the equilibrium distribution functions determine the dynamic behaviour of the medium. Consequently, any improvement in the equilibrium distributions of the media must yield modi cations in the related transport theory formalism. The old linear response DFO theory [80–82] based on the extension of the Debye–H6uckel equilibrium theory to transport phenomena has been recently improved using more accurate pair distribution functions. These include the mean-spherical approximation (MSA) for both the restricted primitive model (RPM) [84,85] and the unrestricted primitive model (di2erent ionic sizes) [78,86–88]. These equilibrium theories are extensions of the hard core DH theory which satisfy the second moment condition [59] and have been shown to provide more accurate expressions for the thermodynamics and transport coeJcients of electrolyte systems [59,78]. In the previous sections we have seen that the statistical mechanical theory of electrolyte solutions has undergone a long and complex development since the early papers of Debye and H6uckel (DH) and of Gouy till the modern and successful dressed-ion theory (DIT) of Kjellander and Mitchell [5,55]. According to the previously mentioned relationship between the equilibrium theory and the transport theory, it was necessary to adapt the transport formalism to the dressed-ion equilibrium theory. In this section, we exploit the dressed-ion model to obtain the dynamical response of bulk electrolyte solutions to an external eld in the static and time-dependent regimes as an approach to DIT transport theory. In particular, we present in this work the electrical response or relaxation of the ionic cloud due to the perturbation of the atmosphere and its e2ect on the ionic mobility by means of the hydrodynamic continuity equation for the time-dependent radial distribution function. Besides, in the asymptotic regime, a reformulation of DFO formalism for the transport theory of ionic systems in terms of the e2ective quantities of DIT is analyzed and its predictions compared to actual experimental data. This is the so termed dressed ion transport theory (DITT). Onsager’s classical results are also recovered in the usual vanishing concentration limit. 9.1. Relaxation of the ionic cloud The dominant forces that determine the deviations from ideal behaviour of the transport processes in electrolyte solutions are the relaxation and the electrophoretic e2ects both of them arising from the interaction between electric charges of the ions. The rst one was studied by Debye [1]. When the ionic equilibrium distribution function is perturbed by some external force, internal electrostatic forces derived of the dissymmetry of the ionic atmosphere appear that tend to restore the equilibrium distribution of the ions. This e2ect changes the eld experienced by individual ions and therefore has a direct e2ect in the mobility and, consequently, in transport related quantities. In order to obtain the correction for the electric eld acting on an ion due to the perturbation of the ionic atmosphere, we have to use the time-dependent distribution function gij (ri ; rj ; t) introduced in Section 2. This function satis es a continuity equation in the con guration space of two particles (6-dim) 3 [99,211,212]: 9gij (ri ; rj ; t) + ∇ i Ji + ∇ j Jj = 0 ; 9t
(270)
where Ji is the probability :ux for i particles. The :ux Ji is given by [100] Ji = :i0 qi gij (ri ; rj ; t)(E − ∇i =j ) − −1 :i0 ∇j gij :
(271)
88
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
Here :i0 is the mobility of ions of species i at in nite dilution where all ion–ion interactions vanish, E is the external eld and =j (ri ; rj ) is the potential due to an ion located at rj acting on ion i. This potential satis es the Poisson equation: ∇2i =j (ri ; rj ) = −
4> ql nl hlj (ri ; rj ) + qi "(ri − rj ) ; &
(272)
l
where we have used the electroneutrality condition k nk qk = 0. Making use of the symmetry condition ∇i hij (r; t) = −∇j hij (r; t), and substituting (271) and (272) in the continuity equation (270) we arrive at −
9hij (r; t) + −1 (:i0 + :j0 )∇2 hij (r; t) + qi :i0 ∇2 =j (r; t) + qj :j0 ∇2 =i (−r; t) 9t =(qi :i0 − qj :j0 )E(r; t)∇hij (r; t) ;
(273)
where we have introduced relative coordinates r = rj − ri . Eq. (273) is the expression that relates the total correlation functions to the potentials, and may be easily solved by Fourier analysis. If the electrolyte (or colloid dispersion) is disturbed by an external eld E(r; t), both the potentials and the distribution functions become asymmetric [99]. The radially symmetric functions of the DIT are perturbed by the eld. Supposing the latter to be small we may write the new potentials and radial distribution functions [99,100]: =i (r; t) = =i; 0 (r) + "=i (r; t) ; hij (r; t) = hij; 0 (r) + "hij (r; t) ;
(274)
where the perturbations "=i and "hij are odd functions of r (Ref. [99]) and the subscript 0 stands here for an equilibrium magnitude. Restricting our attention to a binary electrolyte, and using the symmetry condition for the perturbations "hij (r; t) = −"hji (r; t), the linearity of Poisson’s equation and "=i (r; t) = "=j (r; t), we obtain to the lowest order in perturbations: −
9hij (r; t) + −1 (:i0 + :j0 )∇2 "hij (r; t) + (qi :i0 − qj :j0 )∇2 "=j (r; t) 9t =(qi :i0 − qj :j0 )E(r; t)∇hLij (r) ;
(275)
where we have replaced the total correlation function hij by its long-range equilibrium value hLij . This implies a double supposition: ignoring the time dependence of the perturbed part of the equilibrium pair correlation function and of its equilibrium short range part. The approximation of the total correlation function on the right-hand side of Eq. (274) by its equilibrium value is justi ed in electrolyte transport theory with the exception of the Wien e2ect [99], as only suJciently weak external elds are considered so the perturbation in Eq. (274) can be ignored. The second supposition involved on the right-hand side of Eq. (275) is the approximation of the DIT equilibrium pair correlation to its long-range part. This is the same as ignoring the coupling of the external eld to the internal part of the ionic cloud. This appears reasonable for weak external perturbations and nite concentrations (the DIT concentration regime) for which there is a considerable screening of
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
89
the internal shells by the external ones and thus we can consider the short-range charge density as unaltered. Fourier transforming Eq. (275) we get −i!"hˆij (k; !) − k 2 −1 (:i0 + :j0 )"hˆLij (k; !) − k 2 (qi :i0 − qj :j0 )"=ˆ av j (k; !) =
−i (qi :i0 − qj :j0 )[E(k; !) ∗ [khˆLij (k)"(!)]] ; (2>)2
(276)
where we have used that the space–time Fourier transform of the equilibrium pair correlation is 2>hˆLij (k)"(!). E(k; !)∗[khˆLij (k)"(!)] represents the convolution of the external eld with the internal long range structure of the solution de ned by L ˆ [E ∗ (khij )"(!)](k; !) = d! E(|k − k |; ! − ! )k hˆLij (k )"(! ) : dk (277) Considering that the external eld is weak enough, we can ignore the perturbation of the short-range part of the charge density, i.e. of the internal part of the ionic cloud and the bare ion charge distribution, so Poisson’s equation written for the perturbations reads &k 2 "=ˆ i (k; !) = 4>qj nj "hˆLij (k)"(!) :
(278)
Substituting the above expression in the continuity equation and rearranging gives iqkD2 E(k; !) ∗ [khˆLij (k)"(!)] ; "=ˆ i (k; !) = (2>)2 k 2 (qkD2 + i!4∗ + k 2 ) where 4∗ =1=(:i0 +:j0 ). We have also used the electroneutrality condition the parameter q given by q=
(279) 2
i=1
qi :i0 − qj :j0 (qi − qj )(:i0 + :j0 )
ni qi =0 and introduced (280)
which accounts for the di2erent mobilities and ionic charges of both species in solution. The perturbed part of the electric eld acting on ion i can be easily performed using the potential in Eq. (279). Its expression is given in Fourier space by "Ei (k; !) = −ik"=ˆ i (k; !) so using Eq. (279) and inverting the Fourier transform with r = 0 we obtain ∞ k[E(k; !) ∗ [khˆLij (k)"(!)]] qkD2 −i!t dk d! e : "E(0; t) = (2>)6 k 2 (qkD2 + i!4∗ + k 2 ) −∞
(281)
(282)
We can no longer proceed without an explicit form for the electric external eld E(k; !). In the O case of an homogeneous monochromatic eld of frequency !, O E(k; !) = (2>)4 E0 "(3) (k)"(! + !). Inserting this into Eq. (282) and performing the inverse Fourier transform we get k 2 2ˆ0i (k)2ˆ0j (k) qkD2 E0 ∞ dk 2 "Ei (0; t) = ; (283) 3>kB T 0 (k + qkD2 − i!4 O ∗ )(&k 2 + (k)) ˆ
90
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
where we have used the long-range part of the DIT pair correlation function in Eq. (167) and that the angular average of k(kE0 ) is given by 4>k 2 E0 =3. It is evident from the above equation that the static structure of the :uid determines the response of the medium, as the perturbed part of the electric eld acting on the bulk solution is given through the DIT linear response function ˆ and the short-range charge densities. The response of the system is also given by the mobilities present in the denominator of the integrand of Eq. (282) through the dependence in q and 4∗ . The integral over the modulus of k is performed via the calculus of residues. This one is performed extending the values of k to the entire complex plane and doing a contour integration around the upper half-plane. The relevant singularities are the poles of the denominator of the integrand in Eq. (279). These are given by i5 and i+ as we have previously mentioned, where 5 is the zero of k 2 + qkD2 − i!4 O ∗ in the upper half-plane and is given by
!4 O ∗ i arctg − 2 (284) 5 = (q2 kD4 + !O 2 4∗ 2 2 )1=4 exp 2 qkD and + is the usual e2ective decay length that governs the leading asymptotic behaviour of the equilibrium pair correlations. Performing the integral in Eq. (278) we get " # qi∗ qj∗ + qkD2 E 52ˆ0i (i5)2ˆ0j (i5) : (285) "Ei (0; t) = − + ∗ 2 3kB T (i5) ˆ − &52 & (5 − +2 ) In the above equation, we can see that the perturbed electric eld acting on ion i is made up of two distinct contributions. The rst one comes from the renormalized charge densities qm∗ and the other one involves the Fourier transform of the short-range charge densities evaluated in the complex pole i5. According to Ref. [55], these quantities cannot be interpreted as renormalized charges unless they are evaluated in a purely imaginary argument. This formal inconvenience disappears when the argument of the exponential in Eq. (284) is zero as, in this case, 5 is real. This condition means that !4 O ∗ = m> ; (286) qkD2 where m is an integer. The physical interpretation of the above equation is as follows: the time of relaxation of the ionic atmosphere is given in terms of the solution concentration and ionic frictional coeJcients 2˜l = 1=:l0 by [99] 1 ; (287) 4∗ = 2 kD kB T (1=2) where 1=2 is de ned through the relation: 2 i ni qi =2i 1=2 = : 2 i ni qi
(288)
For a binary electrolyte it is straightforward to show that 1=2 = q=4∗ . Substituting in Eq. (286) and using Eq. (287), we get !4 O ∗ = m> :
(289)
The above equation states a relationship between the eld frequency and the relaxation time of the ionic atmosphere for the pole i5 in Eq. (284) is real and therefore, according to the general DIT
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
91
[5,55], the short-range charge distribution 2ˆ0i (i5) can be interpreted as a real renormalized charge. √ Thus, for a given relaxation time of the ionic atmosphere, there exist a pole i5m = qkD (1 + m2 >2 )1=4 where the response to the external eld E can be expressed in terms of real renormalized charges and screening lengths. Particularizing the Fourier transform of Eq. (162) for k = i5m we get (i5 ˆ m) = nl ql ql;∗ m ; (290) l
q∗
ˆ0l (i5m ). l; m = 2
where The above equation is closely analogous to the normal expression for the Debye length and must be compared to Eq. (172). This result con rms the interpretation of the quantity (i5 ˆ m ) = &+m2 as a renormalized screening length associated to the size of the perturbed ionic cloud, and 2ˆ0l (i5m ) as a renormalized charge which comes from the polarization of the charge density of the quasiparticles due to the average eld acting on ion i, and it is given by ∞ 2> ∗ ql; m = − √ r dr 20l (r)e−5m r (291) qkD 0 which represents the charge inside a region of radius +m−1 around the central ion. The distance +m−1 may then be considered as the spatial range of ionic polarization due to the external eld at the frequency !O = m>=4. This implies that part of the ionic atmosphere of radius +m−1 and charge qi;∗m may be interpreted as the object of the perturbation at those frequencies. For frequencies that do not verify the resonance condition in Eq. (289) this is no longer the case, and the response is complex in general. Solving the equation that de nes the dynamic screening lengths &+m2 = (i5 ˆ m ) for the limit of small concentrations with the use of the MMSA linear response function [64] we get [90] kD +m = % ; (292) 1 2 2 1=2 2 1 + 6 + (q(1 + m > ) =30 − 2 )(kD *) where only terms up to the fourth order in k have been taken into account. The above result closely resembles the one for the static DIT renormalized screening lengths in Eq. (209). In Fig. 14 we represent +m * against the Debye wavevector for di2erent values of the frequency (m-parameter) for a volume fraction 6 = 0:05 and q = 1=2 (Ref. [99]). In that gure it is fairly obvious the increasing di2erence between the dynamic and static screening parameters with increasing concentration. This means that in the presence of a eld, the medium response is made up of the contribution of various quasiparticles besides the equilibrium ones. We shall come back to this idea in the next subsection when we study the perturbed potential in the neighbourhood of a bulk ion. The case of an external static eld is recovered in the limit of vanishing frequency !O (m = 0). √ In this limit 5 → qkD as follows from Eq. (284). Taking this limit in Eq. (285) for the perturbed electric eld we get: "√ # √ √ +qi∗ qj∗ qkD 2ˆ0i (i qkD )2ˆ0j (i qkD ) qkD2 E0 + ∗ 2 "Ei (0) = : (293) √ 3kB T (i ˆ qkD ) − &qkD2 & (qkD − +2 ) The pole i+ in the second term on the rhs of the last equation comes directly from the distribution function and is a consequence of the structure of the medium and the screened electrostatic interaction that exists in the bulk solution of quasiparticles. Using the pole of the integrand of Eq. (283) with the smallest imaginary part, i+, is a consequence of the i ion being under the asymptotic potential created
92
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
Fig. 14. Dynamic decay length +m in units of the ionic radius as a function of the Debye length for a volume fraction of 6 = 0:05 and q = 0:5. The solid corresponds to m = 1 and the dot-dashed line is the static decay length of the usual DIT as given by the MMSA approximation (Ref. [89]).
by the surrounding ions, and the pole i+ is related to the asymptotic behaviour of the DIT functions, so the eld "E(0) is the one created by the external eld and the medium, i.e. the Maxwell eld. The contribution coming from this pole involves the renormalized quantities of the usual DIT. On √ the other hand, the pole i qkD in the rst term on the rhs of Eq. (293) is purely dynamic in origin and is related to the behaviour of the medium in the presence of an external eld. Its contribution to the perturbed electric eld acting on ion i involves the DIT charge densities and structure factor √ evaluated in that pole, 2ˆ0i (i qkD ) and (i5) ˆ respectively. In the static case, as we have already shown in Eq. (289), there is no problem for interpreting the short-range charge distributions as real renormalized charges for low enough densities. Taking the limit of in nite dilution (ni → 0) in Eq. (293) we get qi qj qkD E "Ei0 (0) = − ; (294) √ 3&(1 + q)kB T where we have used that in this limit 20i (r) goes to qi "(3) (r) (equivalently 2ˆ0i (k) tends to qi ) and (k) ˆ to &kD for all k. At the same time, &∗ goes to &; the dielectric constant of the solvent and the renormalized charges qi∗ tend to the bare ion charges. Eq. (294) is the DFO expression
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
93
for the perturbed electric eld acting on ion i so the exact DIT theory approaches the DFO theory in the limit of in nite dilution as may be expected. In this limit, both screening lengths are equal to the Debye length and the renormalized quantities recover their classical expressions. In the limit of high frequency, !4 O ∗ qkD2 , 5 goes like (!4 O ∗ )1=4 e−i>=4 . As we have previously pointed out, the evaluation of the DIT functions in a complex pole allows no physical interpretation in terms of renormalized charges and screening lengths. This may be easily solved by taking the limit of in nite dilution where these functions tend to their classical DFO expressions. Taking these limits in Eq. (283) and taking the real part of the average eld acting on ion i we get "Ei (0; t) = −
|qi qj |qkD2 E0 kD cos(!t) O + (!4 O ∗ )1=2 cos(!t O + >=4) : 2 ∗ ∗ 1=2 3&kB T kD + !4 O + (2!4 O ) kD
(295)
Considering that the Debye constant is negligible in this regime compared to the terms on the frequency, the perturbed electric eld acting on ion i becomes "Ei (0; t) = −
|qi qj |qkD2 E0 cos(!t O + >=4) : 3&kB T (!4 O ∗ )1=2
(296)
As we may see in the above equation, in the limit of in nite dilution and high frequencies, the perturbed eld is >=4 out of phase with the external eld. This e2ect is produced by the relaxation time of the ionic cloud, which acts as a capacitor [102] introducing a temporal delay in the electric response to the external potential. The phase factor is completely independent of structural parameters and even of the nature of the substances in solution, and it is originated in the behaviour of the √ poles ±i5, which are constrained to a convex region of the complex plane varying between ±i qkD at vanishing frequency, and the high frequency limit in which their arguments are ±>=4. This fact is due to the form of the continuity equation and its second order dependence in the space derivatives, which introduces quadratic dependence in k in the denominator of Eq. (279). An asymptotic decay as 1= !O 1=2 is also predicted for the frequency dependence of the perturbed average eld. The validity of the high-frequency approximation made above is limited to the higher frequencies inside the hydrodynamic regime where the system is governed by classical :uid mechanics continuity equations like Eq. (270). We shall work in the asymptotic regime for the calculation of the radial dependence of the perturbed potential. Inverting the Fourier transform in Eq. (279) with ! = 0 and using the expression for the pair correlation in Eq. (167), we get: (kE)2ˆ0i (k)2ˆ0j (k) iqkD2 ikr "=i (r) = dke : (297) (2>)3 kB T k 2 (k 2 + qkD2 )(&k 2 + (k)) ˆ A straightforward calculation shows that the perturbation vanishes when Er = 0. We are mainly interested in the longitudinal perturbation. Taking the electrostatic external eld to be parallel to the z direction, and recalling the inversion symmetry of the charge density and the linear response function in the k-space, we obtain, in the asymptotic regime: " # −√qkD r
2ˆ0i (0)2ˆ0j (0) 1 1 e 1 e−+r qkD2 A 1+ √ +B 1+ − "=i (r) = ; (298) 4>kB T qkD r r +r r r2 qkD2 (0) ˆ
94
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
where A and B are constant given in terms of DIT parameters by √ √ 2ˆ0i (i qkD )2ˆ0j (i qkD ) A= √ ; √ qkD ((i ˆ qkD ) − &qkD2 ) qi∗ qj∗ B= ∗ : & +(qkD2 − +2 )
(299)
The above potential is made up of three terms. For the interpretation of this result we shall assume that the electrostatic eld is coming from a test charge placed at the origin of the bulk solution and surrounded by its own ionic atmosphere, so that the above perturbed electrostatic potential is the interaction potential between ions. The rst term in Eq. (298) contains the contribution of the polarization charge around an i ion and has a dynamical origin as it is purely due to the external eld and involves the ionic mobilities. This term accounts for the e2ect of the net charge contained √ in a region of radius 1= qkD around the central ion. The second term is made up of the typical DIT quantities, e2ective charges, renormalized screening lengths and dielectric constant, and thus it can be interpreted as a structural term. It accounts for the contribution to the potential of the e2ective ionic charges made up from the bare ion (or colloid) charges in the atmosphere of ion i contained in a region of radius 1=+. The third term corresponds to the net contribution of the bare ion and the internal part of the ionic cloud, since the charges 2ˆ0i (0) are the total short-range charge around a given bare particle: 0 2ˆi (0) = 20i (r) dr : (300) The perturbed electrostatic potential around a given ion is thus composed of di2erent screened terms corresponding to the various layers that form the ionic atmosphere. The relative charge and volume of the layers, as well as the concentration, determine the interaction of the quasiparticles in the medium and very di2erent behaviours are obtained, ranging from DLVO-like interaction to pure repulsion or attraction between ionic or colloid species. This is shown in Fig. 15 for di2erent values of the parameters A, B and renormalized charges. In these gures we represent the radial dependence of the perturbed electrostatic potential between oppositely charged ions. The ionic atmospheres of these ions are also oppositely charged in the mean so the renormalized charges are expected to be of di2erent signs just like the long range part of the ionic atmospheres responsible of the e−+r =r √ term and the polarization charges that arises from the i q kD pole. As these di2erent contributions become important (as distance decreases) a range from pure repulsion to attraction between the di2erent layers around the central ions is found. 9.2. Electrophoretic eBect The classical FO formalism consists of combining the hydrodynamic continuity equation and the Navier–Stokes equation with the DH equilibrium pair correlations to obtain the concentration dependence of the transport coeJcients of electrolyte solutions. As we have previously mentioned, the use of the equilibrium DIT in the framework of a hydrodynamic transport scheme is the basis of the DITT. We have just presented the DIT analysis of the relaxation of the ionic cloud based on the combination of the hydrodynamic continuity equation with the DIT pair correlations to study the relaxation of the ionic atmosphere [89]. Now we use these equilibrium distribution functions and
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
95
2
δψ (r)
1
0
-1
-2 0.0
0.5
1.0
1.5
2.0
2.5
3.0
r Fig. 15. Behaviour of the longitudinal part of the perturbed electric potential in the neigbourhood of a bulk ion as a √ function of the radial distance at dynamic and renormalized DIT screening lengths of qkD = 1 and + = 4 respec2 tively, for various values of the parameters A, B and C = 2ˆi (0)2ˆj (0)=qkD (0) ˆ in Eq. (298). Solid line is for A = 1:2, B = −0:83 and C = −1, long-dashed line is for A = 1, B = 0:83 and C = −1, and dotted line is for A = 1, B = −0:83 and C = −1.
the Navier–Stokes equation for the analysis of the electrophoretic e2ect, the other e2ect responsible for the concentration dependence of the transport coeJcients of the ionic solution. The motion of an ion through a viscous medium distorts the velocity eld around it, as it tends to drag with it the solution in its vicinity and therefore the ions in its atmosphere do not move in a medium at rest. This fact constitutes the so-called electrophoretic e2ect. Obviously it is a concentration dependent e2ect and so its interpretation depends on the degree of accuracy with which the equilibrium distribution function is known. The equilibrium structure of the :uid is governed, as usual, by the DIT equilibrium pair correlation, neglecting the e2ect of the dissymmetry of the ionic cloud in conduction in weak elds. Thus electrophoresis is due to hydrodynamic interactions between the ions and the solvent molecules. In the linear stationary regime the velocity eld v of a :uid under the e2ect of an external eld E is related to its structure and hydrostatic pressure P through the Navier–Stokes equation [100]: '∇2 v − ∇P + E"2(r) = 0 ;
(301)
96
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
where ' is the solvent viscosity (the contributions of the solute particles to this transport coeJcient are neglected), and "2(r) is the local deviation of the charge density from its average value. Fourier transforming the above equation and using the incompressibility hypothesis (∇v = 0) we get ikE ˆ ˆ : P(k) = − 2 "2(k) k
(302)
Substituting the above equation into the Fourier transform of Eq. (301) we get for the velocity eld: 2 k E − k(kE) "2(k) ˆ v(k) = : (303) ' k4 As can be seen in this equation, the velocity eld of an incompressible :uid under the e2ect of an applied electric eld is determined solely by this one and the static charge structure of the eld "2(k). ˆ For weak perturbations, the latter may be expressed as a function of the equilibrium distributions by means of the usual relation (12): "2(k) ˆ = qj nj hˆij (k) ; (304) j
where we have used the electroneutrality hypothesis. Substituting the above expression in Eq. (303), and inverting the Fourier transform evaluated in r = 0 we get for the velocity of the surrounding ionic cloud relative to the bare i particle (vi (0)): 2 1 k E − k(kE) ˆ : (305) qj nj dk hij (k) vi (0) = (2>)3 ' j k4 Evaluating the angular part of the integral for a homogeneous and isotropic medium and a uniform applied eld, and using the DIT pair correlation function in Eq. (167) we get ∞ 2ˆ0i (k)2ˆ0j (k) E 0 q j nj dk 2 vi (0) = vi − 2 ; (306) 3> 'kB T j &k + (k) ˆ 0 where vi0 stands for the contribution of the short-range part of the pair correlation to the velocity eld in the neighbourhood of ion i. It represents the electrophoretic contribution of the short range part of the ionic atmosphere. As a rst step in the calculation of the electrophoretic velocity correction we shall consider this short range part to be linked to the bare central particle, therefore ignoring possible contributions to the velocity eld of short-range forces and surface conduction and other electric-double layer e2ects [213] that would be naturally contained in that term vi0 . This implies that the central ion and the short range part of the surrounding charge distribution move as one kinetic unit due to a kind of electrostatic physisorption that is di2erent from speci c ion binding or ionic association [5]. The integral over the modulus of k in Eq. (306) contains the contributions of the long-range part of the pair correlation and must be performed through the calculus of residues. Once again, we extend the values of k to the entire complex plane and perform a contour integration around the upper half plane. For low concentrations, the i ion moves in the asymptotic tail of the average potential of the j ions in its neighbourhood, so one must work in the asymptotic regime, and therefore the relevant singularity is the zero of the denominator of Eq. (306) with the smallest imaginary part
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
97
[5,55] as de ned by Eq. (168). Performing the integral we get for the electrophoretic correction to the velocity: qi∗ E vi (0) = vi0 − qj nj qj∗ ; (307) 6>'kB T&∗ + j where we have used the usual de nitions of the DIT quantities. Combining Eqs. (172), (307) and the above result we get the DITT expression for the electrophoretic increment of the central ion velocity: _vi = vi (0) − vi0 = −
qi∗ + E : 6>'(&∗ =&)
(308)
The above equation clearly resembles that of Onsager’s expression [214]: _vi = −
q i kD E : 6>'
(309)
This equation is in fact recovered in the limit of vanishing concentration where the bare parameters are recovered [5,55,64]. This is due to the internal structure of the equilibrium DIT that in its asymptotic form substitutes DH quantities for the renormalized ones in what we could say constitute the trivial modi cations. In Eq. (308) however we see that DITT introduces a non-trivial dependence on the renormalized dielectric constant that is a prediction of the DITT and could not have been predicted in the original DH theoretical scheme. This modi cation tends to unity as the concentration vanishes and therefore its dependence is lost in the DH expression. It is therefore an exclusive DITT prediction for the ionic velocity increment due to the electrophoretic e2ect. 9.3. Formulation of the DITT conductance equation In the previous section we have dealt with the DITT treatment of the electrophoretic velocity increment of an ion immersed in an electrolyte solution. This e2ect is responsible for the concentration dependence of transport coeJcients of charged :uids together with the relaxation eld induced by the distortion of the ionic atmosphere under the e2ect of the external eld treated in Section 9.1. The main e2ect of these processes is a reduction of the mobility of the charged particles in the bulk :uid with respect to its limiting (ideal) value due to the existence of long range electrostatic interactions. Thus, the macroscopic conductance of the charged :uid is expected to be a decreasing function of concentration and, in fact, this is the predicted limiting law behaviour. However, as concentration increases, the measured conductance starts to deviate above (complete dissociation) and below (incomplete dissociation) the limiting law values, suggesting in both cases a modi cation of electrostatic interaction. This behaviour has been explained in terms of extensions of Onsager’s limiting law of di2erent nature depending on the existence of ionic association or not. In the rst case the deviations from the limiting law were attributed to the existence of ionic pairs [13] in the medium, and in the second one they have been interpreted as due to the mathematical simpli cations involved in the derivation of the limiting law, or in the restricted applicability of the pair distributions derived from DH theory [99]. In the following we shall formulate the DITT conductance equation based on the exact DIT equilibrium formalism.
98
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
We shall demonstrate that, maintaining the formal aspect of the limiting law, it predicts the behaviour of electrolyte solutions on the basis of the mathematical formulation of the intuitive idea of the formation of e2ective or renormalized particles in the bulk that act as kinetic entities. The velocity of the central renormalized charge qi∗ in a bulk binary electrolyte solution is given by the combined e2ect of the total eld acting on it, E + "Ei (0), and the electrophoretic correction due to the medium velocity. "Ei (0) is the perturbed part of the electric eld acting on ion i due to the distortion of the ionic atmosphere induced by the static homogeneous electric eld E and is given by Eq. (293). Using this result and the electrophoretic velocity increment in Eq. (308) we get for this velocity:
qi∗ + "Ei (0) 0 − E ; (310) v i = :i E 1 + E 6>'(&∗ =&) where :i0 = qi Di0 =kB T is the mobility of the i species at in nite dilution and Di0 is the limiting di2usion coeJcient. Employing the de nition of the conductances of ions ,i = ,i0 vi =:i0 E where ,i0 is the limiting equivalent conductance of ion i, and adding for both kinds of ions of the electrolyte, we may write . = .0 − S(.) (c)I 1=2 ;
(311)
0 where equivalent conductance at in nite dilution equals 0 . is the equivalent conductance, . , the 2 , by virtue of Kohlrausch relation [99], I = m m m cm zm is the ionic strength, with zm the valence and cm the molar concentration of species m. On the other hand, S(.) (c) is a function of concentration de ned by the following equation: # "√ 1=2
√ √ +|qi∗ qj∗ | qkD |2ˆ0i (i qkD )2ˆ0j (i qkD )| NA e 2 qkD .0 + ∗ 2 S(.) (c) = √ ˆ 2 1000&kB T 3kB T & (qkD − +2 ) (i ˆ qEk D ) − &qkD
+
NA e 2 1000&kB T
1=2
(+=kD ) ∗ |q | : 6>'(&∗ =&) m m
(312)
Eq. (311) is DITT conductance equation, and maintains the formal aspect of the universally valid Onsager’s limiting law of conductance of ionic solutions. The di2erence with the latter arises from the fact that the slope of the above equation is not constant but it depends on concentration. In fact it is an extension of the latter to high concentrations, and it is formally exact. DITT predicts an extended limiting law with a concentration dependent slope, that in the limit of low concentrations recovers the classical DFO result. This generalized version of the limiting law is capable of accounting for the concentration dependence of the deviations of conductance from its limiting law values in terms of a progressive renormalization of the charge of the kinetic units. This is the transport equivalent to what happens in DIT equilibrium electrolyte theory, that allows the reformulation of the equilibrium DH results in terms of renormalized quantities. The basic assumptions of DITT are that the DIT equilibrium distribution functions are preserved under weak perturbations and that there are e2ective kinetic entities in the bulk formed by the ions and the inner parts of their ionic clouds. These dressed particles are assumed to be identi able entities after a number of collisions with solvent molecules. These hypothesis lead to a reformulation of DFO classical formalism that constitutes the DITT fundamental aim.
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
99
For suJciently low densities, we may approximate the short-range charge densities 2ˆ0m (k) evaluated √ in i qkD and in i+, as these functions tend to a constant value in the limit of low k [64]. Thus, we √ shall approximate 2ˆ0m (i qkD ) 2ˆ0m (i+). This supposes no loss of generality as we are dealing with static elds and the distinction is only relevant for nite frequencies as we have seen before. On the other hand, the renormalized dielectric constant may also be approximated without appreciable error, by its usual value (the dielectric constant of the solvent in the primitive model) up to concentrations of 1 M, equivalent to volume fractions of 10−2 [64]. Using these approximations we may rewrite Eq. (312) in the form: # 1=2 " ∗ ∗ 0
q|qi qj |. 1 (+=kD ) ∗ NA e 2 |q | : (313) S(.) (c) = √ + 1000&kB T 3&kB T (+=kD ) + q 6>' m m From the above equation it follows that the whole transport formalism is highly dependent on the particular model of the renormalized non-Debye equilibrium quantities, both charges and screening lengths. A further simpli cation is possible in the case of symmetric electrolytes, because in these cases there exists a simple relation among renormalized charges and screening lengths given by the equation:
2 q∗ + = m =7 ; (314) kD qm where qm∗ =qm is constant [55] for all species, and we have used the de nition of the renormalization parameter, 7. Using the above relationship we can rewrite Eq. (312) in the form: 1=2
72 |qi | 3=2 q|qi |2 .0 NA e 2 7 : (315) S(.) (c) = √ + 1000&kB T 3&kB T 71=2 + q 3>' Thus, in the case of symmetric electrolytes the deviations of conductance from its limiting law predictions can be expressed in terms of the DITT renormalization parameter only. In the limit of low concentrations the renormalized quantities tend to the bare ones, so 7 → 1 and then we recover Onsager’s classical expression [99]: 1=2
1 |qi | q|qi |2 .0 NA e 2 + : (316) S(.) (c) = √ 1000&kB T 3&kB T 1 + q 3>' It is important to point out the fact that the deviations of Onsager’s result predicted in Eq. (313) (or its equivalent for symmetric electrolytes Eq. (315)) are non-trivial predictions of DITT and could not have been inferred from a direct substitution of renormalized charges in Onsager’s classical equation. The introduction of DIT equilibrium pair correlations in FO hydrodynamic formalism is therefore essential to obtain the transport equations. 9.4. Comparison to experimental results As we have proved in the previous sections, DITT consists of a reformulation of FO transport formalism using the DIT equilibrium distribution functions. As we have seen, this new theoretical scheme allows the interpretation of the concentration-dependent phenomena in bulk solution, in terms of renormalized or e2ective parameters, and leads to a generalized version of Onsager’s limiting law of conductance given by Eq. (311). This equation states that the conductance of an electrolyte
100
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
or colloid system may be expressed as a c1=2 law with a concentration-dependent slope. This is a formally exact result, and the DITT allows the calculation of this slope in terms of a renormalization of the charge of the kinetic units, as stated in Eqs. (313) and (315). The analysis of the predictions of these DITT results, and the comparison with experimental data is the main aim of this section. As follows from Eq. (308), the DITT calculation of the electrophoretic velocity correction leads to a non-trivial dependence of this magnitude on the DIT renormalized dielectric permittivity of the medium (&∗ ). We have seen previously that the latter is made up of contributions of both the dielectric medium (&) and the ionic atmosphere polarization through the linear response function term ˆ (i+)=2i+ and that it is directly related to the susceptibility of the ionic system. Thus, the contributions to the susceptibility of the electrolyte system due to the polarization of the ionic atmosphere in the neighbourhood of an ion can be included in a renormalized dielectric constant through which the dressed ions interact, as well as the contributions of the atmosphere which are present in the renormalized DIT charges and screening length. The renormalized dielectric constant of a OCCS [64] in the PM, can be obtained from evaluating the derivative of the DIT/MMSA calculated -function in Eq. (201) in the leading singularity of the DIT pair correlations, i+: &∗ +* 1 36 = cosh(+*) − 36 (2) − 1) − 2+* − sinh(+*) ; (317) & 2(kD *)2 +* (+*)2 where ) = lB =* is the usual dimensionless coupling parameter that measures the ratio of the Bjerrum length or association length and the mean ionic radius. As can be readily seen in the above equation, the behaviour of the renormalized dielectric constant is controlled by the coupling parameter ). The degree of coupling is therefore assimilated to the nonidealities due to the existence of ionic association. Despite its non-physical features [64,103] such as the impossibility of ionic association, the OCCS results for the renormalized dielectric constant exhibit a dependence on the medium polarization through the dependence on the coupling parameter. The MMSA predictions for the e2ective screening constant lead to the existence of a transition for low ) values. Fig. 16 shows the renormalized dielectric constant behaviour in the MMSA scheme for systems of low coupling parameter. The transition from low to high-coupling regime in the MMSA scheme &∗ is registered at ) ¿ 1=3 where the renormalized dielectric constant shows a change in the sign of the second derivative of &∗ (6) due to this transition. The mobility : of a DIT quasiparticle, de ned through the relation v=:E, is deeply modi ed in DITT with respect to that predicted by the classical Henry’s law [215]. Substituting the renormalized charges and screening length calculated through the expressions derived in the MMSA scheme for a OCCS in the RPM in Eq. (308), together with the low volume fraction expansion of the renormalized dielectric constant in Eq. (317), we get 1 "=
; (318) 1 + (1 − 3))6[1 + (36=2)(1 − 2))] where " = :(6)=:0 (6) stands here for the electrophoretic mobility ratio between the DITT mobility and Onsager’s limiting law value, :0 = kD =6>'. This function is represented against concentration in Fig. 17 for di2erent values of the coupling parameter ), where it can be seen that the mobility of a DIT renormalized quasiparticle su2ers a progressive reduction from its limiting value with concentration. This behaviour can be explained in terms of the decrease of the di2usion constant of the renormalized quasiparticles due to an increment of the interaction parameters such as renormalized charges, screening lengths and e2ective radius. The mobility ratio " tends to its limiting
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
101
Fig. 16. DIT renormalized dielectric constant of an electrolyte solution against volume fraction for di2erent values of the coupling constant ) = q2 =&* in the MMSA approximation as predicted by Eq. (317). The long-dashed curve is for ) = 0:1, full curve is for ) = 0:3, dotted line is for ) = 0:4 and dot-dashed line is for ) = 0:41 (Ref. [90]).
value at high-coupling ) values or equivalently for low ionic radius at constant Bjerrum length. Fig. 17 therefore exhibits the e2ect of DIT renormalization of the colloidal quantities on the particle mobility in the bulk solution. The architecture of the DIT quasiparticles is directly responsible for this behaviour because of the increase in the e2ective radius of the particle due to the adhesion of the short-range part of the ionic cloud to the bare particle (dressed-particle). This produces an increase of the e2ective hydrodynamic radius within which the material is supposed to move as a rigid body and which represents the limit of validity of the hydrodynamic equations [216]. As in the case of the relaxation e2ect, the inner part of the ionic cloud surrounding the ionic or colloid particle is supposed to be unperturbed both by the external electric eld and by the solvent-mediated hydrodynamic interaction so the renormalized particle becomes the kinetic entity of the transport phenomena [90]. This provides new physical insight into the construction and structure of the e2ective transport quantities that were hitherto used as adjustable parameters. Besides this analysis of the electrophoretic mobility, for a complete understanding of the DITT conductance equations, a model of the renormalized parameters is needed. For vanishing concentrations, the asymptotic expansion of Kjellander and Mitchell [55] is valid (see Eq. (135)). As we can see in that result, for charge symmetric electrolytes the leading
102
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
Fig. 17. Representation of the mobility ratio of electrolyte solutions in Eq. (318) against volume fraction for di2erent values of the coupling constant ). Full curve is for ) = 0:1, long-dashed curve is for ) = 0:2, short-dashed curve is for ) = 0:3 and dotted curve is for ) = 0:4 (Ref. [90]).
correction is the #2 ln # term and the e2ective charge is the same for all species, whereas for asymmetric electrolytes the rst term dominates over the second. At low concentrations, #1, so the renormalized charge is equal for both species and smaller than the bare ion charge, and for asymmetric electrolytes the e2ective charge is greater for the ions with the highest valency and smaller for the other ions. The same applies for the decay lengths [55]. It is highly interesting to point out that with this model of the e2ective charges, the low concentration expansion of Eq. (315) (# ln # → 0) reads 0 . = .0 − S(.) c1=2 + Ac3=2 ln c + Bc3=2 ; 0 is Onsager’s limiting slope [99] and A and B are given by the relation where S(.)
2 3 6NA q 0 A= S(.) ; 4 × 103 & &
2 3 ' q 6NA : B = 2A ln 3 4 × 10 &
(319)
(320)
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
103
0.25
0.15
Λ - Λ0+S
(Λ)
c
1/2
( S m-1)
0.20
0.10
0.05
0.00
-0.05 0.00
0.05
0.10
0.15
0.20
c (mol l -1 ) Fig. 18. Comparison of Onsager’s classical extension of the limiting law to the DITT predictions to KCl experimental data. Full curve represents the DITT predictions as stated by Eq. (319), and dashed line are the classical results represented by 0 . − .0 + S(.) c1=2 = A1 c ln c + B1 c (Ref. [90]).
Eq. (319) clearly resembles Onsager’s extension of his limiting law to non-analytic term in c ln c plus a term linear in c: 0 c1=2 + A1 c ln c + B1 c ; . = .0 − S(.)
(321)
where A1 and B1 are usually treated as empirical parameters [99]. The only di2erence between Onsager’s correction and the DITT predictions in Eq. (319) is the dependence on c3=2 of the last two terms on the rhs of this equation, instead of the linear dependence of Onsager results. In Fig. 18, the DITT predictions and Onsager’s theoretical predictions are tested with direct experimental data of conductance of KCl, a completely dissociated 1:1 electrolyte system, in water (relative permittivity &r = 78:3) at 298:15 K. These data were measured with a Hewlett-Packard HP4285A precision LCR meter equipped with a HP E5050A dielectric probe [217], and the value of the limiting equivalent conductance of KCl used, .0 = 14:98 S m−1 , was obtained from the literature [218]. In this gure we represent the deviations of the conductance form its limiting law 0 value, . − .0 − S(.) c1=2 against molar concentration together with Onsager’s and DITT theoretical predictions for it. As we can see, DITT result clearly improves the accuracy of the tting results, specially in the low concentration range, con rming the dependence on c3=2 of the deviations of the limiting law of conductance characteristic of DITT. Calculated standard deviations were 52 =3×10−5 and 52 = 3:2 × 10−3 for DITT and Onsager’s extended limiting law respectively. Thus, DITT together
104
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
with Kjellander and Mitchell expression of the equilibrium renormalized quantities naturally predicts the rst corrections of the limiting law of conductance for point ions, which con rms its validity as an extended formalism of electrolyte transport processes. For nite concentrations, one must employ another approaches to model the concentration dependence of the renormalized quantities. The analytical second and third order self-consistent results of Attard in Eqs. (139) and (142) are valid up to 1 M and considerably extends the range of validity of DH result. However, Attard’s expression predicts an increase in the renormalized charges and decay length to the rst order of approximation, in contradiction with the result of Kjellander and Mitchell for the 1:1 symmetric electrolyte. The result for the renormalized quantities obtained in the DIT scheme through the evaluation of the linear response function (k) ˆ for an OCCS (which is equivalent to a 1:1 RPM electrolyte) in the MMSA approach is (Eq. (209)) + 1 : =
kD 1 + 6 − (kD *)2 =2 In Ref. [90], the DITT equivalent conductance of symmetric 1:1 electrolyte solutions predicted using the DIT/MMSA e2ective decay length in the above equation was compared to that predicted using Attard’s and Kjellander and Mitchell prediction for the decay length. In that work it was proofed that the deviations of the transport property from its limiting law value due to concentration-dependent e2ects that in mean- eld DITT are subsumed in the renormalized quantities. Kjellander and Mitchell expression was seen to show no deviation from the limiting law up to concentrations of 1 M due to Z diameter the very low charge renormalization predicted in this scheme, nor does Attard’s even for 3 A ions. This con rms the low concentration range of applicability of these theoretical expressions, as both Mitchell and Ninham and Kjellander and Mitchell equations for the renormalized charge are asymptotic formulae and do not hold for nite concentrations [55], and the same applies to Attard’s formula [59]. However, the MMSA result calculated at the same order of approximation predicts the starting of the deviations of the equivalent conductance from its limiting value at concentrations Z diameter, in agreement with the empirically detected behaviour for these of 0:04 M for ions of 3 A systems. This con rms both the DITT formalism and the usage of the MMSA closure relation for explaining the renormalization of charge in bulk electrolyte and colloid solutions. Finally, we compare the DITT conductance equation (311) with our own experimental data of KCl up to concentrations of 1 M, and the results are shown in Fig. 19. This gure shows the measured slope of the equivalent conductance of this system against volume fraction, together with the DITT predictions and the extended limiting law calculations. Theoretical calculations were obtained from Eq. (315), together with the MMSA equilibrium theoretical scheme, which was used for modelling the renormalization parameter, 7, because of its ability to reproduce the HNC results. Thus, the only parameters necessary to de ne the system, besides the temperature and the permittivity of the solvent, are the ionic radius of the equivalent RPM electrolyte *, for which we have used the crystallographic ionic radii augmented by a factor of 10% as suggested by Durand-Vidal et al. [86], and the limiting equivalent conductance (.0 ), for which we have used the values registered in the literature. Thus, no parameter is adjusted and the calculations are real predictions of the DITT. As evident from the gures, the deviation of the real slope from its constant limiting law value starts at very low concentrations, and this deviation is correctly predicted by the DITT formalism with a simple analytic equilibrium closure relation such as MMSA. For DITT the calculated standard deviation was 52 = 8 × 10−4 and for Onsager’s extended limiting law was 52 = 4 × 10−3 . This
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
105
Fig. 19. Behaviour of the concentration-dependent slope of the equivalent conductance of KCl. Squares correspond to empirical data. Full curve represents the theoretical predictions using the static correlation function obtained from DITT Eq. (315) together with the DIT/MMSA renormalization parameter and the dashed line represents the predictions of the extended limiting law of Onsager (Ref. [90]).
excellent agreement is direct empirical evidence of the validity of the DITT fundamental hypotheses on the usage of renormalized or “dressed” DIT quasiparticles as generalized kinetic units in the transport formalism. 10. Conclusions We have reviewed the di2erent theoretical and numerical schemes that have been developed for analyzing ionic solutions during the last century, starting form the early works of Debye and H6uckel and of Gouy and Chapman, both for homogeneous (bulk) and inhomogeneous electrolytes and including sophisticated integral equation theories and simulation results. However, the main point of this review has been the analysis of the statistical foundations of the formally exact mean- eld dressed ion theory (DIT) and its transport homologue the dressed ion transport theory (DITT). These formalisms have been formulated mainly during the last decade, a period of deep revision of the classical theories of ionic solutions, trying to explain the puzzling success of PB formalism under conditions for which it ought to be totally inapplicable due to the existence of ionic correlations and higher order electrostatic couplings in the bulk. These theories provide statistically consistent
106
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
frameworks for the numerous theoretical and numerical advances in the theory of electrolytes formulated to overcome the PB formalism throughout the century. The undoubted practical importance of the mean eld formalism of ionic solutions due to the complexities of the numerical schemes had led to numerous studies of the conditions for the preservation of the mean eld image of ionic systems beyond the conventional DH limit (equivalently GC for the edl). We have seen that FT demands the introduction of renormalized or rescaled system parameters (charges and screening length) to overcome the DH theory, a fact that is nowadays accepted. A process of renormalization of charges and screening lengths occurs in an ionic system, and this allows that the whole equilibrium theory of ionic solutions can be interpreted in terms of dressed particles that interact through potentials that are screened in a non-Debye fashion. Moreover, these dressed particles also provide new kinetic entities in terms of which one can reformulate the DFO classical transport formalism, the aim of DITT. The main analytical and numerical results for calculating the e2ective decay length of ionic solutions have been reviewed, ranging from asymptotic expansions and self-consistent results to the sophisticated HNC and MC simulation. The so-called DIT route to the renormalized quantities has also been a matter of study. The e2ective charges and screening length can be calculated in the DIT scheme from the knowledge of the DIT linear response function, which is equivalent to know the static structure of the :uid in whatever approximation. The so termed MMSA (and its generalization for asymmetric electrolyte systems) has been introduced and its predictions tested with high accuracy HNC data of the screening length of an electrolyte solution. The DIT implications in the edl theory were brie:y summarized showing that the commonly reported e2ective surface charges are naturally predicted in this theoretical framework. Finally, the DITT reformulation of the transport theory of ionic solutions in terms of non-Debye quantities was analyzed. The equilibrium DIT and transport DITT form the formally exact mean eld picture of ionic solutions, and it has proofed successful in accounting for experimental and numerical data of thermodynamic and transport properties of electrolyte and colloid solutions, providing new insight and opening new trends in the physics of charged :uids. Acknowledgements This work received the nancial support of the Xunta de Galicia (PGIDT99PXI 20605B) and of the Ministerio de Ciencia y Tecnolog-.a (MAT2001-2877). References [1] P. Debye, E. H6uckel, Phys. Z. 24 (1923) 185. [2] G. Gouy, J. Phys. 9 (1910) 457; D.L. Chapman, Phil. Mag. 25 (1913) 475. [3] B.V. Derjaguin, L.D. Landau, Acta Physicochem. URSS 14 (1941) 6; E.J.W. Verwey, J.Th.G. Overbeek, Trans. Faraday Soc. 42B (1946) 117. [4] M.E. Fisher, Y. Levin, Phys. Rev. Lett. 71 (1993) 3826; M.E. Fisher, J. Stat. Phys. 75 (1996) 1. [5] R. Kjellander, D.J. Mitchell, Chem. Phys. Lett. 200 (1992) 76. ∗ ∗ ∗ [6] L. Guldbrand, B. J6onsson, H. Wennerstr6om, P. Linse, J. Chem. Phys. 80 (1984) 2221.
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111 [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53]
107
R. Kjellander, S. Maracelja, J. Phys. Chem. 90 (1986) 1230. R. Kjellander, S. Maracelja, R.M. Pashley, J.P. Quirk, J. Chem. Phys. 92 (1990) 4399. P. Attard, D.J. Mitchell, B.W. Ninham, J. Chem. Phys. 88 (1988) 4987. ∗ ∗ ∗ G. Stell, J.L. Lebowitz, J. Chem. Phys. 48 (1968) 3706. E. Trizac, L. Bocquet, M. Auboy, Los Alamos National Laboratory, Preprint Archive, cond-mat/0201510 (2002) and references therein. T.H. Gronwall, V.K. LaMer, K. Sandved, Phys. Z. 29 (1928) 358. N. Bjerrum, Kgl. Danske Vidensk. Selskab. 9 (1926) 7. L.W. Bahe, J. Phys. Chem. 76 (1972) 1062. L.M. Varela, M. Garc-.a, F. Sarmiento D. Attwood, V. Mosquera, J. Chem. Phys. 107 (1997) 6415. E. Waisman, J.L. Lebowitz, J. Chem. Phys. 52 (1972) 4037. J.S. HHye, J.L. Lebowitz, G. Stell, J. Chem. Phys. 61 (1974) 3253. H.C. Andersen, D. Chandler, J. Chem. Phys. 55 (1971) 1497; H.C. Andersen, D. Chandler, J. Chem. Phys. 56 (1972) 3182. G. Stell, in: B.J. Berne (Ed.), Statistical Mechanics: Equilibrium Techniques, Plenum Press, New York, 1977. A.R. Allnatt, Mol. Phys. 8 (1964) 533. J.C. Rasaiah, H.L. Friedman, J. Chem. Phys. 48 (1968) 2742. J.E. Mayer, J. Chem. Phys. 18 (1950) 1426. R. Abe, Prog. Theor. Phys. 31 (1976) 1117. E.G.D. Cohen, T.J. Murphy, Phys. Fluids 12 (1969) 1404. J.C. Rasaiah, D.N. Card, J.P. Valleau, J. Chem. Phys. 56 (1972) 248. B. Larsen, J. Chem. Phys. 65 (1976) 3431. B. Larsen, S.A. Rodge, J. Chem. Phys. 72 (1980) 2578. D.N. Card, J.P. Valleau, J. Chem. Phys. 52 (1970) 6323. B. Larsen, Chem. Phys. Lett. 27 (1974) 47. L.R. Zhang, H.S. White, H.T. Davis, Report, 1993. C.-Y. Shew, P. Mills, J. Phys. Chem. 99 (1995) 12988. J.-M. Caillol, D. Levesque, J.-J. Weis, J. Chem. Phys. 116 (2002) 10794. Q. Yan, J.J. de Pablo, Phys. Rev. Lett. 88 (2002) 095504/1; Q. Yan, J.J. de Pablo, J. Chem. Phys. 116 (2002) 2967; Q. Yan, J.J. de Pablo, Phys. Rev. Lett. 86 (2001) 2054. K. Heinzinger, Stud. Phys. Theor. Chem. 27 (1983) 61. S.H. Suh, L. Mier-y-Teran, H.S. White, H.T. Davis, Chem. Phys. 142 (1990) 203. L. Zhang, M. Jinno, H.T. Davis, H.S. White, Mol. Simul. 12 (1994) 1. R.D. Groot, Phys. Rev. A 37 (1988) 3456. Y. Levin, M.E. Fisher, Physica A 225 (1996) 164. ∗ B.P. Lee, M.E. Fisher, Phys. Rev. Lett. 76 (1996) 2906; B.P. Lee, M.E. Fisher, Europhys. Lett. 39 (1997) 611. M.N. Tamashiro, Y. Levin, M.C. Barbosa, Physica A 268 (1999) 24. M.C. Barbosa, J. Phys.: Cond. Matter 14 (2002) 2461. A.L. Kholodenko, A.L. Beyerlein, Phys. Rev. A 34 (1986) 3309. ∗ ∗ ∗ R.R. Netz, H. Orland, Europhys. Lett. 45 (1999) 726. ∗∗ R.R. Netz, H. Orland, Eur. Phys. J. E 1 (2000) 203. R.R. Netz, H. Orland, Eur. Phys. J. E 1 (2000) 67. R.R. Netz, H. Orland, Eur. Phys. J. D 8 (1999) 145. A.G. Moreira, R.R. Netz, Europhys. Lett. 52 (2000) 705. R.R. Netz, Eur. Phys. J. E 5 (2001) 557. R.R. Netz, Eur. Phys. J. E 5 (2001) 189. N.V. Brilliantov, V.V. Malinin, R.R. Netz, Eur. Phys. J. D 18 (2002) 339. R. Kjellander, S. Maracelja, R.M. Pashley, J.P. Quirk, J. Chem. Phys. 80 (1984) 4399. G. Stell, J.L. Lebowitz, J. Chem. Phys. 61 (1974) 3253. J.G. Kirkwood, Chem. Rev. 19 (1936) 275.
108 [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103]
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111 R. Lovett, F.H. Stillinger, J. Chem. Phys. 48 (1986) 3869. R. Kjellander, D.J. Mitchell, J. Chem. Phys. 101 (1994) 603. ∗ ∗ ∗ P.V. Giaquinta, M. Parrinello, M.P. Tosi, Phys. Chem. Liq. 5 (1976) 305. M. Parrinello, M.P. Tosi, Rev. Nuovo Cimento 2 (1979) 6. R. Kjellander, S. Maracelja, J. Phys. Chem. 90 (1986) 1230. P. Attard, Phys. Rev. E 48 (1993) 3604. ∗ ∗ ∗ A. McBride, M. Kohonen, P. Attard, J. Chem. Phys. 109 (1998) 2423. ∗∗ D.J. Mitchell, B.W. Ninham, Chem. Phys. Lett. 53 (1978) 397. ∗∗ P. Attard, R. Kjellander, D.J. Mitchell, Chem. Phys. Lett. 139 (1987) 219. P. Attard, R. Kjellander, Bo J6onsson, J. Chem. Phys. 89 (1988) 1664. L.M. Varela, M. P-erez-Rodr-.guez, M. Garc-.a, F. Sarmiento, V. Mosquera, J. Chem. Phys. 109 (1998) 1930. ∗ ∗ ∗ L. Blum, J.S. HHye, J. Phys. Chem. 81 (1977) 1311. A. Lehmani, O. Bernard, P. Turq, J. Stat. Phys. 89 (1997) 379. L.M. Varela, M. Garc-.a, V. Mosquera, Physica A 311 (2002) 35. ∗∗ R. van Roij, J.-P. Hansen, Phys. Rev. Lett. 79 (1997) 3082. L.M. Varela, J.M. Ruso, M. Garc-.a, V. Mosquera, J. Chem. Phys. 113 (2000) 10174. ∗∗ L.M. Varela, M. P-erez-Rodr-.guez, M. Garc-.a, V. Mosquera, J. Chem. Phys. 113 (2000) 292. ∗∗ L.M. Varela, M. Garc-.a, V. Mosquera, Physica A, in press. Y. Levin, M.E. Fisher, Physica A 76 (1996) 2906. J.M.H. Levelt Sengers, J.A. Given, Mol. Phys. 80 (1993) 899. P. Turq, F. Lantelme, H.L. Friedman, J. Chem. Phys. 66 (1977) 3039. P. Turq, F. Lantelme. D. Levesque, Mol. Phys. 37 (1979) 223. D.L. Ermak, J.A. McCammon, J. Chem. Phys. 69 (1978) 1352. H.L. Friedman, F.O. Ranieri, M.D. Wood, Chem. Scr. 29A (1989) 49. O. Bernard, W. Kunz, P. Turq, L. Blum, J. Phys. Chem. 96 (1992) 3833. L. Onsager, Phys. Z. 27 (1926) 388. L. Onsager, R.M. Fuoss, J. Phys. Chem. 36 (1932) 2689. L. Onsager, Ann. N. Y. Acad. Sci. 46 (1945) 263. L. Onsager, S.K. Kim, J. Phys. Chem. 61 (1957) 215. E. Waisman, J.L. Lebowitz, J. Chem. Phys. 56 (1972) 3086. W. Ebeling, J.J. Rose, J. Solution Chem. 10 (1981) 599. W. Ebeling, M.J. Grigo, J. Solution Chem. 11 (1982) 151. P. Turq, L. Blum, O. Bernard, W. Kunz, J. Phys. Chem. 99 (1995) 822. S. Durand-Vidal, J.P. Simonin, P. Turq, O. Bernard, J. Phys. Chem. 99 (1995) 6733. S. Durand-Vidal, P. Turq, O. Bernard, C. Treiner, L. Blum, Physica A 231 (1996) 123. L.M. Varela, C. Rega, M. P-erez-Rodr-.guez, M. Garc-.a, V. Mosquera, F. Sarmiento, J. Chem. Phys. 110 (1999) 4483. ∗ L.M. Varela, M. P-erez-Rodr-.guez, M. Garc-.a, F. Sarmiento, V. Mosquera, J. Chem. Phys. 111 (1999) 10986. ∗∗∗ P. Attard, Adv. Chem. Phys. 92 (1996) 1. ∗ ∗ ∗ D.A. McQuarrie, Statistical Mechanics, HarperCollins, New York, 1976. R.A. Robinson, R.H. Stokes, Electrolyte Solutions, 2nd Edition, Butterworths, London, 1959. A.K. Soper, Physica B 276 –278 (2000) 12. H.S. Frank, M.W. Evans, J. Chem. Phys. 13 (1945) 507. S. Dixit, A.K. Soper, J.L. Finney, J. Crain, Eurphys. Lett. 59 (2002) 377. G.N. Patey, G.M. Torrie, Chem. Scr. 29A (1989) 39. See for example D. Chandler, Introduction to Modern Statistical Mechanics, Oxford University Press, Oxford, 1987. H.S. Harned, B.B. Owen, The Physical Chemistry of Electrolyte Solutions, 3rd Edition, Reinhold, New York, 1958. J.B. Hubbard, in: M.-C Bellisent-Funel, G.W. Neilson (Eds.), The Physics and Chemistry of Aqueous Ionics Solutions, NATO ASI series, Reidel Publishing, 1987. ∗∗ R.H. Fowler, Statistical Mechanics, Cambridge University Press, Cambridge, 1929. L.L. Lee, Molecular Thermodynamics of Nonideal Fluids, Butterworths, Boston, 1988. J.P. Hansen, I.R. McDonald, Theory of Simple Liquids, Academic Press, Oxford, 1986.
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111
109
[104] G.F. Simmons, Ecuaciones diferenciales con aplicaciones y notas hist-oricas, McGraw-Hill, Madrid, 1985. [105] A. Markushevich, Teor-.a de las Funciones Anal-.ticas, Vol. I, Mir, Moscow, 1987. [106] P. Attard, J. Chem. Phys. 93 (1990) 7301; P. Attard, J. Chem. Phys. 94 (1991) 6936. [107] G. Stell, Phys. Rev. B 1 (1970) 2265. [108] R. Klein, in: S.-H Chen, et al., (Eds.), Structure and Dynamics of Strongly Interacting Colloids and Supramolecular Aggregates in Solution, Kluwer Academic Press, Netherlands, 1992. [109] V.K. LaMer, T.H. Gronwall, L.J. Grei2, J. Phys. Chem. 35 (1931) 2245. [110] M.P. Tosi, F.G. Fumi, J. Phys. Chem. Solids 25 (1945) 45. [111] G. Jaccuci, I.R. McDonald, A. Rahman, Phys. Rev. A 13 (1976) 1581. [112] M.J.L. Sangster, M. Dixon, Adv. Phys. 25 (1976) 247. [113] F.H. Stillinger, R. Lovett, J. Chem. Phys. 48 (1968) 3858. [114] F.H. Stillinger, R. Lovett, J. Chem. Phys. 49 (1968) 1991. [115] N. Bjerrum, Kgl. Danske Vidensk. Selskab. 7 (1926) No. 9. [116] R.M. Fuoss, Chem. Rev. 17 (1935) 27; R.M. Fuoss, J. Am. Chem. Soc. 57 (1935) 2064. [117] R.M. Fuoss, C.A. Krauss, J. Am. Chem. Soc. 55 (1933) 2387; R.M. Fuoss, C.A. Krauss, J. Am. Chem. Soc. 57 (1935) 1. [118] R.M. Fuoss, J. Am. Chem. Soc. 56 (1933) 2017. [119] E.A. Guggenheim, Phil. Mag. XIX (1935) 588. [120] Br6onsted, J. Am. Chem. Soc. 44 (1922) 893. [121] E. Guntelberg, Z. Phys. Chem. 123 (1926) 199. [122] C.W. Davies, J. Chem. Soc. (1938) 2093. [123] T. Burch eld, E.M. Woolley, J. Phys. Chem. 88 (1984) 2149. [124] J.C. Poirier, J. Chem. Phys. 18 (1950) 1426. [125] G. Scatchard, Electrochemical constants, National Bureau of Standards Circular 524 (1953) 185. [126] L. Blum, Adv. Chem. Phys. 78 (1990) 171. ∗∗ [127] S.L. Carnie, G.M. Torrie, Adv. Chem. Phys. 56 (1984) 141. ∗∗ [128] D.C. Grahame, J. Chem. Phys. 21 (1953) 1054. [129] B. Abraham-Schrauner, J. Math. Biol. 2 (1975) 333; B. Abraham-Schrauner, J. Math. Biol. 4 (1978) 975. [130] S. Levine, G.M. Bell, Discuss. Faraday Soc. 42 (1966) 69; S. Levine, C.W. Outhwaite, J. Chem. Soc. Faraday II 74 (1978) 1670. [131] R. Evans, T. Sluckin, Mol. Phys. 40 (1980) 413. [132] W. van Megan, I. Snook, J. Chem. Phys. 73 (1980) 4656. [133] I. Snook, W. van Megan, J. Chem. Phys. 75 (1981) 4104. [134] G.M. Torrie, J.P. Valleau, J. Phys. Chem. 86 (1982) 4615. [135] G.M. Torrie, J.P. Valleau, G.N. Patey, J. Chem. Phys. 76 (1982) 4615. [136] D. Henderson, F.F. Abraham, J.A. Barker, Mol. Phys. 31 (1976) 1291. [137] M. Born, H.S. Green, Proc. R. Soc. London 1988 (1946) 10. [138] R. Lovett, C.Y. Mou, F.P. Bu2, J. Chem. Phys. 65 (1976) 2377. [139] J.G. Kirkwood, E. Monroe, J. Chem. Phys. 9 (1941) 514. [140] G.M. Torrie, J.P. Valleau, Chem. Phys. Lett. 65 (1979) 343. [141] S.W. de Leeuw, J.W. Perram, E.R. Smith, Proc. Roy. Soc. A 373 (1980) 27. [142] E. Waisman, J.L. Lebowitz, J. Chem. Phys. 52 (1970) 4307; E. Waisman, J.L. Lebowitz, J. Chem. Phys. 56 (1972) 3086; E. Waisman, J.L. Lebowitz, J. Chem. Phys. 56 (1972) 3093. ∗∗ [143] J.S. HHye, J.L. Lebowitz, G. Stell, J. Chem. Phys. 61 (1974) 3253. ∗∗ [144] H.C. Andersen, D. Chandler, J. Chem. Phys. 55 (1971) 1497; H.C. Andersen, D. Chandler, J. Chem. Phys. 56 (1972) 3182. [145] A.R. Allnatt, Mol. Phys. 8 (1964) 533. [146] J.C. Rasaiah, H.L. Friedman, J. Chem. Phys. 48 (1968) 2742.
110 [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166] [167] [168] [169] [170] [171] [172] [173] [174] [175] [176] [177] [179] [180] [181] [182] [183] [184] [185] [186] [187] [188] [189] [190] [191] [192] [193] [194] [195]
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111 J.L. Lebowitz, J. Percus, Phys. Rev. 144 (1966) 251. L. Blum, Mol. Phys. 30 (1975) 1529. ∗∗ J.S. HHye, G. Stell, J. Chem. Phys. 61 (1976) 3253. L. Blum, J. Stat. Phys. 22 (1980) 661. M.S. Wertheim, J. Math. Phys. 5 (1964) 643. E. Thiele, J. Chem. Phys. 39 (1963) 474. R.J.F. Leote de Carvalho, R. Evans, Mol. Phys. 83 (1994) 619. ∗ ∗ ∗ J. Ulander, R. Kjellander, J. Chem. Phys. 109 (1998) 9508. ∗∗ J.P. Hansen, I. R. McDonald, Phys. Rev. A 11 (1975) 2111. M. Baus, J.P., Hansen, Phys. Rep. 59 (1980) 1. D.D. Carley, J. Chem. Phys. 46 (1967) 3783. B. Larsen, J. Chem. Phys. 68 (1978) 4511. G.M. Abertheny, M. Dixon, M.J. Gillan, Phil. Mag. B 43 (1981) 1113. J. HHye, E. Lomba, G. Stell, Mol. Phys. 75 (1992) 1217. J. HHye, E. Lomba, G. Stell, Mol. Phys. 79 (1993) 523. M. Metropolis, A.W. Rosenbluth, M.N. Rosenbluth, A.N. Teller, E. Teller, J. Chem. Phys. 21 (1953) 1087. B.J. Alder, T.E. Wainwright, J. Chem. Phys. 31 (1959) 459. P.N. Vorontsov-Vel’yaminov, V.P. Chasovskikh, High Temperatures (URSS) 13 (1975) 1071; P.N. Vorontsov-Vel’yaminov, H.M. El’yashevich, A.K. Kron, Elektrokhimiya 2 (1966) 708; P.N. Vorontsov-Vel’yaminov, H.M. El’yashevich, Elektrokhimiya 4 (1968) 1430. D.N. Card, J.P. Valleau, J. Chem. Phys. 52 (1970) 6323. J.C. Rasaiah, D.N. Card, J.P. Valleau, J. Chem. Phys. 56 (1972) 248. L.S. Brown, L.G. Ja2e, Phys. Rep. 340 (2001) 1. J. Glimm, A. Ja2e, Functional Integral Methods in Quantum Field Theory, NATO Advanced Study Institute Series, Series B B26 (1977) 35. M. Le Bellac, Quantum and Statistical Field Theory, Clarendon, Oxford, 1991. A.L. Kholodenkho, K.F. Freed, J. Chem. Phys. 78 (1983) 7412. A.L. Kholodenkho, K.F. Freed, J. Chem. Phys. 80 (1984) 900. G. Stell, J.L. Lebowitz, J. Chem. Phys. 48 (1968) 3706. P.V. Giaquinta, M. Parrinello, M.P. Tosi, Phys. Chem. Liq. 5 (1976) 305. P. Viellefosse, J. Phys. (Paris) Lett. 38 (1977) L43. M. Parrinello, M.P. Tosi, Riv. Nuovo Cimento 2 (1979) 1. C.W. Outhwaite, in: K. Singer (Ed.), Statistical Mechanics. A Specialist Periodical Report, Vol. 2, The Chemical Society, London, 1975. F.H. Stillinger, R. Lovett, J. Chem. Phys. 48 (1968) 3869. L. Blum, J.S. HHye, J. Phys. Chem. 81 (1977) 1311. M.A. Knackstedt, B.W. Ninham, J. Phys. Chem. 100 (1996) 1330. ∗ J. Ennis, R. Kjellander, D.J. Mitchell, J. Chem. Phys. 102 (1995) 975. ∗∗ R. Kjellander, J. Ulander, Mol. Phys. 95 (1998) 495. ∗ J. Ulander, R. Kjellander, J. Chem. Phys. 114 (2001) 4893. ∗∗ M.P. Allen, D.J. Tildesley, Computer Simulation of Liquids, Oxford University Press, New York, 1990. R. Kubo, M. Toda, N. Hashitsume, Statistical Physics, Vol. II, 2nd Edition, Springer, Berlin, 1991. A.B. Bhatia, D.E. Thornton, Phys. Rev. B 2 (1970) 3004. R.G. Palmer, J.D. Weeks, J. Chem. Phys. 58 (1973) 4171. J.P. Hansen, J.J. Weis, Mol. Phys. 33 (1977) 1379. Z. Badirkhan, G. Pastore, M.P. Tosi, Mol. Phys. 74 (1991) 1089. H.S. Kang, F.H. Ree, J. Chem. Phys. 103 (1995) 9370. J.G. Kirkwood, J.C. Poirier, J. Phys. Chem. 50 (1969) 3756. P. Ballon, G. Pastore, M.P. Tosi, J. Chem. Phys. 85 (1986) 2943. P. Attard, D.J. Mitchell, B.W. Ninham, J. Chem. Phys. 89 (1988) 4358. S.E. Feller, D.A. McQuarrie, J. Phys. Chem. 96 (1992) 3454. L. Mier-y-Teran, S.H. Suh, H.S. White, H.T. Davis, J. Chem. Phys. 92 (1990) 5087.
L.M. Varela et al. / Physics Reports 382 (2003) 1 – 111 [196] [197] [198] [199] [200] [201] [202] [203] [204] [205] [206] [207] [208] [209] [210] [211] [212] [213] [214] [215] [216] [217] [218]
111
G.M. Torrie, P.G. Kusalik, G.N. Patey, J. Chem. Phys. 88 (1988) 7826. G.M. Torrie, P.G. Kusalik, G.N. Patey, J. Chem. Phys. 89 (1988) 3285. G.M. Torrie, P.G. Kusalik, G.N. Patey, J. Chem. Phys. 90 (1989) 4513. D. Wei, G.N. Patey, G.M. Torrie, J. Phys. Chem. 94 (1990) 4260. J. Ennis, S. Maracelja, R. Kjellander, Electrochim. Acta 41 (1996) 2115. ∗ ∗ ∗ C.W. Outhwaite, L.B. Bhuiyan, S. Levine, J. Chem. Soc. Faraday Trans. II 76 (1980) 1388. C.W. Outhwaite, L.B. Bhuiyan, J. Chem. Soc. Faraday Trans. II 78 (1982) 775. C.W. Outhwaite, L.B. Bhuiyan, J. Chem. Soc. Faraday Trans. II 79 (1983) 707. R. Kjellander, S. Maracelja, J. Chem. Phys. 82 (1985) 2122. R. Kjellander, J. Chem. Phys. 88 (1988) 7129. M. Plischke, D. Henderson, J. Chem. Phys. 88 (1988) 2712. L.D. Landau, E.M. Lifshitz, Mec-anica de Fluidos, Curso de F-.sica Te-orica, Vol. 6, Revert-e, Barcelona, 1985. J.P. Boom, S. Yip, Molecular Hydrodynamics, McGraw-Hill, New York, 1980. P. Debye, H. Falkenhagen, Phys. Z. 29 (1928) 121. G. Joos, M. Blumentritt, Phys. Z. 28 (1927) 836. H. Falkenhagen, Electrolytes, Oxford University Press, New York, 1934. R.M. Fuoss, F. Accascina, Electrolytic Conductance, Wiley Interscience, New York, 1959. R. Hidalgo-Alvarez, A. Mart-.n, A. Fern-andez, D. Bastos, F. Mart-.nez, F.J. de las Nieves, Adv. Colloid Interface Sci. 67 (1996) 1. L. Onsager, Phys. Z. 27 (1926) 388. O. Bernard, W. Kunz, P. Turq, L. Blum, J. Phys. Chem. 96 (1992) 3833. R.W. O’Brien, L.R. White, J. Chem. Soc. Faraday Trans. II 74 (1978) 1607. M. P-erez-Rodr-.guez, G. Prieto, C. Rega, L.M. Varela, F. Sarmiento, V. Mosquera, Langmuir 14 (1998) 4422. V.M.M. Lobo, J.L. Quaresma, Handbook of Electrolyte Solutions, Elsevier, Amsterdam, 1989.
Available online at www.sciencedirect.com
Physics Reports 382 (2003) 113 – 302 www.elsevier.com/locate/physrep
Lattice perturbation theory Stefano Capitani DESY Zeuthen, John von Neumann-Institut fur Computing (NIC), Platanenallee 6, 15738 Zeuthen, Germany Accepted 17 April 2003 editor: R. Petronzio
Abstract The consideration of quantum *elds de*ned on a spacetime lattice provides computational techniques which are invaluable for studying gauge theories nonperturbatively from *rst principles. Perturbation theory is an essential aspect of computations on the lattice, especially for investigating the behavior of lattice theories near the continuum limit. Particularly important is its rˆole in connecting the outcome of Monte Carlo simulations to continuum physical results. For these matchings the calculation of the renormalization factors of lattice matrix elements is required. In this review we explain the main methods and techniques of lattice perturbation theory, focusing on the cases of Wilson and Ginsparg–Wilson fermions. We will illustrate, among other topics, the peculiarities of perturbative techniques on the lattice, the use of computer codes for the analytic calculations and the computation of lattice integrals. Methods for the computation of 1-loop integrals with very high precision are also discussed. The review presents in a pedagogical fashion also some of the recent developments in this kind of calculations. The coordinate method of L6uscher and Weisz is explained in detail. Also discussed are the novelties that Ginsparg–Wilson fermions have brought from the point of view of perturbation theory. Particular emphasis is given throughout the paper to the rˆole of chiral symmetry on the lattice and to the mixing of lattice operators under renormalization. The construction of chiral gauge theories regularized on the lattice, made possible by the recent advances in the understanding of chiral symmetry, is also discussed. Finally, a few detailed examples of lattice perturbative calculations are presented. c 2003 Elsevier B.V. All rights reserved. PACS: 12.38.Cy; 12.38.Gc; 11.30.Rd Keywords: Perturbation theory; Lattice QCD; Renormalization; Chiral symmetry; Ginsparg–Wilson fermions
E-mail address:
[email protected] (S. Capitani). c 2003 Elsevier B.V. All rights reserved. 0370-1573/03/$ - see front matter doi:10.1016/S0370-1573(03)00211-4
114
S. Capitani / Physics Reports 382 (2003) 113 – 302
Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1. Outline of the paper . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2. Why lattice perturbation theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3. Renormalization of operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4. Discretization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5. Wilson’s formulation of lattice QCD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1. Fourier transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2. Pure gauge action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.1. Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.2. Gauge *xing and the Faddeev–Popov procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3. Fermion action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6. Dealing with chiral symmetry on the lattice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7. Staggered fermions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8. Ginsparg–Wilson fermions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1. The Ginsparg–Wilson relation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2. Overlap fermions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3. Domain wall fermions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.4. Fixed-point fermions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.5. Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9. Perturbation theory of lattice regularized chiral gauge theories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10. The approach to the continuum limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11. Improvement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1. Improved quarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2. Improved gluons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12. The Schr6odinger functional . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13. The hypercubic group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14. Operator mixing on the lattice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.1. Unpolarized structure functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.1.1. First moment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.1.2. Second moment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.1.3. Third moment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.1.4. Higher moments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.2. A mixing due to breaking of chiral symmetry: GI = 1=2 operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15. Analytic computations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.1. The power counting theorem of Reisz . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.2. Divergent integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.3. General aspects of the calculations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.4. Example (Wilson): the *rst moment of the quark momentum distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.4.1. Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.4.2. Vertex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.4.3. Sails . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.4.4. Operator tadpole . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.4.5. Quark self-energy (sunset diagram) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.4.6. Quark self-energy (tadpole diagram) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.4.7. Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.5. Example of overlap results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.6. Tadpole improvement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.7. Perturbation theory for fat links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16. Computer codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
115 118 119 124 127 130 135 137 140 142 146 148 154 156 156 158 163 168 172 172 180 185 186 189 192 200 201 202 203 204 205 205 206 208 208 211 213 214 216 218 226 228 229 232 232 234 237 238 241
S. Capitani / Physics Reports 382 (2003) 113 – 302 17. Lattice integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18. Algebraic method for 1-loop integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.1. The bosonic case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.2. Examples of bosonic integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.3. Operator tadpoles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.4. The *rst moment of the gluon momentum distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.5. The general fermionic case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.6. The quark self-energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19. Coordinate space methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.1. High-precision integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.2. Coordinate space methods for 2-loop computations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.2.1. Bosonic case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.2.2. Fermionic case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20. Numerical perturbation theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix A. Notation and conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix B. High-precision values of Z0 and Z1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
115 245 250 251 256 258 263 265 270 271 272 273 274 281 283 287 287 287 288 289
1. Introduction In a lattice *eld theory the quantum *elds are studied and computed using a discretized version of the spacetime. The lattice spacing a, the distance between neighboring sites, induces a cutoJ on the momenta of the order 1=a. A spacetime lattice can be viewed as a nonperturbative regularization. Since the other known regularizations, like dimensional regularization or Pauli–Villars, can be de*ned only order by order in perturbation theory, the lattice regularization has this unique advantage over them. It is a regularization which is not tied to any speci*c approximation method, and which allows calculations from *rst principles employing various numerical and analytical methods, without any need to introduce models for the physics or additional parameters. In discretizing a continuum *eld theory one has to give up Lorentz invariance (and in general PoincarKe invariance), but the internal symmetries can usually be preserved. In particular, gauge invariance can be kept as a symmetry of the lattice for any *nite value of the lattice spacing, and this makes possible to de*ne QCD. The construction of chiral gauge theories like the electroweak theory on a lattice presents special problems due to chiral symmetry, which have been understood and solved only recently. The fact that one is able to maintain gauge invariance for any nonzero a is of great help in proving the renormalizability of lattice gauge theories. Lattice gauge theories represent a convenient regularization of QCD where its nonperturbative features, which are essential for the description of the strong interactions, can be systematically studied. The lattice can probe the long-distance physics, which is otherwise unaccessible to investigations which use continuum QCD. Precisely for the study of low-energy nonperturbative phenomena the lattice was introduced by Wilson, who went on to prove in the strong coupling regime quark con*nement. Con*nement means that quarks, the fundamental fermionic *elds of the QCD Lagrangian, are not the states observed in experiments, where only hadrons are visible, and so the free theory has no resemblance to the observed physical world. The quark–gluon structure of hadrons is hence
116
S. Capitani / Physics Reports 382 (2003) 113 – 302
intrinsically diJerent from the structure of other composite systems (like for example electric charges in QED). No description in terms of two-body interactions is possible in QCD. Lattice simulations of QCD show that a large part of the mass of the proton arises from the nonabelian interactions of quarks and gluons and not from the mass of the quarks. Only a small fraction of the proton mass is due to the quark mass. Similarly, the lattice con*rms that only about half of the momentum and a small part of the spin of the proton come from the momentum and spin of the constituent quarks. Computations coming from the lattice are thus crucial to our understanding of the strong interactions. In this review we want to discuss lattice calculations in the weak coupling regime. This is the realm of perturbation theory, which is used to compute the renormalization of the parameters of the Lagrangian and matrix elements, and to study the approach to the continuum limit. Details of the lattice formulation that are only relevant at the nonperturbative level will not be discussed in this review. For the nonperturbative aspects of lattice *eld theories we refer the interested reader to the books of Creutz (1983), Montvay and M6unster (1994) and Rothe (1997) and the very recent one by Smit (2002). Among the others, Rothe’s book contains more material about lattice perturbation theory. Useful shorter reviews, which also cover many nonperturbative aspects, sometimes with a pedagogical cut, can also be found in Kogut (1983), Sharpe (1994), Sharpe (1995), DeGrand (1996), DeGrand (1997), Gupta (1999), Sharpe (1999), Wittig (1999), M6unster and Walzl (2000), Davies (2002) and Kronfeld (2002) and recently in L6uscher (2002). Here we would like to explain the main methods and techniques of lattice perturbation theory, particularly when Wilson and Ginsparg– Wilson fermions are used. We will discuss, among other things, Feynman rules, aspects of the analytic calculations and lattice integrals, the structure of the computer codes necessary to carry them out, and the nature of the mixing problem of lattice operators. Chiral symmetry is a topic which is coming up fairly often in the treatment of fermions on the lattice, and we will address some issues related to it in the course of the review. We feel that a discussion of the problems connected with the realization of chiral symmetry on the lattice is needed. The reader might otherwise wonder why one should do such involved calculations like the ones required for Ginsparg–Wilson fermions. We think that it is also interesting to see how the lattice can oJer fascinating solutions to the general quantum theoretical problem of de*ning chiral gauge theories beyond tree level. Also discussed is an algebraic method for the reduction of any 1-loop lattice integral (in the Wilson case) to a linear combination of a few basic constants. These constants are calculable with very high precision using in a clever way the behavior of the position space propagators at large distances. The coordinate space method, which turns out to be a very powerful tool for the computation of lattice integrals, allows the calculation of these constants with very high precision. These computations are a necessary prerequisite in order to be able to compute 2-loop lattice integrals with a large number of signi*cant decimal places, as we will explain in detail. A lot of nice and interesting work has been done using these techniques in the case of bosonic integrals, which can be evaluated with extraordinary precision at 1 loop, and with adequate precision at 2 loops. A nonnegligible part of this review is devoted to the discussion of these calculations in detail in the bosonic case. The focus of this review is on methods rather than on results. In fact, very few numerical results will be reported. The reader, if interested, can *nd all the useful perturbative results in the references given. Our objective is to provide computational tools which are needed to carry out this kind of calculations. Technical details will be therefore explained in a pedagogical fashion. Particular
S. Capitani / Physics Reports 382 (2003) 113 – 302
117
attention will be paid to certain aspects that only occur in lattice computations, and that physicists expert in continuum perturbative calculations might *nd surprising. The main objective of this review is to show how perturbation theory works on the lattice in the most common situations. It is hoped that one can learn from the material presented here. A background in continuum quantum *eld theory is required, and some acquaintance with continuum perturbative calculations in gauge *eld theories, the derivation of Feynman rules in continuum QCD and the calculation of Feynman diagrams will be assumed. Familiarity with the path integral formalism, with the quantization of *eld theories by means of the functional integral approach and with the renormalization of continuum quantum *eld theories is also desired. The knowledge of elementary facts, such as the renormalization group equations, the running of the strong coupling constant, the function and asymptotic freedom of QCD, will also be taken for granted. This review is not homogeneous. I have given more space to the topics that I believe are more interesting and more likely to be of wider use in the future. Many of the choices made and of the examples reported draw from the experience of the author in doing this kind of calculations. To keep this review into a manageable size, not all important topics or contributions will be covered. One thing that will not be discussed in detail is perturbation theory applied to Symanzik improvement, which, although very interesting and useful, would probably require a review in itself, given also the many important result that have been produced. The Schr6odinger functional is also introduced only in a very general way. I will not be able to do justice to other topics like numerical perturbation theory or tadpole improvement. Many interesting subjects had to be entirely left out because of constraints on space. Among the topics which are not covered at all are nonrelativistic theories, heavy quarks, and anisotropic lattices. I have also omitted all what concerns *nite temperature perturbation theory. Many of these things are treated in detail in the reviews and books cited above, where several topics not covered here can also be found. Moreover, we will not occupy ourselves with phenomenological results, but only with how perturbation theory is useful for extracting phenomenological results from the lattice data. In any case, there are by now so many perturbative calculations that have been made in lattice QCD that it would be impossible to include all of them here. The main reason for the introduction of the lattice was to study QCD in its nonperturbative aspects, like con*nement, and we will con*ne this review to QCD. Although very interesting, spin models, the 4 theory and the Higgs sector, to name a few, will be left out. Even so, lattice QCD is still quite a broad topic by itself, and thus to contain this review into a reasonable size we have been compelled to discuss only the main actions that have been used to study QCD on the lattice. Given the great number of diJerent actions that have been proposed for studying lattice QCD in the past 30 years, it is necessary to limit ourselves to just the few of them that are more widely used. The Wilson and staggered formulations have been the most popular ones in all this time. Recently the particular kind of chiral fermions known as Ginsparg–Wilson (like the overlap, domain wall and *xed-point fermions) have also begun to *nd broader application, and they present interesting challenges for lattice perturbation theory. These are the actions that we will cover in this paper. The main features of the lattice construction and lattice perturbation theory itself will be discussed in detail in the context of Wilson fermions. When the other actions will be introduced the discussions will be more general, although we will try to point out the peculiarities of perturbative calculations in these particular cases. The explanations of the various lattice actions will be rather sketchy and aimed mainly at the aspects which are interesting from the point of view of perturbation theory.
118
S. Capitani / Physics Reports 382 (2003) 113 – 302
1.1. Outline of the paper The review is divided in three parts: Sections 2 and 3 are a sort of motivation and discuss why perturbation theory is relevant also in a nonperturbative regularization like the lattice, Sections 4–12 introduce the notion of lattice and discuss various possible actions with their Feynman rules. In Sections 13–20, which are more technical, lattice computations are presented and discussed in detail. We begin with two sections which are meant to stress the importance of lattice perturbation theory and explain what is meant for renormalization of operators on the lattice. In Section 4 we de*ne what a Euclidean lattice is, showing what the discretization of a continuum theory means in practice. In Section 5 we discuss the Wilson action (which is not chirally invariant) and explain how to derive its Feynman rules in momentum space. All the Feynman rules necessary for 1-loop calculations are explicitly given. In Section 6 we focus our attention on the relation between chirality and fermionic modes on the lattice, and the problems which arise when one tries to de*ne chiral fermions on a lattice. In a brief interlude we discuss staggered fermions. They have some chiral symmetry and have been the major alternative to Wilson fermions (at least in the *rst two decades of the lattice developments). Ginsparg–Wilson fermions, the long-awaited reconciliation of chirality with the lattice, are introduced in Section 8. We give details about the known solutions of the Ginsparg– Wilson relation, namely the overlap, the domain wall and the *xed-point actions. In Section 9 we explain how, using Ginsparg–Wilson fermions, it is possible to de*ne on the lattice chirally invariant gauge theories, where gauge invariance and chiral symmetry are maintained together to every order of perturbation theory and for any *nite value of the lattice spacing. In Section 10 we deal with the approach of coupling constant and masses to the continuum limit and talk about the function and the parameter of the lattice theory. In Section 17 we brieNy introduce the Symanzik improvement, including a short discussion about improved pure gauge actions on the lattice. We conclude the *rst part of the review with a brief presentation of the Schr6odinger functional, which has gained a paramount place in the lattice landscape in recent years. In Section 13 we begin the more technical part of the review, focused on how to actually carry out perturbative computations on the lattice. We introduce at this point the symmetry group of the lattice, the hypercubic group. Since the lattice symmetries are not as restrictive as those of the continuum theory, more mixings arise in general under renormalization, and we discuss some examples of them in Section 14. How to compute Feynman diagrams on the lattice is explained in great detail in Section 15, where we talk about the lattice power counting theorem of Reisz, which is useful for the computation of divergent integrals, and we present, step by step, the complete calculation of the 1-loop renormalization constant of the operator measuring the *rst moment of the momentum distribution of quarks in hadrons. This example is rather simple (compared to other cases) but contains all the main interesting features one can think of: a logarithmic divergence, the presence of a covariant derivative, symmetrized indices and of course the peculiar use of Kronecker -symbols in lattice calculations. Moreover, it is an example of a calculation in which the various propagators and vertices need an expansion in the lattice spacing a (in this case, to *rst order). Finally, it includes the computation of the quark self-energy, which is quite interesting and useful by itself. Brief discussions about overlap calculations, tadpole improvement and perturbation theory with fat link actions conclude Section 15. In Section 16 we discuss the use of computer codes for the automated evaluation of lattice Feynman diagrams.
S. Capitani / Physics Reports 382 (2003) 113 – 302
119
In Section 17 we explain some advanced techniques for the numerical evaluation of lattice integrals coming from Feynman diagrams (using extrapolations to in*nite volume), while in Section 18 we introduce an algebraic method for the exact reduction of any Wilson 1-loop integral to a few basic constants. The bosonic case is thoroughly explained, so that the reader can learn how to use it, and some applications to the exact calculations of operator tadpoles are explicitly given. Section 18 ends with a discussion of the main points of the general fermionic case, and the expression of the 1-loop quark self-energy in terms of the basic constants. The basic constants of the algebraic method can be computed with arbitrary precision, as explained in detail in Section 19. The values of the fundamental bosonic constants, Z0 and Z1 , are given with a precision of about 400 signi*cant decimal places in Appendix B. In order to be able to calculate them to this precision, we need to introduce the coordinate space method of L6uscher and Weisz, which will also be used for the computation of 2-loop integrals. The 2-loop bosonic integrals are discussed at length, and the general fermionic case is also addressed. In Section 20 we brieNy introduce numerical perturbation theory, which is a promising tool for higher-loop calculations. Finally, conclusions are given in Section 21, while Appendix A summarizes some notational conventions. 2. Why lattice perturbation theory To some readers the words “lattice” and “perturbation theory” might sound like a contradiction, but we will see that this is not the case and that lattice perturbation theory has grown into a large and well-established subject. Although the main reason why the lattice is introduced is because it leads to a nonperturbative regularization scheme and as such it allows nonperturbative computations, perturbative calculations on the lattice are rather important, and for many reasons. Perturbation theory of course cannot reveal the full content of the lattice *eld theory, but it can still give a lot of valuable informations. In fact, there are many applications where lattice perturbative calculations are useful and in some cases even necessary. Among them we can mention the determinations of the renormalization factors of matrix elements of operators and of the renormalization of the bare parameters of the Lagrangian, like coupling constants and masses. The precise knowledge of the renormalization of the strong coupling constant is essential for the determination of the parameter of lattice QCD (see Section 10) and its relation to its continuum counterpart, QCD . In general perturbation theory is of paramount importance in order to establish the connection of lattice matrix elements to the physical continuum theory. Every lattice action de*nes a diJerent regularization scheme, and thus one needs for each new action that is considered a complete set of these renormalization computations in order for the results which come out from Monte Carlo simulations to be used and understood properly. Moreover, lattice perturbation theory is important for many other aspects, among which we can mention the study of the anomalies on the lattice, the study of the general approach to the continuum limit, including the recovery in the limit a → 0 of the continuum symmetries broken by the lattice regularization (like Lorentz or chiral symmetry), and the scaling violations, i.e., the O(an ) corrections to the continuum limit. The latter are lattice artifacts that bring in systematic errors in lattice results, which one can try to reduce by means of an “improvement”, as we will see in Section 11. Perturbative calculations are thus in many cases essential, and are the only possibility to have some analytical control over the continuum limit. As we will see in Section 10, the perturbative
120
S. Capitani / Physics Reports 382 (2003) 113 – 302
region is the one that must be necessarily “traversed” in order to reach the continuum limit. There is a strong connection between lattice perturbation theory and the continuum limit of the discretized versions of QCD. Because of asymptotic freedom, in fact, one has g0 → 0 as a → 0. We should also point out that one cannot underestimate the rˆole played by perturbative calculations in proving the renormalizability of lattice gauge theories. Finally, perturbation theory will also be important for de*ning chirally invariant gauge theories on the lattice to all orders in the gauge coupling, as we will see in detail in Section 9. The lattice will be proven to be the only regularization that can preserve chirality and gauge invariance at the same time (without destroying basic features like locality and unitarity). We can say in a nutshell that lattice perturbation theory is important for both conceptual and practical reasons. The phenomenological numbers that are quoted from lattice computations are very often the result of the combined eJort of numerical simulations and analytic calculations, usually with some input from theory. In principle all known perturbative results of continuum QED and QCD can also be reproduced using a lattice regularization instead of the more popular ones. However, calculating in such a way the correction to the magnetic moment of the muon (to make an example) would be quite laborious. A lattice cutoJ would not be the best choice in most cases, for which instead regularizations like Pauli–Villars or dimensional regularization are more suited and much easier to employ. The main virtue of the lattice regularization is instead to make nonperturbative investigations possible, which usually need some perturbative calculations to be properly interpreted. As we have already mentioned, the connection of Monte Carlo results for hadronic matrix elements to their corresponding physical numbers, that is the matching with the continuum physical theory, has to be carried out by performing a lattice renormalization. It is in this context that lattice perturbation theory has a wide and useful range of applications, and we will discuss this important aspect of lattice computations in more detail in the next Section. In this respect, perturbative lattice renormalization is important by itself as well as a hint and a guide for the few cases in which one can also determine the renormalization constants nonperturbatively according to the method proposed in Martinelli et al. (1995) (for a recent review see Sommer (2002)). This is even more important when operator mixing is present. In fact, lattice mixing patterns, generally more complex than in the continuum, become in general more transparent when looked at using perturbative renormalization rather than nonperturbatively. We should also add that perturbative coeQcients can be usually computed with rather high accuracy. Perturbative renormalization results can also be quite useful in checking and understanding results coming from nonperturbative methods (where available). When short-distance quantities can be calculated using such diverse techniques, like lattice perturbation theory or Monte Carlo simulations, their comparison can give signi*cant hints on the validity of perturbative and nonperturbative methods. In some cases a nonperturbative determination of the renormalization constants can turn out to be rather diQcult to get. For the method to work, it is necessary that there is a plateau for the signal over a substantial range of momenta so that one can numerically extract the values of the renormalization factors. The nonperturbative renormalization methods can sometimes fail because a window which is large enough cannot be found. Moreover, where mixings are present these methods could come out to be useless because certain mixings are too small to be seen numerically, although still not so small to be altogether ignored. In these cases the only possibility to compute renormalization factors seems to be provided by the use of lattice perturbative methods. An important exception to this is given by the Schr6odinger functional scheme, where using a particular procedure known as
S. Capitani / Physics Reports 382 (2003) 113 – 302
121
Fig. 1. Perturbative and nonperturbative running of the renormalized strong coupling constant from the Schr6odinger functional on the lattice, from (Capitani et al., 1999c). In this scheme ∼ 116 MeV.
recursive *nite-size scaling technique (which we will explain in Section 12) it is possible to carry out precise nonperturbative determinations of renormalized coupling constants, masses and operators for an extremely wide range of momenta. Computations using the Schr6odinger functional are however rather more involved than average and usually require larger computational resources. We would also like to point out that in the case of Ginsparg–Wilson fermions the computational eJort required to extract nonperturbative renormalization factors (which come on top of the already substantial eJort needed to determine the bare matrix elements) can turn out to be quite expensive, especially in the cases of complicated operators like the ones that measure moments of parton distributions. We can thus say, after having looked at all these diJerent aspects, that lattice perturbation theory is quite important and sometimes irreplaceable. Of course there always are issues concerning its reliability when a 1-loop perturbative correction happens to be large, especially when the corresponding 2-loop calculation looks rather diQcult to carry out. 1 On the other hand, there are cases in which lattice perturbation theory works rather well. As an example we show in Figs. 1 and 2 the scale evolutions of the renormalized strong coupling constant and masses computed in the Schr6odinger functional scheme (Capitani et al., 1999c). 2 We can see that these scale evolutions are accurately described by perturbation theory for a wide range of energies. The perturbative and nonperturbative results are very close to each other, and almost identical even down to energy scales which are surprising low. The dashed curve in Fig. 1 and the continuous curve in Fig. 2 are obtained by including the b2 g07 term of the function and the d1 g04 term of the function (that is, the *rst nonuniversal coeQcients). The other curves are lower-order approximations. In Section 10 more details about these calculations can be found.
1
In this case mean-*eld improved perturbation theory, using Parisi’s boosted bare coupling, is known to reduce the magnitude of higher-order corrections in many situations (see Section 15.6). 2 An explanation of the way these nonperturbative evolutions are obtained is given in Section 12.
122
S. Capitani / Physics Reports 382 (2003) 113 – 302
Fig. 2. Perturbative and nonperturbative running of the renormalized masses from the Schr6odinger functional on the lattice, from Capitani et al. (1999c). In this scheme ∼ 116 MeV. The renormalization group invariant mass M is de*ned in Eq. (10.34).
Fig. 3. Perturbative and nonperturbative running of the renormalized strong coupling constant in the qq scheme, from Necco and Sommer (2001).
Although the running of coupling constant and masses in the *gures is computed within the Schr6odinger functional scheme and its behavior depends on the details of the computational scheme employed, it is interesting to note how close perturbation theory can come to nonperturbative results. The cases presented are particularly instructive, because the corresponding nonperturbative results are among the best that one can at present obtain. The Schr6odinger functional coupled to recursive *nite-size techniques allows to control systematic errors quite accurately. Errors in these calculations are smaller than in other schemes and are fully understood. In other situations we cannot really exclude, when we see a discrepancy between nonperturbative and perturbative results, that at least part of this discrepancy originates from the nonperturbative side. Another nice example of the good behavior of lattice perturbation theory is given in Fig. 3, which comes from the work of Necco and Sommer (2001), and Necco (2002a). These authors have
S. Capitani / Physics Reports 382 (2003) 113 – 302
123
computed the running coupling constant from the static quark force or potential in three diJerent ways, corresponding to three diJerent de*nitions of the running coupling constant, i.e., 3 F(r) =
qq (1=r) dV = CF ; dr r2
(2.1)
VR (1=r) ; r
(2.2)
V (r) = −CF
V (Q) V˜ (Q) = −4CF : Q2
(2.3)
It is seen that in *rst case, called the qq scheme, the perturbative expansion of the coupling constant is rather well behaved, that is the coeQcients of the expansion are small and rapidly decreasing. This is what is shown in Fig. 3, which indicates that compared to nonperturbative numbers perturbative computations can be trusted for coupling constants up to qq ≈ 0:3. In the other two schemes, however, the coeQcients are somewhat larger (especially in the last case), and the corresponding perturbative expansions look worse than in the *rst case. The perturbative coupling constants in these last two schemes have a more pronounced diJerence with respect to the nonperturbative results. The three schemes above diJer only by kinematics, and the results show that the choice of a scheme can have a big inNuence on the perturbative behavior of the coupling constant. Discussions about the validity of lattice perturbation theory cannot then be complete until the dependence on the renormalization scheme (beside the dependence on the form of the action used) is also taken into account and investigated. Not all schemes are equally suitable for lattice perturbation theory. In particular, the qq scheme is the best one among the three considered above. From this point of view the de*nition of coupling constant yielded by the Schr6odinger functional scheme is even better behaved than qq . In fact, the coeQcient b2 of the function (the *rst nonuniversal coeQcient) is the smallest one, thus in this sense the Schr6odinger functional scheme is closest to the MS scheme. We would also like to mention the work of Davies et al. (1997), where the QCD coupling constant is extracted from Wilson loops of diJerent size, which shows another instance of a good agreement between perturbation theory and simulations. We can thus trust lattice perturbation theory, under the right conditions. The behavior of lattice perturbation theory is probably not worse than that of QCD in the continuum. The latter is an asymptotic expansion, which in some case is also aJected by large higher-order corrections. We could even say that perturbation theory can be more accurately tested on the lattice than in the continuum, because in a lattice scheme one has also at his disposal the nonperturbative results to compare perturbation theory with. When lattice perturbation theory and nonperturbative numerical results do not agree, perhaps a look at systematic errors coming from the numerical side can sometimes be worthwhile.
3
Other possible ways of de*ning a strong coupling constant on the lattice are discussed in Weisz (1996).
124
S. Capitani / Physics Reports 382 (2003) 113 – 302
3. Renormalization of operators In general, matrix elements of operators computed on the lattice using numerical simulations require a renormalization in order to be converted into meaningful physical quantities. Monte Carlo matrix elements can be considered as (regulated) bare numbers, and to get physical results one needs to perform a lattice renormalization, which is performed by matching bare lattice results to some continuum scheme, usually chosen to be the MS scheme of dimensional regularization. In many physical problems one evaluates matrix elements of operators that appear in an operator product expansion. These matrix elements contain the long-distance physics of the system and are computed numerically on the lattice, while the Wilson coeQcients contain the short-distance physics and are obtained from perturbative calculations in the continuum. In this case the operators computed on the lattice must at the end be matched to the same continuum scheme in which the Wilson coeQcient are known. Therefore one usually chooses to do the matching from the lattice to the MS scheme of dimensional regularization. A typical example is given by the moments of deep inelastic structure functions, and we will illustrate many features of perturbation theory in the course of this review using lattice operators appearing in operator product expansions which are important for the analysis of structure functions. To perform the matching to a continuum scheme one has to look for numbers which connect the bare lattice results to physical continuum renormalized numbers. We will now discuss how the perturbative matching can be done at 1 loop. Some good introductory material on the matching between lattice and continuum and the basic concepts of lattice perturbation theory can be found in Sachrajda (1990), Sharpe (1994), and Sharpe (1995). A short review of the situation of perturbative calculations around 1995 is given in Morningstar (1996). It turns out that to extract physical continuum matrix elements from Monte Carlo simulations one needs lattice as well as continuum perturbative calculations. At tree level, for momenta much lower than the lattice cutoJ, p=a, lattice operators have the same matrix elements as the original continuum operators. Then at 1 loop one has g02 (0) lat 2 2 lat q|Oi |q = ij + (−ij log a p + Rij ) · q|Ojtree |q ; (3.1) 2 16 j q|OiMS |q =
j
ij +
2 gMS
(−(0) ij 2 16
log
p2 2
+ RMS ij )
· q|Ojtree |q :
(3.2)
MS The lattice and continuum 1-loop *nite constants, Rlat ij and Rij , in general do not have the same value. 4 This happens because lattice propagators and vertices, as will be seen in detail later on, are quite diJerent from their continuum counterparts, especially when the loop momentum is of order 1=a. Therefore the 1-loop renormalization factors on the lattice and in the continuum are in general not equal. As expected however the 1-loop anomalous dimensions are the same.
4
MS We note that while Rlat does not include the pole in j ij is the whole momentum-independent 1-loop correction, Rij and the factors proportional to E and log 4.
S. Capitani / Physics Reports 382 (2003) 113 – 302
125
From Eqs. (3.1) and (3.2) the connection between the original lattice numbers and the *nal continuum physical results is given by g02 (0) MS 2 2 lat MS q|Oi |q = ij − (−ij log a + Rij − Rij ) · q|Ojlat |q : (3.3) 2 16 j MS enter then in the matching factors The diJerences GRij = Rlat ij − Rij
Zij (a ; g0 ) = ij −
g02 2 (−(0) ij log a 2 16
2
+ GRij ) ;
(3.4)
and represent the main objectives of the perturbative lattice calculations. 5 Lattice operators have more mixing options than continuum ones, due to the lower symmetry of the lattice theory. There is no Lorentz invariance (as we will see in more detail in Section 13), and in many cases other symmetries, like chiral symmetry, are also broken. Thus, the matching factors are not in general square mixing matrices (that is one has i 6 j in Eq. (3.3)). To include all relevant operators one must be able to determine all the tree-level structures which appear when lattice radiative corrections are evaluated. While Rlat and RMS depend on the state |q, GR is independent of it, thus Zij depends only on a . This is as it should be, since the renormalization factors are a property of the operators and are independent of the particular external states considered. This is the reason why we have left the state |q unspeci*ed. Furthermore, the matching factors between the lattice and the MS scheme are gauge invariant, and this property can be exploited to make important checks of lattice perturbative calculations. At the end of the process we have just explained, having used both lattice and continuum perturbative techniques, the renormalization factor ZO (a ) which converts the lattice operator O(a) into ˆ ), the physical renormalized operator O( ˆ ) = ZO (a )O(a) ; O(
(3.6)
5 The coupling constant that appears in Eq. (3.3) is usually chosen to be the lattice one, g0 , as advocated in Sachrajda (1990). Of course choosing one coupling constant or the other makes only a 2-loop diJerence, but these terms could still be numerically important. The validity of this procedure should be checked by looking at the size of higher-order corrections. Unfortunately on the lattice these terms are known only in a very few cases and no de*nite conclusions can then be reached concerning this point. In the work of Ji (1995) a generalization to higher loops was proposed, which gives an exact matching condition to all orders. This is done using the lattice and continuum renormalization group evolutions (see Section 10). Using the lattice evolution, one goes to very high energies, which means very small coupling constants because of asymptotic freedom, and there the matching to MS is done. The procedure is essentially reduced to a tree-level matching. After this, one goes back to the original scale , using the continuum renormalization group evolution backwards. For the matching at the scale one then obtains the overall factor g () 0 MS (u) lat (v) exp − du MS dv · exp − ; (3.5) MS (u) lat (v) 0 g0 (a)
where the function governs the evolution of the renormalized operator. This formula uses the high-order coeQcients of the and functions. This approach has been used in Gupta et al. (1997), where a discussion on these issues is made.
126
S. Capitani / Physics Reports 382 (2003) 113 – 302
is obtained. The reader should always keep in mind that in this way one does the matching of the bare Monte Carlo results (obtained using a lattice regulator) directly to the physical renormalized results in the MS scheme. As for any general quantum *eld theory, the process at the end of which physical numbers are determined is accomplished in two steps. First one regularizes the ultraviolet divergences, and in this case the regulator is given by the lattice itself. Then one renormalizes the regulated theory, and on the lattice this results in a matching to a continuum scheme. At this point the lattice cutoJ must be removed, which means that one has to go to the continuum limit a → 0 of the lattice theory, keeping some suitable quantity *xed. What remains after all this is only the scale brought in by the renormalization. In our case the scale at which the matrix elements are renormalized should be in the range QCD ¡ ¡
: a
(3.7)
The lower bound ensures that perturbation theory is valid, while the upper bound ensures that cutoJ eJects, proportional to positive powers of the lattice spacing, are small. If one sets =
1 ; a
(3.8)
since the 1-loop anomalous dimensions are the same on the lattice and in the continuum only a *nite renormalization connects the lattice to the MS scheme: MS
q|Oi |q =
j
g02 lat MS ij − (R − Rij ) · q|Ojlat |q : 162 ij
(3.9)
Every lattice action de*nes a diJerent regularization scheme, and therefore these *nite renormalization factors are in principle diJerent for diJerent actions. The bare numbers, that is the Monte Carlo results for a given matrix element, are also diJerent, and everything adjusts to give the same physical result. We conclude mentioning that Sharpe (1994) has observed that when the operators come from an operator product expansion one should multiply the 1-loop matching factors introduced above with the 2-loop Wilson coeQcients, in order to be consistent. This can be seen by looking at the 2-loop renormalization group evolution for the Wilson coeQcients, c( 1 ) = c( 2 )
g02 ( 1 ) g02 ( 2 )
−(0) =20
g02 ( 2 ) − g02 ( 1 ) (1) (0) 1 4 1+ + O(g0 ) : − 162 20 202
(3.10)
The term proportional to g02 ( 2 ) − g02 ( 1 ) is analogous to a 1-loop matching factor, but contains 1 and (1) , which are 2-loop coeQcients. So, it is only combining 1-loop renormalization matching with 2-loop Wilson coeQcients that one is doing calculations in a consistent way. In this section we have learned that we need perturbative lattice calculations in order to extract physical numbers from Monte Carlo simulations of matrix elements of operators (unless one opts
S. Capitani / Physics Reports 382 (2003) 113 – 302
127
for nonperturbative renormalization, when this is possible). We will try to explain how to perform this kind of calculations in the rest of the review. 4. Discretization Lattice calculations are done in Euclidean space. A new time coordinate is introduced through a Wick rotation from Minkowski space to imaginary (Euclidean) times: x0E = ix0M :
(4.1)
In momentum space this corresponds to k0E = −ik0M , so that the Fourier transforms in Euclidean space are de*ned by the same phase factor. The reason for working with imaginary times is that the imaginary unit in front of the Minkowski-space action becomes a minus sign in the Euclidean functional integral, e i SM → e− SE ;
(4.2)
and the lattice *eld theory in Euclidean space acquires many analogies with a statistical system. The path integral of the particular quantum *eld theory under study becomes the partition function of a statistical system. The transition to imaginary times brings a close connection between *eld theory and statistical physics which has many interesting facets. In particular, when the Euclidean action is real and bounded from below one can see the functional integral as a probability measure weighted by a Boltzmann-like distribution e−SE . It is this feature that allows Monte Carlo methods to be used. 6 Furthermore, on a Euclidean lattice of *nite volume the path integral is naturally well de*ned, since the measure contains only a *nite number of variables and the exponential factor gives an absolutely convergent multi-dimensional integral. One can then generate con*gurations with the appropriate probability distribution sampling the *eld con*guration space with Monte Carlo techniques. This is the practical basis of Monte Carlo simulations. From now on we will work in the Euclidean space in four dimensions, with metric (1,1,1,1), and we will drop all Euclidean subscripts from lattice quantities, so that x0 is for example the time component after the Wick rotation. The Dirac matrices in Euclidean space satisfy an anticommutation relation with g ) replaced by ) : { ; ) } = 2
)
;
(4.3)
and they are all hermitian: ( )† = :
(4.4)
The Euclidean 5 matrix is de*ned by 5 = 0 1 2 3 ; 6
(4.5)
However, when the action is complex, like in the case of QCD with a *nite baryon number density, this is not possible. It is this circumstance that has hampered progress in the lattice studies of *nite density QCD.
128
S. Capitani / Physics Reports 382 (2003) 113 – 302
a
Fig. 4. A two-dimensional projection of a lattice. A site, a link and a closed loop are also shown.
it is also hermitian, and satis*es (5 )2 = 1. The relation between Dirac matrices in Minkowski and Euclidean space is E0 = M 0 ;
Ei = −iM i :
This can be inferred from the kinetic term of the Dirac action in the functional integral: exp{i R M 9M } → exp{− R E 9E } :
(4.6) (4.7)
The explicit Euclidean Dirac matrices in the chiral representation are given in Appendix A. We want to construct *eld theories on a hypercubic lattice. This is a discrete subset of the Euclidean spacetime, where the sites are denoted by x = an (with n integers). We will work in this review only with hypercubic lattices, where the lattice spacing is the same in all directions. A two-dimensional projection of such a (*nite) lattice is given in Fig. 4. For convenience we will sometimes omit to indicate the lattice spacing a, that is we will use units in which a = 1. The missing factors of a can always be reinstated by a naive dimensional counting. In going from continuum to lattice actions one replaces integrals with sums, d 4 x → a4 ; (4.8) x
where on the right-hand side x means now sites: x = an. 7 Lattice actions are then written in terms of sums over lattice sites. The distance between neighboring sites is a, and this minimum distance induces a cutoJ on the momentum space modes, so that a acts as an ultraviolet regulator. The range of momenta is thus restricted to an interval of range 2=a, called the *rst Brillouin zone, and which can be chosen to be BZ = k: − ¡ k 6 : (4.9) a a 7
We use in general the same symbols for continuum and lattice quantities when no confusion can arise, except for lattice derivatives. For them we will use special symbols.
S. Capitani / Physics Reports 382 (2003) 113 – 302
129
BZ is the region of the allowed values of k, and is the domain of integration in Fourier space. For a lattice of *nite volume V = L0 L1 L2 L3 the allowed momenta in the *rst Brillouin zone become a discrete set, given by 2 n (kn ) = ; n = −L =2 + 1; : : : ; 0; 1; : : : ; L =2 ; (4.10) a L and so in principle one deals with sums also in momentum space. However, in in*nite volume the sums over the modes of the *rst Brillouin zone become integrals: =a d k0 =a d k1 =a d k2 =a d k3 1 : (4.11) → V −=a 2 −=a 2 −=a 2 −=a 2 k
The one-sided forward and backward lattice derivatives (also known as right and left derivatives) can be written as (x + a ˆ) − (x) ∇ (x) = ; (4.12) a (x) − (x − a ˆ) ; (4.13) ∇? (x) = a where ˆ denotes the unit vector in the direction. It is easy to check that (∇ )† = −∇? ;
(4.14)
(∇? )† = −∇ ;
(4.15)
that is they are anti-conjugate to each other. Therefore in a lattice theory that is supposed to have a hermitian Hamiltonian only their sum, ∇ + ∇? , which is anti-hermitian, can be present. It acts as a lattice derivative operator extending over two lattice spacings: 1 (x + a ˆ) − (x − a ˆ) (∇ + ∇? ) (x) = : (4.16) 2 2a Note that the second-order diJerential operator ∇ ∇? = ∇? ∇ is hermitian, and when corresponds to the four-dimensional lattice Laplacian, (x + a ˆ) + (x − a ˆ) − 2 (x) G (x) = ∇? ∇ (x) = : a2 It is also useful to recall the lattice integration by parts formula (∇ f(x))g(x) = − f(x)(∇? g(x)) ; x
that is, x
is summed (4.17)
(4.18)
x
(f(x + a ˆ)g(x) − f(x)g(x)) =
(f(x)g(x − a ˆ) − f(x)g(x)) ;
(4.19)
x
which is valid for an in*nite lattice, and also for a *nite one if f and g are periodic (or their support is smaller than the lattice). The formula above amounts to a shift in the summation variable.
130
S. Capitani / Physics Reports 382 (2003) 113 – 302
There is in general some freedom in the construction of lattice actions. For the discretization of continuum actions and operators and the practical setting of the corresponding lattice theory many choices are possible. Since the lattice symmetries are less restrictive than continuum ones, there is more than one possibility in formulating a gauge theory starting from a given continuum gauge theory. In particular, one has quite a number of choices for the precise form of the QCD action on the lattice, depending on the particular features of interest. There is not an optimal lattice action to use in all cases, and each action has some advantages and disadvantages which weigh diJerently in diJerent contexts. This means that deciding for one action instead of another depends on whether chiral symmetry, Navor symmetry, locality, or unitarity are more or less relevant to the physical problem under study. There is a special emphasis on symmetry properties. One has also to balance between costs and gains from the point of view of computational eJorts and perturbation theory. All lattice actions which fall in the same universality class are supposed to have the same naive continuum limit, and each of them yields a possible regularization of the same physical theory. As we said, since every lattice action de*nes a diJerent regularization scheme, one needs for each action that is used a new complete set of renormalization computations of the type discussed in Section 3, in order for the results which come out from Monte Carlo simulations to be used, interpreted and understood properly. Using diJerent actions leads to diJerent numerical results for the matrix elements computed in Monte Carlo simulations, and also the values of the renormalization factors, and of the parameter, depend in general on the lattice action chosen. Even the number and type of counterterms required in the renormalization of operators can be diJerent in each case. For example, for the renormalization of a weak operator when the Wilson action is used more counterterms need to be computed than when the overlap action is used, because chiral invariance is not broken in the second case. Of course all the diJerences that are seen at *nite lattice spacing will disappear in the *nal extrapolations to the continuum limit which must lead, within errors, to the same physical results. In the case of QCD there seems to be a lot of room in choosing an action for the fermions, although also the pure gauge action has some popular variants (but the plaquette, or Wilson, action has a clear predominance over all others, except in particular situations, where, for example, improved gauge actions may be more convenient). Perturbation theory will be *rst introduced in the context of the standard Wilson formulation, which is one of the most widely used in applications. Then a few other fermionic actions will be discussed along the way, pointing out the diJerences in the structure of perturbation theory with respect to the Wilson fermion case.
5. Wilson’s formulation of lattice QCD One of the most popular lattice formulations of QCD is the one invented by Wilson (1974, 1977), which was also the *rst formulation ever of a lattice gauge theory. Its remarkable feature is that it maintains exact gauge invariance also at any nonzero values of the lattice spacing. The discretization of the (Euclidean) QCD action for one quark Navor
1 4 R S = d x (x)(D (5.1) , + m0 ) (x) + Tr[F ) (x)F ) (x)] 2
S. Capitani / Physics Reports 382 (2003) 113 – 302
131
that Wilson proposed is the following: f g + SW ; SW = S W
1 R f SW − = a4 [ (x)(r − )U (x) (x + a ˆ) 2a x
4r (x) + R (x + a ˆ)(r + )U † (x) (x)] + R (x) m0 + a
R (x) 1 ( (∇ ˜ ) + m0 (x) ; ˜? + ∇ ˜ ) − ar ∇ ˜ ?∇ = a4 2 x g = SW
1 4 a [Nc − Re Tr[U (x)U) (x + a ˆ)U † (x + a))U ˆ )† (x)]] ; g02 x; )
(5.2)
˜ (x) = where x = an and 0 ¡ r 6 1. We have also introduced the lattice covariant derivative ∇ 8 (U (x) (x + a ˆ) − (x))=a. This action has only nearest-neighbor interactions. The *rst-order derivative in the Dirac operator is the symmetric one, given by Eq. (4.16) in the free case (after an integration by parts). The *elds U (x) live on the links which connect two neighboring lattice sites, and these variables are naturally de*ned in the middle point of a link. Each link carries a direction, so that U −1 (x) = U † (x) = U− (x + a ˆ) :
(5.3)
Link variables are unitary matrices that do not depend linearly on the gauge potential A (x). The reason is that they belong to the group SU (Nc ) rather than to the corresponding Lie algebra, as is the case in the continuum. The relation of the U (x) matrices to the gauge *elds A (x), the variables which have a direct correspondence with the continuum, is then given by U (x) = eig0 aT
a a
A (x)
(a = 1; : : : ; Nc2 − 1) ;
(5.4)
a
where the T are SU (Nc ) matrices in the fundamental representation. The Wilson action possesses exact local gauge invariance on the lattice at any *nite a. The gauge-invariant construction is done directly on the lattice, as an extension of a discretized version of the free continuum fermionic action. It is not therefore a trivial straightforward discretization of the whole gauge-invariant continuum QCD action, where gauge invariance would be recovered only in the continuum limit. A naive lattice discretization of the minimal substitution rule 9 → D would in fact result in an action that violates gauge invariance on the lattice, whereas with the choice made by Wilson gauge invariance is kept as a symmetry of the theory for any a. It is this requirement that causes the group variables U to appear in the action instead of the algebra variables A . The lattice gauge transformations are U (x) → 5(x)U (x)5−1 (x + a ˆ) ; (x) → 5(x) (x) ; R (x) → R (x)5−1 (x) ; 8
Other actions can have more complicated interactions, like for example overlap fermions (see Section 8).
(5.5)
132
S. Capitani / Physics Reports 382 (2003) 113 – 302
Fig. 5. The plaquette.
with 5 ∈ SU (Nc ), and it is easy to see that they leave the quark–gluon interaction term in the Wilson action invariant. Note that also in the lattice theory the local character of the invariance is maintained. This form of local gauge invariance imposes strong constraints on the form of the gauge *eldstrength tensor F ) . Given the above formula for the lattice gauge transformations of the U ’s, it is easy to see that the simplest gauge-invariant object that one can build from the link variables involves a path-ordered closed product of links. Indeed, one obtains a gauge-invariant quantity by taking the trace of the product of U ’s on links forming a closed path, thanks to the unitarity of the U ’s and the cyclic property of the trace. The physical theory is a local one, and so in constructing the pure gauge action we should direct our attention toward small loops. The simplest lattice approximation of F ) is then the product of the links of an elementary square, called “plaquette”: ˆ )† (x) ; P ) (x) = U (x)U) (x + a ˆ)U † (x + a))U
(5.6)
shown in Fig. 5. This form is not surprising, given that the gauge *eld-strength tensor is in diJerential geometry the curvature of the metric tensor. One could also take larger closed loops, but this minimal choice gives better signal-to-noise ratios, and for the standard Wilson action the trace of the plaquette is then used. 9 This is the expression appearing in the last line of Eq. (5.2). The factor Nc can be understood by looking at the formal expansion of the plaquette Eq. (5.6) in powers of a, which reads P ) (x) = 1 + ig0 a2 F ) (x) − 12 g02 a4 F 2) (x) + ia3 G ) (x) + ia4 H ) (x) ; with G and H hermitian *elds. 10 We have then 1 Re Tr P ) (x) = Nc − g02 a4 Tr F 2) (x) + O(a6 ) ; 2
(5.7)
(5.9)
9 Other actions which use diJerent approximations for F ) , with the aim of reducing the discretization errors, are discussed in Section 11.2. 10 This expansion can be derived by using
ˆ = A (x) + a9) A (x) + · · · : A (x + a))
(5.8)
S. Capitani / Physics Reports 382 (2003) 113 – 302
133
where we have used Tr F ) = 0, because the trace of the SU (Nc ) generators is zero. The plaquette action then has the right continuum limit, and the *rst corrections to the continuum pure gauge action are of order a2 . Technically they are called “irrelevant”, but they are important for determining the rate of convergence to the continuum physics. It can also be shown that in the fermionic part of the action the corrections with respect to the continuum limit are of order a. In Section 11 we will see how to modify the fermion action in order to decrease the error on the fermionic part to order a2 . The plaquette action is also often written as · a4
1−
P
1 Re Tr UP Nc
;
(5.10)
where UP is given in Eq. (5.6), and in numerical simulations of lattice QCD the coeQcient in front of the action is =
6 2Nc = 2; 2 g0 g0
Nc = 3 :
(5.11)
The factor two comes out because here one takes the sum over the oriented plaquettes, that is a sum over ordered indices (for example, ¿ )), while in Eq. (5.2) the sum over and ) is free. In the weak coupling regime, where g0 is small, the functional integral is dominated by the con*gurations which are near the trivial *eld con*guration U (x) = 1. Perturbation theory is then a saddle-point expansion around the classical vacuum con*gurations, where the relevant degrees of freedom are given by the components of the gauge potential, Aa (x). Thus, while the fundamental gauge variables for the Monte Carlo simulations are the U ’s and the action is relatively simple when expressed in terms of these variables, in perturbation theory the true dynamical variables are the A ’s. This mismatch is responsible for the complications of lattice perturbation theory. In fact, when the Wilson action is written in terms of the A ’s, using U = 1 + ig0 aA − g02 a2 A2 + · · ·, it becomes very complicated. Moreover, it consists of an in*nite number of terms, which give rise to an in*nite number of diJerent interaction vertices. Fortunately, only a *nite number of vertices is needed to any given order in g0 . All vertices, except a few, are “irrelevant”, that is they are proportional to some positive power of the lattice spacing a and so they vanish in the naive continuum limit. However, this does not mean that they can be thrown away when doing perturbation theory. Quite the contrary, they usually contribute to correlation functions in the continuum limit through divergent (∼ 1=an ) loop corrections. These irrelevant vertices are indeed important in many cases, contributing to mass, coupling constant and wave-function renormalizations (Sharatchandra, 1978). All these vertices are in fact necessary to ensure the gauge invariance of the physical amplitudes. Only when they are included can gauge-invariant Ward Identities be constructed, and the renormalizability of the lattice theory proven. An example of this fact is given by the diagrams contributing to the 1-loop gluon selfenergy (Fig. 6). If one only considers the diagrams on the upper row, that is the ones that would also exist in the continuum, the lattice results would contain an unphysical 1=(am)2 divergence.
134
S. Capitani / Physics Reports 382 (2003) 113 – 302
Fig. 6. Diagrams for the self-energy of the gluon on the lattice. The diagrams on the upper row have a continuum analog, while the diagrams on the lower row are a pure lattice artifact. They are however necessary to maintain the gauge invariance of the lattice theory, and are important for its renormalizability.
This divergence is exactly canceled when the diagrams of the lower row are added, that is only when gauge invariance is fully restored. Notice that for this to happen also a contribution coming from the measure is needed (see Section 5.2.1). In a similar way, terms of the type p2 ) , which are not Lorentz covariant and are often present in individual diagrams, disappear after all diagrams have been considered and summed. From what we have seen so far, we understand that a lattice regularization does not just amount to introducing in the theory a momentum cutoJ. In fact, it is a far more complicated regularization than just introducing a cutoJ, because one has also to provide a gauge-invariant regularized action. DiJerent discretizations of the same continuum action de*ne diJerent lattice regularizations. Lattice Feynman rules are much more complicated that in the continuum, and as we said new interaction vertices appear which have no analog in the continuum. The structure of lattice integrals is also completely diJerent, due to the overall periodicity which causes the appearance of trigonometric functions. The lattice integrands are then rational functions of trigonometric expressions. Thus, lattice perturbation theory is much more complicated than continuum perturbation theory: there are more fundamental vertices and more diagrams, and propagators and vertices are more complicated than they are in the continuum. All this leads to expressions containing a huge number of terms. Finally, one has also to evaluate more complicated integrals. Lattice perturbative calculations are thus rather involved. As a consequence, for the calculation of all but the simplest matrix elements the help of computers is almost unavoidable (see Section 16). Matrix elements computed in Euclidean space do not always correspond to the analytic continuation of matrix elements of a physical theory in Minkowski space. For this to happen, the lattice action has to satisfy a property known as reNection positivity, which involves time reNections and complex conjugations (roughly speaking is the analog of hermitian conjugation in Minkowski space). In this case the reconstruction theorem of Osterwalder and Schrader (1973, 1975) says that it is possible to reconstruct a Hilbert space in Minkowski space in the usual way starting from the lattice theory. The Wilson action with r = 1 is reNection positive, and therefore corresponds to a well-de*ned physical theory in Minkowski space (L6uscher, 1977; Creutz, 1987). For r = 1 instead the lattice theory contains additional time doublers, which disappear in the continuum limit. 11 In the following we will only work with r = 1. This is what we will in this review mean by Wilson action. 11
This is at variance with the doublers which appear for r = 0 (naive fermion action), which do not disappear in the continuum limit, as we will see in Section 6.
S. Capitani / Physics Reports 382 (2003) 113 – 302
135
5.1. Fourier transforms To perform calculations of Feynman diagrams in momentum space (the main topic of this review) we need to de*ne the Fourier transforms on the lattice. They are given in in*nite volume (which is the standard setting of perturbation theory) by the formulae (x) = R (x) =
=a
−=a
=a
−=a
A (x) =
d 4 p ixp e (p) ; (2)4 d 4 p −ixp R e (p) ; (2)4
=a
−=a
d 4 k i(x+a ˆ=2)k e A (k) ; (2)4
(5.12)
where with a little abuse of notation we have used the same symbol, , for the *elds in x space and for their Fourier transforms. The inverse Fourier transforms are given by (p) = a4
e−ixp (x) ;
x
R (p) = a4
eixp R (x) ;
x
A (k) = a4
e−i(x+a ˆ=2)k A (x) ;
(5.13)
x
with a4 −ixp e : (2)4 x
(4) (p) =
(5.14)
The p-space lattice -function is zero except at the values pn = 2n of the momenta. The -function in position space is xy = a
4
=a
−=a
d 4 p i(x−y)p e : (2)4
(5.15)
As we already remarked, on a lattice of *nite volume the allowed momenta are a discrete set. Here we will mostly consider the case of in*nite volume. Notice that the Fourier transform of A (x) is taken at the point x + a ˆ=2, halfway between x and the neighboring point x + a ˆ. This choice turns out to be quite important for the general economy of the calculations, as we will immediately see.
136
S. Capitani / Physics Reports 382 (2003) 113 – 302
Let us write down explicitly the quark–quark–gluon vertex coming out from Eq. (5.2). In momentum space we have ig0 4 R Sqqg = − a ( (x)(r − )A (x) (x + a ˆ) − R (x + a ˆ)(r + )A (x) (x)) 2 x; ig0 4 a =− 2 x;
=a
−=a
d4 p (2)4
=a
−=a
d4 k (2)4
=a
−=a
d 4 p ix(p+k −p ) iak e e (2)4
=2
×( R (p )(r − )A (k) (p)eiap − R (p )e−iap (r + )A (k) (p)) =a 4 =a 4 d k d p ig0 =a d 4 p (2)4 (4) (p + k − p ) eiak = 4 4 4 2 −=a (2) −=a (2) −=a (2)
(5.16) =2
×( R (p ) A (k) (p)(eiap + e−iap ) + r R (p )A (k) (p)(−eiap + e−iap )) =a 4 =a 4 ig0 =a d 4 p d k d p = (2)4 (4) (p + k − p ) eiak =2 4 4 4 2 −=a (2) −=a (2) −=a (2)
×
R (p ) A (k) (p) eiap =2 e−iap =2 · 2 cos a(p + p ) 2
a(p + p ) + r R (p )A (k) (p)eiap =2 e−iap =2 · (−2i) sin 2
:
(5.17)
We can notice at this point that all exponential phases exactly cancel, thanks to the -function expressing the momentum conservation at the vertex (where p = p + k). We are then left with =a 4 =a 4 =a 4 d p d k d p (2)4 (4) (p + k − p ) Sqqg = 4 4 4 (2) (2) (2) −=a −=a −=a ) ) a(p + p a(p + p R (p ) cos ×ig0 A (k) (p) ; (5.18) − ir sin 2 2 which gives us the lattice Feynman rule for this vertex. It is easy to see that in the continuum limit this Wilson vertex reduces to the familiar QCD vertex, namely ∞ 4 ∞ 4 ∞ 4 d p d k d p 4 (4) R (p ) A (k) (p) : (5.19) (2) (p + k − p ) · ig 0 4 4 4 (2) (2) (2) −∞ −∞ −∞ Had we chosen for the Fourier transform of the gauge potential the expression =a 4 d k ixk e A (k) ; A (x) = 4 −=a (2)
(5.20)
the exponential phases would not have canceled, and the eiap =2 e−iap =2 terms would still be present in the *nal expression of the vertex. This is a general feature of lattice perturbation theory: if one uses the Fourier transforms as de*ned in Eq. (5.12), all terms of the type eiak =2 coming from
S. Capitani / Physics Reports 382 (2003) 113 – 302
137
the various gluons exactly combine to cancel all other phases Noating around, and only sine and cosine functions remain in the momentum-space expression of the vertices of the theory. This feature is especially convenient in the case of higher-order vertices containing a large number of gluons. We are now going to give the explicit expressions for the propagators and for the vertices of order g0 and g02 of the Wilson action, which is all what is needed for 1-loop calculations. In the following we will not explicitly write the -function of momenta present in each vertex and propagator. In our conventions for the vertices, all gluon lines are entering, and when there are quark or ghost lines there will always be an equal number of incoming and outgoing lines. 5.2. Pure gauge action As we have seen, in the Wilson action the group elements U (x) appear instead of the algebra elements A (x), which are the fundamental perturbative variables. To derive the gluon vertices from the pure gauge action, one has then to expand the U ’s in the plaquette in terms of the A ’s. As a consequence, an in*nite number of interaction vertices are generated, expressing the self-interaction of n gluons, with arbitrary n. Since the power of the coupling constant which appears in these vertices grows with the number of gluons, only a *nite number of them is needed to any given order in g0 . The A ’s are matrices in color space, and therefore they do not commute with each other. The expansion of the plaquette in terms of the A ’s can be derived by the use of the Baker–Campbell– HausdorJ formula 1 1 1 A B [A − B; [A; B]] + [[A; [A; B]; B]] + · · · : (5.21) e e = exp A + B + [A; B] + 2 12 24 Since the color matrices T a are traceless and are closed under commutation, the exponent in the expansion of the plaquette obtained using this formula is also traceless, so that the knowledge of the cubic terms of this expansion is suQcient to calculate all vertices with a maximum of four gluons, which is what is needed for 1-loop calculations. 12 12
To compute higher-order vertices it is useful to know that the Baker–Campbell–HausdorJ formula can be written as ∞ A B Cn (A; B) ; (5.22) e e = exp n=1
where the Cn ’s can be determined recursively: Cn+1 (A; B) =
1 [A − B; Cn (A; B)] 2(n + 1) +
p¿1 2p6n
B2p (2p)!(n + 1)
m1 ;:::; m2p ¿0 m1 +···+m2p =n
[Cm1 (A; B); [ : : : ; [Cm2p (A; B); A + B] · · · ]] ;
(5.23)
with C1 (A; B) = A + B, and B2p a Bernoulli number. The Bernoulli numbers Bi are de*ned by x x B1 x2 B2 x 4 B3 x6 = 1 − + − + − ···; ex − 1 2 2! 4! 6!
|x| ¡ 2 :
(5.24)
The *rst few Bernoulli numbers are: B1 = 1=6, B2 = 1=30, B3 = 1=42, B4 = 1=30, B5 = 5=66, B6 = 691=2730, B7 = 7=6.
138
S. Capitani / Physics Reports 382 (2003) 113 – 302
Fig. 7. Propagators and vertices needed for 1-loop calculations in lattice QCD.
From Eq. (5.4) it follows that the entries of the matrices ag0 A (x) are angular variables, which thus take values between zero and 2. In perturbation theory the range of integration of the *elds Aa (x) is extended to in*nity. It is only after Aa (x) has been decompacti*ed that the tree-level propagators can be explicitly computed, performing the resulting Gaussian functional integral. The propagators come as usual from the inverse of the quadratic part of the action. For the full expression of the gluon propagator we have to wait until gauge *xing is implemented, and we will report it later. The 3-gluon vertex is ar a(p − q) abc abc 2 ) sin cos W ) (p; q; r) = −ig0 f a 2 2 a(q − r) ap) aq a(r − p)) cos + sin cos ; (5.25) + ) sin 2 2 2 2 where p + q + r = 0. Gluons are all incoming and are assigned clockwise (see Fig. 7). In the formal a → 0 limit the lattice vertex (5.25) reduces to the continuum expression − ig0 fabc { ) (p − q) + ) (q − r) + (r − p)) } :
(5.26)
S. Capitani / Physics Reports 382 (2003) 113 – 302
139
It is useful to introduce the shorthand notation (which we will use throughout) = 2 sin ak ; ak (5.27) 2 a especially in writing the 4-gluon vertex, which is quite complicated. It is given by Rothe (1997): a(q − s) a(k − r)) a4 ˆ abcd 2 cos − fabe fcde )< cos k ) qˆ rˆ) sˆ W )< (p; q; r; s) = −g0 2 2 12 e a(q − r) a(k − s)) a4 ˆ cos − − < ) cos k ) qˆ rˆ sˆ) 2 2 12 +
aq 1 ak) 1 ) )< a2 (s[ − < a2 (s[ − r) kˆ) cos − r)) qˆ cos 6 2 6 2
ar< 1 as 1 − k) rˆ< cos − k)< sˆ cos ) < a2 (q[ − ) a2 (q[ 6 2 6 2 1 + (q[ − k)? (s[ − r)? ) < a2 12 ? + (b ↔ c; ) ↔ ; q ↔ r) + (b ↔ d; ) ↔ <; q ↔ s)
+
g2 + 0 a4 12
2 ( ab cd + ac bd + ad bc ) + ( abe cde + ace bde + ade bce ) 3 e
× )
<
kˆ? qˆ? rˆ? sˆ? − ) kˆ< qˆ< rˆ< sˆ
?
− ) < kˆ qˆ sˆ rˆ − < kˆ) rˆ) sˆ) qˆ − ) )< qˆ rˆ sˆ kˆ) + ) < kˆ qˆ rˆ sˆ + )< kˆ) rˆ) qˆ sˆ + < ) kˆ) sˆ) qˆ rˆ
:
(5.28)
In the a → 0 limit this expression becomes the four-gluon vertex of continuum QCD. 13 13
It is interesting to note that also in the abelian case (lattice QED) there are vertices with self-interacting gauge particles, which of course vanish in the formal continuum limit. The lowest-order vertex in pure gauge contains four lattice photons coming from the a4 g02 F 4) term of the expansion of the lattice QED action, i.e., 1 · a4 (1 − cos (a2 g0 F ) (x))) ; (5.29) 2 4 2a g0 x ) where F ) (x) = ∇ A) (x) − ∇) A (x) : The four-photon vertex contains four derivatives, it is of order a4 and vanishes as a → 0.
(5.30)
140
S. Capitani / Physics Reports 382 (2003) 113 – 302
The vertices containing *ve or more gluons are at least of order g03 , and thus they are not necessary for 1-loop calculations. To my knowledge, an explicit expression for the *ve-gluon vertex has not yet been given in the literature. General algorithms for the automated calculation of higher-order vertices (for a given con*guration of external momenta) have been reported in L6uscher and Weisz (1986). For nonabelian gauge theories, the calculation of the pure gauge part of course does not end here. One has still to consider the gauge integration measure, which generates an in*nite number of vertices with increasing powers of g0 , starting with a 1=a2 mass counterterm to order g02 . Furthermore, the Faddeev–Popov procedure has to be implemented on the lattice, from which the Feynman rules for the ghost propagator and the various ghost vertices can be derived. We anticipate that the eJective ghost-gauge *eld interaction, at variance with the continuum, is not linear in the gauge potential A , and thus also in this sector we *nd an in*nite number of new vertices that have no continuum analog, like for example the vertex involving two ghosts and two gluons. 5.2.1. Measure The de*nition of the gauge-invariant integration measure on the lattice turns out for nonabelian gauge groups to be nontrivial, and as we said generates an in*nite number of vertices. Let us at *rst consider only the gauge potential, A , at a certain point. We start with the 2-form d 2 s = Tr(dU † dU ) ;
(5.31)
where dU = U (A + dA ) − U (A ), which is invariant under left or right multiplication of U with an SU (3) matrix. When rewritten in terms of A , this 2-form de*nes a metric g on the space of the A ’s, d 2 s = gab (A) dAa dAb ; which gives the gauge-invariant measure (the Haar measure) 14 d (A) = det g(A) dAa :
(5.32) (5.35)
a;
The calculation of g(A) can be done using the properties of the SU (3) group. It turns out that under an in*nitesimal variation of A one gets (Boulware, 1970) U (A + dA ) = U (A )(1 + iag0 dAa Mab (A )T b ) ; where the matrix M is given by ˜ eiag0 A − 1 M (A ) = ; iag0 A˜ 14
The gauge-invariant Haar measure on the group has the properties: [dU ] = 1 ;
(5.36)
(5.37)
(5.33)
[dU ]f(U ) =
[dU ]f(U0 U ) ;
for an arbitrary (suQciently smooth) function f, and for any element U0 of the gauge group.
(5.34)
S. Capitani / Physics Reports 382 (2003) 113 – 302
141
and A˜ denotes the gauge potential in which the color matrices are in the adjoint representation: A˜ = Aa t a ;
(t a )bc = −ifabc ;
Tr(t a t b ) = 3 ab :
(5.38)
One can then write (Kawai et al., 1981) gab dAa dAb = Tr(dU † dU ) † = Tr{T c Mca (A )(−i)d(ag0 Aa )U † (A ) · U (A )id(ag0 Ab )Mbd (A )T d } † = a2 g02 · 12 Mca (A )M bc (A ) dAa dAb :
(5.39)
Leaving aside a constant factor that will anyway cancel in the ratios expressing expectation values of operators, the metric is thus given by g(A) = 12 (M † (A)M (A)) :
(5.40)
Explicitly one has the expansion ∞ 1 − cos(ag0 Aa t a ) 1 (−1)l + (iag0 Aa t a )2l : g(A) = = (ag0 Aa t a )2 2 (2l + 2)!
(5.41)
l=1
The measure for the Wilson action can then be written as the product over all sites of the above expression, and is given by 1 † M (A (x))M (A (x)) DA; DA = det dAa (x) : (5.42) DU = 2 x; x; ;a It is convenient to write this measure term in the form DU = e−Smeas [A] DA :
(5.43)
This can be done by using the identity det g = exp(Tr log g), so that at the end we obtain, from Eq. (5.41), 2(1 − cos (ag0 Aa t a )) 1 Smeas [A] = − Tr log 2 x; (ag0 Aa t a )2
∞ 1 (−1)l (ag0 Aa t a )2l : =− Tr log 1 + 2 2 x; (2l + 2)!
(5.44)
l=1
To lowest order one gets g2 a 2 (A ) ; Smeas [A] = 02 8a x;a;
(5.45)
and this term, which is quadratic in A , is nonetheless part of the interaction and not a kinetic term, because of the presence of the factor g02 . It acts like a mass counterterm of order g02 , and is needed to restore gauge invariance in lattice Feynman amplitudes. It cancels, for example, the quadratic
142
S. Capitani / Physics Reports 382 (2003) 113 – 302
divergence in the 1-loop gluon self-energy (see Fig. 6). In momentum space this mass counterterm is (see Fig. 7) −
g02 ) ab : 4a2
(5.46)
The higher orders in Eq. (5.44), which give self-interaction vertices of the gluons, are at least of order g03 and thus only relevant for calculations with 2 or more loops. As a last comment, we mention that in lattice QED things are much simpler: the abelian measure is just given by DU =
dAa (x) ;
(5.47)
x; ;a
thus there are no measure counterterms. 5.2.2. Gauge 8xing and the Faddeev–Popov procedure Although in some situations it can be convenient, gauge *xing is not in principle necessary on the lattice when one works with actions which are expressed in terms of the U ’s, as is done in Monte Carlo simulations. The reason is that, unlike what happens in the continuum, the whole volume of the gauge group is *nite, because it is given bythe product of a countable number of factors each equal to the volume of the SU (3) group, V = x v(SU (3)) (Rossi and Testa, 1980a). This factor cancels out in normalizing expectation values of operators. In perturbation theory, where one makes a saddle-point approximation of the functional integral around U = 1 and the A ’s become the actual degrees of freedom (which moreover are decompacti*ed), gauge *xing is instead necessary (Baaquie, 1977; Stehr and Weisz, 1983). A gauge has to be *xed in order to eliminate the presence of zero modes in the quadratic part of the action (when expressed in terms of the A ’s). We can see why it is necessary to *x a gauge in perturbative lattice QCD also from the following argument. Strictly speaking, perturbation theory arises as an expansion around the minimum of the plaquette action. Looking at the form of the Wilson action we see that P ) (x) = 1 minimizes the pure gauge action, but this does not yet imply U (x) = 1. On the contrary, even if one *xes U (x) = 1 for each link from the beginning, a gauge transformation will lead to 1 → 5(x)5−1 (x + a ˆ), a group element which can take any value. In order to avoid this to happen and for perturbation theory to be a weak coupling expansion around the con*guration U (x) = 1, one must *x the gauge. Gauge *xing is thus an essential step in the perturbative calculations made on the lattice, and can be implemented by using a lattice Faddeev–Popov procedure, which goes along lines similar to the continuum. The *nal result, however, will be rather diJerent. In fact, as another consequence of the gauge invariance on a lattice, one obtains from the Faddeev–Popov procedure an in*nite number of vertices. Although cumbersome, this procedure is perfectly consistent and gives a precise meaning to the lattice functional integral. We illustrate the lattice Faddeev–Popov method in the most commonly used gauge-*xing condition Fax [A ; B] = ∇? Aa (x) − Ba (x) = 0 ;
(5.48)
S. Capitani / Physics Reports 382 (2003) 113 – 302
143
with B some arbitrary *elds. Eq. (5.48) is the lattice analog of the covariant Lorentz gauge. One chooses the backward lattice derivative ∇? here because in this way the gluon propagator will take a simple form, as we will see below. The Faddeev–Popov determinant is de*ned by (Fax [Ag ; B]) ; (5.49) 1 = CFP [A ; B] · Dg x;a
where g is a gauge transformation and Ag is the gauge transform of the *eld A . The integration measure Dg= x d (gx ) is the product of the Haar measures over the lattice sites. From the property (5.34) of the Haar measure it is easy to see that the Faddeev–Popov determinant is gauge invariant: CFP [Ag ; B] = CFP [A ; B]. One now proceeds like in the continuum, Eq. (5.49) is inserted into the partition function, and the gauge invariance of the Faddeev–Popov determinant is exploited so that at the end one can factorize the Dg integration and drop it altogether. Finally, after adding a gauge-*xing term to the action and integrating over B, one obtains for the expectation value of a generic operator 15 O =
D D R DADBCFP [A ; B] x; a (Fax [A ; B]) · O · exp(−SQCD − Smeas − 1=2 a4 x; a Ba (x)Ba (x)) ; D D R DADBCFP [A ; B] x; a (Fax [A ; B]) · exp(−SQCD − Smeas − 1=2 a4 x; a Ba (x)Ba (x))
(5.50) where is the gauge parameter (particular cases are the Feynman gauge = 1 and the Landau gauge = 0). What is left at this point is the computation of the Faddeev–Popov determinant, which turns out to be independent of B. To carry on this calculation, one only needs to know Ag in a neighborhood of the identity transformation g = 1. The in*nitesimal gauge transformation with parameter ja (x) gives U (x) → eij(x) U (x)e−ij(x+a ˆ) = eiag0 (A
(x)+ (j) A (x))
(apart from O(j2 ) terms), where Dˆ [A]ab jb (x) g0 (j) Aa (x) = −
(5.51) (5.52)
b
with Dˆ [A] = (M † )−1 (A (x)) · ∇ + ig0 Aa (x)t a :
(5.53)
We remark that M is the same matrix as in Eq. (5.37) and t a is a matrix in the adjoint representation of SU (3). The situation is similar to the continuum case. In fact Dˆ [A] is a discretized form of the covariant derivative acting on *elds in the adjoint representation. The result for the Faddeev–Popov determinant is indeed very reminiscent of the continuum and we get 16 CFP [A ] = det(−∇? Dˆ [A]) : 15
(5.54)
This formula was questioned in a nonperturbative context. The partition function of a gauge-*xed BRS-invariant theory was in fact argued to be zero on the lattice, because of the contribution of Gribov copies (Neuberger, 1986, 1987). One possible way out of this problem was proposed in Testa (1998b). 16 For the detailed derivation, see Rothe (1997).
144
S. Capitani / Physics Reports 382 (2003) 113 – 302
The important diJerence with respect to the continuum case is that the lattice operator Dˆ [A] is not linear in A, because of the expansion i 1 (5.55) (M † )−1 (A (x)) = 1 + ag0 A (x) − (ag0 A (x))2 + · · · ; 2 12 and an in*nite number of ghost-gluon vertices are thus generated. 17 In fact, using the well-known formula for Grassmann spin-zero variables c and c, R 4 D(cc) R exp−a ij cRi Qij cj = det Q ; (5.58) we can write the Faddeev–Popov determinant in terms of an action involving ghosts, dcRa (x)ca (x) exp a4 cRa (x)∇? Dˆ ab [A ]cb (x) ; CFP [A ] = a;x
(5.59)
x
so that the expectation value of a generic gauge-invariant operator is D D R DADcDc R · O · exp(−SQCD + a4 x cRa (x)∇? Dˆ ab [A]cb (x) − Smeas − Sgf ) : O = D D R DADcDc R · exp(−SQCD + a4 x cRa (x)∇? Dˆ ab [A]cb (x) − Smeas − Sgf ) (5.60) In Eq. (5.60) the gauge-*xing term has been written, thanks to the -function (Fax [A ; B]), as 2 2 a4 ? a2 ∇ A (x) = (A (x) − A (x − a ˆ) : (5.61) Sgf = 2 x 2 x Ghosts are spin-zero Grassmann variables transforming according to the adjoint representation of SU (3). We are now ready to compute the gluon propagator in the covariant gauge 9 A = 0. One gets sin ak =2 sin ak) =2 1 ab ab ) − (1 − ) : (5.62) G ) (k) = 2 4=a2 sin2 ak =2 sin ak =2 This expression is the result of adding to the free part of the gluon action, 1 1 (∇ A) − ∇) A )2 = a4 (∇ A) ∇ A) − ∇ A) ∇) A ) Sg = a4 4 2 x x 1 = − a4 (A) ∇? ∇ A) − A) ∇? ∇) A ) 2 x 1 (A) (C ) − ∇? ∇) )A ) ; = − a4 2 x
(5.63)
17
The part of the lattice Faddeev–Popov determinant in correspondence with the continuum is obtained by putting M =1, as one can easily see by combining the formulae (5.52) and (5.53) in order to reconstruct the continuum formulae g0 (j) Aa (x) = −(9 + ig0 Ac (x)t c )ab jb (x) = −D [A](x) · j(x)
(5.56)
CFP [A] = det(−9 D [A]) :
(5.57)
and
S. Capitani / Physics Reports 382 (2003) 113 – 302
the gauge-*xing term, 1 4 ? 1 a (∇) A) )(∇? A ) = − a4 A) ∇ ) ∇ ? A ; Sgf = 2 2 x x
145
(5.64)
and we have performed a number of integrations by parts. The reason why one is forced to choose the backward lattice derivative in the gauge-*xing term is because it is precisely the forward derivative that appears in the pure gauge action. In this way the longitudinal components of the lattice propagator have a simple form. In the limit a → 0 the lattice gluon propagator of course reduces to the well-known expression
k k) 1 ab · 2 ) − (1 − ) 2 : (5.65) k k From Eqs. (5.59) and (5.53) one can derive the Feynman rules for the ghosts on the lattice. The ghost propagator is 1 ab · ; (5.66) 2 2 4=a sin ak =2 the ghost–gluon–gluon vertex (see Fig. 7) is (ap1 ) ; (5.67) ig0 fabc (pˆ 2 ) cos 2 while the ghost–ghost–gluon–gluon vertex, 1 2 2 a b g a {t ; t }cd ) (pˆ 1 ) (pˆ 2 ) ; (5.68) 12 0 is a lattice artifact, which vanishes in the formal continuum limit a → 0. Vertices containing three or more ghosts are at least of order g03 and do not enter in 1-loop calculations. The lattice theory de*ned in this way has an exact BRS symmetry (Baaquie, 1977; Kawai et al., 1981) which is given by the transformation 1 −1 (A (x))cb (x + a ˆ)] ; (5.69) Aa (x) = [M −1 (A (x))cb (x) − Mba g0 a ab ca (x) = − 12 fabc cb (x)cc (x) ; cRa (x) = −
1 a (A) (x) − Aa) (x − a))) ˆ ; g0 a )
(5.70) (5.71)
(x) = ica (x)T a (x) ;
(5.72)
R (x) = i R (x)T a ca (x) :
(5.73)
The BRS variation of the gauge-*xing term in the action is the opposite of the BRS variation of the Faddeev–Popov term, and they cancel. This can be seen from the nilpotence of the BRS transformation, 2 Aa (x) = 0. The transformations of quarks, antiquarks and ghosts are the same as in the continuum BRS. In the transformation of the antighosts, which knows about the gauge-*xing term, the lattice backward derivative replaces the continuum derivative, leaving a diJerence of O(a) with the continuum. The transformation of the gauge potential is instead quite diJerent from the
146
S. Capitani / Physics Reports 382 (2003) 113 – 302
continuum, and because of M is nonlinear in the gauge potential A. It reduces anyway to the continuum BRS transformation as a → 0. In the abelian case the Faddeev–Popov determinant reduces to a trivial factor that can be eliminated from the path integral when computing expectation values, analogously to what happens in continuum QED. The derivation of the Faddeev–Popov determinant for arbitrary linear gauge-*xing conditions and for lattices with a large class of boundaries can be found in L6uscher and Weisz (1986). 5.3. Fermion action We now discuss the Feynman rules coming from the fermion part of the action. The quark propagator can be computed by inverting the lattice Dirac operator in momentum space, and is given by sin ak + am0 + 2r sin2 ak =2 −i ab ab S (k; m0 ) = · a : (5.74) sin2 ak + (2r sin2 ak =2 + am0 )2 In the formal continuum limit it reduces to the well-known expression k + m0 −i ab · 2 : k + m20
(5.75)
The quark–antiquark–gluons vertices are obtained by expanding the fermionic part of the action in powers of g0 A. We get again an in*nite tower of vertices, which involve a qqR pair and n gluons (n ¿ 1). Fortunately, only a *nite number of them is needed to any given order in g0 . Here we only give the explicit expressions for the vertices which are needed for 1-loop calculations. We have already derived the expression of the quark–quark–gluon vertex. From Eq. (5.18) one obtains a(p1 + p2 ) a(p1 + p2 ) a bc a bc (V1 ) (p1 ; p2 ) = −g0 (T ) + r sin ; (5.76) i cos 2 2 where p1 and p2 are the quark momenta Nowing in and out of the vertex (see Fig. 7). For a → 0 this becomes the familiar continuum QCD vertex − g0 (T a )bc i : With a similar calculation one can derive the quark–quark–gluon–gluon vertex, getting cd 1 ab 1 2 ab cd abe e +d T (V2 ) 1 2 (p1 ; p2 ) = − ag0 1 2 2 Nc a(p1 + p2 ) a(p1 + p2 ) + r cos : × −i sin 2 2
(5.77)
(5.78)
This vertex is zero as a → 0 and thus has no continuum analog. However it can still give nonvanishing contributions to a Feynman diagram, because of power divergences in loops that can compensate the explicit factor a in front of V2 . Vertices with two quarks and n gluons are associated with factors an−1 g0n .
S. Capitani / Physics Reports 382 (2003) 113 – 302
147
In spite of the complexities brought in by the fact that the lattice gauge theory has an in*nite number of vertices, it turns out that the super*cial degree of divergence of a Feynman diagram D depends only on the number of external lines and is given by Kawai et al. (1981) D = 4 − EG − Eg − 32 Eq ;
(5.79)
where Eg , EG and Eq represent the number of external gluons, ghosts and quarks, respectively. The counting is just like in the continuum because it is only the number of V1 vertices that is relevant to this matter. Vertices which are of higher order in a and g0 do not modify this continuum picture. It is customary to de*ne the (Navor nonsinglet) vector and axial currents on the lattice as follows: f V (x) = R (x) 2
(x) ;
(5.80)
f A (x) = R (x) 5 (x) : (5.81) 2 These currents are not conserved in the Wilson formulation, and therefore they are not protected from renormalization. One then has ZV = 1 ;
(5.82)
ZA = 1 ;
(5.83)
at variance with the continuum results. There exist however vector transformations that leave the Wilson action invariant. The corresponding Noether currents are given by 1 R f f (x)( − r)U (x) V cons (x) = (x + a ˆ) + R (x + a ˆ)( + r)U † (x) (x) : (5.84) 2 2 2 V cons are one-point split operators, which extend over two lattice sites. The renormalization constant of these conserved currents is one. It is not possible to make a corresponding construction which leads to conserved axial currents because, as we will see in detail in the next section, the Wilson action breaks explicitly chiral symmetry. We want to conclude this section by mentioning that often the so-called quenched approximation is used in Monte Carlo simulations. In order to perform a statistical sampling of the functional integral, the fermion variables (which are Grassmann numbers) are analytically integrated (using Eq. (5.58)) and the numerical simulations are performed using the partition function −Sg [U ] Z = DU det(D ≡ DU e−SeJ [U ] ; (5.85) , [U ] + m0 ) e with SeJ [U ] = Sg [U ] − log det(D , [U ] + m0 ) = Sg [U ] − Tr log(D , [U ] + m0 ) :
(5.86)
Simulations in full QCD have to include the full contribution of the determinant, while quenching amounts to the replacement det(D , [U ] + m0 ) → 1, which saves a couple of orders of magnitudes of computer time. In physical terms, this means that no sea quarks are included in the calculations, i.e., internal quark loops are neglected (see Fig. 8). Quenching is often summarized by saying that Nf = 0, because for equal masses det(D , [U ] + m0 ) is the product of Nf equal factors. Although it looks quite drastic, in many cases this approximation does not turn out to be so bad.
148
S. Capitani / Physics Reports 382 (2003) 113 – 302
Fig. 8. The diagram on the left is zero in the quenched approximation, while the diagram on the right does not contain internal quark loops and has to be included also in quenched calculations.
In perturbation theory, quenching means dropping all diagrams which contain an internal quark loop, but the inclusion of these diagrams is usually not so challenging as it is in the simulations. For consistency one should not include diagrams containing internal quark loops when perturbative calculations have to be used in connection with quenched simulations. 6. Dealing with chiral symmetry on the lattice Due to the presence of the Wilson term (the part of the action proportional to the parameter r), the Wilson action explicitly breaks chiral symmetry, so that Wilson fermions do not possess chiral invariance even when the bare mass of the quark is zero. This term turns out however to be necessary in order to get rid of the 4d − 1 extra fermions (also called doublers) which are unavoidably present in the naive lattice discretization of the QCD action. Let us see what would happen putting r = 0 in the Wilson action (which corresponds to naive lattice fermions). In this case, the free fermion propagator is just sin ak + am0 −i ab ab S (k; m0 ) = · a : (6.1) sin2 ak + (am0 )2 Let us for simplicity consider the massless naive propagator, setting m0 = 0 in the above equation. The propagator has a pole at ak = (0; 0; 0; 0), as expected. However, there are also poles at (; 0; 0; 0); (0; ; 0; 0); : : : ; (; ; 0; 0); : : : ; (; ; ; ), that is at the edges of the *rst Brillouin zone, because sin2 ak vanishes at each point where any k is either 0 or =a. Each pole of the propagator corresponds to a massless fermion in the theory, even if all these extra poles are at the edges of the Brillouin zone. In fact, we can always think of shifting the integration in momentum space, thanks to the periodicity of the lattice, and so bring these poles inside the Brillouin zone. For example, we =a 3=2a could shift −=a to −=2a . For r = 0 we would then have to take into account all these 16 Dirac particles when doing lattice computations. Although they are a lattice artifact, they would be pair produced as soon as the interaction is switched on. They would appear in internal loops and contribute to intermediate processes.
S. Capitani / Physics Reports 382 (2003) 113 – 302
149
Let us consider in more detail one of these poles, for example, the one at ak = (; 0; 0; 0). To see things more clearly, we make the change of variables k0 = − k0 ; ki = ki ; (6.2) a and correspondingly 18 0 = −0 ;
i = i ;
(6.3)
so that 5 = −5 . The propagator in the new variables takes near ak = (0; 0; 0; 0) the same form as the original propagator Eq. (6.1) near ak = (0; 0; 0; 0). This means that there is a fermion mode also at ak = (; 0; 0; 0), and moreover the chirality of this new particle is opposite to the chirality of the mode at ak = (0; 0; 0; 0), because 5 = −5 , so that (1 + 5 )=2 = (1 − 5 )=2. It can be easily seen that the 16 doublers split in 8 particles of a given chirality and 8 particles of the opposite chirality, so that even if the massless continuum theory had chiral symmetry and the physical particle (the pole at the origin) was a chiral mode we end up with a vector theory on the lattice. The main problem then is not that there are more species of fermions than expected, but that the doublers destroy the chiral properties of the continuum theory. The Wilson term does precisely the work of suppressing these 15 unwanted additional fermions in the continuum limit. In fact, the Wilson action can be written in the form ˜ ) ; ˜? + ∇ ˜ ) − ar ∇ ˜ ?∇ DW = 12 ( (∇ where the gauge covariant forward derivative is given by ˜ (x) = 1 (U (x) (x + a ˆ) − (x)) : ∇ a The piece ˜ ˜ ?∇ − 12 ar ∇
(6.4)
(6.5)
(6.6)
in action (6.4) is the so-called Wilson term. It is an irrelevant operator which however modi*es the lattice dispersion relation for *nite lattice spacing. It contributes a mass of the order of the cutoJ to the doublers at the edges of the Brillouin zone. This mass becomes large in the continuum limit and the doublers decouple from the physical fermion. Of course, being a generalized (momentum-dependent) mass term, it necessarily breaks chiral symmetry. The connection between doublers and chiral symmetry is a deep one, and what we have shown is a particular case of a general phenomenon, as we will shortly see. Thus, if one uses Wilson fermions chiral symmetry is broken and can at best only be recovered in the continuum limit. The breaking of chiral symmetry at *nite lattice spacing has serious consequences on the Wilson theory, among them the appearance of an additive renormalization to the quark mass. The nonzero value of the bare quark mass which corresponds to a vanishing renormalized quark mass is called critical mass. Its value depends on the strength of the interaction. There is then a critical line in the plane of bare parameters, m0 = mc (g0 ); 18
mc (0) = 0 ;
The Dirac matrices that one has to use in the transformed variables are in fact = (0 5 ) (0 5 )† .
(6.7)
150
S. Capitani / Physics Reports 382 (2003) 113 – 302
where the physical quark mass vanishes. It is only the subtracted mass m0 −mc that is multiplicatively renormalizable: mR = Zm (m0 − mc ) :
(6.8)
This additive mass renormalization is the result of the breaking of chiral symmetry due to the presence of the Wilson term in the action. The bare and the renormalized quark mass cannot vanish at the same time, and the bare mass has to be carefully tuned in order to be able to extract physical information from numerical simulations. Usually Monte Carlo simulations are performed with lattice spacings that are not too small. As a consequence, violations of chiral symmetry due to these lattice artifacts are rather pronounced, and the mass renormalization is also quite large. The quark self-energy at 1 loop has a correction proportional to 1=a, which corresponds to the mass counterterm which has to be introduced to repair the breaking of chirality. We will show in detail how to compute it in Section 15. This critical mass mc is one of the best known quantities in lattice perturbation theory, and its high-precision determination at 1 loop, as well as its 2-loop value, are reported in Section 19.2.2. Another consequence of the loss of chiral symmetry is the appearance of a rather complicated mixing pattern under renormalization, because mixing among operators of diJerent chirality is not a priori forbidden. One example of this kind of mixings will be given in Section 14.2. Moreover, it is problematic to de*ne an axial current which would obey PCAC. The impossibility of eliminating the doublers in the fermion action without at the same time breaking chiral symmetry or some important property of *eld theory is a special case of a very important no-go theorem, established by Nielsen and Ninomiya many years ago (Nielsen and Ninomiya, 1981a–c; Friedan, 1982). 19 In short, the theorem says that a lattice fermion formulation without fermion doubling and with an explicit continuous chiral symmetry is impossible, unless one is prepared to give up some other fundamental properties like locality, or unitarity. 20 It is quite common in quantum *eld theory that introducing an ultraviolet regulator brings in unphysical features. The chiral symmetry issue is perhaps the most serious and unpleasant drawback of the lattice regularization. The diQculties with chiral symmetry already arise at the level of free *elds, and are present even when only the time direction is discretized. We can understand why this happens from general topological considerations on the free fermion propagator, where the close interplay between chirality and doublers follows from the continuity of the energy-momentum relation in the Brillouin zone (Karsten and Smit, 1981) (see also the lectures of Smit, 1986). The general form of a massless lattice fermion propagator which is compatible with continuous chiral 19 20
An alternative proof of the theorem which makes use of the PoincarKe-Hopf theorem has been given in Karsten (1981). This statement only applies to the chiral symmetry which acts on the spinor *elds like ;
(6.9)
R → R + j · R 5 :
(6.10)
→
+ j · 5
As we will shortly see, one of the major theoretical advances of the last years has been the understanding that there are other kinds of transformations that can de*ne a lattice chiral symmetry and which do not necessarily imply fermion doubling.
S. Capitani / Physics Reports 382 (2003) 113 – 302
151
Fig. 9. Examples of inverse propagator functions: a smooth function P (k) which gives rise to a particle and its three doublers (top left), P (k) for the naive fermion propagator (top right) and the SLAC propagator (bottom left), and the inverse of the propagator of a scalar particle (bottom right). The dashed parts of the curves lie outside the *rst Brillouin zone.
invariance is 1 ; i P (k)
(6.11)
where the four functions P (k) are real and each has to vanish only once in the *rst Brillouin zone if this propagator is meant to describe for a → 0 only one single fermion. Let us assume at *rst that P (k) is a continuous function. If the derivative in the Dirac operator in the action is anti-hermitian (like in the Wilson case), the theory is unitary and the function P (k) is periodic in 2. Since it must have a *rst order zero at k = 0 (and because of periodicity also at k = 2n ), it must have another zero somewhere else in the *rst Brillouin zone, but with opposite derivative, hence with opposite chirality (see Fig. 9, top left). This crossing signals the presence of a doubler, therefore it is unavoidable to have these extra particles in the theory. This argument only comes from very general features of the propagator, and is independent of the particular shape of the function P (k), as long as it is continuous. Another heuristic proof based on topological considerations, as well as a more general explanation of why the doublers have in pairs opposite chirality, has been given in Wilczek (1987). In this work a particular choice of the function P (k) which minimizes the numbers of doublers is also proposed, which has the drawback of destroying the equivalence of the four directions under discrete permutations, and of the need for new operators (but no 1=a counterterm as in the Wilson case). Let us come back to the naive propagator in Eq. (6.1), i.e., P (k) = sin ak . One crucial feature here is that the argument of the sine function is k , which is a consequence of the fact that one uses the anti-hermitian derivative ∇ + ∇? . This choice causes all our problems. If the argument
152
S. Capitani / Physics Reports 382 (2003) 113 – 302
of the sine function were k =2, things would be completely diJerent, since sin ak =2 is antiperiodic, that is it has a period of 4 in ak , and there would be no other crossing of the k axes by the P function inside the *rst Brillouin zone. Then no doublers would appear. However, sin ak=2 can only appear in the propagator if one uses either the forward or the backward derivative, but not their sum. These derivatives are not anti-hermitian, and unfortunately in this case the theory turns out not to be unitary, which causes other kinds of serious problems. The fact is that ∇ and ∇? are unphysical, and they propagate the fermion only in the forward or backward direction, and so using only one of them cannot lead to a Lorentz-invariant theory in Minkowski space. It has also been shown, in the abelian case, that if one uses only the forward or backward derivative, then the interactions generate noncovariant contributions to the quark self-energy and vertex function, and the theory is nonrenormalizable (Sadooghi and Rothe, 1997). Note that a scalar propagator does not have this problem, as it is the solution of a second-order diJerential equation. Therefore the function P (k) is in this case quadratic (no ’s are present), and linear crossings are replaced by second-order zeros. Therefore, even if doublers were present, they would cause no problems to the chiral properties of the lattice theory. In any case, the Dirac operator can be chosen to be discretized using ∇? ∇, which is hermitian and produces a sin2 k=2 function in the propagator, as is well known. Therefore in this case the function has a minimum at the origin without further crossings in the *rst Brillouin zone (see Fig. 9, bottom right), and there are no doublers. In the fermionic case, the only other way to avoid the second crossing would be to consider a discontinuous function P (k). The most famous example of this is given by the SLAC propagator (Drell et al., 1976a, b), for which P (k) = k throughout the whole Brillouin zone (see Fig. 9, bottom left). SLAC fermions have been studied in perturbation theory in Karsten and Smit (1978, 1979). However, this choice implies a nonlocality in the lattice action (it corresponds to a nonlocal lattice derivative), which leads to many problems as for the very existence of the continuum limit. The locality assumption is important in order to avoid disasters in the weak coupling expansion. DiJerently, propagators and vertex functions would not be analytic in the momenta. Another type of nonlocal chiral action has been proposed in Rebbi (1987), but we will not discuss it here. At the end of the day the origin of the fermion doubling lies in the fact that the Dirac equation is *rst order. Doublers in the naive fermion action are a necessary feature, and we can understand why it is so by also looking at quantum anomalies. We know that in continuum quantum *eld theory quantum corrections, more precisely the process of regularization, break chiral symmetry. A mass scale appears in the renormalized theory, 21 because unphysical massive degrees of freedom need to be introduced. Upon the removal of the ultraviolet cutoJ it may happen that not all the unphysical degrees of freedom actually decouple. When this occurs not all the symmetries of the formal continuum action can be recovered and quantum anomalies appear. So, even in theories that are chirally symmetric the axial current may acquire an anomalous divergence through quantum eJects. Naive lattice fermions can be thought as a regularization of Dirac fermions that does not break chiral symmetry for any *nite a. The naive lattice theory has no anomalies, and this implies that 21
For example, in dimensional regularization (which preserves gauge symmetry) one needs to introduce a scale to de*ne the coupling constant in noninteger dimensions, and in Pauli–Villars one introduces a heavy particle with wrong metric.
S. Capitani / Physics Reports 382 (2003) 113 – 302
153
extra particles (doublers) are present in order to cancel the axial anomaly. When one tries to remove the doublers from the game, then the anomaly is back again. This situation must then correspond to a regularization which somehow has to break chiral symmetry, and in fact we end up with the Wilson lattice action. So, everything *ts in the general picture. After these general considerations we are now ready to state the Nielsen–Ninomiya theorem. In one of its formulations it says that the lattice massless Dirac operator D = D in the fermionic action R (x)D(x − y) (y) (6.12) S F = a4 x;y
cannot satisfy the following four properties at the same time: (a) (b) (c) (d)
D(x) is local (in the sense that is bounded by Ce−|x| ); its Fourier transform has the right continuum behavior for small p: D(p) = i p + O(ap2 ); D(p) is invertible for p = 0 (and hence there are no massless doublers); 5 D + D5 = 0 (it is invariant under chiral transformations).
Therefore, for any given lattice action at least one of the these conditions has to fail. In particular, naive fermions have doublers and therefore do not satisfy (c), Wilson fermions break chiral symmetry and therefore do not satisfy (d), and SLAC fermions are not local and therefore do not satisfy (a). The case of staggered fermions (another widely used, and old, fermion formulation which has been useful for studying problems in which chiral symmetry is relevant, and which will be discussed in the next section) is more complicated from this point of view: only a U (1) ⊗ U (1) subgroup of the full SU (Nf ) ⊗ SU (Nf ) chiral group remains unbroken, and the doublers are removed only partially. Contrary to what one would naively expect from the Nielsen–Ninomiya theorem, it is still possible to construct a Dirac operator which satis*es (a) – (c) and it is also chirally invariant. The solution to this apparent paradox is that the corresponding chiral symmetry is not the one associated with a Dirac operator which anticommutes with 5 , and the condition (d) is instead replaced by the Ginsparg– Wilson relation, according to which 5 D + D5 is not zero, but is proportional to aD5 D. Thus, the actual lattice chiral symmetry turns out not to be what one would naively expect. The Nielsen– Ninomiya theorem is valid but one can have a nonpathological formulation of chiral fermions with no doublers. 22 Ginsparg–Wilson fermions will be discussed in detail in Section 8. Before that, we will shortly present staggered fermions. If one wants to maintain some form of chiral symmetry, but is prepared to give up Navor symmetry, then staggered fermions are the ideal fermions to work with. Otherwise, the only way to maintain chiral symmetry and Navor symmetry at the same time (and of course all other fundamental properties like locality, unitarity etc.) leads again to the Ginsparg–Wilson relation.
22
When the condition that the Dirac operator anticommutes with 5 is released (at a = 0), the lattice quark propagator is not restricted to be of the form (6.11) and the considerations about the presence of the doublers deriving from it are not anymore valid. In fact, one *nds more general forms of the fermion propagator (see for instance the overlap propagator in Eqs. (8.20) and (8.21)).
154
S. Capitani / Physics Reports 382 (2003) 113 – 302
7. Staggered fermions Another formulation of fermions on the lattice which is quite popular can be obtained by keeping part of the doublers present in the naive fermion action, interpreting them as extra Navors. One remains with 4 fermionic Navors whose 16 components are scattered over a unit hypercube by assigning only one single fermion *eld component to each lattice site. This construction, which can only be carried out in an even number of spacetime dimensions, gives the staggered, or Kogut– Susskind, fermions (Kogut and Susskind, 1975; Banks et al., 1976; Susskind, 1977). It turns out that a continuous subgroup of the original chiral transformations remains as a symmetry of the lattice action even at *nite lattice spacing, and thus no mass counterterm is needed for the vanishing of the renormalized quark mass. All this is achieved at the expense of a breaking of Navor and translational symmetry, which become in fact all mixed together. The idea is to use only one spinor component for each Dirac spinor at each site. One has to single out this component from the remaining three, which decouple from the theory. This is accomplished through the following change of variables called spin-diagonalization (Kawamoto and Smit, 1981) (x) = (x)B(x) ;
(7.1)
† R (x) = B(x) R (x) ;
(7.2)
where (x = an) = n00 n11 n22 n33
(7.3)
depends only on mod 2 (n ) (because 2 = 1). In the new spinor variables B(x) and B(x) R the naive fermion action 1 R (x) [U (x) (x + a ˆ) − U † (x − a ˆ) (x − a ˆ)] S f = a4 2a x + a4
mf R (x) (x)
(7.4)
x
becomes f = a4 Sstagg
1 B(x)D R (x)[U (x)B(x + a ˆ) − U † (x − a ˆ)B(x − a ˆ)] 2a x
+ a4
mf B(x)B(x) R :
(7.5)
x
Up to now, what we have done is just a rewriting of the usual naive action in terms of new variables. The crucial thing now is that the Dirac matrices have disappeared, and they have been replaced by the phase factors D (x = an) = (−1)
¡)
n)
:
(7.6)
Thus in action (7.5) the 4 components of the spinor B(x) are decoupled from one another. We are then allowed to keep only one spinor component out of four, and forget about the others. Since we
S. Capitani / Physics Reports 382 (2003) 113 – 302
155
started with the action of naive fermions, with 16 doublers, after the spin-diagonalization we end up only with 4 doublers. This theory is a theory of 4 Navors. The phase factors D (x) bring a minus sign in the action for every translation of one lattice spacing a, and divide the lattice in even and odd sites, which then form two independent sublattices. This makes the action invariant only under translations of two lattice spacings. Translations of an even number of lattice spacings correspond to ordinary continuum translations, whereas a translation of an odd number of lattice spacings swaps the chiral components and should be interpreted as a chiral rotation of =2. It is then natural to take as fundamental objects hypercubes of linear size a (which contain 16 sites) rather than single sites. The 16 spinor components of the 4 Navors of the theory at each site x can be assigned at the 24 vertices of such a hypercube whose (0; 0; 0; 0) vertex is the point x. In the continuum limit each 24 -hypercube will then be mapped to a single physical point. In this way one distributes the fermionic degrees of freedom over the lattice, leaving only one degree of freedom per lattice site. Just like the 4 components of each of the 4 continuum quarks, the components of the continuum matrices are also spread over these 24 -hypercubes. With this construction there is an eJective doubling of the lattice spacing, which becomes 2a, and the staggered formulation eJectively lives in half the Brillouin zone. This is the way the problem of doublers is (partially) solved. One works with an action in which there is only one independent Grassmann variable B(x) per site. Since in d dimensions the unit hypercube has 2d sites, and a Dirac spinor has 2d=2 components (for even d), one needs 2d=2 fermion *elds to carry out this construction. In four dimensions this corresponds to 4 Navors, and everything *ts together. This counting also shows that it is not possible to remove all the doublers in this way, because 16, and not 4, is the minimal number of sites of a four-dimensional unit hypercube. With the construction above, the 16-fold original degeneracy has been reduced to a 4-fold degeneracy. Actually, strictly speaking the 4 Navors are degenerate only in the continuum limit, while at *nite lattice spacing the SU (4) symmetry is broken. For nonzero a the action maintains nevertheless an exact U (1)V symmetry, B(x) → eiEV B(x);
−iEV B(x) R → B(x)e R ;
(7.7)
corresponding to fermion number conservation, which in the case of vanishing bare masses is accompanied by a Navor nonsinglet axial U (1)A symmetry, n +n1 +n2 +n3
B(x) → eiEA ·(−1) 0
B(x);
n +n1 +n2 +n3
iEA ·(−1) 0 B(x) R → B(x)e R
:
(7.8)
This chiral U (1)L ⊗ U (1)R group is all what remains of the original SU (4)L ⊗ SU (4)R symmetry, but it is enough to guarantee that quark masses are not additively renormalized. Thus, no mass counterterm is required if one starts with a zero bare mass, and this is a great advantage of staggered over Wilson fermions. The intertwining of spin and Navor is a major disadvantage of staggered fermions, and the correct spin-Navor structure is only recovered in the continuum limit. Much of the work in staggered calculations goes in the reconstruction of the continuum quark *elds and operators from the 16 one-component *elds (carrying Navor and spin components) spread over the hypercubes. This can be complicated, and calculations with staggered fermions can thus become rather involved. For details about the way staggered perturbation theory is set up, the reader is referred to the works
156
S. Capitani / Physics Reports 382 (2003) 113 – 302
of Sharatchandra et al. (1981), van den Doel and Smit (1983), Golterman and Smit (1984a, b), G6ockeler (1984) and Daniel and Sheard (1988), Patel and Sharpe (1993), Sharpe and Patel (1994), Ishizuka and Shizawa (1994). The diQcult part in dealing with staggered fermions lies in explicitly writing down the discretized expression of a continuum operator endowed with certain symmetry properties. The construction of staggered lattice operators and the interpretation of their components in terms of spin and Navor turns out to be quite complicated. There are also diJerent possibilities for the assignments of the various spin and Navor components. These complications are the price to pay for the fact that staggered fermions are numerically quite cheap. The mixing of operators under renormalization when staggered fermions are used can also become quite complicated. Flavor symmetry breaking generates mixings which were not present in the original continuum theory. For example a four-fermion operator with a certain Navor structure will mix already at 1 loop with many other four-fermion operators with Navor structures diJerent from the original one. This quickly renders these calculations technically involved, and complicates perturbative calculations a lot. Gamma matrices are split as well, and also the presence of color matrices (if one has U *elds in the operators) or derivatives contributes to entangle things even more. To my knowledge no one has carried out the calculation of the renormalization of an operator like R D) with staggered fermions. We will present the renormalization of this operator with Wilson fermions in Section 15.4. On the other hand, the perturbative calculations for j =j are at a rather advanced stage (for a recent work see Lee (2001), and other recent perturbative calculations with staggered fermions can be found in Hein et al. (2002), Nobes et al. (2002) and Lee and Sharpe (2002a, b)). Of course it makes much more sense to spend a lot of eJort for eJective weak Hamiltonian operators, given the good chiral properties of staggered fermions. But recently the full understanding of the implications of the Ginsparg–Wilson relation has opened new ways for the investigation of these matrix elements on the lattice. 8. Ginsparg–Wilson fermions 8.1. The Ginsparg–Wilson relation For many years it was believed that the Nielsen–Ninomiya theorem was the *nal word as for to the possibility of having chiral fermions on the lattice. The (now) fundamental paper by Ginsparg and Wilson (1982), in which the mildest breaking of chiral symmetry was introduced in a study of the block-spin renormalization group in lattice QCD, appeared shortly after the work of Nielsen and Ninomiya, but remained almost unnoticed. Its importance was recognized only 15 years later. The Ginsparg–Wilson relation lay indeed dormant for all this time, until it was rediscovered in the context of perfect actions (Hasenfratz, 1998a). Shortly after, overlap fermions and domain wall fermions were also recognized to have a Dirac operator satisfying this relation. All these formulations of chiral fermions had been introduced a few years before following ingenious ideas, and well before anyone had speci*cally in mind the Ginsparg–Wilson relation. Many interesting new developments have then come out after the rediscovery of the paper of Ginsparg and Wilson, and for a general overview of these developments we refer to the excellent reviews of Niedermayer (1999), Creutz (2001), L6uscher (2001), and Neuberger (2001), which cover diJerent aspects of the approach, as well as the shorter and original discussion of the main ideas given in HernKandez et al. (2002a).
S. Capitani / Physics Reports 382 (2003) 113 – 302
157
An up-to-date discussion of the numerical results which have been obtained using Ginsparg–Wilson fermions can be found in Giusti (2002). A Dirac operator D which satis*es the Ginsparg–Wilson relation 1 5 D + D5 = a D5 D (8.1) < and the hermiticity condition D† = 5 D5 de*nes fermions which have exact chiral symmetry, no doublers 23 and which also satisfy all other fundamental requirements of a sensible *eld theory like Navor symmetry, locality (HernKandez et al., 1999), 24 unitarity and gauge invariance. L6uscher has shown that fermions obeying the Ginsparg–Wilson relation possess an exact chiral symmetry at *nite lattice spacing, which is of the form (L6uscher, 1998) a → + j · 5 1 − D ; (8.2) < R → R + j · R 5 : (8.3) Note the asymmetric way in which and R appear in the transformation. 25 The global anomaly of the original continuum fermions is also reproduced at nonzero lattice spacing (as was already noticed in the case of domain wall fermions by Jansen (1992)). In terms of the quark propagator the Ginsparg–Wilson relation reads 1 S(x; y)5 + 5 S(x; y) = a 5 (x − y) ; (8.6) < which implies that the propagator is chirally invariant at all nonzero distances, i.e., on the mass shell. This surprising result is a new formulation of chiral symmetry that can coexist with a momentum cutoJ. The fact is that chiral symmetry can be realized on the lattice in diJerent ways other than the naive expectation, without violating the Nielsen–Ninomiya theorem. Ginsparg–Wilson fermions do not obey the anticommutation relation of the Dirac operator with 5 , which is only recovered in the continuum limit. Chirality remains a symmetry of the lattice theory also for nonzero lattice spacing, as are Navor symmetry and the other fundamental symmetries. The chiral symmetry associated with the Ginsparg–Wilson relation can be used to de*ne left- and right-handed fermions. We *rst note that the operator a ˆ5 = 5 1 − D (8.7) < 23
Although in general the Ginsparg–Wilson relation does not guarantee the absence of doublers, we are of course only interested in actions which have no doublers and the solutions that we will discuss are all of this kind. 24 Locality in this context does not have the meaning of strict locality, but it is to be understood in the larger sense that the strength of the interaction decays exponentially with the distance in lattice units. It becomes microscopically small when one considers the continuum limit. 25 Although to leading order in j they can be rewritten in the symmetric form a → + j · 5 1 − D ; (8.4) 2< R → R + j · R 1 − a D 5 : (8.5) 2<
158
S. Capitani / Physics Reports 382 (2003) 113 – 302
satis*es (ˆ5 )† = ˆ5 ;
(ˆ5 )2 = 1;
Dˆ5 = −5 D :
(8.8)
The projectors 1 Pˆ ± = (1 ± ˆ5 ) ; 2 1 (8.9) P± = (1 ± 5 ) ; 2 can then be used to separate the two chiral sectors of the theory. In particular, the constraints Pˆ − =
;
(8.10)
R P+ = R ;
(8.11)
de*ne the left-handed fermions. L6uscher’s chiral symmetry can then be rewritten in the rather appealing form →
+ j · ˆ5
;
(8.12)
R → R + j · R 5 :
(8.13)
It is to be remarked here that the de*nition of left-handed *elds depends on the gauge *elds U . The implications of this fact will be fully appreciated in the construction of lattice chiral gauge theories, outlined in Section 9. Known solutions of the Ginsparg–Wilson relation are overlap, domain wall and *xed-point (“classically perfect”) fermions. We will now discuss these three cases in some detail. A few other actions which are approximate solutions of the Ginsparg–Wilson relation have sprung up in recent years. We will not discuss them here. They are much more complicated than these three cases, especially from the point of view of perturbation theory. We will see that overlap, domain wall and *xed-point fermions already bring a lot of complications into the perturbative expansion of the theory. As for current algebra, we recall that using Ginsparg–Wilson fermions it is also possible to de*ne in the massless limit conserved nonsinglet axial currents, and of course conserved vector currents (something which was possible already in the Wilson case). The form of these Noether currents is rather complicated. They extend over all lattice sites, and their kernels decay exponentially with the distance from the “physical” point (where the corresponding continuum current is located). These are then still local currents in the sense explained above. They are given explicitly in Kikukawa and Yamada (1999b) in the case of overlap fermions. Last but not least, in the Navor singlet sector the conservation of the axial current is violated by the UA (1) quantum anomaly (Hasenfratz et al., 1998; L6uscher, 1998). The related anomalous singlet Ward–Takahashi Identities entail the Witten–Veneziano solution (Witten, 1979; Veneziano, 1979, 1980) of the D puzzle (Giusti et al., 2002). 8.2. Overlap fermions Almost 10 years ago Narayanan and Neuberger (1993a, b, 1994, 1995), motivated from mathematical insights and previous theoretical developments (Callan and Harvey, 1985; Kaplan, 1992; Frolov
S. Capitani / Physics Reports 382 (2003) 113 – 302
159
and Slavnov, 1993), devised an ingenious construction with which it was shown how to de*ne chiral fermions on the lattice. The chiral mode resulted from the “overlap” between two in*nite towers of chiral fermion *elds. The existence of an in*nite number of fermions for each lattice site was the crucial new feature which was understood to be necessary in order to construct chiral fermions on the lattice. The Dirac operator coming out from this formalism was later recognized by Neuberger (1998a–c) to be a solution of the Ginsparg–Wilson relation, and its action was given a simple form. In the massless case 26 the overlap-Dirac operator is 27 X 1 1 (8.16) DN = < 1 + √ ; X = DW − < ; † a a X X where DW is the usual Wilson–Dirac operator: DW =
1 ˜ ); ˜? + ∇ ˜ ) − ar ∇ ˜ ?∇ ( (∇ 2
˜ ∇
(x) =
1 (U (x) (x + a ˆ) − (x)) : a
(8.17)
In the range 0 ¡ < ¡ 2r (at tree level) a chiral spectrum of massless fermions is obtained. For the pure gauge part the standard Wilson plaquette action is most often used. Since additive mass renormalization is forbidden by chiral symmetry, when using overlap fermions one avoids altogether a source of systematic errors always present with Wilson fermions. Although the overlap interaction range is not limited to nearest-neighbor sites, and not even to next-to-nearest-neighbor sites, but in fact involve all sites, the strength of the interaction falls oJ exponentially with the distance (in lattice units), and in this sense the theory is still local (though not ultralocal). Let us now have a look at the structure of perturbation theory with overlap fermions. The interaction vertices and the quark propagator are much more complicated than the ones in the Wilson formulation. This causes the perturbative computations to be rather cumbersome, and the help of a computer is necessary even in the simplest cases. The calculations in Capitani (2001a, b), Capitani and Giusti (2000, 2001) and Capitani (2002a) have been carried out using a set of routines written in the symbolic manipulation language FORM. In several cases these routines are an extension of the ones used to perform calculations with the Wilson action. 26
27
When quarks have a nonzero bare mass m0 then the overlap-Dirac operator is given by 1 DN(m0 ) = 1 − am0 DN + m0 : 2<
(8.14)
We note that any X which satis*es 5 X † = X5
(8.15)
makes DN a solution of the Ginsparg–Wilson relation. Using the Wilson action is the standard option, although other actions have been sometimes advocated for X , with the aim of “improving” things. Such generalized overlap fermions have been proposed by Bietenholz (1999, 2001, 2002) and Bietenholz and Hip (2000). They showed that when X is a truncated perfect action (see later) the convergence properties of these fermions are improved, together with their locality and other symmetries. The presence in X of fat link actions (Section 15.7) can further improve things (Bietenholz, 2001; DeGrand, 2001). Perturbation theory however becomes much more complicated in all these cases.
160
S. Capitani / Physics Reports 382 (2003) 113 – 302
In deriving the Feynman rules we start by noticing that in the free case one gets in momentum space (see Eq. (8.16)) ak 1 −< : (8.18) i X0 (k) = sin ak + 2r sin2 a 2 The massless quark propagator, which is computed by inverting the matrix 1 X0 (k) ; < 1+ a † X0 (k)X0 (k) is then given by ab
S (k) = where
ab
−i
sin ak
a + 2<(!(k) + b(k)) 2<
(8.19)
;
(8.20)
2 ak 1 !(k) = ( X † X )0 (k) = −< ; sin2 ak + 2r sin2 a 2 √
1 b(k) = a
2r
ak −< sin2 2
:
(8.21)
To compute the interaction vertices one *rst expands X order by order in g0 : X (p1 ; p2 ) = X0 (p1 )(2)4 (4) (p1 − p2 ) + X1 (p1 ; p2 ) + X2 (p1 ; p2 ) + O(g03 ) ;
(8.22)
where X0 is given in Eq. (8.18), and the Xi ’s are the interaction vertices of the Wilson action. For 1-loop calculations one needs to compute only the vertices of order g0 and g02 . It is convenient to write the expansion of the inverse of the square root in the form 1 1 √ (p1 ; p2 ) + Y1 (p1 ; p2 ) + Y2 (p1 ; p2 ) + O(g03 ) : (8.23) (p1 ; p2 ) = √ † † X X X X 0 Y1 and Y2 can be obtained imposing the identity (Kikukawa and Yamada, 1999a) =a =a d4 q d4 r 1 1 † (X X )(p1 ; q) √ (q; r) √ (r; p2 ) 4 4 X †X X †X −=a (2) −=a (2) = (2)4 (4) (p1 − p2 ) order by order in the coupling constant. For example, to *rst order one has 2 1 1 † † (p2 ) + (X X )0 (p1 ) Y1 (p1 ; p2 ) √ (p2 ) (X X )1 (p1 ; p2 ) √ X †X 0 X †X 0 √ + ( X † X )0 (p1 ) Y1 (p1 ; p2 ) = 0 ;
(8.24)
(8.25)
which solving for Y1 gives Y1 (p1 ; p2 ) = −
1 (X † X )1 (p1 ; p2 ) : (!(p1 ) + !(p2 ))!(p1 )!(p2 )
(8.26)
S. Capitani / Physics Reports 382 (2003) 113 – 302
161
One can easily compute (X † X )1 and thus obtain an explicit expression for Y1 . With some extra algebra one can get an explicit expression for Y2 . The overlap vertices can be *nally computed from the relations X 1 √ = X 0 Y1 + X 1 √ ; (8.27) X †X 1 X †X 0 1 X √ √ = X 0 Y2 + X 1 Y1 + X 2 : (8.28) X †X 2 X †X 0 The interaction vertices can be entirely given in terms of the vertices of the QED Wilson action, a(p1 + p2 ) a(p1 + p2 ) + r sin ; W1 (p1 ; p2 ) = −g0 i cos 2 2 a(p1 + p2 ) a(p1 + p2 ) 1 2 ; (8.29) W2 (p1 ; p2 ) = − ag0 −i sin + r cos 2 2 2 (where p1 and p2 are the quark momenta Nowing in and out of the vertices), and of X0 (p). In fact, the quark–quark–gluon vertex in the overlap theory has the expression 1 (V1a )bc (p1 ; p2 ) = (T a )bc · < !(p1 ) + !(p2 )
1 X0 (p2 )W1† (p1 ; p2 )X0 (p1 ) ; × W1 (p1 ; p2 ) − (8.30) !(p1 )!(p2 ) which in the continuum limit is the usual QCD vertex, −g0 (T a )bc i . The quark–quark–gluon–gluon vertex takes the rather involved form cd 1 ab 1 ab cd abe e W2 (p1 ; p2 ) < +d T (V2 ) ) (p1 ; p2 ) = ) Nc !(p1 ) + !(p2 )
1 † − X0 (p2 )W2 (p1 ; p2 )X0 (p1 ) !(p1 )!(p2 ) 1 1 1 1 + < 2 !(p1 ) + !(p2 ) !(p1 ) + !(k) !(k) + !(p2 )
× X0 (p2 )W1† (p2 ; k)W1) (k; p1 ) + W1 (p2 ; k)X0† (k)W1) (k; p1 ) † (k; p1 )X0 (p1 ) − + W1 (p2 ; k)W1)
!(p1 ) + !(k) + !(p2 ) !(p1 )!(k)!(p2 )
† (k; p1 )X0 (p1 ) × X0 (p2 )W1† (p2 ; k)X0 (k)W1)
:
(8.31)
As expected, this vertex vanishes in the formal continuum limit. The calculation of the vertices with more gluons becomes increasingly complicated, but in principle it can be carried through. Perhaps in this case it is more convenient to use an alternative method
162
S. Capitani / Physics Reports 382 (2003) 113 – 302
to derive the Feynman rules, which involves less algebra but more integrals. It makes use of the integral representation ∞ 1 1 dt √ = : (8.32) 2 † X †X −∞ t + X X The expansion of the square root to the second order in this representation is given by ∞ ∞ 1 1 1 dt dt 1 √ − (X0† X1 + X1† X0 ) = † † X †X t 2 + X0† X0 −∞ t 2 + X0 X0 −∞ t 2 + X0 X0 ∞ 1 1 dt − (X0† X2 + X1† X1 + X2† X0 ) † 2 2 t + X 0 X0 t + X0† X0 −∞ ∞ 1 dt 1 1 + (X0† X1 + X1† X0 ) (X0† X1 + X1† X0 ) ; † † 2 2 2 t + X 0 X0 t + X0† X0 −∞ t + X0 X0 so that, using the fact that 1=(t 2 + X0† X0 ) commutes with X0 , we have that ∞ 1 1 dt (t 2 X1 − X0 X1† X0 ) aD = aD0 + † t 2 + X0† X0 −∞ t 2 + X0 X0 ∞ 1 dt 1 + (t 2 X2 − X0 X2† X0 ) † 2 2 t + X 0 X0 t + X0† X0 −∞ ∞ dt 2 1 1 1 − X1 (X0† X1 + X1† X0 ) t † † 2 2 2 t + X 0 X0 t + X 0 X0 t + X0† X0 −∞ ∞ dt 2 1 1 1 t − X0 X1† X1 † † t 2 + X 0 X0 t 2 + X 0 X0 t 2 + X0† X0 −∞ ∞ dt 1 1 1 + X1† X0 X1† X0 + ··· : X0 † † t 2 + X 0 X0 t 2 + X 0 X0 t 2 + X0† X0 −∞
(8.33)
(8.34)
The t integration can now be performed using the residue theorem, and the various vertices can be obtained in a more systematic way. For more details on 1-loop perturbative calculations with overlap fermions the works of Chiu et al. (1998), Ishibashi et al. (2000) and Yamada (1998) are useful. Many other technical details can also be found in Fujikawa and Ishibashi (2002), although referred to certain generalizations of the Ginsparg–Wilson relation, and in Chiu and Hsieh (2002). Many 1-loop calculations with overlap fermions have been completed by now, including the relation between the parameter in the lattice scheme de*ned by the overlap operator and in the MS scheme (Alexandrou et al., 2000b), and the renormalization factors of the quark bilinears R I (Alexandrou et al., 2000a), of the lowest moments of all structure functions (Capitani, 2001a, b, 2002a), and of the GS = 2 and GS = 1 eJective weak Hamiltonians, which are important for the calculation of GI = 1=2 amplitudes (see Section 14.2) and of the parameter j =j on the lattice (Capitani and Giusti, 2000, 2001).
S. Capitani / Physics Reports 382 (2003) 113 – 302
163
Φ(s) M
s
−M
Fig. 10. A background *eld for the domain wall.
8.3. Domain wall fermions Another solution of the Ginsparg–Wilson relation is given by the so-called domain wall fermions. The construction of domain wall fermions is not peculiar to the lattice, and we begin by discussing a simple continuum case. The main point, which traces back to ideas from Callan and Harvey (1985), is to work in a *ve-dimensional spacetime where in the *fth dimension, denoted by s, there is a domain wall separating the region s ¿ 0 from the region s ¡ 0. This domain wall can be described by a background *eld J(s) with a behavior like in Fig. 10. An example is J(s) = M tanh(Ms), but the precise form is not important. The essential thing is that it behaves like a step function with a height of order M and a width of order 1=M . The *ve-dimensional free Dirac operator in this theory is built by adding to the usual fourdimensional piece a derivative term in the *fth dimension proportional to 5 , and the background *eld: D5 = 9 + 5 9s − J(s) :
(8.35)
The eigenvectors of D5 can be written in the form B(x; s) = eipx u(s) ;
(8.36)
where the four-dimensional plane waves are solution of the usual four-dimensional Dirac equation, while u(s) satis*es (5 9s − J(s))u(s) = −i p u(s) :
(8.37)
All the solutions B(x; s) have a de*nite chirality and describe fermions which have a mass of order M (the only available scale), except a mode which is massless and which satis*es the equations (5 9s − J(s))u(s) = 0;
p u(s) = 0 :
The massless chiral solutions are then given by s dt J(t) v; p v = 0; u(s) = exp ± 0
(8.38) P± v = v :
(8.39)
164
S. Capitani / Physics Reports 382 (2003) 113 – 302
From Eq. (8.39) we see that the solution with positive chirality is nonnormalizable (it diverges as |s| → ∞), while the interesting result is the normalizable solution with negative chirality. This is the sought-for massless chiral mode, whose wave function falls oJ exponentially in the *fth dimension. This mode is thus con*ned near the domain wall s = 0, and at energies which are far below M this is the only particle left in the theory, which has then eJectively become, after dimensional reduction, a chiral theory in 4 dimensions. 28 This chiral mode is present also when the theory is formulated on a lattice (Kaplan, 1992, 1993). The scale M is now meant to be of order 1=a, so that for any *nite lattice spacing only the massless chiral mode can propagate, while the massive particles decouple from the theory. As already noted, the precise form of the background *eld is not important. Let us consider the half space s ¿ 0, in which we can take J(s) = M . The Dirac operator is now just D 5 = D 4 + 5 9s − M ;
(8.40)
with D4 = 9 . Imposing the boundary condition P+ B(x; s)|s=0 = 0, it can be shown that the dimensional reduction of the *ve-dimensional propagator on the domain wall is given by G(x; s; y; t)|s=t=0 = 2MP− S(x; y)P+ : In this equation S(x; y) is the inverse of the four-dimensional operator D4 − M ; D=M + 1 − (D4 =M )2
(8.41)
(8.42)
and D satis*es the relation 1 D5 D : (8.43) 5 D + D5 = M If we make at this point the identi*cation M = <=a, Eqs. (8.42) and (8.43) represent the overlap solution (8.16) (with D4 = DW ) of the Ginsparg–Wilson relation (8.1). In this sense, overlap and domain wall fermions are equivalent, and one can consider overlap fermions like domain wall fermions as seen from the four-dimensional world. The lattice formulation of domain wall fermions that is commonly used in numerical simulations has been proposed in Shamir (1993) and Furman and Shamir (1995). For a review see also Jansen (1996). The *ve-dimensional action is constructed starting from the four-dimensional Wilson action (to avoid doublers), and is given by 29
Ns 1 R SDW = ( s (x)(−r + )U (x) s (x + ˆ) + R s (x)(−r − )U † (x − ˆ) s (x − ˆ)) 2 x s=1
1 R R R + ( s (x)(1 + 5 ) s+1 (x) + s (x)(1 − 5 ) s−1 (x)) + (< − 1 + 4r) s (x) s (x) 2 +m ( R Ns (x)P+ 1 (x) + R 1 (x)P+ Ns (x)) : x 28 29
For more details concerning this derivation see also (L6uscher, 2001). Here and in the rest of this Section we put a = 1.
(8.44)
S. Capitani / Physics Reports 382 (2003) 113 – 302
165
There is a parameter <, which corresponds to the one already seen in the overlap case, which must take values between zero and two (at tree level). The Wilson parameter is set to r = −1, the sign of the Wilson term is diJerent. This domain wall action can be thought as a Wilson action with an in*nite number of Navors (labeled by the index s) and a special mass matrix (Eqs. (8.49) and (8.50) below). The in*nite number of Navors corresponds to the in*nite tower of fermions for each lattice site of the original overlap formulation. The pure gauge part is the usual Wilson plaquette action. The gauge *elds are 4-dimensional, and therefore there is no gauge interaction along the *fth dimension. The measure term, the gauge-*xing term and the Faddeev–Popov term are also the same as in Wilson. This means that the gluon propagator and the quark–gluon vertices are the same as in the Wilson action. The quark propagator is instead diJerent, and much more complicated. In practical terms this lattice action can be looked at as describing a theory of Ns fermion Navors which have a complicated propagator. As the reader will have noticed, the range of s in Eq. (8.44) is *nite, 1 6 s 6 Ns , because this is how the simulations can only be made. All modes have a mass of order <=a except two modes which are nearly massless (more precisely, their masses are exponentially small in Ns ) and which are localized near the boundaries s = 1 and Ns . The two modes have opposite chirality. As long as Ns is *nite there is a small residual interaction (exponentially small in Ns ) between them. 30 It is only in the limit in which the number of points in the *fth dimension goes to in*nity that the chiral mode at s = 1 becomes massless and fully decouples from the other massless mode, yielding an exact chiral theory. In the perturbative calculations made so far (Aoki and Hirose, 1996; Aoki and Taniguchi, 1999a, b; Aoki et al., 1999a, b; Aoki and Kuramashi, 2001; Aoki et al., 2002) it has been assumed that one can take the limit Ns → ∞ before performing the Feynman integrals. The quark propagator which one uses in perturbation theory is then the one which is obtained in the limit Ns = ∞. The domain wall Dirac operator in momentum space and from s to t in the *fth dimension is Dst (p) = s; t i sin p + (Wst+ (p) + mMst+ )P+ + (Wst− (p) + mMst− )P− (8.45) with Wst+ (p) = −W (p) s; t + s+1; t ;
(8.46)
Wst− (p) = −W (p) s; t + s−1; t ;
(8.47)
W (p) = 1 − < − 2r
sin2
p 2
(8.48)
and the mass matrix is
30
Mst+ = s; Ns t; 1 ;
(8.49)
Mst− = s; 1 t; Ns :
(8.50)
The overlap between the chiral modes living on the two walls depends also on the strength of the gauge coupling, and for strong couplings these chiral modes tend to acquire some nonnegligible overlap.
166
S. Capitani / Physics Reports 382 (2003) 113 – 302
By inverting Dst (p) one gets the quark propagator Sst (p) = s (−p) R t (p) = (−i sin p s; u + Wsu− (p) + mMsu− (p)) GutR (p)P+ + (−i sin p s; u + Wsu+ (p) + mMsu+ (p)) GutL (p)P− ;
(8.51)
where GstR (p) =
A [ − (1 − m2 )(1 − W e− )e(−2Ns +s+t) − (1 − m2 )(1 − W e )e−(s+t) F − 2Wm(e(−N +s−t) + e(−Ns −s+t) ) sinh ] + Ae−|s−t | ;
GstL (p) =
A [ − (1 − m2 )(1 − W e )e(−2Ns +s+t −2) − (1 − m2 )(1 − W e− )e−(s+t −2) F −2Wm(e(−N +s−t) + e(−Ns −s+t) ) sinh ] + Ae−|s−t | ;
cosh() = A=
(8.52)
1 + W2 +
sin2 p
2|W |
;
1 ; 2W sinh
(8.53) (8.54) (8.55)
F = 1 − e W − m2 (1 − W e− ) :
(8.56)
These formulae are valid only for positive W . For 1 ¡ < 6 2 and small momenta W can be negative. In this case the propagator is given by the above equations with the replacements W → −|W | ;
(8.57)
e± → −e± ;
(8.58)
which also imply sinh → −sinh . The massless fermion *eld is given at tree level by the combination B0 = 1 − w02 (P+ w0s−1 s + P− w0Ns −s s ) ;
(8.59)
with w0 = 1 − <. We can see that the damping factors w0s−1 and w0Ns −s con*ne the two chiralities on the two diJerent domain walls. Since this factor is renormalized by the interactions due to the additive renormalization of < (it is like a mass term not protected by chiral symmetry), 31 it is suQcient for our purposes to work instead with the “physical” *eld q(x) = P+ 1 (x) + P−
31
Ns (x)
;
(8.60)
q(x) R = R Ns (x)P+ + R 1 (x)P− ;
(8.61)
The renormalizations of w0 and of the wave function turn out to be quite large. The former in particular is of order O(102 ). This is claimed to be cured by tadpole improvement. The wave function at 1 loop has been also computed in Shamir (2000).
S. Capitani / Physics Reports 382 (2003) 113 – 302
167
Fig. 11. A typical correction to the physical quark propagator showing the various fermionic *elds which form the various propagators.
whose renormalization is simpler. The corresponding propagator is given by Sq (p) = q(−p)q(p) R =
−i sin p + (1 − W e− )m : −(1 − e W ) + m2 (1 − W e− )
(8.62)
While at 1 loop the propagator (−p) R (p) gets an additive mass correction, to the same order the physical propagator is protected from such renormalizations, as it must be thanks to chiral symmetry. Composite operators in the theory are then constructed using the physical *eld. The bilinears are for example given by O(x) = q(x)Iq(x) R :
(8.63)
One sees that, when computing Feynman diagrams in which physical *elds are present as external states, mixed propagators are also needed. They are given by q(−p) R s (p) =
1 (i sin p − m(1 − W e− ))(e−(Ns −s) P+ + e−(s−1) P− ) F 1 + [m(i sin p − m(1 − W e− )) − F]e− F × (e−(s−1) P+ + e−(Ns −s) P− ) ;
R = s (−p)q(p)
(8.64)
1 −(Ns −s) (e P− + e−(s−1) P+ )(i sin p − m(1 − W e− )) F 1 + (e−(s−1) P− + e−(Ns −s) P+ )e− F × [m(i sin p − m(1 − W e− )) − F] :
(8.65)
A typical situation in which these propagators are required is depicted in Fig. 11, a self-energy diagram. It is interesting to note that logarithmic divergences, for example in the self-energy, are localized at the boundaries s = 1 and Ns , as they arise only in the massless limit. Given the expressions of the quark propagators, perturbative calculations with domain wall fermions tend to be rather cumbersome, even in the limit Ns = ∞. Many quantities have been calculated so far, using gluon propagators in the Feynman gauge, but no operators with covariant derivatives have been considered, like the ones entering the moments of structure functions.
168
S. Capitani / Physics Reports 382 (2003) 113 – 302
8.4. Fixed-point fermions Historically the *xed-point action, developed in the context of perfect actions (Hasenfratz and Niedermayer, 1994; Wiese, 1993; Bietenholz and Wiese, 1994; DeGrand et al., 1995, 1996, 1997), was the actual chiral battleground where the Ginsparg–Wilson relation was rediscovered (Hasenfratz, 1998a). The key idea here goes under the name of classically perfect actions, that is actions for which their tree-level predictions (and in particular the properties of the chiral modes) agree with what is expected to happen in the continuum. This is true even for *nite lattice spacing. 32 The construction of perfect actions employs ideas which strongly resemble Wilson’s renormalization group in statistical mechanics (Wilson and Kogut, 1974; Wilson, 1975). A renormalization group step in this context, also called a block transformation, consists in doubling the lattice spacing, so that in lattice units the correlation length is halved after each step. In this way the short-scale Nuctuations in the functional integral can be smoothed out and eliminated step by step. The *xed points of these renormalization group transformations, which is reached after an in*nite number of iterations, A → A → A → A → · · · → AFP ;
(8.66)
are the *xed-point actions. Their properties can be investigated using classical equations. A block variable B(xB ) of the lattice with double lattice spacing can be constructed from the original “*ne” variable (x) by setting B(xB ) = b !(2xB − x)(x); !(2xB − x) = 1 ; (8.67) x
x
where ! is the averaging function for the blocking. One integrates out the original variables keeping the new block averages *xed. Blocking is then equivalent to moving to a coarse lattice. The action that one obtains after the blocking transformations will in general contain many kinds of interactions, even when one starts with a very simple action. The *xed-point action to which one at the end arrives contains an in*nite number of types of interactions, with a corresponding in*nite number of couplings. The action at a *xed point is by de*nition invariant under further renormalization group transformations. There the correlation length is in*nite. This means that in asymptotically free theories, and in particular in QCD, the *xed-point action can only be reached for g0 → 0, that is → ∞. Strictly speaking simulations of a *xed-point action AFP could only be done at = ∞. However, in the vicinity of the *xed point one can consider a linear approximation to the real behavior of the renormalization group Now (see Fig. 12). In practice one uses an action which has the same form as the *xed-point action but is taken at a *nite and large , so that it will be quite close to the true renormalization group trajectory. Lattice artifacts will then be quite small. In this way one has linearized the *xed-point transformation and in the only relevant direction this action AFP is classically perfect. 33 Choosing a good blocking strategy using appropriate block transformations is a kind of an art. A bad choice can lead to a *xed-point action with bad locality properties or even to no *xed point at 32
For a nice pedagogical review of these ideas and techniques, see Hasenfratz (1998b); a short review can be found in Hasenfratz (1998a). 33 However, it is not quantum perfect, as 1-loop calculations have shown (Hasenfratz and Niedermayer, 1997).
S. Capitani / Physics Reports 382 (2003) 113 – 302
169
K2 , . . .
Renormalization group transformation
Fixed point
Fixed-point action
1/ β
K1
Fig. 12. The renormalization group Now and the classically perfect *xed-point trajectory. The directions K1 ; K2 ; : : : represent the parameter space.
all. A good choice can instead lead to a good parameterization of the *xed-point action which is convenient for simulations. The task is then to *nd a good blocking transformation which leads to a *xed-point action that can be well approximated by taking a small number of terms not too diQcult to simulate. Let us now consider the Wilson action 1 R Ag (U ) + Af ( ; ; U ) = 1− Re Tr UP Nc P + +
1 R ( (x)( − 1)U (x) (x + ˆ) − R (x + ˆ)( + 1)U † (x) (x)) 2 x;
R (x)(m + 4) (x) :
(8.68)
x
The renormalization group transformed action is given, after one step, by R R e−( Ag (V )+Af (B;R B; V )) = D R D DU e−((Ag (U )+Tg (V; U ))+Af ( ; ; U )+Tf (B;R B; ; where the blocking kernels are
Mg † − Re Tr(V (xB )Q (xB )) + N(Q (xB )) Tg (V; U ) = Nc x ;
; U ))
;
(8.69)
(8.70)
B
for the pure gauge part, and † R (x)! (U ) Tf (B; B(x R B ) − bf R B; R ; ; U ) = Mf x; xB xB
× B(xB ) − bf
x
x
!xB ;x (U ) (x)
(8.71)
170
S. Capitani / Physics Reports 382 (2003) 113 – 302
for the fermion part. Q (xB ) is an average of the *ne links near xB , while ! is an averaging function for the quark *elds. N is a function of the Q’s needed to normalize the block variables. The details of these functions are complicated and will not be of interest to us here. A simple choice is to take !(2xB − x) = 2−d for the points x in the hypercube containing xB . Note that for *nite Mg ; Mf the average of *ne variables in a block is allowed to Nuctuate around the block variables. When Mg ; Mf → ∞ the blocking kernels become -functions. Thus, the parameters Mg and Mf determine the stiJness of the averaging. In QCD the *xed point lies at = ∞. In this limit one can perform the computations using a saddle-point approximation. Hence for the pure gauge part we have Ag (V ) = min [Ag (U ) + Tg∞ (V; U )] ; {U }
(8.72)
where Tg∞ is the = ∞ limit of the blocking kernel, and the *xed-point action satis*es FP ∞ AFP g (V ) = min [Ag (U ) + Tg (V; U )] : {U }
(8.73)
Substituting the Umin that minimizes Eq. (8.72) inside the fermion part one obtains a recursion relation and *nally one then arrives at the equation that determines the *xed-point fermion action, namely 1 −1 −1 † hFP = xB ;xB + b2f !xB ;x (Umin ) hFP (8.74) x; x (Umin ) !x ; xB (Umin ) ; xB ;xB (V ) Mf x;x where Af ( R ; ; U ) =
R (x)hx; x (U ) (x ) :
(8.75)
So far no approximation has been made. In fact, Eq. (8.69) becomes R R e−Af (B;R B; V ) = D R D e−(Af ( ; ; Umin )+Tf (B;R B; ; ; Umin )) ;
(8.76)
x;x
and since the blocking kernels in the saddle-point approximation are quadratic and the action (8.75) is also quadratic in the fermion *elds, the computation of the renormalization group transformations can be done exactly, because all integrals are Gaussian. Eq. (8.74) is then an exact formula, and Gaussian integrals are equivalent to minimization. Its solutions, hFP , are in general obtained doing some approximations and truncations, but in the case of the free theory they can be carried out exactly. Starting from massless Wilson fermions and using !(2xB − x) = 2−d and Umin = 1 the *xed-point fermion propagator takes the form (q + 2l ) sin2 q) =2 2 h−1 (q) = + : (8.77) 2 2 (q + 2l) (q) =2 + l) ) Mf d ) l∈ Z
This is an analytic function and corresponds to a local action which has no doublers. It is then not surprising to learn that this action breaks chiral symmetry. This breaking is due to the term 2=Mf , which comes entirely from the fermionic blocking kernel Tf . The remarkable point is that this *xed-point propagator satis*es 4 {h−1 ; 5 } = 5 ; (8.78) Mf
S. Capitani / Physics Reports 382 (2003) 113 – 302
171
or equivalently the *xed-point action obeys {h; 5 } =
4 h5 h ; Mf
(8.79)
which we can recognize to be a form of the Ginsparg–Wilson relation. Thus this action, although it naively breaks chiral symmetry, has a remnant of it which is L6uscher’s symmetry, Eq. (8.2). The latter is a good chiral symmetry for any *nite value of the lattice spacing. It follows that the pion mass is zero when the bare quark mass in the action is zero, the index theorem is satis*ed and the theory has the correct global anomalies. We would like to point out that in the limit Mf = ∞ this action becomes chirally symmetric in the naive sense, {h; 5 } = 0, but then it becomes also nonlocal, in agreement with the Nielsen–Ninomiya theorem. We have thus learned that the *xed-point action satis*es the Ginsparg–Wilson relation. As we recalled at the beginning, it is actually when studying the solution in Eq. (8.77) that the Ginsparg– Wilson was rediscovered with all its strength after 15 years of neglect. In Monte Carlo computations a truncation of the *xed-point action is necessary. It is not possible to simulate more than the *rst few terms. It is however hoped that the use of the truncated action still brings small errors with respect to the results that would come out using the true *xed-point action. Perturbative calculations must also be carried out using a truncated *xed-point action, and they are still more cumbersome than average. Even when the truncated action contains just a few terms, they become swiftly quite complex, because they contain higher and higher powers of the fundamental *elds and of their derivatives. Propagators and vertices are obtained after summing the contributions of all these terms, leading in general to rather complicated expressions. The coeQcients of the various terms can only be determined numerically, and this is another limitation on the accuracy of *xed-point results. When one wants to make simulations using a truncated action that has a small residual symmetry breaking and is not too far from the true *xed-point action, a relatively large number of terms is needed, and in this case perturbation theory looks algebraically quite demanding. In fact not many perturbative calculations with *xed-point actions have appeared in the literature. Apart from Hasenfratz and Niedermayer (1997), their examples can be found in Bietenholz and Struckmann (1999), where a perfect action for the anharmonic oscillator was perturbatively constructed, and Farchioni et al. (1995), where the mass gap of the nonlinear ?-model was computed. However, no renormalization of operators has been ever attempted. Finally, we would like to point out that, although perturbation theory is more complicated here compared to overlap and domain wall fermions, the *xed-point action possesses certain advantages over them. The *xed-point action is in fact classically perfect, while overlap and domain wall actions are tailored only to protect chiral symmetry, and not other classical properties, from quantum corrections. Moreover, the strength of the interaction decays slower for overlap and domain wall fermions. In this sense the *xed-point action is more local. Of course, the overlap action can be given explicitly in a simple form, while to obtain any reasonable approximation of the *xed-point action one needs to solve complicated equations. At the end of the day domain wall and overlap fermions are equivalent formulations, while *xed-point fermions can be considered as something diJerent because they possess certain good “continuum” properties that the other two do not have.
172
S. Capitani / Physics Reports 382 (2003) 113 – 302
For recent developments on the subject of *xed-point actions, see Hasenfratz et al. (2002a, b). Many technical details can also be found in J6org (2002). 8.5. Concluding remarks Among the various formulations of Ginsparg–Wilson fermions, we have discussed in some detail the perturbative expansion of the overlap solution and also given an elementary introduction to perturbative calculations for domain wall fermions, together with a little not too technical discussion of the perfect action solution. The practical aspects of the implementation of these various fermions, and the kind of computations that can be made with them, are quite diJerent. In particular, controlling the amount of numerical chiral symmetry breaking occurring in actual simulations seems to be better for overlap fermions (HernKandez et al., 2000) than for the others. In the domain wall formulation, where chirality appears after the reduction of the *ve-dimensional theory to our four-dimensional world, one has to remember that the exact chiral symmetry is attained only when there is an in*nite number of sites in the *fth dimension, which is never the case in Monte Carlo simulations. Reducing the amount of chiral breaking in this case means doing new simulations using larger lattices. In the *xed point action it is the necessary truncation which breaks chiral symmetry, and one has to add more and more terms in order to decrease the amount of breaking. For overlap fermions, on the other hand, the chiral symmetry breaking induced by any numerical implementation can be reduced by using more and more re*ned methods to compute the square root in the action. Techniques which use polynomial approximations of the action and an exact evaluation of the lowest eigenvalues of the Dirac operator are widely employed. One does not need to repeat the simulations on new and larger lattices, or to add more terms to the action. Here the road to smaller chiral symmetry breaking eJects goes via the computation of more and more eigenvalues and the re*nement of the polynomials. Both tasks are easier to accomplish. Making simulations on lattices with a longer and longer domain wall *fth dimension or including into the perfect action calculations more and more terms looks instead much more expensive and complicated, with what all that implies in terms of costs and CPU time. Also perturbation theory seems to be more diQcult to carry out in the case of domain wall and *xed-point actions. 9. Perturbation theory of lattice regularized chiral gauge theories The recent developments in the understanding of chiral symmetry on the lattice have also led to very interesting insights into the subject of chiral gauge theories, in which left- and right-handed fermions do not couple to the gauge *elds in the same way. This is the case of neutrino interactions, in which as is well known no right-handed neutrino components couple to the electroweak gauge *eld, and in general of the electroweak theory. We have learned in the previous section that a Dirac operator satisfying the Ginsparg–Wilson relation describes fermions that have an exact chiral symmetry at *nite lattice spacing. In this Section we discuss the fact that when this symmetry is gauged, that is when these fermions become
S. Capitani / Physics Reports 382 (2003) 113 – 302
173
chiral gauge fermions, the lattice regularization can be arranged in such a way that at the same time gauge invariance is preserved to all orders in perturbation theory, and at any *nite lattice spacing. It has been shown by L6uscher (2000c) that using Ginsparg–Wilson fermions it is indeed possible to regularize chiral gauge theories so that the regularization does not break gauge invariance and maintains to all orders chiral symmetry, locality and all other fundamental principles of *eld theory. We think that this has been one of the major theoretical advances in the theory of quantum *elds on the lattice in recent times, and we would like to give a short account of it. Ginsparg–Wilson lattice fermions provide in this way the only known regularization of chiral gauge theories which is consistent at the nonperturbative level and which does not violate the gauge symmetry or other fundamental principles. For all other widely used regularization methods (and also the BPHZ *nite-part prescription) this is impossible, because chirality and gauge invariance cannot be maintained together as symmetries beyond the classical level. 34 If one wants to keep chiral invariance unbroken, one is forced to introduce new counterterms order by order in perturbation theory to maintain the gauge invariance, making these regularizations much less appealing. Even on the lattice some new counterterms need to be introduced when one just uses Wilson fermions, and a delicate *ne tuning of the corresponding coeQcients is required in order to recover chiral invariance. This is what was done in the Rome approach (Borrelli et al., 1989, 1990; Testa, 1998a). The problem here is not only that one has to introduce counterterms that are not gauge invariant, but that the fermion modes couple to the longitudinal modes of the gauge *eld (because of the lack of gauge invariance), so that the eJect of these unwanted gauge modes on the fermions, which can lead to modi*cations of the spectrum and in particular to the appearance of doublers, has to be controlled and removed in some way. It was checked in perturbation theory (Sarno and Sisto, 1992; Rossi et al., 1993; Travaglini, 1997) that there is no apparent obstruction in carrying out the renormalization program. While in the rest of this section we are only interested in chiral gauge theories which employ Ginsparg–Wilson fermions and which maintain an exact gauge invariance, there is another nonperturbative approach which is worth mentioning where the coupling of the fermions to the unphysical degrees of freedom of the gauge *eld is controlled by means of a suitable gauge *xing (Bock et al., 1998a, b; Bock et al., 2000). This gauge *xing approach uses Wilson fermions and the naive notion of chirality on the lattice, and skirts the consequences of the Nielsen–Ninomiya theorem because it has no explicit gauge invariance, which can in fact only be recovered in the continuum limit. It thus needs counterterms which are not gauge covariant (and which have to be appropriately tuned), but it still ensures renormalizability and at the end it achieves the decoupling of the longitudinal degrees of freedom while keeping the fermion spectrum intact, so that one can de*ne a lattice chiral gauge theory where the fermions have no doublers. Some nonperturbative considerations are also needed in order to de*ne a valid perturbation theory around A = 0 which is still a reliable approximation of the full lattice theory at weak coupling (Bock et al., 1998c). A naive gauge *xing does not do the job, and an appropriate classical potential containing higher powers of the gauge potential has to be introduced. A gauge *xing action which leads to a chiral gauge theory has been constructed 34
A well-known 1-loop example in the electroweak theory is given by the divergent triangle diagrams containing 5 , which give rise to the chiral anomaly (Adler, 1969; Bell and Jackiw, 1969; Bardeen, 1969). The recent lectures of (Zinn-Justin, 2002) contain a good deal of material about chiral anomalies in various regularizations and their connection with topology.
174
S. Capitani / Physics Reports 382 (2003) 113 – 302
in the abelian case (Golterman and Shamir, 1997; Shamir, 1998), and it needs only one counterterm to be tuned nonperturbatively. For a recent presentation of this method see Golterman and Shamir (2002). Reviews have also been given in Shamir (1996) and Golterman (2001). 35 It thus seems that the lattice, when Ginsparg–Wilson fermions are used, is an exception to this vicious pattern which is common to all other regularizations. The lattice acquires in this sense a predominance over other regularizations, as it is the only nonperturbative technique which makes possible to regularize theories without any breaking of chiral and gauge invariance, and no need to introduce complicated counterterms. No noninvariant counterterms in the action are needed. One can thus regularize chiral gauge theories without breaking the gauge invariance and using a cutoJ, something which was thought to be impossible. Such a powerful chiral regularization, which can only be realized using fermions obeying the Ginsparg–Wilson relation, works because the gauge anomaly cancels when radiative corrections are included. Simple schemes which do not take this fact into account and try to construct chiral fermions only at the tree level cannot work out. Many years have been indeed necessary to understand the structure of chiral fermions on the lattice and go beyond the no-go theorem of Nielsen and Ninomiya. Of course at the end there is no conNict with the Nielsen–Ninomiya theorem, because it is the condition (d) (see Section 6) which is not obeyed. This was a nontrivial conceptual advance. We know from general results in algebraic renormalization theory that if the chiral gauge anomalies can be shown to be canceled at 1 loop, then they are absent to all orders because no radiative corrections can be generated at higher orders (Adler and Bardeen, 1969). A mathematical fact is that the gauge anomaly descends from a topological *eld in 4 + 2 dimensions, and that the anomaly cancellation at 1 loop can be formulated in terms of a local cohomology problem in 4+2 dimensions, which has been studied and solved in L6uscher (1999a, b, 2000a, b) and Suzuki (1999), for abelian gauge theories. In general one needs a classi*cation of the topological *elds in six dimensions, and the local cohomology problem is then solved. Let us now explain some of the details of the construction of chiral gauge theories using Ginsparg– Wilson fermions. In chiral theories the fermion integral in the path-integral formulation is restricted to left-handed *elds, and the propagator involves a chiral projector (de*ned in Eq. (8.9)): { (x) R (y)}F = 1F × Pˆ − S(x; y)P+ ; (9.1) − 4 where S(x; y) is the inverse of the Dirac operator: z D(x; z)S(z; y) = a xy . Thus, only the left-handed components of the fermion *eld propagate. The fundamental point is that the de*nition of the measure for left-handed fermions, which is needed to construct the quantum theory, is highly nontrivial, because the projectors Pˆ contain a ˆ5 = 5 (1 − a=
and changing the basis means that the measure gets multiplied by the determinant of a unitary transformation matrix, which is a phase factor. Since the projectors depend on the gauge *elds, 35
I thank M. Golterman and Y. Shamir for correspondence.
S. Capitani / Physics Reports 382 (2003) 113 – 302
175
Space of all Dirac fields
= 0 left-handed fields subspace moves with the gauge fields
Fig. 13. Dependence of the phase of the fermion measure on the gauge *elds.
the basis and the corresponding phase factor also depend on the gauge *elds and cannot be *xed independently of them (see Fig. 13). This is the source of the phase ambiguity. Thus, in chiral gauge theories the fermion integration has a nontrivial phase ambiguity and the fermion measure is not a simple product of local factors. Since this phase ambiguity depends on the gauge *elds, it does not cancel in ratios of expectation values when one normalizes the path integral with the partition function. The right-handed projectors on the other hand give a constant phase factor, which can be factored out and does cancel in such ratios. The phase problem, that is the de*nition of the measure, is the key issue in the construction of chiral gauge theories. The phase ambiguity for the left-handed components can be consistently removed only if the fermion multiplet is nonanomalous, that is when a b c c b dabc R = 2i Tr {R(T )[R(T )R(T ) + R(T )R(T )]}
(9.3)
a
is zero, where R(T ) are the anti-hermitian generators of the fermion representation of the gauge group. 36 For U (1)gauge theories coupled to N left-handed Weyl fermions of charges e this condition becomes N=1 e3 = 0. The key step forward is that the phase problem can be equivalently formulated in terms of a local current which is gauge covariant. To make this step let us consider the eJective action when the fermionic degrees of freedom are integrated out, R −SeJ [U ] e = D D R e−SF [U; ; ] : (9.4) An in*nitesimal deformation of the gauge *eld D U (x) = aD (x)U (x) 36
(9.5)
In this Section we use, as in the original papers, anti-hermitian color matrices, that is they satisfy Tr{T a T b }=−1=2 ab and [T a ; T b ] = fabc T c . This means that U = exp{ag0 Aa T a }. The interested reader can then turn for more details to these papers without problems.
176
S. Capitani / Physics Reports 382 (2003) 113 – 302
(where D (x) = Da (x)T a , and
is not summed) induces a variation of the eJective action
D SeJ = −Tr{ D DPˆ − D−1 P+ } + iLD :
(9.6)
The *rst term in the above formula is the naive expression which one would normally obtain, while the second term Da (x)j a (x) ; (9.7) L D = a4 x
which is linear in the *eld variation, is the new term typical of Ginsparg–Wilson fermions which arises because the fermion measure depends on the gauge *elds, and in fact this term can also be written as LD = i j (vj ; D vj ). The axial current j (x) contains all the information about the phase of the fermion measure, provided it is given by a gauge-covariant local *eld which satis*es the integrability condition, which in its diJerential form reads D LP − P LD + aL[D; P] = i Tr{Pˆ − [ D Pˆ − ; P Pˆ − ]} ;
(9.8)
for all *eld variations D (x) and P (x) that do not depend on the gauge *eld. The reconstruction theorem then says that a given current with these requirements *xes the phase of the measure, and knowing the current is then equivalent to knowing the fermion measure. Once the measure is *xed, the functional integral is well de*ned and the eJective action is gauge invariant. We have then constructed a chiral gauge theory which is not spoiled by radiative corrections. There is thus a one-to-one correspondence between the current and the measure, up to a constant (irrelevant) phase factor in each topological sector. The current j (x) de*nes the chiral gauge theory, and the problem of constructing the fermion measure is reduced to the problem of constructing this current. The measure does not need to be explicitly speci*ed. Let us now imagine to make an expansion in the coupling constant of the various quantities introduced above. The exact cancellation of the gauge anomaly can then be proven recursively, order by order in g0 . We think it is interesting to sketch how this construction is carried out, skipping the points that are technically more involved, for which the reader is referred to the original papers. A more detailed treatment is beyond the scope of this review. In order to construct a perturbative expansion one has to identify the interaction vertices coming from the measure term LD . To this end it is convenient to start by considering the response of the theory under the variation of the gauge potential A, RD A (x) = D (x) :
(9.9)
Since D = g0−1 RD + O(1) ;
(9.10)
the corresponding variation of the eJective action contains an explicit dependence on the coupling constant. One gets RD SeJ = −Tr{ RD DPˆ − D−1 P+ } + ig0 LDR ; with
DR (x) =
1+
∞ k=1
1 (g0 a A˜ (x))k (k + 1)!
(9.11)
D (x) ;
(9.12)
S. Capitani / Physics Reports 382 (2003) 113 – 302
177
where in the gauge potential entering in DR the color matrices are in the adjoint representation. To construct the current, one starts from the curvature term in the right-hand side of the integrability condition Eq. (9.8), FDP ≡ i Tr{Pˆ − [ D Pˆ − ; P Pˆ − ]} ;
(9.13)
which has an expansion in the coupling constant whose leading term is of order g03 . The reason why the term of order g02 is zero is the anomaly cancellation condition dabc R = 0. The fact that these contributions are absent will be crucial in the following. The lowest-order piece of the curvature is then 3 ˜ DP = 1 9 FDP F ; (9.14) 3! 9g03 g0 =0 and the lowest-order part of the measure term is 4 ˜ D = 1 9 LD ; L 4! 9g04 g0 =0
(9.15)
˜ is invariant under the linearized gauge transformations 37 where L A (x) → A (x) + ∇ !(x) : In terms of the above quantities the lowest-order form of the integrability condition reads ˜ DP : ˜ P − RP L ˜D=F RD L ˜ we now de*ne the functional From the lowest-order term of the curvature F 1 ˜ HD = − F ; D 5 =A
(9.18) (9.19)
(9.20)
which satis*es the lowest-oder form of the integrability condition 38 and is linear in D, which allows to de*ne the current h (x): Da (x)ha (x) : (9.23) H D = a4 x
This current in general is not gauge invariant, but from it we can construct q(x) = ∇? h (x) ; 37
One has D (x) = −D˜ !(x) ;
with
38
1 D˜ !(x) = [U (x)!(x + a ˆ)U † (x) − !(x)] : a
(9.24)
(9.16) (9.17)
To prove that ˜ DP RD HP − RP HD = F
(9.21)
one has to make use of the Bianchi identity ˜ P + RP F ˜ D + R F ˜ DP = 0 : RD F
(9.22)
178
S. Capitani / Physics Reports 382 (2003) 113 – 302
which is invariant under the linearized gauge transformations (9.18). It can be shown that q(x) is a topological *eld, that is it satis*es R q(x) = 0 a4 (9.25) x
for all variations (x) of the gauge potential. Not surprisingly, this topological *eld turns out to correspond to the anomaly ia=2 Tr{5 R(T a )D(x; x)}. We stress that q(x) has been derived in a unique way from the current j (x) (which also determines the fermion measure), via the curvature term ˜ appearing in the integrability condition, F. Topological *elds like q(x) have been classi*ed in U (1) lattice gauge theories, and in absence of matter they are equal to a sum of Chern polynomials plus a divergence term which is topologically trivial. The *eld q(x) that we have just obtained is a homogeneous functional of degree 4 in the gauge potential, and since Chern polynomials in four dimensions have degree 2, they cannot contribute in this case. All that is left is thus the topologically trivial term, so that we have q(x) = ∇? k (x) ;
(9.26)
where k (x) turns out to be a local current invariant under linearized gauge transformations. The lowest-order part of the measure term is now given by ˜ D = HD + RD · 1 a4 L Aa (x)k a (x) ; (9.27) 4 x which satis*es the lowest-oder form of the integrability condition, since the last term has zero curvature. This completes the construction to leading order. The higher-order terms of L can be then computed recursively according to the following procedure. If one has already calculated the O(g0n ) term in the expansion of L, then one has to subtract to L another function, L(n) , whose *rst n terms in the expansion in the coupling constant are the same as L. Applying then the above construction (with some slight changes) to the diJerence L − L(n) , one can compute its leading-order term, which determines L at O(g0n+1 ). By repeating this procedure one can obtain L to the desired order. This construction is unique up to terms that are of higher orders in the lattice spacing and therefore are irrelevant in the continuum limit (apart for a *nite renormalization). As we have already mentioned, it can be carried out only if the fermion multiplet is anomaly-free. If this is not the case, 3 i.e., if dabc R = 0, the lowest term of the curvature turns out to be of order g0 instead of g0 , and the 2 4 lowest term of the measure is of order g0 instead of g0 . The topological *eld obtained along the lines explained above is now a homogeneous functional of degree 2 in the gauge potential. This time the Chern polynomials do contribute, and q(x) is topologically nontrivial. It can be shown that this leads to the presence of a lattice *eld corresponding to F ) F˜ ) , namely 1 c dabc j ) T a F b) (x)F (x + a ˆ + a)) ˆ + ∇? k (x) ; (9.28) 1922 R where F ) = ∇ A) (x) − ∇) A (x) is the linearized gauge *eld-strength tensor. This is the well-known covariant anomaly. Thus, for dabc = 0, the theory is not chirally invariant because of quantum R ˜ which satis*es the integrability corrections. This corresponds to the fact that the construction of a L condition, and hence of a gauge-invariant measure term, cannot be accomplished. q(x) = −
S. Capitani / Physics Reports 382 (2003) 113 – 302
179
For nonabelian gauge groups the cohomology problem has not yet been solved, that is a classi*cation of the topological *elds in 4 + 2 dimensions has not yet been achieved. As a result, the general structure of the nonabelian anomaly on the lattice is currently not known. However, although we cannot yet prove that q(x) is topologically trivial (in the absence of anomalies) for nonabelian gauge groups, there is no reason to suspect that the theorem could not be valid after all in that case. The cancellation of the anomalies can in fact be proven to be true for topological *elds in the continuum limit. Consequently, if the topological structure at *nite a matches with the one in the continuum limit (not a surprising result), then the theorem could be proven also at *nite lattice spacing. 39 Thus, the remaining open issues are mostly technical and not of principle and one should expect to be able to overcome these diQculties at some point. There is little doubt that the construction is valid also in this case. 40 We conclude this section discussing the implications of this remarkable construction for perturbative calculations. The measure term ∞ g0k 4k+4 (k) ak a a1 ak LD = L (x; z1 ; : : : ; zk )aa11··· (9.29) a ··· k D (x)A 1 (z1 ) · · · A k (zk ) k! x;:::;z k=4
k
can be on the LD an appear
considered as a local counterterm to be added to the action. The dependence of the measure gauge *elds generates additional gauge vertices in the eJective action. Acting with RD on appropriate number of times one can obtain all these additional vertices. These vertices only at the 1-loop level, and the kth order term is given by (k) ak a 1 ak VM (z1 ; : : : ; zk )a11··· (9.30) ig0 RD · · · RD LDR|A =0 = g0k a4k ··· k D 1 (z1 ) · · · D k (zk ) ; z1 ;:::;zk
where the operator RD has been applied k − 1 times. The expansion of LD as a power series in the coupling constant begins with the g04 term, as indicated in Eq. (9.29). The vertices coming from the measure term are only of *fth and higher orders in the gauge coupling (the additional power of g0 can be inferred by comparing Eqs. (9.6) and (9.11)). They can be determined, using a recursive procedure, once a local gauge-covariant current is given that satis*es the integrability condition. These interaction vertices are not explicitly known at present. Luckily, they are not needed in most cases of interest. In fact, since they are only of order g05 and higher, and moreover they are proportional to positive powers of the lattice spacing (as can be seen by naive dimensional counting), they do not contribute to the continuum limit of 1-loop diagrams. At the 2-loop level the vertices coming from the measure term can come into play only in some very special circumstances. In fact, the lowest-order vertex, which is a *ve-point vertex, is totally symmetric in the gauge group 39
Although in this case one should also show that there are no global topological obstructions at the nonperturbative level. The only nonabelian chiral gauge theory for which the solution of the cohomology problem is known is the SU (2)L ⊗ U (1)Y electroweak theory, which is easier to deal with because the representation of SU (2) are pseudo-real (Kikukawa and Nakayama, 2001). Theoretical advances have also been reported in Suzuki (2000), Igarashi et al. (2000), where the cancellation of the lattice gauge anomaly to all orders in powers of the gauge potential for any compact group was established, and in Kikukawa (2002), where an explicit construction of the chiral measure which employs domain wall fermions was given. In the abelian case a nonperturbative construction has been so far carried out for weak *elds satisfying |F ) | ¡ j (with j ¡ 1=30). This is not a limitation, because one can make them the only statistically relevant *elds in the functional integral by using a modi*ed version of the plaquette action which lies in the same universality class of the standard plaquette action (L6uscher, 1999c). 40
180
S. Capitani / Physics Reports 382 (2003) 113 – 302
indices, so that in 2-loop calculations this vertex can contribute only in diagrams with more than three external lines. All propagators and vertices of these chiral gauge theories satisfy the conditions for the validity of the Reisz power counting theorem (see Section 15). Although renormalizability has not yet been explicitly proven, it seems unlikely that these theories are nonrenormalizable. In this section we have discussed only the theory of left-handed fermions. The introduction of Higgs *elds does not aJect the structure of the measure term, because chiral projectors do not refer to the Higgs sector. Higgs *elds or other *elds that couple vectorially can then be easily incorporated in the theory. We have seen that lattice Feynman diagrams for Ginsparg–Wilson fermions are more complicated to calculate than continuum ones, and for chiral gauge theories with Ginsparg–Wilson fermions the vertices coming from the fermionic measure are also quite involved. On the other hand, having a nonperturbative regularization that is exactly chirally and gauge invariant is crucial when making calculations within the electroweak theory. It can then be worth to pay the price of a more cumbersome formulation. We have thus shown that, although certainly not simple, there is now a construction of chiral gauge theories, at least in the abelian case, which is suitable for quantum calculations, and which makes uses of the lattice regularization. A consistent formulation of the standard model now exists beyond perturbation theory. 10. The approach to the continuum limit The range of couplings for which perturbation theory is expected to be a reasonable expansion (that is, the region where g0 is small) is closely related to the way the continuum limit of lattice QCD is approached. This approach can be described using lattice-type Callan–Symanzik renormalization group equations similar to the continuum ones. Since the *rst two coeQcients of the function, which determines the perturbative running of the coupling constant, are (gauge invariant and) scheme independent (Caswell and Wilczek, 1974; Espriu and Tarrach, 1982), they have the same values as for continuum QCD. Lattice QCD is therefore asymptotically free. So, the lattice QCD bare coupling constant goes to zero in the limit in which physical quantities are kept *xed. This is the continuum limit a → 0. Let us consider, in a massless theory, a physical quantity P of mass dimension n computed on a lattice of spacing a. The product an P is a dimensionless quantity, which can only depend on the bare coupling constant, which in turn depends on the lattice spacing a. We thus have an P = f(g0 (a)). If the physical quantity P has to have some well-de*ned value in the continuum limit, for a → 0 it must happen that f(g0 (a)) lim = *nite constant : (10.1) a→0 an Near the continuum limit g0 is a smooth function of a with a stable ultraviolet *xed point gc , i.e., with the property lim g0 (a) = gc :
a→0
The critical point of QCD is gc = 0.
(10.2)
S. Capitani / Physics Reports 382 (2003) 113 – 302
181
In the continuum limit the correlation length, i.e., the rate of the exponential falloJ of the two-point correlation functions in position space, goes to in*nity. In fact, if the correlation length is held *x in physical units, when a decreases it becomes larger and larger when measured in units of the lattice spacing. For a → 0 the correlation length must diverge in lattice units, so that the discretization eJects disappear. The continuum limit is thus a critical point of the theory. What shown above is a completely nonperturbative argument. For small enough coupling constants the perturbative function can be a reasonable approximation to the real running of the coupling constant and the approach to the continuum limit can be studied in perturbation theory. One *nds that g0 and a tend together to zero along a trajectory determined by renormalization group equations. The function appearing in the renormalization group equation for lattice QCD is de*ned by a
dg0 = −(g0 ) : da
(10.3)
This function has for small coupling constants the expansion (g0 ) = −g03 [b0 + b1 g02 + b2 g04 + · · · ] :
(10.4)
We see that things are similar to the continuum, with replaced by 1=a. The *rst two coeQcients of the function are universal, and thus b0 and b1 in Eq. (10.4) are the same as in the continuum, were they were computed by Gross and Wilczek (1973), Politzer (1973), Jones (1974) and Caswell (1974): 2 1 b0 = 11 − ; (10.5) N f (4)2 3 b1 =
1 (4)4
102 −
38 Nf 3
:
(10.6)
The coeQcient b2 depends on the scheme. The *rst calculations of this coeQcient for the Wilson action were attempted in Ellis and Martinelli (1984a, b) and Ellis (1984). The coeQcient b2 was then fully computed in L6uscher and Weisz (1995a, d), AllKes et al. (1997) and Christou et al. (1998), with the result b2 = −0:00159983232(13) + 0:0000799(4) Nf − 0:00000605(2) Nf2
(csw = 0) ;
(10.7)
b2 = −0:00159983232(13) − 0:0009449(4) Nf + 0:00006251(2) Nf2
(csw = 1) :
(10.8)
In the last line we have also given its value in the tree-level improved theory (which we will introduce in the next section), *nally computed by Bode and Panagopoulos (2002) for general csw . The value of b2 in the Schr6odinger functional has been computed in Bode (1998), Bode et al. (1999), Bode et al. (2000a, b). For T = L, E = =5 and Nf = 0 one *nds b2 =
1 0:482(7) ; (4)3
(10.9)
182
S. Capitani / Physics Reports 382 (2003) 113 – 302
while for Nf = 2 1 0:064(10) : b2 = (4)3 For comparison, b2 is given in the continuum MS scheme by (Tarasov et al., 1980) 2857 5033 1 325 2 − Nf + N : b2 = (4)6 2 18 54 f
(10.10)
(10.11)
In this scheme some higher-order coeQcients are also known (van Ritbergen et al., 1997). Solving Eqs. (10.3) and (10.4) to lowest order gives the solution 1 + O(1=log2 a2 ) : (10.12) g02 ∼ − b0 loga2 2lat The evolution of the bare lattice coupling constant with the scale a de*nes, like in the continuum, a renormalization group invariant parameter. The value of the parameter cannot be determined using the lowest-order solution, since a rescaling of is of the same order as terms which have been dropped. A reasonable de*nition of the parameter has to take into account higher orders. An expression which is exact to all orders is g 1 1 b1 −1 2 −b1 =2b20 −1=2b0 g02 : (10.13) + = a · (b0 g0 ) ·e exp − dt − (t) b0 t 3 b20 t 0 This de*nition has been used in the calculations of Capitani et al. (1999c) reported in Section 2, which use the Schr6odinger functional. A parameter in a given scheme speci*es the value of the coupling constant in that scheme for any given scale , and all dimensionful quantities will be proportional to . Since the parameter depends on the scheme, one has to compute the ratio lat =cont . This was done by diJerent groups using lattice (and continuum) perturbation theory (Hasenfratz and Hasenfratz, 1980, 1981; Dashen and Gross, 1981; Weisz, 1981). For example, for the pure gauge Wilson action one *nds MS = 28:80934(1) : (10.14) lat The huge change in scales between the continuum MS renormalization scheme and the lattice is a common phenomenon, which is more pronounced for some lattice actions than for others. Once the two scales lat and QCD are related, combining the results of Monte Carlo simulations with the knowledge of lat and the quark masses allows in principle to predict all physical quantities. In this way one can predict for example the value of S (MZ ) using only nonperturbative lattice data. We have seen that the bare coupling constant g0 must be related to the size of the lattice spacing a, so that computing lattice quantities near the continuum limit means taking both of them to zero in such a way that Eq. (10.12) (or better Eq. (10.13)) is satis*ed. All renormalized physical quantities should remain constant in what is called the “scaling region” near the continuum limit. Continuum physics can be extracted from the Monte Carlo results only in this region of coupling constants and lattice spacings. For suQciently large scales one can also expand the running coupling constants in a given scheme in terms of the coupling constant de*ned in a diJerent scheme. We can then approximately relate diJerent coupling constants by matching at a *nite scale p (Celmaster and Gonsalves, 1979a, b).
S. Capitani / Physics Reports 382 (2003) 113 – 302
183
On the lattice the relation between the bare and the renormalized coupling constant (de*ned as the three-point function at a certain momentum p) is gR (p) = g0 [1 + g02 (−b0 log ap + C L + O(a2 p2 log ap)) + O(g04 )] ; while for the continuum coupling constant one has
p 2 4 gR (p) = gMS 1 + gMS −b0 log + C MS + O(gMS ) :
(10.15) (10.16)
Combining these two equations we have g0 = gMS [1 + g02 (C MS − C L + b0 log a ) + O(g04 ) + O(a2 )] :
(10.17)
The most eJective way of computing the matching between diJerent coupling constants is provided by background *eld techniques (De Wit, 1967a, b; Abbott, 1981). In most of the calculations quoted above a background *eld method was indeed employed. In the calculations of L6uscher and Weisz (1995a, d) the method was supplemented by a clever way of evaluating the Feynman diagrams which we will discuss in detail in Section 19. A gauge theory on the lattice in presence of a background gauge *eld is renormalizable to all orders in perturbation theory. A nice thing is that no new counterterms are needed, besides those already required in the lattice theory without background *elds (L6uscher and Weisz, 1995c), just like in the continuum (Kluberg-Stern and Zuber, 1975a, b). Thus, introducing a background *eld does not aJect the renormalization of the lattice theory. The renormalization of gauge theories on the lattice without background *elds has been proven in Reisz (1989). Since the propagator of the background *eld is proportional to 1=g2 , for the renormalization of the coupling constant and the perturbative determination of the lowest coeQcients of the function the computation of the self-energy of the background *eld is suQcient. This is the main advantage of using a background *eld method. It is much easier to compute diagrams with two legs instead of three, as it would be the case without background *eld. Also it turns out that the number of diagrams to be computed is smaller. It is true that to control the renormalization of the gauge-*xing parameter the corrections to the gauge *eld propagator must also be calculated, but this is only needed to a lower order than the order to which the self-energy is computed. In the continuum the *eld A is decomposed as follows: A (x) = B (x) + g0 q (x) ;
(10.18)
where the background *eld B is a smooth external *eld (which is not required to satisfy the Yang– Mills equations), while q is the quantum Nuctuation. To the gauge action reexpressed in terms of B and q one has to add the background gauge-*xing term 1 d x Tr(D q (x)D) q) (x)) (10.19) and the ghost action (coming from the Faddeev–Popov procedure) 2 d x Tr(D c(x)(D R + ig0 q˜ (x)) c(x)) ; (10.20) where the covariant derivative is D = 9 + iB˜ ;
(10.21)
184
S. Capitani / Physics Reports 382 (2003) 113 – 302
and q˜ and B˜ denote the quantum *eld and the background *eld in which the color matrices are in the adjoint representation: B˜ = Ba t a (like in Section 5.2.1). On the lattice, as expected, there is more than one choice for extending the theory with the introduction of a nonzero background *eld. A convenient decomposition of the gauge links is U (x) = eiaB
(x) iag0 q (x)
e
:
In this case the background gauge transformations take a simple form i B5 = − log (5(x) eiaB (x) 5−1 (x + a ˆ)) ; a q5 = 5(x)q (x)5−1 (x) ; with 5 an element of the gauge group. The gauge-*xing term is 1 4 ·a Tr(D? q (x)D)? q) (x)) x and the ghost action is Tr(D c(x) R ((M † )−1 (q (x)) · D + ig0 q˜ (x))c(x)) ; 2a4
(10.22)
(10.23) (10.24) (10.25)
(10.26)
x
where the forward and backward lattice covariant derivatives in this case act as follows: 1 (10.27) D f(x) = (eiaB (x) f(x + a ˆ) e−iaB (x) − f(x)) ; a 1 (10.28) D? f(x) = (f(x) − e−iaB (x−a ˆ) f(x − a ˆ)eiaB (x−a ˆ) ) ; a and M is the same matrix de*ned in Eq. (5.37). The theory de*ned in this way is gauge invariant, and has a BRS symmetry and a shift symmetry (L6uscher and Weisz, 1995c). The Feynman rules of the Yang-Mills theory in the presence of a background *eld are given in L6uscher and Weisz (1995d). The usual lattice gauge theory with the standard covariant gauge-*xing term can be recovered in the limit of zero background *eld. The function can be de*ned to describe the scale evolution of the renormalized coupling constant. In this form it has been used to determine the perturbative running of the strong coupling constant in the Schr6odinger functional scheme of which we have discussed in Section 2. There we also discussed the scale evolution of the renormalized masses of the quarks, which is described in lattice QCD by the function: dm a = −(g)m : (10.29) da The expansion of the function for small g is given by (g) = −g2 [d0 + d1 g2 + d2 g4 + · · · ] :
(10.30)
The leading-order coeQcient d0 does not depend on the scheme, and has been computed in Nanopoulos and Ross (1975) 8 : (10.31) d0 = (4)2
S. Capitani / Physics Reports 382 (2003) 113 – 302
185
The coeQcient d1 instead depends on the scheme. In the Schr6odinger functional d1 has been computed by Sint and Weisz (1999), and is given for T = L and E = 0:5 by 1 (0:217(1) + 0:084(1) Nf ) : (10.32) d1 = (4)2 In continuum MS is given by (Nanopoulos and Ross, 1979; Tarrach, 1981; Espriu and Tarrach, 1982) 404 40 1 − Nf : (10.33) d1 = (4)4 3 9 In the MS scheme some higher-order coeQcients are also known (Chetyrkin, 1997; Vermaseren et al., 1997). A very useful quantity is the renormalization group invariant mass, de*ned by (Gasser and Leutwyler, 1982, 1984, 1985) d0
g − d0 (t) 2 2b 0 M = m · (2b0 g ) − : (10.34) exp − dt (t) b0 t 0 This quantity, at variance with the parameter, is scheme independent, and it is nonperturbatively well de*ned. This is not the case for masses renormalized at a certain scale in the MS scheme. 11. Improvement The results of lattice simulations are aJected by statistical errors (which arise because only a *nite number of *eld con*gurations can be generated) and systematic errors. The systematic errors are of various nature. Rather important are those coming from the *niteness of the lattice spacing. But errors due to *nite volume eJects, quenching, extrapolations to the chiral limit are often nonnegligible. The techniques which go under the name of “improvement” aim at removing the systematic error due to the *niteness of the lattice spacing, which are generally of order a with respect to the continuum limit, as in the formal expansion p|Oˆ L |p Monte
Carlo
ˆ phys + O(a)] : = ad [p|O|p
(11.1)
We know that at the values of the coupling constant which are presently attainable these discretization errors are signi*cantly large. It is very expensive to reduce these cutoJ eJects by “brute force”, that is simply decreasing the lattice spacing a, since the simulation time grows with the *fth power of the inverse lattice spacing in the quenched approximation, and even faster in full QCD. This means for example that halving the discretization errors by decreasing a would require a calculational eJort at least thirty times bigger, all other things being equal. A better way of reducing these errors can be achieved by following the Symanzik “improvement” program. O(a) unphysical terms are then systematically removed by adding irrelevant terms to the action and the operators. We will now discuss what this implies in terms of perturbation theory. For a pedagogical introduction on these topics the reader is referred to the reviews of L6uscher (1986, 1999a).
186
S. Capitani / Physics Reports 382 (2003) 113 – 302
11.1. Improved quarks A systematic improvement program to reduce the cutoJ errors order by order in the lattice spacing a was *rst proposed by Symanzik (1980, 1982, 1983a, b) and then further developed for Wilson fermions, and applied to on-shell matrix elements, in L6uscher and Weisz (1985a, b), Sheikholeslami and Wohlert (1985) and Heatlie et al. (1991). In this approach, an irrelevant operator is added to the Wilson action in order to cancel, in on-shell matrix elements, all terms that in the continuum limit are eJectively of order a. 41 In this way one can reduce the cutoJ errors coming from the discretization of the action from O(a) to O(a2 ): p|Oˆ L |p Monte
Carlo
ˆ phys + O(a2 )] : = ad [p|O|p
(11.2)
This represents a remarkable decrease of the systematic error due to the *niteness of the lattice spacing. The continuum limit is reached much faster, with a rate proportional to a2 . 42 For this purpose one has to introduce the improved “clover” fermion action, *rst proposed in Sheikholeslami and Wohlert (1985): r R (x)? ) F clover (x) (x) ; GSIf = csw · ig0 a4 (11.3) ) 4a x; ) which vanishes in the formal continuum limit and has the same symmetries of the original unimproved Wilson action. This counterterm is the only one required after exploiting all the symmetries and the equations of motion. 43 The gluon *eld-strength is here de*ned as 1 1 (x) = (P ) (x) − P †) (x)) : (11.8) F clover ) 4 2ig0 a2 )=± F clover is the average of the four plaquettes lying in the plane ) stemming from the point x (see ) Fig. 14). This de*nition maximizes the symmetry of the lattice expression of the gauge *eld-strength tensor (Mandula et al., 1983b). 41
This means that at n loops these terms, because of Eq. (10.12), have the form g02n a log n a. A similar thing can be found for example in numerical integration methods, where the trapezoidal rule reduces the discretization error to the cubic power of the integration step, and Simpson’s rule further reduces it to the *fth power. 43 The other dimension-*ve terms that are gauge invariant and compatible with the symmetries of the Wilson action are 42
→ →
RD D
→
m( R D
← ←
+ RD D ←
− RD
;
(11.4)
);
(11.5)
which can be eliminated in on-shell matrix elements using two equations of motion, and m Tr(F ) F ) ) ;
(11.6)
m2 R
(11.7)
;
which can be reabsorbed into a rescaling of the coupling constant and mass.
S. Capitani / Physics Reports 382 (2003) 113 – 302
Fig. 14. The combination of the four plaquettes which builds the clover lattice approximation of the F point x.
187
)
tensor at the
Improved actions like this exploit the fact that more than one lattice action corresponds to a given continuum action, so that one looks for those actions which have smaller discretization errors. This of course implies a change in the Feynman rules (and usually not for the best). Adding the Sheikholeslami–Wohlert clover term to the Wilson–Lagrangian means that we have to add to the Wilson quark–quark–gluon interaction vertex
a(p1 + p2 )< a(p1 + p2 )< a bc a bc (11.9) (V )< (p1 ; p2 ) = −g0 (T ) i< cos + r sin 2 2 the improved quark–quark–gluon interaction vertex a (Vimp )bc < (p1 ; p2 ) = −csw · g0
a(p1 − p2 )< r a bc ?< sin a(p1 − p2 ) : (T ) cos 2 2
(11.10)
The fermion propagator and the vertices with an even number of gluons are instead not modi*ed, nor is the gluon propagator. In fact, as we have seen in Section 5, the plaquette diJers from the continuum F a) F a) term by order a2 and higher. Thus there is no need of improvement to this order. Nevertheless, it is sometimes useful to improve the gluon action and reduce the discretization errors from O(a2 ) to O(a3 ) (in fact to O(a4 )), as we will see shortly. Given g0 , only with the appropriate value of the improvement coeQcient csw the full cancellation of O(a) terms can be achieved. To lowest order in perturbation theory one just needs the tree-level value csw =1, and in the work of Heatlie et al. (1991) it has been explicitly shown that all terms that are eJectively of order a are absent in the 1-loop matrix elements of the quark currents for csw = 1. Perturbative determinations of csw of order g02 have been made in Wohlert (1987), Naik (1993), and L6uscher and Weisz (1996). Nonperturbative determinations of csw have been carried out by the ALPHA Collaboration, using the Schr6odinger functional formalism and requiring the cancellation of all O(a2 ) eJects in the lattice PCAC relation. The formula that summarizes the ALPHA Collaboration
188
S. Capitani / Physics Reports 382 (2003) 113 – 302
quenched results when 0 6 g0 6 1 is (L6uscher et al., 1997) csw =
1 − 0:656g02 − 0:152g04 − 0:054g06 ; 1 − 0:922g02
(11.11)
while in the Nf = 2 case one has (Jansen and Sommer, 1998). csw =
1 − 0:454g02 − 0:175g04 + 0:012g06 + 0:045g08 : 1 − 0:720g02
(11.12)
The O(g02 ) terms in these formulae correspond to the known 1-loop perturbative results. The above machinery is not enough to improve completely a lattice theory. In addition to improving the action, one also has to improve the form of the operators. This means that one must add to each operator appropriate higher-dimensional irrelevant operators (with the same symmetry properties as the original unimproved operator) in order to cancel all O(a) contributions from their matrix elements. Bases of improved operators have been constructed for quark currents (Sint and Sommer, 1996; Sint and Weisz, 1997) and operators measuring unpolarized structure functions (Capitani et al., 2001b). Improved renormalization factors have been computed in Gabrielli et al. (1991), Frezzotti et al. (1992), Borrelli et al. (1993) and Capitani et al. (1998b, 2001b). We will not deal with these calculations here. Of course one can also attempt to improve the theory to the next order, canceling all contributions which are eJectively of order a2 . For the fermion part this is quite complicated. In this case four-quark operators are also necessary, besides a certain number of two-quark operators of dimension six. We would like brieNy to mention what improvement means when overlap fermions are used (Capitani et al., 1999b; Capitani et al., 2000c, 2001b). One of the many nice features of overlap fermions is that the overlap action is already improved to O(a), and thus one needs only to improve operators. Operators of the form O = R O˜ are immediately improved, to all orders in perturbation theory, by the substitution 1 1 imp R ˜ aDN O 1 − aDN O = : (11.13) 1− 2< 2< We remark that Oimp and O˜ have the same renormalization constants. Thus full O(a) improvement is much more easily achieved here than for the Wilson case. For example, for the *rst moment of the quark momentum distribution the relevant operator is O{
)}
= R { D)}
;
(11.14)
where { )} means symmetrization in and ). In the standard Wilson theory improvement requires dealing with the combination 1 imp R ?{ [D)} ; D ] − 1 ac2 R {D ; D) } (11.15) O{ )} = R { D)} − aic1 4 4
with appropriate values of the improvement coeQcients c1 (g02 ) = 1 + g02 c1(1) + O(g04 ) ; c2 (g02 ) = 1 + g02 c2(1) + O(g04 ) :
(11.16)
S. Capitani / Physics Reports 382 (2003) 113 – 302
189
Fig. 15. The planar, twisted and L-shaped six-link loops.
Even in a simple case like this the two improvement coeQcients have not yet been computed, because at present only a relation between them is known (Capitani et al., 2001b). One could determine both coeQcients using Ward Identities or suitable physical conditions, but this would require a lot of eJort. Moreover, for higher moments the relevant operators contain more covariant derivatives and correspondingly the number of operator counterterms becomes larger and larger. This means that more and more improvement coeQcients have to be determined, through an adequately large set of constraints. In addition, one has also to compute the contribution of each one of these operator counterterms to the total renormalization constant. All this looks a formidable task. Thus, improving the theory is incomparably simpler for overlap fermions. 11.2. Improved gluons The plaquette is not the only possibility for the construction of the discretized version of the gauge action. One can also consider larger closed loops. As we noted in Section 5, the gluon part of the action is already O(a) improved. The next step consists in implementing the improvement to O(a2 ), that is adding to the Wilson action some counterterms of dimension 6 that (with the appropriate values of their coeQcients) can cancel all O(a2 ) eJects, so that only corrections of order a4 and higher are left. 44 There are 3 terms of dimension 6 with the right quantum numbers, and the improved gauge action can be written as
3 6 ; (11.17) ci (g02 ) L(6) Sg = 2 c0 (g02 )L(4) + a2 i g0 i=1 where L(4) is the usual Wilson plaquette action, and the dimension-6 terms are six-link closed loops which are called planar, twisted and L-shaped loops respectively (see Fig. 15). Each of these terms contains the ) Tr(F ) F ) ) operator plus a linear combination of the following three operators: Tr(D F ) D F ) ); Tr(D F)< D F)< ); Tr(D F < D) F)< ) : (11.18) )
44
)<
)<
The fact that the corrections to the pure gauge action can only be of order a2 or a4 comes from the fact that one can construct gauge-invariant terms of dimension 6 and 8, but not of dimension 5 and 7.
190
S. Capitani / Physics Reports 382 (2003) 113 – 302
L6uscher and Weisz have computed the coeQcients of these linear combinations and determined the values of the improvement coeQcients that accomplish the cancellation of the O(a2 ) corrections (L6uscher and Weisz, 1985a, b). At tree level one has 5 c0 = ; 3
c1 = −
1 ; 12
c2 = 0;
c3 = 0 ;
so that only the L(6) 1 counterterm (the planar loop) is needed. We then have
1 6 5 (4) 1 2 (6) L − a L1 = Tr F 2) (x) + O(a4 ) : Sg = 2 12 2 g0 3
(11.19)
(11.20)
This action is a discretization of the continuum pure Yang–Mills action in which the discretization errors have been reduced to order a4 . Although both L(4) and L(6) have discretization errors of 1 order a2 , in the above combination these contributions exactly cancel. Of course this is only a tree-level cancellation, and the coeQcients get corrected by quantum eJects. At 1 loop one also needs to include L(6) 2 and one gets c0 (g02 ) =
5 + 0:2370g02 ; 3
c1 (g02 ) = −
1 − 0:02521g02 ; 12
c2 (g02 ) = −0:00441g02 ; c3 (g02 ) = 0 :
(11.21)
The coeQcients (11.21) de*ne the so-called L6uscher–Weisz action. They satisfy the normalization condition c0 + 8c1 + 8c2 + 16c3 = 1 ;
(11.22)
which must be valid to all orders of perturbation theory. Since at 1 loop c2 is rather small one (6) usually drops L(6) 2 , and L1 remains the only counterterm (as it was the case at tree level). In this case the normalization condition c0 + 8c1 = 1 is used to write the action in the form 6 [(1 − 8c1 ) L(4) + a2 c1 L(6) 1 ] g02
6 1× 2 = 2 (1 − 8c1 ) P 1)×1 + a2 c1 (P 1×; 2) + P)); ) ; g0 x ¡) x ¡)
Sg =
(11.23)
where P 1×; 2) denotes the rectangle which is two lattice spacings long in the direction. These actions have been extensively investigated also in Weisz (1983), Weisz and Wohlert (1984), Wohlert et al. (1985), Curci et al. (1983) and Bernreuther et al. (1984). There are also other actions which go under the name of improved gauge actions where improvement is not done a[ la Symanzik, but instead following renormalization group arguments (with the result that they still have O(a4 ) correction terms). In this case one looks for actions which are close to what one obtains after applying some blocking transformations, that is renormalization group
S. Capitani / Physics Reports 382 (2003) 113 – 302
191
transformations in which the lattice spacing is doubled at each step. A perturbative calculation gives the action proposed by Iwasaki (1983a, b), which is similar to Eq. (11.23) but with c1 = −0:331 :
(11.24)
There have also been nonperturbative calculations which use Schwinger–Dyson equations, that lead to the so-called DBW2 action (Takaishi, 1996; de Forcrand et al., 2000), which corresponds to Eq. (11.23) with 45 c1 −1:40686 :
(11.25)
There are other proposals in which c2 and c3 are nonzero, which we will not consider here. The main drawback of all these improved gauge actions is that they have no reNection positivity (L6uscher and Weisz, 1984). This means it is not possible to construct a transfer matrix at *nite a. This also causes problems with numerical simulations. The violation of physical positivity in fact leads to unphysical poles in the propagators, corresponding to unphysical states that create a sizeable disturb while extracting physical observables (Necco, 2002b). Another problem is that perturbation theory is not very manageable. The gluon propagator in a covariant gauge for generic c1 is given by 1 G ) (k) = kˆ kˆ) + (kˆ? ) − kˆ) ? )kˆ? A?) (k) (11.26) (kˆ2 )2 ? with
A ) (k) = A) (k) = (1 − ) )C(k)−1 (kˆ2 )2 − c1 kˆ2 2
kˆ4< + kˆ2
<
2 kˆ4< + kˆ2 kˆ4< kˆ2 + (kˆ2 )2 kˆ2< + c12 <
and
ˆ2
C(k) = k − c1 − 4c13
<
<
kˆ4<
<
ˆ2
kˆ4<
= ;)
k − c1
ˆ2 2
(k ) +
:
kˆ2<
< = ;)
(11.27)
< = ;)
kˆ2
kˆ4
1 + c12 2
ˆ2 3
(k ) + 2
kˆ6
ˆ2
−k
kˆ4
(11.28)
=<
The gluon vertices are quite complicated, and we will not report them here. They can be found in Weisz and Wohlert (1984). The quark–gluon vertices are of course untouched. Perturbative calculations using improved gauge actions have been recently presented in Aoki et al. (2000) for three-quark operators and in DeGrand et al. (2002) for two- and four-quark operators, and they have even been employed in connection with domain wall calculations (Aoki et al., 2002). The reason for using improved gauge actions in this context is the fact that they seem to lead to a decrease of the residual chiral symmetry breaking left behind when one is working at *nite Ns . 45
The acronym DBW2 stands for doubly blocked Wilson 1 × 2 plaquette. We remark that in this case the relation between the coeQcients is not linear, and the coeQcients that are used represent rather crude estimates.
192
S. Capitani / Physics Reports 382 (2003) 113 – 302
Fig. 16. The spacetime setting in which the Schr6odinger functional lives. Shown is also a correlation function involving boundary *elds.
12. The Schr/odinger functional We present here a short introduction to a powerful framework for lattice calculations that goes under the name of Schr6odinger functional. This is a *eld of research that has grown very much in recent years and would need a separate review in itself, given its peculiarities and technical complexities as well as the number and importance of the results that has produced. For an introductory review the lectures of L6uscher (1999a) are recommended. The lattice Schr6odinger functional was extensively investigated in Symanzik (1981), L6uscher (1985), L6uscher et al. (1992) and Sint (1994) and used in various physical situations. 46 It has been essential for the calculation of csw perturbatively and nonperturbatively (Jansen et al., 1996; L6uscher et al., 1996, 1997; L6uscher and Weisz, 1996; Jansen and Sommer, 1998) and for the nonperturbative computation of the running coupling constant in QCD (which in turn has allowed a quite precise determination of the parameter) and of the masses of the quarks and their scale evolution (L6uscher et al., 1991, 1993, 1994; Capitani et al., 1998c, 1999c; Bode et al., 2001; Garden et al., 2000; Knechtli et al., 2002). The Schr6odinger functional represents a *nite volume renormalization scheme. It is a standard functional integral in which *xed boundary conditions are imposed, and where the time direction plays a special rˆole. In fact, on the space directions the *elds obey generalized periodic conditions, while on the time direction Dirichlet boundary conditions are imposed (see Fig. 16). In particular, at x0 = 0 and T one *xes the spatial components of the links U to some particular values (usually constant abelian *elds), while the temporal components, U0 (x), remain unconstrained, and they are only de*ned for 0 6 x0 6 T − 1. The pure gauge action is given by the sum of the 46
The use of the Schr6odinger functional for the study of continuum gauge theories was advocated in Rossi and Testa (1980a, b, 1984) and shown to be a particularly clean and appealing theoretical framework if the gauge choice Aa0 = 0 is made. The fermionic boundary conditions considered in Rossi and Yoshida (1989) and Leroy et al. (1990) diJer from the ones we will introduce in the following.
S. Capitani / Physics Reports 382 (2003) 113 – 302
193
Wilson plaquettes which are fully contained within the timeslices at x0 = 0 and T . The spatial plaquettes at the boundaries x0 = 0 and T contribute to the action only with a weight 1=2 to avoid double counting. The gauge group is local in the bulk but global on the boundaries. The fermion *elds are dynamical variables only for 1 6 x0 6 T − 1, while at the two temporal boundaries half of their components are *xed to particular values: P+ (x)|x0 =0 = <(˜x);
P− (x)|x0 =T = < (˜x) ;
R (x)P− |x0 =0 = <(˜ R x);
R (x)P+ |x0 =T =
(12.1)
where 1 ± 0 : (12.2) 2 The complementary components (P− (x)|x0 =0 , etc.) must vanish for consistency. In the spatial directions quark *elds are periodic up to a phase, and the generalized periodic conditions can be written as P± =
ˆ = eiEk (x); (x + Lk)
R (x + Lk) ˆ = R (x)e−iEk ;
k = 1; 2; 3 :
(12.3)
It turns out to be more convenient to work with the equivalent setting in which the fermion *elds have strictly periodic boundary conditions in the spatial directions and the phase E is moved into the de*nition of the covariant derivative, namely 1 D (x) = [ U (x) (x + a ˆ) − (x)] ; a D? (x) = with
1 [ (x) − −1 U −1 (x − a ˆ) (x − a ˆ)] ; a
= eiaE =L ;
E0 = 0;
− ¡ Ek 6 :
(12.4) (12.5)
This phase E is called a *nite-size momentum, and it is not quantized (although one is working in a *nite volume). It can then be chosen to be smaller than the minimal quantized momentum, pmin = 2=L, thereby reducing the lattice artifacts coming from the necessary *niteness of lattices used in simulations. E is a free parameter, and can be tuned in such a way that one obtains the best numerical signals or the best perturbative expansions (or both, possibly). Boundaries do not inNuence the integration measure of the functional integral. Correlation functions in the path integral formulation can be computed in the usual way, the only diJerence being that the action has a more complicate form at the boundaries. Correlation functions can then involve functional derivatives also acting on the boundary values of the quark *elds, R x) = − ; P (˜x) = ; PR (˜x) = − P(˜x) = ; P(˜ : (12.6) <(˜ R x) <(˜x)
194
S. Capitani / Physics Reports 382 (2003) 113 – 302
a *eld con*guration at time zero to a *eld con*guration at time T (Feynman and Hibbs, 1965). A lattice regularization is of course useful for the study of nonperturbative aspects of QCD, from *rst principles. When the Schr6odinger functional is used in conjunction with the lattice regularization, 47 the fact that it can be de*ned in a *nite volume brings interesting features. In fact, the choice of the temporal boundary conditions speci*ed above implies that fermion zero modes are absent to the lowest order of perturbation theory. For zero quark masses the lowest eigenvalue of the Dirac operator is of order 1=L. The frequency gap for quark and gluon *elds remains of order 1=L even in the interacting theory, and this means that one can perform simulations with quark masses close to zero without encountering singularities, because of the infrared cutoJ provided by the lattice size, L. Working in a *nite volume usually leads to systematic errors, but here the situation is completely diJerent. The Schr6odinger functional is exploited as a *nite volume renormalization scheme, where renormalized quantities are speci*ed at the scale = 1=L and for vanishing quark masses. 48 This is something very diJerent from the usual approach, where the renormalization scale is a parameter independent of the lattice size, and is instead determined by the lattice spacing. Thus, there cannot be by de*nition any *nite volume eJects in the Schr6odinger functional. The *nite volume is used to probe the theory and specify the renormalization prescriptions. We stress again that the Schr6odinger functional is a continuum scheme, because the scale at which the theory is renormalized is independent of the lattice spacing. In practice however everything, from Monte Carlo simulations to perturbation theory, is carried out using lattice techniques. For instance, the perturbative calculations of the running coupling constant performed in the Schr6odinger functional scheme, that we mentioned in Section 2, were all done on the lattice. 49 At the end one can extrapolate the results obtained at diJerent lattice spacings to the continuum limit, where the Schr6odinger functional is still well de*ned (see also footnote 46). Then it helps that the parameter is much closer to the continuum than other lattice parameters. For the theory with zero Navors one has (L6uscher et al., 1994) SF = 0:48811(1) : MS
(12.7)
From this point of view the Schr6odinger functional scheme seems “closer” to the MS scheme than other lattice regularizations, as we have discussed at the end of Section 2. The Schr6odinger functional scheme can also be supplemented with powerful *nite-size recursive techniques, which allow to perform renormalization calculations over a wide range of energies. Wilson suggested to introduce a renormalization group transformation so that one can cover large scale diJerences in a recursive manner (Wilson, 1980), and these ideas were then developed in L6uscher (1983), L6uscher et al. (1992, 1996) and Jansen et al. (1996). Nice summaries of these techniques are also given in L6uscher (1997, 2002), Sommer (1997). The evolution of the renormalized coupling constant and masses can then be studied from low to rather high energies. This allows nonperturbative studies to be carried out with a good control of various systematic errors. 47 Usually one takes as lattice action in the interior the Wilson action, which is probably the simplest setting for the lattice Schr6odinger functional. 48 It is customary to take T = 2L or L, so that everything in the theory is referred to the scale L. It seems that some systematic errors are more pronounced in the case T = 2L, thus L is preferred (Sint and Weisz, 1999). 49 One-loop calculations in dimensional regularization can be found in Sint (1995).
S. Capitani / Physics Reports 382 (2003) 113 – 302
195
We would like to sketch how this nonperturbative renormalization over many scales is accomplished. One takes a sequence of pairs of lattices, with lattice size L and L = 2L. Then, keeping the bare parameters *xed, renormalized quantities are computed at the scales = 1=L and =1=L =1=2L.50 In this way one de*nes the step scaling functions ?. With reference to renormalized coupling constant and masses one has the formulae 51 g2 (2L) = ?(g2 (L)) ;
(12.10)
ZP (2L) = ?P (ZP (L)) · ZP (L) :
(12.11)
These step scaling functions correspond to a kind of integrated form of the and functions (see Eqs. (10.3) and (10.29)). Errors on them can be rendered rather small when for *xed L the lattice computations are repeated at diJerent values of a and then extrapolated to the continuum limit, for instance in the O(a) improved theory. One does not need large lattices, and in fact lattices as small as L=a=5 have been used. For the coupling constant and mass evolutions it has never been necessary to consider lattices larger than L =a = 32. The step scaling functions ? and ?P are used recursively to go from very high to very low scales. Once the renormalized quantities are computed at the new scale L = 2L, this is taken as the starting scale for the computation of the renormalized quantities at L = 2L = 4L, and this process is repeated until after n steps one reaches the scale L(n) = 2n L. One has then completed the evolution of the renormalized parameters from the energy scale down to the energy scale =2n . The low-energy end of this evolution should correspond to a scale at which one can safely match the renormalized quantities to some low-energy hadronic quantities. In this way one *xes the relations between bare coupling constant and masses and the renormalized coupling constant and masses at that scale. This matching can again be done without the use of large lattices. At the high-energy end one uses perturbation theory, which is completely safe there (see Figs. 1 and 2), to compute the parameter and the renormalization group invariant masses (Eqs. (10.13) and (10.34), respectively). Once these numbers are known, the matching to the MS scheme or to other continuum schemes can be easily carried out by means of continuum calculations only. We remind that the renormalization group invariant masses do not depend on the scheme, and thus it is only the parameter which needs a conversion to MS. We have thus been able to connect the nonperturbative infrared sector of the theory to the high-energy perturbative regime. The Schr6odinger functional coupled with the recursive *nite-size 50 The renormalized coupling constant is de*ned as the response of the functional to a variation of the gauge *elds at the boundaries. 51 In practice the PCAC relation
9 A (x) = 2m · P(x)
(12.8)
is used to de*ne the masses, so that the renormalization of the mass is proportional to ZA =ZP , and since ZA does not evolve with the scale, the scale evolution of the renormalized mass is proportional to the inverse of the renormalization of the pseudoscalar density: ZA 1 m( ) = = : m0 ; (12.9) ZP (L) L We remind that the local currents are not conserved on the lattice, and they are renormalized by the strong forces, so that their Z’s are diJerent from one.
196
S. Capitani / Physics Reports 382 (2003) 113 – 302
scaling techniques allows to cover a range of scales spanning more than two orders of magnitude. It is the only method by which one can arrive at energy scales of more than 100 GeV (see Figs. 1 and 2). In order to achieve this result with conventional methods it would be necessary to contain all relevant scales in a single lattice, which at the moment is impossible. The Schr6odinger functional acts in this process only as an intermediate renormalization scheme. Everything can be computed nonperturbatively and, when improvement is implemented, the more dangerous systematic errors are well under control. The methods we have just described have yielded the best lattice result that we currently have for the parameter of QCD, MS = 238 ± 19 MeV ;
(12.12)
in the theory with zero Navors (Capitani et al., 1999c). In this work also the nonperturbative relation between the bare masses and the renormalization group invariant masses has been determined with a rather small error. A spinoJ of this last calculation has been the computation of the nonperturbative renormalization of the scalar quark condensate using overlap fermions (HernKandez et al., 2001, 2002a, b). The possibility of doing this calculation relies on two things. One is the fact that in a regularization which respects chiral symmetry one has 1 ZS = ZP = ; (12.13) ZM and therefore the knowledge of the renormalization of the mass translates immediately in the determination of the renormalization of the scalar and pseudoscalar quark densities. The other is that the renormalization group invariant masses are independent of the scheme, and hence they are the same whichever lattice action is used. This means that one can then compute the renormalization of the mass in the overlap formulation by comparing the two formulae ov MRGI = ZM (g0 )mov (g0 ) ;
(12.14)
W (g0 )mW (g0 ) ; MRGI = ZM
(12.15)
W gives in the Wilson formulation the relation between the bare mass and the renormalization where ZM W has been nonperturbatively determined by *nite-size scaling techniques group invariant mass. ZM ov can using the Schr6odinger functional scheme as an intermediate step (Capitani et al., 1999c). ZM be evaluated by *xing the ratio of the bare masses in the two schemes, requiring that some renormalized quantity at a certain scale takes the same value in both schemes. Thanks to Eq. (12.13), ov *nally allows the determination of the renormalization constant of the bare the knowledge of ZM condensate. The strategy illustrated above can be applied to other physical cases and other lattice actions. It shows how important calculations made using the Schr6odinger functional can be in practice, and how results obtained using the *nite-size scaling techniques can have applications within other regularization schemes. Let us now turn our attention to the peculiar structure of perturbation theory in the Schr6odinger functional formalism. For details the reader can consult the article of L6uscher and Weisz (1996). There is an asymmetry between spatial indices and the temporal index, because of the boundary conditions. There is no periodicity in the time direction, so translational invariance is lost and Fourier
S. Capitani / Physics Reports 382 (2003) 113 – 302
197
transforms can be done only in the spatial directions. Given a function f(x) of coordinate space, one works in the time–momentum representation de*ning the three-dimensional Fourier transform, ˜) : f(x0 ; p
(12.16)
The presence of boundaries also makes the form of the propagators quite complicated. Moreover, usually the improved action is employed and this causes the calculations to be even more involved. In fact, to improve the Schr6odinger functional to order a, in addition to the usual Sheikholeslami– Wohlert term one also needs some O(a) boundary counterterms, both for the gluon part and for the quark part of the action. Let us discuss the O(a) improved theory. The quark propagator obeys the equation (D + Dv + Db + m0 )S(x; y) = a−4 xy ;
0 ¡ x0 ¡ T ;
(12.17)
together with the boundary conditions P+ S(x; y)|x0 =0 = P− S(x; y)|x0 =T = 0 ;
(12.18)
where the improvement counterterm in the Dirac operator in the interior of the lattice is the same as for the Wilson action, namely the Sheikholeslami–Wohlert term i (x) (x) ; (12.19) Dv (x) = csw a? ) F clover ) 4 while there are additional counterterms at the boundaries, namely 1 ˆ + (x − a0)] ˆ Db (x) = (c˜t − 1) [ x0 ;a [ (x) − U0† (x − a0)P a ˆ : + x0 ;T −a [ (x) − U0 (x)P− (x + a0)]]
(12.20)
In Eq. (12.20) c˜t is an improvement coeQcient for these fermion boundary counterterms. In perturbation theory (Leroy et al., 1990) it is convenient to make the decomposition (x) = where
cl
cl (x)
+ B(x);
R (x) = R cl (x) + B(x) R ;
(12.21)
is a sort of classical *eld satisfying the Dirac equation
(D + Dv + Db + m0 )
cl (x)
= 0;
0 ¡ x0 ¡ T
(12.22)
= < (˜x)
(12.23)
with boundary values P+
cl (x)|x=0
= <(˜x);
P−
cl (x)|x=T
(see Eq. (12.1)). A similar decomposition holds for R cl . In terms of the boundary values expression 3 ˆ y)|y0 =a + S(x; y)U0 (y)< (˜y)|y0 =T −a ] : c˜t [S(x; y)U0† (y − a0)<(˜ cl (x) = a
cl
has the (12.24)
˜y
A useful property of this decomposition is that the quantum components B(x) are endowed with vanishing boundary conditions. The fermionic action splits as imp
Sf
imp [U; R ; ] = Sf [U; R cl ;
cl ]
imp
+ Sf
[U; B; R B] ;
(12.25)
198
S. Capitani / Physics Reports 382 (2003) 113 – 302
where quantum and classical components are completely separated. The generating functional with fermionic sources D; DR takes the form log Zf = log Zf |
D(x)S(x; R y)D(y) + a4
x;y
[D(x) R
cl (x)
+ R cl (x)D(x)] ;
(12.26)
x
where c˜s is another improvement coeQcient, corresponding to a fermion counterterm living at the boundaries and entirely contained in the timeslices at x0 = 0 and T . Upon diJerentiation we can derive the basic “contractions”, among which we *nd 52 [ (x) R (y)] = S(x; y) ; (12.27) cl (x) ˆ + |y0 =a ; = c˜t S(x; y)U0† (y − a0)P (12.28) <(˜y) R cl (x) ˆ = c˜t P− U0 (z − a0)S(z; [P(˜z) R (x)] = x)|z0 =a : (12.29) <(˜ R z) Using these contractions one can construct, for example, the correlation function of the axial current, inserted at the time x0 , obtaining (see Fig. 16) 1 R y)]5 } ; Tr{[P(˜z) R (x)]0 5 [ (x)P(˜ (12.30) fA (x0 ) = a6 2 R y)] = [ (x)P(˜
˜y;˜z
where · · · denotes gluon averages. The complete list of necessary contractions is given in L6uscher and Weisz (1996). We give here the explicit expression of the quark propagator, which has a quite complicated form, even in the unimproved theory (c˜t = c˜s = 1). One has S(x; y) = (D† + m0 )G(x; y) ; where G(x; y) =
(12.31)
1 eip˜ (˜x−˜y) L3 −2ia−1 sin ap0+ A(˜ p+ )R(p+ ) p ˜
×{(M (p+ ) − ia−1 sin ap0+ )e−!(˜p + (M (p+ ) + ia−1 sin ap0+ )e−!(˜p
52
+
+
)| x0 − y 0 |
)(2T −|x0 −y0 |)
− (M (p+ ) + i0 a−1 sin ap0+ )e−!(˜p
+
)(x0 +y0 )
− (M (p+ ) − i0 a−1 sin ap0+ )e−!(˜p
+
)(2T −x0 −y0 )
} :
The square brackets denote fermion integration in a given external gauge *eld.
(12.32)
S. Capitani / Physics Reports 382 (2003) 113 – 302
199
We note that G(x; y) is de*ned for 0 6 x0 ; y0 6 T , while S(x; y) only for 0 ¡ x0 ; y0 ¡ T . The above auxiliary functions are given by p+ = p +
E ; L
(12.33) 3
2 2 aqk sin A(˜q) = 1 + a m0 + a 2 k=1
R(q) = M (q)[1 − e−2!(˜q)T ] − M (q) = m0 +
2 2 aq sin a 2
;
i sin aq0 [1 + e−2!(˜q)T ] ; a
(12.34) (12.35) (12.36)
and 2 ; a where !(˜q) in Eq. (12.37) is de*ned through the relation % a 1=a2 3 sin2 aq + 1=a2 (A(˜q) − 1)2 &a k k=1 : !(˜q) = sinh A(˜q) 2 2 p0 = p0+ = i!(˜ p+ ) mod
(12.37)
(12.38)
The angle E in the formulae above is the *nite-size momentum which comes from the boundary conditions in the spatial directions (or rather from the modi*ed covariant derivative, Eq. (12.4)). As one can imagine, perturbative calculations are rather nontrivial. We want here to report also the expression of the gluon propagator, which in the Feynman gauge is given by the formula 1 ip˜ (˜x−˜y) D ) (x; y) = 3 e d ) (x0 ; y0 ; p ˜) ; (12.39) L p ˜
with
a 1 1 cosh R T − x0 − a cosh R y0 + a ; d00 (x0 ; y0 ; p ˜) = sinh(Ra) sinh(RT ) 2 2 a dkj (x0 ; y0 ; p sinh[R(T − x0 )] sinh(Ry0 ) ˜ ) = kj sinh(Ra) sinh(RT )
(12.40) (12.41)
for nonzero p ˜ , and d00 (x0 ; y0 ; ˜0) = y0 + a ;
(12.42)
y0 dkj (x0 ; y0 ; ˜0) = kj (T − x0 ) T for p ˜ = 0. The “energy” R is here given by cosh(aR) = 1 + 2
3 k=1
sin2
apk : 2
(12.43)
(12.44)
200
S. Capitani / Physics Reports 382 (2003) 113 – 302
The mixed components d0k and dk0 vanish. The above expression is valid for x0 ¿ y0 . For y0 ¿ x0 one must make use of the symmetry property ˜ ) = d) (x0 ; y0 ; p ˜) : d ) (y0 ; x0 ; p
(12.45)
For more technical details, and for the remaining parts of the perturbative setting like gauge *xing, see L6uscher et al. (1992) and L6uscher and Weisz (1996) and also Kurth (2002), where the renormalization of the quark mass has been computed at 1 loop. The Feynman rules for the gluon vertices are given in Palombi et al. (2002). The Schr6odinger functional formalism was crucial for the computation of many improvement coeQcients. In particular, csw was *xed imposing the vanishing of the O(a) corrections to the PCAC relation. Since a mass term is present in the PCAC relation the Schr6odinger functional is particularly well suited here. Also calculations of moments of structure functions have been carried out using the Schr6odinger functional approach, coupled to the recursive *nite-size scaling technique. These calculations are reported in Bucarelli et al. (1999), Guagnelli et al. (1999a–c, 2000), Jansen (2000) and Palombi et al. (2002). Perturbation theory is in all these cases really cumbersome. Even more, the covariant derivatives make everything more complicated, because of the presence of phase factors and boundary *elds. We conclude mentioning that also a few 2-loop calculations have been completed in this formalism (WolJ, 1995; Narayanan and WolJ, 1995; Bode et al., 1999, 2000a, b). 13. The hypercubic group With this section we begin to explain in a more detailed way how lattice perturbative calculations are actually done. To start with, it is useful to discuss the symmetry group of the lattice and see what are the consequences of the breaking of Lorentz invariance. On the lattice one inevitably ends up with a discrete group. The symmetry group of the discrete rotations of a four-dimensional hypercubic lattice onto itself is a crystallographic group, denoted by W4 and called the hypercubic group. It consists of =2 rotations on the six lattice planes and reNections (so that parity transformations are also included). It has 384 elements and 20 irreducible representations (Baake et al., 1982, 1983; Mandula et al., 1983a). W4 is a subgroup of the orthogonal group O(4), which is the Lorentz group analytically continued to Euclidean space. A major diQculty in doing perturbative calculations on the lattice arises from the fact that the (Euclidean) Lorentz symmetry breaks down to the hypercubic W4 symmetry. Since the lattice has a reduced symmetry with respect to the continuum, more operator mixings are allowed, as we will see in the next section. Let us *rst consider the special hypercubic group, SW4 , consisting of proper rotations without reNections. It has 192 elements and 13 irreducible representations. Five of these representations (of dimensions 1, 1, 2, 3 and 3) are connected to the 4-element permutation group, S4 , because the latter is a subgroup of SW4 and the *ve representations of S4 can be taken as nonfaithful representations of SW4 . There are then four representations of SW4 which can be identi*ed by the fact that they correspond to representations of O(4) which remain irreducible under SW4 : 3 1 1 1 ; ; ; : (13.1) (1; 0); (0; 1); 2 2 2 2
S. Capitani / Physics Reports 382 (2003) 113 – 302
201
The direct product of each of the *rst three representations in Eq. (13.1) with the completely antisymmetric representation of the permutation group S4 generates three other irreducible representations of SW4 (which maintain the same dimensionality), while ( 23 ; 21 ) is invariant under this operation. We also note that the representation ( 21 ; 23 ) turns out to give the same hypercubic representation as ( 23 ; 21 ). So far we have then been able to identify 12 representations. There is yet another representation, which has dimension 6. The complete list of the representations of the special hypercubic group SW4 is thus given by: 11 ; 12 ; 2; 31 ; 32 ; 33 ; 34 ; 35 ; 36 ; 41 ; 42 ; 6; 8 ;
(13.2)
where the subscripts label diJerent representations with the same dimensionality. This group is a subgroup of SO(4), the special orthogonal group. We now discuss the irreducible representations of W4 . Including the reNections doubles the number of group elements, but not the number of representations. This happens because, contrary to the cubic group in three dimensions, the hypercubic group is not the direct product of the rotation group and the reNection group. The reason is that the reNection of all four axes is still a rotation, which is not true for the reNection of three axes in three dimensions. Therefore, going from SW4 to W4 the number of representations only increases from 13 to 20. What happens is that 9 of these 13 representations just double (generating the representations with opposite parity), while the remaining 4, all of dimension 3, merge into two six-dimensional representations which are reNection invariant. In particular, the 33 and 34 of SW4 merge into the 61 of W4 , and the 35 and 36 of SW4 merge into the 62 of W4 . We can then give the complete list of the representations of W4 : 53 11 ; 12 ; 13 ; 14 ; 21 ; 22 ; 31 ; 32 ; 33 ; 34 ; 41 ; 42 ; 43 ; 44 ; 61 ; 62 ; 63 ; 64 ; 81 ; 82 :
(13.3)
The representation 41 is the canonical one, corresponding to an object with a Lorentz index, like is in the continuum. When interested in the behavior of lattice operators which have more than one Lorentz index, we must identify the representations of the hypercubic group contained in the tensor products of the 41 with itself, and compare the result with what happens in the continuum, where one has to consider the tensor products of the ( 21 ; 21 ) with itself. The relation between these two expansions determines what kind of mixings arise when one computes radiative corrections of lattice matrix elements (apart from additional mixings due to the breaking of chiral symmetry or of other symmetries). ( 21 ; 21 )
14. Operator mixing on the lattice Since W4 is a subgroup of O(4), a continuum operator belonging to a given irreducible representation of the (Euclidean) Lorentz group becomes in general a sum of irreducible representations of the hypercubic group. The continuum operator can then belong to various distinct lattice representations, according to the way in which its indices are chosen. This implies than on the lattice the possibilities for mixing under renormalization are larger than in the continuum. Mixings can arise which are pure lattice artifacts and which have to be carefully treated. The number of independent renormalization 53
It could be useful to note that the notation Nm of Eqs. (13.2) and (13.3), used in (Mandula et al., 1983a), corresponds ) ) to the representation (N (or s (N for SW4 ) in (Baake et al., 1982). m m
202
S. Capitani / Physics Reports 382 (2003) 113 – 302
factors in a lattice calculation is then in general larger than in the continuum. In particular, operators which are multiplicatively renormalizable in the continuum may lose this property on the lattice. 54 For Wilson fermions additional mixings (beside those due to the breaking of Lorentz invariance) can arise because of the breaking of chiral symmetry. For staggered fermions, the loss of Navor invariance also opens the door for more mixings, although of a diJerent kind. All these additional mixings are unphysical, being just lattice artifacts which have to be subtracted in order to get physical results from the lattice. In practical terms the worst situation occurs in the case of mixings with operators of lower dimensions, with lattice renormalization factors containing a power divergent coeQcient, proportional to 1=an . These lattice artifacts ought to be subtracted nonperturbatively. In short, Lorentz breaking, as well as the breaking of chiral, Navor or other symmetries that occur in speci*c lattice actions, in general may spoil the multiplicative renormalizability of continuum operators, in some cases even with power divergences. The necessary condition for not having any mixing at all is that the operator belongs to an irreducible representation of W4 , but this is sometimes not suQcient, as we will see shortly. 14.1. Unpolarized structure functions Let us present some examples involving operators which measure moments of unpolarized structure functions. 55 These operators appear in the operator product expansion of two electromagnetic or weak hadronic currents, and have the form O{
1 ··· n }
(x) = R (x){ D 1 · · · D
n}
(x) :
(14.1)
They are symmetric in all their indices and traceless. The operator O{ 1 ··· n } measures the n-th moment, xn , of the unpolarized structure functions (i.e., the distribution of the momentum of the quarks inside the hadrons). The renormalization of these operators, which presents particular computational diQculties due to the presence of the covariant derivatives, has been *rst studied with Wilson fermions in Kronfeld and Photiadis (1985), Corb[o et al. (1989, 1990), Caracciolo et al. (1990, 1989) and then in Capitani and Rossi (1995a), Beccarini et al. (1995), G6ockeler et al. (1996b), Brower et al. (1997) and Capitani (2001a, b). Recent numerical works in full QCD 54 This feature also occurs in other regularizations. For example, in continuum calculations using dimensional regularization in the version known as DRED, “evanescent” operators, coming from the additional −2j dimensions, are generated in the intermediate stages of the calculations. 55 It is not possible to compute a complete structure function directly on the lattice. The reason is that the structure functions describe the physics close to the light cone, and this region of Minkowski space shrinks to a point when one goes to Euclidean space, where Monte Carlo simulations are performed. However, on a Euclidean lattice it is possible to compute the moments of the structure functions, using an operator product expansion: 2 1 Q · h|O(n; i) ( )|h : xn F(i) (x; Q2 ) ∼ C (n; i) 2 0
The Wilson coeQcients contain the short-distance physics, and can be perturbatively computed in the continuum. The matrix elements contain the long-distance physics, and can be computed using numerical simulations, supplemented by a lattice renormalization of the relevant operators.
S. Capitani / Physics Reports 382 (2003) 113 – 302
203
which have made use of these perturbative renormalization factors are Dolgov et al. (2001, 2002), Dreher et al. (2002) and G6ockeler et al. (2002), and recent short reviews of perturbative and nonperturbative methods and results can be found in Capitani (2002b, c) and Negele (2002). Perturbative renormalization factors for all these operators have also been computed using overlap fermions, and are reported in Capitani (2001a, b, 2002a). We point out that all mixings discussed below, which are artifacts of the lattice, are only due to the breaking of Lorentz invariance. They have nothing to do with the breaking of chiral symmetry for Wilson fermions, and therefore they are still present, in exactly the same form, even when one uses Ginsparg–Wilson fermions. In the continuum each of these structure function operators belongs to an irreducible representation of the Lorentz group. On the lattice they are instead in general reducible, they become linear combinations of irreducible representations of the hypercubic group, and this is the reason of the mixings which appear when radiative corrections are computed. More detailed analyses of these mixings can be found in Beccarini et al. (1995) and G6ockeler et al. (1996a). 14.1.1. First moment The operator is O{ )} = R { D)} , symmetric and traceless. An object with a single Lorentz index belongs in the continuum to the ( 21 ; 21 ) representation of the Euclidean Lorentz group O(4), while on the lattice it belongs to the 41 representation of the hypercubic group W4 . The general decomposition of the 16 (nonsymmetrized) tensor components is in the continuum 1 1 1 1 ; ⊗ ; = (0; 0) ⊕ (1; 0) ⊕ (0; 1) ⊕ (1; 1) ; (14.2) 2 2 2 2 while on the lattice is 41 ⊗ 4 1 = 11 ⊕ 3 1 ⊕ 6 1 ⊕ 6 3 :
(14.3)
We have essentially two choices here for the symmetrized operators, that is the two indices can be diJerent or can be equal. In the latter case, one has also to subtract the trace component. The *rst case can be exempli*ed by considering the operator O{01} , which belongs to the 61 and is multiplicatively renormalizable. We will compute its renormalization constant in detail in Section 15.4. A representative of the second case is O{00} − 13 (O{11} +O{22} +O{33} ), which belongs to the 31 and is also multiplicatively renormalizable. The subtracted trace part belongs to the 11 . Finally, the antisymmetric components (which do not enter in the operator product expansion for the moments), for example the operator O[01] , belong to the remaining representation in the expansion, the 63 . Since they belong to diJerent representations of W4 , the lattice renormalization factors of the operators O{01} and O{00} − 13 (O{11} + O{22} + O{33} ) are diJerent, as has been veri*ed by explicit calculations; in the continuum however they are the same, as both operators belong to the (1; 1). We mention here that from the point of view of Monte Carlo simulations the choice of two diJerent indices is worse, because in this case one has to choose one component of the hadron momentum to be diJerent from zero, and this leads to larger systematic eJects due to the granularity of the lattice.
204
S. Capitani / Physics Reports 382 (2003) 113 – 302
14.1.2. Second moment The operator is O{ )?} = R { D) D?} , symmetric and traceless. The general decomposition of the 64 (nonsymmetrized) tensor components of this rank-three operator is in the continuum: 1 1 1 1 1 1 1 1 3 1 1 3 3 3 ⊗ ⊗ =4· ⊕2· ⊕2· ⊕ ; (14.4) ; ; ; ; ; ; ; 2 2 2 2 2 2 2 2 2 2 2 2 2 2 while on the lattice is 41 ⊗ 41 ⊗ 41 = 4 · 41 ⊕ 42 ⊕ 44 ⊕ 3 · 81 ⊕ 2 · 82 :
(14.5)
We have essentially three choices here for the symmetrized components. One is represented by the operator O{123} , which belongs to the 42 and is multiplicatively renormalizable. This choice however is quite unsatisfactory from the point of view of simulations, because two components of the hadron momentum have to be diJerent from zero and from each other, leading to rather large systematic errors. One should minimize these systematic errors by including as few nonzero components of the hadron momentum as possible. From this point of view, the optimal choice is the operator O{111} , which belongs to the 41 . Unfortunately this operator mixes with R 1 , which is a 41 as well. Moreover, the coeQcient of this mixing can be seen from dimensional arguments to be power divergent, 1=a2 , and thus this mixing cannot be studied in perturbation theory. There is an intermediate choice between having the indices all diJerent or all equal, and is given by the operator OS = O{011} − 12 (O{022} + O{033} ) ;
(14.6)
which does not have any power divergences due to the particular combination chosen. This operator belongs to an irreducible representation of W4 , but nonetheless is not multiplicatively renormalizable and undergoes a mixing with another operator. The way in which this happens is not trivial, and was *rst understood in Beccarini et al. (1995). The point is that the operator OS belongs to the 81 , but this representation is present three times in the lattice decomposition of O )? , Eq. (14.5). It turns out that two of these 81 representations mix with each other, at least at the 1-loop level. This mixing can be best seen in the following way: the nonsymmetrized operators OA = O011 − 12 (O022 + O033 ); OB = O101 + O110 − 12 (O202 + O220 + O303 + O330 ) ;
(14.7)
turn out to have diJerent 1-loop corrections on the lattice, and they renormalize with diJerent numerical factors which form a nontrivial mixing matrix: Oˆ A = ZAA OA + ZAB OB ; Oˆ B = ZBA OA + ZBB OB :
(14.8)
Notice that the two covariant derivatives have the same index in OA but two diJerent indices in OB , and the two operators have diJerent tree levels (0 p12 − 12 (0 p22 + 0 p32 ) and 21 p0 p1 − (2 p0 p2 + 3 p0 p3 ), respectively). The operator that we want to measure, OS = O{011} − 12 (O{022} + O{033} ) = 13 (OA + OB ) ;
(14.9)
S. Capitani / Physics Reports 382 (2003) 113 – 302
205
does not go into itself under 1-loop renormalization, Oˆ S = 13 (ZAA + ZBA )OA + 13 (ZAB + ZBB )OB ;
(14.10)
because on the lattice ZAA + ZBA is not equal to ZAB + ZBB , as explicit calculations have shown (Beccarini et al., 1995). In other words, the symmetric combination is lost and OS mixes with an operator of mixed symmetry (nonsymmetrized). We have thus seen that the choice of indices for this rank-three operator is very important, and has practical consequences for the Monte Carlo simulations as well as for the calculation of renormalization factors. In the continuum all O{ )?} cases discussed above, including O{111} , belong to the ( 23 ; 23 ). Thus, they have the same renormalization constant, and no mixing problem. 14.1.3. Third moment The operator is O{ )?<} = R { D) D? D<} , symmetric and traceless. The general decomposition of the 256 (nonsymmetrized) tensor components in the continuum is 1 1 1 1 1 1 1 1 ; ⊗ ; ⊗ ; ⊗ ; 2 2 2 2 2 2 2 2 =4 · (0; 0) ⊕ 6 · (1; 0) ⊕ 6 · (0; 1) ⊕ 2 · (2; 0) ⊕ 2 · (0; 2) ⊕9 · (1; 1) ⊕ 3 · (2; 1) ⊕ 3 · (1; 2) ⊕ 2 · (2; 2) ;
(14.11)
while on the lattice is 41 ⊗ 41 ⊗ 41 ⊗ 41 = 4 · 11 ⊕ 12 ⊕ 14 ⊕ 3 · 21 ⊕ 2 · 22 ⊕ 7 · 31 ⊕ 3 · 32 ⊕ 3 · 33 ⊕ 3 · 34 ⊕10 · 61 ⊕ 6 · 62 ⊕ 10 · 63 ⊕ 6 · 64 :
(14.12)
Without entering a detailed discussion, we only state that, among all symmetrized operators, only O{0123} , which belongs to the 12 representation, is multiplicatively renormalizable. Any other choice leads to mixings, in some cases with power divergent coeQcients. A special case is given by the operator O{0011} + O{3322} − O{0022} − O{3311} , which belongs to the 21 and in principle mixes with two other operators of mixed symmetry which have very complicated expressions, as shown in G6ockeler et al. (1996a). However, this mixing occurs beyond 1 loop, and we can consider this operator to be multiplicatively renormalizable. The tadpole coming from this operator (as well as the one corresponding to O{0123} ) will be computed in detail in Section 18.3. 14.1.4. Higher moments The operator for the fourth moment is O{ )?<} = R { D) D? D< D} , symmetric and traceless. We have seen that going from the *rst to the second and then to the third moment the mixing structure becomes more and more complicated. For the fourth moment and higher, i.e., for operators of at least rank *ve, a new feature occurs: mixings which imply power divergent coeQcients become unavoidable, because at least two of the indices are bound to be equal. One can always *nd lower-dimensional operators with the same transformation properties, and it is not possible to avoid a power divergent renormalization factor.
206
S. Capitani / Physics Reports 382 (2003) 113 – 302
Furthermore, even the *nite mixings become much more complicated and quite entangled, since there happen to be a lot of “copies” of the same representations around. For example, the rank-*ve operator has the decomposition 41 ⊗ 41 ⊗ 41 ⊗ 41 ⊗ 41 = 31 · 41 ⊕ 20 · 42 ⊕ 15 · 43 ⊕ 20 · 44 ⊕ 45 · 81 ⊕ 40 · 82 :
(14.13)
For the higher moments it seems then rather unlikely to *nd an operator which does not mix and does not have power divergences. 14.2. A mixing due to breaking of chiral symmetry: GI = 1=2 operators We now discuss a case in which some of the operator mixings that take place are entirely due to the breaking of chiral symmetry by Wilson fermions. We show then that the calculations on the lattice become much simpler and more manageable when overlap fermions are instead used. The physics is the one of strangeness-changing weak decays, and the operators that we consider appear in the part of the GS = 1 eJective weak nonleptonic Hamiltonian which is relevant for GI = 1=2 transitions (Gaillard and Lee, 1974; Altarelli and Maiani, 1974; Altarelli et al., 1981; Buras and Weisz, 1990; Buras et al., 1992). The GI = 1=2 amplitudes, as is well known, are experimentally much greater than the GI = 3=2 amplitudes. This phenomenon is called “octet enhancement” or “GI = 1=2 rule”, and to this day has not been theoretically understood. The GI = 1=2 bare operators that we consider are, for scales between the charm mass and the bottom mass, O± = (O1 − O1c ) ± (O2 − O2c ) ;
(14.14)
with 56 O1 = (sRa L ub )(uR b L da ) ;
(14.15)
O2 = (s R L u)(u R L d) ;
(14.16)
O1c = (sRa L cb )(cRb L da ) ;
(14.17)
R L c)(c R L d) : O2c = (s
(14.18)
Without entering into many details, we sketch the structure of their mixing on the lattice. With Wilson fermions, this operator renormalization requires the subtraction of several other operators of the same and of lower dimensionality. The pattern of mixing is as follows:
6; i 6; i W 5 Oˆ ± = Z± O± + C± · O± + (mc − mu )C± · is? R )F )d i
1 3 + a(mc − mu )(md − ms )C˜ 5± · is? R ) F˜ ) d + 2 (mc − mu )C± · sd R a
1 3 R 5 d + O(a2 ) : + (mc − mu )(md − ms )C˜ ± · s a 56
When color indices are not shown they are trivially contracted, i.e., the operators are color singlets.
(14.19)
S. Capitani / Physics Reports 382 (2003) 113 – 302
207
6; i All dimension-6 operators O± have opposite chirality with respect to O± . The precise form of them, which is given in Martinelli (1984), will not interest us here. 57 A remarkable thing is that the coeQcients of the mixings with the dimension-5 operators, which could in principle diverge like 1=a, are instead *nite, thanks to the Glashow–Iliopoulos–Maiani (GIM) mechanism. This happens because the GIM mechanism states that this mixing should be zero for mc = mu ; subsequently, a mass factor mc − mu is required. The coeQcients of the mixings with the dimension-3 operators could in principle diverge like 1=a3 . The GIM mechanism and, for the parity-violating operator, also a factor md − ms due to the CPS symmetry (which combines C; P and the exchange of the s and d quarks (Bernard et al., 1985)), renders these power divergences less severe. Still, the fact that these mixings remain power divergent impedes the calculation of the full renormalization of the GI = 1=2 operators using perturbation theory. What can be computed in perturbation theory is only the overall renor6; i 5 malization Z W , which is logarithmically divergent, the coeQcients C± , and the coeQcients C± and 5 ˜ C ± . The last two coeQcients however require a 2-loop calculation, which has been done in Curci et 5 al. (1998) for C± . Let us now see what happens when fermions which respect chiral symmetry are used. The renormalization is in this case given by 58
ov m Oˆ ± = Z± [O± + (m2c − m2u )C± · ((md + ms )sd R + (md − ms )s R 5 d)] + O(a2 ) :
(14.20)
We can see that chiral symmetry has brought a big change in the pattern of subtractions. First of all, chiral symmetry forbids in a direct way any mixings with the other dimension-6 6; i operators, O± , which are of opposite chirality. Furthermore, the GIM mechanism when combined with chiral symmetry is now quadratic, and thus gives coeQcients proportional to m2c − m2u , like in the continuum. This mass factor counts for two powers of 1=a. Finally, the mixing coeQcients with parity-conserving and parity-violating operators are now the same. The parity-conserving operators then acquire an additional factor (md + ms ) which mirrors the factor (md − ms ) coming from the CPS symmetry for the parity-violating operators. This does away with another factor of a. As a result, the mixings with the dimension-5 magnetic operators, which were *nite in Wilson, now become of order a2 , and hence one does not have to take them into account, even in the improved theory. Even more remarkably, the mixings with the dimension-3 operators sd R and s R 5d (which were power divergent in Wilson) become *nite. Thus, the renormalization of the GI = 1=2 matrix elements can now be carried out entirely with perturbative methods, because there are no power divergent coeQcients when one uses overlap fermions (Capitani and Giusti, 2000).
57
The study of weak operators on the lattice has a long history, and several perturbative calculations have been done using Wilson fermions (Cabibbo et al., 1984; Martinelli, 1984; Maiani et al., 1987; Bernard et al., 1987). Recent results for four-fermion operators can be found in (Gupta et al., 1997). Four-fermion operators are also useful for other problems, like the renormalization of higher-twist operators in deep inelastic scattering, and in this case they have a diJerent color, spin and Navor structure (Capitani et al., 1999a, 2000a, b, 2001a). 58 We should mention that to construct overlap operators which have the right chiral properties L has to be replaced by L (1 − a=2
208
S. Capitani / Physics Reports 382 (2003) 113 – 302
15. Analytic computations Analytic computations of Feynman diagrams in lattice QCD present quite a few new and interesting features with respect to the continuum. Of course standard rules like a minus sign for each fermionic loop, which derive from the general properties of the path integral and the Wick theorem, continue to be valid on the lattice. The combinatorial rules are also similar to the continuum. But there are a few technicalities, many of them connected to the breaking of Lorentz invariance, which the reader should be aware of. We will discuss many of them in this Section. We will *rst introduce the power counting theorem on the lattice and see how divergent integrals can be treated. We will then show in detail the calculation of a matrix element at 1 loop, using Wilson fermions. A few comments about calculations with overlap fermions and with fat links will be also made. 15.1. The power counting theorem of Reisz On the lattice the functions to be integrated are periodic (with period 2=a), and a power counting theorem which is appropriate for this kind of integrals, and which accounts for their properties in the continuum limit, has been established by Reisz (1988a–d). This power counting theorem, like the one in the continuum (Hahn and Zimmermann, 1968), is very useful for the treatment of divergent integrals, and is fundamental for proving the renormalizability of lattice gauge theories. We will follow the presentation of L6uscher (1990), where somewhat milder conditions are required than in the original papers. Let us then consider a generic lattice integral at L loops, which will have the general form =a 4 =a 4 d k1 d kL V (k; q; m; a) I= ; (15.1) ··· 4 4 −=a (2) −=a (2) C(k; q; m; a) where qi (i = 1; : : : ; E) are the external momenta and m stands for the masses of the theory. The numerator V contains all vertices and the numerators of the various propagators, while the denominator C is the product of the denominators of these propagators. This overall denominator is assumed to have the structure I C(k; q; m; a) = Ci (li ; m; a) ; (15.2) i=1
where I is the number of internal lines of the diagram, and the line momenta li (k; q) carried by them are linear combinations of the integration variables kj and the external momenta qj . For the power counting theorem to be valid, a few conditions have to be satis*ed by the numerator V , the denominators Ci and the line momenta li . These conditions can be stated as follows. (V1) There exists an integer ! and a smooth function F such that V (k; q; m; a) = a−! F(ak; aq; am) ;
(15.3)
and F is periodic in aki and a polynomial in am. (V2) The continuum limit of the numerator, P(k; q; m) = lim V (k; q; m; a) ; a→0
exists.
(15.4)
S. Capitani / Physics Reports 382 (2003) 113 – 302
209
(C1) There exist smooth functions Gi satisfying Ci (li ; m; a) = a−2 Gi (ali ; am) ;
(15.5)
and the Gi ’s are periodic in ali and polynomials in am. (C2) The continuum limit of all Ci ’s exists, and is given by lim Ci (li ; m; a) = l2i + m2i ;
a→0
(15.6)
where the positive masses mi are combinations of the original masses m. (C3) There exist positive constants a0 and A such that |Ci (li ; m; a)| ¿ A(lˆ2i + m2i )
(15.7)
for all a 6 a0 and all li ’s. (L1) All line momenta satisfy li (k; q) =
L
aij kj +
j=1
E
bil ql ;
(15.8)
l=1
for aij integer and bil real. (L2) Given the linear combinations pi (k) =
L
aij kj
(15.9)
j=1
and the associated set L = {k1 ; : : : ; kL ; p1 ; : : : ; pI } ;
(15.10)
and considering u1 ; : : : ; uL linearly independent elements of L, then ki =
L
cij uj
(15.11)
j=1
holds, with cij integer. The conditions L1 and L2 de*ne a “natural” choice of line momenta. It is important that the coeQcients aij in L1 and cij in L2 are integers. These conditions guarantee that shifting integration variables by 2=a and choosing some of the line momenta as new integration variables still gives a periodic integrand and does not change the domain of integration. The condition C3 is one of the most signi*cant. While the other conditions are rather weak, and are ful*lled by any reasonable theory, this one is strongly discriminating against certain type of integrands. Condition C3 is in fact satis*ed by scalars as well as by Wilson and overlap fermions, but not by naive fermions and staggered fermions. Essentially this condition asks that the denominators Ci diverge like 1=a2 when the momenta li are at the edges of the Brillouin zone, which is suQcient to forbid any doublers in that region. We need now a de*nition of the degree of divergence of an integrand. The degree of divergence of the numerator is de*ned from its asymptotic behavior, →∞
V (k; q; m; a) = Kdeg V + O(deg V −1 ) ;
(15.12)
210
S. Capitani / Physics Reports 382 (2003) 113 – 302
where K = 0, and similarly for the degree of divergence of the denominator, deg C. The lattice degree of divergence takes into account the behavior of the integrand functions not only for small lattice spacing, but also for large loop momenta k ∼ 1=a. The degree of divergence of the integral I is then given (in four dimensions) by deg I = 4 + deg V − deg C :
(15.13)
Finally, for integrals beyond 1 loop we need to introduce the notion of Zimmermann subspaces, which are linear subspaces of the momenta. Let us consider L linear independent elements of the set of momenta L de*ned in condition L2, u1 ; : : : ; ud ; v1 ; : : : ; vL−d ;
(d ¿ 1);
(15.14)
and take them as new integration variables. If we now *x v1 ; : : : ; vL−d to some value, we obtain a 4d-dimensional Zimmermann subspace, spanned by u1 ; : : : ; ud . 59 The degree of divergence for V in this Zimmermann subspace is then de*ned as →∞
V (k(u; v); q; m; a) = KdegZ V + O(degZ V −1 ) ;
(15.15)
where K = 0, and similarly for Ci . This allows to study the behavior of the integrand when only some of the momenta are large. The theorem of Reisz says that the continuum limit of the integral I in Eq. (15.1) exists if degZ I ¡ 0 for all its possible Zimmermann subspaces Z. In this case it is given by integrating the naive continuum limit of the integrand (as in conditions V2 and C2): ∞ 4 ∞ 4 d k1 d kL P(k; q; m) lim I = · · · : (15.16) I 4 4 2 2 a→0 (2) (2) −∞ −∞ i=1 (li + mi ) Thus, in this case we have reduced the initial problem to the computation of a simpler continuum integral, which is absolutely convergent. The theorem can also be formulated in the case in which massless propagators are present, but then one must also introduce infrared degrees of divergence (Reisz, 1988b, d). The proof of the power counting theorem is rather complicated and goes beyond the scope of this review. We refer the interested reader to the original papers (Reisz, 1988a–d). We now discuss a couple of examples which illustrate the meaning of the theorem of Reisz. Let us consider the one-dimensional integral =a 1 dk I (N ) = ; (15.17) 2 2 2 2 −=a 2 N =a sin ak=N + m with N = 1 describing naive fermions and N = 2 scalar particles. Doing naively the limit a → 0 of the integrand and of the integration region gives ∞ 1 dk 1 ; (15.18) = 2 2 2m −∞ 2 k + m 59
One does not distinguish between subspaces corresponding to diJerent values of the *xed momenta v1 ; : : : ; vL−d .
S. Capitani / Physics Reports 382 (2003) 113 – 302
211
which is independent of N . The true value of the integral at *nite a can be computed using the Schwinger representation ∞ 1 2 2 = d e−(x +m ) ; (15.19) x 2 + m2 0 so that it becomes 2a ∞ 2 2 2 dy e−(1+2a m =N )y I0 (y) ; I (N ) = 2 N 0
(15.20)
where I0 is a modi*ed Bessel function, which for large y behaves as y→∞
I0 (y) → √
1 ey : 2y
(15.21)
We can now compute the limit a → 0 of I (N ) by replacing the integrand the modi*ed Bessel √ in ∞ function with its asymptotic expression. Using 0 e−bx d x= x = =a one obtains
1 ; (15.22) Nm which is the correct continuum limit of I (N ). We can see that it depends on N . For N = 2 this result is the same as the naive continuum limit (15.18), whereas for N = 1 the naive continuum limit gives only half of the true value (15.22). This mismatch corresponds to a case in which the Reisz theorem cannot be applied, because the propagator of naive fermions (N = 1) does not satisfy condition C3. In fact, this propagator has a doubler, and the true value of the integral in the continuum limit is then precisely twice the result which one would obtain just doing the naive continuum limit. In the above example all integrals have a negative degree of divergence. The integral =a 1 − cos ak d4 k ; (15.23) 2 4 2 2 −=a (2) 4=a sin ak =2 + m a→0
I (N ) →
instead, is divergent like 1=a2 in the continuum limit, as can be seen by dimensional counting using the rescaled variable k = ka. Therefore, the theorem of Reisz cannot be used, and in fact a naive continuum limit of the integrand gives a vanishing result, which is incorrect. The Reisz power counting theorem is quite useful for the calculation of lattice integrals, especially when they are divergent, as we will see in the next Section. It is also essential to the proof of the renormalizability of lattice Yang-Mills theories to all orders of perturbation theory (Reisz, 1989). 15.2. Divergent integrals For the treatment of divergent integrals on the lattice it is convenient to use a method which was introduced in Kawai et al. (1981). It consists in making an expansion of these integrals in powers of the external momenta, and computing on the lattice only the integrals with vanishing momentum, which are technically much simpler. As an example we consider the case of a quadratically divergent integral depending on two external momenta p and q, I = d k I(k; p; q) : (15.24)
212
S. Capitani / Physics Reports 382 (2003) 113 – 302
This integral can be split as I = J + (I − J ) ;
(15.25)
where J=
d k I(k; 0; 0) +
dk
p < q?
<;?
p < p? + 2
92 I(k; p; 0) dk 9p< 9p?
92 I(k; p; q) 9p< 9q?
q < q? + 2 p=0
p=q=0
92 I(k; 0; q) dk 9q< 9q?
(15.26) q=0
is the Taylor expansion of the original integral to second order. The integrals appearing in J do not depend on the external momenta and are thus much easier to calculate on the lattice. The whole dependence on the external momenta remains in I − J which, because of the subtraction, is ultraviolet-*nite for a → 0 and can thus be computed, according to the theorem of Reisz, just by taking the naive continuum limit. Thanks to this fact, only zero-momentum integrals have to be evaluated on the lattice. Notice that for p; q = 0 and *nite lattice spacing I is well de*ned, but J and I − J are infrared divergent. To compute J and I −J separately, one must then introduce an intermediate regularization. The associated divergences will at the end cancel out in the sum J + (I − J ). This intermediate regularization is completely independent from the main regularization used in the lattice theory, and in particular can be diJerent from it. It just comes out because the splitting is somewhat unnatural. To give an explicit illustration of this method, let us take the logarithmically divergent integral =a 4 d k 1 I= : (15.27) 2 4 2 sin a(k − p) =2) · (4=a2 sin2 ak =2) −=a (2) (4=a The splitting is then made as follows: =a 4 d k 1 ; J = I (p = 0) = 4 2 sin2 ak =2)2 −=a (2) (4=a I − J = lim
a→0
=a
−=a
d4 k (2)4
(4=a2
(15.28)
1 sin a(k − p) =2) · (4=a2 sin2 ak =2) 2
1 − (4=a2 sin2 ak =2)2 ∞ 4 1 d k 1 = : − 4 (k − p)2 · k 2 (k 2 )2 −∞ (2)
(15.29)
Taking common denominators, it is easy to see that the degree of divergence of the above integral is negative, and therefore it can be safely computed in the continuum.
S. Capitani / Physics Reports 382 (2003) 113 – 302
If we use dimensional regularization we have the result 60 2 1 2 2 − log a J= − log 4 + F0 ; 162 d − 4 2 1 p2 − I −J = − log 2 + log 4 − E − 2 ; 162 d−4
213
(15.30) (15.31)
where E =0:57721566490153286 : : : and the lattice constant is F0 =4:369225233874758 : : : (see Eqs. (18.25) and (18.26) and Table 2 later), while if we regularize adding a small mass term m2 to k 2 in the denominators we obtain 61 1 (−log a2 m2 − E + F0 ) ; 162 p2 1 −log 2 − 2 : (I − J )m = 162 m Jm =
(15.32) (15.33)
In both cases, adding up J and I − J we obtain for the original integral the result I = −log a2 p2 − E + F0 − 2 :
(15.34)
To summarize, for the computation of any divergent integral which depends on external momenta it is suQcient to compute some lattice integrals at zero momenta and some continuum integrals. In computer programs, a convenient way to deal with a generic divergent integral (which has to be processed in an automated way) is to subtract from it a simple integral with the same divergent behavior for which the numerical value is exactly known. The diJerence is then *nite and can be computed with reasonable precision using simple integration routines. This is extremely convenient in the case of actions which give rise to complicated denominators, such as overlap fermions. In this case, Wilson integrals with the same divergence are subtracted from the original overlap integral, and then overlap denominators, which are much more complicated, appear only in the numerical calculation of *nite integrals. The calculation of divergent integrals is made only using Wilson fermions. Of course, this is not the only available method for computing divergent integrals. In Section 19.2.1 we will show another technique based on the coordinate space method. 15.3. General aspects of the calculations We have seen that the Feynman rules on the lattice are rather diJerent from the continuum ones. The structure of lattice integrals is also completely diJerent. The integrands are periodic in the momenta, and the basic objects are trigonometric functions and not simple polynomials of the momenta. Many standard methods which are very useful in continuum perturbation theory, like 60
See Eq. (18.30) later. Notice that the integral of the second term in I − J is zero in this regularization. We have also included the log 2 terms which derive from the d-dimensional rede*nition of the coeQcient g02 associated with these 1-loop integrals. 61 See Eq. (18.29) later.
214
S. Capitani / Physics Reports 382 (2003) 113 – 302
Feynman parameterization and partial integration, are then not of much relevance to perturbative lattice calculations. 62 A complete lattice calculation which illustrates the peculiar aspects of lattice perturbation theory is instructive in our opinion. This we do in the next Section. 15.4. Example (Wilson): the 8rst moment of the quark momentum distribution We describe in this section, as a pedagogical example, a typical lattice perturbative calculation. We explain in detail the main steps of the Wilson action computation of the renormalization constant of the forward matrix element q|O{ )} |q on single-quark states of the operator f ; = ) : (15.35) = R { D)} 2 This operator, which is symmetrized in the indices and ), measures the *rst moment of the fraction of the momentum of the proton carried by the quarks. The ’s are Navor matrices, which means that we are considering a Navor nonsinglet operator. The corresponding singlet operator (proportional to the identity Navor matrix) mixes, when radiative corrections are included, with the gluon operator < Tr (F < F<) ), which measures the *rst moment of the momentum distribution of the gluon. In order to avoid the resulting complications we will not consider it here. This example is rather simple (compared to other operators) and contains all the main interesting features one can think of: a logarithmic divergence, a covariant derivative, symmetrized indices and of course the special use of Kronecker -symbols in lattice perturbative calculations. Moreover, it is an example of a calculation which requires an expansion of the various propagators and vertices in the lattice spacing a (in this case, to *rst order). 63 As far as the value of the renormalization constant is concerned, the Navor matrices are not important (except for the fact that they forbid the mixing with gluonic operators). We will thus carry out the explicit computations using the operator O{
1 2
)}
( R D) + R ) D
) ;
(15.36)
with = ). This operator belongs to the representation 61 of the hypercubic group (see Section 13). The choice = ) would compel us to consider the operator O{00} − 13 (O{11} + O{22} + O{33} ), which belongs to the representation 31 and has a diJerent lattice renormalization constant. This would render the calculations a bit more cumbersome, without teaching us much new. The covariant derivative is de*ned as follows: ↔ 1 → ← D = D = (D − D ) : 2 62
(15.37)
We mention that recently a computational method has been presented (Becher and Melnikov, 2002) which transforms lattice integrals in continuum-like integrals through a change of variables (t = tan k=2) and employs known techniques of continuum calculations based on asymptotic expansions. 63 A simpler calculation with no covariant derivatives and no need of expansions in a, which the reader could try as a warm-up, is given by the renormalization of quark currents. This was *rst computed in Martinelli and Zhang (1983a). For the case of extended currents, de*ned on more than one lattice site like the conserved vector current of Eq. (5.84), see Martinelli and Zhang (1983b).
S. Capitani / Physics Reports 382 (2003) 113 – 302
215
Fig. 17. “Proper” diagrams for the 1-loop correction of the matrix element q| R { D)} |q . The black squares indicate the insertion of the operator. The choice of momenta used in the calculations is also shown.
Fig. 18. Diagrams for the quark self-energy. On the left the sunset diagram, on the right the tadpole diagram.
In this calculation the discretized form that we opt for is →
D
(x) =
1 [U (x) (x + a ˆ) − U † (x − a ˆ) (x − a ˆ)] ; 2a
← R (x)D = 1 [ R (x + a ˆ)U † (x) − R (x − a ˆ)U (x − a ˆ)] : 2a
(15.38)
We consider amputated Green’s functions, that is the external propagators are removed. The tree level of the amputated forward quark matrix element is easily seen to be q|O{
)} |q|tree
= 12 i( p) + ) p ) ;
(15.39)
and the 1-loop QCD result has, as we will calculate below, the form q|O{
)} |q|1 loop
=
1 g2 i( p) + ) p ) · 0 2 CF (c1 log a2 p2 + c2 ) ; 2 16
(15.40)
i.e., it is proportional to the tree level. This operator is thus multiplicatively renormalized. The renormalization constant for the matching to the MS scheme can then be read oJ from the above 1-loop result plus the corresponding continuum calculations made in the MS scheme (see Eq. (3.3) and Section 3). For the computation of the lattice part it is necessary to evaluate six Feynman diagrams, which are given in Figs. 17 and 18. The two diagrams in Fig. 18 refer to the quark self-energy and give the renormalization of the wave function. The four diagrams in Fig. 17 are speci*c to the operator considered and we will call them “proper” diagrams.
216
S. Capitani / Physics Reports 382 (2003) 113 – 302
15.4.1. Preliminaries We work in the Feynman gauge ( = 1), where the form of the gluon propagator is simpler, and we set r = 1. We perform the calculations using massless fermions. 64 This is the simplest situation one can think of, although it already leads to complicated manipulations, as we will shortly see. We carry out these manipulations starting from the operator O ) = R D)
;
(15.41)
and implement the symmetrization in and ) at a later stage. Due to the presence of the link variable U in the covariant derivative, this operator has an expansion in the coupling constant, (1) 2 (2) 3 O ) = O(0) ) + g0 O ) + g0 O ) + O(g0 ) :
(15.42)
To evaluate the 1-loop Feynman diagrams in momentum space one has to compute the Fourier transforms of the operators in this expansion including the term of O(g02 ). For forward matrix elements it turns out that we can use the operator de*ned with the right derivative only, instead of the one involving the diJerence between the right and the left derivative (which would lead to more → complicated manipulations). We have then that the expansion of a4 x ( R D) )(x) is 1 R a4 ( (x) U) (x) (x + a)) ˆ − R (x) U)† (x − a)) ˆ (x − a))) ˆ 2a x 1 R 4 ( (x) (x + a)) ˆ − R (x) (x − a))) ˆ =a 2a x + −
1 ig0 T a ( R (x) Aa) (x) (x + a)) ˆ + R (x) Aa) (x − a)) ˆ (x − a))) ˆ 2 x 1 2 a b R ( (x) Aa) (x)Ab) (x) (x + a)) ˆ − R (x) Aa) (x − a))A ˆ b) (x − a)) ˆ (x − a))) ˆ ag T T 4 0 x
+ O(a2 g03 )
:
The Fourier transform of the lowest order is =a d 4 k =a d 4 k 4 1 R (k ) (k)e−ik x (eik(x+a))ˆ − eik(x−a))ˆ ) a 4 4 2a x −=a (2) −=a (2) =a 1 d4 k R = (k) (k)(eiak) − e−iak) ) 2a −=a (2)4 i =a d 4 k R (k) (k) sin ak) ; = a −=a (2)4 64
(15.43)
(15.44) (15.45) (15.46)
Calculations in which the quark propagator is massive are more complicated. A few examples of these calculations, which use simpler operators, can be found in Kronfeld and Mertens (1984), El-Khadra et al. (1997), Mertens et al. (1998) and Kuramashi (1998).
S. Capitani / Physics Reports 382 (2003) 113 – 302
217
where we have used a4 x e−ik x eikx = (2)4 (4) (k − k ). Similar -functions arise to each order of the expansion, expressing the conservation of momentum at the various vertices. The *rst order term in g0 is =a d 4 k =a d 4 p =a d 4 q 1 R (p) (k)Aa (q) a4 ig0 T a ) 4 4 4 2 (2) (2) (2) −=a −=a −=a x ×e−ipx eiqx eikx (eiq) a=2 eik) a + e−iq) a=2 e−ik) a ) =a =a 4 d4 k d p R 1 a (p) (k)Aa) (p − k) = ig0 T 4 4 2 −=a (2) −=a (2) ×(ei(p−k)) a=2 eik) a + e−i(p−k)) a=2 e−ik) a ) =a =a 4 d4 k d p R (p) =ig0 T a 4 4 (2) (2) −=a −=a
(k)Aa) (p − k) cos
a(k + p)) : 2
(15.47)
With our convention for the Fourier transform of the gauge *elds the gluons are always entering the vertices. The calculation above corresponds to the left sail in Fig. 17, and the reader can check that the calculation for the right sail gives the same function: =a =a 4 a(k + p)) d4 k d p R a : (15.48) (k) (p)Aa) (k − p) cos ig0 T 4 4 (2) (2) 2 −=a −=a Finally, the operator tadpole being the only second-order contribution which we are interested in, results to some simpli*cations. In particular, since the gluon is emitted and reabsorbed at the same vertex, there is a Kronecker -symbol in color space coming from the gluon propagator and the color factor becomes a (T a )2bb = (Nc2 − 1)=(2Nc ) = CF , the quadratic Casimir invariant of SU (Nc ). The insertion of the operator tadpole is then =a d4 k R 1 2 − ag0 CF (p) (p)Aa) (k)Aa) (k)(eip) a − e−ip) a ) 4 4 (2) −=a =a d4 k R 1 2 (15.49) (p) (p)Aa) (k)Aa) (k) sin ap) ; = − iag0 CF 4 2 (2) −=a where now the color index a is not summed. Note that the factors exp(±iak) =2) coming from the gluons have canceled, again because it is the same gluon that is emitted and absorbed at the vertex. This insertion does not depend on the momentum of the gluon. We have thus obtained, with the momenta chosen as in Fig. 17, the operator insertions O(0) ) (k) =
1 i sin ak) ; a
a O(1) ) (k; p) = T i cos
a(k + p)) ; 2
a O(2) ) (p) = − CF i sin ap) ; 2
(15.50) (15.51) (15.52)
218
S. Capitani / Physics Reports 382 (2003) 113 – 302
Fig. 19. Operator insertions for R { D)} . Please note that in the second-order term the color factors have been already worked out as they occur in the tadpole.
shown in Fig. 19. They generate the vertex, sails and operator tadpole respectively (see Fig. 17). The gluons have the same index as the trigonometric functions. In the intermediate stages of the calculation many terms originate from the Taylor expansion of the denominators of the propagators. With our choice of momenta 65 only the gluon propagator has to be expanded in the lattice spacing. The quark propagator does not contain any external momenta and does not need to be expanded. We have found that this leads to simpler manipulations than the choice in which the quark propagators have momentum p − k and the gluon propagator has momentum k. 15.4.2. Vertex The *rst diagram that we consider, and for which we will give a rather detailed explanation of how is computed, is the vertex function in Fig. 17. Since this Feynman diagram is divergent, it is convenient to split its computation into two parts, as explained in Section 15.2: the integral at zero momentum, J , and the rest of the integral, I − J , which can be computed in the continuum thanks to the theorem of Reisz. We use dimensional regularization as a regulator of the intermediate divergences, with d = 4 − 2j. We have then 66 =a dd k J= G< (p − k) · V< (k; p) · S(k) · O(0) (15.53) ) (k) · S(k) · V (p; k)|ap=0 ; d (2) −=a <;
where (in the Feynman gauge) 2 ab ab −i sin ak + 2 sin ak =2 S (k) = a · ' (2 ; 2 2 sin ak + 4 sin ak =2
65
(15.54)
We always denote the external momentum as p, the quark propagator momentum as k and the gluon propagator momentum as p − k (except in the tadpoles). 66 We will assume the implicit summation convention for the indices < and in the continuum, while on the lattice even repeated indices, unless explicitly stated, are not summed.
S. Capitani / Physics Reports 382 (2003) 113 – 302
219
a bc (V a )bc < (k; p) = (V )< (p; k)
a(k + p)< a(k + p)< = − g0 (T a )bc sin + i< cos 2 2 a bc
= − g0 (T )
ak< ak< ap< + i< cos + sin 2 2 2
ak< ak< cos − i< sin 2 2
+ O(a2 ) ; ab (p − k) = ab < · G<
ab
(15.55)
4=a2
= < · a
2
1 sin2 a(p − k) =2
2 p sin ak 1 2 +a + O(a ) : 4 sin2 ak =2 (4 sin2 ak =2)2
(15.56)
Putting everything together we have =a 2 2a p sin ak 1 dd k 4 g0 · + J = a · CF d a 4 sin2 ak =2 (4 sin2 ak =2)2 −=a (2) < ak< ak< ap< ak< ak< cos × sin + i< cos + − i< sin 2 2 2 2 2 −i sin ak + 2 sin2 ak =2 × · sin ak) 2 2 2 sin ak + 4( sin ak =2) −i sin ak + 2 sin2 ak =2 · 2 2 2 sin ak + 4( sin ak =2) ak< ak< ap< ak< ak< + i< cos + cos − i< sin + O(a5 ) : × sin 2 2 2 2 2
(15.57)
The color factors are the same as in the continuum, a
(T a )cb (T a )bc =
b
a
(T a )2cc =
Nc2 − 1 = CF ; 2Nc
(15.58)
and CF becomes an overall factor in front of the expressions. We now rescale the integration variable k → k = ak ;
(15.59)
220
S. Capitani / Physics Reports 382 (2003) 113 – 302
so that 67 =a
dd k f(ak; ap) = (2)d
−=a
−
dd k 1 f(k ; ap) : (2)d a4
(15.60)
Note that the domain of integration after the rescaling becomes independent of a. The factor a4 in front of Eq. (15.57) cancels the factor 1=a4 coming from the rescaling in Eq. (15.60), and thus we have an overall factor 1=a left. This means that in order to take the continuum limit of this lattice integral we have to expand the whole integrand including factors of order ap. This is not surprising, as it points to the recovery of the tree level p) . We now drop the prime from k and use the shorthand notation I = sin k ;
W =2
sin2
(15.61) k ; 2
(15.62)
k< ; 2 k< M< = cos : 2 Of course we also have I sin k , =
(15.63)
N< = sin
(15.64) (15.65)
and I2 =
sin2 k :
(15.66)
It should be noted that I and N are odd in k, while M and W are even. Expanding everything to order a (which compensates the remaining factor 1=a in Eq. (15.57)) we have d 2 p I −iI d k , +W 2 (N< + i< M< ) 2 J = ig0 CF d 2 (2W ) I + W2 − (2) < · I) · +
−iI , +W (N< + i< M< ) 2 I + W2
−iI −iI 1 p< , +W , +W (M< − i< N< ) 2 · I) · 2 (N< + i< M< ) 2 2W 2 I +W I + W2
−iI −iI 1 p< , +W , +W · I) · 2 (M< − i< N< ) + (N< + i< M< ) 2 2 2W 2 I +W I + W2
:
(15.67)
67 We have introduced noninteger dimensions because we are using dimensional regularization to compute the *nite part of the divergent integrals. While in principle there should be a factor 1=ad in the right-hand side of this equation, the diJerence between d and 4 is not relevant for the expansion in a and we can set from the beginning this factor to be equal to 1=a4 . The leftover ad−4 term does however give a contribution to divergent integrals when dimensional regularization is used, like in Eq. (15.92).
S. Capitani / Physics Reports 382 (2003) 113 – 302
221
We emphasize at this point that it has been necessary to perform the Taylor expansion before doing the gamma algebra, because the Dirac structure is “hidden” inside the unexpanded propagators and vertices. After a few manipulations we get 2 p I dd k 2 I [ N<2 W 2 − < < M<2 W 2 − I J = ig02 CF , I , N< d 2 (I2 + W 2 )2 ) (2) (2W ) − < 2 + < I , I , < M< + (< I , + I , < + < I , +I , < )N< M< W + · · · ]
+
2W (I2
p< 1 2 I) [2 N< M< W 2 − 2I , I , N< M< + 2< < N< M< W 2 2 +W ) 2
− 2< I , I , < N< M< + (< I , + I , < + < I , 2 2 +I , < )(M< − N< )W + · · · ] ;
(15.68)
where the dots denote terms which are odd in k and therefore do not contribute to the *nal result as their integral vanishes by parity. Notice that the terms which are even in k possess an odd number of Dirac matrices, and this again points to the recovery of the tree-level expression. At this stage we can *nally perform the gamma algebra. However, this is one of the most delicate points of the whole calculation, and one must be careful here. What happens is that the expressions are not in general tensors in the usual sense. Summed indices can appear more than twice in each monomial, due to the breaking of Lorentz invariance. Thus, the well-known continuum formulae cannot be used straightforwardly on the lattice. The lattice reduction formulae which we need in order to proceed with the calculation of the vertex are: 2 I , I , = − I + 2I ,I ;
(15.69)
which is the same as in the continuum, < < = < (−< + 2 < ) = (−2< + 2 < ) ; <
<
(15.70)
<
which is instead diJerent from the continuum result, and < I < (− I2 + 2I , I , < = , I )< <
<
=
2 ( 2< I2 − 2 < I2 − 2I , < I + 4< I< I ) :
(15.71)
<
The Kronecker -symbols in the formulae above, as well as the various factors 2< , are very important and for the moment they must be kept, because their eJect depends on what is present in the rest of each monomial. This can be seen in the following examples: < < cos k cos k< = (− 2< + 2 < ) cos k cos k< <
<
= − cos k
<
cos k< + 2 cos2 k ;
(15.72)
222
S. Capitani / Physics Reports 382 (2003) 113 – 302
< < cos2 k =
<
(− 2< + 2 < ) cos2 k
<
= −(4 − 2j) cos2 k + 2 cos2 k = −(2 − 2j) cos2 k :
(15.73)
Similarly, a term like < I< in the J expression above can only be contracted if in the rest of the monomial there are no other functions with index <. Another gamma algebra relation that we need is + = 2( + − ) ;
(15.74)
so that < I , + I , < + < I , +I , < = 4< I :
(15.75)
Using these formulae for the reduction of matrices we then obtain
d 2 p I d k J = ig02 CF I ( N<2 W 2 + I2 (N<2 + M<2 ) ) d (2W )2 (I2 + W 2 )2 − (2) < − 2I ,I +
(N<2
+
M<2 )
+
M<2 W 2
+ 4< I< I
M<2
2
2
2
+ 4< I N< M< W ) − 2 I M − 2 M W
2
p< 1 I) [(4 I2 N M − 8< I< I N< M< 2W (I2 + W 2 )2 < 2
+ 4< I (M<2 − N<2 )W ) + 4 N M W 2 ]
:
(15.76)
We now exploit the symmetry k → −k of the integration region once more: odd powers of sine functions of the various Lorentz components of the momentum must be combined in such a way that only even powers, which have a nonzero *nal integral, are left: sin k< sin k = sin2 k< < ; (15.77) <
<
sin k< sin k sin sin) =
<
sin2 k< sin2 k ( < ) + <) )
( = )) :
(15.78)
<
Of course, since we have = ), terms of the kind sin k sin k) are zero when integrated and can be safely dropped (unless they appear with sines of other indices like in sin k sin k) sin k sin k , as in the formula above). Examples which are relevant for the J expression we are computing are: p I I) N<2 W = p) I)2 N<2 W ; (15.79) <
<
2
p< I) N M W = 0 ;
<
<
2 2 p I I ) I , I (N< + M< ) =
(15.80) <
(p) I)2 I2 + p I2 ) I)2 )(N<2 + M<2 ) ;
(15.81)
S. Capitani / Physics Reports 382 (2003) 113 – 302
223
and another interesting example is 2 2 p< I , I I < N < M < = p I I ) N ) M ) + ) p) I ) I N M :
(15.82)
<
It is easy to see that after this step all that is left in J is multiplied for either p) or ) p , while p< has disappeared altogether:
d 2 p I 2 d k ) ) ((N<2 + M<2 )W 2 + I2 (N<2 + M<2 ) J = ig02 CF d 2 2 2 2 (2) (2W ) (I + W ) − <
− 2I2 (N<2 + M<2 )) − 2I2 M 2 − 2M 2 W 2 + 4I2 M 2 + 4I N M W
2) p I2 2 2 2 2 2 −2I) (N< + M< ) + 4I) M) + 4I) N) M) W : + (2W )2 (I2 + W 2 )2 <
(15.83)
Notice also that the whole part proportional to 1=(2W (I2 + W 2 )2 ) has disappeared, because all terms were of the kind sin k sin k) . At this point we remark that we started our calculation using only the operator D) , and what we have just obtained is a 1-loop expression of the form 1 loop
D) →
g02 i ( p) A + ) p B) : 162
(15.84)
It appears that we have obtained a 1-loop expression which is not proportional to the tree level one, which is the necessary condition in order for the operator to be multiplicatively renormalized. However, if we now take the symmetrization in and ) into account everything is *ne, because the operator O) gives 1 loop
) D →
g02 i ( p) B + ) p A) ; 162
(15.85)
with the same A and B of D) , and this implies that O{
)}
2 1 1 loop g0 1 i ( p) + ) p )(A + B) : = ( D) + ) D ) → 2 162 2
(15.86)
The symmetrized operator is thus multiplicatively renormalized (at least, for now, its vertex contribution). Looking at Eq. (3.1) we see that the lattice part of the 1-loop renormalization constant that matches to the continuum is given by 2 2 lat − (0) vert · log a p + Rvert = A + B :
(15.87)
In practical terms, in order to get the 1-loop expression for the symmetrized operator, we need to exchange, in the expression in Eq. (15.83), the indices and ) in the terms proportional to ) p .
224
S. Capitani / Physics Reports 382 (2003) 113 – 302
We then have J = ig02 CF
−
dd k (2)d
2 p) I2 I)2 p) I)2 2 2 2 + (N + M ) − 2M < < 2(I2 + W 2 )2 < W (I2 + W 2 )2
2 p) I)2 + (2W )2 (I2 + W 2 )2
(I2 (N<2 + M<2 ) − 4I2 (N<2 + M<2 )) − 2I2 M 2 + 8I2 M 2
;
<
in which we have also made some simpli*cations and used the trigonometric identity N< M< = 12 I< :
(15.88)
Using also N<2 + M<2 = 1
(15.89)
we obtain our *nal expression: J = ig02 CF p)
−
dd k (2)d
sin2 k) (4 − 2j − 2 cos2 k =2) 2( sin2 k + (2 sin2 k =2)2 )2
sin2 k sin2 k) + ( sin2 k =2)( sin2 k + (2 sin2 k =2)2 )2 sin2 k) ((4 − 2j − 2 cos2 k =2)( sin2 k − 4 sin2 k )) : + 8( sin2 k =2)2 ( sin2 k + (2 sin2 k =2)2 )2
(15.90)
It should be noticed that the function cos (k =2) is present in the *nal expression of the Feynman diagrams only with an even power (even though in the Feynman rules, for example in the quark-gluon vertex, Eq. (5.76), it appears also with the *rst power), while cos k can be present also with an odd power. The same will be true for the sails, and in general for every Feynman diagram in this kind of theories. This is due to the fact that cos(k =2) does not have a period of 2, and hence it is not an admissible function, as opposed to cos2 (k =2). Of course the sine functions, due to the symmetry of the integration region, appear in any case only with an even power. The last term in the *nal result for J is logarithmically divergent. The coeQcient of the divergence is easily extracted, and the remaining *nite part can be computed with great precision using the algebraic method that we will introduce in Section 18. We can however proceed here in a much faster (although less precise) way by subtracting a simple lattice integral which the same divergence, for example 1 − 3
−
d 4− 2j k 1 1 = 2 4 − 2 j 2 (2) 162 (4 sin k=2 )
1 1 − − log 4 + F0 − ; 3 j
(15.91)
S. Capitani / Physics Reports 382 (2003) 113 – 302
225
which we have taken from Section 18. The *nite diJerence can then be computed numerically with a simple integration routine. In this way the value of the zero-momentum integral can be obtained, and the result is
1 1 g2 + log a2 2 − F0 + log 4 + 0:473493 J = i p) 0 2 CF − 16 3 j 1 1 g2 = i p) 0 2 CF − − log a2 2 + 1:086227 ; (15.92) 16 3j 3 where the log a2 2 term derives from the rescaling k = ak and dimensions are used. The continuum part, I − J , is just ∞ d 4− 2j k 1 −ik, −ik, i k) 2 < I − J = −g02 CF 4−2j (p − k)2 < k 2 (2) k −∞ 1 p2 g2 1 − + log 2 + E − log 4 + = i p) 0 2 CF − 16 3 j 1 p2 1 g02 = i p) − log 2 + 1:206825 : CF 162 3j 3
the coeQcient g02 when d = 4
5 9
(15.93)
2 In principle I − J also contains terms of the form p p) p , =p , but they belong to the matrix element and not to the renormalization constant. They cancel when the diJerence between continuum and lattice renormalization factors is taken, having the same coeQcient in both cases. The *nal result for the Feynman diagram is (Capitani, 2001a) g02 1 2 2 I = i p) CF − log a p + 2:293052 : (15.94) 162 3
We could have chosen a mass regularization instead of dimensional regularization (adding a mass m2 to the gluon propagator). In this case we would have obtained
1 g2 Jm = i p) 0 2 CF − (log a2 m2 + E − F0 ) − 0:193173 16 3 1 g2 = i p) 0 2 CF − log a2 m2 + 1:070830 (15.95) 16 3 and g2 (I − J )m = i p) 0 2 CF 16
p2 1 11 − log 2 + 3 m 9
:
(15.96)
Of course the sum of Jm and (I − J )m is still given by Eq. (15.94). Expression (15.57) after the a4 rescaling contains also 1=a terms. One could ask where the 1=a terms have gone. The fact is that they are zero for this diagram. They are zero also for the sails, while in the case of the self-energy the 1=a terms are *nite and give an important contribution, which is linked to the breaking of chiral symmetry.
226
S. Capitani / Physics Reports 382 (2003) 113 – 302
15.4.3. Sails We now turn to the sails in Fig. 17. In these diagrams there is only one interaction vertex coming from QCD, and the gluon is the contraction of the A) coming from the covariant derivative in the operator and of the A< in the QCD vertex V< , which therefore (in the Feynman gauge) becomes V) . The *rst order expansion of the covariant derivative in the operator gives, as we have seen, ak) ak) ap) a(k + p)) (1) = i cos − sin ; (15.97) O ) (k; p) = i cos 2 2 2 2 and this expression is valid for both sails. We evaluate the two sails together, since some simpli*cations will take place at intermediate stages of the calculations when taking their sum. Again, we *rst compute the J part at zero-momentum, =a dd k (1) J= G)) (p − k) · g0 [V) (k; p) · S(k) · O(1) ) (k; p) + O ) (k; p) · S(k) · V) (p; k)]|ap=0 d (2) −=a ) 2a p I 1 ig02 −iI ap) * dd k , +W CF + N) M (N =− + i M ) − ) ) ) ) d a 2W (2W )2 I2 + W 2 2 − (2)
) ap) * −iI , +W + M) − (N) + i) M) ) ; (15.98) N) 2 I2 + W 2 where we have already rescaled the integration variable as we did in the vertex. It is easy to see that the integral in the *rst line contains an overall factor a3 (a from S(k) and a2 from G)) (p − k)), and after rescaling (which gives a factor 1=a4 ) one is left with an overall factor 1=a. We have then, in the limit a → 0: d d k J = −ig02 CF d (2) − 2 p I −iI −iI , +W , +W (N) + i) M) ) 2 × · M) + M) · 2 (N) + i) M) ) (2W )2 I + W2 I + W2 −iI −iI 1 p) , +W , +W (M) − i) N) ) 2 · M) + M) · 2 (M) − i) N) ) + 2 2W 2 I +W I + W2 −iI −iI , +W , +W + (N) + i) M) ) 2 · (−N) ) + (−N) ) · 2 (N) + i) M) ) I + W2 I + W2 d 2 p I d k 2 2 (2 N) M) W + () I = − ig0 CF , + I , ) )M) + · · ·) d (2W )2 (I2 + W 2 ) − (2) p) 1 2 2 (2 (M) − N) )W − 2() I + (15.99) , + I , ) )N) M) + · · ·) : 2W (I2 + W 2 ) 2 Again we drop terms odd in k and perform the gamma algebra, using ) I , + I , ) = 2( I) + ) I ) ;
(15.100)
S. Capitani / Physics Reports 382 (2003) 113 – 302
which is valid for = ). After combining the sine functions and exchanging the indices the ) p terms we arrive at d d k 1 2 p) (I)2 W + 2I)2 (M)2 + M 2 )) J = −ig0 CF d 2 (I2 + W 2 ) (2) 2W − 1 2 2 2 ( + p (M − N )W − I ) ; ) ) ) ) 2W (I2 + W 2 )
227
and ) in
(15.101)
where we have also replaced the N) M) factor with I) =2. At the end, after some further simpli*cations (which include the replacement of M<2 − N<2 with cos k< ), we obtain the explicit form cos k) dd k 2 J = −ig0 CF p) 2 d 2( sin k + (2 sin2 k =2)2 )2 − (2) sin2 k) (cos2 k =2 + cos2 k) =2) : (15.102) + 4( sin2 k =2)2 ( sin2 k + (2 sin2 k =2)2 )2 This *nal expression for the sails gives, when integrated,
1 g02 2 2 J = i p) CF 2 − F0 + log 4 + 6:506752 + log a 162 j 2 g02 2 2 = i p) CF + 2:830350 ; + 2 log a 162 j and
d 4− 2j k 1 −ik, −ik, I −J ) 2 + ) 2 4−2j (p − k)2 k k −∞ (2)
p2 1 g02 = i p) CF 2 − + log 2 + E − log 4 − 4 162 j p2 g02 2 = i p) CF − + 2 log 2 − 7:907617 : 162 j = g02 CF
(15.103)
∞
(15.104)
The *nal result for the Feynman diagram is (Capitani, 2001a) I = i p)
g02 CF (2 log a2 p2 − 5:077267) : 162
(15.105)
With a mass regularization the intermediate results would be: Jm = i p) = i p)
g02 CF [2(log a2 m2 + E − F0 ) + 6:506752] 162 g02 CF (2 log a2 m2 − 1:077267) ; 162
(15.106)
228
S. Capitani / Physics Reports 382 (2003) 113 – 302
and
p2 g02 (I − J )m = i p) CF 2 log 2 − 4 : 162 m Also in this diagram the 1=a terms are zero.
(15.107)
15.4.4. Operator tadpole We now consider the diagram arising from the second order expansion of the covariant derivative in the operator in the special case in which the two gluons are contracted to make a tadpole. Its Fourier transform is a O(2) (15.108) ) (p) = − CF i sin ap) : 2 The operator that enters into the tadpole depends only on the external quark momentum, but not on the integration variable, which is carried by the gluon, and so it has a simple continuum limit. The integral is rather easy to compute, and the result is =a dd k G)) (k) · g02 O(2) I= ) (p) d (2) −=a ) a * =a d d k 1 a2 = g02 CF − i sin ap) d 2 (2) 4 sin2 ak =2 −=a d 1 sin ap) d k 1 2 = g0 CF − i 2 d 2 a − (2) 4 sin k =2 1 2 dd k 1 = − g0 CF i p) : (15.109) 2 d 2 − (2) 4 sin k =2 Although this diagram comes from O(a2 A2 ) terms in the action which vanish in the naive continuum limit, the gluon loop gives an extra factor of 1=a2 and so these contributions survive. So, this diagram is not zero, and we can see that it is also not divergent. This *nite integral is encountered very frequently in lattice calculations. As it is one of the most basic quantities of perturbation theory, it is taken as a fundamental constant, called Z0 , in the algebraic method (see Section 18). This constant can in principle be computed with arbitrary precision. It is now known with a very high precision of about 400 signi*cant decimal places, as we will see in Section 19. For our calculation is enough to know that dd k 1 = Z0 = 0:15493339 ; (15.110) 2 d − (2) 4 sin k =2 so that the value of the operator tadpole is g2 I = − 0 2 CF i p) · 12:233050 : (15.111) 16 Of course the symmetrization of this result is trivial. The computation of more complicated operator tadpoles, which can be done in an exact way (that is, expressing the results in terms of only two integrals known with arbitrary precision) using the algebraic method, is discussed in Section 18.
S. Capitani / Physics Reports 382 (2003) 113 – 302
229
15.4.5. Quark self-energy (sunset diagram) The zero-momentum part for the sunset diagram of the quark self-energy is: =a dd k J= G<< (p − k) · [V< (k; p) · S(k) · V< (p; k)]|ap=0 d −=a (2) < g2 = 0 CF a
−
* 2a p I ) ap< dd k 1 + (M< − i< N< ) N< + i< M< + (2)d < 2W (2W )2 2
) * ap< −iI , +W N + i M + − i N ) ; (15.112) (M < < < < < < I2 + W 2 2 where we have already rescaled the integration variable. After combining the various factors a coming from the propagator and the vertices, as well as from the rescaling of k, we are left with an overall factor 1=a. This means that we have to keep all terms of order ap. We have then d d k 1 −iI g02 , +W (N< + i< M< ) J = CF (N< + i< M< ) 2 d a 2W I + W2 − (2) < ×
+ g02 CF
−
d d k 2 p I −iI , +W (N< + i< M< ) 2 (N< + i< M< ) d 2 (2W ) (2) < I + W2
−iI 1 p< , +W (M< − i< N< ) 2 (N< + i< M< ) + 2W 2 I + W2
−iI , +W + (N< + i< M< ) 2 (M< − i< N< ) ; I + W2 which gives dd k 1 g02 (N<2 W − 2< M<2 W + (< I J = CF , +I , < )N< M< ) d 2 2) + W a (2) 2W (I − < + g02 CF
−
(15.113)
2 p I dd k 2 2 (−iI , N< + i< I , < M< + 2i< N< M< W ) (2)d < (2W )2 (I2 + W 2 )
p< 1 2 2 (2i< (M< − N< )W − 2iI (15.114) + , N< M< − 2i< I , < N < M < ) ; 2W (I2 + W 2 ) 2 where again we have dropped terms which are odd in k. Here we have also kept the contribution proportional to 1=a because, contrary to vertex and sails, it does not vanish. In fact, this is a very important quantity for Wilson fermions. It contributes to the linearly divergent T0 =a term in the 1-loop self-energy 68 T0 g02 + ip ; (15.115) T + m T 0 2 , 1 162 a 68
Of course, since in our example we are doing the computations using a massless fermion propagator, we do not obtain the factor T2 .
230
S. Capitani / Physics Reports 382 (2003) 113 – 302
which is due to the breaking of chiral symmetry for Wilson fermions (in fact it is proportional to r, and vanishes for naive fermions). It gives the critical mass to 1 loop: mc = T 0 :
(15.116)
We can see that after reduction there are no Dirac matrices in the contribution to mc coming from this diagram. The corresponding integral is *nite and given by dd k 1 (a) 2 m c = g 0 CF ((N<2 − M<2 )W + I<2 ) d 2 2) (2) 2W (I + W − <
cos k< 2( sin k + (2 sin2 k =2)2 )2 − 2 < sin k< + 4( sin2 k =2)( sin2 k + (2 sin2 k =2)2 )2
= g02 CF
=−
dd k (2)d
2
<
g02 CF · 2:502511: 162
(15.117)
This is the contribution to the critical mass coming from the sunset diagram of the self-energy. Another contribution to it, m(b) c , comes from the tadpole and we will compute it soon. Let us now turn to the rest of the expression (15.114), which contributes to the renormalization of the operator that we are studying. From now on J will only denote the a = 0 part of the zero-momentum expression, in which after reduction there is one Dirac matrix, and which gives T1 . We have then d 2 p I d k 2 2 2 2 J = g 0 CF (−iI , (N< + M< ) + 2i< I< M< + i< I< W ) 2 (I2 + W 2 ) d (2W ) − (2) < 1 2 2 2 (15.118) + p< (i< (M< − N< )W − i< I< ) 2W (I2 + W 2 ) d ip d k 2 2 2 2 2 2 2 , −2I) (N< + M< ) + 4 I< M< + 2I) W = g0 CF d (2W )2 (I2 + W 2 ) − (2) < <
+
ip , ((M)2 − N)2 )W − I)2 ) 2W (I2 + W 2 )
In the last passage we have used the substitution f (k) ; p f (k) = p ,
:
(15.119)
(15.120)
since this kind of integrals does not depend on the direction, with the understanding that the index is *xed and must not appear in the rest of the monomial. This reconstructs the factor ip , and
S. Capitani / Physics Reports 382 (2003) 113 – 302
allows the extraction of the value of T1 . We thus have dd k cos k) 2 J = g0 CF ip , 2 d 2( sin k + (2 sin2 k =2)2 )2 − (2) −(4 − 2j) sin2 k) + 2 < sin2 k< cos2 k< =2 : + 8( sin2 k =2)2 ( sin2 k + (2 sin2 k =2)2 )2 The lattice result at zero momentum is then 1 g02 2 2 + log a J = ip C − F + log 4 + 4:411364 0 , 162 F j 1 g02 2 2 + log a = ip C + 2:573163 : , 162 F j We must still consider the continuum (I − J ): ∞ 4 − 2j d k 1 −ik, I − J = −g02 CF < 4−2j (p − k)2 < k 2 (2) −∞ p2 1 g02 + log − = ip C + − log 4 − 1 E , 162 F 2 j p2 1 g02 = ip , 162 CF − j + log 2 − 2:953809 :
231
(15.121)
(15.122)
(15.123)
The *nal result for the ip , part of the sunset diagram of the lattice self-energy is then g2 T1(a) = 0 2 CF (log a2 p2 − 0:380646) : (15.124) 16 This term, together with the analog term coming from the self-energy tadpole which we will compute soon, needs to be added to the results of the proper diagrams (vertex, sails and operator tadpole). In fact this is the wave function renormalization, and inserted into the tree-level diagram for the operator gives a leg correction: ip , T1 · S(p) · i p) = ip , T1 ·
−ip , · i p = i p T : ) ) 1 p2
(15.125)
If a mass regularization is used one gets Jm = ip , = ip ,
g02 CF (log a2 m2 + E − F0 + 5:411364) 162 g02 CF (log a2 m2 + 1:619354) ; 162
and (I − J )m = ip ,
g02 CF 162
(15.126)
log
p2 −2 m2
:
(15.127)
232
S. Capitani / Physics Reports 382 (2003) 113 – 302
15.4.6. Quark self-energy (tadpole diagram) The last diagram, which completes the lattice self-energy and has no analog in the continuum, is the self-energy tadpole, which originates from the irrelevant vertex in Eq. (5.78). This vertex, when put in the tadpole diagram, depends only on the external momentum. The diagram is then 69 1 =a d d k I= G<< (k) · (V2aa )<< (p; p) 2 −=a (2)d < 1 = 2
=a
−=a
dd k 2 1 a (2)d 4 sin2 ak =2
1 − ag02 {T a ; T a }cc 2 a
(−i< sin ap< + cos ap< )
<
dd k 1 4 −ip , +a 2 d − (2) 4 sin k =2 1 2 4 = − g0 CF Z0 −ip : (15.128) , +a 2 The last term diverges like 1=a, and therefore is part of T0 . In fact it gives a substantial contribution to the critical mass: g02 2 m(b) = −g C · 2Z = − CF · 48:932201 : (15.129) 0 c 0 F 162 The 1-loop critical mass for Wilson fermions is then: 70 g02 (b) mc = m(a) + m = − CF · 51:434712 = −g02 CF · 0:325714 : (15.130) c c 162 The term proportional to ip , gives the contribution of the self-energy tadpole to the renormalization of the *rst moment operator: g2 T1(b) = 0 2 CF · 12:233050 : (15.131) 16 The total result for the 1-loop self-energy on the lattice in the massless case is then g2 T1 = T1(a) + T1(b) = 0 2 CF (log a2 p2 + 11:852404 + (1 − )( − log a2 p2 + 4:792010)) ; (15.132) 16 where we have also included the part which one would obtain if the calculation were carried out in a general covariant gauge. 1 = − g02 CF 2
15.4.7. Concluding remarks We have thus computed all diagrams which are necessary for the 1-loop renormalization of the operator O{ )} = R { D)} =2 , with diJerent indices. Collecting the contributions of the various diagrams, we have the *nal result (Capitani, 2001a) 8 1 g2 log a2 p2 − 3:16486 : q|O{01} |q|1 loop = i(0 p1 + 1 p0 ) · 0 2 CF (15.133) 2 16 3 Note that we have used lima→0 < cos ap< = 4. 70 We will see later that the 1-loop critical mass can be expressed in terms of only two constants (Eq. (18.117)), which are calculable with very high precision. Its 2-loop value is given at the end of Section 19. 69
S. Capitani / Physics Reports 382 (2003) 113 – 302
233
This allows us to specify Eq. (3.1) for this operator, with Rlat ij = −3:16486 · CF . For the 1-loop
in Eq. (3.2). Its value is matching to the MS scheme one also needs to know the factor RMS ij −40=9 · CF . It can be inferred from our previous calculations of the various I − J integrals, by summing the contributions of vertex, sails and the sunset diagram of the self-energy (which are 5=9; −4 and −1 respectively). We then obtain for this operator 8 g02 lat 2 2 log a p − 3:16486 q|O{01} |q = 1 + CF · q|O{tree (15.134) 01} |q ; 162 3 2 2 g 8 40 p MS · q|O{tree CF (15.135) q|O{MS log 2 − 01} |q ; 01} |q = 1 + 162 3 9 so that the renormalization factor that converts the raw lattice results (in the Wilson formulation) to the MS scheme is (for = 1=a) g02 q|O{MS |q = 1 − C · 1:27958 · q|O{lat01} |q : (15.136) F 01} 162 For the typical value g0 = 1 (which corresponds to a scale 1=a of about 2 GeV in the quenched approximation) we then obtain lat q|O{MS 01} |q = 0:98920 · q|O{01} |q :
(15.137)
We would now like to make some comments on these lattice calculations. First we notice that the magnitude of the self-energy tadpole is much larger than the result for the vertex and the sails, and this is a general feature of lattice calculations, which is called “tadpole dominance” of perturbation theory. It has been in some cases used to estimate the value of the radiative corrections of matrix elements, by neglecting all diagrams other than the tadpole. However, sometimes an uncritical application of tadpole dominance can be misleading. For example, in the previous calculations we have encountered a situation in which the operator tadpole exactly cancels the self-energy tadpole, so that the *nal result for the matrix element is given only from the contributions of the vertex and the sails (plus the sunset diagram of the self-energy). This cancellation only happens for the *rst moment. For higher moments, when the operator has n covariant derivatives with distinct indices the result of the operator tadpole is nZ0 =2 (see also Section 18.3). 71 So, the *nal result has even a diJerent sign from the one that would be inferred from computing the self-energy tadpole alone. As discussed in Section 14, there exists also another class of operators which measure the *rst moment of the quark momentum distribution and which belong to another representation of the hypercubic group: O00 − 13 (O11 + O22 + O33 ) : 71
(15.138)
This result is only valid for n between 2 and 4. For higher n it is impossible to have all indices distinct, and the contractions of the A ’s become more complicated and do not give just nZ0 =2. The numerical result for n ¿ 5 is however not far from this number, and one has again a *nal positive result when all diagrams are summed, with dominance of the operator tadpole and not of the self-energy tadpole.
234
S. Capitani / Physics Reports 382 (2003) 113 – 302
Due to the breaking of Lorentz invariance, on the lattice this operator has a renormalization constant diJerent from O{01} . We leave as an exercise for the interested reader to reproduce the *nal 1-loop result 72 , + 3 3 1 1 1 q O00 − Oii q = i 0 p0 − i p i 3 i=1 2 3 i=1 1 loop
g2 × 0 2 CF 16
8 log a2 p2 − 1:88259 3
;
which gives the 1-loop matching factor to the MS scheme + MS , lat , + 3 3 q O00 − 1=3 q = 0:97837 · q O00 − 1=3 q : Oii Oii i=1
(15.139)
(15.140)
i=1
We want to conclude this section by mentioning that in the case of a calculation with the improved action we have to add the Sheikholeslami–Wohlert improved vertex of Eq. (11.10). Moreover, the operators have to be improved. This renders the calculations much more cumbersome. Since the Sheikholeslami–Wohlert vertex includes a ? matrix, more Dirac matrices appear in the manipulations. In the case in which both vertices in the vertex function are taken to be the improved ones, the chains of Dirac matrices can become quite long. The improvement of the operators, in the case that we have just calculated, means that we must also consider the contribution to the renormalization constants coming from the operators in Eq. (11.15). These calculations are quite complicated, and the results are given in (Capitani et al., 2001b). 15.5. Example of overlap results To follow an overlap calculation step by step in the same way as we did for the Wilson case would be quite cumbersome. We give here, as an example, the analytic expressions for the tadpole of the self-energy of the quark, T1(b) , and for the vertex of the scalar current R (in the Feynman gauge, for r = 1). In order to be able to write them in a compact form it is convenient to introduce further abbreviations: k B = b(k) = 2 sin2 − < ; (15.141) 2
D = 2<(!(k) + b(k)) ;
(15.142)
2 1 ! (k) k k 4 A= ; =1− sin2 + 2 sin2 k + 2 sin2 2 < < 2 < 2 2
72
(15.143)
It might be useful to stress that the numerical results for all individual diagrams are the same as in the calculation with diJerent indices, except for the vertex diagram, in which the number for the *nite part is now 3:575320.
S. Capitani / Physics Reports 382 (2003) 113 – 302
235
with !(k) given in Eq. (8.21). The 1-loop tadpole of the quark self-energy for overlap fermions is then given by 4 d k 4 1 1 2 d4 k 2 √ + g0 G (k) 1 − G (k) g0 4 4 2 2 2 < 2 < (1 + A)2
√ 1 2+ A 2 2 2 2 2 2 2 2 M + N + 1 + √ × (−I + B(M − N )) + √ (−B(M − N ) + I ) A < A 4 d k 2 1 √ + g02 G (k) 2 24 < (1 + A)2
1 −2I2 N2 + 1 + √ (2I2 (B + 2M 2 − M2 )) : × (15.144) A The *rst term comes from the part of the overlap vertex V2 in Eq. (8.31) containing W2 and W2† and its value for < = 1 is quite large, 4 g2 1 2 g0 · Z0 1 − = − 0 2 36:69915 ; (15.145) 2 < <=1 16 while the 1-loop result for the whole self-energy tadpole in Eq. (15.144) is slightly smaller, but still large: −
g02 23:35975 : 162
(15.146)
Adding now the value −14:27088 g02 =(162 ) for the sunset diagram of the overlap self-energy of the quark, which is much harder to compute by hand and would produce a very lengthy analytic expression, gives the result −37:63063g02 =(162 ) for the complete self-energy in the Feynman gauge, for < = 1 (Alexandrou et al., 2000a, b; Capitani, 2001a). The values of the overlap self-energy for various choices of < in a general covariant gauge are given in Table 1. One advantage of using overlap fermions is that the power divergent part of the self-energy, T0 , which for Wilson fermions gives a nonzero additive mass renormalization, vanishes. The results for the 1-loop vertex diagram of the scalar operator 73 can be written in the form
4 d k I2 1 1 1 2 √ ·Y (15.147) − 2 + 2 ·X + G(p − k) g0 24 D 4<
1 1 2 2 2 2 2 2 2 2 2 X= −(M − N ) + √ 2(M + N ) + 2 ((I − B )(M − N ) + 2BI ) ;
1 1 2 2 2 2 2 2 2 2 2 √ Y= I + (15.148) 2I (M + N ) + 2 I (−2B(M − N ) + I − B ) :
We note that the sails are not present in this case, as there are no covariant derivatives.
236
S. Capitani / Physics Reports 382 (2003) 113 – 302
Table 1 Results for the *nite constant of the quark self-energy with overlap fermions in a general covariant gauge, from (Capitani, 2001a). We have used the abbreviation U = 1 − . The *rst column (sunset) refers to the diagram on the left in Fig. 18, and the second column (tadpole) to the diagram on the right <
Self-energy (sunset)
Self-energy (tadpole)
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9
−27.511695+11.911596 U −23.687573+11.098129 U −21.172454+10.520210 U −19.337313+10.071356 U −17.912921+9.704142 U −16.760616+9.393275 U −15.800204+9.123666 U −14.981431+8.885590 U −14.270881+8.672419 U −13.645294+8.479438 U −13.087876+8.303183 U −12.586126+8.141044 U −12.130497+7.991018 U −11.713524+7.851554 U −11.329238+7.721442 U −10.972744+7.599750 U −10.639905+7.485778 U −10.327042+7.379023 U
−213.087934−7.119586 −131.723110−6.306119 −91.817537−5.728200 −68.315503−5.279346 −52.931363−4.912132 −42.140608−4.601265 −34.193597−4.331656 −28.125054−4.093580 −23.359746−3.880409 −19.534056−3.687428 −16.407174−3.511173 −13.813486−3.349034 −11.635482−3.199008 −9.787582−3.059544 −8.206069−2.929432 −6.842630−2.807740 −5.660084−2.693768 −4.629539−2.587013
Total self-energy U U U U U U U U U U U U U U U U U U
−240.599629+4.792010 −155.410693+4.792010 −112.989991+4.792010 −87.652816+4.792010 −70.844284+4.792010 −58.901224+4.792010 −49.993801+4.792010 −43.106485+4.792010 −37.630627+4.792010 −33.179350+4.792010 −29.495050+4.792010 −26.399612+4.792010 −23.765979+4.792010 −21.501106+4.792010 −19.535307+4.792010 −17.815374+4.792010 −16.299989+4.792010 −14.956581+4.792010
U U U U U U U U U U U U U U U U U U
These are the only diagrams for which the author has had enough patience to perform their calculation by hand using the overlap action (Capitani, 2001a). Other overlap calculations of renormalization factors made using FORM codes are reported there and in Capitani (2001b), Capitani and Giusti (2000, 2001). The renormalization constants of operators computed using overlap fermions are sometimes large, and in general larger than the corresponding Wilson results (Capitani, 2001a, b). For example for the *rst moment of the unpolarized quark distribution (operator O{01} ) the constant is −53:25571, while (as we have just seen) in the Wilson case it is −3:16486. However, for the proper diagrams the overlap results show only relatively small diJerences from the Wilson numbers. The biggest contribution to the renormalization constant comes from the operator tadpole, and it is exactly the same for overlap and Wilson fermions. The diJerence between overlap and Wilson results is then almost entirely due to the quark self-energy. In the Feynman gauge, the constant of the *nite part of the self-energy in the overlap (for <=1) is −37:63063, while for Wilson fermions is +11:85240; their diJerence is a large number, −49:48303, and quite close to the diJerence of the total renormalization constants for the two kinds of fermions, −50:09085. We notice from Table 1 that the value of the overlap self-energy decreases when < increases, and if one would consider the overlap fermions for < = 1:9, the diJerence between the self-energies would go from −49:48303 down to −26:80898. However, since the quark propagator becomes singular for < = 2, simulations would become more expensive when approaching this value.
S. Capitani / Physics Reports 382 (2003) 113 – 302
237
We note that, contrary to the Wilson case, where the tadpole gives a much larger contribution than the sunset, in the overlap things are more entangled, and for certain values of < it is the sunset that gives a larger result than the tadpole. 15.6. Tadpole improvement We have seen that gluon tadpoles give large numerical results compared to other diagrams (although sometimes they happen to cancel with one another). Quite often the large corrections which occur in lattice perturbation theory are caused by these contributions. Since they are an artifact of the lattice (the corresponding interaction vertex is zero in the naive continuum limit), in order to attempt to make the lattice perturbative expansions closer to the continuum ones a tadpole resummation method has been proposed in Parisi (1980) and Lepage and Mackenzie (1993). The lectures of Mackenzie (1995) contain a pedagogical introduction to these ideas. This resummation of tadpoles amounts to a mean-*eld improvement, in which one makes a rede*nition of the link, separating the contributions of their infrared modes: U (x) = eig0 aA
(x)
= u0 eig0 aA
IR
(x)
= u0 U˜ (x) :
(15.149)
The rescaling factor u0 (which is a number between 0 and 1) contains the high-energy part of the link variables. One makes in the gluon action the substitution U (x) = u0 U˜ (x) ;
(15.150)
and takes U˜ (x) as the new link variable. This implies ˜ = u04 . The eJective coupling constant becomes then g˜20 = g02 =u04 , and is claimed to be closer to the coupling constant de*ned in the MS scheme than the original lattice coupling. In fact, the perturbative expansions in terms of g˜0 have often smaller coeQcients and are better behaved than the standard perturbative expansions. The value of the rescaling factor u0 is taken from the mean value of the link in the Landau gauge, . 1 u0 = Re Tr U (x) ; (15.151) Nc because in this gauge its value is higher (in other gauges u0 can take very small values). However, the fourth root of the mean value of the plaquette, .1=4 1 Re Tr P ) (x) ; (15.152) uP = Nc is easier to compute and is then taken in place of u0 in most applications, although one should always keep in mind that this is a good approximation only for small lattice spacings. When a ∼ 0:4 fm the two de*nitions diJer already by about 10%. In this case, if one sticks to the plaquette de*nition, every perturbative quantity should be computed at least to 2 loops. There is then a small ambiguity in this choice. In the resummed theory one works with tadpole-improved actions and operators. Of course all quantities that contain a *eld U have to be rescaled accordingly, and will be multiplied by some positive or negative power of u0 . This then applies also to the covariant derivatives appearing in operators measuring moments of structure functions. When all variables are properly rescaled the large tadpole contributions can be reabsorbed, and after this tadpole improvement has been implemented the coeQcients in the results of perturbative lattice calculations get in general smaller.
238
S. Capitani / Physics Reports 382 (2003) 113 – 302
The above construction can be seen as a diJerent choice of the gauge coupling (and of other parameters), and is equivalent to a reorganization of lattice perturbation theory. At the end of the day, this is a theory in which a rede*nition of the coupling constant is taking place, which implies a diJerent value of the parameter. 74 So, one should be careful and do things consistently, because one is eJectively changing scheme. This would not matter if one knew the complete series, but to the lowest orders it makes a diJerence. It is however diQcult to estimate the error that results from this. Moreover, this diJerent de*nition of the coupling constant does not always give a better perturbative expansion. In some cases the contributions get bigger. It is also possible to set up tadpole improvement on Symanzik-improved theories, rescaling the aJected quantities with appropriate powers of u0 . For example, the improvement coeQcient csw undergoes the rescaling c˜sw = u03 csw . The coeQcients of the L6uscher–Weisz improved gauge action are modi*ed as follows: c0 =
5 ; 3u04
c1 = −
1 : 12u06
(15.153)
A diJerent kind of improvement of lattice perturbation theory has been proposed and used in Panagopoulos and Vicari (1998, 1999). It works through the resummation of “cactus” diagrams, i.e., tadpoles which become disconnected if any one of their vertices is removed. These diagrams are gauge invariant and it seems that in some cases, for example the renormalization of the topological charge, this kind of resummation achieves better results than standard tadpole improvement. We conclude mentioning that for overlap fermions simple tadpole improvements like the ones discussed here seem to be of little help, because the sunset diagram of the self-energy also gives large results. This is a diagram that is present also in the continuum. Probably a diJerent kind of diagram resummation has to be devised in this case. 15.7. Perturbation theory for fat links Another way to reduce the magnitude of the large renormalization factors seems to be given, at least in some cases, by fat link actions (DeGrand et al., 1999). Under the name of fat links are meant actions in which the quarks couple to gauge links which are smeared (see Fig. 20). They present some advantages, like suppressed exceptional con*gurations and small additive mass renormalization. Furthermore, these actions exhibit better chiral properties. From the point of view of perturbation theory the interesting feature is that renormalization factors often have values quite close to unity. This could be worth taking into consideration in the cases in which the 1-loop corrections of matrix elements on the lattice are large compared to their tree-level values. Damping the large perturbative corrections can be particularly useful in the case of overlap fermions. Simulations using overlap-like fermions with fat links have been reported in DeGrand (2001), and the improvement of the locality and topological properties using overlap and overlap-like fermions with fat links has been studied in DeGrand et al. (2002) and KovKacs (2002). 74
This is similar to what happens in continuum QCD renormalized in the minimal subtraction (MS) scheme, where some second-order perturbative corrections can sometimes be large. If however one systematically drops the factors E and log 4 these corrections become much smaller, and this de*nes the MS scheme, where one has a diJerent renormalized coupling constant and a diJerent value of the parameter.
S. Capitani / Physics Reports 382 (2003) 113 – 302
239
Fig. 20. The APE smearing of the thin link U (x) which produces the fat link.
We follow Bernard and DeGrand (2000), and consider the particular construction known as APE blocking (Albanese et al., 1987), where the smearing of a link is done as follows. Starting with the original link, also known as thin link, V (0) (x) = U (x) ;
(15.154)
the *rst smearing step is done like in Fig. 20. The general smearing step constructs the fat link recursively as c (m) V (m+1) (x) = P (1 − c)V (m) (x) + [V) (x)V (m) (x + ))V ˆ )†(m) (x + ˆ) 6 ) =
ˆ (m) (x − ))V ˆ )(m) (x − )ˆ + ˆ)] ; + V)†(m) (x − ))V
(15.155)
where P projects back into SU (3) matrices. Typical parameter values are c=0:45 and a total number of iterations N = 10 (see for example, Bernard et al., 2000). In perturbation theory the basic variables are the A ’s. For 1-loop computations of quark operators only the linear part of the relation between them, h ) (y)A(0) (15.156) A(1) (x) = ) (x + y) ; y
)
is relevant, because the quadratic part, being antisymmetric, gives no contributions to the tadpoles (which at 1 loop are the only diagrams that can be constructed from two gluons stemming from the same point). In momentum space this convolution becomes a form factor, (15.157) h˜ ) (q)A(0) A(1) (q) = ) (q) ; )
where
* q ˆ qˆ qˆ) q ˆ c ) 2 )− 2 + 2 : h˜ ) (q) = 1 − qˆ qˆ qˆ 6 )
(15.158)
240
S. Capitani / Physics Reports 382 (2003) 113 – 302
We notice that the longitudinal part h˜ is not aJected by the smearing, because it is unphysical and does not depend on which path of links between two *xed sites is chosen. After N smearings one gets A(N ) (q) = (15.159) h˜(N) ) (q)A(0) ) (q) ; )
with
) *N qˆ qˆ) qˆ qˆ) ˜h(N) ) (q) = 1 − c qˆ2 )− 2 + 2 : 6 qˆ qˆ
(15.160)
Since this is a correction to the gauge interaction of the quarks, the eJect is that each quark-gluon vertex is multiplied by a form factor h˜(N) ) (q), where q is the gluon momentum. Now we consider the common case in which all gluon lines of a diagram start and end on quark lines. One can imagine these form factors as attached to the gluon propagator instead than to the vertices. The eJect of smearing can then be summarized in a simple modi*cation of the gluon propagator 75 (N ) G ) (q) → h˜(N ) (q)G< (q)h˜<) (q) :
(15.161)
It is now clear that the Landau gauge is the most natural setup for fat link calculations. The Landau ˜ and it is then quite easy to convert a thin link gauge propagator kills the longitudinal components of h, calculation made in the Landau gauge to a fat link result, provided the loop integration momentum was chosen to be the same as the gluon momentum q (which we have not done in Section 15.4): d4 q c 2 *2N d4 q ) 1 − I(q) → I(q) : (15.162) qˆ (2)4 (2)4 6 Thus, no matter how many iterations N one makes, the *nal result is simply given by the multiplication of the old thin link integrand with a form factor. In the case in which the thin link calculation was instead done in the Feynman gauge, one can observe that doing the same calculation using fat links in the Landau gauge will generate new terms, coming from the (1 − c=6qˆ2 )N qˆ qˆ) = qˆ2 part of h˜(N) ) (q). However, when (1 − c=6qˆ2 ) = 1, the total contribution of these terms must vanish by gauge invariance. If this cancellation already occurs at the level of the integrands, the further multiplication by (1 − c=6qˆ2 )N will not spoil it, and then one can use again Eq. (15.162) for passing from thin to fat links, even if the thin link calculation was done in the Feynman gauge. This has been veri*ed for the additive and multiplicative renormalization of the mass (Bernard and DeGrand, 2000). However, if the cancellation is more subtle, for example involving integration by parts, then Eq. (15.162) is not valid anymore and these new fat link terms have to be computed from scratch. As we mentioned at the beginning, one of the attractive features of fat links is that renormalization factors seem to be much closer to their tree level values. The Wilson tadpole diagram 12:233050 g02 =(162 )CF (see Eq. (15.131)), which is responsible for many of the large corrections in lattice perturbation theory, becomes in the fat link case 0:346274g02 =(162 )CF (when c = 0:45 and N = 10 are used). Thus tadpole improvement is not necessary in this kind of calculations. Other 75 Here G ) (q) is the standard Wilson gluon propagator, because the fat link action changes the quark-gluon interaction but leaves the plaquette action unaltered, and in particular the gluon propagator. This is in some sense complementary to the case of improved gluons (Section 11.2), where the pure gauge action is modi*ed but not the quark-gluon interaction. Recently actions which contain both fat links and an improvement of the pure gauge part have also been considered in perturbation theory (DeGrand et al., 2002).
S. Capitani / Physics Reports 382 (2003) 113 – 302
241
quantities, like the 1-loop correction to the renormalization of the various currents, have been veri*ed to be smaller of nearly two orders of magnitudes with respect to the Wilson results. The reason of these small factors is that if c is not too large, c ¡ 0:75, the absolute value of (1 − c=6qˆ2 ) is less than one, and so (1 − c=6qˆ2 )N is a very small factor, except perhaps for the region of very small momenta. Thus, the region of large momenta is completely suppressed, and this is precisely the dominant contribution to the tadpole diagrams. We can then understand why these fat link integrals are strongly suppressed. 76 This damping however does not necessarily take place when divergent diagrams are considered. In this case one has to perform appropriate subtractions, and not all terms in the original integral remain proportional to (1 − c=6qˆ2 )2N . The *nite part of a divergent fat link integral is then not necessarily small, and more investigations are needed to understand the general situation of divergent integrals. Recently a new bunch of perturbative calculations with fat link actions has been reported (DeGrand, 2002; DeGrand et al., 2002). They mainly use another smearing choice known as hypercubic blocking (Hasenfratz and Knechtli, 2001, 2002; Hasenfratz et al., 2002a), which is more complicated and involves three free parameters which need to be *xed. Using this version of fat links the renormalization constants of the quark currents as well as of weak four-fermion operators have been computed, with the fattening of the links applied to the Wilson, the improved (for the quark as well as the gluon part) and the overlap actions. The case of fat links applied to staggered fermions has been extensively studied in Lee and Sharpe (2002a, b) and Lee (2002).
16. Computer codes The calculation of the 1-loop renormalization constant of the unimproved matrix element q|O{ )} |q, described in detail in the previous Section, can be done entirely by hand. However, even in this relatively simple case a computer program (written in the FORM language) has also been used to have some cross-checks on the correctness of all results. It is in general useful to have this kind of mutual checks between calculations made by hand and calculations made using a computer. Computer programs become necessary if one wants to compute the matrix element q|O{ )} |q in the improved theory, adding the vertex in Eq. (11.10) to the usual Wilson vertex, or even including the improvement of the operator, which means computing the renormalization of the operators in Eq. (11.15). Even worse, if one considers more complicated operators, containing perhaps more covariant derivatives, the huge number of manipulations and the size of the typical monomials makes the completion of these calculations entirely by hand almost impossible. These codes turn out at the end to be necessary also because they can provide the result of the analytic manipulations as an output *le which is already formatted (for example in Fortran) as an input *le for the numerical integration. One of the main reasons for the increasing diQculty in the manipulations related to the moments of unpolarized structure functions is that the covariant derivative is proportional to the inverse of
76
We should however point out that a too strong suppression could render the integrals infrared sensitive.
242
S. Capitani / Physics Reports 382 (2003) 113 – 302
the lattice spacing, D ∼ 1=a, so that one has xn ∼ R D 1 · · · D
n
∼
1 : an
(16.1)
This means that in order to compute the n-th moment, one needs to perform a Taylor expansion in a to order n of every single quantity (propagators, vertices, operator insertions). One does not need too much imagination to see what happens. It is suQcient to have a look to the Wilson quark–quark–gluon vertex to order a2 ,
k k k 1 1 2 2 a bc a bc − a p cos (V ) (k; ap) = −g0 (T ) · i cos − ap sin 2 2 2 8 2
k k k 1 1 + ap cos − a2 p2 sin ; (16.2) + r sin 2 2 2 8 2 or to the expansion of the Wilson quark propagator even only to order a, sin k + 2r sin2 k =2 −i ab ab S (k + aq; am0 ) = · sin2 k + [2r sin2 k =2]2
q cos k + r q sin k + m0 −i +a · 2 sin k + [2r sin2 k =2]2 2 k< − −i < sin k< + 2r sin 2 < < ×
q sin 2k + 4r sin2 k =2 (r ) q) sin k) + m0 ) : { sin2 k + [2r sin2 k =2]2 }2
(16.3)
The algebraic manipulations become thus quite complex. The main consequence of all this is the generation of a huge number of terms, at least in the initial stages of the manipulations, even in the case of matrix elements where all Lorentz indices are contracted. The multiplication of two vertices and two quark propagators which are expanded to order a can be seen from the formulae above to give rise to about 42 × 112 ∼ 2000 monomial terms. Initial expansions of Feynman diagrams containing operators which measure the second and third moment of structure functions can easily reach the order of 106 terms. This slows down the execution of the codes considerably. Most of these terms become zero after doing the Dirac algebra, or do not contribute to the sought Dirac structure, or are zero after integration. The terms which do not contribute to the *nal expression have to be killed as early as possible to speed up the computations. Of course this is not easy; for example terms like the ones proportional to 1=(2W (I2 + W 2 )2 ) which we have mentioned after Eq. (15.83) are zero because = ), but this can be seen only after the Dirac algebra has been performed. Thus, the fact that an operator with n covariant derivatives requires Taylor expansions in a to order n also implies a limitation on the number of moments of structure functions that one can practically compute on the lattice. This is something diJerent from the limitation coming from operator mixings,
S. Capitani / Physics Reports 382 (2003) 113 – 302
243
seen in Section 14, and the combination of these two computational challenges renders in practice the computation of the renormalization of the fourth moment or higher very diQcult. Also the gamma algebra becomes cumbersome to do by hand. This is even worse when one adds improvement, because, as noted in the previous Section, adding a ? matrix for each improved vertex builds up long chains of Dirac matrices. At the end of the day, perturbation theory, even only at 1-loop level, is quite cumbersome on the lattice, and due to the complexity of the calculations, to the great number of diagrams, and to the huge amount of terms for each diagram, computer codes have to be used. To evaluate the Feynman diagrams and obtain the algebraic expressions for the renormalization factors, the author has developed sets of computer codes written in the symbolic manipulation language FORM (for recent developments see Vermaseren (2000)). These codes are able to take as input the Feynman rules for the particular combination of operators, propagators (Wilson or overlap) and vertices (Wilson, and Sheikholeslami–Wohlert-improved, or overlap) appearing in each diagram, to expand them in the lattice spacing a to the appropriate order, to evaluate the gamma algebra on the lattice, and then to work out everything until the *nal sought-for expressions are obtained. Due to the enormous number of terms in the initial stages of the manipulations one needs in many cases a large working memory. To properly deal with the 5 matrices additional computer routines have also been written which are able to perform computations in the ’t Hooft–Veltman scheme, the only scheme proven to have consistent trace properties when 5 matrices are involved. 77 This has been important especially for doing calculations involving weak interactions and four-fermion operators (Capitani et al., 1999a, 2000a, b, 2001a). There is however an increase of about one order of magnitude in computing time when one uses the ’t Hooft–Veltman scheme, due to the sum splittings. Also a careful memory management is required. These FORM codes are able to use Dimensional Regularization (NDR or ’t Hooft–Veltman), and a mass regularization, and some independent checks are thus possible. As a further check on the codes, the author has in many cases performed the calculations also by hand. Computer codes are also often employed nowadays for continuum perturbation theory, where however in general there are more external legs and one can reach higher loops, because the building blocks (the various Feynman rules, the typical monomials, etc.) are much simpler. But there are also diJerences in the codes themselves. We have already mentioned several times that Lorentz symmetry is broken on the lattice. This gives rise to a whole new series of problems, regarding for example the validity of the Einstein summation convention. One of the biggest challenges of computer codes for lattice perturbation theory is to deal with the fact that the summation convention on repeated indices is suspended. FORM, and other similar programs, have been developed having in mind the usual continuum calculations. 78 There are therefore many useful built-in features that are in principle somewhat of a hindrance in doing lattice perturbative calculations. These built-in functions cannot be used straightforwardly on the
77
A de*nition of the Dirac matrices in this scheme is given in Veltman (1989). The work of Jegerlehner (2001) contains an interesting discussion about dimensional regularization, the use of chiral *elds and their relation with the properties of 5 in noninteger dimensions. 78 A recent description of this kind of programs can be found in Weinzierl (2002).
244
S. Capitani / Physics Reports 382 (2003) 113 – 302
lattice. This is for example what FORM would normally do, because two equal indices are assumed to be contracted: p → p (16.4) , ;
p sin k → p , sin k ;
(16.5)
sin k cos2 k → ( · sin k) cos2 k ;
(16.6)
< < sin k cos2 k< → −2
; <
sin k cos2 k< :
(16.7)
Here however the typical terms are monomials which contain more than twice the same index. Only the *rst case is then correctly handled by FORM. For example, in the last case the right answer is instead − sin k cos2 k< + 2 < sin k< cos2 k< : (16.8) <
; <
For this reason one needs the development of special routines to deal with the gamma algebra on the lattice (for more details see also Capitani and Rossi, 1995b, a). One solution is to introduce generalized Kronecker -symbols (L6uscher and Weisz, 1995d)
(16.9)
1 2 ::: n
which are equal to one only if all indices are equal, 1 = 2 = · · · = n , and are zero otherwise. In general one needs special routines, with appropriate modi*cations to the usual commands, to properly treat Dirac matrices and handle terms like in Eq. (15.71). In Feynman diagrams that involve a few gluons the color structure can become quite involved, especially if the 4-gluon vertex is present. A program for the automatic generation of gluon vertices and the reduction of the color structure has been described in L6uscher and Weisz (1986). Expression containing color tensors can be computed by repeatedly using the identities (L6uscher and Weisz, 1995d) Tr(T a XT a Y ) =
1 1 Tr(XY ) − Tr(X ) Tr(Y ) ; 2Nc 2
Tr(T a X ) Tr(T a Y ) =
1 1 Tr(X ) Tr(Y ) − Tr(XY ) ; 2Nc 2
(16.10) (16.11)
where X and Y stand for general complex Nc × Nc matrices. Of course when the computations become so complicated that it is not possible to carry them out by hand, a number of additional checks on the codes is desirable. One can use diJerent regularizations, like a mass regularization and dimensional regulation (in its various forms), and one can develop various routines which use diJerent methods. The case of overlap fermions is one in which manual checks are much more diQcult, because only a few computations can be performed by hand. When computing renormalization factors it is
S. Capitani / Physics Reports 382 (2003) 113 – 302
245
useful to exploit the cancellation of the gauge-dependent part between the continuum and the lattice results (as we noted in Section 3). In particular, in the calculations made in a covariant gauge, the contributions proportional to (1 − ) must be independent of the parameter <, and this can only be seen after the numerical integration, since in the thousands of terms which have to be integrated the dependence on < cannot be factored out. The dependence on < is in fact highly nontrivial, and a look at the quark propagator shows that the monomials in the integrand are not even rational functions of <. Thus, this is a really nontrivial check. Of course calculations made in a general covariant gauge are much more expensive than those restricted to the Feynman gauge, due to the more complicated form of the gluon propagator, but they are worthwhile. They can give a strong check of the behavior of the FORM codes in the case of overlap fermions, as well as of the integration routines. The contributions proportional to (1 − ), besides being independent of <, seem to a certain extent to be also independent of the fermion action used. In particular, for overlap fermions they have the same value as for the Wilson case. 79 These terms are in general equal to their Wilson counterparts at the level of the single diagrams. An exception is given by the self-energy, where only the sum of the sunset and the tadpole diagrams is independent of <, as shown in Table 1, and has the same value as in the Wilson action, as can be seen looking at Eq. (15.132). 80 The component of the total self-energy proportional to (1 − ) is proportional only to the combination F0 − E + 1 = 4:792009568973 · · · (see Eqs. (18.25) and (18.26) and Table 2 later), and therefore is the same for all plaquette actions, irrespective of the fermionic part. Another case in which we have found that individual overlap diagrams do not correspond to their Wilson result is given by operators measuring the moment of the structure function g2 (Capitani, 2001b). These operators are of the form R [ 5 D{)] D< · · · D} ; (16.12) that is they involve both a symmetrization and an antisymmetrization of indices. In the Wilson case they mix with other operators of lower dimension, with a 1=a divergent coeQcient, while in the overlap case these mixings are forbidden by chiral symmetry. For this operator only the sum of vertex and sails is independent of the fermion action used. Lattice perturbative calculations generally involve the manipulation of a huge number of terms, but often a large number of terms remain also in the *nal analytic expressions which have to be numerically integrated. The integrations then require a lot of computer time, which in some cases can be of the order of hundreds of hours. 17. Lattice integrals After the analytical manipulations in Feynman diagrams have been carried out, the resulting expressions must be integrated. Lattice integrals are quite complicated rational functions of trigonometric expressions, and so far it has not been possible to compute them analytically. The only way to 79 In the case of overlap fermions however, due do the greater number of terms and the more complicated functional forms, one gets less precise numbers for these contributions (for a given computing time). 80 This is probably connected to the fact that the 1-loop self-energy for Wilson fermions has a nonvanishing T0 contribution, which gives the additive quark mass renormalization due to the breaking of chiral symmetry, while for overlap fermions T0 is zero.
246
S. Capitani / Physics Reports 382 (2003) 113 – 302
obtain a number from these expressions is to use numerical integration methods. The results of the analytic calculations obtained with computer programs can be stored as a sum of terms suitable for integration, which can be formatted in an appropriate way and passed on to become the input of a, say, Fortran program. Integrals can be approximated by discrete sums. At the basic level, one computes the function to be integrated in a number of points, and uses the sum over these points (sometimes with appropriate weights) as an approximation to the true value of the integral. Since we are working in four dimensions (usually in momentum space), even choosing only 100 points in each direction means that the function has to be evaluated in 108 points, and the computational requirements grow quite rapidly. This becomes much worse for 2-loop integrals, where one has to evaluate sums in eight dimensions. It is therefore invaluable, in order to compute 2-loop integrals, to use the coordinate space methods that we will introduce in Section 19, with which these eight-dimensional integrals can be expressed in terms of only four-dimensional sums. Using simple integration routines it is possible to evaluate numerically lattice 1-loop integrals with reasonable precision, and one can without much eJort obtain results with *ve or six signi*cant decimal places. There are also some more re*ned methods which have been devised and used to obtain faster and cheaper evaluations of these integrals and improve on the precision of the integrations. We are going to explain some of them in the rest of this Section. One expects (Symanzik, 1983b) that a lattice Feynman diagram D with l loops will have near the continuum limit the asymptotic behavior a→0
D(a) ∼ a−!
∞ l
cnm an (log a)m ;
(17.1)
n=0 m=0
where the nonnegative exponent ! is related to the convergence properties of D and its subdiagrams. L6uscher and Weisz have devised a recursive blocking method for the calculation of the *rst coeQcients of the expansion in a of a 1-loop Feynman diagram: a→0 −!
D(a) ∼ a
∞
an (cn0 + cn1 log a) ;
(17.2)
n=0
so that one does not need to evaluate the diagram for very small lattice spacings, which would be numerically quite demanding. To use this method (L6uscher and Weisz, 1986), the diagram has to be known numerically (with a good precision) for a sequence of diJerent lattice spacings, which for simplicity will be taken to be ak = 1= k, with k integer and a constant. The continuum limit a → 0 means k → ∞. We consider the case in which only even powers of a appear in the expansion, which includes diagrams which are O(a) improved (if one is interested in the coeQcients with n 6 2). It is also assumed that one knows the coeQcients of the logarithm terms exactly, so that the task is reduced to compute numerically only the coeQcient c00 (which is the *nite part of the limit a = 0 of the Feynman diagram), and c20 . Let us see how one can compute c00 . One starts by de*ning an auxiliary function f0 (k) = {a! D(a) + (c01 + a2 c21 ) log k}|a=ak ;
(17.3)
S. Capitani / Physics Reports 382 (2003) 113 – 302
which can be seen to have the asymptotic expansion ∞ (An + Bn log k) A1 k →∞ f0 (k) ∼ A0 + + ; 2 ( k) ( k)2n n=2
247
(17.4)
where A0 = c00 − c01 log A1 = c20 − c21 log
; ;
(17.5)
and so on. We can then consider taking A0 f0 (kmax )
(17.6)
as a *rst approximation to compute c00 (remember that c01 is exactly known). It is however easy to do better. In fact, the function f1 (k) =
(k + 0 )2 (k − 0 )2 f0 (k + 0 ) − f0 (k − 0 ) 4 0 k 4 0 k
(17.7)
has an expansion similar to Eq. (17.4) with the same initial term A0 but no 1=k 2 term, so that the *rst correction becomes of order 1=k 4 only. Therefore the approach to the limiting value for k → ∞ is faster, and A0 f1 (kmax − 0 )
(17.8)
gives a better approximation for the computation of c00 . Notice that f1 is not de*ned at kmax , because it is a discrete diJerence involving the nearest point, and thus one has to use its value at kmax − 0 . Usually one chooses 0 = 1 or 2. The above blocking transformation can be iterated i times to give increasingly better approximations to the right value of A0 : fi (k) = A0 + O(1=k 2i+2 ) :
(17.9)
However, because of the logarithms present in Eq. (17.4) for n ¿ 2, the transformations become now slightly more complicated: fi+1 (k) = w1 fi (k + i ) + w2 fi (k) + w3 fi (k − i ) ;
(17.10)
wj = vj =(v1 + v2 + v3 );
(17.11)
(j = 1; 2; 3) ;
v1 = (k + i )2i+2 log(1 − i =k) ;
(17.12)
v2 = k 2i+2 [log(1 + i =k) − log(1 − i =k)] ;
(17.13)
v3 = −(k − i )2i+2 log(1 + i =k) :
(17.14)
While each new iteration gives in principle a better approximation to A0 , it must be stressed that the domain of the function fi becomes smaller with every new blocking step, and therefore after a while (depending on the total number of diJerent lattice measurements available, k) there is a natural halt
248
S. Capitani / Physics Reports 382 (2003) 113 – 302
to the blocking steps. Moreover, the numerical precision is lost a little after each iteration; L6uscher and Weisz estimated that one loses one or two decimal places with each blocking step. There is an optimal choice for the number of iterations, and in L6uscher and Weisz (1986) a stopping criterion and an estimate for the total error are given. At the end there is an optimal estimate of c00 given by A0 fi∗ (k ∗ ) ; k∗
(17.15)
i∗
and minimize the error. where Another method which is useful for the extraction of the leading behavior of lattice integrals has been used in Bode et al. (2000a, b). One assumes as before that the integrals F(k) are known for a sequence of diJerent lattice spacings ki ; (i = 1; : : : ; n), with the necessary computer precision. One wants then to determine the *rst nf coeQcients of the expansion nf F(k) = i fi (k) + R(k) (17.16) i=1
in terms of general functions fi (k) = log k=k ) (which also include f0 (k) = 1). Of course one has always nf 6 n, because the parameters to be determined cannot exceed the data set. In matrix form the above equation reads F = f + R ;
(17.17)
where F is the n-dimensional vector of the data F(ki ); is the nf -dimensional vector of the leading coeQcients that we want to determine, and f is an n × nf matrix. The method calls for to be determined by minimizing the quadratic form in the residues B2 = (F − f)T W 2 (F − f) ; 2
where W is a matrix of positive T
2
T
weights. 81
(17.18) This leads to
2
f W f = f W F ;
(17.19)
and if the columns of the matrix Wf are linearly independent, and P is a projector on the corresponding nf -dimensional subspace, then our task is reduced to *nding a solution of Wf = PWF :
(17.20)
This can be done using the singular value decomposition for Wf, Wf = USV T ;
(17.21)
where S and V are diagonal and orthonormal nf × nf matrices respectively, and U is a columnorthonormal nf × nf matrix, i.e., U T U = 1 and UU T = P. The solution for the nf leading coeQcients in Eq. (17.16) is then = VS −1 U T WF :
(17.22)
The blocking procedure of L6uscher and Weisz discussed previously can be considered as a particular case of this method. In fact, if one repeats these improved *ts modifying the component 81
These weights could be useful in order to give more importance to particular data points, as for larger a there is less roundoJ error, while for smaller a the asymptotic expansions are better satis*ed. In Bode et al. (2000a, b) it was however reported that uniform weights W = 1 work as well.
S. Capitani / Physics Reports 382 (2003) 113 – 302
249
proportional to one of the functions fj , then only the corresponding j changes. Then, if n = nf (i.e., one has the minimal data set from which it is possible to determine the coeQcients), this method is equivalent to performing blocking transformations which cancel the nf − 1 components not related to j . We want to conclude this Section by discussing a simple method which is useful to accelerate the convergence of the numerical evaluation of the integral of a periodic analytic function over a compact domain. Such integrals arise in theories with twisted boundary conditions (L6uscher and Weisz, 1986), U (x + L)) ˆ = 5) U (x)5)−1 ;
(17.23)
where the gauge *elds cross through the lattice boundaries. At least two directions must be twisted, otherwise the twisting is equivalent to a *eld rede*nition of a theory with standard boundary conditions. The twist matrices 5 are constant and gauge-*eld independent, and are SU (Nc ) matrices which satisfy the algebra 5 5? = e2i=Nc 5? 5 :
(17.24)
Explicit representations of these matrices are not needed. These boundary conditions remove the zero modes and the theory acquires a mass gap, so that the gluon propagator is not singular. Besides inducing an infrared cutoJ in Feynman diagrams, they also cause the spectrum of momenta to be continuous, because the momentum components are not quantized by the boundary conditions. Similarly, the ghosts are also twisted periodic *elds, and the Faddeev-Popov determinant has no zero modes. One then ends up with integrals of periodic analytic functions. More technicalities and the Feynman rules can be found in L6uscher and Weisz (1986). For simplicity we consider one-dimensional integrals. We want to compute dk I= f(k) ; (17.25) − 2 where f(k) is analytic and periodic, f(k + 2) = f(k). A *rst approximation of this integral is given by the sum N 2 1 j (17.26) f I (N ) = N j=1 N for N large enough. While in general this is not a very eQcient approximation, for periodic functions the convergence turns out to be exponential, I (N ) − I = O(e−jN ) ;
(17.27)
where j is the absolute value of the imaginary part of the singularity of the integrand which is closest to the real axis. One can see that problems can arise when j happens to be very small, because in this case the convergence of I (N ) to I becomes quite slow. However, things can be improved by making a change of variable (L6uscher and Weisz, 1986), k = k − sin k ;
0 6 (j) ¡ 1 ;
(17.28)
250
S. Capitani / Physics Reports 382 (2003) 113 – 302
where is chosen to be near to one, so that the singularity in the new variables is pushed away from the real axis: jˆ = O(1). Then the sums N 2 ˆ ˆI (N ) = 1 j ; (17.29) f N j=1 N calculated using the transformed function ˆ ) = (1 − cos k )f(k(k )) ; f(k
(17.30)
become better convergent to the integral I : ˆ Iˆ(N ) − I = O(e−jN ) :
(17.31)
Thus, much lower values of N , and less computing power, are suQcient to get the same precision in the numerical evaluation of the integral I . 18. Algebraic method for 1-loop integrals If one is happy with computing 1-loop lattice integrals with a precision of only *ve or six signi*cant decimal places, their values can be easily estimated by using a simple rectangle integration. In order to have some more precise results one can implement the methods discussed in the previous Section. The algebraic method for Wilson fermions, which we are going to explain in this Section, is an enormous improvement on these techniques. It allows every integral coming from 1-loop Feynman diagrams to be computed in a completely symbolic way and to be reduced to a linear combination of a few basic constants. This means that the generic integral can then be calculated numerically with a very large precision with a very small eJort. In fact, once these few basic constants are determined with the desired precision, the original integral is just an appropriate linear combination of them. It is then possible to compute every integral with a very high precision, for example with seventy signi*cant decimal places (Capitani et al., 1998a), and in some cases with even nearly 400 signi*cant decimal places (see Appendix B). Computing 1-loop integrals with sixty or seventy signi*cant decimal places turns out to be absolutely necessary if one wants to evaluate 2-loop integrals with a precision of at least ten signi*cant decimal places. This can be accomplished using the coordinate space method (Section 19). It is suQcient to apply the algebraic method only to integrals with zero external momenta, since the momentum-dependent part can be evaluated in the continuum, as we have seen in Section 15.2. Furthermore, the general zero-momentum integral on the lattice can always be written as a linear combination of terms of the form 4 2n x ˆ y ˆ2nz ˆ2nt d k kˆ2n x ky kz kt F(p; q; nx ; ny ; nz ; nt ) = : (18.1) 4 p q − (2) DF (k; mf ) DB (k; mb ) The algebraic method allows to express these terms in terms of a certain number of basic integrals. The complete reduction of a generic F(p; q; nx ; ny ; nz ; nt ) is achieved using an iteration procedure which makes use of appropriate recursion relations in noninteger dimensions.
S. Capitani / Physics Reports 382 (2003) 113 – 302
251
At the end of this procedure, using the algebraic method every purely bosonic integral can be expressed in terms of 3 basic constants, every purely fermionic integral in terms of 9 basic constants, and every general fermionic-bosonic integral in terms of 15 basic constants. 18.1. The bosonic case We are now going to explain in detail the recursive algorithm in the bosonic case, when the gluon action is given by the Wilson plaquette (Caracciolo et al., 1991, 1992). We note *rst of all that it is enough to consider lattice integrals at zero external momenta. We are in fact interested in the continuum limit of d4 k G(p) = F(k; p) ; (18.2) (2)4 where p is some external momentum. As explained in Section 15.2, we can split this integral in two parts: a subtracted (ultraviolet-*nite) integral, which can be evaluated in the continuum, and a certain number of lattice integrals with zero external momenta. We have then d4 k d4 k nF G(p) = [F(k; p) − (T F)(k; p)] + (T nF F)(k; p) ; (18.3) (2)4 (2)4 where nF is the degree of the divergence of the integral, and
nF 9 1 9 nF p :::p n (T F)(k; p) = ::: F(k; p) : n! 1 9p 1 9p n p=0 n=0
(18.4)
If the propagators are massless, we need to introduce at this point an intermediate regularization for k = 0. It is convenient to choose an infrared mass cutoJ m or to use dimensional regularization. The singularities will cancel when the contribution of the momentum-dependent part is added at the end. Any zero-momentum integral coming from the calculation of lattice Feynman diagrams in the pure gauge Wilson theory can be expressed as a linear combination of terms of the form 82 4 2n x ˆ y ˆ2nz ˆ2nt d k kˆ2n x ky kz kt B(p; nx ; ny ; nz ; nt ) = ; (18.5) 4 DB (k; m)p − (2) where p and ni are positive integers. The inverse bosonic propagator, taken in general to be massive in order to regularize the divergences coming from the separation in J and I − J , is DB (k; m) = kˆ2 + m2 :
(18.6)
Actually, due to the appearance of other kinds of singularities at some intermediate stages of the reductions, we must consider the more general integrals 4 2n x ˆ y ˆ2nz ˆ2nt d k kˆ2n x ky kz kt ; B (p; nx ; ny ; nz ; nt ) = 4 D (k; m)p+ B − (2) where p is an arbitrary integer (not necessarily positive) and is a real number which will be set to zero at the end of the calculations. 82
It is always possible to cast any numerator in a form containing only factors of sin2 k =2, using sin2 k = 4 sin2 k =2 − 4 sin4 k =2 (and similar formulae for the cosine functions).
252
S. Capitani / Physics Reports 382 (2003) 113 – 302
To begin with, each integral B (p; nx ; ny ; nz ; nt ) can be reduced through purely algebraic manipulations to a sum of integrals of the same type with nx = ny = nz = nt = 0 (i.e., pure denominators). This is done by using the recursion relations 83 B (p; 1) = 14 [B (p − 1) − m2 B (p)] ;
(18.7)
B (p; x; 1) = 13 [B (p − 1; x) − B (p; x + 1) − m2 B (p; x)] ;
(18.8)
B (p; x; y; 1) = 12 [B (p − 1; x; y) − B (p; x + 1; y) − B (p; x; y + 1) − m2 B (p; x; y)] ;
(18.9)
B (p; x; y; z; 1) = B (p − 1; x; y; z) − B (p; x + 1; y; z) − B (p; x; y + 1; z) − B (p; x; y; z + 1) − m2 B (p; x; y; z) ;
(18.10)
which can be obtained from the trivial identity DB (k; m) =
4
kˆ2i + m2 :
(18.11)
i=1
With these recursion relations one can eliminate each numerator argument, ni , of the B function, provided that it has the value 1. When it is greater than 1, one has to lower its value until it reaches 1, so that it is then possible to use the above set of recursion relations. The lowering of ni is done by using another recursion relation, r−1 B (p; : : : ; r) = B (p − 1; : : : ; r − 1) p+ −1 −
4r − 6 B (p − 1; : : : ; r − 2) + 4B (p; : : : ; r − 1) ; p+ −1
which is obtained integrating by parts the equation (for r ¿ 1) 9 1 (kˆ2w )r −1 (kˆ2w )r −2 (kˆ2w )r sin kw = 4 + 2 : p+ p+ DB (k; m) DB (k; m) p+ −1 9kw DB (k; m)p+ −1
(18.12)
(18.13)
Notice that for p = 1 some coeQcients in this recursion relation diverge as 1= , and therefore in order to compute B (1; : : :) for = 0 we need to compute B (0; : : :) including terms of order . In general one needs to compute the intermediate expressions for the integrals B (p; nx ; ny ; nz ; nt ) with p 6 0 keeping all terms of order . Using the recursion relations introduced so far, every integral B (p; nx ; ny ; nz ; nt ) can thus be reduced to a sum of the form p B (p; nx ; ny ; nz ; nt ) = ar (m; )B (r) ; (18.14) r=p−nx −ny −nz −nt
where ar (m; ) are polynomials in m2 , which may diverge as 1= for p ¿ 0 and r 6 0. 83
In the following when one of the arguments ni is zero it will be omitted.
S. Capitani / Physics Reports 382 (2003) 113 – 302
253
At this point, all that remains is to reexpress all B (p)’s appearing in the above formula in terms of a small *nite number of them. To accomplish this we need some other recursion relations, which can be obtained considering the trivial identity B (p; 1; 1; 1; 1) − 4B (p + 1; 2; 1; 1; 1) − m2 B (p + 1; 1; 1; 1; 1) = 0 ;
(18.15)
and applying to it the previous procedure until it is reduced to a relation between the B (r)’s only. One then arrives to a nontrivial relation of the form p
br (p; )B (r) + S(p; m; ) = 0 ;
(18.16)
r=p−4
where S(p; m; ) = O(m2 ) for p 6 2, while for p ¿ 2 it is a polynomial in 1=m2 (which is *nite for → 0). We can now use the last relation to express all B (p)’s in terms of B (r)’s which are only in the range 0 6 r 6 3. To do this, when p ¿ 4 we just write B (p) in terms of B (p − 1); : : : ; B (p − 4) and iterate until needed. When p 6 − 1, we solve the relation in terms of B (p − 4), make the shift p → p + 4, and then use it to write B (p) in terms of B (p + 1); : : : ; B (p + 4). Again we iterate until needed. Applying recursively these two relations we get, for p = 0; 1; 2; 3: B (p) =
3
cr (p; )B (r) + T(p; m; ) ;
(18.17)
r=0
where T(p; m; ) is a polynomial in 1=m2 . The above procedure allows the general bosonic integral to be written, after a *nite number of steps, as B (p; nx ; ny ; nz ; nt ) = A( )B (0) + B( )B (1) + C( )B (2) + D( )B (3) + E(m; ) ; (18.18) where E(m; ) is a polynomial in 1=m2 . It can be shown that the limit → 0 is safe at this stage, and one *nally obtains B(p; nx ; ny ; nz ; nt ) = A(0) + B(0)B(1) + C(0)B(2) + D(0)B(3) + E(m; 0) ;
(18.19)
in terms of three basic constants, B(1); B(2) and B(3). This is a minimal set, i.e., no further reduction can be done. It is common practice to write the bosonic results in terms of the three constants Z0 ; Z1 and F0 , which are de*ned by Z0 = B(1)|m=0 ;
(18.20)
Z1 = 14 B(1; 1; 1)|m=0 ;
(18.21)
F0 = lim (162 B(2) + log m2 + E ) :
(18.22)
m→0
Explicitly, Z0 =
−
d4 k 1 ; 4 (2) 4 sin2 k =2
(18.23)
254
S. Capitani / Physics Reports 382 (2003) 113 – 302
Table 2 Numerical values of the basic bosonic constants Z0 Z1 F0
0.154933390231060214084837208 0.107781313539874001343391550 4.369225233874758
Z1 = and 84
−
−
d 4 k sin2 k1 =2 sin2 k2 =2 ; 2 (2)4 sin k =2
d4 k 1 1 = (−log m2 − E + F0 ) : 2 4 (2) (4 sin k =2)2 + m2 162
(18.24)
(18.26)
Their values are given in Table 2. Recall that E = 0:57721566490153286 : : : is the well-known Euler’s constant appearing in continuum integrals. Rewriting B(1) and B(2) in terms of F0 and Z0 is rather trivial. For B(3) one has 13 1 1 1 Z1 − ; (18.27) B(3) = − (log m2 + E − F0 ) − + 2 2 2 2 32 m 128 1024 1536 256 which is a special case of Eq. (18.29). In general d − 1 basic constants are enough for all bosonic integrals in d dimensions, and d − 2 if one only considers *nite integrals. This means that in two spacetime dimensions any *nite bosonic integral can be written in terms of rational numbers and factors 1=2 only, and one constant has to be introduced for divergent integrals. This concludes the illustration of the algebraic method. However, if one wants to use it, it is useful to know some formulae regarding divergent integrals, that is the B (p)’s for p ¿ 2. These formulae can be derived using their expression in terms of the modi*ed Bessel function I0 (x), ∞ 1 2 d x xp+ −1 e−m x=(2−4x) I04 (x) ; (18.28) B (p) = p+ 2 I(p + ) 0 obtained using the Schwinger representation. The basic formula for the divergent integrals is then r −1
br − 2 1 bi−2 I(r − i) (−log m2 − E + Fr −2 ) + Hr −2 : + r B(r) = i 2 r − i I(r) i=2 2 (m ) 2 I(r) If dimensional regularization is used instead one has 2 br − 2 − log 4 + Fr −2 + Gr −2 : Bd (r) = r 2 I(r) d − 4
(18.29)
(18.30)
In the latter case, the recursion relations have to be extended to noninteger dimensions, which can be done without too much eJort. These formulae can be found in Caracciolo et al. (1992). 84
In case dimensional regularization is used, F0 is de*ned by d 4−2j k 1 1 1 = − : − log 4 + F 0 4−2j 162 j (4 sin2 k =2)2 − (2)
(18.25)
S. Capitani / Physics Reports 382 (2003) 113 – 302
255
The constants bi appearing in the formulae for divergent integrals are de*ned by the asymptotic expansion of the modi*ed Bessel function I0 , ∞ bi − (d − 4)ci + O((d − 4)2 ) ed x + O(e−x ) ; (18.31) I0d (x) = (2x)d=2−2 i=0 xi+2 and the constants Fi are also related to this modi*ed Bessel function: 85
∞ p 2 bi 1 p+1 −4x 4 p+1 −4x 4 e I0 (x) − : Fp = d x x e I0 (x) + dx x i+2 bp x 0 2 i=0
(18.33)
The *rst Fi ’s are given explicitly by 86 F1 = F0 − 18 2 +
+ 12 2 Z1 ;
35 12
2 +
1349 216
+ 19 2 Z0 +
F 2 = F0 −
31 144
F 3 = F0 −
523 1872
F 4 = F0 −
7145 22176
2 +
294919 33264
F 5 = F0 −
27971 80190
2 +
3347101 481140
F 6 = F0 −
27039607 74221920
F 7 = F0 −
751956319 2016614880
2 +
24257 2808
2 +
(18.34)
+
401 1386
+
+
3823946741 3024922320
523 468
2 Z0 +
13582 40095
448657133 111332880
2 +
2 Z1 ;
2 Z0 +
25 117
+
31 36
2 Z1 ;
7145 5544
2 Z0 +
1708783 4638870
+
(18.35)
2 Z1 ;
55942 40095
(18.37)
2 Z1 ;
2 Z0 +
48529351 126038430
(18.36)
27039607 18555480
2 Z0 +
(18.38) 2 Z1 ;
751956319 504153720
2 Z1 ;
(18.39) (18.40)
while the *rst bi ’s, together with the *rst Hi ’s and Gi ’s de*ned by p −1
bi 1 Hp = ; i+2 I(p + 2) i=0 2 (i − p) cp 1 ; G p = Hp − I(p + 2) 2p+1
(18.41) (18.42)
are given in Table 3. If the integral we have to compute is *nite from the start, all log m2 terms must cancel, and this means that the F0 constant is not present. In fact F0 appears only in divergent integrals (B(2); B(3); : : :), and always in the combination log m2 + E − F0 :
(18.43)
85
These and other related constants were introduced in GonzKalez-Arroyo and Korthals-Altes (1982). Note that F0 was called F0000 there. Many properties of these functions are discussed there and also in Ellis and Martinelli (1984a). We also note that the *nite constant Z0 can also be expressed in terms of modi*ed Bessel functions: 1 ∞ d x e−4x I04 (x) : (18.32) Z0 = 2 0 86
These relations can be obtained by applying the recursion relations to an appropriate identity like Eq. (18.15).
256
S. Capitani / Physics Reports 382 (2003) 113 – 302
Table 3 Values of br , cr , Gr and Hr r 0 1 2 3 4 5 6 7
br 1 42 1 82 3 322 13 1282 77 5122 297 10242 5727 81922 66687 327682
cr
Gr
Hr
0
0
0
1 322 1 − 322 55 − 15362 5 − 962 1973 − 204802 54583 − 2457602 8558131 − 137625602 −
7 2562 11 − 15362 793 − 5898242 311 − 14745602 27251 − 9437184002 559001 − 1585446912002 7910171 − 202937204736002 −
1 322 1 − 1282 53 − 368642 331 − 14745602 3653 − 117964802 4261 − 1101004802 1331861 − 2959500902402 −
The absence of factors log m2 implies the cancellation of the E and F0 terms, and thus all *nite integrals turn out to be functions of Z0 and Z1 only. These two basic constants are now known with an incredible high precision, about 400 signi*cant decimal places, as it will be shown in Section 19.1 and Appendix B, where they will be computed using coordinate space methods. Thus, every *nite bosonic integral can now be calculated with about 400 signi*cant decimal places. 18.2. Examples of bosonic integrals As an illustration of the algebraic method, we show here in detail how to compute a few bosonic integrals which will afterwards be used for the calculation of the operator tadpoles necessary for the operator renormalization corresponding to the third moment of the quark momentum distribution. We will start from the simpler integrals, which are in some cases needed as intermediate results. Since all these integrals are *nite, it is suQcient in general to use the recursion relations of the previous subsection with m = 0. Exceptions will be duly noted. We will not write the subscript explicitly, although the integrals are supposed to be computed at = 0 when necessary. We also remind that B(1) = Z0 , by de*nition. 1. B(1; 1) This very simple integral is given, using the *rst recursion relation, by B(1; 1) = 14 B(0) =
1 4
:
(18.44)
S. Capitani / Physics Reports 382 (2003) 113 – 302
257
2. B(1; 1; 1) We have B(1; 1; 1) = 4Z1 ;
(18.45)
by de*nition. 3. B(2; 1; 1) By applying the various recursion relations we get 1 B(2; 1; 1) = (B(1; 1) − B(2; 2)) 3 1 2 1 1 − B(1; 1) − B(1) + 4B(2; 1) = 3 4 1+ 1+ 1 1 1 1 − − 2Z0 + 4 · B(1) = 3 4 4 4 1 Z0 : (18.46) 3 We have taken the limit = 0, because it is safe to do so here. 4. B(2; 2; 1) We have to compute this integral because it is necessary for B(2; 1; 1; 1) (the last example). The manipulations are as follows: 1 B(2; 2; 1) = (B(1; 2) − B(2; 3)) 3 1 2 1 B(0; 1) − B(0) + 4B(1; 1) − (2B(1; 2) − 6B(1; 1) + 4B(2; 1)) = 3 2 1 1 B(0; 1) − B(0) + 4B(1; 1) = 3 2 4 − B(0; 1) + B(0) − 8B(1; 1) + 6B(1; 1) − 4B(2; 1) 1 2 1 − B(0; 1) + B(0) + 2B(1; 1) − 4B(2; 1) ; (18.47) = 3 =
where we have taken the limit = 0 when safe. Now B(0; 1) = 14 B(−1) ;
(18.48)
and we need the expression of B(−1) including terms of order . Applying the recursion relations to the identity (18.15), 87 B (3; 1; 1; 1; 1) − 4B (4; 2; 1; 1; 1) − m2 B (4; 1; 1; 1; 1) = 0 ; 87
We have to keep the m2 term in this identity, because some intermediate integrals will be divergent.
(18.49)
258
S. Capitani / Physics Reports 382 (2003) 113 – 302
we obtain B(−1) = 8 + · (−20Z0 − 48Z1 + 8) + O( 2 ) :
(18.50)
We have then 2 1 1 2 − B(0; 1) + B(0) = − (8 + (−20Z0 − 48Z1 + 8)) + = 5Z0 + 12Z1 − 2 ; (18.51) 4 which is *nite, as it should be. The *nal result is B(2; 2; 1) = 43 Z0 + 4Z1 −
1 2
:
(18.52)
5. B(2; 1; 2) Also this integral appears in the computation of B(2; 1; 1; 1). Of course it has to be equal to B(2; 2; 1) by symmetry, but the manipulations are slightly diJerent. Actually, they are much simpler than the previous one: B(2; 1; 2) = B(1; 1; 1) − 2B(1; 1) + 4B(2; 1; 1) = 4Z1 − 2 ·
1 4
+ 4 · 13 Z0
= 43 Z0 + 4Z1 −
1 2
:
(18.53)
This example shows that by a judicious choice of the order of the indices it is possible to obtain the same result with less eJort. 6. B(2; 1; 1; 1) The *rst decomposition of the integral gives B(2; 1; 1; 1) = 12 (B(1; 1; 1) − B(2; 2; 1) − B(2; 1; 2)) :
(18.54)
Taking the results of the two integrals just computed, we then have 1 4 1 4Z1 − 2 · Z0 + 4Z1 − B(2; 1; 1; 1) = 2 3 2 4 1 (18.55) = − Z0 − 2Z1 + : 3 2 These are all the integrals which will be needed in the next Section. In Panagopoulos and Vicari (1990) the exact expressions of various other bosonic integrals (useful for the renormalization of the trilinear gluon condensate) in terms of the constants Z0 and Z1 can also be found. 18.3. Operator tadpoles A very important class of diagrams on the lattice is given by the operator tadpoles, which are generated when an operator in a matrix element contains U *elds in its de*nition. In the case of a simple plaquette gauge action these tadpoles can be computed exactly, using the algebraic methods that we have just discussed. We list the results of the operator tadpoles in a general covariant gauge for operators which contain one, two and three covariant derivatives. These operators are useful for example for the calculations of the renormalization constants of operators measuring the lowest moments of the structure functions. Dirac matrices can then be added at will, since they do not inNuence the calculation of the tadpoles.
S. Capitani / Physics Reports 382 (2003) 113 – 302
259
What is important instead is the choice of the indices of the derivatives. This is not surprising because these operators fall in diJerent representations of the hypercubic group depending on this choice, and the results will diJer accordingly. The operator tadpoles for the various operators in terms of Z0 and Z1 are 1 1 T RD = Z0 + (1 − ) Z0 ; (18.56) 2 8 1 (18.57) T RD D) = −Z0 + (1 − ) Z0 ; 6 1 1 ; (18.58) T RD D = −2Z0 + + (1 − ) Z0 − 8 8 3 1 1 1 ; (18.59) T RD D) D? = − Z0 + (1 − ) − Z0 − Z1 + 2 24 4 16 9 5 1 1 Z0 + Z1 − ; (18.60) T RD D D) = − Z0 + + (1 − ) 2 8 4 8 9 5 1 1 Z0 + Z1 − ; (18.61) T RD) D D = − Z0 + + (1 − ) 2 8 8 4 9 5 1 1 Z0 + Z1 − ; (18.62) T RD D) D = − Z0 − Z1 + + (1 − ) 2 4 8 4 where = ) = ?, and repeated indices are not summed. These results are valid (among others) for both Wilson and overlap fermions, because they depend only on the structure of the gluon propagator. This is not true for the self-energy tadpole, which for overlap fermions is diJerent from the Wilson result (as we have seen in Section 15.5), as the interaction vertex is diJerent. We now show explicitly how these computations are done, deriving the results of Eqs. (18.59)– (18.62), and taking as our *nal task the calculation of the operator tadpoles of the operators O{0123} = R {0 D1 D2 D3} (18.63) and O{0011} + O{3322} − O{0022} − O{3311} = R {0 D0 D1 D1} + R {3 D3 D2 D2} − R {0 D0 D2 D2} − R {3 D3 D1 D1}
;
(18.64)
which was carried out in Capitani (2001b). These operators are multiplicatively renormalizable at 1 loop, and measure the third moment of the unpolarized quark distribution. Each of the four terms in Eq. (18.64) gives the same value for the tadpole. Furthermore 1 O{0011} = (O0011 + O0101 + O0110 + O1001 + O1010 + O1100 ) ; (18.65) 6 and since for the tadpole what is important is the position of the covariant derivatives, and not of the Dirac matrices, we have 1 TO{0011} = (TO0011 + TO0101 + TO0110 ) (18.66) 3 (the remaining terms have the indices 0 and 1 exchanged, and therefore they give the same result). The results corresponding to these three terms are given in Eqs. (18.61), (18.62) and (18.60), respectively.
260
S. Capitani / Physics Reports 382 (2003) 113 – 302
Let us begin by computing the tadpoles for operators with three covariant derivatives which have general indices , ) and < (that is, for the moment they may be equal or diJerent). We have again that →→→ ↔↔↔ one can consider, for the sake of the computation of the tadpole, R D D D instead of 1=8 R D D D , which would be a lot more tedious to calculate. Applying the three covariant derivatives in cascade gives ˜ )D ˜ (x) = 1 [U< (x)U) (x + a<)U ˜
(18.67)
For the calculation of the tadpoles we now have to expand the U ’s in the above expression to second order, and keep the A2 terms coming from the same U as well as the products A · A coming from two diJerent U ’s. Regarding the latter, if we were doing the calculations in the Feynman gauge only the pairs of A with the same index would give nonzero contributions. Here however we work in a general covariant gauge, so we must consider all terms. In addition, we have to expand the terms to order a3 p3 , which reconstructs the tree level of the operator, − ip p) p< :
(18.68)
For example, one of such terms gives 1 1 (x + a ˆ + a)ˆ + a<) ˆ → 3 eiap eiap) eiap< 3 a a i (18.69) ) )< p3 : 2 This factor a3 compensates the factor 1=a3 coming from the covariant derivatives, while the factor a2 coming from the U expansions multiplied with the factor a2 present in the gluon propagator cancels the rescaling factor of the integration variable, a4 . The tadpoles are then *nite in the limit a → 0. Let us *rst consider the terms where both A’s come from the expansion of the same U . This is a simple calculation which gives 2 2 =a 4 ga 1 d k CF 3 − 0 (Aa (k)Aa (−k) + Aa) (k)Aa) (−k) + Aa< (k)Aa< (−k)) · 8a3 ip ip) ip< 4 8a 2 (2) −=a 2 2 =a 3g a d4 k =− 0 G aa (k) · (−ip p) p< ) ; (18.70) 4 2 −=a (2) = − ip p) p< − ) ip2 p< − < ip2 p) − )< ip)2 p −
S. Capitani / Physics Reports 382 (2003) 113 – 302
261
where we have contracted the A’s to form the propagator (note that now the color index a is not summed). Inserting the covariant gauge propagator (Eq. (5.62)) and dividing for the tree level the above expression is reduced to g02 CF multiplied by 88 3 3 3 T1 = − (B(1) − (1 − )B(2; 1)) = − Z0 + (1 − ) Z0 : 2 2 8
(18.71)
The remaining part of the calculation of the tadpoles involves the terms in which the A’s that have to be contracted come from the expansion of diJerent U ’s, and is much more complicated. In the case in which all indices are diJerent, corresponding to the operator O{0123} , the relevant part of the expansion of Eq. (18.67) is =a 4 d k 1 2 2 (−g0 a ) [(A< (−k)A) (k)eiak< =2 eiak) =2 + A) (−k)A (k)eiak) =2 eiak =2 3 4 8a −=a (2) + A< (−k)A (k)eiak< =2 eiak =2 eiak) )(−ip p) p< − i )< p)2 p − i ) p2 p< − i < p2 p) ) + (A< (−k)A) (k)eiak< =2 eiak) =2 − A) (−k)A (k)eiak) =2 e−iak
=2
− A< (−k)A (k)eiak< =2 e−iak =2 eiak) )(−ip p) p< + i )< p)2 p − i ) p2 p< + i < p2 p) ) + (−A< (−k)A) (k)eiak< =2 e−iak) =2 − A) (−k)A (k)e−iak) =2 eiak
=2
+ A< (−k)A (k)eiak< =2 eiak =2 e−iak) )(−ip p) p< − i )< p)2 p + i ) p2 p< + i < p2 p) ) + (−A< (−k)A) (k)eiak< =2 e−iak) =2 + A) (−k)A (k)e−iak) =2 e−iak
=2
− A< (−k)A (k)eiak< =2 e−iak =2 e−iak) )(−ip p) p< + i )< p)2 p + i ) p2 p< − i < p2 p) ) + (−A< (−k)A) (k)e−iak< =2 eiak) =2 + A) (−k)A (k)eiak) =2 eiak
=2
− A< (−k)A (k)e−iak< =2 eiak =2 eiak) )(−ip p) p< + i )< p)2 p + i ) p2 p< − i < p2 p) ) + (−A< (−k)A) (k)e−iak< =2 eiak) =2 − A) (−k)A (k)eiak) =2 e−iak
=2
+A< (−k)A (k)e−iak< =2 e−iak =2 eiak) )(−ip p) p< − i )< p)2 p + i ) p2 p< + i < p2 p) ) + (A< (−k)A) (k)e−iak< =2 e−iak) =2 − A) (−k)A (k)e−iak) =2 eiak
=2
− A< (−k)A (k)e−iak< =2 eiak =2 e−iak) )(−ip p) p< + i )< p)2 p − i ) p2 p< + i < p2 p) ) +(A< (−k)A) (k)e−iak< =2 e−iak) =2 + A) (−k)A (k)e−iak) =2 e−iak
=2
+ A< (−k)A (k)e−iak< =2 e−iak =2 e−iak) )(−ip p) p< − i )< p)2 p − i ) p2 p< − i < p2 p) )] ;
88
It is easy to see that B(2; 1) = 1=4 · B(1) = 1=4 · Z0 .
(18.72)
262
S. Capitani / Physics Reports 382 (2003) 113 – 302
which gives (−ip
p) p< )g02 a2
=a
−=a
d4 k (2)4
G<) (k)sin
ak< ak) sin 2 2
ak ak ak< ak) sin + G< (k) cos ak) sin sin : (18.73) + G) (k)sin 2 2 2 2 The result is 4 d k ( )?) T2 = −(1 − ) 4 − (2) ' ( 4 sin2 k< =2 sin2 k) =2 + 4 sin2 k) =2 sin2 k =2 + 1 − 2 sin2 k) =2 sin2 k< =2 sin2 k =2 (4 sin2 k =2)2 1 3 = (1 − ) − B(2; 1; 1) + B(2; 1; 1; 1) 4 8 1 1 5 : (18.74) = (1 − ) − Z0 − Z1 + 12 4 16 In the last line we have used the results which we have obtained in the previous subsection applying the recursion relations of the algebraic method. This result, added to T1 , gives the operator tadpole for O{0123} , Eq. (18.59): 3 1 1 1 ( )?) T = − Z0 + (1 − ) − Z0 − Z1 + 2 24 4 16 = 36:69915049 + (1 − ) 4:59514785 :
(18.75)
We now consider the cases in which two of the indices are equal, which are necessary for the computation of the operator tadpole of O{0011} . We have d4 k 2 ak) 2 ak) G (k) sin − cos )) 4 2 2 −=a (2) ak ak ak ak) ak) ak) sin + cos ak) sin sin + sin ak) sin sin + G ) (k) sin 2 2 2 2 2 2 4 2 2 d k sin k) =2 − cos k) =2 − (1 − ) = 4 (2) (4 sin2 k =2) −
T2()) ) = a2
=a
sin4 k) =2 − sin2 k) =2 cos2 k) =2 + sin2 k =2 sin2 k) =2(1 + cos k) ) + sin2 k =2 sin k) =2 cos k) =2 sin k) (4 sin2 k =2)2 1 1 1 = B(1; 1) − B(1) − (1 − ) B(2; 2) − B(2; 1) + B(2; 1; 1) − B(2; 2; 1) 2 2 4 1 1 1 1 4 1 1 1 = −Z0 + − (1 − ) − Z0 − Z0 + Z0 − Z0 + 4Z1 − 8 2 4 4 3 4 3 2 1 1 3 = −Z0 + + (1 − ) Z 0 + Z1 − : 8 4 4
×
(18.76)
S. Capitani / Physics Reports 382 (2003) 113 – 302
263
Again, in the last line we have substituted the results obtained in the previous subsection using the algebraic method. The interested reader can check that T2(<
)
= T2())
)
;
(18.77)
but the remaining combination gives a diJerent result: 4 d k cos k) (sin2 k) =2 − cos2 k) =2) ( ) ) T2 = 4 (4 sin2 k =2) − (2) cos k) (sin4 k) =2 − sin2 k) =2 cos2 k) =2) + 2 sin2 k =2 sin2 k) =2 − (1 − ) (4 sin2 k =2)2 1 = − B(1; 1; 1) + B(1; 1) − B(1) 4 1 1 B(2; 2) − B(2; 1) + B(2; 1; 1) − B(2; 2; 1) − (1 − ) 2 4 1 1 3 : = −Z0 − Z1 + + (1 − ) Z0 + Z1 − 4 4 4 Adding the term T1 to each of the above cases we have ( ' T(< ) = T()) ) = − 52 Z0 + 18 + (1 − ) 98 Z0 + Z1 − 14 T(
) )
= − 52 Z0 − Z1 +
1 4
+ (1 − )
'9
Z + Z1 − 8 0
1 4
(
;
(18.78)
(18.79) (18.80)
so that the operator tadpole for the operator O{0011} , and hence for the operator O{0011} + O{3322} − O{0022} − O{3311} , is *nally TO{0011} = 13 (T(
) )
+ 2T()) ) )
= − 52 Z0 − 13 Z1 +
1 6
+ (1 − )
(18.81) '9 8
Z0 + Z 1 −
1 4
(
= −40:5196866756 + (1 − )5:0660880895 :
(18.82) (18.83)
All these results can be given with much higher precision using the numbers reported in Appendix B. The tadpoles of operators with one or two covariant derivatives, which are given at the beginning of this Section, are much simpler to calculate and are left as an exercise for the reader. 18.4. The 8rst moment of the gluon momentum distribution We now wish to show the 1-loop result of a gluonic matrix element, which is probably the most complicated case in which the resulting integrals have been reduced to an expression containing only the two bosonic constants Z0 and Z1 . The calculation of the operator measuring the *rst moment of the gluon momentum distribution, which also corresponds to the gluonic contribution to the energy-momentum tensor, has been done analytically and then the integrals reduced using the
264
S. Capitani / Physics Reports 382 (2003) 113 – 302
Fig. 21. “Proper” diagrams for the 1-loop correction to the matrix element g| < Tr (F < F<) )|g . The black squares indicate the insertion of the operator. Notice that the last two diagrams are quite diJerent: the last diagram contains a 4-gluon vertex (and vanishes for this matrix element), while the previous one is the tadpole coming from the second-order expansion of the operator.
algebraic method Caracciolo et al. (1992); Capitani and Rossi (1995a). The renormalization of this operator can be obtained by computing the radiative corrections to the gluonic matrix element 0 / Tr(F < F<) ) | g : (18.84) g| <
The relevant diagrams are shown in Fig. 21. The vertex function gives 23 13 53 1 3 2 + Z0 + Z1 − − (log p + E − F0 ) ; Nc · − 192 482 144 3 162 the result for the sails is 1 31 19 7 7 2 + (log p + E − F0 ) ; − Z0 − Z1 + Nc · 192 92 24 48 242 and the operator tadpole gives 1 3 4 : Nc · − − Z0 + 64 3 4Nc
(18.85)
(18.86)
(18.87)
The diagram containing the 4-gluon vertex (the rightmost in Fig. 21) is zero for this operator. One must still add the gluon self-energy at 1 loop (Caracciolo, et al., 1992; Capitani and Rossi, 1995a), 7 1 7 5 1 2 + Z + − (log p + − F ) − : (18.88) Nc · 0 E 0 16 362 72 482 8Nc This is the case Nf = 0, that is the quenched approximation. The diagrams are as in Fig. 6, without the quark loops. The numerical result for the gluon self-energy is 1 1 21:679380 Nc − 19:739209 : (18.89) 162 Nc Summing everything together we have that the complete renormalization of the operator < Tr (F < F<) ) at 1 loop is given by 5 1 3 5 1 Z0 − Z1 + − : (18.90) Nc · − − 2 64 48 16 16 8Nc
S. Capitani / Physics Reports 382 (2003) 113 – 302
265
The divergences of the individual diagrams have canceled, and so the energy-momentum tensor has zero anomalous dimensions, as it should. Numerically one has 1 1 −17:778285 Nc + 19:739209 : (18.91) 162 Nc Using the values of Z0 and Z1 reported in Appendix B this result (as well as the gluon self-energy Eq. (18.89)) can be stated with almost 400 signi*cant decimal places. We want to point out that working out the color structure for the diagrams considered here is more complicated than the case of the quark matrix elements measuring the moments of the unpolarized quark distributions. The reduction of the color factors, the fabc tensors etc. can become quite cumbersome. On the other hand, the momentum integrals are much simpler in these gluonic matrix elements than in the quark case, because the fermion propagator is missing. Numerically evaluating integrals that have only gluon propagators poses much less of a computational challenge. 18.5. The general fermionic case Similarly to the bosonic case, one only needs to compute lattice integrals with vanishing external momenta. Any lattice zero-momentum integral coming from the calculation of lattice Feynman diagrams in the general Wilson case can be written as a linear combination of terms of the form 4 2n x ˆ y ˆ2nz ˆ2nt d k kˆ2n x ky kz kt F(p; q; nx ; ny ; nz ; nt ) = ; (18.92) 4 p q − (2) DF (k; mf ) DB (k; mb ) where p, q and ni are positive integers, the inverse bosonic propagator is DB (k; mb ) = kˆ2 + m2b ; and the denominator appearing in the propagator of Wilson fermions is taken to be r2 DF (k; mf ) = sin2 ki + (kˆ2 )2 + m2f : 4 i Actually, the correct Wilson denominator would be *2 )r kˆ2 + mf sin2 ki + ; Dˆ F (k; mf ) = 2 i
(18.93)
(18.94)
(18.95)
however in this algorithm mf only plays the rˆole of an infrared regulator and thus it does not need to be the true fermion mass. This form of the fermion propagator turns out to be much easier to handle than the true one. Moreover, for integrals which are only logarithmically divergent, the two forms give exactly the same result in the limit of small quark masses. As we did in the bosonic case, we *rst generalize the integrals to noninteger dimensions introducing 4 2n x ˆ y ˆ2nz ˆ2nt d k kˆ2n x ky kz kt F (p; q; nx ; ny ; nz ; nt ) = ; (18.96) 4 p+ D (k; m )q B b − (2) DF (k; mf ) where p and q are arbitrary integers (not necessarily positive) and , having been used in the intermediate steps of the calculations, will be safely set to zero at the end.
266
S. Capitani / Physics Reports 382 (2003) 113 – 302
It turns out that every F (p; q; nx ; ny ; nz ; nt ) with q 6 0 (i.e., a purely fermionic integral) can be expressed iteratively in terms of nine purely fermionic integrals, in our case F(1; 0), F(1; −1), F(1; −2), F(2; 0), F(2; −1), F(2; −2), F(3; −2), F(3; −3) and F(3; −4). Purely fermionic integrals can always be expressed in terms of integrals of the same type. This is a general property of all recursion relations. The integral F(2; 0) appears only in the case of divergent integrals. Only eight constants are then needed if the original purely fermionic integral is *nite (i.e., q 6 0 and p 6 1). In the general case in which q can be positive one needs three additional constants, called Y1 , Y2 , and Y3 , to describe the mixed fermionic–bosonic integrals, plus the constants Z0 , Z1 and F0 which already appeared in the purely bosonic case. While for the bosonic case we could give a complete treatment of the reduction steps, for fermions, due to the complexity of the procedure, we can only sketch them. The readers interested in the method can *nd all details in Burgio et al. (1996). There are four steps in this general fermionic method. The *rst step consists in expressing each integral F (p; q; nx ; ny ; nz ; nt ) in terms of F (p; q) only (that is, pure denominators). For this we need three sets of recursion relations. From the trivial identity DB (k; mb ) = i kˆ2i + m2b we obtain the *rst set of recursion relations: 1 (18.97) F (p; q; 1) = [F (p; q − 1) − m2b F (p; q)] ; 4 1 F (p; q; x; 1) = [F (p; q − 1; x) − m2b F (p; q; x) 3 −F (p; q; x + 1)] ;
(18.98)
1 F (p; q; x; y; 1) = [F (p; q − 1; x; y) − m2b F (p; q; x; y) 2 −F (p; q; x + 1; y) − F (p; q; x; y + 1)] ;
(18.99)
F (p; q; x; y; z; 1) = F (p; q − 1; x; y; z) − m2b F (p; q; x; y; z) −F (p; q; x +1; y; z)−F (p; q; x; y +1; z) −F (p; q; x; y; z +1) : (18.100) From the identity kˆ4i = 4(DB (k; mb ) − DF (k; mf ) − m2b + m2f ) + r 2 (DB (k; mb ) − m2b )2
(18.101)
i
we get a second set of recursion relations. Here we give only some examples: F (p; q; x; y; 2) = 2[F (p; q − 1; x; y) − F (p − 1; q; x; y) +(m2f − m2b )F (p; q; x; y) − +
1 1 F (p; q; x + 2; y) − F (p; q; x; y + 2)] 4 4
r2 [F (p; q − 2; x; y) − 2m2b F (p; q − 1; x; y) + m4b F (p; q; x; y)] ; 2 (18.102)
S. Capitani / Physics Reports 382 (2003) 113 – 302
267
F (p; q; x; y; z; 2) = 4[F (p; q − 1; x; y; z) − F (p − 1; q; x; y; z) + (m2f − m2b )F (p; q; x; y; z) 1 1 F (p; q; x + 2; y; z) − F (p; q; x; y + 2; z) 4 4 1 − F (p; q; x; y; z + 2)] + r 2 [F (p; q − 2; x; y; z) 4
−
− 2m2b F (p; q − 1; x; y; z) + m4b F (p; q; x; y; z)] :
(18.103)
Integrating by parts, for s ¿ 3, the equation 4(kˆ2w )s−1 − 4(2 + r 2 kˆ2 )(kˆ2w )s−3 sin2 kw (kˆ2w )s = DF (k; mf )p+ DF (k; mf )p+ −
9 1 4(kˆ2w )s−3 sin kw ; p+ −1 9kw DF (k; mf )p−1+
(18.104)
we obtain the third set of recursion relations: F (p; q; : : : ; s) = 6F (p; q; : : : ; s − 1) − 8F (p; q; : : : ; s − 2) − 4r 2 F (p; q − 1; : : : ; s − 2) + 4r 2 m2b F (p; q; : : : ; s − 2) +r 2 F (p; q − 1; : : : ; s − 1) − r 2 m2b F (p; q; : : : ; s − 1) 4 −2qF (p − 1; q + 1; : : : ; s − 2) + p+ −1 q + F (p − 1; q + 1; : : : ; s − 1) + (2s − 5)F (p − 1; q; : : : ; s − 3) 2
1 − (s − 2)F (p − 1; q; : : : ; s − 2) : 2
(18.105)
As before, by looking at this relation for p = 1 we see that in general we have to keep contributions of order in the intermediate stages of the algebraic reductions. Using the recursion relations introduced up to now, each integral F (p; q; nx ; ny ; nz ; nt ) can be reduced after the *rst step to a sum of pure denominator integrals of the form F (p; q; nx ; ny ; nz ; nt ) =
p
q+k
ars (m; )F (r; s) ;
(18.106)
r=p−k+1 s=q−k
where k = (nx + ny + nz + nt ), and m = mb = mf . In the limit m → 0 this becomes F (p; q; nx ; ny ; nz ; nt ) =
p
q+k
ars (0; )F (r; s) + R(m; ) + O(m2 ) ;
r=p−k+1 s=q−k
where R(m; ) depends on the values of p and q as follows: 1. p ¿ 0, q 6 0: R(m; ) is a polynomial in 1=m2 , *nite for → 0;
(18.107)
268
S. Capitani / Physics Reports 382 (2003) 113 – 302
2. p ¿ 0, q ¿ 0: R(m; )=1= (1− log m2 )R(1) (m)+R(2) (m)+O( ); where R(i) (m) are polynomials in 1=m2 ; 3. p 6 0, q 6 0: R(m; ) = 0; 4. p 6 0, q ¿ 0: R(m; ) = (1 − log m2 )R(1) (m) + R(2) (m) + O( 2 ). In the second step, with the systematic use of the identity F (p; q; 1; 1; 1; 1) − 4F (p; q + 1; 2; 1; 1; 1) − m2 F (p; q + 1; 1; 1; 1; 1) = 0 ; one obtains a nontrivial relation of the form frs (p; q; )F (r; s) + R (p; q; m; ) = 0 ;
(18.108) (18.109)
r; s
where p − 4 6 r 6 p. As in the bosonic case, it can be used to obtain new recursion relations; in this case, one can exploit it in three diJerent ways. After the second step one is then able to reduce every F (p; q) in terms of only the F (r; s)’s with 0 6 r 6 3 and arbitrary s, or r 6 − 1 and s = 1; 2; 3, or r ¿ 4 and s = 0; −1; −2. In the third step we systematically use the identity F (p; q; 1; 1; 1; 1) − F (p + 1; q − 1; 1; 1; 1; 1) + F (p + 1; q; 3; 1; 1; 1) 1 − [F (p + 1; q − 2; 1; 1; 1; 1) − 2m2 F (p + 1; q − 1; 1; 1; 1; 1) 4 + m4 F (p + 1; q; 1; 1; 1; 1)] = 0 ;
(18.110)
which is applied in four diJerent ways according to the particular properties of the four regions q 6 0 and 0 6 p 6 3, q ¿ 0 and 0 6 p 6 3, q = 1; 2; 3 and p 6 − 1, q = −2; −1; 0 and p ¿ 4. We can after the third step express the remaining F (p; q) in terms of only the F (r; s)’s with r = 3 and −4 6 s 6 0, or r = 2 and −4 6 s 6 2, or r = 1 and −4 6 s 6 4, or r = 0 and −4 6 s 6 6, or r = −1 and s = 2. In the fourth step the identities in Eqs. (18.108) and (18.110), which so far have not been used for all possible values of p and q, are further exploited to provide additional relations between the remaining integrals. One has to look systematically for further values of p and q for which these two identities were not trivially satis*ed. At the end of this process one can then achieve a further decrease of the number of independent constants. We give an example of these additional relations:
1 19 1 2 F (0; 4) = (1 − log m ) − + 962 m4 46082 m2 92162 31 13 1 F (0; 3) − F (0; 2) + F (0; 1) 144 1152 9216 61 80989 347 5 F (0; 3) + + + + − 5762 m4 184322 m2 884736002 1440 +
−
83 137 689 1139 F (0; 1) − F (1; 2) + F (1; 1) F (0; 2) + 2560 184320 2880 23040
−
23 329 13283 415 F (1; 0) + F (1; −1) − F (2; 0) + F (2; −1) 147456 4423680 11520 1105920
S. Capitani / Physics Reports 382 (2003) 113 – 302
269
Table 4 New constants appearing in the general fermionic case. F(1; 0) F(1; −1) F(1; −2) F(2; −1) F(2; −2) F(3; −2) F(3; −3) F(3; −4) Y0 Y1 Y2 Y3
0.08539036359532067914 0.46936331002699614475 3.39456907367713000586 0.05188019503901136636 0.23874773756341478520 0.03447644143803223145 0.13202727122781293085 0.75167199030295682254 −0:01849765846791657356 0.00376636333661866811 0.00265395729487879354 0.00022751540615147107
391 30479 437 F (2; −2) − F (3; −2) + F (3; −3) 221184 2211840 245760
161 + F (3; −4) : 589824
−
(18.111)
At the end of these complicated reductions, one is able to write a generic purely fermionic integral (p ¿ 0 and q 6 0) in terms of eight infrared-*nite integrals: F(1; 0), F(1; −1), F(1; −2), F(2; −1), F(2; −2), F(3; −2), F(3; −3) and F(3; −4), plus a constant, Y0 , which appears in the logarithmic divergent integral 1 F(2; 0) = − (log m2 + E − F0 ) + Y0 : (18.112) 162 For the mixed fermionic–bosonic integrals (p ¿ 0 and q ¿ 0) one must introduce three additional constants, which have been chosen as 1 (18.113) Y1 = F(1; 1; 1; 1; 1) ; 8 1 Y2 = F(1; 1; 1; 1; 1; 1) ; (18.114) 16 1 Y3 = F(1; 2; 1; 1; 1) : (18.115) 16 In Table 4 we report the values of the new basic constants introduced in the general case. Most of the basic recursion relations for fermions presented above can be easily generalized to integrals which are dimensionally regularized. However, some of them are intrinsically four-dimensional identities. For this reason it is more convenient to apply the above reductions which use a mass as a regulator. We do not give here the complicated relation between the two schemes, but only say that fortunately the case of logarithmically divergent integrals can still be dealt with rather easily, because it is enough to make the substitution 2=(d − 4) + log 4 for log m2 + E . The relation between integrals computed using the true fermion propagator and integrals computed using the propagator (18.94) is also rather complicated. However, if the integrals are only logarithmically divergent, for
270
S. Capitani / Physics Reports 382 (2003) 113 – 302
m → 0 the two results coincide. On the other hand, for power-divergent expressions if one uses the true fermion propagator the divergent part is a polynomial in 1=m instead of 1=m2 , and the expressions are in general more cumbersome than those involving the propagator in Eq. (18.94). The method discussed here depends on the form of the quark propagator, but not on the vertices. It can thus be applied to O(a) improved fermions as well. A version for, say, overlap fermions, which involves more complicated denominators, has not yet been developed. In this case one has to *nd a generalization of the method, but it is expected that the corresponding recursion relations will turn out to be much more complicated than the already cumbersome Wilson case. A convenient method to reduce all integrals could be the one of Laporta (2000), which has been used in Becher and Melnikov (2002). It uses a brute force approach which reduces integrals to simpler ones (with lower values of the indices) by means of a classi*cation which uses a lexicographic order, without the need of *nding a complicated system of recurrence relations like the one exposed above. 18.6. The quark self-energy Using the algebraic methods explained above, it is possible to give a purely algebraic result for the 1-loop quark self-energy (for r = 1, in the Feynman gauge). This was *rst computed in GonzKalez-Arroyo et al. (1982), Hamber and Wu (1983) and Groot et al. (1984). The result for the 1-loop quark self-energy 2 2 2 2 ˜ ˜ T(p2 ; m2 ) = g02 CF (m˜ c + ip , T1 (p ; m ) + mT2 (p ; m )) ;
(18.116)
in terms of the basic constants is (Burgio et al., 1996) m˜ c = −Z0 − 2F(1; 0) ≈ −0:32571411742170157236 ; T˜ 1 (p2 ; m2 ) =
≈
(18.117)
1 1 1 (2G(p2 a2 ; m2 a2 ) + E − F0 ) + Z0 + 162 8 192 −
1 1 1 1 Y2 + 12 Y3 − F(1; −2) − Y0 + Y1 − 4 16 768 322
−
109 1 25 1 F(1; −1) + F(1; 0) − F(2; −2) + F(2; −1) 192 192 768 48
1 G(p2 a2 ; m2 a2 ) + 0:0877213749 ; 82
(18.118)
1 1 1 − 2 T˜ 2 (p2 ; m2 ) = 2 (F(p2 a2 ; m2 a2 ) + E − F0 ) + 4 48 4 −4 Y0 + Y1 − − ≈
1 1 1 Y2 − F(1; −2) − F(1; −1) 4 192 48
1 49 83 F(1; 0) − F(2; −2) + F(2; −1) 48 192 12
1 F(p2 a2 ; m2 a2 ) + 0:0120318529 ; 42
(18.119)
S. Capitani / Physics Reports 382 (2003) 113 – 302
where 2 2
2 2
F(p a ; m a ) = G(p2 a2 ; m2 a2 ) =
1
0 1 0
271
d x log[(1 − x)(p2 x + m2 )a2 ] ;
(18.120)
d x x log[(1 − x)(p2 x + m2 )a2 ] :
(18.121)
The importance of having explicit expressions like these cannot be underestimated. Thanks to them, the 1-loop self-energy T(p2 ; m2 ) can now be computed with many signi*cant decimal places, provided the basic constants are determined with suQcient accuracy. 19. Coordinate space methods In this section we are going to illustrate the coordinate space method, developed by L6uscher and Weisz (1995b), which is very powerful for a variety of reasons. In particular it turns out to be very important for the calculation of 1-loop integrals with very high precision. Having 1-loop integrals determined with such precision is necessary for the implementation of the only known method (which will be also presented here) with which the computation of 2-loop integrals with good precision can be carried out. A fundamental object for the coordinate space method is the mixed fermionic–bosonic propagator in position space, which will be denoted by eikx d4 k GF (p; q; x) = : (19.1) (2)4 DF (k; m)p DB (k; m)q Any of these propagators GF (p; q; x) can be expressed as a linear combination of the integrals F introduced in Section 18.5, and consequently as a linear combination of the 15 basic constants Z0 ; Z1 ; F0 ; : : : ; Y0 ; Y1 ; Y2 ; Y3 . Of course the general position space propagator GF (p; q; x) can also be expressed in terms of a diJerent set of 15 basic constants, and general recursion relations between the GF ’s can be derived. We can always consider the GF ’s instead of the F’s as an intermediate representation of the general integral coming from the calculations of Feynman diagrams, and then express every GF in terms of a chosen set of *fteen constants. To begin with we consider the *nite bosonic case, in which, as we know, only two basic constants are needed. In the bosonic case a simple reduction algorithm was developed by L6uscher and Weisz (1995b), following ideas by Vohwinkel. For (p; q) = (0; 1) (i.e., the standard boson propagator), a recursion relation which involves only terms with the same (p; q) = (0; 1) was obtained. This simple recursion relation avoids the introduction of noninteger dimensions and nonpositive values of q, as opposed to the case of algorithms for general p and q. The free lattice gluon propagator in position space can then be evaluated recursively. It is a linear function of its values near the origin. L6uscher and Weisz chose as basic constants two values of the propagator close to the origin: GF (0; 1; (0; 0; 0; 0)) = Z0 ;
GF (0; 1; (1; 1; 0; 0)) = Z0 + Z1 −
1 4
;
(19.2)
where the relation between these two constants and the constants of Section 18 is also shown. We denote the gluon propagator by G(x) = GF (0; 1; x). The key observation in the L6uscher-Weisz algorithm is that (∇? + ∇ ) G(x) = x H (x) ;
(19.3)
272
S. Capitani / Physics Reports 382 (2003) 113 – 302
where the function 4 d p ipx e log pˆ 2 H (x) = 4 (2) −
(19.4)
is independent of . Summing now this key formula over 3 G = ∇? ∇ ) we get =0
3
2 [G(x) − G(x − ˆ)]; H (x) = < =0
<=
3
and using −GG(x) = x; 0 (where
x ;
(19.5)
=0
which can be substituted back to eliminate H (x) from the key formula. We then obtain the fundamental recursion relation for the gluon propagator in position space: G(x + ˆ) = G(x − ˆ) +
3 2x [G(x) − G(x − ))] ˆ : < )=0
(19.6)
This formula is to be used for < = 0. Since the propagator is independent of the sign and order of the four coordinates, we can restrict ourselves to x0 ¿ x1 ¿ x2 ¿ x3 ¿ 0. In this sector, the recursion relation allows to express G(x) in terms of its values in *ve points: G(0; 0; 0; 0), G(1; 0; 0; 0), G(1; 1; 0; 0), G(1; 1; 1; 0), G(1; 1; 1; 1). Now, using the properties of the propagator three more relations between these *ve constants can be found, G(0; 0; 0; 0) − G(1; 0; 0; 0) = 1=8 ; G(0; 0; 0; 0) − 3 G(1; 1; 0; 0) − 2 G(1; 1; 1; 0) = 1=2 ; G(0; 0; 0; 0) − 6 G(1; 1; 0; 0) − 8 G(1; 1; 1; 0) − 3 G(1; 1; 1; 1) = 0 ;
(19.7)
and thus we can *nally write the generic bosonic propagator G(x) in terms of only two constants: r3 (x) (19.8) G(x) = r1 (x) G(0; 0; 0; 0) + r2 (x) G(1; 1; 0; 0) + 2 + r4 (x) : The coeQcients rk (x) are rational numbers which can be computed in a recursive manner. We have thus expressed the free gluon propagator in terms of two basic constants, which can be reinterpreted as the values of the propagator near the origin. The generalization of the coordinate method by L6uscher and Weisz to fermions is more complicated, and we will not discuss it here. For fermions a recursion relation analog to Eq. (19.6) which only involves the fermion propagator itself (i.e., without excursions to other values of p and q) has not yet been found, and the reduction to the basic constants has to be carried out along more complicated procedures. The free fermion propagator can be written at the end in terms of eight basic constants, which can be reinterpreted as some values of the quark propagator near the origin. 19.1. High-precision integrals The reduction of the coordinate space propagators to a small set of basic constants, which we have just described, has interesting properties that turn out to be very useful for the high-precision computation of these basic constants. One can show that the coeQcients rk in Eq. (19.8) increase
S. Capitani / Physics Reports 382 (2003) 113 – 302
273
exponentially with the distance x, while the propagator G(x) remains bounded. There are therefore huge cancellations and loss of accuracy at large x. Instead of looking at this numerical instability as a nuisance, one can exploit it in order to compute the basic constants with very high precision (L6uscher and Weisz, 1995b). Let us consider the boson propagator at the points x1 = (n; 0; 0; 0);
x2 = (n; 1; 0; 0)
(19.9)
for large n. The associated sets of coeQcients rk (x1 ) and rk (x2 ) in Eq. (19.8), r3 (x1 ) G(x1 ) = r1 (x1 )G(0; 0; 0; 0) + r2 (x1 )G(1; 1; 0; 0) + + r4 (x1 ) ; 2 r3 (x2 ) + r4 (x2 ) ; (19.10) 2 are of order 10n . If this 2 × 2 algebraic system is now inverted in terms of the unknowns G(0; 0; 0; 0) and G(1; 1; 0; 0), the coeQcients multiplying G(x1 ) and G(x2 ) will be of order 10−n . Therefore the constants G(0; 0; 0; 0) and G(1; 1; 0; 0) can be determined to this level of precision, provided we neglect G(x1 ) and G(x2 ) in the 2 × 2 inverse system. G(0; 0; 0; 0) and G(1; 1; 0; 0) then remain functions only of the coeQcients r3 and r4 . It costs very little to go to very high n and obtain the values of these two constants with very high precision. One can then systematically improve on the accuracy until the desired level of precision is reached. The method is exponentially convergent, and the error can also be estimated with good accuracy. On the contrary, a direct evaluation of the integrals de*ning the basic constants can never provide such accurate results. Using this method we have calculated Z0 and Z1 , the two constants which are the basis of *nite bosonic integrals, with almost 400 signi*cant decimal places. These new results are given in Appendix B. In the general mixed fermionic case, the expressions for a generic GF (p; q; x) in terms of *fteen values of it near the origin are also numerically unstable for |x| → ∞. If we consider for example the fermion propagator, we can choose eight points with |x| ≈ n (say y1 ; : : : ; y8 ) and then we can express the propagator for |x| ¡ n in terms of GF (1; 0; yi ). These expressions are numerically stable: considering for example the set of eight points X (n) ≡ {(n; [0 − 3]; 0; 0); (n + 1; [0 − 3]; 0; 0)} for high n one can obtain the values of the propagator near the origin with great precision. It is also possible to apply the same procedure to GF (1; q; x) with q ¡ 0. The main advantage is that, using larger negative values of q, one can obtain more precise estimates of GF (1; q; x) at this set of points. Once one has extracted the *fteen values for the mixed propagators near the origin, it is straightforward to change basis and compute the *fteen basic constants for 1-loop integrals, Z0 ; Z1 ; : : : ; Y2 ; Y3 . All constants (except F0 ) are now known with a precision of sixty decimal places (Caracciolo et al. 2001). G(x2 ) = r1 (x2 )G(0; 0; 0; 0) + r2 (x2 )G(1; 1; 0; 0) +
19.2. Coordinate space methods for 2-loop computations The determination of 1-loop integrals with high precision is an essential component of re*ned 2-loop calculations which are carried out using coordinate space methods. They allow to reach a precision unmatched by more conventional methods.
274
S. Capitani / Physics Reports 382 (2003) 113 – 302
A few 2-loop calculations have also been performed by means of sophisticated extrapolations to in*nite L of momentum sums, using *tting functions and blocking transformations of the kind that we have discussed in Section 17. 89 The precision reached with these conventional methods can be good in some cases, but the precision which can be achieved using the coordinate method is already higher at present, and can be easily increased (by computing the 1-loop “building blocks” with higher precision). Sometimes the absolute values of the integrals to be computed are rather small, less than 10−6 , 90 and in order to get a good relative error the coordinate method seems to be the most suitable. The 2-loop calculations completed so far, using either conventional or coordinate space methods, only concern quantities that are *nite, like the coeQcient b2 of the function, mentioned in Section 10. No divergent matrix elements have yet been computed. Although some of the individual diagrams for *nite matrix elements can contain subdivergences, the situation in which the matrix elements are themselves divergent is certainly more challenging. 19.2.1. Bosonic case We will now illustrate the calculation of 2-loop integrals using the coordinate space method by (L6uscher and Weisz, 1995b). At *rst we discuss the case of integrals with zero external momenta, which is simpler. Let us then consider the 2-loop integral in momentum space 4 4 d k d q 1 I1 = : (19.11) 4 4 + q)2 − (2) − (2) kˆ2 qˆ2 (k[ This integral can be reexpressed as follows: 4 4 4 d k d q d r 1 (2)4 (4) (k + q + r) I1 = 4 4 4 kˆ2 qˆ2 rˆ2 − (2) − (2) − (2) d4 k d4 q d4 r 1 1 1 = eikx eiqx 2 eirx 2 4 4 4 qˆ rˆ kˆ2 − (2) − (2) − (2) x =
x
=
−
d 4 k ikx 1 e (2)4 kˆ2
G 3 (x) :
3
(19.12)
x
The 2-loop integral in momentum space has thus been written in terms of a sum of a simple function of the position-space propagator G(x) over the lattice sites. We have changed from an eight-dimensional sum in momentum space to a four-dimensional sum in position space. Since G(x) ∼ 1=x2 , this sum is absolutely convergent. An evaluation of the sum over a *nite domain containing the origin, say the region |x| ¡ 20, can then be taken as a *rst approximation of the integral I1 . Of course one must also know the propagator G(x) with very good accuracy, because 89
In a couple of cases even 3-loop calculations have been carried out (AllKes et al., 1994a, b, 1998). 5 An example of this is given by the calculation of the coeQcients C± of Eq. (14.19), which has been carried out using conventional 2-loop integration methods (Curci et al., 1988). 90
S. Capitani / Physics Reports 382 (2003) 113 – 302
275
some precision will be lost when computing the sums. It is here that the high-precision technique which we have just discussed turns out to be handy. In particular, the recursion relations in Eq. (19.6) allow the computation of G(x) in terms of two constants (which can be chosen to be Z0 and Z1 ). These can be computed with arbitrary precision, and thus G(x) can be computed with that precision. If G(x) can be determined (for any x in the domain) with arbitrary precision, the only remaining challenge is to evaluate the sums over a reasonable domain of sites in the most eJective way. Of course the domain cannot be too large because of computational limitations. The sums can then be better evaluated by exploiting the knowledge of the asymptotic expansion of the propagator for large x, which is given by 91 1 x4 1 x4 x6 (x4 )2 x→∞ 1 1 − 2 + 2 2 3 − 4 2 2 + 16 2 4 − 48 2 5 + 40 2 6 + · · · : (19.13) G(x) → 42 x2 x (x ) (x ) (x ) (x ) (x ) Here and in the following the notation: xn =
3
(x )n
(19.14)
=0
is used. We have then 1 1 G (x) → 2 3 (4 ) (x2 )3 3
x→∞
3 x4 1 x4 x6 (x4 )2 1 − 2 + 6 2 3 − 9 2 2 + 36 2 4 − 144 2 5 + 132 2 6 x (x ) (x ) (x ) (x ) (x )
+ O(|x|−12 )
1 1 3 3 24 = h0 (x) + h1 (x) + + (42 )3 (x2 )3 10(x2 )5 (x2 )6 7(x2 )7 3 33 − h2 (x) + h3 (x) + O(|x|−12 ) ; 4(x2 )8 140(x2 )9
(19.15)
(19.16)
where in the last line we have rewritten G 3 (x) in terms of the homogeneous harmonic polynomials h0 (x) = 1 ;
(19.17)
h1 (x) = 2x4 − (x2 )2 ;
(19.18)
h2 (x) = 16x6 − 20x2 x4 + 5(x2 )3 ;
(19.19)
h3 (x) = 560(x4 )2 − 560x2 x6 + 60(x2 )2 x4 − 9(x2 )4 :
(19.20)
It is useful at this point to de*ne a generalized zeta function Z(s; h) = h(x)(x2 )−s ;
(19.21)
x: x =0 91
For the derivation of this and similar asymptotic expansions, see L6uscher and Weisz (1995b).
276
S. Capitani / Physics Reports 382 (2003) 113 – 302
where the site x = 0 is not to be included in the sum. These zeta functions can be calculated with the necessary accuracy. To this end one introduces the heat kernel 2 k(t; h) = h(x)e−tx ; (19.22) x: x =0
so that the generalized zeta function can be reexpressed as ∞ s dt t s−1 [k(t; h) − h(0)] Z(s; h) = I(s) 0 ∞ 2h(0) s s −1 d=2 d−s+1 + dt [t + (−1) t ][k(t; h) − h(0)] ; = I(s) s(s − 2) 1 where in the last line the integration has been split into two parts, and the formula k(t; h) = (−1)d=2 t −d−2 k(1=t; h) has been
(19.23) (19.24)
(19.25)
used.92
We have then 3 G 3 (x) = [G 3 (x) − Gas (x)] +
1 3 24 Z(3; h0 ) + Z(5; h0 ) + 3Z(6; h1 ) + Z(7; h1 ) 2 3 (4 ) 10 7 x x: x =0 3 33 − Z(8; h2 ) + (19.27) Z(9; h3 ) + G 3 (0) : 4 140 We that once the values of the zeta function are known, it only remains to compute can see 3 3 [G (x) − Gas (x)], which is rapidly convergent since each term goes at least like |x|−12 . This x: x =0 sum can then be evaluated with a reasonable approximation using domains which are not too large, and in this way one can obtain I1 = 0:0040430548122(3) :
(19.28)
The coordinate space method can be used to compute more complicated integrals in which nontrivial numerators are present, like 4 4 3 kˆ2 qˆ2 d k d q I2 = 4 4 ˆ2 2 [ 2 − (2) − (2) =0 k qˆ (k + q) = =
−
x
=
−
=0
−
d4 k (2)4
3 x
92
d4 k (2)4
−
d4 q (2)4
−
3 kˆ2 qˆ2 d4 r 4 (4) (2) (k + q + r) (2)4 kˆ2 qˆ2 rˆ2 =0
−
d4 q (2)4
d 4 k ikx kˆ2 e (2)4 kˆ2
−
3 d 4 r ikx iqx irx kˆ2 qˆ2 e e e (2)4 kˆ2 qˆ2 rˆ2 =0
2
−
d 4 r irx 1 e (2)4 rˆ2
This formula derives from the fact that h(x) is harmonic and from the Poisson summation formula −iqx −tx2 2 e e = t −2 e−(q+2x) =(4t) : x
x
(19.26)
S. Capitani / Physics Reports 382 (2003) 113 – 302
=
3 x
277
(−∇∗ ∇ G(x))2 G(x)
=0
= 0:0423063684(1) :
(19.29)
Here the derivatives of the propagator in coordinate space generate the factors kˆ2 and qˆ2 in the numerators of the momentum–space integrals. Similarly, I3 =
−
=
=
−
−
3 x
d4 q (2)4
−
d4 k (2)4
−
=0
3 + q)2 d 4 q kˆ2 qˆ2 (k[ (2)4 =0 kˆ2 qˆ2 (k[ + q)2
−
3 x
=
d4 k (2)4
x
=
d4 k (2)4
−
3 ˆ2 2 2 k qˆ rˆ d4 r 4 (4) (2) (k + q + r) 4 (2) kˆ2 qˆ2 rˆ2 =0
−
d4 q (2)4
d 4 k ikx kˆ2 e (2)4 kˆ2
−
3 d 4 r ikx iqx irx kˆ2 qˆ2 rˆ2 e e e (2)4 kˆ2 qˆ2 rˆ2 =0
3
(−∇∗ ∇ G(x))3
=0
= 0:054623978180(1) :
(19.30)
In general, integrals where the numerator is a polynomial in sines and cosines can be computed using these techniques. However, in more complicated cases, where for example the denominator has a higher power than the integrals discussed so far, it is necessary to introduce auxiliary functions. One of these cases is given by the calculation of I4 =
−
d4 k (2)4
−
3 kˆ2 qˆ2 d4 q : (2)4 =0 (kˆ2 )2 qˆ2 (k[ + q)2
(19.31)
The problem here is that the factor 1=(kˆ2 )2 cannot be related in a simple way to G(x). In this case, the auxiliary function 4 d p (eipx − 1) K(x) = (19.32) 4 (pˆ 2 )2 − (2) is just what we need. In fact, 4 d p 2 eipx ∗ pˆ ; ∇ ∇ K(x) = − 4 (pˆ 2 )2 − (2)
(19.33)
278
S. Capitani / Physics Reports 382 (2003) 113 – 302
so that I4 =
3 x
(−∇∗ ∇ K(x))(−∇∗ ∇ G(x))G(x)
=0
= 0:006603075727(1) :
(19.34)
The function K(x) is related to G(x) by − GK(x) = G(x) ;
(19.35)
(∇∗ + ∇ )K(x) = −x G(x) ;
(19.36)
so that it can be recursively computed in terms of G(x) and of the values of K(x) at the corner of the unit hypercube. Finally, the integral 4 4 3 kˆ4 d k d q I5 = (19.37) 4 4 ˆ2 3 2 [ 2 − (2) − (2) =0 (k ) qˆ (k + q) can be computed introducing the auxiliary function 4 (eipx − 1 − i x sin p + 1 ( x sin p )2 ) d p 2 ; L(x) = 4 (pˆ 2 )3 − (2)
(19.38)
which has the properties 1 1 − GL(x) = K(x) + G(0) − ; 8 322
1 1 ∗ (∇ + ∇ )L(x) = − x K(x) + G(0) ; 2 8
(19.39) (19.40)
and thus can be computed recursively. The integral is then given by I5 =
3 x
(∇∗ ∇ ∇∗ ∇ L(x))G 2 (x)
=0
= 0:00173459425(1) :
(19.41)
We now turn to 2-loop integrals which also depend on an external momentum. Of course these integrals can also be computed using the decomposition explained in Section 15.2 and the theorem of Reisz, thanks to which only zero-momentum integrals have at the end to be really computed on the lattice. However, it is also possible to compute these 2-loop integrals using the coordinate space methods directly, as we are going to show. To understand how external momenta are incorporated into the method, let us *rst discuss the simpler case of a 1-loop integral depending on an external momentum p. The logarithmically divergent integral 4 d k 1 I6 = (19.42) 4 (2) − kˆ2 ( k − p)2
1
S. Capitani / Physics Reports 382 (2003) 113 – 302
can be rewritten in position space as follows: 4 4 d k d q 1 I6 = (2)4 (4) (q + k − p) 4 4 2 ˆ (2) (2) k qˆ2 − − 4 4 d k d q ikx 1 iqx 1 −ipx = e e e 4 4 (2) (2) qˆ2 kˆ2 − − x = e−ipx G 2 (x):
279
(19.43)
x
The integral can then be computed by evaluating the sums 2 I6 = lim e−Rx e−ipx G 2 (x) : R→0
(19.44)
x
It is now useful to employ the function H (x) which we have introduced in Eq. (19.4) when deriving the recursion relations for G(x) in coordinate space: 4 d p ipx H (x) = e log pˆ 2 : (19.45) 4 − (2) Its asymptotic expansion for large x is 4 1 1 x4 1 x4 x→∞ H (x) → − 2 2 2 1 − 2 + 8 2 3 − 7 2 2 + 40 2 4 (x ) x (x ) (x ) (x ) x6 (x4 )2 − 288 2 5 + 280 2 6 + · · · ; (x ) (x )
(19.46)
and thus its leading order term has (up to a constant) the same asymptotic behavior of G 2 (x). We now subtract and add to the original integral an appropriate expression containing H (x): 1 1 −Rx2 −ipx I6 = − lim e e H (x) + e−ipx [G 2 (x) + H (x)] : (19.47) R → 0 2 16 162 x x The *rst part is just the Fourier transform of H (x), which can be read oJ from its de*nition above, and is thus given by 1 1 1 2 − lim e−Rx e−ipx H (x) = − log pˆ 2 = − log p2 + O(p2 ) : (19.48) 2 2 16 R→0 x 16 162 The *nite constant is obtained from the second part, in which the leading 1=|x|4 terms of G 2 (x) and H (x) cancel in the subtraction, so that this part goes like the Fourier transform of 1=|x|6 (and is therefore *nite). In the limit p → 0 we have then
1 1 2 2 I6 = − G (x) + log p + H (x) : (19.49) 162 162 x The sum can be computed using generalized zeta functions as explained before. The result is 1 1 I6 = − log p2 + 2 + 0:02401318111946489(1) : (19.50) 162 8
280
S. Capitani / Physics Reports 382 (2003) 113 – 302
Let us now illustrate this powerful method for a 2-loop integral with an external momentum p. We consider 4 4 d k d q 1 ; (19.51) I7 = 4 4 ˆ2 2 (2) (2) q − p)2 k qˆ (k +[ − − which by dimensional arguments will give a result of the form I7 = c1 + c2 p2 log p2 + c3 p2 + O(p4 ) :
(19.52)
We would like to determine the coeQcients c2 and c3 . We already know the constant c1 , which can be computed setting p = 0 and is given by I1 , Eq. (19.11). This integral can be written in position space as 4 4 4 1 d k d q d s (2)4 (4) (s + q + k − p) I7 = 4 4 4 kˆ2 qˆ2 sˆ2 − (2) − (2) − (2) = e−ipx G 3 (x) : (19.53) x
A function that has the same leading asymptotic behavior (up to a constant) of G 3 (x), and for which the Fourier transform is easily calculable in an exact way, is the four-dimensional Laplacian of H (x). We thus make the decomposition
1 −ipx 1 −ipx 3 I7 = − G (x) + e GH (x) + e GH (x) : (19.54) 2(4)4 x 2(4)4 x The *rst part is the Fourier transform of GH (x), which can be easily computed and gives 1 1 1 −ipx e GH (x) = − pˆ 2 log pˆ 2 = − p2 log p2 + O(p4 ) : (19.55) − 2(4)4 x 2(4)4 2(4)4 We have thus computed the coeQcient c2 . The coeQcient c3 can be obtained from the second part, in which the leading 1=|x|6 terms of G 3 (x) and H (x) cancel in the subtraction, which then goes like 1=|x|8 . This fact is important, because to compute c3 we have to expand the exponential to order p2 , and then the second part becomes
1 1 2 2 3 p x G (x) + GH (x) ; (19.56) − 8 x 2(4)4 and only in this way the function to be summed goes like 1=|x|6 again and is then *nite and can be computed using generalized zeta functions. Putting everything together we have the result
1 1 2 2 3 1 2 2 p log p − p x G (x) + GH (x) + O(p4 ) I7 = I1 − 4 2(4)4 8 2(4) x = 0:0040430548122(3) −
1 p2 log p2 − p2 · 0:00007447695(1) + O(p4 ) : 2(4)4
(19.57)
A more complicated example can be found in L6uscher and Weisz (1995b), to which we also refer for further details on the method. These and other integrals have been used for the calculation of the coeQcient b2 of the function in the pure gauge Wilson theory (L6uscher and Weisz, 1995a, d).
S. Capitani / Physics Reports 382 (2003) 113 – 302
281
The complete 2-loop calculation of this coeQcient requires a combination of momentum space and coordinate space methods. To summarize, coordinate space is of great help for the calculation of important 2-loop momentum space integrals. One takes advantage of the fact that only four-dimensional lattice sums must be performed, instead of eight-dimensional ones, and one can also exploit the asymptotic expansion of the gluon propagator G(x) for large values of x to improve on the convergence. 19.2.2. Fermionic case We now brieNy discuss the computations of 2-loop lattice diagrams with Wilson fermions based on the coordinate space method by L6uscher and Weisz, which have been presented in (Capitani et et al., 1998a; Caracciolo et al., 2001). An essential ingredient of these 2-loop calculations is the high-precision determination of 1-loop mixed fermionic-bosonic propagators (Section 18.5). The algebraic method for general Wilson fermions, thanks to which any 1-loop lattice integral can be written as a combination of *fteen basic constants, known with arbitrarily high precision, allows the implementation of the coordinate space method to 2-loop integrals. We remind that the algebraic method depends only on the structure of the Wilson propagator, and not on the vertices, and thus it can be applied in calculations with the Wilson action as well as with the improved clover action. Let us consider a very simple integral 4 4 d l d r 1 I= ; (19.58) 4 4 D (l)D (r)D (l + r) (2) (2) F F F − − where DF is the denominator of the quark propagator. The standard alternative (in momentum space) would consist in replacing each integration with a discrete sum over L points and then extrapolate to in*nite L: 1 1 I= 8 : (19.59) L DF (l)DF (r)DF (l + r) l;r;l+r =0
Here l and r run over the set (n + 1=2) · 2=L, n = 0; : : : ; L − 1, and one excludes from the sum the points l + r = 0 mod 2 (where the third propagator would diverge). Increasing the values of L one gets the following approximations for the integral I : L = 10;
0:000799652 ;
(19.60)
L = 18;
0:000848862 ;
(19.61)
L = 20;
0:000853822 ;
(19.62)
L = 26;
0:000863064 :
(19.63)
Then, using an extrapolation function of the form a1 log L + a2 a3 log L + a4 a0 + + (19.64) L2 L4 and *tting with it the results of the sum for 6 6 L 6 26, one gets the estimate I ≈ 0:000879776. Note that the computation at L = 26 requires the evaluation of the function on an integration grid of
282
S. Capitani / Physics Reports 382 (2003) 113 – 302
268 ∼ 2 × 1011 points. With the 2-loop techniques based on the coordinate space method one can instead obtain without much eJort I ≈ 0:0008797779181(12). In order to compute 2-loop momentum-space integrals with high precision, we must *rst rewrite them in coordinate space along the lines shown by L6uscher and Weisz, trading an additional x integration for the (4) (l + r + s) which expresses the vanishing of external momenta (Capitani et al., 1998a). For example one can write 4 1 1 d4 l d r 1 I= p1 q1 p2 q2 p3 q3 4 4 − (2) − (2) DF (l)DB (l) DF (r)DB (r) DF (l + r)DB (l + r) 4 4 4 1 1 d l d r d s 1 = p1 q1 p2 q2 p3 q3 4 4 4 − (2) − (2) − (2) DF (l)DB (l) DF (r)DB (r) DF (s)DB (s) × (4) (l + r + s) = GF (p1 ; q1 ; x)GF (p2 ; q2 ; x)GF (p3 ; q3 ; x) :
(19.65)
x
To compute this simple integral we have to evaluate in coordinate space the lattice sums in the last line. The price to pay is that all necessary GF (pj ; qj ; x) integrals must be computed with huge precision. This does not constitute a particular challenge. These GF (pj ; qj ; x) integrals can be determined with the desired precision (for a suQciently large domain of values of x) by using the 1-loop algebraic algorithm, and one can exploit their asymptotic expansions for large values of x. After the subtraction of the asymptotic behavior, the sums are much better convergent. The task is then to evaluate with enough accuracy sums of the kind T= f(x) ; (19.66)
where should in principle be the whole lattice, using only a not too big *nite lattice domain, and be able to estimate the error. The sums are restricted over domains de*ned as (19.67) Dp = {x ∈ : |x|1 6 p} ; where |x|1 = |x |. In Capitani et al. (1998a) it was found that the domain D21 is a reasonable choice. For the computation of the generic Feynman diagram one then chooses some convenient representation of 2-loop integrals, and assembles a database of them. To do this, the 1-loop integrals necessary for their computation can be recursively decomposed in terms of the *fteen basic 1-loop constants and then evaluated with very high precision. Recently with this method the 2-loop critical mass mc , de*ned by the vanishing of the inverse renormalized propagator S −1 (p; m0 ) at p = 0, S −1 (0; mc ) = 0 ;
(19.68)
has been computed in the Wilson case. Using one-dimensional integrals calculated with high accuracy (about 60 signi*cant decimal places), the 2-loop diagrams relevant for the critical mass have been obtained with a precision of about ten signi*cant decimal places. The result for SU (Nc ) with Nf fermion Navors can be put in the form 2 Nf 1 2 Nc − 1 4 2 (19.69) c1 + g0 (Nc − 1) c2; 1 + 2 c2; 2 + c2; 3 ; mc = g 0 Nc Nc Nc
S. Capitani / Physics Reports 382 (2003) 113 – 302
283
where c1 = −0:16285705871085078618
(19.70)
is the 1-loop result, and c2; 1 = −0:0175360218(2) ;
(19.71)
c2; 2 = 0:0165663304(2) ;
(19.72)
c2; 3 = 0:001186203(6)
(19.73)
the result of the 2-loop computations made with the coordinate space method. These 2-loop numbers are in agreement with the less precise results given in Follana and Panagopoulos (2001), obtained using conventional momentum-space methods, which have also produced the value of the critical mass for improved fermions (Panagopoulos and Proestos, 2002). 20. Numerical perturbation theory We want to conclude this review mentioning some numerical methods that are completely diJerent from all that we have presented so far, but represent interesting alternative ideas for the computation of perturbative expansions, and in principle can tackle high-loop calculations, at least in some cases. In the last decade a numerical approach to perturbation theory, in which one extracts perturbative coeQcients using Monte Carlo simulations instead of calculating Feynman diagrams analytically, has emerged and has produced a few interesting results. One of these techniques is given by the so-called numerical stochastic perturbation theory (Di Renzo et al., 1994, 1995), which is based on the stochastic quantization by Parisi and Wu (1981). It uses numerical simulations of the Langevin equation. An additional parameter, the stochastic time , is introduced in the theory, and the gauge *eld is taken to be a random variable, U (x; ), which evolves according to the Langevin equation: d S[U ] + D(x; ) : U (x; ) = − d U
(20.1)
The last term, D(x; ), is a Gaussian noise matrix, i.e., its expectation value is D(x; )D(x ; ) = 2 x; x ; ;
(20.2)
and is the stochastic part of the equation. This approach lends itself quite naturally to lattice investigations. One has to discretize the stochastic time , introducing a nonzero Langevin time step R. This causes an O(R) systematic error, but for R → 0 and → ∞ the time average is expected to reach asymptotically the expectation values corresponding to a path integral with action S[U ]. At the end of the lattice calculations one has then to make extrapolations to the limit R = 0. The numerical solution to the Langevin equation consists in updating the gauge *eld according to Batrouni et al. (1985), U (x; + R) = e−F[U (); D] U (x; ) ;
(20.3)
284
S. Capitani / Physics Reports 382 (2003) 113 – 302
where the driving function is √ F[U; D] = T i (R∇ix; S[U ] + RDi ) ;
(20.4)
i
∇ being the Lie derivative on the group. For the Wilson plaquette action one has T i ∇ix; SG [U ] = (UP − UP† )Tr ; 4N c i
(20.5)
UP U (x)
where Tr stands for the traceless part. Using stochastic perturbation theory in simulations involving only gluons, it has been possible to reach much higher orders in g02 than in conventional perturbative calculations, where 2 loops have been reached only in a few cases. Simulations done using stochastic perturbation theory have instead reached something like the 10-loop order in the case of the plaquette (Di Renzo et al., 1995). Where results are available for stochastic and conventional perturbation theory, they agree within errors. The inclusion of fermions in stochastic perturbation theory has been accomplished only recently (Di Renzo and Scorzato, 2001; Di Renzo et al., 2002). Although passing from quenched to unquenched calculations requires little computational overhead, one needs to use a fast Fourier transform. To perform the *eld updating in the unquenched theory one has to include a more complicated derivative term in the driving function, ∇SG → ∇SG − ∇(Tr log M ) = ∇SG − Tr((∇M )M −1 ) ;
(20.6)
where M is the fermionic matrix. We see that the Lie derivative generates an inverse fermionic matrix M −1 , which is nonlocal and quite expensive to compute numerically. An ingenious way to simulate this term has been proposed long ago (Batrouni et al., 1985), and consists in taking instead ∇SG − Re(U† (∇M )M −1 U) ;
(20.7)
where U is a Gaussian random variable: Ui Uj = ij , so that after averaging over this new variable one recovers the fermionic term U(∇M )M −1 U = Tr((∇M )M −1 ) :
(20.8)
This random fermionic term can be computed recursively, expanding the relevant quantities order by order in the coupling constant. In practical terms one generates the random variable U0 (which does not depend on the coupling constant and is then of order zero) and then recursively computes the variable in M =U :
(20.9)
This is a lot easier than doing the inversion of the Dirac operator, and gives a local evolution, because Eq. (20.7) becomes ∇SG − Re(U† (∇M ) ) : order by order one needs the expansions M = M (0) + −k=2 M (k)
(20.10)
To compute
k¿0
(20.11)
S. Capitani / Physics Reports 382 (2003) 113 – 302
and its inverse M −1 = M (0)
−1
+
−k=2 M −1
(k)
285
:
(20.12)
k¿0 (0)
−1
Note that inverting the zeroth order of the fermionic matrix is trivial: M −1 = M (0) . The nontrivial orders of M −1 can be obtained recursively as follows: (1)
−1
M −1 = −M (0) M (1) M (0)
−1
(2)
−1
−1
− M (0) M (1) M −1
(3)
−1
−1
− M (0) M (2) M −1 − M (0) M (1) M −1
M −1 = −M (0) M (2) M (0) M −1 = −M (0) M (3) M (0)
−1
(1)
−1
(1)
−1
(2)
;
(20.13)
−1
and so forth. Since M (0) is diagonal in momentum space, one can perform its computation going to momentum space, back and forth, and for this reason a fast Fourier transform code is needed. At the end one has (0)
−1
= M (0) U0 ; (1)
−1
(0)
;
(2)
−1
(0)
− M (0) M (1)
(1)
= M −1 U0 = −M (0) M (1)
(2)
= M −1 U0 = −M (0) M (2)
−1
(1)
;
(20.14)
and so forth. These are the lowest-order terms in the expansion (i)
(i)
= M − 1 U0 ;
(20.15)
with which one can compute recursively Eq. (20.7), and hence simulate the fermion system using the Langevin equation. Every quantity that one wants to compute with stochastic perturbation theory has to be expanded in powers of g0 −1=2 = √ : (20.16) 6 For example, U (x; ) = 1 + −k=2 U (k) (x; ); A (x; ) = −k=2 A(k) (x; ) ; (20.17) k¿0
where U (x; ) = exp(A (x; )=
k¿0
) :
(20.18)
Observables are composite operators, and every observable depending on U can be expanded in the coupling constant. For example the *rst order of the plaquette (Eq. (5.6)) is given by (1) P (1) (x; )U)(0) (x + a ˆ; )(U † )(0) (x + a); ˆ )(U)† )(0) (x; ) ) (x; ) = U
+ U (0) (x; )U)(1) (x + a ˆ; )(U † )(0) (x + a); ˆ )(U)† )(0) (x; ) + U (0) (x; )U)(0) (x + a ˆ; )(U † )(1) (x + a); ˆ )(U)† )(0) (x; ) + U (0) (x; )U)(0) (x + a ˆ; )(U † )(0) (x + a); ˆ )(U)† )(1) (x; ) :
(20.19)
286
S. Capitani / Physics Reports 382 (2003) 113 – 302
So far the calculations which use stochastic perturbation theory have been limited to *nite quantities. The computation of the quark currents, of operators measuring structure functions and also weak operators, which in general have nonvanishing anomalous dimensions, still seem a long way to go. There is another way to compute the coeQcients of perturbative expansions numerically. It makes use of Monte Carlo simulations at very weak couplings to measure short-distance quantities (Dimm et al., 1995; Trottier et al., 2002). The perturbative coeQcients are then extracted by making *ts to the results of these numerical simulations. In this method some input from conventional perturbation theory is still required, to de*ne a physical coupling constant. Monte Carlo results are thus *tted to truncated polynomial expansions in the coupling constant. One must be careful in doing this, because if the truncation is too short a poor *t to the Monte Carlo data will come out, while if it is too long then the lowest coeQcients, which are the dominant contributions, become poorly constrained. These *ts can be improved by constraining some of the parameters by means of the techniques known as “constrained curve *tting” (Lepage et al., 2002). They are especially useful in order to constrain parameters which are poorly determined statistically. Coupling constants and volumes are chosen in such a way that the lattice momenta are perturbative. The computations are done at very small lattice spacings and couplings. For example in Trottier et al. (2002) Wilson fermions have been used with coupling constants ranging from lat = 0:008 to lat = 0:053. The lattice spacing and the volume must satisfy q?
aL 1 ; 2 QCD
(20.20)
where q? is a typical gluonic momentum scale. This condition also ensures that the density of the discrete momenta is not small. In recent works the lattice spacing spans a wide range of values, 10−29 ¡ aQCD ¡ 10−3 :
(20.21)
Unfortunately when doing these simulations one has to take into account the problem of the appearance of zero modes, and these infrared eJects are potentially dangerous. Twisted boundary conditions, like the ones in Eq. (17.23), can however eliminate these zero modes, and they are also useful to suppress nonperturbative *nite-volume eJects. With this method one has studied problems like the mass renormalization, small Wilson loops and the static-quark self-energy. In the case of Wilson loops, their 3-loop coeQcients have been extracted and very good agreement has been found with existing conventional perturbative results at 2 loops (Hattori and Kawai, 1981; Di Giacomo and PaJuti, 1982; Curci and Petronzio, 1983; Curci et al., 1984; Hasenfratz et al., 1984; Heller and Karsch, 1985; Bali and Boyle, 2002). Numerical perturbation theory is still in its infancy. Studies have been mostly limited to gluonic quantities, and moreover *nite ones, although some progress has been seen recently in the fermionic case. It seems that much work still needs to be done before one can think of reproducing for example the results of Section 15.4 for the renormalization of the *rst moment of the unpolarized quark distribution, which involves a fermionic operator with a nonzero anomalous dimension.
S. Capitani / Physics Reports 382 (2003) 113 – 302
287
21. Conclusions We have discussed in this review many diJerent aspects of the perturbative calculations made with gauge *elds and fermions de*ned on a hypercubic lattice. Much progress has been made in the last decade. Perturbative calculations have been carried out using a variety of actions and in a variety of physical situations, and recently they have been of great help in the study of chiral fermions on the lattice. This long-standing issue has been solved, and the construction of chiral gauge theories on the lattice presents features that are theoretically relevant also for general quantum *eld theories. We have seen the consequences of the loss of exact Lorentz invariance, and discussed the mixings that derive from this, as well as mixings caused by the breaking of chiral symmetry. In this respect Ginsparg-Wilson fermions represent a great step forward in lattice calculations, and they promise to solve long-standing problems like the calculation from *rst principles of GI = 1=2 weak amplitudes and of the CP-violating parameter R =R. From the more technical side, new methods for the calculation of 1- and 2-loop integrals, with at times incredible precision, have been invented. It is now possible to compute 1-loop bosonic integrals with very high precision, and also fermionic integrals with rather good precision. More challenging have been the calculations of 2-loop Feynman diagrams. Here new methods based on the calculation of propagators in position space have been very useful. These coordinate space methods present many advantages for the computation of 2-loop integrals, and is likely that more developments on these lines will make higher-precision 2-loop calculations much easier to perform.
Acknowledgements I would like to thank Martin L6uscher for having motivated and encouraged me to write this article, and Karl Jansen for much useful advice during its preparation. I pro*ted from discussions with them and with Rainer Sommer and Oleg Tarasov. I *nally thank Karl Jansen, Martin L6uscher and Giancarlo Rossi for useful comments on a *rst version of this article and for suggesting some improvements and corrections at a later time. For the latter I am also indebted to Philipp H6agler, Andrea Shindler and Anastassios Vladikas.
Appendix A. Notation and conventions Generally we use Greek letters ; ); ; : : : for the four-dimensional Lorentz (or Euclidean) indices, which run from 0 to 3, and Latin letters i; j; k; : : : for the three-dimensional indices. Latin letters a; b; c; : : : are used for color indices. Group generators in the fundamental representation of SU (Nc ) are denoted by T , while in the adjoint representation by t. They satisfy [T a ; T b ] = ifabc T c ;
Tr(T a T b ) =
1 ab : 2
(A.1)
288
S. Capitani / Physics Reports 382 (2003) 113 – 302
The symbol 9 is used for continuum derivatives, while the forward and backward lattice derivatives are f(x + ˆ) − f(x) f(x) − f(x − ˆ) ∇ f(x) = ; ∇? f(x) = : (A.2) a a ˜ ,∇ ˜ ? and are given in the main text. The gauge covariant lattice derivatives are denoted by ∇ The shorthand notation = 2 sin ak ak (A.3) a 2 is also often used. The Euclidean Dirac matrices in the chiral representation are 0 −1 0 −i?i 1 0 ; 5 = 0 1 2 3 = ; i = ; (A.4) 0 = i?i 0 −1 0 0 −1 where ?i are the Pauli matrices, and 1 is the 2 × 2 identity matrix. The chiral projectors are 1 0 1 + 5 = 2 0 0 0 0 1 − 5 = ; (A.5) 2 0 1 while the projectors for boundary *elds in the Schr6odinger functional formalism are 1 ∓1 1 ± 0 1 = : P± = 2 2 ∓1 1
(A.6)
Also, ? )=
i [ ; ) ] : 2
(A.7)
Appendix B. High-precision values of Z0 and Z1 We give here the new results of a high-precision calculation of the fundamental bosonic constants Z0 and Z1 , de*ned in Eqs. (18.20) and (18.21). They have been obtained using the recursion relation Eq. (19.6) starting with the gluon propagator at x = 396. 93 The resulting values, with 396 signi*cant decimal places, are: Z0 = 0:15493339023106021408483720810737508876916113364521 98321191752313395351673319454163790491630919236741 07489754149497376290387736082594941817577598499678 93
This was the maximum value compatible with the computational resources at my disposal.
S. Capitani / Physics Reports 382 (2003) 113 – 302
289
92951387264251940296570608026229566408322643387967 84914774223913881583813529174816118783903355821052 96552782448948240231078335735055832848473775143559 80401738187671539786446652153505144942596811258480 8043251280463983474068128158341212164145185669(1) ;
(B.1)
Z1 = 0:10778131353987400134339155028381651483289553031166 39233465607465024738935201734450177503973077057462 05621844437484365688328635749227594895147284883092 15513746596880936011669949517608632177321337226921 37793898141534158628242723648006344189984806003623 35938862675314675890326849096822755901010023056671 38098768018557508588203302625651590185630674198459 5509105334593113724314740915504203882005031989(1) :
(B.2)
Thus, every 1-loop Feynman diagram in the pure gauge theory can always be given with such numerical accuracy (if the plaquette action is used), since it can be expressed as a linear combination of Z0 and Z1 only. References Abbott, L.F., 1981. The background *eld method beyond one loop. Nucl. Phys. B185, 189. Adler, S.L., 1969. Axial vector vertex in spinor electrodynamics. Phys. Rev. 177, 2426. Adler, S.L., Bardeen, W.A., 1969. Absence of higher order corrections in the anomalous axial vector divergence equation. Phys. Rev. 182, 1517. Albanese, M, et al., 1987. Glueball masses and string tension in lattice QCD. Phys. Lett. B 192, 163. Alexandrou, C., Follana, E., Panagopoulos, H., Vicari, E., 2000a. One-loop renormalization of fermionic currents with the overlap-Dirac operator. Nucl. Phys. B 580, 394. Alexandrou, C., Panagopoulos, H., Vicari, E., 2000b. -parameter of lattice QCD with the overlap-Dirac operator. Nucl. Phys. B 571, 257. AllKes, B., Campostrini, M., Feo, A., Panagopoulos, H., 1994a. Lattice perturbation theory by computer algebra: a three loop result for the topological susceptibility. Nucl. Phys. B 413, 553. AllKes, B., Campostrini, M., Feo, A., Panagopoulos, A., 1994b. The three loop lattice free energy. Phys. Lett. B 324, 433. AllKes, B., Feo, A., Panagopoulos, H., 1997. The three-loop beta function in SU (N ) lattice gauge theories. Nucl. Phys. B 491, 498. AllKes, B., Feo, A., Panagopoulos, H., 1998. Asymptotic scaling corrections in QCD with Wilson fermions from the 3-loop average plaquette. Phys. Lett. B 426, 361. [Erratum-ibid. B 553, 337 (2003)]. Altarelli, G., Curci, G., Martinelli, G., Petrarca, S., 1981. QCD Nonleading corrections to weak decays as an application of regularization by dimensional reduction. Nucl. Phys. B 187, 461.
290
S. Capitani / Physics Reports 382 (2003) 113 – 302
Altarelli, G., Maiani, L., 1974. Octet enhancement of nonleptonic weak interactions in asymptotically free gauge theories. Phys. Lett. B 52, 351. Aoki, S., Hirose, H., 1996. Perturbative study for domain-wall fermions in 4+1 dimensions. Phys. Rev. D 54, 3471. Aoki, S., Izubuchi, T., Kuramashi, Y., Taniguchi, Y., 1999a. Perturbative renormalization factors of quark bilinear operators for domain-wall QCD. Phys. Rev. D 59, 094505. Aoki, S., Izubuchi, T., Kuramashi, Y., Taniguchi, Y., 1999b. Perturbative renormalization factors of three- and four-quark operators for domain-wall QCD. Phys. Rev. D 60, 114504. Aoki, S., Izubuchi, T., Kuramashi, Y., Taniguchi, Y., 2002. Perturbative renormalization factors in domain-wall QCD with improved gauge actions, arXiv:hep-lat/0206013. Aoki, S., Kuramashi, Y., 2001. Perturbative renormalization factors of GS = 1 four-quark operators for domain-wall QCD. Phys. Rev. D 63, 054504. Aoki, S., Kuramashi, Y., Onogi, T., Tsutsui, N., 2000. Perturbative renormalization factors of baryon number violating operators for improved quark and gauge actions in lattice QCD. Int. J. Mod. Phys. A 15, 3521. Aoki, S., Taniguchi, Y., 1999a. One loop calculation in lattice QCD with domain-wall quarks. Phys. Rev. D 59, 054510. Aoki, S., Taniguchi, Y., 1999b. One loop renormalization for the axial Ward-Takahashi identity in domain-wall QCD. Phys. Rev. D 59, 094506. Baake, M., Gem6unden, B., Oedingen, R., 1982. Structure and representations of the symmetry group of the four-dimensional cube. J. Math. Phys. 23, 944. [Erratum-ibid. 23, 2595 (1982)]. Baake, M., Gem6unden, B., Oedingen, R., 1983. On the relations between irreducible representations of the hyperoctahedral group and O(4) and SO(4). J. Math. Phys. 24, 1021. Baaquie, B.E., 1977. Gauge *xing and mass renormalization in the lattice gauge theory. Phys. Rev. D 16, 2612. Bali, G.S., Boyle, P., 2002. Perturbative Wilson loops with massive sea quarks on the lattice, arXiv:hep-lat/0210033. Banks, T., Susskind, L., Kogut, J.B., 1976. Strong coupling calculations of lattice gauge theories: (1+1)-dimensional exercises. Phys. Rev. D 13, 1043. Bardeen, W.A., 1969. Anomalous Ward identities in spinor *eld theories. Phys. Rev. 184, 1848. Batrouni, G.G., Katz, G.R., Kronfeld, A.S., Lepage, G.P., Svetitsky, B., Wilson, K.G., 1985. Langevin simulations of lattice *eld theories. Phys. Rev. D 32, 2736. Beccarini, G., Bianchi, M., Capitani, S., Rossi, G., 1995. Deep inelastic scattering in improved lattice QCD. 2. The second moment of structure functions. Nucl. Phys. B 456, 271. Becher, T., Melnikov, K., 2002. The asymptotic expansion of lattice loop integrals around the continuum limit. Phys. Rev. D 66, 074508. Bell, J.S., Jackiw, R., 1969. A PCAC Puzzle: 0 → in the ?-Model. Nuovo Cim. A 60, 47. Bernard, C.W., DeGrand, T., 2000. Perturbation theory for fat-link fermion actions. Nucl. Phys. Proc. Suppl. B 83, 845. Bernard, C.W., DeGrand, T.A., DeTar, C.E., Gottlieb, S., Heller, U.M., Hetrick, J.E., McNeile, C., Orginos, K., Sugar, R.L., Toussaint, D., 2000. Semileptonic decays of heavy mesons with the fat clover action. Nucl. Phys. Proc. Suppl. B 83, 274. Bernard, C.W., Draper, T., Soni, A., Politzer, H.D., Wise, M.B., 1985. Application of chiral perturbation theory to K → 2 decays. Phys. Rev. D 32, 2343. Bernard, C.W., Soni, A., Draper, T., 1987. Perturbative corrections to four-fermion operators on the lattice. Phys. Rev. D 36, 3224. Bernreuther, W., Wetzel, W., Wohlert, R., 1984. -Parameters for lattice Yang–Mills actions containing plaquettes with six links. Phys. Lett. B 142, 407. Bietenholz, W., 1999. Solutions of the Ginsparg–Wilson relation and improved domain wall fermions. Eur. Phys. J. C 6, 537. Bietenholz, W., 2001. Approximate Ginsparg–Wilson fermions for QCD, arXiv:hep-lat/0007017. In: Luo, X.-Q., Gregory, E.B. (Eds.), Proceedings of the International Workshop on Non-Perturbative Methods and Lattice QCD, Guangzhou, China, 15–21 May 2000, World Scienti*c, Singapore. Bietenholz, W., 2002. Convergence rate and locality of improved overlap fermions. Nucl. Phys. B 644, 223. Bietenholz, W., Hip, I., 2000. The scaling of exact and approximate Ginsparg–Wilson fermions. Nucl. Phys. B 570, 423. Bietenholz, W., Struckmann, T., 1999. Perfect lattice perturbation theory: a study of the anharmonic oscillator. Int. J. Mod. Phys. C 10, 531. Bietenholz, W., Wiese, U.J., 1994. Fixed point actions for lattice fermions. Nucl. Phys. Proc. Suppl. B 34, 516.
S. Capitani / Physics Reports 382 (2003) 113 – 302
291
Bock, W., Golterman, M.F., Shamir, Y., 1998a. On the phase diagram of a lattice U (1) gauge theory with gauge *xing. Phys. Rev. D 58, 054506. Bock, W., Golterman, M.F., Shamir, Y., 1998b. Lattice chiral fermions through gauge *xing. Phys. Rev. Lett. 80, 3444. Bock, W., Golterman, M.F., Shamir, Y., 1998c. Chiral fermions on the lattice through gauge *xing: Perturbation theory. Phys. Rev. D 58, 034501. Bock, W., Leung, K.C., Golterman, M.F., Shamir, Y., 2000. The phase diagram and spectrum of gauge-*xed Abelian lattice gauge theory. Phys. Rev. D 62, 034507. Bode, A., 1998. Two loop expansion of the Schr6odinger functional coupling SF in SU (3) lattice gauge theory. Nucl. Phys. Proc. Suppl. B 63, 796. Bode, A., Frezzotti, R., Gehrmann, B., Hasenbusch, M., Heitger, J., Jansen, K., Kurth, S., Rolf, J., Simma, H., Sint, S., Sommer, R., Weisz, P., Wittig, H., WolJ, U., 2001. First results on the running coupling in QCD with two massless Navors. Phys. Lett. B 515, 49. Bode, A., Panagopoulos, H., 2002. The three-loop -function of QCD with the clover action. Nucl. Phys. B 625, 198. Bode, A., Weisz, P., WolJ, U., 2000a. Two loop lattice expansion of the Schr6odinger functional coupling in improved QCD. Nucl. Phys. Proc. Suppl. B 83, 920. Bode, A., Weisz, P., WolJ, U., 2000b. Two loop computation of the Schr6odinger functional in lattice QCD. Nucl. Phys. B 576, 517. [Erratum-ibid. B600, 453 (2001); B608 481 (2001)]. Bode, A., WolJ, U., Weisz, P., 1999. Two-loop computation of the Schr6odinger functional in pure SU (3) lattice gauge theory. Nucl. Phys. B 540, 491. Borrelli, A., Maiani, L., Sisto, R., Rossi, G.C., Testa, M., 1989. Yukawa lattice theory and nonperturbative upper bounds to the fermion mass. Phys. Lett. B 221, 360. Borrelli, A., Maiani, L., Sisto, R., Rossi, G.C., Testa, M., 1990. Neutrinos on the lattice: the regularization of a chiral gauge theory. Nucl. Phys. B 333, 335. Borrelli, A., Pittori, C., Frezzotti, R., Gabrielli, E., 1993. New improved operators: a convenient rede*nition. Nucl. Phys. B 409, 382. Boulware, D.G., 1970. Renormalizeability of massive non-abelian gauge *elds: a functional integral approach. Ann. Phys. 56, 140. Brower, R.C., Huang, S., Negele, J.W., Pochinsky, A., Schreiber, B., 1997. Calculation of moments of nucleon structure functions. Nucl. Phys. Proc. Suppl. B 53, 318. Bucarelli, A., Palombi, F., Petronzio, R., Shindler, A., 1999. Moments of parton evolution probabilities on the lattice within the Schr6odinger functional scheme. Nucl. Phys. B 552, 379. Buras, A.J., Jamin, M., Lautenbacher, M.E., Weisz, P.H., 1992. EJective hamiltonians for GS =1 and GB =1 non-leptonic decays beyond the leading logarithmic approximation. Nucl. Phys. B 370, 69. Buras, A.J., Weisz, P.H., 1990. QCD nonleading corrections to weak decays in dimensional regularization and ’t Hooft– Veltman schemes. Nucl. Phys. B 333, 66. Burgio, G., Caracciolo, S., Pelissetto, A., 1996. Algebraic algorithm for the computation of one-loop Feynman diagrams in lattice QCD with Wilson fermions. Nucl. Phys. B 478, 687. Cabibbo, N., Martinelli, G., Petronzio, R., 1984. Weak interactions on the lattice. Nucl. Phys. B 244, 381. Callan, C.G., Harvey, J.A., 1985. Anomalies and fermion zero modes on strings and domain walls. Nucl. Phys. B 250, 427. Capitani, S., 2001a. Perturbative renormalization of the *rst two moments of non-singlet quark distributions with overlap fermions. Nucl. Phys. B 592, 183. Capitani, S., 2001b. Perturbative renormalization of moments of quark momentum, helicity and transversity distributions with overlap and Wilson fermions. Nucl. Phys. B 597, 313. Capitani, S., 2002a. Perturbative renormalization for overlap fermions. Nucl. Phys. Proc. Suppl. B 106, 826. Capitani, S., 2002b. Status of lattice structure function calculations. Acta Phys. Polon. B 33, 3025. Capitani, S., 2002c. Perturbative and non-perturbative lattice calculations for the study of parton distributions, arXiv:hep-ph/0210076. Presented at the 6th International Symposium on Radiative Corrections, “Application of Quantum Field Theory Phenomenology” (RADCOR 2002) and 6th Zeuthen Workshop on Elementary Particle Theory, “Loops and Legs in Quantum Field Theory”, Kloster Banz, Germany, 8–13 September 2002. Capitani, S., Caracciolo, S., Pelissetto, A., Rossi, G., 1998a. High-precision computation of two-loop Feynman diagrams with Wilson fermions. Nucl. Phys. Proc. Suppl. B 63, 802.
292
S. Capitani / Physics Reports 382 (2003) 113 – 302
Capitani, S., Giusti, L., 2000. Perturbative renormalization of weak Hamiltonian four-fermion operators with overlap fermions. Phys. Rev. D 62, 114506. Capitani, S., Giusti, L., 2001. Analysis of the GI = 1=2 rule and R =R with overlap fermions. Phys. Rev. D 64, 014506. Capitani, S., G6ockeler, M., Horsley, R., Klaus, B., K6urzinger, W., Petters, D., Pleiter, D., Rakow, P.E.L., Schaefer, S., Sch6afer, A., Schierholz, G., 2001a. Four-quark operators in hadrons. Nucl. Phys. Proc. Suppl. B 94, 299. Capitani, S., G6ockeler, M., Horsley, R., Klaus, B., Linke, V., Rakow, P.E.L., Sch6afer, A., Schierholz, G., 2000a. Higher-twist contributions to the structure functions coming from 4-fermion operators. Nucl. Phys. Proc. Suppl. B 83, 232. Capitani, S., G6ockeler, M., Horsley, R., Klaus, B., Linke, V., Rakow, P.E.L., Sch6afer, A., Schierholz, G., 2000b. Higher-twist contribution to pion structure function: 4-Fermi operators. Nucl. Phys. B 570, 393. Capitani, S., G6ockeler, M., Horsley, R., Perlt, H., Rakow, P.E.L., Schierholz, G., Schiller, A., 1998b. Perturbative renormalization of improved lattice operators. Nucl. Phys. Proc. Suppl. B 63, 874. Capitani, S., G6ockeler, M., Horsley, R., Perlt, H., Rakow, P.E.L., Schierholz, G., Schiller, A., 1999a. Renormalization of four-fermion operators for higher twist calculations. Nucl. Phys. Proc. Suppl. B 73, 285. Capitani, S., G6ockeler, M., Horsley, R., Perlt, H., Rakow, P.E.L., Schierholz, G., Schiller, A., 2001b. Renormalisation and oJ-shell improvement in lattice perturbation theory. Nucl. Phys. B 593, 183. Capitani, S., G6ockeler, M., Horsley, R., Rakow, P.E.L., Schierholz, G., 1999b. Operator improvement for Ginsparg–Wilson fermions. Phys. Lett. B 468, 150. Capitani, S., G6ockeler, M., Horsley, R., Rakow, P.E.L., Schierholz, G., 2000c. On-shell and oJ-shell improvement for Ginsparg-Wilson fermions. Nucl. Phys. Proc. Suppl. B 83, 893. Capitani, S., Guagnelli, M., L6uscher, M., Sint, S., Sommer, R., Weisz, P., Wittig, H., 1998c. Non-perturbative quark mass renormalization. Nucl. Phys. Proc. Suppl. B 63, 153. Capitani, S., L6uscher, M., Sommer, R., Wittig, H., 1999c. Non-perturbative quark mass renormalization in quenched lattice QCD. Nucl. Phys. B 544, 669. Capitani, S., Rossi, G., 1995a. Deep inelastic scattering in improved lattice QCD. 1. The *rst moment of structure functions. Nucl. Phys. B 433, 351. Capitani, S., Rossi, G., 1995b. The use of SCHOONSCHIP and FORM in perturbative lattice calculations, arXiv:hep-lat/9504014. In: Denby, B., Perret-Gallix, D. (Eds.), “New Computing Techniques in Physics Research IV”, Proceedings of the 4th International Workshop on Software Engineering and Arti*cial Intelligence for High-energy and Nuclear Physics (AIHENP95), Pisa, Italy, 3–8 April 1995, World Scienti*c, Singapore. Caracciolo, S., Curci, G., Menotti, P., Pelissetto, A., 1989. Renormalization of the energy momentum tensor and the trace anomaly in lattice QED. Phys. Lett. B 228, 375. Caracciolo, S., Curci, G., Menotti, P., Pelissetto, A., 1990. The energy momentum tensor for lattice gauge theories. Ann. Phys. 197, 119. Caracciolo, S., Menotti, P., Pelissetto, A., 1991. Analytic determination at one loop of the energy momentum tensor for lattice QCD. Phys. Lett. B 260, 401. Caracciolo, S., Menotti, P., Pelissetto, A., 1992. One loop analytic computation of the energy momentum tensor for lattice gauge theories. Nucl. Phys. B 375, 195. Caracciolo, S., Pelissetto, A., Rago, A., 2001. Two-loop critical mass for Wilson fermions. Phys. Rev. D 64, 094506. Caswell, W.E., 1974. Asymptotic behavior of nonabelian gauge theories to two loop order. Phys. Rev. Lett. 33, 244. Caswell, W.E., Wilczek, F., 1974. On the gauge dependence of renormalization group parameters. Phys. Lett. B 49, 291. Celmaster, W., Gonsalves, R.J., 1979a. QCD perturbation expansions in a coupling constant renormalized by momentum space subtraction. Phys. Rev. Lett. 42, 1435. Celmaster, W., Gonsalves, R.J., 1979b. The renormalization prescription dependence of the QCD coupling constant. Phys. Rev. D 20, 1420. Chetyrkin, K.G., 1997. Quark mass anomalous dimension to O(s4 ). Phys. Lett. B 404, 161. Chiu, T.W., Hsieh, T.H., 2002. A perturbative calculation of the axial anomaly of a Ginsparg–Wilson Dirac operator. Phys. Rev. D 65, 054508. Chiu, T.W., Wang, C.W., Zenkin, S.V., 1998. Chiral structure of the solutions of the Ginsparg–Wilson relation. Phys. Lett. B 438, 321. Christou, C., Feo, A., Panagopoulos, H., Vicari, E., 1998. The three-loop beta-function of SU (N ) lattice gauge theories with Wilson fermions. Nucl. Phys. B 525, 387. [Erratum-ibid. B608 479, (2001)].
S. Capitani / Physics Reports 382 (2003) 113 – 302
293
Corb[o, G., Franco, E., Rossi, G.C., 1989. Perturbative renormalization of the lowest moment operators of DIS in lattice QCD. Phys. Lett. B 221, 367. [Erratum-ibid. B225 463, (1989)]. Corb[o, G., Franco, E., Rossi, G.C., 1990. Mixing of DIS operators in lattice QCD. Phys. Lett. B 236, 196. Creutz, M., 1983. Quarks, gluons and lattices. Cambridge Monographs On Mathematical Physics. Cambridge University Press, Cambridge. Creutz, M., 1987. Species doubling and transfer matrices for fermionic *elds. Phys. Rev. D 35, 1460. Creutz, M., 2001. Aspects of chiral symmetry and the lattice. Rev. Mod. Phys. 73, 119. Curci, G., Franco, E., Maiani, L., Martinelli, G., 1988. Mixing coeQcients of the lattice weak Hamiltonian with dimension *ve operators. Phys. Lett. B 202, 363. Curci, G., Menotti, P., PaJuti, G., 1983. Symanzik’s improved lagrangian for lattice gauge theory. Phys. Lett. B 130, 205. [Erratum-ibid. B135, 516 (1984)]. Curci, G., PaJuti, G., Tripiccione, R., 1984. Perturbative background to Monte Carlo calculations in lattice gauge theories. Nucl. Phys. B 240, 91. Curci, G., Petronzio, R., 1983. On the perturbative contributions to the string tension estimates. Phys. Lett. B 132, 133. Daniel, D., Sheard, S.N., 1988. Perturbative corrections to staggered fermion lattice operators. Nucl. Phys. B 302, 471. Dashen, R.F., Gross, D.J., 1981. The relationship between lattice and continuum de*nitions of the gauge theory coupling. Phys. Rev. D 23, 2340. Davies, C.T.H., 2002. Lattice QCD, arXiv:hep-ph/0205181. In: Davies, C.T.H., Playfer, S.M. (Eds.), Lectures given at the 55th Scottish Universities Summer School in Physics on “Heavy Flavor Physics”, St. Andrews, Scotland, 7-23 August 2001. Scottish Graduate Textbook Series, Institute of Physics 2002. Davies, C.T.H., Hornbostel, K., Lepage, G.P., McCallum, P., Shigemitsu, J., Sloan, J.H., 1997. Further precise determinations of s from lattice QCD. Phys. Rev. D 56, 2755. de Forcrand, P., GarcK^a PKerez, M., Hashimoto, T., Hioki, S., Matsufuru, H., Miyamura, O., Nakamura, A., Stamatescu, I.O., Takaishi, T., Umeda, T., 2000. Renormalization group Now of SU (3) lattice gauge theory: numerical studies in a two coupling space. Nucl. Phys. B 577, 263. DeGrand, T., 1996. Nonperturbative quantum *eld theory on the lattice, arXiv:hep-th/9610132. Talk given at the 1996 Theoretical Advanced Study Institute in Elementary Particle Physics (TASI 96): “Fields, Strings, and Duality”, Boulder, CO, 2–28 June 1996. DeGrand, T., 1997. Lattice gauge theory for QCD, arXiv:hep-ph/9610391. In: Chan, J., DePorcel, L., Dixon, L., (Eds.), Proceedings of the 24th SLAC Summer Institute on Particle Physics: “The Strong Interaction, From Hadrons to Partons”, Stanford, CA, August 1996, pp. 19 –30. DeGrand, T., 2001. A variant approach to the overlap action. Phys. Rev. D 63, 034503. DeGrand, T., 2002. One loop matching coeQcients for a variant overlap action—and some of its simpler relatives, arXiv:hep-lat/0210028. DeGrand, T., Hasenfratz, A., Hasenfratz, P., Kunszt, P., Niedermayer, F., 1997. Fixed-point action for fermions in QCD. Nucl. Phys. Proc. Suppl. B 53, 942. DeGrand, T., Hasenfratz, A., Hasenfratz, P., Niedermayer, F., 1995. The classically perfect *xed point action for SU (3) gauge theory. Nucl. Phys. B 454, 587. DeGrand, T., Hasenfratz, A., Hasenfratz, P., Niedermayer, F., 1996. Fixed point actions for SU (3) gauge theory. Phys. Lett. B 365, 233. DeGrand, T., Hasenfratz, A., KovKacs, T.G., 1999. Instantons and exceptional con*gurations with the clover action. Nucl. Phys. B 547, 259. DeGrand, T., Hasenfratz, A., KovKacs, T.G., 2002. Improving the chiral properties of lattice fermions, arXiv:hep-lat/0211006. De Wit, B., 1967a. Quantum theory of gravity. 2. The manifestly covariant theory. Phys. Rev. 162, 1195. De Wit, B., 1967b. Quantum theory of gravity. 3. Applications of the covariant theory. Phys. Rev. 162, 1239. Di Giacomo, A., PaJuti, G., 1982. Some results related to the continuum limit of lattice gauge theories. Nucl. Phys. B 205, 313. Dimm, W., Lepage, G.P., Mackenzie, P.B., 1995. Nonperturbative lattice perturbation theory. Nucl. Phys. Proc. Suppl. B 42, 403. Di Renzo, F., Miccio, V., Scorzato, L., 2002. Unquenched numerical stochastic perturbation theory. arXiv:hep-lat/0209018. Di Renzo, F., Onofri, E., Marchesini, G., 1995. Renormalons from eight loop expansion of the gluon condensate in lattice gauge theory. Nucl. Phys. B 457, 202.
294
S. Capitani / Physics Reports 382 (2003) 113 – 302
Di Renzo, F., Onofri, E., Marchesini, G., Marenzoni, P., 1994. Four loop result in SU (3) lattice gauge theory by a stochastic method: lattice correction to the condensate. Nucl. Phys. B 426, 675. Di Renzo, F., Scorzato, L., 2001. Fermionic loops in numerical stochastic perturbation theory. Nucl. Phys. Proc. Suppl. B 94, 567. Dolgov, D., Brower, R., Capitani, S., Dreher, P., Negele, J.W., Pochinsky, A., Renner, D.B., Eicker, N., Lippert, T., Schilling, K., Edwards, R.G., Heller, U.M., 2002. Moments of nucleon light cone quark distributions calculated in full lattice QCD. Phys. Rev. D 66, 034506. Dolgov, D., Brower, R., Capitani, S., Negele, J.W., Pochinsky, A., Renner, D.B., Eicker, N., Lippert, T., Schilling, K., Edwards, R.G., Heller, U.M., 2001. Moments of structure functions in full QCD. Nucl. Phys. Proc. Suppl. B 94, 303. Dreher, P., Brower, R., Capitani, S., Dolgov, D., Edwards, R.G., Eicker, N., Heller, U.M., Lippert, T., Negele, J.W., Pochinsky, A., Renner, D.B., Schilling, K., 2002. Continuum extrapolation of moments of nucleon quark distributions in full QCD. arXiv:hep-lat/0211021. Drell, S.D., Weinstein, M., Yankielowicz, S., 1976a. Strong-coupling *eld theory. 1. Variational approach to 4 theory. Phys. Rev. D 14, 487. Drell, S.D., Weinstein, M., Yankielowicz, S., 1976b. Strong-coupling *eld theories. 2. Fermions and gauge *elds on a lattice. Phys. Rev. D 14, 1627. El-Khadra, A.X., Kronfeld, A.S., Mackenzie, P.B., 1997. Massive fermions in lattice gauge theory. Phys. Rev. D 55, 3933. Ellis, R.K., 1984. Perturbative corrections to universality and renormalization group behavior. Preprint FERMILABCONF-84/41. Presented at the ANL Workshop on Gauge Theory on a Lattice, Argonne, IL, 5–7 April, 1984. Ellis, R.K., Martinelli, G., 1984a. Two loop corrections to the parameters of one-plaquette actions. Nucl. Phys. B 235, 93. [Erratum-ibid. B249 750, (1985)]. Ellis, R.K., Martinelli, G., 1984b. Perturbative corrections to renormalization group behavior in lattice QCD. Phys. Lett. B 141, 111. Espriu, D., Tarrach, R., 1982. On prescription dependence of renormalization group functions. Phys. Rev. D 25, 1073. Farchioni, F., Hasenfratz, P., Niedermayer, F., Papa, A., 1995. The absence of cutoJ eJects for the *xed point action in one loop perturbation theory. Nucl. Phys. B 454, 638. Feynman, R.P., Hibbs, A.R., 1965. Quantum Mechanics and Path Integrals. McGraw-Hill, New York. Follana, E., Panagopoulos, H., 2001. The critical mass of Wilson fermions: a comparison of perturbative and Monte Carlo results. Phys. Rev. D 63, 017501. Frezzotti, R., Gabrielli, E., Pittori, C., Rossi, G.C., 1992. Four fermion operators with improved nearest neighbor action. Nucl. Phys. B 373, 781. Friedan, D., 1982. A proof of the Nielsen–Ninomiya Theorem. Commun. Math. Phys. 85, 481. Frolov, S.A., Slavnov, A.A., 1993. An invariant regularization of the standard model. Phys. Lett. B 309, 344. Fujikawa, K., Ishibashi, M., 2002. A perturbative study of a general class of lattice Dirac operators. Phys. Rev. D 65, 114504. Furman, V., Shamir, Y., 1995. Axial symmetries in lattice QCD with Kaplan fermions. Nucl. Phys. B 439, 54. Gabrielli, E., Martinelli, G., Pittori, C., Heatlie, G., Sachrajda, C.T., 1991. Renormalization of lattice two fermion operators with improved nearest neighbor action. Nucl. Phys. B 362, 475. Gaillard, M.K., Lee, B.W., 1974. GI = 1=2 Rule for nonleptonic decays in asymptotically free *eld theories. Phys. Rev. Lett. 33, 108. Garden, J., Heitger, J., Sommer, R., Wittig, H., 2000. Precision computation of the strange quark’s mass in quenched QCD. Nucl. Phys. B 571, 237. Gasser, J., Leutwyler, H., 1982. Quark masses. Phys. Rept. 87, 77. Gasser, J., Leutwyler, H., 1984. Chiral perturbation theory to one loop. Ann. Phys. 158, 142. Gasser, J., Leutwyler, H., 1985. Chiral perturbation theory: expansions in the mass of the strange quark. Nucl. Phys. B 250, 465. Ginsparg, P.H., Wilson, K.G., 1982. A remnant of chiral symmetry on the lattice. Phys. Rev. D 25, 2649. Giusti, L., 2002. Exact chiral symmetry on the lattice: QCD applications. arXiv:hep-lat/0211009. Giusti, L., Rossi, G.C., Testa, M., Veneziano, G., 2002. The UA (1) problem on the lattice with Ginsparg–Wilson fermions. Nucl. Phys. B 628, 234. Golterman, M.F., 2001. Lattice chiral gauge theories. Nucl. Phys. Proc. Suppl. B 94, 189. Golterman, M.F., Shamir, Y., 1997. A gauge-*xing action for lattice gauge theories. Phys. Lett. B 399, 148.
S. Capitani / Physics Reports 382 (2003) 113 – 302
295
Golterman, M.F., Shamir, Y., 2002. Lattice chiral gauge theories through gauge *xing. arXiv:hep-lat/0205001. Talk given at the NATO Advanced Research Workshop on Con*nement, Topology, and other Nonperturbative Aspects of QCD, Stara Lesna, Slovakia, 21–27 January 2002. Golterman, M.F., Smit, J., 1984a. Relation between QCD parameters on the lattice and in the continuum. Phys. Lett. B 140, 392. Golterman, M.F., Smit, J., 1984b. Selfenergy and Navor interpretation of staggered fermions. Nucl. Phys. B 245, 61. GonzKalez-Arroyo, A., Korthals-Altes, C.P., 1982. Asymptotic freedom scales for any lattice action. Nucl. Phys. B 205, 46. GonzKalez-Arroyo, A., YndurKain, F.J., Martinelli, G., 1982. Computation of the relation between the quark masses in lattice gauge theories and on the continuum. Phys. Lett. B 117, 437. [Erratum-ibid. B122, 486 (1983)]. G6ockeler, M., 1984. Mass terms and mass renormalization for Susskind fermions. Phys. Lett. B 142, 197. G6ockeler, M., Horsley, R., Ilgenfritz, E.M., Perlt, H., Rakow, P., Schierholz, G., Schiller, A., 1996a. Lattice operators for moments of the structure functions and their transformation under the hypercubic group. Phys. Rev. D 54, 5705. G6ockeler, M., Horsley, R., Ilgenfritz, E.M., Perlt, H., Rakow, P., Schierholz, G., Schiller, A., 1996b. Perturbative renormalization of lattice bilinear quark operators. Nucl. Phys. B 472, 309. G6ockeler, M., Horsley, R., Pleiter, D., Rakow, P.E., Sch6afer, A., Schierholz, G., 2002. Calculation of moments of structure functions, arXiv:hep-lat/0209160. Groot, R., Hoek, J., Smit, J., 1984. Normalization of currents in lattice QCD. Nucl. Phys. B 237, 111. Gross, D.J., Wilczek, F., 1973. Ultraviolet behavior of non-abelian gauge theories. Phys. Rev. Lett. 30, 1343. Guagnelli, M., Jansen, K., Petronzio, R., 1999a. Non-perturbative running of the average momentum of non-singlet parton densities. Nucl. Phys. B 542, 395. Guagnelli, M., Jansen, K., Petronzio, R., 1999b. Universal continuum limit of non-perturbative lattice non-singlet moment evolution. Phys. Lett. B 457, 153. Guagnelli, M., Jansen, K., Petronzio, R., 1999c. Renormalization group invariant average momentum of non-singlet parton densities. Phys. Lett. B 459, 594. Guagnelli, M., Jansen, K., Petronzio, R., 2000. Lattice hadron matrix elements with the Schr6odinger functional: the case of the *rst moment of non-singlet quark density. Phys. Lett. B 493, 77. Gupta, R., 1999. Lattice QCD, arXiv:hep-lat/9807028. In: Gupta, R., Morel, A., de Rafael, E., David, F. (Eds.), Lectures given at the Les Houches Summer School “Probing the Standard Model of Particle Interactions”, Session LXVIII, 1997, Elsevier, Amsterdam. Gupta, R., Bhattacharya, T., Sharpe, S.R., 1997. Matrix elements of four-fermion operators with quenched Wilson fermions. Phys. Rev. D 55, 4036. Hahn, Y., Zimmermann, W., 1968. An elementary proof of Dyson’s power counting theorem. Commun. Math. Phys. 10, 330. Hamber, H.W., Wu, C.M., 1983. Some predictions for an improved fermion action on the lattice. Phys. Lett. B 133, 351. Hasenfratz, A., Hasenfratz, P., 1980. The connection between the parameters of lattice and continuum QCD. Phys. Lett. B 93, 165. Hasenfratz, A., Hasenfratz, P., 1981. The scales of Euclidean and Hamiltonian lattice QCD. Nucl. Phys. B 193, 210. Hasenfratz, A., Hasenfratz, P., Heller, U.M., Karsch, F., 1984. The -function of the SU (3) Wilson action. Phys. Lett. B 143, 193. Hasenfratz, A., HoJmann, R., Knechtli, F., 2002a. The static potential with hypercubic blocking. Nucl. Phys. Proc. Suppl. B 106, 418. Hasenfratz, A., Knechtli, F., 2001. Flavor symmetry and the static potential with hypercubic blocking. Phys. Rev. D 64, 034504. Hasenfratz, A., Knechtli, F., 2002. Simulation of dynamical fermions with smeared links. Comput. Phys. Commun. 148, 81. Hasenfratz, P., 1998a. Prospects for perfect actions. Nucl. Phys. Proc. Suppl. B 63, 53. Hasenfratz, P., 1998b. The theoretical background and properties of perfect actions. arXiv:hep-lat/9803027. Prepared for the Advanced Summer School on “Nonperturbative Quantum Field Physics”, Pe˜niscola, Spain, 2–6 June 1997. Hasenfratz, P., Hauswirth, S., J6org, T., Niedermayer, F., Holland, K., 2002b. Testing the *xed-point QCD action and the construction of chiral currents. Nucl. Phys. B 643, 280. Hasenfratz, P., Laliena, V., Niedermayer, F., 1998. The index theorem in QCD with a *nite cut-oJ. Phys. Lett. B 427, 125.
296
S. Capitani / Physics Reports 382 (2003) 113 – 302
Hasenfratz, P., Niedermayer, F., 1994. Perfect lattice action for asymptotically free theories. Nucl. Phys. B 414, 785. Hasenfratz, P., Niedermayer, F., 1997. Fixed-point actions in 1-loop perturbation theory. Nucl. Phys. B 507, 399. Hattori, T., Kawai, H., 1981. Weak coupling perturbative calculations of Wilson loop in lattice gauge theory. Phys. Lett. B 105, 43. Heatlie, G., Martinelli, G., Pittori, C., Rossi, G.C., Sachrajda, C.T., 1991. The improvement of hadronic matrix elements in lattice QCD. Nucl. Phys. B 352, 266. Hein, J., Mason, Q., Lepage, G.P., Trottier, H., 2002. Mass renormalisation for improved staggered quarks. Nucl. Phys. Proc. Suppl. B 106, 236. Heller, U.M., Karsch, F., 1985. One loop perturbative calculation of Wilson loops on *nite lattices. Nucl. Phys. B 251, 254. HernKandez, P., Jansen, K., Lellouch, L., 2002a. From enemies to friends: chiral symmetry on the lattice. arXiv:hep-lat/0203029. Contributed to the NIC Symposium 2001, J6ulich, Germany, 5–6 December 2001. HernKandez, P., Jansen, K., Lellouch, L., Wittig, H., 2001. Non-perturbative renormalization of the quark condensate in Ginsparg–Wilson regularizations. J. High Energy Phys. 0107, 018. HernKandez, P., Jansen, K., Lellouch, L., Wittig, H., 2002b. Scalar condensate and light quark masses from overlap fermions. Nucl. Phys. Proc. Suppl. B 106, 766. HernKandez, P., Jansen, K., L6uscher, M., 1999. Locality properties of Neuberger’s lattice Dirac operator. Nucl. Phys. B 552, 363. HernKandez, P., Jansen, K., L6uscher, M., 2000. A note on the practical feasibility of domain-wall fermions. arXiv:hep-lat/0007015. Talk given at the Workshop on Current Theoretical Problems in Lattice Field Theory, Ringberg, Germany, 2–8 April 2000. Igarashi, H., Okuyama, K., Suzuki, H., 2000. Errata and addenda to ‘Anomaly cancellation condition in lattice gauge theory’, arXiv:hep-lat/0012018. Ishibashi, M., Kikukawa, Y., Noguchi, T., Yamada, A., 2000. One-loop analyses of lattice QCD with the overlap Dirac operator. Nucl. Phys. B 576, 501. Ishizuka, M., Shizawa, Y., 1994. Perturbative renormalization factors for bilinear and four quark operators for Kogut-Susskind fermions on the lattice. Phys. Rev. D 49, 3519. Iwasaki, Y., 1983a. Renormalization group analysis of lattice theories and improved lattice action. 1. Two-dimensional nonlinear O(N ) sigma model. University of Tsukuba Report UTHEP-117. Iwasaki, Y., 1983b. Renormalization group analysis of lattice theories and improved lattice action. 2. Four-dimensional nonabelian SU (N ) gauge model. University of Tsukuba Report UTHEP-118. Jansen, K., 1992. Chiral fermions and anomalies on a *nite lattice. Phys. Lett. B 288, 348. Jansen, K., 1996. Domain wall fermions and chiral gauge theories. Phys. Rept. 273, 1. Jansen, K., 2000. Structure functions on the lattice. arXiv:hep-lat/0010038. In: Lim, C.S., Yamanaka, T. (Eds.), Proceedings of the 30th International Conference on High-Energy Physics (ICHEP 2000), Osaka, Japan, 27 July–2 August 2000, World Scienti*c, Singapore. Jansen, K., Liu, C., L6uscher, M., Simma, H., Sint, S., Sommer, R., Weisz, P., WolJ, U., 1996. Non-perturbative renormalization of lattice QCD at all scales. Phys. Lett. B 372, 275. Jansen, K., Sommer, R., 1998. O(a) improvement of lattice QCD with two Navors of Wilson quarks. Nucl. Phys. B 530, 185. Jegerlehner, F., 2001. Facts of life with 5 . Eur. Phys. J. C 18, 673. Ji, X.D., 1995. Exact matching condition for matrix elements in lattice and MS schemes, arXiv:hep-lat/9506034. Preprint MIT-CTP-2447, unpublished. Jones, D.R., 1974. Two loop diagrams in Yang–Mills theory. Nucl. Phys. B 75, 531. J6org, T., 2002. Chiral measurements in quenched lattice QCD with *xed point fermions, arXiv:hep-lat/0206025. Ph.D. Thesis (Bern), Preprint BUTP-2002-9. Kaplan, D.B., 1992. A method for simulating chiral fermions on the lattice. Phys. Lett. B 288, 342. Kaplan, D.B., 1993. Chiral fermions on the lattice. Nucl. Phys. Proc. Suppl. B 30, 597. Karsten, L.H., 1981. Lattice fermions in Euclidean space-time. Phys. Lett. B 104, 315. Karsten, L.H., Smit, J., 1978. Axial symmetry in lattice theories. Nucl. Phys. B 144, 536. Karsten, L.H., Smit, J., 1979. The vacuum polarization with SLAC lattice fermions. Phys. Lett. B 85, 100.
S. Capitani / Physics Reports 382 (2003) 113 – 302
297
Karsten, L.H., Smit, J., 1981. Lattice fermions: species doubling, chiral invariance, and the triangle anomaly. Nucl. Phys. B 183, 103. Kawai, H., Nakayama, R., Seo, K., 1981. Comparison of the lattice parameter with the continuum parameter in massless QCD. Nucl. Phys. B 189, 40. Kawamoto, N., Smit, J., 1981. EJective lagrangian and dynamical symmetry breaking in strongly coupled lattice QCD. Nucl. Phys. B 192, 100. Kikukawa, Y., 2002. Domain wall fermion and chiral gauge theories on the lattice with exact gauge invariance. Phys. Rev. D 65, 074504. Kikukawa, Y., Nakayama, Y., 2001. Gauge anomaly cancellations in SU (2)L × U (1)Y electroweak theory on the lattice. Nucl. Phys. B 597, 519. Kikukawa, Y., Yamada, A., 1999a. Weak coupling expansion of massless QCD with a Ginsparg–Wilson fermion and axial U (1) anomaly. Phys. Lett. B 448, 265. Kikukawa, Y., Yamada, A., 1999b. Axial vector current of exact chiral symmetry on the lattice. Nucl. Phys. B 547, 413. Kluberg-Stern, H., Zuber, J.B., 1975a. Renormalization of non-Abelian gauge theories in a background-*eld gauge. 1. Green’s functions. Phys. Rev. D 12, 482. Kluberg-Stern, H., Zuber, J.B., 1975b. Renormalization of non-Abelian gauge theories in a background-*eld gauge. 2. Gauge-invariant operators. Phys. Rev. D 12, 3159. Knechtli, F., Della Morte, M., Rolf, J., Sommer, R., Wetzorke, I., WolJ, U., 2002. Running quark mass in two Navor QCD, arXiv:hep-lat/0209025. Kogut, J.B., 1983. The lattice gauge theory approach to quantum chromodynamics. Rev. Mod. Phys. 55, 775. Kogut, J.B., Susskind, L., 1975. Hamiltonian formulation of Wilson’s lattice gauge theories. Phys. Rev. D 11, 395. KovKacs, T.G., 2002. Locality and topology with fat link overlap actions, arXiv:hep-lat/0209125. Kronfeld, A.S., 2002. Progress in lattice QCD. arXiv:hep-ph/0209231. Presented at the 22nd Physics in Collision Conference (PIC 2002), Stanford, California, 20–22 June 2002. Kronfeld, A.S., Mertens, B.P., 1984. Renormalization of massive lattice fermions. Nucl. Phys. Proc. Suppl. B 34, 495. Kronfeld, A.S., Photiadis, D.M., 1985. Phenomenology on the lattice: composite operators in lattice gauge theory. Phys. Rev. D 31, 2939. Kuramashi, Y., 1998. Perturbative renormalization factors of bilinear operators for massive Wilson quarks on the lattice. Phys. Rev. D 58, 034507. Kurth, S., 2002. The renormalised quark mass in the Schr6odinger functional of lattice QCD—a one-loop calculation with a non-vanishing background *eld. arXiv:hep-lat/0211011. Ph.D. Thesis, Humboldt Universit6at, Berlin. Laporta, S., 2000. High-precision calculation of multi-loop Feynman integrals by diJerence equations. Int. J. Mod. Phys. A 15, 5087. Lee, W.J., 2001. Perturbative matching of the staggered four-fermion operators for R =R. Phys. Rev. D 64, 054505. Lee, W.J., 2002. Perturbative improvement of staggered fermions using fat links, arXiv:hep-lat/0208032. Lee, W.J., Sharpe, S.R., 2002a. One-loop matching coeQcients for improved staggered bilinears. arXiv:hep-lat/0208018. Lee, W.J., Sharpe, S.R., 2002b. Matching coeQcients for improved staggered bilinears. arXiv:hep-lat/0208036. Lepage, G.P., Clark, B., Davies, C.T., Hornbostel, K., Mackenzie, P.B., Morningstar, C., Trottier, H., 2002. Constrained curve *tting. Nucl. Phys. Proc. Suppl. B 106, 12. Lepage, G.P., Mackenzie, P.B., 1993. On the viability of lattice perturbation theory. Phys. Rev. D 48, 2250. Leroy, J.P, Micheli, J., Rossi, G.C., Yoshida, K., 1990. QCD perturbation theory in the temporal gauge. Z. Phys. C 48, 653. L6uscher, M., 1977. Construction of a selfadjoint, strictly positive transfer matrix for Euclidean lattice gauge theories. Commun. Math. Phys. 54, 283. L6uscher, M., 1983. Project proposal for the EMC 2 collaboration. Unpublished notes. L6uscher, M., 1985. Schr6odinger representation in quantum *eld theory. Nucl. Phys. B 254, 52. L6uscher, M., 1986. Improved lattice gauge theories. In: Osterwalder, K., Stora, R. (Eds.), Lectures given at the Les Houches Summer School “Critical Phenomena, Random Systems, Gauge Theories”, Session XLIII, 1984. Elsevier, Amsterdam. L6uscher, M., 1990. Selected topics in lattice *eld theory. In: BrKezin, E., Zinn-Justin, J. (Eds.), Lectures given at the Les Houches Summer School “Fields, Strings and Critical Phenomena”, Session XLIX, 1988. Elsevier, Amsterdam.
298
S. Capitani / Physics Reports 382 (2003) 113 – 302
L6uscher, M., 1997. Theoretical advances in lattice QCD, arXiv:hep-ph/9711205. Talk given at the 18th International Symposium on Lepton–Photon Interactions, Hamburg, 28 July–1 August 1997. L6uscher, M., 1998. Exact chiral symmetry on the lattice and the Ginsparg–Wilson relation. Phys. Lett. B 428, 342. L6uscher, M., 1999a. Advanced lattice QCD. arXiv:hep-lat/9802029. In: Gupta, R., Morel, A., de Rafael, E., David, F. (Eds.), Lectures given at the Les Houches Summer School “Probing the Standard Model of Particle Interactions”, Session LXVIII, 1997, Elsevier, Amsterdam. L6uscher, M., 1999b. Topology and the axial anomaly in abelian lattice gauge theories. Nucl. Phys. B 538, 515. L6uscher, M., 1999c. Abelian chiral gauge theories on the lattice with exact gauge invariance. Nucl. Phys. B 549, 295. L6uscher, M., 2000a. Weyl fermions on the lattice and the non-abelian gauge anomaly. Nucl. Phys. B 568, 162. L6uscher, M., 2000b. Chiral gauge theories on the lattice with exact gauge invariance. Nucl. Phys. Proc. Suppl. B 83, 34. L6uscher, M., 2000c. Lattice regularization of chiral gauge theories to all orders of perturbation theory. J. High Energy Phys. 0006, 028. L6uscher, M., 2001. Chiral gauge theories revisited, arXiv:hep-th/0102028. Lectures given at the International School of Subnuclear Physics, 38th Course: “Theory and Experiment Heading for New Physics”, Erice, Italy, 27 August–5 September 2000. L6uscher, M., 2002. Lattice QCD—from quark con*nement to asymptotic freedom, arXiv:hep-ph/0211220. Plenary talk at the International Conference on Theoretical Physics (TH 2002), Paris, UNESCO, 22–27 July 2002. L6uscher, M., Narayanan, R., Weisz, P., WolJ, U., 1992. The Schr6odinger functional – a renormalizable probe for non-abelian gauge theories. Nucl. Phys. B 384, 168. L6uscher, M., Sint, S., Sommer, R., Weisz, P., 1996. Chiral symmetry and O(a) improvement in lattice QCD. Nucl. Phys. B 478, 365. L6uscher, M., Sint, S., Sommer, R., Weisz, P., WolJ, U., 1997. Non-perturbative O(a) improvement of lattice QCD. Nucl. Phys. B 491, 323. L6uscher, M., Sommer, R., Weisz, P., WolJ, U., 1994. A precise determination of the running coupling in the SU (3) Yang–Mills theory. Nucl. Phys. B 413, 481. L6uscher, M., Sommer, R., WolJ, U., Weisz, P., 1993. Computation of the running coupling in the SU (2) Yang–Mills theory. Nucl. Phys. B 389, 247. L6uscher, M., Weisz, P., 1984. De*nition and general properties of the transfer matrix in continuum limit improved lattice gauge theories. Nucl. Phys. B 240, 349. L6uscher, M., Weisz, P., 1985a. On-shell improved lattice gauge theories. Commun. Math. Phys. 97, 59. [Erratum-ibid. 98, 433 (1985)]. L6uscher, M., Weisz, P., 1985b. Computation of the action for on-shell improved lattice gauge theories at weak coupling. Phys. Lett. B 158, 250. L6uscher, M., Weisz, P., 1986. EQcient numerical techniques for perturbative lattice gauge theory computations. Nucl. Phys. B 266, 309. L6uscher, M., Weisz, P., 1995a. Two loop relation between the bare lattice coupling and the MS coupling in pure SU (N ) gauge theories. Phys. Lett. B 349, 165. L6uscher, M., Weisz, P., 1995b. Coordinate space methods for the evaluation of Feynman diagrams in lattice *eld theories. Nucl. Phys. B 445, 429. L6uscher, M., Weisz, P., 1995c. Background *eld technique and renormalization in lattice gauge theory. Nucl. Phys. B 452, 213. L6uscher, M., Weisz, P., 1995d. Computation of the relation between the bare lattice coupling and the MS coupling in SU (N ) gauge theories to two loops. Nucl. Phys. B 452, 234. L6uscher, M., Weisz, P., 1996. O(a) improvement of the axial current in lattice QCD to one-loop order of perturbation theory. Nucl. Phys. B 479, 429. L6uscher, M., Weisz, P., WolJ, U., 1991. A numerical method to compute the running coupling in asymptotically free theories. Nucl. Phys. B 359, 221. Mackenzie, P.B., 1995. Standard model phenomenology using lattice QCD. In: Kilcup, G., Sharpe, S.R. (Eds.), Phenomenology and Lattice QCD—Proceedings of the 1993 Uehling Summer School, Seattle, WA, 21 June–2 July 1993. World Scienti*c, Singapore. Maiani, L., Martinelli, G., Rossi, G.C., Testa, M., 1987. The octet nonleptonic Hamiltonian and current algebra on the lattice with Wilson fermions. Nucl. Phys. B 289, 505.
S. Capitani / Physics Reports 382 (2003) 113 – 302
299
Mandula, J.E., Zweig, G., Govaerts, J., 1983a. Representations of the rotation reNection symmetry group of the four-dimensional cubic lattice. Nucl. Phys. B 228, 91. Mandula, J.E., Zweig, G., Govaerts, J., 1983b. Covariant lattice glueball *elds. Nucl. Phys. B 228, 109. Martinelli, G., 1984. The four fermion operators of the Weak Hamiltonian on the lattice and in the continuum. Phys. Lett. B 141, 395. Martinelli, G., Pittori, C., Sachrajda, C.T., Testa, M., Vladikas, A., 1995. A general method for nonperturbative renormalization of lattice operators. Nucl. Phys. B 445, 81. Martinelli, G., Zhang, Y.C., 1983a. The connection between local operators on the lattice and in the continuum and its relation to meson decay constants. Phys. Lett. B 123, 433. Martinelli, G., Zhang, Y.C., 1983b. One loop corrections to extended operators on the lattice. Phys. Lett. B 125, 77. Mertens, B.P., Kronfeld, A.S., El-Khadra, A.X., 1998. The self energy of massive lattice fermions. Phys. Rev. D 58, 034505. Montvay, I., M6unster, G., 1994. Quantum Fields on a Lattice. Cambridge Monographs on Mathematical Physics. Cambridge University Press, Cambridge. Morningstar, C.J., 1996. Lattice perturbation theory. Nucl. Phys. Proc. Suppl. B 47, 92. M6unster, G., Walzl, M., 2000. Lattice gauge theory: a short primer. arXiv:hep-lat/0012005. Lectures given at the Zuoz Summer School on “Phenomenology of Gauge Interactions”, Zuoz, Engadin, Switzerland, 13–19 August 2000. Naik, S., 1993. O(a) perturbative improvement for Wilson fermions. Phys. Lett. B 311, 230. Nanopoulos, D.V., Ross, D.A., 1979. Limits on the number of Navors in grand uni*ed theories from higher order corrections to fermion masses. Nucl. Phys. B 157, 273. Nanopoulos, D.V., Ross, G.G., 1975. Rare decay modes of the K mesons and KL −KS mass diJerence in an asymptotically free gauge theory. Phys. Lett. B 56, 279. Narayanan, R., Neuberger, H., 1993a. In*nitely many regulator *elds for chiral fermions. Phys. Lett. B 302, 62. Narayanan, R., Neuberger, H., 1993b. Chiral fermions on the lattice. Phys. Rev. Lett. 71, 3251. Narayanan, R., Neuberger, H., 1994. Chiral determinant as an overlap of two vacua. Nucl. Phys. B 412, 574. Narayanan, R., Neuberger, H., 1995. A construction of lattice chiral gauge theories. Nucl. Phys. B 443, 305. Narayanan, R., WolJ, U., 1995. Two loop computation of a running coupling lattice Yang-Mills theory. Nucl. Phys. B 444, 425. Necco, S., 2002a. The Nf = 0 heavy quark potential and perturbation theory. Nucl. Phys. Proc. Suppl. B 106, 862. Necco, S., 2002b. Universality and RG-improved gauge actions, arXiv:hep-lat/0208052. Necco, S., Sommer, R., 2001. Testing perturbation theory on the Nf = 0 static quark potential. Phys. Lett. B 523, 135. Negele, J.W., 2002. Understanding parton distributions from lattice QCD: present limitations and future promise, arXiv:hep-lat/0211022. In the Proceedings of the European Workshop on the QCD structure of the Nucleon (QCD-N’02), Ferrara, Italy, 3–6 April 2002, to appear. Neuberger, H., 1986. Nonperturbative BRS invariance. Phys. Lett. B 175, 69. Neuberger, H., 1987. Nonperturbative BRS invariance and the Gribov problem. Phys. Lett. B 183, 337. Neuberger, H., 1998a. Exactly massless quarks on the lattice. Phys. Lett. B 417, 141. Neuberger, H., 1998b. Vector like gauge theories with almost massless fermions on the lattice. Phys. Rev. D 57, 5417. Neuberger, H., 1998c. More about exactly massless quarks on the lattice. Phys. Lett. B 427, 353. Neuberger, H., 2001. Exact chiral symmetry on the lattice. Ann. Rev. Nucl. Part. Sci. 51, 23. Niedermayer, F., 1999. Exact chiral symmetry, topological charge and related topics. Nucl. Phys. Proc. Suppl. B 73, 105. Nielsen, H.B., Ninomiya, M., 1981a. Absence of neutrinos on a lattice. 1. Proof by Homotopy theory. Nucl. Phys. B 185, 20. [Erratum-ibid. B195, 541 (1981)]. Nielsen, H.B., Ninomiya, M., 1981b. Absence of neutrinos on a lattice. 2. Intuitive topological proof. Nucl. Phys. B 193, 173. Nielsen, H.B., Ninomiya, M., 1981c. No go theorem for regularizing chiral fermions. Phys. Lett. B 105, 219. Nobes, M.A., Trottier, H.D., Lepage, G.P., Mason, Q., 2002. Second order perturbation theory for improved gluon and staggered quark actions. Nucl. Phys. Proc. Suppl. B 106, 838. Osterwalder, K., Schrader, R., 1973. Axioms for Euclidean Green’s functions. Commun. Math. Phys. 31, 83. Osterwalder, K., Schrader, R., 1975. Axioms For Euclidean Green’s functions. 2. Commun. Math. Phys. 42, 281. Palombi, F., Petronzio, R., Shindler, A., 2002. Moments of singlet parton densities on the lattice in the Schr6odinger functional scheme. Nucl. Phys. B 637, 243.
300
S. Capitani / Physics Reports 382 (2003) 113 – 302
Panagopoulos, H., Proestos, Y., 2002. The critical hopping parameter in O(a) improved lattice QCD. Phys. Rev. D 65, 014511. Panagopoulos, H., Vicari, E., 1990. The trilinear gluon condensate on the lattice. Nucl. Phys. B 332, 261. Panagopoulos, H., Vicari, E., 1998. Resummation of cactus diagrams in lattice QCD. Phys. Rev. D 58, 114501. Panagopoulos, H., Vicari, E., 1999. Resummation of cactus diagrams in the clover improved lattice formulation of QCD. Phys. Rev. D 59, 057503. Parisi, G., 1980. Recent progresses in gauge theories. Preprint LNF-80/52-P. Presented at the 20th International Conference on High Energy Physics, Madison, Wisconsin, 17–23 July, 1980. In C80-07-17.140. Parisi, G., Wu, Y.s., 1981. Perturbation theory without gauge *xing. Sci. Sin. 24, 483. Patel, A., Sharpe, S.R., 1993. Perturbative corrections for staggered fermion bilinears. Nucl. Phys. B 395, 701. Politzer, H.D., 1973. Reliable perturbative results for strong interactions? Phys. Rev. Lett. 30, 1346. Rebbi, C., 1987. Chiral invariant regularization of fermions on the lattice. Phys. Lett. B 186, 200. Reisz, T., 1988a. A power counting theorem for Feynman integrals on the lattice. Commun. Math. Phys. 116, 81. Reisz, T., 1988b. A convergence theorem for lattice Feynman integrals with massless propagators. Commun. Math. Phys. 116, 573. Reisz, T., 1988c. Renormalization of Feynman integrals on the lattice. Commun. Math. Phys. 117, 79. Reisz, T., 1988d. Renormalization of lattice Feynman integrals with massless propagators. Commun. Math. Phys. 117, 639. Reisz, T., 1989. Lattice gauge theory: renormalization to all orders in the loop expansion. Nucl. Phys. B 318, 417. Rossi, G.C., Sarno, R., Sisto, R., 1993. Regularization of chiral gauge theories. Nucl. Phys. B 398, 101. Rossi, G.C., Testa, M., 1980a. The structure of Yang–Mills theories in the temporal gauge. 1. General formulation. Nucl. Phys. B 163, 109. Rossi, G.C., Testa, M., 1980b. The structure of Yang–Mills theories in the temporal gauge. 2. Perturbation theory. Nucl. Phys. B 176, 477. Rossi, G.C., Testa, M., 1984. The structure of Yang–Mills theories in the temporal gauge. 3. The instanton sector. Nucl. Phys. B 237, 442. Rossi, G.C., Yoshida, K., 1989. Fermionic functional integral and the temporal gauge of QCD. Nuovo Cim. 11D, 101. Rothe, H.J., 1997. Lattice Gauge Theories: An Introduction, 2nd Edition. Lecture Notes in Physics, Vol. 59, World Scienti*c, Singapore. Sachrajda, C.T., 1990. Lattice perturbation theory. Preprint SHEP 89/90-4. In: DeGrand, T., Toussaint, D. (Eds.), From Actions to Answers, Proceedings of the 1989 Theoretical Advanced Study Institute (TASI, 1989), Boulder, CO, 4-30 June, 1989. World Scienti*c, Singapore. Sadooghi, N., Rothe, H.J., 1997. Continuum behaviour of lattice QED, discretized with one-sided lattice diJerences, in one-loop order. Phys. Rev. D 55, 6749. Sarno, R., Sisto, R., 1992. Regularization of a chiral gauge theory: evidence for a ghost counterterm. Nucl. Phys. Proc. Suppl. B 29, 152. Shamir, Y., 1993. Chiral fermions from lattice boundaries. Nucl. Phys. B 406, 90. Shamir, Y., 1996. Lattice chiral fermions. Nucl. Phys. Proc. Suppl. B 47, 212. Shamir, Y., 1998. The standard model from a new phase transition on the lattice. Phys. Rev. D 57, 132. Shamir, Y., 2000. New domain-wall fermion actions. Phys. Rev. D 62, 054513. Sharatchandra, H.S., 1978. The continuum limit of lattice gauge theories in the context of renormalized perturbation theory. Phys. Rev. D 18, 2042. Sharatchandra, H.S., Thun, H.J., Weisz, P., 1981. Susskind fermions on a Euclidean lattice. Nucl. Phys. B 192, 205. Sharpe, S.R., 1994. Phenomenology from the lattice. arXiv:hep-ph/9412243. In: Donoghue, J. (Ed.), CP Violation and the limits of the Standard Model, Proceedings of the 1994 Theoretical Advanced Study Institute (TASI, 1994), Boulder, CO, 29 May–24 June, 1994, World Scienti*c, Singapore. Sharpe, S.R., 1995. Introduction to lattice *eld theory. In: Kilcup, G., Sharpe, S.R. (Eds.), Phenomenology and Lattice QCD - Proceedings of the 1993 Uehling Summer School, Seattle, WA, 21 June—2 July 1993, World Scienti*c, Singapore. Sharpe, S.R., 1999. Progress in lattice gauge theory, arXiv:hep-lat/9811006. In: Astbury, A., Axen, D., Robinson, J. (Eds.), Proceedings of the 29th International Conference on High-Energy Physics (ICHEP, 1998), Vancouver, Canada, 23–29 July 1998, World Scienti*c, Singapore.
S. Capitani / Physics Reports 382 (2003) 113 – 302
301
Sharpe, S.R., Patel, A., 1994. Perturbative corrections for staggered four fermion operators. Nucl. Phys. B 417, 307. Sheikholeslami, B., Wohlert, R., 1985. Improved continuum limit lattice action for QCD with Wilson fermions. Nucl. Phys. B 259, 572. Sint, S., 1994. On the Schr6odinger functional in QCD. Nucl. Phys. B 421, 135. Sint, S., 1995. One loop renormalization of the QCD Schr6odinger functional. Nucl. Phys. B 451, 416. Sint, S., Sommer, R., 1996. The running coupling from the QCD Schr6odinger functional—a one loop analysis. Nucl. Phys. B 465, 71. Sint, S., Weisz, P., 1997. Further results on O(a) improved lattice QCD to one-loop order of perturbation theory. Nucl. Phys. B 502, 251. Sint, S., Weisz, P., 1999. The running quark mass in the SF scheme and its two-loop anomalous dimension. Nucl. Phys. B 545, 529. Smit, J., 1986. Fermions on a lattice. Acta Phys. Polon. B 17, 531. Smit, J., 2002. Introduction to quantum *elds on a lattice: a robust mate. In: Cambridge Lecture Notes in Physics, Vol. 15. Cambridge University Press, Cambridge. Sommer, R., 1997. Non-perturbative renormalization of QCD. arXiv:hep-ph/9711243. Talk given at the 36th Internationale Universit6atswochen F6ur Kernphysik und Teilchenphysik “Computing particle properties”, Schladming, Austria, 1–8 March 1997. Sommer, R., 2002. Non-perturbative renormalization of HQET and QCD. arXiv:hep-lat/0209162. Stehr, J., Weisz, P.H., 1983. Note on gauge *xing in lattice QCD. Lett. Nuovo Cim. 37, 173. Susskind, L., 1977. Lattice fermions. Phys. Rev. D 16, 3031. Suzuki, H., 1999. Gauge invariant eJective action in Abelian chiral gauge theory on the lattice. Prog. Theoret. Phys. 101, 1147. Suzuki, H., 2000. Anomaly cancellation condition in lattice gauge theory. Nucl. Phys. B 585, 471. Symanzik, K., 1980. CutoJ dependence in lattice 44 theory. In: ’t Hooft, G., et al. (Eds.), Recent Developments in Gauge Theories (Carg[ese 1979). Plenum, New York. Symanzik, K., 1981. Schr6odinger representation and Casimir eJect in renormalizable quantum *eld theory. Nucl. Phys. B 190, 1. Symanzik, K., 1982. Some topics in quantum *eld theory. Presented at the 6th International Conference on Mathematical Physics, Berlin, West Germany, 11–21 August, 1981. In: Schrader, R., et al. (Ed.), Mathematical Problems in Theoretical Physics, Lecture Notes in Physics, Vol. 153. Springer, New York. Symanzik, K., 1983a. Continuum limit and improved action in lattice theories. 1. Principles and 4 theory. Nucl. Phys. B 226, 187. Symanzik, K., 1983b. Continuum limit and improved action in lattice theories. 2. O(N) non-linear sigma model in perturbation theory. Nucl. Phys. B 226, 205. Takaishi, Y., 1996. Heavy quark potential and eJective actions on blocked con*gurations. Phys. Rev. D 54, 1050. Tarasov, O.V., Vladimirov, A.A., Zharkov, A.Y., 1980. The Gell–Mann–Low function of QCD in the three loop approximation. Phys. Lett. B 93, 429. Tarrach, R., 1981. The pole mass in perturbative QCD. Nucl. Phys. B 183, 384. Testa, M., 1998a. The Rome approach to chirality. arXiv:hep-lat/9707007. In: Cho, Y.M., Virasoro, M. (Eds.), Recent Developments in Nonperturbative Quantum Field Theory, Proceedings of the APCTP-ICTP Joint International Conference, Seoul, Korea, 26–30 May 1997, World Scienti*c, Singapore. Testa, M., 1998b. Lattice gauge *xing, Gribov copies and BRST symmetry. Phys. Lett. B 429, 349. Travaglini, G., 1997. A Wilson–Majorana regularization for lattice chiral gauge theories. Nucl. Phys. B 507, 709. Trottier, H.D., Shakespeare, N.H., Lepage, G.P., Mackenzie, P.B., 2002. Perturbative expansions from Monte Carlo simulations at weak coupling: Wilson loops and the static-quark self-energy. Phys. Rev. D 65, 094502. van den Doel, C., Smit, J., 1983. Dynamical symmetry breaking in two Navor SU (N ) and SO(N ) lattice gauge theories. Nucl. Phys. B 228, 122. van Ritbergen, T., Vermaseren, J.A., Larin, S.A., 1997. The four-loop beta function in quantum chromodynamics. Phys. Lett. B 400, 379. Veltman, M.J., 1989. Gammatrica. Nucl. Phys. B 319, 253. Veneziano, G., 1979. U (1) without instantons. Nucl. Phys. B 159, 213. Veneziano, G., 1980. Goldstone mechanism from gluon dynamics. Phys. Lett. B 95, 90.
302
S. Capitani / Physics Reports 382 (2003) 113 – 302
Vermaseren, J.A.M., 2000. New features of FORM. arXiv:math-ph/0010025. Preprint NIKHEF-00-032. Vermaseren, J.A.M., Larin, S.A., van Ritbergen, T., 1997. The 4-loop quark mass anomalous dimension and the invariant quark mass. Phys. Lett. B 405, 327. Weinzierl, S., 2002. Computer algebra in particle physics, arXiv:hep-ph/0209234. Weisz, P., 1981. On the connection between the -parameters of Euclidean lattice and continuum QCD. Phys. Lett. B 100, 331. Weisz, P., 1983. Continuum limit improved lattice action for pure Yang–Mills theory 1. Nucl. Phys. B 212, 1. Weisz, P., 1996. Lattice investigations of the running coupling. Nucl. Phys. Proc. Suppl. B 47, 71. Weisz, P., Wohlert, R., 1984. Continuum limit improved lattice action for pure Yang–Mills theory 2. Nucl. Phys. B 236, 397. [Erratum-ibid. B247, 544 (1984)]. Wiese, U.J., 1993. Fixed point actions for Wilson fermions. Phys. Lett. B 315, 417. Wilczek, F., 1987. Lattice fermions. Phys. Rev. Lett. 59, 2397. Wilson, K.G., 1974. Con*nement of quarks. Phys. Rev. D 10, 2445. Wilson, K.G., 1975. The renormalization group: critical phenomena and the Kondo problem. Rev. Mod. Phys. 47, 773. Wilson, K.G., 1977. Quarks and strings on a lattice. In: Zichichi, A. (Ed.), New Phenomena In Subnuclear Physics, Part A, Proceedings of the First Half of the 1975 International School of Subnuclear Physics, Erice, Sicily, 11 July–1 August, 1975, Plenum Press, New York. CLNS-321; C75-07-11.10. Wilson, K.G., 1980. Monte Carlo calculations for the lattice gauge theory. In: ’t Hooft, G., et al. (Ed.), Recent Developments in Gauge Theories (Carg[ese, 1979). Plenum, New York. Wilson, K.G., Kogut, J.B., 1974. The renormalization group and the R expansion. Phys. Rep. 12, 75. Witten, E., 1979. Current algebra theorems for the U (1) ‘Goldstone boson’. Nucl. Phys. B 156, 269. Wittig, H., 1999. Lattice gauge theory. arXiv:hep-ph/9911400. Presented at the International Europhysics Conference on High-Energy Physics (EPS-HEP 99), Tampere, Finland, 15 –21 July 1999. Wohlert, R., 1987. Improved continuum limit lattice action for quarks, Ph.D. Thesis (Hamburg), Preprint DESY 87-069, unpublished. Wohlert, R., Weisz, P., Wetzel, W., 1985. Weak coupling perturbative calculations of the Wilson loop for the standard action. Nucl. Phys. B 259, 85. WolJ, U., 1995. Two loop computation of a *nite volume running coupling on the lattice. Nucl. Phys. Proc. Suppl. B 42, 291. Yamada, A., 1998. Lattice perturbation theory in the overlap formulation for the Yukawa and gauge interactions. Nucl. Phys. B 529, 483. Zinn-Justin, J., 2002. Chiral anomalies and topology. arXiv:hep-th/0201220. Contributed to the Autumn School 2001 “Topology and Geometry in Physics”, Rot an der Rot, Germany, 24–28 September 2001.
Available online at www.sciencedirect.com
Physics Reports 382 (2003) 303 – 380 www.elsevier.com/locate/physrep
Supernova remnants and -ray sources Diego F. Torresa;∗ , Gustavo E. Romerob , Thomas M. Damec , Jorge A. Combib , Yousaf M. Buttc a
Lawrence Livermore Laboratory, 7000 East Ave. L-413, Livermore, CA 94550, USA Instituto Argentino de Radioastronom$%a, C.C.5, 1894 Villa Elisa, Buenos Aires, Argentina c Harvard-Smithsonian Center for Astrophysics, 60 Garden Street, Cambridge, MA 02138, USA b
Accepted 15 April 2003 editor: M.P. Kamionkowski
Abstract A review of the possible relationship between -ray sources and supernova remnants (SNRs) is presented. Particular emphasis is given to the analysis of the observational status of the problem of cosmic ray acceleration at SNR shock fronts. All positional coincidences between SNRs and unidenti5ed -ray sources listed in the Third EGRET Catalog at low Galactic latitudes are discussed on a case by case basis. For several coincidences of particular interest, new CO(J = 1 − 0) and radio continuum maps are shown, and the mass content of the SNR surroundings is determined. The contribution to the -ray 8ux observed that might come from cosmic ray particles (particularly nuclei) locally accelerated at the SNR shock fronts is evaluated. We discuss the prospects for future research in this 5eld and remark on the possibilities for observations with forthcoming -ray instruments. c 2003 Published by Elsevier B.V. PACS: 95.85.Pw; 98.58.Mj Keywords: Gamma-rays, observations; Gamma-rays, theory; ISM, supernova remnants; ISM, clouds; Cosmic rays
Contents 1. 2. 3. 4. 5.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Phenomenological model for the hadronic -ray emission in SNRs and their environs . . . . . . . . . . . . . . . . . . . . . . . Relativistic Bremsstrahlung . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . DiAusion of CRs and -ray spectral evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sample and correlation analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
∗
Corresponding author. E-mail address: [email protected] (D.F. Torres).
c 2003 Published by Elsevier B.V. 0370-1573/03/$ - see front matter doi:10.1016/S0370-1573(03)00201-1
304 310 313 314 318
304 6. 7. 8. 9.
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
SNRs coincident with -ray sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pulsars within the EGRET error boxes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Variability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Observations and data analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1. CO data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2. Radio continuum data and diAuse background 5ltering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10. Case by case analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.1. -ray source 3EG J 0542 + 2610—SNR G180:0 − 1:7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2. -ray source 3EG J 0617 + 2238—SNR G189:1 + 3:0 (IC443) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3. -ray source 3EG J 0631 + 0642 and 3EG J 0634 + 0521—SNR G205:5 + 0:5 (Monoceros nebula) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.4. -ray source 3EG J 1013 − 5915—SNR G284:3 − 1:8 (MSH 10 − 53) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.5. -ray source 3EG J 1102 − 6103—SNR G290:1 − 0:8 (MSH 11– 61A)/289:7 − 0:3 . . . . . . . . . . . . . . . . . . . . 10.6. -ray source 3EG J 1410 − 6147—SNR G312:4 − 0:4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.7. -ray source 3EG J 1639 − 4702—SNR G337:8 − 0:1=338:1 + 0:4=338:3 + 0:0 . . . . . . . . . . . . . . . . . . . . . . . . 10.8. -ray source 3EG J 1714 − 3857—SNR G348:5 + 0:0=348:5 + 0:1=347:3 − 0:5 . . . . . . . . . . . . . . . . . . . . . . . . 10.9. -ray source 3EG J 1734 − 3232—SNR G355:6 + 0:0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.10. Near the Galactic Center: -ray source 3EG J 1744 − 3011—SNR G359:0 − 0:9=359:1 − 0:5 and -ray source 3EG J 1746 − 2851—SNR G0:0 + 0:0=0:3 + 0:0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.11. -ray source 3EG J 1800 − 2338—SNR G6:4 − 0:1 (W28) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.12. -ray source 3EG J 1824 − 1514—SNR G16:8 − 1:1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.13. -ray source 3EG J1837 − 0423—SNR G27:8 + 0:6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.14. -ray source 3EG J 1856 + 0114—SNR G34:7 − 0:4 (W44) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.15. -ray source 3EG J 1903 + 0550—SNR G39:2 − 0:3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.16. -ray source 3EG J 2016 + 3657—SNR G74:9 + 1:2 (CTB 87) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.17. -ray source 3EG J 2020 + 4017—SNR G78:2 + 2:1 (-Cygni Nebula, W66) . . . . . . . . . . . . . . . . . . . . . . . . . 10.18. An example beyond |b| ¿ 10: -ray source 3EG J0010 + 7309—SNR G119.5 +10.2 (CTA 1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11. SNRs discovered by their likely associated high-energy radiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12. SNRs and their neighborhoods as TeV sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13. Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix A. Reviewing the prospects for the forthcoming GeV satellites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.1. INTEGRAL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.2. AGILE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.3. GLAST . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix B. Future TeV telescopes and their look at SNRs—adapted from Petry (2001) . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
321 322 325 327 327 329 329 329 330 332 335 336 339 340 343 346 347 349 351 353 354 356 357 359 361 362 363 365 366 367 367 368 369 371 374
1. Introduction Gamma-ray astronomy has unveiled some of the most exotic and energetic objects in the universe: from supermassive black-holes in distant radio galaxies to radio-quiet pulsars and the still enigmatic -ray bursts. However, it has been conspicuously less successful in achieving one of its original goals of shedding light on the sources of Galactic cosmic ray nuclei. In this report we focus on the remnants of galactic supernovae, and their possible association with discrete sources of (¿ 70 MeV) -rays, as seen by the Energetic gamma-ray experiment telescope (EGRET). In doing so, we attempt
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
305
Fig. 1. Plot of source locations (galactic coordinates) in the COS-B catalog. The undashed region of the sky was surveyed by the satellite. Most of the sources had 8uxes higher than 1:3 × 10−6 photons cm−2 s−1 above 100 MeV. From Bignami and Hermsen (1983).
to lay a framework in which the long-standing question of the supernova remnant origin of Galactic cosmic rays may be addressed. The 5rst 5rm detection of celestial high-energy -rays was achieved by Clark, Garmire and Kraushaar using the Orbiting Solar Observatory (OSO-3), when they discovered that the plane of the Galaxy was a source of photons with E ¿ 70 MeV (Clark et al., 1968; Kraushaar et al., 1972). Higher spatial resolution studies made with the SAS-2 satellite, launched in 1972, revealed individual sources of -rays from the Vela pulsar (Thompson et al., 1975), and con5rmed the high-energy emission from the Crab (KniAen et al., 1974). The long life of ESA’s COS-B satellite (1975 –1982) produced another major breakthrough in -ray astronomy: for the 5rst time a signi5cant number of -ray sources were seen which could not be identi5ed with objects known at other wavelengths (see Bignami and Hermsen, 1983, for a review of COS-B results). Fig. 1 shows the region surveyed by COS-B and the point sources discovered, as reported in the second COS-B Catalog (Hermsen et al., 1981; Swanenburg et al., 1981). In 1991, the EGRET telescope was launched onboard the Compton Gamma-Ray Observatory (see Gehrels and Shrader, 2001, for a recent review). The Compton satellite (1991–2000), the heaviest orbital scienti5c payload at the time of its launch, had three other experiments apart from EGRET. All of them have contributed to our understanding of the -ray sky, although we shall particularly focus on EGRET results in this report. The Third EGRET (3EG) Catalog, whose point-like detections are shown in Fig. 2, is now the latest and most complete source of information on high-energy -ray sources. It contains 271 detections with high signi5cance, including 5 pulsars, 1 solar 8are, 66 blazar identi5cations, 1 radio galaxy (Cen A), 1 normal galaxy (LMC), and almost two hundred unidenti5ed sources, ∼ 80 of them located at low galactic latitudes (see Grenier, 2001; Romero, 2001 for recent reviews). The detection of pulsed high-energy emission from some -ray sources, on one hand, and the identi5cation of Geminga as a radio quiet pulsar, on the other, have prompted several authors to explore the possibility that all unidenti5ed low-latitude sources contained in earlier versions of the EGRET Catalog (i.e. the Second EGRET –2EG– Catalog, Thompson et al., 1995, 1996) could
306
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
Third EGRET Catalog E > 100 MeV +90
+180
-180
-90
Active Galactic Nuclei Unidentified EGRET Sources
Pulsars LMC Solar FLare
Fig. 2. Plot of source locations (galactic coordinates) in the Third EGRET Catalog. DiAerent populations are marked in diAerent grades of light colours. The size of the dots gives a qualitative idea of the detected 8ux. From Hartman et al. (1999).
be pulsars as well (excepting a small extragalactic and isotropic component which should be seen through the disc of the Galaxy). In particular, Kaaret and Cottam (1996) used OB associations as pulsar tracers, 5nding a signi5cant positional correlation with 2EG unidenti5ed sources. A similar study, including SNRs and HII regions (considered as tracers of star forming regions and, hence, of possible pulsar concentrations) has been carried out by Yadigaroglu and Romani (1997), who also concluded that the pulsar hypothesis for the unidenti5ed 2EG sources was consistent with the available information. However, spectral analysis by Merck et al. (1996) and Zhang and Cheng (1998) showed that several 2EG sources were at odds with the pulsar explanation; the spectra of many sources are too diAerent from what is expected from outer or polar gap models of pulsar emission. Time variability in the -ray 8ux of many sources (discussed below) also argued against a unique population behind the unidenti5ed galactic -ray sources. Most likely, the unidenti5ed -ray sources at low galactic latitudes are related to several diAerent galactic populations (e.g. Grenier, 1995, 2000; Gehrels et al., 2000; Romero, 2001). Among them there are surely several new -ray pulsars (e.g. Kaspi et al., 2000; Zhang et al., 2000; Torres et al., 2001d; Camilo et al., 2001; D’Amico et al., 2001; Mirabal et al., 2000; Mirabal and Halpern, 2001; Halpern et al., 2002). Pulsars remain as the only con5rmed low-latitude population, since pulsed -ray radiation has been already detected for at least six diAerent sources (Thompson et al., 1999; Thompson, 2001). Other populations might include X-ray transients (Romero et al., 2001), persistent microquasars (Paredes et al., 2000; Grenier, 2001; Kaufman-BernadOo et al., 2002), massive stars with strong stellar winds (Benaglia et al., 2001; Benaglia and Romero, 2003), isolated and magnetized stellar-size black holes (Punsly, 1998a,b; Punsly et al., 2000), and middle-mass black holes (Dermer, 1997). Finally, there is a possibility that some -ray sources could be generated
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
307
by supernova remnants (SNRs), especially those interacting with, or located close to, molecular clouds (e.g. Montmerle, 1979; Dor5, 1991, 2000; Aharonian et al., 1994; Naito and Takahara, 1994; Combi and Romero, 1995; Aharonian and Atoyan, 1996; Sturner et al., 1996; Esposito et al., 1996; Combi et al., 1998, 2001; Butt et al., 2001). This review is devoted to discuss this latter possibility in light of recent observations. SNRs are thought to be the main source of both cosmic ray (CR) ions and electrons with energies below the knee in the galactic CR spectrum, at ∼ 1015 eV—however, see Plaga (2002) for alternate theories. The particle acceleration mechanism in individual SNRs is usually assumed to be diAusive shock acceleration, which naturally leads to a power-law population of relativistic particles. In the standard version of this mechanism (e.g. Bell, 1978), particles are scattered by magnetohydrodynamic waves repeatedly through the shock front. Electrons suAer synchrotron losses, producing the non-thermal emission from radio to X-rays usually seen in shell-type SNRs. The maximum energy achieved depends on the shock speed and age as well as on any competing loss processes. In young SNRs, electrons can easily reach energies in excess of 1 TeV, where they produce X-rays by synchrotron mechanism (see, for example, Reynolds, 1996, 1998). Non-thermal X-ray emission associated with shock acceleration has been clearly observed in at least 11 SNRs, and this number seems to be steadily increasing with time. In the case of the very nearby remnant RX J0852.0-4622 (also known as Vela Jr.) the discovery was originally made at X-rays (Aschenbach, 1998) and only then the source was detected at radio wavelengths (Combi et al., 1999). As early as 1979, Montmerle suggested that SNRs within OB stellar associations, i.e. star forming regions with plenty of molecular gas, could generate observable -ray sources. Montmerle himself provided statistical evidence for a correlation between COS-B sources and OB associations. Pollock (1985) presented further analysis of some COS-B sources in the same vein. Statistical correlation studies of EGRET sources and SNRs have been presented by Sturner and Dermer (1995), Sturner et al. (1996), Yadigaroglu and Romani (1997), and Romero et al. (1999a). These studies show that there is a high-con5dence correlation between remnants and -ray sources. Fig. 3 shows the
SN1006 RXJ1713-39
Cas A
γ Cyg
W44
W28
Mon
IC443
Fig. 3. Distribution of the Green’s SNRs (circles) together with EGRET unidenti5ed sources (diamonds), shown in galactic coordinates. Some of the coincident pairs that are studied in this report are marked. The top marks are SNRs for which TeV radiation has been detected. From Mori (2001).
308
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
distribution of the SNRs in Green’s Catalog (2000) along with the 3EG unidenti5ed sources. Some of the coincident pairs that are studied in this report are marked. It is worth noticing, however, that the EGRET observations may contain important information about the hadronic component of cosmic rays concerning only the low-energy domain (typically less than several tens of GeV). Hence, these observations alone cannot solve the problem of galactic cosmic rays. The latter eAort require TeV observations, and that is why we also discuss future TeV observations of EGRET sources in this review. Note, in passing, that whereas an all-sky EGRET map is available, only pointed TeV observations are possible. SNRs can produce high-energy -rays through nucleus–nucleus interactions leading to 0 -production and subsequent -decays. The resulting -ray luminosity will depend on the local enhancement of the CR energy density as well as on the density of the ambient media. However, as shown below in Eq. (10), the expected 8uxes of 0 -decay -rays from a SNR are generally well below the EGRET sensitivity (quite importantly, the estimate given by Eq. (10) is almost independent of the proton spectrum). Thus, any detection of -ray 8ux from a supernova remnant neighborhood would imply a signi5cant enhancement of -ray production. Such an ampli5cation would be possible only through the assumption of the interaction of CRs accelerated by the shell of the SNR with a nearby high density environment—e.g. a dense molecular cloud. GeV -rays (and also TeV photons, see e.g. Pohl, 1996) can be produced also by electrons through relativistic Bremsstrahlung and inverse Compton upscattering of cosmic microwave background photons, diAuse Galactic infrared/optical radiation, and/or the radiation 5eld of the remnant itself (e.g. Mastichiadis, 1996; De Jager and Mastichiadis, 1997). Gaisser et al. (1998) modeled these processes in detail in order to 5t the observational data for the SNRs IC 443 and -Cygni. Sturner et al. (1997) and Baring et al. (1999) also modeled IC 433 with synchrotron emission in the radio band and relativistic Bremsstrahlung in -rays. De Jager and Mastichiadis (1997) included inverse Compton scattering in their model of SNR W44. Bykov et al. (2000) have recently analyzed the non-thermal emission from a SNR interacting with a molecular cloud, modeling it as a highly inhomogeneous structure consisting of a forward shock of moderate Mach number, a cooling layer, a dense radiative shell, and an interior region 5lled with hot tenuous plasma. Particularly for SNRs with mixed morphology (remnants which are shell-like in radio and dominated by central emission in X-rays, Rho and Petre, 1998), they found that Bremsstrahlung, synchrotron, and inverse Compton radiation of the relativistic electron population produce multiwavelength photon spectra in quantitative agreement with radio and high-energy observations. These are only some of the works devoted to high-energy emission from SNRs published in recent years. DiAerentiating the -ray emission produced by ions from that originating in leptons is crucial for determining the origin of cosmic-ray nuclei (for some recent reviews and more references the reader is referred to VPolk, 2001, 2002; Drury et al., 2001; Kirk and Dendy, 2001). After introducing a simple theoretical model for evaluating the possible hadronic -ray emission from SNRs and nearby clouds, we characterize the sample to be investigated, discuss the -ray 8ux variability of the sources, and study the possibility that pulsars might be possible counterparts. For each SNR-EGRET source pair we review and analyze the diAerent scenarios proposed as an explanation of the -ray emission. We present CO(J = 1-0) mm wavelength observations to evaluate whether there are molecular clouds in the vicinity of the SNRs and estimate the -ray 8ux that would be produced in each case via 0 -decays. Some new radio continuum maps are also presented. These latter maps have been processed to eliminate, as far as possible, the galactic contaminating diAuse emission.
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
309
Fig. 4. Expected distribution of source locations (Galactic coordinates) for one year survey of the LAT experiment onboard GLAST. Courtesy of the GLAST Science Team and NASA.
Our aim with this review is to provide a quantitative basis to analyze the possible -ray production in SNRs, providing the reader with useful information to guide future studies. Speci5cally, the role Q of INTEGRAL, AGILE, and GLAST satellites, and the Cerenkov telescopes HESS, VERITAS, MAGIC, and CANGAROO III is discussed. Several target candidates for observations with all these telescopes and satellites are mentioned. As an example, in the GeV band, the -ray large area telescope (GLAST), which will be launched in a few years, is expected to detect ∼ 104 high-energy -ray sources, thousands of them belonging to our Galaxy (see Fig. 4). A technical Appendix quotes the main features of GLAST, as well as its predecessors, AGILE and INTEGRAL, for quick reference. It is also important to clarify what this review is not about. Many authors have studied the evolution of SNRs or their emission properties from very sophisticated numerical modeling points of view, during the last years. We shall not particularly focus on those, except brie8y when dealing with the case by case analysis for speci5c SNR-EGRET source pairs. The rest of this work is organized as follows. In the next section we introduce a simple model to account for the hadronic -ray emission in SNRs and their neighborhoods. Section 3 refers to Relativistic Bremsstrahlung as a competing process. Section 4 analyzes the spectral changes that diAusion could produce on the observed -ray spectrum. The general characteristics of the SNR sample that we shall analyze and the possible pulsar counterparts are presented in Sections 5–7. The variability in the -ray emission for the Third EGRET sources under study is assessed in Section 8. Section 9 gives a brief account of the observations and data extraction techniques used in this review. A case by case analysis of all coincident pairs between SNRs and -ray sources is given in Section 10. Some particular cases in which SNRs were discovered by their high energy emission are discussed in Section 11. The TeV-emission properties and the prospects for new observations using new TeV-telescopes are discussed in Section 12. Finally, Section 13 presents a very brief overview and some concluding remarks.
310
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
2. Phenomenological model for the hadronic -ray emission in SNRs and their environs We 5rst present a simple model for the hadronic -ray emission from “bare” SNRs, and those interacting with molecular clouds. Further details can be obtained from Dor5 (1991, 2000), Drury et al. (1994), Aharonian et al. (1994), and Aharonian and Atoyan (1996). We shall partially follow Mor5ll et al. (1984) and Combi and Romero (1995) in our presentation. Our intention is not to arrive at the most precise theoretical model for an individual SNR, but rather to have a simple, straightforward and robust, albeit crude, method of obtaining and inter-comparing -ray 8uxes due to nucleus–nucleus interactions in interacting SNRs. Let us consider the expansion of a SNR in a homogeneous medium. If this expansion is adiabatic, we can use Sedov’s solutions (Sedov, 1959), which give the time since the explosion and the velocity of the shock front, respectively, as −1=2 5=2 t ∼ 1:5 × 103 n1=2 −1 E51 R1 yr ;
(1)
1=2 1=2 −3=2 km s−1 : vs ∼ 21:6 × 102 n− −1 E51 R1
(2)
Here, E51 is the energy of the SN explosion in units of 1051 erg, R1 is the SNR radius in units of 10 pc, and n−1 is the medium density in units of 0:1 cm−3 . The CR energy per time unit incoming to the SNR is ˙ = ks jCR 4 Rs (t)2 vs (t) ; E(t)
(3)
where the dot means derivative with respect to time, Rs is the SNR time-dependent radius, jCR is the background CR ambient density (∼ 1 eV cm−3 in the solar neighborhood), and ks is the enhancement factor due to re-acceleration by Fermi mechanism at the shock front (see Jones, 2001 for a recent review and references on acceleration details). If we assume equipartition, i.e. that the energy 8ux from the unshocked medium is converted in equal parts into electromagnetic energy, thermal energy, and CR enhancement (Mor5ll et al., 1984), we can write the previous expression as 4 1 − E˙ ∼ R2 ; 3 1 − 2nvs3 s
(4)
where is the downstream to upstream ratio of kinetic energy 8ux in the shock frame ( ∼ 0:06 for strong shocks), is the mean molecular weight, and n is the unshocked particle density. This expression states that the power available for accelerating CRs is 1/3 of the mechanical energy 8ux across the shock. When the SNR expands from a radius comprising a volume V (t1 ) to a volume V (t2 ), the energy decreases accordingly as E(t2 ) V (t1 ) −1 ; (5) = E(t1 ) V (t2 ) with the adiabatic index being = 4=3, Using the previous equations, the CR energy in the SNR between times t1 and t2 is 2 n(1 − ) t2 ECR (t1 ; t2 ) = dt vs3 Rs (t)3(−1) : (6) 3Rs (t2 )3(−1) t1
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
Through the Sedov solutions, this leads to a ratio 2=5 t1
ECR 1− : ∼ = ESN 5 t2
311
(7)
We shall adopt t2 as the actual age of the SNR, estimated from observational data and the Sedov solutions, and we shall assume the initial time t1 as that obtained when the SNR has swept about 5 M of interstellar material, starting then the Sedov phase (Lozinskaya, 1992, pp. 205A). The radius at which this happens is Rs = (3 · 5 M =4 mH n0 )1=3 , and typical values for t1 are in the range 200 –2000 years. In simpli5ed models of SNRs, the remnant is divided into three regions: an interior region 5lled with hot gas and accelerated particles but very little mass, an immediate post shock region where most of the matter is concentrated, and a shock precursor region where the accelerated particles diAusing ahead of the shock aAect the ambient medium. Following Drury et al. (1994) the production rate of -rays per unit volume can be written as Q = E n = q nECR ;
(8)
where n is the number density of the gas, ECR is the CR energy density, and q is the -ray emissivity normalized to the CR energy density, q = E =ECR . The total -ray luminosity is given by q nECR d 3 r, which can be written as q (M1 ECR1 + M2 ECR2 ), where M1 is the total mass in the precursor region, M2 that in the immediate post-shock region, and ECR1; 2 are the corresponding CR energy densities. Since particle diAusion occurs across the shock front, we have ECR1 = ECR2 . This value is also probably not very diAerent from ECR; 3 , the energy density in the interior of the remnant, because of two reasons. First, there is diAusive coupling between the acceleration region around the shock and the interior of the remnant (Drury et al., 1994). Second, if the acceleration is eVcient, CRs provide a substantial, if not the dominant, part of the interior pressure and the interior of the remnant has, for dynamical reasons, to be in pressure equilibrium. It follows that, to order of magnitude, the CR energy density throughout the remnant and in the shock precursor can be taken as ECR1 = ECR2 ≈ ECR3 ≈ 3ESN =4 R3 , where is, again, the fraction of the total supernova explosion energy, ESN , converted to CR energy and R is the remnant radius. Thus, the -ray luminosity results n 3ESN ESN 38 ph s−1 ; ≈ q E n ≈ 10 (9) L = q (M1 + M2 ) SN 4 R3 1051 erg 1cm−3 where n is the ambient density. The exact value of depends on the details of the model, for which Eq. (7) gives an example. For diAerent plausible injection models, is roughly constant throughout the Sedov phase with only a moderate dependence on external parameters such as the ambient density (Markiewicz et al., 1990). If the SNR is located at a distance d, the hadronic -ray 8ux is d −2 n ESN −7 (10) ph cm−2 s−1 ; F(¿ 100 MeV)SNR ∼ 4:4 × 10 1051 erg kpc cm3 where d is the distance to the remnant. Only for very high densities can the usually observed -ray sources can be due to the remnant itself. In general, the 8ux provided by the previous equation is far too low to produce a detectable EGRET source (Drury et al., 1994), at typical galactic distances. However, stronger emission can be produced if there are molecular clouds in the vicinity of the SNR where the locally accelerated protons interact with target ions, producing pions and hence
312
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
enhancing the -ray 8ux (e.g. Montmerle, 1979; Dor5, 1991, 2000; Aharonian et al., 1994). The expected total 8ux is 1 n(r)q W (r) W d3 r : (11) F = 4 d2 V0 Neglecting all possible gradients within the cloud, this equation reduces to Mcl q ; F = mp 4 d2 where Mcl is the mass of the cloud. In particular, we may write d −2 cloud −9 F(¿ 100 MeV) ∼ 10 M3 q (¿ 100 MeV) ph cm−2 s−1 ; kpc
(12)
(13)
where M3 is the mass of the target cloud in units of 103 M , and q is the -emissivity in units of 10−25 s−1 (H − atom)−1 . The factor q will be enhanced in comparison with its normal value because of the local CR source. In a passive giant molecular cloud exposed to the same proton 8ux measured at the Earth, the -ray emissivity above 100 MeV is equal to 1:53q−25 (¿ 100 MeV)(H − atom)−1 s−1 , where the parameter 1:5 takes into account the contribution of nuclei both in CRs and in the interstellar medium (Dermer, 1986; Aharonian, 2001). In clouds near CR accelerators, it may be much higher than this value. If the shape of the CR spectrum in the cloud does not diAer much from that existing near the Earth, we can approximate q jCR ∼ ∼ ks : (14) q; 0 jCR; 0 Following Mor5ll and Tenorio-Tagle (1983), we can use Eq. (6) to obtain 1 Rs (t1 ) 3 ESN ks = 1− : 3 20 Rs (t2 ) Rs (t2 ) jCR
(15)
The Sedov solutions for each of the SNRs considered below should be used in this latter expression to obtain ks , and thus F(E ¿ 100 MeV)cloud . One immediate test of energetic consistency is to check that ECR = ks jCR =(4=3) R3 ¡ ESN , for the obtained value of ks and the assumed value of ESN . If the previous inequality is not valid, one or more of the simplifying assumptions of the model are not correct for the particular case under analysis. The expected -ray 8ux in the TeV region by a SNR is (Aharonian et al., 1994) E −"+1 −10 F (¿ E) = f" 10 A cm−2 s−1 ; (16) TeV where the factor A is ESN d −2 n A= ph cm−2 s−1 ; 1051 erg kpc cm3
(17)
n is the medium density, and f" is a function of the index in the diAerential power-law proton spectrum ("), equal to 0.9, 0.43, and 0.19 for " = 2:1; 2:2, and 2.3, respectively. This estimate, however, usually exceeds that obtained when the GeV spectral index is extrapolated up to TeV energies.
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
313
When that is the case, the extrapolated 8ux (with the same spectral index) will be considered a safer estimate. To extrapolate the GeV 8ux up to TeV energies we assume (Thompson et al., 1996) −" dN E =K ; (18) dE E0 where K is a constant, E is given in MeV, and E0 is a reference energy. This constant can be obtained simply by integrating the 8ux, 10 GeV −" E dE ; (19) K ≡ Fph E 0 100 MeV where Fph is the observed total 8ux (that quoted in the 3EG Catalog, for instance). Once K is known, the 8ux in any given energy interval E1 –E2 is just, E2 (−"+1) E dE : (20) F(E1 ; E2 ) = K E0 E1 We are extrapolating, then, the measured spectral index at MeV–GeV energies assuming that there is no spectral change at higher energies. Actually, this assumption is a simpli5cation not compatible with TeV observations of several sources (see below). In any case, this extrapolation will always provide an upper bound to the high-energy photon 8ux. 3. Relativistic Bremsstrahlung Since relativistic electron Bremsstrahlung and nucleus–nucleus induced pion-decay are competing processes in the generation of -rays from molecular clouds exposed to a nearby CR accelerator, it is necessary to assess the relative weight of each contribution if we are to quantitatively address the question of the possible SNR origin of nucleonic CRs. The -ray emissivity at a given energy E from relativistic Bremsstrahlung is (e.g. Longair, 1994, p. 267–269) qB (E) =
10−21 n −3 KE −p m−3 s−1 GeV−1 ; p−1 m
(21)
where it is assumed an electron power-law distribution, Ne (E) = KE −p . We are interested in the -ray radiation above 100 MeV, so we integrate the previous expression to obtain ∞ 10−21 n −3 K qB (E ¿ 0:1 GeV) = E −p dE m−3 s−1 p−1 m 0:1 GeV 10−21 n −3 K 0:1−(p−1) m−3 s−1 : (22) (p − 1)2 m This same population of relativistic electrons will also radiate at radio wavelengths, via the synchrotron mechanism. The synchrotron spectrum of a power-law electron energy distribution is (Longair, 1994, p. 261) (p−1)=2 1:253 × 1037 (p+1)=2 K Jy m−1 ; (23) J (() = 23:44a(p)B ( =
314
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
where B is the magnetic 5eld measured in Tesla, and √ 1 )"(p=4 − 12 )"(p=4 + 54 )
"(p=4 + 19 12 a(p) = ; 2 (p + 1)"(p=4 + 74 )
(24)
is a numerical coeVcient depending on the spectral index. Then, the ratio between the -ray 8ux emitted by relativistic Bremsstrahlung and the synchrotron emission results qB (E ¿ 100 MeV) F(E ¿ 100 MeV) = R= J (() F( [Jy] 4:3 × 10−21 −(1+p)=2 (p−1)=2 ncm−3 BG (Hz Jy−1 cm−2 s−1 ; b(p) where we have de5ned =
b(p) = 10−5(1+p) (3:2 × 1015 )(p−1)=2 (p − 1)2 a(p) ;
(25) (26)
and converted units to the cgs system. If F(E ¿ 100 MeV) is known, estimating the right hand side of Eq. (25) for the measured spectral photon index and the derived density and magnetic 5eld, the expected value of F( [Jy] can be obtained. This is the radio emission that should be observed if the -rays are from relativistic Bremsstrahlung. If the -ray source is not superposed with the bulk of the synchrotron radio/X-ray emission from the SNR, this tends to favor a nucleus–nucleus origin of the high-energy 8ux, rather than a electron Bremsstrahlung scenario. In general, at the high densities found in molecular clouds, inverse Compton scattering can be ruled out as the main mechanism contributing to the -ray emission (e.g. Gaisser et al., 1998). Above 100 MeV, the relevant cross sections and estimates of the electron-nucleon density ratio show that relativistic Bremsstrahlung dominates over inverse Compton processes (see, for instance, Stecker’s, 1977, Figs. 1 and 2). In particular, De Jager and Mastichiadis (1997) have shown that for molecular densities above 10 cm−3 -ray 8uxes above 70 MeV are dominated by Bremsstrahlung when electrons are considered (see their Fig. 4). In what follows, since we shall mainly consider high-density scenarios, relativistic Bremsstrahlung will be the main alternative to nucleonic interactions in evaluating the origin of the observed -ray 8ux in the MeV–TeV energy range. 4. Di#usion of CRs and -ray spectral evolution The spectrum of -rays generated through 0 -decay at a source of proton density np is ∞ F (E )
F (E ) = 2 dE ; E 2 − m2
E min where m2 E min (E ) = E + ; 4E and Epmax d* (E ; Ep ) Jp (Ep ) dEp : F (E ) = 4 np dE
Epmin
(27)
(28)
(29)
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
315
Here, d* (E ; Ep )=dE is the diAerential cross-section for the production of 0 -mesons of energy E
by a proton of energy Ep in a p − p collision. If the proton spectrum Jp (Ep ) at the -ray production site is Jp (Ep ) = KEp−" ;
(30)
we can also expect a power-law spectrum at -rays: F (E ) ˙ E−" :
(31)
However, the spectrum given by Eq. (30) is not necessarily the same proton spectrum at the acceleration site. If there is diAusion, we shall have, instead c f ; (32) Jp (Ep ; r; t) = 4
where f(Ep ; r; t) is the distribution function of protons at an instant t and distance r from the source. The distribution function satis5es the well-known diAusion equation (Ginzburg and Syrovatskii, 1964): 9f D(Ep ) 9 2 9f 9 (Pf) + Q ; = r + 2 9t r 9r 9r 9Ep
(33)
where P = −dEp =dt is the continuous energy loss rate of the particles, Q = Q(Ep ; r; t) is the source function, and D(Ep ) is the diAusion coeVcient, for which we assume here no dependency on r or t, i.e. the particles diAuse through an homogeneous, quasi-stationary medium. We assume that D(Ep ) ˙ Ep. and f ˙ Ep−" with continuous injection given by Q(Ep ; t) = Q0 Ep−" q(t), which is appropriate for a supernova remnant (Aharonian and Atoyan, 1996). Further simplicity can be achieved assuming that the source is constant after turning on at some instant, i.e. q(t) = 0 for t ¡ 0 and q(t) = 1 for t ¿ 0. Atoyan et al. (1995) have found a general solution for Eq. (33) with arbitrary injection spectrum, which with the listed assumptions leads to ∞ Q0 Ep−" 2 2 √ f(Ep ; r; t) = e− x d x : (34) 4 D(Ep )r
r=RdiA In this expression, RdiA = RdiA (Ep ; t) is the diAusion radius which corresponds to the radius of the sphere up to which the particles of energy Ep propagate during the time t after the injection. Now, for D(Ep ) = aD28 Ep. , where D28 = D=1028 cm2 s−1 , and for RdiA r, i.e. when the target is well immerse into the cosmic ray 8ux, Eq. (34) reduces to Q0 Ep−("+.) ; f(Ep ; r) = 4 aD28 r
(35)
and then, from Eq. (32), we get Jp (Ep ; r) =
cQ0 Ep−("+.) : (4 )2 aD28 r
(36)
Hence, as has been emphasized by Aharonian and Atoyan (1996), the observed -ray 8ux F (E ) ˙ E−("+.) can have a signi5cantly diAerent spectrum from that expected from the particle population
316
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
Fig. 5. Temporal and spectral evolution of CR 8uxes at diAerent (10, 30 and 100 pc) distances from an impulsive proton accelerator. A power-law proton spectrum with " = 2:2 and total energy Wp = 1050 erg are assumed. Curves 1, 2, 3, and 4 correspond to an age of the source of t = 103 yr, 104 yr, 105 yr, and 106 yr, respectively. An energy-dependent diAusion coeVcient D(E) with power-law index . = 0:5 is adopted. The left panel presents results for D = D28 cm2 s−1 , whereas the right panel corresponds to D = 10−2 D28 cm2 s−1 . The hatched curve shows the local (directly measured) 8ux of CR protons. More details are given in Aharonian and Atoyan (1996).
at the source (the SNR). Standard diAusion coeVcients . ∼ 0:3 − 0:6 can explain -ray spectra as steep as " ∼ 2:3 − 2:6 in sources with particles accelerated to a power-law Jp (Ep ) ˙ E −2 if the target is illuminated by the 0 -decays are at suVcient distance from the accelerator. This can explain observed discrepancies in the particle spectral indices inferred from SNR at diAerent frequencies, even if all particles, leptons and hadrons, are accelerated to the same power-law in the source. CRs with total energy Wp and injected in the interstellar medium by some local accelerator reach a radius R(t) at instant t. Their mean energy density is wp ≈ 0:5(Wp =1050 erg)(R=100 pc)−3 eV cm−3 . Thus, in regions up to 100 pc around a CR accelerator with Wp ∼ 1050 erg, the density of relativistic particles may signi5cantly exceed the average level of the “sea” of galactic CRs, wGCR ∼ 1 eV=cm3 . In Fig. 5, the diAerential 8ux of protons at distances R = 10; 30, and 100 pc from an “impulsive” accelerator, with total energy Wp = 1050 erg are shown. The spectrum of CRs at the given time and spatial location can diAer signi5cantly from the source spectrum. The diAusion coeVcient in
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
317
Fig. 6. Gamma-ray emissivities in terms of E2 × q (E ) at diAerent times t and distances R from a proton accelerator. The right hand side axes shows the -ray 8uxes, E2 × F (E ), which are expected from a cloud with parameter M5 =d2kpc = 1. The thin and bold curves correspond to times t = 103 and 105 yr, respectively. Fluxes at distances R = 10 and 30 pc are shown by solid and dashed lines. The power-law index, the total energy of protons, and the diAusion coeVcient are the same as in the right panel of Fig. 5. The curve shown by full dots corresponds to the -ray emissivity (and 8ux) calculated for local CR protons. In order to take into account the contribution of nuclei, all curves should be increased by a factor of ≈ 1:5. More details can be found in Aharonian and Atoyan (1996).
this 5gure is assumed in a power-law form, D(E) ˙ E 0:5 above 10 GeV, and constant below 10 GeV. The commonly adopted value at 10 GeV is about 1028 cm2 s−1 , however smaller values, e.g. 1026 cm2 s−1 , cannot be excluded, especially in active star forming regions (Aharonian, 2001). The existence of massive gas targets like molecular clouds in these regions may result in -ray 8uxes detectable by EGRET, if (W50 · M5 )=d2kpc ¿ ∼ 0:1, where M5 is the mass of the cloud in units of 105 M (Aharonian, 2001). In the case of energy-dependent propagation of CRs, large variety of -ray spectra is then expected, depending on the age of the accelerator, duration of injection, the diAusion coeVcient, and the location of the cloud with respect to the accelerator. The comparison of -ray 8uxes from clouds located at diAerent distances from an accelerator may provide unique information about the CR diAusion coeVcient D(E). Similar information may be obtained from a single -ray emitting cloud, but in diAerent energy domains. For the energy-dependent propagation of CRs the probability for simultaneous detection of a cloud in GeV and TeV -rays is not very high, because the maximum 8uxes at these energies are reached at diAerent epochs (see Fig. 6). The higher energy particles propagate faster and reach the cloud earlier, therefore the maximum of GeV -radiation appears at the epoch when the maximum of the TeV -ray 8ux is already over. In the case of energy-independent propagation (e.g. due to strong convection) the ratio F (¿ 100 MeV)=F (¿ 100 GeV) is independent of time, therefore the clouds that are visible for EGRET at GeV energies would be detectable also at higher energies, provided that the spectral index of the accelerated protons be " 6 2:3. Summing up, special care must be taken in analyzing the a priori expectations for the SNR-cloudy medium scenario: when particles diAuse into the ISM before reaching the cloud, a quite diAerent spectrum from that of the injecting source can 5nally emerge, depending on the distance to the cloud and the diAusion index. Even when molecular clouds are overtaken by the expanding shell of the
318
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
SNR and there is clear evidence of interaction, a strong magnetic 5eld could produce diAerences in the spectra.
5. Sample and correlation analysis Table 1 shows those 3EG sources that are positionally coincident with SNRs listed in the latest version of Green’s Catalog (2000). From left to right, columns are for the -ray source name, the measured 8ux in the summed EGRET phases P1234 (in units of 10−8 ph cm−2 s−1 ), the photon spectral index ", the EGRET class of source (em for possibly extended and C for confused), limit the variability indices I (as in Torres et al., 2001a, c) and 3upper lower limit (as in Tompkins, 1999), information about coincidences with radio pulsars (“y” stands for a pulsar within the error box), the SNR identi5cation (including other usual names when available), the angular distance between the centre of the -ray source position and the centre of the remnant (in degrees), the size of the remnant (in arcmin), and 5nally the SNR type T (S for shell-like emission, F for 5lled-center or plerionic remnant, and C for composite). A separate section below analyzes the possible pulsar associations. All remnants were considered as circles with a radius equal to the major axis of the ellipse that better 5ts their shape, when such is given in Green (2000). It is interesting to see that many of the 3EG sources involved in the associations are classi5ed as extended, and all of them as confused. Also, it was already noted (Romero et al., 1999a) that in the 3EG catalog not all the positional coincidences with SNRs are SNOBs (SNRs in OB associations), as it was the case in the studies by Montmerle (1979) and Yadigaroglu and Romani (1997) using previous samples. The adoption of the 2000 edition 1 of Green’s Catalog (2000) does not produce any substantial statistical diAerence with respect to the previous edition of 1998. Only 5 SNRs were added. However, there is a notable particular diAerence in the case of the EGRET source 3EG J1714−3857, which now coincides with three supernova remnants instead of two. One of these SNRs (the new one in Green’s catalog, G347.3-0.5) appears to be amongst the strongest cases for SNR shock/-ray/nucleonic cosmic-ray source associations known to date (Butt et al., 2001, see below). In addition, 2EG J1801-2312 shifted its position half a degree when converting into 3EG J1800-2338, also aAecting previous positional coincidences. The evolution in the number of coincidences between SNRs and EGRET sources since the First EGRET Catalog until the current situation is shown in Table 2 (Torres et al., 2001b). The Poisson probability for the 19 coincidences to be a chance eAect is 1:05×10−5 , i.e., there is an a priori 0.99998 probability that at least one of the positional associations in Table 1 is physical. This expected chance association was computed using thousands of simulated sets of EGRET sources, by means of a numerical code described elsewhere (Romero et al., 1999a,b; Sigl et al., 2001). Fig. 7 shows the result of numerical simulations for these random populations of -ray sources. Fig. 8 shows the distribution of the -ray photon spectral index for the sample of 3EG sources coincident with SNRs. Some cases of possible physical associations mentioned in the literature and discussed in the next sections are indicated. 1
A new edition—2001—of the catalog has recently appeared. The results presented herein are not changed when the 2001 edition is considered.
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
319
Table 1 Positional coincidences between supernova remnants quoted in Green’s Catalog (2000), and unidenti5ed 3EG EGRET sources. See text for the meaning of the diAerent columns -sour1ce
F
"
Class
I
3
P? SNR
2:67 ± 0:22 2:01 ± 0:06 2:06 ± 0:15 2:03 ± 0:26 2:32 ± 0:13 2:47 ± 0:21
em C 3.16 0:701:40 0:34 C 1.68 0:260:38 0:15 C 1.52 75:8∞ 7:89 em C 1.02 72:0∞ 5:15 em C 1.63 0:220:46 0:00 y C 1.86 0:000:90 0:00
0542 + 2610 0617 + 2238a;b 0631 + 0642a;c 0634 + 0521 1013 − 5915 1102 − 6103
14:7 ± 3:2 51:4 ± 3:5 14:3 ± 3:4 15:0 ± 3:5 33:4 ± 6:0 32:5 ± 6:2
1410 − 6147d 1639 − 4702
64:2 ± 8:8 2:12 ± 0:14 C 1.22 53:2 ± 8:7 2:50 ± 0:18 em C 1.95
0:330:55 0:16 y 0:000:38 0:00 y
1714 − 3857
43:6 ± 6:5 2:30 ± 0:20 em C 2.17
0:150:38 0:00 y
1734 − 3232e 1744 − 3011
40:3 ± 6:7 — C 63:9 ± 7:1 2:17 ± 0:08 C
2.90 1.80
0:000:24 0:00 0:380:62 0:20
119:9 ± 7:4 1:70 ± 0:07 em C 2.00
0:500:69 0:36
1746 − 2851f
1800 − 2338a;g 61:3 ± 6:7 1824 − 1514 35:2 ± 6:5 1837 − 0423 ¡ 19:1 1856 + 0114h 67:5 ± 8:6 1903 + 0550d 62:1 ± 8:9 2016 + 3657 34:7 ± 5:7 2020 + 4017a;i 123:7 ± 6:7
2:10 ± 0:10 2:19 ± 0:18 2:71 ± 0:44 1:93 ± 0:10 2:38 ± 0:17 2:09 ± 0:11 2:08 ± 0:04
C 1.60 0:030:32 0:00 C 3.00 0:000:51 0:00 C 5.41 12:0∞ 2:17 em C 2.92 0:801:51 0:50 em C 2.28 0:350:60 0:18 C 2.06 0:370:75 0:08 C 1.12 0:070:18 0:00
y y y y y ?
G180:0 − 1:7 G189:1 + 3:0 G205:5 + 0:5 G205:5 + 0:5 G284:3 − 1:8 G290:1 − 0:8 G289:7 − 0:3 G312:4 − 0:4 G337:8 − 0:1 G338:1 + 0:4 GG38:3 + 0:0 G348:5 + 0:0 G348:5 + 0:1 G347:3 − 0:5 G355:6 + 0:0 G359:0 − 0:9 G359:1 − 0:5 G0:0 + 0:0 G0:3 + 0:0 G6:4 − 0:1 G16:8 − 1:1 G27:8 + 0:6 G34:7 − 0:4 G39:2 − 0:3 G74:9 + 1:2 G78:2 + 2:1
Other name IC443 Monoceros Monoceros MSH 10 –53 MSH 11– 61A Kes 41
CTB 37A
W28 W44 3C396, HC24 CTB 87 W66, -Cygni
[
Size
2.04 180 0.11 45 1.97 220 2.03 220 0.65 24 0.12 19 0.75 18 0.23 38 0.07 9 0.65 15 0.57 8 0.47 10 0.50 15 0.85 65 0.16 8 0.41 23 0.25 24 0.12 3.5 0.19 16 0.17 42 0.43 30 0.58 50 0.17 35 0.41 8 0.26 8 0.15 60
T S S S S S S S S S S S S S S S S S S S C — F S S F S
a
Association proposed by Sturner and Dermer (1995) and Esposito et al. (1996). GeV J0617+2237. c GeV J0633+0645. d Association proposed by Sturner and Dermer (1995). e GeV J1732 − 3130. f GeV J1746 − 2854. g GeV J1800 − 2328. h GeV J1856 − 0115. i GeV J2020+4023. GeV sources compiled in the GeV ASCA Catalog (Roberts et al., 2001). b
If some SNRs interact, as expected, with nearby massive clouds producing enhanced -ray emission through hadronic/Bremsstrahlung interactions, cases in which there is just a marginal coincidence between the centre-points of the SNRs and the centres of the EGRET sources should be also considered, since the peak -ray emissivity will likely be biased towards the adjacent cloud. So we have also looked at the positional coincidences between unidenti5ed EGRET sources and the region just around the SNRs. We have done so by arti5cially enlarging the size of the SNR by half a degree. We found that there are 26 coincidences of this kind, including those in Table 1. Then, there are
320
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
Table 2 Evolution of the number of positional coincidences between SNRs and unidenti5ed EGRET sources. In the last row, six likely artifacts are disregarded in the 3EG Catalog Catalog EGRET
Unidenti5ed detections
Real coincidences
Number of SNRs in Green ’s catalog
Signi5cance (statistical)
First EGRET Cataloga 2EGb 2EGc 2EGd 3EGe 3EGf
37 32 32 33 81 75
13 (35%) 7 (22%) 5 (16%) 10 (30%) 22 (27%) 19 (25%)
182 194 14g 194 220 220
1:8* ? ? h
5:7* 4:8*
a
Sturner and Dermer (1995). Sturner et al. (1996). c Esposito et al. (1996). d Yadigaroglu and Romani (1997). e Romero et al. (1999a). f Torres et al. (2001b). g Only radio-bright SNRs, 8ux at 1 GHz greater than 100 Jy, were considered. h Computed for pairs.
800
140
700
120
Gamma-ray flux (10 ph cm s )
100
-2
-1
600
500
80
-8
Number of occurrences out of 5000 cases
b
400
300
200
Real Result
60
40
20
100 0
0 0
2
4
6
8
10
12
14
Number of coincidences
16
18
20
1.6
1.8
2.0
2.2
2.4
2.6
2.8
3.0
3.2
Photon spectral index
Fig. 7. Left: Statistical results for the random association between SNRs and EGRET sources at low-latitudes. Right: Distribution of the -ray 8uxes as a function of the photon spectral index. Two sources seem to diAerentiate from the rest. One of these sources (see below) has been recently identi5ed with a blazar.
7 new cases perhaps worthy of further study. These cases are shown in Table 5. Interestingly, the expected chance coincidence in this case is at the level of 14:8 ± 3:14, still 4* lower than the real result, implying a probability of 2:5 × 10−3 for the real result to be a random (Poisson) 8uctuation.
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
Number of EGRET sources
8
321
Monoceros IC433 W28 G312.4-0.4
6
G347.3-0.5 MSH 10-53 G39.2-0.3
4
W44 2
0 1.4
1.6
1.8
2.0
2.2
2.4
2.6
2.8
3.0
3.2
Photon spectral index
Fig. 8. Distribution of the -ray photon spectral index. Some SNR-3EG coincidences for which the physical association has been suggested in the literature are indicated.
In order to quantify the role played by the 7 new sources in this result new simulations were carried out, now considering only these sources. The chance result is 2:3 ± 1:2, again 4* below the real number of coincidences. In the present review, however, we shall focus only on those SNRs that present positional correlation with -ray sources, i.e. only the cases listed in Table 1. 6. SNRs coincident with -ray sources We now analyze the sample of SNRs in Table 1 in more detail. Table 4 presents the radio 8uxes of the SNRs, together with their spectral index and known distance estimates (with the corresponding references). Distances are only approximate since several diAerent values for the same SNR can be found in the literature. When no direct determination is available, estimates can be made using the radio surface brightness-to-diameter relationship, known as 4 − D (Clark and Caswell 1976; Milne, 1979; Case and Bhattacharya, 1998). For these cases, marked with a star in Table 4, the distances given by the new 4−D introduced by Case and Bhattacharya (1998) will—unless otherwise noted—be adopted. A double star symbol means that neither a distance determination nor an estimate is available. There is only one such case in Table 4, for which the distance to a coincident OB
322
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
Table 3 Supernova remnants and nearby (but not coincident) unidenti5ed 3EG -source
F
"
Class
I
3
SNR
[
Size
T
Other
0229 + 6151 1736 − 2908 1741 − 2050 1823 − 1314a;b 1826 − 1302 1928 + 1733 1958 + 2909c
39:9 ± 6:2 51:5 ± 9:1 24:1 ± 3:9 102:6 ± 12:5 66:7 ± 10:1 157:0 ± 36:9 26:9 ± 4:8
2:29 ± 0:18 2:18 ± 0:12 2:25 ± 0:12 2:69 ± 0:19 2:00 ± 0:11 2:23 ± 0:32 1:85 ± 0:20
C C C C C em C em C
1.3 2.4 2.1 2.9 2.6 3.9 1.6
0:370:74 0:16 0:661:09 0:40 0:410:70 0:14 0:721:37 0:40 0:751:28 0:49 0:822:01 0:43 0:430:98 0:15
G132:7 + 1:3 G359:1 + 0:9 G6:4 + 4:0 G18:8 + 0:3 G18:8 + 0:3 G54:1 + 0:3 G65:1 + 0:6
1.50 0.73 1.00 0.88 0.88 1.21 1.36
82 12 31 17 17 1.5 90
S S S S S F? S
Of/OB — — OB OB — —
a
Association proposed by Sturner and Dermer (1995), and Esposito et al. (1996). GeV J1825 − 1310. c GeV J1957 − 8859. b
association was assumed. Thus, distances in Table 4 marked with one or two stars are, respectively, more uncertain than the others. Using the estimated distance to each remnant in Table 4, we have calculated the approximate intrinsic -ray luminosity of the putative region producing the high-energy source in the energy range 100 MeV–10 GeV using the observed EGRET 8ux and photon indices (see Table 1), assuming isotropic emission. We have also re-ordered Green’s Catalog according to descending radio 8ux, and the rank in this list is given for each remnant. It is interesting that from the 5rst 20 SNRs with the highest radio 8uxes, only six appear to be correlated with EGRET sources. Sturner and Dermer (1995) have noted that SNRs not correlated with 2EG sources were either more distant than -Cygni (G78.2+2.1) and IC 433 (G189.1+3.0), or presenting a far smaller radio 8ux. This trend is not observed now with the larger 3EG sample (Table 3). 7. Pulsars within the EGRET error boxes Since both molecular clouds and pulsars can produce -rays, and because both often lie close to SNRs, it is important to explore the possible origin of the high-energy 8ux in neighboring pulsars. In order to 5rmly identify a pulsar as the origin of the -rays from an EGRET source, -ray pulsations must be detected at the pulsar period. However, this is not always possible because of the usually low photon counts observed in most cases. There are only 6 high-energy -ray pulsars that are already con5rmed, 5 of them have associated -ray sources in the 3EG Catalog. We give their properties in Table 5. Other candidates are usually judged by comparison with the properties of the known EGRET pulsars. The results of a correlation analysis between the 3EG sources superposed to SNRs (Table 1) and pulsars are presented in this section. The latter were extracted from the Princeton Catalog (Taylor et al., 1993, available on line at http://pulsar.princeton.edu/ftp/pub/catalog/) and from the recently -partially- released Parkes Multibeam Survey (Manchester et al., 2001, http://www.atnf.CSIRO. AU/research/pulsar/pmsurv/). Adding up both surveys, there are more than 1000 known pulsars (Table 6).
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
323
Table 4 Properties of the SNRs coincident with 3EG sources. The distances quoted are reported or discussed in the cited references. Radio 8uxes and spectral indices (5, such that S( ˙ (5 ) are taken from Green’s (2000) Catalog -source
SNR
d kpc
0542 + 2610 0617 + 2238 0631 + 0642 0634 + 0521 1013 − 5915 1102 − 6103
G180:0 − 1:7 G189:1 + 3:0 G205:5 + 0:5 G205:5 + 0:5 G284:3 − 1:8 G290:1 − 0:8 G289:7 − 0:3 G312:4 − 0:4 G337:8 − 0:1 G338:1 + 0:4 G338:3 + 0:0 G348:5 + 0:0 G348:5 + 0:1 G347:3 − 0:5 G355:6 + 0:0 G359:0 − 0:9 G359:1 − 0:5 G0:0 + 0:0 G0:3 + 0:0 G6:4 − 0:1 G16:8 − 1:1 G27:8 + 0:6 G34:7 − 0:4 G39:2 − 0:3 G74:9 + 1:2 G78:2 + 2:1
0.8–1.6 1.5 0.8–1.6 0.8–1.6 2.9 7 8.2 1.9 –3.1 12.3 9.9 8.6 11.3 11.3 6.3 12.6 6 8.5 –9.2 8.5 8.5 1.6 – 4.2 1.48 2 2.5 7.7–9.6 10 1.7
1410 − 6147 1639 − 4702 1714 − 3857 1734 − 3232 1744 − 3011 1746 − 2851 1800 − 2338 1824 − 1514 1837 − 0423 1856 + 0114 1903 + 0550 2016 + 3657 2020 + 4017
Ref. 1 2 3 3 4 5 * 6 7 * * 8 8 9 * 10 8–11 8 12 13 ** 14 15 15 15 16
1GHz Fradio Jy
5
Rank
L erg s−1
65 160 160 160 11 42 6.2 45 18 4? 7? 9 72 ? 3? 23 14 100? 22 310 2? 30 230 18 9 340
Varies 0.36 0.5 0.5 0.3? 0.4 0.2? 0.26 0.5 0.4 ? 0.4? 0.3 ? ? 0.5 0.4? 0.8? 0.6 Varies ? Varies 0.30 0.6 Varies 0.5
24 11 12 12 91 34 123 32 67 153 121 96 21 ? 170 54 81 18 55 6 186 44 7 68 104 5
9:65 × 1033 1:01 × 1035 1:70 × 1034 1:85 × 1034 1:71 × 1035 8:46 × 1035 1:11 × 1036 3:06 × 1035 4:17 × 1036 2:70 × 1036 2:04 × 1036 3:47 × 1036 3:47 × 1036 1:07 × 1036 — 1:65 × 1036 3:56 × 1036 1:20 × 1037 1:20 × 1037 4:04 × 1035 5:42 × 1034 3:40 × 1034 4:14 × 1035 2:12 × 1036 2:75 × 1036 3:20 × 1035
Note: 1. Anderson et al. (1996) 2. Fesen (1984) 3. JaAe et al. (1997) and Hensberge et al. (2000) 4. Ruiz and May (1986) 5. Kaspi et al. (1997) 6. Caswell and Barnes (1985), Case and Bhattacharya (1999) 7. Koralesky et al. (1998) 8. Green et al. (1997), see also Reynoso and Mangum (2000) 9. Slane et al. (1999) 10. Bamba et al. (2000) 11. Uchida et al. (1992) 12. Kassim and Frail (1996) 13. Frail et al. (1993) and Clark and Caswell (1976) 14. Reich et al. (1984) 15. Green (2000) and Caswell et al. (1975) 16. Lozinskaya et al. (2000) ∗ From the 4 − D relationship presented by Case and Bhattacharya (1998) ∗∗ Distance assumed equal to a coincident OB association, Romero et al. (1999a).
Table 6 presents the results of the correlative 3EG-radio pulsar spatial coincidence analysis: name of the 3EG source, name of the pulsar found within the error box, their angular separation, and the size of the 95% con5dence contour of the 3EG source. Apart from the results quoted in Table 6, recently, Torres and Nuza (2003b) discovered 5ve new coincidences between EGRET sources and pulsars in the 2003 version of the Parkes Catalog, but they discarded their possible association based on simple spin-down energetics. We refer the reader to their paper for further details. 2 We provide 2
Discussion on how -ray observations of Parkes’ pulsars can help distinguishing the outer gap from other models for -ray emission can be found in Torres and Nuza (2003b) and references cited therein.
324
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
Table 5 Properties of the -ray pulsars detected by EGRET. Pulsar parameters and distances are taken from Kaspi et al. (2000), P except PSR B1055 − 52, for which a smaller value of distance was also considered (Ogelman and Finley, 1993; Combi ˙ and E˙ = 4 2 I P=P ˙ 3 , with I = 1045 g cm2 . et al., 1997), and Vela (Caraveo et al., 2001 and references therein). 3 = P=2P, The “P1234” -ray 8uxes and spectral indices are from the 3EG catalog (Hartman et al., 1999) Pulsar/3EG source
P (ms)
3 (kyr)
E˙ (erg s−1 )
d (kpc)
F3EG [ × 10−8 ] (ph cm−2 s−1 )
3EG
(100 MeV–10 GeV) (%)
Crab=0534 − 2200 Vela=0834 − 4511 B1951 + 32=− B1706 − 44=1710 − 4439 Geminga=0633 + 1751 B1055 − 52=1058 − 5234
33 89 39 102 237 197
1.2 12.5 100.0 15.8 316.2 501.1
5:0 × 1038 6:3 × 1036 3:7 × 1036 3:1 × 1036 3:1 × 1034 3:1 × 1034
2.0 0.25 2.4 1.8 0.16 0.5/1.5
226:2 ± 4:7 834:3 ± 11:2 — 111:2 ± 6:2 352:9 ± 5:7 33:3 ± 3:8
2:19 ± 0:02 1:69 ± 0:01 — 1:86 ± 0:04 1:66 ± 0:01 1:94 ± 0:10
0.01 0.08 0.3 1 3 2/19
Table 6 Positional coincidences between 3EG unidenti5ed sources superposed to SNRs and pulsars in the Princeton Catalog and in the recently partially released Parkes Multibeam Survey. We show measured and derived pulsar parameters as well. See text for details. In the case of 3EG J1410 − 6147, second and third values of eVciency are given for two diAerent estimates of the distance to G312:4 − 0:4, 1:9 kpc and 3:1 kpc, respectively. All eVciencies are based on the new data of the 3EG Catalog (Hartman et al., 1999) 3EG J
PSR J
[ deg
deg
(l; b) deg
d kpc
1012 − 5857 1640 − 4715 1800 − 2343 1825 − 1446 1836 − 0436 1856 + 0113 1902 + 0556 1902 + 0615
0.29 0.29 0.12 0.46 0.28 0.05 0.26 0.48
0.72 0.56 0.32 0.52 0.52 0.19 0.64 —
283:7; −2:1 337:7; −0:4 6:1; −0:1 16:8; −1:0 27.1,1.1 34:5; −0:5 39.5,0.2 39.8,0.3
10.1 7.2 4.8 5.4 4.6 3.3 3.9 10.1
Parkes 1013 − 5915 1016 − 5857 1013 − 5934 1014 − 5705 1015 − 5719 1410 − 6147 1412 − 6145 1420 − 6038 1420 − 6048 1413 − 6141 1639 − 4702 1637 − 4642 1640 − 4648 1637 − 4721 1714 − 3857 1713 − 3844 1715 − 3903 1837 − 0423 1838 − 0453 1837 − 0606 1837 − 0559 1837 − 0604
0.16 0.31 0.30 0.15 0.16 0.28 0.46 0.37 0.45 0.30 0.23 0.50 0.14 0.16
0.72 — 0.67 0.36 0.33 0.36 0.56 — — 0.51 — 0.52 0.19 —
284:1; −1:9 284:1; −2:6 283:69; −0:58 312:3; −0:3 313.54,+0.23 312:4; −0:3 337.8,+0.3 338:1; −0:2 337:3; −0:1 348.1,0.2 348:1; −0:3 27.1,+0.7 26.0,+0.38 25.96,+0.27
3.0 11.3 4.9 9.3 8 11.0 5.8 6.1 5.9 6.5 4.8 8.2 5.0 6.2
Princeton 1013 − 5915 1639 − 4702 1800 − 2338 1824 − 1514 1837 − 0423 1856 + 0114 1903 + 0550
P ms
P˙ 10−15
E˙ erg s−1
(%) (Beamed)
195 33 10 912 —
820 518 1030 279 354 267 746 673
17.69 42.02 — 22.68 1.66 208.408 12.896 —
1:2 × 1033 1:3 × 1034 — 4:1 × 1034 1:5 × 1033 4:3 × 1035 1:2 × 1033 —
¿ 100 100 — ¿ 100 100 13 100 —
21 12561 39 50 13 13 41 3501 4160 143 117 52 1045 34
107 442 140 315 68.2 286 154 178 1165 1600 278 381 201.0 96.3
0.806 0.557 57.4 98.7 82.85 333.4 59.2 0.806 4.44 177.41 37.688 115.7 3.304 45.170
2:6 × 1036 2:5 × 1032 8:2 × 1035 1:2 × 1035 1:0 × 1037 5:7 × 1035 6:4 × 1035 5:6 × 1033 1:1 × 1032 1:7 × 1033 6:9 × 1034 8:3 × 1034 1:5 × 1034 2:0 × 1036
0.5 100 5 ¿ 100=12=30 2 80/3/6 12 100 100 100 72 ¡ 55 ¿ 100 7
3 kyr
—
741 188
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
325
also the available information about the pulsar: its galactic coordinates, distance, characteristic time ˙ with P and P˙ the period and period derivative, respectively, spin-down energy release 3 = P=2P, 2 ˙ ˙ E = 4 I P=P 3 (assuming a neutron star moment of inertia I = 1045 g cm2 ), and the eVciency in converting the spin-down luminosity into -rays, if the pulsar alone were responsible for generating the 3EG source 8ux. The -ray eVciency was estimated as ≡ L = E˙ = f4 d 2 F = E˙ ;
(37)
where F is the observed -ray 8ux between 100 MeV and 10 GeV, and f is the -ray beaming fraction (0 ¡ f 6 1). This fraction is essentially unknown (see, e.g., Yadigaroglu and Romani, 1997; Romero, 1998), but it is common practice to assume f ≡ 1=4 (e.g. Thompson, 2001; Kaspi et al., 2000; D’Amico et al., 2001; Torres et al., 2001d). These eVciencies are uncertain also because they suAer (quadratically) the imprecise knowledge of the distance to the pulsar. For the con5rmed -ray pulsars, the observed eVciencies are in the range ∈ (∼ 0:01; ∼ 3–19%), where the interval of upper limits is caused by the diAerent estimates of the distance to PSR B1055− P 52 (see Ogelman and Finley, 1993; Combi et al., 1997; Romero, 1998). It could be considered reasonable that a pulsar generates -rays with eVciencies in the range ∈ (∼ 0:01; ∼ 10%). For higher values the pulsar would be too close to the so-called death line, where the high-energy emission is quenched (Usov, 1994). Two of the pulsars in Table 6, PSR J1800 − 2343 and PSR J1902+0615, lack a con5dent determination of the period derivative; for these it is impossible to assess their expected eVciencies. Based on the required eVciencies and derived spin-down luminosities of the other members of the group shown in the 5rst (Princeton) panel, none of the positional associations seem likely except that between 3EG 1856+0114 and PSR J1856+0113. For this case, the pulsar should have an eVciency of 13% in converting rotational energy into -rays. PSR J1856+0113 was already mentioned as possibly associated with the EGRET source by De Jager and Mastichiadis (1997). The bottom panel of Table 6 contains data that were analyzed elsewhere (Torres et al., 2001d; D’Amico et al., 2001; Camilo et al., 2001); here are the main conclusions. (1) The physical association between 3EG J1013 − 5915 and PSR J1016 − 5857 is possible, as 5rst noted by Camilo et al. (2001). (2) The source 3EG J1410 − 6147 and either PSR J1412 − 6145 or J1413 − 6141 might be physically associated if either of the latter lie closer than their dispersion measure distances, say at the estimated distance to G312:4 − 0:4 (2–3 kpc). (3) The pulsar PSR J1637 − 4642 could contribute part of the high-energy budget of the source 3EG J1639 − 4702. In addition to this, the pulsars J1015 − 5719, J1420 − 6048 and J1837 − 0604 are also possible counterparts for their respective EGRET sources.
8. Variability Ift SNRs, molecular clouds, or pulsars are responsible for some EGRET sources, we would expect them to be non-variable on the time scale of EGRET observations, i.e. from weeks to a few years. Hence, variability analysis of the -ray emission is an important tool to test the original hypothesis, in the sense that variable sources could be ruled out as being produced by SNRs (or pulsars).
326
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
Three variability indices for EGRET sources have been introduced in the literature so far. The 5rst of them, dubbed V , was presented by McLaughlin et al. (1996), who computed it for the sources contained in the Second EGRET Catalog. This method was later used, also, for short timescale studies by Wallace et al. (2000, 2002). The basic idea behind V is to 5nd 82 from the measured 8uxes, and to compute V = −log Q, where Q is the probability of obtaining such a 82 if the source were constant. Several critiques have been mentioned concerning this procedure, among them, that the scheme gets complicated when the 8uxes are just upper limit detections. It can be shown that sources which have upper limits included in the analysis will have a lower V than what is implied by the data (Tompkins, 1999). In addition, a source can have a large value of V because of intrinsic reasons or because of small error bars in the 8ux measurements. Similarly, a small value of V can imply a constant 8ux or big error bars. Each value of V is obtained disregarding those of a control population. Then, there could be pulsars with very high values of V , or variable AGNs with very low ones. Hence, the use of V to classify the variability of -ray sources seems not to produce con5dent results. Two other indices have been computed for all -ray sources: the I index and the 3 index (Torres et al., 2001a; Tompkins, 1999, respectively). See Torres et al. (2001c) for a comparison among the results obtained with them. The idea behind the index I is to carry out a direct comparison of the 8ux variation of any given source with that shown by pulsars, which is considered as instrumental. It basically establishes how variable a source is with respect to the pulsar population. Contrary to Tompkins’ index, the I -scheme uses only the publicly available data of the 3EG Catalog, and is de5ned as follows. Firstly, a mean weighted value for the EGRET 8ux is computed as Nvp
Nvp
−1 F(i) 2 F = × j(i) : (38) 2 j(i) i=1 i=1 Here Nvp is the number of single viewing periods for each -ray source, F(i) is the observed 8ux in the ith-period, whereas j(i) is the corresponding error in the observed 8ux. These data are taken directly from the 3EG Catalog. A 8uctuation index, , is de5ned as (e.g. Romero et al., 1994): = 100 × *sd × F −1 , where *sd is the standard deviation of the 8ux measurements. This 8uctuation index is also computed for the con5rmed -ray pulsars in the 3EG Catalog, and then the averaged statistical index of variability, I , is introduced by I = source = pulsars . We refer the reader to the work by Torres et al. (2001a, c) for details on the error estimates. Since the I -scheme is a relative classi5cation, a result like I = 3:0 says that the 8ux evolution is three times more variable than (equivalently, 4* above) the mean 8ux evolution for pulsars. Torres et al. (2001c) mentioned that in order to get more reliable results under the I -scheme at low-latitudes it seems safer to consider a restrictive criterion. For instance, a source will be considered very variable only when I −.I ¿ [Ip ]+3*p; I . Here .I ∼ 0:5I , [Ip ] is the mean value of I for pulsars, *p; I is their standard deviation, and [Ip ] + 3*p; I = 2:5. Rephrasing the previous constraints just in terms of I , a source will be very variable if I ¿ 5. This represents a deviation of 8* from the mean I -value for pulsars. With this criterion, 3EG J1837-0423, with I = 5:41, is a very variable source. We have 7 non-variable sources, those having I ¡ 1:7 in Table 1. The rest of the sources listed there are dubious under this classi5cation scheme. Tompkins (1999) used the 145 marginal sources that were detected but not included in the 5nal oVcial 3EG list, and, simultaneously, all the detections within 25 deg of the source of interest. The
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
327
maximum likelihood set of source 8uxes was then re-computed. From these 8uxes, a new statistic measuring the variability was de5ned as 3 = *=, where * is the standard deviation of the 8uxes and their average value. The 5nal result of Tompkins’ analysis is a table listing the name of the EGRET source and three values for 3: a mean, a lower, and an upper limit (68% error bars). The mean value of 3 for pulsars -again an assumed non-variable population- is very low, 0.09, but the mean of the upper limits is ∼ 0:2. So pulsars are consistent with having values of 3 up to 0.2. The deviation for the mean value of pulsars is 0.08. A source will be likely variable under the 3 scheme when the lower limit is at least 0.6, 3* above the mean value of the 3 upper limit for pulsars. A source will be considered non-variable when the upper limit for 3 is below that threshold. Sources not ful5lling either classi5cation will be considered as dubious. This is also consistent with the fact that the mean 3 value for the known population of AGNs is 0.9. Using this scheme, too, 3EG J1837-0423 is a variable source. In addition, 3EG J0631+0642 and 3EG J0634+0521 are also variable under 3. However, Tompkins (1999) adds a word of caution for these two sources: the 5tted 8ux for them is zero. Such sources are most likely variable, but unknown instrument systematics or numerical problems within the 3 scheme could conceivable change these results. Also important is to note that many sources have a dubious classi5cation: within the 68% error bars on 3, they can be as variable as an AGN, or as non-variable as a pulsar. This is, unfortunately, a common situation for many of the sources. For those ones, in particular, the I -index scheme can provide some additional information. Based on the variability of the -ray 8ux, then, it is highly unlikely that 3EG J1837 − 0423 is caused by SNR G27.8+0.6, or by the pulsars PSR J1836 − 0436 (from Princeton Catalog) and PSR J1838 − 0453 (from Parkes Catalog). In addition, the spectral index for this 3EG source is very steep, possibly arguing against the pulsar hypothesis (Fierro et al., 1993), although it should be noted that Halpern et al. (2001a) have recently presented a strong argument for the association of PSR J2229 + 6114 with 3EG J2227 + 6122, which a relatively soft index of 2:24 ± 0:14. Out of 19 sources in Table 1, 12 have the same classi5cation under the two schemes; in Table 2, 6 out of 7 have the same classi5cation. This con5rms, for these sources, that the schemes are statistically correlated and that it is safe to consider both indices to smooth out any particular problem with singular sources, as apparently is the case for 3EG J0631+0642 and 3EG J0634+0521. Fig. 9 presents a comparison between the variability indices for those 3EG sources superposed to SNRs and the set of ‘A’ AGNs identi5ed in the same catalog. The distributions are quite diAerent, showing a more non-variable population in the case of the sources herein investigated. The most variable sources in the set are identi5ed for reference. 9. Observations and data analysis 9.1. CO data CO is a polar molecule with strong dipole rotational emission in the mm waveband, and is considered a reliable tracer of molecular hydrogen, H2 , which, though much more abundant, has only a weak quadrupole signature. Throughout this paper, molecular gas masses are derived from observations of the J = 1 − 0 rotational transition of CO at 115 GHz, assuming a proportionality between velocity-integrated CO intensity, Wco , and molecular hydrogen column density N(H2). Speci5cally,
328
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
3EG-SNRs
3EG AGNs
12 Mean I-value= 2.1
Mean I-value= 3.3
40
53% with I>2.5
26% with I>2.5
10
Number of sources
30 8 Most variable sources 3EG J1856+0114 3EG J1734-3232
6
20
3EG J0542+2610 3EG J1824-1514
4 10 2
3EG J1837-0423
0
0 0
5
10
15
Variability index I
0
5
10
15
Variability index I
Fig. 9. Comparison of the variability indices of the 3EG sources superposed with SNRs and identi5ed 3EG AGNs.
we adopt N (H2)=Wco = 1:8 × 1020 cm−2 K −1 km−1 s−1 ; the value derived by Dame et al. (2001) from an intercomparison of large-scale far-infrared, 21 cm, and CO surveys. To estimate total nucleon densities in molecular clouds, we account for elements heavier than hydrogen by assuming a mean molecular weight per H2 molecule of 2.76. All CO(1-0) data presented here are from the whole-Galaxy survey of Dame et al. (2001). This survey is a composite of 37 separate CO surveys carried out over the past 20 years with two nearly identical 1:2 m telescopes, one located at the Harvard-Smithsonian Center for Astrophysics in Cambridge, Massachusetts, and the other at the Cerro Tololo Interamerican Observatory in Chile. The angular resolution of the composite survey is ∼ 8:5 arcmin and the velocity resolution ∼ 1 km s−1 . Many of the 37 separate surveys have been published previously and, where appropriate, the earlier papers are cited instead of, or in addition to, Dame et al. (2001).
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
329
In a few cases, observations of the CO(J = 2-1) transition at the same angular resolution, obtained with the Tokyo-NRO 60 cm Survey Telescope (Sakamoto et al., 1995), are used to search for enhancements of the CO(J = 2-1)=CO(J = 1-0) ratio as an indicator of SNR-molecular cloud interaction (e.g., Seta et al., 1998). 9.2. Radio continuum data and di>use background @ltering In general, the diAuse radio emission of the Galaxy hampers the detection of weak and extended sources with low surface brightness, like SNRs. This large-scale diAuse emission can be removed through diAerent techniques for astronomical data analysis. These techniques can range from detailed modeling of the non-thermal emission in the Galaxy to model-independent 5ltering algorithms. In this work, many of the radio continuum maps have been cleaned of background diAuse contamination using the method originally introduced by Sofue and Reich (1979). Basically, the technique consists of convolving the continuum map with a Gaussian 5ltering beam, producing a new map with a diAerent brightness temperature T01 . A new temperature distribution is computed as T 10 = T − [T 1 , for [T 1 ¿ 0 and T 10 = T for [T 1 ¡ 0. In these expressions, T is the temperature distribution of the original map, and [T 1 = T − T01 are the residuals between the original and the convolved maps. The procedure is repeated, now convolving the T 10 map in order to obtain T02 , [T 2 and 5nally T 20 . After n iterations, the diAerence |T0n − T0n−1 | becomes smaller than the rms noise, and a map of residuals [T n =T −T0n is obtained where all diAuse emission with size scales larger than the original 5ltering beam has been removed. The result is completely independent of the original mechanism that produced the large-scale emission. In the following sections we apply this technique to EAelsberg 100 m-telescope data and to the MOST Galactic plane survey in order to get images of the SNRs as clean as possible at low galactic latitudes.
10. Case by case analysis 10.1. -ray source 3EG J 0542 + 2610—SNR G180:0 − 1:7 The possible counterparts of this -ray source were explored in detail in a recent paper (Romero et al., 2001). No known radio pulsar coincides with this source (see Table 3). Additionally, a radio-quiet Geminga-like pulsar origin is disfavored a priori because of the high variability and the steep spectral index that this source presents (" = 2:67 ± 0:22, Hartman et al., 1999). We have searched the EGRET location error box for other compact radio sources. Although 29 point-like radio sources were detected, none of them is strong enough to be considered a likely counterpart (Romero et al., 2001). The strongest of the sources detected have a radio 8ux one order of magnitude less than those presented by known -ray blazars detected by EGRET. Moreover, the absence of an X-ray counterpart to this source suggest that it is not an accreting source, like a microquasar. Some of us suggested that the only object within the 95% error box capable of producing the required -ray 8ux is the X-ray transient A0535+26. This Be/accreting pulsar, not detected at all in the radio band, can produce variable hadronic -ray emission through the mechanism originally proposed by Cheng and Ruderman (1989, 1991). See Romero et al. (2001) for further details.
330
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
On the basis of results discussed in that paper we conclude that 3EG J0542+2610 and G180:0−1:7 are most likely unrelated. An interesting comparison between this case and one of the EGRET sources coincident with the Monoceros Loop is made below. 10.2. -ray source 3EG J 0617 + 2238—SNR G189:1 + 3:0 (IC443) A detailed description of SNR IC433 was given by Chevalier (1999). We have recently reviewed the spatially-resolved multiwavelength spectrum of IC443 and argued that the morphology and spectrum of the -ray emission make it a likely hadronic cosmic-ray accelerator (Butt et al., 2002b). Seta et al. (1998) have provided an analysis of the CO environment of this remnant. They concluded, as was previously reported by Scoville et al. (1977) and Cornett et al. (1977), that IC433 is interacting with several ambient molecular clouds with a total mass of about 104 M . They also analyzed the ratio R = CO(J = 2-1)=(J = 1-0) in the environs of IC443 and concluded that parts of the clouds presented an abnormally high value, consistent with shock interaction. The detected value of R exceeds 3 (the average galactic value is ∼ 0:6) in some regions. Interestingly, the peak of the CO(J = 2-1)=(J = 1-0) ratio is coincident with the location of the newly discovered pulsar wind nebula by Olbert et al. (2001), which may indicate an alternate way of exciting molecular gas. Dickman et al. (1992) has estimated that the total perturbed molecular gas has a mass of 500 –2000M (Fig. 10). In recent years, X-ray observations of IC443 have been carried out. IC443 was a target for X-ray observations with HEAO 1 (Petre et al., 1998), Ginga (Wang et al., 1992), ROSAT (Asaoka and Aschenbach, 1994), ASCA (Keohane et al., 1997), and more recently, with Chandra and Beppo-SAX; we discuss the latter in more detail below. IC443 was believed to be mostly thermal in the X-ray band (Petre et al., 1998; Asaoka and Aschenbach, 1994), although it has been discovered to emit hard X-ray emission (Wang et al., 1992). Keohane et al. (1997) later found that the hard X-ray emission was localized and non-thermal. They concluded that most of the 2–10 keV photons came from an isolated emitting feature and from the South East elongated ridge of hard emission. Even more recently, Preite-Martinez et al. (2000) and Bocchino and Bykov (2000) reported a hard component detected with the Phoswich Detector System (PDS) on BeppoSAX and two compact X-ray sources corresponding to the ASCA sources detected with the BeppoSAX Medium-Energy Concentrator Spectrometer (MECS) (1SAX J0617.1+2221 and 1SAX J0618.0+2227). 1SAX J0617.1+2221 has also been observed with the Chandra satellite by Olbert et al. (2001), who also obtained VLA observations at 1.46, 4.86 and 8:46 GHz and a polarization measurement. The hard radio spectral index, the amount of polarization, and the overall X-ray and radio morphology led them to suggest that the source is a plerion nebula containing a point source whose characteristic cometary shape is due to supersonic motion of the neutron star. Bocchino and Bykov (2001) have, in addition, recently observed IC443 with XMM-Newton Observatory (see Fig. 11). They resolve the structure of the nebula into a compact core with a hard spectrum of photon index = 1:63+0:11 −0:10 in the 2–10 keV energy range, and found that the nebula also has an extended (∼ 8 arcmin × 5 arcmin) X-ray halo, much larger than the radio emission extension. The photon index softens with distance from the centroid, a behavior also found in other X-ray plerions such as 3C58 and G21:5 − 0:9. Bocchino and Bykov (2001) also looked for periodic signals from the NS but found none with 99% con5dence level in the 10−4 − 6:5 Hz range. Assuming that LX = E˙ ∼ 0:002, the spin-down
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
331
Fig. 10. CO distribution around the remnant IC443 (G189.1+3.0). The 3EG -ray source J0617+2238 is superposed. Note the positional coincidence of the contours of the latter with part of the most dense regions of the CO distribution. The optical boundary of the SNR is superposed as a black contour. The optical emission seems to fade in regions where CO emission increases, this indicates that the molecular material is likely located on the foreground side of the remnant, absorbing the optical radiation. Optical contours are from Lasker et al. (1990).
35m00s
EGRET 95% 30m00s
25m00s
22d20m00s
15m00s
6h18m00s
30s
17m00s
16m30s
Fig. 11. Hard energy band (3–10 keV) of the IC443 nebula. Most of the thermal emission associated with IC443 is not present in this band. The image shows several point sources, besides the plerion nebula itself. The nebula can be represented by an ellipse of 8 arcmin ×5 arcmin. The 95% con5dence EGRET error circle for 3EG J0617 + 2238 is also shown. From Bocchino and Bykov (2001).
332
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
luminosity of the central object results in E˙ = 1:3 × 1036 erg s−1 . If the power-law spectrum of the nebula core region were extrapolated up to the GeV regime, it would provide a 8ux of 2.0 (0.4 –15.2)×10−7 ph cm−2 s−1 , which is consistent with the EGRET 8ux from the IC443 region (see Table 1). However, the fact that the nebula lies outside the 95% con5dence circle of the source argues against an association. Further, the EGRET source luminosity would require a substantial fraction of the estimated spin-down power. The H2 mass near SNR IC443 that we report here, 1:1 × 104 M , is the total molecular mass in the IC443 velocity range, v = −40 to +20 km s−1 , within a rectangle enclosing the main clump coincident with the SNR: l = 188:75 to 189.5, b = 2:5 to 3.375, see Fig. 10. The average density of that region is about 840 nucleons cm−3 . Assuming that the energy of the explosion was E51 = 0:27, and an unshocked density of 0:21 cm−3 (from a 2D dynamic model by Hnatyk and Petruk, 1998), we found that the hadronic 8ux would be 4:7 × 10−6 photons cm−2 s−1 . Indeed, just 10% of the ambient mass is necessary to produce the observed 8ux of 3EG J0617 + 2238. The energy of the explosion and the unshocked density yield a CR enhancement factor, ks ∼ 600 within the SNR, which appears to be unusually high. However, computation of the energy transformed into cosmic rays by the direct product ks jCR (4=3 R3 ) gives 0:4E51 , a value compatible with that obtained for (the eVciency of SN energy conversion to CRs) in the Mor5ll et al. (1984) prescription discussed above. Alternatively, by direct use of Eq. (13), and using the mass considered above and the observed -ray 8ux, we can obtain an estimate of the enhancement factor ks of 66. The diAerence between the CR enhancement factor of the cloud (ks = 66) and the SNR (ks = 600) could be explained in a variety of ways: most simply, that the CR enhancement predicted by the Mor5ll et al. (1984) prescription for SNRs overestimates the value within the adjacent cloud. Particularly if the cloud abuts the remnant, its enhancement will naturally be smaller than that calculated for the SNR interior. As Fig. 10 shows, the optical emission seems to fade in regions where CO emission increases, which perhaps indicates that the molecular material is absorbing the optical radiation, abutting the remnant on the near side. The report by Cornett et al. (1977) also argues that the molecular mass is located between us and the SNR. In any case, we remark that the previous estimations of the CR enhancement factors assume that the explosion proceeds in an homogeneous medium, something which we know is not true in this case (Chevalier, 1999). We note that an electronic Bremsstrahlung hypothesis for the origin of the GeV 8ux (e.g. Bykov et al., 2000) is diVcult to reconcile with the fact that the radio synchrotron emission is concentrated towards the rims of the remnant, whereas the GeV source is centrally located (Fig. 10). It is clear that IC443 will continue to be a primary target for future satellites missions and telescopes. A better localization of the EGRET sources, by AGILE, or GLAST, as well as the already approved INTEGRAL observations could help much in determining the ultimate nature of 3EG J0617 + 2238. The reader is referred to Butt et al. (2002b) for further analysis of the likely hadronic origin of the -ray emission. 10.3. -ray source 3EG J 0631 + 0642 and 3EG J 0634 + 0521—SNR G205:5 + 0:5 (Monoceros nebula) The large SNR G205:5 + 0:5 (Monoceros Loop nebula, 220 arcmin in size) has been thoroughly studied in the past. Various papers have proposed that the Monoceros Loop SNR is interacting with the Rosette Nebula (e.g., Odegard, 1986). A recent study of the stars in NGC 2244, the cluster
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
333
Fig. 12. Left: CO contours (white) on a background image from the Digital Sky Survey. The map covers a somewhat larger region than the left panel 5gure, and shows the HII region and young cluster NGC 2244 producing a hole in the cloud. Most of the Monoceros Loop is also seen faintly. EGRET sources contours are marked in black. Right: CO emission plus contours (black) of 1:4 GHz emission to mark the Rosette Nebula. The peak of the CO emission if near 3EG J0634 + 0521. The position of the X-ray source SAX J0635 + 0533 is marked with a star. White lines are the con5dence levels of the EGRET sources.
within the Rosette, 5nds a distance of 1:39 ± 0:1 kpc (Hensberge et al., 2000); we assume this distance in the following computations. In the left panel of Fig. 12, we show an overlay of the CO contours on an image from the Digital Sky Survey (Lasker et al., 1990). It shows very nicely the HII region and young cluster apparently carving a hole in the cloud. Most of the Monoceros Loop is also seen very faintly. This justi5es the distance adopted, under the assumption that the Nebula and the Monoceros Loop are equally distant from Earth. Bloemen et al. (1997) presented COMPTEL observations of the Monoceros region, and found excessive 3–7 MeV emission which they attributed to nuclear deexcitation lines at 4.44 and 6:13 MeV from accelerated 12 C and 16 O nuclei. Monoceros was already suggested by Esposito et al. (1996) as a source of -rays, and it was also mentioned by Sturner and Dermer (1995) and Sturner et al. (1996) as a possible case of -rays production by hadronic interactions. The age of the remnant is not well determined, 3–20 × 104 yrs. A study of Einstein IPC data (Leahy et al., 1986) shows diAuse X-ray thermal emission in a region corresponding to the detection of optical 5laments (Odegard, 1986). This is only possible if the gas is suVciently hot, and thus if the remnant is suVciently young. This would be in contradiction with the age one obtains from the homogeneous Sedov solutions, which would give an age in excess of 100,000 yrs. One direct interpretation of this discrepancy (Leahy et al., 1986) is that the expansion of the remnant proceeds in a non-homogeneous multi-component medium, where the homogeneous Sedov solutions are not valid.
334
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
A large region covering the spatial extent of both 3EG sources was studied by JaAe et al. (1997): (198 ¡ l ¡ 214, and −6 ¡ b ¡ 8). They presented an image reconstruction of the region around the Rosette Nebula and Monoceros using high-energy (¿ 100 MeV) -ray data from EGRET. The resulting image showed a 7* extended feature in excess of the expected diAuse emission located at the point-source position listed in the EGRET catalog (2EG at that time). These authors proposed that this excess could be evidence of an interaction between the Monoceros remnant and the Rossette nebula. They concluded that if the -ray emission arises solely in the interaction between the two nebulae then the cosmic-ray enhancement would be around ks = 300. This value appears to be excessively high in this case, should the enhancement be the same for all the SNR region. The energy in cosmic-rays, computed using ks jCR (4=3 R3 ) together with the size of the Monoceros remnant (∼ 60 pc), imply an energy of the explosion about one order of magnitude larger than the assumed E51 ∼ 1. This may indicate that the hadronic origin of the -rays in the interaction of the Monoceros SNR and the Rossete Nebula (i.e. 3EG 3EG J0634 + 0521) cannot be responsible for the entire observed 8ux. Indeed, within the 95% contour of 3EG J0634 + 0521 there exists an X-ray source SAX J0635 + 0533, and a Be-star/neutron-star X-ray binary pulsar, probably with a relatively short orbital period (Kaaret et al., 1999; Cusumano et al., 2000; Nicastro et al., 2000). The hard X-ray source SAX J0635 + 0533 shows pulsations at a period of 33:8 ms but no radio 8ux was detected at the Be-star position (see below). SAX J0635 + 0533 might be, as in the case of A0535 + 26 mentioned above, a source of -rays through hadronic processes. Kaaret et al. (1999) suggested that SAX J0635+0533 and 3EG J0634 + 0521 are related. One fact favoring this physical association is that the SAX satellite has not detected extended emission in the region of 3EG J0634 + 0521, as would be the case if the bulk of the radiation were produced in a SNR shock. Additionally, the probability for chance positional coincidence between a Be/X-ray binary and an EGRET source is less than 4% (Kaaret et al., 1999). The situation, however, is far from resolved. The EGRET source, for instance, is non-variable as it would be expected for a binary with eccentric orbit. Recent results reported by Kaaret et al. (2000), comparing observations obtained with BeppoSAX and RXTE separated in time by 2 years, showed that the period derivative of the pulsar has a lower bound equal to 3.8 ×10−13 . This value is 30 times larger than values found from accreting neutron stars (Bildsten, 1997), and it implies a mass accretion rate of 6 × 10−7 M yr −1 (Kaaret et al., 2000; Bildsten, 1997), which far exceeds the expected mass capture rate of a neutron star ∼ 10−11 M yr−1 , and is even slightly larger than the average mass loss rate of Be-stars (Kaaret et al., 2000). Apparently, this would indicate that the X-ray luminosity does not originate in the accretion disk, and argues in favor of SAX J0635 + 0533 being a rotation powered pulsar. In this case, the value of P˙ would imply a characteristic age of only 1400 years and a high spin-down luminosity of 5 × 1038 erg s−1 , out of which less than 0.05% could make a noticeable contribution to the observed -ray 8ux (assuming a distance of 4 kpc, Kaaret et al., 1999). Additionally, arguing against an accretion origin of the radiation, the derived X-ray luminosity (7:7 × 1034 (dkpc =4 kpc)2 erg s−1 ) and magnetic 5eld strength (∼ 109 G) are too low in comparison to other Be-X-ray binaries such as A0535 + 26 (Cusumano et al., 2000). Q Very recently, Monoceros was the target of the HEGRA Cerenkov telescopes (see Lucarelli et al., 2001). HEGRA observed the Monoceros–Rossete region for about 120 h, with an energy threshold of 500 GeV and an angular resolution of 0.1 deg, and mapped a 2 × 2 deg2 region centered in the source SAX J0635 + 0533. The EGRET source 3EG J0634 + 0521 is also within the 5eld of view.
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
335
Although the 8ux and spectrum have not yet been oVcially reported, HEGRA found a tentative excess of counts in four diAerent pixels (0:2 × 0:2 deg2 ) within the 3EG contours; interestingly, none of them coinciding with SAX J0635 + 0533 (Lucarelli et al., 2001). It is possible that TeV emission coming from the binary is being re-absorbed in its neighborhood, as in the case studied by Romero et al. (2001). What HEGRA observations seem to imply is that the marginally signi5cant TeV radiation has an extended origin, diAerent from that producing GeV photons in the binary system SAX J0635 + 0533, but this remains to be con5rmed. In the right panel of Fig. 12 we show the CO emission plus contours of 1:4 GHz continuum emission marking the Rosette. Using the standard CO-to-H2 mass conversion and a mean molecular weight per H2 molecular of 2.76, we calculate a total H2 mass for the associated cloud (in the region l = 205 to 209, b = −3 to −1, and velocity range v = −5 to 30 km s−1 ) of 1:2 × 105 M . We have calculated the molecular masses in small rectangles enclosing the two 3EG sources near the Rosette. Using a distance of 1:39 kpc we get the following results: In the case of 3EG J0634 + 0521 the region considered is l = 205:5 to 206.875 and b = −2:125 to −0:375, and the mass is 2:0 × 104 M . For 3EG J0631 + 0643, in the region l = 204:25 to 205:5 and b = −2:125 to −0:75, the mass is 4:7 × 104 M . In both cases, the velocity range considered is v = −5 to 30 km s−1 and the formal error on these masses, based on the instrumental noise, is ∼ 0:2 × 103 M . Because of uncertainties on the nature of this SNR (for instance, the controversy on the SNR age) Mor5ll et al.’s method is unreliable for estimating the SNR GeV-8ux. However, using directly Eq. (13) and the observed 8ux (Table 1), we can estimate the value of ks needed to generate the observed 8ux, resulting in ks ∼ 6:5 for 3EG J0634 + 0521. Because of the high molecular density, just a modest enhancement of the cosmic-ray density can explain a substantial part of the detected -ray 8ux. We suggest, then, that 3EG J0634+0521 might be a composite source: SAX J0635+0533 might be responsible for part of the GeV 8ux, as well as the bulk of the emission at X-ray energies. The interacting SNR and Rosscete Nebula might also contribute to the 8ux in the GeV range, and would provide the bulk of the possibly detected TeV emission from the region. One direct way to test this scenario would be through an analysis of the spectrum from GeV to TeV. In the case of a composite source, there should be a break in the spectrum between GeV and TeV energies, the latter corresponding only to the accelerated particles in the SNR remnant. In the case of 3EG J0631 + 0643, a CR enhancement value of just ks ∼ 3 can explain the observed GeV 8ux. New high-sensitivity radio measurements of the region would be of great value in determining the relative importance of hadronic and leptonic -ray emission. 10.4. -ray source 3EG J 1013 − 5915—SNR G284:3 − 1:8 (MSH 10 − 53) For this 3EG source, a natural candidate to generate a signi5cant part of the -ray emission seems to be the recently discovered pulsar PSR J1013 − 5915; see Table 3 (Camilo et al., 2001). This pulsar has a characteristic age of 3 = 21 kyr and a spin-down luminosity of E˙ = 2:6 × 1036 erg s−1 . If only the pulsar is considered, the eVciency required for converting spin-down luminosity into -rays is ∼ 0:5% (Camilo et al., 2001), well within the range of eVciencies for previously con5rmed -ray pulsars detected by EGRET. Another Parkes’ pulsar, PSR J1013 − 5934, is also coincident with 3EG 1013 − 5915, but it can be ruled out on the basis of energetic arguments. So also the Princeton pulsar PSR J1012 − 5857
336
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
(Table 3), which requires an eVciency of about 100% in the generation of the -ray emission from the spin-down losses. It is also interesting to note that 3EG J1013 − 5915 has a photon spectral index softer than typical for pulsars: " = 2:32 ± 0:13. The remnant is located in the near side of the Carina spiral arm, in a region with a high density of molecular clouds. Ruiz and May (1986) found 5lamentary optical emission associated with the remnant. These authors also found clear evidence of at least three small CO clouds interacting with G284:3 − 1:8. Other small clouds could have been disrupted by the supernova blast wave and are now forming the shell. Their CO(J = 1-0) mm line data shows sudden changes with position in radial velocity, and the presence of broad asymmetric lines with peak-shoulder pro5les, both of which indicate a shock wave disruption of the CO clouds. Since Ruiz and May (1986) gave only an upper limit for the mass of the shell, we have re-analyzed the gas content for this region. As the longitude-velocity map in Fig. 13a shows, nearly all CO emission in the general direction of the 3EG source and the SNR G284:3 − 1:8 lies in the velocity range −22–3 km s−1 . A CO map integrated over this range is shown in Fig. 13b. The mean velocity of the emission is ∼ −9 km s−1 , which is consistent with the terminal velocity in this direction, suggesting a distance of approximately 2:1 kpc. However, since radial velocity changes very slowly with distance in this direction, the uncertainty on the kinematic distance is large, approximately ±1 kpc. We adopt a distance of 2:9 kpc, the value inferred by Ruiz and May (1986) based on optical observations of the SNR 5laments, the 4 − D distance, as well as the CO kinematics. There is no CO detected at any velocity toward the nominal center of the 3EG source (l = 283:93, b = −2:34), although the total molecular mass within the 95% con5dence radius of the 3EG source (dotted circle in Fig. 13b) is 5:9 × 104 M . Most of this mass does not coincide with the SNR, which is completely included within the 3EG source. Since this emission does not form a single well-de5ned cloud, it’s possible that it arises from gas spread over quite a large distance along the line of sight, perhaps 1–2 kpc. If so, the gas density near the SNR could be quite low. Our study therefore reinforces the idea that it is most likely the pulsar, and not the hadronic or Bremsstrahlung emission from the SNR neighborhood, that is responsible for 3EG J1013 − 5915. 10.5. -ray source 3EG J 1102 − 6103—SNR G290:1 − 0:8 (MSH 11– 61A)/289:7 − 0:3 Sturner and Dermer (1995) proposed that this -ray source, in its 2EG J1103 − 6106 incarnation, may have been related to SNR G291:0 − 0:1. However, the more precise localization in the 3EG catalog shifted the source’s position such that it is no longer superposed with that SNR. Zhang and Cheng (1998) argued against a newly discovered young radio pulsar, PSR J1105−6107 (Kaspi et al., 1997), as the source of the observed high-energy -ray emission. Its age (∼ 6:3 × 104 yr) also seems high for a Vela-like pulsar. In addition, the photon spectral index is very soft, " = 2:47 ± 0:21, though this in itself does not disqualify a possible pulsar origin, as seen in the case of PSR J2229 + 6114=3EG J2227 + 6122 (Halpern et al., 2001b). The line of sight to 3EG 1102−6103 intersects both the near and far sides of the Carina spiral arm, at velocities near −20 and +20 km s−1 , respectively. There is a distinct gap in the near side of the Carina arm in the direction of the 3EG source, with almost no molecular gas within ∼ 1 deg of the source direction (see, e.g. Fig. 2 of Dame et al., 2001). On the other hand, as Fig. 14 shows, there is a very massive molecular complex in the far Carina Arm overlapping the direction of the 3EG source; this complex is No. 13 in the Carina Arm cloud catalog of Grabelsky et al. (1988). There is
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
337
20
Direction of G284.3–1.8
LSR Velocity (km/s)
10
0
-10
-20
-30 286
285
284 Galactic Longitude
283
282
283
282
(a) -1.0
-0.5 0.0 log Ico (K-deg)
0
Galactic Latitude
-1
G284.3–1.8
+
-2
-3
286
(b)
285
284 Galactic Longitude 10
20 Wco (K km/s)
30
40
Fig. 13. (a) Longitude–velocity map of CO integrated over 1 deg of Galactic latitude roughly centered on the SNR G284:3 − 1:8 (MSH 10 − 53), b = −2:5 to −1:5. The longitude of the SNR is indicated by the dotted vertical line. (b) Spatial map of CO integrated over the velocity range −22 to 3 km s−1 . The plus sign marks the center position of the SNR G284:3 − 1:8, whose size is 25 arcmin. The dotted circle is the 95% con5dence radius about the position of 3EG J1013 − 5915 (Hartman et al., 1999). Note that the longitude range (x-axis) of both maps is the same.
little doubt that the two component clouds labeled A and B in Fig. 14 are part of the same complex, since they have approximately the same velocity of 22 km s−1 , and are connected smoothly by weaker emission, also at the same velocity. Also, the HII regions are evidence of abundant on-going star formation in this molecular complex which additionally supports the association of the SNR. Assuming a 8at rotation curve beyond the solar circle, the kinematic distance of the complex is 8:0 kpc and its total molecular mass is 2:1 × 106 M .
338
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380 1
0 HII Region
G290.1–0.8
A B
-1
HII Region -2 291
290
289
288
Galactic Longitude
10
20 30 Wco (K km/s)
40
Fig. 14. CO integrated over the velocity range of the far Carina arm, 0 –45 km s−1 . Contours: 843 MHz continuum from the MOST Galactic plane survey (Green, 1997); the survey has been smoothed to a resolution of 3 arcmin to highlight extended sources. The contour interval is 0:04 Jy=beam, starting at 0:04 Jy=beam. The dotted circle is the 95% con5dence radius about the position of 3EG 1102 − 6103 (Hartman et al., 1999). G290:1 − 0:8 is the SNR MSH 11– 61A (Kirshner and Winkler, 1979). Both HII regions are in the catalog of Georgelin and Georgelin (1970) and have recombination line velocities in rough agreement with that of the complex. The component clouds A and B are discussed in the text.
It is worth noting that the composite CO line pro5le of cloud B is very broad and complex, suggesting possible interaction with SNR G290:1 − 0:8. In the case of cloud A, its radius (∼ 48 pc) and composite linewidth (17 km s−1 FWHM) are roughly consistent with the radius-linewidth relation found for large molecular complexes by Dame et al. (1986). For cloud B, however, its linewidth (∼ 27 km s−1 ) is about a factor of 3 too large compared to its radius (∼ 28 pc). We can also see that the coinciding SNR G289:7 − 0:3 is far from Cloud B, in a region of low molecular density. It is extremely unlikely that this SNR is related with the 3EG source in question. The only remaining candidate is, then G290:1 − 0:8. The total molecular mass within the 95% con5dence radius of the 3EG source (dotted circle in Fig. 14) is 7:7 × 105 M and most of it is localized in Cloud B (4:5 × 105 M ). Assuming typical values for the energy of the explosion (E51 = 1) and the unshocked ambient density (n = 0:1 cm−3 ) we obtain a CR enhancement factor of ∼ 250. Assuming that the same CR enhancement is applicable to the cloud overpredicts the EGRET 8ux by about a factor of 10. Thus, it is likely that the average CR enhancement factor within the cloud is ten times lower than within the SNR, a reasonable result. It is possible, then, that 3EG J1102 − 6103 and SNR G290:1 − 0:8 are indeed related. Note that Bremsstrahlung, which we have neglected here, will contribute still more
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
339
to the predicted 8ux from SNR-cloud interactions. If the outlined scenario is correct, GLAST and AGILE ought to observe a strong, compact -ray source coincident with the position of Cloud B. An alternative, and promising, hypothesis for explaining the high-energy emission is stellar winds collisions, as developed by Eichler and Usov (1993) and Benaglia and Romero (2003). Recently, Contreras et al. (1997) have provided convincing evidence for non-thermal radio emission from the colliding winds region in the stellar system Cygnus OB2 No 5. 3 The position of their radio-imaged shocked region is consistent with the inferred location of the contact discontinuity of the wind–wind interaction of the constituent stars. Benaglia et al. (2001) have argued that the source 3EG J2033 + 4118 could be mainly due to inverse Compton scattering of the stellar photons by the locally accelerated electrons. Similarly, in the present case, 3EG J1102 − 6103 might be the result of -ray production by the stellar winds of the early-type stars WR37, WR38, WR38B and WR39, all located within the 95% con5dence contour of the -ray source and at 2 kpc from the Sun. In particular, WR39 presents an unusually strong wind with a terminal velocity of about 3600 km s−1 (Romero et al., 1999a and references therein). Non-thermal radio emission at the mJy level has been recently detected at ∼3 arcsec from the optical position of the star by Chapman et al. (1999). This emission is a clear indication of the existence of a population of relativistic electrons in the region. Chapman et al. (1999) have suggested that particle acceleration could be occurring at the region where the wind of WR39 collides with the wind of the neighboring Wolf– Rayet star WR38B. This hypothesis is supported by the fact that synchrotron radiation is located between both stars (2 arcsec from WR38B). The relativistic electrons should interact with UV photons from the star, producing IC -rays that could explain part of the emission of 3EG J1102 − 6103. Fortunately, the peak of the spectral energy distribution should be in the IBIS energy range, the Imager on-Board INTEGRAL. Using Benaglia et al.’s (2001) model with a spectral index of −2, we get an integrated 8ux for the energy interval 100 –200 keV of 1:2 × 10−4 ph s−1 cm−2 . For the entire IBIS energy range (20 keV–10 MeV) the expected value is 1:2 × 10−3 ph s−1 cm−2 . In addition to this wind–wind contribution, single stars might also be sources of -rays in the IBIS energy band through the IC emission of electrons locally accelerated in shocks at the base of the winds. These shocks are produced by line-driven instabilities (see Chen and White, 1991 for a discussion of -ray emission from single stars; Benaglia et al., 2001, for a particular application). Consequently, further study of the possible association of 3EG J1102 − 6103 with the SNR G290:1 − 0:8 (MSH 11– 61A) is of the utmost importance, since there are at least two scenarios (aside from the pulsar possibility) that might well contribute to the observed -ray 8ux. 10.6. -ray source 3EG J 1410 − 6147—SNR G312:4 − 0:4 Although there is no correlated Princeton pulsar within the contours of 3EG J1410−6147, there are two Parkes pulsars near the -ray source (see Table 6 above). Both of them require an unreasonable 3
The HEGRA Cherenkov telescope array group recently reported a steady and extended unidenti5ed TeV -ray source lying at the outskirts of Cygnus OB2, the most massive stellar association known in the Galaxy, estimated to contain 2600 OB type members alone (Aharonian et al., 2002). Butt et al. (2003) reported on near-simultaneous follow-up observations of the extended TeV source region with the CHANDRA X-ray Observatory and the very large array (VLA) radio telescope. The broadband spectrum of the TeV source region favors a predominantly nucleonic rather than electronic origin of the high-energy 8ux, possibly in a way similar to that proposed by Romero and Torres (2003) in the case of NGC 253.
340
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
eVciency, at their dispersion-measure distances, to explain the observed -ray 8ux. Both pulsars are located (at least in projection) well within the boundaries of the incomplete shell of SNR G312:4−0:4 (Caswell and Barnes, 1985), to which Yadigaroglu and Romani (1997) estimate a 4 − D distance of 1:9 kpc, whereas Case and Bhattacharya (1999) 5nd 3:1 ± 1:0 kpc. At either of these distances, the required eVciencies would be substantially smaller. The photon spectral index of the -ray source, " = 2:12 ± 0:14, seems to be in the range of other pulsar cases. More recently, Doherty et al. (2003) have provided important new HI absorption measurements toward SNR G312:4 − 0:4 which indicate that it may be much further away, 8:1 kpc (see below). Case and Bhattacharya (1999) have made an in depth study of the possible association between the remnant and the -ray source and concluded that the -ray data alone cannot at present provide conclusive evidence to decide whether the -ray emission from 2EG J1412 − 6211 is due to a pulsar or SNR-molecular cloud interaction, or both. They suggested that CO observations of the environment surrounding G312:4 − 0:4 would help in determining whether a molecular cloud of suVcient mass is present in the right location to produce the observed -ray intensity. Such observations are presented here. The CO emission toward 3EG 1410 − 6147 is extremely bright and complex, arising mainly from the tangent region of the Centaurus spiral arm at v ¡ − 30 km s−1 . The line of sight also intersects the near side of the Carina arm at less negative velocities and the far side of Carina at positive velocities. The cloud in the general direction of the 3EG source is also by far the most massive and dense. This cloud, with a velocity of −49 km s−1 , is labeled A in Fig. 15. The near and far kinematic distances of the cloud are 3.3 and 8:1 kpc, respectively (Clemens, 1985); we will adopt the far kinematic distance since it agrees with the HI absorption measurements of Doherty et al. (2003) towards SNR G312:4 − 0:4. The mass of Cloud A is quite uncertain owing to the fact that molecular gas in both the near and far sides of the Centaurus arm probably contribute to the CO emission at the cloud velocity; the high-velocity limit of the cloud is also uncertain owing to blending with emission at higher (less negative) velocities. If we assume that all of the emission in the velocity range of Fig. 15 (−75 to −35 km s−1 ) arises from cloud A at 8:1 kpc, the total molecular mass within the 95% con5dence radius of the 3EG source (dotted circle in Fig. 15) is 3:3 × 105 M ; given the uncertainties just discussed, the actual mass might be as much as a factor of 2 lower. Even at such a distance, the large quantity of molecular material is suVcient to explain a signi5cant part of the -ray 8ux observed. As in the case of 3EG 1102 − 6103, with the usual assumptions, a CR enhancement factor of ∼100 is necessary to explain the bulk of the -ray emission from 3EG J1410 − 6147. It is most likely, however, that this EGRET source is a composite. New X-ray observations could be used to study the pulsars and evaluate their -ray emissivities. An extrapolation of the GeV spectrum to the TeV regime, if there is no break, would give a 8ux of 1 × 10−10 erg cm−2 s−1 , which is above the HESS sensitivity in the range 500 GeV–10 TeV. Even with a substantial break in the spectrum the 3EG source should be observable by HESS (see the study on SNR TeV observability below, Section 15). 10.7. -ray source 3EG J 1639 − 4702—SNR G337:8 − 0:1=338:1 + 0:4=338:3 + 0:0 Fig. 16 shows the relative positions of these SNRs within the large location contours of 3EG J1639 − 4702. One Princeton pulsar is within the contour of 3EG 1639 − 4702, but it can be ruled
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
cloud A
1
Galactic Latitude
341
0
-1 G312.4–0.4
314
313
312
311
Galactic Longitude
20
40
60 80 Wco (K km/s)
100
Fig. 15. CO integrated over the velocity range of the Centaurus arm, −75–−35 km s−1 . Contours: 843 MHz continuum from the MOST Galactic plane survey (Green et al., 1999); the survey has been smoothed to a resolution of 3’ to highlight extended sources. The contour interval is 0:01 Jy=beam, starting at 0:01 Jy=beam. The dotted circle is the 95% con5dence radius about the position of 3EG J1410 − 6147 (Hartman et al., 1999). Many of the other radio sources in the map are unrelated HII regions discussed by Caswell and Barnes (1985).
out as a possible counterpart because of the required energetics. In addition, three Parkes pulsars coincide with the same 3EG source (see Table 3). Two of them can be immediately discarded based on the same grounds: the required eVciencies are unphysically high. However, PSR J1637 − 4642 seems to be a plausible candidate. Only a 12% eVciency would be needed to convert this pulsar into a plausible counterpart for the origin of the -ray emission. Although the spectral index, "=2:50±0:18, seems quite soft in comparison with detected EGRET pulsars, the work of Halpern et al. (2001b) suggests that a soft spectral index does not automatically rule out a pulsar origin of the -rays: they present a strong case for PSR J2229 + 6114 being responsible for 3EG J2227 + 6122, even though it has a high-energy spectral index of 2:24 ± 0:14. Based on HI absorption seen all the way up to the terminal velocity, Caswell et al. (1975) placed the SNR G337:8 − 0:1 beyond the tangent point at 7:9 kpc. Koralesky et al. (1998) detected maser emission in the SNR at −45 km s−1 , implying a far kinematic distance of 12:4 kpc. As Fig. 17 shows, there is a very massive giant molecular cloud adjacent to the SNR in direction and close to the associated maser in velocity (−56 km s−1 ). The far kinematic distance for this giant molecular cloud is favored by (1) its likely association with both the far-side maser just mentioned and a group of far-side HII regions (group 5 in Georgelin and Georgelin, 1976); (2) its location very close to the Galactic plane; and (3) the radius linewidth relation for giant molecular clouds (Dame et al., 1986). The mean velocity of the complex is −56 km s−1 , implying a far kinematic distance of 11:8 kpc.
342
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
Fig. 16. Relative positions of 3EG J1639 − 4702 (which occupies all the box) and the SNRs G333:8 − 0:1, G338:1 + 0:4, and G338:3 + 0:0, which are very small in comparison. The superposed grey-scaled levels show the radio emission at 843 MHz of the three SNRs as reported in the MOST Catalog prepared by Whiteoak and Green (1996).
The giant molecular cloud has recently been discussed by Corbel et al. (1999) because an adjacent cloud (just outside the velocity integration range of Fig. 17) apparently harbors the soft -ray repeater SGR 1627 − 41. Corbel et al. suggest that collision or tidal interaction between these two giant molecular clouds may have set oA the burst of star formation evident in both. Taking the total CO luminosity of the giant molecular cloud to be that in the range l = 337:625– 338:25, b = −0:25–0:25, and v = −65–−45 km s−1 , the total molecular mass is 5 × 106 M ; this mass may be overestimated by 10 –20% owing to the inclusion of emission from gas at the same velocity at the near kinematic distance. Even with this correction, this giant molecular cloud ranks among the few most massive GMCs in the Galaxy (see, e.g., Dame et al., 1986); its composite CO linewidth of ∼ 20 km s−1 is correspondingly very large. Adopting a mean radius of 0:31 deg, or 65 pc at 11:8 kpc, the mean nucleon density of the cloud is 176 cm−3 . The total mass within the 95% con5dence radius of the 3EG source is 7:6 × 106 M ; this mass too may be overestimated by 10 –20% owing to inclusion of near-side emission. The enhancement factor obtained from Eq. (15) is large for typical parameter values. However, since the SNRs seems to be immersed in the molecular cloud, the Mor5ll et al. (1984) prescription may be an oversimpli5cation in this case. This case is similar to that of 3EG J1903 + 0550 in that we have a distant SNR that we would not expect to be able to detect with EGRET in the neighborhood of a very large molecular cloud. AGILE observations, in advance of GLAST, would greatly elucidate the origin for this 3EG source, since even a factor of 2 improvement in resolution would be enough to favor or reject the SNR connection.
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
343
10.8. -ray source 3EG J 1714 − 3857—SNR G348:5 + 0:0=348:5 + 0:1=347:3 − 0:5 The supernova remnant RX J1713:7 − 3946 is probably the most convincing case for a hadronic cosmic-ray accelerator detected so far in the Galaxy. Butt et al. (2001) noted the positional coincidence of the nearby -ray source 3EG J1714 − 3857 with a very massive (∼ 3 × 105 M ) and dense (∼ 500 nucleons cm−3 ) molecular cloud that is clearly interacting with the SNR RX J1713:7 − 3946 (G347:3 − 0:5) (Slane et al., 1999; Butt et al., 2001). Fig. 18 shows the CO(J = 1-0) line intensity distribution in the vicinity of the SNR. The remnant is a strong X-ray source (Slane et al., 1999) whose ROSAT contours are indicated in the 5gure. Two massive clouds, which we called Clouds A and B, can be seen. The 5rst one is coincident with the -ray source 3EG J1714 − 3857, whose location con5dence contours are also superposed in the 5gure. The X-ray emission is produced by TeV-range electrons radiating by the synchrotron mechanism in the local magnetic 5eld. These same electrons were suggested to be responsible, through IC up-scattering of cosmic microwave background photons, for the TeV -ray emission detected from the NW rim of the remnant by Muraishi et al. (2000) and Butt et al. (2002a). We have previously provided measures of line intensity ratios R = CO(J = 2-1)=(J = 1-0) for the entire region demonstrating that the SNR is likely interacting with Cloud A (Butt et al., 2001). For this cloud, R ∼ 2:4, more than 3:5* above the average Galactic value. Upper limits to the continuum radio emission of Cloud A and the use of Eq. (25) above, allow us to rule out a Bremsstrahlung origin of the GeV radiation from 3EG J1714 − 3857 (see Fig. 19): the electron 8ux needed to explain the GeV source in terms of Bremsstrahlung emission overpredicts the radio synchrotron emission for any reasonable molecular cloud magnetic 5elds (Crutcher, 1988, 1994, 1999). The GeV -rays seem to be the result of 0 -decays produced when the population of cosmic-rays accelerated at the remnant shock are injected into the dense medium of Cloud A. We estimate a cosmic-ray enhancement factor in the range 24 ¡ ks ¡ 36 given the parameters of the SNR (Slane et al., 1999). In addition, the -ray spectrum of 3EG J1714 − 3857 (Hartman et al., 1999) is consistent with a narrow spectral bump at ∼ 70 MeV that could correspond to the signature of the pion-decays resulting from an enhanced population of low energy (E ∼ 1 GeV) protons. This apparent peak, although highly suggestive, is not statistically signi5cant and improved observations are needed to con5rm its existence. Uchiyama et al. (2002a) have reported the discovery of extended (10 × 15 ) and hard (spectral shape described by a 8at power-law photon index " = 1:0+0:4 −0:3 ) X-ray emission from the position of Cloud A, using ASCA data. This emission is interpreted as Bremsstrahlung from a Coulomb-loss8attened distribution of nonthermal low-energy protons in the cloud or mildly relativistic electrons (see also Uchiyama et al., 2002b). Uchiyama et al. (2002a) estimate that the energy content in subrelativistic protons within the cloud far exceeds that in the relativistic protons, say by a factor ∼ 80. The explanation could be that the bulk of the more energetic particles have already diAused from the cloud whereas the sub-relativistic population is captured there. Alternatively, energetic secondary leptons may also be producing low-level non-thermal X-ray and radio emission in the clouds. Regarding the highest energy particles produced in the SNR, a CANGAROO re-observation of the NW-rim of RX J1713:7 − 3946 with a new 10-m re8ector (CANGAROO II) has allowed a determination of the TeV -ray spectrum, which can be 5tted with a power-law of photon index
344
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380 1.0
Galactic Latitude
0.5
0.0
-0.5
G337.8-0.1
-1.0 339.5
339.0
338.5
25
338.0 337.5 Galactic Longitude
50
75 100 125 Wco (K km/s)
337.0
336.5
336.0
150
Fig. 17. CO integrated over the velocity range −65–−45 km s−1 . Contours: 843 MHz continuum from the MOST Galactic plane survey (Green et al., 1999); the contour interval is 0:2 Jy=beam, starting at 0:1 Jy=beam. The dotted circle is the 95% con5dence radius about the position of 3EG J1639 − 4702 (Hartman et al., 1999).
Fig. 18. Intensity map of the CO(J = 1-0) transitions in the region around RX J1713:7 − 3946, from Butt et al. (2002a). Two massive clouds, called Clouds A and B, are indicated. The X-ray contours of the SNR (Slane et al., 1999) are superposed in black, as well as the location con5dence contours of the GeV -ray source 3EG J1714 − 3857 (coincident with Cloud A) and the signi5cance contours of the TeV detection of the remnant by Enomoto et al. (2002), mostly coincident with the X-ray radiation.
" ∼ −2:8 (Enomoto et al., 2002). Such a steep spectrum is hard to explain by IC emission and Enomoto et al. (2002) have claimed that the TeV emission is also of hadronic origin. If this were the case, however, the GeV -ray 8ux would be much higher than observed, as recently noted
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
Flux density (Jy)
10
345
2
ν
10
1
10
0
ν
5/2
10
2
10
3
-0.65
10
4
Frequency (MHz) Fig. 19. The radio synchrotron spectrum which would be expected from the region of the shocked molecular material located towards the NE of the remnant RX J1713:7 − 3946 (Cloud A), under the assumption that the observed GeV 8ux were due to electron/positron Bremsstrahlung. Since this spectrum violates the upper limit (dark) derived from the non-detection of the cloud in the radio band by a factor of ∼ 20 at 843 MHz (Slane et al., 1999), we can rule out a predominantly leptonic origin of the GeV luminosity. Furthermore, if the GeV 8ux of 3EG J1714 − 3857 were of electronic origin, the cloud region would outshine even the radio-brightest NW rim of the remnant, which is found to be emitting only 4 ± 1 Jy at 1:36 GHz (Ellison et al., 2001), as shown by the light data point. An assumed low frequency turnover at ∼ 100 MHz in the radio spectrum is shown by the dotted line.
independently by Reimer and Pohl (2002) and by Butt et al. (2002a). (It is conceivable that protons with a hard index of ∼ −1:8 may alleviate the discrepancy but then it is more diVcult to explain how such a population of protons could produce the measured −2:8 index TeV -rays)). 4 The SNR RX J1713:7 − 3946 is perhaps the best natural laboratory available today for studying the acceleration and diAusion of cosmic-rays. The unique combination of a relatively close SNR and a group of well de5ned molecular clouds in its surroundings, none of them in front of the remnant itself, makes this source a priority target for the forthcoming generation of high-energy instruments
4
One possible solution would be to introduce the eAects of diAussion, if the -rays originate in one of the nearby clouds and not in the SNR itself as suggested by CANGAROO team. See also Uchiyama et al. (2002c).
346
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
Fig. 20. Upper panel: MOST image of the SNR G355:6 + 0:0 at 0:843 GHz (Gray, 1994). The gray-scale representation ranges from 0:4 × 10−2 to 10 × 10−2 Jy beam−1 . Radio contours are shown in steps of 1 Jy beam−1 , the resolution is 43 arcsec. Part of the -ray probability contours of 3EG J1734 − 3232 are superposed. Lower panel: Detailed radio image of the SNR at the same frequency. Radio contours are shown in steps of 1 × 10−2 Jy beam−1 , starting from 0:5 × 10−2 Jy beam−1 .
such as HESS, AGILE, INTEGRAL, and, GLAST, as well as for infrared, radio, mm and sub-mm observatories. 10.9. -ray source 3EG J 1734 − 3232—SNR G355:6 + 0:0 The shell type supernova remnant G355:6+0:0 (Fig. 20) was 5rst identi5ed in the MOST Galactic Center survey (Gray, 1994) as a compact (∼ 0:1 deg× ∼ 0:1 deg) radio source with both thermal and non-thermal emission. The western limb shows possible indications of an interaction with the
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
347
ambient diAuse thermal gas present at that location. X-ray emission from this SNR has also recently been reported by the ASCA X-ray satellite under the designation AX J173518 − 3237 (Sugizaki et al., 2001). The physical relation of even part of the -ray 8ux of 3EG J1734 − 3232 with SNR G355:6 + 0:0, however, is unclear because of the lack of information about both the SNR and its environs. The lack of a -ray spectral index for 3EG J1734 − 3232 (Hartman et al., 1999, see Table 1) further complicates any attempt to connect the SNR with the -ray emission. The -ray error box is also coincident with a very young open cluster, NGC 6383, (l; b) = (355:66; 0:05), which is centered around the bright spectroscopic binary HD 159176 (O7V + O7V ) (eg. van den Ancker et al., 2000). Together with NGC 6530 and NGC 6531, NGC 6383 belongs to the Sgr OB1 association. The nearby radio source G355:3 + 0:1, also within the -ray error box, is most likely an HII region at a distance of ∼ 10 kpc (Crovisier et al., 1973). Although the Third EGRET catalog lists GeV 1732 − 3130 (Lamb and Macomb, 1997) as an alternate name for this source, the large positional oAsets indicate that these two may be separate -ray sources (Roberts et al., 2001). Interestingly, “bridging” these two sources is a “possibly variable” COS-B source, 2CG 356 + 00, located at (l; b) = (356:5; +0:3) (Swanenburg et al., 1981), which may be related to one or both of them. The report of a bright, transient hard X-ray source, KS/GRS 1730 − 312 (l; b = 356:6; +1:06) (Vargas et al., 1996), within the 95% error ellipse of GeV 1732 − 3130 may also explain part of the detected -ray emission from this region. It is possible that in the quiet state KS/GRS 1730 − 312 w as actually seen in ASCA data, as “src 1” in Roberts et al. (2001). A mild indication of variability for 3EG J1734 − 3232 (I = 2:9; 3 = 0:000:24 0:00 ) appears within the I -scheme, and would support the 8aring hard X-ray source being associated with it, although this is not conclusive with the data now at hand. The transient source could also be separately connected to GeV 1732 − 3130=2CG 356 + 00. It is clear that no conclusive determination can be made regarding possible association among SNR G355:6 + 0:0, GeV 1732 − 3130, 2CG 356 + 00, and 3EG J1734 − 3232 until the sizes of the -ray error boxes are signi5cantly reduced by future GeV telescopes. It is almost certain that many GeV sources, especially those located towards the inner Galaxy, will eventually be resolved into several separate sources. 10.10. Near the Galactic Center: -ray source 3EG J 1744 − 3011—SNR G359:0 − 0:9=359:1 − 0:5 and -ray source 3EG J 1746 − 2851—SNR G0:0 + 0:0=0:3 + 0:0 Since the 3EG sources J1744 − 3011 and J1746 − 2851 lie near the very confused Galactic Center region, an analysis of the sort we present for other SNR-EGRET source pairs is not possible in this case. Galactic Center region sources must to be considered as possibly extended (most likely composite) and confused, embedded in high and structured background, and the analysis procedures used for other sources in the EGRET catalogs may not apply (Mayer-Hasselwander et al., 1998). A detailed discussion of the Galactic Center region is beyond the scope of this report; for this the reader is referred to MarkoA et al. (1999), Yusef-Zadeh et al. (2000) and Melia and Falcke (2001), as well as the book edited by Falcke et al. (1999). Very recently, Goldwurm (2001) has presented a review of the high-energy emission detected from the direction of the Galactic center; the reader is referred to that paper for details on diAerent X-rays observations of the Galactic Center region.
348
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
A complete discussion of EGRET observations of the Galactic Center region was presented by Mayer-Hasselwander et al. (1998). They found 5488±516 counts above 30 MeV, representing a high signi5cance excess. The region analyzed was larger than the position of the central EGRET sources by several degrees (see Fig. 1 in Mayer-Hasselwander et al., 1998). They also re-analyzed COS-B data, showing that the COS-B observations of the region were in agreement, despite previous claims, with the more recent EGRET data. Several objects in the region are potential counterparts for the -ray radiation detected, including, for example, GRO 1744−28, 2S1743−2941, E1740 −2942, PSR 1742 − 30, GRS 1736 − 297, GRS 1739 − 278, and GX 359 + 02. However, not all of these coincide with the positions of the 3EG sources we are considering. The -ray 8uxes in diAerent energy bands give only the hint of variability, which is consistent with 3EG Catalog estimates for I and 3 (Table 1). The photon spectrum of the diAuse emission of the Galactic Center region shows a clear break at energies about 1 GeV, with a signi5cant steepening thereafter (it shifts from −1:3 to −3:1). The hard spectrum at energies above 100 MeV has to be compared with the already hard values obtained with standard EGRET analysis techniques, quoted in Table 1 for the sources of interest. These hard values argue against -rays being produced in diAusive processes, i.e., by ambient matter–cosmic-ray interactions. However, this is yet to be con5rmed. Yusef-Zadeh et al. (2002) have suggested that the central -ray source may be due to the interaction of the G0:13 − 0:13 molecular cloud with the diAuse and 5lamentary X-ray features discovered using Chandra, all lying within the 95% con5dence location contours of 3EG J1746 − 2851. The hard spectrum of the EGRET source (−1:7) seems to match the cloud spectrum at about 10 keV when extended down to X-ray energies. Electron Bremsstrahlung and inverse Compton may be responsible for the GeV emission. Pohl (1997) raised the possibility that the radio arc at the Galactic Center could be the counterpart of the high-energy -ray source. Existing radio data on the arc support the view that its synchrotron emission originates from cooling, initially monoenergetic electrons that diAuse and convect from their sources to the outer extensions of the arc. If the source of high-energy electrons coincides with the Sickle region (G0:18 − 0:04), as indicated by the radio data, then the ambient far-infrared photons could be subject to inverse Compton interaction by high-energy electrons. Pohl (1997) showed that the predicted -ray emission depends mainly on the magnetic 5eld strength in the arc and that both the 8ux and the spectrum of the central source could be explained by such a process. On the other hand, the starving state of accretion 8ow around the supermassive black hole make it a dubious counterpart for the high-energy radiation. Note, however, that from an statistical point of view the probability of such a good agreement in the positions of the Galactic Center and the 3EG source is about 10−4 (Mayer-Hasselwander et al., 1998). However, early scenarios were presented in which the -ray emission is produced by the wind accretion from the nearby IRS16 cluster (Melia, 1992; Mastichiadis and Ozernoy, 1994). More recently, MarkoA et al. (1997) were able to reproduce the observed spectrum with a combination of synchrotron radiation and pion decay. If the -ray 8ux is directly related to the dissipation of gravitational energy, i.e. if it is produced by relativistic particles energized by a shock within the infalling plasma, Sgr A∗ could still be the source of the -rays observed. However, in a re5ned analysis, MarkoA et al. (1999), using data from the 3EG catalog and an improved physical treatment, concluded that this was not the case. Forthcoming satellites, particularly INTEGRAL, will scrutinize the Galactic center region and hopefully resolve the nature of this interesting region.
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
349
Fig. 21. Relative positions of 3EG J1800 − 2338 and SNR W28. Contours in step of 5 mK, starting from 15 mK, from the 2695-MHz map obtained with the EAelsberg 100-m single dish telescope (FPurst et al., 1990).
10.11. -ray source 3EG J 1800 − 2338—SNR G6:4 − 0:1 (W28) This EGRET-SNR positional coincidence was originally proposed by Sturner et al. (1996) and Esposito et al. (1996). However, the previous 2EG J1801 − 2321 source does not coincide with 3EG J1800 − 2338, since the new position has shifted position by about half a degree. W28 was also presented as a possible candidate for an association with a COS B source by Pollock (1985), after noticing that SNR W28 appears to be interacting with molecular clouds. Dubner et al. (2000) have made a recent study of the remnant using the VLA. They concluded that the remnant is indeed interacting with molecular clouds in the vicinity, and observed maser emission, as earlier reported by Claussen et al. (1997), 1999. Arikawa et al. (1999) observed W28 and mapped the CO(J = 3-2) and CO(J = 1-0) rotational transition lines toward the remnant, and also concluded that the remnant is interacting with the clouds. The mass of the clouds was found to be 2 × 103 M . This mass, however, is with respect to the 2EG source position. A new molecular mass estimate with respect to 3EG J1800 − 2338 is given below (Fig. 21). The H-alpha 5laments in W28 have a mean velocity of 18 ± 5 km s−1 (Lozinskaya, 1974). HI absorption measurements by Radhakrishman et al. (1972) are consistent with this velocity, since absorption features at 7.3 and 17:6 km s−1 are seen against the SNR continuum. As Fig. 22 (lower panel) shows, there is a molecular cloud at about the same velocity (∼ 19 km s−1 ) which is most likely the birth place of the supernova progenitor. Directly toward the radio-bright rim of W28 (l ∼ 6:6 deg, b ∼ −0:3 deg), which has been proposed by Wootten (1981) and Arikawa et al. (1999) as the site of SNR-giant molecular cloud interaction, a jet-like CO feature is seen in Fig. 22, extending to a smaller cloud at ∼ 7 km s−1 . Arikawa et al. (1999) proposed the 7 km s−1 component as the systemic velocity of W28, whereas Claussen et al. (1997) suggested that the entire 7 km s−1 cloud has been accelerated by the SNR from the main cloud ∼ 10 km s−1 higher in velocity.
350
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
1.0
v = 0 to 28 km s
-1
0.5
120
0.0
80 60
M20
-0.5
Wco (K km/s)
Galactic Latitude
100
40
W28A-2 20 -1.0
M8
-1.5 7.5
7.0
(a) 30
6.5 Galactic Longitude
6.0
5.5
b = -0°.5 to -0°.125
20 2
10 1
0
0 7.5
(b)
Ico (K-deg)
LSR Velocity (km/s)
3
7.0
6.5 Galactic Longitude
6.0
5.5
Fig. 22. (a) CO integrated over the velocity range 0 –28 km s−1 corresponding to the distance of the SNR W28. Contours: 4850 MHz continuum from the survey of Condon et al. (1991); the lowest contour is at 0:32 Jy=beam and the contours are logarithmically spaced by a factor 2. The three labeled sources are HII regions: M20 the Tri5d Nebula and M8 the Lagoon Nebula. The dotted circle is the 95% con5dence radius about the position of 3EG J1800 − 2338 (Hartman et al., 1999). (b) CO intensity integrated over the latitude range of the molecular gas associated with W28, b = −0:5–−0:125 deg.
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
351
The alignment of the 7 km s−1 cloud with the bright interacting rim of W28 and the jet-like feature linking it to the larger cloud at higher velocity supports the Claussen et al. proposal. The low longitude of W28 (6:5 deg) makes both the kinematic distance and the mass of the associated molecular cloud diVcult to measure. Assuming a systemic velocity of 19 km s−1 , the near kinematic distance is 3:7 ± 1:5 kpc. If all emission in the velocity range 0 –28 km s−1 is associated with W28 (see Fig. 22b), the total molecular mass within the 95% con5dence radius of the 3EG source is 3:9 × 105 M . Owing to the severe velocity crowding at this Galactic longitude, this mass estimate may be overestimated by as much as 50%. Even reduced by 50%, the ambient molecular material can still account for the observed EGRET 8ux. Ignoring Bremsstrahlung, a CR enhancement factor of about ks ∼ 20, similar to that found in other cases, is necessary to explain the emission by hadronic interactions. This result is stable against reasonable variations in the input parameters and is compatible with consistency tests. Note that the distance used here is near the upper limit of those shown in Table 4; a smaller distance would make a physical association even more likely. VelOazquez et al. (2002), in a recent analysis of large-scale neutral hydrogen around W28, adopt a distance of ∼ 1:9 kpc. They concluded that the SN energy was ∼ 1:6 × 1050 erg about 3:3 × 104 years ago. An intriguing possibility in this case is that there are actually two (or more) -ray sources, each associated with a diAerent molecular cloud in Fig. 22 and/or with the pulsar PSR B1758-23 (which, coincides with the GeV source as reported in the Roberts et al. (2001) catalog). Investigation of this possibility will require the next generation of GeV and TeV-telescopes, with their improved sensitivity and angular resolution. Q W28 is one of the few remnants observed with Cerenkov telescopes. Rowell et al. (2000) observed it with CANGAROO 3.8-m telescope and were able to set an upper limit on the 8ux (for photons with E ¿ 1:5 TeV) of a diAuse source encompassing the clouds discovered by Arikawa et al. (1999) and part of the 3EG source: 6:64 × 10−12 ph cm−2 s−1 . A simple extrapolation of the 3EG 8ux, with the same spectral index, up to TeV energies yields a value higher than this upper limit by more than an order of magnitude (see Fig. 5 of Rowell et al., 2000). This implies the existence of a break in the spectrum in the GeV–TeV region. But even considering such a break (see Appendix B and Table 10) the 3EG J1800 − 2338 region could be visible to an observatory such as HESS in a matter of hours. The possibility of a leptonic origin for the -ray source cannot be ruled out in this case. The SNR is ranked sixth when ordered by radio 8ux among all entries in Green’s (2000) Catalog (Table 4). With standard values of magnetic 5elds appropriate to molecular clouds (e.g. Crutcher, 1988, 1994, 1999), the radio 8ux that would be generated by the same electronic population producing the GeV emission would not overpredict the currently detected radio 8ux. A model such as the one presented by Bykov et al. (2000) could explain the EGRET source without invoking the dominance of hadronic interactions. 10.12. -ray source 3EG J 1824 − 1514—SNR G16:8 − 1:1 Paredes et al. (2000) proposed that the massive star LS 5039 is part of a newly discovered microquasar system, and that it can be identi5ed with the -ray source 3EG J1824 − 1514. The detection of radio jets and strong variability at diAerent wavelengths (including -rays) supports their claim. Gamma-ray emission from microquasars with high-mass companions has been recently discussed by
352
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
4
MilliARC SEC
2
0
-2
-4
-6 6
4
2
0 -2 MilliARC SEC
-4
-6
Fig. 23. High-resolution radio map of the nearby star LS 5039 obtained with the VLBA and the VLA in phased array mode at 6 cm wavelength. The presence of radio jets in this high- mass X-ray binary is the main evidence supporting its microquasar nature. The contours shown correspond to 6, 8, 10, 12, 14, 16, 18, 20, 25, 30, 40, and 50 times 0:085 mJy beam. From Paredes et al. (2000).
Kaufman-BernadOo et al. (2002), who show that GeV -rays can result from the up-scattering of UV stellar photons by the relativistic jet (see also Georganopoulos et al., 2002). Variability is naturally produced by the changing viewing angle as the jet precesses due to tidal forces from the accretion disk (Fig. 23). The supernova remnant G16:8 − 01:1, also within the 95% con5dence contour of the -ray source, has been recently studied by RibOo et al. (2002). Although the radio structure of the remnant is not well-resolved because of contamination from the partially superposed HII region RCW 164 (Rodgers et al., 1960), but its size is ∼ 30 arcmin and its total 8ux at 5 GHz is ∼ 1 Jy (RibOo et al., 2002). The distance to the source is not known, but a lower limit of ∼ 2 kpc has been established by RibOo et al. (2002) through H1665 line observations of the foreground HII region. HI observations by the same authors indicate that the ambient density around the remnant is ∼ 5 cm−3 , with no evidence of interacting clouds. For a SN energy release E51 ∼ 1 and a distance d ∼ 3 kpc, we found that the expected pion-decay -ray 8ux from this SNR should be F(E ¿ 100 MeV) ∼ 10−7 ph cm−2 s−1 ( ∼ 0:5). This is about one third of the observed 8ux from 3EG J1824 − 1514. It is possible that Bremsstrahlung from SNR shell–cloud interactions could also generate a fraction of the observed -rays. In addition, there is one Princeton pulsar superposed on this 3EG source, but it is not energetic enough to be a plausible alternative to the microquasar found by Paredes et al. (see Section 4). The apparent (though marginal) variability of the -ray source, in addition, argues weakly against a pulsar origin of the GeV 8ux. A variability analysis (Torres et al., 2001a, b, c, d) suggests that 3EG J1824 − 1514 is marginally variable, but this is not con5rmed by Tompkins’s 3 index, which is compatible with a steady source (although the diAerence between the upper and the lower limit on Tompkins’ 3 is large). These results are not conclusive, but it seems unlikely that most of the -ray
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
353
3EG J1837-0423 Flux History From July 12, 1991 to September 20, 1995
400
300
-2
-1
Flux [10 photons cm s ]
350
250
-8
200
150
100
50
0 0
Viewing periods in the 3EG Catalog Fig. 24. Flux history of the 3EG J1837 − 0423. Only for one viewing period 423.0, the source was undoubtedly detected, all others being only upper limits. The X-axis in the 5gure does not represent a linear scale of time. Rather, each point represent the measurement for a diAerent single viewing period given in Hartman et al. (1999).
8ux could come from either the SNR or pulsar alone, especially when the uncertainties in distance are taken into account. The source 3EG J1824 − 1514 remains the best candidate for a -ray emitting microquasar. Future tests of this interesting source will surely be carried out with AGILE and GLAST. 10.13. -ray source 3EG J1837 − 0423—SNR G27:8 + 0:6 This source is variable between EGRET observations, which argues against a SNR or pulsar origin (see above). Indeed, 3EG J1837 − 0423 was detected only once, in viewing period 423.0 with 5:8* signi5cance level. In all other single viewing periods in which it was observed, only an upper limit to the 8ux could be established (see Fig. 24). This behavior is compatible with objects presenting 8ares, such as AGNs. A microlensing model might be a plausible alternative given the absence of a strong radio emitter in the 3EG 5eld (Torres et al., 2002, 2003a). Another possibility is a non-pulsating black hole of the sort discussed by Punsly (1998a,b) and Punsly et al. (2000).
354
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
In any case, the variability clearly indicated by both the I and 3 indices make 3EG J1837 − 0423 incompatible with a SNR or pulsar origin. 10.14. -ray source 3EG J 1856 + 0114—SNR G34:7 − 0:4 (W44) The possible association of the -ray source 3EG J1856 + 0114 and SNR G34:7 − 0:4 (W44) has already been proposed by Esposito et al. (1996) and by Dermer et al. (1997). A comprehensive review of the morphological properties of W44 at diAerent frequencies is given by De Jager and Mastichiadis (1997). The radio emission of the SNR is shell-like, but the X-ray emission is centrally peaked (Rho et al., 1994). A pulsar is found near the center of the 3EG source, PSR B1853 + 01 (quoted as PSR J1856+0113 in Table 6). Frail et al. (1996) discovered a corresponding radio pulsar wind nebula, with a tail-shape pointing back to the center of W44. In addition, they found that the transverse velocity of the pulsar is compatible with the expansion speed of the radio shell. Thus, it is probable that this pulsar is the compact remnant of the supernova. Recently, two new studies concluded that the remnant is in interaction with molecular clouds: a new radio and optical study by Giacani et al. (1997) and a complete CO study by Seta et al. (1998). In Fig. 25, a map of CO(1-0) integrated over the velocity range 30 to 65 km=s is shown. There are six giant molecular clouds, with masses between 0.3 and 3 × 105 M , in the vicinity of W44. Three of them, with a total mass of 4:1 × 105 M (Seta et al., 1998) are apparently interacting with the remnant. The total molecular mass in the vicinity of the 3EG source in W44, speci5cally in the region l = 34:5 to 34.875, b = −0:75 to −0:375, and v = 30 − 65 km s−1 , is 6:2 × 104 M . Based on the Clemens (1985) rotation curve, the main clump near the 3EG source has a kinematic distance of 2:8 kpc. The molecular mass is obtained based on the usual assumption that the H2 column density is proportional to the CO velocity-integrated intensity. The mass in the velocity-perturbed wings can be a factor ∼ 100 less than the total mass of the cloud. With these values for the masses of passive targets and an CR enhancement factor of ∼ 40, the entire GeV 8ux could, in principle, be explained by hadronic interactions, although an additional Bremsstrahlung component will be present also in the SNR shell–cloud interactions. The enhancement factor of ∼ 40 is obtained assuming E51 = 0:67 and a pre-shock ambient density n0 = 1, but the conclusions are robust for reasonable variations in these parameters. We can compute mean densities for the well-de5ned clumps (lighter colors in Fig. 25) by assuming they are roughly spherical. However, the 3EG source contours enclose less intense emission that is spread throughout the whole W44 complex, which is about ∼ 1 deg across, (49 pc at the assumed distance of 2:8 kpc). Thus it is reasonable to assume that the molecular gas enclosed by the 3EG contours is spread over about 49 pc along the line of sight as well. The mean density would then be n ∼ 6:2 × 104 M =[ (9 pc)2 × 49 pc] = 188 H=cm−3 ; here, 9 pc is the 95% location contour of the 3EG source. This density is 6 times larger than that used by De Jager and Mastichiadis’ (1997) in computing the hadronic emission. De Jager and Mastichiadis (1997) developed a speci5c model for this EGRET source (at the time, 2EG J1857 + 0118) and proposed that the -ray radiation could be accounted for by relativistic Bremsstrahlung and inverse Compton scattering. One of their main motivations was to explain the hard spectral index of the corresponding -ray source, ∼ −1:80, as well as the hard index of the radio source ∼ −0:3, both diVcult to reconcile with the standard Fermi 5rst order acceleration process.
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
355
1.0
0.5
Galactic Latitude
0.0
+
-0.5
-1.0
-1.5
-2.0 36.5
36.0
35.5
20
35.0 Galactic Longitude
40
60 Wco (K km/s)
34.5
80
34.0
33.5
100
Fig. 25. The colors are CO(1-0) integrated over the velocity range 30 –65 km=s. The solid contours are 4:85 GHz continuum from the survey of Condon et al. (1991). The dashed contours are the 50%, 68%, 95%, and 99% con5dence contours for 3EG J1856 + 0114. The white “+” marks the position with the highest ratio CO(2-1)=CO(1-0) as determined by Seta et al. (1998).
The model of De Jager and Mastichiadis (1997) propose the pulsar PSR B1853+01 as the source of the -rays; indeed, the required eVciency to produce all the -ray radiation is 13%, which appears marginally plausible, given the uncertainties. However, the luminosity of the pulsar wind nebula (PWN) is negligible in comparison with the total X-ray luminosity of W44 (Harrus et al., 1996). The bulk of the X-ray emission from W44 is thermal (Jones et al., 1993; Rho et al., 1994). But even though the X-ray luminosity of the PWN is negligible, the pulsar could have injected a signi5cant amount of electrons in an earlier stage. Mastichiadis’ paper Good 5ts of the 2EG spectrum were obtained for a range of particle density, whereas the magnetic 5eld was required to be ∼ 10 G and the synchrotron cutoA frequency, (b ∼ 1012:5 Hz. A reasonable 5eld strength could then explain the -ray spectrum of W44 as originating in leptonic processes. W44 is then a complicated case. While the -ray source may be due to SNR shock-cloud interactions (hadronic and leptonic), a signi5cant part of the -ray 8ux observed—perhaps even all of it—could be due to a pulsar. Application of Eq. (25) does not allow us to discard, as in the case
356
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380 1.0
0.5
0.0
-0.5
-1.0
G39.2Ð0.3 -1.5 40.5
40.0
39.5 39.0 Galactic Longitude
10
20 30 40 Wco (K km/s)
38.5
38.0
50
Fig. 26. Left: Relative positions of 3EG J1903 + 0550 and the SNR G39:2 − 0:3. Radio contours in step of 2 mK, starting from 5 mK, from the 2695-MHz map obtained with the EAelsberg 100-m single dish telescope (FPurst et al., 1990). Right: CO integrated over the velocity range of the far Sagittarius arm, 48 to 70 km s−1 . Contours: 4:85 GHz continuum from the survey of Condon et al. (1991). The angular resolution is 7 arcmin; the contour interval is 0:5 Jy=beam, starting at 0:3 Jy=beam. The dotted circle is the 95% radius about the central position of 3EG J1903 + 0550 (Hartman et al., 1999).
of SNR G347:3 − 0:5, a leptonic origin for the high-energy radiation. Here, then, future satellites and telescopes will play an essential role in disentangling the diAerent possibilities. W44 was not observed to be emitting at TeV energies, where an upper limit has been imposed by the Whipple observatory at (F(E ¿ 250 GeV) = 8:5 × 10−11 cm−2 s−1 , Leslard et al., 1995), implying that there is a possible break in the spectrum from GeV to TeV. This can be tested by forthcoming higher sensitivity TeV-telescopes. 10.15. -ray source 3EG J 1903 + 0550—SNR G39:2 − 0:3 Fig. 26 shows the relative positions of 3EG J1903 + 0550 and the SNR G39:2 − 0:3. The 3EG source is also coincident with two Princeton pulsars (Table 3). One of them can be readily discarded as the origin of the -ray emission because of the unrealistically high value required for the eVciency. The other pulsar, PSR J1902 + 0615, lacks a measurement of the period derivative and so we cannot judge the likelihood of this particular association. The SNR G39:2 − 0:3 has been searched for OH maser emission by Koralesky et al. (1998), but none was detected, although, of course, this does not imply a lack of possible interaction. The SNR is more than 8 kpc away, which seems to be, a priori, a problem for generating the requisite 8ux via shock interactions. We 5nd, however, that the large distance may be, at least partially, compensated for by the large amount of molecular material in the neighborhood.
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
357
Caswell et al. (1975) detected HI absorption all the way up to the terminal velocity towards the SNR G39:2 − 0:3, with almost continuous strong absorption between 60 km s−1 and the terminal velocity. They therefore concluded that the remnant was certainly beyond the tangent point, and most likely at the far distance corresponding to 60 km s−1 ; ∼ 9:6 kpc. Such a large distance is consistent with the high foreground hydrogen column inferred both by Becker and Helfand (1987) based on 21 cm absorption measurements with the VLA, and by Harrus and Slane (1999) based on ASCA observations. A distance of 9:6 kpc would place the SNR in the far Sagittarius arm, where as Fig. 26 shows, the remnant is nearly coincident with a massive molecular complex. The complex is (40,59) in the catalog of Dame et al. (1986), who assigned the far kinematic distance based on 2 associated HII regions. The mass of this complex is estimated to be 2:1 × 106 M . The mass within the 95% con5dence radius of the 3EG source (dotted circle in Fig. 26) is even higher, 3:4 × 106 M , because the radius also includes part of another molecular complex at higher longitude. If part of the mass contained in the molecular complex could serve as target material for the relativistic particles accelerated in the SNR shock, this 3EG detection could plausibly be produced by a combination of Bremsstrahlung and pion decay. With the mass quoted, a CR enhancement factor of less than 10 is all that is needed to produce the bulk of the observed -ray emission. However, it is clear that not all of the molecular mass can be illuminated by the SNR shock front. The SNR itself is less than 8 arcmin in size, while the 3EG source is ∼ 1 deg in size. Only 0.1% of the molecular material need to serve as a target for the particles accelerated in G39:2 − 0:3 in order to produce the 3EG source. In this case, however, as in the case of 3EG J1639 − 4702, the enhancement factor (Eq. (15)) is very large (∼ 1000), as a result of the use of the Sedov solutions with typical values for the energy of the explosion and ambient unshocked density. There is an additional nearby SNR, G40:5−0:5 (Downes et al., 1980), which is also associated with the same cloud complex as G39:2 − 0:3 and at a similar distance. Though the center of G40:5 − 0:5 does not coincide with the 3EG source herein analyzed, it is suggestive that the location of the 3EG source lies between the two SNRs, and could thus have a composite origin. (SNR G40:5−0:5 does not appear in Table 1 due to the high ellipticity of the con5dence level contours of 3EG J1903 + 0550.) A large region, −10 ¡ b ¡ 5, 38 ¡ l ¡ 43, comprising this latter SNR as well as the 3EG source was subject of a search for -ray emission using the HEGRA system of imaging atmospheric telescopes (Aharonian et al., 2001a). No evidence for emission from point sources was detected, and upper limits imposed were typically below 0.1 Crabs for the 8ux above 1 TeV. 10.16. -ray source 3EG J 2016 + 3657—SNR G74:9 + 1:2 (CTB 87) Although it has been suggested that SNR G74:9 + 1:2 (CTB 87) may be interacting with ambient clouds (Huang et al., 1983; Huang and Thaddeus, 1986), the coincident 3EG J2016 + 3657 source has been proposed as a counterpart of the blazar-like radio source G74:87 + 1:22 (B2013 + 370) (Halpern et al., 2001a; Mukherjee et al., 2000). B2013 + 370 is a compact, 8at spectrum, 2 Jy radio source at 1 GHz. Its multiwavelength properties were compiled by Mukherjee et al. (2000), and since they resemble other blazars detected by EGRET, make B2013 + 370 an interesting possible counterpart for this 3EG source. Optical photometry of B2013 + 370 shows that it is variable, providing additional evidence of its blazar nature (Halpern et al., 2001a). Additionally, the same authors presented a complete set of classi5cations for the 14 brightest ROSAT X-ray sources in the error circle of the 3EG source,
358
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380 2.0
G74.87+1.22
Galactic Latitude
1.5
WR138
•
1.0
0.5
CTB 87
75.5
75.0 Galactic Longitude
1
2
3 Wco (K km/s)
74.5
4
5
Fig. 27. CTB 87 region: The dashed contours are the usual EGRET 50%, 68%, 95%, and 99% con5dence levels. The solid contours are taken from the 4:85 GHz continuum survey of Condon et al. (1991). The color is CO integrated over the range v = −65 to −50 km s−1 . The positions of the blazar G74:87 + 1:22 and of the WR-star WR138 are marked.
of which B2013 + 370 remained the most likely source of the -rays, should these come from a point-like source. The Crab-like supernova remnant CTB 87 is located at more than 10 kpc (Green, 2000), seemingly disfavoring its shell interactions as the cause of the EGRET source. There are also WR stars in the 5eld (Romero et al., 1999a), which might produce -ray emission. This possibility remains to be analyzed. INTEGRAL observations would help in determining if there is -ray emission coming from the stars. Fig. 27 shows a CO map for the CTB 87 region. One clearly de5ned molecular cloud appears in the map. The mean velocity of the molecular cloud is −57 km=s. Assuming a 8at rotation curve beyond the solar circle, the cloud’s kinematic distance is 10:4 kpc. The total molecular mass within the 95% con5dence radius of the 3EG source is 1:7 × 105 M . With such a high value for the molecular mass, there is still a chance that the hadronic or leptonic -ray emission may be contributing to 3EG J2016 + 3657.
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
359
As in previous cases, only a precise determination of the -ray source position will disentangle the origin of this -ray source. Contrary to other EGRET-SNR pairs, though, this one has the particularity of enclosing a good candidate for an extra-galactic origin of the radiation. 10.17. -ray source 3EG J 2020 + 4017—SNR G78:2 + 2:1 (-Cygni Nebula, W66) The SNR G78:2 + 2:1 lies in a very complex region of the sky, where more than forty HII regions and a large number of shell structures exist. Recently, Lozinskaya et al. (2000) have made an in depth analysis of the SNR, which included new optical observations and re-analysis of archival X-ray data. We refer the reader to their paper for appropriate details on the SNR structure and other features, other than those commented here, related with the possible association with the 3EG source. The main result of Lozinskaya et al. is that X-ray observations lead to a self-consistent model of a young SNR at an early stage of adiabatic expansion into a medium of relatively low density (t = (5 − 6) × 103 yr, n0 = 0:14 − 0:3 cm−3 ). The possible association between 3EG J2020 + 4017 (with the largest signal-to-background ratio of any of the sources in the third EGRET catalog that are positionally coincident with shell-type supernova remnants) and G78:2 + 2:1 was previously suggested by Pollock (1985), Sturner and Dermer (1995) and Esposito et al. (1996), each using the -ray catalog available at the time. Brazier et al. (1996) studied this source (2EG 2020 + 4626 then) and reported the discovery of a point like X-ray source, RX J2020:2 + 4026, which lies close to the center of the remnant. If one is to assume that this source and the 3EG detection are related, the ratio of the gamma to X-ray 8uxes is about 6000, similar to what is detected for the radio quiet Geminga-pulsar. This prompted Brazier et al. to suggest that 3EG J2020 + 4017 can indeed be a new Geminga-like radio quiet pulsar, something which would be in tune with the hard spectra, the low variability index, and the absence of ‘em’ classi5cation in the 3EG Catalog (see Table 1). This possible association remains, then, very suggestive. However, Brazier et al. have reported no evidence for pulsations in the range 13:3 ¿ f(Hz) ¿ 2, F˙ ¡ 10−11 s−2 . Also, no radio pulsar is known to be superposed to the 3EG contours (see Table 6). The absence of pulsar -ray radiation can well be due to a statistical limitation of the EGRET data. Analysis of GLAST (LAT) performance shows that periodicities should be detectable if present in any of the known low-latitude EGRET sources (Carrami˜nana, 2001; Thompson et al., 2001). In particular, Brazier et al. considered that the absence of em (i.e. extended) classi5cation in the 3EG Catalog was de5nitive in disregarding the SNR as a possible site of cosmic ray acceleration: the SNR is about 1 deg in size, and it should be visible as extended by EGRET if the ambient matter were uniformly distributed. This, however, may not be the case if the shock accelerated particles interact and emit -rays most intensely at a localized nearby concentration of molecular material. Yamamoto et al. (1999) have studied the possible SNR-cloud interaction here and observed a very high CO(J = 2-1)=CO(J = 1-0) intensity ratio (∼ 1:5), very suggestive of an interacting cloud, is observed at (l; b)=(78; 2:3). Interestingly, this position exactly coincides with that of the 3EG source (see Fig. 28). After re-analyzing the same set of data used by Yamamoto et al. (1999), we agree with their result, but have also found another high ratio CO(J = 2 − 1)=CO(J = 1 − 0) at an adjacent position (l = 77:875; b = 2:25), and yet another moderately high ratio (∼ 1) at another adjacent position: (l; b) = (78:00; 2:25). These high ratios coincide nicely with the -ray source and with
360
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
Fig. 28. The 5ltered radio emission at 2:7 GHz of the SNR G78:2 + 2:1 is shown in black. Radio contours, from the 2695-MHz map obtained with the EAelsberg 100-m single dish telescope (FPurst et al., 1990), are labeled in steps of 1 K in brightness temperature, starting at 3:0 K. The superposed white levels represent the 99%, 95%, 68%, and 50% statistical probability that a -ray source lies within each contour according to the EGRET catalog (Hartman et al., 1999). In the background, a CO(J = 1-0) map of the region, integrated over the range v = −20–20 km s−1 , which includes all the emission in this direction except for a small amount in the range −50 to −40 km s−1 which is probably associated with the Perseus Arm is shown.
a fairly well-de5ned CO cloud, and make a good case for the interaction between the SNR and the cloud. Assuming a distance of 1:7 kpc (Lozinskaya et al., 2000), the molecular mass of this cloud is 4700 M (calculated over the range l = 77:75 − 78:125 deg, b = 2:00 − 2:375 deg, and v = −10 to 0 km s−1 ). This mass is rather uncertain since W66 lies in the so-called Cygnus X region of the Galaxy where the CO emission is very complex and strong, probably because we are viewing the Local spiral arm tangentially. It is hard to judge what amount of CO might be associated with the SNR and what amount is seen in projection. The Local arm emission is mainly in the range −20–20 km s−1 , and the Perseus Arm is seen at more negative velocities. Kinematic distances, particularly at low velocity, are very unreliable because our line of sight is almost tangent to the solar circle. Using the data obtained for the mass of the cloud in the vicinity of the 3EG source and apparently in interaction with the SNR, as well as the mean value for the unshocked density (n0 = 0:22 cm−3 ) and distance (from Lozinskaya et al., 2000), it is possible to explain the -rays by hadronic processes. The use of Eq. (25) with plausible values of the magnetic 5elds in molecular clouds (Crutcher, 1988, 1994, 1999) rules out leptonic processes as the source of most of the radiation.
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
361
We cannot discard, however, a composite origin for the -rays, with part of the radiation coming from the putative -ray pulsar proposed by Brazier et al. (1996) and part from cloud interactions with the nucleonic component of freshly accelerated CRs. 10.18. An example beyond |b| ¿ 10: -ray source 3EG J0010 + 7309—SNR G119.5 +10.2 (CTA 1) The SNR-EGRET connection that we have explored so far was based on the sample shown in Table 1, constructed using 3EG sources within 10o of the Galactic plane. Although most SNRs fall into this latitude range, there is the possibility of 5nding a few nearby SNRs related with -ray sources at higher latitudes. An example of such is brie8y described in this section. CTA 1 is a shell-type SNR, but its shell is incomplete and broken-out towards the NW. This breakout phenomenon may be caused by more rapid expansion of the blast wave shock into a lower density region toward the NW. HI observations supported this interpretation (Pineault et al., 1993, 1997). Since CTA 1 is located at a relatively high latitude (10 deg) and is nearby (1:4 ± 0:3 kpc, Pineault et al., 1993), it has a large angular size (90 arcmin), little foreground or background confusion, and it can be observed at exceptionally high linear resolution. The age of the SNR was estimated to be 104 yr by Pineault et al. (1993), but it could be younger by a factor of 2 (Slane et al., 1997). CTA 1 was subject of intense observational campaigns in the past years. There have been both ROSAT and ASCA X-ray observations (Seward et al., 1995; Slane et al., 1997), as well as optical, infrared and radio (see Pineault et al., 1997; Brazier et al., 1998 for a review). The ROSAT observation con5rmed that CTA 1 belongs to the class of composite SNRs, which show a shell-type morphology in the radio band and are centre-5lled in X-rays. Five point sources were detected with ROSAT, one of which was found to coincide with the EGRET source (at the time of the analysis, 2EG J0008 + 7307). ASCA data later revealed that this source, named RX J0007:0 + 7302, has a non-thermal spectrum, suggesting that it is the pulsar left from the supernova explosion (Slane et al., 1997). Optical observations were carried out by Brazier et al. (1998), with a 2.12-m telescope, but no object was found within the positional error box of the X-ray source. This allowed an upper limit to be set on the optical magnitude of any counterpart to the putative pulsar. The 3EG J0010+7309 is a non-variable source under the I and 3 schemes, and has a hard spectral index of 1:85 ± 0:10, compatible with those of the Vela pulsar. This source was also detected in the second EGRET Catalog, but with a shifted position which made it coincide with a nearby AGN— which was at the time proposed as a possible counterpart (Nolan et al., 1996). This AGN is not coincident with the 3EG source and cannot be considered a plausible counterpart any longer. Based on positional coincidence, on the hard spectral index, and on physical similarities between the Vela pulsar and RX J0007:0 + 7302, Brazier et al. (1998) proposed that the 3EG source and this X-ray source were related. For an assumed 1 sr beaming, the observed 100 –2000 MeV 8ux corresponds to a luminosity of 4 × 1033 erg s−1 , compatible with other -ray pulsar detections (see Table 5). Although a thorough CO investigation has not yet been done for this SNR, current data from HI observations do not support the presence of very dense and massive molecular clouds in its neighborhood (Pineault et al., 1997). It seems CTA 1 and 3EG J0010 + 7309 might be related only through the compact object left by the latter (RX J0007:0 + 7302). A better localization of the -ray source with AGILE and GLAST will certainly test this suggestion.
362
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
11. SNRs discovered by their likely associated high-energy radiation The SNR catalog compiled by Green (2000) is by no means complete. A large number of low surface brightness remnants remain to be discovered, hidden in the diAuse non-thermal continuum emission produced by the diAuse component of the cosmic-ray electrons in the Galaxy. In recent years, the application of 5ltering techniques in radio continuum data has lead to the detection of several new SNR (e.g. Duncan et al., 1995, 1997; Jonas, 1999; Combi et al., 1999; Klothes et al., 2001). The use of unidenti5ed non-variable -ray sources as tracers of interacting remnants can result in new SNRs being discovered by their (plausibly associated) high-energy emission. As it was emphasized in the previous sections, not all SNR generate observable -rays (say, at EGRET sensitivity) at typical Galactic distances. A second ingredient is necessary: a target, i.e., a dense medium such as a molecular cloud. If a SN explodes in a cloudy medium then more than a single cloud could be illuminated by p–p collisions and thus multiple -ray sources can emerge. Clusters of steady unidenti5ed -ray sources could trace these situations and lead to the discovery of new, very extended SNRs of low radio surface brightness. This approach of considering EGRET sources as tracers of SNRs has been applied by Combi et al. (1998, 2001), leading to the discovery of two new SNRs. The basic technique consists of making HI line observations in the direction of clearly non-variable -ray sources. If some well-de5ned but small clouds (M ∼ 103 –104 M ) are found within the 95% contours of the 3EG sources at velocities that correspond, according to the galactic rotation curve, to distances of less than 1 kpc, then large-scale (i.e. several degrees) radio continuum observations are carried out. These observations aim at detecting, through image 5ltering techniques, large SNRs of low surface brightness. Observations at more than a single frequency are necessary in order to determine if the radio spectral index is non-thermal, as expected from such remnants. The SNR candidates discovered by this procedure are G327 − 12:0, located in the ARA region (Combi et al., 1998) and G06:5 − 12:0, located in Capricornus (Combi et al., 2001): see Fig. 29.
Fig. 29. The Capricornus SNR uncovered by the likely associated high-energy emission. Left: background 5ltered radio emission at 408 MHz (radio data from Haslam et al., 1982) of the region surrounding three 3EG sources, whose contours are marked. Middle: radio map at 2:3 GHz (data from Jonas et al., 1998). The use of both maps shows that the SNR is a non-thermal radio source. Right: A map of the integrated column density of HI (velocity interval −3 to +5 km s−1 ). Label units are 1019 atoms cm−2 . Use of this map shows that small enhancement factors could produce all three EGRET sources. From Combi et al. (2001).
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
363
The 5rst one appears to be responsible for the -ray source 3EG J1659 − 6251, whereas the second could be related to a cluster of three 3EG sources: 3EG J1834 − 2803, 3EG J1847 − 3219, and 3EG1850 − 2652. HI clouds have been found at the positions of all these sources except for 3EG J1847 − 3219. Both remnants are thought to be so close that cosmic-ray enhancement factors in the range 5 ¡ ks ¡ 45 are suVcient to explain the -sources through 0 -decays alone. Their low radio surface-brightnesses rule out electronic Bremsstrahlung as the source of the GeV 8ux. We expect that the use of 5ltering techniques in interferometric radio observations could lead to the discovery of more such remnants in the near future (see, e.g., Klothes et al., 2001), and with the improved capabilities of the next generation satellites, -ray emission from far more distant interacting SNRs could be detected. 12. SNRs and their neighborhoods as TeV sources As can be seen in Table 7, several observations of SNRs such as W44, W51, -Cygni, W63 and Tycho’s SNR, selected because of their possible association with molecular clouds and/or EGRET sources, have only produced TeV emission upper limits (Buckley et al., 1998). This, as discussed below, could indicate spectral cutoAs or breaks in the GeV–TeV energy range. For instance, the required diAerential source spectrum would have to steepen to ∼ E −2:5 for -Cygni in order to escape detection at TeV energies (Fegan, 2001). Very recently (Aharonian et al., 2002a), the HEGRA system of imaging atmospheric Cherenkov telescopes reported a survey of one quarter of the Galactic plane (−2 ¡ l ¡ 85). TeV -ray emission from point sources and moderately extended sources (diameter less than 0:8 deg), including 86 known pulsars (PSR), 63 known supernova remnants (SNR) and 9 GeV sources, were searched with negative results. Upper limits range from 0.15 Crab units up to several Crab units, depending on the observation time and zenith angles covered: no TeV source was detected above 45* in a total observation time of 115 h. At the same time, a search for point sources of radiation above 15 TeV has been conducted with HEGRA AIROBICC array (Aharonian et al., 2002b), but only 8ux upper limits of around 1.3 times the 8ux of Crab nebula were obtained for candidate sources (including SNRs), depending, again, on the observation time and zenith angles covered. Positive detections, however, already exist. SN1006 (Tanimori et al., 1998) and RXJ1713:7 − 3946 (Muraishi et al., 2000; Enomoto et al., 2002) were detected by the CANGAROO telescopes. Observations of SN1006 in 1996 and 1997 show a signi5cant excess from the NW rim of the SNR. The excess is consistent with the location of non-thermal X-rays detected by ASCA (Koyama et al., 1995). Theoretical modeling for SN 1006 has also been performed (e.g. Berezhko et al., 2003). Cassiopeia A was recently announced as TeV source by the HEGRA collaboration (Aharonian et al., 2001b). Two hundred and thirty two hours of observations yield an excess at the 4:9* level, and a 8ux of F = 5:8 ± 1:2stat ± 2syst × 10−13 cm−2 s−1 at (E ¿ 1 TeV). The origin of the -rays has been recently discussed by Berezhko et al. (2002). Cas A has already been associated with a bright source of hard X-rays which indicates a population of non-thermal electrons with energies up to 100 TeV (Allen et al., 1999b). Nevertheless, Cas A is not a 3EG source, but this could be due only to the small EGRET sensitivity. There have also been observations of unidenti5ed -ray sources whose identi5cations are, perhaps, more tentative. These observations were made with the Whipple 10-m telescope (Buckley et al., 1997).
364
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
Table 7 TeV Observations of SNRs. Plerions are shown in the 5rst panel and shell-type SNRs in the second. The third panel shows results for two 3EG sources coincident with SNRs for which there have also been TeV observations. Partially adapted from Fegan (2001) and Mori (2001) Object name All TeV observatories Crab Nebula CANGAROO PSR 1706 − 44 Vela pulsar Durham PSR 1706 − 44 Vela pulsar CANGAROO RXJ 1713:7 − 3946 RXJ 1713:7 − 3946a SN1006 W28 HEGRA Cas A -Cygni Monoceros Durham SN1006 Whipple Monoceros Cas A W44 W51 -Cygni W63 Tycho CAT CasA J2016 + 3657 J2020 + 4017 a
Exposure time (hours) →∞ 60 116
7.0 (¿ 400 GeV) 0.15 (¿ 1 TeV) 0.26 (E=2 TeV)−2:4 TeV−1
10 8.75
1.2 (¿ 300 GeV) ¡ 5:0 (¿ 300 GeV)
66 32 34 58
0.53 (¿ 1:8 TeV) 0.53 1:63 ± 0:15 ± 0:32)E −2:84±0:15±0:20 0.46 (¿ 1:7 TeV) ¡ 0:88 (¿ 5 TeV)b
232 47 120
0.058 (¿ 1 TeV)c ¡ 1:1 (¿ 500 GeV)d ?e
41
¡ 1:7 (¿ 300 GeV)
13.1 6.9 6 7.8 9.3 2.3 14.5
¡ 4:8 (¿ 500 GeV) ¡ 0:66 (¿ 500 GeV) ¡ 3:0 (¿ 300 GeV) ¡ 3:6 (¿ 300 GeV) ¡ 2:2 (¿ 300 GeV) ¡ 6:4 (¿ 300 GeV) ¡ 0:8 (¿ 300 GeV)
24.4
¡ 0:74 (¿ 400 GeV)
287 513
Observations made with CANGAROO-II (Enomoto et al., 2002). A diAerent de5nition of energy threshold is used. c Evidence for emission at the 4:9* level (PPuhlhofer et al., 2001). d Limits converted from Crab units using 8ux of Hillas et al. (1998). e Not yet reported. f Integral 8ux above 400 GeV. b
Flux/upper limit or spectrum 10−11 or 10−11 cm−2 s−1 TeV−1
5.8f 0.990f
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
365
Two of them were related with the 3EG sources in our sample, and details are provided in Table 7. The Crab Nebula has been detected by several Cherenkov observatories and it is often used to provide a check on the calibration of new instruments (e.g. Aharonian et al., 2000). STACEE and CELESTE, on which we comment below, have published signi5cant detections of the Crab Nebula in an energy range lower than that obtained by others ground-based instruments, E ¿ 190 ± 60 GeV (Oser et al., 2000) and E ¿ 50 GeV (De Naurois et al., 2001), respectively. The Crab’s energy spectrum between 300 GeV and 50 TeV has been well established (see below), and it steepens with energy (Hillas et al., 1998; Aharonian et al., 2000). No pulsation has yet been seen (Gillanders et al., 1997; Burdett et al., 1999; Aharonian et al., 1999). The CANGAROO team has detected the pulsar PSR 1706 − 44 in 60 h of observations (Kifune et al., 1995). They have also detected the Vela pulsar at the 6* level, based on ∼ 120 hours of observation. Observations by the Durham group (Chadwick et al., 1997) also con5rmed these detections. In the case of Vela, the VHE signal, which is oAset from the location of the pulsar by 0:14◦ , is thought to originate from a synchrotron nebula, powered by a population of relativistic electrons which were created in the supernova explosion and which have survived since then due to the low magnetic 5eld in the nebula. 13. Concluding remarks The coming years will be exciting times again for -ray astronomy—after the unfortunate forced demise of the Compton Observatory. INTEGRAL, AGILE, MAGIC, and the new stereo-IACTs are or should be on-line soon; and in the longer term, GLAST will surely bring about another revolution, answering existing questions, such as the one reviewed here, and posing further challenges. Ultimately, it will take a sensitive and high spatial-resolution MeV–GeV detector such as GLAST working in close consort with the ground-based TeV and radio telescopes to address the Galactic cosmic ray origin problem: the single most de5nitive test that SNRs (or any other putative sources) are accelerating CR nuclei would be the statistically signi5cant detection of the signature neutral pion -ray hump centered at 67:5 MeV (in log E ), as has been seen already by EGRET in the diAuse -ray background (Hunter et al., 1997). Though such a detection, by itself, only proves the existence of lower energy (∼ 1 GeV=n) nuclei, the detection allows to normalize the hadronic vs. electronic contributions in a model–independent fashion at those ‘lower’ energies. Thus, such a detection together with the extension of the spectrum into the TeV regime, and a multiwavelength spectrum inconsistent with an electronic origin, ought to be suVcient evidence to conclude that nuclei are being accelerated by SNRs. It remains to be seen if this will cleanly be borne out by the data. In any case, the SNR-EGRET connection looks stronger than ever. Several cases for a physical association, based on the analysis of -ray data and the molecular environment of the SNR, has been shown promising and a de5nite conclusion about the origin of cosmic rays in SNR shocks seems to be reachable in a nearby future. Very recently, Erlykin and Wolfendale (2003) have shown, in addition, that for some nearby SNRs, and even when there is no -ray signature, the idea of cosmic ray production in the shocks cannot be discarded since there could be a possible evacuation of ambient gas by the stellar wind of the progenitor star, or by the explosion of a nearby earlier
366
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
supernova. The case by case analysis shows, moreover, that it is at least plausible that EGRET has detected distant (more than 6 kpc) SNRs. There are 5 coinciding pairs of 3EG sources and SNRs for which the latter apparently lie at such high values of distance (disregarding those related with SNRs spatially close to the galactic center). For all these cases, we have uncovered the existence of nearby, large, in some cases giant, molecular clouds that could enhance the GeV signal through pion decay. It is possible that the physical relationship between the 3EG source and the coinciding SNR could provide for these pairs a substantial part of the GeV emission observed. This does not preclude, however, composite origins for the total amount of the radiation detected. Some of these cases present other plausible scenarios. AGILE observations, in advance of GLAST, would greatly elucidate the origin for these 3EG sources, since even a factor of 2 improvement in spatial resolution would be enough to reject the SNR connection. Kilometer-scale neutrino telescopes have also proposed as viable detectors of hadronic cosmic ray sources (Halzen and Hooper 2002; Anchordoqui et al., 2003), and will be a welcomed addition to the arsenal of space- and ground-based detectors that ought to be lined up by the time the large-scale neutrino telescopes are functional. Surely, with all this instrumentation focused on the problem we will 5nally be able to test Shklovskii’s suspicion that “it is possible that ionized interstellar atoms are accelerated in the moving magnetic 5elds connected with an expanding [SNR] nebula.” (Shklovskii, 1953). Acknowledgements We thank Fumio Yamamoto, Masumichi Seta, Toshihiro Handa, and Tetsuo Hasegawa for providing CO(2-1) survey data toward the SNRs IC443, W44, and W66. We acknowledge F. Aharonian, M. Mori, F. Bocchino, J. Paredes and S. Digel for their kind permission to reproduce Figs. 5, 6, 3, 11, 23, and 31, respectively. We thank D. Petry for his permission to adapt part of his work (Petry, 2001) in Appendix B. We further acknowledge Felix Aharonian, Seth Digel, Dave Thompson, Don Ellison, Paula Benaglia, Mischa Malkov, Olaf Reimer, and Isabelle Grenier for useful comments. We remain grateful to the Referee, who provided a detailed review which made us improve the manuscript. D.F.T. was supported by Princeton University, CONICET, and FundaciOon Antorchas during diAerent stages of this research. Also, part of his work was performed under the auspices of the U.S. Department of Energy (NNSA) by University of California Lawrence Livermore National Laboratory under contract No. W-7405-Eng-48. He thanks Princeton University, SISSA, and the University of Barcelona for their kind hospitality. G.E.R. and J.A.C. were supported by CONICET (under grant PIP No 0430/98), ANPCT (PICT 03-04881), as well as by FundaciOon Antorchas. G.E.R. also thanks the Max Planck Association for additional support at the MPIfK, Heidelberg, as well as the University of Paris VII and the Service d’Astrophysique, Saclay, for kind hospitality. Y.M.B. acknowledges the support of the High Energy Astrophysics division at the CfA and the Chandra project through NASA contract NAS8-39073. This research would have been impossible without the eAort of D.A. Green at the Mullard Radio Astronomy Observatory, Cambridge (UK) in providing a web-based SNR catalog. NASA’s HESEARC, SIMBAD, Goddard’s EGRET archive and MPE’s ROSAT All-Sky Survey were also invaluable to this study. The optical (and part of the radio) data is from the Digitized Sky Survey, accessible through http://skyview.gsfc.nasa.gov/ Parts of this work were based on photographic data obtained using The UK Schmidt Telescope. The UK Schmidt
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
367
Telescope was operated by the Royal Observatory Edinburgh, with funding from the UK Science and Engineering Research Council, until 1988 June, and thereafter by the Anglo-Australian Observatory. Original plate material is copyright (c) the Royal Observatory Edinburgh and the Anglo-Australian Observatory. The plates were processed into the present compressed digital form with their permission. The Digitized Sky Survey was produced at the Space Telescope Science Institute under US Government grant NAG W-2166. Appendix A. Reviewing the prospects for the forthcoming GeV satellites A.1. INTEGRAL The International Gamma-ray Astrophysics Laboratory (INTEGRAL) is now in orbit. It has main scienti5c instruments, named SPI (Spectrometer of INTEGRAL) and IBIS (Imager onBoard the Integral Satellite). SPI will work in the range 20 KeV–8 MeV and will perform high resolution spectroscopy; it can measure -ray line pro5les with an accuracy of 2 keV. The sensitivity of IBIS lies in the range 10 keV–10 MeV, and it will focus on achieving good angular resolution (12 arcmin). INTEGRAL will also have three monitors, the twin JEM-X’s (3 to 35 keV) and a OMC in the optical band (500 –600 nm). The JEM-X’s will have an angular resolution of 3 arcmin and a 4:8◦ 5elds of view. The OMC will have a pixel resolution of 16.6 arcsec and a 5◦ × 5◦ 5eld of view (SchPonfelder, 2001). INTEGRAL could identify compact objects with good spatial resolution that are likely counterparts of EGRET sources, and will help con5rm or reject interpretations based, for instance, on microquasars and -ray blazars. The INTEGRAL galactic plane exposure will help to clarify the confused region near Vela. Another 1 Ms exposure, part of the satellite Open Program, will focus on the Carina region. That part of the sky is populated with 5ve EGRET sources, two of which were analyzed above in the case-by-case study: 3EG J1102 − 6103 and 3EG J1013 − 5915. As an example of its potential, we will brie8y discuss what INTEGRAL can do in establishing the nature of these sources. The former case, 3EG J1102 − 6103, was brie8y mentioned in the corresponding section above. Although a hadronic origin for the -ray emission is possible for this source, it could also be the result of inverse Compton up-scattering of UV photons by electrons accelerated in the winds of one or several Wolf–Rayet stars (particularly in the region where the winds of W39 and WR38B collide). If the latter explanation is correct, INTEGRAL should see a source where the winds collide. In the latter case, the source 3EG J1013 − 5915 is likely mostly produced by PSR J1013 − 5915 (Camilo et al., 2001). The JEM-X monitor, in particular, could then probe the putative non-thermal emission from this region, especially the tail of the synchrotron spectrum and, in this way, help to determine the high-energy cutoA of the electron population in the source. In addition, there have been suggestions that INTEGRAL could detect the 0:511 MeV line from giant molecular clouds (see e.g. Guessoum et al., 2001). Giant molecular clouds are typically surrounded by HII regions, ionized in an uncertain fraction. If cosmic-rays are diAusing into the cloud, the cores, too, may be ionized. If a suVcient density of cosmic-rays is present, they could excite CNO nuclei as well as lead to the production of nuclear -rays lines and positron-production, the latter being emitted by radioactive nuclei. The 8uxes and line separations predicted are on the verge
368
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
Table 8 Instrumental parameters of some of the forthcoming satellite missions in the MeV–GeV range. EGRET data (5rst panel) is shown for comparison. The second panel corresponds to AGILE-GRID and the third to GLAST-LAT Energy range
Energy resolution [E=E
EAective area (cm2 )
Field of view sr
Angular resolution
Minimum 8ux (ph cm−2 s−1 )
Source location arcmin
Bremsstrahlung life (yr)
20 MeV 30 GeV
∼ 0:1
1500
0.5
100 MeV: 5:8◦ 1 GeV: 1:7◦
∼ 10−7
∼ 30
91–96
30 MeV 50 GeV
∼1
540
3.0
1 GeV: 0:6o
¿ 6 × 10−8
5 –20
03– 06
20 MeV 300 GeV
∼ 0:1
8000
2.5
100 MeV: ∼ 3:5o 10 GeV: ∼ 0:1o
∼ 4 × 10−9
¡1
06 –11
of detectability by INTEGRAL (Guessoum et al., 2001), but it will be an interesting arena to explore with forthcoming observations, particularly for those giant molecular clouds closer to Earth (regrettably not the ones we 5nd superposed with the SNRs in Table 1). Thus, although it is expected that no direct observations with INTEGRAL can prove that cosmicrays are accelerated in SNR shocks (the energy range of the relevant phenomena, around 100 MeV and beyond, is out of the INTEGRAL energy band), they can be used to explore alternative explanations for the unidenti5ed EGRET sources, and so are important to get an overall picture of the SNR-EGRET source connection. Nuclear -ray line signatures in the 0.1–10 MeV band will certainly also provide important hints regarding the CR acceleration processes in the Galaxy. A.2. AGILE AGILE (astro-rivelatore gamma a Immagini Leggero), is expected to be launched in 2003 (for a recent review see Tavani et al., 2001). AGILE will have a very large 5eld of view, covering at one time approximately 1/5 of the sky at energies between 30 MeV and 50 GeV. The angular resolution will be a factor of two better than that of EGRET, speci5cally due to the GRID instrument, whose parameters are given in Table 8. However, the sensitivity for point-like sources will remain comparable to that of EGRET. AGILE will also have detection and imaging capabilities in the hard X-ray range provided by the Super-AGILE detector. The main goals of Super-AGILE are the simultaneous -ray and hard X-ray detection of astrophysical sources (which was never achieved by previous -ray instruments), improved source positioning (1–3 arcmins, depending on intensity), fast burst alert, and on-board triggering capability. AGILE is thus very well suited for studying compact objects, particularily those presenting -ray variability or pulsed emission. AGILE will search for pulsed -ray emission from all recently discovered Parkes pulsars coincident with EGRET sources (D’amico et al., 2001; Torres et al., 2001d; Camilo et al., 2001), so establishing their contribution to the EGRET -ray 8ux. Unpulsed -ray emission from plerionic SNRs, and search for time variability in pulsar-wind nebula interactions will also be possible targets for AGILE. Finally, AGILE will be essential in
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
369
assessing the possible existence of new populations of variable -ray sources in the galaxy, such as non-pulsating black holes, X-ray binaries, and microquasars. In the case of 3EG J0542 + 2610, for instance, AGILE could test the hypothesis that the -ray emission is produced in A0535 + 26 (Romero et al., 2001). The key prediction of this model is anti-correlation of the X-ray and -ray emissions. A.3. GLAST The large area telescope (LAT) on the upcoming -ray large area space telescope (GLAST) mission will be suitable for studying the relationship between -ray sources and SNRs. The instrument’s parameters are given in Table 8 (Michelson, 2001). GLAST, expected to be launched in 2006, will explore the energy range from 30 MeV to greater than 100 GeV with 10% energy resolution between 100 MeV and 10 GeV. GLAST uniquely combines high angular resolution with superb sensitivity, and has a moderate eAective area. Sources below the EGRET threshold (∼6 × 10−8 photons cm−2 s−1 ) will be localized to arcmin scales. This clearly will improve our understanding of the SNR shock acceleration of cosmic-rays and hadronic -ray production. In what follows, we provide a brief example of GLAST capabilities as applied to the SNR -cygni (W66), courtesy of Seth Digel and NASA (see also, Ormes et al., 2000 astro-ph/0003270 from which this example is adapted). Allen et al. (1999a, b) studied what information could be obtained from GLAST observations based on a 1-year all-sky survey assuming that 60% of the -ray 8ux was produced by a pulsar at the location proposed by Brazier et al. (1996). This pulsar was assumed to have a diAerential photon index of 2.08, that of the coincident source, 3EG J2020 + 4017. The remainder of the photon 8ux was assumed to come from relativistically accelerated particles through both leptonic and hadronic processes—the large dominance of the latter in agreement with what we found in the corresponding section above. The position of the molecular cloud assumed in the simulations corresponds well with the CO clouds actually found—see Fig. 28). Allen et al. additionally assumed that the electron and proton spectra of W66 have the shape speci5ed by Bell (1978, see their Eq. (5)), with a common relativistic spectral index of " = 2:08. The normalization of the electron spectrum was determined from the radio data, by assuming that the magnetic 5eld strength is 100 G. The normalization of the proton spectrum, instead, was determined by assuming that the total number of non-thermal electrons is a factor of 1.2 larger than the number of non-thermal protons. The spectral results of the simulations are shown in Fig. 30. It shows the -ray spectra produced by the putative pulsar, by the decay of neutral pions, as well as by the leptonic processes: Bremsstrahlung radiation of the electrons, and inverse Compton scattering of electrons on the cosmic microwave background radiation. The latter three of these four spectra were obtained using the -ray emissivity results of Gaisser et al. (1998) and Baring et al. (1999). The non-thermal Bremsstrahlung and neutral pion spectra were obtained assuming that the average density of the material with which the cosmic-rays interact was n0 = 190 atoms cm−3 , which should be compared with the actual density we found, of 188 atoms cm−3 (see Section 10.17. Perhaps most illustrative is Fig. 31, which demonstrates that the resolution of GLAST will allow one to distinguish the contribution of the pulsar from that of the interacting molecular cloud. Should a picture like Fig. 31 be the result of an actual GLAST observation the proposed composite origin for 3EG J2020 + 4017 would be proved.
370
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
Fig. 30. Models of the neutral pion decay, non-thermal Bremsstrahlung (NB), inverse Compton (IC), and pulsar (PSR) -ray spectra associated with W66. The sum of the three cosmic-ray components ( 0 + NB + IC) of the shell and the sum of all four components ( 0 + NB + IC + pulsar emission) are shown. EGRET spectral data is also included. From Allen et al. (1999a, b).
EGRET Data (E > 1 GeV)
Galactic Latitude
3°
GLAST Simulation
X-ray Source SNR Shell
2°
Shock-Accelerated CRs Interacting with ISM
1° 79°
78°
77° 79° Galactic Longitude
78°
77°
Fig. 31. Comparison between the observed EGRET data and a GLAST simulation for the W66 SNR region. The large circle shows the position and extent of the radio shell of the SNR. GLAST will be able to localize distinctively the position of the putative -ray pulsar (marked here as the X-ray source) proposed by Brazier et al. as well as the SNR shock interacting with the molecular cloud, should both contribute as assumed to the -ray 8ux observed. Courtesy of S. Digel and NASA; adapted from Ormes et al. (2000) astro-ph/0003270.
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
371
Appendix B. Future TeV telescopes and their look at SNRs—adapted from Petry (2001) At higher and higher energies the sky appears darker and darker. There are just 57 sources detected between 1 and 10 GeV (Lamb and Macomb, 1997) and higher frequencies remain largely unobserved. In order to reach the highest energy -rays, new ground-based telescopes are being built and older ones upgraded. A very brief description of them, and of their impact on the SNR--ray source association problem is given here. One approach to constructing high-energy -ray detectors is to use existing solar farms, which have 5elds of large heliostats focusing sunlight on a central tower; such facilities lie unused at night. The Q arrival of the Cerenkov wavefront at groups of heliostats is precisely measured, and this information is used to diAerentiate -rays from cosmic-ray primaries (Ong, 1998). STACEE (Chantell et al., 1998), CELESTE (Smith et al., 1997), Solar-2 (TPumer et al., 1999), and GRAAL (Arqueros et al., 1999) are all examples of such facilities. MILAGRO (Sinnis et al., 1995) and Tibet HD (Amenomori et al., 1999) are examples of air showers detectors. These arrays can in principle operate 24 h a day and are expected to achieve a larger energy range, comparable to the next generation imagers discussed below. Q Imaging atmospheric Cherenkov telescopes (IACTs) are ground-based -ray detectors using the atmosphere as a tracker and calorimeter and having good all-around performance, both in sensitivity and angular resolution, for sources above 100 GeV. They typically have point spread functions with 68 ¡ 0:16◦ . The number of detected√primary gamma-photons will typically be ¿ 100, and therefore source locations, computed as 68 = N , can reach arcmin accuracies. Generically, the possibility of separating two nearby point sources can be realized if their angular distance is ¿ 368 . Some forthcoming IACT telescopes in the TeV regime are detailed in Table 9. The successful introduction of stereo imaging technology in the HEGRA instrument has paved the way for bigger and more complex instruments. The unprecedented accuracy in air shower stereo reconstruction of the HESS telescopes in Namibia will allow one to obtain an angular resolution of ∼ 0:1◦ with a source location accuracy of only 10 arcseconds (Konopelko, 2001). After 50 h of observation of a point source, HESS will detect a 8ux of ∼ 10−11 ph cm−2 s−1 at E ¿ 100 GeV. In the case of extended sources of angular size >, the 8ux detected will be ∼ (>=0:1 deg) × 10−11 ph cm−2 s−1 in the same energy range. It is clear that such a powerful instrument will provide answers to many of the pending questions mentioned in this review, at least for the case of southern SNRs. A natural next step in the development of stereo imaging arrays for TeV astronomy is to place one of these systems at high altitude, where it could detect much lower-energy gamma-rays. It has been estimated that at an altitude of 5 km, a threshold of 5 GeV might be achieved (Aharonian et al., 2001c). These systems, then, might replace orbital observatories in the GeV band in the foreseeable future. In an attempt to establish an agenda for the new TeV telescopes, Petry (2001) and Petry and Reimer (2001) have analyzed the observational possibilities for each of the unidenti5ed and tentatively identi5ed EGRET sources. Several problems have to be taken into account. The 5rst is that of the apparent spectral steepening of several 3EG sources between the GeV and the TeV band (see Reimer and Bertsch, 2001). This could be why strong sources such as -Cygni, IC443, W28, and CTA 1 were not observed in the TeV band (e.g. Buckley et al., 1998). In order to get an estimate of the impact of this eAect on any 3EG source, Petry (2001) proposed to use the same spectral steepening as that of the Crab Nebula. The diAerential spectral index "0:1 GeV of the Crab Nebula at 0:1 GeV is "0:1 = 2:19 ± 0:02 (Hartman et al., 1999) while it was measured by the Whipple telescope
372
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
Table 9 Q Next-generation imaging Cherenkov telescopes and their main characteristics. The locations of the sites chosen for these observatories and their estimated minimum energy thresholds are given. Roman numbers behind some of the project names denote the project phases. The Type refers to the number of individual telescopes and diameter of their mirror dish. The last column is the predicted energy threshold for -ray photons at zenith angle # = 0◦ . (From Petry, 2001) Project name CANGAROO III HESS I MAGIC I VERITAS
Latitude 31◦ 23◦ 29◦ 32◦
S S N N
Longitude 137◦ 17◦ 17◦ 111◦
E E W W
Altitude (m)
Type (m)
160 1800 2200 1300
4 × 10 4 × 13 1 × 17 7 × 10
Ethr (0◦ ) (GeV) 80 40 30 60
at 500 GeV to be "500 GeV = 2:49 ± 0:06 ± 0:04 (Hillas et al., 1998). The latter authors showed that the steepening towards higher energies can be described by an increase in spectral index of 0.15 per decade of energy. Petry (2001) showed that this would then imply, if a similar situation is the case for the unidenti5ed EGRET sources, that F(E ¿ x[GeV]) = F0 × 10−5 × 10−(5+0:15) (x=10)−(5+0:30) = F0 × 10−(5−0:15) x−(5+0:30)
for 10 ¡ x ¡ 100
F(E ¿ x[GeV]) = F0 × 10−5 × 10−(5+0:15) 10−(5+0:30) (x=100)−(5+0:45) = F0 × 10−(5−0:45) x−(5+0:45)
for 100 ¡ x ¡ 1000
(39)
where F0 is the integrated 8ux above 0:1 GeV and 5 is the 8ux spectral index (obtained by subtracting 1.0 from the diAerential spectral index ", quoted for instance in Table 1) of a given source taken from the Third EGRET Catalog. An extrapolation without taking into account possible breaks in the spectrum could give a large over-estimation of the high-energy 8ux. Although not valid for all sources, the previous scheme can be used to give an idea of the expected 8ux from interesting 3EG detections (Petry, 2001; Petry and Reimer, 2001). Nonetheless, not only the spectral cutoAs, but other technical problems aAect actual observation. 5 5 Q Cherenkov telescopes typically cannot observe at zenith angles much larger than 70◦ . The zenith angle # at the upper culmination of an astronomical object depends on the latitude @ of the observatory and the declination DEC of the object according to # = |@ − DEC|. Therefore, the condition |@ − DEC| 6 70◦ has to be imposed in the selection of Q observable objects. The area perpendicular to the optical axis illuminated by the Cherenkov light at the position of the telescope is proportional to the square of the distance d to the shower maximum. d grows with # as d ˙ 1=cos(#). The Q Q energy threshold Ethr of a Cherenkov telescope is equivalent to a Cherenkov photon density threshold Athr at the position of the telescope. Athr is an instrumental constant determined by the trigger condition of the data acquisition system. These parameters depend on the zenith angle # and the primary photon energy, and impact directly on the required 8ux sensitivity for an observation to be plausible. Under reasonable assumptions, A(E; #) ˙ E · cos2 (#) (Petry, 2001), and then, to satisfy the trigger condition A(E; #) ¿ Athr , E has to increase with # as Ethr (#) = Ethr (0◦ ) · cos−2 (#), with the values for Ethr (0◦ ) shown in Table 9. An increase in eAective collection area is accompanied by a proportional increase in hadronic background rate, such that the gain in 8ux sensitivity is therefore only the square-root of the gain in area. If we de5ne, F5* (Ethr ) as the integral 8ux above the energy threshold Ethr which results in a 5* detection after 50 h of observation time, F5* (Ethr (#); #) = F5* (Ethr (0◦ ); 0◦ ) · cos−1 (#). The last main ingredient that has to be considered is the needed observation time itself. It can be computed as (Petry, 2001) T5* (Ethr ) = (F(Ethr )=F5* (Ethr ))−2 50 h. Objects requiring much more than 50 –100 h will probably be excluded from the 5rst years of operation of the next generation IACTs, since, typically, the maximum expected duty cycle for these telescopes will be about 1000 h yr −1 .
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
373
Table 10 TeV observability of EGRET-SNR pairs, adapted from Petry (2001) Object name
95 (◦ )
5
#min (◦ )
Ethr (GeV)
3EG
F(Ethr ) (cm−2 s−1 )
5(Ethr )
T5* (h)
CANGAROO III
0617 + 2238 0631 + 0642 0634 + 0521 1410 − 6147 1714 − 3857 1744 − 3011 1746 − 2851 1800 − 2338 1824 − 1514 1826 − 1302 1837 − 0606 1856 + 0114
0.13 0.46 0.67 0.36 0.51 0.32 0.13 0.32 0.52 0.46 0.19 0.19
1:01 ± 0:06 1:06 ± 0:15 1:03 ± 0:26 1:12 ± 0:14 1:30 ± 0:20 1:17 ± 0:08 0:70 ± 0:07 1:10 ± 0:10 1:19 ± 0:18 1:00 ± 0:11 0:82 ± 0:14 0:93 ± 0:10
54 38 36 31 8 1 2 7 16 18 25 32
228 128 123 108 82 80 80 81 86 88 97 112
5:13 × 10−11 (3:22 × 10−11 ) 4:12 × 10−11 (1:41 × 10−11 ) 5:39 × 10−11 (8:47 × 10−12 ) 8:78 × 10−11 (3:30 × 10−11 ) 2:70 × 10−11 (7:06 × 10−12 ) 9:72 × 10−11 (5:70 × 10−11 ) 4:22 × 10−9 (2:64 × 10−9 ) 1:46 × 10−10 (7:45 × 10−11 ) 4:18 × 10−11 (1:24 × 10−11 ) 2:78 × 10−10 (1:32 × 10−10 ) 6:30 × 10−10 (2:40 × 10−10 ) 3:33 × 10−10 (1:65 × 10−10 )
1.46(1.52) 1.51(1.66) 1.48(1.74) 1.57(1.71) 1.60(1.80) 1.47(1.55) 1.00(1.07) 1.40(1.50) 1.49(1.67) 1.30(1.41) 1.12(1.26) 1.38(1.48)
22(55) 19(161) 11(430) 3.51(25) 28(409) 2:86‡ (6:17) 0:07‡ (0:11‡ ) 1:91‡ (3:73‡ ) 12(141) 1:00‡ (2:11‡ ) 0:44‡ (1:16‡ ) 0:83‡ (1:68‡ )
0617 + 2238 0631 + 0642 0634 + 0521 1410 − 6147 1714 − 3857 1744 − 3011 1746 − 2851 1800 − 2338 1824 − 1514 1856 + 0114
3EG 0.13 0.46 0.67 0.36 0.51 0.32 0.13 0.32 0.52 0.19
1:01 ± 0:06 1:06 ± 0:15 1:03 ± 0:26 1:12 ± 0:14 1:30 ± 0:20 1:17 ± 0:08 0:70 ± 0:07 1:10 ± 0:10 1:19 ± 0:18 0:93 ± 0:10
46 30 28 39 16 7 6 1 8 24
82 53 52 66 43 41 40 40 41 48
HESS I 2:21 × 10−10 (1:48 × 10−10 ) 1:42 × 10−10 (5:53 × 10−11 ) 1:77 × 10−10 (3:49 × 10−11 ) 1:80 × 10−10 (7:28 × 10−11 ) 7:44 × 10−11 (2:21 × 10−11 ) 2:63 × 10−10 (1:63 × 10−10 ) 8:36 × 10−9 (5:49 × 10−9 ) 3:93 × 10−10 (2:16 × 10−10 ) 1:28 × 10−10 (4:34 × 10−11 ) 9:55 × 10−10 (5:15 × 10−10 )
1.31(1.37) 1.36(1.51) 1.33(1.59) 1.42(1.56) 1.60(1.80) 1.47(1.55) 1.00(1.07) 1.40(1.50) 1.49(1.67) 1.23(1.33)
4.22(9.44) 6.69(44) 4.17(107) 5.11(31) 20(224) 1.48(3.88) 0:03‡ (0:05‡ ) 0:71‡ (2:17) 6.28(55) 0:29‡ (0:54‡ )
0010 + 7309 0617 + 2238 0631 + 0642 0634 + 0521 1744 − 3011 1746 − 2851 1800 − 2338 1824 − 1514 1837 − 0423 1856 + 0114 1903 + 0550 2016 + 3657 2020 + 4017
3EG 0.24 0.13 0.46 0.67 0.32 0.13 0.32 0.52 0.52 0.19 0.64 0.55 0.16
0:85 ± 0:10 1:01 ± 0:06 1:06 ± 0:15 1:03 ± 0:26 1:17 ± 0:08 0:70 ± 0:07 1:10 ± 0:10 1:19 ± 0:18 1:71 ± 0:44 0:93 ± 0:10 1:38 ± 0:17 1:09 ± 0:11 1:08 ± 0:04
44 6 22 24 59 58 53 44 33 28 23 8 11
58 30 35 36 114 106 82 58 43 38 35 31 31
MAGIC I 7:87 × 10−10 (4:16 × 10−10 ) 8:11 × 10−10 (5:75 × 10−10 ) 2:49 × 10−10 (1:03 × 10−10 ) 2:89 × 10−10 (6:26 × 10−11 ) 5:64 × 10−11 (3:21 × 10−11 ) 3:16 × 10−9 (1:94 × 10−9 ) 1:45 × 10−10 (7:42 × 10−11 ) 7:48 × 10−11 (2:38 × 10−11 ) 4:44 × 10−11 (3:08 × 10−12 ) 1:26 × 10−9 (6:97 × 10−10 ) 9:10 × 10−11 (3:35 × 10−11 ) 3:43 × 10−10 (1:83 × 10−10 ) 1:26 × 10−9 (1:00 × 10−9 )
1.15(1.25) 1.31(1.37) 1.36(1.51) 1.33(1.59) 1.62(1.70) 1.15(1.22) 1.40(1.50) 1.49(1.67) 2.01(2.45) 1.23(1.33) 1.68(1.85) 1.39(1.50) 1.38(1.42)
0:35‡ (0:67‡ ) 0:34‡ (0:48‡ ) 1:12‡ (3:15) 0:96‡ (8:75) 35(107) 0:09‡ (0:14‡ ) 3.71(14) 10(99) 21(¿ 500) 0:22‡ (0:40‡ ) 4.12(30) 0:81‡ (1:52‡ ) 0:22‡ (0:28‡ )
0010 + 7309 0617 + 2238 0631 + 0642
3EG 0.24 0.13 0.46
0:85 ± 0:10 1:01 ± 0:06 1:06 ± 0:15
41 9 25
141 82 98
VERITAS 2:70 × 10−10 (1:31 × 10−10 ) 2:20 × 10−10 (1:47 × 10−10 ) 6:16 × 10−11 (2:19 × 10−11 )
1.30(1.40) 1.31(1.37) 1.36(1.51)
1:03‡ (2:12‡ ) 1:26‡ (1:89‡ ) 4:51‡ (13)
374
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
Table 10 (continued) Object name
95 (◦ )
5
#min (◦ )
Ethr (GeV)
F(Ethr ) (cm−2 s−1 )
0634 + 0521 1800 − 2338 1856 + 0114 1903 + 0550 2016 + 3657 2020 + 4017
0.67 0.32 0.19 0.64 0.55 0.16
1:03 ± 0:26 1:10 ± 0:10 0:93 ± 0:10 1:38 ± 0:17 1:09 ± 0:11 1:08 ± 0:04
27 56 31 26 5 8
100 251 108 99 81 82
7:34 × 10−11 2:61 × 10−11 3:48 × 10−10 1:62 × 10−11 8:92 × 10−11 3:34 × 10−10
(1:22 × 10−11 ) (1:19 × 10−11 ) (1:73 × 10−10 ) (5:00 × 10−12 ) (4:27 × 10−11 ) (2:55 × 10−10 )
5(Ethr )
T5* (h)
1.48(1.74) 1.55(1.65) 1.38(1.48) 1.68(1.85) 1.39(1.50) 1.38(1.42)
3:78‡ (42) 23(110) 0:80‡ (1:61‡ ) 24(249) 3:11‡ (6:50‡ ) 0:83‡ (1:09‡ )
Taking these considerations into account it is possible to determine which of the 3EG sources analyzed here might be detected by the next generation IACTs. Twelve out of the 19 SNR-EGRET cases listed in Table 1 can be considered likely candidates, from a practical point of view. The complete results for all 3EG sources superposed to SNRs that might be detected in less than 50 h is compiled in Table 10. Columns are as follows: 3EG name, radius of the 95% con5dence contour, spectral index at 0:1 GeV, all from Hartman et al. (1999), minimum zenith angle, IACTs minimum energy threshold, expected 8ux at the minimum energy threshold, the integral spectral index at the IACTs minimum energy threshold for this source, the observation time to obtain a detection with 5* signi5cance (in brackets are the corresponding data if the spectrum at 0:1 GeV was steeper by one standard deviation). A ‡ -mark indicates that the detection is photon 8ux limited: the observation time was increased such that 100 photons are detected. Many of the SNR-EGRET pairs studied will be primary candidates for one or several of the forthcoming IACTs. A combination of TeV and GeV observations, together with an understanding of the molecular material distribution of the region, will be crucial in determining the nature of several of the unidenti5ed EGRET sources analyzed here. References Aharonian, F.A., Drury, L.O’C., VPolk, H.J., 1994. Astron. Astrophys. 285, 645. Aharonian, F.A., Atoyan, A.M., 1996. Astron. Astrophys. 309, 91. Aharonian, F.A., et al., 1999. Astron. Astrophys. 346, 913. Aharonian, F.A., et al., 2000. Astrophys. J. 539, 317. Aharonian, F.A., 2001. Space Sci. Rev. 99, 187. Aharonian, F.A., et al., 2001a. Astron. Astrophys. 375, 1008. Aharonian, F.A., et al., 2001b. Astron. Astrophys. 370, 112. Aharonian, F.A., Konopelko, A.K., VPolk, H.J., Quintana, H., 2001c. Astropart. Phys. 15, 335. Aharonian, F.A., et al., 2002a. Astron. Astrophys. 395, 803. Aharonian, F.A., et al., 2002b. Astron. Astrophys. 390, 39. Aharonian, F., et al., 2002. Astron. Astrophys. 393, L37. Allen, G.E., Digel, S.W., Ormes, J.F., 1999a. Proceedings of the International Cosmic Ray Conference, Vol. 5, Utah, pp. 515. Allen, G.E., Gotthelf, E.V., Petre, R., 1999b. Proceedings of the International Cosmic Ray Conference, Vol. 5, Utah, p. 480. Amenomori, M., et al., 1999. Astrophys. J. 525, L93. Anchordoqui, L.A., Torres, D.F., McCauley, T., Romero, G.E., Aharonian, F.A., 2003. Astrophys. J. 589, 481.
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
375
Anderson, S.B., et al., 1996. Astrophys. J. 468, L55. Arikawa, Y., Tatematsu, K., Sekimoto, Y., Takahashi, T., 1999. Pub. Astron. Soc. Japan 51, L7. Arqueros, F., et al., 1999. Proceedings of the International Cosmic Ray Conference, Vol. 5, Utah, pp. 211. Asaoka, I., Aschenbach, B., 1994. Astron. Astrophys. 284, 573. Aschenbach, B., 1998. Nature 396, 141. Atoyan, A.M., Aharonian, F.A., Voelk, H.J., 1995. Phys. Rev. D 52, 3265. Bamba, A., Yokogawa, J., Sakano, M., Koyama, K., 2000. Astron. J. 52, 259. Baring, M.G., et al., 1999. Astrophys. J. 513, 311. Becker, R.H., Helfand, D.J., 1987. Astron. J. 94, 1629. Bell, A.R., 1978. Mon. Not. R. Astron. Soc. 182, 147. Benaglia, P., Romero, G.E., Stevens, I., Torres, D.F., 2001. Astron. Astrophys. 366, 605. Benaglia, P., Romero, G.E., 2003. Astron. Astrophys. 399, 1121. Berezhko, E.G., Ksenofontov, L.T., VPolk, H.J., 2002. Astron. Astrophys. 395, 943. Berezhko, E.G., PPuhlhofer, G., VPolk, H.J., 2003. Astron. Astrophys. 400, 971. Bignami, G.F., Hermsen, W., 1983. Annu. Rev. Astron. Astrophys. 21, 67. Bildsten, L., 1997. Astrophys. J. (Suppl.) 113, 367. Bloemen, H., et al., 1997. Astrophys. J. 475, L25. Bocchino, F., Bykov, A.M., 2000. Astron. Astrophys. 362, L29. Bocchino, F., Bykov, A.M., 2001. Astron. Astrophys. 376, 248. Brazier, K.T.S., Kanbach, G., Carrami˜nana, A., Guichard, J., Merck, M., 1996. Mon. Not. R. Astron. Soc. 281, 1033. Brazier, K.T.S., Reimer, O., Kanbach, G., Carrami˜nana, A., 1998. Mon. Not. R. Astron. Soc. 295, 819. Buckley, J.H., et al., 1997. Proceedings of the 25th International Cosmic Ray Conference, Vol. 3, Durban, pp. 237. Buckley, J.H., et al., 1998. Astron. Astrophys. 329, 639. Burdett, A., et al., 1999. Proceedings of the International Cosmic Ray Conference, Vol. 5, Utah, pp. 448. Butt, Y., Torres, D.F., Combi, J.A., Dame, T., Romero, G.E., 2001. Astrophys. J. 562, L167. Butt, Y., Torres, D.F., Romero, G.E., Dame, T., Combi, J.A., 2002a. Nature 418, 499. Butt, Y., Torres, D.F., Combi, J.A., Dame, T., Romero, G.E., 2002b. Proceedings of the 22nd Moriond Astrophysics Meeting: The Gamma Ray Universe, Les Arcs, Savoie, France, 9 –16 March 2002. astro-ph/0206132, to appear. Butt, Y., et al., 2003. astro-ph/0302342, Astrophys. J., submitted for publication. Bykov, A.M., Chevalier, R.A., Ellison, D.C., Uvarov, Y.A., 2000. Astrophys. J. 538, 203. Camilo, F., et al., 2001. Astrophys. J. 557, L51. Caraveo, P.A., De Luca, A., Mignani, R.P., Bignami, G.F., 2001. Astrophys. J. 561, 930. Carrami˜nana, A., 2001. In: Carrami˜nana, O., Reimer, O., Thomson, D. (Eds.), Proceedings of the International Workshop on The Nature of Galactic Unidenti5ed Gamma-ray Sources. Kluwer Academic Press, Dordrecht, pp. 107. Case, G., Bhattacharya, D., 1998. Astrophys. J. 504, 761. Case, G., Bhattacharya, D., 1999. Astrophys. J. 521, 246. Caswell, J.L., Barnes, P.J., 1985. Mon. Not. R. Astron. Soc. 216, 753. Caswell, J.L., Murray, J.D., Roger, R.S., Cole, D.J., Cooke, D.J., 1975. Astron. Astrophys. 45, 239. Chadwick, P.M., et al., 1997. Proceedings of the 25th International Cosmic Ray Conference, Vol. 3, Durban, p. 189. Chantell, M.C., et al., 1998. Nucl. Instr. Meth. A 408, 468. Chapman, J.M., Leitherer, C., Koribalski, B., Bouter, R., Storey, M., 1999. Astrophys. J. 518, 890. Chen, W., White, R., 1991. Astrophys. J. 381, L63. Cheng, K.S., Ruderman, M., 1989. Astrophys. J. 337, L77. Cheng, K.S., Ruderman, M., 1991. Astrophys. J. 373, 187. Chevalier, R.A., 1999. Astrophys. J. 511, 798. Clark, D.H., Caswell, J.L., 1976. Mon. Not. R. Astron. Soc. 174, 267. Clark, G.W., Garmire, G.P., Kraushaar, W.L., 1968. Astrophys. J. 153, L203. Claussen, M.J., Frail, D.A., Goss, W.M., Gaume, R.A., 1997. Astrophys. J. 489, 143. Clemens, D.P., 1985. Astrophys. J. 295, 422. Combi, J.A., Romero, G.E., 1995. Astron. Astrophys. 303, 872. Combi, J.A., Romero, G.E., AzacOarate, I., 1997. Astrophys. Space Sci. 250, 1. Combi, J.A., Romero, G.E., Benaglia, P., 1998. Astron. Astrophys. 333, L91.
376
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
Combi, J.A., Romero, G.E., Benaglia, P., 1999. Astrophys. J. 519, L177. Combi, J.A., Romero, G.E., Benaglia, P., Jonas, J., 2001. Astron. Astrophys. 366, 1047. Condon, J.J., Broderick, J.J., Seielstad, G.A., 1991. Astron. J. 102, 2041. Contreras, M.E., et al., 1997. Astrophys. J. 488, L153. Corbel, S., Chapuis, C., Dame, T.M., Durouchoux, P., 1999. Astrophys. J. 526, L29. Cornett, R.H., Chin, G., Knapp, G.R., 1977. Astron. Astrophys. 54, 889. Crovisier, J., Fillit, R., Kazes, I., 1973. Astron. Astrophys. 27, 417. Crutcher, R.M., 1988. In: Dickman, R., Snell, R., Young, J. (Eds.), Molecular Clouds, Milky-Way & External Galaxies. Springer, New York, pp. 105. Crutcher, R.M., 1994. Clouds, cores and low mass stars. In: Clemens, D.P., Barvainis, R. (Eds.), Astronomical Society of the Paci5c Conference Series, Vol. 65, Proceedings of the Fourth Haystack Observatory, San Francisco, p. 87. Crutcher, R.M., 1999. Astrophys. J. 520, 706. Cusumano, G., Maccarone, M.C., Nicastro, L., Sacco, B., Kaaret, P., 2000. Astrophys. J. 528, L25. Dame, T.M., Elmegreen, B.G., Cohen, R.S., Thaddeus, P., 1986. Astrophys. J. 305, 892. Dame, T.M., Hartmann, D., Thaddeus, P., 2001. Astrophys. J. 547, 792. D’Amico, N., et al., 2001. Astrophys. J. 552, L45. De Jager, O.C., Mastichiadis, A., 1997. Astrophys. J. 482, 874. De Naurois, M., et al., 2001. In: Aharonian, F., VPolk, H.J. (Eds.), Proceedings of the International Symposium on High Energy Gamma-Ray Astro. (Heidelberg). AIP, New York, pp. 540. Dermer, C.D., 1986. Astron. Astrophys. 157, 223. Dermer, C.D., et al., 1997. Astron. J. 113, 1379. Dermer, C.D., 1997. In: Dermer, C.D., Strickman, M.S., Kurfess, J.D., Williamsburg, V.A. (Eds.), Proceedings of the Fourth Compton Symposium, Williamsburg, April 1997, AIP Conference Proceedings, Vol. 410, p. 1275. Dickman, R.L., Snell, R.L., Ziurys, L.M., Huang, Y.-L., 1992. Astrophys. J. 400, 203. Doherty, et al., 2003. Mon. Not. R. Astron. Soc. 399, 1048. Dor5, E.A., 1991. Astron. Astrophys. 251, 597. Dor5, E.A., 2000. Astrophys. Space Sci. 272, 227. Downes, A.J.B., Pauls, T., Salter, C.J., 1980. Astron. Astrophys. 92, 47. Drury, L.O’C., Aharonian, F., VPolk, H.J., 1994. Astron. Astrophys. 287, 959. Drury, L.O’C., et al., 2001. Report of working group number four at the ISSI workshop on Astrophysics of Galactic Cosmic Rays: astro-ph/0106046. Dubner, G.M., VelOazquez, P.F., Goss, W.M., Holdaway, M.A., 2000. Astron. J. 120, 1933. Duncan, A.R., Stewart, R.T., Haynes, R.F., Jones, K.L., 1995. Mon. Not. R. Astron. Soc. 277, 36. Duncan, A.R., Stewart, R.T., Haynes, R.F., Jones, K.L., 1997. Mon. Not. R. Astron. Soc. 287, 722. Eichler, D., Usov, V.V., 1993. Astrophys. J. 402, 271. Ellison, D.C., Slane, P., Gaensler, B.M., 2001. Astrophys. J. 563, 191. Enomoto, R., et al., 2002. Nature 416, 823. Erlykin, A.D., Wolfendale, A.W., 2003. J. Phys. G: Nucl. Part. Phys. 29, 641–664. Esposito, J.A., Hunter, S.D., Kanbach, G., Sreekumar, P., 1996. Astrophys. J. 461, 820. Falcke, H., Cotera, A., Duschl. W.J., Melia, F., Rieke, M.J., 1999. The Central Parsecs of the Galaxy. ASP Conference Series, Vol. 186. Fegan, S., 2001. In: Carrami˜nana, O., Reimer, O., Thomson, D. (Eds.), Proceedings of the International Workshop on The Nature of Galactic Unidenti5ed Gamma-ray Sources. Kluwer Academic Press, Dordrecht, pp. 285. Fesen, R.A., 1984. Astrophys. J. 281, 658. Fierro, J.M., et al., 1993. Astrophys. J. 413, L27. Frail, D.A., Kulkarni, S.R., Vasisht, G., 1993. Nature 365, 136. Frail, D.A., Giacani, E.B., Goss, W.M., Dubner, G., 1996. Astrophys. J. 464, L165. FPurst, E., Reich, W., Reich, P., Reif, K., 1990. Astrophys. J. Suppl. 85, 691. Gaisser, T.K., Protheroe, R.J., Stanev, T., 1998. Astrophys. J. 492, 219. Gehrels, N., Macomb, D.J., Bertsch, D.L., Thompson, D.J., Hartman, R.C., 2000. Nature 404, 363. Gehrels, N., Shrader, C.R., 2001. Gamma 2001. In: Ritz, S., Gehrels, N., Schader, C.R. (Eds.), AIP Conference Proceedings, New York, p. 3.
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
377
Georganopoulos, M., Aharonian, F.A., Kirk, J., 2002. Astron. Astrophys. 388, L25. Georgelin, Y.P., Georgelin, Y.M., 1970. Astron. Astrophys. 7, 133. Georgelin, Y.P., Georgelin, Y.M., 1976. Astron. Astrophys. 49, 57. Giacani, E.B., et al., 1997. Astron. J. 113, 1379. Gillanders, G., et al., 1997. Proceedings of the 25th International Cosmic Ray Conference, Vol. 3, Durban, p. 185. Ginzburg, V.L., Syrovatskii, S.I., 1964. The Origin of Cosmic Rays. Pergamon Press, London. Goldwurm, A., 2001. Exploring the gamma-ray universe. In: Battrick, B., Gimenez, A., Reglero, V., Winkler, C. (Eds.), Proceedings of the Fourth INTEGRAL Workshop. ESA SP-459, Noordwijk: ESA Publications Division, p. 455. Grabelsky, D.A., Cohen, R.S., Bronfman, L., Thaddeus, P., 1988. Astrophys. J. 331, 181. Gray, A.D., 1994. Mon. Not. R. Astron. Soc. 270, 847. Green, A.J., Frail, D.A., Goss, W.M., Otrupcek, R., 1997. Astrophys. J. 114, 2058. Green, A.J., Cram, L.E., Large, M.I., Ye, T., 1999. Astrophys. J. Suppl. 122, 207. Green, D.A., 1997. Proc. Astron. Soc. Australia 14, 73. Green, D.A., 2000. A Catalogue of Galactic Supernova Remnants, Mullard Radio Astronomy Observatory, Cambridge, UK (available at http://www.mrao.cam.ac.uk/surveys/snrs/). Grenier, I.A., 1995. Adv. Space Res. 15, 73. Grenier, I.A., 2000. Astron. Astrophys. 364, L93. Grenier, I.A., 2001. In: Carrami˜nana, A., Reimer, O., Thompson, D. (Eds.), The Nature of Unindenti5ed Galactic Gamma-Ray Sources. Kluwer Academic Publishers, Dordrecht, pp. 51. Guessoum, N., Von Ballmoos, P., Knodleseder, J., Vedrenne, G., 2001. Gamma 2001. In: Ritz, S., Gehrels, N., Schader, C.R. (Eds.), AIP Conference Proceedings, New York, p. 16. Halpern, J.P., Eracleous, M., Mukherjee, R., Gotthelf, E.V., 2001a. Astrophys. J. 551, 1016. Halpern, J.P., et al., 2001b. Astrophys. J. 552, L125. Halpern, J.P., Gotthelf, E.V., Mirabal, N., Camilo, F., 2002. Astrophys. J. 573, L41. Halzen, F., Hooper, D., 2002. Rept. Prog. Phys. 65, 1025. Hartman, R.C., et al., 1999. Astrophys. J. Suppl. 123, 79. Harrus, I.M., Hughes, J.P., Helfand, D.J., 1996. Astrophys. J. 464, L161. Harrus, I.M., Slane, P.O., 1999. Astrophys. J. 516, 811. Haslam, C.G.T., Salter, C.J., StoAel, H., Wilson, W.E., 1982. Astrophys. J. Suppl. 47, 1. Hensberge, H., Pavlovski, K., Verschueren, W., 2000. Astron. Astrophys. 258, 553. Hermsen, W., et al., 1981. In: Proceedings of the International Cosmic Ray Conference 17, Paris, 1, 320. Hillas, A.M., et al., 1998. Astrophys. J. 503, 774. Hnatyk, B., Petruk, O., 1998. Condens. Matter Phys. 1, 655. Huang, Y.-L., Dame, T.M., Thaddeus, P., 1983. Astrophys. J. 272, 609. Huang, Y.-L., Thaddeus, P., 1986. Astrophys. J. 309, 804. Hunter, et al., 1997. Astrophys. J. 481, 205. JaAe, T.R., Bhattacharya, D., Dixon, D.D., Zych, A.D., 1997. Astrophys. J. 484, L129. Jonas, J.L., 1999. Ph.D. Thesis, Rhodes University. Jones, L.R., Smith, A., Angellini, L., 1993. Mon. Not. R. Astron. Soc. 265, 631. Jones, T.W., 2001. In: Chung-Ming Ko (Eds.), Proceedings of the Seventh Taipei Astrophysics Workshop on Cosmic Rays in the Universe, ASP Conference Proceedings, Vol. 241. Astronomical Society of the Paci5c, San Francisco, astro-ph/0012483. Kaaret, P., Cottam, J., 1996. Astrophys. J. 492, L35. Kaaret, P., Piraino, S., Halpern, J., Eracleous, M., 1999. Astrophys. J. 523, 197. Kaaret, P., Cusumano, G., Sacco, B., 2000. Astrophys. J. 542, L41. Kaspi, V.M., et al., 1997. Astrophys. J. 485, 820. Kaspi, V.M., et al., 2000. Astrophys. J. 528, 445. Kassim, N.E., Frail, D.A., 1996. Mon. Not. R. Astron. Soc. 283, L51. Kaufman-BernadOo, M.M., Romero, G.E., Mirabel, I.F., 2002. Astron. Astrophys. 385, L10. Keohane, J.W., Petre, R., Gotthelf, E.V., Ozaki, M., Koyama, K., 1997. Astrophys. J. 484, 350. Kifune, T., et al., 1995. Astrophys. J. 438, L91. Kirk, J.G., Dendy, R.O., 2001. J. Phys. G 27, 1589.
378
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
Kirshner, R.P., Winkler, P.F., 1979. Astrophys. J. 227, 853. Klothes, R., Landecker, T.L., Foster, T., Leahy, D.A., 2001. Astron. Astrophys. 376, 641. KniAen, D.A., et al., 1974. Nature 25, 397. Koralesky, B., Frail, D.A., Goss, W.M., Claussen, M.J., Green, A.J., 1998. Astron. J. 116, 1323. Konopelko, A.K., 2001. In: Aharonian, F.A., VPolk, H.J. (Eds.), High Energy Gamma-Ray Astronomy. AIP, Melville, pp. 568. Kraushaar, W.L., et al., 1972. Astrophys. J. 177, 341. Lamb, R.C., Macomb, D.J., 1997. Astrophys. J. 488, 872. Lasker, B.M., et al., 1990. Astron. J. 99, 2019. Leahy, D.A., Naranan, S., Singh, K.P., 1986. Mon. Not. R. Astron. Soc. 220, L501. Leslard, R.W., et al., 1995. Proceedings of the International Cosmic Ray Conference, Rome, 2, 475. Longair, M.S., 1994. High Energy Astrophysics, Vol. 2, Stars, the Galaxy and the Interstellar Medium. 2nd edition. Cambridge University Press, Cambridge. Lozinskaya, T.A., 1974. Sov. Astron. 17, 603. Lozinskaya, T.A., 1992. Supernovae and Stellar Wind in the Interstellar Medium. AIP, New York. Lozinskaya, T.A., Pravdikova, V.V., Finoguenov, A.V., 2000. Astronomy Lett. 26, 77. Lucarelli, F., Konopelko, A., Rowell, G., Fonseca, V., 2001. The HEGRA collaboration, High-energy gamma-ray astronocmy. In: Aharonian, F., VPoelk, H. (Eds.), AIP Conference Proceedings, New York, p. 779. Manchester, R.N., et al., 2001. Mon. Not. R. Astron. Soc. 328, 17. Markiewicz, W.J., Drury, L.O’C., VPolk, H.J., 1990. Astron. Astrophys. 236, 487. MarkoA, S., Melia, F., Sarcevic, I., 1997. Astrophys. J. 489, L47. MarkoA, S., Melia, F., Sarcevic, I., 1999. Astrophys. J. 522, 870. Mastichiadis, A., 1996. Astron. Astrophys. 305, L53. Mastichiadis, A., Ozernoy, L.M., 1994. Astrophys. J. 426, 599. Mayer-Hasselwander, H.A., et al., 1998. Astron. Astrophys. 335, 161. McLaughlin, M.A., Mattox, J.R., Cordes, J.M., Thompson, D.J., 1996. Astrophys. J. 473, 763. Melia, F., 1992. Astrophys. J. 387, L25. Melia, F., Falcke, H., 2001. Annu. Rev. Astron. Astrophys. 39, 309. Merck, M., et al., 1996. Astron. Astrophys. Suppl. 120, 465. Michelson, P.F., 2001. Gamma 2001. In: Ritz, S., Gehrels, N., Schader, C.R. (Eds.), AIP Conference Proceedings, New York, p. 713. Milne, D.K., 1979. Aust. J. Phys. 32, 83. Mirabal, N., Halpern, J.P., 2001. Astrophys. J. 547, L137. Mirabal, N., Halpern, J.P., Eracleous, M., Becker, R.H., 2000. Astrophys. J. 541, 180; Ray Conference, Vol. 1, Paris, p. 17. Montmerle, T., 1979. Astrophys. J. 231, 95. Mor5ll, G.E., Forman, M., Bignami, G., 1984. Astrophys. J. 284, 856. Mor5ll, G.E., Tenorio-Tagle, G., 1983. Space Sci. Rev. 36, 93. Mori, M., 2001. J. Phys. Soc. Japan B 70 (Suppl.), 22. Mukherjee, R., Gotthelf, E.V., Halpern, J., Tavani, M., 2000. Astrophys. J. 542, 740. Muraishi, H., et al., 2000. Astron. Astrophys. 354, 57L. Naito, T., Takahara, F., 1994. J. Phys. G 20, 477. Nicastro, L., Gaensler, B.M., McLaughlin, M.A., 2000. Astron. Astrophys. 362, L5. Nolan, P.L., et al., 1996. Astrophys. J. 459, 100. Odegard, N., 1986. Astrophys. J. 301, 813. P Ogelman, H., Finley, J.P., 1993. Astrophys. J. 413, L31. Olbert, C., Clear5eld, R.C., Williams, N., Keohane, J., Frail, D.A., 2001. Astrophys. J. 554, L205. Ong, R.A., 1998. Phys. Rep. 305, 93. Oser, S., et al., 2000. Astrophys. J. 547, 949. Paredes, J.M., MartOb, J., RibOo, M., Massi, M., 2000. Science 288, 2341. Petre, R., Keohane, J., Hwang, U., Allen, G., Gotthelf, E., 1998. The Hot Universe. In: Katsuji Koyama, Shunji Kitamoto, Masayuki Itoh, (Eds.), Proceedings of the IAU Symposium 188. Kluwer Academic Press, Dordrecht, p. 117.
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
379
Petry, D., Reimer, O., 2001. In: Ritz, S., Gehrels, N., Schader, C.R. (Eds.), Proceedings of the Gamma 2001 Workshop. AIP Conference Proceedings, New York, p. 696. Petry, D., 2001. In: Carrami˜nana, O., Reimer, O., Thomson, D. (Eds.), Proceedings of the International Workshop on The Nature of Galactic Unidenti5ed Gamma-ray Sources. Kluwer Academic Press, Dordrecht, pp. 299. Pineault, S., et al., 1993. Astron. J. 105, 1060. Pineault, S., et al., 1997. Astron. Astrophys. 324, 1152. Plaga, R., 2002. New Astron. 7, 317. Pohl, M., 1996. Astron. Astrophys. 307, 57. Pohl, M., 1997. Astron. Astrophys. 317, 441. Pollock, A.M.T., 1985. Astron. Astrophys. 150, 339. Preite-Martinez, A., Feroci, M., Strom, R.G., Mineo, T., 2000. In: McConnell, M.L., Ryan, J.M., Proceedings of the Fifth Compton Symposium, American Institute of Physics (AIP), Williamsburg, VA. AIP Conference Proceedings, Vol. 510, p. 73. Punsly, B., 1998a. Astrophys. J. 498, 640. Punsly, B., 1998b. Astrophys. J. 498, 660. Punsly, B., Romero, G.E., Torres, D.F., Combi, J.A., 2000. Astron. Astrophys. 364, 556. Radhakrishman, V., Goss, W.M., Murray, J.D., Brooks, J.W., 1972. Astrophys. J. Suppl. 24, 49. RibOo, M., et al., 2002. Astron. Astrophys. 384, 954. Reich, W., FPurst, E., Sofue, Y., 1984. Astron. Astrophys. 133, L4. Reimer, O., Bertsch, D.L., 2001. Proceedings of the 27th International Cosmic Ray Conference, Hamburg, pp. 2566–2569. Reimer, O., Pohl, M., 2002. Astron. Astrophys. 390, L43. Reynoso, E., Mangum, J.G., 2000. Astrophys. J. 545, 874. Reynolds, S.P., 1996. Astrophys. J. 459, L13. Reynolds, S.P., 1998. Astrophys. J. 493, 375. Rho, J., Petre, R., 1998. Astrophys. J. 503, L167. Rho, J., Petre, R., Schlegel, E.M., Hester, J.J., 1994. Astrophys. J. 430, 757; 1999, Astrophys. J. 515, 712. Roberts, M.S.E., Romani, R.W., Kawai, N., 2001. Astrophys. J. Suppl. 133, 451. Rodgers, A.W., Campbell, C.T., Whiteoak, J.B., 1960. Mon. Not. R. Astron. Soc. 121, 103. Romero, G.E., Combi, J.A., Colomb, F.R., 1994. Astron. Astrophys. 288, 731. Romero, G.E., 1998. Rev. Mex. Astron. Astrophys. 34, 29. Romero, G.E., Benaglia, P., Torres, D.F., 1999a. Astron. Astrophys. 348, 868. Romero, G.E., Torres, D.F., Andruchow, I., Anchordoqui, L.A., Link, B., 1999b. Mon. Not. R. Astron. Soc. 308, 799. Romero, G.E., Kaufman-BernadOo, M., Combi, J., Torres, D.F., 2001. Astron. Astrophys. 376, 599. Romero, G.E., 2001. In: Carrami˜nana, A., Reimer, O., Thompson, D. (Eds.), The Nature of Unindenti5ed Galactic Gamma-Ray Sources. Kluwer Academic Press, Dordrecht, pp. 65. Romero, G.E., Torres, D.F., 2003. Astrophys. J. 586, L33. Rowell, G.P., et al., 2000. Astron. Astrophys. 359, 337. Ruiz, M.T., May, J., 1986. Astrophys. J. 309, 667. Sakamoto, S., Hasegawa, T., Hayashi, M., Handa, T., Oka, T., 1995. Astrophys. J. Suppl. 100, 125. Seta, M., et al., 1998. Astrophys. J. 505, 286. Seward, F.D., Schmidt, B., Slane, P., 1995. Astrophys. J. 453, 284. SchPonfelder, V., 2001. Gamma 2001. In: Ritz, S., Gehrels, N., Schader, C.R. (Eds.), AIP Conference Proceedings, New York, p. 809. Scoville, N.Z., Irvine, W.M., Wannier, P.G., Predmore, C.R., 1977. Astrophys. J. 216, 320. Sedov, L.I., 1959. Similarities and Dimensional Methods in Mechanics. Academic Press, New York. Stecker, F.W., 1977. Astrophys. J. 212, 60. Shklovskii, I.S., 1953. Dokl. Akad. Nauk SSSR 91, No. 3, 475. Sigl, G., Torres, D.F, Anchordoqui, L.A., Romero, G.E., 2001. Phys. Rev. D 63, 081302. Sinnis, G., et al., 1995. Nucl. Phys. B (Proc. Suppl.) 43, 141. Slane, P., et al., 1997. Astrophys. J. 485, 221. Slane, P., et al., 1999. Astrophys. J. 525, 357. Smith, D.A., et al., 1997. Nucl. Phys. (Proc. Suppl.) 54, 362.
380
D.F. Torres et al. / Physics Reports 382 (2003) 303 – 380
Sofue, Y., Reich, W., 1979. Astron. Astrophys. Suppl. 38, 251. Sturner, S.J., Dermer, C.D., 1995. Astron. Astrophys. 293, L17. Sturner, S.J., Dermer, C.D., Mattox, J.R., 1996. Astron. Astrophys. Suppl. 120, 445. Sturner, S.J., Skibo, J.G., Dermer, C.D., Mattox, J.R., 1997. Astrophys. J. 490, 617. Sugizaki, M., et al., 2001. Astrophys. J. Suppl. 134, 77. Swanenburg, et al., 1981. Astrophys. J. 243, L69. Tanimori, T., et al., 1998. Astrophys. J. 497, L25. Tavani, M., et al., 2001. Gamma 2001. In: Ritz, S., Gehrels, N., Schader, C.R. (Eds.), AIP Conference Proceedings, New York, p. 729. Taylor, J.H., et al., 1993. Astrophys. J. Suppl. 88, 529 (updated at ftp://pulsar.princeton.edu). Thompson, D.J., et al., 1975. Astrophys. J. 200, L79. Thompson, D.J., et al., 1995. Astrophys. J. Suppl. 101, 259. Thompson, D.J., et al., 1996. Astrophys. J. Suppl. 107, 227. Thompson, D.J., et al., 1999. Astrophys. J. 516, 297. Thompson, D.J., 2001. In: Aharonian, F., VPolk, H.J. (Eds.), Proceedings of the International Symposium on High Energy Gamma-Ray Astro. (Heidelberg). AIP, New York, pp. 103. Thompson, D.J., Digel, S.W., Nolan, P.L., Reimer, O., 2001. In: Slane, P.O., Gaensler, B.M. (Eds.), Proceedings of the Neutron Stars in Supernova Remnants, ASP Conference Series, Vol. 9999, 2002. astro-ph/0112518, in press. Tompkins, W., 1999. Ph.D. Thesis, Stanford University. Torres, D.F., et al., 2001a. Astron. Astrophys. 370, 468. Torres, D.F., Combi, J.A., Romero, G.E., Benaglia, P., 2001b. In: Carrami˜nana, O., Reimer, O., Thomson, D. (Eds.), Proceedings of the International Workshop on The Nature of Galactic Unidenti5ed Gamma-ray Sources. Kluwer Academic Press, Dordrecht, pp. 97. Torres, D.F., Pessah, M.E., Romero, G.E., 2001c. Astron. Nachr. 322, 223. Torres, D.F., Butt, Y.M., Camilo, F., 2001d. Astrophys. J. 560, L155. Torres, D.F., Romero, G.E., Eiroa, E.F., 2002. Astrophys. J. 560, 600. Torres, D.F., Romero, G.E., Eiroa, E.F., Wambsganss, J., Pessah, M.E., 2003a. Mon. Not. R. Astron. Soc. 339, 335. Torres, D.F., Nuza, S.E., 2003b. Astrophys. J. 583, L25. TPumer, T., et al., 1999. Astropart. Phys. 11, 271. Uchida, K., Morris, M., Yusef-Zadeh, F., 1992. Astron. J. 104, 1533. Uchiyama, Y., Takahashi, T., Aharonian, F.A., 2002a. Pub. Astron. Soc. Japan 54, 73. Uchiyama, Y., Takahashi, T., Aharonian, F.A., Mattox, J.R., 2002b. Astrophys. J. 571, 866. Uchiyama, Y., Aharonian, F.A., Takahashi, T., 2002c. Astron. Astrophys., to appear. Usov, V.V., 1994. Astrophys. J. 427, 394. van den Ancker, M.E., ThOe, P.S., de Winter, D., 2000. Astron. Astrophys. 362, 580. Vargas, M., et al., 1996. Astron. Astrophys. 313, 828. VelOazquez, P.F., Dubner, G.M., Goss, W.M., Green, A., 2002. astro-ph/0207530. VPolk, H.J., 2001. In: Proceedings of the 21st Moriond Astrophysics Meeting: High Energy Astrophysical Phenomena, Les Arcs, Savoie, France, 20 –27 January 2001. astro-ph/0105356, to appear. VPolk, H.J., 2002. In: Proceedings of the International Cosmic Ray Conference, Hamburg, astro-ph/0202421, to appear. Wallace, P.M., et al., 2000. Astrophys. J. 540, 184. Wallace, P.M., Halpern, J.P., Magalhaes, A.M., Thompson, D.J., 2002. Astrophys. J. 569, 36. Wang, Z.R., Asaoka, I., Hayakawa, S., Koyama, K., 1992. PASJ 44, 303. Whiteoak, J.B.Z., Green, A.J., 1996. Astron. Astrophys. Suppl. 118, 329. Wootten, A., 1981. Astrophys. J. 245, 105. Yadigaroglu, I.-A., Romani, R.W., 1997. Astrophys. J. 476, 356. Yamamoto, F., Hasegawa, T., Morino, J., Handa, T., Sawada, T., Dame, T.M., 1999. In: Nakamoto, T. (Ed.), Proceedings of the Star Formation 1999. Nobeyama Radio Observatory, p. 110. Yusef-Zadeh, F., Melia, F., Wardle, M., 2000. Science 287, 85. Yusef-Zadeh, F., Law, C., Wardle, M., 2002. Astrophys. J. 568, L121. Zhang, L., Cheng, K.S., 1998. Astron. Astrophys. 335, 234. Zhang, L., Zhang, Y.J., Cheng, K.S., 2000. Astron. Astrophys. 357, 957.
381
CONTENTS VOLUME 382 L.M. Varela, M. Garcı´ a, V. Mosquera. Exact mean-field theory of ionic solutions: non-Debye screening
1
S Capitani. Lattice perturbation theory
113
D.F. Torres, G.E. Romero, T.M. Dame, J.A. Combi, Y.M. Butt. Supernova remnants and g-ray sources
303
PII: S 0 3 7 0 - 1 5 7 3 ( 0 3 ) 0 0 2 7 0 - 9