ORDER-N METHODOLOGIES AND THEIR APPLICATIONS
S.Y. WU, C.S. JAYANTHI
AMSTERDAM – LONDON – NEW YORK – OXFORD – PARIS – SHANNON – TOKYO
Physics Reports 358 (2002) 1–74
Order-N methodologies and their applications S.Y. Wu ∗ , C.S. Jayanthi Department of Physics, University of Louisville, Louisville, KY 40292, USA Received January 2001; editor: A:A: Maradudin
Contents 0. Introduction 1. Localization of electronic degree of freedom: the “nearsightedness” of an electron in a many-electron system 2. Issues relevant to the development of O(N ) procedures 3. The direct approach 3.1. The divide-and-conquer (DC) method [1,37,43] 3.2. The Fermi operator expansion (FOE) method [27,30] 3.3. The kernel polynomial method (KPM) [36] 3.4. Order-N non-orthogonal tight-binding molecular dynamics (O(N )=NOTBMD) schemes [44] 3.5. Recursion method-related O(N ) schemes 4. Order-N methods based on variational approaches 4.1. The density matrix (DM) method [21,22,28] 4.2. Self-consistent LDA-based density matrix method [31,73–75]
3 4 8 10 13 16 21 24 27 32 32 35
4.3. Penalty function-based energy minimization approach [39] 4.4. Variational approaches using localized orbitals minimization (LOM) 4.5. Absolute energy minimum approach to linear scaling [41] 5. Issues aCecting the implementation of O(N ) algorithms 5.1. Issues related to tight-binding approaches 5.2. Construction of the Hamiltonian in Erst principles O(N ) algorithms 6. Applications 6.1. The shape of large fullerenes [35,123–125] 6.2. Dimensional stability of single-walled carbon nanotubes [119] 6.3. Initial stages of growth of Si=Si(001) 6.4. Liquid carbon structures 6.5. Extended Si{311} defects 6.6. Controllable reversibility in the mechanical deformation of a single-walled nanotube by a local probe
∗
Corresponding author. Tel.: +1-502-852-3335; fax: +1-502-852-0742. E-mail address:
[email protected] (S.Y. Wu).
c 2002 Elsevier Science B.V. All rights reserved. 0370-1573/02/$ - see front matter PII: S 0 3 7 0 - 1 5 7 3 ( 0 1 ) 0 0 0 3 5 - 7
36 37 41 43 43 44 48 49 50 52 55 57
60
2
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
7. Choosing an O(N ) scheme 7.1. The DC method 7.2. The FOE method 7.3. The O(N )=NOTB-MD scheme 7.4. The DM method 7.5. The LOM methods
65 67 67 67 68 68
7.6. Some general remarks 7.7. Recent reEnement on the FOE method: energy renormalization method [157] 8. Epilogue Acknowledgements References
68 69 70 71 71
Abstract An exhaustive inventory of existing order-N methodologies for the calculation of the total energy as well as the atomic forces up to 1999 has been conducted. These methodologies are discussed in terms of the key approximations involved in each method. Emphasis is placed on the roles played by these approximations and how they aCect the accuracy and eKciency of the method. Issues aCecting the implementation of various order-N procedures, such as the choice of the tight-binding model in the order-N tight-binding approaches and the construction of the Hamiltonian in the order-N ab initio approaches, are also discussed. Some typical examples of applications of the order-N methods to study problems of realistic sizes are presented to provide a glimpse of the capability of utilizing the order-N methods to predict the stable structures and properties of complex systems with reduced symmetry. This review is expected to serve as a clearinghouse where a single resource is provided to help guide the reader to decide, among the existing methodologies, which method can best fulEll the task at hand. c 2002 Elsevier Science B.V. All rights reserved. PACS: 71.15.Nc; 71.15.Pd
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
3
0. Introduction The emphasis of modern, high technology-related theoretical materials research is focused on the understanding of structure-dependent, in particular defect-controlled properties of materials. The goal is to predict accurately the structural, electronic, vibrational, optical, and magnetic properties of materials. The formulation of these kinds of problems invariably involves a large number of degrees of freedom. For situations where a quantum-mechanically correct understanding of system properties is essential, the methods most frequently used range from semi-empirical tight-binding (TB) approaches to methods based on the density functional theory (DFT) in the local density approximation (LDA). The quantum mechanical description of electrons in a large multi-atom system requires the calculation of the ground state energy either by solving the eigenvalue equation or minimizing the total energy functional with respect to singleparticle orbitals. The number of computational operations for both the TB and the DFT=LDA approaches scales as N 3 where N is the number of atoms in the system under consideration. The N 3 -scaling presents a severe challenge to the computational methodologies if one has to deal with systems of realistic sizes that usually involve a large number of degrees of freedom. The year of 1991, in some respect, marks the beginning of an endeavor to develop a methodology which provides the framework to calculate the total energy and, in the broader context of molecular dynamics (MD) simulations, the atomic forces with a computational eCort that scales linearly with respect to the size of the system. In that year, W.-T. Yang [1] proposed a method of calculating the electronic structure with a linear scaling behavior. The approach divides a system into overlapping subsystems. The local charge density is determined in each subsystem, with the subsystems connected by a common chemical potential. In this way, one can achieve a linear scaling for the calculation of the charge density. Since then, a great surge of activities has been generated in the development of methodologies for the calculation of the total energy and the atomic forces of systems with a large number of degrees of freedom which scale linearly with the size of the system. Within a short span of less than ten years, many publications devoted to this endeavor had appeared in the literature, topical conferences=workshops, titled-program sections in national and international meetings, etc. A subEeld, appropriately named the order-N (O(N )) method, is thus emerging in the Eeld of computational condensed matter physics and materials science. At this moment, there are many diCerent approaches to the construction of the O(N ) procedures for both TB Hamiltonians and DFT=LDA-based methods. These procedures cover the spectrum from the direct calculation of the local charge density from the local Hamiltonian using certain approximations, to unconstrained variational approaches in the framework of either one-particle density matrix (OPDM) or generalized Wannier functions (GWF). ECorts have also been made to identify key issues and to assess their roles with respect to both the eKciency and the accuracy of various O(N ) procedures [2,3]. While most of the implementations of the O(N ) procedures are for TB Hamiltonians, there are now instances where linear scaling algorithms in self-consistent LDA have been developed. Among the many diCerent approaches, no single encompassing O(N ) method has emerged which can be applied to systems of realistic sizes with the assurance of both the eKciency and the reliability. The stage is therefore set for a critical examination of various aspects of the existing O(N ) procedures. This review is undertaken with that in mind. We have conducted an exhaustive inventory of the existing O(N )
4
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
procedures, which have appeared in the literature up to 1999. In this review, we shall discuss these methodologies, distill key approximations involved in each method, state their roles and how they aCect the eKciency of the calculation and the reliability of the result. In this respect, we are providing a clearinghouse where a practitioner will have a single source and can decide, among the available methodologies, which method can best fulEll the task at hand. In addition, we hope that the critical analysis of these various aspects of the existing procedures will promote further reEnement on the twin themes of eKciency and reliability. This review is organized as follows. In Section 1, the feasibility for the development of O(N ) schemes is discussed in terms of the localization of the electronic degree of freedom. Issues relevant to the development of various O(N ) procedures are presented in Section 2. O(N ) schemes based on the direct approach in which the density matrix is calculated directly using various approximations are given in Section 3. O(N ) algorithms developed using the variational approach are presented in Section 4. Section 5 deals with issues aCecting the implementation of the O(N ) schemes. Section 6 gives some typical examples of the applications of O(N ) schemes to study properties of systems of realistic sizes which cannot be treated by conventional methods. The factors helpful in the choice of a particular O(N ) scheme for a certain problem at hand are discussed in Section 7. 1. Localization of electronic degree of freedom: the “nearsightedness” of an electron in a many-electron system The major part of the computational eCort of the calculation of the total energy of a system of N atoms is the determination of the band structure energy of the electrons in the system. The scaling behavior for the calculation of the electronic ground state energy of the system can usually be expressed as N . In the case of calculations based on the conEguration interactions (CI) methods, is of the order 7 and the scaling approaches exponential behavior for very large N . For calculations using methods based on DFT=LDA or TB Hamiltonians, is of the order 3. These scaling behaviors present a serious bottleneck for the study of systems of realistic sizes, in particular when large-scale molecular dynamics (MD) simulations are involved. In the last several years, a concerted eCort has been devoted to the development of methodologies, referred now as the order-N (O(N )) method, to circumvent the bottleneck present in the conventional total energy and atomic force calculations by reducing the scaling of computational eCort from N with ¿ 3 to = 1. The development of an O(N ) procedure for the calculation of the total energy of a multi-atom system is critically dependent on the localization of the electronic degrees of freedom. The realization that the electronic degree of freedom is short-ranged can be traced to the works of Laue [4] and Friedel [5]. In Kittel’s version [6] of the Laue theorem, it states that “the particle density per unit energy range is approximately independent of the form of the boundary, at distances from the boundary greater than a characteristic particle wave length at the energy considered”. Using the analogy between the mathematics describing the energy density of black body radiation (!;˜r) and that for the quantum local density of states (LDOS) (E;˜r), Friedel pointed out that the LDOS is independent of the boundary conditions provided that the locality is a few wavelengths from the boundary. Although there is no complete formal proof of the
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
5
localization of LDOS for the general case, Kittel had provided the following proof at the suggestion by Dyson for the case of free electron gas. Consider a system of free electron gas described by the eigenvalue equation 1 2 ∇ ˜k (˜r) = E˜k ˜k (˜r) ; (1.1) 2m where m is the mass of the electron and ˝ has been taken to be 1. Introducing the function ∗ r) ˜k (0)e−E˜k t ; (1.2) u(˜r; t) = ˜k (˜ −
˜k
it can be shown that u(˜r; t) satisEes the diCusion equation 9u = D ∇2 u 9t
(1.3)
with D = 1=2m. At the origin, u(0; t) = ˜k ˜k∗ (0) ˜k (0)e−E˜k t . Replacing the summation over the wave vector ˜k by integration over the eigenenergy, one obtains (1.4) u(0; t) = dE˜k | ˜k (0)|2 g(E˜k )e−E˜k t ;
where g(E˜k ) is the density of states. Identifying the LDOS at the origin by (E; 0) = | ˜k (0)|2 g(E˜k ); Eq. (1.4) indicates that u(0; t) is simply the Laplace transform of the LDOS at the origin. From the theory of diCusion, it is known that, after a time t ¿ tc ; the quantity u(0; t) at the origin will feel the presence of a boundary at a distance away where tc ≈
2 = 2m2 : D
(1.5)
Since u(0; t) is the Laplace transform of the LDOS, | ˜k (0)|2 g(E˜k ); the dominant contributions to u(0; t) must be from those LDOS components with E˜k tc ¡ 1. From Eq. (1.5), this leads to E˜k ¡
1 1 = tc 2m2
(1.6)
or k¡
1
(1.7)
because E˜k = k 2 =2m. Using k = 2=, we have : (1.8) 2 Thus, u(0; t) will be aCected by the presence of a boundary if the boundary is at a distance away or less. For electrons at the Fermi surface of a typical metal, F =2 ≈ 1=kF , is therefore of the order of magnitude of a lattice constant. Hence, the electron density per unit energy range at the Fermi surface will only be weakly perturbed by the presence of an impurity (viewed as a “boundary”) about a lattice constant from the point under consideration. ¡
6
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
The insensitivity of the LDOS to the presence of a boundary (or an impurity) only a “few” wavelengths away is described by Heine [7] as the invariance property of the LDOS. It was used by Heine to provide a simple but succinct explanation of the magnetic moment of the iron atoms in Fe3 Al as compared to that in pure iron [7]. However, it should be noted that this invariance theorem of the LDOS is still an approximate theorem. The distance from the locality in question to a boundary (or an impurity) determines the extent of the inQuence of the boundary on the degree of accuracy desired for the LDOS [8]. The invariance theorem becomes more approximate when the locality get closer to the boundary. The localization of the electron degree of freedom is apparently not a consequence of screening because the example discussed above is for a system of free electron gas. It however does depend on the presence of many electrons. Therefore it must be a consequence of quantum mechanical destructive interference in a many-body system. The implication of this property of insensitivity of the electron degree of freedom with respect to the perturbation “far” away on properties of a system of many particles is summarized in the “Principle of Nearsightedness” formulated by Kohn [9] for a system of many quantum mechanical particles moving in an external potential v(˜r). It states: “Let F(˜r1 ;˜r2 ; : : : ;˜rn ) be a static property depending on n coordinates ˜r1 ;˜r2 ; : : : ;˜rn , all within a linear dimension ≈ , a typical de Broglie wavelength occurring in the ground state wave function or Enite temperature ensemble. Denote by ˜r the center of mass, ˜r = n−1 i ˜ri . Then, at a Exed chemical potential , a change of the external potential Rv(˜r ), no matter how large, has a small eCect on F, provided only that Rv(˜r ) is limited to a “distant” region, in the sense that for all ˜r ; |˜r − ˜r |. Thus F does not “see” Rv(˜r ) if ˜r is “far”. The principle of nearsightedness, therefore, provides the working basis for the formulation of order-N methods. In the calculation of the total energy of a system of many electrons, the electron band structure energy can be expressed in terms of the density operator ˆ such that EBS = 2 Tr(H ˆ );
(1.9)
where H is the Hamiltonian of the system under consideration and the density operator ˆ may be expressed as ˆ =
occ
| |
(1.10)
with | being the eigenvector of, say the Kohn–Sham Hamiltonian of the system corresponding to the eigenenergy E . The summation is over the Ne =2 states of the lowest energies where Ne is the total number of electrons in the system. To take advantage of the “nearsightedness” of the density matrix, it is convenient to expand the wave functions in terms of linear combinations of a set of localized orbitals {|i } such that ij |i j | ; (1.11) ˆ = ij
EBS = 2
ij
ij Hji ;
(1.12)
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
and Ne = 2
ij Sji ;
7
(1.13)
ij
where the Hamiltonian matrix element Hij is given by Hij = ∗i (˜r)Hj (˜r) d˜r ; and the overlap matrix element Sij is given by Sij = ∗i (˜r)j (˜r) d˜r :
(1.14)
The localized orbitals can be chosen to be centered at atomic sites. In this way, ij will be functions of ˜Rij = ˜Rj − ˜Ri where ˜Ri is the position vector of the ith atom. The electron band structure energy can be rewritten as EBS = 2 ij Hji : (1.15) i
j
If ij decays as a function of Rij as suggested by the principle of nearsightedness, then one can truncate the summation over j in Eq. (1.15) to a region about the ith atom. Such a region of localization can be deEned by a cut oC distance Rc such that ij (˜Rij ) ≈ 0
for Rij ¿ Rc :
(1.16)
Thus the summation over j in Eq. (1.15) will be size-independent, and the calculation of the band structure energy will scale as N , the size of the system. The decay of the density matrix in real space depends on the system under consideration. The following behavior patterns have been demonstrated for crystalline solids [10 –12]: (1) Systems with a large band gap (insulators and semiconductors). For systems with a suKciently large band gap, it had been shown that the decay behavior of the density matrix in real space is exponential. SpeciEcally [9 –13], (˜r;˜r ) ≈ e−K|˜r−˜r with
|
(1.17)
K = cTB Egap
in the tight-binding limit ;
(1.18)
K = cWB aEgap
in the weak-binding limit :
(1.19)
and In Eqs. (1.18) and (1.19), a is the lattice constant, Egap is the energy gap, and cTB and cWB are constants in the tight-binding limit and weak-binding limits, respectively. These constants are of the order one when the distance is given in the unit of angstrom, and the energy in the unit of eV. It should be noted here that the tight-binding limit refers to the situation where
8
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
the atoms of the system under consideration are far apart while the weak-binding limit refers to the situation when the electronic structure can be approximated by free-electron like behavior. (2) Metals. For metals with a free-electron like band structure, it can be shown that [14] (˜r;˜r ) ≈ kF
cos(kF |˜r − ˜r |) ; |˜r − ˜r |2
(1.20)
at zero temperature where kF is the Fermi wave vector. The algebraic dependence on the inverse of the distance of the density matrix suggests an unfavorable decay behavior for metals in the implementation of an O(N ) method. However, at Enite temperatures, the decay of the density matrix with respect to the distance becomes much faster due to the destructive interference. It has been shown that, for Enite temperatures, the density matrix can be described by [13,15]
kB T cos(kF |˜r − ˜r |) (˜r;˜r ) ≈ kF exp −c |˜r − ˜r | ; (1.21) |˜r − ˜r |2 kF where T is the temperature, kB the Boltzmann constant, and c a constant of the order of one. This behavior pattern, together with the fact that the ratio of the deviation of the band structure energy at a Enite temperature T from that at 0 K with respect to the Fermi energy (RE=EF ) is proportional to (kB T=EF )2 , provides the opportunity to manipulate between the requirement of choosing an appropriate temperature for the truncation of the density matrix and the accuracy of the calculation. SpeciEcally, the temperature should be chosen suKciently high so that the density matrix decays quite rapidly with the distance and yet the deviation of the band structure energy associated with the Enite temperature used is still negligible. In this way, the calculation of the total energy and the atomic forces can be carried in an order-N fashion while the accuracy of these calculations can be maintained at an acceptable level. For disordered systems, the wave functions of electrons are more localized as compared to those in their respective crystalline counterparts. This then suggests that the decay of the density matrix for a disordered system with respect to the distance may be faster than that of the corresponding case in the crystalline form. Therefore, an exponential decay behavior is expected for the density matrix of a disordered system [8,9]. The decay behavior of the density matrix in real space discussed above clearly indicates that the implementation of O(N ) methods is feasible [16]. However, there are many issues relevant to the development of eKcient and reliable O(N ) procedures. These issues are discussed in the following section.
2. Issues relevant to the development of O(N ) procedures The development of O(N ) procedures for the calculation of the total energy of a many-atom system is based on Eq. (1.15), namely, EBS = 2 ij Hji : i
j
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
9
In this equation, the calculation of the band structure energy must be carried out in a real space representation to take the full advantage of the decay behavior of the density matrix in real space. The key ingredient in the development of an O(N ) procedure is then the truncation of ij such that the summation over j in Eq. (1.15) is only over a small number of terms and is size-independent. The feasibility of implementing the O(N ) procedure based on the decay behavior of ij (˜Rij ) has been discussed in Section 1. It has been established that a cut-oC distance Rcut can be found so that ij (˜Rij ) ≈ 0
for Rij ¿ Rcut
to the degree of accuracy consistent with the problem under consideration. Therefore, the general requirement for a reliable and eKcient O(N ) procedure must include: (1) an accurate determination of ij (˜Rij ) for Rij ¡ Rcut and (2) an eKcient algorithm for calculating ij for Rij ¡ Rcut . In the literature, there are many diCerent approaches [17– 45] to the development of an O(N ) procedure for the calculation of the total energy as well as atomic forces, which are designed to fulEll these two requirements. These approaches are developed for either tight-binding (TB) Hamiltonians [46,47] or Erst principles methods [48,49]. In the former case, the Hamiltonian matrix elements appearing in Eq. (1.15) are given. The focus of the O(N ) procedure is therefore entirely on the eKcient and accurate calculation of the density matrix elements ij for Rij ¡ Rcut . The only complication occurs when one has to deal with issues related to orthogonal tight-binding (OTB) Hamiltonians vs non-orthogonal tight-binding (NOTB) Hamiltonians. In the latter case, while the calculation of total energy still scales linearly with the size of the system because of the decay property of the density matrix, the prefactor involved in the construction of the Hamiltonian matrix elements could potentially overwhelm the computational eCort. In this regard, issues such as the choice of the basis set, the evaluation of the Coulomb potential, and the self-consistency in the calculation of the potential, are all important players as they may cause computational logjams. These issues must be addressed if one is to develop an eKcient and workable O(N ) procedure. For large-scale quantum mechanical molecular dynamics simulations, an O(N ) procedure for the calculation of atomic forces is a necessity. The force acting on the ith atom can be calculated as the negative gradient of the total energy with respect to the ith atomic position vector, −∇˜Ri Etot . It would be convenient and eKcient if the electronic contribution to this force can be determined using the Hellmann–Feynman theorem [50]. In this situation, the calculation of atomic forces can be treated simply as a by-product of the calculation of the total energy with minimum extra eCort. While there are cases where Hellmann–Feynman theorem will yield directly the electronic contribution to the atomic forces, there are other situations where corrections due to the Pulay force or those associated with the truncation of the density matrix must be taken into consideration [51]. The inclusion of these corrections often gives rise to complications in the calculation of atomic forces, resulting in an increase in the computational eCort. In this review, O(N ) procedures developed for the calculation of the total energy as well as atomic forces will be discussed in terms of the issues raised above. These O(N ) procedures will be categorized into two groups according to their underlying methodology, namely, the direct
10
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
approach and the variational approach. In the direct approach, the density matrix elements are calculated directly using various approximations. In the variational approach, the truncated density matrix elements and=or the orbitals used to construct the Hamiltonian matrix elements are treated as variational parameters for the minimization of some energy functionals. Those O(N ) methods that share the same underlying concept will be discussed as a unit, using a representative approach. 3. The direct approach In the implementation of an O(N ) procedure based on the decay behavior of the density matrix in real space, it is most convenient to construct the system Hamiltonian in a basis set of local orbitals i (˜r − ˜Ri ) where ˜Ri is often chosen to be the coordinate of the ith atom and the orbital. In this way, i is the orbital about the ith site. The eigenfunction of the system matrix H corresponding to the eigenenergy E can be expanded in terms of i as ci; ; i ; (3.1) = i
where the summation is over all the orbitals at all the sites i. The column vector c of the coeKcients of expansion ci; then satisEes the general eigenvalue equation Hc = E Sc
(3.2)
with the Hamiltonian and overlap matrices given by Hi; j% (˜Rj − ˜Ri ) = ∗i (˜r)Hj% (˜r) d˜r ; and Si; j% (˜Rj − ˜Ri ) =
∗i (˜r)j% (˜r) d˜r :
(3.3)
(3.4)
For a basis set with Si; j% = &ij &% , it is referred as an orthogonal basis set. Otherwise, it is referred as a non-orthogonal basis set. The Enite basis set of localized orbitals {i } is in general not complete. Hence the result of calculation often improves when more local orbitals are included. At T = 0 K, the density operator of a system can be deEned as ˆ =
occ
| | ;
(3.5)
where the summation is over all the occupied states. Using Eq. (3.1), we obtain ˆ = i; j% |i j% | ; i; j%
(3.6)
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
11
where i; j% =
occ
∗ ci; cj%; :
(3.7)
From Eqs. (3.6) and (3.7), we have ∗ ∗ ˆ2 = ci; cj%; |i j% |k' l& |ck'; cl&; i; j%; k' l&;;
=
∗ ci; cl&; |i l& |
i;l&; ;
=
j%; k'
∗ ci; cl&; |i l& |&; =
i;l& ;
=
∗ cj%; ' Sj%; k' ck';
i;l&
i; l& |i l& | = ˆ :
∗ ci; cl&; |i l& |
(3.8)
i;l&
In arriving at Eq. (3.8), the orthonormal condition of the eigenfunction ; | = & , is used. Eq. (3.8) states that the density operator ˆ as expressed by Eqs. (3.6) and (3.7) satisEes the condition of idempotency. From Eq. (3.6), the density matrix in real space is given by (˜r;˜r ) = ˜r |ˆ|˜r = i; j% i (˜r)∗j% (˜r ) : (3.9) i; j%
SpeciEcally, the electron density is given by (˜r) = ˜r |ˆ|˜r = i; j% i (˜r)∗j% (˜r) :
(3.10)
i; j%
It should be noted that, in general, i; j% = i |ˆ|j% . This can be seen from Eq. (3.6) by calculating i |ˆ|j% , i.e., i |ˆ|j% = k'; l& Si; k' Sl&; j% : (3.11) k';l&
Using the relation I= |i (S −1 )i; j% j% | ; i; j%
(3.12)
12
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
we have ˆ =
|i (S −1 )i; k' k' |ˆ|l& (S −1 )l&; j% j% |
i; k' l&; j%
=
i; j%
(S −1 )i; k' k' |ˆ|l& (S −1 )l&; j%
k';l&
|i j% | :
Comparing this with Eq. (3.6), we obtain i; j% = (S −1 )i; k' k' |ˆ|l& (S −1 )l&; j% :
(3.13)
k';l&
From Eq. (3.11), it can be seen that, if {i } is an orthogonal basis set so that Si; j% = &ij &% , we have i |ˆ|j% = i; j% :
(3.14)
Thus i; j% may be referred as pseudo-density matrix elements in a non-orthogonal basis representation. At a Enite temperature, the expression of the density matrix as given by Eq. (3.7) may be generalized to ∗ i; j% = ci; cj%; (3.15) f(x) ;
where x = (E − Ef )=kB T with EF being the Fermi energy of the system, and f the Fermi–Dirac distribution function. The band structure energy is the key component of the total energy, which consumes the majority of the computational eCort. It can be expressed in terms of the density matrix elements through EBS = 2
occ
|H | = 2
i; j% Hj%; i :
(3.16)
i; j%
The total number of electrons of the system, N , is related to the density matrix by N =2
N=2
| = 2
i; j% Sj%; i ;
(3.17)
i; j%
through which the Fermi energy EF of the system is determined. Eqs. (3.16) and (3.17) in fact constitute the working equations to implement the O(N ) procedure. The direct approach to the implementation of an O(N ) procedure is based on the direct calculation of matrix elements i; j% for sites i with i; j% truncated at Rij ¿ Rcut using various
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
13
approximations. In the following subsections, these methods will be discussed in detail. For the case of Erst principles O(N ) methods, substantial computational eCort has to be used to construct the Hamiltonian matrix elements Hi; j% . Since the situation is the same for the variational approach, the discussion of eKcient methods to construct Hamiltonian matrix elements will be deferred until Section 5. 3.1. The divide-and-conquer (DC) method [1,37,43] In the divide and conquer (DC) method proposed by Yang, a system under consideration is divided into disjoint subsystem in real space through the use of partition matrices associated (I ) with subsystems I; Pi; j% , in the space deEned by localized orbitals {i } such that (I ) Pi; j% = 1 ; (3.18) I
where the summation is over all the subsystems I . One possible way of constructing the partition matrix is (I ) (I ) (I ) Pi; j% = qi + qj%
with (I ) qi =
1 2
(3.19)
if i ∈ I ;
(3.20)
0 if i ∈ I :
In Eq. (3.20), i ∈ I means that the site i is in the subsystem I . Using the partition matrix as deEned by Eq. (3.18), one can write (I ) (I ) Pi; j% i; j% = i; j% ; i; j% = I
(3.21)
I
where (I ) (Ii;)j% = Pi; j% i; j% :
(3.22)
Eqs. (3.19) and (3.20) indicate that (Ii;)j% vanishes if both i and j do not belong in the subsystem I . This then suggests that the calculation of i; j% may proceed “locally” in a systematic manner. SpeciEcally, one may deEne a local Hamiltonian H (I ) that is the projection of the system Hamiltonian in the local orbitals of atoms in the subsystem I and their neighboring atoms. These neighboring atoms are referred as the buCer atoms and are included in the buCer zone adjacent to the subsystem I so as to facilitate a better representation of the local density matrix. The buCer zone is deEned by a truncation radius Rb which is a distance from the “center” of the subsystem within which all the neighboring atoms to the atoms in the subsystem I are included as buCer atoms. The accuracy of the calculation of the density matrix can be improved in a systematic manner by increasing Rb .
14
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
Under the truncation deEned by Rb , the local Hamiltonian matrix and the local overlap matrix are given by (I ) (I ) (I ) Hi; j% = i |H |j% ;
(3.23)
(I ) (I ) (I ) Si; j% = i |j% ;
(3.24)
and
where (Ii) is the local orbital centered at site i with the site i being either in the subsystem I or in its buCer zone. The local density matrix, (Ii;)j% , can then be approximated by (I ) (I ) ∗ (Ii;)j% = Pi; f(E(I ) − EF )ci; (3.25) j% cj%;
(I ) with ci; satisfying the general eigenvalue equation deEned in the subsystem I and its associated buCer zone, i.e.,
H (I ) c(I ) = E(I ) S (I ) c(I ) :
(3.26)
The Fermi energy EF in Eq. (3.25) is determined by N =2 i; j% Sj%; i i; j%
=2
i; j%
I
(Ii;)j% Sj%; i = 2
I
i; j%
(Ii;)j% Sj%; i
(3.27)
with (Ii;)j% approximated by Eq. (3.25). The band structure energy can then be calculated by EBS = 2 i; j% Hj%; i i; j%
=2
I
i; j%
(Ii;)j% Hj%; i :
(3.28)
The linear scaling in the calculation of the band structure in the divide-and-conquer method stems from the “truncation” of i; j% according to Eqs. (3.21) and (3.25). For a given subsystem I , the summation over i and j is Enite and independent of the size of the system. The computational eCort for calculating the band structure energy is therefore dependent linearly on the size of the system through the summation over I . To carry out molecular dynamics (MD) simulations, an eKcient method for calculating the forces acting on atoms is a necessity. Forces acting on atoms are determined by the energy
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
15
gradients with respect to nuclear coordinates. For tight binding approaches, the electronic contributions to the atomic forces can usually be calculated using Hellmann–Feynman theorem. In the case of Erst principles approaches, in addition to the Hellmann–Feynman contribution, Pulay correction due to the dependence of local orbitals on nuclear coordinates must be included. The Pulay correction term depends on the gradients of the wave functions with respect to nuclear coordinates. Zhao and Yang [52] formulated an O(N ) approach for Pulay correction using the expression for the Pulay correction given by Pulay ˜ Fi =2 n (∇˜Ri )(H − E ) d˜r (3.29)
where n is the occupation number for the eigenstate and ˜Ri is the position vector for the ith atom. Using the divide-and-conquer scheme, we have Pulay (I ) ˜ Fi =2 P n (∇˜Ri )(H − E ) d˜r
≈2
I
=2
I
P
(I )
wj; k% =
j; k%
I
(Ij;)k%
− EF )
(∇˜Ri
(I ) )(H
j; k%
where
f(E(I )
(I )
− E )
(∇˜Ri k% )Hj d˜r − wj; k%
(I ) (I )∗ (I ) f(E(I ) − EF )cj ck% E ;
(I ) r d˜
(∇˜Ri k% )j d˜r ;
(3.30)
(3.31)
and j; k% is given by Eqs. (3.21) and (3.25). The approximation in Eq. (3.30) allows the summations over j and k for a given I to be truncated within the region deEned by Rb . Thus the calculation of the Pulay correction term also scales linearly with respect to the system size. The DC method had been tested by calculating the total energy and the energy gradients of 4-glycine, 8-glycine, and 12-glycine polypeptides [53]. The subsystem was chosen to be one glycine residue for all these polypeptides. Thus the systems were divided into 4, 8, and 12 subsystems for 4-glycine, 8-glycine, and 12-glycine polypeptides, respectively. The buCer atoms were included according to the number of chemical bonds between the subsystem atoms and the neighboring atoms. The implementation of the divide-and-conquer method was carried out using a LDA=LCAO program. Three types of local orbitals were used as the local basis set. They were S (single zeta), D (double zeta), and P (polarization) orbitals. The calculation was found to be insensitive to the choice of the inverse temperature % (% = 1=kB T ) used in the Fermi distribution function as long as % is larger than 100 a:u. The result of the test calculations indicated that the DC method indeed scales linearly with the size of the system. Table 1 lists the comparison of the total energies and the energy gradients obtained using the self-consistent DC method and those from the “exact” LDA=LCAO calculation for 4-glycine polypeptide. It can be seen that the result of the calculation improves systematically as the size of the buCer zone (Rb ) increases. For the calculation of the total
16
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
Table 1 Comparisons of the results of the self-consistent calculation of the energy (E) and the energy gradient (G) of 4-glycine polypeptide from the DC method and those from the Kohn–Sham (KS) calculation for three diCerent types of atomic basis: S (single zeta), D (double zeta), and P (polarization). The results are given in atomic units. The comparison is presented in terms of the number of nearest neighbors included in the region of localization (e.g., 1 ≡ Erst nearest neighbors, 2 ≡ second nearest neighbors, and so on). The table is reproduced from Table I in Ref. [53] Density-matrix formulation
Method basis
S
D
P
[E(dc)-E(KS)]=N 1 2 3 4 5 6 7
5.73E-03 8.40E-04 1.11E-05 2.22E-05 4.41E-06 6.09E-07 4.42E-07
7.56E-03 1.14E-03 9.86E-05 2.88E-05 −3:28E-06 −1:80E-06
1.42E-02 3.30E-03 5.59E-04 1.78E-04 3.57E-05 1.02E-05
rms[G(dc)-G(KS)] 1 2 3 4 5 6 7
7.43E-02 1.21E-02 3.78E-03 9.78E-04 2.31E-04 4.20E-05 1.71E-05
7.51E-02 5.72E-02 9.12E-03 1.45E-03 3.54E-04 6.46E-05
1.63E-01 9.01E-02 1.58E-02 3.62E-03 1.09E-03 3.26E-04
energy, the inclusion of the third nearest neighbors in the buCer zone has already achieved an accuracy of the order of 10−3 a:u: per atom. In the implementation of the self-consistent O(N ) methods such as in the case of using the divide-and-conquer method, a substantial computational eCort must be spent to construct the Hamiltonian and overlap matrices, to calculate the electron density, and to determine the electrostatic eCects. These calculations in general scales as N 2 . The discussion of eKcient, linear scaling solutions to these problems will be presented later in Section 5. 3.2. The Fermi operator expansion (FOE) method [27,30] At a Enite temperature T , the density operator can be expressed as f(E )| | ˆ =
| = f(H )
= f(H ) ;
(3.32)
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
17
where f(x) is the Fermi–Dirac distribution so that f(H ) =
1 e(H −EF )=kB T
+1
(3.33)
with EF being the Fermi energy of the system. The operator f(H ) as deEned by Eq. (3.33) is referred as the Fermi operator. The band structure energy can then be written as EBS = 2 Tr(H ˆ )=2 fi; j% (H )Hj%; i ; (3.34) i; j%
where fi; j% = i |f(H )|j%
(3.35)
with {i } being some orthogonal local basis set. The Fermi–Dirac distribution f(x) can be approximated by a polynomial function of x as it was carried out by Goedecker and Colombo [27]. However, polynomials of high degree are often numerically unstable. Goedecker and Teter managed this problem of instability using the Chebyshev polynomial representation [30]. In this way, the Fermi operator is approximated by f(H ) ≈
m
a0 aj Tj (H ) ; I+ 2
(3.36)
1
where I is a unit matrix and the Chebyshev polynomial matrices Tj (H ) satisfy T0 (H ) = I ; T1 (H ) = H ; Tj (H ) = 2HTj−1 (H ) − Tj−2 (H ) ; and the expansion coeKcients aj are determined by 2 1 aj = f(x)Tj (x) d x : −1
(3.37)
(3.38)
It should be noted that the Chebyshev polynomials Tj (x) are deEned only in the interval of x given by [ − 1; 1]. To implement the Chebyshev polynomial representation of the Fermi operator f(H ), one must scale the eigenvalue spectrum of H so that all the eigenvalues of H fall within this interval. For an orthogonal basis set {i }, the Fermi operator can be constructed according to Eq. (3.36) in the following manner. Denote |tlj as the lth column of the matrix Tj (H ). From Eq. (3.37), we have |tl0 = |el ; |tl1 = H |el ; j
j−1
|tl = 2H |tl
j−2
− |tl
;
(3.39)
18
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
where |el is a column vector that has zero elements everywhere but with the lth entry being one. The lth column of the Fermi operator, |fl , can then be obtained as |fl =
m
a0 0 |tl + aj |tlj : 2
(3.40)
j=1
Using the recipe given by Eqs. (3.39) and (3.40), the number of operations needed to construct the Fermi matrix is proportional to N 2 . This can be seen as follows. The Hamiltonian in a real space representation is often a sparse matrix. Let nH be the maximum number of non-zero oC-diagonal elements (either per row or per column) of H; m the degree of the Chebychev polynomial used to approximate the Fermi distribution function, and Nb the number of basis vectors in {i }, then the number of operations needed to construct |fl is mnH Nb . Since there are Nb column vectors |fl , the total number of operations required to construct f(H ) is therefore mnH Nb2 . Nb , the number of basis vectors in {i }, is proportional to the number of atoms in the system. Hence the calculation of f(H ) according to Eqs. (3.39) and (3.40) scales as N 2 with respect to the size of the system. To achieve a linear scaling for the calculation of f(H ), Goedecker and Teter pointed out that, because the decay property of the density matrix in real space, the column vector |fl may be viewed as a localized orbital. They then introduce a localization region for each column vector |fl about the lth basis function such that the elements in |fl are taken to be zero if they are outside the localization region. In this way, only those elements of |fl within the localization region need to be calculated. Corresponding to a given region of localization, there is a given number of basis vectors, Nloc which is independent of the size of the system under consideration. Therefore the number of operations needed for the construction of f(H ) becomes Nb Nloc mNH , resulting in a linear scaling for its calculation. Two more issues play important roles in the implementation of a Chebyshev representation of the Fermi distribution. They are the mapping of the eigenvalue spectrum of H onto the interval [ − 1; 1] and the determination of the Fermi energy. For the former, one needs to know the maximum and minimum eigenvalues of H . These quantities can be determined with the help of auxiliary functions of H . Since the majority of the computational eCort in constructing |fl is spent on the recursive calculations of |tlj and these calculations can also be used in building up the auxiliary functions of H , there is relatively small additional eCort in evaluating these auxiliary functions. The determination of the Fermi energy can be accomplished using the condition of charge neutrality, i.e., Ne = f(H )i; i ; (3.41) i
where Ne is the number of electrons in the system under consideration. In the implementation of Eq. (3.41) to determine the Fermi energy, the only extra eCort needed is the computation of the Chebychev coeKcients according to Eq. (3.38) corresponding to a series of “guess” values of EF so that the corrected value of EF can be obtained by satisfying Eq. (3.41). This expense in computing eCort is insigniEcant compared to the eCort for calculating |fl . The process of
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
19
determining EF scales linearly with the size of the system when the region of localization is imposed in the calculation of |fl . Once the density matrix f(H )i; j% is determined by Eqs. (3.39) and (3.40) within the context of an appropriately chosen localized region, the band structure energy can then be calculated by EBS = 2 f(H )i; j% Hj%; i : (3.42) i; j%
The linear scaling for the calculation of EBS is achieved because for a given site i, the summation over j is only over the sites within the localized region about i. Hence the summations over j, , and % are limited and independent of the size of the system. The eCort of calculation of EBS is therefore dependent linearly on the size. For the purpose of carrying out molecular dynamics (MD) simulations, the forces acting on each individual atom in the system must be calculated. The most time consuming part of the calculation of atomic forces is the electronic contribution to the atomic forces. This contribution is determined by the gradient of the band structure energy with respect to the atomic coordinates. SpeciEcally, ˜ i(el) = − ∇˜ EBS : F Ri Using Eq. (3.42), we obtain (el) ˜i = − F f(H )j; k% (∇˜Ri Hk%; j ) + f (H )j; l& (∇˜Ri Hl&; k% )Hk%; j : j; k%
(3.43)
l&
With the introduction of a region of localization, the summation over j and k in Eq. (3.43) is limited and independent of the size of the system. Hence the calculation of the atomic forces scales linearly with the system. However, the introduction of the region of localization makes the compact form of Eq. (3.43) no longer applicable. This is because diCerent local Hamiltonian H (j) must be used for the atom at the site j. Therefore the gradient of the band structure energy must be calculated term by term. For example, consider a term proportional to H 3 in the expansion of the band structure energy as given by Eq. (3.42). Its gradient must be calculated according to {H (j)j; k% H (k)k%; l' ∇˜Ri H (l)l'; j + H (j)j; k% ∇˜Ri H (k)k%; l' H (l)l'; j j; k%;l'
+ ∇˜Ri H (j)j; k% H (k)k%; l' H (l)l'; j } :
(3.44)
Similar procedures must be followed for all the terms in the expansion of EBS as given by Eq. (3.42). Using Eq. (3.44) and similar procedures for the other terms in the band structure energy, the calculation of the atomic forces scales approximately as Nb Nloc nH m. When the density matrix is approximated by a Chebyshev polynomial representation, a recursion relation for the gradient of Chebyshev polynomial has been derived by Voter et al. [36] for the calculation of the force and it will be discussed in Section 3.3.
20
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
While the linear scaling of the method of Fermi operator expansion with respect to the size of the system is achieved through the introduction of the region of localization, an eKcient implementation of the method to calculate the total energy and atomic forces still hinges on the degree of Chebyshev polynomial representation, m. Numerical experiments have indicated that a polynomial representation of the degree m = 1:5(1max − 1min )=kB T is suKcient to give reasonably converged result. This suggests that the eKciency of the method dictates a choice of high temperature. However, higher temperature will introduce larger errors in the calculations of the total energy and atomic forces. Therefore, a trade-oC must be made to balance the accuracy against the eKciency. However, there is a way to correct some of the errors by using the energy functional Etot − 12 TS where S is the entropy of the electrons in the system [54,55]. This energy functional gives an approximate extrapolation of the total energy at T = 0. The procedure outlined above is applicable for the implementation of the method of FOE only in an orthogonal basis set. However, in most of the real space representations of systems of interests, in particular when MD simulations are involved, a non-orthogonal basis set is often the basis set of choice (see the discussion in Section 3.4). The generalization of the FOE method for the non-orthogonal basis set has been developed by Stephan and Drabold [45]. In a non-orthogonal basis representation, the electronic structure is the solution to a general eigenvalue problem given by Hc = ESc ; where H is the Hamiltonian of the system expressed in the non-orthogonal basis set and S the overlapping matrix. Introducing an eCective Hamiltonian HU = S −1 H ; one obtains HU c = Ec : In this way, the procedure for the implementation of the Chebyshev polynomial representation of the Fermi operator f(HU ) developed for the orthogonal basis set can now be applied to the eCective Hamiltonian HU . However, even if S is a sparse matrix, S −1 is usually a full matrix, hence so will be HU . Thus the implementation of the Chebychev polynomial representation of f(HU ) will be computational-wise costly. Furthermore, in a MD simulation, HU must be constructed at every time step. This will make the computational eCort for the Fermi operator of HU even more cumbersome. Hence the generalization of the representation of the Fermi operator f(HU ) by a Chebyshev polynomial in a non-orthogonal basis set is in general not a very eKcient procedure, in particular in comparison with either the DC method or the order-N non-orthogonal tight-binding molecular dynamics scheme to be discussed in Section 3.4 as both methods are designed for the non-orthogonal basis set. Goedecker and Teter [30] tested the FOE method by calculating the energies for a pair of screw and antiscrew dislocations in silicon at diCerent separations, using a tight-binding scheme. Their results, summarized in Table 2, show good agreement with the ab initio results [56].
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
21
Table 2 Comparisons of energies for a pair of screw and antiscrew dislocations at various distances obtained by an ab initio method [56] for a 324-atom cell, and those by the FOE-TB method for a 648- and 1296-atom cell, reproduced from Table 1 in Ref. [30]
ab initio TB 648 TB 1296
V 3:3 A
V 9:9 A
V 16:5 A
V 23:1 A
4.36 3.77 3.88
7.70 6.72 6.93
10.76 8.73 9.00
13.32 10.63 10.91
3.3. The kernel polynomial method (KPM) [36] The kernel polynomial method, developed by Voter and coworkers [36], is similar in its conceptual framework to the FOE method. The key to the method is to approximate either the electronic density of states (DOS) or the zero-temperature Fermi distribution by convoluting the exact DOS or the step function with the kernel polynomial, an expansion of the delta function as a polynomial. The Gibbs oscillations associated with a polynomial Etting of the delta function are damped by Jackson damping [57–59]. The procedure of the method is outlined as follows. The band structure energy and the total number of electrons of a system can be determined by ∞ EBS = 2 12(1 − 1F )(1) d1 ; (3.45) −∞
and
Ne = 2
∞
−∞
2(1 − 1F )(1) d1 ;
(3.46)
respectively, where 2(1) is the zero-temperature Fermi distribution function, namely, 1 if 1 ¡ 0 ; 2(1) = 0 if 1 ¿ 0 : If one approximates the DOS by convoluting it with the kernel polynomial, one has 2 &K ( − )( ) d ; K () = 0
(3.47)
where K is the kernel polynomial-convoluted DOS, and &K (), the kernel polynomial, is the polynomial expansion of the & function &K () =
M 1 m=0
2qm
gm cos(m)
(3.48)
with qm = when m = 0, and =2 otherwise. The Gibbs factors {gm }, in the form derived by Jackson, are introduced in Eq. (3.48) to minimize the Gibbs oscillations due to the Enite
22
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
truncation (M ) of the series. The parameter is deEned by
−1 1 − b = cos ; a
(3.49)
where a = 12 (1max − 1min ) and b = 12 (1max + 1min ) with 1max and 1min being the maximum and the minimum eigenenergy, respectively. Eq. (3.49) deEnes the transformation from the 1-space to the -space which gives rise to Eq. (3.47). It should be noted that the kernel polynomial is expressed as an expansion in terms of the Chebyshev polynomials since
1−b 1−b Tm = cos m cos−1 : a a Transforming the integration in Eq. (3.46) from the 1-space to the -space, we have 2 U F )() d ; 2(; (3.50) Ne = 2 0
U F ) is the transform of 2 and is a periodic function of with the periodicity of where 2(; 2 such that it equals 1=2 for F 6 6 2 − F and is zero elsewhere. The Fermi angle F is deEned by EF = a cos(F ) + b. Substituting Eq. (3.47) into Eq. (3.50) yields
M F 2gm m sin(mF ) − Ne = 0 g0 1 − ; (3.51) m m=1
where m =
i|Tm (HU )|i
(3.52)
i
with i summed over all basis functions |i, and HU = (H − b)=a being the scaled Hamiltonian so that its eigenvalues are in the interval {−1; 1}. Similarly, the band structure energy obtained by using the smeared DOS leads to 2 SD U F ) cos()() d EBS = 2 2(; 0
sin(2F ) 0 g0 sin(F ) F ≈− − 1 g1 −1+ 2
M +1 sin(mF ) sin[(m − 2)F ] m−1 gm−1 : (3.53) − + m (m − 2) m=3
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
23
U F ) with the One may also approximate the Fermi distribution function by convoluting 2(; kernel polynomial such that 2UK (; F ) =
0
2
U ; F ) d : &K ( − )2(
(3.54)
Substituting 2UK for 2U in Eq. (3.45), one obtains, for the smeared Fermi distribution approximation, SF EBS = 1 g0
F 1−
−
M
(m−1 + m+1 )gm
m=1
sin(mF ) : m
(3.55)
SD and E SF gives a better approximation to the true It should be noted that the average of EBS BS energy than either one alone. The gradient of the band structure energy needed for the calculation of the atomic forces can be determined using a recursive relation for the gradient of the Chebyshev polynomials derived by Voter et al. [36], namely,
∇˜Ri Tj (HU ) = ∇˜Ri Tj−2 (HU ) +
j−1 l=0
(1 + kl ) (1 + kj−1−l )Tl (HU ) (∇˜Ri HU )Tj−1−l (HU ) ;
(3.56)
where kl = 0 if l 6 0 and 1 otherwise. The number of steps for the calculation of the atomic forces scales as Nb Nloc nH M 2 when Eq. (3.56) is used. Here, M denotes the degree of truncation of the kernel polynomial. The kernel polynomial method had been tested for silicon system using the orthogonal tight-binding Hamiltonian of Goodwin et al. [60]. In this calculation, two atoms are considered as “H -linked” if they have a non-zero interacting Hamiltonian matrix element. The region of localization is set up in terms of a given number (L) of Hamiltonian links from a certain atom at the site i. For example, the region of localization corresponding to L = 1 contains all the atoms (neighbors) directly H -linked to the atom at site i. The region for L = 2 is composed of these immediate neighbors and atoms which are directly H -linked to the immediate neighbors, and so on. Fig. 1 shows the convergence of the unrelaxed (100) surface energy with respect to M for a series of localization regions (L = 2– 4). The calculation was carried out for a supercell with 216 atoms and with a global Fermi energy. It can be seen that the curve for each L has converged by M = 100, and the series of asymptotes converges toward the exact tight-binding result. The kernel polynomial method has been extended by RWoder et al. [61] to the general case of non-orthogonal tight-binding basis set. The implementation of the method requires the application of S −1 H . Instead of the explicit inversion of S, the multiplication of S −1 is performed using a preconditioned conjugate-gradient method. While an O(N ) scaling with respect to the size is achieved, the method, just like the FOE method, is not very eKcient when applied to a non-orthogonal basis set in the MD simulations because of the prefactors involved.
24
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
Fig. 1. The surface energy for unrelaxed Si(100) surface calculated by the kernel polynomial method vs. the number of moments M at diCerent levels of logical truncation L, reproduced from Fig. 7 in Ref. [36].
3.4. Order-N non-orthogonal tight-binding molecular dynamics (O(N)=NOTB-MD) schemes [44] Starting from a non-orthogonal basis with only nearest neighbor overlaps, Mckinnon and Choy recently showed that the process of orthogonalization gives rise to terms beyond the nearest-neighbor hopping terms in the tight-binding Hamiltonian [62]. They then conclude that a more appropriate scheme for tight-binding calculations is the non-orthogonal tight-binding approach. Furthermore, in a tight-binding (TB) MD simulation, it is impossible to maintain the orthogonality of the basis set at every time step. This is because the condition of orthogonality is environment-dependent. Although the orthogonality condition for a given conEguration may be achieved via the LWowdin transformation [63], the orthogonality of the basis set corresponding to a given conEguration will no longer be operational for the conEguration at the next time step in a MD simulation as the system relaxes under the action of its own atomic forces. Thus, to develop a transferable set of tight-binding parameters, which is applicable for diCerent local environment as encountered in MD simulations, it is more advantageous to use the framework of a non-orthogonal tight-binding approach. These considerations therefore have created the need to develop an O(N ) procedure within the framework of a non-orthogonal tight-binding Hamiltonian. Recently, Jayanthi et al. developed just such a procedure, an O(N ) non-orthogonal tightbinding molecular dynamics (O(N )=NOTB-MD) scheme [44]. While the pseudo-density matrix i; j% can be calculated using Eq. (3.7), it can also be determined from the generalized Green’s function G˜ i; j% according to EF 1 i; j% = − lim Im G˜ i; j% (E + i1) dE ; (3.57) 1→0 −∞
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
25
where G˜ is deEned by G˜ ≡ (ES − H )−1 :
(3.58)
As shown in Eq. (3.16), the calculation of the band structure energy in the context of a non-orthogonal Hamiltonian is given by i; j% Hj%; i : (3.16 ) EBS = 2 i
j%
The calculation of the electronic contribution to the force acting on the ith atom can be determined by the Hellmann–Feynman theorem and is given by ˜ i; el = − ∇˜ EBS = − 2 F {j; k% ∇˜Ri Hk%; j (˜ Rkj ) − 6j; k% ∇˜Ri Sk%; j (˜Rkj )} ; (3.59) Ri j; k%
where 6i; j%
1 ≡ − lim 1→0
EF
−∞
E Im G˜ i; j% (E + i1) dE :
(3.60)
The feasibility of developing a linear scaling algorithm for the calculation of the band structure energy and the electronic contribution to the atomic forces within the framework of a NOTB Hamiltonian depends on the decay of the pseudo-density matrix in real space, namely, i; j% (˜Rij ) → 0 as Rij → ∞. The O(N ) procedure for the atomic force requires in addition that 6i; j% (˜Rij ) → 0 as Rij → ∞. These conditions had been checked for both semiconductors and metals. It was found that, for semiconductors, i; j% and 6i; j% approach zero for Rij ¿third nearest-neighbor distance. In the case of metals, these conditions are satisEed for somewhat larger Rij . Thus the summation over j in both Eqs. (3.16 ) and (3.59) can be truncated within a region deEned by R0 such that i; j% (˜Rij ) → 0 and 6i; j% (˜Rij ) → 0 for Rij ¿ R0 . In this way, an O(N ) procedure for the calculation of the band structure energy and atomic forces is established. However, the accuracy of the band structure energy and that of the atomic force calculated using the truncated summation will depend critically on how accurately can i; j% (˜Rij ) and 6i; j% (˜Rij ) for Rij 6 R0 be determined. To ensure an accurate determination of i; j% (˜Rij ) and 6i; j% (˜Rij ) for Rij 6 R0 , it is imperative to attach a buCer zone, deEned by Rb , to the region deEned by R0 . The key is to maintain that there is suKcient input from the buCer zone beyond the interior region (R0 ) to render a reliable calculation of i; j% (˜Rij ) and 6i; j% (˜Rij ) for Rij 6 R0 . Thus, one has to deal with a local Hamiltonian in the region corresponding to Rloc = R0 + Rb about a given site i even though one only needs to determine accurately i; j% (˜Rij ) and 6i; j% (˜Rij ) for Rij 6 R0 . In this regard, it is advantageous to use the method of real space Green’s function (RSGF) to calculate i; j% (˜Rij ) and 6i; j% (˜Rij ) according to Eqs. (3.57) and (3.60). This method shifts the computational eCort of inverting a large matrix to matrix multiplications and inversions of a sequence of matrices of dimension smaller than the original matrix. This feature
26
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
allows the calculation of Gi; j% (˜Rij ) for Rij 6 R0 without actually having to invert the entire matrix corresponding to Rloc . Hence this method leads to an eKcient calculation of i; j% (˜Rij ) and 6i; j% (˜Rij ). The implementation of the O(N )=NOTB-MD scheme starts with the determination of EF using the following equation Ne =
i
i; j% Sj%; i :
(3.7 )
j%
For a given conEguration of the system under consideration and an appropriately chosen Rloc , an initial guess of EF is used in Eq. (3.57) to calculate i; j% . The result is then substituted into Eq. (3.7) to determine the total number of electrons, Ne . If this number is not equal to the given number of electron in the system, the process will be repeated for a series of EF until the consistency between EF and Ne is achieved. The resulting i; j% , together with the 6i; j% calculated with the Enal EF (Eq. (3.60)), is then used to calculate the band structure energy and the electronic contribution to the atomic forces (Eqs. (3.16) and (3.59), respectively). The total energy of the system within the framework of the tight-binding approach is calculated by Etot = EBS + Erep ;
(3.61)
where Erep is the sum of pairwise repulsive terms which are usually parameterized by Etting. The force acting on a given atom i is determined by ˜i = F ˜ i; el − ∇˜ Erep : F Ri
(3.62)
The series of equations, Eqs. (3.7), (3.16), (3.57) – (3.62), constitute the working equations for the O(N )=NOTB-MD scheme. The accuracy of the O(N )=NOTB-MD scheme was checked by calculating the atomic forces on every atom of an unstable silicon (Si80 ) cluster in the tetrahedral network structure, using the NOTB Hamiltonian developed by menon and Subbaswamy [64] for silicon. Fig. 2 shows the comparison between the result of the calculation with the exact result obtained by direct diagonalization. It can be seen that the overall agreement is excellent. The eKciency of the scheme was demonstrated by using it to determine the stable conEguration of a Si1000 cluster. The initial conEguration of the cluster was set up in a regular tetrahedral network. This unstable conEguration was relaxed under the action of the atomic forces. A stable conEguration was considered to have been reached if the atomic forces are of the order of 10−2 eV and the energy is at a minimum. Fig. 3 shows the top view along the 100 direction of the stable conEguration of the Si1000 cluster. The interior of the cluster is seen to exhibit the bulk-like structure. On the “surface”, the appearance of dimers associated with the Si(100) reconstruction is in evidence. The calculation of atomic forces at 1000 sites takes approximately 1:35 min on a Convex=Examplar with 16 processors (HP 735). Therefore 1000 MD steps for a Si1000 cluster can be accomplished in less than a day on that machine.
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
27
Fig. 2. Comparisons of the forces acting on every atom in an unstable (tetrahedral) Si80 cluster calculated by the O(N )=NOTB method and those obtained by direct diagonalization, reproduced from Fig. 2 in Ref. [44].
3.5. Recursion method-related O(N) schemes The recursion method developed by Haydock et al. [65], in its original framework, is a well established method for calculating the local density of states (LDOS). In this method, a diagonal element of the Green’s function is calculated in terms of a continued fraction whose coeKcients are determined according to the Lanczos transformation [66]. Because the LDOS at a given site is not expected to be sensitively dependent on the part of the system far from the site in question, the continued fraction can be truncated at a Enite step. In this way, the calculation of the LDOS is size-independent, rendering a linear scaling for the calculation of the DOS. Since the band structure energy can be calculated as an integration of the energy weighted by the DOS, several workers had taken advantages of the linear scaling behavior of the LDOS in the framework of the recursion method to devise linear schemes for the calculation of the total energy [17,19,34,67,68]. It should be noted that the recursion method-based schemes are applicable only for orthogonal basis sets. In fact, all the recursion method-based O(N ) schemes except one [17] had been developed for orthogonal tight-binding Hamiltonians. While it is quite convenient to implement the recursion scheme for the calculation of the band structure energy, the calculation of the electronic contribution to the atomic force is not at all straightforward. ˜ el; i requires the calculation of the derivatives of the local This is because the determination of F Green’s function with respect to atomic coordinates, a process that is cumbersome to implement and slow in its evaluation. HorsEeld developed a scheme, referred as the global density of
28
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
Fig. 3. The top view (along the 100 direction) of the relaxed Si1000 cluster with the dimer associated with the reconstruction of the Si(100) “surface” highlighted, reproduced from Fig. 4(a) in Ref. [44].
states (GDOS) method [69], to speed up the calculation of the derivatives. An alternative way to circumvent this problem, the bond order potentials (BOP) method [34], was developed by Pettifor and coworkers. In this approach, instead of diCerentiating the band structure energy, the Hellmann–Feynman theorem was used to determine the electronic force. The essential points of these two methods are described in the following. 3.5.1. The global density of states (GDOS) method [69] For the orthogonal basis set, the band structure energy can be determined by 1 EBS = − Lim dE Im Gi; i (E + i1)Ef(E) : i 1→0
(3.63)
The electronic contribution to the atomic force can then be obtained as the gradient of the band structure energy. SpeciEcally, 1 ˜ el; j = − F Lim dE ∇˜Rj Im Gi; i (E + i1)Ef(E) ; (3.64) i 1→0 ˜ el; j is the electronic contribution to the force acting on the jth atom. Within the framewhere F work of the recursion method, the diagonal element of the Green’s function is determined as a
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
29
continued fraction G00 (z) in terms of the set of recursion coeKcients an and bn generated from |u0 = |i. Thus the gradient of the diagonal element can be written as ∇˜Rj G00 (z) =
2n 9G00 (z) 9an 2n+1 9G00 (z) 9bn (p) (p) ∇ + ∇ ; ˜ R (p) (p) ˜Rj i j i 9 a 9 b n n 9 9 n n i i p=1
(3.65)
p=1
where (p)
∇˜Rj i = ∇˜Rj i|H p |i = ∇˜Rj
=
i1 1 :::ip−1 p−1
Hi; i1 1 Hi1 1 ;i2 2 : : : Hip−1 p−1 ;i
i1 1 :::ip−1 p−1
{(∇˜Rj Hi; i1 1 )Hi1 1 ;i2 2 : : : Hip−1 p−1 ;i
+ Hi; i1 1 (∇˜Rj Hi1 1 ;i2 2 ) : : : Hip−1 p−1 ;i + : : :} :
(3.66)
The evaluation of the gradient of the diagonal element of the Green’s function using Eqs. (3.65) and (3.66) is in general a very slow process mainly because of the calculation of the gradient (p) of the local moment, i = i|H p |i. Therefore, instead of working with the local moments, HorsEeld considered the global moment {Hi; i1 1 Hi1 1 ;i2 2 : : : Hip−1 p−1 ;i (p) = ii1 1 :::ip−1 p−1
+ Hi; i1 1 Hi1 1 ;i2 2 : : : Hip−1 p−1 ;i + : : :} = Tr {H p } : Since the matrix multiplication can be permuted in a trace, we have ∇˜Rj (p) = p (∇˜Rj Hi; i1 1 )Hi1 1 ;i2 2 : : : Hip−1 p−1 ;i :
(3.67)
(3.68)
ii1 1 :::ip−1 p−1
Eq. (3.68) is, compared to Eq. (3.66), much easier to implement on a computer, thus allowing a more eKcient way of calculating the gradient of the Green’s function. The conversion of the moments into recursion coeKcients had been found to be unstable if the moment p is greater than 21. This restriction severely limits the usefulness of the GDOS method since many realistic situations require the description of moments much higher than p 6 21. 3.5.2. The bond order potential (BOP) [34] As given by Eq. (3.16 ), the band structure energy can also be calculated according to EBS = 2 i; j% i; j% Hj%; i where the density matrix i; j% is expressed as 1 i; j% = − Lim dE Im Gi; j% (E + i1)f(E) : (3.69) 1→0
30
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
The only diCerence between this expression and the corresponding expression in Eq. (3.57) is the presence of the Fermi distribution function. For the orthogonal basis set employed in the bond order potential (BOP), the electronic contribution to the atomic force determined from the Hellmann and Feynman theorem and given by Eq. (3.59) is reduced to ˜ i; el = − 2 F j; k% ∇˜Ri Hk%; j : (3.70) j; k%
Thus, the burden of calculating the force is shifted to the determination of the density matrix. Within the framework of the recursion method, the oC-diagonal Green’s function is determined from two diagonal Green’s function, namely, Gi; j% (z) = 12 {G++ (z) − G− − (z)} ;
(3.71)
where G++ (z) = +|G(z)|+ ; G− − (z) = −|G(z)|− ; and G(z) = (z − H )−1
(3.72)
with |+ = 12 (|i + |j%) and |− = 12 (|i − |j%) being the bonding and anti-bonding orbital, respectively. Thus the oC-diagonal density matrix can be expressed as i; j% = 12 (N++ − N− − ) ; where 1 N++ = − Lim 1→0 N− − = −
(3.73)
1 Lim 1→0
dE Im G++ (E + i1)f(E) ;
dE Im G− − (E + i1)f(E) :
The quantity (N++ − N− − ), which gives the diCerence between the number of electrons per spin in the bonding orbital and that in the anti-bonding orbital, is referred to as the bond order. The calculation of the density matrix using Eq. (3.73) turns out to be a slow process as it requires high levels of recursion to achieve an accurate bond order. Through the eCorts of Pettifor and coworkers, a more eKcient expansion scheme was developed. Let 1 2
|u0 = √ (|i + ei2 |j%) ;
(3.74)
where 2 = cos−1 . It can be seen that |+ = |u01 and |− = |u0−1 . In general, G00 = 12 (Gi; i + Gj%; j% + Gi; j% ) ;
(3.75)
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
31
in the context of the tight-binding approach where the Hamiltonian is real and symmetric. The oC-diagonal Green’s function can then be determined by Gi; j% (z) =
1 2 (z) − G00 G00 : 1 − 2
(3.76)
Eq. (3.76) indicates that the calculation of the oC-diagonal Green’s function requires the difference of two diagonal Green’s function. It is this relation that slows down the calculation. If we take the limit 1 → 2 , we obtain Gi; j% =
9G00 : 9
(3.77)
In this way, the oC-diagonal Green’s function is expressed as the derivative of a single diagonal Green’s function. Substituting Eq. (3.77) into Eq. (3.69), one obtains after some tedious manipulations with the introduction of an “auxiliary” space, i; j% = −
∞
∞ : : : : 90n; (&a ) + 2 90n; n0 n i; j% (n−1)0 (&bn )i; j% ;
n=0
(3.78)
n=1
: where the response functions 90m; n0 (Nel ; T ) for a given number of electron Nel and electron temperature T are given by 1 : : : 90m; n0 (Nel ; T ) = Lim Im G0m (E + i1)Gn0 (E + i1)f(E) dE : (3.79) 1→0 : ) in the auxiliary vector space spanned by the set of The superscript : denotes a vector |ei 0 : |e: ) is the inner product between two such orthonormal basis vectors |e; ) such that :i; j% = (ei j% vectors. The vector space is referred to as the “auxiliary” vector space because the Hamiltonian : ) = |e: )H . This relation leads to (e: |f(H )|e: ) = does not operate in this space, i.e., H |ei i i j% :i; j% f(H ). It should be noted that :i; j% really has no physical meaning. However, since the auxiliary space will always appear in conjunction with the atomic orbitals |i, the presence of :i; j% allows one to label the bond between atomic orbitals under consideration. : The quantities (&a: n )i; j% and (&bn )i; j% are expressed as
(&a: n )i; j% = (&b: n )i; j% = where s: =
i; j%
9 a: n
9:i; j% 9 b: n
9:i; j%
=
2n+1
9a: n i|H s |j% ; 9s:
(3.80)
2n 9b: n i|H s |j% ; 9s:
(3.81)
s=1
=
s=1
:i; j% i|H s |j% :
32
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
The set of equations (3.78) – (3.81) constitutes the working equations for the bond order expansion. The truncation of the expansion at Enite number of terms requires the introduction of auxiliary matrices, referred as O-matrices. Its details can be found in Ref. [34]. In the implementation of this method, it had been found that there could be inconsistency between the total energy and the atomic forces [3]. This discrepancy might lead to an incorrect prediction of the stable structure of the system under consideration. Another factor to be noted is that the method is designed for orthogonal tight-binding basis sets. It will not be a trivial task, and most likely not worthy of the eCort, if an extension of this method to the general case of non-orthogonal basis set is to be attempted.
4. Order-N methods based on variational approaches In the literature, there are various approaches to the development of an O(N ) algorithm using the variational principle [18,20 –26,28,29,31–33,35,39,41]. All these approaches concern the minimization of some energy functional. They either involve the reformulation of the usual constrained variational scheme to an unconstrained one [70,71], or include the constraints (e.g., the orthogonality of the eigenfunctions or the idempotency of the density operator) in terms of some penalty function [18,39]. The variational schemes are carried out either in terms of the density matrix elements or localized orbitals. The linear scaling behavior of the calculation of the total energy and atomic forces with respect to the size of the system stems from the truncation of the density matrix or the localized orbitals in real space. 4.1. The density matrix (DM) method [21,22,28] From Eqs. (1.9) – (1.13), we have E = 2 Tr(H ˆ )
(4.1)
N = 2 Tr() ˆ :
(4.2)
and
For a given conEguration of the system under consideration, the density matrix , ˆ thus the energy E, can be determined variationally by minimizing E with respect to ˆ under the constraint given by Eq. (4.2). Li, Numes, and Vanderbilt, instead of minimizing E directly, considered the Grand potential = given by [21] = = 2 Tr[(H ˆ − )] ;
(4.3)
where is the chemical potential (Fermi level) of the system. By minimizing the grand potential = with respect to ˆ for an in the interval (1N=2 ; 1(N=2)+1 ) with 1i being the ith eigenenergy of H , the constraint on N as given by Eq. (4.2) is eliminated. However, this process leads to unphysical results for , the eigenvalues of . ˆ SpeciEcally, for states below the Fermi level
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
33
Fig. 4. The function f(x) = 3x2 − 2x3 vs. x, reproduced from Fig. 1 in Ref. [21].
will approach ∞, while above the Fermi level will approach −∞. This is certainly not the expected result of = 1 for states below the Fermi level and = 0 for states above the Fermi level. Since the idempotency of ˆ is responsible for being 1 for the occupied states and zero for the unoccupied states, Li et al. introduced the puriEcation transformation (McWeeny transformation) [72] ˜ = 3ˆ2 − 2ˆ3
(4.4)
to control the situation. From Eq. (4.4), it can be seen that an idempotent matrix is invariant under the puriEcation transformation. Because the function f(x) = 3x2 − 2x3 is stationary at x = 0 and 1, if is close to 0 or 1 ( = & or = 1 + &, |&|1), then ˜ ≈ O(&2 ) or 1 − O(&2 ). This simply means that if ˆ is nearly idempotent, then ˜ is more nearly idempotent. Furthermore, the function f(x) is concave upwards at x = 0 and concave downwards at x = 1. The eigenvalues of ; ˜ ˜ , are therefore constrained in the interval [0,1] if is in the neighborhood of 0 or 1 (see Fig. 4). In this way, the minimization of = will more likely drive → 1 for the occupied states and → 0 for the unoccupied states. Thus, the scheme is to minimize the ˜ energy functional =, =˜ = 2 Tr[(3ˆ2 − 2ˆ3 )(H − )]
(4.5)
with respect to . ˆ The development of an O(N ) scheme for the minimization of =˜ again hinges on the decay behavior of the density matrix in the real space. By imposing the region of localization (Rc ) for the density matrix such that ij (˜Rij ) = 0
for Rij ¿ Rc ;
(4.6)
the minimization procedure for =˜ will scale linearly with the size. Once =˜ min is obtained for a given conEguration of the system under consideration, the corresponding energy can be determined by E(N ) = =˜ min + N :
(4.7)
34
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
The time consuming part of the procedure is the matrix multiplication involving . If there are M orbitals per site, L sites in the region of localization, then the calculation will scale as NL2 M 3 . ˜ For a Exed chemical potential, The atomic forces can be determined as the gradient of =. one has ∇˜Ri =˜ =
9=˜ 9=˜ ∇˜Ri + ∇˜ H : 9 9H R i
(4.8)
The variational nature of the process of determining =˜ min for a given conEguration requires ˜ 9 = 0. Hence 9== ˜ ˜ i = − ∇˜ =˜ = − 9= ∇˜ H = − 2 Tr[(32 − 23 )∇˜ H ] : F (4.9) Ri Ri 9H R i Eq. (4.9) is just the Hellmann–Feynman force. It can be seen that the calculation of atomic forces scales linearly with the size of the system. The implementation of the O(N ) procedure based on Eqs. (4.5) and (4.9) is straightforward for a Enite orthogonal basis set. Nunes and Vanderbilt [28] had extended the algorithm for a non-orthogonal basis set. In this situation, the functional =˜ takes on the form of =˜ = 2 Tr[(3S − 2SS)(H − S)] ;
(4.10)
where S is the overlap matrix. For the calculation of atomic forces, the gradient of =˜ is given by ∇˜Ri =˜ = 2 Tr(˜ ∇˜Ri H ) + 2 Tr[H (3 − 4S)∇˜Ri S] ;
(4.11)
where ˜ = 3S − −2SS ; and H = H − S : The O(N ) algorithm for calculating =˜ and atomic forces depends on the decay behavior of in real space. Since is deEned by = S −1 XS −1 ;
(4.12)
Xi; j% = i |ˆ|j%
(4.13)
where with {i } being the non-orthogonal basis set and ˆ the density operator. A comparison with Eq. (3.13), together with Eq. (3.7), shows that ∗ i; j% = ci; cj%; ;
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
35
precisely the same as the pseudo-density matrix discussed in Section 3. The test conducted by Jayanthi et al. [44] on the decay behavior of the pseudo-density matrix had demonstrated that it has the same the localization properties as the conventional density matrix. Hence i; j% (˜Rij ) ˜ can be truncated for Rij ¿ Rc , leading to a linear scaling for the calculation of =. The density matrix method had been successfully implemented for orthogonal tight-binding Hamiltonians. Its implementation for self-consistent density functional theory (DFT) calculations had been carried out by HernYandez and Gillan. Their procedure is reviewed in the following subsection. 4.2. Self-consistent LDA-based density matrix method [31,73–75] For the implementation of the O(N ) procedure of the density matrix method in LDA-based calculations, HernYandez and Gillan expressed (˜r;˜r ) as [31] (˜r)L% % (˜r ) ; (4.14) (˜r;˜r ) = ;%
where (˜r) is referred to as the support function. From Eq. (4.4), one obtains (˜r)K% % (˜r ) (˜ ˜ r;˜r ) =
(4.15)
;%
with K = 3LSL − 2LSLSL ; and
(4.16)
S% =
d˜r (˜r)% (˜r) :
(4.17)
HernYandez and Gillan’s strategy is to minimize the total energy with respect to support functions as well as to the L% s under the constraint of a Exed number of electrons. The linear scaling of the procedure is the result of the requirement that the support functions are nonzero only within localized regions (support regions centered on the atoms) and L% s are nonzero only if the corresponding regions are separated by less than a chosen cutoC Rc . In the numerical implementation of the O(N ) procedure, each support function is represented by its value (˜rl ) at the grid points ˜rl in its own support region. To perform the minimization process, the gradients of the total energy with respect to the support function and that with respect to L% are given respectively by 9Etot =4 [K% (H% )(˜rl ) + 3(LHL)% % (˜rl ) − 2(LSLHL + LHLSL)% % (˜rl )] ; (4.18) 9 (˜rl ) %
and 9Etot = 6(SLH + HLS)% − 4(SLSLH + SLHLS + HLSLS)% ; 9L%
(4.19)
36
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
where (H% )(˜rl ) is the function obtained when H acts on % (˜r) and evaluated at ˜rl , and H% = d˜r (˜r)H% (˜r). In the calculation of the atomic forces, care must be exercised to include the appropriate Pulay-type correction terms. There is also the problem of constructing the Hamiltonian and overlap matrices in the implementation of this O(N ) procedure. This is a common problem shared by all the Erst principles approaches. It is closely related to the choice of the localized orbitals in real space. We shall therefore defer the discussion of these issues to Section 5.2.2. The implementation of the procedure described above has resulted in the development of the computer code CONQUEST [76]. 4.3. Penalty function-based energy minimization approach [39] In the variational scheme to determine the density matrix by energy minimization, the condition on the idempotency of the density matrix may be imposed by a penalty function. Kohn [39] proposed just such a scheme by introducing an energy functional Q Q [(˜ ˜ r;˜r )] = E[] U − N [] U + P[] ˜ ; where
E[] =
−
˝2
2m
(4.20)
[∇˜r ∇˜r (˜r;˜r )]˜r =˜r + ;(˜r)(˜r)
d˜r ;
(4.21)
N [] =
(˜r;˜r) d˜r ;
(4.22)
and the penalty function P[] ˜ given by 1=2 2 2 P[] ˜ = [˜ (1 − ) ˜ ]˜r =˜r d˜r
(4.23)
with U = ˜ 2 ; ˜ being the trial density matrix, and a positive number. Let ˜j be the eigenvalue of , ˜ then ∞ 1=2 2 P[] ˜ = ˜j (1 − ˜j )2 : (4.24) 1
Eq. (4.24) indicates that P[] ˜ = 0 only if all the ˜j are either 1 or 0, thus satisfying the condition of idempotency, or else, P[] ˜ will be positive and is therefore a penalty function. When P[] ˜ = 0, one has 2 Q [] ˜j (1˜j − ) ; ˜ = Q [] ˜ = E[] U − N [] U = (4.25) j
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
37
where ˜j is either 1 or 0. Thus, for a given set of 1˜j , the energy functional Q will be at its minimum when ˜j = 1 for 1˜j 6 and 0 for 1˜j ¿ . The minimization process for the functional Q may be viewed as
Q [] ˜ = min min Q [] ˜ = min [=(P ) + P ] ; (4.26) {P }
P
{P[]=P ˜ }
where =(P ) = min =() ˜ = P[]=P ˜
min
P({˜j })=P ;˜j ;˜ j
2 ˜j (1˜j − ) :
(4.27)
j
Hence, the number must be chosen according to ¿ c = max P
d=(P ) ; dP
(4.28)
such that for any ¿ c ; d[=(P )+P ]=dP ¿ 0 for all P . In this way, for given and ¿ c , minimization of Q [] ˜ yields the correct density matrix, and other relevant quantities such as energy, forces, etc. The linear scaling of the calculation of the energy functional Q [] ˜ is again brought about by the truncation of ˜ in the region of localization. The implementation of the variational scheme can be handled by conventional conjugate gradient method. 4.4. Variational approaches using localized orbitals minimization (LOM) In the electronic structure calculation, it is well known that methods based on iterative diagonalization are more eKcient than methods using direct diagonalization even though both approaches scale as O(N 3 ) with respect to the system size [48,49]. There are two types of iterative approaches, namely, constrained and unconstrained minimization. For the constrained approach, the condition of orthogonality of the wave functions is explicitly imposed. The imposition of the orthogonality leads to O(N 3 ) scaling behavior. In the unconstrained approach, the calculation of the inverse of the overlap matrix S scales as O(N 3 ). However, it had been shown by Galli and Parrinello [71] that no signiEcant loss in accuracy will occur if localized orbitals are used for the unconstrained minimization. This then opens the way to take advantage of the local nature of the density matrix in real space. Thus, instead of determining the eigenfunctions, one searches for the localized wave functions that may be viewed as appropriate linear combinations of eigenfunctions. In terms of these localized wave functions, the implementation of the unconstrained minimization will sclae linearly with respect to the size. There are two very similar variational approaches using the localized wave functions. In this subsection, we discuss the approach developed by Mauri et al. [20], later modiEed by Kim et al. [32], as well as the scheme developed by OrdejYon et al. [23]. We also discuss a variational approach in terms of the localized orbitals but using a penalty function to eliminate the constraint on orthogonality [18].
38
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
4.4.1. Unconstrained variational approach of Mauri et al. [20,32] Mauri et al. [20] considered a energy functional given by !
"
N=2 2 ˝ 2 E[A; {}] = 2 Aij i − ∇ j + F[] ˜ + A N − d˜r (˜ ˜ r) ; 2m ij
(4.29)
where {} deEnes a set of N=2 localized orbitals treated as variational variables, (˜ ˜ r) = 2
N=2
Aij j (˜r)i (˜r) ;
(4.30)
ij
F[] ˜ is the sum of the Hartree, exchange-correlation, and external potential energy functionals, and A a parameter. If A = S −1 with Sij = i |j being the overlap matrix, then [S ˜ −1 ] = (˜r). In this case, the last term in Eq. (4.29) is zero and the energy functional becomes the total energy of the system according to the DFT. The scaling behavior of the calculation of S −1 is O(N 3 ) and therefore should be avoided. Since ∞ −1 −1 S = (I − (I − S)) = (I − S)n ; n=0
Mauri et al. substituted [20] A=Q=
B
(I − S)n ;
(4.31)
n=0
where B is an odd number. They proved that the absolute minimum of the functional E[Q; {}] is E0 the Kohn–Sham ground state energy. The set of localized wave functions, {}, is constrained to be nonzero only within appropriately chosen regions of localization. In the calculation of the energy functional E[Q; {}] using the set of localized orbitals {}, the sums entering Eq. (4.29) and its derivatives extend only to orbitals belonging to overlapping regions of localization. Hence, the procedure for the minimization of the energy functional scales linearly with the size. The use of the localized wave functions to carry out the minimization of the energy functional leads to shallow multiple local minima [26,77]. This can cause the minimization to be trapped in unphysical situations instead of yielding the minimum that corresponds to the ground state. Kim et al. [32] proposed a solution to this problem by allowing the number of localized orbitals to exceed the number of occupied states. In their approach, The energy functional is expressed as E[{}; A; M ] = 2
M
Qij j |(H − A)|i + AN ;
(4.32)
i; j=1
where {} is a set of M overlapping orbitals, Q a M × M (M ¿ N=2) Matrix given by Q = 2I − S :
(4.33)
A comparison with Eq. (4.31) indicates that Eq. (4.33) corresponds to the case with B = 1.
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
39
It can be shown that the energy functional as given by Eqs. (4.32) – (4.33) possesses the following properties: (1) It is invariant under unitary transformations; (2) orbitals with vanishing norms do not contribute to the value of the energy functional; (3) the ground state energy is a stationary point of the energy functional; (4) if A is equal to the chemical potential , the stationary point is a minimum of the energy functional. Using an orthogonal tight-binding Hamiltonian, Kim et al. [32] demonstrated numerically that the minimization of the energy functional E[{’}; ; M ] with respect to the set of localized orbitals {} can lead to approximate ground state energy E0 . The severity of the problem associated with the multiple local minima seemed to have been reduced if not eliminated by allowing the number of localized orbitals to exceed the number of occupied states (M ¿ N=2). The minimization of the functional can be carried out using a conjugate gradient procedure, leading to structural optimization via molecular dynamics simulations. Kim et al. [32] tested their method by performing calculations for various carbon systems (bulk solids, surfaces, clusters, and liquids), based on an orthogonal tight-binding model. They used a region of localization of up to second neighbors. In the calculation, they included three local orbitals per site, leading to M = 3N=4 local orbitals used in the expansion rather M = N=2 orbitals. They found that this was suKcient to overcome the multiple-minima problem associated with the case when M = N=2 orbitals were used. Fig. 5 shows the energy and the charge per atom during a conjugate gradient minimization of the energy functional E[{}; A; M ] for a 256-carbon atom slab. The parameter was varied from A = 20–3:1 eV, corresponding to the value of the chemical potential of the system under consideration. From Fig. 5, it can be seen that both the energy and the charge converge to their respective values after about 100 iterations. They also established that the computational eCort for the MD simulations indeed scales linearly with the system size. 4.4.2. Unconstrained minimization scheme of OrdejDon et al. [23] OrdejYon et al. proposed a unconstrained minimization scheme by introducing an energy functional [23] N=2 N=2 E =2 Hii − Hji (Sij − &ij ) : (4.34) i=1
i; j=1
An examination of Eq. (4.34) reveals that it can be rewritten as E =2
N=2
Qij Hji
(4.35)
i; j=1
−1 −1 with Q = 2I − S (Eq. (4.33)). Thus Eq. (4.34) corresponds to E = 2 N=2 i; j=1 Sij HJI with S replaced by Q = 2I − S, the truncation of the expansion of S −1 at its Erst term. By requiring that the set of local wave functions {’} be truncated within regions of localization deEned by some cut-oC radius Rc , the sum in Eq. (4.34) only extends to terms between overlapping regions of localization. In this way, the calculation of the energy functional scales
40
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
Fig. 5. Total energy and total charge per atom for a 256-carbon atom slab obtained by the minimization of the generalized energy functional E[{}; A; M ] vs. the number of iterations. Three states per atom were used. The chemical potential A was varied from 20 to 3:1 eV. The plot is reproduced from Fig. 4 in Ref. [32].
linearly with the size. The conjugate gradient scheme can be used to carry out the minimization of the energy functional as well as the molecular dynamics simulations in the structural determination. OrdejYon et al. demonstrated the order-N scaling of their method by calculating the band structure energy for silicon supercells in the diamond structure with diCerent number of atoms, based on an orthogonal tight-binding model. The calculation was carried out at the E point. In the calculation, the local wave functions were centered at the bonds. Fig. 6 shows the scaling of the CPU time with the number of atoms in the supercell for two regions of localization with two diCerent cut-oC radii, corresponding to 26 and 38 atoms included in the localization region respectively. As can be seen from Fig. 6, the results indicate linear scaling in both cases. Numerical tests also suggest that the scheme suCers from the problem of multiple local minima [77]. 4.4.3. Unconstrained minimization via localized orbitals using a penalty function [18] Wang and Teter introduced an energy functional E=
n i=1
i |H |i +
n i=1 j
|j |i |2 ;
(4.36)
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
41
Fig. 6. The CPU time for the calculation of the band structure energy of a 216-atom silicon supercell in the diamond structure using the variational O(N ) scheme of OrdejYon et al. [23]. The scaling behavior for the case with the localization region containing 26 atoms and that with the region containing 38 atoms are shown. The N 3 scaling corresponding to the case of exact diagonalization is also shown for comparison. The Egure is reproduced from Fig. 1 in Ref. [23].
where i is the local wave function for bond i, n the total number of bonds, a positive constant, and the sum over j extends to all neighbors of i. The second term in Eq. (4.36) is the penalty function that is intended to control the lack of orthogonality. From Eq. (4.36), it can be seen that the minimum of the energy functional will yield the ground state energy E0 only if approaches inEnity. It is of course diKcult to minimize the energy functional for a large . However, Wang and Teter observed that for a tight-binding basis set, a whose value is greater than but of the same order of magnitude of the Hamiltonian matrix elements can lead to a reasonable approximation to the ground state energy. The “smallness” of makes it possible to use the scheme such as the conjugate gradient to carry out the variational minimization. 4.5. Absolute energy minimum approach to linear scaling [41] For a set of N=2 linearly independent orbitals {}, one can always construct a density operator =
N=2 i:j
|i Sij−1 j | ;
(4.37)
where Sij = i |j . It can be shown that this density operator satisEes the conditions on Her miticity († = ), idempotency (2 = ), and normalization (N = 2 (˜r) d˜r). The minimization of the energy functional, E[] = 2 Tr(H ), with respect to the set of orbitals {} will lead to the correct ground state energy without any constraint. As pointed out in Section 3.4, the computing eCort for calculating S −1 is O(N 3 ). The methods discussed in Section 3.4 are all designed to circumvent this O(N 3 ) bottleneck. The key ingredients for these methodologies are to construct the energy functional by replacing S −1 with its truncated expansion and the truncation of the orbitals. The remedy for the problem of multiple local minima associated with the truncation of the orbitals is to allow the number of orbitals to exceed the number of occupied states (N=2). However, these remedies still do not eliminate the problem of multiple minima.
42
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
Recently, Yang proposed an absolute energy minimum approach to address the issue related to local minima [41]. In this approach, the energy functional is given by
E(N ) = min min =[(2X − XSX ); {}] + AN ; (4.38) {};2 rank(S)=N
X =X †
where {} is a set of M ¿ N=2 arbitrary (possibly linear dependent) orbitals, X an auxiliary matrix that at the minimum becomes a generalized inverse of S, A a constant, and = given by =[A; {}] = E[A ] − 2A Tr(A )
(4.39)
with A =
M
|i Aij j | :
(4.40)
i; j
The constant is chosen such that the matrix HU − AS is negative deEnite where 9E[A() ] 1 1 U d H= ; 2 0 9Aij ()
(4.41)
and A() = [(2X − SXS) − S − ] + S −
(4.42)
with S − being the (1) inverse of a singular matrix [78] when the orbitals are linearly dependent. In this case, S − is deEned by SS − S = S :
(4.43)
The minimization of the energy functional via Eq. (4.38) has a constraint on the normalization speciEed by 2 Tr() = 2 rank(S) = N :
(4.44)
Note that rank(S) is equal to the number of linearly independent orbitals in the set {i ; i = 1; : : : ; M }. Yang also proposed a energy functional which eliminate the constraint on normalization, namely, E(N ) = min min =[(2X − XSX ); {}] + 2(A − ) rank(S) + N ; (4.45) {}
X =X †
where is the chemical potential. The variational scheme based on either Eq. (4.38) or Eq. (4.45) deEnes the ground state energy as the absolute minimum. Therefore, it should provide
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
43
more robust minimization algorithms. The linear scaling of the scheme hinges on the truncation of the set of orbitals as well as the truncation of the auxiliary matrix X . However, no O(N ) scheme has yet been implemented. 5. Issues a)ecting the implementation of O(N ) algorithms 5.1. Issues related to tight-binding approaches It is straightforward to implement the various O(N ) algorithms described in Sections 3 and 4 using tight-binding hamiltonians. This is because the Hamiltonian matrix and, in the case of NOTB approaches, also the overlap matrix are given as parameterized functions of ˜Rij . Thus, there is no overhead needed to construct the Hamiltonian and overlap matrix elements. In fact, most of the applications of the O(N ) methods has been carried out within the framework of tight-binding approaches. The main issue is therefore not the process of implementation itself, but rather how reliable is the Hamiltonian so that accurate result can be expected in the context of the O(N ) approach. To achieve the kind of reliable results expected for the prediction of stable structures and properties of complex systems with reduced symmetry, the tight-binding Hamiltonian must have the appropriate conceptual framework for the understanding of chemical trends in both the structural and electronic properties of the condensed matter systems. For example, for the study of the growth and stability of nanostructures, it is critical that one can reliably calculate the energetics of the system because it determines states of local and global minimum that a system can attain as well as the saddle points which deEne the energy barriers for these states. This usually requires diCerentiating states which diCer by an energy of the order of magnitude of 0:1 eV. Accurate prediction of energy diCerences of this magnitude demands extremely accurate theory such as the state-of-art density functional calculations. But the size and the complexity of systems of interest preclude, at least presently, a MD strategy based on ab initio methods. On the other hand, there are evidences that tight-binding approaches which carefully incorporate chemical trends in the determination of the tight-binding parameters can correctly predict relative energy diCerences if not the absolute values for covalent-bonded systems [64,79 –83]. Therefore, the key issue in the implementation of O(N ) algorithms based on tight-binding approaches is to select an appropriate TB Hamiltonian that has no intrinsic bias towards ionic, covalent, or metallic bonding. Only in this way, the TB Hamiltonian will have good transferability over a wide range of coordination and local environments needed for the correct prediction of the stable structures and properties of the system under consideration. Conventional TB Hamiltonians are developed within two-center framework [47,83]. The parameters deEning the matrix elements are usually Etted to equilibrium properties of dimers and bulk crystals. It is unrealistic to expect that parameters obtained this way can be used to describe correctly properties of a system as its size increases from a few atoms to a few hundreds, or a few thousands, of atoms. To improve the range of transferability of the TB parameters, the following ingredients must be taken into consideration: (1) self consistent charge transfer eCects, (2) environment-dependent multicenter terms, and (3) environment-dependent eCective repulsive potential. In recent years, there is a concerted eCort to address these issues in the
44
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
context of orthogonal as well as non-orthogonal TB approaches [84 –93]. However, there is yet no encompassing standard approach to eCectively include all these factors. Hence, the applications of O(N ) methodologies based on TB approaches have mostly relied on TB Hamiltonians constructed speciEcally to treat particular issues of the system under consideration. Thus, the limiting factor for using O(N )=TB methodologies as a predictive tool is the reliability and transferability of the TB Hamiltonian. 5.2. Construction of the Hamiltonian in Grst principles O(N) algorithms In the implementation of Erst principles O(N ) procedures, a substantial computational eCort is spent on the construction of Kohn–Sham or Fock matrices. The most frequently used scheme for the construction of the Kohn–Sham matrix in the self-consistent electronic structure calculations is the plane-wave method. In this approach, a pseudopotential formalism is assumed and a super cell approximation is adopted [48]. The wave functions and other related quantities can then be expanded in terms of plane waves. In this way, a powerful tool, the fast Fourier transform (FFT) [94], can be used to construct the Hamiltonian matrix. SpeciEcally, the kinetic energy is calculated in the momentum space while the potential energy is determined from the charge density in real space (expressed in terms of grid points) and then transformed into the momentum space. The operation scales as O(M log M ) where M is the number of plane waves (or grid points). The pseudolinear scaling of the construction of Hamiltonian matrix using the plane wave methods has been reviewed in Ref. [2]. Localized basis sets such as the Gaussian basis sets have also been used to calculate the electronic structure of condensed matter systems, in particular for all-electron calculations in molecular systems. The key to the popularity of Gaussian orbitals is that many of the integrals involved in the evaluation of Coulomb interaction and exchange-correlation interaction can be determined analytically if Gaussian basis functions are used [95]. Furthermore, the product of two Gaussians can be expressed as a Gaussian centered in between the two original Gaussians. These two properties of the Gaussian orbitals have been used proEtably to simplify the construction of the Hamiltonian matrix. To achieve O(N ) scaling for the construction of the Hamiltonian matrix for localized basis sets, further manipulations are needed. This usually involves a hierarchical partition of charge density, the separation of partitions based on the decay behavior of the density matrix, and the use of multipole expansions for the interaction between well-separated partitions [96 –99]. The set up then allows methods such as the fast multipole method (FMM) [100] to be used for the evaluation of energy integrals such as the Coulomb interaction. A detailed discussion of this and related approaches can also be found in Ref. [2]. In general, it is not convenient to implement an O(N ) methodology in a plane-wave based algorithm because a large number of plane waves is needed to expand the localized basis functions. An alternative is to perform the plane-wave calculation in adaptive coordinates [101,102]. In recent years, several schemes based on a real space approach have been developed, including the real space method [103] and the wavelet methods [104 –106]. In this subsection, we focus on two such approaches which form the basis of the only two Erst principles O(N ) codes currently in existence, namely, SIESTA and CONQUEST.
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
45
5.2.1. Implementation of an O(N) procedure using LCAO basis sets Recently, OrdejYon et al. [35] implemented the O(N ) scheme developed by OrdejYon et al. (Section 4.4.2) using linear combinations of atomic orbitals (LCAO) basis sets (SIESTA). The scheme is a fully self-consistent DFT=LDA-based approach, with core-electron contribution replaced by appropriate pseudopotentials. It allows the practitioner to use either localized minimal basis sets such as the Ere-ball orbitals [107] or expanded bases depending on the size of the system under consideration, the required accuracy of the property under study, and the available computational power. We focus our discussion of the implementation procedure based on the Ere-ball orbitals because it can be implemented easily and it requires only modest computational platforms such as work stations. The extension to more extended bases (e.g., multiple-z bases) is, in principle, straightforward. In OrdejYon et al.’s approach [35], the Kohn–Sham (KS) Hamiltonian is rewritten as H KS =
p2 [Vnl (˜r − ˜Ri ) + Vna (˜r − ˜Ri )] + VH& (˜r) + Vxc (˜r) ; + 2m i
(5.1)
where Vnl is the short-ranged nonlocal part of the pseudopotential. The long range local part of the pseudopotential, Vl , is absorbed in Vna , the neutral atom potential, such that na ni (˜r − ˜Ri ) 2 ˜ ˜ d˜r ; (5.2) Vna (˜r − Ri ) = Vl (˜r − Ri ) + e |˜r − ˜r | where nna i is the atomic charge density of the atom i in its neutral, isolated state. In this way, the neutral-atom charge density of the system, n0 (˜r), can be expressed as n0 (˜r) = nna r − ˜Ri ) : (5.3) i (˜ i
Let &n(˜r) = n(˜r) − n0 (˜r) where n(˜r) is the actual charge density of the system. The Hartree potential can be decomposed into two components, VH& and VH0 , associated with &n(˜r) and n0 (˜r), respectively. From Eq. (5.3), it can be seen that VH0 can be expressed as a sum of atomic contributions. From Eq. (5.2), it can be seen that Vna is short-ranged because the core attraction will be cancelled by the electron Coulomb repulsion of the neutral atom charge beyond some cut-oC distance. The Ere-ball orbitals { } are pseudoatomic orbitals deEned by Sankey and Niklewski [107]. They are slightly excited orbitals obtained by solving the valence electron problem for the isolated atom with the same pseudopotential and LDA approximations as used in the system Hamiltonian but with the boundary condition that the orbitals vanish beyond a cut-oC radius rc . The construction of the KS Hamiltonian matrix and overlap matrix with respect to { } can be proceeded as follows. The matrix elements of the overlap matrix (S; = |; ), the kinetic energy ( |p2 =2m|; ), Vnl ( |Vnl (˜r − ˜Ri )|; ), and Vna ( |Vna (˜r − ˜Ri )|; ) are calculated only once beforehand and tabulated as functions of the relative positions of the “atomic” centers. During the simulation when the atomic positions undergo changes, these tables will be used as the basis for interpolation. The matrix elements of VH& and Vxc depend on the charge density. Their calculations are carried out in terms of the self-consistent charge density. For an
46
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
initial input of the LCAO density matrix, n(˜r) and &n(˜r) are computed on a real space grid. Poisson’s equation for VH& associated with &n(˜r) can be solved by the standard FFT (with a scaling behavior of N log N ) or by the multigrid method. It should be noted that only two FFTs are necessary per self-consistent cycle, in contrast with plane wave-based calculations where a FFT is required for each state. The Vxc is computed at each grid point using n(˜r) at that point. The non zero matrix elements |VH& |; and |Vxc |; are obtained by direct summation on the grid for orbitals with their “atomic” centers closer than 2rc apart. The KS Hamiltonian constructed with the initial n(˜r) and &n(˜r) is then used in the O(N ) scheme of OrdejYon et al. [23] to minimize the energy functional with respect to the localized orbitals to obtain the band structure energy. The resulting orbitals are used to calculate the new charge density and compared with the input charge density. This then completes one self-consistent cycle. The process is repeated until the input and out put charge density agree within the desired accuracy. The total energy of the system can then be calculated at this point as e2 e2 Etot = EBS − VH (˜r)n(˜r) d˜r + VH0 (˜r)n0 (˜r) d˜r 2 2 + [1xc (n) − Vxc (n)]n(˜r) d˜r + Uii−ee ; (5.4) where Uii−ee =
e2 Zj Zj e2 − 2(410 ) |Rj − Rj | 2
VH0 (˜r)n0 (˜r) d˜r :
(5.5)
jj
The introduction of the term Uii−ee in the calculation of the total energy is to circumvent the problem associated with the long-range nature of the Coulomb interaction between the ions. As given by Eq. (5.5), Uii−ee can be computed as a sum of short-ranged contributions as the terms corresponding to ions which are far apart are cancelled. In MD simulations to determine the stable structures, the force acting on the ith atom can be calculated as 0 ˜i = − F [; ∇˜Ri H; − 6; ∇˜Ri S; ] − ∇˜Ri Uii−ee ;
+2 n0 ∇˜Ri |VH& | − 2 ; ∇˜Ri |(VH& + Vxc )|; ;
(5.6)
;
where H 0 = p2 =2m + Vnl + Vna , and n0 is the electron occupation at the state . The Erst two terms in Eq. (5.6) can again be calculated by interpolating the tabulated matrix element data. The last two terms are the Pulay-like corrections. They must be integrated numerically based on VH& , Vxc , , and ∇˜Ri . 5.2.2. Real space implementation of the self-consistent LDA-based DM method Gillan and coworkers have implemented the self-consistent LDA-based DM method discussed in Section 4.2 in real space [31,73–75]. They have used both a real space grid representation
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
47
[31,103,108] and a basis of B-splines [76]. For the approach based on the real space grid, the support functions (˜r)s are represented by their values at the grid points. They have non-zero values only within their regions of localization because they are supposed to be localized in their own regions of localization. Real space integration is replaced by the summation over grid points. For example, the overlap matrix element S% is given by S% = &! (˜ri )% (˜ri ) ; (5.7) i
where &! is the volume per grid point, and the summation is over all the grid points inside the region common to the regions of localization of and % . The Laplacian is represented by a Enite diCerence representation, namely, n 92 (nx ; ny ; nz ) 1 = 2 c (nx + m; ny ; nz ) ; 9x 2 h m=−n |m|
(5.8)
where h is the grid spacing, nx , ny , and nz are integer indices of the grid point ˜ri , and c|m| is determined according to the order |m|, with similar expressions for 92 = 9y2 and 92 = 9z 2 . In this way, the kinetic energy can be calculated by (see Eq. (4.15)) EK = 2 K% T% ; (5.9) %
where T% is calculated as the sum over the grid points common to the regions of localization of and % ,
˝2 T% = &! − % (˜ri )∇˜2r (˜ri ) ; (5.10) 2m i The charge density needed for the calculation of Hartree and exchange-correlation potentials is determined by n(˜ri ) = 2 (˜ri )K% % (˜ri ) : (5.11) %
˜ space Using FFT, its Fourier component nG˜ i can be calculated so that the Hartree potential in G can be conveniently determined as 2 ˜ ˜ i ) = 4= e n(Gi ) ; VH (G Gi2
(5.12)
where = is the volume of the simulation cell. The Hartree energy is then calculated as EH = 2=e2
|nG˜ |2 i
˜ i =0 G
Gi2
:
(5.13)
48
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
The Hartree potential in real space can be obtained by another FFT, using Eq. (5.12). The exchange-correlation energy is determined by Exc = &! n(˜ri )1xc [n(˜ri )] : (5.14) i
The pseudopotential energy is calculated by Eps = &! Vps (˜ri )n(˜ri ) ;
(5.15)
i
where Vps (˜r) =
;ps (|˜r − ˜Ri |)
(5.16)
i
with the ionic pseudopotential expressed as the sum of the Coulomb potential due to a Gaussian charge distribution and a short-range potential ;0ps (r), i.e., ;ps (r) = −
Ze2 erf (1=2 r) + ;0ps (r) : r
(5.17)
Gillan and coworkers have also used an alternative way to represent the support function. B-splines, or blip functions, are piecewise polynomial functions which can be set up to be localized on the points of a grid (blip grid) rigidly attached to each atom [76]. Using a basis of B-splines, L(˜r), one can write i (˜r) = bis L(˜r − ˜Ris ) ; (5.18) s
where ˜Ris denotes the grid points associated with the atom i. The energy functional can then be minimized with respect to the coeKcients of the B-splines, bis .
6. Applications Most of the applications of O(N ) methodologies to study properties of systems of realistic sizes up to this point are based on TB or semi-empirical approaches. These applications include: a study of C60 impacts on a diamond surface by Galli and Mauri [109]; a general study of carbon systems including crystalline, amorphous, and liquid carbon by Qiu et al. [110]; studies ◦ of the 90 partial dislocation in silicon by Hanson et al. [111] and by Nunes et al. [112]; calculations of the electronic structure, solvation free energy, and heats of formation for protein and DNA by York et al. [113,114]; a study of stable geometries of icosahedral fullerenes by Xu and Scuseria [115]; a study of the structural properties and energetics of the extended {311} defects in silicon by Kim et al. [116]; a study of C28 deposition on a semiconducting surface by Canning et al. [117]; a study of gas phase growth of a disordered solid of C28 fullerenes by
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
49
Kim et al. [118]; a study of surface reconstructions and dimensional changes in single-walled carbon nanotubes by Ajayan et al. [119]; a study of edgy-driven transition in the surface of silicon nanorod by Ismail-Beigi and Arias [120]; a study of the behavior of shock-compressed methane at high temperatures and pressures by Kress et al. [121]; a study of the initial stages of growth of Si=Si(001) [122]. There are also a few instances where Erst principles O(N ) methodologies have been applied to investigate properties of large systems. They include: studies of the shape of large single and multi-shelled fullerenes by York et al. [123,124], by Itoh et al. [125], and by OrdejYon et al. [35]; a study of the geometry and energetics of DNA basepairs and triplets by Lewis et al. [126]; a study of the stable structure of a large DNA molecule by SYanchez-Portal et al. [127]. All the applications cited involve systems of large size. Thus they can only be studied by O(N ) methods. The above list is by no means exhaustive. It is meant to provide a Qavor of the wide range of applications of the O(N ) methods. In this section, we discuss a few typical examples of these applications. 6.1. The shape of large fullerenes [35,123–125] The discovery of concentric spherical graphite shells by Ugarte [128] prompted a series of theoretical studies on the equilibrium shape of large single-shell fullerenes [123–125,129 –132]. The key question in these endeavors is whether the spherical shape of the multiple-shelled fullerenes is due to the intrinsic stability of the single-shelled fullerenes or due to the van der Waals interactions between the shells. Isolated, large, and defect-free single-shell fullerenes have not been observed experimentally. Therefore, one must rely on the result of theoretical studies to shed light on the underlying physics for the existence of spherical multiple-shell fullerenes. Results of the studies using the elastic theory [129,130] as well as empirical potentials [129,131,132] suggested that large single-shell fullerenes are not spherical but markedly polyhedrally faceted. The implication is then that the spherical shape of the multi-shell fullerenes is the consequence of the inter-shell interactions. However, a more reliable answer on this issue is expected if a calculation based on quantum-mechanical simulation is carried out. York et al. using the O(N ) methodology of divide and conquer, carried out just such a calculation [123,124]. In their approach, a large single-shell fullerene was divided into subsystems, each with one carbon atom. The local basis set for each subsystem included the atomic orbitals of the atom deEning the subsystem and those of up to its third nearest neighbors. The atomic orbitals were obtained as numerical LDA solutions for a spherical carbon atom. The Goldberg type I fullerenes with Ih symmetry were assumed for the system under consideration. The non-self-consistent Harris functionals were used to construct the Hamiltonian. These restrictions allowed the calculations of fullerenes up to about 1000 atoms. York et al. investigated the shape of C240 which is the “spherical” cluster next to C60 . In their simulations, they considered several diCerent initial conEgurations, with two being “spherical” (sph1 and sph2), and three faceted (fac1, fac2, and fac4). After relaxation, they found an almost spherical conEguration (Syork ) with the lowest energy, followed by a polyhedral structure (Pyork ) with its energy higher than that of Syork by 0:07 eV=atom. They had also studied the shapes of C540 and C960 . Their Endings indicate that the spherical shape also has the lower energy for these clusters.
50
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
The Endings of York et al. are very diCerent from the conclusions drawn from the results obtained using the empirical potentials. Itoh et al. carried [125] out another investigation of the shapes of large fullerenes using the unconstrained orbital minimization scheme of OrdejYon et al. [23]. To ensure that the variational scheme will not be trapped in one of the local minima, they had incorporated the information about the chemistry of the local bonding conEguration in constructing the local wave functions (LWF) used in the O(N ) scheme. SpeciEcally, they deEned 3N=2 LWFs to correspond to -type orbitals, and N=2 LWFs -type orbitals. The initial guesses for the functions were linear combinations of sp2 orbitals, while those for functions were combinations of p⊥ orbitals. The functions were centered at each of the 3N=2 bonds of the cage network. The functions were assigned to be centered at each of the bonds pointing radially from the pentagons, resulting in a total number of N=2 such functions. The distance between the center of a LWF and an atom was expressed in terms of a number Nd which was deEned as the minimum integer number of bonds between the center and the atom. With the centers of regions of localization of the LWFs assigned, the cut oC distance for these LWFs was then expressed in terms of a cut-oC number Nc . If the “distance” Nd between the center of a certain LWF and an atom is less than Nc , the atom in question is then inside of the region of localization corresponding to that LWF. The advantage of using this deEnition of the distance is that the number of atoms within the cut-oC depends only on the topology of the bonds, not on the curvature of the network. Itoh et al. found that, for the case of C240 , an accuracy with an error of less than one percent as compared to the exact calculation can be achieved for the O(N ) procedure for Nc = 4. For larger clusters, no further degradation of the accuracy was detected. Itoh et al. studied the shapes of large fullerenes including C240 , C540 , C960 , and C2160 , using the O(N ) procedure outlined above. They found that, in every case, the minimum energy conEguration is markedly polyhedral rather than spherical. This is deEnitely in contrast to the Endings of York et al. For the case of C240 , they calculated the energy of all the structures considered by York et al. and compared with their optimized structure. They found that their faceted optimized structure is signiEcantly lower in energy than all the structures considered by York et al., including the optimized spherical structure Syork obtained by York et al. after relaxation (see Table 3). The two groups had used diCerent O(N ) approaches, but both approaches were based on Erst principles DFT=LDA within the framework of non-self-consistent Harris functional. To shed light on the discrepancy between these two studies, OrdejYon et al. had carried out a LCAO-based self-consistent O(N ) calculation of the equilibrium structures of C60 , C240 , and C540 [35]. Their Endings are in basic agreement with those obtained by Itoh et al. (see Table 4). This result seems to suggest that the shape of single-shell carbon clusters tend to be polyhedral except the fullerene. 6.2. Dimensional stability of single-walled carbon nanotubes [119] The dimensional stability of carbon nanotubes is of central importance for their potential technological applications. Experiments on single-walled nanotubes indicated that nanotubes can be severely deformed locally due to focused electron irradiation, leading to fracture at the necks developed along the tube [133,134]. Irradiation causes carbon atoms knocked oC from the surface of the tube. This process of atom removal produces vacancies and holes, thus creating an
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
51
Table 3 V and energies per atom (in eV, with respect to a graphitic sheet) for single-shell C240 clusGeometric parameters (A) ters of various structures calculated with the O(N )-ab initio tight-binding method of OrdejYon et al. [23], reproduced from Table III in Ref. [125] Morphology
Bonds (b1 ; b2 ; b3 ; b4 ; b5 )a
Radii (r1 ; r2 ; r3 )a
r( U )b
sph1c sph2c fac1c fac2c fac4c SYork c PYork c YOd This work
(1.44,1.43,1.44,1.43,1.44) (1.43,1.44,1.43,1.43,1.44) (1.48,1.44,1.48,1.44,1.48) (1.47,1.43,1.47,1.43,1.47) (1.45,1.40,1.47,1.45,1.46) (1.43,1.43,1.45,1.42,1.44) (1.43,1.42,1.51,1.47,1.46) (1.43,1.38,1.45,1.42,1.43) (1.42,1.38,1.45,1.42,1.43)
(7.12,7.12,7.12) (7.12,7.12,7.12) (7.03,7.42,6.97) (7.63,7.21,6.75) (7.49,7.19,7.05) (7.01,7.13,7.14) (7.66,7.19,7.07) (7.36,7.06,6.92) (7.32,7.06,6.94)
7.120 7.120 7.098 7.085 7.195 7.106 7.247 7.065 7.065
(0.000) (0.000) (0.188) (0.367) (0.180) (0.056) (0.244) (0.180) (0.153)
EO(N )
Eexact
EYork
0.185 0.194 0.502 0.241 0.141 0.210 0.212 0.122 0.120
0.169 0.176 0.488 0.232 0.131 0.195 0.200 0.111 0.108
0.128 0.128 0.248 0.278 0.208 0.108 0.178
a
Inequivalent bonds and radii. See Ref. [123] for the deEnition. Average radius and standard deviation (in parentheses). c Optimized structures obtained by York et al. [123]. d Optimized structure obtained by Yoshida and Osawa [131]. b
Table 4 Comparisons of the average radii (r), U standard (s ), and maximum deviation (m = (rmax − rmin )=2) of radii, and ◦ ◦ non-planarity angle (around pentagons, from 0 for a planar pentagonal site to 12 for a truncated icosahedron) of fullerene clusters obtained by the self-consistent O(N )-LCAO method of OrdejYon et al. [35] and those by Itoh et al. [125], reproduced from Table I in [35] This work V rU (A) C60 C240 C540
3.59 7.18 10.69
s = rU
0.000 0.023 0.038
m = rU
0.000 0.027 0.054
Itoh et al. ◦
12:0 ◦ 8:5 ◦ 9:6
V rU (A)
s = rU
m = rU
3.55 7.06 10.53
0.000 0.021 0.033
0.000 0.028 0.053
12:0 ◦ 7:9 ◦ 9:2
◦
unstable conEguration. The tube may mend itself through a rearrangement of atoms and thus shrink in size (diameter). Recently, Ajayan et al. [119] studied, both experimentally and theoretically, the surface reconstruction of single-walled nanotubes under low Quxes of irradiation. They found that, under low Quxes of irradiation, a typical tube shrank from an original diameter of 1:4 nm to an incredible value of 0:4 nm in about half an hour of irradiation. The mended tubes were found to be stable and the overall shape of the tubes remained cylindrical. To understand the physics involved in the mending process, Ajayan et al. carried out a simulation of the surface reconstruction in single-walled nanotubes using the O(N ) method of Fermi operator expansion based on a tight-binding model. They simulated a homogeneous removal of carbon atoms from a (10,10) nanotube of a diameter of 1.36 nm at the rate of extraction of 5 atoms=ps. The MD cell contains 399 carbon atoms, with the periodic boundary condition imposed along the axis of the tube. The time step of the simulation is 0:7 fs and the
52
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
Fig. 7. A highly defected rough cylinder due to surface reconstruction associated with a random extraction of 200 carbon atoms from the surface of a (10; 10) nanotube, reproduced from Fig. 2(b) in Ref. [119].
simulation runs for a total time of 70 ps. In the simulation, the tube was gradually heated to 700 K to accelerate the process of surface reconstruction. They observed the following mending process. Initially, the two-coordinated carbon atoms created the removal of atoms from the surface recombined to saturate the dangling bonds, resulting in a mainly three-coordinated highly defective carbon network, with nonhexagonal rings including squares, pentagons, heptagons, octagons, nanogons, and decagons scattered in the network. The unstable high-membered rings then disappeared, leading to a structure mainly composed of Eve-, six-, and seven-membered rings. After a random extraction of 200 carbon atoms from the surface, the surface reconstruction yielded a highly defected rough cylinder, with the diameter reduced from the original value of 1:36 nm to a value averaged around 0:7 nm (see Fig. 7). They also found that the cohesive energy of this reconstructed narrow nanotube was reduced by only 0:55 eV=atom compared to a perfect (5,5) nanotube of the same diameter. They also conducted a simulation with an inhomogeneous atom removal. They found that the defective surface was not able to reconstruct in a disordered sp2 network, but yielded linear atomic chains connecting undefected regions of nanotube. In this study, Ajayan et al. showed that both experimental results and theoretical simulations led to the same conclusion regarding the surface reconstruction of a single-walled nanotube under atom-removal by irradiation. SpeciEcally, the mechanisms determining the surface reconstruction are the saturation of dangling bonds and Stone–Wales mechanism [135]. The outcome of the mending process leads to the shrinking of the tube. 6.3. Initial stages of growth of Si=Si(001) Recent scanning tunneling microscope (STM) studies of the initial stages of growth of Si=Si(001) [136], Ge=Si(001) [137], and Si=Ge(001) [138] revealed a new type of growth structures. These structures are abundant and stable near room temperature. They appear as chains of “adatom units” intersecting the substrate dimer rows obliquely at a speciEc angle, thus diCerent from dilute dimer rows previously observed above room temperature that intersect the substrate dimer rows at a right angle [139]. The chain structure is found to be faint in the Elled-state STM image and bright in the empty-state image, similar to the behavior of monomers on Si(001) substrate. In contrast, the dilute dimer row is found to be bright in both situations.
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
53
Based on the experimental results, it was suggested that the building unit of the chain structure is, perhaps, not a dimer, but a pair of atoms [137]. Liu et al. [122] recently carried out a simulation study of the low coverage (∼0:01 ML) in the initial stage of growth of Si=Si(001). Such a study requires the use of a large surface unit cell. This requirement renders Erst principles MD schemes not feasible for the simulation study. Liu et al. used the O(N )=NOTB-MD scheme for their study. Slabs of sizes range from 4×4 to 8×8 in the lateral direction and from 12 to 24 layers in the z direction were used in our simulations. The bottom of the slab (2– 4 layers) was Exed at their bulk positions while the rest of the system (including adatoms) were fully relaxed. The NOTB Hamiltonian developed by Menon and Subbaswamy [64] for silicon was used to calculate the total energy and the atomic forces. However, Liu et al. have included an on-site Hubbard-like term to correct the charge transfer [122]. In their test calculation using this modiEed Hamiltonian, they found the correct trend for the ordering as well as values of surface energies for all the commonly studied reconstructed surfaces of Si(001). They next performed MD simulations for two Si adatoms placed in the trough between the substrate dimers of the c4×2 surface of Si(001). STM studies revealed that the adsorption of such an “adatom pair” breaks the symmetry of the underlying substrate by interrupting the “antiferromagnetic” buckling of the dimer rows [136]. In the simulations, they accommodated this reconstruction and allowed the relaxation of substrate atoms in the vicinity of the adatom unit, using a slab of size 8×8×12. Fig. 8 shows the result of the MD simulation. The simulation V between the adatoms, a separation very close to the yielded an equilibrium separation of 2:52 A equilibrium separation of an isolated Si dimer. This result seemed to suggest that the two adatoms in the trough are chemically bonded. They then calculated the bond charge between the two adatoms using the method of local analysis [140]. The bond charge was found to be 0:4e, a substantial value compared with the bond charge of 0:5e for bulk Si. This analysis conErms that the two adatoms are chemically bonded. It is indeed a dimer (referred to as C dimer). To understand why C dimers appear dark in the Elled-state and bright in the empty-state images of STM, they calculated the local density of states (LDOS) at the location of the C dimer. The result is shown in Fig. 9. It can be seen that the LDOS of an isolated C dimer at and below the Fermi energy (∼− 6:9 eV) is very small, whereas there is a broad pronounced peak above the Fermi energy that is centered around −4:9 eV. Hence the C dimer will appear dark in the Elled-state and very bright in the empty-state STM images. To understand larger growth structures, they examined possible three-adatom structures. Using the C dimer as the seed, they placed the third adatom at sites 3, 4, or 5 in Fig. 8 and allowed the system to relax. Let Eb (i) = Eslab + 3Eatom − Esys be the binding energy of a three-adatom conEguration with the C dimer at sites 1 and 2 and the third adatom at site i. They found, with Eb (5) = 0 as the reference, Eb (3) = 0:65 eV and Eb (4) = 0:15 eV. Since site 3 has the highest binding energy, the conEguration of 3-adatoms at sites 1, 2, and 3 is expected to be favored, thus promoting the growth of a dilute dimer row rather than a chain structure. However, they have also calculated the barriers to diCusion between sites. Their calculation yielded the diCusion barriers between sites 3, 4, and 5 as: E(5 → 4) = 0:78 eV; E(4 → 5) = 0:93 eV, and E(4 → 3) = 1:03 eV. Assuming a diCusion rate of K(i → j) ∼ e−E(i→j)=kB T , the relative rates of diCusion between these sites at room temperature were estimated as: K(4 → 5)=K(5 → 4) = 3 × 10−3
54
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
Fig. 8. Top view of Si(100) surface with a pair of adatoms in the trough. Open circles: adatoms. Solid circles: top layer substrate atoms. Diamonds: second layer atoms. Atoms denoted by larger symbols are higher than those by smaller ones. Note that the interruption of the antiferromagnetic buckling of the c(4 × 2) due to the adsorption of adatoms is included in the simulation. Locations 3, 4, and 5 mark the positions of the third adatom as it moves along the adjacent trough (see text). The Egure is reproduced from Fig. 1 in Ref. [122].
Fig. 9. Local electron density of states at site 1 (see Fig. 8) for an isolated C dimer and those at sites 1 and 2 for a chain, reproduced from Fig. 2 in Ref. [122].
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
55
and K(4 → 3)=K(5 → 4) = 5 × 10−5 . Hence, it is relatively easy for the third adatom to reach site 4 from site 5, but much harder for the third adatom to move from site 4 to site 3 at room temperature. Thus, at room temperature, the third adatom will be trapped at site 4 suKciently long to wait for the arrival of a fourth adatom at site 6. This will lead to the formation of two units of the chain structure. This analysis suggests that the competition between kinetics and thermodynamics eventually dictates the type of large adatom structures found on Si(001). 6.4. Liquid carbon structures Qiu et al. [110] implemented the O(N ) density matrix variational scheme of Li et al. [21], using the tight binding Hamiltonian developed by Xu et al. [141], to study carbon systems such as crystalline, amorphous, and liquid carbon. In their implementation, they used a two-stage steepest-descent minimization algorithm for the minimization of the grand potential = to ensure the consistency between the chemical potential and the total number of electrons Ne , where = = ETB − Ne = 2 Tr[(H ˜ TB − )]
(4.5 )
with ˜ = 32 − 23 : Here HTB is the tight-binding Hamiltonian and ETB is the corresponding tight-binding band structure energy. In the Erst stage, the line minimization proceeds along the direction of −∇ =|=n ≡ An , where An = −∇ ETB |n − n [ − ∇ Ne |n ] = −∇ Tr[H ˜ TB ]|=n − n {−∇ Tr[] ˜ |=n }
(6.1)
with n and n being the density matrix and the chemical potential at the nth iteration, respectively. The variational matrix at the (n + 1)th step in the Erst stage of minimization is given by (1 )n+1 = n + An ;
(6.2)
where denotes the step size. Substituting Eq. (6.2) into Eq. (4.5 ), one obtains (=1 )n+1 = c0 + c1 + c2 2 + c3 3 ;
(6.3)
where c0 = =n ; c1 = − Tr[A2n ] ; c2 = Tr[3A2n HTB − 2A2n (HTB n + n HTB ) − 2An n An HTB ] ; c3 = − 2 Tr[A3n HTB ] :
(6.4)
56
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
The value of (denoted by min ) at which (=1 )n+1 is at its minimum is computed using Eqs. (6.3) and (6.4). However, after the Erst-stage line minimization, the number of electrons calculated according to N = Tr[(˜ 1 )n+1 ] is in general not equal to Ne , the total number of electrons of the system under consideration. To achieve the consistency in the chemical potential and the total number of electrons, is adjusted at the second stage of the line minimization along the direction of −∇ N |=n ≡ Bn with a step size & so that the density matrix at the (n + 1)th step is given by (2 )n+1 = n + min An + &Bn :
(6.5)
By substituting Eq. (6.5) into Ne = Tr[(˜ 2 )n+1 ], one obtains Ne = d0 + d1 & + d2 &2 + d3 &3 ;
(6.6)
where d0 = Tr[(˜ 1 )n+1 ] ; d1 = − Tr[Bn (B1 )n+1 ] ; d2 = 3 Tr[Bn2 − 2Bn2 (1 )n+1 ] ; d3 = − 2 Tr[Bn3 ]
(6.7)
with (B1 )n+1 = − ∇ N |=(1 )n+1 . There are three roots for Eq. (6.6). Qiu et al. found that it is most convenient to use the root with smallest absolute value, denoted by &min . In this way, the combination of the two stages of line minimization leads to the expression for the determination of the variational density matrix at the (n + 1)th step given by n+1 = n + min An + &min Bn :
(6.8)
The chemical potential at the (n + 1)th step is likewise given by n+1 = n −
&min : min
(6.9)
Thus, the two-stage steepest-descent minimization of the grand potential allows Erst the minimization of = along the direction of −∇ =, and then the adjustment of along the direction of −∇ N so that the density matrix at each step is always on the surface of Tr[] ˜ = Ne . At the Erst MD step, Qiu et al. used the suggestion of Li et al. [21] as the initial variational density matrix, namely, 0.5 for the diagonal elements and zero for the oC-diagonal elements. For the subsequent MD steps, they obtained the initial density matrix by extrapolating forward from the electron conEgurations of previous time steps. SpeciEcally, the initial guess for the density matrix of the (n + 1)th MD step, , is expressed as ({r(tn+1 )}) = ({r(tn )}) + [({r(tn )}) − ({r(tn−1 )})] + %[({r(tn−1 )}) − ({r(tn−2 )})] ;
(6.10)
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
57
where {r(tn )} denotes the set of ionic coordinates at the time step tn . If the parameters and % are chosen to be 1 and 0, respectively, it corresponds to the Erst order extrapolation. For = 2 and % = − 1, it corresponds to the second order extrapolation. Qiu et al. found that, for carbon systems, the system energy increases monotonically if the second order extrapolation is used, while the system energy decreases monotonically if the Erst order extrapolation is used. Thus by using the Erst and second order extrapolations alternatively, they were able to achieve conservation of the total energy using a relatively large tolerance for the minimization process. Using the procedure described above, Qiu et al. performed the simulations of liquid carbon with a density of 2:0 g=cm3 . They used the orthogonal tight-binding Hamiltonian developed by Xu et al. [141] to model the liquid carbon. The MD cell was composed of 64 atoms with a cubic periodic boundary condition imposed. Only the E point was used for the electronic structure calculation. The cut-oC of the variational density matrix was set by Nc=46 , where Nc is the number of atoms included in the region of localization about a given atom. The MD simulations ran with a time step of 0:7 fs. After 3:5 ps of thermalization at 5000 K, the temperature control was released and the simulations ran another 1:4 ps. The results of pair-correlation functions, atomic distributions, and the partial redial distribution functions and bond angle distribution functions of various coordinated atoms all agree quite well with those obtained by the method of direct diagonalization (Fig. 10). 6.5. Extended Si{311} defects Ion implantation is a tool to introduce speciEcally chosen atomic particles into a substrate to aCect the changes in the electrical, chemical and metallurgical properties of the substrate. However, this process may also induce transient enhanced diCusion (TED) of dopants [142]. For example, the diCusion of boron in ion-implanted silicon during annealing is many orders of magnitude greater than the diCusion of boron in the sample in a thermal equilibrium [143]. The transient enhanced diCusion of boron in silicon is a limiting factor in the fabrication of electronic devices. Experimental evidences and theoretical studies indicate that the transient enhanced diCusion of boron is related to the pairing of boron (B) and silicon (Si) interstitials introduced by B+ implantation. Hence, understanding the proEle of Si interstitials after ion implantation and during the thermal annealing is important for the determination of the distribution of B in the ion-implanted samples. Recent experimental studies [144,145] suggested that emission of Si interstitials from extended {311} defects is mainly responsible for the transient enhanced diCusion of B in Si samples. The Si {311} defects are rodlike structures along the 011 direction that may extend to as much as 1 m [146,147]. The width of the {311} defects is along the 233 direction and covers the range from from 1 to 100 nm. The defects reside on the {311} plane formed by the 011 and 233 directions, thus the name. The formation of {311} defects has been observed by GeV-electron irradiation, ion implantation, and surface oxidation. There are suggestions that {311} defects are formed as a consequence of the condensation of Si interstitials. Recently, Kim et al. [116] carried out simulations to study the structural properties and energetics of extended {311} defects using the unconstrained O(N ) orbital variation approach of Kim et al. [32]. They calculated the total energy of the defect structure using the orthogonal tight-binding Hamiltonian developed by Kwon et al. [148]. Because of the size of
58
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
Fig. 10. Comparisons of the results obtained by the density matrix-TBMD simulation with those obtained by the direct diagonalization-TBMD simulation. (a) Pair correlation; (b) partial radial distribution; (c) angular distribution function; reproduced from Fig. 5 in Ref. [110].
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
59
Fig. 11. Supercell used to study {311} defects, reproduced from Fig. 1 in Ref. [116].
the extended {311} defects, the calculation of the total energy requires a very large supercell to take into consideration of the long range structural relaxation. Kim et al. implemented their calculation on a Cray T3D. The calculation was done at the E point in the supercell approach. V Imposing Within the O(N ) scheme, they used a spherical localization with a cut-oC Rc = 6 A. this cut-oC yields an error of less than 1% in the calculation of the total energy. The chemical potential is adjusted so that the total number of electrons is within 10−5 of the exact value. To control possible overestimate of the charge transfer, a Hubbard-like term with U = 4 eV is introduced. This extra term is observed to have not altered to any appreciable extent the relaxed structures or the total energies. The supercell used in Kim et al.’s simulations is constructed as follows (see Fig. 11). The U and [23U3] U are chosen as the x- and y-axis, respectively, with the direction [311] directions [011] √ √ √ designated as the z axis. In terms of the three unit lengths Lx0 = a= 2; Ly0 = 11a= 2, and
60
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
√ Lz0 = 11a, the orthorombic supercell used in Kim et al.’s simulations is deEned by choosing Lx = nx Lx0 ; Ly = ny Ly0 , and Lz = 2Lz0 , where a is the lattice constant for Si in its diamond phase, and nx and ny are integers. nx and ny are chosen so that the displacement of the atoms V far from the defect core from their regular position in the diamond structure is less than 0:02 A. Periodic boundary conditions are imposed along all three axes. Kim et al. carried out the structure optimization for the defect structure by Erst using the constant-temperature MD simulations. These simulations were performed at 300 –600 K for about 1 ps so that the structure would not be trapped in local minimum conEgurations. The equilibrium conEguration of the defect structure was then obtained by fully relaxing the atomic positions V using the steepest descent method until the forces acting on each atom is less than 0:01 eV= A. The eCective temperature of the equilibrium conEguration of the defect structure is less than 0:1 K. The formation energy per interstitial is the quantity used by Kim et al. to determine the relative stability of stable defect structures. Let Nint and Nbulk be the number of interstitial Si atoms and bulk Si atoms in the MD (supercell) cell, respectively. The formation energy for a given structure may be written as
E f = Etot [Nint ; Nbulk ] −
Nint + Nbulk Etot [Nbulk ] : Nbulk
(6.11)
f = E f =N . Alternatively, one may The formation energy per interstitial is then given by Eint int deEne the binding energy of the defect structure as f −Eb = E f − Nint E 011 ;
(6.12)
f f , the smaller is the formation energy of an isolated 011 interstitial. In terms of Eint where E 011 f is, the more stable is the defect structure. On the other hand, a positive E indicates the Eint b that the defect structure is stable compared to the structure with the same number of isolated interstitials. Using the procedure outlined above, Kim et al. studied the extended {311} defects systematically. They found that interstitial chain structures along the 011 direction are stable defect structures compared with isolated interstitials. They also found that the side-by-side condensation of these interstitial chain structures along the 233 direction leads to the formation of the extended {311} defects. Their studies also suggested that successive rotations of pairs of atoms in the {011} plane provide the means for the propagation of interstitial chains because these rotations have a relatively small energy barrier. The growth of the {311} defects can then be explained in terms of the stability of the interstitial chain structure and the mechanism for their propagation (see Fig. 12).
6.6. Controllable reversibility in the mechanical deformation of a single-walled nanotube by a local probe Most recently, an experimental investigation of the eCect of mechanical deformation on the electrical properties of SWNT using an AFM tip to reversibly deQect a suspended SWNT
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
61
Fig. 12. (a) An interstitial chain, formed by stacking pairs of interstitials, is inserted into bulk silicon along the 011 direction. The solid atoms are Lx0 =2 into the plane with respect to the open-circle atoms. The pair of interstitials is surrounded by two adjacent Eve-member rings. The dotted lines highlight the broken bonds due to the insertion f ) of 2:2 eV per interstitial. (b) A more stable of the interstitial chain. The structure has a formation energy (Eint f = 1:7 eV) is obtained by rotating a bond (denoted by an arrow). (c) An interstitial chain conEguration structure (Eint f = 1:7 eV is obtained with the rotation of the other bond (the other arrow). The Egures are reproduced from of Eint Fig. 2 in Ref. [116].
was carried out [149]. Fig. 13 gives a schematic drawing of the experimental setup. The SWNT bridging a pair of metal electrodes (20 nm thick Ti placed on top of the SWNT) was suspended across a trench (typically of 100 –1000 nm wide and 175 nm deep) prefabricated in between the catalyst islands on a SiO2 =Si substrate. Placing an AFM tip above the center of the suspended SWNT, the sample-stage containing the SWNT was moved upward and then retracted. The up-and-down cyclic movement was repeated many times while the AFM cantilever deQection and the resistance of the SWNT were simultaneously recorded as a function of time. In situ
62
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
Fig. 13. (a) A schematic view of the experimental setup. (b) A schematic (side) view of the pushing-and-retracting action of the AFM tip. The Egures are reproduced from Fig. 1 in Ref. [153].
measurements of the conductance found an unexpected decrease in conductance of two orders of magnitude when the AFM tip deQected the center of a suspended SWNT to a seemingly small ◦ bending angle (2 = 13 ). It was also found that the conductance and the structure recovered as the tip retracted. This controllable reversibility of the deformation-induced two orders of magnitude change in conductance clearly indicates the feasibility of utilizing this process in the design of nanoscale switch. This experimental observation is obviously very diCerent from the results of previous studies [150 –152]. A comparison of the relevant factors characterizing this experimental procedure with those deEning previous theoretical studies immediately brings out a key factor which plays a crucial role in the experimental procedure yet it is missing in the theoretical consideration. This is the pushing-and-retracting action of the AFM tip as it is manipulated to induce the mechanical deformation on the SWNT. In previous theoretical studies [8,9], the bending of the SWNT was modeled by holding the ends of the SWNT at positions deEning the angle of bending. This initial conEguration was then allowed to relax to its equilibrium conEguration while the ends were kept at the initial Exed positions. The equilibrium conEguration of the bent SWNT obtained in this way, in particular in the neighborhood of the center of the SWNT, is certainly not expected to be able to model that of the bent SWNT obtained in our experiment under the pushing action of an AFM tip. This is because the pushing action of the AFM tip will give rise to a concentrated local strain in the section of the SWNT in the immediate neighborhood of the tip which would otherwise not be in existence without the presence of the tip. Thus, to shed light on the physics behind the unexpected reduction in conductance by two orders of
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74 ◦
63
magnitude at a relatively small angle of deQection of 2 = 13 , one must model the experimental procedure carefully by explicitly involving the AFM tip, in particular the pushing-and-retracting action of the tip. Liu et al. [153] recently carried out just such a simulation of the deformation of a SWNT under the pushing and retracting action of an AFM. They used a metallic (5,5) SWNT conV as the “sample”. The AFM tip was modeled by a taining 960 carbon atoms (l ≈ 120 A) capped (5,5) SWNT with 110 carbon atoms. The tip was Erst placed above the center of the suspended SWNT and then pushed downward vertically at a uniform speed in a continuous manner to deQect the SWNT. The MD simulation of the continuous deQection of the SWNT was carried out at 300 K. Because of the size of the system under consideration, an order-N=non-orthogonal tight-binding MD (O(N )=NOTB-MD) scheme was used to carry out the simulation [44]. The NOTB Hamiltonian developed by Menon et al. [154] was used in the calculation. This Hamiltonian is constructed in terms of s and p orbitals, and hence is equipped to take into account of any eCect associated with – hybridization. To model the pushing action of the tip, the 50 atoms at the far end (from the sample) of the tip were held rigidly as a V unit and they were move downward at a speed of 0:002 A=step (383 m=s with a time step of 0:522 fs). Forty atoms at each end of the suspended SWNT were held at their Exed positions during the simulation. The rest of the atoms, including the 60 atoms in the bottom portion of the tip (adjacent to the SWNT) and 880 atoms in the SWNT, were allowed to move under the action of the atomic forces (calculated by the O(N )=NOTB scheme). It should be noted that the downward speed is actually about two orders of magnitude smaller than the thermal speed of atoms at 300 K. The deQection process went on until the bending angle 2 reached ◦ 15 . To check the observed reversibility of the SWNT, the tip was then pulled back in the same manner. In Fig. 14, the equilibrium conEgurations of the system, i.e., the SWNT and the tip, during the pushing-and-retracting action of the tip are shown. These equilibrium conEgurations were obtained after the system was relaxed at the speciEed bending angle. The top ◦ ◦ ◦ four Egures in the panel show the equilibrium conEgurations corresponding to 2 = 0 ; 7 ; 11 ; ◦ and 15 , respectively, as the tip pushes down on the SWNT. The bottom three Egures give ◦ ◦ ◦ the conEgurations corresponding to 2 = 11 ; 7 ; and 0 during the retracting stage of the tip’s ◦ pushing-and-retracting cycle. It can be seen that, for 2 = 7 , the deformation of the SWNT in ◦ the vicinity of the tip is basically elastic in nature. However, for 2 ¿ 11 , there is a noticeable change in the bonding conEguration for the atoms in the central section of the SWNT in the proximity of the tip. Using either the distance or the bond charge criterion [140], one can de◦ termine the average number of bonds per atom in the central section of the SWNT. At 2 = 11 , ◦ this number has already changed from 3 to 3.3 while at 2 = 15 , this number has reached 3.6. This dramatic change in the average number of bonds per atom in the central section of the SWNT (near the tip) where the bend is located signiEes that a change in the nature of bonding, namely from a sp2 to a sp3 bonding, has occurred for atoms in the bending region. It should be ◦ noted that no such change had been observed even for bending angle up to 2 = 45 in previous simulations where no tip was involved in deforming the SWNT. Since electrons in sp3 bonding are localized, this change could bring about drastic change in the electric properties of the SWNT. The bottom three Egures show that, as the tip was being withdrawn in a continuous ◦ manner from 2 = 15 , the SWNT returns to its original unbent structure. This observation of reversibility of the SWNT under the manipulation of an AFM tip for small bending angles is
64
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
Fig. 14. Simulation of the deformation of a (5; 5) nanotube under the manipulation of an AFM tip. Top four Egures ◦ ◦ ◦ ◦ are the equilibrium conEgurations at 2 = 0 ; 7 ; 11 , and 15 , respectively, during the pushing action of the tip ◦ ◦ ◦ while the bottom three Egures give the equilibrium conEgurations corresponding to 2 = 11 ; 7 , and 0 , respectively, as the tip is being withdrawn. The accompanying Egures in each case show the side view and top view of the bending region. The Egures are reproduced from Fig. 2 in Ref. [153].
consistent with the experimental observation. It indicates that while the system as a whole (the bent SWNT and the tip) is in its equilibrium conEguration, the structure of the bent SWNT by ◦ itself, at least for 2 ¿ 11 , is very unstable. It exists entirely due to the anchoring of the tip near by. As the tip is being pulled back, the extremely unstable structure of the bent SWNT immediately starts to recover from its precarious conEguration, thus putting stress on the extra bond formed in the change from a sp2 to sp3 bonding conEguration for the atoms in the central section of the SWNT and in the proximity of the tip. The stress eventually breaks the extra bond and the SWNT returns to its unbent conEguration as the tip pulls away.
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
65
The conductance of the SWNT was calculated by connecting it to two semi-inEnite leads, left (L) and right (R). In the calculation, both leads are chosen to be the ideal (5,5) SWNTs. In this way, the conductance can be expressed as [155] G=
2e2 Tr(EL RrS ER RaS ) ; h
(6.13)
where RSa(r) is the advanced (retarded) Green’s function of the sample (the bent NT) and is given by [153] RSa(r) = {ESS − hS − ;SL NLa(r) ;LS − ;SR NRa(r) ;RS }−1 ; where ;SL(SR) = ESSL(SR) − ;SL(SR)
(6.14)
with hS being the Hamiltonian of the bent NT (the sample), SS the overlapping matrix of the sample, SSL(SR) the overlapping matrix between the sample and the left (right) lead, ;SL (;SR ) the coupling between the sample and the left (right) lead, and NLa(r) (NRa(r) ) the advanced (retarded) Green’s function for the semi-inEnite left (right) lead. Since the leads are ideal semi-inEnite (5,5) NT, one obtains NLa(r) = NRa(r) = {(E ± i1)S0 − h0 }−1 ;
(6.15)
where h0 and S0 are the Hamiltonian and the overlapping matrix for the semi-inEnite (5,5) NT, respectively. The coupling term EL(R) can be expressed as [155] EL(R) = i{;SL(R) NrL(R) ;L(R)S − ;SL(R) NaL(R) ;L(R)S } :
(6.16) ◦
◦
◦
◦
Using Eqs. (6.13) – (6.16), the conductance corresponding to 2 = 0 ; 7 ; 11 ; and 15 was computed. The result of the calculation is shown in Fig. 15. It can be seen that the conductance ◦ ◦ (in unit of 2e2 =h) at the Fermi energy, EF , changes from 2.0 for 2 = 0 to 0.01 for 2 = 15 , a change of two orders of magnitude, consistent with the experimental observation. Apparently the change from a sp2 to a sp3 bonding conEguration for atoms in the bending region due to the inclusion of the tip and the action of the tip in the simulation had induced this drastic reduction in the conductance.
7. Choosing an O(N ) scheme In this review, we have discussed various O(N ) algorithms and some examples of their applications. It is now the time to take stock of the situation and to address the issues related to the question as to whether some useful guidelines can be set up regarding the choice of a particular O(N ) scheme to be used for a certain problem and under a given set of conditions. Disregarding various approximations involved in the implementation of diCerent O(N )
66
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
Fig. 15. Conductance vs. energy for a (5,5) nanotube at various bending angles (2) and strains (O). The fermi energy is at E = 0. Reproduced from Fig. 4 in Ref. [149].
procedures for the moment, an O(N ) procedure can only be as good as the methodologies, namely either Erst principles or tight-binding=semi-empirical Hamiltonians, that form the basis of its approach. In this regard, an O(N ) procedure based on Erst principles methods is expected to yield more reliable prediction of system properties than the one based on tight-binding=semi-empirical Hamiltonians. However, the calculation steps in the self-consistent DFT=Erst principles methods for systems of large but Enite sizes are often dominated by the nearly O(N ) procedures for constructing the Hamiltonian and calculating the energy integrals (see Section 5.2). Thus the implementation of the O(N ) methodologies for the calculations of the total energy and atomic forces alone will not bring about a substantial speedup of the computations because of this overhead. Hierse and Stechel [29,156] addressed this issue by exploring the possibility of the transferability of local electronic structure information. However, they had not had any concrete success. This is probably why most of the applications of O(N ) methods have been implemented in the framework of tight-binding approaches. The drawback here is simply the transferability of the tight-binding Hamiltonian (see Section 5.2). The development of tight-binding Hamiltonians with a wide range of transferability is now being pursued by many workers in the Eeld [87,89,90,92], but with only very limited success. Hence, in some sense, the usefulness of the O(N ) methods as predictive tools for studying systems of realistic sizes is hampered, not by the O(N ) procedures, but by the shortcomings of the methodologies which the O(N ) procedure is built on. As discussed in Section 6, Eve O(N ) procedures have been applied to study real problems which explicitly involve structure optimization via MD simulations. These Eve procedures are: the divide and conquer (DC) method, the Fermi operator expansion (FOE) method, the O(N )=NOTB-MD method, the density matrix minimization (DMM) method, and the localized
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
67
orbitals minimization (LOM) method. We discuss, in this section, the key features of these methods relevant to their implementation, eKciency, and accuracy, while disregarding the computational costs related to the construction of the Hamiltonian. 7.1. The DC method The key features of the DC methods can be summarized as follows: • The system under consideration is divided into disjoint subsystems via partition matrices.
Local Hamiltonians are deEned as the projection of the system Hamiltonian in the subsystems. The calculation of the truncated local density matrix in real space is implemented by direct diagonalization of the local Hamiltonians. • It’s framework allows the implementation in semi-empirical=tight-binding approaches as well as DFT=Erst principles approaches. • The accuracy can be improved by a systematic increase in the size of the region of localization. The application of the method to a particular problem is guided by a compromise between accuracy and eKciency. 7.2. The FOE method For the FOE method, the key features in its implementation are: • The density operator is replaced by the Fermi operator at a Enite temperature. The Fermi
operator is then represented by a truncated Chebyshev polynomial. Numerical experiments suggests that a Chebyshev polynomial of the degree m ≈ 1:5(1max − 1min )=kB T is suKcient to yield converged result. • For metallic and small gap systems, the elegant energy renormalization group method proposed by Baer and Head-Gordon [157] (see Section 7.7) may be used to circumvent the problem associated with choosing an appropriate temperature. • The generalization of the FOE method for non-orthogonal basis set could be cumbersome and computationally more costly. • The accuracy and the eKciency of the method are dependent on the combination of the choices of the size of the region of localization and the temperature. 7.3. The O(N)=NOTB-MD scheme Key features related to the O(N )=NOTB-MD are: • The density matrix is identiEed with the general Green’s function in real space. The evalu-
ation of the local density matrix can be carried out using the method of real space Green’s function or the direct method within the region of localization. • While the method has only been implemented in the context of a non-orthogonal tightbinding Hamiltonian, its framework allows convenient implementation of DFT=Erst principles approaches.
68
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
• The eKciency as well as the accuracy of the method is controlled by the size of the region
of localization. A judicious choice of the size of the region of localization is arrived as a compromise between the accuracy and the eKciency.
7.4. The DM method The issues in the implementation of the DMM method include: • The variational scheme involves the minimization of the grand potential = with respect to
truncated local density matrix elements.
• The requirement of the idempotency of the density matrix is facilitated by the introduction
of the McWeeny puriEcation scheme within the framework of the grand potential. • The possibility of having runaway solutions can be lessened [76] by Erst using a McWeeny iterative search which enables the truncated density matrix to converge quadratically towards idempotency. The resulting density matrix elements are then used as inputs to Li et al.’s variational scheme which maintains the idempotency to the Erst order in the minimization process. • The extension of the scheme for a non-orthogonal basis set involves a signiEcant degree of complication. 7.5. The LOM methods Key features of the various schemes of the LOM methods are: • The construction of the energy functional to be minimized is guided by the expansion of
S −1 in terms of S. • The minimization of the energy functional is carried out in terms of localized orbitals. • Utilizing orthogonal localized orbitals often leads to multiple local minima and Qat region where the minimization procedure can be trapped. Incorporation of information about the local bonding nature may lessen these problems but may diminish the eCectiveness of the methods as predictive tools. • Problems associated with multiple local minima can be reduced by allowing the number of localized orbitals to exceed the number of occupied states. 7.6. Some general remarks There are three common factors which are important for a reliable and eKcient implementation of any O(N ) algorithm. These factors are: the quantity deEning the region of localization; the eKcient and reliable calculation of the atomic forces; the convenience of implementing parallel programming. We discuss in the following the relevant issues concerning these factors. In principle, the region of localization can be conveniently deEned by a cutoC radius Rc so that the local density matrix elements or the local orbitals are non-vanishing only for Rij 6 Rc or r 6 Rc . An appropriate choice of Rc depends on the problem at hand. The accuracy of
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
69
the calculation can be systematically improved by increasing Rc . However, a constant Rc may create unnecessary complications in determining properties of a system under consideration [110]. For example, if one were to use a constant Rc to obtain the energy vs. volume curve, one may encounter situations where there is discontinuities in the energy vs. volume curve. These discontinuities are the consequences of the fact that, for a constant Rc , the number of atoms seen by the atom in question is larger for a smaller volume than for a larger volume. These then means that the energies calculated for smaller volumes will be more accurate than those for larger volume, hence resulting in energy discontinuities in the energy vs. volume curve. The energy discontinuity can also occur in MD simulations at high temperatures as atoms can diCuse into and out of the region of localization with a constant Rc . The simplest remedy to this type of problems is to use a cutoC Nc which is deEned as the number of closest neighbors of a given atom. In MD simulations, it is important to maintain consistency between the total energy and atomic forces. For tight-binding and non-self-consistent approaches, an accurate determination of atomic forces can be achieved in an eKcient manner because the forces can be calculated by Hellman–Feynman theorem. However, the procedures for an accurate determination of atomic forces become cumbersome and complicated for self-consistent tight-binding=semi-empirical and Erst principles approaches. The O(N ) algorithms based on the direct approach (the DC method, the FOE method, and the O(N )=NOTB-MD scheme) are intrinsic parallel algorithms. Therefore, it is quite convenient to implement parallel programming for these schemes. There are also reported works on the parallel algorithms for variational approach-based O(N ) schemes. But the procedures are much more complicated and their implementations are more cumbersome. There are reported works in the literature on the comparison of some of the O(N ) algorithms in terms of their accuracy and eKciency [158–160]. Since our emphasis is on the O(N ) methods which scale linearly with respect to the size for the calculation of the total energy as well as the atomic forces, the work which is most relevant is the one reported by Scuseria and coworkers [150,151]. In this work, Scuseria et al. concluded that, for (orthogonal) tight-binding Hamiltonians, the FOE method is slightly faster than a method based the DMM approach while requiring slightly more memory. However, when they implemented the O(N ) schemes based on a semi-empirical approach (an AM1 Hamiltonian), they found that the FOE method is much slower than the DMM-based methods. This is because the calculation based on the semi-empirical methods requires the self-consistent determination of the density matrix. This comparison indicates that the choice of a particular O(N ) method to be used in a given situation depends critically on the choice of the “Hamiltonian”. Therefore caution must be exercised to decide what is the most appropriate “Hamiltonian” for a given situation before one decides on a particular O(N ) scheme. 7.7. Recent reGnement on the FOE method: energy renormalization method [157] As this review article is being written, there is a most recent work [157] on the reEnement of one of the existing O(N ) algorithms, namely the FOE method. We discuss this work as follows.
70
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
Within the context of the FOE method, the maintenance of a precision of calculation of the order 10−D requires [16] %&1=2 ¿ D ln(10) ;
(7.1)
where &1 is the HOMO-LUMO gap. For systems with small &1 or metals, this requirement becomes the source of trouble since the expansion length P for a uniformly convergent Chebyshev representation of the Fermi operator is given by [16] P = 23 (D − 1)%RE :
(7.2)
Baer and Head-Gorden [157] proposed a telescopic series for the expansion of the density operator = 2( − H )
(7.3)
such that = F%0 + (F%1 − F%0 ) + (F%2 − F%1 ) + : : : :
(7.4)
Note that 2(1) is the Heaviside function, F the Fermi operator, and %n = q n %0 with q ¿ 1. When the initial inverse temperature %0 is chosen small, F%0 will be strongly localized. Long range correlations of the density operator quenched by %0 are systematically corrected by successive terms Nn = F%n − F%n−1 . These terms are progressively more delocalized. However, they are −1 about the Fermi level. Hence the quenching non-zero only in the energy range of the order %n−1 of the temperature by a factor of q also allows the energy to scale down by the factor 1=q. Using Eq. (7.4), the band structure energy can be expressed as EBS = Tr {HF%0 } + Tr {HN1 } + Tr {HN2 } + : : : :
(7.5)
Eq. (7.5) indicates that the progressively longer range correlations are accounted for by the consecutive terms involving Nn . These terms can, however, be evaluated in increasingly smaller subspaces. SpeciEcally, because the change in % scales as q while the change in the relevant energy interval scales as 1=q in each step, the calculation of Nn can be accomplished by a Chebyshev expansion of the same length P. The calculation as prescribed by Eq. (7.5) can be terminated with a Enal explicit diagonalization when the subspace at the Enal step is suKciently small. The CPU time for the energy renormalization method has been found to scale as N (ln N )2 . 8. Epilogue The development of O(N ) methods opens up the possibility of predicting the equilibrium structure of systems of realistic sizes. Once the equilibrium structure of a system is determined, all sorts of other properties of the system under consideration can be calculated. This development is timely as the materials research has entered into the realm of nano-scale materials where the lack of symmetry dictates that systems with a large number of degree of freedom must be treated. The calculation of the total energy and the energy optimization process for
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74
71
the determination of the structure for such systems can only be handled using the O(N ) approach. The possibility of using quantum mechanics-based O(N )-MD scheme to predict the stable structures of nano-materials will provide guidelines for the fabrication of new materials with designed properties. For ab initio method-based O(N ) schemes, it is now feasible to carry out MD simulations for systems of up to several hundred atoms. The limiting factor here is the construction of the Hamiltonian and the calculation of the energy integrals. For tight-binding=semi-empirical approach-based O(N ) schemes, MD simulations of systems of thousands atoms can routinely be carried out on work stations. Much larger systems can be studied on super computers. The limiting factor is the transferability of the Hamiltonian used in the study. While it is important to continue to improve the existing O(N ) schemes by reducing or eliminating these limiting factors, these O(N ) schemes can nevertheless be proEtably used as predictive tools for studying systems of realistic sizes as long as care is exercised with respect to these limitations. Acknowledgements We would like to thank Professor Alexei A. Maradudin for his encouragement and patience. We would also like to acknowledge the comments by Dr. Shudun Liu, the assistance in preparing the tables by Dr. Ming Yu, and the assistance in preparing the Egures by Chris Leahy. This work was supported by an NSF grant (DMR-9802274) and a US Department of Energy grant (DE-FG02-OOER45832). References [1] W. Yang, Phys. Rev. Lett. 66 (1991) 1438. [2] W. Yang, J.M. PYerez-JordYa, in: P. Schleyer (Ed.), Encyclopedia of Computational Chemistry, Wiley, New York, 1998, pp. 1496–1513. [3] S. Goedecker, Rev. Mod. Phys. 71 (1999) 1085. [4] M.Y. Laue, Ann. Phys. (Leipzig) 44 (4) (1914) 1197. [5] J. Fridel, Adv. Phys. 3 (1954) 446. [6] C. Kittel, Quantum Theory of Solids, Wiley, New York, 1963, p. 338. [7] V. Heine, in: H. Ehrenreich, F. Seitz, D. Turnbull (Eds.), Solid State Physics, Vol. 35, Academic Press, New York, 1980, pp. 1–128. [8] P.W. Anderson, Phys. Rev. Lett. 21 (1968) 13. [9] W. Kohn, Int. J. Quart. Chem. 56 (1995) 229. [10] W. Kohn, Phys. Rev. 115 (1959) 809. [11] J. des Cloizeaux, Phys. Rev. 135 (1964) A685; ibid (1964) A698. [12] W. Kohn, Chem. Phys. Lett. 208 (1993) 167. [13] S. Ismail-Beigi, T. Arias, Phys. Rev. Lett. 82 (1999) 2127. [14] See, for example, N. March, W. Young, S. Sampanthar, The Many-Body Problem in Quantum Mechanics, Cambridge University Press, Cambridge, England, 1967. [15] S. Goedecker, O. Ivanov, Solid State Commun. 105 (1998) 665. [16] R. Baer, M. Head-Gordon, Phys. Rev. Lett. 79 (1997) 3962. [17] S. Baroni, P. Giannozzi, Europhys. Lett. 17 (1992) 547. [18] L.-W. Wang, M. Teter, Phys. Rev. B 44 (1992) 12 798. [19] W. Zhang, D. Tomenek, G.F. Bertsch, Solid State Commun. 86 (1993) 607.
72 [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68]
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74 F. Mauri, G. Galli, R. Car, Phys. Rev. B 47 (1993) 9973. X.-P. Li, R.W. Nunes, D. Vanderbilt, Phys. Rev. B 47 (1993) 10 891. M.S. Daw, Phys. Rev. B 47 (1993) 10 895. P. OrdejYon, D. Drabold, M. Grumbach, R.M. Martin, Phys. Rev. B 48 (1993) 14 646. W. Kohn, Chem. Phys. Lett. 208 (1993) 167. E.B. Stechel, A.R. Williams, P.J. Feibelman, Phys. Rev. B 49 (1994) 10 088. F. Mauri, G. Galli, Phys. Rev. B 50 (1994) 4316. S. Goedecker, L. Colombo, Phys. Rev. Lett. 73 (1994) 122. R.W. Nunes, D. Vanderbilt, Phys. Rev. B 50 (1994) 17 611. W. Hierse, E.B. Stechel, Phys. Rev. B 50 (1994) 17 811. S. Goedecker, M. Teter, Phys. Rev. B 51 (1995) 9455. E. HernYandez, M.J. Gillan, Phys. Rev. B 51 (1995) 10 157. J. Kim, F. Mauri, G. Galli, Phys. Rev. B 52 (1995) 1640. K.C. Pandey, A.R. Williams, J.F. Janak, Phys. Rev. B 52 (1995) 14 415. A.P. HorsEeld, A.M. Bratkovsky, M. Fearn, D.G. Pettifor, M. Aoki, Phys. Rev. B 53 (1996) 12 694. P. OrdejYon, E. Artacho, J.M. Soler, Phys. Rev. B 53 (1996) R10 441. A.F. Voter, J.D. Kress, R.N. Silver, Phys. Rev. B 53 (1996) 12 733. S.L. Dixon, K.M. Merz, J. Chem. Phys. 104 (1996) 6643. R.T. Gallant, A. St-Amant, Chem. Phys. Lett. 256 (1996) 569. W. Kohn, Phys. Rev. Lett. 76 (1996) 3168. S.K. Goh, A. St-Amant, Chem. Phys. Lett. 264 (1997) 9. W. Yang, Phys. Rev. B 56 (1997) 9294. J.M. Millam, G.E. Scuseria, J. Chem. Phys. 106 (1997) 5569. S.L. Dixon, K.M. Merz, J. Chem. Phys. 107 (1997) 879. C.S. Jayanthi, S.Y. Wu, J.A. Cocks, N. Luo, Z.-L. Xie, M. Menon, G. Yang, Phys. Rev. B 57 (1998) 3799. U. Stephan, D. Drabold, Phys. Rev. B 57 (1998) 6391. D.W. Bullet, in: H. Ehrenreich, F. Seitz, D. Turnbull (Eds.), Solid State Physics, Vol. 35, Academic Press, New York, 1980, pp. 129–215. W.A. Harrison, Electronic Structure and the Properties of Solids, Dover Publications, Inc., New York, 1989. M.C. Payne, M.P. Teter, D.C. Allan, T.A. Arias, J.D. Joannopoules, Rev. Mod. Phys. 64 (1993) 1045. G. Galli, A. Pasquarello, in: M.P. Allen, D.J. Tildesley (Eds.), Computer Simulation in Chemical Physics, Kluwer, Dordrecht, 1993, p. 281. R.P. Feynman, Phys. Rev. 56 (1939) 340. P. Pulay, Mol. Phys. 17 (1969) 197. Q. Zhao, W. Yang, J. Chem. Phys. 102 (1995) 9598. W. Yang, T.-S. Lee, J. Chem. Phys. 103 (1995) 5674. M.J. Gillan, J. Phys.: Condens. Matter 1 (1989) 689. A.P. HorsEeld, A.M. Bratkovsky, Phys. Rev. B 53 (1996) 15 381. T.A. Arias, J.D. Joannopoules, Phys. Rev. Lett. 73 (1994) 680. R.N. Silver, H. RWoder, Int. J. Mod. Phys. C 5 (1994) 735. R.N. Silver, H. RWoder, A.F. Voter, J.D. Kress, in: A. Teutner (Ed.), Simulation Multi-Conference ’95 Proceedings, High Performance Computing ’95, Society for Computer Simulation, San Diego, 1995, p. 200. R.N. Silver, H. RWoder, A.F. Voter, J.D. Kress, J. Comput. Phys. 124 (1996) 115. L. Goodwin, A.J. Skinner, D.G. Pettifor, Europhys. Lett. 9 (1989) 701. H. RWoder, R.N. Silver, D.A. Drabold, J.J. Dong, Phys. Rev. B 55 (1995) 15 382. B.A. McKinnon, T.C. Choy, Phys. Rev. B 52 (1995) 14 531. P.-O. LWowdin, J. Chem. Phys. 18 (1950) 365. M. Menon, K.R. Subbaswamy, Phys. Rev. B 55 (1997) 9231. R. Haydock, V. Heine, M.J. Kelly, J. Phys. C 8 (1975) 2591. C. Lanczos, J. Res. Nat. Bur. Stand. 45 (1950) 255. D.G. Pettifor, Phys. Rev. Lett. 63 (1989) 2480. M. Aoki, Phys. Rev. Lett. 71 (1993) 3842.
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74 [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114]
73
A.P. HorsEeld, Mat. Sci. Eng. B 37 (1996) 219. T.A. Arias, M.C. Payne, J.D. Joannopoules, Phys. Rev. Lett. 69 (1992) 1077. G. Galli, M. Parrinello, Phys. Rev. Lett. 69 (1992) 3547. R. McWeeny, Rev. Mod. Phys. 32 (1960) 335. E. HernYandez, M.J. Gillan, C.M. Goringe, Phys. Rev. B 53 (1996) 7147. C.M. Goringe, E. HernYandez, M.J. Gillan, I.J. Bush, Comput. Phys. Commun. 102 (1997) 1. M.J. Gillan, D.R. Bowler, C.M. Goringe, E. HernYandez, in: F. Yonezawa, K. Tsuzi, K. Kaji, M. Doi, T. Fujiwara (Eds.), The Physics of Complex Liquids, Proceedings of the International Symposium, 10 –12 November, 1997, Nagoya, Japan, World ScientiEc, Singapore, 1998. D.R. Bowler, I.J. Bush, M.J. Gillan (Preprint). P. OrdejYon, D. Drabold, M. Grumbach, R.M. Martin, Phys. Rev. B 51 (1995) 1456. S. Barnett, Mattrices: Methods and Applications, Oxford University Press, New York, 1990. D.G. Pettifor, R. Dodloucky, Phys. Rev. Lett. 53 (1984) 1080. A.T. Paxton, A.P. Sutton, C.M. Nex, J. Phys. C 20 (1987) L263. Th. Frauenheim, F. Weich, Th. KWohler, S. Uhlmann, D. Porezag, G. Seifert, Phys. Rev. B 52 (1995) 11 492. N. Bernstein, E. Kaxiras, Phys. Rev. B 56 (1997) 10 488. D. Porezag, Th. Frauenheim, Th. KWohler, G. Seifert, R. Kaschner, Phys. Rev. B 51 (1995) 12 947. G. Fabricious, A.M. Llois, N. Weissmann, Phys. Rev. B 44 (1991) 6870. A. Vega, J. Dorantes-Davila, L.C. Balbas, G.M. Pastor, Phys. Rev. B 47 (1993) 4742. G. Fabricious, A.M. Llois, N. Weissmann, M.A. Khan, Phys. Rev. B 49 (1994) 2121. R. Cohen, M. Mehl, D.A. Papaconstantopoulos, Phys. Rev. B 50 (1994) 14 694. X.S. Chen, J.J. Zhao, G.H. Wang, Z. Phys. D 35 (1995) 149. M.S. Tang, C.Z. Wang, C.T. Chan, K.M. Ho, Phys. Rev. B 53 (1996) 979. A.F. Kohan, G. Ceder, Phys. Rev. B 54 (1996) 805. S. Bouarab, A. Vega, J.A. Alonso, M.P. Iniguez, Phys. Rev. B 54 (1996) 3003. M. Mehl, D.A. Papaconstantopoulos, Phys. Rev. B 54 (1996) 4519. H. Hass, C.Z. Wang, M. FWahnle, C. ElsWasser, K.M. Ho, Phys. Rev. B 57 (1998) 1461. W.H. Press, S.A. Teukolsky, W.T. Vetterling, B.P. Flannery, Numerical Receipes, Cambridge University Press, Cambridge, 1992. S. Obara, A. Saika, J. Chem. Phys. 84 (1986) 3963. J.M. PYerez-JordYa, W. Yang, Chem. Phys. Lett. 241 (1995) 469. C.A. White, B.G. Johnson, P.M.W. Gill, M. Head-Gordon, Chem. Phys. Lett. 253 (1996) 268. M.C. Strain, G.E. Scuseria, M.J. Frisch, Science 271 (1996) 51. M. Challacombe, E. Schwegler, J. AlmlWof, J. Chem. Phys. 104 (1996) 4685. L. Greengard, Science 265 (1994) 909. F. Gygi, Europhys. Lett. 19 (1992) 617. F. Gygi, Phys. Rev. B 48 (1993) 11 692. J.R. Chelikowsky, N. Troullier, Y. Saad, Phys. Rev. Lett. 72 (1994) 1240; see also J. Bernholc, E.L. Briggs, D.J. Sullivan, C.J. Brabec, M.B. Nardelli, K. Rapcewicz, C. Roland, M. Wensell, Int. J. Quantum Chem. 65 (1997) 531. K. Cho, T.A. Arias, J.D. Joannopoulos, P.K. Lam, Phys. Rev. Lett. 71 (1993) 1808. S. Wei, M.Y. Chou, Phys. Rev. Lett. 76 (1996) 2650. S. Goedecker, O. Ivanov, Solid State Commun. 105 (1998) 665. O.F. Sankey, D.J. Niklewski, Phys. Rev. B 40 (1989) 3979; see also D. Sanchez-Portal, E Artacho, J.M. Soler, J. Phys.: Condens. Matter 8 (1996) 3859. T. Hoshi, M. Arai, T. Fujiwara, Phys. Rev. B 52 (1995) R5459. G. Galli, F. Mauri, Phys. Rev. Lett. 73 (1994) 3471. S.Y. Qiu, C.Z. Wang, K.M. Ho, C.T. Chan, J. Phys.: Condens. Matter 6 (1994) 9153. L.K. Hansan, B. Stokbro, B. Lundquist, K. Jacobsen, D. Deaven, Phys. Rev. Lett. 75 (1995) 4444. R. Nunes, J. Bennetto, D. Vanderbilt, Phys. Rev. Lett. 77 (1996) 1516. D.M. York, T.-S. Lee, W. Yang, J. Am. Chem. Soc. 118 (1996) 10 940. D.M. York, T.-S. Lee, W. Yang, Chem. Phys. Lett. 263 (1996) 297.
74 [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160]
S.Y. Wu, C.S. Jayanthi / Physics Reports 358 (2002) 1–74 C. Xu, G. Scuseria, Chem. Phys. Lett. 262 (1996) 219. J. Kim, J. Wilkins, F. Khan, A. Canning, Phys. Rev. B 55 (1997) 16 186. A. Canning, G. Galli, J. Kim, Phys. Rev. Lett. 78 (1997) 4442. J. Kim, G. Galli, J.W. Wilkins, A. Canning, J. Chem. Phys. 108 (1998) 2631. P. Ajayan, V. Ravikumar, C. Charlier, Phys. Rev. Lett. 81 (1998) 1437. S. Ismail-Beigi, T. Arias, Phys. Rev. B 57 (1998) 11 923. J.D. Kress, S.R. Bickham, L.A. Collins, B.L. Holian, S. Goedecker, Phys. Rev. Lett. 83 (1999) 3896. S. Liu, C.S. Jayanthi, S.Y. Wu, X. Qin, Z. Zhang, M. Lagally, Phys. Rev. B 61 (2000) 4421. D. York, J.P. Lu, W. Yang, Phys. Rev. B 49 (1994) 8526. J.P. Lu, W. Yang, Phys. Rev. B 49 (1994) 11 421. S. Itoh, P. OrdejYon, D. Drabold, R.M. Martin, Phys. Rev. B 53 (1996) 2132. J. Lewis, P. OrdejYon, O. Sankey, Phys. Rev. B 55 (1997) 6880. D. Sanchez-Portal, P. OrdejYon, E. Artacho, J.M. Soler, Int. J. Quantum Chem. 65 (1997) 453. D. Ugarte, Nature (London) 359 (1992) 707; Europhys. Lett. 22 (1993) 45. J. TersoC, Phys. Rev. B 46 (1992) 15 546. T.A. Witten, H. Li, Europhys. Lett. 23 (1993) 51. M. Yoshida, E. Osawa, Fullerene Sci. Tech. 1 (1993) 55. A. Maiti, C. Brabec, J. Bernholc, Phys. Rev. Lett. 70 (1993) 3023. P. Ajayan, C. Colliex, P. Bernier, J.M. Lambert, Microsc. Microanal. Amicrostruct. 4 (1993) 501. C.H. Kiang, W.A. Goddard, R. Beyers, D.S. Bethune, J. Phys. Chem. 100 (1996) 3749. A.J. Stone, D.J. Wales, Chem. Phys. Lett. 128 (1986) 501. J. van Wingarden, A. Van Dam, M.J. Haye, P.M.L.O. Scholte, F. Tuinstra, Phys. Rev. B 55 (1997) 4723. X.R. Qin, M. Lagally, Science 278 (1997) 1444. W. Wulfhekel, B.J. Hattink, H.J.W. Zandvliet, G. Rosenfeld, B. Poelsema, Phys. Rev. Lett. 79 (1997) 2494. P.J. Bedrossian, Phys. Rev. Lett. 74 (1995) 3648. D.R. Alfonso, S.Y. Wu, C.S. Jayanthi, E. Kaxiras, Phys. Rev. B 59 (1999) 7745. C.H. Xu, C.Z. Wang, C.T. Chan, K.M. Ho, J. Phys.: Condens. Matter 4 (1992) 6047. P.M. Fahey, P.B. Griddin, J.D. Plummer, Rev. Mod. Phys. 61 (1989) 289. A.E. Michel, W. Rausch, P.A. Ronsheim, R.H. Kastl, Appl. Phys. Lett. 50 (1987) 416. D.J. Eaglesham, P.A. Stolk, H.-J. Gossmann, J.M. Poate, Appl. Phys. Lett. 65 (1994) 2305. P.A. Stolk, H.-J. Gossmann, D.J. Eaglesham, J.M. Poate, Nucl. Instrum. Methods Phys. Sect. B 96 (1995) 187. L.G. Salisbury, M.H. Loretto, Philos. Mag. A 39 (1979) 317. C.A. Ferreira Lima, A. Howie, Philos. Mag. 34 (1976) 1057. I. Kwon, R. Biswas, C.Z. Wang, K.M. Ho, C.M. Soukoulis, Phys. Rev. B 49 (1994) 7242. T. Tombler, C. Zhou, L. Alexseyev, J. Kong, H. Dai, L. Liu, C.S. Jayanthi, M. Tang, S.Y. Wu, Nature 405 (2000) 771. S. Paulson, M. Falvo, N. Snider, A. Helser, T. Hudson, A. Seeger, R. Taylor, R. SuperEne, S. Washburn, Appl. Phys. Lett. 75 (1999) 2936. M. Nardelli, J. Bernholc, Phys. Rev. B 60 (1999) 16 338. A. Rochefort, D. Salahub, P. Avouris, Chem. Phys. Lett. 297 (1998) 45. L. Liu, C.S. Jayanthi, M. Tang, S.Y. Wu, T. Tombler, C. Zhou, L. Alexseyev, J. Kong, H. Dai, Phys. Rev. Lett. 84 (2000) 4950. M. Menon, K.R. Subbawamy, M. Sawtarie, Phys. Rev. B 48 (1993) 8398. S. Datta, Electronic Transport in Mesoscopic Systems, Cambridge University Press, Cambridge, 1995. W. Hierse, E.B. Stechel, Phys. Rev. B 54 (1996) 16 515. R. Baer, M. Head-Gordon, Phys. Rev. B 58 (1998) 15 296. D. Bowler, M. Aoki, C. Goringe, A. HorsEeld, D. Pettifor, Modelling Simul. Mater. Sci. Eng. 5 (1997) 199. K.R. Bates, A.D. Daniels, G.E. Scuseria, J. Chem. Phys. 109 (1998) 3308. A.D. Daniels, G.E. Scuseria, J. Chem. Phys. 110 (1999) 1321.
INTRODUCTION TO THE THEORY OF ELECTRONIC NON-ADIABATIC COUPLING TERMS IN MOLECULAR SYSTEMS
Michael BAER
AMSTERDAM – LONDON – NEW YORK – OXFORD – PARIS – SHANNON – TOKYO
Physics Reports 358 (2002) 75–142
Introduction to the theory of electronic non-adiabatic coupling terms in molecular systems Michael Baer Applied Physics Division, Soreq NRC, Yavne 81800, Israel Received May 2001; editor: S: Peyerimho) Contents 1. Introduction 2. The Born–Oppenheimer treatment 2.1. The Born–Oppenheimer equations for a complete Hilbert space 2.2. The Born–Oppenheimer equation for a (0nite) sub-Hilbert space 3. The adiabatic-to-diabatic transformation 3.1. The derivation of the adiabatic-to-diabatic transformation matrix 3.2. The necessary condition for having a solution for the adiabatic-to-diabatic transformation matrix 4. The adiabatic-to-diabatic transformation matrix and the line integral approach 4.1. The necessary conditions for obtaining single-valued diabatic potentials and the introduction of the topological matrix 4.2. The approximate adiabatic-to-diabatic transformation matrix 5. The quantization of the non-adiabatic coupling matrix 5.1. The quantization as applied to model systems 5.2. The treatment of the general case
79
6. The and 6.1. 6.2.
80 82
7. 8.
82
9.
83
10. 11.
77 79
84 84 87 90 90 95
12.
construction of sub-Hilbert spaces sub–sub-Hilbert spaces The construction of sub-Hilbert spaces The construction of sub–subHilbert spaces The topological spin An analytical derivation for the possible sign 5ips in a three-state system The geometrical interpretation for sign 5ips The multi-degenerate case The extended approximate Born–Oppenheimer equation 11.1. Introductory remarks 11.2. The Born–Oppenheimer approximation as applied to an M -dimensional model 11.3. The gauge invariance condition for the approximate Born–Oppenheimer equations and the Bohr–Sommerfeld quantization of the non-adiabatic coupling matrix The adiabatic-to-diabatic transformation matrix and the Wigner rotation matrix
E-mail address:
[email protected] (M. Baer). c 2002 Elsevier Science B.V. All rights reserved. 0370-1573/02/$ - see front matter PII: S 0 3 7 0 - 1 5 7 3 ( 0 1 ) 0 0 0 5 2 - 7
99 99 101 101 103 106 109 110 110 111
113 114
76
M. Baer / Physics Reports 358 (2002) 75–142
12.1. Wigner rotation matrices 12.2. The adiabatic-to-diabatic transformation matrix and Wigner’s dj -matrix 13. Studies of speci0c systems 13.1. The study of ‘real’ two-state molecular systems 13.2. The study of a tri-state model system 14. Summary and conclusions Acknowledgements Appendix A. The Jahn–Teller model and the Longuet-Higgins phase
114 115 116 117 124 126 128
Appendix B. The suBcient conditions for having an analytic adiabatic-to-diabatic transformation matrix B.1. Orthogonality B.2. Analyticity Appendix C. On the single=multi-valuedness of the adiabatic-to-diabatic transformation matrix Appendix D. The diabatic representation References
131 132 132 134 138 140
129
Abstract The Born–Oppenheimer treatment leads to the adiabatic framework where the non-adiabatic terms are the physical entities responsible for the coupling between adiabatic states. The main disadvantage of this treatment is in the fact that these coupling terms frequently become singular thus causing diBculties in solving the relevant Schroedinger equation for the motion of the nuclei that make up the molecular systems. In this review, we present the line integral approach which enables the formation of the adiabatic-to-diabatic transformation matrix that yields the friendlier diabatic framework. The review concentrates on the mathematical conditions that allow the rigorous derivation of the adiabatic-to-diabatic transformation matrix and its interesting physical properties. One of the 0ndings of this study is that the non-adiabatic coupling terms have to be quantized in a certain manner in order to yield single-valued diabatic potentials. Another important feature revealed is the existence of the topological matrix, which contains all the topological features of a given molecular system related to a closed contour in con0guration space. Finally, we present an approximation that results from the Born–Oppenheimer treatment which, in contrast to the original Born–Oppenheimer approximation, contains the e)ect of the non-adiabatic coupling terms. The various derivations are accompanied by examples which in many cases are interesting c 2002 Elsevier Science B.V. All rights reserved. by themselves. PACS: 31.15.Kb; 31.50.+w; 31.70.−f; 34.20.−b; 34.20.Gj; 34.20.Mq Keywords: Born–Oppenheimer treatment; Electronic non-adiabatic coupling; Diabatization; Line integral; Quantization; Topological matrix
M. Baer / Physics Reports 358 (2002) 75–142
77
1. Introduction Electronic non-adiabatic e)ects are an outcome of the Born–Oppenheimer (BO) treatment and as such are a result of the distinction between the fast moving electrons and the slow moving nuclei [1,2]. The non-adiabatic coupling terms (NACT), together with the potential energy surfaces (PES), which are also an outcome of the BO treatment, are the origin for the driving forces which govern the motion of the atoms in molecular systems. The NACTs couple the various adiabatic PESs just like potential coupling terms do within a diabatic framework. Indeed, they are considered as such for instance while studying charge transfer processes during atomic and molecular collisions [3]. In the late 1950s and the beginning of the 1960s Longuet-Higgins (LH) and colleagues [4 –7] discovered one of the more fundamental features in molecular physics related to the BO electronic adiabatic eigenfunctions. They found that these functions, when surrounding a point of degeneracy, in con0guration space (CS), may acquire a phase which leads to a 5ip of sign of these functions. Later this feature was explicitly demonstrated by Herzberg and Longuet-Higgins [6] for the Jahn–Teller conical intersection (CI) model [8–11] (see Appendix A). This interesting observation implies that if a molecular system possesses a CI at a point in CS, the relevant electronic eigenfunctions which are parametrically dependent on the nuclear coordinates, are multi-valued (this 0nding was later con0rmed in a numerical calculation [12]). No hints were given to the fact that this phenomenon is connected, in some way, to the BO NACTs. In 1960, Hobey and McLachlan [13] discussed a transformation (henceforth to be termed, as the ‘adiabatic-to-diabatic transformation’ or concisely, the ADT) to eliminate the NACTs from the BO close-coupling equations with the aim of reaching (what a few years later was termed) the diabatic framework [14] and got as far as generating the 0rst order di)erential equations to determine the transformation matrix elements. In a subsequent publication, McLachlan [15] dropped the whole idea as being ‘inconsistent’ and tried other ways to achieve his goal. In 1969, Smith [16] considered a di-atom system, eliminated the radial NACT from the BO close coupling equations and obtained the corresponding diabatic representation. In 1975, the present author suggested deriving the ADT matrix for a tri-atom system by solving an integral equation along a two-dimensional contour [17]. This integral equation, hence to be termed the line integral, reduces, for the two-state case, to an ordinary integral along a contour, over the corresponding NACT, to calculate the ADT angle. In addition, the suBcient conditions that guarantee the existence and uniqueness of the integral equation solution (along a contour in a given region in CS) were derived. Moreover, it was shown that these conditions, later termed the ‘Curl’ conditions, are ful0lled by the system of BO eigenfunctions which span a full Hilbert space. In 1980, the LH phase [5] was employed in order to form what can be termed as an extended BO approximation [19]. In 1982, Mead and Truhlar [20], who followed the foot steps of McLachlan [15], stated that the diabatic framework is out of reach because the ‘Curl’ condition, just mentioned, can never be ful0lled in a molecular system since the electronic manifold forms an unbreakable in0nite Hilbert space. In 1984, the LH phase got a tremendous boost by the exciting exposure of the novel adiabatic phase—termed the topological (Berry) phase [21]—an unavoidable feature for a system which contains fast moving parts (e.g. electrons) and which is driven by a slowly moving external
78
M. Baer / Physics Reports 358 (2002) 75–142
0eld (e.g. vibrating=rotating molecules). Berry suggested that the LH phase is a good example for the existence of such a phase in molecular systems. In 1988, Pacher et al. [18] made the ansatz that one can always 0nd a group of states which are strongly coupled to each other but only weakly coupled to states outside this group. In 1992, Baer and Englman suggested that the topological phase related to molecular systems, as well as the LH phase, should be identi0ed with the ADT angle as calculated for a two-state system [22]. A similar idea was expressed, independently, by Aharonov et al. [23]. In 1997, Baer and Englman [24] presented their version of the extended BO approximation expressed in terms the NACTs and following that the present author showed that, up to an additive constant, the ADT angle is identical to the LH phase [25]. In 2000, Baer and Alijah [26] showed that in order for the ADT to yield single-valued diabatic potentials, the corresponding NACT matrix cannot be arbitrary but has to be ‘quantized’ (as will be discussed later). The ‘quantized’ NACT for the two-state system yields an ADT angle, with features identical to the LH phase. In other words, demanding single-valuedness for the diabatic potentials forces the NACT to yield an ADT angle, once it is calculated along a closed contour, to be a multiple of (or zero). In 1996, the 0rst veri0cation of the relevance of the line integral approach for a realistic molecular system was published by Yarkony [27] He calculated ab initio NACTs for the two lowest states of the H3 system and used them to obtain the corresponding ADT angle along a circle with a given radius centered at the point of the CI. The calculations were done for circles of di)erent radii. He found that as long as the radius is smaller than 0.7 Bohr the 0nal ADT angle is, up to third decimal place, equal to . Increasing the radius led to values smaller than which he interpreted as a drawback of the theory. In a more recent publication, Alijah and Baer [28] showed that these deviations from are most likely due to a third electronic state interfering with the two-state system and therefore one has to apply the line integral approach for a three-state system [28]. We return now to LH 0ndings regarding the multi-valuedness of the electronic eigenfunctions. LH proposed to correct for this ‘de0ciency’ by multiplying the wave functions of the two states responsible for forming the CI by an identical phase factor which ensures their uniqueness without a)ecting the ortho-normal features of the original eigenfunctions [5]. This modi0cation seemed to be the right thing to do, at least at that time, but two questions arise: (a) Was it really necessary to incorporate into the quantum mechanical theory of atoms and molecules an ad hoc correction of this type? (b) Is it guaranteed that such a modi0cation will not lead, in some cases, to con5icting results? Among other things we shall try to answer these questions. At this stage we just ascertain that irrespective of what the answers will be the importance of the LH observation is in pointing out that something may go wrong if the whole system of the electrons and nuclei is not treated with care. This is essential in particular if, once the electronic eigenvalue problem is solved, the resulting nuclear Schroedinger equation (SE) is treated employing approximations. As mentioned above, the starting point in this 0eld is the Born–Oppenheimer (BO) treatment. Here, this derivation is carried out for a 0nite sub-Hilbert space (SHS), which is de0ned by making use of the NACTs. It will be shown that this particular SHS behaves for all practical purposes as a full Hilbert space [29]. Among other things it is characterized by a well-de0ned ADT matrix. These subjects are treated in Sections 2 and 3. The connection between the
M. Baer / Physics Reports 358 (2002) 75–142
79
non-adiabatic coupling matrix (NACM) and the uniqueness of the relevant diabatic potential matrix is presented in Section 4; the quantization of the NACM is discussed in Section 5 and the conditions for breaking up the complete Hilbert space into sub-Hilbert spaces (SHS) and sub–sub-Hilbert spaces (SSHS) are given in Section 6. Three subjects related to topological e)ects are presented in Sections 7–9, the multi-degenerate case is discussed in Section 10, the extended BO approximation is treated in Section 11 and the relation between the ADT matrix and Wigner’s rotation matrix is elaborated in Section 12. Analytic and numerical examples are given in Section 13 and a summary and conclusions are presented in Section 14. 2. The Born–Oppenheimer treatment 2.1. The Born–Oppenheimer equations for a complete Hilbert space The Hamiltonian, H , of the nuclei and the electrons is usually written in the following form: H = Tn + He (e | n) ;
(1)
where Tn is the nuclear kinetic energy, and He (e | n) the electronic Hamiltonian which also contains the nuclear Coulomb interactions and depends parametrically on the nuclei coordinates, and e and n stand for the electronic and the nuclear coordinates, respectively. The Schroedinger equation (SE) to be considered is of the form (H − E) (e; n) = 0 ;
(2)
where E is the total energy and (e; n) is the complete wave function which describes the motions of both the electrons and nuclei. Next we employ the BO expansion:
(e; n) =
N
i (n)i (e | n)
;
(3)
i=1
where the i (n); i = 1; : : : ; N are nuclear-coordinate dependent coeBcients (recognized later as the nuclear wave functions) and i (e | n); i = 1; : : : ; N are the electronic eigenfunctions of the above introduced electronic Hamiltonian: (He (e | n) − ui (n))i (e | n) = 0;
i = 1; : : : N :
(4)
Here ui (n); i = 1; : : : ; N are the electronic eigenvalues recognized, later, as the (adiabatic) PESs that govern the motion of the nuclei. In this treatment we assume that our Hilbert space is of dimension N . Substituting Eq. (3) into Eq. (2), multiplying it from the left by j (e | n) and integrating over the electronic coordinates while recalling Eqs. (1) and (4), yields the following set of coupled equations: N i=1
j | Tn i (n) | i + (uj (n) − E) j (n) = 0;
j = 1; : : : ; N ;
(5)
80
M. Baer / Physics Reports 358 (2002) 75–142
where the bracket notation means integration over electronic coordinates. To continue we recall that the kinetic operator Tn can be written (in terms of mass-scaled coordinates) as Tn = −
1 2 ∇ ; 2m
(6)
where m is the mass of the system and ∇ is the gradient (vector) operator. Substituting Eq. (6) into Eq. (5) yields the more explicit form of the BO system of coupled equations: −
1 2 ∇ 2m
j + (uj (n) − E)
N
j−
1 (1) (2ji · ∇ 2m i=1
i
+ (2) i ) = 0; ji
j = 1; : : : ; N ;
(7)
where (1) is the non-adiabatic (vector) matrix of the 0rst kind (henceforth termed the nonadiabatic matrix), with the elements: (1) ji = j | ∇i
(8a)
and (2) is non-adiabatic (scalar) matrix of the second kind, with the elements: 2 (2) ji = j | ∇ i :
(8b)
For a system of real electronic wave functions (1) is an antisymmetric matrix. Eq. (7) can also be written in a matrix form as follows: −
1 2 1 ∇ + (u − E) − (2(1) · ∇ + (2) ) = 0 ; 2m 2m
(9)
where is the column vector that contains nuclear functions. 2.2. The Born–Oppenheimer equation for a (8nite) sub-Hilbert space Next, the full Hilbert space is broken up into two parts—a 0nite part, designated as the P-space, with dimension M , and the complementary part, the Q-space (which is allowed to be of an in0nite dimension). The breakup is done according to the following criteria [29]: ∼ (1) ij = 0
for
i 6 M; j ¿ M :
(10)
In other words, the non-adiabatic coupling terms between P-states and Q-states are all assumed to be zero. These requirements will later be reconsidered for a relaxed situation where these coupling terms are assumed to be not necessarily identically zero but small, i.e., of the order in regions of interest. To continue we de0ne the following two relevant Feshbach projection operators [30], namely, PM , the projection operator for the P-space PM =
M j=1
|j j |
(11a)
M. Baer / Physics Reports 358 (2002) 75–142
81
and QM , the projection operator for the Q-space QM = I − PM :
(11b)
Having introduced these operators we are now in a position to express the P-part of the (1) (to be designated as (1) ). (2) -matrix (to be designated as (2) P ) in terms of the P-part of P To do that we consider Eq. (8a) and derive the following expression: (1)
∇ji = ∇j | ∇i = ∇j | ∇i + j | ∇2 i
or, recalling Eq. (8b), we get (1) (2) ji = − ∇j | ∇i + ∇ji :
(12)
The 0rst term on the right-hand side can be further treated as follows: ∇j | ∇i = ∇j | PM + QM | ∇i
which for i; j 6 M becomes ∇j |∇i |P = ∇j |PM |∇i =
M
∇j |k k |∇i
(13)
k=1
(the contribution due to QM can be shown to be zero), or also (1)
∇j |∇i |P = (P )2ij ;
i; j 6 M ;
(13 )
where (1) P is, as mentioned above, of dimension M . Therefore within the Pth subspace the (1) matrix (2) P can be presented in terms of P in the following form: (1) 2 (1) (2) P = (P ) + ∇P :
(14)
Substituting the matrix elements of Eq. (14) into Eq. (7) yields the 0nal form of the BO equation for the P-subspace: 1 2 1 2 1 − ∇ + u− (15) P − E − (2P · ∇ + ∇P ) = 0 ; 2m 2m 2m where the dot designates the scalar product, is a column matrix which contains the nuclear functions { i ; i = 1; : : : ; M }, u is a diagonal matrix which contains the adiabatic potentials and P , for reasons of convenience, replaces (1) P . Eq. (15) can also be written in the form −
1 (∇ + P )2 + (u − E) = 0 2m
(16)
which is writing the SE more compactly (a similar Hamiltonian was employed by Pacher et al. [31] within their block-diagonalized approach to obtain quasi-diabatic states).
82
M. Baer / Physics Reports 358 (2002) 75–142
3. The adiabatic-to-diabatic transformation 3.1. The derivation of the adiabatic-to-diabatic transformation matrix The aim in performing what is termed the ADT is to eliminate from Eq. (16) the somewhat problematic matrix, P . This is done by replacing, in Eq. (16), the column matrix by another column matrix where the two are related as follows:
= A :
(17)
At this stage, we would like to emphasize that the same transformation has to be applied to the electronic adiabatic basis set in order not to a)ect the total wave function of both the electrons and the nuclei. Thus if is the electronic basis set that is attached to then and are related to each other as = A† :
(18)
Here A is an undetermined matrix of the coordinates (A† is its complex conjugate). Our next step is to obtain an A-matrix, which will eventually simplify Eq. (16) by eliminating the P -matrix. For this purpose we consider the following expression: (∇ + P )2 A = (∇ + P )(∇ + P )A = (∇ + P )(A∇ + (∇A) + P A) = 2(∇A) · ∇ + A∇2 + (∇2 A) + (∇P )A + 2P (∇A) + 2P A(∇) + 2P A which can be further developed to become ) = A∇2 + 2(∇A + P A) · ∇ + {(P + ∇) · (∇A + P A)}
where the ∇’s, in the third term, do not act beyond the curled parentheses { }. Now, if A (henceforth to be designated as AP in order to remind us that it belongs to the P-sub-space) is chosen to be a solution of the following equation: ∇AP + P AP = 0 ;
(19)
then the above (kinetic energy) expression becomes ) = AP ∇2
and so Eq. (16) becomes −
1 AP ∇2 + (uP − E)AP = 0 : 2m
(16 )
In Appendix B, AP is proved to be an orthogonal matrix. Consequently, Eq. (16 ) becomes −
1 2 ∇ + (WP − E) = 0 ; 2m
(20)
M. Baer / Physics Reports 358 (2002) 75–142
83
where WP , the diabatic potential matrix is WP = (AP )† uP AP :
(21)
Eq. (20) is the diabatic Schroedinger equation. In what follows, the A-matrix (or the AP -matrix) will be called the ADT matrix. 3.2. The necessary condition for having a solution for the adiabatic-to-diabatic transformation matrix The A-matrix has to ful0ll Eq. (19). It is obvious that all features of A are dependent on the features of the -matrix elements. Thus, for instance, if we want the ADT matrix to have second derivatives or more in a given region, the -matrix elements have to be analytic functions in this region namely, they themselves have to have well de0ned derivatives. However, this is not enough to guarantee the analyticity of A. In order for it to be fully analytic, there are additional conditions that the elements of this matrix have to ful0ll, namely, that the result of two (or more) mixed derivatives should not depend on the order of the di)erentiation. In other words, if p and q are any two coordinates, then the following condition has to hold: 92 92 A= A: 9p9q 9q9p
(22)
We derived the conditions for that to happen on various occasions [17,32] and this derivation is repeated in Appendix B (under Section B:2). The result is the ful0llment of the following relation: 9 9 q − p = [q ; p ] 9p 9q
(23)
which can also be written more compactly as a vector equation: Curl = [x]
(24)
In what follows, Eq. (24) will be referred to as the ‘Curl’ condition. In Appendix C, it is proved, employing the integral representation, that the same condition guarantees that the A-matrix will be single-valued throughout this region. The importance of the ADT matrix is in the fact that given the adiabatic potential matrix, it yields the diabatic potential matrix. Since the potentials that govern the motion of atomic species have to be analytic and single-valued, and since the adiabatic potentials usually have these features, we expect the ADT to yield diabatic potentials with the same features. Whereas the analyticity feature is guaranteed because the ADT matrix is usually analytic, it is more the uniqueness requirement that is of concern. The reason being that in cases where the electronic eigenfunctions become degenerate in CS, the corresponding NACT terms become singular (as is well known from the Hellman–Feynman theorem [32]) and this as is proved in Appendix C, may cause the ADT to become multi-valued. Thus we have to make sure that the relevant diabatic potentials will stay single-valued also in cases where the ADT matrix is not. All these aspects will be discussed in the next section.
84
M. Baer / Physics Reports 358 (2002) 75–142
Returning to the diabatic potentials as de0ned in Eq. (21), the condition expressed in Eq. (24) also guarantees well behaved (namely single-valued) diabatic potentials. However, it is known (as was already discussed above) that the -matrix elements are not always well behaved because they may become singular, implying that in such regions Eq. (24) is not satis0ed at every point. Such a situation still guarantees an analytic ADT matrix (except at the close vicinity of these singular points) but no longer its single-valuedness. The question is to what extent this ‘new’ diBculty is going to a)ect the single-valuedness of the diabatic potentials (which have to be single-valued if a solution for the corresponding SE is required). The next section is devoted to this problem. 4. The adiabatic-to-diabatic transformation matrix and the line integral approach Eq. (19) is the main subject of this section. From now on the index P will be omitted and it will be understood that any subject to be treated will refer to a 0nite SHS of dimension M . Eq. (19) can also be written as an integral equation along a contour in the following way: s A(s; s0 | #) = A(s0 | #) − ds · (s | #)A(s ; s0 | #) ; (25) s0
where # is a contour in the multi-dimensional CS, the points s and s0 are located on this contour, ds is a di)erential vector along this contour and the dot is a scalar product between this di)erential vector and the (vectorial) NACM . It is noticed that the -matrix is the kernel of this equation and since, as mentioned above, some of the NACTs may be singular in CS (but not necessarily along the contour itself) it has implication on the multi-valuedness of both the A-matrix and the diabatic potentials. 4.1. The necessary conditions for obtaining single-valued diabatic potentials and the introduction of the topological matrix The solution of Eq. (19) can be written in the form [32,33]: s A(s; s0 ) = ˝ exp − ds · A(s0 ) ; s0
(26)
where the symbol ˝ is introduced to indicate that this integral has to be carried out in a given order [33] (see also Ref. [31]). In other words, ˝ is a path ordering operator. The solution in Eq. (26) is well de0ned as long as , along #, is well de0ned. However, as mentioned earlier, the solution may not be uniquely de0ned at every point in CS. Still, we claim that under certain conditions such a solution is of importance because it will lead to uniquely de0ned diabatic potentials. This claim brings us to formulate the necessary condition for obtaining uniquely de0ned diabatic potentials. Let us consider a closed path # de0ned in terms of a continuous parameter % so that the starting point s0 of the contour is at % = 0. Next, & is de0ned as the value attained by % once the contour completes a full cycle and returns to its starting point. For instance, in case of a circle % is an angle and & = 2.
M. Baer / Physics Reports 358 (2002) 75–142
85
With these de0nitions, we can now look for the necessary condition(s). Thus, we assume that at each point s0 in CS, the diabatic potential matrix W (%) (≡ W (s; s0 )) ful0lls the condition: W (% = 0) = W (% = &)
(27)
Following Eq. (21) this requirement implies that for every point s0 , we have A+ (0)u(0)A(0) = A+ (&)u(&)A(&) :
(28)
Next is introduced another transformation matrix, B, de0ned as B = A(&)A+ (0)
(29)
which, for every s0 and a given contour #, connects u(&) with u(0): u(&) = Bu(0)B+ :
(30)
The B-matrix is, by de0nition, a unitary matrix (it is a product of two unitary matrices) and at this stage except for being dependent on # and, eventually, on s0 , it is rather arbitrary. In what follows we shall derive some features of B. Since the electronic eigenvalues (the adiabatic PESs) are uniquely de0ned at each point in CS we have: u(0) ≡ u(&) and therefore Eq. (30) implies the following commutation relation: [B; u(0)] = 0
(31)
or more explicitly: ∗ (Bkj Bkj − (kj )uj (0) = 0
(32)
j=1
Eq. (32) has to hold for every arbitrary point s0 (≡ % = 0) on the path # and for an essential, arbitrary set of non-zero adiabatic eigenvalues, uj (s0 ); j = 1; : : : ; M . Due to the arbitrariness of s0 and therefore also of the uj (s0 )’s Eq. (32) can be satis0ed if and only if the B-matrix elements ful0ll the relation: ∗ Bkj Bkj = (kj ;
j; k 6 M
(33)
or Bjk = (jk exp(i)k ) :
(34)
Thus B is a diagonal matrix which contains in its diagonal (complex) numbers whose norm is 1 (this derivation holds as long as the adiabatic potentials are non-degenerate along the path #). From Eq. (29) we obtain that the B-matrix transforms the A-matrix from its initial value to its 0nal value while tracing a closed contour: A(&) = BA(0) :
(35)
86
M. Baer / Physics Reports 358 (2002) 75–142
Let us now return to Eq. (26) and de0ne the following matrix: D = ˝ exp − ds · : #
(36)
From Eq. (26) it is noticed that if the contour # is a closed loop (which returns to s0 ) the D-matrix transforms A(s0 ) to its value A(s = s0 | s0 ) obtained, once we reached the end of the closed contour, namely: A(s = s0 | s0 ) = DA(s0 ) :
(37)
Now comparing Eq. (35) with Eq. (37), it is noticed that B and D are identical. This implies that all the features that were found to exist for the B-matrix also apply to the matrix D as de0ned in Eq. (36). Returning to the beginning of this section, we established the following: The necessary condition for the A-matrix to yield single-valued diabatic potentials is that the D-matrix, de0ned in Eq. (36), be diagonal and has, in its diagonal, numbers of norm 1. Since we consider only real electronic eigenfunctions these numbers can be either (+1)s or (−1)s. Following Eq. (37) it is also obvious that the A-matrix is not necessarily single-valued because the D-matrix, as was just proved, is not necessarily a unit matrix. In what follows, the number of (−1)s in a given matrix D will be designated as K. The D-matrix plays an important role in the forthcoming theory because it contains all topological features of an electronic manifold in a region surrounded by its contour # as will be explained next. That the electronic adiabatic manifold can be multi-valued is a well known fact, going back to LH et al. [4 –7]. In this section, we just proved that the same applies to the ADT matrix and for this purpose is introduced the diabatic framework. The diabatic manifold is, by de0nition, a manifold independent of the nuclear coordinates and therefore single-valued in CS. Such a manifold always exists for a complete Hilbert space [32] (see Appendix D). Next we assume that an approximate (partial) diabatic manifold like that can be found for the present SHS de0ned with respect to a certain (usually 0nite) region in CS. This approximate diabatic manifold is, by de0nition, single-valued. Next we consider Eq. (18), in which the electronic diabatic manifold is presented in terms of the product A+ where is the adiabatic electronic manifold. Since this product is singled-valued in CS (because it produces a diabatic manifold), it remains single-valued while tracing a closed contour. In order for this product to remain single-valued the number of wave functions that 5ip sign in this process has to be identical to the topological number K. Moreover, the positions of the (−1)s in the D-matrix have to correspond with the electronic eigenfunctions that 5ip their sign. Thus, for instance, if the third element in the D-matrix is (−1) this implies that the electronic eigenfunction that belongs to the third state has to 5ip sign. It is known that multi-valued adiabatic electronic manifolds create topological e)ects [34]. Since the newly introduced D-matrix contains the information relevant for this manifold (the number functions that 5ip sign and their identi0cation) we shall de0ne it as the topological matrix. Accordingly, K will be de0ned as the topological number. Since D is dependent on the contour # the same applies to K thus: K = K(#).
M. Baer / Physics Reports 358 (2002) 75–142
87
4.2. The approximate adiabatic-to-diabatic transformation matrix In the previous section, the ADT matrix as well as the diabatic potentials were derived for the relevant SHS without running into any theoretical con5icts. In other words, the conditions in Eq. (10) led to a 8nite SHS which, for all practical purposes, behaves like a full (in0nite) Hilbert space. However, it is unconceivable that such strict conditions as presented in Eq. (10) are ful0lled for real molecular systems. Thus the question is: To what extent the results of the present approach, namely, the diabatic potentials as well as the ADT matrix and the ‘Curl’ relations will be a)ected if the conditions in Eq. (10) are replaced by more realistic ones? This subject will be treated next. We shall also brie5y discuss other approaches and examine their ability to yield relevant diabatic potentials. 4.2.1. The quasi-diabatic framework The quasi-diabatic framework is de0ned as the framework for which the conditions in Eq. (10) are replaced by the following more realistic ones [35]: ∼ (1) ij = O()
for i 6 M; j ¿ M :
(10 )
Thus, we still relate to the same SHS but it is now de0ned for P-states which are weakly coupled to Q-states. We shall prove the following lemma: If the interaction between any P-state and Q-state is measured like O() the resultant P–ADT matrix elements and the diabatic potentials become perturbed to O(2 ). The same applies to the Curl conditions in Eqs. (23) and (24) which, in this case, are ful0lled up to O(2 ). 4.2.1.1. The ADT matrix and the diabatic potentials. We prove our statement in two steps: First, we consider the special case of a (0nite) Hilbert space of three states, the two lowest of which are coupled strongly to each other but the third state is only weakly coupled to them. Then we extend it to the case of a complete Hilbert space of N states where M states are strongly coupled to each other, and L (= N − M ) states, are only loosely coupled to these M original states (but can be strongly coupled among themselves). We start with the 0rst case where the components of two of the -matrix elements, namely, 13 and 23 , are of the order of O() (see Eq. (10 )). The 3 × 3 A-matrix has nine elements of which we are interested in only four, namely, a11 , a12 , a21 and a22 . However, these four elements are coupled to a31 and a32 and therefore we consider the following six line integrals (see Eq. (22)): 3 s aij (s) = aij (s0 ) − ds · ik (s)akj (s); i = 1; 2; 3; j = 1; 2 : (38) k=1
s0
Next, we estimate the magnitudes of a31 and a32 and for this purpose we consider the equations for a31 and a32 . Thus, assuming a1j and a2j are given, the solution of the relevant equations in Eq. (38), is s a3j (s) = a3j (s0 ) − ds · (31 a1j + 32 a2j ) : (39) s0
88
M. Baer / Physics Reports 358 (2002) 75–142
For obvious reasons we assume a3j (s0 ) = 0. Since both, a1j and a2j , are at most (in absolute values) unity it is noticed that the magnitude of a31 and a32 are of the order of O() just like the assumed magnitude of the components of i3 for i = 1; 2. Now, returning to Eq. (38) and substituting Eq. (39) in the last term in each summation, one can see that the integral over i3 a3j ; j = 1; 2 is of the second order in which can be speci0ed as O(2 ). In other words, ignoring the coupling between the two-state system and a third state introduces a second order error in the calculation of each of the elements of the two-state A-matrix. To get to the general case we assume A and to be of the following form: (M ) A A(M; L) A= (40a) A(L; M ) A(L) and
=
(M )
(M; L)
(L; M )
(L)
;
(40b)
where we recall that M is the dimension of the P-SHS. As before, the only parts of the A-matrix which are of interest to us are A(M ) and A(L; M ) . Substituting Eqs. (40) into Eq. (22) we 0nd for A(M ) the following integral equation: s s (M ) (M ) (M ) (M ) A = A0 − ds · A − ds · (M; L) A(L; M ) ; (41) s0
s0
where A stands for A(s) and A0 for A(s0 ). Our next task is to get an estimate for A(L; M ) . For this purpose, we substitute Eqs. (40) into Eq. (19) and consider the 0rst order di)erential equation for this matrix: ∇A(L; M ) + (L; M ) A(M ) + (L) A(L; M ) = 0
(42)
which will be written in a slightly di)erent form ∇A(L; M ) + (L) A(L; M ) = − (L; M ) A(M )
(42 )
in order to show that it is an inhomogeneous equation for A(L; M ) (assuming the elements of A(M ) are known). Eq. (42 ) will be solved for the initial conditions where the elements of A(L; M ) are zero (this is the obvious choice in order for the isolated SHS to remain as such in the diabatic framework as well). For these initial conditions, the solution of Eq. (42 ) can be shown to be s s s (L; M ) (L) (L) (L; M ) (M ) = exp − ds · exp ds · ds · A : (43) A s0
s0
s0
In performing this series of integrations, it is understood that they are carried out in the correct order and always for consecutive in0nitesimal sections along the given contour # [17,33]. Eq. (43) shows that all elements of A(L; M ) are linear combinations of the (components of the) (L; M ) elements which are all assumed to be of 0rst order in . We also reiterate that the absolute values of all elements of A(M ) are limited by the value of the unity.
M. Baer / Physics Reports 358 (2002) 75–142
89
Returning now to Eq. (41) and replacing A(L; M ) by the expression in Eq. (43) we 0nd that the line integral to solve A(M ) is perturbed to the second order, namely: s (M ) (M ) ds · (M ) A(M ) + O(2 ) : (44) A = A0 − s0
This concludes our derivation regarding the ADT matrix for a 0nite N . The same applies for an in0nite Hilbert space (but 0nite M ) if the coupling to the higher Q-states decays fast enough. Once there is an estimate for the error in calculating the ADT matrix, it is possible to estimate the error in calculating the diabatic potentials. For this purpose we apply Eq. (21). It is seen that the error is of the second order in namely of O(2 ), just like for the ADT-matrix. 4.2.1.2. The curl condition. Next we analyze the P-Curl condition with the aim of examining to what extent it is a)ected when the weak coupling is ignored as described in the previous section [35]. For this purpose, we consider two components of the (unperturbed) -matrix, namely, the matrices q and p which are written in the following form (see Eq. (40b)): (M ) L) x (M; x x = ; x = q; p : (45) M) (L; x(L) x Here x(M ) (and eventually x(L) ); x = p; q are the matrices that contain the strong NACTs whereas L) L) (and (K; ); x = p; q are the matrices that contain the weak NACTs, all being of the (M; x x order O(). Employing Eqs. (23; 24) and substituting Eq. (45) for q and p , it can be seen by algebraic manipulations that the following relation holds: (M )
(M )
9q 9p ) (M ) (M; L) (L; M ) L) (L; M ) − q − (M; p } : = [(M p ; q ] + {p q 9q 9p
(46)
As is noticed, all terms in the curled parentheses are of order 2 which implies that the Curl condition becomes Curl (M ) = [(M ) × (M ) ] + O(2 )
(47)
or, in other words, the Curl condition within the SHS is ful0lled up to O(2 ). Obviously, the fact that the solution of the ADT matrix is only perturbed to the second order makes, the present approach rather attractive. It not only results in a very eBcient approximation but also yields an estimate for the error made in applying the approximation. 4.2.2. The diabatization due to other approaches Although the procedure described so far, to reach the diabatic framework (to be termed ‘diabatization’), is, in principle, the most straightforward one still other approaches were also developed [36 – 45]. As is noticed, the present approach is based on the NACTs which are computationally expensive to obtain and quite often are not available. Other methods were developed to achieve approximate diabatization without explicitly referring to the NACTs.
90
M. Baer / Physics Reports 358 (2002) 75–142
One procedure due to Macias and Riera [36] is based on the behavior of certain operators around the avoided crossing region (it was originally suggested for diatomic molecules). The main idea is to expose a symmetric operator which some of its terms behave ‘violently’ at the vicinity of this region but following the ADT become mild. Meyer and Werner [37] while applying this approach to LiF considered the electronic dipole moment operator, Peric et al. [38,39], while studying the C2 H system, suggested for this purpose the transition dipole moment operator and Petrongolo et al. [40], while studying the NH2 system, considered the quadrupole moment and for NO2 , one of the dipole moments. These studies were all performed for two-state systems and since the ADT matrix is expressed, in such cases, by a single angle—the ADT angle (to be discussed later)—the information available from the regular ab initio calculation suBces to determine, in this way, the ADT matrix. In all cases reported so far the calculated ADT angles exhibit a reasonable functional form. In particular, this procedure yields the value of while passing directly through the avoided crossing point. This is very clearly shown in Fig. 15 of Ref. [39]. However, it has to be emphasized that these calculations were carried out for two-state systems having in the region of interest one isolated CI. Thus, additional studies are necessary to 0nd out whether this approach can be extended to a system with several CIs. A di)erent approach is utilized by Pacher et al. [41], Romero et al. [42], Sidis [43], and others [44,45] which developed recipes for construction ab initio diabatic states. These methods can be eBcient as long as one encounters, at most, one isolated CI in a given region in CS but have to be further developed, if several CIs are located at the region of interest. 5. The quantization of the non-adiabatic coupling matrix One of the main outcomes of the analysis so far is that the topological matrix D, presented in Eq. (36), is identical to an ADT matrix calculated at the end point of a closed contour. From Eq. (36) it is noticed that D does not depend on any particular point along the contour but on the contour itself. Since the integration is carried out over the NACM, , and since D has to be a diagonal matrix with numbers of norm 1 for any contour in CS, these two facts impose severe restrictions on the NACTs. In the next section, we present a few analytical examples showing that the restrictions on the -matrix elements are indeed quantization conditions that go back to the early days of quantum theory. Section 5.2 will be devoted to the general case. 5.1. The quantization as applied to model systems In this section, we intend to show that for a certain type of models the above imposed ‘restrictions’ become the ordinary well known Bohr–Sommerfeld quantization conditions [46]. For this purpose we consider the following NACM : (s) = gt(s) ;
(48)
where t(s) is a vector whose components are functions in CS and g is a constant antisymmetric matrix of dimension M . For this case, one can evaluate the ordered exponential in Eq. (36).
M. Baer / Physics Reports 358 (2002) 75–142
Thus substituting Eq. (48) into Eq. (36) yields the following solution for the D-matrix: D = G exp −! ds · t(s) G † ; #
91
(49)
where ! is a diagonal matrix which contains the eigenvalues of the g-matrix and G is a matrix that diagonalizes g (G † is the complex conjugate of G). Since g is an antisymmetric matrix, all its eigenvalues are either imaginary or zero. Next we concentrate on a few special cases: 5.1.1. The two-state case The g-matrix in this case is given in the form 0 1 g= : −1 0 The matrix G that diagonalizes it is 1 1 1 G= √ 2 i −i
(50a)
(50b)
and the corresponding eigenvalues are ±i. Substituting Eq. (50b) in Eq. (49) and replacing the two !’s by ±i yields the following D-matrix: −sin t(s) · ds t(s) · ds cos # # (51) D = : sin t(s) · ds cos t(s) · ds #
#
Next we refer to the requirements to be ful0lled by the matrix D, namely, that it is diagonal and that it has in the diagonal numbers which are of norm 1. In order for that to happen the vector-function t(s) has to ful0ll along a given (closed) path # the condition: t(s) · ds = n ; (52) #
where n is an integer. These conditions are essentially the Bohr–Sommerfeld quantization conditions [46] (as applied to the single term of the two-state -matrix). Eq. (52) presents the condition for the extended CI case. It is noticed that if n is an odd integer, the diagonal of the D-matrix contains two (−1)s, which means that the elements of the ADT matrix 5ip sign while tracing the closed contour in Eq. (52) (see Eq. (37)). This case is reminiscent of what happened in the simpli0ed Jahn–Teller model as was studied by HLH [6] in which they showed that if two eigenfunctions that belong to the two states that form a CI, trace a closed contour around that CI, both of them 5ip sign (see Appendix A). If the value of n, in Eq. (52), is an even integer the diagonal of the D-matrix contains two (+1)s, which implies that in this case none of the elements of the ADT matrix 5ip sign
92
M. Baer / Physics Reports 358 (2002) 75–142
while tracing the closed contour. This situation will be identi0ed as the case where the above mentioned two eigenfunctions trace a closed contour but do not 5ip sign—the case known as the Renner–Teller model [5,47]. Eq. (52) is the extended version of the Renner–Teller case. In principle, we could have a situation where one of the diagonal elements is (+1) and one (−1) but from the structure of the D-matrix, one can see that this case can never happen. In our introductory remarks, we said that this section would be devoted to model systems. Nevertheless, it is important to emphasize that although this case is treated within a group of model systems this ‘model’ stands for the general case of a two-state SHS. Moreover, this is the only case for which we can show, analytically, for a non-model system, that the restrictions on the D-matrix indeed lead to a quantization of the relevant NACT. 5.1.2. The three-state case The NACM will be de0ned in a way similar to that in the previous section (see Eq. (48)), namely, as a product between a vector-function t(s) and a constant antisymmetric matrix g written in the form 0 1 0 ; − 1 0 3 g= (53) 0 −3 0 where 3 is a (constant) parameter. Employing this form of g we assumed that g13 and g31 are zero. (The more general case is treated elsewhere [29].) The eigenvalues of this matrix are !1; 2 = ± i!; !3 = 0; ! = 1 + 32 (54) and the corresponding matrix, G, that diagonalizes the matrix g is √ 1 1 3 2 1 0 i! −i! G= √ : ! 2 √ −3 −3 2 Employing, again, Eq. (49) 2 3 +C −2 D = ! !S 3(1 − C) where
C = cos !
we 0nd for the D-matrix the following result: !S 3(1 − C) −3!S !2 C 2 3!S 1 + 3 C
#
t(s) · ds
and
S = sin !
(55)
(56)
#
t(s) · ds
:
(57)
It is well noticed that the necessary and suBcient condition for this matrix to become diagonal is that the following condition 2 t(s) · ds = 1 + 3 t(s) · ds = 2n (58) ! #
#
M. Baer / Physics Reports 358 (2002) 75–142
93
be ful0lled. Moreover, this condition leads to a D matrix that contains in its diagonal numbers of norm 1 as required. However, in contrast to the previous two-state case, they, all three of them, are positive, namely (+1). In other words, the ‘quantization’ of the matrix as expressed in Eq. (58) leads to a D-matrix that is a unit matrix and therefore will maintain the ADT matrix single-valued along any contour that ful0lls this ‘quantization’. This is, to a certain extent, an unexpected result but, as we shall see in the next section, it is not the typical result. Still it is an interesting result and we shall return to it in Sections 10 and 12. 5.1.3. The four-state case The g-matrix in this case 0 1 0 3 −1 0 g= 0 −3 0 0
0
−6
will be written in the form 0 0 ; 6 0
where 3 and 6 are the two parameters. The i%q i%q −i%p −i%p 1 p%q −p%q −q%p q%p G= √ 2 i%p i%q i%q i%p q%p
−q%p
p%q
matrix G that diagonalizes g is ;
and %p and %q are de0ned as p2 − 1 1 − q2 ; % = %p = q p2 − q 2 p2 − q 2
$=
(60)
−p%q
where p and q are de0ned as 1 p = √ ($2 + $4 − 462 )1=2 ; 2 1 q = √ ($2 − $4 − 462 )1=2 2
and $ as
(59)
(1 + 32 + 62 ) :
(61)
(62a)
(62b)
From Eq. (61) it is obvious that p ¿ q. The four eigenvalues are: (!1 ; !2 ; !3 ; !4 ) ≡ (ip; −ip; iq; −iq) :
(63)
94
M. Baer / Physics Reports 358 (2002) 75–142
Again employing Eq. (49) we 0nd for the D-matrix elements, the following expressions: D11 (8) = %q2 Cp + %p2 Cq ;
D12 (8) = p%q2 Sp + q%p2 Sq ;
D13 (8) = %p %q (−Cp + Cq );
D14 (8) = %p %q (−qSp + pSq ) ;
D22 (8) = p2 %q2 Cp + q2 %p2 Cq ;
D23 (8) = %p %q (pSp − qSq ) ;
D24 (8) = pq%p %q (Cp − Cq );
D33 (8) = %p2 Cp + %q2 Cq ;
D34 (8) = − (q%p2 Sp + p%q2 Sq );
D44 (8) = q2 %p2 Cp + p2 %q2 Cq ;
D21 (8) = − D12 (8);
D31 (8) = D13 (8);
D32 (8) = − D23 (8) ;
D41 (8) = − D14 (8);
D42 (8) = D24 (8);
D43 (8) = − D34 (8) ;
Cp = cos(p8)
Sp = sin(p8)
(64)
where and
and similar expressions for Cq and Sq . Here 8 stands for 8= f(s ) · ds : #
(65)
(66)
Next we determine the conditions for this matrix to become diagonal (with numbers of norm 1 in the diagonal). This will happen if and only if when p and q ful0ll the following relations: p8 = p f(s ) · ds = 2n ; (67a) #
q8 = q
#
f(s ) · ds = 2‘ ;
(67b)
where n(¿ 1) and ‘ de0ned in the range n ¿ ‘ ¿ 0 are allowed to be either integers or half-integers but m (= n − ‘) can only attain integer values. The di)erence between the case where n and ‘ are the integers and the case where both are half-integers is as follows: Examining the expressions in Eq. (64), it is noticed that in the 0rst case all diagonal elements of D are (+1); so that, D is, in fact, the unit matrix and therefore the elements of the ADT matrix are single-valued in CS. In the second case, we get from Eq. (64), that all four diagonal elements are (−1). In this case, when the ADT traces a closed contour all its elements 5ip sign. Since p and q are directly related to the NACTs 3 and 6 (see Eqs. (61) and (62)), the two conditions in Eqs. (67) imply, again, ‘quantization’ conditions for the values of the -matrix elements, namely for 3 and 6, as well as for the vectorial function f(s). It is interesting to note that this is the 0rst time that in the present framework the quantization is formed by two quantum numbers: a number n to be termed the principal quantum number and a number ‘, to be termed the secondary quantum number. This case is reminiscent of the two quantum numbers that characterize the hydrogen atom.
M. Baer / Physics Reports 358 (2002) 75–142
95
5.1.4. Comments concerning extensions In the last three sub-sections, we treated one particular group of -matrices as presented in Eq. (48) where g is an antisymmetric matrix with constant elements. The general theory demands that the matrix D as presented in Eq. (49) be diagonal and that as such it contains (+1)s and (−1)s in its diagonal. In the three examples that were worked out, we found that for this particular class of -matrices the corresponding D-matrices contained either (+1)s or (−1)s but never a mixture of the two types. In other words, the D-matrix can be represented in the following way: D = (−1)n I ; (68) where n is either even or odd and I the unit matrix. Indeed, for the two-state case, n was found to be either odd or even, for the three-state case it was found to be only even and for the four-state case it was again found to be either odd or even. It seems to us (without proof) that this pattern applies to any dimension. If this really is the case, then we can make the following two statements: (a) In case the dimension of the -matrix is an odd number, the D-matrix will always be the unit matrix I , namely n must be an even number. This is so because an odd dimensional g-matrix, always has zero as an eigenvalue and this eigenvalue produces the (+1) in the D-matrix which ‘dictates’ the value of n in Eq. (68). (b) In case the dimension of the -matrix is an even number the D-matrix will (always) be equal either to I or to (−I ). (c) These two facts lead to the conclusions that in case of an odd dimension the ‘quantization’ is characterized by (a series of) integers only but in case of an even dimension it is characterized either by (a series of) integers or by (a series of) half-integers. 5.2. The treatment of the general case The derivation of the D-matrix for the general case is based on 0rst deriving the ADT matrix, A, as a fuction of % and then obtaining its value at the end of the arbitrary closed contours (when % becomes &). Since A is a real unitary matrix, it can be expressed in terms of cosine and sine functions of given angles [32,48,49]. We shall, 0rst, brie5y consider the two special cases with M = 2 and 3. The case of M = 2 was treated by us in the previous section. Here this treatment is repeated with the aim of emphasizing di)erent aspects and also for reasons of completeness. The matrix A(2) takes the form cos ; sin ; 12 12 A(2) = ; (69) −sin ;12 cos ;12 where ;12 , the ADT angle, can be shown to be [17] s ;12 = 12 (s ) · ds : (70) s0
Designating 812 as the value of ;12 for a closed contour, namely: 812 = 12 (s ) · ds ; #
(71)
96
M. Baer / Physics Reports 358 (2002) 75–142
the corresponding D(2) matrix becomes accordingly (see also Eq. (51)): cos 812 sin 812 (2) D = : −sin 812 cos 812
(72)
Since for any closed contour D(2) has to be a diagonal matrix with (+1)s and (−1)s, it is seen that 812 = n where n is either odd or even (or zero) and therefore the only two possibilities for D(2) are as follows: D(2) = (−1)n I ;
(73)
where I is the unit matrix and n is either even or odd. The case of M = 3 is somewhat more complicated because the corresponding orthogonal matrix is expressed in terms of three angles, namely, ;12 , ;13 , and ;23 [32,48]. This case was recently studied by us in detail [28] and here we brie5y repeat the main points. The matrix A(3) is presented as a product of three rotation matrices of the form cos ;13 0 sin ;13 (3) 0 1 0 (;13 ) = (74) Q13 −sin ;13 0 cos ;13 (3) (3) (the other two, namely, Q12 (;12 ) and Q23 (;23 ), are of a similar structure with the respective cosine and sine functions in the appropriate positions) so that A(3) becomes (3) (3) (3) Q23 Q13 A(3) = Q12
or, following the multiplication, the more explicit form c12 c13 − s12 s23 s13 s12 s23 c12 s13 + c12 s23 c13 A(3) = −s12 c13 − c12 s23 s13 c12 c23 −s12 s13 + c12 s23 c13 : −c23 s13 −s23 c23 c13
(75)
(76)
Here cij = cos(;ij ) and sij = sin(;ij ). The three angles are obtained by solving the following three coupled 0rst order di)erential equations which follow from Eq. (19) [28,48]: ∇;12 = − 12 − tan ;23 (−13 cos ;12 + 23 sin ;12 ); ∇;23 = − (23 − cos ;12 + 13 sin ;12 ); ∇;13 = − (cos ;23 )−1 (−13 cos ;12 + 23 sin ;12 ) :
(77)
These equations were integrated as a function of ’ (where 0 6 ’ 6 2), for a model potential [28] along a circular contour of radius > (for details see Section 13.2). The ’-dependent ; angles, i.e., ;ij (’ | >), for various values of > and S (S is the potential-energy shift de0ned as the shift between the two original coupled adiabatic states and a third state, at the origin, i.e. at > = 0) are presented in Fig. 1. Thus for each ’ we get, employing Eq. (76), the A(3) (’)-matrix elements. The relevant D(3) -matrix is obtained from A(3) by substituting ’ = 2. If 8ij are de0ned as 8ij = ;ij (’ = 2) ;
(78)
M. Baer / Physics Reports 358 (2002) 75–142
97
Fig. 1. The three adiabatic–diabatic-transformation angles (obtained by solving Eqs. (77) for a 3 × 3 diabatic model potential presented in Section 13.2) ;12 (’); ;23 (’); ;13 (’) as calculated for di)erent values of > and S: (a) ; = ;12 , S = 0:0; (b) ; = ;12 ; S = 0:05; (c) ; = ;12 , S = 0:25; (d) ; = ;23 ; S = 0:0; (e) ; = ;23 , S = 0:05; (f) ; = ;23 , S = 0:25; (g) ; = ;13 , S = 0:0; (h) ; = ;13 , S = 0:05; (i) ; = ;13 , S = 0:25. (——– ) > = 0:01; (——) > = 0:1; (- - - - - -) > = 0:5.
then, as is noticed from Fig. 1, the values of 8ij are either zero or . A simple analysis of Eq. (76), for these values of 8ij , shows that D(3) is a diagonal matrix with two (−1)s and one (+1) in the diagonal. This result will now be generalized for an arbitrary D(3) -matrix in the following way: Since a general A(3) -matrix can always be written as in Eq. (76) the corresponding D(3) -matrix will become diagonal if and only if: 8ij = ;ij (’ = 2) = nij
(79)
the diagonal terms can, explicitly, be represented as Dij(3) = (ij cos 8jn cos 8jm ;
j = n = m; j = 1; 2; 3 :
(80)
This expression shows that the D(3) -matrix, in the most general case, can have either three (+1)s in the diagonal or two (−1)s and one (+1). In the 0rst case, the contour does not surround any CI whereas in the second case, it surrounds either one or two CIs (a more general discussion regarding this ‘geometrical’ aspect will be given in Section 9). It is important to emphasize that this analysis, although it is supposed to hold for a general three-state case, contradicts the analysis we performed of the three-state model in Section 5.1.2. The reason is that the ‘general (physical) case’ applies to an (arbitrary) aggregation of CIs
98
M. Baer / Physics Reports 358 (2002) 75–142
whereas the previous case applies to a special (probably unphysical) situation. In Section 10, the discussion on this subject is extended. In what follows, the cases for an aggregation of CIs will be termed the ‘breakable’ situations (the reason for choosing this name will be given later) in contrast to the type of models which were discussed in Sections 5.1.2 and 5.1.3 and which are termed as the ‘unbreakable’ situation. Before discussing the general case, we would like to refer to the present choice of the rotation angles. It is well noticed that they di)er from the ordinary Euler angles which are routinely used to present three-dimensional orthogonal matrices [50]. In fact, we could apply the Euler angles for this purpose and get identical results for A(3) (and for D(3) ). The main reason we prefer the ‘democratic’ choice of the angles is that this set of angles can be extended to an arbitrary dimension without any diBculty as will be done next. The M -dimensional ADT matrix A(M ) will be written as a product of elementary rotation matrices similar to that given in Eq. (75) [32]: A(M ) =
M −1 M i=1 j¿i
Qij(M ) (;ij ) ;
(81)
where Qij(M ) (;ij ) (like in Eq. (74)) is an M × M matrix with the following terms: In its (ii) and (jj) positions (along the diagonal) are located the two relevant cosine functions and at the rest of the (M − 2) positions are located (+1)s; in the (ij) and (ji) o)-diagonal positions are located the two relevant ±sine functions and at all other remaining positions are zeros. From Eq. (81) it can be seen that the number of matrices contained in this product is M (M − 1)=2 and that this is also the number of independent ;ij -angles which are needed to describe an M × M unitary matrix (we recall that the missing M (M + 1)=2 conditions follow from the orthonormality conditions). The matrix A(M ) as presented in Eq. (81) is characterized by two important features: (a) Every diagonal element contains at least one term which is a product of cosine functions only. (b) Every o;-diagonal element is a summation of products of terms where each product contains at least one sine function. These two features will lead to conditions to be imposed on the various ;ij -angles to ensure that the topological matrix, D(M ) , is diagonal as discussed in Section 4.1. To obtain the ;ij -angles, one usually has to solve the relevant 0rst order di)erential equations of the type given in Eq. (77). Next, like before, the 8ij -angles are de0ned as the ;ij -angles at the end of a closed contour. In order to obtain the matrix D(M ) one has to replace, in Eq. (81), the angles ;ij by the corresponding 8ij -angles. Since D(M ) has to be a diagonal matrix with (+1)s and (−1)s in the diagonal this can be achieved if and only if all 8ij -angles are zero or multiples of . It is straightforward to show that with this structure the elements of D(M ) become Dij(M ) = (ij
M
M
cos 8ik = (ij (−1)
k =i
nik
;
i = 1; : : : ; M ;
(82)
k=i
where nik are integers which ful0ll nik = nki . From Eq. (82), it is noticed that along the diagonal of D(M ) we may encounter K numbers which are equal to (−1) and the rest which are equal to (+1). It is important to emphasize that in case a contour does not surround any CI the value of K is 0.
M. Baer / Physics Reports 358 (2002) 75–142
99
6. The construction of sub-Hilbert spaces and sub–sub-Hilbert spaces In Section 2.2, it was shown that the condition in Eq. (10) or its relaxed form in Eq. (10 ) (in Section 4) enables the construction of SHS. Based on this possibility we consider a prescription 0rst for constructing the SHS which extends to the full CS and then, as a second step, constructing the sub-SHS that extends only to (a 0nite) portion of CS. In the study of (electronic) curve crossing problems one distinguishes between a situation where two electronic curves, Ej (R); j = 1; 2, approach each other at a point R = R0 so that the di)erence SE(R = R0 ) = E2 (R = R0 ) − E1 (R = R0 ) ≡ 0, and a situation where the two electronic curves interact so that SE(R) ∼ Const. (¿ 0). The 0rst case is usually treated by the Landau– Zener (LZ) formula [52–56] and the second is based on the Demkov-approach [57]. It is well known that whereas the LZ-type interactions are strong enough to cause transitions between two adiabatic states the Demkov-type interactions are usually weak and a)ect the motion of the interacting molecular species relatively slightly. The LZ situation is the one that becomes the Jahn–Teller conical intersection (CI) in two dimensions [5 –11]. We shall also include the Renner–Teller parabolic intersection (PI) [5,58,61], although it is characterized by two interacting potential energy surfaces which behave quadratically (and not linearly as in the LZ case) in the vicinity of the above mentioned degeneracy point. 6.1. The construction of sub-Hilbert spaces Following Section 2.2 we shall be more speci0c about what is meant by ‘strong’ and ‘weak’ interactions. It turns out that such a criterion can be assumed, based on whether two consecutive states do, or do not, form a CI or a PI (it is important to mention that only consecutive states can form CI=PIs). The two types of intersections are characterized by the fact that the non-adiabatic coupling terms, at the points of the intersection, become in0nite (these points can be considered as the ‘black holes’ in molecular systems and it is mainly through these ‘black holes’ that electronic states interact with each other.). Based on what was said so far, we suggest breaking up a complete Hilbert space of size N into L SHSs of varying sizes NP ; P = 1; : : : ; L, where L NP (83) N= P=1
with L being 0nite or in0nite. Before we continue with the construction of the SHSs we would like to make the following comment: Usually, when two given states form CI=PIs, one thinks of isolated points in CS. In fact, CI=PI points are not isolated points but form (0nite or in0nite) seams which ‘cut’ through the molecular CS. However, since our studies are carried out for planes, these planes, usually, contain isolated CI=PI points only. The criterion according to which the break-up is carried out is based on the NACT ij as were de0ned in Eq. (8a). In what follows, we distinguish between two kinds of non-adiabatic coupling terms: (a) The intra non-adiabatic coupling terms ij(P) which are formed between two eigenfunctions belonging to a given SHS, namely, the Pth SHS: ij(P) = i(P) ∇j(P) ; i · j = 1; : : : ; NP (84)
100
M. Baer / Physics Reports 358 (2002) 75–142
Fig. 2. A schematic picture describing the three consecutive sub-Hilbert spaces, namely, the (P − 1)th, the Pth and the (P + 1)th. The dotted lines are separation lines. Q) and (b) Inter non-adiabatic coupling terms (P; which are formed between two eigenfunctions, ij the 0rst belonging to the Pth SHS and the second to the Qth SHS: Q) (P) (Q) (P; ∇ = ; i = 1; : : : ; Np ; j = 1; : : : ; NQ : (85) ij i j
The Pth SHS is de0ned through the following two requirements: (1) All Np states belonging to the Pth SHS interact strongly with each other in the sense that each pair of consecutive states have at least one point where they form an LZ-type (P) interaction. In other words, all jj+1 ; j = 1; : : : ; NP − 1 form, at least at one point in CS, a CI=PI. (2) The range of the Pth SHS, is de0ned in such a way that the lowest (or the 0rst) state and the highest (the NP th) state which belong to this SHS form Demkov-type interactions with the highest state belonging to the lower (P − 1)th SHS and with the lowest state belonging to the upper (P + 1)th SHS, respectively (see Fig. 2). In other words, the two non-adiabatic coupling terms ful0ll the conditions (see Eq. (10 ) in Section (4:2)): P) N(P−1; ∼ O() P −1 1
and
P+1) (P; ∼ O() : NP 1
(86)
At this point we make two comments: (a) Conditions (1) and (2) lead to a well de0ned SHS which for any further treatments (in spectroscopy or scattering processes) has to be treated as a whole (and not on a ‘state-by-state’ level). (b) Since all states in a given SHS are adiabatic
M. Baer / Physics Reports 358 (2002) 75–142
101
states strong interactions of the LZ-type can occur between two consecutive states only. However Demkov-type interactions may exist between any two states. 6.2. The construction of sub–sub-Hilbert spaces As we have seen the sub-Hilbert spaces are de0ned for the whole CS and this requirement could lead, in certain cases, to situations where it will be necessary to include the complete Hilbert space. However, it frequently happens that the dynamics we intend to study takes place in a given, isolated, region which contains only part of the CI=PI points and the question is whether the e)ects of the other CI=PIs can be ignored? The answer to this question can be given following a careful study of these e)ects employing the line integral approach as presented in terms of Eq. (25). For this purpose we analyze what happens along a certain line # which surrounds one CI or more. To continue we employ the same procedure as discussed in Section 4.2: We break up the ADT matrix and the -matrix as written in Eq. (40) and then continue like in Eq. (41), etc. In this way we can show that if, along the particular line #, the ‘non-interesting’ parts of the -matrix are of order the error expected for the interesting part in the ADT matrix is of order O(2 ). If this happens for any contour in this region then we can just ignore the a)ects of CI which are outside this region and carry out the dynamic calculations with this reduced set of states. 7. The topological spin Before we continue and in order to avoid confusion two matters have to be clari0ed: (a) We distinguished between two types of LZ situations, which form (in two dimensions) the Jahn–Teller CI and the Renner–Teller PI. The main di)erence between the two is that the PIs do not produce topological e)ects and therefore, as far as this subject is concerned, they can be ignored. Making this distinction leads to the conclusion that the more relevant magnitude to characterize topological e)ects, for a given SSHS, is not its dimension M but NJ , the number of CIs. (b) In general, one may encounter more than one CI between a pair of states [76,80]. However, to simplify the study, we assume one CI for a pair of states so that (NJ + 1) stands for the number of states that form the CIs. So far we introduced three di)erent integers M; NJ and K. As mentioned earlier, M is a characteristic number of the SSHS (see Section 6.2) but is not relevant for topological e)ects; instead NJ , as just mentioned, is a characteristic number of the SSHS and relevant for topological e)ects, and K, the number of (−1)s in the diagonal of the topological matrix D (or the number of eigenstates that 5ip sign while the electronic manifold traces a closed contour) is relevant for topological e)ects but may vary from one contour to another and therefore is not a characteristic feature for a given SSHS. Our next task is to derive all possible K-values for a given NJ . Let us 0rst refer to a few special cases: It was shown before that in case of NJ = 1 the D-matrix contains two (−1)s in its diagonal in case the contour surrounds the CI(29) and no (−1)s when the contour does not surround the CI. Thus the allowed values of K are either 2 or zero. The value K = 1 is not allowed. A similar inspection of the case NJ = 2 reveals that K, as before, is equal either to
102
M. Baer / Physics Reports 358 (2002) 75–142
2 or to zero (see Section 5.2). Thus the values K = 1 or 3 are not allowed. From here we continue to the general case and prove the following statement: In any molecular system, K can attain only even integers in the range: KJ = NJ ; NJ = 2P; K = {0; 2; : : : ; KJ }; (87) KJ = (NJ + 1); NJ = 2P + 1; where p is an integer. The proof is based on Eq. (82). Let us assume that a certain closed contour yields a set of 8ij -angles which produce a number K. Next we consider a slightly di)erent closed contour, along which one of these 8ij ’s, say 8st , changed its value from zero to . From Eq. (82) it can be seen that only two D-matrix elements contain cos(8st ), namely, Dss and Dtt . Now if these two matrix elements were, following the 0rst contour, positive then changing 8st from 0 → would produce two additional (−1)s, thus increasing K to K + 2, if these two matrix elements were negative, this change would cause K to decrease to K − 2, and if one of these elements was positive and the other negative then changing 8st from 0→ would not a)ect K. Thus, immaterial the value of NJ , the various K values di;er from each other by even integers only. Now since any set of K’s also contains the value K = 0 (the case when the closed loop does not surround any CIs), this implies that K can attain only even integers. The 0nal result is the set of values presented in Eq. (87). The fact that there is a one-to-one relation between the (−1)s in the diagonal of the topological matrix and the fact that the eigenfunctions 5ip sign along closed contours (see discussion at the end of Section 4.1) hints at the possibility that these sign 5ips are related to a kind of a spin quantum number and in particular to its magnetic components. The spin in quantum mechanics was introduced because experiments indicated that individual particles are not completely identi0ed in terms of their three spatial coordinates [51]. Here we encounter, to some extent, a similar situation: A system of items (i.e., distributions of electrons) in a given point in CS is usually described in terms of its set of eigenfunctions. This description is incomplete because the existence of CIs causes the electronic manifold to be multi-valued. For instance, in case of two (isolated) CIs we may encounter at a given point in CS four di)erent sets of eigenfunctions (see next section): (a) (1 ; 2 ; 3 );
(b) (−1 ; −2 ; 3 );
(c) (1 ; −2 ; −3 );
(d) (−1 ; 2 ; −3 ) :
(88)
In case of three CIs we have as many as eight di)erent sets of eigenfunctions, etc. Thus, we have to refer to an additional characterization of a given SSHS. This characterization is related to the number NJ of CIs and the associated possible number of sign 5ips due to di)erent contours in the relevant region of CS, traced by the electronic manifold. In Refs. [26,29], it was shown that in a two-state system the non-adiabatic coupling term, 12 , has to be ‘quantized’ in the following way: 12 (s ) · ds = n ; (89) #
where n is an integer (in order to guarantee that the 2 × 2 diabatic potential be single-valued in con0guration space). In case of CIs this number has to be an odd integer and for our purposes it is assumed to be n = 1. Thus each conical intersection can be considered as a ‘spin’. Since in
M. Baer / Physics Reports 358 (2002) 75–142
103
a given SSHS NJ conical intersections are encountered, we could de0ne the spin, J , of this SSHS as (NJ =2). However, this de0nition may lead to more sign 5ips than we actually encounter (see next section). In order to make a connection between J and NJ as well as with the ‘magnetic components’ MJ of J and the number of the actual sign 5ips, the spin J has to be de0ned as KJ = NJ ; Nc = 2p ; 1 KJ J= (90a) ; 2 2 KJ = (NJ + 1); NJ = 2p + 1 and, accordingly, the various MJ -values will be de0ned as MJ = J − K=2
where
K = {0; 2; : : : ; KJ } :
(90b)
For the seven lowest NJ values, we have the following assignments: For NJ = 0: {J = 0; MJ = 0} ; for NJ = 1:
{J = 1=2; MJ = 1=2; −1=2} ;
for NJ = 2:
{J = 1=2; MJ = 1=2; −1=2} ;
for NJ = 3:
{J = 1; MJ = 1; 0; −1} ;
for NJ = 4:
{J = 1; MJ = 1; 0; −1} ;
for NJ = 5:
{J = 3=2; MJ = 3=2; 1=2; −1=2; −3=2} ;
for NJ = 6:
{J = 3=2; MJ = 3=2; 1=2; −1=2; −3=2} ;
for NJ = 7:
{J = 2; MJ = 2; 1; 0; −1; −2} :
(90c)
The general formula and the individual cases as presented in the above list indicate that indeed the number of conical intersections in a given SSHS and the number of possible sign 5ips within this SSHS are interrelated, similar to a spin J with respect to its magnetic components MJ . In other words, each decoupled SSHS is now characterized by a spin quantum number J which connects between the number of conical intersections in this system and the topological e)ects which characterize it. 8. An analytical derivation for the possible sign *ips in a three-state system In the next section, we intend to present a geometrical analysis to permit us to gain some insight with respect to the phenomenon of sign 5ips in an M -state system (M ¿ 2). This can be done without the support of a parallel mathematical study [60]. In this section, we intend to supply the mathematical foundation (and justi0cation) for this analysis [59]. Thus employing the LI approach, we intend to prove the following statement: If a contour in a given plane surrounds two CIs belonging to two di)erent (adjacent) pairs of states, only two eigenfunctions 5ip sign—one that belongs to the lowest state and the other that belongs to the highest one. To prove this we consider the three following regions (see Fig. 3): In the 0rst region, designated 612 , is located the main portion of the interaction, t12 , between states 1 and 2 with the point of the CI=PI at C12 . In the second region, designated as 623 , is located the main
104
M. Baer / Physics Reports 358 (2002) 75–142
Fig. 3. The breaking up of a region 6, that contains two CIs (at C12 and C23 ), into three sub-regions: (a) The full region 6 de0ned in terms of the closed contour #. (b) The region 612 which contains a CI at C12 and is de0ned by the closed contour #12 . (c) The region 60 , which is de0ned by the closed contour #0 and does not contain any CI. (d) The region 623 which contains a CI at C23 and is de0ned by the closed contour #23 . It can be seen that # = #12 + #0 + #23 .
portion of the interaction, t23 , between states 2 and 3 with the point of the CI=PI at C23 . In addition, we assume a third region, 60 , which is located in-between the two and is used as a bu)er zone. Next, it is assumed that the intensity of the interactions due to the components of t23 in 612 and due to t12 in 623 is ∼ 0. This situation can always be achieved by shrinking 612 (623 ) towards its corresponding center C12 (C23 ). In 60 the components of both, t12 and t23 , may be of arbitrary magnitude but no CI=PI of any pair of states is allowed to be there. To prove our statement, we consider the line integral (see Eq. (25)): A = A0 − ds · A ; (91) #
where the integration is carried out along a closed contour #, A is the 3 × 3 ADT matrix to be calculated, the dot stands for a scalar product and is the matrix of 3 × 3 that contains the two non-adiabatic coupling terms, namely: 0 t12 0 : − t 0 t (92) (s) = 12 23 0 −t23 0 (It is noticed the components of t13 ∼ 0.) This assumption is not essential for the proof, but makes it more straightforward.
M. Baer / Physics Reports 358 (2002) 75–142
105
The integral in Eq. (91) will now be presented as a sum of three integrals (for a detailed discussion on that subject: see Appendix C), namely: A = A0 − ds · A − ds · A − ds · A : (93) #12
#0
#23
Since there is no CI in the bu)er zone, 60 , the second integral is zero and can be deleted so that we are left with the 0rst and the third integrals. In general, the calculation of each integral is independent of the other; however, the two calculations have to yield the same result and therefore, they have to be interdependent to some extent. Thus, we do each calculation separately but for di)erent (yet unknown) boundary conditions: the 0rst integral will be done for G12 as a boundary condition and the second for G23 . Thus A will be calculated twice: A = Gij − ds · A : (94) #ij
Next, the topological matrices D, D12 and D23 are introduced which are related to A in the following way (see Eq. (37)): A = DA0 ;
(95a)
A = D12 G12 ;
(95b)
A = D23 G23 :
(95c)
The three equalities can be ful0lled if and only if the two G-matrices namely, G12 and G23 are chosen to be G12 = D23 A0
and
G23 = D12 A0 :
(96)
Since the all D-matrices are diagonal the same applies to D12 and D23 so that D becomes D = D13 = D12 D23 :
(97)
Our next task will be to obtain D12 and D23 . For this purpose, we consider the two partial -matrices, 12 and 23 : 0 t12 0 0 0 0 and 23 (s) = 0 ; − t 0 0 0 t (98) 12 (s) = 12 23 0 −t23 0 0 0 0 so that = 12 + 23 : We start with the 0rst of Eq. (94), namely: A = G12 − ds · 12 A ; #ij
(99)
(100)
106
M. Baer / Physics Reports 358 (2002) 75–142
where 12 replaces because 23 is assumed to be negligibly small in 612 . The solution and the corresponding D-matrix, namely, D12 are well known (see discussion in Section 5.1.1). Thus, −1 0 0 D12 = (101) 0 −1 0 0 0 1 which implies (as already explained in Section 4.1) that the 0rst (lowest) and the second functions 5ip sign. In the same way, it can be shown that D23 is equal to 1 0 0 D23 = (102) 0 −1 0 0 0 −1 which shows that the second and the third (the highest) eigenfunctions 5ip sign. Substituting Eqs. (101) and (102) into Eq. (97) yields the following result for D13 : −1 0 0 (103) D13 = 0 1 0 : 0 0 −1 In other words, surrounding the two CIs indeed leads to the 5ip of sign of the 0rst and the third eigenfunctions, as was claimed. This idea can be extended, in a straightforward way, to various situations as will, indeed, be done in the next section. 9. The geometrical interpretation for sign *ips In Sections 5 and 7, we discussed the possible K-values of the D-matrix and made the connection to the number of sign 5ips based on the analysis given in Section 4.1. Here we intend to present a geometrical approach in order to gain more insight into the phenomenon of sign 5ips in the M -state system (M ¿ 2). As was already mentioned before, CIs can take place only between two adjacent states (see Fig. 4). Next we make the following de0nitions: (a) Having two consecutive states j and (j + 1), the two form the CI to be designated as Cj as shown in Fig. 5, where NJ CI are presented. (b) The contour that surrounds a CI at Cj will be designated as #jj+1 (see Fig. 5a). (c) A contour that surrounds two consecutive CIs, i.e., Cj and Cj+1 will be designated as #jj+2 (see Fig. 5b). In the same way, a contour that surrounds n consecutive CIs namely Cj ; Cj+1 ; : : : ; Cj+n will be designated as #jj+n (see Fig. 5c for NJ = 3). (d) In case of three CIs or more: a contour that surrounds Cj and Ck but not the in-between CIs will be designated as #j; k . Thus, for instance, #1; 3 surrounds C1 and C3 but not C2 (see Fig. 5d).
M. Baer / Physics Reports 358 (2002) 75–142
107
Fig. 4. Four interacting adiabatic surfaces presented in terms of four adiabatic Landau–Zener-type curves. The points Cj ; j = 1; 2; 3; stand for the three conical intersections.
We also introduce an ‘algebra’ of closed contours based on previous section (see also Appendix C): #jn =
n−1
#kk+1
(104)
k=j
and also: #j; k = #jj+1 + #kk+1
where k ¿ j + 1 :
(105)
This algebra implies that in case of Eq. (104) the only two functions that 5ip sign are 1 and n because all in-between -functions get their sign 5ipped twice. In the same way, Eq. (105) implies that all four electronic functions mentioned in the expression, namely, the jth and the (j + 1)th, the kth and the (k + 1)th, all 5ip sign. In what follows, we give a more detailed explanation based on the mathematical analysis of Section 8. In the last two sections, it was mentioned that K yields the number of eigenfunctions which 5ip sign when the electronic manifold traces certain closed paths. In what follows, we shall show how this number is formed for various NJ -values. The situation is obvious for NJ = 1. Here the path either surrounds or does not surround a C1 . In case it surrounds it, two functions, i.e., 1 and 2 5ip sign so that K = 2 and if it does not surround it, no -function 5ips sign and K = 0. In case of NJ = 2, we encounter two CIs, namely, C1 and the C2 (see Fig. 5a and b). Moving the electronic manifold along the path #12 will change the signs of 1 and 2 whereas moving it along the path #23 will change the signs 2 and 3 . Next moving the electronic manifold along the path, #13 (and Fig. 5b) causes the sign of 2 to be 5ipped twice (once when surrounding C1 and once when surrounding C2 ) and
108
M. Baer / Physics Reports 358 (2002) 75–142
Fig. 5. The four interacting surfaces, the three points of conical intersection and the various contours leading to sign conversions: (a) The contours #jj+1 surrounding the respective Cj ; j = 1; 2; 3 leading to the sign conversions of the jth and the (j + 1)th eigenfunctions. (b) The contours #jj+2 surrounding the two (respective) conical intersections namely Cj and Cj+1 ; j = 1; 2 leading to the sign conversions of the jth and the (j + 2)th eigenfunctions but leaving unchanged the sign of the middle, the (j + 1), eigenfunction. Also shown are the contours #jj+1 surrounding the respective Cj ; j = 1; 2; 3 using partly dotted lines. It can be seen that #jj+2 = #jj+1 + #j+1j+2 . (c) The contour #14 surrounding the three conical intersections, leading to the sign conversions of the 0rst and the fourth eigenfunctions but leaving unchanged the signs of the second and the third eigenfunctions. Based on Fig. 4b) we have: #14 = #12 + #23 + #34 . (d) The contour #1; 3 surrounding the two external conical intersections but not the middle one, leading to the sign conversions of all four eigenfunctions, i.e., (1 ; 2 ; 3 ; 4 ); → (−1 ; −2 ; −3 ; −4 ). Based on Fig. 5(b) we have: #1; 3 = #12 + #34 .
therefore, altogether, its sign remains unchanged. Thus in case of NJ = 2, we can have either no change of sign (when the path does not surround any CI) or three cases where two di)erent functions change sign. A somewhat di)erent situation is encountered in case NJ = 3 and therefore we shall brie5y discuss it as well (see Fig. 5d). It is now obvious that contours of the type #jj+1 ; j = 1; 2; 3 surround the relevant Cj (see Fig. 5a) and will 5ip the signs of the two corresponding
M. Baer / Physics Reports 358 (2002) 75–142
109
eigenfunctions. From Eq. (104) we get that surrounding two consecutive CIs namely, Cj and Cj+1 , with #jj+2 ; j = 1; 2 (see Fig. 5b), will 5ip the signs of the two external eigenfunctions, namely, j and j+2 , but leave the sign of j+1 unchanged. We have two such cases—the 0rst and the second CIs and the second and the third ones. Then we have a contour #14 that surrounds all three CIs (see Fig. 5c) and here, like in the previous where NJ = 2 (see also Eq. (104), only the two external functions, namely, 1 and 4 5ip sign but the two internal ones, namely, 2 and 3 , will be left unchanged. Finally, we have the case where the contour #1; 3 surrounds C1 and C3 but not C2 (see Fig. 5d). In this case, all four functions 5ip sign (see Eq. (105)). We brie5y summarize what we found in this NJ = 3 case: We revealed six di)erent contours that led to the sign 5ip of six (di)erent) pairs of functions and one contour that leads to a sign 5ip of all four functions. The analysis of Eq. (82) shows that indeed we should have seven di)erent cases of sign 5ip and one case without sign 5ip (not surrounding any CI). 10. The multi-degenerate case The emphasis in our previous studies, was on isolated two-state CIs. Here we would like to refer to cases where at a given point three (or more) states become degenerate. This can happen, for instance, when two seams cross each other at a point so that at this point we have three surfaces crossing each other. The question is: how to incorporate this situation into our theoretical framework? To start with, we restrict our treatment to a tri-state degeneracy (the generalization is straightforward) and consider the following situation: (1) The two lowest states form a CI, presented in terms of 12 (>), located at the origin, namely, at > = 0. (2) The second and the third states form a CI, presented in terms of 23 (>; ’ | >0 ; ’0 ), located at > = >0 , ’ = ’0 [61]. (3) The tri-state degeneracy is formed by letting >0 → 0, namely lim 23 (>; ’ | >0 ; ’0 ) = 23 (>; ’)
>0 →0
(106)
so that the two CIs coincide. Since the two CIs are located at the same point, every closed contour that surrounds one of them will surround the other so that this situation is the case of one contour # (= #13 ) surrounding two CIs (see Fig. 5b) According to the discussion of the previous section, only two functions will 5ip signs, i.e., the lowest and the highest one. Extending this case to an intersection point of n surfaces will not change the 0nal result, namely: only two functions will 5ip signs the lowest one and the highest one. This conclusion contradicts the 0ndings discussed in Sections 5.1.2 and 5.1.3. In Section 5.1.2, we treated a three-state model and found that in this case the functions can never 5ip signs. In Section 5.1.3, we treated a similar four-state case and found that either all four functions 5ip sign or none of them 5ips sign. The situation where two functions 5ip signs is not allowed under any conditions.
110
M. Baer / Physics Reports 358 (2002) 75–142
Although the models mentioned here are of a very specialized form (all the non-adiabatic coupling terms ought to have identical spatial dependence), still the fact that such contradictory results are obtained for the two situations could hint to the possibility that in the transition process from the non-degenerate to the degenerate situation, in Eq. (106), something is not continuous. This contradiction is not resolved so far but still we would like to make the following suggestion. We may encounter in molecular physics two types of multi-degeneracy situations: (1) The one, described above, is formed from an aggregation of two-state CIs and depends on ‘outside’ coordinates (the coordinates that yield the seam). Thus this multi-degeneracy is formed by varying these external coordinates in a proper way. In the same way, the multi-degeneracy can be removed by varying these coordinates. It is noticed that this kind of a degeneracy is not an essential degeneracy because the main features of the individual CIs are expected to be una)ected while assembling this degeneracy or disassembling it. We shall term this degeneracy as the breakable multi-degeneracy. (2) The other type, mentioned above, is the one which is not formed from an aggregation of CIs and probably will not breakup under any circumstances. Therefore this degeneracy is termed as the unbreakable multi-degeneracy.
11. The extended approximate Born–Oppenheimer equation 11.1. Introductory remarks In this section, we derive the (extended) BO approximation which, in contrast to the original BO approximation, contains the e)ect of the non-adiabatic coupling term [64 – 67]. The starting point of approximate treatments of molecular dynamics is the BO treatment, which is based on the fact that within molecular systems the fast moving electrons can be distinguished from the slow moving nuclei. Within the BO approximation, one assumes that the non-adiabatic coupling terms are negligibly small and that therefore the upper electronic states do not a)ect the nuclear wave function on the lower surface. The relevance of this assumption is not considered to be dependent on the energy of the system. However, the ordinary BO approximation was also employed for cases where these coupling terms are not necessarily small, assuming that the energy can be made as low as required. The justi0cation for applying the approximation in such cases is that for a low enough energy the upper adiabatic surfaces are classically forbidden, implying that the components of the total wave function related to these states are negligibly small. As a result, the terms that contain the product of these components with the non-adiabatic coupling terms are also small and will have a minor e)ect on the dynamical process. This assumption, which underlies many of the single-state dynamical calculations performed during the last three decades, becomes questionable when some of the non-adiabatic coupling terms are large or in8nitely large. The reason is that although the components of the total wave function may be negligibly small, their product with the large non-adiabatic coupling terms will result in non-negligible values, sometimes even inde0nitely large values. In that case, this aspect of the BO approximation will break down for any energy, no matter how low.
M. Baer / Physics Reports 358 (2002) 75–142
111
As is well known (and as follows from their de0nition), the non-adiabatic coupling terms appear in the o)-diagonal positions in the SE (see Eq. (7)). In order to form a single approximate BO equation that contains the non-adiabatic coupling terms, these terms must 0rst be shifted from their original o)-diagonal positions to the diagonal. In a 0rst publication on this subject the present author and Englman [64] showed that such a possibility exists and they derived, for the two-state case, an approximate version of the BO equation which indeed contains the non-adiabatic coupling term. There are also two other, additional, derivations [65,66]. In particular, the latest version [66] treats an M -state model which is an extension of the two-state (general) case. Here, we shall brie5y present this last derivation with some modi0cations. Particular emphasis will be placed on the two-state case. 11.2. The Born–Oppenheimer approximation as applied to an M-dimensional model Our starting equation is the BO equation as presented in Eq. (15) or more compactly in Eq. (16): 1 − (107) (∇ + )2 + (u − E) = 0 : 2m As we may recall the matrix u is a diagonal matrix and is an antisymmetric vector matrix. The model aspect of this system is with regard to the form of the -matrix. This matrix is assumed to be of the type presented in Eq. (48), in Section 5.1, namely (s) = gt(s) ;
(48 )
where t(s) is a vector whose components are functions of the coordinates and g a constant antisymmetric matrix. Due to its particular form, it presents the multi-degeneracy case as discussed in Section 10. The advantage of this choice is that the unitary matrix that diagonalizes it, is a constant matrix G. Thus, returning to Eq. (107), replacing by ) where the two are related as
= G)
(108)
and continuing in the usual way leads to the following equation: 1 − (109) (∇ + i%t)2 ) + (W − E)) = 0 2m for ). Here % is a diagonal matrix which contains the eigenvalues of the g-matrix and W is the matrix that contains the diabatic potentials, thus: W = G † uG ;
(110)
where G † is complex conjugate of G. Considering Eq. (110), it is seen that the 0rst term in front of the (column) vector ) is a diagonal matrix because t is a vector of functions (not of matrices) and %, as mentioned above, is a diagonal matrix. However due to the transformation a new non-diagonal matrix is formed, i.e., the potential matrix W , which couples the various di)erential equations. It is important to emphasize that so far the derivation has been rigorous and no approximations have been imposed. Thus, the solution of Eq. (109) will be the same as the solution of Eq. (107). Having
112
M. Baer / Physics Reports 358 (2002) 75–142
arrived at Eq. (109) we are now in a position to impose the BO approximation. As was already stated earlier, since for low enough energies, all upper adiabatic states are energetically closed, each of the corresponding adiabatic functions j ; j = 2; : : : ; N is expected to ful0ll the condition |
1 || j |;
j = 2; : : : ; N
(111)
in those regions of CS where the lower surface is energetically allowed. This assumption has to be employed with some care because so far it was proven, numerically, to hold for a two-state system (N = 2) only [62,63]. (In Ref. [67] we showed, employing this assumption in a three-state model, that it is also ful0lled there as well.) Our next step is to analyze the product W) for the jth equation of Eq. (109). Recalling Eqs. (108) and (110), we have N ∗ (W))j = {(G ∗ uG)(G ∗ )}j = (G ∗ u )j = Gjk uk k k=1
= ui )j − u1
N k=1
or (W))j = u1 )j +
N
∗ Gjk
k
+
N
∗ Gjk uk
k
k=1
∗ Gjk (uj − u1 )
k;
j = 1; : : : ; N :
(112)
k=2
It is noticed from Eq. (112) that for each j (= 1 : : : ; N ) this equation contains the product of the function )j and the lowest adiabatic potential surface u1 and a summation of products of the negligibly small k ’s (namely, only those for k ¿ 2) with potential terms. Substituting Eq. (112) into Eq. (109) and deleting these summations in each of the equations yields the following system of equations: 1 − (113) (∇ + it!j )2 )j + (u1 − E))j = 0; j = 1; : : : ; N : 2m It is well noticed that this system of N equations for the N )-functions is uncoupled and therefore each equation stands on its own and can be solved independently of all other equations. However, it is also to be noticed that all these equations are solved for the same (adiabatic) potential energy surface u1 but for di)erent %j ’s. Once the )-functions are derived the 0nal adiabatic vector f is obtained from Eq. (108) when applied to )f . In particular, the 0nal nuclear wave function 1f follows from the explicit expression: N = G1k )kf : (114) 1f k=1
A potential diBculty associated with this approach is due to the fact that the calculated )-functions yield, through the transformation in Eq. (108), also all other f -functions, namely jf ; j ¿ 1. These functions, by de0nition, have to be identically zero at all asymptotes, because they belong to the upper adiabatic states which are classically forbidden at all these regions. At this stage nothing in the theory guarantees that the calculated )-functions ful0ll this demand.
M. Baer / Physics Reports 358 (2002) 75–142
113
There is, however, at least one case for which the calculated )-functions will produce jf ’s; j ¿ 1 which are all identically zero and that is the case when Eq. (113) are gauge invariant [68–70] (we shall elaborate on the gauge invariance property in the next section). At this stage let us assume that the functions (t%j ); j = 1; : : : ; N (it%j are the eigenvalues of the -matrix) are such that these equations are gauge invariant so that the various )-functions, if calculated for the same boundary conditions, are all identical. Thus, our next step will be to determine the boundary conditions for the )-functions in order to solve Eq. (113). To 0nd those we need to impose boundary conditions on the -functions. We assume that at the given (initial) asymptote all -functions are zero except for the ground state function 1in . Thus, 1=
1in ;
j
=
jin
= 0;
j = 2; : : : ; N :
(115)
Due to Eq. (108) the boundary conditions for the )-functions are given in the form ∗ )jin = Gj1
1in ;
j = 1; : : : ; N :
(116)
It is seen that the boundary conditions for the )-functions are all identical, up to a constant ∗ ), and therefore the same applies to the )-functions at every point in CS. Thus if at a (= Gj1 given asymptotic region we de0ne )f as the )-function calculated for a )in which is identical to in (not proportional to it) then it can be shown (see also Eq. (114)) that the -functions at this particular asymptote, namely jf ; j = 1; : : : ; N become 1f
= )f ;
jf
= 0;
j = 2; : : : ; N :
(117)
Thus, for the particular case of a gauge invariant set of uncoupled equations, we indeed obtain a solution, which is compatible with the assumptions. 11.3. The gauge invariance condition for the approximate Born–Oppenheimer equations and the Bohr–Sommerfeld quantization of the non-adiabatic coupling matrix The system of uncoupled equations presented in Eq. (113) do not necessarily yield solutions which are related to each other. In other words, if we do not produce the conditions that will guarantee that all these equations yield the same solution this procedure cannot be accepted. In what follows we show, based on studies performed and assumptions made in Section 5.1 (in particular, in Section 5.1.4), that if solved for the same initial conditions all equations indeed yield identical results. As was already mentioned, the various equations di)er only because their imaginary components, i.e., %j t(s); j = 1; : : : ; M are not necessarily identical. However, it is well known that if this set of functions are related to each other in a way that will be discussed next, Eq. (113) are gauge invariant [68–70] and their solutions, therefore, are all identical. The gauge invariance condition is ful0lled if the products %j t(s) satisfy the following condition: %j ds · t(s) = 2nj ; j = 1; : : : ; M ; (118) #
where the nj numbers form either a series of integers or of a series of half-integers (i.e., 2pj + 1)=2; pj is an integer). These conditions are very similar to those discussed in Section 5.1.4 for the type of -matrices treated in the previous section. In Section 5.1 we found
114
M. Baer / Physics Reports 358 (2002) 75–142
that in order for the topological matrix D to be diagonal (which also implies that it must have (+1)s and (−1)s in its diagonal) these the products %j t(s); j = 1; : : : ; M have to ful0ll Eq. (118). Moreover, for the cases M = 2; 4 the series has to be either of integers or of half-integers (but not a mixture of them) and in case of M = 3 it has to be a series of integers only. In Section 5.1.4 we speculated (but did not prove) that these features will apply for any M -value, namely, in case of even M -values we have both series (of integers and half-integers) but in case of odd M -values we have only integers. The di)erence between having a series of integers and half-integers is that in the 0rst case, all electronic eigenfunctions are singled-valued and in the second case all of them are multi-valued. Thus in the 0rst case, while tracing any closed contour these functions do not produce topological e)ects whereas in the second case, all of them produce topological e)ects (see Section 5.1). To conclude this subject we would like to refer to two subjects: (1) For the model treated in this section, we derived an extended BO approximated equation, i.e. Eq. (113). Its validity is guaranteed by the fact that the eigenvalues of non-adiabatic coupling matrix have to ful0ll the Bohr–Sommerfeld quantization rules. (2) For the case M = 2, the model presents, in fact, the realistic, most general, case. 12. The adiabatic-to-diabatic transformation matrix and the Wigner rotation matrix The ADT matrix in the way it is presented in Eq. (26) is somewhat reminiscent of the Wigner rotation matrix [71] (assuming that A(s0 ) ≡ I ). In order to see that we 0rst present a few well known facts related to the de0nition of ordinary angular momentum operators (we follow the presentation by Rose [72]) and the corresponding Wigner matrices and then return to discuss the similarities between Wigner’s dj (&) matrix and the ADT matrix. 12.1. Wigner rotation matrices The ordinary angular rotation operator R(k; A) in the limit A → 0 is written as R(k; A) = exp(−iS(k; A)) ;
(119)
where k is a unit vector in the direction of the axis of rotation, A the angle of rotation and S(k; A) is an operator that has to ful0ll the condition S(k; A) → 0 for A → 0 to guarantee that in this situation (i.e. when A → 0) R(k; A) → I . Moreover, since R(k; A) has to be unitary, the operator S(k; A) has to be Hermitian. Next, it is shown that S(k; A) is related to the total angular momentum operator, J , in the following way: S(k; A) = (k · J )A ;
(120)
where the dot stands for scalar product. Substituting Eq. (120) in Eq. (119) yields, for R(k; A), the following expression: R(k; A) = exp(−i(k · J )A) :
(119 )
It has to be emphasized that in this framework, J is the angular momentum operator in the ordinary coordinates space (i.e., CS) and A is a (di)erential) ordinary angular polar coordinate.
M. Baer / Physics Reports 358 (2002) 75–142
115
Next, the Euler’s angles are employed for deriving the outcome of a general rotation of a system of coordinates [50]. It can be shown that R(k; A) is, accordingly, presented as R(k; A) = e−i8Jz e−i&Jy e−i;Jz ; (121) where Jy and Jz are the y and z components of J and 8, & and ; are the corresponding three Euler angles. The explicit matrix elements of the rotation operator are given in the form Dmj m (A) = jm |R(k; A)|jm = e−i(m 8+m;) jm |e−i&Jy |jm ; (122) where m and m are the components of J along the Jz and Jz axes, respectively, and |jm is an eigenfunction of the Hamiltonian, of J 2 and of Jz . Eq. (122) will be written as
Dmj m (A) = e−i(m 8+m;) djm m (&) : (123) j j The D -matrix as well as the d -matrix are called the Wigner matrices and they are the subject of the present section. It is noticed that if we are interested in 0nding a relation between the ADT matrix and Wigner’s matrices, we should mainly concentrate on the dj -matrix. Wigner derived a formula for these matrix elements (see Ref. [72], Eq. (4:13)) and this formula was used by us to obtain the explicit expression for j = 32 (the matrix elements for j = 1 are given in Ref. [72], p. 72).
12.2. The adiabatic-to-diabatic transformation matrix and Wigner’s dj -matrix The obvious way to form a similarity between Wigner rotation matrix and the ADT matrix de0ned in Eqs. (26) is to consider the (unbreakable) multi-degeneracy case which is based, just like Wigner rotation matrix, on a single axis of rotation. For this sake, we consider the particular set of -matrices as de0ned in Eq. (48) and derive the relevant ADT matrices. In what follows, the degree of similarity between the two types of matrices will be presented for three special cases, namely, the two-state case which in Wigner’s notation is the case, j = 12 , the tri-state case (i.e., j = 1) and the tetra-state case (i.e., j = 32 ). However, before going into a detailed comparison between the two types of matrices, it is important to remind the reader what the elements of the Jy -matrix look like. Employing Eqs. (2:18) and (2:28) of Ref. [72], it can be shown that 1 jm|Jy |jm + k = (1k (j + m + 1)(j − m) ; (124a) 2i 1 (j − m + 1)(j + m) : (124b) jm + k |Jy |jm = − (1k 2i De0ning now J˜y as J˜y = iJy ; (125) it is seen that the J˜y -matrix is an antisymmetric matrix just like the -matrix. Since the dj -matrix is de0ned as dj (&) = exp(−i&Jy ) = exp(&J˜y ) : (126) It is expected that for a certain choice of parameters (that de0ne the -matrix) the ADT matrix becomes identical to the corresponding Wigner rotation matrix. To see the connection, we substitute Eq. (48) into Eq. (26) and assume A(s0 ) to be the unity matrix.
116
M. Baer / Physics Reports 358 (2002) 75–142
The three matrices of interest were already derived and presented in Section 5.1. There they were termed the D-(topological) matrices (not related to the above mentioned Wigner Dj -matrix) and were used to show the kind of quantization one should expect for the relevant NACTs. The only di)erence between these topological matrices and the ADT matrices requested here is that in Eqs. (51), (57) and (67), the closed line integral (see Eq. (71) is replaced by ;(s) de0ned along an (open) contour (see Eq. (70)): For the three cases studied in Section 5.1, the similarity to the three corresponding Wigner matrices is achieved in the following way: (1) For the two-state case (i.e. j = 12 ), the d1=2 (&) is identi0ed with the corresponding ADT matrix (see Eq. (69)) for which & = ;. 1 (2) For the tri-state case (j = 1) we consider Eq. (56). The √ corresponding d (&)-matrix is obtained by assuming 3 = 1 (see Eq. (53)) and therefore ! = 2. From Eq. (57) or (58) it is √ seen that & = ; 2. For the sake of completeness, we present the corresponding d1 (&)-matrix [72]: √ 1 + C(&) 2S(&) 1 − C(&) √ 1 √ d1 (&) = (127) 2S(&) 2C(&) − 2S(&) ; 2 √ 2S(&) 1 + C(&) 1 − C(&) where C(&) = cos & and S(&) = sin &. (3) For the tetra-state case (j = 32 ), we consider Eq. (64). The corresponding d3=2 (&) is obtained by assuming 3 = 4=3 and 6 = 1 (see Eq. (59)). This will yield for $ the value
$=
10 3
(see Eq. (62b)). Since & = p; (see Eqs. (67)), we have to determine the value of √ √ p which can be shown to be p = 3 (see Eq. (61)) and therefore: & = ; 3. For the sake of completeness we present the d3=2 (&) matrix: √ √ C3 − 3C 2 S − 3S 2 C S3 √ 2 √ 2 2 2 3C S C(1 − 3S ) − S(1 − 3C ) − 3S C (128) d3=2 (& ) = √ ; √ − 3S 2 C S(1 − 3C 2 ) C(1 − 3S 2 ) − 3C 2 S √ √ 2 −S 3 − 3S 2 C 3C S C3 where C = cos(&=2) and S = sin(&=2). The main di)erence between the ADT and the Wigner matrices is that whereas the Wigner matrix is de0ned for an ordinary spatial coordinate the ADT matrix is de0ned for a rotation coordinate in a di)erent space. 13. Studies of speci/c systems In this section, we concentrate on a few examples to show the degree of relevance of the theory presented in the previous sections. For this purpose, we analyze the conical intersections of two ‘real’ two-state systems and one model-system resembling a tri-state case.
M. Baer / Physics Reports 358 (2002) 75–142
117
13.1. The study of ‘real’ two-state molecular systems We start by mentioning the pioneering studies of Yarkony et al. Yarkony was the 0rst to apply the line integral approach to reveal the existence of a CI for a ‘real’ molecular system— the H3 system [27]—by calculating the relevant NACTs from 0rst principles and then deriving the topological angle 8 (see Eq. (71)). Later he and co-workers applied this approach to study other tri-atom system such as CH2 [73,76], Li3 [74], HeH2 [75], H2 S [76] and AlH2 [77]. Such studies could be materialized only due to the eBcient methods developed by Yarkony et al. [78] to calculate the NACT. Recently, Xu et al. [79] studied in detail the H3 molecule as well as its two isotopic analogs namely H2 D and D2 H mainly with the aim of testing the ability of the line integral approach to distinguish between the situations when the contour surrounds and=or does not surround the CI point. Some time later Mebel et al. [80,81] employed ab initio NACTs and the line integral approach to study some features related to the C2 H molecule. In the next two sections the results of these two studies will be presented. 13.1.1. The H3 -system and its isotopic analogs Although the study to be described is for a ‘real’ systems, the starting point were not the ab initio adiabatic PESs and the ab initio NACTs but the diabatic double-many-body-expansion (DMBE) potentials [82–84]. These were used to calculate the ADT angle ; by employing the Hellman–Feynman theorem [32,85]. However, we shall present our results in term of the Diabatic-to-adiabatic transformation (DAT) angle & (as will be explained next). So we have to prove that the two angles are identical. We consider a two-dimensional diabatic framework which is characterized by an angle, &(s), associated with the orthogonal transformation which diagonalizes the diabatic potential matrix. Thus, if V is the diabatic potential matrix and if u is the adiabatic one, the two are related by the orthogonal transformation matrix A [32,85]: u = A† VA ; (129) † where A is the complex conjugate of the A-matrix. For the present two-state case, A can be written in the form cos & −sin & ; (130) A= sin & cos & where &—the above mentioned DAT angle (or mixing angle as it is recently termed)—is given by [32] 2V12 1 & = tan−1 : (131) 2 V11 − V22 Recalling ;(s), the ADT angle (see Eqs. (69) and (70)) it is expected that the two angles are related. The connection is formed by the Hellman–Feynman [32,85] theorem which yields the relation between the s component of the NACT, t, namely, ts , and the characteristic diabatic magnitudes: 9V sin 2& ∗ 9V ts (s) = (u2 − u1 )−1 A∗1 A (132) A2 = A2 ; 9s 2W12 1 9s
118
M. Baer / Physics Reports 358 (2002) 75–142
where Ai ; i = 1; 2 are the two columns of the A-matrix in Eq. (130). Replacing the two Ai -columns by their explicit expressions yields for ts the expression [86]: sin 2& sin 2& 9 9 ts (s) = − (133) (V11 − V22 ) + cos 2& V12 2V12 2 9s 9s Next, di)erentiating Eq. (131) with respect to s: 9 9 9 (V11 − V22 ) = 2 V12 cot 2& + cot 2& V12 9s 9s 9s
(134)
and substituting Eq. (134) into Eq. (133) yields the following result for ts (s): 9& ts (s) = : (135) 9s Comparing this equation with Eq. (70) it is seen that the DAT angle & is, up to an additive constant, identical to the relevant ADT-angle ;: ;(s) = &(s) − &(s0 ) :
(136)
This relation will be used to study geometrical phase e)ects within the diabatic framework for the H3 system and its two isotopic analogs. What is meant by this is that since our starting point is the 2 × 2 diabatic potential matrix, we do not need to obtain the ADT angle by solving a line integral; it will be obtained, simply by applying Eqs. (131) and (136). The forthcoming study is carried out by presenting &(’) as a function of an angle ’ to be introduced next: In what follows, we shall be interested in the location of the seam de0ned by the conditions rAB = rBC = rAC [4 –7] where rAB ; rBC and rAC are the inter-atomic distances. Since we intend to study the geometrical properties produced by this seam we follow a suggestion by Kuppermann and coworkers [34,90] and employ the hyperspherical coordinates (>; A; ’) which are known to be convenient to carry out quantum mechanical reactive (i.e. exchange) processes. These coordinates were found to be suitable for this purpose as well because one of the angular coordinates surrounds the seam in case of the pure hydrogenic case. Consequently, following previous studies [87–90], we express the three above mentioned distances in terms of these coordinates, i.e. 1 A 2 2 rAB = dC > 1 + sin cos(’ + )AC ) ; 2 2 1 A 2 2 rBC = dA > 1 + sin cos(’) ; 2 2 1 A 2 2 rAC = dB > 1 + sin cos(’ − )AB ) ; (137) 2 2 where
mX mX −1 mZ = 1− ; )XY = 2 tan ; G M G mA mB mC G= ; M = mA + mB + mC : M
d2X
(138)
M. Baer / Physics Reports 358 (2002) 75–142
Here X; Y; Z stand for A; B; C and 2 + r2 + r2 : > = rAB AC BC
119
(139)
Equating the three inter-atomic distances with each other, we 0nd that the seam is a straight line, for which > is arbitrary but ’ and A have 0xed values ’s and As determined by the masses only: 2 2 −1 cos )AC − t cos )AB − (dA =dC ) + t(dA =dB ) ’s = tan (140) sin )AC − t sin )AB and As = 2 sin
−1
(dA =dB )2 − 1 cos(’s − )AB ) − (dA =dB )2 cos ’s
where t is given in the form dA 2 dA 2 t= −1 −1 dC dB
;
(141)
−1
:
(142)
Eqs. (139) – (142) are valid when all three masses are di)erent. In case two masses are equal, namely, mB = mC , we get for As the simpli0ed expression: −1 mB − mA (143) As = 2 sin mB + 2mA and for ’s the value when mA ¿ mB and the value zero when mA ¡ mB . In case all three masses are equal (then t = 1) we get As = 0 and ’s = . In what follows, we discuss the H2 D system. For this purpose is employed Eq. (143) for which it is obtained that the straight-line-seam is de0ned for the following values of As and ’s , namely As = 0:4023 rad, and ’s = . In the H3 case, the value of As is zero and this guarantees that all the circles with constant > and A encircle the seam. The fact that As is no longer zero implies that not all the circles with constant > and A encircle the seam; thus, circles for which A ¿ As will encircle the seam and those with A ¡ As will not. In Fig. 6 are presented &(’) curves for H2 D, all calculated for > = 6a0 . In this calculation, the hyperspherical angle ’, de0ned along the [0; 2] interval, is the independent angular variable. In Fig. 6a are shown two curves for the case where the line integral does not encircle the seam, namely, for A = 0:2 and 0:4 rad. and in Fig. 6b for the case the line integral encircles the seam namely, for A = 0:405 and 2:0 rad. It is noticed that the curves in Fig. 6a reach the value of zero and those in Fig. 6b the value of . In particular, two curves, one in Fig. 6a for A = 0:4 rad and the other in Fig. 6b for A = 0:405 rad, were calculated along very close contours (that approach the locus of the seam) and indeed their shapes are similar—they both yield an abrupt step—but one curve reaches the value of zero and the other the value . Both types of results justify the use of the line integral to uncover the locus of the seam. More detailed results as well as the proper analysis can be found in Ref. [79]. These results as well as others presented in Ref. [79] are important because on various occasions, it was implied that the line integral approach is suitable only for cases when relatively short radii around the CI are applied. In Ref. [79], it was shown for the 0rst time that this
120
M. Baer / Physics Reports 358 (2002) 75–142
Fig. 6. The mixing angle &, for the H2 D system, as a function of hyperspherical angle ’, calculated for hyperspherical radius > = 6a0 : (a) Results for A = 0:2 rad (——– ) and A = 0:4 rad (- - - - - -). (b) The same as (a) but for A = 2:0 rad (——– ) and A = 0:405 rad (- - - - - -).
approach can be useful even for large radii. This does not mean that it is relevant for any assumed contour surrounding a CI (or for that matter a group of CIs) but it means that we can always 0nd contours with large radii that will reveal the CI location for a given pair of states. We shall return to this problem in our next section.
M. Baer / Physics Reports 358 (2002) 75–142
121
13.1.2. The C2 H molecule As expected, the molecule C2 H is the focus of intensive studies experimental [91–96] as well as theoretical=numerical [80,81,97–105]. As far as our subject is concerned, it was Peyerimho) and coworkers [97–102], who revealed the existence of a CI between the 12 A and the 22 A states (to be designated the (1; 2) CI) in the collinear con0guration and Cui and Morokuma [103], who found a CI between the 22 A and the 32 A states in the C2v con0guration (henceforth designated the (2; 3) CI). Recently, Mebel et al. found two twins CI between the 32 A and the 42 A states [105] (henceforth designated (3; 4) CIs). These twins are located on the two sides X Since the study of the twins has not yet of the C2v line at relatively short distances [0:0; 0:3 A]. been completed, we refer to the two lower CIs only. Employing the MOLPRO program package [106], the six relevant (Cartesian) non-adiabatic coupling terms between the two states, the 12 A and 22 A as well as between the 22 A and 32 A electronic states were calculated for the con0gurations of interest. These non-adiabatic coupling elements were then transformed, employing chain rules [48], to non-adiabatic coupling elements with respect to the internal coordinates of the C2 H molecule, namely, i |9j = 9r1 (= r1 ); i |9j = 9r2 (= r2 ); i |9j = 9’(= ’ ), where r1 and r2 are the CC and CH distances, respectively, and ’ is the relevant CC : : : CH angle. Next, is derived the ADT angle, ;(’|r1 ; r2 ), employing the following line integral (see Eq. (70)), where the contour is an arc of a circle with radius r2 : ’ d’ ’ (’ |r1 ; r2 ) : (144) ;(’|r1 ; r2 ) = 0
The corresponding topological phase, 8(r1 ; r2 )—see Eq. (71)—de0ned as ;(’ = 2|r1 ; r2 ), was also obtained for various values of r1 and r2 . First we refer to the (1,2) CI. A detailed inspection of the non-adiabatic coupling terms revealed, indeed, the existence of a CI between these two states, for instance, at the point X r2 = 1:60 A) X as was established before [97,102]. More CIs, of this kind, {’ = 0; r1 = 1:35 A; are expected at other r1 values. Next were calculated the ;(’|r1 ; r2 ) angles as a function of ’ for various r2 . The ’ (’|r1 ; r2 ) functions as well as the ADT angles are presented in Figs. 7a–c for X Mebel et al. also calculated the topological three di)erent r2 -values namely, r2 = 1:8; 2:0; 3:35 A. angle 8(r1 ; r2 ) for these three r2 -values as well as for other r2 -values. In all cases, they got X }) or zero for 8(r1 ; r2 ) either the value of (when r2 was in the interval {r2 } = {1:60; 2:95 A (when r2 was outside it). The reason is that as long as r2 is in this interval, it forms a circle X however, that contains one single CI (see also Figs. 7a and b for the cases r2 = 1:8; 2:0 A); X when r2 ¿ 2:95 A, it forms a circle that contains two (symmetric) CIs (see Fig. 7c) and in X it forms a circle that does not contain any CI [105]. The fact that the value case r2 ¡ 1:60 A, of the integral is zero when no CIs are surrounded by the circle was proved in Appendix C. Thus, in this sense the present calculation con0rms this derivation. In this series of results, we encounter a somewhat unexpected result namely when the circle surrounds two CIs the value of the line integral is zero. This does not contradict any statements made regarding the general theory (which asserts that in such a case the value of the line integral is either a multiple of 2 or zero) but it is still somewhat unexpected, because it implies that the two CIs behave like vectors and that they arrange themselves in such a way as to reduce the e)ect of the NACTs. This result has important consequences regarding the cases where a pair of electronic states are coupled by more than one CI [105].
122
M. Baer / Physics Reports 358 (2002) 75–142
Fig. 7. Results for the C2 H molecule as calculated along a circle surrounding the 12 A –22 A conical intersection. Shown are the geometry, the non-adiabatic coupling matrix elements ’ (’|r2 ) and the ADT angles ;(’|r2 ) X and for three r2 -values (r2 is the CH distance): (a) r2 = 1:80 A; X as calculated for r1 (= CC distance) = 1:35 A X (c) r2 = 3:35 A. X (b) r2 = 2:00 A;
On this occasion, we want also to refer to a wrong statement that we made more than once, [32b; 80], namely that the (1; 2) results indicate ‘that for any value of r1 and r2 the two states under consideration form an isolated two-state SHS’. We now know that in fact they do not form an isolated system because the second state is coupled to the third state via a CI as will be discussed next. Still the fact that the series of topological angles, as calculated for the various values of r1 and r2 , are either multiples of 2 or zero indicates that we can form, for this adiabatic two-state system, single-valued, namely physical, diabatic potentials. Thus, if for some numerical treatment only the two lowest adiabatic states are required the results obtained here suggest that it is possible to form from these two adiabatic surfaces single-valued diabatic potentials employing the line integral approach. In Ref. [81] is presented the 0rst line integral study between the two excited states, namely between the second and the third states in this series of states. Here, like before, the calculations X but in contrast to the preare done for a 0xed value of r1 (results are reported for r1 = 1:251 A) vious study, the origin of the system of coordinates is located at the point of the CI. Accordingly,
M. Baer / Physics Reports 358 (2002) 75–142
123
Fig. 8. Results for the C2 H molecule as calculated along a circle surrounding the 22 A –32 A conical intersection. X from the CC axis where r1 (= CC) = 1:2515 A. X The The CI is located on the C2v line at a distance of 1:813 A circle is located at the point of the CI and de0ned in terms of a radius q. Shown are the non-adiabatic coupling X (c) and (d) for matrix elements ’ (’|q) and the ADT angles ;(’|q) as calculated for: (a) and (b) for q = 0:2 A; X (e) and (f) for q = 0:4 A. X q = 0:3 A;
the two polar coordinates (’; q) are de0ned. Next is derived the ’th non-adiabatic coupling term, i.e., ’ (= 1 |92 = 9’) again employing chain rules for the transformation (; ; r2 ) → ’ (q is not required because the integral is performed along a circle with a 0xed radius q). X The In Fig. 8 are presented ’ (’|q) and ;(’|q) for three values of q, i.e., q = 0:2; 0:3; 0:4 A. main features to be noticed are: (1) The function ’ (’ | q) exhibits the following symmetry properties: ’ (’) = ’ ( − ’) and ’ ( + ’) = ’ (2 − ’) where 0 6 ’ 6 . In fact, since the origin is located on the C2v -axis we should expect only |’ (’)| = |’ ( − ’)| and |’ ( + ’)| = |’ (2 − ’)| where 0 6 ’ 6 but since the line integral has to supply the value of , this function cannot allow itself to be with alternating signs. (2) The ADT angle, ;(’|q), increases, for the two smaller q-values, monotonically to become 8(#|q), with the value of (in fact we X respectively). The two-state assumption seems to got 0:986 and 1:001 for q = 0:2 and 0:3 A, X break down in case q = 0:4 A because the calculated value of 8(#|q) is only 0:63. The reason X circle passes very close to the two (3; 4) CIs (the distance at the closest being that the q = 0:4 A X A detailed analysis of this situation is given elsewhere [81]. points is ∼ 0:04 A).
124
M. Baer / Physics Reports 358 (2002) 75–142
13.2. The study of a tri-state model system In Section 5.2 we discussed to some extent the 3 × 3 ADT matrix A (≡ A(3) ) for a tri-state system. This matrix was expressed in terms of three (Euler-type) angles ;ij ; i ¿ j = 1; 2; 3 (see Eq. (76)) which ful0ll a set of three coupled, 0rst order, di)erential equations (see Eq. (77)). In what follows, we treat a tri-state model system de0ned in a plane in terms of two polar coordinates (>; ’) [28]. In order to guarantee that the non-adiabatic matrix , yields single-valued diabatic potentials we shall start with a 3 × 3 diabatic potential matrix and form, employing the Hellman–Feynman theorem [32,85], the corresponding non-adiabatic coupling matrix . The main purpose of studying this example is to show that the A-matrix may not be uniquely de0ned in CS although the diabatic potentials are all single-valued. The tri-state diabatic potential that is employed in this study is closely related (but not identical) to the one used by Cocchini et al. [107,108] to study the excited states of Na3 . It is of the following form (for more details see Ref. [28]): E + U1 U2 W1 − W2 E − U1 W1 + W2 V = (145) U2 : W1 − W2 W1 + W2 A Here E and A are the values of two electronic states (an E-type state and an A-type-state, respectively), Ui ; i = 1; 2 are two potentials de0ned as U1 = k> cos ’ + 12 g>2 cos(2’)
(146a)
U2 = k> sin ’ − 12 g>2 sin(2’):
(146b)
and Wi ; i = 1; 2 are potentials of the same functional form as the Ui ’s but de0ned in terms of a di)erent set of parameters f and p, which replace g and k, respectively. The numerical values for these four parameters are [107] √ √ k = 2p = 5:53 a:u: and g = 2f = 0:152 a:u : Eqs. (77) are solved, for 0xed >-values, but for a varying angular coordinate, ’, de0ned along the interval (0; 2). Thus, > serves as a parameter and the results will be presented for di)erent >-values. A second parameter that will be used is the potential energy shift, S (= E − A ), de0ned as the shift between the two original coupled adiabatic states and the third state at the origin, i.e., at > = 0 (in case S = 0, all three states are degenerate at the origin). The results will be presented for several of its values. In Fig. 9 are shown the three non-adiabatic coupling terms ’ij (’); i; j = 1; 2; 3 (i ¿ j) as calculated for di)erent values of > and S. The main feature to be noticed is the well de0ned (sharp) tri-peak structure of ’12 and ’23 as a function of ’. There are other interesting features to be noticed but these are of less relevance to the present study (for a more extensive discussion see Ref. [28]). In Fig. 1 are presented the three ;-angles as a function of ’ for various values of > and S. The two main features which are of interest for the present study are: (1) Following
M. Baer / Physics Reports 358 (2002) 75–142
125
Fig. 9. The three non-adiabatic coupling terms (obtained for the model potential described in Section 13.2) 12’ (’); 23’ (’); 13’ (’) as a function ’ calculated for di)erent values of > and S: (a) = 12 , S = 0:0; (b) = 12 , S = 0:05; (c) = 12 , S = 0:5; (d) = 23 , S = 0:0; (e) = 23 , S = 0:05; (f) = 23 , S = 0:5; (g) = 13 , S = 0:0; (h) = 13 , S = 0:05; (i) = 13 , S = 0:5: (——– ) > = 0:01; ( – – – – – – ) > = 0:1; (- - - - - - -) > = 0:5; (.........) > = 1:0.
a full cycle, all three angles in all situations obtain the values either of or of zero. (2) In each case (namely for each set of > and S), following a full cycle, two angles become zero and one becomes . From Eq. (76) it is noticed that the A-matrix is diagonal at ’ = 0 and ’ = 2 but in case of ’ = 0 the matrix A is the unit matrix and in the second case, it has two (−1)s and one (+1) in its diagonal. Again recalling Eq. (37), this implies that the D-matrix is indeed diagonal and has in its diagonal numbers of norm 1. However, the most interesting fact is that D is not the unit matrix. In other words, the ADT matrix presented in Eq. (76) is not single-valued in CS although the corresponding diabatic potential matrix is single-valued, by de0nition (see Eqs. (145) and (146)). The fact that D has two (−1)s and one (+1) in its diagonal implies that the present -matrix produces topological e)ects, as was explained in the last two paragraphs of Section 4.1: Two electronic eigenfunctions 5ip sign upon tracing a closed path and one electronic function remains with its original sign.
126
M. Baer / Physics Reports 358 (2002) 75–142
Much as the results in the last section are interesting the rather more interesting case is the one for S = 0, namely, the case where the three states degenerate at one point. What we found is that even in this case D is not the unit matrix but it keeps the features it encountered for S = 0. In other words, the transition from the S =0 situation to the S = 0 situation is continuous as was discussed in Section 11. However, the present S = 0 D-matrix is in contradiction with the D-matrix in Section 5.1.2 which was derived for a particular type of a 3 × 3 -matrix which also refers to a tri-fold degeneracy at a single point. In this case, as we may recall, it was proved that it has to be a unit matrix if it is expected to yield single-valued diabatic potentials. These two examples support the 0nding of Section 11 where we distinguished between breakable and unbreakable multi-degeneracy. The Cocchini et al. model [107] belongs, of course, to those models that yield the breakable degeneracy. 14. Summary and conclusions Part of the subjects presented here have already been discussed in a previous review article [32b], but since the time that article was published not only was progress made but also, having gained perspective, we can now weight various ideas in a di)erent way. Moreover, in this composition we have been able to verify some of the assumptions which, in previous publications, looked more like ansatz. In what follows, we shall summarize the main points. (1) In the previous review article [32b], the diabatic framework was derived, employing projection operators, for a sub-space of a Hilbert space which was assumed to be decoupled from the remaining part of the Hilbert space. In other words, it was assumed that the coupling terms between the internal states of the SHS and outside states are zero. Here we showed that if these coupling terms are small enough, e.g. of O() all equations derived for establishing the SHS diabatic presentation are correct to the O(2 ) level. In particular, the diabatic potentials (as calculated by the present approach) are perturbed only to the second order. This important feature and the fact that the calculation of NACTs, once a formidable task, has recently become more of a routine [78,106] (although, still, very time consuming and quite approximate) make the combined approach an attractive procedure to eliminate the unpleasant NACTs and in this way to form the diabatic framework. We, also, brie5y referred to other approaches [36 – 45] which were developed to achieve the same goal without explicitly employing the NACTs. Their obvious advantage is in avoiding the troublesome NACTs, however their accuracy cannot always be estimated and the extension to three (or more) states is not guaranteed. (2) In the previous review article [29], the topological D-matrix was introduced. Here its topological features were presented in much more detail than before. We discussed the explicit connection between the number of (−1)s and their positions along the diagonal of this matrix with the particular electronic eigenfunctions that 5ip sign while tracing a closed contour. For instance one interesting 0nding is that only an even number of functions can 5ip sign and another one is, that in case of a multi-degeneracy, at most two functions will 5ip sign. As a by-product we derived the topological spin and discussed to some extent the idea of a new assignment for molecular systems. (3) Another subject discussed at some length is the extended BO approximation. This type of the BO approximation, in contrast to the ordinary one, contains the e)ect of the NACTs.
M. Baer / Physics Reports 358 (2002) 75–142
127
In the previous review article [29], this approximation was presented for a two-state case. Here the derivation is extended to a model system of M -states. The extension can be accomplished if and only if the NACTs ful0ll certain gauge conditions as discussed in Section 11. In contrast to the previous review we omitted the results of a numerical example. The numerical study for a two state and a tri-state model is described in Ref. [67]. (4) In the present review, we considered the case of multi-degeneracy (namely, situations where more than two states cross at one point in CS) and we distinguished between a breakable and an unbreakable degeneracy. The breakable degeneracy is formed by an aggregation of conical intersections which originally were located at di)erent points in CS but, were, then, shifted one with respect to the other (for instance by varying one of the indirect coordinates of the molecular system) to form the one single point of degeneracy. The unbreakable degeneracy, is a situation where the multi-fold degeneracy is treated as a fait accompli (see Section 5.1). Within this context (i.e., the unbreakable degeneracy), we discussed to some extent the fact that the ADT matrix may become, for a certain chosen set of parameters, the corresponding Wigner’s rotation matrix [71,72] (see Section 12). (5) A few examples are presented in order to clarify the ideas and strengthen the con0dence in the derivations and their outcome. For this sake we studied analytically various models (as presented in Section 5.1) and also solved, numerically, a well known tri-state model (see Section 13.2) which led to the 0rst three calculated ADT-angles. This approach was also applied to realistic cases in particular to tri-atom molecules (as well as other types of tri-atom systems). One important example which was mentioned is the 0rst successful application of the line integral approach to a realistic system for the H3 system [27]. A study of the H3 system and its isotopic analogs was also performed by Xu et al. [79], who pointed out the capability of the line integral to distinguish between (closed) contours that surround and do not surround a CI. Mebel et al. [80,81] considered the CIs between the two lowest states and between the second and the third states of the C2 H molecule. In particular, in the 0rst study it was demonstrated that although the second state is strongly coupled to the third state the two lower adiabatic states can be transformed, correctly, to the corresponding two diabatic states. This can be accomplished because the two strong interaction regions (namely the interaction region between the 0rst and the second states and the region between the second and the third states) do not overlap. In this way, the two-state ‘Curl’ condition is approximately ful0lled throughout CS (see Section 4.2) which ensures the single-valuedness of the resulting diabatic potentials. We shall 0nish this review with some practical conclusions: The subject of topological e)ects has its roots in the interesting studies of Longuet–Higgins (LH) and his collaborators [4 –7]. The importance of these studies in revealing the unusual phenomenon related to the possible non-uniqueness of the electronic eigenfunctions in con0guration space is incontrovertible. However, we question the way this subject was subsequently treated. Our main hesitation is connected with the ad hoc correction of the de0ciency by the introduction of a phase factor to restore the uniqueness of the electronic wave function [5]. Such modi0cations, eventually, help to temporarily overcome encountered diBculties but then may cause confusion and prevent uncovering the real cause for the observed phenomenon. Indeed, for two or three decades, the LH phase was often treated as an independent self-standing entity and this in spite of the fact that no recipe was given for its calculation. In the present review,
128
M. Baer / Physics Reports 358 (2002) 75–142
it is shown that the LH phase has its origin in the NACTs and that it should be identi0ed, in case of an isolated two-state system, with the ADT angle. The study of NACTs has to be considered as an important topic in molecular systems. Nevertheless, the NACTs are still ignored in many studies mainly because of diBculties in obtaining them numerically. The inclusion of the NACTs will produce topological e)ects which may a)ect some of the measured magnitudes [34], but their main importance is that in many cases they couple adiabatic states to such an extent, that they cannot be treated as isolated states. This, however, does not imply that every time they show up they have to be incorporated. In order to be able to decide when NACTs have to be included and in what way to include them we 0rst have to know their locations and to be aware of their spatial dependence. At this stage, it is accepted that the strong NACTs have their origin in a CI=PI. Here were presented means to expose the point of the CI=PI in given situations. Moreover, it was proved that, in case of a single CI=PI, they decrease as q−1 . However, information on their dependence on other nuclear coordinates is very scarce. One of the most common ways to include NACTs is to transform to the diabatic framework. (In fact, this is the only way to guarantee their correct inclusion because not eliminating them from the SE enforces solving di)erential equations that contain singularities.) Indeed, there are many publications that report on numerical studies for diabatic potentials (a sample is also given here [17,31–33,36 – 45,48,79,97,109 –125]) and a variety of methods were proposed to derive them, but the relevance of these potentials was only rarely tested. For instance, these potentials, when transformed back to the adiabatic framework, have to produce not only the correct adiabatic potentials but also the correct NACTs. Finally, we would like to call attention to the recent study of the NACTs between the two lowest states of the C2 H system [80] already mentioned earlier. It was shown employing ab initio numerical results that immaterial which closed contour one may assume (large or small), the result of the two-state line integral is, either ∼ 0:0 or ∼ . This does not necessarily mean that the two-state system is decoupled from the other states, but it means that it is coupled with these higher states in such a way that the transformation to the (two-state) diabatic framework can be made by, simply, ignoring these additional coupling terms. In other words, the line integral approach is not only a way to produce the correct diabatic potentials but it also probes to what extent a given system of adiabatic states can be safely transformed to the diabatic framework. Acknowledgements The author would like to thank Professors Y.T. Lee, S.H. Lin and A. Mebel for their warm hospitality at the Institute of Atomic and Molecular Science, Taipei, where the main parts of this review were written, and the Academia Sinica of Taiwan for partly supporting this research. The author thanks Professor R. Englman for many years of scienti0c collaboration and for his encouragement, Professors A.J.C. Varandas, G.D. Billing, A. Alijah, S. Adhikari for joining his e)orts at various stages of the research and for many illuminating discussions, and Professor A. Mebel, for his recent intensive collaboration. Finally the author thanks his son, Dr. Roi Baer, for being the devil’s advocate and forcing him to try harder.
M. Baer / Physics Reports 358 (2002) 75–142
129
Appendix A. The Jahn–Teller model and the Longuet-Higgins phase We consider a case where in the vicinity of a point of degeneracy between two electronic states the diabatic potentials behave linearly as a function of the coordinates in the following way [8–10]: y x W=k ; x −y where (x; y) are some generalized nuclear coordinates and k a force constant. The aim is to derive the eigenvalues and the eigenvectors of this potential matrix. The eigenvalues are the adiabatic potential energy states and the eigenvectors form the columns of the ADT matrix. In order to perform this derivation, we shall employ polar coordinates (q; ’), namely: y = q cos ’
and
x = q sin ’ :
(A.1)
Substituting for x and y we get ’-independent eigenvalues of the form u1 = kq
and
u2 = − kq
where q = {0; ∞} and ’ = {0; 2} :
(A.2)
As noticed from Fig. 10, the two surfaces u1 and u2 are cone-like PESs with a common apex. The corresponding eigenvectors are 1 ’ 1 ’ 1 = √ cos ; √ sin ; 2 2 ’ ’ 1 1 : (A.3) 2 = √ sin ; − √ cos 2 2 The components of the two vectors (1 ; 2 ), when multiplied by the electronic (diabatic) basis set (|L1 ; |L2 ), form the corresponding electronic adiabatic basis set (|31 ; |32 ): 1
’ ’ 1 |L1 + √ sin |L2 ; 2 2
1
’ ’ 1 |L1 − √ cos |L2 : 2 2
|31 = √ cos |32 = √ sin
(A.4)
The adiabatic functions are characterized by two interesting features: (a) they depend only on the angular coordinate (but not on the radial coordinate) and (b) they are not single-valued in CS because when ’ is replaced by (’ + 2)—a rotation which brings the adiabatic wave functions back to their initial position—both of them change sign. This last feature, which was revealed by LH [4 –7], may be, in certain cases, very crucial because multi-valued electronic eigenfunctions cause the corresponding nuclear wave functions to be multi-valued as well, a feature which has to be incorporated explicitly (through speci0c boundary conditions) while
130
M. Baer / Physics Reports 358 (2002) 75–142
Fig. 10. The two interacting cones within the Jahn–Teller model.
solving the nuclear SE. In this respect, it is important to mention that ab initio electronic wave functions indeed, possess the multi-valuedness feature as described by LH [12]. One way to get rid of the multi-valuedness of the electronic eigenfunctions is by multiplying it by a phase factor [5], namely: j (’) = exp(i8)3j (’);
j = 1; 2 ;
(A.5)
where 8 = ’=2 :
(A.6)
It is noticed that j (’); j = 1; 2 are indeed single-valued eigenfunctions; however, instead of being real, they become complex. The fact that the electronic eigenfunctions are modi0ed as presented in Eq. (A.5) has a direct e)ect on the non-adiabatic coupling terms as introduced in Eqs. (8a) and (8b). In particular, we consider the term (1) 11 (which for the case of real eigenfunctions is identically zero) for the case presented in Eq. (A.5): (1) 11 = 1 |∇1 = i∇8 + 31 |∇31
M. Baer / Physics Reports 358 (2002) 75–142
131
but since 31 |∇31 = 0 ;
it follows that (1) 11 becomes (1) 11 = i∇8 :
(A.7)
In the same way, we obtain 2 2 (2) 11 = i∇ 8 − (∇8) :
(A.8)
The fact that now (1) 11 is not zero will a)ect the ordinary BO approximation. To show that, we consider Eq. (15) for M = 1, once for a real eigenfunction and once for a complex eigenfunction. In the 0rst case, we get from Eq. (16) the ordinary BO equation: −
1 2 ∇ + (u − E) = 0 2m
(A.9)
(1) because for real electronic eigenfunctions (1) 11 ≡ 0 but in the second case for which 11 = 0 the BO-SE becomes
−
1 (∇ + i∇8)2 + (u − E) = 0 2m
(A.10)
which can be considered as an extended BO approximation [19] for a case of a single isolated state expressed in terms of a complex electronic eigenfunction. This equation was interpreted for some time as the adequate SE to describe the e)ect of the Jahn–Teller CI which originates from the two interacting states. As it stands, it contains an e)ect due to an ad hoc phase related to a single (the lowest-state) electronic eigenfunction. Moreover, no prescription is given how to calculate it. If no other information is available it is inconceivable that this equation bears any relevance to non-adiabatic coupling e)ects [23]. Appendix B. The su4cient conditions for having an analytic adiabatic-to-diabatic transformation matrix The adiabatic-to-diabatic transformation matrix (ADT), AP , ful0lls the following 0rst order di)erential vector equation (see Eq. (19)): ∇AP + P AP = 0 :
(B.1)
In order for AP to be a regular matrix at every point in the assumed region of CS, it has to have an inverse and its elements have to be analytic functions in this region. In what follows, we prove that if the elements of the components of P are analytic functions in this region and have derivatives to any order and if the P-subspace is decoupled from the corresponding Q-subspace then, indeed, AP will have the above two features.
132
M. Baer / Physics Reports 358 (2002) 75–142
B.1. Orthogonality We start by proving that AP is a unitary matrix and as such it will have an inverse (the proof is given here again for the sake of completeness). Let us consider the complex conjugate of Eq. (B.1): ∇A†P − A†P P = 0 ;
(B.2)
where we recall that P , the non-adiabatic coupling matrix, is a real antisymmetric matrix. Multiplying Eq. (B.2) from the right by AP and Eq. (B.1) from the left by A†P and combining the two expressions, we get A†P ∇AP + (∇A†P )AP = (∇A†P AP ) = 0 ⇒ A†P AP = Const : For a proper choice of boundary conditions, the above mentioned constant matrix can be assumed to be the identity matrix, namely: A†P AP = I :
(B.3)
Thus AP is a unitary matrix at any point in CS. B.2. Analyticity From basic calculus, it is known that a function of a single variable is analytic at a given interval if and only if it has well de0ned derivatives, to any order, at any point in that interval. In the same way, a function of several variables is analytic in a region if at any point in this region, in addition to having well de0ned derivatives for all variables to any order, the result of the di)erentiation with respect to any two di)erent variables does not depend on the order of the di)erentiation. The fact that the AP matrix ful0lls Eq. (B.1) ensures the existence of derivatives to any order for any variable, at a given region in CS, if P is analytic in that region. In what follows, we assume that this is, indeed, the case. Next we have to 0nd the conditions for a mixed di)erentiation of the AP matrix elements to be independent of the order. For that purpose, we consider the p and q components of Eq. (B.1) (the subscript P will be omitted to simplify notation): 9 A + p A = 0; 9p 9 A + q A = 0 : 9q
or
(B.4)
Di)erentiating the 0rst equation with respect to q, we 0nd 9 9 9 9 A+ p A + p A = 0 9q 9p 9q 9q 9 9 A+ 9q 9p
9 p A − p q A = 0 : 9q
(B.5a)
M. Baer / Physics Reports 358 (2002) 75–142
In the same way, we get from the second equation, the following expression: 9 9 9 A+ q A − q p A = 0 : 9p 9q 9p
133
(B.5b)
Requiring that the mixed derivative is independent of the order of the di)erentiation yields: 9 9 (B.6) q − p A = (q p − p q )A 9p 9q or (since A is a unitary matrix): 9 9 q − p = [q ; p ] : 9p 9q
(B.7)
Thus, in order for the AP matrix to be analytic in a region, any two components of P have to ful0ll Eq. (B.7). Eq. (B.7) can also be written in a more compact way: Curl P = P xP ;
(B.8)
where x stands for a vector product. The question to be asked is: Under what conditions (if at all) do the components of P ful0ll Eq. (B.8)? In Ref. [17], it is proved that this relations holds for any full Hilbert space. Here we shall show that this relation holds also for the P-SHS of dimension M , as de0ned by Eq. (10) in Section 2.2. To show that we employ, again, the Feshbach projection operator formalism [30] (see Eqs. (11)). We start by considering the pth and qth components of Eqs. (8) in Section 2.1: ! " ! 2 " 9 k 9q 9j 9k = + j ; j; k 6 M ; (B.9a) 9p jk 9p 9q 9p9q
9p 9q
jk
!
=
9j 9q
" ! 2 " 9 k 9 k + j 9p 9q9p ;
Subtracting Eq. (B.9b) from Eq. (B.9a) and analytic functions with respect to the nuclear ! " ! 9j 9k 9 j 9 9 q − p = 9q − 9q 9p 9q 9 p jk
j; k 6 M :
(B.9b)
assuming that the electronic eigenfunctions are coordinates yields the following result: " 9k (B.10) 9p ; j; k 6 M :
Eq. (B.10) stands for the (j; k) matrix element of the left-hand side of Eq. (B.7). Next we consider the (j; k) element of the 0rst term on the right-hand side of Eq. (B.7), namely: "! " M ! 9i 9 (q p )jk = j i k : 9q 9p i=1
134
M. Baer / Physics Reports 358 (2002) 75–142
Since for real functions: " ! " ! 9i 9j j =− i ; 9q 9q we get for this matrix element the result: "! M " " ! M ! 9k 9k 9j 9 j (q p )jk = − i i =− |i i | : 9q 9p 9q 9p i=1
i=1
Recalling that the summation within the round parentheses can be written as [1 − QM ], where QM is the projection operator for Q-subspace, we obtain "! " ! " ! N 9k 9j 9j 9k (q p )jk = − − i i ; j; k 6 M : 9q 9p 9q 9p i=M +1
Since under the summation sign each term is zero (no coupling between the inside and the outside subspaces—see Eq. (10) in Section 2.2—we 0nally get that " ! 9j 9k (q p )jk = − : (B.11a) 9q 9p A similar result will be obtained for Eq. (B.7), namely " ! 9j 9k : (p q )jk = − 9p 9q
(B.11b)
Subtracting Eq. (B.11b) from Eq. (B.11a) yields Eq. (B.10) thus proving the existence of Eq. (B.7). Summary: In a region where the P elements are analytic functions of the coordinates, AP is an orthogonal matrix with elements which are analytic functions of the coordinates. Appendix C. On the single= multi-valuedness of the adiabatic-to-diabatic transformation matrix In this appendix, we discuss the case where two components of P , namely p and q (p and q are the Cartesian coordinates) are singular in the sense that at least one element in each of them is singular at the point B(p = a; q = b) located on the plane formed by p and q. We shall show that in such a case the ADT matrix may become multi-valued. We consider the integral representation of the two relevant 0rst order di)erential equations (namely the p and the q components of Eq. (19)): 9 AP + Pp AP = 0; 9p 9 AP + Pq AP = 0 : 9q
(C.1)
M. Baer / Physics Reports 358 (2002) 75–142
135
In what follows, the subscript P will be omitted to simplify the notations. If the initial point is P(p0 ; q0 ) and we are interested in deriving the value of A(= AP ) at a 0nal point Q(p; q) then one integral equation to be solved is p q A(p; q) = A(p0 ; q0 ) − dp p (p ; q0 )A(p ; q0 ) − dq q (p; q )A(p; q ) : (C.2a) p0
q0
Another way of obtaining the value of A(p; q) (we shall designate it as A˜ (p; q)) is by solving the following integral equation: q p ˜ q) = A(p0 ; q0 ) − ˜ 0 ; q ) − ˜ ; q) : A(p; dq q (p0 ; q )A(p dp p (p ; q)A(p (C.2b) q0
p0
In Eq. (C.2a), we derive the solution by solving it along the path # characterized by two straight lines and three points (see Fig. 11a): # : P(p0 ; q0 ) → P (p0 ; q) → Q(p; q)
(C.3a)
and in Eq. (C.2b) by solving it along the path # also characterized by two (di)erent) straight lines and the three points (see Fig. 11b): # : P(p0 ; q0 ) → Q (p; q0 ) → Q(p; q)
(C.3b)
It is noticed that #, formed by # and # written schematically as: # = # − #
(C.4)
is a closed path. Since the two solutions of Eq. (C.1) presented in Eqs. (C.2a) and (C.2b) may not be identical we shall derive the suBcient conditions for that to happen. To start this study we assume that the four points P; P ; Q and Q are at small distances from each other so that if p = p0 + Sp;
q = q0 + Sq
then Sp and Sq are small enough distances as required for the derivation. Subtracting Eq. (C.2b) from Eq. (C.2a) yields the following expression: q0 +Sq SA(p; q) = − dq (q (p0 ; q )A(p0 ; q ) − q (p; q )A(p; q )) q0
+
p0 +Sp
p0
dp (p (p ; q0 )A(p ; q0 ) − p (p ; q)A(p ; q)) ;
(C.5)
where ˜ q) : SA(p; q) = A(p; q) − A(p;
(C.6)
136
M. Baer / Physics Reports 358 (2002) 75–142
Fig. 11. The rectangular paths # and # connecting the points (p0 ; q0 ) and (p; q) in the (p; q) plane.
Fig. 12. The di)erential closed paths # and the singular point B(a; b) in the (p; q) plane: (a) The point B is not surrounded by #. (b) The point B is surrounded by #.
Next we consider two cases: (a) The case where the point B(a; b) is not surrounded by the path # (see Fig. 12a). In this case, both p and q are analytic functions of the coordinates in the region enclosed by # and therefore the integrands of the two integrals can be replaced by the corresponding derivatives calculated at the respective intermediate points, namely:
SA(p; q) = Sp
q0 +Sq
q0
− Sq
dq
p0 +Sp
p0
9(q (p; ˜ q )A(p; ˜ q )) 9p
dp
; q)) 9(p (p ; q)A(p ˜ ˜ : 9q
(C.7)
M. Baer / Physics Reports 358 (2002) 75–142
137
Fig. 13. The closed (rectangular) path # as a sum of three partially closed paths #1 ; #2 ; #3 .
To continue the derivation we recall that Sp and Sq are small enough so that the two integrands vary only slightly along the interval of integration so, that SA becomes ˜˜ p; ˜˜ ˜˜ q)A( ˜˜ q)) 9(q (p; 9(p (p; ˜ q)A( ˜ q)) ˜ p; ˜ SA(p; q) = SpSq − : (C.8) 9p 9q Assuming again that all relevant functions are smooth enough the expression in the curled parentheses can be evaluated further to become: 9q (p; q) 9p (p; q) SA(p; q) = − [q ; p ] A(p; q)Sp Sq ; − (C.9) 9p 9q where Eqs. (C.1) were used to express the derivatives of A(p; q). Since the expression within the curled parentheses is identically zero due to Eq. (23), SA becomes identically zero or in other words, the two in0nitesimal paths # and # yield identical solutions for the A matrix. The same applies to ordinary (namely not necessarily small) closed paths because they can be constructed by ‘integrating’ over closed in0nitesimal paths (see Fig. 13). (b) The case when one of the di)erential closed paths surrounds the point B(a; b) (see Fig. 12b). Here the derivation breaks down at the transition from Eq. (C.5) to (C.7) and later, from Eq. (C.7) to (C.8), because p and q become in0nitely large in the close vicinity of B(a; b) and therefore their intermediate values cannot be estimated. As a result it is not clear whether the two solutions of the A matrix calculated along the two di)erent di)erential paths are identical or not. The same applies to a regular size (i.e. not necessarily small) path # that surrounds the point B(a; b). This closed path can be constructed from a di)erential path #d that surrounds B(a; b), a path #p that does not surround B(a; b), and a third, a connecting path #i , which, also, does not surround B(a; b) (see Fig. 14). It is noted that the small region surrounded by #d governs the features of the A matrix in the entire region surrounded by #, immaterial how large is #.
138
M. Baer / Physics Reports 358 (2002) 75–142
Fig. 14. The closed path # as a sum of three closed paths #d ; #p ; #i . (a) The closed (rectangular) paths, i.e., the large path # and the di)erential path #d both surrounding the singular point B(a; b). (b) The closed path #p which does not surround the point B(a; b). (c) The closed path #i which does not surround the point B(a; b).
Appendix D. The diabatic representation Our starting equation is Eq. (3) in Section 2.1 with one di)erence namely we replace i (e|n) by i (e|n0 ); i = 1; : : : ; N , where n0 stands for a 0xed set of nuclear coordinates. Thus
(e; n|n0 ) =
N
i (n)i (e|n0 )
:
(D.1)
i=1
Here i (e|n0 ), like i (e|n), is an eigenfunction of the following Hamiltonian: (He (e|n0 ) − ui (n0 ))i (e|n0 ) = 0;
i = 1; : : : ; N ;
(D.2)
M. Baer / Physics Reports 358 (2002) 75–142
139
where ui (n0 ); i = 1; : : : ; N are the electronic eigenvalues at this 0xed set of nuclear coordinates. Substituting Eq. (1) (of Section 2.1) and Eq. (D.1) in Eq. (2) yields the following expression: N
Tn i (n) |i (e|n0 ) +
i=1
N
i (n)[He (e|n)
− E] |i (e|n0 ) = 0 :
(D.3)
i=1
It has to be emphasized that whereas n0 is 0xed n is a variable. Substituting Eq. (6) for Tn , multiplying Eq. (D.3) by j (e|n0 )| and integrating over the electronic coordinates yields the following result: N 1 2 − ∇ −E j (e|n0 )|He (e|n)|i (e|n0 ) i (n) = 0 : (D.4) j (n) + 2m i=1
Recalling He (e|n) = Te + u(e|n)
(D.5)
and also (D.5 )
He (e|n0 ) = Te + u(e|n0 ) ; we can replace He (e|n) in Eq. (D.4) by the following expression: He (e|n) = He (e|n0 ) + {u(e|n) − u(e|n0 )} :
(D.6)
Eq. (D.6) is valid because the electronic coordinates are independent of the nuclear coordinates. Having this relation, we can calculate the following matrix element: )j (e|n0 )|He (e|n)|)i (e|n0 ) = uj (n0 )(ji + vij (n|n0 ) ;
(D.7)
vij (n|n0 ) = )j (e|n0 )|u(e|n) − u(e|n0 )|)i (e|n0 ) :
(D.8)
where De0ning Vij (n|n0 ) = vij (n|n0 ) + uj (n0 )(ji and recalling Eq. (D.7) we get for Eq. (D.4) the expression: N 1 2 − ∇ −E Vji (n|n0 ) i (n) = 0 : j (n) + 2m
(D.9)
(D.10)
i=1
This equation can be also written in the matrix form 1 2 ∇ + (V − E) = 0 : (D.11) 2m Here V , the diabatic potential matrix, in contrast to u, in Eq. (9) of Section 2.1, is a full matrix. Thus Eq. (D.11) is the SE within the diabatic representation. −
140
M. Baer / Physics Reports 358 (2002) 75–142
References [1] M. Born, J.R. Oppenheimer, Ann. Phys. (Leipzig) 84 (1927) 457 ∗∗∗. [2] M. Born, K. Huang, Dynamical Theory of Crystal Lattices, Oxford University, New York, 1954 ∗. [3] M. Baer, C.Y. Ng (Eds.), State Selected and State-to-State Ion–Molecule Reaction Dynamics: Theory, Vol. 82, Wiley, New York, 1992. [4] H.C. Longuet-Higgins, U. Opik, M.H.L. Pryce, R.A. Sack, Proc. R. Soc. London A 244 (1958) 1. [5] H.C. Longuet-Higgins, Adv. Spectrosc. 2 (1961) 429 ∗∗∗. [6] G. Herzberg, H.C. Longuet-Higgins, Discuss. Faraday Soc. 35 (1963) 77 ∗∗∗. [7] H.C. Longuet-Higgins, Proc. R. Soc. London Ser. A 344 (1975) 147 ∗∗. [8] H.A. Jahn, E. Teller, Proc. R. Soc. London Ser. A 161 (1937) 220 ∗∗∗. [9] E. Teller, J. Phys. Chem. 41 (1937) 109 ∗. [10] E. Teller, Isr. J. Chem. 7 (1969) 227. [11] R. Englman, The Jahn–Teller E)ect in Molecules and Crystals, Wiley-Interscience, New York, 1972 ∗. [12] A.J.C. Varandas, J. Tennyson, J.N. Murrell, Chem. Phys. Lett. 61 (1979) 431. [13] W.D. Hobey, A.D. Mclachlan, J. Chem. Phys. 33 (1960) 1695. [14] W. Lichten, Phys. Rev. 164 (1967) 131 ∗∗. [15] A.D. Mclachlan, Mol. Phys. 4 (1961) 417. [16] F.T. Smith, Phys. Rev. 179 (1969) 111 ∗∗. [17] M. Baer, Chem. Phys. Lett. 35 (1975) 112 ∗∗∗. [18] T. Pacher, L.S. Cederbaum, H. Koppel, J. Chem. Phys. 89 (1988) 7367. [19] C.A. Mead, Chem. Phys. 49 (1980) 23. [20] C.A. Mead, D.G. Truhlar, J. Chem. Phys. 77 (1982) 6090. [21] M.V. Berry, Proc. R. Soc. London A 392 (1984) 45 ∗∗∗. [22] M. Baer, R. Englman, Mol. Phys. 75 (1992) 293 ∗∗. [23] Y. Aharonov, E. Ben-Reuven, S. Popescu, D. Rohrlich, Nucl. Phys. B 350 (1991) 818 ∗∗. [24] M. Baer, R. Englman, Chem. Phys. Lett. 265 (1997) 105 ∗. [25] M. Baer, J. Chem. Phys. 107 (1997) 2694 ∗. [26] M. Baer, A. Alijah, Chem. Phys. Lett. 319 (2000) 489 ∗∗. [27] D.R. Yarkony, J. Chem. Phys. 105 (1996) 10 456 ∗∗∗. [28] A. Alijah, M. Baer, J. Phys. Chem. A 104 (2000) 389 ∗. [29] M. Baer, Chem. Phys. 259 (2000) 123 ∗. [30] H. Feshbach, Ann. Phys. (NY) 5 (1958) 357. [31] T. Pacher, L.S. Cederbaum, H. Koppel, Adv. Chem. Phys. 84 (1993) 293 ∗∗. [32] M. Baer, in: M. Baer (Ed.) Theory of Chemical Reaction Dynamics, Vol. II, CRC Press, Boca Raton, FL, 1985 (Chapter 4). [33] M. Baer, Mol. Phys. 40 (1980) 1011 ∗. [34] Y.-S.M., Wu, B. Lepetit, A. Kuppermann, Chem. Phys. Lett. 186 (1991) 319 ∗∗. [35] M. Baer, R. Englman, Chem. Phys. Lett. 335 (2001) 85. [36] A. Macias, A. Riera, J. Phys. B 11 (1978) L489; Int. J. Quantum Chem. 17 (1980) 181 ∗. [37] H.-J. Werner, W. Meyer, J. Chem. Phys. 74 (1981) 5802. [38] M. Peric, R.J. Buenker, S.D. Peyerimho), Mol. Phys. 71 (1990) 673. [39] M. Peric, S.D. Peyerimho), R.J. Buenker, Z. Phys. D 24 (1992) 177 ∗∗. [40] C. Perongolo, G. Hircsh, R. Buenker, Mol. Phys. 70 (1990) 825, 835 ∗. [41] T. Pacher, H. Koppel, L.S. Cederbaum, J. Chem. Phys. 95 (1991) 6668. [42] T. Romero, A. Aguilar, F.X. Gadea, J. Chem. Phys. 110 (1999) 6219. [43] V. Sidis, in: M. Baer, C.Y. Ng (Eds.), State-to-State Ion Molecule Reaction Dynamics, Vol. II, p. 73; Adv. Chem. Phys. 82 (1992). [44] W. Domcke, A.L. Sobolewski, C. Woywod, Chem. Phys. Lett. 203 (1993) 220. [45] W. Domcke, G. Stock, Adv. Chem. Phys. 100 (1997) 1. [46] D. Bohm, Quantum Theory, Dover Publications, Inc., New York, 1989, p. 41. [47] R. Renner, Z. Phys. 92 (1934) 172.
M. Baer / Physics Reports 358 (2002) 75–142 [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97]
141
Z.H. Top, M. Baer, J. Chem. Phys. 66 (1977) 1363. E.S. Kryanchko, D.R. Yarkony, Int. J. Quantum Chem. 76 (2000) 235. H. Goldstein, Classical Mechanics, Addison-Wesley Publishing Company, Inc., Reading, MA, 1966, p. 107. L.D. Landau, E.M. Lifshitz, Quantum Mechanics, Pergamon Press, Oxford, 1965, p. 188. C. Zener, Proc. R. Soc. London, Ser. A 137 (1932) 696. L.D. Landau, Phys. Z. Sowjetunion 2 (1932) 46. H. Nakamura, C. Zhu, Comm. At. Mol. Phys. 32 (1996) 249. D. Elizaga, L.F. Errea, A. Macias, L. Mendez, A. Riera, A. Rojas, J. Phys. B 32 (1999) L697. A. Alijah, E.E. Nikitin, Mol. Phys. 96 (1999) 1399. Yu.N. Demkov, Sov. Phys. JETP 18 (1964) 138. J.W. Zwanziger, E.R., Grant, J. Chem. Phys. 87 (1987) 2954. M. Baer, J. Phys. Chem. A 105 (2001) 2198. M. Baer, Chem. Phys. Lett. 329 (2000) 450. M. Baer, A. Yahalom, R. Englman, J. Chem. Phys. 109 (1998) 6550. S. Adhikari, G.D. Billing, J. Chem. Phys. 111 (1999) 40. R. Baer, D. Charutz, R. Koslo), M. Baer, J. Chem. Phys. 105 (1996) 9141. M. Baer, R. Englman, Chem. Phys. Lett. 265 (1997) 105 ∗. M. Baer, J. Chem. Phys. 107 (1997) 10662. M. Baer, S.H. Lin, A. Alijah, S. Adhikari, G.D. Billing, Phys. Rev. A 62 (2000) 032506-1 ∗. S. Adhikari, G.D. Billing, A. Alijah, S.H. Lin, M. Baer, Phys. Rev. A 62 (2000) 032507-1. V. Fock, Z. Phys. 39 (1927) 226. H. Weyl, Z. Phys. 56 (1929) 330. K. Huang, Quarks, Leptons and Gauge Fields, World Scienti0c, Singapore, 1982. E.P. Wigner, Gruppentheorie, FriedrichVieweg, Braunschweig, 1931. M.E. Rose, Elementary Theory of Angular Momentum, Wiley, New York, 1957. D.R. Yarkony, J. Chem. Phys. 110 (1999) 701. R.G. Sadygov, D.R. Yarkony, J. Chem. Phys. 110 (1999) 3639. R.G. Sadygov, D.R. Yarkony, J. Chem. Phys. 109 (1998) 20. N. Matsunaga, D.R. Yarkony, J. Chem. Phys. 107 (1997) 7825. G. Chaban, M.S. Gordon, D.R. Yarkony, J. Phys. Chem. 101A (1997) 7953. P. Saxe, B.H. Lengs0eld, D.R. Yarkony, Chem. Phys. Lett. 113 (1985) 159 ∗∗. Z.R. Xu, M. Baer, A.J.C. Varandas, J. Chem. Phys. 112 (2000) 2746. A. Mebel, M. Baer, S.H. Lin, J. Chem. Phys. 112 (2000) 10 703 ∗∗. A. Mebel, M. Baer, S.H. Lin, Chem. Phys. Lett. 336 (2001) 135. A.J.C. Varandas, F.B. Brown, C.A. Mead, D.G. Truhlar, N.C. Blaise, J. Chem. Phys. 86 (1987) 6258. A.J.C. Varandas, Adv. Chem. Phys. 74 (1988) 255. A.J.C. Varandas, A.I. Voronin, J. Mol. Phys. 95 (1995) 497. R.K. Preston, J.C. Tully, J. Chem. Phys. 54 (1971) 4297. Z.R. Xu, M. Baer, A.J.C. Varandas, unpublished. R.C. Whitten, F.T. Smith, J. Math. Phys. 9 (1968) 1103. B.R. Johnson, J. Chem. Phys. 73 (1980) 5051. G.D. Billing, N. Markovic, J. Chem. Phys. 99 (1993) 2674. A. Kuppermann, in: R.E. Wyatt, J.Z.H. Zhang (Eds.), Dynamics of Molecules and Chemical Reactions, Marcel Dekker, Inc., New York, 1996, p. 411. Y.-C. Hsu, Y.-J. Shiu, C.-M. Lin, J. Chem. Phys. 103 (1995) 5919. J.-H. Wang, Y.-T. Hsu, K. Liu, J. Phys. Chem. 101A (1997) 6593. V.M. Blunt, H. Lin, O. Sorkhabi, W.M. Jackson, Chem. Phys. Lett. 257 (1996) 347. B.A. Balko, J. Zhang, Y.T. Lee, J. Chem. Phys. 94 (1991) 7958. P. LYoZer, E. Wrede, L. Schneider, J.B. Halpern, W.M. Jackson, K.H. Welge, J. Chem. Phys. 109 (1998) 5231. Y.-C. Hsu, J.J.-M. Lin, D. Papousek, J.-J. Tsai, J. Chem. Phys. 98 (1993) 6690. H. ThYummel, M. Peric, S.D. Peyerimho), R.J. Buenker, Z. Phys. D 13 (1989) 307 ∗∗∗.
142 [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125]
M. Baer / Physics Reports 358 (2002) 75–142 M. Peric, R.J. Buenker, S.D. Peyerimho), Mol. Phys. 71 (1990) 673. M. Peric, S.D. Peyerimho), R.J. Buenker, Mol. Phys. 71 (1990) 693. M. Peric, S.D. Peyerimho), R.J. Buenker, J. Mol. Spectrosc. 148 (1991) 180. M. Peric, W. Reuter, S.D. Peyerimho), J. Mol. Spectrosc. 148 (1991) 201. M. Peric, S.D. Peyerimho), R.J. Buenker, Z. Phys. D 24 (1992) 177. Q. Cui, K. Morokuma, J. Chem. Phys. 108 (1998) 626 ∗. A.M. Mebel, M. Hayashi, W.M. Jackson, J. Wrobel, M. Green, D. Xu, S.H. Lin, J. Chem. Phys., in press. A. Mebel, M. Baer, S.H. Lin, J. Chem. Phys. 114 (2001) 5109. MOLPRO is a package of ab initio programs written by H.-J. Werner and P. J. Knowles, with contributions from J. AlmlYof, R.D. Amos, M.J.O. Deegan, S.T. Elbert, C. Hampel, W. Meyer, K. Peterson, R. Pitzer, A.J. Stone, P.R. Taylor, and R. Lindh. F. Cocchini, T.H. Upton, W.J. Andreoni, Chem. Phys. 88 (1988) 6068. R. Meiswinkel, H. KYoppel, Chem. Phys. 144 (1990) 117. A.J.C. Varandas, Z.R. Xu, J. Chem. Phys. 112 (2000) 2121. X. Chapuisat, A. Nauts, D. Dehareug-Dao, Chem. Phys. Lett. 95 (1983) 139. D. Hehareug-Dao, X. Chapuisat, J.C. Lorquet, C. Galloy, G. Raseev, J. Chem. Phys. 78 (1983) 1246. L.S. Cederbaum, H. Koppel, W. Domcke, Int. J. Quantum Chem. Symp. 15 (1981) 251. H. Koppel, W. Domcke, L.S. Cederbaum, Adv. Chem. Phys. 57 (1984) 59 ∗. Z.H. Top, M. Baer, Chem. Phys. 25 (1977) 1. M. Baer, A.J. Beswick, Phys. Rev. A 19 (1979) 1559. M. Baer, G. Niedner-Schatteburg, J.P. Toennies, J. Chem. Phys. 91 (1989) 4169. M. Baer, C.-L. Liao, R. Xu, G.D. Flesch, S. Nourbakhsh, C.Y. Ng, J. Chem. Phys. 93 (1990) 4845. G.J. Tawa, S.L. Mielke, D.G. Truhlar, D.W. Schwenke, in: J.M. Bowman (Ed.), Advances in Molecular Vibrations Collision Dynamics, Vol. 2B, JAI Press, Greenwich CT, 1993, p. 45. S.L. Mielke, D.G. Truhlar, D.W. Schwenke, J. Phys. Chem. 99 (1995) 16 210. I. Last, M. Gilibert, M. Baer, J. Chem. Phys. 107 (1997) 1451. M. Chajia, R.D. Levine, Phys. Chem. Chem. Phys. 1 (1999) 1205. T. Takayanki, Y. Kurasaki, A. Ichihara, J. Chem. Phys. 112 (2000) 2615. L.C. Wang, Chem. Phys. 237 (1998) 305. C. Shin, S. Shin, J. Chem. Phys. 113 (2000) 6528. T. Takayanki, Y. Kurasaki, J. Chem. Phys. 113 (2000) 7158.
Physics Reports 358 (2002) 143–226
Static properties of chiral models with SU(3) group structure Soon-Tae Honga; b; ∗ , Young-Jai Parka a
b
Department of Physics, Sogang University, Seoul 100-611, South Korea W.K. Kellogg Radiation Laboratory, California Institute of Technology, Pasadena, CA 91125, USA Received April 2001; editor: G:J: Brown
Contents 1. Introduction 2. Outline of the chiral models 2.1. Chiral symmetry and currents 2.2. WZW action and baryon number 2.3. Hedgehog solution 2.4. Collective coordinate quantization 2.5. Cheshire cat principle 3. Baryon octet magnetic moments 3.1. Coleman–Glashow sum rules 3.2. Strangeness in Yabu–Ando scheme 4. Baryon decuplet magnetic moments 4.1. Model-independent sum rules 4.2. Multiquark structure 5. SAMPLE experiment and baryon strange form factors 5.1. SAMPLE experiment and proton strange form factor 5.2. Strange form factors of baryons in chiral models 6. Uni>cation of chiral bag model with other models 6.1. Connection to naive nonrelativistic quark model 6.2. Connection to other models ∗
145 147 147 151 153 156 158 161 161 164 171 171 174 176 176 183 186 186 186
7. Improved Dirac quantization of Skyrmion model 189 7.1. Modi>ed mass spectrum in SU(2) Skyrmion 189 7.2. Phenomenology in SU(3) Skyrmion 194 7.3. Berry phase and Casimir energy in SU(3) Skyrmion 196 8. Superqualiton model 198 8.1. Color–Aavor-locking phase and Q-matter 198 8.2. Bosonization of QCD at high density 201 Acknowledgements 206 Appendix A. Spin symmetries in the SU(3) group 206 Appendix B. Inertia parameters in the chiral bag model 209 B.1. Angular part of the matrix element 209 B.2. Quark phase inertia parameter 212 Appendix C. Batalin–Fradkin–Tyutin quantization scheme 215 C.1. BRST symmetries in Skyrmion model 215 C.2. SU(3) Skyrmion with Aavor symmetry breaking eDects 216 References 220
Corresponding author. Department of Physics, Sogang University, Seoul 100-611, South Korea.
c 2002 Elsevier Science B.V. All rights reserved. 0370-1573/02/$ - see front matter PII: S 0 3 7 0 - 1 5 7 3 ( 0 1 ) 0 0 0 5 7 - 6
144
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
Abstract We investigate the strangeness in the framework of chiral models, such as the Skyrmion, MIT bag, chiral bag and superqualiton models, with SU(3) Aavor group structure. We review the recent progresses in both the theoretical paradigm and experimental veri>cation for strange hadron physics, and in particular, the SAMPLE experiment results on the proton strange form factor. We study the color–Aavor locking phase in the color superconducting quark matter at high density, which might exist in the core of neutron stars, in the soliton-like superqualiton description. We explain the diIculties encountered in the application of the standard Dirac quantization to the Skyrmion and superqualiton models and treat the geometrical constraints of these soliton models properly to yield the relevant mass spectrum including c 2002 Elsevier Science B.V. All the Weyl ordering corrections and the BRST symmetry structures. rights reserved. PACS: 21.60.Fw; 12.39.Dc; 13.40.Gp; 14.20.−c; 11.10.Ef; 12.20.Ds Keywords: Chiral models; Skyrmions; Form factors; Dirac formalism; Superqualiton
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
145
1. Introduction Nowadays there has been a signi>cant discussion concerning the possibility of sizable strange quark matrix elements in the nucleon. Especially, the measurement of the spin structure function of the proton, given by the European Muon Collaboration (EMC) experiments on deep inelastic muon scattering [1], has suggested a lingering question touched on by physicists that the eDect of strange quarks on nucleon structure is not small. The EMC result has been interpreted as the possibility of a strange quark sea strongly polarized opposite to the proton spin. Similarly, such interpretation of the strangeness has been brought to other analyses of low-energy elastic neutrino–proton scattering [2]. Quite recently, the SAMPLE Collaboration [3,4] reported the experimental data of the proton strange form factor through parity-violating electron scattering [5 –7]. To be more precise, they measured the neutral weak form factors at a small momentum transfer QS2 = 0:1 (GeV=c)2 to yield the proton strange magnetic form factor [3,4]: s GM (QS2 ) = + 0:14 ± 0:29 (stat) ± 0:31 (sys) :
This result is contrary to the negative values of the proton strange form factor which result from most of the model calculations [8–22] except those of Hong et al. [23], Hong [24] and Hong and Park [25] based on the SU(3) chiral bag model (CBM) [26 – 43] and the recent predictions of the chiral quark soliton model [44] and the heavy baryon chiral perturbation theory [45,46]. Recently, the anapole moment eDects associated with the parity-violating electron scattering have been intensively studied to yield the more theoretical predictions [46 –50]. (For details see Ref. [50].) In fact, if the strange quark content in the nucleon is substantial, then kaon condensation can be induced at a matter density lower than that of chiral phase transition [51,52] aDecting the scenarios for relativistic heavy-ion reactions [53], neutron star cooling [54 – 63] and so on. On the other hand, it is well known that baryons can be obtained from topological solutions, known as SU(2) Skyrmions, since the homotopy group 3 (SU(2)) = Z admits fermions [31,64 – 67]. Using the collective coordinates of the isospin rotation of the Skyrmion, Witten and co-workers [64] have performed semiclassical quantization having the static properties of baryons within 30% of the corresponding experimental data. Phenomenologically, the MIT bag model [68,69] >rstly incorporated con>nement and asymptotic freedom of QCD. However, this model lacks chiral symmetry so that it cannot be directly applied to the nuclear interaction description. Moreover, in order for the bag to be stable, a bag size should be approximately 1 fm, which is simply too big to naively exploit the MIT bag model in describing nuclear systems. To overcome these diIculties, Brown and Rho proposed a “little bag” [26,27] where they implemented the spontaneously broken chiral symmetry and brought in Goldstine pion cloud to yield the pressure enough to squeeze the bag to a smaller size so that the bag can accommodate the nuclear physics of meson exchange interactions. Here the method of squeezing the bag without violating the uncertainty principle will be discussed later in accordance with the Cheshire cat principle [70]. On the other hand, the pion cloud was introduced outside the MIT bag to yield a “chiral bag” [71] by imposing chiral invariant boundary conditions associated with the chiral invariance and con>nement.
146
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
As shown in the next section, based on an analogy to the monopole–isomultiplet system [72], the baryon number was >rst noticed [73] to be fractionalized into the quark and pion phase contributions, and later established [31] for the special case of the “magic angle” of the pionic hedgehog >eld and then generalized for arbitrary chiral angle [74]. Here one notes that, in the “cloud bag” model [75], the hedgehog component of the pion >eld was ignored so that the baryon number could be lodged entirely inside the bag. The CBM, which is a hybrid of two diDerent models: the MIT bag model at in>nite bag radius on one hand and the SU(3) Skyrmion model at vanishing radius on the other hand, has enjoyed considerable success in predictions of the baryon static properties such as the EMC experiments and the magnetic moments of baryon octet and decuplet, as well as the strange form factors of baryons [23] to con>rm the SAMPLE Collaboration experiments. After the discovery of the Cheshire cat principle [70], the CBM has been also regarded as a candidate which uni>es the MIT bag and Skyrmion models and gives model-independent relations insensitive to the bag radius. On the other hand, Brown and co-workers [28] calculated the pion cloud contributions to the baryon magnetic moments by using the SU(2) CBM as an eDective nonrelativistic quark model (NRQM). The Coleman–Glashow sum rules of the magnetic moments of the baryon octet were investigated in the SU(3) CBM so that the bag was proposed as an eDective NRQM with meson cloud inside and outside the bag surface [37]. The possibility of uni>cation of the NRQM and Skyrmion and MIT bag models through the chiral bag, was proposed again for the baryon decuplet [38], as well as the baryon octet [37]. In the Skyrmion model [76,66], many properties of baryon containing light u- and d-quarks have suggested that they can be described in terms of solitons. Provided the Wess–Zumino term [77] is included in the nonlinear sigma model Lagrangian, the solitons have the correct quantum numbers to be QCD baryons [78] with many predictions of their static properties [64]. Moreover, the Nc counting suggests that the baryons with a single heavy quark (mq QCD ) can be described as solitons as baryon contain only light quarks. Meanwhile, there has been a considerable progress in understanding the properties of baryons containing single heavy quarks [79,80]. Callan and Klebanov (CK) [79] suggested an interpretation of baryons containing heavy quarks as bound states of solitons of the pion chiral Lagrangian with mesons containing heavy quark. In their formalism, the Auctuations in the strangeness direction are treated diDerently from those in the isospin directions [79,80]. Jenkins and Manohar [81] recently reconsidered the model in terms of the heavy quark symmetry to conclude that a doublet of mesons containing the heavy quark can take place in the bound state if both the soliton and meson are taken as in>nitely heavy. On the other hand, in the scheme of the SU(3) cranking, Yabu and Ando [82] proposed the exact diagonalization of the symmetry breaking terms by introducing the higher irreducible representation (IR) mixing in the baryon wave function, which was later interpreted in terms of the multiquark structure [83,84] in the baryon wave function. On the other hand, the Dirac method [85] is a well known formalism to quantize physical systems with constraints. The string theory is known to be restricted to obey the Virasoro conditions, and thus it is quantized [86] by the Dirac method. The Dirac quantization scheme has been also applied to the nuclear phenomenology [87,88]. In this method, the Poisson brackets in a second-class constraint system are converted into Dirac brackets to attain self-consistency. The
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
147
Dirac brackets, however, are generically >eld-dependent, nonlocal and contain problems related to ordering of >eld operators. These features are unfavorable for >nding canonically conjugate pairs. However, if a >rst-class constraint system can be constructed, one can avoid introducing the Dirac brackets and can instead use Poisson brackets to arrive at the corresponding quantum commutators. To overcome the above problems, Batalin, Fradkin, and Tyutin (BFT) [89] developed a method which converts the second-class constraints into >rst-class ones by introducing auxiliary >elds. Recently, this BFT formalism has been applied to several models of current interest [90 –93], especially to the Skyrmion to obtain the modi>ed mass spectrum of the baryons by including the Weyl ordering correction [94 –97]. Furthermore, due to asymptotic freedom [98,99], the stable state of matter at high density will be quark matter [100], which has been shown to exhibit color superconductivity at low temperature [101,102]. The color superconducting quark matter [103–136] might exist in the core of neutron stars, since the Cooper-pair gap and the critical temperature turn out to be quite large, of the order of 10 –100 MeV, compared to the core temperature of the neutron star, which is estimated to be up to ∼ 0:7 MeV [137]. On the other hand, it is found that, when the density is large enough for strange quark to participate in Cooper-pairing, not only color symmetry but also chiral symmetry are spontaneously broken due to the so-called color–Aavor locking (CFL) [115]: At low temperature, Cooper pairs of quarks form to lock the color and Aavor indices as
a p) Lbj (−p ˜ ) = Li (˜
−
a p) Rbj (−p ˜ ) = abI ijI (pF ) Ri (˜
;
(1.1)
where a; b = 1; 2; 3 and i; j = 1; 2; 3 are color and Aavor indices, respectively, and we ignore the small color sextet component in the condensate. In this CFL phase, the particle spectrum can be precisely mapped into that of the hadronic phase at low density. Observing this map, SchSafer and Wilczek [109,108] have further conjectured that two phases are in fact continuously connected to each other. The CFL phase at high density is complementary to the hadronic phase at low density. This conjecture was subsequently supported by showing that quarks in the CFL phase are realized as Skyrmions, called superqualitons, just like baryons are realized as Skyrmions in the hadronic phase [113,134].
2. Outline of the chiral models 2.1. Chiral symmetry and currents For a fundamental theory of hadron physics, we will consider in this work the chiral models such as Skyrmion, MIT bag and chiral bag models. Especially, the CBM can be described as a topological extended object with hybrid phase structure: the quark >elds surrounded by the meson cloud outside the bag. In the CBM, a surface coupling with the meson >elds is introduced to restore the chiral invariance [71] which was broken in the MIT bag [68,69]. To discuss the symmetries of the CBM and to derive the vector and axial currents, which are
148
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
crucial ingredients for the physical operators for the magnetic moments and EMC experiments, we introduce the realistic chiral bag Lagrangian L = LCS + LCSB + LFSB
(2.1)
with the chiral symmetric (CS) part, chiral symmetry breaking (CSB) mass terms and SU(3) Aavor symmetry breaking (FSB) pieces due to the corrections m = mK and f = fK 1T 1 2 1 2 " " T TB ; LCS = i! 9" − tr[l" ; l& ] + LWZW ' U5 B + − f tr(l" l ) + 2 4 32e2 1 TB ; LCSB = − T M 'B + f2 m2 tr(U + U † − 2)' 4
√ TB LFSB = 16 (f2 m2K − f2 m2 )tr((1 − 3)8 )(U + U † − 2))' √ 1 2 2 TB : − 12 f (* − 1)tr((1 − 3)8 )(Ul" l" + l" l" U † ))'
(2.2)
Here the quark >eld has SU(3) Aavor degrees of freedom and the chiral >eld U = ei)a a =f ∈ SU(3) is described by the pseudoscalar meson >elds a (a = 1; : : : ; 8) 1 and Gell–Mann matrices )a with )a )b = 23 +ab +(ifabc +dabc ))c , and l" = U † 9" U . In the numerical calculation in the CBM we will use the parameter >xing e = 4:75, f = 93 MeV and fK = 114 MeV. The interaction term crucial for the chiral symmetry restoration is given by U5 =
1 + !5 1 + !5 1 − !5 † 1 − !5 U + U ; 2 2 2 2
(2.3)
and B = − n" 9" 'B where 'B is the bag theta function with vanishing value (normalized to be unity) only inside the bag and n" is the outward normal unit four vector and the Skyrmion term is included to stabilize soliton solution of the meson phase Lagrangian in LCS . The WZW term, which will be discussed in terms of the topology in the next section, is described by the action: iNc -WZW = − d 5 r"&! tr(l" l& l l l! ) ; (2.4) 2402 MT where Nc is the number of colors and the integral is done on the >ve-dimensional manifold
T = VT × S 1 × I with the three-space volume VT outside the bag, the compacti>ed time S 1 and the M unit interval I needed for a local form of WZW term. The chiral symmetry is explicitly broken by the quark mass term with M = diag(mu ; md ; ms ) and pion mass term, which is chosen such that it will vanish for U = 1.
1
In this work, we will use the convention that a; b; c; : : : are the indices which run 1; 2; : : : ; 8 and i; j; k; : : : for 1; 2; 3 and p; q; : : : for 4; 5; 6; 7. The Greek indices "; &; : : : are used for the space–time with metric g"& = diag(+; −; −; −).
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
149
Now we want to construct Noether currents under the SU(3)L × SU(3)R local group transformation. Under in>nitesimal isospin transformation in the SU(3) Aavor channel →
= (1 − ia Qˆ a ) ;
U → U = (1 − ia Qˆ a )U (1 + ia Qˆ a ) ;
(2.5)
where a (x) the local angle parameters of the group transformation and Qˆ a = )a =2 are the SU(3) Aavor charge operators given by the generators of the symmetry, the Noether theorem yields the Aavor octet vector currents (FOVC) from the derivative terms in LCS and LFSB i i "a " 2 " & " & † TB JV = T ! Qˆ a 'B + − f tr(Qˆ a l ) + 2 tr[Qˆ a ; l ][l ; l ] + U ↔ U ' 2 8e √ i TB (fK2 − f2 )tr((1 − 3)8 )(U {Qˆ a ; l" } + {Qˆ a ; l" }U † ) + U ↔ U † )' 12 Nc "& TB + tr(Qˆ a l& l l − U ↔ U † )' 482 −
(2.6)
with 0123 = 1. Of course, the JV"a are conserved as expected in the chiral limit, but the mass terms in LCSB and LFSB give rise to the nontrivial four-divergence: √ i TB 9" JV"a = − (fK2 m2K − f2 m2 )tr((1 − 3)8 )[Qˆ a ; U + U † ])' 6 √ i T B − i T [Qˆ a ; M ] 'B : + (fK2 − f2 )tr((1 − 3)8 )[Qˆ a ; Ul" l" + l" l" U † ])' (2.7) 12 " can be easily constructed In addition, one can see that the electromagnetic (EM) currents JEM ˆ by √ replacing the SU(3) Aavor charge operators Qa with the EM charge operator Qˆ EM = Qˆ 3 + (1= 3)Qˆ 8 in the FOVC (2.6) and that the four-divergence (2.7) vanishes to yield the conserved EM currents. Similarly, under in>nitesimal chiral transformation in the SU(3) Aavor channel:
→
= (1 − ia !5 Qˆ a ) ;
U → U = (1 + ia Qˆ a )U (1 + ia Qˆ a ) ;
(2.8)
one obtains the Aavor octet axial currents (FOAC) i i 2 "a " " & " & † T TB ˆ ˆ ˆ JA = ! !5 Qa 'B + − f tr(Qa l ) + 2 tr[Qa ; l ][l ; l ] − U ↔ U ' 2 8e √ i TB (fK2 − f2 )tr((1 − 3)8 )(U {Qˆ a ; l" } + {Qˆ a ; l" }U † ) − U ↔ U † )' 12 Nc "& TB : tr(Qˆ a l& l l + U ↔ U † )' + 482 −
(2.9)
150
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
Here one notes that the FOAC are conserved only in the chiral limit, but one has the nontrivial four-divergence from the mass terms of LCSB and LFSB √ i TB 9" JA"a = (fK2 m2K − f2 m2 )tr((1 − 3)8 )[Qˆ a ; U − U † ])' 6 √ i T B + i T !5 {Qˆ a ; M } 'B : − (fK2 − f2 )tr((1 − 3)8 ){Qˆ a ; Ul" l" − l" l" U † })' 12 (2.10) In the meson phase currents of (2.6) and (2.9), one should note that the terms with U ↔ U † in the FOAC have the opposite sign of those in the FOVC. Moreover, the mesonic currents from the WZW term and the nontopological terms have also the sign diDerence in front of the term with U ↔ U † . On the other hand, one can de>ne the 16 vector and axial vector charges [138–140] of SU(3)L × SU(3)R Qˆ a = d 3 xJV0a ; 5 Qˆ a =
d 3 xJA0a ;
(2.11)
where JV"a and JA"a are the octets of the FOVC and FOAC in (2.6) and (2.9) respectively. In the quantized theory discussed later, these generators are the charge operators and satisfy their equal time commutator relations of the Lie algebra of SU(3)L × SU(3)R [Qˆ a ; Qˆ b ] = ifabc Qˆ c ; 5 5 [Qˆ a ; Qˆ b ] = ifabc Qˆ c ; 5
5
[Qˆ a ; Qˆ b ] = ifabc Qˆ c ;
(2.12)
R L and the chiral charges Qˆ a and Qˆ a de>ned as R; L 5 1 Qˆ a = (Qˆ a ± Qˆ a ) 2 form a disjoint Lie algebra of SU(3)s
(2.13)
R R R [Qˆ a ; Qˆ b ] = ifabc Qˆ c ; L L L [Qˆ a ; Qˆ b ] = ifabc Qˆ c ; R
L
[Qˆ a ; Qˆ b ] = 0
(2.14)
from which the Adler–Weisberger sum rules [141,142] can be obtained in terms of oD-mass shell pion–nucleon cross sections.
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
151
2.2. WZW action and baryon number More than 30 years ago, Skyrme [76] proposed a picture of the nucleon as a soliton in the otherwise uniform vacuum con>guration of the nonlinear sigma model. Quantizing the topologically twisted soliton, he suggested that the topological charge or winding number could be identi>ed with baryon number B. His conjecture for the de>nition of B has been revived [78,143] in terms of quantum chromodynamics (QCD). In particular, Witten [78] has established a unique relation between the topological charge and baryon number with the number of colors Nc playing a crucial role. In the large-Nc limit of QCD [144], meson interactions are described by the tree approximation to an eDective local >eld theory of mesons, and baryons behave as if they were solitons [145] so that the identi>cation of the Skyrmion with a baryon can be consistent with QCD. In this section, we will brieAy review and summarize the fermionization of the Skyrmion with the WZW action [78] to obtain the baryon number in the CBM. Now we consider the pure Skyrmion on a space-time manifold compacti>ed to be S 4 = S 3 × S 1 where S 3 and S 1 are compacti>ed Euclidean three-space and time, respectively. The chiral >eld U is then a mapping of S 4 into the SU(3) group manifold to yield the homotopy group 4 (SU(3)) = 0 so that the four-sphere in SU(3) de>ned by U (x) is the boundary of a >ve dimensional manifold M = S 3 × S 1 × I with two dimensional disc D = S 1 × I where I is the unit interval. Here one notes that M is not unique so that the compacti>ed space–time S 4 is also the boundary of another >ve-disc M with opposite orientation. On the SU(3) manifold there is a unique >fth rank antisymmetric tensor !"&! invariant under SU(3)L × SU(3)R , which enables us to de>ne an action -M; M = ± d 5 x"&! !"&! ; (2.15) M;M
where the signs ± are due to the orientations of the >ve-discs M and M respectively. As in Dirac quantization for the monopole [146,147], one should demand the uniqueness condition in a Feynman path integral ei- M = ei- M to yield -M − -M = M+M ! = 2 × integer for any >ve-sphere constructed from M + M in the SU(3) group manifold. Here one notes that every >ve-sphere in SU(3) is topologically a multiple of a basic >ve-sphere S 5 due to 5 (SU(3)) = Z. Normalizing ! on the basic >ve-sphere S 5 such that S 5 ! = 2 one can use in the quantum >eld theory the action of the form n- where n is an arbitrary integer. On the other hand, one can obtain the >fth rank antisymmetric tensor !"&! on the >ve-disc M [78] i tr(l" l& l l l! ) ; (2.16) 2402 which leads us to the condition that the n-M is nothing but the WZW term in the pure Skyrmion model if n = Nc . Here one notes in the weak >eld approximation that the right-hand side of (2.16) can be reduced into a total divergence so that by Stokes’s theorem M ! can be rewritten as an integral over the boundary of M, namely compacti>ed space–time S 5 . In the CBM the T = VT × S 1 × I where VT = S 3 − V with V being the >ve-disc M = S 3 × S 1 × I is modi>ed into M three-space volume inside the bag. On the modi>ed >ve-manifold, one can construct the WZW term (2.4) in the CBM. !"&! = −
152
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
Also it is shown [78] that the above action -M is a homotopy invariant under SU(2) mappings with the homotopy group 4 (SU(2)) = Z2 and for a 2 adiabatic rotation of a soliton, the action gains the value -M = corresponding to the nontrivial homotopy class in 4 (SU(2)) so that one can obtain an extra phase ein = (−1)n in the amplitude, with respect to a soliton at rest with -M = 0 belonging to the trivial homotopy class. Here the factor (−1)n indicates that the soliton is a fermion (boson) for odd (even) n. On the other hand, one remembers that a baryon constructed with n quarks is a fermion (boson) if n is odd (even). With the WZW term with three-Aavor Nc = 3, one then concludes that the Skyrmion can be fermionized. Here one notes that the nontrivial homotopy class in 4 (SU(2)) can be depicted [78] by the creation and annihilation mechanism of a Skyrmion–anti-Skyrmion pair in the vacuum through the channel of 2 rotation of the Skyrmion and it corresponds to quantization of the Skyrmion as a fermion. Such a mechanism has also been used [148] in the (2 + 1)-dimensional nonlinear sigma model to discuss the Hopf topological invariant and linking number [149]. In fact, since the (2 + 1)-dimensional O(3) nonlinear sigma model (NLSM) was >rst discussed by Belavin and Polyakov [150], there have been lots of attempts to improve this soliton model associated with the homotopy group 2 (S 2 ) = Z. In particular, the con>guration space in the O(3) NLSM is in>nitely connected to yield the fractional spin statistics, which was >rst shown by Wilczek and Zee [151,149] via the additional Hopf term. Moreover, the O(3) NLSM with the Hopf term was canonically quantized [152] and the CP 1 model with the Hopf term [153–157], which can be related with the O(3) NLSM via the Hopf map projection from S 3 to S 2 , was also canonically quantized later [154]. In fact, the CP 1 model has better features than the O(3) NLSM, in the sense that the action of the CP 1 model with the Hopf invariant has a desirable manifest locality, since the Hopf term has a local integral representation in terms of the physical >elds of the CP 1 model [149]. Furthermore, this manifest locality in time is crucial for a consistent canonical quantization [158]. Recently, the geometrical constraints in the O(3) NLSM and CP 1 model are systematically analyzed to yield the >rst-class Hamiltonian and the corresponding BRST invariant eDective Lagrangian [158–160]. Meanwhile, the CP N model was studied [161] on the noncommutative geometry [162], which was quite recently analyzed in the framework of the improved Dirac quantization scheme [163]. Now using the Noether theorem as in the previous section, one can obtain the conserved Aavor singlet vector currents (FSVC) JV" which can be practically derived by simple replacement of Qˆ a with 1 in the FOVC (2.6). If one de>nes the baryon number of a quark to be 1=Nc so that a baryon constructed from Nc quarks has baryon number one, then the baryon number current B" can be shown to be (1=Nc )JV" , namely B" =
1 T " 1 "& TB ; ! 'B + tr(l& l l )' Nc 242
and the baryon number of the chiral bag is given by 1 3 0 3 1 † B = d xB = d x + d3 x tr(li lj lk ) ; Nc 242 ijk VT B which will be discussed in terms of the hedgehog solution ansatz in the next section.
(2.17)
(2.18)
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
153
2.3. Hedgehog solution Since the Euler equation for the meson >elds in the nonlinear sigma model was analytically investigated [71] to obtain a speci>c classical solution for the meson >elds whose isospin index points radially i (˜r)=f = rˆi ;(r), the so-called hedgehog solution, this spherically symmetric classical solution has been commonly used as a prototype ansatz in the literature of the Skyrmion related hadron physics. In this section, we will consider the classical con>guration in the meson and quark phases to review and summarize brieAy the baryon number fractionization [31,74] in the CBM. Assuming maximal symmetry in the meson phase of the chiral bag, we describe the hedgehog solution U0 embedded in the SU(2) isospin subgroup of SU(3): ˆ ei˜<·r;(r) 0 ; (2.19) U0 = 0 1 where
ning a stationary point of the chiral bag action. Together with the boundary term − 12 T U5 B , the static mass M also yields the boundary condition for the chiral angle at z = ef R: 2 sin2 ; d; 1 T ˆ 1+ ˆ i!5˜<·r; ; (2.22) i!5˜< · re = z2 d z 2ef3 which allows the Aow of currents in the two phase via the bag boundary. Here one notes that the baryon number (2.18) obtained from the topological WZW term and quark >elds remains constant [31,74] regardless of the bag radius through the continuity of the current at the bag boundary, even though one has the additional pion mass term in the static mass M .
154
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
On the other hand, in the chiral symmetric limit, the conventional variation scheme with respect to the quark >elds yields the Dirac equation inside the bag and the boundary condition on the bag surface: i!" 9" = 0;
r¡R ;
i!" n" = U5 ;
r=R ;
(2.23) (2.24)
where the missing quark masses will be discussed later after the collective coordinate quantization is performed. Due to the coupling of spin and isospin in the boundary condition (2.24), the u and d quarks ˜ = ˜I + ˜J , can be coalesced [165] into hedgehog (h) quark states, eigenstates of grand spin K ˜ ˜ ˜ ˜ not of the isospin I and the spin J = L + S separately, while the s quark is decoupled from the hedgehog quark states. The h quark state is then speci>ed by a set of quantum numbers ˜ 2 and K3 (K; mK ; P; m) where K(K + 1) and mK 2 are the eigenvalues of the squared operator K ˜ and P and m are the parity and radial excitation quantum numbers, the third component of K, respectively. Similarly the s quark states are labeled by another set (j; mj ; P; n) with j(j + 1), mj and n, the eigenvalues of ˜J 2 , J3 and radial quantum number. Now the quark >eld can be expanded in terms of the wave functions of the hedgehog and strange quark states h (˜r; t) = r)e−ijn t an + nh∗ (˜r)eijn t b†n + ns (˜r)e−i!n t cn + ns∗ (˜r)ei!n t d†n ; (2.25) n (˜ n
where the hedgehog quark states are expressed by the spatial wave functions nh (˜r) with grand spin quantum numbers, whose explicit forms will be given in the Appendix B, and the annihilation operator an (b†n ) for the positive (negative) energy ful>lls the usual anticommutator rules and also de>nes the vacuum an |0 = bn |0 = 0, and the strange quark states are analogously described. Here we do not bother to include the color index explicitly since every particle is a color singlet. The energy spectrum of the hedgehog quark states [165] is subordinate to ;(R), the chiral angle at bag surface, while the strange quark states remain intact regardless of the chiral angle. Finally, in the framework of the previous literatures [31,74] we reconsider the baryon number (2.18) in the hedgehog ansatz to see that the total baryon number is still an integer in the CBM. Using the hedgehog solution (2.19) in the meson piece in (2.18) one can obtain the fractional baryon number in terms of the chiral angle at the bag surface ; = ;(R) ∈ [ − ; 0] [74]: Bm = −
1 *E (; − sin ; cos ;) ; 2
(2.26)
where *E is the Euler characteristic, which has an inter two in the spherical bag surface.
2
Here we have used the same symbol mK for the quantum number and the kaon mass. However, a reader can easily recognize the meaning of the symbol from the context.
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
155
In general, the Euler characteristic of a compact surface is the topological invariant de>ned by the integer v − e + f [166] with v, e and f the numbers of vertices, edges and faces in a decomposition of the surface so that one can easily see *E (sphere) = 2 and *E (torus) = 0, for instance. Also, it is interesting to see that adding a handle H, or a torus with the interior of one face removed, to a compact surface S reduces its Euler characteristic by two, since to obtain the coalesced surface S one needs the surgery of removing the interior of a face of S so that S has two faces less than S and H combined. For a coalesced surface with h handles, one has the generalized identity *E (S ) = *E (S) − 2h [166]. On the other hand, it has been noted [31] in the CBM that the quark phase spectrum is asymmetric about zero energy to yield the nonvanishing vacuum contribution to the baryon number B0 = −
1 sgn(En )e−s|En | ; lim 2 s→0 n
(2.27)
where the sum runs over all positive and negative energy eigenstates and the symmetrized operator 12 [ † ; ] is used in the quark part of (2.18). Here one notes that the regularized factor is closely related [74] to the eta invariant of Atiyah et al. [167], C(s) =
1 sgn(En )|En |−s lim 2 s→0 n
(2.28)
which has been also discussed in connection with the phase factor of the path integral in quantum >eld theory associated with the Jones polynomial and knot theory [168], and recently, has been exploited in investigation of the semiclassical partition functions and the Jacobi >elds in the framework of the Morse theory of diDerential geometry [169]. Except at the magic angle ; = − =2, where the baryon number is shared equally with both quark and meson phases and B0 jumps by unity due to the Diracsea [31], the chiral angle dependence of the quark vacuum baryon number dB0 =d; = 12 lims→0 n s(dEn =d;)e−s|En | is given in terms of the integration of the Gaussian curvature D on the bag surface S [74] dB0 1 2 (2.29) = 2 sin ; d 2 xD d; 2 S where a multiple-reAection expansion of the Euclidean Green’s function, as well as the Dirac equation (2.23) and the boundary condition (2.24), has been used [74]. Using the Gauss–Bonnet theorem [170], one can rewrite (2.29) in terms of the Euler characteristic dB0 =d; = (*=) sin2 ; to yield the total quark phase baryon number Bq = 1 +
1 *E (; − sin ; cos ;) ; 2
(2.30)
where, in addition to the ;-dependent vacuum contribution B0 , one has the unity factor con+ tributed by the Nc degenerate valence quarks to >ll the K P = 0+ h-quark and j P = 12 s-quark eigenstates. In the K P = 0+ level we can de>ne the static hedgehog ground state |H 0 : a†v |0
156
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
(a†v being the valence quark creation operator with the quantum number K P = 0+ ) for −=2 6 ; 6 0 and |0 in − 6 ; ¡ − =2, since the quarks in the positive energy level are the valence quarks while those in the negative energy level can be considered to sink into the vacuum. Here one notes that in the MIT bag limit at ; = 0, where there are no vacuum and meson contributions, only the Nc degenerate valence quarks yield the baryon number. Also for −=2 6 ; ¡ 0 the valence quarks and ;-dependent vacuum contribute to Bq while for − 6 ; ¡ − =2 only the quark vacuum does in the static hedgehog ground state. 2.4. Collective coordinate quantization Until now, we have considered the baryon quantum number in the classical static hedgehog solution in the meson phase of the CBM. As in the Skyrmion model [64], the other quantum numbers such as spin, isospin and hypercharge can be obtained in the CBM by quantizing the zero modes associated with the slow collective rotation, U0 → AU0 A† ;
→A
;
(2.31)
on the SU(3)F group manifold where A(t) ∈ SU(3)F is the time-dependent collective variable restrained by the WZW constraint. In the )-dimensional IR, the baryon is then described by a wave function of the form |B) = GB) (A) ⊗ |intrinsic ;
where GB) (A) is the baryon-dependent collective coordinate wave function ) (A) GB) (A) = dim())Dab
(2.32)
(2.33)
with the quantum numbers a = (Y; I; I3 ) (Y ; hypercharge, I ; isospin) and b = (YR ; J; J3 ) (YR ; right hypercharge, J ; spin) in the Wigner D-matrix. In the eight-dimensional adjoint representation 8 (A) = 1 tr(A† ) A) ). On the other hand, the intrinsic state degenerate the matrix is given by Dab a b 2 to all the baryons is described by the classical meson con>guration approximated by a rotated hedgehog solution AU0 A† and a rotated hedgehog ground state discussed later. With the introduction of the collective rotation, the Dirac equation (2.23) is modi>ed and the boundary condition (2.24) is rewritten in the hedgehog ansatz as below [33]: 1 i!" 9" + q˙a !0 )a = 0; r ¡ R ; (2.34) 2
irˆ · ˜! + ei!5 )i rˆi ; = 0; r = R ; (2.35) where we have used the collective coordinates qa de>ned by A† A˙ = − (i=2))a q˙a . The collective rotation of the chiral bag induces [33] the particle–hole excitations which will be treated perturbatively in this work to yield the correction to the wave functions nh (˜r) and
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226 s r) n (˜
157
in (2.25): h 0h r) n = n (˜
1 h m|)i |nh + q˙i 2 jm − j n m =n
s 0s r) n = n (˜
1 h m|)p |ns + q˙p 2 jm − !n
0h r) m (˜
1 s m|)p |nh + q˙p 2 !m − jn m
0h r) m (˜
0s r) m (˜
;
;
(2.36)
m =n
where the matrix elements with the unperturbed states following Dirac notations: d 3 x m†0h (˜r))i n0h (˜r) ; h m|)i |nh = V d 3 x m†0s (˜r))p n0h (˜r) : s m|)p |nh = V
0h r) n (˜
and=or
0s r) n (˜
are de>ned as the
(2.37)
Here one notes that since )8 q˙8 related to the WZW term plays the role of a constraint, it does not appear explicitly in the above quark wave functions. With the collective rotation, the Fock space should then be modi>ed for Nc quarks to >ll up the new single states (2.36) with the minimum energy so that the rotated hedgehog ground state has a form analogous to the cranking formula in nuclear physics [171] 1 s m|)p |nh † † h m|)p |ns † † h m|)i |nh † † |H = 1 + q˙i a b + q˙p a d + q˙p c b 2 m; n jm − j n m n jm − !n m n !m − jn m n 1 s m|)p |v † h m|)i |v † |H 0 ; + q˙i a av + q˙p c av (2.38) 2 m jm − j v m !m − jv m where |v stands for the valence quark state for −=2 6 ; 6 0. To obtain the chiral bag Hamiltonian in the chiral symmetric limit (see Section 2.2 for the symmetry breaking case) we can construct the canonical momenta a conjugate to the collective variables qa √ 3 a = I1 q˙i +ia + I2 q˙p +pa + (2.39) B+8a : 2 Here we have used the parameter >xing Nc = 3 and the identity Bq + Bm = 1 discussed before where Bm comes from the WZW term and Bq is calculated from the equation H | V d 3 x † √ ()8 =2)|H = ( 3=2)Bq . The moments of inertia I1 and I2 are explicitly given by sum of two contributions from the quark and meson phases as below: 3 |h m|)3 |nh |2 3 |h m|)3 |v|2 I1 = + 2 m; n jm − jn 2 m jm − j v 2 ∞ d; 8 sin2 ; 2 2 + 3 d z z sin ; 1 + + 2 ; 3e f ef R dz z
158
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
3 |h m|)4 |ns |2 |s m|)4 |nh |2 3 |s m|)4 |v|2 + + I2 = 2 m; n jm − !n !m − j n 2 m !m − jv 2 ∞ 2 2 ; d; 2 sin + 3 d z z 2 (1 − cos ;) 1 + + ; e f ef R dz z2
(2.40)
where we have used the symmetry properties of the matrix elements to employ only )3 and )4 . The chiral bag Hamiltonian is then given by 2 1 ˆ2 3 ˆ2 1 1 1 ˆ H0 = M + − (2.41) J + C 2 − YR ; 2 I1 I2 2I2 4 2 2 where M is the static mass (2:20) and Jˆ and Cˆ 2 are the Casimir operators in the SU(2) and SU(3) groups, respectively, and YˆR is the right hypercharge operator to yield the WZW constraint YˆR |phys = + 1|phys for any physical state |phys.
2.5. Cheshire cat principle As we have seen in the previous sections, the CBM can be considered as a hybrid or combination of two diDerent models: the MIT bag model at in>nite bag radius on one hand and Skyrmion model at vanishing radius on the other hand. Of course, the meson phase Lagrangian in (2.1) can be generalized by a more complicated version including vector meson >elds such as I and ! [172]. In the hybrid model, there has been considerable discussion concerning the conjecture that the bag itself has only notational but no physical signi>cance, the so-called Cheshire cat principle (CCP) [70,32,39,40,42]. 3 The jargon Cheshire cat originates from the quotation in the fable “Alice in Wonderland” [173]: “Well, I’ve often seen a cat without a grin”, thought Alice, “but a grin without a cat! It is the most curious thing, I ever saw in my life!” According to the Cheshire cat viewpoint, the bag wall (Cheshire cat) tends to fade away, when examined closely, leaving behind the bag boundary conditions translating the fermionic and bosonic descriptions into one another (the grin of the Cheshire cat) [70]. In (1+1) dimensions where exact bosonization and fermionization relations are known [174], Nadkarni and co-workers proposed the Cheshire cat model where the CCP is exactly obeyed so that physics is invariant under changes in bag shape and=or size [70]. Namely, in a simple model with a free massless fermion inside the bag and the equivalent free massless boson outside the bag, the bag boundary conditions are shown, via bosonization relations, to yield a clue to the CCP: shifting the bag wall has no physical eDect. Now, we brieAy recapitulate the CCP in the (1+1)-dimensional CBM, by introducing a massless free single-Aavored fermionic quark con>ned to a region of volume V (inside) and 3
Based on phenomenology, a similar idea of the CCP was proposed by Brown and co-workers, simultaneously and independently of Ref. [70].
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
159
a massless free bosonic meson J located in a region VT (outside). Here we assume that these two >elds are coupled to each other via the surface 9V . Now we consider the following action S which is invariant under global chiral rotations and parity 4 S = SV + SVT + S9V ;
SV =
V
(2.42)
d 2 x T i!" 9" + · · · ;
(2.43)
1 d 2 x (9" J)2 + · · · ; 2 VT 1 S9V = dK" n" T ei!5 J=f ; 2 9V
SVT =
(2.44) (2.45)
where the ellipsis stands for other terms such as interactions, masses and so on. Here we have assumed that chiral symmetry holds on the boundary even if as in nature it is broken both inside and outside due to mass terms, and that the boundary√ term does not break the discrete symmetries P, C and T . In boundary action (2.45), f = 1= 4 is the J meson decay constant and dK" is an area element with the normal vector n" , namely, n2 = − 1 and picked outward-normal. From action (2.42), one can obtain the classical equations of motion: i!" 9" = 0 ;
(2.46)
9" 9" J = 0 ;
(2.47)
and the boundary conditions associated with the MIT con>nement condition in" !" = − ei!5 J=f n " 9" J =
;
1 T " n !" !5 : 2f
(2.48) (2.49)
Here one can have the conserved vector current j" = T 12 !" with 9" j" = 0 or T 12 n" !" = 0 at the surface from Eq. (2:48), and the conserved axial vector current j"5 = T 12 !" !5 with 9" j"5 = 0 from Eq. (2.49). Note that at quantum level, the vector current is not conserved due to quantum anomaly, contrast to the usual open space case where anomaly is in the axial current. For simplicity, we assume that the quark is con>ned to the space −∞ 6 r 6 R with a boundary at r = R. The vector current j" is then conserved inside the bag 9" j" = 0 ; 4
(2.50)
Here we have used the metric g"& = diag(1; −1) and the Weyl representation for the gamma matrices, !0 = !0 = M1 , !1 = − !1 = − iM2 , !5 = !5 = M3 with Pauli matrices Mi .
160
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
to, after integration, yield the time-rate change of the fermion (quark) number R R dB dr 90 j0 = 2 dr 91 j1 = 2j1 (R) ; =2 dt −∞ −∞
(2.51)
so that one can obtain on the boundary dB T " (2.52) = n !" ; dt which vanishes classically as mentioned above. However, at quantum level the above quantity is not well-de>ned locally in time since † (t) (t + ) is singular as → 0 due to vacuum Auctuation. Now we regulate this bilinear operator by exploiting the following point-splitting ansatz at r = R: 1 i ˙ j1 = T (t) !1 (t + ) = − J(t) 2 4f
†
(t) (t + ) =
1 ˙ J(t) + O() ; 4f
(2.53)
where we have used the boundary condition (2.48), the commutation relation [J(t); J(t + )] = i sgn and † (t) (t + ) = i=+ regular terms [39]. The quarks can then Aow in or out if the meson >elds change in time. In order to understand the leakage of the quarks from the bag, we consider the surface tangent t " = "& n& to obtain at r = R t " 9" J = −
1 T " 1 T " n !" = t !" !5 ; 2f 2f
(2.54)
where we have used the relation T !" !5 = "& T !& valid in (1+1) dimensions. Combination of Eqs. (2:49) and (2:54) yields the bosonization relation at the boundary r = R and time t 9" J =
1 T !" !5 ; 2f
(2.55)
which is a unique feature of (1+1) dimensional >elds [174]. Moreover, the quark >eld can be written in terms of the meson >eld as follows:
x i dJ (x) = exp − d z (z) + !5 (x0 ) ; (2.56) 2f x0 dz where (z) is the momentum >eld conjugate to J(z). Here one notes that the nonvanishing vector current (2.53) is not conserved due to quantum eDects to yield the vector anomaly as ˙ shown in Eq. (2:51), and that the amount of fermion number Yt J=f is pushed into the Dirac sea through the bag boundary to yield the following fractional fermion numbers BV inside the bag and BVT outside the bag, respectively, ; ; (2.57) BV = 1 − ; BVT = with ; = J(R)=f. Note that due to the identity BV + BVT = 1, the total fermion number B is invariant under such changes of the bag location and=or size so that one can conclude that the CCP in (1+1) dimensions is realized. Until now, we have considered the colorless fermions without introducing a gauge >eld A" . If one includes the additional gauge degrees of freedom
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
161
Fig. 1. The Aavor singlet axial current of the proton as a function of bag radius: (a) the quark and C meson contribution a0BQ + a0C , (b) the gluon contributions a0G; static and a0G; vac from static gluon due to quark source and gluon vacuum, respectively, (c) the total contribution a0total . The shaded area stands for the range admitted by experiments.
inside the bag, one can have another type of anomaly, the so-called color anomaly [175,176], which also appears in the realistic (3+1) dimensional CBM. (For more details, see Ref. [39].) Now, we would like to brieAy comment on the case of the CCP in (3 + 1) dimensions. One remembers in (2:26) and (2:30) that the fractional baryon numbers Bm and Bq are described in terms of the Euler characteristic and chiral angle, which depend on the bag shape and size, respectively, so that one can enjoy the freedom to >x the fractional baryon numbers in both phases by adjusting these bag parameters. Moreover, due to the identity Bm + Bq = 1, the total baryon number B is invariant under such changes of the bag shape and=or size so that one can conclude that the CCP in the CBM is realized at least in the physical quantity B in (3+1) dimensions as in the above case of (1+1) dimensions. This fact supports the CCP in the (3+1)-dimensional CBM even though there is still no rigorous veri>cation for this principle in other physical quantities evaluated in the CBM. For instance, one can see the approximate CCP in the Aavor singlet axial current evaluated in the (3+1)-dimensional CBM, as shown in Fig. 1. (For more details, see Ref. [141].) In the following sections, we will see that the CBM can be regarded as a candidate unifying the MIT bag and Skyrmion models since the other physical quantities are also insensitive enough to suggest the CCP. 3. Baryon octet magnetic moments 3.1. Coleman–Glashow sum rules Since Coleman and Glashow [177] predicted the magnetic moments of the baryon octet about 40 years ago, there has been a lot of progress in both the theoretical paradigm and experimental veri>cation for the baryon magnetic moments.
162
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
In this section, we will investigate the explicit Coleman–Glashow sum rules and spin symmetries of the magnetic moments of the baryon octet in the adjoint representation of the SU(3) Aavor group by assuming that the chiral bag has the SU(3) Aavor symmetry with mu = md = ms , m = mK and f = fK . Even though the quark and pion masses in LCSB in (2.2) break both the SU(3)L × SU(3)R and the diagonal SU(3) symmetry so that chiral symmetry cannot be " conserved, these terms without derivatives yield no explicit contribution to the EM currents JEM obtainable from (2:6), and at least in the adjoint representation of the SU(3) group the EM " currents are conserved and of the same form as the chiral limit result JEM; CS to preserve the U-spin symmetry. The higher representation mixing in the baryon wave functions, induced by the diDerent pseudoscalar meson masses and decay constants outside and diDerent quark masses inside the bag, will be discussed in the next section in terms of the multiquark structure scheme where the chiral bag has additional meson contribution from the qq T content inside the bag. In the collective quantization scheme of the CBM which was discussed in the previous section, the EM currents yield the magnetic√moment operators of the same form as the chiral symmetric limit consequence "ˆ iCS = "ˆ i(3) ˆ i(8) CS + 1= 3" CS where Nc i(a) 8 8 ˆR 8 ˆ T q + MDa8 Ji : = − NDai − N dipq Dap "ˆ CS 2
(3.1)
R R R Here Jˆi = − Tˆi are the SU(2) spin operators, and Tˆi and Tˆp are the right SU(3) isospin operators along the isospin and strangeness directions, respectively, and the inertia parameters are of complicated forms given by 3 N= sgn(jm )h m|"3(3) |mh − 3v|"3(3) |v 2 m 2 ∞ 4 d; sin2 ; 2 2 + 3 d z z sin ; 1 + + 2 ; 3e f ef R dz z (4) (4) 1 3 h m|)4 |nss n|"3 |mh s m|)4 |nhh n|"3 |ms N =− + I2 2 m; n jm − !n !m − jn
−
(4) 1 1 3 v|)4 |mss m|"3 |v + I2 2 m !m − jv 3e2 f2
∞
ef R
d z z 2 sin2 ;
d; ; dz
(0) (0) 1 3 h m|)3 |nhh n|"3 |mh 1 3 v|)3 |mhh m|"3 |v − I1 2 m; n jm − j n I1 2 m jm − j v ∞ 1 d; 1 + d z z 2 sin2 ; ; 2 2 I1 3e f ef R dz
M=−
(3.2)
˜ , "(4) = 1 V3 and "(0) = 1 2 V3 where Vi = ijk xj !0 !k and the Hermitian conjugate with "3(3) = 14 · 13 ˜) ·V 3 3 4 43 matrix elements are understood in the quark phase parts of N and M. The numerical values
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
163
Table 1 The inertia parameters as a function of the bag radius R with f = 93 MeV, fK = 114 MeV and e = 4:75 R
M
N
N
P
Q
!
0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00
0.671 0.671 0.669 0.660 0.647 0.643 0.656 0.693 0.768 0.886 1.042
5.028 5.088 5.371 5.660 5.697 5.834 6.000 6.128 6.167 6.130 6.056
0.908 0.835 0.791 0.752 0.699 0.615 0.519 0.424 0.335 0.266 0.222
0.762 0.772 0.822 0.886 0.944 1.022 1.112 1.184 1.212 1.185 1.114
0.986 1.000 1.062 1.125 1.159 1.205 1.265 1.305 1.302 1.249 1.156
5.372 6.008 7.144 8.290 8.991 10.133 11.875 14.022 16.550 19.280 21.987
[35,25] of these inertia parameters are summarized in Table 1 and their quark phase inertia parameters are discussed in Ref. [35] and Appendix B. Here one notes that M and N originate from the topological WZW term along the isospin and strangeness directions, respectively. With respect to the octet baryon wave function GB) discussed in (2.33), the spectrum of the magnetic moment operator "ˆ i in the adjoint representation of the SU(3) Aavor symmetric limit has the following U-spin symmetric Coleman–Glashow sum rules [177–179] due to the degenerate dand s-Aavor charges in the SU(3) EM charge operator Qˆ EM in the EM currents: 1 4 1 N+ N ; "K+ = "p = M + 10 15 2 1 1 1 "O0 = "n = M − N+ N ; 20 5 2 1 3 1 "O− = "K− = − M − N+ N ; 20 15 2 1 1 1 "K0 = − " = − M + N+ N : (3.3) 40 10 2 Here one should note that the U-spin symmetry originates from the SU(3) group theoretical fact that the matrix elements of the magnetic moment operators in (3.1) in the adjoint representation, √ 8 +(1= 3)D8 |8, have degenerate values for the U-spin multiplets (p; K+ ); (n; O0 ) such as 8|D38 88 and (O− ; K− ) with the same electric charges. In (3.3), one can easily see that "K (I3 ) = "K0 + I3 "K where "K = 18 M + 16 (N + 12 N ) so that the summation "K+ + "K− is independent of the third component of the isospin I3 so that one can obtain the other Coleman–Glashow sum rules [177–180] "K0 = 12 ("K+ + "K− ) :
(3.4)
164
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
Since there is no SU(3) singlet contribution to the magnetic moment, the summation of the magnetic moments over the octet baryon vanishes to yield the identity [179]: "B = 0 : (3.5) B∈octet
Introducing in the meson pieces of the CBM Lagrangian (2.1) the minimal photon coupling to the derivative terms, 9" U → ∇" U = 9" U + ieA" [Qˆ EM ; U ] with the SU(3) EM charge operator Qˆ EM one obtains the K0 transition matrix element for the decay K0 → + !: 1 1 1 1 √ "K0 = − M + M+ N ; (3.6) 40 10 2 3 which, in incorporating an SU(3) singlet contribution of the photon, satis>es the modi>ed Coleman–Glashow sum rules [180,181]: 1 3
√ "K0 = " − "n ;
6 3
√ "K0 = "K0 − 2"O0 + 3" − 2"n :
(3.7)
It is also interesting to note that the hyperon and transition magnetic moments in the SU(3) Aavor symmetric limit can be expressed in terms of the nucleon magnetic moments only [177,178,182]: " = 12 "n ; "O− = − ("p + "n ) ; "K+ − "K− + "O0 − "O− = 3("p + "n ) ; 1 3
1 2
√ "K0 = − "n :
(3.8)
Here one should note that the transition magnetic moment possesses an arbitrary global phase factor in itself, while the other octet magnetic moments have a de>nite overall sign. In (3.8), we have used the phase convention of Ref. [183], which is consistent with de Swart convention [184] of the SU(3) isoscalar factors used in the CBM. 3.2. Strangeness in Yabu–Ando scheme In the previous section, we have considered the CBM in the adjoint representation with the SU(3) Aavor symmetry, where the U-spin symmetry is conserved even though we have the chiral symmetry breaking mass terms. Now we include the SU(3) Aavor symmetry breaking i(a) terms LFSB in (2.2) to yield the magnetic moment operators "ˆ FSB of (3.9) induced by the symmetry breaking kinetic terms. However, the symmetry is also broken nonperturbatively by
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
165
the mass terms via the higher-dimensional IR channels where the CBM can be treated in the Yabu–Ando scheme [82] to yield the multiquark structure with the meson cloud inside the bag. The quantum mechanical perturbative scheme to the symmetry breaking eDects in the multiquark structure will be discussed in terms of the V-spin symmetry in the next section. Assuming that the CBM includes the kinetic term in LFSB in the collective quantization, the " Noether scheme gives rise to the U-spin symmetry breaking conserved EM currents JEM; FSB so " " " that JEM = JEM; CS + JEM; FSB . With the spinning CBM ansatz, the EM currents yield the magnetic √ i(a) i(a) i(a) + "ˆ FSB . Here "ˆ CS is given in (3:1) moment operators "ˆ i = "ˆ i(3) + 1= 3"ˆ i(8) where "ˆ i(a) = "ˆ CS i(a) and "ˆ FSB is described as below: √ 3 i(a) 8 8 8 8 "ˆ FSB = − PDai (1 − D88 ) + Qdipq Dap D8q ; (3.9) 2 where P and Q are the inertia parameters along the isospin and strangeness directions obtained from the mesonic Lagrangian LFSB : ∞ 8 2 2 P = 3 3 (fK − f ) d z z 2 sin2 ; cos ; ; 9e f ef R ∞ 8 2 2 Q = 3 3 (fK − f ) d z z 2 sin2 ; ; 9e f ef R ∞ 8 2 2 2 2 d z z 2 (1 − cos ;) mI2 = 3 3 (fK mK − f m ) 3e f ef R ∞ 4 d; 2 2 sin2 ; 2 2 2 + (f − f ) dz z + cos ; 3ef K dz z2 ef R 1 0 + ms Nc (3.10) h n|! |nh ; 3 n whose numerical values are shown in Table 1. Breaking up the tensor product of the Wigner D functions into a sum of the single D functions [184], 8 8 )! 8 8 )! 8 8 ) D a1 b 1 D a2 b 2 = Dab ; (3.11) a a b b 1 a2 1 b2 a;b;);! i(a) one can rewrite the isovector and isoscalar parts of the operator "ˆ FSB as
1 10 3 27 4 8 3 8 3 27 10 − D3i + (D3i + D3i ) + D3i + Q D − D 5 4 10 10 3i 10 3i 6 8 3 8 9 27 9 27 i(8) "ˆ FSB = P − D8i + D8i + Q − D8i − D8i : 5 20 10 20
"ˆ i(3) FSB = P
(3.12)
166
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
T IRs, which are absent in the isoscalar channel due to their nonvanishing hyHere the 10 and 10 percharge, come out together to conserve the Hermitian property of the operator in the isovector channel, while the singlet operator constructed in the singlet IR 1 cannot allow the quantum number (YR ; J; −J3 ) = (0; 1; 0) [179] so that the operator does not occur in either channel. Using the octet baryon wave function (2.33) for the matrix elements of the full magnetic moment operator "ˆ i , one can obtain the hyper>ne structure in the adjoint representation: 2 1 4 1 8 N+ N + P− Q ; "p = M + 10 15 2 45 45 1 1 1 1 7 "n = M − N + N − P + Q ; 20 5 2 9 90 1 1 1 1 1 N + N − P − Q ; " = M − 40 10 2 10 20 1 11 1 1 1 "O0 = M − N + N − P − Q ; 20 5 2 45 45 1 3 1 4 2 "O− = − M − N + N − P − Q ; 20 15 2 45 45 1 4 1 13 1 "K+ = M + N+ N + P− Q ; 10 15 2 45 45 1 1 11 1 1 "K0 = − M + N + N + P + Q ; 40 10 2 90 36 1 7 3 1 2 "K− = − M − N + N − P + Q : (3.13) 20 15 2 45 90 Here one notes that the Coleman–Glashow sum rules (3.4) and (3.5) are still valid while the other relations (3.3) and (3.8) are no longer retained due to the SU(3) Aavor symmetry breaking eDects of mu = md = ms ; m = mK and f = fK through the inertia parameters P and Q. By substituting the EM charge operator Qˆ EM with the q-Aavor EM charge operator Qˆ q , one "(q) "(q) "(q) can obtain the q-Aavor currents JEM = JEM; CS + JEM; FSB in the SU(3) Aavor symmetry broken "(u) "(d) "(s) " = JEM + JEM + JEM . Here one notes case to yield the EM currents with three Aavor pieces JEM that by de>ning the Aavor projection operators: 1 1 1 Pˆ u = + )3 + √ )8 ; 3 2 2 3 1 1 1 Pˆ d = − )3 + √ )8 ; 3 2 2 3 1 1 Pˆ s = − √ )8 ; 3 2 3
(3.14)
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
167
2 satisfying Pˆ q = Pˆ q and q Pˆ q = 1, one can easily construct the q-Aavor EM charge operators Qˆ q = Qˆ EM Pˆ q = Qq Pˆ q . As in the previous section, one can then >nd the magnetic moment operator in the u-Aavor channel, √ 1 2N 3 2 c i(u) 8 8 8 8 "ˆ = M 1+ D3i + √ D8i D + D88 Jˆi − N 9 2 38 3 3 2 1 8 1 8 R 2 8 8 8 ˆ D3i + √ D8i (1 − D88 ) −N dipq D3p + √ D8p T q − P 3 3 3 3 1 8 1 8 8 ; (3.15) +Q √ dipq D3p + √ D8p D8q 3 3
to yield the u-components of the baryon octet magnetic moments in the adjoint representation: 2 8 1 16 4 N+ N + M+ P− Q; 510 45 2 135 135 11 1 7 2 2 (u) "n = M − N+ N − P+ Q; 30 15 2 27 135 1 1 1 7 1 (u) " = M − N+ N − P− Q ; 20 15 2 15 30 11 1 2 22 2 N + N − P− Q; "O(u)0 = M − 30 15 2 135 135 7 1 1 8 4 "O(u)− = M − N + N − P− Q; 30 45 2 135 135 2 8 1 26 2 (u) "K+ = M + N+ N + P− Q; 5 45 2 135 135 19 1 1 11 1 (u) "K0 = M + N+ N + P+ Q ; 60 15 2 135 54 7 1 7 2 4 N + N − P+ Q: "K(u)− = M − 30 45 2 135 135
"p(u) =
(3.16)
Similarly, one can construct the s-Aavor magnetic moment operator, Nc 2 8 2 8 ˆ 8 ˆR Tq )J i − N √ D8i − N √ dipq D8p (1 − D88 9 3 3 3 3 1 2 8 8 8 8 −P √ D8i (1 − D88 ) + Q dipq D8p D8q ; 3 3 3
"ˆ i(s) = −M
(3.17)
168
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
to obtain the baryon octet magnetic moments in the s-Aavor channel: 7 1 1 1 1 N+ N + P+ Q ; − M+ 60 45 2 45 90 3 1 1 1 1 "(s) = − M − N + N − P − Q ; 20 15 2 15 30 11 42 122 12 1 N + N − M− P− Q; "O(s) = − 530 415 2 9135 4135 11 1 1 11 1 "K(s) = − M + N + N + P+ Q : 60 15 2 135 54
"N(s) =
(3.18)
Here one notes that all the baryon magnetic moments satisfy the model-independent relations in the u- and d-channels and the I-spin symmetry in the s-Aavor channel, where the isomultiplets have the same strangeness number "B(d) =
Qd (u) " ; Qu B
"B(s) = "B(s) T :
(3.19) (3.20)
Here BT is the isospin conjugate baryon in the isomultiplets of the baryon. For the K0 transition, one can obtain the u- and d-Aavor components given by the diDerent pattern 1 (u) 2 (d) 1 1 8 1 7 √ "K0 = √ "K0 = − M + N+ N + P− Q; (3.21) 60 15 2 135 270 3 3 and the vanishing s-Aavor component. Until now, we have considered the explicit SU(3) Aavor symmetry breaking eDects in the magnetic moment operators "ˆ i of the CBM in the adjoint representation, where the mass terms in LCSB and LFSB cannot contribute to "ˆ i due to the absence of the derivative term. Treating the mass terms as the representation-dependent fraction in the Hamiltonian approach, one can see 8 induces the representation mixing eDects in the baryon wave functions. that the term with D88 In order to investigate explicitly the mixing eDects in the Yabu–Ando scheme, we quantize the collective variables A(t) so that we can obtain the Hamiltonian of the form: 1 1 1 1 ˆ2 3 ˆ2 H =M + J + − hSB − YR ; (3.22) 2 I1 I2 2I2 4 where I1 and I2 are the moments of inertia of the CBM along the isospin and the strangeness directions, respectively, and their explicit expressions are given in (2.40). Here one remembers that the static mass M obtainable from (2.20) satis>es the equation of motion for the chiral angle (2.21). The pion mass in (2.21) also yields deviation from the chiral
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
169
limit chiral angle for a >xed bag radius so that the numerical results in the massive CBM can be worsened when one uses the experimental decay constant. In order to obtain the numerical results in Table 1, we use the massless chiral angle and the experimental data f = 93 MeV; fK = 114 MeV and e = 4:75 since (mu + md )=ms ≈ m2 =m2K ≈ 0:1, so that we can neglect the light quark and pion masses. This approximation would not be contradictory to our main purpose to investigate the massive kaon contributions to the baryon magnetic moments. On the other hand, the chiral and SU(3) Aavor symmetry breaking induces the representationdependent part 5 2 2 8 hSB = Cˆ 2 + !(1 − D88 ); 3
(3.23)
2 where Cˆ 2 is the Casimir operator in the SU(3)L group and the symmetry breaking strength is given by ∞ 8 ! = 3 3 I2 (fK2 m2K − f2 m2 ) d z z 2 (1 − cos ;) + I2 ms Nc n|!0 |nh e f ef R n ∞ 2 2 4 d; 2 sin ; + I2 (fK2 − f2 ) d z z2 + cos ; (3.24) ef dz z2 ef R
with the numerical values in Table 1. Of course, one can easily see that, in the vanishing ! limit, the Hamiltonian (3.22) approaches to the previous one H0 in (2.41) with the SU(3) Aavor symmetry. Now one can directly diagonalize the Hamiltonian hSB in the eigenvalue equation hSB |B = jSB |B of the Yabu–Ando scheme [82] with the eigenstate denoted by |B = ) C)B |B) where C)B is the representation mixing coeIcient and |B are octet baryon wave function in the )-dimensional IR discussed in (2.32). The possible SU(3) representations of the minimal multiquark Fock space qqq + qqqqq T are T ⊕ 27 6 in the baryon octet with YR = 1 and restricted by the Clebsch–Gordan series 8 ⊕ 10 J = 12 , so that the representation mixing coeIcients can be evaluated by solving the eigenvalue equation of the 3 × 3 Hamiltonian matrix in (3.22). Since in the multiquark scheme of the CBM the baryon wave functions act nonperturbatively on the magnetic moment operators with the quark and meson phase contributions in their inertia parameters, one could have the meson cloud content qq T inside the bag via the channel of qqqqq T multiquark Fock space. Here in order to construct the pseudoscalar mesons inside the bag, the qq T contents refer to all the appropriate Aavor combinations. 5
To be consistent with the massless chiral angle approximation, we also neglect the u- and d-quark contributions, √ 8 8 8 3=2)D38 + 12 D88 ) with !u; d = I2 mu; d Nc n n|!0 |n, which can break the I-spin symmetry through D38 . 1 Because of the baryon constraint YR = 1 originated from the WZW term, the spin- 2 decuplet baryons to 10⊕27⊕35. In the qqqqq T multiquark structure the Clebsch–Gordan decomposition of the tensor product of the two IR’s is given T 2 ⊕ 273 ⊕ 35, where the superscript stands for by (3 ⊗ 3 ⊗ 3) ⊗ (3T ⊗ 3) = (1 ⊕ 82 ⊕ 10) ⊗ (1 ⊕ 8) = 13 ⊕ 88 ⊕ 104 ⊕ 10 the number of diDerent IR’s with the same dimension. 2 3 !u; d (1 ± ( 6
170
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
Table 2 The baryon octet magnetic moments in the U-spin symmetry broken case in the Yabu–Ando scheme of the CBM, compared with the SU(2) CBM and naive NRQM predictions and the experimental data R
"p
"n
"
"O 0
"O −
"K +
"K 0
"K −
"K0
0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00 SU(2) Naive Exp
1.69 1.71 1.80 1.89 1.91 1.96 2.02 2.07 2.10 2.10 2.10 2.27 2.79 2.79
−1:28 −1:30 −1:39 −1:48 −1:50 −1:54 −1:59 −1:62 −1:62 −1:59 −1:55 −1:35 −1:86 −1:91
−0:56 −0:56 −0:56 −0:58 −0:57 −0:57 −0:57 −0:55 −0:53 −0:49 −0:45 −0:61 −0:61 −0:61
−1:22 −1:22 −1:27 −1:32 −1:33 −1:36 −1:38 −1:39 −1:37 −1:33 −1:26 −1:33 −1:43 −1:25
−0:47 −0:46 −0:45 −0:45 −0:44 −0:42 −0:40 −0:37 −0:34 −0:31 −0:28 −0:60 −0:50 −0:65
1.73 1.73 1.80 1.86 1.87 1.89 1.90 1.89 1.85 1.79 1.73 2.28 2.68 2.46
0.69 0.69 0.72 0.75 0.76 0.77 0.78 0.78 0.76 0.73 0.69 0.82 0.82
−0:36 −0:36 −0:36 −0:36 −0:35 −0:35 −0:34 −0:34 −0:34 −0:34 −0:35 −0:64 −1:04 −1:16
1.19 1.20 1.28 1.36 1.38 1.43 1.48 1.52 1.51 1.48 1.26 1.26 1.61 1.61
−
In the SU(3) Aavor sector of the CBM, the mechanism explaining the meson cloud inside the bag surface seems [37] closely related to the pseudoscalar composite operators T i!5 )a ∼ T ⊕ (3; T 3), while in a (a = 1; : : : ; 8) since the pseudoscalar quark bilinears transform like (3; 3) the U(1) Aavor sector the mechanism is supposed [37] to be described with the anomalous gluon eDect in the quark–antiquark annihilation channel [185]. In the SU(3) CBM with the minimal multiquark Fock space, the meson cloud content qq T inside the bag surface can be then phenomenologically illustrated [37] by sum of two topologically diDerent Feynman diagrams. One notes here that, in the multiquark scheme of the SU(3) CBM, the baryon magnetic moments have two-body operator eDect as well as one-body self-interaction in the sense of quasi-particle model in the many-body problem. The gluons are supposed to mediate the pseudoscalar C0 meson cloud via the qq T pair creation and annihilation process. i(a) As shown in Table 2, the U-spin symmetry breaking eDect, through the explicit operator "ˆ FSB and the Yabu–Ando scheme in the multiquark structure, improves the >t to most of the baryon octet magnetic moments. However, if the experimental data [186] is correct, the >t to the "K− seems a little bit worsened. Here one should note that " seems to be well-predicted in the CBM as in the naive NRQM since " could be mainly determined from the strange quark and kaon whose masses are kept in our massless pro>le approximation. From the numerical values in Table 2, one can see that the SU(3) CBM could be regarded to be a good candidate of the uni>cation of the bag and Skyrmion models with predictions almost independent of the bag radius. For the K0 → + ! transition matrix element, we obtain the numerical prediction of the exp CBM "K0 = 1:19–1:53 comparable to the experimental data "K 0 = 1:61 [186]. In the q-Aavor channels, the I-spin symmetry and model-independent relations (3.20) hold in the multiquark scheme since the Hamiltonian hSB has the eigenstates degenerate with the isomultiplets in our approximation, where the I-spin symmetry breaking light quark masses are neglected.
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
171
4. Baryon decuplet magnetic moments 4.1. Model-independent sum rules In the previous section, we have calculated the magnetic moments of baryon octet in the SU(3) Aavor case [37], where the Coleman–Glashow sum rules [174] including the U-spin symmetry hold up to the SU(3) Aavor symmetric limit of the adjoint representation to suggest the possibility of a uni>cation of the SU(3) CBM and the naive NRQM. The measurements of the magnetic moments of the decuplet baryons were reported for "++ [187] and "P− [188] to yield a new avenue for understanding hadronic structure. In this section, we will calculate the magnetic moments of the baryon decuplet [38] to compare with the known experimental data, to make new predictions in the CBM for the unknown experiments and to derive the model-independent sum rules which will be used later to generalize the CBM conjecture [37] for the baryon decuplet. In order to estimate the magnetic moments of the decuplet baryons in the U-spin broken i(a) symmetry case, we have at >rst derived the explicit magnetic moment operators "ˆ FSB from the i(a) Aavor symmetry breaking Lagrangian LFSB in the adjoint representation where "ˆ CSB vanishes. In the SU(3) cranking scheme described in the previous sections, the magnetic moment operators i(a) "ˆ i are then given by (3.1) and (3.9) and the tensor product of the Wigner D functions in "ˆ FSB can be decomposed into a sum of the single D functions to yield the isovector and isoscalar parts as below 4 8 3 8 3 27 3 27 i(3) ; "ˆ FSB = P − D3i + D3i + Q D − D 5 10 10 3i 10 3i 6 8 3 8 9 27 9 27 i(8) "ˆ FSB = P − D8i + D8i + Q − D8i − D8i : (4.1) 5 20 10 20 Here one notes that, to conserve the Hermitian property of the magnetic moment operator, 10 T IRs appear together in the isovector channel of the baryon octet as discussed in the and 10 T IRs do not take place in the decuplet baryons. previous section while the 1; 10 and 10 With respect to the decuplet baryon wave function GB) in (2.33), the magnetic moment operator "ˆ i has the spectrum for the decuplet in the adjoint representation: 1 1 3 1 3 "++ = M + N − √ N + P − Q ; 8 2 7 56 2 3 1 1 5 1 1 "+ = M + N− √ N + P+ Q ; 16 4 21 84 2 3 1 13 P+ Q; 21 168 1 1 1 1 1 "− = − M − N− √ N − P+ Q ; 16 4 7 7 2 3 "0 =
172
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
1 17 1 1 19 N− √ N + P− "K∗+ = M + Q; 16 4 84 168 2 3
"K∗0 =
1 1 P− Q ; 84 84
13 1 1 1 17 "K∗− = − M − N− √ N − P+ Q; 16 4 84 168 2 3
1 17 P− Q; 42 168 1 1 1 11 1 "O∗− = − M − N − √ N − P + Q ; 16 4 42 84 2 3 1 1 1 9 3 N− √ N − P− Q : "P− = − M − 16 4 28 56 2 3
"O∗0 = −
(4.2)
In the SU(3) Aavor symmetric limit with the chiral symmetry breaking masses mu = md = ms ; mK = m and decay constants fK = f , the magnetic moments of the decuplet baryons are simply given by [189] 1 1 1 "B = QEM N − √ N M+ ; (4.3) 16 4 2 3 where QEM is the EM charge. Here one remembers that for the case of the CBM in the adjoint representation, the prediction of the baryon magnetic moments with the chiral symmetry is the same as that with the SU(3) Aavor symmetry since the mass-dependent term in LCSB and LFSB " do not yield any contribution to JFSB so that there is no terms with P and Q in (4.2). Due to the degenerate d- and s-Aavor charges in the SU(3) EM charge operator Qˆ EM , the CBM possesses the generalized U-spin symmetry relations in the baryon decuplet magnetic moments, similar to those in the octet baryons (3.3), "− = "K∗− = "O∗− = "P− ; "0 = "K∗0 = "O∗0 ; "+ = "K∗+ ;
(4.4)
which will be shown to be shared with the naive NRQM, to support the eDective NRQM conjecture of the CBM. Since the SU(3) FSB quark masses do not aDect the magnetic moments of the baryon decuplet in the adjoint representation of the CBM, in the more general SU(3) Aavor symmetry broken case with mu = md = ms , m = mK and f = fK , the decuplet baryon magnetic moments with P and Q satisfy the other sum rules [38]: "K∗0 = 12 ("K∗+ + "K∗− ) ;
(4.5)
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
"− + "++ = "0 + "+ ;
173
(4.6)
"B = 0 :
(4.7)
B∈decuplet
Here one notes that the K∗ hyperons satisfy the identity "K∗ (I3 ) = "K∗0 + I3 Y"K∗ , where Y"K∗ = 1 1 1 3 5 √ ∗+ 16 M + 4 (N − 2 3 N ) + 14 P − 56 Q, such that "K + "K∗− is independent of I3 as in (4.5). For 1 M + 18 (N − the baryons, one can formulate the relation " (I3 ) = "0 + I3 Y" with "0 = 32 √ √ 5 1 4 11 (1=2 3)N ) + 17 P + 112 Q and Y" = 16 M + 14 (N − (1=2 3)N ) + 21 P − 168 Q, so that baryons can be easily seen to ful>ll the sum rule (4.6). Also the summation of the magnetic moments over all the decuplet baryons vanish to yield the model-independent relation (4.7). Also the summation of the magnetic moments over all the decuplet baryons vanishes to yield the model-independent relation, namely the third sum rule in (4.7), since there is no SU(3) singlet contribution to the magnetic moments as in the baryon octet magnetic moments. In the SU(3) Aavor symmetry broken case, by using the projection operators in (3.14) we can decompose the EM currents into three-Aavor pieces to obtain the baryon decuplet magnetic moments in the u-Aavor channels of the adjoint representation: 5 1 2 1 1 (u) "++ = M + N− √ N + P− Q ; 12 3 7 28 2 3 3 1 10 1 1 N − √ N + P + "(u)+ = M + Q; 8 6 63 126 2 3 "(u)0 = 13 M +
2 63 P
13 + 252 Q; 2 1 1 2 7 (u) "− = M − N− √ N − P+ Q ; 24 6 21 21 2 3
3 1 19 1 17 "K(u)∗+ = M + (N − √ N ) + P− Q; 8 6 126 252 2 3 1 1 1 "K(u)∗0 = M + P− Q; 3 126 126 7 13 1 1 17 (u) N− √ N − P+ Q; "K∗− = M − 24 6 126 252 2 3 "O(u)∗0 = 13 M −
17 − 252 Q; 1 1 1 11 7 (u) "O∗− = M − N− √ N − P+ Q;
24
1 63 P
6 7 1 (u) "P− = M − N− 24 6
2 3
63
126
1 3 1 √ N − P− Q : 14 28 2 3
(4.8)
174
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
Similarly, the baryon decuplet magnetic moments in the s-Aavor channels are given as follows: 1 2 5 7 1 (s) " = − M + N− √ N + P+ Q; 48 12 21 168 2 3 "K(s)∗ = − 16 M +
1 126 P
1 − 126 Q; 3 1 1 2 5 (s) "O∗ = − M − N− √ N − P− Q ;
16
12 5 1 (s) "P = − M − N− 24 6
2 3 1
√ N
2 3
21
−
84
3 1 P− Q : 14 28
(4.9)
In general, all the baryon magnetic moments in the CBM also satisfy the model-independent relations in the u- and d-Aavor components and the I-spin symmetry in the s-Aavor channel of (3.20), as shown in (4.8) and (4.9). Moreover, one notes that the relations (3.20) are satis>ed even in the multiquark decay constants f = fK do not aDect the relations (3.20) in the u- and d-Aavor channel without any strangeness and in the s-Aavor channel with the same strangeness. 4.2. Multiquark structure Until now, we have considered the CBM in the adjoint representation where the U-spin i(a) induced by the symmetry symmetry is broken only through the magnetic moment operators "ˆ FSB breaking derivative term. To take into account the missing chiral symmetry breaking mass eDect from LCSB and LFSB , in this section we will treat nonperturbatively the symmetry breaking mass terms via the higher-dimensional IR channels where the CBM can be handled in the Yabu–Ando scheme [82] with the higher IR mixing in the baryon wave function to yield the minimal multiquark structure with meson cloud inside the bag. The possible SU(3) representations of the minimal multiquark Fock space are restricted by the Clebsch–Gordan series 10 ⊕ 27 ⊕ 35 for the baryon decuplet with YR = 1 and J = 32 through the decomposition of the tensor product of the two IRs in the qqqqq T so that the representation B ) mixing coeIcients in the eigenstate |B = ) C) |B can be determined by diagonalizing the 3 × 3 Hamiltonian matrix hSB given by (3.23). Here one should note that in the Yabu–Ando approach the meson cloud, or qq T content with all the possible Aavor combinations to construct the pseudoscalar mesons inside the bag through the channel of qqqqq T multiquark Fock space, contributes to the baryon decuplet magnetic moments since the baryon wave functions in the multiquark scheme of the CBM act nonperturbatively on the magnetic moment operators with both the quark and meson phase pieces in their inertia parameters. i(a) The U-spin symmetry breaking eDect shown in Fig. 2 through the explicit operator "ˆ FSB and the multiquark structure yields meson cloud contributions to the baryon decuplet magnetic moments, comparable to those in the naive NRQM. The vertical lines show that even though nature does not preserve the perfect Cheshire catness [70,40,42] at least in the SU(3) CBM, the model could be considered to be a good candidate which uni>es the MIT bag and Skyrmion
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
175
Fig. 2. Baryon decuplet magnetic moments. The eDective NRQM results with bag radius 0:0 fm 6 R 6 1:0 fm in the U-spin symmetric (thin vertical lines) and symmetry broken (thick vertical lines) cases are compared with the naive NRQM (thick lines) and the experimental data (thin vertical lines with a cross). Table 3 The baryon decuplet magnetic moments of the CBM in the U-spin symmetry broken case [38] compared with the naive NRQM and the experimental dataa R
"++
" +
" 0
" −
"K∗+
"K∗0
"K∗−
"O∗0
"O∗−
"P −
0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00 Naive
2.81 2.87 3.05 3.24 3.30 3.43 3.58 3.71 3.79 3.81 3.78 5.58
1.22 1.23 1.30 1.38 1.40 1.46 1.52 1.59 1.63 1.65 1.65 2.79
−0:38 −0:40 −0:45 −0:49 −0:50 −0:52 −0:53 −0:54 −0:53 −0:52 −0:49
−1:97 −2:03 −2:20 −2:36 −2:40 −2:49 −2:59 −2:67 −2:69 −2:68 −2:63 −2:79
1.64 1.70 1.87 2.04 2.11 2.23 2.39 2.55 2.67 2.74 2.78 3.11
−0:17 −0:18 −0:20 −0:21 −0:21 −0:21 −0:21 −0:19 −0:17 −0:14 −0:11
−1:88 −1:94 −2:10 −2:26 −2:31 −2:40 −2:50 −2:58 −2:60 −2:58 −2:52 −2:47
0.14 0.17 0.22 0.28 0.30 0.34 0.40 0.46 0.51 0.56 0.60 0.64
−1:72 −1:77 −1:90 −2:04 −2:09 −2:17 −2:27 −2:34 −2:36 −2:33 −2:26 −2:15
−1:45 −1:47 −1:54 −1:61 −1:62 −1:66 −1:70 −1:72 −1:70 −1:65 −1:57 −1:83
a
For the experimental data and [188], respectively.
0.00
"exp ++
= 4:52 ± 0:50 and
"Pexp −
0.32
= − 1:94 ± 0:17 ± 0:14, we have referred to Refs. [187]
models with predictions almost independent of the bag radius. One can also easily see in Fig. 2 that the full symmetry breaking eDects induce the magnetic moments of the baryon decuplet to pull the U-spin symmetric predictions back to the experimental data. In Table 3, the SU(3) CBM predictions in the SU(3) symmetry breaking case in the multiquark structure are
176
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
Table 4 The strange Aavor baryon decuplet magnetic moments in the naive and CBM [38] R
"(s)
"K(s)∗
"O(s)∗
"P(s)
0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00 Naive
0.31 0.31 0.33 0.34 0.35 0.37 0.39 0.41 0.44 0.46 0.48 0.00
−0:21 −0:22 −0:22 −0:22 −0:22 −0:21 −0:21 −0:20 −0:19 −0:18 −0:18 −0:61
−0:64 −0:65 −0:67 −0:70 −0:70 −0:72 −0:73 −0:74 −0:74 −0:74 −0:73 −1:22
−1:08 −1:09 −1:14 −1:18 −1:19 −1:22 −1:25 −1:26 −1:26 −1:25 −1:22 −1:83
explicitly listed to be compared with the naive NRQM and the experimental data. For the known experimental data we obtain &CBM ++ = (1:01 − 1:37)"p to be compared with the experimental value "exp = (1:62 ± 0:18)" [187] and the naive NRQM prediction "naive ++ = 2"p . Since the "P− could ++ p be dominantly achieved from the strange quark and kaon whose masses are kept in our massless chiral angle approximation, the prediction "P− = − (1:45 − 1:72) n.m. in the CBM seems to be fairly well-consistent with the experimental data "Pexp − = − (1:94 ± 0:17 ± 0:14) n:m. [188] and naive the naive NRQM prediction "P− = − 1:83 n:m: Since the Hamiltonian hSB has eigenstates degenerate with the isomultiplets in our approximation, where the I-spin symmetry breaking light quark masses are neglected so that the relations (3.20) are derived in the same strangeness sector, the multiquark structure in the q-Aavor channels conserves the I-spin symmetry and model-independent relations (3:20). The s-Aavor magnetic moments "B(s) in Table 4, reveal the stronger Cheshire catness than in "B and the pretty good consistency with the naive NRQM. In Fig. 2 and Table 3, the meson cloud contributions to the magnetic moments in the SU(3) eDective NRQM are obtained with respect to the naive NRQM and experimental values. With the help of the naive NRQM data, one could also easily see the meson cloud contributions, which are originated from the qq T content and strange quarks inside the bag, as well as the massive kaons outside the bag. 5. SAMPLE experiment and baryon strange form factors 5.1. SAMPLE experiment and proton strange form factor In this section, we consider the SAMPLE experiment and the corresponding theoretical paradigms in the chiral models to connect the chiral model predictions with the recent experimental data for the proton strange form factor. As discussed in Section 1, there have been
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
177
lots of theoretical predictions with varied values for the SAMPLE experimental results associated with the proton strange form factor through parity-violating electron scattering. Especially, the positive value of the proton strange form factor predicted in the framework of the CBM is quite comparable to the recent SAMPLE experimental data. The SAMPLE experiment was performed at the MIT=Bates Linear Accelerator Center using a 200 MeV polarized electron beam incident on a liquid hydrogen target. The scattered electrons were detected in a large solid angle (∼ 1:5 sr) air Cherenkov detector at backward angles ◦ ◦ 130 ¡ ; ¡ 170 . The parity-violating asymmetry A was determined from the asymmetries in ratios of integrated detector signal to beam intensity for left- and right-handed beam pulses. (For details of the SAMPLE experiment, see Refs. [190,191].) On the other hand, there have been considerable discussions concerning the strangeness in hadron physics. Beginning with Kaplan and Nelson’s work [51] on the charged kaon condensation, the theory of condensation in dense matter has become one of the central issues in nuclear physics and astrophysics together with the supernova collapse. The K − condensation at a few times nuclear matter density was later interpreted [192] in terms of cleaning of qq T condensates from the quantum chromodynamics (QCD) vacuum by a dense nuclear matter and also was further theoretically investigated [52] in chiral phase transition. Now, the internal structure of the nucleon is still a subject of great interest to experimentalists as well as theorists. In 1933, Frisch and Stern [193] performed the >rst measurement of the magnetic moment of the proton and obtained the earliest experimental evidence for the internal structure of the nucleon. However, it was not until 40 years later that the quark structure of the nucleon was directly observed in deep inelastic electron scattering experiments. The development of QCD followed soon thereafter, and is now the accepted theory of the strong interactions governing the behavior of quarks and gluons associated with hadronic structure. Nevertheless, we still lack a quantitative theoretical understanding of these properties (including the magnetic moments) and additional experimental information is crucial in our eDort to understand the internal structure of the nucleons. For example, a satisfactory quantitative understanding of the magnetic moment of the proton has still not been achieved, now more than 60 years after the >rst measurement was performed. Quite recently, the SAMPLE experiment [3,4] reported the proton’s neutral weak magnetic form factor, which has been suggested by the neutral weak magnetic moment measurement through parity-violating electron scattering [5,6]. Moreover, McKeown [194] has shown that the strange form factor of proton should be positive by using the conjecture that the up-quark eDects are generally dominant in the Aavor dependence of the nucleon properties. In fact, at a small momentum transfer Q2 = 0:1 (GeV=c)2 , the SAMPLE Collaboration obtained the positive experimental data for the proton strange magnetic form factor [3,4]: s GM (Q2 = 0:1 (GeV=c)2 ) = + 0:14 ± 0:29(stat) ± 0:31(sys) :
(5.1)
This positive experimental value is contrary to the negative values of the proton strange form factor which result from most of the model calculations [8–22] except those of Hong et al. [23] and Hong and Park [25] based on the SU(3) chiral bag model (CBM) [26,27,30,32] and the recent predictions of the chiral quark soliton model [44] and the heavy baryon chiral perturbation theory [45,46]. Recently, the anapole moment eDects associated with the parity-violating
178
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
Table 5 Electroweak quark couplings Flavor
!
Z Vector
2 3 − 13 − 13
u d s
1 4 − 14 − 14
−
2 3 1 3 1 3
Axial vector sin2 ;W
+ sin2 ;W + sin2 ;W
− 41 1 4 1 4
electron scattering have been intensively studied to yield more theoretical predictions [46 –50]. (For details of the anapole eDects, for instance, see Ref. [50].) Through further investigations including gluon eDects, one can also obtain somehow realistic predictions for the proton strange form factor. On the other hand, a number of parity-violating electron scattering experiments such as the SAMPLE experiment associated with a second deuterium measurement [195], the HAPPEX experiment [196], the PVA4 experiment [197], the G0 experiment [198] and other recently approved parity-violating measurements [199,200] at the JeDerson Laboratory, are planned for the near future. (For details of the future experiments, see Ref. [50].) Now we consider the form factors of the baryon octet with internal structure. If a particle is point-like, with no internal structure due to interactions other than EM, the photon couples to the EM current, " " T " d − 1 s! Vˆ ! = 23 u!" u − 13 d! 3T s ;
(5.2) " Vˆ !
for the particle with transition and according to the Feynman rules, the matrix element of from momentum state p to momentum state p + q is given by "
p + q|Vˆ ! |p = u(p + q)!" u(p) ;
(5.3)
where u(p) is the spinor for the particle states. However, if the particle has the internal structure caused by other interaction not given by QED, the Feynman rules cannot yield the explicit coupling of the particle to an external or internal photon line. The standard electroweak model couplings to the up, down and strange quarks are listed in Table 5. The baryons are de>nitely extended objects with internal structure, for which the coupling constant can be described in terms of form factors which are real Lorentz scalar functions associated with the internal structure and >xed by the properties of the EM currents such as current conservation, covariance under Lorentz transformations and hermiticity. The above matrix element is then generalized to have covariant decomposition:
i " ! ! 2 " 2 "& p + q|Vˆ ! |p = u(p + q) F1 (q )! + F (q )M q& u(p) ; (5.4) 2MB 2 where q is the momentum transfer and M"& = (i=2)(!" !& − !& !" ) and MB is the baryon mass and F1! and F2! are the Dirac and Pauli EM form factors, which are Lorentz scalars and p2 = (p + q)2 = MB2 on shell so that they depend only on the Lorentz scalar variable q2 .
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
179
With these form factors, the diDerential cross section in the laboratory system for electron scattering on the baryon is given as dM 2 cos2 (;=2) 1 = 4 dP 2E sin (;=2) 1 + 2(E=mB )sin2 (;=2) t t 2 2 2 2 × F1 (t) − F (t) − (F1 (t) + F2 (t)) tan (;=2) ; (5.5) 4m2B 2 2m2B where is the >ne structure constant and E and ; are the energy and scattering angle of the electron and t = q2 is the Mandelstam variable. In order to see the physical interpretation of these EM form factors, it is convenient to consider the matrix element (5.4) in the reference frame with p ˜ + (˜ p +˜q) = 0, where one can have the rest frame in the vanishing q2 limit. In this rest frame of the baryon , we can associate the EM form factors at zero momentum transfer, F1 (0) and F2 (0), with the static properties of the baryon such as electric charge, magnetic moment and charge radius. Next, we will also use the Sachs form factors, which are linear combinations of the Dirac and Pauli form factors: GE = F1 −
(5.6)
where < = − q2 =4MB2 ¿ 0. The quark Aavor structure of the form factors can be revealed by writing the matrix elements of individual quark currents in terms of form factors:
i f 2 " f 2 "& f " f p + q|qT ! q |p ≡ u(p + q) F1 (q )! + F (q )M q& u(p); (f = u; d; s) ; (5.7) 2MN 2 which de>nes the form factors F1f and F2f . Then using de>nitions analogous to Eq. (5.6), we can write GE! = 23 GEu − 13 GEd − 13 GEs
(5.8)
! with a similar expression for GM . The neutral weak current operator is given by an expression analogous to Eq. (5.2) but with diDerent coeIcients: " T " d + (− 1 + 1 sin2 ;W )s! Vˆ Z = ( 14 − 23 sin2 ;W )u!" u + (− 14 + 13 sin2 ;W )d! T "s : 4 3
(5.9)
Here the coeIcients depend on the weak mixing angle, which has recently been determined [186] with high precision: sin2 ;W = 0:2315 ± 0:0004. In direct analogy to Eq. (5.8), we have Z in terms of the diDerent quark Aavor expressions for the neutral weak form factors GEZ and GM components 2 2 2 Z u d s 1 2 1 1 1 1 GE; M = ( 4 − 3 sin ;W )GE; M + (− 4 + 3 sin ;W )GE; M + (− 4 + 3 sin ;W )GE; M :
(5.10)
180
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
Fig. 3. Examples of amplitudes contributing to electroweak radiative corrections to the coeIcients in Eq. (5.10). f An important point is that the form factors GE; M (f = u; d; s) appearing in this expression are exactly the same as those in the EM form factors, as in Eq. (5.8). Utilizing isospin symmetry, one can then eliminate the up and down quark contributions to the neutral weak form factors by using the proton and neutron EM form factors and obtain the expressions: Z; p !; p 2 1 1 !; n 1 s GE; M = ( 4 − sin ;W )GE; M − 4 GE; M − 4 GE; M :
(5.11)
This result shows how the neutral weak form factors are related to the EM form factors plus a contribution from the strange (electric or magnetic) form factor. Thus measurement of the neutral weak form factor will allow (after combination with the EM form factors) determination of the strange form factor of interest. It should be mentioned that there are electroweak radiative corrections to the coeIcients in Eq. (5.10) due to processes such as those shown in Fig 3. These are generally small corrections, of order 1–2%, and can be reliably calculated [201,202]. The EM form factors present in Eq. (5.11) are very accurately known (1–2%) for the proton in the momentum transfer region Q2 ¡ 1 (GeV=c)2 . The neutron form factors are not known as accurately as the proton form factors (the electric form factor GEn is at present rather poorly constrained by experiment), although considerable work to improve our knowledge of these quantities is in progress. Thus, the present lack of knowledge of the neutron form factors will not signi>cantly hinder the interpretation of the neutral weak form factors. The properties of the Sachs form factors GE and GM near Q2 = 0 are of particular interest in that they represent static physical properties of the baryon. Namely, at zero momentum transfer, one can have the relations between the EM form factors and the static physical quantities of the baryon octet, namely GE (0) = Q and GM (0) = " where Q and " are nothing but the electric charge and magnetic moment operators of the baryon. The F2 (0) is thus interpreted as the anomalous magnetic moments of the baryon octet "an = " − Q. In the strange Aavor sector, the fractional EM charge Qs of the baryon can be obtained from the strange Aavor fractional EM charges in the baryon to yield QNs = 0, Qs = QKs = − 13 and QOs = − 23 . The strange Aavor anomalous magnetic moments degenerate in isomultiplets can then be easily given by "an(s) = "(s) − Qs so that the strange form factors at zero momentum transfer de>ned as F2(s) = − 3"an(s) can be calculated to yield s s F2B (0) = GM (0) − GEs (0) :
(5.12)
Since the nucleon has no net strangeness, we >nd GEs (0) = 0. However, one can express the slope of GEs at Q2 = 0 in the usual fashion in terms of a “strangeness radius” rs rs2 ≡ −6[dGEs =dQ2 ]Q2 =0 :
(5.13)
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
181
Now we consider the parity-violating asymmetry for elastic scattering of right- vs. left-handed Z as diselectrons from nucleons at backward scattering angles, which is quite sensitive to GM cussed in Refs. [5,203,204]. The SAMPLE experiment measured the parity-violating asymmetry in the elastic scattering of 200 MeV polarized electrons at backward angles with an average s = 0, the expected asymmetry in the SAMPLE experiment is about Q2 0:1(GeV=c)2 . For GM −6 s . The neutral weak ax−7 × 10 or −7 ppm, and the asymmetry depends linearly on GM Z ial form factor GA contributes about 20% to the asymmetry in the SAMPLE experiment. In parity-violating electron scattering GAZ is modi>ed by a substantial electroweak radiative correction. The corrections were estimated in Refs. [201,202], but there is considerable uncertainty in the calculations. The uncertainty in these radiative corrections substantially limits the ability to s , as will be discussed below. determine GM The elastic scattering asymmetry for the proton is measured to yield A = − 4:92 ± 0:61 ± 0:73 ppm ;
(5.14)
where the >rst uncertainty is statistical and the second is the estimated systematic error. This value is in good agreement with the previously reported measurement [191]. Z On the other hand, the quantities GE; M for the proton can be determined via elastic parityviolating electron scattering [5,6]. The diDerence in cross sections for right- and left-handed incident electrons arises from interference of the EM and neutral weak amplitudes, and so contains products of EM and neutral weak form factors. At the mean kinematics of the experiment ◦ (Q2 = 0:1(GeV=c)2 and ; = 146:1 ), the theoretical asymmetry for elastic scattering from the proton is given by s A = (−5:72 + 3:49GM + 1:55GAe (T = 1)) ppm ;
(5.15)
GAe = GAZ + CFA + Re ;
(5.16)
where
where GAZ is the contribution from a single Z-exchange, as would be measured in neutrino– proton elastic scattering, given as GAZ = − (1 + R1A )GA + R0A + GAs ; (5.17) √ and C = 8 2=(1 − 4 sin2 ;W ) = 3:45 with the >ne-structure constant , and FA is the nucleon anapole moment [205] and Re is a radiative correction. Here GA is the charged current nucleon form factor: we use GA = GA (0)=(1 + Q2 =MA2 )2 , with GA (0) = − (gA =gV ) = 1:267 ± 0:035 [186] and MA = 1:061 ± 0:026 (GeV=c) [206]. GAs (Q2 = 0) = Ys = − 0:12 ± 0:03 (see, e.g., Ref. [207]), 1 and R0; A are the isoscalar and isovector axial radiative corrections. The radiative corrections were estimated by Ref. [201] to be R1A = − 0:34 and R0A = − 0:12, but with nearly 100% uncertainty. 7 For the case of a deuterium target, a separate measurement was performed with the same apparatus, where both elastic and quasi-elastic scattering from the deuteron were measured due 7
The notation used here is R0A = (1=2)(3F − D)RTA=0 , where
√
3RTA=0 = − 0:62 in Ref. [202].
182
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
Fig. 4. A result of a combined analysis of the data from the two SAMPLE measurements. The two error bands from the hydrogen experiment [4] and the deuterium experiment are indicated. The inner hatched region includes the statistical error and the outer represents the systematic uncertainty added in quadrature. The ellipse represents the allowed region for both form factors at the 1M level. Also plotted is the estimate of the isovector axial e − N form factor GAe (T = 1), obtained by using the anapole form factor and radiative corrections of Zhu et al. [208].
to the large energy acceptance of the detector. Based on the appropriate fractions of the yield, the elastic scattering and threshold electrodisintegration contributions were estimated to change the measured asymmetry by only about 1%. The asymmetry for the deuterium is measured to yield A = − 6:79 ± 0:64 ± 0:51 ppm :
(5.18)
On the other hand, the theoretical asymmetry for the deuterium is given by s A = (−7:27 + 0:75GM + 1:78GAe (T = 1)) ppm :
(5.19)
Note that in this case the expected asymmetry is −8:8 ppm again, assuming zero strange quark contribution and the axial corrections of Ref. [208]. Combining this measurement with the previously reported hydrogen asymmetry [4] and with the expressions in Eqs. (5.15) and (5.19) leads to the two sets of diagonal bans in Fig. 4. The inner portion of each band corresponds to the statistical error, and the outer portion corresponds to statistical and systematic errors combined in quadrature. The best experimental value for the strange magnetic form factor is given by (5.1). As noted in recent papers [209,210], most model calculations tend to produce negative values s (0), typically about −0:3. A recent calculation using lattice QCD techniques (in the of GM
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
183
s (0) = − 0:36 ± 0:20 [210]. A recent study using a quenched approximation) reports a result GM constrained Skyrme-model Hamiltonian that >ts the baryon magnetic moments yields a positive s (0) = + 0:37 [23]. value of GM
5.2. Strange form factors of baryons in chiral models In this section, we will revisit the symmetry breaking mass eDects to investigate the V-spin symmetric Coleman–Glashow sum rules [25] in the framework of the perturbative scheme, where the representation mixing coeIcients can be obtained in the quantum mechanical perturbation theory, diDerently from the Yabu–Ando approach discussed in the previous section with the direct diagonalization. In the perturbative method, the Hamiltonian is split up into H = H0 + HSB where H0 is the SU(3) Aavor symmetric part given by (2.41) and the symmetry breaking part is described by 8 HSB = m(1 − D88 )
(5.20)
with m the inertia parameter corresponding to ! of (3.24) in the Yabu–Ando method where the Hamiltonian has been divided into the representation-independent and -dependent parts. Provided one includes the representation mixing as in the previous section, the baryon wave function is described in terms of the higher representation: B B |B = |B8 − C10 |B10 − C27 |B27 ;
(5.21)
where the representation mixing coeIcients are explicitly calculated as C)B = )
B|HSB |B8 E) − E8
(5.22)
with the eigenvalues E) and eigenfunctions |B) = GB) ⊗ |intrinsic of the equation H0 |B) = E) |B) . Here GB) is the collective wave function discussed above and the intrinsic state degenerate to all the baryons is described by a Fock state of the quark operator and the classical meson con>guration. Using the octet wave functions with the higher representation mixing coeIcients (5.22), the additional hyper>ne structure of the magnetic moment spectrum in the quantum mechanical perturbative scheme is given by +"Bi = − 2
)=10;27
ˆ 8 B|"
i
|B) ) B|HSB |B8 E) − E8
(5.23)
up to the >rst order of m, the strength of the symmetry breaking in (5.20). It is interesting here to note that one has the oD-diagonal matrix elements of the magnetic moment operators "ˆ i with T and 27, diDerently from the diagonal matrix elements of the chiral higher representations 10 symmetric magnetic moments in Section 3.1. This fact is presumably related to the existence T and 27, which decay to the initial states in 8 through of exotic states [211] belonging to 10 8 the channel of the operator D88 related to the symmetry breaking mass eDects.
184
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
Table 6 The strange form factors of baryon octet (s); 0 F2N
Fit CBM SM
0.16
−0:19 −0:13
(s); 1 +F2N
(s); 2 +F2N
(s) F2N
(s) F2
(s) F2O
(s) F2K
F20
0.28
−0:07
0.37 0.30 −0:02
1.37 0.49 0.51
1.22 0.25 0.09
−0:99 −1:54 −1:74
−0:67 −0:67
−0:12 −0:09
0.61 0.20
0.26
One can then obtain the V-spin symmetry relations in the perturbative corrections of the octet magnetic moments: 2 +"p = +"O− = mI2 ( 125 M+
8 1125 (N
− 2N )) ;
31 +"n = +"K− = mI2 ( 750 M−
46 1125 (N
−
1 +"K+ = +"O0 = mI2 ( 125 M+ 9 +" = mI2 ( 500 M+
1 125 (N
37 +"K0 = mI2 ( 1500 M−
4 1125 (N
21 23 N ))
;
− 2N )) ;
− 2N )) ;
7 375 (N
−
17 21 N ))
;
(5.24)
where the operator "ˆ iFSB is neglected due to its small contributions. Here one notes that the above V-spin symmetric relations come from the SU(3) group theo√ 8 + (1= 3)D8 |) retical fact that the matrix elements of the operators in (5.23), such as 8|D38 88 8 |8, have degeneracy for the V-spin multiplets (p; O− ); (n; K− ) and (O0 ; K+ ) as in the )|D88 U-spin symmetry of Section 3.1. Also as in the Yabu–Ando approach since the baryon wave functions in the multiquark structure of the CBM act on the magnetic moment operators with the quark and meson phase contributions in their inertia parameters, one could have the meson cloud content in the qqqqq T multiquark Fock subspace of the chiral bag. Now we consider the form factors of the baryon octet with internal structure in the framework of the CBM. The baryons in the CBM are de>nitely extended objects with internal structure characterized by the bag radius and dressed by the meson cloud. As discussed before, the F2 (0) is interpreted as the anomalous magnetic moments of the baryon octet "an = " − Q whose numerical values can be easily obtained from Table 6 by subtracting the corresponding electric " charges. Here one should note that the EM currents JEM obtainable from (2.6) are conserved as mentioned before and the charge density operator is a constant of motion so that the EM charge operator can be quantized in a conventional way even though the EM charge density is modi>ed due to the derivative-dependent symmetry breaking terms. In the strange Aavor sector, the strange form factors [8] at zero momentum transfer can be calculated from Eqs. (3.18) and (5.12) to yield (s) F2N (0) = − 3"N(s) ;
(s) F2 (0) = − 3"(s) − 1 ;
(s) (0) = − 3"K(s) − 1; F2K
(s) F2O (0) = − 3"O(s) − 2 :
(5.25)
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
185
Note that the I-spin symmetric relation in Eq. (3.19) can be expressed in a simpler form as (u) (0) = F2;(d)BT (0) : F2B
(5.26)
Now the baryon octet strange form factors in Eq. (5.25) can be explicitly splitted into three pieces as follows: (s) (s); 0 (s); 1 (s); 2 = F2B (M; N; N ) + F2B (P; Q) + +F2B (mI2 ) : F2B
(5.27)
In the adjoint representation, one can obtain the CS and explicit current FSB contributions to the strange form factors: (s); 0 7 F2N = 20 M−
1 15 (N
+ 12 N ) ;
(s); 0 9 = 20 M + 15 (N + 12 N ) − 1 ; F2 (s); 0 = 35 M + F2O
4 15 (N
+ 12 N ) − 2 ;
(s); 0 1 1 = 11 F2K 20 M − 5 (N + 2 N ) − 1 ;
(5.28)
and (s); 1 =− F2N
1 15 P
(s); 1 = 13 P + F2O
−
1 30 Q;
1 15 Q;
(s); 1 F2 = 15 P + (s); 1 F2K =−
1 10 Q
11 45 P
;
1 18 Q
−
:
(5.29)
Treated in the quantum mechanical perturbative scheme of the previous section, the representation mixing coeIcients from the multiquark structure can be explicitly given as (s); 2 43 +F2N = mI2 (− 750 M+
38 1125 N
(s); 2 9 = mI2 (− 250 M− +F2
2 125 N
+
4 125 N )
;
(s); 2 3 = mI2 (− 125 M− +F2O
4 375 N
+
8 375 N )
;
(s); 2 37 = mI2 (− 750 M+ +F2K
14 375 N
−
34 1125 N )
−
26 1125 N )
:
;
(5.30)
Next using the Aavor singlet vector currents JV"0 , which can be constructed by replacing Qˆ a by 1 in (2.6), instead of the EM currents in the matrix element (5.4) we can also obtain the Aavor singlet form factors [8,35,6] F20 = 12 M − 1
(5.31)
which are degenerate with all the baryon octet even in the multiquark structure regardless of whether one uses the Yabu–Ando or perturbative methods.
186
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
In Table 6, one can acquire the numerical values for the strange form factors and Aavor singlet form factors. 6. Uni*cation of chiral bag model with other models 6.1. Connection to naive nonrelativistic quark model Until now, we have considered the static properties such as the magnetic moments and form factors of the baryon octet in the CBM which uni>es the MIT bag and Skyrmion models with the bag radius parameter. In this section, we will relate the CBM with the naive NRQM by investigating the model-independent sum rules in the magnetic moments, which have been already derived in the CBM in the previous sections for the baryon octet and decuplet to have a clue for the uni>cation of the naive NRQM into the CBM. In the naive NRQM, the wave function of a baryon consists of several degrees of freedom [212], (baryon) = (space) (spin) (Aavor) (color) ;
(6.1)
where the spatial wave function is symmetric in the ground state, and the spin state can either be completely symmetric (J = 32 ) or of mixed symmetry (J = 12 ), and there are 33 Aavor combinations which can be reshu[ed into irreducible representations of SU(3), 8 and the color wave function is antisymmetric and degenerate to all the baryons since every naturally occurring baryon is a color singlet, and the full baryon wave function is antisymmetric under the interchange of any two quarks. The baryon octet wave function is then constructed by the nontrivial spin=Aavor wave function of the form: √ 2 (baryon octet) = (6.2) ( 12 (spin) 12 (Aavor) + two other terms) ; 3 where ij (spin) and ij (Aavor) are the states with mixed symmetry such that their product is completely symmetric in quarks i and j. Since each quark has the intrinsic spin in itself the baryon magnetic moments in the naive NRQM are obtained by linearly adding the magnetic angular momentum quantum number of the wave function. The baryon octet magnetic moments and the K0 transition matrix element can then be constructed in terms of linear vector sum of the three constituent quark magnetic moments "q (q = u; d; s) [213]: "p = 13 (4"u − "d ); "K− = 13 (4"d − "s );
"n = 13 (−"u + 4"d ) ; "K0 = 13 (2"u + 2"d − "s ) ;
8 In terms of group theory, the combination of three quark Aavors yield a decuplet, a singlet and two octets since the direct product of three fundamental representations of SU(3) decomposes according to the Clebsch–Gordan series 3 ⊗ 3 ⊗ 3 = 1 ⊕ 8 ⊕ 8 ⊕ 10.
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
"K+ = 13 (4"u − "s );
"O− = 13 (−"d + 4"s ) ;
"O0 = 13 (−"u + 4"s ); √1 "K0 3
187
" = "s ;
= 13 (−4"u − "d ) ;
(6.3)
where "q = Qq (mN =mq ) in unit of nuclear magnetons (= e˝=2mN c) with mq the q-Aavor quark mass and mN the nucleon mass. Here one notes that due to "u ="d = − 2 one has the ratio "n ="p = − 23 comparable to the experimental value −0:69 and the CBM prediction − 34 in the leading order of Nc . Since the d- and s-Aavor charges are degenerate in the SU(3) EM charge operator Qˆ EM , the baryon magnetic moments in the SU(3) Aavor symmetric limit with the chiral symmetry breaking masses mu = md = ms satisfy the U-spin symmetric Coleman–Glashow sum rules in the naive NRQM, the analogy of the U-spin symmetry relations (3.3) in the CBM: mN ; mu mN "O0 = "n = − 23 ; mu mN "O− = "K− = − 13 ; mu mN "K0 = − " = 13 ; mu "K+ = "p =
(6.4)
and the other Coleman–Glashow sum rule (3.5) for the summation of the magnetic moments over all the octet baryons. The naive NRQM also predicts the other sum rules (3.7) and the relations of the hyperon and transition magnetic moments in terms of the nucleon magnetic moments (3.8) in the CBM. Using the projection operators (3.14), one can easily see that the nucleon magnetic moments in the u-Aavor channel of the naive NRQM are given by "p(u) = 43 Qu (mN =mu ) and "n(u) = − 1 3 Qu (mN =mu ), and the d-Aavor components of nucleon magnetic moments are given by (3.20) as in the CBM, but "N(s) = 0 due to the absence of the strange quarks in the nucleon of the naive NRQM. In general, one can easily see that the SU(3) Aavor components of hyperon magnetic moments also satisfy the identities (3.20) in the naive NRQM. In the more general SU(3) Aavor symmetry breaking case with mu = md = "s , one can easily see that the baryon octet magnetic moments ful>ll the Coleman–Glashow sum rule (3.4), since "K+ + "K− is independent of the third component of the isospin, and the last model-independent relation in (3.8) and the identities in (3.20) hold since they are the relations derived in the same strangeness sector. Together with the above model-independent Coleman–Glashow sum rules shared by two models, the CBM predictions propose the uni>cation of the naive NRQM and the CBM which has the meson cloud, around the quarks of the naive NRQM, located both in the quark and
188
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
meson phases. In other words, the CBM can be phenomenologically proposed as an eDective NRQM in the adjoint representation and model-independent relations and Cheshire cat properties are shown to support the eDective NRQM conjecture with meson cloud. In Table 2, the SU(2) CBM predictions [28] are explicitly listed to be compared with the naive NRQM and SU(3) CBM so that the pure kaon contributions to the baryon magnetic moments can be explicitly calculated with respect to the naive NRQM. Next starting with the symmetric spin con>guration in the ground state with symmetric (space) one can have the spin- 32 baryon decuplet wave function in the naive NRQM with the symmetric Aavor state to yield (baryon decuplet) =
s (spin) s (Aavor)
:
(6.5)
In the naive NRQM, the baryon magnetic moments are then obtained as the linear sum of the three constituent quark magnetic moments, similar to (6.3): "− = 3"d ;
"0 = "u + 2"d ;
"+ = 2"d + "d ;
"++ = 3"u ;
"K∗− = 2"u + "s ;
"K∗0 = "u + "d + "s ;
"K∗+ = 2"u + "s ;
"O∗− = "d + 2"s ;
"O∗0 = "u + 2"s ;
"P− = 3"s :
(6.6)
In the SU(3) Aavor symmetric limit with the chiral symmetry breaking masses mu = md = ms , the decuplet baryons with the EM charge QEM are described by [189] "B = QEM
mN mu
(6.7)
and satisfy the U-spin symmetry relations (4.4) and (4.7). On the other hand the magnetic moments in the u- and s-Aavor channels are given by "(u) = (Q + 1) 23
mN ; mu
"(s) = 0 ;
(6.8)
and in general, all the baryon decuplet magnetic moments ful>ll the model-independent relations (3.20) in the u- and d-Aavor components and the I-spin symmetry in the s-Aavor channel where the isomultiplets have the degenerate strangeness number. Finally, one should note that the other sum rules (4.5) and (4.6) and the identities in (3.20) hold even in the SU(3) Aavor symmetry breaking case (mu = md = ms ) since they are the relations derived in the same strangeness sector. The above model-independent sum rules in the baryon decuplet satis>ed by the naive NRQM and the CBM support the eDective NRQM conjecture as in the baryon octet. The eDective NRQM conjecture discussed in the baryon octet and decuplet support the possibility of the uni>cation of the CBM with the naive NRQM while the Cheshire catness suggests another clue
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
189
to the uni>cation of the CBM with the Skyrmion model. In the next section, we will proceed to consider the other plausible uni>cation of the CBM with the NJL model, chiral perturbation theory, CK model and chiral quark soliton model. 6.2. Connection to other models So far the chiral soliton model such as the Skyrmion model have been constructed mainly on the basis of the low-energy meson phenomenology since the eDective meson Lagrangian underlying QCD is not known. There has been some progress in deriving eDective meson Lagrangian either directly from QCD [214] or from the quark Aavor dynamics [215] of the NJL model [216]. Especially, it has been claimed that the Skyrmion model can be derived [217,218] from the NJL model in the limit of large vector and axial–vector meson masses. Consequently, one may claim that there can be plausibility in the uni>cation of the CBM with the NJL model. Next in the strong chiral symmetry breaking limit, the Yabu–Ando approach to the Skyrmion model has suggested [82] the mass formula similar to the one derived in the bound state scheme in CK model so that one may conclude that the perturbation and bound state schemes are two extreme limits of the Yabu–Ando approach. Similarly, in the large limit of the symmetry breaking strength ! of (3.24), the CBM results are comparable to those of Refs. [219,220] estimated in the bound state scheme of CK model. Finally, in the chiral quark soliton model [221], the baryon decuplet magnetic moments satisfy the model-independent sum rules (4.4) and (4.7) as in the naive NRQM. Moreover, one can easily see that the CBM shares with the naive NRQM and chiral quark soliton model the following sum rules: −4"++ + 6"+ + 3"K∗+ − 6"K∗0 + "P− = 0 ;
(6.9)
−2"++ + 3"+ + 2"K∗+ − 4"K∗0 + "O∗− = 0 ;
(6.10)
−"++ + 2"+ − 2"K∗0 + "O∗0 = 0 ;
(6.11)
"++ − 2"+ + "0 = 0
(6.12)
"0 − "K∗− = "K∗+ − "O∗0 = 12 ("+ − "O∗− ) = 13 ("++ − "P− ) :
(6.13)
and These sum rules also suggest the possibility of uni>cation of the CBM with the naive NRQM and chiral quark soliton model. 7. Improved Dirac quantization of Skyrmion model 7.1. Modi@ed mass spectrum in SU(2) Skyrmion In this section, we will apply the Batalin–Fradkin–Tyutin (BFT) method to the Skyrmion to obtain the modi>ed mass spectrum of the baryons by including the Weyl ordering correction.
190
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
We will next canonically quantize the SU(2) Skyrme model by using the Dirac quantization method, which will be shown to be consistent with the BFT one after the adjustable parameters are introduced to de>ne the generalized momenta without any loss of generality [94]. Now we start with the SU(2) Skyrmion Lagrangian of the form
1 2 1 2 3 " LSM = dr − f tr(l" l ) + tr[l" ; l& ] (7.1) 4 32e2 where l" = U † 9" U and U is an SU(2) matrix satisfying the boundary condition limr→∞ U = I so that the pion >eld vanishes as r goes to in>nity. On the other hand, in the Skyrmion model, since the hedgehog ansatz has maximal or spherical symmetry, it is easily seen that spin plus isospin equals zero, so that isospin transformations and spatial rotations are related to each other. Furthermore, spin and isospin states can be treated by collective coordinates a" = (a0 ;˜a) (" = 0; 1; 2; 3) corresponding to the spin and isospin rotations, A(t) = a0 + i˜a · ˜< ;
(7.2)
which is the time-dependent collective variable de>ned on the SU(2)F group manifold and is related with the zero modes associated with the collective rotation (2.31) in the SU(3) CBM. With the hedgehog ansatz described in Section 1:4 and the collective rotation A(t) ∈ SU(2), the chiral >eld can be given by U (˜x; t) = A(t)U0 (˜x)A† (t) = ei
(7.3)
(7.4) (7.5)
with the dimensionless quantity z = ef r. Introducing the canonical momenta " = 4I10 a˙" conjugate to the collective coordinates a" , one can then obtain the canonical Hamiltonian, H = M0 +
1 " " ; 8I10
(7.6)
and the spin and isospin operators: J i = 12 (a0 i − ai 0 − ijk aj k ) ; I i = 12 (ai 0 − a0 i − ijk aj k ) :
(7.7)
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
191
On the other hand, our system has the second class constraints 9 P1 = a" a" − 1 ≈ 0;
P2 = a" " ≈ 0 ;
(7.8)
to yield the Poisson algebra with 12 = − 21 = 1
kk = {Pk ; Pk } = 2kk a" a" :
(7.9)
We now recapitulate the construction of the >rst-class SU(2) Hamiltonian. Following the BFT formalism [89,94,222] we introduce two auxiliary >elds (;; ; ) with the Poisson brackets {;; ; } = 1 :
(7.10)
One can then obtain the >rst-class constraints P˜ 1 = P1 + 2;;
P˜ 2 = P2 − a" a" ; ;
(7.11)
satisfying the >rst-class constraint Lie algebra {P˜ i ; P˜ j } = 0. Demanding that they are strongly ˜ = 0, one can construct the >rst-class BFT involutive in the extended phase space, i.e., {P˜ i ; F} " " ˜ physical >elds F = (a˜ ; ˜ ) corresponding to the original >elds F = (a" ; " ), as a power series of the auxiliary >elds (;; ; ): a" a" + 2; 1=2 ; a˜ = a a" a" 1=2 a" a" : ˜ " = (" − a" ; ) " " a a + 2; "
"
(7.12)
˜ ) of the >rst-class >elds F ˜ is also >rst-class, As discussed in Ref. [92], any functional K(F ˜ ˜ namely, K(F; G) = K(F). Using the property, we construct a >rst-class Hamiltonian in terms of the above BFT physical variables. The result is 1 " " H˜ = M0 + ˜ ˜ : 8I10
(7.13)
We then directly rewrite this Hamiltonian in terms of the original as well as auxiliary >elds [95], 1 a& a& H˜ = M0 + (" − a" ; )(" − a" ; ) & & ; 8I10 a a + 2;
(7.14)
which is also strongly involutive with the >rst-class constraints {P˜ i ; H˜ } = 0. However, with the >rst-class Hamiltonian (7.14), one cannot naturally generate the >rst-class Gauss law constraint from the time evolution of the primary constraint P˜ 1 . Now, by introducing an additional 9
Here one notes that, due to the commutator {" ; P1 } = − 2a" , one can obtain the algebraic relation {P1 ; H } = (1=2I)P2 .
192
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
term proportional to the >rst-class constraints P˜ 2 into H˜ , we obtain an equivalent >rst-class Hamiltonian, 1 P˜ 2 ; H˜ = H˜ + 4I10 ;
(7.15)
which naturally generates the Gauss law constraint,
{P˜ 1 ; H˜ } =
1 ˜ P2 ; 2I10
{P˜ 2 ; H˜ } = 0 :
(7.16)
Here one notes that H˜ and H˜ act on physical states in the same way since such states are annihilated by the >rst-class constraints. Using the >rst-class constraints in this Hamiltonian (7.15), one can obtain the Hamiltonian of the form
1 H˜ = M0 + (a" a" & & − a" " a& & ) : 8I10
(7.17)
Following the symmetrization procedure, the >rst-class Hamiltonian yields the slightly modi>ed energy spectrum with the Weyl ordering correction [223,94,222,95],
H˜ = M0 +
1 [I (I + 1) + 14 ] ; 2I10
(7.18)
where I is the isospin quantum number of baryons. Next, using the Weyl ordering corrected energy spectrum (7.18), we easily obtain the hyper>ne structure of the nucleon and delta hyperon masses to yield the static mass and the moment of inertia M0 = 13 (4MN − M );
I10 = 32 (M − MN )−1 :
(7.19)
Substituting the experimental values MN = 939 MeV and N = 1232 MeV into Eq. (7.19) and using the expressions (7.5), one can predict the pion decay constant f and the Skyrmion parameter e as follows: f = 63:2 MeV;
e = 5:48 :
With these >xed values of f and e, one can then proceed to yield the predictions for the other static properties of the baryons. The isoscalar and isovector mean square (magnetic) charge radii and the baryon and transition magnetic moments are contained in Table 7, together with the experimental data and the standard Skyrmion predictions [64,66,224]. 10 It is remarkable that the eDects of Weyl ordering correction in the baryon energy spectrum are propagated through the model parameters f and e to modify the predictions of the baryon static properties. Moreover, one can show that, by >xing a free adjustable parameter c introduced to de>ne generalized momenta, the baryon energy eigenvalues obtained by the standard Dirac method are 10
For the delta magnetic moments, we use the experimental data of Nefkens et al. [225].
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
193
Table 7 The static properties of baryons in the standard and Weyl ordering corrected (WOC) Skyrmions compared with the experimental dataa Quantity
Standard
WOC
Experiment
MN M f e r 2 1=2 M; I =0 r 2 1=2 M; I =1 r 2 1=2 I =0 r 2 1=2 I =1 "p "n "++ "N " p − "n
939 MeVa 1232 MeVa 64:5 MeV 5.44 0:92 fm
939 MeVa 1232 MeVa 63:2 MeV 5.48 0:94 fm
∞
∞
0:59 fm
0:60 fm
939 MeV 1232 MeV 93:0 MeV — 0:81 fm 0:80 fm 0:72 fm 0:88 fm 2.79 −1:91 4.7−6.7 3.29 4.70
a
∞
∞
1.87 −1:31 3.72 2.27 3.18
1.89 −1:32 3.75 2.27 3.21
The quantities used as input parameters.
consistent with the above BFT result. To be more speci>c, we can obtain the modi>ed quantum energy spectrum of the baryons [94] HN = M0 + 18I10 [l(l + 2) +
9 4
− c2 ]
(7.20)
which is consistent √ with the BFT result (7.18) if the adjustable parameter c is >xed with the values c = ± 5=2. Here one notes that these values for the parameter c relate the Dirac bracket scheme with the BFT one to yield the desired quantization in the SU(2) Skyrmion model so that one can achieve the uni>cation of these two diDerent formalisms. (For details see Ref. [94].) On the other hand, we can obtain the BRST invariant Lagrangian in the framework of the BFV formalism [226 –228] which is applicable to theories with the >rst-class constraints by introducing two canonical sets of ghosts and antighosts together with auxiliary >elds. Following the procedure in Appendix C.1, one can arrive at the BRST invariant Lagrangian [95]: LeD = − M0 +
˙˙ 2I10 ˙ 2 2I10 " " 2 T )2 − ;B + CT˙ C˙ ; ; − 2 I (1 − 2;) (B + 2 CC a˙ a˙ − 10 1 − 2; (1 − 2;)2 1 − 2; (7.21)
which is invariant under the BRST transformation: +B a" = )a" C;
+B ; = − )(1 − 2;)C;
+B CT = − )B;
+B C = +B B = 0 :
(7.22)
194
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
Here C (CT ) and B are the (anti)ghosts and the corresponding auxiliary >elds. (For details see Appendix C.1.) 7.2. Phenomenology in SU(3) Skyrmion Now let us consider the hyper>ne splittings for the SU(3) Skyrmion [78,229,211] which has been studied in two main schemes as discussed in the previous chapters. Firstly, the SU(3) cranking method exploits rigid rotation of the Skyrmion in the collective space of SU(3) Euler angles with full diagonalization of the Aavor symmetry breaking (FSB) terms [25,37,38]. Especially, Yabu and Ando [82] proposed the exact diagonalization of the symmetry breaking terms by introducing higher irreducible representation mixing in the baryon wave function, which was later interpreted in terms of the multiquark structure [83,84] in the baryon wave function. Secondly, Callan and Klebanov [79] suggested an interpretation of baryons containing a heavy quark as bound states of solitons of the pion chiral Lagrangian with mesons. In their formalism, the Auctuations in the strangeness direction are treated diDerently from those in the isospin directions [79,80]. In order to generalize the standard Aavor symmetric (FS) SU(3) Skyrmion rigid rotator approach [230,231] to the SU(3) Skyrmion case with the pion mass and FSB terms, we will now investigate the chiral breaking pion mass and FSB eDects on c the ratio of the strange–light to light–light interaction strengths and cT that of the strange–strange to light–light. Now we start with the SU(3) Skyrmion Lagrangian of the form 1 tr[l" ; l& ]2 + LWZW + 14 f2 tr M (U + U † − 2) + LFSB ; 32e2 √ LFSB = 16 (fK2 m2K − f2 m2 ) tr((1 − 3)8 )(U + U † − 2)) √ 1 − 12 (fK2 − f2 ) tr((1 − 3)8 )(Ul" l" + l" l" U † )) ; L = − 14 f2 tr(l" l" ) +
(7.23)
where f (fK ) and e are the pion (kaon) decay constants and the dimensionless Skyrme parameter as before. Here l" = U † 9" U with an SU(3) matrix U and M is proportional to the quark mass matrix given by M = diag(m2 ; m2 ; 2m2K − m2 ) ; where m = 138 MeV and mK = 495 MeV. Note that LFSB is the FSB correction term due to the relations m = mK and f = fK [232,25] and the Wess–Zumino–Witten (WZW) term [78] is described by the action iNc -WZW = − d 5 r"&! tr(l" l& l l l! ) ; (7.24) 2402 M where Nc is the number of colors and the integral is done on the >ve-dimensional manifold M = V × S 1 × I with the three-space volume V , the compacti>ed time S 1 and the unit interval I needed for a local form of WZW term. Here note that we have used the three-space volume V instead of VT of the CBM case.
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
195
Using Eq. (C.20) in Appendix C.2 and following the Klebanov and Westerberg’s quantization scheme [230] for the strangeness Aavor direction in the BFT formalism, one can obtain the Hamiltonian of the form Nc † ("K − 1)a a 8I20
1 1 -2 + − 1 + (" − 1) a†˜I · ˜
-22 + 2I10 (-1 − -2 ) − ("K − 1) ("K − 1) (a† a)2 ; 4I10 I20
H = M0 + 12 -0 m2 +
1 ˜2 2I10 (I
+ 14 ) +
(7.25)
where
*2 m2K − m2 + -3 =-0 "K = 1 + m20
1=2
;
m0 =
Nc )1=2 4(-0 I20
and a† is creation operator for constituent strange quarks and we have ignored the irrelevant creation operator b† for strange antiquarks [230]. Then, introducing the angular momentum of the strange quarks ˜Js = 12 a†˜
and * and Here note that the FSB eDects are included in c and c, T through -1 ; -2 ; I20 3 in "K . The Hamiltonian (7.26) then yields the structure of the hyper>ne splittings as follows: 1 Y2 − 1 +M = cJ (J + 1) + (1 − c) I (I + 1) − 2I10 4
2 Y −1 1 + (1 + cT − 2c) (7.27) + (1 + cT − c) ; 4 4
196
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
Table 8 The values of c and cT in the massless pion and massive pion rigid rotator approaches to the SU(3) Skyrmions compared with the experimental dataa Source Rigid rotator, Rigid rotator, Rigid rotator, Rigid rotator, Experiment
massless and FS massless and FSB massive and FS massive and FSB
c
cT
0.92 0.82 0.79 0.67 0.67
0.86 0.69 0.66 0.56 0.27
a
For the rigid rotator approaches, both the predictions in the Aavor symmetric (FS) case and Aavor symmetry breaking (FSB) one are listed.
where ˜J = ˜I + ˜Js is the total angular momentum of the quarks, and c and cT are the modi>ed quantities due to the existence of the FSB eDect as shown above. Now using the experimental values of the pion and kaon decay constants f = 93 MeV and fK = 114 MeV, we >x the value of the Skyrmion parameter e to >t the experimental data of cexp = 0:67 to yield the predictions for the values of c and cT c = 0:67;
cT = 0:56
(7.28)
which are contained in Table 8, together with the experimental data and the SU(3) rigid rotator predictions without pion mass. For the massless and massive rigid rotator approaches, we have used the above values for the decay constants f and fK to obtain both the predictions in the FS and FSB cases. As a result, we have explicitly shown that the more realistic physics considerations via the pion mass and the FSB terms improve both the c and cT values, as shown in Table 8 [97]. 7.3. Berry phase and Casimir energy in SU(3) Skyrmion Now we investigate the relations between the Hamiltonian (7.26) and the Berry phases [233]. In the Berry phase approach to the SU(3) Skyrmion, the Hamiltonian takes the simple form [39] H ∗ = K +
1 ˜2 8I1 (R
˜ K + gK2 T ˜ K2 ) ; R·T − 2gK ˜
(7.29)
where K is the eigenenergy in the K state, gK is the Berry charge, ˜R (˜L) is the right (left) ˜ K is the angular momentum of the “slow” generators of the group SO(4) ≈ SU(2) × SU(2) and T 2 2 ˜ ˜ ˜ ˜ ˜ rotation. We recall that I = L=2 = − R=2 and L = R on S 3 . Applying the BFT scheme to the Hamiltonian (7.29), we can obtain the Hamiltonian of the form
g 2 1 1 ∗ K 2 2 ˜I + gK˜I · T ˜K + ˜K + H˜ = K + T : (7.30) 2I1 2 4
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
197 ∗
In the case with the relation cT = c2 , the Hamiltonian (7.29) is equivalent to H˜ in the Berry phase approach where the corresponding physical quantities can be read oD as follows: K = M0 + 12 -0 m2 + !a† a;
˜ K = ˜Js ; gK = 2c : T
(7.31)
The same case with the Hamiltonian (7.30) follows from the quark model and the bound state approach with the quartic terms in the kaon >eld neglected. In fact, the strange–strange interactions in the Hamiltonian (7.26) break these relations to yield the numerical values of cT in Table 8. Next, the baryon mass spectrum in the chiral models can be described in powers of Nc as follows: H = E1 Nc + E0 Nc0 + E−1 Nc−1 + · · · ;
(7.32)
where the ellipsis stands for the contributions from the higher-order terms of Nc−1 . Note that, for instance in Eq. (7.26), E1 and E−1 correspond to M0 + 12 -0 m2 + !a† a and the terms from the rotational degrees of freedom associated with the moment of inertia 1= I10 , respectively. Moreover, in >tting the values of the pion and kaon decay constants f and fK and the value of the Skyrmion parameter e as in the numerical evaluations of Tables 7 and 8 for instance, we have missed the Casimir eDect contributions, with which one can improve the predictions to obtain more realistic phenomenology. Now, in order to take into account the missing order Nc0 eDects, we consider the Casimir energy contributions to the Hamiltonian (7.26). The Casimir energy originated from the meson Auctuation can be given by the phase shift formula [234,235] 1 ∞ a T p 2 E0 = dp − (+(p) − aT0 p3 − aT1 p) + 2 2 i=;K 2 p + "2 2 0 p +m i
3 8
− aT0 m4i
"2
3 1 + ln 4 2 m2i
"2
1 + aT1 m2i 1 + ln 2 4 mi
− mi +(0) + · · · ;
where the ellipsis denotes the contributions from the counterterms and the bound states (if any). Here " is the energy scale and +(p) is the phase shift with the momentum p and the coeIcients aTi (i = 0; 1; 2) are de>ned by the asymptotic expansion of + (p), namely, + (p) = 3aT0 p2 + aT1 − aT2 =p2 + · · · : Even though the Casimir energy correction does not contribute to the ratios c and cT since these ratios are associated with the order 1=Nc piece of the Hamiltonian (7.26), these eDects are signi>cant in the baryon mass itself [234,235] given in Eqs. (3.22) and (7.26), and also seems to be signi>cant in other physical quantities such as the H dibaryon mass [231]. Now, we would like to brieAy comment on numerical estimation of the Casimir energy. Even though it is diIcult to determine the magnitude of the Casimir energy due to the ambiguity in using the derivative expansion in the chiral soliton models, the magnitude is known to depend on the dynamical details of the Lagrangian and loop corrections and its sign is estimated to be negative. The preliminary calculations produce the Casimir energy with range −(200–1000) MeV [236] and later the more reliable estimations yield −(500–600) MeV [237].
198
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
8. Superqualiton model 8.1. Color–Bavor-locking phase and Q-matter So far we have studied the phenomenology of hadron physics without introducing matter density degrees of freedom. In this section, we consider the possibilities of the applications of the chiral models such as superqualiton model to the dense matter physics. Here note that one can have somewhat intriguing similarity between the hadron–quark continuity [109] and the CCP. In other words, SchSafer and Wilczek proposed that the three-Aavor color–Aavor locking (CFL) operative at asymptotic density continues upto the chiral transition density, in which case there will be hadron–quark continuity since there will be a one-to-one mapping between hadrons and quark=gluons. Now, we consider quark matter with a >nite baryon number described by QCD with a chemical potential, which is to restrict the system to have a >xed baryon number, L = LQCD − " T i !0
i
;
(8.1)
where T i !0 i is the quark number density and equal chemical potentials are assumed for diDerent Aavors, for simplicity. The ground state in the CFL phase is nothing but the Fermi sea where all quarks are gaped by Cooper-pairing; the octet has a gap while the singlet has 2. Equivalently, this system can be described in terms of bosonic degrees of freedom, which are small Auctuations of Cooper pairs. Following Ref. [113], we introduce bosonic variables, de>ned as ULai (x) ≡ lim
y→x
|x − y|!m
(pF )
abc ijk
bj vF ; x) Lck (˜vF ; y) L (−˜
;
(8.2)
where !m (∼ s ) is the anomalous dimension of the diquark >eld and (˜vF ; x) denotes a quark >eld with momentum close to a Fermi momentum "˜vF [118]. Similarly, we de>ne UR in terms of right-handed quarks to describe the small Auctuations of the condensate of right-handed quarks. Since the bosonic >elds, UL; R , are colored, they will interact with gluons. In fact, the colored massless excitations will constitute the longitudinal components of gluons through Higgs mechanism. Thus, the low-energy eDective Lagrangian density for the bosonic >elds in the CFL phase can be written as A "&A LeD = [ 14 F 2 tr(9" UL† 9" UL ) + nL LWZW + (L ↔ R)] + Lm − 14 F"& F + gs G"A J "A + · · · ;
(8.3) where Lm is the meson mass term and the ellipsis denotes the higher-order terms in the derivative expansion, including mixing terms between UL and UR . The gluons couple to the bosonic >elds through a minimal coupling with a conserved current, given as i 1 "&IM J A" = F 2 trUL−1 T A 9" UL + tr T A UL−1 9& UL UL−1 9I UL UL−1 9M UL + (L ↔ R) + · · · ; 2 242 (8.4) where the ellipsis denotes the currents from the higher-order derivative terms in Eq. (8.3). F is a quantity analogous to the pion decay constant, calculated to be F ∼ " in the CFL color
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
199
superconductor [125]. The Wess–Zumino–Witten (WZW) term [78] is described by the action (7.24) in the previous section. The coeIcients of the WZW term in the eDective Lagrangian, (8.3), have been shown to be nL; R = 1 by matching the Aavor anomalies [113], which is later con>rmed by an explicit calculation [127]. Among the small Auctuations of condensates, the colorless excitations correspond to genuine Nambu–Goldstone (NG) bosons, which can be described by a color singlet combination of UL; R [122,112], given as Kij ≡ ULai UR∗aj :
(8.5)
The NG bosons transform under the SU(3)L × SU(3)R chiral symmetry as K → gL KgR† ;
with gL; R ∈ SU(3)L; R :
(8.6)
Since the chiral symmetry is explicitly broken by current quark mass, the instanton eDects, and the electromagnetic interaction, the NG bosons will get mass, which has been calculated by various groups [125,122,119]. Here we focus on the meson mass due to the current strange quark mass (ms ), since it will be dominant for the intermediate density. Then, the meson mass term is simpli>ed as Lm = C tr(M T K) · tr(M ∗ K† ) + O(M 4 ) ;
(8.7)
where M = diag(0; 0; ms ) and C ∼ 4 ="2 · ln("2 =2 ). (Note that in general, there will be two more mass terms quadratic in M . But, they all vanish if we neglect the current mass of up and down quarks and also the small color-sextet component of the Cooper pair [122].) Now, let us try to describe the CFL color superconductor in terms of the bosonic variables. We start with the eDective Lagrangian described above, which is good at low energy, without putting in the quark >elds. As in the Skyrmion model of baryons, we anticipate that the gaped quarks come out as solitons, made of the bosonic degrees of freedom. That the Skyrme picture can be realized in the CFL color superconductor is already shown in Ref. [113], but there the mass of the soliton is not properly calculated. Here, by identifying the correct ground state of the CFL superconductor in the bosonic description, we >nd the superqualitons have same quantum numbers as quarks with mass of the order of gap, showing that they are really the gaped quarks in the CFL color superconductor. Furthermore, upon quantizing the zero modes of the soliton, we >nd that high spin excitations of the soliton have energy of order of ", way beyond the scale where the eDective bosonic description is applicable, which we interpret as the absence of high-spin quarks, in agreement with the fermionic description. It is interesting to note that, as we will see below, by calculating the soliton mass in the bosonic description, one >nds the coupling and the chemical potential dependence of the Cooper-pair gap, at least numerically, which gives us a complementary way, if not better, of estimating the gap. As the baryon number (or the quark number) is conserved, though spontaneously broken, 11 the ground state in the bosonic description should have the same baryon (or quark) number as 11
The spontaneously broken baryon number just means that the states in the Fock space do not have a well-de>ned baryon number. But, still the baryon number current is conserved in the operator sense [238].
200
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
the ground state in the fermionic description. Under the U(1)Q quark number symmetry, the bosonic >elds transform as UL; R → ei;Q UL; R e−i;Q = e2i; UL; R ; where Q is the quark number operator, given in the bosonic description as F2 Q = i d 3 x tr[UL† 9t UL − 9t UL† UL + (L ↔ R)] ; 4
(8.8)
(8.9)
neglecting the quark number coming from the WZW term, since the ground state has no nontrivial topology. The energy in the bosonic description is F2 ˜ UL |2 + (L ↔ R)] + Em + E ; E = d 3 x tr[|9t UL |2 + |∇ (8.10) 4 where Em is the energy due to meson mass and E is the energy coming from the higher derivative terms. Assuming the meson mass energy is positive and Em + E ¿ 0, which is reasonable because =F 1, we can take, dropping the positive terms due to the spatial derivative, F2 E ¿ d 3 x tr[|9t UL |2 + (L ↔ R)](≡ EQ ) : (8.11) 4 Since for any number d 3 x tr[|UL + i9t UL |2 + (L ↔ R)] ¿ 0 ;
(8.12)
we get a following Schwartz inequality, Q2 6 I EQ ;
(8.13)
where we de>ned F2 I= d 3 x tr[UL UL† + (L ↔ R)] : 4
(8.14)
Note that the lower bound in Eq. (8.13) is saturated for EQ = !Q or UL; R = ei!t
with ! =
Q : I
(8.15)
The ground state of the color superconductor, which has the lowest energy for a given quark number Q, is nothing but the so-called Q-matter, or the interior of very large Q-ball [239,240]. Since in the fermionic description the system has the quark number Q = "3 =2 d 3 x = "3 =2 · I=F 2 , we >nd, using F 0:209" [125], 1 " 3 != 2 F 2:32" : (8.16) F
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
201
By passing, we note that numerically ! is very close to 4F. The ground state of the system in the bosonic description is a Q-matter whose energy per unit quark number is !. Now, let us suppose we consider creating a Q = 1 state out of the ground state. In the fermionic description, this corresponds to the fact that we excite a gaped quark in the Fermi sea into a free state, which costs energy at least 2. In the bosonic description, this amounts to creating a superqualiton out of the Q-matter, while reducing the quark number of the Q-matter by one. Therefore, since, reducing the quark number of the Q-matter by one, we gain energy !, the energy cost to create a gaped quark from the ground state in the bosonic description is E = MQ − ! ;
(8.17)
where MQ is the energy of the superqualiton con>guration. From the relation that 2 = MQ − !, we later estimate numerically the coupling and the chemical potential dependence of the Cooper gap. 8.2. Bosonization of QCD at high density It is sometimes convenient to describe a system of interacting fermions in terms of bosonic variables, since often in that description the interaction of elementary excitations becomes weak and perturbative approaches are applicable (see, e.g., Ref. [241]). Now, we attempt to bosonize cold quark matter of three light Aavors, where the low-lying energy states are bosonic. Following the Skyrme picture of baryons in QCD at low density, we now investigate how gaped quarks in high density QCD are realized in its bosonic description with the Lagrangian given in Eq. (8.3) [113,134]. Assuming the maximal symmetry in the superqualiton, we seek a static con>guration for the >eld UL which is the SU(2) hedgehog in color–Aavor in SU(3) as in (2.19) ˆ ei˜<·x;(r) 0 ULc (˜x) = ; (8.18) 0 1 where ;(r) is the chiral angle determined by minimizing the static mass M0Q given below and for unit winding number we take limr→∞ ;(r) = 0 and ;(0) = . The static con>guration for the other >elds are described as UR = 0;
G0A =
xA !(r); r
GiA = 0 :
(8.19)
Now we consider the zero modes of the SU(3) superqualiton as follows: U (˜x; t) = A(t)ULc (˜x)A(t)† : The Lagrangian for the zero modes is then given by 1 i Q † ˙ )a † ˙ )b − tr(Y A† A˙) ; tr A A L = − M0 + Iab tr A A 2 2 2 2
(8.20)
(8.21)
202
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
where Iab is an invariant tensor on M = SU(3)=U(1) and Y is the hypercharge 1 0 0 1 )8 Y = √ = 0 1 0 : 3 3 0 0 −2 Using the above static con>guration, we obtain the static mass M0 and the tensor Iab as follows: 4 2 ∞ s 1 2 d; 2 ; − sin ; cos ; − 2 −2mE r Q 2 M0 = F dr r + sin ; + 3 2 e ; 3 2 dr 2 F 2r 0
Iab =
32 2 ∞ dr r 2 sin2 ; = − 4I1 ; − 9 F 0 ∞
8
− F2 3
0
a = b = 1; 2; 3
dr r 2 (1 − cos ;) = − 4I2 ; a = b = 4; 5; 6; 7
0;
(8.22)
a=b=8
where s is the strong coupling constant and mE = "(6s =)1=2 is the electric screening mass for the gluons. As in Appendix C.2, since A belongs to SU(3), A† A˙ is anti-Hermitian and traceless to be expressed as a linear combination of i)a as follows: ˜v · < + &1 V † ˙ a A A = iFv )a = iF ; V† −2& where ˜v, V and & are given by Eq. (C.11). The Lagrangian is then expressed as L = − M0Q + 2F 2 I1˜v 2 + 2F 2 I2 V † V + 13 Nc F& :
(8.23)
In order to separate the SU(2) rotations from the deviations into strange directions, we write the time-dependent rotations as in Eq. (C.13). Furthermore, we exploit the time-dependent collective coordinates a" = (a0 ;˜a) (" = 0; 1; 2; 3) as in the SU(2) Skyrmion [64], via A(t) = a0 + i˜a · ˜<, and the small rigid oscillations S described by Eqs. (C.14) and (C.15). After some algebra, one can obtain the relations among the variables in (C.11) and the SU(2) collective coordinates a" and the strange deviations D such as i † F& = (D† D˙ − D˙ D) − D† (a0˜a˙ − a˙0˜a + ˜a × ˜a) ˙ · ˜
(8.24)
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
203
to yield the superqualiton Lagrangian to order 1=Nc i † † L = −M0Q + 2I1 a˙" a˙" + 4I2 D˙ D˙ + Nc (D† D˙ − D˙ D) − 4I2 m2K D† D 6 ˙ · ˜
†
− D˙ (a0˜a˙ − a˙0˜a + ˜a × ˜a) ˙ · ˜
8 i † † + 2I2 (D† D˙ − D˙ D)2 − Nc (D† D˙ − D˙ D)D† D + I2 m2K (D† D)2 ; 9 3
(8.25)
where we have included the kaon mass terms proportional to the strange quark mass which is not negligible. The momenta " and s , conjugate to the collective coordinates a" and the strange deviation D† are given by 1 † 0 = 4I1 a˙0 − 2i(I1 − 2I2 )(D†˜a · ˜
1 3
− Nc D† (a0˜< − ˜a × ˜<)D ;
i s = 4I2 D˙ − Nc D − 2i(I1 − 2I2 )(a0˜a˙ − a˙0˜a + ˜a × ˜a) ˙ · ˜
i 9
− 4I2 (D† D˙ − D˙ D)D + Nc (D† D)D
satisfying the Poisson brackets {a" ; & } = +"& , {D† ; s } = {D ; s;† } = + . Performing Legendre transformation, we obtain the Hamiltonian to order 1=Nc as follows: 2 1 " " 1 † Nc Nc Q † † 2 + s − i (D s − s D) + + 4I2 mK D† D H = M0 + 8I1 4I2 s 24I2 144I2 1 1 {D† (a0˜ − ˜a0 + ˜a × ˜) · ˜<s − s† (a0˜ − ˜a0 + ˜a × ˜) · ˜
204
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
1 (D† s − s† D)2 8I2 2 Nc 8 Nc † † † 2 −i (D s − s D)(D D) + − I2 mK (D† D)2 : (8.26) 24I2 108I2 3 Applying the BFT scheme [89,91,94] to the above result with the auxiliary >elds (;; ; ), one can obtain the >rst-class Hamiltonian 1 " a& a& H˜ = M0Q + ( − a" ; )(" − a" ; ) & & 8I1 a a + 2; 2 Nc Nc 1 † † † 2 s − i (D s − s D) + + 4I2 mK D† D + 4I2 s 24I2 144I2 1 1 {D† (a0˜ − ˜a0 + ˜a × ˜) · ˜<s − +i 4I1 8I2 Nc † 0 − s† (a0˜ − ˜a0 + ˜a × ˜) · ˜
1 1 − 12I2 8I1
(D† s + s† D)2 −
where the ellipsis stands for the strange–strange interaction terms of order 1=Nc which can be readily read oD from Eq. (8.26). Following the Klebanov and Westerberg’s quantization scheme [230] for the strangeness Aavor direction, one can obtain the Hamiltonian of the form 1 ˜2 1 H˜ = M0Q + &K a† a + I + 2c˜I · ˜Js + cT˜Js2 + ; (8.28) 2I1 4 where ˜I and ˜Js are the isospin and angular momentum for the strange quarks and Nc I1 I1 &K = ("Q − 1); c = 1 − ("Q − 1); cT = 1 − ("Q − 1) 24I2 2I2 "Q I2 "Q2 with
m2 "Q = 1 + K2 mQ
1=2
;
mQ =
Nc : 24I2
Here note that a† is creation operator for constituent strange quarks and the factor 14 originates from BFT corrections [94], which are applicable to only u- and d-superqualitons. The Hamiltonian (8.28) then yields the mass spectrum of superqualiton as follows [134]: 1 1 MQ = M0Q − Y − &K + cJ (J + 1) + (1 − c)I (I + 1) 3 2I1 (Y − 1=3)(Y − 7=3) 1 + (cT − c) (8.29) + +I; 1=2 4 4 with the total angular momentum of the quark ˜J = ˜I + ˜Js .
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
205
Table 9 The dependence of superqualiton masses on the coupling s with mK =F = 0:3 s
MQ (u)=4F
MQ (s)=4F
Mu =4F
Ms =4F
0.050 0.100 0.150 0.200 0.250 0.300 0.350 0.400 0.450 0.500 0.550 0.600 0.650 0.700 0.750 0.800 0.850 0.900 0.950 1.000
1.040 1.040 1.041 1.041 1.041 1.041 1.041 1.042 1.042 1.042 1.042 1.042 1.042 1.042 1.042 1.042 1.042 1.042 1.042 1.043
1.061 1.061 1.061 1.061 1.061 1.062 1.062 1.062 1.062 1.062 1.062 1.062 1.062 1.063 1.062 1.063 1.063 1.063 1.063 1.063
0.078 0.078 0.079 0.079 0.079 0.079 0.079 0.079 0.079 0.079 0.079 0.079 0.079 0.079 0.079 0.079 0.079 0.079 0.080 0.080
0.089 0.089 0.089 0.089 0.089 0.089 0.089 0.089 0.089 0.089 0.089 0.089 0.090 0.090 0.090 0.090 0.090 0.090 0.090 0.090
Unlike creating Skyrmions out of Dirac vacuum, in dense matter the energy cost to create a superqualiton should be compared with the Fermi sea. By creating a superqualiton, we have to remove one quark in the Fermi sea since the total baryon number has to remain unchanged. Similar to Cooper pair mechanism [242], from Eq. (8.17), the twice of u- and s-superqualiton masses are then given by 1 −! ; 2I1 3 cT − ! 2Ms = M0Q + &K + 8I1
2Mu = M0Q +
(8.30)
to yield the predictions for the values of Mu (= Md ) and Ms Mu = 0:079 × 4F;
Ms = 0:081 × 4F;
for mK =F = 0:1 ;
Mu = 0:079 × 4F;
Ms = 0:089 × 4F;
for mK =F = 0:3 ;
Mu = 0:079 × 4F;
Ms = 0:109 × 4F;
for mK =F = 0:8 ;
(8.31)
which are comparable to the Cooper gap [134,131]. To see if the estimated superqualiton mass is indeed the Cooper gap, one needs to compare our numerical results with the analytic expression for the coupling dependence of the gap. In Table 9, we show the dependence of superqualiton masses on the strong coupling constant s .
206
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
By >tting the numerical results with the gap as, in the unit of 4F, log(Mu ) = a log(s ) + bs−1=2 + c :
(8.32)
We get a = 0:00085, b = − 0:00233, and c = 0:1193. This is very diDerent from the analytic expression obtained in the literature, " 32 ∼ 5 exp − √ : (8.33) gs 2gs As suggested in Ref. [130], the weak coupling result, Eq. (8.33), is applicable only when the coupling is extremely small or the chemical potential is very large. In our numerical analysis, we are unable to probe this region. Acknowledgements STH would like to deeply thank R.D. McKeown for the warm hospitality at the Kellogg Radiation Laboratory, Caltech where a part of this work has been done. The authors are grateful to G.E. Brown, D.K. Hong, W.T. Kim, Y.W. Kim, B.H. Lee, H.K. Lee, S.H. Lee, R.D. McKeown, C.M. Maekawa, D.P. Min, B.Y. Park and M. Rho for helpful discussions. The work of STH was supported by Grant No. 2000-2-11100-002-5 from the Basic Research Program of the Korea Science Engineering Foundation and that of Y.J. Park was supported by the Korea Research Foundation Grant No. KRF-2000-015-DP0070. Appendix A. Spin symmetries in the SU(3) group In order to discuss the I-, U- and V-spin symmetries associated with the SU(3) group, we will brieAy review the root diagram approach to the construction of the Lie algebra of the SU(3) group which has eight generators. Since the rank of the SU(3) group is two, one can have the Cartan subalgebra [243,244], the set of two commuting generators Hi (i = 1; 2) corresponding to )3 and )8 [H1 ; H2 ] = 0 ;
(A.1)
and the other generators E ( = ± 1; ±2; ±3) satisfying the commutator relations [Hi ; E ] = ei E ; [E ; E ] = N E+ ; [E ; E− ] = ei Hi ;
(A.2)
where ei is the ith component of the root vector eˆ in a two-dimensional root space and N is a normalization constant to be >xed. Normalizing the root vectors such that ei ej = +ij , one can choose the root vectors 1 1 −1 eˆ = −eˆ = √ ; 0 ; 3
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
207
Fig. 5. Root diagram for SU(3) group. The simple root vectors eˆ 2 and eˆ −3 can produce all the other root vectors through the operations of addition and eˆ = − eˆ − .
1 1 eˆ = −eˆ = √ ; ; 2 3 2 1 1 3 −3 eˆ = −eˆ = − √ ; ; 2 3 2 2
−2
(A.3)
as illustrated in Fig. 5 where one has two simple roots eˆ 2 and eˆ −3 of the equal length separated by an angle 2=3 so that one can obtain the Dynkin diagram [243,244] for the SU(3) Lie algebra given by ◦–◦. Substituting the root vectors in Fig. 5 normalized as in (A.3) into the relations (A.1) and (A.2), one can easily derive the commutator relations: 1 [H1 ; E1 ] = √ E1 ; 3 1 1 [H1 ; E2 ] = √ E2 ; [H1 ; E3 ] = − √ E3 ; 2 3 2 3 [H2 ; E1 ] = 0; [H2 ; E2 ] = 12 E2 ; 1 [H2 ; E3 ] = 12 E3 ; [E1 ; E−1 ] = √ H1 ; 3 1 1 1 [E2 ; E−2 ] = √ H1 + H2 ; [E3 ; E−3 ] = − √ H1 + 2 2 3 2 3 1 1 [E2 ; E−3 ] = − √ H1 + [E1 ; E3 ] = √ E2 ; 6 2 3 1 1 [E1 ; E3 ] = √ E2 ; [E2 ; E−3 ] = √ E1 ; 6 6 1 [E−1 ; E2 ] = √ E3 : 6
[H1 ; H1 ] = 0;
1 H2 ; 2 1 H2 ; 2
(A.4)
208
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
Associating the root vectors Hi (i = 1; 2) and E ( = ± 1; ±2; ±3) with the physical operators Y; I3 ; I± , U± and V± through the de>nitions: 1 H1 = √ I3 ; 3
1 H2 = Y ; 2
1 E± = √ I± ; 6
1 E±3 = √ U± ; 6
1 E±2 = √ V± ; 6
(A.5)
one can use the commutator relations (A.4) to yield the explicit expressions for the eigenvalue equations of the spin operators in the SU(3) group [184]: I+ |Y; I; I3 = ((I − I3 )(I + I3 + 1))1=2 |Y; I; I3 + 1 ; U+ |Y; I; I3 = (a+ (I − I3 + 1))1=2 |Y + 1; I + 12 ; I3 − 12 − (a− (I + I3 ))1=2 |Y + 1; I − 12 ; I3 − 12 ;
V+ |Y; I; I3 = (a+ (I + I3 + 1))1=2 |Y + 1; I + 12 ; I3 + 12 + (a− (I − I3 ))1=2 |Y + 1; I − 12 ; I3 + 12 ;
(A.6)
where the de Swart phase convention [184] is used and a+ =
(Y+ + 13 (p − q) + 1)(Y+ + 13 (p + 2q) + 2)(−Y+ + 13 (2p + q)) ; (2I + 1)(2I + 2)
a− =
(Y− + 13 (p − q))(Y− + 13 (p + 2q) + 1)(Y− − 13 (2p + q) − 1) 2I (2I + 1)
(A.7)
with Y± = 12 Y ± I . Here p and q are nonnegative coeIcients needed to construct bases for the IR D(p; q) of SU(3) group. The dimension n of D(p; q), namely the number of the basis vectors is then given by (1 + p)(1 + q)(1 + 12 (p + q)) [184] so that one can denote the IRs of interest as follows: 1 = D(0; 0);
3 = D(1; 0);
3T = D(0; 1) ;
81 = D(1; 1);
10 = D(3; 0);
T = D(0; 3) ; 10
T = D(2; 2); 27
35 = D(4; 1);
T = D(1; 4) ; 35
28 = D(6; 0);
64 = D(3; 3);
T = D(5; 2) ; 81
T = D(2; 5) : 81
(A.8)
Using the relations for the raising spin operators (A.6) and the similarly constructed relations for the lowing spin operators I− ; U− and V− , one can derive the isoscalar factors [184] of the SU(3) group for the Clebsch–Gordan series which have been used in the previous sections. In Fig. 6 is depicted the spin symmetry operation diagram for the decuplet baryons.
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
209
Fig. 6. Spin symmetry operations in the baryon decuplet.
Appendix B. Inertia parameters in the chiral bag model B.1. Angular part of the matrix element In this section, we will derive the explicit expression of the quark phase inertia parameters in the CBM, which are to some extent abstractly described in the above sections, by considering one of the most complicated quantity N whose meson phase contribution is already explicitly given in the previous section. (For the other inertia parameters, see Refs. [25,33,35,36].) To obtain the explicit description of the quark phase inertia parameter Nq , we will >rst calculate the angular part of the matrix element h m|)4 V3 |ns in this section. Now one notes that the vector operator Vi = ijk xj !0 !k can be given in terms of vi = ijk rˆj Mk as follows: 0 rvi Vi = ; (B.1) rvi 0 where the unit vectors rˆi (i = 1; 2; 3) can be expressed in terms of the spherical harmonics Yl; m (;; J) 1=2 2 (Y1; −1 − Y1; 1 ) ; rˆ1 = sin ; cos J = 3 1=2 2 (Y1; −1 + Y1; 1 ) ; rˆ2 = sin ; sin J = i 3 1=2 4 Y1; 0 : (B.2) rˆ3 = cos ; = 3
210
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
Acting the unit vector operators on the eigenstate of the angular momentum |j; mj , one can obtain the identities (j − mj + 1)(j − mj + 2) 1=2 rˆ1; 2 |j; mj = |j + 1; mj − 1 4(2j + 1)(2j + 3) (j + mj + 1)(j + mj + 2) 1=2 ∓ |j + 1; mj + 1 4(2j + 1)(2j + 3) (j + mj − 1)(j + mj ) 1=2 − |j − 1; mj − 1 4(2j − 1)(2j + 1) (j − mj − 1)(j − mj ) 1=2 |j − 1; mj + 1 ; ± 4(2j − 1)(2j + 1) (j − mj + 1)(j + mj + 1) 1=2 rˆ3 |j; mj = |j + 1; mj (2j + 1)(2j + 3) (j − mj )(j + mj ) 1=2 + |j − 1; mj : (B.3) (2j − 1)(2j + 1) Now the angular parts of the s-quark eigenstates with D = ± 1 corresponding to j = l ± 12 are given in terms of the quantum number j and mj and spin states | ↑ and | ↓ j + mj 1=2 1 1 |j; mj +1 = j − 2 ; mj − 2 | ↑ 2j j − mj 1=2 1 1 | ↓ + j − ; mj + 2j 2 2 j − mj + 1 1=2 1 1 |j; mj −1 = − | ↑ j + ; mj − 2j + 2 2 2 j + mj + 1 1=2 1 1 | ↓ + (B.4) j + ; mj + 2j + 2 2 2 which satisfy the relations i j
m
; mj |j; mj i = +ii +jj +mjj ;
˜M · rˆi |j; mj +1 = − |j; mj −1 :
(B.5)
Applying the identities (B.2) and (B.3) to the s-quark eigenstate angular parts (B.4), one can evaluate the following relations: j − mj + 1 1=2 j − mj 1 1 v3 |j; mj +1 = −i | ↑ j + ; mj − 2j + 2 2j 2 2 j + mj + 1 1=2 j + mj 1 1 | ↓ j + ; mj + −i 2j + 2 2j 2 2
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
j + mj 1=2 j + mj + 1 1 1 | ↑ v3 |j; mj −1 = i j − + ; mj − 2j 2j + 2 2 2 j − mj 1=2 j − mj + 1 1 1 −i j − ; mj + | ↓ 2j 2j + 2 2 2
211
(B.6)
which are crucial in the following calculation of the angular part of the matrix element involved in the inertia parameter in the quark phase K + mK 1=2 K − mK + 1 K K +j+1=2 +m 1 K; mK |)4 v3 |j; mj +1 |s = i mj +1=2 ; 2K + 2 2K + 1 2 K; mK |)4 v3 |j; mj +1 |s = 4 K; mK |)4 v3 |j; mj −1 |s
K + mK =i 2K
1=2
2K(2mK − 1) ; +K +mK (2K − 1)(2K + 1) j+1=2 mj +1=2
3 K; mK |)4 v3 |j; mj +1 |s = −1 K; mK |)4 v3 |j; mj −1 |s
K − mK + 1 =i 2K + 2
1=2
(2K + 2)(2mK − 1) K +mK ; + (2K + 1)(2K + 3) j−1=2 mj +1=2
4 K; mK |)4 v3 |j; mj +1 |s =3 K; mK |)4 v3 |j; mj −1 |s = 0
K − mK + 1 2 K; mK |)4 v3 |j; mj −1 |s = i 2K
1=2
;
K + mK K +mK + 2K + 1 j−1=2 mj +1=2
(B.7)
with |s = (0; 0; 1)T the s-quark eigenstate in the SU(3) Aavor space. Here one can easily see that the angular parts of the hedgehog quark eigenstates, constructed with |j; mj ±1 and the isospin eigenstates | ⇑ = (1; 0; 0)T and | ⇓ = (0; 1; 0)T , are given by 1 K − mK + 1 1=2 1 |K; mK 1 = − | ⇑ K + 2 ; mK − 2 2K + 2 +1 1 1 K + mK + 1 1=2 | ⇓ ; − K + 2 ; mK + 2 2K + 2 +1 K + mK 1=2 1 1 |K; mK 2 = − K − ; mK − | ⇑ 2K 2 2 −1 1 K − mK 1=2 1 − K − ; mK + | ⇓ ; 2K 2 2 −1 1 1 K − mK + 1 1=2 |K; mK 3 = − K + ; mK + | ⇑ 2K + 2 2 2 −1 K + mK + 1 1=2 1 1 K + ; mK + | ⇓ ; + 2K + 2 2 2 −1
212
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
K + mK 1=2 1 1 |K; mK 4 = K − ; mK − | ⇑ 2K 2 2 +1 K − mK 1=2 1 1 + K − ; mK + | ⇓ 2K 2 2 +1
(B.8)
which ful>ll the relations i K
m
; mK |K; mK i = +ii +jj +mKK ;
˜M · rˆi |K; mK i = (−1)i |K; mK i+2 :
(B.9)
Here one notes that |K; mK 1 and |K; mK 2 (|K; mK 3 and |K; mK 4 ) have the quantum number D = + 1 (D = − 1) where D = P(−1)K . B.2. Quark phase inertia parameter In this section, we will combine the angular part of the matrix element derived in the previous section with the radial wave functions of the quark eigenstates so that one can calculate the quark phase inertia parameter Nq . Now the unperturbed hedgehog and strange quark eigenstates in terms of the quantum numbers D and D are described as follows: jK (jm r) 0h |K; mK 1 m = c1 n1 i˜M · rj ˆ K+1 (jm r) −jK (jm r) |K; mK 2 for D = + 1 ; − c2 n2 i˜M · rj ˆ K−1 (jm r) −jK+1 (jm r) 0h |K; mK 3 m = −c1 n1 i˜M · rj ˆ K (jm r) jK−1 (jm r) |K; mK 4 for D = − 1 ; + c2 n2 i˜M · rj ˆ K (jm r) jl (!n r) 0s |j; mj +1 |s for D = + 1 ; n = n1 i˜M · rj ˆ l+1 (!n r) −jl (!n r) 0s |j; mj −1 |s for D = − 1 ; (B.10) n = − n2 i˜M · rj ˆ l−1 (!n r) where jK (jm r) and jl (!n r) are the spherical Bessel functions with the energy eigenvalues jm and !n , respectively, and the constants c1 and c2 are the normalization constants satisfying c12 + c22 = 1 and the constants n1 and n2 are normalized as −3 2 2 n−2 1 R Em = Em (jK (Em ) + jK+1 (Em )) − 2(K + 1)jK (Em )jK+1 (Em ) −3 2 2 n−2 2 R Em = Em (jK (Em ) + jK−1 (Em )) − 2KjK (Em )jK−1 (Em )
(B.11)
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
213
and the other constants n1 and n2 also satisfy the above conditions with (Pn = !n R; l) instead of (Em = jm R; K). Using the angular parts of the matrix elements (B.7) and the full quark eigenstate wave functions (B.10), one can now calculate the matrix element h m|)4 V3 |ns as below: ! K − mK + 1 1=2 (2K + 2)(2mK − 1) −c1 N1 N1 (S1 + S2 ) h m|)4 V3 |ns = C 2K + 2 (2K + 1)(2K + 3) " K + 1 1=2 K + mK K + c2 N2 N1 S1 +Kj−1=2 +m mj +1=2 2K + 1 K ! 1=2 K + mK + 1 1=2 K − mK + 1 K +C c1 N1 N2 S3 2K 2K + 1 K +1 " 2K(2mK − 1) K + c2 N2 N2 (S3 − S4 ) +Kj+1=2 +m mj +1=2 (2K − 1)(2K + 1) ! 1=2 K + mK 1=2 K K − mK + 1 +C c1 N1 N1 S3 2K 2K + 1 K +1 " 2K(2m − 1) K K + c2 N2 N1 (S3 + S4 ) +Kj+1=2 +m mj +1=2 (2K − 1)(2K + 1) ! K − mK + 1 1=2 (2K + 2)(2mK − 1) −c1 N1 N2 +C (S1 − S2 ) 2K + 2 (2K + 1)(2K + 3) " K + 1 1=2 K + mK K + c2 N2 N2 S1 +Kj−1=2 +m (B.12) mj +1=2 2K + 1 K where C = jK (Em )jK (En )= |jK (Em )jK (En )|; N1 = R3=2 jK (En )n1 and N2 = R3=2 jK (En )n2 and N1 and N2 are similarly de>ned for the strange quark eigenstates. The radial integrations here are given as R dr r 3 jK (jm r)jK+1 (!n r) S1 = 0 3 ; R jK (Em )jK+1 (Pn ) R dr r 3 jK+1 (jm r)jK (!n r) ; S2 = 0 R3 jK (Em )jK (Pn ) R dr r 3 jK (jm r)jK−1 (!n r) ; S3 = 0 3 R jK (Em )jK−1 (Pn ) R dr r 3 jK−1 (jm r)jK (!n r) S4 = 0 : (B.13) R3 jK (Em )jK (Pn )
214
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
Similarly, one can calculate the other matrix element h m|)4 |ns which is also needed for the inertia parameter Nq ,
K − mK + 1 1=2 1 − v1 K c1 N1 N1 + +mK 2K + 2 Em − Pn j−1=2 mj +1=2 K + mK 1=2 1 + v2 K +C c2 N2 N2 + +mK 2K Em − Pn j+1=2 mj +1=2 K + mK 1=2 1 − v2 K +C c2 N2 N1 + +mK 2K Em − Pn j+1=2 mj +1=2 K − mK + 1 1=2 1 + v1 K +C c1 N1 N2 + +mK 2K + 2 Em − Pn j−1=2 mj +1=2
h m|)4 |ns = C
(B.14)
where v1 = jK+1 (Em )=jK (Em ) and v2 = jK+1 (Em )=jK (Em ). Combining the above two matrix elements (B.12) and (B.14), one can obtain the explicit expression for the quark phase inertia parameter Nq , 1 h m|)4 |nss n|)4 V3 |mh R m; n jm − !n =
m; n;K
+
# $ K +1 1 − v1 c1 N1 N1 K+ c2 N2 N1 S1 + c1 N1 N1 (S1 + S2 ) (Em − Pn )2 3
m; n;K
+
m; n;K
+
m; n;K
# $ K 1 + v2 c2 N2 N2 K− c1 N1 N2 S3 + c2 N2 N2 (S3 − S4 ) (Em − Pn )2 3 # $ K 1 − v2 c N N K c N N S + N N (S + S ) c − 1 1 2 3 2 2 1 2 2 1 3 4 (Em − Pn )2 3 # $ K +1 1 + v1 c1 N1 N2 K+ c2 N2 N2 S1 + c1 N1 N2 (S1 − S2 ) ; (Em − Pn )2 3
(B.15)
where K+ and K− are de>ned as
K+ =
K +1 K
1=2
K ; 3
K− =
K K +1
1=2
K +1 3
and the summation over the index mK has been carried out. Here one notes that the summation indices m and n of the left-hand side are understood as the shorthand of the sets of the quantum numbers (K; mK ; D; m) and (j; mj ; D ; n) associated with the hedgehog and strange quark eigenstates, respectively.
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
215
Appendix C. Batalin–Fradkin–Tyutin quantization scheme C.1. BRST symmetries in Skyrmion model In this section, we will obtain the BRST invariant Lagrangian in the framework of the BFV formalism [226 –228] which is applicable to theories with the >rst-class constraints by introducing two canonical sets of ghosts and antighosts together with auxiliary >elds (Ci ; PT i ), (Pi ; CT i ), (N i ; Bi ), (i = 1; 2) which satisfy the super-Poisson algebra 12 T j } = {Pi ; CT j } = {N i ; Bj } = +ij : {Ci ; P Here the super-Poisson bracket is de>ned as A B CA C B B A {A; B} = − (−1) q r p l q r p l
(C.1)
(C.2)
where CA denotes the number of fermions called ghost number in A and the subscripts r and l the right and left derivatives. In the SU(2) Skyrmion model, the nilpotent BRST charge Q, the fermionic gauge >xing function T and the BRST invariant minimal Hamiltonian Hm are given by Q = Ci P˜ i + Pi Bi ;
Hm = H˜ −
T = CT i *i + PT i N i ;
1 1T C P2 2I10
(C.3)
which satisfy the relations {Q; Hm } = 0; Q2 = {Q; Q} = 0; {{T; Q}; Q} = 0. The eDective quantum Lagrangian is then described as 2 i 2 LeD = " a˙" + ; ;˙ + B2 N˙ + PT i C˙ + CT 2 P˙ − Htot
(C.4)
1 1 1 with Htot = Hm − {Q; T}. Here B1 N˙ + CT 1 P˙ = {Q; CT 1 N˙ } terms are suppressed by replacing *1 1 with *1 + N˙ . Now we choose the unitary gauge
*1 = P1 ;
*2 = P2
(C.5)
and perform the path integration over the >elds B1 , N 1 , CT 1 , P1 , PT 1 and C1 , by using the equations of motion, to yield the eDective Lagrangian of the form LeD = " a˙" + ; ;˙ + BN˙ + PT C˙ + CT P˙
12
Here one notes that the BRST symmetry can be also constructed by using the residual gauge symmetry interpretation of the BRST invariance [245,246].
216
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
1 1 a M aM (" − a" ; )(" − a" ; ) M M − P˜ 2 8I10 a a + 2; 4I10 ; T T + P˜ 2 N + BP2 + PP + 2a" a" ; CC − M0 −
(C.6)
with rede>nitions: N ≡ N 2 , B ≡ B2 , CT ≡ CT 2 , C ≡ C2 , PT ≡ PT 2 , P ≡ P2 . Next, using the variations with respect to " , ; , P and PT , one obtains the relations 1 1 " " " M M " a˙ = ( − a ; )a a + a −N −B ; 4I10 4I10 ; 1 " " 1 " " 1 " M M " " ˙ T ;=− a ( − a ; )a a + a a − ; − 2CC + N + a ; 4I10 2I10 4I10 P = − C˙ ;
T = CT˙ P
(C.7)
to yield the eDective Lagrangian
2 2I10 " " ;˙ M M T )a a LeD = −M0 + M M a˙ a˙ − 2I10 M M + (B + 2CC + BN˙ a a a a ˙ 4I10 " " ; T )aM aM + M M a a˙ + a" + (B + 2CC (B + N ) + CT˙ C˙ : a a aM aM
˙ M aM , one can arrive at the BRST invariant Finally, with the identi>cation N = − B + ;=a Lagrangian [95] LeD = − M0 +
˙˙ 2I10 " " 2I10 ˙ 2 T )2 − ;B + CT˙ C˙ ; ; − 2I10 (1 − 2;)2 (B + 2CC a˙ a˙ − 2 1 − 2; (1 − 2;) 1 − 2; (C.8)
which is invariant under the BRST transformation +B a" = )a" C;
+B ; = − )(1 − 2;)C ;
+B CT = − )B;
+B C = +B B = 0 :
C.2. SU(3) Skyrmion with Bavor symmetry breaking eDects In this section, our starting SU(3) Skyrmion Lagrangian in Eq. (7.23) is given by 1 1 tr[l" ; l& ]2 + LWZW 4 32e2 1 + f2 tr M (U + U † − 2) + LFSB ; 4
L = − f2 tr(l" l" ) +
(C.9)
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
1 6
LFSB = (fK2 m2K − f2 m2 ) tr((1 − −
√
217
3)8 )(U + U † − 2))
√ 1 2 (fK − f2 ) tr((1 − 3)8 )(Ul" l" + l" l" U † )) ; 12
(C.10)
where the WZW action is given by Eq. (7.24). Now we consider only the rigid motions of the SU(3) Skyrmion U (˜x; t) = A(t)U0 (˜x)A(t)† : Assuming maximal symmetry in the Skyrmion, we can use the hedgehog solution U0 given in Eq. (2.19) embedded in the SU(2) isospin subgroup of SU(3) with the chiral angle ;(r) which is determined by minimizing the static mass M0 in Eq. (7.4) and, for unit winding number, satis>es the boundary conditions limr→∞ ;(r) = 0 and ;(0) = . Since A belongs to SU(3), A† A˙ is antiHermitian and traceless to be expressed as a linear combination of )a as follows: ˜v · < + &1 V † ˙ a A A = ief v )a = ief V† −2& where
1
2
3
˜v = (v ; v ; v );
V=
v4 − iv5 v6 − iv7
;
v8 &= √ : 3
(C.11)
After tedious algebraic manipulations, the FSB contribution to the Skyrmion Lagrangian is then expressed as [97] LFSB = −(fK2 m2K − f2 m2 )(1 − cos ;) sin2 d
1 + (fK2 − f2 ) sin2 d 2
−(fK2
−
8 2 2 2 2 2 sin2 ; − e f ˜v sin ; − 3 r2
2 2 2 2 sin d f )e f ((1 2
d
d; dr
2
cos ;
− cos ;)2 D† V 2 − sin2 ;D† < · rV ˆ 2 )
√ i 2 2 sin 2d 2 + (fK − f2 )e2 f2 sin ;(D†˜v ·
(C.12)
In order to separate the SU(2) rotations from the deviations into strange directions, the time-dependent rotations can be written as [247] A(t) 0 A(t) = S(t) (C.13) 0 1
218
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
with A(t) ∈ SU(2) and the small rigid oscillations S(t) around the SU(2) rotations. 13 Furthermore, we exploit the time-dependent angular velocity of the SU(2) rotation through i A† A˙ = ˙ · ˜< : 2 Note that one can use the Euler angles for the parameterization of the rotation [249]. On the other hand, the small rigid oscillations S, which were also used in Ref. [230], can be described as 7 a S(t) = exp i d )a = exp(iD) ; (C.14) a=4
where
D=
√
0 2D†
√
2D
0
;
1 D= √ 2
d4 − id5 d6 − id7
:
(C.15)
Including the FSB correction terms in Eq. (C.12), the Skyrmion Lagrangian to order 1=Nc is then given in terms of the angular velocity i and the strange deviations D i 1 † † L = −M0 + I10 ˙ · ˙ + (4I20 + -1 )D˙ D˙ + Nc (D† D˙ − D˙ D) 2 2 1 † + i I10 − 2I20 − -1 + -2 (D† ˙ · ˜
˙ D˙ D) ; − 2(-1 − -2 )(D† D)(
(C.16)
where * = fK =f . Here the soliton energy M0 , the moment of inertia I10 are given by Eqs. (7.4) and (7.5), and the other moment of inertia I20 , the strength -0 of the chiral symmetry breaking and the inertia parameters -i (i = 1; 2; 3) originated from the FSB term are, respectively, 13
Here one notes that Auctuations Ja from √ collective√ rotations A can be also separated by the other suitable parameterization [248] U = A U0 A† exp(i 8a=1 Ja )a )A U0 A† :
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
219
given by
∞ 2 2 2 1 2 sin d; ; I20 = 3 d z z 2 (1 − cos ;) 1 + + ; e f 0 4 dz z2 ∞ 8 -0 = 3 d z z 2 (1 − cos ;) ; e f 0
-1 = (*2 − 1)-0 ; 8 -2 = (* − 1) 3 3e f 2
4f -3 = (*2 − 1) e
∞
0
0
∞
d z z 2 sin2 ; ;
d z z2
d; dz
2
2 sin2 ; + z2
cos ;
(C.17)
with the dimensionless quantities z = ef r. The momenta hi and s , conjugate to the collective coordinates i and the strange deviation † D are given by
1 1 † ˜h = I10 ˙ + i I10 − 2I20 − -1 + -2 D†˜
i 3
˙ − (4I20 + -1 )(D† D˙ − D˙ D)D + Nc (D† D)D − 2(-1 − -2 )(D† D)D;
(C.18)
which satisfy the Poisson brackets {i ; hj } = +ij , {D† ; s } = {D ; s;† } = + . Performing the Legendre transformation, we obtain the Hamiltonian to order 1=Nc as follows: 1 1 2 1 † Nc ˜h + s s − i (D† s − s† D) H = M0 + -0 m2 + 2 2I10 4I20 8I20
Nc2 2 2 2 † + + -0 (* mK − m ) + -3 D D 16I20 -2 1 1 +i · (D†˜h · ˜<s − s†˜h · ˜
220
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
-22 + I10 (-1 − -2 ) 1 1 3 -2 + − 1+ + (D† D)(s† s ) 2 2I10 3I20 2 I10 8I10 I20
-22 − I10 (-1 − -2 ) 1 1 3 -2 − 1+ − (D† s + s† D)2 + 2 12I20 2 I10 8I10 32I10 I20 1 -1 − -2 − (D† s − s† D)2 + 2 8I20 32I20
-22 + 2I10 (-1 − -2 ) -2 Nc 1 −i 1 − + (D† s − s† D)(D† D) 2 8 I20 I10 2I10 I20
Nc2 2 2 Nc2 -22 + 2I10 (-1 − -2 ) 2 2 2 (D† D)2 ; + − 3 -0 (* mK − m ) − 3 -3 + 32 2 12I20 I10 I20
(C.19)
= I + 1- . where I20 20 4 1 Through the symmetrization procedure [222,94], we can obtain the Hamiltonian of the form 1 1 1 1 † Nc 2 2 † † ˜ H = M0 + -0 m + I + + s s − i 8I (D s − s D) 2 2I10 4 4I20 20
Nc2 2 2 2 † + + -0 (* mK − m ) + -3 D D 16I20 -2 1 1 − 1+ (D†˜I · ˜<s − s†˜I · ˜
where the isospin operator ˜I is given by ˜I = ˜h and the ellipsis stands for the strange–strange interaction terms of order 1=Nc which can be readily read oD from Eq. (C.19). Here one notes that the overall energy shift 1=8I10 originates from the Weyl ordering correction in the BFT Hamiltonian scheme as discussed before.
References [1] [2] [3] [4] [5] [6] [7]
J. Ashman, et al., Phys. Lett. B 206 (1988) 364. D.B. Kaplan, A. Manohar, Nucl. Phys. B 310 (1988) 527. R. Hasty, et al., Science 290 (2000) 2117. ∗∗∗ D.T. Spayde, et al., Phys. Rev. Lett. 84 (2000) 1106; B. Mueller, et al., Phys. Rev. Lett. 78 (1997) 3824. R.D. McKeown, Phys. Lett. B 219 (1989) 140. E.J. Beise, R.D. McKeown, Comm. Nucl. Part. Phys. 20 (1991) 105. R.D. McKeown, in: C.R. Ji, D.P. Min (Eds.), New Directions in Quantum Chromodynamics, AIP, Melville, New York, 1999.
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226 [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55]
221
R.L. JaDe, Phys. Lett. B 229 (1989) 275. ∗∗ M.J. Musolf, M. Burkardt, Z. Phys. C 61 (1994) 433. W. Koepf, E.M. Henley, S.J. Pollock, Phys. Lett. B 288 (1992) 11. B.R. Holstein, in: E.J. Beise, R.D. McKeown (Eds.), Parity Violation in Electron Scattering, World Scienti>c, Singapore, 1990, p. 27. N.W. Park, J. Schechter, H. Weigel, Phys. Rev. D 43 (1991) 869. S.C. Phatak, S. Sahu, Phys. Lett. B 321 (1994) 11. C.V. Christov, et al., Prog. Part. Nucl. Phys. 37 (1996) 1. H.-W. Hammer, U.-G. Meissner, D. Drechsel, Phys. Lett. B 367 (1996) 323. H. Ito, Phys. Rev. C 52 (1995) R1750. H. Weigel, et al., Phys. Lett. B 353 (1995) 20. D. Leinweber, Phys. Rev. D 53 (1996) 5115. P. Geiger, N. Isgur, Phys. Rev. D 55 (1997) 229. M.J. Musolf, H.-W. Hammer, D. Drechsel, Phys. Rev. D 55 (1997) 2741. M.J. Musolf, H. Ito, Phys. Rev. C 55 (1997) 3066. U.-G. Meissner, et al., Phys. Lett. B 408 (1997) 381. S.T. Hong, B.Y. Park, D.P. Min, Phys. Lett. B 414 (1997) 229. ∗∗∗ S.T. Hong, in: C.R. Ji, D.P. Min (Eds.), New Directions in Quantum Chromodynamics, AIP, Melville, New York, 1999. S.T. Hong, B.Y. Park, Nucl. Phys. A 561 (1993) 525. ∗∗∗ G.E. Brown, M. Rho, Phys. Lett. B 82 (1979) 177. ∗∗∗ G.E. Brown, M. Rho, V. Vento, Phys. Lett. B 84 (1979) 383. G.E. Brown, M. Rho, V. Vento, Phys. Lett. B 97 (1980) 423. ∗∗ R.L. JaDe, in: A. Zichichi (Ed.), Pointlike Structures Inside and Outside Hadrons, Plenum Press, New York, 1982. A.D. Jackson, M. Rho, Phys. Rev. Lett. 51 (1983) 751. M. Rho, A.S. Goldhaber, G.E. Brown, Phys. Rev. Lett. 51 (1983) 747. G.E. Brown, A.D. Jackson, M. Rho, V. Vento, Phys. Lett. B 140 (1984) 285. B.Y. Park, M. Rho, Z. Phys. A 331 (1988) 151. B.Y. Park, V. Vento, M. Rho, G.E. Brown, Nucl. Phys. A 504 (1989) 829. B.Y. Park, D.P. Min, M. Rho, Nucl. Phys. A 517 (1990) 561. S.T. Hong, B.Y. Park, Int. J. Mod. Phys. E 1 (1992) 131. S.T. Hong, G.E. Brown, Nucl. Phys. A 564 (1993) 491. ∗∗∗ S.T. Hong, G.E. Brown, Nucl. Phys. A 580 (1994) 408. ∗∗∗ M.A. Nowak, M. Rho, I. Zahed, Chiral Nuclear Dynamics, World Scienti>c, Singapore, 1996. M. Rho, Nucl. Phys. A 622 (1997) 538. H.J. Lee, D.P. Min, B.Y. Park, M. Rho, V. Vento, Nucl. Phys. A 657 (1999) 75. M. Rho, in: C.R. Ji, D.P. Min (Eds.), New Directions in Quantum Chromodynamics, AIP, Melville, New York, 1999. H.J. Lee, D.P. Min, B.Y. Park, M. Rho, Phys. Lett. B 491 (2000) 257. H.C. Kim, M. Praszalowicz, M.V. Polyakov, K. Goeke, Phys. Rev. D 58 (1998) 114 027. U.-G. Meissner, Nucl. Phys. A666 –A667 (2000) 51. C.M. Maekawa, U. van Kolck, Phys. Lett. B 478 (2000) 73. C.M. Maekawa, J.S. Veiga, U. van Kolck, Phys. Lett. B 488 (2000) 167. S.-L. Zhu, S.J. Puglia, B.R. Holstein, M.J. Ramsey-Musolf, Phys. Rev. D 62 (2000) 033 008. W.C. Haxton, C.P. Liu, M.J. Ramsey-Musolf, Phys. Rev. Lett. 86 (2001) 5247. D.H. Beck, R.D. McKeown, hep-ph=0102334. ∗ D.B. Kaplan, A.E. Nelson, Phys. Lett. B 175 (1986) 57. C.H. Lee, G.E. Brown, D.P. Min, M. Rho, Nucl. Phys. A 585 (1995) 401. A.E. Nelson, D.B. Kaplan, Phys. Lett. B 192 (1987) 273. G.E. Brown, K. Kubodera, D. Page, P. Pizzochero, Phys. Rev. D 37 (1988) 2042. G.E. Brown, V. Thorsson, K. Kubodera, M. Rho, Phys. Lett. B 291 (1992) 355.
222 [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100]
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226 G.E. Brown, Nucl. Phys. A 574 (1994) 217. G.E. Brown, Astrophys. J. 440 (1995) 270. H.A. Bethe, G.E. Brown, Astrophys. J. 445 (1995) L129. G.E. Brown, M. Rho, Phys. Rep. 269 (1996) 333. G.Q. Li, C.H. Lee, G.E. Brown, Nucl. Phys. A 625 (1997) 372. G.Q. Li, C.H. Lee, G.E. Brown, Phys. Rev. Lett. 79 (1997) 5214. G.E. Brown, C.H. Lee, R. Rapp, Nucl. Phys. A 639 (1998) 455. G.Q. Li, G.E. Brown, C.H. Lee, Phys. Rev. Lett. 81 (1998) 2177; Nucl. Phys. A 574 (1994) 217. G.S. Adkins, C.R. Nappi, E. Witten, Nucl. Phys. B 228 (1983) 552. ∗∗ G.S. Adkins, C.R. Nappi, Nucl. Phys. B 233 (1984) 109. I. Zahed, G.E. Brown, Phys. Rep. 142 (1986) 1. S.T. Hong, Phys. Lett. B 417 (1998) 211. ∗∗∗ A. Chodos, R.L. JaDe, K. Johnson, C.B. Thorn, Phys. Rev. D 10 (1974) 10. ∗∗ T. DeGrand, R.L. JaDe, K. Johnson, J. Kiskis, Phys. Rev. D 12 (1975) 2060. S. Nadkarni, H.B. Nielsen, I. Zahed, Nucl. Phys. B 253 (1985) 308; H.B. Nielsen, A. Wirzba, in: J.-M. Richard, E. Aslanides, N. Boccara (Eds.), The Elementary Structure of Matter, Springer, Berlin, 1988, p. 467. ∗ A. Chodos, C.B. Thorn, Phys. Rev. D 12 (1975) 2733. R. Jackiw, C. Rebbi, Phys. Rev. Lett. 36 (1976) 1116. V. Vento, J.H. Jun, E.M. Nyman, M. Rho, G.E. Brown, Nucl. Phys. A 345 (1980) 413. J. Goldstone, R.L. JaDe, Phys. Rev. Lett. 51 (1983) 1518. A.W. Thomas, Adv. Nucl. Phys. 13 (1984) 1. T.H.R. Skyrme, Proc. R. Soc. A 260 (1961) 127. ∗∗ J. Wess, B. Zumino, Phys. Lett. B 37 (1971) 95. E. Witten, Nucl. Phys. B 223 (1983) 422; E. Witten, Nucl. Phys. B 223 (1983) 433. ∗∗∗ C.G. Callan, I. Klebanov, Nucl. Phys. B 262 (1985) 365. ∗∗ N.N. Scoccola, H. Nadeau, M.A. Nowak, M. Rho, Phys. Lett. B 201 (1988) 425. E. Jenkins, A.V. Manohar, Nucl. Phys. B 396 (1993) 27. H. Yabu, K. Ando, Nucl. Phys. B 301 (1988) 601. J.H. Kim, C.H. Lee, H.K. Lee, Nucl. Phys. B 501 (1989) 835. H.K. Lee, D.P. Min, Phys. Lett. B 219 (1989) 1. P.A.M. Dirac, Lectures in Quantum Mechanics, Yeshiva University, 1964. ∗∗ M.B. Green, J.H. Schwarz, E. Witten, Superstring Theory, Vol. 1, Cambridge University Press, Cambridge, 1987. S.H. Lee, I. Zahed, Phys. Rev. D 37 (1988) 1963. ∗ S.H. Lee, I. Zahed, Phys. Lett. B 215 (1988) 583. I.A. Batalin, E.S. Fradkin, Phys. Lett. B 180 (1986) 157; Nucl. Phys. B 279 (1987) 514; I.A. Batalin, I.V. Tyutin, Int. J. Mod. Phys. A 6 (1991) 3255. ∗∗ R. Banerjee, Phys. Rev. D 48 (1993) R5467. W.T. Kim, Y.J. Park, Phys. Lett. B 336 (1994) 376. Y.W. Kim, Y.J. Park, K.D. Rothe, J. Phys. G 24 (1998) 953; Y.W. Kim, K.D. Rothe, Nucl. Phys. B 510 (1998) 511; M.I. Park, Y.J. Park, Int. J. Mod. Phys. A 13 (1998) 2179. ∗ S. Ghosh, Phys. Rev. D 49 (1994) 2990; R. Amorim, J. Barcelos-Neto, Phys. Rev. D 53 (1996) 7129; M. Fleck, H.O. Girotti, Int. J. Mod. Phys. A 14 (1999) 4287; C.P. Nativiade, H. Boschi-Filho, Phys. Rev. D 62 (2000) 025 016. S.T. Hong, Y.W. Kim, Y.J. Park, Phys. Rev. D 59 (1999) 114 026. ∗∗ S.T. Hong, Y.W. Kim, Y.J. Park, Mod. Phys. Lett. A 15 (2000) 55. S.T. Hong, Y.J. Park, Mod. Phys. Lett. A 15 (2000) 913. S.T. Hong, Y.J. Park, Phys. Rev. D 63 (2001) 054 018. ∗∗∗ D.J. Gross, F. Wilczek, Phys. Rev. Lett. 30 (1973) 1343. H.D. Politzer, Phys. Rev. Lett. 30 (1973) 1346. J.C. Collins, M.J. Perry, Phys. Rev. Lett. 34 (1975) 1353.
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226 [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148]
223
B.C. Barrois, Nucl. Phys. B 129 (1977) 390. D. Bailin, A. Love, Phys. Rep. 107 (1984) 325, and references therein. R. Rapp, T. SchSafer, E.V. Shuryak, M. Velkovsky, Phys. Rev. Lett. 81 (1998) 53. ∗∗∗ M. Alford, K. Rajagopal, F. Wilczek, Phys. Lett. B 422 (1998) 247. ∗∗∗ N. Evans, S.D. Hsu, M. Schwetz, Nucl. Phys. B 551 (1999) 275. N. Evans, S.D. Hsu, M. Schwetz, Phys. Lett. B 449 (1999) 281. T. SchSafer, F. Wilczek, Phys. Lett. B 450 (1999) 325. T. SchSafer, F. Wilczek, Phys. Rev. D 60 (1999) 074 014. T. SchSafer, F. Wilczek, Phys. Rev. Lett. 82 (1999) 3956. ∗∗ R.D. Pisarski, D.H. Rischke, Phys. Rev. Lett. 83 (1999) 37. D.T. Son, Phys. Rev. D 59 (1999) 094 019. R. Casalbuoni, R. Gatto, Phys. Lett. B 464 (1999) 111. D.K. Hong, M. Rho, I. Zahed, Phys. Lett. B 468 (1999) 261. ∗ K. Langfeld, M. Rho, Nucl. Phys. A 660 (1999) 475. M. Alford, K. Rajagopal, F. Wilczek, Nucl. Phys. B 537 (1999) 443. ∗∗ R.D. Pisarski, D.H. Rischke, Phys. Rev. D 60 (1999) 094 013; Phys. Rev. D 61 (2000) 051 501; Phys. Rev. D 61 (2000) 074 017. T. SchSafer, F. Wilczek, Phys. Rev. D 60 (1999) 114 033. D.K. Hong, Phys. Lett. B 473 (2000) 118. M. Rho, A. Wirzba, I. Zahed, Phys. Lett. B 473 (2000) 126. C. Manuel, M.H. Tytgat, Phys. Lett. B 479 (2000) 190. S.R. Beane, P.F. Bedaque, M.J. Savage, Phys. Lett. B 483 (2000) 131. D.K. Hong, T. Lee, D.P. Min, Phys. Lett. B 477 (2000) 137. R. Rapp, T. SchSafer, E.V. Shuryak, M. Velkovsky, Ann. Phys. 280 (2000) 35. S.D. Hsu, M. Schwetz, Nucl. Phys. B 572 (2000) 211. D.T. Son, M.A. Stephanov, Phys. Rev. D 61 (2000) 074 012; Erratum, D 62 (2000) 059 902. W.E. Brown, J.T. Liu, H. Ren, Phys. Rev. D 61 (2000) 114 012; Phys. Rev. D 62 (2000) 054 016; Phys. Rev. D 62 (2000) 054 013. M.A. Nowak, M. Rho, A. Wirzba, I. Zahed, Phys. Lett. B 497 (2001) 85. M. Rho, E. Shuryak, A. Wirzba, I. Zahed, Nucl. Phys. A 676 (2000) 273. B.Y. Park, M. Rho, A. Wirzba, I. Zahed, Phys. Rev. D 62 (2000) 034 015. K. Rajagopal, E. Shuster, Phys. Rev. D 62 (2000) 085 007. K. Rajagopal, F. Wilczek, in: M. Shifman (Ed.), Handbook of QCD, World Scienti>c, Singapore, 2001, hep-ph=0011333. Y. Kim, M. Rho, nucl-th=0004054. D.K. Hong, H.K. Lee, M.A. Nowak, M. Rho, hep-ph=0010156. D.K. Hong, S.T. Hong, Y.J. Park, Phys. Lett. B 499 (2001) 125. ∗∗∗ G.E. Brown, M. Rho, nucl-th=0101015. G.E. Brown, M. Rho, hep-ph=0103102. D. Pines, M.A. Alpar, in: D. Pines, R. Tamagaki, S. Tsuruta (Eds.), The Structure and Evolution of Neutron Stars, Addison-Wesley, Reading, MA, 1992. M. Gell-Mann, Phys. Rev. 125 (1962) 1067. S.L. Adler, R. Dashen, Current Algebras, Benjamin, New York, 1968. B.W. Lee, Chiral Dynamics, Gordon and Breach, London, 1972. S.L. Adler, Phys. Rev. Lett. 14 (1965) 1051. W.I. Weisberger, Phys. Rev. Lett. 14 (1965) 1047. L.D. Faddeev, Lett. Math. Phys. 1 (1976) 289. G. ’t Hooft, Nucl. Phys. B 72 (1974) 462; G. ’t Hooft, Nucl. Phys. B 75 (1975) 461. E. Witten, Nucl. Phys. B 160 (1979) 57. P.A.M. Dirac, Proc. R. Soc. A 133 (1931) 60; P.A.M. Dirac, Phys. Rev. 74 (1948) 817. R. Jackiw, Comm. Nucl. Part. Phys. 13 (1984) 141. T. Jaroszewicz, Phys. Lett. B 159 (1985) 299.
224
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226
[149] F. Wilczek, A. Zee, Phys. Rev. Lett. 51 (1983) 2250. [150] A.A. Belavin, A.M. Polyakov, JETP Lett. 22 (1975) 245. [151] F. Wilczek, Phys. Rev. Lett. 48 (1982) 1144; F. Wilczek, Phys. Rev. Lett. 49 (1982) 957; Y.S. Wu, A. Zee, Phys. Lett. B 147 (1984) 325. [152] M. Bowick, D. Karabali, L.C.R. Wijewardhana, Nucl. Phys. B 271 (1986) 417; M. Kimura, H. Kobayashi, I. Tsutsui, Nucl. Phys. B 527 (1998) 624. [153] G.W. SemenoD, Phys. Rev. Lett. 61 (1988) 517; P.K. Panigrahi, S. Roy, W. Scherer, Phys. Rev. Lett. 61 (1988) 2827; A. Foerster, H.O. Girotti, Phys. Lett. B 230 (1989) 83; P. Voruganti, Phys. Rev. D 39 (1989) 1179. [154] A. Kovner, Phys. Lett. B 224 (1989) 229. [155] M. Bergeron, G.W. SemenoD, Int. J. Mod. Phys. A 7 (1992) 2417. [156] N. Banerjee, S. Ghosh, R. Banerjee, Phys. Rev. D 49 (1994) 1996. [157] Y. Kim, P. Oh, C. Rim, Mod. Phys. Lett. A 12 (1997) 3169; P. Oh, J. Pachos, Phys. Lett. B 444 (1998) 469; T. Itoh, P. Oh, Phys. Lett. B 491 (2000) 362. [158] S.T. Hong, B.H. Lee, Y.J. Park, SOGANG-HEP 277=00, 2000. [159] S.T. Hong, W.T. Kim, Y.J. Park, Phys. Rev. D 60 (1999) 125 005. ∗∗∗ [160] S.T. Hong, Y.J. Park, K. Kubodera, F. Myhrer, Mod. Phys. Lett. A 16 (2001) 1361. [161] B.H. Lee, K. Lee, H.S. Yang, Phys. Lett. B 498 (2001) 277. [162] N. Seiberg, E. Witten, JHEP 09 (1999) 032. [163] S.T. Hong, W.T. Kim, Y.J. Park, M.S. Yoon, Phys. Rev. D 62 (2000) 085 010. [164] S. Nam, Phys. Rev. D 41 (1990) 2323. [165] P.J. Mulders, Phys. Rev. D 30 (1984) 1073. [166] B. O’Neill, Elementary DiDerential Geometry, Academic Press, New York, 1966. [167] M.F. Atiyah, V. Patodi, I. Singer, Math. Proc. Camb. Phil. Soc. 77 (1975) 43. [168] E. Witten, Comm. Math. Phys. 121 (1989) 351. [169] S.T. Hong, hep-th=0104149. [170] S. Kobayashi, K. Nomizu, Foundations of DiDerential Geometry, Wiley, New York, 1969. [171] D. Inglis, Phys. Rev. 96 (1954) 1059; D.J. Thouless, J.G. Valatin, Nucl. Phys. 31 (1961) 211. [172] U.G. Meissner, et al., Phys. Rev. Lett. 57 (1986) 1676. [173] L. Caroll, Alice’s Adventures in Wonderland, Macmillan, New York, 1865. [174] S. Coleman, Phys. Rev. D 11 (1975) 2088; S. Mandelstam, Phys. Rev. D 11 (1975) 3026; A. Luther, Phys. Rev. C 49 (1979) 261. ∗∗ [175] H.B. Nielsen, M. Rho, A. Wirzba, I. Zahed, Phys. Lett. B 269 (1991) 389. [176] H.B. Nielsen, M. Rho, A. Wirzba, I. Zahed, Phys. Lett. B 281 (1992) 345. [177] S. Coleman, S.L. Glashow, Phys. Rev. Lett. 6 (1961) 423. ∗∗∗ [178] S. Okubo, Prog. Theor. Phys. 27 (1962) 949. [179] G.S. Adkins, in: A. Chodos, et al., (Eds.), Solitons in Nuclear and Elementary Particle Physics, World Scienti>c, Singapore, 1984. [180] S. Okubo, Phys. Lett. 4 (1963) 14. [181] V. Gupta, R. Kogerler, Phys. Lett. B 56 (1975) 473. [182] H.J. Lipkin, Nucl. Phys. B 241 (1984) 477. [183] F. Dydak, et al., Nucl. Phys. B 118 (1977) 1. [184] J.J. de Swart, Rev. Mod. Phys. 35 (1963) 916. [185] A. De Rujula, H. Georgi, S.L. Glashow, Phys. Rev. D 12 (1975) 147. [186] Particle Data Group, Eur. Phys. J. C 15 (2000) 1. [187] A. Bosshard, et al., Phys. Rev. D 44 (1991) 1962. [188] H.T. Diehl, et al., Phys. Rev. Lett. 67 (1991) 804. [189] M.A.B. Beg, B.W. Lee, A. Pais, Phys. Rev. Lett. 13 (1964) 514. [190] E.J. Beise, et al., Nucl. Instr. and Meth. A 378 (1996) 383. [191] B.A. Mueller, et al., Phys. Rev. Lett. 78 (1997) 3824. [192] G.E. Brown, K. Kubodera, M. Rho, Phys. Lett. B 192 (1987) 273. [193] R. Frisch, O. Stern, Z. Physik 85 (1933) 4.
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226 [194] [195] [196] [197] [198] [199] [200] [201] [202] [203] [204] [205] [206] [207] [208] [209] [210] [211] [212] [213] [214] [215] [216] [217] [218] [219] [220] [221] [222] [223] [224] [225] [226] [227] [228] [229] [230] [231] [232] [233] [234] [235] [236] [237] [238] [239]
225
R.D. McKeown, hep-ph=9607340 (1996). ∗ MIT-Bates Lab experiment 00-04, T. Ito, spokesperson. JeDerson Lab experiment 99-115, K. Kumar, D. Lhuillier, spokespersons. Mainz experiment PVA4, D. von Harrach, spokesperson; F. Maas, contact person. JeDerson Lab experiment 00-006, D. Beck, spokesperson, 383 (1996). JeDerson Lab experiment 00-003, R. Michaels, P. Souder, R. Urciuoli, spokespersons. JeDerson Lab experiment 00-114, D. Armstrong, spokesperson. M.J. Musolf, B.R. Holstein, Phys. Lett. B 242 (1990) 461. M.J. Musolf, et al., Phys. Rep. 239 (1994) 1. D.H. Beck, Phys. Rev. D 39 (1989) 3248. T. Averett, et al., Nucl. Instr. and Meth. A 438 (1999) 246. I. Zel’dovich, JETP Lett. 33 (1957) 1531. G.T. Garvey, W.C. Louis, D.H. White, Phys. Rev. C 48 (1993) 761. H. Lipkin, M. Karliner, Phys. Lett. B 461 (1999) 280, and references therein. S.-L. Zhu, et al., Phys. Rev. D 62 (2000) 033 008; M.J. Musolf, B.R. Holstein, Phys. Rev. D 43 (1991) 2956. R.D. McKeown, in: B. Frois, M.A. Bouchiat (Eds.), Parity Violation in Atoms and Polarized Electron Scattering, World Scienti>c, Singapore, 1999, p. 423. S.J. Dong, K.F. Liu, A.G. Williams, Phys. Rev. D 58 (1998) 074 504. M. Jezabek, M. Praszalowicz (Eds.), SU(3) Skyrmion, Skyrmions and Anomalies, World Scienti>c, Singapore, 1987. K. Huang, Quarks Leptons and Gauge Fields, World Scienti>c, Singapore, 1982. D.H. Perkins, Introduction to High Energy Physics, Addison-Wesley, Reading, MA, 1987. R. Cahill, C. Roberts, Phys. Rev. D 32 (1985) 2418; P. Simic, Phys. Rev. D 34 (1986) 1903; D.W. McKay, H.J. Munczek, B.L. Young, Phys. Rev. D 37 (1988) 195. A. Dhar, R. Shankar, S.R. Wadia, Phys. Rev. D 31 (1985) 3256. Y. Nambu, G. Jona-Lasinio, Phys. Rev. 122 (1961) 345; Y. Nambu, G. Jona-Lasinio, Phys. Rev. 124 (1961) 246. D. Ebert, H. Reinhardt, Nucl. Phys. B 271 (1986) 188. H. Reinhardt, B.V. Dang, Nucl. Phys. A 500 (1989) 563. J. Kunz, P.J. Mulders, Phys. Lett. B 231 (1989) 335. Y.S. Oh, D.P. Min, M. Rho, Nucl. Phys. A 534 (1991) 493. H.C. Kim, M. Praszalowicz, K. Goeke, Phys. Rev. D 57 (1998) 2859. W. Oliveira, J.A. Neto, Int. J. Mod. Phys. A 12 (1997) 4895. T.D. Lee, Particle Physics and Introduction to Field Theory, Harwood, New York, 1981. G.S. Adkins, in: K.F. Liu (Ed.), Chiral Solitons, World Scienti>c, Singapore, 1983. B.M.K. Nefkens, et al., Phys. Rev. D 18 (1978) 3911. E.S. Fradkin, G.A. Vilkovisky, Phys. Lett. B 55 (1975) 224; M. Henneaux, Phys. Rev. C 126 (1985) 1. ∗∗ T. Fujiwara, Y. Igarashi, J. Kubo, Nucl. Phys. B 341 (1990) 695; Y.W. Kim, S.K. Kim, W.T. Kim, Y.J. Park, K.Y. Kim, Y. Kim, Phys. Rev. D 46 (1992) 4574. ∗ C. Bizdadea, S.O. Saliu, Nucl. Phys. B 456 (1995) 473; S. Hamamoto, Prog. Theor. Phys. 95 (1996) 441. P.O. Mazur, M.A. Nowak, M. Praszalowicz, Phys. Lett. B 147 (1984) 137. K.M. Westerberg, I.R. Klebanov, Phys. Rev. D 50 (1994) 5834. ∗∗ I.R. Klebanov, K.M. Westerberg, Phys. Rev. D 53 (1996) 2804. G. Pari, B. Schwesinger, H. Walliser, Phys. Lett. B 255 (1991) 1. M.V. Berry, Proc. R. Soc. (London) A 392 (1984) 45. B. Moussallam, Ann. Phys. (NY) 225 (1993) 264. J.I. Kim, B.Y. Park, Phys. Rev. D 57 (1998) 2853. I. Zahed, A. Wirzba, U.-G. Meissner, Phys. Rev. D 33 (1986) 830. G. Holzwarth, Phys. Lett. B 291 (1992) 218. S. Coleman, Aspects of Symmetry, Cambridge University Press, Cambridge, 1985. S. Coleman, Nucl. Phys. B 262 (1985) 263.
226 [240] [241] [242] [243] [244] [245] [246] [247] [248] [249]
S.-T. Hong, Y.-J. Park / Physics Reports 358 (2002) 143–226 D.K. Hong, J. Low Temp. Phys. 71 (1988) 483. M. Stone, Bosonization, World Scienti>c, Singapore, 1994. L.N. Cooper, Phys. Rev. 104 (1956) 1189. H. Georgi, Algebras in Particle Physics, Benjamin=Cummings, Menlo Park, CA, 1982. B.G. Wybourne, Classical Groups for Physicists, Wiley, New York, 1974. H.J. Lee, J.H. Yee, Phys. Rev. D 47 (1993) 4608. H.J. Lee, J.H. Yee, Phys. Lett. B 320 (1994) 52. D. Kaplan, I.R. Klebanov, Nucl. Phys. B 335 (1990) 45. B. Schwesinger, Nucl. Phys. A 537 (1992) 253. B. Schwesinger, H. Weigel, Phys. Lett. B 267 (1991) 438; B. Schwesinger, H. Weigel, Nucl. Phys. A 540 (1992) 461.
Physics Reports 358 (2002) 227–308
Strangeness in the nucleon: neutrino–nucleon and polarized electron–nucleon scattering W.M. Albericoa; b; ∗ , S.M. Bilenkya;b; c;1 , C. Maierona; b a
Dipartimento di Fisica Teorica, Universita di Torino, Via P. Giuria 1, 10125 Torino, Italy b INFN, Sezione di Torino, Italy c Scuola Internazionale Superiore di Studi Avanzati (SISSA) I-34014 Trieste, Italy Received May 2001; editor: W: Weise
Contents 1. Introduction 2. The standard Lagrangian of the interaction of leptons and quarks with vector bosons 2.1. The charged current Lagrangian 2.2. The electromagnetic interaction Lagrangian 2.3. The neutral current Lagrangian 3. One-nucleon matrix elements of the neutral current 4. Strange form factors of the nucleon 5. P-odd e8ects in the elastic scattering of polarized electrons on the nucleon 6. The experiments on the measurement of P-odd asymmetry in elastic e–p scattering
229 235 235 236 237 238 244 247 252
7. P-odd asymmetry in the elastic scattering of electrons on nuclei with S = 0, T = 0 8. Inelastic parity violating (PV) electron scattering on nuclei 9. Elastic NC scattering of neutrinos (antineutrinos) on the nucleon 10. Neutrino–antineutrino asymmetry in elastic neutrino (antineutrino)–nucleon scattering 11. The elastic scattering of neutrinos (antineutrinos) on nuclei with S = 0 and T = 0 12. Neutrino (antineutrino)–nucleus inelastic scattering 13. Summary and conclusions Acknowledgements References
259 262 272 281 286 288 303 304 305
Abstract After the EMC and subsequent experiments at CERN, SLAC and DESY on the deep inelastic scattering of polarized leptons on polarized nucleons, it is now established that the Q2 = 0 value of the axial strange ∗
Corresponding author. Fax: +39116707214. E-mail address: [email protected] (W.M. Alberico). 1 On leave of absence from Joint Institute for Nuclear Research, Dubna, Russia.
c 2002 Elsevier Science B.V. All rights reserved. 0370-1573/02/$ - see front matter PII: S 0 3 7 0 - 1 5 7 3 ( 0 1 ) 0 0 0 5 8 - 8
228
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
form factor of the nucleon, a quantity which is connected with the spin of the proton and is quite relevant from the theoretical point of view, is relatively large. In this review, we consider di8erent methods and observables that allow one to obtain information on the strange axial and vector form factors of the nucleon at di8erent values of Q2 . These methods are based on the investigation of the neutral current induced e8ects such as the P-odd asymmetry in the scattering of polarized electrons on protons and nuclei, the elastic neutrino (antineutrino) scattering on protons and the quasi-elastic neutrino (antineutrino) scattering on nuclei. We discuss in detail the c 2002 Elsevier Science B.V. phenomenology of these processes and the existing experimental data. All rights reserved. PACS: 12.15.−y; 13.10.+q; 14.20.Dh; 14.65.Bt; 25.30.−c Keywords: Strangeness; Strange form factors; Neutrino scattering; Polarized electron scattering
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
229
1. Introduction In this review, we will discuss the strange form factors of the nucleon. As it is well known, the net strangeness of the nucleon is equal to zero. However, according to quantum Feld theory, in the cloud of a physical nucleon there must be pairs of strange particles. From the point of view of QCD the nucleon consists of valence u and d quarks and of a sea of quark–antiquark H ss; pairs uu, H dd, H etc. produced by virtual gluons. In the region of large Q2 information about the ss H sea can be obtained from the experiments on the production of charmed particles in charged current interactions of neutrinos and antineutrinos with nucleons in the deep inelastic region. The charmed particles can be produced in d–c and s–c transitions. The probability of d–c transitions is proportional to sin2 C , while the probability of s–c transition is proportional to cos2 C ( C being the Cabibbo angle). Due to the smallness of C (sin2 C 4×10−2 ) the d–c transition is a Cabibbo-suppressed one. This enhances the possibility of studying the strange sea in the nucleon by observing two-muon neutrino events (one muon is produced by neutrinos and another muon is produced in the decay of charmed particles) [1–5]. In the latest NuTeV experiment at Fermilab [4], the following value was found for the ratio of the total momentum fraction carried by the strange (and antistrange) sea quarks H in the nucleon to the total momentum fraction carried by uH and d: S + SH = 0:42 ± 0:07 ± 0:06 : (1) UH + DH H qH being the number density of antiquarks qH which carry the fraction x of Here, QH = 01 d x xq(x), the proton momentum p (in the inFnite momentum frame). A recent analysis of deep inelastic scattering data found a larger value of [6]. The investigation of the matrix elements p |sHOs|p (|p being the state of a nucleon with momentum p and O some spin operator) in the conFnement region Q2 . 1 GeV2 is a very important subject [7–9]. Some information on this matrix element can be obtained from the pion–nucleon scattering data and from the masses of strange baryons (see Ref. [10]). Let us consider the scalar form factor
=
H |p = u(p )u(p)N (t) ; mˆ p |(uu + dd)
(2)
where mˆ = 12 (mu + md );
t = (p − p)2 :
Chiral perturbation theory allows one to connect the value of the scalar form factor in the Cheng–Dashen point s = u = M 2 , t = 2m2 with the isospin-even amplitude of pion–nucleon scattering [11–14] (s, u, t are the customary Mandelstam variables, M is the mass of the nucleon, m the mass of the pion). From the results of the phase-shift analysis of the low-energy pion– nucleon data it is possible to obtain the value of the form factor at the point t = 0, N (0), which is called the -term. The calculation of the -term requires an extrapolation from the point t = 2m2 to the point t = 0. This procedure is based on dispersion relations and chiral perturbation theory. In Ref. [15],
230
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
the value 2 N 45 ± 8 MeV
(3)
was found for the -term. Let us deFne the quantity yN =
p|ss H |p 1 2 p|(uu
H |p + dd)
;
(4)
which characterizes the strange content of the nucleon. If one assumes that the breaking of SU(3) is due to the quark masses, the following relation, which connects the parameters yN and N with the mass di8erence of the and hyperons, can be derived [17]: ms 1 1− (1 − yN )N M − M : (5) 3 mˆ Taking into account higher-order corrections and assuming that the mass ratio ms = mˆ has the standard value 25, from (5) we have (1 − yN )N 31:8 MeV
(6)
and by combining (3) and (5) we obtain yN 0:3. Let us stress that there are many uncertainties in the determination of the value of the N -term and of the parameter yN . They are mainly connected with the pion–nucleon experimental data and the extrapolation procedure. Larger values for the parameter yN up to yN = 0:5 were obtained by di8erent authors (for a recent discussion, see Ref. [18]). The most convincing evidence in favor of a non-zero value of the axial strange constant gAs which characterizes the matrix element p|s H 5 s|p, was found from the data of experiments on deep inelastic scattering of polarized leptons on polarized nucleons. The Frst indication in favor of gAs = 0 was obtained in the EMC experiment at CERN [19]. Subsequent experiments at CERN [20], SLAC [21–24] and DESY [25,26] conFrmed the EMC result. These experiments triggered a large number of theoretical papers in which the problem of the strangeness of the nucleon was investigated in detail (see the recent reviews [8–10,27,28]). The cross section of the scattering of longitudinally polarized leptons on polarized nucleons is characterized by four dimensionless structure functions of the variables x and Q2 : F1 (x; Q2 ), F2 (x; Q2 ), g1 (x; Q2 ) and g2 (x; Q2 ) (here x = Q2 =2p · q, Q2 = − q2 , p is the momentum of the initial nucleon, q the momentum of the virtual photon). The functions F1 (x; Q2 ) and F2 (x; Q2 ) determine the unpolarized cross section, while the functions g1 (x; Q2 ) and g2 (x; Q2 ) characterize the part of the cross section which is proportional to the product of the polarizations of leptons and nucleons. 2
Let us notice that in a recent lattice calculation [16], made within two-Oavor QCD, the range 45–55 MeV for the value of the -term was obtained. The authors of Ref. [16] obtained this range of values, compatible with the one given by Eq. (3) and derived from experimental data, by using an extrapolation procedure in the quark masses which, at variance with previous attempts, respects the correct chiral behavior of QCD.
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
231
The measurement of the asymmetry in the deep inelastic scattering of longitudinally polarized leptons on longitudinally polarized nucleons allows one to determine the structure function g1 (x; Q2 ). In the framework of the naive parton model, based on the assumption that in the inFnite momentum frame (|p ˜ | → ∞) partons (quarks) can be considered as a free particles, all structure functions depend only on the scaling variable x. Let us consider, in the inFnite momentum frame, a nucleon with helicity equal to one. For the function g1 (x), we have 1 2 (+1) g1 (x) = e [q (x) + qH(+1) (x) − q(−1) (x) − qH(−1) (x)] ; (7) 2 q q where eq is the charge of the quark (in the unit of the proton charge), q(±1) (x) (qH(±1) (x)) is the number-density of the quarks q (antiquarks q) H with momentum xp and helicity equal (respectively, opposite) to the helicity of the nucleon. Thus, the structure function g1 is determined by the di8erences of the number of quarks and antiquarks with positive and negative helicities. Notice that in the naive parton model the structure functions F1 (x) and F2 (x) are given by F2 (x) = x eq2 q(x) = 2xF1 (x) ; (8) q
where q(x) = q(+1) (x) + q(−1) (x) ; q(x) H = qH(+1) (x) + qH(−1) (x) are the total numbers of quarks and antiquarks with momentum xp. From the theoretical point of view, the important quantity is the Frst moment of the structure function g1 : 1 2 g1 (x; Q2 ) d x : (9) "1 (Q ) = 0
In the region Q2 . 10 GeV2 , the main contribution to "1 comes from the light u; d; s quarks. In the naive parton model, for the Frst moment of the proton we have "1p = 12 ( 49 Pu + 19 Pd + 19 Ps) ; where
Pq =
1
0 r=±1
r[q(r) (x) + qH(r) (x)] d x
(10)
(11)
is the di8erence of the total numbers of quarks and antiquarks in the nucleon with helicity equal and opposite to the helicity of the nucleon. Thus, Pq is the contribution of the q-quarks and q-antiquarks H to the spin of the proton. The Frst moment "1p can be determined from the measurement of the deep inelastic scattering of polarized leptons on longitudinally polarized protons. In the EMC experiment at
232
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
Q2 = 10:7 GeV2 , the value "1p (10:7) = 0:126 ± 0:010 ± 0:015
(12)
was found, while in the latest CERN SMC [20], SLAC 155 [24] and DESY HERMES [25,26] experiments the following values of "1p were determined: "1p (10) = 0:120 ± 0:005 ± 0:006 ± 0:014 ; "1p (5) = 0:118 ± 0:004 ± 0:007 ; "1p (3) = 0:122 ± 0:003 ± 0:010 :
(13)
Let us stress that the experimental data can be obtained in a limited interval of the variable x which does not include the region of very small and very large values of x. In order to determine the value of "1p , it is necessary to make extrapolations of the data to the points x = 0 and 1. The small x-behavior of the structure function g1 is the most contradictory issue (see Ref. [8]). Usually, a Regge behavior of the function g1 is assumed at small x. Recent extrapolations are based on next to leading order (NLO) QCD Fts. Notice that in some non-perturbative approaches a singular behavior of the function g1 (x) at small x was obtained [8]. Let us now discuss the possibilities of determining the axial strange constant gAs from these data. In the framework of the naive parton model the quantities Pq, which enter into the sum rule (10), are determined by the one-nucleon matrix element of the axial quark current q H 5 q (see Section 3): H 5 q|pp = 2Ms Pq pp|q
:
(14)
Here, s is the polarization vector of the nucleon and |pp (|pn ) is the state vector of a proton (neutron) with momentum p. Relation (14) allows one to obtain two constraints on the quantities Pu, Pd and Ps. The Frst one comes from the isotopic SU(2) invariance of strong interactions, which implies pp|u 5 d|pn =p p|(u 5 u
H 5 d)|pp : − d
(15)
From (15) and (14) it follows that Pu − Pd = gA ;
(16)
where gA is the weak axial constant. From the data on the %-decay of the neutron it follows that [29] gA = 1:2670 ± 0:0035 :
(17)
The second constraint follows from SU(3) symmetry. Assuming exact SU(3) symmetry we have Pu + Pd − 2Ps = 3F − D ≡ gA8 ;
(18)
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
233
where F and D are the constants which determine the matrix elements of the axial weak current for the states of di8erent hyperons belonging to the SU(3) octet. From the Ft of the experimental data it was found that [29] F = 0:463 ± 0:008;
D = 0:804 ± 0:008 ;
(19)
hence, gA8 = 0:585 ± 0:025 :
(20)
Now from Eqs. (10), (17) and (19) we can express the Frst moment as follows: "1p = 0:187 ± 0:004 + 13 Ps :
(21)
If we compare (21) with the values of "1p which were obtained in experiments [see (12) and (13)], we come to the conclusion that the quantity Ps, which determines the matrix element of the strange axial current, is di8erent from zero and negative. Using, for example, the EMC result (12), from (21) we Fnd Ps = − 0:18 ± 0:05 : This conclusion is based on the naive parton model and was obtained about 10 years ago (see Ref. [7]). After the EMC result was obtained, a lot of experimental and theoretical works were published. The leading order (LO) and NLO QCD corrections to the sum rule (10) and to relation (15) were calculated and many di8erent e8ects were taken into account (see the reviews [8–10,27,28,30–33]). Let us introduce the constants Ai pp|
H 5 (i |pp = 2Ms Ai
pp|
H 5 |pp = 2Ms A0 ;
(i = 3; 8)
(22)
and where
(23)
u
=d
s is the Oavor SU(3) triplet, (i are the Gell–Mann matrices, H 5 (i currents and H 5 is the axial singlet current. In the naive parton model the following relations hold:
is the SU(3) octet of axial
A3 = Pu − Pd = gA3 ≡ gA ; A8 = Pu + Pd − 2Ps ≡ gA8 ; A0 = Pu + Pd + Ps = gA0 ≡ P) ;
(24)
234
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
the quantity P) being the total contribution of quarks and antiquarks to the spin of the proton. The sum rule (10) can be now rewritten in the form 1 "1p = 12 A3 +
1 36 A8
+ 19 A0 :
(25)
The NLO QCD corrections modify the above expression as follows (see [8] and references therein):
s (Q2 ) 1 1 1 p 2 2 "1 (Q ) = 1 − (26) A3 + A8 + A0 (Q ) ; 12 36 9 where s (Q2 ) is the strong coupling constant and the quantities Ai (i = 3; 8; 0) are given by relation (22). The quantities A3 and A8 are determined by the one-nucleon matrix elements of the corresponding conserved currents (we have assumed SU(3) Oavor symmetry). These quantities do not depend on Q2 and turn out to be A3 = gA ;
A8 = gA8 ;
(27)
where the numerical values of the constants gA and gA8 are given by Eqs. (17) and (20), respectively. The quantity A0 , instead, is determined by the matrix element of the non-conserved singlet current H 5 . If higher-order QCD corrections are taken into account, then this quantity depends on the renormalization scheme and on the renormalization scale, which is usually taken to be equal to Q2 . Two renormalization schemes are commonly employed: the MS scheme [34] and the Adler– Bardeen (AB) scheme [35]. In the MS scheme A0 (Q2 ) is determined by the renormalization scale-dependent contribution of quarks to the spin of nucleon, A0 (Q2 ) = P)(Q2 ) = Pq(Q2 ) : (28) q=u;d; s
In the AB scheme A0 (Q2 ) is determined by the contribution of quarks and gluons to the spin of the proton A0 (Q2 ) = P)AB − 3
s (Q2 ) PG(Q2 ) ; 2
(29)
where P)AB does not depend on Q2 and all renormalization scale dependence is absorbed by the gluon contribution PG(Q2 ). The latter, due to triangle anomaly [36,37], behaves as 1=s and can give a sizable contribution to A0 (Q2 ). 3 3 Let us notice that relation (29) o8ers the possibility of explaining the data by the large gluon contribution [36,37]. In fact, A0 can be written in the form
A0 = gA8 + 3Ps − 3
s PG : 2
Even if we assume that Ps = 0, the experimental data can be explained by a large positive PG.
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
235
In the MS scheme, from Eqs. (14), (24) and (26), for the matrix element of the axial strange current in NLO approximation we have
−1 1 s (Q2 ) 1 5 p|s H 5 s|ps = − 3 1 − "1p (Q2 ) + gA + gA8 : (30) 2M 4 12 H 5 s|p depends on the renormalization scale. Using Let us stress that the matrix element p|s the E155 data [24] at Q2 = 5 GeV2 and the values (17), (20) of the axial constants gA and gA8 , from (30) we Fnd 1 p|s H 5 s|ps = 0:12 ± 0:03 : 2M
(31)
Thus, if we take into account higher-order QCD corrections, from the data on deep inelastic scattering of polarized leptons on polarized protons we can conclude that the one-nucleon matrix element of the axial strange current is relatively large. This conclusion does not depend on the renormalization scheme (matrix elements are measurable quantities). Similar considerations can be drawn from the operator product expansion (OPE) approach [38]. In this review, we will consider possibilities of obtaining information on the strange vector and axial form factors of the nucleon from the investigation of neutral current e8ects. We will consider in detail the P-odd asymmetry in elastic and quasi-elastic scattering of polarized electrons on nucleons and nuclei (Sections 5, 6, 7, 8) and the elastic and quasi-elastic scattering of neutrinos (antineutrinos) on nucleons and nuclei (Sections 9, 10, 11, 12). We will discuss the existing experimental data and future experiments. Derivations of many basic relations will be presented. 2. The standard Lagrangian of the interaction of leptons and quarks with vector bosons In the standard SU(2)×U(1) electroweak model (SM) [39–41], the Lagrangian of the interaction of the fundamental fermions (neutrinos, charged leptons and quarks) with vector bosons contains three parts: charged current (CC), electromagnetic (em) and neutral current (NC) interactions [14,42–46]. 2.1. The charged current Lagrangian The Lagrangian of the CC interaction of leptons and quarks with the charged vector bosons W ± reads g CC LCC (32) I = − √ j W + h:c: 2 2 Here, g is a coupling constant which is connected with the Fermi constant GF by the relation G
√F =
2
g2 8m2W
(33)
236
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
(mW being the mass of the W -boson) and jCC = 2(j1 + ij2 ) ≡ 2j1+i2
(34)
is the charged current. In Eq. (34), j1; 2 are the components of the isovector current: 4 H 1 /i aL ; ji = aL 2
(35)
a
where aL = 12 (1 − 5 ) dard Model:
0eL = ; eL eL uL = ; 1L dL
are left-handed doublets of the SU(2)×U(1) gauge group of the Stan-
a
01L ; 1L cL = ; 2L sL 1L =
0/L ; /L tL = : 3L bL /L =
(36)
In terms of the Felds of leptons and quarks with deFnite masses the charged current (35) reads mix mix jCC = 2 0H‘L ‘L + 2[uL dmix (37) L + cHL sL + tHL bL ] ; ‘=e;1;/
where now dmix L =
Vuq qL ;
q=d; s;b
sLmix =
Vcq qL ;
bmix L =
q=d; s;b
Vtq qL
(38)
q=d; s;b
and V is the unitary 3×3 Cabibbo–Kobayashi–Maskawa mixing matrix. 2.2. The electromagnetic interaction Lagrangian The Lagrangian of the electromagnetic interaction has the form em Lem I = − ej A
(39)
4 We use the Feynman–Bjorken–Drell metric. In this metric g00 = 1; gii = − 1 (i = 1; 2; 3), the non-diagonal elements of g% being equal to zero. Thus, the scalar product of vectors A and B is A · B ≡ A B = A0 B0 − ˜A · ˜B. Moreover, the Dirac matrices satisfy the commutation relations % + % = 2g% and we adopt the deFnition 5 = i0 1 2 3 for the matrix 5 and the deFnition 60123 = 1 for the antisymmetric tensor 6%7 . For the spinors u(p) we will use the covariant normalization u(p) H u(p) = 2p . In this metric † = 0 0 . Notice also that vector of states are ˜ − p ˜ ) (see, for example, Ref. [47]). With this choice the normalized in such a way that p |p = 2p0 (2)3 8(3) (p normalizing factors do not appear in the matrix elements of the currents, but only in the Fnal expression of the cross sections.
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
with e being the charge of the proton and H ‘ + jem = (−1)‘ eq q H q ; ‘=e;1;/
237
(40)
q=u;d;:::
the electromagnetic current (with eu = 2=3; ed = − 1=3, etc.) 2.3. The neutral current Lagrangian The Lagrangian of the NC interaction of leptons and quarks with the neutral vector boson Z 0 is g LNC j NC Z ; (41) I =− 2 cos W where W is the weak (Weinberg) angle, the characteristic parameter of the electroweak uniFcation, and jNC is the neutral current. The structure of the latter in the Standard Model is determined by the requirements of uniFcation of the weak and electromagnetic interactions into the uniFed electroweak interaction. We have 5 jNC = 2j3 − 2 sin2 W jem :
(42)
From Eqs. (35), (36) and (42) the neutral current can be rewritten in the following form:
1 1 1 NC j = q H (1 − 5 ) q + q H (1 − 5 ) − q+ 0H‘ (1 − 5 ) 0‘ 2 2 2 q=u;c; t +
‘=e;1;/
q=d; s;b
lH (1 − 5 ) −
1 ‘ − 2 sin2 W jem : 2
‘=e;1;/
(43)
In this review, we will focus on processes at relatively small energies (less than a few GeV). Therefore, it can be convenient to separate, in (43), the contribution of the lightest u and d quarks. One obtains jNC;q = v3 − a3 − 12 (vs − as ) − 2 sin2 W jem :
(44)
Here, we deFne H 1 d ≡ NH 1 /3 N ; v3 = u 12 u − d 2 2 H 5 1 d ≡ NH 5 1 /3 N ; a3 = u 5 12 u − d 2 2 5 We notice that in the literature di8erent deFnitions of NC are used. In particular, a frequently used notation di8ers from (42) by a factor of 2: NC j˜ = 2jNC :
238
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
where N = ( du ). Indeed, the mass di8erence between d and u quarks (md − mu 3 MeV) [29] is much smaller than the QCD constant QCD 200–300 MeV and can be neglected: in this case N is a doublet of the isotopic SU(2) group and the currents v3 and a3 are the third components of the isovectors vi = NH 12 /i N;
ai = NH 5 21 /i N :
(45)
Instead, the currents vs and as are isoscalars: they represent the contributions to jNC;q of the s, c and heavier quarks. Taking into account only s-quarks we have vs = s H s;
as = s H 5 s :
(46)
The quark electromagnetic current is given by [see Eq. (40)] jem;q = eq q H q :
(47)
q=u;d;:::
Also, in this case it is convenient to separate, in the above current, the contributions of the lightest u, d quarks. Taking into account that eq = I3q + 16 (q = u; d) we have jem;q = v3 + v0 ;
(48)
where v0 is the isoscalar current which, in the u; d; s approximation, reads v0 = 16 NH N + (− 13 )s H s :
(49)
3. One-nucleon matrix elements of the neutral current We will consider here in detail the one-nucleon matrix elements of the neutral current as well as the ones of the electromagnetic current. Let us consider, for example, the process of the elastic scattering of muon–neutrino on the nucleon: 01 + p → 01 + p :
(50)
The amplitude of this process is given by the expression GF 2
f|S |i = − i √ u(k ) (1 − 5 ) u(k) out p |JNC (0)|pin (2)4 8(4) (p − p − q) ;
(51)
where k and p (k and p ) are the four-momenta of the initial (Fnal) neutrino and nucleon, respectively, q = k − k and out p
|JNC (0)|pin = p |T {jNC (0)e−i
4 Hhad I (x) d x
is the hadronic matrix element. 6 6
The indexes “in” and “out” will be dropped hereafter.
}|p
(52)
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
239
NC In Eq. (52) Hhad I (x) is the Hamiltonian density of the strong interactions, J (0), |pin and |pout are the neutral current operator, the initial and Fnal nucleon states in the Heisenberg representation. From (44) we get the following expression for the matrix element of the neutral current:
p |JNC (0)|p = p |(V3 − A3 )|p − 12 p |(Vs − As )|p − 2 sin2 W p |Jem |p ;
(53)
where V3(s) , A3(s) are the currents in Heisenberg representation. From the isotopic SU(2) in variance of strong interactions it follows that V3 , A3 are the third components of the isovector currents Vi and Ai (i = 1; 2; 3) while Vs , As are isoscalar currents. The isotopic invariance of strong interactions allows one to determine the one-nucleon matrix elements of the current V3 from the one-nucleon matrix elements of the electromagnetic current Jem . In fact, from (48) it follows that p(n) p
|Jem |pp(n) =p(n) p |V3 |pp(n) +p(n) p |V0 |pp(n) ;
(54)
where |pp (|pn ) is the state of a proton (neutron) with momentum p. Furthermore, we have UV3 U−1 = − V3 ;
UV0 U−1 = V0 ;
(55)
where the charge symmetry operator U = exp{iI2 } (rotation of around the second axis in the isotopic space) transforms proton states into neutron states and vice versa, according to U|pp = − |pn ;
U|pn = |pp :
The following relations then hold: pp
|V3 |pp = − n p |V3 |pn ;
pp
|V0 |pp = + n p |V0 |pn :
(56)
From (54) and (56) we have then pp
|V3 |pp = 12 [pp |Jem |pp − n p |Jem |pn ] :
(57)
Moreover, pp
|V0 |pp = 12 [pp |Jem |pp + n p |Jem |pn ] :
(58)
Let us discuss now the one-nucleon matrix elements of the electromagnetic current. The conservation law of the electromagnetic current, 9 Jem = 0, entails (p − p)p |Jem |p = 0 :
(59)
240
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
From this relation, it follows that the one-nucleon matrix elements of the electromagnetic current are characterized by two form factors and have the general form i p |Jem |p = u(p ) F1 (Q2 ) + (60) % q% F2 (Q2 ) u(p) : 2M Here, q = p − p is the four-momentum transfer, Q2 = − q2 , % = (i=2)( % − % ); F1 (Q2 ) and F2 (Q2 ) are the Dirac and Pauli form factors. At Q2 = 0, we have F1 (0) = eN ;
F2 (0) = N
(61)
where eN is the nucleon charge (in units of the proton charge) and N is the anomalous magnetic moment of the nucleon (in units of the nucleon Bohr magneton). Notice that from the invariance of strong interactions under time reversal it follows that the form factors are real functions of Q2 . With the help of the Dirac equation, the matrix element (60) can be rewritten in the form 1 GM (Q2 ) − GE (Q2 ) 2 em p |J |p = u(p ) GM (Q ) − n u(p) : (62) 2M 1+/ Here, n = p + p ; / = Q2 =4M 2 and GM (Q2 ) = F1 (Q2 ) + F2 (Q2 ) ; GE (Q2 ) = F1 (Q2 ) − /F2 (Q2 )
(63)
are, correspondingly, the magnetic and electric (charge) Sachs form factors. In the Q2 = 0 limit they yield GM (0) = eN + N = 1N ; GE (0) = eN 1N being the total magnetic moment of the nucleon (in units of the nucleon Bohr magneton). Let us notice that the magnetic and electric form factors GM and GE characterize the matrix elements of the operators ˜J em and J0em , respectively, in the Breit system (the system in which ˜ = 0). In fact, from (62) it follows that ˜n = p ˜ +p p |˜J em |p = u(p )˜u(p)GM (Q2 ) :
Furthermore, in the Breit system n0 = 2p0 and we have p0 GM (Q2 ) − GE (Q2 ) em 2 p |J0 |p = u(p ) 0 GM (Q ) − u(p) ; M 1+/
(64)
(65)
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
241
while, from the Dirac equation, it follows that u(p )(p= + p=)u(p) = 2p0 u(p )0 u(p) = 2M u(p )u(p) :
(66)
The quantity p02 =M 2 in the Breit system can be expressed through p02 Q2 = 1 + ≡1+/ M2 4M 2 and combining (65) and (66) we have p |J0em |p = u(p )0 u(p)GE (Q2 ) :
(67)
Let us consider now the one-nucleon matrix elements of the vector current. From relation (57) it follows that pp
|V3 |pp = − n p |V3 |pn
= u(p )
F1V (Q2 )
i % V 2 + q F2 (Q ) u(p) ; 2M %
(68)
where F1V (Q2 ) = 12 (F1; p (Q2 ) − F1; n (Q2 )) ; F2V (Q2 ) = 12 (F2; p (Q2 ) − F2; n (Q2 )) are the isovector Dirac and Pauli form factors. Alternatively, we can use the isovector magnetic and electric (charge) form factors: V GM (Q2 ) = 12 (GM; p (Q2 ) − GM; n (Q2 )) ;
GEV (Q2 ) = 12 (GE; p (Q2 ) − GE; n (Q2 )) : Let us consider now the one-nucleon matrix elements of the operator A3 . Information about these matrix elements can be obtained from the data on the investigation of the quasi-elastic processes: 01 + n → 1− + p ;
(69)
0H1 + p → 1+ + n :
(70)
In the region Q2 m2W we are interested in, the matrix elements of the processes (69) and (70) have the following form: GF 2
f|S |i = − i √ u(k ) (1 − 5 )u(k)pp |JCC |pn (2)4 8(4) (p − p − q) ;
GF 2
†
f|S |i = − i √ u(k ) (1 + 5 )u(k)n p |JCC |pp (2)4 8(4) (p − p − q) :
242
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
Here, k is the momentum of the initial 01 (0H1 ), k the momentum of the Fnal 1− (1+ ); p and p are the momenta of the initial n(p) and of the Fnal p(n), respectively. The quark current which gives contribution to the matrix element of the process (69) has the form jCC = u (1 − 5 )dVud ;
(71)
where Vud is an element of the CKM mixing matrix. From the existing data |Vud |2 = 0:9735 ± 0:0008 [29] and hereafter we will not take into account small corrections due to |Vud | = 1. The current (71) can be expressed in terms of the above introduced u; d quark iso-doublet N as follows: jCC = NH (1 − 5 ) 12 (/1 + i/2)N ≡ v1+i2 − a1+i2 ;
(72)
are the “plus-components” of the isovectors (45). where v1+i2 and a1+i2 For the Heisenberg currents we have JCC = V1+i2 − A1+i2 ;
(73)
are the “plus-components” of the isovectors Vi and Ai . where V1+i2 and A1+i2 Let us consider now the one-nucleon matrix elements of the axial current. The charge symmetry operator U introduced in (55) transforms the isovector Ai as follows: 3 −1 3 UA1; = − A1; U ;
UA2 U−1 = A2 :
(74)
From these relations we have pp
|A1+i2 |pn =n p |A1−i2 |pp :
(75)
Eq. (75) implies pp
|A1+i2 |pn =p p|A1+i2 |p n∗ :
(76)
The one-nucleon matrix element of the CC axial current has the following general structure: 1 1 1+i2 2 CC 2 CC 2 |pn = u(p ) 5 GA (Q ) + q 5 GP (Q ) + n 5 GT (Q ) u(p) : (77) pp |A 2M 2M Due to the invariance under time reversal the form factors GA , GPCC and GTCC are real quantities. Moreover, taking into account relation (76), one easily Fnds that GTCC (Q2 ) = 0 :
(78)
H )(1 − 5 )u(k). Let us notice that for the quasi-elastic processes u(k H ) (1 − 5 )u(k)q = − m1 u(k Thus, the contribution of the pseudoscalar form factor GPCC to the matrix element of processes (69) and (70) is proportional to the muon mass m1 and in the region of neutrino energies ¿ 1 GeV can be neglected.
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
243
Isotopic invariance of the strong interactions provides the relation between the one-nucleon matrix elements of the operators A3 and A1+i2 . In fact for the isovector Ai , we have [Ik ; Aj ] = i6kj‘ A‘
(79)
with Ik being the total isotopic spin operator (here 6kj‘ is the totally antisymmetric tensor, with 6123 = 1). This relation implies A3 = 12 [A1+i2 ; I1−i2 ]
(80)
and taking into account that I1−i2 |pp = |pn ;
pp
|I1−i2 = 0 ;
from (80) the following relation holds: pp
|A3 |pp = − n p |A3 |pn = 12 pp |A1+i2 |pn :
The one-nucleon matrix elements of the axial current A3 have the general structure 1 1 3 V 2 V 2 V 2 q 5 GP (Q ) + n 5 GT (Q ) u(p) ; pp |A |pp = u(p ) 5 GA (Q ) + 2M 2M
(81)
(82)
where due to the T-invariance of strong interactions the form factors GAV , GPV and GTV are real. Furthermore, pp
|A3 |pp =p p|A3 |p p∗ :
(83)
From (82) and (83) it follows that the form factor GTV (Q2 ) vanishes. Moreover, the contribution of the pseudoscalar form factor GPV (Q2 ) to the matrix elements of the NC-induced processes is proportional to the lepton mass and can be neglected (both for neutrino- and electron-induced processes). Finally, from (81) and (82) we have the following relation for the axial form factor: GAV (Q2 ) = 12 GA (Q2 ) :
(84)
Thus, summarizing, the form factors that characterize the proton matrix elements of the u–d part of the NC are connected with the electromagnetic form factors of proton and neutron and with the CC axial nucleon form factor by the relations p V 2 2 n 2 1 GM; E (Q ) = 2 {GM; E (Q ) − GM; E (Q )} ;
GAV (Q2 ) = 12 GA (Q2 ) : The matrix elements of proton and neutron are connected by the charge–symmetry relations pp
|V3 |pp = − n p |V3 |pn ;
pp
|A3 |pp = − n p |A3 |pn :
244
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
4. Strange form factors of the nucleon In this section, we will consider the strange form factors of the nucleon. Let us start by considering the one-nucleon matrix element of the vector, vs = s H s, and axial, as = s H 5 s, 7 strange currents. They have the following general structure: i 1 s s 2 % s 2 s 2 p |V |p = u(p ) F1 (Q ) + (85) q F2 (Q ) + q F3 (Q ) u(p) ; 2M % 2M 1 1 s s 2 s 2 s 2 p |A |p = u(p ) 5 GA (Q ) + (86) q 5 GP (Q ) + n 5 GT (Q ) u(p) ; 2M 2M where, again, q = p − p, n = p + p, the Fis (Q2 ) (i = 1; 2; 3) are the strange vector form factors of the nucleon and Gas (Q2 ) (a = A; P; T ) the strange axial ones, respectively. From the invariance of strong interactions under time reversal it follows that the form factors s F3 (Q2 ) and GTs (Q2 ) are equal to zero. In fact, for the axial current T-invariance implies: p |As |p = pT |As |pT ? ;
(87)
(repeated indexes, in the r.h.s., are not summed) where ? = (1; −1; −1; −1) and the vector |pT describes a nucleon with momentum pT = (p0 ; −p ˜ ) and in a spin state u(pT ) = T u T (p) ;
(88)
the matrix T satisfying the condition TT T −1 = ? :
(89)
With the help of (87) and (89) it is easy to see that GTs (Q2 ) = 0 : Analogously, from T-invariance it follows F3s (Q2 ) = 0 : Furthermore, from the hermiticity of the neutral currents we have p |As |p = p|As |p ∗ ; p |Vs |p = p|Vs |p ∗ :
(90) F1;s 2 (Q2 )
s (Q 2 ) GA; P
From (85), (86) and (90), it follows that the form factors and are real. 8 For the same reasons mentioned above we shall hereafter omit the pseudoscalar form factor GPs . 7 Notice that the present, general discussion is also valid if the currents of the c and the other heavier (isoscalar) quarks are included. 8 In general, the vector strange current is not conserved. However, due to T-invariance, the one-nucleon matrix element of the vector strange current satisFes the condition
(p − p) p |Vs |p = 0 :
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
245
As an alternative to F1s (Q2 ) and F2s (Q2 ), one can deFne the magnetic and electric strange form factors of the nucleon, which are connected with F1;s 2 (Q2 ) by the relations s (Q2 ) = F1s (Q2 ) + F2s (Q2 ) ; GM
(91)
GEs (Q2 ) = F1s (Q2 ) − /F2s (Q2 ) ;
(92)
which, in the Q2 = 0 limit, assume the values s (0) = 1s ; GM
(93)
GEs (0) = 0
(94)
1s being the strange magnetic moment of the nucleon in units of the nuclear Bohr magneton. Obviously, relation (94) follows from the fact that the net strangeness of the nucleon is equal to zero. 9 In the region of small Q2 , we have GEs (Q2 ) = − 16 rs2 Q2 ;
(95)
where rs2 = − 6(dGEs =dQ2 )Q2 =0 is a parameter which can be interpreted as the mean square strangeness radius of the nucleon. As already mentioned in the Introduction, in the framework of the parton model the matrix element p|q H 5 q|p gives the contribution of the q-quark and q-antiquark H to the spin of proton. In fact, assuming that the proton is in a state with momentum p and helicity equal to one, we have H 5 q|pp = u(p) 5 u(p)gAq pp|q
;
(96)
where the spinor u(p) satisFes the equation 5=su(p) = u(p)
9
(97)
In fact, in the Breit system, for the one-nucleon matrix element of the strangeness operator
V0s (x) d 3 x
S= we have
p
˜ − p V0s (x) d 3 x p = (2)3 8(3) (p ˜ )p |V0s (0)|p
˜ − p ˜ − p = (2)3 8(3) (p ˜ )u(p)0 u(p)GEs (0) = (2)3 2p0 8(3) (p ˜ )GEs (0) : On the other hand, since the net strangeness of the nucleon is equal to zero, we have
p
V0s (x) d 3 x p = 0 :
246
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
and s is the unit vector which obeys the condition s · p = 0. In the rest frame of the nucleon s = (0; ˜ ), where ˜ is the unit vector in the direction of the proton momentum. From (96) and (97) we obtain H 5 q|pp = Tr 5 12 (1 pp|q
+ 5=s)(p= + M )gAq = 2Ms gAq :
(98)
Notice that, by combining Eq. (98) for q = s with Eq. (30), one obtains the following value for gAs : GAs (0) ≡ gAs = − 0:12 ± 0:03 ;
(99)
which represents the present direct estimate of this parameter from deep inelastic scattering experiments. Now, let us consider the matrix element of the axial quark current in the parton approximation, in the inFnite momentum frame. We have 1 0 p H 5 q|pp = u r (px ) 5 ur (px )(qr (x) + qHr (x)) d x pp|q 0 0 px r
=
0
1
1 r(qr (x) + qHr (x)) d x ; 2mq sq x r
(100)
where qr (x) (qHr (x)) is the density of q-quarks (q-antiquarks) H with momentum px = xp and helicity r; x = Q2 =(2p · q) is the Bjorken variable (0 6 x 6 1) and mq the mass of the q-quark. Taking into account that sq = x
M s mq
from (100) we obtain H 5 q|pp = 2Ms pp|q
0
1
r(qr (x) + qHr (x)) d x :
Now by comparing (98) with (101) one Fnds that in the parton approximation 1 q gA = [q(+) (x) + qH(+) (x) − {q(−) (x) + qH(−) (x)}] d x ≡ Pq : 0
(101)
r
(102)
H to the spin of the Thus, the constant gAq ≡ Pq is the contribution of q-quarks and q-antiquarks nucleon. There exists a large number of papers in which the strange magnetic moment 1s and the strange radius rs of the nucleon are calculated within di8erent models (pole models, chiral quark models, soliton models, Skyrme models, lattice QCD and others). The predicted values of 1s and rs in di8erent models are very di8erent in magnitude and in sign. It is not our aim here to review these papers and we recommend the interested reader to refer to the original literature [48–81].
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
247
In summarizing the contents of this and of the previous sections, for the one-nucleon matrix elements of the vector and axial NC of the Standard Model we have i NC;p(n) NC 2 % NC;p(n) 2 p | V | p = u(p ) F (Q ) + q F (Q ) u(p) ; (103) 1 p(n) p(n) 2 2M % p(n) p
NC;p(n)
|ANC |pp(n) = u(p ) 5 GA
u(p) ;
(104)
where the NC form factors are given by 2 2 1 s F1;NC;p(n) (Q2 ) = ± 12 {F1;p2 (Q2 ) − F1;n 2 (Q2 )} − 2 sin2 W F1;p(n) 2 2 (Q ) − 2 F1; 2 (Q ) ;
(105)
GANC;p(n) (Q2 ) = ± 12 GA (Q2 ) − 12 GAs (Q2 ) :
(106)
Equivalently, one can consider the NC Sachs form factors: GENC;p(n) (Q2 ) = ± 12 {GEp (Q2 ) − GEn (Q2 )} − 2 sin2 W GEp(n) (Q2 ) − 12 GEs (Q2 ) ;
(107)
NC;p(n) p p(n) n s (Q2 ) = ± 12 {GM (Q2 ) − GM (Q2 )} − 2 sin2 W GM (Q2 ) − 12 GM (Q2 ) : GM
(108)
Relations (105) [or (107), (108)] and (106) are the basic ones. From these relations it is obvious that the investigation of NC-induced processes allows one to obtain direct information on the strange form factors of the nucleon providing one can “a priori” utilize information on the value of the parameter sin2 W (which is obtained from the measurement of di8erent NC processes), information on the electromagnetic form factors of the nucleons (which is obtained from the measurement of elastic scattering of electrons on nucleons) and on the axial form factor of the nucleon (which is obtained from the measurement of quasi-elastic CC neutrino scattering on nucleons). The investigation of NC-induced processes in the region Q2 1 GeV2 allows one to determine the strange magnetic moment of the nucleon 1s and the strange axial constant gAs directly from experimental data. At larger momentum transfers one could obtain information on the Q2 behavior of the strange form factors of the nucleon. In the next sections, we shall discuss possible experiments from which direct information on the strange form factors of the nucleon can be obtained. We will also present the existing experimental data. 5. P-odd e'ects in the elastic scattering of polarized electrons on the nucleon There are two types of NC-induced e8ects which allow one to obtain direct information on the strange form factors of the nucleon (see for example Ref. [82]): (1) The P-odd asymmetry in the elastic scattering of polarized electrons on unpolarized nucleons. (2) The NC-induced elastic scattering of neutrinos and antineutrinos on nucleons. In this section, we will discuss the P-odd asymmetry in the process ˜e + p → e + p :
(109)
248
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
Fig. 1. Diagrams of the process ˜e + p → e + p.
The diagrams of process (109) in lowest order in the constants e and g are shown in Fig. 1, where both the exchange of a photon and of the vector boson Z 0 are considered. For the matrix element of this process we have the following expression: 4 f|S |i = i(2)4 8(4) (p − p − q) 2 u(k ) u(k)p |Jem |p Q GF Q2 NC − √ u(k ) (gV − gA 5 ) u(k)p |J |p : (110) 2 2 Here, k and k are the momenta of the initial and Fnal electron, p and p the momenta of the initial and Fnal nucleon, q = k − k , = e2 =4 and the weak NC vector and axial couplings for the electron are gV = − 12 + 2 sin2 W ; gA = −
1 2
:
(111)
Since we will consider polarized electrons, let us introduce the density matrix of electrons with momentum k and polarization P. It is given by 1 7(k) = (1 + 5 P=)(k= + m) : (112) 2 Here, P is the four-vector of polarization, satisfying the condition P · k = 0. In the electron rest frame we have P = (0; ˜P 0 ), the vector ˜P 0 being usually written in the form of the sum of longitudinal and transverse components: ˜P 0 = P0˜ + ˜P⊥ ;
(113)
where ˜ is the unit vector in the direction of the electron momentum. We shall consider scattering of high-energy electrons on nucleons. Thus, k0 m and the polarization vector can be approximated by the expression: k ; (114) P = P0 + P⊥ m 0 ). From (112) and (114) it follows that the density matrix of ultrarelativistic where P⊥ = (0; ˜P⊥ electrons has the form 7(k) = 12 (1 + (5 + 5 P=⊥ )k= ;
(115)
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
249
where we have introduced the notation ( = P0 . Notice that for the V –A interaction the contribution of the transverse polarization to the cross section is proportional to the electron mass and at high energies can be neglected. The lowest order contribution, to the cross section of the process, stemming from the second (NC) term of matrix element (110) is determined by the quantity G Q2 Q2 = 1:798×10−4 ; GeV2 2 2
F A0 ≡ √
(116)
which is small in the region Q2 . M 2 we are interested in. Thus, in the calculation of the cross section we shall only take into account the square of the Frst (electromagnetic) term of (110) and the interference of the electromagnetic and NC terms. Let us notice that the interference of the electromagnetic and P-even part of the NC term (v · V and a · A) gives a very small correction to the electromagnetic term and can be neglected. We shall be interested in the pseudoscalar term of the cross section which is proportional to ( and is due to the interference of the electromagnetic amplitude and P-odd part of the NC amplitude (v · A and a · V ). It is obvious that 1 =% (gV 4 Tr (5 k
% − gA 5 )k= = ({gV L% 5 (k; k ) − gA L (k; k )} ;
(117)
where L% (k; k ) = k k % + k k % − g% k · k ;
(118)
%7 L% k7 k ; 5 (k; k ) = i6
(119)
6%7 being the antisymmetric (under the exchange of any two indexes) tensor, with 60123 = − 60123 = − 1. With the help of Eq. (117) the cross section of process (109) can be expressed as follows: ˜ 42 M 1 GF Q2 dk % I % em % I d( = 4 [gV L5 W% (A) + gA L W% (V )] L W% + ( √ : (120) Q p·k 2 k0 2 2 em is given by In the above, the hadronic electromagnetic tensor W% ˜ 1 dp em W% = p |Jem |pp|J%em |p 8(4) (p − p − q) ; 2M 2p0
(121)
I (V ) (pseudotensor W I (A)) arises from the interference of the electromagwhile the tensor W% % netic and vector (axial) part of the hadronic NC: 1 I W% (V ) = {p |Jem |pp|V%NC |p 2M
+p |VNC |pp|J%em |p }8(4) (p − p − q)
˜ dp ; 2p0
(122)
250
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308 I W% (A) =
1 {p |Jem |pp|ANC % |p 2M em (4) +p |ANC |pp|J% |p }8 (p − p − q)
˜ dp ; 2p0
(123)
From Eqs. (120), (122) and (123) it follows that information on the one-nucleon matrix elements of NC can be obtained by investigating the dependence of the cross section of process (109) on the longitudinal polarization (. The SM values of the constants gV and gA are given by (111). The parameter sin2 W is known, at present, with very high accuracy. Its on-shell value is given by [29] sin2 W = 0:23117 ± 0:00016 : For the constant gV we have gV = − 0:0397 ± 0:0003 : Thus in the SM |gV ||gA |. Taking into account this inequality we can conclude from the general expression for the cross section (120) that the main contribution to the (-dependent part of the cross section is given by the interference of the electromagnetic term and the vector part of the NC term. The axial part of the NC term cannot, nevertheless, be totally negligible at speciFc kinematical conditions. em , W I (V ) and the pseudotensor W I (A) have the following general form: The tensors W% % %
q q% 1 em = − g% − 2 W1em + n n% W2em ; W% q 4M 2
q q% 1 I W% (V ) = − g% − 2 W1I + n n% W2I ; 4M 2 q I W% (A) =
i 6 p7 q W3I ; 2M 2 %7
(124)
where n = p + p. Calculating the traces in Eqs. (121), (122) and (123), one obtains
Q2 em 2 ; W1 = /GM 8 0 − 2M
2 GE2 + /GM Q2 em W2 = 8 0− 1+/ 2M and W1I
NC = 2/GM GM 8
Q2 0− 2M
;
(125)
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
NC GE GENC + /GM GM Q2 =2 ; 8 0− 1+/ 2M
Q2 I NC W3 = 2GM GA 8 0 − : 2M
251
W2I
(126)
Here, 0 = p · q=M and / = Q2 =4M 2 . With the help of Eqs. (120), (125) and (126) for the cross section of the scattering of electrons with polarization ( on unpolarized nucleons we Fnd the following general expression:
d dA
d = dA (
0
(1 + (A) ;
(127)
where (d=dA)0 is the cross section for the scattering of unpolarized electrons on nucleons and is given by the Rosenbluth formula
d dA
0
= Mott
2 GE2 + /GM 2 + 2 tan2 /GM 1+/ 2
:
(128)
Here, Mott is the Mott cross section Mott =
2 cos2 ( =2) ; 4E 2 sin4 2 (1 + (2E=M ) sin2 =2)
(129)
where M is the mass of the target nucleon, E and are the energy and scattering angle of the electron in the laboratory system. From Eq. (127) it follows that the P-odd asymmetry is given by A=
1 (d=dA)( − (d=dA)−( : ( (d=dA)( + (d=dA)−(
(130)
With the help of (120), (124), (125) and (126) we Fnd the following expression for the asymmetry A in Born approximation: A = − A0
NC + 6G G NC + (1 − 4 sin 2 )6 G G NC /GM GM E E W M A ; 2 + 6G 2 /GM E
where 6=
1 ; 1 + 2(1 + /)tan2 ( =2)
6 =
/(1 + /)(1 − 62 ) :
(131)
252
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
We remind the reader that the NC vector and axial form factors [see expressions (107) and (108)] can be written in the following form: 1 s 1 NC;p(n) p(n) n(p) GM; = {(1 − 4sin2 W )GM; E E − GM; E } − GM; E 2 2 1 s 0;p(n) ≡ GM; E − GM; (132) E ; 2 1 1 1 GANC;p(n) = ± GA − GAs ≡ GA0;p(n) − GAs : (133) 2 2 2 Using expressions (132) and (133) we can explicitly separate the terms proportional to strange form factors in the expression of the P-odd asymmetry. Indeed, the r.h.s. of Eq. (131) can be split as follows: A = A(0) + A(s) ;
(134)
where 0 + 6G G 0 + (1 − 4 sin 2 )6 G G 0 /GM GM E E W M A ; 2 + 6G 2 /GM E s + 6G G s + (1 − 4 sin 2 )6 G G s /GM GM E E W M A A(s) = − A0 : 2 + 6G 2 /GM E
A(0) = − A0
(135) (136)
According to this separation, the asymmetry A(0) is determined by the non-strange electromagnetic and axial form factors of the nucleon and by the electroweak parameter sin2 W . The asymmetry A(s) , instead, is the contribution to the P-odd asymmetry of the strange form factors. As it is seen from these expressions the contribution of the axial strange form factor to the asymmetry is suppressed by the factor 1 − 4 sin2 W 0:075. Let us stress that in order to obtain information on the strange vector form factors of the nucleon from the measurement of the P-odd asymmetry it is necessary to know the nucleonic electromagnetic form factors with large enough accuracy. Due to the isovector nature of u–d part of the neutral current, even if we limit ourselves to consider the P-odd asymmetry for the scattering on the proton, this quantity contains the electromagnetic form factors of both proton and neutron. At present, the electromagnetic form factors of the neutron and particularly its charge form factor are rather poorly known. New measurements of the electromagnetic form factors of the nucleon are under way or in program at the Thomas Je8erson National Accelerator Laboratory (Je8erson Lab) [83]. 6. The experiments on the measurement of P-odd asymmetry in elastic e–p scattering We will discuss here the results of recent experiments on the measurement of the P-odd asymmetry A in elastic electron–proton scattering. In the experiment of the HAPPEX collaboration at Je8erson Lab [84] the elastic scattering ◦ of electrons with an energy of 3:3 GeV at the average scattering angle = 12:3 was measured. Consequently, the average value of the square of the momentum transfer was Q2 = 0:477 GeV2 .
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
253
The longitudinal polarization of the electrons was in the range 67–76%. A 15 cm liquid hydrogen target was used. In order to select elastic scattering events two high-resolution spectrometers were used in the experiment. Only 0.2% of the events were due to background processes. Electrons with a polarization of about 70% were obtained by irradiation of GaAs crystals by circularly polarized laser light. The polarization of the electron beam was continuously monitored by a Compton polarimeter and was also measured by MHller scattering. The combined asymmetry obtained from the results of the 1998 and 1999 data taking is equal to A(Q2 = 0:477) = {−15:05 ± 0:98(stat) ± 0:56(syst)}×10−6 :
(137)
The measured value of the asymmetry allows one to obtain information on the following combination of the strange form factors (at Q2 = 0:477 GeV2 ): s GEs + 0:392GM :
From (136) and (137) it was found that s GEs + 0:392GM = 0:069 ± 0:056 ± 0:039 ; p GM =1p
(138) (139)
where the Frst error is the combined (in quadrature) statistical and systematic errors and the second error is determined by the uncertainties on the electromagnetic form factors. For p the HAPPEX kinematics (GM =1p ) 0:36. In accordance with the existing data, the ratios of the electromagnetic form factors of proton and neutron to the magnetic form factor of the proton were taken to be GEp = 0:99 ± 0:02 ; p (GM =1p ) GEn = 0:16 ± 0:03 ; p (GM =1p ) n =1 GM n = 1:05 ± 0:02 : p (GM =1p )
(140)
The estimated contribution to the asymmetry of the axial form factor GANC was AA = (0:56 ± 0:23)×10−6 ;
(141)
where the main uncertainty is due to radiative corrections [82,85,86]. Taking into account (94) one can put GEs = /7s ; p GM =1p
p where 7s is a constant. Furthermore, in Ref. [84] the same Q2 -dependence of GM (Q2 ) was assumed for the strange magnetic form factor; in this case, from (139) one Fnds
7s + 2:91s = 0:51 ± 0:41 ± 0:29 ; where 1s is the strange magnetic moment of the nucleon [see Eq. (93)].
(142)
254
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
Fig. 2. Band: allowed region from the results of Ref. [84] with the assumptions discussed in the text. Points are the theoretical estimates from various models. The numbers refer to the list of references for the models as they appear in [84] (the corresponding reference numbers in the present work are listed in [87]). (Taken from Ref. [84].)
The allowed region of values of the parameters 7s and 1s , obtained from (142), is shown in Fig. 2. Points are the predictions of di8erent models [87]. The main uncertainties in the determination of quantity (138) are connected with the electromagnetic form factors of the neutron, which are not known, at present, with accuracy large enough. The value (139) was obtained [88] by using (140) for the magnetic form factor of the neutron. If, instead, we take the value [89] n =1 GM n = 1:12 ± 0:04 ; (143) p GM =1p then s GEs + 0:392GM = 0:122 ± 0:056 ± 0:047 : p GM =1p
(144)
Thus, the new measurements of electromagnetic form factors of the nucleon which are in progress at Je8erson Lab will have an important impact on the possibility of obtaining a more precise information on the strange form factors of the nucleon from future measurements of the P-odd asymmetry. An extension of the HAPPEX experiment (HAPPEX2) [90] is planned at Je8erson Lab: it ◦ will measure the P-odd asymmetry at a scattering angle 6 , corresponding to Q2 0:1 GeV2 ; thus, smaller than in the HAPPEX measurement. The motivation for this extension is to explore
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
255
the possibility that the strange form factors can be large at small Q2 but then fall o8 signiFcantly at the current HAPPEX kinematics. Another experiment on the measurement of the P-odd asymmetry in elastic e–p scattering was carried out by the SAMPLE collaboration at the MIT=Bates Linear Accelerator Center [91]. In this experiment, longitudinally polarized electrons with an energy of 200 MeV were scattered in ◦ ◦ backward direction, at scattering angles 130 6 6 170 . The average value of the momentum 2 2 transfer squared was Q = 0:1 GeV . A liquid hydrogen target was used in the experiment. The scattered electrons were detected by air Cherenkov counters. The average polarization of the electron beam was equal to 36:3±1:4%. In the latest measurements the following value of the P-odd asymmetry was obtained: Ap (Q2 = 0:1) = {−4:92 ± 0:61(stat) ± 0:73(syst)}×10−6 :
(145)
When electrons are scattered in the backward direction, the parameter 6 in expressions (135) and (136) is small and the contribution to the asymmetry of the electric strange form factor GEs is suppressed. The measurement of the P-odd asymmetry allows one in this case to obtain s . From Eq. (131) for = information on the strange magnetic form factor of the nucleon, GM we obtain the following expression for the asymmetry: NC NC G 1 GF Q2 GM A Ap = − √ + (1 − 4 sin2 W ) 1 + : (146) / GM 2 2 GM The last, axial, term in the above expression is multiplied by the factor (1 − 4 sin2 W ) which is small ( 0:07). However, in the SAMPLE experiment the value of / is small (/ 0:03); hence, the contribution of the axial form factor turns out to be kinematically enhanced. In Eq. (146) the weak axial form factor of the proton is given, at tree level in the Standard Model, by GANC = 12 (GA − GAs ) :
(147)
However, as it was pointed out in Ref. [85], the contribution to the P-odd asymmetry of the radiative corrections can be large. Taking the latter into account, expression (147) can be written in the form GANC = 12 [(1 + R1A )GA − R0A − GAs ] ;
(148)
where R1A and R0A are the radiative corrections to the isovector and isoscalar parts of the matrix element. They were calculated to be [85] R1A = − 0:34 ± 0:28;
R0A = − 0:12 ± 0:12 :
(149)
The electroweak corrections to the nucleon vertex induce the following anapole axial term in the matrix element of the electromagnetic current:
q= a(Q2 )Q2 em p |J |p = e u(p ) − 2 5 u(p) : (150) M2 q
256
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
Here, a(0) is the anapole moment of the nucleon [92]. We recall that the anapole moment of Cs nuclei was measured in a recent experiment [93]. In Ref. [86] the contribution to the P-odd asymmetry of the anapole moment of the nucleon has been calculated in the framework of chiral perturbation theory, both for the isovector [(R1A )a ] and isoscalar [(R0A )a ] terms. They are given by [86] √ 8 2 1 aI I (RA )a = − (I = 0; 1) ; (151) 2 2 GF D (1 − 4 sin W ) GA where D is the scale of chiral symmetry breaking. In Ref. [86] for the contribution of the anapole moments to R1A and R0A it was found that (R1A )a = − 0:06 ± 0:24;
(R0A )a = 0:01 ± 0:14
(152)
and for the total radiative corrections to the axial form factor GANC the following values were obtained: R1A = − 0:41 ± 0:24;
R0A = 0:06 ± 0:14 :
(153)
The SAMPLE data for the proton were Frst studied by assuming the values (149) for the radiative corrections and the value gAs = − 0:1 for the axial strange form factor (in agreement with the data of the experiments on the deep inelastic scattering of polarized leptons on polarized protons). Under these assumptions the following value of the strange magnetic form factor at Q2 = 0:1 GeV2 was obtained [91]: s = 0:61 ± 0:17 ± 0:21 ± 0:19 ; GM
(154)
where the last error is due to uncertainties in the radiative corrections. Recently, the SAMPLE collaboration has published the Frst results of the experiment on the measurement of the P-odd asymmetry in the quasi-elastic scattering of polarized electrons on deuterium [94,95] in the same kinematical region as in the proton case. The P-odd asymmetry in ˜e–d scattering is given by the following expression [95]: s Ad = (−7:27 + 1:78GAe (T = 1) + 0:75GM )×10−6 ;
(155)
where the term GAe (T = 1) = − GA (1 + R1A )
(156)
includes the axial form factor and the isovector part of the radiative corrections. The (small) isoscalar part of the radiative corrections and the contribution of GAs are included in the constant term in Eq. (155). The P-odd asymmetry in the scattering of polarized electrons on protons can be expressed as follows [95]: s Ap = (−5:72 + 1:55GAe + 3:49GM )×10−6 :
(157)
The measured value of the asymmetry in the SAMPLE ˜e–p experiment [91] is given by (145), while the P-odd asymmetry measured in ˜e–d scattering turned out to be [95] Ad (Q2 = 0:1) = {−6:79 ± 0:64(stat) ± 0:55(syst)}×10−6 :
(158)
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
257
s Fig. 3. Allowed regions for the form factors GM and −(1+R1A )GA ≡ GAe (in the notation of the Fgure), corresponding to the measurements of Refs. [91,95] of the P-odd asymmetries for the proton (H2 , unshaded region) and for the deuteron (D2 , hatched region), respectively. The inner regions include statistical errors only, the outer ones include statistic and systematic uncertainties added in quadrature. The vertical shaded band corresponds to the calculated value of −(1 + R1A )GA using the theoretical estimate of Ref. [86]. The isoscalar corrections R0A are assumed to be the ones calculated in Ref. [86] (Taken from Ref. [95]).
By combining Eqs. (155) and (157) with the corresponding experimental values, the authors of s ) plane, which are shown in Fig. 3. The inner parts Ref. [95] obtained two bands in the (GAe ; GM of the bands include only statistical errors while the outer bounds take into account statistical and systematic errors combined in quadrature. The shaded ellipse in Fig. 3 corresponds to the 1 allowed region for both quantities. s and G e at Q 2 = 0:1 GeV2 are given by The best-Ft values of the form factors GM A s GM = 0:14 ± 0:29stat ± 0:31syst ;
(159)
GAe (T = 1) = 0:22 ± 0:45stat ± 0:39syst :
(160)
In order to obtain from Eq. (159) the strange magnetic moment of the nucleon it is necessary to assume a Q2 -dependence of the strange form factor. In Ref. [95] a model proposed by Hemmert et al. [96], based on heavy baryon chiral perturbation theory, was used. For the strange magnetic
258
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
moment the following value was then obtained: 1s = [0:01 ± 0:29stat ± 0:31syst ± 0:07theor]1N ;
(161)
where the third error takes into account the theoretical uncertainty coming from di8erent theoretical predictions. Thus, from the latest SAMPLE data it is impossible to draw any deFnite conclusion on the value of the strange magnetic moment of the nucleon. Let us discuss now the value (160) of the axial constant. At Q2 = 0:1 GeV2 , in Born approximation and assuming the usual axial dipole form factor, we have GAe (T = 1) = − 1:071 ± 0:005. Taking into account the radiative corrections calculated in Ref. [86], the following value of the form factor GAe (T = 1) was obtained [95]: GAe (T = 1) = − 0:83 ± 0:26 :
(162)
This value corresponds to the vertical band in Fig. 3. Thus, the predicted value of GAe (T = 1) di8ers considerably from the experimental value, Eq. (160). One possible origin of this disagreement could be connected with a large anapole moment of the nucleon [95]. The surprising results which have been obtained in the SAMPLE experiments on the measurement of the P-odd asymmetry in ˜e–p and ˜e–d experiments require further theoretical e8orts in the calculations of the radiative corrections and further experiments, which will allow to check these results (see also the recent review [97]). At present, the SAMPLE collaboration has proposed a new experiment [98,99] on the measurement of the P-odd asymmetry in the scattering on deuterium of polarized electrons with energy of 120 MeV (thus lower than that in the previous run). At this energy the asymmetry will be smaller, but the cross section will be signiFcantly larger. One Fnal remark about the measurement of strange magnetic moment of the nucleon is in order: with the help of expression (40) for the electromagnetic current, we can present the Pauli form factors of proton and neutron in the following form: F2p = 23 F2u + (− 13 )F2d + (− 13 )F2s ; F2n = 23 F2d + (− 13 )F2u + (− 13 )F2s ;
(163)
F2q
where is the contribution of the q-quark (q = u; d; s) to the Pauli form factor of the proton. We have used in Eqs. (163) the isotopic SU(2) symmetry, from which it follows that (F2u; d )p = (F2d; u )n :
If we set Q2 = 0 and take into account that F2p (0) ≡ p = 1:79, F2n (0) ≡ n = − 1:91, we obtain 1:79 = 23 F2u (0) + (− 13 )F2d (0) + (− 13 )1s ; −1:91 = 23 F2d (0) + (− 13 )F2u (0) + (− 13 )1s :
These relations can be combined to give F2u (0) = 1s + 1:67 ; F2d (0) = 1s − 2:03 :
(164)
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
259
Thus, the measurement of the strange magnetic moment of the nucleon will allow one to determine the contribution of the u and d quarks to the magnetic moments of proton and neutron. From (164) it follows that if 1s ¿0:18; then F2u (0) ¿ |F2d (0)|, while in the case that 1s ¡0:18 the opposite inequality holds, F2u (0) 6 |F2d (0)|. A new experiment on the measurement of P-odd asymmetry in elastic electron–proton scattering is going on at the Mainz Microtron Facility [100]. The energy of the electron beam in this experiment is of 855 MeV. The scattered electrons are detected at a scattering angle of ◦ 35 (Q2 = 0:227 GeV2 ). The polarization of the electron beam is 80%. It is expected that the P-odd asymmetry will be measured with statistical accuracy of 3% and systematic error of 4%. The combination of strange Dirac and Pauli form factors F1s + 0:13F2s at Q2 = 0:227 GeV2 will be determined from this experiment with an accuracy of 0.02. Finally, the G0 collaboration is measuring now the P-odd asymmetry in elastic electron– proton scattering at Je8erson Lab [101]. It is expected that from the data of this experiment the strange form factors will be determined with a few percent accuracy at di8erent values of Q2 in the interval 0:1 6 Q2 6 1 GeV2 . Thus, in the nearest years we will have new information on the strange vector form factors of nucleon, information which could have an important impact on our understanding of the nucleon structure and of strong interactions.
7. P-odd asymmetry in the elastic scattering of electrons on nuclei with S = 0, T = 0 In this section, we shall consider the elastic scattering of polarized electrons on nuclei with spin and isotopic spin equal to zero (like 4 He; 12 C, etc.). Interest for this case was raised in previous works [102–105], ˜e + A → e + A :
(165)
It is evident from the SU(2) isotopic invariance of strong interactions that in this case the matrix elements of the isovector currents V3 and A3 are equal to zero. Also, the matrix element of the axial strange current As is equal to zero. 10 The matrix element of process (165) is given by the general expression (110) in which p (p ) refers now to the initial (Fnal) nucleus and, as before, q = p − p and Q2 = − q2 . Process (165) can be represented by the same diagrams of Fig. 1, with the exchange of and Z 0 between the electron and the nucleus. For the matrix element of the electromagnetic current Jem we have now p |Jem |p = p |V0 |p = (p + p ) F(Q2 ) ;
(166)
In fact, the matrix element p |As |p is a pseudovector and depends only on p and p (p and p being the momenta of the initial and Fnal nucleus). It is obvious that from two vectors it is impossible to build pseudovector. 10
260
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
where V0 is the isoscalar component of the electromagnetic current and F(Q2 ) is the electromagnetic form factor of the nucleus (there is only one form factor in the case of a spin zero nucleus). Similarly, the nuclear matrix element of the NC reads p |JNC |p = p |(−2 sin2 W Jem − 12 Vs )|p = (p + p ) F NC (Q2 ) ;
(167)
F NC (Q2 ) = − 2 sin2 W F(Q2 ) − 12 F s (Q2 )
(168)
where
and F s (Q2 ) is the strange form factor of the nucleus. 11 At Q2 = 0 the form factor F(Q2 ) is equal to the total charge of the nucleus F(Q2 = 0) = Z ; while for the strange form factor we have, in the same limit, F s (Q2 = 0) = 0 (the net strangeness of the nucleus is equal to zero). At small Q2 , we can expand the two form factors as follows: F(Q2 ) = Z(1 − 16 r 2 Q2 + · · ·) ; F s (Q2 ) = − 16 rs2 Q2 + · · · ; where r 2 is the mean square of the electromagnetic radius of the nucleus and rs2 is the mean square of the nuclear “strangeness radius”. Let us notice that in the impulse approximation, for nuclei with N = Z, we have 2GEs (Q2 ) F s (Q2 ) : = F(Q2 ) GEp (Q2 ) + GEn (Q2 )
(169)
The general expression for the cross section of the scattering of electrons with longitudinal I (A) = 0. polarization ( on nuclei with zero spin can be obtained from Eq. (120) by setting W% Then one obtains, ˜ 42 MA 1 GF Q2 dk em I d( = 4 gA L% W% L% W% +( √ (V ) ; (170) Q p·k 2 k 0 2 2 11 Let us notice that the current Vs is not conserved and the matrix element of the strange vector current has the following general form:
p |Vs |p = (p + p) F s (Q2 ) + (p − p) G s (Q2 ) :
However, from T-invariance of the strong interactions it follows that the form factor G s is equal to zero.
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
261
em and W I (V ) are given by Eqs. (121) where MA is the mass of the nucleus and the tensors W% % and (122). em and W I (V ) have the following general form For a spin zero nucleus the tensors W% % n n% em em W% = W ; 4MA2 2 n n% I I W% (V ) = W ; (171) 4MA2 2
where n = p + p . It is then easy to show that
Q2 em 2 2 W2 = F (Q )8 0 − ; 2MA
Q2 I NC 2 2 W2 = 2F (Q )F(Q )8 0 − : 2MA
(172)
Here, 0 = E − E = p · q=MA is the energy transferred to the nucleus in the laboratory system. By inserting (171) and (172) into (170), we obtain the following expression for the cross section of the scattering of electrons with longitudinal polarization ( on spin zero nuclei:
d d = (1 + (A) : (173) dA ( dA 0 Here,
d dA
0
= Mott F 2 (Q2 )
(174)
is the cross section for the scattering of unpolarized electrons on nuclei, Mott being the Mott cross section for a target nucleus of mass MA : Mott =
2 cos2 ( =2) : 4E 2 sin4 2 (1 + (2E=MA ) sin2 ( =2))
(175)
In the above is the scattering angle and E the initial energy of the electron in the laboratory system. The P-odd asymmetry A is then given by [106] G Q2 F NC (Q2 ) 2 2 F(Q2 )
F s (Q2 ) GF Q2 2 2 sin W + : = √ 2F(Q2 ) 2 2
F A(Q2 ) = − √
(176)
As it is clearly seen from (176), the measured value of the asymmetry can be di8erent from Q2 GF Q2 sin2 W = 8:309×10−5 ; GeV2 2
A(0) (Q2 ) = √
(177)
262
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
only if the strange form factor F s (Q2 ) is di8erent from zero. Important information on the strange form factor of the nucleus can be obtained from the investigation of the Q2 dependence of the asymmetry: if the quantity A(Q2 )=Q2 depends on Q2 , it would be the proof that the strange nuclear form factor is di8erent from zero. Finally, should it occur that the P-odd asymmetry (176) is negative, it would imply that the strange form factor of the nucleus is large and negative. From (176) it follows that the strange form factor of a nucleus with S = 0 and T = 0 is determined by quantities that can be experimentally measured. In fact, we have
√ 1 d 2 2 2 s 2 2 F (Q ) = 2 A(Q ) − 2 sin W : (178) Mott dA 0 GF Q2 An experiment on the measurement of the P-odd asymmetry in the elastic scattering of polarized electrons on 4 He is under preparation at the Je8erson Lab [107]. The square of the momentum transfer in this experiment is expected to be Q2 = 0:6 GeV2 . Two high-resolution spectrometers will be employed. The target will be a circulating 4 He gas system. Thus, this experiment will be able to measure the above discussed strange form factor of a spin zero nucleus. We like to mention, here, that the P-odd asymmetry in the elastic scattering of polarized electrons on nuclei represents an almost direct measurement of the Fourier transform of the neutron density, since the Z 0 -boson preferentially couples to neutrons. Indeed for 0+ → 0+ transitions, it is easy to show that the P-odd asymmetry can be expressed in the following form [108]: d˜rj0 (qr)7n (˜r) 1 2 (1 − 4 sin W ) − ; (179) A = − A0 2 d˜rj0 (qr)7p (˜r) which is valid both for isospin symmetric (Z = N ) and asymmetric (Z = N ) nuclei. In Eq. (179) 7n (7p ) is the neutron (proton) density and j0 (x) the spherical Bessel function of order zero. Taking into account the value of sin2 W , the last term on the right-hand side of Eq. (179) dominates the asymmetry and directly gives information on the neutron distribution. In fact, the denominator coincides with the form factor F(Q2 ), which can be measured independently [see Eq. (174)]. The Parity Radius Experiment (PREX) at the Je8erson Laboratory plans to measure the neutron radius Rn in 208 Pb through parity violating (PV) electron scattering [109]. The measurement of the neutron skin in a heavy nucleus (Rn is generally assumed to be a few percent larger than the proton radius) will have important implications on our knowledge of the structure of neutron stars, which are expected to have a solid, neutron-rich crust [110]. 8. Inelastic parity violating (PV) electron scattering on nuclei In addition to PV elastic electron scattering on proton, deuterium and S = 0; T = 0 nuclei, the P-odd asymmetry can be considered in the process of inelastic scattering of polarized electrons on nuclei. Several basic ideas motivate this investigation: the scattering on the single proton is not suWcient to determine the various unknown form factors which enter into the PV hadronic response; one is thus immediately led to consider also neutrons (namely deuterium) and more generally nuclei [82,104,111].
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
263
As we have discussed in the previous section, the special case of elastic scattering on spin-zero, isospin-zero targets o8ers an unambiguous possibility to measure the strange form factor of the nucleus. However, this type of investigation is conFned to modest momentum transfers since the elastic nuclear form factors rapidly fall o8 with increasing momentum, with the exception of the very light nuclei. Therefore, one would like to have additional, complementary information from other electron scattering measurements. One possibility is the inelastic excitation of discrete states in nuclei, but most probably the corresponding cross sections are not large enough to permit high precision information to be extracted. A more promising case is the quasi-elastic scattering namely the inelastic scattering of electrons in the region of the so-called quasi-elastic peak [112]. This process roughly corresponds to “knocking” individual nucleons out of the nucleus without too much complication in the Fnal nuclear state, in particular from Fnal state interaction. Quasi-elastic scattering occurs for a given three-momentum transfer q ≡ |˜q|, approximately at energy transfer ! = Q2 =(2M ). 12 The width of the peak is characterized by the Fermi momentum pF of the speciFc nucleus under study. In this kinematical region, the cross sections are generally proportional to the number of nucleons in the nucleus, and thus are prominent features in the inelastic spectrum. One might then hope to perform high precision studies, which would complement work on PV elastic electron scattering [82,113]. The focuses of this investigation are multiple: on the one hand, one wishes to understand the role played by the various single-nucleon form factors in the total asymmetry. By changing the kinematics (q; ! and the scattering angle ) and by adjusting N and Z through the choice of di8erent targets, one can hope to alter the sensitivity of the asymmetry to the underlying form factors. Of course, a precise study of nucleonic form factors from the scattering on nuclei is possible only if nuclear model uncertainties are well under control. On the other hand, the measurement of the asymmetry in PV quasi-elastic electron scattering brings into play new aspects of the nuclear many-body physics, namely, the ones related to the nuclear response functions to NC probes. This might involve a sensitivity of the cross sections to speciFc dynamical aspects which cannot be revealed with the customary reactions employed in nuclear structure studies. These issues were extensively discussed in Ref. [114], where PV quasi-elastic electron scattering was studied within the context of the relativistic Fermi gas (RFG). Let us consider the inclusive process in which a polarized electron with four-momentum k and longitudinal polarization ( is scattered through an angle to four-momentum k , exchanging a photon or a Z 0 to the target nucleus: ˜e + A → e + A∗ :
(180)
We generically indicate with A∗ the Fnal nucleus in an excited state, in which one (or more) nucleons are ejected. The leading order diagrams contributing to the amplitude of process (180) are illustrated in Fig. 4. Only the Fnal electron is detected and Fxes the kinematics of the process.
12
Here, we adopt the customary notation q0 = ! for the energy transferred to the nucleus; hence Q2 = − q2 =˜q2 − !2 .
264
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
Fig. 4. Single photon exchange and Z 0 exchange diagrams for electron scattering from nuclei.
The inclusive cross section for the scattering of polarized electrons on unpolarized nuclei can be written as ˜ dk 42 1 GF Q2 % I % em % I d( = 4 0 L W% + ( √ [gV L5 W% (A) + gA L W% (V )] ; (181) Q 2k k0 2 2 where the leptonic tensor and pseudotensor, L% and L% 5 , are given in Eqs. (118) and (119). The hadronic (electromagnetic and interference) tensors are deFned as ˜ (4) 3NM 2 d˜ p dp em em ˜ | − pF )(W% W% = 8 (p − p − q) (pF − |p ˜ |) (|p )s:n: (182) Ep Ep 4pF3 and
˜ (4) d˜ p dp I ˜ | − pF )[W% 8 (p − p − q) (pF − |p ˜ |) (|p (V; A)]s:n: : Ep Ep (183) ˜ 2 + M 2 and the single-nucleon (s.n.) tensors are In Eqs. (182) and (183) Ep = p
2 X X% GE2 + /GM q q% em 2 + [W% ]s:n: = − /GM g% − 2 ; (184) q M2 1+/
NC X X% GE GENC + /GM GM q q% I NC +2 2 (185) [W% (V )]s:n: = − 2/GM GM g% − 2 q M 1+/ I W% (V; A) =
3N M 2 4pF3
I (A)]s:n: = [W%
i 6 p7 q 4GM GANC ; 2M 2 %7
(186)
where X = p − q (q · p)=q2 . Expressions (182) and (183) for the nuclear hadronic tensors are obtained in the impulse approximation (IA), which amounts to consider the electron–nucleus interaction as an incoherent superposition of electron–nucleon scattering processes. Moreover, the nucleus is described as a gas of non-interacting, relativistic nucleons, with momentum distribution 3N=(4pF3 ) (pF − |p ˜ |); pF being the Fermi momentum. In Eqs. (182) and (183) ˜ | − pF ) N = Z; N is the number of protons or neutrons in the nucleus and the function (|p ensures that the Fnal nucleon is excited above the Fermi level (Pauli blocking e8ect). We also notice that the total cross section is obtained from the sum of the contributions from protons
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
265
and neutrons, each of them being calculated by using the pertinent nucleonic form factors in the single-nucleon tensors (184)–(186). From the above equations one can derive the expression for the double-di8erential (with respect to the energy, k0 ≡ 6 , and scattering angle, A, of the Fnal electron) inclusive cross section for the inelastic scattering of polarized electrons on nuclei. The sum of the cross sections for electrons with opposite polarization 2 pc d 2 + d 2 − d = + = Mott vL RL (q; !) + vT RT (q; !) (187) dA d6 dA d6 dA d6 coincides with the inclusive, parity-conserving cross section for unpolarized electrons, which is obtained from the electromagnetic hadronic tensor only. Their di8erence, instead, 2 pv d d 2 − d 2 + = − dA d6 dA d6 dA d6 GF Q2 = −Mott √ {vL RLAV (q; !) + vT RTAV (q; !) + vT RTVA (q; !)} 2 2
(188)
denotes the parity-violating inclusive cross section, which is obtained from the interference I (V; A). It corresponds to the interference between the matrix elements for hadronic tensors W% the exchange of one photon and the one for the exchange of a Z 0 boson. In Eqs. (187), (188) Mott is the Mott cross section and 2 2 1 Q2 Q 1 ; v = + tan2 vL = (189) T 2 2 ˜q 2 ˜q 2 Q2 1 1 + tan2 tan (190) vT = 2 ˜q 2 2 are lepton kinematical factors. The functions RL(T) (q; !) (q = |˜q|) are the longitudinal (transverse) electromagnetic nuclear response functions, which are given by em RL (q; !) = W00 ;
(191)
em em RT (q; !) = W11 + W22 ;
(192)
the direction of the three-momentum transfer ˜q being assumed as z-axis. The corresponding parity-violating nuclear response functions are deFned as I RLAV (q; !) = gA W00 (V ) ;
(193)
I I RTAV (q; !) = gA [W11 (V ) + W22 (V )] ;
(194)
I RTVA (q; !) = igV W12 (A) :
(195)
266
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
By measuring the cross sections for the scattering of electrons with both polarizations one can determine the asymmetry: 2
2
d 2 − d + d 2 − d + A= − + dA d6 d Ad6 dA d6 dA d6
= A0
vL RLAV (q; !) + vT RTAV (q; !) + vT RTVA (q; !) ; vL RL (q; !) + vT RT (q; !)
(196)
where A0 is deFned in Eq. (116). Within the RFG model the above-deFned nuclear response functions can be analytically evaluated. By performing the integrals over p ˜ one obtains RL; T (q; !) = R0 (q; !)U L; T (q; !) ;
(197)
T L; T RL; AV (q; !) = R0 (q; !)gA UAV (q; !) ;
(198)
T (q; !) ; RTVA (q; !) = R0 (q; !)gV UVA
(199)
where 3N M 2 (EF − ") (EF − ") 2qpF3 In Eq. (200) EF = pF2 + M 2 is the Fermi 1 q 1+ "(q; !) = max (EF − !); 2 R0 (q; !) =
:
(200)
energy and ! 1 −! : /
Two regimes exist: (i) q¡2pF , where " = EF − ! and Pauli blocking occurs; and (ii) q ¿ 2pF , where " = 12 (q 1 + 1/ − !) and the responses are not Pauli blocked. The remaining dependence on q and ! in Eqs. (197)–(199) is contained in the reduced responses: q2 1 2 U L (q; !) = 2 GE2 + )K ; (GE2 + /GM Q 1+/ 2 U T (q; !) = 2/GM + L UAV (q; !) = 2
q2 Q2
# 1 " 2 2 GE + /GM K; 1+/
GE GENC
T NC UAV (q; !) = 4/GM GM +
T (q; !) = UVA
1 NC NC + (GE GE + /GM GM )K ; 1+/
2 NC )K ; (GE GENC + /GM GM 1+/
/(1 + /) 4GM GANC (1 + K ) :
(201)
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
267
Fig. 5. Relativistic Fermi gas response functions and asymmetry for 12 C at di8erent values of q, shown as a function of !. The location of ! = Q2 =(2M ) is indicated by the arrows. Further explanations are given in the text (taken from Ref. [114]).
Here, the following functions of q, ! and pF have been introduced: Q2 ! !2 1 2 2 K= 2 − (1 + /) ; (E + EF " + " ) + (EF + ") + q 3M 2 F 2M 2 4M 2
(202)
1 / {EF + " + !} − 1 : (203) 2q 1 + / In most kinematical situations the quantities K and K are small and their e8ect on the P-odd asymmetry is negligible, below the percent. It is interesting to notice that by setting K = K = 0 the presence of the nuclear medium in the response functions (197)–(199) is felt through the function R0 (q; !) only. The latter obviously cancels in expression (196) of the asymmetry, thus leading to the same combination of form factors which was obtained in the case of elastic electron–proton scattering. This fact endures the possibility of using the measurements of quasi-elastic cross sections in the scattering of polarized electrons on nuclei to extract information on the strange form factors of the nucleon. Indeed, as discussed in Ref. [114], the nuclear physics dependence of the P-odd asymmetry which emerges from the calculations in the RFG model is rather weak. Typical results for 12 C are shown in Fig. 5. The Fermi momentum is taken to be pF = 225 MeV and the strange form factors are set to zero. In the upper panels the two electromagnetic
K =
268
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
response functions (RL , RT ) are displayed as a function of energy transfer !, for three typical 3-momentum transfers, q = 0:3; 0:5 and 2 GeV. The intermediate panels show the corresponding PV responses RLAV , RTAV and RTVA . Quasi-elastic scattering allows one to explore these quantities (as well as the P-odd asymmetry) in di8erent kinematical ranges. The Q2 values corresponding to the peak of the responses (! = Q2 =2M ) are 0.09, 0.24 and 2:4 GeV2 , respectively, but for each case a range of di8erent Q2 is explored (for example 1:9 6 Q2 6 2:9 GeV2 in the right panels); thus, showing one of the advantages of this investigation. Once these response functions are multiplied by the (angle dependent) lepton kinematical factors (189), (190) and combined as in Eq. (196), one obtains the asymmetry shown in the lower panels of Fig. 5. It ranges from a few ×10−6 at forward angle and low q to a few ×10−4 for a broad range of angles at q = 2 GeV. Further, one can examine the sensitivity of the P-odd asymmetry to the nucleon strange and non-strange form factors. At backward scattering angles vL =vT → 0, vT =vT → 1 and the terms containing magnetic and axial form factors dominate (in spite of the gV factor which penalizes the interference with the axial nuclear current). In Ref. [114] the electromagnetic form factors of the proton were parameterized with the usual dipole form, with a cuto8 mass MV = 843 MeV; for the neutron the following form factors were used: n GM (Q2 ) =
7Mn 1n ; (1 + Q2 =MV2 )2
GEn (Q2 ) =
−1n / 2 Q =MV2 )2 (1
(1 +
(204)
+ (n /)
;
(205)
where the Galster [115,116] parameterization for GEn was assumed, with 1n = − 1:1913 and (n = 5:6. The standard value of the parameter 7Mn is unity; it accounts for possible deviations as in Eq. (140). The axial isovector form factor was parameterized as GA (Q2 ) =
gA (1 + Q2 =MA2 )2
(206)
with MA = 1 GeV. For the strange form factors the following parameterization was adopted: GEs (Q2 ) =
7s / ; (1 + Q2 =MV2 )2 (1 + (Es /)
(207)
s GM (Q2 ) =
1s ; s /) (1 + Q2 =MV2 )2 (1 + (M
(208)
GAs (Q2 ) =
gAs / ; (1 + Q2 =MA2 )2 (1 + (As /)
(209)
where the second factor in the denominators accounts for possible deviations of the high-/ s = (s = 0 were used. dipole fall-o8. In Ref. [114] the values (Es = (n ; (M A The correlations between di8erent parameters used in modeling the form factors were examined by looking at the dependence of one parameter from a second one (all the remaining ones
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
269
◦
Fig. 6. Correlation plots for 12 C at q = 500 MeV, ! = Q2 =(2M ) and = 150 . The asymmetry is constant for any pairs of parameters corresponding to the lines marked 0%. For lines marked ±1% (±5%) the asymmetry remains constant at values 1% (5%) larger or smaller than that obtained with the canonical choice of parameters. The panels show correlations of gA with (n , 7s , 1s and gAs . Further explanations are given in the text (taken from Ref. [114]).
being Fxed) keeping the P-odd asymmetry constant. In Fig. 6 a few examples of these correlation ◦ plots are shown for 12 C at q = 500 MeV; ! = Q2 =(2M ) and = 150 : the lines marked 0% correspond to a (constant) value of A which is obtained starting with the “standard” values (e.g. gA = 1:26; 7s = 0 in the upper right panel). Lines marked +1% (−1%) have asymmetries 1:01A (0:99A), with a corresponding meaning for the lines marked ±5%. From these curves one observes, for example, that at this particular choice of kinematics, a ±1% determination of the asymmetry would permit a ±5:6% determination of gA if everything else were known (we refer the reader to the above discussion on the uncertainties on this parameter due, e.g., to radiative corrections). A ±5% determination of A likewise would translate into a ±28% uncertainty in gA . In fact, there are uncertainties in the other parameters which enter into the problem and in the nuclear model itself.
270
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
Fig. 7. Correlation plots as in Fig. 6, showing gA correlated with (a) 7Mn , (b) 1s and (c) gAs , for q = 500 MeV and ◦ = 150 . Three cases are shown: elastic scattering from hydrogen (labelled H) and quasi-elastic scattering from carbon (C) and tungsten (W). Only the ±1% contour lines are shown. For tungsten two di8erent Fermi momenta are used: pFp = 250 MeV for protons and pFn = 285 MeV for neutrons (taken from Ref. [114]).
The parameter (n characterizes the high-/ fall-o8 of the electric form factor of the neutron, GEn : we see from the left upper panel of Fig. 6 that, if the latter will be determined in future experiments to ±10%, this only translates into a ±0:6% uncertainty in gA . Obviously, this relatively minor e8ect is due to the backward kinematics, which suppresses the longitudinal contributions. The left lower panel of Fig. 6 shows the correlation between gA and the strength s , 1 : if this parameter goes from 0 to −1 (from 0 to 1), then g would decrease (increase) of GM s A by 34:7%. This correlation is rather important: for example if gA were known to be ±10%, then 1s would be constrained to ±0:29. Finally, the right lower panel shows a non-negligible correlation between gA and gAs . One should also notice, here, the potentialities o8ered by the use of di8erent nuclear targets. Indeed, the measurement of A in inelastic electron–nucleus scattering can give information not only on the strange nucleon form factors, but also on the non-strange parts of the weak neutral form factors of the nucleon. This can be achieved by Fltering the latter with a suitable choice of Z and N , such that, for example, it enhances the isovector contributions to the nuclear response and eventually cancels the isoscalar ones. NC in U T or G G NC in U T , in Eqs. (201): As an illustration, let us consider the product GM GM M A VA AV s and G s enter A multiplied by the combination ZG p +NG n , whereas in a (Z; N ) nucleus both GM M M A p n . Hence, the non-strange contributions (e.g. the isovector part of GANC ) enter with ZGM − NGM p p n n ). the ratio of the strange to the non-strange transverse pieces is (ZGM + NGM )=(ZGM − NGM This ratio is 1 for the proton, 0.187 for a Z = N nucleus and can be further reduced to be nearly zero by choosing a nucleus whose N=Z ratio is very close to −1p =1n . The case of tungsten (184 W), having advantages for high luminosity experiments, gives a ratio of −0:009. ◦ In Fig. 7 correlation plots are shown for q = 500 MeV and = 150 ; the results for 12 C and 184 W are obtained by integrating over the entire quasi-elastic response region, although there
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
271
is no signiFcative di8erence between the peak asymmetry and the integrated result. In (a) a signiFcative interplay appears between gA and 7Mn , more pronounced in H than in C and W, but not negligible even in the last case. In (b) and (c) instead, it is clearly shown that no correlation exists in W between gA and the strangeness parameters 1s and gAs : thus an N ¿Z nucleus such as tungsten appears to have advantages for the determination of GA in backward-angle scattering. A special case with N = Z is the deuteron, which we have already mentioned in Section 6, discussing the SAMPLE experiment at MIT=Bates. In the kinematic region around the quasi-elastic peak the P-odd asymmetry for the deuteron can be evaluated in the so-called “static” approximation, in which the contributions to the cross sections of protons and neutrons at rest are summed incoherently [117]: q2 A p NC;p p NC;p NC;n n NC;n n = 2 gA vL 2 (GE GE + GE GE ) + vT /(GM GM + GM GM ) A0 d 2Q ! p NC;p n × + gV [vT /(1 + /)(GM GA + GM GANC;n )]
−1 q2 p 2 p 2 n 2 n 2 × vL 2 ((GE ) + (GE ) ) + vT [/((GM ) + (GM ) )] : 2Q
(210)
This formula can be obtained from the RFG, setting the terms K and K in Eqs. (201) to zero and taking N = Z = 1. The dependence of the P-odd asymmetry on the deuteron structure was studied in Ref. [117], under di8erent conditions. The authors concluded that in the kinematical region around the quasi-elastic peak deviations from the static model are within 1% or 2%. A more recent study of the deuteron structure e8ects in Ad for the speciFc kinematic conditions of the SAMPLE experiment can be found in [118]. To get a Oavor of the general sensitivity of Ad to the single nucleon form factors one can consider the transverse contributions in Eq. (210), which are dominant at large scattering angles, as in the SAMPLE experiment. As already s and G s enter the transverse stated for the general case N = Z, the strange form factors GM A p n and are suppressed by contributions to the asymmetry multiplied by the combination GM + GM a factor 0:187 with respect to the contribution coming from the isovector axial form factor p n . GA , which is multiplied by GM − GM As a Fnal issue in this section let us consider forward-angle scattering: at small and Fxed q, ! it is vL =vT → 2Q2 =q2 and vT =vT → 0, so that the contribution of the response RT (q; !) is suppressed. In this situation considerable sensitivity of the asymmetry to the electric strange form factor can be achieved. In Fig. 8 the correlation plot of 7s versus 1s is shown for 12 C at ◦ q = 500 MeV and = 10 . From all the above consideration, quasi-elastic scattering of polarized electrons on nuclei can be considered, with appropriate choices of the kinematical conditions, as an useful tool to determine the strange form factors of the nucleon; it can also provide important information on the non-strange components of various nucleonic form factors, which are still waiting for a precise determination. Although it goes outside the scopes of the present review, we also mention that the measurement of PV nuclear response functions would open new and interesting possibilities to explore the nuclear dynamics as viewed by weak neutral probes and to test nuclear models in the domain of medium–high excitation energy [119,120].
272
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
Fig. 8. Correlation plots as in Fig. 6, showing 7s correlated with 1s , for q = 500 MeV, ! = Q2 =(2M ) and = 10 (taken from Ref. [114]).
◦
9. Elastic NC scattering of neutrinos (antineutrinos) on the nucleon Direct information on the strange form factors of the nucleon can be obtained from the investigation of the NC processes [121,122] 01 (0H1 ) + N → 01 (0H1 ) + N :
(211)
The amplitude for the process of neutrino (antineutrino) scattering is given by the following expression: GF f|S |i = ∓ √ u(k ) (1 ∓ 5 )u(k)p |JNC |p(2)8 8(4) (p − p − q) ; (212) 2 where k and k are the momenta of the initial and Fnal neutrino (antineutrino), p and p the momenta of the initial and Fnal nucleon, q = k − k and JNC = VNC − ANC is the hadronic neutral current in the Heisenberg representation. The matrix elements of the vector and axial NC are given in the Standard Model by expressions (103) and (104). The cross sections of processes (211) turn out to be d0(0)H =
GF2 M d˜k % % NC { L (k; k ) ∓ L (k; k ) } W (p; q) ; % 5 (2)2 p · k k0
where NC (p; q) = W%
1 d˜ p p |JNC |pp|J%NC |p 8(4) (p − p − q) ; 2M 2p0
(213)
(214)
while the tensor L% (k; k ) and pseudotensor L% 5 (k; k ) are given by (118) and (119), respectively.
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
273
NC has the following structure: It is evident that W% NC VV +AA VA W% = W% − W% ;
(215)
VV +AA is due to the contribution of the vector–vector where, with obvious notation, the tensor W% VA and axial–axial NC, whereas the pseudotensor W% is due to the interference of the vector and axial NC. After performing the traces over spin states, they become (/ = Q2 =4M 2 , n = p + p ):
q q% VV +AA NC 2 NC 2 W% = −[/(GM ) + (1 + /)(GA ) ] g% − 2 q
NC 2 NC )2 q q% NC 2 (GE ) + /(GM Q2 NC 2 n n% + − 2 (GA ) 8 0 − (216) + (GA ) 1+/ 4M 2 q 2M
and VA = W%
i Q2 7 NC NC 6 p p GM GA 8 0 − ; M 2 %7 2M
(217)
respectively. Taking into account that d˜k M = dQ2 d0 ; k0 p·k
(218)
from Eqs. (213), (216) and (217) we obtain, correspondingly, the following expressions for the di8erential cross sections of the processes (211):
NC 2 E NC )2 (GE ) + 2M y(GM d NC GF2 1 2 NC 2 M = (G ) + 1 − y − y y M E dQ2 0(0)H 2 2 2E 1 + 2M y
M 1 2 1 NC 2 NC NC + : (219) y +1−y+ y (GA ) ± 2y 1 − y GM GA 2 2E 2 Here, y=
Q2 p·q = p · k 2p · k
(220)
and E is the energy of neutrino (antineutrino) in the laboratory system. In order to obtain information on the strange form factors of the nucleon from the investigation of processes (211) it is necessary to know the axial CC form factor GA [see relation (106)]. The latter can be determined by investigating the quasi-elastic processes 01 + n → 1− + p ; 0H1 + p → 1+ + n :
(221)
274
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
The amplitudes of these processes are given, respectively, by expressions GF 2
f|S |i = − i √ u(k ) (1 − 5 )u(k)p |JCC |p(2)4 8(4) (p − p − q) ;
GF 2
†
f|S |i = i √ u(k ) (1 + 5 )u(k)p |JCC |p(2)4 8(4) (p − p − q) ;
(222) (223)
where k and k are the momenta of the initial 01 (0H1 ) and Fnal 1− (1+ ) lepton, p is the momentum of the initial n (p) and p the momentum of the Fnal p (n). The Heisenberg vector and axial charged currents are the “plus”-components of the isovectors Vi and Ai [see Eq. (73)]. In Section 3, we have considered the one-nucleon matrix elements of the axial current A1+i2 . 1+i2 Let us discuss now the matrix element of the vector current V . Due to isotopic invariance of the strong interactions the vector current Vi is conserved (conserved vector current, CVC) [123]: 9 Vi = 0 :
Thus, the matrix element of the vector current satisFes the condition (p − p) pp |V1+i2 |pn = 0 and has the following general form: i 1+i2 CC 2 % CC 2 p | V | p = u(p ) F (Q ) + q F (Q ) u(p) ; p n 1 2 2M %
(224)
(225)
where F1;CC2 (Q2 ) are CC form factors. The corresponding Sachs CC form factors are given by CC (Q2 ) = F1CC (Q2 ) + F2CC (Q2 ) ; GM
GECC (Q2 ) = F1CC (Q2 ) −
Q2 CC 2 F (Q ) : 4M 2 2
(226) (227)
An important property of the isovector current Vi is given by its commutation relation with the isospin operator [Ik ; Vj ] = i6kj‘ V‘ ;
(228)
where Ik is the total isotopic spin operator. From Eq. (228) it follows that V1+i2 = [V3 ; I1+i2 ] :
(229)
Taking into account the charge symmetry of strong interactions, from (229) the following relations hold: pp
|V1+i2 |pn =n p |V1−i2 |pp =p p |Jem |pp −n p |Jem |pn :
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
275
Let us notice that in the derivation of these relations we have used expression (48) for the e.m. current. Thus, the CC vector form factors are connected with the electromagnetic form factors of proton and neutron by p CC n GM (Q2 ) = GM (Q2 ) − GM (Q2 ) ;
(230)
GECC (Q2 ) = GEp (Q2 ) − GEn (Q2 ) :
(231)
NC have to be The cross sections of processes (221) are given by expression (213) in which W% replaced by ˜ † 1 dp CC W% (p; q) = (232) p |JCC |pp|J%CC |p 8(4) (p − p − q) : 2M 2p0
It is obvious that in order to obtain the cross sections of the quasi-elastic processes (221) it is necessary to replace the NC form factors in expression (219) by the CC ones (we are neglecting the muon mass). One obtains
CC 2 E CC )2 (GE ) + 2M y(GM d CC GF2 1 2 CC 2 M = (G ) + 1 − y − y y M E dQ2 0(0)H 2 2 2E 1 + 2M y
M 1 2 1 2 CC + (233) y +1−y+ y (GA ) ± 2y 1 − y GM GA : 2 2E 2 The most detailed study of the elastic NC scattering of neutrinos (antineutrinos) on protons was done in the experiment 734 at BNL in 1987 [124]. A 170 ton high-resolution liquid-scintillator target-detector was used in this experiment. The liquid-scintillator cells were segmented by proportional drift tubes. About 79% of the target protons were bound in carbon and aluminum nuclei and about 21% were free protons. The neutrino beam was a horn-focused wide band beam. The average energy of neutrinos was 1:3 GeV and the average energy of antineutrinos was 1:2 GeV. The spectrum of neutrinos and antineutrinos was determined from the detection of quasi-elastic 01 + n → 1− + p and 0H1 + p → 1+ + n events. The angle between the momenta of the recoil protons and of the incident neutrinos as well as the range and energy loss were measured. The measurement of the range and energy loss provided an e8ective particle identiFcation and the determination of the kinetic energy of the recoil protons. The background from the neutrons entering into the detector was eliminated by restricting the Fducial volume down to about 19% of the total volume of the detector. After all cuts, 951 neutrino events and 776 antineutrino events were selected. The di8erential cross sections dE0(0)H (d=dQ2 )NC H (E0(0) H ) d NC 0(0) H L0(0) = ; (234) 2 dQ 0(0)H dE0(0)H L0(0)H (E0(0)H )
276
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
Fig. 9. The data points are the measured Oux-averaged di8erential cross sections for 01 p → 01 p and 0H1 p → 0H1 p measured in the experiment of Ref. [124]. The solid curves are best Fts to the combined data with the values MA = 1:06 GeV and sin2 W = 0:220. The error bars represent statistical error and also include Q2 -dependent systematic errors (taken from Ref. [124]).
obtained by folding the cross sections (219) with the BNL neutrino and antineutrino spectra L0(0)H (E0(0)H ), were determined from the data of the experiment [124]. Their values are presented in Fig. 9 by points. For the ratios of (Oux averaged) total elastic and quasi-elastic cross sections, for Q2 in the interval 0:5 6 Q2 6 1 GeV2 , the following values were obtained: = RBNL 0
(01 p → 01 p) = 0:153 ± 0:007 (stat) ± 0:017 (syst) ; (01 n → 1− p)
(235)
RBNL = 0H
(0H1 p → 0H1 p) = 0:218 ± 0:012 (stat) ± 0:023 (syst) ; (0H1 p → 1+ n)
(236)
RBNL =
(0H1 p → 0H1 p) = 0:302 ± 0:019 (stat) ± 0:037 (syst) : (01 p → 01 p)
(237)
The Ft of the data presented in Ref. [124] was done under the assumption that the contribution of the strange form factors of the nucleon can be neglected and that the axial CC form factor is given by the dipole formula 1 1 GA (0) GANC (Q2 ) = GA (Q2 ) = 2 2 (1 + Q2 =MA2 )2
(238)
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
277
with GA (0) = 1:26. The parameters MA and sin2 W were considered as free parameters. From the simultaneous Ft of the neutrino and antineutrino data, the following values sin2 W = 0:218+0:039 −0:047 ; MA = 1:06 ± 0:05 GeV were found (with D2 = 15:8 at 14 DOF). The value of the axial cuto8 MA was in a good agreement with the existing (at that time) world-average value MA = 1:032 ± 0:036 GeV ;
(239)
which was found from the data of the experiments on quasi-elastic neutrino and antineutrino scattering. The solid curves in Fig. 9 were obtained with the above best-Ft values of the parameters. In Ref. [124], it was also reported the result of the Ft of the data on NC elastic neutrino (antineutrino)–proton scattering under the assumption that the contribution of the strange vector form factors can be neglected and the axial strange form factor has the same Q2 dependence as the CC axial form factor GANC (Q2 ) =
1 [GA (0) − GAs (0)] : 2 (1 + Q2 =MA2 )2
(240)
For the parameter sin2 W the value 0:22 was taken. The parameter MA was constrained to the world-averaged value (239). From this Ft it was found GAs (0) = − 0:12 ± 0:07 :
(241)
Thus, from the results of the Ft we described above, it follows that −0:25 6 GAs (0) 6 0 at 90% CL. It is necessary to stress, however, that there is a strong correlation between the values of the parameters GAs (0) and MA (see Fig. 10). The data of the BNL experiment were re-analyzed in Ref. [125]. In this work not only strange axial but also strange vector form factors, F1s (Q2 ) and F2s (Q2 ), were taken into account. It was assumed that all non-strange form factors have the same dipole Q2 -dependence, with MV = 0:843 GeV. The parameter MA was considered as a free parameter. It was also assumed that the electric form factor of the neutron is given by Eq. (205). These authors made several Fts of the BNL data under di8erent assumptions on the values of the parameters that characterize the strange form factors. For the latter the following parameterizations were chosen: F2s (Q2 ) =
1s ; (1 + /)(1 + Q2 =MV2 )2
GAs (Q2 ) =
GAs (0) ; (1 + Q2 =MA2 )2
F1s (Q2 ) =
F1s Q2 ; (1 + /)(1 + Q2 =MV2 )2 (242)
278
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
Fig. 10. Simultaneous Ft of d=dQ2 for the neutrino and antineutrino elastic scattering samples in the (MA ; ?)-plane (? ≡ −GAs (0)), with sin2 W Fxed at 0.220. In the Ft MA has been constrained at the world-average value MA = 1:032 ± 0:036 GeV (taken from Ref [124]).
which have the same (dominant) Q2 -dependence as the non-strange form factors. 13 For the value of the parameter sin2 W the world average value sin2 W = 0:2325 was taken. If we neglect the contribution of all strange form factors and keep as the only variable parameter MA , then an acceptable Ft to the data can be found with MA = 1:086 ± 0:015 GeV (D2 = 14:12 at 14 DOF). This value of MA is in good agreement with the world-average value MA = 1:061 ± 0:026 GeV : If we neglect the contribution of the vector strange form factors only, the best Ft to the data is obtained with GAs (0) = − 0:15 ± 0:07;
MA = 1:049 ± 0:019 GeV
(D2 = 9:73 at 13 DOF). 13
We notice that some authors, both in the study of PV electron scattering and of neutrino scattering, have assumed s parameterizations similar to the ones of the non-strange form factors directly for the Sachs form factors GE; M . The parameterizations used here for F1;s 2 correspond to GEs (Q2 ) =
(4M 2 F1s − 1s )/ 1 ; 1+/ (1 + Q2 =MV2 )2
s GM (Q2 ) =
(4M 2 F1s / + 1s ) 1 : 1+/ (1 + Q2 =MV2 )2
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
279
Hence, there is a strong correlation between the values of parameters MA and GAs (0). This correlation is connected with the fact that both negative values of GAs (0) and larger values of MA increase the cross sections of neutrino– and antineutrino–proton scattering. Finally, if we consider all strange form factors and MA to be variable parameters, then from the Ft of the BNL data we get 1s = − 0:39 ± 0:70;
F1s = 0:49 ± 0:70 GeV−2
GAs (0) = − 0:13 ± 0:09;
MA = 1:049 ± 0:023 GeV
(D2 = 9:28
at 11 DOF). The authors of Ref. [125] concluded that from the data of the BNL experiment it is not possible to make Frm conclusions on the values of the strange form factors of the nucleon. The result of the Ft strongly depends on the value of the parameter MA which determines the Q2 behavior of the axial CC form factor. Satisfactory Fts were obtained for values of the strange axial form factor GAs (0) in the range from 0 to −0:15 ± 0:07, depending on the value of MA . There are no doubts that new experiments on the measurement of NC elastic neutrino (antineutrino) proton scattering are necessary. Information on the strange axial constant of the nucleon GAs (0) ≡ gAs can be obtained from the measurement of the ratio R of the total cross sections for the production of protons and neutrons in quasi-elastic scattering of neutrinos (antineutrinos) on nuclei with isotopic spin equal to zero. A detailed calculation was done for the nucleus 12 C and for the neutrino beam of the Los Alamos Meson Physics Facility (LAMPF) (with neutrino energies ¡200 MeV) [126,127]. The nuclear structure e8ects were taken into account in the framework of the random phase approximation; the Fnal state interaction of the ejected nucleon was included through a Fnite-range force derived from the Bonn potential. Although this issue goes somewhat beyond the subject of this section, we present it here since the original suggestion was founded on the hypothesis that nuclear structure and=or Fnal state interaction e8ects appreciably cancel in the ratio of quasi-elastic 0(0)-nucleus H cross sections. In order to illustrate the sensitivity of the method proposed in [126,127], let us notice that the main contribution to the cross sections of quasi-elastic neutrino scattering comes from the axial NC form factor:
Gs 1 0p; n ˙ |GANC |2 GA2 1 ∓ 2 A : (243) 4 GA For the ratio R we have then: 14 R=
Gs 0p 16 1 − 4 A 1 − GAs : n 0 GA 5
(244)
14 We address the reader to a possible source of confusion existing in the literature: this approximate expression for the ratio R suggests a “reference” value, without strangeness e8ects, R = 1. In fact, even if the axial NC form factor contribution is dominant, also the vector NC form factors (which are di8erent for protons and neutrons also in the absence of strange form factors) contribute to the neutrino scattering cross section, producing a deviation from 1 of the reference value. The size of this deviation depends on the kinematical conditions and on the speciFc kinematical cuts applied in calculating the cross sections.
280
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
Thus, in the ratio of the cross sections the e8ect of the strange form factor is more than doubled. An important advantage of measuring the ratio R is connected with the fact that this quantity is not a8ected by the absolute normalization of the cross sections. A measurement of the above ratio is planned at the LSND detector at Los Alamos. The detector is made of mineral oil (CH2 ) and reveals NC neutrino-induced knockout reactions by measuring the energy deposition of the recoiling nucleons, with an energy resolution of 5% for nucleon kinetic energies TN ¿50 MeV. The available neutrino beam is produced by pion in Oight decay and is composed of about 80% neutrinos and 20% antineutrinos. The yields of protons and neutrons can be measured as a function of the recoiling nucleon kinetic energy, with an average over the nucleon angle, as the detector has no angular resolution. The total cross sections are then obtained by integrating the di8erential yields over the nucleon kinetic energy. While all neutron events come obviously from quasi-elastic scattering, the proton events can be produced both in free scattering on protons and in quasi-elastic scattering on 12 C, therefore a kinematical cut has to be applied in order to exclude free proton contributions, when forming the proton over neutron ratio. The maximum kinetic energy that can be transferred by the neutrinos available at Los Alamos to a free proton, at rest in the laboratory frame, is about 60 MeV; therefore, the lower limit TN ¿60 MeV is adopted in calculating the cross sections. In Ref. [126] it was shown that in the range of neutrinos energies of LAMPF, the ratio R practically does not depend on the neutrino energy and consequently, on the uncertainties in the neutrino spectrum. In particular, the ratio calculated averaging the cross sections over the expected neutrino spectrum was found to be practically the same as the one obtained for Fxed E0 = 200 MeV, a value which has been used in further studies (see Section 12). It was also shown that the ratio of the cross sections for the scattering of neutrinos on free protons and neutrons di8ers from the ratio of the cross sections of the quasi–elastic knockout of protons and neutrons from 12 C, calculated within the random phase approximation, by no more than 10%. 15 This fact was presented as a reason for using the free expression as a qualitative guidance in understanding the role of the di8erent contributions to the ratio. This small di8erence can also be taken as a Frst indication that the ratio R weakly depends on nuclear e8ects. This argument will be further developed in Section 12. Beside the axial form factor GAs the e8ects of the strange form factor F2s were also studied, while F1s was not considered since its e8ects are expected to be small at the energies of interest for the Los Alamos experiment. Since the Los Alamos beam is in fact a mixture of neutrinos and antineutrinos, the experimentally measured quantity will be obtained as the ratio of a “weighted average” of both types
15
More speciFcally, for the case of free nucleons the following limits on the outgoing nucleon kinetic energy were assumed in the calculation of the total cross sections: 50 6 TN 6 59:7 MeV. With this choice the neutrino p=n ratio was found to be R=
0:66 − 0:84GAs (0) + 0:25(GAs (0))2 − 0:254F2s (0) + 0:2GAs (0)F2s (0) : 0:75 + 0:92GAs (0) + 0:25(GAs (0))2 + 0:254F2s (0) + 0:2GAs (0)F2s (0)
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
281
Fig. 11. Ratio of the integrated proton to neutron yield for the quasi-elastic neutrino-induced reactions on 12 C as a function of −GAs (0) ≡ −Gs (0) in the conditions of the LAMPF decay-in-Oight neutrino beam. The various curves correspond to di8erent values of F2s (0) ≡ 1s (taken from Ref. [127]).
of cross sections, namely, p p 0 + 0H RH LAMPF = n (245) 0 + 0nH = L(E) (E) dE being the cross section averaged over the neutrino or antineutrino Oux L0(0)H (E). The e8ects of F2s are of opposite sign for neutrinos and antineutrinos, hence the sensitivity of the experimental ratio to F2s can be reduced in the measured quantity (245). Due to the di8erent rate of change of the separate 0 and 0H cross sections with the strange axial form factor, the e8ects of gAs on the experimental ratio can also be altered. In Ref. [127], it was found that assuming a spectrum made of 80% neutrinos and 20% antineutrinos the sensitivity of (245) to gAs was increased with respect to the pure neutrino ratio, while the sensitivity to F2s was decreased. The corresponding results are presented in Fig. 11, where the dependence of the ratio RH on the axial strange constant −GAs (0) is shown. Notice, however, that a subsequent study [128], which used an updated spectrum for the antineutrinos, showed that the sensitivity of RH to both gAs and F2s is close to the one of the pure neutrino ratio. An analysis of the experiment on the measurement of the ratio R is going on at LAMPF [129]. It is planned that in this experiment the ratio R should be measured with an accuracy of 10%. If −GA (0)¡ − 0:2, an e8ect of GA (0) will be seen. 10. Neutrino–antineutrino asymmetry in elastic neutrino (antineutrino)–nucleon scattering As we have seen before, the precise measurement of the cross sections of the NC neutrino (antineutrino) scattering on nucleons allows one to obtain direct information on the strange form factors of the nucleon if the electromagnetic form factors of proton and neutron and the CC axial form factor are known. At present, there exist detailed information on the electromagnetic form factors of proton and neutron, which was obtained from the data of numerous experiments
282
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
on elastic electron–proton and electron–deuteron scattering [130,131]. New information on the electromagnetic form factors of the proton was recently obtained by the Je8erson Lab Hall A Collaboration [132], which measured the polarization of recoil protons in the scattering of polarized electrons on unpolarized protons. The axial form factor of the proton is rather poorly known and, as it was pointed out in the previous section, the main problems in extracting information on the strange form factors from the existing neutrino (antineutrino)–proton data are connected with the uncertainty on the axial form factor. In Ref. [133] a method was proposed, which allows one to obtain information on the strange form factors of the nucleon in a model-independent way. Let us consider the NC processes: 01 + N → 01 + N ;
(246)
0H1 + N → 0H1 + N :
(247)
The di8erence of the cross sections of processes (246) and (247) are given by [see Eq. (219)]
d NC d NC 2GF2 1 NC NC − = GA ; (248) y 1 − y GM dQ2 0 dQ2 0H 2 y being deFned as in Eq. (220). Hence, the above di8erence of the neutrino and antineutrino cross sections is determined only by the magnetic and axial NC form factors. The axial and magnetic NC form factors of the nucleon are given by Eqs. (106) and (108), respectively. The latter can be rewritten in the form NC;p(n) p(n) 3 s GM = ± GM − 2 sin2 W GM − 12 GM ;
(249)
3 = 1 (G p − G n ) is the isovector form factor of the nucleon. where GM M M 2 Let us consider now the CC processes:
01 + n → 1− + p ;
(250)
0H1 + p → 1+ + n :
(251)
From Eq. (233) it follows that the di8erence of the cross sections of reactions (250) and (251) is given by
d CC d CC 4GF2 1 3 − = 1 − GM GA : (252) y y dQ2 0 dQ2 0H 2 Thus, this di8erence is determined by the isovector electromagnetic form factor and the CC axial form factor, which are the u − d part of the magnetic and axial NC form factors of the nucleon. Let us now deFne the following neutrino–antineutrino asymmetry: A(Q2 ) =
2 NC (d=dQ2 )NC 0 − (d=dQ )0H : 2 CC (d=dQ2 )CC 0 − (d=dQ )0H
(253)
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
From Eqs. (248) and (252), we have 16
s GAs G p(n) 1 GM 1 ±1 − ±1 − 2 sin2 W M3 − Ap(n) = 3 4 GA 2 GM GM
:
283
(254)
Thus, in the asymmetry A the strange axial and vector form factors enter in the form of the s =G 3 . Taking into account only terms which depend linearly on the strange ratios GAs =GA and GM M form factors we can rewrite Eq. (254) in the following form: s GAs 0 1 GM 0 Ap(n) = Ap(n) ∓ ∓ A ; (255) 3 8 GM GA p(n) where 1 0 Ap(n) = 4
1 ∓ 2 sin2 W
p(n) GM 3 GM
(256)
is the expected asymmetry in the case that all strange form factors are equal to zero. Thus, should it turn out that the measured asymmetry is di8erent from A0 , it would be a model-independent proof that the strange form factors are not negligible with respect to the non-strange ones. Usually, in neutrino experiments nuclear targets are used. Accordingly, we can average the asymmetry (253) over the protons and neutrons; then, we obtain (assuming, e.g., an isospin symmetrical nucleus) the following expression: A = A0 +
G 0 GAs 1 2 : sin W M 3 G 2 GM A
(257)
Here, A0 = 14 (1 − 2 sin2 W ) 0 = (G p +G n )=2 is the isoscalar magnetic form factor of the nucleon. In the expression for and GM M M the averaged asymmetry A only the axial strange form factor enters: in fact, the interference between the (isoscalar) strange vector form factor and the isovector axial form factor vanishes after averaging over p and n. We notice that the electromagnetic form factors of the nucleon enter into the expression p(n) 3 . It is well known that the for the asymmetry Ap(n) (Q2 ) in the form of the ratio GM =GM electromagnetic form factors satisfy the approximate scaling relation p GM (Q2 ) 1p n (Q 2 ) = 1 ; GM n
(258)
where 1p and 1n are the total magnetic moments of proton and neutron. Using the values 1p = 2:79; 16
1n = − 1:91;
sin2 W = 0:23 ;
More precisely, in Eq. (254) enters A|Vud |2 ; however, we do not write explicitly the CKM matrix element since we neglect the small deviation of |Vud | from unity.
284
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
we obtain the following expressions for the asymmetries in the scaling approximation: Gs Gs Ap = 0:12 − 0:12 A − 0:13 M ; 3 GA GM An = 0:16 + 0:16
GAs Gs + 0:13 M ; 3 GA GM
GAs : GA Thus, the asymmetries Ap and An are rather sensitive to both axial and magnetic strange form factors. The contribution of the axial strange form factor to the averaged asymmetry A, instead, is suppressed due to the smallness of the isoscalar magnetic moment of the nucleon with respect to the isovector one. The neutrino–antineutrino asymmetry depends on the ratio of the magnetic form factors of the proton and neutron. There exists rather detailed information on the magnetic form factor of the proton in a wide range of Q2 up to 30 GeV2 [134–139]. The experimental data show that the behavior of the form factor at high Q2 cannot be described by the standard dipole formula. The magnetic form factor of the neutron is less known than the one of the proton. A large part of the information on the neutron form factors has been obtained from experiments on the measurement of quasi-elastic scattering of electrons on deuterium and other nuclei [140,141]. In order to extract from these data the electromagnetic form factors of the neutron it is necessary to take into account the neutron binding, Fnal state interaction, contribution of meson exchange currents and other e8ects which rely on theoretical models. In order to calculate the expected neutrino–antineutrino asymmetry with an error band connected with the uncertainties of the electromagnetic form factors, in Ref. [133] a Ft of the data on the magnetic form factors of proton and neutron was done. The range 0:5 GeV2 6 Q2 6 10 GeV2 was considered and the following two-poles formula was adopted for the magnetic form factor of the proton: p GM a1 1 − a1 = + : (259) 1p 1 + a2 Q2 1 + a3 Q2 A = 0:14 + 0:02
It generalizes the dipole formula and was previously proposed in Ref. [142]. For the magnetic form factor of the neutron the expression n Gp GM = M (1 + a4 Q2 ) (260) 1n 1p was taken, where the parameter a4 takes into account the deviation from the scaling relation (258). From the Ft of the data the following values for the parameters were found: a1 = − 0:50 ± 0:04 ; a2 = 0:71 ± 0:02 ; a3 = 2:20 ± 0:04 ; a4 = − 0:019 ± 0:004 :
(261)
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
285
Fig. 12. Plot of 4|Vud |2 Ap as a function of Q2 . The shadowed area corresponds to the uncertainty induced by the errors in the magnetic form factors in the absence of strange contributions. All the curves were obtained using a S (Q2 ) = 0 utilizing the Ft dipole form for FAs (Q2 ) with gAs = − 0:15. The dashed (dotted) curve was obtained with GM of Eqs. (259) and (260) (respectively, the WT2 Ft [143]) for the magnetic form factors of the nucleon. The solid S (Q2 ) with 1s = − 0:3 and utilizing the Ft of Eqs. (259) (dot–dashed) line was obtained using a dipole form for GM and (260) (respectively, the WT2 Ft) for the magnetic form factors of the nucleon (taken from Ref. [133]).
In the calculation of the expected asymmetry the parameterizations proposed in Refs. [130,143] were also used. The Q2 -behavior of the strange form factors is unknown. Di8erent parameterizations of the strange form factors were thus considered in Ref. [133]. For the sake of illustration in Fig. 12 the result of the calculation of the asymmetry in the elastic neutrino (antineutrino)–proton scattering is shown. Here, it was assumed that the strange form factors are given by the dipole formulas gAs 1s s 2 GAs (Q2 ) = (Q ) = ; G ; (262) M s2 (1 + Q2 =MA ) (1 + Q2 =MVs2 ) in which the following values were used for MAs and MVs : MAs = MA = 1:032 GeV; MVs = MV = 0:84 GeV : (263) s The dashed area was obtained under the assumption that gA = 1s = 0: it corresponds to the error band associated (to a conFdence level of 90%) to the uncertainties (261) in the parameterization of the magnetic proton and neutron form factors. The dashed and dotted curves display the e8ect of the axial strange form factor (with gAs = − 0:15) and correspond to di8erent parameterizations ([133] and [143], respectively) of the electromagnetic form factors. These curves were obtained under the assumption that 1s = 0. The solid and dot–dashed curves demonstrate the e8ect of both strange axial and vector form factors. They were obtained with gAs = − 0:15 and 1s = − 0:3. It is worth noticing that the slow variation of the asymmetry with Q2 is due to the deviation of the magnetic form factors from the scaling relation (258).
286
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
Thus, the measurement of the neutrino–antineutrino asymmetry could allow one to resolve the contribution of the strange form factors. Let us notice that the combined e8ect of the axial and vector strange form factors depends on the relative sign of gAs and 1s . If the sign are the same (as it is assumed in Fig. 12) the contribution of both form factors to the asymmetry sum up. Should the signs be opposite (which could be the case if 1s ¿0 [91]) the contributions to the asymmetry of the vector form factor would tend to compensate the e8ect of the axial one. We have assumed in Eq. (262) that strange and non-strange form factors have the same Q2 behavior. According to the asymptotic quark counting rule [144,145], it would be natural to expect that the strange form factors decrease with Q2 more rapidly than the non-strange ones. In this case the contribution of the strange form factors to the asymmetry will disappear at high Q2 . Calculations which consider this situation were made in Ref. [133] and show that the region Q2 1–2 GeV is probably the optimal one to look for the e8ects of strangeness in neutrino–nucleon scattering.
11. The elastic scattering of neutrinos (antineutrinos) on nuclei with S = 0 and T = 0 In this section, we will consider the processes of the elastic NC scattering of neutrinos (antineutrinos) on nuclei with S = 0, T = 0 [104,106,146,147]. 0(0) H + A → 0(0) H +A :
(264)
The matrix element of the process for the scattering of neutrinos (antineutrinos) is given by the expression G
F f|S |i = ∓ √ u(k ) (1 ∓ 5 )u(k)p |JNC |p(2)4 8(4) (p − p − q) :
2
(265)
Here, k and k are the momenta of the initial and Fnal neutrino (antineutrino), q = k − k , p and p are the momenta of the initial and Fnal nucleus. The cross section of processes (264) are given by general expression (213). 2 3 The axial current, ANC , and the isovector part of the vector NC, V (1 − 2 sin W ), do not give contribution to the matrix element of processes (264). For the isoscalar part of the vector NC we have, in general, p |VNC |p = n F NC (Q2 ) + q G NC (Q2 ) :
(266)
Here, n = p + p , q = p − p and Q2 = 2MA T (T is the kinetic energy of the Fnal nucleus). The form factor G NC (Q2 ) is equal to zero due to T-invariance of the strong interactions. The remaining form factor F NC (Q2 ) can be written in the form F NC (Q2 ) = − 2 sin2 W F(Q2 ) − 12 F s (Q2 ) ;
(267)
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
287
where F(Q2 ) is the isoscalar electromagnetic form factor and F s (Q2 ) the strange form factor of the nucleus. From Eqs. (266) and (214), we have
Q2 1 NC NC 2 2 W% = n n% [F (Q )] 8 0 − : (268) 4M 2 2MA2 Then, from Eqs. (213) and (268) it follows that the cross sections of the scattering of neutrinos and antineutrinos on a nucleus with S = 0, T = 0 are equal to each other and are given by the expression
d0H GF2 d0 p·q Q2 = = 1− − [F NC (Q2 )]2 ; (269) dQ2 dQ2 2 MA E 4E 2 where E is the neutrino energy in the laboratory system. Thus, the strange form factor of the nucleus can be determined from the measurement of the cross section of process (264) if the electromagnetic form factor of the nucleus F(Q2 ) is known. The latter can be determined from the measurement of the cross section of the elastic scattering of unpolarized electrons on the nucleus, which is given by the expression
de 42 p·q Q2 = 4 1− − [F(Q2 )]2 : (270) dQ2 Q MA E 4E 2 From Eqs. (269) and (270) we can Fnd the relation connecting the strange form factor of the nucleus with the corresponding (measurable) cross sections: √ ! 2) 2 2 (d =dQ 0 F s (Q2 ) = ± 2F(Q2 ) ∓ 2 sin2 W (271) GF Q2 (de =dQ2 ) or, equivalently, by extracting the elastic form factor from (270): ! 4 d 1 2 d Q 0 e F s (Q2 ) = ± 2 ∓ 2 sin2 W : 42 dQ2 GF2 dQ2 1 − p · q=MA E − Q2 =4E 2
(272)
In Eqs. (271) and (272) the upper (lower) signs refer to a positive (negative) value of the quantity 4 sin2 W F + F s . We remind the reader that the P-odd asymmetry in the scattering of polarized electrons on nuclei with S = 0 and T = 0 is determined by the ratio F NC =F [see Eq. (176)]. On the other hand, the ratio of cross sections (269) and (270) is determined by the ratio (F NC =F)2 . Hence, by comparing (178) with (271), we Fnd the following general relation between quantities, which are measurable in the scattering of neutrinos and electrons on nuclei with S = 0 and T = 0: (d0 =dQ2 ) 2 A(Q ) = ± ; (273) (de =dQ2 ) where the plus (minus) sign have the same correspondence as in Eqs. (271) and (272).
288
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
Let us stress that relation (273) takes place for any reactions of type (165) and (264) in which the initial and Fnal nuclei have S = 0, T = 0. It can be violated only if in the neutral current there are scalar and=or tensor terms. The observation of the process of the scattering of neutrino on nuclei requires the measurement of the small recoil energy of the Fnal nucleus. It could be easier to detect the process of scattering of neutrinos and electrons on nuclei if the nucleus undergoes a transition to excited states. Let us consider, for example, the processes [106] 0 + 4 He → 0 + 4 He∗ ; e + 4 He → e + 4 He∗ ; where 4 He∗ is the excited state of 4 He with S = 0 and T = 0 and excitation energy of 20:1 MeV. This state can decay into p and radioactive 3 H [148]. The matrix element of the neutral current for the above processes is given by
p·q NC NC 2 2 p |J |p = p |V |p = 2 p − 2 q Fin (Q ) + q Gin (Q ) : (274) q It is obvious that the contribution of the form factor Gin can be neglected. For the form factor Fin we have FinNC (Q2 ) = − 2 sin2 W Fin (Q2 ) − 12 Fins (Q2 ) ;
(275)
where Fin (Q2 ) and Fins (Q2 ) are the inelastic electromagnetic and strange form factors. All the relations that were obtained for the elastic case are valid also for the inelastic processes providing we change everywhere the elastic form factors by the inelastic ones and we use for p · q [e.g. in Eqs. (269) and (270)], the relation p · q = 12 Q2 + 12 (MA∗2 − MA2 ) ;
(276)
MA∗ being the mass of the excited state. Other S = 0; T = 0 nuclei, like 12 C and 16 O, could o8er similar (perhaps better) possibilities of detecting 0-induced transitions to S = 0; T = 0 excited states. Admittedly, the detection of the decay products of these excited states could be fairly diWcult [149,150]. Moreover, according to theoretical calculations [151], the strength of these isoscalar transitions is generally much weaker than the corresponding isovector ones. Yet, the measurement of cross sections (269) and (270) (or the analogous ones for inelastic processes) would allow a model-independent determination of the vector strange form factor of the nucleus. 12. Neutrino (antineutrino)–nucleus inelastic scattering As we have seen in considering the Los Alamos future experiment and the BNL measurement of the axial strange form factor (see Section 9), one can consider neutrino scattering on both free nucleons and nucleons bound inside complex nuclei. The latter are convenient targets, as many neutrino detectors often contain nuclei as well as free protons. In fact, even the Brookhaven “free” scattering data were mostly obtained from the scattering of 0; 0H on carbon, corrected for the Fermi motion and other nuclear e8ects.
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
289
In considering bound nucleons, especially at low energies such as the ones available at Los Alamos, it is important to be able to describe the e8ects of the nuclear many body structure and to evaluate the impact it can have on the measured quantities, in order to correctly interpret the experimental results in terms of single nucleon properties. Results from quasi-elastic electron scattering, which has been widely studied, can provide a guidance for selecting reliable nuclear models. However, NC neutrino processes involve additional complications, with respect to the inclusive electron scattering. In fact, as the outgoing neutrino cannot be measured experimentally, an hadronic signature that the reaction has taken place has to be detected. The corresponding cross sections are therefore exclusive (or semi-inclusive) in the hadronic sector, similar to a coincidence electron scattering process, but inclusive in the leptonic sector, so that a detailed balance of the energy and momentum transfer is not possible any more. In this section, we will summarize the studies which have been proposed in order to evaluate the impact of nuclear uncertainties on the extraction of the strange axial form factor of the nucleon from neutrino nucleus scattering, in particular from the measurement of the ratio which is under study at Los Alamos. A general review about neutrino–nucleus scattering processes can be found in [104]. We will consider the following processes: 01 (0H1 ) + A → 01 (0H1 ) + N + (A − 1) ;
(277)
in which a neutrino (antineutrino) of four-momentum k interacts with a nucleus A in its ground state and a nucleon is emitted, and detected, in the Fnal state, with four-momentum pN = (EN ; pN ), while both the states of the daughter nucleus and the outgoing neutrino remain undetected. Most of the models used in the literature describe the above quasi-free processes within the IA, where the neutrino is assumed to interact with only one nucleon, which is then emitted, the remaining nucleons being spectators. After the interaction with the neutrino, the struck nucleon can be considered as free (plane wave impulse approximation, PWIA) or the residual interaction with the recoiling system can be taken into account, either directly or indirectly, through distorted wave functions. The di8erent calculations proposed in the literature then di8er by the models employed to describe the bound nucleus as well as the emitted nucleons. A schematic representation of the neutrino–nucleus scattering amplitude and, respectively, of its description within the PWIA is given in Figs. 13 and 14. A model which has been employed very often in the literature is the RFG, in which the bound and the outgoing nucleon are described by plane waves. Although the Fermi gas is not completely realistic, owing to its simplicity it is a useful guidance to understand the behavior of the di8erent cross sections. Therefore, before illustrating various results, we brieOy present the relevant formalism. 17
17 The same model was considered in Section 7 for the description of the P-odd asymmetry in quasi-elastic electron– nucleus scattering. Here, however, the kinematic situation is di8erent, since the Fnal neutrino is not detected, while the nucleon ejected from the nucleus is the only observable Fnal particle in the process.
290
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
Fig. 13. Schematic representation for the amplitude, in Born approximation, of the neutrino–nucleus scattering.
Fig. 14. Representation of the 0-nucleus scattering in the plane wave impulse approximation.
The RFG cross section is obtained by averaging over the nucleon momentum distribution the cross section for the scattering on a free moving nucleon of four-momentum p. Its general expression can thus be written as
3 3 GF2 3N M 2 |p d2 d k d p ˜N| = dEN dAN 0(0)H (2)2 4pF3 k0 k0 p0
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
291
×8(3) (˜k − ˜k + p ˜ −p ˜ N )8(k0 − k0 + p0 − EN ) NC × (pF − |p ˜ | ) ( |p ˜ N | − pF )(L% ∓ L% 5 )(W% )s:n: ;
(278)
where the integration over the momentum of the outgoing neutrino, ˜k , has been included NC ) explicitly. Here, (W% s:n: is the single nucleon NC hadronic tensor, which is given by the following expression:
q q% NC NC 2 NC 2 (W% )s:n: = −[/(GM ) + (1 + /)(GA ) ] g% − 2 q NC 2 NC )2 X X% (GE ) + /(GM + + (GANC )2 1+/ M2 #2 q q% " i NC − GANC + 2 6%10 p1 q0 GANC GM 2
q
M
(279)
with 18 X = p −
(p · q)q : q2
In Eq. (278) L% and L% 5 are the leptonic tensor and pseudotensor, given in Eqs. (118) and (119), respectively; pF is the Fermi momentum of the nucleus under consideration, 3N=4pF3 (pF − |p ˜ | ) is the momentum distribution of the nucleons in the RFG, N = Z; N being the number of protons or neutrons. We notice that the NC form factors which are contained in the hadronic NC ) tensor (W% s:n: are di8erent for protons and neutrons: hence, the proton form factors have to be used in Eq. (279) when the quasi-elastic emission of a proton is considered and, conversely, the neutron form factors have to be used when the emitted particle is a neutron. The function (|p ˜ N | − pF ) takes into account the e8ects of Pauli blocking on the ejected nucleon. The e8ects of an average binding energy of the bound nucleons can be taken into account by subtracting a (constant) term to the initial nucleon energy in the argument of the energy conserving delta function, p0 → p0 − B. Writing explicitly the contraction between the leptonic and the hadronic tensors the cross section becomes
3 3 GF2 3N |p d2 ˜N| d k d p = 8(k0 − k0 + p0 − EN ) 3 2 dEN dAN 0(0)H (2) 4pF k0 k0 p0 ×8(3) (˜k − ˜k + p ˜ −p ˜ N ) (pF − |p ˜ |) (|p ˜ N | − pF ) 18
We notice that expression (279) is analogous to the sum of the ones contained in Eqs. (216) and (217), but without the energy conserving delta functions (here explicitly included in the integral) and with X instead of n = p + pN ≡ 2p + q . The two notations are equivalent for the scattering of massless leptons, since the contraction of L% with terms proportional to q (q% ) vanishes.
292
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
×
NC 2 VM (GM )
NC )2 (GENC )2 + /(GM NC 2 NC NC + VEM ; + VA (GA ) ± VAM GA GM 1+/ (280)
where the form factors have been grouped as in Eq. (219) and VM = 2M 2 /(k · k ) ; VEM = 2(k · p)(k · p) − M 2 (k · k ) ; VA = M 2 (k · k ) + 2M 2 /(k · k ) + 2(k · p)(k · p) ; VAM = 2(k · k )(k · p + k · p) :
(281)
The interesting quantities, related to the determination of the strange form factors of the nucleon, are then the single di8erential cross section
d d d2 ≡ = dAN ; (282) dTN 0(0)N dEN 0(0)N dEN dAN 0(0)N H H H where TN is the outgoing nucleon kinetic energy, the total (integrated over the nucleon energy) cross section
d 0(0)N dTN (283) H = dTN 0(0)N H and the corresponding “proton over neutron” ratios R0(0)H =
(d=dTN )0(0)p H ; (d=dTN )0(0)n H
(284)
R0(0)H =
0(0)p H : 0(0)n H
(285)
An approximate expression for the RFG cross section can be obtained by inserting the explicit forms [Eqs. (106)–(108)] of the nucleonic NC form factors into Eq. (280) and neglecting both the terms proportional to (1 − 4 sin2 ( W )) 0:075 and the terms quadratic in the strange form ˜ −p ˜ N ) to integrate over ˜k one factors. Under these approximations and using 8(3) (˜k − ˜k + p obtains
NC 3Z(N ) GF2 |p d2 ˜ N |M 4 = (|p ˜ N | − pF ) dEN dAN 0p(n) k0 4pF3 (2)2 0p(n) H
×
d3 p 8(k0 − k0 + p0 − EN )Ip(n) (k; p; Q2 ) ; k0 p0
(286)
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
293
where (the plus=minus sign refer to the 0 and 0H cases) n(p)
s Ip(n) (k; p; Q2 ) = Mp(n) + 2/2 GM GM + [(z − /)2 − /(/ + 1)]
n(p) s GM GEn(p) GEs + /GM 1+/
− 8p(n) {[(z − /)2 + 2/(/ + 1)]GA GAs n(p)
n(p)
s ± 2/(z − /)(GM GA + GA GM − GM GAs )} ;
(287)
with 8p = 1; 8n = − 1; z = k · p=M 2 and, for scattering of massless leptons, / = Q2 =4M 2 = k · k =2M 2 = (k · p − k · pN )=2M 2 . The terms Mp(n) contain only the electromagnetic and axial form factors and are given by n(p)2 n(p)2 2 G + /G 1 n(p) E M Mp(n) = /2 (GM + GA2 ) + [(z − /)2 − /(/ + 1)] (288) + GA2 : 2 (1 + /) The above equations, although approximate, can be useful to understand the interplay between strange and non-strange form factors in the cross sections (282), (283) and thus in ratios (284), (285). We notice, however, that all the results presented in this section were obtained without introducing any approximation. The RFG was Frst applied to the study of neutrino–nucleon scattering in connection with the problem of nucleon strangeness by Horowitz et al. [152], and later employed by other authors [153,154,160]. The considered nucleus was 12 C, for which the values pF = 225 MeV and B = 25–27 MeV are typically used. A detailed analysis of the uncertainty induced by nuclear structure e8ects on the determination of axial strange form factor in 0-nucleus quasi-elastic scattering was proposed by Barbaro et al. [153], both for the Los Alamos and the Brookhaven kinematical conditions. In this paper, the authors consider processes (277) on 12 C, using two di8erent nuclear models: the RFG and an hybrid model (HM), in which the bound nucleons are described by harmonic oscillator shell model wave functions, while the outgoing nucleon is described by a plane wave. These choices are considered as two “extremes” among the available realistic nuclear models: in fact the RFG, using plane waves for the bound nucleons, can be seen as a “maximally unconFned” model, while the HM, whose bound single nucleon wave functions decrease more rapidly than the expected exponential behavior, is somehow “over-conFned”. Moreover, while the RFG involves on-mass shell single nucleon amplitudes, the HM allows one to consider (half)-o8 shell nucleonic currents, thus providing an estimate of o8-shellness e8ects. Therefore, the di8erence between the e8ects of the strange form factors on the considered quantities calculated with these two models can provide an upper bound to the uncertainty induced by nuclear structure e8ects. These two models were used to calculate both the di8erential and total cross sections [Eqs. (282) and (283)], for the neutrino-induced emission of a proton and of a neutron, for Fxed neutrino energy. In this calculation the axial and axial strange form factors, GA and GAs , were assumed to have the same dipole Q2 -dependence, while strangeness contributions to the nucleonic vector current were not considered. It is important to notice that, contrary to what happens at the relatively high energies of BNL, under the Los Alamos kinematics the correlation between the value of the axial dipole cut-o8 mass and the strange axial constant gAs is
294
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
negligible, as shown in Refs. [82,152], and thus at low energies the dipole parameterization can be used without introducing signiFcant uncertainties. 19 The results of Ref. [153] for the kinematic conditions of Los Alamos (E0 = 200 MeV) are shown in the upper and central panels of Figs. 15 and 16. From the comparison of their results on the separate cross sections at E0 = 200 MeV, Barbaro and collaborators derive an uncertainty on gAs due to nuclear model dependencies given by 8nucl (gAs ) = ± 0:25, therefore larger than the expected value of gAs itself. However, when the ratio of either di8erential or total cross sections is considered, this uncertainty is reduced by an order of magnitude, down to 8nucl (gAs ) = ± 0:015. Including, in a worst case scenario, additional uncertainties due to o8-shellness e8ects a Fnal estimate 8nucl (gAs ) = ± 0:03 is provided. This estimate is much smaller than the uncertainties on measurements of gAs in deep inelastic scattering processes, and therefore these authors conclude that the proposed Los Alamos measurement is a very good method to determine gAs . The same analysis was applied in [153] to the single cross section for the neutrino-induced quasi-elastic emission of a proton at E0 = 1:3 GeV, the average energy of the BNL neutrino beam, as shown in the lower panels of Figs. 15 and 16. In this case nuclear structure e8ects are much smaller and the estimated uncertainty on the strange axial constant drops down to 8nucl (gAs ) = ± 0:015. Both low and intermediate energy results are in agreement with the previous analysis proposed by Horowitz et al. [152] within the RFG, with and without binding energy e8ects. The analysis of Barbaro et al. was carried on within the PWIA; thus, not including the e8ects of Fnal state interaction (FSI). The authors, however, concluded that the estimated value of 8nucl (gAs ) for the p=n ratio can be considered as an upper bound also for the uncertainty associated with the use of more realistic nuclear models, which include FSI e8ects. Nonetheless, a detailed study of these e8ects has to be done, in order to be able to extract strangeness parameters from the proposed quantities. As already mentioned in Section 9, FSI e8ects on ratios (284) and (285) under the kinematical conditions typical of the Los Alamos facility were studied by Garvey et al. [127], within the random phase approximation. In their calculations the ground state of carbon was described as a Slater determinant of Woods–Saxon (WS) wave functions, the parameters of the WS potential being chosen in order to reproduce the ground state properties of 12 C. FSI was included by means of a Fnite range G-matrix interaction, derived from the Bonn NN potential. It was found that FSI can sizeably a8ect the separate cross sections, resulting in an increase of the latter of up to 40%, but these e8ects are mainly canceled in the ratio, where they amount to ¡10% and do not interfere with the e8ects of gAs . Similar results were obtained in a later calculation by Alberico et al. [154], who evaluated the ratio of di8erential cross sections, Eq. (284), within both the RFG and a relativistic shell model (RSM). The latter included also Fnal state interaction e8ects through a relativistic optical potential (ROP). Again, FSI e8ects were found to be large on the separate cross sections, resulting in a decrease of the latter of up to 40%, mainly due to 19
Since very little is known about the Q2 -dependence of strange form factors, in the literature about neutrino scattering they are often assumed to have the same Q2 -dependence as the corresponding non-strange ones. Only in Ref. [128] strange form factors obtained within an SU(3) Skyrme model were used to predict the e8ects of strangeness on the Los Alamos ratio RH LAMPF [see Eq. (245)].
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
295
Fig. 15. Di8erential cross sections (d=dEN )0N , for the emission of a proton and of a neutron [panels (a) and (b), respectively] under the Los Alamos kinematic conditions, and for the emission of a proton for the average energy E0 = 1:3 GeV of Brookhaven (panel c). The solid and dot–dashed lines correspond the hybrid model mentioned in the text, without strangeness and with the strange axial constant set to gAs = − 0:2, respectively. The dashed and dotted lines are obtained within the RFG, again with gAs = 0 and gAs = − 0:2, respectively (taken from Ref. [153]). Fig. 16. Total cross sections 0N integrated over the nucleon kinetic energy (TN ¿60 MeV for low-energy neutrinos and TN ¿200 MeV for higher-energy neutrinos) as a function of the strange axial constant −gAs , for the emission of a proton (a) and of a neutron (b) at Los Alamos kinematics and for protons at Brookhaven kinematics (c). The solid lines correspond to the hybrid model described in the text, while the dashed lines are the RFG results (taken from Ref. [153]).
the absorption into di8erent channels described by the imaginary part of the optical potential. However, these e8ects on the p=n ratio were reduced to less than 10% and were found to be due mainly to the Coulomb repulsion, which is present for the protons only. In Ref. [154], FSI e8ects on the p=n ratio at Los Alamos kinematics were shown to be comparable with the e8ects
296
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
s . It is worth noticing that in Ref. [154] it was found of the magnetic strange form factor GM that Fnal state interaction can still be relevant, for the separate cross sections, even at relatively high energies (E0 1 GeV). Here, we want to make a comment about a possible source of confusion: the authors [126,127] who originally proposed the measurement of the ratio R use the following deFnition for the nucleon kinetic energy, TN = Tp = Tn + 2:77 MeV; 2:77 MeV being the average Coulomb repulsion for the protons. For the total cross sections (283) this translates into a di8erent value for the lower limit of integration, namely Tpmin = 60 MeV, Tnmin = 57:23 MeV. Other authors [128,153,154], instead, use the same values TN = Tp = Tn . While the choice of either deFnition does not a8ect the general considerations on both the sensitivity to the strange form factors and to the nuclear model e8ects, it was noticed in [128] that, when the experimental ratio given in Eq. (245) is considered, its numerical value can sensibly depend on the choice adopted. This has to be remembered when one is comparing results from di8erent authors and especially when examining the future experimental data. All calculations of the Los Alamos ratio so far considered were performed in the framework of the IA. Going beyond IA, Umino et al. [155,156] evaluated the e8ects of two-body relativistic meson exchange currents (MEC) in neutrino–nucleus scattering at low and intermediate energies, within a soft-pion dominance model. The exchange current e8ects on the single di8erential cross sections (282) were calculated for both neutrinos and antineutrinos, for a Fxed incident energy of 200 MeV and under various assumptions for the strange form factors GAs and F2s . These e8ects are strongly dependent on the value of the Fermi momentum and can be relatively large for pF ¿ 350 MeV. As an example, with pF = 300 MeV the (0; 0 ; p) cross sections are reduced by MEC e8ects by about 20% at the peak; these corrections, however, become much less important (as already discussed for FSI) in the ratio of cross sections, Eq. (284). For carbon, by assuming pF = 225 MeV, MEC contributions were found to be small for neutrino scattering and somewhat larger for the antineutrino cross sections, resulting in a reduction of the latter of about 15%. Moreover, these e8ects are mainly conFned to low values of the outgoing nucleon kinetic energy, TN 6 50 MeV, which is excluded by the kinematical cuts applied in the proposed LAMPF experiment, in order to select quasi-elastic events. Correspondingly, MEC corrections to the proton over neutron ratio were found to be limited to a few percent for the neutrino case and about 10% for the antineutrino one. 20 The possibility to extract information on the nucleonic strange form factors from a measurement of the ratios in Eqs. (284), (285) at energies higher than the ones available at LAMPF has been studied in Ref. [160], for neutrino and antineutrino energies of 1 GeV. Separate neutrino and antineutrino ratios were considered, both for di8erential and integrated cross sections. It was found that while the separate cross sections can still be rather sensitive to FSI e8ects, the nuclear model dependence is very weak in the ratios. The sensitivity of the ratios to the strange axial, magnetic and electric form factors as well as to the axial cut-o8 MA was studied in detail.
20
We mention here that MEC e8ects have also been considered in connection with the P-odd asymmetry in electron– nucleus scattering: both in the case of ˜e–d quasi-elastic processes [157] and of ˜e–4 He elastic scattering [158,159], the MEC contributions are small and below experimental detectability; hence, they do not a8ect signiFcantly the investigation of nucleon strange form factors.
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
297
For the axial and magnetic strange form factors the dipole parameterizations in Eq. (262) were used, while the electric strange form factor was assumed to be 7s / GEs (Q2 ) = : (289) (1 + Q2 =MV2 )2 It was found that for the neutrino ratio the dominant e8ect is still due to gAs and that this e8ect can still be large enough to allow an extraction of gAs from a possible experiment, although s has to be carefully considered. On the its interplay with the strange magnetic form factor GM other hand, the antineutrino ratio can be rather sensitive to the electric strange form factor, GEs . However, although important as preliminary indications, these results can depend on the speciFc kinematical cuts somehow arbitrarily used in the calculation, and further investigation would be needed in case of a future experiment. An illustration of these results is given in Fig. 17, where the ratio (285) of integral cross sections, calculated in the RFG, is plotted as a function of the parameter 1s for di8erent values of gAs and 7s . Finally, let us go back to the neutrino–antineutrino asymmetry, introduced in Section 10 for the case of free nucleons. A more realistic approach would require to use the quasi-elastic NC processes (277) together with the CC processes 01 (0H1 ) + A → 1− (1+ ) + p(n) + (A − 1)
(290)
for the denominator. Since, as already noticed, the momentum transfer Q2 cannot be determined in quasi-elastic NC processes, the following asymmetry should be considered: A(TN ) =
NC (d=dTN )NC 0 − (d=dTN )0H CC (d=dTN )CC 0 − (d=dTN )0H
(291)
with TN being, as usual, the kinetic energy of the ejected nucleon (proton or neutron). 21 Here, the CC cross sections (d=dTN )CC 0(0) H in the denominator have been considered in analogy to the NC cross sections of Eq. (282). The e8ects of nuclear structure on this asymmetry were studied in Ref. [154], by comparing the results obtained within two nuclear models: the RFG and a RSM. The shell model wave functions were obtained as mean Feld solutions of a Dirac equation derived from a linear Lagrangian, which includes nucleons, scalar (), vector-isoscalar (!) and vector-isovector (7) mesons. This model had been previously used in the study of (e; e N ) reactions on di8erent nuclei [161–163]. The e8ects of Fnal state interaction were described within the distorted wave impulse approximation, using, for the outgoing nucleon wave functions, the solutions of a Dirac equation with a phenomenological ROP, which includes scalar, vector and (for protons) Coulomb components [164]. When one is dealing with the CC cross sections entering in the denominator of the asymmetry (291), another type of “Fnal state interaction” has to be considered, namely the Coulomb distortion of the outgoing leptons. This is expected to be small but it has to be taken into account since it is enhanced in the 0–0H di8erence in the denominator of (291) and can become relevant if other e8ects such as FSI are canceled in the ratio. 21
We remind the reader that in the free nucleon case the following relation holds: Q2 = 2MTN .
298
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
Fig. 17. The ratios (285) of the integrated NC neutrino and antineutrino–nucleus cross sections (283), as a function of 1s , evaluated in the RFG. The incident energy is E0(0)H = 1 GeV and the integration limits for the cross sections are 100 6 Tp ≡ Tn 6 400 MeV. The solid line corresponds to gAs = 7s = 0, in the other three curves [both in (a) and in (b)] gAs = − 0:15 and 7s = 0 (dashed line), −2 (dot–dashed line) and +2 (dotted line) (taken from Ref. [160]).
In Ref. [154] Coulomb distortion was taken into account within the “e8ective impulse approx˜ imation”, which prescribes to modify the outgoing lepton plane wave eik ·˜r according to |˜k | ˜ ˜ (292) eik ·˜r → e8 eike8 ·˜r ; |˜k | where 3 Z ˜ke8 = ˜k 1 ± : (293) 2 R|˜k | Here, the plus (minus) refers to the lepton (anti-lepton), Z is the number of protons and R 1:2A1=3 is the e8ective charge radius of the nucleus under investigation. The validity of this approximation has been studied in electron scattering processes [165].
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
299
Fig. 18. The asymmetry A for an ejected proton, Eq. (291), versus TN = Tp = Tn , at incident 0(0) H energy E0 = 1:0 GeV. The solid lines correspond to the RSM calculation without Fnal state interaction, the dot–dashed lines include Fnal state interaction e8ects through the ROP, the dotted lines represent the RSM corrected by both the Fnal state interaction and the Coulomb distortion of the outgoing muon. The three sets of curves correspond to di8erent choices of strangeness parameters: gAs = 1s = 0 (lower lines), gAs = − 0:15, 1s = 0 (intermediate lines) and gAs = − 0:15, 1s = − 0:3 (upper lines) (taken from Ref. [154]).
Asymmetry (291) was calculated in Ref. [154] for three typical values of the neutrino energy, 200 and 500 MeV and 1 GeV, using dipole-like parameterizations for both strange and non-strange form factors and the Galster parameterization for the electric form factor of the neutron [see Eq. (205)]. It was found that at low energies (200 MeV) the impact of nuclear uncertainties is too large to allow an unambiguous determination of the strange form factors from a measurement of A. However, at larger energies (already 500 MeV and especially 1 GeV) these e8ects are strongly reduced. Results for E0 = 1 GeV are shown in Fig. 18, where the asymmetry A, for the emission (in the NC processes) of a proton, is plotted as a function of TN for three di8erent choices of the strange form factors, as indicated in the caption. Here, the solid curves represent the shell model calculations without FSI e8ects, while RFG results are not displayed, since they coincide with the RSM ones. The dot–dashed curve shows the e8ects of FSI as described by the ROP. The e8ect on A of both FSI and Coulomb distortion is illustrated by the dotted line, which shows that, except for the region of small TN , this combined e8ect is still small enough when it is compared with the e8ects of strangeness. A comment is required about the divergent behavior of the asymmetry in Fig. 18 for large TN : this is associated with the e8ect of the outgoing lepton (a muon in this case) mass, which brings the CC cross sections down to zero more rapidly than the NC ones. 22 22
The outgoing lepton mass was neglected in the formalism presented in Section 10, its e8ects being irrelevant for the general considerations done there, but it has been included in the results presented in this section.
300
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
Fig. 19. The integral asymmetry, Eq. (294), AI for an ejected proton, versus gAs , at incident 0(0) H energy E0 = 1:0 GeV. The lower limit for the integration is TN = 100 MeV. The solid lines correspond to the RSM calculation, the dashed lines to the RFG with average binding energy of 25 MeV, the dot–dashed lines include Fnal state interaction e8ects. The two sets of curves correspond to di8erent choices of the magnetic strangeness parameter: 1s = 0 (lower lines) and 1s = − 0:3 (upper lines) (taken from Ref. [154]).
The authors of Ref. [154] considered also an “integral asymmetry”, obtained as the ratio of NC and CC di8erences between the total cross sections (283): NC − 0NC H AI = 0CC : (294) 0 − 0CC H The corresponding curves for the emission of a proton are shown in Fig. 19. Since the speciFc sensitivity of the asymmetries in Eqs. (291), (294) to the parameters characterizing the strange form factors of the nucleon derives from the cancellation of various contributions in the di8erence between neutrino and antineutrino cross sections, a crucial point, in considering realistic measurements, is the role played by the di8erent shapes of the available neutrino and antineutrino spectra. This issue was addressed in Ref. [166] for the Brookhaven kinematical conditions: the integrated cross sections (283) were considered both for quasi-elastic scattering on 12 C and for elastic scattering on free nucleons. The integral asymmetry AI (294), calculated for E0 = E0H = 1 GeV, was compared with the following “Oux-averaged asymmetry”: 0NC − 0NC H AI = ; (295) 0CC − 0CC H where the quantities 0(0)H are the total cross sections averaged over the BNL neutrino and antineutrino spectra [see Eq. (234)] and the limits for the integrations over the outgoing nucleon kinetic energy and over the neutrino energy correspond to the kinematics of the BNL-734 Experiment [124], considered in Section 9. It was found that the e8ects of the folding amount to ¡2%, due to the similarity of the BNL 0 and 0H spectra. In Fig. 20 the integral asymmetry AI is plotted as a function of the strange axial constant −gAs . The solid line represents the
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
301
Fig. 20. The integral asymmetry AI for an ejected proton versus −gAs . The magnetic and electric strange form factors have been set to zero, while dipole parameterizations (with the same value of the cut-o8 mass) have been assumed for the axial CC and the axial strange form factors. The solid line corresponds to the “Oux-averaged” 0(0)-proton H elastic scattering asymmetry, the empty dots to elastic scattering without folding at E0(0)H = 1 GeV. Results for the quasi-elastic asymmetry on 12 C are shown by the following curves: dashed line (RFG, with folding), dotted line (RFG, unfolded), dot–dashed line (RSM, unfolded) and three-dot–dashed line (RSM+ROP, unfolded); all unfolded curves are evaluated at E0(0)H = 1 GeV (taken from Ref. [166]).
folded asymmetry (295) for scattering on free protons, which has to be compared with the unfolded elastic asymmetry shown by the empty-dots. The dashed and dotted lines show results for the quasi-elastic emission of protons, within the RFG, with and without folding. The curves obtained (without folding) using the RSM described previously in this section are also shown, conFrming that at these energies nuclear e8ects on the asymmetry are small. In Ref. [166] an indirect “experimental value” of the folded asymmetry (295) for 0p elastic scattering was derived from the measured ratios of total elastic and quasi-elastic cross sections given in Eqs. (235)–(237): AI =
RBNL (1 − RBNL ) 0 = 0:136 ± 0:008 (stat) ± 0:019 (syst) : 1 − RBNL RBNL =RBNL 0 0H
(296)
The sensitivity of the ratios RBNL , RBNL , RBNL and of the folded asymmetry AI to the strange 0 0H form factors of the nucleon was then studied, by assuming the usual dipole parameterization for the strange and non-strange form factors. The results for neutrino and antineutrino proton scattering in the kinematical conditions of the BNL-734 experiment are shown in Fig. 21. The quantities RBNL , RBNL , RBNL and AI are plotted as functions of the strange magnetic 0 0H moment 1s for di8erent values of the axial strange constant gAs and of 7s , while the axial cuto8 mass is assumed to be MA = 1:032 GeV. The horizontal dotted lines represent the experimental values and the shaded areas are the corresponding error bands. The latter are rather large and comparable with the e8ects of strangeness and it is clear that more precise measurements are needed in order to extract information on the strange form factors. Nevertheless, we can observe that relatively large negative values of gAs seem to be excluded by the antineutrino ratio RBNL . 0H The other ratios and the asymmetry are compatible with relatively large, negative values of
302
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
Fig. 21. The ratios RBNL , RBNL and RBNL , here indicated without the superscript BNL and the folded asymmetry 0 0H I A as functions of 1s . All curves correspond to 0(0)p H elastic scattering in the kinematic conditions of Brookhaven. Results are shown for gAs = 0; −0:15 and for 7s = 0 (solid line), 7s = − 2 (dot–dashed line) and 7s = + 2 (dashed line). The shadowed regions correspond to the experimental data [Eqs. (235), (236), (237) and (296)] measured in the BNL-734 experiment (taken from Ref. [166]).
gAs : in this case a negative value of 1s seems to be favored, in agreement with the Fndings of Ref. [125] but in contradiction with the SAMPLE results. In Ref. [166] the sensitivity of the above quantities to the axial cuto8 mass was also studied, in the range MA = 1:032 ± 0:036 GeV: in accordance with previous analyses [124,125] the ratios RBNL , RBNL , RBNL were found to be 0 0H 2 rather sensitive to MA , while (assuming the same Q dependence for GA and GAs ) the asymmetry turned out to be practically independent of this parameter. Noticing that, as shown in Fig. 21, the asymmetry does not depend on the electric strange form factor GEs we can conclude that a more precise measurement of this quantity could allow one to obtain more “clean” information s . 23 on GAs and GM Finally, we notice that most of the studies of neutrino scattering quasi-elastic processes we presented here consider the e8ects of negative values of 1s only. This choice appeared to be favored by model calculations before the SAMPLE result relative to the proton data (see Section 6) was published. Instead, a positive 1s would interfere with a negative gAs , reducing the e8ect of the latter, on both the p=n neutrino ratio and the 0 − 0H asymmetry, the size of this reduction being dependent on the actual value of 1s . However, as we have seen considering also the deuteron data, the SAMPLE result is still strongly a8ected by uncertainties due to radiative corrections and the error bands of the BNL-734 experiment do not allow to draw any deFnite 23 We observe that the independence of the asymmetry of both MA and GEs can be understood by looking at Eq. (253). This property is maintained when the cross sections which contribute to the asymmetry are integrated over the momentum transfer and averaged over the BNL neutrino Oux.
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
303
conclusion. More stringent results have to be obtained before evaluating their quantitative impact on the neutrino cross sections and ratios. 13. Summary and conclusions The strange axial and vector form factors of the nucleon are an important issue which was under intensive experimental and theoretical investigations during the last decade. The Frst experimental evidence that the Q2 = 0 value of the axial strange form factor of the nucleon, gAs , is unexpectedly large and was obtained in 1989 in the famous EMC experiment (CERN) on the measurement of deep inelastic scattering of polarized muons on polarized protons. The EMC result triggered new DIS experiments at CERN, SLAC and DESY and hundreds of theoretical papers in which the connection of the constant gAs with the strange quarks polarization and the polarization of gluons in the nucleon was elaborated in detail. The EMC result also intensiFed the experimental and theoretical investigation of the problem of the strange form factors of the nucleon at relatively low values of Q2 (6 1 GeV). In order to obtain information on the strange form factors of the nucleon it is necessary to investigate neutral current (NC)-induced processes. In the region of small Q2 only the u, d and s parts of the NC should be taken into account. In the standard model there are three di8erent components of the NC: (1) isovector (vector and axial) u–d currents; (2) electromagnetic current; (3) strange (vector and axial) currents. Thus, by using the available information on the electromagnetic and CC axial form factors of the nucleon, from the investigation of the NC-induced lepton (electron and=or neutrino) nucleon processes we can obtain information on the strange axial and vector form factors of the nucleon. This strategy can be realized by the investigation of • the P-odd asymmetry in the elastic scattering of polarized electrons on nucleons; • neutrino(antineutrino) elastic scattering on nucleons.
In the case of electron–nucleon scattering in the region of small Q2 the diagram with the exchange of a -quantum gives a much larger contribution to the matrix element of the process than the diagram with Z 0 exchange. Thus, the P-odd asymmetry, which arises from the interference between and Z 0 exchanges, is very small ( 10−5 ). The measurement of such small asymmetries became possible only when very intense beams of highly polarized electrons and high-resolution spectrometers were developed at MIT=Bates, Je8erson Lab and MAINZ. In Section 6, we have discussed the interesting results that were obtained in the latest experiments. The P-odd asymmetry in the polarized electron–proton scattering is determined by the interference of the scalar electromagnetic and pseudoscalar Z 0 -exchange parts of the amplitude, aV NC and vANC : the latter, however, occurs in the asymmetry multiplied by the small coeWcient gV = − 1=2 + 2 sin2 W (from the electron NC). Hence, the P-odd asymmetry in ˜e + p scattering is mainly sensitive to the strange vector form factors of the proton. On the other
304
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
hand, the elastic and quasi-elastic scattering of polarized electrons on nuclei allows one to obtain complementary information: di8erent choices of kinematics (e.g. backward versus forward scattering) and of nuclear targets can be used to select and eventually enhance the various hadronic NC components. At the moment this possibility has not been experimentally tested, but several experiments are under way or foreseen in the near future, which will measure the P-odd asymmetry in light nuclei (deuterium, 4 He, etc.). The NC neutrino (antineutrino) nucleon scattering allows one to obtain information both on the axial and vector strange form factors (see Section 9). Moreover, none of the contributions of the strange form factors to the cross sections of these processes is, in principle, suppressed. By investigating neutrino (antineutrino) nucleon scattering (as well as the P-odd asymmetry in electron–nucleon scattering) one can also hope to obtain information on the Q2 behavior of the strange form factors. The most detailed investigation of neutrino (antineutrino) nucleon scattering was done in 1987 in the BNL-734 Brookhaven experiment, of which we discussed here in detail the results. In order to obtain information on the strange form factors of the nucleon from the data of neutrino experiments it is necessary to know with good accuracy the Q2 behavior of the CC axial form factor. There is no such information at present. Yet, we have discussed in Section 10 a method that allows one to obtain model-independent information on the strange form factors of the nucleon. For this purpose it is necessary to measure the neutrino–antineutrino asymmetry. There is no doubt that with the many new neutrino experiments and the suggested new neutrino facility, the neutrino factory [167], the problem of the strange form factors of the nucleon will have new development. It is worth mentioning that usually in neutrino experiments nuclear targets are used. A large part of this review is devoted to the detailed consideration of nuclear e8ects in the processes of neutrino (antineutrino) scattering on nuclei (see Sections 11 and 12). These e8ects are usually quite relevant on the single cross sections, but they can be drastically reduced when one considers ratios of cross sections. This fact has been widely underlined also in connection with the P-odd asymmetry in the (elastic or inelastic) scattering of polarized electrons on nuclei (see Sections 7 and 8). Finally, we mention that in the introductory parts (see Sections 1–4) many basic phenomenological relations are derived in suWcient detail, that they can easily be followed by the reader. We believe that this review will be useful for many physicists who are or will be interested in the problem of strangeness in the nucleon.
Acknowledgements The authors are deeply grateful to A. Molinari, T.W. Donnelly, E. Barone, A. De Pace and C. Giunti, for helpful discussions during the preparation of the manuscript. S.M.B. and C.M. acknowledge support from Department of Theoretical Physics, University of Torino and INFN. S.M.B. thanks the Physics Department of the Helsinki University for the hospitality during the initial stage of this work and the Alexander von Humboldt Foundation for support. This work was supported in part by Italian MURST under contract N. 9902198839.
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
305
References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43]
P. Vilain, et al., CERN-EP=98-128, 1998. S.A. Rabinowitz, et al., Phys. Rev. Lett. 70 (1993) 134. M. Abramowitz, et al., Z. Phys. C 15 (1982) 19. T. Adams, et al. [NuTeV Collaboration], in: Bloomington 1999, Physics with a high luminosity polarized electron ion collider p. 337, hep-ex=9906038. A.O. Bazarko, et al. [CCFR Collaboration], Z. Phys. C 65 (1995) 189. V. Barone, C. Pascaud, F. Zomer, Eur. Phys. J. C 12 (2000) 243. M. Anselmino, A. Efremov, E. Leader, Phys. Rep. 261 (1995) 1. B. Lampe, E. Reya, Phys. Rep. 332 (2000) 1. B.W. Filippone, Xiangdong Ji, The spin structure of the nucleon, hep-ph=0101224. J. Ellis, Strangeness and hadron structure, hep-ph=0005322. L.S. Brown, W.J. Pardee, R.D. Peccei, Phys. Rev. D 4 (1971) 2801. T.P. Cheng, R.F. Dashen, Phys. Rev. Lett. 26 (1971) 594. J. Gasser, M.E. Sainio, A. Svarc, Nucl. Phys. B 307 (1988) 779. T.P. Cheng, L.F. Li, Gauge Theory of Elementary Particle Physics, Clarendon, Oxford, 1984. J. Gasser, M. Leutwyler, M.E. Sainio, Phys. Lett. B 253 (1991) 252. S.V. Wright, D.B. Leinweber, A.W. Thomas, Nucl. Phys. A 680 (2000) 137. T.P. Cheng, Phys. Rev. D 38 (1988) 2896. A. Bottino, F. Donato, N. Fornengo, S. Scopel, Astropart. Phys. 13 (2000) 215. J. Ashman, et al. [European Muon Collaboration], Nucl. Phys. B 328 (1989) 1. B. Adeva, et al. [Spin Muon Collaboration], Phys. Rev. D 58 (1998) 11 2002. K. Abe, et al. [E143 Collaboration], Phys. Rev. Lett. 74 (1995) 346. K. Abe, et al. [E143 Collaboration], Phys. Rev. Lett. 75 (1995) 25. K. Abe, et al. [E143 Collaboration], Phys. Rev. D 58 (1998) 12 003. P.L. Anthony, et al. [E155 Collaboration], Phys. Lett. B 493 (2000) 19. A. Airapetian, et al. [HERMES Collaboration], Phys. Lett. B 442 (1998) 484. K. Ackersta8, et al. [HERMES Collaboration], Phys. Lett. B 404 (1997) 383. Hai-Yang Cheng, Chin. J. Phys. 38 (2000). E.W. Hughes, R. Voss, Annu. Rev. Nucl. Part. Sci. 49 (1999) 303. D.E. Groom, et al. [Particle Data Group], Eur. Phys. J. C 15 (2000) 1. G.M. Shore, The proton spin crisis: another ABJ anomaly?, in: Erice 1998, From the Planck Length to the Hubble Radius, p. 79, hep-ph=9812355. C.H. Llewellyn Smith, hep-ph=9812301. G.K. Mallot, in: J.A. Jaros, M.E. Peskin (Eds.), Proceedings of the 19th International Symposium on Photon and Lepton Interactions at High Energy LP99, Int. J. Mod. Phys. A 15S1 (2000) 521. K. Liu, J. Phys. G 27 (2001) 511. G. ’t Hooft, M. Veltman, Nucl. Phys. B 44 (1972) 189. R.D. Ball, S. Forte, G. RidolF, Phys. Lett. B 378 (1996) 255. A.V. Efremov, O.V. Teryaev, Spin structure of the nucleon and triangle anomaly, JINR-E2-88-287, Published in Czech. Hadron Symposium, 1988, p. 302. G. Altarelli, G.G. Ross, Phys. Lett. B 212 (1988) 391. See H.D. Politzer, Phys. Rep. 140 (1974) 129. S.L. Glashow, Nucl. Phys. 22 (1961) 579. S. Weinberg, Phys. Rev. Lett. 19 (1967) 1264. A. Salam, in: Svartholm: Elementary Particle Theory, Proceedings of the Nobel Symposium, Lerum, Sweden, Stockholm, 1968, p. 367. S. Weinberg, The Quantum Theory of Fields, Vols. I and II, Cambridge University Press, Cambridge, 1996. M.E. Peskin, D.V. Schroeder, An introduction to Quantum Field Theory, Addison-Wesley Publ. Company, Reading, MA, 1996.
306
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
[44] C. Quigg, Gauge Theories of the Strong, Weak and Electromagnetic Interactions, Benjamin=Cummings, Reading, USA, 1983 (Frontiers In Physics, 56). [45] E. Leader, E. Predazzi, An Introduction to Gauge Theories and Modern Particle Physics, Vol. 1, Cambridge University Press, Cambridge, UK, 1996. [46] S.M. Bilenky, Introduction to Feynman Diagrams and Electroweak Interactions Physics, Editions Frontieres, 1994. [47] C. Itzykson, J.B. Zuber, Quantum Field Theory, McGraw-Hill Book Co., New York, 1988. [48] R.L. Ja8e, Phys. Lett. B 229 (1989) 275. [49] B.R. Holstein, in: E.J. Beise, R.D. McKeown (Eds.), Proceedings of the Caltech Workshop on Parity Violation in Electron Scattering, World ScientiFc, Singapore, 1990, p. 27. [50] N.W. Park, J. Schechter, H. Weigel, Phys. Rev. D 43 (1991) 869. [51] N.W. Park, H. Weigel, Nucl. Phys. A 541 (1992) 453. [52] W. Koepf, E.M. Henley, J.S. Pollock, Phys. Lett. B 288 (1992) 11. [53] S. Hong, B. Park, Nucl. Phys. A 561 (1993) 525. [54] T.D. Cohen, H. Forkel, M. Nielsen, Phys. Lett. B 316 (1993) 1. [55] S.C. Phatak, S. Sahu, Phys. Lett. B 321 (1994) 11. [56] M.J. Musolf, M. Burkardt, Z. Phys. C 61 (1994) 433. [57] W. Koepf, E.M. Henley, Phys. Rev. C 49 (1994) 2219. [58] H. Forkel, et al., Phys. Rev. C 50 (1994) 3108. [59] H. Ito, Phys. Rev. C 52 (1995) R1750. [60] H. Weigel, et al., Phys. Lett. B 353 (1995) 20. [61] H.W. Hammer, U. Meissner, D. Drechsel, Phys. Lett. B 367 (1996) 323. [62] D. Leinweber, Phys. Rev. D 53 (1996) 5115. [63] H. Forkel, Prog. Part. Nucl. Phys. 36 (1996) 229. [64] Chr. V. Christov, et al., Prog. Part. Nucl. Phys. 37 (1996) 91. [65] P. Geiger, N. Isgur, Phys. Rev. D 55 (1997) 299. [66] M.J. Ramsey-Musolf, H. Ito, Phys. Rev. C 55 (1997) 3066. [67] M.J. Musolf, H.W. Hammer, D. Drechsel, Phys. Rev. D 55 (1997) 2741. [68] W. Melnitchouk, M. Malheiro, Phys. Rev. C 55 (1997) 431. [69] S.-T. Hong, B.-Y. Park, D.-P. Min, Phys. Lett. B 414 (1997) 229. [70] U. Meissner, V. Mull, J. Speth, J.W. Van Orden, Phys. Lett. B 408 (1997) 381. [71] B.-Q Ma, Phys. Lett. B 408 (1997) 387. [72] H.W. Hammer, M.J. Ramsey-Musolf, Phys. Lett. B 416 (1998) 5. [73] L.L. Barz, et al., Nucl. Phys. A 640 (1998) 259. [74] S.J. Dong, K.F. Liu, A.G. Williams, Phys. Rev. D 58 (1998) 074504. [75] H.W. Hammer, M.J. Ramsey-Musolf, Phys. Rev. C 60 (1999) 045205. [76] T. Hemmert, B. Kubis, U. Meissner, Phys. Rev. C 60 (1999) 04550. [77] H. Forkel, F.S. Navarra, M. Nielsen, Phys. Rev. C 61 (2000) 055206. [78] D.B. Leinweber, A.W. Thomas, Phys. Rev. D 62 (2000) 074505. [79] L. Hannelius, D.O. Riska, Phys. Rev. C 62 (2000) 045204. [80] L. Hannelius, D.O. Riska, L.Y. Glozman, Nucl. Phys. A 665 (2000) 353. [81] S. Dubnicka, A.Z. Dubnickova, P. Weisenpacher, hep-ph=0102171. [82] M.J. Musolf, T.W. Donnelly, J. Dubach, S.J. Pollock, S. Kowalski, E.J. Beise, Phys. Rep. 239 (1994) 1. [83] see http:==www.jlab.org= and http:==www.jlab.org=highlights=nuclear=Nuclear.html [84] K.A. Aniol, et al. [HAPPEX Collaboration], Phys. Lett. B 509 (2001) 211, nucl-ex=0006002. [85] M.J. Musolf, B.R. Holstein, Phys. Lett. B 242 (1990) 461. [86] S. Zhu, S.J. Puglia, B.R. Holstein, M.J. Ramsey-Musolf, Phys. Rev. D 62 (2000) 033008. [87] The numbers associated with the points in Fig. 2 correspond to the following reference numbers in the present work: [65; 48; 61; 56; 60; 50; 51; 74; 75; 69; 59; 52; 70; 71]. [88] H. Anklin, et al., Phys. Lett. B 428 (1998) 248. [89] E.E.W. Bruins, et al., Phys. Rev. Lett. 75 (1995) 21.
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308
307
[90] P.A. Souder, in: Bates 25 Symposium, Proceedings, November 3–5, 1999, Cambridge, p. 291; K. Kumar, D. Lhuillier (spokespersons), JLab Experiment 99–115. [91] D.T. Spayde, et al. [SAMPLE Collaboration], Phys. Rev. Lett. 84 (2000) 1106. [92] Ya.B. Zeldovich, Sov. Phys. JETP 12 (1961) 777. [93] C.S. Wood, et al., Science 275 (1997) 759. [94] E.J. Beise, in: Bates 25 Symposium, Proceedings, November 3–5, 1999, Cambridge, p. 305. [95] R. Hasty, et al. [SAMPLE Collaboration], Science 290 (2000) 2117, nucl-ex=0102001. [96] T.R. Hemmert, U. Meissner, S. Steininger, Phys. Lett. B 437 (1998) 184. [97] D.H. Beck, B.R. Holstein, Int. J. Mod. Phys. E 10 (2001) 1, hep-ph=0102053. [98] Opportunities for Nuclear Science at the MIT-Bates Accelerator Center, November 2000 (available at http:==mitbates.mit.edu=news.stm). [99] T. Ito (spokesperson), MIT-Bates Experiment 00-04. [100] F.E. Maas [A4 Collaboration], in: B. Frois, M.A. Bouchiai (Eds.), Parity Violation in Atoms and Polarized Electron Scattering, World ScientiFc Publ. Co., Singapore, 1999, p. 491. [101] D.H. Beck, [G0 Collaboration], in: B. Frois, M.A. Bouchiai (Eds.), Parity Violation in Atoms and Polarized Electron Scattering, World ScientiFc Publ. Co., Singapore, 1999, p. 502. [102] G. Feinberg, Phys. Rev. D 12 (1975) 3575. [103] J.D. Walecka, Nucl. Phys. A 285 (1977) 349. [104] T.W. Donnelly, R.D. Peccei, Phys. Rep. 50 (1979) 1. [105] D.H. Beck, Phys. Rev. D 39 (1989) 3248. [106] J. Bernabeu, S.M. Bilenky, J. Segura, S.K. Singh, Phys. Lett. B 282 (1992) 177. [107] E.J. Beise (spokesperson), CEBAF experiment 91-004. [108] W.M. Alberico, A. Molinari, in: B. Frois, M.A. Bouchiai (Eds.), Parity Violation in Atoms and Polarized Electron Scattering, World ScientiFc Publ. Co., Singapore, 1999, p. 389, nucl-th=9904026. [109] R. Michaels, P.A. Souder, G.M. Urciuoli (spokespersons), Je8erson Laboratory Experiment E-00-003. [110] C.J. Horowitz, J. Piekarewicz, Phys. Rev. Lett. 86 (2001) 5647, astro-ph=0010227. [111] M.J. Musolf, T.W. Donnelly, Nucl. Phys. A 546 (1992) 509. [112] W. Bertozzi, E.J. Moniz, R.W. Lourie, in: B. Frois, I. Sick (Eds.), Modern Topics in Electron Scattering, World ScientiFc, Singapore, 1991, p. 419. [113] M.J. Musolf, T.W. Donnelly, Z. Phys. C 57 (1993) 559. [114] T.W. Donnelly, M.J. Musolf, W.M. Alberico, M.B. Barbaro, A. De Pace, A. Molinari, Nucl. Phys. A 541 (1992) 525. [115] S. Galster, Nucl. Phys. B 32 (1971) 221. [116] S. Platchkov, et al., Nucl. Phys. A 508 (1990) 343. [117] E. Hadjimichael, G.I. Poulis, T.W. Donnelly, Phys. Rev. C 45 (1992) 2666. [118] L. Diaconescu, R. Schiavilla, U. van Kolck, nucl-th=0011034. [119] M.B. Barbaro, A. De Pace, T.W. Donnelly, A. Molinari, Nucl. Phys. A 569 (1994) 701. [120] M.B. Barbaro, A. De Pace, T.W. Donnelly, A. Molinari, Nucl. Phys. A 598 (1996) 503. [121] D.B. Kaplan, A. Manohar, Nucl. Phys. B 310 (1988) 527. [122] J. Ellis, M. Karliner, Phys. Lett. B 213 (1988) 73. [123] V. de Alfaro, S. Fubini, G. Furlan, C. Rossetti, Currents in Hadron Physics, North-Holland, Amsterdam, 1973. [124] L.A. Ahrens, et al., Phys. Rev. D 35 (1987) 785. [125] G.T. Garvey, W.C. Louis, D.H. White, Phys. Rev. C 48 (1993) 761. [126] G.T. Garvey, S. Krewald, E. Kolbe, K. Langanke, Phys. Lett. B 289 (1992) 249. [127] G.T. Garvey, S. Krewald, E. Kolbe, K. Langanke, Phys. Rev. C 48 (1993) 1919. [128] E. Kolbe, S. Krewald, H. Weigel, Z. Phys. A 358 (1997) 445. [129] W.C. Louis, LSND Collaboration, LAMPF proposal 1173. [130] See P.E. Bosted, Phys. Rev. C 51 (1995) 509. [131] K. de Jager, in: Bates 25 Symposium, Proceedings, November 3–5, 1999, Cambridge, p. 225, hep-ex=0003034. [132] M.K. Jones, et al. [Je8erson Lab Hall A Collaboration], Phys. Rev. Lett. 84 (2000) 1398. [133] W.M. Alberico, S.M. Bilenky, C. Giunti, C. Maieron, Z. Phys. C 70 (1996) 463.
308 [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166] [167]
W.M. Alberico et al. / Physics Reports 358 (2002) 227–308 L. Andivahis, et al., Phys. Rev. D 50 (1994) 5491. R.G. Arnold, et al., Phys. Rev. Lett. 57 (1986) 174. P.E. Bosted, et al., Phys. Rev. C 42 (1990) 38. P.N. Kirk, et al., Phys. Rev. D 8 (1973) 63. D. Krupa, et al., J. Phys. G 10 (1984) 455. W. Bartel, et al., Nucl. Phys. B 58 (1973) 429. A. Lung, et al., Phys. Rev. Lett. 70 (1993) 718. S. Rock, et al., Phys. Rev. Lett. 49 (1982) 1139. S.I. Bilen’kaya, Yu.M. Kazarinov, Sov. J. Nucl. Phys. 32 (1980) 382. K. Watanabe, M. Takahashi, Phys. Rev. D 51 (1995) 1423. S.J. Brodsky, B.T. Chertok, Phys. Rev. D 14 (1976) 3003. S.J. Brodsky, J.F. Gunion, Phys. Rev. Lett. 37 (1976) 402. T. Suzuki, T. Kohyama, K. Yazaki, Phys. Lett. B 252 (1990) 323. E.M. Henley, G. Krein, S.J. Pollock, A.G. Williams, Phys. Lett. B 269 (1991) 31. S. Fiarman, W.E. Meyerho8, Nucl. Phys. A 206 (1973) 1. F. Ajzenberg-Selove, Nucl. Phys. A 506 (1990) 1. F. Ajzenberg-Selove, Nucl. Phys. A 460 (1986) 1. J. Blomqvist, A. Molinari, Nucl. Phys. A 106 (1968) 545. C.J. Horowitz, Hungchong Kim, D.P. Murdock, S. Pollock, Phys. Rev. C 48 (1993) 3078. M.B. Barbaro, A. De Pace, T.W. Donnelly, A. Molinari, M.J. Musolf, Phys. Rev. C 54 (1996) 1954. W.M. Alberico, M.B. Barbaro, S.M. Bilenky, J.A. Caballero, C. Giunti, C. Maieron, E. Moya de Guerra, J.M. Ud[\as, Nucl. Phys. A 623 (1997) 471. Y. Umino, J.M. Ud[\as, P.J. Mulders, Phys. Rev. Lett. 74 (1995) 4993. Y. Umino, J.M. Ud[\as, Phys. Rev. C 52 (1995) 3399. S. Schramm, C.J. Horowitz, Phys. Rev. C 49 (1994) 2777. M.J. Musolf, T.W. Donnelly, Phys. Lett. B 318 (1993) 263. M.J. Musolf, R. Schiavilla, T.W. Donnelly, Phys. Rev. C 50 (1994) 2173. W.M. Alberico, M.B. Barbaro, S.M. Bilenky, J.A. Caballero, C. Giunti, C. Maieron, E. Moya de Guerra, J.M. Ud[\as, Phys. Lett. B 438 (1998) 9. J.M. Ud[\as, P. Sarriguren, E. Moya de Guerra, E. Garrido, J.A. Caballero, Phys. Rev. C 48 (1993) 2731. J.M. Ud[\as, P. Sarriguren, E. Moya de Guerra, E. Garrido, J.A. Caballero, Phys. Rev. C 53 (1996) R1488. J.M Ud[\as, Ph.D. Thesis, Universidad Aut[onoma de Madrid, 1993. For more details on the Relativistic Optical Potential used here see also S. Hama, B.C. Clark, E.D. Cooper, H.S. Sherif, R.L. Mercer, Phys. Rev. C 41 (1990) 2737; E.D. Cooper, S. Hama, B.C. Clark, R.L. Mercer, Phys. Rev. C 47 (1993) 297. C. Giusti, F.D. Pacati, Nucl. Phys. A 473 (1987) 717. W.M. Alberico, M.B. Barbaro, S.M. Bilenky, J.A. Caballero, C. Giunti, C. Maieron, E. Moya de Guerra, J.M. Ud[\as, Nucl. Phys. A 651 (1999) 277. See C. Albright, et al., Physics at a Neutrino Factory, hep-ex=0008064.
Physics Reports 358 (2002) 309–440
Quantum eects in Coulomb blockade I.L. Aleinera , P.W. Brouwerb , L.I. Glazmanc; ∗ a
Department of Physics, SUNY at Stony Brook, Stony Brook, NY 11794, USA Laboratory of Atomic and Solid State Physics, Cornell University, Ithaca, NY 14853-2501, USA c School of Physics and Astronomy, Theoretical Physics Institute, University of Minnesota, 116 Church Str. SE, Minneapolis, MN 55455, USA b
Received July 2001; editor: C:W:J: Beenakker Contents 1. Introduction 2. The model 2.1. Non-interacting electrons in an isolated dot: status of RMT 2.2. Eect of a weak magnetic 8eld on statistics of one-electron states in the dot 2.3. Interaction between electrons: the universal description 2.4. Inclusion of the leads 3. Strongly blockaded quantum dots 3.1. Mesoscopic
311 315 315 319 323 337 344 345 350 357 362 365 366
4.2. Finite-size open dot: introduction to the bosonization technique and the relevant energy scales 4.3. The limit of low temperature: the eective exchange Hamiltonian and Kondo eect 4.4. Simpli8ed model for mesoscopic
∗
Corresponding author. Tel.: +1-612-6260724; fax: +1-612-6268606. E-mail address: [email protected] (L.I. Glazman).
c 2002 Elsevier Science B.V. All rights reserved. 0370-1573/02/$ - see front matter PII: S 0 3 7 0 - 1 5 7 3 ( 0 1 ) 0 0 0 6 3 - 1
370 377 383 387 401 408 412 417 418 419 420
310
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
Appendix C. Scattering states and derivation of Eq. (99) Appendix D. Mesoscopic
421 424 425
Appendix G. Correlation functions Appendix H. Derivation of Eqs. (340)–(343) Appendix I. Derivation of Eqs. (360)–(361) References
429 432 435 437
427
Abstract We review the quantum interference eects in a system of interacting electrons con8ned to a quantum dot. The review starts with a description of an isolated quantum dot. We discuss the random matrix theory (RMT) of the one-electron states in the dot, present the universal form of the interaction Hamiltonian compatible with the RMT, and derive the leading corrections to the universal interaction Hamiltonian. Next, we discuss a theoretical description of a dot connected to leads via point contacts. Having established the theoretical framework to describe such an open system, we discuss its transport and thermodynamic properties. We review the evolution of the transport properties with the increase of the contact conductances from small values to values ∼ e2 =˝. In the discussion of transport, the c 2002 Elsevier emphasis is put on mesoscopic
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
311
1. Introduction Conventionally, electric transport in bulk materials is characterized by the conductivity . Then, the conductance G of a 8nite-size sample of dimensions Lx × Ly × Lz can be found by combining the conductances of its smaller parts, G = Ly Lz =Lx . This description, however, is applicable only at suKciently high temperatures, at which the conductivity can be treated as a local quantity. It was discovered about two decades ago [1,2] that the quantum corrections to the conductivity are non-local on the scale of the (temperature dependent) dephasing length L’ , which is much larger than the elastic mean free path. If the sample is small, or the temperature low, so that L’ exceeds the sample size, the concept of conductivity loses its meaning. Moreover, the conductance G acquires signi8cant sample-to-sample
We refer the reader to Ref. [11] for a pedagogical presentation of the theory of adiabatic electron transport.
312
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
Fig. 1. Micrograph of the quantum dot [9]. By changing the voltages on the gates conductances of point contacts and the shape of the dot can be changed.
A statistical ensemble of quantum dots can be obtained by slightly varying the shape of the quantum dot, the Fermi energy, or the magnetic 8eld, keeping the properties of its contacts to the outside world 8xed. Statistical properties of the conductance for such an ensemble depend on the number of modes, Nch , in the junctions and their transparency. The “usual” UCF theory adequately describes the conductance through a disordered or chaotic dot in the limit of large number of modes, Nch 1. When all channels in the two point contacts are transparent, the universal value of the root mean square (r.m.s.) conductance
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
313
Fig. 2. (a) Schematic view of the dot with the gate electrode; (b) conductance of the dot as a function of the gate voltage (from Ref. [21]).
A 8rst step to approach this problem is to disregard the mesoscopic
314
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
charge
There is a certain overlap with Ref. [15].
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
315
main results for the conductance through a dot and for its thermodynamics, see Section 4.1. Along with providing the general picture for the conductance and dierential capacitance, this subsection points to formulas valid in various important limiting cases, which are derived later in Sections 4.6 and 4.7. Sections 4.2– 4.4 review the main physical ideas built into the theory, while the details of the rigorous theory are presented in Section 4.5. Last but not least, we illustrate the application of theoretical results by brie
We put ˝ = 1 in all the intermediate formulae.
316
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
In this orthonormal basis † j ˆ ˆ ; Hˆ F =
(4)
where the fermionic operators ˆ are de8ned as ˆ ≡ d˜r ˆ (˜r)∗ (˜r) and have the usual anticommutation relations, cf. Eq. (2). Each particular eigenstate depends sensitively on the details of the random potential U (˜r), which is determined by the shape of the quantum dot. However, we will not be interested in the precise value of observables that depend on the detailed realization of the potential U (˜r). Instead, our goal is a statistical description of the various response functions of the system with respect to external parameters such as magnetic 8eld, gate voltage, etc., and of the correlations between the response functions at dierent values of those parameters. Hereto, the statistical properties of a response function are 8rst related to the correlation function of the eigenstates of Hamiltonian (3). Then, we can employ the known results for statistics of the eigenvalues and eigenvectors in a disordered or a chaotic system [41– 44]. For a disordered dot, the correlation functions are found by an average of the proper quantities over the realizations of the random potential. Such averaging can be done by means of the standard diagrammatic technique for the electron Green functions ∗ (˜r2 ) (˜r1 ) GR; A (j;˜r1 ;˜r2 ) = ; (5) j − j ± i0
where plus and minus signs correspond to the retarded (R) and advanced (A) Green functions, respectively. The spectrum of one-electron energies is fully characterized by the density of states 1 (j) = (j − j ) = d˜r [GA (j;˜r;˜r) − GR (j;˜r;˜r)] ; (6) 2i where the last equality follows immediately from de8nition (5). Even though the density of states is a strongly oscillating function of energy, its average is smooth, 1 (j) = ; (7) (j) where (j) is the mean one-electron level spacing. It varies on the characteristic scale of the order of the Fermi energy EF , measured, say, from the conduction band edge, which is the largest energy scale in the problem. Since we are interested in quantities associated with a much smaller energy scale, we can neglect the energy dependence of the mean level spacing. The average in Eq. (7), denoted by brackets : : :, is performed over the dierent realizations of the random potential for the case of a disordered dot, or over an energy strip of width , but EF for a clean (i.e., ballistic) system. The average density of states does not carry any information about the correlations between the energies of dierent eigenstates. Such information is contained in the correlation functions for the electron energy spectrum. Probably, the most important example is the two-point correlation function R(2) (!), R(2) (!) = 2 (j)(j + !) − 1 :
(8)
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
317
Fig. 3. Diagrammatic expansion of the correlation of the density of states for the disordered system [41]. In the presence of magnetic 8eld (unitary ensemble, ! = 2) the Cooperon diagram (b) is suppressed. In the 8gure, the symbols P and Q refer to the momenta of the electrons; P is the large momentum, |P − kF |kF , Q is the small momentum |Q|kF ; % = P 2 =2m − &, where & is the chemical potential; in two dimensions, the mean free time is related to the diusion constant D through D = vF2 =2, vF being the Fermi velocity.
To calculate R(2) (!), one substitutes expression (6) for the density of states in terms of the Green functions G R and G A into Eq. (8), 2 R(2) (!) = 2 Re d˜r1 d˜r2 GR (j + !;˜r1 ;˜r1 )GA (j;˜r2 ;˜r2 ) − 1 : (9) 2 When the dot contains many electrons—which is the case of interest here—the size of the dot and the region of integration in Eq. (9) greatly exceed the Fermi wavelength. In that case, averaging of the products GR GA using the diagrammatic technique yields two contributions to the integrand in (9), proportional to the squares of diusion and Cooperon propagators in coinciding points, respectively, see Fig. 3. The result of such a calculation [41], 2 1 R(2) (!) = 2 Re ; (10) ! (i! + (n )2 ( n
is expressed in terms of the eigenvalues (n of the classical diusion operator, ˜ 2 fn (˜r) = (n fn (˜r) ; −D∇
(11)
supplemented by von Neumann boundary conditions at the boundary of the dot. Here D is the electron diusion coeKcient in the dot, and the Dyson symmetry parameter ! = 1 (2) in the presence (absence) of time reversal symmetry. Ensembles of random systems possessing and not possessing this symmetry are called orthogonal and unitary ensembles, respectively. Being derived by means of diagrammatic perturbation theory, Eqs. (10) and (12) below are valid for energy dierences ! only. The exact results, valid for all !, are also available, see, e.g., Refs. [43– 47]; however, approximation (10) is suKcient for the present discussion. An expression similar to Eq. (10) is believed to be valid for a chaotic and ballistic quantum dot. The only dierence with the diusive case is that instead of the eigenvalues of the diusion
318
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
operator one has to use eigenvalues (n of a more general (Perron–Frobenius) operator 4 of the classical relaxational dynamics [49]; 5 in this case the eigenvalues (n can be complex with Re (n ¿ 0, see also Appendix A. The relation of R(2) (!) to the dynamics of a particle in a closed volume allows one to draw an important conclusion regarding the universality of R(2) (!). Note that a spatially uniform particle density satis8es the relaxational dynamics or the diusion equation. Because of particle number conservation, such a solution is time independent. Therefore, the lowest eigenvalue (0 in Eq. (11) must be zero, independent of the shape of the dot and the strength of the disorder, 2 2 1 R(2) (!) = − + Re : (12) 2 2 2 ! ! ! (i! + (n ) 2 (n =0
The 8rst term in Eq. (12) is universal. One can immediately see that if ˝! is much smaller than the Thouless energy ET ≡ ˝ Re (1 , the remaining terms in Eq. (12) are small compared to the 8rst one and can be neglected. The universal part of the correlation function (12) can be reproduced within a model where Hamiltonian (1) is replaced by a Hermitian matrix with random entries † Hˆ F = H( ˆ ˆ ( : (13) ;(
The coeKcients H( in Eq. (13) form a real (! = 1) or complex (! = 2) random Hermitian matrix H of size M × M , belonging to the Gaussian ensemble M2 [ + (2=! − 1) (( ]; ! = 1; 2 : (14) 2 ( ( Hamiltonian (13), with the distribution of matrix elements (14) reproduces the universal part of the spectral statistics of the microscopic Hamiltonian (1) in the limit M → ∞ of large matrix size. The condition ET , provides the signi8cant region !ET where the non-universal part of the two-level correlator R2 (!) could be neglected and where the replacement of the microscopic Hamiltonian by the matrix model (13), is meaningful. This condition may be reformulated as the requirement that the dimensionless conductance g of the dot be large, where ˝(1 ET g≡ (15) = 1 : A large value of g indicates that the dot can be treated as a good conductor. We now present explicit expressions for the dimensionless conductance g for a disc-shaped dot for two simple models: (i) a dot of radius R exceeding the electron mean free path l, and (ii) a ballistic dot of radius R with a boundary which scatters electrons diusively, see Refs. [51,52] and references therein. For a diusive dot, one needs to solve Eq. (11) to obtain (1 = Dx1 =R2 , where x1 ≈ 2:40 is the 8rst zero [53] of the Bessel function, J0 (x1 ) = 0. Using the electron density of states in H( H ( =
4
The corresponding non-linear -model was 8rst suggested in Ref. [48], however, this classical (Perron–Frobenius) operator was erroneously identi8ed with the Liouville operator. 5 This identi8cation does not properly handle the repetitions in periodic orbits [50], and therefore is applicable only for the systems, where all the periodic orbits are unstable.
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
319
two-dimensional electron gas, one 8nds = 2˝2 =mR2 , and the conductance is g=
x1 2˝ x1 : kF l = 2 4 e 4R
Here kF is the Fermi wave vector, and R is the resistance per square of the two-dimensional electron gas. Note that g is independent of the radius of the dot in the case of the diusive electron motion. For model (ii), the eigenvalue Re (1 ≈ 0:38vF =R of the Liouville operator with the diusive boundary conditions was found in Ref. [51], and one 8nds g = 0:19kF R : 2.2. E9ect of a weak magnetic :eld on statistics of one-electron states in the dot Physical properties of a mesoscopic system are manifestly random. The statistics of the random behavior can be studied by a measurement of the
(16)
s(a) Here Hmn is the random realization of independent Gaussian real symmetric (antisymmetric) M × M matrices, a(s)
a(s) H( H ( =
M2 [ ± ( ( ] ; 2 ((
(17)
where “+” (“−”) sign corresponds to the symmetric (antisymmetric) part of the Hamiltonian. In Eq. (16), X is a real parameter proportional to the magnetic 8eld (the precise relation between X and B is given below), so that the time reversal symmetry is preserved, H( (B) = H( (−B) :
(18)
The prefactor (1 + X 2 )−1=2 in Eq. (16) is chosen in such a way that the mean level spacing remains unaected by the magnetic 8eld. The correlation function of elements of Hamiltonian (16) at dierent values of the magnetic 8eld B1; 2 can be conveniently written similarly to Eq. (14) as
NhC NhD M2 H( (B1 )H ( (B2 ) = 2 ( ( 1 − + ( ( 1 − : (19) 4M 4M
320
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
The dimensionless quantities NhD; C , which characterize the eect of the magnetic 8eld on the wave functions and the spectrum of the closed dot, are related to the original parameters X as NhD = 2M (X1 − X2 )2 ;
NhC = 2M (X1 + X2 )2 ;
where the normalization is chosen in a way that the size of the matrix M does not enter into physical quantities in the physically relevant limit M 1. The relationship between the parameter X and the real magnetic 8eld applied to the dot is best expressed with the help of NhD and NhC ,
61 − 62 2 61 + 62 2 D C ; Nh = 5g ; (20) Nh = 5g 60 60 where g1 is the dimensionless conductance of the closed dot, see Eq. (15), 61(2) is the magnetic
ij ( ij ( 2 ∗ (i) ( (j) = −1 ; (i) ( (j) = : (21) M ! M In other words, an eigenvector can be represented as (n) =
1 (un + it vn ) ; M 1 + t2
√
(22)
where M is the size of the matrix, u; v are independent real Gaussian variables, un um = nm ; vn vm = nm ; un vm = 0, and t = 0 (1) for the orthogonal (unitary) ensemble [59,54]. Eq. (22)
is a consequence of the symmetry of the distribution of the random matrices with respect to an arbitrary rotation of the basis. At the crossover between those two ensembles, this symmetry no longer exists, which leads to a departure of the distribution of (n) from a (real or complex) Gaussian [60,61], and to correlations of the values of the wave functions at dierent sites [62,63]. It was noticed in Refs. [64,65] that the crossover between the orthogonal and unitary ensembles can be described by using decomposition (22) where the parameter t is no longer 8xed at the extremal values t = 0 or 1, but a
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
321
function [60 – 63]
2 NhC NhC 1 − t4 1 − t2 exp − W (t) = 8 t3 16 t
2
NhC NhC 1 + t2 4 + − 1− ; × 4 2t 4 NhC
(x) =
0
1
dy exp[ − x(1 − y2 )] ;
(23)
where NhC is given by Eq. (20) with 61 = 62 ≡ 6, being the magnetic
jl km jm kl 2 R A Gjk (j + !; B1 )Glm (j; B2 ) = 2 ; + M −i! + NhD =2 −i! + NhC =2
(25)
where NhC and NhD are de8ned in Eq. (19). The irreducible averages GA GA or GR GR are smaller than Eq. (25) by factor of !=(M) and can be disregarded in the thermodynamic limit M → ∞. All the higher moments can be expressed in terms of the second moments (25) with the help of the Wick theorem. 6 Eq. (25) can serve as a starting point to relate the quantities NhD ; NhC to the real magnetic 8eld, thus providing a derivation of Eq. (20). To establish such a relation, one needs to recalculate average (25) using the microscopic Hamiltonian (1). The result is again given by Eq. (10), where now the magnetic 8eld is introduced into the diusion equation of Eq. (11) by the replacement of 9= 9˜r by the covariant derivative 9= 9˜r + i(e=c˝)˜A± (˜r), and a modi8cation of the boundary condition to ensure that the particle
It is noteworthy, that such a decomposition is legitimate only at energy scales greatly exceeding the level spacing . It is not valid for the calculation of the properties of a single wave function.
322
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
ˆ The second term in Fig. 4. (a) analytic expressions for the lines on the diagrams; (b) diagrams for self-energy ?. the self-energy includes an intersection of the dashed lines ant it is smaller than the 8rst term by a factor 1=M . (c), (d) diagrammatic representation for irreducible averages (25). Fig. 5. Example of the self-returning trajectories.
We close this subsection with a more elaborate discussion of Eq. (25) and its physical interpretation. Hereto, we consider the case of equal magnetic 8elds B1 and B2 , i.e., NhD = 0. Then, one sees from Eq. (25) that a weak magnetic 8eld introduces new energy scale ˝=Ch = TNhC =2 :
(26)
If this scale is smaller than the Thouless energy ET , the universal description holds, and we 8nd that the system under consideration is at the crossover between the orthogonal and unitary ensembles. Phenomena associated with the energy scale smaller than ˝=Ch , are described eectively by the unitary ensemble, whereas phenomena associated with larger energy scales still can be approximately described by the orthogonal ensemble. This conclusion allows for a simple semiclassical interpretation of relation (20) between NhC and the real magnetic 8eld B, as we now explain. Consider a classical trajectory of an electron starting from a point r and returning to the same point, see Fig. 5. The quantum mechanical amplitude for this process contains an oscillating term eiScl =˝ , where Scl is the classical action along this trajectory. In a weak magnetic 8eld the classical trajectory does not change, while the classical action acquires an Aharonov–Bohm phase: eB Scl → Scl + A ; (27) c cl
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
323
where Acl is the directed area swept by the trajectory, see Fig. 5. If the time it takes for the electron to return along the trajectory, t, would be of the order of the ergodic time ˝=ET , the characteristic value of this area would be of the order of the geometrical area of the dot |Acl | Adot (the meaning of the Thouless energy ET is discussed in previous subsections). If, however, t ˝=ET , the electron trajectory covers the dot Nt tET = ˝ times. Each winding through the dot adds a value of the order of Adot to Acl ; these contributions, however, are of random √ signs, (an example is shown in Fig. 5). Therefore, the total area accumulated scales with Nt ,
tET 1=2 |Acl | Adot Nt Adot : (28) ˝ The orthogonal ensemble is dierent from the unitary ensemble by the fact that in the orthogonal ensemble the quantum mechanical amplitudes corresponding to a pair of time reversed trajectories have the same phase, while these phases are unrelated for the unitary ensemble. For a small magnetic 8eld, this means, that the trajectories which acquired an Aharonov–Bohm phase smaller than unity still eectively belong to the orthogonal ensemble whereas, those which acquire a larger phase are described by the unitary ensemble. To obtain the characteristic time scale separating these two regimes, we require eB|Acl | ˝: (29) c Substituting estimate (28) into Eq. (29), and using 6 = BAdot , one 8nds
1 ET 6 2 : (30) t ˝ 60 Up to a numerical coeKcient, this estimate coincides with the energy scale given by Eqs. (26) and (20). The small energy scale physics is governed by long trajectories with a typical accumulated
324
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
are proportional to the inverse dimensionless conductance 1=g, and thus small [66 –70]. As the result, Hamiltonian (31) will be separated into two pieces: (0) (1=g) Hˆ int = Hˆ int + Hˆ int :
(33)
The 8rst term here is universal, does not depend on the geometry of the dot, and it does not
(35)
The three terms in Hamiltonian (35) have a dierent meaning. The 8rst two terms represent the dependence of the energy of the system on the total number of electrons and total spin, respectively. Because both total charge and spin commute with the free-electron Hamiltonian, these two terms do not have any dynamics for a closed dot. We will see that the situation will change with the opening of contacts to the leads. Finally, the third term corresponds to the interaction in the Cooper channel, and it does not commute with the free-electron Hamiltonian (1) or (13). This term is renormalized if one considers contributions in higher order perturbation theory in the interaction. For an attractive interaction, Jc ¡ 0, the renormalization enhances this interaction, eventually leading to the superconducting instability. (We will not consider the case of an attractive interaction here; for a recent review on the physics of small superconducting grains, see Ref. [71]. Eects in larger grains, Ec , are reviewed in Ref. [72].) For the repulsive case, Jc ¿ 0, this term renormalizes logarithmically to zero. We will be dealing with the latter case throughout the paper. Constants in Hamiltonian (35) are model-dependent. In the remainder of this subsection we will show how to calculate them for some particular interactions and discuss the structure of (1=g) the non-universal part Hˆ .
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
325
At this point it is important to mention that universal Hamiltonian (35) is de8ned within the Hilbert space of one-electron states with energies of the order or less than Thouless energy. The matrix elements of this Hamiltonian are not just the matrix elements of the interaction potential (32). This potential is de8ned in a much wider energy strip which includes one-electron states with energies EF . Virtual transitions between the low-energy ( . ET ) sector and these high-energy states renormalize the matrix elements of the universal Hamiltonian. It turns out [13] that the third term in Hamiltonian (35) corresponding to the Cooper channel of interaction, can be strongly renormalized by such virtual transitions. In order to avoid this complication, we will consider the case of the unitary ensemble in the discussion below, where the “bare” matrix elements corresponding to the Cooper channel are already suppressed by a weak magnetic 8eld (it is suKcient to thread a
At given energy j, at most one eigenstate contributes to the sum in Eq. (36). Furthermore, it is known that there is no correlation between the statistics of levels and that of wave functions in the lowest order in 1=g, see, e.g., Ref. [41], so we can neglect the level correlations and average the -function in Eq. (36) independently. As a result, we can estimate 1 j+=2 (˜r1 ) (˜r2 )∗ B(=2 − |j − j|) ≈ d j1 [G(j1 ;˜r1 ;˜r2 )]− ; ; (37) 2 j−=2 where we introduced the notation [G(j;˜r1 ;˜r2 )]− = − i[GA (j;˜r1 ;˜r2 ) − GR (j;˜r1 ;˜r2 )] :
(38)
Let us 8rst calculate the average of the matrix element (32). Because of the randomness of the wave functions, the corresponding product in Eq. (32) does not vanish only if its indices are equal pairwise. It is then readily expressed with the help of Eq. (37), 2 (˜r1 )! (˜r2 )∗( (˜r3 )∗ (˜r4 ) 2 =2 =2 1 = 2 d j1 d j2 [ !( [G(j + j1 ;˜r1 ;˜r4 )]− [G(j! + j2 ;˜r2 ;˜r3 )]− 4 −=2 −=2 + ( ! [G(j + j1 ;˜r1 ;˜r3 )]− [G(j! + j2 ;˜r2 ;˜r4 )]− ] :
(39)
326
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
In deriving Eq. (39) for = ! we have used the relation (˜r1 ) (˜r2 )∗ ! (˜r1 )! (˜r2 )∗ B(=2 − |j − j|)B(=2 − |j! − j|)
= (˜r1 ) (˜r2 )∗ ! (˜r1 )! (˜r2 )∗ B(=2 − |j − j|)B(=2 − |j! − j|) 2 (˜r1 ) (˜r2 )∗ ! (˜r1 )! (˜r2 )∗ : 2 Result (39) can be justi8ed also for = ! and kF |˜r1 − ˜r2 |1 (with kF being the Fermi wave vector), see Ref. [68]. The corrections to Eq. (39) are of the order or smaller than 1=g2 , and will be neglected since we are interested in the leading in 1=g terms. (0) In the leading approximation in 1=g needed to derive Hˆ int , the Green function entering into the products in the above formula may be averaged independently. Substituting 2 ˜ [G(j + j1 ;˜r1 ;˜r2 )]− = F(kF |˜r1 − ˜r2 |); F(kF |˜r |) = eik·˜r FS (40) TVd with : : :FS denoting the average over the electron momentum on the Fermi surface, into Eq. (39), we 8nd =
(Vd )2 (˜r1 )! (˜r2 )∗( (˜r3 )∗ (˜r4 )(0) = [ !( F14 F23 + ( ! F13 F24 ] ; where we introduced the short-hand notation
(41)
Fij ≡ F(kF |˜ri − ˜rj |) ;
(42) and Vd is the volume of a d-dimensional grain. Eq. (41) does not depend on the disorder strength and corresponds to the universal limit. The coordinate dependence in function F indicates simply that all the plane waves that are allowed by the conservation of energy are represented in the wave function. 2.3.1. Universal description for the case of short-range interaction We start with the model case of a weak short-range interaction V (˜r) = DVd (˜r) ; (43) to illustrate the principle, and then discuss the realistic long-range Coulomb interaction. In Eq. (43), Vd is the volume of the dot in dimension d = 3 or its area for d = 2, and D1 is the interaction constant. Averaging of the matrix elements (32) with the help of Eq. (41) allows us to derive the (0) Hamiltonian Hˆ int from Eq. (31). Indeed, such averaging yields (0)
H!( = D( !( + ( ! ) :
(44)
Now we substitute Eq. (44) into Eq. (31) and rearrange the summation over spin indices with the help of identity 21 2 3 4 = 1 3 2 4 + ˜1 3 ˜4 2 ; where ˜ = (ˆ x ; ˆ y ; ˆ z ) and ˆ k are the Pauli matrices in the spin space. As a result, the interaction Hamiltonian takes the universal form (34), (35), with the coupling constants Ec = 14 D;
JS = − D;
Jc = 0 :
(45)
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
327
Fig. 6. Diagrammatic expansion for the correlation of the Green functions of the disordered system GR GA .
The vanishing of Jc is a feature of the unitary ensemble, see also Appendix B. We see that for a weak short-range repulsive interacton all the matrix elements in the universal part of the Hamiltonian are smaller than the level spacing. Later we will see that for the Coulomb interaction this is not the case, and the constant Ec is large compared with . We derived the universal Hamiltonian (35) in the leading approximation, which corresponds to the limit 1=g → 0. In this approximation, only the “most diagonal” matrix elements of the interaction Hamiltonian are 8nite. In the 8rst order in 1=g, there are two types of the corrections (1=g) H!( to the universal Hamiltonian. First, the matrix elements of the diagonal part of the Hamiltonian, Eq. (35), acquire a correction ˙ 1=g. This correction exhibits mesoscopic
(0)
H!( = H!( − H!( :
(46)
To this end, we have to 8nd the 1=g contribution to the averages of the products (39) of wave functions. This contribution can be found from the corresponding irreducible product of the Green functions. Using diagrams for the diusive systems, see Fig. 6, we 8nd [G(j + !;˜r1 ;˜r2 )]− [G(j;˜r3 ;˜r4 )]− ir = 2 ReGR (j + !;˜r1 ;˜r2 )GA (j;˜r3 ;˜r4 )ir (n fn (˜r1 )fn (˜r2 ) 4 = F14 F23 : TVd (!2 + (2n )
(47)
(n ¿0
Here (n and fn are de8ned in Eq. (11), and normalization condition d˜rfn (˜r)fm (˜r) = nm ;
(48)
is imposed. Substituting Eq. (47) into Eq. (39), and using the condition (n we will obtain the 1=g correction to the average product of the wave functions (41): (n fn (˜r1 )fn (˜r4 ) (˜r1 )! (˜r2 )∗( (˜r3 )∗ (˜r4 )(1=g) = !( F13 F24 Vd (j − j! )2 + (2n (n ¿0 (n fn (˜r1 )fn (˜r3 ) + ( ! F14 F23 ; (49) Vd (j − j! )2 + (2n (n ¿0
where we used the short-hand notation (42).
328
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
Fig. 7. Diagrammatic expansion for the
We are interested in the interactions between the electrons in states suKciently close to the Fermi surface (within an energy strip ∼ ET around the Fermi level). It means that the dierence of one-electron eigenenergies entering in the denominators of Eq. (49) are much smaller than (n , and can be neglected. Substituting Eq. (49) into Eqs. (32) and (43), and assuming |j −j! |ET , we 8nd 1 (1 (1=g) H!( = c1 D ( !( + ( ! ); c1 = 1; (50) g (n (n ¿0
where the dimensionless conductance of the dot g is de8ned in Eq. (15). Thus, we have shown that in the metallic regime, g1, the non-universal corrections to the average interaction matrix elements are small indeed. Now we turn to the mesoscopic
2
2 (1 2 (1=g) 2 2 [H!( ] = c2 D ; c2 = 2 : (51) g (n (m =0
(This result is for the generic case where all indices ; !; (, and are dierent. If they are not all dierent, the
The constant E does not account for the screening provided by the electrons of the dot. This screening will be considered in detail below.
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
329
Fig. 8. Matrix element in random phase approximation. Polarization operator F involves all the transitions where at least one of the states (; is outside the energy strip of the width of the Thouless energy around the Fermi level.
For the simplest geometry, the corresponding matrix elements H!( are inversely proportional to the linear size of the dot, rather than to its volume (or area in dimension d = 2). If the linear size of the dot exceeds the screening radius (or the eective Bohr radius aB = E˝2 =e2 m∗ in d = 2), the matrix elements H!( exceed , and the lowest-order perturbation theory in the interaction Hamiltonian fails. In the theory of linear screening, it is well-known how to deal with this diKculty. When calculating an observable quantity, one should take into account the virtual transitions of low-energy electrons into the high-energy states. If the gas parameter e2 (53) E˝vF (with vF being the electron Fermi velocity) of the electron system in the dot is small, then the random phase approximation (RPA) allows one to adequately account for such virtual transitions. Accounting for these transitions yields, in general, a retarded electron–electron interaction [18]. However, the characteristic scale of the corresponding frequency dependence is of the order of (1 , see Eq. (11). That is why at the energy scale |j| . ET we can consider the interaction as instantaneous one, and derive the eective interaction Hamiltonian acting in this truncated space. A modi8ed matrix element in RPA scheme is shown in Fig. 8. It involves substitution of the bare potential (52) in Eq. (32) with the renormalized potential Vsc (˜r1 ;˜r2 ). This potential is the solution of the equation Vsc (˜r1 ;˜r2 ) = V (˜r1 − ˜r2 ) − d˜r3 d˜r4 V (˜r1 − ˜r3 )F(˜r3 ;˜r4 )Vsc (˜r4 ;˜r2 ) ; (54) rs ≡
where the integration over the intermediate coordinates is performed within the dot only. The polarization operator F should include only the states where at least one electron or (hole) is outside the energy strip of width j∗ . ET for which we derive the eective Hamiltonian, (˜r1 )∗! (˜r1 )! (˜r2 )∗ (˜r2 ) F(˜r1 ;˜r2 ) = 2 B(−j j! ) : (55) |j! − j | ∗ |j! −j |¿j
330
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
Here all energies are measured from the Fermi level, and B(x) is the step function. The factor of two accounts for spin degeneracy. We can compare Eq. (55) with the usual frequency-dependent operator FR (!;˜r1 ;˜r2 ) that involves all the electron states FR (!;˜r1 ;˜r2 ) = 2
(˜r1 )∗! (˜r1 )! (˜r2 )∗ (˜r2 ) j! ;j
and obtain 1 F(˜r1 ;˜r2 ) = Im
j! − j − ! − i0
B(−j j! ) sgn j!
d! R F (!;˜r1 ;˜r2 )B(|!| − j∗ ) : !
(56)
(57)
In the next step, we replace the polarization operator with its average value [18] FR (!;˜r1 ;˜r2 ) =
2 (n fn (˜r1 )fn (˜r2 ) ; TVd −i! + (n
(58)
(n ¿0
where (n and fn for a diusive system are de8ned in Eq. (11). The prefactor in Eq. (58) is nothing but the thermodynamic density of states per unit volume (area). 8 Substituting Eq. (58) into Eq. (57) and taking into account that j∗ . ˝(1 , we 8nd 2 F(˜r1 ;˜r2 ) = fn (˜r1 )fn (˜r2 ) : (59) TVd (n ¿0
Now we can use the completeness of the solution set of the diusion equation, fn (˜r1 )fn (˜r2 ) = (˜r1 − ˜r2 ) n
√ and the explicit form of the zero-mode solution f0 (˜r) = Bdot (˜r)= Vd for this equation, in order to present Eq. (59) in the form 2 1 F(˜r1 ;˜r2 ) = (˜r1 − ˜r2 ) − Bdot (˜r1 )Bdot (˜r2 ) : (60) TVd Vd
Here Bdot (˜r) = 1, if ˜r belongs to the dot and Bdot (˜r) = 0 otherwise. The structure of Eq. (60) is easy to understand. The 8rst term in brackets characterizes local screening in the Thomas– Fermi approximation. The last term in brackets subtracts the constant-potential contribution of the zero mode which cannot induce electron transitions between the levels and therefore cannot be screened. 8
The mesoscopic
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
331
Substituting the polarization operator (60) in Eq. (54), we 8nd the equation for the selfconsistent potential in the Thomas–Fermi approximation Vsc (˜r1 ;˜r2 ) = V (˜r1 − ˜r2 ) 2 1 − d˜r3 V (˜r1 − ˜r3 ) Vsc (˜r3 ;˜r2 ) − d˜r4 Vsc (˜r4 ;˜r2 ) : TVd Vd
(61)
The use of Thomas–Fermi approximation is justi8ed if the linear size of the dot exceeds the screening radius. The integrals here are taken over the volume of the dot (or over the corresponding area in the 2d case). We can rewrite the integral equation (61) in the more familiar dierential form, 4e2 2 2 −∇˜r Vsc (˜r;˜r1 ) = (˜r − ˜r1 ) − B (˜r)Vsc (˜r;˜r1 ) E TVd dot 2 + Bdot (˜r) d˜r2 Vsc (˜r2 ;˜r1 )Bdot (˜r2 ) ; (62) TV2d supplemented with the requirement that the solution vanishes at r → ∞. In the case of a 2d dot, the ∇2 operator is still acting in the three-dimensional space; the right-hand side of Eq. (62) must be multiplied by (z), where coordinate z is directed normal to the plane of the dot. The second term in the right-hand side of Eq. (62) represents the familiar Thomas–Fermi screening. In the three-dimensional case, this term is ˙ Vsc (˜r)=rD2 , with the screening radius rD . First, we 8nd an approximate solution of Eq. (62) inside the dot. We consider only length scales larger than the screening radius. This enables us to neglect the left-hand side of Eq. (62) altogether. In addition, we will neglect the corrections of the order of the level spacing, which are generated by the last term in the right-hand side of Eq. (62). Then the solution takes the form TVd Vsc (˜r;˜r1 ) = (63) (˜r − ˜r1 ) + VU Bdot (˜r)Bdot (˜r1 ) ; 2 where the constant VU will be determined later. The easiest way to 8nd VU is to consider Eq. (62) for ˜r outside the dot (keeping ˜r1 inside the dot). The right-hand side of Eq. (62) then vanishes, ˜ ˜2r Vsc (˜r; ˜r1 ) = 0 : ∇
(64)
The constant VU de8nes for this Laplace equation the boundary condition at the surface of the dot, Vsc (˜r;˜r1 )|˜r∈S = VU
(65)
(the surface is de8ned unambiguously in the limit of zero screening radius). After the solution Vsc (˜r;˜r1 ) of the Laplace equation (64) is found, the constant VU can be obtained by integration
332
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
of Eq. (62) along the surface of the dot 2 ˜ Vsc (˜r;˜r1 ) = 4e : − d˜S ∇ E S For the two-dimensional case, Eq. (66) obviously reduces to ↔ ↔ 9 9 4e2 d˜r 9 z Vsc (˜r;˜r1 ) = − ; ; 9z ≡ E 9z z→−0 9z z→+0
(66)
(67)
where the two-dimensional integration is performed over the area of the dot (z = 0). Now the solution of Eqs. (64) and (65) can be formally found with the help of the Green function of the Laplace equation −∇˜2r1 D(˜r1 ;˜r2 ) = (˜r1 − ˜r2 );
D(˜r1 ;˜r2 )|˜r1 ∈S = 0 ;
(68)
describing the electrostatic 8eld outside the dot. If the system contains metallic gates, Eq. (68) should be supplied with additional Dirichlet boundary conditions on the surface Si of those gates D(˜r1 ;˜r2 )|˜r1 ∈Si = 0 ;
where index i enumerates the corresponding gates. One immediately 8nds from Eqs. (65) and (68) ˜ D(˜r;˜r2 )[1 − Bdot (˜r)] Vsc (˜r;˜r1 ) = VU d˜S2 ∇ S
(69)
(70)
for the potential Vsc outside the dot. Substitution of Eq. (70) into Eq. (66) yields 2
e VU = C
(71)
for the constant VU . The geometrical capacitance of the dot is given by formula valid for any shape of the dot E ˜ ˜ ˜ ˜ C= d S2 ∇2 d S1 ∇1 D(˜r1 ;˜r2 ) : (72) 4 S S In a two-dimensional system, Eq. (72) is replaced with ↔ ↔ E C= d˜r1 d˜r2 9 z1 9 z2 D(˜r1 ;˜r2 ) ; 4
(73)
where the two-dimensional integration is performed over the area of the dot (z = 0), and operator is de8ned in Eq. (67). The concrete value of capacitance C is geometry-dependent. If the size of the dot is characterized by a single parameter L, then C is proportional to L with some coeKcient, which depends on details of the geometry. In the practically important case of a gated dot, the capacitance is C L2 =4d, where L2 is the area of the dot, and d . L is the distance from the gate to the plane of the dot.
↔ 9z
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
333
The charging energy, e2 =2C, is the dominant energy scale for the matrix elements of the interaction Hamiltonian, as we will now show. Corrections to this scale appear from terms of the order of in the eective interaction potential Vsc . In order to 8nd the matrix elements V!( , we need to know the screened potential Vsc (˜r1 ;˜r2 ) within the dot. The constant part of this potential ∼ e2 =C contributes only to the “diagonal” matrix elements (32), with = ; ! = (. This part does not contribute to any other matrix element, because of the orthogonality of the corresponding one-electron wave functions. In order to 8nd the non-diagonal matrix elements, we need to account for smaller but coordinate-dependent terms in Vsc (˜r1 ;˜r2 ). To accomplish that goal, we substitute potential (63), (70) into the left-hand side of Eq. (62) and treat it in the 8rst order of perturbation theory in TC=e2 1. As the result, we obtain with the help of Eq. (71) Vsc (˜r1 ;˜r2 ) =
e2 Vd + (˜r1 − ˜r2 ) + V˜ (˜r1 ) + V˜ (˜r2 ) : C 2
(74)
Here, the potential V˜ (˜r) appears due to the 8nite size of the dot: a charge e “expelled” from the point ˜r =˜r1; 2 in the process of screening, cannot be pushed away to in8nity because of the dot’s boundary. In the 3d case, this charge forms a thin layer near the boundary of the dot; in the 2d case, it creates an inhomogeneous distribution over the whole area of the dot. The explicit form of this potential [in units of level spacing , cf. Eq. (74)] is V d ˜1 ˜ 2 D(˜r1 ;˜r2 ); d = 3 ; V˜ (˜r) = (˜r − ˜r1 ) d˜S1 ∇ d˜S2 ∇ 8C S S ↔ ↔ Vd ˜ V (˜r) = d˜r1 9 z 9 z1 D(˜r;˜r1 ); d = 2 ; (75) 8C where the Green function D(˜r;˜r1 ) and the dot capacitance C are given by Eqs. (68) and (71), ↔ and 9 z is de8ned in Eq. (67). The importance of the additional potential (75) for the structure of the matrix elements of the interaction Hamiltonian was 8rst noticed in Ref. [74]. The substitution of the 8rst term of Eq. (74) into Eq. (32) immediately yields the new value of the interaction constant Ec in the n-dependent ˆ part of the universal Hamiltonian (35). This constant in fact is the single-electron charging energy Ec =
e2 : 2C
(76)
Note, that this value is much larger than the single-electron level spacing: Ec rs (kF L)d−1 ;
d = 2; 3 ;
because of a large factor kF L1. We intend to show now that this large scale appears only in the charge part of the universal Hamiltonian and it enters neither the spin channel nor the (1=g) non-universal part Hˆ int . The constant JS in Eq. (35) originates from the two-body interaction term of the screened potential. The use of the corresponding (second) term of Eq. (74) would yield JS = − =2,
334
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
which is an overestimation of the spin-dependent interaction term. The correct result for JS in the random phase approximation reads: 9
r ln 1 + ; d = 3; s 2rs −JS = d˜r F(kF |˜r |)2 Vsc (˜r) = × (77) 2 1 + 1 − r r s s 1 − r 2 ln 1 − 1 − r 2 ; d = 2 ; s s where the gas parameter rs is given by Eq. (53) and the function F was introduced in Eq. (40). It is worthwhile to notice that at rs 1, the RPA result (77) gives JS = − =2, which forbids, in particular, the Stoner instability. However, in the regime rs & 1, the RPA scheme becomes non-reliable and the constant (77) should be replaced with the corresponding Fermi liquid constant. A universal description still holds in this case provided that the Fermi liquid description does not breakdown at length scales smaller than the system size. We now turn to the discussion of 1=g corrections. The 8rst type of such corrections results from the substitution of the third and fourth terms of Eq. (74) into Eq. (32). One thus 8nds [74]: (1=g) 0 0 0 [H1 ]!( = X!( + !( X ; X! = d˜r V˜ (˜r) (˜r2 )∗! (˜r2 ) : (78) The matrix elements in Eq. (78) are random. Their characteristic values can be easily found from Eqs. (74) and (49), with the result 0 2 [X! ] = b00
2 ; g
(79)
where b00 is a geometry-dependent numerical coeKcient, 1 | d˜r V˜ (˜r)fn (˜r)|2 (1 b00 = (n
(80)
(n ¿0
and the potential V˜ is de8ned by Eq. (75). The coeKcient b00 is of the order of unity. According 1=g to Eq. (78), the matrix elements [H(1=g) ]!( of the Hamiltonian Hˆ with = or ! = ( are of √ the order of = g, contrary to the assumptions of Ref. [75]. The rest of these matrix elements are smaller. One can use the two-body part of the screened interaction potential (74) to evaluate them [67]. This part coincides with the simple model (43), up to the constant D, which should be substituted by 1=2. After the substitution, one can use results (50) and (51). In all the previous consideration, we assumed the potential on the external gates to be 8xed, see Eq. (69). Such an assumption was valid because additional potentials created by those gates 9
The additional smallness rs in the expression for JS arises because the main contribution to this constant comes from the distances of the order of 1=kF ; at such distances the approximation of the local part of the potential by -function in Eq. (74) is no longer valid and one should replace TVd (˜r) → 2
˜ d˜k V (k) eik˜r ; (2)d 1 + 2V (k)=(TVd )
V (k) =
4e2 ; Ek 2
d = 3;
2e2 Ek ;
d = 2:
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
335
were implicitly included into the con8ning potential U (˜r) from Eq. (1). One, however, may be interested in a comparison of the properties of the dot at two dierent sets of the gate voltages. To address such a problem one has to include gates into the electrostatic problem, i.e., to 8nd the correction U (˜r) to the self-consistent con8ning potential due to the variation of the gate potentials. This requires that the Thomas–Fermi screening of the charge on the external gates by the electrons in the dot has to be considered. The resulting equation for this correction is similar to Eq. (62): 2 1 2 −∇ U (˜r) = − B (˜r)U (˜r) − d˜r1 Bdot (˜r1 )U (˜r1 ) ; (81) TVd dot Vd supplemented with the requirement that the solution vanishes at r → ∞ and has the 8xed value on the surface of each gate, Si , [cf. Eq. (69)]: U (˜r)|˜r∈Si = − eVg(i) ;
i = 1; 2; : : : ;
(82)
where indices i enumerate the gates, and Vg(i) is the electrostatic potential on ith gate. Solving Eqs. (81) and (82) involves the same steps as in derivation of Eq. (74) [the only dierence is that now there is no -function in Eq. (63), the right-hand side of Eq. (66) vanishes, and the boundary conditions on the gate surfaces are given by Eq. (82)]. The solution for the potential outside the dot is obtained in terms of the Green function (68) – (69), similarly to Eq. (70): ˜ D(˜r;˜r2 ) − ˜ D(˜r;˜r2 ) ; U (˜r) = UU d˜S2 ∇ eVg(i) d˜S2 ∇ (83) S
Si
i
where the zero-mode part of the potential of the dot is given by UU = − 2Ec N; N = Ni ; eNi = Ci Vg(i) :
(84)
i
The geometrical capacitance of the dot C is de8ned by Eqs. (72) and (73) and the geometrical mutual capacitances between the dot and the gates are E ˜ ˜ ˜ 1 D(˜r1 ;˜r2 ) ; d S2 ∇2 d˜S1 ∇ (85) Ci = 4 Si S for three-dimensional dots. To obtain the result for the two-dimensional case, the surface integral over S here should be replaced as in Eq. (67). The resulting potential inside the dot is found similarly to Eq. (74) in the form (i) U (˜r) = − 2Ec N − NV˜ (˜r) + Ni V˜ (˜r) ; (86) i (i) where V˜ (˜r) is de8ned in Eq. (75) and potentials V˜ (˜r) are given by V (i) ˜1 ˜ 2 D(˜r1 ;˜r2 ); d = 3 ; V˜ (˜r) = d (˜r − ˜r1 ) d˜S1 ∇ d˜S2 ∇ 8Ci S Si ↔ ↔ V (i) d˜r1 9 z 9 z1 D(˜r;˜r1 ); d = 2 : V˜ (˜r) = d 8Ci Si
(87)
336
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
We see that the eect of the external gates has the same hierarchical structure as the electronic interaction inside the dot: The largest scale Ec [the 8rst term in Eq. (86)] corresponds to a simple uniform shift of the potential, while the much smaller energy scale characterizes the changes in the potential aecting the shape of the dot. Therefore, the eect of the external gates can be separated into a universal part U 0 and a
(88)
j
j
Nj X! :
(89)
j are random Gaussian variables with second moments In Eq. (89), the X! j
i X! X! = bij
2 ; g
(90)
where the geometry-dependent coeKcient b00 is given by Eq. (80) and all other coeKcients are given by 1 (1 (i) ( j) ˜ bij = d˜r V (˜r)fn (˜r) d˜r V˜ (˜r)fn (˜r) ; (n (n ¿0 1 (1 ( j) ˜ b0j = bj0 = d˜r V (˜r)fn (˜r) d˜r V˜ (˜r)fn (˜r) (91) (n (n ¿0
for i; j = 1; 2 : : : : In general, all these coeKcients are of the order of unity. 2.3.3. Final form of the e9ective Hamiltonian To summarize this subsection, we note that in the absence of superconducting correlations, the universal part of the interaction Hamiltonian, Eq. (35), consists of two parts. The dominant part depends on the dot’s charge n. ˆ The corresponding energy scale Ec , Eq. (76), is related to the geometric capacitance of the dot C, Eq. (72), and exceeds parametrically the level spacing . [There are corrections of order to Eq. (76) arising from the potentials V˜ in Eq. (74) and from the local part of the screened interaction.] The next part depends on the total spin ˜S of the dot. The corresponding energy scale JS is smaller than . If the level spacing did not
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440 1=g have the following form: Hˆ (0) ˆ 2 − E N2 ; H( ;† (; + Ec (nˆ − N)2 + JS (˜S) Hˆ = c ;( 1 (1=g) (1=g) j † 0 Hˆ = ( n ˆ − N ) X − N X + H!( j ; !; ! ! 2 j ;!
!(
337
(92) † † ; 1 !; 2 (; 2 ; 1
:
(93)
The last term in Eq. (92) is a c-number and it can be disregarded in all subsequent considera( j) (1=g) are given by Eq. (90), the elements H!( tions. The
(94)
where the dimensionless conductance of the dot is de8ned by Eq. (15). Since g1, condition (94) is met even if more than one channel is open. The conductance of a point contact connecting two clean conducting continua can be related, by the Landauer formula [80 –84], to a scattering problem for electron waves incident on the contact. These waves can be labeled by a continuous wave number k and by a set of discrete quantum numbers j. The number k accounts for the continuous energy spectrum of the incoming waves, and j de8nes their spatial structure in the directions transverse to the direction of incidence. Although in principle the number of discrete modes j participating in scattering is in8nite, the number of relevant modes contributing to the conductance is con8ned to the 10
Accounting for such trajectories would result in an additive contribution to the mesoscopic
338
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
Nch modes that are propagating through the contact; all other modes are evanescent and hardly contribute to the conductance. An adiabatic point contact [5,10,85] is an example adequately described by a model having Nch modes in a lead. Let us now turn to a quantitative description of the theory. The total Hamiltonian of the system is given as a sum of three terms [86], Hˆ t = Hˆ + Hˆ L + Hˆ LD ;
(95)
where the Hamiltonian of the closed dot, Hˆ , will be taken in the universal limit, see Eq. (92), Hˆ L describes the leads, and Hˆ LD couples the leads and the dot. A separation into three terms, like Eq. (95), is well known in the context of the tunneling Hamiltonian formalism [87], and in nuclear physics [88]. Its application to point contacts coupling to quantum dots can be found, e.g., in Refs. [14,86,89,90]. The Hamiltonian of the leads reads Nch d k ˆ† ˆ H L = vF (96) k (k) ˆ j (k) ; 2 j j=1
where we have linearized the electron spectrum in the leads and measure all the energies from the Fermi level. The vector k kF is the deviation of longitudinal momentum in a propagating mode from the Fermi wave vector kF . [For the sake of simplicity, we assume the Fermi velocity to be the same in all the modes; the general case can be reduced to Eq. (96) by a simple rescaling.] Hamiltonian (96) accounts for all the leads attached to the dot; the lead index (in case the dot is coupled to more than one lead), the transverse mode index, and the spin index have all been combined into the single index j, which is summed from 1 to Nch ; Nch being the total number of propagating channels in all the leads. (In this subsection, we reserve Greek and Latin letters for labeling the fermionic states in the dot and in the leads, respectively.) The Hamiltonian Hˆ LD in Eq. (95) describes the coupling of the dot to the leads, Nch M dk ˆ H LD = (97) [Wj † j (k) + h:c:] : 2 j=1 =1
Here the coupling constants Wj form a real M × Nch matrix W , and M → ∞ is the size of the random matrix describing the Hamiltonian of the dot. We emphasize that the matrix W describes the point contacts, not the dot. That is why this matrix is not random: all the randomness is included in the matrix H from Eq. (92). In the absence of electron–electron interactions in the dot, the Hamiltonian (95) of the combined system of dot and leads can be easily diagonalized and the one-electron eigenstates can be found. These eigenstates are best described by the Nch × Nch scattering matrix S(j), which relates the amplitudes aout of out-going waves and ain j of in-going waves in the leads, j aout i =
Nch
Sij (j)ain j :
(98)
j=1 out in terms of the lead states (For a precise de8nition of the amplitudes ain j (k), and j and aj for a derivation of the formulae presented below, we refer to Appendix C.) The matrix S is unitary, as required by particle conservation. It can be expressed in terms of the matrices W
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
339
and H that de8ne the non-interacting part of the Hamiltonian and a matrix U that describes the boundary condition at the lead–dot interface, 11 S(j) = U [1 − 2iW † (j − H + iWW † )−1 W ]U T ;
(99)
where = 1=(2vF ) is the one-dimensional density of states in the leads, the M × M matrix H is formed by the elements H( of the Hamiltonian of the dot (92), and UU T is the scattering matrix in the absence of the coupling between the leads and the dot (i.e., when HLD = 0). The scattering matrix Sij (j) describes scattering of electrons from one lead to another, as well as backscattering of an electron into the same lead. Eq. (99) can be also rewritten in terms of the (matrix) Green function of the closed dot G(j) = (j − H)−1 ;
(100)
as S(j) = U
1 − iW † G(j)W T U : 1 + iW † G(j)W
(101)
The matrix S describes both re<ection from the point contacts (i.e., scattering processes that involve the backscattering from the contacts only, not the dot), and scattering that involves (ergodic) exploration of the dot. Alternatively, the backre<ection from the point contacts into the leads can be described in terms of a re<ection matrix rc , which is related to the matrix W as rc = U
2 − MW † W T U : 2 + MW † W
(102)
A contact is called ideal if rc = 0, or, equivalently, W † W = 2 =M. For a single mode contact, the transparency |tc |2 is given by |tc |2 = 1 − |rc |2 . For non-interacting electrons Eqs. (99) – (102) essentially solve the physical part of the problem since all the observable quantities are expressed in terms of the scattering matrix. We now consider three observables in more detail: the two-terminal conductance, the tunneling density of states, and the ground state energy of the dot in contact to the leads. Two-terminal conductance: The two-terminal conductance is de8ned for a quantum dot that is connected to electron reservoirs via two leads (numbered 1 and 2) with N1 and N2 channels each (where N1 + N2 = Nch ), see Fig. 9. The electron reservoirs are held at a constant chemical potential & + eVi (i = 1; 2). Labeling the group of channels belonging to lead 1 (2) with the index 1 6 j 6 N1 (N1 + 1 6 j 6 Nch ), the total current I through, say, lead no. 1 is given by I = G(V2 − V1 ) ;
11 The states j (k) in Eq. (96) represent scattering states in the leads, which are de8ned with the help of a suitably chosen boundary condition at the lead–dot interface. These boundary conditions may lead to a nontrivial scattering matrix S0 = UU T even in the absence of any lead–dot coupling HLD , see Appendix C for details.
340
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
Fig. 9. Setup for a measurement of the two-terminal conductance. The quantum dot (gray) is connected to two leads, numbered 1 and 2, that are connected to electron reservoirs. A current I
where the conductance G is expressed in terms of the scattering matrix S by the Landauer formula [93,80 –82] N1
Nch 9fF e2 G= dj − |Sij (j)|2 ; (103) 2˝ 9j i=1 j=N1 +1
fF (j) = 1=(1 + ej=T ) being the Fermi distribution function. It is convenient to use the unitarity of the scattering matrix and rewrite Eq. (103) as
e2 N1 N2 9fF G= − dj − Tr HSHS † ; (104) 2˝ Nch 9j where the Nch × Nch traceless matrix H is de8ned as N2 − N ; i = 1; : : : ; N1 ; ch (105) Hij = ij × N 1 ; i = N1 + 1; : : : ; Nch : Nch The advantage of Eq. (104) over the more conventional form (103) of the Landauer formula is that it separates the classical conductance (2e2 =h)(N1 N2 =Nch ) and the quantum interference correction, the second term on the r.h.s. of Eq. (104). Tunneling density of states: An important particular case of Eq. (103) is the strongly asymmetric setup, where one of the contacts, say the right one (no. 1 in Fig. 9), has only one channel (N1 = 1), with a very small transmission amplitude, |tc |2 = 1 − |rc |2 1. The corresponding coupling matrix W from Eq. (97) then acquires the form, see also Eq. (102) tc TM Wj = j 1 (106) + (1 − 1j )Wj(2) : 2 2
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
341
Here the channel j = 1 corresponds to the right contact, while the channels j = 2; : : : ; Nch correspond to the left contact, and we have chosen the basis of the states inside the dot such that the right contact is connected to “site” = 1. The matrix Wj(2) describes the coupling to the left contact. Substituting Eq. (106) into Eq. (99) and using Eq. (106) one obtains, with the help of Eq. (103)
9fF G = G1 TM d j − T (j) ; (107) 9j where the tunneling conductance of the left contact is given by 12 e2 2 |tc | : ˝ The quantity T (j) in Eq. (107) is the tunneling density of states and is given by (in the next (2) two formulae we omit the superscript (2) in Wˆ ): 1 1 † WW T (j) = j − H + iWW † j − H − iWW † 11 1 1 = − Im j − H + iWW † 11 G1 =
1 = − Im[GoR (j)]11 ; (108) a result which one can obtain also by treating the right contact in the tunneling Hamiltonian approximation. Here, we introduced the Green functions for the dot connected to the leads, cf. Eq. (100) GoR(A) = [j − H ± iWW † ]−1 ;
(109)
where the “+” (“−”) sign corresponds to the retarded (advanced) Green functions, respectively. An equivalent form of Eq. (108) is obtained by noticing from Eq. (100) that 9G! = 9H11 = G1 G1! . One immediately 8nds from Eq. (108)
1 9 1 + iW † G(j)W Tr ln ; T (j) = 2i 9H11 1 − iW † G(j)W and with the help of Eq. (101) and SS † = 1, arrives at [94] 1 9S T (j) = − : Tr S † 2i 9H11
(110)
This indicates that the tunneling density of states can be related to the parametric derivative of the scattering matrix of the dot without the left (tunneling) contact. 12 The extra factor 2 accounts for spin degeneracy. Strictly speaking, for electrons with spin 1=2 one should set N1 = 2 and sum contributions to the tunneling density of states for the two spin directions. In the presence of spin-rotation symmetry, however, the result is the same as that of Eqs. (108) and (110) below.
342
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
Free energy for non-interacting electrons: The thermodynamic potential I of the dot at chemical potential & can be found in terms of the Green functions as I = − T d j ln[1 + e−(j−&)=T ](j) ; (111) where the total density of states (i.e., the local density of states integrated over the volume of the dot) (j) is given by [compare with Eq. (108)] 1 1 (j) = − Im Tr : (112) j − H + iWW † Similarly to the derivation of Eq. (110), one can rewrite the latter formula in terms of the energy derivative of the scattering matrix as [94] 1 9S (j) = (113) Tr S † 2i 9j The average number of particles in the dot nˆq = − 9I= 9& (here : : :q indicates the average over the quantum state without disorder average) can then be found as nˆq = d jfF (j)(j) ; (114) where fF (j) is the Fermi function. Statistical properties of S: The calculation of the statistical properties of the two-terminal conductance, the tunneling density of states, and the ground state energy is now reduced to the analysis of the properties of the scattering matrix for a chaotic quantum dot, which is a doable, though not straightforward task. The available results are collected in an excellent review [14] (see, in particular, Chapter 2 of that reference). Here, we mention a few results that are needed in Section 4. For particles with spin, but in the absence of spin–orbit scattering, the scattering matrix is block diagonal, S = diag(S o ; S o ), where each block S o is a matrix of o = N =2, the total number of orbital channels. In the presence of spin–orbit scattering size Nch ch o -dimensional such a block structure does not exist, and one commonly describes S as a Nch matrix of quaternions (denoted as S o ), which are 2 × 2 matrices with special rules for complex conjugation and transposition [54]. For re<ectionless contacts all averages that involve S (or S † ) only vanish, Si1 j1 (j1 )Si2 j2 (j2 ) : : : Sin jn (jn ) = 0;
n = 1; 2; : : : :
(115)
For non-ideal contacts, the average of S does not vanish, and is given by the re<ection matrix rc of the contact, n Si1 j1 (j1 )Si2 j2 (j2 ) : : : Sin jn (jn ) = rc; ik jk ; n = 1; 2; : : : : (116) k=1
Moments that involve both S and S † have a rather complicated dependence on energy (see, e.g., Ref. [95] for ! = 1). Here, we mention an approximate result for the second moment Sio1 j1 (j1 )Sio∗ (j2 ) in case rc = 0, 2 j2 i i j j + (2=! − 1)i1 j2 i2 j1 ; (117) Sio1 j1 (j1 )Sio∗ (j2 ) = o 1 2 2 1 2 j2 Nch + 2=! − 1 + 2i(j2 − j1 )=
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
343
where is the mean level spacing of the dot. 13 Approximation (117) is valid for j1 − j2 = 0 and for large values of |j2 − j1 |. It is also valid for arbitrary |j1 − j2 | if Nch 1. This is suKcient to estimate average and
+ +
(1 − rc rc† )i1 i2 (1 − rc† rc )j1 j2
o + 2=! − 1 − Tr r r † + 2i(j − j )= Nch c c 2 1
(2=! − 1)(1 − rc rc† )i1 j2 (1 − rc† rc )j1 i2
o + 2=! − 1 − Tr r r † + 2i(j − j )= Nch c c 2 1
:
(118)
This approximate result is for large energy dierences or for the case Nch − Tr rc rc† 1. In Section 4 we use the Fourier transform S(t) of the scattering matrix, ∞ S(j) = dt S(t)eijt ; (119) 0
for j in the upper-half of the complex plane. The Fourier transform of the Hermitian conjugate is de8ned with negative times and for j in the lower-half of the complex plane, ∞ † S (j) = dt S † (−t)e−ijt : (120) 0
Statistical averages for the scattering matrix in time-representation can be obtained by Fourier transform of Eqs. (115) – (117), recalling that ensemble and energy averages are equivalent. In particular, we 8nd that the scattering from a non-ideal point contact with energy-independent rc = 0 is instantaneous, S(t) = rc (t) +
(121)
where the re<ection matrix of the point contact rc is de8ned in Eq. (C.9), and the ensemble average of the “
(122)
where ! = 1(2; 4) corresponds to the orthogonal (unitary, symplectic) ensemble. Statistical properties of Go : We will see in Section 4.8, that the tunneling conductance in the presence of the interaction cannot be expressed in terms of the parametric derivative of 13
For the symplectic ensemble, for which ! = 4, Sijo is a quaternion and the left-hand side of Eq. (117) has to be interpreted as the quaternion modulus of Sio1 j1 Sio2†j2 , where Sio2†j2 is the Hermitian conjugate of Sio2 j2 . The same interpretation applies to Eqs. (118) and (122) below. In the r.h.s. of Eq. (118), the complex conjugate (rc )∗i2 j2 should be replaced by the Hermitian conjugate (rc )†i2 j2 for ! = 4.
344
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
S, as it was done in Eq. (110). The transport in this case will be related to the statistics of the Green functions for the open dot (109), and we give those properties below. It is more convenient to write the results in the time domain. For averages which include only retarded or only advanced components one 8nds [GoR (t)]( = − [GoA (t)]( = − i(t)( ; TM GoR (tj ) = GoR (tj ) ; (123) j
j
which means that the attachment of the leads does not change the average level spacing in the dot. For averages involving the retarded and advanced components, we have for the case of re<ectionless contacts 2 [GoR (t1 )]1 (1 [GoA (t2 )]2 (2 = 2 [1 (2 2 (1 + (2=! − 1)1 (1 2 (1 ] M ×(t1 + t2 )e−(Nch +2=!−1)t1 =(2) :
(124)
The derivation of Eq. (124) may be performed similar to that of Eq. (25). This concludes our brief review of the two-terminal conductance, the tunneling density of states, and the ground state energy for a quantum dot in the absence of electron–electron interactions. The interaction between electrons leads to the Coulomb blockade, and simple formulas as Eqs. (103), (110) and (113) cease to be valid. With interactions, the answer very substantially depends on the conductance of the dot–lead junctions. The consideration of this regime will be the subject of the two remaining sections. 3. Strongly blockaded quantum dots In this section, we discuss the regime of strong Coulomb blockade in quantum dots. To allow for this regime, the conductances of the contacts of the dot to the leads G1; 2 must be small in units of e2 =˝. As it was already explained in the Introduction, junctions to a quantum dot formed in a two-dimensional electron gas of a semiconductor heterostructure are well described by the adiabatic point contact model. Normally, a small conductance is realized only in single-mode junctions. Since there are two leads connected to the dot, labeled 1 and 2, o = 2, and the transmission block of the S-matrix we have total number of (orbital) channels Nch of the point contacts has the form t1 0 t= : (125) 0 t2 The transmission amplitudes for the separate left and right point contacts t1; 2 are related to the corresponding conductances by Landauer formula (103) e2 G1; 2 = g1; 2 ; g1; 2 = |t1; 2 |2 ; (126) ˝ where the extra factor of two comes from the spin degeneracy. We recall that the coupling coeKcients t1; 2 characterize only the point contact and are not random, see discussion after
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
345
Eq. (97); they characterize the average conductance of each point contact and not the total conductance of a given sample consisting of both the two contacts and the quantum dot. Since the conductances G1 and G2 are small, we can express the coupling matrix W relating the Hamiltonian of the closed quantum dot to its scattering matrix in terms of G1 and G2 , see Eq. (102), 1 TM g1 1 TM g2 Wj = j j1 + j j2 : (127) 2 2 Because the coupling is weak, it is possible to construct a perturbation theory of the conductance in terms of this coupling. This perturbation theory diers substantially for the peaks and the valleys of the Coulomb blockade. These two cases will be considered separately. 3.1. Mesoscopic =uctuations of Coulomb blockade peaks In the weak tunneling regime, the charge on the dot is well de8ned—quantum
(128)
346
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
The upper bound in Eq. (128) ensures that only the last occupied discrete level in the dot contributes to the electron transport. In the absence of the interaction this level can be in four states: we denote by P the probability of this level to be occupied with one electron of given spin projection = ↑; ↓, by P0 to be empty, and by P2 to be occupied with two electrons. Those probabilities are normalized as P0 + P2 + P↑ + P↓ = 1. If the level is adjusted in resonance with the Fermi level in the leads, then due to the electron transfer through the dot, the state of the level changes between all of those states. The picture is changed when the interaction between electrons is included. As can be seen from Eq. (93), only three states with the number of electrons on the dot diering by 1, can be in resonance. States with a dierent charge always have an activation energy of the order of the charging energy Ec . As long as Ec T , these states do not participate in the electronic transport through the dot. A conductance peak is characterized by the condition |N − N∗ | . T=Ec 1. Depending on N∗ , one can distinguish two cases: (i) N∗ = 2j + 1=2, with j being an integer, which corresponds to states |0, | ↑, and | ↓ in resonance, and |2 having an extra energy Ec . (ii) N∗ = 2j − 1=2, which corresponds to states |2, | ↑, and | ↓ in resonance and |0 having an extra energy Ec . These limitations on the Hilbert space of the level lead to the deviation from the simple Breit–Wigner formula even in the temperature regime (128). The form of the conductance peaks can be calculated using rate equations. We will present the master equation describing case (i). Case (ii) is considered analogously and we will state the result after Eq. (133). Considering Hamiltonian (97) with the coupling matrix W given by Eq. (127), and in the Fermi Golden rule approximation, one 8nds the following rate equations [96 –98] dP↑ = {g1 [f1 P0 − (1 − f1 )P↑ ] + g2 [f2 P0 − (1 − f2 )P↑ ]} ; dt 2˝ dP↓ {g1 [f1 P0 − (1 − f1 )P↓ ] + g2 [f2 P0 − (1 − f2 )P↓ ]} ; = dt 2˝ dP0 = {g1 [(1 − f1 )(P↑ + P↓ ) − 2f1 P0 ] dt 2˝ + g2 [(1 − f2 )(P↑ + P↓ ) − 2f2 P0 ]} ; where g1; 2 are the exact conductances of the point contacts,
(129)
(130) g1 = g1 M | (1)|2 ; g2 = g2 M | (2)|2 and (n) are the RMT components of the M -component one-electron wave function of the partially occupied level responsible for the electron transport through the dot. It is readily seen that g1; 2 experience strong mesoscopic
The conductances g1 (g2 ) are conductances for electron transport from reservoir 1 (2) through the point contact into a dot state with wave function . They are
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
347
Among the three equations (129) only two are independent, and we have to supplement those equations with the normalization condition P0 + P↓ + P↑ = 1 (recall that P2 = 0, since double occupancy has an energy cost Ec T ). In the presence of a small bias eV = &1 − &2 applied between the leads, the occupation factors f1 and f2 are given by the Fermi distribution functions of electrons taken at the energy E ± eV=2, where E = 2Ec (N − N∗ ), which is the energy it takes to put an electron into the unoccupied state with the lowest energy in the dot. The current through, say, junction no. 1 is e I= Tg1 [2f1 P0 − (1 − f1 )(P↑ + P↓ )] : (131) 2˝ The stationary solutions for P0 , P↑ and P↓ can be found from the rate equations (129). We substitute the result into Eq. (131), take the limit of low bias V , and 8nd that the conductance of the system is given by e2 g1 g2 9fF = 9x I G(N) = =− : (132) V V →0 ˝ g1 + g2 T 1 + fF (x) The dependence of the conductance on N comes through the energy dependence of the Fermi distribution function fF (x): fF (x) =
ex
1 ; +1
x=
2Ec (N − N∗ ) : T
(133)
Note that the maximum of the conductance is slightly shifted (by ∼ T=Ec ) away from the point N = N∗ , and more importantly, the conductance peak is not symmetric. This is the consequence of correlations in transport of electrons with opposite spins through a single discrete state on the dot. Because of the repulsion, no more than one electron can reside in that state at any time (P2 = 0). Case (ii) is also described by Eq. (132) after the replacement x → −x. In the maximum, the function (132) takes the value of 15
2 g1 g2 3=2 e Gpeak = (3 − 2 ) : (134) ˝ g1 + g2 T Eqs. (132) and (134) relate the conductance through an interacting-electrons system to the single-particle electron wave functions; the height of the conductance peaks is expressed in terms of g1; 2 . According to Eq. (130), these quantities experience mesoscopic
(135)
Most easily this task is accomplished for the pure orthogonal ! = 1 or unitary ! = 2 ensemble [99]. In these cases, the wave functions are distributed according to Eq. (21). Substituting 15
The numerical prefactor in Eq. (134) is dierent from that used in Refs. [43,62,99 –102]. The dierences arises, because these references did not properly account for the two-fold spin degeneracy in the conditions of the Coulomb blockade. However, this does not aect the results for the functional form of the distribution functions for the
348
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
Eq. (21) into Eq. (130) and the result into Eq. (134), we obtain Gpeak in terms of the average conductances g1; 2 and a single random variable ,
2 2g1 g2 3=2 e Gpeak = (3 − 2 ) : (136) ˝ T (g1 1=2 + g2 1=2 )2 The distribution function for is very sensitive to the presence (! = 2) or absence (! = 1) of a time-reversal symmetry breaking magnetic 8eld, e− W!=1 () = √ ; 2 −(1+a)
W!=2 () = (1 − a) e
K0 [(1 − a)] +
1+a K1 [(1 − a)] 1−a
:
(137)
Here K0 (x); K1 (x) are modi8ed Bessel functions of the second kind [103], while the parameter a characterizes the asymmetry of the point contacts:
2 g1 1=2 − g2 1=2 a= : (138) g1 1=2 + g2 1=2 It is interesting that the distribution function in the orthogonal case ! = 1 remains universal no matter how asymmetric contacts are. In the unitary ensemble such a universality does not hold. In the case of symmetric contacts, g1 = g2 , this result was 8rst obtained in [99]. The general case was considered in [104]. It is important to emphasize that the probability distributions (137) are strongly non-Gaussian and they are sensitive to the magnetic 8eld. These two features were checked experimentally in [105,21]. The results were in a reasonable agreement with the theory, 16 see Fig. 10. It is also noteworthy that the average conductance depends on the magnetic 8eld
2 g1 g2 ! 3=2 e Gpeak = (3 − 2 ) F[2 − ! + (! − 1)a] ; (139) 1=2 ˝ T (g1 + g2 1=2 )2 where the asymmetry parameter a was de8ned in Eq. (138) and the dimensionless function F(x) is given by
1 + x (1 − x)2 1+x F(x) = − arcosh ; F(1) = 1; F(0) = 43 : (140) 2x 1−x 4x3=2 We see from Eqs. (139) and (140) that the average conductance in the magnetic 8eld (unitary case, ! = 2) is larger than in the absence of the 8eld (orthogonal case, ! = 1). This phenomenon is analogous to the eect of negative magnetoresistance due to the weak localization in bulk systems [106] with the same physics involved. For the average peak heights, the 16
The symmetry breaking occurs due to the orbital eect of the magnetic 8eld, and one can neglect the role of Zeeman energy. Indeed, in experiments [21] the crossover between the ensembles corresponds to a 8eld B ∼ 10 mT. Zeeman energy at such a 8eld would be comparable to temperature only at T . 2:5 mK, which is 10 times lower than the base temperature in the measurements.
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
349
Fig. 10. Histograms of conductance peak heights for orthogonal (a,c) and unitary ensembles (b,d). Data (a,b) and (c,d) are taken from [105,21], respectively. An example of the raw data used for the graphs (c,d) is shown in Fig. 2(b).
magnetoresistance is sensitive to the asymmetry of the contacts (138) and in the limiting case of a very asymmetric contacts, a → 1, the magnetoresistance vanishes. So far, we have presented results for the peak height distributions for ensembles of Coulombblockaded quantum dots with either fully preserved or fully broken time-reversal symmetry. To investigate how the crossover between those two ensembles happens for the temperature regime (128), one has to use the statistics of the wave functions in the crossover ensemble of random Hamiltonians distributed according to Eq. (19). This statistics is described by Eqs. (22) and (23). To obtain the moments of the conductance at the crossover regime, is a lengthy, albeit straightforward calculation involving Eqs. (130), (134), (22) and (23). We refer the readers to the original papers [100,107] and review [15] for the results. (The peak height distribution function for the limit a → 1 corresponding to the case of strongly asymmetric contacts was obtained earlier in Ref. [61], see also Ref. [43].) Besides the distribution functions of the peaks heights at a 8xed value of the magnetic 8eld, another quantity of interest is the correlation function of the peak heights for dierent values of the magnetic 8eld. It is de8ned as C(B1 ; B2 ) =
G(B1 )G(B2 ) ; (G(B1 )2 G(B2 )2 )1=2
G(B) = Gpeak (B) − Gpeak (B) :
(141)
Physically, the correlator C characterizes “how fast” the ensemble changes under the eect of the magnetic 8eld.
350
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
The peak height correlation function (141) was studied numerically in Refs. [101,102]. For the unitary ensemble, NhC 1, the results were found to be well approximated as 1 C(B1 ; B2 ) ≈ ; (142) (1 + 0:25NhD )2 where the parameters NhD; C are related to the dierence and sum of the magnetic 8elds by Eq. (20). Once again, the characteristic magnetic 8eld is determined by the condition Nh 1. 3.2. Mesoscopic =uctuations of Coulomb blockade valleys In this subsection, we study the mesoscopic
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
351
Fig. 11. Schematic picture of the second order processes contributing to the conductance in the Coulomb blockade valleys. Processes (a), (b) are electron-like and (c), (d) are hole-like. Elastic co-tunneling [28] corresponds to the processes (a) and (c), and inelastic co-tunneling [27,28] is described by (b) and (d).
The tunneling conductance can be calculated according to the Golden rule as 2e2 2 |Ae + Ah |2 ; (145) G=2 ˝ where is the density of states per one spin in the leads and the extra factor of 2 takes care of the spin degeneracy. The amplitudes Ae and Ah correspond to the processes of Fig. 11(a) and (c), respectively. They can be calculated with the help of the Hamiltonian (97), with Eq. (127) for the matrix W , in second order perturbation theory in the tunneling probabilities. The result is
M (1) ∗ (2) Ae = g1 g2 B(j ) ; 42 j + Ee
M (1) ∗ (2) B(−j ) ; (146) Ah = − g1 g2 42 −j + Eh where j and (i) are the exact eigenenergies and eigenvectors of the non-interacting Hamiltonian (13), and the argument i = 1; 2 of the wave function labels the sites coupled to contacts 1 and 2, respectively, see Eqs. (127). The eigenenergies j are measured from the upper 8lled level in the dot, so that step function B(±j ) selects empty (occupied) orbital states for electron (hole) like processes of Fig. 11a(c). The relative sign dierence in the electron, Ae , and hole, Ah , amplitudes comes from the commutation relation of the fermion operators.
352
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
Substituting Eq. (146) into Eq. (145), we 8nd G=
e2 g2 g1 |Fe + Fh |2 ; 43 ˝
(147)
where the dimensionless functions F are given by Fe = M
Fh = M
∗ (1) (2)
j + Ee
∗ (1) (2)
j − Eh
B(j ) ; B(−j ) :
(148)
It is also possible, and more convenient for some applications, to express F in terms of the exact one electron Green functions of the dot ∞ A (j) − GR (j) d j G12 12 Fe = M ; 2i j + Ee 0 0 A (j) − GR (j) d j G12 12 ; (149) Fh = M j − Eh −∞ 2i where the Green function is given by Eq. (24). Eqs. (147) – (149) solve the part of the problem depending on the interactions; the conductance is expressed in terms of the single-electron eigenfunctions and eigenvalues of the isolated dot. What remains is to perform the statistical analysis of the conductance in a fashion similar to what was done in Section 3.1 for the peak heights. Note, however, that there is an important dierence with the case of the peak height statistics: while the height of a conductance peak depended on the wavefunction of one level only, many levels with energies of the order of Ee(h) contribute to the tunneling. The superposition of such a large number of tunneling amplitudes signi8cantly aects the conductance
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
expressions are Fe Fe∗ =
; Ee
Fh Fh∗ =
; Eh
353
2 Fe Fe = −1 ; ! Ee
2 Fh Fh = −1 ; ! Eh
Fe Fh∗ = Fe Fh = 0 ;
where all the averages are found in the same fashion as (150). Using Eq. (151), we can 8nd the average of the conductance (147): e2 G = 3 g2 g1 + : 4 ˝ Ee Eh
(151)
(152)
This expression was 8rst obtained 17 in [28] by a dierent method of the ensemble averaging. (To calculate the quantitites of the type of Eq. (151), the authors of Ref. [28] used the technique developed in the semiclassical theory of inhomogeneous superconductors [109,110].) Notice that a magnetic 8eld has no eect on the average conductance; G remains the same for the unitary and orthogonal ensembles. We remind the reader that Eq. (152) is obtained within the RMT theory, and therefore assumes Ec ET . However, as we show below, the
(154)
where the irreducible averages are given in Eq. (151). Eq. (154) indicates that the amplitudes F entering the conductance are random Gaussian variables with zero average and variance (151). (This can be proven by explicit consideration of the higher moments.) This observation immediately allows one to compute the
(155)
With the breaking of time reversal symmetry the
Eq. (152) is dierent from Eq. (13) of [108] by a factor of 2, because of algebraic error in this reference.
354
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
It is clear from Eq. (155) that the
B() √ ; !=1 ; G W () = − (156) = 2 G − B()e !=2 ; which coincides with Porter–Thomas distribution [59]. This was to be expected from the central limit theorem, because the conductance is determined by a large number of random amplitudes, which is exactly the assumption behind the Porter–Thomas distribution. Now, we turn to the study of the eect of the magnetic 8eld on the valley conductance. Our purpose is to 8nd the correlation function for the conductance
D
C Nh Nh ∗ Fe Fe = H ; Fe Fe = H ; (157) Ee 2Ee Ee 2Ee
D
C Nh Nh ∗ Fh Fh = H ; Fh Fh = H : (158) Eh 2Eh Eh 2Eh Here the dimensionless function H(x) is given by 1 1 ln x ln(1 + x2 ) + arctan x + Li2 (−x2 ) H(x) = x 2
(159)
with Li2 (x) being the second polylogarithm function [103]. The asymptotic behavior of function H is H(x) = 1 + (x ln x)=, for x1, and H(x) = (x)−1 ln2 x, for x1. The parameters NhD and NhC are de8ned in Section 2.2. The limits of the pure orthogonal (unitary) ensembles (151) are recovered by putting NhD = 0 and NhC = 0(∞) in Eq. (157). 18
This statement is valid for the case when the quantum dot is coupled to the electron reservoirs via tunneling point contacts, which we consider here. In case of a wide junction (i.e., a contact with many contributing channels with small transparency each), the conductance is a sum over all channels in the contacts of the transmission probabilities for these channels. In that case, the relative size of the
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
355
The correlation function acquires a universal form [108] (i.e., all the dependences for dierent Ee(h) can be collapsed to a single curve upon rescaling of the magnetic 8eld): G(B1 )G(B2 ) C(B1 ; B2 ) = G 2
2
2 61 + 62 2 61 − 62 2 = H + H ; (160) 6c 6c where the scaling function H(x) is de8ned in Eq. (159), 6i is the magnetic
1 2E 1=2 6c = √ 60 ; E = min(Ee ; Eh ) ; (161) 5g where 60 is the
B() D W () = √ exp − I0 ; (162) 1 − D2 1 − D2 1 − D2 here I0 (x) is the zeroth order modi8ed Bessel function of the 8rst kind. In the limiting cases D = 1 and 0, the distribution function P(g) coincides with the Porter–Thomas distribution (156) for the orthogonal and unitary ensembles, respectively. The dependence of the crossover parameter D on the magnetic
2 46 D=H ; (163) 6c2 where the correlation
356
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
Fig. 12. The correlation function C(TB = B1 − B2 ) for the conductance
agrees reasonably well with the theory. Extracted with the help of Eq. (160), the dependence of the correlation magnetic 8eld on the gate voltage across a valley is shown in Fig. 13. One can see that the correlation magnetic 8eld increases with the deviation from a conductance peak, in agreement with Eq. (161). However, the value of the ratio of the correlation 8elds for peaks, see Section 3.1, and for the valleys was found somewhat smaller than the theoretical prediction. We recall that a relatively large correlation magnetic 8eld in a valley results from a wide energy band (∼ Ec =) of virtual discrete states in the dot. By the same token, the conductance in adjacent valleys is also correlated, because all but one virtual states participating in transport are the same for such valleys. In fact the conductance remains correlated over a large number ∼ Ec = of valleys. The corresponding correlation function was calculated in Ref. [112], and the valley–valley correlation function for the dierential capacitances was found in Ref. [113]. Considering electron transport through a blockaded dot, we have used so far 8nite-order perturbation theory in the dot–lead tunneling amplitudes. If the dot carries a non-zero spin, like in the case of an odd number of electrons on the dot, such a treatment may miss speci8c eects caused by the degeneracy of the spin state of an isolated dot. Tunneling results in the exchange interaction between the spins of the dot and leads. The exchange, in turn, leads to a many-body
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
357
Fig. 13. Ensemble-averaged characteristic correlation 8eld (solid) and average conductance (dashed) across peak–valley–peak for ∼ 14 independent data sets.
phenomenon, the Kondo eect. This eect results in an unexpected temperature dependence of the conductance across the dot at temperatures T . 3.3. Kondo e9ect in a strongly blockaded dot The Kondo eect is one of the most studied and best understood problems of many-body physics. Initially, the theory was developed to explain the increase of resistivity of a bulk metal with magnetic impurities at low temperatures [114]. Soon it was realized that Kondo’s mechanism works not only for electron scattering, but also for tunneling through barriers with magnetic impurities [115 –117]. A non-perturbative theory of the Kondo eect has predicted that the cross-section of scattering o a magnetic impurity in the bulk reaches the unitary limit at zero temperature [118]. Similarly, the tunneling cross-section should approach the unitary limit at low temperature and bias [119,120] in the Kondo regime. The Kondo problem can be discussed in the framework of Anderson’s impurity model [121]. The three parameters de8ning this model are: the on-site electron repulsion energy U , the one-electron on-site energy j0 , and the level width M formed by hybridization of the discrete level with the states in the bulk. The non-trivial behavior of the conductance occurs if the level is singly occupied, M ¡ |j0 | ¡ U , and the temperature T is below the Kondo temperature √ j 0 (j 0 + U ) ; (164) TK UM exp 2MU where j0 ¡ 0 is measured from the Fermi level [122]. Before considering the Kondo eect in a quantum dot, we 8rst review it for tunneling through a single localized level with on-site repulsion U using the Hamiltonian of the Anderson
358
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
model, Hˆ A =
q;
%q (a†1q a1q + a†2q a2q ) +
+
q;
j0 a†0 a0 + U nˆ↑ nˆ↓
(t1 a†1q + t2 a†2q )a0 + a†0 (t1 a1q + t2 a2q ) ;
nˆ = a†0 a0 : a†1q ,
a†2q ,
(165) (166)
a†0
Here and are the electron creation operators in the left and right leads (1 and 2), and on the localized level, respectively; %q are the corresponding energies in the electron continuum. For brevity, the tunneling matrix elements t1 and t2 connecting the localized state in the dot with the states in the leads are taken to be q-independent. The widths Mi are related to these tunneling matrix elements: Mi = 2i |ti |2 , where i is the density of states of lead i = 1; 2. At 8rst sight, an Anderson impurity in a two-band model (166) may be associated with a two-channel Kondo problem [123]. However, it is easy to show that in our case the parameters of this two-channel problem are such that it can be reduced to the conventional single-channel one. Indeed, by a unitary transformation q u t1 1 = ua1q ± va2q with = : (167) v !q |t1 |2 + |t2 |2 t2 Hamiltonian (166) can be converted [120] to the conventional one-band model with Anderson impurity [121]. The localized state and band described by the fermion operators q form the usual Anderson impurity model, which is characterized by three parameters: U , j0 , and M = M1 + M2 . The band described by the operators !q is entirely decoupled from the impurity. The scattering amplitude A(1; q → 2; q ) between two states in the opposite leads is related to the scattering amplitudes A(; q → ; q ) and A(!; q → !; q ) within the bands and !, respectively, by relation A(1; q → 2; q ) = uv[A(; q → ; q ) − A(!; q → !; q )] :
The scattering amplitude in the -band is directly related to the scattering phase K for the conventional Kondo problem, A(; q → ; q) = exp(2iK ); the scattering problem for the !-band is trivial, A(!; q → !; q) = 1. Using Eq. (167), we 8nally 8nd 4g1 g2 |A(1; q → 2; q)|2 = 4|uv|2 sin2 K = sin2 K : (g1 + g2 )2 This way, the “Kondo conductance” GK associated with the spin-degenerate localized level can be expressed in terms of the problem of one-channel scattering o a single Kondo impurity in a bulk material, 19
e2 4g1 g2 T GK = f : (168) 2 ˝ (g1 + g2 ) TK 19 This mapping is possible due to the complete decoupling of mode ! from the Anderson impurity spin. Mapping also works for quantum dots which have spin 1=2 and are described by the universal model (92). Corrections to this model, see Eq. (93), violate the mapping, but do not change qualitatively the behavior of the Kondo conductance in the case of S = 1=2. If the spin S ¿ 1=2, then the temperature and magnetic 8eld dependence of GK may be quite dierent from the predictions of Eq. (168), even for a dot described by the universal model, see Ref. [124].
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
359
Fig. 14. Plot of the universal function y = f(x) versus x = T=TK from Ref. [125].
Here TK is the characteristic Kondo temperature, and f(x) is a universal function. This function, found with the help of numerical renormalization group in Ref. [125], is plotted in Fig. 14. A remarkable property of scattering on a Kondo impurity, is that the corresponding cross-section approaches the unitary limit at low energies, f(0) = 1. The low-temperature correction to the unitary limit is proportional to T 2 and described by NoziXeres’ Fermi liquid theory [118],
T 2 T 2 f = 1 − 2 ; T TK : (169) TK TK So far we considered the Kondo-enhanced tunneling through a single level. The same situation can be realized in tunneling through a quantum dot. The number of electrons in the dot is controlled by the applied gate voltage. If this number is an odd integer, 2n + 1, i.e., the dimensionless gate voltage N = Cg Vg =e lies in the interval |N − (2n + 1)| ¡ 12 ;
(170)
the Kondo eect leads to a dramatic increase of the conductance at low temperatures. The consequence is an even–odd alternation of the valley conductance through the quantum dot: In “even” valleys, the low-temperature conductance is due to elastic co-tunneling, while in “odd” valleys, the presence of an unpaired spin in the highest occupied level leads to the Kondo eect, which causes a remarkable increase of the low-temperature conductance compared to the even valleys. The advantage of using the quantum dots for the experimental studies of the Kondo eect stems from an opportunity to eectively control the parameters of the system, which is hardly possible for a magnetic impurity embedded in a host material. Quantitatively, the low-temperature limit for the conductance in the odd valleys is given by Eqs. (168) and (169). The Kondo temperature TK , Eq. (164) depends on the gate voltage through parameter j0 : j0 = 2Ec (N − N∗ ) ¡ 0;
U = 2Ec ;
(171)
360
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
Fig. 15. (a) Linear conductance at T TK (solid line) and T . TK (dotted line). Due to the Kondo eect, the function G(N) develops plateaux in place of the “odd” valleys of the Coulomb blockade. (b) Sketch of the temperature dependence of the linear conductance, G(T ), in √ the “odd” valley. The conductance decreases with the decrease of temperature from T ∼ Ec down to T ∼ Tel Ec , see, e.g., [28,108], and Eq. (181). At very low temperatures, T . TK , the conductance grows again. If the singly occupied level, which gives rise to the Kondo eect, has equal partial widths, M1 = M2 , the conductance reaches the unitary limit G = e2 =˝ at low temperatures.
where N∗ = n + 1=2 is the degeneracy point at which two dierent charge states of the dot have the same energy. At a 8rst glance, it appears that the Kondo temperature for a quantum dot may be obtained by the substitution of the parameters Eq. (171) into Eq. (164). However, it is not exactly the case, as we explain below. Unlike the single-level Anderson impurity model, the discrete energy spectrum of a dot is dense, Ec . Still, if the junctions’ conductances are small, Gi e2 = ˝, (strong Coulomb blockade), those levels of the dot which are doubly 8lled or empty are important only at scales larger than one-electron level spacing, and only the one single-occupied level contributes to the dynamics of the system at energy scale smaller than . As a result, the model of the dot attached to two leads can be truncated to the Anderson impurity model, however the high energy cut-o is determined by rather than by the charging energy. (See Section 4.3 for more discussions.) This change in the energy cut-o replaces the 8rst factor U in Eq. (164) by . The level width M can be related to the sum g1 + g2 for a given level in the quantum dot. Finally, Eq. (171) establishes the relation between j0 and U on the one hand, and the charging energy Ec and gate voltage N on the other hand. The resulting Kondo temperature (for the interval 0 ¡ N − N∗ 1=2) is found as $ 2 Ec ∗ TK (g1 + g2 ) exp − (172) (N − N ) : Ec g1 + g2 Eqs. (168) and (169) tell us that upon suKciently deep cooling, the gate voltage dependence of the conductance through a quantum dot should exhibit a drastic change. Instead of the “odd” valleys, which correspond to the intervals of gate voltage (170), plateaux in the function G(N) develop, see Fig. 15a. In other words, the temperature dependence of the conductance in the “odd” valleys should be very dierent from the one in the “even” valleys. The conductance, determined by the activation and inelastic co-tunneling mechanisms, decreases monotonously with the decrease of temperature, and then saturates at the value of Gel in the even valleys, see the previous Section. On the contrary, if the gate voltage is tuned to one of the intervals (170), the G(T ) dependence is non-monotonous, see Fig. 15b. When temperature is lowered, the conductance 8rst drops due to the Coulomb blockade phenomenon. A further decrease of temperature results in the increase of the conductance at T . TK . Its T = 0 saturation value
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
361
Fig. 16. The gate voltage dependence of the conductance at various temperatures (a), and the temperature dependence of the conductance in the valleys labeled 4 and 5 (b), see [127]. Reprinted with permission from Science 281 (1998) 540. Copyright 1998 American Association for the Advancement of Science.
depends on the ratio of the partial level widths M1 and M2 of a particular discrete level. The partial widths are related to the tunneling matrix elements, Mi = 2|ti |2 ˙ gi , where the matrix elements ti , in turn, are proportional to the values of the electron eigenfunctions within the dot, ti ˙ n (i), where n labels the eigenstate, and the argument i of the wavefunction n refers to the site in the dot adjacent to junction i, see Eqs. (126) and (130). In a disordered or chaotic dot, the electron wave functions are random quantities, described by the Porter–Thomas distribution [59]. As we already discussed in the previous section the randomness of the wave functions results in mesoscopic
362
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
increase the junction conductances, so that G1; 2 come close to e2 =˝, which is the maximal conductance of a single-mode quantum point contact. The junctions in the experiments [126] were tuned to G (0:3 − 0:5)e2 =˝. Under these conditions, one would expect TK & 100 mK only if N − N∗ . 0:07, see Eq. (172); in the middle of the valley (N − N∗ = 0:5) the Kondo temperature is unobservably small. Indeed, an evidence for the Kondo eect was found [126] in a relatively narrow interval, N − N∗ . 0:15. Only in this domain of gate voltages the anomalous increase of conductance G(T ) with lowering the temperature T was clearly observed; the anomalous temperature dependence of the conductance was not seen in the middle of the Coulomb blockade valley. In conventional Kondo systems, a signature of the eect is in the characteristic minimum of the resistivity. The minimum comes from the competition of the two contributions to the resistivity: the phonon contribution is decreasing with lowering the temperature, while the Kondo contribution has the opposite temperature dependence. A similar feature in tunneling through a quantum dot would be a minimum in the conductance in an “odd” valley. However, such a minimum is quite shallow at G1 ; G2 e2 = ˝. Indeed, the minimum comes from the competition of the temperature dependencies of the inelastic co-tunneling and of the Kondo contribution. In the domain T the inelastic contribution to the co-tunneling is exponentially small, Gin ˙ exp(−=T ). The proper expansion of the function f(T=TK ) in Eq. (168) at T TK and the use of (172) yield the following temperature dependence of the Kondo contribution to the valley conductance
˝(G1 + G2 ) 2 G(T ) = Gel 1 + ln + ··· ; (173) e2 Ec T where Gel = (˝G1 G2 =4e2 )(=Ec ) is the average elastic co-tunneling conductance, see Eq. (152). As one can see from Eq. (173), the Kondo correction to the conductance remains particularly small everywhere in the temperature region T & TK . In order to increase the Kondo temperature and to observe the anomaly of the temperaturedependent conductance G(T ) under these unfavorable conditions, one may try to make the junction conductances larger. However, if G1; 2 come close to e2 =˝, the discreteness of the number of electrons on the dot is almost completely washed out [39]. Exercising this option, therefore, raises a question about the nature of the Kondo eect in the absence of strong charge quantization. We address this question later on, in Section 4.3. 3.4. Overall temperature and gate voltage dependence of the conductance Closing this section, we brie
Ec (N − N∗ )=T e2 ; gU 2˝ sinh[Ec (N − N∗ )=T ]
(174)
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
363
where N∗ is a half-integer number, and the parameter gU is determined by the conductances of the point contacts g1; 2 , the statistical properties of the wave functions of the dot, and by the rate of inelastic processes. If the rate of inelastic processes 1=’ is small compared to the inverse electron dwell time in the dot (g1 + g2 ), then the parameter gU is given by ! " g1 g2 g1 g2 gU = = F[2 − ! + (! − 1)a] : (175) 1=2 g1 + g2 (g1 + g2 1=2 )2 Here, ! = 1; 2 in the limits of no and strong magnetic 8elds, respectively; the parameter of asymmetry of the point contacts a and the function F are de8ned in Eqs. (138) and (139). The conductance (174) has the same dependence on magnetic 8eld as the averaged peak conductance in the low-temperature (T ) regime, see Eq. (139). Notice, that unlike the lowtemperature peak conductance, the mesoscopic
g1 g2 : g1 + g2
(176)
The two cases of rapid and slow inelastic relaxation can be distinguished through their magnetoconductance, which, if normalized by the average peak height, is temperature independent as long as inelastic processes are not slow, 1=’ (g1 + g2 ), but quickly disappears for T & when 1=’ (g1 + g2 ) [130]. Recent measurements of the magnetoconductance of Coulomb 20
Higher order in g1 and g2 corrections to Eq. (176) were considered in Refs. [30,40] for the cases of single-channel and multichannel junctions, respectively. The latter case (g1 ; g2 1 at Nrmch ) is typical for the metallic Coulomb blockade systems. Summation of the leading series in (g1 + g2 ) ln(Ec =T ), performed in the limit Nch → ∞ results U + (g1 + g2 ) ln(Ec =T )]. It is noteworthy that the increase of the contacts [30] in the replacement gU → 2g=[2 conductances result in the decrease of the peak conductance G through the device.
364
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
blockade peak heights show values between the two extremes of rapid and slow inelastic relaxation [131]. At lower temperatures T , the peaks heights are proportional to 1=T , and exhibit strong mesoscopic
2 T 4 e2 Gin = g1 g2 : (179) 3 ˝ Ec One can see that the crossover between the two regimes occurs at temperatures T Tin ≡ Ec = | ln (g1 + g2 )| :
(180)
A comparison of Eq. (180) with the result (152) for elastic co-tunneling shows that the crossover to the temperature-independent conductance occurs at T Tel ≡ Ec : (181) If the dot carries a 8nite spin, then at much lower temperatures, T . TK , the conductance increases again due to the Kondo eect, see Section 3.3 and Fig. 15. In a typical experiment with a quantum dot formed in a semiconductor heterostructure, the 8ve relevant energy scales are related as TK . . Tel . Tin . Ec :
(182)
Temperatures T . Tel are easily accessible; at such temperatures the conductance
Width of the peak at larger values of conductances is discussed in Ref. [132].
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
365
4. Weakly blockaded dots Charge quantization deteriorates gradually with the increase of the conductance of the junctions connecting the quantum dot to the leads. The characteristic resistance at which a substantial deterioration occurs can be estimated from the following heuristic argument. The energy of a state with one extra electron in the dot is Ec . Dispensing with the discreteness of the quasiparticle spectrum ( → 0), one can estimate the lifetime of a state with a given charge as the RC-constant of a circuit consisting of a capacitor with capacitance C = e2 =2Ec , imitating the dot, and a resistor with resistance R = 1=G, imitating the dot-lead contact. According to the Heisenberg uncertainty principle, this state is well-de8ned as long as Ec & ˝G=C. Therefore charge quantization requires a small lead-dot conductance, G . e2 = ˝. We refer to dots with G . e2 =h, for which the charge on the dot is quantized, as “strongly blockaded” dots. Dots with G & e2 =h, for which charge is not quantized, are referred to as “weakly blockaded”. While the constraint on G involves only the universal quantity e2 = ˝, the crossover from “strong” to “weak” Coulomb blockade occurring at G ∼ e2 = ˝ is not universal. The detailed behavior of a partially open dot depends on the number of modes propagating through the junction and on their transparency, even in the limit → 0. In the case of a tunnel junction with large number of modes each of which is characterized by a small transmission coeKcient, charging eects vanish gradually at G & e2 =˝. At large G, the remaining periodic oscillations of the average charge of the dot with the varying gate voltage were shown to be exponentially small, proportional to exp(−˝G=e2 ) [133] (a controllable calculation of the pre-exponential factor still remains an open problem). In the case of a single-mode junction, Coulomb blockade oscillations of the average charge vanish exactly at G = e2 =˝, see Refs. [38,39]. Here we concentrate on the case of a narrow channel 22 supporting a few propagating modes and connected adiabatically [10] to the lead and quantum dot at its ends. This is the conventional geometry of a quantum dot device formed in a two-dimensional gas of a semiconductor heterostructure [8]. The theory for such a model is quite well developed, but the calculations are very involved. 23 In this review, our aim is twofold. On the one hand, we want to present the detailed derivations that transform the problem into a one that can be solved using well-developed formalisms. On the other hand, we also want to give the general picture of the thermodynamic and transport properties of a weakly blockaded dot, as it emerges from the theory. Therefore, this section is organized as follows. We start with Section 4.1, where an overview of the 8nal results is given. For those readers who wish to enter into the more detailed derivations, this Subsection may also serve as a guide to the more detailed presentations, which can be found in Sections 4.5 – 4.8 and Appendices F–H, and use a combination of the bosonization and perturbation theory methods. Before presenting a rigorous treatment of the mesoscopic
We use the word “channel” instead of “contact” to emphasize that the structure of the wave function in and near the contact is one dimensional, see Section 4.2. 23 Coulomb blockade at G ∼ e2 =˝ in the case of large number of channels, Nch 1, in our opinion, is less understood. General scaling ideas regarding the evolution of the Coulomb blockade with the increase of G in this case are presented in Ref. [134].
366
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
most important observables. Going beyond the estimates, however, requires a rigorous theory. It is necessary not only for 8xing the numerical values of the coeKcients in formulas for various measurable characteristics, but also for 8nding their dependence on control parameters. One can view the result for the correlation function of mesoscopic conductance
(183)
and obey the condition Ec |r |4 . :
(184)
For such an open dot the conductance G depends only weakly on temperature and gate voltage as long as T & . In the regime Ec T , the disorder averaged conductance G is
1=2
2M(3=4) Ec eC 2 e2 1 0:24 G = 1− (|r1 |2 + |r2 |2 ) + 1− 1+ ; (185) 2˝ M(1=4) T 3 ! T where C ≈ 0:577 is the Euler constant, and M(x) is the Gamma function. This result summarizes Eqs. (346) and (349) below, see also Ref. [40]. The symmetry index ! = 1(2; 4) for the orthogonal (unitary, symplectic) ensemble. The two temperature-independent terms in Eq. (185) correspond to the classical resistance of the quantum dot and the interference (weak-localization) correction to it, and are known in the context of mesoscopic systems without electronic interactions. The temperature-dependent terms appear as a result of electron–electron interactions. However, they are small compared to the temperature-independent terms at temperatures T & . As a matter of fact, the amplitude of “usual” mesoscopic
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
367
In the case of stronger re<ection, Ec |r |4 ;
(186)
the above consideration and Eq. (185) is applicable, though in a narrower temperature interval, Ec T & Ec |r |4 :
(187)
At lower temperatures and for contacts with equal re<ection probability, |r1 | = |r2 |, the conductance shows peaks with maxima of order G ∼ e2 =˝, see Ref. [40]. The maxima are well pronounced if Eq. (186) is satis8ed: in the valleys of Coulomb blockade the estimate for the conductance is Gmin ∼
e2 : ˝ Ec |r |4
(188)
At even lower temperatures, $ T . TK ;
TK ∼
Ec |r |4 exp −(N) ; Ec |r |4
(189)
the Kondo eect develops, and the conductance in the “odd” valleys approaches the unitary limit. The function (N) is positive with values ∼ 1; the pre-exponential numerical factor is ∼ 1. While the estimates equations (188) and (189) are easily obtainable by the combination of methods reviewed in Sections 3.2 and 4.3, the explicit dependence of the conductance on gate voltage across a valley and the explicit form of the functional dependence on N in the exponential of Eq. (189) are not known. If the re<ections in the junctions are of dierent strengths, then several temperature intervals appear, and the behavior of the conductance is signi8cantly dierent in each interval. We will review here only the case of a strongly asymmetric setup: in one channel, backscattering is strong, G1 ≡ 1 − |r1 |2 1, while in the other channel, backscattering is weak, 24 but still not in8nitesimally small: =Ec |r |2 1 :
(190)
(Here we wrote r for the backscattering amplitude r2 in the weakly backscattering channel.) In this con8guration, the 8rst contact can be treated within the tunneling Hamiltonian formalism, and all the results of Section 4.8 are applicable. We 8nd from Eqs. (358) and (359) of that section 1; T Ec ;
3 T ; Ec T T0 (N) ; G(T ) ≈ G1 (191) 8Ec eC 82 T2 ; T T0 (N) ; C 3e Ec T0 (N) 24
The case of |r2 ||r1 |1 is considered in Ref. [135].
368
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
where T0 (N) is the energy scale at which crossover to the Fermi liquid behavior, G ˙ T 2 , occurs [cf. Eq. (221)], 8eC (192) Ec |r |2 cos2 N : The transport processes contributing to Eq. (191) are inelastic. At temperatures T T0 (N) the inelastic mechanism results in the Coulomb blockade oscillations. However, this contribution to the conductance vanishes at T → 0. At very low temperatures the elastic co-tunneling becomes dominant in the electron transport. This mechanism yields, see Eq. (366), e−C Ec Gel = G1 ln ; T T0 (N) : (193) 2Ec T0 (N) T0 (N) =
Here, in accordance with the left inequality of Eq. (190), we assumed T0 (N). In this temperature regime, the mesoscopic
e−C Ec 32 T0 (N) T G(T ) = G1 1 ln + 2 f : (194) 2Ec T0 (N) Ec eC TK (N) Here, the 8rst term represents elastic co-tunneling, cf. Eq. (193), while the second term is due to the Kondo eect. Further, f is the universal scaling function [125], see Fig. 14, and the Kondo temperature is given by $ T0 (N) TK : (195) exp − T0 (N) 3 The parameters 1 ; 2 and 3 are independent random quantities 25 obeying Porter–Thomas statistics [59], see Eq. (156). Therefore, the Kondo contribution also experiences mesoscopic
Note, that in the case of asymmetric setup, the N-dependence of the exponential in Eq. (195) is known, unlike the case |r1 |2 = |r2 |2 , cf. Eq. (189).
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
369
Di9erential capacitance of an open dot: In addition to complications stemming from quantum
(196)
the result for the average capacitance can be found in Ref. [39], Cdi (N) ∼ ln(Ec =T )|r |2 cos(2N) : (197) C The derivation of that result and its generalization to the case of many modes (see also Ref. [138]) is given in Section 4.6, see Eq. (323) and the discussion around it. Upon lowering the temperature it becomes important, whether the energy scale Ec |r |2 is larger or smaller than the mean level spacing . If the condition T0 (N)
(198)
with T0 (N) from Eq. (192) is satis8ed, then oscillations of the dierential capacitance follow the law [39]
Cdi (N) 1 |r |2 cos(2N) : ∼ ln (199) C |r |2 cos2 N In the opposite case, Ec |r |2 ;
(200)
mesoscopic
2
16 Ec K(s) = 2 2 cos(2s) ln4 ; (201) 3 ! Ec T see Eq. (321) in Section 4.6; here ! = 1(2) for a system with (without) time-reversal symmetry. The periodic oscillations of K(s) are due to the Coulomb blockade; they remain correlated over an interval s ∼ Ec =. Note, that for the case of one single-mode junction we consider here,
370
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
Fig. 17. Schematic view of a quantum dot with a single junction.
the periodic part of the correlation function exceeds the non-periodic part [see Eq. (337) with Nch = 2] by a large factor ∼ 10(T=) ln4 (Ec =T ). If the temperature becomes lower than , the periodic part of K(s) can be estimated by Eq. (201) with T replaced by under the logarithm, and for the estimate of the non-periodic part one may replace T by in Eq. (337). The periodic part of the correlation function dominates at the lowest temperatures as well. With the increase of the number of channels Nch , however, the relative magnitude of the periodic part drops rapidly (see Section 4.6 for details). 4.2. Finite-size open dot: introduction to the bosonization technique and the relevant energy scales We start with a single-junction system, see Fig 17. The relevant electron states which facilitate the electron transfer through the junction and thus determine the charge
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
the spectrum of the fermions. Writing (x) = e−ikF x L (x) + eikF x R (x), where left- and right-moving fermions respectively, we obtain from Eq. (202) ∞ † † Hˆ 0 = ivF d x( L 9x L − R 9x R ) : −∞
371 L
and
R
are
(203)
Here vF is the Fermi velocity of one-dimensional fermions. The possibility of re<ection in the channel, which results in a deviation of the channel conductance from its perfect value e2 =˝, can be taken into account by addition of a backscattering term to the Hamiltonian (203), † † Hˆ bs = |r |vF ( L (0) R (0) + R (0) L (0)) ; (204)
where |r |2 1 is the re<ection coeKcient. 26 The second part of the eective Hamiltonian represents the charging energy. Until now, we were expressing it in terms of the number of electrons inside the dot. Using the conservation of the total number of electrons in the entire system, it is convenient here to express this energy in terms of charge outside the dot, 2 0 † † Hˆ C = Ec dx (: L L + R R :) + N : (205) −∞
(Here : : : : : denotes normal ordering of the fermion operators.) We would like to treat the charging energy (205) non-perturbatively, while developing a perturbation theory in Hˆ bs . Refs. [38,39] suggest a scheme in which this can be done, using the bosonized representation for the one-dimensional fermions in the channel. The boson representation makes the interaction Hamiltonian (205) quadratic in the new variables, thus removing the main obstacle in building the desired perturbation theory in the backscattering Hamiltonian (204). In the boson representation, the two parts Hˆ 0 and Hˆ bs of the Hamiltonian take form ∞ 1 v F 2 2 Hˆ 0 = dx (206) (∇( ) + 2(∇B( ) ; 2 −∞ (=N; s 2
√ √ 2 Hˆ bs = − |r |D cos[2 BN (0)] cos[2 Bs (0)] ; (207) where D is the energy bandwidth for the one-dimensional fermions, which are related to the boson variables by the transformation [139] √ 1 1 D † exp i N (x) + s (x) + BN (x) + Bs (x) ; L(x) = Oˆ 2vF 2 2 √ 1 1 D † = O ˆ exp i (x) + (x) − B (x) − B (x) : N s N s R(x) 2vF 2 2
The operators (0) and † (0) in Eq. (204) are regularized as d xf(x) (x) and d xf(x) † (x), respectively, where f is a positive function, symmetric around x = 0 with d xf(x) = 1, and with support close to the origin. This same regularization is used throughout the remainder of this section. 26
372
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
Here we introduced Majorana fermions Oˆ±1 here to satisfy the commutation relations for fermions with opposite spins, {Oˆ+1 ; Oˆ−1 } = 0; Oˆ2±1 = 1. Anti-commutation of electrons of the same spin ( = 1 or −1), is ensured by the following commutation relations between the canonically conjugated Bose 8elds: [∇( (x ); B( (x)] = [∇B( (x ); ( (x)] = −i(( (x − x );
(; ( = N; s :
(208)
The interaction term (205) becomes also quadratic in the boson representation: 2 2BN (0) ˆ H C = Ec √ −N : (209) √ √ The operators (2e= )∇BN (x) and (2= )∇Bs (x) are the smooth parts of the electron charge (N) and spin (s) densities, respectively. 27 The average charge of the dot can be determined from the N-dependence of the thermodynamic potential I of the full Hamiltonian. Taking into account that only one of its parts, the charging energy, depends on N, we 8nd 1 9I Nˆ q = N − (210) 2Ec 9N (hereinafter · · ·q indicates the average over the quantum state without disorder average). In the absence of backscattering, r = 0, the Hamiltonian of the system Hˆ 0 + Hˆ C is quadratic in the spin and charge densities. The explicit N-dependence of the Hamiltonian can be easily removed by the transformation BN (x) → BN (x)+(1=2)N1=2 . Hence, in this case the thermodynamic potential I, and hence the ground state energy, obviously have no N-dependence, the average charge is linear in N and is not quantized, and 2 √ BN (0)q = Nˆ q = N : (211) Note that scattering between the right- and left-moving states occurs only at the point of a barrier in the channel x = 0. Hamiltonian (202) – (204) completely ignores the fact that the dot has a 8nite size, and a particle that entered it, eventually ought to exit. However, these two events are separated by the dwell time ∼ ˝=, which makes model (206) – (209) applicable at → 0. To illustrate the eect of a 8nite dwell time, we replace the in8nite interval (0; ∞] by a 8nite segment (0; L], choosing the length of the eective one-dimensional system in such a way that the time delay of a particle entering this segment at x = 0 and bouncing o the “wall” at x = L equals ˝=. In other words, throughout the remainder of this section we replace the upper limit of the integration in Eq. (206) by L, 1 vF L 2 2 ˆ H0 = dx (212) (∇( ) + 2(∇B( ) ; 2 −∞ (=N; s 2 27 Eq. (209) contains no contribution from the boson 8elds at −∞, because of the order of limits implied in the interaction Hamiltonian (205). First, the range of the interaction is taken to in8nity, then the length of the one-dimensional channel.
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
373
where L ˝vF =, and set zero boundary conditions for the displacement 8elds in the spin and charge mode: BN (L) = Bs (L) = 0 :
(213)
When the re<ection amplitude r is 8nite, the Hamiltonian of the system is Hˆ = Hˆ 0 + Hˆ C + Hˆ bs :
(214)
The largest energy scale appearing in the Hamiltonian (214) is the charging energy Ec . At energies smaller than Ec the charge is pinned to the value of gate voltage N, and does not exhibit quantum
T ; D sin T
374
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
which is readily performed with the help of quadratic Hamiltonian (215). The result of this calculation is
4eC 2 Ec 2 I(N) = − 3 |r | Ec cos N ln : (218) T At low temperatures this correction diverges, which signals the breakdown of the simple perturbation theory in the backscattering Hamiltonian. In order to 8nd the low-energy cut-o of the logarithm, we invoke the following renormalization group arguments. 28 Hamiltonian (215), (216) represents spin
(221)
The numerical coeKcient in Eq. (221) could not be found from the RG scheme but results from the exact consideration. We will not describe the calculation of this coeKcient. The renormalization procedure must be stopped when the bandwidth D˜ reaches T0 (N). At lower energies the dynamics of the spin mode is described by the renormalized Hamiltonian s ˆ H = K {s } + U {Bs } with the “potential energy” L √ cos(N) U {Bs (x)} = − D2 T0 (N) cos[2 Bs (0)] + vF d x(∇Bs )2 : (222) |cos(N)| −∞ The spin mode Bs (0) is strongly pinned by the potential energy. Physically, it means that the
The model is exactly soluble [39], however, the arguments we use seem to be more physically transparent and suitable for further purposes.
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
375
Fig. 18. The potential Umin [Bs (0)] at various gate voltages. (a) At cos(N) ¿ 0 the potential has one minimum (spin singlet state). (b) At cos(N) ¡ 0 the minimum is double-degenerate (spin doublet state). Tunneling between the two minima corresponds to the hybridization of the spin states of the dot and lead. This hybridization and consequent lifting of the degeneracy essentially is the Kondo eect.
The “potential energy” (222) can be easily minimized with respect to the possible spatial con8gurations of the 8eld Bs (x), which satisfy the boundary condition (213), and have the value Bs (0) at x = 0: √ cos(N) cos[2 Bs (0)] + Bs (0)2 = : (223) Umin [Bs (0)] = − D2 T0 (N) |cos(N)| In the limit L → ∞ (i.e., → 0) the second term of Eq. (223) does not contribute, and the minima of the potential energy correspond to √ 2 Bs (0) = 2n if cos(N) ¿ 0 ; (224) √ 2 Bs (0) = (2n + 1) if cos(N) ¡ 0 with integer n. In each of these two sets, the energy is the same within a set for all n. At large but 8nite L, when the condition T0 (N)
(225)
is satis8ed, the minima of energy (223) hardly shift, but their degeneracy is reduced strongly. If the number of electrons on the dot is close to an even number, cos(N) ¿ 0, then the degeneracy is removed completely. If the charge is closer to an odd number of electrons, cos(N) ¡ 0, the energy minimum preserves double degeneracy (n = − 1 and 0), see Fig. 18. s The “kinetic” part K {s } of the Hamiltonian Hˆ is quadratic in ∇s , see Eq. (215); it does not commute with the “potential” part, and causes tunneling between the potential minima. The tunneling amplitude is energy-dependent [141], and small at E T0 (N). To describe the low-energy dynamics of the spin mode, it is convenient to project out all the states of the Luttinger liquid that are not pinned to the minima of the potential (223). Transitions within one of the sets (224) then can be described [140] by a tunneling Hamiltonian 2 √ D˜ ˆ H± = − (226) cos{ [s (+0) − s (−0)]} ; 2T0 (N) which operates in an energy band D˜ T0 (N). Here a discontinuity of the variable s (x) at x = 0 is allowed, and the point x = 0 is excluded from the region of integration in Eq. (215).
376
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
The Hamiltonian Hˆ xy , which is a sum of two operators of 8nite shifts for the 8eld Bs (0), √ represents hops Bs (0) → Bs (0) ± between pinned states. In other words, Hˆ ± is the 8nite shift operator with respect to the variable B as can be checked with the help of the commutation relations (208). These hops correspond to a change by 1 of the z-projection of the dot’s spin. At energy scales D˜ the hops are predominantly inelastic. The variation of the transition amplitude with energy can be found by continuing the same RG procedure, until the running cut-o D˜ reaches the value . At energies below ,
vF2 ∇Bs (−0)∇Bs (+0) : 2T0 (N)
(227)
This term is omitted usually in the DC tunneling problem for a Luttinger liquid [140]. In agreement with the general idea about SU (2) symmetry of the g = 1=2 Luttinger liquid [142], both operators (226) and (227) have the same scaling dimension. To 8nd the numerical factor in Eq. (227), again a comparison with the exact solution is necessary. We have seen in this subsection, that there is a hierarchy of the energy scales characterizing an open dot. The charge of the dot
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
377
below the intra-dot excitations are also suppressed. Under these conditions, the spin of the dot is S = 0 or 1=2, depending on the sign of cos N. If this cosine is positive, then the dot is in a singlet state, and there are no peculiarities in the temperature dependence of observable quantities at T . . However, if cos N ¡ 0, then the dot carries a spin, and the Kondo eect develops at suKciently low temperatures. In the next subsection, we will study the temperature domain T . , 8nd the Kondo temperature, and discuss the transport properties of the dot in Kondo regime. Starting from Section 4.4, we return to the intermediate temperature scale, . T . Ec , in order to discuss the dominant transport mechanisms there. 4.3. The limit of low temperature: the e9ective exchange Hamiltonian and Kondo e9ect We could continue using the same RG method applied to the bosonized representation of the Hamiltonian, down to scale D˜ ∼ ; it would allow us to estimate the order of magnitude of the eective exchange constant J responsible for the hybridization. It is possible, however, to develop a dierent method applicable in the energy domain T0 (N) & D˜ & , which allows us a more accurate determination of J and of the Kondo temperature TK . At energies E T0 (N), the spin 8eld Bs (0) is pinned at the point contact. Recalling that Bs (L) = 0, we see that the spin of the dot indeed takes discrete values only, as was mentioned above. At such energy scale, it is instructive to return to the fermionic description of the problem. After the two parts of the Hamiltonian, Eqs. (226) and (227), are found, one can explicitly see that the initial SU (2) symmetry of the problem is preserved. Therefore, the eective Hamiltonian controlling the spin degrees of freedom of the system dot+lead at low energies, corresponds to the isotropic exchange interaction, ˜ ˜Sˆ˜Sˆ d : Hˆ ex = J (D)
(228)
† Here ˜Sˆ d = 12 5ˆ†1 (˜R0 )˜1 2 5ˆ2 (˜R0 ) and ˜Sˆ = 12 ˆ 1 (˜R0 )˜1 2 ˆ 2 (˜R0 ) are the operators of spin density in the dot and in the lead, respectively, at the point ˜R0 of their contact; ˜1 2 are the Pauli matrices. In what follows, we adopt this point to be the origin, ˜R0 = 0. The dimension of ˜ depends on the way we normalize the electron wave functions; the exchange constant J (D) ˜ ˙ [d T0 ]−1 , where d and are the densities of dimension-wise, it can be written as J (D) states in the dot and lead, respectively. In fact, the form of Hamiltonian (228) could be derived without resorting to the explicit calculation performed above within the bosonization technique. The only important notion needed for the derivation, is the low-energy decoupling of the dot from the lead. Pinning of both charge and spin modes is the manifestation of this decoupling. The decoupling is complete, however, only in the absence of excitations. To describe the behavior of the system at low but 8nite temperatures, one needs to establish the least irrelevant operators that violate the complete decoupling. The decoupled dot and lead are described by independent Fermi liquids. The number of electrons in the dot cannot be changed, as such a variation of charge would result in a large energy increase of order Ec . That is why the leading irrelevant operators preserving the initial SU (2) symmetry are the interaction in the charge and spin channels between the dot and the
378
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
lead at the point of contact, ˜ NˆNˆd + Hˆ ex ; Hˆ irr = A(D) † where Hˆ ex is de8ned in Eq. (228) and Nˆd = 5ˆ† (˜R0 )5ˆ (˜R0 ) and Nˆ = ˆ (˜R0 ) ˆ (˜R0 ) are the operators of the charge density in the dot (x ¿ 0) and in the lead (x ¡ 0) respectively, at the point ˜R0 of their contact. The 8rst term here represents the interaction in the singlet channel (inelastic scattering between an electron in the dot an electron in the lead). Because the charge
˜ ∗n (0)’n2 (0)’∗k (0)’k2 (0). The detailed structure of the with the matrix elements J˜ n1 n2 k1 k2 = J (D)’ 1 1 one-electron functions in the continuum of states {k } is not important in the discussion of the Kondo eect. An examination of the standard calculation [114] shows that only the densities |’k (0)|2 , averaged over the continuum of states k, matter. Therefore we will suppress the indices k1 and k2 in the exchange constants hereinafter, assuming that the functions ’k are normalized so that |’k (0)|2 = |’k (0)|2 = 1 :
(230)
If the dot is in the spin-doublet state, the exchange constant gets renormalized at low energies. Unlike the “ordinary” Kondo model with only one localized orbital state involved, here the renormalization occurs due to virtual transitions in both the continuum spectrum of the lead and the discrete spectrum of the dot. To see the signature of the Kondo eect, we calculate the second order correction to the exchange constant J00 where 0 labels the singly occupied level of the quantum dot. This second order correction, diagrammatically shown in Fig. 19, has the form D˜ |’l (0)|2 ˜ = J 2 (D) ˜ |’0 (0)|2 J00 (E; D) d% ; (231) (l) (0) | % | + | % − % | |E| (l) d d ˜ |%d |¡D
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
379
Fig. 19. The diagrammatic representation for the second order correction to the exchange interaction constant. ˜ The electron states in the intermediate lines include excitations within energy strip [E; D].
where the integral is taken over the continuum energy spectrum % in the leads, is the corresponding density of electron states. The sum over the discrete energy levels of electrons in the dot includes the term involving the singly occupied level l = 0. This term, which corresponds to a spin-
380
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
meaning in the limit → 0. In a Fermion representation, one 8nds S ∼ hz Jd for the spin of the dot by a straightforward 8rst order perturbation theory in the exchange Hamiltonian (229). Comparing the two results for S, we 8nd ˜ = J ≡ [d T0 (N)]−1 ; J (E; D)
D˜ E :
(233)
˜ in Hamiltonian (228) with T0 (N). With the help of Eq. (233) we can relate the constant J (D) After we incorporated all ultraviolet contributions into the de8nition of the exchange constant, we are ready to study the Kondo eect per se. Taking the dierence of Eqs. (231) and (232) to 8nd the second order correction to the “new” coupling constant (233), we 8nd that the correction now only involves a contribution from the dierence of the true (discrete) spectrum of the dot and the approximation where the density of states is constant (and continuous): D˜ D˜ 2 |’l (0)| d d%d J00 = J |’0 (0)|2 1 + J d% ; (234) − (l) (0) | % | + |%d | ˜ | % | + | % − % | |E| − D (l) d d ˜ |%d |¡D
where d = 1= is the one particle density of states in the dot. The right-hand side of Eq. (234) is now perfectly convergent in the high energy. On the other hand, the contribution of the term with l = 0 gives a logarithmic divergence at E → 0. This is precisely the term which describes the Kondo eect, since it corresponds to a spin
˜ = J |’0 (0)|2 + J 2 |’0 (0)|4 ln + H ; J00 (E; D) (235) E D˜ D˜ 2 |’l (0)| d d%d H = lim lim d% : (236) − (l) E→0 |E| ˜ D→∞ |%| + |% | −D˜ |%| + |%d | (l) ˜ =0 |%d |¡D;l
d
The limit E = 0 exists in Eq. (236), and the constant H is of the order of unity and
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
381
The only dierence developing at larger conductances, is that the parameters de8ning the low-temperature behavior dier from the bare conductances of the junction. Comparison of the results for Kondo temperature (172) and (237) illustrates this point. In both cases, TK is controlled by the value of J, TK (J)1=2 e−1=J : In the case of strong Coulomb blockade, this parameter is simply related to the conductance of the junctions and to the ratio Ec = by second order perturbation theory in tunneling amplitudes, see Eq. (172). In the case of strong charge
382
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
induced by only the terms ˙ J12 in Hamiltonian (238). The conductance can be calculated by applying the standard Fermi golden rule to the problem. Because we neglect the level spacing at the moment, and because of the four-fermion structure of Hamiltonian (238), the electron transition rate and the conductance G at low temperatures is proportional to T 2 , 3 e 2 2 2 2 2 G(T ) = J T : 4˝ 12 d The comparison with the exact result [40], see also Eq. (359), yields 32 ˝ 2 = 2 C 2 G1 [Ec T0 (N)2 2d ]−1 : (239) J12 3 e e The smallest exchange constant in the problem, J11 , does not aect the conductance in the strongly asymmetric setup which we consider here. For calculating the conductance in the absence of strong asymmetry, one should know the value of J11 . For that purpose, some other response function should be calculated with the help of Hamiltonian (238) and then compared with an exact result; we will not address this problem in this Review. Instead, we utilize Eq. (239) to determine the conductance through the dot in the limit of low temperatures. If the gate voltage is close to an odd integer, cos N ¡ 0, the lowest discrete energy level in the dot is spin-degenerate. At T . , only this level remains important. This way, the initial problem of the dot, which has a dense spectrum of discrete levels, and is strongly coupled to the leads, is reduced to the problem of a single-level Kondo impurity considered in Section 3.3. We can adapt result (168) to 8nd the conductance. First, we express the temperature-independent factor in terms of the exchange constants, 29 rather than in terms of the conductances g1 and g2 : 2 4J12 4g1 g2 → : (g1 + g2 )2 (J11 + J22 )2 Next, we neglect the small term J11 in the denominator here, and 8nd:
T T 4e2 J12 2 GK ;N = f : TK ˝ J22 TK Now we are ready to 8nd the Kondo conductance through the dot in the regime of strong charge
3 16 4 1024 T 2 2 GK G1 |r2 | (cos N) f : (240) 3 33 TK (N) Here, TK (N) is given by Eq. (237), and f(x) is the universal scaling function plotted in Fig. 14. Note that the Kondo conductance (240) in a strongly asymmetric setup is proportional to the product of the small conductance G1 and small re<ection coeKcient |r2 |2 and therefore is signi8cantly smaller than the conductance quantum e2 =˝ even at T = 0. Similar to the case of the closed quantum dot, the dominant mechanisms switch from inelastic co-tunneling at high temperatures to the elastic temperature-independent mechanism at lower 29 2 The following formula is valid only for the single-channel Kondo model, i.e., if the condition J12 = J11 J22 holds. √ Within the g1 -accuracy of our calculation this relation trivially holds. Checking it to higher order in g1 requires calculation of J11 .
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
383
temperatures, and 8nally to the regime of the Kondo eect at the lowest temperatures. We postpone a more detailed discussion of those crossovers until after we have constructed a theory of elastic co-tunneling in open quantum dots. We turn to the outline of this theory in the following subsections. 4.4. Simpli:ed model for mesoscopic =uctuations As was already mentioned in the beginning of this section, the main diKculty for building a theory of interacting electrons in an open dot stems from the need to treat both the Coulomb interaction and the dot–lead conductance non-perturbatively. In the limit → 0, the bosonization procedure allows one to treat exactly the interaction in a dot connected to a lead by a perfect quantum channel. Then, the eect of a 8nite re<ection amplitude r in the channel can be treated by a systematic perturbation theory [39,40]. An important conceptual conclusion drawn from this theory is that in the low-energy sector (de8ned by j . Ec |r |2 ), the excitations inside and outside the dot are independent from each other. This understanding allowed us to consider the low-temperature eects (T . ) in a partially open dot with Ec |r |2 , because in this case one is able to map the properties of a partially open dot onto the “conventional” case of strong Coulomb blockade (see the previous subsection). Such a mapping allows one to consider the Kondo eect, but is insuKcient for evaluating the co-tunneling contribution to the conductance, for which a broad range of energies up to Ec is involved. (This was shown, e.g., in Section 3.2, where we considered the valley conductance through a strongly blockaded quantum dot.) The aim of this and the following sections is to build a theory capable of describing simultaneously many-body eects and mesoscopic
384
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
1=Ec inside the dot, after which it bounces back into the channel without creating excitations. The same consideration is applicable to an electron attempting to leave the dot. The fact that within the energy range |j| . Ec from the Fermi level the electron processes are predominantly elastic implies that the low-energy properties of the system can be mapped onto a dot eectively decoupled from the channel, that there is a well-de8ned scattering amplitude from the entrance of the dot, and that the phase of this scattering amplitude is given by the Friedel sum rule = Q=e = N :
(241)
Eq. (241) can be applied to electrons incident from inside the dot, as well as to electrons incident from the channel. The description outlined here resembles closely Nozieres’ Fermi liquid description of the unitary limit in the one-channel Kondo problem [118]. We now demonstrate that the above considerations are suKcient to reproduce the result Eg (N) |r |Ec cos 2N ;
(242)
obtained by Matveev [39] for the ground state energy Eg (N) of spinless fermions in the limit of zero level spacing in the dot and with a 8nite re<ection amplitude r in the contact, and then apply the same scheme to 8nd the corrections to the ground state energy arising from a 8nite , but without r. Those corrections will result in the mesoscopic
where vF and kF are the Fermi velocity and Fermi wave vector, respectively. (Here we omitted the irrelevant constant part of the electron density; the restriction of wave vectors k to the range Ec =vF around kF is because only for those wave vectors elastic re<ection from the contacts takes place.) We thus obtain Ec v cos(2kF |x| − 2); |x| ¡ vF =Ec ; F N(x) = (243) sin(2kF |x| − 2) ; |x| ¿ vF =Ec : |x| Next, we take into account the eect of a scattering potential V (x) in the contact that generates a 8nite re<ection amplitude r = 0. In the 8rst order of perturbation theory, the presence of the potential V gives a shift to the ground state energy given by Eg (N) = d xN(x)V (x) : (244) Substituting Eq. (243) into Eq. (244), assuming that the eective range of the potential around x = 0 is smaller than vF =Ec , and using the standard expression |r | = |V˜ (2kF )|=vF , we obtain Eq. (242). Here V˜ (k) is the Fourier transform of the potential V (x).
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
385
Fig. 20. Evolution of the energy levels of the quantum dot with the scattering phase .
Having veri8ed the relation between scattering phase shifts and the ground state energy for the case = 0, we proceed with evaluation of the ground state energy of a 8nite dot connected to a reservoir by a perfect channel. According to the discussion preceding Eq. (241), the channel is eectively decoupled from the dot due to the charging eect even though r = 0. We now 8nd the ground state energy of the system by relating the ground state energy of the closed dot to the scattering phase of Eq. (241). For a chaotic dot, this problem is equivalent to 8nding a variation of the eigenenergies by introduction of an arbitrary scatterer [143] with the same phase shift . The relevant contribution to the ground state energy is given by Eg = [%i () + &] ; (245) −Ec . %i ¡0
where %i are the eigenenergies measured from the Fermi level &. (Again, we have restricted the summation to a window of size ∼ Ec around the Fermi level, since the phase shift applies only to particles in this energy range.) As soon as the scattering phase changes by , one more level enters under the Fermi level. Evolution of the energy levels with changing is shown schematically in Fig. 20. The position of the level %i () satis8es the gluing condition %i ( + ) = %i+1 () :
(246)
From Eqs. (245) and Eq. (246) we see that the ground state energy depends almost periodically on : Eg () = Eg ( + ) + O() :
(247)
As we will see below, the amplitude of the oscillation of the ground state energy with the variation of is much larger than the mean level spacing, so that we can neglect the last term in Eq. (247).
386
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
In order to estimate the magnitude of the oscillations of the ground state energy, we recall that the correlation function of the level velocities is given by [47]
2 2 9 ji = ; 9 ji 9 jj = ij ; (248) ! where ! = 1; 2 for the orthogonal and unitary ensembles, respectively, and · · · stands for the ensemble averaging. Formula (248) can be easily understood from the 8rst order perturbation theory: At 1, we have ji () ≈ ji (0) + (=)| i2 (0)|. Using the fact that i (0) is a Gaussian random variable, see Eq. (21), one obtains Eq. (248). Estimating the mesoscopic
2 2 TEc 2 2 2 [Eg () − Eg (0)] = ij ≈ : (249) ! ! −Ec . %i ;%j ¡0
As we have already explained, energy (245) is a periodic function of with period . On the other hand, for . 1, Eq.√ (249) is valid. Therefore, the characteristic amplitude of the oscillations is of the order of Ec , and it is plausible to assume that the correlation function of energies at two dierent parameters N1 ; N2 takes the form TEc Eg (N1 )Eg (N2 ) ≈ (250) cos 2(N1 − N2 ) ; ! where we use Eq. (241). It is important to notice that the variation of the energy of the ground state is much larger than the mean level spacing . A dierent, though equivalent way to arrive at Eq. (250) is as follows. We have seen that because of the presence of the charging energy Ec , charge excitations in the dot relax elastically after a time ∼ ˝=Ec . If the dot size is in8nitely big (corresponding = 0), no backscattering from the dot’s walls occurs within this short time interval, and the only source of elastic scattering is an eventual re<ection amplitude r in the contact. However, if the dot has a 8nite size, there is a 8nite probability that the incoming particle will scatter from its walls, or from impurities inside the dot, and exit elastically before the time ˝=Ec . The corresponding amplitude r for re<ection from the dot in a time shorter than ˝=Ec can be estimated as ˝=Ec dt S(t) ; (251) r∼ 0
where S(t) is the scattering matrix of the dot in time representation, cf. Eq. (119). Hence, in order to include elastic scattering from the dot, we have to add amplitude (251) to the re<ection amplitude rc from the contact in Eq. (242). The mesoscopic
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
387
As a result, the amplitude of
(252)
Here, the Hamiltonian of the dot is, in turn, written as a sum of the Hamiltonian Hˆ D of the closed dot in the absence of electron–electron interactions and the charging energy Hˆ C [see also Eq. (92); here spin is included in the indices and (] † Hˆ D = H( ˆ ˆ ( ; Hˆ C = Ec (nˆ − N)2 : (253) ;(
The Hamiltonian Hˆ L of the leads is given by Eq. (96); the term Hˆ LD that couples the dot with the leads is given in Eq. (97). The operator nˆ in Eq. (253) is for the total charge enˆ on the dot. Conventionally, it is † de8ned in terms of the creation and annihilation operators ˆ and ˆ of fermions in the dot, as in Eq. (34). For the purpose of the formal manipulations of this section, it is more convenient, † however, to change de8nition (34) of n, ˆ and reformulate it in terms of the operators ˆj (k) and ˆ (k) for the channel. This is possible, because the total number of particles in the entire system j (leads and dot) is an integer number which can be added to the parameter N at no cost, so that we can write Nch d k ˆ† nˆ = − (k) ˆj (k) : (254) 2 j j=1
388
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
Strictly speaking, the use of Eq. (254) instead of (34) is only permitted for a canonical description, in which the total number of particles in the system is held 8xed. Below, however, we use Eq. (254) in a grand canonical description. The dierence between the two descriptions is only important if non-periodic
I = − T ln(Tr e−H =T ) :
(255)
We evaluate the trace in two steps, Tr : : : = Tr L Tr D : : : ; where L and D indicate the fermionic operators belonging to the leads and dot, respectively. With the rede8nition (254) of the charge n, ˆ the interaction Hamiltonian HC is now attributed to the leads. As a result, the Hamiltonian of the system becomes quadratic in the fermionic operators of the dot, so that this part of the system can be integrated out: ˆ
ˆ
ˆ
ˆ
ˆ
Tr D e−H =T = Tr D e−(H L +H C +H D +H LD )=T ˆ
ˆ
= e−(H L +H C )=T e−ID =T T e1=2
1=T 0
d1 d2 Hˆ LD (1 )Hˆ LD (2 )D
:
(256)
Here Hˆ LD () is the interaction representation of the dot–lead coupling operator ˆ ˆ ˆ ˆ ˆ ˆ Hˆ LD () = e(H L +H C +H D ) Hˆ LD e−(H L +H C +H D ) ;
T denotes time ordering for the imaginary time ; ID is the thermodynamic potential of the closed dot, ˆ
ID = − T ln Tr e−H D =T ;
(257)
and the average : : :D over the Hamiltonian HD of the dot is de8ned as ˆ
: : :D = eID =T Tr D (e−H D =T : : :) :
Note that, as the interaction has been shifted to the leads, the thermodynamic potential ID of the closed dot is formally independent of the gate voltage N. It remains to compute the average Hˆ LD (1 )Hˆ LD (2 )D in Eq. (256). Hereto we use the explicit form (97) of Hˆ LD and the Green function G! () of the closed dot,
∞ 1 ˆ −i! n G! () ≡ −T ˆ () U ! (0)D = T e ; (258) i!n − Hˆ D ! n=−∞ † where !n = T (2n + 1) is a fermionic Matsubara frequency and Uˆ () = ˆ (−). Hence we 8nd ˆ U (1 ; 0)[W † G(1 − 2 )W ]j j ˆ (2 ; 0) ; Hˆ LD (1 )Hˆ LD (2 )D = − 2 (259) j1 j2 1 2 j1 ; j 2
where the operators ˆ j (; x) are the Fourier transform of ˆ j (; k), ˆ (x) = d k e−ikx ˆ (k) : j j 2
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
389
It is convenient to introduce a new matrix, Lj1 j2 , by the following relation: 4L(i!n )j1 j2 ≡ −
sign !n j1 j2 + [W † G(i!n )W ]j1 j2 ; i
(260)
where = 1=2vF is the density of states in the leads. With this de8nition, the dot Green function
G and the coupling matrix W can be eliminated from Eq. (259) in favor of the matrix L,
so that we obtain
Hˆ LD (1 )Hˆ LD (2 )D =
ˆ 2T U (1 ; 0) ˆ (2 ; 0) j sin[T (1 − 2 )] j j 1 Uˆ − (1 ; 0)Lj1 j2 (1 − 2 ) ˆj2 (2 ; 0) : 2 j ; j j1 1
(261)
2
The 8rst term in the r.h.s. comes from the diagonal part of the Green function G (i.e., the density of states in the dot); this term can be obtained by multiplying the ensemble-averaged Green function by the product W † W = 2 =M characterizing an ideal dot–lead coupling. The second term, which contains the matrix Lj1 ; j2 , also has components o-diagonal in j1 and j2 . It accounts for a combined eect of electron trajectories backscattered into the lead. Backscattering may be caused by electron re<ection o a barrier in the contact, impurities in the contact and the dot, and the boundaries of the dot. This term is responsible for the mesoscopic
ˆ
Tr D e−H =T ˙ Tr b e−ID =T e−H e =T T e−Se
(262)
where Tr b denotes a trace over the chiral fermion 8elds bˆj ; Hˆ e is an eective Hamiltonian that contains the fermion 8elds ˆj (x) in the lead, the 8ctitious fermion 8elds bˆj (x), and their coupling at x = 0, † † Hˆ e = ivF d x [ ˆ j (x)9x ˆ j (x) + bˆj (x)9x bˆj (x)] j
1 ˆ† [bj (0) j (0) + +
j
† ˆ j (0)bj (0)]
2
+ Ec
d x : ˆ j (x) ˆ j(x) : + N
j
†
and Se is an eective action that contains the 8elds ˆ j (x) at x = 0 only, 1=T 1=T Se = 4 d1 d2 Uˆ j1 (1 ; 0)Lj1 j2 (1 − 2 ) ˆ j2 (2 ; 0) : j1 ; j 2
0
0
(263)
(264)
390
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
The equivalence of representations (256) and (262) – (264) can be easily checked by tracing out the fermions bˆj in Eq. (262), with the help of the relation [cf. the 8rst term on the r.h.s. of Eq. (260)] T bˆj1 (; 0)bUˆj2 (0; 0) = iTj1 j2 e−i!n sgn !n : !n
The operators ˆ j (0) and bˆj (0) appearing in the eective Hamiltonian (263) and in the eective action (264) are the fermion 8elds evaluated at the origin x = 0. To simplify notation, from now on we omit the coordinate in the argument of the one-dimensional operators j (or, later, of the bosonic operators j ), when they are taken at the origin, j ()
≡
j (; x = 0);
j () ≡ j (; x = 0) :
(265)
It can be checked that the operator in the second line of Eq. (263) connecting the fermion 8elds ˆ and bˆ at x = 0 corresponds to an ideal coupling between those 8elds. This coupling means that a -particle is scattered to a b-particle with unit probability. This motivates the following substitution of right and left moving fermion 8elds, ˆ (x) = ˆ (x)B(−x) + ˆ (−x)B(x) ; j
L; j
R; j
bˆj (x) = i[ ˆR; j (−x)B(−x) − ˆL; j (x)B(x)] :
(266)
[Here B(x) = 1 if x ¿ 0, 0 if x ¡ 0, and B(0) = 1=2.] With this substitution we obtain from Eqs. (263) and (264) ∞ † † ˆ H e = ivF d x( ˆ L; j 9x ˆ L; j − ˆ R; j 9x ˆ R; j ) −∞
j
+Ec Se =
j1 j2
0
j 1=T
2
0
−∞
d1
0
dx : 1=T
ˆ† L; j
ˆ
L; j
+
ˆ† R; j
ˆ
R; j :
+ N ;
(267)
d2 Lj2 j2 (1 − 2 )
×[ Uˆ L; j1 (1 ) + Uˆ R; j1 (1 )][ ˆ L; j2 (2 ) + ˆ R; j2 (2 )] ;
where the time dependence of any operator Aˆ is de8ned as ˆ −Hˆ e : ˆ = eHˆ e Ae A()
(268) (269)
In this formulation, all backscattering from the dot and from the contact is characterized by the kernel L. To make contact with the theory of Section 2.4 of a non-interacting quantum dot, it is advantageous to express L in terms of the scattering matrix S of the non-interacting system. Comparing Eqs. (101) with Eq. (260), one immediately 8nds 1 S(i!n ) 1=T − 2i 1 + S(i!n ) ; !n ¿ 0 ; L(!n ) = (270) ei!n L() d = † (i! ) 1 S 0 n ; !n ¡ 0 : 2i 1 + S † (i!n )
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
391
This relation allows us to deduce the statistical distribution of the kernel L() from the better known statistical distribution of the scattering matrix S, see Section 2.4. For ideal contacts the ensemble average of S n (with integer n) is zero, cf. Eq. (115), which implies that the ensemble average of L is zero as well. Eqs. (267) and (268) provide an eective description of the combined system of the lead and the dots in terms of fermions moving in one-dimensional wires, where the interaction only exists on one end of the wire. The eective action Se provides a coupling between the wires and between left and right moving fermions. The one-dimensional description for the charging eect, which was already introduced in Section 4.2, was 8rst proposed by Flensberg [38] and Matveev [39]. Their theory can be interpreted as an approximation of the eective action theory presented here, corresponding to the replacement of S (and L) by its ensemble average, Sj1 j2 → Sj1 j2 = − irj1 j1 j2 ;
(271)
where rj is the backscattering amplitude in the contact for the jth channel, see Section 2.4. At weak backscattering, ri 1, Eq. (270) gives, cf. Eq. (121), Lij () = vF ()ij rj ;
(272)
so that the action loses its time dependence and acquires the form of a backscattering Hamiltonian. Approximation (271) loses all the information about the trajectories where the electron returns to the entrance of the dot, after having been re<ected by the dot’s walls or by impurities inside the dot. Such returns are the cause of mesoscopic
I = ID − T ln Tr {e−H e =T T e−Se } ;
(273)
where ID is de8ned by Eq. (257), and the eective Hamiltonian Hˆ e and the action Se are given by Eq. (267). Eq. (273) allows the calculation of all thermodynamic properties of the system within the framework of the eective action. Such quantities involve, among others, the speci8c heat and the dierential capacitance,
1 92 I Cdi (N) = C 1 − : (274) 2Ec 9N2 We will perform the explicit calculation of this quantity in the next subsubsection. 30 It is not diKcult to extend the eective action approach to the calculation of kinetic quantities, such as the two-terminal conductance and the tunneling densities of states. We will sketch these extensions below. The corresponding expressions at Ec = 0 were discussed in Section 2.4. 30
In Eq. (273) the thermodynamic potential of the dot ID does not depend on gate voltage N and, thus, does not contribute to the dierential capacitance. More careful analysis, see Appendix F, reveals the dependence, which gives a contribution small as =Ec to Cdi . We relegate the corresponding discussion to the end of Section 4.6 and Appendix F.
392
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
Two-terminal conductance, the general formulation: We will use the usual Kubo formula to express the two-terminal conductance G in terms of the Matsubara correlation function F(iIn → ! + i0) − F(iIn → ! − i0) G = lim ; !→0 2i! 1=T dF()eiIn ; F() = T Iˆ()Iˆ(0)q ; (275) F(iIn ) = 0
where In = 2nT is the bosonic Matsubara frequency, and Iˆ() = exp(Hˆ )Iˆ exp(−Hˆ )
(276)
is the operator of the electric current in the Matsubara representation. Because the total current through any cross-section of the system is conserved we can de8ne the current operator in the leads. For future technical convenience, we de8ne the current I through the dot as a weighed dierence of the currents I1 and I2 entering the dot through the two leads 1 and 2, 1 (N2 Iˆ1 − N1 Iˆ2 ) ; (277) Iˆ = N1 + N2 where N1 (N2 ) is the number of channels in the corresponding lead and † † I = evF ( Lj (x) Lj (x) − Rj (x) Rj (x))|x→0 ; = 1; 2 (278) j∈
is the operator of the current in lead ( = 1; 2). The fact that the current operator Iˆ is de8ned inside the leads allows us to copy the derivation of the eective action from the thermodynamic potential: we shift the interaction from the quantum dot to the lead and then take a partial trace over all dot states. As before, the result is represented by means of the same eective action for the one-dimensional fermions ˆ L , ˆ R . We thus 8nd Tr[e−Hˆ e =T T Iˆ()Iˆ(0)e−Se ] F() = ; (279) Tr[e−Hˆ e =T T e−Se ] where the eective Hamiltonian and action are de8ned in Eq. (267), and the time dependence of the operator Iˆ is given by Eq. (269). Eqs. (275), (279) and (267) constitute the complete formulation of the two-terminal conductance problem within the eective action theory. The practical execution of the analytic continuation in Eq. (275) is more easily achieved in the time domain. As we will see below, the function F() can be analytically continued from the real axis to the complex plane, so that the result is analytic in a strip 0 ¡ Re ¡ 1=T , and has branch cuts along the lines Re = 0; 1=T . It allows one to deform the contour of integration as shown in Fig. 21, thus obtaining ∞ F(iIn ) = i dt e−In t [B(In )B(t) − B(−In )B(−t)][F(it + 0) − F(it − 0)] : −∞
Now, the analytic continuation (275) can be performed, with the result i ∞ dt t[F(it + 0) − F(it − 0)] : G= 2 −∞
(280)
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
393
Fig. 21. The integration contour used in the evaluation of the conductance, see Eq. (275), for (a) In ¿ 0, and (b) for In ¡ 0. Branch cuts of the analytic continuation of F() are shown by thick lines. Fig. 22. Schematic view of the asymmetric two-terminal setup. The left point contact has one channel almost open and the conductance of the right point contact G1 is much smaller than e2 =2˝. One of the electron trajectories contributing to elastic co-tunneling is also shown.
Next, we use the analyticity of F() in the strip 0 ¡ Re ¡ 1=T , and shift the integration variable t → t − i=2T in the 8rst term in brackets in Eq. (280), and t → t + i=2T in the second term. Bearing in mind that F() = F( + 1=T ), we 8nd ∞ 1 G= dtF(it + 1=2T ) : (281) 2T −∞ This formula is the most convenient for practical calculations. Tunneling density of states: In the case of a very asymmetric setup, [G1 e2 =˝ and arbitrary G2 ], the general formula for two-terminal conductance (281) can be further simpli8ed. One can expand the scattering matrix in terms of small transmission amplitude corresponding to G1 , like it was done for the case of non-interacting electrons, see Section 2.4, Eqs. (106) – (108). This way the problem of calculation of the two-terminal conductance is reduced to the one of evaluation of the tunneling density of states for a dot strongly coupled to one lead, see Fig. 22. For the non-interacting system, the tunneling density of states is de8ned in Eq. (107). Because the transmission coeKcient |t1 |2 1, we can consider the tunneling current I as the function of applied voltage V in second order perturbation theory in the tunneling Hamiltonian [8rst term in Eq. (106)]. This gives us the standard result [87] I (eV ) = i[J (iIn → eV + i0) − J (iIn → eV − i0)] ;
(282)
where In = 2Tn is the bosonic Matsubara frequency, and the Matsubara current J is de8ned as G1 1=T T J (iIn ) = M de−iIn Ftun () ; (283) 2e 0 sin T
394
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
where the last factor corresponds to the one-particle Green function of the electrons in the right lead (see Fig. 22; the right lead is not aected by interactions or mesoscopic
1
Here we wrote Uˆ 1 and 1 for the fermion creation and annihilation operators in the dot at the location of the tunnel junction, in keeping with the RMT formulation of Section 2.4, see Eq. (106). Analytic continuation in Eq. (282) is performed similarly to the derivation of Eq. (281), and one obtains
∞ eV Ftun (it + 1=2T ) I = T sinh MG1 dt e−ieVt : (285) 2T cosh Tt −∞ Here Ftun () must be analytic in the strip 0 ¡ Re ¡ 1=T . The linear conductance G is therefore given by ∞ Ftun (it + 1=2T ) G = G1 M dt : (286) 2 cosh Tt −∞ One can check that at Ec = 0, Eqs. (286) and (284) are equivalent to Eqs. (107) and (108). The Green function (284) is dramatically aected by the interactions, and we would like to construct an eective action theory to describe these eects. Once again, we wish to get rid of the fermionic degrees of freedom of the dot. Similar to Eq. (254), it is convenient to rewrite the charge operator enˆ in terms of the variables of the channel. However, here we have to keep in mind that the tunneling events described by operators † 1 and 1 change the charge in the dot by an amount +e and −e. This is not taken into account in the simple rede8nition (254) of nˆ that we used for the eective action theory for the ground state energy and the two-terminal conductance. Instead, to account for the fact that the total number of particles in the system changes upon a tunneling event, one has to introduce three † additional operators [40]: A Hermitian operator mˆ and two unitary operators Fˆ and Fˆ , with the commutation relations † † [m; ˆ Fˆ ] = Fˆ : (287) These operators serve to keep track of the total number of particles in the system. They commute with all the fermionic degrees of freedom. We now include mˆ into the de8nition of the charge operator, dk † ˆ (k) ˆ (k) + mˆ nˆ = − (288) j j 2 j and rewrite Eq. (284) as ˆ (0) : Uˆ ()F(0) ˆ Uˆ Ftun () = T F() 1 1
(289) †
It is easy to see from Eq. (287) that the operators Fˆ ; Fˆ in Eq. (289) change the charge, as de8ned by Eq. (288), by +e and −e, respectively, in accordance with the initial de8nition (34) of charge in terms of the fermionic operators of the dot.
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
395
After this manipulation, the Hamiltonian of the system becomes quadratic in the fermionic operators of the dot, so that this part of the system can be integrated out. Manipulations analogous to the derivation of Eq. (273) give [70] Ftun () = Fin () + Fel () ; Fin () = − Fel () =
G11 (−)
1=T
0
(;( ; ij
Mij (; 1 ; 2 ) =
Uˆ F(0) ˆ q ; T e−Se F() ˆ
T e−Sˆ e q
(290)
1=T
0
(291)
d1 d2 G1( (1 − )W(j(2) Mij (; 1 ; 2 )W((2)∗ i G( 1 (−2 )
1 ˆ ˆ U F(0) ˆ T e−Se F() ˆ e − S 4T e q ×[ Uˆ L; i (1 ) + Uˆ R; i (1 )][ ˆ L; j (2 ) + ˆ R; j (2 )]q ;
(292)
where the eective action is given by Eq. (267), the Green function of the dot, G! (), is de8ned by Eq. (258), and the coupling matrix Wj(2) corresponds to the stronger (left) junction connecting the dot and a lead. Note also that the scattering matrix S entering in the expressions for the eective action Se , see Eqs. (268) and (270), includes this stronger junction only. The averages in Eqs. (290) – (292) are performed as ˆ
· · ·q = Tr e−H e =T : : :
with the eective Hamiltonian ∞ † † ˆ d x( ˆ L; j 9x ˆ L; j − ˆ R; j 9x ˆ R; j ) H e = ivF −∞
j
+ Ec
j
0
−∞
2
dx :
ˆ† L; j
ˆ
L; j
+
ˆ† R; j
ˆ
R; j
: +N − mˆ :
(293)
The interaction representation of the operators is de8ned in Eq. (269), with Hˆ e from Eq. (293). The dierence between Eqs. (293) and (267) is in the dierent de8nitions of the charge operators in Eqs. (254) and (288). There is a good physical reason to distinguish the two contributions to the tunneling density of states in Eq. (290). The 8rst contribution in (291) is inelastic: this term does not allow the introduced electron to leave the dot; the charge of the dot at the moment of tunneling suddenly changes by +e and all the other electrons have to redistribute to accommodate this charge. We will see, that the logarithmical divergence of the imaginary time action corresponding to such evolution (orthogonality catastrophe) completely suppresses this contribution at T → 0. Conversely, the second contribution, Fel , cf. Eq. (292), contains the kernel R() which promotes the tunneled electron through the dot into the channel (left lead). Because the very
396
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
same tunneling electron is introduced to and then removed from the dot, there is no need in the redistribution of other electrons, so no orthogonality catastrophe occurs. As a result, the elastic contribution survives at T → 0, analogously to the elastic co-tunneling contribution to the conductance of a strongly blockaded dot, see Section 3.2. It can be shown, that if the charging energy vanishes, Ec = 0, all physical results of the eective action theory contained in Eqs. (267) – (270) and (293) are equivalent to the non-interacting theory of Eqs. (103), (108) and (111). The advantage of the eective action representation becomes clear when one has to treat the eects of charging. The eective Hamiltonian (267) or (293) can then be diagonalized exactly and the eective action Se treated perturbatively. The technique for such treatment is described in the following subsection. 4.5.2. Bosonization of the e9ective Hamiltonian The interaction is treated by bosonization of the chiral fermions. After bosonization, the ˆ The price we pay, however, is that the Hamiltonian Hˆ e is quadratic in the boson 8elds . eective action Se is not quadratic in the boson 8elds. We introduce boson 8elds ’ˆ Lj and ’ˆ Rj for left moving and right moving particles by 31 ˆ
Lj
Oˆj −i’ˆ e Lj ; =√ 2D
ˆ
Rj
Oˆj i’ˆ e Rj ; =√ 2D
j = 1; : : : ; Nch :
(294)
Here Nch is the total number of channels (including the spin degeneracy), and Oˆj = Oˆ†j is a Majorana fermion, {Oˆj ; Oˆi } = 2ij . Since the Majorana fermions do not enter into the Hamiltonian, their average is given by T Oˆj (1 ); Oˆi (2 )q = ij sgn(1 − 2 ) :
(295)
The boson 8eld ’ˆ Lj and ’ˆ Rj satisfy the commutation relations [’ˆ Lj (x); ’ˆ Li (y)] = − iij sgn(x − y) ;
(296)
[’ˆ Rj (x); ’ˆ Ri (y)] = iij sgn(x − y) ;
(297)
[’ˆ Rj (x); ’ˆ Li (y)] = − iij :
(298)
The high-energy cut-o 1=D is of the order of 1=DF and is chosen consistently with the high-energy cut-o in the bosonic correlation functions, see below Eq. (303). † † † † The products ˆ Lj ˆ Lj and ˆ Rj ˆ Rj read † † 1 9 : ˆ Lj (x) ˆ Lj (x):= ’ˆ (x); 2 9x Lj
31
† † 1 9 : ˆ Rj (x) ˆ Rj (x):= ’ˆ (x) : 2 9x Rj
√
(299)
√
These bosonic 8elds are related to the 8elds of Eq. (208) as Bj (x) = (’ˆ Rj − ’ˆ Lj )= 2, ’ˆ j (x) = (’ˆ Rj + ’ˆ Lj )= 2. The linear scale D is related to the cut-o D there by D = D=vF .
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
397
It is easy to check, using Eqs. (296) – (298), that the fermionic 8elds (294) obey the standard anti-commutation relations. The commutation relation (298) guarantees the anticommutation † relation { ˆ L; j ; ˆ R; j } = 0; { ˆ L; j ; ˆ R; j } = 0 and the correct commutation relations between the operator of the number of particles in the dot, see Eqs. (254) and (299), 32 Nch 0 † † nˆ = : ˆ R; j ˆ R; j + ˆ L; j ˆ L; j : d x −∞
j=1
N
=
ch 1 [’ˆ R; j (x = 0) + ’ˆ L; j (x = 0)] 2
(300)
j=1
and the fermionic operators, † † [n; ˆ ˆ R; L (x)] = ˆ R; L (x)B(−x) :
In terms of the bosonic 8elds, the eective Hamiltonian (267) acquires the form
∞ 9’ˆ R; j 2 9’ˆ L; j 2 v F Hˆ 0 = dx + 9x 9x 4 j −∞ 2 Ec + 2 [’ˆ L; j (0) + ’ˆ R; j (0)] + 2N : 4
(301)
j
ˆ The bosonized form of the Hamiltonian (293) is obtained by the replacement N → (N − m). As Hamiltonian (301) is quadratic in the bosonic 8elds, it can be easily diagonalized. One can immediately notice [38,39] that the eigenenergies of (301) do not depend on the gate voltage N because the latter can be removed by the transformation ’ˆ L(R); j → ’ˆ L(R); j − N=Nch . On the other hand, the eective action (267) becomes a non-linear functional of the bosonic operators 1=T 1 1=T Se = d1 d2 Lij (1 − 2 )Oˆi (1 )Oˆj (2 ) 2D ij 0 0 ×[e−i’ˆ Li (1 ) + ei’ˆ Ri (1 ) ][ei’ˆ Lj (2 ) + e−i’ˆ Rj (2 ) ]
(hereinafter we use convention (265) for the operators taken at x = 0). Our general strategy will be to expand the quantity of interest in powers of the eective action, and then to average over the quadratic Hamiltonian (301). All the relevant averages have the form exp(a), ˆ where a is a linear combination of the boson 8elds ’ˆ Lj and ’ˆ Rj ; j = 1; : : : ; Nch . Such an average can be found according the rule 1 exp(a) ˆ q = exp aˆq + [aˆ2 q − aˆ2q ] : (302) 2 32 The absence of a term proportional to the bosonic 8elds at x = − ∞ in Eq. (300) follows from the order of limits taken with respect the range of the interaction and the system size: First, the interaction is de8ned in terms of the charge of the entire length of the channel for x ¡ 0, and only then the length of the channel is sent to in8nity.
398
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
The corresponding bosonic correlation functions can be found from the Hamiltonian (301), see e.g. Ref. [70]. For readers wishing to reproduce the calculation, we notice only that the commutation relation (298) should be taken into account to obtain the Green functions with correct analytic properties. The result is N ; i; j = 1; : : : ; Nch Nch
D T 1 T ’ˆ Li (1 )’ˆ Lj (2 ) = ij ln + A + ij − B vF |sin T (1 − 2 )| Nch ’ˆ L; j q = ’ˆ R; j q = −
T ’ˆ Ri (1 )’ˆ Rj (2 ) = T ’ˆ Li (1 )’ˆ Lj (2 ) ; T ’ˆ Li (1 )’ˆ Rj (2 ) = ij
i 1 ln f(2 − 1 ) − A : sgn(1 − 2 ) − 2 Nch
(303)
Here all of the bosonic operators are taken at the origin x = 0. One should take the limit A → +∞; B → +∞ at the end of the calculation of each physical propagator. 33 The function ln f() has the integral representation ∞ DEc Nch eC sinh[(i + x=Ec Nch )T ] ln f() = ln − d x e−x ln ; (304) vF sinh(2 Tx=Ec Nch ) 0 where C ≈ 0:577 is the Euler constant. We note that f() is analytic for Im ¡ 0. For T Ec , to a good approximation, we set f() ≈
DT ; ivF sin[( − it0 )T ]
t0 =
: Ec Nch eC
(305)
This equation is valid both for = 0 and for 1=Nch Ec . It also provides the correct upper cut-o for the arising logarithmic divergences. Approximation (305) fails when contributions from times ∼ 1=Nch Ec are important, as is the case for spinless fermions with Nch = 1. Using Eqs. (294), (295) and (303) one can 8nd all the relevant correlation functions of the fermionic operators. We will discuss some of them here in order to demonstrate the analytic structure of the perturbation theory. First of all, from Eq. (303) we immediately 8nd that any Green’s function which involves only left (L) or only right (R) moving fermions is not sensitive to the interaction, i.e., (2vF )T Uˆ L; i (1 ) ˆ L; j (2 )q = ij
T sin[T (1 − 2 )]
(306)
and all the averages are given by the Wick theorem. All such Green functions may have only pole-type singularities in the plane of the complex variable i . The insensitivity of such type of the Green functions to the interaction can be understood from the chirality of the fermions and the causality principle. 33
The precise meaning of these quantities is Nch A=2 =
#
ˆ L; i ()’ˆ L; j ()q , ij ’
(Nch − 1)B + A=2 =
#
ˆ L; i ()’ˆ L; i ()q . i ’
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
399
Fig. 23. Analytic structure of correlator (307) as a function of 1 . The dot indicates the pole singularity at 1 = 2 ; the thick lines the branch cuts along the lines Re 1 = 3 ; Im 1 ¡ 0 and Re 1 = 4 ; Im 1 ¡ 0.
Let us now consider a more complicated average involving left movers together with right movers, (2v )2 T Uˆ ( ) ˆ ( ) Uˆ ( ) ˆ ( ) F
=
L; i
1
L; i
2
R; j
3
R; j
T T sin[T (1 − 2 )] sin[T (3 − 4 )]
4
q
f(4 − 1 )f(3 − 2 ) f(3 − 1 )f(4 − 2 )
1=Nch
:
(307)
We 8x 2 ; 3 ; 4 to be real numbers and consider correlator (307) as a function of the complex parameter 1 . It has a pole singularity at 1 = 2 and branch cuts along the lines Re 1 = 3 ; Im 1 ¡ 0, and Re 1 = 4 ; Im 1 ¡ 0, see Fig. 23. It is readily seen, that the contribution to an integral over 1 from the pole does not depend on the interaction (it does no longer involve the function f): the fractional-power factor in Eq. (307) turns into unity at the pole. Thus it is the branch cut contribution that comes from the interaction. The same is true for any higher order correlation function. Imagine now that we expand an observable quantity in power series in Lij and account in each order of perturbation theory only for the pole contributions in the integrals over the intermediate times 1 ; 2 ; : : : . The pole contribution in each order does not depend on the interaction, and constitutes just a Green function for non-interacting fermions. Such contributions can be easily summed up, resulting in a geometric series in L. 34 Since a geometric series of the matrix L is precisely the scattering matrix S, cf. Eq. (270), we thus arrive at the following simple result: in order to take into account the pole contributions in all subsequent orders of perturbation theory, one has to simply replace L in the 8rst non-vanishing contribution by 1 S(i!n ); !n ¿ 0 ; − 2i L(!n ) → (308) 1 † S (i!n ); !n ¡ 0 : 2i (Here !n is the fermionic Matsubara frequency.) 34
For the calculation of the ground state energy, the geometric series appears only when one calculates the thermodynamic potential (273) of the entire system, including the contribution ID from the closed dot.
400
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
Consider now a correlation function which includes a branch cut in the leading order (e.g., as function of the times 1 ; 2 ; 3 ; 4 in second order in the eective action). Then, in each of the subsequent higher orders we select the contributions including only one branch cut, all other singularities in these terms being single poles. Since such a contribution is factorized into a branch-cut factor involving two times, and a product of Green functions for non-interacting fermions for the remaining times, we can again sum up all such contributions as a geometric series. As before, the result is that one has to take the 8rst non-vanishing term in an expansion in powers of L that contains a single branch cut and then make replacement (308) to account for all higher orders in L with the same branch cut singularity. In such a way, we take into account all the orders of perturbation theory in L that contain only one branch cut singularity in each order. The same analytic structure occurs in all higher orders of perturbation theory: poles give a “reducible” average, whereas the “irreducible” part of the propagator has only branch cut singularities. This leads to a very important statement: the perturbation theory in action (267) can be constructed as an expansion in powers of the scattering matrix S rather than of the kernel L. In this expansion all the reducible averages (poles) are included automatically in all the orders of the perturbation theory, whereas the branch cuts are considered perturbatively; each order of the perturbation theory in S is characterized by the number of branch cuts it includes. 35 Correlators (306) and Eq. (307) do not depend explicitly on the gate voltage N. One can see from Eq. (303), that a correlator having such a dependence must include at least creation operators of left (right) moving particles and annihilation operator for right (left) moving particles in all channels. For a total number of Nch channels, the lowest order N-dependent correlator is 1=Nch N Nch R − L ) ch Nch f( i − i j Uˆ (Lj ) ˆ (Rj ) = (2vF )Nch T e−i2N : (309) L; j R; j t0 f(0) j=1
q
i; j=1
The expression for the correlator of an arbitrary number of the fermionic operators as well as other necessary correlation functions are given in Appendix G. To conclude this subsubsection, let us prove the assumption of Section 4.4 about the Fermi liquid behavior of the one-channel system Nch = 1 at low energies. To this end we substitute Eq. (305) into Eqs. (307) and (309), and discover that the Green functions now have only the pole-type singularities, which is the hallmark of the Fermi liquid. Moreover, one 8nds from Eq. (309) at 1=Ec 1 T T Uˆ L (0) ˆ R ()q = − e−i2N ; N = 1 ; 2vF sin T which is exactly a Green function of one-channel system with unpenetrable barrier and the scattering phase = (N + 1=2), see also Eq. (241). 35
Regarding this statement, we point to the fact that for non-interacting particles, the ground state energy, Eq. (111) and the conductance, Eq. (103), are of second order in the scattering matrix S. In the eective action theory, this result follows from the summation of the pole contributions to all orders in L, which produces a single term proportional to S 2 . The interaction corrections, however, can be of higher order in S, depending on the number of branch cut singularities that are included in the perturbation theory.
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
401
4.6. Results for the di9erential capacitance As we have already seen in the discussion of weak tunneling, the existence of periodic oscillations of measurable quantities with the gate voltage N is a signature of Coulomb blockade. In this section we consider such oscillations, which occur due to the incremental charging of the dot, in the case of the dierential capacitance of a partially open dot. First we derive the relation between the N-dependent thermodynamic potential and the scattering matrix S for an interacting system, see Eqs. (313) – (316) and (331) – (332) below. These formulae replace the known results for the non-interacting electrons, see Eqs. (111) – (114) of Section 2.4. Using the general result, we then 8nd the ensemble-averaged dierential capacitance. The average capacitance oscillates periodically with N as long as some re<ection in the channels connecting the dot with leads exists. The amplitude of oscillations, however, decays with the number of channels as |r |Nch , where r is the re<ection amplitude in a channel, and the number of channels Nch accounts also for the spin degree of freedom. The detailed formulae for the oscillations of the average capacitance are given below in Eqs. (122), (321) and (323). In the absence of re<ection in the channels, the average dierential capacitance does not oscillate with N. However, mesoscopic
Cdi (N1 )Cdi (N2 )
C2 and show that it consists of two additive parts,
(310)
K(s) ≡ KQ (s) + K& (s) : The 8rst part, KQ (s), originates from the discreteness of charge, and can be represented as the product of an oscillatory function with unit period and an envelope function YNch (s=Ec ) which decays on a scale |s| ∼ Ec =,
s KQ (s) ∼ cos(2s) YNch : (311) Ec Ec The explicit results for the special cases of a junction with one propagating mode of spinless fermions and real spin 1=2 electrons are given below in Eqs. (319) and (321), respectively. The second part, K& (s), comes from the mesoscopic
K& (s) ∼ ; s&1: (312) Ec Nch s The derivation of K& (s) within the general formalism is also outlined at the end of this section. With the increase of the number of channels Nch connecting the dot and leads, the oscillatory part of the correlation function quickly decays. It is experimentally detectable only for a small number of channels.
402
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
The oscillatory part KQ (s) of the capacitance autocorrelation function: As we have seen in the previous subsection, the N-dependent contribution to the thermodynamic potential appears in Nch th order perturbation theory (Nch is the total number of channels): I(N) = T
(−1)Nch +1 [Se ]Nch q : Nch !
(313)
Substituting Eq. (267) into Eq. (313) and using Eq. (309), we obtain I(N) = −2 (−1)Nch Re e−i2N T 1=Nch 1=T Nch f(Ri − Lj ) × det[S] dRi dLj ; 2t f(0) 0 0
(314)
i; j=1
where the function f() is de8ned by Eq. (304). The determinant of the scattering matrix in the imaginary time domain is de8ned by S11 (L − R ) S12 (L1 − R2 ) : : : S1Nch (L1 − RNch ) 1 1 S21 (L − R ) S22 (L2 − R2 ) : : : S2Nch (L2 − RNch ) 2 1 (315) det[S] ≡ : ::: ::: ::: ::: SNch 1 (LN − R1 ) SN 2 (LN − R2 ) : : : SNch Nch (LN − RN ) ch
ch
ch
ch
The kernel L was substituted with the scattering matrix S in Eq. (315), in accordance with the previous subsubsection, and S() = T e−i!n [S(i!n )B(!n ) − S † (i!n )B(−!n )] : !n =T (2n+1)
The latter formula can also be written in a form of the Lehmann representation ∞ T 1 S() = dt[S(t) + S † (t)] : 2i −∞ sin[( − it)T ]
(316)
Eqs. (314) and (315) express the gate-dependent interaction to the thermodynamic potential in a most general form. It is valid for any scattering matrix of the dot. In what follows we apply those equations for studying the dierent limiting cases. The oscillatory part of di9erential capacitance and KQ (s) for the one-channel geometry (spinless fermions): In this case the scattering matrix is formed by its only element S(t). Using the fact that the function f() is analytic in the lower complex semiplane, see Eq. (304), we 8nd ∞ f(−it) Ec eC I(N) = 2 Re e−i2N S(t) dt : f(0) 0 With the help of Eq. (304), one obtains for T Ec
∞ ∞ x + tEc = Ec eC −i2N −x I(N) = 2 Re e S(t)exp − d xe ln dt : x 0 0
(317)
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
403
One can use Eq. (317) to analyze the statistics of the dierential capacitance (274). In approximation (271), one obtains [39] Cdi (N) (318) = 2eC |r |cos 2N ; C where · · · stands for the ensemble averaging and r is the re<ection amplitude of the contact between the dot and the lead. The statistics of the mesoscopic
Cdi (N) 4eC 2 Ec |r | cos 2N ln ; = C T
16 Ec 3 Ec C 2 KQ (s) = 2 cos(2s) ln ln + 4e |r | : (321) 3 ! Ec T !Ec T The result for the averaged capacitance was 8rst obtained in Ref. [39]. The correlation function of the mesoscopic
404
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
non-periodic
(322)
Substitution of Eq. (322) into Eq. (314) gives a periodic contribution to the thermodynamic potential,
4 Ec I(N) = 2 cNch ln Re e−2iN det r ; (323) t0 T where the numerical coeKcient cNch depends weakly on the number Nch of channels; c2 = 1=4, c3 = ( 98 )−3=2 21=3 M( 13 )M( 76 ). Eq. (323) demonstrates that the dependence of the amplitude of the Coulomb blockade oscillations on the contact’s conductance G is not universal; more detailed characteristics of the contact are important. The determinant det r in Eq. (323) can be evaluated in terms of the re<ection eigenvalues Rj of the contact, |det r | = j R1=2 (here Rj are de8ned as the eigenvalj † ues of rr ). The distribution of Rj is known for a number of relevant models, including the model of a short disordered wire [14]. With the help of that distribution, one can 8nd from Eq. (323) the amplitude of the oscillations of I and of the dierential capacitance, Cdi (N) ∼ ln(Ec =T ) exp(−23 ˝G=e2 ) cos(2N). This exponential dependence, and the more general formula expressing the amplitude of oscillations in terms of j Rj1=2 (but both without the logarithmically divergent prefactor) were found in Ref. [138]. There a dierent method, based on an extension of the instanton technique of Ref. [145] to the case of an arbitrary (though energy independent) S-matrix of the contact was used. The same results were reproduced within a non-linear -model for interacting systems in Ref. [146]. Non-periodic interaction corrections to the capacitance, K& (s): In addition to the periodic-in-N mesoscopic
dfF 1 9S (j; T ) = d j − ; Tr S † (j) d j 2i 9j
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
405
fF (j) = 1=(1 + exp(j=T )) being the Fermi distribution function, and the N-dependence of & follows from the self-consistency condition d& 1 −1 Cdi =− : (325) = (&; T ) + dN C(&; T ) 2Ec The mesoscopic
I = ID − T ln Tr e−H e =T T e−Se ;
(273)
where ID is the thermodynamic potential of the closed dot without interactions and the second term accounts for the dierence between the thermodynamic potentials of an open dot with interactions and a closed dot without. The thermodynamic potential ID of the closed dot without interactions can be written as an integral over the density of states closed (j) of the closed dot, ID = − T d j closed (j)ln(1 + e−j=T ) : (327) The density of states closed (j) is a sum of delta functions, and thus diers from the density of states (j) of the dot when it is coupled to the leads, see Eq. (113), which is a continuous function of energy. It was the latter density of states that determined the thermodynamic potential (111) of the open quantum dot in the absence of interactions, and that entered into the mean-8eld description, see Eqs. (324) and (325) above. Using the equality 9 1 1 + S(j − i0+ ) closed (j) = ; (328) Tr ln 9j 2i 1 + S(j + i0+ ) together with unitarity of S, we can write closed (j) as 1 9 closed (j) = (j) − Im ln[1 + S(j + i0+ )] : (329) 9j 36 The dierence between Eqs. (326) and (325) is in the
406
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
Hence, we can represent ID , in turn, as 1 ∞ ID = Ilb + d jfF (j) Im Tr ln[1 + S(j + i0+ )] ; −∞
(330)
where Ilb is the thermodynamic potential of the dot coupled to the leads in the absence of interactions, see Eq. (111). (The subscript “lb” has been written in analogy with the non-interacting contribution Glb to the two-terminal conductance, see Section 4.7 below.) The second term in Eq. (273) can be calculated by expansion in the eective action Se , followed by an average over the fermionic operators using Eqs. (306) and (307). As explained below Eq. (307), one can distinguish two types of contributions: the “pole contributions”, which are already present in the absence of interactions, and the “branch cut contributions”, which represent the corrections to I due to the electron–electron interactions in the dot. Since, in the absence of interactions, the thermodynamic potential of the dot is given by Ilb , the pole contributions to the second term in Eq. (273) and the second term of Eq. (330) must cancel to all orders in the eective action Se . Hence, the thermodynamic potential is given by I = Ilb + Iee ;
(331) ˆ
where the interaction correction Iee represents the “branch cut contributions” to −T ln Tr e−H e =T T e−Se . (The calculation of Iee proceeds along the same lines as the calculation of the interaction correction to the two-terminal conductance; Details of the latter calculation can be found in Section 4.7 and Appendix H.) To second order in the action Se (i.e., up to one branch cut contribution), we 8nd ∞ ∞ Tr S(t1 )S † (t2 ) T2 Iee = − sin dt1 dt2 ds 2 Nch 0 sinh[T (t1 + s)] sinh[T (t2 + s)] t0 sinh[T (s − t0 )] sinh[T (s + t1 + t2 + t0 )] 1=Nch × : (332) sinh[T (t1 + t0 )] sinh[T (t2 + t0 )] In order to 8nd the dierential capacitance (274) from Eqs. (331), (113) and (332), we need to include the dependence on the chemical &, which is done via the relation S(&; t) = S(0; t)e−i&t :
(333)
Hence, with help of Eq. (326) we 8nd for Ec Cdi = Cdi ; lb + Cdi ; ee ; Cdi ; lb 2 + =1− C 2Ec 8Ec ×
(334) 0
∞
dt
∞
0
T (t 2 − t 2 ) iN(t−t ) e ; sinh[T (t − t )]
dt Tr [S † (−t )S(t) − S † (−t )S(t)] (335)
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
Cdi ; ee 2 = − 2 sin C 4 Ec Nch
0
∞
dt
0
∞
dt
∞
t0
407
ds Tr S † (−t )S(t)
2 T 2 (t − t )2 eiN(t−t ) sinh[T (t + s)] sinh[T (t + s)] sinh[T (s − t0 )] sinh[T (s + t + t + t0 )] × sinh[T (t + t0 )] sinh[T (t + t0 )] ×
1=Nch
:
(336)
In the 8rst term, Cdi ; lb , we have separated the ensemble average and the mesoscopic
2 Nch c Nch 3 2 var (Cdi =C) = 1− ; (337) 12! Nch Ec 22 T 4Nch 2 T where c is a numerical factor that takes the value c ≈ 3:42 for Nch → ∞. The 8rst term between brackets {· · ·} represents the
3 var (Cdi =C) K& (s) = s coth s −1 : 2T sinh2 (s=2T ) 2T The
2 2 K& (s) = (N)((N + s)) 2Ec
4 2 2 1 − 4(2s=Nch )2 = : (338) ! Nch Ec [1 + 4(2s=Nch )2 ]2 where the correlator of the density of states was taken from Ref. [150]. 37 (The argument of the density of states is N, since
To reproduce this result from Eq. (335) it is not suKcient to use a Gaussian average with correlator (122), as non-Gaussian contributions are also important for times of order t ∼ 1=Nch .
408
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
4.7. Conductance through a dot with two almost re=ectionless junctions This subsection is devoted to the eects of interaction on the two-terminal conductance of a dot with junctions to the leads that are either completely re<ectionless, or have only a small re<ection coeKcient. The statistics of the two-terminal conductance in the absence of the interactions is reviewed in Ref. [14]. The eect of interaction on this quantity was analyzed in Ref. [151] and we follow this reference in this subsection. The Kubo formula for the conductance G is given by Eqs. (281), (279) and (277). Similarly to the consideration of the capacitance, we expand the current–current correlation function (281) in powers of the eective action Se and calculate the arising fermionic correlation functions with the help of Eqs. (306) – (309) and (G.3) – (G.4). The lowest non-vanishing contribution to the current–current correlation function F(), and hence to the two-terminal conductance, is of the second order in the action, 2 2 F() = I ()I (0)q + 12 I ()I (0)Se q − 12 I ()I (0)q Se q + · · · :
(339)
We substitute into Eq. (339) the explicit form of the current operators (277) and the eective action (267). After calculation of the averages of the fermionic operators with the help of Eqs. (306) – (309) and (G.3) – (G.4), and with approximation (305), we 8nd G = Glb + Gee + Gosc ;
(340)
∞ ∞ e2 N1 N2 (t1 − t2 )T Tr S † (−t1 )HS(t2 )H Glb = − dt1 dt2 ; 2˝ Nch sinh[(t1 − t2 )T ] 0 0
(341)
∞ ∞ e2 1 dt1 dt2 Tr S † (−t1 )HS(t2 )H Gee = − sin 2˝ Nch 0 0 ∞ (2s + t2 + t1 )2 T 2 × ds sinh[(s + t1 )T ] sinh[(s + t2 )T ] t0
sinh[(t1 + t2 + s + t0 )T ] sinh[(s − t0 )T ] 1=Nch × : sinh[(t1 + t0 )T ] sinh[(t2 + t0 )T ]
(342)
The charging energy enters into the expressions only through the time t0 , see Eq. (305), which has the meaning of the characteristic time of the classical recharging of the dot (RC-time). Notice, that the 8nal results are written in terms of the scattering matrix S of the dot in the absence of the interactions, see Section 4.5.2. In this order of perturbation theory, the term Gosc oscillating with gate voltage is present only for the case N2 = N1 = 1 (spinless fermions, and both contacts carry a single channel each): ∞ ∞ e2 (2s + t2 + t1 )2 T 2 Gosc = 2 dt1 dt2 ds 4 ˝ 0 sinh[(t1 + t0 )T ] sinh[(t2 + t0 )T ] t0
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
×
409
1 sinh[(t2 + t1 + s + t0 )T ] sinh[(s − t0 )T ]
×Re e−2iN [S11 (t1 )S22 (t2 ) + S12 (t1 )S21 (t2 )] :
(343)
The derivation of Eqs. (340) – (343) is relegated to Appendix H. Eq. (341) coincides with the Landauer formula for the conductance of non-interacting electrons (104). Eq. (342) is the leading interaction correction at temperatures much smaller than the charging energy. The power-law form-factor (the last line in this formula) describes the non-Fermi liquid behavior of the dot when Nch ¿ 1. Finally, the Coulomb blockade-like contribution (342) for N1 = N2 = 1 is somewhat similar to the oscillation of the capacitance, Eq. (314). To gain intuition about the structure of the interaction correction, we start with the simplest example of instantaneous re<ection from the contact, when S(t) is given by Eq. (322). Neglecting the
2=Nch e2 cNch Nch sin =Nch Nch Ec eC † Gee = − Tr rHr H −1 ; (345) 2˝ 4 2 T where cNch is a numerical coeKcient that depends only weakly on Nch , ranging from c4 ≈ 5:32 to c∞ = 4. Adding Eqs. (344) and (345), we obtain 2=Nch
Nch Ec eC cNch Nch sin =Nch e2 N1 N2 † − Tr rHr H : (346) G= 2˝ Nch 2 T 4 Comparing Eq. (346) with Eq. (344), we see that the “bare” re<ection amplitude entering in the Landauer formula is renormalized by the interaction. The renormalization is temperature-dependent, manifesting the non-Fermi liquid behavior of the open dot on the energy scale Ec & T & . For N1 = N2 = 2 (electrons with spin, and both contacts carry one channel each) the result (346) has been obtained previously in Ref. [40]. The above example (322) of re<ection at the moment t = 0 corresponds to case where the electron is backscattering instantaneously, without entering the dot. Now we are going to estimate the interaction-induced renormalization of the amplitude for a backscattering event occurring somewhere within the dot. In the language of the electron trajectories, such an event corresponds to a trajectory returning to the contact at some time ts ¿ 0, Sij (t) = rij (t − ts );
ts ¿ 0 :
(347)
410
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
It is easy to see that the non-interacting contribution (344) remains intact, whereas the interaction contribution depends signi8cantly on ts : e2 Tr rHr † H Gee = − 2˝
Nch Ec 2=Nch − 1; ts (Nch Ec )−1 ; T
× (348) 1 2=Nch −1 −1 − 1; (N E ) t T ; c s ch ts T −2Tts e ; ts T −1 (for brevity, we omitted here the numerical coeKcients of the order of unity). We see that for short trajectories, with return times ts smaller than the RC-time for the open dot, see Eq. (305), the renormalization of the scattering amplitude is complete, i.e., the same as for the direct re<ection from the contact we studied before. On the other hand, the eect of interaction on the scattering events occurring “deep inside” the dot, ts 1=T is exponentially weak. As long as the temperature is suKciently high, T TNch , and the scattering from the dot is ergodic, the majority of the scattering events is not renormalized by the interaction. Therefore, the Landauer part of the conductance (344) remains intact at such temperatures. The interaction correction aects only a small portion of scattering events, and thus only slightly aects the statistics of the two terminal conductance. Note that the characteristic energy Nch that appears here is nothing but the inverse escape time for a quasiparticle from the dot into one of the leads. To study the statistics of the two-terminal conductance, one should use the statistical properties of the scattering matrix (115), and Eqs. (340) – (342). The result substantially depends on whether we are dealing with spinless fermions, denoted by the spin-degeneracy factor gs = 1, or electrons with spin, denoted by gs = 2. In the latter case, the results are written in terms of o = N =g (and, similarly, N o = N =g , N o = N =g ). Below we the number of orbital channels Nch 1 s 2 s ch s 1 2 concentrate on the case of the re<ectionless contacts; all backscattering events occur within the dot. For the averaged conductance, we 8nd [151]
o N1o N2o gs e2 N1o N2o cNch Nch 2 G = + 1 − 1 + ; (349) o o o o 2 2˝ Nch ! Nch (Nch + 2=! − 1) Nch gs T where ! = 1 (2; 4) for the orthogonal (unitary, symplectic) ensemble, and cNch is an Nch -dependent numerical constant, ranging from c4 ≈ 4:77 to c∞ = 2 =6. Eq. (349) is valid for an arbitrary number of channels except the case of gs = 1; N1 = N2 = 1. For the latter case, one 8nds
e2 1 1 4Ec eC G = − !; 1 ; (350) + ln 2˝ 2 6 8T c2 T where c ≈ 2:81. Similar to Eq. (344), here the 8rst term in brackets is the classical resistance of two point contacts connected in series. The second term is the familiar weak localization correction for non-interacting electrons [this term replaces the correction to the Kircho law coming from the backscattering in the junctions, see Eq. (344)]. Finally the last term is the high-temperature expansion of the interaction correction. This correction comes from the renormalization of the
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
411
amplitudes of the backscattering events occurring in the dot within “time horizon” ts ∼ 1=T , cf. Eq. (348). At high temperature, the interaction correction is less signi8cant than the correction from the weak localization. Notice, however, that at T = TNch =, which is the lower limit of the applicability of the theory, the interaction correction is smaller than the weak localization correction only by parameter 1=Nch , and may be still important for the temperature dependence of magnetoresistance. One notices also that for the unitary ensemble, ! = 2, both the weak localization and interaction corrections to the average conductance vanish, in contrast with the usual situation in disordered metals [18]. The reason is that the interaction enhances any electron scattering, see Eq. (348), no matter whether it is scattering from one lead to the other (1 ↔ 2) or backscattering into the same lead (1 ↔ 1 and 2 ↔ 2). For the unitary ensemble, the scattering 1 ↔ 1 and 1 ↔ 2 occur with equal probabilities and therefore neither re<ection nor transmission are enhanced in average. On the other hand, for the orthogonal ensemble, backscattering into the same lead is more probable due to the weak localization, and the interaction further increases the backscattering probability. In a similar manner, the mesoscopic
o2 2 Nch cN ch Nch N1 N2 1 gs e 2 var G = ; (351) + o o2 ! 2˝ 6T Nch gs 4 T 2 Nch with c ≈ 6:49 for Nch 1. The 8rst term in Eq. (351) represents the high-temperature asymptote of the mesoscopic
2 2
2 e 2 Ec G(N)G(N ) = (1 + 2!; 1 ) {1 + cos[2(N − N )]} ln ; (352) 2˝ 64 2 T 2 T where Ec is the charging energy of the dot and C ≈ 0:577 is the Euler constant. In this order of perturbation theory, such an explicit N-dependence, which re<ects the discreteness of charge, exists only for spinless fermions, but is absent for particles with spin. For spin 1=2 electrons, eects of the discreteness of charge appear only in the Nch th order in perturbation theory in S, similarly to Eq. (314). For a large number of re<ectionless channels, the characteristic amplitude of mesoscopic
412
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
4.8. Conductance in the strongly asymmetric setup We consider here a dot with two leads, one of which is connected to the dot by a tunneling contact with a small transmission coeKcient, while the other is coupled to the dot via an ideal (no re<ection) or almost ideal (small r) point contact. The 8rst lead serves as a tunneling probe of the “open” quantum dot. See the discussion in Sections 2.4 and 4.5.1. Inelastic tunneling: We begin with the analysis of the inelastic contribution to the tunneling conductance. According to Section 4.5.2, the order of perturbation theory is characterized by the number of the branch cuts the corresponding fermionic correlator includes. In the leading order, we will allow only one branch cut in the complex plane and collect all the pole contributions in Eqs. (291) and (292). Physically, this corresponds to neglecting the process of escape of the tunneled electron into another lead. With the help of Eqs. (G.6) and (G.8), we 8nd f() 2=Nch (0) Fin () = − G11 (−) ; f(0) f() 2=Nch i! ( − ) e n 1 2 [i sgn !n + 2 2 W † Go (i!n )W ]ij ; Mij (; 1 ; 2 ) = T f(0) ! n
where !n is the fermionic Matsubara frequency, and Green function for the open dot Go , is de8ned in Eq. (109). 38 This gives with the help of Eq. (108) f() 2=Nch (0) ; (353) Ftunn () = − T (−) f(0) where the dimensionless function f() is de8ned in Eq. (304), and the Matsubara counterpart of the tunneling density of states T () is related to the exact tunneling density of state of non-interacting system T (j), see Eq. (108), by 1 j −j T () = d j T (j)e sgn + tanh ; (354) 2 2T at −1=T ¡ ¡ 1=T . It is noteworthy, that the inelastic co-tunneling contribution is expressed in terms of the Green functions of the open dot, rather than that of the closed dot in accordance with the discussion in the previous subsection. In the absence of the interaction, the prefactor f()=f(0) equals unity and Eqs. (353), (354) and (286) reproduce the formula for non-interacting model (107). It follows from Eqs. (304), (353) and (354) that F() is an analytic function of for 0 ¡ Re ¡ 1=T , so that we can use Eq. (286) for the tunneling conductance. For T Nch Ec we 8nd
2=Nch ∞
2=Nch +1 ∞ 2 T MT (j) 1 ijt Gin = G1 dj dt e ; (355) 4 cosh j=2T −∞ cosh tT Ec Nch eC −∞ where C ≈ 0:577 is the Euler constant. Here we used approximation (305), justi8ed at T Nch Ec . 38
We remind that the Matsubara Green function is expressed in terms of its retarded and advanced counterparts as G(i!n ) = B(!n )GR (i!n ) + B(−!n )GA (i!n ).
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
413
To 8nd the averaged conductance, we notice that T = 1=(M). Then, the integration in Eq. (355) immediately yields
2=Nch √ 2 T M (1 + 1=Nch ) Gin = G1 ; (356) 2M (3=2 + 1=Nch ) Ec Nch eC where M(x) is the Gamma function. Eq. (356) is valid for re<ectionless point contacts, provided that 2 T Nch . Note that Eq. (356) predicts a power-law temperature dependence of the tunneling conductance. The power-law factor can be understood in terms of the well-known orthogonality catastrophe [136]. An addition of an electron without a re-distribution of charge between the dot and lead would cost an amount of energy ∼ Ec . It means that the tunneling spectrum would have a “hard” gap with the width of the order of charging energy, and the temperature dependence of the conductance would display an activation behavior. The power-law (356) indicates that a redistribution of charge indeed occurs when an additional electron tunnels onto the dot. In fact, each channel of the ideal dot–lead junction carries away charge e=Nch . According to the Friedel sum rule, it corresponds to an additional phase shift = =Nch for the scattering in each of the channels. Tunneling of an electron with energy jEc creates an excited state of the electron system consisting of the dot and lead. The overlap of this state with the initial state of the system de8nes the probability of tunneling. The smaller the energy j, the smaller such overlap is (orthogonality catastrophe [136]). This overlap can be # expressed in terms of the phase shifts [136,153], and is proportional to (j=Ec )5 , where 5 = (=)2 , and the sum is taken over all the modes. In our case, the charge is “shifted” through Nch channels inside the dot, and through the same number of channels in the lead, yielding 2Nch equal contributions to the sum; 5 = 2=Nch . This exponent indeed coincides with the one in the power-law (356). A particular case of Eq. (356) for one-channel junctions deserves additional discussion. For spinless fermions, Nch = 1, one obtains
2 2G1 2 T Gin = ; Nch = 1 (357) 3 Ec eC We see that in this case the temperature dependence is similar to that for a strongly blockaded dot, see Eq. (179). The T 2 dependence can be obtained from the usual phase space argument for the quasiparticle lifetime, and is a manifestation of the Fermi liquid behavior of a one-channel dot, see Section 4.4 for a qualitative discussion. The temperature dependence of the tunneling conductance for the electrons with spin, Nch = 2, is qualitatively dierent [40]:
3 T Gin = G1 ; Nch = 2 : (358) 8Ec eC The temperature exponent here is smaller than the one given by the phase space argument. This indicates non-Fermi liquid behavior, which we have already discussed in Section 4.2. If there is no scattering in the junction (r = 0), and, in addition, one neglects backscattering from within the dot (corresponding to the limit =Ec → 0), then result (358) is valid at arbitrarily low temperature. The existence of a 8nite backscattering, however, restores the Fermi liquid behavior at suKciently low temperature. We have already discussed in Section 4.2 the mechanism that recovers the Fermi liquid behavior in the case of r = 0. Re<ection is a relevant perturbation;
414
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
at low energies it becomes stronger, reaching re ∼ 1 at j ∼ T0 (N). The characteristic energy scale T0 (N) is given by Eq. (221). At smaller energy scales the dot eectively is weakly coupled to both leads, and the strong Coulomb blockade behavior [cf. Eq. (179)] is restored. The Fermi liquid result Gin =
T2 82 G ; 1 3eC Ec T0 (N)
T T0 (N)
(359)
matches Eq. (358) at the upper limit of the allowed temperature interval. We refer the reader to the original reference [40] for the rigorous derivation of the numerical coeKcient in Eq. (359). We see that inelastic contribution vanishes at T = 0 in a close resemblance of the strong Coulomb blockade, see Section 3.2. Pursuing the analogy with the strongly blockaded regime further, one may expect, that there should be the another contribution analogous to the elastic co-tunneling for the dots weakly connected to the leads, and we turn to the analysis of this mechanism now. Elastic tunneling: To obtain the leading elastic contribution we calculate correlator Eq. (292) taking into account all terms which include a branch cut in the complex plane and two additional branch cuts: the cuts in the complex 1 and 2 planes in the correlation function (G.7), and similarly in higher order correlation functions, see Eq. (G.8). After analytic continuation (286), see also Appendix I, we 8nd for Nch ¿ 2 T G1 M ∞ Gel = dt dt1 dt2 4 cosh(Tt) cosh[T (t + t1 − t2 )] −∞
1=Nch f(−it1 )f(it2 ) × −1 ; f(it2 − it − 1=2T )f(−it1 + it + 1=2T ) . / f(it + 1=2T ) 2=Nch R † A G × Re (t )WW G (t ) o 1 o 2 f(0) 11
(360)
where Go (t)R; A are the exact Green functions for the open dot without interactions, see Eq. (109). For the case of a one channel contact (Nch = 1; spinless fermions) an additional contribution appears, which is reminiscent of the Coulomb blockade oscillations . / G1 M ∞ Gel = dt dt1 dt2 Re e−i2N−i=2 GoR (t1 )WW † GoR (t2 ) 4 11 −∞ f(it + it1 + it2 + 1=2T ) t0 f(0) cosh Tt f(it + 1=2T ) 2 f(−it1 )f(−it2 ) × f(−it2 − it − 1=2T )f(−it1 + it + 1=2T ) : f(0) ×
(361)
Similar to the discussion of the case of two-terminal conductance, we apply Eq. (360) to the simplest model situation 8rst and only then consider the statistical properties of the conductance.
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
415
Imagine, there is a short semiclassical trajectory connecting the tunneling contact and the lead ( [GoR (t1 )WW † GoA (t2 )]11 (t1 − te )(t2 + te ) ; (362) M where te is the time for an electron to travel along this trajectory and ( is the constant characterizing the coupling of the tunneling contact with this trajectory. Substituting Eq. (362) into Eq. (360), we obtain for T Ec
1; te Nch Ec ¡ 1 ; Gel (G1 × (363) (te Nch Ec )−2=Nch ; t0 Nch Ec ¿ 1 ; where we omitted the non-essential numerical factors. Eq. (363) indicates that the contribution of the processes where the time that an electron spends in the dot time is smaller than the recharging time ˝=(Nch Ec ) is not aected by the interaction at all. This was to be expected, because during such a short time the other electrons do not manage to redistribute themselves. On the other hand, the longer trajectories experience a power-law suppression. The physical reason for this is the orthogonality catastrophe with the index 5 = 2=Nch which we already faced in the discussion of the inelastic contribution. We are ready now to perform the statistical analysis for the tunneling conductance of the system. We will limit ourselves to the case of a single mode point contact and electrons with spin (i.e., Nch = 2; results for spinless fermions can be found in Ref. [70]). To 8nd the average conductance we use Eq. (124), and the fact that for re<ectionless contacts Tr WW † = 2 Nch TM . Integration in Eq. (360) is then easily performed and gives for TT Ec
e−C Ec Gel = G1 ln ; (364) 2Ec T where C ≈ 0:577 is the Euler constant. At very low temperatures, T should be substituted by T ln(Ec =). The reason for the logarithmic divergence in Eq. (364) can be understood from Eq. (363): the contribution of trajectories with characteristic time te scales as 1=te , therefore the sum over all the trajectories logarithmically diverges. Similar to the case of the strong Coulomb blockade, the elastic co-tunneling is a strongly
e−C Ec Gel = G1 ln : (366) 2Ec T0 (N) Relationships (365) for mesoscopic
416
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
Fig. 24. Conductance showing Coulomb blockade oscillations as a function of gate voltage Vg in the open-channel regime [(a), (b)] and weak-tunneling regime [(c), (d)] at B = 0 and 100 mT (the crossover between the orthogonal and unitary ensembles occurs at B ≈ 20 mT for this quantum dot), from Ref. [154]. Note that the oscillations in the open-channel case are stronger at B = 0 mT compared to B = 100 mT, unlike the weak-tunneling Coulomb blockade.
they are smaller than the aperiodic part of the
Gel (N1 )Gel (N2 ) 0:33 Ec = cos 2(N1 − N2 ) : ln Gel 2 !2 T T
(367)
It is important to emphasize that this part is more sensitive to the magnetic 8eld than the aperiodic part: it reduces by a factor of four rather than by a factor of two. (Non-interacting models for the quantum dot predict the aperiodic
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
417
5. Conclusions In the beginning of this review, see Sections 2.1–2.3.3, we have derived the model Hamiltonian of an isolated dot. One of the main assumptions in that derivation was that the electron states are random. Randomness of the one-electron states, a large dimensionless conductance inside the dot (g ≡ ET =1), and relative weakness of the electron–electron interaction (rs . 1) allowed for a relatively simple description of an isolated dot. 39 The main part of the Hamiltonian of the dot is universal; it consists of a free-electron part described by random matrix theory and an interaction term, which is not random, see Eq. (93). Corrections to this universal part of the Hamiltonian are small at g1. In the discussion of the transport phenomena and of other eects which result from the contact of the dot with external electron reservoirs (leads), we have considered that universal part of the Hamiltonian only. Despite this extremely simple description of the physics inside the quantum dot, a large variety of intricate physical eects is uncovered when the dot is coupled to the outside world via point contacts. These eects range from quantum interference transport phenomena in the regime strong Coulomb blockade (Sections 3.1 and 3.2) to the eects of quantum charge
This simple model is not applicable to dots with a small number of electrons where the dimensionless conductance is small. It is also not applicable for dots of integrable shape [9], where the shell structure of the orbital states is dominant. 40 This result can be obtained by considering the second term of Eq. (93) in second order perturbation theory.
418
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
connected to the 8ve particle state, so that the quantum dot will return to the original state. In that case, the level is broadened much less than it is found in Ref. [155]. 41 Inelastic processes within the dot aect the conductance through the dot. This is exempli8ed in the temperature dependence of the conductance of an open quantum dot, Eq. (349). It follows from this formula, that the weak localization correction (second term) does not depend on temperature. This is, however, a consequence of neglecting the inelastic process within the dot. Only due to the those processes, the phase coherence inside the dot is destroyed. Quantitatively, o + 1 → N o + 1 + 2=( ) with it is described approximately [156 –158] by the replacement Nch ’ ch ’ equal to the inelastic relaxation time for an electron with energy T . From the theory, one expects ’ ˙ T −2 . Experiments with open quantum dots [159] give the result ’ ˙ T −1 . The reasons for such discrepancy have not been identi8ed as of yet. The second area of research omitted in this review is related to the spin–spin interactions. These eects become signi8cant at rs ∼ 1, and may result in a formation of a dot ground state with spin exceeding 1=2. SuKciently strong electron–electron interaction (large rs ) should result in a complete spin polarization (Stoner instability). However,
Clearly, in a 8nite system each “broadened” one-electron level is in fact a bunch of many discrete many-body levels enveloped around that one-particle state. By the one-particle level width one means the width of that envelope.
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
419
We also acknowledge numerous discussions of the subjects re<ected in the review with our collaborators, O. Agam, B.L. Altshuler, A.V. Andreev, H.U. Baranger, C.W.J. Beenakker, M. B[uttiker, B.I. Halperin, F.W.J. Hekking, A. Kamenev, A.I. Larkin, K.A. Matveev, Yu. V. Nazarov, and M. Pustilnik. This work was supported by A.P. Sloan and Packard fellowships at SUNY at Stony Brook, NSF grants no. DMR-9416910, DMR-9630064, and DMR-9714725 at Harvard University (where PWB has started working on this review), and by NSF under grants no. DMR-9731756 and DMR-9812340 at the University of Minnesota.
Appendix A. Correlation of the wave functions in ballistic dots Consider a system con8ned by a potential U (˜r) that is smooth on the scale a much larger than the Fermi wavelength. The Hamiltonian function H(˜ p;˜r), H(˜ p;˜r) =
p ˜2 + U (˜r) ; 2m
(A.1)
serves as a classical counterpart of the Hamiltonian of the system. The classical diusion in p;˜r) = EF . This space, Eq. (11) is replaced by the classical evolution on the energy shell H(˜ classical evolution is described by Perron–Frobenius equation
9H 9 1 9H 9 9 2 − − ˜n × fm (˜n;˜r) = (m fm (˜n;˜r) ; (A.2) 9p ˜ 9˜r 9˜r 9p ˜ 9˜n where ˜n is the unit vector in the momentum direction. The last term on the left-hand side of Eq. (A.1) is necessary for the regularization of the otherwise singular eigenfunctions of the Perron–Frobenius operator. The calculation of the quantum corrections (e.g., weak localization) to the chaotic dynamics requires a 8nite value of 1= ˝=(ma2 ), [157,170], however, for the leading in 1=g approximation the limit → ∞ should be taken only in the end of the calculation. The normalization condition (48) is replaced with d˜r d˜nfm (˜n;˜r)fm (−˜n;˜r) = Id mm ; (A.3) where Id is the surface area of the unit sphere in d-dimensional space, I2 = 2; I3 = 4. The product of two Green functions, compare with Eq. (47), averaged over energy is given by, see e.g. Ref. [157], GR (j + !;˜r1+ ;˜r2− )GA (j;˜r1− ;˜r2+ )
=
2 TVd
(m
d˜n1 d˜n2 ikF˜n1˜r1 ikF˜n2˜r2 fm (˜n1 ; ˜R1 )fm (˜n2 ; ˜R2 ) e e ; Id Id (−i! + (m )
˜r1; 2 : ˜r1;±2 = ˜R1; 2 ± 2
(A.4)
420
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
Repeating all the steps in the derivation of Eq. (50) from Eq. (39) we obtain instead of Eq. (50) 1 0 !( + ( ! ; g 1 Re (1 d˜n1 d˜n2 c1 = lim d˜rfm (˜n1 ;˜r − ˜n1 r)fm (˜n2 ;˜r) ; (m Id Id r→+0 (1=g)
H!( = c1 D
(A.5)
(m =0
where the dimensionless conductance of the dot g is de8ned similarly to Eq. (15), Re (1 g= : It is important to emphasize that, unlike in the diusive case, the expression for the matrix element involves also the eigenfunctions of the classical evolution operator. This can be traced to the dierence of the normalization conditions (48) and (A.3). Analogously, Eq. (51) acquires the form
2 (1=g) 2 2 [H!( ] = c2 D ; g Re (1 Re (1 d˜n1 d˜n2 d˜n3 d˜n4 2 c2 = 2 lim r→+0 (m (m Id Id Id Id ×
(m ;(m =0
d˜r1 d˜r2 fm (˜n1 ;˜r1 − ˜n1 r)fm (˜n2 ;˜r2 )fm (˜n3 ;˜r2 − ˜n3 r)fm (˜n2 ;˜r1 ) :
(A.6)
Appendix B. E(ect of the interaction in the Cooper channel As we already emphasized in Section 2.3, the universal description of the interaction by Hamiltonian (35) is valid provided that the coupling constants in this Hamiltonian are calculated with taking into account virtual transitions to the excited states which are beyond the energy strip in which Hamiltonian (35) is de8ned. For a weak short-range interaction, these transitions only insigni8cantly change the interaction constants Ec and JS for the charge and spin channels, respectively. The situation is dierent for the Cooper channel, where the renormalization is signi8cant even for a weak interaction. Consider the lowest order perturbation theory correction to the interaction matrix element M!! shown in Fig. 25a. The analytic expression for this correction is H( H(!! H!! ≈ − B(j( j ) : (B.1) |j( + j | |j(; |¿ET
According to the structure of the matrix elements discussed in Section 2.3.1, the largest contribution comes from the terms with ( = , and one 8nds
∗ j −1 2 H!! ≈ − H!! ln ; ET
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
421
Fig. 25. Diagrammatic expansion for the renormalization of the matrix elements in the Cooper channel: (a) the 8rst order correction; (b) summation of the leading logarithm series for the renormalized coupling constant. Intermediate states ( are only those lying outside the energy strip ET .
where j∗ is a high-energy cut-o above which the approximation of the instantaneous interaction and equidistant spectrum becomes non-applicable. Because of the large logarithmic factor here, the renormalization of the coupling constants in the Cooper channel is signi8cant, even if the interaction is weak, H!! . To take into account this renormalization, one has to sum the leading logarithmically divergent series shown in Fig. 25b. This results in the renormalization of the interaction constant in comparison with its bare value, Jc → J˜ c =
Jc : 1 + (Jc =) ln(j∗ =ET )
(B.2)
One can see, that for repulsive interaction, Jc ¿ 0, the eective interaction in the Cooper channel is always weak, and for the majority of eects it can be neglected form the very beginning. For the attractive interaction, Jc ¡ 0, the interaction constant renormalizes to strong coupling limit and diverges at the energy scale c = j∗ e−=|Jc | , which is nothing but the superconducting gap in BCS theory. If the inequaltiy c ¡ ET holds, one can still use Eq. (35) for the description of low-energy physics of a dot; in particular, eects of 8nite level spacing on superconductivity can be accounted for in the formalism described here. On the other hand, if c ¿ ET , the random matrix description for the interaction eects is not applicable. Appendix C. Scattering states and derivation of Eq. (99) In this appendix, we discuss the precise de8nitions of the scattering states j (k) in the lead Hamiltonian HL of Eq. (96) and the derivation of relation (99) between the scattering matrix S and the matrices W; H , and U parameterizing the Hamiltonian of the dot–lead system. Our discussion follows that of Refs. [86,88–91]. We illustrate a part of Hamiltonian (96) corresponding to one of the leads in Fig. 26. For each particular lead, we take the direction towards the dot as positive x, and locate the lead–dot interface near x = 0. The electron operators then take the form Nch Nch d k ∗ i(kF +k)x ˆ (˜r) = 6j (˜r⊥ ) + Ujl e−i(kF +k)x ] ˆj (k) ; (C.1) [U e e 2 jl j=1
l=1
422
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
Fig. 26. Schematic picture of the dot with one of the leads.
where 6j (˜r⊥ ) characterizes the structure of the wave function in the transverse direction. The unitary matrix Ujl describes the scattering in the absence of any coupling to the quantum dot [86,91]. If the contact between the lead and the dot is de8ned as in Fig. 26, with Dirichlet boundary conditions at the lead–dot interface, electrons pick up a phase shift at the lead–dot interface at x = 0, but otherwise remain in the same transverse mode, hence Ujl = ijl in that case. In the absence of electron–electron interactions in the dot, the wave function in the leads at the energy j = vF k acquires the form i(kF +j=vF )x −i(kF +j=vF )x r) = 6j (˜r⊥ )[ain + aout ]: (C.2) e (˜ j e j e j
The amplitudes of the out-going waves aout are related to the amplitudes of in-going waves ain j j by the Nch × Nch scattering matrix S(j), aout i
=
Nch
Sij (j)ain j :
(C.3)
j=1
The matrix S is unitary, as required by particle conservation. In order to express the scattering matrix S in terms of the Hamiltonian of the closed dot, the coupling matrix W and the unitary matrix U , we represent Hamiltonian (95) in terms of a Schr[odinger equation for the wave-function j (x), which corresponds to the Fourier transform of the fermion 8elds ˆ (k) over the entire real axis, ˆ (x) = d k ˆ (k)e−ikx ; j = 1; : : : ; Nch : (C.4) j 2 j
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
423
The operator ˆj (x) is a mathematical construction that is related, but not identical to the true electron operator ˆe (˜r) de8ned in Eq. (C.1). Then, denoting the wave function inside the dot as & , we 8nd from Eqs. (92) and (95) – (97), that the Schr[odinger equation for the entire system reads M
9 j (x) ∗ j j (x) = ivF Wj (x) ; + 9x =1
j& =
M
H& +
=1
Nch
W&l l (0) :
(C.5)
l=1
For x = 0, Eq. (C.5) corresponds to a left moving particle at velocity vF . To 8nd the scattering matrix Sjl , we set, cf. Eq. (C.2), Nch e−ikx Ujl ain x¿0 ; l ; l=1 (C.6) j (x) = Nch −ikx Ujl∗ aout l ; x¡0 ; e l=1
where k = j=vF . At x = 0 we use the standard regularization j (0) = [ j (−0) + Substitution of Eq. (C.6) into the Schr[odinger equation (C.5) yields j& =
j (+0)]=2.
N
ch 1 ∗ out H& + W&j (Ujl ain l + Ujl al ) ; 2
j;l=1
∗ out 0 = ivF (Ujl ain l − Ujl al ) +
Wj∗ :
The wave function & of the dot can be eliminated from this equation, resulting in the condition out (C.3) for the amplitudes ain j and aj , with the Nch × Nch matrix S given by S(j) = U [1 − 2iW † (j − H + iWW † )−1 W ]U T : This is Eq. (99) of the main text. The coupling matrix W can be represented in the form TM W= VOW˜ ; (C.7) 2 where V is an orthogonal M × M matrix, W˜ is a real Nch × Nch matrix, and O is an M × Nch projection matrix, Oj = j ; = 1; : : : ; M and j = 1; : : : ; Nch . The 8rst factor in Eq. (C.7) ensures a convenient normalization of the matrix W˜ . Because the distribution function of the random matrix H is invariant under rotations H → V HV † , the matrix V in Eq. (C.7) can be omitted. The matrix W˜ can be related to the (2Nch ) × (2Nch ) scattering matrix Sc of the point contacts between the lead and the dot. Unlike the scattering matrix S(j) which connects the incoming and outgoing states for the entire device, the matrix Sc characterizes the properties of each point
424
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
contact connecting a lead to the dot separately. The corresponding transmission and re<ection amplitudes are de8ned in Nch × Nch matrices rc , rc , and tc ,
rc tcT Sc = ; (C.8) tc rc which can be expressed [92] in terms of the coupling matrix W˜ and the matrix U of Eq. (C.1), † 1 − W˜ W˜ T U ; rc = U † 1 + W˜ W˜
rc
† 1 − W˜ W˜ ˜ −1 ˜ W ; =−W † 1 + W˜ W˜
tc = W˜
2 U : † 1 + W˜ W˜
(C.9)
The matrix rc expresses the re<ection from a point contact to the leads, rc corresponds to the re<ection of an electron entering the point contact from the dot, and the matrix tc describes the transmission through the point contact from a lead to the dot. A contact is called ideal if rc = rc = 0, or, equivalently, W † W˜ = 1.
Appendix D. Mesoscopic .uctuations of elastic co-tunneling far from the peaks The calculation of the magnetic 8eld dependence of the mesoscopic
D Nh Nh E Ee Fe Fh∗ = H1 ; 1 + h + H1 ;1 + ; Ee + Eh 2Eh Ee 2Eh Eh C
C Nh Nh E Ee Fe Fh = H1 ; 1 + h + H1 ;1 + ; (D.1) Ee + Eh 2Eh Ee 2Eh Eh where the dimensionless functions H1 (x) is given by 2 3 1 x 2 x H1 (x; y) = − arctan xy − xLi2 (1 − y) + + ln xy (1 + x2 ) 6 2 1 3 0 12 − x ln x + x + ln 1 + x2 y2 − Re[(i + x)Li2 (−ixy)] 2 2
;
(D.2)
with Li2 (x) being the second polylogarithm function [103]. The asymptotic behavior of functions H1 is H1 (x; y) = (xy=)ln(xy) for x1, and H1 (x; y) = [(1=2) ln2 (xy) − ln2 x]=(x) for x1. The limits of the pure orthogonal (unitary) ensembles (151) are recovered by putting NhD = 0 and NhC = 0(∞) in Eq. (D.1).
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
425
The correlation function of the conductances at dierent values of the magnetic 8eld is somewhat involved,
D
D 2 Nh Nh G(B1 )G(B2 ) Eh Eh Ee Eh = H + H ; 1 + + E ↔ E e 1 h G 2 Ee + Eh 2Ee (Ee + Eh )2 2Eh Ee + NhD ↔ NhC :
(D.3)
In this general case, where the gate voltage can be anywhere in the Coulomb blockade valley, it is impossible to point out the single scale for the magnetic 8eld. However, in the vicinity of the peaks, Ee Eh or Ee Eh , the situation simpli8es signi8cantly, and one obtains the universal dependence (160). The dependence of the crossover parameter D from Eq. (162) on the magnetic
C
C Nh Nh Eh Eh Ee Eh D= H + H1 ;1 + + Ee ↔ Eh ; (D.4) Ee + Eh 2Ee 2Eh Ee (Ee + Eh ) 2 where the functions H; H1 are de8ned in Eqs. (159) and (D.2), and NhC is de8ned in Eq. (20) with 61 = 62 = 6. As before, there is no universal dependence on the magnetic 8eld. In the vicinity of the peak, however, we recover the universal result (163).
Appendix E. Derivation of Hamiltonian (238) To include the second junction we should modify Hamiltonian given in Eqs. (207), (209) and (212): in the charging energy, Eq. (209), the variable BN (0) should be replaced by the dierence BN (0) − BN (L); we should also return from Eq. (212) to the generic form (206) of the free-8eld Hamiltonian, and replace the boundary condition Eq. (213) by a barrier described by the Hamiltonian √ √ 2 Hˆ r1 = − |r1 |D cos[2 BN (L)]cos[2 Bs (L)] ;
(E.1)
which is similar to Eq. (207). Similar to the one-junction case [cf. Eq. (211)], in the energy range E Ec the
√ [BN (0) − BN (L)]q = eN ;
(E.2)
and
426
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
of the eective action [140], rather than a Hamiltonian, |!n |[4|B˜N (!n )|2 + 2|Bs (!n )|2 + 2|Bs1 (!n )|2 ] S= i!n
1=2 √ √ e−C E D d{|r |cos[N − 2 B˜N ()] cos[2 Bs ()] c 3 √ √ (E.3) + |r1 |cos[2 B˜N ()] cos[2 Bs1 ()]} ; where is the imaginary time, and !n is the Matsubara frequency. The spin degrees of freedom at each junction are described by g = 1=2 Luttinger liquids, and the mode corresponding to the remaining charge degree of freedom, is a Luttinger liquid with g = 1=4; its increased stiness comes from the low-energy constraint equation (E.2) of the charge neutrality of the dot. It is easy to check that because of this increased stiness [i.e., because of (g ¡ 1=2)], the re<ection at the point contacts is a relevant perturbation. 42 If the bare re<ection amplitudes |r | and |r1 | are small, then with the reduction of the energy scale E they grow as (Ec =E)1=4 , independently of each other. We intend to consider the asymmetric geometry: the bare re<ection amplitudes are very dierent: |r ||r1 |. Then, the crossover to weak tunneling through the x = L junction is either occurs upon reaching the energy scale ∼ Ec |r1 |4 , or tunneling is weak from the very beginning. At energies E . Ec |r1 |4 , the variables Bs1 and B˜N are 8xed most of the time, so that √ √ cos(2 B˜N ) cos (2 Bs1 ) = 1, to ensure the energy minimum of the junction. The system makes rare hops between various con8gurations satisfying the energy minimum condition. Upon further reduction of the bandwidth, the eective backscattering at x = 0 follows the renormalization prescribed by Eq. (220), and reaches the limit of the weak tunneling at energies ∼ Ec |r1 r cos N|2 . This energy scale coincides with Eq. (221) if |r1 | ∼ 1, and replaces it if |r1 | is small. Note that similar to condition (225), the spatial quantization of the electron states in the dot does not aect the crossover to the regime of weak tunneling, if Ec |r1 r cos N|2 . In that regime, the system performs rare hops between the minima of the “potential” √ √ U {Bs ; Bs1 ; B˜N } = −U cos(N − 2 B˜N ) cos(2 Bs ) √ √ − U1 cos(2 B˜N ) cos(2 Bs1 ) ; (E.4)
+4
represented by the last two lines in the action (E.3). [Here the renormalization-dependent energies U ¿ 0 and U1 ¿ 0 replace the corresponding √ bare values entering in Eq. (E.3).] In the minima of potential (E.4), the spin of the dot ( =2)(Bs − Bs1 ) is integer or half-integer, depending on the value of N, cf. Eq. (224). There are three types √ of hops which connect √the minima of potential (E.4). Two of the types, Bs (0) → Bs (0) ± and Bs1 (0) → Bs1 (0) ± , are already familiar to us from the single-junction case, see Eq. (226) and the discussion which follows that equation. These hops correspond to a change by 1 of the z-projection of the dot’s spin via spin exchange with one of the leads. The third type of hops involves a simultaneous change of all three 8elds √ √ √ Bs (0) → Bs (0) ± =2; Bs1 (0) → Bs1 (0) ± =2; B˜N (0) → B˜N (0) ± =2 : 42
The quadratic part of the eective action (E.3) represents the 8xed point of a four-channel Kondo problem [40], and the re<ection represents relevant perturbation near the 8xed point.
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
427
In such a hop, one electron (charge e and spin sz = ± 1=2) is transferred across the dot; the spin of this electron and the dot’s spin may
√
√
√ ± 0L ± ± ˜ Hˆ ± ˙ −cos cos cos ; (E.5) 2 N 2 s1 2 s supplements Eq. (226) and a similar equation for the x = L junction, √ L (E.6) Hˆ ± ˙ −cos{ [s1 (+0) − s1 (−0)]} : ˜ ˜± ˜ Here ± s1 = s1 (x = L + 0) − s1 (x = L − 0) and N = N (x = + 0) − N (x = − 0) are the discontinuities of the respective 8elds at the junction (for the 8eld ˜ N the discontinuity is the same at both junctions). All three Hamiltonians, Eqs. (E.5), (E.6) and Eq. (226), have the same scaling exponent, as can be easily checked with the help of action (E.3). These three Hamiltonians represent the easy-plane part of the exchange interaction. The SU (2) symmetry guarantees the existence of terms like Eq. (227), which restore the isotropy of the exchange. The exchange interaction changes the spin of the dot by an integer. If we account for the 8nite level spacing, then the lowest energy states would correspond to the smallest possible spin of the dot, like in the one-junction geometry considered above. Kondo eect develops only if the spin of the dot is 1=2. This doublet state is realized in the dot periodically with the gate voltage, when cos N ¡ 0 (we assume here |r1 ||r |). In Section 4.3 we concentrate on the doublet state only. Returning to the fermionic variables at E Ec |r1 r cos N|2 , we 8nd the exchange Hamiltonian (238) which generalizes Eq. (228) to the case of two junctions. Appendix F. Canonical versus grand canonical ensembles A crucial step in the eective action formalism is rede8nition (254) of the charge operator nˆ in terms of the fermion operators in the leads. As was discussed below Eq. (254), strictly speaking, such a rede8nition requires a canonical description of the entire system, i.e., that the total number of particles in the leads and the quantum dot is kept 8xed in the ensemble. A grand-canonical approach, as in Section 4.5.1, can be used only when all observables are considered at a single value of the gate voltage N, or when physical observables are a periodic function of N. For a non-periodic N-dependence (i.e., for mesoscopic
428
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
Here Tr Np denotes a trace over all states with a 8xed number of particles Np ; Tr is a trace over all states with all numbers of particles, and Nˆ p is the operator for the total number of particles. Hence, what we need is the dependence of I on a complex chemical potential & in a strip T ¡ Im & ¡ T . The modi8cation of the eective action theory of Section 4.5.1 to include a complex chemical potential is straightforward, because all intermediate steps are analytic in & for T ¡ Im & ¡ T . In the 8nal result, the &-dependence appears through the &-dependence of the scattering matrix, S(&; t) → S(0; t)e−i&t and through the &-dependence of the thermodynamic potential of the lead. To 8nd the latter, we include & into the charging energy, ∞ † † Hˆ e − &Nˆ p; L = ivF d x( ˆ L; j 9x ˆ L; j − ˆ R; j 9x ˆ R; j ) −∞
j
+ Ec
j
−
0
−∞
2
dx :
ˆ† L; j
ˆ
L; j
+
ˆ† R; j
ˆ
R; j
: +N
&2 − &N − &Nˆ p; L q ; 4Ec
(F.2)
where N = N + &=(2Ec ) and Nˆ p; L is the number of particles in the leads. Hence, by the second line of Eq. (F.2), wherever we found an explicit N-dependence, N should be replaced by N + &=(2Ec ). The term &Nˆ p; L q denotes the quantum mechanical average of the number of particles in the lead and arises due to the normal ordering of the ˆ -8elds. Let us now discuss the implication of this scheme for the calculation of the mesoscopic
I(&) = I(0) −
(F.3)
where Nˆ p is the expectation number of the total number of particles in the system (leads and dot) at & = 0. Mesoscopic
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
429
8nd the leading &-dependence of I(&), one can replace I by its ensemble average, for which one 8nds, 43 I(&) = I(0) −
&2 &2 : − &N − &Np − 4Ec 2
(F.4)
With this, the &-integration in Eq. (F.1) is straightforward when we choose Np = Nˆ p ; for T Nch the integration can be done by the saddle point method and amounts to substitution (326). Interaction and
j=1
L; j
k=1
1 = (2vF )Nf
1 f(0)t0
|m|Nch
k=1
−i2m(N+Nch =4)
e
Nch
nR
j j sinT (Ljk − Ljl ) sinT (RjkU − RjlU )
T
k¿l
R; j
k=1
sinT (LjkU − LjlU )
T
k¿l
nRj +mNch
sin T (Rjk − Rjl )
T
k¿l
q
nLj +mNch
j=1
nL
×
R; j
k=1
T
k¿l
nRj nRj +mNch T T LU − L ) RU − R ) sinT ( sinT ( jl k=1 l=1 jl jk jk l=1
nLj +mNch nLj
×
k=1
×
L R Nch nj +mN ch ni +mN ch
i; j=1
k=1
nLj +mNch nRi
×
k=1
43
l=1
[f(Ril
−
U Ljk )]1=Nch
l=1
f(RilU
1 − LjkU )
nL
R
ni j
U
[f(Ril − Ljk )]1=Nch
k=1 l=1
1=Nch
nL
R
j ni +mNch
k=1
l=1
f(Ril
1 − Ljk )
1=Nch
;
(G.1)
In this case, the approximate formula (122) for the average of the scattering matrix gives a result that is wrong o , because time scales ∼ Nch , for which approximation (122) does not hold, are by a factor 1 + (2 − !)=!Nch important. Exact evaluation makes use of the relation (2i)−1 Tr S † (9S= 9j) = −1 .
430
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
# R where nL; are arbitrary non-negative integers, m is an arbitrary integer, Nf = Nch m+ j (nLj +nRj ), j and we use the convention Nj=1 aj = 1, for N = 0, and Nj=1 aj = 0, for N ¡ 0. The function f() and the time scale t0 are de8ned by Eq. (305). One can easily check that Eqs. (306), (307) and (309) follow from Eqs. (G.1). Correlators of fermionic operators and current operator: For the calculation of the twoterminal conductance, we need correlators that involve both the fermionic operators and the current operator I , which is linear in the boson 8elds, evF 9 I= Hjj [’ˆ Lj (x) − ’ˆ Rj (x)] ; (G.2) 2 9x j x↑0
Here the diagonal matrix H is de8ned in Eq. (105). The necessary bosonic correlators are obtained from those of the free boson 8elds ’ˆ by substitution → ± ix=vF , where the + (−) sign is for left (right) movers. For the correlators of the current operators one thus 8nds
2 T 2 1 e2 N1 N2 : (G.3) T I ()I (0)q = 2 2 Nch ± sin2 [T ( ± i0)] The correlators of the current operators and two or four fermionic operators are most conveniently expressed in the correlators of the fermionic operators themselves, NLU NRU NL NR U U ˆ ˆ L R L R U (i ) U (k ) ˆ (j ) ˆ (l ) T I ()I (0) i=1
L; ni
j=1
L; nj
k=1
R; nk
l=1
R; nl
q
NLU NRU NL NR U U ˆ ˆ L R L R U (i ) U (k ) ˆ (j ) ˆ (l ) T L; ni L; nj R; nk R; nl
= T I ()I (0)q
i=1
1 + 2
e2 2
j=1
k=1
l=1
NLU NRU NL NR U U ˆ ˆ L R L R ˆ (j ) ˆ (l ) U (i ) U (k ) T L; ni L; nj R; nk R; nl
i=1
j=1
k=1
l=1
q
q
NLU NRU NL NR U U × gni (Li − ) − gnj (Lj − ) + gnk ( − Rk ) − gnl ( − Rl ) i=1
j=1
k=1
l=1
NLU NRU NL NR U U gni (Li ) − gnj (Lj ) + gnk (−Rk ) − gnl (−Rl ) ; × i=1
j=1
gn () ≡ Hnn T cot[T ( + i0)] ;
k=1
l=1
(G.4)
where NLU ; NLU ; NRU ; NR are the arbitrary non-negative integers, and ni = 1; : : : ; Nch labels the channels. The expressions for the average of the fermionic operators were given earlier, see Eqs. (306) – (309), and (G.1).
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
431
ˆ Fˆ † : Calculation of the Correlators of fermionic operators and the charging operators F, tunneling density of states Eqs. (290) – (292) involves not only the correlation function of ˆ Fˆ † that increase or decrease the the fermionic operators but also the products of the operators F; charge of the dot by one electron. The corresponding calculation is facilitated by the observation that † −Hˆ e † ˆ ˆ H e U F(0) ˆ ˆ F() =e Fˆ e Fˆ = T exp d1 [Hˆ e − Fˆ Hˆ e (1 )F] ; 0
where the one-dimensional Hamiltonian is given by Eq. (293) and the commutation relation for ˆ Fˆ † are given by Eq. (287). In bosonized representation one 8nds F; Ec † [’ˆ L; j (0) + ’ˆ R; j (0)] + 2(N − mˆ − 1=2) : (G.5) Hˆ e − Fˆ Hˆ e Fˆ = j
Because this term is linear in the bosonic 8elds, further calculation proceeds without any diKculty with the help of Eq. (303). The operator mˆ in Eq. (G.5) does not have its own dynamics and can be 8xed to be any integer, ei2mˆ = 1. We obtain f() 2=Nch ˆ U ˆ T F()F(0)q = ; (G.6) f(0) where the dimensionless function f() is de8ned in Eq. (304). For the averages involving fermionic operators as well as the charge changing operators we 8nd [compare with Eq. (306)], Uˆ ( ) ˆ ( ) Uˆ F(0) ˆ (2v )T F() F
L; i
1
L; j
2
q
f() 2=Nch T f( − 1 )f(−2 ) 1=Nch = ij ; f(0) sin[T (1 − 2 )] f(−1 )f( − 2 )
Uˆ (1 ) ˆ (2 )q Uˆ F(0) ˆ (2vF )T F() R; i R; j
f() 2=Nch T f(1 − )f(2 ) 1=Nch = ij : f(0) sin[T (1 − 2 )] f(1 )f(2 − )
(G.7)
Notice that the structure of this expression is equivalent to the form factors in the problem of the orthogonality catastrophe [153]. Averages involving larger number of the fermionic operators are found in the same manner. We will give the formula expressing correlation function involving the charging operator and the arbitrary number of the fermionic operators in terms of the correlation function involving the fermionic operators only: NLU NRU NL NR U U ˆ ˆ ˆ L R L R ˆ (j ) ˆ (l ) U (i ) U (k ) U F(0) ˆ T F() i=1
L; ni
j=1
L; nj
k=1
R; nk
l=1
R; nl
q
432
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
NU NRU NL NR L f() 2=Nch U U ˆ ˆ L R L R ˆ (j ) ˆ (l ) U (i ) U (k ) = T L; ni L; nj R; nk R; nl f(0) i=1
j=1
k=1
l=1
1=Nch NLU NRU NL NR U U L) R) R L f( − f( f(k − ) f( − i ) j j × U L) U R − ) L R f( − f( f( − f( ) ) j j i k i=1 j=1 k=1 l=1
q
(G.8)
where NLU ; NLU ; NRU ; NR are the arbitrary non-negative integers, and ni = 1; : : : ; Nch labels the channels. The expressions for the correlation functions of fermionic operators were given earlier, see Eqs. (306) – (309) and (G.1).
Appendix H. Derivation of Eqs. (340) – (343) Here we present the details of the calculation of the current–current correlator F() and the two-terminal conductance G up to second order in the eective action Se . To zeroth order in the interaction, F() is given by Eq. (G.3). The correction F()(1) to 8rst order in the action follows from Eqs. (306) and (G.4) as F()(1) = T I ()I (0)0 Se 0 − T I ()I (0)Se 0 1=T e2 T 3 1=T Tr S(1 − 2 )H2 = d1 d2 8i ± 0 sin[T (1 − 2 )] 0 ×{cot[T (1 − ± i0)] − cot[T (1 − ± i0)]} ×{cot[T (1 ± i0)] − cot[T (2 ± i0)]} :
(H.1)
We change to variables 1 = + and 2 = − and integrate over . Since the integrand is analytic in the upper (lower) half of the complex plane for the + (−) signs in the denominators, one directly 8nds F()(1) = 0 :
(H.2)
The second order correction to F() reads 2 2 F()(2) = 12 I ()I (0)Se 0 − 12 I ()I (0)0 Se 0 :
Here we 8nd for Nch ¿ 2 (2)
F()
e2 T 4 1=T =− d1 d2 d3 d4 Tr S(1 − 2 )HS(3 − 4 )H 16 ± 0 ×
1 sin[(1 ± + i0)T ]sin[(4 ± + i0)T ]
(H.3)
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
433
Fig. 27. Shift of the integration contours for the variable in Eq. (H.6). Branch cuts of the integrand are indicated by thick lines, poles by dots. The integration contour for 8rst term between brackets {· · ·} in Eq. (H.6) is shifted below the real axis (a); the integration contour for the second term is shifted above the real axis (b).
1 sin[(2 − i0)T ]sin[(3 − i0)T ] f(2 − 1 )f(3 − 4 ) 1=Nch × : f(3 − 1 )f(2 − 4 ) ×
(H.4)
We left out terms that have all poles at the same side of the real axis, because they vanish after integration over the times 1 ; : : : ; 4 . We now parameterize 1 ; 2 ; 3 , and 4 according to 1 = + =2 + 1 ; 2 = + =2 ; 3 = − =2 + 2 ; 4 = − =2 ;
(H.5)
perform the integration over , and use the Lehmann representation (316) for S(), 0 1=T e2 T 3 i ∞ Tr S(t1 )HS † (t2 )H (2) F () = dt1 dt2 d 8 sin[(it2 − )T ] 0 −∞ 0 ± 1 × sin[(± + it1 )T ]sin[(± − + i0)T ] 1 sin[(± − it2 )T ]sin[(± + + it1 − it2 )T ] f(it2 − it1 − )f() 1=Nch × : f(−it1 )f(it2 ) −
(H.6)
Finally, we shift the integration over to the complex plane, using the approximation (305) for f. For the 8rst term between brackets, we choose the integration path below the real axis, while for the second term the integration path is chosen above the real axis, see Fig. 27.
434
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
The result is
0 e2 T 2 ∞ F () = dt1 dt2 Tr S(t1 )HS † (t2 )H 4 ± 0 −∞
1 × sinh[(t1 ± i)T ]sinh[(−t2 ± i)T ] ∞ sin(=Nch ) +T ds sinh[(s + t1 − t2 ± i)T ] t0 1 × sinh[(t1 ± i)T ]sinh[(s + t1 )T ] 1 + sinh[(−t2 ± i)T ]sinh[(s − t2 )T ]
sinh[(t1 − t2 + s + t0 )T ]sinh[(s − t0 )T ] 1=Nch : × sinh[(t1 + t0 )T ]sinh[(−t2 + t0 )T ] (2)
(H.7)
The 8rst term is the pole contribution, the second term contains a contribution from the branch cut. Corrections to F() to higher orders in perturbation theory can be classi8ed in the same way as terms with pole contributions, one branch cut, two branch cuts, etc. As discussed in Section 4.5.2, there are no corrections of higher order in the scattering matrix S with zero or one branch cut—these have been accounted for here; all higher order corrections have more than one branch cut. Now, using the conductance formula (281), one directly 8nds the interaction correction (342) to the conductance. For Nch = 2 there is an additional, periodic in N, contribution to the current–current correlator F() to second order in the action Se . This origin of this extra N-dependent contribution is the fact that for Nch = 2, product (309) of four dierent Fermion operators has a non-zero average. We thus 8nd e2 T 2 e−2iN 1=T F()osc = − d1 d2 d3 d4 642 t02 0 ×f(0)−2 [f(2 − 1 )f(4 − 3 )f(4 − 1 )f(2 − 3 )]1=2 ×[S11 (1 − 2 )S22 (3 − 4 ) + S12 (1 − 2 )S21 (3 − 4 )] sin[(1 − 3 )T ] × sin[( ± + i0)T ]sin[(3 ± + i0)T ] 1 ±
sin[(2 − 4 )T ] + c:c: (H.8) sin[(2 − i0)T ]sin[(4 − i0)T ] [Note that the products S11 S22 and S12 S21 appear with the same sign, because the latter product has an extra minus sign both in the fermionic correlator (309) and in the current-fermion correlators (G.4).] Again, terms that vanish after integration have been left out. ×
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
435
We now use the parameterization (H.5) of the times 1 ; : : : ; 4 , integrate over , and implement the Lehmann representation (316), to 8nd 1=T e2 T e−2iN ∞ F()osc = dt dt d[S11 (t1 )S22 (t2 ) + S12 (t1 )S21 (t2 )] 1 2 32i2 t02 0 0 ×f(0)−2 [f(−it1 )f(−it2 )f(− − it1 )f( − it2 )]1=2 sin[( + it1 − it2 )T ] × sin[( + it1 ± )T ]sin[(it2 ± )T ] ± −
sin[( + it1 − it2 )T ] sin[(− + it2 ± )T ]sin[(it1 ± )T ]
+ c:c:
(H.9)
Next we integrate over by shifting the integration contour into the upper (lower) half of the complex plane for the 8rst (second) term between brackets {: : :}. The only contribution is from the branch cut of the integrand, which reads, with approximation (305) for f, ∞ e2 T 3 e−2iN ∞ F()osc = dt1 dt2 ds[S11 (t1 )S22 (t2 ) + S12 (t1 )S21 (t2 )] 16 0 t0 ×
1 sinh[(t1 + t0 )T ]sinh[(t2 + t0 )T ]
1 sinh[(t1 + t2 + t0 + s)T ]sinh[(s − t0 )T ] sinh[(s + t1 − t2 )T ] × sin[(is + it1 ± )T ]sin[(it2 ± )T ] ± ×
+
sinh[(s + t1 − t2 )T ] sin[(is + it2 ± )T ]sin[(it1 ± )T ]
+ c:c:
(H.10)
Finally, using the conductance formula (281), we 8nd the oscillating contribution (343) to the conductance.
Appendix I. Derivation of Eqs. (360) -- (361) To obtain the leading elastic contribution we calculate correlator (292), taking into account all the terms which include branch cut in the complex plane and the cuts in the complex 1 and 2 planes in the correlation function (G.7), and similarly in higher order correlation function, see Eq. (G.8). This gives [70] for Nch ¿ 2 1 f() 2=Nch 1=T (1) d1 d2 [Go (1 )WW † Go (−2 )]11 Fel () = 4 f(0) 0
436
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
Fig. 28. Deformation of the integration contours for the variables 1 (a) and 2 (b) in Eq. (I.1). Branch cuts coming from functions f() are shown by thick lines, and the integration contour is represented by arrowed vertical thin lines.
T × sinT ( + 1 − 2 )
+
f(1 )f(2 ) f(1 + )f(2 − )
f(−1 )f(−2 ) f(−1 − )f( − 2 ) 2=Nch
−2
;
2=Nch
(I.1)
where Go () is the Matsubara Green function for the open dot. One can easily, see that the pole contribution at + 1 − 2 = 0 is excluded from Eq. (I.1), because it is already taken into account in the inelastic contribution (353). One can now deform the contours of the integration over 1 and 2 in Eq. (I.1) using analyticity of function f() at Im ¡ 0, see Fig. 28.
2=Nch ∞ 2=Nch 1 f() f( − it )f( − it ) 1 2 Fel(1) () = dt1 dt2 −1 2 f(0) f(−it1 − )f( − it2 ) 0 . / T Re GoR (t1 )WW † GoA (−t2 ) ; (I.2) 11 sin T ( + it1 − it2 ) where Go (t)R; A are the retarded and advanced Green function for the open dot. Deriving Eq. (I.2), we use the fact that f( + 1=T ) = f(); G( + 1=T ) = − G(), and G(it + 0) − G(it − 0) = i[GR (t) − GA (t)]. The resulting function (I.2) is analytic for 0 ¡ ¡ 1=T , and we can use the analytic continuation Eq. (286) to 8nd the tunneling conductance Eq. (360). For the case of a one-channel contact Nch = 1 (spinless fermions) the additional contribution appears due to the anomalous average (309). G1 M f() 2 1=T f( + 1 − 2 ) f(−1 )f(2 ) osc Gel () = d1 d2 2 f(0) t0 f(0) f(2 − )f(− − 1 ) 0 −i2N−i=2 † Re{e [Go (1 )WW Go (−2 )]11 } : (I.3) After deformations of the integration contours analogous to derivation of Eq. (I.2) and analytic continuation we arrive to Eq. (360).
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
437
References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38]
B.L. Altshuler, Pis’ma Zh. Exp. Teor. Fiz. 41 (1985) 530 [Sov. Phys. JETP Lett. 41 (1985) 648]. P.A. Lee, A.D. Stone, Phys. Rev. Lett. 55 (1985) 1622. A.G. Aronov, Y.V. Sharvin, Rev. Mod. Phys. 59 (1987) 755. B.L. Altshuler, P.A. Lee, R.A. Webb (Eds.), Mesoscopic Phenomena in Solids, North-Holland, Amsterdam, 1991. C.W.J. Beenakker, H. van Houten, Solid State Phys. 44 (1991) 1. E. Akkermans, G. Montambaux, J.-L. Pichard, J. Zinn-Justin (Eds.), Mesoscopic Quantum Physics, North-Holland, Amsterdam, 1995. R.A. Webb, S. Washburn, Phys. Today 41 (1988) 46; S. Washburn, R.A. Webb, Rep. Prog. Phys. 55 (1992) 1311. L.P. Kouwenhoven, C.M. Marcus, P.L. McEuen, S. Tarucha, R.M. Westervelt, N.S. Wingreen, in: L.L. Sohn, L.P. Kouwenhoven, G. Sch[on (Eds.), Mesoscopic Electron Transport, Kluwer, Dordrecht, 1997. See, e.g., L. Kouwenhoven, C.M. Marcus, Phys. World 11 (1998) 35 and references therein. L.I. Glazman, G.B. Lesovik, D.E. Khmelnitskii, R.I. Shekhter, Pis’ma Zh. Exp. Teor. Fiz. 48 (1988) 218 [Sov. Phys. JETP Lett. 48 (1988) 238]. B. Kramer, Quantization of the transport, in: T. Dittrich, P. H[anggi, G.-L. Ingold, B. Kramer, G. Sch[on, W. Zwerger (Eds.), Quantum Transport and Dissipation, WILEY-VCH, Weinheim, 1998, p. 79. L.I. Glazman, I.A. Larkin, Semicond. Sci. Technol. 6 (1991) 32. A.A. Abrikosov, L.P. Gorkov, I.E. Dzyaloshinskii, Methods of Quantum Field Theory in Statistical Physics, Prentice-Hall, Englewood Clis, NJ, 1963. C.W.J. Beenakker, Rev. Mod. Phys. 69 (1997) 731. Y. Alhassid, Rev. Mod. Phys. 72 (2000) 895. H.U. Baranger, P.A. Mello, Phys. Rev. Lett. 73 (1994) 142. R.A. Jalabert, J.-L. Pichard, C.W.J. Beenakker, Europhys. Lett. 27 (1994) 255. B.L. Altshuler, A.G. Aronov, in: A.L. Efros, M. Pollak (Eds.), Electron–Electron Interactions in Disordered Systems, North-Holland, Amstardam, 1985. D.V. Averin, K.K. Likharev, in: B.L. Altshuler, P.A. Lee, R.A. Webb (Eds.), Mesoscopic Phenomena in Solids, North-Holland, Amsterdam, 1991. M.A. Kastner, Rev. Mod. Phys. 64 (1992) 849. J.A. Folk, S.R. Patel, S.F. Godijn, A.G. Huibers, S.M. Cronenwett, C.M. Marcus, K. Campman, A.C. Gossard, Phys. Rev. Lett. 76 (1996) 1699. H.R. Zeller, I. Giaver, Phys. Rev. Lett. 20 (1968) 1504; Phys. Rev. 181 (1969) 789; J. Lambe, R.C. Jaklevic, Phys. Rev. Lett. 22 (1961) 1371. The charging eects in electron transport through a granular medium were discussed theoretically in C.J. Gorter Physica 8 (1951) 777. D.V. Averin, K.K. Likharev, J. Low Temp. Phys. 62 (1986) 345. R.I. Shekhter, Zh. Exp. Teor. Fiz. 63 (1972) 1410 [Sov. Phys. JETP 36 (1973) 747]. I.O. Kulik, R.I. Shekhter, Zh. Exp. Teor. Fiz. 68 (1975) 623 [Sov. Phys. JETP 41 (1975) 308]. T.A. Fulton, G.J. Dolan, Phys. Rev. Lett. 59 (1987) 109. D.V. Averin, A.A. Odintsov, Phys. Lett. A 140 (1989) 251. D.V. Averin, Yu.N. Nazarov, Phys. Rev. Lett. 65 (1990) 2446. L.I. Glazman, K.A. Matveev, Zh. Exp. Teor. Fiz. 98 (1990) 1834 [Sov. Phys. JETP 71 (1990) 1031]. H. Schoeller, G. Sch[on, Phys. Rev. B 50 (1994) 18436. D.V. Averin, Physica B 194 –196 (1994) 979. J. K[onig, H. Schoeller, G. Sch[on, Phys. Rev. Lett. 78 (1997) 4482. K.A. Matveev, Zh. Exp. Teor. Fiz. 99 (1991) 1598 [Sov. Phys. JETP 72 (1991) 892]. A.I. Larkin, V.I. Melnikov, Zh. Exp. Teor. Fiz. 61 (1971) 1231 [Sov. Phys. JETP 34 (1972) 656]. P. Nozieres, A. Blandin, J. Phys. 41 (1980) 193. N. Andrei, C. Destri, Phys. Rev. Lett. 52 (1984) 364. A.M. Tsvelick, P.B. Weigmann, Z. Phys. B 54 (1984) 201. K. Flensberg, Phys. Rev. B 48 (1993) 11156; Physica B 203 (1994) 432.
438 [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84]
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440 K.A. Matveev, Phys. Rev. B 51 (1995) 1743. A. Furusaki, K.A. Matveev, Phys. Rev. Lett. 75 (1995) 709; Phys. Rev. B 51 (1995) 16878. B.L. Altshuler, B.I. Shklovskii, Zh. Exp. Teor. Fiz. 91 (1986) 220 [Sov. Phys. JETP 64 (1986) 127]. M.V. Berry, Proc. R. Soc. London A 400 (1985) 229. K.B. Efetov, Supersymmetry in Disorder and Chaos, Cambridge University Press, New York, 1997. A.D. Mirlin, Phys. Rep. 326 (2000) 259. V.E. Kravtsov, A.D. Mirlin, Pis’ma Zh. Exp. Teor. Fiz. 60 (1994) 645 [Sov. Phys. JETP Lett. 60 (1994) 656]. A.V. Andreev, B.L. Altshuler, Phys. Rev. Lett. 75 (1995) 902. B.L. Altshuler, B.D. Simons, in: E. Akkermans, G. Montambaux, J.-L. Pichard, J. Zinn-Justin (Eds.), Mesoscopic Quantum Physics, North-Holland, Amsterdam, 1995. B.A. Muzykantskii, D.E. Khmelnitskii, Pis’ma Zh. Eksp. Teor. Fiz. 62 (1995) 68 [JETP Lett. 62 (1995) 76]. A.V. Andreev, O. Agam, B.D. Simons, B.L. Altshuler, Phys. Rev. Lett. 76 (1996) 3497; Nucl. Phys. B 482 (1996) 536. E.B. Bogomolny, J.P. Keating, Phys. Rev. Lett. 77 (1996) 1472. Ya.M. Blanter, A.D. Mirlin, B.A. Muzykantskii, Phys. Rev. Lett. 80 (1998) 4161. Ya.M. Blanter, A.D. Mirlin, B.A. Muzykantskii, preprint, cond-mat=0011498. M. Abramowitz, I. Stegun (Eds.), Handbook of Mathematical Functions, Dover, New York, 1972, p. 409. M.L. Mehta, Random Matrices, Academic Press, New York, 1991. M.L. Mehta, A. Pandey, Comm. Math. Phys. 87 (1983) 449. M.L. Mehta, A. Pandey, J. Phys. A 16 (1983) 2655. K. Frahm, J.-L. Pichard, J. Phys. (France) I 5 (1995) 847. O. Bohigas, M.-J. Giannoni, A.M. Ozorio de Almeida, C. Schmit, Nonlinearity 8 (1995) 203. C.E. Porter, R.G. Thomas, Phys. Rev. 104 (1956) 483. H.-J. Sommers, S. Iida, Phys. Rev. B 49 (1994) 2513. V.I. Fal’ko, K.B. Efetov, Phys. Rev. B 50 (1994) 11267. V.I. Fal’ko, K.B. Efetov, Phys. Rev. Lett. 77 (1996) 912. V.I. Fal’ko, K.B. Efetov, J. Math. Phys. 37 (1996) 4935. J.B. French, V.K.B. Kota, A. Pandey, S. Tomsovic, Ann. Phys. (NY) 181 (1988) 198. S.A. van Langen, P.W. Brouwer, C.W.J. Beenakker, Phys. Rev. E 55 (1997) R1. Ya.M. Blanter, Phys. Rev. B 54 (1996) 12807. B.L. Altshuler, Y. Gefen, A. Kamenev, L.S. Levitov, Phys. Rev. Lett. 78 (1997) 2803. Ya.M. Blanter, A.D. Mirlin, Phys. Rev. E 55 (1997) 6514. O. Agam, N.S. Wingreen, B.L. Altshuler, D.C. Ralph, M. Tinkham, Phys. Rev. Lett. 78 (1997) 1956. I.L. Aleiner, L.I. Glazman, Phys. Rev. B 57 (1998) 9608. J. von Delft, D.C. Ralph, Phys. Rep. 345 (2001) 61. K.A. Matveev, L.I. Glazman, R.I. Shekhter, Mod. Phys. Lett. 8 (1994) 1007. B.N. Narozhny, I.L. Aleiner, A.I. Larkin, Phys. Rev. B 62 (2000) 14898. Y.M. Blanter, A.D. Mirlin, B.A. Muzykantskii, Phys. Rev. Lett. 78 (1997) 2449. R.O. Vallejos, C.H. Lewenkopf, E.R. Mucciolo, Phys. Rev. Lett. 81 (1998) 677. P.W. Brouwer, Y. Oreg, B.I. Halperin, Phys. Rev. B 60 (1999) 13977. H.U. Baranger, D. Ullmo, L.I. Glazman, Phys. Rev. B 61 (2000) 2425. I.L. Kurland, I.L. Aleiner, B.L. Altshuler, Phys. Rev. B 62 (2000) 14886. J.M. Ziman, Principles of the Theory of Solids, 2nd Edition, Cambridge University Press, Cambridge, 1972, p. 339. D.S. Fisher, P.A. Lee, Phys. Rev. Lett. B 23 (1981) 6851. Y. Imry, in: G. Grinstein, G. Mazenko (Eds.), Directions in Condensed Matter Physics, World Scienti8c, Singapore, 1986. M. B[uttiker, Phys. Rev. Lett. 57 (1986) 1761; IBM J. Res. Dev. 32 (1988) 317. A.D. Stone, A. Szafer, IBM J. Res. Dev. 32 (1988) 384. H.U. Baranger, A.D. Stone, Phys. Rev. B 40 (1989) 8199.
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440
439
[85] B.J. van Wees, H. van Houten, C.W.J. Beenakker, J.G. Williamson, L.P. Kouwenhoven, D. van der Marel, C.T. Foxon, Phys. Rev. Lett. 60 (1988) 848. [86] C.H. Lewenkopf, H.A. Weidenm[uller, Ann. Phys. 212 (1991) 53. [87] See, e.g., G.D. Mahan, Many-Particle Physics, Plenum Press, New York, 1990. [88] C. Mahaux, H.A. Weidenm[uller, Shell-model Approach to Nuclear Reactions, North-Holland, Amsterdam, 1969. [89] S. Iida, H.A. Weidenm[uller, J.A. Zuk, Ann. Phys. (NY) 200 (1990) 219. [90] T. Guhr, A. M[uller-Groeling, H.A. Weidenm[uller, Phys. Rep. 299 (1998) 189. [91] H. Nishioka, H.A. Weidenm[uller, Phys. Lett. B 157 (1985) 101. [92] P.W. Brouwer, Phys. Rev. B 51 (1995) 16878. [93] R. Landauer, IBM J. Res. Dev. 1 (1957) 223. [94] M. B[uttiker, J. Phys.: Condens. Matter 5 (1993) 9361. [95] J.J.M. Verbaarschot, H.A. Weidenm[uller, M.R. Zirnbauer, Phys. Rep. 129 (1985) 367. [96] L.I. Glazman, K.A. Matveev, Pis’ma Zh. Exp. Teor. Fiz. 48 (1988) 403 [Sov. Phys. JETP Lett. 48 (1988) 445]. [97] C.W.J. Beenakker, Phys. Rev. B 44 (1991) 1646. [98] D.V. Averin, A.N. Korotkov, K.K. Likharev, Phys. Rev. B 44 (1991) 6199. [99] R.A. Jalabert, A.D. Stone, Y. Alhassid, Phys. Rev. Lett. 68 (1992) 3468. [100] Y. Alhassid, Phys. Rev. B 58 (1998) 13383. [101] Y. Alhassid, H. Attias, Phys. Rev. Lett. 76 (1996) 1711. [102] Y. Alhassid, H. Attias, Phys. Rev. B 54 (1996) 2696. [103] I.S. Gradshteyn, I.M. Ryzhik, Tables of Integrals, Series, and Products, 5th Edition, Academic Press, New York, 1994. [104] V.N. Prigodin, K.B. Efetov, S. Iida, Phys. Rev. Lett. 71 (1993) 1230. [105] A.M. Chang, H.U. Baranger, L.N. Pfeier, K.W. West, T.Y. Chang, Phys. Rev. Lett. 76 (1996) 1695. [106] B.L. Altshuler, D.E. Khmelnitskii, A.I. Larkin, P.A. Lee, Phys. Rev. B 22 (1980) 5142; S. Hikami, A.I. Larkin, Y. Nagaoka, Prog. Theor. Phys. 63 (1980) 707. [107] Y. Alhassid, J.N. Hormuzdiar, N.D. Whelan, Phys. Rev. B 58 (1998) 4876. [108] I.L. Aleiner, L.I. Glazman, Phys. Rev. Lett. 77 (1996) 2057. [109] P.G. de Gennes, Rev. Mod. Phys. 36 (1964) 225. [110] E.A. Shapoval, Sov. Phys. JETP 22 (1966) 647 [Zh. Eksp. Teor. Fiz. 49 (1965) 930]. [111] S.M. Cronenwett, S.R. Patel, C.M. Marcus, K. Campman, A.C. Gossard, Phys. Rev. Lett. 79 (1997) 2312. [112] R. Baltin, Y. Gefen, Phys. Rev. B 61 (2000) 10247. [113] A. Kaminski, I.L. Aleiner, L.I. Glazman, Phys. Rev. Lett. 81 (1998) 685. [114] J. Kondo, Prog. Theor. Phys. 32 (1964) 37. [115] J. Appelbaum, Phys. Rev. Lett. 17 (1966) 91. [116] P.W. Anderson, Phys. Rev. Lett. 17 (1966) 95. [117] J.M. Rowell, in: E. Burstein, S. Lundquist (Eds.), Tunneling Phenomena in Solids, Plenum Press, New York, 1969, p. 385. [118] P. NoziXeres, J. Low Temp. Phys. 17 (1974) 31. [119] T.K. Ng, P.A. Lee, Phys. Rev. Lett. 61 (1988) 1768. [120] L.I. Glazman, M.E. Raikh, JETP Lett. 47 (1988) 452. [121] P.W. Anderson, Phys. Rev. 124 (1961) 41. [122] F.D.M. Haldane, Phys. Rev. Lett. 40 (1979) 416. [123] P. NoziXeres, A. Blandin, J. Phys. 41 (1980) 193. [124] M. Pustilnik, L.I. Glazman, preprint cond-mat=0105155. [125] T.A. Costi, Phys. Rev. Lett. 85 (2000) 1504. [126] D. Goldhaber-Gordon, H. Shtrikman, D. Mahalu, D. Abusch-Madger, U. Meirav, M.A. Kastner, Nature (London) 391 (1988) 156; D. Goldhaber-Gordon, J. Gores, M.A. Kastner, H. Shtrikman, D. Mahalu, U. Meirav, Phys. Rev. Lett. 81 (1998) 5225. [127] S.M. Cronenwett, T.H. Oosterkamp, L.P. Kouwenhoven, Science 281 (1998) 540. [128] J. Schmid, J. Weis, K. Eberl, K. von Klitzing, Physica B 256 –258 (1998) 182.
440 [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166] [167] [168] [169] [170]
I.L. Aleiner et al. / Physics Reports 358 (2002) 309–440 L.I. Glazman, R.I. Shekhter, J. Phys.: Condens. Matter 1 (1989) 5811. C.W.J. Beenakker, H. Schomerus, P.G. Silvestrov, preprint cond-mat=0010387. J.A. Folk, S.R. Patel, C.M. Marcus, C.I. Duru[oz, J.S. Harris Jr., preprint cond-mat=0008052. A. Kaminski, L.I. Glazman, Phys. Rev. 61 (2000) 15927. S.V. Panyukov, A.D. Zaikin, Phys. Rev. Lett. 67 (1991) 3168. G. Falci, G. Sch[on, G. Zimanyi, Phys. Rev. Lett. 74 (1995) 3257. L.I. Glazman, F.W.J. Hekking, A.I. Larkin, Phys. Rev. Lett. 83 (1999) 1830. P.W. Anderson, Phys. Rev. Lett. 18 (1967) 1049. D. Berman, N.B. Zhitenev, R.C. Ashoori, M. Shayegan, Phys. Rev. Lett. 82 (1999) 161. Yu.V. Nazarov, Phys. Rev. Lett. 82 (1999) 1245. F.D.M. Haldane, J. Phys. C 14 (1981) 2585. C.L. Kane, M.P.A. Fisher, Phys. Rev. B 46 (1992) 15233. A.I. Larkin, P.A. Lee, Phys. Rev. B 17 (1978) 1596. A.O. Gogolin, A.A. Nersesyan, A.M. Tsvelik, Bosonization and Strongly Correlated Systems, Cambridge University Press, New York, 1998. I.L. Aleiner, K.A. Matveev, Phys. Rev. Lett. 80 (1998) 814. P.W. Brouwer, M. B[uttiker, Europhys. Lett. 37 (1997) 441. S.E. Korshunov, Pis’ma Zh. Eksp. Teor. Fiz. 45 (1987) 342 [JETP Lett. 45 (1987) 434]. A. Kamenev, Phys. Rev. Lett. 85 (2000) 4160. M. B[uttiker, H. Thomas, A. Prˆetre, Phys. Lett. A 180 (1993) 364. M. B[uttiker, A. Prˆetre, H. Thomas, Phys. Rev. Lett. 70 (1993) 4114. V.A. Gopar, P.A. Mello, M. B[uttiker, Phys. Rev. Lett. 77 (1996) 3005. Y.V. Fyodorov, H.-J. Sommers, J. Math. Phys. 38 (1997) 1918. P.W. Brouwer, I.L. Aleiner, Phys. Rev. Lett. 82 (1999) 390. K.B. Efetov, Phys. Rev. Lett. 74 (1995) 2299. P. Nozieres, C.T. de Dominicis, Phys. Rev. 178 (1969) 1097. S.M. Cronenwett, S.M. Maurer, S.R. Patel, C.M. Marcus, C.I. Duru[oz, J.S. Harris Jr., Phys. Rev. Lett. 81 (1998) 5904. U. Sivan, Y. Imry, A.G. Aronov, Europhys. Lett. 28 (1994) 115. H.U. Baranger, P.A. Mello, Phys. Rev. B 51 (1995) 4703. I.L. Aleiner, A.I. Larkin, Phys. Rev. B 54 (1996) 14423. P.W. Brouwer, C.W.J. Beenakker, Phys. Rev. B 55 (1997) 4695. A.G. Huibers, S.R. Patel, C.M. Marcus, P.W. Brouwer, C.I. Duru[oz, J.S. Harris Jr., Phys. Rev. Lett. 81 (1998) 1917. S. Tarucha, D.G. Austing, Y. Tokura, W.G. van der Wiel, L.P. Kouwenhoven, Phys. Rev. Lett. 84 (2000) 2485. S. Sasaki, S. De Franceschi, J.M. Elzerman, W.G. van der Wiel, M. Eto, S. Tarucha, L.P. Kouwenhoven, Nature 405 (2000) 764. M. Eto, Yu.V. Nazarov, Phys. Rev. Lett. 85 (2000) 1306. M. Pustilnik, L.I. Glazman, Phys. Rev. Lett. 85 (2000) 2993. S. L[uscher, T. Heinzel, K. Ensslin, W. Wegscheider, M. Bichler, preprint cond-mat=0002226. M.G. Vavilov, I.L. Aleiner, Phys. Rev. B 60 (1999) 16311. M. Switkes, C.M. Marcus, K. Campman, A.C. Gossard, Science 283 (1999) 1905. F. Zhou, B. Spivak, B. Altshuler, Phys. Rev. Lett. 82 (1999) 608. P.W. Brouwer, Phys. Rev. B 58 (1998) 10135. T.A. Shutenko, I.L. Aleiner, B.L. Altshuler, Phys. Rev. B 61 (2000) 10366. I.L. Aleiner, A.I. Larkin, Phys. Rev. E 55 (1997) 1243.
441
CONTENTS VOLUME 358 S.Y. Wu, C.S. Jayanthi. Order-N methodologies and their applications M. Baer. Introduction to the theory of electronic non-adiabatic coupling terms in molecular systems
1 75
S.-T. Hong, Y.-J. Park. Static properties of chiral models with SU(3) group structure
143
W.M. Alberico, S.M. Bilenky, C. Maieron. Strangeness in the nucleon: neutrino–nucleon and polarized electron–nucleon scattering
227
I.L. Aleiner, P.W. Brouwer, L.I. Glazman. Quantum effects in Coulomb blockade
309
Contents of volume
441
Forthcoming issues
442
PII: S 0 3 7 0 - 1 5 7 3 ( 0 1 ) 0 0 1 0 2 - 8
442
FORTHCOMING ISSUES* V. Barone, A. Drago, P. Ratcliffe. Transverse polarisation of quarks in hadrons C.M. Varma, Z. Nussinov, W. van Saarloos. Singular Fermi liquids J.-P. Blaizot, E. Iancu. The quark–gluon plasma: collective dynamics and hard thermal loops A. Sopczak. Higgs physics at LEP-1 P. Tabeling. Two-dimensional turbulence: a physicist approach A. Altland, B.D. Simons, M. Zirnbauer. Theories of low-energy quasi-particle states in disordered d-wave superconductors J.A. Krommes. Fundamental descriptions of plasma turbulence in magnetic fields J.D. Vergados. The neutrinoless double beta decay from a modern perspective C.-I. Um, K.-H. Yeon, T.F. George. The quantum damped harmonic oscillator P. Reimann. Brownian motors: noisy transport far from equilibrium T. Yamazaki, N. Morita, R. Hayano, E. Widmann, J. Eades. Antiprotonic helium J.K. Basu, M.K. Sanyal. Ordering and growth of Langmuir–Blodgett films: X-ray scattering studies G.E. Brown, M. Rho. On the manifestation of chiral symmetry in nuclei and dense nuclear matter S. Nussinov, M.A. Lampert. QCD inequalities C. Chandre, H.R. Jauslin. Renormalization-group analysis for the transition to chaos in Hamiltonian systems
*The full text of articles in press is available from ScienceDirect at http://www.sciencedirect.com. PII: S 0 3 7 0 - 1 5 7 3 ( 0 1 ) 0 0 1 0 3 - X