Physics of Carbon Nanotubes Tsuneya Ando Department of Physics, Tokyo Institute of Technology 2–12–1 Ookayama, Meguro-ku, Tokyo 152-8551, Japan
Abstract. A brief review is given on electronic and transport properties of carbon nanotubes mainly from a theoretical point of view. The topics include a description of electronic states in a tight-binding model and in an effective-mass or k·p scheme. Transport properties are discussed including absence of backward scattering except for scatterers with a potential range smaller than the lattice constant and its extension to multi-bands cases.
1
Introduction
Graphite needles called carbon nanotubes (CNs) have been a subject of an extensive study since discovery in 1991 [1,2]. A multi-wall CN is a few concentric tubes of two-dimensional (2D) graphite consisting of carbon-atom hexagons arranged in a helical fashion about the axis. The diameter of CNs is usually between 20 and 300 ˚ A and their length can exceed 1 µm. The distance of adjacent sheets or walls is larger than the distance between nearest neighbor atoms in a graphite sheet and therefore electronic properties of CNs are dominated by those of a single layer CN. Single-wall nanotubes were produced in a form of ropes a few years later [3,4]. The purpose of this paper is to give a brief review of recent theoretical study on electronic and transport properties of carbon nanotubes. Carbon nanotubes can be either a metal or semiconductor, depending on their diameters and helical arrangement. The condition whether a CN is metallic or semiconducting can be obtained based on the band structure of a 2D graphite sheet and periodic boundary conditions along the circumference direction. This result was first predicted by means of a tight-binding model [5,6,7,8,9,10,11,12,13,14]. These properties can be well reproduced in a k·p method or an effective-mass approximation [15]. The theoretical predictions have been confirmed by Raman experiments [16] and direct measurements of local density of states by scanning tunneling spectroscopy [17,18,19]. The k·p scheme has been used successfully in the study of wide varieties of electronic properties. Some of such examples are magnetic properties including the Aharonov-Bohm (AB) effect on the band gap [20], optical absorption spectra [21], exciton effects [22], lattice instabilities in the absence [23,24] and presence of a magnetic field [25], magnetic properties of ensembles of nanotubes [26], effects of spin-orbit interaction [27], junctions [28], vacancies [29], topological defects [30], and properties of nanotube caps [31]. B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 3–18, 2003. c Springer-Verlag Berlin Heidelberg 2003
4
Tsuneya Ando
Transport properties of CNs are interesting because of their unique topological structure. There have been reports on experiments in CN bundles [32] and ropes [33,34]. Transport measurements became possible for a single multi-wall nanotube [35,36,37,38,39] and a single single-wall nanotube [40,41,42,43,44]. Single-wall nanotubes usually exhibit large charging effects presumably due to nonideal contacts [45,46,47,48,49]. Almost ideal contacts were realized also [50]. In this paper we shall mainly discuss electronic states and transport properties of nanotubes obtained theoretically in the k·p method combined with a tight-binding model. It is worth mentioning that several papers giving general reviews of electronic properties of nanotubes were published already [51,52,53,54,55,56]. In section 2 electronic states are discussed in a nearest-neighbor tightbinding model. In section 3, the effective mass equation is introduced. In section 4 the absence of backscattering in the presence of scatterers is discussed. In section 5 the discussion is extended towards the presence of a perfectly conducting channel when several bands coexist at the Fermi level. Brief discussions are made on effects of lattice vacancies in section 6 and on TomonagaLuttinger liquid behavior in section 7. A short summary is given in section 8.
2
Electronic States
The structure of 2D graphite sheet is shown in √ Fig. 1. We have the primitive translation vectors a = a(1, 0) and b = a(−1/2, 3/2), √ and the vectors connect√ ing nearest neighbor carbon atoms τ = a(0, 1/ 3), τ 2 = a(−1/2, −1/2 3), 1 √ reciproand τ 3 = a(1/2, −1/2 3). Note that a · b = −a2 /2. The primitive √ ∗ ∗ ∗ ∗ cal lattice vectors a and b are given by a = (2π/a)(1, 1/ 3) and b√ = √ (2π/a)(0, 2/ 3), The K and K’ points are given as K = (2π/a)(1/3, 1/ 3) and K = (2π/a)(2/3, 0), respectively. We have exp(iK·τ 1 ) = ω, exp(iK·τ 2 ) = ω −1 , exp(iK·τ 3 ) = 1, exp(iK·τ 1 ) = 1, exp(iK·τ 2 ) = ω −1 , and exp(iK·τ 3 ) = ω, with ω = exp(2πi/3). In a tight-binding model, the wave function is written as ψA (RA )φ(r−RA ) + ψB (RB )φ(r−RB ), (1) ψ(r) = RA
RB
where φ(r) is the wave function of the pz orbital of a carbon atom located at the origin, RA = na a+nb b+τ 1 , and RB = na a+nb b with integer na and nb . Let −γ0 be the transfer integral between nearest-neighbor carbon atoms and choose the energy origin at that of the carbon pz level. Then, we have ψB (RA −τ l ), εψB (RB ) = −γ0 ψA (RB +τ l ). (2) εψA (RA ) = −γ0 l
l
Physics of Carbon Nanotubes A T
y b
B
ky
τ1
τ2
K'
τ3
a
x
η
K
η K'
K
(na,nb)
η
Armchair (η=π/6) kx
K L
5
Zigzag (η=0) z y x
(0,0)
L
Fig. 1. The lattice structure of a 2D graphite sheet and various quantities, the corresponding Brillouin zone, and the coordinate system on cylinder surface. We shall consider the case that 0 ≤ η ≤ π/6. The zigzag nanotube corresponds to η = 0 and the armchair nanotube to η = π/6
Assuming ψA (RA ) ∝ fA (k) exp(ik·RA ) and ψB (RB ) ∝ fB (k) exp(ik·RB ), we have exp(−ik·τ l ) 0 −γ0 fA (k) fA (k) l (3) f (k) = ε f (k) . exp(+ik·τ l ) 0 −γ0 B B l
The energy bands are given by
ε± (k) = ±γ0
exp(−ik·τ l ) .
(4)
l
It is clear that ε± (K) = ε± (K ) = 0. Near the K and K’ point, we have √ ε± (k+K) = ε± (k+K ) = ±γ kx2 +ky2 with γ = 3aγ0 /2. The band structure is shown in Fig. 2. 4
M
Energy (units of γ0)
3
K
Γ
2 1
EF 0 -1 -2 -3
K
Γ
Wave Vector
M
K
Fig. 2. Calculated band structure of a two-dimensional graphite along K → Γ → M → K shown in the inset
6
Tsuneya Ando
Every structure of a single wall CN can be derived from a monatomic layer of graphite as shown in Fig. 1 (a). Each hexagon is denoted by the chiral vector √ 3 1 (5) nb . L = na a + nb b = na − nb , 2 2 In another convention for the choice of primitive translation vectors, L is characterized by two integers (p, q) with p = na −nb and q = nb and the corresponding CN is often called a (p, q) nanotube. We shall construct a nanotube in such a way that the hexagon at L is rolled onto the origin. For convenience, we introduce another unit basis vectors (ex , ey ) as shown in Fig. 1. The direction of ex or x is along the circumference of CN, i.e., ex = L/L with L = |L| = a n2a +n2b −na nb , and ey or y is along the axis of CN. A primitive translation vector in the ey direction is written as T = ma a + mb b,
(6)
with integer ma and mb . Now, T is determined by the condition T·L = 0 or ma (2na −nb ) − mb (na −2nb ) = 0. This can be solved as pma = na −2nb ,
pmb = 2na −nb ,
(7)
where p is the greatest common divisor of na − 2nb and 2na − nb . The first Brillouin zone of the nanotube is given by the region −π/T ≤ ky < π/T with T = a m2a +m2b −ma mb . The unit cell is formed by the rectangular region determined by L and T. For nanotubes with sufficiently large diameter, effects of mixing between π bands and σ bands and change in the coupling between π orbitals can safely be neglected. Then, the energy bands of a nanotube are obtained simply by imposing periodic boundary conditions along the circumference direction, i.e., ψ(r+L) = ψ(r). This leads to the condition exp(ik·L) = 1, which makes the wave vector along the circumference direction discrete, i.e., kx = 2πj/L with integer j, but the wave vector perpendicular to L arbitrary except that −π/T ≤ k < π/T . The number of one-dimensional bands, i.e., j is given by the total number of carbon atoms in a unit cell determined by L and T. The band structure of a nanotube depends critically on whether the K and K’ points in the Brillouin zone of the 2D graphite are included in the allowed wave vectors when the 2D graphite is rolled into a nanotube. This can be understood by considering exp(iK·L) and exp(iK ·L). We have 2πν exp(iK·L) = exp + i , 3
2πν exp(iK ·L) = exp − i , 3
(8)
where ν is an integer (0 or ±1) determined by na + nb = 3N + ν,
(9)
Physics of Carbon Nanotubes
7
with integer N . This shows that for ν = 0 the nanotube becomes metallic because two bands cross at the wave vector corresponding to K and K’ points without a gap. When ν = ±1, on the other hand, there is a nonzero gap between valence and conduction bands and the nanotube is semiconducting. For translation r → r + T the Bloch function at the K and K’ points acquires the phase
(10) exp(iK·T) = exp + i2πµ/3 , exp(iK ·T) = exp − i2πµ/3 , where µ = 0 or ±1 is determined by ma + mb = 3M + µ,
(11)
with integer M . When ν = 0, therefore, the K and K’ points are mapped onto k0 = +2πµ/3T and k0 = −2πµ/3T , respectively, in the one-dimensional Brillouin zone of the nanotube. At these points two one-dimensional bands cross each other without a gap. A nanotube has a helical structure for general L. There are two kinds of nonhelical nanotubes, zigzag with (na , nb ) = (m, 0) and armchair with (na , nb ) = (2m, m). A zigzag nanotube is metallic when m is divided by three and semiconducting otherwise. We have pm √a = m and pmb = 2m, which give ma = 1 and mb = 2, and µ = 0 and T = 3a. When a zigzag nanotube is metallic, two conduction and valence bands having a linear dispersion cross at the Γ point of the one-dimensional Brillouin zone. On the other hand, an armchair nanotube is always metallic. We have pma = 0 and pmb = 3m, which gives ma = 0 and mb = 1, and µ = 1 and T = a. Thus, the conduction and valence bands cross each other always at k0 = ±2π/3a. It is straightforward to calculate the band structure of nanotubes when a tight-binding model is used. Figures 3 and 4 show some examples for zigzag and armchair nanotubes. 10
10
Energy (eV)
( 8,0) 5
0
-5 0.0
10 ( 9,0)
5
(10,0) 5
EF
0
EF
0
EF
0.5
-5 1.0 0.0
0.5
-5 1.0 0.0
0.5
1.0
Wave Vector (π/ 3a) Fig. 3. Some examples of the band structure obtained in a tight-binding model for zigzag nanotubes
8
Tsuneya Ando 3
Energy (units of γ0)
Armchair L/a=14 3 2
1
K’
0
K
-0.5
0.0
0.5
Wave Vector (units of 2π/a)
3
Fig. 4. Some examples of the band structure obtained in a tight-binding model for armchair nanotubes
Neutrino on Cylinder Surface
Essential and important features of electronic states become transparent when we use a k · p scheme in describing states in the vicinity of K and K’ points in the 2D graphite. The effective-mass equation for the K point is given by FA (r) ˆ , (12) γ(σ · k)F(r) = εF(r), F(r) = FB (r) where the origin of energy ε is chosen at K or K’ points, σ = (σx , σy ) is ˆ = −i∇, and FA and FB the Pauli spin matrix, γ is a band parameter, k represent the amplitude at two carbon sites A and B, respectively [15]. The above equation is same as Weyl’s equation for a neutrino with vanishing rest mass and constant velocity independent of the wave vector. The energy becomes εs (k) = ±γ|k| and the velocity is given by |v| = γ/¯h independent of k and ε. The density of states becomes D(ε) = |ε|/2πγ 2 . Figure 5 shows the energy dispersion and the density of states schematically. An important feature is the presence of a topological singularity at k = 0. A neutrino has a helicity and its spin is quantized into the direction of its motion. The spin eigen function changes its signature due to Berry’s phase e
kx
0
e
ky
0 D(e)
Fig. 5. The energy dispersion and density of states in the vicinity of K and K’ points obtained in a k·p scheme
Physics of Carbon Nanotubes
9
under a 2π rotation. Therefore the wave function acquires phase −π when the wave vector k is rotated around the origin along a closed contour [57,58]. The signature change occurs only when the closed contour encircles the origin k = 0 but not when the contour does not contain k = 0. This topological singularity at k = 0 causes a zero-mode anomaly in the conductivity [59,60]. It is also the origin of the absence of backscattering and perfect conductance in metallic carbon nanotubes as discussed below. The wave function in the k·p scheme is written as a product of the neutrino wave function and the Bloch function at the K-point. The Bloch function ψK acquires a phase under translation, ψK (r+ L) = ψK (r) exp(iK·L). Therefore, the boundary condition for the neutrino wave function is [15]
F(r+L) = F(r) exp −2πiν/3 . (13) This corresponds to the presence of a fictitious AB magnetic flux along the axis. In the k·p scheme, therefore, electrons in a nanotube can be regarded as neutrinos on a cylinder surface with an AB flux.
4
Absence of Backscattering
The neutrino wave function is written as F(r) ∝ exp[iκν (n)x + iky] with κν (n) = (2π/L)(n − ν/3), where the x axis and y axis are chosen in the circumference and axis direction, respectively, n is an integer, and k is the wave vector in the axis direction. The corresponding energy levels are 2 2 (14) ε(±) ν (n, k) = ±γ κν (n) +k , where + and − stand for conduction and valence bands, respectively. Figure 6 shows a schematic illustration of the bands for ν = 0 and +1. When ν = 0, there are bands with a linear dispersion without a gap and CN becomes a metal, while when ν = ±1, on the other hand, there is a nonzero gap and e
e n=+2 n=-1
n=+2, -2 n=+1, -1 n=0
n=+1 n=0
k
n=0 n=+1
n=+1, -1 n=0
k
n=-1 n=+1
Fig. 6. Energy bands of a nanotube obtained in the effective-mass approximation for ν = 0 (left) and ν = +1 (right)
10
Tsuneya Ando
CN becomes a semiconductor. There is a one-to-one correspondence between the AB flux and the band gap. The effective-mass equation for the K’ point is obtained by replacing σy by −σy in Eq. (12) and the boundary conditions are obtained by the replacement ν by −ν. The nontrivial Berry’s phase leads to the unique property of a metallic nanotube that there exists no backscattering and the tube is a perfect conductor even in the presence of scatterers [58,61]. It has been proved that the Born series for backscattering vanish identically [61]. Furthermore, the conductance has been calculated exactly for finite-length nanotubes containing many impurities with using Landauer’s formula [62]. Figure 7 shows an example of such a conductance as a function of a magnetic field applied perpendicular to the axis. The conductance is given by 2e2 /π¯h independent of the length in the absence of a magnetic field. It decreases with length in a magnetic field inconsistent with Ohm’s law. The absence of backward scattering has been confirmed by numerical calculations in a tight binding model [63]. As shown in Fig. 8 backscattering corresponds to a rotation of the k direction by ±π (in general (2n + 1)π with integer n). In the absence of a magnetic field, there exists a time reversal process corresponding to each backscattering process. The time reversal process corresponds to a rotation of the k direction by ±π in the opposite direction. The scattering amplitudes of these two processes are same in the absolute value but have an opposite signature because of Berry’s phase. As a result, the backscattering amplitude cancels out completely. In semiconducting nanotubes, on the other hand, backscattering appears because the symmetry is destroyed by a nonzero AB magnetic flux. An important information has been obtained on the mean free path in nanotubes by single-electron tunneling experiments [40,44]. The Coulomb oscillation in semiconducting nanotubes is quite irregular and can be explained
Conductance (units of 2e2/πh)
1.0
Length (units of L) 10.0 20.0 50.0 100.0
ν=0 φ/φ0 = 0.00 εL/2πγ = 0.0 Λ/L = 10.0 u/2Lγ = 0.10
0.5
0.0 0.0
0.5
1.0
Magnetic Field:
1.5 (L/2πl)2
2.0
Fig. 7. Calculated conductance of finite-length nanotubes at ε = 0 as a function of the effective strength of a magnetic field (L/2πl)2 in the case that the effective mean free path Λ is much larger than the circumference L. The conductance is always given by the value in the absence of impurities at H = 0 [61]
Physics of Carbon Nanotubes
11
Fig. 8. Schematic illustration of a back scattering process (solid arrows) and corresponding timereversal process (dashed arrows)
only if nanotubes are divided into many separate spatial regions in contrast to that in metallic nanotubes [64]. This behavior is consistent with the presence of backscattering leading to a localization of the wave function. In metallic nanotubes, the wave function is extended in the whole nanotube because of the absence of backscattering. With the use of electrostatic force microscopy the voltage drop in a metallic nanotube has been shown to be negligible [65]. Single-wall nanotubes usually exhibit large charging effects due to nonideal contacts. Contacts problems were investigated theoretically in various models and an ideal contact was suggested to be possible under certain conditions [45,46,47,48,49]. Recently, almost ideal contacts were realized and a Fabry-Perot type oscillation due to reflection by contacts was claimed to be observed [50].
5
Presence of Perfectly Transmitting Channel
When the Fermi level moves away from the energy range where only linear bands are present, interband scattering appears because of the presence of several bands at the Fermi level. Let rβα ¯ be the reflection coefficient from a state with wave vector kα to a state with kβ¯ ≡ −kβ in a 2D graphite sheet. Here, β¯ stands for the state with wave vector opposite to β. Only difference arising in nanotubes is discretization of the wave vector. Apart from a trivial phase factor arising from the choice in the phase of the wave function, the reflection coefficients satisfy the symmetry relation [66]: rβα ¯ = −rαβ ¯ .
(15)
This leads to the absence of backward scattering rαα ¯ = −rαα ¯ = 0 in the single-channel case as discussed in the previous section. Define the reflection t t matrix r by [r]αβ = rαβ ¯ . Then, we have r = − r, where r is the transpose t of r. In general we have det P = det P for any matrix P , where detP is the determinant of P . In metallic nanotubes, the number of traveling modes nc is always given by an odd integer and therefore det(−r) = −det(r), leading to det(r) = 0.
12
Tsuneya Ando
¯ with By definition, rβα ¯ represents the amplitude of an out-going mode β wave function ψβ¯ (r) for the reflected wave corresponding to an in-coming mode α with wave function ψα (r). The vanishing determinant of r shows that there exists at least one nontrivial solution for the equation nc
rβα ¯ aα = 0.
(16)
α=1
Then, there is no reflected wave for the incident wave function α aα ψα (r), demonstrating the presence of a mode which is transmitted through the system with probability one without being scattered back. Figure 9 shows some examples of the length dependence of the calculated conductance for different values of the energy. There are three and five traveling modes for 1 < εL/2πγ < 2 and 2 < εL/2πγ < 3, respectively. The arrows show the mean free path of traveling modes obtained by solving the Boltzmann transport equation (see below). The relevant length scale over which the conductance decreases down to the single-channel result is given by the mean free path. Figure 10 shows an example of the conductivity obtained by solving the Boltzmann equation. The Boltzmann transport equation can be converted into the equation in terms of mean free path for each band in quasi-onedimensional systems [67,68] and the conductivity becomes the sum of the mean free path of each band. The Boltzmann equation gives an infinite conductivity as long as the Fermi level lies in the energy range −1 < εL/2πγ < +1 where only the linear metallic bands are present. However, the conductivity becomes finite when the Fermi energy moves away from this energy range into the range where other bands are present. Further, the increase of the number of conducting modes gives no enhancement of electronic conduction, 5.0
εL/2πγ 1.1 1.5 1.9 2.1 2.5 2.9
Conductance (units of 2e2/πh)
Mean Free Path 4.0
3.0
(b)
2.0
1.0
W-1 = 100.0 u/2γL = 0.10 0.0 0
5
10
15
Length (units of L)
20
Fig. 9. Examples of the length dependence of the calculated conductance. Arrows show the mean free path of traveling modes obtained by solving the Boltzmann transport equation. For lower three curves there are three bands with n = 0 and ±1. For upper three curves there are five bands with n = 0, ±1, and ±2. The mean free path is same for bands with same |n| and decreases with increasing |n| [66]
Physics of Carbon Nanotubes
Conductivity Density of States 15 1.0 10 0.5 5
0.0
Density of States (units of 1/2πγ)
20
1.5
Conductivity (units of 2e2L/πhW)
13
0 0
1
2
3
4
5
Energy (units of 2πγ/L)
Fig. 10. Energy dependence of the conductivity calculated using Boltzmann transport equation. The conductivity is infinite in the energy range −1 < εL/2πγ < +1 but is finite in the other region where several bands coexist at the Fermi level. Dashed line: density of states; thin solid lines: contribution to the conductivity of each band (lowest curve: the contribution of m = 0, the next two m = ±1, . . . ) [66]
because the inter-band scattering becomes increasingly important than the number of conducting modes. This conclusion is quite in contrast to the exact prediction that there is at least a channel which transmits with probability unity, leading to the conclusion that the conductance is given by 2e2 /π¯h independent of the energy, for sufficiently long nanotubes. The difference originates from the absence of phase coherence in the approach based on a transport equation. In the transport equation, scattering from each impurity is treated as a completely independent event after which an electron looses phase memory, while in the transmission approach phase coherence is maintained throughout the system. Effects of inelastic scattering can be considered in a model in which the nanotube is separated into segments with length of the order of the phase coherence length and the electron looses the phase information after the transmission through each segment. Figure 11 shows some examples of calculated conductance for ε(2πγ/L)−1 = 1.5 with nc = 3. As long as the length is smaller than or comparable to the inelastic scattering length Lφ , the conductance is close to the ideal value 2e2 /π¯h corresponding to the presence of a perfect channel. When the length becomes much larger than Lφ , the conductance decreases in proportion to the inverse of the length. When Lφ becomes comparable to the mean free path (Λ0 /L ∼ 6 in the present case), the conductance becomes close to the Boltzmann result given by the dotted line. These results correspond to the fact that a conductivity can be defined for finite Lφ . The figure shows that this conductivity is roughly proportional to Lφ .
Conductance (units of 2e2/πh)
14
Tsuneya Ando
Lφ/L Samples 10.0 100 20.0 100 50.0 100 100.0 100
100
10-1
10-2 101
εL/2πγ = 1.5 W-1 =100.0 u/2γL = 0.10 Boltzmann σ = (4e2/πh) 2Lφ 102
103
Length (units of L)
6
104
Fig. 11. An example of the conductance in the presence of inelastic scattering as a function of the length for different values of the phase coherence length Lφ [66]
Lattice Vacancies – Short-Range Scatterers
So far, we have exclusively considered the case that scattering potential has a range larger than the lattice constant. When the range becomes smaller than the lattice constant, the effective potential for A and B sites in a unit cell can be different. In this case the scatterer becomes dependent on “pseudospin” and causes backscattering due to “pseudo-spin-flip scattering.” Further, it causes intervalley scattering between K and K’ points, which causes also backscattering and leads to nonzero resistance [61]. One typical example of such short-range and strong scatterers is a lattice vacancy. Effects of scattering by a lattice vacancy in armchair nanotubes have been studied within a tight-binding model [69,70]. It has been shown that the conductance at zero energy in the absence of a magnetic field is quantized into zero, one, or two times of the conductance quantum e2 /π¯h for a vacancy consisting of three B carbon atoms around an A atom, of a single A atom, and of a pair of A and B atoms, respectively [70]. Numerical calculations were performed for about 1.5×105 different kinds of vacancies and demonstrated that such quantization is quite general [71]. This rule was analytically derived in a k·p scheme later [29,72].
7
Tomonaga Luttinger Liquid
A metallic CN has linear bands in the vicinity of the Fermi level. In such a onedimensional metal, the Coulomb interactions induce a breakdown of the Fermi liquid theory by causing strong perturbations. The resulting system is known as a Tomonaga-Luttinger liquid, which is predicted to show non-Fermi-liquid
Physics of Carbon Nanotubes
15
behavior such as the absence of Landau quasi-particles, spin-charge separation, suppression of the tunneling density of states, and interaction-dependent power laws for transport quantities. For armchair CN’s, in particular, there have been many theoretical works in which an effective low-energy theory was formulated and explicit predictions were made on various quantities like the energy gap at the Fermi level, tunneling conductance between the CN and a metallic contact, etc. [75,76,77,78,79]. Some experiments suggested the presence of such many-body effects [42,43,80]. In [80], for example, the conductance of bundles (ropes) of single-wall CN’s are measured as a function of temperature and voltage. Electrical connections to nanotubes can be achieved by either depositing electrode metal over the top of the tubes (end contacted) or by placing the tubes on the top of predefined metal leads (bulk contacted). The measured differential conductance displays a power-law dependence on temperature and applied bias. The obtained different values of the power between two samples of bulk and end contacted roughly agree with the theoretical predictions on the exponent of the electron tunneling into the bulk and the end of the Tomonaga-Luttinger liquid.
8
Summary and Conclusion
In summary, a brief review has been given on electronic and transport properties of carbon nanotubes mainly from a theoretical point of view. The topics include a description of energy bands in a tight-binding model and in an effective-mass or k·p scheme. In the latter scheme electrons in nanotubes are regarded as neutrinos on cylinder surface with a fictitious Aharonov-Bohm flux passing through the cross section. The k · p description is particularly useful for revealing extraordinary properties of metallic nanotubes. In fact, in metallic carbon nanotubes, there is at least a single channel transmitting through the system without backscattering independent of energy for scatterers with potential range comparable to or larger than the lattice constant. This channel is sensitive to inelastic scattering when several bands coexist at the Fermi level, however. This has been demonstrated in a model that an electron looses its coherence for a distance determined by a phase coherence length. The effective-mass scheme is useful also for understanding effects of lattice vacancies, junctions, and topological defects. Acknowledgements This work has been supported in part by Grants-in-Aid for COE (12CE2004 “Control of Electrons by Quantum Dot Structures and Its Application to Advanced Electronics”) and Scientific Research from the Ministry of Education, Science and Culture, Japan.
16
Tsuneya Ando
References 1. 2. 3. 4. 5. 6. 7. 8. 9.
10. 11. 12. 13. 14. 15. 16.
17. 18. 19. 20. 21. 22. 23. 24.
25. 26. 27. 28.
S. Iijima, Nature (London) 354, 56 (1991). 3 S. Iijima, T. Ichihashi, and Y. Ando, Nature (London) 356, 776 (1992). 3 S. Iijima and T. Ichihashi, Nature (London) 363, 603 (1993). 3 D. S. Bethune, C. H. Kiang, M. S. de Vries, G. Gorman, R. Savoy, J. Vazquez, and R. Beyers, Nature (London) 363, 605 (1993). 3 N. Hamada, S. Sawada, and A. Oshiyama, Phys. Rev. Lett. 68, 1579 (1992). 3 J. W. Mintmire, B. I. Dunlap, and C. T. White, Phys. Rev. Lett. 68, 631 (1992). 3 R. Saito, M. Fujita, G. Dresselhaus, and M. S. Dresselhaus, Phys. Rev. B 46, 1804 (1992). 3 M. S. Dresselhaus, G. Dresselhaus, and R. Saito, Phys. Rev. B 45, 6234 (1992). 3 M. S. Dresselhaus, G. Dresselhaus, R. Saito, and P. C. Eklund: Elementary Excitations in Solids, ed. J. L. Birman, C. Sebenne and R. F. Wallis (Elsevier Science Publishers B. V., Amsterdam, 1992) p. 387. 3 R. A. Jishi, M. S. Dresselhaus, and G. Dresselhaus, Phys. Rev. B 47, 16671 (1993). 3 K. Tanaka, K. Okahara, M. Okada and T. Yamabe, Chem. Phys. Lett. 191, 469 (1992). 3 Y. D. Gao and W. C. Herndon, Mol. Phys. 77, 585 (1992). 3 D. H. Robertson, D. W. Brenner, and J. W. Mintmire, Phys. Rev. B 45, 12592 (1992). 3 C. T. White, D. C. Robertson, and J. W. Mintmire, Phys. Rev. B 47, 5485 (1993). 3 H. Ajiki and T. Ando, J. Phys. Soc. Jpn. 62, 1255 (1993). 3, 8, 9 A. M. Rao, E. Richter, S. Bandow, B. Chase, P. C. Eklund, K. W. Williams, M. Menon, K. R. Subbaswamy, A. Thess, R. E. Smalley, G. Dresselhaus, and M. S. Dresselhaus, Science 275, 187 (1997). 3 C. H. Olk and J. P. Heremans, J. Mater. Res. 9, 259 (1994). 3 J. W. Wildoer, L. C. Venema, A. G. Rinzler, R. E. Smalley, and C. Dekker, Nature (London) 391, 59 (1998). 3 A. Hassanien, M. Tokumoto, Y. Kumazawa, H. Kataura, Y. Maniwa, S. Suzuki, and Y. Achiba, Appl. Phys. Lett. 73, 3839 (1998). 3 H. Ajiki and T. Ando, J. Phys. Soc. Jpn. 62, 2470 (1993). [Errata, J. Phys. Soc. Jpn. 63, 4267 (1994).] 3 H. Ajiki and T. Ando, Physica B 201, 349 (1994); Jpn. J. Appl. Phys. Suppl. 34-1, 107 (1995). 3 T. Ando, J. Phys. Soc. Jpn. 66, 1066 (1997). 3 N. A. Viet, H. Ajiki, and T. Ando, J. Phys. Soc. Jpn. 63, 3036 (1994). 3 H. Suzuura and T. Ando, Proceedings of 25th International Conference on the Physics of Semiconductors, edited by N. Miura and T. Ando (Springer, Berlin, 2001), p. 1525. 3 H. Ajiki and T. Ando, J. Phys. Soc. Jpn. 64, 260 (1995); 65, 2976 (1996). 3 H. Ajiki and T. Ando, J. Phys. Soc. Jpn. 64, 4382 (1995). 3 T. Ando, J. Phys. Soc. Jpn. 69, 1757 (2000). 3 H. Matsumura and T. Ando, J. Phys. Soc. Jpn. 67, 3542 (1998); 3
Physics of Carbon Nanotubes
17
29. T. Ando, T. Nakanishi, and M. Igami, J. Phys. Soc. Jpn. 68, 3994 (1999). 3, 14 30. H. Matsumura and T. Ando, J. Phys. Soc. Jpn. 70, 2657 (2001). 3 31. T. Yaguchi and T. Ando, J. Phys. Soc. Jpn. 70, 3641-3649 (2001); 21, 2224 (2002). 3 32. S. N. Song, X. K. Wang, R. P. H. Chang, and J. B. Ketterson, Phys. Rev. Lett. 72, 697 (1994). 4 33. J. E. Fischer, H. Dai, A. Thess, R. Lee, N. M. Hanjani, D. L. Dehaas, and R. E. Smalley, Phys. Rev. B 55, R4921 (1997). 4 34. M. Bockrath, D. H. Cobden, P. L. McEuen, N. G. Chopra, A. Zettl, A. Thess, and R. E. Smalley, Science 275, 1922 (1997). 4 35. L. Langer, V. Bayot, E. Grivei, J. -P. Issi, J. P. Heremans, C. H. Olk, L. Stockman, C. Van Haesendonck, and Y. Bruynseraede, Phys. Rev. Lett. 76, 479 (1996). 4 36. A. Yu. Kasumov, I. I. Khodos, P. M. Ajayan, and C. Colliex, Europhys. Lett. 34, 429 (1996). 4 37. T. W. Ebbesen, H. J. Lezec, H. Hiura, J. W. Bennett, H. F. Ghaemi, and T. Thio, Nature (London) 382, 54 (1996). 4 38. H. Dai, E.W. Wong, and C. M. Lieber, Science 272, 523 (1996). 4 39. A. Yu. Kasumov, H. Bouchiat, B. Reulet, O. Stephan, I. I. Khodos, Yu. B. Gorbatov, and C. Colliex, Europhys. Lett. 43, 89 (1998). 4 40. S. J. Tans, M. H. Devoret, H. Dai, A. Thess, R. E. Smalley, L. J. Geerligs, and C. Dekker, Nature (London) 386, 474 (1997). 4, 10 41. S. J. Tans, R. M. Verschuren, and C. Dekker, Nature (London) 393, 49 (1998). 4 42. D. H. Cobden, M. Bockrath, P. L. McEuen, A. G. Rinzler, and R. E. Smalley, Phys. Rev. Lett. 81, 681 (1998). 4, 15 43. S. J. Tans, M. H. Devoret, R. J. A. Groeneveld, and C. Dekker, Nature (London) 394, 761 (1998). 4, 15 44. A. Bezryadin, A. R. M. Verschueren, S. J. Tans, and C. Dekker, Phys. Rev. Lett. 80, 4036 (1998). 4, 10 45. J. Tersoff, Appl. Phys. Lett. 74, 2122 (1999). 4, 11 46. M. P. Anantram, S. Datta, and Y.-Q. Xue, Phys. Rev. B 61, 14219 (2000). 4, 11 47. K. -J. Kong, S. -W. Han, and J. -S. Ihm, Phys. Rev. B 60, 6074 (1999). 4, 11 48. H. J. Choi, J. Ihm, Y. -G. Yoon, and S. G. Louie, Phys. Rev. B 60, R14009 (1999). 4, 11 49. T. Nakanishi and T. Ando, J. Phys. Soc. Jpn. 69, 2175 (2000). 4, 11 50. J. Kong, E. Yenilmez, T. W. Tombler, W. Kim, and H. J. Dai, R. B. Laughlin, L. Liu, C. S. Jayanthi, and S. Y. Wu, Phys. Rev. Lett. 87, 106801 (2001). 4, 11 51. M. S. Dresselhaus, G. Dresselhaus, and P. C. Eklund, Science of Fullerenes and Carbon Nanotubes, (Academic Press 1996). 4 52. T. W. Ebbesen, Physics Today 49 (1996) No. 6, p. 26. 4 53. H. Ajiki and T. Ando, Solid State Commun. 102, 135 (1997). 4 54. R. Saito, G. Dresselhaus and M. S. Dresselhaus, Physical Properties of Carbon Nanotubes, (Imperial College Press 1998). 4 55. C. Dekker, Phys. Today 52, No. 5, p. 22 (1999). 4 56. T. Ando, Solid State Commun. 15, R13 (2000). 4
18
Tsuneya Ando
57. M. V. Berry, Proc. Roy. Soc. London A392, 45 (1984). 9 58. T. Ando, T. Nakanishi, and R. Saito, J. Phys. Soc. Jpn. 67, 2857 (1998). 9, 10 59. N. H. Shon and T. Ando, J. Phys. Soc. Jpn. 67, 2421 (1998). 9 60. T. Ando, Y. Zheng, and H. Suzuura, J. Phys. Soc. Jpn. 71, 1318-1324 (2002). 9 61. T. Ando and T. Nakanishi, J. Phys. Soc. Jpn. 67, 1704 (1998). 10, 14 62. R. Landauer, IBM J. Res. Dev. 1, 223 (1957); Phil. Mag. 21, 863 (1970). 10 63. T. Nakanishi and T. Ando, J. Phys. Soc. Jpn. 68, 561 (1999). 10 64. P. L. McEuen, M. Bockrath, D. H. Cobden, Y. -G. Yoon, and S. G. Louie, Phys. Rev. Lett. 83, 5098 (1999). 11 65. A. Bachtold, M. S. Fuhrer, S. Plyasunov, M. Forero, E. H. Anderson, A. Zettl, and P. L. McEuen, Phys. Rev. Lett. 84, 6082 (2000). 11 66. T. Ando and H. Suzuura, J. Phys. Soc. Jpn. 71, 2753 (2002). 11, 12, 13, 14 67. H. Akera and T. Ando, Phys. Rev. B 43, 11676 (1991). 12 68. T. Seri and T. Ando, J. Phys. Soc. Jpn. 66, 169 (1997). 12 69. L. Chico, L. X. Benedict, S. G. Louie, and M. L. Cohen, Phys. Rev. B 54, 2600 (1996). 14 70. M. Igami, T. Nakanishi, and T. Ando, J. Phys. Soc. Jpn. 68, 716 (1999). 14 71. M. Igami, T. Nakanishi, and T. Ando, J. Phys. Soc. Jpn. 68, 3146 (1999). 14 72. M. Igami, T. Nakanishi, and T. Ando, J. Phys. Soc. Jpn. 70, 481 (2001). 14 73. S. Tomonaga, Prog. Theor. Phys. 5, 544 (1950). 74. J. M. Luttiger, J. Math. Phys. 4, 1154 (1963). 75. L. Balens and M. P. A. Fisher, Phys. Rev. B 55, 11973 (1997). 15 76. Yu. A. Krotov, D. -H. Lee, and S. G. Louie, Phys. Rev. Lett. 78, 4245 (1997). 15 77. C. Kane, L. Balents, and M. P. A. Fisher, Phys. Rev. Lett. 79, 5086 (1997). 15 78. R. Egger and A. O. Gogolin, Euro. Phys. J. B 3, 281 (1998). 15 79. H. Yoshioka and A. A. Odintsov, Phys. Rev. Lett. 82, 374 (1999). 15 80. M. Bockrath, D. H. Cobden, L. Jia, A. G. Rinzler, R. E. Smalley, L. Balents, and P. L. McEuen, Nature (London) 397, 598 (1999). 15
Turbulence Siegfried Grossmann Fachbereich Physik, Philipps-Universit¨ at Renthof 6, D-35032 Marburg, Germany
[email protected] Abstract. Recent progress in the physics of developed turbulence is presented. After summarizing the relevant notions as the Reynolds number Re and the dissipation rate , results concerning intermittency, scaling of structure functions , breakdown of dynamical scaling, and multifractality is discussed. A unified, dissipation based theory of thermal convection has found rather detailed experimental confirmation. For anomalous turbulent particle dispersion mode coupling theory proves to be applicable. Recent results on rigorous bounds on the dissipation rate still leave a gap to the data. Finally the non–normal-nonlinear mechanism for the onset of shear flow turbulence despite stable laminarity is summarized.
1
Introduction
Turbulent flow is a nonlinear complex state of fluid motion, confined in a controlled geometry, stirred by shear or a pressure drop. It expresses the unique signature of the hydrodynamic nonlinearity (u · grad) u. After an introduction of the basic notions, in particular the Reynolds number (Sect. 2), our present understanding of the physics of turbulence is highlighted for a couple of examples. These include the multifractality of incompressible turbulent fluid flow (Sect. 3), the recently developed unified theory of turbulent thermal advection (Sect. 4), the mode coupling theory to explain the unusually strong particle pair dispersion (Sect. 5), an overview on rigorous dissipation bounds with the still considerable gap to the data (Sect. 6), and the mechanism of the onset of turbulence despite linear stability of the preceding laminar flow (Sect. 7). A short summarizing section (Sect. 8) finishes this plenary talk on recent progress in the physics of turbulence. References are offered, which the reader might wish to consult for more details.
2 2.1
Developed Turbulence The Reynolds Number
Exerting a pressure drop between the inlet and the outlet of a pipe containing water, shearing the two parallel plates of a water channel against each other, stirring a fluid with a paddle, etc., results in fluid flow, which for weak forcing is laminar, time independent. The fluid sticks on the boundaries. Global or local shear profiles develop. B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 19–35, 2003. c Springer-Verlag Berlin Heidelberg 2003
20
Siegfried Grossmann
Due to the gradient of the downstream (x-direction) momentum density ρ0 ux (z) in wall normal direction z, the molecular interaction provides momentum transport between the plates, one moving with U , the other at rest. We consider incompressible flow, therefore the constant mass density ρ0 is skipped henceforth. The molecular transport velocity umol,⊥ is of the order of the thermal velocity vth of the molecules, but reduced by the factor mean free path λmf p / distance L. Therefore umol,⊥ ∼ vth λmf p /L. The product vth λmf p ∼ ν is the kinematic viscosity ν, macroscopically defined by ν = ηd /ρ0 , ηd the dynamic viscosity of the fluid. The flow becomes turbulent, leading to additional perpendicular momentum transport by advection with a velocity of the order of the relative velocity U , if U exceeds umol,⊥ sufficiently, U umol,⊥ = ν/L. The ratio U UL ≡ Re = νL−1 ν
(1)
is the Reynolds number, introduced by Osborne Reynolds about 120 years ago [1], Fig. 1. Typically ν ≈ 15 mm2 s−1 for air, ν ≈ 1 mm2 s−1 for water. Some Reynolds numbers are: water droplet (1/10 mm) falling (2 cm/s) through air 0.07 micro fluids order(1) wind (10 m/s) blowing over telegraph wires (1.5 mm) 103 baseball (15 cm) propelled at 40 m/s 4 × 105 shark (0.4 m width) swimming at 5 m/s 2 × 106 wind tunnels (DLR, ONERA, Moscow, etc.) 106 –107 ships (15 m – 60 m) with 30 knots (16 m/s) 108 –109 planetary boundary layer (10 km, 15 m/s) 1010
Fig. 1. Osborne Reynolds, 1842–1912. The picture can be found under www.eng.man. ac.uk/historic/reynolds/orey1904.jpg
Turbulence
21
The following approximate Re ranges are useful to know: ≤ o(1) 20–80 100–1300 O(1500)–∞ 2.2
laminar structure formation, laminar loss of spatial coherence turbulence
Dissipation Rate
The velocity difference vr (x) ∝ u(x + r) − u(x) increases linearly with r for sufficiently small r due to viscosity, ”viscous subrange”, with a fluctuating slope ∂u/∂x. For larger r it increases much less, on average vr ∝ r1/3 . This r-range is called the ”inertial subrange”. It is here where the hydrodynamic nonlinearity (u·grad) u, the inertia term, dominates the linear, viscous term ν∆u in the equation of motion. The Reynolds number can be understood also as the order of magnitude ratio of the inertia to the viscous terms, . (u grad)u Re = . ν∆u
(2)
Flow structures appear, if Re > O(1), under the influence of the nonlinearity. Turbulence comes for Re > O(1500 − 2000) under strong action of the nonlinearity. The relevant physical parameter, obtained from the experimental observation of the 1/3 scaling in the inertial subrange, evidently must have the units m2/3 /s or (m2 /s3 )1/3 . This parameter can be understood as the energy (∝ U 2 ) input rate (∝ U/L). In stationary turbulence it is equal to the output rate = (x, t) = ν(∂ui /∂xj )(∂ui /∂xj ) , the energy dissipation rate. Thus ∝ U 3 /L. The importance of as the control parameter for the open non-equilibrium system ”turbulent flow” was discovered independently by Kolmogorov [2], Oboukhov [3], von Weizsaecker [4], Heisenberg [5], and Onsager [6]. It was Lewis Fry Richardson, 1881-1953, who in essence although not explicitly had found it already in 1926 [7,8], when he detected the first scaling law in physics, the dependence of the turbulent diffusivity K(), in cm2 /s, on the diffusing particle cloud’s diameter , in cm, to be K() ∝ 4/3 , cf. Section 5. Apparently the units of the factor of proportionality must be cm2/3 /s, i.e., the factor is ∝ 1/3 .
3
Anomalous Scaling
3.1
Structure Functions
The turbulence wave number spectrum E(k) and the structure function S(r) =< vr2 >, in the mean field form S(r) = b2/3 r2/3 ,
(3)
22
Siegfried Grossmann
E(k) ∝ 2/3 k −5/3 ,
(4)
are known since the 1940th [2,3,4,5,6]. Meanwhile, S(r) and its Fourier transform E(k) including the prefactor bexp ≈ 8.4 can be explained and derived from the Navier-Stokes equations of motion in mean field approximation [9]. (For the Navier-Stokes equations the reader may consult textbooks, cf. [10,11,12,13,14]. The last reference [14] is the completely updated version of [13], in a series of books for each chapter.) The deep physical problem, which was noticed very soon after Kolmogorov’s discovery of the 2/3 – scaling of the structure function by Landau (cf. [10]) is the experimental observation that the local energy dissipation rates (x, t) strongly fluctuate in space and time. For large Reynolds number flows huge activities of about 20 to 60 and more times the mean are followed by calm periods, of irregularly varying lengths. This behavior of the turbulent activity is denoted as ”intermittency”. Since S(r) depends on 2/3 and other order structure functions r δζp Sp (r) ≡ < vrp > ∝ (r)p/3 ∝ rζp (5) L on p/3 , the local -fluctuations have influence on the r-dependence of the structure functions Sp (r), and therefore contribute to the value of the power law exponents ζp . The physical idea is that the relevant dissipation rate for eddies of size r will be the r-averages of the locally fluctuating dissipation rates, < (x, t) >r . Apparently, the probability density function (pdf) of r will be narrow and strongly peaked at the mean value , if r is large; the intermittent fluctuations will mostly be averaged out. But for decreasing r, the pdf will become broader and broader, because of the more and more p/3 visible intermittent fluctuations. Since r for p > 3 probes the tail of the pdf, its average increases with decreasing r; therefore the deviations δζp in Eq. (5) must be negative. If p < 3, the shrinking center of the pdf is probed, p/3 the r-average of r shrinks and the corresponding δζp is positive. The structure function exponents ζp of Eq.(5) pose the problem of the physics of turbulence. They are difficult to measure, because experimental flows and even more so numerical simulations have rather limited power law ranges and limited statistics to determine the moment exponents sufficiently reliably. Despite the still not strongly convincing data the turbulence community believes in a Re-asymptotic power law behavior (5) and, if so, in a nonlinear dependence of ζp on the order p, i.e., in deviations from the Kolmogorov exponents p/3, which are linear in p. 3.2
Reference Scale of Intermittency
The Kolmogorov exponents p/3 exhaust the balance of the units on the lhs and rhs of Eq.(5) completely. Thus any deviation δζp requires the explicit presence of a characteristic length scale. Two candidates are on the stage:
Turbulence
23
The outer, stirring scale L or the inner, dissipative, smallest eddies’ scale η = (ν 3 /)1/4 . Now, we can say something concerning the relevant scale from one of the few available rigorous results, p +q < vrp >< vrq > . (6) < vr2 2 > ≤ This is a direct consequence of Schwarz’s inequality. It leads to ζ p+q ≥ 2
1 (ζp + ζq ) , 2
(7)
if and only if the relevant explicit length scale in the structure functions is the outer one, L. Equation (7) characterizes a convex downward curve in agreement with the data, cf. [11], thus confirming the choice of L as the relevant explicit scale. 3.3
Multifractality
Theory, for a long time, could not do more than suggesting models which describe the measured ζp versus p. These models started with Kolmogorov and Oboukhov’s log-normal model in 1962 [15,16] and may not have finished with the very successful She-Leveque model in 1994 [17]; for a summary see [11]. A frame of sufficient generality which comprises these models is the multifractal representation of the structure functions Sp (r), hp+3−D(h) r dµ(h) ∝ rζp , (8) Sp (r) ≡< vrp >= (L)p/3 L as a superposition of sets, whose dimension is D(h), and whose measure is dµ(h), with different power law exponents h. Not only h = 1/3 contributes, but (probably) a continuum of other values too. (r/L)3−D(h) represents the probability that an r-eddy is in the h-scaling set. The exponent then is ζp = inf (hp − 3 − D(h)), h
(9)
i.e., ζp and D(h) are mutual Legendre transforms. Different models for ζp yield different fractal dimensions D(h) and vice versa. Equations (8),(9) give the frame, not the solution. The very existence of intermittency implies D(h) = 3; the dimension of the set of active, intermittent turbulence must be fractal. The real problem is to calculate ζp or D(h) from the NavierStokes equations (10),(11), less so to invent further more or less convincing models. ∂t u(x, t) = −(u · grad) u − grad p + ν∆u, b.c. ,
(10)
div u = 0 .
(11)
b.c. denotes the (no slip) boundary conditions for the differential equations, introducing shear or pressure gradients into the solutions.
24
Siegfried Grossmann
3.4
Anomalous Exponents
Recent exciting progress to calculate the exponents of the passive scalar strucT ture functions, (T (x + r) − T (x))p ∝ rζp supports our hope that solving the full turbulence problem might appear above the horizon now. The passive scalar field T (x, t), e.g. the temperature field, solves ∂t T (x, t) = −(u · grad) T + κ∆T.
(12)
κ is the thermal diffusivity, characterizing the fluid. u either is a turbulent solution of (10) or a flow field which properly models turbulence. The passive scalar exponents ζpT could be analysed using the Kraichnan [18] model flow uK (x, t), see [19] for a detailed review. The essence is that the corrections δζpT are not evaluated by dimension counting but, in the linear passive scalar problem, by the socalled zero mode solutions or the statistically conserved quantities in turbulent flow, together with the shape evolution of multiparticle sets. This is a promising, fascinating progress in understanding intermittency in turbulence from the hydrodynamic equations of motion. 3.5
Breakdown of Dynamical Scaling
There are old conjectures about the possibility of a renormalization group analysis of the structure function scaling, see e.g. [20] and other references. Different from many particle critical behavior, in turbulence there seem to be an infinity of independent critical exponents. Also of interest is another recently derived structurally rigorous result [21]: There is no dynamical scaling of the time dependent turbulent correlation functions. The reason is multifractality. Although all (static) structure functions Sp (r) have, in the inertial subrange, nice power law behavior ∝ rζp , the (Lagrangian) dynamical correlations Sp (r, τ ) do not share dynamical scaling behavior. Instead, Sp (λr, λz τ ) = λζp Sp (r, τ ).
(13)
The proof [21] uses continued fraction decomposition and the analysis of its coefficients. This elucidates the reason: The dynamics is influenced by all ζp simultaneously, because higher order continued fraction coefficient are determined by higher order structure functions and thus higher and higher order ζp . Therefore intermittency, the nonlinear dependence on p, destroys dynamical scaling, which holds and only holds if ζp = hp is linear in p. Then z = 1 − h and D(h) = 3, nonfractal, nonintermittent. 3.6
Degrees of Freedom, Numerical Work
In turbulent flow the number of relevant degrees of freedom is exception2 ally large. An estimate of the Lyapounov exponents gives λk ≈ −νk + /2ν . This allows to determine the Kaplan-Yorke dimension (largest d
Turbulence
25
d with k=1 λk ≥ 0) leading to the following number of degrees of freedom of the turbulent flow, dKY ≈ 0.022(V /η)3 ≈ (L/4η)3 ≈ 0.015Re9/4.
(14)
Already at onset, Re ≈ 1300, there are dKY ≈ 105 active modes. L/4η is about 50; this indicates the number of necessary modes in each spatial direction. Note the steep increase of dKY with Re. In developed turbulence, say Re = 5 × 106 , we find dKY ≈ 2 × 1013 ; the density of turbulent modes is of the order of 1/cm3 . Near instabilities, on the other hand, there are only a few modes, dKY ≈ 8. This huge number of active modes in turbulent flow is the reason that still direct numerical simulations are restricted to rather limited Reynolds numbers. The ”numerical work” W to calculate turbulent flow of a chosen Reynolds number Re has been estimated [22] to be W ≈ Re3 log2 Re ≈ 4.8 Re3.1 flops.
(15)
This leads to W (Re = 5 × 106 ) ≈ 2.8 × 1021 floating point operations. Even if we consider todays best tact times of about 0.5 Gigaflops/s this work would need t(Re = 5 × 106 ) ≈ 5.6 × 1012 s ≈ 180 000 a. To phrase it alternatively: To handle this amount W of numerical work in, say, 1 day, it needs tact times ∆t = 3.1 × 10−17 s or 32 Petaflops /s or 32 000 Teraflops /s. These numbers explain, why direct numerical simulations of Navier-Stokes turbulence are still restricted to some 102 for Reλ or several 104 for Re, despite impressive hardware progress.
4
Thermal Advection
One of the most intensely studied turbulent flows is thermal, Rayleigh-B´enard advection in a container, heated from below. Because of gravitation g, the thermal expansion αp = −ρ−1 0 (∂ρ0 /∂T )p leads to buoyancy ∝ gαp ∆T /L. Here L is the height of the container and ∆T = Tbottom − Ttop is the temperature difference. The resulting advection is slowed down by viscosity ν and by thermal diffusivity κ. The ratio of the buoyant forcing and the molecular slow down is the dimensionsless Rayleigh number Ra =
gαp ∆T L3 . νκ
(16)
The onset of convection occurs at Rac = 1708. For larger Ra the advection patterns become time dependent, then chaotic, then loose spatial coherence, and at about Ra ≈ 5 × 107 the flow becomes turbulent. Considerable experimental progress in high Rayleigh number thermal advection, up to Ra ≈ 1015 − 1017 , has stimulated theory and vice versa. A unifying theory [23,24,25] has enjoyed several experimental tests and confirmations.
26
Siegfried Grossmann
Quite different fluids have been measured, besides water and cryogenic helium, from organic fluids with large Prandtl numbers P r = ν/κ, e.g. dipropylene glycol with P r ≈ 1350 , to liquid metals like mercury with very small P r = 0.025 or sodium, even smaller P r = 0.005. The accuracy of the experiments has been improved quite remarkably [26]. The scatter of the data is less than 0.1%, systematic effects are controlled ≤ 1% . Recent attention is devoted to the dependences on the container’s geometry [27], its aspect ratio Γ = widthB/heightL, as well as its shape, rectangular or cylindrical. Also corrections due to imperfect side wall conditions are taken care of [26,28]. The main elements of the unified theory are exact relations between the thermal forcing Ra, the fluid’s property P r, the container geometry Γ , on the one hand, and the response of the system. This is the effective heat flux N u = J/(κ∆T L−1 ), the Nusselt number, dimensionless, and the large scale motion, the ”wind”, which circulates. Its velocity U defines the Reynolds number Re = U L/ν . J is the total temperature flux, the sum of the advective and molecular fluxes. It is nondimensionalized with the molecular flux of the fluid at rest κ∆T L−1. The mentioned exact relations are u /κ3 L−4 = RaP r(N u − 1) and θ /κ∆T 2 L−2 = N u .
(17)
u and θ are the kinetic and the thermal dissipation rates. The idea is to decompose them into the dissipation contributions from the boundary layers and those from the bulk. Both are then modeled fluid dynamically in terms of Re and P r. This enables one to calculate a complete turbulent Ra − P r parameter state diagram. This is of interesting high complexity but seems to be well supported by experiment. Some of the results are [23,24,25]: i) Experimental precision and theory give evidence for deviations from simple power laws Ra ∝ N uβ P rα . To identify these deviations, compensated plots have to be done. ii) The temperature flux N u and the circulating wind Re cannot be described by power laws. Global pure power law scaling is conceptually insufficient because of the varying weights of the boundary layer and the bulk contributions. If one introduces effective, local exponents, these depend on Ra and P r. iii) N u increases with P r for P r ≤ 0(1). For larger P r it decreases again, the details depending on Ra. There is a Prandtl number independent upper bound of N u versus Ra. iv) The kinetic boundaries seem to follow laminar boundary layer theory up to Ra ≈ 1014 − 1015 for P r ≈ 1 − 10. The asymptotic turbulent range is only beyond this (taken fixed P r). v) The wind increases roughly as Re ∼ Ra0.50 − Ra0.43 , depending on P r, and ∼ P r−0.67 − P r−0.76 , depending on Ra. vi) The famous β = 2/7 exponent [29] in a power law fit N u ∝ Raβ is not confirmed. The heat flux exponent for larger Ra is nearer to β = 1/3.
Turbulence
27
vii) The boundary√layer thicknesses are described by λθ = L/(2 N u) and λu = 0.482L/ Re . For more details see [23,24,25] and [30,31].
5
Turbulent Particle Dispersion
Contaminants in air are very effectively dispersed under turbulent advection. The mean extension δRt of particle clouds, with δRt2 = (R(t) − R(0))2 , where R(0) is the initial pair separation r, and R(t) is the pair distance a time t later, increases strongly with t, δRt2 = CD t3 .
(18)
Note that the corresponding law for Brownian diffusion is δRt2 ∝ t. For correlated spreading it holds δRt2 ∝ t2 , with the largest known exponent 2 here. It was Lewis Fry Richardson, Fig. 2, in 1926 [7,8], who explained the exceptionally effective turbulent spreading (18) by observing that the turbulent diffusivity is scale dependent. K(r) = κD 1/3 r4/3 .
(19)
Equation (19) was the first scaling law in physics. In the atmosphere the scales range from the order of mm to thousands of km. To this r-range of 8 orders of magnitude corresponds a K-range of about 11 orders. The turbulent, advective diffusivity can be as huge as K ≈ 109 Dmol,air ≈ 1013 Dmicrof luids . How can one physically understand this giant and effective turbulent diffusivity? Consider the Kubo type representation of the variance δRt2 in terms of the Lagrangian, time dependent velocity correlation function [32,33]. t t dτ1 dτ2 v(τ1 |...)v(τ2 |...) . (20) δRt2 = (R(t) − r)2 = 0
0
Fig. 2. Lewis Fry Richardson, 1881–1953. The picture can be found under www-gap.dcs.st-and. ac.uk/history/PictDisplay/Richardson.html
28
Siegfried Grossmann
Here v(τ ; r|x, 0) denotes the Lagrangian velocity difference of a pair of particles, initially at position x, having separation r, observed at a time τ later. From Eq. (20) we find the known diffusion laws as well as the anomalous turbulent advective diffusion: normal diffusion correlated diffusion anomalous turb. diff.
BT d , ∝ t γ2 km 0 2 2 ∝t v d , 2/3 ∝ t3 d , since v 2 ∝ δRt ∝ t .
The peculiarity of turbulent advective diffusion is that the pairs’ relative velocity (which in Brownian motion is the thermal velocity v 2 ∼ kB T /m0 ) is not a given constant velocity, but increases with increasing particle separation 2/3 δRt as v 2 ∝ δRt ∝ t3/3 = t. This adds the third t-power. One might doubt the correlated type of diffusion, i.e., the second t-power. Closer analysis [34] shows that in relaxation approximation v(r)vτ (r) ≈ v 2 exp(−γt) again the typical turbulence advection argument comes in. With increasing t the decay rate γ = γ(δRt ) decreases and the correlation ≈ v 2 can persist. Since numerical evaluation of the Lagrangian time correlation [35] gives strong indication that the memory contributions are large, leading to much larger decay rates γrenorm , an improved theory succeeded to calculate γrenorm γ in mode coupling approach [34]. Then, instead of t2 for the double integral in (20) one finds t γ −1 ∝ t D(δRt ) ∝ t t, thus again a t2 behavior. Once more the decreasing decay rate due to an increasing pair separation leads to the characteristic signature of turbulent diffusion, the t3 spreading. In mode coupling approximation the relevant numerical factors CD and κD can also be calculated. They are found as CD ≈ 0.5 − 0.8 and κD ≈ 0.8 − 1.0 .
(21)
This is in good agreement with recent experiments and simulations [36,37,38,39]. Combining these values of the Richardson constant CD or rather κD with Richardson’s 1929 measurement (K/cm2 s−1 ) / (r/cm)4/3 = 0.6 allows an estimate of the mean atmospheric energy dissipation rate: atmosphere ≈ 20 mm2 s−3 . This average is a factor of about 10−5 less than wind tunnel dissipation and about 10−2 below high Rayleigh-B´enard thermal convection dissipation. (In turbulent thermal advection in large Ra cryogenic helium the dissipation rate is about ≈ 2 × 103 mm2 s−3 .) The Lagrangian correlation decay time can be calculated [34] as −1 ≈ 0.5(r/L)2/3 large eddy turnovers . τrenorm = γrenorm
(22)
Turbulence
6
29
Dissipation Bounds
The characteristic control parameter of turbulence is the energy flux through the open nonequilibrium system ”turbulent fluid flow”. In order to maintain turbulence one continuously has to feed in energy, which is dissipated again on the small scales by the molecular viscosity. In stationary flow the input and the output rates are the same. While the dissipation rate = ν(∂ui /∂xj )2 V,t is expressed in terms of the kinematic viscosity ν and the flow field gradients ∂ui /∂xj , spatially (V ) and temporally (t) averaged, the input rate is given by the external boundary conditions, ∝ U 2 /LU −1 , the kinetic energy fed in during a large eddy turnover time LU −1 . Hence = c( U 3 L−1 . The dimensionless dissipation coefficient c( depends on Re. The input rate U 3 L−1 is controlled by the experimentalist. The dissipation rate is a consequence of the flow realization under given U , L, and Re, and is not known immediately. Therefore or rather /U 3 L−1 = c( is a quantity of high interest. Beginning with E. Hopf in 1941 and L. N. Howard in 1972 rigorous upper bounds for the dissipation coefficient c( have been studied. In 1970 F. H. Busse [40] succeeded to formulate and Re-asymptotically calculate (within an approximation) the bound c( (Re → ∞) = 0.0101. In 1992 Doering and Constantin [41] introduced another variational upper bound, which could be improved considerably in [42,43,44]. Using this improvement, in [45] the equivalence of the bounds from the Busse and the Doering-Constantin variational principles could be shown. Quite recently the Nicodemus et al. bound [43,44] could asymptotically in Re be still improved [46]; for finite Re the curve c( versus Re was confirmed. All these rigorous -bounds asymptotically approach a constant value c( (Re → ∞). They refer to plane Taylor-Couette flow, driven by shearing the opposite plates. These bounds, though they are rigorous consequences of the Navier-Stokes equations, are still not fully satisfactory, because i. available data [47,48] are still an order of magnitude below the presently known bounds, and ii. there are hints that experimentally c( (Re) might decrease with Re, while the known bounds approach a constant level. In [49] it could be shown, why c( (Re) can indeed decrease. But still there is a gap by a factor of ≈ 1/10 between theory and experiment. The recent developments and improvements of rigorous upper dissipation bounds can be summarized as follows (DC = Doering-Constantin, etc.):
30
Siegfried Grossmann
c( (Re → ∞)/10−3
c( (Re = ReES )/10−3
c( (Re)/10−3 c( (Re ≈ 104 )/10−3 c( (Re = Returb )/10−3
= 88.4 DC, 1992 = 83.3 DC, 1994 = 78.6 GGHL, 1995 = 66.3 NGH, 1997 = Re−1 ES = 12.1 at ReES = 82.64 = 10.9 NGH, 1997/98 = 10.1 B, 1970/78 = 8.6 PK, 2003 = 1.16 for Re = 2 × 104 – 0.53 for Re = 106 LFS, 1992 (Taylor-Couette flow) = 1.0 R, 1959 = Re−1 turb = 0.77 at Returb = 1300
To close the data/theory gap and to derive the scaling of c( with Re remains an exciting challenge for our physical understanding as well as for the mathematical hydrodynamics.
7
Onset of Shear Turbulence
One of the oldest and most mysterious problems in the physics of turbulence was the question, why laminar flow though a pipe or a channel or other Poiseuille or Couette shear flow becomes turbulent. After Osborne Reynolds [1] pioneering experiments in 1883 it took about 100 years until we now believe to have understood the mechanism [50,51,52,53,54]. The puzzle is that all disturbances of the laminar parabolic or linear profiles in pipe or planar shear flows should decay, because all eigenvalues of small disturbances turn out to have negative (damping) real parts, eλt ∝ e−|λreal |t → 0. The only exception is pressure driven channel flow, but this observation even pushes the puzzle further: Here a positive eigenvalue appears for Re = 5771, but the laminar flow ”does not wait for that”, it becomes already turbulent for much smaller Reynolds numbers. Our present understanding, why laminar shear flow develops turbulence despite stable laminarity can be summarized as follows; for details the reader is refered to [54] and further references therein. Indeed, laminar shear flow mathematically is linearly stable. It needs disturbances to trigger the turbulence onset. For smaller Re stability prevails even under disturbances. For larger Re the disturbance amplitude δudist can be all the smaller, the larger the Reynolds number Re already is. δudist /U ∝ Reγ .
(23)
Experiments say γ ≈ −1.5 ± 0.3 (Trefethen et al. 2001), theory leads to −3 ≤ γ ≤ −1 . Note that U in (23) is ∝ Re , thus in experiment √ δudist ∝ Reγ+1 = 1/ Re . (24)
Turbulence
31
With some care in avoiding disturbances one can succeed to delay the onset of turbulence to Re much beyond Re ≈ 2 000. We know about onset delay until even 100 000 (cf. [55]). This needs a noise suppression by a factor of about 1/7, which is conceivable. The exponent γ in Eq. (23) describes the ”double threshold” (which means both, disturbance and Reynolds number must exceed this value) in the sense of enveloping the δudist versus Re curves for different fixed type of disturbances. For a given type of disturbance the double threshold may have a complicated, seemingly fractal structure as a function of δudist [56]. How does the Navier-Stokes equation (10) manage onset? It is its characteristic nonlinearity again, which is responsible. The flow field consists of the laminar flow U and an additional field u, thus U + u. The nonlinearity then produces terms (U · grad) u + (u · grad) U + (u · grad) u. The first one just describes the advection of the disturbance with the laminar flow. The second one perturbes the laminar profile; this linear matrix operator ∂Uj /∂xi generically neither is symmetric nor diagonal. Its presence implies that the linear part of the Navier-Stokes operator L must be nonnormal, defined by LL+ = L+ L. This property induces that the eigenvectors are no longer orthogonal to each other which generically can lead to transient growth of the disturbances u. The amplification can be calculated in model systems to be ∝ Re. Thus for sufficiently large Re the third term, (u · grad) u, is no longer negligible. It rearranges the meanwhile amplified disturbance field in such a way that it can be amplified again. This feedback loop makes u grow. The total field U + u on average has less shear, switching the nonnormality of the shear matrix down. This subtle interplay between nonnormality and nonlinearity thus saturates the fluctuating u-field on a finite (turbulent) level. The intricate nonnormal-nonlinear mechanism needs many degress of freedom, because, from the very structure of the Navier-Stokes nonlinearity, the interaction and the transient amplification occur in different sectors of the flow modes. Subsets of modes in pipe or channel flow are either nonnormal but do not interact or they interact and then are not nonnormal to each other. Therefore many and not too small such subsets are necessary for the mechanism to become effective. This finding is in agreement with the above mentioned large dimension dKY of turbulent flow and with the observation of many scales present already at onset of turbulence. One can well imagine that in a proper nonnormal-nonlinear balance stationary (or periodic) virtual solutions of the Navier-Stokes equation with u = 0, thus different from the laminar flow U , might exist, though unstable. They correspond to the unstable periodic orbits, generically present in nonlinear chaotic systems. Indeed various such solutions with finite u-field energy have been found meanwhile [56,57]. Following them as functions of decreasing Re allows to study the lowest Reynolds number for which disturbances can trigger an onset of turbulence. This happens for considerably larger Re than the energy instability.
32
Siegfried Grossmann
The bounds for the double threshold scaling exponent γ + 1 in (24) can be estimated from the described mechanism [51]. Given a disturbance δudist ; let it be amplified to ∝ Reδudist ; the quadratic interaction leads to ∝ (Reδudist )2 ; if this exceeds the original amplitude δudist , a positive feedback loop is established. Growth thus is triggered, if δudist ≥ const × Re−2 . Now, due to the other mentioned interaction term with the laminar background flow, (U · grad) u, which also contributes ∝ Re, the nonnormal amplification can saturate. With the same argument as before one then finds δudist ∝ Re0 , i.e., γ + 1 lies between −2 and 0. Thus note that the onset of shear flow turbulence is not a common instability due to eigenvalues crossing the imaginary axis. It is another, new type of large degrees of freedom mechanism, based on non-orthogonal eigenfunctions of the linearised Navier-Stokes operator (describing the interaction with the laminar flow) and its nonnormality. The interplay with the nonlinearity determines the growth mechanism. That the Navier-Stokes equations indeed support this picture, has been quantitatively confirmed in [53].
8
Summary
In this plenary talk an overview has been offered on recent exciting progress in the physics of turbulence. We have clearly left behind us the period of modeling and now approach, instead, the Navier-Stokes based understanding. • Promising steps are results on passive scalar scaling exponents in developed turbulence. We have elaborated characteristic differences to renormalization group approaches of equilibrium phase transition physics, namely the hydrodynamic multifractality and the breakdown of dynamical scaling. The present limits of direct numerical simulations can be located in the necessarily large number of degrees of freedom. • Thermal convection in Rayleigh-B´enard cells has been realized for very large Ra and Re numbers. Still one is not really asymptotic in Re. The boundary layer effects superimpose the bulk turbulence. Heat conduction and thermal wind can no longer be described in terms of pure power laws. A theory which is completely based on the kinetic and the thermal dissipation rates has been successful to deal with the bulk of presently available data. • Dynamical mode coupling approach has allowed to derive the anomalous turbulent particle dispersion including the numerical prefactors. The Richardson ideas have been largely confirmed. • Rigorous bounds on the energy dissipation rate have been improved considerably. A still remaining further factor of about 1/10 to the measured dissipation rates remains to be explained, as an interesting challenge. • The onset of turbulence in shear flow can be understood by an intricate balance between transient nonnormal growth and nonlinear interactions
Turbulence
33
rearranging the disturbances to establish a positive feedback loop. The signatures of this mechanism are linearly stable laminarity, the necessity of disturbances, many degrees of freedom right from the start, the existence of a (asymptotically rather flat) double threshold, and a finite Reynolds number gap before onset can happen at all. As usual, research progress opens a plethora of new questions. But the mood is very positive, we are making progress in this interesting field of condensed matter physics: turbulence. Acknowledgements It is my great pleasure to thank my enthusiastic coworkers who shared the fun working on turbulence as far as reported here: Ulrich Brosa, David Daems, Bruno Eckhardt, Hans Effinger, Hirokazu Fujisaka, Thomas Gebhardt, Martin Holthaus, Detlef Lohse, Victor L’vov, Rolf Nicodemus, Itamar Procaccia, Stefan Thomae, Christian Wiele.
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14.
15. 16. 17. 18. 19. 20.
O. Reynolds, Phil. Trans. R. Soc. London 174, 935 (1883). 20, 30 A. N. Kolmogorov, C. R. Akad. Nauk USSR 30, 301 (1941). 21, 22 A. M. Oboukhov, C. R. Akad. Nauk USSR 32, 19 (1941). 21, 22 C. F. von Weizs¨ acker, Z. Phys. 124, 614 (1948). 21, 22 W. Heisenberg, Z. Phys. 124, 628 (1948). 21, 22 L. Onsager, Phys. Rev. 68, 286 (1945). 21, 22 L. F. Richardson, Proc. R. Soc. London A110, 709 (1926). 21, 27 L. F. Richardson, Beitr. Phys. Atm. 15, 24 (1929). 21, 27 H. Effinger, S. Grossmann, Z. Phys. B66, 289 (1987). 22 L. D. Landau, E. M. Lifschitz, Fluid Mechanics (Pergamon Press, Oxford 1987); Hydrodynamik (Akademie Verlag, Berlin 1991). 22 U. Frisch, Turbulence (Cambridge University Press, Cambridge 1995). 22, 23 S. B. Pope, Turbulent Flows (Cambridge University Press, Cambridge 2000). 22 A. S. Monin, A. M. Yaglom, Statistical Fluid Mechanics (The MIT Press, Cambridge Massachusetts 1975). 22 A. S. Monin, A. M. Yaglom, Statistical Fluid Mechanics: The Mechanics of Turbulence, revised by A. M. Yaglom: Volume I, Chapters 2-5 (CTR Monographs, Ames Research Center and Stanford University 1997, 1998, 1999, 2001). 22 A. N. Kolmogorov, J. Fluid Mech. 12, 82 (1962). 23 A. M. Oboukhov, J. Fluid Mech. 12, 77 (1962). 23 Z.-S. She, E. Leveque, Phys. Rev. Lett. 72, 336 (1994); E. Leveque, Z.-S. She, Phys. Rev. E55, 2789 (1997). 23 R. H. Kraichnan, Phys Fluids 11, 945 (1968). 24 G. Falkovich, K. Gaw¸edzki, M. Vergassola, Rev. Mod. Phys. 73, 913 (2001). 24 S. Grossmann, E. Schnedler, Z. Phys. B26, 307 (1977). 24
34
Siegfried Grossmann
21. D. Daems, S. Grossmann, V. S. L’vov, I. Procaccia, Phys. Rev. E60, 6656 (1999). 24 22. S. A. Orszag, J. Fluid Mech. 41, 363 (1970), S. A. Orszag, V. Yakhot, Phys. Rev. Lett. 56, 1691 (1986). 25 23. S. Grossmann, D. Lohse, J. Fluid Mech. 407, 27 (2000). 24. S. Grossmann, D. Lohse, Phys. Rev. Lett. 86, 3316 (2001). 25, 26, 27 25, 26, 27 25. S. Grossmann, D. Lohse, Phys. Rev. E66, 016305 (2002). 25, 26, 27 26. X. Xu, K. M. Bajaj, G. Ahlers, Phys. Rev. Lett. 84, 4357 (2000). 26 27. S. Grossmann, D. Lohse, J. Fluid Mech. 486, 105 (2003). 26 28. G. Ahlers, X. Xu, Phys. Rev. Lett. 86, 3320 (2001). 26 29. B. Castaing, G. Gunaratne, F. Heslot, L. Kadanoff, A. Libchaber, S. Thomae, X. Wu, S. Zaleski, G. Zanetti, J. Fluid Mech. 204, 1 (1989). 26 30. J. Niemela, L. Skrbek, K. R. Sreenivasan, R. Donnelly, Nature 404, 837 (2000). 27 31. J. Niemela, K. R. Sreenivasan, J. Fluid Mech. Mode coupling theory of turbulent dispersion, Preprint, Marburg 2003. 27 32. S. Grossmann, I. Procaccia, Phys. Rev. A29, 1358 (1984). 27 33. S. Grossmann, Ann. d. Physik (Leipzig) 47, 577 (1990). 27 34. S. Grossmann, J. Fluid Mech. 481, 355 (2003). 28 35. S. Grossmann, C. Wiele, Z. Phys. B103, 469 (1997). 28 36. M.-C. Jullien, J. Paret, P. Tabeling, Phys. Rev. Lett. 82, 2872 (1999). 28 37. S. Ott, J. Mann, J. Fluid Mech. 422, 207 (2000). 28 38. T. Ishihara, Y. Kaneda, Phys. Fluids 14, L69 (2002). 28 39. G. Boffetta, I. M. Sokolov, Phys. Rev. Lett. 88, 094501 (2002). 28 40. F. H. Busse: J. Fluid Mech. 41, 219 (1970); Adv. Appl. Mech 18, 77 (1978). 29 41. C. R. Doering, P. Constantin, Phys. Rev. Lett. 69, 1648 (1992); Phys. Rev. E49, 4087 (1994). 29 42. T. Gebhardt, S. Grossmann, M. Holthaus, M. Loehden, Phys. Rev. 51, 360 (1995). 29 43. R. Nicodemus, S. Grossmann, M. Holthaus, Physica D101, 178 (1997); Phys. Rev. Lett. 79, 4170 (1997); Phys. Rev E56, 6774 (1997). 29 44. R. Nicodemus, S. Grossmann, M. Holthaus, J. Fluid Mech. 363, 281 and 301 (1998). 29 45. R. R. Kerswell, Physica D100, 355 (1997). 29 46. S. C. Plasting, R. R. Kerswell, J. Fluid Mech. 2003, (preprint 2002). 29 47. H. Reichardt, Gesetzm¨ assigkeiten der geradlinigen turbulenten Couette Str¨ omung, Mitteilungen aus dem MPI f¨ ur Str¨ omungsforschung 22, G¨ ottingen 1959. 29 48. D. P. Lathrop, J. Fineberg, H. L. Swinney, Phys. Rev. Lett. 68, 1515 (1992); Phys. Rev A46, 6390 (1992). 29 49. R. Nicodemus, S. Grossmann, M. Holthaus, Eur. Phys. J. 10, 385 (1999). 29 50. L. Boberg, U. Brosa, Z. Naturforsch. 43a, 697 (1988). 30 51. L. N. Trefethen, A. E. Trefethen, S. C. Reddy, T. A. Driscol, Science 261, 578 (1993). 30, 32 52. T. Gebhardt, S. Grossmann, Phys. Rev. E50, 3705 (1994). 30 53. U. Brosa, S. Grossmann, Eur. Phys. J. B9, 343 (1999). 30, 32 54. S. Grossmann, Rev. Mod. Phys. 72, 603 (2000). 30
Turbulence
35
55. I. J. Wygnanski, F. H. Champagne, J. Fluid Mech 59, 281 (1973). 31 56. B. Eckhardt, K. Marzinzik, A. Schmiegel: in A Perspective Look at Nonlinear Media, J. Parisi, S. C. Mueller, W. Zimmermann (Eds.), Lecture Notes in Physics, Vol. 503, Springer, Berlin etc., pp 327–338 (1998). 31 57. B. Eckhardt, H. Faisst have found also travelling wave solutions, private communication, preprint Marburg (2003). 31
Another Semiconductor Revolution: This Time It’s Lighting! Roland Haitz Retired from Agilent Technologies
Abstract. A 40 year old semiconductor technology, Light Emitting Diodes (LEDs) has steadily improved performance and cost to a point where it will move from its home turf, signaling applications to the much larger market, general lighting. The white LEDs are building momentum at such a rapid rate that we predict a revolution in lighting comparable to blowing out the gaslights by Edison’s incandescent lamp 100 years ago. One technology will compete for all applications from the smallest indicator lamp to the lighting system for sports stadiums. LEDs will provide superior performance and lower cost of ownership, at any point in this dynamic range of 11 orders of magnitude. A complete conversion to LED based lamps could reduce electricity consumption for lighting by up to 75% and reduce global coal production by approximately 600 Mtons/year. There is no single technology investment on the horizon with a better environmental benefits to cost ratio.
1
Introduction
This paper is about revolution, a revolution that is expected to change the way we are lighting our homes, shops, offices, streets and even sports stadiums. This revolutionary lighting technology is based on a 40 year evolution of tiny semiconductor based lights (Light Emitting Diodes or LEDs) that could not even be seen in direct sunlight before 1975. Today, the technology is at a transition point from signaling a message to illuminating a small area. Over the next 20 years, LEDs will improve performance and cost to a point where they will practically compete for every light-based signaling and lighting application on the surface of the earth! There will be some of exceptions: • LEDs will not match the low weight and un-tethered mobility of the firefly, • LEDs will not compete with the spectacular display of a thunderstorm, and • LEDs will not match the high flux and the low energy cost of the sun. Every other application between those extremes will be fair game! LEDs will bring some unique advantages to the “lighting table”. LEDs are very efficient and are beating incandescent lamps today. The efficiency of compact fluorescent lamps (CFL) will be matched in a couple of years. A decade from today, LEDs will surpass the efficiency of the most efficient white lamps such as fluorescent lamps (FL) and high intensity discharge lamps (HID). B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 35–50, 2003. c Springer-Verlag Berlin Heidelberg 2003
36
Roland Haitz
Efficiency – and its tie to energy cost – is only one of the critical parameters in lighting. Other issues are: initial lamp cost, color control, colorrendering index (CRI) and life-time systems cost including maintenance. Today LEDs excel on maintenance cost (lamp life, ruggedness, etc.) but lag on color control (CRI over temperature, time, drive current, etc.). The existing lighting technologies have spoiled users with superb color control. LEDs will face an up-hill battle on CRI against the entrenched technologies. But an active color control system could turn this handicap into an advantage: Unlimited color and dimming control without a loss of efficiency. At this point of the revolution, initial lamp cost is the decisive parameter for market conversion. Today, LEDs have a cost structure to dominate all monochrome signaling applications within a few years. The situation for white LED lighting looks much worse: LED lamps are at least two orders of magnitude more expensive than equivalent conventional lamps. But this disadvantage shrinks significantly when life-time energy costs are included. Expected cost improvements for LEDs will bring the initial lamp cost to the range of today’s CFL lamps within 5–10 years. At that price level, energy costs will become the dominant cost factor and favor LED lamps. This paper will start with a brief description of the physics behind the light generating process in LEDs. Next comes a description of the important milestones in the evolution of signaling LEDs. The meat of the paper will concentrate on today’s challenges in the transition from signaling to lighting and on the vision of revolutionizing today’s lighting technology. A non-trivial side benefit will be energy savings for the consumer and a significant reduction in the emission of green house gases.
2
The LED Light-Generation Process
The essential element of an LED is a p-n junction, the most basic element of the semiconductor industry. A p-n junction (Fig. 1) is the transition layer between two classes of semiconductor materials: The electron carrying n-layer and the hole carrying p-layer. When a forward voltage is applied to the structure (positive to p-layer and negative to n-layer), then electrons are injected from the n-layer into the p-layer and holes from the p-layer into the n-layer. These injected carriers are called minority carriers: a relatively small number of electrons surrounded by a large number of holes on the p-side and vice versa on the n-side. These electron and holes can be in the same physical space, but they are separated in energy and momentum space. The separation in energy is called the energy gap Eg (see Fig. 1). A positively charged hole is nothing else but a missing electron in the crystal lattice. Both electrons and holes can move freely through the crystal lattice. By applying a positive voltage VF to the p-side electrode (Fig. 2), electrons will diffuse from the n-side to the p-side of the conduction band. Similarly holes will diffuse to the n-side of the valence band. A positively charged hole can attract a negatively
Another Semiconductor Revolution
37
Fig. 1. Schematic representation of an un-biased pn junction
Fig. 2. Junction biased with a forward voltage VF resulting in minority carrier injection into both conduction and valence bands
charged electron and the electron can recombine with the hole. However, this process has to obey two fundamental laws of physics: energy and momentum conversation. The energy conservation law is readily met by emitting a photon with a quantum energy hν = Eg . This process results in the conversion of an injected electron or hole into a visible photon as long as the energy gap is in the range of Eg =1.9eV (red) to Eg = 3.0eV (violet). The momentum conservation law is more difficult to meet. Without going into all possible variations, let us consider the most important case. In practically all semiconductors, the holes occupy states near zero momentum. In Si and Ge, the electrons occupy the lowest energy states – which happen to be far away from zero momentum – and recombination accompanied by the emission of a photon is practically impossible. In other semiconductor materials, like the alloys between elements of Group III and Group V of the periodic system, very often – but not always – the electrons also occupy states near
38
Roland Haitz
zero momentum. In these so-called direct bandgap materials, such as GaAs, GaAlAs, GaInN, GaAlInP, etc., injected electrons recombine readily with holes by emitting infrared or visible light with a wavelength depending on the gap energy Eg . But, electrons also have a chance to recombine without emitting light. To recombine radiatively, the electron (hole) must find a hole (electron) with exactly opposite momentum to meet the law of momentum conservation. This process will take some time. During this time delay, the electron (hole) has a finite probability to drop into an electron (hole) trap such as a crystal defect. While being trapped, the electron (hole) will eventually recombine with a hole (electron), but instead of generating a photon, the recombination process will meet the energy conservation law by emitting multiple phonons or lattice vibrations (heat). Considering these two recombination paths, radiative and non-radiative, the efficiency of the recombination process can be described by a simple equation ηint = τn /(τn + τr )
(1)
Here, ηint denotes the internal quantum efficiency, τr the mean time to recombine radiatively and τn the mean time to recombine non-radiatively. The ideal case is τr << τn , then ηint ≈ 1. In this case, the electrons recombine radiatively long before they have a chance to get trapped. In some III-V materials, like GaAlAs (850nm) and GaAlInP (650nm), we approach the condition of ηint ∼ 100%. In Si and Ge, we have the opposite condition: τr >> τn and the radiative recombination is essentially zero. Most practical LED materials have an internal quantum efficiency in the range of 10–50%. The losses are dominated by crystal defects, inter-valley electron scattering, and lack of perfect confinement. The latter two loss mechanisms go beyond the scope of this paper, and are discussed in great detail in the relevant LED literature. Generating photons inside the LED chip is only half the story of making efficient LEDs. The chip itself is a formidable photon trap. The light emitted by the p-n junction is directed equally into all directions (isotropic emitter). The LED material has an index of refraction in the range of n0 = 2.9–3.6. When the photons emitted towards the top surface reach the chip-air interface, total internal reflection sends most of them back into the chip and only 1.5–2.0% of the internally generated photons escape through the top surface (Fig. 3). If the bulk of the chip is transparent to the generated light (Fig. 4), then we have essentially 5 transmission surfaces and 7.5–10% of the photons can escape directly into air. The remaining 90–92.5% will be absorbed within the chip. If the chip is embedded in a material with an index of refraction of n1 = 1.5 (epoxy), then the escape probability increases by approximately n1 2 and the extraction efficiency for a transparent substrate chip is in the 20–
Another Semiconductor Revolution
39
Fig. 3. LED chip based on an absorbing substrate and a thick window layer allowing for top emission and some emission through the side walls
Fig. 4. LED chip based on a transparent substrate with top emission and substantial emission through the side walls. A partly reflecting back contact further improves extraction efficiency
25% range. We can express this situation by defining an extraction efficiency ηextraction . The external quantum efficiency of an LED, ηext is given by ηext = ηint · ηextraction
(2)
The extraction efficiency cannot be easily quantified in mathematical terms. The usually low extraction efficiency is caused by several factors that contribute to the internal absorption of light. In a cubic chip with walls that are perpendicular to each other, only low order modes that fill a 17◦ cone perpendicular to each surface can escape the chip. For higher order modes, the chip behaves like a corner reflector: it preserves the modal structure and all photons outside of an escape cone will eventually be absorbed. For the case of n1 = 1.5 (epoxy encapsulation), the escape cone expands to 26◦ and 20–25% of the flux can escape. The higher order modes keep bouncing around the chip until they are absorbed at low reflectance surfaces, such as electrical contacts, by the active layer itself, by free-carrier absorption in the bulk substrate or by crystal defects.
40
Roland Haitz
One way to enhance extraction efficiency is by means of rapid mode conversion. So far, the most effective chip design is a chip that looks like an inverted truncated pyramid. Non-perpendicular surfaces do not conserve modes, and after a couple of bounces, even the light in high order modes can be converted into low order escape modes. Embedding these chips into epoxy [1] has resulted in an external quantum efficiency record for LEDs of 53%. With an internal quantum efficiency in the 90–100% range, the extraction efficiency must be between 53% and 59%. This experiment has been an important step in raising our hopes for further improvements.
3
Brief History of LED Evolution
The first recorded history of light emission from a “semiconductor” material occurred in 1907 by H. J. Round [2]. Passing current through wire point contacts on SiC produced yellow light. There are no recorded consequences to this observation. The second recorded light emission from SiC was described by O.V. Losev in 1928–30. His detailed experiments clearly show a pn junction structure, at a time long before pn junctions were discovered and understood. Losev’s observations are described in a historical reference by E. E. Loebner [3]. The potential invention fell into the cracks of the political instability during the Stalin years and the up-coming war. In the late 1950’s, Welker’s proposal [4] that compound semiconductors from the III and V columns of the periodic system should have properties comparable to Ge and Si led to the detection of infrared (IR) emission from GaAs crystals with quantum efficiencies in the range of 0.01–0.1%. The observation of IR emission and understanding band structures of semiconductor materials (momentum conservation) was soon followed by a quest for visible emission. Bandgap widening in the ternary GaAsP compound led to the first engineered structure for visible emission: N. Holonyak [5] in 1962. The LED chip was placed in a conventional diode package of the time. The device lit up, but the device was useless as a product since only a very small fraction of the light was exiting the LED package and its distribution was uncontrolled. By the mid 1960’s, Hewlett-Packard (HP) was the largest user of the only digital display of the time, the Nixie tube. This device had its share of disadvantages from angular reading problems to expensive, high-voltage drive electronics. HP was determined to find another solution. It teamed up with Monsanto Co. and in 1968 both introduced the first usable LED products: digital displays by HP and indicator lamps by Monsanto. Around the same time, Bell Labs introduced LED products and used them to replace incandescent lamps in multi-line telephone sets. 1968 constitutes the first year in which LED products were designed into end-user equipment.
Another Semiconductor Revolution
41
In 1972/74, HP had a phenomenal success with its HP 35 calculator. In the early years, calculator sales were limited by the availability of LED displays. Other display applications for character sizes from 5–15 mm emerged. The early years of LED technology were dominated by numerical display applications. Indicator lamps played a minor role. By 1975, a new display technology, Liquid Crystal Displays (LCD) killed the battery powered part of the LED display products for reasons of lower power consumption. By 1980, the indicator market for LEDs overtook display products and started to influence performance requirements. The government-forced introduction of the center high-mount stop light (CHMSL) for automobiles in 1986 generated the first LED signaling application with a multi-lumen flux requirement. Its first solution using 72 conventional 5mm LEDs looked like an unnatural act. The need for power packaging became apparent. In the late 1980’s, HP introduced the first LED lamps that were rated beyond the conventional 20-mA limit: 50-mA, 70-mA and then 150-mA. The lamp count for a CHMSL was reduced to 20 in the early 1990’s and to 12 by 1998. Better optical light distribution might bring the lamp count down to 1–2. The first automobiles with complete LED lighting on the rear end (except back-up light) have been introduced in 2000. What about white light for illumination? 1992–93 saw the first introduction of GaAlAsP and GaInN based LEDs, both with surprising performance. Combining these technologies, LEDs could now cover the entire visible spectrum with unprecedented efficiency for monochromatic sources and with competitive performance against incandescent white lamps. With this improved capability, will LEDs be able to compete with white light sources that are used for lighting? The answer is complex and affirmative. The answer covers a convoluted analysis based on efficiency, cost, color performance, time and available industry investment.
4
LED Performance
The first GaAsP based LED products in the 1968–72 period were characterized by their ability to be recognized in a well-lit office environment. The target was a brightness of 30 ftLambert ≈ 100nits ˙ for a signal to be recognized in ambient light. This target translates to a flux of approximately 0.1mlm for a segment in the HP35 calculator display. Such a segment is not visible in direct sun light or in any bright environment. In 1972–74, LEDs had a major breakthrough in efficiency, nearly an order of magnitude. Figure 5 describes this development in the line labeled “Flux/Package”. To understand this breakthrough we have to look at Eq. (3) for luminous efficacy ηlum = 683 · Reye · ηint · ηextraction
(3)
The two new factors are: (1) Peak eye response of 683 lm/W at 555nm and (2) eye response relative to 555nm.
42
Roland Haitz
Fig. 5. Evolution of red LEDs over more than three decades. The line labeled “Flux/Package” represents the most powerful commercially available LED in a given year. The line labeled “Cost/Lumen” describes the price decline per unit of flux for the highest volume LED customers in that year
In our analysis of the progress of red LEDs over the last three decades (Fig. 5) we have to recognize contributions from the last three factors of Eq. (3). The LED performance started around 1ml in 1968. In 1973-74 we had a nearly 10× improvement roughly divided by three nearly equal contributions from Reye , ηint and ηextraction . The first factor resulted from a wavelength shift from 655 nm to 635 nm with improved eye sensitivity. The second factor was caused by a shift in direct to exciton recombination (not explained in this paper). The third factor was due to a shift from absorbing to transparent substrate (Figs. 3–4). The next step in LED development was limited to long wavelength red LEDs and was based on the GaAlAs system. This system is very efficient in the IR but drops off very rapidly in the red. From 1985 to 1992 this was the most efficient material for red LEDs. It lost out to GaAlInP in the mid 1990’s because of its sensitivity to Al oxidation in a high temperature, high humidity environment that is typical for traffic lights in southern parts of the USA. In 1992, Toshiba and HP introduced the GaAlInP material system. This system is today – and will be for the near to distant future – the dominant system for red and yellow LEDs. During the mid 1990’s, the new GaAlInP technology fell behind the trend line. The cooperation between HP and Philips Lighting led to the first serious recognition that LED technology will play a major role in lighting. This recognition accelerated the interest in power LEDs. The introduction of lamps based on truncated pyramid chips set new efficiency and power records [1] and moved the performance ahead of the now 30 year old trend line with a slope of 20×/decade. The second trend line in Fig. 5 describes the continued reduction in cost/lumen for the same red LED technologies of the flux line. It is based on historical data from my HP files. In the early years, the cost/lamp was more important than the cost/lumen. With the start of power signaling devices such as CHMSL, the cost/lumen emerged as an important issue. For higher
Another Semiconductor Revolution
43
flux applications such as traffic lights, the cost/lumen becomes a competitive issue. Figure 5 represents a 25 year old trend line with a downward slope of 10×/decade. How will this trend continue into the future? To answer this question we are segmenting the flux trend line into its two most important contributions for red LEDs: external quantum efficiency and drive power. The data in Fig. 6 is only approximate. It ignores wavelength shifts (eye response) between materials. The important issues are: (1) start in 1968–75 and (2) near term performance in 2000–05. In the early years all performance improvements were due to increases in efficiency. The drive current for indicator lamps was flat at a rated current of 20mA. As efficiencies increased during the next decades, the slope of the efficiency curve started to saturate. At the same time the need for increased flux accelerated drive power. Between 1995 and 2005, the efficiency curve will increase by approximately 2.5× while drive power will increase by 30×. 1 W lamps were introduced in 2000, 5W in 2003 and practically every major LED supplier has a 10W LED lamp on its roadmap for 2004–2005. From here on out, efficiency improvements will be relatively slow but still extremely important in achieving energy savings. Increased drive power will open up new applications but without any impact on energy savings. Power increases will be most important for a continued reduction of the cost/lumen. As LED technology matures, packaging cost will dominate and higher power LEDs will translate into lower cost/lumen.
5
White LEDs and Their Impact on Illumination
The addition of the GaInN material system by Nichia and Toyoda Gossei in 1993 added another dimension to the LED ball game. Now, LEDs could cover the entire visible spectrum from red to blue and generate white light by either mixing red, green and blue LEDs or by using a blue LED to pump a phosphor. Both approaches will coexist in the generation of white LED
Fig. 6. The two main drivers of the flux trend line in Fig. 5 are efficiency and drive power. In the early years, practically all improvements are due to efficiency. In the later years, efficiency saturates and drive power becomes the dominant factor
44
Roland Haitz
light sources. At this point in the technology cycle, we face two fundamental questions: • Where is white relative to red, both in flux and cost/lumen? • Will white continue on a trend line comparable to red or will its flux and cost level off over time? It has been our speculation for several years that the flux performance of white should track red quite closely. At 620 nm, the eye response is around 0.35, green at 535 nm is at 0.85 and blue at 470 nm is at 0.09. The red eye sensitivity lies in between the arithmetic and geometric means of the green and blue sensitivities. Therefore, white is expected to be close to the performance of red LEDs. In Fig. 7 we have added recent flux and cost data to Fig. 5. Indeed, white is close to red. White starts below the red flux trend line in 2001 and above the cost trend line. By 2003, white exceeds the red flux trend line. Cost for white is still above the red trend line but is expected to drop close to the trend line by 2005. How will this trend for white continue? Figure 8 combines our forecast for white flux/lamp and its underlying drivers, efficacy and drive power. This time we chose luminous efficacy rather than quantum efficiency. For efficacy we use the current data for 2002–03 and the projected data from the OIDA roadmap [6]. Since the OIDA roadmap may be hyped for reasons of generating government support and because implementation is usually late, we slowed down its progress by 2–5 years. Drive power for the 2001–05 time frame is based on current products or published plans by the leading LED manufacturers. From 2005–10, we expect a slowing trend in drive power for a couple of reasons. During this period cost will still lag market expectations and prevent wide-spread adoption. With an expected performance of 100W and 70 lm/W in 2010, LEDs will exceed the flux of practically all incandescent or fluorescent lamps used in home or office applications. Market penetration to a level beyond 3000 lm LED lamps will be limited by high lamp costs and
Fig. 7. Superposition of early white flux and cost data on the red trend lines of Fig. 5. White flux is crossing the red trend line in 2002 while cost remains a factor of 2× above the trend line
Another Semiconductor Revolution
45
Fig. 8. Near term flux performance of white LEDs and extrapolation to 2020. Efficacy is expected to increase by 10× to 150 lm/W while drive power will increase by 500× to 500W in 2020. By 2010, white flux will be ahead of the red trend line by 20×
lack of energy savings in competition with FL and HID lamps. Continued increases in efficacy from 70–150 lm/W will be absolutely crucial during the 2010–20 period to penetrate the medium power market (300–3000 lm) and to establish a beach-head in high lumen applications such as street lights and sports stadiums. Comparing white flux data with the red trend line, we see a crossover around 2002 and a performance advantage of more than an order of magnitude by 2010. In other words, the white lamps will stay ahead of the red trend line with its 30 years of history. This is an extremely important finding because of its impact on cost predictions for the 2005–20 time frame. If flux performance is ahead of the red trend line by >20x in 2010, we should be able to meet or beat the red cost trend line during the 2010–20 period. The future battle between LED lamps and conventional lamps is illustrated in Fig. 9. White flux performance from Fig. 8 and trend lines for red flux and cost are extrapolated to 2020. LED lamps will enter the sweet spot for conventional lighting – defined by a flux range from 300 lm for a 25W incandescent light bulb to 3500˙lm for a 4 ft fluorescent tube – during the period of 2005–08. Initially, their penetration will be limited by lamp cost, standards and the general unfamiliarity of a new technology. As LED lamp start to penetrate the sweet spot on cost towards the end of the decade and into the 2010–20 period, we will see a steady conversion towards LED lighting. So far we have compared LEDs with conventional light sources on a lumen for lumen basis. LEDs bring another advantage to the table. LEDs are cold point sources and reflective or refractive optical surfaces can be placed in close proximity. The result is a superior control of light distribution expressed by the distribution efficiency in Eq. (4) ηlighting = ηlum · ηdistribution
(4)
46
Roland Haitz
Fig. 9. A superposition of the flux projection of Fig. 8 onto the trend line of Fig. 5. Since white LEDs will be far ahead of the red trend line, we expect that LED costs will meat or beat the red cost trend line. Later in this decade, LEDs will enter the sweet spot for lighting: flux packages of 300–3000 lm and costs of < 10$/klm
Our current data with experimental LED sources indicate a 20–50% improvement in distribution efficiency over conventional light sources. This result should not surprise any designers of luminaires. Hot sources require remote and large surfaces to shape the light distribution. High quality surfaces are expensive and low quality surfaces quickly dominate the light distribution efficiency. The same argument holds for 4ft FL lamps. The surface of the lamp is cold, but very large. The arguments of the above paragraphs bring us to an important prediction: During the 2015–20 period, LED technology will drive lighting. Its value proposition of energy savings, lamp life even in extreme environments, maintenance, safety and a host of other beneficial features will make LED lighting a disruptive technology. In addition, LED lighting will bring a few other features to the table: • Practically unlimited dimming range without sacrificing efficiency. • Color control, not only along the Planckian white point line but over the entire color gamut of the primary light sources. • Instant turn-on after a power failure (streetlights, sports stadiums). • Longer stand-by operation for a given battery size (emergency lights including dimming options). This list could be extended by including the needs of a variety of lighting applications. It only serves to demonstrate the unmatched and nearly universal lighting contributions that LED lamps will be able to offer. Unfortunately, the transition to LED lighting will hit a few bumps. The conventional lighting technologies have had decades to hone their lighting quality and to spoil their customers’ expectations. Color rendering, white
Another Semiconductor Revolution
47
point control, flux maintenance, variations over temperature, etc. are issues that LED lamps have not addressed at all or are only starting to address. We expect that LEDs will be able to successfully resolve these issues, but it will take 1-2 decades before LEDs can deliver a comparable lighting quality.
6
Energy Savings
The enormous energy savings potential from LED lighting was first outlined in a White Paper by Haitz, Kish, Tsao and Nelson at the OIDA Forum in 1999 [7]. LED lamps will have a significant impact on the amount of electricity that is consumed by lighting. The savings will first come from replacing incandescent and halogen lamps with an average efficacy of 15 lm/W by LEDs with 50 lm/W by 2005. During the 2005-15 period, LEDs will reach 150 lm/W and attack FL and HID lamps with an average efficacy around 75 lm/W. Today’s electricity consumption for lighting in the USA is approximately 60GW averaged over 24 hours: 24GW for incandescent/halogen lamps and 36GW for FL/HID lamps. Let us estimate the maximum energy conservation in a single-minded model that replaces all existing lamps with LED lamps having an efficacy of 150 lm/W by 2020 (or earlier/later depending on the availability of R&D funds for economic or political reasons). In this model we can save 90% of the electricity used in incandescent/halogen lamps and 50% used in FL/HID lamps. For the USA, this energy savings amounts to 40GW, or approximately $40B/year at an electricity cost of $0.11/kWh. By adding a modest improvement in light distribution efficiency of 25%, the projected savings increase to 48GW. In other words, the maximum energy savings potential in lighting could be as high as 80%. This estimate for the USA is based on today’s lighting use and does not include a 2% increase per year. Since the USA lighting amounts to roughly 33% of global use, we estimate the global energy savings at >140GW, or $140B. Should the global society choose to shut down – or not to build new – nuclear reactors, we could reduce the nuclear reactor count by > 100 of the largest 1.4 GW reactors!
7
Lighting Revolution
The above described energy savings potential is not going to be implemented over night. LED lamps will have to systematically improve performance and cost. But the light at the end of the tunnel is very bright! Today’s lighting technology is divided between more than 10 branches of different lamp technologies: Incandescent, halogen, neon, CFL, FL, low pressure sodium, high pressure sodium, high pressure mercury and variations of the HID family. Each technology requires different lamp manufacturing
48
Roland Haitz
processes, electrical drive circuits, luminaire design, and other technology specific features. With LED lamps, we need only one technology to cover all signaling/lighting applications from the dimmest indicators (0.1mlm) to the most demanding sports stadiums (10Mlm). By 2020, no other electricity-based signaling/lighting technology can cover such a dynamic range of 11 orders of magnitude and/or match the efficacy and cost of LEDs at any point within this range! Today, LED lamps represent the only lighting technology with a significant annual slope in performance and cost improvements. Incandescent lamps have practically not improved in efficiency since Edison’s lamp more than 100 years ago. Halogen, FL, CFL, and HID lamps have been similarly stagnant for the past 2–5 decades. The conventional technologies are stuck with the fundamental laws of physics and significant improvements in efficiency are unlikely, if not impossible. LED lamps, and their potential Laser variant, provide the only hope for substantial energy savings in lighting. For this simple reason, LEDs will win the battle for lighting in the 21st century!
8
Government Support and Incentives
As long as this LED story looks so good and so convincing, why is the LED industry requesting substantial government support instead of making the required investment on their own Nickel? The answer to this question is complex and has many facets pointed out by the following arguments: • The global power LED industry is fairly small, maybe $600–800M in 2002 revenue and burdened with current R&D expenses around $150M. Profits for most “clean-play” companies are negative and the industry over-all is at break-even, at best. • The established light bulb companies with annual revenues of $13B consider LEDs to be an intruding technology and therefore the established management is antagonistic and fighting a turf battle. • Most benefits of LED lighting will flow to other segments of the economy and not to the LED industry. By far the biggest beneficiary will be the electricity consumer through energy savings (approx. 60–70% of the potent benefits). Next come the general economy with a reduction in demand for new power plants and the environment with substantial reductions in the emission of green house gases. And last in this ranking is the LED industry: modestly increased revenues, but no profits in the early, investment dominated years. The LED industry is stuck with the hope on future benefits while the rest of the economy enjoys the energy cost reduction of the LED investments! Would you invest in such an industry, yet alone recommend it to your pension fund? • At the same time, the LED industry shoulders all the investment and market risk. Based on my 33 years of LED experience, I estimate that the
Another Semiconductor Revolution
49
USA based industry will have to invest at least $1B in R&D before the 150 lm/W efficacy goal is reached in 2010–20. The investment sunk to date in these new technologies is in the $150–250M range. That amounts to $750– 850M to go. In addition, the LED industry has to carry the market risk: Lighting is a standards-based industry. Such industries innovate only slowly because of the established building codes, union rules, buyer hesitation, fear of exposure to new unproven lights, ect. • Consumers like to see broad-based trials for new technologies such as the effects on health, cost, and reliability. The consumers ask questions like “Does a government agency really agree with the hyped cost reductions or other advertised benefits?” Who pays to build this customer confidence? Considering the huge energy savings potential for consumers, the impact on the general economy and the risk exposure for the LED industry, we have no choice but to propose a risk-sharing arrangement: • Government investments to accelerate research toward the 150 lm/W goal by supporting University, National Labs, and industry investments. • Government support to reduce the high-risk industry investments in performance improvement, manufacturing cost reduction, standards, and market acceptance. • Government support to remove or modify acceptance inhibitors such as building codes, union/trade rules, safety/health regulations, ect.
9
Impact on the Environment
Today in the USA, 56% of electricity is generated by the most polluting technology: coal-fired power plants. It is a fair question to ask “What reduction in carbon emissions can we achieve, if all the energy savings from lighting is applied to reduce the use of coal-fired power plants by 2020 at today’s rate of lighting consumption?” The answer is quite amazing. Again we use a simple-minded analysis of complete conversion to LED lamps that have an efficacy of 150 lm/W. The US savings discussed above are estimated at 48GW per year. At a coal burn rate of 4Mtons/GWyear, we are looking at a potential saving of 192Mtons of coal in 2020. This US saving represents nearly 25% of annual US coal production in 2000 and its implementation would amount to a significant contribution to the Kyoto agreement on green house gas emission.
10
Summary
We have a lighting revolution on our hands! This revolution is not an evolution from existing lighting technologies. It is a disruptive revolution based on an emerging light emitting technology that will sweep from indicator lamps to
50
Roland Haitz
lighting applications within a couple of decades. Over the last three decades, LEDs have conquered practically all small signal indicator and large area emissive display applications. Within a decade, LEDs will dominate power signaling applications from traffic lights to the brake lights on automobiles. The large lighting market is the next target. Today, LED lamps can beat incandescent lamps on efficacy, but not on initial purchase cost. Given a decade or two with the current performance and cost slope, LED’s will dominate all lighting applications because of their superior value proposition offered to the consumer, the economic infrastructure and the environment. The governments around the globe carry the responsibility to accelerate this revolution through substantial financial support and a speed-up of the regulatory process!
References 1. M. R. Krames et al., “High-power truncated-inverted-pyramid AlGaInP/GaP light-emitting diodes exhibiting >50% external quantum efficiency,” Appl. Phys. Let., vol. 75, p. 2365, 1999. 40, 42 2. H. J. Round, “A note on carborundum,” Elec. World, vol. 49, p.308,1907. 40 3. E. E. Loebner, “Subhistories of the Light Emitting Diode,” IEEE Trans. Electron Devices, vol. ED-23, p. 675, 1976. 40 4. H. J. Welker, “Discovery and Development of III-V Compounds,” IEEE Trans. Electron Devices, vol. ED-23, p. 664, 1976. 40 5. N. Holonyak and S. F. Bevacqua, “Coherent (visible) light emission fromGaAsP junctions,” Appl. Phys. Let., vol 1, p. 82, 1962. 40 6. J. Y. Tsao, “Light Emitting Diodes for General Illumination,” OIDA technology roadmap update 2002, available from Optoelectronic Industry Development Association (www.oida.org). 44 7. R. H. Haitz et al., “The Case for a National Research Program on Semiconductor Lighting,” White Paper presented at the 1999 OIDA Forum and available from OIDA, (www.oida.org). 47
Quantum Tunneling of Carbon Monoxide in Molecule Cascades A. J. Heinrich, C. P. Lutz, J. A. Gupta, and D. M. Eigler IBM Research Division, Almaden Research Center 650 Harry Road, San Jose, CA 95120, USA Abstract. We combine atom-manipulation, in-situ isotope selection, and molecule cascades to study the quantum tunneling of molecules bound to a surface. The hopping rate of individual CO molecules was measured with scanning tunneling microscopy between 0.5 K and 10 K. We find hopping rates that are independent of temperature below 6 K and exhibit a pronounced isotope effect, hallmarks of quantum tunneling. This hopping rate can be tuned over several orders of magnitude by tailoring interactions with neighboring molecules. At higher temperatures we observe thermally-activated hopping with an anomalously low Arrhenius prefactor that we interpret as tunneling from excited vibrational states.
1
Introduction
Quantum tunneling of atoms and molecules has been observed in many physical, chemical, and biological systems [1,2,3]. The high spatial resolution of the scanning tunneling microscope (STM) provides a unique opportunity to study the motion of single atoms and molecules on surfaces in a well-characterized environment [4]. Tunneling in adsorbate diffusion [5] was recently observed using STM to study hydrogen atoms on a Cu(100) surface [6,7]. The present work investigates the hopping mechanism of CO molecules in molecule cascades where, in contrast to the random walk of diffusion, the hopping direction and rate can be engineered [8]. We study the hopping rates of CO in molecule cascades as a function of temperature, isotope and local environment.
2
Instrumentation
The data presented here were acquired with a variable-temperature STM with a base temperature of 0.5 K. This STM uses a continuous-circulation 3 He refrigerator for temperatures from 1.0 K to 4.5 K. Below 1.0 K the STM operates in a single-shot mode with a hold time of 10 hr. 3 He is liquefied without the need for a pumped 4 He bath (1 K pot) through the use of the Joule-Thompson effect. Operation up to ∼ 40 K is easily achieved with a small heater on the STM. A clean Cu (111) surface was prepared by repeated Ar sputtering and heating of single-crystal Cu in ultra-high vacuum. The B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 51–60, 2003. c Springer-Verlag Berlin Heidelberg 2003
52
A. J. Heinrich et al.
crystal was then transferred to the cold STM. CO molecules totaling ∼ 0.01 of a monolayer were adsorbed onto the surface from a gaseous source with the samples held at ∼ 15 K. We find that the CO does not adsorb in the chemisorbed form with the surface below ∼ 10 K. We co-adsorbed 12 C16 O and 13 16 C O by mixing isotope-selected gases in the room-temperature vacuum chamber. Residual gas analysis of the CO isotope distribution showed an upper bound of 0.02 for the relative amount of 17 O and 18 O. STM images were recorded at a sample bias voltage of V = 10 mV and a tunnel current of I = 1 nA. The typical conditions for sliding CO were V = 10 mV and I = 40 nA.
3
Results
Figure 1 shows an STM image of the Cu(111) surface where 5 CO molecules were positioned with the STM. The locations of CO molecules (substrate Cu atoms) are indicated by filled red circles (blue dots). In addition, the orientation of the second layer Cu atoms is indicated with black dots. It will become evident later that the exact knowledge of the sample orientation is critical for an understanding of the results presented below. An isolated CO (on the left) appears as a single dip, typically 0.05 nm deep [9]. The azimuthal symmetry is consistent with CO adsorption on Cu(111) atop sites; the bonding between C and Cu orients the CO molecule upright over Cu atoms on the surface [10]. Two CO molecules positioned at nearest neighbor sites (middle of Fig. 1) are imaged as dips centered at Cu lattice sites with a peak between them. We refer to this arrangement as a dimer although the molecules can be separated easily with the STM tip [4]. A single CO apparently decreases the local electronic density of states (LDOS) at the adsorption site at the Fermi level which causes the feedback mechanism to move the STM tip closer to the surface in order to keep a constant tunnel current. In contrast the dimer shows an increase in the LDOS right between the two CO molecules.
d0
ª3 d0
Fig. 1. STM image (2.5 nm ×1.5 nm) of CO molecules on Cu (111). All STM images in this paper are high-pass filtered to emphasize the local contrast, light areas represent local peaks
Quantum Tunneling of CO in Molecule Cascades
53
A CO trimer is shown on the right side of Fig. 1. We call this open, bent arrangement of 3 CO molecules a chevron. Chevrons always spontaneously decay in the same way: the central CO molecule hops by exactly one lattice the figure by the arrow. After the CO site (denoted as d0 ) as indicated in √ hops, all 3 CO are at a spacing of 3d0 with d0 = 0.255 nm the nearest neighbor distance. Chevrons therefore offer a way to control the direction of hopping of an adsorbate on a surface. 3.1
Isotope Selection
2
2
d I/dV [a. u.]
Figures 2 and 3 illustrate how inelastic electron tunneling spectroscopy (IETS) [6,11,12] of CO vibrational excitations [14,15] was used to construct structures in which the carbon isotope of each molecule is individually selected. Each curve in Fig. 2A shows two peaks at positive and two inverted peaks at negative voltage. These peaks are due to vibrational excitations corresponding to the frustrated translation mode (|V | = 4 mV) and the frustrated rotation mode (|V | = 35 mV). We observe a shift with carbon isotope (12 C vs. 13 C) of 1.0 meV for the CO frustrated rotation at 35 meV and no isotope shift for the frustrated translation at 4 meV. The step in conductance is a factor of 0.2 at |V | = 35 mV and about 0.15 at |V | = 4 mV for CO arranged in the lattice, but only half as large for isolated CO molecules. We do not observe the CO to Cu external stretch mode. The isotope selection works by assembling an array of CO molecules (Fig. 3A), imaging the array with IETS (Fig. 3B) and using the isotopedependence of the frustrated rotation to determine the carbon isotope of each CO molecule. Each bright (dark) peak in the IETS map of Fig. 3B corresponds to a 13 C16 O (12 C16 O) molecule. This information is then used to
40 20 0 -20 -40
A 12C16O 13C16O
dI/dV [a.u.]
-40
1.7
-20 0 20 sample voltage [mV]
40
B 12C16O
1.6
13C16O
1.5
VAC=2mVRMS
30
35 40 sample voltage [mV]
Fig. 2. IETS (A) Inelastic scanning tunneling spec2 troscopy (d2 I/dV √ CO √ ) of molecules in a 3 × 3 array. The green (red ) curve was taken on top of a 13 C16 O (12 C16 O) molecule. (B) The frustrated translation mode at positive voltage in the dI/dV spectrum
54
A
A. J. Heinrich et al.
B
√ √ Fig. 3. Sorting CO isotopes. (A) STM topograph of a 3 × 3 array of CO. (B) dI/dV image acquired simultaneously with the topograph at VDC = 35.5 mV, IDC = 3.55 nA, VAC = 1.5 mVRMS
build isotope-selected structures as illustrated in Fig. 3B. Vibrational peaks were initially correlated to carbon isotopes by dosing 12 C16 O and 13 C16 O separately on clean surfaces. 3.2
Timing Molecule Cascades
It is possible to link chevrons into molecule cascades of arbitrary lengths [8]. The initial configuration of a linked chevron cascade consists of staggered chains of CO dimers and an additional trigger CO molecule (Fig. 4A). To start the cascade one CO molecule is manually moved with the STM tip to set up a chevron which decays after a certain time by the indicated hop (marked 0). The cascade spontaneously proceeds one hop at a time until all chevrons have decayed. The final configuration is shown in Fig. 4B. The CO √ molecules are now evenly spaced by 3d0 . In order to operate the cascade again, the molecules are moved back to their initial positions one at a time with the tip of the STM. Since the molecules hop spontaneously after the trigger is moved with the STM, molecule cascades represent a new method to manipulate molecules remote from the location of the tip. Inclusion of additional helper CO molecules (those not part of the dimers in Fig. 4A) increases the chevron decay rate by a factor of ∼ 70 at 7 K. This is one way that the rate of molecule cascades can be adjusted to a desired value. A single timing measurement of a linked-chevron molecule cascade is illustrated in Fig. 4C. The tip height is shown as a function of time during
Quantum Tunneling of CO in Molecule Cascades
A
55
B
manual move
0 1 2 3 F
4
relative tip height [Å]
5 2
manual move park tip at F
1
hop 0
C hop 4 hop 5
hop 0
0
-2
0
2
4
6 8 time [s]
10
12
14
Fig. 4. Timing a linkedchevron cascade. (A) 1.7 nm ×3.4 nm STM image of the initial arrangement of CO molecules. (B) STM image after all chevrons have decayed. (C) Example of a single timing measurement showing changes in tip height as a function of time
the different stages of the timing process showing changes in tip height as a function of time. The initial chevron is formed by bringing the tip closer to the surface (at t = −1.5 s). The decay of this chevron is marked by a sharp drop in tip height (hop 0 in the inset) and defines t = 0. The tip is then retracted and quickly parked at the final position marked ”F”, chosen so that the decay of the 4th chevron (hop 4 at t = 11.6 s) can be observed as a step up in tip height. Hop 4 and 5 can be seen at the end of the tip-height trace as steps up and down, respectively. We find that the presence of the STM tip has a significant influence on the decay rate of chevrons at all accessible tunneling conditions, so these rates cannot be measured reliably by repeated imaging. The hopping rates were influenced by the presence of the tip (either slowed or sped up depending on the tip position) even at low tunnel current (I = 50pA) and bias voltage between V = 5 mV and 50 mV. The size of the influence depends primarily on the tip height (rather than on I or V) when V is kept well below the 35 meV vibrational excitation. As a result of this rather strong tip influence, for the data presented here, we use the STM only to start the cascade by moving the trigger molecule,
56
A. J. Heinrich et al.
and to detect a characteristic signal near the end of the cascade when the last CO hops (Fig. 4C). The measured hopping rate is equal to the number of hops in the cascade, divided by the average measured propagation time. The choice of tip trajectory during each timing measurement ensures that the molecules under test, with the exception of the last hop, are at least 0.7 nm laterally away from the tip at all times. Tip influence can thus be reduced just by increasing the length of the cascade. The fastest hopping rate that can be measured by this method is determined by how fast the tip can be moved from the start to the end of the cascade, which takes about 4 s in our current setup and can be decreased to about 1 s. The statistical distribution of measured times reveals information about the individual underlying hops. Analysis of the propagation times for 160 measurements of an 18-hop cascade shows a variance that is consistent with 16 ± 2 independent hops per measurement, where each hop is modelled to occur with a constant probability per unit time. We therefore conclude that the hops are most likely independent of each other and that the tip influences at most the first and last hop of the measured cascades. A comparison of hopping rates for 12 C16 O and 13 C16 O linked-chevron cascades is shown in Fig. 5. To minimize uncontrolled environmental influences such as surface-state standing waves, the 18-hop cascades were fully enclosed by walls of CO with a spacing of d = 5.5d0 on the left and d = 6d0 on the right. All timing measurements were performed in the same location by swapping the CO isotopes in and out. We could not confidently distinguish oxygen isotopes with IETS, and rely on a low initial concentration of the undesired isotopes 17 O and 18 O. To further rule out significant influence from oxygen isotopes we routinely performed consistency checks between hop rates of different groups of CO, and observed no discrepancies.
rate [1/s]
12C16O 13C16O
1
0.1 0
2
4
6 T [K]
8
10
Fig. 5. Comparison of 12 16 C O and 13 C16 O hopping rates in a molecule cascade. Each point represents the average for 8 measurements of an 18 hop cascade. Solid and dotted curves are least-square fits to a function that is the sum of a constant rate due to quantum-tunneling and a thermally activated rate
Quantum Tunneling of CO in Molecule Cascades
57
Figure 5 shows that the hopping rate for each isotope is independent of temperature below ∼ 6 K. A pronounced isotope effect is evident in the low temperature rates of RQT = 0.396 ± 0.018 s−1 for 12 C16 O and RQT = 0.110 ± 0.008 s−1 for 13 C16 O, a ratio of 3.6. Both findings are characteristic of quantum tunneling.
4
Discussion
We will first take a closer look at the tunneling rate at low temperatures. We analyze the isotope-dependent hopping rates in the WKB approximation without isotope shifts of the zero-point energy (ZPE). The exponent in the WKB approximation for the tunneling rate can be evaluated for one isotope by using the measured rate and assuming an attempt rate for tunneling on the order of the external vibrational frequencies (AQT = 5 × 1012 /s). With an assumed mass difference of 1 amu, we then compare the measured rates to obtain the tunneling masses. We find the surprising result that the tunneling mass is 13amu for 13 C16 O tunneling and 12amu for 12 C16 O tunneling. Hence, if isotope ZPE shifts are small enough to be ignored, the relevant tunneling mass is that of the C atom rather than the entire CO molecule. A mixed-isotope cascade in which the molecules that hop are 12 C16 O and the rest are 13 C16 O gives a tunneling rate RQT = 0.41 ± 0.03s−1 identical to that for a cascade in which all molecules are 12 C16 O. This indicates that motion of the C atoms on neighboring molecules does not significantly influence the tunneling process. The temperature dependence of the hopping rate above 6 K approaches a linear slope in an Arrhenius plot (Fig. 5 uses a linear T scale), which indicates a thermally activated process. We model the hopping rate R of each CO isotope as the sum of the quantum tunneling rate RQT and an Arrhenius (thermally activated) rate: R = RQT + A exp(−E/kB T ) , where A is the Arrhenius prefactor, E is the activation energy, and kB is the Boltzmann constant. Fitting the hopping rates for both isotopes using the same values for A and E yields a poor fit (dotted lines in Fig. 5). This implies that the thermally activated process is significantly dependent on the carbon isotope. The isotope dependence of the thermally activated hopping can be attributed either to the prefactor or to the activation energy. Allowing a separate prefactor (A) for each isotope produces an excellent fit (Fig. 5, solid lines). The fit yields E = 9.5 ± 0.9 meV, A = 105.8±0.5 s−1 for 12 C16 O and 105.4±0.5 s−1 for 13 C16 O. Instead permitting separate activation energies (E) produces a fit that is nearly indistinguishable from the solid lines in Fig. 5. In that case, fit parameters for the activation energy are E = 8.9 ± 0.9 meV for 12 16 C O and E = 9.6 ± 0.9 meV for 13 C16 O. The shared Arrhenius prefactor
58
A. J. Heinrich et al.
is A = 105.5±0.5 s−1 . Allowing both A and E to depend on the isotope does not improve the fit significantly. Therefore, these measurements can not directly distinguish the relative importance of these two parameters in the isotope dependence. The ZPE of the molecule in its initial potential well should depend on the molecule’s mass and thus produce an isotope shift of the activation energy E: the lighter molecule will have a higher initial energy so less thermal energy is needed to carry it over the barrier. It should be noted that tunneling and classical over-the-barrier processes both depend on the stiffness of the full many-dimensional potential energy surface in the coordinates perpendicular to the reaction path. This makes it difficult to determine the value, and even the sign, of isotope shifts. For example, modes that are stiffer in the barrier than in the initial state favor passage of heavier isotopes [16]. The entire range of hopping rates in Fig. 5 can be modeled with an isotope shift in the ZPE of about 0.7 meV, and a prefactor (for both isotopes) of 105.5±0.5 s−1 . This prefactor, if interpreted as an attempt rate, is anomalously low however: vibrational frequencies for CO are at least six orders of magnitude higher. Such low prefactors have been observed in several STM atom-tracking measurements of diffusion on surfaces and it was found that they also coincide with activation energies below ∼ 100 meV [17]. We suggest that low prefactors can be the result of tunneling from thermally excited states. In such a model, the activation energy of ∼ 9.5 meV corresponds to the energy of an excited vibrational state of the CO molecule in the chevron configuration. It is not possible to measure the vibrational spectrum of a chevron since it decays much faster than the time needed for the measurement of an IETS spectrum. However, chevrons can be stabilized through their interaction with additional CO molecules. IETS of such a structure yielded some evidence for a vibrational mode with detectable cross section at 6 − 9 meV and is still under investigation. The frustrated rotation mode shifts up by about 1 meV in comparison to isolated CO molecules [8]. In such a model, the low prefactor is the product of the tunneling attempt rate and the probability of tunneling through the barrier when starting from the excited state [18]. This tunneling probability, and hence the prefactor, should show a mass dependence. At higher temperatures, this thermallyassisted tunneling model predicts a further increase of the slope in the Arrhenius plot when direct over-the-barrier processes or higher vibrational states become thermally excited. We note that the data presented here may be the result of isotope ZPE shift and thermally-assisted tunneling acting together. Quantum tunneling rates are exponentially sensitive to factors that influence the height or width of the energy barrier. It is therefore imperative to control the details of the local environment of the molecule cascades when determining their hopping rates. We observed that walls of CO molecules in proximity to a CO cascade systematically change the hopping rate. This is demonstrated in Fig. 6, where the hopping rate of a 16-hop linked-chevron
Quantum Tunneling of CO in Molecule Cascades
59
A
3d0
rate [1/s]
B
2 1
T=7K C16O
13
0.1 0
2
4 6 8 10 wall spacing d/d0
12
Fig. 6. Environmental influence on hopping rates. (A) STM image (5 nm ×3 nm) of the CO timing structure (center ) with walls made of CO molecules on both sides. The wall spacing is d = 3d0 . A spacing of d = 0 corre√ sponds to a continuous 3 × √ 3 array. (B) The hopping rate of 13 C16 O as a function of d. Error bars are based on the measured spread in 5 runs of a cascade of length 16 hops
cascade is measured as a function of wall spacing d. While the hopping rate is independent of wall spacing for d > 4d0 , it increases by a factor of 10 as d is reduced to d = 1d0 . The crystal orientation (indicated in Fig. 1 via the second layer Cu atoms) also significantly affects CO hopping rates. Linked-chevron cascades made to propagate in the direction opposite to that in Fig. 4 proceed at an 8 times slower rate (at 6 K), indicating that the orientation of second layer copper atoms is important.
5
Summary
We have demonstrated a novel system in which the quantum tunneling of molecules can be studied and controlled on the atomic scale. The well characterized influences of neighboring CO molecules (helpers and walls), isotope selection, and choice of propagation direction allow us to tune the tunneling rate over several orders of magnitude. Conversely, the cascade tunneling rate can be exploited to sensitively probe local environments. At higher temperatures, tunneling from thermally-excited vibrational states readily accounts for our observations of a low Arrhenius prefactor and a mass dependence. We suggest that thermally-assisted tunneling can play an important role in chemical kinetics when Arrhenius behavior exhibits anomalously low prefactors.
60
A. J. Heinrich et al.
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18.
J.O. Alben et al., Phys. Rev. Lett. 44, 1157 (1980). 51 A.A. Louis, J.P. Sethna, Phys. Rev. Lett. 74, 1363 (1995). 51 A.C. Luntz, J. Harris, Surf. Sci. 258, 397 (1991). 51 B.G. Briner, M. Doering, H. P. Rust, A.M. Bradshaw, Science 278, 257 (1997). 51, 52 R. Baer, Y. Zeiri, R. Kosloff, Surf. Sci. 411, L783 (1998). 51 L.J. Lauhon, W. Ho, Phys. Rev. Lett. 85, 4566 (2000). 51, 53 J. Kua, L.J. Lauhon, W. Ho, W.A. Goddard III, J. Chem. Phys. 115, 5620 (2001). 51 A.J. Heinrich, C.P. Lutz, J.A. Gupta, and D.M. Eigler, Science 298, 1287 (2002). 51, 54, 58 L. Bartels, G. Meyer, K.-H. Rieder, Appl. Phys. Lett. 71, 213 (1997). 52 G. Witte, Surf. Sci. 502–503, 405 (2002). 52 B.C. Stipe, M.A. Rezaei, W. Ho, Science 280, 1732 (1998). 53 J.I. Pascual et al., Surf. Sci. 502–503, 1 (2002). 53 L.J. Lauhon, W. Ho, Phys. Rev. B 60, 8525 (1999). R. Raval et al., Surf. Sci. 203, 353 (1988). 53 J. Braun et al., J. Chem. Phys. 105 3258 (1996). 53 R. Baer, Y. Zeiri, R. Kosloff, Phys. Rev. B 54, R5287 (1996). 58 J.V. Barth, H. Brune, B. Fischer, J. Weckesser, K. Kern, Phys. Rev. Lett. 84, 1732 (2000). 58 P.J. Price, Am. J. Phys. 66, 1119 (1998). 58
Boson Cavities: From Electronic Transport to Quantum Chaos Tobias Brandes Department of Physics, UMIST PO Box 88, Manchester M60 1QD, UK Abstract. This is a ‘mini-review’ of some recent results on electron transport through two-level systems (e.g., double quantum dots) and simple mesoscopic scatterers (delta barrier), interacting with dissipative boson baths and single boson modes (phonons, photons). The relevant models (Spin-Boson system, Rabi-Hamiltonian) and their stationary properties (electron current, boson distribution) are investigated. For a single boson mode interacting with N two-level systems, the relation between quantum chaos and a quantum phase transition for N → ∞ is discussed.
1
Introduction
A large part of solid state physics deals with the interactions between fermions (electrons) and bosons (phonons, photons, magnons etc.). There is a recent trend towards studying these interactions in their ‘purest form’, i.e. in quantum systems with only very few effective degrees of freedom, to be controlled from the ‘outside’ by external parameters such as magnetic fields or gate voltages. Typically, nanoscales and low temperatures are required in order to master the complexity of condensed matter (many-body effects, decoherence) if one wishes to realise quantum optical effects in quantum transport, or to achieve a control over the two key elements of quantum mechanics (quantum superpositions and entanglement) in an ‘artificial’, man-made system. On the theoretical side, simple models for fermion-boson interactions continue to be fascinating, as often very non-trivial results can be obtained from even the most primitive Hamiltonians. In this short overview, I discuss models that describe electron transport and two-level systems interacting with dissipative boson baths and single boson modes. The main focus will be on , electron-phonon interactions, motivated by recent experiments in coupled quantum dots [1,2,3,4,5,6,7,8,9], ‘recoil’ effects in free-standing semiconductor quantum dots [10], or systems where phonons start to become controllable (phonon confinement, ’phonon cavity QED’ [11,12,13,14,15,16,17,18]). More or less closely related (although not reviewed here) are situations where vibrational degrees of freedom play a big role, such as in experiments on transport through single molecules [19,20,21,22,23,24], electron ‘shuttles’, freestanding and movable nanostructures [25,26,27,28,29,30,31,32], or theories dealing with macroscopic ‘quantum mechanics’ of, e.g., cantilevers coupled to Cooper pair boxes [33]. B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 63–77, 2003. c Springer-Verlag Berlin Heidelberg 2003
64
2
Tobias Brandes
Phonon Cavities
The simplest model for a phonon cavity is an infinitely extended, homogeneous thin plate of thickness 2b, where phonons are described by a displacement field u(r), disregarding the microscopic crystal structure. The interactions of dilatational and flexural phonon modes with quantum dot electrons have been investigated by S. Debald et al. [18]. The determination of the phonon-subband dispersion relation ωn,q (q is the in-plane phonon wave vector) from the Rayleigh-Lamb equations is a well-known problem from elasticity theory [34,35,36], although non-trivial due to the boundary conditions at the surfaces that mix longitudinal and transversal propagation. The numerically determined ωn,q curves have minima at values for q that correspond to van Hove singularities in the phonon density of states at certain finite energies ¯hω. These singularities (which are a geometrical effect and not due to, e.g., the crystal structure) are ‘nanomechanical fingerprints’ of confinement and lead to a strong increase in electron-phonon scattering at the corresponding energies. This has been predicted to be observable in energydependent non-linear transport spectroscopy in coupled quantum dots. Another surprising feature of the thin plate model is the vanishing of the deformation potential (DP) interaction between electrons and phonons for q = qt,n , where the qt,n denote the solutions for the transversal wavevectors associated with the transversal speed of sound ct . For dilatational modes, the displacement field has zero divergence at the phonon energy h ct π ¯ , hω 0 = √ ¯ 2 b
(1)
and the electron-phonon DP vanishes at this energy. If electrons are confined symmetrically in the midplane between the two plate surfaces, flexural phonon modes are decoupled due to symmetry. Then, the dominant second order DP electron-phonon scattering is completely ‘switched off’ for energy transfers ∆ = h ¯ ω0 (although contributions from fourth and higher order processes still remain possible). Since piezoelectric coupling is weaker than DP interaction for small b, the ∆ = h ¯ ω0 defines a ‘dissipation-free manifold’ for midplane electrons in, e.g., double quantum dots (see below). It should be emphasised that in contrast to phononic crystals, the vanishing of the electron-phonon interaction and the van Hove singularities discussed here occur in a simple homogeneous, infinitely extended plate that is confined in just one direction.
3 Non-linear Transport: Dissipative Spin-Boson System Double quantum dots are sensitive phonon emitters and detectors[3,4,37] and can be described by a (pseudo) spin-boson model [38]
Boson Cavities: From Electronic Transport to Quantum Chaos
ε 1 σz + Tc σx + σz A + ωQ a†Q aQ 2 2 Q † A := gQ a−Q + aQ ,
H =
65
(2) (3)
Q
where one additional ‘transport’ electron tunnels between a left (L) and a right (R) dot with energy difference ε and inter–dot coupling Tc , where σz = |LL| − |RR| and σx = |LR| + |RL|. Here, ωQ are the frequencies of phonons, and the gQ denote interaction constants. The coupling to external leads offers the possibility to study spin-boson dynamics in transport properties such as the (non-)stationary electronic current or shot noise. The simplest description is that for non-linear transport with the lead chemical potentials µL → ∞ and µR → −∞ [39,40,41,38], allowing for an additional ‘empty’ state and tunneling from a left reservoir at rate ΓL into the left dot, and from the right dot to the right reservoir at rate ΓR . Lowest order perturbation theory in these rates (neglecting higher order terms in ΓL/R [42,43]) yields an equation of motion for the reduced statistical operator ρ(t) [38,39], ∂ ρLL (t) = −iTc [ρLR (t) − ρRL (t)] + ΓL [1 − ρLL (t) − ρRR (t)] ∂t ∂ ρRR (t) = −iTc [ρRL (t) − ρLR (t)] − ΓR ρRR (t). (4) ∂t For the remaining equation for the off-diagonal element ρLR = ρ∗RL , one has to choose between perturbation theory in gQ (weak coupling, PER), or in Tc in a polaron-transformed frame (strong coupling, POL) [44]. In general, no exact solution of the model is available even for the simplest case of only one bosonic mode (see below). The standard Born and Markov approximation with respect to A yields d PER ρ (t) = [iε − γp − ΓR /2] ρLR (t) dt LR + [iTc − δ− ] ρRR (t) − [iTc − δ+ ] ρLL (t).
(5)
Here, the rates are T2 γp ≡ 2π c2 ρ(∆) coth (β∆/2) , ∆ ρ(ω) ≡ |gQ |2 δ(ω − ωQ )
(6)
Q
Tc π εTc π ρ(∆) coth (β∆/2) ∓ ρ(∆), (7) 2 ∆ 2 ∆2 where ∆ := ε2 + 4Tc2 is the energy difference of the hybridized levels, and β = 1/kB T the inverse phonon equilibrium bath temperature. Note that δ± ≡ −
66
Tobias Brandes
beside the off–diagonal decoherence rate γp , there appear terms ∝ δ± in the diagonals which turn out to be important for the stationary current. On the other hand, the polaron transformation [38] leads to an integral equation t ΓR OL C(t − t )ρLR (t ) ρP dt eiε(t−t ) LR (t) = − 2 0 + iTc {C(t − t )ρLL (t ) − C ∗ (t − t )ρRR (t )} , where C(t) = Xt X † , ∞ ρ(ω) C(t) := exp − dω 2 [(1 − cos ωt) coth(βω/2) + i sin ωt] , ω 0
(8)
(9)
is the phonon equilibrium correlation function of the displacement operators X (Xt is the time evolution of X with respect to the phonon system),
† ∗ gQ X = ΠQ D Q (10) , DQ (z) = ezaQ −z aQ , ωQ where DQ (z) is the unitary displacement operators for the phonon mode Q. 3.1
Dissipative Landau-Zener Problem and Quantum Pump
For ΓL = ΓR = 0, one can study adiabatic transfer [45,46,47,48] of electrons from, e.g., the left to the right dot under the influence of a Hamiltonian with a slow time-dependence Tc (t) = − ∆ 2 sin Ωt, ε(t) = −∆ cos Ωt. This is relevant, e.g., for adiabatic quantum computation schemes in solids [49,50,51], where one must be fast enough in order to avoid dissipation, and slow enough in order to avoid undesired Landau-Zener transitions to excited states. This trade-off can be quantified [52,53] by calculating the inversion change δσz f from the ideal value σz f = −1, which for a slow half-period sweep (duration π/Ω) yields
2
2 πω ∆ Ω c R δσz f ≈ 1 − + cos (11) + 2 ρ(∆)nB (∆), ωR ωR Ω Ω √ π 3 J3/2 (π) √ = where ωR ≡ Ω 2 + ∆2 , nB is the Bose distribution, and c = 4 2 2.4674. The ground state of the system ‘rotates’ on a curve in ε-Tc -space with constant energy difference ∆ to the excited level, such that for ∆ = h ¯ ω0 dissipation due to phonon absorption can be switched off in a thin plate cavity as discussed above. The effect of dissipation on adiabatic rotation of quantum states [54,53] can in principle be measured as a time-averaged current in a ‘quantum pump’:
Boson Cavities: From Electronic Transport to Quantum Chaos
67
One pump cycle starts with an additional electron in the left dot and an adiabatic rotation of the parameters (ε(t), Tc (t)) by changing, e.g., gate voltages as a function of time. This completely quantum-mechanical part of the cycle is performed in the ‘save haven’ of the Coulomb- and the Pauli-blockade [55], i.e., with the left and right energy levels of the two dots well below the chemical potentials of the leads. The cycle continues with closed tunnel barrier Tc = 0 and increasing εR (t); the two dots then are still in a superposition of the left and the right state. The subsequent lifting of the right level above the chemical potential of the right lead constitutes a measurement of that superposition: the electron is either in the right dot (with a high probability 1 − 12 δσz f ) and tunnels out, or the electron is in the left dot (and nothing happens because the left level is still below µ and the system is Coulomb blocked). For ΓR , ΓL t−1 cycle the decharging of the right dot and the recharging of the left dot from the left lead is fast enough to bring the system back into its initial state with one additional electron on the left dot, and the average electron current is
e 1 Ipump = − 1 − δσz f . (12) tcycle 2 Here, the leads act as classical measurement devices of the quantum-mechanical time-evolution between the two dots. Note that the present scheme combines the ‘classical’ pumping aspect in Coulomb blockaded systems [56,57,58,59] with tunneling/quantum interference in mesoscopic pumping [60,61,62,63,64,65]. Similar schemes for adiabatic transfer have been suggested by Silvestrini and Stodolsky [66], Barnes and Milburn [67], and realised experimentally in a superconducting Cooper pair box [68]. 3.2
Stationary Current
ρLR (z = 0) through the The stationary electron current Istat = −e2TcImˆ double quantum dot is obtained from Laplace transforming the equations of motion as an infinite sum of contributions G+ (= Istat /e in lowest order in Tc ) and G− , Istat (ε) =
−eΓL ΓR G+ (ε) . ΓL G− (ε) + (ΓL + ΓR )G+ (ε) − ΓL ΓR
(13)
Here, the expressions iTc − δ± iε − γp − ΓR /2 (POL) −iTc Cε G+ Im · ≡ 2T c ∗ (POL) 1 C 1 + 2 ΓR Cε G− −ε (PER)
G±
≡ 2Tc Im
(14)
are obtained in perturbation theory in gQ (PER) and from the polaron transformation (POL, perturbation theory in Tc ) and the subsequent decoupling
68
Tobias Brandes
of the correlation function C(t), Eq. (9) with Laplace transform ∞ Cε := dteiεt C(t).
(15)
0
Note that PER works in the correct eigenstate base of the hybridized system (level splitting ∆), whereas the energy scale ε in POL is that of the two therefore does not incorporate the square-root isolated dots (Tc = 0) and hybridization form of ∆ = ε2 + 4Tc2 . However, for large |ε| Tc , ∆ → |ε|, and POL and PER coincide: in this limit and for small electron-phonon coupling we find ΓR /2 + γ(±ε) , ΓR2 /4 + ε2 πρ(|ε|) [coth(β|ε|/2) + sgn(ε)] . γ(ε) := 2ε2
G± (ε) → −2Tc2
(16) (17)
For ε Tc , the stationary current Eq.(13) is determined by the function γ(ε), showing the broad ‘shoulder’ on the spontaneous phonon emission side of the resonant tunneling peak, as observed by Fujisawa et al. [3].
4
Non-linear Transport: Single Boson Mode
The Rabi Hamiltonian [69,70] is given by the single boson mode version ωQ = ωδQ,Q0 , gQ = gδQ,Q0 , of Eq. (2) or canonically equivalent forms of it. It is probably one of the best studied models for the interaction of matter with light [70] and can be used, e.g., to study the transfer of quantum coherence from light to matter (control of tunneling by electromagnetic fields [71,54]) and vice versa [72,73,74]. Non-equilibrium physics of single or few vibration modes in molecular electronic transport has already been studied experimentally and theoretically. Phonon cavity experiments with quantum dots in free-standing semiconductor structures give indications of ‘recoil’ effects related to confined phonon modes [10]. One could furthermore envisage transport experiments through coupled quantum dots interacting with single phonon or photon cavity modes, or single vibrational degrees of freedoms of macroscopic mechanical devices (such as cantilevers) coupled to microscopic charges [75,33]. The Rabi Hamiltonian is probably the simplest model for the interaction of light with matter, and yet it is only exactly solvable at certain values of the coupling constant g (‘Juddian points’), or in the rotating wave approximation. We have therefore numerically solved the master equation for non-linear transport through the coupled single mode (pseudo) spin-boson system (again corresponding to, e.g., double quantum dots in the Coulomb blockade regime), without invoking any further approximations such as decoupling schemes or perturbation theory in g. However, the bosonic Hilbert
Boson Cavities: From Electronic Transport to Quantum Chaos
69
space has to be truncated at a finite number N of boson states. The total number of equations for the stationary dot-boson density operator ρij nm := n, i|ρ|j, m,
i, j = 0, L, R,
(18)
2
then is 5N + 10N + 5, remembering that there is always an equation for n = m = 0. 4.1
Boson Damping and Boson Distribution
Damping of the boson mode a† can be described within the master equation approach in the standard Lindblad form [76], γb ρ| ˙ damping = − 2aρB a† − a† aρB − ρB a† a , (19) 2 which can easily be incorporated either in the numerical approach, or used in the analytical expression for the stationary current Istat (ε), Eq. (13), from the polaron transformation and subsequent decoupling of the boson degree of freedom with the correlation function 2 γ g −( 2b +iω )t C(t) = exp − 2 1 − e , t ≥ 0. (20) ω The analytical result Eq. (13), together with Eqs. (14), (15), and (20) agrees relatively well with the numerical solution; both showing phonon emission peaks at ε = nω for small Tc and small damping γb , cf. Fig. 1. There is, however, a more fundamental issue with the description of a damped oscillator mode by the simple Lindblad form Eq.(19), which fails [77], e.g., to reproduce power-law tails in correlation functions at low temperatures as obtained from exact solutions of microscopic models [78]. These models can be incorporated into single electron-boson transport theories [24], although decoupling approximations have to be invoked there, too. One of the interesting question about ‘molecular transport’ through this spin-boson system is how the boson state can be controlled by ε, Tc , and ΓL/R , i.e., parameters of the electronic subsystem. The somehow intuitive picture of ‘controlling’ the boson mode by the stationary electron current is not appropriate here, because the coupled electron and boson have to be dealt with on equal footing. In fact, we have not been able to find simple limiting analytical solutions for the stationary reduced density operator of the boson, ρii (t). (21) ρb ≡ lim Trdot ρ(t) = lim t→∞
t→∞
i=0,L,R
except for ε −|Tc |, where σz ≈ 1 (‘electron on left dot’) and the boson is in the coherent state |z = −g/2ω. The boson state can be visualised from the numerical result, using the Wigner function [79] W (x, p) =
∞ 1 (−1)n n|ρb |mm|D(2α)|n, π n,m=0
(22)
70
Tobias Brandes eps = 0.25
eps = 0.00
-2
N=10, γb=0.05 POL γb=0.05 N=5, γb=0.5 POL γb=0.5
-3
0.08 0.06 0 0.04 0 0.02 0 0 4
2 -4 4 x
0.1 0.075 0 0.05 0 0.025 0 0 4 2
2 -4 4
0 p
-2
0 p
-2 x
2
4-4
-2
0
-2
0 x
2
-4 4
0 p
-2
-2
0
log10[I/eω]
eps = 0.50
0.08 0.06 0.04 0 0.02 0 0 4
2 4-4
4-4
-4 eps = 1.25
eps = 1.00
eps = 0.75
-5
0.06 0.04 0 0.02 0 0 4
0.04 0.02 0 0 4
2
-6 -4 4
-2
-1
0 1 2 ε=εL- εR (100 µeV)
3
4
0 p
-2
-4 4
0 p
-2
-2
0 x
4-4
2 -4 4
0 p
-2
-2
0 x
2
0.06 0.04 0 0.02 0 0 4
2
4-4
-2
0 x
2
2 4-4
Fig. 1. Left: Stationary current through double quantum dot coupled to single boson mode (coupling constant g = 0.2) with tunnel rates to left/right leads ΓL = ΓR = 0.1, interdot coupling Tc = 0.01, boson damping γb = 0.05 (in units of boson frequency ω). Numerical results truncation for N boson states, alongside corresponding polaron transformation result. Right: Wigner distribution functions for the bosonic mode (ΓL = ΓR = Tc = 0.1, γb = 0.005, g = 0.8, N = 20)
√ where D(z) = exp[za† −z ∗ a] and α ≡ (x+ip)/ 2. Figure 1 (right) shows that the distribution in phase space spreads out close to the resonance energies, with corresponding peaks in position (x) and momentum (p) variances (not shown here).
5
Single Particle Scattering in a Cavity
I now turn from single electron tunneling (‘zero-d transport’) to mesoscopic transport in 1d systems (quantum wires), which leads me to quantum mechanical scattering and the transmission properties of particles in the presence of coupling to a cavity boson mode. Even for non-interacting fermions at finite density this is a very complex problem due to the possibility of induced many-body effects (Kondo physics, superconducting correlations,...) in the presence of a Fermi sea. We have therefore started with re-considering the simplest model for 1d single electron scattering in a boson cavity; H=
p2 + δ(x) g0 + g1 [a† + a] + Ωa† a. 2m
(23)
The electron-boson coupling is via a ‘dynamical’ delta-barrier. Although at first sight this model might seem a bit too simple in order to yield interesting physics, quite a few authors have actually investigated this Hamiltonian or its lattice version in order to study tunneling in presence of phonons[80], Fanotype resonances [81,82] or the behaviour of transmission amplitudes in the complex energy plane [83,84], and time-dependent Hamiltonians as classical limits of fully quantised models [85].
Boson Cavities: From Electronic Transport to Quantum Chaos
5.1
71
Recursion Scheme
Scattering states of the Hamiltonian can be written as highly entangled wave ∞ functions of the coupled electron-boson system, x|Ψ = n=0 ψn (x)|n, where {|n} is the harmonic oscillator basis. The total transmission coefficient T (E) is obtained from the sum over all propagating modes, kn (E) |tn (E)|2 , k (E) 0 n=0
[E/Ω]
T (E) =
kn ≡
√ E − nΩ
(24)
where h ¯ = 2m = 1 and the sum runs up to the largest n such that kn remains real. Continuity of the wave function at x = 0 leads to an infinite recursion relation for a self energy Σ (N ) (E) which can be written in an intuitive form that, e.g., for the zero-channel transmission amplitude t0 (E) reads −2iγ0(E) , − Σ (1) (E) N g12 Σ (N ) (E) = −1 , G0 (E − N Ω) − Σ (N +1) (E) t0 (E) =
G−1 0 (E)
(25) (26)
√ −1 where √ the ‘Green function’ G0 (E) ≡ [−2iγ0 (E)+g0 ] and γ0 (E) = Eθ(E) +i −Eθ(−E). The recursion can be truncated by setting Σ (N ) (E) = 0 for a fixed N > 0 and recursively solving Eq. (25) down to Σ (1) (E) which, however, fails to work for too large coupling constants g1 . In the strong coupling regime, one has to start from polarons as new quasi-particles. In a lattice version of the Hamiltonian Eq. (23), this is easily accomplished by a canonical transformation and a subsequent perturbation theory in the coupling to the localised polaron level, similar to the spin-boson problem discussed above. On the other hand, in the original Hamiltonian Eq. (23) this would correspond to an inconvenient perturbation theory in the kinetic energy p2 /2m of the electron. 5.2
Resonances and Transparency Points
Barriers with an attractive static part, g0 < 0, are most interesting in that they exhibit Fano type resonances with zero transmission coefficient below the first threshold E = Ω. If the term (a† + a) in the Hamiltonian Eq. (23) is replaced by an oscillating term ∝ cos(Ωt), Fano-resonances in this classical limit are known to appear [82] when the energy of the electron in the first (non-propagating) channel n = 1 coincides with the bound state of the attractive delta barrier potential, E − Ω = −g02 /4. In the single boson mode case Eq. (23), zero transmission corresponds to a diverging self energy Σ (1) (E), in Eq. (25). For small g1 , this condition can be written as √ (27) 0 = [Σ (1) (E)]−1 ≈ (2 Ω − E + g0 )/g12 ,
72
Tobias Brandes
which coincides with the resonance condition for the classical case. A new feature of Eq. (23) (which does not appear for its classical, timedependent counterpart) is the existence of perfect transmission T (E) = 1 at an energy below the opening of the first channel (n = 1) threshold. There, t0 (E) = −2ik0 /(−2ik0 + g0 − Σ (1) (E)) which means that t0 (E) = 1 for g0 = Σ (1) (E). Since Σ (1) (E) is real for 0 < E < Ω, the self energy exactly renormalizes the static part g0 of the scattering potential to zero at this point. Using the perturbative expression in Eq.(27) for Σ (1) (E), one finds two perfect transparency points √ (28) g0 = − Ω − E ± Ω − E + g12 , i.e, both for attractive and repulsive static barrier strengths g0 .
6 Quantum Chaos and Quantum Phase Transition: Single Mode Dicke Model In this final section, I return back to the single-mode (pseudo) spin boson problem but now consider not only a single two-level system (as represented by the Pauli matrices σi ), but an array of N = 2j identical (but distinguishable) two-level systems represented by angular momentum operators Jz , J± for a pseudo-spin of length j. The corresponding Dicke Hamiltonian [86] λ H = ω0 Jz + ωa† a + √ (a† + a)(J+ + J− ), 2j
(29)
generalises the Rabi Hamiltonian to j > 1/2, is well-known from quantum optics (single mode superradiance) and can be regarded as a simple model for solid state qubit arrays interacting via a common cavity boson (photon/phonon). Our original motivation to study this model was to find a relation between quantum chaos (showing up for finite N ) and quantum phase transitions (N → ∞) in systems of N interacting particles as a function of some coupling constant λ. For non-interacting systems, the Anderson (localisationdelocalisation) transition is an example for such a relation, other examples include chaos in interacting spin systems [87], the Lipkin model [88], and the interacting boson model [89]. A common feature of the models in the previous sections is the difficulty to continuously move from their weak-coupling to their strong-coupling limits. In fact, a typical feature of models like the (isolated) Rabi-Hamiltonian (j = 1/2) is the breakdown of numerical approaches for too large coupling constants (coming from the weak coupling side). The idea therefore is that such instabilities have a ‘deeper’ reason, i.e., an underlying quantum phase transition that only becomes apparent if the system is regarded as a finite-size version of some ‘larger’ system in the thermodynamic limit. In the example here, the Rabi Hamiltonian is the finite size version of the Dicke Hamiltonian.
Boson Cavities: From Electronic Transport to Quantum Chaos
73
We have proven and studied this connection [90,91] in great detail for Eq.(29). One might speculate that similar strong connections between quantum chaos and quantum phase transitions are a general feature of many more classes of physically interesting systems. 6.1
Spectrum and Wave Functions
We have derived [90] exact analytical solutions for the spectrum and wave functions of this Hamiltonian for N → ∞ and found a localisation-delocalisation transition in a cross-over between Poissonian and Wigner level-spacing distribution, using numerical diagonalisation for finite N . The ground state √ bifurcates into a Schr¨ odinger cat above the critical point λ = λc = ωω0 /2, which can be related to a transition between non-chaos and chaos in the classical, canonical limit of the Hamiltonian Eq.(29) and its non-linear, momentum-dependent potential energy. A Holstein-Primakoff transformation [92] of the Hamiltonian leads to a representation in terms of two oscillator modes a† and b† (the latter represents the pseudo spin). For N → ∞, the ground state energy −ω0 , λ < λc EG = (30) 2 j − λ2 + ω0 ω2 , λ > λc ω 8λ and the two collective excitation energies ε± are obtained exactly. From the vanishing of ε− at the critical point one obtains the critical exponents ν = 1/4, z = 2 on resonance ω = ω0 . Finite-j precursors of the phase transitions can be identified in the crossover of the level spacing distribution P (S) from Poissonian (λ < λc ) to Wigner-Dyson, which we have calculated numerically [90]. The ground state wave function √ Ψ0 can be represented√in a 2d position space with coordinates x ≡ (1/ 2ω)(a† + a) and y ≡ (1/ 2ω0 )(b† + b). The splitting of√Ψ0 into a superposition of two peaks (separated by a distance of the order j) is related to the existence of a conserved parity Π = exp{iπ[a† a + Jz + j]} of the Hamiltonian. In fact, Eq. (29) is equivalent to a single particle on a lattice with points (n, m), |m| ≤ j, n = 0, 1, 2, ..., and the eigenvalues ±1 of Π correspond to the two independent sublattices. For j → ∞, the effective tunnel barrier between the two lobes of Ψ0 becomes infinitely strong, Π is spontaneously broken (the cat is ‘broken into two pieces’), and each of the two lobes aquires its own effective Hamiltonian [90] above λc . 6.2
Classical Limit, Chaos
The above discussion shows that the simple one-boson mode Hamiltonian Eq.(29) is an attractive model to study an exact solution for a quantum phase transition. Moreover, for finite j it exhibits a well-defined transition
74
Tobias Brandes
from integrable to chaotic behavior. One can derive a canonical, classical Hamilton function corresponding to Eq.(29) by using the Holstein-Primakoff expressions for the spin, which leads to the problem of a single particle in a momentum dependent potential ω02 y 2 + p2y − ω0 √ 1 2 2 ω x + ω02 y 2 + 2λ ωω0 xy 1 − (31) U (x, y, py ) = 2 4jω0 Poincar´e sections for the classical model [91] show the transition between regular (λ < λc ) and chaotic (λ > λc ) behavior which agrees with the transition in P (S) of the quantum model.
7
Conclusions
In the above overview, I have presented spin-boson models from a ‘mesoscopic’ point of view. The coupling to external electron reservoirs allows to study these models under transport (non-equilibrium) conditions. In spite of their simplicity, the Hamiltonians discussed here have some very non-trivial properties that became apparent in particular in the last two sections (mesoscopic ‘quantum’ scatterer, Dicke model). In the future, photon and phonon cavities can be expected to yield further insight into the dynamics of coupled quantum systems, in particular if they are combined with electron transport. Acknowledgements Collaborations and discussions with R. H. Blick, Y.-N. Chen, S. Debald, C. Emary, E. M. H¨ ohberger, J. Kotthaus, B. Kramer, N. Lambert, F. Renzoni, J. Robinson, and T. Vorrath are acknowledged. This work was supported by EPSRC grant GR44690/01, DFG project Br1528/4, the WE Heraeus foundation and the UK Quantum Circuits Network.
References 1. N. C. van der Vaart, S. F. Godjin, Y. V. Nazarov, C. J. P. M. Harmans, J. E. Mooij, L. W. Molenkamp, and C. T. Foxon, Phys. Rev. Lett. 74, 4702 (1995). 63 2. R. H. Blick, R. J. Haug, J. Weis, D. Pfannkuche, K. v. Klitzing, and K. Eberl, Phys. Rev. B 53, 7899 (1996). 63 3. T. Fujisawa, T. H. Oosterkamp, W. G. van der Wiel, B. W. Broer, R. Aguado, S. Tarucha, and L. P. Kouwenhoven, Science 282, 932 (1998). 63, 64, 68 4. S. Tarucha, T. Fujisawa, K. Ono, D. G. Austin, T. H. Oosterkamp, W. G. van der Wiel, Microelectr. Engineer. 47, 101 (1999). 63, 64 5. H. Qin, F. Simmel, R. H. Blick, J. P. Kotthaus, W. Wegscheider, M. Bichler, Phys. Rev. B 63, 035320 (2001). 63
Boson Cavities: From Electronic Transport to Quantum Chaos
75
6. T. Fujisawa, D. G. Austing, Y. Tokura, Y. Hirayama, and S. Tarucha, Nature 419, 278 (2002). 63 7. T. H. Oosterkamp, T. Fujisawa, W. G. van der Wiel, K. Ishibashi, R. V. Hijman, S. Tarucha, and L. P. Kouwenhoven, Nature 395, 873 (1998). 63 8. R. H. Blick, D. Pfannkuche, R. J. Haug, K. v. Klitzing, and K. Eberl, Phys. Rev. Lett. 80, 4032 (1998). 63 9. R. H. Blick, D. W. van der Weide, R. J. Haug, and K. Eberl, Phys. Rev. Lett. 81, 689 (1998). 63 10. E. M. H¨ ohberger, J. Kirschbaum, R. H. Blick, T. Brandes, W. Wegscheider, M. Bichler, and J. P. Kotthaus (unpublished), 2002. 63, 68 11. J. Seyler, M. N. Wybourne, Phys. Rev. Lett. 69, 1427 (1992). 63 12. N. Bannov, V. Mitin, M. Stroscio, phys. stat. sol. (b) 183, 131 (1994). 63 13. N. Bannov, V. Aristov, V. Mitin, M. A. Stroscio, Phys. Rev. B 51, 9930 (1995). 63 14. A. Greiner, L. Reggiani, T. Kuhn, L. Varani, Phys. Rev. Lett. 78, 1114 (1997). 63 15. N. Nishiguchi, Y. Ando, M. N. Wybourne, J. Phys.: Conden. Matter 9, 5751 (1997). 63 16. L. G. C. Rego, G. Kirczenow, Phys. Rev. Lett. 81, 232 (1998). 63 17. I. Wilson-Rae and A. Imamoglu, Phys. Rev. B 65, 235311 (2002). 63 18. S. Debald, T. Brandes, B. Kramer, Phys. Rev. B (Rapid Comm.) 66, 041301(R) (2002). 63, 64 19. H. Park, J. Park, A. K. L. Lim, E. H. Anderson, A. P. Alivisatos, and P. L. McEuen, Nature 407, 57 (2000). 63 20. C. Joachim, J. K. Gimzewski, A. Aviram, Nature 408, 541 (2000). 63 21. Reichert, R. Ochs, D. Beckmann, H. B. Weber, M. Mayor, and H. v. L¨ ohneysen, Phys. Rev. Lett. 88, 176804 (2002). 63 22. D. Boese and H. Schoeller, Europhys. Lett. 54, 668 (2001). 63 23. A. O. Gogolin and A. Komnik, cond-mat/0207513, 2002. 63 24. K. Flensberg, cond-mat/0302193 (2003). 63, 69 25. L. Y. Gorelik, A. Isacsson, M. V. Voinova, B. Kasemo, R. I. Shekhter, and M. Jonson, Phys. Rev. Lett. 80, 4526 (1998). 63 26. C. Weiss and W. Zwerger, Europhys. Lett. 47, 97 (1999). 63 27. A. Erbe, C. Weiss, W. Zwerger, and R. H. Blick, Phys. Rev. Lett. 87, 096106 (2001). 63 28. A. D. Armour and A. MacKinnon, Phys. Rev. B 66, 035333 (2002). 63 29. N. F. Schwabe, A. N. Cleland, M. C. Cross, and M. I. Roukes, Phys. Rev. B 52, 12911 (1995). 63 30. A. N. Cleland, M. L. Roukes, Nature 392, 160 (1998). 63 31. R. H. Blick, M. L. Roukes, W. Wegscheider, M. Bichler, Physica B 249, 784 (1998). 63 32. R. H. Blick, F. G. Monzon, W. Wegscheider, M. Bichler, F. Stern, and M. L. Roukes, Phys. Rev. B 62, 17103 (2000). 63 33. A. D. Armour, M. D. Blencowe, and K. C. Schwab, Phys. Rev. Lett. 88, 148301 (2002). 63, 68 34. L. D. Landau and E. M. Lifshitz, Theory of Elasticity, Vol. 7 of Landau and Lifshitz, Course of Theoretical Physics (Pergamon Press, Oxford, 1970). 64 35. T. Meeker, and A. Meitzler, in Physical Acoustics (Academic, New York 1964), Vol. 1, Part A; B. Auld, Acoustic Fields and Waves (Wiley, New York 1973), Vol. 2. 64
76
Tobias Brandes
36. N. Bannov et al., Phys. Rev. B 51, 9930 (1995); N. Bannov et al., phys. stat. sol. (b) 183, 131 (1994). 64 37. R. Aguado and L. Kouwenhoven, Phys. Rev. Lett. 84, 1986 (2000). 64 38. T. Brandes and B. Kramer, Phys. Rev. Lett. 83, 3021 (1999). 64, 65, 66 39. T. H. Stoof and Yu. V. Nazarov, Phys. Rev. B 53, 1050 (1996). 65 40. S. A. Gurvitz and Ya. S. Prager, Phys. Rev. B 53, 15932 (1996). 65 41. S. A. Gurvitz, Phys. Rev. B 57, 6602 (1998). 65 42. U. Hartmann and F. K. Wilhelm, phys. stat. sol. (b) 233, 385 (2002); U. Hartmann and F. K. Wilhelm, cond-mat/0212063. 65 43. M. Keil and H. Schoeller, Phys. Rev. B 66, 155314 (2002). 65 44. T. Brandes, T. Vorrath, in Recent Progress in Many Body Physics, Advances in Quantum Many Body Theory, edited by R. Bishop, T. Brandes, K. Gernoth, N. Walet, and Y. Xian (World Scientific, Singapore, 2001). 65 45. N. H. Bonadeo, J. Erland, D. Gammon, D. Park, D. S. Katzer, and D. G. Steel, Science 282, 1473 (1998). 66 46. T. Brandes, F. Renzoni, and R. H. Blick, Phys. Rev. B 64, 035319 (2001). 66 47. J. Schliemann, D. Loss, and A. H. MacDonald, Phys. Rev. B 63, 085311 (2001). 66 48. F. Renzoni and T. Brandes, Phys. Rev. B 64, 245301 (2001). 66 49. D. V. Averin, Solid State Comm. 105, 659 (1998). 66 50. D. V. Averin, in Quantum computing and quantum communications, Vol. 1509 of Lecture Notes in Computer Science (Springer, Berlin, 1999), p. 413. 66 51. A. M. Childs, E. Farhi, and J. Preskill, Phys. Rev. A 65, 012322 (2001). 66 52. M. Thorwart and P. H¨ anggi, Phys. Rev. A 65, 012309 (2002). 66 53. T. Brandes and T. Vorrath, Phys. Rev. B 66, 075341 (2002). 66 54. M. Grifoni and P. H¨ anggi, Phys. Rep. 304, 229 (1998). 66, 68 55. T. Brandes and F. Renzoni, Phys. Rev. Lett. 85, 4148 (2000). 67 56. L. J. Geerligs, V.F. Anderegg, P.A.M. Holweg, J.E. Mooij, H. Pothier, D. Esteve, C. Urbina, and M.H. Devoret, Phys. Rev. Lett. 64, 2691 (1990). 67 57. L. P. Kouwenhoven, A. T. Johnson, N. C. van der Vaart, C. J. P. M. Harmans, and C. T. Foxon, Phys. Rev. Lett. 67, 1626 (1991). 67 58. H. Pothier, P. Lafarge, C. Urbina, D. Esteve, and M.H. Devoret, Europhys. Lett. 17, 249 (1992). 67 59. (Ed.) H. Grabert and M. H. Devoret, Single Charge Tunneling, Vol. 294 of NATO ASI Series B (Plenum Press, New York, 1991). 67 60. P. W. Brouwer, Phys. Rev. B 58, R10135 (1998). 67 61. M. Switkes, C. M. Marcus, K. Campman, and A. C. Gossard, Science 283, 1905 (1999). 67 62. M. L. Polianski and P. W. Brouwer, Phys. Rev. B 64, 075304 (2001). 67 63. J. N. H. J. Cremers and P. W. Brouwer, Phys. Rev. B 65, 115333 (2002). 67 64. M. Moskalets and M. B¨ uttiker, Phys. Rev. B 64, 201305 (2001). 67 65. E. R. Mucciolo, C. Chamon, and C. M. Marcus, cond-mat/0112157, 2001. 67 66. P. Silvestrini and L. Stodolsky, Phys. Lett. A 280, 17 (2001). 67 67. S. D. Barrett and G. J. Milburn, cond-mat/0302238 (2003). 67 68. Y. Nakamura, Yu. A. Pashkin, and J. S. Tsai, Nature 398, 786 (1999). 67 69. I. I. Rabi, Phys. Rev. 51, 652 (1937). 68 70. L. Allen and J. H. Eberly, Optical Resonance and Two-Level Atoms (Dover, New York, 1987). 68 71. P. Neu and R. J. Silbey, Phys. Rev. A 54, 5323 (1996). 68
Boson Cavities: From Electronic Transport to Quantum Chaos
77
72. M. Ueda, T. Wakabayashi, and M. Kuwata-Gonokami, Phys. Rev. Lett. 76, 2045 (1996). 68 73. H. Saito and M. Ueda, Phys. Rev. Lett. 79, 3869 (1997). 68 74. H. Saito and M. Ueda, Phys. Rev. A 59, 3959 (1999). 68 75. A. D. Armour and M. D. Blencowe, Phys. Rev. B 64, 035311 (2001). 68 76. D. F. Walls and G. J. Milburn, Quantum Optics (Springer, Berlin, 1994). 69 77. F. Bloch, Phys. Rev. 105, 1206 (1957). 69 78. U. Weiss, Quantum Dissipative Systems, Vol. 2 of Series of Modern Condensed Matter Physics (World Scientific, Singapore, 1993). 69 79. K. E. Cahill and R. J. Glauber, Phys. Rev. 177, 1882 (1969). 69 80. B. Y. Gelfand, S. Schmitt-Rink, and A. F. J. Levi, Phys. Rev. Lett. 62, 1683 (1989). 70 81. P. F. Bagwell, Phys. Rev. B 41, 10354 (1990). 70 82. P. F. Bagwell and R. K. Lake, Phys. Rev. B 46, 15329 (1992). 70, 71 83. D. F. Martinez and L. E. Reichl, Phys. Rev. B 64, 245315 (2001). 70 84. S. W. Kim, H-K. Park, H.-S. Sim, and H. Schomerus, cond-mat/0203391. 70 85. J.-M. Lopez-Castillo, C. Tannous, and J.-P. Jay-Gerin, Phys. Rev. A 41, 2273 (1990). 70 86. R. H. Dicke, Phys. Rev. 93, 99 (1954). 72 87. G. Georgeot and D. L. Shepelyansky, Phys. Rev. Lett. 81, 5129 (1998); Phys. Rev. E 62, 3504 (2000). 72 88. W. D. Heiss and A. L. Sannino, Phys. Rev. A 43, 4159 (1991); W. D. Heiss and M. M¨ uller, Phys. Rev. E 66, 016217 (2002). 72 89. Y. Alhassid and N. Whelan, Nucl. Phys. A556, 42 (1993); P. Cejnar and J. Jolie, Phys. Rev. E 58, 387 (1998). 72 90. C. Emary, T. Brandes, Phys. Rev. Lett. 90, 044101 (2003). 73 91. C. Emary, T. Brandes, cond-mat/0301273 (2003). 73, 74 92. M. Hillery and L. D. Mlodinow, Phys. Rev. A 31, 797 (1985). 73
Nanoscopic Quantum Rings: A New Perspective Tapash Chakraborty Institute of Mathematical Sciences, Chennai 600 113, India Abstract. With rapid advances in fabrication of nano-scale devices, quantum rings of nanometer dimensions that are disorder free and contain only a few (interacting) electrons have gained increasing attention. Accordingly, the emphasis of theoretical research has also shifted from the problems involving the persistent current which is indirectly related to the energy levels, to a direct probe of the low-lying energy spectrum of a single quantum ring. Transport and optical spectroscopies have revealed many interesting aspects of the energy spectra that are in good agreement with the theoretical picture presented here.
1
Introduction
A metallic ring of mesoscopic dimension in an external magnetic field is known to exhibit a wide variety of interesting physical phenomena. One spectacular effect that has fascinated researchers over a few decades is that the ring can carry an equilibrium current (the so-called persistent current) [1,2] which is periodic in the Aharonov-Bohm (AB) flux Φ [3,4] with a period Φ0 = hc/e, the flux quantum. This effect is a direct consequence of the properties of the eigenfunctions of isolated rings, which cause the periodicity of all physical quantities. The reason for this behavior is well known and is briefly as follows: In a ring which encloses a magnetic flux Φ, the vector potential can be eliminated from the Schr¨ odinger equation by introducing a gauge transformation. The result is that the boundary condition is modified as ψn (x + L) = e2πiΦ/Φ0 ψn (x), where L is the ring circumference 1 . The situation is then analogous to the one-dimensional Bloch problem with Bloch wave vector kn = (2π/L)Φ/Φ0 . The energy levels En and other related physical quantities are therefore periodic in Φ0 . For a time-independent flux Φ, the equilibrium current (at T = 0) associated with state n is In = −
∂En evn = −c , L ∂Φ
(1)
where vn = ∂En /¯ h∂kn = (Lc/e)∂En /∂Φ is the velocity of state n. An important condition for In to be nonzero is that the wave functions of the charge 1
The magnetic flux Φ is assumed to thread the ring axially but the electron motion is uninfluenced by the magnetic field. Then the Φ0 periodicity of the electron wave function is strictly AB type.
B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 79–94, 2003. c Springer-Verlag Berlin Heidelberg 2003
80
Tapash Chakraborty
carriers should stay coherent along the circumference L of the ring. The ring geometry and the thermodynamic current have also played a central role in the gauge-invariance interpretation of the integer quantum Hall effect and the current carrying edge states [5]. The phenomenon is resticted only to mesoscopic rings, i.e., rings whose size is so small that the orbital motion of electrons in the ring remains quantum phase coherent throughout. Strictly one-dimensional rings – B¨ uttiker et al. [2] were the first to propose the possibility of observing a persistent current in diffusive normal metal rings. They considered a one-dimensional ring enclosing a magnetic flux Φ, and noted the analogy between the boundary condition and the one-dimensional Bloch problem discussed above. Let us first consider the impurity-free singleelectron Hamiltonian in a magnetic field. The Schr¨ odinger equation is then simply pˆ2 ψ(x) = Eψ(x) 2m∗ with usual periodic boundary conditions. The solutions are, of course, plane waves with wave vector k = p/¯h, and due to periodic boundary conditions kn = 2πn/L, where n is an integer. If we apply a magnetic field B perpendicular to the ring then pˆ → pˆ + eA/c and the vector potential is A = 12 Br0 = BL/4π where r0 is the ring radius. Magnetic flux Φ through the ring is then Φ = BL2 /4π. The wave vector is then modified accordingly e 2π 2π 2π e BL2 Φ n+ A= kn = = . n+ n+ L h ¯ L ¯hc 8π2 L Φ0 The energy levels are then radily obtained from 2 h2 kn2 ¯ h2 ¯ Φ En (B) = = . n+ 2m∗ 2m∗ r02 Φ0
(2)
that are parabolas as a function of Φ. The equilibrium current (which is not a transport current) carried by the n-th level is then (Eq. 1) 2πe¯ h Φ . (3) In = − ∗ 2 n + m L Φ0 The total persistent current is I = n In , where the summation is over N lowest occupied levels 2 . For weak impurity potentials, the degeneracies at the level crossings are lifted (Fig. 1). The magnetic moment for each occupied level is e¯ h dE Φ =− M=− n+ . (4) dB 2m∗ Φ0 2
We consider the case of T = 0 and unless otherwise specified, only spinless particles.
Energy (arbitary units)
Nanoscopic Quantum Rings: A New Perspective
Φ/ Φ0
81
Fig. 1. Energy levels of a one-dimensional electron gas (non-interacting) on a ring as a function of the magnetic flux. The dashed lines correspond to the disorder-free case and the solid lines are for the weak impurity case
As we see from Fig. 1, because of the alternating signs of ∂E/∂Φ for each consecutive levels, the total moment is of the order of last moment around EF . The total persistent current is therefore, 2Φ/Φ0 for N odd and − 21 ≤ Φ/Φ0 < 12 I(Φ) = −I0 , (5) [2Φ/Φ0 − 1] for N even and 0 ≤ Φ/Φ0 < 1 where I0 = evF /L and vF = π¯hN/m∗ L is the Fermi velocity. It is periodic in Φ/Φ0 with period 1. Early experiments were carried out on relatively large (µm-size) metallic rings containing a large number of electrons and impurities [6]. The observed results have not yet been explained to everyone’s satisfaction [7]. A semiconductor ring in a GaAlAs/GaAs heterojunction [8] (also of µm size but in the ballistic regime) displayed the persistent current to be periodic with a period of Φ0 and the amplitude of 0.8 ± 0.4evF /L, in agreement with the theoretical predictions. The electron-electron interaction did not change the value of persistent current. These experiments inspired a large number of researchers to report a large number of theoretical studies involving various averaging procedures, dependence on the chemical potential, temperature, different realizations of disorder, and often conflicting conclusions about the role of electron-electron interactions. We will not go into those aspects of the work on a mesoscopic ring any further.
2
Electronic Structure of a Parabolic Quantum Ring
In an attempt to clearly understand the role of electron-electron interactions in a quantum ring (QR) without getting encumbered by a variety of other issues mentioned above, we have constructed a model of a QR [9] that is disorder free, contains only a few interacting electrons and most importantly can be solved via the exact diagonalization method to obtain the energy levels very accurately. One advantage of this model is that, with the energy levels thus calculated, other physical quantities in addition to the persistent current, such as optical absorption, role of electron spin etc. can also be studied very accurately [10,11]. Interestingly, over the years, the model has received a
82
Tapash Chakraborty
large following. The model is also most relevant for recent experiments on nano-rings [12,13,14,15,16,17,18,19]. 2.1
Theoretical Model
The Hamiltonian for an electron in a ring with parabolic confinement and subjected to a perpendicular magnetic field is e 2 1 ∗ 2 1 2 p − A + m ω0 (r − r0 ) (6) H= 2m∗ c 2 where the vector potential is A = 12 (−By, Bx, 0) (symmetric gauge). The Schr¨ odinger equation (in polar coordinates) is then written as 2 1 ∂2ψ ∂ ψ 1 ∂ψ ieB¯h ∂ψ h2 ¯ + 2 + − − ∗ 2 2m ∂r r ∂r r ∂θ 2m∗ c ∂θ 2 2 2 e B r 1 2 + m∗ ω20 (r − r0 ) − E ψ = 0. (7) + ∗ 2 8m c 2 Introducing the ansatz 1 ψ = √ f (r)e−ilθ , 2π and the quantities, N = BeA/hc = Φ/Φ0 , α = ω0 m∗ A/h, x = r/r0 , E = 2m∗ πAE/h2 , where A = πr02 is the area of the ring, the radial part of the Schr¨ odinger equation is 2
2 1 l2 2 2 2 f + f + 4E + 2N − 4α − N + 4α x + 8α x − 2 f = 0, (8) x x where l is the usual orbital angular momentum of the single-particle level. Parameter α is inversely proportional to the width of the ring. As shown in Fig. 2, large α means a narrow path for the electrons to traverse and α =100
α =20
α =50
α =5
Fig. 2. Confinement potential α (r − r0 )2 for various values of α
Nanoscopic Quantum Rings: A New Perspective
83
hence the electron motion is close to that of a strictly one-dimensional ring. For small α, the electron motion is almost two-dimensional. These two limits can indeed be achieved in our model [9]. For a δ-function confinement (x = 1), the radial equation (Eq. 8) becomes
(9) 4E + 2N l − 4α2 − (N 2 + 4α2 ) + 8α2 − l2 f = 0 2
with the solution, E = 14 (l − N ) , derived above for a strictly one-dimensional ring (Eq. 2). In the other limit, i.e., when the magnetic field is large compared to the confinement α, the radial equation reduces to 1 l2 f + f + 4E + 2N l − N 2 x2 − 2 f = 0 x x
with the solution [9], E = n + 12 N , corresponding to the Fock-Darwin levels [20]. The single-electron energy levels are obtained from numerical solutions of the radial equation (Eq. 8) and are shown in Fig. 3. For α = 20 (Fig. 3(b)), the lower set of energy levels are still similar to those of the ideally narrow ring and are given by a set of translated parabolas (as in Fig. 3(a)). As α decreases, i.e., the ring becomes wider, the sawtooth behavior of the narrow ring is gradually replaced by the formation of Fock-Darwin levels as in the quantum dots. In a quantum ring, or in any cylindrically symmetric system, the wave functions are of the form ψλ = Rnl (r) eilθ ,
n = 0, 1, 2, . . . ,
l = 0, ±1, ±2, . . . ,
and the interaction matrix elements are evaluated from [9] ∞ dqqV(q) Vλ1 λ2 λ3 λ4 = δl1 +l2 ,l3 +l4 2π 0 ∞ × dr1 r1 J|l1 −l4 | (qr1 )Rn1 l1 (r1 )Rn4 l4 (r1 ) 0 ∞ dr2 r2 J|l2 −l3 | (qr2 )Rn2 l2 (r2 )Rn3 l3 (r2 ) ×
(10)
(11)
0
Fig. 3. Energy levels of a single electron versus the magnetic field for (a) ideally narrow ring, and (b)–(d) parabolic confinement model with various values of the confinement potential strength. The second Fock-Darwin level is plotted as dotted lines
84
Tapash Chakraborty
where λ represents the quantum number pair {n, l}, Ji is the Bessel function of order i. All our numerical results correspond to the case of Coulomb interaction in a plane, V(q) = 2πκ/q, where κ = e2 /4πε0 ε. We have considered the case of m∗ = 0.07m, ε = 13, and the ring radius, r0 = 10 nm. Energies of a QR containing four interacting and non-interacting electrons [9] are shown in Fig. 4. It is clear that for the spinless electrons, the only effect of the inter-relectron interaction is an upward shift of the total energy. This is due to the fact that in a narrow ring all the close-lying states are in the lowest Landau band and cannot be coupled by the interaction because of the conservation of the angular momentum. The situation is however different once the spin degree of freedom is included, as described in Sect. 3. 2.2
Persistent Current
The magnetic moment (proportional to the persistent current) (Eq. 4) is calculated in our model from its thermodynamic expression ∂Em e−Em /kT e−Em /kT , (12) M=− ∂B m m where ∂Em /∂B are evaluated as the expectation values of the magnetization operator in the interacting states |m. ring geometry, the magnetization
In our operator is M = 1/2m∗ e/cLz + e2 Br2 /2c2 . Our calculations revealed that the interaction has no effect on magnetization which remains periodic with period Φ0 [9].
Fig. 4. Energy spectrum of a QR containing four non-interacting and interacting spinless electrons and for two different widths of the ring
Nanoscopic Quantum Rings: A New Perspective
85
We also studied the persistent current of a QR in the presence of a Gaussian impurity and/or with Coulomb interaction included [21]. The impurity interaction of our choice was, 2
V imp (r) = V0 e−(r−R)
/d2
,
(13)
where V0 is the potential strength and d is the width. The impurity matrix element in our model is then 2rR imθ0 −(R2 +r 2 )/d2 Im Rλ (r)Rλ (r)e Tλ,λ = 2πV0 e rdr, (14) d2 where m = l −l, (R, θ0 ) is the impurity position and Im is the modified Bessel function. We found that the effect of impurity is simply to lift the degeneracies in the energy spectrum and reduce the persistent current (Fig. 5), but it does not change the phase of the oscillations as a function of the magnetic flux. Even for the strongest impurity, the inter-electron interaction had no effect on the persistent current (except to shift the spectrum to higher energies). Our result verified the conjecture of Leggett [22]. Based on variational arguments and two important proiperties of the many-particle wave function in a mesoscopic ring, viz., the antisymmetry and the single valuedness, he proposed that, for arbitary electron-electron interactions and an arbitary external potential, the maxima and minima of the energy curves for even and odd numbers of electrons would be the same as for non-interacting systems.
Fig. 5. Single-electron energy spectrum and magnetization (unit of energy is 2¯ h2 /m∗ r02 ) versus Φ/Φ0 for (a) α = 20, V0 = 1, d = 0.2; (b) α = 20, V0 = 4, d = 0.5; (c) α = 5, V0 = 0.5, d = 0.2; and (d) α = 5, V0 = 1, d = 0.5 [21]
86
Tapash Chakraborty
3
Optical Absorption Spectra
After watching a not so spectacular performance of the persistent current, and from our past experience on the important effect of the inter-electron interaction on a few-electron quantum dot [20], we chose to turn our attention on optical spectroscopy of QRs [10], well before any such experiments on nano-rings were reported. We found that optical spectroscopy is indeed a direct route to explore impurity and interaction effects on a quantum ring. In our work, intensities of the optical absorption are evaluated within the electric-dipole approximation. For parabolic quantum dots (obtained by setting r0 = 0 in Eq. 6), the radial part of the wave function (Eq. 10) is given
|l| explicitly as Rnl = C exp −r2 /(2a2 ) r|l| Ln (r2 /a2 ), where C is the normal ization constant, a = ¯ h/(m∗ Ω), Ω = ω20 + ω2c /4, Lkn (x) is the associated Laguerre polynomial, and ωc is the cyclotron frequency. In our QR model (r0 = 0) the radial part Rnl has to be determined numerically. We define the single-particle matrix elements as ∞ iθ r2 Rλ (r)Rλ (r)dr. dλ,λ = λ |re |λ = 2πδl+1,l 0
The dipole operators are then X = 12 λλ [dλ λ + dλλ ] a†λ aλ , † 1 Y = 2i λλ [dλ λ − dλλ ] aλ aλ .
(15)
The probability of absorption from the ground state |0 to an excited state |f will then be proportional to I = |f |r|0|2 = |f |X|0|2 + |f |Y |0|2 . In the calculated absorption spectra presented below, the areas of the filled circles are proportional to I. 3.1
Quantum Dots with a Repulsive Scattering Center
Far-infrared (FIR) spectroscopy on µm-size QR arrays that were created in GaAs/AlGaAs heterojunctions was first reported by Dahl et al. [23]. The rings were of two different sizes: The outer diameter being ≈ 50 µm diameter in both the cases, but the inner diameters were 12 µm (“broad rings”) and 30 µm (“narrow rings”). These rings were also described by these authors as disks with a repulsive scatterer at the center of the disk. The observed resonance frequencies as a function of the applied magnetic field are shown in Fig. 6. Around zero magnetic field the resonances are similar to what one expects for a circular disk. At larger B two modes with negative magnetic field dispersion were found that were interpreted as edge magnetoplasmons at the inner and outer boundary.
Nanoscopic Quantum Rings: A New Perspective
87
Fig. 6. Frequencies of magnetoplasma resonances for an array of (a) “broad” and (b) “narrow” rings [23]
Our theoretical results for absorption energies and intensities [10] of a quantum dot containing an impurity (modelled by a Gaussian potential) and one to three electrons are shown in Fig. 7 as a function of the magnetic field. We have included spin but ignored the Zeeman energy. The two upper modes of the one-electron spectrum behave almost similar to the experimental results of Dahl et al. [23]. However, the two lower modes behave differently (in the one-electron case) from those experimental results. The lower modes, i.e., the edge magnetoplasmon modes, reveal a periodic structure similar to the
Fig. 7. Absorption energies and intensities of hω0 = 4 meV, ina quantum dot (r0 = 0), ¯ cluding a Gaussian repulsive scatterer (Eq. 13) with V0 = 32 meV, d = 5 nm, as a function of the applied magnetic field. The dot contains one to three electrons. The areas of the filled circles are proportional to the calculated intensity [10]
88
Tapash Chakraborty
case of a parabolic ring discussed below. That is true only for the one-electron system. When the number of electrons in the system is increased the periodic structure of the edge modes (the two lowest modes) starts to disappear due to the electron-electron interaction. Since the spin degree of freedom is also included in our calculations, the difference between the one- and two-electron results in Fig. 7 is entirely due to the Coulomb force. The lowest mode (which is also the strongest) behaves (even for only three electrons) much the same way as does the lowest mode in the experiment (where the system consists of the order of one million electrons). 3.2
A Parabolic Quantum Ring
The results reported below correspond to that of a narrow parabolic ring with α = 20 and r0 = 10 nm. In a pure one-electron ring the dipole-allowed absorption from the ground state can happen with equal probability to the first two excited states and all other transitions are forbidden. An impurity in the ring will mix the angular-momentum eigenstates of the pure system into new states between which dipole transitions are allowed. For an impurity of moderate strength (Fig. 8(a)), an appreciable part of the transition probability still goes to the first two excited states while for a strong impurity, absorptions taking the electron to the lowest excited state are more favorable (Fig. 8(b)). One important result here is that in a system with broken rotational symmetry, transition probability depends strongly on the polarization of the incident light. For example, if instead of the unpolarized light considered here, we were to consider the case of light polarized along the diameter passing through the impurity [24], the absorption would prefer the second excited state. The other interesting feature observed in Fig. 8 is the periodic
Fig. 8. Absorption energies of a single electron in a parabolic QR versus Φ/Φ0 for α = 20 and (a) V0 = 1.0, d = 0.2 and (b) V0 = 4.0, d = 0.5 [10]
Nanoscopic Quantum Rings: A New Perspective
89
behavior of absorption energies as a function of the applied field that follows closely the behavior of the persistent current. Blocking of this current by a strong impurity is reflected as the flat curves of absorption energies versus the field. In order to study the effect of impurity potential and electron correlations, we have considered a ring with four spinless electrons. Compared to the case of impurity-free single-electron results, here the dipole transitions to the first excited state are forbidden (|∆L| > 1). However, impurities will again permit transitions to the forbidden states. In general, effect of an impurity on the absorption spectrum as a function of the external magnetic field can be qualitatively explained by the single-particle properties. For example, when we compare Figs. 9(a) and (b) we notice that lifting of the degeneracy in the energy spectra of non-interacting electrons is reflected by a smoother behavior as a function of the applied field. The sole effect of the Coulomb interaction on the energy spectrum is to shift it upwards and to increase the gap between the ground state and excited states (see Sect. 1). As a result, the Coulomb interaction moves the absorption to higher frequencies (Fig. 9 (c) and (d)). The effect of electron-electron interaction is evident in the intensity: for the non-interacting system (Fig. 9 (a) and (b)), intensity of each absorption mode does not depend on the magnetic field, but for the interacting system (Fig. 9 (c) and (d)) there is a strong variation of intensity as a function of the field.
(c)
(d)
Fig. 9. Dipole-allowed absorption energies for four non-interacting [(a) and (b)] and interacting [(c) and (d)] electrons in a QR versus Φ/Φ0 . The parameters are the same as in Fig. 8
4
Role of Electron Spin
Until now, we have considered only spinless particles in our theoretical model of a parabolic QR. However, at low fields, electron spins are expected to play
90
Tapash Chakraborty
Fig. 10. Ground state energy versus Φ/Φ0 for up to ten non-interacting electrons. The energies are scaled to illustrate how the periodicity depends upon the number of electrons on the ring [11]
an important role. In Fig. 10, we show the ground state energies calculated for up to ten non-interacting electrons on a ring [11]. Clearly, the major consequence of the spin degree of freedom is period and amplitude halving of the energy with increasing number of the flux quanta. It is also strongly particle number dependent. This result for a non-interacting system was reported earlier [25] and can be accounted for by simply counting the number of spins, following the Pauli principle, and noting that the up and down spins contribute identically to the energy. The particle number (modulo 4) dependence can also be trivially explained in this way. The Coulomb interaction in our model was found to have a profound effect on the energy spectra when the spin degree of freedom is included [11]. The calculated low-lying energy states for a two non-interacting (a) and interacting (b) electron system are shown in Fig. 11. As the flux is increased, the angular momentum quantum number (L) of the ground state of a non-interacting system is increased by two, i.e., the ground state changes as 0,2,4,... The period is as usual, one flux quantum. The ground state of the non-interacting system is always a spin-singlet. The first excited state is spin degenerate. As the Coulomb interaction is turned on, the singlet-triplet degeneracy is lifted. This is due to the fact that, as the interaction is turned on, states with highest possible symmetry in the spin part of the wave function are favored because that way one gains the exchange energy. As a result, the triplet state comes down in energy with respect to others and therefore the period (as well as the amplitude) of the ground state oscillations is halved. Similar results are also evident with three electrons (Sz = 12 ) (Fig. 11(c) noninteracting and (d) interacting electron systems), where the ground state oscillates with a period Φ0 /3. We found similar behavior for QRs containing up to four electrons [11]. In the four electron system, we noticed that with-
Nanoscopic Quantum Rings: A New Perspective
91
Fig. 11. Few low-lying energy states for a ring containing two (a) non-interacting and (b) interacting electrons. Three electron results are given in (c) and (d) for non-interacting and interacting systems respectively
out the Zeeman energy included the spin configuration Sz = 0 has the lowest energy and a Φ0 /2 periodicity is observed. However, if the Zeeman energy is taken into account the Sz = 1 configuration becomes lower in energy and the Φ0 /4 periodicity is recovered. The final message that emerges from these studies is that: the Coulomb interaction (in fact, any type of repulsive interaction) favors the spin-triplet ground states for reasons explained above. In the absence of any interaction, the ground states are spin singlets and as a function of Φ/Φ0 are parabolas with minimum at about the integer values (exactly at integer values for an ideal ring). When a repulsive interaction is turned on, singlet states rise in energy more than the triplet state and for strong enough repulsion, a decrease in period of oscillations is observed. Can one observe the fractional oscillations of the ground state energy in optical spectroscopy? We present in Fig. 12 our theoretical results for the optical absorption in a QR containing two electrons and also an impurity of moderate strength [10] (V0i = 1.0, di = 0.2). The strong effect of the Coulomb interaction on the electrons with spin is quite obvious. The system of noninteracting electrons but a impurity-free system (Sz = 0) is shown in (a), while the interacting (impurity-free) case is shown in (b). A system containing an impurity of moderate strength is shown for the non-interacting (c) and interacting (d) electrons. The absorption spectra clearly reflect the behavior of the energy levels and the impurity does not destroy the fractional periodicity of the electronic states.
92
Tapash Chakraborty
Fig. 12. Absorption spectra of a QR containing two (a), (c) non-interacting, or (b), (d) interacting electron. In (c), (d), additionally, the system also has an impurity of moderate strength. The size of the filled circles is proportional to the calculated absorption intensities
5
Recent Works on Nano-Rings
Quantum rings of nanoscopic dimensions are usually created as self-organized [12,13,14,15] or fabricated on AlGaAs/GaAs heterojunctions containing a two-dimensional electron gas [16,17,18]. Self-organized QRs in InAs/GaAs systems were created by growing InAs quantum dots on GaAs and a process involving a growth interruption when In migrates at the edge of the dot. This creates nanostructures that resemble a volcano crater with the center hole of 20 nm diameter and the outer diameter of 60-120 nm [12]. Only one or two electrons are admitted in these clean nano rings and FIR spectroscopy was performed to investigate the ground state and low-lying excitations in a magnetic field that is oriented perpendicular to the plane of the rings. The low-lying excitations were found to be unique to QRs first explored theoretically by us [9,10,11]. Similarly, the observed ground state transition from angular momentum l = 0 to l = −1, when one flux quantum threads the ring, is also well described by our parabolic ring model. Warburton et al. [13] reported a successive population of these self-assembled structures by up to five electrons. They investigated the exciton luminescence of charged rings. Recombination-induced emission by the systems from a neutral exciton to a quintuply charged exciton was reported. In magnetotransport experiments in Coulomb blockade regime reported in Refs. [16,17], QRs contain a few hundred electrons. The deduced energy spectra was however well described by the single-electron picture. Keyser et al. [18] recently reported transport spectroscopy on a small ring containing less than ten electrons. The deduced energy spectrum was found to be strongly influenced by the electron-electron interaction. They also observed a reduction of the AB period that is in line with our theoretical results of Sect. 3.
Nanoscopic Quantum Rings: A New Perspective
93
Following our original work on parabolic quantum rings, many theoretical works on such systems have been reported [26,27,28,29,30,31]. Chaplik [26] investigated a parabolic QR having a small finite width. He noted that the magnetic field dependent component of the energy does not depend on the inter-electron interaction. He also investigated the behavior of charged and neutral magnetic excitons in a QR. In Ref. [28], a single QR containing up to eight electrons was investigated using path integral Monte Carlo techniques. Addition energies, spin correlations, etc. were evaluated for different values of ring radii, particle number and the temperature. There are many other issues related to the QR that were investigated theoretically in recent years. With the advent of nanoscopic quantum rings research on parabolic QRs has taken a new dimension. Like all other nanostructures investigated in recent years, QRs have proven to be a very useful device to investigate many fundamental physical phenomena and perhaps in a near future will also be found to be important for practical applications. Acknowledgements Many thanks to Rolf Haug (Hannover) for inviting me to present the talk. I also thank Christian Sch¨ uller and Detlef Heitmann for their hospitality during a visit at the University of Hamburg as a guest Professor in March 2003.
References 1. 2. 3. 4. 5.
6. 7.
8. 9.
10. 11. 12.
F. Hund, Ann. Phys. (Leipzig) 32, 102 (1938). 79 M. B¨ uttiker, Y. Imry, R. Landauer, Phys. Lett. A96, 365 (1985). 79, 80 N. Byers, C.N. Yang, Phys. Rev. Lett. 7, 46 (1961). 79 F. Bloch, Phys. Rev. 21, 1241 (1968). 79 R.B. Laughlin, Phys. Rev. B 23, 5632 (1981); B.I. Halperin, Phys. Rev. B 25, 2185 (1982); for a review, see T. Chakraborty, P. Pietil¨ ainen: The Quantum Hall Effects (Springer, N.Y. 1995), second edition. 80 L.P. Levy et al., Phys. Rev. Lett. 64, 2074 (1990); V. Chandrasekhar, et al., Phys. Rev. Lett. 67, 3578 (1991). 81 U. Eckern, P. Schwab, Adv. Phys. 44, 387 (1995); H.F. Cheung, et al., Phys. Rev. B 37, 6050 (1988); G. Montambaux, H. Bouchiat, et al., Phys. Rev. B 42, 7647 (1990). 81 D. Mailly, C. Chapelier, A. Benoit, Phys. Rev. Lett. 70, 2020 (1993). 81 T. Chakraborty, P. Pietil¨ ainen, Phys. Rev. B 50, 8460 (1994); and also in Transport Phenomena in Mesoscopic Systems. Eds. H. Fukuyama and T. Ando (Springer, Heidelberg, 1992). 81, 83, 84, 92 V. Halonen, P. Pietil¨ ainen, T. Chakraborty, Europhys. Lett. 33, 377 (1996). 81, 86, 87, 88, 91, 92 K. Niemel¨ a, P. Pietil¨ ainen, P. Hyv¨ onen, T. Chakraborty, Europhys. Lett. 36, 533 (1996). 81, 90, 92 A. Lorke, et al., Phys. Rev. Lett. 84, 2223 (2000). 82, 92
94 13. 14. 15. 16. 17. 18. 19. 20.
21. 22.
23. 24.
25. 26. 27. 28. 29. 30. 31.
Tapash Chakraborty R.J. Warburton, et al., Nature 405, 926 (2000). 82, 92 H. Pettersson, et al., Physica E 6, 510 (2000). 82, 92 D. Haft, et al., Physica E 13, 165 (2002). 82, 92 A. Fuhrer, et al., Nature 413, 822 (2001). 82, 92 T. Ihn, et al., Physica E (2003). 82, 92 U.F. Keyser, et al., Phys. Rev. Lett. (2003). 82, 92 J. Liu, A. Zaslavsky, L.B. Freund, Phys. Rev. Lett. 89, 096804 (2002). 82 P.A. Maksym, T. Chakraborty, Phys. Rev. Lett. 65, 108 (1990); T. Chakraborty, Comments Condens. Matter Phys. 16, 35 (1992); T. Chakraborty, Quantum Dots (North-Holland, Amsterdam, 1999); T. Chakraborty, F. Peeters, U. Sivan (Eds.), Nano-Physics & Bio-Electronics: A New Odyssey (Elsevier, Amsterdam, 2002); S.M. Reimann, M. Manninen: Rev. Mod. Phys. 74, 1283 (2002). 83, 86 T. Chakraborty, P. Pietil¨ ainen, Phys. Rev. B 52, 1932 (1995). 85 A.J. Leggett, in Granular Nanoelectronics, vol. 251 of NATO Advanced Study Institute, Series B: Physics, edited by D.K. Ferry, J.R. Berker, and C. Jacoboni (Plenum, New York, 1992), p. 297. 85 C. Dahl, J.P. Kotthaus, H. Nickel, W. Schlapp, Phys. Rev. B 48, 15480 (1993). 86, 87 P. Pietil¨ ainen, V. Halonen, T. Chakraborty, in Proc. of the International Workshop on Novel Physics in Low-Dimensional Electron Systems, T. Chakraborty (Ed.), Physica B 212 (1995). 88 D. Loss, P. Goldbart, Phys. Rev. B 43, 13762 (1991). 90 A.V. Chaplik, JETP 92, 169 (2001). 93 S. Viefers, et al., Phys. Rev. B 62, 10668 (2000). 93 P. Borrmann, J. Harting, Phys. Rev. Lett. 80, 3120 (2001). 93 A. Puente, L. Serra, Phys. Rev. B 63, 125334 (2001). 93 Z. Barticevic, M. Pacheco, A. Latge, Phys. Rev. B 62, 6963 (2000); Z. Barticevic, G. Fuster, M. Pacheco, Phys. Rev. B 65, 193307 (2002). 93 A. Emperador, M. Pi, M. Barranco, E. Lipparini, Physica E 12, 787 (2002). 93
Optical Spectroscopy of Low-Dimensional Quantum Spin Systems Markus Gr¨ uninger1 , Marco Windt1 , Eva Benckiser1, Tamara S. Nunner2 , Kai P. Schmidt3 , G¨ otz S. Uhrig3 , and Thilo Kopp2 1 2 3
II. Physikalisches Institut, Universit¨ at zu K¨ oln, Z¨ ulpicher Str. 77, 50937 Cologne, Germany Experimentalphysik VI, Universit¨ at Augsburg, Universit¨ atsstraße 1, 86135 Augsburg, Germany Institut f¨ ur Theoretische Physik, Universit¨ at zu K¨ oln, Z¨ ulpicher Str. 77, 50937 Cologne, Germany
Abstract. Low-dimensional quantum spin systems display fascinating excitation spectra. In recent years, optical spectroscopy was shown to be a powerful tool for the study of these spectra by means of phonon-assisted infrared absorption. We discuss the results of antiferromagnetic S=1/2 cuprates with various topologies: the spinon continuum observed in the weakly coupled chains of CaCu2 O3 , two-triplet bound states and the continuum of the two-leg ladders in (La,Ca)14 Cu24 O41 , and the bimagnon-plus-phonon spectrum of the bilayer YBa2 Cu3 O6 , an undoped parent compound of the 2D high-Tc cuprates. Various theoretical approaches (dynamical DMRG, continuous unitary transformations (CUT), and spin-wave theory) are used for a quantitative analysis. Particular attention is paid to the role of the cyclic fourspin exchange.
1
Introduction
The dawn of phonon-assisted infrared absorption of magnetic excitations dates back to 1959, when Newman and Chrenko [1] observed an infrared absorption band at 0.24 eV in the classic three-dimensional (3D) S=1 antiferromagnet NiO. A connection with the antiferromagnetic order was suggested on the basis of the observed temperature dependence. In 1964 Mizuno and Koide [2] proposed that this absorption band reflects the simultaneous excitation of two magnons and one phonon. At that time, the spin dynamics were analyzed only qualitatively on a mean-field level. In 1966, phonon-assisted two-magnon absorption was also reported in cubic KNiF3 [3], which still is considered to be the best realization of the 3D S=1 Heisenberg model [4]. Only in 1995 Lorenzana and Sawatzky [5] rediscovered this idea and proposed that the mid-infrared absorption features observed in the undoped parent compounds of the high-Tc cuprates [6] – the best realization of the 2D S=1/2 square-lattice Heisenberg model – had to be explained in terms of bimagnonplus-phonon absorption. On the basis of spin-wave theory they performed the first quantitative analysis and obtained an excellent description of the dominant peak [5]. The spectral weight at higher energies (see below, Fig. 13) B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 95–111, 2003. c Springer-Verlag Berlin Heidelberg 2003
96
Markus Gr¨ uninger et al.
was tentatively ascribed to higher-order multi-magnon contributions. Due to the reduced dimensionality and the small spin value S=1/2, quantum effects should be pronounced. Therefore the nature of the spin excitations in the 2D cuprates is still controversial, in particular at high energies [7,8,9,10,11,12]. The detailed information on the spectral density offered by the infrared data should allow to clarify this point, but it still poses a challenge to theory. The problems in the description of the line shape of the 2D cuprates appear particularly conspicuous in comparison with the excellent description obtained (a) in spin-wave theory for the isostructural S=1 compound La2 NiO4 [5,13] and (b) in a two-spinon analysis of the S=1/2 chain Sr2 CuO3 [14,15]. In 1D good agreement is obtained because quantum fluctuations are fully included, and in the 2D S=1 nickelate because fluctuations beyond spin-wave theory are small. Here, we try to shed some light on this issue by comparing S=1/2 cuprate compounds with different topologies: weakly coupled chains, two-leg ladders and 2D layers. Before, we will discuss bimagnon-plus-phonon absorption in general, the experimental determination of the magnetic contribution to the optical conductivity σ(ω) and in particular the magnetic excitations of S=1/2 two-leg spin ladders.
2
Bimagnon-Plus-Phonon Absorption
This technique allows to study the spin–spin correlation function via a measurement of the dipole–dipole correlation function, i.e., the optical conductivity σ(ω). Since spin is conserved, σ(ω) reflects S=0 excitations (neglecting spin–orbit coupling), e.g., the excitation of two S=1 magnons with total spin Stot =0 or the appropriate combination of two elementary triplets (henceforth called triplons [16]) or two spinons. However, in the cuprates direct absorption of, e.g., two magnons is not infrared active due to inversion symmetry. We can effectively avoid this selection rule by simultaneously exciting a Cu-O bond-stretching phonon that breaks the symmetry. Hence, the lowest order infrared-active magnetic absorption is a two-magnon-plus-phonon process. The phonon participation was verified experimentally by the observation of a frequency shift induced by oxygen isotope substitution in YBa2 Cu3 O6 [7]. The quantitative analysis [5] starts from a three-band Peierls–Hubbard model in the presence of an electric field E. The dominant contribution to the dipole moment arises from displacements of the oxygen ions (only Einstein phonons are considered). These modulate the hopping matrix elements and the on-site energies, whereas the electric field contributes only to the on-site energies. In perturbation theory a low-energy Hamiltonian is derived, in which the relevant term corresponds to a nearest-neighbor Heisenberg Hamiltonian, where the exchange coupling Ji,δ depends on the electric field E and on the displacements of the oxygen ions uj H= Ji,δ (E, {uj })S i S i+δ . (1) i,δ
Optical Spectroscopy of Low-Dimensional Quantum Spin Systems
97
Here, i labels the Cu sites and δ runs over nearest-neighbor sites. We expand J(E, u) to order ∂ 2 J/∂E∂u which entails the coupling of a photon to a phonon and two neighboring spins. The dipole moment associated with two-magnon-plus-phonon absorption then results from the Fourier transform of the product of neighboring spin operators weighted by a momentumdependent vertex function γ(k), which corresponds to the Fourier transform of ∂ 2 J/∂E∂u. To zeroth order in the magnon–phonon coupling, the magnetic system and the phonon system can be decoupled. The role of the phonon is thus reduced to (a) breaking the symmetry, (b) a shift of the energy scale by the phonon energy ωph , and (c) a contribution kph to the total momentum ktot . Since kph runs over the entire Brillouin zone, the selection rule 0 = ktot = kmag + kph tells us that the magnetic excitations have to be integrated over all momenta, where the form factor is given by |γ(k)|2 ≡ fph (k).
3 Experimental Determination of the Magnetic Contribution to σ(ω) Since the considered bimagnon-plus-phonon absorption is a higher-order process, one expects only a weak dipole moment or a small spectral weight in σ(ω). This can be determined very accurately by measuring both the transmittance T (ω) and the reflectivity R(ω). The optical conductivity σ(ω) = nκω/2π results from inverting [17,18] R(ω) =
(1 − R)2 Φ (n − 1)2 + κ2 , T (ω) = , 2 2 (n + 1) + κ 1 − (RΦ)2 Φ(ω) = exp(−2ωκd/c) = exp(−αd) ,
(2) (3)
where n denotes the index of refraction, κ the extinction coefficient, α the absorption coefficient, c the velocity of light and d the thickness of the transmittance sample [R(ω) denotes the single bounce reflectivity and hence needs to be measured on a thick (“semi-infinite”), opaque sample]. These equations are obtained for a sample with parallel surfaces by adding up the intensities of all multiply reflected beams incoherently, i.e., by neglecting interference effects. Experimentally, this condition is realized either if the sample surfaces are not perfectly parallel or by smoothing out the Fabry–Perot interference fringes by means of Fourier filtering. In case of weak absorption κ n, the reflectivity is entirely determined by n and not suitable to derive κ by using a Kramers–Kronig transformation. At the same time, κ can be determined very accurately from the transmittance. Since T (ω) depends exponentially on κ · d, the appropriate choice of d is essential. As an example we plot in Fig. 1 the data of the two-leg S=1/2 ladder Lax Ca14−x Cu24 O41 [19] (see also Sect. 4). The top panel shows the reflectivity measured on a 0.8 mm thick sample (x=4) for two different polarizations of the electrical field, namely, parallel to the rungs and parallel to the legs. The feature at about 600–700 cm−1 corresponds to the Cu-O bond-stretching
98
Markus Gr¨ uninger et al. Energy ( eV )
Reflectivity
0.4
0
0.2
0.4
0.8
x=4
1.0
1.2
E||c (leg)
0.2
E||a (rung) 0
d = 6 & 28 µm Transmittance
0.6
x=5.2 E||a
0.5
E||c 0 0
2000
4000
6000
-1
Frequency ( cm )
8000
10000
Fig. 1. Mid-infrared reflectivity and transmittance of Lax Ca14−x Cu24 O41 at T =4 K for polarization parallel to the rungs and to the legs, respectively. Top panel : Reflectivity for x=4. Bottom panel : Transmittance of two single crystals with thickness d=28 µm (solid lines) and 6 µm (dashed lines)
phonon mode. At higher frequencies, the reflectivity is featureless, which is characteristic for an insulator in the regime of weak absorption below the gap. The different absolute values of the two polarization directions reflect the difference in n, namely, na ≈ 2.3 and nc ≈ 2.6. The bottom panel of Fig. 1 shows T (ω) measured on thin single crystals with x=5.2 for two different thicknesses, d=6 µm and 28 µm. Fabry–Perot interference fringes have been removed by Fourier filtering. In contrast to the reflectivity, the transmittance reveals the weak absorption features we are looking for, in this case in the range from about 2000 to 6000 cm−1 . The spectra can be divided into three different regimes. The absorption below ≈ 1300 cm−1 can be attributed to phonons and multi-phonon bands. The strong absorption at high frequencies is due to an electronic background, which has to be identified with the onset of charge-transfer excitations or with the absorption of localized carriers (located in the CuO2 chains but not in the ladders [20]). In order to analyze the magnetic excitations in the intermediate frequency range, they have to be separated from the background, which thus needs to be known precisely. This requires the measurement of a thin sample, which still is transparent at high frequencies, whereas the weaker magnetic features can be determined more precisely from the data of the thicker sample (see bottom panel of Fig. 1). The spectrum of the optical conductivity σ(ω) is shown in Fig. 2 for polarization parallel to the legs (top) and parallel to the rungs (middle). In the latter case, the features between 2000 and 6000 cm−1 are clearly separated from the high-frequency background, which can be determined unambiguously by a Gaussian fit for ω > 7000 cm−1 (dashed line in the middle panel of Fig. 2, see also [17]). For polarization parallel to the legs, the stronger absorption complicates the determination of the background considerably [21]. The data clearly reveal a comparably strong absorption peak at about 6000 cm−1 ,
Optical Spectroscopy of Low-Dimensional Quantum Spin Systems
99
Energy (eV)
σ(ω) ( 1/Ωcm )
σ(ω) ( 1/Ωcm )
50
0.2
0.4
0.6
0.8
40
La5.2Ca8.8Cu24O41
30
Fit: Gauss + A*(ω - ωc2)
1.0
E || c (leg) 2
20 10 0 3
La5.2Ca8.8Cu24O41 Fit: Gauss
E || a (rung)
2 1
σ(ω) ( 1/Ωcm )
0
magnetic contribution E || c (leg) E || a (rung)
2
1
0 2000
4000
6000 -1
Frequency ( cm )
8000
Fig. 2. Optical conductivity σ(ω) of La5.2 Ca8.8 Cu24 O41 at T =4 K (solid lines) and fits of the highfrequency background (dashed lines). Top panel : Polarization parallel to the legs. The total fit (long-dashed line) is the sum of a Gaussian and of a quadratic part. Middle panel : Polarization parallel to the rungs. Bottom panel : Magnetic contribution to σ(ω) resulting after subtraction of the background
which can already be seen in T (ω) of the 6 µm sample (Fig. 1). We obtained an excellent fit by using a Gaussian line shape for this peak plus a quadratic frequency dependence of the absorption at higher frequencies (dashed lines in Fig. 2). Subtracting the background fits we obtain the magnetic contribution to σ(ω) (bottom panel of Fig. 2).
4
Magnetic Excitations of Two-Leg Spin-1/2 Ladders
Two-leg spin-1/2 ladders show fascinating properties such as a spin-liquid ground state with a spin gap to the lowest excited state and superconductivity under pressure upon hole doping [22]. This possibility of hole doping has placed the so-called telephone-number compounds A14 Cu24 O41 in the focus of attention. Here, we are interested in the magnetic properties of nominally undoped samples, i.e. Cu2+ , which corresponds to Lax Ca14−x Cu24 O41 with x=6. Single-phase crystals could only be synthesized for x ≤ 5.2 [23]. However, polarized x-ray absorption data [20] show that at least for x > 2 the holes are located within the second structural unit of these compounds, the CuO2 chains. Thus, we consider the ladders to be undoped [17,19].
100
Markus Gr¨ uninger et al.
The minimal model for S=1/2 cuprate ladders consists of an antiferromagnetic Heisenberg Hamiltonian plus an additional cyclic exchange term Hcyc [24] H = J (Si,l Si+1,l + Si,r Si+1,r ) + J⊥ Si,l Si,r + Hcyc , (4) i
i
where J⊥ and J denote the rung and leg couplings, i refers to the rungs, and l, r label the two legs. The cyclic exchange term1 corresponds to the cyclic permutation of four spins on a plaquette and emerges as the dominant correction to the nearest-neighbor Heisenberg model in an expansion of the three-band Hubbard model [28]. Rewriting the Hamiltonian in terms of rung singlets and rung triplets, one can easily see that the strongest effect of the cyclic exchange coupling Jcyc is a renormalization of the other terms in the Hamiltonian, causing a redshift of the entire one-triplon dispersion (see Fig. 3a) [24]. Correspondingly, also the lower edge of the two-triplon continuum shifts to lower energies (open symbols in Fig. 3b). Below the continuum, there exists an S=0 two-triplon bound state (full symbols in Fig. 3b), which results from an attractive interaction between two triplons. The only qualitatively new contribution of the cyclic exchange is a competing repulsive interaction between triplets on neighboring rungs. Increasing Jcyc thus reduces the binding energy of the two-triplon bound state and also the width of the bound-state dispersion. The existence of two-triplon bound states in undoped two-leg ladders was predicted theoretically by a number of groups [30,31,32,33,34,35]. Jurecka and Brenig [33] predicted that the S=0 bound state dominates the optical conductivity spectrum for small values of J /J⊥ . The evolution of σ(ω) for 0.2 ≤ J /J⊥ ≤ 1.15 was discussed in [17]. The existence of the S=0 two-triplon bound state was confirmed experimentally by measuring Note that the formulations of Hcyc used in the DMRG and in the CUT calculations are slightly different [24,25]. The resulting Hamiltonian is identical except for couplings along the diagonals if J⊥ and J are suitably redefined [26].
2.5
(a)
(b)
2.5 2.0
1.5
1.5
ω/J⊥
2.0
1.0
J cyc /J⊥ = 0 J cyc /J⊥ = 0.1 J cyc /J⊥ = 0.2 J cyc /J⊥ = 0.3
0.5 0.0 0.0
0.5
p x/π
0.0
0.5
px/π
1.0 0.5 0.0 1.0
ω/J⊥
1
Fig. 3. DMRG results for an 80-site ladder with J = J⊥ and 0 ≤ Jcyc /J⊥ ≤ 0.3 [24]. (a) Dispersion of the elementary triplet (triplon) branch. (b) Corresponding lower edge of the two-triplon continuum (open symbols) and the S=0 two-triplon bound state (full symbols)
Optical Spectroscopy of Low-Dimensional Quantum Spin Systems
101
σ(ω) of (La,Ca)14 Cu24 O41 [19]. As shown in Fig. 3b, the bound state shows a maximum at k ≈ π/2 and a minimum at the Brillouin zone boundary. Both give rise to van Hove singularities in the density of states which cause peaks in σ(ω). Knowledge of these two peak frequencies and of the spin gap is sufficient to determine the three coupling constants J , J⊥ and Jcyc (see Fig. 4). For La5.2 Ca8.8 Cu24 O41 we find J /J⊥ ≈ 1.25–1.35, Jcyc /J⊥ ≈ 0.20– 0.27, and J⊥ ≈ 950–1100 cm−1 [24]. Inclusion of a sizable Jcyc is thus indeed necessary for a consistent description of the experimental data. However, the parameters were determined from three discrete energies. A comparison of the entire spectral density calculated for this parameter set with the experimentally determined line shape of σ(ω) provides a thorough test whether this minimal model captures all relevant properties. In fact, the agreement between theory and experiment is excellent (see Fig. 5). The two van Hove singularities of the bound state cause the two peaks at about 2140 cm−1 and 2780 cm−1 for leg polarization. For polarization parallel to the rungs, the lower bound state is suppressed due to a selection rule [19], and the upper bound-state peak is about 60 cm−1 higher, which reflects the different frequencies of the phonons involved in the two polarization directions [36]. The continuum above ≈ 3000 cm−1 is reproduced almost perfectly, in particular for the rung polarization, which experimentally can be identified unambiguously and with high precision. Due to the complications arising from the background subtraction for polarization parallel to the legs (see Sect. 3), the small deviations within the continuum range in the top panel of Fig. 5 are certainly within the experimental accuracy. In this polarization, σ(ω) contains two contributions, where the two legs are excited in-phase (py = 0) or out-of-phase (py = π). The two-triplon weight and thus also the bound state is contained in the in-phase contribution, whereas the continuum is dominated by the out-of-phase mode, which reflects the excitation of three (or more) triplons. In [17,19] we have compared σ(ω) of (La,Ca)14 Cu24 O41 with the results of two further theoretical approaches, namely, Jordan–Wigner fermions [37] and
0.8
(a)
(b)
J cyc /J⊥ = 0 J cyc /J⊥ = 0.1 J cyc /J⊥ = 0.2 J cyc /J⊥ = 0.3
ωmax,min / J⊥
2.5
0.6
2.0
0.4
1.5
0.2
1.0 1.0
1.1
1.2
J || / J⊥
1.3
1.0
1.1
1.2
J || / J⊥
1.3
0.0
∆s / J ⊥
3.0
Fig. 4. DMRG results for the bound states and for the spin gap [24]. All results were extrapolated to an infinite ladder. (a) Frequency of the maximum of the S=0 boundstate dispersion at px ≈ π/2 (full symbols) and of the minimum at px = π (open symbols). (b) Spin gap ∆s as a function of J /J⊥ for different values of Jcyc /J⊥
102
Markus Gr¨ uninger et al. La 5.2Ca 8.8Cu 24O41 DMRG py=0 DMRG py=π DMRG py=0 + p y=π
σ (arb. units)
E || c (leg) 2.0
1.0
0.0 3.0
σ (arb. units)
1.0
0.4 0.0
0.5
1.0
La 5.2Ca 8.8Cu 24O41 DMRG
ω /J⊥
2.0
0.0
px /π
0.2
E || a (rung) 0.0 1500
2500
3500
ω (cm-1 )
4500
5500
Fig. 5. Comparison of σ(ω) of La5.2 Ca8.8 Cu24 O41 at T =4 K (gray lines) with DMRG results (symbols) [24] for an 80-site ladder with J /J⊥ =1.3, Jcyc /J⊥ =0.2, leg rung =570 cm−1 , ωph =620 cm−1 [29] and a finite broadening of J⊥ =1000 cm−1 , ωph δ = 0.1J⊥ . Top panel : For polarization of the electrical field E parallel to the legs, σ(ω) contains two contributions, in which the two legs are in-phase (py =0) or outof-phase (py = π) with each other. Bottom panel : Polarization parallel to the rungs. Inset: One-triplon dispersion (dashed line), lower edge of the two-triplon continuum (thin solid line) and S=0 two-triplon bound state (thick solid line) for the above parameters
continuous unitary transformations (CUT). There, the cyclic exchange was not included. Here, we report for the first time on CUT results for the spectral densities including four-spin interactions. In the CUT approach, the Hamiltonian H is mapped to an effective Hamiltonian Heff which conserves the number of rung triplons [38,39]. The ground state of Heff is the rung-triplon vacuum. The CUT is implemented perturbatively in J /J⊥ and Jcyc /J⊥ . The resulting plain series are represented in terms of the variable 1−∆s /(J +J⊥ ) [40,41], where ∆s is the one-triplon gap which is proportional to the inverse correlation length of the system. Then standard Pad´e extrapolations yield reliable results up to J /J⊥ ≈ 1–1.5 depending on the value of Jcyc /J⊥ . The CUT result for the two-triplon contribution to σ(ω) is given in Fig. 6 for J /J⊥ =1.25, Jcyc /J⊥ =0.18 and J⊥ =1060 cm−1 [26]. For the comparison with the experimental data one has to bear in mind that the spectral weight of three and more triplons is missing (see Fig. 10 below; note that the rung polarization contains only excitations of an even number of triplons).
σ(ω) ( 1/Ωcm )
Optical Spectroscopy of Low-Dimensional Quantum Spin Systems
La5.2Ca8.8Cu24O41 CUT J||/J⊥ = 1.25 Jcyc/J⊥ = 0.18
2
J⊥ = 1060 cm
1
103
-1
E || leg 0
σ(ω) ( 1/Ωcm )
E || rung 0.4
0.2
0.0 2000
3000
4000
5000
6000
-1
Frequency ( cm )
Fig. 6. Comparison of σ(ω) of La5.2 Ca8.8 Cu24 O41 (gray lines) at T = 4 K with the two-triplon contribution calculated by CUT (black lines) for J /J⊥ = 1.25, leg rung Jcyc /J⊥ = 0.18 [26], J⊥ = 1060 cm−1 , ωph = 570 cm−1 , ωph = 620 cm−1 [29] and a finite broadening of δ = 0.02J⊥ . Top panel : Polarization parallel to the legs. The missing weight is mainly due to the sizeable three-triplon part (see Fig. 10). Bottom panel : Rung polarization. Here, the leading correction is the small four-triplon part
Roughly speaking, the two-triplon contribution calculated by CUT is equivalent to the in-phase contribution of the DMRG result (for leg polarization). Qualitatively, the spectra agree very well with each other. In principle no extra broadening is needed in CUT, thus sharp features are better resolved. However, CUT slightly underestimates the splitting of the two peaks of the bound state and also the position of the continuum in the rung polarization (the frequency of the maximum is too low by about 10%). The optical conductivity σ(ω) reflects a weighted superposition of the momentum-resolved spectral densities with S = 0 (see above). The k-resolved CUT data for the two-triplon contribution are shown in Fig. 7. In both panels the spectral densities are dominated by the bound state, which leaves the continuum at k ≈ 0.3π. In both polarizations, the continua show interesting structures. For 0 ≤ k ≤ π/2 there is a dominant ridge which develops from the two-triplon Raman peak at k=0. For large k this ridge moves to higher energies, getting an almost anti-bound state at k = π. In the rung polarization, there is a pronounced feature concentrated around k = π/2 below the ridge. This feature survives the k integration as a small shoulder in σ(ω), which may correspond to the experimentally observed peak at about 3200 cm−1 (see bottom panel of Fig. 6). The strength of this feature depends on the weight factor or phonon form factor fph . To lowest order (4th order in the
104
Markus Gr¨ uninger et al.
2.0
CUT
k (π)
1.5
E || leg
1.0 0.5 0.0
E || rung
k (π)
1.5 1.0 0.5 0.0
0.0
2.0
4.0
ω (J⊥)
Fig. 7. Momentum-resolved two-triplon spectral densities with S = 0 as obtained by the CUT approach for J /J⊥ =1.25 and Jcyc /J⊥ =0.18 [26]. Top panel : Polarization parallel to the legs. Bottom panel : Polarization parallel to the rungs leg Cu-O hopping tpd ), the dominant contribution to fph comes from the inphase and the out-of-phase stretching modes of the oxygen ions on the legs, rung whereas for fph the out-of-phase stretching mode and the vibration of the oxygen ion on the rung are taken into account (see also [36]) p p x x leg rung , fph +4. (5) = 8 sin4 = 8 sin2 fph 2 2
The effect of different form factors on the line shape of σ(ω) is visualized in Figs. 8 and 9. The small shoulder discussed above at about 2.3 J⊥ is more pronounced for form factors which suppress the dominant ridge at small k. Obviously, both the spectral weight and the line shape depend sensitively on the form factor. Finally, we discuss the influence of Jcyc on the spectral weight and the line shape of σ(ω). The relative spectral weights Inrel of the n-triplon contributions to the S=0 spectral density calculated by CUT for leg polarization are plotted in Fig. 10 for three different values of Jcyc /J (see [39] for the S=1 channel). For J =0 the system consists of local rung singlets which can only be excited to local rung triplets. Due to the local nature, the S=0 weight is exhausted entirely by the two-triplon part, I2rel = 1. For finite J the twotriplon spectral weight is reduced and the multi-triplon weight is enhanced. In σ(ω) this translates into a spectral weight transfer from low energies to high energies, i.e., to an increase of the high-energy continuum weight. An
Optical Spectroscopy of Low-Dimensional Quantum Spin Systems
σ(ω) ( 1/Ωcm )
CUT
4
0.4
0.2
1
0
2
1
2
E || rung
2
fph = sin + sin 1 2 sin 4 sin
E || leg J||/J⊥ = 1.25 Jcyc/J⊥ = 0.18
2
105
ω ( J⊥ )
3
Fig. 8. Influence of the phonon form factor fph on the line shape of the two-triplon contribution to σleg (ω) as calculated by CUT for J /J⊥ =1.25 and Jcyc /J⊥ =0.18 [26]. (Inset) Continuum on an enlarged scale
4
3
4
fph = 1
CUT
σ(ω) ( 1/Ωcm )
2
0.4
sin +1/2 2 sin 4 sin
J||/J⊥ = 1.25 Jcyc/J⊥ = 0.18
0.2
0.0
1
2
ω ( J⊥ )
3
Fig. 9. Influence of the phonon form factor fph on the line shape of the two-triplon contribution to σrung (ω) as calculated by CUT for J /J⊥ =1.25 and Jcyc /J⊥ =0.18 [26]
4
relative spectral weights In
rel
1.0 rel
0.5
rel
rel
sum (I2 +I3 +I4 )
S=0 CUT
n=2 Jcyc/J||=0 0.10 0.20
n=3 n=4
0.0 0.0
0.5
1.0
J|| / J⊥
1.5
Fig. 10. Relative spectral weights Inrel of the n-triplon contributions to the S=0 spectral density in leg polarization as a function of J /J⊥ for three different values of Jcyc /J [26]
additional suppression of the leading two-triplon part takes place upon increasing Jcyc (see Fig. 10). This reflects the fact that the rung-singlet phase is destabilized by Jcyc , resulting in a quantum phase transition to the topologically different staggered dimer phase [41,42,43,44]. For J /J⊥ = 1.25 we find a reduction from I2rel = 0.64 for Jcyc = 0 to I2rel = 0.50 for Jcyc /J = 0.2, i.e., the two-triplon contribution looses about 20% of its weight. The sum of I2rel , I3rel , and I4rel is very close to 1 at least for J /J⊥ ≤ 1.5 and Jcyc /J ≤ 0.2,
106
Markus Gr¨ uninger et al.
and I4rel remains small in this parameter range. It will thus be sufficient to determine the two- and three-triplon contributions in order to obtain a reliable description of the line shape of σ(ω). The influence of Jcyc on the line shape of the two-triplon contribution to σ(ω) is visualized in Figs. 11 and 12 for J /J⊥ =1.0 and 1.3. The spectra follow the trends apparent in Fig. 3b. With increasing Jcyc , the dispersion of the bound state and thus also the splitting of the two sharp bound-state peaks in σ(ω) are significantly reduced. Furthermore, an increase of Jcyc causes a redshift of the entire spectrum. Additionally, Jcyc gives rise to new fine structure in the spectral densities. For instance for J /J⊥ =1.3 and Jcyc /J⊥ =0.2 there appears a shoulder between the two bound-state peaks in the leg polarization (see lower left panel of Fig. 11). This can be traced back to a matrix-element effect. The momentum-resolved spectral densities plotted in Fig. 7 reveal that the spectral weight of the two-triplon bound state shows a maximum for a value of k close to but not identical with k = π. After the integration over k the van Hove singularities dominate the spectra, but the maximum still survives as a clear peak. Pronounced effects of Jcyc can also be observed in the continua, where the tendency towards an anti-bound state is enhanced by Jcyc .
5
From Weakly Coupled Chains to 2D Layers
In Fig. 13, we compare the magnetic contribution to σ(ω) of the undoped S=1/2 two-leg ladder (La,Ca)14 Cu24 O41 [19] discussed in the previous sections with the spectra of the 2D bilayer YBa2 Cu3 O6 [7] (bottom panel) and of CaCu2 O3 (top panel) [45]. The latter compound was thought to represent a two-leg ladder with J J⊥ , but it rather has to be viewed as a 3D system of weakly coupled chains [45,46]. In order to facilitate the comparison, the spectra are shifted by the respective phonon frequency ωph , and the frequency is plotted on the scale of the exchange coupling J (where J reflects the coupling along the chains, along the legs and within the layers, respectively). The values of J were determined by comparison with theoretical results. The apparent trend in Fig. 13 is that the spectral weight is shifted to higher energies on going from 1D chains via ladders to 2D layers. This reflects the increase of the number of nearest-neighbor spins ν from 2 in the chain to 3 in the ladder to 4 in a 2D layer. At the same time, the spectral weight is smeared out over a broader frequency range. In the 2D cuprates, the high-energy continuum at about 4J and above remains puzzling [7,8]. The first question one has to address is which part of the weight reflects magnetic excitations and which part has to be subtracted as a background as in the ladders. On the scale of J used in Fig. 13, the very steep onset of charge-transfer excitations occurs only at about 13 J in YBa2 Cu3 O6 [7], and it is rather unlikely that this causes a significant contribution in the range plotted in Fig. 13. At first one might expect
Optical Spectroscopy of Low-Dimensional Quantum Spin Systems
107
0.3
E || leg J||/J⊥ = 1.0
3 2
0.1
0.1
1
0
0.0
J||/J⊥ = 1.3
J||/J⊥ = 1.3 0.2
Jcyc/J⊥ = 0.2
2
0.1
0.1
1
0 1
2
2
ω ( J⊥ )
3 4 ω ( J⊥ )
5
σ(ω) ( 1/Ωcm )
3
0
0.2
Jcyc/J⊥ = 0.2
0 4 σ(ω) ( 1/Ωcm )
J||/J⊥ = 1.0 σ(ω) (1/Ωcm )
σ(ω) (1/Ωcm )
4
0.0
Fig. 11. Influence of Jcyc on the line shape of the two-triplon contribution to σleg (ω) as calculated by CUT [26]. For a direct comparison with the experimental result for σ(ω), one must bear in mind that the sizeable three-triplon contribution is missing (see Fig. 10)
1.0
σ(ω) ( 1/Ωcm )
0.8 0.6 0.4 0.2 0.0
CUT E || rung
J||/J⊥ = 1.0
Jcyc/J⊥ = 0.2 0.1 0 J||/J⊥ = 1.3
σ(ω) ( 1/Ωcm )
0.8 0.6 0.4 0.2 0.0
Jcyc/J⊥ = 0.2 0.1 0 1
2
3 ω ( J⊥ )
4
Fig. 12. Influence of Jcyc on the line shape of the two-triplon contribution to σrung (ω) as calculated by CUT [26]. Here, the leading correction is the four-triplon contribution, which is of minor importance
σ(ω) ( 1/Ωcm )
108
Markus Gr¨ uninger et al. 3
CaCu2O3 E||b DMRG
2 1
σ(ω) ( 1/Ωcm )
0
La5.2Ca8.8Cu24O41 E||c DMRG
2
1
σ(ω) ( 1/Ωcm )
0
YBa2Cu3O6 E||a spin-wave theory 1
0 0
2
4
6
8
(ω-ωph)/J Fig. 13. Evolution of the optical conductivity from weakly coupled chains via twoleg ladders to 2D layers at T = 4 K. Top: σ(ω) of CaCu2 O3 for E b (solid line), DMRG result (circles) for J /J⊥ = 5 and J = 1300 cm−1 [45]. Middle: σ(ω) of La5.2 Ca8.8 Cu24 O41 for E c (solid ), DMRG data (symbols) for J /J⊥ = 1.3, Jcyc /J⊥ = 0.2 and J = 1000 cm−1 (see Fig. 5) [24]. Bottom: σ(ω) of the 2D bilayer YBa2 Cu3 O6 for E a (solid ). In a bilayer, the two-magnon contribution from spinwave theory (dashed ) contains an in-plane part (dotted ) and an inter-plane part (dash-dotted ). Here, the in-plane exchange is J = 780 cm−1 and the inter-plane exchange amounts to J12 /J = 0.1 [7]. The two-magnon peak corresponds to 2.88J for J12 /J = 0.1, and to 2.73J for J12 = 0
that the unexplained spectral weight above the two-magnon part reflects the multi-magnon contribution. However, the high-energy weight is missing also in exact diagonalization results for the square-lattice Heisenberg model [47]. One may speculate whether this failure is due to the still rather small cluster sizes. Lorenzana et al. claimed that inclusion of a cyclic exchange term offers a remedy to this problem [47]. We have shown above that Jcyc enhances the high-energy weight also in the ladders (see Fig. 10), but only by about 20%. Since Jcyc /J is very similar in the ladders and in the 2D cuprates [47,48,49], it seems unlikely that Jcyc alone can explain the large discrepancy shown in the bottom panel of Fig. 13. Note that in the ladders the discrepancy between the full spectrum and the two-triplon contribution is very similar to the discrepancy observed in the 2D case (see also Fig. 6), which indicates
Optical Spectroscopy of Low-Dimensional Quantum Spin Systems
109
that multi-particle excitations are relevant. Interestingly, all three compounds show a contribution from an incoherent continuum at about 2νSJ, where ν denotes the number of nearest-neighbor spins. The position of the continuum thus may reflect the rather local nature of the incoherent excitations. However, in 2D a quantitative description of the total weight still poses a demanding challenge for future research. Acknowledgements The single crystals studied here were provided by A. Erb (YBa2 Cu3 O6 ), C. Sekar and G. Krabbes (CaCu2 O3 ) and U. Ammerahl, M. H¨ ucker, B. B¨ uchner and A. Revcolevschi (telephone-number compounds). It is a great pleasure to acknowledge their very valuable contribution to this work. We thank D. van der Marel, G. A. Sawatzky, A. Freimuth, C. Knetter and P. Brune for many fruitful discussions. This project is supported by the DFG (SFB 608, SFB 484 and SP 1073) and by the BMBF (13N6918/1).
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17.
R. Newman, R.M. Chrenko, Phys. Rev. 114, 1507 (1959). 95 Y. Mizuno, S. Koide, Phys. Kond. Mat. 2, 166 (1964). 95 A. Tsuchida, J. Phys. Soc. Jpn. 21, 2497 (1966). 95 H. Yamaguchi, K. Katsumata, M. Hagiwara, M. Tokunaga, H.L. Liu, A. Zibold, D.B. Tanner, Y.J. Wang, Phys. Rev. B 59, 6021 (1999). 95 J. Lorenzana, G.A. Sawatzky, Phys. Rev. Lett. 74, 1867 (1995); Phys. Rev. B 52, 9576 (1995). 95, 96 J.D. Perkins, J.M. Graybeal, M.A. Kastner, R.J. Birgeneau, J.P. Falck, M. Greven, Phys. Rev. Lett. 71, 1621 (1993). 95 M. Gr¨ uninger, D. van der Marel, A. Damascelli, A. Erb, Th. Wolf, T. Nunner, T. Kopp, Phys. Rev. B 62, 12422 (2000). 96, 106, 108 C.-M. Ho, V.N. Muthukumar, M. Ogata, P.W. Anderson, Phys. Rev. Lett. 86, 1626 (2001). 96, 106 P.W. Anderson, Science 288, 480 (2000). 96 R.B. Laughlin, Phys. Rev. Lett. 79, 1726 (1997). 96 G. Aeppli, S.M. Hayden, P. Dai, H.A. Mook, R.D. Hunt, T.G. Perring, F. Dogan, phys. stat. sol. b 215, 519 (1999). 96 A.W. Sandvik, R.R.P. Singh, Phys. Rev. Lett. 86, 528 (2001). 96 J.D. Perkins, D.S. Kleinberg, M.A. Kastner, R.J. Birgeneau, Y. Endoh, K. Yamada, S. Hosoya, Phys. Rev. B 52, R9863 (1995). 96 H. Suzuura, H. Yasuhara, A. Furusaki, N. Nagaosa, Y. Tokura, Phys. Rev. Lett. 76, 2579 (1996). 96 J. Lorenzana, R. Eder, Phys. Rev. B 55, R3358 (1997). 96 K.P. Schmidt, G.S. Uhrig, cond-mat/0211627. 96 M. Gr¨ uninger, M. Windt, T. Nunner, C. Knetter, K.P. Schmidt, G.S. Uhrig, T. Kopp, A. Freimuth, U. Ammerahl, B. B¨ uchner, A. Revcolevschi, J. Phys. Chem. Sol. 63, 2167 (2002). 97, 98, 99, 100, 101, 110
110
Markus Gr¨ uninger et al.
18. H.S. Choi, E.J. Choi, Y.J. Kim, Physica C 304, 66 (1998). 97 19. M. Windt, M. Gr¨ uninger, T. Nunner, C. Knetter, K.P. Schmidt, G.S. Uhrig, T. Kopp, A. Freimuth, U. Ammerahl, B. B¨ uchner, A. Revcolevschi, Phys. Rev. Lett. 87, 127002 (2001). 97, 99, 101, 106, 110 20. N. N¨ ucker et al., Phys. Rev. B 62, 14384 (2000). 98, 99 21. In [19] we subtracted an exponential background because T (ω) was restricted to a narrower frequency range. This is not sufficient to fit the more recent highfrequency data of Fig. 2 [17]. For E a, this has only a marginal effect on our estimate of the magnetic contribution to σ(ω). The only relevant difference is an improved description of the high-energy continuum for E c. 98 22. M. Uehara, T. Nagata, J. Akimitsu, H. Takahashi1, N. Mˆ ori, K. Kinoshita, J. Phys. Soc. Jpn. 65, 2764 (1996). 99 23. U. Ammerahl, A. Revcolevschi, J. Crystal Growth 197, 825 (1999); U. Ammerahl: PhD thesis, Univ. of Cologne, 2000. 99 24. T.S. Nunner, P. Brune, T. Kopp, M. Windt, M. Gr¨ uninger, Phys. Rev. B 66, 180404(R) (2002). 100, 101, 102, 108 25. K.P. Schmidt, C. Knetter, M. Gr¨ uninger, G.S. Uhrig, Phys. Rev. Lett. 90, 167201 (2003). 100 26. Neglecting two-spin exchange terms along the diagonals, the CUT notation CUT CUT , JCUT , Jcyc is given in terms of the other notation J⊥ , J , Jcyc using J⊥ CUT CUT cyclic permutations by [27] JCUT = J + 14 Jcyc , J⊥ = J⊥ + 12 Jcyc , and Jcyc = Jcyc . 100, 102, 103, 104, 105, 107 27. S. Brehmer, H.-J. Mikeska, M. M¨ uller, N. Nagaosa, S. Uchida, Phys. Rev. B 60, 329 (1999). 110 28. E. M¨ uller–Hartmann, A. Reischl, Eur. Phys. J. B 28, 173 (2002), and references therein. 100 29. Experimentally, the frequency of the upper bound state ω2 is ≈ 60 cm−1 higher in σrung (ω) than in σleg (ω), thus two different phonon frequencies are used. 102, 103 30. G.S. Uhrig, H.J. Schulz, Phys. Rev. B 54, R9624 (1996); erratum ibid. 58, 2900 (1998). 100 31. K. Damle, S. Sachdev, Phys. Rev. B 57, 8307 (1998). 100 32. O.P. Sushkov, V.N. Kotov, Phys. Rev. Lett. 81, 1941 (1998); V.N. Kotov, O.P. Sushkov, R. Eder: Phys. Rev. B 59, 6266 (1999). 100 33. C. Jurecka, W. Brenig, Phys. Rev. B 61, 14307 (2000). 100 34. S. Trebst, H. Monien, C.J. Hamer, Z. Weihong, R.R.P. Singh, Phys. Rev. Lett. 85, 4373 (2000). 100 35. W. Zheng, C.J. Hamer, R.R.P. Singh, S. Trebst, H. Monien, Phys. Rev. B 63, 144410 (2001). 100 36. T.S. Nunner, P. Brune, T. Kopp, M. Windt, M. Gr¨ uninger, Acta Phys. Pol. B 34, 1545 (2003). 101, 104 37. T.S. Nunner, T. Kopp, cond-mat/0210103. 101 38. C. Knetter, G.S. Uhrig, Eur. Phys. J. B 13, 209 (2000). 102 39. C. Knetter, K.P. Schmidt, M. Gr¨ uninger, G.S. Uhrig, Phys. Rev. Lett. 87, 167204 (2001). 102, 104 40. K.P. Schmidt, C. Knetter, G.S. Uhrig, Acta Phys. Pol. B 34, 1481 (2003) (cond-mat/0208358). 102 41. K.P. Schmidt, H. Monien, G.S. Uhrig, to be publ. in Phys. Rev. B (condmat/0211429). 102, 105
Optical Spectroscopy of Low-Dimensional Quantum Spin Systems 42. 43. 44. 45. 46. 47. 48. 49.
111
M. M¨ uller, T. Vekua, H.-J. Mikeska, Phys. Rev. B 66, 134423 (2002). 105 K. Hijii, K. Nomura, Phys. Rev. B 65, 104413 (2002). 105 A. L¨ auchli, G. Schmid, M. Troyer, Phys. Rev. B 67, 100409 (2003). 105 E. Benckiser, M. Gr¨ uninger, T. Nunner, T. Kopp, C. Sekar, G. Krabbes, to be published. 106, 108 T.K. Kim, H. Rosner, S.-L. Drechsler, Z. Hu, C. Sekar, G. Krabbes, J. Malek, M. Knupfer, J. Fink, H. Eschrig, Phys. Rev. B 67, 024516 (2003). 106 J. Lorenzana, J. Eroles, S. Sorella, Phys. Rev. Lett. 83, 5122 (1999). 108 R. Coldea, S.M. Hayden, G. Aeppli, T.G. Perring, C.D. Frost, T.E. Mason, S.-W. Cheong, Z. Fisk, Phys. Rev. Lett. 86, 5377 (2001). 108 A.A. Katanin, A.P. Kampf, Phys. Rev. B 66, 100403 (2002). 108
Fractional Aharonov-Bohm Oscillations in a Kondo Correlated Few-Electron Quantum Ring Ulrich F. Keyser1 , Claus F¨ uhner1 , Sebastian Borck1, Rolf J. Haug1 , 2 Werner Wegscheider , and Max Bichler3 1 2 3
Institut f¨ ur Festk¨ orperphysik, Universit¨ at Hannover Appelstrasse 2, 30167 Hannover, Germany Angewandte und Experimentelle Physik, Universit¨ at Regensburg 93040 Regensburg, Germany Walter Schottky Institut, TU M¨ unchen 85748 Garching, Germany
Abstract. We investigate a small quantum ring fabricated by local oxidation with an atomic force microscope. For strong tunnel coupling to the leads we observe a Kondo-effect which shows the temperature-, gate voltage-, and magnetic field dependence predicted for a spin-1/2 Anderson impurity. Using transport spectroscopy in a perpendicular magnetic field we prove that we can tune the electron number on our small quantum ring between one and seven electrons. The well-known Aharonov-Bohm effect leads to transitions between ground states of a ring structure separated by the flux quantum h/e. However, for our low electron numbers electron-electron interaction dominates the level spectrum. This leads to more frequent level crossings compared to devices with high electron densities. When our device contains only five electrons, we observe a level transition when about a fifth of a flux quantum is added to the ring area. These frequent level transitions lead also to a modulation of the Kondo conductance in our small quantum ring.
1
Introduction
Transport spectroscopy provides an extremely successful means to understand the physics of interacting electrons confined to a quasi-zero dimensional potential well [1]. Until recently the only accessible shape of these so-called quantum dots was a tiny disc or box with a simple topology. It is now possible to create more complex, multiply connected topologies, namely small quantum rings with an outer and an inner boundary using new fabrication techniques. These novel devices combine the classical Coulomb blockade observed in quantum dots with quantum mechanical phenomena like the Aharonov-Bohm effect [2] and Kondo physics [3]. Small quantum rings were initially fabricated by the self-assembled growth of InAs on GaAs [4,5], but these structures were mainly used for optical experiments. The direct manipulation of GaAs/AlGaAs-heterostructures with
Present address: Department of Applied Physics, Delft University of Technology, Lorentzweg 1, 2628 CJ Delft, The Netherlands.
B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 113–125, 2003. c Springer-Verlag Berlin Heidelberg 2003
114
Ulrich F. Keyser et al.
an atomic force microscope (AFM) [6] offers an alternative approach that allows us to directly write complex geometries into an two-dimensional electron system. The fabrication of quantum dots [7,8], quantum point contacts [9,10], and, very recently, quantum rings [11,12] has been demonstrated. These novel ring devices can be studied by transport spectroscopy over a variety of different transport regimes. Recently, Fuhrer et al. studied a quantum ring containing a few hundred electrons in the Coulomb-blockade regime [11]. Their measurements showed an Aharonov-Bohm effect and allowed them to deduce the energy spectra [13] of their device. Their system was well described within a single-particle picture [11] because of an effective screening of the electron-electron interaction by a metallic top gate. Here we discuss a small quantum ring containing less than ten electrons in the Kondo regime. We show that due to the lack of a screening top gate, the ground state of our ring is dominated by strong electron-electron interaction effects that lead to fractional Aharonov-Bohm oscillations.
2
Sample Preparation
We fabricated our quantum ring from a GaAs/AlGaAs-heterostructure containing a two-dimensional electron gas (2DEG) 34 nm below the surface. The layer structure with its Si δ-doping is shown in Fig. 1 and consists of (from top to bottom): a 5 nm thick GaAs cap layer, an 8 nm thick layer of AlGaAs, the Si-δ-layer, a 20 nm wide AlGaAs barrier, and 100 nm of GaAs. At low temperatures the 2DEG has an electron density of ne ∼ 4 · 1015 m−2 and a mobility µe ∼ 50 m2 V−1 s−1 . The mean free path of the electrons is ∼ 5 µm. Hall bars are fabricated by optical lithography in combination with wet-chemical etching and the 2DEG is contacted by annealed Au/Gecontacts. The devices are bonded with gold wires and mounted into an AFM (Nanoscope IIIa, Digital Instruments) for local oxidation with a conducting tip (highly n-doped Si, Nanosensors). The tip is negatively biased with respect to the sample (Fig. 1). With the naturally adsorbed water film on the GaAs surface, an electrochemical nanocell is formed in which the tip serves as the cathode and the GaAs as the
Fig. 1. Sketch of the setup used for the local oxidation of the heterostructures. The righthand side of the figure shows the layer structure and the influence of the oxidation
Aharonov-Bohm Oscillations of a Few-Electron Quantum Ring
115
anode. The oxide forms by reaction of OH− -ions from the water layer with Ga and As, respectively [6]. Underneath the oxide the 2DEG is depleted as in a shallow etch process [14]. By adjusting the oxidation currents we are able to tune the properties of the depleted regions in the 2DEG [15]. We apply high oxidation currents (I ≈ 1.0 µA) to obtain insulating lines with a breakdown voltage of ±300 mV. Each line is written at least twice to avoid defects and leakage sites. The completed structure is shown in Fig. 2(a). The oxide lines appear as rough surface (linewidth < 120 nm). We define self-aligned in-plane gates (IPG) with the oxide lines. These are labelled A and B, and their corresponding applied voltages are VA and VB , respectively. The inner diameter of the quantum ring is Di = 190 nm and the outer Do = 450 nm. Thus the outer circumference of our ring is less than one third the mean free path in the 2DEG. The ring is connected to the source and drain contacts by two 150 nm wide point contacts. The conductance of both point contacts are tuned by gate A whereas gate B couples mainly to the source contact. In the following experiments, gate A is kept at a constant voltage VA , and VB is used to control the electrochemical potential and thus the number of electrons on the ring. All measurements were performed in a dilution refrigerator at a base temperature of Tb = 30 mK if not stated otherwise in the text. The current through the sample was detected using a standard lock-in technique and an excitation voltage of 5 µV at a frequency of 89 Hz. A simple sketch of the measurement setup is shown in Fig. 2(b). At VA = VB = 0 mV, the point contacts are conducting and the resistance of the device is found to be between h/e2 and h/2e2 indicating that the device is in the ballistic transport regime. We observe nice Aharonov-Bohm oscillations for electrons that are transmitted in a perpendicular magnetic field B through the sample. A typical example is shown in Fig. 2(c). The observed periodicity of ∆B = 60 mT corresponds to a diameter of 300 nm for the electronic orbit, which fits perfectly to the geometric values [12].
Fig. 2. (a) AFM-image of the completed ring structure with labels for source (S) and drain (D) contacts and in-plane gates (A, B). (b) Sketch of the measurement setup (c) Conductance G as a function of magnetic field B in the ballistic regime shows Aharonov-Bohm oscillations
116
3
Ulrich F. Keyser et al.
Kondo Physics in Quantum Rings
Theoretical studies predict a Kondo effect for a quantum dot with an unpaired electron in the topmost occupied state. The tunneling barriers of the device should become therefore perfectly transparent with a conductance of 2(e2 /h) [16,17]. This effect is referred as ‘Kondo’ effect because of the similarities between a single quantum dot and the scattering of electrons in metals induced by dilute magnetic impurities. The resistance of a metal is enhanced by an interaction between the electrons and the spins localized at the magnetic impurities at low temperatures. A quantum dot with a net spin of 1/2 acts as an ‘impurity’ in the 2DEG and the ‘scattering’ of electrons from the contacts leads to an enhancement of the conductance. In Fig. 3(a) a quantum dot, or in our case ring, in the Coulomb-blockade regime with N electrons is shown. It is coupled to electron reservoirs by two tunneling barriers. The occupied state with the highest energy of the ring contains one single electron. It is necessary that the level be two-fold degenerate, in our case due to the electron spin. In principle, we have to pay the charging energy U to add an electron but if the coupling of the electron on the ring to the leads is strong enough, it leads to the formation of many spin singlet states. Thus, the spin of the ring is screened by electron spins from the reservoirs. One singlet state is indicated by the dashed ellipse in Fig. 3(a). In this configuration, electrons from the leads and the ring can change places resulting in an exchange of their spins. This gives rise to a finite conductance despite the charging energy and the constant N . The resulting effective density of states for the ring is shown in Fig. 3(b). The broad peaks denote the positions of the single-particle levels in the electron addition spectrum. The width of the peaks is determined by the tunnel coupling to the leads given by the barrier transparencies Γ = (ΓS + ΓD ), where Γ measures the inverse lifetime of the electron on the dot.
Fig. 3. (a) Quantum ring or dot with a singly occupied spin degenerate level. The electron on the dot couples to electrons with opposite spin in the source and drain contacts. This leads to the formation of spin singlets, one is indicated by the dashed ellipse. (b) The resulting density of states has a maximum pinned to the electrochemical potentials in the contacts
Aharonov-Bohm Oscillations of a Few-Electron Quantum Ring
117
The remarkable fact is the appearance of a very narrow peak at the electrochemical potential of the contacts [18]. This finite density of states allows the transport of electrons through the structure. A new temperature scale appears in the model denoted Kondo temperature TK . The quantity kB TK is a measure of the binding energy of the spin singlet state. Often the Kondo temperature is defined by the following equation [19] 1√ ε(U + ε) kB TK = U Γ exp π (1) 2 Γ U for ε < 0 and U + ε > 0, where ε is the energetic distance of the occupied level below the electrochemical potential in the contacts. The exponential dependence of TK on the inverse of the tunnel coupling highlights the necessity to tune the system into a regime with a high Γ for this effect to be observed. As already mentioned we can use the in-plane gates to induce tunneling barriers at the point contacts in our quantum ring. These separate the ring from the source and drain contact as indicated by the dashed lines in Fig. 2(b). In Fig. 4(a) a logarithmic grey-scale plot of the linear conductance G is shown as a function of VA and VB . For this measurement VA was stepped from −250 mV to −70 mV with ∆VA = 5 mV and VB varied between −280 mV and −120 mV. When the electrochemical potential on the ring matches the electrochemical potentials in the leads the number of electrons N on the ring can increase by one. The number of electrons fluctuates between N and N + 1 at these charge degeneracy points. This leads to a finite conductance and thus to a peak in G indicated by the black lines in the plot. Please note that the black lines have a finite slope because of the finite capacitance of both gates to the quantum ring. In this regime the ring exhibits dot-like characteristics as observed for single-electron transistors [1]. Beyond the outermost left Coulomb-blockade peak our ring does not show
Fig. 4. (a) Logarithmic grey-scale plot of the linear conductance G as a function of VA and VB . White corresponds to G = 10− 4e2 /h and black to G = 1.6e2 /h (b) Coulomb-blockade valleys (in (a) marked with asterisks) at VA = −200, −150, −80 mV, from bottom to top
118
Ulrich F. Keyser et al.
any finite conductance anymore and we assume that it is totally depleted from electrons. With increasing VB N increases one by one until N = 5. This valley is marked by the white asterisks in Fig. 4(a). Apart from the electrochemical potential, we control the tunnel coupling and thus Γ with VA , and thereby tune our quantum ring into the Kondo regime. The data in Fig. 4(b) depicts two Coulomb-blockade peaks with N = 5 at VA = −200, −150, −80 mV from bottom to top. The positions of the Coulomb-blockade valleys shown are marked in Fig. 4(a) by the white asterisks. Beside the rise in the peak conductance by a factor of five to 0.8(e2 /h) the most striking result is the increase in the valley conductance between the peaks, whereas it barely changes at the left and right hand side of the peaks. This finite valley conductance is expected for a Kondo effect. To analyze this Kondo effect in more detail, non-linear transport measurements are shown in Fig. 5. Figure 5(a) depicts the differential conductance, dISD /dVSD , measured at a temperature of Tb = 30 mK as function of VB and VSD , measured at strong coupling, VA = −80 mV. The central Coulomb-blockade diamond corresponds to the valley with N = 5 electrons discussed above. A sharp zero-bias peak appears, whereas in the valleys to the left and right only low conductance is observed. If electrons are added according to Hund’s rule [20], a Kondo effect and thus a peak is expected in consecutive Coulomb-blockade valleys [21]. In contrast, the appearance of the Kondo effect alternates with electron number on the ring (odd-even Kondo effect [22]) below a gate voltage of −200 mV. For VB > −200 mV we observe a more complicated pattern presumably caused by the
Fig. 5. (a) Grey-scale map of the non-linear conductance dI/dVSD as a function of VSD and VB at VA = −80 mV, (b) Temperature dependence of the zero-bias peak marked in (a) by the vertical line. (c) Full width at half maximum (FWHM) extracted from (b). (d) Temperature dependence of the peak conductance for at five different gate voltages
Aharonov-Bohm Oscillations of a Few-Electron Quantum Ring
119
opening of the source tunneling barrier and an increased asymmetry of the device. Fig. 5(b) depicts temperature-dependent measurements taken at the gate voltage indicated in Fig. 5(a). The zero-bias peak observed at Tb = 30 mK vanishes almost completely when the temperature is increased to T ∼ 500 mK as expected for a Kondo resonance [18]. We estimate a Kondo temperature TK from these measurements by extrapolation of the full peak width at half maximum ∆VSD to T = 0 K (Fig. 5(c)), which results in TK ≈ e∆VSD,T =0 /2kB ∼ 600 mK [18,23]. Fitting the Kondo conductance at zero bias G(T ) with an empirical formula from Ref. [24,25] G(T ) =
G0 1 + ((21/s − 1)(T /TK )2 )s
(2)
yields TK ∼ 600 mK and s ∼ 0.21. This is in agreement with earlier studies on a spin-1/2 system [25]. The scaled peak conductance (full circles) is shown together with the fit in Fig. 5(d). In order to prove the scaling behavior, the results obtained at four different gate voltages are shown as open symbols in Fig. 5(d). The respective Kondo temperatures are TK ∼ 0.9, 1.0, 1.1, 1.4 K. The zero-bias peak of a spin-1/2 induced Kondo effect is expected to split in a magnetic field according to the Zeeman effect. In the analysis of data (see Ref. [26]) taken up to 4 T we determine the Land´e-factor of 0.44 for GaAs as observed by other groups [27,21]. This is further evidence of a spin-1/2 Kondo effect. We already assumed that we can tune the number of electrons on our device between N = 0 and N ≥ 5. We will now use the electron addition spectrum of our small quantum ring to prove that it contains just these numbers of electrons. The spectrum is shown in Fig. 6 for magnetic fields up to B = 6 T. The linear conductance G (VSD = 0) is plotted in grey scale as function of VB and B at VA = −80 mV. Each Coulomb-blockade peak appears as a black line more or less parallel to the B-axis. The signatures of non-vanishing conductance between Coulomb-blockade peaks are due to the Kondo effect described above. The valley marked by the arrow is the same as investigated earlier in this work with N = 5. We observe an alternating pattern of high and low conductance as a function of B. The Kondo effect is modulated abruptly between high (grey) and low (white) conductance regions for B < 2 T. Interestingly, our measurements (Fig. 6) show some similarities with results obtained for quantum dots designed as discs [21,28]. This is presumably due to the similar importance of the outer edge for tunneling through quantum dots as well as quantum rings in high magnetic fields. Additionally, calculations indicate that a quantum ring has the same shell structure as a quantum dot for up to six electrons [29]. The alternating pattern in the valley conductance with increasing magnetic field can be explained by a redistribution of electrons between different Landau levels (LL) [30]. For example, inside the valley marked in Fig. 6 we observe an increased conductance indicating Kondo correlations. At B ∼ 1.5 T
120
Ulrich F. Keyser et al.
Fig. 6. Grey-scale plot of the linear conductance of the ring as a function of VB and the perpendicular magnetic field B at VA = −80 mV
an electron from the upper LL is transferred to the lower LL which is indicated by the sharp boundary in the spectrum. The Kondo effect is suppressed here because the transport level in the lowest and outermost LL n = 0 contains two electrons with opposite spins (N =odd, spin-0 in n = 0). At B ∼ 2.0 T a second electron is transferred to LL n = 0, and an unpaired spin is available again in LL n = 0. The Kondo effect is thus restored. Similar drastic changes are not observed anymore for higher magnetic fields. We conclude that no further electrons are redistributed from LL n = 1 to n = 0 and therefore assume that all electrons are in the lowest Landau level (filling factor ν = 2). A further increase of the magnetic field up to B ∼ 5.5 T reveals some weaker features highlighted in Fig. 6 by dashed lines. These small variations in the amplitude and position of the Coulomb-blockade peaks are identified with spin flips of the electrons in the lowest LL. Their spin is flipped from | ↑> to | ↓> by increasing the magnetic field [31]. For B > 5.5 T, the electrons in the lowest LL are totally spin-polarized corresponding to a filling factor ν = 1. The number of electrons N on the ring is determined by counting the spin flips between ν = 2 (B = 2 T) and ν = 1 (B ∼ 5.5 T). For the Kondo valley marked in Fig. 6 we observe only two spin flips. The occurrence of the Kondo effect at B = 0 indicates that N is odd, thus we conclude that there are indeed five electrons on the quantum ring, as expected for this valley.
Aharonov-Bohm Oscillations of a Few-Electron Quantum Ring
121
4 Aharonov-Bohm Oscillations with Electron-Electron Interaction After the characterization of the Kondo effect we focus on lower magnetic fields B where only a few flux quanta pierce our quantum ring. A strictly one-dimensional, single mode quantum ring which encloses a magnetic flux of m flux quanta φ0 = h/e shows the energy spectrum El,m [32] El,m =
22 (m + l)2 . m∗ d2e
(3)
Here m∗ is the effective mass of the electron, l = 0, ±1, ±2, . . . the angular momentum quantum number, and de the diameter of the ring. An example is depicted in Fig. 7(a). For every value of l, the states lie on parabolas with their minima at a magnetic flux φ = mφ0 . An increase in m by one leads to a change of the ground state of the ring from l = 0 to l = −1. Therefore, the energy of the ground state oscillates with a period of h/e in a perpendicular magnetic field even in a single-particle picture. As a consequence, we expect the Coulombblockade resonances of our quantum ring to follow the thick line in Fig. 7(a). This was observed in Ref. [11]. We find a completely different behavior in our quantum ring. A grey-scale plot of G normalized with G0 = G(B = 0) is shown at VA = −150 mV in Fig. 7(b). The position of each Coulomb-blockade peak is marked with a white dot. Each kink in the peak position indicates a change of the ground state in our ring. According to a theoretical calculation of Niemel¨ a et al., electron-electron interaction lifts the degeneracy of the singlet and triplet states in a ring [33]. This effect leads to more frequent level crossings of the ground state compared with a ring in the absence of interaction because the triplet states are present at lower energies. These calculations predict a transition (Aharonov-Bohm)
Fig. 7. (a) Energy levels without interaction as a function of B. The grey line marks the movement of a Coulomb-blockade peak for ∆B ∼ 60 mT. (b) G/G0 (B, VB ) as grey scale plot at VA = −150 mV with peaks marked by white dots
122
Ulrich F. Keyser et al.
period ∆B that shortens with increasing number of electrons on the ring, e.g. for four electrons it should be four times smaller. In fair agreement, we observe a period of ∆B ∼ 13 mT which is four to five times shorter compared to the open regime. For our ring with five electrons ∆B should be shortened by a factor of five. Our results are explained by recent Quantum Monte Carlo calculations that model a system with parameters similar to our sample [34]. Between the two Coulomb-blockade peaks shown in Fig. 7(b) we observe a modulation of the Kondo conductance. This reflects a change in the ground states of the system with increasing B. If the ground state is a triplet state the Kondo temperature is lower and thus the conductance is reduced. After the next transition the ground state is again a spin singlet and the Kondo effect is recovered. By using the Kondo effect we can map out the behavior of the quantum ring states even in the situation where the ring is in the Coulomb-blockade regime. We apparently observe smooth linear shifts of the Aharonov-Bohm maxima in Fig. 7(d), indicated by the slightly tilted vertical lines in the grey-scale plot. These shifts only appear in our two-terminal measurement at finite magnetic field when the ring is threaded by at least half a flux quantum. Linear shifts of normal Aharonov-Bohm oscillations were recently reported by Ji et al. for a quantum dot in the unitary Kondo regime embedded in an AharonovBohm interferometer [35]. Their four-terminal measurement is interpreted in terms of smooth phase shifts by almost 2π across a Kondo resonance. In contrast we investigate a Kondo resonance far from the unitary limit in a quantum dot which itself serves as the interferometer. The exact mechanism for the observed linear shift of the Aharonov-Bohm maxima has yet to be clarified, but it might be connected to the fact that the level structure of our small ring interferometer is influenced by the gate voltage. In Fig. 8(b) we show Aharonov-Bohm oscillations in the normalized conductance G/G0 as a function of B for the gate voltages marked by the symbols in Fig. 8(a). The vertical dashed lines denote the expected period of ∆B ∼ 60 mT of the Aharonov-Bohm oscillations extracted from the measurements in the open regime. It is immediately evident that we again obtain the much shorter period as discussed above. A comparable oscillation period is also obtained in the center of the Kondo valley at higher tunnel coupling VA = −80 mV and VB = −231 mV (marked by the open circle in (a)) as depicted in the lower part of Fig. 8(b). We also find a phase jump by π in our data at each crossing of a Coulombblockade peak due to the two-terminal nature of our measurement. This is verified in the upper part of Fig. 8(b) for one of the resonances: the conductance at B = 0 changes from a maximum to a minimum (• to ) in crossing the peak.
Aharonov-Bohm Oscillations of a Few-Electron Quantum Ring
123
Fig. 8. (a) N = 5-Kondo valley at VA = −150 mV (solid) and VA = −80 mV (dashed line) at B = 0. , , • and ◦ mark the gate voltages for the AharonovBohm measurements in (b). (b) Upper part: Normalized conductance G/G0 as a function of B at VA = −150 mV. The curves are offset for clarity. Lower part: G/G0 at VA = −80 mV after subtraction of an increasing background
5
Conclusion
In conclusion, a small tunable quantum ring with less than ten electrons is shown to exhibit Aharonov-Bohm oscillations as well as Coulomb-blockade. We tuned the ring into the Kondo regime and proved that it is induced by a single spin on the ring. From the transport measurements in the Kondo regime we conclude that the energy spectrum is strongly influenced by electron-electron interaction. An analysis of the phase evolution of the Aharonov-Bohm effect in the Kondo regime yields phase jumps by π at the Coulomb-blockade resonances and a smooth shift of the Aharonov-Bohm maxima in between. Acknowledgements We acknowledge discussions with S. Ulloa, J. K¨ onig, U. Zeitler, and F. Hohls. This work was supported by BMBF, DIP, and TMR.
References 1. L. P. Kouwenhoven, in Mesoscopic Electron Transport, NATO ASI Series, edited by L. L. Sohn, L. P. Kouwenhoven, and G. Sch¨ on (Kluwer, Dordrecht, 1997), Vol. 345. 113, 117 2. Y. Aharonov and D. Bohm, Phys. Rev. 115, 485 (1959). 113 3. J. Kondo, Progress of theoretical Physics 32, 37 (1964). 113 4. A. Lorke, R. J. Lyuken, A. O. Govorov, J. P. Kotthaus, J. M. Garcia, and P. M. Petroff, Phys. Rev. Lett. 84, 2223 (2000). 113 5. R. J. Warburton, C. Schaflein, D. Haft, F. Bickel, A. Lorke, K. Karral, J. M. Garcia, W. Schoenfeld, and P. M. Petroff, Nature 405, 926 (2000). 113 6. M. Ishii and K. Matsumoto, Jpn. J. Appl. Phys. 34, 1329 (1995). 114, 115
124
Ulrich F. Keyser et al.
7. H. W. Schumacher, U. F. Keyser, U. Zeitler, R. J. Haug, and K. Eberl, Appl. Phys. Lett. 75, 1107 (1999). 114 8. R. Held, S. L¨ uscher, T.Heinzel, K.Ensslin, and W.Wegscheider, Appl. Phys. Lett. 75, 1134 (1999). 114 9. R. Held, T. Vancura, T. Heinzel, K. Ensslin, M. Holland, and W. Wegscheider, Appl. Phys. Lett. 73, 262 (1998). 114 10. J. Regul, U. F. Keyser, M. Paesler, F. Hohls, U. Zeitler, R. J. Haug, A. Malav´e, E. Oesterschulze, D. Reuter, and A. D. Wieck, Appl. Phys. Lett. 81, 2023 (2002). 114 11. A. Fuhrer, S. L¨ uscher, T. Ihn, T. Heinzel, K. Ensslin, W. Wegscheider, and M. Bichler, Nature 413, 385 (2001). 114, 121 12. U. F. Keyser, S. Borck, R. J. Haug, W. Wegscheider, M. Bichler, and G. Abstreiter, Semicon. Sci. Technol. 17, L22 (2002). 114, 115 13. W.-C. Tan and J. C. Inkson, Semicon. Sci. Technol. 11, 1635 (1996). 114 14. H. van Houten, B. J. van Wees, M. G. J. Heijman, and J. P. Andre, Appl. Phys. Lett. 49, 1781 (1986). 115 15. U. F. Keyser, H. W. Schumacher, U. Zeitler, R. J. Haug, and K. Eberl, Appl. Phys. Lett. 76, 457 (2000). 115 16. L. I. Glazman and M. E. Raikh, JETP Lett. 47, 452 (1988). 116 17. T. K. Ng and P. A. Lee, Phys. Rev. Lett. 61, 1768 (1988). 116 18. Y. Meir, N. S. Wingreen, and P. A. Lee, Phys. Rev. Lett. 70, 2601 (1993). 117, 119 19. F. D. M. Haldane, Phys. Rev. Lett. 40, 416 (1978). 117 20. S. Tarucha, D. G. Austing, T. Honda, R. J. van der Hage, and L. P. Kouwenhoven, Phys. Rev. Lett. 77, 3613 (1996). 118 21. J. Schmid, J. Weis, K. Eberl, and K. v. Klitzing, Phys. Rev. Lett. 84, 5824 (2000). 118, 119 22. D. Goldhaber-Gordon, H. Shtrikman, D. Mahalu, D. Abusch-Magder, U. Meirav, and M. A. Kastner, Nature 391, 156 (1998). 118 23. J. Nygard, D. H. Cobden, and P. E. Lindelof, Nature 408, 342 (2000). 119 24. T. A. Costi, A. C. Hewson, and V. Zlatic, J. Phys.: Condens. Matter 6, 2519 (1994). 119 25. D. Goldhaber-Gordon, J. G¨ ores, M. A. Kastner, H. Shtrikman, D. Mahalu, and U. Meirav, Phys. Rev. Lett. 81, 5225 (1998). 119 26. U. F. Keyser, C. F¨ uhner, S. Borck, R. J. Haug, M. Bichler, G. Abstreiter, and W. Wegscheider, cond-mat/0206262 . 119 27. S. M. Cronenwett, T. H. Oosterkamp, and L. P. Kouwenhoven, Science 281, 540 (1998). 119 28. D. Sprinzak, Y. Ji, M. Heiblum, D. Mahalu, and H. Shtrikman, Phys. Rev. Lett. 88, 176805 (2002). 119 29. A. Emperador, M. Pi, M. Barranco, and E. Lipparini, Phys. Rev. B 64, 155304 (2001). 119 30. P. L. McEuen, E. B. Foxman, J. Kinaret, U. Meirav, and M. A. Kastner, Phys. Rev. B 45, 11419 (1992). 119 31. M. Ciorga, A. S. Schrajda, P. Hawrylak, C. Gould, P. Zawadzki, S. Jullian, Y. Feng, and Z. Wasilewski, Phys. Rev. B 61, 16315 (2000). 120 32. M. B¨ uttiker, Y. Imry, and R. Landauer, Phys. Lett. A 96, 365 (1983). 121 33. K. Niemel¨ a, P. Pietel¨ ainen, P. Hyv¨ onen, and T. Chakraborty, Europhysics Letters 36, 533 (1996). 121
Aharonov-Bohm Oscillations of a Few-Electron Quantum Ring
125
34. A. Emperador, E. Lipparini, and F. Pederiva, submitted (2003). 122 35. Y. Ji, M. Heiblum, and H. Shtrikman, Phys. Rev. Lett. 88, 076601 (2002). 122
Self-Organized InGaAs Quantum Rings – Fabrication and Spectroscopy Axel Lorke1, Jorge M. Garcia2 , Ralf Blossey3 , Richard J. Luyken4 , and Pierre M. Petroff5 1 2 3 4 5
Experimentalphysik, Universit¨ at Duisburg-Essen Lotharstr. 1 ME245, 47048 Duisburg, Germany Instituto de Microelectr´ onica, Parque Tecnologico de Madrid 28760 Tres Cantos, Madrid, Spain Interdisciplinary Research Institute, IEMN, Cit´e Scientifique Avenue Poincar´e, BP 69, 59652 Villeneuve d’Ascq Cedex, France Infineon Technologies, Corporate Research ST Otto Hahn Ring 6, 81739 M¨ unchen, Germany Materials Department, University of California Santa Barbara Ca. 93106, USA
Abstract. We report on recent experiments on semiconductor quantum rings, fabricated by self-organized Stranski-Krastanov growth. In the first part of this review we focus on the growth mechanisms that lead to formation of annular islands, with emphasis on the influence of diffusion and surface tension. In the second part, we summarize the results of intraband spectroscopy, which show the influence that the ring morphology has on the electronic properties, in particular, when a magnetic flux quantum threads the interior of the islands.
1
Introduction
When Goldstein et al. [1] in 1985 reported on the observation of island formation in the Stranski-Krastanov growth of InAs on GaAs, their work was more concerned with the optimization of smooth (InGa)As layers for optically active quantum wells than with the fabrication of nanostructures. They did however already observe all ingredients of what was to become a great breakthrough for semiconductor nanoscience: bright luminescence out of islands of just the right size for single-electron and quantum effects. Soon the benefits of these (InGa)As islands as self-ordering semiconductor nanostructures were realized and the optical, far-infrared and transport properties were studied in great detail. It was shown that the ease of fabrication, the homogeneity and the crystal quality made self-organized InAs islands ideal systems to study many-particle quantum effects in electronic and excitonic systems with a precision that the label “artificial atoms” was indeed warranted for these semiconductor nanostructures [2]. The terms “self-organized” or “self-assembled” seem to imply that within the parameter-space dictated by the optimum growth conditions, the morphology of the islands is more or less fixed. This would certainly limit the B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 125–137, 2003. c Springer-Verlag Berlin Heidelberg 2003
126
Axel Lorke et al.
use of these islands as model systems for quantum effects on the nanometer scale. It was soon realized though, that additional ordering mechanisms can be exploited for more complicated shapes. The most prominent among these mechanisms is the vertical ordering that can lead to double (or multiple) vertically aligned dots [1,3,4], often labeled as “artificial molecules”. In the strained Si-Ge-system, Stranski-Krastanov growth can even lead to three-dimensionally ordered lattices [5]. A further method to vary the shape of (InGa)As islands makes use of a morphological change that takes place when the dots are partially covered with GaAs. As will be discussed in detail below, then islands with a welldeveloped ring-shape can be fabricated. Rings are distinctly different from single or multiple stacked dots in that they have a not-simply-connected geometry. This geometry makes them particularly interesting for studies in magnetic fields. The quantization of magnetic flux in the interior leads to a periodic change in the quantum mechanical properties of the electron system encircling the ring. This in turn affects all electronic properties of rings. For sufficiently narrow rings, these oscillations are periodic in the applied magnetic field and are commonly labeled “Aharonov-Bohm-oscillations”, even though – strictly speaking – the Aharonov-Bohm effect requires the absence of a magnetic field along the path of the electrons. In 1959 Aharonov and Bohm discussed [6] that even in this case, the electron acquires an additional phase along its path that is directly proportional to the vector potential A. For a closed loop, where the total phase must be an integer multiple n of 2π, the discreteness of n even when A is changed continuously leads to oscillations that have no classical analogue. The theoretical predictions were soon experimentally confirmed and Aharonov-Bohm-oscillations have since been observed in a large variety of systems, mostly, though in homogeneous magnetic fields fields where B = 0 along the path of the electrons [7,8,9]. Also, most of the experiments have been carried out in the semiclassical regime where the electron wavelength is given by the Fermi-energy of the chosen material. Furthermore, most experiments were carried out in the mesoscopic regime, where (non-phase-breaking) scattering plays an important role in the determination of the electron path. Naturally occurring ring-like quantum structures like the benzene molecule seem to offer an alternative to the above-mentioned semi-classical systems. In order to thread a flux-quantum through the interior of a benzene molecule, however, magnetic fields of around 50 000 T are needed, orders of magnitude more than the fields available in today’s laboratories. Self-organized quantum rings have all properties necessary to investigate flux-related phenomena in fully quantized, scatter-free few-electron systems. Their lateral size is large enough so that they can encompass a flux quantum in magnetic fields of a few tesla. Also, they can be controllably charged with single electrons, so that the influence of the electron-electron-interaction on the Aharonov-Bohm-oscillations can be studied. Furthermore, their small-
Self-Organized InGaAs Quantum Rings
127
ness, their superb crystal quality and the fact that they can be probed as an ensemble average ensures that random scattering effects do not play a decisive role in the observed features. Finally, self-organized quantum rings are experimentally accessible through a number of techniques. The fact that transport, far-infrared and optical interband spectroscopy can be performed on the same sample allows for an in-depth characterization of the electronic and excitonic structure. This way, a complete picture, containing the influence of the external confining potential, the electron-electron-interaction and the magnetic field on the many-particle states can be derived. In this review we will first focus on the formation of self-organized (InGa)As quantum rings. After summarizing the growth conditions necessary for ring formation, we will discuss the implications that the transformation from dots to rings has on the understanding of Stranski-Krastanov growth in III-V-semiconductors. In the second part, we will review the electronic properties of ring-shaped quantum structures in magnetic fields and show how these are related to the experimental observations in transport and far-infrared spectroscopy.
2
Growth Procedure
In hindsight, the observation of well-defined ring-shaped islands was to some extend a fortunate coincidence. We now know that a number of growth parameters have to be chosen correctly to induce the ring formation. The rings shown in Fig. 1 were prepared in the following way [10] (see also Fig. 2). On silicon-doped GaAs (100) wafers, a GaAs buffer layer was grown at ≈ 600◦C to smooth out the substrate. Then the temperature was lowered to ≈ 530◦C, a value that can be reproduced in different growth chambers
Fig. 1. Atomic-force micrograph of self-organized (InGa)As quantum rings. Note the slight elongation along the (1¯ 10)-direction
128
Axel Lorke et al.
(a)
(b) InAs
≈ 1.7 monolayers
GaAs 530 °C
(c)
dots
(d) GaAs
GaAs cover layer
≈ 30s anneal
rings
Fig. 2. Growth sequence used for the fabrication of (InGa)As rings
by observing the RHEED pattern, since under moderate As flux, at this temperature the surface reconstruction changes from (2 × 4) to c(4 × 4). Then, 1.7 monolayers of InAs were deposited at a low growth rate to induce Stranski-Krastanov formation of large, approximately lens-shaped islands. In the present case, the islands have a base width of approximately 30 nm and a height of about 10 nm. Our studies [11] and those of other authors [12,13,14] strongly suggest that large dots are a necessary starting condition for the formation of well-defined rings. The sample was annealed for 40 s under As flux to narrow the size distribution and further help the formation of large islands [16]. The dots were then capped with 4 nm GaAs and the wafers were – after an annealing time of a few 10 seconds – cooled down to room temperature under As flux and removed from the chamber for AFM characterization under ambient conditions. As seen in Fig. 1, this procedure results in well-defined rings of ≈ 2 nm height, 20 nm inner diameter and between 60 and 140 nm outer diameter. Detailed statistical analysis [17] shows that the centers of the rings are located at the sites of the former dots. Even though this does not seem very surprising, it shows that the annular mounds are formed by an outward transport of material that leads to a dissolution of the original, lens-shaped island. In the following section we will more closely analyze the different mechanisms that are responsible for this outward movement of the In-rich material.
3
Growth Mechanisms
One possible reason for the redistribution of the InAs is diffusion. At the growth temperature of ≈ 530◦ C, the In atoms are very mobile on the surface [18], whereas the Ga atoms can diffuse only little during the growth interruption. Both is reflected in the shape of the rings. The outer bound-
Self-Organized InGaAs Quantum Rings
129
ary is elongated along the (1¯10)-crystal orientation, which is the preferred direction for In-diffusion on the reconstructed GaAs (100) surface [18]. The inner boundary, on the other hand, remains circular, which shows that the GaAs that is surrounding the dots just after the capping layer deposition (see Fig. 2(c)), remains in place (Fig. 2(d)). The elongation of the outer edge and the almost circular inner edge can even better be seen in capped InAs islands, grown under somewhat different growth conditions [19,20,21]. A possible scenario for the ring formation is depicted in Fig. 3 [21]. Here, – as in Fig. 2 – it has been assumed that the GaAs capping layer is surrounding the dot rather than covering it like a blanket (see Fig. 3(a)). This assumption is commonly made in the literature and reflects the fact that the relaxed InAs lattice constant at the tip of the dot [22] makes it an unfavorable site for the attachment of Ga atoms. It is further confirmed by transport measurements [23] which show that only when the capping layer thickness is comparable to the height of the original dot, the In is locked in place by the GaAs coverage and the annealing step no longer results in ring formation. A further assumption made in Fig. 3 may on the other hand not be justified. In step (b) and (c) the outdiffusing In stays more or less fixed once it has left the site of the dot. Only then the relatively sharp outer edge of the rings can be obtained. Even though this might be explained by (InGa)As alloy formation [23], we believe that diffusion alone cannot account for all experimental evidence. Apart from the sharp ring edge mentioned above, we observe that the ring formation seems to take place quite abruptly [21] rather than gradually, as expected for a diffusive mechanism. The strongest evidence against a picture that is based on different diffusion constants of the group III constituents comes from the work of Raz et al. [14]. They have shown that self-assembled
(a)
(b)
(c)
Fig. 3. Model of diffusion-driven ring formation. See text for details. After [23]
130
Axel Lorke et al.
InAs rings can also be grown on InP. In this material combination, the substrate and the capping layer contain the same group III element as the dots. Starting from Fig. 3(a) and only assuming In migration, a depression in the center as shown in Fig. 3(b) or (c) cannot be explained. An alternative explanation for the abrupt morphological change of the InAs islands comes from an analogy to the instability of wetting droplets. In seemingly unrelated material combinations (e.g. polystyrene on Si), structures of striking similarity to the present rings have been observed and explained by a dewetting process [24]. It should also be remembered that one of the key ingredients for the understanding of Stranski-Krastanov growth is the interplay between surface and interface energies of the different constituents. The same should be true for partially capped islands. An explanation for (InGa)As ring formation, based on dewetting phenomena, is given in Fig. 4 [25]. As shown in (a), the wetting angle θ of uncapped islands is given by the balance of lateral forces at the base of the dot, γac = γbc cos(θ) + γab .
(1)
Here, γij is the force at the interface between materials i and j, which in the present case are a = GaAs, b = InAs, and c = As-vapor. When the islands are partially capped, the corresponding relation would read (see Fig. 4(b)) γac = γbc cos(θ) − γab cos(θ) .
(2)
Since (2) is incompatible with (1), the configuration shown in Fig. 4(b) cannot be in equilibrium and a net force ∆F must be present for partially capped islands (Fig. 4(c)). This radial force can explain the outward motion of InAs that leads to the ring formation. A possible new equilibrium shape of the rings is suggested in Fig. 4(d). Here, the (lateral) interfacial forces are balanced again and the shape is in qualitative agreement with the experimental data.
(a) c
θ
b
c a
a
(c) c a
∆F b
(b)
θ b
(d)
Fig. 4. Model for ring formation based on wetting droplet instability. After [25]
Self-Organized InGaAs Quantum Rings
131
In this model, we have for simplicity neglected the presence of the InAs wetting layer that is connecting the dots (see Fig. 2). The wetting layer will certainly affect the details of the ring formation. However, it will not qualitatively change the results [25]. Still, it would be desirable to develop a more sophisticated model that would also include the wetting layer as well as strain. Strain is known to play an important role in determining the equilibrium shape of (InGa)As dots. First steps towards a more complete model are being taken [13,15].
4
Further Control of the Island Shape
As discussed in the previous section, a number of effects play a role in the formation of self-organized quantum rings, among them diffusion, strain, and surface and interface energies. All of these can be controlled by proper choice of the growth parameters. This way, not only rings but other complex shapes can be realized. The influence of diffusion on the shape of capped InAs islands was investigated in Ref. [11]. Diffusion can be influenced not only by changing the growth temperature but also by using cracked As2 , which is more reactive than As4 and thus reduces the mobility of the group III elements. This way, structures ranging from elongated islands to rings could be realized. An intermediate structure, i.e. an elongated island with a depression in its center is of particular interest. Such a “camel back” structure might be useful for realizing double dot systems with molecule-type electronic properties. Strain and interface energies can be influenced by covering the dots not with pure GaAs but with Inx Ga1−x As alloy of different composition x. As shown by Songmuang et al. [13], then the dissolution of the center dot can be completely suppressed and structures of surprising complexity emerge, such as a center island with a surrounding mound and a thin trench separating the two. Even though this demonstrates that an impressive variety of structures can be realized using self-organized growth, an important question remains. If indeed the overgrowth plays such an important role in the distribution of material, can the shape of the freestanding structures be preserved during the next growth steps that are necessary for the fabrication of electrically and optically active structures? In the next section we will show that this is true, at least for (InGa)As rings, by demonstrating that the electronic properties of the completed heterostructure are reflecting the ring geometry of the islands.
5
Electronic Properties
To study their electronic properties, the (InGa)As rings are embedded into a capacitor-type MISFET structure (metal-insulator-semiconductor field-effect
132
Axel Lorke et al. far-infrared radiation B
gate electrode
Vg + Vmod
blocking layer InAs islands back contact
InAs island eVg
FIR
back contact
ttot
t1
Fig. 5. Top: Schematic layer structure and experimental geometry used for the spectroscopic experiments. Bottom: Simplified sketch of the conduction band structure
transistor structure). The basic layer sequence is schematically depicted in Fig. 5. First, a heavily Si-doped back contact layer is grown, which serves as an electron reservoir and defines the local Fermi level. It is followed by a 25 nm GaAs spacer that separates the islands from the doping layer but allows for tunneling of electrons between the back contact and the rings. The rings themselves are grown using the above described Stranski-Krastanov procedure, with a capping layer thickness of 1 nm. They are then covered with 30 nm GaAs, followed by a 116 nm blocking layer that prevents tunneling or leakage currents between the rings and the top gate electrode. The blocking layer consists of a AlAs/GaAs superlattice, the gate is realized by a semi-transparent NiCr layer. The total area of the sample is about 5 mm2 , covering ≈ 5 × 108 rings. The low density ensures that the rings are well separated laterally and that to a good approximation the charge in the ring layer can be neglected. Then an external voltage (shifted by the built-in voltage caused by the Schottky barrier) will cause a linear electrostatic potential across the structure, as schematically depicted in Fig. 5(bottom). This way, the applied gate voltage Vg (scaled by the “lever arm” ttot /t1 , see Fig. 5) directly translates into an energy shift of the rings with respect to the back contact. Single electrons are loaded into the rings, each time the energy of the corresponding state is shifted below the Fermi energy in the back contact. This can be probed by application of an additional AC voltage Vmod . On resonance, electrons are tunneling back and forth between the rings and the back contact, leading to an effectively increased capacitance of the structure. Alternatively, the increase in the capacitance can be understood as the
Self-Organized InGaAs Quantum Rings
133
contribution of the so-called quantum capacitance [26], which is proportional to the electronic density of states in the rings. Thus, the capacitance-voltage (CV) traces directly reflect the many-particle ground state energies of the quantized electronic system. Complementary information can be obtained by far-infrared transmission spectroscopy, which probes the excitations of the confined electron system. Due to the extremely low total number of electrons in the rings, the absorption is of the order of 0.1 %, which constitutes a great experimental challenge in the far-infrared. The tunability of the electron number is of great advantage here, because signal-to-noise ratios down to 10−4 can be achieved when the sample spectrum is normalized to a spectrum where the gate voltage is chosen so that the rings are void of electrons. Figure 6 shows a CV-spectrum of the above described quantum ring heterostructure. Around Vg = 0 V, two maxima (arrows) indicate the loading of the first and second electron per dot. At higher voltages, the capacitance rapidly increases because of the loading of the wetting layer [27] and no further charging peaks can be discerned in this sample (on other samples, however, a third maximum can be seen [28]). In comparison to the spectrum of dots [29] (see inset), the first charging peak of rings is shifted to higher voltages. This reflects the reduced height of the rings, which increases the electronic ground state energy. The lowest state is doubly spin degenerate and the separation between the first and the second maximum is given by the electron-electron-interaction. Concerning the charging energy, the difference between dots and rings is surprisingly small, considering the large difference in lateral size (about a factor of three). This might already be an indication of the missing center part of the rings, which would decrease the effective area available for the electrons and thus increase their interaction strength. It should be pointed out, though, that the conversion of voltages into energies is not very reliable in the ring sample, because the lever arm (nominally 7 in the present case) is not well defined close to the charging of the wetting layer.
capacitance (arb. units)
dots
rings
-1
-0.4
-0.3
-0.5
-0.2
0
-0.1 0 Vg [V]
0.1
0.2
0.3
Fig. 6. Capacitance-voltage traces of self-organized rings and dots. Arrows indicate the charging of the first and the second electron per ring
134
Axel Lorke et al.
As mentioned in the introduction, a direct way to show that the notsimply-connected geometry of the islands translates into ring-like electronic properties is the investigation of the magnetic properties of the sample. Figure 7(a) summarizes the measured far-infrared resonances as a function of an external magnetic field B, applied perpendicular to the plane of the rings [30]. The resonances can be grouped into the following modes: (1) two resonances (◦) which are degenerate at B = 0 and exhibit orbital Zeeman splitting when the magnetic field is applied; (2) a low-lying mode () which – because of the limitations of the spectrometer – can only be detected above 10 meV, but extrapolates to ≈ 7 meV at B = 0. Both the ()- and the (◦)-mode die out around B = 7 T, when a new mode () appears. The appearance and disappearance of modes is in contrast to the properties of quantum dots [2], where in general, only two resonances are observed, which are separated by the Zeeman energy. The periodic exchange of the allowed resonances in a magnetic field is, on the other hand, a characteristic of (sufficiently narrow) quantum rings. Making use of a model by Chakraborty et al. [31], we calculate the single electron states in a confining potential U (r) =
1 ∗ 2 m ω0 (r − R0 )2 , 2
(3)
where ω0 is the characteristic frequency of a parabolic wire, bent into a ring of radius R0 (see inset in Fig. 7(c)). This model has the advantage that it requires only two adjustable parameters, which can readily be fitted to the (extrapolated) resonances at B = 0, one of which (≈ 20 meV) is given the radial confinement, whereas the other (≈ 7 meV) reflects an azimuthal excitation with a change in angular momentum. The derived parameters, ¯ ω0 = 12 meV can be used to calculate the energy levels R0 = 14 nm, h for the entire range of magnetic fields investigated. The result is shown in Fig. 7(b), where the energies of the states with radial quantum number n = 0 and angular momentum = 0, ±1, ±2 (solid lines) as well as with n = 0, = 0, ±1, −2 (dashed lines) are plotted. The arrows in Fig. 7(b) indicate possible far-infrared excitations from the lowest energy state, under the assumption that only transition with ∆N = 0, 1 and ∆ = ±1 are allowed. Even though the fit parameters were obtained from the resonance positions at B = 0, the calculated dispersion is in qualitative agreement with the experiment also at high magnetic fields. Here, the most important feature is the predicted change in ground state at a magnetic field of around 7 T. This change from a state with zero angular momentum to a state with angular momentum = −1 is responsible for the appearance and disappearance of the different resonances shown in Fig. 7(a). The strongest evidence for a change in angular momentum at high magnetic fields comes from a direct mapping of the single-electron ground state energy by capacitance spectroscopy. Figure 7(c) shows the results of a careful fitting procedure to accurately determine the lowest charging peak position.
Self-Organized InGaAs Quantum Rings
135
Vg = 0.143 V, (2 electrons / ring)
(a) 30 energy (mev)
25 20 15 10 5
ne = 2
0 (b)
30 energy (meV)
25 20 15 10 N=0 1
5
calculated energy (meV)
(c) –3
–156
12 –2
–160
10 –1
8
l=0
6 4
0
2
–164
φ/ φo ≈ 1 4 6 8 magnetic field (T)
10
–168
measured peak position (mV)
0
12
Fig. 7. (a) Far-infrared resonance energies of quantum rings. Lines represent linear Zeeman terms corresponding to half the cyclotron energy (solid, dotted lines) and the full cyclotron energy (dashed line), respectively, see Ref. [30]. The mode indicated by × cannot be accounted for and might be an artifact. (b) Possible excitations from the ground state, assuming selection rules ∆N = 0, 1, ∆ = ±1. Note the change in ground state that occurs just below 8 T. (c) Lines: Calculated ground state energies of quantum rings as a function of the magnetic field. The insert displays the model potential used for the calculations. Dots: Experimental values of the lowest CV charging peak
Up to approximately 8 T, the scaled data closely follow the parabolic dispersion of the = 0 state. At magnetic fields above 8 T the slope of the magnetic field dependence changes and now the data is in good agreement with the dispersion of the = −1 state.
136
Axel Lorke et al.
Summarizing the results shown in Fig. 7, three independent experimental observations can simultaneously be described using the model parameters ¯ ω0 = 12 meV: The resonance positions at B = 0, the change R0 = 14 nm, h in the far-infrared excitation spectrum at ≈8 T and the change in slope of the lowest charging peak at about the same field. This gives us confidence in the validity of the model and shows that indeed, the ring morphology of the (InGa)As islands translates into annular electronic states. So far, we have restricted ourselves to a single-particle picture. A lot of interesting physics, however, arises from the influence of the Coulomb interaction on the ground state and the excitations. For example, two electrons confined to a ring can be understood as a rotating Wigner molecule [32]. At present, the large number of rings that are probed simultaneously does not allow us to complement the theoretical predictions with experimental data of sufficient resolution. Great progress in interband optical spectroscopy, on the other hand, has made it possible to investigate single rings and study the influence of electron-electron interaction with high resolution [33]. These experiments were carried out at B = 0 and the challenge remains to apply a high magnetic field and observe clear evidence for the ring morphology. Progress in this direction is rapid [34], so that the observation of (somewhat counter-intuitive) Aharonov-Bohm-type phenomena in neutral excitons [35] seems feasible. Also, the calculated ground state spectra for a many-electron system in a quantum ring display characteristic features, caused by the electron-electron interaction. For two electrons, the ground state should exhibit a periodic change between spin singlet and triplet states [36]. The results of the CV spectra for 2 electrons (not shown here) are in rough agreement with many particle calculations [37], a clear confirmation of the theoretical predictions, however, is still lacking. Refinements in both the theoretical and experimental tools promise for the future many unexpected and interesting results that will deepen the understanding of the fascinating objects of nanoscopic rings in the quantum limit. Acknowledgement The work summarized here was accompanied by continuous support and by stimulating discussions with A. O. Govorov, R. J. Warburton, K. Karrai S. E. Ulloa, and J. P. Kotthaus. We would like to thank them for their fruitful collaboration. Financial support through SFB 348 and BMBF grant 01BM164 is gratefully acknowledged.
Self-Organized InGaAs Quantum Rings
137
References 1. L. Goldstein, F. Glas, J.-Y. Marzin, M. N. Charasse, and G. Le Roux, Appl. Phys. Lett. 47, 1099 (1985). 125, 126 2. for reviews, see, e.g., S. M. Reimann and M. Manninen, Rev. Mod. Phys. 74, 1283 (2002); D. Bimberg, M. Grundmann, N. N. Ledentsov, Quantum Dot Heterostructures, (John Wiley, Chichester 1999). 125, 134 3. W. Wu et al., Appl. Phys. Lett. 71, 1083 (1997). 126 4. R. J. Luyken et al., Nanotechnology 10, 14 (1999). 126 5. V. Holy et al., Phys. Rev. Lett. 83, 356 (1999). 126 6. Y. Aharonov and D. Bohm, Phys. Rev. 115, 485 (1959). 126 7. L. P. L´evy et al., Phys. Rev. Lett. 64, 2074 (1990); V. Chandrasekhar et al., ibid., 67, 3578 (1991); D. Mailly, C. Chapelier, and M. Benoit, ibid., 70, 2020 (1993); A. F. Morpurgo et al., ibid., 80, 1050 (1998); R. Schuster et al., Nature 385, 417 (1997). 126 8. A. Fuhrer et al., Nature 413, 822 (2001). 126 9. U. F. Keyser et al., Semicond. Sci. Tech. 17, L22 (2002). 126 10. J. M. Garc´ıa et al., Appl. Phys. Lett. 71, 2014 (1997). 127 11. D. Granados and J. M. Garc´ıa, Appl. Phys. Lett., accepted. 128, 131 12. P. B. Joyce et al., Appl. Phys. Lett. 79, 3615 (2001). 128 13. R. Songmuang, S. Kiravittaya, and O. G. Schmidt, J. Cryst. Growth 249, 416 (2003). 128, 131 14. T. Raz, D. Ritter, and B. Bahir, Appl. Phys. Lett. 82, 1707 (2003). 128, 129 15. L. G. Wang et al., Appl. Phys. A 73, 161 (2001). 131 16. F. Ferdos et al., Appl. Phys. Lett. 81, 1195 (2002). 128 17. K. Mecke, private communication (unpublished). 128 18. V. Bressler-Hill et al., Phys. Rev. B 50, 8479 (1994) and references therein. 128, 129 19. I. Kamiya, I. Tanaka, and H. Sakaki, J. Cryst. Growth 201, 1146 (1999). 129 20. H. Heidemeyer et al., Appl. Phys. Lett. 80, 1544 (2002). 129 21. A. Lorke et al., Mat. Sci. Eng. B 88, 225 (2002). 129 22. I. Kegel et al., Phys. Rev. Lett. 85, 1694 (2000). 129 23. A. Lorke et al., Jpn. J. Appl. Phys. 40, 1857 (2001). 129 24. S. Herminghaus et al., Science 282, 916 (1998) K. Jacobs and S. Herminghaus, Physikalische Bl¨ atter 55 (12), 35 (1999) 130 25. R. Blossey and A. Lorke, Phys. Rev. E 65, 021603 (2002). 130, 131 26. R. J. Luyken et al., Appl. Phys. Lett. 74, 2486 (1999). 133 27. G. Medeiros-Ribeiro, D. Leonard, and P. M. Petroff, Appl. Phys. Lett. 66, 1767 (1995). 133 28. H. Pettersson et al., Physica E 6, 510 (2000). 133 29. see, e.g. M. Fricke et al., Europhys. Lett. 36, 197 (1996); B. T. Miller et al., Phys. Rev. B 56, 6764 (1997) and references therein. 133 30. A. Lorke et al., Phys. Rev. Lett. 84, 2223 (2000). 134, 135 31. T. Chakraborty and P. Pietil¨ ainen, Phys. Rev. B 50, 8460 (1994); V. Halonen, P. Pietil¨ ainen, and T. Chakraborty, Europhys. Lett. 33, 377 (1996). 134 32. L. Wendler et al., Z. Phys. B 100, 211 (1996). 136 33. R. J. Warburton et al., Nature 405, 926 (2000). 136 34. D. Haft et al., Physica E 13, 165 (2002). 136 35. A. O. Govorov et al., Phys. Rev. B 66, 081309(R) (2002). 136 36. H. Hu, J-L. Zhu, and J-J. Xiong, Phys. Rev. B 62, 16777 (2000). 136 37. A. Emperador et al., Phys. Rev. B 62, 4573 (2000). 136
Quantum Mechanics in Quantum Rings Thomas Ihn1 , Andreas Fuhrer1 , Martin Sigrist1 , Klaus Ensslin1 , Werner Wegscheider2 , and Max Bichler3 1 2 3
Solid State Physics Laboratory, ETH Z¨ urich 8093 Z¨ urich, Switzerland Angew. und Exp. Physik, Universit¨ at Regensburg 93040 Regensburg, Germany Walter Schottky Institut, TU M¨ unchen 85748 Garching, Germany
Abstract. Magnetotransport measurements on a two-terminal semiconductor quantum ring structure are reviewed. The structure has been fabricated by AFMlithography on a shallow Ga[Al]As heterostructure. In the open regime AharonovBohm oscillations are observed and the phase coherence length can be inferred from their temperature dependence. The ring can also be tuned into the Coulomb blockade regime. It is demonstrated that single-particle level spectra can be reconstructed showing many characteristics of an ideal one-dimensional ring spectrum. Further information about individual quantum states including their position and degree of localisation, their angular momentum and their spin is extracted from measurements in perpendicular and parallel magnetic field and from measurements with asymmetrically applied plunger gate voltages. The ring is an example for a Coulomb blockaded many-electron system in which many aspects of the energy spectrum and the quantum states can be understood quantitatively.
1
Introduction
Within the last 100 years the implications of quantum theory have dominated physics research. Richard Feynman states in his lectures on physics that quantum interference ‘has in it the heart of quantum mechanics’ [1]. Within the last fifteen years semiconductor nanostructures have proven to allow the realisation of nearly ideal quantum systems. Among them is the quantum point contact showing the quantisation of conductance [2,3], quantum dots representing tunable artificial atoms [4,5] and quantum ring structures exhibiting the Aharonov-Bohm (AB) effect [6,7,8,9]. Electron transport through these structures at low temperatures is a key experimental technique which has led to the discovery of a number of effects for which the phase coherence of electrons is crucial. In this review we concentrate on such experiments on a nano-scale quantum ring structure which gives remarkable insight into and control over quantum mechanical properties of electronic states [10,11]. The review is organised as follows: after the introduction of the structure and its fabrication we discuss the AB effect in the open two-terminal ring. When the coupling of the B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 139–154, 2003. c Springer-Verlag Berlin Heidelberg 2003
140
Thomas Ihn et al.
ring to source and drain contacts is strongly reduced the ring can be driven into the Coulomb-blockade regime where the ring’s energy spectrum and the nature of its orbital states can be investigated. In the last section we discuss how the spin states in the ring can be measured.
2
The Ring Structure and Its Fabrication
The quantum ring structure is fabricated on a Ga[Al]As heterostructure with the heterointerface 34 nm below the sample surface. The density of the twodimensional electron gas (2DEG) forming near the interface at a temperature of T = 4.2 K is ns = 5 × 1015 m−2 , its mobility is µ = 90 m2 /Vs. The electron gas can be patterned into a nanostructure by local anodic oxidation [12,13]. This technique allows the direct local oxidation of a GaAs surface with the tip of a scanning force microscope under ambient conditions by applying a voltage between the conductive tip and the buried electron gas. At low temperatures the electron gas is depleted below the oxide lines. Similar to the action of a local shallow etch on the 2DEG, self-aligned but mutually insulated conducting regions are created (Fig. 1a). Figure 1b is an image of the quantum ring structure taken after the oxidation step with the scanning force microscope. The bright lines are the oxide lines. The ring structure in the centre is connected to source and drain via quantum point contacts. These can be tuned with the in-plane gates QPC 1a and b and QPC 2a and b. Two additional plunger gates allow to tune the electronic structure of the ring. The average radius of the ring is r0 = 132 nm, its electronic width about ∆ r = 65 nm. After the oxidation step the whole structure has been covered with a metallic top gate electrode. Comparing the Fermi wavelength in the unstructured electron gas λF = 35 nm with the width of the ring ∆ r we estimate that 2–4 radial modes may be occupied in the ring. The elastic mean free path of electrons in the 2DEG is le = 8 µm, much larger than the size of the whole structure.
Fig. 1. (a) Schematic of the transfer of the oxide line pattern into the twodimensional electron gas. (b) Scanning force microscopy image of the ring structure taken after writing the oxide lines but before depositing the top gate metallisation
Quantum Mechanics in Quantum Rings
3
141
Aharonov-Bohm Effect
3.1
Overview
∆R (kΩ)
Figure 2 shows the magnetoresistance of the ring measured at a temperature of 1.7 K with an AC bias current of 1.4 nA at 31 Hz. The top gate voltage was Vtg = 300 mV and all in-plane gate voltages were kept at 210 mV. These voltage settings make sure that the ring is strongly coupled to source and drain. At magnetic fields B > 2 T we identify Shubnikov-de Haas minima corresponding to an electron density of ns = 5.5 × 1015 m−2 , e.g. close to the density in the pristine 2DEG. At low magnetic fields B < 0.9 T (AharonovBohm regime) pronounced B-periodic AB oscillations are observed with a period ∆ B = 77 mT corresponding to a circular area with radius 131 nm in excellent agreement with the geometric ring radius r0 . The oscillations are strongly reduced in amplitude above B = 0.9 T where the classical cyclotron ¯ kF /eB becomes smaller than r0 . At these fields the chirality radius Rc = h of the quantum states in a magnetic field starts to play a role. Shubnikov-de Haas oscillations set in at about 2 T where Rc ≈ ∆ r. The inset shows that AB oscillations with a slightly reduced period between 65 and 70 mT persist even in the regime where r0 < Rc < ∆ r, but they are hardly discernible at higher fields. The reduced period reflects the increased area enclosed by the
30
Resistance (kΩ)
28
∆B=65mT-70mT
26 24 22
SdH-oscillations AB-oscillations ∆B=77mT
20
ν=10 ν=8
ν=6
Rc > r0
18 16
r0 > Rc > ∆r
I = 1.4 nA Vtg: +300 mV Vipg: +210 mV
0.5
0
0.5
1 1.5 2 2.5 Magnetic Field (Tesla)
r0,∆r > Rc 3
3.5
4
Fig. 2. Magnetoresistance of the ring measured at T = 1.7 K. At low magnetic fields (B < 0.9 T) h/e-periodic Aharonov-Bohm oscillations are visible with a period ∆ B = 77 mT. At large fields (B > 2 T) the ring shows Shubnikov-de Haas minima for even filling factors ν ≤ 10. The inset shows the magnetoresistance in this field range with the smooth background subtracted. B-periodic oscillations with slightly smaller period are observed
142
Thomas Ihn et al.
states as they are pushed to the outer ring boundaries by the Lorentz force. A reduction of the AB oscillation amplitude in high magnetic fields has also been observed in earlier experiments on ring structures [9,14,15]. In the remainder of this review we concentrate on measurements in the Aharonov-Bohm regime at B < 0.9 T where the effects of the magnetic field on the orbital wave functions play a minor role but B acts mainly on the wave function phase. In a two-terminal measurement the observable phase of the Aharonov-Bohm oscillations in the magnetoresistance is rigid due to the generalised Onsager relations [16] taking on values of either 0 or π. The magnetoresistance in Fig. 2 is symmetric around zero magnetic field in agreement with this prediction. 3.2
Phase Coherence Length from Temperature Dependence
Figure 3 shows the Aharonov-Bohm oscillations measured at three different temperatures from T = 1.7 K up to 15 K. At the lowest temperature the magnetoresistance shows a significant h/2e-periodic component. This periodicity can either arise from the interference of time reversed paths (AltshulerAronov-Spivak oscillations [17]) or from other paths with winding number n = 2 around the ring. In general, h/ne-components of the oscillations can be extracted from the measurement by Fourier analysis. The result for our measurements is shown in Fig. 4. With increasing temperature the oscillations
30
I = 1.4 nA Vtg: +300 mV Vipg: +210 mV
T=1.7 K T= 9 K T=15 K
Resistance (kΩ)
29 28 27 26 25 24 -0.8
-0.6
-0.4
-0.2 0 0.2 0.4 Magnetic Field (Tesla)
0.6
0.8
Fig. 3. Magnetoresistance of the ring measured at different temperatures. The AB oscillations decay with increasing temperature
Quantum Mechanics in Quantum Rings
143
in the magnetoresistance die out. The h/ne components disappear the faster the larger the winding number n. At T = 15 T weak h/e-periodic oscillations remain. Being an interference effect the AB oscillations require the phase coherence of electron waves. The amplitude of the oscillations is affected by dephasing [18]. The temperature dependence of the h/ne oscillation amplitude An (T ) has been suggested to follow the dependence [19] An (T ) ∝ e−nL/lϕ (T ) , where lϕ (T ) is the temperature dependent phase coherence length and L is a characteristic length scale of the ring for which we use half the circumference, i.e. L = πr0 . It has been argued that thermal energy averaging will give a significant contribution to the temperature dependence of the h/ne-periodic oscillations for odd n as soon as kT becomes larger than the Thouless energy Ec which is in a strictly one-dimensional ring similar to the level spacing ∆ [18,19]. In contrast, oscillations with even n, in particular with n = 2, are believed to be insensitive to thermal averaging due to the large contribution of time reversed paths [19]. The temperature dependence of these oscillations is therefore expected to be dominated by lϕ (T ). From these arguments a temperature dependence lϕ (T ) ∝ T −1 was found in recent experiments in a temperature range between 0.3 and 3.5 K [19], similar to the behaviour in small open dots [20]. In our experiment the temperature dependence of the h/2e-periodic oscillations does not strictly follow the 1/T -dependence over the full temperature range (Fig. 4). Other dephasing mechanisms in addition to electron-electron scattering may lead to a stronger decay at higher temperatures. If we estimate the phase coherence length from the data points below 6 K assuming the 1/T -dependence to hold, we find lϕ (T ) = 7.5 µm/(T /K). In addition, we find that the decay of h/ne-periodic oscillations is proportional to n for n ≤ 3, 10-1
Amplitude (e2/h)
h/e h/2e h/3e h/4e
10-2
10-3
0
2
4
6
8
Temperature (K)
10
12
14
Fig. 4. Amplitudes of h/neperiodic oscillations determined from a Fourier-analysis of the magnetoconductances at different temperatures
144
Thomas Ihn et al.
i.e. thermal energy averaging does not seem to dominate in our structure for odd n. This result which is in contrast to previous experiments [19] may be due the more than a factor of ten larger level spacing ∆ ∝ 1/r02 caused by our small ring radius r0 .
4
The Ring in the Coulomb-Blockade Regime
For the following experiments the ring was cooled down to 100 mK in a dilution refrigerator. A DC bias voltage of Vbias = 20 µV was applied between source and drain contacts and the DC current was measured with a noise floor at about 200 fA. The quantum ring structure can be tuned into the Coulomb blockade regime [10,11] by lowering the top gate voltage to Vtg ≈ 210 mV and setting the voltages of the point contact gates VQPC1a = VQPC1b = 200 mV and VQPC2a = VQPC2b = 300 mV. Plunger gate voltages were kept between 200 and 300 mV. 4.1
Coulomb Blockade Diamonds
The differential conductance dI/dVbias is shown in Fig. 5 as a function of Vbias and Vtg . Along the dashed line of zero bias conductance resonances can be seen at discrete Vtg . The diamond shaped regions of small differential conductance lining up along the same axis are Coulomb blockade diamonds characteristic for regions of fixed electron number in the ring [21,22]. From one diamond to the next the number of electrons increases by one with increasing Vtg . Half the width of a certain diamond in Vbias -direction gives the energy (divided by the electronic charge e) for adding the next electron to the dot. In the constant interaction picture [23] it is given by the sum of the charging energy contribution e2 /CΣ and the single-particle level spacing ∆i . For the diamond marked by white boundary lines in Fig. 5 this energy is 300 µeV. The separation of conductance peaks at zero bias is given by the same energy allowing to determine the lever arm of the top gate to be αtg = 0.6. This lever arm is constant over the whole voltage range shown in the figure. It allows the conversion of Vtg into an energy scale which is important for a quantitative spectroscopy of states in the ring. 4.2
Voltage Dependent Lever Arms
In contrast to αtg the lever arms of the in-plane plunger gates αpg are not constant. This is shown in Fig. 6a where the DC conductance is plotted for constant Vbias = 20 µV and variable top gate and plunger gate voltages (the plunger gate voltage is applied to both plunger gates in Fig. 1b). The reason for the voltage dependent αpg is the finite density of states in the twodimensional in-plane gates, which leads to substantial depletion of these gates
Quantum Mechanics in Quantum Rings
145
10-2
dI/dV(e2/h)
2.5
1.5
10-3
∆i+e2/C
206
205
-0.8 -0.4 0 0.4 0.8 Bias Voltage (mV)
1.0
0.5
Ring Energy (mV)
207
∆i+e2/C
Top Gate Voltage (mV)
208
0
Fig. 5. The differential conductance dI/dVbias as a function of top gate and bias voltage shows Coulomb blockade diamonds. The plunger gate voltage was Vpg = 300 mV
near the ring at negative Vtg − Vpg . The dashed lines in the figure indicate parametric charge rearrangements in the sample. Given that αtg is independent of the gate voltages, the Vtg axis in Fig. 6a can be directly converted to an energy scale. We further observe that on lines of constant Vtg − Vpg (these lines are essentially parallel to the Vtg -axis, i.e. at constant Vpg ) different conductance peaks have with good accuracy the same slope dVtg /dVpg . We determine the voltage dependent differential lever arm αpg (Vpg ) = αtg
d(eVtg ) dVpg
by averaging the slopes for different peaks at given Vpg . Integration of the differential lever arm gives the desired energy calibration for the Vpg -axis. Figure 6b shows the success of the procedure by plotting the peaks of Fig. 6a with two renormalised energy axes. All peaks have a slope of −1 by construction. The same calibration procedure can be performed for each plunger gate individually. It turns out that the corresponding differential lever arms αpgi (Vpgi ) are identical for both plunger gates (i = 1, 2) demonstrating their symmetric action on the states in the ring. Figs. 6c and d show the convincing results.
Thomas Ihn et al.
206
205
204
Ering [top gate](mV)
b) 0.4 0 -0.4 -0.8 -1.2
d) 1
3
0.5
2.5
0
2
-0.5 -1
200 300 400 Plunger Gate Voltage(mV)
0
0.5 1 1.5 Ering [pg2](mV)
2
1.5 1 0.5 0
Ln(G(e2/h))
c) Ering [top gate](mV)
Top Gate Voltage (mV)
a)
Ering [top gate](mV)
146
-3 -3.5
-0.5
1
2
3 4 5 Ering [plunger gate](mV)
6
7
-1 0.0 0.4 0.8 1.2 Ering [pg1](mV)
-4 -4.5 -5
Fig. 6. (a) The conductance G as a function of top gate and plunger gate voltage shows the voltage dependent lever arm of the plunger gate. (b) After converting the Vpg and the Vtg -axis to an energy scale all conductance peaks move along lines with slope of −1. (c) and (d) The same calibration works for the two individual plunger gates
4.3
Energy Spectra as a Function of Magnetic Field
The electron in a one-dimensional ring pierced by a magnetic flux (Fig. 7a) has become a popular analytically solvable model in quantum mechanics courses. The Schr¨ odinger equation can be written in the form [24] 1 2 (ˆ p+h ¯ k) + V (x) uk (x) = Ek uk (x), (1) 2m where k = Φ/(r0 Φ0 ) = πr0 eB/h is the renormalised magnetic field, the coordinate x is measured along the ring circumference and pˆ = −i¯h∂/∂x is the momentum operator. An arbitrary potential V (x) is allowed around the ring, which has the property V (x + 2πr0 n) = V (x) for an arbitrary integer n. The wave function uk (x) is also required to be periodic around the ring. Using the Ansatz for a Bloch wave ψk (x) = uk (x) exp(ikx) one can show that the ψk (x) obey the equation 2 pˆ + V (x) ψk (x) = Ek ψk (x), 2m which describes the problem of an electron in a one-dimensional periodic potential V (x). The eigenvalue solutions of this problem are, of course, well known (see e.g. [25,24]) and we show them schematically in Fig. 7b and c. If the potential V (x) which breaks the rotational symmetry of the problem is
Quantum Mechanics in Quantum Rings
b) B ϕ
c) 6
6
5
5
4 3
0
1 -3
2 -2
3 -1
i=4 0
2
`=0
1 0 -2
-1
0
1
Magnetic flux m
2
Energy (arb.units)
z
Energy (arb.units)
a)
147
4 3 2 1 0 -1
0
1
Flux m
Fig. 7. (a) Schematic of the one-dimensional ring problem. (b) Energy spectrum for V (x) = 0. The flux piercing the ring is measured in dimensionless units m = Φ/Φ0 , where Φ0 = h/e is the flux quantum. Numbers along the zig-zag line are the angular momenta of the corresponding states. (c) Energy spectrum for V (x) = 0 (dashed parabolae) and V (x) = 0 (solid lines) in the reduced zone scheme
zero, we have the free-electron case and the spectrum consists of parabolae shifted with respect to each other by one flux quantum. Characteristic for the resulting energy spectrum in Fig. 7b are the diamond shaped regions enclosed by energy states. Alternatively the one-dimensional ring states can be described in terms of the angular momentum quantum number [10]. Each parabola in Fig. 7b corresponds to a certain angular momentum such that = m at the apex of the parabola, where m = Φ/Φ0 is the number of flux quanta threading the ring, i.e. E (m) =
¯2 h ( − m)2 . 2m r02
(2)
In case of non-zero V (x) (weak periodic potential) the dispersion splits at degeneracy points and forms a ‘band structure’ (Fig. 7c). Filling such a ring spectrum with a constant electron number, we observe that the topmost state changes its energy in a zig-zag fashion as a function of magnetic flux (Fig. 7b and c). Introducing interactions in the spirit of the constant interaction model, neighboring ‘bands’ in Fig. 7c will additionally split by the charging energy e2 /CΣ and each individual ‘band’ will split into a spin-pair with the same spacing. Figure 8a shows the measured conductance of the ring as a function of plunger gate voltage and magnetic field taken at Vtg = 213 mV, i.e. at a different top gate voltage than the corresponding measurement in [10]. The dispersions of the conductance peaks in a magnetic field show the AharonovBohm period of ∆ B = 75 mT in peak position as well as in amplitude, about the same value found for the oscillating conductance in the open regime (Fig. 2). Under the reasonable assumption of a magnetic field independent
Thomas Ihn et al. 7
a)
8
9
10 11 12
13
14
15
16
17
10
0.3
8 6 4
0.2
2 0
0.1
0 200 220 240 260 280 300 320 340 Plunger Gate Voltage (mV)
G (10-3e2/h)
Magnetic Field (Tesla)
Vtg=213mV
b) e2/C=320µeV
0.3 Magnetic Field (Tesla)
148
0.2
0.1
0 0
0.2 0.4 0.6 Energy (µeV)
Fig. 8. (a) Measured conductance as a function of plunger gate voltage and magnetic field. (b) Reconstructed energy spectrum. A constant charging energy of 320 µV was subtracted between neighbouring conductance peaks
charging energy the motion of the peaks as a function of magnetic field does directly reflect the dispersion of single-particle states. Pronounced zig-zag behaviour as expected from the one-dimensional model is, for example, seen for the peaks labeled 13 and 14. Obviously these two peaks are a spin-pair, i.e. they belong to the same orbital level successively populated by spin-up and down with increasing Vpg . No significant rounding of the cusps along these lines is observed indicating that the symmetry breaking potential felt by these states is small. Other conductance peaks do not show such a pronounced zig-zag behaviour but have a weaker magnetic field dispersion. They look rather like the lowest states in Fig. 7c which are strongly influenced by the potential V (x). Considering that two or more radial subbands coexist in the ring structure we have to expect the superposition of two or more ‘band structures’ like the one in Fig. 7c offset by the subband splitting with respect to each other. Such a scenario leads to many accidental level crossings reducing the probability for the occurrence of strongly oscillating states. At the same time this consideration explains the coexistence of flat as well as strongly oscillating states which are close in energy. It has been demonstrated in [10] that an experimental single-particle energy spectrum can be reconstructed from measurements like Fig. 8a if a constant charging energy is subtracted between conductance peaks. The result of this procedure is shown in Fig. 8b. Already in this figure, the presence of diamonds characteristic for the ideal one-dimensional ring spectrum (Fig. 7b) can be seen. In Fig. 9 measurements taken at three slightly different top gate voltages have been combined in order to reconstruct a spectrum with as many states as possible. In this spectrum, many diamonds like the shaded one are
Quantum Mechanics in Quantum Rings
149
1.0
22
Energy (meV)
0.8 0.6 13-14 0.4 0.2 1 0 0
0.1 0.2 Magnetic Field (Tesla)
0.3
Fig. 9. Reconstructed energy spectrum. Conductance peak positions were shifted by appropriate values close to the charging energy, in contrast to Fig. 8b, where a constant charging energy was used for reconstructing the spectrum
discernible reminding of the one-dimensional model. Even the parabolic dispersions of states with given angular momentum can be followed over a large energy interval. The additional states with a weaker dispersion are due to low lying states of another subband. An upper limit for the electron number N in the ring can be estimated from the product of the electron density ns in the 2DEG and the ring area A = π(r0 + ∆ r/2)2 − π(r0 − ∆ r/2)2 giving N < 270. A lower limit for N can be obtained as follows: from the slope of the strongly oscillating zig-zag states such as 13 and 14 and using eq. (2) we can determine the associated angular momentum at B = 0 to be about 8. This implies that there are 17 spin degenerate angular momentum states occupied in this subband accommodating 34 electrons. Additional electrons will fill states of at least one additional subband. Just counting the corresponding conductance peaks at B = 0 in Fig. 9 gives ten additional electrons. We are confident that the actual electron number is N > 50. From these estimates it becomes evident that the ring is a many-electron quantum dot. In the literature, energy spectra of few-electron quantum dots have been analysed in detail and shell-filling has been found in so-called artificial atoms [4]. Energy spectra of many-electron quantum dots have so far been analysed by statistical approaches such as random matrix theory (see [26,27] for excellent reviews). The quantum ring structure discussed here is an example for a manyelectron quantum dot in which many aspects of the energy spectrum and the quantum states can be understood quantitatively and many others at least qualitatively.
150
Thomas Ihn et al.
We support this central statement by a few examples some of which have been discussed in other publications [10,11]. As shown above, angular momentum quantum numbers can be determined from the slopes of the strongly oscillating zig-zag states. This implies that the corresponding wave functions are close to plane wave states extended around the ring. As discussed in [11] the contribution of these states to the persistent currents in the ring is significant and can be quantitatively determined from the spectrum. On the other hand, states with a weak magnetic field dispersion will tend to be localised by some symmetry breaking potential V (x). The wave functions are, however, of such a shape that the states still couple sufficiently to source and drain and a conductance resonance is detected. Localisation of states may therefore most likely occur in the vicinity of source and drain, or in one of the two arms of the ring. We will come back to this issue further below. It has also been shown [11] that interaction effects can be explained quantitatively based on a Hartree-calculation. Screening of the interaction by the presence of the top gate lowers the charging energy significantly. For the same reason the typical exchange energy is negligibly small. Under the assumption of negligible fluctuations of the charging energy this explains why spin-pairs occur in this sample with exceptionally high frequency. 4.4
Asymmetric Plunger Gate Voltages
The nature of states in the ring being either extended (strong dispersion in B, well defined ) or localised (flat dispersion in B) can be further investigated by the application of asymmetric plunger gate voltages. The basic idea is that states localised in the arm near plunger gate 1 will strongly shift in energy when this plunger gate is changed, but weakly shift if plunger gate 2 is changed due to the different lever arms of the two gates on such a state. A state which is symmetric with respect to the axis connecting source and drain is in first order not shifted at all when Vpg1 − Vpg2 is changed while (Vpg1 + Vpg2 )/2 is kept fixed. Figure 10a shows the dispersion of three neighbouring states measured at B = 2 mT as a function of the asymmetry parameter α given by the difference Vpg1 − Vpg2 converted to energy using the voltage dependent lever arms discussed before. The average voltage (Vpg1 + Vpg2 )/2 converted to energy serves as the energy parameter δ. A constant charging energy of 310 µeV has been subtracted from the separation of neighbouring peaks. Around zero asymmetry, states 1 and 2 move strongly to higher energy with increasing α while state 3 depends weakly on asymmetry. We conlude that states 1 and 2 are more localised close to plunger gate 2, but state 3 is extended around the ring. Comparison with the magnetic field dispersion of these three states shown in Fig. 10b strongly supports this conlusion. Around zero magnetic field states 1 and 2 are constant in energy, i.e. localised, state 3 has a large slope, i.e. it has a well defined angular momentum and extends evenly around
Quantum Mechanics in Quantum Rings
a)
2 e2/C=310µeV
1.5
b)
3
α=0
1
B = 2mT
151
250 Magnetic Field (mT)
Asymmetry α (meV)
1 0.5 0 -0.5
150 100 50 0
-1 1.5
200
0
0.2 0.4 Energy δ (meV)
0.6
-50
1
0.2
2
3
0.4 0.6 Energy δ (meV)
Fig. 10. (a) Dispersion of three neighbouring quantum states as a function of asymmetrically applied plunger gate voltages. Conductance peak positions were shifted by 310 µeV. (b) The dispersion of the same conductance peaks as a function of magnetic field
the ring. This example shows how the combination of conductance measurements at zero asymmetry but in a magnetic field and those at zero magnetic field and finite asymmetry gives further insight in the nature of the observed quantum states of the Coulomb-blockaded many-electron quantum ring system. A careful analysis of a larger number of states confirms the interpretation developed above. 4.5
Zeeman Splitting of Spin-Pairs in Parallel Magnetic Field
In the preceeding discussion of experimental spectra neighbouring zig-zag states have been interpreted as spin-pairs. It is possible to access the spin of the tunnelling electron directly by applying the magnetic field parallel to the plane of the ring [28]. In Fig. 11a we show two conductance peaks corresponding to a spin-pair measured as a function of perpendicular magnetic field. After this measurement the sample was rotated in situ into the parallel magnetic field orientation (with very high accuracy) keeping the temperature below 600 mK. After the rotation was finished the peak separation was unchanged for all the measured conductance peaks (most of them are not shown here). Figure 11b shows the peaks corresponding to the spin-pair as a function of parallel magnetic field. The separation of the peaks increases linearly with B as expected for Zeeman splitting of spin-up and spin-down. This proves directly that the interpretation of the two zig-zag states observed in B⊥ as a spin-pair is correct. The diamagnetic shift of levels in B which plays an important role in [28] is relatively weak in the quantum ring. It leads to a slight shift of both peaks to higher energies as B increases. This shift is the same for the two
152
Thomas Ihn et al. B
B
a)
b)
0.6
0.2
B 0.2
B
0 1 2 3 0.1 Magnetic Field (Tesla)
4
Ring Energy (meV)
0.4
0.0
Fig. 11. (a) Conductance peaks of a spin-pair measured in perpendicular magnetic field. (b) The same conductance peaks after in situ rotation of the ring into the parallel magnetic field orientation measured as a function of parallel field. The change in peak separation is the Zeeman splitting of spin-up and down electrons
peaks because the corresponding orbital wave functions of the two states are identical.
5
Outlook
This review has tried to give an overview over recent magnetotransport experiments on a quantum ring sample fabricated by AFM-lithography. Some results like the Aharonov-Bohm oscillations and the dephasing in the open ring, the measurements in the Coulomb blockade regime with asymmetric plunger gate voltages and the Zeeman splitting have not been published before. Similar reconstructed energy spectra as a function of B⊥ and questions related to the screened interactions and persistent currents have been discussed in [10,11]. All the interpretation of the spectra presented here used the constant interaction picture. However, a close inspection of the data reveals effects beyond this simple model. Variations of the charging energy around an average value can be related to the extended or localised character of the states. Under very special circumstances, even the exchange energy shows up and one can speculate about the existence of voltage tunable singlet-triplet transitions. Such transitions can also be observed at finite B . From the observation of Zeeman shifts of a larger number of peaks one can try to infer information about the ground state spin of the quantum ring similar to [28]. The Kondo-effect has been observed in this ring when the coupling to source and drain was increased. Its strength can be shown to vary with magnetic field in a h/e-periodic fashion. All these results will be discussed in detail in future publications.
Quantum Mechanics in Quantum Rings
153
Quantum rings are also interesting from different viewpoints. Being the standard interferometers in mesoscopic semiconductor physics they have, for example, been used for the measurement of the transmission phase through a quantum dot embedded in one arm of a ring [29,30]. The detailed understanding of these experiments, in particular the observed phase lapse between conductance peaks, is still controversial and further experiments are certainly needed. Recent experiments on rings with a quantum dot embedded in both arms showed signatures of the Fano effect [31]. In connection with experiments aiming at controlled dephasing [32] high quality quantum rings that can be coupled to other quantum devices are highly desirable. Given the present interest in phase coherence, dephasing, entanglement and interactions in view of quantum information processing, one can certainly state that experiments with quantum rings promise still many exciting results in the future.
References 1. R. Feynman, R. Leighton, and M. Sands, The Feynman Lectures on Physics, Vol. III (Addison Wesley, 1965). 139 2. B. van Wees, H. van Houten, C. Beenakker, J. Williamson, L. Kouwenhoven, D. van der Marel, and C. Foxon, Phys. Rev. Lett. 60, 848 (1988). 139 3. D. Wharam, T. Thornton, R. Newbury, M. Pepper, H. Ahmed, J. Frost, D. Hasko, D. Peacock, D. Ritchie, and G. Jones, J. Phys. C 21, L209 (1988). 139 4. S. Tarucha, D. Austing, T. Honda, R. van der Haage, and L. Kouwenhoven, Phys. Rev. Lett. 77, 3613 (1996). 139, 149 5. L. Kouwenhoven, D. Austing, and S. Tarucha, Rep. Prog. Phys. 64, 701 (2001). 139 6. G. Timp, A. Chang, J. Cunningham, T. Chang, P. Mankiewich, R. Behringer, and R. Howard, Phys. Rev. Lett. 58, 2814 (1987). 139 7. K. Ishibashi, Y. Takagaki, K. Gamo, S. Namba, S. Ishida, K. Murase, Y. Aoyagi, and M. Kawabe, Solid State Communications 64, 573 (1987). 139 8. C. Ford and H. Ahmed, Microelectronic Engineering 6, 169 (1987). 139 9. G. Timp, A. Chang, P. DeVegvar, R. Howard, R. Behringer, J. Cunningham, and P. Mankiewich, Suf. Sci. 196, 68 (1988). 139, 142 10. A. Fuhrer, S. L¨ uscher, T. Ihn, T. Heinzel, K. Ensslin, W. Wegscheider, and M. Bichler, Nature 413, 822 (2001). 139, 144, 147, 148, 150, 152 11. T. Ihn, A. Fuhrer, T. Heinzel, K. Ensslin, W. Wegscheider, and M. Bichler, Physica E 16, 83 (2003). 139, 144, 150, 152 12. R. Held, T. Vancura, T. Heinzel, K. Ensslin, M. Holland, and W. Wegscheider, Appl. Phys. Lett. 73, 262 (1998). 140 13. T. Heinzel, R. Held, S. L¨ uscher, K. Ensslin, W. Wegscheider, and M. Bichler, Physica E 9, 84 (2001). 140 14. G. Timp, P. Mankievich, P. deVegvar, R. Behringer, J. Cunningham, R. Howard, H. Baranger, and J. Jain, Phys. Rev. B 39, 6227 (1989). 142 15. J. Liu, W. Gao, K. Ismail, K. Lee, J. Hong, and S. Washburn, Phys. Rev. B 50, 17383 (1994). 142
154
Thomas Ihn et al.
16. M. B¨ uttiker, Phys. Rev. Lett. 57, 1761 (1986). 142 17. B. Altshuler, A. Aronov, and B. Spivak, JETP Lett. 33, 94 (1981). 142 18. Y. Imry, Introduction to Mesoscopic Physics, vol. 1 of Mesoscopic Physics and Nanotechnology (Oxford University Press, 2002), 2nd ed. 143 19. A. E. Hansen, A. Kristensen, S. Pedersen, C. B. S. rensen, and P. E. Lindelof, Phys. Rev. B 64, 45327 (2001). 143, 144 20. A. Huibers, M. Switkes, C. Marcus, K. Campman, and A. Gossard, Phys. Rev. Lett. 81, 200 (1998). 143 21. E. Foxman, P. McEuen, U. Meirav, N. Wingreen, Y. Meir, P. Belk, N. Belk, M. Kastner, and S. Wind, Phys. Rev. B 47, 10020 (1993). 144 22. L. Kouwenhoven, C. Marcus, P. McEuen, S. Tarucha, R. Westervelt, and N. Wingreen, in Nato ASI conference proceedings, edited by L. P. Kouwenhoven, G. Sch¨ on, and L. Sohn (Kluwer, Dordrecht, 1997), pp. 105–214. 144 23. C. Beenakker, Phys. Rev. B 44, 1646 (1991). 144 24. M. B¨ uttiker, Y. Imry, and R. Landauer, Phys. Lett. 96A, 365 (1983). 146 25. N. W. Ashcroft and N. D. Mermin, Solid State Physics (Saunders College Publishing, 1976). 146 26. C. Beenakker, Rev. Mod. Phys. 69, 731 (1997). 149 27. Y. Alhassid, Rev. Mod. Phys. 72, 895 (2000). 149 28. S. Lindemann, T. Ihn, T. Heinzel, W. Zwerger, K. Ensslin, K. Maranowski, and A. Gossard, Phys. Rev. B 66, 195314 (2002). 151, 152 29. A. Yacoby, M. Heiblum, D. Mahalu, and H. Shtrikman, Phys. Rev. Lett. 74, 4047 (1995). 153 30. R. Schuster, E. Buks, M. Heiblum, D. Mahalu, V. Umansky, and H. Shtrikman, Nature 385, 417 (1997). 153 31. K.Kobayashi, H.Aikawa, S.Katsumoto, and Y.Iye, Phys.Rev.Lett 88, 256806 (2002). 153 32. E. Buks, R. Schuster, M. Heiblum, D. Mahalu, and V. Umansky, Nature 391, 871 (1998). 153
Polymer-Defined Semiconductor Nanostructures Klaus Thonke Abteilung Halbleiterphysik, Universit¨ at Ulm D-89069 Ulm, Germany Abstract. Semiconductor nanostructures with sizes down to a few nanometers can be defined in different, highly parallel processes with the aid of polymers. In a “top-down” approach, either micelles formed by diblock-copolymers or other selfassembled patterns like “breath figures” or imprints of colloids in polymer films are used for the formation of metal masks of different sizes on semiconductors in a first preparation step. These metal masks are then used in an anisotropic dry-etching process for the shaping of pillars in the semiconductor material. After additional evaporation and etching steps, dense patterns of nano-holes in metal films can be produced. In a “bottom-up” approach, micelles act as nano-reactors for the formation of metal clusters of Zn, Cd, Mg or Cu which then can be oxidized so as to form nanocrystals of the oxidic semiconductors ZnO, CdO, MgO, and CuO. When Au salts are added to the micelles, Au dots remain after removal of the organic compounds. These can be used as catalysts for the growth of ZnO nanopillars with high aspect ratio. Even ring-shaped (Zn,Cd)O semiconductor patterns can be formed starting with nanoporous polymer membranes, which are loaded by appropriate metal salts and then removed in an oxygen plasma, leaving “donut”-like (Zn,Cd)O rings. Production, structure and optical properties of these nanostructures are discussed.
1
Introduction
For many future applications in chemistry, biology, electronics, micromechanics, and “life science”, structures with sizes well below 1 µm will be needed. Even smaller sizes of only a few nm might be of interest both for the production of quantum confining structures and for purposes of sorting, mixing, and attaching of macromolecules or cells. Nano-sized structures which are shaped into dots, pillars, and tubes, mostly attached to substrates, or also holes in membranes, need to be designed and produced. Photolithography is difficult and expensive to use for such sizes below 1 µm and currently is limited to ≈ 100 nm feature size. E-beam or ion beam writing are alternative lithography techniques offering full flexibility, but due to their serial nature are hampered by low throughput and high costs. Since for a lot of applications no perfect long-range order or absolute positioning is required, methods of structure definition which make use of the self-organizing properties of certain polymers [1] offer an attractive alternative. Such concepts still allow a good short-range order, a narrow distribution of sizes, and they are – due to B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 155–170, 2003. c Springer-Verlag Berlin Heidelberg 2003
156
Klaus Thonke
their parallel character – very efficient, fast, and cheap. Mostly, no expensive setups are needed for the generation of patterns. In combination with the creation of a surface relief via conventional photolithography, perfect hexagonal order can be achieved even over larger areas [2]. In this article we show the application of different types of polymer patterns for the definition of various types of nanostructures, mainly in inorganic semiconductors. The self-organized polymer masks can be applied to both approaches: They either can be used, mostly after an intermediate step of metal evaporation, in a “top-down” etching procedure for the shaping of homogeneous or layered substrates like semiconductor quantum wells into pillars, dots etc., or they can be used in a “bottom up” process directly as molds for the semiconductor precursor components, or indirectly for the metal catalyst, which in subsequent steps allows for the growth of semiconductor nanorods.
2
Diblock-Copolymer Micelles Used as “Nanoreactors”
Diblock-Copolymers consist of two immiscible parts of different polymers joined end to end. When these two partial chains of polymers have different properties (e.g. one polymer soluble in polar solvents, the other one only in non-polar solvents), self-organizing ordering processes occur under suitable conditions. In the experiments presented here, mainly a specific diblockcopolymer was used consisting of the blocks polystyrene (“PS”, typically 1000 - 1400 units long) and poly(2vinyl-pyridine) (“P2VP”, typically 400 units long) (Fig. 1). When this diblock-copolymer with unequal lengths of the components is brought into the unpolar solvent toluene, the unpolar PS chains stay in contact with the solvent and form the shell of “micelles”1 , whereas the polar P2VP ends of the polymer stick together to form the core of the micelles [1]. By changing the length of the polymer blocks, the diameter of the core can typically be adjusted between 1 − 20 nm, whereas the distances are variable between 10 − 150 nm [3]. In the next step, the core of the micelle serves as a “nanoreactor”: Metal salts are added to the solution, which then can attach selectively to the polar core. We used typically a “load factor” L = 0.5, which means that on the average one salt unit binds to every second P2VP monomer [4]. The polymer can now be removed quantitatively by an oxygen or hydrogen plasma. At the same time, the metal salt is reduced, leaving back pure Au dots in the case of Au, or in the case of an oxygen plasma, oxidized particles for most of the other metal salts. 2.1
The Top-Down Approach
We make use of the micelle-generated metal-salt agglomerations to produce by plasma treatment first Au (or equivalently Ti, Pt, etc.) metal dots, which 1
From Latin: mica = small crumb
Polymer-Defined Semiconductor Nanostructures b lo c k B
b lo c k A
c o re P 2 V P c o ro n a P S
m e ta l s a lt e .g . H A u C l4 , C d C l2 , Z n C l2
157
lo a d e d m ic e lle
p o la r
n o n - p o la r
N
n P o ly s ty r e n e (P S ) (x 1 0 0 0 )
m P o ly ( 2 - v in y lp y r id in e ) (P 2 V P ) (x 4 0 0 ) N N
H
+
[A u C l4 ]
-
Fig. 1. Left: Structure of the polystyrene/poly(2-vinylpyridine)diblock-copolymer typically used in our experiments. Right: In a non-polar solvent, (inverse) micelles can form. Metal salts can attach preferentially to the polar part of the polymer
then can be used as masks for etching. For this purpose we dip a substrate, e.g. a silicon or GaAs wafer, into the solution with the already prepared salt-loaded micelles and pull it out slowly. If the concentration, pulling speed etc. are selected suitably, a monomicellar film spreads over the substrate. Subsequently, this film is treated by an oxygen plasma to remove the polymer completely. In the case of Au salt loading, Au nanocrystals are left on the surface, which shortly after the plasma process are partially oxidized, but which then turn into small, pure Au crystals within a few hours [5]. In this way, a dense (≈ 1010 cm−2 ) array of Au dots with narrow size distribution (depending on the length distribution of the polymers and the perfection of loading) is produced. The short-range order is hexagonal, as can be seen in Fourier transformed secondary electron microscope (SEM) pictures, but a long-range order, of course, is not present. To achieve such a long-range order, additional measures like “grapho-epitaxy” [2] would be necessary. Au dots on GaAs substrates prepared as described over areas of several cm2 were taken as masks in a highly anisotropic reactive ion beam etching (RIBE) process. In this case, the polymer was removed in a hydrogen plasma instead of the oxygen plasma, since oxidation of the GaAs surface has to be avoided. Before etching, the Au dot decorated GaAs wafers (see Fig. 2, left) were briefly dipped into HF to remove oxide layers, and then dry etched under optimized conditions using chlorine as reactive species. We obtain a dense, quasi-hexagonal array of very thin GaAs pillars with a maximum height of 90 nm [6] (Fig. 3). Their diameters are ≈ 8 nm as defined by the diameter of the Au dots. Further etching leads to a collapse of the pillars due to underetching. The same concept was used then for the production of quantum dots by etching narrow pillars out of quantum well (QW) heterostructures. We performed these experiments on GaAs QWs embedded in Al0.37 Ga0.63 As barriers, and on In0.1 Ga0.9 As QWs embedded in GaAs barriers. In either case, the
158
Klaus Thonke
Fig. 2. Left: SEM micrograph (top view) of Au nanodots on a silicon surface. The average diameter of the dots is ≈ 8 nm, the average distance ≈ 60 nm. Right: Side view of an array of etched GaAs-needles defined by the micelle-generated Au dots; typical diameter ≈ 8 nm; typical distance ≈ 60 nm, height ≈ 50 nm
Fig. 3. Left: AFM micrograph of etched GaAs pillars (top view). The shape is broadened due to the finite AFM tip size. Right: AFM cross section of this sample along the solid line in the left figure: The average etching depth is ≈ 70 nm
top barrier was kept relatively thin (≈ 20 nm), and the QW had a thickness between 5 and 25 nm. Also in this ternary compound semiconductor, the anisotropic etching process works with an aspect ratio of up to 1:10, thus allowing to define quantum dots (QDs) with a diameter of around 8 nm, which are enclosed by barrier material on top and on the bottom (Fig. 4). Low-temperature photoluminescence (PL) measurements on the as-etched structures so far failed to detect any signal unambiguously related to the QDs. Presumably, the surface layer of the pillars is too heavily damaged by the etching process, thus leading to nonradiative recombination of the photo-excited electron-hole pairs. Attempts to improve the crystal quality by annealing failed. Either the pillars have to be covered by some organic or inorganic passivation layer, or they have to be epitaxially overgrown by the barrier material – a task not yet solved. The substrate with the very narrow GaAs pillars can furthermore be used for an “image reversal” process to produce a gold film with ordered nanoholes, i.e. a nanoporous membrane. For this purpose, Au is evaporated on the structure shown in Fig. 2 (right) to a height of 50 nm, lower than the pillar
Polymer-Defined Semiconductor Nanostructures
159
Fig. 4. TEM cross section of pillars etched from quantum well layers, thus forming quantum dots (QDs) sandwiched in the middle of the pillars. (The bending is caused by the TEM preparation process.) Remaining on top of the pillars are the Au nanocrystals with a diameter of ≈ 7 nm, which were used as etching masks [6]. Left: GaAs QD embedded in AlGaAs barriers. Right: InGaAs QD embedded in GaAs barriers
height. Chemical wet etching now attacks the tip of the GaAs pillars, and the pillars are totally removed, leaving back nano-holes of ≈ 8 nm diameter in the Au film. Longer etching removes the Au film totally from the substrate, and it can be transferred to any other substrates (e.g., glass), or can be used freely suspended. A SEM micrograph of such a film is shown in Fig. 5. Applications like sieves for macromolecules etc. are suggestive. Similarly, “anti-dot” arrays in semiconductor heterostructures could be produced.
G o ld
lift- o ff
H 2S O 4:H 1 : 1 : 1 2
O 2
:H 2
O
/ = ) I s u b s tra te
Fig. 5. Left: After evaporation of an additional Au layer on the GaAs substrate with needles, a wet chemical lift-off process can remove the needles, leaving back a nano-porous Au film. Right: SEM micrograph of such a film with 50 nm thickness and short-range ordered holes of ≈ 8 nm diameter
160
2.2
Klaus Thonke
The Bottom-Up Approach
The technique of loading micelles with metal salts can also be used for the preparation of semiconductor quantum dots in different constructive ways. The most direct one is the generation of oxidic semiconductor QDs composed of materials such as CdO, ZnO, MgO, and CuO. For this purpose, the micelles are first loaded with the appropriate metal salt (mostly chlorides), then an appropriate substrate is dipped into the solution, and the polymer is removed in an oxygen plasma, which at the same time oxidizes the metal ions. In this way, the patterns shown in Fig. 6 were prepared. For ZnO and CuO/Cu2 O (not shown in Fig. 6), we obtain relatively dense and regular patterns of QDs. In the case of CdO, the AFM micrographs seem to indicate, that the salt loading of the micelles is incomplete, and that even some of the micelles might be lost totally. X-ray photo-electron spectroscopy (XPS) indeed detects Zn, Cd, or Cu as expected, and in the case of Zn proves by a shift of the 3p3/2 transition to 1022.3 eV that it is oxidized. For Cd, the 3d5/2 peak is shifted to 405.9 eV, presumably caused by charging effects. For Cu, we find both oxidation stages, CuO and Cu2 O. Similarly, an initial preparation of In or Ga dots, and subsequent treatment in an As ambient under MBE epitaxy conditions could lead to the growth of InAs or GaAs QDs [9]. Besides their use as etching masks, the above mentioned Au dots can also serve in a constructive way. Huang et al. [10] have reported, that Au can act as a catalyst in a “vapor liquid solid” (VLS) process for the growth of ZnO nanopillars on sapphire a-plane substrates. Analogous growth of ZnO rods but with the significantly smaller micelle-generated Au points as catalysts were carried out by Glass et al. [11] and Cao et al. [13]. The basic scheme of this VLS process is depicted in Fig. 7.
Fig. 6. Left: AFM micrographs of micelle-generated ZnO short-range ordered dots of ≈ 8 nm diameter. Right: In the case of CdO dots, a less dense array of dots of an average size of ≈ 3 nm results [7,8]
Polymer-Defined Semiconductor Nanostructures Z n -A u a llo y , liq u id A r, Z n , C O x
flo w A u c lu s te r
161
d ia m e te r : 3 0 n m h e ig h t: u p to 8 0 0 n m Z n O n a n o w ir e g r o w th , || c -a x is
a -p la n e s a p p h ir e
Fig. 7. Scheme of the VLS process used for growth of the ZnO nanorods
A stream of Zn vapor is generated directly from Zn powder or from a ZnO/graphite powder mixture heated to ≈ 1000◦C. This Zn gas stream flows over the substrate covered by the Au catalyst dots and formes there an Au/Zn alloy, thus lowering the melting point of the resulting AuZn dot. When the temperature is lowered, the dots are supersaturated with Zn, and ZnO begins to grow when oxygen (from O2 or CO) is present in the quartz tube. The c-plane lattice constant of ZnO fits to that of a-plane sapphire nearly perfectly, hence the ZnO rods start to grow perpendicularly on a-plane sapphire substrates. The Au catalyst points are floating on top of the pillars, as can be seen in high-resolution SEM micrographs in backscattered electron detection mode (Fig. 8 B). The diameter of the pillars can be controlled by the size of the original Au catalyst dots. In the present case, dots with ≈ 8 nm result in pillars with ≈ 30 nm diameter. It is remarkable, that in macroscopic low-temperature PL investigations the total emission of such pillars is almost as bright as that from state-of-theart commercial ZnO substrates, despite the fact that only a few percent of the substrate area are covered by ZnO (Fig. 10). The donor bound exciton lines (D0 ,X) are slightly broader, but the donor-acceptor pair transition (D0 ,A0 ) related to unidentified impurity species is much weaker, indicating purer material. Presumably, some kind of self-purification takes place during the VLS pillar growth process. Pillars with diameters from 60 nm upward have been demonstrated to act as single-wire lasers [10]. Again, other applications as sensors, fluid mixers etc. are appealing.
3 “Breath Figures” Forming Polymer Nano-honeycombs Self-organized patterns with dimensions between 0.2–1 µm can be produced by the use of condensation patterns of water vapor, of so-called “breath figures”. These are named after the ordered figures of water droplets created e.g., when water vapor condenses on a cold glass window [14]. This hexagonally ordered array can be imprinted into a polymer [15]. In our work, a
162
Klaus Thonke
Fig. 8. SEM micrograph of ZnO pillars grown by the VLS process, showing pillars with ≈ 30 nm diameter and 800 nm height. The bending of the pillars is an artifact due to charge pile-up during the scan. (A) picture recorded with secondary electrons. (B) picture recorded with backscattered electrons. The brighter spots on top hint to the heavier element Au [12]
Fig. 9. SEM micrograph of ZnO pillars grown similarly as in Fig. 8, showing pillars with ≈ 100 nm diameter and 1.5 µm height [13]
solution of the polymer P(MMA-co-MA(HFPO3)-50/50) in freon was spread on the surface of a silicon wafer, over which a stream of humid air is blown under well-controlled (temperature, humidity, solvent concentration) atmospheric conditions. The water vapor condenses, forms water droplets which initially grow and order in a hexagonal array, while the solvent evaporates and cools the surface. The short copolymer is suited to delay the coagulation of the water droplets. The hydrophilic “backbone” of the polymer dissolves in the water droplets, whereas the hydrophobic Fluor(HFPO3) side chains stabilize the surface towards the solution. Convection streams in the solvent support the ordering process. If the parameters are suitably selected, the
Polymer-Defined Semiconductor Nanostructures
163
Wavelength (nm) 385
380
375
370
365
105 0
(D ,X) 4
LOG (PL Intensity)
10
0
3
10
0
(D ,X) - LO
0
(D ,A )
0
(D ,X): TES
ZnO substrate
FE
2
10
ZnO pillars 101
Au clusters on sapphire 100
3.20
3.25
3.30
3.35
3.40
Energy (eV)
Fig. 10. PL spectrum (at T = 5 K) of the ZnO pillars from Fig. 8 (middle trace), from a commercial ZnO substrate as reference (upper trace), and from the sapphire substrate covered only with Au dots without ZnO (lower trace). (D0 ,X) marks donor bound excitons, TES the related two-electron-transitions, (D0 ,A0 ) a donor-acceptor pair band, and FE a free exciton recombination [12]
w a te r c o n d e n s a tio n s o lv e n t e v a p o r a tio n p o ly m e r s o lu tio n s u b s tra te
Fig. 11. Left: Scheme of the creation process of vitrified breath figures (after [15]). Right: SEM micrograph of a polymer breath figure on a Si substrate, showing a nano-honeycomb with ≈ 1.3 µm periodicity and a height of ≈ 150 nm of the sidewalls, as found in AFM measurements [7]
polymer just vitrifies before the droplets start to coagulate. The resulting honeycomb pattern is shown in Fig. 11. The size of the droplet imprints can be controlled by variation of the relative humidity, the velocity of the air stream, and the temperature. A faster process results in smaller drops, which can be brought down in diameter to
164
Klaus Thonke
Fig. 12. After evaporation of gold on the polymer honeycomb structure, a gold net can be lifted off, and gold disks remain on the substrate surface [7,16]
≈ 200 nm. Areas of several cm2 can be covered with a very good short range order. For the purpose of producing semiconductor nanostructures, we first have to increase the contrast of this pattern under dry etching conditions. Tests with different metals have shown, that gold sticks best to the surface and preserves best the shape of the edges [7,16]. Directed evaporation of a 50 nm gold layer onto the polymer structure produces an array of gold disks in the holes of the array, and a separated gold net residing on top of the side walls. After lift-off of the net (which can be used for further applications, as will be discussed below), the polymer is also removed by solvents, and solely the Au disks remain on the substrate (Fig. 12). The substrate with the Au disk mask was now subjected to a highly anisotropic etching process for silicon, the so-called “Bosch-process” in a ICPRIE reactor. This process allows to generate Si pillars exhibiting diameters with a very narrow distribution, arranged in a rather perfect short-range hexagonal order, and of ≈ 20 µm height (Fig. 13). Since the dielectric constant of Si is high ( = 11.9), the production of two-dimensional photonic crystals with a complete bandgap should be feasible with this technique.
Fig. 13. Regular pillars of silicon, arranged in hexagonal order, created with the “breathfigure”-defined Au disks depicted in Fig. 12 [7,16]
Polymer-Defined Semiconductor Nanostructures
165
Alternatively, isotropic dry etching in a parallel plate reactor can be performed, using mainly SF6 as the reactive gas. We applied this process to a SiGe quantum well on a Si substrate. Due to strong under-etching, “mushroom”-shaped nanostructures result as depicted in Fig. 14. The initial SiGe layer is now located in the upper, thin part of the pillar. Further decrease of the lateral dimensions should allow to produce SiGe quantum dots. We now return to possible applications of the Au net, which was lifted off from the honeycomb patterns (Fig. 12). On the one hand, we can again use it as a mask for dry etching. For this application, we transferred it onto a GaAs substrate, and treated it with an anisotropic chlorine gas based RIBE process. In this way, a dense, regular array of holes can be produced [16]. On the other hand, the Au net can be transferred onto virtually any substrate, e.g., glass. This metal structure with its regular holes can be used as an optical short-pass filter. Approximately, only light with wavelengths less than twice the hole diameter can pass through, hece the structure acts as a “metallic photonic filter” [17]. Indeed, optical transmission experiments on these filters show a cut-on at around twice the hole diameter (see Fig. 15). The transmission characteristics can be modelled as a metal sheet with a hexagonal arrangement of holes, each of which acts as a cylindrical wave guide [18]. The cut-on edge, which is relatively extended since the thickness of the metal sheet is only 50 nm, could be sharpened if the thickness of the film is increased further. The transmission towards higher energies is proportional to the ratio of the hole areas and the total area. With a hole diameter decreased to ≈ 100 nm, technically very interesting UV short-pass filters could be realized. The experimentally existing imperfection of the long range order is unimportant in such an application, only the distribution of the hole sizes has to be narrow enough.
5 E 5 E/ A 5 E
Fig. 14. Result of isotropic etching of a Si/SiGe/Si Heterosturcture. The Au disk on top is strongly underetched, and embedded SiGe (quantum) disks are produced [7,16]
166
Klaus Thonke T r a n s m is s io n 1
p a ra m e te rs u s e d fo r th e o re tic a l c u rv e : s = 1 9 6 0 n m , d = 1 6 8 0 n m , l= 5 0 n m
0 .8 0 .6
d
0 .4 0 .2
l
4 0 0 0
s
6 0 0 0
8 0 0 0
1 0 0 0 0
W a v e n u m b e r (c m
1 2 0 0 0 -1
1 4 0 0 0
)
Fig. 15. Left: Model system used for calculation. Right: Optical transmission of a Au metal grid shortpass filter and model calculation [16]
4
Nanoporous Membranes
The third basic type of polymer which we employed are nanoporous membranes, generated via colloid imprint in polymers [19]. This type of masks allows both, the formation of rings and of disks. The scheme for the production of this type of mask is as follows (see Fig. 16): A hydrophobic monomer is poured on water in a Langmuir-Blodgett trough together with especially coated silica colloids. The silica colloids order into a hexagonal closest package with the monomer wetting the surface of the h y b r id film :
s ilic a c o llo id s
m o n o m e r
w a te r U V
ir r a d ia tio n
c r o s s -lin k e d p o ly m e r tr a n s fe r o n to s u b s tr a te b y L a n g m u ir B lo d g e tt te c h n iq u e c o llo id s r e m o v e d ; n a n o p o r o u s m e m b r a n e Z n /C d a c e ta te fille d in to h o le s
O
p la s m a
Fig. 16. Scheme for the production of nanoporous masks and of ZnO rings [19,20]
Polymer-Defined Semiconductor Nanostructures
167
Fig. 17. Left: Nanoporous polymer membrane transferred onto mica [19] Right: ZnO donuts of ≈ 300 nm outer diameter and ≈ 150 nm height created via nanoporous membranes [20]
spheres and filling the space in between. The monomer is then cross-linked by UV irradiation. Subsequently, the floating film is picked up, and the silica colloids are removed by HF vapor, leaving a nanoporous membrane. The thickness of this film is controlled by the initial amount of monomer, and the hole diameters are as monodisperse as the diameters of the silica colloids used. For our experiments, membranes with typical hole diameters of 300 nm were realized (Fig. 17, left). 4.1
The Bottom-Up Use for Making Rings
To form rings of binary or ternary oxidic semiconductors like (Zn,Cd)O, the nanoporous membrane is transferred onto a substrate like silicon or sapphire, and a controlled amount of Cd and/or Zn acetate is filled into the pores, which preferentially wets the side walls of the pores (Fig. 16, bottom part). Subsequent treatment in an oxygen plasma removes the polymer, reduces the acetate, and leaves back oxidized metallic nano-sized crystallites sticking together to form a ring (Fig. 17,right) [20]. Energy dispersive X-ray spectroscopy (EDX) measurements on these rings confirm the change from ZnO over (Zn,Cd)O to CdO as intended by the variation of the Zn acetate to Cd acetate ratio. Low-temperature PL spectra recorded directly after production show only a weak signal dominated by the yellow defect-related band around 2 eV [20]. This band further increases for annealing at Ta = 500◦ C and at Ta = 700◦ C. Presumably, competing non-radiative recombination channels are healed out, but still a lot of defects are present. Finally, annealing at Ta = 900◦ C reduces the defect-band at 2 eV, and increases the near-bandgap bound-exciton PL as the dominant feature (Fig. 18), proving clearly the crystalline nature of the material. Further annealing under atmospheric conditions at Ta = 1100◦ C leads to a decrease of the PL signal due to partial sublimation of the material, as directly observed by SEM.
168
Klaus Thonke
385
380
Wavelength (nm) 375 370
365
360
0
(D ,X) 0
PL Intensity
900 C
0
1100 C
3.20
4.2
3.25
3.30 3.35 Energy (eV)
0
700 C
3.40
3.45
Fig. 18. Photoluminescence of ZnO donuts after various annealing stages [20]
The Top-Down Use for Disk-Shaped Masks
The nanoporous polymer membranes of Fig. 16 have also been used to generate a pattern of metallic discs by evaporating a thin metal layer onto the structure. The process is similar to that described in section 3 for the “breath figures” though in comparison the structure sizes are now significantly reduced. The ordered metal discs serve as etch masks in an anisotropic etching process defining an array of pillars out of the underlying semiconductor material. The result of the first step of this process is depicted in Fig. 19(left): The SEM micrograph shows a view on the polymer mask with embedded gold dots, after the top gold layer (forming a net as in Fig. 12) has been removed [21]. For this type of mask, with the actual small size of the holes, the diameter of the resulting Au dots is not only dependent on the initial colloid size, but also on the thickness of the polymer film, since the holes are narrower towards the surface and tend to close during metal evaporation in a kind of self-limiting process. Fig. 19 (right) shows an array of Silicon pillars obtained by using these gold disks as masks in a highly anisotropic etching process on a silicon substrate. The ordered pillars in this case have a diameter of ≈ 150 nm, and the height is ≈ 1 µm.
5
Conclusions
We have investigated and demonstrated the feasibility of patterning inorganic semiconductors on the length scale of 10 nm to 1 µm by making use of self-organizing patterns of diverse polymers. Both in a “top-down” and in a “bottom-up” approach dots, rings, pillars, and holes can be created with good short-range order and constancy of diameters.
Polymer-Defined Semiconductor Nanostructures
169
Fig. 19. Left: SEM micrograph (side view, 45◦ ) of a nanoporous polymer mask, onto which a ≈ 50 nm thick Au layer was evaporated, after removal of the Au net formed on top of the polymer structure. Right: Etched Si pillars using these Au disks as masks [21]
Acknowledgements The present report focusses on work done in the Abteilung Halbleiterphysik, Universit¨ at Ulm (together with M. Haupt, A. Ladenburger, S. Miller, and X. Cao) in close collaboration with the Abteilung Organische Chemie III (M. odel, A. Mourran, S. Riethm¨ uller, R. Glass† , H. Xu, M¨ oller2 , J. Spatz3 , W. G¨ F. Yan, C. Hartmann and others), with the Zentrale Einrichtung Elektronenmikroskopie (F. Banhart and P. Walther), and with the Abteilung f¨ ur Oberfl¨ achenchemie und Katalyse (H. Rauscher, J. Behm). J. Konle and H. Presting (Daimler Chrysler AG) gave expert support in the specialized etching process (“Bosch Process”) for the high pillars in silicon, and P. Unger and H. Wolff (Abteilung Optoelektronik, Universit¨ at Ulm) in the RIBE process used for GaAs pillars. R. Sauer and A. Waag contributed with valuable discussions and ideas. I want to thank R. Sauer also for careful reading of the manuscript. The financial support of the Sonderforschungsbereich 569 of the Deutsche Forschungsgemeinschaft and the Graduiertenkolleg “Molecular organisation and dynamics at interfaces” is gratefully acknowledged.
References 1. I. W. Hamley: The Physics of Block Copolymers (Oxford University Press, Oxford, 1999). 155, 156 2. R.A. Segalman, H. Yokoyama, E.J. Kramer, Adv. Materials, 13, 1152–1155 (2001). 156, 157 3. J. P. Spatz, P. Eibeck, S. M¨ oßmer, M. M¨ oller, T. Herzog, P. Ziemann, Adv. Mater. 10, 849 (1998). 156 2 3
Now with Inst. f¨ ur Textilchemie und Makromolekulare Chemie, RWTH Aachen, Germany. Now with Inst. f¨ ur Phys. Chem. und Biophys. Chemie, Univ. Heidelberg, Germany.
170
Klaus Thonke
4. J. P. Spatz, T. Herzog, S. M¨ oßmer, M. M¨ oller, P. Ziemann, ACS Symp. Ser. 706, 12 (1997). 156 5. G. K¨ astle, H.-G. Boyen, F. Weigl, G. Lengl, Th. Herzog, P. Ziemann, S. Riethm¨ uller, O. Mayer, C. Hartmann, J.P. Spatz, M. M¨ oller, M. Ozawa, F. Banhart, M.G. Garnier, P. Oelhafen, Advanced Functional Materials, accepted (2003). 157 6. M. Haupt, S. Miller, A. Ladenburger, R. Sauer, K. Thonke, J.P. Spatz, S. Riethm¨ uller, M. M¨ oller, F. Banhart, J. Appl. Phys. 91, 6057 (2002) M. Haupt, S. Miller, K. Bitzer, K. Thonke, R. Sauer, J. P. Spatz, S. M¨ oßmer, C. Hartmann, M. M¨ oller, phys. stat. sol. (b) 224, 867–870 (2001). 157, 159 7. M. Haupt, PhD thesis, University of Ulm. (2003) 160, 163, 164, 165 8. M. Haupt, A. Ladenburger, R. Glass, W. Roos, H. Rauscher, S. Riethm¨ uller, M. M¨ oller, R. Sauer, J. P. Spatz, K. Thonke, Proceedings of the 26th ICPS, Edinburgh (2002) (in press). 160 9. A. Waag, private communication. 160 10. M. H. Huang, S. Mao, H. Feick, H. Yan, Y. Wu, H. Kind, E. Weber, R. Russo, P. Yang, Science 292, 1897 (2001). 160, 161 11. R. Glass et al., to be published 160 12. M. Haupt, A. Ladenburger, R. Sauer, K. Thonke, R. Glass, W. Roos, J. P. Spatz, H. Rauscher, S. Riethm¨ uller, M. M¨ oller, J. Appl. Phys., scheduled for Vol. 93 (May 2003). 162, 163 13. X. Cao et al., to be published. 160, 162 14. T. J. Baker, Philos. Mag. 44, 752 (1922) 161 15. A. Mourran, S.S. Sheiko, M. Krupers, M. M¨ oller, PMSE Proceedings of the Am. Chem. Soc. 80, 175 (1999) A. Mourran, S.S. Sheiko, M. M¨ oller, PMSE Proceedings of the Am. Chem. Soc. 81, 426 (1999). 161, 163 16. M. Haupt, S. Miller, A. Mourran et al. (to be published). 164, 165, 166 17. J.S. McCalmont, M. M. Sigalas, G. Tuttle, K.-M. Ho, C.M. Soukolis, Appl. Phys. Lett. 68, 2759 (1996). 165 18. C.C. Chen, Transactions on Microwave Theory and Techniques MTT-21, 1 (1973). 165 19. H. Xu, W. A. Goedel, Langmuir 19(12) (2003). 166, 167 20. A. Ladenburger, M. Haupt, R. Sauer, K. Thonke, H. Xu, W.A. Goedel, Physica E17, 489 (2003). 166, 167, 168 21. A. Ladenburger, F. Yan et al., to be published. 168, 169
Interaction of Palladium Nano-Crystals with Hydrogen During PECVD Growth of Carbon Nanotubes Wilfried Wunderlich and Masaki Tanemura Nagoya Institute of Technology, Department of Environmental Technology 466-8555 Nagoya, Japan [email protected] Abstract. Using plasma-enhanced chemical vapor deposition (PECVD) with Acetylene and Ammoniac on Pd-specimens Carbon nanotubes (CNT) could be produced successfully. Two different devices are compared and the conditions for best growth conditions are explained. The detailed analysis of electron diffraction pattern obtained by transmissions electron microscopy (TEM) of as-grown specimen showed an expansion of the Pd lattice, which can be explained by the formation of fcc-PdHx for x = 0 . . . 0.7. For x = 0.7 . . . 2 the first investigation of hexagonal -PdH2 is reported, which lattice spacing is independent on the hydrogen content. The amount of fcc-PdH and hcp-PdH2 increases when the specimens are treated subsequently with Hydrogen. A growth model is provided.
1
Introduction
Since the discovery of Carbon nano tubes (CNT) [1], their processing made a large progress and the possibility of controlled growth of nanotubes stimulated possible applications, for example they can be used as microsize Xray emitters [2-3], flat displays, STM-tips, or for Hydrogen storage. The plasma-enhanced chemical vapor deposition (PECVD) method [4] is one of the most efficient production methods of nanotubes or their closed variants Graphite nano-Fibers (GNF). The hydrocarbon gas is decomposed at the metal-surface, in the case of this paper Pd, leading to excess carbon forming the carbon walls and excess Hydrogen gas. Usually it is considered, that Hydrogen is stored interstitially in the octahedral vacancies of fcc-Pd nano-crystalline material with the possibility of an anisotropic lattice expansion in [111] and [100] directions [5]. Hydrogencharging with a concentration of PdH0.706 leads to an expansion of the lattice constant to 0.4049 nm instead of 0.3906 nm in bulk-Pd [6]. During the CNT growth using the PECVD process the inlet-gases decompose catalytically and atomic Hydrogen is released, which is partly stored in the metallic nanocrystals lying inside the CNT. Since the generated hydrogen is known to be stored in the metallic lattice of the Pd-nano-crystals toping the CNT, this composite is also considered as part of a device for hydrogen storage, since B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 171–177, 2003. c Springer-Verlag Berlin Heidelberg 2003
172
Wilfried Wunderlich and Masaki Tanemura
it is able to store Hydrogen amount several times larger than its atomic volume. Studies of the influence of Hydrogen on the metallic nano-crystals inside the CNT are still lacking. The measurement of the lattice expansion of the Palladium nano-crystallites inside the CNT is the main theme of this paper.
2
Experimental
Graphite nano tubes were grown by the PECVD technique on a 50µm thin wire of Pd, as described previously [2-4]. Two devices were tested, one with a large reaction chamber (about 2 liter, Fig. 1a), another with a large vacuum tank connected with a 0.2 mm orifice to the small reaction chamber (Fig. 1b). The specimens produced in the smaller reaction chamber showed better results, due to reduced velocity of the gas molecules. Furthermore, the specimen lies on −400 V (Fig. 1b) compared to earth (Fig. 1a), which also leads to more stable plasma conditions. By heating the wire inductively (2 V 5 A) it is heated to about T=500◦C, which was found to be the best condition for growth, up to 5-30µm long fibers were found. The optimum-mixing ratio of Acetylene (C2 H2 ) and Ammoniac (NH3 ) gases was found as 1:2 (Fig. 2), while at 1:3 the growth of triangular shaped carbon tubes and at 1:1.5 flat
Fig. 1. Plasma-CVD equipment for producing Carbon nano tubes
Interaction of Pd Nano-Crystals with Hydrogen on CNT
173
Fig. 2. The growth of CNT depends on the gas pressure ratio P-C2 H2 :P-NH3
carbon layers were observed. The optimum total pressure is 1.2 Torr, at lower pressures the amount is too low, at higher pressures the tubes became too fat. The ex-situ Hydrogen-charging experiments on grown GNF were performed at 1 Torr H2 at 400◦ C. For subsequent TEM-observations the wire was fixed on a 3mm-disc. The specimens were characterized by the TEM JEOL 3010FX. The diffraction patterns were analyzed using image simulations performed with the image simulation program EMS, which is well known for TEM-analysis.
3
Results and Discussions
The low magnification TEM micrograph in Fig. 3a shows the Pd nano-crystal embedded on top of Carbon nanotubes. From the analysis of the corresponding diffraction pattern it is deduced that the facets at the top of the nanocrystal consist of (220) planes, indicating that these high-density planes have the lowest energy to the surrounding Carbon layers. The inner surfaces of the palladium crystals towards the empty fiber are also mostly facetted with a long tail. In previous analysis it was found that during decomposition of the acetylene gas hydrogen is released, which penetrates into the nano tubes as well as into the metallic nano-crystals and it is concluded, that the melting point of Pd is reduced due to both, the hydrogen-interaction as well as the
174
Wilfried Wunderlich and Masaki Tanemura
nano-size. In this case the Pd nano-crystals have a round shape indicating also the melting during CNT formation due to Hydrogen interaction. In order to clarify the influence of Hydrogen the Pd-CNT-samples were exposed to H2 -gas at 400◦ C for 20 min. The diffraction pattern shown in Fig. 3b correspond both to (022) diffraction spots in the zone axis [110] and both in the same magnification, the upper micrograph before, the lower after Hydrogen treatment. The distance between the corresponding (022) spots and the incident beam is reduced after the hydrogen treatment, while the ring-like diffraction pattern from the carbon nanotubes in both figures have the same size. The conscientious analysis showed that the distance in reciprocal space of the (002) spot to the incident beam is about 3.1% smaller in the upper Fig. 3a than in the lower, indicating the extension of the lattice constants due to the Hydrogen treatment. This leads to an extension of the lattice spacing to 0.404 nm, in good agreement to previous measurements [6]. About 10% of the CNT, however, show diffraction pattern, which cannot be identified as fcc, but only as hcp crystals, as shown in Fig. 3c. The analysis of the diffraction pattern in Fig. 3c leads to a z=[22.1] zone axis with a lattice spacing of a=0.250 nm and c=0.408 nm. This ratio of c/a=1.63 corresponds well to other hcp metals like Co. Careful analysis of different zone axis, showed that the hexagonal phase only possess these lattice constants, which are independent to the Hydrogen content, as confirmed by analysis of many diffraction pattern. The increase of the lattice constants in fcc Pd due to interstitial hydrogen atoms is shown in Fig. 4. The dots refer to the experimental bulk-observations [6], the straight line for fcc is the linear interpolation assuming the validity of the well-accepted Vegard rule for interstitial alloying of foreign elements. The Hydrogen can be stored in the fcc lattice until all octaeder positions are filled, which corresponds to a maximal composition of PdHx with x = 1.
Fig. 3. TEM-observation of Palladium nano-crystals inside Carbon nanotubes: (a) TEM-micrograph; diffraction pattern with (b) fcc- and (c) hcp-structure
Interaction of Pd Nano-Crystals with Hydrogen on CNT
175
Fig. 4. Lattice constants for fcc PdHx (x = 0 . . . 0.7) and hcp-PdHx (x = 0.7 . . . 2) extrapolated from experimental data points [6] and deduced from TEM-diffraction pattern
Reaching this concentration the lattice constants have increased by 20pm, a value, which corresponds well to the Hydrogen radius usual fitted for ionic materials. The fcc-(111) spacing increases from 0.225 nm to 0.233 nm for fcc PdHx from x = 0 to 0.7 and would be 0.250 nm for x = 1. When the Hydrogen content increases beyond x = 0.7, the hexagonal lattice stable upto x = 2 is observed, which c-axis is a continuation of the fcc-a-axis and the hcp-a axis is a continuation of the fcc-[111] spacing (Fig. 4). Why the Vegard rule is not working is the case of the hexagonal PdH2 requires some additional research. In the crystal structure of hexagonal PdH2 two types of interstitial tetraeder positions are possible, whether its tip points upward or downward (Fig. 5). If both of these tetraeder type positions are fully occupied, the composition is PdH2 . This hcp-PdHx (with x = 1...2) lattice has the space group P63 /mmm, in which the two possible tetraeder sites are occupied with a probability of
Fig. 5. Crystal structure of fcc-Pd (left), fcc-PdH (middle) and hcp-PdH2 (right)
176
Wilfried Wunderlich and Masaki Tanemura
0.5 in the case of x = 1 or fully occupied in the case of x = 2 (PdH2 ). The symmetry would be reduced to P63 mc, if only the upper or lower half of them would be occupied. This symmetry-reduced lattice, however, does not fit to the experimentally observed diffraction pattern. The hydrogen necessary to form PdH can be easily explained by the growth mechanism for nanotubes (Fig. 6). On atomic scale the growth can be divided into five steps, first the cracking of the double C=H acetylen bonds, then the cracking of their Carbon-Hydrogen bonds, the formations of intermolecular hydrogen bonds and of carbon bonds in the nanotube shape, and finally the growth of the nanotube into the long shape. For all of these steps the catalytic function of Palladium is required. The growth only occurs at the tip of the nanotube, since splitted tubes are never observed. The growth of the nanotubes terminates, when the tip of the Pd-nanocrystal is covered with a graphite mono-layer. Hence, it is likely, that the growth occurs at the Carbon-Palladium interface. The decomposition of the acetylene gas releases the carbon required for the CNT growth and also the hydrogen, which is partially released in the vacuum, partly stored in the empty part of the CNT, and partly stored in the Palladium crystal. Also the bias voltage of −400 V enhances this Hydrogen diffusion into the tubes.
Fig. 6. Growth model for carbon nano tubes during the plasma-CVD
4
Summary
This study reports about TEM investigation of the lattice expansion of fccPalladium and the first investigation of hexagonal Palladium hydride, which was found inside of CNT. Further research is required to find the optimal growth conditions to increase the amount of hcp-PdH2 . Since this is the first report on hcp-PdH2 it has to be clarified, whether the H-penetration and hcp
Interaction of Pd Nano-Crystals with Hydrogen on CNT
177
PdH2 formation requires the nanotube/Pd interface as a necessary carrier acting like a funnel. The result of this study has a great impact for further storage of Hydrogen in Palladium. If the hexagonal Palladium structure could be stabilized, it will become possible to store a larger amount of hydrogen into the Pd lattice than previously considered. Acknowledgement We gratefully acknowledge the fruitful discussions and support by Prof. em. Fumio Okuyama.
References 1. S. Ijima, Nature 354 56 (1991). 2. M. Tanemura, V. Filip, K. Iwata Y. Fujimoto, F. Okuyama, D. Nicolaescu, H. Sugie, J. Vac. Sci. Technol. B 20 122 (2002). 3. H. Sugie, M. Tanemura, V. Filip, K. Iwata, K. Takahashi, F. Okuyama, Appl. Phys. Letters 78, 2578 (2001). 4. M. Tanemura, K. Iwata, K. Takahashi, Y. Fujimoto, F. Okuyama, H. Sugie, V. Filip, J. Appl. Phys. 90 1529 (2001). 5. Toshiro Kuji, Yoshihito Matsumura, Hirohisa Uchida, Tatsuhiko Aizawa, J. Alloys and Compounds, 330-332 718 (2002). 6. J.E. Worsham, M.K. Wilkinson, C.G. Shull, J. Phys. Chem. Solids 3 303 (1957).
Electron Coherence in Mesoscopic Kondo Wires F´elicien Schopfer, Christopher B¨ auerle, Wilfried Rabaud , and Laurent Saminadayar Low Temperature Research Laboratory, CRTBT-CNRS B.P. 166 X, 38042 Grenoble Cedex 09, France {bauerle,saminadayar}@grenoble.cnrs.fr Abstract. We present measurements of the magnetoresistance of long and narrow quasi one-dimensional gold wires containing magnetic iron impurities. The electron phase coherence time extracted from the weak antilocalisation shows a pronounced plateau in a temperature region of 300 mK – 800 mK, associated with the phase breaking due to the Kondo effect. Below the Kondo temperature, the phase coherence time increases, as expected in the framework of Kondo physics. At much lower temperatures, the phase coherence time saturates again, in contradiction with standard Fermi liquid theory. In the same temperature regime, the resistivity curve displays a characteristic maximum at zero magnetic field, associated with the formation of a spin glass state. We argue that the interactions between the magnetic moments are responsible for the low temperature saturation of the phase coherence time.
1
Introduction
The understanding of the ground state of an electron gas at zero temperature is one of the major challenges in Solid State Physics. For a long time it has been known that such a ground state is well described by Landau’s theory of Fermi liquids [1]. In this description, the lifetime of quasiparticles is infinite at zero temperature, as the coupling to the environment tends to zero. Alternatively, in mesoscopic physics, one key physical concept is the phase coherence time, i.e. the time an electron can travel in a solid before it looses its phase coherence and thus its quantum, wave like behaviour. Such a decoherence is due to inelastic processes, like electron-phonon, electron-electron or electron-photon collisions. It has been shown by Altshuler and coworkers [2] that the phase coherence time diverges at zero temperature as electronphonon, electron-electron and electron-photon interactions all go to zero at zero temperature. However, recent experiments on metallic as well as semiconductor wires suggest that the phase coherence time saturates at very low temperature [3]. Following this work, it has been argued that the observed saturation is
Present address: Department of Physics, University of Maryland, College Park, MD 20742-4111, USA.
B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 181–192, 2003. c Springer-Verlag Berlin Heidelberg 2003
182
F´elicien Schopfer et al.
indeed universal and intrinsic, and due to electron-electron interactions in the ground state of the Fermi liquid [4]. Contrary to this, other interpretations argue that this saturation is extrinsic and due to the coupling to other degrees of freedom, like two level systems [5]. On the other hand, some experimental results suggest that the dephasing depends on the dimensions of the samples [6], whereas another group argues that some of their experimental results agree with standard theory [7], at least down to 50 mK. It should be noted, however, that the problem of relaxation at zero temperature is not subject of debate in the mesoscopic community only. Recent experiments in spin polarized helium, a textbook example of a Fermi liquid, show a saturation of the transverse spin-diffusion coefficient at zero temperature [8]. This equally raises key questions about the applicability of conventional Fermi liquid theory. Recent experiments invoke the coupling to magnetic impurities as a possible source of the frequently observed low temperature saturation of the phase coherence time [9,10,11]. It is well known that in metals the interaction of conduction electrons with magnetic impurities gives rise to the Kondo effect [12]. Concerning transport properties of metals, the best known feature of this effect is the existence of a minimum and a subsequent logarithmic increase of the resistivity with decreasing temperature below the Kondo temperature TK . The influence of Kondo impurities on the dephasing rate, on the other hand, is by far more subtle. Finally, it is well known that above a certain amount of impurities, and below a certain temperature, RKKY interactions between magnetic moments lead to the formation of a spin glass [13]. This regime has basically not been explored so far and may contain a great deal of new physical phenomena. All this physics related to magnetic impurities leads to new energy scales: the Kondo temperature TK and the spin glass transition temperature Tg . Both energy scales have to be considered when dealing with the “zero” temperature limit, and have also to be introduced in the theoretical description of dephasing in mesoscopic wires.
2
Historic
Already in the early days of weak localisation, many experimentalists observed a systematic saturation of the electron phase coherence at low temperatures, when extracted from low field magnetoresistance [14,15]. This saturation has often been attributed to the presence of some residual magnetic impurities [16]. To our knowledge, the first measurements which clearly demonstrated the strong influence of magnetic impurities on the phase coherence, even in the presence of extremely dilute magnetic impurities (below the ppm level) has been carried out by Pannetier and coworkers in the 1980’s [17,18]. These measurements have been performed on extremely pure Au samples and coher-
Electron Coherence in Mesoscopic Kondo Wires
183
ence lengths of several micrometers have been obtained at low temperatures. Again in these experiments, the phase coherence time was almost temperature independent below 1 K. By annealing the samples, the authors could show that the phase coherence time increases substantially. The annealing process oxidizes magnetic impurities and hence suppresses decoherence due to the Kondo effect. These experiments therefore clearly show that the presence of an extremely small amount of magnetic impurities can lead to substantial electron decoherence at low temperatures. A different method to suppress the effect of magnetic impurities can be achieved by applying a sufficiently high magnetic field in order to fully polarise the magnetic impurity spins. In this case, weak localisation measurements are not possible to extract the phase coherence time. On the other hand, measurements of Aharonov Bohm (AB) oscillations and universal conduction fluctuations (UCF) are possible. Pioneering work on both, UCF and AB oscillations in quasi 1D quantum conductors containing a small amount of magnetic impurities (down to 40 ppm) has been performed by Benoˆıt and coworkers in the late 1980’s [19]. In this work the authors could clearly show that UCF as well as AB oscillations increase considerably at fields larger than 1 Tesla, showing the suppression of the Kondo effect due to the polarization of the magnetic impurity spins. In the context of the present debate on the low temperature saturation of τφ , these measurements have been repeated recently on metallic samples containing more dilute magnetic impurities [9]. In this article we point out another effect which leads to a saturation of τφ at low temperatures when measured by weak localisation, namely the formation of a spin glass state. We show that, even in the presence of very dilute magnetic impurities, the impurities cannot be regarded as independent (single impurity limit) at low temperatures and interactions between the magnetic impurities have to be taken into account. It is well known that RKKY interactions between magnetic impurities lead to the formation of a frozen spin configuration at a characteristic temperature Tg . In systems containing very dilute magnetic impurities, this temperature lies well below the Kondo temperature, hence sets another energy scale, which can be of the order of the lowest temperatures presently accessible in experiments.
3
Experimental
In this article we report on measurements of the temperature dependence of the low field magnetoresistance and resistivity of quasi one-dimensional (1D) long and narrow Au/Fe Kondo wires down to temperatures below 0.1 TK . Sample fabrication is done using electron beam lithography on silicon substrate. The metal is deposited with a Joule evaporator and standard liftoff technique. In order to improve adhesion to the substrate a 1 nm thin titanium layer is evaporated prior to the gold evaporation. Two sources of 99.99% purity with different iron impurity concentrations are employed for
184
F´elicien Schopfer et al.
the gold evaporation. The actual iron impurity concentration is determined via the resistance variation at low temperature due to the Kondo effect. Such a method directly characterises the purity of the samples, which may be quite different from the purity of the sources. The samples (A and B) have the same geometrical parameters: their lengths, widths and thicknesses are L = 450µm, w = 150 nm and t = 45,nm. The 1 K resistance value for sample A (B) is 4654Ω (2235Ω). The samples are quasi 1D with respect to both, the phase-breaking length lφ = Dτφ and the thermal length lT = ¯ hD/kB T , D being the diffusion constant. From the relation D = 1/3 vF le , we obtain a diffusion coefficient of 0.056 m2 /s and 0.115 m2 /s for sample A and B, respectively. We have studied very carefully the temperature dependence of the low field magnetoresistance as well as the electrical resistivity over a temperature range extending from 4.2 K down to 15 mK. Figures 1 and 2 display the magnetoresistance at different temperatures for two samples containing different amounts of magnetic iron impurities. From fits to weak localisation theory for quasi 1D conductors (see insets) we extract the phase coherence length lφ . For the fitting procedure, we first determine the spin-orbit scattering length at temperatures above 0.5 K with magnetoresistance curves covering a field span of ± 2000 G. We then fix the spin-orbit scattering length to the obtained value of 50 nm for all weak localisation fits [20]. Using the measured geometrical and electrical parameters of the samples, the only fitting parameter is hence the phase coherence length lφ . From the relation τφ = lφ2 /D, we then obtain the phase coherence time τφ as shown in Fig. 3.
900 mK
590 mK
DR/R
160 mK 75 mK 20 mK 2.10 - 4
2.10 - 5 -2000
-2000
-1500
-1000
-500
0
500
1000
690 mK 0
2000
1500
2000
B (G)
Fig. 1. Low field magnetoresistance of sample A at different temperatures. Inset shows a fit of weak localisation theory to the data
Electron Coherence in Mesoscopic Kondo Wires
185
4.2 K
DR/R
545 mK
15 mK
10 - 3
4.10 - 4
-400
-400
-200
0
200
425 mK
0
400
400
B (G)
Fig. 2. Low field magnetoresistance of sample B at different temperatures. Inset shows a fit of weak localisation theory to the data
Three distinct temperature regimes are clearly distinguishable. At high temperatures (above 1 K), the phase coherence time decreases rapidly with increasing temperature due to electron-phonon coupling. This temperature dependence is well described by a T −3 power law, as expected from theory. At temperatures between 0.3 K and 1 K the phase coherence time shows a pronounced plateau. Here the temperature independence of τφ is caused by dephasing due to the Kondo effect as we will see later, when we discuss the temperature variation of the resistivity. Below 0.3 K, the phase coherence time increases again, because of the partial screening of the magnetic impurities [21]. At lower temperatures, however, we again observe an apparent saturation of τφ . In order to understand this rather unusual temperature dependence of τφ , it is important to analyse the temperature dependence of the electrical resistivity. Figure 4 displays the temperature variation of the resistivity for both samples in zero magnetic field. The total contribution to the resistivity is given by three different contributions: electron-electron interaction, weak localisation and the contribution due to magnetic impurities. Below 1 K one observes an increase of the resistivity for both samples. This increase is due to the Kondo contribution and electron-electron interaction. At the lowest temperatures, one observes a clear maximum in the resistivity for both samples. Since the amplitude of the weak localisation curves is basically temperature independent at these temperatures and since the electron-electron √ contribution increases monotonically with decreasing temperature ∼ (1/ T ), the
186
F´elicien Schopfer et al. sample B 10
φ
τ (ns)
τφ (ns)
1
1
0.1
sample A
0.01
10
0.1
100
1000
T (mK)
Fig. 3. Phase coherence time as a function of temperature. The solid line is the theoretical prediction from [2] for sample B
sample A
6994
3355 6992
6988
r (nW.cm)
r (nW.cm)
6990 3354
6986 3353 6984
sample B 6982 3352 10
100
1000
T (mK)
Fig. 4. Resistivity variation as a function of temperature, measured at zero magnetic field
maximum in the resistivity has to be due to some other phenomenon related to the presence of magnetic impurities. The maximum in the resistivity is a common feature for Kondo systems [22]. At low temperature, magnetic impurities interact via the RKKY interaction. In systems with high Kondo temperatures, and very low impurity
Electron Coherence in Mesoscopic Kondo Wires
187
concentration, a complete Kondo screening of the magnetic impurities can be obtained. This case is often referred to as the unitary limit, where RKKY interactions are suppressed. However, if the concentration is high enough, and the Kondo temperature low enough, the screening length may be very large as it varies like 1/TK . RKKY interactions are then important and the magnetic impurities cannot be treated in the single impurity limit. The unitary limit is hence never reached, and the system transits into a spin glass state at a temperature Tg . The most common features of this transition is a maximum in the resistivity curve as well as an anomaly in the magnetic susceptibility which appear roughly at Tg [23]. Both phenomena have been extensively studied in the past: the dependence of the temperature of the resistivity maximum [24,25,26] and of the susceptibility anomaly [27] as a function of the impurity concentration in Au/Fe systems as well as many others. However, the effect of such a peculiar spin configuration on the phase coherence time has not been explored so far [28]. To our knowledge, this is the first time that weak localisation measurements are accessible in this spin glass regime. In order to determine the impurity concentration of our samples, we subtract the measured contribution due to weak localisation and fit the temperature variation of the resistivity at zero magnetic field to the following expression: ln(T /TK ) α (1) ρ(T ) − ρ0 = √ + β 0.743 + 0.332 1 − 2 ln T /TK + π 2 S(S + 1) T where the first term corresponds to the electron-electron contribution and the second term to the Hamann expression [29] for the Kondo contribution, with β being the impurity concentration in ppm, S the impurity spin and ρ0 the residual resistivity at 1 K. Taking S = 3/2, TK = 300 mK [24] and fitting both data sets over the same temperature range, we obtain an impurity concentration of approximately 60 ppm (15 ppm) and a coefficient α = 1.4 nΩ·cm·mK−1/2 (9.5 nΩ·cm·mK−1/2 ) for sample A (B), compared to the theoretically expected value of 36.4 nΩ·cm·mK−1/2 (12.1 nΩ·cm·mK−1/2 ). The poor agreement between experimental and theoretical values for coefficient α for sample A is due to our choice of fitting both sets of data over exactly the same temperature range. If we fit the data of sample A over a limited temperature range (> 100 mK), we then recover the theoretically expected value for α. This proves again that in sample A, where the impurity concentration is higher than in sample B, RKKY interactions between magnetic impurities are already present at these temperatures. As a consequence, the resistivity deviates strongly from the Kondo model [30]. The saturation of τφ and the subsequent desaturation at lower temperatures can be well understood in terms of the Kondo effect. Spin flip scattering due to the presence of magnetic impurities causes very efficient dephasing at temperatures around TK . At lower temperatures the magnetic impurities become screened by the surrounding conduction electrons and the spin flip scat-
188
F´elicien Schopfer et al.
tering process is attenuated. As a consequence, τφ increases with decreasing temperatures [21]. At low enough temperatures standard Fermi liquid theory [31] should again describe the temperature dependence of τφ . It should therefore follow a power law T −2/3 [2] as shown by the solid line in Fig. 3. This is clearly not the case for our experimental data. What can be the origin of the observed low temperature saturation of τφ ? One explanation, however presently very controversial, is the possible existence of zero temperature dephasing due to electron-electron interactions [4]. The agreement of the experimental data with this theory is reasonable. For a detailed comparison of our data with this theory, we refer the reader to [10]. Another possibility for the low temperature saturation of τφ is the presence of another type of magnetic impurity with a Kondo temperature below the measuring temperature (e. g. Mn). In this case one would expect again , in qualitative agreement with a plateau for τφ at temperatures around TK the data on the phase coherence. If so, this additional Kondo contribution should also lead to an increase of the resistivity at low temperatures. None of these two possibilities does explain the maximum in the resistance curve. As already mentioned above, the maximum in the resistance curve is a well known feature which is attributed to freezing of the magnetic impurities into a spin glass state. It is thus clear that RKKY interactions between magnetic impurities are important in our samples and have to be taken into account in the interpretation of our experimental data. For this purpose, we extract the spin scattering rate from the measurement of the phase coherence time. The total dephasing time is given by [32] 1 2 1 = + τφ τnm τs
(2)
where τs is the spin scattering rate and τnm is the non-magnetic scattering rate given by the usual formula 1 τnm
= AT 2/3 + BT 3
(3)
Coefficient A = 0.8 (0.6) ns−1 K−2/3 is calculated using the parameters of sample A (B) and coefficient B = 0.04 (0.04) ns−1 K−3 is obtained by fitting the data at high temperature. The fit for sample B is diplayed in Fig. 3. This non-magnetic part of the dephasing time is then subtracted from our data, and we obtain the spin scattering time as a function of temperature. This is displayed in Fig. 5. Both curves exhibit a clear maximum around TK [32,33], where the dephasing mechanism due to the Kondo effect is the most efficient. This is associated with the plateau observed around TK in the τφ (T ) curve [34]. Below TK , magnetic impurities get screened, and one expects a decrease of the dephasing time. This is indeed what is observed. Theoretical predictions lead to a T 2 behaviour in Nozi`eres’s Fermi liquid theory [31], whereas recent work
Electron Coherence in Mesoscopic Kondo Wires 80
189
0.2 sample A
60
sample B
40 0.15
ts-1 (ns-1)
ts-1 (ns-1)
20
0
0.1
0.05 10
100
1000
T (mK)
Fig. 5. Magnetic scattering rate for sample A and B obtained by subtraction of the standard dephasing rate from the data of Fig. 3
leads to a 1/ln2 (TK /T ) dependence for partially screened impurities [35]. None of these predictions is observed. The key point is that for both samples the spin-scattering rate saturates and is basically constant down to the lowest temperatures. This is not surprising: it is well known that spin-spin correlations have a strong influence on the measured dephasing time. Freezing of magnetic moments into a spin glass state violates the time reversal symmetry, hence leading to a very efficient dephasing mechanism. When comparing with the resistivity curves, it is obvious that this new regime appears around Tg . This saturation in the spin scattering rate can thus be associated with the formation of a spin-glass due to the RKKY interactions between magnetic impurities.
4
Conclusions
Our results clearly show that RKKY interactions, associated with the spin glass freezing, lead to a constant spin scattering rate, and hence yield a finite phase coherence time at very low temperatures. It is thus important to consider both energy scales, TK as well as Tg when dealing with metallic systems containing even a very small amount of magnetic impurities. The understanding of electron dephasing in the temperature range below TK is certainly a challenge for theory, but is probably the key point to interpret properly the experiments carried out on metals as they often contain magnetic impurities on the ppm level at best, with Kondo temperatures in the mK range. Measurements at lower temperatures on samples with very low
190
F´elicien Schopfer et al.
concentrations of magnetic impurities, well in the unitary limit, would also be of great interest. In this case all magnetic impurities are completely screened, and the standard Fermi liquid behaviour should be recovered. This would be the key test to discriminate between intrinsic dephasing and dephasing due to Kondo impurities. Acknowledgements We gratefully acknowledge L. Glazman, A. Zaikin, L.P. L´evy, O. Laborde, J. Souletie, P. Mohanthy, H. Pothier, A. Benoˆıt and F. Hekking for fruitful discussions. Samples have been made at NanoFab, CRTBT-CNRS. Part of this work has been performed at the “Ultra-Low Temperature FacilityUniversity of Bayreuth” within a TMR-project of the European Community (ERBFMGECT-950072). We are indebted to G. Eska, R. K¨ onig and I. Usherov-Marshak for their assistance.
References 1. D. Pines and P. Nozi`eres, The Theory of Quantum Liquids, W.A. Benjamin (1966). 181 2. B.L. Altshuler, A.G. Aronov and D.E. Khemelnitskii, J. Phys. C 15, 7367 (1982); B.L. Altshuler and A.G. Aronov, in Electron-Electron Interactions in Disordered Conductors, eds. A.L. Efros and M. Pollak, North Holland, Amsterdam (1985). 181, 186, 188 3. P. Mohanty, E.M.Q. Jariwala and R.A. Webb, Phys. Rev. Lett. 78, 3366 (1997). 181 4. D.S. Golubev and A.D. Zaikin, Phys. Rev. Lett. 81, 1074 (1998); D.S. Golubev, A.D. Zaikin and G. Sch¨ on, J. of Low Temp. Phys. 126, 1355 (2002). 182, 188 5. Y. Imry, H. Fukuyama and P. Schwab, Europhys. Lett. 47, 608 (1999); A. Zawadowski, J. van Delft and D.C. Ralph, Phys. Rev. Lett. 83, 2632 (1999); V.V. Afonin, J. Bergli, Y.M. Galperin, V.L. Gurevich and V.I. Kozub, Phys. Rev. B. 66, 165326 (2002). 182 6. D. Natelson, R.L. Willett, K.W. West and L.N. Pfeiffer, Phys. Rev. Lett. 86, 1821 (2001). 182 7. F. Pierre, H. Pothier, D. Est`eve, M.H. Devoret, A.B. Gougam and N.O. Birge, in Kondo Effect and Dephasing in Low-Dimensional Metallic Systems, V. Chandrasekhar, C. van Haesendonck and A. Zawadowski eds., 119, Kluwer, Dodrecht, 2001. 182 8. H. Akimoto, D. Candela, J.S. Xia, W.J. Mullin, E.D. Adams and N.S. Sullivan, Phys. Rev. Lett. 90, 105301 (2003). 182 9. F. Pierre and N.O. Birge, Phys. Rev. Lett. 89, 206804 (2002). 182, 183 10. F. Schopfer, C. B¨ auerle, W. Rabaud and L. Saminadayar, Phys. Rev. Lett. 90, 056801 (2003). 182, 188 11. A. Anthore, F. Pierre, H. Pothier and D. Esteve, Phys. Rev. Lett. 90, 076806 (2003). 182 12. J. Kondo, Prog. Theor. Phys. 32, 37 (1964). 182
Electron Coherence in Mesoscopic Kondo Wires
191
13. J.A. Mydosh, Spin Glasses: An experimental introduction, Taylor and Francis, London, 1993. 182 14. M.E. Gershenson, V.N. Gubankov and J.E. Juravlev, Pis’ma Zh. Eksp. Teor. Fiz. 35, 201 (1982). 182 15. D. Abraham and R. Rosenbaum, Phys. Rev. B 27, 33 (1983). 182 16. G. Bergmann, Physics Reports 107, 1 (1984) and references therein. 182 17. B. Pannetier, J. Chaussy, R. Rammal and P. Gandit, Phys. Rev. Lett. 53, 718 (1984); B. Pannetier, J. Chaussy, R. Rammal and P. Gandit, Phys. Rev. B. 31, R3209 (1985). 182 18. B. Pannetier, J. Chaussy and R. Rammal, Physica Scripta T13, 245 (1986). 182 19. A. Benoˆıt, S. Washburn, C.P. Umbach, R.A. Webb, D. Mailly and L. Dumoulin, in Anderson Localisation, T. Ando and H. Fukuyama eds., Springer Verlag (1988); A. Benoˆıt, D. Mailly, P. Perrier and P. Nedellec, Superlattices and Microstructures 11, 3 (1992). 183 20. This is somewhat oversimplified, since the spin-orbit scattering length may vary slightly with temperature. 184 21. P. Mohanthy and R.A. Webb, Phys. Rev. Lett. 84, 4481 (2000). 185, 188 22. U. Larsen, Phys. Rev. B. 14, 4356 (1976). 186 23. The maximum in the resistance curve is a precursor of the spin glass transition. The actual transition is situated at a slightly lower temperature than this maximum. 187 24. O. Laborde and P. Radhakrishna, Solid State Commun. 9, 701 (1971); O. Laborde, PhD Thesis, Universit´e Joseph Fourier (1977), unpublished. 187, 192 25. G. Neuttiens, J. Eom, C. Strunk, V. Chandrasekhar, C. Van Hasendonck and Y. Bruynseraede, Europhys. Lett. 34, 617 (1996). 187 26. J. Eom, J. Aumentado, V. Chandrasekhar, P.M. Baldo and L.E. Rehn, condmat/0302198. 187 27. G. Frossati, J.L. Tholence, D. Thoulouze and R. Tournier, Physica 84B, 33 (1976) and references therein. 187 28. P.G.N. de Vegvar, L.P. L´evy and T.A. Fulton, Phys. Rev. Lett. 66, 2380 (1991). 187 29. D.R. Hamann, Phys. Rev. 158, 570 (1967). 187 30. If we subtract the theoretically expected contribution for e-e interactions from the total resistivity, one obtains a maximum in the resistivity at a temperature around 60 mK. 187 31. P. Nozi`eres, J. of Low Temp. Phys. 17, 31 (1974). 188 32. C. van Haesendonck, J. Vranken and Y. Bruynseraede, Phys. Rev. Lett. 58, 1968 (1987). 188, 191 33. R.P. Peters, G. Bergmann and R.M. Mueller, Phys. Rev. Lett. 58, 1964 (1987). 188 34. The saturation value of τφ around the Kondo temperature differs very much for the two samples. This is somewhat surprising since the impurity concentration is only by a factor of 4 different when extracted from the resistivity. The determination of the impurity concentration via the phase coherence time at around TK [32] would lead to an impurity concentration of less than 1ppm for sample B. This is clearly incompatible with the observed resistance variation as well as the position of the maximum in the R(T) curve, which is in relatively
192
F´elicien Schopfer et al.
good agreement with data on the bulk Au/Fe Kondo system (see [24]). This discrepancy is presently not understood. 188 35. M.G. Vavilov and L.I. Glazman, cond-mat/0210507. 189
Scaling of the Quantum Hall Plateau Transition Frank Hohls and Rolf J. Haug Institut f¨ ur Festk¨ orperphysik, Universit¨ at Hannover Appelstr. 2, 30167 Hannover, Germany Abstract. We examine the transition between quantum Hall plateaus considering a number of different experiments. Measurements of the temperature and frequency dependence in the variable-range hopping regime allow to determine the localization length ξ, confirming the predicted scaling behavior ξ ∝ |B − Bc |−γ with a universal critical exponent γ = 2.3 ± 0.2. For devices of reduced size we observe conductance fluctuations at the plateau transition. Their temperature dependence turns out to differ from the temperature dependence of the width of the plateau transition. In a third experiment the dynamical scaling of the transition width is addressed by conductivity measurements at microwave frequencies. We combine these measurements with data from other experiments on dynamical scaling and observe a universal scaling function with an exponent κ = 0.5±0.1. The resulting dynamical exponent z = 0.9 ± 0.3 shows the relevance of electron-electron interactions.
1
Introduction
The understanding of the quantum Hall effect (QHE) is closely related to the nature of the transition between adjacent quantized Hall plateaus. Recently this transition was interpreted as a quantum phase transition where each quantized value of the Hall resistance identifies a phase of the quantum Hall system [1]. While classical phase transitions happen at nonzero temperatures T > 0, quantum phase transitions are strictly speaking restricted to T = 0 [2]. But as long as the quantum fluctuations at the transition dominate over thermal fluctuations, we can observe their physics at nonzero temperature. Near the quantum phase transition the system fluctuates between both quantum phases. The spatial extension of these fluctuation is represented by the correlation length ξ. It diverges in form of a power law ξ ∝ |δ|−γ for a small distance δ of some control parameter to the critical point of the transition. The exponent γ is universal for a given class of phase transitions i.e. it does not depend on details of the system. For the QHE transition the correlation length corresponds to the localization length representing the spatial extension of the electronic states. This length depends on the distance δE = E − Ec to the critical energy Ec at or close to the center of a Landau level [3]: ξ(E) ∝ |E − Ec |−γ B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 193–206, 2003. c Springer-Verlag Berlin Heidelberg 2003
(1)
194
Frank Hohls and Rolf J. Haug
The localization length was investigated in several numerical calculations for noninteracting electrons [4,5,6] which confirmed Eq. (1) and found a universal critical exponent γ = 2.35 ± 0.03, independent of the disorder potential [3]. Short-range electron-electron interaction is expected not to change this exponent [7]. However, the effect of long-range interaction still remains an open question. Neither the wave function nor the energy E are experimentally accessible. Instead we measure the conductivity σxx as function of the electron density ne or more often as function of the magnetic field B. For a small distance to the critical point the relation between Fermi energy EF and magnetic field B resp. filling factor ν = ne h/eB can be linearized: EF − Ec ∝ Bc − B ∝ ν − νc .
(2)
For an infinite sample at T = 0 the conductivity would be zero everywhere except for that filling factor νc or magnetic field Bc , which corresponds to a Fermi energy EF equal to the critical energy Ec of the plateau transition. For a finite sample size L we observe quasimetallic behavior (σxx ∼ e2 /h) for L ≤ ξ(B) and insulating behavior (σxx e2 /h) for L ξ(B). Scaling theory predicts that for finite L the conductivity follows a general scaling function [8] (3) σxx (δB) = G (L/ξ(δB)) = GL L1/γ δB , where we have used equations (1) and (2). Instead of varying the sample size it is experimentally more convenient to introduce nonzero temperature T > 0 or frequency f > 0. Both introduce an additional time scale τT ∝ ¯h/kB T resp. τf = 1/f . In the language of quantum phase transitions this time scale has to be compared to the correlation time τξ ∝ ξ z which is related to the correlation length ξ by the dynamical exponent. More intuitive the additional time scale τ can be translated into an effective length Leff ∝ τ 1/z . Plugging this into Eq. (3) we end up with temperature and frequency scaling σij (f, T = 0) = Gf (f κ /δB) and σij (f = 0, T ) = GT (T κ /δB)
(4)
with a universal scaling exponent κ = 1/zγ. While for noninteracting electrons z = 2 holds, the situation is less clear with interaction: Short range interaction are assumed not change the dynamical exponent [9], but long range electron-electron interaction is predicted to change it to z = 1 [10,11].
2
Localization Length in the Hopping Regime
As explained in the introduction the localization length is one of the crucial factors for the understanding of the plateau transition. Numerically its scaling behavior ξ ∝ |δE|−γ is well established. But experimentally it is difficult
Scaling of the Quantum Hall Plateau Transition
195
to access. Koch et al. addressed this topic by comparison of the conductivity for devices with different size [12] and extracted an exponent γ = 2.3 ± 0.2 from their data. But more favorable would be a direct measurement of the localization length ξ and its energy dependence within a single device. Polyakov and Shklovskii [11,13] proposed a direct relation between ξ and the conductivity due to variable-range hopping (VRH) which is the dominant transport mechanism in the localized regime ξ < Leff for low enough temperature. This provides an experimental access to the localization length [14,15,16,17,18]. 2.1
Temperature Dependence
The temperature dependence of VRH-conductivity in the QHE regime is given by [11,19,20] σxx (T ) ∝
1 exp − T0 /T , T
kB T0 = C
e2 . 4π0 ξ
(5)
The characteristic energy scale, the hopping temperature T0 , is determined by the Coulomb energy at a length scale which is given by the localization length ξ. The dimensionless constant C is of the order of unity. We now use the relation T0 ∝ 1/ξ for a verification of the localization length scaling ξ ∝ |ν − νc |−γ [Eq. (1)+(2)] and the predicted universality of the dynamical exponent γ. The 2DES used for this experiment resides in an AlGaAs/GaAs heterojunction. Extra doping with beryllium (S1+S3) resp. silicon (S2) close to the AlGaAs-GaAs interface [21] produces different realizations of disorder allowing for a real test of universality. The electron densities and mobilities are ne = 2.1 · 1015 m−2 and µe = 2 m2 /Vs for S1, ne = 3.2 · 1015 m−2 and µe = 4 m2 /Vs for S2, and ne = 2.4 · 1015 m−2 and µe = 12 m2 /Vs for S3. We concentrate on both sides of the transition between the plateaus at ν = 2 and ν = 1. Figure 1a shows the σxx peak at this transition for S1. The analysis of variable-range hopping is valid within the tails of this peak, starting at a distance to the critical point, that is indicated by the vertical lines. Examples of the fits of Eq. (5) to the data are shown in Fig. 1b for S1 on the high field side of the transition. Fig. 1c displays the result for T0 . The characteristic temperature T0 and thus the localization length ξ ∝ 1/T0 follows the expected power law behavior ξ ∝ |δν|−γ deep into the plateau up to a large distance δν = ν − νc ∼ 0.3 to the critical point of the transition. And more interestingly all samples can be described by the same critical exponent γ = 2.3 ± 0.2 independent of their special realization of disorder. This truly demonstrates the universality of the transition [17].
Frank Hohls and Rolf J. Haug
σ XX (e /h)
0.2 VRH
2
0.1
T ⋅ σ xx (K ⋅ e /h)
0.0
5
VRH
6
7
B (Tesla)
10000
8
9
B=7.0 T (δν =0.06)
2
-4
10
(c) S1
(b)
-2
10
7.2 T
1000
( x 50 )
S2 S3
100 (x2)
10 ν > νC
7.4 T
-6
10
(a)
T0 (K) ~ 1/ξ
196
B=8.6 T (δν =0.29)
1
8.2 T
7.8 T
2 3 1/2 1/2 1/T (1/K )
7.6 T
1
4
0.04
ν < νC
0.1
0.3
0.6
|δν|
Fig. 1. (a) Shubnikov-de Haas peak in σxx at the plateau transition between ν = 2 and ν = 1 for sample S1 at a temperature T = 0.2 K. In the tails of the peak, indicated by the vertical lines and the arrows, the data can be modelled by variablerange hopping (VRH). (b) Temperature dependence of the conductivity in the VRH-regime for different magnetic fields at the edge of the ν = 1 plateau for sample S1. The lines result from fits to Eq. (5). (c) Hopping temperature T0 derived from fits of the conductivity to Eq. (5) plotted against the distance δν = ν − νc to the critical point νc . The data where determined for three different samples (S1-3) with different disorder. The different symbols discriminate the low field (ν > νc ) and the high field side (ν < νc ) of the plateau transition from ν = 2 to ν = 1
2.2
Frequency Dependence
The energy necessary for the hopping process can also be supplied by nonzero frequencies f . Polyakov and Shklovsii derived for the variable-range hopping conductivity [11] a formula Re σxx (f ) =
4π 2 0 ξf, 3
(6)
being linear in both the frequency f and the localization length ξ. This provides us with an additional possibility for measuring ξ. We have realized the high frequency measurement of σxx in a coaxial reflection setup: We have patterned the 2DES, again residing in an AlGaAs/GaAs heterojunction with electron density ne = 3.3 · 1015 m2 and mobility µe = 35 m2 /Vs, into a Corbino device. This acts as load of a coaxial cable with characteristic impedance Z0 = 50 Ω (Fig.2a. A load impedance Z = 1/G deviating from Z0 leads to reflection of an incident wave at the load with a complex reflection coefficient R = (Z − Z0 )/(Z + Z0 ). Therefore a measurement of the reflection coefficient allows to compute the conductivity σxx = ln(r1 /r2 )/(2πZ) with r1 = 820 µm the outer and r2 = 800 µm the
Scaling of the Quantum Hall Plateau Transition
100
cryostat
(a)
ν < ν C = 3.5 x 30
6
10 ξ (µ m)
2
8
-2
Re σ XX (10 ⋅ e /h)
network analyzer
10
(c)
ν=2.30 ν=2.37
1
197
ν > ν C = 3.5 x 100
ν > ν C = 2.5 x 10 ν < ν C = 2.5 x3
4 0.1
2 0
ν > ν C = 1.5
(b) 0 1 2 3 4 5 6 f (GHz)
0.1
0.15 0.2
0.3
0.4
| ν − ν C|
Fig. 2. (a) Diagram of the measurement setup. (b) Exemplary frequency dependence of the real part of the conductivity σxx for two different distances to the critical point νc = 2.5 of the ν = 3 → 2 transition. The lines are fits to Eq. (6). (c) Localization length determined from these fits as function of the distance δν = ν−νc to the critical point νc of the associated QHE plateau transition. The lines demonstrate the expected power law behavior ξ ∝ |δν|−γ with γ = 2.3
inner radius of the Corbino geometry [22]. A network analyzer allows a direct measurement of the reflection coefficient R. Coaxial line and sample were fitted into a dilution refrigerator with a base temperature Ts ≤ 50 mk. Great care was taken for proper thermal anchoring. The electron temperature, estimated from temperature dependent measurements, reaches a value of Te ≈ 100 mK. The power of the incident wave P ≤ −75 dBm was kept low enough to avoid extra heating. Examples of the frequency dependence of σxx (f ) for f = 0 − 6 GHz are show in Fig. 2b. For filling factors ν, for which σxx (f = 0) σxx (f = 6 GHz), variable-range hopping is valid and Eq. (6) can be applied. We indeed observe the predicted linear behavior as a function of frequency. The solid lines in Fig. 2b show the result of such fits. The slope of the lines is a direct measure of ξ. Figure 2c displays the resulting localization length ξ as function of the distance δν = ν − νc to the associated critical point of the plateau transition. We analyzed the three transitions from ν = 4 to ν = 1. The solid lines correspond to the expected scaling behavior ξ ∝ |δν|−γ with γ = 2.3. All data but those for 2 < ν < 2.5 agree with the expected power law and particulary also with the predicted and in the previous section measured universal critical exponent γ and thus confirm the picture of a quantum phase transition [15].
198
Frank Hohls and Rolf J. Haug
3 Conductance Fluctuations at the QHE Plateau Transition Until now we have restricted our analysis to the insulating regime σxx e2 /h being observed at some distance δB ∝ δν to the critical point Bc of the quantum Hall plateau transition. Here variable-range hopping is the dominant transport mechanism and we can directly extract the localization length. When entering the quasimetallic regime σxx ∼ e2 /h this is no longer valid and instead we should be able to test the conductivity scaling as described by Eq. (4). We can simplify this procedure when regarding only a single value of σxx , usually chosen as half of the maximum conductivity σxx (Bc ) observed at the critical point Bc . Then Eq. (4) can be reduced to the following relation for the full width at half maximum ∆B of the conductivity peak: ∆B(f ) ∝ f κ resp. ∆B(T ) ∝ T κ
(7)
This behavior was first observed by Wei et al. [23] and led to the proposal of a universal scaling behavior of the plateau transition by Pruisken [8]. The effective length Leff ∝ T −1/z was interpreted as the phase coherence length Lφ . Wei et al. found a scaling exponent κ = 1/zγ = 0.42 ± 0.04 from temperature dependent measurements (see also [24,25]). With γ = 2.3 determined numerically, also experimentally in the variable-range hopping regime, and by size dependent measurements [12] this yields an effective length Leff (T ) ∝ 1/T √. For B = 0 the phase coherence length is known to follow Lφ ∝ 1/ T due to inelastic electron-electron scattering. Therefore the question arose [26] whether the temperature dependence could be changed for high fields or wether the scaling length should be identified at all with a phase coherence length as understood for low magnetic fields. Universal conductance fluctuations (UCF) which can be observed for samples of mesoscopic size are well known to depend on Lφ [27]. Therefore we chose to analyze conductance fluctuations in the quantum Hall regime as an access to the phase coherence length Lφ . Previous experiments [28,29,30,31,32,33,34,35,36] in the quantum Hall regime were partly interpreted by mechanisms different from UCF, e.g. tunneling between edge channels of a small Hall bar through [30,31,32] or charging of [33,34] single impurities. To avoid such effects for our measurements we chose a Corbino geometry to get rid of edge channels. In addition we used rather large devices with a width w = r1 −r2 = 6 µm and radius r2 = 60 µm to suppress the effect of single impurities. In addition, the rather small conductance fluctuations for such a device allow for a simultaneous measurement of the width ∆B of the transition which would be spoiled by strong fluctuations as observed for microscopic samples. The device was fabricated from the same wafer as sample S1 used for the variable-range hopping experiment. The results observed for this samples were reproduced for devices having the properties of S2 and S3 of the previous section.
Scaling of the Quantum Hall Plateau Transition
(a)
320 mK
(c)
6
2
g (e /h)
10
199
5 ∆B
7 B (Tesla)
8
2 ∆ B (Tesla)
55 mK
2
6
4
55 mK
g (e / h)
0
100 mK
160 mK
2
(b)
(d) 25 mK
1
~T 0.4 20
100 T (mK)
0
κ
2
0.2 e /h 1000
6.9
7.0 B (Tesla)
7.1
Fig. 3. (a) Conductance g = 2πσxx / ln(r1 /r2 ) of the measured Corbino device for different temperatures at the ν = 2 → 1 QHE plateau transition. (b) Full width at half maximum ∆B of the conductance peak as function of temperature. The solid line results from a fit to ∆B ∝ T κ for T > 150 mK, yielding a scaling exponent κ = 0.5. (c) Magnification of the grey rectangle shown in (a) allows to perceive conductance fluctuations modulated on top of the conductance peak. (d) Conductance fluctuations δg(B) = g(B) − g(B) after subtraction of the smooth background g(B) of the conductance peak and further magnification as indicated by the grey rectangle in (c)
We measured the conductance g = 2πσxx / ln(r1 /r2 ) for the plateau transition ν = 2 → 1 (Fig. 3a for temperatures T = 20 − 700 mK. For large temperatures T > 100 mK the width shows scaling behavior ∆B ∝ T 0.5 (Fig. 3b) in reasonable agreement with an effective length Leff (T ) ∝ 1/T . The saturation at low temperatures T < 70 mK demonstrates Leff ≥ L. For such low temperatures the effective length exceeds the sample size and thus becomes irrelevant. In an enlargement of the conductance peak shown in Fig. 3c the conductance fluctuations are rendered visible. For further analysis we separate the peak form g(B) from the fluctuations δg(B) = g(B) − g(B) where g(B) is determined by polynominal smoothing. The result is shown in Fig. 3d. The influence of the phase coherence length and thus of temperature is twofold: Rising the temperature resp. lowering Lφ (T ) reduces the amplitude δgrms and increases the correlation field Bc . The latter describes the falloff of the correlation function F (b) = δg(B)δg(B + b) to half of its maximum value F (0) and is given by Bc = Φ0 /L2φ with Φ0 = h/e the flux quantum.
(8)
200
Frank Hohls and Rolf J. Haug
It was shown [37] that the most robust approach to the correlation field is an analysis ofthe power spectral density (PSD) which is the Fourier transF (b) exp(−i2πfB b) db /2π of the correlation function F (b). form P(fB ) = Figure 4a displays the PSD obtained from the conductance fluctuations in the magnetic field range B = 6.4 − 7.1 T covering most of the plateau transition. As shown by the solid lines it can be fitted by a simple exponential decay: P(fB , Bc (T )) = P0 e−2πBc (T )fB
(9)
∆B 25 mK 80 mK 160 mK
1E-6
1E-7
BC (mT)
4
2
P ( fB ) ( e /h / Tesla )
1E-5
0.6 0.5 0.4 0.3 0.2
20
40 60 80 100 fB (1/ Tesla)
70 mK
0.5
15
(b)
10
30 mK
0.6 0.7 0.8 1.0 1.3 2.0
5 0
0
(c)
0
50
100 T (mK)
Lφ (µm)
(a)
1E-4
1/κ
The corresponding correlation function F (b) = F (0)/[1 + (Bc /b)2 ] fulfills both limiting cases known for UCF [27], F (b) ≈ F (0) for b Bc and F (b) ∝ (b/Bc )2 for b Bc , and the definition of Bc , F (Bc ) = F (0)/2. Therefore it is reasonable to use fits to Eq. (9) to determine the correlation field Bc . As a result of such fits like the ones shown in Fig. 4a we determine the temperature dependence of Bc (T ) as plotted in Fig. 4b. Due to the smallness of the fluctuations whose amplitude drops with rising T the analysis cannot be extended to temperatures above 200 mK. Within this range we observe a linear behavior B c (T ) ∝ T down √ to T ≥ 30 mK. Thus the temperature dependence Lφ = Φ0 /Bc ∝ 1/ T is definitely distinct from the temperature dependence of Leff ∝ 1/T which governs the scaling of the plateau transition [38,39].
150
Fig. 4. (a) Power spectral density P(fB ) of the conductance fluctuation δg(B) in a magnetic field interval B = 6.4−7.1 T. The lines show fits to Eq. (9). (b) Correlation field Bc resp. dephasing length Lφ = Φ0 /Bc determined by the fits shown in (a). √ The straight line shows the expected linear dependence Bc ∝ T resp. Lφ ∝ 1/ T . (c) The plateau transition width ∆B to the power of 1/κ. The data shown are identical to Fig. 3b but plotted on differently scaled axis. Clearly the width saturates due to finite size effect at a temperature of T = 70 mK while the correlation field Bc shown in (b) still follows the temperature
Scaling of the Quantum Hall Plateau Transition
201
The difference between these two length scales governing the conductance fluctuations and the plateau transition is underlined by another observation: While the fluctuations show a temperature dependence down to 30 mK as found from Bc the width ∆B saturates already below 70 mK. The direct comparison between Figs. 4b and 4c exposes this difference. These results support the proposal of Polyakov and Shamoshkin [26] that the conductivity near the critical point of the plateau transition is governed by two exponents: One exponent z1 = 1 with resulting length scale L1 ∝ T −1/z1 = 1/T governing the scaling behavior and another one z2 = 2 with L2 ∝ T −1/z2 = 1/T 0.5 describing amplitude correction and fluctuations.
4
Universal Dynamical Scaling
Besides the scaling ∆B(T ) ∝ T κ with temperature, first observed by Wei et al. [23] and shown here in Fig. 3b, Eq. (7) also predicts a scaling law ∆B(f ) as function of frequency. This idea was first tested by Engel et al. finding a scaling behavior with an exponent κ = 0.43 in agreement with measurements of temperature scaling on the same sample [40]. Contrary to that Balaban et al. found no scaling behavior at all [41]. The setup presented in Fig. 2a allows to measure the magneto conductivity σxx (B) at the plateau transition for frequencies up to f = 6 GHz at an electron temperature of T ≈ 100 mK. Thus we can reach the regime f ≥ kB T /h ≈ 2 GHz at which frequency becomes the dominant parameter. The measured conductivity σxx (B) is presented in Fig. 5a for a magnetic field range covering the transitions from filling factor ν = 6 to ν = 2. As expected the width of the peaks in σxx broadens with rising frequency. As seen from Fig. 5b the frequency dependence of the transition width ∆B shows different regimes: For f > kB T /h ≈ 2 GHz the dominating energy scale is given by the frequency. Here the width can be fitted by a power law behavior ∆B(f ) ∝ f κ as shown by the straight line with nonzero slope. For f kB T /h the temperature of the 2DES sets the dominating energy scale and ∆B does not depend on frequency (horizontal line). For a description of ∆B(f ) in the complete frequency range we have to include temperature. The single-parameter scaling functions for the plateau transition (Eq. 4) are modified using a two-variable scaling analysis [1, p.327] σxx (T, f ) = GT,f (T κ δB, f κ δB) .
(10)
For combining temperature and frequency into a single energy scale we use the Ansatz (hf )2 + (αkB T )2 with α a dimensionless factor of the order of unity [42]. We also tested a simpler formula in adding both energy scales but this did not fit our data [43]. Using the above Ansatz together with Eq. (7) results into a transition width κ 2 2 (hf ) + (αkB T ) (11) ∆B(f, T ) = C
202
Frank Hohls and Rolf J. Haug
6
(a)
B (Tesla) 4 3
5
2.5
0.3 GHz
0.1
0.4
(b)
3 GHz
0.2
width ∆ B
2
Re(σ ) (e /h)
0.0 'B
0.1 0.0
0.1
0.15
0.0 2
3
4 5 filling factor ν
ν=3→4
0.2
6 GHz
0.2
0.3
6
f (GHz)
Fig. 5. (a) Conductivity σxx measured for different frequencies in the reflection setup depicted in Fig. 2a. (b) Transition width ∆B(f ) for the transition ν = 3 → 2. The width ∆B is defined as full width at half maximum of the conductivity peaks as indicated in (a). The straight lines represent the expected behavior for hf kB T resp. hf kB T . The curved line results from a fit of the data to (12)
with C a sample specific constant. A fit of this equation to the ν = 3 → 2 transition width is shown in Fig. 5b. The good quality of the fit proves the validity of our Ansatz for combining temperature and frequency. Dividing Eq. (11) by its DC-limit ∆B(f → 0, T ) yields
2 κ hf ∆B(f, T ) = 1+ (12) ∆BDC (T ) αkB T only depending on the dimensionless ratio hf /kB T and no longer on any sample details. Using this equation it is possible to directly compare the frequency dependence of different measurements and different samples. We will use this equation to compare our results to all other experiments studying frequency scaling. Table 1 summarizes the wide ranges of the parameters like frequency, mobility, density, temperature, filling factor, and material, which are covered by the different experiments [15,22,40,41,44,45,46,47]. The width measured in these experiments are displayed in Fig. 6a. In Fig. 6b all these data of ∆B(f, T ) are normalized to the f → 0 width ∆B and plotted as function of a dimensionless parameter hf /kB T . All measurements but a single one of Balaban et al.1 , which we will omit in our 1
The deviation of the data of Balaban et al. presumably are caused by macroscopic density inhomogeneities which are known to spoil scaling behavior [48]. This
Scaling of the Quantum Hall Plateau Transition
203
Table 1. 2DES mobility µe , range of filling factor ν, 2DES temperature T , and frequencies f . In the experiment of Shahar et al. the 2DES resides in InGaAs, the other experiments used AlGaAs/GaAs heterostructures
µe
Experiment
m2 Vs
ν
T (K)
f (GHz)
Hohls et al. [22,15]
35
1-5
0.1
0.1-6
Meisels et al. [44,45]
10
1-2
0.3
35-55
Engel et al. [40]
4
1-2
0.14-0.5
0.2-14
Shahar et al. [46]
3
0-1
0.2-0.43
0.2-14
Balaban et al. [41]
3
1-2
0.15?
0.7-7
50
3-5
0.24-0.5
1-10
Lewis [47]
10 Shahar
∆ B (T)
Balaban Engel 1
Hohls
Meisels
y = ∆ B( f ) / ∆ B( f ≈ 0)
4 3 2
Hohls Meisels Engel Lewis Shahar Balaban
Lewis (a) 0.1 0.1
1
10 f (GHz)
100
1
(b) 0.1
1
10
x = hf / kBT
Fig. 6. (a) Transition width ∆B(f ) for all experiments listed in Table 1. (b) Normalized transition width y = ∆B(f, T )/∆B(f ≈ 0, T ) vs. dimensionless parameter x = hf /kB T for all data shown in (a). The solid line results from a simultaneous error weighted fit of all data but those from Balaban et al. to Eq. (12), yielding a scaling exponent κ = 0.5 ± 0.1
further analysis, fall onto a universal curve (Fig. 6b). This result convincingly confirms the idea of a universal scaling behavior. The solid line in Fig. 6 results from a simultaneous fit of Eq. 12 to the data, yielding a scaling exponent κ = 0.5±0.1. Using the value γ = 2.3±0.2 for the critical exponent derived by the measurements in the variable-range hopping suspicion comes from deviations of the critical point Bc for samples lying near to each other on the same chip.
204
Frank Hohls and Rolf J. Haug
regime we can compute the dynamical exponent z = 1/κγ = 0.9 ± 0.3 [49]. When compared to z = 2 for noninteracting electrons this value demonstrates the importance of interaction for the complete understanding of the integer quantum Hall effect which in the beginning was thought to be well described without interaction. Also the derived z agrees well with theoretical predictions of z = 1 [10,11].
5
Summary
We have addressed the plateau transition in the quantum Hall effect with a number of different experiments which complete our picture of this quantum phase transition. Analyzing the conductivity in the variable-range hopping regime we were able to measure the localization length by experiments with varying temperature and frequency. We observed the theoretically expected critical behavior ξ ∝ |B − Bc |−γ with a universal critical exponent γ = 2.3 ± 0.2. Reducing the size of our devices we observed conductance fluctuations at the plateau transitions. We have shown that these fluctuations and the transition itself are governed by different length scales with different exponents as predicted theoretically. In the last part we have addressed the question of universal dynamical scaling of the quantum Hall plateau transition. We measured the conductivity at microwave frequencies and developed a scheme to combine our results with all other experiments on dynamical scaling of the plateau transition. We have shown that nearly all data fall onto a single universal curve and derived a dynamical exponent z ≈ 1 which shows the relevance of interaction. Acknowledgement We thank Ulrich Zeitler for his invaluable contributions to the original publications on the experiments presented here. The analysis of the universal dynamical scaling behavior was developed together with Ronald Meisels and Friedemar Kuchar – thanks for the wonderful collaboration. All this work would not have been fruitful without discussions with Ferdinand Evers, Bodo Huckestein, Bernhard Kramer, Dima Polyakov, and Ludwig Schweitzer.
References 1. S. L. Sondhi, S. M. Girvin, J. P. Carini, and D. Shahar, Rev. Mod. Phys. 69, 315 (1997). 193, 201 2. S. Sachdev, Quantum Phase Transitions (Cambridge University Press, Cambridge, 1999). 193 3. B. Huckestein, Rev. Mod. Phys. 67, 357 (1995). 193, 194 4. H. Aoki and T. Ando, Phys. Rev. Lett. 54, 831 (1985). 194 5. J. T. Chalker and G. J. Daniell, Phys. Rev. Lett. 61, 593 (1988). 194
Scaling of the Quantum Hall Plateau Transition 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35.
205
B. Huckestein and B. Kramer, Phys. Rev. Lett. 64, 1437 (1990). 194 D.-H. Lee and Z. Wang, Phys. Rev. Lett. 76, 4014 (1996). 194 A. M. M. Pruisken, Phys. Rev. Lett. 61, 1297 (1988). 194, 198 Z. Wang, M. P. Fisher, S. M. Girvin, and J. T. Chalker, Phys. Rev. B 61, 8326 (2000). 194 B. Huckestein and M. Backhaus, Phys. Rev. Lett. 82, 5100 (1999). 194, 204 D. G. Polyakov and B. I. Shklovskii, Phys. Rev. B 48, 11167 (1993). 194, 195, 196, 204 S. Koch, R. J. Haug, K. v. Klitzing, and K. Ploog, Phys. Rev. Lett. 67, 883 (1991). 195, 198 D. G. Polyakov and B. I. Shklovskii, Phys. Rev. Lett. 70, 3796 (1993). 195 S. Koch, R. J. Haug, K. v. Klitzing, and K. Ploog, Semicond. Sci. Technol. 10, 209 (1995). 195 F. Hohls, U. Zeitler, and R. J. Haug, Phys. Rev. Lett. 86, 5124 (2001). 195, 197, 202, 203 M. Furlan, Phys. Rev. B 57, 14818 (1998). 195 F. Hohls, U. Zeitler, and R. J. Haug, Phys. Rev. Lett. 88, 036802 (2002). 195 F. Hohls, U. Zeitler, and R. J. Haug, Physica E 12, 670 (2002). 195 G. Ebert, K. von Klitzing, C. Probst, E. Schuberth, K. Ploog, and G. Weimann, Solid State Commun. 45, 625 (1983). 195 A. Briggs, Y. Guldner, J. P. Vieren, M. Voos, J. P. Hirtz, and M. Razeghi, Phys. Rev. B 27, 6549 (1983). 195 K. Ploog, J. Cryst. Growth. 81, 304 (1987). 195 F. Hohls, U. Zeitler, R. J. Haug, and K. Pierz, Physica B 298, 88 (2001). 197, 202, 203 H. P. Wei, D. C. Tsui, M. A. Paalanen, and A. M. M. Pruisken, Phys. Rev. Lett. 61, 1294 (1988). 198, 201 S. Koch, R. J. Haug, K. v. Klitzing, and K. Ploog, Phys. Rev. B 43, 6828 (1991). 198 E. Chow, H. P. Wei, S. M. Girvin, and M. Shayegan, Phys. Rev. Lett. 77, 1143 (1996). 198 D. G. Polyakov and K. V. Samokhin, Phys. Rev. Lett. 80, 1509 (1998). 198, 201 P. A. Lee, A. D. Stone, and H. Fukuyama, Phys. Rev. B 35, 1039 (1987). 198, 200 G.Timp, A. M. Chang, P. Mankiewich, R. Behringer, J. E. Cunningham, T. Y. Chang, and R. E. Howard, Phys. Rev. Lett. 59, 732 (1987). 198 A. K. Geim, P. C. Main, P. H. Beton, L. Eaves, S. P. Beaumont, and C. D. W. Wilkinson, Phys. Rev. Lett. 69, 1248 (1992). 198 J. A. Simmons, S. W. Hwang, D. C. Tsui, H. P. Wei, L. W. Engel, and M. Shayegan, Phys. Rev. B 44, 12933 (1991). 198 P. C. Main, A. K. Geim, H. A. Carmona, C. V. Brown, T. J. Foster, R. Taboryski, and P. E. Lindelof, Phys. Rev. B 50, 4450 (1994). 198 A. A. Bykov, Z. D. Kvon, E. B. Ol’shanetskii, L. V. Litvin, and S. P. Moshchenko, Phys. Rev. B 54, 4464 (1996). 198 D. H. Cobden and E. Kogan, Phys. Rev. B 54, R17316 (1996). 198 D. H. Cobden, C. H. W. Barnes, and C. J. B. Ford, Phys. Rev. Lett. 82, 4695 (1999). 198 F. Hohls, U. Zeitler, and R. J. Haug, Ann. Phys. 8, SI97 (1999). 198
206
Frank Hohls and Rolf J. Haug
36. T. Machida, S. Ishizuka, S. Komiyama, K. Muraki, and Y. Hirayama, Phys. Rev. B 63, 045318 (2001). 198 37. R. Sch¨ afer, P. vom Stein, and C. Wallisser, in Advances in Solid State Physics, edited by B. Kramer (Vieweg, Braunschweig, 1999), Vol. 39, p. 583. 200 38. F. Hohls, U. Zeitler, and R. J. Haug, Phys. Rev. B 66, 073304 (2002). 200 39. F. Hohls, U. Zeitler, and R. J. Haug, in Proc. 26th Int. Conf. Phys. Semicond., Endinburgh 2002. 200 40. L. W. Engel, D. Shahar, C. Kurdak, and D. C. Tsui, Phys. Rev. Lett. 71, 2638 (1993). 201, 202, 203 41. N. Q. Balaban, U. Meirav, and I. Bar-Joseph, Phys. Rev. Lett. 81, 4967 (1998). 201, 202, 203 42. F. Hohls, U. Zeitler, R. J. Haug, R. Meisels, K. Dybko, and F. Kuchar, Physica E (2002). 201 43. F. Hohls, U. Zeitler, R. J. Haug, R. Meisels, K. Dybko, and F. Kuchar, in Proc. 26th Int. Conf. Phys. Semicond., Endinburgh 2002. 201 44. F. Kuchar, R. Meisels, K. Dybko, and B. Kramer, Europhys. Lett. 49, 480 (2000). 202, 203 45. K. Dybko, R. Meisels, F. Kuchar, G. Hein, and K. Pierz, in Proc. 25th Int. Conf. Phys. Semicond., Osaka 2000, edited by N. Miura and T. Ando (Springer, Berlin, 2001), p. 915. 202, 203 46. D. Shahar, L. W. Engel, and D. C. Tsui, in Proc. 11th In. Conf. High Magn. Semicond. (Semimag Boston 1994), edited by D. Heiman (World Scientific, Singapore, 1995), p. 256. 202, 203 47. R. L. Lewis, Ph.D. thesis, Indiana University, Bloomington, 2001. 202, 203 48. I. M. Ruzin, N. R. Cooper, and B. I. Halperin, Phys. Rev. B 53, 1558 (1996). 202 49. F. Hohls, U. Zeitler, R. J. Haug, R. Meisels, K. Dybko, and F. Kuchar, Phys. Rev. Lett. 89, 276801 (2002). 204
Graphite as a Highly Correlated Electron Liquid Yakov Kopelevich1 , Pablo Esquinazi2 , Jos´e Henrique Spahn Torres1, Robson Ricardo da Silva1 , and Heiko Kempa2 1 2
Instituto de F´ısica “Gleb Wataghin”, Universidade Estadual de Campinas Unicamp, 13083-970, Campinas, S˜ ao Paulo, Brasil Abteilung Supraleitung und Magnetismus, Institut f¨ ur Experimentelle Physik II Universit¨ at Leipzig, Linn´estrasse 5, D-04103, Leipzig, Germany
Abstract. Although a considerable amount of research work has been done on graphite, its physical properties are still not well understood. In the present paper we review recent reports on the occurrence of magnetic-field-driven metalinsulator and insulator-metal transitions, as well as the quantum Hall effect (QHE) in graphite. The experimental results suggest that the low field (∼ 1 kOe) metalinsulator transition is associated with the transition between Bose metal and excitonic insulator states. On the other hand, the reentrant insulator-metal transition which takes place at higher fields can consistently be understood assuming the occurrence of superconducting correlations caused by the Landau level quantization. We argue that the QHE, observed only for strongly anisotropic quasi-twodimensional (2D) graphite samples, and superconducting correlations may represent the same phenomenon, implying that Cooper pairs in the quasi-2D samples form a highly correlated boson liquid.
1
Introduction
The apparent metal-insulator transition (MIT) in two-dimensional (2D) electron (hole) systems which takes place either varying carrier concentration or applying a magnetic field H has attracted a broad research interest [1]. Recently, a similar MIT driven by a magnetic field applied perpendicular to basal planes has been reported for graphite [2,3,4,5]. The quasi-particles (QP) in graphite behave as massless Dirac fermions (DF) with a linear dispersion relation, similar to, e. g., QP near the gap nodes in superconducting cuprates. Theoretical analysis [6,7] suggests that the MIT in graphite is the condensed-matter realization of the magnetic catalysis (MC) phenomenon [8] known in relativistic theories of (2 + 1)-dimensional DF. According to this theory [6,7], the magnetic field H opens an insulating gap in the spectrum of Dirac fermions of graphene, associated with the electron-hole (e-h) pairing, below a transition temperature Tce (H) which is an increasing function of field. However, at higher fields and at temperatures T < Tmax (H) an insulator-metal transition (IMT) takes place [2] indicating that additional physical processes may operate approaching the field HQL that pulls carriers into the lowest Landau level (LLL). The occurrence of superconducting B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 207–222, 2003. c Springer-Verlag Berlin Heidelberg 2003
208
Yakov Kopelevich et al.
correlations in the Quantum Limit (QL) [9,10] and below the temperature Tmax (H) has been proposed for graphite in [2]. Other theoretical works predict the occurrence of the field-induced Luttinger liquid [11] and the integral quantum Hall effect (IQHE) [12] in graphite. In this article we present experimental results which provide a fresh insight on the magnetotransport properties of graphite. In particular, we show that the field-driven MIT as well as IMT are generic to graphite, occurring in both strongly anisotropic quasi-2D and less anisotropic quasi-3D samples. On the other hand, the Hall resistance Rh (H, T ) measurements reveal QHE features only for quasi-2D graphite. The results are discussed in the context of relevant theoretical models.
2
Samples and Experimental Details
We have performed measurements of both basal-plane Rb (H, T ) and Hall Rh (H, T ) resistances on several well-characterized [2,3,4,5,13,14] quasi-2D Highly Oriented Pyrolitic Graphite (HOPG) and, less anisotropic, flakes of single crystalline Kish graphite. Four HOPG samples with the room temperature and H = 0 out-of-plane/basal-plane resistivity ratio ρc /ρb = 8.6 × 103 (HOPG-1), ∼ 5 × 104 (HOPG-3 and HOPG-UC1), ∼ 2 × 104 (HOPGAC) and Kish graphite single crystals (K-1, K-2) with the ratio of ∼ 100 have been studied. HOPG samples were obtained from the Research Institute ”Graphite”, Moscow (HOPG-1, HOPG-3), the Union Carbide Co. (HOPG-UC) and from Advanced Ceramics Co. (HOPG-AC). The ρb -values at T = 300 K (H = 0) were ∼ 3µΩcm (HOPG-UC), ∼ 5µΩcm (HOPG-3, K-1), ∼ 45µΩcm (HOPG-1) and ∼ 40µΩcm (HOPG-AC). Low-frequency (f = 1 Hz) and dc standard four-probe magnetoresistance measurements were performed on samples with dimensions 4.9 × 4.3 × 2.5 mm3 (HOPG1), 4 × 4 × 1.2 mm3 (HOPG-3), 5 × 5 × 1 mm3 (HOPG-UC), 2 × 2 × 0.5 mm3 (HOPG-AC), 2.7 × 2.4 × 0.15 mm3 (K-1) and 1.7 × 1.0 × 0.08 mm3 (K-2) in the temperature interval 100 mK≤ T ≤300 K using different 9 T-magnet He-cryostats and a dilution refrigerator. The Hall resistance was measured using the van der Pauw configuration with a cyclic transposition of current and voltage leads [15,16] at fixed applied field polarity, as well as magnetic field reversal; no difference in Rh (H, T ) obtained with these two methods was found. For the measurements, silver past electrodes were placed on the sample surface, while the resistivity values were obtained in a geometry with an uniform current distribution through the sample cross section. All resistance measurements were performed in the Ohmic regime and at various angles between applied magnetic field and the sample c-axis.
Graphite as a Highly Correlated Electron Liquid
3
209
Magnetic-Field-Induced Metal-Insulator Transition
3.1 Basal-Plane Resistance Behavior in a Perpendicular Magnetic Field The transition from metallic- (dRb /dT > 0) to insulator-like (dRb /dT < 0) behavior of the basal-plane resistance Rb (T, H) driven by a magnetic field applied perpendicular to the graphene planes has first been reported in [3] for HOPG-1 sample and then observed for all studied graphite samples. Figure 1 presents Rb (T, H) measured for both HOPG-UC and a Kish graphite samples. As can be seen from Fig. 1, Rb (T ) has a metallic character at zero or low enough fields. As the applied field exceeds H ∼ 0.5 . . . 1 kOe, Rb (T ) becomes insulating-like, suggesting the occurrence of MIT driven by the magnetic field. Figures 1 and 2 illustrate also that Rb (T ) goes through a minimum at the field-dependent temperature Tmin (H > Hc ), where Hc is a threshold field below which the metallic state of graphite is preserved. Figure 2 shows the resistance minimum, and Fig. 3 presents Tmin (H) obtained for the K-1 sample; evidently, Tmin (H) is an increasing function of the field. Note the huge low-temperature magnetoresistance; Rb (T, H) varies about two orders of magnitude in the field interval 0 ≤ µ0 H ≤ 1 T, see Fig. 1. 3.2
Possible Metal – Excitonic Insulator Transition
The characteristic feature of the band structure of a single graphene layer is that there are two isolated points in the first Brillouin zone where the band K-1
HOPG-UC
BASAL-PLANE RESISTANCE (Ω)
10
-1
1T
1T
5 kOe
5 kOe
10
-1
10
-2
10
-3
2.5 kOe
2 kOe
10
10
-2
1 kOe 1 kOe
-3
500 Oe
500 Oe
300 Oe 100 Oe
300 Oe H=0
H=0
1
10
100
1
10
100
TEMPERATURE (K)
Fig. 1. Basal-plane resistance Rb (T, H) measured for the samples HOPG-UC and Kish single crystal (K-1) at various magnetic fields applied perpendicular to the graphene planes
210
Yakov Kopelevich et al.
R(T)/R(Tmin)
1.05
1.02
10
100
TEMPERATURE (K)
Fig. 2. Reduced basal-plane resistance R(T )/R(Tmin ) measured for Kish graphite single crystal (K-1) with H = 400 Oe (✷), H = 500 Oe (◦), H = 750 Oe (), and H = 1 kOe (). Arrows indicate Tmin (H), the fielddependent temperature which separates insulating-like (T < Tmin ) and metalliclike (T > Tmin ) resistance behavior; H|| c-axis
90 K-1
Tmin(K)
60
30
0 0.04
0.06
0.08
0.10
APPLIED MAGNETIC FIELD (T)
Fig. 3. Tmin (H) obtained for K-1 (✷) sample; dotted line is obtained from Eq. (1) with the fitting parameters A = 350K/T 1/2 , µ0 Hc = 0.038 T , and the solid line is obtained from (2) with the fitting parameters B = 320K/T 1/2 , µ0 Hc = 0.035 T ; H|| c-axis
dispersion is linear E(k) = vF | k | (vF ∼ 106 m/s is the Fermi velocity), so that the electronic states can be described in terms of a Dirac equation in two dimensions. According to theory [6,7], a magnetic field applied perpendicular to the graphene planes opens an insulating gap in the spectrum of Dirac fermions, associated with an electron-hole pairing, leading to the excitonic insulator state below a transition temperature Tce (H) which can be associated with Tmin (H). As exemplified in Fig. 3 for the K-1 sample, in the vicinity of Hc , Tmin(H) can be well described by the equations Tmin (H) = A(H − Hc )1/2
(1)
and Tmin (H) = B[1 − (Hc /H)2 ]H 1/2 ,
(2)
where A, B and Hc are fitting parameters. Equation (2) corresponds to the expression (64)√of [7], Tce ∼ (1 − νb2 )H 1/2 , obtained within the MC theory, where νb = 2π cn2D /Nf |eH| ≡ Hc /H is the filling factor, Nf is the number of fermion species (Nf = 2 for graphite), and n2D is the 2D carrier density. We note a two orders of magnitude difference between the predicted value for µ0 Hc ≈ 2.5 T (n2D = n3D d ∼ 1011 cm−2 , n3D ∼ 3 × 1018 cm−3 and d = 3.35˚ A, the distance between graphene planes) [7] and the fitting value for µ0 Hc = 0.035 T. The discrepancy can be understood, however,
Graphite as a Highly Correlated Electron Liquid
211
assuming that the Coulomb coupling, given by the dimensionless parameter g = 2πe2 /0 v (0 is the dielectric constant) [6,7], drives the system very close to the excitonic instability. In this case, the threshold field Hc can be well below the estimated value of 2.5 T. The above analysis, together with the experimental evidence that only the perpendicular component of the applied field drives the MIT [17], supports the theoretical expectations of the field-induced excitonic insulator state in graphite. 3.3
Graphite as a Bose Metal
Our analysis revealed [3,5] that the scaling approach used to characterize both the magnetic-field-induced superconductor-insulator (SC-I) quantum phase transition [18] and the field-driven MIT in 2D electron (hole) systems [1] can be equally well applied to the MIT observed in graphite. According to the scaling theory of the SC-I transition [18], the resistance in the critical regime is given by the equation: R(δ, T ) = Rcr f (| δ | /T 1/zv ),
(3)
where Rcr is the resistance at the transition, f (| δ | /T 1/zv ) a scaling function such that f (0) = 1; z and v are critical exponents, and δ the deviation of a variable parameter from its critical value. With δ = H − Hcr we have plotted R vs. | δ | /T 1/α for the HOPG-3 sample in Fig. 4(a), where α = 0.65 ± 0.05 was extracted from log-log plots of (dR/dH) |Hcr vs. T −1 , and the critical field Hcr = 1.14 kOe was experimentally determined. As can be seen in Fig. 4(a), the resistance data obtained in the temperature range 50 K–200 K collapse into two distinct branches, below and above the Hcr . The analysis performed on various HOPG as well as Kish graphite samples revealed the universality of the scaling [5]. At T < 20 K, where the resistance Rb (T ) saturates, a clear deviation from the scaling takes place, reminiscent of the behavior observed in amorphous Mo-Ge films [19]. Recently, it has been suggested that the SC-I-type transition measured, e. g., in Mo-Ge films is in fact a Bose metal – insulator (BM-I) transition, and a two parameter scaling formula has been proposed to characterize it [20] RT 1+2/z /δ 2β = f (δ/T 1/zv ),
(4)
where β = v(z + 2)/2. This analysis implies the existence of a non-superfluid liquid of Cooper pairs (Bose metal) in the zero-temperature limit. In Fig. 4(b) we use this approach to analyze the data obtained for HOPG-3 sample. As Fig. 4(b) illustrates, the scaling formula (4) works very well in the whole studied temperature interval, taking Hcr = 1140 Oe, z = 1, and v = 2/3. Supporting the existence of a BM phase, magnetization measurements [13,21,22,23] indicate the existence of isolated superconducting islands or “grains” embedded into a non-superconducting matrix in graphite samples.
212
Yakov Kopelevich et al. T T T T T T
a
6.6
= = = = = =
5K 10 K 20 K 50 K 100 K 200 K
b
10
4
10
3
10
2
10
1
10 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0
0
6.3
3
6.0
2
5
3
10
RT /δ (m ΩK /Oe )
2
BASAL-PLANE RESISTANCE (mΩ)
6.9
5.7
10
-6
10
-5
|H-H cr|/T
1/α
10 (Oe K
-4
-1/α
)
3/2
3/2
δ/T (Oe/K )
Fig. 4. (a) Basal-plane resistance measured for HOPG-3 at T = 10 (- -), 20 (−), 50 (), 100 (◦), and 200 K (✷) plotted vs. the scaling variable, where Hcr = 1140 Oe and α = 0.65. (b) Scaling analysis of the same Rb (T, H) data presented in (a), assuming a Bose metal - insulator transition [20]; Hcr = 1140 Oe, z = 1, and v = 2/3
Such a system, depending on the SC island size, distance between neighboring islands, as well as matrix’s conducting properties, can undergo a SC phase transition decreasing the temperature or can remain a normal metal [24,25,26]. A similar picture has been proposed to account for the pseudogap phase in cuprates [27,28]. From the results presented in Sects. 3.2 and 3.3 it is reasonable to conclude on the field-driven BM Excitonic-Insulator (EI) transition originating from an interplay between excitonic and superconducting instabilities. The low-temperature Rb (T ) saturation on the “insulating side” of the transition, see Fig. 1, can be accounted for by the sample quenched disorder within a framework of the EI model [29], or assuming a non-superfluid liquid of vortices as in the BM-I transition model [24]. The similarity between MIT measured in graphite and 2D systems provides an additional evidence for the quasi-2D nature of graphite. An important piece of evidence regarding the dimensionality of graphite can be obtained studying the angle dependence of the out-of-plane magnetoresistance given in the next section.
4
Electrical Transport Normal to the Graphene Planes
The electrical transport perpendicular to the graphene layers, i.e. parallel to the c-axis, is still one of the long running problems of ideal graphite. As has
Graphite as a Highly Correlated Electron Liquid
213
been pointed out by Kelly [30] it is unclear how large is the true anisotropy ratio ρc /ρb (the ratio between the perpendicular to the parallel to the planes resistivities). There is no doubt, however, that the sample quality affects the absolute values as well as the magnetic field and temperature dependence of both resistivities. From the different views published in literature and taking into account our results in Kish graphite and HOPG samples we estimate an ideal, intrinsic ratio ρc /ρb ≥ 105 at low temperatures. The experimental situation is indeed difficult because lattice defects can affect differently the two resistivities. For example, the defects on the planes may act as scattering centers for the basal electrons. On the other hand, some topological basal-plane defects like pentagons and heptagons can trigger localized superconductivity and therefore an increase of the defect density may decrease the resistivity [31]. However, these dislocations added to stacking faults can act as localisation centers and as reflecting barrier [32], thus increasing ρc . For example, ρc (0, 0) ∼ 10µΩm for Kish graphite but ρc (0, 0) ∼ 103 µΩm for some of the HOPG samples. The difference between those samples is given by the sample perfection and defect density, which simultaneously determine the quasi-2D behavior. Kish graphite behaves as a quasi-3D sample, whereas the best HOPG samples shows a quasi-2D behavior. These transport properties as well as the amplitude of the angular dependent magnetoresistance oscillations (ADMRO) are correlated to the broadening of the X-rays rocking curve (full width at half maximum (FWHM)). If the measured ρc (T, H) does not reflect a pure intrinsic property, then one expects than defects short circuit the layers affecting the c-axis transport properties. In this case it could be that the measured c-axis transport simply reflects the in-plane electronic states of the graphene sheet. The question of coherent or incoherent transport along the c-axis is directly connected to the above discussion. As pointed out in [33] for graphite intercalation compounds, one would speculate that if the mean free path is smaller than the distance between layers, then the c-axis conduction cannot be coherent. On the other hand, Sato et al. [34] argued that very large resistivity ratio is not necessarily proportional to the ratio of the corresponding mean free paths, which is applicable when the anisotropy of the Fermi velocity and effective mass are not so large. With a cylindrical Fermi surface including a dispersion relation (weakly corrugated Fermi surface) in the normal direction of the form E = 2 (kx2 + ky2 )/2m|| − 2t cos(dkz ) ,
(5)
where m|| and t are the mass for the in-plane motion and the transfer interaction between adjacent planes, these authors argued that the large anisotropy ratio in the resistivity can be understood basically as the ratio between effective masses. The transverse effective mass is given in this model as m⊥ = 2 /2d2 t and the anisotropy ρc /ρb (dkF || m⊥ )2 /m2|| [34]. If we use this model for graphite (note that the dispersion relation is not the appropriate one for grahite) we obtain t ∼ 3 (30) meV for an anisotropy ratio of
214
Yakov Kopelevich et al.
104 (102 ), with the parameters d = 0.335 nm, kF || ∼ 9 × 107 m−1 and an effective mass for carriers parallel to the planes m|| 0.05me , where me is the free electron mass. Such a small t-value may indicate the lack of coherent transport and band conduction. We stress that in the past a value of 0.3 eV was obtained for the transfer interaction between planes for graphite. As we will show below this value appears to be misleading. We need therefore model-independent experimental evidence to prove whether the transport is or not coherent across the graphene layers. One possible way to test coherent transport across the graphite layers is given by the measurement of a maximum in the angle dependence of ρc (θ) at magnetic fields parallel to the layers [35,36,37] (the angle θ is defined between the field direction and the c-axis). This peak should be absent for incoherent interlayer transport but observed if the inequality ωc τ > 1 holds, where ωc is the cyclotron frequency and τ the relaxation time of the carriers. Coherent transport means therefore that band states extend over many layers and a 3D Fermi surface can be defined. In the other case, incoherent transport is diffusive and neither a 3D Fermi surface nor the Bloch-Boltzmann transport theory is applicable [35]. In order to check this and the role played by disorder we have performed measurements of the out-of-plane electrical resistivities with high angle resolution. Figure 5 shows the results for three samples at 9 T and 2 K. We observe that a weak coherent peak in ρc around the parallel orientation (90◦ ) occurs and this is larger the larger the FWHM of the corresponding rocking curve. The asymmetry seen in the angle dependence is in part due to the small experimental misalignment of the surface of the sample and to the lack of crystal perfection. As expected, the coherent peak decreases the smaller the FWHM. Our results indicate that lattice defects not only affect the transport as scattering centers but they contribute to enhance the coupling between the layers giving rise to a 3D-like electronic spectrum and coherent transport. The absence of coherent peak in ideal samples may be related either to incoherent transport or because ωc τ < 1 holds. Although the validity of semiclassical criteria for incoherent transport is under discussion [36] we use the IoffeRegel-Mott maximum metallic resistivity ρmax criterion to evaluate coherent transport [38]. We obtain that only for the HOPG samples ((b) and (c) in Fig. 5 (left)) ρc > ρmax in the whole T − and H−range, in agreement with the observed vanishingly small coherent peak. The ADMRO effect was first reported for a quasi-2D organic conductor [39] and explained in terms of weakly corrugated cylindrical Fermi surface [40]. Different corrugation symmetries, from a s-type (Eq. (5)) to dxy -type, have been treated theoretically assuming always a quadratic energy dispersion relation for the in-plane term [41]. For a band conduction between planes we would expect that ρc shows a maximum at a field parallel to the planes. This behavior was indeed observed in the original paper of Kajita et al. [39] and in some graphite intercalation
Graphite as a Highly Correlated Electron Liquid
215
7.0 6.9
(a) 90
92
7.70 (b) 7.65 89.8
90.0
90.2
0.8
94
90.4
ρc(θ) / ρc(0)
ρc (mΩm)
88 7.75
89.6
Kish graphite HOPG AC HOPG UC
1.0
6.8
0.6 0.4
B=5T T=2K
c θ
B
90.6
6.72
0.2
6.68
0.0 -10 0 10 20 30 40 50 60 70 80 90 100 110
6.64 6.60
(c) 89.5
90.0
90.5
Angle θ (degree)
Angle θ (degree)
91.0
Fig. 5. Left: Angle dependence around 90◦ of the c-axis resistivity of: (a) a Kish graphite sample K-2 (FWHM = 1.6◦ ), (b) a HOPG sample (HOPG AC) with FWHM = 0.40◦ , and (c) a HOPG sample with FWHM = 0.24◦ (HOPG-UC2) samples at µ0 H = 9 T (B = µ0 H) and at 2 K. θ = 90◦ means that field is applied parallel to the graphene planes (after [17]). Right: Angle dependence of the normalized c-axis resistance at a field of 5 T and a temperature of 2 K for the same samples as in the left figure. The continuous line is obtained from (7) with n = 0.75
compounds [34] in which the oscillations can be well explained by the standard ADMRO model. However, clear deviations from the standard model were observed in some intercalated graphite samples in which, for example, ρc (H||c) > ρc (H ⊥ c) [34]. In this case the authors argue that incoherent hopping instead of coherent band conduction can be the origin for the deviations [34]. The relevance of different corrugation symmetries has been studied in [41]. However, no theoretical studies have been done yet assuming a linear dispersion relation. Nevertheless, a comparison between the results in graphite intercalated compounds and in HOPG and Kish graphite samples is of interest. We should keep in mind that the density of carriers in the intercalated compounds is orders of magnitude larger than in pure graphite and therefore some differences are expected. Measurements of ρc (θ, H, T ) are shown in Fig. 5 (right) for the same three samples presented in Fig. 5 (left). We see that ρc (H||c) > ρc (H ⊥ c) for all samples. Also, it is clearly observed that the oscillations decrease in amplitude the smaller the FWHM, indicating that extrinsic reasons such as the mosaicity of the sample play a role. We argue that the observed angular dependence (Fig. 5, right) cannot be related to an intrinsic property of ρc but to ρb . As was experimentally proved in [17], ρb (B) depends only on the normal field component; then if ρc (θ) ∝ ρb (θ) we can write for the measured c-axis resistivity at a given
216
Yakov Kopelevich et al.
temperature and field: ρc (θ) ρc (90◦ ) + αρb (B cos θ) ,
(6)
where α is a free parameter. In Eq. (6) we assume that no coherent conduction exists between the planes and that all the angular (as well as the field) dependence is given by the in-plane resistivity. For µ0 H ≥ 1 T we get for the HOPG-AC sample ρc ∝ H n with n = 0.75 [5], similar to the dependence obtained for the in-plane resistivity. Therefore we can write the following equation for the normalized c-axis resistivity as ρc (θ) β + γ(cos θ)n , ρc (0)
(7)
where β and γ are chosen to match the normalization conditions. Figure 5 (right) shows a fit to the HOPG-AC sample data using Eq. (7). The good fit indicates that the angular dependence observed in ρc is only due to the perpendicular component of the field which affects the basal-plane resistivity and this affects directly ρc . We conclude this section arguing that there appears to be no evidence for a band and coherent conduction between basal-planes in ideal graphite.
5
Reentrant Metallic State in the Quantum Limit
As specified in the Introduction, re-entrant metallic behavior, i. e. dRb /dT > 0, takes place in graphite in the QL. Figure 6 illustrates the re-appearance of the metallic state at T < Tmax (H) increasing field (see also Fig. 1), and the inset in Fig. 6 demonstrates the Shubnikov – de Haas (SdH) oscillations associated with the Landau level quantization. The noticeable difference in the high-field behavior between HOPG and Kish graphite samples is a multiple crossing of Rb (H, T ) isotherms measured in HOPG (Fig. 7), and its absence in the case of Kish graphite, see inset in Fig. 6. The appearance of plateaus in the Hall resistance Rh (H), shown in Fig. 8, is another characteristic feature of quasi-2D HOPG [42]. The results of Fig. 8 suggest the QHE occurrence in HOPG. Following the analysis of transitions between adjacent quantum Hall plateaus [43], in the inset of Fig. 8 we plot the temperature dependence of the maximum slope (d | Rh | /dH)max vs. T −1 associated with the largest step in Rh (H, T ) measured at ∼ 3.5 T. At T ≥ 1.5 K this slope is ∝ T −κ with an exponent κ = 0.42 (0.45) for the HOPG-UC (HOPG-3) sample. Numerous experiments performed on QHE systems showed that κ varies from sample to sample and can even depend whether it is determined from Hall or longitudinal resistance measurements. Nevertheless, it is interesting to note that the here obtained exponent κ agrees with that predicted for transitions between both IQHE and fractional QHE (FQHE) plateaus [44,45]. The observed saturation in
Graphite as a Highly Correlated Electron Liquid
217
1.2 Rb(Ω)
Tmax(H)
Rb (Ω)
1.5
1.0
0.8
9T
0.4
8T 7T 6T
0.0 0
2
4
6
8
µ0H (T)
5T 4T
0.5
3T 2T
0.0
1T 0.5 T
0
70
140
210
280
T (K)
Fig. 6. Basal-plane resistance measured in single-crystalline Kish graphite sample (K-1) in the high-field regime; arrow: Tmax (H) below which reentrant metallic state takes place. Inset: Rb (H) at T = 20, 10, 5, and 2 K (top to bottom). After [42]
(d | Rh | /dH)max vs. T −1 at T < 1.5 K, see inset in Fig. 8, is also similar to that found in QHE systems but its origin is still unclear, see e. g. [46]. We stress that Rh (H, T ) is the Hall resistance measured for the bulk sample which translates to, e.g., ρh = 3.5 mΩcm at the main plateau for HOPG-UC sample. This gives Rh / = ρh /d ∼ 10 kΩ, i.e. only a factor ∼ 2.5 less than the Hall resistance quantum h/e2 . We note further that if the QHE-like behavior of HOPG samples is related to their quasi-2D nature, the lack of any signature for the QHE in Kish graphite provides an additional evidence for its 3D character. On the other hand, the reentrant metallic state takes place for all filling factors or magnetic fields H > HQL ∼ 4 T for HOPG samples, and µ0 H > 0.2 T for the K-1 sample, indicating that the QHE alone cannot account for this effect. In what follows we argue that the reentrant metallic state in both HOPG and Kish graphite samples can be caused by a common mechanism associated with a Cooper-pair formation. The appearance or reappearance of superconducting correlations in the regime of Landau level quantization has been predicted by several theoretical groups (for review articles see [9,10]). According to the theory, superconducting correlations in quantizing field result from the increase of the 1D density of states N1 (0) at the Fermi level. In the quantum limit (H > HQL ) the superconducting critical temperature TSC (H) for a 3D system is given by the equation [9] TSC (H) = 1.14Ω exp[−2πl2 /N1 (0)V ],
(8)
218
Yakov Kopelevich et al. 0.98 H=2T Rb(T)/R(Tx)
45
T = 10 K 0.91
5K
H=8T
3K 0.84 H = 0
Rb (mΩ)
30
0.1
2K 1 T (K)
10
18
15
15 12 2
3
4
0 0
2
4
6
8
µ0H (T)
Fig. 7. Basal-plane resistance measured in HOPG-3 sample at four temperatures, demonstrating crossings of the Rb (H) isotherms, i. e. the sequence of the fielddriven metal-insulator-metal transitions. Lower inset: detailed view of the crossing in Rb (H) isotherms; T = 2 K (•), 5 K (dotted line), 10 K (solid line). Upper inset: normalized resistance r = Rb (T )/R(Tx ) where Tx = 18 K (H = 0, µ0 H = 8 T), and Tx = 11 K (µ0 H = 2 T). After [42]
where 2πl2 /N1 (0) ∼ 1/H 2 , l = (c/eH)1/2 , V is the BCS attractive interaction, and Ω is the energy cutoff on V . In 2D case, TSC increases linearly with field [47] TSC (H) ∼ eHV N (0)/m∗ c.
(9)
On the other hand, the authors of [48] have shown that TSC (H) in 2D can be a complicated function that deviates from the linear form. Figure 9 demonstrates that Tmax (H) is an increasing function of field, in qualitative agreement with Eq. (9) and the 2D predictions [47,48]. Above a certain field H > HQL a reentrant decrease of TSC is also expected [9], which is consistent with the saturation in Tmax (H), see Fig. 9. The occurrence of either spin-singlet or spin-triplet [49] superconductivity in graphite may be possible in the QL. Theory predicts an oscillatory behavior of TSC (H) at H < HQL , i. e. with increasing number of occupied Landau levels. Indeed, a non-monotonic Tmax (H) is observed for all HOPG samples at µ0 H < 4 T, see Fig. 9. The absence of pronounced Tmax vs. H oscillations in Kish graphite can naturally be understood taking into account its lower anisotropy. In (quasi-) 2D case the density of states N(0)
Graphite as a Highly Correlated Electron Liquid
219
-Rh (mΩ)
-1
(d|Rh|/µ0dH)max (mΩT )
50
50 45 40 35 30 25 20 15 10 5 0 -5
40 30
HOPG-3
20
HOPG-3
HOPG-UC
10
0.1
1 -1 1/T (K )
10
HOPG-UC
K-1
0
2
4 µ0H (T)
6
8
Fig. 8. Hall resistance Rh (H, T ) measured for HOPG-3 sample from 100 mK () to 20 K (), for HOPG-UC at T = 4.2 K, and for K-1 (Rh /10) at T = 1.5 K. Inset shows (dRh /dH)max vs. 1/T for the HOPG samples; dashed and solid lines are linear fits to the function ∼ T −κ with κ = 0.42 (HOPG-UC) and 0.45 (HOPG-3). After [42] K-1
Tmax (K)
HOPG-1
HOPG-3 10 HOPG-UC
0
2
4
6
8
µ0H (T)
Fig. 9. Tmax vs. H for several studied samples. After [42]
is a set of delta functions (broadened however by quenched and thermal disorder) corresponding to different Landau levels, and hence Tmax should oscillate stronger with field in HOPG, as observed. A Tmax (9 T ) = 62 K obtained for Kish graphite is much higher than Tmax (9T ) = 11 K measured for
220
Yakov Kopelevich et al.
strongly anisotropic HOPG-UC sample. This fact can be understood taking into account quantum and/or thermal fluctuations [9,10], which are stronger in quasi-2D HOPG, and hence can effectively reduce Tmax (i. e. TSC ). It is expected that below TSC (H) and for 3D samples, the resistance along the applied field vanishes and the resistance perpendicular to the field direction shows a drop. However, in graphite both the c-axis and basal-plane resistance remain finite due to the layer crystal structure, implying the occurrence of superconducting correlations without macroscopic phase coherence. In Fig. 10 we compare Rb (T ) and the magnetization M(T) measured for the HOPG-UC sample at µ0 H = 1 T and 5 T (inset), viz. in the QL, illustrating that the reentrant metallic state(s) is(are) accompanied by the enhanced diamagnetic response, supporting both the superconductivity- and QHE-based scenarios for the field-induced metallic state(s). Actually, the QHE and superconductivity can represent the same phenomenon; in the quasi-2D HOPG, Cooper pairs can form a highly correlated boson liquid, analogous to the QHE fermionic systems [50,51]. If such an interpretation is correct, this would be the first observation of the QHE in a boson system. Before closing, we note that the results presented in this section provide a possible solution of the long-standing problem of the metallic resistance behavior (dRb /dT > 0) in graphite in the QL even below 1 K [52]. A weak logarithmic increase in Rb (T ) measured for HOPG samples at T < Tmax (H),
∆M/M
Tmax
-0.01
-2.85 µ0H = 5 T
-3.00
∆R/R
-0.02
M (G)
∆M/M, ∆R/R
0.00
-3.15
0
-0.03 0
10
20
40
30
80
120
40
50
T (K) Fig. 10. ∆M/M = [M (T ) − M (Tmax )]/M (Tmax ) and ∆R/R = [Rb (T ) − Rb (Tmax )]/Rb (Tmax ) measured for the HOPG-UC sample at µ0 H = 1 T. Inset: magnetization M(T) measured at µ0 H = 5 T. After [42]
Graphite as a Highly Correlated Electron Liquid
221
see the upper inset in Fig. 7 and [2], remains to be clarified. In particular, the formation of a Wigner crystal or charge density wave of Cooper pairs might be possible in the quasi-2D systems [50,51]. Acknowledgements The experimental work presented in this chapter has been possible with the support of the following institutions and grants: FAPESP, CNPq, CAPES, DFG ES 86/6-3 and the DAAD. We gratefully acknowledge the interest and discussions with D. V. Khveshchenko, I. A. Shovkovy, Z. Tesanovic, F. Guinea and M. Vozmediano.
References 1. E. Abrahams et al., Rev. Mod. Phys. 73, 251 (2001), and references therein. 207, 211 2. Y. Kopelevich et al., Phys. Solid State 41, 1959 (1999); Fiz. Tverd. Tela (St. Petersburg) 41, 2135 (1999). 207, 208, 221 3. H. Kempa et al., Solid State Commun. 115, 539 (2000). 207, 208, 209, 211 4. M. S. Sercheli et al., Solid State Commun. 121, 579 (2002). 207, 208 5. H. Kempa et al., Phys. Rev. B 65, 241101(R) (2002). 207, 208, 211, 216 6. D. V. Khveshchenko, Phys. Rev. Lett. 87, 206401 (2001), ibid. 87, 246802 (2001). 207, 210, 211 7. E. V. Gorbar et al., Phys. Rev. B 66, 045108 (2002). 207, 210, 211 8. V. P. Gusynin et al., Phys. Rev. Lett. 73, 3499 (1994). 207 9. M. Rasolt and Z. Tesanovic, Rev. Mod. Phys. 64, 709 (1992). 208, 217, 218, 220 10. T. Maniv et al., Rev. Mod. Phys. 73, 867 (2001). 208, 217, 220 11. C. Biagini et al., Europhys. Lett. 55, 383 (2001). 208 12. Y. Zheng and T. Ando, Phys. Rev. B 65, 245420 (2002). 208 13. Y. Kopelevich et al., J. Low Temp. Phys. 119, 691 (2000). 208, 211 14. P. Esquinazi et al., Phys. Rev. B 66, 024429 (2002). 208 15. Y. Kopelevich et al., Sov. Phys. Solid State 26, 1607 (1984). 208 16. M. Levy and M. P. Sarachik, Rev. Sci. Instrum.60, 1342 (1989). 208 17. H. Kempa et al., Solid State Commun. 125, 1 (2003). 211, 215 18. M. P. A. Fisher, Phys. Rev. Lett. 65, 923 (1990). 211 19. N. Mason and A. Kapitulnik, Phys. Rev. Lett. 82, 5341 (1999). 211 20. D. Das and S. Doniach, Phys. Rev. B 64, 134511 (2001). 211, 212 21. R. R. da Silva et al., Phys. Rev. Lett. 87, 147001 (2001). 211 22. Yang Hai-Peng et al., Chin. Phys. Lett. 18, 1648 (2001). 211 23. S. Moehlecke et al., Phil. Mag. B 82, 1335 (2002). 211 24. D. Das and S. Doniach, Phys. Rev. B 60, 1261 (1999). 212 25. M. V. Feigel’man et al., Phys. Rev. Lett. 86, 1869 (2001). 212 26. B. Spivak et al., Phys. Rev. B 64, 132502 (2001). 212 27. Yu. N. Ovchinnikov et al., Phys. Rev. B 63, 064524 (2001). 212 28. A. A. Abrikosov, Phys. Rev. B 63, 134518 (2001). 212 29. J. Zittartz, Phys. Rev. 165, 605 (1968). 212
222
Yakov Kopelevich et al.
30. B. T. Kelly: Physics of Graphite (Applied Science Publishers LTD, London and New Jersey 1981), pp. 267 ff, 293 ff 213 31. J. Gonz´ ales et al., Phys. Rev. B 63, 134421 (2001). 213 32. S. J. Ono, J. Phys. Soc. Japan 40, 49 (1976). 213 33. K. Sugihara, Phys. Rev. B 29, 5872 (1984). 213 34. H. Sato et al., J. Phys. Soc. Japan 69, 1136 (2000). 213, 215 35. P. Moses, R. H. McKenzie, Phys. Rev. B 60, 7998 (1999). 214 36. J. Singleton et al., Phys. Rev. Lett. 87, 117001 (2001). 214 37. J. Wosnitza et al., Phys. Rev. B 65, 180506(R) (2002). 214 38. J. J. McGuire et al., Phys. Rev. B 64, 94503 (2001). 214 39. K. Kajita et al., Solid State Commun. 70, 1189 (1989). 214 40. K. Yamaji, J. Phys. Soc. Japan 58, 1520 (1989). 214 41. R. Yagi and Y. Iye, Solid State Commun. 89, 275 (1994). 214, 215 42. Y. Kopelevich et al., Phys. Rev. Lett. 90, 156402 (2003). 216, 217, 218, 219, 220 43. H. P. Wei et al., Phys. Rev. Lett. 61, 1294 (1988). 216 44. A. M. M. Pruisken, Phys. Rev. Lett. 61, 1297 (1988). 216 45. S. Kivelson et al., Phys. Rev. B 46, 2223 (1992). 216 46. L. P. Pryadko and A. Auerbach, Phys. Rev. Lett. 82, 1253 (1999). 217 47. A. H. MacDonald et al., Aust. J. Phys. 46, 333 (1993). 218 48. V. P. Gusynin et al., JETP 80, 1111 (1995). 218 49. A. V. Andreev and E. S. Tesse, Phys. Rev. B 48, 9902 (1993). 218 50. Z. Tesanovic and M. Rasolt, Phys. Rev. B 48, 9902 (1993). 220, 221 51. Z. Tesanovic, J. Supercond. 8, 775 (1995). 220, 221 52. Y. Iye et al., Phys. Rev. B 30, 7009 (1984). 220
Non-equilibrium Transport and Relaxation in Diffusive Nanowires with Kondo Impurities Johann Kroha1, Achim Rosch2 , Jens Paaske2,3, and Peter W¨ olfle2 1 2 3
Physikalisches Institut, Universit¨ at Bonn Nussallee 12, 53115 Bonn, Germany Institut f¨ ur Theorie der Kondensierten Materie, Universit¨ at Karlsruhe Postfach 6980, 76128 Karlsruhe, Germany Ørsted Laboratory, Niels Bohr Institute, University of Copenhagen Universitetsparken 5, DK-2100 Copenhagen, Denmark
Abstract. Combining non-equilibrium transport with spectroscopic measurements provides a unique tool for the investigation of the microscopic processes in mesoscopic conductors. Experiments on resistive quantum wires show that the nonequilibrium quasiparticle distribution function f (E, V ) as a function of the quasiparticle energy E approximately obeys the scaling property, f (E, V ) = f (E/V ), if the transport voltage V exceeds a certain crossover scale V ∗ . This scaling indicates anomalous inelastic relaxation processes to be present. It is demonstrated that the latter can be induced by quantum impurities with a degenerate internal degree of freedom, i.e. by Kondo impurities. We review a perturbative renormalization group method to describe the Kondo effect in an arbitrary stationary non-equilibrium situation as well as in a magnetic field, and show that the experiments are explained in detail by a very low concentration of Kondo impurities, with V ∗ ≈ TK , the Kondo temperature. It is discussed how this provides a possible explanation of the observed low-temperature plateau of the decoherence time in mesoscopic conductors.
1
Introduction
Recent years have witnessed a tremendous development in the fabrication techniques of nanoscopic devices as well as in the preparation and control of quantum states in electronic systems. The functionality of such devices relies usually on the coherence of the quantum states involved. However, it has been recognized that in disordered, metallic quantum wires the phase coherence time τϕ , as extracted from the weak localization correction to the magnetoresistance, seems to saturate for low temperatures T at a plateau value [1] rather than diverge as τϕ ∼ T −2/3 , as expected from the theory of one-dimensional diffusive electron systems with Coulomb interaction [2,3]. It can be of crucial importance for the design of quantum coherent electron devices to uncover the microscopic origin of the underlying dephasing processes. Roughly at the same time when the notion of the anomalous dephasing time plateau was introduced as a physical effect, it was demonstrated by the Saclay group that a detailed investigation of the quasiparticle distribution B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 223–235, 2003. c Springer-Verlag Berlin Heidelberg 2003
224
Johann Kroha et al.
function f in a quantum device in stationary non-equilibrium is an exceedingly sensitive tool to investigate the interactions at work [4]. This is because the shape of the distribution f , as compared to a non-interacting system, is only influenced by inelastic, i.e. dephasing processes, while linear response measurements like the conductivity are in general hampered by a large elastic impurity scattering background. In this article we review the theoretical analysis of direct measurents of the non-equilibrium distribution function in resistive metallic nanowires. The experiments on Au and Cu wires [4,5] reveal a scaling property of the distribution function at a position x along the wire in terms of the quasiparticle energy E and the applied transport voltage V , fx (E, V ) = fx (E/V ) for voltages exceeding a crossover scale V ∗ . As will be shown, the origin of this behavior can be traced back to an extremely low concentration (a few ppm) of Kondo impurities present in these nanowires, with V ∗ ≈ TK , the Kondo temperature. For a general introduction to the physics of Kondo impurities see [6,7]. On the other hand, samples of even greater nominal purity [5] as well as Ag wires [8] do not show the low-temperature anomaly. To calculate the distribution function theoretically, we have previously used the so-called Non-Crossing Approximation (NCA) for Anderson impurities in the Kondo regime out of equilibrium [9,10]. The theory allows for a quantitative comparison with experiment. In fact, after determining TK as the energy scale V ∗ where deviations in the distribution function from the scaling form occur, the concentration of Kondo impurities in the wire is the only adjustable parameter of the theory and, hence, can be extracted by fitting the theoretical curves to experiment. Comparing the impurity concentration determined in this way to the low-temperature plateau values of the dephasing time in several similarly fabricated nanowires strongly suggests that the dephasing time plateau is also caused by Kondo scattering [5,9,10]. Very recently there have been experimental studies probing especially the question whether these impurities are of magnetic or of non-magnetic nature. These experiments investigated (1) the non-equilibrium distribution in an external magnetic field B [11] and (2) the decoherence time in wires with deliberately added magnetic impurities [12]. The results strongly suggest that the Kondo impurities are of magnetic origin. As for the theoretical description, it is well known that the NCA fails in the presence of a magnetic field. Theoretical studies for B = 0 have been done in [13], but do not permit a satisfactory comparison with experiments. Therefore, we will use here the concept of a perturbative coupling constant renormalization group (RG) method, recently developed by us [14] to describe a single Kondo defect or quantum dot in stationary non-equilibrium, i.e. coupled to two electron reservoirs at different chemical potentials, with or without an external magnetic field. We outline its generalization to the present case of a quantum wire with a finite concentration of Kondo impurities and an arbitrary, position dependent non-equilibrium distribution
Non-equilibrium Transport and Relaxation in Diffusive Nanowires
225
function. In this case, the quantum Boltzmann equation for fx (E, V ) must be solved along with the RG equations. Alternatively, we use in the case of vanishing magnetic field an auxiliary particle representation of the Anderson impurity model in the Kondo regime [6], using the so-called Non-Crossing Approximation (NCA) in non-equilibrium [9], and show corresponding numerical results. The paper is organized as follows. In Sect. 2 we recollect the experimental setup to measure the non-equilibrium distribution function in a metallic wire, along with the most important experimental results. The perturbative RG method for a Kondo defect away from equilibrium and its generalization to a finite defect concentration and arbitrary distribution function are outlined in Sect. 3. The quantitative fits of the Kondo-induced distribution functions calculated within NCA to the experimental results are shown in Sect. 4. Here we also discuss the connection to related measurements of the decoherence time in quantum wires fabricated in a similar way, before some conclusions are drawn in Sect. 5.
2
Experimental Method and Results
The distribution function f (E, U ) was measured [4,8] in a three-terminal setup where a non-equilibrium current was driven through a Cu, Au or Ag nano-wire contacted by two reservoirs at chemical potentials µL,R = ±V /2, respectively (Fig. 1). (Throughout we will use units such that = 1, kB = 1, e = 1, and µ = 1, where e and µ are the elementary charge and the magnetic moment, respectively). In addition, a superconducting Al tunneling junction was attached at a position x along the wire, the Al slab being in equilibrium with itself. For a voltage U across the junction the tunneling current is given by e 2 (1) dE f (E) − f o (E + U ) N0 Nsc (E + U ) , jtunnel = |t| σ where t is the (assumed energy independent) tunneling matrix element, f o (E) is the Fermi distribution function in the superconductor, and N0 , Nsc denote the density of states in the wire and in the superconductor, respectively. Since for the voltages used in the experiment N0 is flat and the BCS density of states f(E)
x −V/2 0 -eU V/2
SC
E
U V
U V
Fig. 1. Experimental setup for measuring the non-equilibrium distribution function in a metallic nano-wire. Also sketched is the distribution function as measured at position x along the wire, and compared to the hot-electron case
226
Johann Kroha et al.
Nsc is measured independently, the pronounced peak structure of Nsc (E) can be used to deconvolute Eq. (1) to obtain the non-equilibrium distribution fx (E, V ) at position x in the wire. Measurements in magnetic field were also performed, exploiting the Al’tshuler-Aronov dip [3] in the density of states of a strongly disordered (instead of superconducting) slab of the tunneling contact in order to deconvolute Eq. 1. The electronic transport in the wire is diffusive with diffusion coefficient D. Length L and thickness d of the wire are such that the diffusion modes are one-dimensional, but the Fermi surface is three-dimensional, and the Coulomb and phonon scattering times are large compared to the electronic diffusion time τD = L2 /D through the wire, so that equilibration due to these processes can be neglected [4]. In this situation, assuming purely elastic scattering, one expects the distribution function at a given position x to be a linear superposition of the Fermi functions in the reservoirs, x o x f (E + V ) + f o (E) , (2) fxelast (E, V ) = 1 − L L Eq. (2) is a solution of the diffusive Boltzmann equation (7) [15], for vanishing collision integral (see Sect. 3.1). This situation is to be distinguished from the hot electron regime, where local equilibration occurs due to inelastic processes [16]. Typical measured distribution functions are shown in Fig. 2a. They exhibit rounding of the Fermi steps as compared to Eq. (2) and obey scale invariance with respect to the transport voltage V , f (E, V ) = f (E/V ), when V exceeds a certain low energy scale, V 0.1 meV (Fig. 2b) [4]. Deviations from scaling were observed again for voltages larger than a high energy scale Eo 0.5 meV. The latter may be explained by reservoir heating effects or by the electrons coupling to additional degrees of freedom at high energies. By a heuristic argument [4], the origin of the observed scaling behavior can be traced back to an anomalous electron-electron interaction v˜(ε) which scales with the energy transfer ε as v˜(ω) ∝ 1/ε. We reproduce this argument here: The scaling property implies that the equation of motion for f (E, V ), the Boltzmann equation and, as a consequence, the inelastic single-particle collision rate 1/τ (E) are scale invariant. Assuming for the moment that v˜(ε) has no essential momentum dependence, 1/τ is given in 2nd order perturbation theory (PT) by E ε ε 1 1 ≡ N03 dε dε |˜ , , . (3) v (ε)|2 F˜ τ (E) τ (E/V ) V V V Here F˜ is a combination of distribution functions f guaranteeing that there is only scattering from an occupied into an unoccupied state. Therefore, the experimental results about the scaling property of f imply that F˜ depends only on the dimensionless energies as displayed in Eq. (3). Demanding scale invariance for 1/τ with respect to V , i.e. making the frequency integrals dimensionless, implies a characteristic energy dependence of the interaction and
Non-equilibrium Transport and Relaxation in Diffusive Nanowires
227
the single particle scattering rate, v˜(ε) ∝ 1/ε. Extrapolated to the Fermi energy EF this means that in 2nd order perturbation theory (PT) the resulting quasiparticle relaxation rate would not vanish at EF [4]. One can, therefore, conjecture that the anomalous scaling form fx (E/V ) and the apparent low-temperature saturation of the dephasing time observed [1] in the magnetoresistance ρ(B) of nanowires might have the same microscopic origin. To substantiate this speculation, a quantitative calculation in non-equilibrium is needed, and the perturbative infrared singularity of v˜(ε) signals that one has to re-sum infinite orders of PT. We note that the momentum depedence of the dynamically screened Coulomb interaction in a disordered system, induced by the one-dimensional diffusion mode of the nanowire, [3] would not be consistent with the observed scaling property, and that other collective modes which could induce a singular effective interaction, are not expected in the simple Au or Cu wires. We conclude that, within the 2nd order PT argument discussed above, v˜(ε) ∝ 1/ε, i.e. has no essential momentum dependence and should thus be of local origin. Anomalous low-energy behavior of local origin can be induced by the Fermi surface singularities characteristic for Kondo type systems. [6,7]. Based on such considerations, Kondo impurities have been proposed as the origin of the anomalous energy relaxation [9,10,17,18]. Kaminski and Glazman showed first that Kondo scattering can induce an effective local electron-electron interaction consistent with the observed scaling, including the deviations thereof [17]. Inelastic scattering by Kondo impurities has also been considered in [19]. 1 1 0.2mV
0.1mV V=0.0mV
(E)
f(E) 0.5 U
0.5
a) U
0 −0.2
−0.1
b) 0
0 E [meV]
0.1
0.2
−1.5 −1.0 −0.5
0.0 E/V
0.5
1.0
1.5
Fig. 2. (a) Typical measured distribution functions in the middle of a resistive Cu wire (L = 1.5 µm,x/L=0.5), bath temperature T = 25 mK. The distribution for the non-interacting case is shown as a dashed line. The inset shows raw data for the differential conductance of the tunnel junction, exhibiting the BCS singularities at the gap edges. (b) Scaling collapse of the data of (a). (Data reproduced from [4])
228
3 3.1
Johann Kroha et al.
Theory The Model and General Relations
We now turn to the calculation of the distribution function fx (E, V ) in a wire with uniform concentration of Kondo impurities and with a finite bias V applied between its ends. For a single Kondo impurity embedded in the electron sea we consider the Kondo Hamiltonian εk c†kσ ckσ − BSz + J S · (c†k σ τσ σ ckσ ), (4) H= k,σ
k,k ,σ,σ
where the first term describes the conduction electron sea with the flat bare density of states N0 and the band width D, and J > 0 is the antiferromagnetic spin coupling. It is convenient to use the pseudofermion representation of the local spin 1/2 [20,6] 1 † f τσ σ fσ . (5) S= 2 σ σ,σ
† Here τ is the vector of Pauli matrices, † and fσ , fσ are pseudofermion operators obeying the constraint Q = σ fσ fσ = 1, which can be implemented exactly (for details of this technique see, e.g., [21,22]). How can scattering from a Kondo defect, which has only a degenerate internal degree of freedom, lead to inelastic processes and hence to a redistribution of the occupation of states in the wire? The physical origin is the finite decay rate of the internal degree of freedom. It is given by the transverse spin relaxation rate Γ (see Eq. (11) below) and is in the large bias limit, ln(V /TK ) 1 (but ln(V /D) 1), proportional to V , x x 1− (N0 J)2 V , (6) Γ0 (V ) = 2π L L
analogous to the Korringa relaxation rate, with T replaced by V [9]. Note that towards smaller voltages √ there is a logarithmic renormalization N0 J → 1/[2 ln(V /TK )], with TK = D N0 Je−1/2N0 J . The resulting life-time broadening of the local states gives ultimately rise to inelastic scattering with energy exchange ∼ 1/τs (V ). Hence, it is, roughly speaking, the approximate proportionality Eq. (6) that induces a broadening of the Fermi steps ∝ V (up to log corrections), i.e. the observed scaling property. An accurate description of these non-equilibrium phenomena requires a detailed, quantitative calculation of the distribution function. We consider fx (E, V ) in a resistive nanowire of length L, subject to the boundary conditions that the left (x = 0) and the right (x = L) leads are in equilibrium at their respective chemical potentials, i.e. fx=0 (E, V ) = f o (E − V /2), fx=L (E, V ) = f o (E + V /2), with f o (E) = 1/(eE/T + 1) the Fermi distribution. The lesser (<) and the greater (>) conduction electron Keldysh r > Green’s functions read G< x (k, E) = −2πifx (k) ImGx (k, E) and Gx (k, E) =
Non-equilibrium Transport and Relaxation in Diffusive Nanowires
229
2πi[1 − fx (k)] ImGrx (k, E), respectively, where E, k denote quasiparticle energy and momentum [23]. A superscript r indicates a retarded propagator. The distribution function is determined by the stationary quantum Boltzmann equation, which in a disordered electron system with diffusion constant D takes the form [15] −D∇2x fx (E, U ) = I{fx (E, U )} . The collision integral 1 < > < [Σx (E)G> I= x (p, E) − Σx (E)Gx (p, E)] . 2πN0 p
(7)
(8)
is expressed in terms of the selfenergies Σ ≶ for scattering into (<) and out of (>) states with given energy E. For low concentration of Kondo impurities cimp the selfenergies are proportional to the single-electron T-matrix of a ≶ ≶ single impurity, Σx (E) = cimp tx (E). We emphasize that, apart from the assumption of small cimp , the formulation Eqs. (7), (8) is exact once the T-matrix is known. This leaves us with the necessity to calculate the collision integral (and other physical quantities) in a controlled way. As is well known [6] low order PT is not sufficient for the Kondo problem because of its infrared divergencies. This remains true even in non-equilibrium, where the divergencies are cut off by inelastic terms like the transverse spin relaxation rate Γ , since the terms of PT are logarithmic and the convergence of the perturbation series is slow. Hence, a resummation to infinite order is required. Such resummations have been performed by several authors, using the log resummation scheme in equilibrium for the energy dependence of the collision kernel [17], using a factorization of the collision kernel and the Nagaosa resummation [18], or employing an NCA calculation of the collision integral Eq. (8) in non-equilibrium [9]. However, these resummations are either technically cumbersome or use an uncontrolled selection of terms in the resummation of PT. Notably, the NCA is known to give good quantitative results for vanishing magnetic field B in the energy range above and down to about 0.1 TK , but completely fails for B = 0. For a discussion of the origin of this failure see [24]. Therefore, we here propose a poor man’s scaling method for general stationary non-equilibrium situations which is technically simple to perform and at the same time allows for a systematic resummation of the terms of PT. 3.2
Poor Man’s Scaling Method in Non-equilibrium
The coupling constant renormalization group method relies on the notion of universality, i.e. the fact that physical quantities may be expressed solely in terms of the effective low-energy scale of the problem (e.g. TK ), and do not explicitly depend on the microscopic parameters of the underlying Hamiltonian, in particular not on the band cutoff D. As a consequence, it is possible
230
Johann Kroha et al.
to absorb a change of the D into a renormalization of the coupling constants by the requirement that the physically observable quantities, like the T-matrix, (or their irreducible parts) be invariant under this cutoff rescaling. In the semianalytic approach considered here it is only possible to implement the invariance under an infinitesimal cutoff rescaling δD up to a given, finite order n of PT, typically n = 2 (1–loop RG). This leads to the perturbative flow equations for the renormalized coupling constants. The fact that the invariance of physical quantities has been implemented to nth order PT (see above) dictates for consistency that the irreducible part of any such quantity must now be calculated from the renormalized coupling constants in that same nth order PT. The latter statement is crucial especially for the RG in non-equilibrium, since here the RG equations themselves will not only involve the coupling constants (or coupling functions) but also physically observable quantities, namely the transverse spin relaxation rate Γ and the quasiparticle distribution function, which, in general, must be determined selfconsistently by solving the Boltzmann equation. To 2nd order PT, the coupling function renormalizations are due to the vertex corrections shown in Fig. 3a, whose real parts are logarithmic. The intermediate electron lines in Fig. 3a carry the distribution function fx (E, V ) inside the wire. Although the Fermi steps in fx (E, V ) will be smeared due to inelastic relaxation, we still expect, at least for small cimp , the dominant contributions to come from the Fermi energies µL,R = ±V /2 in the left (L) or right (R) lead. Therefore, we choose the running cutoffs symmetric with respect to µL,R [14]. We define dimensionless coupling functions g(ω) = N0 J(ω), where ω is the energy of the scattering electron. The spin structure of these coupling functions is given by two invariant amplitudes, g˜⊥ for spin flip and g˜||σ for spin non-flip processes:
ασ,ω;α σ ¯ ,ω−γB y y x x ˜⊥ (ω − γB/2) gγ,−γB/2;¯ ¯ τσ¯ σ) g ¯ τσ¯ σ + τγ γ γ,B/2 = (τγ γ
(9a)
ασ,ω;ασ,ω z z gγ,−γB/2;γ,−γB/2 = ταα τγγ g˜||σ (ω) ,
(9b)
a;a where gb;b denotes the coupling function for conduction electrons of spin σ and energy ω in lead α(a = (α, σ, ω)) interacting with a pseudo fermion in state b = (γ, ωf ) and going into states a , b . Using furthermore the fact that the pseudofermion propagators are sharply peaked at the Zeeman energies
ω
ω J
a)
J
ω
+
ω J
J
J
J
b)
Fig. 3. (a) Diagrams generating the 2nd order renormalization of the conduction electron-local spin vertex. Solid and dashed lines denote the electron and the pseudofermion propagators, respectively. (b) 2nd order expression for the conduction electron T-matrix
Non-equilibrium Transport and Relaxation in Diffusive Nanowires
231
±B, one obtains the RG equations for the dimensionless, energy dependent coupling functions at position x along the wire, ∂˜ gx||σ (ω) 1 =− ∂ ln D 4
g˜x⊥ (
α,β=−1,1
B + βV 2 ) × 2
(10a)
V
˜ f ω + σ(B + β 2 ) 2fx (αD + βV /2, V ) − 1 G αD 1 ∂˜ gx⊥ (ω) σB + βV βV =− )˜ gx||σ ( )× g˜x⊥ ( ∂ ln D 2 2 2 α,β,σ=−1,1
˜f ω + 2fx (αD + βV /2, V ) − 1 G
σB+βV 2
αD
(10b)
,
˜ f (y) = 1/(1 + y) is related to the real part of the pseudofermion where G propagator, and the initial conditions are g˜x||σ (ω) = g˜x⊥ (ω) = N0 J. Note that for the case of a single impurity coupled to two leads at different chemical ˜ f in potentials (no x dependence), the statistical factors and the functions G Eqs. (10), (10) can effectively be combined to threshold functions, so that the above equations reduce to the poor man’s scaling equations derived in [14]. In the RG equations (10), (10) it is up to now assumed that the relaxation rate Γ of the impurity spin state destroying quantum coherence is negligible compared to V or B. However, due to the finite current, relaxation processes contributing to Γ are present even at T = 0, and lead to an imaginary part of the self-energy of the pseudofermions (e.g. in the PF propagator in the diagrams of Fig. 3a) and to vertex corrections. We identify Γ with the transverse spin relaxation rate [14], which is in 2nd order in the renormalized couplings given by
π dω g˜x||γ (ω)2 fx (ω, V )[1 − fx (ω, V )] Γx = 4 γ=−1,1 +˜ gx⊥(ω − γB/2)2 fx (ω, V )[1 − fx (ω − γB, V )] . (11) To incorporate the effect of decoherence in the RG equations, the pseud˜ f (y) as ofermion propagators are broadened by the rate Γ , i.e. we re-define G f 2 2 ˜ GΓ/αD (y) = (1 + y)/[(1 + y) + (Γ/αD) ]. As a further complication, the RG equations (10), (10) contain the as yet unknown distribution functions fx at the positions x in the wire. They are to be determined by solving the Boltzmann equation (7), where the T-matrix entering the collision integral (8) must be calculated in 2nd order PT in the renormalized coupling constants, see the discussion at the beginning of this ≶ section. The corresponding expression for tx (E) is depicted diagrammatically in Fig. 3b. This means that the Boltzmann equation (7), with Eqs. (8), and the set of RG equations (10)(10), with the spin relaxation rate Eq. (11) must be solved selfconsistently. The solution of this set of equations is in progress.
232
Johann Kroha et al.
It should allow for a well controlled description of the distribution function in non-equilibrium quantum wires containing a finite concentration of Kondo impurities, both with and without magnetic field B. In the following section we show results obtained from the NCA solution of this problem for B = 0 [9]. To conclude this section, we analyze the scaling properties of fx (E, V ) following from the RG approach for zero magnetic field B. From the collision integral Eq. (8) and the T-matrix (Fig. 3b) it follows that fx (E, V ) has the scaling property fx (E, V ) = fx (E/V ) exactly if the coupling functions depend on the energy ω only through ω/V . By defining a dimensionless cutoff D/V it is seen that all energies in the RG equations become dimensionless (in units of V ), provided that (1) the distribution function depends only on the dimensionless energy E/V and (2) the spin relaxation rate Γ is proportional to V (Korringa behavior). The latter condition is obeyed in the large bias limit, ln(V /TK ) 1, with logarithmic corrections for smaller bias. Hence, observing that the initial conditions g˜x||σ (ω) = g˜x⊥ (ω) = N0 J = const(ω) trivially fulfill the condition that the coupling constants depend only on the dimensionless energy, the selfconsistent solution fx (E/V ) is scale invariant for large bias, however with sizeable logarithmic corrections in a wide voltage range, as the bias V approaches TK from above. This is in agreement with the experimental findings, if one identifies the crossover scale for deviations from scaling V ∗ ≈ TK . For finite magnetic field the scaling property naturally breaks down, as seen from the RG equations.
4
Comparison with Experiment: NCA Solution
The numerical NCA solutions show approximate scaling within a factor of 4 to 9 in V , depending on parameters. This fact, that scaling is obeyed only within a finite, transient voltage range, is because of the logarithmic corrections to the linear in V behavior of the spin relaxation rate Γ , Eq. (6), and of vertex corrections entering the NCA equations (or the RG equations, respectively), see the discussion at the end of Sect. 3.2. For V 10V ∗ approximate scaling cannot be esablished anymore. This provides for T TK a rough estimate, and for T > TK an upper bound on TK . The experimental parameters [4,8] are in the regime T TK V . For the numerical evaluations we assume magnetic impurities and take TK ≈ 0.1 K in Cu and TK ≈ 0.5 K in Au wires (corresponding to N0 J = 0.041 and N0 J = 0.048, respectively). These values are consistent with the experimentally observed scale for deviations from scaling and with independent estimates of TK for these samples [8]. After TK is fixed, cimp is the only adjustable parameter of the theory. The results for fx (E, V ), as measured by a tunnel junction attached to the wire, are shown in Figs. 4 and 5. Excellent quantitative agreement with experiments [4,8] is obtained for all samples. In Au wires the fitted values of cimp are consistent with (although somewhat higher than) independent estimates of the magnetic impurity concentration [8], considering the roughness of both estimates. This
Non-equilibrium Transport and Relaxation in Diffusive Nanowires
233
Fig. 4. Non-equilibrium distribution functions for three different Cu samples. Black lines: experimental results; [4]. Open circles: theory for V TK . Deviations from scaling at smaller V [1] are also reproduced by the theory (not shown). The fitted cimp values are indicated. The insets show the difference between the experimental and the theoretical curves. For sample 3 fx (E/V ) was measured simultaneously at two different positions x; hence, both curves are fitted by one single value of cimp
suggests that the scaling behavior of fx (E, V ) in the Au samples is due to magnetic impurities. Furthermore, in all Cu samples the fitted cimp is ∼ 102 times smaller than in Au. This systematics is in accordance with cimp estimated from the plateau in the T dependence of the dephasing time τϕ in similarly prepared samples [8,25].
5
Conclusion
We have described a recently introduced poor man’s scaling method (perturbative coupling constant RG on 1-loop level), applicable to Kondo systems in non-equilibrium. In non-equilibrium the renormalized couplings necessarily acquire an ω-dependence, and decoherence processes are incorporated by including the transverse spin relaxation rate Γ in the RG equations. We propose a generalization to the case of an arbitrary, stationary non-equilibrium distribution in the electron sea. Such a situation is realized, e.g. in quantum wires with a finite concentration of Kondo impurities, subject to a large bias voltage. In this case, the distribution function, depending on the quasiparticle energy E, the bias voltage V , and on the position x, enters the RG equations and must be determined selfconsistently during the RG procedure by solving the respective quantum Boltzmann equation.
234
Johann Kroha et al.
Fig. 5. The same quantities as in Fig. 4 are shown, however for Au wires. Experimental results reproduced from [8]
We have shown that the quasiparticle distribution in a quantum wire with Kondo impurities, determined by this set of coupled equations, obeys the scaling property fx (E, V ) = fx (E/V ) in the regime of exponentially large bias, ln(V /TK ) 1, with deviations from scaling for smaller voltage, in agreement with related experiments on Cu and Au quantum wires. Nonequilibrium solutions of this problem within the NCA for vanishing magnetic field give quantitative agreement with related experiments and allow to extract the concentration of Kondo impurities from fitting the theoretical curves to the experimental data. A detailed comparison of these results with the dephasing time measurements in similarly prepared samples strongly suggests that the anomalous low-temperature plateau of τϕ in Cu and Au wires is caused by Kondo impurities. A solution of the non-equilibrium RG equations for resistive wires in magnetic field is in progress. Acknowledgements We thank A. Zawadowski, S. De Franceschi, L.I. Glazman, J. K¨ onig, O. Parcollet, H. Pothier, H. Schoeller and G. Sellier for helpful discussions. This work was supported in part by the DFG Forschungszentrum for Functional Nanostructures (CFN) and by the Emmy Noether program (A.R.) of the DFG.
Non-equilibrium Transport and Relaxation in Diffusive Nanowires
235
References 1. P. Mohanty, E.M.Q. Jariwala and R. A. Webb, Phys. Rev. Lett. 78, 3366 (1997). 223, 227 2. B. L. Al’tshuler, A. G. Aronov, and D. E. Khmelnitskii, J. Phys. C: Solid State Physics 15, 7367 (1982). 223 3. For a review see B. L. Al’tshuler and A. G. Aronov, in Electron-Electron Interactions in Disordered Systems, 1 (North-Holland, Amsterdam, 1985). 223, 226, 227 4. H. Pothier, S. Gu´eron, Norman. O. Birge, D. Esteve, and M. H. Devoret, Phys. Rev. Lett. 79, 3490 (1997). 224, 225, 226, 227, 232, 233 5. F. Pierre, H. Pothier, D. Esteve, M.H. Devoret, A.B. Gougam, and N.O. Birge in Kondo Effect and Dephasing in Low-Dimensional Metallic Systems V. Chandrasekhar, C. van Haesendonck, and A. Zawadowski eds., Nato Science Series II, 50, 119 (Kluwer, 2001). 224 6. For a comprehensive introduction to the Kondo and related problems see A. C. Hewson, The Kondo Problem to Heavy Fermions, (Cambridge University Press, 1993). 224, 225, 227, 228, 229 7. D. L. Cox and A. Zawadowski, Adv. Phys. 47, 599 (1998). 224, 227 8. F. Pierre, H. Pothier, D. Esteve, and M.H. Devoret, J. Low Temp. Phys. 118, 437-445 (2000). 224, 225, 232, 233, 234 9. J. Kroha and A. Zawadowski, Phys. Rev. Lett. 88, 176803 (2002). 224, 225, 227, 228, 229, 232 10. J. Kroha, Adv. Solid State Phys. 40, 216 (2000); J. Kroha, in Kondo Effect and dephasing in low-dimensional metallic systems, V. Chandrasekhar, C. v. Haesendonck, and A. Zawadowski, eds., NATO Science Series II 50 , 133 (Kluwer, 2001). 224, 227 11. A. Anthore, F. Pierre, H. Pothier, and D. Esteve, Phys. Rev. Lett. 90, 076806 (2003). 224 12. F. Schopf, C. B¨ auerle, W. Rabaud, and L. Saminadayar, Phys. Rev. Lett. 90, 056801 (2003). 224 13. G. G¨ oppert, Y. M. Galperin, B. L. Altshuler, and H. Grabert, Phys. Rev. B 66, 195328 (2002). 224 14. A. Rosch, J. Paaske, J. Kroha, and P W¨ olfle, Phys. Rev. Lett. 90, 076804 (2003). 224, 230, 231 15. K. E. Nagaev, Phys. Lett. A 169 103 (1992); Phys. Rev. B 52, 4740 (1995). 226, 229 16. Kozub and Rudin, Phys. Rev. B 52, 7853 (1995). 226 17. A. Kaminski and L.I. Glazman, Phys. Rev. Lett. 86, 2400 (2001). 227, 229 18. G. G¨ oppert, H. Grabert, Phys. Rev. B 64, 033301 (2001). 227, 229 19. J. S´ olyom, A. Zawadowski, Z. Phys. B 226, 116 (1996). 227 20. A. A. Abrikosov, Physics 2 5 (1965). 228 21. A review can be found in N. E. Bickers, Rev. Mod. Phys. 59, 845 (1987). 228 22. T. A. Costi, J. Kroha, and P. W¨ olfle, Phys. Rev. B 53, 1850 (1996). 228 23. J. Rammer and H. Smith, Rev. Mod. Phys. 58, 323 (1986). 229 24. S. Kirchner and J. Kroha, J. Low Temp. Phys. 126, 1233 (2002). 229 25. A. B. Gougam, F. Pierre, H. Pothier, D. Esteve, and N.O. Birge, J. Low Temp. Phys. 118, 447 (2000). 233
Real-Space Renormalization and Energy-Level Statistics at the Quantum Hall Transition Rudolf A. R¨ omer1 and Philipp Cain2 1 2
Department of Physics and Centre for Scientific Computing, University of Warwick, Coventry, CV4 7AL, United Kingdom Institut f¨ ur Physik, Technische Universit¨ at Chemnitz 09107 Chemnitz, Germany
Abstract. We review recent applications of the real-space renormalization group (RG) approach to the integer quantum Hall (QH) transition. The RG approach, applied to the Chalker-Coddington network model, reproduces the critical distribution of the power transmission coefficients, i.e., two-terminal conductances, Pc (G), with very high accuracy. The RG flow of P (G) at energies away from the transition yields a value of the critical exponent, νG = 2.39 ± 0.01, that agrees with most accurate large-size lattice simulations. Analyzing the evolution of the distribution of phases of the transmission coefficients upon a step of the RG transformation, we obtain information about the energy-level statistics (ELS). From the fixed point of the RG transformation we extract a critical ELS. Away from the transition the ELS crosses over towards a Poisson distribution. Studying the scaling behavior of the ELS around the QH transition, we extract the critical exponent νELS = 2.37 ± 0.02.
1
Introduction
The integer quantum Hall (QH) transition is described well in terms of a delocalization-localization transition of the electronic wavefunctions. In contrast to a usual metal-insulator transition (MIT), the QH transition is characterized by a single extended state located exactly at the center = 0 of each Landau band [1]. When approaching = 0, the localization length ξ of the electron wavefunction diverges according to a power law −ν , where defines the distance to the MIT for a suitable control parameter, e.g., the electron energy. On the theoretical side, the value of ν has been extracted from various numerical simulations, e.g., ν = 2.5 ± 0.5 [2], 2.4 ± 0.2 [3], 2.35 ± 0.03 [4], and 2.39 ± 0.01 [5]. In experiments ν ≈ 2.3 has been obtained, e.g., from the frequency [6] or the sample size [7] dependence of the critical behavior of the resistance in the transition region at strong magnetic field. Recently, a semianalytical description of the integer QH transition, based on the extension of the scaling ideas for the classical percolation [8] to the Chalker-Coddington (CC) model of the quantum percolation [2], has been developed [9,10]. The key idea of this description, a real-space-renormalization group (RG) approach, is the following. Each RG step corresponds to a doubling of the system size. The RG transformation relates the conductance distribution of the sample at the next step to the conductance distribution B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 237–252, 2003. c Springer-Verlag Berlin Heidelberg 2003
238
Rudolf A. R¨ omer and Philipp Cain
at the previous step. The fixed point (FP) of this transformation, yields the distribution of the conductance, Pc (G), of a macroscopic sample at the QH transition. This universal distribution describes the mesoscopic properties of a fully coherent QH sample. Analogously to the classical percolation [8], the correlation length exponent, ν, was extracted from the RG procedure [5] using the fact that a slight shift of the initial distribution with respect to the FP distribution Pc (G) drives the system to the insulator upon renormalization. Then the rate of the shift of the distribution maximum determines the value of ν. Remarkably, both Pc (G) and the critical exponent obtained within the RG approach [5,11,12] agree very well with the “exact” results of the large-scale simulations [3,4,13,14,15]. The goal of the present paper is threefold. First, we briefly review the basic ingredients that constitute the real-space RG method in the QH situation [5]. Second, we extend the RG approach to include the level statistics at the QH transition and apply a method analogous to the finite-size-corrections analysis to extract ν from the energy-level statistics (ELS) obtained within the RG approach. This method yields ν = 2.37 ± 0.02, which is even closer to the most precise large-scale simulations result ν = 2.35 ± 0.03 [4] than the value ν = 2.39±0.01 inferred from the conductance distribution [5]. This agreement is by no means trivial. Indeed, the original RG transformation [5] related the conductances, i.e., the absolute values of the transmission coefficients of the original and the doubled samples, while the phases of the transmission coefficients were assumed random and uncorrelated. In contrast, the level statistics at the transition corresponds to the FP in the distribution of these phases. Therefore, the success of the RG approach for conductances does not guarantee that it will be equally accurate quantitatively for the level statistics. Third, we show that the RG structure employed in the present approach, which is constructed from 5 saddle points (SP), represents in many aspects the minimal model of the QH transition. A further reduction in the number of SP leads to less reliable results.
2
The RG Approach to the CC Model
Our RG approach to the QH transition [5,9,10] is based on the RG unit shown in Fig. 1. The unit is a fragment of the CC network consisting of five nodes. Each node, i, is characterized by the transmission coefficient ti , which is an amplitude to deflect an incoming electron along the link to the left. Analogously, the reflection coefficient ri = (1 − t2i )1/2 is the amplitude to deflect the incoming electron to the right. Doubling of the sample size corresponds to the replacement of the RG unit by a single node. The RG
Real-Space Renormalization at the Quantum Hall Transition
φ2
Ψ2
I Ψ1
φ1
IV III II
φ4 φ3
Ψ4
V Ψ3
239
Fig. 1. Chalker-Coddington network on a square lattice consisting of nodes (circles) and links (arrows). The RG unit used to construct the matrix (4) combines five nodes (full circles) by neglecting some connectivity (dashed circles). Φ1 , . . . , Φ4 are the phases acquired by an electron along the loops as indicated by the arrows. Ψ1 , . . . , Ψ4 represent wave function amplitudes, and the thin dashed lines illustrate the boundary conditions used for the computation of level statistics
transformation expresses the transmission coefficient of this effective node, t , through the transmission coefficients of the five constituting nodes [9] t1 t5 r2 r3 r4 eiΦ2 − 1 + t2 t4 ei(Φ3 +Φ4 ) r1 r3 r5 e−iΦ1 − 1 iΦ3 iΦ4 +t3 t2 t5 e + t1 t4 e t = . (1) (r3 − r2 r4 eiΦ2 ) (r3 − r1 r5 eiΦ1 ) + (t3 − t4 t5 eiΦ4 ) (t3 − t1 t2 eiΦ3 ) Here Φj are the phases accumulated along the closed loops (Fig. 1). Within the RG approach to the conductance distribution, information about electron energy is incorporated only into the values of ti [5]. The energy dependence of phases, Φj , is irrelevant; they are assumed completely random. Due to this randomness, the transmission coefficients, ti , for a given energy, are also randomly distributed with a distribution function P (t). Then the transformation (1) allows, upon averaging over Φj , to generate the next-step distribution P (t ). Therefore, within the RG scheme, a delocalized state corresponds to the FP distribution Pc (t) of the RG transformation. Due to the symmetry of the RG unit, it is obvious that the critical distribution, Pc (t2 ), of the power transmission coefficient, t2 = G, which has the meaning of the two-terminal conductance, is symmetric with respect to t2 = 12 as shown in Fig. 2. In other words, the RG transformation respects the duality between transmission and reflection. The critical distribution Pc (G) found in Refs. [5] and [9] agrees very well with the results of direct large-scale simulations.
3
RG Approach to the ELS
It has been realized long ago that, alongside with the change in the behavior of the eigenfunctions, a localization-delocalization transition manifests itself in the statistics of the energy levels. In particular, as the energy is swept across the mobility edge, the shape of the ELS crosses over from the WignerDyson distribution, corresponding to the appropriate universality class, to the Poisson distribution. Moreover, finite-size corrections to the critical ELS close to the mobility edge allow to determine the value of the correlation
240
Rudolf A. R¨ omer and Philipp Cain
5
5 SP unit 4 SP unit
4
Pc(G)
3
2
1
0
0
0.2
0.4
0.6
G
0.8
1
Fig. 2. The critical distribution of the conductance Pc (G) at the QH transition obtained using the 5 SP RG unit (dashed line). The dotted line denotes a 4 SP RG unit as discussed in Sect. 5. The latter distribution clearly deviates from the expected symmetry with respect to G = 0.5
length exponent [16], thus avoiding an actual analysis of the spatial extent of the wave functions. For this reason, the ELS constitutes an alternative to the MacKinnon-Kramer [17,18,19,20] and to the transmission-matrix [21,22] approaches to the numerical study of localization. 3.1
Derivation of the Network Operator for the RG
As has been shown by Fertig [23], energy levels of an 2D CC network can be computed from the energy dependence of the so called network operator U (E). U is constructed similar to the system of equations for obtaining the transmission coefficient t of the RG unit as presented in Eq. (1). Every SP of the network contributes two scattering equations. Each of them describes the amplitude of one outgoing channel using the amplitudes of the two incoming channels weighted by the transmission and reflection coefficients t and r in which also the random phase Φ of the links between SP’s can be incorporated. When comparing to the calculation of the transmission coefficient t an essential difference has to be taken into account. Energy levels are defined only in a closed system which requires to apply appropriate boundary conditions. The energy dependence of U (E) enters trough the energy dependence of the ti (E) of the SP’s, as well as the energy dependence of the phases Φj (E) of the links. Considering the vector Ψ of wave amplitudes on the links of the network, the eigenenergies can now be obtained from the stationary condition U (E)Ψ = Ψ.
(2)
Nontrivial solutions exist only for discrete energies Ek , which coincide with the eigenenergies of the system [23]. The evaluation of the Ek ’s according to Eq. (2) is numerically very expensive. For that reason a simplification was
Real-Space Renormalization at the Quantum Hall Transition
241
proposed [24]. Instead of solving the real eigenvalue problem, calculating a spectrum of quasienergies ω is suggested following from U (E)Ψl = eiωl (E) Ψl .
(3)
For fixed energy E the ωl are expected to obey the same statistics as the real eigenenergies [24]. This approach makes is perfectly suited for large-size numerical simulations, e.g. studying 50 × 50 SP networks. In order to combine the above algorithm with the RG iteration, in which a rather small unit of SP’s is considered, we first “close” the RG unit at each RG step in order to discretize the energy levels as shown in Fig. 1 with dashed lines. For a given closed RG unit with a fixed set of ti -values at the nodes, the positions of the energy levels are determined by the energy dependences, Φj (E), of the four phases along the loops. These phases change by ∼ π within a very narrow energy interval, inversely proportional to the sample size. Within this interval the change of the transmission coefficients is negligibly small. A closed RG unit in Fig. 1 contains 10 links, and, thus, it is described by 10 amplitudes. Each link is characterized by an individual phase. On the other hand, it is obvious that the energy levels are determined only by the phases along the loops. One way to derive U is to combine the individual phases into phases Φj connected to the four inner loops of the unit and to exclude from the original system of 10 equations all amplitudes except the “boundary” amplitudes Ψj (Fig. 1). The network operator for the remaining four amplitudes is a 4 × 4 matrix {Unm } with elements U11 U13 U21 U23 U31 U33 U41 U43
= (r1 r2 − t1 t2 t3 )e−iΦ1 = t2 t5 r3 e−iΦ1 = −t1 r3 r4 e−iΦ2 = −(t4 r5 + t3 t5 r4 )e−iΦ2 = −t1 t4 r3 e−iΦ4 = (r4 r5 − t3 t4 t5 )e−iΦ4 = −(t2 r1 + t1 t3 r2 )e−iΦ3 = t5 r2 r3 e−iΦ3
U12 U14 U22 U24 U32 U34 U42 U44
= (t1 r2 + t2 t3 r1 )e−iΦ1 = t2 r3 r5 e−iΦ1 = r1 r3 r4 e−iΦ2 = (t4 t5 − t3 r4 r5 )e−iΦ2 = t4 r1 r3 e−iΦ4 = −(t5 r4 + t3 t4 r5 )e−iΦ4 = −(t1 t2 − t3 r1 r2 )e−iΦ3 = r2 r3 r5 e−iΦ3
(4)
which can be substituted in Eq. (3). Then the energy levels, Ek , of the closed RG unit including phases Φj (E) = Φj (Ek ), are the energies for which one of the four eigenvalues of the matrix U is equal to one. Thus, the calculation of the energy levels reduces to a diagonalization of the 4 × 4 matrix. The crucial step now is the choice of the energy dependence Φj (E). If each loop in Fig. 1 is viewed as a closed equipotential as it is the case for the first step of the RG procedure [2], then Φj (E) is a true magnetic phase and changes linearly with energy with a slope governed by the actual potential profile, which, in turn, determines the drift velocity. Thus Φj (E) = Φ0,j + 2π
E , sj
(5)
242
Rudolf A. R¨ omer and Philipp Cain
where a random part, Φ0,j , is uniformly distributed within [0, 2π], and 2π/sj is a random slope. Here the coefficient sj acts as an initial level spacing connected to the loop j of the RG unit by defining a periodicity of the corresponding phase. Strictly speaking, the dependence (5) applies only for the first RG step. At each following step, n > 1, Φj (E) is a complicated function of E which carries information about all energy scales at previous steps. However, in the spirit of the RG approach, one can assume that Φj (E) can still be linearized within a relevant energy interval. The conventional RG approach suggests that different scales in real space can be decoupled. Linearization of Eq. (5) implies a similar decoupling in energy space. With Φj (E) given by Eq. (5), the statistics of energy levels determined by the matrix equation (3) is obtained by averaging over the random initial phases Φ0,j and values ti chosen randomly according to a distribution P (t). For every realization the levels Ek are computed from the solutions ω(Ek ) = 0 of Eq. (3) yielding 3 level spacings as illustrated in Fig. 3. Thus the situation is comparable with estimating the true random matrix ensemble distribution functions from small, say, 2×2 matrices only [25,26]. Within the RG approach, the slopes sj as in Eq. (5) determine the level spacings at the first step. They are randomly distributed with a distribution function P0 (s). Subsequent averaging over many realizations yields the ELS, P1 (s), at the second step. Then the key element of the RG procedure, as applied to the level statistics, is using P1 (s) as a distribution of slopes in Eq. (5). This leads to the next-step ELS and so on. The approach of this work relies on the real eigenenergies of the RG unit. The simpler computation of the spectrum of quasienergies adopted in largescale simulations within the CC model [24,27] cannot be applied since the energy dependence of phases Φj in the elements of the matrix is neglected and only the random contributions, Φ0,j , are kept. Nevertheless it is instructive to compare the two procedures as presented in Fig. 3. The figure shows the dependence of the 4 quasienergies ωk on the energy E calculated for two single sample RG units, with ti chosen from the critical distribution Pc (t). The energy dependence of the phases Φj was chosen from the ELS of the unitary random matrix ensemble (GUE) according to Eq. (5). It is seen that the dependences ω(E) range from remarkably linear and almost parallel (Fig. 3a) to strongly nonlinear (Fig. 3b). 3.2
The Shape of the ELS at the QH Transition
First, let us turn our attention to the shape of the ELS at the QH transition. As starting distribution P0 (s) of the RG iteration, we choose the ELS of GUE, since previous simulations [24,28] indicate that the critical ELS is close to GUE. According to P0 (s), sj is drawn randomly and Φj , j = 1, . . . , 4 is set as in Eq. (5). For the transmission coefficients of the SP the FP distribution Pc (t), obtained in Sect. 2, is used as initial distribution P0 (t). And from P0 (t), the 5 ti , i = 1, . . . , 5, are selected. As in Sect. 2 the RG transformation (1)
Real-Space Renormalization at the Quantum Hall Transition 3 2
ω1 2
ω2
b
1
ω(E)
1
ω(E)
3
a
ω1
0
ω2 0
−1
ω3
−1
−2
ω4
−2
−3
ω3
ω4
−3
0
243
0.2
0.4
E
0.6
0.8
0
0.1
0.2
E
Fig. 3. Energy dependence of the quasieigenenergies ω for two sample configurations. Instead of using the quasispectrum obtained from ωl (E = 0) () we calculate the real eigenenergies according to ω(Ek ) = 0 (✷). Different line styles distinguish different ωl (E). We emphasize that the observed behavior varies from sample to sample between remarkably linear (a) and strongly nonlinear (b)
is used to compute 107 super-transmission coefficients t . The accumulated distribution P1 (t ) is again discretized in at least 1000 bins, such that the bin width is typically 0.001 for the interval t ∈ [0, 1]. P1 (t ) is then smoothed by a Savitzky-Golay filter [29] in order to decrease statistical fluctuations. By finding solutions ω(Ek ) = 0 of Eq. (3) the new ELS P1 (s ) is constructed from the “unfolded” energy-level spacings sm = (Em+1 − Em )/∆, where m = 1, 2, 3, Ek+1 > Ek and the mean spacing ∆ = (E4 − E1 )/3. Due to the “unfolding” [30] with ∆, the average spacing is set to one for each sample and in each RG-iteration step spacing data of 2 × 106 super-SP’s can be superimposed. The resulting ELS is discretized in bins with largest width 0.01. In the following iteration step the procedure is repeated using the P1 ’s as initial distributions. Convergence of the iteration process is assumed when the mean-square deviations of both distributions Pn (t) and Pn (s) deviate by less than 10−4 from predecessors Pn−1 (t) and Pn−1 (s). Once the (unstable) FP has been reached, the Pn ’s should in principle remain unchanged during all further RG iterations. Our simulations show [5] that unavoidable numerical inaccuracies sum up within several further iterations and lead to a drift away from the FP. In order to stabilize our calculation, we therefore use in every RG step instead of Pn (t) the FP distribution Pc (t). This trick does not alter the results but speeds up the convergence of the RG for Pc (s) considerably. This now enables us to determine the critical ELS Pc (s). The RG iteration converges rather quickly after only 2 − 3 RG steps. The resulting Pc (s) is shown in Fig. 4 together with the ELS for GUE. Pc (s) exhibits the expected features, namely, level repulsion for small s and a long tail at large s, but the overall shape of Pc (s) differs noticeably from GUE. In the previous largesize lattice simulations [24,28] the obtained critical ELS was much closer to GUE than Pc (s) in Fig. 4. This fact, however, does not reflect on the
244
Rudolf A. R¨ omer and Philipp Cain
Pc from Ek Pc from ωl GUE
P(s)
1.0
0.5
0.0
0
1
2
s/∆
3
Fig. 4. FP distributions Pc (s) obtained from the spectrum of ωl (E = 0) and from the RG approach using the real eigenenergies Ek in comparison to the ELS for GUE. As in all other graphs P (s) is shown in units of the mean level spacing ∆
accuracy of the RG approach. Indeed, as it was demonstrated recently, the critical ELS – although being system size independent – nevertheless depends on the geometry of the samples [31] and on the specific choice of boundary conditions [32,33]. Sensitivity to the boundary conditions does not affect the asymptotics of the critical distribution, but rather manifests itself in the shape of the “body” of the ELS. Recall now that the boundary conditions which have been imposed to calculate the energy levels (dashed lines in Fig. 1) are non-periodic. As mentioned above, the critical ELS has also been computed previously by diagonalizing U (0) and studying the distribution of quasienergies. In Fig. 4 the result of this procedure using the present RG approach is shown. It appears that the resulting distribution is almost identical to Pc (s). This observation is highly non-trivial, since, as follows from Fig. 3, there is no simple relation between the energies and quasienergies. 3.3
Small and Large s Behavior
As we have seen, the general shape of the critical ELS is not universal. However, the small-s behavior of Pc (s) must be the same as for GUE, namely Pc (s) ∝ s2 . This is because delocalization at the QH transition implies level repulsion [16,24,27,28,34,35,36,37,38,39,40,41,42]. In Fig. 5 we show that this is also true for the RG approach. The given error bars of our numerical data are standard deviations computed from a statistical average of 100 FP distributions each obtained for different random sets of ti ’s and Φj ’s within the RG unit. In general, within the RG approach, the s2 -asymptotics of P (s) is most natural. This is because the levels are found from diagonalization of the 4 × 4 unitary matrix with absolute values of elements widely distributed between 0 and 1. The right form of the large-s tail of P (s) is Poissonian,
Real-Space Renormalization at the Quantum Hall Transition 10
10
Pc data Pc fit GUE
10 10
−2
P(s)
P(s)
10
0
10
−4
10 10 10
10
245
0
−1
−2
−3
Pc data b=5.442 b=6.803
−4
−5
−6
10
−3
10
−2
10
−1
10
0
10
−6
1.0
1.5
s/∆
2.0
2.5
3.0
3.5
s/∆
Fig. 5. Left: FP Pc (s) for small s in agreement with the predicted s2 behavior. Due to the log-log plot errors are shown in the upper direction only. Right: The large s tail of Pc (s) compared with fits according to the predictions of [16] (lines). The interval used for fitting is indicated by the bars close to the lower axis. For clarity errors are shown in upper direction and for s/∆ = 1.5, 2.0, 2.5, 3.0 only. For s/∆ < 2.4, only every 5th data point is drawn by a symbol
Pc (s) ∝ exp(−bs) [16] for s ≥ 3∆. The data has a high accuracy only for s/∆ ≤ 2.5. For such s, the distribution Pc (s) does not yet reach its large-s tail and the fit parameters shown in Fig. 5 depend largely on the s-interval chosen.
4
Scaling Results for the ELS
4.1
Finite-Size Scaling at the QH Transition
The critical exponent ν of the QH transition governs the divergence of the correlation length ξ∞ as a function of the control parameter z0 , i.e. ξ∞ (z0 ) ∝ |z0 − zc |−ν ,
(6)
where zc is the critical value. For the QH transition ν ≈ 2.35 has been calculated by a variety of numerical methods [2,3,4] and is in agreement with the experimental estimates ν ≈ 2.3 [6,43,44]. As presented in Sect. 2 the RG approach for the conductance distribution yields a rather accurate value ν = 2.39 ± 0.01. In order to extract ν from the ELS the one-parameterscaling hypothesis [45] is employed. This approach describes the rescaling of a quantity α(N ; {zi }) – depending on (external) system parameters {zi } and the system size N – onto a single curve by using a scaling function f N α (N ; {zi }) = f . (7) ξ∞ ({zi }) Since Eq. (6), as indicated by “∞”, holds only in the limit of infinite system size, we now use the scaling assumption to extrapolate f to N → ∞ from the
246
Rudolf A. R¨ omer and Philipp Cain
finite-size results of the computations. The knowledge about f and ξ∞ then allows to derive the value of ν. We use the natural parametrization t = (ez + 1)−1/2 [9], such that z can be identified with a dimensionless electron energy. The universal conductance distribution at the transition, Pc (G), corresponds to a distribution Qc (z) [5] which is symmetric with respect to z = 0 and has a shape close to a Gaussian. The RG procedure for the conductance distribution converges and yields Qc (z) only if the initial distribution is an even function of z. This suggests to choose as a control parameter in Eq. (7), the position z0 of the maximum of the function Q(z). The meaning of z0 is an electron energy measured from the center of the Landau band. The fact that the QH transition is infinitely sharp implies that for any z0 = 0, the RG procedure drives the initial distribution Q(z − z0 ) towards an insulator, either with complete transmission of the network nodes (for z0 > 0) or with complete reflection of the nodes (for z0 < 0). 4.2
Scaling for αP and αI
In principle, one is now free to choose for the finite-size scaling analysis (FSS) any characteristic quantity α(N ; z0 ) constructed from the ELS which has a systematic dependence on system size N for z0 = 0 while being constant at the transition z0 = 0. Because of the large number of possible choices [16,28,46,47,48,49] a restriction to two appropriate quantities is made which are obtained by integration of the ELS and have already been successfully used in Refs. [46,50], namely s0 1 s0 P (s)ds, and αI = I(s)ds, (8) αP = s0 0 0 s with I(s) = 0 P (s )ds . The integration limit is chosen as s0 = 1.4 which approximates the common crossing point [46] of all ELS curves as can be seen in Fig. 6. Thus P (s0 ) is independent of the distance |z − zc | to the critical point and the system size magnification N . Since αI,P (N, z0 ) is analytical for finite N , one can expand the scaling function f at the critical point. The first order approximation yields α(N, z0 ) ∼ α(N, zc ) + a|z0 − zc |N 1/ν
(9)
where a is a coefficient. For our calculation we use higher order expansions [51] expanding f twice, first, in terms of Chebyshev polynomials of order Oν and, second, as Taylor expansion with terms |z0 − zc | in the order Oz . This procedure allows to describe deviations from linearity in |z0 − zc | at the transition. Contributions from an irrelevant scaling variable can be neglected since the transition point z0 = 0 is known. In Fig. 7 the resulting fits for αP at the transition are shown. The fits are chosen in a way such that the total number of parameters is kept at a minimal value and the fit agrees well with the numerical data.
Real-Space Renormalization at the Quantum Hall Transition
247
1.4 z0=+0.1 z0=−0.1 Pc Poisson
1.2
P(s)
1.0 0.8 0.6 0.4 0.2 0.0 0.0
0.5
1.0
1.5
2.0
2.5
s/∆
Fig. 6. RG of the ELS used for the computation of ν. The dotted lines corresponds to the first 9 RG iterations with an initial distribution P0 shifted to the metallic regime (z0 = 0.1) while the thin full lines represent results for a shift toward localization (z0 = −0.1). Within the RG procedure the ELS moves away from the FP as indicated by the arrows. At s/∆ ≈ 1.4 the curves cross at the same point – a feature we exploit when deriving a scaling quantity from the ELS −0.08
0.92
1
2
t
−0.09
Log(αP)
αP
0.91
0.90
0.89
0.88 −0.06
−0.04
−0.02
0.00
z0
2 4 8 16 32
64 128 256 512
0.02
0.04
−0.10
−0.12
0.06
2 4 8 16 32
−0.11
−0.13
1 6
8
2
r
10
12
64 128 256 512
14
16
Log(ξ∞/N)
Fig. 7. Left: Behavior of αP at the QH transition as results of the RG of the ELS. Data are shown for RG iterations n = 1, . . . , 9 corresponding to effective system sizes N = 2n = 2, . . . , 512. Full lines indicate the functional dependence according to FSS using the χ2 minimization with Oν = 2 and Oz = 3. Right: FSS curves resulting from the χ2 fit of our data shown in Fig. 7. Different symbols correspond to different effective system sizes N = 2n . The data points collapse onto a single curve indicating the validity of the scaling approach
The corresponding scaling curves for αP are displayed in Fig. 7. In the plots the two branches for complete reflection (z0 < 0) and complete transmission (z0 > 0) can be distinguished clearly. In order to estimate the error of the fitting procedure the results for ν obtained by different orders Oν and Oz of the expansion, system sizes N , and regions around the transition are com-
248
Rudolf A. R¨ omer and Philipp Cain
pared. A part of the over 100 fit results together with the standard deviation of the fit are given in Ref. [52]. The value of ν is calculated as average of all individual fits where the resulting error of ν was smaller than 0.02 resulting in ν = 2.37 ± 0.02. This is in excellent agreement with the previously quoted results [2,3,4,5].
5
Test of Different SP Unit
Apparently, the quality of the RG approach crucially depends on the choice the RG unit. For the construction of a proper chosen RG unit two conflicting aspects have to be considered. (i) With the size of the RG unit also the accuracy of the RG approach increases since the RG unit can preserve more of the connectivity of the original network. (ii) As a consequence of larger RG units the computational effort for solving the scattering problem rises, especially in the case where an analytic solution, as Eq. (1), is not attained. Because of these reasons building an RG unit is an optimization problem depending mainly on the computational resources available. As mentioned in the previous Section larger RG units were already studied in [11,12,53]. In these works the authors could not benefit from an analytic solution and achieve only a similar and less accurate statistics in comparison with the results presented here. In this Section the opposite case is studied using a small RG unit proposed in [54] in the context of the Hall resistivity. The super-SP now consists only of 4 SP’s as shown in Fig. 8. It resembles the 5 SP’s unit (Fig. 1) used previously leaving out the SP in the middle of the structure. Again the scattering problem can be formulated as a system of now 8 equations which is solved analytically t2 t3 eiΦ2 (r1 r4 e−iΦ1 − 1) + t1 t4 (r2 r3 eiΦ3 − 1) . (10) t4SP = (1 − r2 r3 eiΦ3 )(1 − r1 r4 eiΦ1 ) + t1 t2 t3 t4 eiΦ2 The result can be verified using Eq. (1) after setting t3 = 0 and r3 = 1, joining the phases Φ1 and Φ4 and renumbering the indices. The RG transformation (10) is then applied within the RG approach analogously to the 5 SP unit. First the FP distribution Pc (G) is obtained. A comparison of Pc (G) for both RG units is shown in Fig. 2. In the case of the 5
φ1 I
IV
φ2
φ2 φ3
II
III
Fig. 8. RG unit constructed from 4 SP’s indicated by full circles. Some connectivity is neglected (dotted circles). The phases Φj are accumulated by the electron motion (arrows) along contours of the energy potential
Real-Space Renormalization at the Quantum Hall Transition
249
SP unit the FP distribution Pc (G) exhibits a flat minimum around G = 0.5, and sharp peaks close to G = 0 and G = 1. It is symmetric with respect to G ≈ 0.5. The 4 SP unit yields differing results. While Pc (G) is still rather flat it is clearly asymmetric, which already indicates that the 4 SP unit can not describe all of the underlying symmetry of the CC network. The Pc (G) for the 4 SP unit is then used in the calculation of the critical exponent ν to construct the shifted initial distributions Q0 (z). The behavior of ν as function of n for the 4 and 5 SP RG units is demonstrates in Fig. 9. Both curves approach convergence monotonously from larger values of ν. During all iteration steps, ν for the 4 SP differs from the 5 SP result by an almost constant positive shift. After 8 iterations, which equals an increase of system size by a factor of 256, one finds ν5SP = 2.39 ± 0.01 and ν4SP = 2.74 ± 0.02. The error describes a confidence interval of 95% as obtained from the fit to a linear behavior. The result for ν4SP deviates clearly from the five SP result and also from the values obtained by other methods [2,3,4]. In addition to these findings also the discussion in Sect. 2 indicates that the 4 SP RG unit fails to describe the critical properties at the QH transition correctly. This fact underlines again the importance of the RG unit for a successful application of the RG approach.
3.5
5 SP 4 SP zmax
0.0
3.3
−0.2 −0.4
3.1
ν
−0.6 0
2.9
0.1
0.2
0.3
z0
2.7 2.5 2.3
4 16 32
64
128
256
N Fig. 9. The critical exponent ν as function of the effective system size N = 2n for 4 SP (dotted line) and 5 SP unit (dashed line). Inset: Maximum zmax of Q(z) vs. initial shift z0 for 8 RG iterations (symbols) using 4 SP. Dashed lines indicate linear fits
250
6
Rudolf A. R¨ omer and Philipp Cain
Conclusions
The version of the network model [55] that has been most widely studied in the context of the QH effect, is the CC model [2], describing the electron motion in a disordered system in a strong magnetic field limit. The fact that the RG approach, within which the correlations between different scales are neglected, describes the results of the large-scale simulations of the CC model so accurately, indicates that only a few spatial correlations within each scale are responsible for the critical characteristics of the quantum Hall transition. More precisely, the structure of the eigenstates of a macroscopic sample at the transition can be predicted from the analysis of a single RG unit consisting of only five nodes. Further applications of this approach to the computation of the Hall resistance and the plateau-to-insulator transition shall be published elsewhere. Acknowledgements We thank B. Huckestein, M.E. Raikh, M. Schreiber, and U. Z¨ ulicke for stimulating discussions. This work was supported by the DFG within SFB393 and the priority research program on quantum Hall systems. Further support was provided by a DAAD-NSF collaborative research grant INT-0003710.
References 1. B. Huckestein, Rev. Mod. Phys. 67, 357 (1995). 237 2. J. T. Chalker and P. D. Coddington, J. Phys.: Condens. Matter 21, 2665 (1988). 237, 241, 245, 248, 249, 250 3. D.-H. Lee, Z. Wang, and S. Kivelson, Phys. Rev. Lett. 70, 4130 (1993). 237, 238, 245, 248, 249 4. B. Huckestein, Europhys. Lett. 20, 451 (1992). 237, 238, 245, 248, 249 5. P. Cain, R. A. R¨ omer, M. Schreiber, and M. E. Raikh, Phys. Rev. B 64, 235326 (2001), ArXiv: cond-mat/0104045. 237, 238, 239, 243, 246, 248 6. F. Hohls, U. Zeitler, and R. J. Haug, Phys. Rev. Lett. 86, 5124 (2001), ArXiv: cond-mat/0011009. 237, 245 7. S. Koch, R. J. Haug, K. v. Klitzing, and K. Ploog, Phys. Rev. Lett. 67, 883 (1991). 237 8. D. Stauffer and A. Aharony, Introduction to Percolation Theory (Taylor and Francis, London, 1992). 237, 238 9. A. G. Galstyan and M. E. Raikh, Phys. Rev. B 56, 1422 (1997). 237, 238, 239, 246 10. D. P. Arovas, M. Janssen, and B. Shapiro, Phys. Rev. B 56, 4751 (1997), ArXiv: cond-mat/9702146. 237, 238 11. A. Weymer and M. Janssen, Ann. Phys. (Leipzig) 7, 159 (1998), ArXiv: condmat/9805063. 238, 248 12. M. Janssen, R. Merkt, J. Meyer, and A. Weymer, Physica 256–258, 65 (1998). 238, 248
Real-Space Renormalization at the Quantum Hall Transition 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28.
29.
30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44.
251
Z. Wang, B. Jovanovic, and D.-H. Lee, Phys. Rev. Lett. 77, 4426 (1996). 238 X. Wang, Q. Li, and C. M. Soukoulis, Phys. Rev. B 58, 3576 (1998). 238 Y. Avishai, Y. Band, and D. Brown, Phys. Rev. B 60, 8992 (1999). 238 B. I. Shklovskii, B. Shapiro, B. R. Sears, P. Lambrianides, and H. B. Shore, Phys. Rev. B 47, 11487 (1993). 240, 244, 245, 246 J.-L. Pichard and G. Sarma, J. Phys. C 14, L127 (1981). 240 J.-L. Pichard and G. Sarma, J. Phys. C 14, L617 (1981). 240 A. MacKinnon and B. Kramer, Phys. Rev. Lett. 47, 1546 (1981). 240 A. MacKinnon and B. Kramer, Z. Phys. B 53, 1 (1983). 240 R. Landauer, Phil. Mag. 21, 863 (1970). 240 D. S. Fisher and P. A. Lee, Phys. Rev. B 23, 6851 (1981). 240 H. A. Fertig, Phys. Rev. B 38, 996 (1988). 240 R. Klesse and M. Metzler, Phys. Rev. Lett. 79, 721 (1997). 241, 242, 243, 244 E. P. Wigner, Proc. Camb. Phil. Soc. 47, 790 (1951). 242 M. L. Mehta, Random Matrices and the Statistical Theory of Energy levels (Academic Press, New York, 1991). 242 M. Metzler, J. Phys. Soc. Japan 67, 4006 (1998). 242, 244 M. Batsch and L. Schweitzer, in High Magnetic Fields in Physics of Semiconductors II: Proceedings of the International Conference, W¨ urzburg 1996, edited by G. Landwehr and W. Ossau (World Scientific Publishers Co., Singapore, 1997), pp. 47–50, ArXiv: cond-mat/9608148. 242, 243, 244, 246 W. H. Press, B. P. Flannery, S. A. Teukolsky, and W. T. Vetterling, Numerical Recipes in FORTRAN, 2nd ed. (Cambridge University Press, Cambridge, 1992). 243 F. Haake, Quantum Signatures of Chaos, 2nd ed. (Springer, Berlin, 1992). 243 H. Potempa and L. Schweitzer, J. Phys.: Condens. Matter 10, L431 (1998), ArXiv: cond-mat/9804312. 244 D. Braun, G. Montambaux, and M. Pascaud, Phys. Rev. Lett. 81, 1062 (1998), ArXiv: cond-mat/9712256. 244 L. Schweitzer and H. Potempa, Physica A 266, 486 (1998), ArXiv: condmat/9809248. 244 Y. V. Fyodorov and A. D. Mirlin, Phys. Rev. B 55, 16001 (1997). 244 T. Kawarabayashi, T. Ohtsuki, K. Slevin, and Y. Ono, Phys. Rev. Lett. 77, 3593 (1996), ArXiv: cond-mat/9609226. 244 M. Batsch, L. Schweitzer, I. K. Zharekeshev, and B. Kramer, Phys. Rev. Lett. 77, 1552 (1996), ArXiv: cond-mat/9607070. 244 M. Feingold, Y. Avishai, and R. Berkovits, Phys. Rev. B 52, 8400 (1995), ArXiv: cond-mat/9503058. 244 T. Ohtsuki and Y. Ono, J. Phys. Soc. Japan 64, 4088 (1995), ArXiv: condmat/9509146. 244 M. Metzler and I. Varga, J. Phys. Soc. Japan 67, 1856 (1998). 244 M. Batsch, L. Schweitzer, and B. Kramer, Physica B 249, 792 (1998), ArXiv: cond-mat/9710011. 244 M. Metzler, J. Phys. Soc. Japan 68, 144 (1999). 244 Y. Ono, T. Ohtsuki, and B. Kramer, J. Phys. Soc. Japan 65, 1734 (1996), ArXiv: cond-mat/9603099. 244 S. Koch, R. J. Haug, K. v. Klitzing, and K. Ploog, Phys. Rev. B 43, 6828 (1991). 245 F. Hohls, U. Zeitler, and R. J. Haug, Phys. Rev. Lett. 88, 036802 (2002), ArXiv: cond-mat/0107412. 245
252
Rudolf A. R¨ omer and Philipp Cain
45. E. Abrahams, P. W. Anderson, D. C. Licciardello, and T. V. Ramakrishnan, Phys. Rev. Lett. 42, 673 (1979). 245 46. E. Hofstetter and M. Schreiber, Phys. Rev. B 49, 14726 (1994), ArXiv: condmat/9402093. 246 47. I. K. Zharekeshev and B. Kramer, Jpn. J. Appl. Phys. 34, 4361 (1995), ArXiv: cond-mat/9506114. 246 48. I. K. Zharekeshev and B. Kramer, Phys. Rev. B 51, 17239 (1995). 246 49. I. K. Zharekeshev and B. Kramer, Phys. Rev. Lett. 79, 717 (1997), ArXiv: cond-mat/9706255. 246 50. E. Hofstetter and M. Schreiber, Phys. Rev. B 48, 16979 (1993). 246 51. K. Slevin and T. Ohtsuki, Phys. Rev. Lett. 82, 382 (1999), ArXiv: condmat/9812065. 246 52. P. Cain, R. A. R¨ omer, and M. E. Raikh, Phys. Rev. B 67, 075307 (2003), ArXiv: cond-mat/0209356. 248 53. M. Janssen, R. Merkt, and A. Weymer, Ann. Phys. (Leipzig) 7, 353 (1998). 248 54. U. Z¨ ulicke and E. Shimshoni, Phys. Rev. B 63, 241301 (2001), ArXiv: condmat/0101443. 248 55. B. Shapiro, Phys. Rev. Lett. 48, 823 (1982). 250
Evidence for a Metal-Insulator Transition in Overdoped Cuprates: New Raman Results Francesca Venturini Walther Meissner Institute, Bavarian Academy of Sciences 85748 Garching, Germany Abstract. Transport properties play a major role in the characterization of correlated metals such as the cuprates. The usual dc and optical conductivity measurements, however, suffer from the missing momentum resolution in the strongly anisotropic CuO2 plane. The problem can be partially solved by using electronic Raman scattering. Here, the response is proportional to the conductivity. Additionally, different parts of the Fermi surface can be projected out by using polarized light. With this technique we study the electron dynamics in the normal state of cuprates over a wide range of doping. The strong anisotropy of the electron relaxation which evolves below a doping level of 0.22 holes/CuO2 is interpreted in terms of an unconventional metal-insulator-transition with an anisotropic gap. A phenomenology is developed which allows a quantitative understanding of the Raman results and provides a scenario which links single- and many-particle properties.
1
Introduction
Copper-oxygen compounds are characterized by strong electronic correlations. As a result, a complex phase diagram develops upon doping that contains long-range antiferromagnetic order, magnetic fluctuations, charge and spin ordering, pseudogap behaviour, and superconductivity. [1,2,3,4] The interrelation of the different instabilities is among the important open problems in solid-state physics. In contrast to more conventional materials, the comparison of different experiments often poses new questions. In fact, the normal state of copper-oxygen compounds is characterized by several crossover lines separating regions of the phase diagram with different physical properties. [5] Such a rich phenomenology has been ascribed to the existence of a nearby quantum-critical point (QCP). As a consequence, the fluctuations expected in the system at T > 0 would give rise to the distinctly non Fermi-liquid behaviour as observed in most transport and thermodynamic quantities. Although several scenarios have been proposed for the origin of a QCP [6,7,8,9,10,11,12] there is no picture which has been able to reproduce the complex behaviour of the electron dynamics over a wide range of doping levels. It would be extremely useful if a description of electron dynamics could be able to connect the results from various experiments to a common origin. From the point of view of critical phenomena it is not uncommon that single-particle properties may show substantially different behavior from B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 253–266, 2003. c Springer-Verlag Berlin Heidelberg 2003
254
Francesca Venturini
many-particle properties. [13] In the vicinity of a quantum phase transition (QPT) single-particle properties, e.g., density of states at the Fermi level may be uncritical while two-particle properties such as the conductivity may deviate from Fermi liquid behavior. Raman scattering of light by electrons can be a useful tool for studying a putative QPT in planar anisotropic systems since it combines the sensitivity to strong correlations of many-particle probes and resolution in k space. By adjusting the polarization states of the incoming and scattered photons it is in fact possible to selectively probe excitations with momenta in different regions of the first Brillouin zone. Thus, the B1g and B2g electronic Raman spectra can probe either “hot”(anti-nodal) or “cold” (nodal) electrons with momenta along the principal axes and the diagonals of the CuO2 plane, respectively. In the lowest order approximation there is an analogy between the Raman response χ and the conductivity σ . In fact, it was shown that neglecting selection rules and vertex corrections χ (ω, T ) ∝ ωσ (ω, T ) [14]. In this paper we present detailed measurements of the Raman spectra in Bi2 Sr2 CaCu2 O8+δ (BSCCO), YBa2 Cu3 O6+x (YBCO) and La2−x Srx CuO4 (LSCO) over a wide range of effective doping 0.09 < p < 0.24. As a result of the study of the temperature and symmetry dependence of the Raman measurements, a strong anisotropy of the electronic relaxation rates is observed which cannot be explained by single-particle properties alone. Therefore, a new phenomenological model is developed, which allows a quantitative understanding of the Raman results in terms of an unconventional metal-insulator transition.
2 Doping Dependence of the Raman Response in the Normal State The electronic Raman response χ (ω, T, p) shows a very different evolution with doping in B1g and B2g symmetries. This becomes already clear by comparing the spectra of optimally (p = 0.16) to strongly (p = 0.23) overdoped BSCCO single crystals [15] at a fixed temperature T = 180 K. The contribution of the lattice vibrations has been subtracted out. In B1g symmetry (Fig. 1) the response is strongly suppressed below 2000 cm−1 , indicating the opening of a gap in the electronic excitation spectrum upon decreasing carrier concentrations. In contrast, in B2g symmetry (Fig. 2) there is only a very weak doping dependence, as if a gap would exist only for the antinodal or “hot” quasiparticles. The evolution of the spectra with the carrier concentration is best evinced when also the temperature dependence is studied. In Fig. 3 the raw Raman spectra of a strongly overdoped sample (Tc = 62, p = 0.22) [15] are shown for B1g (a) and for B2g (b) symmetries in the normal state at 206 K, 114 K and 80 K. All the temperatures indicated here are already corrected to include the heating of the sample due to the absorption of the laser light. The
Evidence for a Metal-Insulator Transition
255
c’’ (w,p) (cps/mW)
12 Bi 2 Sr2CaCu 2O8+d T=180K
10
B1g
8 6 Tc 92K 78K 62K 56K
4 2
p 0.16 0.20 0.22 0.23
0 0
500
1000
1500
Raman shift w
2000
2500
3000
(cm-1)
Fig. 1. Electronic Raman response χ (ω, p) at T = 180 K in B1g symmetry of differently doped BSCCO crystals
c’’ (w,p) (cps/mW)
6 Bi 2 Sr2CaCu 2O8+d T=180K
5
B2g
4 3
Tc 92K 78K 62K 56K
2 1
p 0.16 0.20 0.22 0.23
0 0
500
1000
1500
2000
Raman shift w (cm-1)
2500
3000
Fig. 2. Electronic Raman response χ (ω, p) at T = 180 K in B2g symmetry of differently doped BSCCO crystals
Fig. 3. Raman response χ (ω, T ) of strongly overdoped BSCCO (Tc = 62, p = 0.22) at different temperatures in B1g (a) and B2g (b) symmetries. The spectra in B1g symmetry have been multiplied by 0.5 to show the data on the same scale as in Figs. 4 and 5
256
Francesca Venturini
Fig. 4. Raman response of overdoped χ (ω, T ) BSCCO (Tc = 78, p = 0.20) at different temperatures in B1g (a) and B2g (b) symmetries
sharper features superimposed on the broad electronic continuum are due to lattice vibrations. The low-frequency response (ω < 200 cm−1 ) depends on temperature in a rather similar way in the two symmetries: the initial slope of the spectra, which decreases with increasing temperature, is proportional to the lifetime of the carriers, indicating a typical metallic behaviour1 . The previous observations can be compared with measurements for another overdoped sample (Tc = 78, p = 0.20) [15] with a reduced carrier concentration. The Raman response in B1g and B2g symmetries is shown in Fig. 4 (b) at the three temperatures, 242 K, 190 K and 91 K. The spectra in the latter symmetry are similar in shape and temperature evolution to those of the strongly overdoped sample (Fig. 3 (b)). In contrast, in the B1g channel there is hardly any temperature dependence at all. Finally, the results for a slightly underdoped BSCCO sample (Tc = 92 K, p = 0.15) [16] are shown for comparison in Fig. 5. While in B2g symmetry (Fig. 5(b)) there is still a decrease of the slope at low frequencies with increasing temperature, just the opposite temperature dependence is observed in B1g symmetry (Fig. 5(a)) indicating a non-metallic behaviour. Summarizing, the continuum in B2g geometry is relatively doping independent and has a temperature evolution typical for a metal. Differently, a non-trivial dependence on doping and temperature is observed in B1g sym1
It is emphasized that in this context lifetime should be understood in the spirit of transport lifetime as measured here by Raman scattering and not as singleparticle lifetime obtained by photoemission spectroscopy.
Evidence for a Metal-Insulator Transition
257
Fig. 5. Raman response χ (ω, T ) of underdoped BSCCO (Tc = 92 K, p = 0.15) at different temperatures in B1g (a) and B2g (b) symmetries
metry: the slope of low-frequency response decreases with increasing T in a way similar to that in B2g symmetry for the strongly overdoped sample (Fig. 3(b)), becomes temperature independent near optimal doping p ≥ 0.16 (Fig. 4 (b)), and starts to increase with increasing T just below optimal doping (Fig. 5 (b)).
3
Raman Relaxation Rates
A quantitative analysis of the dynamics of the carriers can be performed from the experimentally measured Raman response χ (ω, T ) through a memoryfunction approach [16]. Following this method it is possible to derive “Raman relaxation rates” Γ (ω, T ) and “Raman mass-enhancement factors” 1 + λ(ω, T ). The dynamical relaxation rates for two overdoped BSCCO samples (Tc = 62 K and Tc = 78 K), whose raw Raman spectra are shown in Figs. 3 and 4, are displayed as a function of temperature in Fig. 6 [17]. Both in B1g (a,b) and B2g (c,d) symmetries, the frequency dependence of the quasiparticle relaxation rates shows only little dependence on momentum and doping. As compared to results at lower doping levels [16] a tendency to a more quadratic frequency dependence below approximately 400 cm−1 in B1g symmetry is found here, possibly indicating more conventional quasiparticle dynamics. The static relaxation rate is obtained by extrapolation Γ0 (T ) ≡ Γ (ω → 0, T ), and represents the inverse of the quasiparticle lifetime; therefore it has
258
Francesca Venturini
Fig. 6. Relaxation rates in B1g (a,b) and B2g (c,d) symmetries at different temperatures for two overdoped BSCCO samples (Tc = 78 K and Tc = 62 K)
a significance similar to the transport resistivity in a conventional metal. As visible from Fig. 6, the dependence of the dc limit of the relaxation rates on temperature evolves differently with doping in the two symmetries: while B Γ0 2g (T ) decreases with temperature at both doping levels (Fig. 6 (c,d)) B consistent with ordinary and optical transport [18], Γ0 1g (T ) is essentially temperature independent for p = 0.20 and assumes the B2g behaviour at p ≥ 0.22. The evolution of the static relaxation rates in B1g and B2g symmetries over a wide doping range (0.09 ≤ p ≤ 0.23) at a fixed temperature T = 200 K is summarized in Fig. 7 (a). A strong anisotropy between the two B symmetries is clearly visible up to p 0.21. The magnitude of Γ0 1g decreases B2g by approximately 70% for 0.09 ≤ p ≤ 0.22 while Γ0 is almost constant up to p 0.20 and changes by only 30% in the narrow range 0.20 < p < 0.22. In particular, a crossover for 0.20 < p < 0.22 is clearly visible where the Raman relaxation rates rapidly decrease, and the anisotropy vanishes. It is important to emphasize that the changes for 0.20 < p < 0.22 are observed for a set of samples prepared from pieces of a single homogeneous crystal, which have been annealed in different oxygen partial pressures. The results are consistent with those of samples from other sources. The observed features are therefore robust and not related to differences in sample quality. The variations of the relaxation rates with temperature ∂Γ0 (T, p)/∂T as a function of doping are shown in Fig. 7 (b) at a fixed temperature T = 200 K. In B1g symmetry, which probes mainly the nodal quasiparticles, ∂Γ0 (T, p)/∂T deviates only little from 2 in the entire doping range. The B logarithmic derivative ∂[lnΓ0 2g (T )]/∂(lnT ) in Fig. 7 (c) demonstrates that
Evidence for a Metal-Insulator Transition
G0 (K)
2000
259
Bi2Sr2CaCu2O8+d B1g
1000
B2g (a)
¶G0 /¶T (K/K)
0 2
B2g
0 -2
B1g (b)
¶(lnG0 )/¶(lnT)
-4 1.0 0.5
B2g
0.0
B1g -0.5 T = 200K -1.0 0.05 0.10
B
(c)
0.15 0.20 doping p
0.25
Fig. 7. (a) Static relaxation rates in B1g and B2g symmetries as a function of doping p at a fixed temperature T = 200 K. (b) ∂Γµ (T )/∂T as a function of doping. (c) Logarithmic derivatives of the Raman relaxation rates indicating power-law behavior in the temperature dependence. The gray line marks optimal doping. The metallic regime above pc = 0.22 is shaded. The solid lines in (a) and (b) are the theoretical predictions
Γ0 2g (T ) varies essentially linearly with temperature. The two observations B suggest that the relaxation rate in B2g symmetry varies as Γ0 2g (T, p) B1g 2kB T . In contrast, ∂Γ0 (T, p)/∂T , which reflects the dynamics of the antinodal quasiparticles, is strongly temperature dependent, increasing continuously with p and changing sign close to optimal doping (Fig. 7 (b)). For p ≥ 0.22 any kind of anisotropy disappears also for ∂Γ0 (T, p)/∂T . B Both the apparent symmetry dependence of the relaxation rates, Γ0 2g < B1g (Fig. 7 (a)), and the characteristic increase of the relaxation rate for Γ0 B decreasing temperatures, ∂Γ0 1g (T )/∂T < 0 for p ≤ 0.16 (Fig. 7 (b)), indicate that there is not only gap-like behaviour but also a strong anisotropy in the momentum space. In particular, since the strongest effects are observed in B1g symmetry, the maxima of such a gap must be located around the Brillouin zone axes (anti-nodal or “hot” region). Thus the “hot” quasiparticles show a crossover from metallic to insulating behaviour near optimal doping while the “cold” quasiparticles are metallic for all dopings at the temperatures examined. In this sense, the observed evolution with doping indicate the existence of an anisotropic or unconventional metal-insulator transition. This is different from a conventional Mott transition [19] since the charge
260
Francesca Venturini
excitations become gapped only on specific parts of the Fermi surface, and the overall dc transport remains still metallic.
4 Phenomenological Model for the Metal-Insulator Crossover The momentum dependence of the gap is reminiscent of both the superconducting gap and the pseudogap [2,4] being compatible with | dx2 −y2 | symmetry. In spite of a similar k dependence, however, an incipient superconducting instability (pre-formed pairs) can be safely excluded because of the high temperature (200 K) and doping level (p > 0.19) of our experiment. The same holds for the pseudogap. Its onset temperature T ∗ actually merges with Tc already for 0.16 < p < 0.19. [2,4] The scenario considered here is that of an anisotropic charge gap which develops near the “hot” spots to minimize strong interactions between electrons. [20] Therefore, we examine the effect of an anisotropic normal-state gap on the Raman response. An exact treatment of non-resonant electronic Raman scattering in systems displaying a quantum-critical MIT in the limit of infinite dimensions has been formulated for Hamiltonians displaying both Fermi-liquid and non-Fermi-liquid ground states [21]. However, the development of an anisotropic gap and its effect on the Raman response have not been investigated yet. Therefore, a phenomenological treatment for a system near a quantum phase transition is considered here, focusing on the effect of an anisotropic, doping-dependent gap in the charge channel ∆C (p). The observation of non-metallic behaviour of quasiparticles with momenta along the axes of the Brillouin zone indicates that the gap is compatible with | dx2 −y2 | symmetry. The simplest form which is maximal along the axes and vanishes along the Brillouin zone diagonals is considered ∆C (p, φ) = ∆C (p) cos2 (2φ)
(1)
with φ the azimuthal angle on a cylindrical Fermi surface and ∆C (p) the doping dependent magnitude. The doping dependence is postulated to be of the form ζ p θ(pc − p) (2) ∆C (p) = ∆C (0) 1 − pc with pc the critical doping and ∆C (0) the maximum value of the gap that, together with the exponent ζ, has to be determined by comparison with the experimental results. For doping levels larger than pc the gap is closed as expressed by the Heaviside function θ(pc − p) and the dynamics of the quasiparticles is metallic. For p < pc the gap increases in magnitude with decreasing p. The effect of such a gap is to reduce the number of states participating in the conduction to those located in either part of the band with a total
Evidence for a Metal-Insulator Transition
261
width Eb separated symmetrically by ±∆C /2 with respect to the chemical potential. Note that the gap ∆C should be considered an activation energy for moving a particle rather than a single particle gap. The Raman response function in the normal state in given [22] ∞ dy G (y, k)G (y − ω, k − q) [nF (y) − nF (y − ω)] , (3) γk2 χγγ (ω) = −∞ π k
where nF is the Fermi-Dirac distribution function, γk is the Raman vertex. h = 1 and kB = 1 are considered. G (y, k) is the imaginary part of the ¯ renormalized electronic Green function G (y, k) =
Σ (y, k) (y − ξk − Σ (y, k))2 + (Σ (y, k))2
(4)
with ξk the band dispersion. The analytical calculation can be simplified by approximating the Fermi surface as cylindrical and by transforming the 2D k-sum in an energy and an angular integration around the Fermi surface. The imaginary part of the self energy Σ is taken momentum, energy and doping independent, and linearly dependent on temperature in qualitative agreement with ARPES results in a wide range of doping [23]. Then, for q → 0 and in the limit ω → 0 the Raman response can be approximated as [22] 2π ∂nF (y) 1 dy (−ω) χγγ (ω → 0) = NF dφ γ 2 (φ) . (5) π ∂y 2Σ 0 band The static Raman relaxation rate is the inverse of the slope of he Raman spectra for vanishing frequencies −1 ∂χ (ω → 0) . (6) Γ0 (T, p) = ∂ω Therefore, for Eb T the Raman relaxation rate is Γ0 (T, p) =
πΣ NF
0
2π
−1 ∆C (φ, p) dφ γ 2 (φ)nF . 2
(7)
The relaxation rates calculated from Eq. (7) are plotted in Fig. 8 for B1g (panel a) and B2g (panel b) symmetries for different doping levels p/pc . The temperature corresponding to the experiments is also marked. From the figure it is possible to see that, when the doping level is reduced from p/pc = 1, the slope of the relaxation rates in the B1g symmetry at a fixed temperature decreases and eventually changes sign at a doping concentration which depends on the temperature. In B2g symmetry no such a phenomenology is observed.
262
Francesca Venturini
Fig. 8. Temperature dependence of the static relaxation rates in B1g and B1g symmetries for different doping levels p/pc . The temperatures corresponding to 200 K is indicated
5 Unconventional Metal-Insulator Transition in the Overdoped Regime The comparison between the experimental static relaxation rates and the theoretical predictions of the phenomenological model is shown in Fig. 7. The theoretical curves have been calculated from Eq. (7) with ∆C (0) = 1100 K, and ζ = 0.25. In B2g symmetry which reflects dc and optical transport properties [16,24], the influence of the gap is weak, and the temperature dependence comes essentially from Σ (T ). This explains why in conventional transport the MIT is observed at much lower temperatures if at all. The general trend of the B1g rates, in particular the sign change, is well reproduced by the phenomenology in spite of the simplified form considered for Σ and for the gap.
6
Relaxation Rates in Different Cuprate Compounds
It is interesting to compare the results obtained in BSCCO with the observations in cuprates belonging to different material classes. In YBCO the overall behaviour is similar to what is observed in BSCCO [16]. The temperature dependence of the static relaxation rates and its evolution with doping is shown in Fig. 9 [16]. The anisotropy in the scattering rate is strongly reduced upon doping and disappears in the overdoped regime (Fig. 9 (c)). As in BSCCO, Γ0 is linearly increasing with doping in B2g
Evidence for a Metal-Insulator Transition
263
800 x=0.50 UD
G0 (K)
600
YBa 2 Cu 3 O 6+x x=0.93 Opt x=1.00 OD
B1g B2g
400 200 (a) 0
0
100 200
(b) 0 100 200 0 temperature (K)
(c) 100 200 300
Fig. 9. Temperature dependence of the static relaxation rates in in B1g and B2g symmetries for differently doped YBCO single crystals
symmetry at all doping levels, consistently with transport measurements. B Γ0 1g (T ), instead, displays an approximately linear temperature dependence only in the overdoped sample, while it becomes almost and completely temperature independent in the optimally (Fig. 9 (b)) and underdoped (Fig. 9 (a)) samples, respectively. Hence, the “cold” electrons probed by B2g symmetry display metallic behaviour in the entire doping range studied, while the “hot” ones, probed by B1g symmetry, are metallic only in the overdoped regime. Since p ≥ 0.19 cannot be accessed in this compound by oxygen doping alone, it is not possible to determine pc . The electronic Raman response χ (ω, T ) of an underdoped LSCO crystal (Tc = 28 K, x = 0.10) [25] is shown in Fig. 10 for B1g (a) and B2g (b) sym-
Fig. 10. Raman response of underdoped χ (ω, T ) LSCO (Tc = 28 K, x = 0.10) at different temperatures in B1g (a) and B2g (b) symmetries
264
Francesca Venturini
metries. The contribution from vibrational excitations has been subtracted out for clarity. In the range below 200 cm−1 the usual temperature dependence of the slope of the spectra is observed in B2g symmetry (panel (b)): the increase of the spectral weight upon reducing the temperature leads to an increasing slope of the response in the dc limit. In B1g symmetry a similar metallic increase towards low temperature is observed between approximately 250 and 100 K. The spectra are more flat indicating a larger relaxation rate or a shorter lifetime as well as a weaker temperature dependence than in B2g symmetry. Below 200 K an anomalously strong increase of the low-energy response is observed. Although the spectra are measured down to 15 cm−1 the linear decrease toward zero energy, which is necessary to satisfy causality, cannot be resolved any more. Consequently, the lifetime must become very long and exceed the one observed at B2g symmetry by far. These observations cannot be straightforwardly reconciled with the metalinsulator-crossover scenario. It is possible that the expected transition from an essentially isotropic material to a strongly anisotropic one is still occurring when the hole concentration is reduced. Specifically, at high temperatures the B B magnitudes of the B1g relaxation rates Γ0 1g are much larger than Γ0 2g as B in BSCCO and YBCO. However, the increase of Γ0 1g upon cooling, which is considered to be an indication of a gap, is completely absent. At T < 100 K B B Γ0 1g becomes even smaller than Γ0 2g for the underdoped sample with x = 0.10 (Fig. 10 (a)). Additional information which could help to understand these observations is provided by the comparison with results from other experimental methods. There seems to be a clear indication for charge ordering in LSCO with a relatively large correlation length of the one-dimensional domains, that influences also the correlation properties [25]. Therefore, a superposition of two effects is probably observed: at high temperatures the metal-insulator transition dominates the dynamics in B1g symmetry as in the other compounds. For this B B reason Γ0 1g is still larger than Γ0 2g for almost the entire temperature range. At low temperatures, however, the effect of the gap is overcompensated by the influence of charge ordering which leads to an enhanced conductivity along the principal directions.
7
Conclusions
The Raman observations strongly suggest that a putative quantum critical point would lie at a doping of pc 0.22. Even if this value is higher than pc 0.19 derived from transport properties [26], the two phenomena are probably directly linked. The differences in the critical doping in transport and Raman can be understood in terms of different selection rules and are a direct effect of the unconventional anisotropic nature of the gap controlling the transition. In fact, while transport is most sensitive to quasiparticles located in the “cold”
Evidence for a Metal-Insulator Transition
265
spots, the possibility of differently weighting the regions on the Fermi surface allows Raman to resolve the MIT up to its very onset at pc . Since pc 0.22 is also inferred from the T 0 line [5], Raman scattering probably captures the first onset of non-Fermi liquid behaviour in this compound and can trace it back to a correlation-induced localization of carriers. On the other hand, the pseudogap at T ∗ does not fit straightforwardly into this scenario. It either marks a pairing or charge-ordering instability or is connected to T 0 in a more complicated way through fluctuation effects [10]. The existence of a QPT seems to be a general feature of the cuprates although the critical doping depends on the material class as demonstrated for low Tc compounds. [26] Here we have shown that the QPT can be described phenomenologically in terms of a generalized MIT with a strongly anisotropic gap. In LSCO the influence of such a transition on the Raman spectra seems to be masked by a charge-ordering instability. Acknowledgements The author would like to thank the collaborators T.P. Devereaux, J.K. Freericks, M. Opel, I. T¨ utt˝ o, and R. Hackl for the support. We greatly benefitted from the collaboration with P. Calvani, A. Lucarelli, M. Ortolani, as well as from the enlightening discussion with C. Di Castro. The author is indebted to the Gottlieb Daimler-Karl Benz Foundation for financial support. The work is part of the DFG project under grant number HA2071/2-1.
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15.
16.
E. Dagotto, Rev. Mod. Phys. 66, 763 (1994). 253 T. Timusk and B. Statt, Rep. Prog. Phys. 62, 61 (1999). 253, 260 T. Noda, H. Eisaki, S. Uchida, Science 286, 265 (1999) 253 J.L. Tallon and J.W. Loram, Physica C 349, 53 (2001). 253, 260 M. Gutmann, E.S. Bo˘zin, S.J.L. Billinge, cond-mat/0009141 (2000). 253, 265 J. Zaanen and O. Gunnarsson, Phys. Rev. B 40, 7391 (1989). 253 A. Sokol and D. Pines, Phys. Rev. Lett. 71, 2813 (1993). 253 V. Emery and S. Kivelson, Nature 374,434 (1995). 253 S.C. Zhang, Science 275, 1089 (1997). 253 S. Andergassen, S. Caprara, C. Di Castro, and M. Grilli Phys. Rev. Lett. 87, 56401 (2001). 253, 265 C.M. Varma, Phys. Rev. B 61, R3804 (2000). 253 S. Chakravarty et al., Phys. Rev. B 63, 094503 (2001). 253 S. Sachdev, Science 288, 475 (2000). 254 B.S. Shastry and B.I. Shraiman 65, 1068 (1990). 254 F. Venturini, M. Opel, T.P. Devereaux, J.K. Freericks, I. T¨ utt˝ o, B. Revaz, E. Walker, H. Berger, L. Forr´ o, and R. Hackl, Phys. Rev Lett. 89, 107003 (2002). 254, 256 M. Opel , R. Nemetschek, C. Hoffmann, R. Philipp, P.F. M¨ uller, R. Hackl, I. T¨ utt˝ o, A. Erb, B. Revaz, E. Walker, H. Berger, and L. Forr´ o, Phys. Rev. B 61, 9752 (2000). 256, 257, 262
266
Francesca Venturini
17. F. Venturini, M. Opel, R. Hackl, H. Berger, L. Forr´ o, and B. Revaz, J. Phys. Chem. Solids 63, 2345 (2002). 257 18. D.B. Tanner and T. Timusk, edited by D.M. Ginzberg, Properties of HighTemperature Superconductors III (World Scientific, Singapore, 1992). 258 19. N. Mott, Conduction in Non-Crystalline Materials (Clarendon Press, Oxford, 1987). 259 20. N. Furukawa et al., Phys. Rev. Lett. 81, 3195 (1998); J. Gonz´ alez et al., Phys. Rev. Lett. 84, 4930 (2000). 260 21. J. K. Freericks and T. P. Devereaux, Phys. Rev. B 64, 125110 (2001); J. K. Freericks, T. P. Devereaux, and R. Bulla, Phys. Rev. B 64, 233114 (2001). 260 22. T.P. Devereaux and A.P. Kampf, Phys. Rev. B 59, 6411 (1999). 261 23. M.R. Norman, M. Randeria, H. Ding, and J.C. Campuzano, Phys. Rev. B 57, R11093 (1998); T. Valla, A.V. Fedorov, P.D. Johnson, Q. Li, G.D. Gu, and N. Koshizuka, Phys. Rev Lett. 85, 828 (2000); A.A. Kordyuk, S.V. Borisenko, M.S. Golden, S. Legner, K.A. Nenkov, M. Knupfer, J. Fink, H. Berger, L. Forr´ o, and R. Follath, Phys. Rev. B 66, 014502 (2002). 261 24. T.P. Devereaux, cond-mat/0302083 (2003). 262 25. F. Venturini, Q.-M. Zhang, R. Hackl, A. Lucarelli, M. Ortolani, P. Calvani, N. Kikugawa, and T. Fuijita, Phys. Rev. B 66, 060502(R) (2002). 263, 264 26. Y. Ando et al., Phys. Rev. Lett. 75, 4662 (1995); 77, 2065 (1996); 79, 2595 (1997); Phys. Rev. B 56, R8530 (1997); G. S. Boebinger et al., Phys. Rev. Lett. 77, 5417 (1996); P. Fournier et al., ibid 81, 4720 (1998); S. Ono et al., ibid 85, 638 (2000). 264, 265
LDA+DMFT Investigations of Transition Metal Oxides and f -Electron Materials K. Held1 , V. I. Anisimov2 , V. Eyert3 , G. Keller4 , A. K. McMahan5 , I. A. Nekrasov2, and D. Vollhardt4 1 2 3 4
5
Max-Planck Institute for Solid State Research D-70569 Stuttgart, Germany Institute of Metal Physics, Russian Academy of Sciences-Ural Division Yekaterinburg GSP-170, Russia Institute for Physics, Theoretical Physics II University of Augsburg, D-86135 Augsburg, Germany Theoretical Physics III, Center for Electronic Correlations and Magnetism Institute for Physics, University of Augsburg D-86135 Augsburg, Germany Lawrence Livermore National Laboratory, University of California Livermore, CA 94550, USA
Abstract. In the last few years LDA+DMFT, the merger of conventional band structure theory in the local density approximation (LDA) with the many-body dynamical mean-field theory (DMFT) has been proven to be a powerful tool for the realistic modeling of strongly correlated electron systems. This paper provides a brief introduction to this novel computational technique and presents the results for two prime examples of strongly correlated electron systems, i.e., the Mott-Hubbard transition in V2 O3 and the volume collapse transition in Ce.
1
Introduction
In the last century, solid state theory was divided into two main communities, the density functional [1,2,3] (DFT) band structure community, mainly based on the local density approximation (LDA), and the many-body community. The approaches developed by the respective communities are rather complementary in their strengths and weaknesses, see Table 1. LDA allows for the calculation of physical properties of real materials, starting ab initio from the potential of the ionic lattice, the kinetic energy and the Coulomb interaction of the electrons (without free parameters). Moreover, LDA calculations turned out to be unexpectedly successful, even quantitatively and even for the electronic band structure which, strictly speaking, cannot be calculated within the DFT framework. This is surprising because LDA is a serious approximation to the Coulomb interaction between electrons. In particular, the correlation but also the exchange contribution of the Coulomb interaction is only treated rudimentarily, i.e., by means of a local density and by a functional obtained from the jellium model [4], a weakly correlated problem. However, there are important classes of materials where LDA fails, such B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 267–283, 2003. c Springer-Verlag Berlin Heidelberg 2003
268
K. Held et al.
Table 1. Two complementary approaches in solid state theory, pros and cons +
−
DFT/LDA band structure theory • material specific (input lattice const.) no free parameters: ab initio • often very successful (quantitatively)
many-body theory • systematic investigations of electronic correlations • often allows qualitative insight
• effective one-particle approach • based on model Hamiltonians fails for strong electronic correlations (parameters needed as input) (transition metal oxides, f-electrons ...) • CPU intensive
as transition metal oxides or heavy fermion systems, i.e., materials where electronic correlations are strong. For instance, LDA predicts La2 CuO4 and V2 O3 , to be metals [5,6] whereas, in reality, they are insulators. The study of the electronic correlations induced by the Coulomb interaction is the principal task of the other community, which investigates the consequential many-body physics by perturbative and non-perturbative methods. Often many-body approaches provide insight into the relevant physical mechanism. But, the electronic correlations make the theory complicated and numerical approaches CPU intensive, such that only simplified model Hamiltonians can be investigated. With the need of parameters as an input and simplified models, many-body calculations were not capable to quantitatively predict material properties. One of the most successful many-body approaches developed in the last years is the dynamical mean-field theory [7,8,9,10,11,12,13,14,15] (DMFT). This theory is controlled in the parameter 1/Z (Z: number of neighboring lattice sites) and reliably treats the local electronic correlations, at least for three dimensional systems. Depending on the strength of the Coulomb interaction, it yields a weakly correlated metal, a strongly correlated metal with heavy quasiparticles, or a Mott insulator. At the same time, DMFT is general and powerful enough to be applied to complicated many-body Hamiltonians. Recently, [16,17] physicists of the two communities have joined forces to combine the advantages of LDA and DMFT and developed the computational LDA+DMFT approach which is capable of calculating strongly correlated systems such as transition metal oxides and f -electron materials realistically. In Sect. 2, we introduce this LDA+DMFT approach. Calculations for V2 O3 and the f -electron system Ce are presented in Sect. 3 and 4, respectively. A summary closes the presentation in Sect. 5. A more details description of LDA+DMFT can be found in [18], also see the conference proceedings [19].
LDA+DMFT
2
269
The LDA+DMFT Approach
2.1
Local Density Approximation
Within Born-Oppenheimer approximation [20] and neglecting relativistic effects, electronic properties of solid state systems are described by the electronic Hamiltonian
ˆ = H
2 + ˆ ∆ + Vion (r) Ψˆ (r, σ) d r Ψ (r, σ) − 2m e σ 1 + d3 r d3 r Ψˆ + (r, σ)Ψˆ + (r , σ ) Vee (r−r ) Ψˆ (r , σ )Ψˆ (r, σ). 2
3
(1)
σσ
Here, Ψˆ + (r, σ) and Ψˆ (r, σ) are field operators that create and annihilate an electron at position r with spin σ, ∆ is the Laplace operator, me the electron mass, e the electron charge, and Vion (r) = −e2
i
e2 Zi 1 and Vee (r−r ) = |r − Ri | 2 |r − r |
(2)
r=r
denote the one-particle ionic potential of all ions i with charge eZi at given positions Ri , and the electron-electron interaction, respectively. While the ab initio Hamiltonian (1) is easy to write down it is impossible to solve it exactly if more than a few electrons are involved. Thus, one has to do approximations. DFT/LDA turned out to be unexpectedly successful in this respect. In principle, DFT/LDA only allows one to calculate static properties like the ground state energy or its derivatives. However, in practice it turned out that the Kohn-Sham equations [2] also reliably describe the band structure [3], at least for weakly correlated materials with s and p orbitals. This corresponds to replacing the ab initio Hamiltonian (1) by 2 − ∆ ˆ LDA = + Vion (r) + d3 r ρ(r )Vee (r−r ) d3 r Ψˆ + (r, σ) H 2me σ ∂E LDA (ρ(r)) ˆ + xc Ψ (r, σ). (3) ∂ρ(r) LDA Here, ρ(r ) is the electron density and Exc (ρ(r)) the exchange correlation potential within LDA, determined by the weakly correlated jellium problem [4]. Equation (3) describes independent electrons moving in the lattice potential and the density of the other electrons, which has to be determined self-consistently. For practical calculations one needs to expand the field operators w.r.t. a basis Φilm , e.g., a linearized muffin-tin orbital (LMTO)[21] basis (i denotes
270
K. Held et al.
lattice sites; l and m are orbital indices). In this basis, one has Ψˆ + (r, σ) = ˆσ† ilm c ilm Φilm (r), such that the Hamiltonian (3) reads ˆ LDA = H (δilm,jl m εilm n ˆ σilm + tilm,jl m cˆσ† ˆσjl m ). (4) ilm c ilm,jl m ,σ
Here, tilm,jl m = Φilm |−2 ∆/2me + Vion (r) + d3 r Vee (r − r )ρ(r )+ LDA (ρ(r))/∂ρ(r)|Φjl m for ilm = jl m and zero otherwise, εilm denotes ∂Exc the corresponding diagonal part, and n ˆ σilm = cˆσ† ˆσilm . ilm c For d or f electrons the most important Coulomb interaction is the local Coulomb interactions on the same lattice site. These contributions are largest due to the extensive overlap (w.r.t. the Coulomb interaction) of these localized orbitals. Moreover, the largest non-local contribution is the nearestneighbor density-density interaction which, to leading order in the number of nearest-neighbor sites, yields only the Hartree term [8,22], already included in the LDA. The large local Coulomb interactions lead to strong electronic correlations which are only very rudimentary taken into account in the LDA. To improve on this, we supplement the LDA Hamiltonian (4) with the local Coulomb matrix approximated by the (most important) matrix elements σσ Umm (Coulomb repulsion and Z-component of Hund’s rule coupling) and Jmm (spin-flip terms of Hund’s rule coupling) between the localized electrons (for which we assume i = id and l = ld ): σσ ˆ =H ˆ LDA + 1 H Umm ˆ ild mσ n ˆ ild m σ n 2 mσ,m σ
1 − Jmm cˆ†ild mσ cˆ†ild m σ¯ cˆild m σ cˆild m¯σ − ∆d n ˆ ild mσ . 2 mσ
(5)
mσ,m
Here, the prime on the sum indicates that at least two of the indices of an operator have to be different, and σ ¯ =↓ (↑) for σ =↑ (↓). In typical ↑↓ σσ ≡ U , Jmm ≡ J, Umm applications we have Umm = U − 2J − Jδσσ for m = m . With M interacting orbitals, the average Coulomb interaction is ¯ = [U +(M −1)(U −2J)+(M −1)(U −3J)]/(2M −1). The last term of then U the Hamiltonian (5) reflects a shift of the one-particle potential of the interacting orbitals and is necessary if the Coulomb interaction is taken into account. This shift, the local Coulomb repulsion U , and the Hund’s rule exchange can be determined by constrained LDA calculations [23]. 2.2
Dynamical Mean-Field Theory
The many-body extension of LDA, Equation (5), was proposed by Anisimov et al.[24] in the context of their LDA+U approach. Within LDA+U the Coulomb interactions of (5) are treated within the Hartree-Fock approximation. Hence, LDA+U does not contain true many-body physics. While this approach is successful in describing long-range ordered, insulating states
LDA+DMFT
271
of correlated electron systems it fails to describe strongly correlated paramagnetic states. To go beyond LDA+U, to capture the many-body nature of the electron-electron interaction various approximation schemes have been proposed and applied[16,17,25,26,27,28]. One of the most promising approaches, first implemented by Anisimov et al.[16], is to solve (5) within DMFT [7,8,9,10,11,12,13,14,15] (“LDA+DMFT”). Of all extensions of LDA only the LDA+DMFT approach is presently able to describe the physics of strongly correlated, paramagnetic metals with well-developed upper and lower Hubbard bands and a narrow quasiparticle peak at the Fermi level. This characteristic three-peak structure is a signature of the importance of many-body effects [11,12]. During the last ten years, DMFT has proved to be a successful approach for investigating strongly correlated systems with local Coulomb interactions [15]. It becomes exact in the limit of a high lattice coordination numbers Z; it is controlled in 1/Z, [7,8] and preserves the dynamics of local interactions. Hence, it represents a dynamical mean-field approximation. In this non-perturbative approach the lattice problem is mapped onto an effective single-site problem which has to be determined self-consistently together with the k-integrated Dyson equation connecting the self energy Σ and the Green function G at frequency ω −1
1 0 (k) − Σ(ω) . (6) d3 k ω1 + µ1 − HLDA Gqlm,q l m (ω) = VB qlm,q l m 0 Here, 1 is the unit matrix, µ the chemical potential, the matrix HLDA (k) is ˆ ilmσ with HLDA being the matrix defined as HLDA − i=id ,l=ld mσ ∆d n elements of (4), Σ(ω) denotes the self energy matrix which is non-zero only between the interacting orbitals, [...]−1 implies the inversion of the matrix with elements n (=qlm), n (=q l m ), and the integration extends over the Brillouin zone with volume VB . The DMFT single-site problem depends on G(ω)−1 = G(ω)−1 + Σ(ω) and is equivalent [11,12] to an Anderson impurity model if its hybridization ∆(ω) satisfies G −1 (ω) = ω − dω ∆(ω )/(ω − ω ). The local one-particle Green function at a Matsubara frequency iων = i(2ν + 1)π/β (β: inverse temperature), orbital index m (l = ld , q = qd ), and spin σ is given by the following functional integral over Grassmann variables ψ and ψ ∗ 1 σ σ σ∗ A[ψ,ψ ∗ ,G −1 ] Gνm = − ψνm e . (7) D[ψ]D[ψ ∗ ]ψνm Z σ σ∗ Here, Z = D[ψ]D[ψ ∗ ]ψνm ψνm exp(A[ψ, ψ ∗ , G −1 ]) is the partition function and the single-site action A has the form (the interaction part of A is in terms of the “imaginary time” τ , i.e., the Fourier transform of ων )
272
K. Held et al.
A[ψ, ψ ∗ , G −1 ] =
ν,σ,m
σ∗ σ σ ψνm (Gνm )−1 ψνm
1 σσ σ∗ σ σ ∗ σ − Umm dτ ψm (τ )ψm (τ )ψm (τ )ψm (τ ) 2 β
mσ,mσ
+
1 J 2 mσ,m mm
0
β
σ∗ σ ¯ σ ¯∗ σ dτ ψm (τ )ψm (τ )ψm (τ )ψm (τ ) .
(8)
0
This single-site problem (7) has to be solved self-consistently together with the k-integrated Dyson equation (6) to obtain the DMFT solution of a given problem. Due to the equivalence of the DMFT single-site problem and the Anderson impurity problem a variety of approximate techniques have been employed to solve the DMFT equations, such as the iterated perturbation theory (IPT) [11,15] and the non-crossing approximation (NCA) [29,30,31], as well as numerical techniques like quantum Monte Carlo simulations (QMC) [32], exact diagonalization (ED) [33,15], or numerical renormalization group (NRG) [34]. In principle, QMC and ED are exact methods, but they require an extrapolation, i.e., the discretization of the imaginary time ∆τ → 0 (QMC) or the number of lattice sites of the respective impurity model Ns → ∞ (ED), respectively. In the context of LDA+DMFT we refer to the computational schemes to solve the DMFT equations discussed above as LDA+DMFT(X) where X=IPT [16], NCA [28], QMC [35] have been investigated in the case of the Sr-doped LaTiO3 , and quantitatively compared. [35] The same strategy was formulated by Lichtenstein and Katsnelson [17] as one of their LDA++ approaches. They also applied LDA+DMFT(IPT) [36], and were the first to use LDA+DMFT(QMC) [37], to investigate the spectral properties of iron. Recently, among others V2 O3 [38,39], Ca(Sr)VO3 [40], LiV2 O4 [41], Ca2−x Srx RuO4 [42,43], CrO2 [44], Ni [45], Fe [45], Mn [46], Pu [47], and Ce [48,49,50] have been studied by LDA+DMFT. Realistic investigations of itinerant ferromagnets (e.g., Ni) have also become possible by combining density functional theory with multi-band Gutzwiller wave functions. [51]
3
Mott-Hubbard Metal-Insulator Transition in V2 O3
One of the most famous examples of a cooperative electronic phenomenon occurring at intermediate coupling strengths is the transition between a paramagnetic metal and a paramagnetic insulator induced by the Coulomb interaction between the electrons – the Mott-Hubbard metal-insulator transition. [52] Correlation-induced metal-insulator transitions (MIT) are found, for example, in transition metal oxides with partially filled bands near the
LDA+DMFT
273
Fermi level. For such systems band structure theory typically predicts metallic behavior. The most famous example is V2 O3 doped with Cr. At low temperatures, V2 O3 is an antiferromagnetic insulator with monoclinic crystal symmetry and, at high temperature, it is a paramagnet with a corundum structure. In this paramagnetic phase, an isostructural first-order transition from a metal to an insulator occurs upon Cr-doping, accompanied by a 12% increase in volume. From a model point of view the MIT is triggered by a change of the ratio of the Coulomb interaction U relative to the bandwidth W . Originally, Mott considered the extreme limits W = 0 (when atoms are isolated and insulating) and U = 0 where the system is metallic. While it is simple to describe these limits, the crossover between them, i.e., the metal-insulator transition itself, poses a very complicated electronic correlation problem. Among others, this metal-insulator transition has been addressed by Hubbard in various approximations [53] and by Brinkman and Rice within the Gutzwiller approximation [54]. During the last few years, our understanding of the MIT in the one-band Hubbard model has considerably improved, in partucular due to the application of the dynamical mean-field theory [55]. Within LDA, both the paramagnetic metal V2 O3 and the paramagnetic insulator (V0.962 Cr0.038 )2 O3 are found to be metallic (see Fig. 1), if one takes into account the slightly different lattice parameters [56]. The LDA DOS shows a splitting of the five Vanadium d orbitals into three t2g states near the Fermi energy and two eσg states at higher energies. This reflects the (approximate) octahedral arrangement of oxygen around the vanadium atoms. Due to the trigonal symmetry of the corundum structure the t2g states are further split into one a1g band and two degenerate eπg bands, see Fig. 1. The only visible difference between (V0.962 Cr0.038 )2 O3 and V2 O3 is a slight narrowing of the t2g and eσg bands by ≈ 0.2 and 0.1 eV, respectively as well as a weak downshift of the centers of gravity of both groups of bands for V2 O3 . In particular, the insulating gap of the Cr-doped system is seen to be missing in the LDA DOS. Here we will employ LDA+DMFT(QMC) to show explicitly that the insulating gap is caused by electronic correlations. We restrict ourselves to the three t2g bands at the Fermi energy and make use of a simplification for cubic transition metal oxides which allows for the use the LDA DOS instead of the full LDA Hamiltonian as an input (see [18]; note that this is an approximation for V2 O3 since cubic symmetry is lifted). While the Hund’s rule coupling J is insensitive to screening effects and may, thus, be obtained within LDA to a good accuracy (J = 0.93 eV [57]), the LDA-calculated value of the Coulomb repulsion U has a typical uncertainty of at least 0.5 eV [35]. To overcome this uncertainty, we study the spectra obtained by LDA+DMFT(QMC) for three different values of the Hubbard interaction (U = 4.5, 5.0, 5.5 eV) in Fig. 2. From the results obtained we conclude that the critical value of U for the MIT is at about 5 eV: At U = 4.5 eV one observes pronounced quasiparticle peaks at the Fermi energy,
274
K. Held et al. 4
DOS (1/eV)
(V0.962Cr0.038)2O3
V 3d a1g
3
V 3d eg 2
V 3d eg
π σ
1 0
eσ g 2
3d V
3+
DOS (1/eV)
V2O3 3 2 1 a1g t 2g
eπg
0 -2
-1
0
1
2
3
4
5
E (eV)
Fig. 1. Left: Scheme of 3d levels in the corundum crystal structure. Right: Partial LDA DOS of the 3d bands for paramagnetic metallic V2 O3 and insulating (V0.962 Cr0.038 )2 O3 [reproduced from [38]]
Fig. 2. LDA+DMFT(QMC) spectra for paramagnetic (“ins.”) (V0.962 Cr0.038 )2 O3 and V2 O3 (“met.”) at U = 4.5, 5 and 5.5 eV, and T = 0.1 eV = 1160 K [reproduced from [38]]
i.e., characteristic metallic behavior, even for the crystal structure of the insulator (V0.962 Cr0.038 )2 O3 , while at U = 5.5 eV the form of the calculated spectral function is typical for an insulator for both sets of crystal structure parameters. At U = 5.0 eV one is then at, or very close to, the MIT since there is a pronounced dip in the DOS at the Fermi energy for both a1g and eπg orbitals for the crystal structure of (V0.962 Cr0.038 )2 O3 , while for pure V2 O3 one still finds quasiparticle peaks. We note that at T ≈ 0.1 eV one only observes metallic-like and insulator-like behavior, with a rapid but smooth crossover between these two phases, since a sharp MIT occurs only at lower
LDA+DMFT
275
Intensity in arbitrary units
temperatures [55]. The critical value of the Coulomb interaction U ≈ 5 eV is in reasonable agreement with the values determined spectroscopically by fitting to model calculations, and by constrained LDA, see [38] for details. To compare with the V2 O3 photoemission spectra by Schramme et al. [58] and Mo et al. [59], as well as with the X-ray absorption data by M¨ uller et al. [60], the LDA+DMFT(QMC) spectrum at T = 300 K is multiplied with the Fermi function and Gauss-broadened by 0.09 eV to account for the experimental resolution. The theoretical result for U = 5 eV is seen to be in good agreement with experiment (Fig. 3). In contrast to the LDA results, our results do not only describe the different bandwidths above and below the Fermi energy (≈ 6 eV and ≈ 2 − 3 eV, respectively), but also the position of two (hardly distinguishable) peaks below the Fermi energy (at about 1 eV and -0.3 eV) as well as the pronounced two-peak structure above the Fermi energy (at about 1 eV and 3-4 eV). In our calculation the eσg states have not been included so far. Taking into account the Coulomb interaction ¯ = U − 2J ≈ 3 eV and also the difference between the eσg band and the t2g U band centers of gravity of roughly 2.5 eV, the eσg band can be expected to be located roughly 5.5 eV above the lower Hubbard band (-1.5 eV), i.e., at about 4 eV. From this estimate one would conclude the upper X-ray absorption maximum around 4 eV in Fig. 1 to be of mixed eσg and eπg nature. While LDA also gives two peaks below and above the Fermi energy, their position and physical origin is quite different. Within LDA+DMFT(QMC) the peaks at -1 eV and 3-4 eV are the incoherent Hubbard bands induced by the electronic correlations whereas in the LDA the peak at 2-3 eV is caused entirely by the eσg (one-particle) states, and that at -1 eV is the band edge maximum of the a1g and eπg states (see Fig. 1). Obviously, the LDA+DMFT LDA LDA+DMFT(QMC) Mo et al.’02 Schramme et al.’00
-3
-2.5
-2
-1.5 -1 E(eV)
LDA LDA+DMFT(QMC) Mueller et al’97
-0.5
0
0.5
0
1
2
3 E (eV)
4
5
6
Fig. 3. Comparison of the LDA+DMFT(QMC) spectrum[38] at U = 5 eV and T = 300 K below (left Figure) and above (right Figure) the Fermi energy (at 0 eV) with the LDA spectrum[38] and the experimental spectrum (left: photoemission spectrum of Schramme et al. [58] at T = 300 K and Mo et al. at T = 175 K [59]; right: X-ray absorption spectrum of M¨ uller et al. at T = 300 K [60]). Note that Mo et al. [59] use a higher photon energy (hν = 500 eV) than Schramme et al. [58] (hν = 60 eV) which considerably reduces the surface contribution to the spectrum
276
K. Held et al.
results are a big improvement over LDA which, as one should keep in mind, was the best method available to calculate the V2 O3 spectrum before. Particularly interesting are the spin and the orbital degrees of freedom in V2 O3 . From our calculations [38], we conclude that the spin state of V2 O3 is S = 1 throughout the Mott-Hubbard transition region. This agrees with the measurements of Park et al. [61] and also with the data for the hightemperature susceptibility [62]. But, it is at odds with the S = 1/2 model by Castellani et al. [63] and with the results for a one-band Hubbard model [64] which corresponds to S = 1/2 in the insulating phase and, contrary to our results, shows a substantial change of the local magnetic moment at the MIT [55]. For the orbital degrees of freedom we find a predominant occupation of the eπg orbitals, but with a significant admixture of a1g orbitals. This admixture decreases at the MIT: in the metallic phase at T = 0.1 eV we determine the occupation of the (a1g , eπg1 , eπg2 ) orbitals as (0.37, 0.815, 0.815), and in the insulating phase as (0.28, 0.86, 0.86). This should be compared with the experimental results of Park et al. [61] who, from their analysis, extracted the ratio of the configurations eπg eπg :eπg a1g to be 1:1 in the paramagnetic metallic and 3:2 in the paramagnetic insulating phase. This corresponds to a one-electron occupation of (0.5,0.75,0.75) and (0.4,0.8,0.8), respectively. Although our results show a somewhat smaller value for the admixture of a1g orbitals, the overall behavior, including the tendency of a decrease of the a1g admixture across the transition to the insulating state, are well reproduced. In this context we would also like to note the work by Laad et al. [39] who started from our LDA DOS for V2 O3 and found, within DMFT(IPT), that it is possible to trigger a Mott-Hubbard metal-insulator transition by shifting the eπg band with respect to the a1g band. In the study above, the experimental crystal parameters of V2 O3 and (V0.962 Cr0.038 )2 O3 have been taken from the experiment. This leaves the question unanswered whether a change of the lattice is the driving force behind the Mott transition, or whether it is the electronic Mott transition which causes a change of the lattice. For another system, Ce, we will show in Section 4 that the energetic changes near a Mott transition are indeed sufficient to cause a first-order volume change.
4
Volume Collapse in Ce
Cerium exhibits a transition from the γ- to the α-phase with increasing pressure or decreasing temperature. This transition is accompanied by an unusually large volume change of 15% [65], much larger than the 1-2% volume change in V2 O3 . The γ-phase may also be prepared in metastable form at room temperature in which case the γ-α transition occurs under pressure at this temperature [66]. Similar volume collapse transitions are observed under pressure in Pr and Gd (for a recent review, see [67]). It is widely believed that these transitions arise from changes in the degree of 4f electron correlations,
LDA+DMFT
277
as is reflected in both the Mott transition [68] and the Kondo volume collapse (KVC) [69] models. These two scenarios were considered to be contradictory, but might be more similar [70,71] than previously thought. For a realistic calculation of the cerium α-γ transition, we employ the full Hamiltonian calculation where the one-particle Hamiltonian was calculated by LDA and the 4f Coulomb interaction U along with the associated 4f site energy shift [∆d in Equation (5)] by a constrained LDA calculation (for details, see [67,49,50]). We have not included the spin-orbit interaction which has a rather small impact on LDA results for Ce, nor the intra-atomic exchange interaction which is less relevant for Ce as occupations with more than one 4f -electron on the same site are rare [J = 0 in Equation (5)]. Furthermore, the 6s, 6p, and 5d orbitals are assumed to be non-interacting in the formalism of Equation (5). Note, that the 4f orbitals are even better localized than the 3d orbitals and, thus, uncertainties in U and the 4f site energy are relatively small and would only translate into a possible volume shift for the α-γ-transition. We would also like to note earlier calculations by Z¨olfl et al. [48] who studied Ce by LDA+DMFT(NCA) and by Savrasov et al. [47] who used an IPT-inspired DMFT solver for Pu. The LDA+DMFT(QMC) spectral evolution of the Ce 4f -electrons is presented in Fig. 4. It shows similarities to V2 O3 (Fig. 2): At a volume per atom V = 20 ˚ A3, Fig. 4 shows that almost the entire spectral weight lies in a large 4.0 3
3.5
V=20A
9.0 8.0
3.0
α−Ce V=29A
3
2.0 3
V=24A
1.0
2.0
Α(ω)
Α(ω)
2.5
1.5
0.0 4.0
3
V=29A
1.0
3.0
γ−Ce
2.0
V=34A
3
V=34A
3
3
0.5
V=40A
1.0 3
V=46A
0.0 −6.0
−3.0
0.0 3.0 ω(eV)
6.0
9.0
0.0 −5.0
0.0
5.0
10.0
ω(eV)
Fig. 4. Left: 4f spectral function A(ω) at different volumes and T = 632 K (ω = 0 corresponds to the chemical potential; curves are offset as indicated; ∆τ = 0.11eV−1 ); Right: Total LDA+DMFT spdf -spectrum (solid line) in comparison with the combined photoemission and BIS spectrum [72] (circles) for α(upper part) and γ-Ce (lower part) at T = 580 K [reproduced from [50]]
278
K. Held et al.
quasiparticle peak with a center of gravity slightly above the chemical potential. This is similar to the LDA solution; however, a weak upper Hubbard band is also present even at this small volume. At the volumes 29 ˚ A3 and 34 ˚ A3 which approximately bracket the α-γ transition, the spectrum has a three peak structure. Finally, by V = 46 ˚ A3 , the central peak has disappeared leaving only the lower and upper Hubbard bands. In the right part of Fig. 4 we show the total LDA+DMFT spdf -spectrum (broadened with the experimental resolution 0.4 eV) and compare with experiment [72]. The calculated f -spectrum shows a sharp quasiparticle or Kondo resonance slightly above the Fermi energy, which is the result of the formation of a singlet state between f - and conduction states. We thus suggest that the spectral weight seen in the experiment is a result of this quasiparticle resonance. In the lower part of Fig. 4, a comparison between experiment and our calculation for γ-Ce is shown. The most striking difference between the lower and the upper part of Fig. 4 is the absence of the Kondo resonance in the γ-phase which is in agreement with our calculations. Nonetheless, γCe remains metallic with spectral weight arising from the spd-electrons at the Fermi energy, quite contrary to V2 O3 . Altogether, one can say that the agreement with the experimental spectrum is very good, and comparable to the LDA accuracy for much simpler systems. Fig. 5a shows our calculated DMFT(QMC) energies EDMFT [49,50] as a function of atomic volume at three temperatures relative to the paramagnetic Hartree Fock (HF) energies EPMHF [of the Hamiltonian (5)], i.e., the energy contribution due to electronic correlations. We also present the polarized HF energies which basically represent a (non-self-consistent) LDA+U calculation and reproduce EDMFT at large volumes and low temperatures. With decreasing volume, however, the DMFT energies bend away from the polarized HF solutions. Thus, at T = 0.054 eV ≈ 600 K, a region of negative curvature in EDMFT −EPMHF is evident within the observed two phase region (arrows). Fig. 5b presents the calculated LDA+DMFT total energy Etot (T ) = ELDA (T ) + EDMFT (T ) − EmLDA (T ) where EmLDA is the energy of an LDA-like solution of the Hamiltonian (5) [73]. Since both ELDA and EPMHF −EmLDA have positive curvature throughout the volume range considered, it is the negative curvature of the correlation energy in Fig. 5a which leads to the dramatic depression of the LDA+DMFT total energies in the range V = 26-28 ˚ A3 for decreasing temperature, which contrasts to the smaller changes near V = 34 ˚ A3 in Fig. 5b. This trend is consistent with a double well structure emerging at still lower temperatures (prohibitively expensive for QMC simulations), and with it a first-order volume collapse. This is in reasonable agreement with the experimental volume collapse and strongly suggests that the electronic correlations leading to the emergence of a Kondo-like energy scale are eventually responsible for the 15% volume collapse in Ce.
LDA+DMFT
E−EPMHF (eV)
0
(a)
279
polarized HF
−1
T=0.054 eV T=0.136 eV T=0.544 eV
−2
α
γ
Etot (eV)
(b)
2.0 1.6 1.2 10
20
30 3 V (Å )
40
50
Fig. 5. (a) Correlation energy EDMFT − EPMHF as a function of atomic volume (symbols) and polarized HF energy EAFHF−EPMHF (dotted lines); arrows: observed volume collapse from the α- to the γ-phase. (b) The negative curvature of the correlation energy leads to a growing depression of the total energy near V = 26– 28 ˚ A3 as temperature is decreased, consistent with an emerging double well at still lower temperatures and thus the α-γ transition. The curves at T = 0.544 eV were shifted downwards in (b) by −0.5 eV to match the energy range [reproduced from [49]]
5
Summary
In this paper we discussed the set-up of the computational LDA+DMFT scheme which merges two non-perturbative, complementary investigation techniques of solid state theory. LDA+DMFT allows one to perform ab initio calculations of real materials with strongly correlated electrons and is, at present, the only available ab initio computational technique which is able to treat systems close to a Mott-Hubbard MIT, heavy fermions, and f -electron materials. As two particular examples we presented results for the transition metal oxide V2 O3 and the f -electron system Ce. Our LDA+DMFT(QMC) calculations show a MIT in V2 O3 upon Cr-doping at a reasonable value of the Coulomb interaction U ≈ 5 eV and are in good agreement with the experimentally determined photoemission and X-ray absorption spectra for V2 O3 , i.e., above and below the Fermi energy. In particular, we find a spin state S = 1 in the paramagnetic phase, and an orbital admixture of eπg eπg and eπg a1g configurations, which both agree with recent experiments. Thus,
280
K. Held et al.
LDA+DMFT(QMC) provides a remarkably accurate microscopic theory of the strongly correlated electrons in the paramagnetic metallic phase of V2 O3 . Paramagnetic Ce undergoes an even more dramatic, isostructural volume collapse than V2 O3 . Our LDA+DMFT(QMC) spectra show a dramatic reduction in the size of the f -electron quasiparticle peak at the Fermi level when passing from the (expermental) α- to the γ-phase volume. In contrast to V2 O3 , Ce remains metallic due to the spd electrons. But, nonetheless, the total spectrum changes considerably and is in good agreement with experiment. An important aspect of our results is that the rapid reduction in the size of the f -electron quasiparticle peak seems to coincide with the appearance of a negative curvature in the correlation energy and a shallow minimum in the total energy. This suggest that the electronic correlations responsible for the reduction of the quasiparticle peak are associated with energetic changes strong enough to cause a volume collapse in the sense of the Kondo volume collapse model [69], or a Mott transition model [68] including electronic correlations. Acknowledgements We are grateful to J. W. Allen, O. K. Andersen, N. Bl¨ umer, R. Bulla, S. Horn, W. Metzner, Th. Pruschke, R. T. Scalettar, and M. Schramme for helpful discussions. We thank A. Sandvik for making available his maximum entropy code. The QMC code of [15] App. D was modified for use for some of the results of Section 4. This work was supported in part by the Deutsche Forschungsgemeinschaft through the Emmy-Noether program (KH) and Sonderforschungsbereich 484 (DV, GK, VE), the Russian Foundation for Basic Research by RFFI-01-02-17063 (VA,IN) and RFFI-02-02-06162 (IN), the Ural Branch of the Russian Academy of Sciences for Young Scientists (IN), the U.S. Department of Energy by University California LLNL under contract No. W-7405-Eng-48. (AM), the Leibniz-Rechenzentrum, M¨ unchen, and the John v. Neumann-Institut for Computing, J¨ ulich.
References 1. P. Hohenberg and W. Kohn, Phys. Rev. B 136, 864 (1964). 267 2. W. Kohn and L. J. Sham, Phys. Rev. 140, 4A, A1133 (1965); W. Kohn and L. J. Sham, Phys. Rev. A - Gen. Phys. 140, 1133 (1965); L. J. Sham and W. Kohn, Phys. Rev. 145 N 2, 561 (1966). 267, 269 3. R. O. Jones and O. Gunnarsson, Rev. Mod. Phys. 61, 689 (1989). 267, 269 4. L. Hedin and B. Lundqvist, J. Phys. C: Solid State Phys. 4, 2064 (1971); U. von Barth and L. Hedin, J. Phys. C: Solid State Phys. 5, 1629 (1972); D. M. Ceperley and B. J. Alder, Phys. Rev. Lett. 45, 566 (1980). 267, 269 5. T. C. Leung, X. W. Wang, and B. N. Harmon, Phys. Rev. B 37, 384 (1988); J. Zaanen, O. Jepsen, O. Gunnarsson, A. T. Paxton, and O. K. Andersen, Physica C 153, 1636 (1988); W. E. Pickett, Rev. Mod. Phys. 61, 433 (1989). 268
LDA+DMFT
281
6. L. F. Mattheiss, J. Phys.: Cond. Matt. 6, 6477 (1994). 268 7. W. Metzner and D. Vollhardt, Phys. Rev. Lett. 62, 324 (1989). 268, 271 8. E. M¨ uller-Hartmann, Z. Phys. B 74, 507 (1989); ibid. B 76, 211 (1989). 268, 270, 271 9. U. Brandt and C. Mielsch, Z. Phys. B 75, 365 (1989); ibid. B 79, 295 (1989); ibid. B 82, 37 (1991). 268, 271 10. V. Janiˇs, Z. Phys. B 83, 227 (1991); V. Janiˇs and D. Vollhardt, Int. J. Mod. Phys. 6, 731 (1992). 268, 271 11. A. Georges and G. Kotliar, Phys. Rev. B 45, 6479 (1992). 268, 271, 272 12. M. Jarrell, Phys. Rev. Lett. 69, 168 (1992). 268, 271 13. D. Vollhardt, in Correlated Electron Systems, edited by V. J. Emery, World Scientific, Singapore, 1993, p. 57. 268, 271 14. Th. Pruschke, M. Jarrell, and J. K. Freericks, Adv. in Phys. 44, 187 (1995). 268, 271 15. A. Georges et al., Rev. Mod. Phys. 68, 13 (1996). 268, 271, 272, 280 16. V. I. Anisimov et al., J. Phys. Cond. Matter 9, 7359 (1997). 268, 271, 272 17. A. I. Lichtenstein and M. I. Katsnelson, Phys. Rev. B 57, 6884 (1998). 268, 271, 272 18. K. Held et al., in Quantum Simulations of Complex Many-Body Systems: From Theory to Algorithms, J. Grotendorst, D. Marks, and A. Muramatsu (ed.), NIC Series Volume 10, p. 175-209 (2002). 268, 273 19. A. I. Lichtenstein, M. I. Katsnelson, and G. Kotliar, to be published in Electron Correlations and Materials Properties 2, A. Gonis (ed.), Kluwer, New York; G. Kotliar and S. Savrasov, in New Theoretical Approaches to Strongly Correlated Systems, A. M. Tsvelik (ed.), Kluwer, New York, 2001, p. 259. 268 20. M. Born and R. Oppenheimer, Ann. Phys. (Leipzig) 84, 457 (1927). 269 21. O. K. Andersen, Phys. Rev. B 12, 3060 (1975); O. Gunnarsson, O. Jepsen, and O. K. Andersen, Phys. Rev. B 27, 7144 (1983); O. K. Andersen and O. Jepsen, Phys. Rev. Lett. 53, 2571 (1984). 269 22. J. Wahle et al., Phys. Rev. B 58, 12749 (1998). 270 23. O. Gunnarson et al., Phys. Rev. B 39, 1708 (1989). 270 24. V. I. Anisimov, J. Zaanen, and O. K. Andersen, Phys. Rev. B 44, 943 (1991); V. I. Anisimov, F. Aryasetiawan, and A. I. Lichtenstein, J. Phys. Cond. Matter 9, 767 (1997). 270 25. V. Drchal, V. Janiˇs, and J. Kudrnovsk´ y, in Electron Correlations and Material Properties, edited by A. Gonis, N. Kioussis, and M. Ciftan, Kluwer/Plenum, New York, 1999, p. 273. 271 26. J. Lægsgaard and A. Svane, Phys. Rev. B 58, 12817 (1998). 271 27. Th. Wolenski, Ph.D. Thesis, Universit¨ at Hamburg 1998. 271 28. M. B. Z¨ olfl et al., Phys. Rev. B 61, 12810 (2000). 271, 272 29. H. Keiter and J. C. Kimball, Phys. Rev. Lett. 25, 672 (1970); N. E. Bickers, D. L. Cox, and J. W. Wilkins, Phys. Rev. B 36, 2036 (1987). 272 30. Th. Pruschke and N. Grewe, Z. Phys. B 74, 439 (1989). 272 31. Th. Pruschke, D. L. Cox, and M. Jarrell, Phys. Rev. B 47, 3553 (1993). 272 32. J. E. Hirsch and R. M. Fye, Phys. Rev. Lett. 56, 2521 (1986); M. Jarrell, Phys. Rev. Lett. 69, 168 (1992); M. Rozenberg, X. Y. Zhang, and G. Kotliar, Phys. Rev. Lett. 69, 1236 (1992); A. Georges and W. Krauth, Phys. Rev. Lett. 69, 1240 (1992); M. Jarrell, in Numerical Methods for Lattice Quantum Many-Body Problems, edited by D. Scalapino, Addison Wesley, 1997; for multi-band QMC,
282
33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. 51.
52.
53. 54. 55.
56. 57. 58. 59. 60. 61.
K. Held et al. see M. J. Rozenberg, Phys. Rev. B 55, R4855 (1997); J. E. Han, M. Jarrell, and D. L. Cox, Phys. Rev. B 58, R4199 (1998); K. Held and D. Vollhardt, Euro. Phys. J. B 5, 473 (1998). 272 M. Caffarel and W. Krauth, Phys. Rev. Lett. 72, 1545 (1994). 272 R. Bulla, Adv. Sol. State Phys. 46, 169 (2000). 272 I. A. Nekrasov et al. Euro. Phys. J. B 18, 55 (2000). 272, 273 M. I. Katsnelson and A. I. Lichtenstein, J. Phys. Cond. Matter 11, 1037 (1999). 272 M. I. Katsnelson and A. I. Lichtenstein, Phys. Rev. B 61, 8906 (2000). 272 K. Held et al., Phys. Rev. Lett. 86, 5345 (2001). 272, 274, 275, 276 M. S. Laad, L. Craco, and E. M¨ uller-Hartmann, cond-mat/0211210. 272, 276 I.A. Nekrasov et al., cond-mat/0211508; A. Liebsch, cond-mat/0301537. 272 I.A. Nekrasov et al., Phys. Rev. B 67, 085111 (2003). 272 A. Liebsch and A. I. Lichtenstein, Phys. Rev. Lett. 84, 1591 (2000). 272 V. I. Anisimov et al., Eur. Phys. J. B 25, 191-201 (2002). 272 L. Craco, M. S. Laad, and E. M¨ uller-Hartmann, cond-mat/0209132. 272 A. I. Lichtenstein, M. I. Katsnelson, and G. Kotliar, Phys. Rev. Lett. 87, 67205 (2001). 272 S. Biermann et al., cond-mat/0112430. 272 S. Y. Savrasov, G. Kotliar, and E. Abrahams, Nature 410, 793 (2001); S. Y. Savrasov and G. Kotliar, cond-mat/0106308. 272, 277 M. B. Z¨ olfl et al., Phys. Rev. Lett. 87, 276403 (2001). 272, 277 K. Held, A. K. McMahan, and R. T. Scalettar, Phys. Rev. Lett. 87, 276404 (2001). 272, 277, 278, 279 A. K. McMahan, K. Held, and R. T. Scalettar, Phys. Rev. B 67, 75108 (2003). 272, 277, 278 W. Weber, J. B¨ unemann, and F. Gebhard, in Band-Ferromagnetism, edited by K. Baberschke, M. Donath, and W. Nolting, Lecture Notes in Physics, Vol. 580 (Springer, Berlin, 2001), p. 9; J. B¨ unemann, F. Gebhard, W. Weber, Phys. Rev. B 57, 6896 (1998). 272 N. F. Mott, Rev. Mod. Phys. 40, 677 (1968); Metal-Insulator Transitions (Taylor & Francis, London, 1990); F. Gebhard, The Mott Metal-Insulator Transition (Springer, Berlin, 1997). 272 J. Hubbard, Proc. Roy. Soc. London Ser. A 276, 238 (1963); 277, 237 (1963); 281, 401 (1964). 273 W. F. Brinkman and T. M. Rice, Phys. Rev. B 2, 4302 (1970). 273 G. Moeller et al., Phys. Rev. Lett. 74, 2082 (1995); J. Schlipf et al., Phys. Rev. Lett. 82, 4890 (1999); M. J. Rozenberg, R. Chitra, and G. Kotliar, Phys. Rev. Lett. 83, 3498 (1999); R. Bulla, Phys. Rev. Lett. 83, 136 (1999); R. Bulla, T. A. Costi, and D. Vollhardt, Phys. Rev. B 64, 45103 (2001); J. Joo and V. Oudovenko, Phys. Rev. B 64, 193102 (2001); N. Bl¨ umer, Ph.D. thesis, Universit¨ at Augsburg 2002. 273, 275, 276 P. D. Dernier, J. Phys. Chem. Solids 31, 2569 (1970). 273 I. Solovyev, N. Hamada, and K. Terakura, Phys. Rev. B 53, 7158 (1996). 273 M. Schramme, Ph.D. thesis, Universit¨ at Augsburg 2000 (Shaker Verlag, Aachen, 2000); M. Schramme et al. (unpublished). 275 S.-K. Mo et al., cond-mat/0212110. 275 O. M¨ uller et al., Phys. Rev. B 56, 15056 (1997). 275 J.-H. Park et al., Phys. Rev. B 61, 11 506 (2000). 276
LDA+DMFT
283
62. D. J. Arnold and R. W. Mires, J. Chem. Phys. 48, 2231 (1968). 276 63. C. Castellani, C. R. Natoli, and J. Ranninger, Phys. Rev. B 18, 4945 (1978); 18, 4967 (1978); 18, 5001 (1978). 276 64. M. J. Rozenberg et al., Phys. Rev. Lett. 75, 105 (1995). 276 65. Handbook on the Physics and Chemistry of Rare Earths, edited by K. A. Gschneider Jr. and L. R. Eyring (North-Holland, Amsterdam, 1978); in particular, D. G. Koskenmaki and K. A. Gschneider Jr., ibid, p.337. 66. J. S. Olsen et al., 133B, 129 (1985). 276 276 67. A. K. McMahan et al., J. Comput.-Aided Mater. Design 5, 131 (1998). 276, 277 68. B. Johansson, Philos. Mag. 30, 469 (1974); B. Johansson et al., Phys. Rev. Lett. 74, 2335 (1995). 277, 280 69. J. W. Allen and R. M. Martin, Phys. Rev. Lett. 49, 1106, (1982); J. W. Allen and L. Z. Liu, Phys. Rev. B 46, 5047, (1992); M. Lavagna, C. Lacroix, and M. Cyrot, Phys. Lett. 90A, 210 (1982). 277, 280 70. K. Held et al., Phys. Rev. Lett. 85, 373 (2000); see also C. Huscroft, A. K. McMahan, and R. T. Scalettar, Phys. Rev. Lett. 82, 2342 (1999). 277 71. K. Held and R. Bulla, Eur. Phys. J. B 17, 7 (2000). 277 72. L. Z. Liu et al., Phys. Rev. B 45, 8934 (1992). 277, 278 73. We solve self-consistently for nf using a 4f self energy Σ = Uf (nf − 12 ), and then remove this contribution from the eigenvalue sum to get the kinetic energy. The potential energy is taken to be 12 Uf nf (nf − 1). 278
The Single Quantum Dot Photodiode – A Two-Level System with Electric Contacts Evelin Beham1 , Artur Zrenner2 , Stefan Stufler2 , Frank Findeis1 , Max Bichler1 , and Gerhard Abstreiter1 1 2
Walter Schottky Institute, Technical University of Munich Am Coulombwall, D-85748 Garching, Germany University of Paderborn Warburger Str. 100, D-33098 Paderborn, Germany
Abstract. Semiconductor quantum dots can be described as quantum mechanical two-level systems. Under the influence of strong electromagnetic driving fields and in the absence of decoherence such systems exhibit Rabi flopping corresponding to a qubit rotation in the context of quantum computing. Based on a single QD incorporated in a photodiode we have prepared a two-level system with electric contacts. By means of this single QD photodiode we demonstrate the transfer of coherent optical excitations into a deterministic photocurrent . The QD photocurrent directly reflects Rabi flopping of the exciton state. Under π-pulse excitation the exciton occupation is inverted from 0 to 1 leading ideally to the creation of exactly one exciton per laser pulse. For this condition, the photocurrent is determined by the repetition rate of the experiment f and the elementary charge e, resulting quantitatively in I = f · e.
1
Introduction
In recent years the investigation of coherent phenomena in low dimensional semiconductor structures has come more and more in the focus of interest. In contrast to conventional semiconductor devices that in general work in an incoherent mode of operation, the application of coherent phenomena for novel quantum logic is one of the big challenges for the future. At present there are various proposals for the realization of quantum logic on the basis of semiconductor heterostructures. Many of them use quantum dots (QDs) which are regarded as promising candidates for the implementation of qubits, the basic elements of quantum computers [1]. A fundamental approach to realize a qubit is to utilize a quantum mechanical two-level system. Due to the discrete energy structure of these quasi zero-dimensional systems, single QDs can be modeled in terms of a two-level system as will be shown in this contribution. In particular, the electrical access to the state of the quantum systems makes our technique to a forward-looking approach. One essential requirement for a coherent manipulation of a qubit is a sufficiently long decoherence time of the quantum state to allow for many operation cycles. Compared to higher dimensional semiconductor structures, carriers confined in QDs are expected to have longer decoherence times, since they experience B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 287–300, 2003. c Springer-Verlag Berlin Heidelberg 2003
288
Evelin Beham et al.
weaker coupling to the environment. Recent experiments on self-assembled QDs have evidenced a low temperature decoherence time of the ground state exciton of more than 500 ps, probably limited by the radiative recombination time of the exciton [2]. As compared to semiconductor structures with higher dimensionality, where the dephasing times are typically by a factor of 10 to 100 smaller, the reduced rate of interactions of QD states with the environment seems to be a key issue in terms of coherent applications. Various experiments on individual QDs have shown that single quantum states in QDs can be coherently manipulated. Bonadeo et al. demonstrated, that a single exciton confined in a natural QD can be coherently controlled by means of optical pulse-pairs [3]. Furthermore, it was shown that under strong excitation conditions even Rabi flopping of the exciton state in a single QD is achievable [4]. Also in our current contribution we report about Rabi flopping in a single QD, but in contrast to existing all-optical approaches, we are able to quantitatively prove the full inversion of the quantum state for the case of an applied optical π-pulse. This is carried out by application of photocurrent (PC) spectroscopy, a powerful experimental technique to investigate the ground-state absorption in a single QD. Recently PC experiments on QD ensembles have given insight into the mechanisms and the time scales of carrier capture, redistribution and escape processes [6,7]. Furthermore PC experiments on single self assembled QDs have been performed, showing the discrete absorption characteristics of single QDs resulting in sharp spectral features [8,9].
2
The Single QD Photodiode
In our present work we report about coherent experiments on a single QD photodiode, which can be regarded as a novel opto-electronic quantum device [5]. While based on a conventional diode structure, here a GaAs n-iSchottky structure, the only optically active part is a single self-assembled In0.5 Ga0.5 As QD contained in the intrinsic layer of the diode (see insets of Fig. 1). A semitransparent Schottky contact is provided by a 5 nm thick Titanium layer. The optical selection of a single QD is done by near-field shadow masks with apertures from 100 nm to 500 nm, which are prepared by electron beam lithography from a 80 nm thick aluminum layer. A single QD embedded in such a diode structure can be treated as a single two-level system. The quantum state of this two-level system can be coherently manipulated by resonant optical excitation applying coherent pump pulses. As a characteristic feature of our approach, the diode structure allows for an electrical access to the exciton state in the QD via a detection of the tunneling current out of the QD. It is this connection to the electric circuit which offers the new and so far unattained realization of a two-level system with electric contacts. With this kind of device a single quantum system can be addressed in real- or frequency-space, coherently controlled, and read out electrically. Important
Single Quantum Dot Photodiode
289
Fig. 1. Single PC line of the exciton ground state in a QD. The insets show the realization of a single QD photodiode on the basis of a n-i-Schottky diode and the active part of the band diagram of the single QD photodiode
coherent properties like the dephasing time of the system can be tuned by simply changing an external voltage. Experimentally, we apply a PC technique to investigate the behavior of a single QD as a quantum mechanical two-level system. This technique allows us to perform on the one hand the optical generation of a specific quantum state in the QD and on the other hand an electrical read-out of the quantum state. Thus it enables us to monitor the excitation level of the QD by a simple current measurement. Moreover, performing an electrical probe avoids the inherent background problem of all-optical techniques. The basic concept of our PC technique is shown by means of the band diagram in the inset of Fig. 1. A QD is embedded in the intrinsic layer of the photodiode, which is operated in reverse bias direction. Resonant optical excitation of the QD ground state is performed by a cw Ti:sapphire laser tuned to the exciton resonance. In the regime of sufficiently high electric field (for our QDs typically > 40 kV/cm), the optically generated carriers tunnel out of the QD and are separated in the applied field. Since our PC measurements are performed at T = 2.3 K, tunneling is supposed to be the dominant escape mechanism in our QDs [6]. The resulting field ionization of the carriers gives rise to a tunneling current, which is recorded in an external dc current measurement. By tuning the excitation energy, the PC signal exhibits sharp spectral resonances revealing the discrete density of states of a single QD. Figure 1 shows a single PC line taken from the exciton ground state in a single QD.
3
Application as a Mini-spectrometer
Based on the above described PC approach, a single QD photodiode can be operated as a photodetector with a unique spatial and spectral resolution. Single QDs can provide substantial current signals (up to nA) from mesoscopic absorption volumes (nm-scale). Additionally, such a detector offers
290
Evelin Beham et al.
a spectral resolution in the sub-nm range, which is determined by the PC linewidth. We have evidence that this linewidth originates mainly from the lifetime of the resonantly excited ground state exciton in the QD. For typical operation conditions, PC lines with a spectral width Γ ≈ 100 µeV are detected. From these we conclude a typical exciton lifetime τ = h ¯ /Γ of a few ps. Another feature of our QD photodetector is the tunability of its resonance energy. Using the Quantum Confined Stark Effect (QCSE) in the QD, the exciton energy can be shifted by applying an external bias voltage on the photodiode (see inset of Fig. 2). Thus the spectral sensitivity of the photodetector can be adjusted electrically. In other words the single QD photodiode can be used as an electrically tunable, mesoscopic spectrometer. The Stark shift of the exciton ground state resonance leads to a maximum spectral range of 5–10 meV, in which the detector can be operated. In Fig. 2 we show the performance of our single QD photodiode as an electrically tunable spectrum analyzer. Applying laser excitation with fixed energy, we observe a single exciton resonance appearing at a specific voltage. This resonance voltage is defined by the condition that the QCSE-shift brings the exciton in resonance with the laser. Tuning the excitation energy therefore leads to a shift of the exciton resonance on the VB axis. Each peak-voltage corresponds to a particular spectral energy, and hence the voltage value is a direct measurement of the laser excitation energy. This is just the function of a spectrometer. In this manner our single QD spectrometer works over a spectral range of 3.6 meV, which corresponds to tens of its detection linewidth.
Fig. 2. Application of a single QD photodiode as mini spectrum analyzer. A number of different laser energies have been recorded sequentially by tuning VB . Energy calibration is provided by the QCSE (see inset)
Single Quantum Dot Photodiode
291
4 Application as a Two-Level System with Electric Contacts A single QD can be regarded as a quantum mechanical two-level system. Figure 3 gives a schematic view of the two-level model, that underlies all subsequent considerations in this work. Two different states of one regarded QD are involved to build the two-level system. The upper level is represented by the QD occupied by one exciton in its ground state. The lower level stands for the QD without any occupation, which is the crystal ground state. Although there are more confined states in a single QD, this system can still be treated as a two-level system. Due to the large energy separations between the various transition, the PC technique allow us to consider the exciton ground state only. In principal the s-shell of a single QD offers space for two excitons. However, the biexciton line is spectrally shifted by ≈ 3 meV [10] from the single exciton line due to few particle interactions. Thus no second absorption process can take place at the single exciton resonance energy as long as the QD is occupied by one exciton. Therefore the occupation number of the QD under resonant excitation of the ground state cannot exceed one and the system is perfectly represented by the two-level model described above. As already described the applied PC technique is based on resonant optical excitation of the exciton state. This resonant pump field is used to drive our two-level system. Probing of the excitation is performed in a simple detection of the tunneling current, which lets us monitor the excitation level of the two-level system. In the following, two experimental regimes will be discussed. First we take a cw pump source to drive our two-level system. Under this experimental condition we explore the case of exciting our two-level system incoherently. In the second experiment we use a pulsed excitation source, which enables us to observe coherent processes in our two-level system.
Fig. 3. Schematic view of an excitonic two-level system in the ground state of a semiconductor QD. The state |0 > corresponds to an empty QD, |X > to a one exciton occupancy
4.1 4.1.1
Incoherent Characteristics Saturation Behavior
Figure 4 shows a series of PC spectra of a single QD for increasing excitation power. The dominating PC line results from the ground state absorption of the investigated QD. This was confirmed by single dot PL on the same
292
Evelin Beham et al.
Fig. 4. PC spectra of a single QD ground state under resonant excitation conditions. With increasing excitation power a saturation of the QD PC is observed. For each spectrum the zero PC line is indicated on the left
shadow mask aperture. For further discussion we use the peak amplitude of the PC line as a measure of the QD absorption at fixed electric field. Starting at a relatively low excitation level the PC line first increases significantly, but strongly saturates at high excitation. Further increase in absorption appears to be suppressed in the limit of high excitation. Since the excitation is performed through on a shadow mask in the optical near-field regime (size ≈ 250 nm), the absolute excitation density of the QD is difficult to determine. Thus we use arbitrary units for the excitation level of the QD, where a value of 80 corresponds to a power density of ≈ 105 W/cm2 in the laser focus on the sample surface. Moreover a background PC emerges at high excitation intensities, and also an additional feature at 1.3124 eV. These occur probably due to additional excitation of other QDs by stray light at high laser power, since all QDs are connected electrically in parallel. As explained above, a single QD can be regarded as a two-level system. It is an intrinsic property of a two-level system , that for resonant high power excitation the two levels can at maximum become equally occupied (incoherent case). Therefore the excitation level of the system saturates at 50%, if we increase our cw driving power. In the saturated case, stimulated emission and absorption just compensate each other and a further increase of the excitation power does not affect the excitation level. Experimentally, the excitation level of the two-level system is probed by the tunneling current out of the upper state (ground state exciton). Thus reaching the maximum excitation level of 50% results in a saturation of the PC from the QD. Similar series of power dependent PC spectra were taken for different bias conditions. They all exhibit a saturation of the PC amplitude, measured from the peak of the dominating PC line (at ≈ 1.3135 eV) with respect to the background PC. The experimental data are summarized in Fig. 5. The saturation level of the PC amplitude shows a pronounced dependence on the electric field in the QD-plane. At higher electric fields F the tunneling
Single Quantum Dot Photodiode
293
Fig. 5. Overview of the PC saturation with excitation power for different bias conditions. For stronger electric fields a higher PC saturation value is observed. The experimental data are fitted by a specifically calculated function, based on a rate equation model
time is reduced and the QD returns faster to its initial empty state. Therefore a higher absorption rate is possible, leading to a higher saturated PC. 4.1.2
Modeling
The saturation of the QD PC with increasing excitation power can be described by a fundamental rate equation model. In our model we concentrate on a specific QD ground state with an exciton occupation number N . The status of the QD is modeled in terms of a two-level system. The QD is declared to be in level 1, if it is unoccupied, and in level 2 if it is occupied with one exciton. For these two levels we define two time-averaged occupation numbers N1 and N2 , with values between 0 and 1, complementing each other to one: N1 + N2 = 1. According to the theory of two-level systems [11] we consider the following rate equations dN1 N2 = A21 N2 + B21 N2 ρ − B12 N1 ρ + dt τesc N2 dN2 = −A21 N2 − B21 N2 ρ + B12 N1 ρ − . dt τesc
(1) (2)
Here the well known Einstein coefficients enter: A21 , B21 and B12 for spontaneous emission, stimulated emission and absorption respectively. ρ is the energy density of the radiation field. In additional we consider the tunneling escape process of the exciton out of the QD with a time constant τesc . To solve the rate equations (1) and (2) we use B12 = B21 [11] and introduce the factor M reflecting the probability for the stimulated processes and we use A21 = 1/τr , where τr is the radiative recombination time of the exciton. Furthermore the strength of the radiation field ρ is expressed in terms of the excitation power P . With the first and the last term in the above equations
294
Evelin Beham et al.
we have included the two exciton loss mechanisms, radiative recombination and tunneling. The second and third term count for the power dependent resonant creation and annihilation of the exciton. Since the resulting effective change of the occupation dN1/2 /dt depends on the occupation number N1/2 itself, we recognize that the absorption behavior of the QD is affected by the current state of the dot. This property leads to the nonlinear power dependence of the PC, resulting in PC saturation. To return to our experimental data we take the steady state occupation number of the upper state N0 as the occupancy of the QD and calculate the resulting PC from the tunnel escape term to I =e·
N0 e = · τesc 2τesc P +
1 2M
·
P
1 τesc
+
1 τr
.
(3)
This expression is used to fit the power dependent PC peak amplitudes. Figure 5 illustrates the good agreement of the fit results with our experimental data. For high excitation power P the PC expression in equation (3) leads to the saturation of the PC at Isat = e/2τesc . Due to the low temperature in our experiment, thermal escape out of the QD is regarded to be negligible with respect to tunneling escape, as shown by Chang et. al. for InAs QDs [6]. Therefore we take τesc = τt (F ). The tunnel escape time τt is expected to depend strongly on the chosen electric field. 4.1.3
Exciton Tunneling Time
According to our model we are able to obtain the tunneling escape time from the fits to the experimentally observed PC saturation levels. Since there are both electrons and holes in the QD, it is of interest which escape time, of the electron or of the hole, is relevant. A more detailed look on the mechanism for saturation gives an answer to this question. The initial state is an empty QD. First, a resonant photo absorption leads to an occupation of the dot with one exciton. However, electron and hole are expected to exhibit different tunneling lifetimes. After tunneling of the faster of the two carriers the system is dephased, but there is still one carrier left in the QD and the absorption is still bleached. This bleaching originates from the renormalization of a single charged exciton with respect to the neutral exciton, as observed by PL spectroscopy [10]. After the second carrier has also tunneled out, the dot is again in its initial state and ready for the next absorption process. This means, it is the lifetime of slower tunneling carrier that controls bleaching and determines the value of the saturated PC. Equation (3) is used to calculate the carrier lifetime from the saturated PC. Saturation values of 2.78 nA at F = 68.9 kV/cm to 0.24 nA at F = 64.4 kV/cm lead to a tunneling lifetime for the slower tunneling carriers of 29 ps to 330 ps respectively.
Single Quantum Dot Photodiode
4.2 4.2.1
295
Coherent Characteristics Coherent Experiment
For the discussion of the coherent properties of a single QD photodiode we first want to address the hierarchy of relevant and characteristic timescales in this system. The lifetime of a coherent polarization is given by the decoherence or dephasing time τp . Only within this characteristic time it is possible to observe coherent interactions or to perform coherent manipulations in the system. In the field of semiconductor QDs recent experiments on self-assembled InGaAs QDs have evidenced low temperature dephasing times in excess of 500 ps at low temperatures [2]. The spontaneous lifetime of the exciton τs is of the order of 1 ns in typical III/V semiconductor QDs with direct band gap. It is naturally the longest timescale relevant for excitonic two-level systems. In biased single QD photodiodes the tunneling time τt is further of big relevance. Depending on the electric field, τt is tunable from infinity down to < 1 ps, concluded from the linewidth of the exciton resonance. Tunneling on a timescale τt ≤ τp on the other hand leads to enhanced dephasing and further limits the time range for coherent interactions. In order to safely reach conditions for coherent excitation we apply 1.7 ps pulses from a mode locked Ti:sapphire laser. By applying an appropriate bias on our photodiode, we choose a sufficiently long tunneling time, that exceeds the pulse length but is smaller than the dephasing time. If a quantum mechanical two-level system is strongly driven on its resonance frequency, there occurs a basic coherent phenomenon, known as Rabi oscillations. Under a constant illumination intensity on this frequency, the excitation level L of the two-level system, which is the expectation value for the two-level system to be in the upper state, is subjected to an oscillation in time. For time scales shorter than the dephasing time, the population oscillates between lower and upper level. In general, the optical Bloch theory for ideal two-level systems [12] describes the oscillation of the excitation level L under resonant excitation of the two-level system by L = sin2 (Θ/2). Here the parameter Θ = (µ/¯h) E(t)dt enters, with the transition dipole moment µ and the time-integral E(t)dt representing the envelope area A of the exciting pulse. The observation of Rabi oscillations is typically realized by using pulsed excitation. Let us assume one resonant excitation pulse, which is shorter than the dephasing time in the system. This pulse just drives our two-level system as long as the pulse lasts. After the end of the pulse there remains a constant but coherent inversion in the system (assuming times shorter than dephasing). The crucial parameter, which decides about the resulting inversion after this pulsed excitation, is the envelope area A of the pulse. This parameter is affected by both the pulse length and the electric field amplitude Aexc of the pulse, respectively its intensity. In our experiments we use a fixed pulse length and vary the intensity of the exciting pulse I ∝ A2exc . Before we come to the coherent PC spectroscopy on our single QD photodiode, we characterize the exciton resonance by linear, incoherent PC spectroscopy. This is performed by I-V-spectroscopy, that is
296
Evelin Beham et al.
realized by cw-driving the exciton at a fixed excitation energy, namely at the same energy as taken for the pulsed experiment. In a first step we tune the emission energy EL of our cw Ti:sapphire laser to a fixed energy close to the QD exciton resonance EX . In a second step the bias voltage VB of our device is tuned (at fixed EL ) and, as a consequence of the QCSE, EX (VB ) is shifted through EL . At a specific voltage VB , EX just hits the excitation energy EL and we observe a resonance in the resulting PC. Figure 6 displays the PC resonance for VB = 0.96 V, where EX = EL ≈ 1.31 eV. For bias voltages VB < 0.96 V the exciton resonance energy EX is higher than the exciting laser energy EL , in the range VB > 0.96 V EX lies below EL . By repeating this procedure for different EL , we obtain a magnitude of the QCSE of about 2.4 meV/V (see inset of Fig. 2). Based on the Stark shift the voltage axis can be transferred to an energy axis. The excitonic PCresonance exhibits a width in VB of about 14 mV, which corresponds to a spectral linewidth of about 34 µeV.
Fig. 6. PC spectrum of a single QD in the region of the excitonic ground state energy EX . In this cw experiment, the laser energy EL is fixed, whereas EX is tuned by VB via the QCSE
4.2.2
Rabi Oscillations
To demonstrate the coherent properties of our single QD photodiode we performed PC spectroscopy under pulsed excitation. A mode-locked Ti:sapphire with an output pulse of ≈ 1.7 ps is spectrally set close to the exciton resonance at 1.31 eV. Analogous to the above described cw spectroscopy we tune the exciton resonance through the fixed laser energy by taking I-V-curves under illumination (see Fig. 7). With ≈ 0.9 meV the spectral width of our ps-laser is nearly bandwidth limited. It is an important notice, that the broadening of our laser does not affect the resonant character of our excitation. Within the laser line width of about 0.9 meV no other QD state than the exciton contributes to the PC. With a biexciton binding energy of 3 meV, the two-photon biexciton resonance would be 1.5 meV below EX . On the voltage axis this resonance is spectrally shifted by ≈ 0.65 V from the exciton resonance and
Single Quantum Dot Photodiode
297
Fig. 7. PC data versus VB for increasing excitation amplitude Aexc obtained from a single QD photodiode under pulsed excitation. The individual spectra are vertically offset for clarity. The resonance condition EX = EL is indicated by the dashed line
would therefore occur at VB ≈ 0.3 V, below the onset of exciton tunneling (see also Fig. 9). By tuning the bias voltage of our photodiode the sharp exciton resonance (≈ 34 µeV) is shifted through this laser spectrum. For a linear PC signal we therefore expect a convolution between both spectra, the broad laser and the relatively sharp exciton resonance. For instance the spectrum at Aexc = 0.22 in Fig. 7 displays the PC response in this almost linear regime. Under this experimental condition we observe mainly the spectral structure of the exciting laser. In addition, Fig. 7 shows a series of PC spectra for increasing Aexc , which is directly related to the pulse area A, as introduced above. Experimentally, the real laser intensity reaching the QD is however hard to specify. Therefore we restrict to relative values for Aexc . A value of Aexc = 1 for the pulsed excitation corresponds to an average cw excitation intensity of ≈ 1 · 103 W/cm2 on the QD sample. The recorded PC data contain a background contribution (non-resonant and linear in power), which we believe is due to wettinglayer tailstate absorption by straylight. It has been subtracted for clarity. The spectral range in Fig. 7 (reaching from 0 V to 1.8 V) corresponds to electric fields from 21 kV/cm to 66 kV/cm. The main exciton resonance appears at 0.96 V (45 kV/cm respectively), consistent with the cw PC data taken at the same laser energy. The displayed voltage range contains therefore the condition of resonant excitation as well conditions of detuning between EX and EL . For further analysis we follow the PC signal on the main resonance at ≈ 0.96 V. With increasing Aexc we first observe an increasing PC signal from the QD. At Aexc ≈ 1 the current reaches a maximum, a further increase of Aexc in turn leads to a reduction of the PC. This behavior is in contrast to the well known characteristics of a conventional, incoherent
298
Evelin Beham et al.
photodiode, but expected for the coherent population of a two-level system with increasing pumping power, which is directly reflected here in the PC. A full on-resonance scenario (EX = EL ) over a more extended range in Aexc is shown in Fig. 8. With increasing Aexc we observe more than one period of a damped Rabi oscillation in the PC, which reflects directly and quantitatively the resulting occupancy in the two-level system. The exact reason for the observed power dependent damping of the Rabi oscillations with increasing Aexc is unknown so far, possibly caused however by Coulomb scattering with excitations in wetting layer tail states. The first maximum of the Rabi oscillations corresponds to an excitation with a π-pulse, which appears here at Aexc = 1. Applied to a two-level system with an initial occupancy of 0, a single π-pulse leads to a complete inversion of the system, namely to occupancy 1. The photo-ionization of this excitation by tunneling leads to the separation of (in the ideal case) exactly one electronhole pair and hence to the net transport of one elementary charge between the contacts of the photodiode. In this sense, a single QD photodiode excited with a π-pulse in the coherent regime, is a deterministic current source, which delivers one elementary charge e to an outer circuit per laser pulse. With the pulse repetition frequency f = 82 MHz of our mode locked Ti:sapphire laser we expect therefore a time-integrated net current I = f · e of 13.1 pA. In our current experiment we obtain a peak PC in the first maximum of about 11.5 pA. So we nearly reach the theoretical maximum for the PC that can be drawn out of the QD under this experimental conditions. It is this quantitative correspondence that makes our experiment novel in the field of Rabi oscillations. In contrast to most other approaches, the here applied PC technique works as a quantitative proof for Rabi flopping. 4.2.3
Field Dependency
In systematic investigations of the Rabi-oscillations as a function of the applied bias voltage we have varied the carrier tunneling time out of the QD.
Fig. 8. Rabi oscillations of the PC at resonance for increasing excitation amplitude Aexc . Coherent π-pulse excitation corresponds to Aexc = 1
Single Quantum Dot Photodiode
299
Here we observe a reduced photocurrent amplitude under π-pulse excitation conditions for lower bias voltages as shown in Fig. 9. Decreasing the bias voltage leads to a lower tunneling probability out of the QD. Once excited, the exciton can either tunnel out of the QD, but also recombine, if the tunneling time becomes comparable with the radiative recombination time. In this case the exciton can no longer be fully observed in the PC, but partly recombines radiatively in the case of low electric field. Due to this incomplete tunneling of the exciton we observe the decreased PC maximum, although the exciton is driven to almost complete inversion (π-pulse excitation). In our experiment the PC π-pulse amplitude is decreased from 12.3 pA at VB = 1.15 V (F = 49.6 kV/cm) to 5.0 pA at VB = 0.76 V (F = 39.9 kV/cm). For comparison we also included data from the photoluminescence (PL) measurement on the same QD state. The observed decrease in the PL-signal with increasing electric field results from the upcoming tunneling of the QD-exciton. In between the maxima of the PL- and the PC-amplitude we observe a small gap in the electric field range around 35 kV/cm which arises from the non-resonant excitation conditions in the PL-experiment shifting the PL onset to lower electric fields.
Fig. 9. PC amplitude under -pulse excitation conditions for various bias voltages at our photodiode. For electric field below 50 kV/cm we observe a reduced PC amplitude due to incomplete tunneling of the exciton. The included PL data show the expected anticorrelation behavior of PL and PC signal
5
Summary
In summary we have introduced the single QD photodiode, a novel kind of device that acts as a two-level system with electric contacts. We have investigated this two-level system in the incoherent and coherent driving regime.
300
Evelin Beham et al.
The incoherent case was experimentally realized by cw PC spectroscopy, where we observed a power dependent saturation of the PC, which reflects a maximum excitation level of 50% of the two-level system in saturation. By application of pulsed excitation we performed coherent PC spectroscopy, where the PC response shows Rabi oscillations. For the particularly chosen excitation with optical π-pulses we were able to demonstrate Rabi flopping in a quantitative way. Furthermore, under this condition our single QD photodiode works as a deterministic current source, where each laser pulse contributes ideally one elementary charge to the PC. The authors like to acknowledge financial support by the BMBF via 01BM917 and the DFG (SFB 348).
References 1. F. Troiani, U. Hohenester, and E. Molinari, Phys. Rev. B 62, 2263 (2000) The Physics of Quantum Information, ed. by D. Bouwmeester, A. Ekert, and A. Zeilinger (Springer, Berlin 2000). 287 2. P. Borri, W. Langbein, S. Schneider, U. Woggon, R. L. Sellin, D. Ouyang, and D. Bimberg, Phys. Rev. Lett. 87, 157401 (2001). M. Bayer and A. Forchel, Phys. Rev. B 65, 41308 (2002). 288, 295 3. N.H. Bonadeo, J. Erland, D. Gammon, D. Park, D.S. Katzer, and D.G. Steel, Science 282, 1473 (1998). 288 4. T. H. Stievater, Xiaoqin Li, D. G. Steel, D. Gammon, D. S. Katzer, D. Park, C. Piermarocchi, and L. J. Sham, Phys. Rev. Lett. 87, 133603 (2001). H. Kamada, H. Gotoh, J. Temmyo, T. Takagahara, and H. Ando, Phys. Rev. Lett. 87, 247401 (2001). H. Htoon, T. Takagahara, D. Kulik, O. Baklenov, A.L. Holmes, Jr., and C.K. Shih, Phys. Rev. Lett. 88, 87401 (2002). 288 5. A. Zrenner, E. Beham, S. Stufler, F. Findeis, M. Bichler, G. Abstreiter, Nature 418, 612 (2002). 288 6. W.-H. Chang, T.M. Hsu, C.C. Huang, S.L. Hsu, C.Y. Lai, N.T. Yeh, T.E. Nee, and J.-I. Chyi, Phys. Rev. B 62, 6959 (2000). 288, 289, 294 7. A. Patan`e, A. Levin, A. Polimeni, L. Eaves, P.C. Main, M. Henini, and G. Hill, Phys. Rev. B 62, 11084 (2000). P.W. Fry, I.E. Itskevich, D.J. Mowbray, M.S. Skolnick, J.J. Finley, J.A. Barker, E.P. O’Reilly, L.R. Wilson, I.A. Larkin, P.A. Maksym, M. Hopkinson, M. AlKhafaji, J.P.R. David, A.G. Cullis, G. Hill, and J.C. Clark, Phys. Rev. Lett. 84, 733 (2000). 288 8. E. Beham, A. Zrenner, and G. B¨ ohm, Physica E 7, 359 (2000). F. Findeis, M. Baier, E. Beham, A. Zrenner, and G. Abstreiter, Appl. Phys. Lett. 78, 2958 (2001). 288 9. E. Beham, A. Zrenner, F. Findeis, M. Bichler, and G. Abstreiter, Appl. Phys. Lett. 79, 2808 (2001). 288 10. F. Findeis, A. Zrenner, G. B¨ ohm, and G. Abstreiter, Solid State Commun. 114, 227 (2000). 291, 294 11. A. Yariv, Optical Electronics (John Wiley & Sons, New York 1989). 293 12. L. Allen and J. H. Eberly, Optical Resonance and Two-Level Atoms (Wiley, New York 1975). 295
Prospects of Quantum Cascade Lasers with GaInAs Waveguides Nicolaus Ulbrich, Giuseppe Scarpa, Gerhard Abstreiter, and Markus-Christian Amann Walter Schottky Institut, Technische Universit¨ at M¨ unchen Am Coulombwall 3, D-85748 Garching, Germany Abstract. High-performance InP-based quantum cascade lasers for pulsed operation in the 5.5 µm wavelength range have been fabricated in the strain-compensated GaInAs-AlInAs material system using GaInAs-based waveguides. Low optical losses of 3.1 cm−1 at 77 K can be achieved with GaInAs as cladding material. Threshold current densities of 4.2 kA/cm2 at 300 K, temperature tuning rates of 0.97 nm/K and pulsed operation up to 440 K have been achieved with uncoated devices. The operating temperature could be increased to 470 K using high-reflection coating. The presented devices are suited for important chemical sensing applications.
1
Introduction
Quantum cascade lasers have matured as reliable coherent light sources in the mid-infrared spectral region enabling important advances in areas such as laser absorption spectroscopy and optical communications [1]. These applications require compact and highly efficient laser sources working preferably at room temperature. Existing options such as lead salt diode lasers [2] or coherent sources based on difference frequency generation [3] operate at cryogenic temperatures, only or generate inherently low infrared powers [4]. Quantum cascade lasers allow the realization of compact and high-power laser sources in the mid-infrared wavelength range from 3.5 to 27 µm and recently also at terahertz frequencies [5]. The technologically most advanced mid-infrared laser sources are based on intersubband transitions in the type-I material system GaInAs-AlInAs on InP substrates [6]. The GaInAs-AlInAs system provides a large conduction band discontinuity of 520 meV and allows the realization of laser sources in the wavelength range from 5 to 27 µm [7]. The demand for high-performance devices in the 5 µm wavelength range for laser spectroscopy applications is satisfied by strain-compensated GaInAs-AlInAs compositions which provide even larger barrier heights of up to 620 meV with a reduced electron tunneling rate from the upper laser level into the continuum [8]. Quantum cascade lasers are also realized in the type-I GaAsAlGaAs material system with a maximum conduction band discontinuitiy of 340 meV [9] as well as using Sb-based type-II interband transitions [10]. The highest detection sensitivities in chemical sensing applications can be achieved using quantum cascade distributed feedback lasers in continuous B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 301–312, 2003. c Springer-Verlag Berlin Heidelberg 2003
302
Nicolaus Ulbrich et al.
wave operation with laser linewidths as narrow as 1–3 MHz without additional frequency stabilization feedback loops [11]. The central problem in the development of continuous wave operation is the high power consumption due to the short non-radiative intersubband lifetimes. Significant improvements in the 9 µm wavelength range have recently been achieved using additional epitaxial over-growth of InP top cladding layers [12] providing continuous wave operation with thermoelectric temperature control. A solution for noncryogenic laser spectroscopy is the application of very short laser pulses in the range of 5 to 50 ns at low duty cycle with acceptable linewidths in the range of 250 MHz [4]. For practical issues the room-temperature pulsed mode operation has a clear advantage over low-temperature continuous wave operation since it eliminates the need for cryogenic cooling. Due to the low average power dissipation at low duty cycle, also the ternary alloys GaInAs and AlInAs can be employed as mid-infrared waveguide cladding layers to achieve high-temperature pulsed operation despite their considerably larger thermal resistivity compared to the binary InP. In fact, the highest operating temperatures in pulsed mode have been achieved with GaInAs waveguides without additional over-growth of InP [13]. In this contribution, we demonstrate high-performance pulsed operation of InP-based strain-compensated quantum cascade lasers in the 5 µm wavelength range employing GaInAs cladding layers. The presented waveguides exhibit low optical losses due to the precise control of the doping concentration substantially reducing heat generation. Furthermore, the presented devices show a large wavelength tuning range for important spectroscopic applications such as environmental monitoring or medical diagnostics.
2
Free Carrier Plasma Effect
Stable modes in mid-infrared waveguides can be obtained exploiting the refractive index change introduced by the free carrier plasma effect. The complex dielectric constant is calculated from the classical Drude-model by ω ˜ p2 i 2 (1) (n − iκ) = = 1 − i2 = ∞ 1 − 2 1+ ω η ωτ where ω ˜ p2 = (nc e2 )/(0 me ∞ ) and η = 1 + 1/(ωτ )2 . The free carrier concentration is denoted by nc and the dielectric constant by ∞ . The scattering time τ = µe me /e is obtained from the electron mobility µe and the energydependent electron effective mass me [14]. The absorption loss coefficient α = 2k0 κ = k0 2 /n is then calculated from the real part n and the imaginary part κ of the complex refractive index. The calculation of guided modes in mid-infrared waveguides requires the exact knowledge of the complex dielectric constant, which is readily achieved by an accurate determination of the temperature dependent electron mobility
Prospects of Quantum Cascade Lasers with GaInAs Waveguides
303
µe (T ) as demonstrated previously [15]. The complex dielectric constants of the materials InP, GaInAs and AlInAs are displayed in Fig. 1 as a function of the free carrier concentration nc at a wavelength of 5 µm and at room temperature. In the range of low free carrier concentrations the real part n of the refractive index of InP and GaInAs is 3.09 and 3.39, respectively, and the highest absorption loss is calculated for AlInAs. At a given free carrier concentration of nc = 6 × 1016 cm−3 , e.g., the absorption loss α of AlInAs, GaInAs and InP is 3.9, 1.7 and 1.2 cm−1 , respectively. The refractive index change can be achieved at high concentrations in the range of nc = 1 × 1019 cm−3 and is most pronounced in GaInAs. In GaInAs, a concentration of nc = 1.8 × 1019 cm−3 is required to obtain a refractive index of 1.5 with a resulting absorption loss of α = 4700 cm−1 . The respective absorption loss in InP is α = 9900 cm−1 at a concentration of nc = 2.7 × 1019 cm−3 . The lowest internal loss in mid-infrared waveguides is therefore expected with GaInAs as cladding material. Another advantage of GaInAs is the precise control of the carrier concentration in a molecular beam epitaxy system. We therefore investigate the laser characteristics of quantum cascade lasers in the strain-compensated system Ga0.4 In0.6 As-Al0.56 In0.44 As on InP substrates in the 5 µm wavelength range with GaInAs-based waveguides. 3.5
2.5 2.0
2
10
1.5
101
1.0 0.5 0.0
3
3
10 InP GaInAs AlInAs
l=5 µm, 300 K 1E17 1E18 1E19 -3 Free carrier concentration nc (cm )
0
10
Absorption loss a (cm-1)
Refractive index n
3.0
Fig. 1. Real part n of the complex refractive index and absorption coefficient α as a function of free carrier concentration for InP, GaInAs and AlInAs at a wavelength of 5 µm and a temperature of 300 K
Laser Structure
The active region is based on the strain-compensated system Ga0.4 In0.6 AsAl0.56 In0.44 As with a conduction band discontinuity of 630 meV and a doping concentration of nc = 3.6 × 1011 cm−2 in the injection superlattice. The detailed layer sequence of one 51.0 nm long period of the active region is displayed in [8]. The bandstructure comprises a vertical laser transition with a transition energy of 239 meV (λ = 5.2 µm) and a two LO-phonon resonant electron extraction.
304
Nicolaus Ulbrich et al.
We have grown three samples by molecular beam epitaxy on an n-type InP substrate (Sn-doped, 2×1017 cm−3 ) with a different number of periods in the active region and different GaInAs-based waveguide designs. The refractive index profile and the mode-intensity profile of sample A are displayed in Fig. 2. The growth sequence starts with a 1.35 µm thick low-doped and lattice matched GaInAs layer (Si-doped, 6 × 1016 cm−3 ) followed by the growth of the 1.44 µm thick strain-compensated active region comprising 28 periods and the completion of the waveguide core with a low-doped GaInAs layer (1.7 µm, 6 × 1016 cm−3 ). The growth is then terminated with a 1.2 µm thick GaInAs cladding layer (Si-doped, 2 × 1019 cm−3 ). The effective refractive index of the waveguide is neff = 3.36 with a confinement factor of Γ = 52.7 % in the active region. At a temperature of 77 K we calculate a waveguide loss of αw = 4.4 cm−1 of which an amount of 4.0 cm−1 is generated in the 1.2 µm thick highly-doped cladding layer. At 300 K we calculate an increased waveguide loss of αw = 10 cm−1 due to the decreased electron mobility mainly in the highly-doped cladding layer. Figure 3 shows the refractive index profile and the mode-intensity profile of the waveguide of sample B with the same number of periods in the active region as in sample A. The growth sequence starts with a 1.4 µm thick lowdoped and lattice matched GaInAs layer (Si-doped, 9 × 1016 cm−3 ) followed by the growth of the 1.44 µm thick active region comprising 28 periods. The penetration of the mode in the 1.2 µm thick GaInAs cladding layer (Si-doped, 2 × 1019 cm−3 ) is reduced due to the refractive index change introduced by an additional 0.5 µm thick GaInAs layer with an intermediate doping concentration (Si, 7×1018 cm−3 ). On the substrate side, the refractive index change arises from the low-doped InP substrate. The resulting effective refractive index is neff = 3.36 with a confinement factor of Γ = 53.1 % in the active region which is about the same as in sample B. At a temperature of 77 K we calculate a slightly reduced waveguide loss of αw = 4.0 cm−1 compared to the respective value for sample A. The absorption loss in the highly doped cladding layer has been reduced to 3.6 cm−1 with a negligible 3.5
1.5 1.0 0.5 0
1
2
3 4 Distance (µm)
0.8 0.6 0.4 0.2
5
6
0.0
Mode intensity (a.u.)
2.0
Plasmon layer
2.5
0.0
1.0
Sample A Active region G=52.7 %
Refractive index n
3.0
Fig. 2. Profile of the real part n of the refractive index (black line) and resulting mode-intensity profile (gray line) of the laser waveguide of sample A with 28 periods in the active region at a wavelength of 5.4 µm and a temperature of 300 K
Prospects of Quantum Cascade Lasers with GaInAs Waveguides 3.5
2.0 1.5 1.0 0.5 0
1
2
3 4 Distance (µm)
0.8 0.6 0.4 0.2
5
6
0.0
Mode intensity (a.u.)
Plasmon layer
2.5
0.0
1.0
Sample B Active region G=53.1 %
Refractive index n
3.0
305
Fig. 3. Profile of the real part n of the refractive index (black line) and resulting mode-intensity profile (gray line) of the laser waveguide of sample B with 28 periods in the active region at a wavelength of 5.4 µm and a temperature of 300 K
loss in the additional 0.5 µm thick GaInAs layer. At 300 K the calculated waveguide loss amounts to αw = 9.3 cm−1 . Sample C comprises a reduced number of 14 periods in the active region. The growth starts with a sequence of low-doped GaInAs waveguide core layers (0.5 µm, 9 × 1016 cm−3 ; 1.0 µm, 7.5 × 1016 cm−3 ; 0.5 µm, 6 × 1016 cm−3 ) and is followed by the growth of the strain-compensated active region with a total thickness of 0.77 µm (14 periods). A thick low-doped GaInAs waveguide core layer (2.35 µm, 6×1016 cm−3 ) and a highly doped GaInAs cladding layer (1.2 µm, 1.8 × 1019 cm−3 ) terminate the growth sequence. We calculate an effective refractive index of neff = 3.37 and an accordingly reduced confinement factor in the active region of Γ = 25.1 % compared to samples A and B with 28 periods. At 77 K the waveguide loss amounts to αw = 3.3 cm−1 of which 2.1 cm−1 arise from the highly doped cladding layer. The respective value at 300 K is αw = 7.3 cm−1 . The lasers were processed into ridge waveguides of various widths from 22 to 30 µm by optical contact lithography and deep wet chemical etching in a H3 PO4 :H2 O2 :H2 O solution. Electrical insulation is provided by a 250 nm thick SiO2 passivation layer. Ti/Pt/Au (20/20/300 nm) top contacts were thermally evaporated followed by Ge/Au/Ni/Au (13/33/10/200) back contacts after thinning the substrate to 120 µm. The lasers were then cleaved in 0.3 to 3 mm long bars. All devices were soldered epilayer up with a Sn-Au eutectic alloy to a copper heatsink and placed into a liquid nitrogen cooled cold finger cryostat providing constant temperatures from 77 to 500 K. The light is collimated and focused by f/1 Au-coated parabolic mirrors and detected with a calibrated liquid nitrogen-cooled HgCdTe detector. For spectral measurements the light is focused by an f/2.6 Au-coated parabolic mirror into the 100 µm wide entrance slit of a grating spectrometer with a focal length of 260 mm and 150 lines/mm.
306
Nicolaus Ulbrich et al.
4
Laser Characteristics and Discussion
The light output-current characteristics at various heat sink temperatures and the current-voltage characteristic at 300 K of a representative 2.8 mm long and 22 µm wide device of sample A are displayed in Fig. 4. The emission is in the 5.5 µm wavelength range. The lasers were operated at low duty cycle with 250 ns long current pulses at a repetition frequency of 250 Hz. The observed threshold current density at 300 K is 4.9 kA/cm2 with an output power of 300 mW per facet and per pulse. The onset voltage at 300 K is 5.3 V and the voltage of 8.3 V at laser threshold corresponds to an electric field of 58 kV/cm. The maximum achieved operating temperature is 440 K. Figure 5 shows a plot of the threshold current density as a function of reciprocal cavity length of 30 µm wide devices of sample A at 77 and 300 K. The waveguide loss αw and the modal gain coefficient gΓ are determined from the threshold condition αm + αw (2) Jth = gΓ where the mirror loss αm = − ln(R1 R2 )/(2L) is calculated from the cavity length L and the reflectivities R1 = R2 = 27 % of the uncoated laser facets. 350 300
300 K
L=2.8 mm W=22 µm
350 K
10 8
200
6
150
400 K
100
0
4
420 K
50
2
440 K
0
1
2
3
4 5 Current (A)
6
7
8
9
0
300 K 77 K
6
-1
aW=13 cm gG=3.6 cm/kA
4 -1
2 0
Fig. 4. Light output-current characteristics at various heat sink temperatures (black lines) and current-voltage characteristic at 300 K (gray line) of a 2.8 mm long and 22 µm wide device of sample A
Sample A
8
f
Threshold current density (kA/cm2)
12
Voltage (V)
Power (mW)
250
l=5.5 µm
aW=5.9 cm gG=8.0 cm/kA
0
2
4
6
8 10 1/L (cm -1)
12
14
16
Fig. 5. Threshold current density as a function of reciprocal cavity length of 30 µm wide devices of sample A with 28 periods in the active region at temperatures of 77 K (black symbols) and 300 K (gray symbols)
Prospects of Quantum Cascade Lasers with GaInAs Waveguides
307
For determination of the threshold current density of a particular cavity length the laser facets were thoroughly investigated with a microscope and typically three to five devices of the same cavity length were measured to obtain reliable results. The determined waveguide loss at 77 K is αw = 5.9 cm−1 which is larger than the calculated value of 4.4 cm−1 . This might be due to slight deviations of the free carrier concentration in the GaInAs cladding layer. The modal gain coefficient was determined to 8.0 cm/kA. At a temperature of 300 K we observed an increased waveguide loss of αw = 13 cm−1 and a reduced modal gain coefficient of 3.6 cm/kA. The observed increase of the waveguide loss is attributed to the reduced electron mobility at higher temperatures mainly in the highly doped GaInAs cladding layer. A plot of the threshold current density of 30 µm wide devices of sample C with a reduced number of periods in the active region is shown in Fig. 6 as a function of reciprocal cavity length at 77 and 300 K. Laser bars have been cleaved with cavity lengths in the range between 0.38 and 3.6 mm. The facets have been thoroughly investigated with a light microscope and typically three to five devices with the same cavity length were measured to obtain reliable results. The waveguide loss at 77 K is αw = 3.1 cm−1 which is in good agreement with the calculated value of 3.3 cm−1 . At a temperature of 300 K we have observed an increased waveguide loss of αw = 7.1 cm−1 which is due to the reduced electron mobility at higher temperature. The observed waveguide loss in in good agreement with the calculations at 300 K (7.3 cm−1 ) based on the temperature dependent electron mobility. The modal gain coefficient is 3.8 cm/kA at cryogenic temperatures and decreases to a value of 1.7 cm/kA at room temperature. At a temperature of 77 K we observe gain saturation for cavity lengths in the range below 310 µm due to a high mirror loss of αm = 42 cm−1 . At 300 K gain saturation is already observed for cavity lengths in the range below 800 µm (αm = 16 cm−1 ). The observed ratio gΓA /gΓC = 2.11 of the modal gain coefficients of sample A and sample C is in good agreement with the calculated ratio of the respective confinement factors of 52.7 %/25.1 % = 2.10. This emphasizes
Sample C -1
aW=7.1 cm gG=1.7 cm/kA
300 K 77 K
16 12
f
Threshold current density (kA/cm2)
20
8
-1
aW=3.1 cm gG=3.8 cm/kA
4 0
0
5
10
15 1/L (cm -1)
20
25
30
Fig. 6. Threshold current density as a function of reciprocal cavity length of 30 µm wide devices of sample C with 14 periods in the active region at temperatures of 77 K (black symbols) and 300 K (gray symbols)
308
Nicolaus Ulbrich et al.
the accuracy of the calculated and experimentally determined confinement factors and modal gain coefficients. Figure 7 shows a plot of the threshold current densities of samples A, B and C at various heat sink temperatures for representative 2.5 mm long devices with ridge widths of 30 µm. The devices were operated with 250 ns long current pulses at a pulse repetition frequency of 250 Hz. At a temperature of 77 K the threshold current densitiy of sample A is 1.4 kA/cm2 . The respective value of sample B is 1.0 kA/cm2 which is in agreement with the calculated reduction of the internal loss of waveguide B compared to the waveguide of sample A. The threshold current density of sample C is 2.4 kA/cm2 which is higher than the respective values of samples A and B in accordance with the reduced number of periods in the active region. At 300 K the threshold current densities of samples A, B and C are 5.3, 4.2 and 7.3 kA/cm2 , respectively. The temperature dependence of the threshold current density can be described by the phenomenological relation T Jth (T ) = J0 exp (3) T0
L=2.5 mm W=30 µm
10
T0=200 K
5
T0=210 K T0=170 K
f
Threshold current density (kA/cm2)
with the parameter T0 describing the exponential increase of the laser threshold. In the temperature range between 77 and 300 K the observed values of samples A and B are T0 = 170 K and T0 = 150 K, respectively. In the temperature range between 300 and 440 K the respective values are T0 = 200 K and T0 = 210 K. Figure 8 (a) shows the emission spectra of a representative 2.5 mm long and 30 µm wide device of sample A at heat sink temperatures of 77, 150, 200 and 300 K. The device was operated with 250 ns long current pulses at a repetition frequency of 1 kHz. At a temperature of 77 K the emission spectrum is peaked at a wavelength of 5.37 µm which corresponds to a transition energy of 230.8 meV. The large linewidth of 0.47 meV at 77 K is due to the application of long current pulses and the resulting heating of the device which
T0=150 K
Sample A Sample B Sample C
1 50
100
150
200
250
300
Temperature (K)
350
400
450
Fig. 7. Threshold current density as a function of heat sink temperature of 2.5 mm long and 30 µm wide devices of samples A, B and C
Prospects of Quantum Cascade Lasers with GaInAs Waveguides
309
is considerable even within the first 100 ns [16]. At higher temperatures the linewidth increases to a value of 0.85 meV at 300 K. Furthermore, the emission wavelength strongly increases to 5.58 µm (222.1 meV) at a temperature of 300 K. Figure 8 (b) shows the peak emission wavelengths of 2.5 mm long and 30 µm wide devices of samples A and B as a function of heat sink temperature. The lasing mode of sample A tunes linearly with temperature from 5.40 µm at 77 K to 5.65 µm at 350 K. The obtained temperature tuning rate is therefore 0.91 nm/K. The lasing mode of sample B tunes linearly with temperature from 5.37 µm at 77 K to 5.64 µm at 350 K with a temperature tuning rate of 0.97 nm/K. A reduction of the mirror loss αm = − ln(R1 R2 )/(2L) can be achieved using high-reflection coating for mid-infrared wavelengths [13]. Figure 9 shows a schematic diagram of a laser cavity with high-reflection coating for a wavelength of 5.6 µm. The left facet is coated with a single λ/4-stack of Al2 O3 and Au with a reflectivity of R1 = 99.6 %. The light is coupled out through the right facet which is coated with a single λ/4-stack of Al2 O3 and a-Si giving a reflectivity of R2 = 80.2 %. The resulting mirror loss that can be achieved with a 2.1 mm long laser is αm = 0.53 cm−1 compared to αm = 5.2 cm−1 of an uncoated 2.5 mm long device. We have observed a corresponding reduction of the threshold current density to 0.47 kA/cm2 at 77 K of a 2.1 mm long and 30 µm wide high-reflection coated device of sample B compared to the respective value of 1.0 kA/cm2 of a 2.5 mm long and 30 µm wide device of the same sample. At a temperature of 300 K the respective reduction is from 4.2 kA/cm2 to 3.3 kA/cm2 .
0.8
(a)
(b)
5.6 77 K
300 K
0.6 0.4
0.91 nm/K
5.5
5.4
0.2 0.0
Wavelength (µm)
Normalized light output
1.0
5.2
5.4 5.6 5.8 Wavelength (µm)
0.97 nm/K
Sample A Sample B 100 200 300 Temperature (K)
Fig. 8. (a) Pulsed emission spectra at various heat sink temperatures of a 2.5 mm long and 30 µm wide device of sample A. (b) Peak emission wavelength as a function of heat sink temperature of 2.5 mm long and 30 µm wide devices of sample A (gray symbols) and sample B (black symbols)
310
Nicolaus Ulbrich et al.
Al2O3
Al2O3 R1=99.6 %
R2=80.2 %
Laser cavity
Au
a-Si
Fig. 9. Schematic diagram of laser cavity with high-reflection coating comprising a single λ/4-stack of Al2 O3 and Au on the left side and of Al2 O3 and a-Si on the right side
Figure 10 shows the light output-current characteristics at various heat sink temperatures and the current-voltage characteristic at 470 K of a 2.1 mm long and 30 µm wide high-reflection coated device of sample B. The operating temperature could be increased to 470 K compared to 440 K of a 2.5 mm long and 30 µm wide uncoated device of the same sample. The threshold current density at 400 K is 5.5 kA/cm2 increasing to 7.5 kA/cm2 at 470 K. The corresponding onset voltage at 470 K is 5.0 V and the voltage at threshold is 11.3 V. 16
L=2.1 mm W=30µm
14
400 K 440 K
14
470 K
12 10
10 8
8
460 K
6
6
470 K
4
4 2
2 0
5
0
1
2
3
4 5 Current (A)
6
7
8
0
Voltage (V)
Power (mW)
12
Fig. 10. Light outputcurrent characteristics at various heat sink temperatures (black lines) and current-voltage characteristic at 470 K (gray line) of a 2.1 mm long and 30 µm wide high-reflection coated device of sample B
Conclusions
In this paper, we discussed the advantages of GaInAs cladding layers over InP and AlInAs for the fabrication of high-performance InP-based quantum cascade lasers in the mid-infrared wavelength range around 5 µm. The waveguide loss can well be engineered in the 5 µm wavelength range due to the fine control of the doping concentration achievable in GaInAs with molecular beam epitaxy. We showed that very low optical losses can be achieved with GaInAs-based waveguides, and high-temperature operation can be obtained with short current pulses at low duty cycle. The large temperature tuning rate of 0.97 nm/K is advantageous for chemical sensing applications with short
Prospects of Quantum Cascade Lasers with GaInAs Waveguides
311
laser pulses. Furthermore, the growth of GaInAs-based waveguides requires no additional over-growth with InP top cladding layers. In the wavelength range around 10 µm we have not yet reached low optical losses with GaInAsbased waveguides [17], possibly due to large deviations of the optical loss as a result of slight deviations of the doping concentration at longer wavelengths.
References 1. F. Capasso, R. Paiella, R. Martini, R. Colombelli, C. Gmachl, T. L. Myers, M. S. Taubmann, R. M. Williams, C. G. Bethea, K. Unterrainer, H. Y. Hwang, D. L. Sivco, A. Y. Cho, A. M. Sergent, H. C. Liu, and E. A. Whittaker, IEEE J. Quantum Electron. 38, 511 (2001). 301 2. A. Fried, B. Henry, B. Wert, S. Sewell, and J. R. Drumming, Appl. Phys. B 67, 317 (1998). 301 3. D. Richter, D. G. Lancaster, and F. K. Tittel, Appl. Opt. 39, 4444 (2000). 301 4. A. A. Kosterev and F. K. Tittel, IEEE J. Quantum Electron. 38, 582 (2001). 301, 302 5. R. K¨ ohler, A. Tredicucci, F. Beltram, H. E. Beere, E. H. Linfield, A. G. Davies, D. A. Ritchie, R. C. Iotti, F. Rossi, Nature 417, 156 (2002); M. Rochat, L. Ajili, H. Willenberg, J. Faist, H. Beere, A. G. Davies, E. H. Linfield, D. Ritchie, Appl. Phys. Lett. 81, 1381 (2002). 301 6. F. Capasso, C. Gmachl, R. Paiella, A. Tredicucci, A. L. Hutchinson, D. L. Sivco, J. N. Baillargeon, and A. Y. Cho, IEEE Select. Topics Quantum Electron. 6, 931 (2000). 301 7. J. Faist, A. Tredicucci, F. Capasso, C. Sirtori, D. L. Sivco, J. N. Baillargeon, A. L. Hutchinson, and A. Y. Cho, IEEE J. Quantum Electron. 34, 336 (1998). 301 8. D. Hofstetter, M. Beck, T. Aellen, and J. Faist, Appl. Phys. Lett. 78, 396 (2001). 301, 303 9. C. Sirtori, H. Page, C. Becker, and V. Ortiz, IEEE J. Quantum Electron. 38, 547 (2001). 301 10. R. Q. Yang, J. L. Bradshaw, J. D. Bruno, J. T. Pham, and D. E. Wortman, IEEE J. Quantum Electron. 38, 559 (2001). 301 11. A. A. Kosterev, A. A. Malinovsky, F. K. Tittel, C. Gmachl, F. Capasso, D. L. Sivco, J. N. Baillargeon, A. L. Hutchinson, and A. Y. Cho, Appl. Opt. 40, 5522 (2001); H. Ganser, B. Frech, A. Jentsch, M. Muertz, C. Gmachl, F. Capasso, D. L. Sicvco, J. N. Baillargeon, A. L. Hutchinson, A. Y. Cho, and W. Urban, Opt. Commun. 197, 127 (2001); R. M. Williams, J. F. Kelly, and J. S. Hartman, Opt. Lett. 24, 1844 (1999). 302 12. M. Beck, D. Hofstetter, T. Aellen, J. Faist, U. Oesterle, M. Ilegems, E. Gini, H. Melchior, Science 295, 301 (2002). 302 13. N. Ulbrich, G.Scarpa, A. Sigl, J. Roßkopf, G. B¨ ohm, G. Abstreiter, and M.-C. Amann, Electron. Lett. 37, 1341 (2001); G. Scarpa, N. Ulbrich, J. Rosskopf, A. Sigl, G. B¨ ohm, G. Abstreiter, and M.-C. Amann, IEE Proc. Optoelectron. 149, 201 (2002). 302, 309
312
Nicolaus Ulbrich et al.
14. B. Jensen: Handbook of Optical Constants of Solids (Academic Press, San Diego 1985). 302 15. G. Scarpa, N. Ulbrich, A. Sigl, M. Bichler, D. Schuh, M.-C. Amann, G. Abstreiter, Physica E 13, 844 (2002). 303 16. J. Faist, C. Gmachl, F. Capasso, C. Sirtori, D. Sivco, J. Baillargeon, and A. Y. Cho, Appl. Phys. Lett. 70, 2670 (1997). 309 17. N. Ulbrich, G. Scarpa, G. B¨ ohm, G. Abstreiter, and M.-C. Amann, Appl. Phys. Lett. 80, 4312 (2002). 311
Optics and Transport in Conjugated Polymer Crystals: Interchain Interaction Effects Giovanni Bussi1,2 , Andrea Ferretti1,2 , Alice Ruini1,2 , Marilia J. Caldas3,1 , and Elisa Molinari1,2 1 2 3
INFM National Research Center on nanoStructures and bioSystems at Surfaces(S 3 ) Dipartimento di Fisica, Universit` a di Modena e Reggio Emilia, Modena, Italy Instituto de F´ısica, Universidade de S˜ ao Paulo, S˜ ao Paulo, Brazil
Abstract. We investigate the fundamental properties of conjugated-polymer semiconductors from the novel viewpoint of solid-state ab initio approaches, that are appropriate for extended and crystalline systems. The impact of interchain interactions on optics and transport of these materials is analyzed by developing computational schemes for transfer integrals and exciton states. We focus on a prototype polymer of great interest for optoelectronics, poly-para-phenylenevinylene (PPV), and compare different solid-state packings, where the character of interactions ranges from quasi-one-dimensional to quasi-three-dimensional. Interchain coupling is found to control light emission and charge conduction, and can thus be used as a tunable parameter for the design of devices based on organic materials.
1
Introduction
Since the discovery that conjugated organic polymers can be used as the active component in light-emitting diodes (LED) [1], much effort has been devoted to the characterization of their photophysical properties, both through experimental and theoretical tools. For the design of efficient light emission devices, the conjugated polymer material has to show high mobility and high photoluminescence quantum yield (PL quantum efficiency) in the solid state. Improving the performance of organic-based devices by designing new structures and molecules demands a detailed understanding of transport and optical processes taking place in the device, which can ultimately be achieved only through a fundamental understanding of the electronic structure of the composite system: in particular, of the electronic structure of the polymer film itself. In spite of extensive research, basic quantities central to the performance of devices, such as characterization of the excitonic spectrum and magnitude of the exciton binding-energy, details of charge carrier injection into and transport through films, are poorly understood. We are still very far from the degree of understanding achieved for inorganic solids. The complexity of relaxation and polarization phenomena that take place in these systems complicates the assignment of the energy levels. In covalently bonded inorB. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 313–326, 2003. c Springer-Verlag Berlin Heidelberg 2003
314
Giovanni Bussi et al.
ganic semiconductors the charge carriers are delocalized due to strong interactions between neighboring atoms, the electronic properties can usually be described in terms of Bloch states within the single-electron approximation, and the optical properties can usually be understood within the same framework. Polarization energies are small, carrier screening is efficient and exciton binding energies are of the order of few meV. The transport gap Eg (defined by the existence of free electron-hole pairs) is therefore very close to the onset of optical absorption, i.e. the optical gap Eopt , and quenching of luminescence (for optically active hosts) is caused by carrier capture or recombination at localized impurities or coordination defects. In the case of organic molecular solids, on the other hand, localization and polarization phenomena dominate the physics of excitation and transport: charges localize naturally on individual molecules and they present low dielectric constants ( 3) and very inhomogeneous polarizabilities. The optical gap usually corresponds to the formation of a Frenkel exciton, with electron and hole on the same molecule, rather than a band-to-band transition. The case of π-conjugated polymer crystals, where chains form highly directional stacks with relatively small intermolecular spacing, departs somewhat from the purely molecular picture [2,3]. Strong coupling between π-orbitals along the chain long axis leads to anisotropic conductive and dielectric properties, which have been recognized for a long time. Certain packing geometries can produce efficient interchain interaction, leading to further anisotropy in the directions orthogonal to the chain axis, both in transport and optical properties [4,5]. Polymer crystals are thus intermediate materials which seem to bridge the gap between narrowband molecular solids and delocalized wide-band semiconductors, and this unique position has motivated considerable interest in their basic electronic and optical properties. The influence of interchain interaction on optoelectronic characteristics is therefore of basic importance, also in order to rationalize the generally observed decrease in luminescence quantum efficiency when going from solutions to films [6,7]. It has long been assumed [8,9] that a band picture should be applicable to isolated polymer chains (due to the π-backbone) but that in the solid state interchain interactions should be negligible, in a molecularcrystal picture. The lower emission quantum yield in the solid state has been associated to the presence of impurities [10,11], such as precursor molecules remaining in the final polymer, in analogy with inorganic semiconductors. Recently, as for molecular crystals, it has been ascribed to the presence of low-lying excited states with symmetry-forbidden radiative coupling to the ground state, resulting from the interchain interaction. However, most previous theoretical investigations on the photophysics of conjugated organic polymers have been carried out on isolated chains (which mimic the situation in inert matrices or dilute solutions), or on small clusters of oligomers [12], thereby failing to gauge crystalline symmetry effects.
Optics and Transport in Conjugated Polymer Crystals
315
In this context, full ab initio calculations for complete three-dimensional crystalline arrangements can help in providing structure-property relationships that are also useful for the engineering of materials with improved characteristics. We here focus on poly-para-phenylene vinylene (PPV), the most studied polymer for optoelectronic applications [1], and a prototype for these systems.
2
Systems
The chemical structure of the PPV infinite chain is represented in the upper panel of Fig. 1(a). All carbon atoms in the phenylene ring and in the vinylene group are in sp2 hibridization, giving rise to the typical planar structure [13] of these compounds, with π-electrons delocalized above and below the molecular plane. Due to the presence of the π-electrons these systems exhibit clear semiconducting behaviour. xˆ ✲
xˆ ✲
xˆ ✲ ✻
zˆ
✻ P0
P2
yˆ
φs P1
(a) SC
(b) πS
(c) HB
Fig. 1. Structural details of different PPV packings. Black and white balls stand for carbon atoms in different domains while hydrogen atoms are described by dangling ticks. Note the dashed lines, that indicate the assumed periodicity and unit cells. (a) Single chain (SC): the chemical structure of PPV is shown in the upper panel, the monomer is composed of one phenylene and one vinylene unit; cell dimensions are a = 15.00 ˚ A (ˆ x), b = 15.00 ˚ A (ˆ y ), c = 6.65 ˚ A (ˆ z ). (b) Displaced π-stack (πS): basecentered orthorombic bravais lattice, a = 15.00 ˚ A (ˆ x), b = 7.12 ˚ A (ˆ y ), c = 6.65 ˚ A (ˆ z ), with one chain per unit cell, interchain distance in the stack direction 3.56 ˚ A [14]. We show in the upper part the relative shift of the chains by half a unit cell in zˆ direction. (c) Herringbone (HB): monoclinic unit cell, a = 8.07 ˚ A (ˆ x), b = 6.05 ˚ A, ◦ ◦ ˆ ˚ c = 6.54 A (ˆ z ), bc = 123 , setting angle φs = 52 [15]. Cell dimensions a and b for SC and a for πS have been chosen in order to isolate chains in the respective directions
316
Giovanni Bussi et al.
PPV offers the possibility to investigate the impact of crystalline aggregation on optical and transport properties, because films can be grown in very different packing structures, ranging from fully three-dimensional (3D), to effective two-dimensional (2D) or one-dimensional (1D) aggregations. In fact, according to single-crystal X-rays data [15,16], unsubstituted PPV crystal packs in a typical 3D herringbone (HB) configuration as shown in Fig. 1(c), but many derivatives showing totally different packings can be obtained by functionalization with long aliphatic side-chains. In particular, we will simulate the case of MEH-PPV [14], where polymer chains are stacked along a direction orthogonal to their planes, giving rise to a π-stack (πS) geometry with 2D character, as depicted in Fig. 1(b). Finally, we will also consider the isolated PPV single chain (SC), that can indeed be viewed as 1D system with the highest degree of 1D confinement (∼ 2–5 ˚ A). The isolated PPV chain serves to model the case of amorphous phases and dilute solutions [4], and also as a reference system to better understand the role of crystal packing.
3
Methods
One major problem when dealing with complex systems is the choice of a theoretical tool. In the case of organic conjugated systems, it is usual to work with semiempirical Hartree-Fock techniques, specifically designed to address their structural and optical properties [17]. However, this approach present serious shortcomings when treating polymolecular systems, since the techniques were not originally parametrized for these situations [18]. To gain informations about the microscopic properties of the crystalline systems, we adopt an ab initio approach based on density functional theory (DFT) [19,20,21], for ideal 3D periodic structures. Within this framework it is possible to calculate the electronic density and other ground-state properties with an atomic-scale accuracy, without using semi-empirical parameters. Moreover, the periodic packing of the chains is already included and there is no need to simulate these crystals with large but finite ad hoc clusters. We work in the local density approximation (LDA), where the potential depends just locally on the charge density. LDA has known shortcomings, however, to obtain geometries for these non-bonded crystals, so whenever possible we use experimental information for lattice constants. We obtain thus band structures and charge densities that can be used as input for the study of transport parameters and optical properties. 3.1
Transport Properties
The theoretical description of transport properties in these conjugated polymer materials presents several difficulties. In the ring-structured polymer chains composing the crystal, ring-torsion vibrational modes are very soft
Optics and Transport in Conjugated Polymer Crystals
317
[13], which implies a strong coupling between the electronic and ionic degrees of freedom even at relatively low temperatures, and the consequent description in terms of polarons, i.e. quasi particles dressed with lattice rearrangements. The complexity of relaxation and polarization phenomena taking place complicates the analysis of transport, but in the simplest picture two main regimes are identified: an incoherent hopping mechanism and a coherent band-like conduction [22], that becomes dominant at low-temperatures for highly ordered samples. In both cases, a key parameter [22,23,24,25] for the characterization at the microscopic level of transport properties in organic conjugated materials are the interchain transfer integrals (TI), that reflects the ease of charge transfer between two weakly interacting chains. In fact, it has been shown that these quantities enter both in the description of hopping regime through Marcus theory [26,25] and in that of band-like transport by means of the Landauer-B¨ uttiker formalism [27,28,29]. We describe below our scheme for the ab initio calculation of such parameters for the case of polymer crystals [5]. 3.1.1
Inverse Tight Binding Model
The tight binding (TB) formalism [30] allows one to write the hamiltonian of a complex system in terms of identifiable atomic-like components. We can also focus directly the formalism on relevant identifiable states of any given structure, which in the case of polymer crystals are single infinite polymer chains. In fact, a polymer crystal is composed by infinite chains (along zˆ) that generate a 2D lattice, defined by vectors {P}. The unit cell contains q inequivalent chains denoted by {τ }, and the isolated-chain eigenfunctions corresponding to the i-th inequivalent chain in the s-th lattice site are r|φimk,Ps = φimk (r − τi − Ps ) ,
(1)
where m and k stand for band index and wavevector. The crystal eigenvectors are expanded in terms of the isolated-chain ones as follows [5] l eik·Ps Cmi |φimkz ,Ps , (2) |ψlk = s
m,i
where the Bloch-sum over Ps vectors ensures that the final state |ψlk has a well defined k-symmetry. The important feature of Eq. (2) is that we do not need to sum over different k-points of the isolated chain. Substituting Eq. (2) in the usual crystal hamiltonian, we obtain the master equation kz l l z Cnj eik·Ps Emi,nj (Ps ) = l (k) Cnj eik·Ps αkmi,nj (Ps ) ∀ m, i (3) n,j,s
n,j,s
where transfer integrals (TI)
kz Emi,nj
kz (Ps ) = φimkz ,0 |H|φjnkz ,Ps Emi,nj z (Ps ) αkmi,nj
=
φimkz ,0 |φjnkz ,Ps
,
z and overlap integrals (OI) αkmi,nj are
(4a) (4b)
318
Giovanni Bussi et al.
and H is the single-particle crystal hamiltonian. A typical approximation that we will also introduce in our calculations is the well known neglect of differential overlap (NDO), that implies z (Ps ) = δmn δij δPs ,0 . αkmi,nj
(5)
In a standard TB scheme we seek the solution of the master equation to find the eigenvalues l (k) from the TI’s and OI’s. Our aim is instead the calculation of the TI’s starting from the full electronic band-structure obtained from the ab initio DFT calculation; we will denote this scheme as inverse tight binding (ITB). This approach follows in part the framework outlined by Koster and Slater (KS) [30], where TI’s are obtained by interpolating on a finite set of k-points, in order to obtain the full band-structure in a following step within a standard TB spirit. In what follows, we describe the application of the ITB scheme to the polymer crystals in Fig. 1. There is an important simplification in this particular case, in that the relevant states — lowest conduction band and top of valence band — for the crystalline structures can be derived just from the corresponding bands of the isolated chains.1 We first consider the case of one chain per unit cell (q = 1), as for the πS system, and one level per l in Eq. (3) is chain (p = 1): the sum over i and m and the coefficients Cnj reduced to kz eik·Ps Em1,m1 (Ps ) = m (k) , (6) s
where the index m is related to both crystal and isolated-chain bands (i.e. the m-th chain band generates the m-th crystal band). For the HB crystal with two inequivalent chains per cell (q = 2) new complications arise. Still considering one level per chain, Eq. (3) becomes a 2 × 2 eigenvalue problem, where the two eigenvectors for each k describe the band doublet generated by the presence of two chains in the unit cell, and to good approximation 1 + kz m (k) + − eik·Ps Em1,m1 (Ps ) = (7a) m (k) 2 s 1 k z + (k) − − eik·Ps Em1,m2 (Ps ) = (7b) m (k) . 2 m s While Eqs. (6) and (7) can be easily inverted by Fourier transforms leading to expressions for TI between equivalent chains in different sites (as in [5]), this is not the case for Eq. (7), for which a further approximation has to be introduced. We will suppose that the TI’s between inequivalent chains are negligible except for those referring to inequivalent nearest neighbours (INN) 1
We have performed numerical checks to verify that the last (last two) valence band in MEH-PPV (HB-PPV) indeed originates just from the HOMO band of the isolated chain.
Optics and Transport in Conjugated Polymer Crystals
319
[Focusing the central chain in the lower panel of Fig. 1(c), in open circles, we consider the four INN (solid circles) chains around it]. Taking into account the full symmetry properties of the HB structure, the following expression holds for the TI of INN chains kz |= |Em1,m2
3.2
− + m (k) − m (k) . −ik·P 1 ||1 + e−ik·P2 | 2 |1 + e
(8)
Optical Properties
Optical excitation energies and excited eigenstates cannot be extracted directly from DFT theory, one has first to calculate electron (hole) addition energies and the corresponding eigenstates, and then obtain the correct combinations of these states that describe the bound electron-hole pairs (excitons). The first step can be performed through the Green’s function formalism [31,32] that gives the effective single-particle hamiltonian acting on a quasiparticle (QP), whose energy is defined as the difference between the energy of a state with N +1 electrons (N −1 for holes) and the energy of the N -electron ground state. It has been shown that QP eigenstates are virtually the same as the original DFT eigenstates, but the energies are shifted by an approximately constant amount, that is, the energy shift does not vary much with k. A very usual approximation is to consider a rigid shift of the conduction bands relative to the DFT values, called a scissor operator approximation. Neutral excitations can then be obtained as the eigenstates of a twoparticle problem, a quasi-electron and a quasi-hole. These two-particle states are usually referred to as excitons. The excitonic state |Ψµ can be obtained from the ground state |Ψ0 by proper combinations of states where one electron is removed from a valence state and added in a conduction state2 (µ) † |Ψµ = Acvk ψˆck ψˆvk |Ψ0 . (9) cvk
The effective two-particle hamiltonian can be obtained through the Green’s function formalism and the Bethe-Salpeter equation (BSE) [33,34,35], and is made up of three terms Hc v k ,cvk = Kkc v k ,cvk − Kdc v k ,cvk + 2δS Kxc v k ,cvk Kkc v k ,cvk Kdc v k ,cvk 2Kxc v k ,cvk
=
δcc δvv δkk (QP ck
−
QP vk )
= −ψvk ψc k |w ˆ12 |ψv k ψck = −ψvk ψc k |ˆ v12 |ψck ψv k .
(10a) (10b) (10c) (10d)
The first term (Kk ) is called the kinetic kernel and brings the contribution of QP energies to the excitonic state, while the second and the third terms 2
To study optical excitations, only “vertical” transition are included, so that the crystal momentum is conserved.
320
Giovanni Bussi et al.
are respectively the Coulomb direct kernel (Kd ) and the exchange kernel (Kx ). These two non-diagonal kernels are obtained as matrix elements of the 2 Coulomb potential [v(12) = e /(r1 − r2 )] or of the screened Coulomb potential [w(12) = d(3)−1 (13)v(32)]. The inverse dielectric response −1 can be calculated starting from QP states within the random phase approximation (RPA). Note that the exchange term is present only when the total spin S = 0, for the singlet states. In order to apply this scheme to realistic crystal, we have approximated the full RPA screening by the static dielectric tensor, that is, the anisotropy of the screening is preserved, but the spatial dependency (on r, r ) is neglected [4]. The system inhomogeneity is not well described: this is a good approximation for a 3D system, but is a more drastic approximation in the case of 2D or 1D systems. To circumvent this problem, we use an effective volume enclosing the chain (or the stack of chains) and consider the screening to be negligible outside this region. This scheme leads to a renormalization of the dielectric constant, that becomes independent from the supercell dimensions. The eigenvalues of the effective two-particle hamiltonian give the excitation energies, while the eigenvectors provide information about the character of the excited states, i.e. the two-particle wavefunctions depending on both hole and electron positions: (µ) ∗ Acvk ψck (re )ψvk (rh ) . (11) Ψµ (re , rh ) = cvk
For a given excited state one can also calculate the oscillator strength, defined as the matrix element of the dipole operator. We can thus arrive at a firstprinciples simulation of the absorbtion spectrum for an infinite crystal.
4
Results
In this section we present our results for both transport and optical properties of PPV, starting from the common basis of DFT single particle description. In Fig. 2 we report the ab initio band structure along some relevant lines in the Brillouin zones for the isolated chain, the π-stack and the herringbone systems [panel (a), (b), (c) respectively]. Chain direction is along Γ P , Brillouin zones and k-point labels are displayed in the upper panels. All calculations were performed using the PWSCF code [36] within the LDA for the exchangecorrelation functional, norm-conserving pseudopotentials and a plane wave basis set with 45 Ry cutoff energy. We see the band-doubling for HB-PPV, Fig. 2(c), due to the presence of two inequivalent chains in the unit cell. Apart from that, there is a noticeable similarity between all these band structures along the chain’s direction, which shows considerable dispersion. The bands for HB and πS are not completely flat, however, in directions orthogonal to the chains, indicating the existence of interchain interactions.
Optics and Transport in Conjugated Polymer Crystals
yˆ
Energy [eV]
zˆ
5.0 4.0 3.0 2.0 1.0 0.0 -1.0 -2.0 -3.0 -4.0 -5.0
A
x ˆ zˆ
Γ
x ˆ
x ˆ
yˆ
yˆ
B P
A
321
zˆ
Γ
B P
A
Γ
+ m (k)
✠ I ❅ ❅ − m (k)
Γ
A (a) SC
Γ
A
P
B
A
(b) πS
Γ
A P
B
A
(c) HB
Fig. 2. Band structures of PPV isolated chain (a), π-stack structure (b) and herringbone packing (c). The origin in the energy scale is referred to the highest occupied molecular level (HOMO) of every system. Chains are oriented along the Γ P line. Brillouin zone structures are shown in the upper panels for all packings. Thick lines describe the trace of the symmetry-irreducible zones in the plane of interest
4.1
Transport Properties
In Table 1 we report the numerical values of the TI’s obtained for the HB and πS crystals, focusing the orthogonal dispersion for kz of the highest occupied state (HOMO) of the crystal, because we are interested here in hole conduction in the low field regime [37]. A large difference emerges between the two structures. In fact the maximum TI for πS is almost four times larger than its counterpart for HB, as a signature of large interchain coupling. In the case of HB, the maximum coupling is reached between inequivalent chains, even with the large setting angle, in accordance with previous findings [38,22]. In the case3 of E11 (P2 ), the chains are laterally displaced relative to each other, while in the case of E11 (P1 ) the interchain distance is very large (more than 5˚ A). In the same ITB spirit we studied the behaviour of the highest valence band for SC as generated from the interaction of HOMO states of monomer units. In doing so we are justified by the clear origin of the HOMO band for SC. We obtain thus the TI’s of the SC hamiltonian between monomeric states at different sites. The nearest-neighbour (NN) TI was found to be 523 meV, highlighting the anisotropy in the transport properties between directions parallel and orthogonal to the chains. In real polymer structures there are 3
See the caption of Tab. 1 for notations.
322
Giovanni Bussi et al.
Table 1. Transfer integrals for HOMO states in meV. For clarity we use Eij (Ps ) kz (Ps ), since m refers always to the top valence band, and kz is set in place of Emi,mj to the zˆ component of the HOMO k-vector HB
πS
E12 (0)
27.42
E(P0 )
120.64
E11 (P2 )
13.96
E(2P0 )
10.49
E11 (P1 + P2 )
1.49
E(3P0 )
2.86
E11 (P1 )
0.27
however many phenomena which can break the conjugation along the chains (libration of rings, kinks, chain-ends) thus closing important transport channels. In these cases, and in the case of oligomeric crystals, it is very useful to understand in which measure alternative transport channels orthogonal to the chains could open up. In particular, the fact that NN TI for πS has an intermediate magnitude between those of HB, orthogonal to the chains, and SC, along the chain, is suggestive that transport along the stack could play a relevant role for this material. 4.2
Optical Properties
We have carried out the BSE calculation of excitonic states as outlined in section 3.2. We take the scissor operator from the SC calculation of Ref. [39], and we suppose it to be independent from the interchain packing. The vertical QP excitations are sampled in the Brillouin zone with a discrete mesh of k points (20 in the SC, 10×20 in the πS, and 5×5×14 in the HB configuration). Single-particle wavefunctions are expanded on a plane-wave basis set with a 35 Ry energy cutoff. For the HB system we use an orthorombic cell with the same interchain distance as the true monoclinic cell. The obtained excited state energy levels are schematically represented in the upper graphs of Fig. 3, together with the continuum of transport states (corresponding to dissociated electron-hole pairs). The oscillator strength of each exciton is calculated, and used to simulate the absorption spectrum in each case (each eigenvalue lorentzian-broadened, and weighted by the corresponding oscillator strength). We compare the absoption spectrum with the joint density of states of the QP approximation: when electron-hole interaction is neglected (QP approximation, shaded region in Fig. 3), a typical 1D Van Hove singularity is observed for the SC configuration (a) at the energy corresponding to the gap between valence and conduction states. In the HB structure, the presence of two inequivalent chains per unit cell, and the fact that the transition between the uppermost valence band and the lowest conduction band is optically forbidden, leads to a non-negligible blue shift of the “optical absorption” onset
Optics and Transport in Conjugated Polymer Crystals (a)
(b)
DA
(c)
DA
Continuum
Absorption (arb. units)
CT
z polarization
CT
DD
Continuum
Continuum
323
DA
y polarization (×5)
z polarization CT DA
z polarization
1
2
3
Energy (eV)
4
1
2
DA
3
Energy (eV)
4
1
2
3
4
Energy (eV)
Fig. 3. Excitons and absorption spectra for SC (a), πS (b) and HB (c). The upperlying graphs are energy diagrams for the lowest excitons in each system, while the lower-lying graphs show the calculated absorption spectra polarized in the chain direction (ˆ z ); in the πS case we also plot the absorption for polarization in the stacking direction (ˆ y ). The spectra are obtained using a 0.1 eV lorentzian broadening. Shaded lines represent spectra calculated without including electron-hole interaction, while black lines show the results for the two-particle calculation. The grey box represents the continuum of dissociated transport states. Excitons and peaks are labelled as direct active (DA), direct dark (DD) and charge transfer (CT)
with respect to the transport continuum. Note that all this features are observed for light polarized along chain direction (ˆ z ). The introduction of the electron-hole correlation leads to dramatic changes in the absorption spectra [4]: a peak due to a localized exciton arises at an energy considerably lower than the transport gap. This bound state is polarized along the chain direction in all systems. Another effect of the electron-hole interaction is the suppression of the absorption strength for states at energy corresponding to the band gap: this effect was already known for quantum wires and is typical of 1D systems [40]. The energy difference between the exciton and the lowest transport state can be interpreted as the binding energy of the exciton. The binding energy of the lowest lying optically active state for SC, πS and HB configuration is respectively 0.7, 0.5 and 0.2 eV, indicating a general decreasing trend when the dimensionality increases. This is due mainly to the larger screening in the higher dimensionality systems. However, the effect of the 2D or 3D packing cannot be reduced to an increase of the screening and decrease of the binding energy: in fact, looking at the upper graphs in Fig. 3 it is possible to observe that a variety of excitons is produced by the interchain coupling. The first clear difference between SC and πS systems is that, in the latter case, states can be classified as direct (i.e. electron and hole on the same chain) and charge transfer (CT) excitons (i.e. electron and hole on adjacent chains). The direct state in the πS structure originates from the SC excitons and is optically active, while the lowest observed CT state (indicated in the spectrum) is also predicted to be
324
Giovanni Bussi et al. −4c 3d
1 0.75 0.5 0.25
−3c
−2c
−c
0
c
2c
3c
4c
5c
(b)
(a)
7.12
d
3.56
0
0
−d
−3.56
−2d
−7.12
−3d 3d
y(A)
−10.68
(d)
(c)
10.68
2d
7.12
d
3.56
0 0
10.68
2d
0
−d
−3.56
−2d
−7.12
−3d
y(A)
−10.68 − 26.60 −19.95 −13.30 −6.65
0
6.65 13.30 19.95
26.60 33.25
z(A)
Fig. 4. Excitonic wavefunction of a direct and a CT state for the πS packing. The color plots [(a) for the direct and (c) for the CT state] represent the projection on the xz plane of the probability of finding the electron in a generic point when the position of the hole is fixed in the point indicated by the white spot. The grid spacing d in the yˆ direction corresponds to the interchain distance, while the grid spacing c in the zˆ direction corresponds to the length of the periodic cell. In (b) and (d) the projection on the yˆ direction is shown for both states. Note that the probability of finding the electron and hole on different chains is very small but not zero in the direct state, while it is very high for the CT state
observable, but for polarization along the stacking direction (ˆ y ). The energy of this CT state is just 0.1 eV higher than the direct state (labelled here as DA). Note that the oscillator strength of the CT exciton is non-negligible with respect to that of the zˆ polarized direct state. In Fig. 4 it is possible to appreciate the differences between a direct and a CT exciton in the πS structure. It is important to note that the probability of finding electron and hole on neighbouring chains is small but not zero in the direct-exciton case. Passing to the 3D packing of the HB structure, other features appear: in particular, the presence of two inequivalent chains leads to a splitting of the lowest lying direct state in two excitons, usually referred to as Davydov splitting. This effect is related to the band doubling already observed in the previous section and is found also in other herringbone polymers [41]. For symmetry reasons, one of the Davydov states is optically active (DA), while the other one is dark (DD). Note that the lowest singlet state turns out to be optically inactive, leading to the luminescence quenching observed in experiments. Also for the HB structure we find CT states, however, in this case, the calculation indicates negligible oscillator strength for all bound CT states, that are thus not observable in the spectrum.
Optics and Transport in Conjugated Polymer Crystals
5
325
Conclusions
We reported a fully ab initio description of optical and electronic properties of an important conjugated polymer, and their connection with transport properties, focusing on the influence of interchain interactions and packing symmetries. To be able to study extended periodic systems we use a reciprocal space representation, and go beyond the single-particle approximation to be able to treat optical excitations. We analyze three different systems: a one-dimensional isolated PPV chain, a two-dimensional PPV π-stack, and a three-dimensional herringbone PPV crystal. We calculate the electronic band structure and show that the band dispersion along the chain direction is larger than dispersion orthogonal to the chains, that is however non-negligible. We then calculate the interchain transfer integrals as parameters for hopping conduction and we find that the coupling between neighboring chains is larger in πS with respect to HB structure. The direct consequence is that the presence of side-chains in MEHPPV leads to an increase of the interchain coupling, contrary to the usually accepted notion that aliphatic substitution prevents interchain interaction. We also presented calculated optical properties and absorption spectra. The inclusion of electron-hole interaction is critical and dramatically changes the shape of the spectra. We find that interchain coupling is also very important, and that the exciton binding energies show a substantial decrease with the increase of dimensionality. The excitonic structure is very much affected by interchain coupling: excitons with electron and hole on different chains appear in the 2D and 3D systems at energies just above the lowest singlet exciton. Moreover, in the HB packing the presence of two chains per unit cell leads to a splitting of the single-chain exciton in two Davydov components, and the lowest component is optically forbidden, quenching the luminescence. This splitting is not observed in the πS structure, thus, coupled to the better electrical performance, this indicates MEH-PPV as a preferred material for photoluminescent devices.
References 1. J. H. Burroughes et al., Nature 347, 539 (1990). 313, 315 2. S. F. Alvarado, P. F. Seidler, D. G. Lidzey, and D. D. C. Bradley, Phys. Rev. Lett. 81, 1082 (1998). 314 3. I. H. Campbell, T. W. Hagler, D. L. Smith, and J. P. Ferraris, Phys. Rev. Lett. 76, 1900 (1996). 314 4. A. Ruini, M. J. Caldas, G. Bussi, and E. Molinari, Phys. Rev. Lett. 88 (2002). 314, 316, 320, 323 5. A. Ferretti, A. Ruini, E. Molinari, and M. J. Caldas, Phys. Rev. Lett. 90, 086401 (2003). 314, 317, 318 6. M. Yan, L. J. Rothberg, E. W. Kwock, and T. M. Miller, Phys. Rev. Lett. 75, 1992 (1995). 314
326
Giovanni Bussi et al.
7. I. D. Samuel, G. Rumbles, and R. H. Friend, in Primary Photoexcitations in Conjugated Polymers, edited by N. S. Sariciftci, World Scientific, Singapore, 1998. 314 8. J. L. Br´edas, R. R. Chance, R. Silbey, G. Nicolas, and P. Durand, J. Chem. Phys. 77, 371 (1982). 314 9. D. Beljonne, Z. Shuai, J. Cornil, D. A. dos Santos, and J. L. Br´edas, J. Chem. Phys. 111, 2829 (1999). 314 10. U. Lemmer et al., Appl. Phys. Lett. 66, 2827 (1993). 314 11. H. L. Gomes et al., Appl. Phys. Lett. 74, 1144 (1999). 314 12. J. Cornil, J. P. Calbert, D. Beljonne, R. Silbey, and J. L. Br´edas, Synth. Met. 119, 1 (2001). 314 13. R. B. Capaz and M. J. Caldas, Phys. Rev. B (2003). 315, 317 14. C. Y. Yang, F. Hide, M. A. D´ıaz-Garc´ıa, A. J. Heeger, and Y. Cao, Polymer 39, 2299 (1998). 315, 316 15. D. Chen, M. J. Winokur, M. A. Masse, and F. E. Karasz, Phys. Rev. B 41, 6759 (1990). 315, 316 16. D. Chen, M. J. Winokur, M. A. Masse, and F. E. Karaz, Polymer 33, 3116 (1992). 316 17. M. J. Caldas, E. Pettenati, G. Goldoni, and E. Molinari, Appl. Phys. Lett. 79, 2505 (2001). 316 18. L. Y. A. D´ avila and M. J. Caldas, J. Comput. Chem. 23, 1135 (2002). 316 19. W. Kohn and L. J. Sham, Phys. Rev. 140, A1133 (1965). 316 20. W. Kohn, Rev. Mod. Phys. 71, 1253 (1999). 316 21. R. M. Dreizler and E. K. U. Gross, Density Functional Theory: An Approach to the Quantum Many-Body Problem, Springer, 1990. 316 22. J. L. Br´edas et al., Synth. Met. 125, 107 (2002). 317, 321 23. Z. G. Soos, S. Etemad, D. S. G. ao, and S. Ramasheda, Chem. Phys. Lett. 194, 341 (1992). 317 24. L. Torsi, D. Dodabalapur, L. J. Rothberg, A. W. P. Fung, and H. E. Katz, Science 272, 1462 (1996). 317 25. R. A. Marcus, Rev. Mod. Phys. 65, 599 (1993). 317 26. R. A. Marcus and N. Sutin, Biochim. Biophys. Acta 811, 265 (1985). 317 27. R. Landauer, Philos. Mag. 21, 863 (1970). 317 28. M. B¨ uttiker, Y. Imry, R. Landauer, and S. Pinhas, Phys. Rev. B 31, 6207 (1985). 317 29. M. B¨ uttiker, Phys. Rev. Lett. 57, 1761 (1986). 317 30. J. C. Slater and G. F. Koster, Phys. Rev. 94, 1498 (1954). 317, 318 31. L. Hedin, Phys. Rev. 139, A796 (65). 319 32. L. Hedin and S. Lundqvist, Solid State Phys. 23, 1 (1969). 319 33. L. J. Sham and T. M. Rice, Phys. Rev. 144, 708 (1966). 319 34. W. Hanke and L. J. Sham, Phys. Rev. Lett. 144, 387 (1979). 319 35. M. Rohlfing and S. G. Louie, Phys. Rev. B 62, 4927 (2000). 319 36. S. Baroni, A. Dal Corso, S. de Gironcoli, and P. Giannozzi, 2001, http://www.pwscf.org. 320 37. H. U. Baranger and A. D. Stone, Phys. Rev. B 40, 8169 (1989). 321 38. A. Calzolari, R. Di Felice, E. Molinari, and A. Garbesi, Appl. Phys. Lett. 80, 3331 (2002). 321 39. M. Rohlfing and S. G. Louie, Phys. Rev. Lett. 82, 1959 (1999). 322 40. F. Rossi and E. Molinari, Phys. Rev. Lett. 76, 3642 (1996). 323 41. G. Bussi et al., Appl. Phys. Lett. 80, 4118 (2002). 324
Terahertz Quantum Cascade Lasers R¨ udeger K¨ ohler1 , Alessandro Tredicucci1 , Fabio Beltram1 , Harvey E. Beere2 , Edmund H. Linfield2 , Giles A. Davies2 , and David A. Ritchie2 1
2
NEST-INFM and Scuola Normale Superiore Piazza dei Cavalieri 7, 56126 Pisa, Italy [email protected] Cavendish Laboratory, University of Cambridge Madingley Road, Cambridge CB3 0HE, United Kingdom
Abstract. Unipolar semiconductor injection lasers emitting at THz frequencies (4.3 THz, λ ∼ 69µm and 3.5 THz, λ ∼ 85µm) are discussed. The devices are based on interminiband transitions in chirped GaAs/AlGaAs superlattices that are arranged in a quantum-cascade scheme. The core featuring 100 repetitions of this type of superlattice, is embedded into a novel kind of waveguide loosely based on the surface plasmon concept, which allows to achieve low waveguide losses and high confiment factors. Continuous-wave laser emission is obtained with low tresholds of a few hundred A/cm2 up to 48 K heat sink temperature and maximum output powers of more than 4 mW. Under pulsed excitation, peak output powers of 4.5 mW at low temperatures and still 1 mW at 65 K are measured. The maximum operating temperature is 68 K. The operation of these devices is studied with the help of a Hakki-Paoli analysis.
1
Introduction
The use of THz radiation (1–10 THz, λ ∼ 30µm–300µm) offers great prospects in bio-medical applications [1,2,3], in atmospheric and astronomical spectroscopy [3], and for the implementation of high-bandwidth intrabuilding wireless communications. In many instances, sensing with THz radiation provides information not accessible with conventional techniques. The peculiar transparency characteristics of various substances in this spectral region make it an ideal choice for the study of biological tissues, for revealing concealed items or materials, and for the analysis of chemical processes in the atmosphere of the Earth. The progress in these fields, however, is hindered by the lack of suitable emitters. At present, THz radiation can be obtained from a variety of sources including gas lasers, free-electron lasers and p-doped Ge lasers [4] as well as black-dody radiation, mixing of two visible laser beams [5] and transient generation of oscillating charges with fs pulses [6]. They all suffer, however, from shortcomings precluding wide-spread use in commercial systems. Although semiconductor devices traditonally account for a large share of the sources of electromagnetic waves, emitting frequencies from kHz up B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 327–340, 2003. c Springer-Verlag Berlin Heidelberg 2003
328
R¨ udeger K¨ ohler et al.
to those of ultaviolet light, the terahertz range has remained substantially uncovered. This region still can be considered a no-man’s land between the reigns of electronics and photonics. While transport-based devices are widely available at GHz frequencies and up, their operation above 1 THz faces major physical obstacles. On the other hand, transition-based photonic systems such as diode lasers are utilized from the ultra-violett down to about 12 THz, but difficulties of both fundamental and technological nature impede their operation at ever longer wavelengths. Quantum Cascade Lasers (QCLs) [7] are unipolar semiconductor lasers that are based on intersubband transitions in a specifically engineered heterostructure. Contrary to conventional laser diodes, where the optical transition takes place across the band gap (interband) and thus involves both electrons and holes, in these novel devices only electrons are involved and the transition occurs between subbands belonging to the conduction band (intersubband). The energy and envelope functions of the subbands can be controlled by the thickness of the individual layers, quantum wells and barriers, and by the applied bias. Therefore, the band gap of the choice materials is in first instance irrelevant to the energy of the emitted photons and hence technologically mature systems like InGaAs/AlInAs or GaAs/AlGaAs can be used in a wide range of emission wavelengths. In the original proposal by Kazarinov and Suris [8] as well as in following studies [9,10,11] it was suggested that such a laser would be realized most easily at photon energies below the reststrahlen band, where the scattering of electrons with longitudinal optical (LO) phonons is considerably reduced. The resulting longer lifetime of the upper laser level then would lead to lower thresholds. It turned out, however, that LO phonon scattering actually can be conveniently used to design population inversion. To this end, the active region typically hosts three levels, of which the upper two (level 1 and 2) form the laser transition, and the third is separated by about the LO phonon energy from the lower laser level. The wavevectors at which electrons scatter with LO phonons are therefore quite different for electrons in levels 2 and 1, which, in conjunction with the 1/q 2 -dependence of the scattering rate τ −1 , readily leads to τ2 > τ1 . The active regions hosting the optical transition are connected by injector/collector regions. These are superlattice structures whose states resemble a miniband that collects electrons from the previous active region, cools down the carrier distribution and injects them again into the upper laser level of the following active region. Injector/collector regions and active regions form a building block (so-called period) that can be repeated many times (typically 25–100). Under appropriate bias, all periods line up to form the potential ’cascade’, the active core of the laser, which the electrons travel downstream, emitting ideally one photon at each step of the cascade. Quantum cascade lasers have been first operated in the mid-infrared [7], and the range of emission wavelengths has increased widely ever since [12,13], with concomitant tremendous improvements in their performance, eventually lead-
Terahertz Quantum Cascade Lasers
329
ing to continuous-wave operation at room-temperature [14]. However, the phonon reststrahlen band (located at 8–9 THz in the commonly employed system GaInAs/AlInAs, GaAs/AlGaAs) was considered to be an insuperable barrier in the further expansion to lower emission frequencies. In fact, while electroluminescence at THz frequencies was observed by several groups [15,16,17], none of these structures exhibited population inversion. This can be understood also from the difficulty of gaining sufficient control of scattering rates at such low energies, since an extension of the phonon-scattering based approach is not straightforward. Another challenge was the development of a suitable waveguide to confine light of such long wavelength to an epilayer compatible with molecular beam epitaxy (MBE) technology without imposing high absorption losses onto the laser mode. Lasing has been reported just a short time ago [18], and very low threshold current densities have been observed in such structures [19]. Very recently continuous-wave operation has been obtained [20,21] and operation at longer wavelength has been reported [22,23,24]. Here, we review the latest developments and performance improvements of THz quantum cascade lasers emitting at 4.3 THz (λ ∼ 69µm) and 3.5 THz (λ ∼ 85µm).
2
Quantum Cascade Laser at 4.3 Terahertz
In Fig. 1 (a) we show a portion of the conduction band structure of a THz QC laser. The moduli squared of the relevant wavefunctions are displayed, with the laser transition taking place between the two levels drawn in boldface. The active region follows the concept of chirped superlattices (SL) [25], in which the optical transition takes place between the first and second miniband rather than between individual electronic levels. The term ’chirped’ reflects the fact that period and duty-cycle of the superlattice are varied in order to realize a flat-band condition under applied bias. The main characteristics of this structure are the large dipole matrix element of 7.8 nm at an emission energy of 18 meV (4.3 THz) and the wide injector miniband, highlighted by the shaded area in Fig. 1. The lower laser level 1 is strongly coupled to this miniband which, in conjunction with the large miniband dispersion of 17 meV, facilitates the achievement of population inversion. The strong coupling leads to a rapid extraction of carriers from state 1 owing to the large scattering matrix elements between this level and the injector states. Moreover, it reduces −1 from the upper laser level 2 directly into the non-radiative scattering rate τ21 1, as electrons in the first can scatter to a whole dense miniband of states. Thus even a fast total scattering rate τ2−1 from the upper laser level by itself −1 does not preclude laser action as only the condition τ1−1 > τ21 must be fulfilled. Here, τ2 and τ1 are computed as the net loss rate of carriers from level 2 and 1, respectively, into all states of lower energy. Notice that, unlike in mid-infrared QC lasers, where the scattering rates are dominated by direct emission of LO phonons, in the present structure carrier-carrier scattering
330
R¨ udeger K¨ ohler et al. 0.06
metal layer
Mode Intensity (norm.)
0.15
Energy (eV)
2 1
0.10
Active Region
0.05
Injector
0.00 0
(a)
50
100
150
Distance (nm)
200
250
bottom contact layer
0.05 0.04
Γ = 0.47 αW = 12 cm-1
0.03 0.02 0.01 0.00 0
(b)
10
20
30
40
Distance (µm)
Fig. 1. (a) Conduction band energy diagram of the 4.3 THz laser under an electric field of 3.5 kV/cm. The layer thickness (in nm) are, from left to right, starting from the injection barrier 4.3/ 18.8/ 0.8/ 15.8/ 0.6/ 11.7/ 2.5/ 10.3/ 2.9/ 10.2/ 3.0/ 10.8/ 3.3/ 9.9, where Al0.15 Ga0.85 As layers are in boldface and the 10.2 nm wide GaAs well is doped 4 × 1016 cm−3 . Also shown are the moduli squared of the wavefunctions; the optical transition occurs between the two states drawn in boldface. Carriers are injected from the ground state of the injector into the upper laser level via resonant tunneling through the injection barrier. Fast extraction of carriers from the lower laser level is facilitated by the strong coupling between the lower laser level and the injector miniband. (b) Calculated waveguide mode profile along the growth direction of the final device structure. The origin of the abscissa is at the top metal-semiconductor interface; the laser active core is indicated by the shaded area. A confinement factor of Γ = 0.47 and optical losses of αW = 12 cm−1 are computed
plays a major role. Besides directly affecting electron distributions it also acts as an activation mechanism for carrier-phonon scattering. The latter is not possible for electrons at zero in-plane momentum due to lack of final states at appropriate energy. Carrier-carrier scattering, however, can transfer sufficient in-plane momentum to electrons to open that path, a perception which is also supported by Monte-Carlo simulations [26]. Finally, the large dispersion of the miniband supports a wide range of currents and voltages in the operating characteristics, which is necessary for the achievement of high output powers, and, additionally, suppresses thermal backfilling from the downstream active region. Efficient waveguiding of THz radiation inside a semiconductor represents a challenge owing to the long wavelength, typically resulting in small confinement factors, and to the high losses caused by free-carrier absorption. Conventionally, long-wavelength (up to λ ∼ 24µm) QC lasers [13,27,28] made use of a concept introduced by Sirtori et al. [29], that relies on surface-plasmons. Surface plasmons are the solution to Maxwell’s equations at the interface between two materials possessing dielectric constants of opposite sign [30]. Such an interface can be realized in QC lasers between the moderately doped
Terahertz Quantum Cascade Lasers
331
stack of active regions and a subsequently deposited metal. In the direction perpendicular to the interface, the optical mode peaks at the interface and decays exponentially to both sides, with the decay constants controlled by the wavelength of the light and by the dielectric constant of the materials. In the InP material system, where the substrate acts like a cladding layer, confinement factors of more than 80 % have been demonstrated for λ ∼ 17µm and an active core of 4 µm. At THz wavelengths, though, a single-plasmon waveguide would lead to very small confinement factors. A unity confinement of the optical mode can be obtained in a double-surface plasmon waveguide configuration, where the active core is sandwiched between metal layers or highly-doped semiconductors. However, residual penetration of the light into the cladding layers causes large optical losses (αW = 50 . . . 80 cm−1 ) [31]. Solutions based on undoped substrates therefore represent an attractive alternative. In our laser the waveguide relies on the presence of a thin, highly-doped layer which was grown directly underneath the low-doped stack of active SLs. Thanks to its negative dielectric constant, comparable in modulus to the one of active stack and substrate, and thanks to its small thickness the two surface plasmons existing at the two interfaces unite into a single mode, whose penetration into the surrounding semiconductor is at the same time minimized, resulting in a very tight confinement and low optical losses. The calculated mode profile of our waveguide is shown in Fig. 1 (b); we calculate a confinement factor of Γ = 0.47 and optical losses of αw = 12 cm−1 . The refractive index of the semiconductor layers was calculated using a Drude model, in which the carrier-carrier scattering time was set to 0.1 ps for the highly-doped layers, 0.5 ps for the low-doped layers, and the substrate was assumed loss-free. It should be mentioned that both the single-surface plasmon waveguide and the double-surface plasmon waveguide are the limiting cases of this configuration. Reducing the thickness and/or doping concentration of the inserted layer, one recovers the single-surface plasmon case, whereas an increasing thickness of the inserted layer leads to the double-surface plasmon configuration. The structure was grown by molecular beam epitaxy on a semi-insulating GaAs substrate, starting with the 800 nm thick n-doped (2 × 1018 cm−3 ) GaAs layer, followed by 104 repetitions of the superlattice (thicknesses are given in the caption of Fig. 1), and terminated by a 200 nm thick n-doped (5 × 1018 cm−3 ) GaAs layer to facilitate electrical contacting. Samples were processed into ridge-geometry mesas (width 150 µm) by optical lithography and wet chemical etching to a depth of 11.1 µm exposing the bottom contact layer. Evaporation of GeAu/Au (60 nm/60 nm) onto the designated areas on the bottom contact layer and onto two narrow (15 µm wide) stripes on top of the laser ridges, followed by annealing at 420 ◦ C under nitrogen atmosphere provided ohmic contacts. A second evaporation of Cr/Au (10 nm/170 nm) allowed for wire bonding. A schematic view of a device is shown in the inset
332
R¨ udeger K¨ ohler et al.
of Fig. 3. The use of two narrow stripes for the top contact instead of the full area of the ridge reduces the waveguide losses, which are believed to be higher in annealed material. Further, compared to the first devices [18] the side contacts were moved closer to the ridges to reduce heat dissipation in the bottom contact layer. Notice, however, that there is an optimum distance because a coupling of the mode to the metallization of the side contact must be avoided. Samples were then thinned down to about 250 µm, and laser bars were cleaved. Facets were left either untreated or a high-reflection coating (Al2 O3 /Ti/Au/Al2 O3 ) (100 nm/10 nm/100 nm/100 nm) was deposited on the back facet. The laser bars were then soldered onto a copper block using an In-Ag alloy, wire-bonded and mounted onto the cold-finger of a continuousflow cryostat equipped with polyethylene windows. Light was collected using an f/1 off-axis parabolic mirror, sent through an FTIR-spectrometer and detected with a DTGS (deuterated triglycine sulphate) detector or a Hecooled Si bolometer. Light-current (L-I) characteristics in continuous wave (cw) operation were recorded by mounting a Winston cone in front of the laser facet and using a calibrated pyroelectric radiometer. This latter arrangement allowed for a collection efficiency of 0.33, if the laser output were isotropic. Figure 2 shows the light-current and voltage-current (V-I) characteristics of two devices recorded in cw operation at different heat sink temperatures. Solid lines represent data collected from a laser ridge with the cleaved facets acting as mirrors while the data represented by the dashed lines were recorded from a laser on whose back-facet a high-reflection coating was deposited. At a heat-sink temperature of 8 K, the maximum output power is more than 2 mW for the first device and more than 4 mW for the latter, with threshold current densities of 210 Acm−2 and 180 Acm−2 , respectively. The devices stop operating in cw mode at a temperature of 48 K. The V-I characteristics are shown as well, from which the voltage at threshold can be extracted to be 4.3 V. Taking into account a residual voltage drop of approximately 0.5 V for contacts and electrical feed-through, this is in perfect agreement with the calculated value of 3.8 V. The drop in differential resistance coincides with laser threshold and is an indication of a reduced upper state lifetime as a consequence of stimulated emission [32]. Figure 3 shows the L-I characteristics in pulsed operation obtained at a duty cycle of 0.5%. The maximum peak output power is only slightly (10 percent) higher than in cw operation but the maximum operating temperature increases to 68 K with still 1 mW of peak power at 65 K heat sink temperature. The small difference in output power between pulsed and cw operation is attributed to the low current densities which lead to small dissipated powers. In Fig. 4 we show single-mode spectra recorded from the two devices of Fig. 2 in cw operation. For both lasers single-mode emission with a side-mode suppression ratio of more than 20 dB is obtained. However, at higher injection currents additional longitudinal modes appear in the spectra. Decreasing the current again recovers single-mode operation.
Terahertz Quantum Cascade Lasers
333
Fig. 2. Light-current (L-I) and voltage-current (V-I) characteristics of two devices recorded in cw operation. The power values represent what was collected (collection efficiency ≈ 0.33) from one facet onto a calibrated pyroelectric radiometer after correction for the transmittance of the polyethylene window (0.63). Dashed lines represent data collected from a 2.23 mm long device with a coated back-facet. At low temperatures, output powers of more than 4 mW are obtained and still 1.5 mW at 40 K. The device stops operating at 48 K. The 7 K V-I characteristics and its derivative are shown in the left panel. The sharp drop in differential resistance at threshold is caused by the reduction of the upper state lifetime by stimulated emission. Solid lines refer to a 1.96 mm long and 150 µm wide laser stripe, where the facets were left untreated. The output powers are lower by about a factor of two, but the maximum operating temperature is still 45 K. Due to the slightly different device size, the scale of the current refers only to the solid lines. The inset shows a schematic view of a processed laser stripe (see text)
In order to gain further understanding of the device operation we have performed measurements of the net modal gain G as a function of continuous injection current density J following the technique pioneered by Hakki and Paoli [33]. To this end, sub-threshold spectra from the 1.96 mm long device of Fig. 2 were recorded at 8 K in rapid-scan mode with a resolution of 0.125 cm−1 averaging over 400 scans. The net gain is extracted from the fringe contrast [33], and is plotted versus the injection current in Fig. 5. The inset shows an example sub-threshold spectrum. Close to threshold the dependence appears to be reasonably linear, and a net modal gain constant of gΓ = 0.164 cm/A is extracted. One cannot use this value, however, to directly extract the waveguide losses and threshold gain. The reason is that contrary to mid-infrared QC lasers, where the transparency condition of the laser transition is readily reached close to zero injection current, in our structure the lower laser level is populated up to threshold and above [34]. There exists thus a minimum current density J0 for which the laser transition becomes transparent. Therefore, an extrapolation of the data collected close
334
R¨ udeger K¨ ohler et al.
Fig. 3. Light-current (L-I) characteristics of the same devices as in Fig. 2, recorded in pulsed operation at a duty cycle of 0.5% with a He-cooled Si bolometer. The bolometer was calibrated against the pyroelectric radiometer in the cw set-up using the average power of the laser in pulsed operation at 6 K. Solid lines refer to the device without coating of the back facet while dashed lines represent data collected from the laser with a coated back-facet. At the lowest temperature, the latter device emits 4.5 mW, and still more than 1 mW at a 65 K. The inset shows laser spectra recorded at two different injection currents at a heat sink temperature of 5 K. The onset of multi-mode lasing above a certain threshold appears in the L-I characteristics as instabilities
to threshold to zero current density overestimates the waveguide losses and consequently the gain. In fact, such an interpolation would yield an unreasonably high value of αW = 30 cm−1 , if corrected for the mirror losses of 5 cm−1 . Likewise, the threshold gain, even assuming a linear behavior over the whole current range cannot be calculated as gΓ Jth . Nevertheless, in the linear part, the gain in the structure can be written as [35] G = gΓ (J − J0 ) =
4πz 2 q02 (N3 − N2 ) 0 λnLp (2γ32 )
(1)
where N3 , N2 are the sheet densities in levels 3,2, z = 7.8nm the dipole matrix element between states 3 and 2, 0 = 8.85 × 10−12 C/Vm , λ = 67µm, Lp = 104.9nm is the length of one period, n = 3.6, and 2γ32 = 2meV. In this linear region we can then extract an effective time τ0 for the relaxation of population inversion by differentiation of Eq. 1 −1 ∂G ∂G (2) τ0 = q0 ∂J ∂(N3 − N2 ) We obtain a value of 0.96 ps, which is in fair agreement with previous Monte Carlo simulations [34].
Terahertz Quantum Cascade Lasers
335
Fig. 4. Spectra recorded from the devices of Fig. 2 in continuous-wave operation at a heat sink temperature of 10 K. The intensity is normalized. (a) Spectrum recorded from the device with coated back-facet. The injection current is 650 mA, equivalent to an output power of about 550 µW. Note the side-mode suppression ratio of more than 20 dB. Increasing the injection current, the laser emits on several longitudinal modes. (b) Spectrum recorded from the device with facets left untreated at an injection current of 700 mA. The output power of the laser is approximately 450 µW. The laser emission is shifted by about 20 GHz towards higher frequencies
Fig. 5. Measurement of peak net gain as a function of injection current density at 8 K using the Hakki-Paoli technique; the dashed horizontal line indicates laser threshold. The slope of a linear fit to the data yields the gain constant. Unlike in most mid-IR QC lasers, the y-axis intercept does not correspond to the waveguide losses due to the existence of a finite current density for which transparency in the active region is reached. In the inset an example sub-threshold spectrum is reported
336
3
R¨ udeger K¨ ohler et al.
Quantum Cascade Laser at 3.5 Terahertz
Recently we have been able to extend the emission wavelength towards lower frequencies using a similar active core designed for emission at 14 meV (3.35 THz) and embedding it into an analogous waveguide [24]. In Fig. 6 (a) we show a portion of the conduction band structure of the laser. The two states between which the optical transition takes place are drawn in boldface and share a dipole matrix element of 9.0 nm. The dispersion of the miniband is again chosen to be 17 meV. Although this is larger than the photon energy, re-absorption of the emitted light is avoided by careful engineering of the miniband so that no two states are separated by the photon energy and share a substantial dipole matrix element. Figure 6 (b) shows the calculated optical mode profile of the laser. In this structure, the active core is made of 92 repetitions of the superlattice, whose layer structure is given in the caption. The high-doped layer inserted underneath the active core is 800 nm thick and doped 1.5 × 1018 cm−3 . Again, epitaxial growth is terminated with a 200 nm thick n-doped (5 × 1018 cm−3 ) GaAs layer in order to facilitate electrical contacting. Using the Drude model and parameters introduced in section 2 we compute a confinement factor of 0.37 and optical propagation losses αW = 9 cm−1 .
0.10
2 1
0.05
injector active region
bottom contact layer 0.03
Γ = 0.37
0.02
αw = 9 cm
-1
0.01
0.00
0.00 0
(a)
metal layer
0.04
Mode intensity (norm.)
Energy (eV)
0.15
50
100
150
Distance (nm)
200
250
0
(b)
20
40
60
80
Distance (µm)
Fig. 6. Conduction band diagram and mode profile of the λ ∼ 85µm QC laser. (a) Profile of the conduction band edge of a portion of the active core under an electric field of 2.5 kV/cm. The layer thickness (in nm) are, from left to right, starting from the injection barrier 4.0/ 23.3/ 0.8/ 18.5/ 0.6/ 13.6/ 2.4/ 12.2/ 2.6/ 12.7/ 3.3/ 12.5/ 2.8/ 11.6, where Al0.15 Ga0.85 As layers are in boldface and the fist 10.5 nm of the 12.5 nm wide GaAs well are doped 4 × 1016 cm−3 . Also shown are the moduli squared of the wavefunctions; the optical transition occurs between the two states drawn in boldface. (b) Calculated mode profile along the growth direction of the final device structure. The origin of the abscissa is at the top metal-semiconductor interface; the laser active core is indicated by the shaded area. A confinement factor of Γ = 0.37 and optical losses of αW = 9 cm−1 are computed
Terahertz Quantum Cascade Lasers
337
Samples were processed as described in section 2. Lasing at λ ∼ 85 µm (3.5 THz) was achieved with peak output powers of 1.7 mW at low temperatures. Figure 7 shows the L-I characteristics obtained from a 4.7 mm long and 180 µm wide laser ridge under pulsed excitation. This laser possesses a very low threshold current density of 95 Acm−2 but the small operating range of injection currents prevents high output powers. In fact, the roll-off in the L-I characteristics at around 132 Acm−2 coincides with a small feature of negative differential resistance in the V-I characteristics, which also marks the end of resonant tunneling injection. Subtracting 0.5 V for residual resistances, this point is reached at a voltage of 3.9 V which coincides precisely with the bias field at which the injector ground state and the upper laser level are fully anticrossed. Beyond this point, injection into the upper laser gradually decreases, as is evident from the roll-off in the L-I characteristics as well as from the increasing differential resistance, which is a signature of deteriorating transport efficiency. We attribute the low maximum current density to poor transport through the miniband rather to a too small tunnel coupling between the injector and the active region. This assessment is based on the fact that the tunnel coupling in this device is only slightly different from the design at 4.3 THz. We believe that also the maximum operating temperature of a 47 K is limited by the low maximum current density and not by real
Fig. 7. Light-current (L-I) characteristics of a 4.7 mm and 180 µm wide laser device, recorded in pulsed operation at a duty cycle of 1% with a He-cooled Si bolometer. The bolometer was calibrated against the pyroelectric radiometer in the set-up using the Winston cone, providing a collection efficiency of 33%. Peak powers of more than 1.5 mW are reached at low temperatures and still 650 µW at 30 K. Lasing ceases at around a 42 K heat sink temperature. The inset shows a laser spectrum recorded at a drive current of 1.2 A
338
R¨ udeger K¨ ohler et al.
thermal issues. Comparing the threshold current density of the 4.7 mm long device with that obtained from a 2.1 mm long device (127 Acm−2 ) one can estimate an upper limit for the waveguide losses. To this end, mirror losses are calculated in the plane-wave approximation, which is likely to underestimate the reflectivity of the laser facet as it neglects the tight confinement of the optical mode. Calculating the effective refractive index from the Fabry-Perot spectra (n = 3.88), one computes αM2.1 = 5cm−1 and αM4.7 = 2.2cm−1 for the 2.1 mm long and 4.7 mm long laser, respectively. Assuming equal modal gain in both devices the waveguide losses αW can be calculated using the expression for the threshold current density αW + αM , (3) Jth = gΓ where g denotes the material gain, finding αW = 5 cm−1 . Figure 8 shows the cw L-I characteristics recorded from this device with a DTGS detector. The maximum output power can be estimated to be around 350 µW using a calibration of the DTGS detector made with the QC laser emitting at 4.3 THz. No single-mode emission is obtained from such long lasers due to the small mode spacing. Instead, in shorter devices single-mode emission is observed. Such a spectrum recorded from a shorter device of 2.1 mm length at an injection current of 570 mA is shown in the inset. In summary, we have presented semiconductor injection lasers emitting at 4.3 THz (λ ∼ 67µ m) and 3.5 THz (λ ∼ 85µ m), respectively. They are based on chirped superlattice active regions, arranged in a quantum cascade scheme, and employ a new kind of waveguide, capable of strongly confining THz radiation inside a semiconductor with very low optical losses. In pulsed operation, the best devices operate up to a 68 K heat-sink temperature with peak output powers of 4.5 mW. In continuous wave operation, similar output powers of 4 mW are reached and the maximum operating temperature is 48 K.
Fig. 8. Light-current (L-I) characteristics of the 4.7 mm long device of Fig. 2, recorded in cw operation at a heat sink temperature of 10 K. The values represent what was collected from one facet onto a DTGS detector (calibrated against a pyroelectric radiometer, see text) after correction for the transmittance of the polyethylene window (0.63). We estimate the maximum cw output power to be 350 µW. The inset shows a laser spectrum recorded from a 2.1 mm long device at an injection current of 570 mA
Terahertz Quantum Cascade Lasers
339
The operation of the laser has been analyzed using the technique of Hakki and Paoli, and fair agreement between this measurement and previous MonteCarlo simulations was found. In view of the considerable progress in the performance of these devices since their first demonstration, we are confident that cw operation at liquid nitrogen temperature will be reached in the near future. In this perspective, the demonstration of THz QC lasers, operating below the LO phonon band, opens the way towards the development of widely usable THz photonics. Acknowledgements The authors would like to thank Sukhdeep Dhillon and Carlo Sirtori of THALES research for the deposition of the high-reflection coatings. This work was supported in part by the European Commission through the IST Framework V FET project WANTED. R.K. and A.T. acknowledge support from the C.N.R. and the Fondazione Cassa di Risparmio di Pisa; E.H.L. and A.G.D acknowledge support from Toshiba Research Europe Ltd and the Royal Society, respectively.
References 1. P. Y. Han, G. C. Cho, X-C. Zhang, Opt. Lett. 25, 242, (2000). 327 2. D. M. Mittleman, S. Hunsche, L. Boivin, M. C. Nuss, Opt. Lett. 22, 904 (1997). 327 3. Daniel Mittleman (Ed.), Sensing with Terahertz Radiation, Springer Berlin, 2002. 327 4. E. Br¨ undermann, D. R. Chamberlin, E. E. Haller, Appl. Phys. Lett. 76, 2991 (2000), and references therein. 327 5. E. R. Brown, K. A. McIntosh, K. B. Nichols, C. L. Dennis, Appl. Phys. Lett. 66, 285 (1995). 327 6. R. Kersting, K. Unterrainer, G. Strasser, H. F. Kauffmann, E. Gornik, ”FewCycle THz Emission from Cold Plasma Oscillations”, Phys. Rev. Lett. 79, 3038 (1997). 327 7. J. Faist, F. Capasso, D. L. Sivco, C. Sirtori, A. L. Hutchinson, A. Y. Cho, Science 264, 553 (1994). 328 8. R. F. Kazarinov and R. A. Suris, Possibility of the amplification of electromagnetic waves in a semiconductor with a superlattice, Soviet Physics – Semiconductors 5, 707 (1971). 328 9. F. Capasso, K. Mohammed, A. Y. Cho, IEEE Jrnl of Quantum Electronics 22, 1853 (1986). 328 10. K. L. Wand and P.-F. Yuh, Theory and Applications of Band-Aligned Superlattices, IEEE Jrnl of Quantum Electronics 25, 12 (1988). 328 11. A. Kastalsky and V. Goldman and J. Abeles, Possibility of infrared laser in a resonant tunneling structure, Appl. Phys. Lett. 59, 2636 (1991). 328 12. J. Faist, F. Capasso, D. L. Sivco, A. L. Hutchinson, S-N. G. Chu, and A. Y. Cho, Appl. Phys. Lett. 72, 680 (1998). 328
340
R¨ udeger K¨ ohler et al.
13. R. Colombelli, F. Capasso, C. Gmachl, A. L. Hutchinson, D. L. Sivco, A. Tredicucci, M. C. Wanke, A. M. Sergent, A. Y. Cho, Appl. Phys. Lett. 78, 2620 (2001). 328, 330 14. M. Beck, D. Hofstetter, T. Aellen, J. Faist, U. Oesterle, M. Ilegems, E. Gini, H. Melchior Science 295, 301 (2002). 329 15. M. Rochat, J. Faist, M. Beck, U. Oesterle, M. Ilegems, Appl. Phys. Lett. 73, 3724 (1998). 329 16. B. S. Williams, B. Xu, Q. Hu, M. R. Melloch, Appl. Phys. Lett. 75, 2927 (1999). 329 17. J. Ulrich, R. Zobl, W. Schrenk, G. Strasser, K. Unterrainer, E. Gornik, Appl. Phys. Lett. 76, 1928 (2000). 329 18. R. K¨ ohler, A. Tredicucci, F. Beltram, H. E. Beere, E. H. Linfield, A. G. Davies, D. A. Ritchie, R. C. Iotti, F. Rossi, Nature 417, 156 (2002). 329, 332 19. M. Rochat, L. Ajili, H. Willenberg, J. Faist, H. E. Beere, A. G. Davies, E. H. Linfield, D. A. Ritchie, Appl. Phys. Lett. 81, 1381 (2002). 329 20. M. Rochat, G. Scalari, D. Hofstetter, M. Beck, J. Faist, H. Beere, G. Davies, E. Linfield, D. Ritchie, Electr. Lett. 38, 1675 (2002). 329 21. R. K¨ ohler, A. Tredicucci, F. Beltram, H. E. Beere, E. H. Linfield, A. G. Davies, D. A. Ritchie, S. Dhillon, C. Sirtori, Appl. Phys. Lett. 82, 1518 (2003). 329 22. B. S. Williams, H. Callebaut, S. Kumar, Q. Hu, J. L. Reno, Appl. Phys. Lett. 82, 1015 (2003). 329 23. J. Faist, private communication. 329 24. R. K¨ ohler, A. Tredicucci, F. Beltram, H. E. Beere, E. H. Linfield, A. G. Davies, D. A. Ritchie, Opt. Lett., in press (2003). 329, 336 25. A. Tredicucci, F. Capasso, C. Gmachl, D. L. Sivco, A. L. Hutchinson, A. Y. Cho, Appl. Phys. Lett. 73, 2101 (1998). 329 26. R. K¨ ohler, R. C. Iotti, A. Tredicucci, F. Rossi, Appl. Phys. Lett. 79, 3920 (2001). 330 27. A. Tredicucci, C. Gmachl, F. Capasso, A. L. Hutchinson, D. L. Sivco, A. Y. Cho, Appl. Phys. Lett. 76, 2164 (2000). 330 28. A. Tredicucci, C. Gmachl, M. C. Wanke, F. Capasso, A. L. Hutchinson, D. L. Sivco, S.-N. G. Chu, A. Y. Cho, Appl. Phys. Lett. 77, 2286 (2000). 330 29. C. Sirtori, J. Faist, F. Capasso, D. L. Sivco, A. L. Hutchinson, A. Y. Cho, Appl. Phys. Lett. 66, pp. 3242 - 3244, 1995. 330 30. N. W. Ashcroft and N. D. Mermin, Solid State Physics, ITPS Thomson Learning, London, 1976. 330 31. M. Rochat, M. Beck, J. Faist, U. Oesterle, Appl. Phys. Lett. 78, 1967 (2001). 331 32. C. Sirtori and F. Capasso and J. Faist and A. L. Hutchinson and D. L. Sivco and A. Y. Cho, IEEE Jrnl. of Quantum Electronics 34,1722 (1998). 332 33. B. W. Hakki and T. L. Paoli, J. Appl. Phys. 46, 1299 (1975). 333 34. R. K¨ ohler, R. C. Iotti, A. Tredicucci, F. Rossi, Appl. Phys. Lett. 79, 3920 (2001). 333, 334 35. J. Faist, F. Capasso, C. Sirtori, D. L. Sivco, A. L. Hutchinson, A. Y. Cho, Appl. Phys. Lett. 66, 538 (1995). 334
Ultrafast Buildup of a Many-Body Resonance after Femtosecond Excitation of an Electron-Hole Plasma in GaAs Rupert Huber, Florian Tauser, Andreas Brodschelm, and Alfred Leitenstorfer Physik-Department E11, Technische Universit¨ at M¨ unchen D-85748 Garching, Germany Abstract. On a femtosecond timescale, we observe how Coulomb screening and collective scattering build up in an extreme nonequilibrium electron-hole plasma photoinjected in GaAs. To this end, we generate the plasma via interband excitation with a 10 fs laser pulse. The subsequent polarization response of the system is probed with uncertainty-limited temporal resolution using ultrabroadband terahertz spectroscopy. We show that the intrinsic material becomes conductive instantaneously upon carrier injection, whereas collective effects such as Coulomb screening and plasmon scattering exhibit a delayed onset. Thus, the ultrafast formation of dressed quasiparticles is directly monitored for the first time. The timescale for these phenomena is of the order of the inverse plasma frequency. Our findings support recent quantum kinetic simulations.
1
Introduction
The quasiparticle model is one of the fundamental concepts in many-body physics. It is usually impractical to regard each isolated (“bare”) particle and to take into account its interactions with all the other individual components of the system. More physical insight is gained by introducing new units which are composed of the bare particle plus some average surroundings. Usually, these quasiparticles are assumed to form instantly. However, this picture turns out to be valid only on timescales which are long compared to the oscillation cycle of the collective mode of the system. Very recently, the quantum dynamical phenomena which occur during the formation of interparticle correlations in systems far from thermal equilibrium have become accessible experimentally [1,2], using different types of ultrafast spectroscopy with a time resolution of 10 fs. The measurements presented in this article are based on a novel experimental method which allows for femtosecond probing of mid-infrared excitations in solids with sub-cycle resolution. This technique of ultrabroadband electro-optic sampling [3,4] has previously been used in terahertz emission experiments investigating transient high-field transport in semiconductor devices [5,6]. We show that generation and field-resolved detection of multi-THz electric field transients may be exploited for studies of ultrafast carrier-carrier B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 341–351, 2003. c Springer-Verlag Berlin Heidelberg 2003
342
Rupert Huber et al.
interactions which are relevant for a large variety of microscopic phenomena found in condensed matter. While the interaction of two isolated point charges is described by the bare Coulomb potential, in many-body systems this interaction is modified as a result of the collective response of the screening cloud surrounding each charge carrier. The present contribution reports on a measurement of the ultrafast buildup of Coulomb screening and collective behavior in a dense electron-hole plasma photogenerated within 10 fs [2]. In recent interband pump-probe experiments on polaron formation [1], the dynamics of the carrier distribution has been monitored which is a result of a coupling process between elementary excitations. In contrast, ultrabroadband two-dimensional THz studies add a new dimension: direct access to the details of the interaction process itself. This goal is achieved by probing the dielectric response of the non-equilibrium system in the frequency regime containing the eigenfrequency of its collective mode, in our case a plasma resonance at 15 THz. It turns out that shortly after generation of the plasma, the amplitude and phase distortion of a single-cycle probe transient at 28 THz [7,8] is almost independent of frequency, i.e. close to instantaneous. If the system is probed approximately 100 fs after its excitation, we find a delayed response due to plasma oscillations. These observations are consistent with an ultrafast transition from an undressed state of bare charges far from equilibrium into a correlated many-body ensemble with screening clouds surrounding each carrier.
2 Femtosecond Buildup of Coulomb Screening in GaAs Probed via Ultrabroadband THz Spectroscopy 2.1
Physical Background
The earliest stage in the dynamics of interacting many-body excitations in solids is the buildup of inter-particle correlations. A photogenerated electronhole plasma in a semiconductor constitutes an ideal playground to study such ultrafast transient phenomena: In a plasma each carrier attracts a charge cloud of opposite sign to form a dressed quasiparticle. This screening cloud modifies the interaction potential of the charges with respect to the bare Coulomb potential Vq in reciprocal space: Vq =
4πe2 q2
(1)
q denotes the momentum exchange in the interaction process, e is the elementary charge. The effective interaction in the carrier plasma [9] may be described by the potential Wq (ω, tD ) which is defined as Wq (ω, tD ) =
Vq (ω, tD )
(2)
Ultrafast Buildup of a Plasma Resonance
343
The longitudinal dielectric function (ω, tD ) renormalizes the bare Coulomb potential Vq (Eq. 1). While Vq obviously mediates an instantaneous interaction with a white frequency spectrum, Wq (ω, tD ) may exhibit a pronounced dependence on the frequency ω corresponding to the energy exchanged in a collision. This retardation in the many-body system is caused by collective effects such as plasma oscillations at a frequency ωpl . In the nonequilibrium stage immediately after femtosecond photogeneration of a plasma, Wq (ω, tD ) might also depend on the time delay tD with respect to the event of photoinjection. Shortly after the advent of spectroscopic techniques with sub-picosecond temporal resolution, first theory calculations concerning carrier-carrier interactions among photoexcited charge carriers have been carried out [10,11,12,13]. These simulations have been based on static and dynamic screening models within the framework of the semiclassical Boltzmann equation. In this context, the screened quasiparticles in the plasma are assumed to form instantly after carrier excitation. This assumption might become questionable on timescales of a few femtoseconds where a highly excited solid might behave in a quantum coherent way. The conditions associated with extremely short times where the phase of the quantum mechanical wave functions comes into play may be described within the framework of quantum kinetic theories [14]. Especially, it turns out that phenomena connected to the formation of quasiparticles in states far from thermal equilibrium constitute a new class for such quantum kinetic processes. In particular, the buildup of screening and collective phenomena in an electron-hole plasma after ultrafast optical excitation has been studied intensely via quantum kinetic simulations [15,16,17,18,19,20]. Especially, a sophisticated quantum kinetic theory for the dynamics of Wq (ω, tD ) has been developed [16,17,18,20]. Experiments investigating the regime of Coulomb quantum kinetics [21,22,23,24] have been sensitive to the dynamics of interband polarizations and particle distributions. However, they have not provided access to the interaction potential itself. In contrast, we present a direct observation of the buildup of screening and the formation of dressed quasiparticles in a dense electron-hole plasma photogenerated in GaAs within 10 fs. To this end, the low-energy excitations of the plasma and the resulting resonances in (ω, tD ) are mapped out with sub-cycle temporal resolution. 2.2
Experimental Setup
The experimental setup is sketched in Fig. 1. We start with 10 fs laser pulses (center wavelength: 780 nm) at a repetition rate of 64 MHz and an average output power of 1 W from a mode-locked Ti:sapphire laser system. The near infrared light is split up into three beams. The major part of the laser intensity is directly focussed onto a GaAs sample. The specimen consists of a 200-nmthin epitaxial layer of high-purity GaAs attached to a transparent diamond substrate by van-der-Waals forces. Via resonant interband absorption of a
344
Rupert Huber et al. 6 4 M H z T i:s a p p h ir e 1 0 fs @ 7 8 0 n m 1 W
1 0 W
7 0 %
N IR
E D E
p u m p
l /2
2 0 %
P 1
V D 2
S i
P 4
G a S e
T
4
tD
V D 1
1 0 %
N d :Y V O
P 2 s a m p le : 2 0 0 n m
P 3 i- G a A s
P
T H z
(T ), (tD ,T )
l /4
Z n T e
T H z p ro b e
T H z
W P
B P
Fig. 1. Experimental setup for ultrafast near infrared pump-multiTHz probe measurements. VD1 and VD2: variable delay lines; λ/2: half-wave plate; GaSe: nonlinear optical crystal (thickness: 30 µm); P1, P2, P3, P4: gold coated parabolic mirrors; P: polarizer; Si: high-resistivity silicon window; ZnTe: electro-optic crystal (ZnTe, <110>-oriented, thickness: 10 µm); λ/4: quarter-wave plate; WP: Wollaston prism; BP: balanced pair of photodiodes
laser pulse, free electron-hole pairs are created in GaAs. The photoinduced particle pair density is N = 2 × 1018 cm−3 , resulting in a plasma frequency of ωpl ≈ 15 THz. At a delay time tD after photoexcitation, the mid infrared polarizability of this nonequilibrium system is tested in the long-wavelength limit, i.e. q ≈ 0, by a single-cycle electric field transient of a duration of 27 fs (FWHM of the intensity, Fig. 2(a)) and a center frequency of 28 THz (Fig. 2(b)). Phase matched optical rectification of another portion of the 10 fs laser pulse in a GaSe nonlinear crystal as thin as 30 µm [7,8] is exploited to generate this ultrabroadband THz probe. Any resonance in the system will lead to a distortion of the wave form and a retarded tail to the probe transient transmitted through the sample. The changes ∆ET Hz induced in the probe electric field ET Hz are directly measured in the time domain via ultrabroadband electro-optic sampling [3,4]: The transmitted THz transient is focussed into a <110>-oriented ZnTe crystal of a thickness of 10 µm. The electrooptic effect results in a birefringence of ZnTe which is proportional to the THz electric field amplitude present in the crystal. A time delayed third part of the 10 fs laser pulse reads out the birefringence thereby sampling the THz wave form as a function of a second delay time T . Our setup allows us to detect birefringence induced modifications of the sampling photon current as √ −1 low as ∆I/I = 5 × 10−9 Hz limited only by the shot noise of the laser light used to sample the THz wave form. Figure 2(a) depicts the electric field amplitude ET Hz of the THz probe transmitted through the unexcited GaAs sample versus delay time T . The Fourier transform of this THz transient is shown in Fig. 2(b) and (c). The field amplitude (Fig. 2(b)) exhibits frequency components between 1 THz and 100 THz, covering the entire mid and far infrared wavelength region from λ = 300 µm to 3 µm. The phase spectrum in Fig. 2(c) is flat between
Ultrafast Buildup of a Plasma Resonance
345
Fig. 2. (a) Electric field amplitude ET Hz of the single-cycle probe versus time T . The pulse duration is tp = 27 fs (FWHM of the intensity envelope). (b) Amplitude and (c) phase spectra of the single-cycle transient versus frequency and photon energy. f0 denotes the center frequency. ∆f represents the FWHM of the spectral intensity
10 THz and 55 THz giving rise to a time-bandwidth product as small as tp × ∆f = 0.35. In this wavelength regime, our single-cycle pulses represent the ultimate probe with a temporal resolution limited only by the uncertainty principle. 2.3
Experimental Results
In a two-dimensional time domain spectroscopy, we measure the THz electric field change ∆ET Hz due to carrier injection with the 10 fs pump as a function of pump-probe delay tD and THz sampling delay T [2]. The electric waveform ET Hz of the test transient is shown again in Fig. 3(a). The THz electric field change ∆ET Hz due to carrier excitation with the 10 fs pump is displayed in a grayscale map versus pump-probe delay tD (vertical) and THz sampling delay T (horizontal) in Fig. 3(b). The diagonal dotted line denotes the position of the maximum of the 10 fs pump pulse. For pump-probe delays between tD = −20 fs and +20 fs, the excitation pulse overlaps with the electric field of the THz and a negative (white) and positive (black) half cycle in ∆ET Hz appear along the excitation diagonal. These striking features in the vertical region around T = 0 fs correspond to an instantaneous perturbation of the single-cycle probe which appears as soon as the plasma is injected. A retarded oscillatory response of the plasma starts to develop at tD = 40 fs, visualized by the dashed black line in Fig. 3(b) to guide the eye: An additional half cycle in ∆ET Hz builds up, represented by a vertical dark gray column labelled 1 at T = 60 fs. In this region no probe field is present without excitation (see Fig. 3(a)). For even longer pump-probe delays tD ≥ 70 fs, a second half wave (label 2) appears in the retarded response at T = 100 fs. Another maximum
346
Rupert Huber et al.
Fig. 3. (a) Electric field amplitude ET Hz of the single-cycle probe versus time T . (b) The polarization response of the plasma depends on two delay times, tD and T . The electric field change ∆ET Hz induced by 10 fs photoexcitation of 2 × 1018 cm−3 electron-hole pairs in GaAs is shown as a grayscale map versus tD and T . The dotted diagonal line denotes the position of the 800 nm pump pulse and the dashed curve serves to emphasize the buildup of the retarded plasmon response
(3) shows up at T = 140 fs for tD ≥ 100 fs and finally a last minimum (4) appears at T = 170 fs for tD ≥ 120 fs. For pump-probe delays beyond tD = 150 fs the electric field change ∆ET Hz versus T becomes stationary on a picosecond time scale given by the carrier recombination time in the sample. In order to obtain a more quantitative description of the experimental data in Fig. 2 and to extract q=0 (ω, tD ), we transfer our results into the frequency domain by an incomplete Fourier transform [16,17]. Strictly speaking, with our THz probe we do measure a transverse dielectric property of the system while the longitudinal dielectric function is relevant for the Coulomb scattering matrix element. However, we want to point out that the transverse and the longitudinal dielectric function are degenerate in the limit of small momentum transfer q. This requirement is fulfilled under our experimental conditions since the wavelength of the THz probe is much larger than the relevant microscopic length scales characterizing the plasma, namely the average interparticle distance and the exciton Bohr radius. Since our detection scheme is sensitive to both amplitude and phase of the transmitted probe field, we are able to access the full complex dielectric function in the long-wavelength
Ultrafast Buildup of a Plasma Resonance
347
limit, i.e. q=0 (ω, tD ), via the following procedure similar to the one described in [25]: Both sets of real time data ET Hz (T ) and ∆ET Hz (tD , T ) are Fourier transformed along the electro-optic sampling axis T . In addition, only precisely known information enters the calculation: The layer structure of the sample is accounted for by a transfer matrix formalism which includes reflexion and absorption losses as well as phase shifts upon propagation through the interfaces and the layer material, respectively [26]. The dielectric function of intrinsic GaAs without optical excitation is parameterized by a dielectric oscillator model where q=0 (ω, tD 0) is represented by the following equation, with ωpl set to zero: 2 2 ωpl ωLO − ωT2 O (ω) = ∞ × 1 + 2 − (3) ωT O − ω 2 − iγω ω 2 + iω/τ ωLO and ωT O are the longitudinal and transverse optical phonon frequencies in the center of the Brillouin zone (ωLO /2π = 8.8 THz and ωT O /2π = 8.1 THz in GaAs). The lattice damping is described by γ = 0.2 ps−1 and the nonresonant background polarizability of the bound electrons is accounted for by ∞ = 11.0. Figure 4 (a) and (b) show the imaginary and real parts of the inverse dielectric function versus frequency (or equivalently, versus energy ¯hω exchanged in a long-range Coulomb collision) for various pump-probe delay times tD . The negative imaginary part of 1/ q=0 (Fig. 4(a)) reflects the buildup of dissipation in the plasma: 25 fs after carrier generation a wide range of energies may be exchanged between the particles, indicating an uncorrelated state in an extreme nonequilibrium situation. Within 100 fs a sharp maximum at the plasma frequency of ωpl = 14.4 THz appears due to the transition to collective plasmon scattering. In the unexcited sample (lower solid line in Fig. 4(a)) -Im(1/ q=0 ) vanishes exactly, except for a small peak at ωpl /2π = 8.8 THz which is due to energy exchange with the crystal lattice via polar-optical scattering with LO phonons. The real part of 1/ q=0 (Fig. 4(b)) describes the buildup of Coulomb screening. The quantity Re(1/ q=0 ) renormalizes the effective charge that an electron feels in the interaction process with a quasiparticle in the plasma exchanging an energy h ¯ ω (see Eqs. 1 and 2). For late pump-probe delays of tD ≥ 150 fs, a resonant dispersive feature of over- and antiscreening [2] is found around the plasma frequency. This phenomenon is consistent with a fully developed dressed interaction. Interestingly, the resonance extrema do not emerge instantaneously with carrier generation: At tD = 25 fs, the spectrum is completely flat above h ¯ ωpl , indicating bare Coulomb collisions. However, Re(1/ q=0 ) drops close to zero in the low-energy region with a step at h ¯ ωpl . This behaviour means a very large polarizability at low frequencies, consistent with an onset of the free-carrier conductivity immediately after carrier generation at tD = 0. In contast, Re(1/ q=0 ) is always finite
348
Rupert Huber et al.
Fig. 4. (a) Imaginary and (b) real part of the long-wavelength limit of the inverse dielectric function of GaAs versus frequency for different pump-probe delays tD . The lower solid lines result from the dielectric oscillator model of Eq. (3) for unexcited GaAs. A Drude-fit of the data at tD = 150 fs is represented by the upper solid line
for ground-state GaAs (lower solid line in Fig. 4(b)), in agreement with the insulating properties of the unexcited semiconductor. The classical Drude formula of Eq. 3 describes the long-wavelength and high-frequency limit of the dielectric function of an electron gas assuming an exponential plasmon damping with a time constant τ . We have performed least-square fits to our data allowing ωpl and τ in the Drude response as free parameters while we kept the lattice part fixed. Good agreement is found for late delay times. An example is given by the uppermost curves in Fig. 4(a) and (b) with ωpl = 14.4 THz and τ = 85 fs for tD = 150 fs. Interestingly, the dielectric functions measured in the non-Markovian quantum regime at early times tD show clear deviations from a Drude shape. For a heuristic interpretation, the results for the best fit parameters τ and ωpl are displayed in Fig. 5. In this context, τ serves as a measure for the memory depth of the quantum system. τ increases from a value below 20 fs at tD = 25 fs up to τ = 85 fs for tD > 125 fs, directly reflecting the time-delayed onset of the plasmon response and the dressing of bare charges (Fig. 5(a)). The initial value of τ of approximately 45 fs obtained at tD = 0 is influenced by coherent effects due to pump-probe overlap. It may be regarded as a remainder of the sharp lattice resonance in unexcited GaAs. The plasma frequency ωpl is proportional to the square root of the particle pair density N . In contrast to the memory depth τ , ωpl remains constant already for tD > 25 fs as a result of the carrier generation process which is completed within 10 fs (Fig. 5(b)). We want to emphasize that our experimental findings strongly support quantum kinetic theories for the nonequilibrium dynamics of the Coulomb
Ultrafast Buildup of a Plasma Resonance
349
interaction. In fact, such models have predicted a strongly broadened plasmon pole and a delayed buildup of Coulomb screening at early times after femtoscond interband excitation of charge carriers [16,17,18,20]. The typical time scale for these phenomena is of the order of the duration of a plasma oscillation period, i.e. 2π/ωpl = 70 fs in our case.
Fig. 5. (a) Plasmon damping time τ and (b) plasma frequency ωpl versus pumpprobe delay tD , as obtained from leastsquare fits of the Drude response of Eq. 3 to the data in Fig. 4
2.4
Conclusion
In conclusion, we have reported the direct observation of the ultrafast buildup of Coulomb correlation effects and the formation of dressed quasiparticles in a many-body system. We demonstrate that plasmon scattering and Coulomb screening are not present instantaneously after 10 fs photogeneration of a dense electron-hole plasma in GaAs. However, they evolve on a time scale which is approximately set by the inverse plasma frequency. The conductivity of the plasma is shown to set in immediately with carrier generation. These findings are in agreement with recent simulations in the framework of Coulomb quantum kinetics. The results are obtained utilizing a novel technique based on ultrabroadband THz spectroscopy that allows us to resolve the polarization response of the system with sub-cycle resolution of the electric field amplitude and phase. We want to point out that the present scheme may be employed to study the dynamics of various elementary excitations of interesting materials and nanostructures in the mid to far infrared spectral region with a time resolution close to the ultimate limit. As an example, new perspectives arise for investigations in systems such as magnons in high-Tc superconductors, electron-lattice interactions in organic semiconductors and vibrational dynamics in large molecules and biological complexes.
350
Rupert Huber et al.
Acknowledgements We gratefully acknowlegde stimulating discussions with L. B´ anyai, H. Haug and L. V. Keldysh. The high-quality GaAs sample has been provided by M. Bichler and G. Abstreiter.
References 1. M. Betz, G. G¨ oger, A. Laubereau, P. Gartner, L. B´ anyai, H. Haug, K. Ortner, C. R. Becker, and A. Leitenstorfer, Phys. Rev. Lett. 86, 4684 (2001). 341, 342 2. R. Huber, F. Tauser, A. Brodschelm, M. Bichler, G. Abstreiter, and A. Leitenstorfer, Nature 416, 286 (2001). 341, 342, 345, 347 3. Q. Wu and X.-C. Zhang, Appl. Phys. Lett. 71, 1285 (1997). 341, 344 4. A. Leitenstorfer, S. Hunsche, J. Shah, M C. Nuss, and W. H. Knox, Appl. Phys. Lett. 74, 1516 (1999). 341, 344 5. A. Leitenstorfer, S. Hunsche, J. Shah, M C. Nuss, and W. H. Knox, Phys. Rev. Lett. 82, 5140 (1999). 341 6. A. Leitenstorfer, S. Hunsche, J. Shah, M C. Nuss, and W. H. Knox, Phys. Rev. B 61, 16642 (2000). 341 7. R. Huber, A. Brodschelm, F. Tauser and A. Leitenstorfer, Appl. Phys. Lett. 76, 3191 (2000). 342, 344 8. A. Brodschelm, F. Tauser, R. Huber, J. Y. Sohn, and A. Leitenstorfer in Ultrafast Phenomena XII, T. Elsaesser, S. Mukhamel, M. M. Murnane, N. F. Scherer (Eds.) (Springer Series in Chemical Physics 66, Berlin, 2000), pp. 215 - 217. 342, 344 9. J. Lindhard, Dan. Mat. Fys. Medd. 28, 2-57 (1954). 342 10. M. A. Osman and D. K. Ferry, Phys. Rev. B 36, 6018 (1987). 343 11. S. M. Goodnick and P. Lugli, Phys. Rev. B 37, 2578 (1988). 343 12. J. F. Young, N. L. Henry, and P. J. Kelly, Sol. Stat. Electron. 32, 1567 (1989). 343 13. J. H. Collet, Phys. Rev. B 36, 6018 (1987). 343 14. H. Haug, A.-P. Jauho, Quantum Kinetics in Transport and Optics of Semiconductors, Springer Series in Solid-State Sciences Vol. 123 (Springer, Berlin, 1996). 343 15. M. Hartmann, H. Stolz and R. Zimmermann, phys. stat. sol. (b) 159, 35 (1990). 343 16. K. El Sayed, S. Schuster, H. Haug, F. Herzel and K. Henneberger, Phys. Rev. B 49, 7337 (1994). 343, 346, 349 17. K. El Sayed, L. B´ anyai, and H. Haug, Phys. Rev. B 50, 1541 (1994). 343, 346, 349 18. L. B´ anyai, Q. T. Vu, B. Mieck, and H. Haug, Phys. Rev. Lett. 81, 882 (1998). 343, 349 19. N.-H. Kwong and M. Bonitz, Phys. Rev. Lett. 84, 1768 (2000). 343 20. Q. T. Vu and H. Haug, Phys. Rev. B 62, 7179 (2000). 343, 349 21. F. X. Camescasse, A. Alexandrou, D. Hulin, L. Banyai, D. B. Tran Thoai and H. Haug, Phys. Rev. Lett. 77, 5429 (1996). 343 22. W. A. H¨ ugel, M. F. Heinrich, M. Wegener, Q. T. Vu, L. Banyai and H. Haug, Phys. Rev. Lett. 83, 3313 (1999). 343
Ultrafast Buildup of a Plasma Resonance
351
23. M. Bonitz, J. F. Lampin, F. X. Camescasse, A. Alexandrou, Phys. Rev. B 62, 15724 (2000). 343 24. Q. T. Vu, H. Haug, W. A. H¨ ugel, S. Chatterjee and M. Wegener, Phys. Rev. Lett. 85, 3508 (2000). 343 25. M. Schall and P. U. Jepsen, Opt. Lett. 25, 13 (2000). 347 26. M. Born and E. Wolf: Principles of Optics, 7th ed. (Cambridge University Press, Cambridge, 1999). 347
Quantum Cascade Lasers for the Mid-infrared Spectral Range: Devices and Applications Ch. Mann1 , Q. K. Yang1 , F. Fuchs1 , W. Bronner1, R. Kiefer1 , K. K¨ ohler1, 1 2 2 3 H. Schneider , R. Kormann , H. Fischer , T. Gensty , and W. Els¨ aßer3 1
2 3
Fraunhofer-Institute for Applied Solid State Physics (IAF) Tullastrasse 72, D-79108 Freiburg, Germany [email protected] Max-Planck-Institute for Chemistry J.J. Becher-Weg 27, D-55128 Mainz, Germany Institute of Applied Physics, Darmstadt University of Technology Schlossgartenstrasse 7, D-64289 Darmstadt, Germany
Abstract. Quantum cascade lasers emitting at λ ∼ 5 µm based on different active region designs are investigated. Using lattice-matched GaInAs/AlInAs on InP substrates the maximum peak optical power as well as the maximum pulsed-mode operating temperature is enhanced by incorporating AlAs blocking barriers together with strain-compensating InAs layers into the active regions. Further improvement is achieved by employing strain-compensated GaInAs/AlInAs quantum wells for which maximum pulsed-mode operating temperatures in excess of 350 K are observed. High-reflectivity coated devices mounted substrate-side down show a maximum continuous-wave operating temperature of 194 K. Also the normalized relative intensity noise is investigated. Finally, a comparison trace-gas sensing experiment employing one of the present quantum cascade lasers and a lead-chalcogenide laser is presented. Detecting the P(25) absorption line of CO, higher stability is obtained using a quantum cascade laser.
1
Introduction
Quantum cascade (QC) lasers are unipolar mid- to far-infrared emitters in which the laser transition occurs between quantized energy levels within e.g., the conduction band. As the emission wavelength is determined by quantum confinement, a broad wavelength range can be covered by tuning the thicknesses of the individual layers without changing the material compositions. Since their first demonstration in 1994 employing GaInAs/AlInAs grown on InP substrate [1] these novel devices have made tremendous progress. At present the wavelength range covered by this material system reaches from 3.5 µm [2] to 24 µm [3] and also two-wavelength [4,5] as well as broadband QC lasers emitting continuously between 6 µm and 8 µm were fabricated [6]. A maximum pulsed-mode operating temperature of 470 K at a wavelength of 5.5 µm [7] and recently continuous-wave (CW) operation at room temperature for devices emitting at 9.1 µm [8] was reported. Because of the high B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 351–368, 2003. c Springer-Verlag Berlin Heidelberg 2003
352
Ch. Mann et al.
operating temperature and the high pulsed optical power QC lasers are getting suitable light sources for trace-gas sensing as well as for optical free-space communication. However, for devices emitting at around 5 µm wavelength the maximum CW operating temperature is currently still limited to 210 K [9]. To achieve population inversion necessary for laser action the lifetime of electrons in the final state of the laser transition must be shorter than the lifetime of electrons in the initial state. For this reason, in QC lasers resonant emission of longitudinal-optical (LO) phonons from the final state to a bound state lying approximately one LO-phonon energy below is widely employed as depopulation mechanism [1]. Under operating conditions electrons are injected into the active region of the QC laser structure by resonant tunneling through the injection barrier. After making radiative transitions they relax by resonant LO-phonon emission and tunnel through the exit barrier into the following injector. The injectors are designed such that a minigap opens up for electrons in the initial state of the laser transition to prevent them from leaking into the injector (“Bragg reflector”), while a miniband is created for electrons in the final states favoring them to tunnel through to be injected into the active region of the next period [10,11,12]. In the present paper we focus on GaInAs/AlInAs/InP based QC lasers emitting in the 5 µm wavelength range. First, modified designs aiming at enhancing population inversion are investigated. In Sect. 2 we show that the confinement of electrons in the initial state of the laser transition of QC lasers using lattice-matched GaInAs/AlInAs is enhanced by incorporating AlAs blocking barriers together with strain-compensating InAs layers into the active regions. In Sect. 3 we report on QC lasers employing strain-compensated active regions. For these devices the increased conduction band offset further enhances the electrical confinement and additionally offers more design flexibility. Next, in Sect. 4 the normalized relative intensity noise is investigated. In Sect. 5 a comparison trace-gas sensing experiment employing one of the present QC lasers and a lead-chalcogenide laser detecting the P(25) absorption line of CO is presented and finally, in Sect. 6 a brief summary is given.
2 Advantage of Blocking Barriers in the Active Regions of λ ∼ 5 µm Quantum Cascade Lasers In this section we report on the improvement of λ ∼ 5 µm QC lasers by incorporating AlAs blocking barriers together with strain-compensating InAs layers into the active regions [13]. With respect to a reference sample without blocking barriers the maximum peak optical power at 77 K is increased by a factor of 3. Additionally, the maximum pulsed-mode operating temperature is increased from 320 K to 350 K.
Quantum Cascade Lasers for the Mid-infrared Spectral Range
2.1
353
Concept of Blocking Barriers
The calculated conduction band profile (Γ-valley) of two active regions connected by an injector of a QC laser structure designed for emission at λ ∼ 5 µm using Ga0.47 In0.53 As/Al0.48 In0.52 As grown lattice-matched on InP substrate, as published in [12], is shown in Fig. 1a. In this design the injector acts as a “Bragg reflector” for electrons in the initial state of the laser transition (level 3), whereas electrons in the final states (level 2 and level 1) are favored to tunnel through the exit barrier into the injector’s miniband as described in Sect. 1. The detailed layer thicknesses and doping profiles are described in the caption of Fig. 1. To further enhance the confinement of electrons in level 3, the 3.0 nm Al0.48 In0.52 As exit barrier was substituted by a triple layer sequence composed of 0.7 nm AlAs (“blocking barrier”) sandwiched between 0.9 nm Al0.48 In0.52 As (see Fig. 1b) [13]. The tunneling probability of electrons in a state |i> of a quantum well through a barrier of thickness Lb is propor√ tional to exp(−2Lb 2mb Vi /¯h) with the effective mass of electrons in the ¯ . For this barrier mb , the effective barrier height Vi and Planck’s constant h reason, the direct tunneling/emission of electrons in level 3 into the next active region/continuum is selectively blocked by increasing V3 . At the same (a) injection barrier
(b)
AlAs blocking barrier
exit barrier
3
MINIGAP
2 1
MINIBAND
3
MINIGAP
2 1
MINIBAND
InAs
Fig. 1. Schematic conduction band profile (Γ-valley) of two active regions connected by an injector under positive bias condition at an electric field of 75 kV/cm. Also shown are the moduli squared of the relevant wave functions and the first miniband of the injector (grey shaded region). The laser transition is indicated by wavy arrows. (a) Structure as published in [12]. The layer sequence of one active region and injector, in nanometers, from left to right starting from the injection barrier is 5.0, 0.9, 1.5, 4.7 2.2, 4.0, 3.0, 2.3, 2.3, 2.2, 2.2, 2.0, 2.0, 2.0, 2.3, 1.9, 2.8, 1.9. (b) Structure with incorporated AlAs blocking barriers. The layer sequence of one active region and injector, in nanometers, from left to right starting from the injection barrier is 5.0, 1.0, 1.5, 2.0, 0.7 (InAs), 2.0, 2.2, 4.1, 0.9, 0.7 (AlAs), 0.9, 2.5, 2.3, 2.3, 2.2, 2.0, 2.0, 2.0, 2.3, 1.9, 2.8, 1.9. The Al0.48 In0.52 As barriers (Ga0.47 In0.53 As quantum wells) are typeset in bold (roman), layers in italic are Si doped to n = 2 × 1017 cm−3 , and the underlined layers serve as the exit barrier
354
Ch. Mann et al.
time the high tunneling probability out of level 2 and level 1 is maintained by reducing √the total width√of the triple layer exit barrier, such that the products Lb 2mb V2 and Lb 2mb V1 are kept essentially unchanged. For the QC laser structure shown in Fig. 1a level 3 is close to the conduction band edge of the Al0.48 In0.52 As exit barrier and consequently the effective barrier height V3 is small. By incorporating AlAs blocking barriers into the active regions the calculated tunneling probability out of level 3 is √ reduced √ by a factor of two (2Lb 2mb V3 /¯h = 1.8 for the design of Fig. 1b and 2Lb 2mb V3 /¯ h = 1.1 for the design of Fig. 1a). As the total width of the exit barrier is reduced from 3.0 nm to 2.5 nm the√high tunneling probability √ out of level 2 and level 1 is maintained (2Lb 2mb V2 /¯h = 2.9 and 2Lb 2mb V1 /¯ h = 3.0 for both designs). To compensate for the strain caused by introducing AlAs blocking barriers thin InAs layers were additionally incorporated into the active quantum well. Furthermore, these layers increase the depth of the effective potential, thus allowing more design flexibility towards shorter emission wavelengths. From now on wafers based on the design of Fig. 1a (Fig. 1b) will be referred to as sample A (sample B). At an applied electric field of 75 kV/cm the calculated transition energy of sample A (sample B) is 250 meV (258 meV), corresponding to an emission wavelength of 5.0 µm (4.8 µm). For both designs the energy difference between level 2 and level 1 is approximately equal to one LO-phonon energy for the purpose of LO-phonon assisted depopulation of the final state of the laser transition (see Sect. 1). The calculated LO-phonon scattering times of sample A (sample B) are τ32 = 2.3 ps (3.7 ps), τ31 = 2.8 ps −1 −1 + τ31 of (2.5 ps) yielding a lifetime τ3 = 1.3 ps (1.5 ps) with τ3−1 = τ32 the initial state, and τ21 = 0.3 ps (0.2 ps). The transition matrix-element is |z32 | = 1.7 nm (1.4 nm). 2.2
Device Fabrication
For both designs shown in Fig. 1 twenty-five periods of alternating active regions and injectors embedded between 400 nm thick Si-doped (n = 1 × 1017 cm−3 ) Ga0.47 In0.53 As separate confinement layers were grown latticematched on S-doped (n = 2 × 1017 cm−3 ) (001)-oriented InP substrates by molecular beam epitaxy (MBE). Then the wafers were transferred to a metal-organic chemical vapor deposition (MOCVD) system, where Si-doped InP serving as upper waveguide and contact layers was grown (20 nm, n = 5 × 1017 cm−3 ; 1500 nm, n = 2 × 1017 cm−3 ; 1300 nm, n = 7 × 1018 cm−3 ). For both samples we calculate values of neff = 3.23 for the effective mode refractive index, Γtot = 0.55 for the mode confinement factor considering the mode overlap with active regions and injectors, and Γ = 0.26 for the mode confinement factor considering the mode overlap with active regions only. The wafers were processed into 8–16 µm wide ridge-waveguide structures by chemical assisted ion beam etching (CAIBE) to a depth of 4 µm. Then
Quantum Cascade Lasers for the Mid-infrared Spectral Range
355
a 350 nm Si3 N4 passivation layer was deposited by plasma enhanced chemical vapor deposition (PECVD), windows were opened on top of the ridges, and Ge/Ni/Ge/Ni/Au (5/5/5/5/400 nm) was evaporated as the top contact metallization. After thinning the substrates to approximately 110 µm thickness the same metallization was deposited as the backside contact. Finally, lasers of 1–3 mm length were cleaved from the processed wafers and mounted substrate-side down with uncoated facets on copper heat-sinks. For all measurements presented in Sects. 2 and 3 the samples were placed inside a temperature controlled continuous flow cryostat. The optical power emitted into a solid angle of about π/10 was detected with a calibrated liquid nitrogen cooled InSb detector using calibrated attenuation filters to avoid saturation. Emission spectra were analyzed with a Bomem DA3 Fourier transform spectrometer. 2.3
Device Performance
Prior to comparing the device performance of QC lasers with and without AlAs blocking barriers the modal gain of reference sample A without blocking barriers is investigated. Employing the method proposed by Hakki and Paoli, the net modal gain defined as Γ g − αi , with the mode confinement factor Γ , the material gain g, and the internal losses αi , can be determined experimentally from sub-threshold electroluminescence (EL) spectra [14,15]. Figure 2a shows net modal gain spectra of an 8 × 1000 µm2 device of sample A at a heatsink temperature of 30 K for various injection currents between 120 mA and 210 mA. The net modal gain maximum occurs at an emission energy of about 1955 cm−1 (5.1 µm). The peak net modal gain as a function of injection
Fig. 2. (a) Measured net modal gain of an 8 × 1000 µm device of sample A at a heat-sink temperature of 30 K for various injection currents between 120 mA and 210 mA. (b) Peak net modal gain as a function of current density (squares) and linear least square fit to the experimental data (solid line). The threshold condition Γ gpeak (Jth ) − αi = αm is indicated (circle)
356
Ch. Mann et al.
current density, plotted in Fig. 2b, shows a linear dependence with a slope Γ g0 = (9.1±0.2) cm/kA. As can be seen from extrapolating the experimental data to the threshold condition Γ gpeak (Jth ) − αi = αm , with the measured threshold current density Jth = 2.85 kA/cm2 and the calculated mirror loss αm = 13.1 cm−1 , no gain saturation occurs. Extrapolation to J = 0 yields internal losses of αi = (12.4 ± 0.5) cm−1 . Theoretically, gain spectra can be calculated from Fermi’s golden rule [16]. Assuming perfectly parabolic subbands and a Lorentzian lineshape, the peak material gain gpeak(J) = g0 J is proportional to the injection current density J, and the gain coefficient g0 reads for unity injection efficiency into the initial state of the laser transition [11,17] 4πe|z32 |2 τ21 τ3 1 − . (1) g0 = ε0 neff λ0 Lp (2γ32 ) τ32 In (1) e is the electron charge, ε0 is the dielectric constant, neff is the effective mode refractive index, λ0 is the free-space wavelength corresponding to the gain maximum, Lp is the length of one active region/injector stage and (2γ32 ) is the full width at half maximum of the EL. Using (1) together with (2γ32 ) = 15 meV and the parameters of Sects. 2.1 and 2.2, a value of Γ g0 = 10.5 cm/kA is obtained in reasonable agreement with experiment. The light output versus injection current (L-I) dependence at various heatsink temperatures and the voltage versus injection current (V-I) characteristic at 300 K is shown in Fig. 3a (Fig. 3b) for a 16 × 3000 µm2 device of sample A (sample B). The lasers are driven by current pulses of 100 ns length at a repetition rate of 5 kHz. For both samples the operating voltage is approximately 7 V. The peak optical power of reference sample A at 77 K increases almost linearly with injection current above threshold Ith = 0.9 A up to I = 2.5 A and P = 250 mW. The maximum optical power of 285 mW is reached at an injection current of 3.6 A, for higher injection current a rollover of the optical power appears. A similar behavior of the (L-I)-dependence is found at elevated heat-sink temperatures. At 300 K the maximum optical power is 30 mW at I = 3.6 A. The maximum pulsed-mode operating temperature is 320 K. In comparison, the device of sample B with incorporated blocking barriers operates at higher injection current and thus at higher optical power. At 77 K (300 K) the (L-I)-dependence is almost linear from threshold Ith = 0.8 A (Ith = 3.4 A) up to an injection current of 5.0 A (5.5 A) with a maximum optical power of 890 mW (245 mW). For higher injection current the optical power decreases rapidly. The maximum operating temperature is 350 K. There are two mechanisms responsible for the rollover of the optical power with increasing injection current. (1) Since more electrons are injected into level 3, at a certain injection current the confinement of these electrons becomes insufficient and they start leaking out of the active region, resulting in a plateau in the (L-I)-curve. (2) Due to the increasing applied voltage the energy levels in the active regions and injectors become misaligned and the
Quantum Cascade Lasers for the Mid-infrared Spectral Range
357
Fig. 3. Light output versus injection current (L-I) dependence at various heat-sink temperatures (solid ) and voltage versus injection current (V-I) characteristic at 300 K (dashed ) of (a) sample A and (b) sample B. The 16 × 3000 µm2 devices are driven by current pulses of 100 ns length at a repetition rate of 5 kHz
resonant tunneling/injection is interrupted, giving rise to a sharp decrease of the optical power. For reference sample A mechanism (1) gives rise to the observed plateau in the (L-I)-curve. We attribute the ability of sample B to operate at higher injection current, and thus at higher optical power (about a factor of 3 at 77 K), to the fact that mechanism (2) sets in at higher injection current than mechanism (1), due to the significantly enhanced confinement of electrons in the initial state of the laser transition by incorporating blocking barriers into the active regions. For both sample A and sample B the threshold current increases exponentially with a characteristic temperature T0 = 136 K. At elevated temperatures the threshold current of sample B is about 20% higher compared to sample A, which might be caused by a slightly broadened gain spectrum and a reduced transition matrix-element (see Sect. 2.1). The peak emission wavelength of sample A (sample B) at 77 K is 5.05 µm (4.94 µm) [13]. The blueshift of the emission of sample B is attributed to the increased depth of the effective potential due to the incorporated strain-compensating InAs layers (see Sect. 2.1).
358
Ch. Mann et al.
3 Quantum Cascade Lasers Based on Strain-Compensated GaInAs/AlInAs In this section we show that the device performance can be further improved by using strain-compensated GaInAs/AlInAs in the active regions and injectors, giving rise to an increased conduction band offset. Employing a modified design based on “double-phonon relaxation” [18] maximum pulsed-mode operating temperatures in excess of 350 K are observed. High-reflectivity coated samples mounted substrate-side down show a maximum CW operating temperature of 194 K. 3.1
Sample Design
To further enhance the electrical confinement of electrons in the active regions and to achieve more design flexibility towards shorter emission wavelengths, the conduction band offset can be increased by using strained Ga1−x Inx As quantum wells (x > 0.53) and Al1−y Iny As barriers (y < 0.52). As the individual layers in the active regions and injectors of QC lasers are sufficiently thin (typically below 5 nm) they remain below the critical layer thickness. The compositions and layer thicknesses are chosen such that the compressive strain introduced by the quantum wells is compensated by the tensile strain introduced by the barriers. Using Al0.6 In0.4 As/Ga0.38 In0.62 As the conduction band offset is increased to ∼ 710 meV compared to ∼ 510 meV for lattice-matched Al0.48 In0.52 As/Ga0.47 In0.53 As. Figure 4 shows the calculated conduction band profile (Γ-valley) of two active regions connected by an injector of a QC laser structure based on strain-compensated Al0.6 In0.4 As/Ga0.38 In0.62 As. The design employs a four quantum well active region with three lower bound states (levels 1–3) separated by one LO-phonon energy each [18]. Due to the “double-phonon relaxation” the extraction efficiency out of the active region into the following injector is enhanced, thus reducing the lifetime of the final state of the laser transition (level 3), and thermal backfilling of electrons from the injector into the final state is reduced, thus improving the high-temperature performance. Additionally, the confinement of electrons in the initial state (level 4) is enhanced by the increased potential depth. Also AlAs blocking barriers together with strain-compensating InAs layers (see Sect. 2) are incorporated into the active regions. The detailed layer thicknesses and doping profiles are described in the caption of Fig. 4. At an applied electrical field of 75 kV/cm the calculated transition energy is 272 meV, corresponding to an emission wavelength of 4.6 µm. The estimated LO-phonon scattering time of level 4 −1 −1 −1 + τ42 + τ41 and (level 3) is τ4 = 1.1 ps (τ3 = 0.2 ps) using τ4−1 = τ43 −1 −1 −1 τ3 = τ32 + τ31 with τ43 = 2.8 ps, τ42 = 2.9 ps, τ41 = 5.6 ps, τ32 = 0.3 ps and τ21 = 1.7 ps. For the transition matrix-element a value of |z43 | = 1.8 nm is determined.
Quantum Cascade Lasers for the Mid-infrared Spectral Range
359
4 3 2 1
MINIBAND
Fig. 4. Schematic conduction band profile (Γ-valley) of two active regions connected by an injector under positive bias condition at an electric field of 75 kV/cm. Also shown are the moduli squared of the relevant wave functions and the first miniband of the injector (grey shaded region). The laser transition is indicated by wavy arrows. The layer sequence of one active region and injector, in nanometers, from left to right starting from the injection barrier is 4.4, 0.9, 1.1, 4.8, 1.5, 4.4, 1.6, 1.6, 0.6 (InAs), 1.6, 0.9, 0.6 (AlAs), 0.9, 3.1, 1.2, 2.9, 1.4, 2.8, 1.6, 2.7, 2.0, 2.5, 2.3, 2.3, 2.7, 2.1, 3.0, 1.9. The Al0.6 In0.4 As barriers (Ga0.38 In0.62 As quantum wells) are typeset in bold (roman), layers in italic are Si doped to n = 3 × 1017 cm−3 , and the underlined layers serve as the exit barrier
Samples based on the design of Fig. 4 were grown and processed as described in Sect. 2.2 with the following exceptions. The silicon doping profiles of the upper Ga0.47 In0.53 As separate confinement layer (300 nm, n = 1 × 1017 cm−3 ; 200 nm, n = 2 × 1017 cm−3 ) as well as of the MOCVD grown InP upper waveguide and contact layers (20 nm, n = 5 × 1017 cm−3 ; 1500 nm, n = 2 × 1017 cm−3 ; 1200 nm, n = 2 × 1018 cm−3 ; 100 nm, n = 7 × 1018 cm−3 ) were modified. The etch depth was 4.7 µm, and Ge/Ni/Ge/Ni/Au (5/5/5/5/300 nm) followed by Ti/Au (50/400 nm) was evaporated as the top contact. 3.2
Device Performance
The light output versus injection current (L-I) dependence at various heatsink temperatures and the voltage versus injection current (V-I) characteristic at 300 K is shown in Fig. 5a for a 16×3000 µm2 device driven by current pulses of 100 ns length at a repetition rate of 5 kHz. The operating voltage is slightly above 9 V. The device shows further improved performance compared to the QC lasers presented in Sect. 2. At 240 K the maximum peak power (slope efficiency) is 846 mW (655 mW/A) decreasing to 473 mW (400 mW/A) at 300 K and to 97 mW (132 mW/A) at 350 K, which is the maximum temperature achievable with our setup. Therefore higher operating temperatures are possible. The threshold current density Jth as a function of heat-sink temperature T increases exponentially from 1.9 kA/cm2 at 240 K to 3.1 kA/cm2 at 300 K
360
Ch. Mann et al.
Fig. 5. (a) Light output versus injection current (L-I) dependence at various heatsink temperatures (solid ) and voltage versus injection current (V-I) characteristic at 300 K (dashed ). The 16 × 3000 µm2 device is driven by current pulses of 100 ns length at a repetition rate of 5 kHz. (b) Normalized pulsed-mode emission spectra at various heat-sink temperatures
and to 4.5 kA/cm2 at 350 K. Fitting the experimental data to the empirical relation Jth (T ) = J0 exp(T /T0 ), a value of J0 = 0.31 kA/cm2 and a characteristic temperature T0 = 131 K is determined. Normalized pulsed-mode emission spectra at various heat-sink temperatures are displayed in Fig. 5b. As expected for Fabry-Perot devices in pulsed-mode operation, the emission spectra are multi-mode with a longitudinal mode spacing of ∼ 0.5 cm−1 . The peak of the emission shifts from 2049 cm−1 (4.88 µm) at 240 K to 2024 cm−1 (4.94 µm) at 350 K, in reasonable agreement with the calculated transition energy. In order to reduce the threshold current density and the injected electrical power leading to improved CW performance, the devices can be highreflectivity (HR) coated. As HR-coated lasers based on the design shown in Fig. 4 are not yet available, we present our results on HR-coated lasers based on similar design, but employing a triple quantum well active region with single-phonon relaxation as published in [19] next. For the back facet two λ/4 pairs of SiO2 /Si resulting in a reflectivity of ∼ 96% were deposited. The front facet was coated with one λ/4 pair of SiN/Si giving rise to a reflectivity of ∼ 65%. After coating the devices were mounted substrate-side down on copper heat-sinks. Figure 6a shows the CW light output versus injection current (L-I) dependence at various heat-sink temperatures and the voltage versus injection current (V-I) characteristic at 82 K of an 8 × 1000 µm2 device. The operating voltage is slightly above 8 V. At a heat-sink temperature of 82 K the threshold current is 46 mA and a maximum optical power of almost 80 mW is emitted. At the maximum CW operating temperature of 194 K the threshold current is increased to 279 mA with the maximum optical power still exceeding 1 mW. We attribute the noise in some parts of the L-I-curves to spatial mode hop-
Quantum Cascade Lasers for the Mid-infrared Spectral Range
361
Fig. 6. (a) CW light output versus injection current (L-I) dependence at various heat-sink temperatures (solid ) and voltage versus injection current (V-I) characteristic at 82 K (dashed ) of an 8 × 1000 µm2 device with HR-coated facets mounted substrate-side down. (b) Normalized CW emission spectra (I ≈ 1.1 Ith ) at various heat-sink temperatures
ping of the Fabry-Perot device. The maximum CW operating temperature is close to the highest published value of 210 K for QC lasers at λ ∼ 5 µm obtained for a junction down mounted 12 × 2000 µm2 device with uncoated facets [9]. Normalized CW emission spectra at various heat-sink temperatures are displayed in Fig. 6b for injection currents I ≈ 1.1 Ith , i.e., slightly above threshold. Except for the 153 K spectrum the emission is single-mode with a side-mode suppression ratio larger than 30 dB, tuning from 2029.38 cm−1 (4.93 µm) at 80 K to 1966.58 cm−1 (5.08 µm) at 194 K. Assuming that no heating of the device occurs in pulsed-mode (100 ns pulse width, 5 kHz repetition rate), the true temperature of the active region Tact can be estimated by comparing the emission spectra in pulsed and CW operation. The analysis shows that Tact increases rapidly for heat-sink temperatures close to the maximum CW operating temperature due to the limited heat dissipation within the device. At the maximum heat-sink temperature Ths = 194 K we estimate Tact = (298 ± 10) K, which is close to the maximum pulsed-mode operating temperature. For the thermal resistance Rth defined as Rth =
Tact − Ths Pel − Popt
(2)
we obtain with the injected electrical power Pel = 2.7 W and the emitted optical power Popt = 1 mW a value of Rth = (38 ± 4) K/W. The CW operation performance of our devices can be further enhanced by improved thermal management to reduce the thermal resistance (e.g., by lateral overgrowth of InP increasing the lateral heat dissipation [8,20]) and by employing modified active region designs aiming at reducing the threshold
362
Ch. Mann et al.
current density (e.g., by using four quantum well active regions with doublephonon relaxation and AlAs blocking barriers as shown in Fig. 4 and [18]).
4
Relative Intensity Noise
In this section we present first experimental investigations of the intensity noise properties of a 16 × 2000 µm2 QC laser based on the design of Fig. 1b using lattice-matched GaInAs/AlInAs with incorporated AlAs blocking barriers. The device was driven by current pulses of 100 ns length at a repetition rate of 1 kHz. The emitted light was collected by an f/1.6 mirror collimator and focused onto a peltier-cooled HgCdZnTe photovoltaic detector of ∼ 150 MHz bandwidth. After amplification using a low-noise amplifier the detected signal was split by a power divider and analyzed simultaneously by an oscilloscope and an electrical spectrum analyzer (ESA). The intensity fluctuations of the device were characterized by measuring the normalized relative intensity noise RIN∗ given by RIN∗ =
(PS − PD )R , Bτdc U 2
(3)
where PS is the measured spectral noise power of the laser light, PD is the spectral dark noise power, and B is the resolution bandwidth of the ESA. The mean detected electrical power is τdc U 2 /R with the voltage of the detected optical pulse U , the impedance of the amplifier R, and the duty cycle τdc . The experimentally determined normalized relative intensity noise RIN∗ measured at a frequency of 9.5 MHz and a heat-sink temperature of 273 K is shown in Fig. 7 as a function of the optical power Popt . In the investigated range of Popt , corresponding to injection currents between I ≈ 1.08 Ith and I ≈ 1.62 Ith , RIN∗ decreases from -99.6 dB/Hz to -115.4 dB/Hz. The experi−γ mental data can be fitted according to a simple power-law RIN∗ ∝ Popt with γ = 1.8. For other QC lasers we found γ=1.6–1.8. Compared to these results, in interband semiconductor lasers γ = 3 was determined in the low frequency
Fig. 7. Normalized relative intensity noise RIN∗ versus optical power Popt of a 16 × 2000 µm2 QC laser measured at a frequency of 9.5 MHz and a heat-sink temperature of 273 K (squares) and least square fit to the experimental data (solid line). The device is driven by current pulses of 100 ns length at a repetition rate of 1 kHz
Quantum Cascade Lasers for the Mid-infrared Spectral Range
363
limit and for small values of Popt [21]. Also for lead-chalcogenide diode lasers [22] and for vertical-cavity surface-emitting lasers [23] the same value γ = 3 was obtained. Theoretical noise considerations for interband semiconductor lasers according to a small signal rate equation analysis, assuming single-mode opera−3 [24], where the constant C contains tion, results in RIN∗ = C(β, nsp , τph )Popt the spontaneous emission coefficient β, the population inversion factor nsp and the photon lifetime τph . Thus, we attribute our experimental finding of −γ RIN∗ ∝ Popt with γ ≈ 1.8 for QC lasers to a pump power dependence of β or nsp or to multi-mode emission. This interesting observation deserves further detailed analysis in order to get a complete understanding of the intensity noise properties of QC lasers.
5 Trace-Gas Detection Using Quantum Cascade Lasers in Continuous-Wave Operation Current scientific topics of atmospheric chemistry pose great demands on trace-gas measurement techniques based on infrared spectroscopy. The instruments should be compact to be used on – often space restricted – airborne platforms, reliable under harsh environmental and electromagnetic conditions, and easy to operate. However, the most important requirements are selectivity (no cross interference to other chemical species), versatility (several components detectable with the same technique), and sensitivity (typically within the range of detectable optical density of 10−6 ). For a detailed discussion of the field of infrared spectroscopy in atmospheric chemistry, the reader is referenced to [25]. Tunable diode laser absorption spectroscopy (TDLAS) based on lead-chalcogenide diode lasers is able to fulfill these requirements with respect to selectivity, versatility and sensitivity. However, the lead-chalcogenide material technology is not mature enough to enable a more widespread application of this spectroscopic technique. The major obstacle in the further development of TDLAS is that individual lead-chalcogenide lasers are not adequately comparable in their operation characteristics. Consequently, it is currently not possible to run such a spectrometer by an untrained operator. QC lasers as a new type of a narrow bandwidth, high power infrared light source offer the opportunity to exploit the full potential of TDLAS. Although room temperature CW operation of QC lasers emitting at 9.1 µm wavelength was reported recently [8], most state-of-the-art QC lasers operated in CW-mode still require cryogenic cooling. For this reason several different techniques were developed to fulfill the above mentioned criteria employing QC lasers operated in pulsed-mode [26,27,28,29]. However, these techniques suffer either from limited sensitivity or from the need of high detector bandwidth to resolve the laser’s fast frequency chirp due to ohmic heating within the individual pulses, which is likely to fail under harsh elec-
364
Ch. Mann et al.
tromagnetic conditions e.g., within airplanes. Therefore, we employ QC lasers operating in CW-mode using cryogenic cooling already established for leadchalcogenide lasers. An application of atmospheric trace-gas detection already indicating the outstanding performance of QC lasers is described in [30]. At the Max-Planck-Institute for Chemistry a comparison trace-gas sensing experiment employing a lead-chalcogenide diode laser (double-heterostructure laser, Laser Components GmbH, Olching, Germany) and one of the present QC lasers (12 × 1000 µm2 device with HR-coated facets of the same design used for the CW experiments presented in Sect. 3) operated in a two-laser spectrometer was designed. The optical set-up is described in [31], the applied electronics and the principles of the gas supply setup in [32]. Shortly summarized, a custom-built liquid nitrogen cryostat houses both lasers and two detectors, an InSb photodiode for the measurement and a HgCdTe photodiode for reference signal detection (both Kolmar Technologies Inc., Newburyport, MA, USA). The beams of both lasers are combined using a semi-transparent coated CaF2 window with 50 % reflectivity and are guided into a 36 m multipass cell of Herriott type (44 hPa cell pressure) with astigmatic mirrors [33]. A small fraction of the beam is separated behind the cell to generate a reference signal for absorption line identification and locking. The emission of both lasers is guided over the absorption feature by small current ramps (about 5 mA for both lasers)1 of 82 ms duration. The time-multiplexed operation (4.8 s integration time for every laser) was maintained by consecutive blocking of the individual beams with small chopping units. For increased sensitivity a 2f wavelength modulation (5 kHz modulation frequency) detection scheme is used [25,32,34]. Both the QC laser and the lead-chalcogenide laser were tuned to measure the P(25) absorption line of the fundamental CO band at 2037.025 cm−1 (band center at about 2170 cm−1 ). Its line strength is about a factor of 100 lower compared to the strongest CO absorptions within this band. However, it allows a small signal experiment in combination with the easy gas handling characteristics of CO. Both lasers showed single mode emission at their operating conditions near their respective threshold currents (I ≈ 1.2 Ith for both lasers) as indicated by a monochromator, as well as by their noise characteristics and ´etalon tuning behavior. They both show multi-mode operation at higher injection current, which typically inhibits sensitive measurements. The operating temperature of the QC laser was approximately 100 K, whereas the lead-chalcogenide laser was operated at about 82 K. At the respective operating condition the optical power emitted by the QC laser is a factor of 40 higher, as measured by a pyroelectric detector. Figure 8 depicts a 20 min time series of calibration gas that contains 3.47 ppmv CO (ppmv = “parts per million of volume”, i.e., 1 ppmv = 1 µmol CO/1mol air). The instrument was calibrated immediately before the the 1
Preliminary tests showed a current tuning rate of 0.6 GHz/mA for the QC laser, which corresponds to typical values obtained for lead-chalcogenide lasers.
Quantum Cascade Lasers for the Mid-infrared Spectral Range
365
concentration (ppmv)
3.6
3.4
3.2
3.0
QC laser lead-chalcogenide laser 10:40
10:45
10:50
10:55
Time
Fig. 8. 20 min time-series of calibration gas containing 3.47 ppmv CO. The gas was measured alternately with a QC laser (closed circles) and a lead-chalcogenide laser (open circles) with an integration time of about 4.8 s
measurements were taken with the same gas. The concentration corresponds to an optical density at the line center of 7.29 × 10−3 . Both lasers indicate identical concentrations to within the measurement errors during the first few minutes, but small changes of the observation conditions (combined effects of laser as well as optical instabilities) lead to systematic drifts after about 10:44. Usually these instabilities are addressed with a new calibration or background measurement of the instrument by flushing it with calibration or zero gas (gas not containing the species of interest). However, the drift tends to be smaller for the QC laser, which could increase the time interval within which a recalibration is needed. We think that the higher stability has two possible reasons: (1) the considerable lower temperature tuning rate of the QC laser (the measured value of 2.2 GHz/K is a factor of 50 smaller compared to typical lead-chalcogenide lasers), and (2) the higher quality of the beam profile [35]. However, these effects have to be investigated in more detail in the future. The noise (precision) of the lead-chalcogenide laser measurements of about 44 ppbv (1σ; ppbv = “parts per billion of volume”, i.e., 1 ppbv = 1 nmol CO / 1 mol air) is mainly caused by detector noise due to the low laser power. The noise (precision) of the QC laser measurement is about 13 ppbv (1σ), corresponding to a detectable optical density of 2.76 × 10−5 at an observation bandwidth of 5.7 Hz or a normalized detectable optical density2 of 3.21 × 10−7 Hz−1/2 m−1 (1σ). This value is a factor of four smaller compared to the value obtained for the lead-chalcogenide laser. Our present 2
The relative precision of the QC laser measurements (0.34 %) is already high. Therefore, it is not excluded that a considerable fraction of the estimated noise is related to other sources in the spectrometer as e.g., pressure variations in the measurement cell or temperature uncertainties. The experiments will therefore be repeated at a lower concentration level. It is further intended to make ambient air measurement comparisons in the near future.
366
Ch. Mann et al.
results are summarized in Table 1. They confirm the findings of Webster et al. who have also observed considerably improved performance of a QC laser compared to a lead-chalcogenide diode laser in the 8 µm wavelength range [30]. Webster et al. also describe an improved temperature cycling behavior from which we might anticipate, together with the lower tuning rates, an improved spectrometer handling. Table 1. Characteristics of the time series shown in Fig. 8. A running mean over approximately 3 min is used to separate drifts from noise (precision). The former is numerically differentiated and smoothed afterwards by the same running mean to estimate drift rates. Faster fluctuations are interpreted as noise, which is given as 1σ values. Since drifts can be effectively eliminated by regular background and calibration measurements, the noise is interpreted as detectable optical density (OD), which is also given normalized to an observation bandwidth of 1 Hz and standard absorption path of 1 m unit noise (precision) detectable OD normalized detectable OD drift rate
average abs. maximum
6
QC laser
lead-chalcogenide laser
ppbv
13.1
43.6
—
2.76 × 10−5
9.15 × 10−5
Hz−1/2 m−1
3.21 × 10−7
10.6 × 10−7
ppbv s−1
0.102
-0.262
0.531
1.45
−1
ppbv s
Summary
In summary, different designs of QC lasers emitting in the 5 µm wavelength range aiming at enhancing population inversion were presented. In a first approach, AlAs blocking barriers together with strain-compensating InAs layers were incorporated into the active regions of QC lasers using lattice-matched GaInAs/AlInAs. Doing so the confinement of electrons in the initial state of the laser transition was enhanced by selectively blocking their direct tunneling/emission into the next active region/continuum. The device performance was further improved by employing strain-compensated active regions and injectors giving rise to an increased conduction band offset. Designs based on single-phonon relaxation as published in [19] and on double-phonon relaxation [18] in combination with AlAs blocking barriers were employed. Next, the normalized relative intensity noise RIN∗ was investigated. For QC lasers RIN∗ was found to decrease more slowly with increasing optical power compared to data obtained for interband semiconductor lasers. Finally, a comparison trace-gas sensing experiment employing one of the present QC lasers
Quantum Cascade Lasers for the Mid-infrared Spectral Range
367
and a lead-chalcogenide laser was presented. Detecting the P(25) absorption line of CO, higher stability was obtained using a QC laser. Acknowledgements The authors would like to thank J. Schaub and N. Rollb¨ uhler for material growth, J. Schleife, K. Schwarz and R. Moritz for technical support, M. Mikulla, J. Wagner and G. Weimann for continuous support. Funding of the present work by the German Federal Ministry of Education and Research (BMBF) within the project “QUANKAS” is gratefully acknowledged.
References 1. J. Faist, F. Capasso, D. L. Sivco, C. Sirtori, A. L. Hutchinson, A. Y. Cho, Science 264, 553 (1994). 351, 352 2. J. Faist, F. Capasso, D. L. Sivco, A. L. Hutchinson, S.-N. G. Chu, A. Y. Cho, Appl. Phys. Lett. 72, 680 (1998). 351 3. R. Colombelli, F. Capasso, C. Gmachl, A. L. Hutchinson, D. L. Sivco, A. Tredicucci, M. C. Wanke, A. M. Sergent, A. Y. Cho, Appl. Phys. Lett. 78, 2620 (2001). 351 4. C. Gmachl, D. L. Sivco, J. N. Baillargeon, A. L. Hutchinson, F. Capasso, A. Y. Cho, Appl. Phys. Lett. 79, 572 (2001). 351 5. A. Straub, C. Gmachl, D. L. Sivco, A. M. Sergent, F. Capasso, A. Y. Cho, El. Lett. 38, 565 (2002). 351 6. C. Gmachl, D. L. Sivco, R. Colombelli, F. Capasso, A. Y. Cho, Nature 415, 883 (2002). 351 7. N. Ulbrich, G. Scarpa, A. Sigl, J. Roßkopf, G. B¨ ohm, G. Abstreiter, M.-C. Amann, El. Lett. 37, 1341 (2001). 351 8. M. Beck, D. Hofstetter, T. Aellen, J. Faist, U. Oesterle, M. Ilegems, E. Gini, H. Melchior, Science 295, 301 (2002). 351, 361, 363 9. B. Ishaug, W.-Y. Hwang, J. Um, B. Guo, H. Lee, C.-H. Lin, Appl. Phys. Lett. 79, 1745 (2001). 352, 361 10. C. Sirtori, F. Capasso, J. Faist, D. L. Sivco, S.-N. G. Chu, A. Y. Cho, Appl. Phys. Lett. 61, 898 (1992). 352 11. J. Faist, F. Capasso, C. Sirtori, D. L. Sivco, A. L. Hutchinson, A. Y. Cho, Appl. Phys. Lett. 66, 538 (1995). 352, 356 12. J. Faist, F. Capasso, C. Sirtori, D. L. Sivco, J. N. Baillargeon, A. L. Hutchinson, S.-N. G. Chu, A. Y. Cho, Appl. Phys. Lett. 68, 3680 (1996). 352, 353 13. Q. K. Yang, Ch. Mann, F. Fuchs, R. Kiefer, K. K¨ ohler, N. Rollb¨ uhler, H. Schneider, J. Wagner, Appl. Phys. Lett. 80, 2048 (2002). 352, 353, 357 14. B. W. Hakki, T. L. Paoli, J. Appl. Phys. 44, 4113 (1973). 355 15. B. W. Hakki, T. L. Paoli, J. Appl. Phys. 46, 1299 (1975). 355 16. Q. K. Yang, A. Z. Li, J. Phys.: Condens. Matter 12 1907 (2000). 356 17. J. Faist, F. Capasso, D. L. Sivco, A. L. Hutchinson, C. Sirtori, S.-N. G. Chu, A. Y. Cho, Appl. Phys. Lett. 65, 2901 (1994). 356
368
Ch. Mann et al.
18. D. Hofstetter, M. Beck, T. Aellen, J. Faist, Appl. Phys. Lett. 78, 396 (2001). 358, 362, 366 19. R. K¨ ohler, C. Gmachl, A. Tredicucci, F. Capasso, D. L. Sivco, S.-N. G. Chu, A. Y. Cho, Appl. Phys. Lett. 76, 1092 (2000). 360, 366 20. M. Beck, J. Faist, U. Oesterle, M. Ilegems, E. Gini, H. Melchior, IEEE Photon. Technol. Lett. 12, 1450 (2000). 361 21. I. Joindot, J. Phys. III France 2, 1591 (1992). 363 22. H. Fischer, M. Tacke, J. Opt. Soc. Am. B 8, 1824 (1991). 363 23. D. M. Kuchta, J. Gamelin, J. D. Walker, J. Lin, K. Y. Lau, J. S. Smith, M. Hong, J. P. Mannaerts, Appl. Phys. Lett. 62, 1194 (1993). 363 24. H. Haug, Phys. Rev. 184, 338 (1969). 363 25. M. W. Sigrist (ed.): Air Monitoring by Spectroscopic Techniques, Chemical Analysis: A Series of Monographs on Analytical Chemistry and its Applications (John Wiley & Sons, New York 1994). 363, 364 26. K. Namjou, S. Cai, E. A. Whittaker, J. Faist, C. Gmachl, F. Capasso, D. L. Sivco, A. Y. Cho, Opt. Lett. 23, 219 (1998). 363 27. A. A. Kosterev, R. F. Curl, F. K. Tittel, C. Gmachl, F. Capasso, D. L. Sivco, J. N. Baillargeon, A. L. Hutchinson, A. Y. Cho, Appl. Opt. 39, 4425 (2000). 363 28. D. D. Nelson, J. H. Shorter, J. B. McManus, M. S. Zahniser, Appl. Phys. B 75, 343 (2002). 363 29. E. Normand, M. McCulloch, G. Duxbury, N. Langford, Opt. Lett. 28, 16 (2003). 363 30. C. R. Webster, G. J. Flesch, D. C. Scott, J. E. Swanson, R. D. May, W. S. Woodward, C. Gmachl, F. Capasso, D. L. Sivco, J. N. Baillargeon, A. L. Hutchinson, A. Y. Cho, Appl. Opt. 40, 321 (2001). 364, 366 31. R. Kormann, H. Fischer, F. G. Wienhold: A compact multi-laser TDLAS for trace gas flux measurements based on a micrometerological technique, in Application of Tunable Diode and Other Infrared Sources for Atmospheric Studies and Industrial Processing Monitoring II, Vol. 3758, ed. by A. Fried (The International Society for Optical Engineering, Denver, Colorado 1999), pp. 162–169. 364 32. R. Kormann, H. Fischer, C. Gurk, F. Helleis, Th. Kl¨ upfel, K. Kowalski, R. K¨ onigstedt, U. Parchatka, V. Wagner, Spectrochimica Acta A 58, 2489 (2002). 364 33. J. B. McManus, P. L. Kebabian, W. S. Zahniser, Appl. Opt. 34, 3336 (1995). 364 34. J. Reid, D. Labrie, Appl. Phys. B 26, 203 (1981). 364 35. Th. Beyer, private communication (2001). 365
Simulation of Transport and Gain in Quantum Cascade Lasers A. Wacker1 , S.-C. Lee1 , and M. F. Pereira Jr.1,2 1 2
Institut f¨ ur Theoretische Physik, Technische Universit¨ at Berlin Hardenbergstr. 36, 10623 Berlin, Germany NMRC, Lee Maltings, Prospect Row, Cork, Ireland
Abstract. Quantum cascade lasers can be modeled within a hierarchy of different approaches: Standard rate equations for the electron densities in the levels, semiclassical Boltzmann equation for the microscopic distribution functions, and quantum kinetics including the coherent evolution between the states. Here we present a quantum transport approach based on nonequilibrium Green functions. This allows for quantitative simulations of the transport and optical gain of the device. The division of the current density in two terms shows that semiclassical transitions are likely to dominate the transport for the prototype device of Sirtori et al. but not for a recent THz-laser with only a few layers per period. The many particle effects are extremely dependent on the design of the heterostructure, and for the case considered here, inclusion of electron-electron interaction at the Hartree Fock level, provides a sizable change in absorption but imparts only a minor shift of the gain peak.
1
Introduction
Since the first realization of a quantum cascade laser (QCL) in 1994 [1] these semiconductor heterostructures have become important devices in the infrared regime operating up to room temperature [2]. Lasing in the THz-region was also achieved recently [3], opening a new window for applications. The standard devices contain an injector region guiding the electrons to the upper laser level in an active region where the optical transitions occur between a few discrete levels. A frequently studied prototype is the sample in [4]. Different designs are interminiband-QCLs [5,6], as well as QCLs without injector regions [7] or containing only four barriers per period like the staircase-laser [8] and a recent THz-QCL [9]. The modeling of quantum cascade lasers was first performed on the basis of rate equations [10] for the electron dynamics in the active region. It was assumed that the electrons reach the upper laser level with the rate J/e, where J is the current density and e < 0 is the electron charge. A necessary condition for inversion is that the scattering rate 1/τu→l from the upper to the lower laser level is smaller than the out scattering 1/τl from the lower laser level. Optimizing these scattering rates by a sophisticated choice of well and barrier widths in the active region, QCLs with high performance could be designed. While typically scattering with optical phonons is considered B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 369–380, 2003. c Springer-Verlag Berlin Heidelberg 2003
370
A. Wacker et al.
to be the main mechanism for the scattering rates [11,12], electron-electron scattering has also been treated [13,14]. The influence of a magnetic field has been studied by these rate equations in [15]. While these rate equations for the electron densities ni [in units 1/cm2 ] for the levels i average over the momentum k in the in-plane direction, the distribution functions fi (k) can be taken into account employing Monte-Carlo (MC) simulations [16,17,18]. If one includes the injector region in the simulation and imposes periodic boundary conditions (a good approximation as typical devices have approximately 30 periods each containing an active region and an injector region) a full simulation of QCL-devices can be performed. Such an approach was performed almost simultaneously on the basis of rate equations [19], MCsimulations [20] and a quantum transport model [21] obtaining good results for the current-voltage characteristic of a prototype device [4]. In this article we want to show, in how far quantum effects affect the transport and gain behavior and address the question if simple semiclassical models such as rate equations or MC-simulations are applicable. In particular we demonstrate (i) that the current can be calculated in a quantum transport model, (ii) how this relates to semiclassical approaches, and (iii) discuss the implications of many particle corrections on the gain spectra.
2
Current in Quantum Transport
In order to describe the quantum cascade √ laser we start by defining a set of single particle basis states Ψα (z)eik·r / A. Here k, r are two dimensional vectors in the x, y plane perpendicular to the growth direction z and A is the normalization area. The functions Ψα (z) reflect the layer sequence of the QCL structure and may be chosen as energy-eigenstates or Wannier states (see the discussion in [22]). Then the Hamilton operator reads in second quantization:
ˆ = H
α,β,k,s
o ˆ scatt Hαβ (k)a†α,k aβ,k +H
(1)
ˆo H
where all terms connecting different k-indices (i.e. breaking the translational ˆ scatt . The spin index s invariance of the structures) have been included in H yields an additional factor 2 for the current and the gain as we assume that all states are spin degenerate and no spin transitions occur. Much information is contained in the (reduced) density matrix a†β,k a ˆα,k ; ραk,βk = ˆ
(2)
in particular the occupation probabilities are given by the diagonal elements fα (k) = ραkαk . The average current density (in the z-direction) is evaluated
Simulation of Transport and Gain in Quantum Cascade Lasers
by the temporal evolution of the position operator zˆ d e i ˆo e e i ˆ zˆ = [H , zˆ] + [Hscatt , zˆ] , J= V dt h V ¯ V ¯h =J0
371
(3)
=Jscatt
where V denotes the normalization volume. Let us first consider the current J0 . For an arbitrary choice of the basis we may write e i J0 = 2(for Spin) Wβ,α (k)ραk,βk , (4) V h ¯ αβk
where Wβ,α (k) =
γ
o o Hβγ (k)zγα − zβγ Hγα (k)
(5)
is an anti-hermitian matrix. If the wave functions Ψα (z) are chosen real, which is typical for bound states, Wβ,α (k) becomes real and J0 is determined by the non-diagonal elements of ραk,βk . For a scattering part of the form ˆαk,γk (t)ˆ ˆ scatt = O H a†αk (t)ˆ aγk (t) , (6) αγk,k ,s
which contains only pairs of electronic annihilation and creation operators, we obtain
2e † ˆ ˆβk,γk a ˆγk . (7) Jscatt = i a ˆαk Oαk,βk (t)zβγ − zαβ O V¯ h αk γβk
ˆαk,γk contains phonon annihilation and In the case of phonon scattering O creation operators and thus phonon-assisted density matrices [23] determine Jscatt for this scattering process. To evaluate the density matrices we perform the perturbation expansion within the formalism of nonequilibrium Green functions [24,25,26] similar to [27]. The key quantities are the lesser and retarded Green function † G< α1 ,α2 (k; t1 , t2 ) = iaα2 k (t2 )aα1 k (t1 )
Gret α1 ,α2 (k; t1 , t2 )
= −iΘ(t1 −
t2 )aα1 k (t1 )a†α2 k (t2 )
(8) +
a†α2 k (t2 )aα1 k (t1 ) ,
(9)
where the time dependence is taken in the Heisenberg picture. The lesser Green function refers to the electron density and it becomes the density matrix ρα1 k,α2 k (t) = G< α1 ,α2 (k; t, t)/i for equal times. In the stationary state considered here the Green functions only depend on the time difference t = t1 − t2 and we introduce the energy E as the Fourier conjugate of t: dE G(t2 + t, t2 ) = G(E)e−iEt/¯h . (10) 2π
372
A. Wacker et al.
This provides us with the Dyson equation
E − Ho (k) − Σ ret (k, E) Gret (k, E) = 1
(11)
and the Keldysh relation Σ < (k, E)Gadv (k, E) G< (k, E) = Gret (k, E)Σ
(12)
where capital bold symbols represent matrices in αβ. Together with the functionals Σ {G} for the self-energies this provides a self consistent set of equations which can be solved numerically. Although the Green functions are diagonal in k, the expression (7) for Jscatt can be evaluated by Eq. (21) as derived in the appendix. Here we use self-energies in self-consistent Born approximation for impurity, interface roughness, and phonon scattering, applying the following approximations: (i) The k-dependence of the scattering matrix elements is neglected. (ii) It is assumed that Σ is diagonal and depends only on the diagonal elements of G in the basis of Wannier functions. The scattering matrix elements are evaluated for a typical momentum transfer assuming an inter√ face roughness with average height of 0.28/ 2π nm 1 and a correlation length of 10 nm. The impurity scattering was estimated by an effective scattering h. Electron-electron interaction is included within the mean field rate γimp /¯ approximation. See [22] for further details.
3
The Current-Voltage Characteristic
We perform our calculation using a basis of Wannier functions. These functions are shown in Fig. 1 for zero bias and an operating field of 220 mV per period for the sample used in [4]. While the spatial structure of the Wannier 400
400 eFd=0 mV
eFd=220 mV
V [meV]
V [meV]
300
200
200
100 0 0 -40
-20
0 z [nm]
20
40
-40
-20
0 z [nm]
20
40
Fig. 1. Conduction band offset including mean field potential and Wannier functions for two different electric fields for the sample of [4] 1
In the calculations performed in [22] a factor 2π was lacking in the program, which can be compensated by the reduction in the roughness height.
Simulation of Transport and Gain in Quantum Cascade Lasers
373
functions does not change with bias, their energetic position is affected both by the external field and the mean field which is evaluated self consistently. From Fig. 1 we see, that the mean field almost vanishes at operating conditions as the electrons are mainly in the injector region where the doping is also located. The energy levels of the Wannier functions bunch at the operating field indicating the strong coupling between the functions enabling transport through the structure. In Fig. 2 the current-voltage characteristic is shown for different doping densities ND per period. The theoretical result exhibits a monotonic increase of the current density with doping, showing that the mean field has no dramatic influence on the transport behavior in these structures. We find good agreement with the experimental data except for ND = 3.5 × 1011 /cm2 and ND = 5.3 × 1011 /cm2 where the experiment exhibits a significantly higher bias drop. The difference is of the same order as the bias of a test structure containing only contact layers and the waveguides, albeit it is not clear why this additional bias drop is only present in some samples.
Comparing J0 and Jscatt
4
In Fig. 3(a,b) we show the different contributions to the current evaluated for the structures of [4] and [9], respectively. Both current-field relations are 10
10
(a)
(b)
3.5
8
8 3.8
5.3
4.6 7
V
6
6
10
4
11
2
3.5 3.8 4.6 5.3 7 10
2 contact & waveguide
0
0
5
10 -2
J (kA cm )
0 15 0
-2
ne (10 cm )
4
5
10
15
-2
J (kA cm )
Fig. 2. Current-voltage characteristics for different doping densities for the structures of [28] at T = 77 K. (a) Experimental data from M. Giehler (PDI Berlin) where the thin line refers to a structure grown without the QCL structure, thus providing an estimate for contact effects. (b) Theoretical result for γimp = 5 meV using the bias U = N F d, where the QCL structure consists of N = 30 periods
374
A. Wacker et al. 2000 J=J0+Jscatt J0 Jscatt
J=J0+Jscatt J0 Jscatt
2
J [A/cm ]
1500
2
J [kA/cm ]
10
5
1000
500
(a) 0 0
(b)
T=77 K
0.1 0.2 0.05 0.15 Voltage drop per period [V]
0.25
0 0
T=30 K
0.02 0.04 0.06 Voltage drop per period [V]
0.08
Fig. 3. Contributions to the current density for the samples of [4] (a) and [9] (b). γimp = 0 was used in the calculations
in reasonable agreement with the respective experimental results. (The data of [9] only extends to F d ≈ 70 mV, therefore there is no verification of the current peak.) While Jscatt dominates the behavior in Fig. 3a, both the contributions of J0 and Jscatt are important in Fig. 3b. In the following we want to discuss the role of the two current contributions with respect to the use of semiclassical approaches: As the expressions for J0 and Jscatt are invariant to unitary transformations of the basis states, they can be evaluated in arbitrary basis sets. A special basis set is given ˆ o (including the by the energy eigenstates ϕµ (z) obtained by diagonalizing H mean field), which will be used in the following argumentation. The semiclassical theories used in [19,20] imply that the density matrix is diagonal in the energy eigenstates, i.e., ρµkµ k = δµµ fµ (k). In this basis the diagonal elements of Wµµ in Eq. (5) vanish and thus J0 becomes zero in the semiclassical approximation. In the semiclassical approximation the Green functions in the basis of energy eigenstates are given by (k, E) ≈ ∓πiδµ,ν δ(E − Eµ ) Gret/adv µ,ν G< µ,ν (k, E) ≈ 2πiδµ,ν δ(E − Eµ )fµ (k) .
(13) (14)
Then we find from Eq. (21) 2e i < z ret z Σ (k, Eµ ) + ifµ (k)Σµµ (k, Eµ ) Jscatt ≈ V¯ h 2 µµ µk i < ret . Σ (k, Eµ ) + ifµ (k)Σνµ (k, Eµ ) zµν − 2 νµ ν
(15)
In semiclassical approximation the self-energies are related to the scattering probabilities Rµ k →µk as follows < (k, Eµ ) = i¯ h Σµµ
µ k
fµ (k )Rµ k →µk ,
i ret ¯ Σµµ (k, Eµ ) = − h Rµk→µ k 2 µk
Simulation of Transport and Gain in Quantum Cascade Lasers
375
and the quantities Σ < z and Σ ret z contain an additional factor zµ µ . This provides us with Jscatt ≈
2e Rµk→µ k (zµ µ − zµµ ) , V
(16)
µkµ k
which is the semiclassical expression for the current density2 . Therefore the entire current is contained in Jscatt in the semiclassical approximation. For the special case of the structure considered in [4], the density matrix is approximately diagonal in the basis of energy eigenstates3 implying that J0 → 0 and Jscatt is well approximated by the semiclassical expression (16). This expectation is supported by Fig. 3a and the findings of [20]. In contrast, J0 is an important contribution for the THz-laser of [9], which contains only 4 barriers per period, see Fig. 3b. Therefore it is questionable if semiclassical approaches work here.
5
Gain and Absorption Spectra
The general evaluation of gain spectra within the quantum transport model used here was described in detail in [29]. The key idea is to evaluate the complex susceptibility χ(ω) which is related to the optical absorption coefficient at a frequency ω via [30] α(ω) =
ω {χ(ω)} , c nB
(17)
where nB is the background refractive index and c is the speed of light. Figure 4 shows the gain spectrum for the sample of [4]. At zero current we find strong absorption due to transitions in the active region, which vanishes already for small currents as the carriers are transfered to the injector region. Pronounced gain around h ¯ ω = 130 meV sets in for current densities of several kA/cm2 . The height and width of the gain spectrum is in good agreement with the findings of [31]. Within the semiclassical approximation the susceptibility is given by [32]
{χ(ω)} =
|℘µν |2 µνk
2 3
[fµ (k) − fν (k)](Γν + Γµ ) , +0 V (Eν − Eµ − ¯hω)2 + (Γν + Γµ )2 /4
(18)
ˆ µ k ,µk have been neglected for ν = µ ˆνk,µ k Gµ µ (k)O Terms of the form zµν O here. Their implication is not clear yet. Note that J0 also vanishes if the density matrix is diagonal in a basis of real states, which are not energy eigenstates. But then Jscatt no longer corresponds to the semiclassical result.
376
A. Wacker et al.
200
Gm [1/cm]
100
J=0 2 J=0.8 kA/cm 2 J=2.6 kA/cm 2 J=6.5 kA/cm 2 J=11.6 kA/cm
T=77 K
0
-100
-200 0.08
0.1
0.12
0.14
-hω [eV]
0.16
Fig. 4. Gain spectrum for the sample of [4] using γimp = 0. (From [29])
where Γν is the FWHM of {Gret νν (k, E)} and ℘µν = ezµν is the dipole matrix element. In [22] it was shown that this semiclassical approach gives reasonable results compared with the quantum model for the structure of [4]. In these approaches the influence of electron-electron interaction was totally neglected. Here we study the influence of many-particle corrections within the Hartree Fock approximation on the gain spectrum. The susceptibility is decomposed by ℘µν χν,µ (k, ω) , χ(ω) = 2(for spin) µ,ν,k
where the susceptibility functions χν,µ (k, ω) between the eigenstates ν and µ are determined by the equation ℘νµ (k) (fν (k) − fµ (k)) = h (ω − eν (k) + eµ (k) + i(Γµ + Γν )/2) χνµ (k, ω) ¯ νµµν χνµ (k , ω) + (fν (k) − fµ (k)) 2 V 0 k ννµµ − (fν (k) − fµ (k)) χνµ (k , t) V . k − k
(19)
k
Equation (19) reduces, in the equilibrium case, with only two isolated subbands of idealized quantum well subbands where phenomenological dephasing characterizes the broadening, to Eq. (5) of [33]. The bare Coulomb interaction and renormalized energies which appear above are given by e2 e−|k−k ||z−z | ∗ µναβ ∗ φ (z )φβ (z ) = dz dz φµ (z)φν (z) V k − k 2+r +0 A|k − k | α with the normalization area A and νννν νµµν heν (k) = Eν (k) − ¯ fν (k ) V fν (k ) V + . k − k k − k k
k
Simulation of Transport and Gain in Quantum Cascade Lasers
377
1000 HF free particle
absorption [1/cm]
800 600 400 200 0 0
50
100 Eopt[meV]
150
200
Fig. 5. Absorption spectrum for the sample from Fig. 2 with Nd = 3.8 × 1011 /cm2 at an operating field of F d = 0.2 V. The dashed line gives the result from Eq. (18). The full line includes Coulomb corrections according to Eq. (19)
The second term on the right-hand side of (19) gives rise to the depolarization shift [34,35], while the last term (exchange contribution) is analogous to the excitonic coupling term in interband transitions [36]. Figure 5 shows the absorption spectra. The inclusion of many-particle corrections yields a blue shift of about 5 meV for the low frequency absorption peak and a slight red shift for the gain peak around 130 meV.
6
Discussion
The impact of quantum effects on transport and gain in quantum cascade lasers have been examined. In the evaluation of the current two different ˆ o , zˆ] and Jscatt ∝ [H ˆ scatt , zˆ] appear. In the semiclassical apterms, J0 ∝ [H proximation, where the density matrix is assumed to be diagonal in the basis of energy eigenstates, Jscatt carries the entire current. Our quantum transport calculations show that Jscatt dominates the behavior for the prototype sample of [4], which has been frequently studied, thus justifying the semiclassical approaches in [19,20]. On the other hand the current J0 , resulting from nondiagonal elements in the density matrix, shows strong contributions for the THz-laser of [9]. Nevertheless, it is not clear by now in how far the assumption of diagonal self-energies in the Wannier basis affects this behavior. Ongoing work is focused towards the inclusion of the full matrix structure in the self-energies. The many particle effects are extremely dependent on the structure, since its design determines the actual electronic overlap and subband occupation. For the case considered here, the gain spectra are hardly modified by the electron-electron interactions within the Hartree-Fock approximations, while a significant depolarization shift occurs for the low frequency absorption.
378
A. Wacker et al.
Acknowledgements Helpful discussions with M. Giehler, H.T. Grahn, A. Knorr, L. Schrottke, and M. W¨ orner as well as financial support by DFG within FOR394 is gratefully acknowledged.
References 1. J. Faist, F. Capasso, D. L. Sivco, C. Sirtori, A. L. Hutchinson, and A. Y. Cho, Science 264, 553 (1994). 369 2. M. Beck, D. Hofstetter, T. Aellen, J. Faist, U. Oesterle, M. Ilegems, E. Gini, and H. Melchior, Science 295, 301 (2002). 369 3. R. K¨ ohler, A. Tredicucci, F. Beltram, H. E. Beere, E. H. Linfield, A. G. Davies, D. A. Ritchie, R. C. Iotti, and F. Rossi, Nature 417, 156 (2002). 369 4. C. Sirtori, P. Kruck, S. Barbieri, P. Collot, J. Nagle, M. Beck, J. Faist, and U. Oesterle, Appl. Phys. Lett. 73, 3486 (1998). 369, 370, 372, 373, 374, 375, 376, 377 5. G. Scamarcio, F. Capasso, J. Faist, C. Sirtori, D. L. Sivco, A. L. Hutchinson, and A.-Y. Cho, Appl. Phys. Lett. 70, 1796 (1997). 369 6. G. Strasser, S. Gianordoli, L. Hvozdara, W. Schrenk, K. Unterrainer, and E. Gornik, Appl. Phys. Lett. 75, 1345 (1999). 369 7. M. C. Wanke, F. Capasso, C. Gmachl, A. Tredicucci, D. L. Sivco, A. L. Hutchinson, S.-N. G. Chu, and A. Y. Cho, Appl. Phys. Lett. 78, 3950 (2001). 369 8. N. Ulbrich, G. Scarpa, G. B¨ ohm, G. Abstreiter, and M. Amann, Appl. Phys. Lett. 80, 4312 (2002). 369 9. B. S. Williams, H. Callebaut, S. Kumar, Q. Hu, and J. L. Reno, Appl. Phys. Lett. 82, 1015 (2003). 369, 373, 374, 375, 377 10. F. Capasso, J. Faist, and C. Sirtori, J. Math. Phys. 37, 4775 (1996). 369 11. D. Paulaviˇcius, V. Mitin, and M. A. Stroscio, J. Appl. Phys. 84, 3459 (1998). 370 12. S. Slivken, V. I. Litvinov, M. Razeghi, and J. R. Meyer, J. Appl. Phys. 85, 665 (1999). 370 13. P. Hyldgaard and J. W. Wilkins, Phys. Rev. B 53, 6889 (1996). 370 14. P. Harrison, Appl. Phys. Lett. 75, 2800 (1999). 370 15. V. M. Apalkov and T. Chakraborty, Appl. Phys. Lett. 78, 1973 (2001). 370 16. S. Tortora, F. Compagnone, A. Di Carlo, P. Lugli, M. T. Pellegrini, M. Troccoli, and G. Scamarcio, Physica B 272, 219 (1999). 370 17. R. C. Iotti and F. Rossi, Appl. Phys. Lett. 76, 2265 (2000). 370 18. R. C. Iotti and F. Rossi, Appl. Phys. Lett. 78, 2902 (2001). 370 19. K. Donovan, P. Harrison, and R. W. Kelsall, J. Appl. Phys. 89, 3084 (2001). 370, 374, 377 20. R. C. Iotti and F. Rossi, Phys. Rev. Lett. 87, 146603 (2001). 370, 374, 375, 377 21. A. Wacker, in Advances in Solid State Phyics, edited by B. Kramer (Springer, Berlin, 2001), p. 199. 370 22. S.-C. Lee and A. Wacker, Phys. Rev. B 66, 245314 (2002). 370, 372, 376, 379 23. T. Kuhn, in Theory of Transport Properties of Semiconductor Nanostructures, edited by E. Sch¨ oll (Chapman and Hall, London, 1998). 371
Simulation of Transport and Gain in Quantum Cascade Lasers
379
24. L. P. Kadanoff and G. Baym, Quantum Statistical Mechanics (Benjamin, New York, 1962). 371 25. L. V. Keldysh, Sov. Phys. JETP 20, 1018 (1965), [Zh. Eksp. Theor. Fiz. 47, 1515 (1964)]. 371 26. H. Haug and A.-P. Jauho, Quantum Kinetics in Transport and Optics of Semiconductors (Springer, Berlin, 1996). 371 27. A. Wacker, Phys. Rep. 357, 1 (2002). 371 28. M. Giehler, R. Hey, H. Kostial, S. Cronenberg, T. Ohtsuka, L. Schrottke, and H. T. Grahn, Appl. Phys. Lett. 82, 671 (2003). 373 29. A. Wacker, Phys. Rev. B 66, 085326 (2002). 375, 376 30. J. D. Jackson, Classical Electrodynamics, 3 ed. (John Wiley & Sons, New York, 1998). 375 31. F. Eickemeyer, R. A. Kaindl, M. Woerner, T. Elsaesser, S. Barbieri, P. Kruck, C. Sirtori, and J. Nagle, Appl. Phys. Lett. 76, 3254 (2000). 375 32. H. Haug and S. W. Koch, Quantum Theory of the Optical and Electronic Properties of Semiconductors, 2 ed. (World Scientific, Singapore, 1993). 375 33. S. L. Chuang, M. S. C. Luo, S. Schmitt-Rink, and A. Pinczuk, Phys. Rev. B 46, 1897 (1992). 376 34. T. Ando, A. B. Fowler, and F. Stern, Rev. Mod. Phys. 54, 437 (1982). 377 35. M. Helm, in Intersubband Transitions in Quantum Wells: Physics and Device Applications, edited by E. R. Weber and R. K. Willardson (Academic Press, 1999), Vol. 62, p. 1. 377 36. M. F. Pereira, Jr., R. Binder, and S. W. Koch, Appl. Phys. Lett. 64, 279 (1994). 377
Appendix Similarly to [22] the scattering current (7) can be evaluated in the following way: We define the contour-ordered Green function (superscript c) c ˆαk,βk (τ1 )zβγ a Fαk (τ1 , τ2 ) = −iTˆc{O ˆγk (τ1 )ˆ a†αk (τ2 ) γβk
ˆβk,γk (τ1 )ˆ aγk (τ1 )ˆ a†αk (τ2 )} . − zαβ O They are evaluated in the Dirac representation (with index D) 1 ˆ c ˆD Fαk (τ1 , τ2 ) = −iTˆc e dτ i¯h Hscatt (τ ) O ˆD aD† αk,βk (τ1 )Zβγ a γk (τ1 )ˆ αk (τ2 ) γβk
ˆ D (τ1 )ˆ aD aD† − Zαβ O βk,γk γk (τ1 )ˆ αk (τ2 )
The lowest order non-vanishing terms of the expansion gives 1 c dτ Fαk (τ1 , τ2 ) ≈ h ¯ γβk δ c0 ˆαk,βk (τ1 )zβγ Gc0 ˆ O γ,δ (k ; τ1 , τ )Oδk ,k (τ )G,α (k; τ, τ2 ) ˆβk,γk (τ1 )Gc0 (k ; τ1 , τ )O ˆδk ,k (τ )Gc0 (k; τ, τ2 ) − zαβ O γ,δ ,α
380
A. Wacker et al.
ˆ aD (τ1 )ˆ with the bare Green functions Gc0 aD† α,γ (k; τ1 , τ ) = −iTc {ˆ αk γk (τ )}. In order to be consistent with the perturbation expansion in the Green functions, further terms are taken into account, which replace the bare Green functions by the full Green functions. Then we find 1 c cz (τ1 , τ2 ) ≈ (k; τ1 , τ )Gc,α (k; τ, τ2 ) dτ Σα Fαk h ¯
c zαβ Σβ (k; τ1 , τ )Gc,α (k; τ, τ2 ) − β
with cz Σα (k; τ1 , τ ) =
ˆαk,βk (τ1 )zβγ Gc (k ; τ1 , τ )O ˆδk ,k (τ ) O γ,δ
(20)
γβδk
where the averaging refers to the phonon bath for phonon scattering. Thus, in Born approximation the self energies Σ z are given the the usual functionals for the self-energies Σ (G) where the Green functions G are replaced by Z · G in matrix notation. Using Langreth rules and changing to the energy rep< (t, t) can be inserted in Eq. (7) yielding the final expression resentation Fαk Jscatt = =
2e < Fαk (t, t) V¯ h αk 2e dE V¯ h 2π αk
−
< ret < zαβ Σβ (k, E)Gadv (k, E) + Σ (k, E)G (k, E) ,α β ,α
β
to evaluate Jscatt .
(21)
From Digital to Analogue Magnetoelectronics: Theory of Transport in Non-collinear Magnetic Nanostructures Gerrit E.W. Bauer1 , Yaroslav Tserkovnyak2, Daniel Huertas-Hernando1,3 , and Arne Brataas4 1
2 3 4
Department of NanoScience, Delft University of Technology Lorentzweg 1, 2628 CJ Delft, The Netherlands [email protected] Lyman Laboratory of Physics, Harvard University Cambridge, Massachusetts 02138 USA Department of Physics, Sloane Physics Laboratory, Yale University New Haven, CT 06520-8120, USA Department of Physics, Norwegian University of Science and Technology, N-7491.
Abstract. Magnetoelectronics is mainly digital, i.e. governed by up and down magnetizations. In contrast, analogue magnetoelectronics makes use of phenomena occuring for non-collinear magnetization configurations. Here we review theories which have recently been applied to the transport in non-collinear magnetic nanostructures in two and multiterminal structures, viz. random matrix and circuit theory. Both are not valid for highly transparent systems in a resistive environment like perpendicular metallic spin valves. The solution to this problem is a renormalization of the conventional and spin-mixing conductance parameters.
1
Introduction
The giant magnetoresistance, as well as most of the current magnetoelectronics, can be understood in terms of the transport of electrons in either spin-up or spin-down state, since the magnetizations are collinear (parallel or antiparallel) with the spin-quantization axis. Both charge and spin transport can be described in terms of two “channels” with spin-dependent conductivities, scattering rates etc. [1]. This “digital” magnetoelectronics does not profit from the “analogue” freedom of a magnetization to point in any direction. Early seminal contributions by Slonczewski [2] and Berger [3] revealed fundamentally new physics and technological possibilities of non-collinearity, which triggered a large number of experimental and theoretical studies. An important example is the non-equilibrium spin-current induced torque (briefly, spin torque) which one ferromagnet can exert on the magnetization vector of a second magnet through a normal metal in a biased spin valve structure. This torque can be large enough to dynamically turn magnetizations [4], which is potentially interesting as a low-power switching mechanism for magnetic B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 383–396, 2003. c Springer-Verlag Berlin Heidelberg 2003
384
Gerrit E.W. Bauer et al.
random access memories [5]. Non-collinear magnetizations are also essential for novel magnetic devices like the spin-flip [6,7] and spin-torque transistors [8], detection of spin-precession [9], the Gilbert damping of the magnetization dynamics in thin magnetic films [10], and spin-injection induced by ferromagnetic resonance [11]. We are interested in heterostructures containing band ferromagnets that are accurately described by a Stoner spin-density functional model. Elemental metals and its alloys have high electron densities and their thin-film heterostructures are usually considerably disordered. Size quantization effects on transport can therefore mostly be disregarded [12]. Semiclassical methods are appropriate in slowly varying bulk regions of the structures, but heterointerfaces can be atomically sharp and must be treated fully quantum mechanically. There are basically two methods which are suitable to understand and compute transport properties of these systems from first principles, viz. Green function theory with configurational averaging [13] and the scattering formalism for transport, combined with random matrix theory [14]. These two approaches have recently been extended to non-collinear magnetic structures, viz. magnetoelectronic circuit [6] and random matrix theory [15]. Here, as in [16], we show that both approaches are closely related, but do not hold for transparent interfaces. Following Schep’s [17] strategy for collinear systems, both theories can be generalized, leading to analytical results for perpendicular spin valves with parameters that can be obtained by ab initio band structure calculations, as well as determined from experiments.
2
Boltzmann and Diffusion Equation
When a local non-equilibrium magnetization does not point in the direction of the spin-quantization axis, the distribution function for band states at the Fermi energy with index n is a matrix in Pauli spin space f↑↑ (r) f↑↓ (r) = fnc (r) ˆ1 + σ ˆ ·f sn (r) . (1) fˆn (r) = f↑↓ (r) f↓↓ (r) n On the right hand side the distribution is expanded into unit matrix and the vector of the Pauli spin matrices. fnc is charge accumulation and the spin accumulation f sn is a vector whose direction is always parallel to the magnetization vector m in the bulk of a ferromagnet, but arbitrary in a normal metal depending on device configuration and applied biases. fˆn can be diagonalized by unitary rotation matrices in spin space, characterized by the polar angles θ and ϕ. Let us assume for simplicity that these angles are piecewise constant in position space, thus disregarding magnetic domain walls [18] and magnetic field-induced spin precession in normal metals [9]. In the local spin quantization frame the distribution function is diagonal with two spin components s = ±1. Introducing spin-conserving and spin-flip scattering sf , for the sake of argument taken to be state-independent, life times τs and τs,−s
From Digital to Analogue Magnetoelectronics
385
and separating the distribution into an isotropic electrochemical potential µs and an anisotropic term γns that vanishes when averaged over the Fermi surface, the Boltzmann equation for the stationary state reads [19] 1 1 µ−s − µs + sf . (2) γns = v is · ∇ (γns + µs ) + sf τs τs,−s τs,−s where v ns is the group velocity of state ns. Charge and spin currents read e jc = v ns γns , (3) hA ns 1 js = v ns sγns . (4) 4πA ns The Boltzmann equation is still unnecessarily complicated for most realistic systems. In the presence of sufficient disorder, only the lowest harmonics of γns in reciprocal space survive. In that limit, the Boltzmann equation reduces to the diffusion equation ∇2 [µs (r) − µ−s (r)] =
µs (r) − µ−s (r) . 2sd
(5)
√ sd = Dτ sf is the spin-flip diffusion length, which does not depend on spin index [20]. The spin-averaged diffusion coefficient D can be written in terms of the density of states at the Fermi energy Ns (EF ) 1 1 1 = + , (N↑ (EF ) + N↓ (EF )) D N↑ (EF ) D↑ N↓ (EF ) D↓
(6)
in terms of the spin-dependent diffusion coefficients. In a simple two-band model Ds = vs τs /3, where vs are the spin-dependent Fermi velocities. The average spin-flip relaxation time is defined as 1 1 1 = sf + sf . sf τ τ↑,↓ τ↓,↑
(7)
The currents js (r) = −
σs ∇µs (r) e
(8)
are governed by the spin-dependent conductivities σs = Ns (EF ) e2 Ds .
(9)
386
3
Gerrit E.W. Bauer et al.
Boundary Conditions
The semiclassical approach is valid when the potential landscape varies slowly on the scale of the Fermi wave length. In heterostructures we often encounter regions in which materials change on an atomic scale, such as at intermetallic interfaces or tunnel junctions, which have to be treated quantum mechanically. The “nodes” are the bulk regions, in which the semiclassical distributions are well defined. The intermediate scattering regions, or “contacts”, can then be treated formally exactly by boundary conditions, which link the distributions of two neighboring nodes. Consider the spin valve structures in Fig. 1. It may be part of a larger circuit. We denote the distribution functions in the ferromagnetic terminals by subscripts L end R. We allow for a drifting distribution by the superscript α = ±1, which indicates whether drift is in (α = 1) or opposite (α = −1) to the transport direction (from left to right). Taking into account the difference between the left- and right-moving distribution functions is our key generalization of the previous theories. In Sect. 5 we discuss how the circuit theory can be recovered be renormalizing the conductance parameters. Here we concentrate on scattering theory. In order to work with simple matrices instead of diadics, we follow WainT = tal et al. [15] and introduce the 4 × 1 vector representation [f α n (r)] α (f↑↑ (r) , f↑↓ (r) , f↑↓ (r) , f↓↓ (r))n . The boundary conditions for the non-equilibrium distributions to the left and right of the scattering region then read ˇ R→R TˇL→R nm f + R f− , (10) f+ R,n = L,m + nm R,m f− L,n
m L
m R
m L
j R
ˇ L→L R TˇR→L nm f − = f+ + R,m . nm L,m
(11)
ˇ are 4 × 4 transmission and reflection probability matrices and the subTˇ , R scripts indicate the direction of the currents (L → R denotes transmission from left to right, R → R reflection from the right, etc.). All matrix elements
Fig. 1. Different realizations of perpendicular spin valves in which θ is the angle between magnetization directions. (a) Highly resistive junctions like point contacts and tunneling barriers limit the conductance. (b) Spin valve in a geometrical constriction amenable to the scattering theory of transport. (c) Magnetic multilayers with transparent interfaces
From Digital to Analogue Magnetoelectronics
follow from the scattering matrix and are normalized, for example L→R † 1 tnm S . TˇL→R nm SS = L tL→R nm S NS
387
(12)
The transmission amplitudes, such as tL→R ns,ms of a wave coming in from the left as mode m and spin s and going out in mode n and spin s, are here collected T
= tL→R , tL→R , tL→R , tL→R . We also have S ∈ [1, 4], in vectors tL→R nm ↑↑ ↑↓ ↑↓ ↓↓ nm
NSL = N↑F (δS,1 + δS,2 ) + N↓F (δS,3 + δS,4 ) , where NsF is the number of modes for spin s in the ferromagnet. In a nutshell, this is a very general formulation of charge and spin transport, but it is not yet amenable for analytic treatment or analysis of experiments. The isotropy assumption that reduced the Boltzmann to the diffusion equation in the previous chapter, enormously simplifies the results, as demonstrated in the following. We focus here on the electrical charge current as a function of the magnetization configuration in symmetric spin valves, as in Fig. 1(b),(c), in order to keep the analytical manipulations manageable. We will see later that we can derive rules from these results that are valid for general structures. Tˆ ˆ are functions of the magnetic configuration, which, disregarding magand R netic anisotropies, can be parameterized by a single polar angle θ. In first instance, we disregard spin-flip scattering and discuss later how it can be included. Integrating over the lateral coordinates leaves a position dependence only in the transport direction (x). The next step is the assumption that the distribution functions for incident electrons from the left and right are isotropic in space. The distribution functions for the outgoing electrons do not have to be isotropic, as long as they are subsequently scrambled in the nodes. The isotropy assumption may be invoked when the nodes are diffuse or chaotic, such that electrons are distributed equally over all states at the (spin-dependent) Fermi surfaces (which is equivalent to replacing state dependent scattering matrix elements by its average [17]). The Fermi surface integration is then carried out easily, and the distribution functions within left and right ferromagnet nodes (at locations xL and xR , respectively) are matched via simplified boundary conditions ˇ R→R (θ) f − (xR ) , f + (xR ) = TˇL→R (θ) f + (xL ) + R ˇ L→L (θ) f + (xL ) + TˇR→L (θ) f − (xR ) . f − (xL ) = R
(13a) (13b)
The 4 × 4 transmission and reflection probability matrices have elements [15]
TˇL→R
SS
=
1 R→L R→L † t t . NSF mn nm S nm S
(14)
In the coordinate systems defined by the magnetization directions, the transverse components of the spin accumulation in the ferromagnets vanish identically [6,21] and the distributions in the magnets depend on the local spin
388
Gerrit E.W. Bauer et al.
current densities γs and chemical potentials µs only f ± xL/R = (±γ↑ + µ↑ ) xL/R , 0, 0, (±γ↓ + µ↓ ) xL/R .
(15)
By this choice the explicit angle-dependence of transport is contained only in the matrices. We can now link an arbitrary distribution on the left to compute the distributions on the right, subject to the constraint of charge current conservation. Here we focus on the simple case in which we apply a bias (µs (xL ) − µs (xR )) , (16) ∆µ = s
but no spin accumulation gradient µs (xL ) − µ−s (xL ) = µs (xR ) − µ−s (xR ) over the system. We then find that γs (xL ) = γs (xR ) , i.e. the spin current component parallel to the magnetization on left and right ferromagnets are the same. The charge current Ic =
e2 F Ns γs h s
(17)
divided by the chemical potential drop is the electrical conductance G = Ic /∆µ. Equations (13),(15) then lead to G=
2e2 F ˇ ˇ ˇ R→R −1 TˇL→R NS 1 − TL→R + R . h SS
(18)
S=1,4 S =1,4
When the transparency is small, all transmission probabilities are close to zero, reflection probabilities are close to unity, and the Landauer-B¨ uttiker conductance, starting point of [15], is recovered: e2 F ˆ G→ NS T (θ) . (19) h SS S=1,4 S =1,4
Indeed, in this limit the distributions to the left and right are not perturbed by the current, the nodes are genuine reservoirs, and standard scattering theory applies. Also, when θ = 0, π, Eq. (18) is equivalent to results by Schep et al. [17] for the two-channel model. The scattering region is still not specified and may be interacting and/or quantum coherent. We now discuss how analytical results can be obtained in the non-interacting, diffuse limit.
4
Semiclassical Concatenation
The scattering matrix of a composite system can be formulated as concatenations of the scattering matrix from separate elements, e.g. the scattering
From Digital to Analogue Magnetoelectronics
389
matrices of bulk layers and interfaces [22]. By assuming isotropy, i.e. sufficient disorder or chaotic scattering, Waintal et al. [15] proved by averaging over random scattering matrices that size quantization effects like the equilibrium exchange coupling or other phase coherent phenomena are destroyed by disorder and vanish like the inverse of the number of modes. Under these conditions we are free to define nodes in the interior of the device and link them via the boundary conditions (13). This is equivalent to composing the total transport probability matrices in Eq. (18) in terms of those of individual elements by semiclassical concatenation rules [23]. The 4 × 4 transmission probability matrix through a F(0)/N/F(θ) double heterojunction as in Fig. 1 in which bulk scattering is absent, takes the form ˇ N→N (0) R ˇ N→N (θ) −1 TˇF→N (0) , ˇ−R Tˇ (θ) ≡ TˇN→F (θ) 1
(20)
where the interface transmission and reflection matrices as a function of magnetization angle appear. The transformations needed to obtain TˇN→F (θ) and ˇ N→N (θ) require some attention. In terms of the spin-rotation R cos θ/2 − sin θ/2 ˆ U= (21) sin θ/2 cos θ/2 and projection matrices (s = ±1) 1 1 + s cos θ s sin θ u ˆs (θ) = , s sin θ 1 − s cos θ 2
(22)
the interface scattering coefficients (omitting the mode indices for simplicity) are transformed as follows [6] ˆ† = ˆ rˆcN U u ˆs rscN , (23) rˆN →N = U s
→N tF ss →F tN ss F →F rss
= = =
Uss tcF s , cN † ts Uss , rscF δss .
(24) (25) (26)
The superscript c indicates that the matrices should be evaluated in the reference frame of the local magnetization, and are thus diagonal in the absence of spin-flip relaxation scattering at the interfaces. Different transformation properties for the different elements of the scattering matrix derive from our choice to use local spin-coordinate systems that may differ for each magnet. Let us, for example, inspect a transmission matrix element from the normal metal into the ferromagnet with magnetization rotated by θ 1 cN 2 1 R→L R→L † 1 t tn↑m↑ tn↑m↑ = (1 + cos θ) F (27) TˇN →F (θ) 11 = F 2 N↑ mn N↑ mn n↑m↑
390
Gerrit E.W. Bauer et al.
and analogously for the other matrix elements as well as other matrices. Transport through a more complex system can be treated by repeated concatenation of two scattering elements in terms of reflection and transmission matrices analogous to Eq. (20). In the presence of significant bulk scattering, we can represent a disordered metal B with thickness dB by diagonal matrices like [17,15] TˇB SS =
−1 1 e 2 ρB s dB 1+ B + δSS , Ns h AB
(28)
where ρB s , AB are the single-spin bulk resistivities and cross section of the bulk metal (normal or magnetic). The interface parameters of the present theory are the spin-dependent Landauer-B¨ uttiker conductances 2 2 cN 2 cF 2 tcN = NN − rlm,s tcF = NSF − rlm,s = (29) gs = lm,s lm,s lm
lm
lm
lm
and the real and imaginary part of the spin-mixing conductance cN ∗ cN gs−s = NN − rlm,−s , rlm,s lm
which can also be represented in terms of the total conductance g = g↑ + g↓ , polarization p = (g↑ − g↓ ) /g, and relative mixing conductance η = 2g↓↑ /g. The actual concatenation of the 4 × 4 matrices defined here is rather complicated even when using symbolic programming routines. This explains why in [15] analytic results were obtained only in special limiting cases. We found that final results are simple even in the most general cases, not only for Eq. (19) considered by [15], but also for Eq. (18). For the spin valves in Fig. 1, we find for the conductance as a function of angle p˜2 g˜ , 1− (30) G (θ) = 2 2 1 + |˜η| 1+cos θ Re η ˜ 1−cos θ
where η˜ = 2˜ g↑↓ /˜ g and 1 1 e2 ρF,s dF e2 ρN dN 1 = + + − g˜s gs h 2AF h 2AF 2 1 1 e2 ρN dN 1 = + − . g˜↑↓ g↑↓ h 2AN 2NN
1 1 + F Ns NN
(31) (32)
Equation (30) is identical to the angular magnetoresistance derived by circuit theory [6] after replacement of g˜s and g˜↑↓ by gs and g↑↓ . Physically, in Equations (31),(32) spurious Sharvin resistances are substracted from the interface resistances obtained by scattering theory, whereas bulk resistances are added.
From Digital to Analogue Magnetoelectronics
391
These corrections are large for transparent interfaces and essential to obtain agreement between experimental results of transport experiments in CPP (current perpendicular to plane) multilayers [25,26,27] and first-principles calculations, for conventional [17,28,29] as well as mixing conductances [7]. The mixing conductance parameterizes the magnetization torque due to a spin accumulation in the normal metal, governed by the reflection of electrons from the normal metal. It is therefore natural that the mixing conductance is reduced by the bulk resistance of the normal metal and we can also understand that only the normal metal Sharvin resistance has to be substracted. The real part of the mixing conductances is often close to the number of modes in the normal metal g↑↓ ≈ NN , in which case g˜↑↓ ≈ NN /2 [30]. By letting NsF → ∞ we are in the regime of [15]. The circuit theory is recovered when, additionally, NN → ∞. The bare mixing conductance is bounded not only from below Reg↑↓ g/2 [6], but also from above |g↑↓ |2 /Reg↑↓ 2NN .
5
Extended Circuit Theory
It is not obvious how these results should be generalized to more complicated circuits and devices as well as to the presence of spin-flip scattering in the normal metal. The magnetoelectronic circuit theory [6] does not suffer from these drawbacks. Originally, it was assumed in [6] that local spin and charge currents through the contacts only depend on the generalized potential differences, and the local node chemical potentials are obtained by a spin-generalization of the Kirchhoff laws of electrical circuits. This is valid only for highly resistive contacts, such that the in and outgoing currents do not significantly disturb the quasi-equilibrium distribution of the nodes. Fortunately we are able to relax this limitation and take into account a drift term in the nodes as well. In order to demonstrate this, we construct the fictitious circuit depicted in Fig. 2. Consider a junction that in conventional circuit theory is characterized by a matrix conductance gˆ, leading to a matrix current ˆı when the normal and ferromagnetic distributions fˆL and fˆR are not equal. When the distributions of the nodes are isotropic, we know from circuit theory that
ˆı = (ˆ g)ss u ˆs fˆL − fˆR uˆs , (33) ss
Fig. 2. Fictitious device that illustrates the generalization of circuit theory to transparent resistive elements as discussed in the text
392
Gerrit E.W. Bauer et al.
where the projection matrices u ˆs are defined in Eq. (22) and (ˆ g )ss = gs , (ˆ g )s,−s =gs,−s . Introducing lead conductances, which modify the distributions fˆL → fˆ1 and fˆ2 ← fˆR , respectively, we may define a (renormalized) conductance matrix gˆ ˜, which causes an identical current ˆı for the reduced (matrix) potential drop:
u ˆs fˆ1 − fˆ2 uˆs . (34) ˆı = gˆ ˜ ss
ss
When the lead conductances are now chosen to be twice the Sharvin conductances, and using (matrix) current conservation
(35) ˆı = 2NN fˆL − fˆ1
= ˆs , 2NsF u ˆs fˆ2 − fˆR u (36) s
straightforward matrix algebra leads to the result that the elements of gˆ˜ are identical to the renormalized interface conductances found above [Equations (31),(32) without the bulk resistivities]. By replacing gˆ by gˆ˜ we not only recover the above results for the spin valve, but we can now use the renormalized parameters also for circuits with arbitrary complexity and transparency of the contacts. Also spin-flip scattering in N can be included [6]. It does not affect the form of Eq. (37) either, but only reduces the parameter χ. ˜ Other effects of the spin-flip scattering are discussed in detail in [24].
6
Applications
Intermetallic interfaces in a diffuse environment (see Fig. 1c) have been studied thoroughly in perpendicular (CPP) spin valves [25,26,27]. These experiments provided a large body of evidence for the two-channel (i.e. spin-up and spin-down) series resistor model and a wealth of accurate transport parameters such as the spin-dependent interface resistances for various material combinations [17,27,28,29]. In exchange-biased spin valves, it is possible to measure the electric resistance as a function of the angle between magnetizations, which has been analyzed experimentally and theoretically [31,32]. Pratt c.s. observed that experimental magnetoresistance curves [33] could accurately be fitted by the form [6] 1 − cos θ R (θ) − R (0) = . R (π) − R (0) χ (1 + cos θ) + 2
(37)
According to the new insights described above, the free parameter χ is a function of renormalized microscopic parameters 2
χ=
1 |˜ η| −1 2 1 − p˜ Re˜ η
(38)
From Digital to Analogue Magnetoelectronics
393
in terms of the relative mixing conductance η˜ = 2˜ g↑↓ /˜ g, the polarization p˜ = (˜ g↑ − g˜↓ ) /˜ g, and the average conductance g˜ = g˜↑ + g˜↓ . Experimental values for the parameters for Cu/Permalloy (Py) spin valves are χ ˜ = 1.2 and p˜ = 0.6 [33]. Disregarding a very small imaginary component of the mixing conductance [7], using the known values for the bulk resistivities, the theoretical Sharvin conductance for Cu (0.55 · 1015 Ω−1 m−2 /spin [17]), and the spin-flip length of Py as the effective thickness of the fer = 5 nm [25] , we arrive at the bare Cu/Py interface mixing romagnet F sd conductance G↑↓ = 0.39 (3) · 1015 Ω−1 m−2 . This value may be compared with the calculated mixing conductance for a disordered Co/Cu interface (0.55 · 1015 Ω−1 m [7]). The agreement is reasonable, but leaves some room for material and device dependence that deserves to be investigated in the future. The mixing conductance can also be determined from the excess broadening of ferromagnetic resonance spectra. A larger mixing conductance in Pt/Py can be explained by the larger density of conduction electrons in Pt compared to Cu [30]. Reasonable agreement between experiment and theory has been also found by Zwierzycki et al. [34] for Fe/Au multilayers. The spin torque on a ferromagnet [2,15] equals the spin current through the interface with vector component normal to the magnetization direction and its evaluation is closely related to the charge conductance [6,15]. An analytical expression for the spin valve reads: 2
L (θ) = −
|˜ η| g˜p˜ Re η˜ sin θ
1 − cos θ +
|˜ η|2 Re η ˜
∆µ . (1 + cos θ) 8π
(39)
Note that here the imaginary part of the mixing conductance is taken into account explicitly, but the torque remains coplanar to the magnetization of the contacts, i.e. an out-of-plane “effective” field vanishes identically. Previous results [2,15] are recovered in the limit that η˜ → 2 and p˜ → 1. By the generalized circuit theory it is straightforward to compute the torque on the base contact of the spin-flip transistor with antiparallel source-drain magnetizations [7]. Let us assume the three contacts to be identical, and the base contact magnetization lies in the plane of the source and drain magnetizations. If one may also neglect spin-flip scattering in the base contact, the in-plane torque Lb turns out to be always larger than the spin valve torque L in the two-terminal spin valve, Eq. (39), with a symmetric and flatter dependence on the angle of the base magnetization direction θ (Fig. 3) Lb (θ) = −
∆µ . θ + Re η˜ + 6/(2 + |η| / (Re η˜) ) 4π g˜p˜ Re η˜ sin θ
(1 −
Re η˜) cos2
2
2
(40)
In the presence of a significant imaginary part of the mixing conductance, we also find an out-of-plane (effective field) torque L⊥ (θ) with the same angular dependence and L⊥ Im η˜ Re η˜ = −2 2 . Lb |η| + 2 Re η˜
(41)
394
Gerrit E.W. Bauer et al.
Fig. 3. The spin-accumulation induced magnetization torque for a two-terminal spin valve and a three-terminal spin-flip transistor. ∆µ is the source-drain bias and all contact parameters are taken to be the same, with Re η = 2 and Im η = 0
Stiles and Zangwill [35] directly solved the Boltzmann equation for spin valves to obtain angular magnetoresistance and spin torque, approximating the mixing conductance by the number of modes (note that in a direct solution of the Boltzmann equation this parameters should not be renormalized). The numerical results agree well with the functional form (37) (M.D. Stiles, private communication). This function has been also derived by Slonczewski [36] and by Shpiro et al. [37]. Slonczewski rederived the result with a simple circuit theory similar to that of [6] and also pointed out the relation between the angular magnetoresistance and the spin torque. Shpiro et al. [37] found (37) to be valid in the limit of vanishing exchange splitting, thus in a regime different from the transition metal ferromagnets considered here [21].
7
Conclusions
We reported analytical results for the angular magnetoresistance of arbitrary spin valves, which, by comparison with experiments [33], leads to a value for the mixing conductance and spin torque for the Cu/Py interface of G↑↓ = 0.39 (3) · 1015 Ω−1 m−2 . The associated generalization of magnetoelectronic circuit theory opens the way to engineer materials and device configurations to optimize switching properties of magnetic random access memories. Mixing conductances determined by experiments or first principles theory are transferable to arbitrary devices and may be used for static as well as dynamic transport properties. The spin-dependent interface resistances determined by CPP-GMR transport experiments have played an important role in understanding “digital magnetoelectronics”. We hope that the spin-mixing conductances will play a comparable role in “analogue magnetoelectronics”.
From Digital to Analogue Magnetoelectronics
395
Acknowledgements We profited from discussions with Bart van Wees, Paul Kelly, Alex Kovalev, and Yuli Nazarov, and have been supported by FOM, NSF Grant DMR 02-33773 and the NEDO joint research program “Nano-Scale Magnetoelectronics”.
References 1. S. Maekawa and T. Shinjo (eds.), Spin Dependent Transport in Magnetic Nanostructures (Taylor and Francis, London, 2002). 383 2. J.C. Slonczewski, J. Magn. Magn. Mater. 159, L1 (1996). 383, 393 3. L. Berger, Phys. Rev. B 54, 9353 (1996). 383 4. M. Tsoi, A. G. M. Jansen, J. Bass, W.-C. Chiang, M. Seck, V. Tsoi, and P. Wyder, Phys. Rev. Lett. 80, 4281 (1998); J.-E. Wegrowe, D. Kelly, Y. Jaccard, P. Guittienne, and J.-P. Ansermet, Europhys. Lett. 45, 626 (1999); J.Z. Sun, J. Magn. Magn. Mater. 202, 157 (1999); E.B. Myers, D.C. Ralph, J.A. Katine, R.N. Louie, and R.A. Buhrman, Science 285, 867(1999); J.A. Katine, F.J. Albert, R.A. Buhrman, E.B. Myers, and D.C. Ralph, Phys. Rev. Lett. 84, 3149 (2000); J. Grollier, V. Cros, A. Hamzic, J.M. George, H. Jaffr`es, A. Fert, G. Faini, J. Ben Youssef, and H. Legall, Appl. Phys. Lett. 78, 3663 (2001); B. Oezyilmaz, A.D. Kent, D. Monsma, J.Z. Sun, M.J. Rooks, and R.H. Koch, cond-mat/0301324; S. Urazhdin, N.O. Birge, W.P. Pratt Jr, and J. Bass, condmat/0303149 . 383 5. K. Inomata, IEICE Transactions on Electronics E84-C, 740 (2001); J.C. Slonczewski, cond-mat/0205055. 384 6. A. Brataas, Yu.V. Nazarov, and G.E.W. Bauer, Phys. Rev. Lett. 84, 2481 (2000); Eur. Phys. J. B 22, 99 (2001). 384, 387, 389, 390, 391, 392, 393, 394 7. K. Xia, P. J. Kelly, G.E.W. Bauer, A. Brataas, and I. Turek, Phys. Rev. B 65, 220401 (2002). 384, 391, 393 8. G.E.W. Bauer, A. Brataas, Y. Tserkovnyak, and B.L van Wees, Appl. Phys. Lett., in press. 384 9. D. Huertas-Hernando, Yu.V. Nazarov, A. Brataas, and G.E.W. Bauer, Phys. Rev. B 62, 5700 (2000). 384 10. Y. Tserkovnyak, A. Brataas, and G.E.W. Bauer, Phys. Rev. Lett. 88, 117601 (2002). 384 11. A. Brataas, Y. Tserkovnyak, G.E.W. Bauer, and B. Halperin, Phys. Rev. B 66, 060404 (2002). 384 12. The quantum size effects observed by S. Yuasa, T. Nagahama, and Y. Suzuki, Science 297, 234 (2002) are an exception, because the high-quality tunnel barriers focus conductance electrons to essentially a single wave vector, so the Fermi surface averaging that smears out residual quantum effects in metallic systems, is not effective. 384 13. J. Rammer and H. Smith, Rev. Mod. Phys. 58, 323 (1986). 384 14. C. W. J. Beenakker, Rev. Mod. Phys. 69, 731 (1997). 384 15. X. Waintal, E.B. Myers, P.W. Brouwer, and D.C. Ralph, Phys. Rev. B 62, 12317 (2000). 384, 386, 387, 388, 389, 390, 391, 393
396
Gerrit E.W. Bauer et al.
16. G.E.W. Bauer, Y. Tserkovnyak, D. Huertas-Hernando, and A. Brataas, Phys. Rev. B 67, 094421 (2003), cond-mat/0205453. 384 17. K.M. Schep, J.B.A.N. vanHoof, P.J. Kelly, G.E.W. Bauer, and J.E. Inglesfield, Phys. Rev. B 56, 10805(1997); G.E.W. Bauer, K.M. Schep, P.J. Kelly, and K. Xia, J. Phys. D: Appl. Phys. 35, 2410 (2002). 384, 387, 388, 390, 391, 392, 393 18. G. Tatara and H. Fukuyama, Phys. Rev. Lett. 78, 3773 (1997). 384 19. T. Valet and A. Fert, Phys. Rev. B 48, 7099 (1993). 385 20. A. Filip, Ph.D. Thesis, University of Groningen, 2002. 385 21. M.D. Stiles and A. Zangwill, Phys. Rev. B 66, 014407 (2002). 387, 394 22. S. Datta, Quantum Phenomena in Semiconductor Nanostructures, (Addison Wesley, 1989). 389 23. B. Shapiro, Phys. Rev. B 35, 8256 (1987); M. Cahay, M. McLennan, and S. Datta, Phys. Rev. B 37, 10125 (1988); A. Brataas and G.E.W. Bauer, Phys. Rev. B 49, 14684(1994). 389 24. A.A. Kovalev, A. Brataas, and G.E.W. Bauer, Phys. Rev. B 66, 224424 (2002). 392 25. W.P. Pratt, Jr., S.-F. Lee, J.M. Slaughter, R. Loloee, P.A. Schroeder, and J. Bass, Phys. Rev. Lett. 66, 3060 (1991); J. Bass and W.P. Pratt, J. Magn. Magn. Mater. 200, 274 (1999). 391, 392, 393 26. M.A.M. Gijs, S.K.J. Lenczowski, and J.B. Giesbers, Phys. Rev. Lett. 70, 3343 (1993). 391, 392 27. For recent reviews see: M.A.M. Gijs and G.E.W. Bauer, Advances in Physics 46, 285 (1997); J.-P. Ansermet, J. Phys.-Cond Mat. 10, 6027 (1998); A. Barth´el´emy, A. Fert and F. Petroff, in Handbook of Magnetic Materials, Vol. 12, edited by K.H.J. Buschow (1999); E. Tsymbal and D. G. Pettifor, Sol. State Phys. 56, 113 (2001). 391, 392 28. M. D. Stiles and D. R. Penn, Phys. Rev. B 61, 3200 (2000). 391, 392 29. K. Xia, P.J. Kelly, G.E.W. Bauer, I. Turek, J. Kudrnovsk´ y, and V. Drchal, Phys. Rev. B 63, 064407 (2001). 391, 392 30. Y. Tserkovnyak, A. Brataas, and G.E.W. Bauer, Phys. Rev. B 66, 224403 (2002). 391, 393 31. P. Dauguet, P. Gandit, J. Chaussy, S.F. Lee, A. Fert, and P. Holody, Phys. Rev. B 54, 1083 (1996). 392 32. A. Vedyayev, N. Ryzhanova, B. Dieny, P. Dauguet, P. Gandit, and J. Chaussy, Phys. Rev. B 55, 3728 (1997). 392 33. L. Giacomoni, B. Dieny, W.P. Pratt, Jr., R. Loloee and M. Tsoi, to be published. 392, 393, 394 34. M. Zwierzycki et al., unpublished. 393 35. M. D. Stiles and D. R. Penn, J. Appl. Phys. 91, 6812 (2002). 394 36. J.C. Slonczewski, J. Magn. Magn. Mat. 247, 324 (2002). 394 37. A. Shpiro, P.M. Levy, and S. Zhang, cond-mat/0212045. 394
New Developments with Magnetic Tunnel Junctions Hubert Br¨ uckl, Andy Thomas, J¨ org Schotter, Jan Bornemeier, and G¨ unter Reiss University of Bielefeld, Nano Device Group 33615 Bielefeld, Germany Abstract. Besides sensors, magnetic random access memory (MRAM) and hard disk read heads, much more possible applications are still ahead for the new magnetoelectronic effects of magnetic tunnel junctions (MTJs), e.g. in logic or biochips. While the technology has improved within the recent years, the reliable fabrication of MTJs with magnetoresistance ratios of up to 50% at room temperature made it possible to extend the efforts towards new developments. Thus, MTJs with two or more barriers in the film stack provide both the realization of new devices like multivalent logic and the improvement of the bias stability and magnetoresistance ratio at larger voltages. The influence of ballistic effects, resonant states and spin accumulation are experimentally investigated and estimated by numerical solution of the Schr¨ odinger equation for double barrier systems. Besides new applications in logic devices, MTJs are promising sensors for magnetoresistive biochips, which are capable to detect even single molecules like DNA by functionalized magnetic markers. The principle and design of magnetoresistive biochips are presented, and their performance compared to the established fluorescent method.
1
Introduction
Besides the application in MRAM and hard disk read heads, which are anticipated in the years 2003 and 2004 respectively, magnetic tunnel junctions (MTJs) provide much more potential and possibilities due to their unique properties like nonvolatility, low power consumption, VLSI and high sensistivity as magnetic sensors. Two further developments are reported in Chap. 2 and 3, i.e. the possible future applications in logic and biochips. First of all, a short introduction is given to some key features of a usual MTJ with only one single barrier. On this base, the physics and possible applications of double barrier junctions are described in Chap 2.
2
Single Barrier Magnetic Tunnel Junctions
The spin dependent tunneling between two ferromagnetic electrodes through a thin insulating barrier was discovered by Julliere [1] at low temperatures of 4,2 K in the system Fe/Ge/Co already in 1975. But, it took a while until 1995 when Moodera et al. [2] and Miyazaki and Tezuka [3] showed independently B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 397–412, 2003. c Springer-Verlag Berlin Heidelberg 2003
398
Hubert Br¨ uckl et al.
a relative resistance increase of more than 10% at room temperature. The 20 year long period of hard time can be explained mainly by the application of unsuitable materials for the oxide barrier like NiO or Gd2 O3 , which probably have a high affinity to spin flip scattering. This late breakthrough is granted a big attention and caused a large number of worldwide activities in this topic. In the meantime, the fabrication method, film growth and material choice were refined, what progressively improved the tunneling magnetoresistance (TMR) amplitude to around 50% at room temperature. 2.1
Stack Deposition and Composition
An actual optimized layer stack is shown as a typical example in detail in Fig. 1 [4]. It is prepared without any vacuum break in a six target sputtering chamber connected to an oxidation module. This is essential to avoid contamination and defects in the stack. The complete layer system is deposited by magnetron sputtering on clean, thermally oxidized Si wafers. The first Cu layer serves as an electrical connector to the lower electrode, which consists of a Ir83 Mn17 antiferromagnetic and a Co70 Fe30 ferromagnetic layer with a high spin polarization of about 50% [5]. The subsequent oxidation of the Al layer with a remote ECR (electron cyclotron resonance) plasma source is optimized for large TMR amplitudes. Next, the softmagnetic layer, either Ni80 Fe20 (Permalloy) or Co70 Fe30 , is deposited on top of the barrier. The subsequent Ta layer prohibits interdiffusion of the final Cu contact layer. After deposition, the hard layer magnetization is aligned unidirectionally by activating the exchange bias by a short annealing step of 250◦ C for 2 minutes, whereby the IrMn layer becomes paramagnetic above its N´eel temperature. While the CoFe magnetization is aligned homogeneously along an applied external field, the direct exchange “freezes” the IrMn interface spins along this direction during cooling down. The strength of this coupling (typi-
Cu 30 nm Ta 5 nm Ni80Fe20 4 nm AlOX 1,8 nm Co70Fe30 3 nm Ir17Mn83 12 nm NiFe 4 nm Cu 30 nm
2 nm Fig. 1. Schematic drawing of a typical layer stack with Cu conducting layers, Ta diffusion barrier, an exchange biased CoFe hardmagnetic electrode and a NiFe softmagnetic electrode. The perfectly smooth Al oxide barrier layer is imaged by crosssectional transmission electron microscopy
New Developments with Magnetic Tunnel Junctions
399
cally several 10 kA/m) can be explained by a model of uncompensated spins, in which only a small amount of the interface spins contribute to the pinning [6]. Alternatively, if an annealing step is unwanted, the application of a magnetic field during deposition also induces an unidirectional pinning in the initial film growth stage. The pinning is usually somewhat smaller than with field annealing. Uncomplete alignment result in reduced TMR amplitudes and complicated magnetic switching [7,8]. Finally, a 30 nm thick Au layer is sputtered on top to guarantee a good contact for bonding or direct 2- and 4-point measurements, before the whole stack is patterned either by optical lithography for larger areas or by electron beam lithography for MTJs of sub-µm extensions (down to 50 nm) [9,10]. 2.2
Spin Tunneling and Polarization
The quantum mechanical tunneling of particles through a barrier cannot be understood in the framework of the classical physics. The tunneling is comprehensively presented in the books of Duke [11] and Wolf [12]. These early works tackle the single particle case without considering any spin. In order to understand the spin dependent tunneling of electrons between two ferromagnetic electrodes through an insulator, it is helpful to have a closer look at the band structure of the involved ferromagnetic metals. The spin conservation and the Pauli principle regulate the behavior of the tunnel electrons. They are allowed to occupy only final states which possess the same spin symmetry. Majority electrons occupy majority states (minority states) in the counter electrode in the case of parallel (antiparallel) orientation of the magnetization. Thus, a general expression for the tunnel current [13] can be extended by taking care of the individual spin channels σ: Nσhard (E) · Nσsoft (E) · Tσσ (E). (1) jσ (E) ∝ σ
The total current is yielded by the sum of the two spin channels j(E) = jσ (E) and the integration over the energy. The tunnel current is essentially σ
determined by the transmission probability Tσσ (E) through the barrier and the spin resolved density of states (DOS) Nσ (E) of the metallic electrodes which both depend on energy. If the magnetizations of identical electrodes are aligned parallel, the electrons of both spin sorts find empty states in the counter electrode. At antiparallel alignment, however, empty states are not available for a part of the majority electrons because the spin is conserved. This leads already to a smaller tunnel current in the antiparallel state. The ratio of majority and minority DOS is decisive for the current difference in the two opposite magnetization states. Additionally, the tunnel probability for parallel and antiparallel alignment is different (T↑↓ < T↑↑ ). As a measure for the signal amplitude by the switch, i.e. the difference of the conductivity between the parallel and
400
Hubert Br¨ uckl et al.
antiparallel states, the optimistic definition1 of the TMR amplitude is given here by T MR =
j↑↑ − j↑↓ R↑↓ − R↑↑ = . j↑↓ R↑↑
(2)
Limiting to small bias voltages (E ≈ EF ) and assuming equal transmission probabilities Tσσ (E), a simple relationship between the TMR amplitude and the spin polarization P can be deduced from eq. (1) at the Fermi level: T MR =
2P hard P soft . 1 − P hard P soft
(3)
The spin polarization is defined by the relative DOS at the Fermi level EF P =
N↑ (EF ) − N↓ (EF ) N↑ (EF ) + N↓ (EF )
(4)
This expression was already given by Julliere [1] deduced from a simple parallel resistor model of the two spin channels. A maximum spin polarization of 100%, as it is postulated for ferromagnetic half metals (e.g. Fe3 O4 ), would give an infinite TMR with Eq. (3), because the tunnel current in the antiparallel state would be zero. The spin polarization P defined in Eq. (4) strongly influences the TMR amplitude, which can be already seen in the free electron approach of Eq. (3). The nonlinear dependence of the TMR amplitude is shown in Fig. 2 for two cases: equal polarization on both barrier sides and a fixed polarization of high P material on one side. Ferromagnetic half metals like Heusler alloys with a predicted P near 100% would have TMR amplitudes of over 1000% or 100% for the respective cases. The experimental determination of P in a TMR measurement requires careful preparation and clean experiments, because both the barrier properties and the electronic interface states can have a decisive influence on the total TMR signal. The values of P have improved permanently as the preparation technique has improved as well. Table 1 gives an overview of the actual maximum P values for a few interesting and important materials. As stated above, the polarization can be compared among different materials only if the transmission probability is constant. This is guaranteed for the optimized MTJs of this work since the layer stack of the bottom electrode and the Al oxide preparation are held constant and give reproducible results (cf. chapter 2). The crystallographic orientation and hence the corresponding wave vectors strongly influence the spin polarization. For example, P of Fe increases by changing the crystallographic orientation from (100) to (110) and (211) 1
The pessimistic version divides by the larger resistance and yields therefore a smaller signal ratio. This definition is often found in theoretical paperwork. Especially for imperfect MTJs, the antiparallel state R↑↓ is not well defined sometimes.
New Developments with Magnetic Tunnel Junctions !!
!.4389
705
705
401
3
3
Fig. 2. Left: The maximum TMR amplitude resulting from different spin polarizations in the case of equal (P1 =P2 ) and different (P2 = 0, 5 = const) electrodes. Right: TMR amplitude vs. P without any crystalline texture (polycrystalline) and with texture and a 10% area of unpolarized grain boundaries in comparison to an epitaxial system with P = 50% Table 1. Overview of the spin polarization for different materials at low temperatures Material
Barrier
P
Ni (poly)
Al oxide
28% / 33%
Co (poly)
Al oxide
43% / 45%
Fe (poly)
Al oxide
38% / 44%
Al oxide
4%
Fe(110)
Al oxide
37%
Fe(211)
Al oxide
41%
Ni80 Fe20 (poly)
Al oxide
53% / 48%
Co50 Fe50 (poly)
Al oxide
48%
Co70 Fe30 (poly)
Al oxide
50%
Co84 Fe16 (poly)
Al oxide
49%
SrRuO3
SrTiO3
-9,5%
La0,67 Sr0,33 MnO3
SrTiO3
78%
from [14],
b
from [15],
c
from [16],
d
a a
b
Fe(100)
a
a
b b a
a c
d
from [17]
(Table 1). A plausible explanation for the different and, in the course of time, increasing P values in polycrystalline samples is, besides the barrier improvement, the introduction of a vertical crystallographic texture. P is then weighted by the amount of equally oriented crystallites. The influence can be estimated by a parallel resistor model for the different conductance channels of differently oriented grains. Thus, an untextured part of only 10% due to e.g. grain boundaries, which is the case at realistic crystallite sizes of
402
Hubert Br¨ uckl et al.
10 nm, reduces P already by several percent. This is illustrated in the right graph of Fig. 2 for a fixed spin polarization on one barrier side at P1 = 50%. The polarization degree of the second barrier P2 is varied from zero to 100%. As it can be read from the graph, a polycrystalline film with a smaller P2 can markedly reduce the TMR amplitude if crystal orientations with a small P contribute to an equal amount which is assumed in this case.
3
Double Barrier Magnetic Tunnel Junctions
The development of double barrier magnetic tunnel junctions (DBMTJs), which consist of two stacked MTJs, is interesting from the physical and the application point of view [18]. On the one hand, ballistic effects [19] and resonant states [20,21] are predicted, leading to higher magnetoresistance values. On the other hand, it is possible to build a multi-valued logic and improve the bias voltage dependence in applications [23]. 3.1
Experiment
DBMTJs are based on single barrier MTJ stacks. In addition, a NiFe middle electrode with variable thickness is inserted and followed by a second tunnel barrier (Al 1,4 nm + oxidation) and a 6 nm thick CoFe top electrode which is exchange biased by 11 nm thick IrMn. Due to the initial VollmerWeber growth, the superparamagnetic limit of the middle NiFe layer permits comparable results only at thickness larger than 2 nm. 3.2
Logic with Multi-Valued States
Because DBMTJs need two identical tunnel barriers, reproducibility is important for preparing the TMR layer stacks. This is guaranteed because single MTJs have only a very small standard deviation of the mean TMR amplitude of 46,3 ± 1,7% at room temperature and 73,0 ± 2,7% at 10 K. Under this condition, three symmetric switching states occur in DBMTJs (Fig. 3) with
area resistance (MΩµm )
2
-80
0 80 field (kA/m)
Fig. 3. Three-valued state of a DBMTJ with a 7 nm thick NiFe middle electrode. The arrows indicate the magnetization states of the three electrodes
New Developments with Magnetic Tunnel Junctions
403
a TMR amplitude of 38% at room temperature (57% @ 10 K). By suitably shifting of the hysteresis, the three different signal levels can be set independently at zero field. This provides the possibility to store more than two states in a single elementary cell. The concept can be expanded to four states and more. 3.3
Serial Resistor Model
normalized TMR
The Julliere Model (Eq. (3)) predicts an unchanged TMR amplitude, if (a) the spin polarization of all three electrodes is the same, (b) the tunnel electrons do not lose spin information during tunneling, and (c) there is no ballistic contribution. The smaller TMR compared to single MTJs can be put down to other effects discussed in the following parts. In order to explain the discrepancy to the TMR amplitude of maximum 38% in DBMTJs, the data are evaluated within a simple serial resistor model. The evaluation is based on the fact that the process preparing the first tunnel barrier is very reproducible and gives TMR amplitudes of 46,3% with a standard deviation of only 1,7%. With this value, it is possible to separate the parameters of both barriers yielding the resistances and the TMR values of each junction independently. The known TMR amplitude of the lower junction is the only input parameter. The evaluation leads to a TMR amplitude of the upper junction of about 20% for 2,5 and 4 nm NiFe thickness and about 33% for 7 nm. This also leads to reasonable resistances: 7 nm NiFe interlayer for example gives 17,1 MΩµm2 for the lower barrier and 22,8 MΩµm2 for the top barrier [18]. Although the TMR amplitude of the DBMTJ is lower at 10 mV bias voltage compared to the MTJs, the value at 500 mV is larger (29% instead of 27%). This improved bias stability is interesting for many applications. From the I(V) characteristics (Fig. 4), the barrier parameters are deduced by fitting to the model of Brinkman [22]. The mean barrier height (and the effective barrier thickness) of the upper junction is somewhat larger (smaller) than the lower reference junction which indicates a slight overoxidation of
bias voltage (V)
Fig. 4. Normalized bias dependence of the TMR for the reference MTJ and DBMTJs with different interlayer thickness
404
Hubert Br¨ uckl et al.
the upper barrier. This is probably caused by a different growth behavior of the upper Al layer on the different underlayers and can be optimized. The overoxidation of the second barrier explains the low TMR amplitude. The experimental fixing of the TMR amplitude of the first barrier enables the separation and leads to roughly the same resistance values for both barriers in every DBMTJ. Then, the voltage drop at each junction should be half of the bias voltage. Figure 4 shows exactly this behavior, and hence the consistency of the serial resistor model: The normalized TMR reaches about 0,77 at 500 mV for all of the DBMTJ compared to 0,77 at 240 mV for the reference MTJ. In contradiction to [23], we interpret that small deviations in the I(V) characteristics for different NiFe interlayer thickness are within the reproducibility error. As shown before, this error is quite low. No significant additional complex behavior has been seen. Therefore, it is not necessary to use a more sophisticated model for e.g. ballistic electrons. 3.4
The Contribution of Ballistic Electrons
The apparent absence of ballistic effects in DBMTJs can be understood by calculating the contribution of the ballistic to the total current. This calculation was carried out for Co/Al oxide 1,5 nm/Co 2 nm/Al oxide 1,5 nm/Co junctions at 30 mV bias voltage by solving the Schr¨ odinger equation of the double barrier potential of free electrons for both the parallel and antiparallel state. Whereas the TMR amplitude raises from 44% for usual tunneling to 175% for ballistic transport (without resonant quantum well enhancement), the ballistic current is a factor of 107 smaller than the diffusive part in the serial model. This ratio can be improved by decreasing the barrier thickness, but it is not possible to balance both currents. Thus, the larger ballistic TMR is always masked by the overwhelming diffusive current part. To test this prediction, DBMTJs were produced with a nonmagnetic Cu middle electrode and otherwise unchanged features. Cu is known as a material which strongly suppresses the spin dependent tunneling [24]. Therefore, only ballistic electrons or resonant states can contribute to the presence of a TMR effect. With a 3 nm thick Cu middle electrode, a tiny TMR amplitude – smaller than 0,01% – could be measured (Fig. 5), whereas the single barrier MTJ has about 70%, i.e. a factor of 104 . Nonetheless, this is a thousand times more than the free electron theory predicts. Thus, resonant states [25] or - more plausible to our opinion – spin accumulation effects can play an important role, too. Assuming a spin flip relaxation time of 600 fs in Cu [26], we calculate a ratio of 10−4 for the contribution of spin accumulated electrons to the total current, which is in agreement to the experiment. In order to separate the different contributions to the electron transport, the technological challenge is to contact and ground the middle electrode. This can be simply realized with a combination of a single barrier MTJ and a Schottky barrier (Fig. 6). Only ballistic electrons, crossing the bottom ferromagnet, are measured at the Schottky contact. One has to note that
New Developments with Magnetic Tunnel Junctions
405
0,016 0,014
0,010 0,008 0,006 0,004 0,002 0,000 -160 -120 -80 -40
field
40
80
120 160
(kA/m)
%0&
0
Fig. 5. TMR amplitude measured with a nonmagnetic middle electrode at 750 mV bias voltage and T = 10 K. The layer stack is ../IrMn/CoFe/ Al oxide / Cu / Al oxide /CoFe/IrMn/..
705
TMR (%)
0,012
ELDVYROWDJHP9
Fig. 6. Left: Sketch of a double junction stack consisting of a Co/GaAs(100) Schottky barrier and a single MTJ. Right: The according TMR and BMC amplitude vs. bias voltage measured at T = 50 K. The lines are fits to the experimental data
this arrangement is merely a simple spin filter for ballistic electrons with two ferromagnetic layers, instead of three at the DBMTJs which provide more possibilities and larger effects. Figure 6 shows an example of a measured ballistic magnetocurrent (BMC), defined similiar to eq. 2, at a Co 5 nm/Al oxide 1,8 nm/Co 5 nm/IrMn/Ta/Cu stack on n-GaAs(100). The bias voltage is applied across the Al oxide barrier, whereas the Co electrode is grounded. The ballistic current over the Schottky barrier is measured with a ohmic contact at the semiconductor. It is four orders of magnitude smaller than the tunnel current across the Al oxide barrier. The TMR amplitude drops as usual with increasing bias voltage. The BMC is zero up to a threshold, which is determined by the Schottky barrier height of about 0,6 eV. Only electrons with a larger energy can cross the barrier. At large bias, the BMC increases and reaches a maximum of about 70% at 1,5 eV at T = 50 K. This value increases at lower temperatures and is much larger than the usual TMR amplitudes at these bias voltages.
406
3.5
Hubert Br¨ uckl et al.
Conclusion
The bias voltage dependence in DBMTJs can be explained as a series of two single barrier junctions, leading to a drop of TMR vs. bias voltage to half of the single junctions. The improved bias voltage behavior and the three switching states make the DBMTJs a promising candidate for future applications. In addition, ballistic effects would provide a larger magnetoresistance effect, but they are a factor of at least 104 smaller than the diffusive tunnel currents. Nevertheless, ballistic effects, resonant states and spin accumulation are excellent platforms for new developments with MTJs.
4
Application in a Magnetoresistive Biochip
Magnetic sensing is one of the fundamental applications of the TMR and GMR (giant magnetoresistance) effect. Consequently, these new magnetoresistive effects have been recently investigated as potential biosensors [27,28,29,30]. Compared to the established fluorescent labeling method, the use of magnetic markers in biochip sensors has important advantages with respect to the detection of biomolecules at low concentrations; i.e. the high sensitivity of the new MR effects even at small magnetic fields and the absence of a magnetic background signal. Additionally, the compact size of the required instrumentation and the direct availability of an electronic signal allow for inexpensive integrated handheld detection units, which could also be operated by non-expert users. 4.1
Hybridization
In order to analyze the molecular composition of a given sample, magnetic markers are specifically bound to the molecules, and the stray field of the magnetic markers is detected by a magnetoresistive sensor (Fig. 7). The magnetic markers are commercially available superparamagnetic or ferromagnetic microspheres which are already widely used in the life sciences, for example in biochemical separation and in clinical applications like cancer treatment [31,32,33]. The biosensor can be used to detect any molecular recognition reaction. As an example, we have tested double-stranded DNA (PCR product) with a
ELRPROHFXOH HJ'1$RUSURWHLQ
PDJQHWLFPDUNHU VWUD\ILHOG ;05VHQVRU Substrat
Fig. 7. Schematic drawing of a magnetoresistive biochip sensor. The principle is explained in the text. The sensor can be operated in air and fluids
New Developments with Magnetic Tunnel Junctions
407
length of about 1 kB as a positive probe and double-stranded sheared salmon sperm DNA of about the same length as a negative probe. The analyte DNA is biotin-labeled (5’ and internal), single-stranded and complementary to the positive probe. The main steps of the magnetic biochip detection process are sketched in Fig. 8. First, the various probe DNA samples are immobilized on the functionalized surface of the sensor elements by the use of a pin-spotter. Then, the analyte DNA is bound to the probe DNA at 42◦ C for 12 hours in a suitable hybridization solution. Only specifically bound DNA remains during the following washing step. Subsequently, the streptavidin-coated magnetic microspheres are attached to the specifically bound DNA at their biotinlabeled end groups. The right Fig. 8 shows the resulting surface coverage with magnetic markers after hybridization with analyte DNA at a concentration of 10 ng/µl. In the case of unspecific probe DNA with a concentration of 100 ng/µl, the marker coverage is identical to the background (less than 0,2%), whereas a specific probe DNA concentration of 10 ng/µl leads to a surface coverage of about 40%. Finally, the concentration of the magnetic markers, respectively the attached biomolecules (DNA), can be measured electronically by the magnetoresistive sensors.
/RDGLQJRIWKHFKLSZLWKVLQJOH VWUDQGHG'1$PROHFXOHV
a2
+\EULGL]DWLRQZLWK ELRWLQ\ODWHGVLQJOHVWUDQGHG'1$ RU51$SUREHV
$GGLWLRQRIPDJQHWLF PDUNHUV FRDWHGZLWK6WUHSWDYLGLQ ELQGLQJWR%LRWLQ
'HWHFWLRQRIWKHPDUNHUV ZLWK;05VHQVRU
Fig. 8. Application of a magnetoresistive sensor to DNA detection. Left: Process steps of preparation and detection. Right: Micrograph after hybridization and marker binding of negative (top) and positive (bottom) DNA probes on a spiralshaped GMR sensor
408
4.2
Hubert Br¨ uckl et al.
Detection by GMR Sensors
The giant magnetoresistance (GMR) effect is sensitive enough for the detection of magnetic markers and suitable for large area sensors of several 10 microns size. The GMR sensors typically consist of Ni80 Fe20 /Cu multilayers in the second antiferromagnetic coupling maximum with an effect amplitude of ∼7%. Following deposition, the multilayers are patterned into 1 µm wide spiral-shaped lines using negative electron beam lithography (Fig. 8 right). The magnetic microspheres, being superparamagnetic, have to be magnetized in order to produce a sensor signal. Since the sensor only responds to fields in the film plane, the magnetizing field is applied out-of-plane, thus avoiding sensor saturation by the magnetizing field. Hence, the response of the sensor is due to the in-plane components of the magnetic stray fields induced by the magnetized microspheres (Fig. 9). The induced in-plane field is radially symmetric around the microsphere center. The maximum in-plane field induced by a typical magnetite microsphere with a diameter of 0,35 µm at a vertical distance of ∼200 nm amounts to 0,7 kA/m in saturation.
YHUWLFDOILHOG2H
UHVSRQVHVLJQDO 9
UHVSRQVHVLJQDO9
PDUNHUFRYHUDJH
Fig. 9. Left: A typical GMR response in dependence of a vertical field with and without markers. Right: The concentration dependence is linear. The zero level is defined by the reference signal of the unspecific negative probe
4.3
Comparison to the Fluorescence Method
The usual commercially available biochip is based on the detection of fluorescent markers with an optical scanner unit which is rather large and expensive. For a direct comparison, similiar conditions are strived for the detection using the magnetoresistive technique. In a comparative analysis, the concentration of the analyte DNA was held constant, while the concentration of the immobilized probe DNA was varied. The sample treatment before binding of the different kinds of markers was absolutely identical. The applied procedures, temperatures and incubation times were the same until the last washing step after hybridization. The magnetic markers were dispersed in solution onto the
New Developments with Magnetic Tunnel Junctions
409
sensors and measured after another washing step in the dried state. Thus, the direct comparability was guaranteed. The test DNA had a length of about 1000 base pairs and is produced by PCR means. Whereas the concentration of the single-stranded analyte DNA was constant at 10 ng/µl, the concentration of the positive probe DNA varied between 16 and 10.000 pg/µl. The negative test consisted of 100.000 pg/µl salmon sperm DNA with a fragment length like the complementary DNA. Tests with the negative probe showed only a small unspecific binding ability which guarantees a suitable chip surface for the comparison. The magnetic markers (from Bangs Lab.) were paramagnetic magnetitepolystyrene beads of an average diameter of 350 nm and a broad size distribution. These beads are characterized by an excellent binding behavior of the streptavidin coating to the biotin-marked analyte DNA, while unspecific binding is kept to a minimum. Due to the strong binding of hundreds of markers on each sensor element, the statistics averages the response signals of differently sized beads. The fluorescence measurements with the Cy3 dye were done by a commercial scanner (ScanArray 4000 by Perkin Elmer). The concentration and the incubation time of the dye, as well as the scanner sensitivity (laser energy and photomultiplier) were tuned in such a way that the dynamic range covers the concentration range of the probe DNA. The concentration dependence of the normalized signals is compared in Fig. 10. The error bars result mainly from the inhomogeneous probe DNA binding, since the DNA spots were dropped from a pipette. In result, both methods can cover about the same concentration range of three orders of magnitude. Both methods are limited by unspecifically bound markers in the low concentration range. And both methods are limited in the same way by the marker saturation at complete surface coverage in the high concentration range. Both methods show about the same tendency of concentration dependence (logarithmic scale !). Nevertheless, it should be noted that the normalized signal amplitudes of the magnetoresistive method are clearly larger, especially at low concentrations. The most important advantage of the magnetoresistive method,
UHODWLYHVLJQDO
magnetoresistive sensor fluorescent detection
signal of 100.000 pg/µl unspecific probe DNA
VSHFLILFSUREH'1$FRQFHQWUDWLRQ>SJO@
Fig. 10. Comparison between the magnetoresistive and fluorescent method [34]. The response signal is normalized to the unspecific background signal in both cases. The magnetoresistive signal is recorded in a vertical field of 40 kA/m
410
Hubert Br¨ uckl et al.
however, is its scalability. The sensistivity increases at smaller dimensions, keeping the same dynamic range of three to four orders of magnitude. The sensistivity can be further increased by replacing the GMR sensors by MTJs which provide a larger magnetoresistance. This is discussed in the next paragraph. The combination of variable sensitivity range and pure electronic signal processing (optics is expensive) are the most attractive reasons for the product development. 4.4
Detection by MTJs
705DPSOLWXGH
By implementing MTJs instead of GMR sensors, one can reach the single molecule detection level due to the higher sensitivity and scaling ability of MTJs. Estimations including the stray field of ferromagnetic markers with a diameter of 50 nm and state-of-the-art MTJ properties show that micron sized MTJs are sensitive enough for the detection of single markers, respectively molecules. First results of biochip sensors, which have implemented MTJs, demonstrate already their high sensitivity (Fig. 11). Experiments with model systems of magnetic dot arrays on top of MTJs are underway to get a deeper insight in the remagnetization processes and, hence, the origin of the response signals. 1.6
1.4 1.2 1.0
0.8 0.6 0.4 0.2 0.0 -8
-6.4 -4.8 -3.2 -1.6
0
1.6
3.2
YHUWLFDOILHOG N$P
4.8
6.4
8
Fig. 11. Example of TMR response signals of 100x100µm sensors at different marker coverages (uncovered, 5%, 60%) with 0, 86µm large magnetic markers near the threshold of the hysteresis switching
Acknowledgements The authors acknowledge the collaboration with the partners from the genetics department in Bielefeld: Paul Kamp, Anke Becker, Alfred P¨ uhler. H.B. thanks Andreas H¨ utten, Jan Schmalhorst and Willi Schepper for stimulating discussions. The authors gratefully acknowledge the support of this project by the German Ministry for Education and Research, BMBF, under grant 13N7859 and by the DFG in SFB 613.
New Developments with Magnetic Tunnel Junctions
411
References 1. M. Julliere, Phys. Lett. 54A, 225 (1975). 397, 400 2. J.S. Moodera, L.R. Kinder, T.M. Wong, R. Meservey, Phys. Rev. Lett. 74, 3273 (1995). 397 3. T. Miyazaki, N. Tezuka, J. Magn. Magn. Mater. 139, L231 (1995). 397 4. A. Thomas, H. Br¨ uckl, M.D. Sacher, J. Schmalhorst, G. Reiss, J. Vac. Sci. Techn. B, in press (2003). 398 5. H. Kikuchi, M. Sato, K. Kobayashi, J. Appl. Phys. 87, 6055 (2000). 398 6. A.E. Berkowitz, K. Takano, J. Magn. Magn. Mater. 200, 552 (2000). 399 7. H. Br¨ uckl, J. Schmalhorst, H. Boeve, G. Gieres, J. Wecker, J. Appl. Phys. 91, 7029 (2002). 399 8. H. Boeve, L. Esparbe, G. Gieres, L. B¨ ar, J. Wecker, H. Br¨ uckl, J. Appl. Phys. 91, 7962 (2002). 399 9. D. Meyners, H. Br¨ uckl, G. Reiss, J. Appl. Phys. 93, 2676 (2003). 399 10. H. Kubota, G. Reiss, H. Br¨ uckl, W. Schepper, J. Wecker, G. Gieres, Jap. J. Appl. Phys. 41, L280 (2002). 399 11. C.B. Duke, Tunnelling in Solids (Academic Press, New York 1969). 399 12. E.L. Wolf, Principles of electron tunnelling spectroscopy (Oxford University Press, New York 1989). 399 13. R.J. Hamers, Ann. Rev. Phys. Chem. 40, 531 (1989). 399 14. J.S. Moodera, G. Mathon, J. Magn. Magn. Mater. 200, 248 (1999). 401 15. S. Yuasa, T. Sato, E. Tamura, Y. Suzuki, H. Yamamori, K. Ando, T. Katayama, Europhys. Lett. 52, 344 (2000). 401 16. D.C. Worledge, T.H. Geballe, Phys. Rev. Lett. 85, 5182 (2000). 401 17. D.C. Worledge, T.H. Geballe, Appl. Phys. Lett. 76, 900 (2000). 401 18. A. Thomas, H. Br¨ uckl, J. Schmalhorst, G. Reiss, J. Appl. Phys., in press (2003). 402, 403 19. J.H. Lee, I.-W. Chang, S.J. Byun, T.K. Hong, K. Rhie, W.Y. Lee, K.-H. Shin, C. Hwang, S.S. Lee, B.C. Lee, J. Magn. Magn. Mater. 240, 137 (2002). 402 20. X. Zhang, B.-Z. Li, G. Su, F.-C. Pu, Phys. Rev. B 56, 5484 (1997). 402 21. L. Sheng, Y. Chen, H.Y. Teng, C.S. Ting, Phys. Rev. B 59, 480 (1999). 402 22. W.F. Brinkman, R.C. Dynes, J.M. Rowell, J. Appl. Phys. 41, 1915 (1970). 403 23. Y. Saito, M. Amano, K. Nakajima, S. Takahashi, M. Sagoi, J. Magn. Magn. Mater. 223, 293 (2001). 402, 404 24. P. LeClair, H.J.M. Swagten, J.T. Kohlhepp, R.J.M. van de Verdonck, W.J.M. de Jonge, Phys. Rev. Lett. 84, 2933 (2000). 404 25. K. Miyamoto, H. Yamamoto, J. Appl. Phys. 84, 311 (1998). 404 26. F.J. Jedema, A.T. Filip, B.J. Wees, Nature 40, 345 (2001). 404 27. M.M. Miller, P.E. Sheehan, R.L. Edelstein, C.R. Tamanaha, L. Zhong, et al., J. Magn. Magn. Mater. 225, 138 (2001). 406 28. M. Tondra, M. Porter, R.J. Lipert, J. Vac. Sci. Technol. A 18, 1125 (2000). 406 29. H. Br¨ uckl, A. H¨ utten, G. Reiss, A. Becker, A. P¨ uhler, Statusseminar ”Magnetoelectronics”, VDI-TZ, BMBF, ISBN 3-931384-31-4, D¨ usseldorf (2000). 406 30. J. Schotter, P.B. Kamp, A. Becker, A. P¨ uhler, D. Brinkmann, W. Schepper, H. Br¨ uckl, G. Reiss, IEEE Trans. Magn. 38, 3365 (2002). 406 31. K. Kriz, J. Gehrke, D. Kriz, Biosensors & Bioelectronics 13, 817 (1998). 406 32. D.J. Anderson, Clinical Chemistry, Anal. Chem. 71, 293R (1999). 406
412
Hubert Br¨ uckl et al.
33. U. H¨ afeli, W. Sch¨ utt, J. Teller, M. Zborowski, Scientific and clinical applications of magnetic carriers (Plenum Press, New York 1997). 406 34. J. Schotter, P.B. Kamp, A. Becker, A. P¨ uhler, G. Reiss, H. Br¨ uckl, Biosensors & Bioelectronics, in preparation (2003). 409
III-V and II-VI Mn-Based Ferromagnetic Semiconductors Tomasz Dietl Institute of Physics, Polish Academy of Sciences al. Lotnik´ ow 32/46, PL 02 668 Warszawa, Poland Abstract. A review is given of advances in the field of carrier-controlled ferromagnetism in Mn-based diluted magnetic semiconductors and their nanostructures. Experimental results for III-V materials, where the Mn atoms introduce both spins and holes, are compared to the case of II-VI compounds, in which the Curie temperatures TC above 1 K have been observed for the uniformly and modulation-doped p-type structures but not in the case of n-type films. The experiments demonstrating the tunability of TC by light and electric field are presented. The tailoring of domain structures and magnetic anisotropy by strain engineering and confinement is discussed emphasizing the role of the spin-orbit coupling in the valence band. The question of designing modulated magnetic structures in low dimensional semiconductor systems is addressed. Recent progress in search for semiconductors with TC above room temperature is presented.
Over the recent years spin electronics (spintronics) has emerged as an interdisciplinary field of nanoscience, whose main goal is to acquire knowledge on spin-dependent phenomena, and to exploit them for new functionalities [1]. One of the relevant issues is to develop methods suitable for manipulations with the magnetization magnitude and direction as well as with the spin currents, which will ultimately lead to a control over the individual spins in solid state environment. Today’s spintronic research involves virtually all material families, the most mature being studies on magnetic metal multilayers, in which spin-dependent scattering and tunnelling are being successfully applied in reading heads of high density hard-discs and in magnetic random access memories (MRAM). However, particularly interesting appear to be ferromagnetic semiconductors, which combine complementary functionalities of ferromagnetic and semiconductor material systems. For instance, it can be expected that powerful methods developed to control the carrier concentration and spin polarization in semiconductor quantum structures could serve to tailor the magnitude and orientation of magnetization produced by the spins localized on the magnetic ions. Furthermore, there is a growing amount of evidences that ferromagnetic semiconductors are the materials of choice for the development of functional spin injectors, aligners, filters, and detectors. In addition of consisting the important ingredient of power-sawing spin transistors, spin injection can serve as a tool for fast modulation of light polarization in semiconductors lasers. B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 413–427, 2003. c Springer-Verlag Berlin Heidelberg 2003
414
Tomasz Dietl
Already early studies of Cr spinels as well as of rock-salt Eu- [2] and Mnbased [3] chalcogenides led to the observation of a number of outstanding phenomena associated with the interplay between ferromagnetive cooperative phenomena and semiconducting properties. The discovery of the carrierinduced ferromagnetism in Mn-based zinc-blende III-V compounds [4,5], followed by the prediction [6] and observation of ferromagnetism in p-type II-VI materials [7,8] allows one to explore the physics of previously not available combinations of quantum structures and magnetism in semiconductors [9]. These aspects of ferromagnetic semiconductors will be presented here together with a description of models aiming at explaining the nature of ferromagnetism in these materials. This survey has been prepared on the basis of recent review papers on ferromagnetic semiconductors [10,11], updated with some latest findings in this field. We limit ourselves only to Mn-based ferromagnetic semiconductors originating from zinc-blende family of diluted magnetic semiconductors (DMS). Other magnetic semiconductors like manganites, chalcogenides, and spinels, exhibiting different magnetic coupling mechanisms and band structures, are not discussed here.
1 1.1
Mn Impurity in II-VI and III-V Semiconductors Substitutional Mn
It is well established that substitutional Mn is divalent in II-VI compounds, and assumes the high spin d5 configuration characterized by S = 5/2 and g = 2.0. Here, Mn ions neither introduce nor bind carriers, but give rise to the presence of the localized spins which are coupled to the effective mass electrons by a strong symmetry allowed p-d kinetic exchange and by a weaker s-d potential exchange [12]. In III-V compounds, in turn, the Mn atom, when substituting a trivalent metal, may assume either of two configurations: (i) d4 or (ii) d5 plus a weakly bound hole, d5 +h. It is now commonly accepted that the Mn impurities act as effective mass acceptors (d5 +h) in the case of antimonides and arsenides, so that they supply both localized spins and holes, a picture supported by MCD [13] and EPR [14] measurements. Just like in other doped semiconductors, if the average distance between the Mn acceptors becomes smaller than 2.5aB , where aB is the acceptor Bohr radius, the Anderson-Mott insulator-to-metal transition occurs. However, a strong p-d antiferromagnetic interaction between the Mn and hole spin enhances strongly the acceptor binding energy and reduces aB . It has been postulated [15] that owing to the large p-d interaction, the effect is particularly strong in nitrides, and may lead to the formation of a middle-gap small d5 +h polaron state, reminiscent of the Zhang-Rice singlet in high temperature superconductors.
III-V and II-VI Mn-Based Ferromagnetic Semiconductors
1.2
415
Interstitial Mn
Another important consequence of electrical activity of Mn in III-V compounds is the effect of self-compensation. In the case of (Ga,Mn)As, it accounts presumably for the upper limits of both hole concentration and substitutional Mn concentration [16]. According to RBS and PIXIE experiments [16], an increase in the Mn concentration not only results in the formation of MnAs precipitates [17] but also in the occupation by Mn of interstitial positions, MnI . Since the latter is a double donor in GaAs [18], its formation is triggered by lowering of the system energy due to removal of the holes from the Fermi level. This scenario explains the reentrance of the insulator phase for large Mn concentrations [19] as well as a strong influence of (Ga,Mn)As properties upon annealing at temperatures much lower than those affecting other possible compensators, such as As antisites, AsGa . Importantly, a symmetry analysis demonstrates that the hybridization between bands and d-states of MnI is weak, which implies a substantial decrease of the sp-d exchange interaction once Mn assumes an interstitial position [20].
2
Zener Model of Carrier-Mediated Ferromagnetism
For low carrier densities, II-VI DMS are paramagnetic but neighbor Mn-Mn pairs are antiferromagnetically coupled or even blocked owing to short-range superexchange interactions. However, this antiferromagnetic coupling can be overcompensated by ferromagnetic interactions mediated by band holes [6,7,8]. In the presence of band carriers, the celebrated Ruderman-KittelKasuya-Yosida (RKKY) mechanism of the spin-spin exchange interaction operates. In the context of III-V magnetic semiconductors, this mechanism was first discussed by Gummich and da Cunha Lima [21], and then applied to a variety of III-V and II-VI Mn-based layered structures [22]. It has been shown [6] that on the level of the mean-field and continuous medium approximations, the RKKY approach is equivalent to the Zener model. In terms of the latter, the equilibrium magnetization, and thus TC is determined by minimizing the Ginzburg-Landau free energy functional F [M (r)] of the system, where M (r) is the local magnetization of the localized spins [25,26]. This is a rather versatile approach, to which carrier correlation, confinement, k · p, and spin-orbit couplings as well as weak disorder and antiferromagnetic interactions can be introduced in a controlled way, and within which the quantitative comparison of experimental and theoretical results is possible [8,26]. As shown in Fig. 1, theoretical calculations [8,26], carried out with no adjustable parameters, explain satisfactory the magnitude of TC in both (Zn,Mn)Te [8] and (Ga,Mn)As [19,24]. A similar conclusion has been reached by analyzing TC in a series of annealed samples [28], in which TC reaches presently 160 K [29,30,31]. In the model, the hole contribution to F is computed by diagonalizing the 6 × 6 Kohn-Luttinger k · p matrix containing the p-d exchange contribution, and by a subsequent computation of
416
Tomasz Dietl
Fig. 1. Ferromagnetic Curie temperature normalized by the Mn concentration in epilayers of p-type (Zn,Mn)Te [8,23] and (Ga,Mn)As [19,24,25,26] as well as in modulation-doped p-type (Cd,Mn)Te [7,27]. Points and lines represent experimental and theoretical results, respectively
the partition function Z, Fc = kB T ln Z. The model is developed for p-type zinc-blende and wurzite semiconductors and allows for the presence of both biaxial strain and quantizing magnetic field. The enhancement of the tendency towards ferromagnetism by the carrier-carrier exchange interactions is described in the spirit of the Fermi liquid theory. Importantly, by evaluating Fc (q), the magnetic stiffness can be determined, which together with magnetic anisotropy, yield the dispersion of spin waves [32] and the structure of magnetic domains [33]. Owing to a relatively small magnitudes of the s-d exchange coupling and density of states, the carrier-induced ferromagnetism is expected [6] and observed only under rather restricted conditions in n-type DMS [23,34].
3
Spin Polarization
An important parameter that characterizes any magnetic material is the degree of spin polarization of band carriers. According to theoretical results [26] summarized in Fig. 2, the expectation value of spin polarization reaches 80% for typical values of Mn and hole concentrations in (Ga,Mn)As, a prediction that is being verified by Andreev reflection.
4
Strain Effects
It is well known that orbital momentum of the majority hole subbands depends on strain. Hence, magnetic anisotropy (easy axis direction) can be manipulated by adjusting the lattice parameter of the substrate, as the growth of DMS films in question is usually pseudomorphic. Theoretical results [26] displayed in Fig. 3 show how magnetic anisotropy varies with the strain direction and the hole concentration [25,26,35]. In particular, the crystallographic orientation of the easy axis depends on whether the epitaxial strain is compressive or tensile, in agreement with the pioneering experimental studies for
III-V and II-VI Mn-Based Ferromagnetic Semiconductors
417
Fig. 2. Computed degree of spin polarization of the hole liquid as a function of the spin splitting parameter for various hole concentrations in Ga1−x Mnx As (BG = −30 meV corresponds to the saturation value of Mn spin magnetization for x = 0.05). The polarization of the hole spins is oriented in the opposite direction to the polarization of the Mn spins (after Dietl et al. [26])
Fig. 3. Computed anisotropy field for compressive (a) and tensile (b) strains for various value of the hole spin splitting parameter BG . The value of BG = 30 meV corresponds to the saturation value of magnetization for Ga0.95 Mn0.05 As. The symbol [001] → [100] means that the easy axis is along [001], and the aligning external magnetic field is applied along [100] (after Dietl et al. [26])
(In,Mn)As [36] and (Ga,Mn)As [37]. However, magnetic anisotropy at given strain is predicted to vary with the degree of the occupation of particular hole subbands. This, in turn, is determined by the ratio of the valence band exchange splitting to the Fermi energy, and thus, by the magnitude of spontaneous magnetization, which depends on temperature. As shown in Fig. 4, the predicted temperature-induced switching of the easy axis has recently been detected in samples with appropriately low hole densities [38,39].
418
Tomasz Dietl 0,12
M / MSat (5K) [ a.u. ]
1,0
a) 5 K
0,09
0,5
b) 25 K
0,06
H || [001]
0,03
H || [001]
H || [100]
0,0
(Ga,Mn)As/GaAs As4/Ga = 5
-0,5
0,00
-0,03 o
xMn=0.023; Tsub = 270 C
-0,06
H || [100]
-1,0 -2000
-1000
0
1000
2000
-0,09 3000 -200
Magnetic Field [ Oe ]
0
200
400
Magnetic Field [ Oe ]
Fig. 4. Magnetization loops at 5 K (a) and 25 K (b) for parallel (full symbols) and perpendicular (open symbols) orientation of the (001) (Ga,Mn)As/GaAs epilayer with respect to the external magnetic field. The reversed character of the hysteresis loops indicates the flip of the easy axis direction between these two temperatures (after Sawicki et al. [38])
5
Dimensional Effects
It is straightforward to generalize the mean-field Zener model for the case of carriers confined to the d-dimensional space [6,40]. The tendency towards the formation of spin-density waves in low-dimensional systems [40,41] as well as possible spatial correlation in the distribution of the magnetic ions can also be taken into account. The mean-field value of the critical temperature Tq , at which the system undergoes the transition to a spatially modulated state characterized by the wave vector q, is given by the solution of the equation, β 2 AF (q, Tq )ρs (q, Tq ) dζχo (q, Tq , ζ)|φo (ζ)|4 = 4g 2 µ2B . (1) Here q spans the d-dimensional space, φo (ζ) is the envelope function of the carriers confined by a (3−d)-dimensional potential well V (ζ); g and χo denote the Land´e factor and the q-dependent magnetic susceptibility of the magnetic ions in the absence of the carriers, respectively. Within the mean-field approximation (MFA), such magnetization shape and direction will occur in the ordered phase, for which the corresponding Tq attains the highest value. A ferromagnetic order is expected in the three dimensional (3D) case, for which a maximum of ρs (q) occurs at q = 0. According to the above model TC is proportional to the density of states for spin excitations, which is energy independent in the 2D systems. Hence, in the 2D case, TC is expected to do not vary with the carrier density, and to be enhanced over the 3D value at low carrier densities. Experimental results [7,27] presented in Fig. 1 confirm these expectations, though a careful analysis indicates that disorder-induced band tailing lowers TC when the Fermi energy approaches the band edge [27,41]. In 1D systems, in turn, a formation of
III-V and II-VI Mn-Based Ferromagnetic Semiconductors
419
spin density waves with q = 2kF is expected, a prediction awaiting for an experimental confirmation.
6 Manipulation of Magnetization by Electric Field and Light Since magnetic properties are controlled by the band carriers, the powerful methods developed to change carrier concentration by electric field and light in semiconductor structures can be employed to alter the magnetic ordering. Such tuning capabilities of the materials in question were put into the evidence in (In,Mn)As/(Al,Ga)Sb [42,43] and (Cd,Mn)Te/(Cd,Zn,Mg)Te [7,27] heterostructures, as shown in Figs. 5 and 6. Importantly, the magnetization switching is isothermal and reversible. Though not investigated in detail, it is expected that underlying processes are rather fast. Since the background hole concentration is small in Mn-based II-VI quantum wells, the relative change of the Curie temperature is typically larger than in III-V compounds.
7
Spin Injection
A number of groups is involved in the development of devices capable of injecting spins into a non-magnetic semiconductor. Obviously, owing to a high degree of spin polarization and resistance matching ferromagnetic semiconductors constitute a natural material of choice here [44]. Typically, a pin light emitting diode structure is employed, in which the p-type spin injecting electrode is made of a ferromagnetic semiconductor. Experimental results VG < 0
0.04
0.02
RHall (kΩ)
1
1.5 K 5K 10 K
0
20 K
R Hall (kΩ )
-1 -0.5
0.00
0.0
0.5
VG > 0
B (T)
22.5 K VG 0V +125 V -125 V 0V
-0.02
-0.04 -1.0
-0.5
0.0
0.5
1.0
B (mT)
Fig. 5. Magnetization hysteresis evaluated by measurements of the anomalous Hall effect at various gate voltages that changes the hole concentration in a field-effect transistor structure with an (In,Mn)As channel (after Ohno et al. [43])
420
Tomasz Dietl 10
p-i-p
-2
p = 16x10 cm
(a)
PL intensity
T (K) = 4.2
T = 1.34 K 10
2.7
5.2
2.4
7.1
2.1
10
1.8
12
1.2
16
1685
1695
1705
1685
1695
p-i-n
Vd = 0 V
T (K) = 4.2
(c)
T (K) = 4.2 3.5
3.2
3.2
2.1
2.2
1.7
1.7
1.5
1.5 1705
1715
(b)
1705 Vd = -0.7 V
3.7
1695
-2
p (x10 cm ) = 2.7
1695
(d)
1705
1715
Energy (meV)
Fig. 6. Effect of temperature (a,c,d), illumination (b) and bias voltage Vd (c,d) on photoluminescence line in quantum well of (Cd,Mn)Te placed in a center of p-i-n diode (c,d) and p-i-p structure (a,b). Line splitting and shift witness the appearance of a ferromagnetic ordering that can be altered isothermally, reversibly, and rapidly by light (b) and voltage (c,d), which change the hole concentration p in the quantum well (after Boukari et al. [27])
obtained for the (Ga,Mn)As/GaAs/(In,Ga)As/n-GaAs diode are shown in Fig. 7. In this particular experiment [45], the degree of circular polarization is examined for light emitted in the growth direction. In the corresponding Faraday configuration, simple selection rules are obeyed for radiative recombination between the electron and heavy hole ground state subbands. Since the easy axis is in plane, in agreement with the theoretical results of Fig. 3, a field of a few kOe is necessary to align the magnetization and thus to produce a sizable degree of light polarization.
8
Optical and Transport Properties
In addition to thermodynamic properties discussed above, the Zener model of ferromagnetism in materials in question has been applied to describe optical properties of (Ga,Mn)As in the region of interband transitions between the exchange split valence band and the conduction band [13,26] as well in the regime, where intra-valence band excitations dominate [46]. Of course, both
III-V and II-VI Mn-Based Ferromagnetic Semiconductors
421
Fig. 7. Degree of circular polarization of light emitted by a (In,Ga)As quantum well located in a p-i-n diode biased in the forward direction and containing (Ga,Mn)As as the p-type electrode. An external magnetic field is applied along the hard axis, that is perpendicularly to the interface plane. The degree of circular polarization and the (Ga,Mn)As magnetization depend similarly on the magnetic field and temperature, which together with the lack of photoluminescence polarization for excitation by linearly polarized light, point to the existence of the hole spin injection from ferromagnetic (Ga,Mn)As to non-magnetic (In,Ga)As via a non-magnetic GaAs (after Young et al. [45])
dispersion of absorption and magnetic circular dichroism are rather sensitive to disorder [13,46], whose full description is difficult in these heavily doped and strongly disordered magnetic alloys. Furthermore, the so-far disregarded intra-d level transitions, enhanced presumably by the p-d hybridization, are expected to contribute in the present spectral range. However, the main aspects of available experimental results appear to be correctly understood. Recently, various aspect of d.c. transport in (Ga,Mn)As and p-(Zn,Mn)Te have been examined. In general, a number of mechanisms by which the sp-d exchange interaction between localized and effective mass electrons can affect transport phenomena have been identified [12]. Generally speaking, these mechanisms are associated with spin-disorder scattering, spin-splitting, and the formation of bound magnetic polarons [12,47]. In the particular case of (Ga,Mn)As, a resistance maximum at TC and the associated negative weakfield magnetoresistance have been described in terms of critical scattering [12,24] or by considering an interplay between the magnetic and electrostatic disorder [47,48]. Furthermore, anisotropic magnetoresistance (AMR), known already from early studies of p-(Hg,Mn)Te [50], has been theoretically examined as a function of strain and hole density in (Ga,Mn)As [51]. Interestingly, it has been suggested [49] that the well-known high-field negative magnetoresistance of (Ga,Mn)As is actually an orbital weak localization effect, which is not destroyed by spin scattering owing to large spin splitting of the valence
422
Tomasz Dietl
band. Finally, arguments have been presented [52] that owing to a relatively high resistance, the side-jump mmechanism of the anomalous Hall dominates, and its calculation with the appropriate Kohn-Luttinger amplitudes and by neglecting entirely the disorder gives a correct sign and amplitude of the Hall coefficient in both (Ga,Mn)As [52] and p-(Zn,Mn)Te [49].
9
Towards Functional Ferromagnetic Semiconductors
In view of the promising properties of ferromagnetic semiconductors, the development of a functional material with TC comfortably surpassing the room temperature, becomes an important challenge of today’s materials science. A concentrated effort in this direction, stimulated by theoretical results [25,26] recalled in Fig. 8 and confirmed by others [53,54], suggests that there is no fundamental limits precluding the achievement of this goal. However, because of limited solubility of magnetic impurities in functional semiconductors, search for perspective compounds must be accompanied by a careful control and detection of possible ferromagnetic or ferrimagnetic precipitates and inclusions, typically with the sensitivity greater than that provided by standard x-ray diffraction. It is then useful to formulate some experimental criteria that should be fulfilled in order to call a given material a ferromagnetic semiconductor. First, magnetic characteristics should scale with the concentration of the magnetic constituent and also with the carrier density (which can be varied not only by doping but also by other means such as an electric filed or light). Furthermore, there should be a relation between temperature and field dependence of semiconductor and magnetic properties. In particular, the anomalous Hall effect, spin-dependent resistance, and mag-
C Si Ge AlP AlAs GaN GaP GaAs GaSb InP InAs ZnO ZnSe ZnTe 10
100
1000
Curie temperature (K)
Fig. 8. Computed Curie temperature for various materials containing 5% Mn per unit cell and 3.5 × 1020 holes per cm3 (after Dietl et al. [25,26])
III-V and II-VI Mn-Based Ferromagnetic Semiconductors
423
Fig. 9. Curie temperature in Ge1−x Mnx . Experimental results and LSDA theory are shown by points and dashed-dotted line respectively (after Joker et al. [55]). Solid line depicts the expectations of the Zener model [25,26] assuming the hole concentration p = 3.5 × 1020 [cm−3 ]x/0.025
netic circular dichroism together with the spin injection capability are the well known signature of a ferromagnetic semiconductor. In the above context, particularly remarkable is a successful synthesis by the NRL group [55] of a new ferromagnetic semiconductor Ge1−x Mnx . The epitaxial films of this material are shown to consist of (i) precipitates, whose size, composition, and magnetic properties depend on the growth temperature and (ii) a homogenous p-type matrix Ge1−x Mnx , whose TC increases approximately linearly with x, as shown in Fig. 9. These findings are also compared with the outcome of theory developed within the local spin-density approximation (LSDA) [55] and with the results determined from the Zener model of the hole-mediated ferromagnetism in tetrahedrally coordinated semiconductors, put forward a priori for Ge1−x Mnx [25,26]. As seen, this comparison indicates that the LSDA overestimates considerably the magnitude of TC , while the Zener model, despite the proximity of the metal-insulator transition, provides a reasonable evaluation of TC for the assumed degree of compensation. There is a remarkable parallel effort aiming at synthesizing other promising ferromagnetic semiconductors. Limited ourselves to Mn-based III-V systems we can mention works demonstrating indications of high temperature ferromagnetism in (Ga,Mn)P [56] and (Ga,Mn)N [57,58] though whether the ferromagnetism in these materials fulfilled the criteria specified above is under a vigorous debate [59,60].
10
Conclusion
With no doubt recent years have witness a remarkable progress in the development of new material systems, which show novel capabilities, such as the manipulation of ferromagnetism by the electric field. At the same time,
424
Tomasz Dietl
carrier-controlled ferromagnetic semiconductors combine intricate properties of charge-transfer insulators and strongly correlated disordered metals with the physics of defect and band states in semiconductors. Accordingly, despite important advances in theory of these materials, quantitative understanding of these systems will be ahead for a long time. Acknowledgements The author would like to thank his co-workers, particularly F. Matsukura and H. Ohno in Sendai; J. Cibert in Grenoble; P. Kacman, P. Kossacki, and M. Sawicki in Warsaw, and A.H. MacDonald in Austin for many years of fruitful collaboration in studies of ferromagnetic semiconductors. Author’s research in Germany in 2003 was supported by Alexander von Humboldt Foundation, while the work in Poland by State Committee for Scientific Research as well as by FENIKS and AMORE EC projects.
References 1. S. Wolf, D.D. Awshalom. R.A. Buhrman, J.M. Daughton, S. von Moln´ ar, M.L. Roukes, A.Y. Chtchelkanova, D.M. Treger, Science 294, 1488 (2001); T. Dietl, Acta Phys. Polon. A 100 (suppl.), 139 (2001) (available at http://xxx.lanl.gov/abs/cond-mat/0201279); H. Ohno, F. Matsukura, Y. Ohno, JSAP International 5, 4 (2002) (available at http://www.jsapi.jsap.or.jp/). 413 2. P. Wachter, in: Handbook on the Physics and Chemistry of Rare Earth vol. 1 (North-Holland, Amsterdam 1979) p. 507; E.L. Nagaev, Physics of Magnetic Semiconductors (Mir, Moscow 1983); A. Mauger, C. Gotard, Phys. Rep. 141, 51 (1986). 414 3. T. Story, Acta Phys. Polon. 91, 1735 (1997). 414 4. H. Ohno, H. Munekata, T. Penney, S. von Moln´ ar, L.L. Chang, Phys. Rev. Lett. 68, 2664 (1992). 414 5. H. Ohno, A. Shen, F. Matsukura, A. Oiwa, A. Endo, S. Katsumoto, Y. Iye, Appl. Phys. Lett. 69 363 (1996). 414 6. T. Dietl, A. Haury, Y. Merle d’Aubign´e, Phys. Rev. B 55, R3347 (1997). 414, 415, 416, 418 7. A. Haury, A. Wasiela, A. Arnoult, J. Cibert, S. Tatarenko, T. Dietl, Y. Merle d’Aubign´e, Phys. Rev. Lett. 79, 511 (1997). 414, 415, 416, 418, 419 8. D. Ferrand, J. Cibert, A. Wasiela, C. Bourgognon, S. Tatarenko, G. Fishman, T. Andrearczyk, J. Jaroszy´ nski, S. Kole´snik, T. Dietl, B. Barbara, D.Dufeu, Phys. Rev. B 63, 085201 (2001). 414, 415, 416 9. H. Ohno, Science 281, 951 (1998). 414 10. T. Dietl, Semicond. Sci. Technol. 17, 377 (2002). 414 11. F. Matsukura, H. Ohno, T. Dietl, in: Handbook of Magnetic Materials, vol. 14, Ed. K.H.J. Buschow, (Elsevier, Amsterdam 2002) p. 1–87. 414 12. T. Dietl, in: Handbook on Semiconductors vol. 3B ed. T.S. Moss (Elsevier, Amsterdam 1994) p. 1251. 414, 421 13. J. Szczytko, W. Bardyszewski, A. Twardowski, Phys. Rev. B 64, 075306 (2001). 414, 420, 421
III-V and II-VI Mn-Based Ferromagnetic Semiconductors
425
14. O.M. Fedorych, E.M. Hankiewicz, Z. Wilamowski, J. Sadowski, Phys. Rev. B 66, 045201 (2002). 414 15. T. Dietl, F. Matsukura, and H. Ohno, Phys. Rev. B 66, 033203 (2002). 414 16. K.M. Yu, W. Walukiewicz, T. Wojtowicz, I. Kuryliszyn, X. Liu, Y. Sasaki, J.K. Furdyna, Phys. Rev. B 65, 201303(R) (2002). 415 17. J. De Boeck, R. Oesterholt, A. Van Esch, H. Bender, C. Bruynseraede, C. Van Hoof, G. Borghs, Appl. Phys. Lett. 68, 2744 (1996). 415 18. J. Masek, F. Maca, Acta Phys. Polon. A 100, 319 (2001); F. Maca, J. Masek, Phys. Rev. B 65 235209 (2002). 415 19. F. Matsukura, H. Ohno, A. Shen, Y. Sugawara, Phys. Rev. B 57, R2037 (1998). 415, 416 20. J. Blinowski and P. Kacman, Phys. Rev. B 67, 121204 (2003). 415 21. U. Gummich and I.C. da Cunha Lima, Solid State Commun. 76, 831 (1990). 415 22. M.A. Boselli, A. Ghazali, and I.C. da Cunha Lima, J. Appl. Phys. 85, 5944 (1999); M.A. Boselli, A. Ghazali, and I.C. da Cunha Lima, Phys. Rev. B 62, 8895 (2000). 415 23. T. Andrearczyk, J. Jaroszy´ nski, M. Sawicki, Le Van Khoi, T. Dietl, D. Ferrand, C. Bourgognon, J. Cibert, S. Tatarenko, T. Fukumura, Z. Jin, H. Koinuma, M. Kawasaki, in: Proceedings 25th Intional Conference on Physics of Semiconductors, Osaka, Japan, 2000, eds. N. Miura, T. Ando (Spriger, Berlin 2001) p. 235. 416 24. T. Omiya, F. Matsukura, T. Dietl, Y. Ohno, T. Sakon, M. Motokawa, H. Ohno, Physica E 7, 976 (2000). 415, 416, 421 25. T. Dietl, H. Ohno, F. Matsukura, J. Cibert, D. Ferrand, Science 287, 1019 (2000). 415, 416, 422, 423 26. T. Dietl, H. Ohno, F. Matsukura, Phys. Rev. B 63, 195205 (2001). 415, 416, 417, 420, 422, 423 27. H. Boukari, P. Kossacki, M. Bertolini, D. Ferrand, J. Cibert, S. Tatarenko, A. Wasiela, J.A. Gaj, T. Dietl, Phys. Rev. Lett. 88, 207204 (2002). 416, 418, 419, 420 28. K.W. Edmonds, K.Y. Wang, R.P. Campion, A.C. Neumann, C.T. Foxon, B.L. Gallagher, P.C. Main, Appl. Phys. Lett. 81, 3010 (2002). 415 29. K.W. Edmonds, K.Y. Wang, R.P. Campion, A.C. Neumann, N.R.S. Farley, B.L. Gallagher, C.T. Foxon, Appl. Phys. Lett. 81, 4991 (2002). 415 30. K.C. Ku, S.J. Potashnik, R.F. Wang, M.J. Seong, E.Johnston-Halperin, R.C. Meyers, S.H. Chun, A. Mascarenhas, A.C. Gossard, D.D. Awschalom, P. Schiffer, N. Samarth, e-print, http://arXivorg/abs/cond-mat/0210426, 2002. 415 31. H. Ohno, J. Crystal Growth, 2003, in press. 415 32. J. K¨ onig, T. Jungwirth, A.H. MacDonald, Phys. Rev. B 64, 184423 (2001). 416 33. T. Dietl, J. K¨ onig, A.H. MacDonald, Phys. Rev. B 64, 241201(R) (2001). 416 34. J. Jaroszy´ nski, T. Andrearczyk, G. Karczewski, J. Wr´ obel, T. Wojtowicz, E. Papis, E. Kami´ nska, A. Piotrowska, D. Popovic, T. Dietl, Phys. Rev. Lett. 89, 266802 (2002). 416 35. M. Abolfath, T. Jungwirth, J. Brum, A.H. MacDonald, Phys. Rev. B 63, 054418 (2001). 416 36. H. Munekata, A. Zaslavsky, P. Fumagalli, R.J. Gambino, Appl. Phys. Lett. 63 2929 (1993). 417
426
Tomasz Dietl
37. H. Ohno, F. Matsukura, A. Shen, Y. Sugawara, A. Oiwa, A. Endo, S. Katsumoto, Y. Iye, in: Proceedings 23rd International Conference on the Physics of Semiconductors, Berlin 1996, eds. M. Schefler and R. Zimmermann (World Scientific, Sigapore 1996) p. 405; A. Shen, H. Ohno, F. Matsukura, Y. Sugawara, N. Akiba, T. Kuroiwa, A. Oiwa, A. Endo, S. Katsumoto, Y. Iye, J. Cryst. Growth 175/176, 1069 (1997). 417 38. M. Sawicki, F. Matsukura, T. Dietl, G.M. Schott, C. Ruester, G. Schmidt, L.W. Molenkamp, G. Karczewski, J. Superconductivity/Novel Magnetism, in press, 2002; e-print, http://arXivorg/abs/cond-mat/0212511. 417, 418 39. K. Takamura, F. Matsukura, D. Chiba, H. Ohno, Appl. Phys. Lett. 81, 2590 (2002). 417 40. T. Dietl, J. Cibert, D. Ferrand, Y. Merle d’Aubign´e, Materials Sci. Engin B 63, 103 (1999). 418 41. P. Kossacki, D. Ferrand, A. Arnoult, J. Cibert, S. Tatarenko, A. Wasiela, ´ atek, M. Sawicki, J. Wr´ Y. Merle d’Aubign´e, K. Swi¸ obel, W. Bardyszewski, T. Dietl, Physica E 6, 709 (2000). 418 42. S. Koshihara, A. Oiwa, M. Hirasawa, S. Katsumoto, Y. Iye, C. Urano, H. Takagi, H. Munekata, Phys. Rev. Lett. 78, 617 (1997). 419 43. H. Ohno, D. Chiba, F. Matsukura, T. Omiya, E. Abe, T. Dietl, Y. Ohno, K. Ohtani, Nature 408, 944 (2000). 419 44. Y. Ohno D. K. Young B. Beschoten , F. Matsukura, H. Ohno, D.D. Awschalom, Nature 402, 790 (1999). 419 45. D.K. Young, E. Johnston-Halperin, D.D. Awschalom, Y. Ohno, H. Ohno, Appl. Phys. Lett. 80, 1598 (2002). 420, 421 46. S.-R. Eric Yang, J. Sinova, T. Jungwirth, Y.P. Shim, A.H. MacDonald, Phys. Rev. B 67, 045205 (2003). 420, 421 47. E. L. Nagaev, Phys. Rep. 346, 387 (2001). 421 48. Sh.U. Yuldashev, Hyunsik Im, V.Sh. Yalishev, C.S. Park, T.W. Kang, Y. Sasaki, X. Liu, J.K. Furdyna, Appl. Phys. Lett. 82, 1206 (2003). 421 49. T. Dietl, F. Matsukura, H. Ohno, J. Cibert, D. Ferrand, in: Proceedings of NATO Workshop, Les Houches, France, February 2002 (Kluwer, Dortrecht 2003), in press. 421, 422 50. T. Wojtowicz, T. Dietl, M. Sawicki, W. Plesiewicz, J. Jaroszy´ nski, Phys. Rev. Lett. 56, 2419 (1986). 421 51. T. Jungwirth, M. Abolfath, J. Sinova, J. Kucera, A.H. MacDonald, Appl. Phys. Lett. 81, 4029 (2002). 421 52. T. Jungwirth, Qian Niu, and A. H. MacDonald, Phys. Rev. Lett. 88, 207208 (2002). 422 53. T. Jungwirth, J. K¨ onig, J. Sinova, J. Kucera, A.H. MacDonald, Phys. Rev. B 66, 012402 (2002). 422 54. K. Sato, H. Katayama-Yoshida, Semicond. Sci. Technol. 17, 367 (2002). 422 55. Y.D. Park, A.T. Hanbicki, S.C. Erwin, C.S. Hellberg, J.M. Sullivan, J.E. Mattson, T.F. Ambrose, A. Wilson, G. Spanos, B.T. Jonker, Science 295, 652 (2002). 423 56. N. Theodoropoulou, A.F. Hebard, M.E. Overberg, C.R. Abernathy, S.J. Pearton, S.N.G. Chu, R.G. Wilson, Phys. Rev. Lett. 89, 107203 (2002). 423 57. S. Kuwabara, T. Kondo, T. Chikyow, P. Ahmet, H. Munekata, Jpn. J. Appl. Phys. 40, L724 (2001); N. Theodoropolpu, A.F. Hebard, M.E. Overberg, C.R. Abernathy, S.J. Pearton, S.N.G. Chu, R.G.Wilson, Appl. Phys. Lett. 78 3475
III-V and II-VI Mn-Based Ferromagnetic Semiconductors
427
(2001); M.L. Reed, N.A. El-Masry, H.H. Stadelmaier, M.K. Ritums, M.J. Reed, C.A. Parker, J.C. Roberts, S.M. Bedair, Appl. Phys. Lett. 79, 3473 (2001); M.E. Overberg, C.R. Abernathy, S.J. Pearton, N.A. Theodoropoulou, K.T. MacCarthy, A.F. Hebard, Appl. Phys. Lett. 79, 1312 (2001). 423 58. S. Sonoda, S. Shimizu, T. Sasaki, Y. Yamamoto, H. Hori, J. Cryst. Growth, 237–239, 1358 (2002). 423 59. M. Zaj¸ac, J. Gosk, M. Kami´ nska, A. Twardowski, T. Szyszko, S. PodsiadSlo, Appl. Phys. Lett. 79, 2432 (2001). 423 60. K. Ando, Appl. Phys. Lett. 82, 100 (2003). 423
Spin-Galvanic Effect and Spin Orientation Induced Circular Photogalvanic Effect in Quantum Well Structures Sergey Ganichev Fakult¨ at f¨ ur Physik, Universit¨ at Regensburg D-93040 Regensburg, Germany Abstract. The spin-galvanic effect and the spin polarization induced circular photogalvanic effect generated by homogeneous optical excitation with circularly polarized radiation in quantum wells (QWs) are reviewed. In both effects the current flow is driven by an asymmetric distribution of spin polarized carriers in k-space of systems with lifted spin degeneracy due to k-linear terms in the Hamiltonian. Spin photocurrents provide methods to investigate spin relaxation in the condition of monopolar spin orientation and to conclude on the in-plane symmetry of QWs.
1
Introduction
he spin-degree of freedom of charge carriers and its manipulation has become a hot topic in material science under the perspective of spin-based electronic devices (for a review see [1]). One of the most frequently used and powerful methods of generation and investigation of spin polarization is optical orientation [2]. Optical generation of an unbalanced spin distribution in a semiconductor may lead to electrical currents driven by optically generated spin polarization. Spin photocurrents may be caused by an inhomogeneous spin distribution obtained due to inhomogeneous optical excitation [3,4] or inhomogeneities of materials like p−n junctions [5] as well as at simultaneous one- and two-photon coherent excitation of proper polarization [6]. Here we review a new property of the electron spin in a homogeneous spin-polarized two-dimensional electron gas: its ability to drive an electric current if QWs belongs to one of the gyrotropic classes. Recently it was demonstrated that an excitation of QWs with circularly polarized radiation leads to a current whose direction depends on helicity of the incident light [7]. This effect belongs to the class of photogalvanic effects which were intensively studied in bulk semiconductors (for review see [8,9]) and represents a circular photogalvanic effect (CPGE). It was shown in [10] that in gyrotropic QW structures CPGE is caused by spin orientation of carriers in systems with band splitting in k-space due to k-linear terms in the Hamiltonian [11,12]. A homogeneous irradiation of QWs with circularly polarized light results in a non-uniform distribution of photoexcited carriers in k-space due to optical selection rules and conservation laws which leads to a current [13]. B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 427–442, 2003. c Springer-Verlag Berlin Heidelberg 2003
428
Sergey Ganichev
Furthermore, a thermalized but spin-polarized electron gas can drive an electrical current [14]. Recently it was demonstrated that in a gyrotropic QWs a homogeneous spin polarization obtained by any means yields a current [15]. This phenomenon is referred to as spin-galvanic effect (SGE). While electrical currents are usually generated by electric fields or gradients, in this case a uniform non-equilibrium population of electron spins gives rise to an electric current. The microscopic origin of the SGE is an inherent asymmetry of spin-flip scattering of electrons in systems with removed k-space spin degeneracy of the band structure. This effect has been demonstrated by optical orientation [15,16] and therefore also represents a spin photocurrent. The CPGE and the SGE have in common that the current flow is driven by an asymmetric distribution of carriers in k-space in systems with lifted spin degeneracy. The crucial difference between both effects is, that the spingalvanic effect may be caused by any means of spin injection, while the spin orientation induced CPGE needs optical excitation. Even if the spin-galvanic effect is achieved by optical spin orientation the microscopic mechanisms are different. The spin-galvanic effect is caused by asymmetric spin-flip scattering of spin polarized carriers and it is determined by the process of spin relaxation. If spin relaxation is absent, the spin-galvanic current vanishes. In contrast, the CPGE is the result of selective photoexcitation of carriers in k-space and depends on momentum relaxation. Both spin photocurrents have been observed in n- and p-type QWs based on various semiconductor materials at very different types of optical excitation by application of several lasers at wavelengths ranging from the visible to the far-infrared.
2
Samples and Experimental Technique
The experiments were carried out on GaAs [13,17], InAs [10,15], asymmetric SiGe QWs [18], and BeZnMnSe [19] QW structures in belonging to two different classes of symmetry. Higher symmetric structures were (001)-oriented QWs which, as our measurements showed, corresponded to the point group C2v [13]. Structures of the lower symmetry class Cs were (113)-oriented QWs. Samples of n- and p-type QWs with width Lw from 7 nm to 20 nm and freecarrier densities from 1011 cm−2 to 2 · 1012 cm−2 were studied. For optical excitation mid-infrared (MIR), far-infrared (FIR) and visible laser radiation was used. Most of the measurements were carried out in the infrared with photon energies less than the energy gap εg . Depending on the photon energy and QW band structure the MIR and FIR radiation induce direct transitions between size quantized subbands or, at longer wavelength, Drude absorption. A pulsed TEA-CO2 laser and a molecular FIR laser [20] have been used as radiation sources in the spectral range between 9.2 µm and 496 µm. Some experiments in the MIR have been carried out making use of the tunability of the free-electron laser “FELIX” [21]. For optical inter-
Spin-Galvanic Effect in Quantum Wells
429
band excitation a cw-Ti:sapphire laser was used providing radiation with λ=0.777 µm and radiation power P 100 mW. The circular polarization has been obtained using a Fresnel rhomb, λ/4 plates, and a photoelastic modulator for MIR, FIR and visible radiation, respectively. The helicity Pcirc of the incident light was varied from −1 (lefthanded circular, σ− ) to +1 (right-handed circular, σ+ ) according to Pcirc = sin 2ϕ, where the phase angle ϕ is the angle between the initial plane of polarization and the optical axis of the polarizer. Samples were studied in the temperature range of 4.2 K to 300 K. The photocurrent jx was measured in the unbiased structures via the voltage drop across a 50 Ω load resistor in a closed circuit configuration [7] (see Fig. 1(c)). The current in the case of excitation with visible radiation was recorded by a lock-in amplifier in phase with the photoelastic modulator.
Fig. 1. Oscilloscope traces obtained for pulsed excitation of (113)-grown n-type GaAs QWs at λ = 10.6 µm and normal incidence. (a) and (b) show CPGE signals, (c) the measurement arrangement and (d) a signal pulse of a fast photon drag detector. For (001)-grown QWs oblique incidence was used in order to obtain helicity dependent current
3
Spin Polarization Induced CPGE
The CPGE appears due to the asymmetry of the momentum distribution of photoexcited carriers in homogeneous samples. The microscopic origin of this current is the conversion of photon angular momentum into directed motion of carriers. The experimental data can be described by simple analytical expressions derived from a phenomenological theory which shows that the effect can only be present in gyrotropic media. This requirement rules out effects depending on the helicity of the radiation in non-optically active materials.
430
3.1
Sergey Ganichev
Experiment: General Features
With illumination of QW structures by polarized radiation a current signal proportional to the helicity Pcirc has been observed in unbiased samples [7]. The signal follows to the temporal structure of the applied 100 ns laser pulses and reverses its sign by switching the polarization from σ+ to σ− (see Fig. 1). The radiation induced current and its characteristic helicity dependence shown in Fig. 2 reveals that we are dealing with the CPGE. In (001)-oriented samples a helicity dependent signal is only observed under oblique incidence [10]. A variation of the angle of incidence Θ0 in the incidence plane around Θ0 =0◦ changes the sign of the current. For light propagating along 110 direction the photocurrent flows perpendicular to the wavevector of the incident light (see Fig. 2(a)). For illumination along a cubic axis 100 both a transverse and a longitudinal CPGE current is detected [13]. In samples grown on a (113)-GaAs surface or on (001)-miscut substrates representing the lower symmetry class Cs , the CPGE has been observed also under normal incidence of radiation [10] as shown in Fig. 2(b). The current does not change its sign by the variation of Θ0 and assumes (a)
20 10
(001)- grown n- InAs QW T = 293K
σ−
jx / P
-9
( 10 A / W )
0
e
σ+
-10 -20
Θ0 = −30 0
y x [110]
6
(b)
σ+
4 2 0
(113)A- grown p- GaAs MQWs T = 293K
b
0 e Θ0 = 0
-2
y' [332]
-4
x [110] -6
0
0
45
0
90
ϕ
σ− 0
135
0
180
Fig. 2. Photocurrent in QWs normalized by the light power P as a function of the phase angle ϕ defining helicity. Measurements are presented for T = 293 K and λ = 76 µm. (a) oblique incidence of radiation with an angle of incidence on (001)grown QWs (symmetry class C2v ). (b) normal incidence of radiation on (113)grown QWs (symmetry class Cs ). Full lines show ordinate scale fits after Eqs. (2) and (3) for the top and lower panel, respectively. Insets: experimental setup
Spin-Galvanic Effect in Quantum Wells
431
its maximum at Θ0 =0◦ . This is in contrast to (001)-oriented samples and in accordance to the phenomenological theory of the CPGE for Cs . For normal incidence in this symmetry the current always flows along the [1¯10]- direction perpendicular to the plane of mirror reflection of the point group Cs . 3.2
Phenomenology
Phenomenologically the CPGE current j can be described as [9] jλ = γλµ i(E × E ∗ )µ ,
(1)
µ
where γ is a pseudo-tensor, E is the complex amplitude of the radiation elecˆ are the electric field amplitude tric field, i(E × E ∗ )µ = eˆµ E02 Pcirc , E0 and e and the unit vector pointing in the direction of light propagation, respectively. In general, in addition to the CPGE current given in Eq. (1), two other photocurrents can be present simultaneously, namely the linear photogalvanic effect (LPGE) and the photon drag effect [9]. Both effects were observed in low dimensional structures (for review see [13]). These photocurrents are not changed in sign or amplitude if the polarization is switched from σ+ to σ− which allow to distinguish them from the CPGE. They do not require spin orientation and are outside the scope of the present investigation. In the following we analyze Eq. (1) for symmetries relevant to experiment. Hereafter we use for (001)-grown QWs cartesian coordinates x [1¯10], y [110], z [001] and for (113)-grown QWs the coordinates x = x [1¯10], y [33¯ 2], and z [113]. For C2v symmetry the photocurrent is given by jx = γxy eˆy E02 Pcirc ,
jy = γyx eˆx E02 Pcirc .
(2)
If eˆ is along 110 then the current flows normal to the light propagation direction. If the sample is irradiated with e ˆ parallel to 100 the current is neither parallel nor perpendicular to the light propagation direction. Another conclusion from Eq. (2) is that in QWs of C2v symmetry the photocurrent can only be induced under oblique incidence of radiation. For normal incidence e ˆ is parallel to [001] and hence the current vanishes. In contrast to this result in QWs of Cs symmetry a photocurrent also occurs for normal incidence of the radiation because the tensor γ has an additional component γxz . The current here is given by jx = (γxy eˆy + γxz eˆz )E02 Pcirc ,
jy = γy x eˆx E02 Pcirc .
(3)
At normal incidence, eˆx = eˆy = 0 and eˆz = 1, the current in the QW flows along x, i.e. perpendicular to the mirror reflection plane. The dependence of the photocurrent on the angle of incidence Θ0 is determined by the value of the projection eˆ on the x- (y-) axis (see Eqs (2)) or on the z -axis (Eqs. (3)). The phenomenological picture outlined above perfectly describes the experimental observations [13].
432
3.3
Sergey Ganichev
CPGE at Inter-band Transitions
The CPGE at inter-band transitions is most easily conceivable from the schematic band structure shown in Fig. 3 [10]. Microscopically a conversion of photon helicity into a spin photocurrent arises due to k-linear terms in the ˆ = βlm σl km where k is the electron wavevector, effective Hamiltonian H lm σl are the Pauli spin matrices and βlm are real coefficients. The coefficients βlm form a pseudo-tensor subjected to the same symmetry restriction as the transposed pseudo-tensor γ. The sources of k-linear terms are the bulk inversion asymmetry (BIA) also called the Dresselhaus term [12] (including a possible interface inversion asymmetry [22]) and possibly a structural inversion asymmetry (SIA) usually called the Rashba term [11]. For the sake of simplicity we take into account a band structure consisting only of the lowest conduction subband e1 and the highest heavy-hole subband h2 kx2 /2me1 ) ± hh1 whose energy dispersion is described by εe1 ,±1/2 (k) = [(¯ βe1 kx + εg ] and εhh1 ,±3/2 (k) = −[(¯ h2 kx2 /2mhh1 ) ± βhh1 kx ], respectively. For absorption of circularly polarized radiation of photon energy h ¯ ω energy and momentum conservation allow transitions only for two values of kx . We consider a QW of Cs symmetry where due to selection rules the optical transitions occur from ms = −3/2 to ms = −1/2 for right-handed polarized light and from ms = 3/2 to ms = 1/2 for left-handed polarized light. Here ms are the spin quantum numbers of the electron states. The corresponding transitions for, for instance, σ+ photons occur at µ2 µ 2µ ± 2 kx = + 2 (βe1 + βhh1 ) ± hω − Eg ), (4) 4 (βe1 + βhh1 ) + 2 (¯ h ¯ ¯h ¯h and are shown in Fig. 3(a) by the solid vertical arrows. Here µ = (me1 · mhh1 )/(me1 + mhh1 ). The ‘center of mass’ of these transitions is shifted from the point kx = 0 by (βe1 + βhh1 )(µ/¯h2 ). Thus the sum of the electron velocities in the excited states in the conduction band is non-zero resulting
ε
jx
e1 (+1/2)
ε
(b) e1 (-1/2)
e2 (+1/2)
e2 (-1/2)
σ+
σ+
e1 (+1/2)
e1 (-1/2)
hh1 (-3 /2)
hh1 (+3/2)
kx- 0
kx+ kx
jx
0 kx+
kx
Fig. 3. Microscopic picture of the CPGE at direct transitions in Cs point group for (a) inter-band transitions and (b) for transitions between size-quantized subbands in the conduction band. Currents shown for one subband only
Spin-Galvanic Effect in Quantum Wells
433
in a spin polarized net current. Switching the polarization from σ+ to σ− mirrors the picture and the current direction reverses. The microscopic theory of CPGE in QWs for inter-band excitation was worked out in [23,24]. The CPGE at inter-band absorption is not observed experimentally as yet. A strong spurious photocurrent due to other mechanisms like the Dember effect, photovoltaic effects at contacts etc. mask the relatively weak CPGE. However application of polarization selective measurements, like modulation of polarization, should allow to extract the CPGE current. In the infrared range, where effects mentioned above vanish, the CPGE caused by inter- or intra-subband transition has been observed experimentally. 3.4
CPGE at Inter-subband Transitions
For direct transition between size quantized states in the conduction or valence band the model is very similar to inter-band transitions discussed above [25]. In Fig. 3(b) we sketch the situation for QWs of Cs symmetry. Due to selection rules optical transitions for monochromatic, say σ+ , radiation occur only at a fixed kx+ where the energy of the incident light matches the transition energy as is indicated by the arrow in Fig. 3(b). Therefore optical transitions induce an imbalance of momentum distribution in both subbands yielding an electric current in x direction with contributions from e1 and e2. As in n-type QWs the energy separation ε21 between e1 and e2 is typically larger than the energy of longitudinal optical phonons h ¯ ωLO , the non-equilibrium distribution of electrons in e2 relaxes rapidly due to emission (2) of phonons. As a result, the momentum relaxation time in e2 subband, τp , (1) is much less than τp , the momentum relaxation time in e1 subband. Thus the current is mainly due to the photogenerated holes in the initial state of the resonant optical transition in the e1 subband. The microscopic theory of CPGE for direct inter-subband transitions in n-type QWs for both Cs and C2v symmetry was developed in [25] with the result that the current is proportional to the derivative of the absorbance. (2) (1) For τp much less than τp it was shown that for Cs symmetry hω) d η12 (¯ (2) (1) (τp(1) − τp(2) ) jx ∼ βyx I Pcirc eˆy , + βyx (5) d ¯hω (1)
(2)
where βyx and βyx are components of β in the e1 and e2 subbands, respectively. The change of sign of the photocurrent with photon energy may be understand from the Fig. 3(b). It is seen that at large photon energy, hω > ε21 , excitation occurs at positive kx resulting in a current jx shown by ¯ arrow in Fig. 3(b). Decreasing of the photon frequency shifts the transition towards negative kx and reverses the current direction. Similar arguments hold for C2v symmetry under oblique incidence. However, in contrast to spin-flip processes occurring for Cs symmetry described above, in C2v symmetry due to selection rules the absorption of circularly
434
Sergey Ganichev
polarized radiation is spin-conserving [9]. For these symmetry the CPGE is also proportional to the derivative of absorption and is given by hω) d η12 (¯ (2) (1) (τp(1) − τp(2) ) I Pcirc eˆy . jx ∼ βyx − βyx (6) d ¯hω Since the CPGE in QW structures of C2v symmetry is caused by spinconserving optical transitions, the photocurrent described by Eq. (6) in contrast to Eq. (5) is proportional to the difference of subband spin splittings. Experimentally CPGE at resonant transitions was observed in n-type GaAs samples of QW widths from 8.2 to 8.6 nm. Direct optical transitions between e1 and e2 subband were excited applying MIR radiation of the CO2 laser or free-electron laser “FELIX”. A current signal proportional to the helicity has been observed at normal incidence in (113)-samples and at oblique incidence in (001)-oriented samples indicating the CPGE [25]. In Fig. 4 the data are presented for a (001)-grown n-GaAs QW of 8.2 nm width. It is seen that in direction x the current for both, σ+ and σ− radiation, changes sign at a frequency of the absorption peak. Experimental results shown in Fig. 4, in particular the sign inversion of the spectral behaviour of the current, are in a good agreement with microscopic theory developed in [25] (see Eq. (6)). The CPGE at direct inter-subband transitions has also been observed in p-type GaAs QWs demonstrating spin orientation of holes (see Fig. 2(b)) [10,17]. To achieve hh1–lh1 transitions radiation of FIR was applied. 3.5
CPGE at Intra-subband Transitions (Drude Absorption)
Now we consider indirect intra-subband transitions. This situation is usually realized in the FIR range where the photon energy is not high enough to excite direct inter-subband transitions. Due to energy and momentum conservation intra-subband transitions can only occur by absorption of a photon and simultaneous absorption or emission of a phonon. This process is described by virtual transitions involving intermediate states. It can be shown that transitions via intermediate states within one and the same subband do not contribute to the spin photocurrent. However, spin selective indirect optical transitions excited by circularly polarized light can generate a spin current if virtual processes involve intermediate states in different subbands [10]. Optical absorption caused by indirect transitions in n-type samples have been obtained applying FIR radiation covering the range of 76 µm to 280 µm. The experiments were carried out on GaAs [10,23], InAs [10] and semimagnetic ZnBeMnSe [19] QWs. The energy separation between e1 and e2 sizequantized subbands of those samples is much larger than the FIR photon energies used here. Therefore the absorption is caused by indirect intra-subband optical transitions. With illumination of (001)-grown QWs at oblique incidence of FIR radiation a current signal proportional to the helicity Pcirc has
Spin-Galvanic Effect in Quantum Wells 2
jx / P ( 10
-9
A/ W)
0.8 1
σ+
0.4 0.0
0
Absorption (a.u.)
n-GaAs QWs
435
σ− -1
-2
115
125 !ω (meV)
135
Fig. 4. Photocurrent in QWs normalized by P as a function of the photon energy hω. Measurements are presented for n-type (001)-grown GaAs QWs of 8.2 nm width ¯ (symmetry class C2v ) at T = 293 K and oblique incidence of radiation with an angle of incidence Θ0 = 20◦ . The absorption of the MIR laser radiation results in direct transitions between e1 and e2 subbands. The current jx is perpendicular to the direction of light propagation. The dotted line shows the absorption measured using a Fourier spectrometer
been observed (see Fig. 2(a)) showing that Drude absorption of a 2D electron gas results in spin orientation and the CPGE. CPGE at intra-subband absorption was also observed in p-type samples at long wavelengths [7,18,23], where the photon energies are smaller than the energy separation between heavy-hole and light-hole subbands. 3.6
CPGE in SiGe QWs
In symmetrical (001)-grown SiGe QWs no CPGE current has been detected as expected from the presence of inversion symmetry in both materials. However, in artificially grown non-symmetric QWs, CPGE has been observed being caused by the Rashba spin-orbit coupling due to a built-in potential gradient in the QWs [18]. CPGE in SiGe QWs has been detected in p-type structures at direct heavy-hole – light-hole transitions and at indirect intrasubband transitions.
4
Spin-Galvanic Effect
The picture of the spin photocurrent given so far involved the asymmetry of the momentum distribution of photoexcited carriers, i.e. the CPGE. After momentum relaxation of the photoexcited carriers the CPGE vanishes, however, a spin orientation may still be present if the spin relaxation time is
436
Sergey Ganichev
longer than the momentum relaxation time. In such a case an asymmetry of spin-flip scattering of non-equilibrium spin polarized carriers may contribute to the total current. This current is caused by the spin-galvanic effect (SGE) and in general it does not require photoexcitation. 4.1
Phenomenology
The SGE is due to spin relaxation of a uniform non-equilibrium spin polarization in QWs of gyrotropic symmetry [15]. Phenomenologically, an electric current can be linked to the electron’s averaged spin polarization S by Qαγ Sγ . (7) jα = γ
Like in the case of CPGE here we have a pseudo-tensor Q with the same symmetry restrictions like γ. For C2v symmetry of (001)-grown QWs only two linearly independent components, Qxy and Qyx , may be non-zero so that jx = Qxy Sy ,
jy = Qyx Sx .
(8)
Hence, SGE current needs a spin component lying in the plane of QWs. In QWs of Cs symmetry an additional tensor component Qxz may be non-zero and the SGE may be caused by spins oriented normally to the plane of QW. 4.2
Microscopic Model
Microscopically, the spin-galvanic effect is caused by asymmetric spin-flip relaxation of spin polarized electrons in systems with k-linear contributions to the effective Hamiltonian [15]. Fig. 5 sketches the electron energy spectrum along kx with the spin dependent term βyx σy kx . In this case σy is a good quantum number. Spin orientation in y-direction causes the unbalanced population in the subbands. The current flow is caused by k-dependent spinflip relaxation processes. Spins oriented in y-direction are scattered along kx from the higher filled, e.g. spin-up subband, | + 1/2y , to the less filled spindown subband, | − 1/2y . Four quantitatively different spin-flip scattering events exist and are sketched in Fig. 5 by bent arrows. The spin-flip scattering rate depends on the values of the wavevectors of the initial and the final states, respectively [26]. Two scattering processes shown by broken arrows ε
jx
-1/2 y
+1/2 y
kx
i1
0
kx
f1
kx
Fig. 5. Microscopic origin of the spin-galvanic current in the presence of k-linear terms in the electron Hamiltonian
Spin-Galvanic Effect in Quantum Wells
437
are inequivalent and generate an asymmetric carrier distribution around the subband minima in both subbands. This asymmetric population results in a current flow along the x-direction. The uniformity of spin polarization in space is preserved during the scattering processes. Therefore the spin-galvanic effect differs from other experiments where the spin current is caused by inhomogeneities [3,4,5]. Up to now the spin-galvanic effect has been recorded at optical spin orientation caused by inter-band, inter-subband, as well as intra-subband transitions [15,16]. Note that the reverse process to the spingalvanic effect i.e. a spin polarization induced by an electric current flow has been theoretically considered in [27,28]. 4.3
Microscopic Theory
The microscopic theory of the spin-galvanic effect is presented following [16]. The occurrence of a current is due to the spin dependence of the electron ˆ can be written as scattering matrix elements Mk k . The 2 × 2 matrix M kk a linear combination of the unit matrix Iˆ and Pauli matrices as follows ˆ = A Iˆ + σ · B , M kk kk kk
(9)
where A∗k k = Akk , Bk∗ k = Bkk due to hermiticity of the interaction and A−k ,−k = Akk , B−k ,−k = −Bkk due to the symmetry under time inversion. The spin-dependent part of the scattering amplitude is given by [26] σ · B k k = v(k − k )[σx (ky + ky ) − σy (kx + kx )] .
(10)
We note that Eq. (10) determines the spin relaxation time, τs , due to the Elliot-Yafet mechanism. The spin-galvanic current has the form [16] (1)
jSGE,x = Qxy Sy ∼ e ne
βyx τp Sy , h τs ¯
(1)
jSGE,y = Qyx Sx ∼ e ne
βxy τp Sx .(11) ¯h τs
Since scattering is the origin of the SGE, the spin-galvanic current is determined by the Elliot-Yafet spin relaxation time. The relaxation time τs is proportional to the momentum relaxation time τp . Therefore the ratio τp /τs in Eq. (11) does not depend on the momentum relaxation time. The in-plane average spin Sx in Eq. (11) decays with the total spin relaxation time τs . Thus the time decay of the spin-galvanic current following the pulsed photoexcitation is determined by τs . 4.4
SGE at Optical Orientation in Magnetic Field
Excitation of QWs by circularly polarized light results in a spin polarization which, at proper orientation of the electron spins, causes the spin-galvanic effect. Because of the tensor equivalence the spin-galvanic current induced by circularly polarized light always occurs simultaneously with the CPGE. It has
438
Sergey Ganichev
been recently shown [16], that at inter-subband transitions the spin-galvanic effect may be separated from CPGE making use of the spectral behaviour at resonance. This will be discussed in more detail in the next section. Another possibility to investigate the spin-galvanic effect without contributions of the CPGE to the current has been introduced in [15]. The spin polarization was obtained by absorption of circularly polarized radiation at normal incidence on (001)-grown QWs as depicted in the inset of Fig. 6. For normal incidence the CPGE as well as the spin-galvanic effect vanish because eˆx = eˆy = 0 (see Eqs. (2)) and Sx = Sy = 0 (see Eqs.(8)), respectively. Thus, we have a spin orientation along the z coordinate but no spin photocurrent. To obtain an in-plane component of the spins, necessary for the spingalvanic effect, a magnetic field B x has been applied. Due to Larmor precession a non-equilibrium spin polarization Sy is induced being ωL τs⊥ S0z , (12) 1 + (ωL τs )2 √ where τs = τs τs⊥ , τs , τs⊥ are the longitudinal and transverse electron spin relaxation times, ωL is the Larmor frequency. The denominator in Eq. (12) yielding the decay of Sy for ωL τs > 1 is well known from the Hanle effect [2] Both, for visible and infrared radiation, a current due to SGE has been observed for all (001)-grown n-type GaAs and InAs samples after applying an in-plane magnetic field (Figs. 6-7) [15,29]. For low magnetic fields B where ωL τs < 1 holds, the current increases linearly as expected from Eqs. (8) and (12). The polarity of the current depends on the direction of the excited spins determined by the radiation helicity and the direction of the applied magnetic field. For magnetic field applied along 110 the current flows along the magnetic field. For B 100 both the transverse and the longitudinal effects are observed [13]. ez
2 1
-9
j / P ( 10 A / W ) x
Sy = −
2DEG
Sy
Bx jx S0z ω Larmor
-1
n-QW λ=0.777 µm p-QWs n-heterojunct. n-heterojunct. λ=148 µm
}
-2 -800
-400
0 400 Bx (mT)
800
Fig. 6. Spin-galvanic current jx normalized by P as a function of magnetic field B at T=293 K for various samples. Data for GaAs/AlGaAs heterostructures at normal incident with circularly polarized radiation, λ=0.777 µm, P =100 mW and λ = 148 µm, P =20 kW. Inset: experimental setup
Spin-Galvanic Effect in Quantum Wells
439
0.2
jx ( m A )
0.1
n-GaAs/AlGaAs heterojunction T= 4.2K
right circularly polarized light
0 left circularly polarized light
-0.1 -0.2 -4
-2
0 Bx ( T )
2
4
Fig. 7. Spin-galvanic current jx as a function of magnetic field B for normal incident circularly polarized radiation at λ = 148 µm and radiation power 20 kW. Solid and dashed curves are fitted after Eqs. (8) and (12) using the same value of the spin relaxation time τs and scaling of the ordinate
For higher magnetic fields the current assumes a maximum and decreases upon further increase of B, as shown in Fig. 7. This drop of the current is ascribed to the Hanle effect [2]. The experimental data are well described by Eqs. (8) and (12). The observation of the Hanle effect demonstrates that free carrier intra-subband transitions can polarize the spins of electron systems. The measurements allow to obtain the spin relaxation time τs from the peak position of the photocurrent where ωL τs = 1 holds [15]. 4.5
SGE at Optical Orientation without Magnetic Field
In the experiments described above an external magnetic field was used for re-orientation of an optically generated spin polarization. The SGE can also be occur at optical excitation only, without application of an external magnetic field. The necessary in-plane component of the spin polarization can be obtained by oblique incidence of the exciting circularly polarized radiation but in this case the CPGE may also occur interfering with the SGE. However, a pure SGE may be obtained at inter-subband transitions in n-type GaAs QWs [16]. The spin orientation is generated by spin-selective optical excitation followed by spin-non-specific thermalization. The magnitude of the spin polarization and hence the current depends on the initial absorption strength but not on the k of transition. Thus the spin-galvanic current is proportional to the absorbance [29]. In contrast, as shown above the spectrum of CPGE changes sign and vanishes in the center of resonance (see Fig. 4 and [25]). Thus, if a measurable helicity dependent current is present in the center of the resonance it must be attributed to the SGE. Experiments have been carried out making use of the spectral tunability of the free electron laser “FELIX” [21]. The photon energy dependence of the current was measured for incidence in two different planes with components of propagation along the x- and y-directions (see Fig. 8). It can be seen that for a current along x the spectral shape is similar to the derivative of the
440
Sergey Ganichev 2
1
j / P ( 10
-10
A / W)
0.8
0.4 j || [110]
Absorption (a.u.)
n-GaAs QWs T= 293 K σ+ radiation
0.0
0 j || [110]
80 100 120 140 160 180 D ω (meV)
Fig. 8. Photocurrent in QWs normalized by the light power P at oblique incidence of right-handed circularly polarized radiation on n-type (001)-grown GaAs/AlGaAs QWs of 8.2 nm width at T = 293 K as a function of the photon energy ¯ hω. Circles: current in [110] direction in response to irradiation parallel [1¯ 10]. Rectangles: current in [1¯ 10] direction in response to irradiation parallel [110]. Dotted line shows the absorption measured using a Fourier transform spectrometer
absorption spectrum indicating the CPGE as already discussed above. When the sample was rotated by 90◦ about z the sign change in the current, now along y, disappears and its spectral shape follows more closely the absorption spectrum. The lack of a sign change for current along y [110] in the experiment shows that the spin-galvanic effect dominates for this orientation. The non-equivalence of the [110]- and [1¯10]-axes is caused by the interplay of BIA and SIA terms in the Hamiltonian. It has been shown in [16] that in directions y and x the SGE and the CPGE are proportional to terms with the sum and the difference of BIA and SIA terms, respectively. For our samples it appears that in the case where they subtract, the CPGE dominates over SGE. Conversely when BIA and SIA terms add the CPGE is suppressed and the SGE dominates consistent with the lack of sign change for the current along the y-direction (see Fig. 8).
5
Summary
A non-equilibrium uniform spin polarization obtained by optical orientation drives an electric current in QWs if they belong to a gyrotropic crystal class. Two different microscopic mechanisms of spin photocurrents can be distinguished, the CPGE and the spin-galvanic effect. In the first effect the coupling of the helicity of light to spin polarized final states with a net linear momentum is caused by selection rules together with band splitting in k-space. The
Spin-Galvanic Effect in Quantum Wells
441
current flow in the second effect is driven by asymmetric spin relaxation of a homogeneous non-equilibrium spin polarization. Spin photocurrents are not limited to 2D structures. Most recently they have been predicted for 1D systems like carbon nanotubes of spiral symmetry [30]. Macroscopic measurements of photocurrents in different geometric configurations of experiments allow to conclude on details of the microscopic tensorial spin-orbit interaction [13]. In particular the relation between the Dresselhaus like terms and the Rashba term, respectively, may be obtained. Furthermore the macroscopic in-plain symmetry of QWs may easily be determined. Spin photocurrents were also applied to investigate the mechanism of spin relaxation at monopolar spin orientation [29] where only one type of charge carriers is involved in the excitation-relaxation process. This condition is close to that of electric spin injection in semiconductors. Two methods were applied to determine spin relaxation times: the Hanle effect in the spin-galvanic current [15] and spin sensitive bleaching of photogalvanic currents [17,31]. The CPGE has also been applied to detect the state of polarization of terahertz radiation with a high time resolution [32]. Acknowledgements The authors thank W. Prettl, E.L. Ivchenko, V.V. Bel’kov, and Petra Schneider for many discussions and helpful comments on the present manuscript. Financial support by the DFG is gratefully acknowledged.
References 1. D.D. Awschalom, D. Loss, N. Samarth (Eds.): Semiconductor Spintronics and Quantum Computation, in the series Nanoscience and technology, eds. K. von Klitzing, H. Sakaki, R. Wiesendanger (Springer, Berlin, 2002). 427 2. F. Meier, B.P. Zakharchenya (Eds.): Optical orientation (Elsevier Science Publ., Amsterdam, 1984). 427, 438, 439 3. N.S. Averkiev, M.I. D’yakonov, Sov. Phys.Semicond. 17, 393. 427, 437 4. A.A. Bakun, B.P. Zakharchenya, A.A. Rogachev, M.N. Tkachuk, V.G. Fleisher, Sov. JETP Lett. 40, 1293. 427, 437 ˘ c, J. Fabian, S. Das Sarma, Appl. Phys. Lett. 79, 1558 (2001). 427, 437 5. I. Zuti´ 6. M.J. Stevens, A.L. Smirl, R.D.R. Bhat, J.E. Sipe, H.M. van Driel, J. Appl. Phys. 91, 4382 (2002). 427 7. S.D. Ganichev, E.L. Ivchenko, H. Ketterl, W. Prettl, L.E. Vorobjev, Appl. Phys. Lett. 77, 3146 (2000). 427, 429, 430, 435 8. B.I. Sturman, V.M. Fridkin: The Photovoltaic and Photorefractive Effects in Non-Centrosymmetric Materials (Gordon and Breach Science Publishers, New York, 1992). 427 9. E.L. Ivchenko, G.E. Pikus: Superlattices and Other Heterostructures. Symmetry and Optical Phenomena (Springer, Berlin 1997). 427, 431, 434 10. S.D. Ganichev, E. L. Ivchenko, S.N. Danilov, J. Eroms, W. Wegscheider, D. Weiss, W. Prettl, Phys. Rev. Lett. 86, 4358 (2001). 427, 428, 430, 432, 434
442
Sergey Ganichev
11. Y.A. Bychkov, E.I. Rashba, Sov. JETP Lett. 39, 78 (1984). 427, 432 12. M.I. D’yakonov, V.Yu. Kachorovskii, Sov. Phys. Semicond. 20, 110 (1986). 427, 432 13. S.D. Ganichev, W. Prettl, J. Phys.: Condens. Matter (topical review), to be published (cond-mat/cond-mat/0304266 and 0304268). 427, 428, 430, 431, 438, 441 14. E.L. Ivchenko, Yu.B. Lyanda-Geller, G.E. Pikus, Sov. JETP Lett. 50, 175 (1989). 428 15. S.D. Ganichev, E.L. Ivchenko, V.V. Bel’kov, S.A. Tarasenko, M. Sollinger, D. Weiss, W. Wegscheider, W. Prettl, Nature (London) 417, 153 (2002). 428, 436, 437, 438, 439, 441 16. S.D. Ganichev, Petra Schneider, V.V. Bel’kov, E.L. Ivchenko, S.A. Tarasenko, W. Wegscheider, D. Weiss, D. Schuh, D.G. Clarke, M. Merrick, B.N. Murdin, P. Murzyn, P.J. Phillips, C.R. Pidgeon, E.V. Beregulin, W. Prettl, submitted to Phys. Rev. B (cond-mat/0303193). 428, 437, 438, 439, 440 17. Petra Schneider, S.D. Ganichev, J. Kainz, U. R¨ ossler, W. Wegscheider, D. Weiss, W. Prettl, V.V. Bel’kov, L.E. Golub, D. Schuh, phys. stat. sol. a, to be published (cond-mat/0303056). 428, 434, 441 18. S.D. Ganichev, U. R¨ ossler, W. Prettl, E.L. Ivchenko, V.V. Bel’kov, R. Neumann, K. Brunner, G. Abstreiter, Phys. Rev. B 66, 75328-1 (2002). 428, 435 19. S.D. Ganichev, M. Sollinger, W. Prettl, D.R. Yakovlev, P. Grabs, G. Schmidt, L. Molenkamp, E.L. Ivchenko, Verhandl. DPG (VI) 36, 1/170 (2001). 428, 434 20. S.D. Ganichev, I.N. Yassievich, W. Prettl, J. Phys.: Condens. Matter 14 (topical review), R1263 (2002). 428 21. G.M.H. Knippels, X. Yan, A.M. MacLeod, W.A. Gillespie, M. Yasumoto, D. Oepts A.F.G. van der Meer, Phys. Rev. Lett. 83, 1578 (1999). 428, 439 22. O. Krebs, P. Voisin, Phys. Rev. Lett. 77, 1829 (1996). 432 23. S.D. Ganichev, E.L. Ivchenko, W. Prettl, Physica E 14, 166 (2002). 433, 434, 435 24. L.E. Golub, Physica E, to be published (cond-mat/0208295). 433 25. S.D. Ganichev, V.V. Bel’kov, Petra Schneider, E.L. Ivchenko, S.A. Tarasenko, D. Schuh, W. Wegscheider, D. Weiss, W. Prettl, submitted to Phys. Rev. B (cond-mat/0303054). 433, 434, 439 26. N.S. Averkiev, L.E. Golub, M. Willander, J. Phys.: Condens. Matter 14 (topical review), R271 (2002). 436, 437 27. A.G. Aronov, Yu.B. Lyanda-Geller, Sov. JETP Lett. 50, 431 (1990). 437 28. V.M. Edelstein, Solid State Comm. 73, 233 (1990). 437 29. S.A. Tarasenko, E.L. Ivchenko, V.V. Bel’kov, S.D. Ganichev, D. Schowalter, Petra Schneider, M. Sollinger, W. Prettl, V.M. Ustinov, A.E. Zhukov, L.E. Vorobjev, Journal of Supercond.: Incorporating novel Magn., to be published (condmat/0301388). 438, 439, 441 30. E.L. Ivchenko, B. Spivak, Phys. Rev. B 66, 155404-1 (2002). 441 31. S.D. Ganichev, S.N. Danilov, V.V. Bel’kov, E.L. Ivchenko, M. Bichler, W. Wegscheider, D. Weiss, W. Prettl, Phys. Rev. Lett. 88, 057401-1 (2002). 441 32. S.D. Ganichev, H. Ketterl, W. Prettl, MRS Symp. Proc. 692 eds. E.D. Jones, M.O. Manasreh, K.D. Choquette, D. Friedman, H4.3.1/4 (2001). 441
Spin Injection in Ferromagnet/ Semiconductor Hybrid Structures Dirk Grundler, Toru Matsuyama, and Claas Henrik M¨ oller Institut f¨ ur Angewandte Physik, Universit¨ at Hamburg Jungiusstrasse 11, D-20355 Hamburg, Germany Abstract. We review our recent studies on transport phenomena in hybrid structures which consist of a ferromagnetic metal and a high-mobility two-dimensional electron system (2DES). The main focus will be on our recent investigations on ballistic spin injection into semiconductors. A novel fabrication technique by means of cleaved-edge overgrowth is used to generate lateral spin-valve devices with clean interfaces and deep-sub-µm 2DES channel length.
1
Introduction
There is a widespread interest in the injection of spin-polarized currents into semiconductors. High spin injection efficiencies are needed if spintronics with semiconductors which is nowadays a fascinating research field in solid state physics [1,2,3] should become relevant for technological applications [4]. One of the most challenging and controversial issues in experiment [5,6,8,9] and theory [10] is currently the spin injection and spin detection in a high-mobility two-dimensional electron system (2DES) using ferromagnetic metals. Since the early reports of all-electrical spin injection and spin detection in 2DES in the diffusive transport regime [5], a severe controversy [6,10] has started about the observed spin-valve effects. The main reasoning is that the diffusive transport theory predicts a fundamental obstacle for spin injection into semiconductors due to the conductivity mismatch [6,11] and that stray fields might induce a magnetoresistance (MR) effect which mimics the spin-valve effect [8]. We have developed a model to describe ballistic transport through ferromagnet/semiconductor hybrid structures [12,13,14]. In this regime characteristic and pronounced spin-valve effects are predicted. MR effects observed recently in quasi-ballistic 2DES with ferromagnetic contacts [15,16] can be explained by this theory. We review the main findings of the approach and present intriguing results on the spin-valve effect in Co/InAs(2DES)/Co hybrid structures which we fabricated by a novel method using cleaved-edge overgrowth.
2
Spin Injection into Semiconductors
In 1990, Datta and Das reported on the so-called spin field-effect transistor which was an analogue of the optical modulator [17]. Their theoretical B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 443–447, 2003. c Springer-Verlag Berlin Heidelberg 2003
444
Dirk Grundler et al.
treatment of spin transport along a one-dimensional electron system in an InAs-based heterostructure predicted the intriguing possibility to vary the sign of the spin-valve effect ∆G/G in a ferromagnet/semiconductor hybrid structure by means of an electrical field [Fig. 1(a)]. The latter varied the spin precession angle φ via the electric-field controlled spin-orbit interaction, i.e., via the Rashba effect in the III-V semiconductor heterostructure [18]. We have recently found out that it is very important to include explicitly the ferromagnetic contacts in the calculation if the transmission of ballistic electrons through a ‘realistic’ spin-field effect transistor is regarded [13,14]. This was lacking in the original paper.
Fig. 1. (a) Spin-valve effect in the Datta-and-Das approach recalculated from [17] with (full line) and without (dotted line) spin precession. Schematic sketch of the orientation of the contact magnetization if there is no spin precession (b) and if there is spin precession (c). (d),(e) Calculated spin-valve effects for configuration (b) and (c), respectively. Arrows indicate the orientation of the magnetization. The curves are taken from [14]
Spin Injection in Ferromagnet/Semiconductor Hybrid Structures
445
2.1 Spin-Valve Effects in Ferromagnet/InAs(2DES)/ Ferromagnet Hybrid Structures: Theory Based on the Landauer-B¨ uttiker formalism for ballistic transport in mesoscopic devices [19] we calculated the spin-valve effects ∆G/G for the two magnetic configurations shown in Figs. 1(b) and (c). Using the transfer-matrix formalism we included the exchange splitting of the electron states in the ferromagnets via the Stoner model, the spin filtering at the interfaces due to the band structure mismatch [13] and the Rashba effect. The latter is known to occur in InAs bulk crystals and in InAs heterostructures and scales with the carrier density ns of the electron system in the semiconductor [3]. For the traces in Figs. 1(d) and (e) we used the model system Fe/InAs(2DES)/Fe and assumed a length L of the semiconductor channel of 0.15 µm and a width W of 1 µm. We have also taken into account oblique angles in this mesoscopic spin-field effect transistor [14]. In Fig. 1(d) two characteristic features are distinguished: the short-period oscillations of ∆G/G as a function of ns and the coarse behavior (broken line). The former are due to Fabry-Perot resonances which occur due to coherent multiple reflections of ballistic electrons between the two interfaces. The latter originates from the spin filtering at the interfaces which depends on the Fermi energy, i.e., on ns , and reflects the ballistic spin-filter transistor discussed earlier in [13]. Due to this spin- and energy-dependent transmission across the interfaces the broken line in Fig. 1(d) deviates remarkably from the constant behavior expected in the Datta-and-Das approach where interfacial effects were neglected [compare broken line in Fig. 1(a)]. In Fig. 1(e) we recover the predicted spin precession (broken line) leading to a negative ∆G/G over a broad range in ns . Superimposed are again Fabry-Perot resonances which are now spin dependent. For ns = 0.3..1.2 × 1012 /cm2 the calculations predict that the sign of ∆G/G should change if the orientation of the magnetization of the contacts can be controlled in the different directions as depicted in the insets of Figs. 1(d) and (e). This predicted change of sign is unknown from spin-polarized transport in giant magnetoresistance or tunneling magnetoresistance devices. 2.2 Cleaved-Edge Overgrowth: A New Approach to Spin Injection We have recently shown that cleaved-edge overgrowth (CEO) is a very powerful technique to fabricate very clean interfaces between a metal and a 2DES residing in an InAs heterostructure. The high quality was proven by means of the extraordinary magnetoresistance effect which occurred in a magnetic field perpendicular to the 2D plane [20]. Theoretical modelling supported our findings about the interfacial quality [21]. We adapted the CEO technique to fabricate planar Co contacts on cleaved edges of InAs heterostructures. The geometry is shown in Fig. 2. A detailed description of the experiment is given
446
Dirk Grundler et al.
Fig. 2. (a) Schematic sketch of the experiment. The height of the etched step and the length were h = 60 nm and l = 140 nm, respectively. The width is W = 100 µm. (b) Spin-valve effect at T = 4.2 K when the magnetic field B was applied along the x direction. The up and the down sweep are the broken and the full lines, respectively. Taken from [22]
elsewhere [22]. Cooling the samples to 4.2 K and measuring the resistance between the Co contacts in a magnetic field B||x we observe a hysteretic magnetoresistance which shows a negative ∆G/G [Fig. 2(b)]. For a Co magnetization along the x direction stray fields are in the plane of the 2DES and are unlikely to effect the current flow. Changing the orientation of the Co magnetization, i.e., B||z, results in a positive ∆G/G. We interpret the data that spin-polarized electrons travel ballistically between the contacts and give rise to the observed spin-valve effects due to spin precession. The mean free path λe of the investigated InAs samples ranges from 0.5 µm to 1.6 µm which is always larger than the total separation of 0.2 µm. Following the findings of Pareek and Bruno [23] the width W > λe reduces the overall spin polarization of the current. Actually, the observed spin-valve effects reflect a maximum spin injection efficiency η of only 2 %. However, the novel technique opens the room for further significant improvements concerning spin injection into semiconductors, e.g., by reducing the width W or by in situ processing of the hybrid structure.
3
Conclusions
A ballistic transport theory for the spin-valve effects in ferromagnet/semiconductor/ferromagnet hybrid structures has been developed. We find pronounced effects with ∆G/G of up to a few %. Theory predicts that clean interfaces are a crucial prerequisite in order to observe spin injection phenomena since they give rise to spin filtering. We have used cleaved-edge overgrow to realize high-quality interfaces between Co and InAs(2DES) and found characteristic spin-valve effects in the experiments at low temperature. They suggest spin injection with efficiency η = 2 %. Our novel approach makes epitaxial regrowth of ferromagnets on the semiconductor heterostructure possible which might give rise to higher spin-injection efficiencies in the future [24].
Spin Injection in Ferromagnet/Semiconductor Hybrid Structures
447
Acknowledgements We thank D. Heitmann, Ch. Heyn, C.-M. Hu, O. Kronenwerth, G. Meier, U. Merkt, S. Schn¨ ull and A. Wittmann for continuous support and stimulating discussions. The work is supported by the DFG via SFB508 and via Graduiertenkolleg “Nanostrukturierte Festk¨ orper”, by the BMBF via 13N8283 and by the Japanese NEDO program.
References 1. D.D. Awschalom, M.E. Flatte and N. Samarth, Sci. Am. 286, 66 (2002). 443 2. D. Grundler, Phys. World 15 (no. 4), 39 (2002). 443 3. G. Meier and D. Grundler, Rashba spin–splitting and ferromagnetic electrodes on InAs, in Festk¨ orperprobleme - Advances in Solid State Physics/Vol. 40, Ed. B. Kramer (Braunschweig: Vieweg 2000) p 295; and references therein. 443, 445 4. S.A. Wolf et al., Science 294, 1488 (2001) and references therein. 443 5. P. Hammar, B.R. Bennett, M.J. Yang, and M. Johnson, Phys. Rev. Lett. 83, 203 (1999); Phys. Rev. Lett. 84, 5024 (2000). 443 6. A.T. Filip, B. H. Hoving, F. J. Jedema, B.J. van Wees, B. Dutta, and S. Borghs, Phys. Rev. B 62, 9996 (2000). 443 7. G. Meier, T. Matsuyama, and U. Merkt, submitted (2001). 8. F.G. Monzon, H.X. Tang, and M.L. Roukes, Phys. Rev. Lett. 84, 5022 (2000). 443 9. B.J. van Wees, Phys. Rev. Lett. 84, 5023 (2000). 443 10. J.-I. Inoue, G.E. Bauer, and L.W. Molenkamp, Phys. Rev. B 67, 033104 (2003); and references therein. 443 11. G. Schmidt, D. Ferrand, L.W. Molenkamp, A.T. Filip, and B.J. van Wees, Phys. Rev. B 62, R4790 (2000). 443 12. D. Grundler, Phys. Rev. Lett. 86, 1058 (2001). 443 13. D. Grundler, Phys. Rev. B 63, R161307 (2001). 443, 444, 445 14. T. Matsuyama, C.-M. Hu, D. Grundler, G. Meier, and U. Merkt, Phys. Rev. B 65, 155322 (2002). 443, 444, 445 15. C.-M. Hu, J. Nitta, A. Jensen, J.B. Hansen, and H. Takayanagi, Phys. Rev. B 63, 125333 (2001). 443 16. G. Meier, T. Matsuyama, and U. Merkt, Phys. Rev. B 65, 125327 (2002). 443 17. S. Datta and B. Das, Appl. Phys. Lett. 56, 665 (1990). 443, 444 18. Yu. A. Bychkov and E. I. Rashba, J. Phys. C: Solid State Phys. 17, 6039 (1984). 444 19. S. Datta: Electronic transport in mesoscopic systems (Cambridge: University Press, 1995). 445 20. C. H. M¨ oller, O. Kronenwerth, D. Grundler, W. Hansen, Ch. Heyn, and D. Heitmann, Appl. Phys. Lett. 80, 3988 (2002). 445 21. M. Holz, O. Kronenwerth, and D. Grundler, Phys. Rev. B 67, 1553xx (2003). 445 22. D. Grundler, A. Wittmann, Ch. Heyn, and C. H. M¨ oller, (submitted). 446 23. T.P. Pareek and P. Bruno, Phys. Rev. B 65, 241305(R) (2002). 446 24. O. Wunnicke, Ph. Mavropoulos, R. Zeller, P.H. Dederichs, and D. Grundler, Phys. Rev. B 65, 241306(R) 2002. 446
Electron Transport in Ferromagnet/InAs Hybrid Devices Guido Meier Institut f¨ ur Angewandte Physik und Zentrum f¨ ur Mikrostrukturforschung Universit¨ at Hamburg, Jungiusstraße 11, D-20355 Hamburg Abstract. Electron transport in ferromagnet/InAs hybrid metal-oxide semiconductor field-effect transistors is investigated at liquid helium temperature. We employ p-type InAs crystals which exhibit two-dimensional electron systems at their surfaces. In transistors with non-magnetic contacts Shubnikov–de Haas oscillations as well as weak localization effects in the longitudinal resistance hint at a strong and tunable spin-orbit interaction. Permalloy source and drain contacts with defined micromagnetic behavior have been integrated in ferromagnet/InAs hybrid transistors. These transistors exhibit a tunable magnetoresistance effect which is examined in external magnetic fields as a function of the gate voltage.
1
Introduction
Semiconductor based spintronics is an active field of research [1,2,3]. The idea of a spintransistor [4] has created a new branch of research in solid state physics, combining semiconductors with ferromagnetic metals or utilizing ferromagnetic semiconductors in all-semiconductor devices [5]. Currently, the injection, transport and detection of spin-polarized electrons in semiconductors are investigated. On the way to a possible spintransistor and other devices using the electron spin in semiconductors a multitude of issues must be addressed. First of all, the injection of spin-polarized charge carriers from a ferromagnet into a semiconductor has to be demonstrated. This problem has been discussed controversially in the last few years but now a consensus on the basic principles has been reached, at least for the limiting cases of diffusive and ballistic transport in the semiconductor. In epitaxially grown diluted magnetic II-VI semiconductors the electron spin could be aligned in external magnetic fields and the spin polarization was detected via the circular polarization of light emitted from an integrated lightemitting diode (LED) [6]. Such spin-LEDs have been built with ferromagnetic metals as spin injectors [7,8]. These experiments exhibit spin injection rates of up to 30% at liquid helium temperatures [8]. Even at room temperature a spin-injection rate of 2% has been achieved in such a device [7]. In other optical experiments up to 9% spin-injection efficiency has been demonstrated at a temperature of 80 K with cobalt as ferromagnetic injector [9]. Schottky barriers [7,8] or tunneling barriers [9] play an important role in all these optical experiments. B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 449–460, 2003. c Springer-Verlag Berlin Heidelberg 2003
450
Guido Meier
Unlike the experiments with optically detected spin polarization efficient spin injection could not yet be shown convincingly in magnetoresistance experiments on ferromagnet/semiconductor-hybrid structures. This gave rise to theoretical works on the transport processes at the ferromagnet/ semiconductor interface. In case of diffusive transport significant spin injection rates are only obtained for spin polarization of 100% in the injecting contacts [10]. Half-metallic magnets could provide such a high spin polarization at the Fermi energy [11,12]. In particular the Heusler alloy Ni2 MnIn is almost perfectly lattice matched to InAs [13] and is predicted to exhibit 100% spin polarization at epitaxial interfaces to this semiconductor [14]. Theories for a single ferromagnet/semiconductor interface surpassing the diffusive limit predict spin filtering at the interface [15,16]. This model can be expanded to describe transport in a two-dimensional electron system (2DES) of a semiconductor between two ferromagnetic contacts [17]. For the Fe/GaAs model system magnetoresistance changes of up to 5% are predicted in the approach of free electrons described by plane waves. By modelling a realistic hybrid structure with a 1 µm wide two-dimensional electron channel and spin-orbit interaction within the semiconductor we have shown that the spindependent conductivity changes remain [18]. A theoretical work which considers a perfect epitaxial Fe/GaAs interface surpasses the plane-wave approximation [19]. In this approach realistic Bloch functions and the concomitant symmetry of the energy bands are taken into account. This model predicts nearly ideal spin filtering, i.e., a spin polarization in the semiconductor of virtually 100%. The symmetry ∆1 of the Bloch functions at the Γ -point of GaAs and InAs is the same as that of the majority spins at the Fermi energy in Fe. This is the reason why in a ballistic transition of electrons from Fe into the semiconductor the states at the Γ -point of the semiconductor collect the majority but block the minority spins. The result predicted for the system Fe/InAs is similar [20]. However, disorder is expected to distort perfect epitaxy of the interfaces and hence to reduce the spin-injection rate due to the reduced symmetry [19,20]. Pioneering experiments on quasi-ballistic ferromagnet/semiconductor hybrid devices without gate electrodes exhibited spin-dependent transport with resistance changes in the range of 0.1% [21]. In hybrid transistors with ferromagnetic contacts on InAs we could observe a spin-related magnetoresistance in the order of 1% which could be tuned by the gate voltage [22]. However, possible parasitic magnetoresistance effects like anisotropic magnetoresistance of the ferromagnetic contacts or local Hall effects in the 2DES require a detailed analysis of the temperature, gate-voltage, and magneticfield dependence to prove spin-dependent transport. As in the optical investigations tunneling barriers should also improve the spin-injection rate and the spin-related magnetoresistance in transport experiments. In fact calculations on the influence of barriers predict an improvement in the diffusive [23] as well as in the ballistic limit [24]. First
Electron Transport in Ferromagnet/InAs Hybrid Devices
451
transport experiments on hybrid structures, that comprise the ferromagnet MnAs, the ferromagnetic semiconductor Ga1−x Mnx As, and an AlAs tunneling barrier yielded magnetoresistance effects of up to 30% at a temperature of 5 K [25]. Devices with Schottky barriers also show encouraging results for spin-polarized transport in ferromagnet/semiconductor hybrids [26]. Here we focus on our recent experiments on hybrid structures based on InAs. InAs is the material of choice because of its strong and tunable spinorbit interaction and because it forms Ohmic contacts to virtually all metals. First we discuss transport in metal-oxide semiconductor field-effect transistors (MOSFETs) with non-magnetic contacts. From both Shubnikov–de Haas oscillations and transport measurements in the regime of weak localization the strength of the tunable spin-orbit interaction can be deduced. Then we present results for ferromagnetic contacts on InAs and describe their behavior observed with magnetic-force microscopy in external magnetic fields with micromagnetic simulations. Finally we analyze electronic transport in ferromagnet/InAs hybrid transistors.
2
Spin-Orbit Interaction in InAs
The Rashba spin-orbit interaction lifts the spin degeneracy in the 2DES as a consequence of an asymmetric approximately triangular electric potential perpendicular to the channel even in the absence of a magnetic field [27]. The strength of this interaction in inversion layers on MOSFETs on p-type InAs has been studied by magnetotransport via beating patterns in Shubnikov–de Haas oscillations [28]. This allows control of spin precession via field effect, i.e., by applying an external gate voltage that alters the built-in potential. Recently we could also observe weak localization and weak antilocalization in InAs [29]. Weak localization provides a method to determine spin-splitting in low magnetic fields and without the restrictions imposed by a technically limited magnetic field strength in Shubnikov–de Haas measurements. It originates from interference between time-reversed trajectories of the conduction electrons and leads to a negative magnetoresistance in low fields [30]. In the presence of spin-dephasing the constructive interference is transformed into destructive interference and so weak localization is turned into antilocalization, causing a positive magnetoresistance for near-zero fields followed by the negative magnetoresistance of localization with increasing field. For the observation of weak localization and weak antilocalization we have prepared samples based on p-type InAs (100) single crystals doped with Zn at a concentration of 2.0 × 1017 cm3 . A 2DES that does not form a Schottky barrier to metals exists at the surface of this material. The 2DES is confined in an approximately triangular potential well, the inclination of which, and thereby the surface electric field, is controllable by a gate voltage. After mechanical and chemical polishing of the InAs surface the electrodes of a field-effect transistor in Corbino geometry are patterned by optical lithogra-
452
Guido Meier
phy and thermal evaporation in a lift-off process. The semiconductor channel between the contacts has a length of 100 µm. An Al/Au gate electrode is placed above the channel, insulated from this by a 340 nm thick SiO2 layer deposited by plasma-enhanced chemical vapor deposition. Samples in Corbino geometry exhibit a classical, parabolic Drude magnetoresistance of RD = R0 (1+µ2 B 2 ) with R0 being the zero-field resistance and µ the mobility. To specify the quantum corrections caused by weak localization and antilocalization, this Drude term must be determined and subtracted from the experimental data. Such a parabola as well as the corresponding raw data are depicted in Fig. 1(a). In principle the quantum corrections to the conductance caused by weak localization and antilocalization can be described by [31]. An improved description is provided by the model developed by Iordanskii, Lyanda-Geller, and Pikus including the Rashba term for the spin splitting [32]. From this procedure one can obtain the spin-orbit scattering time as a function of the gate voltage. However, a rough evaluation can already be achieved from the raw data without an elaborate fitting procedure. Simulations show that the maximum resistance due to quantum corrections occurs at a field of approximately four times the spin-orbit magnetic field Bso which is connected with the spin-orbit scattering time [33]. Figure 1(b) shows a plot of the magnetic field Bmax at the maximum resistance due to the quantum corrections as a function of the gate voltage. The relative strength of spin-orbit interaction in p-type InAs crystals becomes clear, if the deduced Bso values are compared to, e.g., GaAs heterostructures. While the value Bso in the present crystals is in the regime of 10 mT in GaAs heterostructures values of some 0.1 mT are observed [33]. The increase of Bso as a function of the gate voltage indicates a decrease of the related spin-orbit scattering time. From a model description of the measured data it is in principle possible to determine the strength of the spin-orbit interaction, e.g. the Rashba parameter. This result can be compared to the analysis of high-field data measured by Shubnikov–de Haas oscillations. However, while the correct (a)
250 245 240
(b)
55 50 45
235 -1.0
60
Vg = 5V Bmax (mT)
R (Ω)
255
40 -0.5
0.0 B (T)
0.5
1.0
5
10
15 20 25 Vgate (V)
30
35
Fig. 1. (a) Raw experimental data (solid line) fitted with the classical Drude magnetoresistance parabola (dashed line). (b) Magnetic field Bmax at the maximum resistance as a function of the gate voltage
Electron Transport in Ferromagnet/InAs Hybrid Devices
453
magnitude for the Rashba parameter results, the dependence on the electron density is contradictory [29]. As a conclusion we find that p-type InAs crystals with a 2DES as inversion layer at the surface exhibit a strong spin-orbit interaction which can be tuned by a gate voltage at will.
3
Ferromagnetic Contacts
Ferromagnetic contacts play an important role in the context of spin injection independent of the specific type of material which could be a ferromagnetic semiconductor [5], a conventional ferromagnetic transition metal, or a half metallic magnet [11,12]. Beside the high quality of the interface a defined magnetization direction is an decisive requirement. In a simple approach this requirement could be met by a single-domain contact, in which nearly all microscopic magnetic moments are parallel. The two limiting magnetization configurations for ferromagnetic in-plane source and drain contacts are depicted in Fig. 2. We distinguish spin-valve and spin-transistor geometry. In the former case the magnetic moments and spins in the ferromagnet are aligned parallel to the ferromagnet-semiconductor interface. In our ballistic model the spin is then conserved in the 2DES [18]. In the spin-transistor case the magnetic moments and spins point in the direction perpendicular to the interfaces. Then the injected spin is not a constant of motion in the channel. It can be shown that the optimal coupling to the spin-precession state is given for normally injected modes, i.e. for modes injected perpendicularly to the interface [18]. From the micromagnetic point of view the spin-valve geometry (see Fig. 2(a)) implies no difficulties. Given the micromagnetic energy contributions in ferromagnets, namely crystalline and shape anisotropy, the micromagnetic behavior can be tailored to obtain parallel or antiparallel magnetization configurations of two neighboring end domains in the contacts. With the micromagnetic computer code oommf supplied by Donahue and Porter [34] we (a)
z
gate y x
(b) source 0
L
drain
Fig. 2. Sketch of a hybrid device with two ferromagnetic contacts and the 2DES in (a) the spin-valve and (b) the spin-transistor geometry. Ferromagnetic contacts (2DES) are shown in black (grey)
454
Guido Meier
(a)
(b)
1 µm (c)
(d)
Fig. 3. Simulated magnetization of permalloy contacts at B=0 (a) and next to saturation (b). The thickness of the contacts is 20 nm and the size 2 µm × 1 µm and 1 µm × 1 µm, respectively. A saturation magnetization Ms = 800 kA/m, an exchange constant A = 13×10−12 J/m, and an anisotropy constant K1 = 500 J/m3 are used as material parameters. Simulated MFM images are shown in (c) and (d). Note that the grey scales in (c) and (d) comprise 0.04 mT/nm2 and 0.16 mT/nm2 between black and white, respectively
have simulated the hysteresis curve and the corresponding magnetization patterns [35,36]. Figure 3 depicts two simulated magnetization patterns in limiting cases, i.e., at zero external magnetic field and next to saturation. At B = 0 both contacts exhibit Landau magnetization patterns, i.e., flux closing domain patterns with small stray fields. The end domains next to the gap between the contacts are aligned antiparallel in a spin-valve configuration (see Fig. 2(a)). Next to saturation nearly all local magnetic moments are aligned in the direction of the external field. The latter configuration resembles the spin-transistor configuration (see Fig. 2(b)). To improve the comparison of the simulated magnetization to magnetic-force microscopy (MFM) we have expanded the computer code so as to calculate images as expected from MFM from the derived magnetization patterns [37]. Figure 3(c) and (d) show the corresponding calculated MFM images. Using electron-beam lithography and an optimized sputter process which provides the possibility of in situ cleaning the interface, we have prepared asymmetric permalloy contacts on p-type InAs crystals with a 2DES inversion layer at their surfaces. MFM is applied in the LiftModeT M to image the contacts at remanence and in external magnetic fields in the x - and the y-direction. Figure 4 shows a sequence of MFM images of contacts. The contacts are designed in the geometry described above to exhibit flux closing Landau magnetization patterns at remanence. Such a configuration is desirable because of the strongly reduced stray field which we have measured quantitatively by Hall micromagnetometry [38]. At this point it is important to mention that in this type of samples the electronic mean free path at liquid helium temperatures is comparable to the spacing between the two ferromagnetic contacts which is 150 nm in this example. For this reason the important part of the magnetization pattern for ballistic electron transport is given by the domains next to the spacing. The white and black regions in
Electron Transport in Ferromagnet/InAs Hybrid Devices (a) Bx (mT) -10
455
(b) By (mT) -10
+10
-0.8
+2.0
-0.4
-6.4
+6.4
-10
-0.4
+1.6
+4.4
+10
+1.6
-0.8
-2.8
-10 1µm
Fig. 4. Magnetic-force micrographs of permalloy structures on a p-type InAs crystal in external magnetic fields applied along the x-direction (a) and the y-direction (b) as defined in Fig. 2. The geometries of the contacts are 2 µm × 1 µm × 20 nm and 1 µm × 1 µm × 20 nm, respectively. The arrows in (b) indicate the direction of flux closing domain patterns
Fig. 4(a) indicate that at an external magnetic field of −10 mT applied in the x -direction the magnetization is approximately aligned in the spin-transistor geometry. This MFM image is in concordance with the simulated MFM image of Fig. 3(d). If the field is reduced to a value of −0.4 mT an irreversible magnetization switching occurs in the small contact and a Landau pattern is observed there. In the larger contact a reversible change of the left part is observed but the magnetization at the right edge next to the spacing remains unchanged. An increase of the external field in the positive direction at first reverses the magnetization of the larger contact, whereas the Landau pattern in the small contact is conserved at a field of +1.6 mT. At +10 mT the magnetization is again close to the spin-transistor geometry, but with a reversed direction compared to the starting state. On the way back a Landau pattern is again observed at +1.6 mT in the small contact, whereas the
456
Guido Meier
large contact switches its magnetization direction at −0.8 mT. At −10 mT the starting state of the sequence is reestablished. Figure 4(b) shows a sequence of MFM images with the external field applied in the y-direction. At −10 mT both contacts are nearly saturated. By reducing the field strength at −0.4 mT a flux closing magnetization pattern is observed in both contacts. From the trend of the grey scale and the considerations in the context of Fig. 3 the directions of the flux closing domain patterns are determined to be counter clockwise in the large and clockwise in the small contact, respectively. Consequently, the end domains in proximity to the anticipated channel are aligned parallel (see arrows in Fig. 4(b) at −0.4 mT) resembling the parallel spin-valve geometry of Fig. 2(a). A further increase of the field leads to reversed saturation at +10 mT. On the way back at +2.0 mT we again observe the flux closure domain patterns in both contacts but in contrast to the situation −0.4 mT the direction of rotation is counter clockwise in both contacts. This results in an antiparallel alignment of the end domains (see arrows in Fig. 4(b) at +2.0 mT) resembling the antiparallel spin-valve configuration of Fig. 2(a). After reduction to B=0 the MFM image observed is in concordance with the simulated MFM image of Fig. 3(c). Via −6.4 mT and −10 mT the hysteresis loop is completed and the starting state of this sequence is reestablished.
4 Electron Transport in Ferromagnet/InAs Hybrid Devices For transport measurements in InAs(2DES) MOSFETs permalloy source and drain contacts such as the ones shown in Fig. 4 were prepared [22]. They are wired by sputter deposited Nb leads employing electron-beam lithography. The MOSFETs are completed by a gate oxide grown by plasma enhanced chemical vapor deposition and a Al/Au gate. The main influence of the gate voltage on the source-drain resistance is due to the change of the carrier concentration as can be seen in Fig. 5(b). This effect masks the subtle resistance changes which can be expected from spin-polarized transport [21]. To circumvent this difficulty we have measured the source-drain resistance versus the strength of the external magnetic field which was applied in the x -direction as defined in Fig. 2. In the down sweep as well as in the up sweep displayed in Fig. 5(a) irreversible changes of the source-drain resistance are clearly observed. We focus on these resistance jumps and attribute them to the irreversible magnetization changes of the magnetic contacts as discussed in the previous section. The smooth background of the resistance is presumably due to the reversible part of the magnetization. After a complete cycle of an up- and a down sweep the measurement is repeated for different gate voltages. A surprising feature is a gate-controlled change of the sign of the irreversible resistance jumps seen, e.g., when the traces for gate voltages 0 and +2 V are compared for the resistance change at +9 mT. As can be seen in Fig. 5(c) there is a significant
Electron Transport in Ferromagnet/InAs Hybrid Devices
(a)
457
(b)
(c)
Fig. 5. (a) Normalized resistance versus magnetic field applied along the x -direction for various gate voltages V g at T =1.5 K. The traces are successively offset for clarity. Dashed and solid lines are down and up sweeps, respectively. (b) Sourcedrain resistance versus gate voltage. (c) Amplitudes of the normalized irreversible changes of the source-drain resistance. Filled circles denote jumps at 19.0 mT in up sweeps, open triangles and open circles jumps in down sweeps at −12.0 and −24.4 mT, respectively. The errors are comparable to the symbol sizes
gate-voltage dependence of the amplitude of these jumps. In the gate-voltage regime around Vg = 0 the amplitude is clearly larger than at higher and lower gate voltages. Such a behavior is in concordance with the picture of spins precessing due to the Rashba effect. From the observed irreversible resistance change ∆R/R0 one can estimate the degree of spin polarization η at the Ni80 Fe20 /InAs(2DES) interface from the relation ∆R/R0 = 2η 2 e−L/ΛS [39] with the channel length L and the spin scattering length ΛS in the semiconductor. A conservative estimate of η is obtained when spin scattering in the semiconductor is neglected, i.e., ΛS = ∞. Consequently, we calculate a minimum degree of spin polarization at the interface of η = 7% from our data ∆R/R0 = 1%. As expected for non-epitaxial interfaces this is considerably lower than the spin-polarization of 37% in the bulk of Ni80 Fe20 [40].
458
5
Guido Meier
Outlook
We have manufactured tailored ferromagnetic contacts on InAs and have observed their micromagnetic behavior with magnetic-force microscopy in external magnetic fields and Hall micromagnetometry. InAs exhibits a strong spin-orbit interaction which manifests itself in beating patterns of Shubnikov– de Haas oscillations as well as in weak localization and antilocalization of the electrons in the low magnetic fields. InAs based field-effect transistors with ferromagnetic contacts have been fabricated and their source-drain resistance has been examined as a function of the gate voltage and of the direction and strength of magnetic fields at low temperatures. The amplitude and the sign of irreversible resistance changes can be tuned by the gate voltage. This behavior provides evidence that spin-polarized transport is relevant in these devices. Future work will aim at improving the spin-injection efficiency which is proposed to be accomplishable by tunneling barriers in all-electrical spininjection devices [23] or by epitaxial ferromagnet/semiconductor interfaces [19]. In addition it is desirable to use highly spin polarized materials as injectors. In this regard half-metallic magnets might play an important role [12]. InAs heterostructures will provide an enhanced mobility of the 2DES and a concomitantly increased spin coherence length. In addition lithographically defined mesas and gate electrodes provide the possibility to further reduce the dimensionality of the electron system. With these ingredients a genuine spintransistor with a 1DES as envisioned by Datta and Das [4] should lie within reach. Acknowledgements I would like to thank Ulrich Merkt for discussions, encouragement, and persistent support. Many thanks to Toru Matsuyama, Dirk Grundler, Miriam Barthelmeß, Christian Pels, and Christopher Schierholz. The expertise and technical assistance of Jerzy Gancarz is appreciated. This work is supported by the Deutsche Forschungsgemeinschaft via the Sonderforschungsbereich 508 ‘Quantenmaterialien’, the Graduiertenkolleg ‘Physik Nanostrukturierte Festk¨ orper’, the NEDO International Joint Research Program, and the BMBF via grant 13N8283.
References 1. D.D. Awschalom, M.E. Flatt´e, and N. Samarth, Scientific American 286, 53 (June 2002). 449 2. S.A. Wolf, D.D. Awschalom, R.A. Buhrman, J.M. Daughton, S. von Moln´ ar, M.L. Roukes, A.Y. Chtchelkanova, and D.M. Treger, Science 294, 1488 (2001). 449 3. D. Grundler, Physics World 15, 39 (April 2002). 449
Electron Transport in Ferromagnet/InAs Hybrid Devices
459
4. S. Datta and B. Das, Appl. Phys. Lett. 56, 665 (1990). 449, 458 5. T. Dietl et al., Science 287, 1019 (2000). 449, 453 6. R. Fiederling, M. Keim, G. Reuscher, W. Ossau, G. Schmidt, A. Waag, and L.W. Molenkamp, Nature 402, 787 (1999). 449 7. H.J. Zhu, M. Ramsteiner, H. Kostial, M. Wassermeier, H.-P. Schnherr, and K.H. Ploog, Phys. Rev. Lett. 87, 016601 (2001); M. Ramsteiner, H. Zhu, A. Kawaharazuka, H.-Y. Hao, and K.H. Ploog, Advances in Solid State Physics 42, 95 (2002). 449 8. A.T. Hanbicki, B.T. Jonker, G. Itskos, G. Kioseoglou, and A. Petrou, Appl. Phys. Lett. 80, 1240 (2002); R. Jansen, Comment, 81, 2130 (2002); A.T. Hanbicki and B.T. Jonker, Response to Comment, 81, 2131 (2002). 449 9. V.F. Motsnyi, J. De Boeck, J. Das, W. Van Roy, G. Borghs, E. Goovaerts, and V.I. Safarov, Appl. Phys. Lett. 81, 265 (2002). 449 10. G. Schmidt, D. Ferrand, L.W. Molenkamp, A.T. Filip, and B.J. van Wees, Phys. Rev. B 62, 4790 (2000). 450 11. R.A. de Groot, F.M. Mueller, P.G. van Engen, and K.H.J. Buschow, Phys. Rev. Lett. 50, 2024 (1983). 450, 453 12. W.E. Pickett and J.S. Moodera, Phys. Today 54(5), 39 (2001). 450, 453, 458 13. J.Q. Xie, J.W. Dong, J. Lu, J. Palmstrøm, Appl. Phys. Lett. 79, 1003 (2001) 450 14. K.A. Kilian, R.H. Victora, IEEE Transactions on Magnetics 37, 1976 (2001) 450 15. G. Kirczenow, Phys. Rev. B 63, 054422 (2001). 450 16. D. Grundler, Phys. Rev. Lett. 86, 1058 (2001). 450 17. D. Grundler, Phys. Rev. B 63, R 161307 (2001). 450 18. T. Matsuyama, C.-M. Hu, D. Grundler, G. Meier, and U. Merkt, Phys. Rev. B 65, 155322 (2002). 450, 453 19. O. Wunnicke, Ph. Mavropoulos, R. Zeller, and P.H. Dederichs, and D. Grundler, Phys. Rev. B 65, R241306 (2002). 450, 458 20. M. Zwierzycki, K. Xia, P.J. Kelly, G.E.W. Bauer, and I. Turek, condmat/0204422, (2002). 450 21. C.-M. Hu, J. Nitta, A. Jensen, J.B. Hansen, and H. Takayanagi, Phys. Rev. B 63, 125333 (2001). 450, 456 22. G. Meier, T. Matsuyama, and U. Merkt, Phys. Rev. B 65, 125327 (2002). 450, 456 23. E.I. Rashba, Phys. Rev. B 62, 16267 (2000). 450, 458 24. C.-M. Hu and T. Matsuyama, Phys. Rev. Lett. 87, 066803 (2001). 450 25. S.H. Chun, S.J. Potashnik, K.C. Ku, P. Schiffer, and N. Samarth, Phys. Rev. B 66, 100408 (2002). 451 26. S. Kreuzer, J. Moser, W. Wegscheider, D. Weiss, M. Bichler, and D. Schuh, Appl. Phys. Lett. 80, 4582 (2002). 451 27. E.I.Rashba and Yu.A.Bychkov, J. Phys. C 17,6039 (1984). 451 28. T. Matsuyama, R. K¨ ursten, C. Meißner, and U. Merkt, Phys. Rev. B 61, 15588 (2000). 451 29. C. Schierholz, R. K¨ ursten, G. Meier, T. Matsuyama, and U. Merkt, physica status solidi (b) 233, 436 (2002). 451, 453 30. S. Datta, Electron Transport in Mesoscopic Systems, Cambridge University Press, (1995). 451 31. S. Hikami, A.I. Larkin, and Y.Nagaoka, Prog. Theor. Phys. 63, 707 (1980). 452
460
Guido Meier
32. S. V. Iordanskii, Yu. B. Lyanda-Geller, and G. E. Pikus, JETP Lett. 60, 206 (1994). 452 33. P.D. Dresselhaus, C.M.A. Papavassiliou, R.G. Wheeler, and R.N. Sacks, Phys. Rev. Lett. 68, 106 (1992). 452 34. M. Donahue and D. Porter, Object Oriented Micromagnetic Framework - computer program, http://math.nist.gov/oommf. 453 35. G. Meier and T. Matsuyama, Appl. Phys. Lett. 76, 1315 (2000). 454 36. G. Meier, M. Halverscheid, T. Matsuyama, and U. Merkt, J. Appl. Phys. 89, 7469 (2001). 454 37. M. Barthelmeß, A. Thieme, R. Eiselt, and G. Meier, J. Appl. Phys. 93, 8400 (2003). 454 38. G. Meier, R. Eiselt, and M. Halverscheid, J. Appl. Phys. 92, 7296 (2002). 454 39. M. Johnson, Phys. Rev. B 58, 9635 (1998). 457 40. R.J. Soulen et al., Science 282, 85 (1998). 457
Self-Organized Structures on Flat Crystals: Nanowire Networks Formed by Metal Evaporation Rainer Adelung1 , Rainer Kunz1 , Frank Ernst2 , Lutz Kipp3 , and Michael Skibowski3 1 2 3
Technische Fakult¨ at der CAU Kiel Kaiserstr. 2, 24143 Kiel, Germany Case Western Reserve University Cleveland, Ohio 44106-7204, USA Institut f¨ ur Experimentelle und Angewandte Physik der CAU Kiel Leibnizstr. 19, 24098 Kiel, Germany
Abstract. Nanotechnology often requires fabricating large, well-ordered assemblies of nanoscopic components on the surface of a substrate. It has been demonstrated that such ordered nanoassemblies can be obtained by pre-structuring the substrate, e.g., with a regular pattern of surface steps [1] or step bunches [2,3,4,5,6,7,8,9]. However, an equally powerful yet much simpler approach, is selforganized growth. We demonstrate the strength of this approach using the selforganized growth of metal nanowires on the surface of layered transition-metal dichalcogenide (TMDS) crystals. The surfaces of these crystals are perfectly flat – they do not reconstruct and feature virtually no step edges or other defects. Nevertheless, evaporation of certain metals onto these surfaces leads to the formation of large assemblies of nanowires. These have a diameter down to 8 nm, and they form regular networks with mesh diameters of 200 to 400 nm, extending over macroscopic distances (millimeters) [10,11]. Moreover, we have discovered that a variety of other nanocomponents forms along with the wires: cluster arrays forming triangles or parallelograms, as well as nanotunnels forming networks similar to those of the nanowires. We believe that the formation of these highly organized structures on a featureless substrate occurs because, on the one hand, surface diffusion is very rapid, and on the other hand, the weak interaction between the substrate and the adsorbed atoms enables them to detach from the substrate. In contrast to more familiar cases of surface diffusion, therefore, metal atoms on TMDS crystal surfaces are strongly influenced by “second order” phenomena, e.g. strain in the substrate, charge transfer to the substrate, or other phenomena that affect the surface diffusivity. In addition to the mechanism by which the nanostructures form, we discuss some of their physical properties and the influence of different processing parameters on their formation.
1
Introduction
The main driving force for the nanostructure research we have witnessed over the past ten years has been the continuous discovery of novel nanoscopic morphologies. This is interesting because the physical properties of nanoscopic B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 463–476, 2003. c Springer-Verlag Berlin Heidelberg 2003
464
Rainer Adelung et al.
architectures of material can differ greatly from their macroscopic counterparts [12]. However, in order to measure these interesting properties or even use them in novel applications, it is often necessary to integrate a large number of nanoscopic components to well-ordered arrays with a high packing density. Among the materials from which nanostructures can be grown, semiconductors received considerably more attention than metals. In fact, some of the most attractive properties of metals are lost when reducing the dimensions to the nanometers. In particular, the confinement of the electrons in metal particles with nanoscopic dimensions causes the electrical and the thermal conductivity to decrease. This may be the reason why metal nanoparticles have mostly been considered only for catalysis, where the increase of the surface-to-volume ratio is important for maximizing the throughput. Nevertheless, we believe that nanotructures of metallic elements will raise increasing interesting in basic research as well as for new applications. A particularly exciting new family of nano- and microstructures involving metallic elements was recently discovered in experiments of evaporating various metals onto the surface of layered transition-metal dichalcogenide (TMDC) crystals in ultra-high vacuum (UHV) which will be introduced here. TMDC crystals have been the subject of many experimental and theoretical investigations [13,14,15,16], since the 1970s, mainly because they possess unique physical properties, which originate from their special, quasi-two-dimensional electronic structure. The latter originates from the particular crystal structure: a TX2 (T = transition-metal, X = chalcogen-atom) crystal consists of two-dimensional X—T—X sandwiches in either 1T or 2H coordination, separated by a van der Waals gap (Fig. 1). The crystals easily cleave along the van der Waals gap. In fact, cleaving produces perfectly flat surfaces exhibiting extremely low defect densities, no dangling bonds, and no step edges over regions larger than 100 µm2 . TMDCs can be produced as flakes with a diameter of up to 1 cm2 via chemical vapor transport. Remarkably, UHV evaporation of particular metallic elements onto the featureless surfaces of such crystals leads to the formation of regular nanowire networks and other nanostructures, e.g., nanotunnels. This extreme example of self-organization is important because it may lead to new strategies for solving the problem of integrating nanocomponents in large, well-ordered aggregates on the surface of a substrate.
2
Experimental
The single-crystalline TX2 layered-crystal substrates for our experiments were grown by CVT (chemical vapor transport). This method, which has been used for more than 40 years [17], yields foil-like single crystals with a diameter of up to 1 cm. Their surface appears flat and shiny, reflecting the atomistic structure. Layered-crystals consist of coherently stacked triple-
Self-Organized Structures on Flat Crystals
465
Fig. 1. Structure of a layered transition-metal dichalgogenide crystal
layers with the sequence XTX—XTX—XTX, where T denotes a monolayer of a transition metal (V, Ti, etc) and X a monolayer of a chalcogen atom (Se, S, etc) [18]. In both types of monolayers, X and T, the atoms are arranged in a two-dimensional hexagonal lattice with the same lattice parameter. While strong covalent and ionic forces act within each triple layer X—T—X, only weak (van der Waals) forces act between neighboring triple layers. This enables the crystal to cleave easily between the triple layers, and the surfaces exposed by cleaving are known as high-structural-quality substrates for deposition experiments in UHV (ultra-high vacuum). For such experiments, the layered-crystals can either be cleaved directly in the UHV or under ambient conditions – before loading. Layered-crystal substrates prepared in this way often possess a surface that is atomically flat over hundreds of µm2 without any structural defects. For evaporation of Rb and Cu onto such substrates, we used either a commercial Rb dispenser from SAES Getters (Lainate, Milano, Italy) or a thermal Cu evaporator (a tungsten filament with a Cu wire attached to it), heated by an electric current of typically 6 A. In order to characterize the structures of the resulting nanowire networks, we employed SEM (scanning electron microscopy) and SPM (scanning probe microscopy), including AFM (atomic force microscopy), and STM (scanning tunneling microscopy). SEM can retrieve the lateral dimensions of nanoscopic features with a spatial resolution of about 1.5 nm. Moreover, because SEM images possess an excellent depth-of-field, a quasi-three-dimensional view of the specimen can be obtained by tilting it by 45◦ about an axis normal to the viewing direction. Complementary to SEM, AFM and STM yield precise topographical measurements, assessing the profile of nanostructures with sub˚ Angstr¨om resolution. Combining SPM (AFM and STM) with SEM, therefore, provides highly resolved and reliable data on the morphology of nanoscopic objects in three dimensions and helps to recognize artifacts introduced by either techniques.
466
3
Rainer Adelung et al.
Results and Discussion
Experiments with different substrate crystals and different metals [1,2,3,4,5,6] [7,8,9,10,11,12,13,14,15,17,18] have shown that exposing a freshly cleaved surface of a TMDC crystal to a metal vapor in UHV can initiate the formation of a variety of nanostructures on the crystal surface. For example, deposition of Rb onto TiTe2 , TaS2 , VSe2 , WSe2 , TiS2 and NbSe2 [19,20,21], results in the formation of nanowires (Figs. 6c,d and 7). Similar nanowires form when Cu is deposited onto the surface of on VSe2 or TaS2 [10,11] crystals (Figs. 5 and 6a,b). In each case, the nanowires are organized in more or less regular networks. Copper evaporation onto other TMDC crystals, however, does not produce nanowires but different structures. Cu on TaS2 results in nanotunnels (Fig. 4a) and large dendrites (Fig. 3), while Cu on produces cluster arrays with regular geometric shapes (Fig. 8a–c). Since all TMDC substrates have a perfectly flat and unreactive surface, this diversity of highly organized one-dimensional nanostructures is very surprising. Usually, the deposition of atoms on simple, flat surfaces merely causes the formation of clusters [22] with a simple morphology that minimizes interfaceand surface energy. In the following, we discuss a number of recent experiments that enlighten the conditions under which the formation of nanowires takes place and what the underlying micromechanisms could be. First, we review the results of diffusion experiments with copper evaporation onto TaS2 . Second, we consider the behavior of TMDC substrates after adsorbate deposition. Third, the findings of both experiments in the first and second part will be compared with the nanowire growth. Finally, to get a better understanding of the influence of the growth parameters on the kind of obtained structure, TiTe2 surfaces hold at different temperatures during deposition will be shown. 3.1
Growth Parameters
In order to understand why so many different and non-equilibrium structures can form, we have to take a detailed look at the growth parameters. Even though the TMDC crystals have a similar surface structure, major differences are observed in the diffusion behavior of adsorbed atoms. This is evident from experiments in which crystals were covered with copper grids with a fine mesh size, as they are used to prepare specimens for transmission electron microscopy (TEM). The grids were mounted directly above the surface, however without making contact. Exposing this to copper vapor yields a regular array of squares with an extension of 50 µm on the specimen surface. Figure 2a shows the result obtained with a WSe2 substrate. The regions covered by Cu have the expected shape of squares, forming a shadow image of the masking grid. This indicates that the diffusion length of the Cu atoms on the WSe2 surface is short.
Self-Organized Structures on Flat Crystals
467
Fig. 2. SEM-image of copper evaporation through a grid used as a mask, producing a microstructured copper pattern (a) with blurred sides on TaS2 (b) With straight sides on WSe2 even if deposited on an area that shows many step edges
The WSe2 crystal in Fig. 2b shows one of the rare areas of a layered crystal surface that contains several step edges. Even though step edges usually constitute good nucleation sites for condensation, no Cu is found at the step edges. Accordingly, no diffusion occurs into the “shaded” areas, confirming that the diffusion length of the Cu atoms on the WSe2 surface is short. In contrast to this, Fig. 2a shows the result obtained with TaS2 as the substrate. In order to obtain comparable results, the TaS2 crystal was placed next to the WSe2 in the same experiment, covered with a copper grid in the same manner. Different from the WSe2 the TaS2 crystal features squareshaped areas with significantly blurred edges. Since the blurring of the edges reflects substantial copper diffusion into the shaded areas, we conclude that the diffusion length of copper on TaS2 is significantly larger than on WSe2 . To determine the absolute diffusion length is difficult from such an experiment. To get a better estimation of the diffusion length, a slightly different experiment was carried out. In this experiment, only one part of the TaS2 crystal was covered during evaporation, the other part was uncovered. The borderline between the covered and uncovered region shows a fractal shape given by several large dendritic structures. Again an area is selected that contains step edges, see Fig. 3a. Obviously they all act as nucleation site for the larger dendritic structures which surround the step edges. An enhanced diffusion along the step edges can be excluded by considering the shape of the dendrites, see Fig. 3b and c. The form of the dendrites is a typical outcome of the well known diffusion limited aggregation (DLA) growth [23] mecha-
468
Rainer Adelung et al.
Fig. 3. SEM-images of large dendritic shaped copper structures surrunding step edges on the TaS2 surface, (a) overview (b) magnification showing the dendritic structure (c) different magnification showing the dendritic structure (e) different area where EDX spectra were taken (f ) EDX spectrum taken at region B, containing no Cu (g) EDX spectrum taken at region A, containing Cu
nism. To confirm experimentally that no copper can be found between the dendrites, Fig. 3d. shows another area of that border between uncovered and covered areas. Here EDX spectroscopy was carried out. While in area A on the tip of the dendrite a large copper signal can be detected, at area B no copper signal is found, see Fig. 3e and f. By estimating the minimal detectable copper signal in EDX at 5keV, the maximal amount in the spectrum B is far below one monolayer. This proves that the nucleation probability is very low. From the shape of the dendrites an average diffusion lengths for copper of more than 20µm on the TaS2 can be directly observed. Otherwise the dendrites could not be grown to such an extension. Compared with typical self diffusion lengths on metals, this is an extreme high value.
Self-Organized Structures on Flat Crystals
3.2
469
Adsorbate Induced Substrate Reorganization
Extreme growth parameters are also found in other systems (e.g., for metal diffusion on polymers [24] and can not be the only explanation for the appearance of the different structures. Another possibile mechanism is an adsorbate induced structure change of the substrate. In most other structures this is a negligible effect. If an atom is depsosited onto a regular 3-dimensional ionicly or covalently bound crystal, a possible charge transferfrom the adsorbate to the substrate has not much effect. Especially the bulk lattice constant will not be effected, as every atom is held fixed within a 3-dimensional coordination. Only a surface reconstruction or relaxation might occur. This is different in a 2-dimensional layered crystal. If a monolayer of adsorbate atoms is on the surface the ratio between substrate atoms to adsorbate atoms is about 3:1. This is because each individual layer can be viewed as independent crystal due to the weak bonds in z-direction. Therefore, the influence of the adsorbate on the hole structure in the case of a layered crystal is much stronger. This might explain why a tunnel structure consisting of large tunnels is covering the TaS2 crystals after deposition of a thicker copper film (approx 100 nm), see Fig. 4a. Tunnels or buckels relax the compressive stess which is introduced by an expansion of the surface layers. If the Fermi-levels of the materials are not at the same hight, a charge transfer will take place for compensation. If this charge transfer is from the adsorbate to the substrate,
Fig. 4. Structures of mechanical failure (SEM-image), found in the substrate after copper evaporation, (a) tunnels in TaS2 , (b) cracks in VSe2
470
Rainer Adelung et al.
a higher filling of the substrate bands will occur. Consequently a new equilibrium lattice constant will be found. If this leads to an expansion, which is likely as exlained in the following, a relaxation in form of buckels or tunnels will occur. Another example of the influence of the adsorbate onto the substrate on a much smaller scale occurs on VSe2 (Fig. 4b). Here tensile stress occurs after the evaporation of copper and as a result surface cracks can be found. This shows that both, tensile and compressive stress, can occur due to copper deposition in the substrate. 3.3
Nanowire Growth
Figure 5 shows another structure, found after Cu deposition on VSe2 . In contrast to the experiment leading to the figure 4b, here a smaller amount of copper was evaporated. This is a network of nanowires, with a diameter down to approx. 8nm connected in meshes with a width of typically 100–500 nm. These nanowires were also found on TiSe2 after deposition on crystals at room temperature. A detailed mechanism for the formation of nanowires, which is in parts verified by experiments and calculations, draws a complex picture [11]. According to this model, a metal atom deposited on a TX2 surface shares its valence electron with the uppermost triple-layer of the substrate. This can be directly observed in experiments [20] and is also evident in calculations on similar systems. The electron transfer has two consequences. First, the adsorbed atoms behave like dipoles due to the delocalization of their electrons in the substrate. If two atoms are placed on neighboring sites on the surface, they will repel each other [25]. This might be the reason for the low nucleation probability often observed in these systems, e.g. in the
Fig. 5. Different parts of nanowire networks obtained after copper evaporation on VSe2 , SEM-image
Self-Organized Structures on Flat Crystals
471
above shown Cu:TaS2 system. Second, the charge transfer causes a change in the lattice constant, as found in molecular static calculations for the system Li on TiS2 . The expansion of the equilibrium lattice parameter is found after the deposition of 1/3 monolayer [25], thus causing compressive stresses. In a simplified picture, a positively charged metal atom on the surface attracts the two chalcogen atoms X of the uppermost TX2 triple-layer, which are electron receptors, while repelling the transition metal atom X, which is an electron donator, but a weaker one than the metal atom on the surface. This interaction causes an expansion of the equilibrium lattice parameter at the surface, leading to compressive in-plane stresses in the uppermost surface layer under the constraints of an extended TX2 crystal. The compressive stresses predicted by this (admittedly simplistic) model explain the formation of a few nanotunnels we observed in the surface layer(s) of the substrates covered with nanowire networks [11]. These tunnels are much smaller than the ones found after the Cu deposition on TaS2 . Scanning tunneling microsopy (STM) and atomic force microscopy (AFM) work has revealed, however, that the nanowires are embedded in surface cracks with a depth corresponding to a single triple-layer of the TX2 substrate. Figure 6a and b shows two AFM images, in which a nanowire was removed. The underlaying crack is found to be one triple-layer deep. Figure 6c and d show images were a tip of a nanowire on TaS2 obtained from Rb deposition is visible. The darker lines at the end show also cracks, also one triple layer deep. The TaS2 surface exhibits approximately 3-atom wide charge density waves. As those are still visible, no material can be placed on the surface between the nanowires. If nanowires grow in cracks in the surface, this poses the question how cracks can form in a surface that initially features compressive stresses. Obviously, a change from compressive to tensile stresses must occur to enable the formation of the cracks. This may be explained in the following way: As the coverage of the TX2 surface with metal increases, the mean next neighbor distances between the metal atoms decreases, and the energy associated with their electrostatic repulsion increases. At some critical surface coverage (at least if a full monolyer is deposited), the formation of metal islands will then become energetically more favorable. In a classical nucleation process, first metal clusters will then appear on the surface, energetically afforded by thermal fluctuations. As these “starting points” surpass the critical size for stable growth, the clusters will grow by surface diffusion. The formation of metallic islands on the TX2 surface, however, will withdraw the electrons that the adsorbate atoms initially donated to the substrate. Consequently, the equilibrium lattice parameter of TX2 surface layer will decrease. Assuming that the compressive stresses induced by the initial donation of electrons have relaxed by the formation of nanotunnels, the decrease of the equilibrium lattice parameter by the formation of metal clusters will now render the TX2 surface under tensile stress. Since the strength of the uppermost TX2 triple-layer is probably much lower in dilatation than in compression, the tensile stresses
472
Rainer Adelung et al.
Fig. 6. Cracks found under and next to nanowire networks. (a) AFM image of a part of the nanowire network made by copper evaporation on VSe2 . (b) One wire is removed by AFM manipulation, a crack is visable underneath, (c) Charge density waves on the surface of TaS2 near a Rb nanowire are separated by a crack network (d) the cracks are found to start at the end of a wire, as visable in this magnification of the panel (c)
cause the nucleation of cracks, and the spacing of the cracks will be smaller than the spacing of the nanotunnels in agreement with our observations. These cracks in the topmost TX2 triple layer are attractive sites for heterogeneous nucleation of further metal clusters. Consequently, metallic adsorbates will nucleate in the cracks and begin to fill them. This induces a non-linear, self-accelerating process in which more adsorbate atoms are withdrawn from the surface by diffusional growth of the metallic adsorbates filling the wires, the equilibrium lattice parameter of the surface layer further decreases and, consequently, more cracks open, more nuclei of metallic islands form in the cracks, and even more untrapped adsorbate atoms are withdrawn from the TX2 surface. In this self-organized process of nanowire formation, the metal atoms actually create the cracks in which they later end up as nanowires. The type of nanowire growth varies from substrate to substrate. For example, with Rb, on TiTe2 and TaS2 most of the nanowires are aligned in only three orientations, originating from the crystal symmetry. The crystal will easier fracture in certain high symmetry directions. If the nanowires grow along structures of mechanical failure of the substrate, the wires will preferen-
Self-Organized Structures on Flat Crystals
473
tially appear in those directions. But this is not always the case, Rb on VSe2 will not always form aligned networks, more twisted nanowires appear here. On one substrate different growth modes can be found. Usually no clusters will be found between the nanowires originating from Rb-evaporation. This was shown e.g. in Fig 6b. While this is also the case in the center of WSe2 crystals, a modified wire growth can be observed close to the crystal borders, where clusters can be found in the meshes. Interestingly, one can also observe here smaller branched wire structures that surround the nanowires. On the first view (see Fig. 7), they appear like the dendrites in Fig. 2 and 3 on a much smaller scale. But a closer look reveals that this is not diffusion limited aggregation growth. Also the branched dense packed wire bundles are aligned in only three different directions, but they are tilted against the nanowires by 90 degree, see figure caption of Fig. 7. If it would be DLA the structures had to stay more or less normal on every wire. Some of them do (see Fig. 7), but others grow in a 30◦ angle away from the wire. In addition, at the tips of theses wire bundles, cracks can be found. Whatever the details of this modified growth behaviour are, the mechanism appears to be the same. The adsorbate modifies the substrate, hereby inducing cracks. A long diffusion length at room temperature enabels the adsorbate to reach the nucleation sites next to the wires.
Fig. 7. Structures found after Rb deposition on WSe2 . Nanowire networks are surounded by Rb-nanowires arranged as bundles. The hexagonal orientation of the bundles is 90◦ tilted against the hexagonal nanowire orientation, as visable in the right STM image, which is a magnification from the left STM image
3.4
Temperature Dependence
To understand the growth behavior in detail, first experiments with different substrate temperatures will be presented here. Usually, moderate elevated
474
Rainer Adelung et al.
temperature will change feature sizes, like a distance or radius of clusters, but maintain the type of structure. We found on TiTe2 , that at different substrate temperatures different types of structures will appear. On TiTe2 , hold at room temperature during copper deposition, no nanowires or tunnels can be found. Instead, the surface is covered with copper clusters, agglomerated to form a film. Obviously, a much higher nucleation probability is given here compared to TaS2 . Surprisingly, a long range order can also be found on some areas of the sample surface. Figure 8 shows an overview over an area which exhibits geometrical shapes formed by a lower cluster density. The triangles, parallelograms, and other geometric shapes have a typical baseline width in the range of 1 µm to 10 µm. The lower cluster density in the structure obviously leads to a higher density on the outside. From the outside to the inside of the structure the cluster density increases towards the structure, then drops suddenly. Even if a shape completely different from wires or tunnels is formed, the structures show three directions, forming a 120◦ angle similar to the nanowire structures. In contrast, on samples with elevated substrate temperature, around 100◦ C, no geometrical arranged cluster arrays could be found anymore. Instead, some areas show “conventional” nanowires. Figure 8d shows a step edge on a cleaved crystal decorated with copper. Next to the step edge a lower cluster density can be observed, but further away from the edge regular copper clusters are visible. This is exactly what is expected form a “regular” growth mechanism. The atoms arrive on the surface and diffuse arround. In the case that the atoms arrive far away from the step edge, they have a high probability to start a new nucleus or agglomerate at an already existing cluster. In the case that the atoms arrive
Fig. 8. Nanostructures found after Cu-evaporation on TiTe2 at room temperature and 100◦ C. SEM images of copper clusters at different magnifications. (a) Geometric shapes like triangles or (b) parallelograms can be found. The image shows that cluster density around the shapes is increased. (c) larger area, image contains a dust particle. (d) conventional nanowire growth on a step edge at 100◦ C (e) nanowire network like structures with many clusters in between at 100◦ C
Self-Organized Structures on Flat Crystals
475
close to the step edge, they are able to diffuse along the step edge until they hit other atoms. In this manner, they can form part of a wire located on a step edge as shown in Fig. 8d. “Regular” growth means here that no long diffusion length or adsorbate substrate interaction are necessary for the explanation. More surprising, on other areas nanowire networks like on VSe2 and TiSe2 could be found (Fig. 8e). By raising the temperature a little, it seems that only growth parameters like diffusion length and nucleation change. Therefore it appears that only the growth parameters determine which type of structure will grow and the chosen TX2 substrate is less important.
4
Summary and Conclusion
Different structures that appear after the evaporation of Cu and Rb on TX2 type crystals were presented. It was shown that very different growth parameters can be found on layered crystals. Those range from a regular nucleation probability leading to ordinary cluster growth to extreme parameters like a very low nucleation probability and a diffusion length in the order of 20 µm can occur on layered crystals. In addition, examples were presented for an adsorbate induced structural change in the substrate, tunnels and cracks as results from compressive and tensile stress. Both, the unusual growth parameters like small diffusion barrier or low nucleation probability and the adsorbate induced changes in the substrate can explain a variety of structures, including the nanowire growth. Experiments with different temperatures show that the types of structures can be switched with the same adsorbate/substrate system. Against intuition, we conclude that the flat TMDC crystal surfaces are a good choice to produce many different self-organized structures. Acknowledgements One of the authors (R.A.) gratefully acknowledges a grant by the Alexander von Humboldt foundation and thanks Prof. F. Faupel for fruitful discusions. We thank the Case Scool of Engeneering and the Case Alumni Association for start-up funds. In addition, work was supported by the Deutsche Forschungsgemeinschaft, Forschergruppe 353/2-1.
References 1. J. Viernow, J.-L. Lin, D. Y. Petrovykh, F. M. Leibsle, F. K. Men, and F. J. Himpsel, Appl. Phys. Lett. 72, 948 (1998). 463, 466 2. J. Zhu, K. Brunner, G. Abstreiter, O. Kienzle, F. Ernst, Thin Solid Films 336, 252 (1998). 463, 466 3. K. Brunner, J. Zhou, G. Abstreiter, O. Kienzle, F. Ernst, in The 24 th. International Conference on the Physics of Semiconductors, Israel, D. Gershoni (Eds.) (World Scientific 1999) p. 61. 463, 466
476
Rainer Adelung et al.
4. J. Zhu, K. Brunner, G. Abstreiter, O. Kienzle, F. Ernst, M. R¨ uhle, Phys. Rev. B 60, 10935 (1999). 463, 466 5. J. Stangl, V. Holy, J. Grim, G. Bauer, J. Zhu, K. Brunner, G. Abstreiter, O. Kienzle, F. Ernst, Thin Solid Films 357, 71 (1999). 463, 466 6. K. Brunner, J. Zhou, G. Abstreiter, O. Kienzle, F. Ernst, Thin Solid Films 369, 39 (2000). 463, 466 7. K. Brunner, J. Zhou, C. Miesner, G. Abstreiter, O. Kienzle, F. Ernst, Physica A, 7, 881 (2000). 463, 466 8. F. Ernst, O. Kienzle, O.G. Schmidt, K. Eberl, J. Zhu, K. Brunner, G. Abstreiter, in Microscopy of Semiconducting Materials 2001, A. G. Cullis, J. L. Hutchison, (Eds.) (IOP Publishing Ltd 2002), p. 167. 463, 466 9. K. Brunner, J. Zhou, G. Abstreiter, O. Kienzle, F. Ernst, phys. stat. sol. (b) 224, 531 (2001). 463, 466 10. R. Adelung, F. Ernst, A. Scott, M. Tabib-Azar, L. Kipp, M. Skibowski, S. Hollensteier, E. Spiecker, W. J¨ ager, V. Zaporojtchenko, F. Faupel, Adv. Mat. 14, 1056 (2002). 463, 466 11. R. Adelung, W. Hartung, F. Ernst, Act. Mat. (b) 50, 4925 (2002). 463, 466, 470, 471 12. Y. Xia, P. Yang, Y. Sun, Y. Wu, B. Mayers, B. Gates, Y. Yin, F. Kim, H. Yan, Adv. Mat. 15, 353-389 (2003). 464, 466 13. V. Grasso: Physics and Chemistry of Materials with Layered Structures, (Reidel, Dordrecht, 1986). 464, 466 14. A. Aruchamy: Physics and Chemistry of Materials with Low-Dimensional Structures, (Kluwer Academic, Dordrecht, 1992). 464, 466 15. G. Prasad, O. Srivastava, J. Phys. D 21, 1028 (1988). 464, 466 16. M. Whittinghan, Science 1126, 192 (1976). 464 17. H. Schafer: Chemical Transport Reactions, (Academic Press, New York 1964). 464, 466 18. F. Levy: Chrystallography and crystal chemistry of materials with layered structures, (Reidel, Dortrecht 1968). 465, 466 19. R. Adelung, L. Kipp, J. Brandt, L. Tarcak, M. Traving,C. Kreis, M. Skibowski, Appl. Phys. Lett. 74, 3053 (1999). 466 20. R. Adelung, J. Brandt, K. Rossnagel, O. Seifarth, L. Kipp, M. Skibowski, C. Ramirez, T. Strasser, W. Schattke, Phys. Rev. Lett. 68, 1303 (2001). 466, 470 21. R. Adelung, J. Brandt, L. Kipp, M. Skibowski, Phys. Rev. B. 63, 165327-1 (2001). 466 22. J.A. Venables, Surf. Sci. 299/300, 798 (1994). 466 23. T. Vicsek: Fractal Growth Phenomena, (World Scientific, Singapore 1992). 467 24. F. Katzenberg, R. Janlewing, J. Petermann, Colloid and Polymer Sci. 278, 280 (2000). 469 25. C.Ramirez, W.Schattke, Surf. Sci. 482–485, 424 (2001). 470, 471
Atomic Structure of Alkali Halide Surfaces Roland Bennewitz, Sacha Sch¨ar, Enrico Gnecco, Oliver Pfeiffer, Martin Bammerlin, and Ernst Meyer Institute of Physics, University of Basel 4056 Basel, Switzerland Abstract. The atomic structure of surfaces of alkali halide crystals has been revealed by means of high-resolution dynamic force microscopy. True atomic resolution is demonstrated both on steps surrounding islands or pits, and on a chemically mixed crystal. We have directly observed the enhanced interaction at lowcoordinated sites by force microscopy. The growth of NaCl films on metal surfaces and radiation damage in a KBr surface is discussed based on force microscopy results. The damping of the tip oscillation in dynamic force microscopy might provide insight into dissipation processes on the atomic scale. Finally, we present atomically resolved images of wear debris found after scratching a KBr surface.
1
Introduction
Dynamic force microscopy has been developed in recent years into a method capable of atomic resolution microscopy on different types of surfaces, including semiconductors, metals, or oxides [1]. A class of insulating materials where particularly clear results have been obtained are alkali halide crystals. The comparably simple preparation of clean surfaces and the strong interaction contrast of the ionic bonds on short length scale contributed to the experimental progress [2]. In dynamic force microscopy, a tip attached to the end of a microfabricated cantilever oscillates with an amplitude of 2–10 nm at the mechanical resonance frequency of the cantilever, typically between 70 kHz and 350 kHz. Forces between tip and sample influence the oscillation: The increasing attraction between tip and sample decreases the resonance frequency. This frequency shift can be used as a feedback signal to control the tip-sample distance. Scanning the tip at constant frequency shift provides an image closely related to the topography of the surface. More accurately, a surface resembling a constant geometric mean of force and interaction energy is recorded [3]. The difference between such measurements and the ideal atomic topography becomes evident when a local variation of forces is encountered, as we will discuss in the following paragraphs [4].
2
NaCl Films on Metal Surfaces
Thin films of NaCl can be easily grown on flat metal and semiconductor surfaces by molecular evaporation from powder in a crucible. Electron diffraction B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 477–485, 2003. c Springer-Verlag Berlin Heidelberg 2003
478
Roland Bennewitz et al.
and scanning tunneling microscopy experiments revealed that monatomic steps on a Ge(100) substrate are overgrown by a continuous NaCl film in a carpet-like mode, due to the internal cohesion of the film being much stronger than the interaction with the substrate [5,6]. This growth mode has been confirmed for the Cu(111) surface by dynamic force microscopy experiments. NaCl islands extending over several hundred nanometers were found to smoothly cover several substrate steps [7]. In contrast, NaCl evaporated on Al(111) has been reported to grow in small islands nucleated at substrate steps [8]. A dynamic force microscopy image of this surface is given in Fig. 1a. For low coverage, islands of 10–20 nm lateral size start to grow not only at steps but also on terraces. The perfectly rectangular shape of the islands indicates the strong tendency to minimize the number of corner sites and the mobility of the NaCl molecules on the surface at room temperature. From scanning tunneling microscopy the thickness of the NaCl islands is difficult to judge. Constant current images correspond roughly to surfaces of constant density of states at the Fermi edge. The thickness of insulating layers will appear systematically smaller than geometrically expected, since they provide no electronic states themselves but can only extend states of the substrate [6,8]. In dynamic force microscopy there exists a similar problem. The force at a given distance between tip and Al substrate on the one hand and NaCl islands on the other hand can differ significantly. Such difference is largely caused by a variation of the electrostatic force contribution. The substrate and the islands have work functions as different as 1 V [9]. However, for the experiment represented in Fig.1 the sample bias has been adjusted such that the same electrostatic contribution to the total force is encountered on substrate and islands. By this means the geometric island height is well reproduced, and we find that both monolayer and double-layer islands grow directly on the Al(111) surface. The existence of monolayer islands excludes
Fig. 1. Sub-monolayer NaCl film grown on a Al(111) surface at room temperature. (a) Topography image acquired by dynamic force microscopy (frame size 100 nm). (b) Cross-section along the direction indicated by the white line. Note that both monolayer and double-layer islands have grown
Atomic Structure of Alkali Halide Surfaces
479
a growth mode exclusively based on NaCl molecules in vertical configuration, as suggested for the growth on Ge(100) where only double layer islands were found [6]. A high-resolution image of a NaCl film grown on Cu(111) is presented in Fig. 2a. Atomic resolution is obtained on both the NaCl film and a small NaCl island grown on top of it. The corresponding atomic structure of the island is schematically depicted in Fig. 2b. Note that only one type of ion has been imaged as protrusion, as always found for dynamic force and scanning tunneling microscopy alkali halide surfaces [2,6,8,10]. For an interpretation of the atomic contrast in scanning tunneling microscopy, calculations of the local density of states have been performed [8]. They indicate that the highest density of states around the Fermi edge coincides with the sites of chlorine ions, which show up as protrusions. For dynamic force microscopy, the contrast on atomic scale depends on the atomic structure of the tip apex. Assuming that the charge of an ionic species at the tip apex determines repulsion or attraction at a given surface ion, the actual contrast may even change when the tip touches the surface and picks up ions [11]. The cross section in Fig. 2c quantifies the corrugation of the measured topography. There is a significant increase of the corrugation from the terraces over steps to corner sites. This increased corrugation reflects the specific role of low-coordinated atoms at the surface. The enhanced interaction at low-coordinated sites has many impacts like catalytic activity or nucleation of films grown on the surface. Pushing the resolution of dynamic force microscopy to the atomic scale provides real-space insights of such enhanced interactions. A detailed analysis of the experiment by means of atomistic simulations has revealed that not only the stronger electrostatic forces around
Fig. 2. NaCl island on top of a NaCl film grown on Cu(111) [2]. (a) Dynamic force microscopy image showing atomic resolution on island and film. Note the enhanced corrugation at the steps. (b) Corresponding sketch of the atomic structure of the island. (c) Cross-section along the white line
480
Roland Bennewitz et al.
the exposed ions but also their stronger displacement in the force field of the tip contribute to the corrugation [11]. The maximal corrugation found on several alkali halide surfaces is around 0.1 nm. Further approach of the tip results in jumps of atoms and destruction of tip or surface. Such corrugation of 0.1 nm is far beyond the difference of ionic radii or the expected corrugation of the short-ranged electrostatic force. Therefore, the distortion surface ions by the force of the tip has to be considered as a generally important contribution to the atomic corrugation [4]. As a consequence, the atomic corrugation found on terraces will always be significantly smaller than on the steps separating them. It is interesting to note that the growth of NaCl films on different metal substrates exhibits significant differences. Foelsch et al. have studied the growth of NaCl on differently reconstructed copper surfaces [12]. They concluded that a registry of the NaCl films with charge modulations present on certain reconstructions strongly enhances the film-substrate interaction. The interaction can be strong enough to induce a facetting of the copper surface, where (111) facets are actually free of NaCl. The weak binding of NaCl to the Cu(111) surface may explain our finding that NaCl covers Cu(111) in micrometer-sized single-domain islands, overgrowing tens of substrate steps [9]. It would be worth a theoretical study to compare with the contrasting mechanism of the growth of well-ordered nanometer-sized NaCl islands on Al(111).
3
Radiation Damage in KBr
The irradiation of alkali halide crystals with ionizing radiation like low-energy electrons or ultraviolet light starts processes of defect creation and diffusion which strongly affect the crystal surface. Defect aggregation and desorption processes create surface features which often resemble the symmetry of the respective crystal structure, pointing once more to the particular role of lowcoordinated sites in all surface processes. Force microscopy is the only tool to image such nanometer-sized surface features and, thereby, contributes to the understanding of the underlying microscopic processes. In a pioneering experiment, Wilson and Williams studied the evolution of KI surfaces exposed to ultraviolet light by scanning force microscopy in a dry nitrogen atmosphere. They found rectangular pits and hills with typical side lengths of 20–50 nm which were oriented along the main crystallographic direction [13]. Bennewitz et al. found triangular metallic islands on a CaF2 (111) surface after electron irradiation [14]. The development of rectangular pits after electron irradiation on the KBr(100) surface was described by Such et al. [15]. High-resolution force microscopy images of such pits are shown in Fig. 3. The electron dose applied in this experiment led to the desorption of a fraction of a monolayer in form of perfectly rectangular pits of one monolayer depth. Atomic resolution images proof that there are nearly no kinks in the
Atomic Structure of Alkali Halide Surfaces
481
Fig. 3. Topography of KBr(100) cleavage faces after irradiation with low-energy electrons. (a) frame size 100 nm (b) frame size 12.7 nm. This frame was processed with a low-pass filter to enhance the atomic contrast on the surface. The pits have a size of 34 and 54 missing KBr molecules respectively
steps surrounding the pits. The enhanced appearance of step atoms simplifies the determination of the atomically exact size of the pits. Furthermore, judging from the atomic contrast at steps and in the pits we can assume that the stoichiometry of the surface is retained [16]. Combining this information with the finding of a correlation between desorption yield and step density on the surface, Such et al. suggested a complete picture of the desorption process that bases on halogen interstitial formation and predominant desorption from kink sites [17].
4
Damping in Dynamic Force Microscopy
The damping of the tip oscillation due to tip-sample interaction is an important aspect of dynamic force microscopy which has attracted at lot of interest, since it may open a path to study dissipative mechanisms with atomic resolution. Basically, any dissipative interaction must reduce the energy of the tip oscillation. In our experiments the amplitude is controlled to be constant, and a damping of the oscillation is recorded as increase of the excitation amplitude necessary to maintain the amplitude [18]. The damping is recorded simultaneously with the topography. Indeed, a strong variation of the damping is found on the atomic scale. One example is given in Fig. 4a. First of all, a strong contrast between the Cu substrate and the NaCl film partially covering the substrate is found. The atomic structure of the NaCl film is revealed, whereby step atoms cause significantly stronger damping. Obviously, the lower coordination plays an even more important role in the damping process than in the topographic imaging (compare Sect. 2). The interpretation of these damping measurements is discussed controversially. The contrast between the Cu substrate and the NaCl film can be explained in terms of the work function difference between the two. The varying electrostatic force may cause a variation of Joule’s dissipation in the semiconducting tip. Joule’s dissipation as a source of damping
482
Roland Bennewitz et al.
Fig. 4. Maps of the damping of tip oscillation simultaneously recorded with topography maps (frame size 18 nm). (a) NaCl film grown on Cu(111). Some sulphur impurities can be recognized on the Cu substrate. Not the enhanced damping at low-coordinated sites. (b) Cleavage face of a KCL0.6 Br0.4 (100) crystal. The strong damping found on single atoms demonstrates the ability of dynamic force microscopy to distinguish different anions
in force microscopy has been demonstrated for heterogenous semiconductors [19]. For the atomic contrast the situation is more complicated and several possible artifacts have been described [20]. The damping signal seems to depend on the atomic structure of the tip apex even more than the force signal [21]. Therefore, mechanisms of dissipation in a loosely bound tip structure have been proposed. However, the fascinating possibility of a microscopy of local phonon excitation is also under consideration [22]. For the moment, the damping maps are an experimental quantity that comes for free and provides often very helpful contrast between materials and at steps. In an effort to understand the microscopic origins of the damping, we have studied mixed alkali halide crystals. The arbitrary distribution of ions with two different masses indeed provides an atomic contrast in the damping signal on a KCL0.6 Br0.4 (100) surface. A clearly enhanced damping of the tip oscillation can be found above a number of protrusions in Fig. 4b. By comparing this number with the relative density of the anions we assign the enhanced damping to Br sites. A contrast between Br and Cl anions is also found in the the atomic corrugation of topography maps [4]. The atomic resolution of different chemical constituents is certainly an important step for the surface science of insulators. However, the current results can not contribute to the understanding of dissipation phenomena on atomic scale. The topographic contrast on atomic scale can eventually create an artifact in the damping signal by convolution with long-range damping mechanisms. However, topography-independent atomic contrast in damping experiments has been demonstrated for the Si(111)7×7 surface reconstruction [21]. For the results presented here, the damping signal can at least serve as high-contrast signal proving the ability of dynamic force microscopy to distinguish between different anions. The ultimate experiment proving sensitivity for atomic dissipation processes will probably have to deal with damping variations on different isotopes in a chemically homogenous crystal.
Atomic Structure of Alkali Halide Surfaces
5
483
Scratches in KBr
Force microscopy can also be used to study microscopic processes leading to wear. In a model experiment, we have scratched surfaces of KBr crystals by bringing the tip into contact with the surface, scanning forth and back, and increasing the normal load on the tip until some atomic layers had been removed by the tip (see Fig. 5a). In order to image the scratch and the debris around it, we reduced the load on the tip to the minimum and scanned the tip in contact, recording both the topography and the lateral force acting on the tip. The lateral force signal reveals the atomic periodicity of the surface via an atomic stick-slip process [23]. Figure 5b shows a typical result of such experiments [24]. The debris forms well organized terraces around the scratch, their atomic structure being in perfect registry with the crystal surface. Note that the atomic periodicity revealed in contact-mode force microscopy is not necessary the true atomic structure due to the finite size of the contact. However, the shift in registry between adjacent terraces can be resolved and, thereby, the height of the steps between the terraces proven to be monatomic. It is somewhat surprising and important for the further description of wear processes that the debris does not form an amorphous, irregular mound as might be inferred from macroscopic planing process. It can be assumed that the action of the tip releases and transports single molecules or very small clusters of KBr, which reorganize in the form of smooth terraces like the islands seen in Sect. 2. Possibly, this re-crystallization of the worn-off material is also supported by the pressure of the tip in contact.
Fig. 5. Lateral force maps acquired in contact mode around a scratch in a KBr(100) surface. (a) Overview showing one end of the scratch and the surrounding mounds. (b) Detail revealing the atomic structure of the terraces forming the mounds
6
Summary
Force microscopy has been developed into a method providing atomic resolution on surfaces of alkali halides. The results contribute to several as-
484
Roland Bennewitz et al.
pects of Surface Science including film growth, enhanced interaction at lowcoordinated sites, radiation damage at the surface, or atomic scale wear. An interesting but not fully understood extension of the method may provide insight in dissipation processes at the atomic scale.
References 1. S. Morita, R. Wiesendanger, and E. Meyer, Noncontact Atomic Force Microscopy, NanoScience And Technology (Springer, Berlin, Germany, 2002). 477 2. R. Bennewitz, M. Bammerlin, and E. Meyer, in Noncontact Atomic Force Microscopy, NanoScience And Technology, edited by S. Morita, R. Wiesendanger, and E. Meyer (Springer, Berlin, 2002), pp. 93–108. 477, 479 3. F. Giessibl and H. Bielefeldt, Phys. Rev. B 61, 9968 (2000). 477 4. R. Bennewitz, O. Pfeiffer, S. Sch¨ ar, V. Barwich, and E. Meyer, Appl. Surf. Sci. 188, 232 (2002). 477, 480, 482 5. C. Schwennicke, J. Schimmelpfennig, and H. Pfn¨ ur, Surface Science 293, 57 (1993). 478 6. K. Gloeckler, M. Sokolowski, A. Soukopp, and E. Umbach, Phys. Rev. B 54, 7705 (1996). 478, 479 7. R. Bennewitz, V. Barwich, M. Bammerlin, M. Guggisberg, C. Loppacher, A. Baratoff, E. Meyer, and H.-J. G¨ untherodt, Surface Science 438, 289 (1999). 478 8. W. Hebenstreit, J. Redinger, Z. Horozova, M. Schmid, R. Podloucky, and P. Varga, Surface Science 424, L321 (1999). 478, 479 9. R. Bennewitz, M. Bammerlin, M. Guggisberg, C. Loppacher, A. Baratoff, E. Meyer, and H.-J. G¨ untherodt, Surf. Interface Anal. 27, 462 (1999). 478, 480 10. J. Repp, S. F¨ olsch, G. Meyer, and K.-H. Rieder, Phys. Rev. Lett. 86, 252 (2001). 479 11. R. Bennewitz, A. Foster, L. Kantorovich, M. Bammerlin, C. Loppacher, S. Sch¨ ar, M. Guggisberg, E. Meyer, H.-J. G¨ untherodt, and A. Shluger, Phys. Rev. B 62, 2074 (2000). 479, 480 12. S. F¨ olsch, A. Riemann, J.Repp, G. Meyer, and K.-H. Rieder, Phys. Rev. B 66, 161409 (2002). 480 13. R. Wilson and R. Williams, Nucl. Instr. Meth. Phys. B 101, 122 (1995). 480 14. R. Bennewitz, D. Smith, and M. Reichling, Phys. Rev. B 59, 8237 (1999). 480 15. B. Such, P. Czuba, P. Piatkowski, and M. Szymonski, Surface Science 451, 203 (2000). 480 16. R. Bennewitz, S. Sch¨ ar, V. Barwich, O. Pfeiffer, E. Meyer, F. Krok, B. Such, J. Kolodzej, and M. Szymonski, Surface Science 474, L197 (2001). 481 17. B. Such, J. Kolodziej, P. Czuba, P. Piatkowski, P. Struski, F. Krok, and M. Szymonski, Phys. Rev. Lett. 85, 2621 (2000). 481 18. C. Loppacher, R. Bennewitz, O. Pfeiffer, M. Guggisberg, M. Bammerlin, S. Sch¨ ar, V. Barwich, A. Baratoff, and E. Meyer, Phys. Rev. B 62, 13674 (2000). 481 19. T. Stowe, T. Kenny, D. Thomson, and D. Rugar, Appl. Phys. Lett. 75, 2785 (1999). 482 20. H. Hug and A. Baratoff, in Noncontact Atomic Force Microscopy, NanoScience And Technology, edited by S. Morita, R. Wiesendanger, and E. Meyer (Springer, Berlin, 2002), pp. 395–432. 482
Atomic Structure of Alkali Halide Surfaces
485
21. C. Loppacher, M. Bammerlin, M. Guggisberg, S. Sch¨ ar, R. Bennewitz, A. Baratoff, E. Meyer, and H.-J. G¨ untherodt, Phys. Rev. B 62, 16944 (2000). 482 22. M.-Y. Mo and L. Kantorovich, J. Phys. C: Solid State Phys. 13, 1439 (2001). 482 23. R. Bennewitz, E. Gnecco, T. Gyalog, and E. Meyer, Tribology Letters 10, 51 (2001). 483 24. E. Gnecco, R. Bennewitz, and E. Meyer, Phys. Rev. Lett. 88, 215501 (2002). 483
Room Temperature Spin Polarization of Epitaxial Half-Metallic Fe3 O4 (111) and CrO2 (100) Films Mikhail Fonin1 , Yuriy Dedkov1 , Christian K¨ onig1 , Gernot G¨ untherodt1 , 2 3 4 Ulrich R¨ udiger , Joachim Mayer , Denis Vyalikh , and Serguei Molodtsov4 1
2 3
4
II. Physikalisches Institut, Rheinisch-Westf¨ alische Technische Hochschule Aachen 52056 Aachen, Germany Fachbereich Physik, Universit¨ at Konstanz 78457, Konstanz, Germany Gemeinschaftslabor f¨ ur Elektronenmikroskopie, Rheinisch-Westf¨ alische Technische Hochschule Aachen 52056 Aachen, Germany Institut f¨ ur Oberfl¨ achen- und Mikrostrukturphysik, Technische Universit¨ at Dresden D-01062 Dresden, Germany
Abstract. The electronic structure of thin epitaxial films of magnetite (Fe3 O4 ) and chromium dioxide (CrO2 ) has been investigated at 293 K by means of spinand angle-resolved photoemission spectroscopy. Epitaxial Fe3 O4 (111) films have been grown on W(110) and Al2 O3 (1120) substrates by oxidizing epitaxial Fe(110) films. High surface quality and chemical homogeneity as well as high crystalline order in the bulk of Fe3 O4 films were confirmed for both substrates by means of TEM, STM and LEED. The Fe3 O4 (111) epitaxial films show a maximum spin polarization value of −(80 ± 5)% near EF at 293 K. Two types of dispersive states were identified in the Γ -M direction of Fe3 O4 (111) surface Brillouin zone at 293 K by means of angle-resolved ultraviolet photoemission spectroscopy using synchrotron radiation. The surface electronic band structure of the Fe3 O4 (111) film is described as a product of two overlapping contributions following the symmetries of the oxygen sublattice surface Brillouin zone and the iron sublattice surface Brillouin zone, respectively. Epitaxial CrO2 (100) films have been deposited on TiO2 (100) substrates by a chemical vapor deposition technique. High structural quality of CrO2 (100) films was confirmed by x-ray diffraction. The surface and the interface properties of the CrO2 (100) films were studied by STM and TEM. Near the Fermi level an energy gap was observed for spin-down electrons and a spin polarization of about +(95 ± 10)% was found at 293 K.
1
Introduction
The intriguing feature of the material class of half-metallic ferromagnets (HMF) is metallic conductivity for one spin component and insulating behavior for the other one. The theoretically predicted 100% spin polarization B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 487–503, 2003. c Springer-Verlag Berlin Heidelberg 2003
488
Mikhail Fonin et al.
at the Fermi level EF of HMFs [1,2,3,4,5] makes them promising materials for magnetoelectronic devices [6,7,8]. According to Julli`ere’s model [9] the tunnel magnetoresistance (TMR) of ferromagnet/insulator/ferromagnet tunnel junctions depends on the spin polarization of the ferromagnetic electrodes used. The TMR increases with increasing spin polarization of the electrode materials. This fact has revived an intensive research interest on the class of half-metallic ferromagnets like Heusler alloys [10], manganites [11], and transition metal oxides [1,2,3,12,13,14,15,16,17]. Theoretical calculations made on the basis of the local spin-density approximation (LSDA) to the density-functional theory have predicted only minority spin states at EF for Fe3 O4 [4,5] and only majority spin at EF for CrO2 [1,2,3]. Experimentally, spin-resolved photoelectron yield measurements which were performed on single crystalline Fe3 O4 samples showed a large spin polarization of −60% near the photothreshold [18,19,20]. The possible halfmetallic ferromagnetic nature of epitaxial Fe3 O4 (111) thin films grown on Fe(110)/W(110) was experimentally confirmed by means of spin- and angleresolved photoemission spectroscopy. In this experiment a negative spin polarization of −(80 ± 5)% at EF was measured at room temperature [21]. Recently, spin polarization values of over 90% near EF were found for CrO2 at 1.8 K using superconducting point contact spectroscopy [12,14,15,16], although values of 95% have been obtained earlier at 293 K for binding energies of about 2 eV below EF using spin-polarized photoemission spectroscopy [17]. In this article we present a study of crystallographic surface and bulk structure as well as electronic properties of thin epitaxial Fe3 O4 (111) and CrO2 (100) films. Epitaxial Fe3 O4 (111) films have been grown on W(110) and Al2 O3 (1120) substrates by oxidizing epitaxial Fe(110) films. High structural order of the obtained Fe3 O4 (111) surface was confirmed by low energy electron diffraction (LEED) for W(110) substrates and by scanning tunneling microscopy (STM) as well as LEED for Al2 O3 (1120) substrates. Sharp interfaces without intermixing and a good crystalline quality were found for Fe3 O4 films grown on Al2 O3 (1120) by transmission electron microscopy (TEM). The Fe3 O4 (111) films prepared on W(110) show the spin-polarization value of about −(80 ± 5)% near EF at 293 K, whereas the films grown on Al2 O3 (1120) show a lower spin-polarization value of about −(60 ± 5)%. The surface electronic band structure of epitaxial Fe3 O4 (111) films grown on W(110) substrates has been investigated at 293 K by means of angle-resolved ultraviolet photoemission spectroscopy using synchrotron radiation. Two types of dispersive states were identified in the Γ -M direction of Fe3 O4 (111) surface Brillouin zone. The surface electronic band structure of the Fe3 O4 (111) film is described as a product of the overlapping of two contributions following the symmetries of the oxygen sublattice surface Brillouin zone and the iron sublattice surface Brillouin zone, respectively.
Room Temperature Spin Polarization
489
Epitaxial CrO2 (100) films have been deposited on TiO2 (100) substrates by a chemical vapor deposition technique. X-ray diffraction (XRD) measurements on CrO2 (100) indicate a high crystalline order of the prepared films. A sharp CrO2 (100)/TiO2(100) interface without intermixing and a good crystalline quality without any evidence of Cr2 O3 formation were found by TEM. The CrO2 (100) surface, studied by STM, shows the growth of large pyramidal epitaxial islands with the edges oriented along the main crystallographic axes of the CrO2 (100) surface. Atomically flat terraces separated by steps of 4.4 ˚ A height are observed by STM on the island sides. A spin polarization value of about +(95 ± 10)% near EF was found at 293 K by spin-resolved photoemission spectroscopy for epitaxial CrO2 (100) films.
2
Experimental
The scanning tunneling microscopy characterization of the prepared thin films was carried out in an ultra-high vacuum (UHV) system with a base pressure of 8×10−11 mbar equipped with a commercial Omicron UHV AFM/STM. All STM measurements were carried out at room temperature using electrochemically etched polycrystalline tungsten tips cleaned in UHV by Ar+ sputtering. The presented STM images were taken in the constantcurrent-mode. The transmission electron microscopy characterization of the layered system was performed on an analytical TEM with a field emission gun (FEI TECNAI F20) equipped with an EDX detector and a post-column imaging filter (GATAN GIF). The TEM specimens were prepared using standard cross-sectioning techniques with a final Ar+ ion thinning at 3 keV ion energy. The photoemission experiments were carried out at 293 K in an ultra-high vacuum system (base pressure 1 × 10−10 mbar) for angle-resolved photoemission spectroscopy with spin analysis described in detail in Ref. [22]. The unpolarized He I (hν=21.2 eV) resonance line was used for the photoemission experiments. The spin-resolved photoemission spectra have been recorded in the normal emission mode by a 180◦ hemi-spherical energy analyzer connected to a 100 kV Mott detector for spin analysis. The energy resolution was 100 meV full width at half maximum (FWHM) and the angle resolution ±3◦ . The spin-resolved measurements have been performed in magnetic remanence after having applied a magnetic field pulse of about 500 Oe along the in-plane 112 axis for thin Fe3 O4 (111) films and along 001 axis for the CrO2 (100) films. The angle-resolved photoemission spectroscopy experiments were carried out at the Russian-German Beamline at the BESSY II storage ring. This dipole beamline (DIP-16-1A) provides a tuneable source of photons over an energy range of 30-1500 eV. A 127◦ CLAM4 analyzer for ARPES experiments with an angle resolution of 1◦ was used. The overall system resolution was set to 100 meV (FWHM). The W(110) substrate was oriented within 1◦ along
490
Mikhail Fonin et al.
the W(110) surface normal direction and the [001] direction in the plane of the storage ring. Epitaxial 50 ˚ A thick Fe(110) films were prepared on W(110) substrate in situ by electron-beam evaporation at 293 K, followed by annealing at 600 K to improve the surface quality. The Fe(110) thin films were than successfully oxidized into Fe3 O4 (111) films by exposure to high-purity oxygen gas and a subsequent annealing procedure at 600 K. Epitaxial 200 ˚ A thick Fe(110) films were prepared also in situ by electron beam evaporation on a 100 ˚ A thick Mo(110) buffer layer on Al2 O3 (1120) substrates. The samples were annealed at 800 K to improve the bulk and surface structure. The epitaxial Fe(110) thin films were than oxidized into Fe3 O4 (111) films by annealing in 10−6 mbar oxygen atmosphere at 1000 K. After the preparation of the Fe3 O4 /Fe(110)/Mo(110)/Al2O3 (1120) system the films were transferred into the photoemission spectroscopy (PES) chamber by breaking the UHV conditions. After the introduction into the PES chamber the sample surface was cleaned by Ar+ sputtering under grazing angle (E=500 eV; p=1 × 10−6 mbar) for 10 min followed by an annealing step in 5 × 10−6 mbar O2 atmosphere at 1000 K for 30 min. The epitaxial CrO2 (100) films were prepared on isostructural TiO2 (100) substrates by a chemical vapor deposition (CVD) technique proposed by Ishibashi [23]. The deposition vessel used consists of a quartz glass tube placed inside a furnace and kept at 553 K. CrO3 is decomposed at a temperature of 533 K within a two-zone tube furnace. An oxygen flow transports the decomposed precursor material (CrO3 ) and its intermediate oxide phases (CrO5 , Cr8 O21 ) into the deposition zone where the substrate is placed on the sample holder separately heated up to 663 K enabling the growth of CrO2 . Immediately after the growth the CrO2 (100) films were introduced into an ultra-high vacuum chamber for the STM or PES analysis. The surface of the CrO2 (100) was cleaned in the UHV by moderate Ar+ sputter cycles of 30 sec at 500 eV at grazing incidence followed by annealing at 423 K for 12 hours.
3 3.1
Results and Discussion Magnetite
˚ thick Fe(110) film Figure 1a shows the LEED image of an epitaxial 200 A grown on the Mo(110)/Al2 O3 (1120) surface. A very sharp (1×1) LEED pattern of bcc Fe(110) was observed. The epitaxial Fe(110) films were than oxidized into Fe3 O4 (111) by annealing in 5×10−6 mbar O2 at 1000 K for 30 min. The high crystalline order of the obtained Fe4 O4 (111) films was confirmed by LEED (Fig. 1b). A hexagonal (2×2)-overstructure typical for a well-ordered unreconstructed Fe3 O4 (111) surface [21,24,25] is clearly visible. Figure 2 shows a cross-section TEM image of the Fe3 O4 (111) film grown on the Fe(110)/Mo(110)/Al2O3 (1120) system. Lattice planes are resolved
Room Temperature Spin Polarization
491
Fig. 1. LEED patterns of (a) a pure Fe(110) film and (b) a Fe3 O4 (111) film grown on Mo(110)/Al2 O3 (1120)
in all three layers of the system and in the sapphire substrate. The interfaces between all four materials are clearly visible and abrupt for the Mo(110)/Al2 O3 (1120) and the Fe3 O4 (111)/Fe(110) interface, at which only steps with monolayer height can be observed. The interface between Mo and Fe is less clearly visible in Fig. 2, however, the chemical analysis performed with the energy filter revealed that it is chemically sharp within the given resolution limit (about 2 nm). No indication for interdiffusion has been found at all three interfaces. TEM measurements show that the thickness of the A. The surface morphology of the obtained Fe3 O4 (111) layer is about 150 ˚ Fe3 O4 (111) film was studied by STM and is shown in Fig. 3a. Large epitaxial islands with a lateral extensions of more than 100 nm have been formed (Fig. 3a). The longer side edges of the islands are oriented along the in-plane [110], [011], and [101] crystallographic directions of the Fe3 O4 (111) surface. The islands are monoatomically flat with step heights of approximately 5 ˚ A which corresponds to the distance between equivalent Fe3 O4 (111) surface terminations [24]. The vertical peak-to-peak roughness of the surface on a lateral scale of 1000 nm is about 60 ˚ A. Figure 3b shows a STM image with atomic A resolution of the regular Fe3 O4 (111) surface. A hexagonal lattice with a 6 ˚ periodicity and a corrugation amplitude of about 0.5 ˚ A can be clearly seen in this image. This value is in good agreement with the Fe3 O4 (111) in-plane
Fig. 2. TEM cross-section micrograph of the Fe3 O4 (111) film on system Fe(110)/Mo(110)/Al2 O3 (1120) imaged along the [001] zone axis of Fe(110). White lines indicate the individual interfaces between different layers
492
Mikhail Fonin et al.
Fig. 3. STM image of an epitaxial Fe3 O4 (111) film on Fe(110)/Mo(110)/Al2 O3 (1120): (a) a 200×200 nm2 surface section and (b) an atomically resolved regular Fe3 O4 (111) surface (6.5×6.5 nm2 ). Tunneling voltage (UT ) was +0.8 V; tunneling current (IT ) was 0.12 nA
lattice constant of 5.92 ˚ A. The observed structure of the Fe3 O4 (111) surface is an good agreement with LEED measurements as well as with previous studies [24]. After preparation the Fe3 O4 (111)/Fe(110)/Mo(110)/Al2O3 (1120) samples were transferred from the thin film deposition system into the PES chamber under air conditions for the spin-resolved PES experiments. Directly after the exsitu transfer of the Fe3 O4 (111) samples into the PES chamber the Fe3 O4 (111) surface was cleaned by Ar+ sputtering for 10 min under a grazing angle with a beam energy of E=500 eV and an Ar pressure of 1 × 10−6 mbar followed by annealing in a 5 × 10−6 mbar O2 atmosphere at 800 K for 30 min. The chemical analysis performed by Auger spectroscopy directly after the cleaning procedure did not reveal carbon or any other contaminations. The crystalline quality of the surface was preserved as confirmed by LEED. Figure 4 shows spin-resolved photoemission spectra recorded near EF together with the total photoemission intensity (left-hand panel: a, c, e) and the resulting spin polarization values (right-hand panel: b, d, f) as a function of the binding energy relative to the Fermi level of Fe3 O4 (111) films as well as of Fe(110) films. In Fig. 4a, c, e the spectra of a pure Fe(110) film [21], a Fe3 O4 (111) layer on the Fe(110)/W(110) system [21], and of a Fe3 O4 (111) layer on the Fe(110)/Mo(110)/Al2O3 (1120) system [26] are presented, respectively. The solid circles in (b) correspond to the spin polarization of the pure Fe(110) film, in (d) of the Fe3 O4 (111)/Fe/W system, and in (f) of the Fe3 O4 (111)/Fe/Mo/Al2O3 system. The spin-resolved spectra of the valence band of Fe(110) (Fig. 4a) show the 3 1 4 1 ↓⊕ ↓ states near 0.25 eV and from the ↑⊕ ↑ emission from the states near 0.7 eV. The spectra are in agreement with previous measurements [27,28]. The Fe(110) films show a spin-polarization value of about −(85 ± 5)% at EF at 293 K.
Room Temperature Spin Polarization
493
Fig. 4. Left-hand panel : Spin-resolved photoemission spectra of (a) the Fe(110)/W(110) system, (c) the Fe3 O4 (111) on Fe(110)/W(110) sytem, and (e) the Fe3 O4 (111) on Fe(110)/Mo(110)/Al2 O3 (1120) system for hν=21.2 eV in normal emission. (Spin down: down triangles; spin up: up triangles; total photoemission intensity: solid circles). Right-hand panel : spin polarization as function of binding energy of (b) the Fe(110)/W(110) system, (d) the Fe3 O4 (111) on Fe(110)/W(110) system, and (f) the Fe3 O4 (111) on Fe(110)/Mo(110)/Al2 O3 (1120) system
The spin-resolved spectra of the valence band near EF clearly show a dominant emission from the spin-down Fe-3d states (≤1.5 eV) (Fig. 4c) and from the O-2p states (≥1.5 eV) (not shown here) for Fe3 O4 (111) films grown on Fe(110)/W(110) substrates. The spin-resolved spectra exhibit a clear halfmetallic feature, i.e. a metallic Fermi cut-off for the minority spin with a disappearance of spectral weight near EF - reflecting the energy gap - for the majority spin. The features of Fe-3d bands in the range of 2 eV below EF in the PES spectra of the Fe3 O4 (111) film on the Al2 O3 (1120) substrate (e) do not differ from the film on W(110) (c), but a significant decrease of the Fe-3d photoemission intensity has been observed for the films grown on the Al2 O3 (1120) substrate. The Fe3 O4 (111)/Fe(110)/W(110) system shows a negative spin polarization at EF of about −(80 ± 5)% with a parallel magnetic coupling between Fe3 O4 (111) and the underlying Fe(110), contrary to the antiparallel coupling in Fe3 O4 (111)/Fe(110) system observed by Kim et al. [25]. The Fe3 O4 (111)
494
Mikhail Fonin et al.
films on the Al2 O3 (1120) substrates show a maximum negative spin polarization at EF of about −(60 ± 5)% at 293 K (Fig. 4f). However, this polarization value cannot be related to a contribution from Fe(110) underlayer, as the A. thickness of the Fe3 O4 (111) film determined by TEM (Fig. 2) is about 150 ˚ Comparing the Fe3 O4 (111) films on W(110) and Al2 O3 (1120) substrates the reduction of the observed spin polarization for the Fe3 O4 (111) films on the Al2 O3 (1120) substrate may be caused by the cleaning procedure or by strain in the Fe3 O4 (111) surface layers caused by a lattice mismatch between the Mo(110) and Fe(110) layers as well as Fe(110) and Fe3 O4 (111) layers. As reported by Jeng et al. the resulting strain can lead to a reduction of the spin polarization value [29]. In this case the presence of uniaxial strain leads to a broadening of the B-site Fe-3d bands reducing the insulating band gap of the majority spin. As a consequence the half-metallic behavior of cubic magnetite is reduced and in high-strain regimes Fe3 O4 (111) can eventually turn into normal metal behavior. Another reason for the photoemission intensity decrease can be due to the cleaning procedure after the exsitu sample transfer into the PES chamber. As reported before [30] the cleaning procedure can crucially influence the surface structure leading to a decrease or a total loss of the spin polarization. The surface electronic band structure of epitaxial Fe3 O4 (111) films grown on W(110) substrates prepared in situ has been investigated at 293 K by means of angle-resolved ultraviolet photoemission spectroscopy using synchrotron radiation [31,32]. The left-hand panel of Fig. 5 shows representative
Fig. 5. Left-hand panel : ARPES spectra of a Fe3 O4 (111) thin film surface recorded at hν=58 eV for emission polar angles 0–25◦ along the Γ -M direction of the Fe3 O4 (111) surface Brillouin zone; curves 1–5 mark the major O 2p electronic states dispersions; right-hand panel : a LEED image (E=121 eV) of the prepared Fe3 O4 (111) surface and a schematic representation of the oxygen-derived and ironderived surface Brillouin zones
Room Temperature Spin Polarization
495
valence band spectra of an epitaxial Fe3 O4 (111) thin film as a function of emission polar angle Θ (0-25◦ ) along the Γ -M direction of the Fe3 O4 (111) surface Brillouin zone. All spectra were normalized on the maximum intensity for each spectrum. The photon energy of hν=58 eV, which corresponds to the Fe 3p→3d resonance [32,33], was used in the photoemission experiments which yields an increased photoemission intensity from Fe-3d states near EF . As reported before no significant difference in the band dispersions derived from experiment was found for the photon energies in the off- and on-resonance measurement regimes [34,35]. The angle-resolved photoemission (ARPES) spectra show a Fe 3d-derived emission extended over 2 eV below EF and an O 2p-derived emission between 2.5 and 9 eV of binding energy. The major dispersions of the O-2p states in the Fe3 O4 (111) valence band are marked by dashed lines (1-5) in Fig. 5. The borders of the surface Brillouin zones were calculated separately for the oxygen sublattice and the iron sublattice surface Brillouin zones (right-hand panel in Fig. 5). Following the Fe3 O4 (111) surface structure, the oxygen subO lattice surface Brillouin zone was considered with the M 0 12 point at 1.22 ˚ A−1 ◦ ◦ giving the angle values of Θ=19.04 and Θ=22.32 for photoelectrons with the binding energies equal to EF and 14 eV, respectively. The line between these two points represents the surface Brillouin zone border of the oxygen sublattice (line B in Fig. 6). The iron sublattice was considered in similar Fe way with the M 0 12 point at 0.61 ˚ A−1 giving the corresponding angle values ◦ ◦ of Θ=9.39 and Θ=10.95 . The line between these two points represents the surface Brillouin zone border of the iron sublattice (line A in Fig. 6). The obtained spectroscopy data are used to plot colored intensity maps of the Fe3 O4 (111) thin film valence band as shown in Fig. 6: a over the range of 14 eV below EF and b over the range of 1.5 eV below EF . The O-2p electronic states show a clear almost monotonic dispersion from about 8 eV down to 2.5 eV of binding energy within the polar angle changes
Fig. 6. The contour plot of the ARPES intensity along the Γ -M direction: (a) for the emission between EF and 14 eV of binding energy and (b) extended over 1.5 eV below the Fermi level
496
Mikhail Fonin et al. O
from 0 to 20◦ (M 0 12 ) (lines 1-5 in Fig. 6a). At the same time the Fe-3d electronic states show an obviously non-monotonic dispersion with a period of O 19◦ starting at the Γ 00 point and running up to the M 0 12 point (marked also Fe
as Γ 01 ) (Fig. 6b). This evidences that the dispersion of the Fe 3d-derived states follows the symmetry of the iron sublattice. However, the dispersion symmetry of the O-2p electronic states can not be described in terms of the iron sublattice surface Brillouin zone as it follows the symmetry of the oxygen sublattice. This means that the full description of the surface electronic band structure of the Fe3 O4 (111) film is only possible when two different representations of the Brillouin zones are considered. The first one is the oxygenO derived Brillouin zone Γ 00 -M 0 12 corresponding to the emission angles 0-20◦ (line B in Fig. 6a and b) and the second one is the iron-derived Brillouin zone Fe Γ 00 -M 0 12 corresponding to the emission angles 0–9◦ (line A in Fig. 6a and b). Different representations of the Brillouin zones have recently been used for the description of electronic states in a quasi-two-dimensional La-graphite intercalation compound [36]. The weak dispersion of the O-2p and Fe-3d states in the valence-band structures observed along the Γ -L direction [33,32] indicates an almost two-dimensional-like character of the crystalline arrangement and electronic structure of the Fe3 O4 (111) surface as well. 3.2
Chromium Dioxide
The crystallographic quality of the CrO2 (100) films was characterized by xray diffractometry (XRD). A typical x-ray diffraction pattern (Fig. 7a) shows only dominant CrO2 (200) and CrO2 (400) peaks indicating a preferred a-axis growth. No impurities in the CrO2 film, including Cr2 O3 , were found. Peaks labled with an asterisk (∗) are due to the underlaying TiO2 (100) substrate and the sample holder made of brass. Magnetic properties of CVD grown CrO2 (100) films were characterized by means of ex situ magneto-optical Kerr effect (MOKE). Figure 7b shows longitudinal MOKE hysteresis loops of CrO2 (100) films with the applied magnetic field along the in-plane easy (c axis) and hard (a axis) magnetic axes. The coercive field along the easy magnetic axis which can be extracted from the plot is around 160 Oe. Figure 8 shows a cross section TEM image of a 100 nm thick CrO2 (100) film on a TiO2 (100) substrate. The interface between CrO2 (100) and TiO2 (100) substrate (marked by a white arrow in Fig. 8) is well defined without any visible formation of a Cr2 O3 interlayer [37,38]. The electron diffraction pattern of CrO2 (100) films with an incoming electron beam parallel to the in-plane [001] direction of CrO2 exhibits a four-fold symmetry (see inset in Fig. 8) indicating the formation of an epitaxial CrO2 (100) film. The surface quality of prepared CrO2 (100) films was probed by STM under UHV conditions. The measurements performed directly after the introduction of the samples into the UHV show a strong contamination on top
Room Temperature Spin Polarization
497
Fig. 7. (a) An x-ray diffraction pattern of an epitaxial CrO2 (100) film deposited on a TiO2 (100) substrate. (b) In-plane magnetic hysteresis loops of CrO2 (100) film taken at 293 K
100 nm Fig. 8. TEM (cross section) image of an epitaxial CrO2 (100) film on a TiO2 (100) substrate. The white arrow indicates a sharp interface boundary between the CrO2 film and the TiO2 substrate. Inset: electron diffraction pattern of a CrO2 (100) film on a TiO2 (100) substrate with the incident electron beam parallel to the in-plane [001] direction of CrO2 (100)
of the CrO2 surface. After sputtering (500 eV, incidence angle 10◦ , 5 min) and annealing (450 K, 12 hours) the contamination of the surface could be significantly reduced. Large epitaxial truncated pyramidal islands of CrO2 with average heights of 10 nm and the lateral dimensions of about 500×600 nm, were observed by STM (not shown here). The island edges are preferably oriented along the [001] and [010] in-plane crystallographic directions of CrO2 surface. Figure 9 (left-hand panel) shows a STM image obtained on the side of such pyramid with the edges preferably parallel to the [001] direction of CrO2 surface. The atomically flat terraces of CrO2 are clearly visible on this image. The step height of each individual step was determined to be about 4.4 ˚ A (right-hand panel in Fig. 9). This value is in a good agreement with the CrO2 lattice constant as well as with that measured before [16]. Small roundly shaped
498
Mikhail Fonin et al.
Fig. 9. STM image of an epitaxial CrO2 (100) film grown on a TiO2 (100) substrate (left-hand panel ). Profile A corresponds to a STM line scan over several CrO2 (100) steps (right-hand panel ). Scanning parameters: 110×110 nm2 ; UT =+1 V; IT =0.12 nA
islands observed on the surface are attributed to the contaminations due to the ex situ film preparation and transfer. Figure 10 shows a series of angle-resolved PES spectra recorded near EF and the resulting spin polarization as a function of the sputtering time (inset in Fig. 10), calculated from the spin-resolved PES spectra of the CrO2 (100) films for a variety of sputtering and annealing periods. The position of the Cr-3d band in the range of 2 eV in the PES spectra changes considerably with increasing sputtering time (each cycle 30 sec; energy of the Ar+ ion beam: 500 eV) from approximately 2.3 eV below EF for non-sputtered CrO2 (100) film to 2 eV below EF after 750 sec of sputtering. At the same time an increasing of the intensity of the shifted Cr-3d band has been observed. This effects can be due to an increasing structural disorder of the CrO2 (100) surface produced by the sputtering process. Annealing of the sputtered CrO2 (100) film at 150◦ C for 12 hours in UHV leads to a restoring of the peak at 2.3 eV below EF and of its former intensity (spectrum B in Fig. 10). We conclude that the annealing leads to a complete recovery of the crystalline properties of the CrO2 (100) surface layer structure. Directly after the introduction of the sample into the UHV a spin polarization value of +(85±10)% at EF has been found (Fig. 10). After seven sputtering cycles (total sputtering time 210 sec) the spin polarization increases up to +(95 ± 10)%. This effect can be explained by an improving of the surface quality by removing contaminations that has also been observed by a STM surface analysis. After additional sputtering the samples spin polarization decreases continuously and approaches a value less than 10% after 750 sec. This effect can be attributed to an increasing destruction of the surface order with increasing sputtering time, i.e. the sputtering process eventually produces an amorphous non-magnetic topmost layer of CrO2 [39]. A following annealing process of the sputtered CrO2 (100) film at 150◦ C for 12 hours in UHV leads
Room Temperature Spin Polarization
499
Fig. 10. ARPES spectra as a function of binding energy recorded near EF and spin polarization (inset) at EF (solid circles) and at 1 eV (solid squares) of a CrO2 (100) film vs sputtering time. Points A and B: after annealing of the sputtered (sputtering time: 750 sec) CrO2 (100) film at 100◦ C and 150◦ C for 12 hours, respectively
to a restoring of the high spin polarization up to +(85 ± 10)% (see point B in Fig. 10). Annealing seems to heal the surface layers from defects produced by the sputtering without formation of a Cr2 O3 overlayer. Figure 11 presents the spin- and angle-resolved photoemission spectra together with the total angle-resolved photoemission intensity and the spin polarization as a function of the binding energy of a CrO2 (100) film after sputtering for 750 sec and subsequent annealing for 12 hours at 150◦ C. The spin-resolved spectra of the valence band of the CrO2 (100) near EF clearly show a dominant emission from the spin-up Cr-3d states (<3 eV) and the O-2p states (>3 eV). The spin-resolved spectra show a clear half-metallic feature, i.e. a metallic Fermi cut-off for the majority spin with the disappearance of spectral weight near EF – reflecting the energy gap – for the minority spin. A maximum positive spin polarization of +(95 ± 10)% was observed for
500
Mikhail Fonin et al.
Fig. 11. Spin polarization as a function of binding energy (right-hand panel ) of an epitaxial CrO2 (100) film after sputtering for 210 sec at 500 eV and an additional annealing treatment at 150◦ C for 12 hours together with the corresponding angleresolved spin-polarized photoemission spectrum (spin down: triangles down, spin up: triangles up) and the total photoemission intensity (solid circles) near EF (lefthand panel ). Inset in left-hand panel presents a total intensity spectrum of an epitaxial CrO2 (100) film showing a clear Fermi cut off
such a CrO2 (100) film at 293 K. Thus the measured maximum spin polarization is close to the theoretically predicted value of 100% [1,3]. In consistence with the band-structure calculations a weak photoemission intensity due to the presence of the majority states at EF has been observed [13,38]. In previous spin-resolved photoemission measurements on polycrystalline CrO2 samples an extremely low photoemission intensity at EF was observed and a high spin polarization of approximately +95% could be measured only for binding energies of about 2 eV below EF [17]. The position of the Cr-3d band has been estimated to be about 2.5 eV below EF [17]. In contrast, in the present study a weak but finite intensity was observed at EF , which allows the determination of the spin polarization at EF . The position of the d band was found to be at 2.3 eV below EF . The observation of a finite intensity at EF is in agreement with recent photoemission and inverse photoemission experiments, where, however, the position of the Cr-3d band has been found to be around 1.3 eV below EF [13]. Compared to our study the difference in the positions of the Cr-3d band [13] and the lack of the photoemission intensity near EF [17] can most probably be attributed to the polycrystalline nature of the former CrO2 samples.
4
Conclusions
Epitaxial Fe3 O4 (111) films have been grown on W(110) and Al2 O3 (1120) substrates by oxidizing epitaxial Fe(110) films. The high crystalline order of
Room Temperature Spin Polarization
501
the Fe3 O4 (111) surface was confirmed by low energy electron diffraction for W(110) substrates and by scanning tunneling microscopy as well as LEED for the Al2 O3 (1120) substrates. Sharp interfaces without intermixing and a good crystalline quality were found for Fe3 O4 films grown on Al2 O3 (1120) by transmission electron microscopy. The Fe3 O4 (111) films prepared on W(110) show the spin-polarization value of about −(80 ± 5)% near EF at 293 K, whereas the films grown on Al2 O3 (1120) show a lower spin-polarization value of about −(60 ± 5)%. The surface electronic band structure of epitaxial Fe3 O4 (111) films grown on W(110) substrates has been investigated at 293 K by means of angleresolved ultraviolet photoemission spectroscopy using synchrotron radiation. Two types of dispersive states were identified in the Γ -M direction of the Fe3 O4 (111) surface Brillouin zone. Correspondingly, the surface electronic band structure of the Fe3 O4 (111) film is described as a product of two overlapping contributions following the symmetries of the oxygen sublattice surface Brillouin zone (O 2p-derived states) and an iron sublattice surface Brillouin zone (Fe 3d-derived states). Epitaxial CrO2 (100) films have been prepared on TiO2 (100) substrates by a chemical vapor deposition technique from CrO3 as a precursor material. The CrO2 (100) surface and the bulk structure studied by STM and XRD, demonstrate the high quality of the prepared films. The formation of large pyramidal epitaxial islands with the edges oriented along the main crystallographic axes of the CrO2 (100) surface was observed. A sharp CrO2 (100)/TiO2(100) interface without intermixing and a good crystalline quality without any evidence of the Cr2 O3 formation were found by transmission electron microscopy. A spin polarization value of about +(95 ± 5)% near EF was found at 293 K by spin- and angle-resolved photoemission spectroscopy on epitaxial CrO2 (100) films. Acknowledgements This work was supported by the German Federal Ministry of Education and Research (BMBF) under grant No. FKZ 05KS1PAA/7 and FKZ 13N7988, and by the bilateral Project “Russian-German Laboratory at BESSY”. The authors would like to thank C. Herwartz and T.E. Weirich for help with the TEM experiments on magnetite films, as well as S. Senz and D. Hesse for providing TEM measurements on CrO2 thin films.
References 1. K.-H. Schwarz, J. Phys. F: Met. Phys. 16, L211 (1986). 488, 500 2. S. P. Lewis, P. B. Allen, and T. Sasaki, Phys. Rev. B 55, 10253 (1997). 488 3. M. A. Korotin, V. I. Anisimov, D. I. Khomskii, and G. A. Sawatzky, Phys. Rev. Lett. 80, 4305 (1998). 488, 500
502 4. 5. 6. 7. 8. 9. 10. 11. 12.
13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33.
Mikhail Fonin et al. A. Yanase and K. Siratori, J. Phys. Soc. Jap. 53, 312 (1984). 488 Z. Zhang and S. Satpathy, Phys. Rev. B 44, 13319 (1991). 488 W. J. Gallagher et al., J. Appl. Phys. 81, 3741 (1997). 488 J. M. Daughton, J. Appl. Phys. 81, 3758 (1997). 488 H. Boeve, R. J. M. van der Veerdonk, B. Dutta, J. de Boeck, J.S. Moodera, and G. Borghs, J. Appl. Phys. 83, 6700 (1998). 488 M. Julli`ere, Phys. Lett. A 54, 225 (1975). 488 R. de Groot, F. M¨ uller, P. van Engen, and K. H. J. Buschow, Phys. Rev. Lett. 50, 2024 (1983). 488 J.-H. Park, E. Vescovo, H.-J. Kim, C. Kwon, R. Ramesh, T. Venkatesan, Nature 392, 794 (1998). 488 R. J. Soulen Jr., J. M. Byers, M. S. Osofsky, B. Nadgorny, T. Ambrose, S. F. Cheng, P. R. Broussard, C. T. Tanaka, J. Nowak, J. S. Moodera, A. Barry, and J. M. D. Coey, Science 282, 85 (1998). 488 T. Tsujioka, T. Mizokawa, J. Okamoto, A. Fujimori, M. Nohara, H. Takagi, K. Yamaura, and M. Takano, Phys. Rev. B 56, R15509 (1997). 488, 500 W. J. DeSisto, P. R. Broussard, T. F. Ambrose, B. E. Nadgorny, and M. S. Osofsky, Appl. Phys. Lett. 76, 3789 (2000). 488 Y. Ji, G. J. Strijkers, F. Y. Yang, C. L. Chien, J. M. Byers, A. Anguelouch, G. Xiao, and A. Gupta, Phys. Rev. Lett. 86, 5585 (2001). 488 A. Anguelouch, A. Gupta, Gang Xiao, D. W. Abraham, Y. Ji, S. Ingvarsson, and C. L. Chien, Phys. Rev. B 64, R180408 (2001). 488, 497 K. P. K¨ amper, W. Schmitt, G. G¨ untherodt, R. J. Gambino, and R. Ruf, Phys. Rev. Lett. 59, 2788 (1987). 488, 500 S. F. Alvarado, W. Eib, F. Meier, D. T. Pierce, K. Sattler, H. C. Siegmann, and J. P. Remeika, Phys. Rev. Lett. 34, 319 (1975). 488 S. F. Alvarado, M. Erbudak, and P. Munz, Phys. Rev. B 14, 2740 (1976). 488 S. F. Alvarado and P. S. Bagus, Phys. Lett. A 67, 397 (1978.) 488 Yu. S. Dedkov, U. R¨ udiger, and G. G¨ untherodt, Phys. Rev. B 65, 064417 (2002). 488, 490, 492 R. Raue, H. Hopster, and E. Kisker, Rev. Sci. Instrum. 55, 383 (1984). 489 S. Ishibashi, T. Namikawa, and M. Satou, Mat. Res. Bullet. 14, 51 (1979). 490 M. Ritter and W. Weiss, Surf. Sci. 432, 81 (1999); W. Weiss and M. Ritter, Phys. Rev. B 59, 5201 (1999). 490, 491, 492 H.-J. Kim, J.-H. Park, and E. Vescovo, Phys. Rev. B 61, 15284 (2000); H.-J. Kim, J.-H. Park, and E. Vescovo, Phys. Rev. B 61, 15288 (2000). 490, 493 M. Fonin, Yu. S. Dedkov, U. R¨ udiger, G. G¨ untherodt, J. Mayer, Phys. Rev. B, in press. 492 R. Kurzawa, K.-P.- K¨ amper, W. Schmitt, and G. G¨ untherodt, Solid State Commun. 60, 777 (1986). 492 H.- J. Kim and E. Vescovo, Phys. Rev. B 58, 14047 (1998). 492 H.-T. Jeng and G. Y. Guo, Phys. Rev. B 65, 094429 (2002). 494 S. A. Morton, G. D. Waddill, S. Kim, Ivan K. Schuller, S. A. Chambers, and J. G. Tobin, Surf. Sci. 513, L451 (2002). 494 Yu. S. Dedkov, M. Fonin, D. V. Vyalikh, S. L. Molodtsov, U. R¨ udiger, and G. G¨ untherodt, to be submitted to Phys. Rev. Lett. 494 M. Fonin, Yu. S. Dedkov, J. O. Hauch, U. R¨ udiger, and G. G¨ untherodt, to be submitted to Phys. Rev. B. 494, 495, 496 Y. Q. Cai, M. Ritter, W. Weiss, and A. M. Bradshaw, Phys. Rev. B 58, 5043 (1998). 495, 496
Room Temperature Spin Polarization
503
34. S. L. Molodtsov, Yu. Kucherenko, J. J. Hinarejos, S. Danzenb¨ acher, V. D. P. Servedio, M. Richter, and C. Laubschat, Phys. Rev. B 60, 16435 (1999). 495 35. S. L. Molodtsov, S. V. Halilov, M. Richter, A. Zangwill, and C. Laubschat, Phys. Rev. Lett. 87, 017601 (2001). 495 36. S. L. Molodtsov, F. Schiller, S. Danzenb¨ acher, Manuel Richter, J. Avila, C. Laubschat, and M. C. Asensio, Phys. Rev. B 67, 115105 (2003). 496 ¨ 37. M. Rabe, J. Pommer, K. Samm, B. Ozyilmas, C. K¨ onig, M. Fraune, U. R¨ udiger, G. G¨ untherodt, S. Senz, and D. Hesse, J. Phys.: Condens. Matter. 14, 7 (2002). 496 38. Yu. S. Dedkov, M. Fonine, C. K¨ onig, U. R¨ udiger, G. G¨ untherodt, S. Senz, and D. Hesse, Appl. Phys. Lett. 80, 4181 (2002). 496, 500 39. N. Heiman and N. S. Kazama, J. Appl. Phys. 50, 7333 (1979). 498
Pulsed Laser Deposition (PLD) – A Versatile Thin Film Technique Hans-Ulrich Krebs1 , Martin Weisheit1 , J¨ org Faupel1 , Erik S¨ uske1 , 1 1 1,2 ormer , Thorsten Scharf , Christian Fuhse , Michael St¨ Kai Sturm1,3 , Michael Seibt4 , Harald Kijewski5 , Dorit Nelke6 , Elena Panchenko6, and Michael Buback6 1
2 3 4 5 6
Institut f¨ ur Materialphysik, Universit¨ at G¨ ottingen Hospitalstraße 3–7, 37073 G¨ ottingen, Germany [email protected] http://www.gwdg.de/~upmp GKSS-Forschungszentrum Geesthacht, Abt. WTB Max-Planck-Straße 1, 21502 Geesthacht, Germany Nanofilm Technologie GmbH Anna-Vanderhoeck-Ring 5, 37081 G¨ ottingen, Germany IV. Physikalisches Institut, Universit¨ at G¨ ottingen Bunsenstraße 13, 37073 G¨ ottingen, Germany Institut f¨ ur Rechtsmedizin, Universit¨ at G¨ ottingen Windausweg 2, 37073 G¨ ottingen, Germany Institut f¨ ur Physikalische Chemie, Universit¨ at G¨ ottingen Tammannstraße 6, 37077 G¨ ottingen, Germany
Abstract. Pulsed laser deposition (PLD) is for many reasons a versatile technique. Since with this method the energy source is located outside the chamber, the use of ultrahigh vacuum (UHV) as well as ambient gas is possible. Combined with a stoichiometry transfer between target and substrate this allows depositing all kinds of different materials, e.g., high-temperature superconductors, oxides, nitrides, carbides, semiconductors, metals and even polymers or fullerenes can be grown with high deposition rates. The pulsed nature of the PLD process even allows preparing complex polymer-metal compounds and multilayers. In UHV, implantation and intermixing effects originating in the deposition of energetic particles lead to the formation of metastable phases, for instance nanocrystalline highly supersaturated solid solutions and amorphous alloys. The preparation in inert gas atmosphere makes it even possible to tune the film properties (stress, texture, reflectivity, magnetic properties ...) by varying the kinetic energy of the deposited particles. All this makes PLD an alternative deposition technique for the growth of high-quality thin films.
1
Introduction
With the pulsed laser deposition (PLD) method, thin films are prepared by the ablation of one or more targets illuminated by a focused pulsed-laser beam. This technique was first used by Smith and Turner [1] in 1965 for the preparation of semiconductors and dielectric thin films and was established B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 505–517, 2003. c Springer-Verlag Berlin Heidelberg 2003
506
Hans-Ulrich Krebs et al.
due to the work of Dijkkamp and coworkers [2] on high-temperature superconductors in 1987. Their work already showed main characteristics of PLD, namely the stoichiometry transfer between target and deposited film, high deposition rates of about 0.1 nm per pulse and the occurrence of droplets on the substrate surface (see also [3]). Since the work of Dijkkamp et al., this deposition technique has been intensively used for all kinds of oxides, nitrides, or carbides, and also for preparing metallic systems and even polymers or fullerenes. The aim of this paper is to give a brief sketch on the versatility of the pulsed laser deposition method and to give some examples of where it is needed. Differences compared to conventional thin film techniques like thermal evaporation and sputtering will be discussed, too.
2
Typical Experimental Set-ups
A typical set-up for PLD is schematically shown in Fig. 1. In an ultrahigh vacuum (UHV) chamber, elementary or alloy targets are struck at an angle of 45◦ by a pulsed and focused laser beam. The atoms and ions ablated from the target(s) are deposited on substrates. Mostly, the substrates are attached with the surface parallel to the target surface at a target-to-substrate distance of typically 2–10 cm. In our case, an UHV of about 10−9 mbar, an excimer laser LPX110i (Lambda Physik) with KrF radiation (wavelength 248 nm, pulse duration 30 ns), Si or Al2 O3 substrates, and a target-to-substrate distance of 3–7 cm are used. In order to obtain a steady ablation rate from the target, the laser beam is scanned (in our case by eccentric rotation of the focusing lens and by additionally sweeping and/or slightly turning the target under the laser beam) over a sufficiently large target area (at least 1 cm2 ). By adjusting the number of laser pulses on each target, multilayers with desired single layer and bilayer thicknesses can be created. Two ways for growing alloy systems were applied, using a bulk alloy target or elementary targets of the constituents. In the latter case, the pulse number on each target is supposed to be low enough to obtain a thickness of less than one monolayer from each element. Under these deposition conditions, in addition to atoms and ions, in most cases some droplets of target material are also deposited on the substrate surface, too. In most systems, the formation of large droplets or the tearing-off Substrate Laser pulse
Plasma plume Target UHV-chamber
Fig. 1. Schematic diagram of a typical laser deposition set-up
PLD – A Versatile Thin Film Technique
507
of target exfoliations can be reduced by using dense and smooth targets [4,5]. However, the ablation of smaller droplets originating from the fast heating and cooling processes of the target, which is due to the pulsed laser illumination cannot completely be avoided. In the literature, the corresponding mechanisms are called “hydrodynamic sputtering” [6] or “subsurface heating” [7]. These droplets can only be prevented from reaching the substrate surface, for instance by using the so-called “off-axis” geometry, firstly described by Holzapfel et al. [8] during PLD of high-temperature superconductors, or by using special laser ablation facilities, for instance the “dual-beam” ablation technique [9], where the substrate is shadowed from the material ablated simultaneously from two targets.
3
Versatility of the PLD Technique
During PLD, many experimental parameters can be changed, which then have a strong influence on film properties. First, the laser parameters such as laser fluence, wavelength, pulse duration and repetition rate can be altered. Second, the preparation conditions, including target-to-substrate distance, substrate temperature, background gas and pressure, may be varied, which all influence the film growth. In the following sections, we focus on the most interesting of these parameters. 3.1
UHV and Different Gas Atmospheres
The PLD technique allows preparing all kinds of oxides, nitrides, carbides, but also polymers, Buckminster fullerenes or metallic systems. In Table (1), a non-comprehensive list of materials deposited for the first time after 1987 is given. In order to obtain all these different kinds of materials, one has to work in ultrahigh vacuum (UHV) or reactive gas atmosphere during deposition. This is possible with PLD, because the energy source is outside the deposition chamber. During growth of oxides, the use of oxygen is often inevitable for achieving a sufficient amount of oxygen in the growing oxide film. For instance, for the formation of perovskite structures at high substrate temperatures in a one-step process, an oxygen pressure of about 0.3 mbar is necessary [2]. Also, for many other oxide or nitride films, the necessity of working in a reactive environment makes it difficult to prepare such samples vice thermal evaporation, using electron guns. In the case of sputtering, where commonly argon is used as the background gas, a larger amount of oxygen or nitrogen can only be added in special oven facilities close to the substrate surface.
508
Hans-Ulrich Krebs et al.
Table 1. List of some materials deposited for the first time by PLD after 1987 and references Material High-temperature superconductors
YBa2 Cu3 O7 BiSrCaCuO TlBaCaCuO MgB2 Oxides SiO2 Carbides SiC Nitrides TiN Ferroelectric materials Pb(Zr,Ti)O3 Diamond-like carbon C Buckminster fullerene C6 0 Polymers Polyethylene, PMMA Metallic systems 30 alloys/multilayers FeNdB
3.2
Literature Dijkkamp et al. (1987) Guarnieri et al. (1988) Foster et al. (1990) Shinde et al. (2001) Fogarassy et al. (1990) Balooch et al. (1990) Biunno et al. (1989) Kidoh et al. (1991) Martin et al.(1990) Curl and Smalley (1991) Hansen and Robitaille (1988) Krebs and Bremert (1993) Geurtsen et al. (1996)
[2] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21]
Small Target Size
The PLD technique is also flexible, because the spot size of the focused laser beam is small and, therefore, the target area may even be less than 1 cm2 . This allows to prepare complex samples with enrichments of isotopes or isotopic markers within the deposited film. Being able to easily prepare samples for research purposes or for application tests is especially interesting, if the sample or one component is extremely expensive or impossible to prepare with other techniques. Here, the flexibility of the PLD technique pays off, due to the possibility of easily exchanging and adjusting the targets. In our case, for instance Fe-Ag thin films and multilayers were prepared with 57 Fe contributions to make special areas of the samples sensitive for M¨ ossbauer spectroscopy and to investigate intermixing effects between the two components [22]. 3.3
Stoichiometry Transfer
In many cases, one takes advantage of the fact that during PLD the stoichiometry of the deposited film is very close to that of the used target and, therefore, it is possible to prepare stoichiometric thin films from a single alloy bulk target. This so-called “stoichiometry transfer” between target and substrate has made the PLD technique interesting for the growth of complex systems, for instance of high-temperature superconductors, piezoelectric and ferroelectric materials with perovskite structure, and also for technical applications (sensors, capacitors, ...). Stoichiometry transfer between target and substrate is difficult to obtain with evaporation or (magnetron) sputtering by using a single target, because in general the partial vapor pressures and sputtering yields of the components are different from each other which gives rise to a different concentration
PLD – A Versatile Thin Film Technique
509
of the thin film growing on the substrate. In the case of PLD, with most materials a stoichiometry transfer between target and substrate is obtained, which can be explained as follows. The fast and strong heating of the target surface by the intense laser beam (typically up to temperatures of more than 5000 K within a few ns [23], corresponding to a heating rate of about 1012 K/s) ensures that all target components irrespective of their partial binding energies evaporate at the same time. When the ablation rate is sufficiently high (which normally is the case at laser fluences well above the ablation threshold), a so-called Knudsen layer is formed [6] and further heated (for instance by Inverse Bremsstrahlung) forming a high-temperature plasma [24], which then adiabatically expands in a direction perpendicular to the target surface. Therefore, during PLD, the material transfer between target and substrate occurs in a material package, where the separation of the species is small. The expansion of the whole package can be well described by a shifted Maxwell-Boltzmann center-of-mass velocity distribution [25] f (vz ) ∝ vz3 · exp[−mA (vz − vcm )2 /(2kTeff )].
(1)
with a center-of-mass velocity vcm and an effective temperature Teff . Then, adiabatic collisionless expansion occurs transfering the concentration of the plasma plume towards the substrate surface. Thus one can understand that complex structures such as oxides or perovskites are built up again at the substrate surface, when the substrate temperature is high enough, because all components are transfered from target to substrate at the right composition. But also in the case of polymers, the preparation of films from single bulk targets is possible, as was first shown by Hansen and Robitaille in 1988 [19]. In the case of polymers, chemical structure and chain length strongly depend on the applied laser wavelength and fluence (see for instance [26]). In Fig. 2, an infrared spectrum (FTIR) of poly-(methyl methacrylate) (PMMA) laser deposited at a fluence of 300 mJ/cm2 is compared with a literature spectrum [27]. As can be seen, apart
Transmission
Dechant 1972
PMMA film
3000
2000
(cm-1)
1000
Fig. 2. FTIR spectrum of laser deposited PMMA films in comparison to bulk material [27]
510
Hans-Ulrich Krebs et al.
from small intensity differences all absorption lines are seen also in the PLD film indicating that the reorganization at the substrate surface leads to a chemical structure very close to the bulk structure. Nevertheless, using a laser wavelength of 248 nm, the chain length of the grown PMMA films is reduced when depositing the film at room temperature. This is known from gel permeation chromatography (GPC) measurements performed after dissolving the PMMA films in tetrahydrofuran (see Fig. 3, note the logarithmic scale on the x-axis). The obtained average molecular mass Mn obtained is about 5800 g/mol. Details of these experiments are described elsewhere [28].
oligomers 5800 dimer 211 y it s n e t n I
trimer 290 tetramer 396
100
3.4
1000
10000
Molecular mass (g/mol)
100000
Fig. 3. GPC measurements on laser deposited PMMA
Pulsed Nature of PLD
The pulsed nature of the PLD process allows for strongly changing the laser conditions for each target. Therefore, it becomes possible producing complex composite materials like polymer-metal systems, where completely different laser fluences are necessary for the deposition of polymer and metal, respectively. In Fig. 4, a transmission electron microscope (TEM, Philips EM-420) image of a polycarbonate (PC) film with a layer of Ag grains is shown. For production of such a sample, the PC has to be prepared at low laser fluence of about 60 mJ/cm2 to optimize the chemical structure, while the Ag crystals were deposited at an about 80 times higher fluence of 5 J/cm2 . By this technique, even PC/Ag-multilayers were grown for the first time with low interface roughness as will be described in [29]. 3.5
Energetic Particles
To obtain sufficiently high ablation rates (on the order of 0.01 nm per pulse) for the deposition of metallic systems in UHV, high laser fluences of more than 5 J/cm2 are necessary [20,30]. Under these conditions, the film deposition occurs with energetic particles.
PLD – A Versatile Thin Film Technique
511
Fig. 4. TEM image of a polycarbonate film with nanocrystalline Ag grains
At a laser fluence of 8 J/cm2 the velocities of the plasma plume expansion correspond to average kinetic energies of the ablated ions of more than 100 eV [23] in agreement with results of Lunney [31]. The mean energy of the atoms is much lower, on the order of 5–10 eV [32]. In the literature, an acceleration of the ions in the strongly increasing space charge field incurred by the more mobile electrons, collectively moving away from the ions [33], is made responsible for the higher energies of the ablated ions. The deposition with energetic particles allows for the formation of metastable phases, for instance nanocrystalline highly supersaturated solid solutions or amorphous films over a wide composition range. For instance, in the Fe-Ag system, which is almost immiscible in thermodynamic equilibrium, the bcc Fe(Ag) single phase can be supersaturated much higher than with conventional deposition techniques, namely up to 13 at.% or up to 40 at.% at room temperature and 150 K, respectively [34,35]. In Fig. 5, x-ray measurements (Philips X‘Pert MRD) of a Fe91 Ag9 thin film are shown after annealing at different temperatures. From the absence of an Ag peak at about 44.5◦ and from the width of the Fe(Ag) peak it can directly be seen that the sample is supersaturated homogeneously and nanocrystalline with a grain size of about 6 nm (as deduced from the Scherrer formula [36]). With annealing temperatures up to 620 ◦ C, the peak shifts to higher scattering angles, but no Ag peak occurs, indicating that Ag diffuses out of the Fe(Ag) grains into the grain boundaries, where wetting and stabilizing occurs up to high temperatures without Ag grain formation. For mixing effects leading to homogeneous films, implantation of the energetic ions, intermixing with the already deposited material, and film growth below the surface (so-called “subsurface growth”) were made responsible. The implantation of particles (with energies above the displacement threshold) below the substrate surface also induces defect formation, at least for high laser fluences above 6 J/cm2 . Using high resolution transmission electron microscopy (HRTEM, Philips CM200) in cross-section, dislocation densities of more than 1012 cm−2 were obtained in Fe(Ag) thin films (Fig. 6).
512
Hans-Ulrich Krebs et al.
Fe 91 Ag 9
Fe (110)
Intensity (a. u.)
Ag (111)
620°C (30 m in)
400°C (23 h)
400°C (82 m in)
400°C (10 m in)
Fe(Ag)
300°C (10 m in)
40
45
50
2 θ (degree)
55
Fig. 5. High-angle x-ray diffraction of a laser deposited Fe91 Ag9 thin film after annealing at different temperatures
Fig. 6. HRTEM image of a supersaturated Fe(Ag) film showing a dislocation loop and dislocations (enlarged in the insets)
The implantation of additional material into the already grown film, which is fixed to the substrate, leads to high compressive stress on the order of GPa. This compressive stress can be detected from peak shifts to lower scattering angles in conventional high-angle x-ray diffractometry [30]. Due to intermixing effects at the substrate surface, in general a strong adhesion of PLD films exists to all substrates used in our case (Si, Al2 O3 , W) and no tearing off the substrate was observed up to film thicknesses of 1 µm. In the case of Fe/MgO and Ni80 Nb20 /MgO, the interface roughness of multilayers laser deposited in UHV is very small (typically about 0.35 nm for layer periodicities of up to 5 nm), as can be deduced from a fit to the x-ray reflectivity measurements (using Co-Kα radiation) depicted in Fig. 7 (for details of the fit procedure see [37]). This is an indication for high surface mobility of the deposited particles and low intermixing effects in this system [37,38].
PLD – A Versatile Thin Film Technique
513
Fig. 7. X-ray reflectivity measurements and fit of laser deposited Fe/MgO thin films with a layer periodicity of 2.1 nm for different numbers of bilayers
3.6
Tunable Particle Energy
The kinetic energy of the deposited particles can be systematically varied from an average energy of about 50 eV to about 150 eV by increasing the laser fluence from 2 to about 10 J/cm2 for metallic systems. This only slightly changes the film properties [39]. A much stronger influence on the film properties occurs, when the particle energy is lowered by an inert gas pressure. Then, the energy can be reduced to thermal energies below 1 eV. In an Ar atmosphere, well below about 0.1 mbar, the reduction of the average energy of the ablated particles can be described by scattering of a dense cloud of ablated material moving through a dilute gas [39]. On the way towards the substrate, mainly the energetic ions are scattered out of the deposition path, while the slower atoms reach the substrate surface without any hindrance. At higher gas pressures, the plasma expansion leads to a shock front between plasma plume and surrounding gas, which hinders the plasma expansion and induces a further velocity reduction [40]. A decrease of particle energy is accompanied by systematic changes in texture and microstructure. With conventional thin film techniques usually the substrate temperature has to be varied to change texture. This shows that during PLD, where energetic particles exist, the particle energy is an additional parameter to play with. As an example, in Fig. 8 texture measurements (using a conventional four-circle diffractometer with Co-Kα radiation) are shown for Permalloy (Ni80 Fe20 , Py) deposition in different Ar gas atmospheres at room temperature. These are 3-dimensional plots, where the angles ψ and ϕ are used as usual and the “height” is the measured intensity. For Py deposited in UHV, the films exhibit a strong (111) fibre texture typical for fcc metals with a full width at half maximum (FWHM) of 5◦ . For higher pressures during deposition, the sharpness of the peaks is reduced and the FWHM rises, before a complete change in the texture occurs. For 0.1 mbar, besides the (111) fibre
514
Hans-Ulrich Krebs et al.
Fig. 8. Change of the Permalloy thin film texture, depending on the Ar gas pressure during PLD at room temperature
texture also traces of the (200) and (321) fibre texture are seen. For even higher pressures the (111) direction completely vanishes and only the (200) and (321) directions remain. The reduction of kinetic energy is also accompanied by a lowering of intermixing and resputtering effects [39,41], and by a stress transition from compressive to tensile [42]. As a further example, stress values obtained for Ag under different Ne gas pressures are shown in Fig. 9. Details of the used in-situ bending beam technique are described in ref. [42]. Under UHV conditions and at low Ne pressure, the film stress is about −0.6 GPa. With increasing pressure a steep compressive-to-tensile transition occurs at about 0.1 mbar and tensile stress of +0.15 GPa is reached. One can see that depending on the desired conditions, films with compressive or tensile stress, or even stress-free films can be grown by PLD by simply changing the background inert gas pressure during deposition. It is also possible to reduce bulk defects and intermixing at interfaces, if desired. The kinetic energy of the deposited particles has to be reduced by the inert gas to such an extent that it is below the threshold for bulk atom displacements (about 25 eV in the case of most metals). Under such conditions, implantation of particles into the growing film is minimized while 0,4
stress [GPa]
0,2 0,0 -0,2 -0,4 -0,6 -0,8 1E-8
1E-6
1E-4
0,01
Ne pressure [mbar]
1
Fig. 9. Change of stress of Ag films laser deposited at different Ne gas pressures
PLD – A Versatile Thin Film Technique
515
at the same time having enough energy for structural displacements at the film surface and increased surface mobility. This can be achieved with Ar gas using a pressure of about 0.04 mbar and with He at about 0.1 mbar. Under these conditions, multilayers with much sharper interfaces can be prepared, as has been tested in the case of the Fe/Ag system [39]. The change of interface roughness strongly influences the properties of multilayers. As was shown earlier, for instance the giant magneto resistance of Py/Ag multilayers [43] and the x-ray reflectivities of Ni80 Fe20 /MgO multilayers [37] are drastically changed, when preparing the samples in UHV and in Ar gas atmosphere, respectively.
4
Conclusions
Since the breakthrough of the PLD technique due to the work of Dijkkamp in 1986 all kinds of materials were prepared by this method. The stoichiometry transfer between target and substrate and the possibility of working in UHV as well as in different reactive and inert gas atmospheres are particularly attractive features of PLD. That the kinetic energy of the deposited ions lies in the range of about 100 eV, is also of interest due to the possibility of preparing new systems far away from equilibrium (supersaturated binary systems, nanocrystalline materials, metastable alloys, ...). Furthermore, using an inert gas pressure makes this method versatile, because the energy of the deposited particles is a free parameter to play with. Energy may be reduced and adjusted for special purposes. For instance, an adjustment of texture, stress or interface roughness can be obtained. The possibility of additionally changing laser features, such as wavelength, repetition rate, pulse length, fluence and target-to-substrate distance, and the deposition conditions, such as substrate temperature and substrate orientation with repect to the deposited material, further demonstrates the enormous versatility of PLD. Acknowledgements This work is supported by the Sonderforschungsbereich 602 and Graduiertenkolleg 782.
References 1. H. M. Smith and A. F. Turner, Appl. Opt. 4, 147 (1965). 505 2. D. Dijkkamp, T. Venkatesan, X. D. Wu, S. A. Shareen, N. Jiswari, Y. H. MinLee, W. L. McLean, and M. Croft, Appl. Phys. Lett. 51, 619 (1987). 506, 507, 508 3. D. B. Chrisey and G. K. Hubler: Pulsed laser deposition of thin films, (Wiley, New York 1994). 506
516
Hans-Ulrich Krebs et al.
4. C. Scarfone, M. G. Norton, C. B. Carter, J. Li, and H. W. Mayer, Mat. Res. Soc. Symp. Proc. 191, 183 (1991). 507 5. S. F¨ ahler, M. St¨ ormer, and H. U. Krebs, Appl. Surf. Sci. 109/110, 433 (1997). 507 6. R. Kelly, and J. E. Rothenberg, Nucl. Instrum. Methods Phys. Res. B 7/8, 755 (1985). 507, 509 7. R. K. Singh and J. Narayan, Phys. Rev. B 41, 8843 (1990). 507 8. B. Holzapfel, B. Roas, L. Schultz, P. Bauer, and G. Saemann-Ischenko, Appl. Phys. Lett. 61, 3178 (1992). 507 9. A. A. Gorbunov, W. Pompe, A Sewing, S. V. Gaponov, A. D. Akhsakhalyan, I. G. Zabrodin, I. A. Kaskov, E. B. Klyenkov, A. P. Morozov, N. N. Salaschenko, R. Dietsch, H.Mai, and S. V¨ ollmar, Appl. Surf. Sci. 96–98 649 (1996). 507 10. C. R. Guarnieri, R. A. Roy, K. L.Saenger, S. A. Shivashankar, D. S. Lee, and J. J. Cuomo, Appl. Phys. Lett. 53, 532 (1988). 508 11. C. M. Foster, K. F. Voss, T. W. Hagler, D. Mihailovic, A. J. Heeger, M. M. Eddy, W. L. Olsen, and E. J. Smith, Solid State Comm. 76, 651 (1990). 508 12. S. R. Shinde, S. B. Ogale, R. L. Greene, T. Venkatesan, P. C. Canfield, S. L. Budko, G. Lapertot, and C. Petrovic, Appl .Phys. Lett. 79, 227 (2001). 508 13. E. Fogarassy, C. Fuchs, A. Slaoui, and J. P. Stoquert, Appl. Phys. Lett.57, 664 (1990). 508 14. M. Balooch, R. J. Tench, W. J. Siekhaus, M. J. Allen, A. L. Connor, and D. R. Olander, Appl. Phys. Lett. 57, 1540 (1990). 508 15. N. Biunno, J. Narayan, S. K. Hofmeister, A. R. Srinatsa, and R. K. Singh, Appl. Phys. Lett. 54, 1519 (1989). 508 16. H. Kidoh, T. Ogawa, A. Morimoto, and T. Shimizu, Appl. Phys. Lett. 58, 2910 (1991). 508 17. J. A. Martin, L. Vazquez, P. Bernard, F. Comin, and S. Ferrer , Appl. Phys. Lett. 57, 1742 (1990). 508 18. R. F. Curl and R. E. Smalley, Scientific American, October 1991, 32. 508 19. S. G. Hansen and T. E. Robitaille, Appl. Phys. Lett. 52, 81 (1988). 508, 509 20. H. U. Krebs and O. Bremert, Appl. Phys. Lett. 62, 2341 (1993). 508, 510 21. A. J. M. Geurtsen, J. C. S Kools, L. de Wit, and J. C. Lodder, Appl. Surf. Sci. 96–98, 887 (1996). 508 22. R. Gupta, M. Weisheit, H. U. Krebs, and P. Schaaf, Phys. Rev. B 67, 75402 (2003). 508 23. S. F¨ ahler and H. U. Krebs, Appl. Surf. Sci. 96–98, 61 (1996). 509, 511 24. C. R. Phipps, T. P. Turner, R. F. Harrison, G. W. York, W. S. Osborne, G. K. Anderson, X. F. Corlis,, L. C. Haynes, H. S. Steele, K. C. Spicochi, and T. R. King, J. Appl. Phys. 64, 1083 (1988). 509 25. J. C. S. Kools, T. S. Baller, S. T. De Zwart, and J. Dieleman, J. Appl. Phys. 71, 4547 (1992). 509 26. S. Nishio, T. Chiba, A. Matsuzaki, and H. Sato, J. Appl. Phys. 79, 7198 (1996). 509 27. J. Dechant: Ultrarotspektroskopische Untersuchungen an Polymeren, (Akademie Verlag, Berlin 1972). 509 28. E. S¨ uske, T. Scharf, H. Kijewski, P. Schaaf, D. Nelke, E. Panchenko, M. Buback, and H. U. Krebs, in preparation. 510 29. J. Faupel and H. U. Krebs, in preparation. 510 30. H. U. Krebs, J. Non-Equilibrium Processing 10, 3 (1997). 510, 512
PLD – A Versatile Thin Film Technique 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43.
517
J. Lunney, Appl. Surf. Sci. 86, 79 (1995). 511 P. E. Dyer, Appl. Phys. Lett. 55, 1630 (1989). 511 W. Demtr¨ oder and W. Jantz, Plas. Phys. 12, 691 (1970). 511 M. St¨ ormer and H. U. Krebs, J. Appl. Phys. 78, 7080 (1995). 511 S. Kahl and H. U. Krebs, Phys. Rev. B 63, 172103 (2001). 511 B. D. Cullity: Elements of x-ray diffraction, (Adison-Wesley, Reading, Massachusetts 1967). 511 S. Vitta, M. Weisheit, T. Scharf, and H. U. Krebs, Optics Lett. 26, 1448 (2001). 512, 515 C. Fuhse, H. U. Krebs, S. Vitta, and G. A. Johansson, in preparation. 512 K. Sturm, S. F¨ ahler, and H. U. Krebs, Appl. Surf. Sci. 154–155, 462 (2000). 513, 514, 515 T. Scharf and H. U. Krebs, Appl. Phys. A 75, 551 (2002). 513 K. Sturm and H. U. Krebs, J. Appl. Phys. 90, 1061 (2001). 514 T. Scharf, J. Faupel, K. Sturm, and H. U. Krebs, submitted. 514 J. Faupel, H. U. Krebs, A. K¨ aufler, Y. Luo, K. Samwer, and S. Vitta, J. Appl. Phys. 92, 1171 (2002). 515
The Chemical Route in Plasma Enhanced Low Pressure Synthesis of c-BN A. Lunk1 , M. Gross1 , and H. Wulff2 1 2
Institut f¨ ur Plasmaforschung, Universit¨ at Stuttgart Pfaffenwaldring 31, 70569 Stuttgart, Germany Institut f¨ ur Chemie und Biochemie, E.-M.-A.-Universit¨ at Greifswald Soldmanstr. 16/17, 17487 Greifswald, Germany
Abstract. In the last three years it was shown that plasma enhanced chemical vapor deposition (PECVD) is an interesting alternative to the physical route. This contribution reviews the state of the art in PECVD of c-BN and discusses some open questions in the first part. In the second part results are given of sputter/etch rates experiments of c-BN in comparison with h-BN for argon, hydrogen and fluorine species. In H2 no selectivity in etch rate could be found between c-BN and h-BN while in CF4 indications for etch selectivity were found depending on bias voltage. For a certain deposition parameter range c-BN films with relatively low stress were deposited. Grazing Incidence X-ray Diffraction shows clearly that the film consists of polycristalline c-BN.
1
Introduction
The deposition of thin cubic boron nitride (c-BN) films at present is an important challenge on behalf of its outstanding mechanical and electrical properties. Hardness of c-BN films is second to diamond at room temperature and its chemical inertness is far superior to diamond. c-BN has high thermal conductivity, large band gap, low dielectric constant and can be n-type and p-type doped. An overview on data and deposition techniques is given in [1,3]. Thermodynamic considerations at equilibrium conditions give important hints on adapted deposition methods and on stability of thin films produced. The discussion on the stable phases of BN at room temperature and atmospheric pressure is still open. First p-T phase diagrams on boron nitride at thermal equilibrium conditions were given in 1963 [4] and 1975 [5]. An adapted pseudo-Debye model was proposed 1999 which allows extrapolations to the low temperature range [6]. More recently an ab initio study of phase transformation in boron nitride was presented [7]. The authors calculated the p-T phase diagram and by compression simulations they deduced the phase transformation paths among the four phases of BN (cubic c-, hexagonal h-, wurtzite w-, rhombohedral r-BN). The simulations showed that at normal B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 519–529, 2003. c Springer-Verlag Berlin Heidelberg 2003
520
A. Lunk et al.
ambient conditions c-BN is the most stable phase and that the direct transformation path between h-BN and c-BN is far less favorable than indirect transformation with w-BN or r-BN as an intermediate. Deposition of c-BN films at non-equilibrium conditions can be achieved by plasma enhanced physical vapor deposition (PEPVD) as well as by plasma enhanced chemical vapor deposition (PECVD). In PEPVD the formation of c-BN only occurs if momentum transferred by the ions to the growing film is high enough. Bombardment by ions causes dislocations and interstitial defects resulting in the formation of intrinsic stress in the film. The intrinsic stress amounts to about 15 GPa or higher and therefore thick films (film thickness higher than 500 nm) peel off from the substrate. One way to overcome the problems of PEPVD is the application of PECVD at very low ion energies to prevent defects. Literature reports mixtures of relative uncomplex (B2 H6 , BCl3 , BF3 ) as well as complex compounds (dimethylamineborane, borazine, boranetriethylamine). An overview on PECVD-methods applied to cBN film deposition is given in [8] which collects data published up to the year 2000. Great technical success in c-BN deposition of thick films by PECVD was achieved since 2000. Pioneering work was done by S. Matsumoto and coworkers which achieved film thickness up to 25 µm. They applied dc-jet [9] as well as microwave excited plasma [10] working at a pressure of 50 Torr (6.65 kPa) in a Ar/BF3 /N2 /H2 mixture. Experiments showed that the percentage of cBN content in the films strongly depends on hydrogen and nitrogen content in the gas phase as well as on bias voltage [11]. In contrast to experiments in Ar/BF3 /N2 mixtures [12] the results in [11] show very clearly the important role of hydrogen in c-BN deposition. It was found that the highest content of c-BN was achieved if the flow ratio QH2 /QBF3 was about 1. The key role of hydrogen found corresponds to the results of thermodynamic calculations of gas phase composition in an Ar/BF3 /N2 /H2 mixture [13]. Without hydrogen the calculations predicted gaseous BN at temperatures higher than 3000 K. By adding of H2 solid BN can be found at temperatures of about 1000 K. In this paper we try to throw light on the role of fluorine and hydrogen in c-BN deposition. Experiments start with the chemical components B, H2 and N2 as vapor or gases, while fluorine is provided from CF4 . In the first step the decomposition of CF4 in the plasma was investigated by mass spectrometry. In the second step the etching rates of h-BN and c-BN in Ar, H2 and CF4 in dependence on applied bias voltage were measured. Evaluating the measured data parameter range was found for the deposition of c-BN characterized by low internal stress. The properties of the c-BN film deposited were characterized by FTIR (Fourier Transform Infrared spectroscopy) and GIXD (Grazing Incidence X-ray Diffraction) measurements.
The Chemical Route in Plasma Enhanced Synthesis of c-BN
2
521
Experiments
Figure 1 shows the experimental set-up used. Boron is evaporated in the water cooled anode of the hollow cathode arc evaporation device. Argon as carrier gas is introduced through the cathode while other components flow directly into the chamber passing a gas shower. The substrate holder can be heated or cooled and is connected to the rf-power supply of 13.56 MHz. Film growth is observed in situ by infrared reflection absorption spectroscopy (IRRAS). The beam of a FTIR spectrometer is coupled out and projected through KBr windows into the deposition chamber on the substrate surface with an angle of incidence of 750 . An external Mercury-Cadmium-Telluride (MCT) detector was applied for measurements of the intensity of the reflected beam. Measurements in the IRRAS-configuration with polarized light were performed with the following set-up: two IR polarizers are placed in the optical path in front of the substrate. The first polarizer is set at an angle of 450 to the plane of incidence and the second one can be changed from parallel (p-polarization) to perpendicular (s-polarization) to the plane of incidence. Since the beam of the FTIR spectrometer has an elliptical polarization, the first polarizer is necessary to guarantee that the intensity incident on the substrate is equal in s- and in p-polarization. This is an essential condition for quantitative simulations of the spectra. Typical growth rates of 10-15 nm substrate bias voltage UBias substrate heating thermocouple
alignment cooling
shutter
substrate
IRanalyzer
s-pol./p-pol.
transverse magnetic field
IR- copper crucible polarizer with boron (anode)
MCTdetector
hollow cathode pumping system
N2 Ar
FTIRspectrometer 20 scans/s 400-4000 cm-1
FTIR
Fig. 1. Experimental set-up of the deposition device
522
A. Lunk et al.
per minute were achieved. More details of the experimental set-up are given in [14] and [15]. The total pressure in the deposition chamber was held constant at 0.6 Pa. The vacuum chamber is connected to a mass spectrometer (maximum detectable mass mmax = 200 amu, resolution ∆m/m = 1/200, electron energy eUb = 102 eV) via a pressure stage. Therefore the decomposition of CF4 in the plasma could be measured by mass spectrometry in situ. Mass spectrometry measurements were performed during BN deposition and also for comparison at conditions nearly equal to deposition conditions with the exception that no boron was evaporated.
3
Results and Discussion
3.1
Mass Spectrometry
The dependence of CF4 partial pressure on the mass ratio was measured with and without plasma. Figure 2 shows results for the masses 69 amu (CF+ 3 ), ++ + 50 amu (CF+ ), 31 amu (CF ) and 25 amu (CF ) at “plasma off”. Because 2 2 of the high argon concentration resulting in a high mass peak at 20 amu (Ar++ ) fluorine ions (19 amu) could not be detected. From Fig. 2 it can be derived that the fragmentation factor (fraction of ion counts) ICF+ /ICF+ = 2 3 0.16 (0.12) as well as ICF+ /ICF+ = 0.06 (0.05) are nearly constant between 3 0.5 and 3.0 sccm CF4 flow. The data given in brackets correspond to the data reported in the spectrometer manual. At plasma conditions the ion counts show a different picture (Fig. 3). The CF+ 3 ion count rate is decreased by the factor of 0.25 compared to condition “plasma off”. While the ion ratio ICF+ /ICF+ is nearly the same at conditions
-1
cps [s ]
2
10
13
10
12
10
11
3
+
CF 3 (69 amu) +
CF 2 (50 amu) 10
CF4-H 2-N 2-Ar; plasma off Q(H 2) = 5 sccm Q(N 2) = 17.5 sccm Q(Ar) = 60 sccm
10
10
+
CF (31 amu) ++ CF 2 (25 amu)
9
0.0
0.5
1.0 1.5 2.0 CF4 flow [sccm]
2.5
3.0
Fig. 2. Ion counts of different masses in dependence on CF4 flow rate (plasma off)
10
12
10
11
10
10
523
CF4-H 2-N 2-Ar; plasma on Q(H 2) = 5 sccm Q(N 2) = 17.5 sccm Q(Ar) = 60 sccm
-1
cps [s ]
The Chemical Route in Plasma Enhanced Synthesis of c-BN
CF 3
+
CF 2
+
+
CF ++ CF 2 10
9
0.0
0.5
1.0
1.5
2.0
2.5
3.0
CF4 flow [sccm]
Fig. 3. Ion counts of different masses in dependence on CF4 flow rate (plasma on)
“plasma on” in comparison to “plasma off”(ICF+ /ICF+ = 0.16) the ion ratio 2 3 ICF+ /ICF+ is increased by a factor of 2 (ICF+ /ICF+ = 0.12). Normalizing the 3 3 ion count rate the following ratio was measured at condition “plasma off”: ICF+ : ICF+ : ICF+ = 1 : 0.16 : 0.06 and at condition “plasma on” results: 3 2 ICF+ : ICF+ : ICF+ = 1 : 0.16 : 0.12. The surprising result for the high 3 2 CF+ concentration is in accordance with mass spectrometer measurements at comparable pressure in CF4 plasma [16]. The authors measured the ion composition in the plasma directly. Instead of a ratio of ICF+ : ICF+ : ICF+ = 3 2 1 : 0.07 : 0.01 derived from ionization rates they measured a ratio ICF+ : 3 ICF+ : ICF+ = 1 : 0.2 : 0.2. Direct detection of ion composition measurements 2 also gives important hints on the order of fractional ion fluxes. At 1.3 Pa the authors measured a ratio ICF+ /Itotal = 0.4 and IF+ /Itotal = 0.025 - this 3
means that nearly the half of ion concentration in CF4 belongs to CF+ 3 ions while the concentration of fluorine ions is a factor of 40 smaller than total ion concentration. Comparing the results of mass spectroscopy shown in Figs. 2 and 3 with those of [16] we assume in a zero order approximation that findings from [16] for the F+ - concentration can be applied to our deposition system also. 3.2
Etch Rates of h-BN/c-BN
In the discussion on the chemical route in PECVD the possibility of selective etching of h-BN/c-BN by hydrogen and fluorine plays an important role. Etching experiments were performed in the set-up shown in Fig. 1. In a first step h-BN as well as c-BN films were deposited by PEPVD. For details of this deposition technique we refer to [14,15]. In the second step films were sputtered/etched in Ar, Ar/H2 , Ar/CF4 . Etching rate as well as composition of the film during the process were monitored by IRRAS. Figure 4 shows
524
A. Lunk et al.
etch/sputter rate [nm/min]
10
Q(Ar) = 60 sccm, Q(CF4) = 3 sccm Q(H 2) = 3 sccm Idischarge = 35 A 1
h-BN: Ar Ar/H 2 Ar/CF 4 -100
c-BN: Ar Ar/H 2 Ar/CF 4 -1000
U Bias [V]
Fig. 4. Etch/sputter rates in dependence on bias voltage, gas composition and phase of BN film
the dependence of etch/sputter rates of h-BN as well as c-BN films on bias voltage and gas composition used. The amount of Ar flow rate Q(Ar) was chosen for stable working conditions of the hollow cathode arc. The flow rates Q(CF4 ) and Q(H2 ) applied are the same as those used under c-BN deposition conditions. The bias voltage results from rf-voltage applied. Sputter rates in Ar and Ar/H2 show the typical dependence on bias voltage and differ hardly while sputter/etch rates in Ar/CF4 differ strongly. At low bias voltages etch rates in Ar/CF4 are nearly one magnitude higher than in Ar, Ar/H2 mixtures. From the data of Fig. 4 and measurements of the rf-power and voltage the etch/sputter yields were calculated as shown in Fig. 5. In these calculations we assumed mass densities of ρc−BN = 3.0g/cm3 and ρh−BN = 2.6g/cm3
sputter/etch yield
1
0.1
Q(Ar) = 60 sccm Q(CF4) = 3 sccm Q(H 2) = 3 sccm) Idischarge = 35 A
c-BN: h-BN: Ar Ar Ar [18] Ar [18] Ar/H 2 Ar/H 2 Ar/CF 4 Ar/CF 4
-100 U Bias[V]
Fig. 5. Etch/sputter yield in dependence on bias voltage, gas composition and phase of BN film
The Chemical Route in Plasma Enhanced Synthesis of c-BN
525
[17]. Regarding the sputter selectivity of Ar for h-BN/c-BN the results of Figs. 4 and 5 show a reverse behavior in relation to TRIM simulations [18]. Because of the higher density of c-BN and similar binding energies in h-BN and c-BN the sputter rate of c-BN becomes somewhat higher than of h-BN as shown in [18]. One possible explanation for the difference to our results is that in the experiments of Fig. 4 the h-BN films are strong textured (t-BN) and compressed while TRIM assumes amorphous material. The influence of texture of h-BN films on sputter rate will be investigated in the future. The measurements in Ar/H2 mixture indicate a small influence of chemical etching in hydrogen. At low bias voltages the rates are higher than in pure argon and at high voltages the rates coincide. These results can be interpreted as a superposition of sputtering by Ar ions with small amount of chemical etching by hydrogen. The etch/sputter selectivity found only results from the contribution of Ar ion sputtering. Subtracting the the argon sputter yield from sputter yield in Ar/H2 mixture hydrogen etching does not show selectivity. This result corresponds to results from the theoretical study on the interaction of hydrogen species with boron nitride [19]. The authors found only a small selectivity on etching by hydrogen of the two phases h-BN, c-BN. In Ar/CF4 mixture a pronounced chemical etching could be observed. At a bias voltage of UBias = −100 V the etch rate is about one magnitude higher than the sputter rate of Ar. If we assume a superposition of chemical etching with sputtering in Ar and therefore subtract the sputter yield of Ar from etch/sputter yield of Ar/CF4 mixture the following statements result: the etch rate of c-BN is constant - independent of bias voltage while the etch rate of h-BN increases with increasing bias voltage. Therefore both curves cross each other. The relatively low etching selectivity of the two BN phases found in Ar/CF4 mixture is completely different to the findings discussed in [13]. In plasma enhanced CVD processes these authors found a selectivity factor of 6 between h-BN and c-BN. There was no bias voltage applied so that floating potential (-10 to -20 V) can be assumed. In the published literature sputter yield measurements of h-BN/c-BN are only reported for Ar [18]. Figure 5 also shows measured sputter rates in Ar from [18] for comparison. With the exception of measured rate at 400 V the results are in fair agreement. 3.3
Plasma Enhanced Chemical Vapor Deposition of c-BN Films
Extensive measurement series were carried out to find the parameter range for PECVD of c-BN films. In the experiments the power of the hollow cathode arc was kept constant (P = 3.24 kW) and also the argon and CF4 gas flow rates (Q(Ar) = 60 sccm, Q(CF4 ) = 1sccm)). Films were deposited on (111) silicon substrates ( diameter: φ = 2 ). The phase of BN films and its thickness was measured in situ by IRRAS. Cubic BN films could be deposited in a relatively broad parameter range. The boundaries of the parameter range found are given in Fig. 6. The upper curve denotes the boundary when c-BN nucleates. In the deposition procedure BN deposition started at relatively
526
A. Lunk et al.
600
Q(Ar) = 60 sccm Q(N 2) = 17.5 sccm Q(CF4) = 1 sccm) Idischarge = 75 A
500
UBias [V]
400 U U
300
Bias Bias
nucleation boundary phase boundary c-BN => h-BN
200 100 0 0
2
4
6
8
10
Q(H 2) [sccm]
Fig. 6. Bias voltage for nucleation- and phase-boundary in dependence on hydrogen flow
high bias voltage (UBias = −750V) firstly depositing textured h-BN (t-BN). For the nucleation of c-BN the bias voltage must be reduced to values shown in Fig. 6 in dependence of hydrogen flow. Without H2 the voltage should be reduced to 450 V and with increasing hydrogen flow the voltage increases. The lower curve shows the phase transition in the growth from c-BN to h-BN in dependence of hydrogen flow. If the applied voltage is reduced to values lower than the boundary voltage value h-BN growth starts. Between the upper and the lower curve c-BN deposition could be achieved. At Q(H2 ) = 10sccm the nucleation bias voltage was so high that it could not be realized by the rfpower supply used. 3.4
c-BN Film Properties
Diagnostics of c-BN films deposited in B/N2 /H2 /CF4 mixture were realized by FTIR and GIXD. Thickness of the films was 500 nm consisting of about 20nm h-BN plus 480 nm c-BN. An example of GIXD pattern is shown in Fig. 7. The pattern shows only characteristic reflections of the cubic BNphase. The intensity ratios correspond to bulk values with a statistical distribution of crystallites. That means there is no preferred orientation in the polycrystalline c-BN film. The film can be considered as an isotropic material. The mean particle size is about 45 ˚ A. One of the most important problems of c-BN film deposition is the internal stress of the film which causes flaking and peeling-off of the layers. Therefore investigations of film properties also focus on stress measurements. Notable are the peak shifts in GIXD-spectra of the deposited film to smaller angles in comparison to stress-free reference material [20]. Changes of local atomic ordering of the c-BN during deposition and crystallization will result in a change of its molar volume. In this context, the term molar volume is meant to indicate the volume occupied by 1 mol of BN plus associated defects. Imperfections of the first type, such as
The Chemical Route in Plasma Enhanced Synthesis of c-BN
527
c-BN (111)
140
c-BN (220)
80 60 40
c-BN (311)
100
c-BN (200)
X-ray intensity [arb. units]
120
20 0 40
50
60
70
80
90
100
110
120
2 [degrees]
Fig. 7. GIXD pattern of c-BN film, probe Si022, angle of incidence ω = 0.60
point defects, displacement disorders, or substitution disorders affect the molar volume of the material. Changes in this imperfection concentration should be accompanied by a change in film stress. It is evident that an increase in the molar volume of the films bonded to a rigid substrate will result in an overall increase in compressive film stress. From the Bragg-angle shifts the stress was calculated. Assuming isotropic defect concentration the compressive stress ∆δ due to a increase of the molar volume ∆V/V can be calculated by: ∆δ = −
1 E ∆V 31−ν V
with E the Young modulus and ν the Poisson’s ratio of the film [21]. Additionally film stress was also measured from the bending deformation of the substrate. In Tab. 1 are given results of stress measurements at different conditions. For the sample of Fig. 7 a comparison could be obtained between stress measurement by bending method with the results of GIXD measurement. Both results show fair agreement and indicate that at PECVD the stress can be reduced with a factor of nearly 3 in comparison with comparable plasma enhanced physical vapor deposition. These films are stable for long time (since 11 months). The other samples of Table1 peel off after one week and could not be investigated by GIXD.
4
Summary
Cubic BN films with relatively low stress were deposited by PECVD in an Ar/B/N2 /H2 /CF4 mixture. Sputter/etch rate measurements showed the important role of fluorine. The etch rate of hydrogen was relatively low and etch selectivity could not be detected. The critical bias voltages for nucleation of cBN as well as for the transition from c-BN- to h-BN growth depend strongly on fluorine and hydrogen content in the plasma. Stress measurements by
528
A. Lunk et al.
Table 1. Stress measurements of c-BN films Sample
UBias [V]
Stress [GPa] GIXD
Bending method
Q(CF4 ) [sccm]
Q(H2 ) [sccm]
Si021
750-130
-
-21.4
1
5
Si022
600-250
- 7.2
- 7.1
2
5
Si023
600-100
-
-17.7
1
5
bending method and GIXD showed fairly good agreement and indicated that film stress can be reduced by a factor of three in relation to comparable PEPVD method.
References 1. Landolt-B¨ ornstein, Vol. 17 (Heidelberg, New York 1982). 519 2. P.B. Mirikarimi, K.F. McCarty, D.L. Medlin, Mat. Sci. Engin. 2, 47–100 (1997). 3. W. Kulisch, Deposition of Diamond-Like Superhard Materials (Berlin, Heidelberg, New York 1999). 519 4. F. P. Bundy, R. H. Wentorf, J. Chem. Phys. 38, 1144 (1963). 519 5. F.R. Corrigan and F.P. Bundy, J. Chem. Phys. 63, 3812–3820 (1975). 519 6. V. L. Solozhenko, V. Z. Turkevich, W. B. Holzapfel, Journal Phys. Chem. B103, 2903–2905 (1999). 519 7. W.J. Yu, W.M. Lau, S.P. Cha, Z.F. Liu, Q.Q. Zheng, Physical Review B67, 014108-1–014108-9 (2003). 519 8. A. Lunk, Plasma-enhanced deposition of superhard thin films, in R. Hippler, S. Pfau, M. Schmidt, K.H. Schoenbach (Eds.), Low Temperature Plasma Physics (Berlin, Weinheim, New York 2001). 520 9. S. Matsumoto, W. Zhang, Jpn. J. Appl. Phys. 39, L442–L444 (2000). 520 10. S. Matsumoto, W. Zhang, Jpn. J. Appl. Phys. 40 L570–L572 (2001). 520 11. W. Zhang, S. Matsomoto, J. Mat. Res. 15, 2677–2683 (2000). 520 12. S. Matsumoto, N. Nishida, K. Akashi, K. Sugai, J. Mater. Sci. 31, 713–720 (1996). 520 13. W. Kalss, R. Haubner, B. Lux, Diamond & Rel. Mater. 7, 369–375 (1998). 520, 525 14. K.L. Barth, A. Neuffer, J. Ulmer, A. Lunk, Diamond & Relat. Mater. 5, 1270– 1274 (1996). 522, 523 15. P. Scheible, A. Lunk, Thin Solid Films 364, 40–44 (2000). 522, 523 16. M.V.V. Rao et al. Plasma Sourc. Sci. & Technol. 11, 69–76 (2002). 523 17. G. Lehman, P. Hess, S. Weissmantel, G. Reisse, P. Scheible, A. Lunk, Appl. Phys. A74, 41–45 (2002). 525 18. M. Chen, G. Rohrbach, A. Neuffer, K.L. Barth, A. Lunk, IEEE Transact. Plasma Science 26, 1713–1717 (1998). 525 19. R.Q. Zhang, T.S. Chu, C.S. Lee, S.T. Lee, J. Phys. Chem. B104, 6761–6766 (2000). 525
The Chemical Route in Plasma Enhanced Synthesis of c-BN 20. JCPDS-PDF (1999) #35-1365. 526 21. V. Weihnacht, W. Br¨ uckner, Thin Solid Films, 418, 136–144 (2002). 527
529
A Sharp Eye on Thin Films – Advances through Synchrotron Radiation Ralf R¨ ohlsberger Technische Universit¨ at M¨ unchen, Physikdepartment E13 James-Franck-Str. 1, 85747 Garching, Germany [email protected] Abstract. The availability of high-brilliance synchrotron radiation has opened new avenues in thin-film research. In recent years new experimental techniques have been developed to reveal structural and dynamical properties with unprecedented accuracy. In this contribution, selected examples are discussed how to obtain depth profiles of properties like electron density, magnetization or spin orientation in thin films and multilayers : The measurement of reflected and diffracted intensities in reciprocal space allows one to determine density profiles and lateral order in layered structures. The efficieny of these techniques can be enhanced by using interference effects in grazing incidence geometry that result in the formation of standing waves. This allows one to concentrate the radiation on the layer under study, leading to an enhanced signal-to-noise ratio. Another way to achieve an even higher specifity is tuning the photon energy to an electronic or nuclear resonance. In this way, selected layers in the sample can be probed which is particularly attractive for the study of magnetic layer systems. It is shown, how isotopic probe layers can be used to directly image the depth profile of the magnetic spin structure inside single layers.
1
Introduction
Parallel to the decreasing size of functional microstructures there is a growing need for efficient methods to characterize their properties on an atomic level. In the last decade a number of powerful methods has been developed for direct imaging of nanostructures, especially after the introduction of scanning probe microscopic techniques. While these methods achieve atomic resolution on surfaces, they are not able to resolve the in-depth structure of thin films and nanoscale systems. In this case one has to resort to probes like photons or neutrons that deeply penetrate into the system and are retrieved after the scattering process. The enormous brilliance of modern synchrotron radiation sources has opened new avenues for the structural characterization of thin films and nanoscale systems. For a precise structural characterization it is mandatory to control the depth inside the sample with high accuracy. The methods that are available here can be divided into two classes: (a) Diffraction methods. In this case the structure of the sample is mapped into reciprocal space and from the data collected the structure is refined. B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 529–545, 2003. c Springer-Verlag Berlin Heidelberg 2003
530
Ralf R¨ ohlsberger
The range in reciprocal space that is sampled determines the corresponding spatial resolution of the method. (b) Imaging methods. Here the structure of the sample is determined in direct space. A number of x-ray microscopic techniques has been developed to image the lateral structure of thin films and surfaces. Depending on the kind of secondary radiation that is detected, i.e., either photons or electrons, and on their escape depth, the methods integrate over depth ranges from a few micrometer to a few nm. However, if a resonant scattering process is used that introduces an element or isotope specifity, individual layers in the sample can be probed. The main subject of this article is the determination of the in-depth structure of thin films via synchrotron radiation. The article is organized as follows: In the first section the reciprocal space of thin films and layered structures is introduced and particular examples are discussed how diffraction methods can be adapted for structural investigation of thin films. The next section then deals with x-ray interference effects that can be used to confine the radiation intensity to particular parts of the sample in order to a) select certain parts of the sample that are to be probed and b) to enhance the signal to noise ratio due to strong enhancement of the radiation intensity in standing waves. The field of resonant x-ray scattering will be treated in the following sections. Tuning the photon energy to electronic or nuclear resonances guarantees that the signal is generated only from specific layers in the sample. Such layers can be used as marker layers to probe selected parts of the sample like boundaries or interfaces. If the scattering process exhibits isotopic specifity, like in the case of nuclear resonant scattering, the probe layer does not even disturb the chemical integrity of the sample so that the internal structure of individual layers can be studied.
2
Properties of Synchrotron Radiation
Compared to conventional x-ray sources, synchrotron radiation exhibits a number of unique properties as there are the energy spectrum, the polarization state, the time structure and the highly directional emission. A major driving force behind the development of current and future synchrotron radiation sources is the generation of x-ray beams with increasing brilliance. The brilliance is the key quantity for the investigation of low-dimensional structures like thin films. It is a measure for the number of photons per second that is emitted per source area and per solid angle within a relative spectral width of 10−3 . The evolution of the brilliance from conventional x-ray sources to the x-ray free electron laser (XFEL) is illustrated in Fig. 1 where also the spectral brilliance for some selected present and future sources is shown. An increase in brilliance corresponds to an increasing directional emission of the
A Sharp Eye on Thin Films
531
Fig. 1. Left: Evolution of the brilliance of synchrotron radiation sources (SRS) during the past decades. The brilliance is given in photons/(s mm2 mrad2 0.1 % BW). Right: Brilliance of selected present-day and future synchrotron radiation sources in the hard x-ray regime as function of energy
radiation which allows the illumination of nanostructures with increasing efficiency. Moreover, with decreasing size of the emitting electron beam the radiation source appears to be more and more like a point source if viewed from the sample position. This leads to an increasing transverse coherence length at the sample position so that experiments with (partially) coherent x-ray beams become feasible. In addition, the high degree of coherence allows for focusing down to sub-micrometer spotsizes.
3
Diffraction Methods: Density Profiles of Thin Films
Diffraction methods rely on the analysis of the intensity distribution in reciprocal space. The reciprocal space of a layer system that is accessible in a scattering experiment is shown in Fig. 2a. The truncation of the infinite crystal lattice by the surface leads to additional intensity between the Bragg spots along the qz direction, resulting in the so-called truncation rods [1]. The scan along the rod through the origin corresponds to the measurement of the specular reflectivity. Since no momentum transfer parallel to the sample surface is involved, one probes the in-depth structure along the surface normal. A typical reflectiviy curve is shown in Fig. 2b. Since the index of refraction for x-rays is slightly below 1, n = 1 − δ, one observes total√reflection over an angular range of a few mrad until the critical angle ϕc = 2δ is reached. Beyond ϕc , the intensity decays proportional to qz−4 . If the surface is covered by a thin film or a multilayer, this decay is modulated in a characteristic way
532
Ralf R¨ ohlsberger
(1)
b
(2) (3)
a
Fig. 2. (a) Scheme of a multilayer thin film with correlated roughness and its reciprocal space. The regions below the dashed lines are not accessible in a scattering experiment (incident beam or detector below sample). The black dots along the specular rod are the Bragg reflections due to the vertical periodicity of the multilayer. Due to the correlated roughness between the individual layers, horizontal Bragg-sheets appear. The various dotted lines indicate different ways to map out the reciprocal space: (1) specular reflectivity (qx = 0), (2) detector scan (ϕi = const.), (3) rocking scan (qz = const.) (b) Specular reflectivity of a multilayer consisting of 30 bilayers of Fe metal (6.0 nm) and its native oxide (0.8 nm) [2]
from which the structure of the layer system can be determined with high precision. In the most simple approximation, the first-order Born approximation, the scattered intensity along the specular direction, i.e. the reflected intensity is given by 2 1 dρ(z) exp(−iqz z) dz (1) I(q) = 4 qz dz where qz is the wavevector transfer. Great effort has been taken in the past to extract the density profile ρ(z) out of reflectivity measurements. Here one encounters the phase problem of x-ray physics : Since the phase information is lost in the measurement of intensities, a unique reconstruction of ρ(z) is not possible in general. However, in many cases a solution is possible nonetheless, if additional information about the sample is available. It can be shown that reflectivity data can be inverted if more than one measurement can be done on the same sample, e.g., by contrast variation around an absorption edge. Other schemes propose the introduction of reference layers to retrieve the phase information. In practice, however, such methods are limited when there is no appropriate absorption edge accessible or the reference layer leads to a significant perturbation of the system. On the other hand, the natural
A Sharp Eye on Thin Films
533
presence of interfaces at different depths in a layer system may provide enough phase information so that a reconstruction of the density profile even from a single measurement becomes possible, as shown recently [3]. The algorithm relies on the theory of logarithmic dispersion relations : In the same way as the real and imaginary part of an analytic function are connected by the Kramers-Kronig relations, one can establish a similar relation for the modulus and the phase of an analytic function. Thus, from the measurement of the scattered intensity, which is in the kinematical limit the modulussquared of the structure function, the phase of the structure function can be reconstructed under certain conditions. This method can be advantageously applied to soft-matter thin films like polymers, liquids or layered bio-organic structures like membranes [5]. The spatial accuracy is usually better than 0.1 nm and the sensitivity to the density changes in the profile is in the range of a few percent [6]. Lateral variations of the layer structure lead to off-specular contributions. The combination of grazing incidence (GI) reflection with small-angle x-ray scattering (SAXS) has merged into the GISAXS technique to investigate microstructured thin films like dewetted polymers [7], nanoparticles in thin films [8] etc. 3.1
Layering at Boundaries
The technique of specular x-ray reflection for the analysis of density profiles of thin films was introduced by Parratt [9]. Presently, synchrotron radiation is routinely used for high-precision determination of the internal structure of thin films down to the atomic level. Such studies become possible due to the enormous dynamic range in the scattered intensity that is accessible at present-day sources. An example is the detection of layering on the liquid metallic surface of Ga [10]. Recently, the use of high-energy x-rays opened the way to investigate the layering and lateral ordering at the liquid-solid interface [11,12]. One example is shown in Fig. 3a where the layering of liquid Ga at the diamond surface was shown [11]. Only the measurement of reflecA−1 facilitates tivities down to the 10−10 level at momentum transfers up to 2 ˚ such studies. A similar layering phenomenon on a much larger length scale was observed for Pt nanoparticles embedded in a Al2 O3 matrix [8] which is illustrated in Fig. 3b. According to Eq. (1), the scattered intensity decreases with decreasing difference in the electron density of adjacent layers. The limits of this technique were investigated recently [6]. It was shown that the method can be applied even in the case of low-contrast boundaries, e.g., between polymer films [6]. This is typically the domain of neutron scattering, where scattering contrast is produced be deuteration of one of the layers. However, the enormous brilliance of modern synchrotron sources can easily compensate for the low contrast, so that problems that may arise from deuteration in such systems can be avoided.
534
Ralf R¨ ohlsberger Pt clusters in Al−Oxide
liquid Ga
diamond(111)
a
b
Fig. 3. (a) Atomic layering of liquid Ga at a diamond surface. The upper curve shows the reflectivity curve that was recorded with 17 keV x-rays at beamline ID10 of the ESRF [11]. The experimental setup is shown in the inset. The lower curve displays the density profile derived from the reflectivity. (b) Reflectivity curve of an Al2 O3 layer with embedded Pt nanoparticles. Evaluation of the reflectivity curve reveals layering of the particles at the substrate interface, shown schematically in the corresponding density profiles (lower two panels)
3.2
Growth of Oxide Layers
Another interesting example in this field are native oxide layers on metal surfaces. This is illustrated in Fig. 4, where the reflectivity of a 28 nm thick Ta film is shown that was produced by sputter deposition and exposed to air afterwards. The native oxide layer consists of Ta2 O5 that forms a very sharp boundary with the Ta metal. As a result, the reflectivity curve exhibits an extra beat period from which the oxide layer thickness can be determined with very high accuracy. Especially the reflectivity in the region of the nodes is very sensitive to small changes in the average thickness of the oxide layer. This is illustrated in Fig. 4b, where the node of the beat pattern is shown during growth of the oxide under ambient conditions. From the analysis of this region, the growth kinetics can be derived with very high accuracy. The temporal evolution of the thickness, d(t), obtained in this study is shown√in Fig. 4c [13]. The solid line is a simulation according to the law d(t) = d0 +c t.
A Sharp Eye on Thin Films
535
Ta−Oxide Ta
Fig. 4. (a) Reflectivity of a Ta film on Si, recorded at a photon energy of 14.4 keV. Ta develops a native oxide (Ta2 O5 ) during exposition to air. The beat pattern in the reflectivity curve allows for a precise determination of the thickness of the oxide. (b) With increasing oxide layer thickness, the node of the beat pattern shifts to smaller angles. (c) Temporal evolution of the oxide layer thickness, d(t), during growth under√ambient conditions [13]. The solid line is a fit according to the law d(t) = d0 + c t
4
X-ray Interference Effects
X-ray interference effects in grazing incidence geometry can be used to confine the radiation field to certain regions of the sample. This is a result of x-ray standing waves that form above total reflecting boundaries due to superposition of incident and reflected waves [14]. A selection of some typical geometries is shown in Fig. 5. The right side shows the intensity of the radiation as a function of depth in the sample. On the left side the reflectivity curve of the system is shown. At the critical angle, an antinode of the standing wave coincides with the surface, where the intensity peaks with up to four times the incident intensity, see Fig. 5a. Variation of the angle of incidence leads to a shift in the vertical position of the antinodes. This is the basis for the method of x-ray standing waves that is used to study the position of adsorbates on single-crystalline surfaces. For the study of thin films, this method becomes particularly effective when the layer under investigation is coated on a highly reflective substrate, i.e., a material of higher electron density. A substantial intensity enhancement inside the layer is achieved if the layer thickness d is an integer multiple of the standing wave period, because waves that are repeatedly reflected at the boundaries then add up constructively, see Fig. 5b. The intensity enhancement inside the film is even stronger if the film under study is sandwiched between two layers of higher electron
536
Ralf R¨ ohlsberger Normalized intensity
1.0
reflectivity
−20
0.8
Pd
0.2
reflectivity
3
4
5
6
8
10
30
40
0
a 0
1.0
2
4
angle of incidence/mrad
10
6
depth/nm 0
0.8
2
4
6
−20
0.6
Fe Pd
0.4 0.2 0.0
20
2
4
6
40
depth/nm
angle of incidence/mrad
0
0.8
10
20
−20
0.6
Fe
0.4 0.2 0.0
0
b 0
1.0
reflectivity
2
−10
0.4
0.0
1
k1
k0
0.6
0
c 0
2 4 6 angle of incidence/mrad
Pd C Pd
0 20 40 60
depth/nm
Fig. 5. Thin-film interference effects to enhance the fluorescence signal from surfaces and thin films. The graphs show the specular reflectivity of the structures (left) and the field intensity as a function of depth (right). The superposition of incident and reflected waves leads to formation of standing waves (a) above totalreflecting surfaces. At the critical angle an antinode of the standing wave coincides with the surface, (b) inside a thin film on a total-reflecting substrate, and (c) inside a sandwich structure that acts as an x-ray waveguide. The very strong intensity enhancement in the center of this layer system can be used to probe material that is placed in the center of the spacer layer
density, see Fig. 5c. Since the energy transport takes place parallel to the layer boundaries, such layer systems can be regarded as x-ray waveguides [15]. Depending on the film thickness, a certain number of guided modes can be excited, which show up as minima in the reflectivity curve of the layer system below the critical angle of the high-density layer. Every signal that is derived from the electric field inside the layer can be increased by designing the layer system as an x-ray waveguide and coupling the incident beam into a guided mode by proper adjustment of the angle of incidence. 4.1
Signal Enhancement via X-ray Waveguides
The formation of standing waves in thin films can be used to significantly enhance the signal-to-noise ration in surface diffraction experiments. This was demonstrated recently in the investigation of the lateral ordering of a multilayer structure consisting of 10 periods of phospholipid membranes [16]. By
A Sharp Eye on Thin Films
537
sandwiching this membrane stack between the Si substrate and a Ti capping layer, schematically shown in Fig. 6a, an x-ray waveguide was produced. The excitation of the waveguide modes is evidenced by strong dips in the specular reflectivity. At these positions a strong intensity enhancement inside the film develops, resulting in a remarkable intensity enhancement of the diffracted beam. This is illustrated in Fig. 6a. The resulting surface diffraction pattern is shown in Fig. 6b, indicating an increase of the signal-to-noise ratio by two orders of magnitude when the first-order waveguide mode in the film is excited. The three observed peaks can be indexed to a slightly distorted 2D hydrocarbon chain lattice [16], as shown in the inset. It should be emphasized here that nearly any kind of organic and biomolecular thin film with a thickness larger than about 10 nm that is sandwiched in between high-density layers (e.g., silicon, glass, transition metals) fulfill the optical criteria of an x-ray waveguide. Using this technique, structural information on the molecular level can be extracted with unprecedented signal-to-noise ratios. rel. electron density
d
te rac
11111111111111111 00000000000000000 ff di 00000000000000000 11111111111111111 00000000000000000 11111111111111111 ide nt 00000000000000000 11111111111111111 00000000000000 11111111111111 00000000000000000 11111111111111111 00000000000000 11111111111111 00000000000000000 11111111111111111 00000000000000 11111111111111 00000000000000000 00000000000000000000 11111111111111111111 0000000000000011111111111111111 11111111111111 00000000000000000 11111111111111111 00000000000000000000 11111111111111111111 00000000000000000000 11111111111111111111 reflecte 00000000000000000000 11111111111111111111 d
inc
TE0 − mode 0 20 40 60 80
reflected intensity diffracted intensity
b (2,0) (1,−1)
a
(1,1)
Fig. 6. Resonant enhancement of surface diffraction from via standing wave formation in an x-ray waveguide where a stack of phospolipid membranes serves as guiding layer [16]. (a) Two waveguide modes appear as minima in the reflectivity, where the intensity of the diffracted beam is strongly enhanced. (b) Enhancement of the signal-to-noise ratio due to the waveguide effect (lower curve), the upper curve is without coupling into a resonant mode
538
Ralf R¨ ohlsberger
4.2
Confined Fluids
Many properties of liquid films confined between two flat and parallel solid surfaces with nanometer distances are not understood in detail. The investigation of confined liquids has direct impact on the understanding of phenomena such as lubrication, adhesion, surface chemistry, etc. A new approach to investigate such systems has been developed recently. The central element is an x-ray waveguide structure with a tunable gap [17], as shown schematically in Fig. 7. Fluids that are confined between the two reflecting surfaces influence the mode mixing upon propagation of the x-ray along the channel. Therefore, the analysis of the diffraction pattern at the exit allows one to determine the layer structure of the fluid. Figure 7b shows the specular reflectivity through the waveguide channel that was filled with a macromolecular liquid and subjected to a pressure of 1.5 Mpa [18]. The simulation of the measured data (solid line) revealed molecular layering in the fluid, as sketched in the inset. In such a case the liquid acquires solid-like properties with dramatic consequences for its lubrication behaviour, for example.
Fig. 7. (a) Experimental setup for the study of confined fluids [17] and application to a liquid macromolecular film. (b) Reflectivity curve of a confined 2.5 nm thick macromolecular film at a pressure of 1.5 MPa. The density profile derived from the reflectivity indicates molecular layering [18]
5
Element Specifity
Tuning the x-ray energy to an absorption edge allows to select particular elements in the sample. This allows one, for example, to address individual layers in a multilayered sample. This is particularly attractive for the investigation of magnetic materials. Due to exchange interaction, resonant x-ray scattering at the L and M edges of the transition elements, the rare earths and the actinides is strongly dependent on magnetization, as sketched in Fig. 8. From the absorption spectra recorded for opposite photon helicities,
A Sharp Eye on Thin Films
539
Fe L−edge
Co L−edge
EF
3d 4s E0 = 700 eV L3 L2
conventional hysteresis
2p3/2
E0 2p1/2
Spin − Orbit
E
Exchange splitting
Fig. 8. Left: Level scheme of the L-edges of Fe. Due to the spin polarization of the 3d band in a magnetic environment, one obtains an asymmetry in the scattering of photons with opposite helicity. Right: Element specific hysteresis loops for a trilayer system consisting of Fe(10 nm)/Cu(3 nm)/Co(5 nm) compared to the total hysteresis loop of the same system [20]
orbital and spin moments can be derived via sum rules [19]. Due to strongly enhanced magnetic contributions to the optical constants, it is possible to study magnetic properties in an element and site-selective manner. This is shown in the right column of Fig. 8 for a trilayer system Fe/Cu/Co. Using the dichroic signal from the Fe and Co L-edges, one could measure the hysteresis loops of these layers seperately [20], revealing a magnetic coupling through the Cu spacer layer. 5.1
Magnetic X-ray Scattering
The presence of an absorption edge strongly influences the optical properties of a material. In thex-ray regime the refractive index can be written as n(E) = 1 + (2π/k02 ) i ρi fi (E), where the sum runs over all atomic species in the sample and the atomic forward scattering amplitude is given by [21] fi (E) = −(ef · ei )[f0 + f (E) + if (E)] −(ef × ei ) z [m (E) + im (E)]
(2)
where f0 is the nonresonant scattering amplitude, f (E) + if (E) describes the contribution near an absorption edge and m (E)+im (E) is the magnetic part of the scattering amplitude, and z is the unit vector of the magnetization.
540
Ralf R¨ ohlsberger
Accordingly, in resonant reflection from magnetic samples, strong magnetooptical effects are observed. This opens interesting possibilities to study the magnetism of surfaces, thin films and nanostructures [22]. The strong dependence on the magnetization direction also allows for the determination of magnetic spin structures. 5.2
Induced Magnetization Density Profiles
Some nonmagnetic elements like Pt or Pd can be substantially polarized in the vicinity of magnetic layers. The induced magnetization of nonmagnetic layers may significantly contribute to the magnetic properties of the system. This can be studied by tuning the x-ray energy to an absorption edge of the nonmagnetic element. Such an investigation was performed recently on a Co/Pt bilayer [23]. The real and imaginary part of the magnetic scattering amplitude in Eq. 2 around the Pt L3 edge are shown in Fig. 9. To determine the magnetization density profile, two reflectivity measurements were performed in the vicinity of the Pt L3 edge, at energies E1 and E2 as shown in Fig. 9. From these measurements the magnetization density profile could be reconstructed with a spatial resolution of 0.1 nm, shown in Fig. 9. Remarkably, one observes a constant magnetization per Pt atom up to 0.3 nm above the inflection point of the chemical profile. Beyond that, the Pt magnetiza-
simulation
m’ fit of m’ m’’
E1 = 11566 eV
E2 E1
E1 = 11566 eV
fit
µ Pt /atom µ Pt /layer
0.3 nm
E2 = 11562 eV
fit
Pt
Co
Fig. 9. Determination of the induced magnetization density profile in a Pt layer on Co. The magnetization profile (right) is reconstructed from two reflectivity measurements at energies E1 and E2 in the vicinity of the Pt L3 edge [23]. The left panel showns the measured reflectivity curve and the asymmetry ratios for left- and rightcircular polarization at energies E1 and E2 , respectively
A Sharp Eye on Thin Films
541
tion decreases exponentially with a decay length of 0.3 nm. In general, such studies allow for a high-resolution study of the interplay between structural and magnetic roughness.
6
Isotopic Specifity
Even if single layers can be selected by tuning the energy to an atomic resonance, the measured signal originates from the whole layer. To determine the magnetic depth profile within single layers, special reconstruction techniques have to be applied, as shown in the previous section. These algorithms, however, do not always lead to unique solutions. It is therefore desirable to have methods at hand to directly image magnetic depth profiles. One approach is to use a scattering process that is isotopically sensitive. This applies for nuclear resonant scattering. While isotopic probe layers have been employed for this purpose in conventional M¨ ossbauer spectroscopy [24], this technique has experienced a revitalization by using high-brilliance synchrotron radiation beams with a reduction of data acquisition times by orders of magnitude [25,26,27]. 6.1
Nuclear Resonant Scattering of Synchrotron Radiation
Using synchrotron radiation, M¨ ossbauer spectroscopy is transformed from an incoherent absorption technique on the energy scale to a time-resolved coherent scattering method : The simultaneous excitation of the hyperfine-split nuclear energy levels, shown in Fig. 10a, by a pulse of synchrotron radiation leads to temporal beats in the evolution of the subsequent nuclear decay [28]. The analysis of this beat pattern allows a precise determination of the magnitude and the orientation of magnetic fields in nanoscale structures [29]. This is shown for some selected geometries in Fig. 10b. 6.2
Mapping of Magnetic Spin Structures
Bilayers consisting of hard- and soft-magnetic thin films are often used as model systems to investigate the magnetic coupling between different magnetic phases [30]. Such layer systems exhibit the exchange-spring effect : Directly at the hard-magnetic layer, the magnetization of the soft layer is pinned due to exchange interaction. With increasing distance from the interface, the coupling becomes weaker so that the magnetization in the soft-magnetic layer may rotate under the action of an external field. This results in a spiral structure along the normal, shown schematically in Fig. 11, that springs back into alignment with the hard-magnetic layer if the external field is switched off. In the following it is shown, how ultrathin probe layers of 57 Fe can be employed to directly image the structure of such a spiral. The experimental sheme is shown in Fig. 11. The bilayer consists of a soft-magnetic layer (11
542
Ralf R¨ ohlsberger
me +3/2 I e = 3/2
+1/2 π
−1/2 −3/2
k0
m
mg −1/2
Ig = 1/2 ω1 ω 2 ω 3 ω 4 ω 5 ω6
+1/2
π k0
m
σ
σ
π m
k0
σ
¼
Fig. 10. Level scheme of the 14.4 keV transition of the M¨ ossbauer isotope 57 Fe. Here the transitions proceed between the Zeeman-split nuclear levels. The strong polarization dependence leads to significant differences in the time spectra of coherent elastic scattering, from which the orientation of magnetic hyperfine fields can be determined
nm Fe) on a hard-magnetic layer (30 nm FePt in the L10 phase). In the Fe layer a tilted probe layer of 57 Fe was embedded so that different depths in the sample are probed by illuminating the sample at various transverse displacements ∆x. While the FePt was remanently magnetized along the direction of the incident beam, an external field was applied perpendicular to the scattering plane. Time spectra recorded for H = 160 mT at various values of ∆x are shown in Fig. 12, indicating a gradual transition from A to B in Fig. 10. From these spectra the in-plane rotation angles were derived for different external fields, which is shown in the right column of Fig. 12. The solid lines are simulations according to the model outlined in [30]. This experiment has shown that isotopic probe layers can be advantageously employed to directly image magnetic depth profiles and spin structures in thin films. Further systems to be investigated are metal/oxide heterostructures and exchange-bias layer systems, for example.
7
Conclusion and Outlook
With respect to the multitude of thin-film studies that are presently performed with synchrotron radiation, this review cannot be complete. For this contribution I have focused mainly on the analyis of coherent diffraction
A Sharp Eye on Thin Films scattering plane
H ∆x
φ
D
Fe FePt M
543
k0
11 nm Fe
ϕ
30 nm FePt
0.7 nm 57 Fe
20 mm
Fig. 11. Left: Scheme of the spin structure in an exchange-spring bilayer consisting of a soft-magnetic layer (Fe) on a hard-magnetic layer (FePt) with uniaxial anisotropy. An external field H is applied perpendicular to the remanent magnetization M of the hard layer. Right: To image the resulting magnetic spiral, a tilted ultrathin probe layer of 57 Fe is deposited within the soft layer. The magnetic properties in depth D of the film are probed by adjusting the transverse coordinate ∆x of the sample relative to the incident beam [27]
Hext = 160 mT
240 mT
500 mT
MFePt
Fig. 12. Internal magnetic structure of an Fe film that is exchange coupled to a hard-magnetic FePt film [27]. The rotation angle φ, as defined in Fig. 11, is derived from the characteristic beat pattern of the time spectra shown in the left panel
544
Ralf R¨ ohlsberger
from thin films in the hard x-ray regime to reveal structural and magnetic properties. I did not discuss the multitude of incoherent methods like x-ray absorption, x-ray fluorescene and photoemission that have found numerous application for the study of thin films. Even the review of coherent scattering methods is not complete here. For example, the GISAXS technique is still evolving, and it has already shown its strength to reveal properties of nanostructured thin films consisting of clusters, nanocrystals and semiconductor quantum dots, polymer films and more. If such studies are performed with coherent x-rays one enters the regime of speckle interferometry. This allows not only for structural but also for dynamical studies which merge then into the new field of x-ray photon correlation spectroscopy (XPCS). With further increasing brilliance of synchrotron radiation sources, x-ray scattering techniques that so far required bulk samples can be extended to thin films and nanostructures. This applies, for example, for inelastic x-ray scattering. While it is already possible to determine the phonon density of states of thin films via incoherent nuclear absorption [31], the investigation of phonon dispersion relations of low-dimensional structures will become possible in the near future. Acknowledgements I highly appreciate the support of Torsten Klein in performing the measurements and evaluation of the data shown in Fig. 4. Furthermore, it was a great pleasure to collaborate with Heiko Thomas and Kai Schlage (Universit¨ at Rostock) and Rudolf R¨ uffer and Olaf Leupold (ESRF) that led to the results presented in section 6. Finally, I want to thank Uwe van B¨ urck for critical reading the manuscript and helpful comments.
References 1. I. K. Robinson, Phys. Rev. B 33, 3830 (1986). 531 2. G. S. D. Beach and A. Berkowitz, private communication (2002). 532 3. K.-M. Zimmermann, M. Tolan, R. Weber, J. Stettner, A. K. Doerr, and W. Press : Phys. Rev. B 62, 10377 (2000). 533 4. W. L. Clinton : Phys. Rev. B 48, 1 (1993). 5. M. Tolan : X-ray Scattering from Soft-Matter Thin Films (Springer, Berlin 1999). 533 6. O. H. Seeck, I. D. Kaendler, M. Tolan, K. Shin, M. H. Rafailovich, J. Sokolov and R. Kolb : Appl. Phys. Lett. 76, 2713 (2000). 533 7. P. M¨ uller-Buschbaum, S. V. Roth, M. Burghammer, A. Diethert, P. Panagiotou, and C. Riekel : Europhys. Lett. 61, 639 (2003). 533 8. A. Gibaud, S. Hazra, C. Sella, P. Laffez, A. D´esert, A. Naudon, and G. Van Tendeloo : Phys. Rev. B 63, 193407 (2001). 533 9. L. G. Parratt : Phys. Rev. 95, 359 (1954). 533
A Sharp Eye on Thin Films
545
10. M. J. Regan, E. H. Kawamoto, S. Lee, P. S. Pershan, N. Maskil, M. Deutsch, O. M. Magnussen, B. M. Ocko, and L. E. Berman : Phys. Rev. Lett. 75, 2498 (1995). 533 11. W. J. Huisman, J. F. Peters, M. J. Zwanenburg, S. A. de Vries, T. E. Derry, D. Abernathy, J. F. van der Veen : Nature 390, 397 (1997). 533, 534 12. C.-J. Yu, A. G. Richter, A. Datta, M. K. Durbin, and P. Dutta : Phys. Rev. Lett. 82, 2326 (1999). 533 13. T. Klein : Diploma Thesis, University of Rostock (2000). 534, 535 14. M. J. Bedzyk, G. M. Bommarito, and J. S. Schildkraut : Phys. Rev. Lett. 62, 1376 (1989). 535 15. Y. P. Feng, S. K. Sinha, H. W. Deckman, J. B. Hastings, and D. P. Siddons : Phys. Rev. Lett. 71, 537 (1993). 536 16. F. Pfeiffer, U. Mennicke, and T. Salditt : J. Appl. Cryst. 35, 163 (2002). 536, 537 17. M. J. Zwanenburg, J. H. Bongaerts, F. Peters, D. O. Riese, and J. F. van der Veen : Phys. Rev. Lett. 85, 5154 (2000). 538 18. O. H. Seeck, H. Kim, D. R. Lee, D. Shu, I. D. Kaendler, J. K. Basu, and S. K.Sinha : Europhys. Lett. 60, 376 (2002). 538 19. P. Carra, B. T. Thole, M. Altarelli, and X. Wang : Phys. Rev. Lett. 70, 694 (1993). 539 20. C. T. Chen, Y. U. Idzerda, H.-J. Lin, G. Meigs, A. Chaiken, G. A. Prinz, and G. H. Ho : Phys. Rev. B 48, 642 (1993). 539 21. J. P. Hannon, G. T. Trammell, M. Blume, and D. Gibbs : Phys. Rev. Lett. 61, 1245 (1988). 539 22. K. Starke, F. Heigl, A. Vollmer, M. Weiss, G. Reichardt, and G. Kaindl : Phys. Rev. Lett. 86, 3415 (2001). 540 23. J. Geissler, E. Goering, M. Justen, F. Weigand, G. Sch¨ utz, J. Langer, D. Schmitz, H. Maletta, and R. Mattheis : Phys. Rev. B 65, 020405 (2001). 540 24. G. Lugert and G. Bayreuther : Phys. Rev. B 38, 11068 (1988). 541 25. B. Niesen, M. F. Rosu, A. Mugarza, R. Coehoorn, R. M. Jungblut, F. Rooseboom, A. Q. R. Baron, A. I. Chumakov, and R. R¨ uffer : Phys. Rev. B 58, 8590 (1998). 541 26. C. Carbone et al., ESRF Highlights 1999, p. 60 541 27. R. R¨ ohlsberger, H. Thomas, K. Schlage, E. Burkel, O. Leupold and R. R¨ uffer : Phys. Rev. Lett. 89, 237201 (2002). 541, 543 28. E. Gerdau and H. de Waard (eds.) : Nuclear Resonant Scattering of Synchrotron Radiation, Hyperfine Interactions vol. 123/124 (1999) and vol. 125 (2000). 541 29. R. R¨ ohlsberger, J. Bansmann, V. Senz, K. L. Jonas, A. Bettac, K. H. MeiwesBroer, and O. Leupold : Phys. Rev. B, 67, 245412 (2003). 541 30. E. E. Fullerton, J. S. Jiang, M. Grimsditch, C. H. Sowers and S. D. Bader : Phys. Rev. B 58 (1998) 12193. 541, 542 31. R. R¨ ohlsberger, W. Sturhahn, T. S. Toellner, K. W. Quast, P. Hession, M. Hu, J. Sutter, and E. E. Alp : J. Appl. Phys. 86, 584 (1999). 544
Stress, Strain and Magnetic Anisotropy: All Is Different in Nanometer Thin Films Dirk Sander1 , Holger Meyerheim1 , Salvador Ferrer2 , and J¨ urgen Kirschner1 1
2
Max-Planck-Institut f¨ ur Mikrostrukturphysik Weinberg 2, D-06120 Halle, Germany [email protected] European Synchrotron Radiation Facility B.P. 220, F-38043 Grenoble Cedex, France
Abstract. The application of the crystal curvature technique for stress measurements at surfaces and in films is presented. Important aspects regarding sample clamping and elastic anisotropy are elucidated in view of a quantitative stress analysis. Combined stress measurements and structural characterizations are mandatory to obtain a meaningful interpretation of the stress data. The in-situ combination of surface X-ray diffraction with stress measurements indicates a previously unknown substantial distortion of a W(110) surface layer upon Ni coverage, and the decisive role of this substrate relaxation for the stress state is discussed. The magnetization-induced stress in epitaxial layers is measured to derive the magnetoelastic coupling coefficients, which drive the magnetostriction of bulk samples. It is found that even a subtle film strain in the sub-percent range leads to a drastically modified magneto-elastic coupling in the films as compared to the respective bulk value. The implication of non-bulk like magneto-elasticity for the magnetic anisotropy of strained films is emphasized, and recent progress in the theoretical description of the strain-dependent modification of the magneto-elastic coupling is acknowledged.
1
Introduction
Mechanical stress in thin films is known as one of the most important issues when it comes to the application and reliability of thin film devices [1,2,3]. A dramatic example of thin film failure is the delamination of a thin film from its substrate. G.G. Stoney described already in 1909 [4] that this failure of thin films has been observed as early as 1858, and he writes, “I have had the same experience in protecting silver films in searchlight reflectors by a film of electro-deposited copper, it being found that if the film of copper is more than 0.01 mm thick peeling is apt to take place”. Stoney concluded that the Cu film was deposited under tension, which seems very plausible from today’s point of view taking the different lattice constants of Cu (3.61 ˚ A) and Ag (4.09 ˚ A) into account. The difference of the lattice constants leads indeed to a tensile lattice misfit of +13.3 %. An example for the delamination a 900 nm Cr film under tensile stress from a glass substrate is shown in Fig. 1a.
B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 547–561, 2003. c Springer-Verlag Berlin Heidelberg 2003
548 (a)
Dirk Sander et al. (b)
Fig. 1. Stress effects in strained films. (a) Delamination of a 900 nm Cr film from a glass substrate [5]. Image width 1.2 mm.(b) Distortion lines (white) in the 5th layer of Fe (grey), deposited on an extended four layer thin Fe film (dark grey) on W(001) [6]. Scanning tunneling microscopy image, image width 200 nm
The original work by Stoney [4] does not only describe the delamination phenomenon, but it also offers the first quantitative measurement of Ni-film stress from the bending of a thin steel substrate. Stoney deduced a stress of 0.3 GPa for a 5 µm thin film deposited on a steel ruler. He derived a relation – the so-called Stoney equation – between the radius of curvature R of a substrate of thickness d due to the force per unit width exerted by a film of thickness t under a stress of τ , τ t = Ed2 /(6R), with E : modulus of elasticity of the substrate. However, this relation is faulty as it neglects the two-dimensional nature of the stress and bending problem, and in the proper analysis, E is replaced by E/(1 − ν), which includes the Poisson ratio ν of the substrate. Important aspects regarding quantitative stress measurements are discussed in Sect. 3. Even in films with just a few atomic layers thickness stress is an important factor which determines the atomic structure. Considerable stress in the GPa range often induces pronounced structural relaxations, which are driven by a reduction of film stress. An example is shown in Fig. 1b, where islands of the 5th layer Fe are shown as grey patches on a four layer thick Fe film, deposited on W(100) [6]. White lines in the grey patches indicate the formation of distortion lines running along the 100 directions which are due to the incorporation of additional Fe atoms. This process is driven by the large misfit of 10 % between Fe and W. Incorporation of additional Fe atoms along lines in the fifth layer reduces the strain [6]. The atomic processes which govern stress relaxation on the nanometer scale are of interest in their own right, but here we discuss the implications for the magnetic properties of atomic layers layers [7]. Combined studies of both film stress and magneto-elastic coupling indicate that the magnetoelastic properties of strained films deviate sharply from the respective bulkbehavior, and film strain has been identified as a decisive factor which governs the effective magneto-elastic coupling in ferromagnetic films. This contribution offers a brief reference to work which discussed the origin of stress at surfaces and in thin films in Sect. 2. Sect. 3 focuses on ex-
Stress, Strain and Magnetic Anisotropy
549
perimental set-ups for stress measurements and important aspects regarding the quantitative curvature analysis are discussed there. The stress during Ag growth on a Fe whisker is discussed in Sect. 4, and examples of combined curvature and X-ray diffraction studies are presented in Sect. 5. Sect. 6 elucidates the application of curvature measurements to investigate the magneto-elastic coupling in nanometer thin ferromagnetic films.
2
Origin of Stress in Thin Films and at Surfaces
Mechanical stress in a film can be due to lattice misfit [8], crystalline coalescence [9], surface stress changes [8,10] and magnetization processes [7], to name just a few examples. We refer the reader to the references for a deeper discussion of the underlying physical principles. Possible implications of stress at interfaces and in nanometer thin structures for pattern formation, alloying, shape and structural transitions have been reviewed [11]. In the following we present a few recent results to elucidate some selected aspects of stress measurements.
3
Experimental Aspects of Stress Measurements
Forces acting in a film, which is deposited on one surface of a substrate, induce both a bending and a dilation of the substrate. The resulting relation between film forces and curvature of the substrate has been discussed [7,8,12]. The determination of film stress proceeds via a measurement of the stress-induced deflection of the film-substrate composite. This deflection of a substrate, which is clamped at one end, a so-called cantilever substrate, is analyzed. Various techniques have been applied successfully to detect even minute deflections of the substrate, and different experimental realizations are sketched in Fig. 2. The relation between the radius of curvature R and the deflection line of the substrate ξ(x), where x is measured along the sample length is given in good approximation by 1/R = ξ (x). If the deflection is measured at the end of the substrate at x = L, the curvature is given by 1/R = 2ξ(L)/L2 . Note, that this relation implies that the whole substrate of length L is covered by the film homogeneously. Also, the effect of clamping on the resulting deflection is neglected, as will be discussed below. Stress measurement by analyzing the substrate deflection have been performed with high sensitivity. E.g., the substrate might form a capacitor with a reference electrode, and a stress-induced change of the capacitor plate separation can be detected with sub-Angstrom sensitivity. If one regards the substrate as a macroscopic cantilever it is obvious, that all techniques which have been applied to detect the cantilever deflections in atomic force microscopy (AFM) can be employed to detect the substrate deflection in a stress measurement.
550
Dirk Sander et al.
(a)
(b)
1
2
1 cm
1
2
1 cm
(c)
(d) E A
C
F
R D
1
2
Fig. 2. Cantilever beam experiments for stress measurements. (a) Interferometric set-up [13]. The deflection of the cantilever 1 is detected by the change of the interference pattern produced in the gap between 1 and optical flat 2. (b) Capacitance measurement [14]. The deflection of the cantilever 1 induces a change of the capacitor plate distance at the position of the reference electrode 2. (c) Capacitance measurement with coils for magnetization measurements [15]. (d) Optical two beam measurement [16]. The curvature 1/R of the substrate A is detected by the deflection of two laser beams on the position sensitive detectors E, F. C: beamsplitter, D: mirror, F: laser
However, an important aspect with regard to quantitative stress measurements is due to the sample clamping. In most experimental situations, the thin substrate has a rectangular shape, and it is clamped along its width at one end to a sample manipulator. This clamping inhibits a curvature along the sample width near the clamping, and the resulting deflection line is considerably changed. This has been analyzed in great detail by finite-elementmethod calculations [17]. These calculations introduce the dimensionality of bending D, and the deviation from D = 2 are due to effect of clamping. The following expression describes the proper relation between curvature 1/R and the stress τ t = Y d2 /{6(1 − ν)[(1 + (2 − D)ν]R}, where the film thickness is given by t, and d is the sample thickness. The anisotropy of the Young modulus Y and the Poisson ratio ν of the substrate is neglected in this expression, which is a perfect approximation for W for all orientations, and also for the
Stress, Strain and Magnetic Anisotropy
551
(111) faces of cubic materials. In the latter case, Y and ν have to be calculated for the (111) plane [7,18]. The deviation from the ideal two-dimensional bending caused by the clamping is indicated in Fig. 3. The effect of sample clamping on the deflection line is most severe for short substrates, with an aspect ratio length/width below two. The effect of clamping is most severe when the deflection ξ(L) at the sample end is measured, and it is less pronounced when the curvature ξ (L) is measured instead. We present in Fig. 2d a two-beam optical deflection technique which allows to measure the change of slope of the substrate between two points separated along the substrate length. This difference of slope corresponds to the crystal curvature, and such a measurement resembles a curvature measurement to a good approximation. This technique has been applied to the stress measurements on a Fe whisker, which are discussed next. We conclude by quoting typical values for the stress-induced substrate deflection. For any stress measurement of films in the nanometer thickness range, substrates thinner as 0.5 mm are mandatory. A film stress of −0.6 GPa in a 1 nm Ag film on a 0.1 mm thin Fe(001) whisker is expected to induce a radius of curvature of 570 m, the bottom end of the 10 mm long substrate will be deflected by 0.88 µm [19]. A curvature of this magnitude can be routinely detected by stress measurements. However, stress induced by magnetization processes is typically three orders of magnitude smaller, as discussed in Sect. 6, and there averaging techniques or phase-sensitive signal detection is used to increase the signal-to-noise ratio [20,21].
z ¢¢ » k = R w
-1
z(l)
» z ¢(l )
1.8
A = 1.0 n = 0.3
1.6 1.4
kz ´´ kz
0.0 0.2
kz´
0.4 0.6
1.2
0.8
1.0
1.0
0.8
0.1
1 Ratio a = length / width
5
2.0 - D
R
l
Dimensionality D
2.0
1.2
Fig. 3. Sketch of a cantilever experiment with a sample of length l and width w. The plot indicates the deviation from the true 2-dimensional bending, D = 2 for different aspect ratios and measurement schemes. The calculation is done for an isotropic substrate A = 1 with a Poisson ratio ν = 2. Data from [17], with permission
552
4
Dirk Sander et al.
Film Stress During Epitaxial Growth: Ag on Fe(100)
Thin single crystalline substrates are mandatory for cantilever stress measurements. For many semiconducting compounds thin samples are readily available from stock, for metals however, thicker single crystalline samples have to be polished down until the requested thickness is reached. An alternative route towards thin crystals is given by the growth of needle-like Fe whiskers from the gas phase [22]. It is well known that high quality Fe crystals can be produced, with extended (100) terraces in the µm range, which are bounded by single atomic steps. These substrates have often a squarelike cross section of the order of 100 µm×100 µm, and grow in length up to 14 mm. An example of such a Fe crystal is shown in Fig. 4, and the adjacent needle indicates the small lateral dimensions of the Fe whisker. Thus, a rather thin single crystalline substrate with an exceedingly large aspect ratio 100 is obtained, which facilitates the quantitative stress analysis due to negligible clamping effects as discussed above. The growth of Ag on Fe(100) whiskers has been studied by reflection high energy electron diffraction (RHEED) in great detail [23,24]. RHEED oscillation of the specular intensity have been observed during the deposition at 360 K above a thickness of five Ag layers , and layer-by-layer growth has been proposed in this thickness range. The layer-by-layer growth and the small epitaxial misfit make Ag on Fe(100) an ideal prototype for stress measurements, as the resulting film stress should be given by the lattice misfit, and no stress relaxation is expected for depositions up to several dozen layers. Indeed, our measurements indicate that the stress in films above 5 layers thickness is given by the misfit, whereas surface stress effects and a possible surface alloy formation dominate the stress in thinner Ag films [19]. A) and bcc Fe (aFe = 2.866 ˚ A), the epitaxial Between fcc Ag (aAg = 4.08 ˚ relation is given by 45◦ rotated surface unit cells of the √ two elements. √ A moderate compressive in-plane lattice misfit η = (aFe −aAg / 2)/(aAg / 2) = −0.8 % results. This misfit induces a biaxial film stress of τ = ηYAg /(1 − ν Ag ) = −0.61 GPa. Our stress measurements as presented in Fig. 5 confirm this magnitude of stress for layer-by-layer growth conditions for tAg > 5 ML. Figure 5 shows a plot of the Ag-induced stress of the Fe(100) substrate as a function of Ag thickness in monolayers (1 ML Ag: 2.04 ˚ A). The measurement was performed with a two-beam optical deflection technique, and the stress is
1 mm
Fig. 4. Fe whisker with a mirror-like shiny surface next to a needle
Stress, Strain and Magnetic Anisotropy 0
30 60 90 120 150 180
time (s)
T = 366 K
0
553
-1
t t Ag (N/m)
0.2 ML s -1 -2 -3
III
I
-4 -5
slope: -0.6 GPa
II
Ag off
Ag on 0
10
20
t Ag (ML)
30
40
50
Fig. 5. Stress during Ag deposition on Fe(100), as measured on a Fe whisker (Fig. 4) with an optical two deflection technique (Fig. 2(d)). 1 ML Ag: 2.043 ˚ A. The constant slope of the stress curve in regime III indicates a compressive stress of −0.6 GPa
deduced from the measured curvature, as described above. A negative signal indicates compressive stress, i.e. a stress, which favors an expansion of the film with respect to the substrate. Immediately after the beginning af Ag deposition, the stress drops rapidly and reaches a minimum of −1.7 N/m, after deposition of one layer Ag. Further deposition leads to a small tensile stress, and after 5 ML of deposition, the stress continues to drop, and it reaches a constant slope of −0.6 GPa. After the deposition of 35 ML Ag the Ag evaporator is closed, and the stress remains at −4.9 N/m. No significant stress relaxation is observed, and we performed additional tests to verify that a a possible radiative heating of the substrate by the Ag evaporator has no impact on the curvature measurement. Three distinct stress regimes are identified in Fig. 5. Regime I (0–1 ML) is ascribed to the formation of the Ag-Fe interface. The measured stress change is mainly due to the relief of the tensile surface stress of Fe(100) upon coverage with Ag. The subsequent regime II (1–5 ML) is characterized by a much smaller stress change. We ascribe this to a rougher surface morphology in this thickness range, which is possibly caused by some surface alloy formation between Ag and Fe. This assumption is corroborated by the RHEED results which indicate an absence of intensity oscillations in this thickness regime [24]. Regime III (> 5 ML) is characterized by a constant slope of the stress curve given by −0.6 GPa, which is caused by the epitaxial misfit between Ag and Fe, as discussed above. We performed many stress measurements during the deposition of Ag at temperatures between 300 K and 393 K and for deposition rates between 0.05 ˚ A/s and 0.5 ˚ A/s, and the three stress regimes were always observed. The back-extrapolation of the stress curve of all of our measurements from regime III to zero coverage leads to a stress value of −1.23 N/m. We ascribe this to the difference of surface stress between Ag(100) and Fe(100), and we conclude that Fe is under a larger tensile surface stress as compared to Ag. The calculated surface stress of Ag(100) is +0.82 N/m [25], and we deduce 2.05 N/m as the surface stress for clean Fe(100). This example demonstrates that the high sensitivity of stress measurements can be exploited to investigate both adsorbate-induced changes of sur-
554
Dirk Sander et al.
face stress and film stress [8]. In addition, even subtle structural relaxations, which occur during epitaxial growth, give rise to stress oscillations with a monolayer period, as has been verified for semiconductor [26] and metallic systems [16]. In conclusion, stress measurements are sensitive tools to monitor structural relaxation or composition changes at surface and in films. An important ingredient of the stress analysis is a detailed knowledge of the atomic structure of the film-substrate composite. To this end we performed combined surface X-ray diffraction studies and curvature measurements to derive stress-strain relations at the atomic level.
5 Combined Stress and Surface X-ray Diffraction Measurements Any interpretation of stress measurements relies on a knowledge of the geometric structure of both substrate and film. This statement might seem trivial, but we want to point out that a detailed structural investigation may reveal considerable adsorbate-induced distortions of the substrate, which cannot be anticipated, nor excluded in general. An example is presented here, where the substantial distortion of W(110) upon Ni deposition has been detected by surface X-ray diffraction [27]. In-situ combination of stress measurements with structural investigations is extremely useful. We performed such combined stress measurements by the curvature technique with surface X-ray diffraction at the beamline ID03 of the European Synchrotron Radiation Facility (E.S.R.F.) in Grenoble, France [27,28,29,30]. The strength of combined stress and X-ray diffraction measurements is due to the high sensitivity of both techniques for subtle structural changes of the film-substrate composite. A revealing example is the Nishiyama-Wassermann growth of Ni on W(110). Former investigation suggested the following sequence of Nistructures in the first layer with increasing Ni coverage: Ni is bonded in pseudomorphic sites at small coverage, then first in a 1 × 8 and then in a c(1 × 7) coincidence structure [31]. These structural transitions are driven by the tendency of the Ni atoms to increase their atomic density on the W surface. These structural transitions are accompanied by a reduction of the strain along W[100] from +27 % for the pseudomorphic structure to −1.3 % for the c(1 × 7) coincidence structure. Previous low energy electron diffraction and scanning tunneling microscopy studies indicate a constant strain of +3.7 % along W[110] [31,32]. Our recent surface X-ray diffraction studies identified the formerly proposed 1 × 8 coincidence structure as a structurally modulated 1 × 1 structure, which gives rise to satellite reflections at positions which deviate slightly from those of the former model, and a modulation period of 7.7, and not 8 lattice units, along W[100] is found by surface X-ray diffraction [27]. The proposed c(1 × 7) coincidence structure was confirmed in the recent structural analysis [27].
Stress, Strain and Magnetic Anisotropy
555
The c(1 × 7) coincidence structure is characterized by a coverage of 1.3 with respect to the atomic density of the W(110) surface. Only after this structure has formed, additional deposition of Ni leads to the formation of a fcc(111)-like second layer, as described by the Nishiyama-Wassermann growth mode. The formation of the fcc(111)-like second Ni layer is accompanied by a characteristic change of the film stress from compressive c(1 × 7) coincidence structure) to tensile for the second Ni layer [27]. This demonstrates that stress measurements give an accurate and highly sensitive indication of the formation of distinct surface structures. An example of our combined stress and surface X-ray diffraction work in shown in Fig. 6. Figure 6 shows the curvature measured along W[001] during the deposition of Ni, right axis, and the measured X-ray intensity on the left axis for a satellite reflection characteristic of the modulated 1 × 1 structure, top panel, and for the c(1 × 7) coincidence structure in the lower panel. The sequential occurrence of the two structures is revealed by the intensities of both reflections. First, the modulated 1 × 1 structure is formed, then (a)
E.S.R.F beamline ID-3 position sensitive detector
(d)
laser
Ni
W
X-ray detector
(b)
modulated 1 x 1 structure
Ni on
Ni off
1 0
(c)
0
200
400
600
1 x 7 satellite
3 2
Ni on
Ni off
1
curvature along W[001] (arb. units)
X-ray intensity (1000 counts / s)
2
0 0
200
400
time (s)
600
Fig. 6. Combined curvature and X-ray diffraction experiments at the ESRF. (a) Sketch of the experimental layout. Crystal curvature and diffracted X-ray intensity are measure simultaneously during Ni growth on W(110). (b) X-ray intensity (left axis) and crystal curvature (right axis) during the deposition of the first layer of Ni. The formation and disappearance of a modulated 1 × 1 structure with increasing Ni coverage coincides with the change of curvature from compressive to tensile back to compressive. (c) The 1 × 7 structure evolves only after the disappearance of the modulated structure detected in (b). (d) Structural model of the 1 × 7 structure. Atomic chains a, B between Ni and W are formed. Areas of higher atomic density are separated by 1.1 nm along [001]
556
Dirk Sander et al.
its intensity drops down to almost zero with increasing deposition. While the intensity of the modulated 1 × 1 structure vanishes with ongoing deposition, the c(1 × 7) coincidence structure gains in intensity. The completion of the c(1 × 7) coincidence structure is identified by the vanishing slope of the curvature signal. Thus, the curvature data clearly identify the filling of the first layer, which is also characterized by the maximum diffracted intensity of the c(1 × 7) satellite. The deposition of more Ni leads to a positive slope of the curvature signal for a coverage above 1.3. The proper two dimensional stress analysis reveals that the biaxial film stress changes from compressive to tensile once the second layer Ni grows in the fcc(111)-like structure [27]. The exact Ni coverage which corresponds to the filling of the first Ni layer is identified with high precision by both techniques. The resulting c(1 × 7) coincidence structure was subsequently analyzed by surface X-ray diffraction, and the result of the structural model is presented in Fig. 6d. The main structural characteristic is the formation of atomic chains between Ni and W, as indicated by the dark lines A and B. Note, the substantial distortions of W atoms from their respective bulk positions [27]. The measured tensile stress of the second layer Ni amounts to 15 GPa, and this value can be quantitatively ascribed to an average isotropic film strain of +3.7 % [20]. We emphasize that in contrast to the thicker Ni films the stress in the first layer cannot be described by the average strain of the different structures [31]. Our X-ray investigations offer a possible cause for this complicated stress behavior of the first Ni layer, as we find that the Ni induces substantial distortions within the W(110) surface layer, with lateral displacements of the W atoms as large as 0.3 ˚ A, see Fig. 6d [27]. This substantial Ni-induced distortion has escaped previous investigations and these structural details will be an important ingredient for future stress calculations. Finally, we discuss briefly the decisive role of even small strain in the sub-percent range for the modified magneto-elastic coupling and magnetic anisotropy of epitaxial thin films as compared to their bulk counterparts.
6 Magneto-elastic Coupling: Non-bulk-like Behavior in Epitaxial Films The coupling between lattice strain and magnetic anisotropy energy is given by the magneto-elastic coupling. This effect can be understood as the strain dependent part of the magnetic anisotropy energy density. The magnetoelastic coupling takes the empirical result into account which indicates that ferromagnetic samples change their length upon magnetization. For Fe this effect is quite small, the strain induced by the magnetization along [100] amounts only to 2×10−5, measured along [100], starting from a demagnetized state. This effect is called magnetostriction. Ferromagnetic films have also the tendency to strain upon magnetization changes. But they are bonded rigidly
Stress, Strain and Magnetic Anisotropy (a)
557
(b) M || [100]
M M || [010]
l=
Dl = 2 ´10-5 l
magneto-elastic stress (~ MPa)
Fig. 7. (a) Magnetostriction of a bulk sample leads to a strain λ. (b) The same effect, namely the magneto-elastic coupling, leads to a stress in the film, which induces an anticlastic curvature of the substrate
to a substrate, and magneto-elastic stresses (also: magnetostrictive stresses) are induced instead. The degree of resulting film-substrate deformation depends on the rigidity of the substrate. Figure 7 shows the difference between magnetostriction of a bulk sample and magnetostrictive stress in a film. The important aspect of magneto-elastic coupling is that the stress state of ferromagnetic film can be changed by changing the orientation of the film magnetization. A measurement of the change of curvature upon reorientation of the magnetization reveals the magnitude of the effective magneto-elastic coupling. Therefore, highly sensitive stress measurements are appropriate to measure magneto-elastic coupling coefficients, provided that the sensitivity is sufficient to detect stress changes which are often three orders of magnitude smaller (MPa) as compared to epitaxial misfit stress (GPa). The orientation of the crystal axes, the magnetization directions and of the curvature measurement determine which coupling coefficient is measured. The appropriate data evaluation is discussed in [7,18]. E.g., switching the magnetization direction between the in-plane [100] and [010] direction of a cubic film and measuring the curvature along the [100] direction gives access to B1 . Combined stress measurements during film growth and magneto-elastic stress measurements reveal the impact of film strain on the effective magnetoelastic coupling. According to the measurements, the effective magneto-elastic coupling varies with film thickness. Therefore, at first sight one might be tempted to ascribe the variation of Bi to a thickness effect. However, the stress measurements reveal, that also the film stress varies with film thickness, and consequently the film strain is expected to vary with the film thickness [33]. It has been possible to produce films of equal thickness but different strain by lightly modifying the growth parameters during film deposition [34]. These measurements, and our own measurements of the magneto-elastic coupling coefficients, clearly indicate a strain-dependent correction of the effective magneto-elastic coupling coefficients. In Fig. 8a we present a plot of the stress as a function of Fe thickness for different experiments on Fe growth on W(001). The plots Fig. 8b–e show the effective magneto-elastic coupling Beff for Fe [7], Co [35], Ni [36] and Fe on Cu(001) [36]. For Fe on Cu a combination of B1 and B2 is measured [36]. The
Dirk Sander et al. (a)
Fe / W(001) 4
t (GPa)
8
thick films are considerably stressed t res = 0.6 GPa
6 4
t Fe (nm)
(d)
2 3 e - Fe [%]
3
Beff [MJ/m ]
Beff [MJ/m 3 ] 1
» 1 nm
4
4
6
» 0.5 nm
4
» 20 nm
20 10
bulk,dhcp 4
0
0.5
Co / W(001)
1.0 1.5 e - Co [%]
2 e - Ni [%]
0.8
2.0
3
» 2 nm
0.6 » 20 nm 0.4 0.2
Ni / Cu(001)
1
2
B bulk,hcp
(e)
6
0
film stress [GPa]
B
4
» 30 nm
2
30
0
bulk
B1
8
0
eff
B4 [MJ/m3 ]
B1eff [MJ/m 3 ]
1.5 (5 nm) (1 nm) 1.0 (0.4 nm) 0.5 I: B > 0, l < 0 0.0 II: B < 0, l > 0 (20 nm) -0.5 Fe / W(001) -1.0 (68 nm)
0
0
(c)
1
1
0 10 20 30 40 50 60
(b)
0
2
e res = 0.3 %
2 0
3
e (%)
558
0.0
Fe / Cu(001) 0.4
0.6 e - Fe [%]
0.8
Fig. 8. (a) Film stress for Fe on W(001). Note the considerable residual stress even in thicker films. (b)–(e) The effective magneto-elastic coupling constants Beff of various systems. The straight lines through the data points indicate a straindependent modification of Beff
straight lines through the data points indicate a modification of the effective magneto-elastic coupling constants in proportion to film strain. The modified magneto-elastic coupling has an important impact on the magnetic anisotropy of strained films. Although magnetostriction of bulk Fe is a small effect, the underlying magneto-elastic coupling gives rise to an important contribution to the magnetic anisotropy of strained films. The reason is −3 the larger magnitude of the coupling constants B (B1 (Fe) = −3.44 MJ/m ) −3 as compared to the crystalline anisotropy K, (K1 (Fe) = 0.05 MJ/m ). This values correspond to an energy per atom of the order of 0.3 meV for the former, and 4 µeV for the latter [7,37]. The magneto-elastic coupling coefficient B1 couples the strain to the energy density via a term ∼ B1 , thus even a strain in the percent range renders the magneto-elastic coupling a decisive contribution to the magnetic anisotropy. Therefore, measurements of the magneto-elastic coupling are mandatory to put the anisotropy discussion of strained films on a physically sound foundation. The most important result of all experimental determinations of the
Stress, Strain and Magnetic Anisotropy
559
various Bi of different epitaxially strained systems indicate that magnitude and sign of Bi are not given by the respective bulk value [7]. Instead, the data analysis suggests that the effective Bi is given by a strain dependent correction of the bulk value. This important result was not considered in earlier anisotropy discussions. Recently, a strain dependent correction of the magneto-elastic coupling was also found in state-of-the-art calculations of the strain dependence of the magnetic anisotropy [38,39,40,41,42,43]. We conclude that non-bulk like magneto-elastic coupling is a general phenomenon in epitaxially strained films, and the application of bulk coupling constants to anisotropy problems will lead to erroneous results.
7
Conclusion and Outlook
Although stress measurements on thin films have a tradition which dates back to the 1850’s, highly sensitive experiments reveal exciting new results, which cannot be ascribed to simple stress-strain relations. Stress in the (sub)monolayer coverage range are determined by the often strong adsorbatesubstrate interaction which can induce substantial distortions or intermixing at the interfaces. The electronic origins of the corresponding stress changes are still under debate and are a topic of current research [8,44]. New insight into the correlation between even subtle structural relaxations and forces acting in film-substrate composite is accessible by combined stress and surface X-ray diffraction measurements. Here, ab initio based calculation can elucidate the relevant mechanisms on the mesoscale [45,46]. The effect of even relatively small strains in the sub-percent range on the magnetic anisotropy of thin films are profound. The implication for the application of nanometer thin films in sensors are significant. Recent first principles calculations support convincingly the experimental result of a strain-induced modification of the magneto-elastic coupling [42]. Further work on alloys should investigate the consequences for technologically relevant systems. Acknowledgements The authors thank the E.S.R.F. for support and assistance during the experiments performed at beamline ID-3. We thank Heike Menge from the MPI Halle for the skillful preparation of high quality thin crystalline substrates and Fe whiskers.
References 1. R.W. Hoffman, Phys. Thin Films 3, 211 (1966). 547 2. M. F. Doerner and W.D. Nix, CRC Crit. Rev. Solid State and Materials Science 14, 225 (1988). 547
560 3. 4. 5. 6. 7. 8. 9. 10.
11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23.
24. 25. 26. 27. 28. 29. 30. 31. 32. 33.
Dirk Sander et al. W.D. Nix, em Metall. Trans. A 20A, 2217 (1989). 547 G.G. Stoney, Proc. R. Soc. London A 82, 172 (1909). 547, 548 M.S. Hu, M.D. Thouless, and A.G. Evans, Acta metall. 36, 1301 (1987). 548 W. Wulfhekel, F. Zavaliche, F. Poratti, H.P. Oepen, and J. Kirschner, Europhys. Lett. 49, 651 (2000). 548 D. Sander, Rep. Prog. Phys. 62, 809 (1999). 548, 549, 551, 557, 558, 559 H. Ibach, Surf. Sci. Rep. 29, 193 (1997). 549, 554, 559 W.D Nix and B.M. Clemens, J. Mater. Res. 14, 3467 (1999). 549 D. Sander and H. Ibach, in: Landolt-B¨ ornstein, Numerical Data and Functional Relationships in Science and Technology, New Series, Group III: Condensed Matter, Volume 42, A 2, H. Bonzel, ed.; Springer, Berlin (2002). 549 D. Sander, Curr. Opinion Solid St. Mat. Sci. 1, 51 (2003). 549 L.H. He and C.W. Lim, em Surf. Sci. 478, 203 (2001). 549 V. Buck, Z. Physik B, 33, 349 (1979). 550 M. Moske, PhD thesis, Georg-August Universit¨ at G¨ ottingen, MathematischNaturwissenschaftliche Fachbereiche (1988). 550 M. Weber, R. Koch, and K.H. Rieder, Phys. Rev. Lett. 73, 1166 (1994). 550 D. Sander, S. Ouazi, V.S Stepanyuk, D.I. Bazhanov, and J. Kirschner, Surf. Sci. 512, 281 (2002). 550, 554 K. Dahmen, S. Lehwald, and H. Ibach, Surf. Sci. 446, 161 (2000). 550, 551 K. Dahmen, H. Ibach, and D. Sander, J. Magn. Magn. Mater. 231, 74 (2001). 551, 557 R. Mahesh, D. Sander, , S.M. Zharkov, and J. Kirschner, Phys. Rev. B 68, 0454XX (2003). 551, 552 A. Enders, PhD thesis, Martin-Luther Universit¨ at Halle-Wittenberg, Mathematisch-Naturwissenschaftlich-Technische Fakult¨ at (1999). 551, 556 Th. Gutjahr-L¨ oser. PhD thesis, Martin-Luther Universit¨ at Halle-Wittenberg, Mathematisch-Naturwissenschaftlich-Technische Fakult¨ at (1999). 551 Klaus-Thomas Wilke. Kristallz¨ uchtung. Dt. Verl. d. Wiss., Berlin (1988). 552 B. Heinrich, K.B. Urquhart, J.R. Dutcher, S.T. Purcell, J.F. Cochran, A.S. Arrot, D.A. Steigerwald, and W.F. Egelhof, Jr., J. Appl. Phys. 63, 3863 (1988). 552 J. Unguris, R.J. Celotta, and D.T. Pierce, J. Magn. Magn. Mater. 127, 205 (1993). 552, 553 P. Gumbsch and M.S. Daw, Phys. Rev. B 44, 3934 (1991). 553 J. Massies and N. Grandjean, Phys. Rev. Lett. 71, 1411 (1993). 554 H.L. Meyerheim, D. Sander, R. Popescu, J. Kirschner, O. Robach, S. Ferrer, and P. Steadman, Phys. Rev. B 67, 155422 (2003). 554, 555, 556 H.L. Meyerheim, D. Sander, R. Popescu, J. Kirschner, and S. Steadman, P.and Ferrer, Phys. Rev. B 64, 045414–1 (2001). 554 H.L. Meyerheim, D. Sander, R. Popescu, S. Steadman, P.and Ferrer, and J. Kirschner, Surf. Sci. 475, 103 (2001). 554 R. Popescu, H.L. Meyerheim, D. Sander, J. Kirschner, P. Steadman, O. Robach, and S. Ferrer, Phys. Rev. B. submitted, (2003). 554 D. Sander, C. Schmidthals, A. Enders, and J. Kirschner, Phys. Rev. B 57, 1406 (1998). 554, 556 C. Schmidthals, D. Sander, A. Enders, and J. Kirschner, Surf. Sci. 417, 361 (1998). 554 J.W. Matthews and J.L. Crawford, Thin Solid Films 5, 187 (1970). 557
Stress, Strain and Magnetic Anisotropy
561
34. G. Wedler, J. Walz, A. Greuer, and R. Koch, Phys. Rev. B 60, R11313 (1999). 557 35. Th. Gutjahr-L¨ oser, D. Sander, and J. Kirschner, J. Magn. Magn. Mater. 220, L1 (2000). 557 36. Th. Gutjahr-L¨ oser, D. Sander, and J. Kirschner, J. Appl. Phys. 87, 5920 (2000). 557 37. P. Bruno, in: Magnetismus von Festk¨ orpern und Grenzfl¨ achen, Physical origins and theoretical models of magnetic anisotropy, pages 24.1–24.28. Forschungszentrum J¨ ulich, J¨ ulich (1993). 558 38. M. Komelj and M. F¨ ahnle, J. Magn. Magn. Mater. 220, L8 (2000). 559 39. M. Komelj and M. F¨ ahnle, Phys. Rev. B 65, 092403–1 (2002). 559 40. M. Komelj and M. F¨ ahnle, J. Magn. Magn. Mater. 238, L125 (2002). 559 41. M. Komelj and M. F¨ ahnle, Phys. Rev. B 65, 212410–1 (2002). 559 42. M. F¨ ahnle, M. Komelj, R.Q. Wu, and G.Y. Guo, Phys. Rev. B 65, 144436–1 (2002). 559 43. M. F¨ ahnle and M. Komelj, J. Magn. Magn. Mater. 220, L13 2000. 559 44. J.E. M¨ uller, K. Dahmen, and H. Ibach, Phys. Rev. B 66, 235407–1 (2002). 559 45. V.S. Stepanyuk, D.A. Bazhanov, A.N. Baranov, W. Hergert, P.H. Dederichs, and J. Kirschner, Phys. Rev. B 62, 15398 (2000). 559 46. O.V. Lysenko, V.S Stepanyuk, W. Hergert, and J. Kirschner, Phys. Rev. Lett. 89, 126102–1 (2002). 559
Grain Boundary Mechanics – New Approaches to Microstructure Control Myrjam Winning Institut f¨ ur Metallkunde und Metallphysik, RWTH Aachen, 52056 Aachen, Germany [email protected] Abstract. The reaction of grain boundaries to mechanical stresses is reviewed. Results of in-situ experiments on planar, symmetrical tilt grain boundaries with different tilt axes (112, 111 and 100) under the influence of an external mechanical stress field will be presented. It was found that the motion of planar grain boundaries can be induced by an imposed external stress irrespective of the angle of misorientation i.e., irrespective whether the grain boundary was a low or high angle grain boundary. The observed activation enthalpies of the stress induced grain boundary motion allow conclusions on the migration mechanism. The motion of planar low and high angle grain boundaries under the influence of a mechanical stress field can be attributed to the movement of the grain boundary dislocations which comprise the structure of the boundary. A sharp transition between low and high angle grain boundaries was observed for different tilt axes. Also an influence of the mechanical stress on grain growth was investigated. The fact that boundaries can also be moved by mechanical forces sheds new light on microstructure evolution during elevated temperature deformation.
1
Introduction
Defects play an essential role in microstructure evolution, in particular 2D defects, i.e., internal surfaces between similar and dissimilar phases. Interfaces between equal phases but different crystallographic orientations are referred to as grain boundaries. Grain boundaries have the unique property that they can react to exerted forces by a change of position. Grain boundary mechanics, i.e., the reaction and motion of grain boundaries under the influence of mechanical stresses is the main subject of this paper. To investigate the influence of grain boundary structure on the mobility, grain boundaries with a well known and constant structure are required. Since the structure changes continuously along a curved boundary, the influence of the structure on mobility is not known exactly. Mostly, grain boundary migration experiments have been conducted on curved grain boundaries, utilizing curvature as driving force. By using planar grain boundaries a constant structure can be obtained but another driving force on the grain boundary is needed to activate boundary motion. It has long been known that planar low angle grain boundaries can be driven by a mechanical stress field [1]. Most recently it was shown that even planar high angle grain boundaries in B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 563–576, 2003. c Springer-Verlag Berlin Heidelberg 2003
564
Myrjam Winning
aluminium interact with a mechanical stress field and that the motion of the grain boundaries can be enforced by a respective stress [2, 3, 4]. The measured activation enthalpies for low angle grain boundaries were found to be similar to the self diffusion enthalpy and for high angle grain boundaries akin to the grain boundary diffusion enthalpy in aluminium. There is a distinct transition between low angle and high angle grain boundaries which can be identified by a conspicuous step in the measured activation enthalpies. The aim of this paper is to review the reaction of individual boundaries to mechanical stress fields and the resulting effect on grain growth in polycrystalline material.
2
Experimental Procedure
For the experiments high purity aluminium bicrystals were used with a total impurity content of 7.7ppm. The investigated grain boundaries were planar, symmetrical tilt boundaries with three different tilt axes (112, 111 and 100) and misorientation angles between 3◦ –34◦ . The bicrystals were grown by using a horizontal Bridgman method. The orientation of the crystallographic axes was measured by Laue backreflection method with an accuracy of ±1◦ . The misorientation of the grain boundary could be determined by Laue technique and scanning electron microscopy with electron back scattering diffraction (EBSD) with an accuracy of ± 0.2◦ . To measure the grain boundary motion continuously a special X-ray diffraction tracking device was used [5]. The specimen holder (see Fig. 1(a)) was designed such as to exert a shear stress parallel to the grain boundary normal vector. For in-situ observations of the grain boundary motion the used method did not interrupt the grain boundary motion to localize the grain boundary position and therefore did not interfere with the grain boundary motion process. Due to different orientations in the two grains of the bicrystal the grains could be distinguished by using X-ray diffraction so that the position of the grain boundary could be determined with an absolute accuracy of approximately ± 15 µm. In Fig. 1(a) the experimental set-up is shown for the in-situ measurements of the grain bounday motion. The force of the spring causes a displacement of the upper grip while the lower grip is fixed. This leads to a shear stress on the sample. The sample itself is situated on a hot stage, and the temperature is controlled during the experiment by a thermo-couple. In the following we assume that every symmetrical tilt boundary, irrespective of the angle of misorientation, can be described as an arrangement of a single set of edge dislocations which is schematically shown in Fig. 1(b). With the experimental set-up (Fig. 1(a)) a mechanical shear stress can be
Grain Boundary Mechanics – New Approaches to Microstructure Control spring
I
F
θ
565
II
sample movable grip fixed grip
II
I
x
grain I
grain II
F
z y
(a)
(b)
Fig. 1. (a) Sample holder for in-situ experiments on planar grain boundaries under an external stress. (b) Illustration of the loading on the sample for a symmetrical tilt boundary. The force on the sample F leads to the stress tensor in Eq. (1)
applied to the boundary. In this case the stress tensor is given by 0 τxy 0 σ = τxy σyy 0 0 0 0
(1)
The force on a dislocation can be calculated by the Peach-Koehler equation [6]. For a given stress tensor σ, Burgers vector b and line element s of the dislocation, the force per unit length on the dislocation is F = (σ · b) x s.
(2)
From Fig. 1(b) the following terms for the Burgers vector b1 and b2 and the dislocation line s in the coordinate system of the stress tensor can be found −b · sin θ/2 b · sin θ/2 0 (3) b1 = b · cos θ/2 , b2 = b · cos θ/2 , and s = 0 0 0 −1 From Eq.(2) and (3) the force on the dislocations reads τxy · b · sin θ/2 − σyy · b · cos θ/2 τxy · b · cos θ/2 Fdis 1 = 0
(4a)
and
Fdis 2
−τxy · b · sin θ/2 − σyy · b · cos θ/2 τxy · b · cos θ/2 = 0
(4b)
566
Myrjam Winning
The result is a motion of the dislocation in x- and y-direction, i.e., parallel x and perpendicular to the boundary. However, since σyy ≈ 0 the force Fdis y is small. The force normal to the boundary, Fdis causes a dislocation sliding with τxy ≡ τ θ y Fdis = τ · b · cos . 2
(5)
The imposed shear stress τ exerts a force on the grain boundary and causes the grain boundary to move with the velocity v. From the measured velocity v we can calculate the grain boundary mobility from Eq. (6) v =m·p
(6)
The force per unit grain boundary area is given by the product of the force on a single dislocation per unit length and the dislocation line length per unit area in the grain boundary ρdis [1] y pgb = Fdis · ρdis = τ · sinθ
(7)
where ρdis is the dislocation density in the boundary and can be calculated easily for each tilt boundary. To expose the samples to a shear stress in the elastic range the applied stress ranging from 10−1 MPa to 10−3 MPa corresponding to driving forces on the boundary ranging from 10−2 MPa to 10−4 MPa according to Eq. (7). The temperature dependence of the grain boundary mobility is given by m0 ∆H ∆H v =m= · exp − = m0 · exp − (8) p kT kT kT for the investigated temperature range (300◦ C to 600◦C) the term 1/T is small compared to the exponential function and its temperature dependence can be neglected. The experimental set-up allows in-situ measurements of the grain boundary motion in dependence of the temperature and the driving force. Besides it is possible to vary the temperature and driving force without interruption of the grain boundary motion.
3
Results
Figure 2 shows some examples of time-distance-diagrams for different tilt grain boundaries. One can recognize that in all cases this means for all tilt axes and irrespective whether the boundary is a low or high angle grain boundary the motion of the grain boundary could be activated by the external shear stress and that the stress induced grain boundary motion is a steadystate motion. Therefore it is possible to detemine the grain boundary velocity from the slope of the time-distance curve.
Grain Boundary Mechanics – New Approaches to Microstructure Control
grain boundary position [µm]
250 ϑ= 450°C
200 150 ϑ= 400°C
100
grain boundary position [µm]
140
300
567
<100> 7.7° 850 K
120 100 80 60 40 20 0
50
0
0
50
100
150
200
250
300
350
100
150
200
250
300
350
400
450
500
time [s]
(b)
time [s]
(a)
50
400
Fig. 2. Time-distance-diagrams for different tilt grain boundaries. (a) Timedistance-diagrams for a 12.9◦ 112-tilt boundary at 673 K and 723 K. (b) Timedistance-diagram for a 7.7◦ 100-tilt boundary at 850 K
In Fig. 3 two examples of optical micrographs are given for a 111 tilt grain boundary with a misorientation angle of 16◦ and a 100 tilt grain boundary with 10.6◦ misorientation angle. In each micrograph one can see two grooves of the grain boundary. The first groove is the initial position of the boundary before the experiment and the second groove is the final position of the boundary after the experiment. The distance between the two grooves is the swept distance of the boundary during the experiment. Both micrographs prove that the entire boundary moves in the experiment without a shape change. EBSD measurements before and after the experiments prove that also the misorientation does not change during the stress induced motion of the boundaries.
final position
600µm
300µm
150µm
initial position 150µm
(a)
(b) ◦
Fig. 3. (a) Optical micrograph of a 16 111-tilt boundary after the in-situ experiment. The boundary moved in this experiment nearly 600µm. (b) Optical micrograph of a 10.6◦ 100-tilt boundary after the experiment. This boundary moved about 300µm in the in-situ experiment
568
Myrjam Winning
To scrutinize Eq. (7) the dependences of the grain boundary velocity on the external stress and misorientation angle were determined (Fig. 4 and Fig. 5). In both figures a linear dependence can be seen for all investigated grain boundaries.
grain boundary velocity [µm/s]
0.1 <112> 12.9° 658 K <100> 10.6° 735 K <111> 12.0° 754 K
0.09 0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0 0
0.005
0.01
0.015
0.02
0.025
0.03
0.035
external shear stress [MPa]
Fig. 4. Dependence of the grain boundary velocity on the external shear stress for different tilt boundaries
grain boundary velocity [µm/s]
0.25 <111> / 703 K / 0.0057 MPa <112> / 703 K / 0.0121 MPa <100> / 700 K / 0.0143 MPa
0.2
0.15
0.1
0.05
0 0
0.1
0.2
0.3
0.4
0.5
0.6
sin θ
Fig. 5. Dependence of the grain boundary velocity on misorientation angle for different tilt boundaries
3.1
Motion of Planar Symmetrical Low Angle Tilt Boundaries
Figure 6 shows the Arrhenius plot of the mobility of planar, symmetrical low angle tilt boundaries with different tilt axes. Apparently the stress driven grain boundary motion is a thermally activated process.
Grain Boundary Mechanics – New Approaches to Microstructure Control
569
temperature [K] 7
850
800
6
700
∆H = 1.29eV lnm0 = 23.99
5
ln (v/p [µm/sMPa])
750
650 <100> 5.7° <112> 12.9° <111> 10.7°
4 3 2
∆H = 1.28eV lnm0 = 23.07
∆H = 1.19eV lnm0 = 19.32
1 0 -1 -2 0.0011 0.00117 0.00124 0.00131 0.00138 0.00145 0.00152 0.00159
reciprocal temperature [1/K]
Fig. 6. Arrhenius plot of the mobility of planar symmetrical low angle tilt grain boundaries with different tilt axes and 7.7ppm impurity content
According to Eq. (8) ln
∆H v = lnm0 − p kT
(9)
we can determine the activation enthalpy for grain boundary motion ∆H from the slope of the Arrheniusline and the pre-exponential factor m0 from the extrapolated intersection with the mobility axis. Obviously, it is possible to activate the motion of planar low angle grain boundaries. The activation enthalpies for the motion of the low angle boundaries are similar for the different tilt axes, ∆H = 1.19eV for the motion of the symmetrical 100tilt boundary (misorientation angle 5.7◦ ), ∆H = 1.28eV for the 112-tilt boundary (misorientation angle 12.9◦ ) and ∆H = 1.29eV for the 111-tilt boundary (misorientation angle 10.7◦ ). The pre-exponential factors for the motion of the different low angle grain boundaries are also similar, namely ln m0 = 19.32 in the case of the 100-boundary, ln m0 = 23.07 for the 112-boundary and ln m0 = 23.99 for the 111-tilt boundary. 3.2
Motion of Planar Symmetrical High Angle Tilt Boundaries
The Arrhenius lines of the mobility of different planar high angle grain boundaries are shown in Fig. 7. Even for the planar high angle grain boundaries the migration of the planar boundary can be induced by the external shear stress and nearly the same activation enthalpies are measured, namely ∆H = 0.74eV for the 100-tilt boundary (misorientation angle 33.9◦ ), ∆H = 0.81eV for the 112-tilt boundary (misorientation angle 31.3◦ ) and ∆H = 0.74eV for the 111-boundary (misorientation angle 32.7◦). Also the pre-exponential factors for the different tilt axes were similar ln m0 = 13.63 for the 100-tilt boundary, ln m0 = 16.47 for the 112-boundary and ln m0 = 14.26 for the 111-boundary.
570
Myrjam Winning temperature [K] 850
6
800
750
700 <100> 33.9° <111> 32.7° <112> 31.3°
5.5 ∆H = 0.81eV lnm0 = 16.47
ln (v/p [µm/sMPa])
5 4.5 4 3.5 3
∆H = 0.74eV lnm0 = 14.26
∆H = 0.74eV lnm0 = 13.63
2.5 2 1.5 1 0.00113
0.0012
0.00127
0.00134
0.00141
0.00148
reciprocal temperature [1/K]
Fig. 7. Arrhenius plot of the mobility of planar symmetrical high angle tilt grain boundaries with different tilt axes and 7.7ppm impurity content
3.3
Activation Parameters for the Motion of Planar Boundaries
Figure 8 shows the dependence of the activation enthalpies on the misorientation angle for all measured symmetrical 100-, 112- and 111-tilt boundaries. For all boundaries there is a distinct subdivision into two ranges. In the case of the 112- and 111-tilt boundaries and for angles smaller than 13.6◦ an average activation enthalpy ∆Hm = 1.28eV ± 0.01eV was found, while for angles larger than 13.6◦ the average activation enthalpy was ∆Hm = 0.84eV ± 0.01eV. In the case of the 100-tilt boundaries and for angles smaller than 8.6◦ an average activation enthalpy ∆Hm = 1.18eV ±0.034eV was found, while for angles larger than 8.6◦ an average activation enthalpy ∆Hm = 0.73eV ± 0.029eV was measured. The transition between these two ranges is marked for all tilt axes by a conspicuous step in the activation enthalpy without any extended transition regime. Obviously, the observed step in the activation enthalpy for grain boundary motion must be attributed to the transition between low angle and high angle grain boundaries. The corresponding transition occurs at the angle of 13.6◦ ±0.55◦ for 112- and 111-tilt boundaries while for 100-tilt boundaries the transition angle is at 8.6◦ ±0.15◦. Within experimental accuracy there is not any noticeable transition range between the levels for low angle boundaries and high angle boundaries. 3.4
Influence of Shear Stresses on Grain Growth in Polycrystals
To investigate the influence of an external shear stress on grain growth in polycrystals, high purity aluminium polycrystals were annealed under different shear stresses and the kinetics of grain growth was determined by measuring the grain size at different annealing times and several temperatures. The experiments were carried out on three samples for each combination of
Grain Boundary Mechanics – New Approaches to Microstructure Control
571
1.5
1.3
129 119
1.2
109
13.6°
1.1
99
1
8.6°
0.9
89
0.8
79
0.7
69
activation enthalpy [kJ/mol]
activation enthalpy [eV]
139
<112>[2][1] <112> <111> [1] <111> [2] <100> <100> [4]
1.4
59
0.6 0.5
49 0
5
10
15
20
25
30
35
misorientation angle [°]
Fig. 8. Dependence of the activation enthalpy on misorientation for all investigated tilt axes 600 without stress 0.035MPa 0.52MPa 4.39MPa
average grain size [µm]
550 500 450 400 350
annealing temperature 300°C
300 250
0
5
10
15
20
25
30
35
40
annealing time [min]
Fig. 9. Average grain size in dependence on the annealing time for different external mechanical stresses and 300◦ C
temperature, stress state and annealing time. The average grain size was determined for each sample by measuring the grain size of at least 200 grains (for large grain sizes) and 400 grains (for small grain sizes). In Fig. 9 the average grain size is shown in dependence on annealing time for 300◦ C and in Fig. 10 for 350◦ C and four different stress states. The deviation from the given grain size is on the average 4.5%. For both temperatures the largest grain size was found for the largest shear stress (4.39MPa), which was definitely in the range of plastic deformation for these temperatures. For the same annealing time the average grain size decreased with increasing stress as long as the stress was in the elastic range. The external stress hindered the grain growth for small shear stresses compared to grain growth without additional mechanical stress. Only for the largest stress which caused plastic deformation of the sample the kinetics of grain growth were accelerated by the additional stress compared to grain growth without mechanical stress.
572
Myrjam Winning
average grain size [µm]
700
without stress 0.035 MPa 0.52 MPa 4.39 MPa
600 500 400
annealing temperature 350°C
300 200
0
5
10
15
20
25
30
35
40
annealing time [min]
Fig. 10. Average grain size in dependence on the annealing time for different external mechanical stresses and 350◦ C
4
Discussion
The optical micrographs in Fig. 3 show that it is obviuosly possible to activate the motion of planar grain boundaries by applying a mechanical stress field. Moreover, the results prove that planar grain boundaries can be driven by a mechanical stress field irrespective of the angle of misorientation, i.e., irrespective whether high or low angle grain boundary. During this stress induced motion the position of the grain boundary could be located at every time so that the grain motion could be measured in-situ. From the timedisplacement diagrams (Fig. 2) it can be seen that the stress induced grain motion is a steady-state motion, therefore, the velocity can be determined from the time-displacement diagrams. The dependence of the grain boundary velocity on the external stress (Fig. 4) and on the misorientation angle (Fig. 5) support the determination of the driving force Eq. (7). The measured activation enthalpies for low angle grain boundaries are comparable to the volume self diffusion enthalpy ∆HSD = 1.25eV [7] in aluminium, and for the high angle grain boundaries the activation enthalpies are comparable to the grain boundary diffusion enthalpy ∆HGB = 0.6−0.9eV [8]. Therefore, the grain boundary motion of planar grain boundaries under influence of mechanical stresses should be correlated with the movement of the dislocations in the grain boundary [2,3,4]. In fcc crystals the reaction of edge dislocations to an applied shear stress ought to be purely mechanical and not thermally activated. Obviously, the observed grain boundary motion is a thermally activated process controlled by diffusion. Moreover, real boundaries are never perfect symmetrical tilt boundaries but always contain structural dislocations of other Burgers vectors. These dislocations have to be displaced by nonconservative motion to make the entire boundary migrate. The climb process requires diffusion, which can only be volume diffusion for low angle grain boundaries but grain boundary
Grain Boundary Mechanics – New Approaches to Microstructure Control
573
diffusion for high angle grain boundaries according to the observed activation enthalpies. Since the climb process is much slower than the glide process it determines the velocity of boundary motion. Therefore, the measured activation enthalpies of grain boundary motion must be associated with the activation enthalpies of the climb processes. The results show that irrespective of the misorientation angle low angle grain boundaries as well as high angle grain boundaries can be driven by an externally imposed mechanical stress. Obviously, the motion of any grain boundary is due to the coupling of a respective external stress with the strain field of the boundary as expressed in terms of the movement of the dislocations which compose the grain boundary. Another major result is the existence of a conspicuous step in the activation enthalpy for the grain boundary motion at about 13.6◦ for the 111and 112- tilt boundaries and at 8.6◦ for the 100-tilt boundaries. This step in the activation enthalpy reflects the transition of the low angle grain boundary to the high angle grain boundary regime (Fig. 8). In this view the transition of the migration behaviour of low and high angle tilt grain boundaries merely changes the diffusion path of the vacancies necessary for the climb of the structural dislocations, namely from volume self diffusion to grain boundary diffusion. The elastic coupling of the dislocations with the applied shear stress remains virtually unchanged. As a consequence the transition angle depends on the tilt axis. Especially during the primary recrystallization the transition from low to high angle grain boundary plays an important role because the recrystallization takes place over the motion of high angle grain boundaries. Also the Brandon criterion [9] which defines the angular deviation of special boundaries from the exact coincidence orientation relationship, by the term θtrans ∆θ = √ Σ
(10)
where ∆θ is this angular deviation, θtrans the transition angle from low to high angle grain boundary (normally one assumes for this transition angle 15◦ ) and Σ is the density of coincidence sites in the grain boundary. With Eq. (10) we get for the angular deviation for a Σ13 100 tilt boundary ∆θ(Σ13/100) = 2.38◦ while for the angular deviation for a Σ13 111 boundary the angular deviation is ∆θ (Σ13/111) = 3.77◦ . This means a difference of nearly 36% in the angular deviations of special boundaries from the exact coincidence orientation relationship between 100 and 111 tilt boundaries. In Fig. 11 a model is shown for the dislocation structure in a 100-tilt boundary. The 100-edge dislocation with the Burgers vector b = a100 (a is the lattice parameter) can be divided into a pair of lattice dislocations with the Burgers vectors b1 = a/2110 and b2 = a/21-10. Such model is based on transmission electron microscopy investigations in gold [10] and is supported by the experiments of Viswanathan and Bauer [11]. We now assume that the dislocation density in the grain boundary at the transition
574
Myrjam Winning
angle is the same for all tilt axes. This assumption seems reasonable because the change in the diffusion path from volume to grain boundary diffusion obviously depends on the dislocation density in the grain boundary. At the transition angle the dislocation density in the grain boundary reaches a critical value and causes the change in the diffusion path. The misorientation angle of 100 tilt boundaries is given by the distance of the 100-edge dislocations. But the real density of dislocations in the grain boundary is higher due to the dissociation of the 100 dislocations into pairs of 110 edge dislocations (Fig. 11). Therefore, the critical dislocation density in 100 tilt boundaries is reached at a lower misorientation angle and, therefore, also the transition angle is lower than for the 112- and 111-tilt boundaries which was observed in the experiments. First experiments on grain growth showed an influence of the external shear stress on the kinetics of grain growth. The acceleration of the grain growth kinetics by a large stress is in good agreement with literature [e.g. 12] where the same behaviour was observed in polycrystalline thin films under the influence of pressure. Applied shear stresses in the elastic range lead to a hindering of grain growth. This is a new experimental observation and maybe due to the interaction of grain boundaries with the external shear stress. One possible explanation is that the additional driving force on the grain boundaries can act in the same direction as the capillary force due to the grain boundary curvature or in the opposite direction. In the second case small grains would shrink more slowly compared to the case without additional force. To clarify the observed phenomenon it will be necessary to determine the kinetics of grain growth in dependence of external mechanical stresses in a more detailed way coupled with investigations of grain morphology and the texture evolution during grain growth. Obviously there is a possibility to change the grain growth kinetics and thus also the microstructure evolution by an external mechanical stress field. The experimental results show that grain boundaries can directly interact with mechanical stresses and that mechanical stress fields can change completely the behaviour of grain boundaries. This means the microstructure evolution and the properties of materials are not only influenced by dynamic properties of grain boundaries but also by grain boundary mechanics i.e., the
110
010 b1
b 100
b2 110
Fig. 11. Model for the dislocation structure of 100tilt grain boundaries. The 100-edge dislocation can be described as a close spaced pair of 110-edge dislocations
Grain Boundary Mechanics – New Approaches to Microstructure Control
575
interactions between grain boundaries and mechanical stresses. All processes which are based on the migration of grain boundaries e.g., recrystallization or grain growth should be affected by these interactions between grain boundaries and mechanical stresses and therefore it should be possible to influence the kinetics and also the microstructure evolution during theses processes by applying a mechanical stress field. Another effect of these interactions can be expected for high temperature plasticity e.g., during cyclic deformation at high temperatures. Especially for polycrystals with very small grain sizes (nanocrystalline materials) the effect of mechanical stresses on the grain boundaries and therefore on the behaviour of the material should be very large due to the large volume fraction of grain boundaries in the material.
5
Summary
A method was presented which allows the activation of the motion of planar, symmetrical tilt boundaries with different tilt axes in aluminium under the influence of a mechanical stress field. The grain boundary motion is due to the movement of the structural dislocations in the boundary. Two different activation enthalpies were found: for low angle grain boundaries an activation enthalpy close to the self diffusion enthalpy and for high angle grain boundaries an activation enthalpy close to the grain boundary diffusion enthalpy. There is a sharp transition between the low and the high angle grain boundaries which is marked by a conspicuous step in the activation enthalpy. The transition angle is 13.6◦ ± 0.55◦ for the 112- and the 111-tilt boundaries, and 8.6◦ ± 0.15◦ for tilt boundaries with a 100-tilt axis. Also an influence of mechanical stresses on grain growth could be observed. For small shear stresses the kinetics of the grain growth is hindered compared with the grain growth without additional mechanical force. For large shear stresses the grain growth is accelerated due to plastic deformation. Acknowledgement The author would like to express her gratitude to the Deutsche Forschungsgemeinschaft for financial support through contract Wi 1917/1.
References 1. C.H. Li, E.H. Edwards, J. Washburn and E.R. Parker,Acta Metall. 1, 223 (1953). 2. M. Winning, G. Gottstein and L.S. Shvindlerman, Acta Mater. 49, 211 (2001). 3. M. Winning, G. Gottstein, and L.S. Shvindlerman, Acta Mater. 50, 353 (2002). 4. M. Winning, Acta Materialia, submitted February 2003.
576
Myrjam Winning
5. U. Czubayko, D.A. Molodov, B.-C. Petersen, G. Gottstein and L.S. Shvindlerman, Meas. Sci. Technol. 6, 947 (1995). 6. M. Peach and J.S. Koehler, Phys. Rev. 80, 436 (1950). ´ 7. J. Philibert, Diffusion et Transport de Mati`ere dans les Solides, (Les Editions de Physique, Les Ulis Cedex France 1985). 8. T.E. Volin, K.H. Lie and R.W. Balluffi, Acta Metall. 19, 263 (1971). 9. D.G. Brandon, Acta Metall. 14, 1479 (1966). 10. T. Schober and R.W. Balluffi, Phys. Stat. Sol.(b) 44, 103 (1971). 11. R. Viswanathan, C.L. Bauer, Acta Metall. 21, 1099 (1973). 12. C.A. Volkert and C. Lingk, Appl. Physics Letters 73, 3677 (1998).
Models for Food Webs Barbara Drossel Darmstadt University of Technology, Institute of Solid State Physics Hochschulstr. 6, D-64289 Darmstadt, Germany
Abstract. Food webs are complex systems of many interacting biological species. After summarizing some of their properties, this contribution presents different ways of modeling food webs. It briefly mentions static models and discusses dynamical models and their population dynamics equations. The complexity-stability debate is mentioned, and evolutionary food web models are presented as a natural way of obtaining large stable webs. Computer simulation results for a particular evolutionary model are shown to reproduce several important features of real food webs, including the prevalence of weak links.
Among the most complex systems on earth are ecosystems, which consist of many biological species that interact in many different ways, including predation, symbiosis and competition. These systems are highly nonlinear and very far from thermodynamic equilibrium, and their analysis and modeling presents a big challenge to modern-day science. A much discussed topic is the question of how these large complex systems can be stable. Because of the huge complexity, research typically focuses either on the interaction of only a few species, or on only one or two important types of interaction among many species. The latter approach views ecosystems as food webs, where among all the possible interactions feeding relationships and competition for food are considered. Figure 1 shows such a food web. The arrows point from a resource to its consumer. In the following section, we will summarize some important characteristics of food webs. Section 2 will briefly introduce static models that try to reproduce the overall structure of real food webs. Section 3 then presents and discusses the population dynamics equations used in the literature and mentions the complexity-stability debate. Section 4 finally presents evolutionary models and their capability to generate large complex stable webs. This article is based on previous publications and drafts, which were written in collaboration with Alan McKane, Paul Higgs, Chris Quince and Stefan Scheu [1,2,3,4].
1
Characteristics of Food Webs
A food web is always incomplete. A set of species living in a particular habitat, such as a lake or a bay or a small island, is chosen, and their feeding relationships are investigated. Not all species can be taken into account (for instance B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 579–588, 2003. c Springer-Verlag Berlin Heidelberg 2003
580
Barbara Drossel 17
12
7
8
4
13
9
5
1
16
14
15
11
10
6
2
3
Fig. 1. Narragansett Bay food web. 1=flagellates, diatoms; 2=particulate detritus; 3=macroalgae, eelgrass; 4=Acartia, other copepods; 5=sponges, clams; 6=benthic macrofauna; 7=ctenophores; 8=meroplankton, fish larvae; 9=pacific menhaden; 10=bivalves; 11=crabs, lobsters; 12=butterfish; 13=striped bass, bluefish, mackerel; 14=demersal species; 15=starfish; 16=flounder; 17=man. (After Yodzis 1989. Introduction to theoretical ecology, Harper and Row, New York)
bacteria are difficult to detect and extremely numerous), and species are often grouped together to form trophic species which share the same predators and prey. Since the observation time is finite, very weak links are not discovered. Since real webs fluctuated with time, for instance because predators take different prey at different life stages or different times of the year, the representation of the food web as a graph with static links is a simplification. Food webs contain basal species that have no prey, top species that have no predator, and intermediate species that have both predators and prey. The percentage of basal and top species is typically smaller than 20 percent, and this percentage decreases with increasing web size and more detailed observation. The length of food chains leading from a basal to a top species is typically short, implying that there are only a few trophic levels in a food web, with 4 or 5 levels being common. The number of levels can be expected to depend logarithmically on the web size, since the ecological efficiency, with which prey biomass is turned into predator biomass is far below 1 (a typical values used in the literature is 0.1). This means that the biomass decreases by a factor 10 from one level to the next. Loops and omnivory appear to be rare in smaller webs, but are observed more often in larger webs. An omnivore is a species that feeds on several trophic levels and eats for instance also the prey of its prey. The typical number of links per species appears to be around two, but larger webs again deviate from this simple rule. To summarize, while some general statements about the properties of food webs can be made, individual webs, and in particular large webs, may deviate
Models for Food Webs
581
from these simple rules. There are essentially two challenges for the modeling of food webs: The first is to reproduce the above-mentioned features of food webs in a model, and the second is to take into account population dynamics and study under which conditions large and stable webs can exist.
2
Static Models
There is a class of models that focus on the first challenge and try construct food webs by drawing arrows (“links”) between dots (“species”) according to some simple rule. Clearly, a completely random model that picks for each arrow the starting and end point at random, does not share many features with real webs. Models that place species on some ranking scale and then draw links according to a rule that depends on the rank fare much better. A recent model of this kind is the so-called niche model by Williams and Martinez [5], which assigns to each species a “niche value” by randomly drawing a number from the interval [0, 1]. The species are constrained to consume all prey within a range of values whose randomly chosen center is less than the consumer’s niche value. The size of the range is chosen according to a beta distribution with parameters such that the desired mean number of links per species results. Species with similar niche values often share consumers, and up to half of the consumer’s range may include species with niche values higher than the consumer’s value. There are only two empirical parameters: the number of species and the linkage density (or the connectance). Evaluating 12 different structural properties of the webs generated by the niche model and comparing them to real food web data, the authors found that the agreement between the model web and real webs is in general good. In spite of the apparent success at reproducing properties of real food webs for appropriately chosen parameter values, this and other static models cannot give a real explanation of the observed web structures. The webs constructed by these models do not result from a dynamical process; links are not assigned according to some biologically inspired rule, and the models do not contain any population dynamics. A good agreement with real data is achieved by capturing some structural features of real webs, but not by incorporating underlying biological properties. In particular, the question of web stability cannot be addressed in these simple models. The question of web stability will be discussed in the next section, where dynamical models are considered, and the question how the structure of webs might follow from evolutionary dynamics combined with biological principles, will be explored in section 4.
582
Barbara Drossel
3 Population Dynamics and the Complexity-Stability Debate Writing down a set of coupled equations for many interacting species is not an easy task, and not many examples are found in the literature. Equations for two or a few interacting species are commonly found, but generalizations to an arbitrary number of species are rarely suggested. In order to keep the expressions simple, one uses the same type of equation for each species, albeit with different coefficients and couplings. The equations then have the general form dNi (t) gij − Nj (t)gji + ri Ni (t) − cNi2 (t) , (1) = λNi (t) dt j j with i being the species index, Ni being the population size of species i, and ri being a positive growth rate (basal species) or a negative death rate (other species). For basal species, the logistic term −cNi2 (t) is necessary in order to prevent an explosion of the population size in the absence of predators. The so-called functional responses gij (N1 , . . . , Nn ) depend on the population sizes and are the number of prey eaten per unit time by an individual predator. λ is the above-mentioned ecological efficiency. The most commonly used form of the functional response is the LotkaVolterra form gij = aij Nj ,
(2)
which means that the number of prey eaten by a predator is proportional to the prey density, with aij being a coupling constant. This may be a good form for chemical reactions, but for predator-prey interactions this is not a very realistic assumption, since predators become saturated at high prey densities and since they have to divide their time between the different available prey species. Models for two or a few species often use the Holling form, which takes into account predator saturation, and which was generalized to an arbitrary number of species by Arditi and Michalski [6], gij =
1+
aij Nj . k bik Nk
(3)
The sum over k is taken over the prey of species i. Closely related to this is the Beddington form, which additionally takes into account that predators waste time fighting with other predators, gij =
1+
aij Nj , b k ik Nk + l cil Nl
(4)
where the first sum is again taken over all prey k of species i, and the second sum is taken over all those predator species l that share a prey with i.
Models for Food Webs
583
Arditi and Michalski [6] also suggest the following ratio-dependent functional response, which implements the idea that predators share the prey, r(i)
gij =
Ni +
aij Nj
r(i) k∈R(i) bik Nk
,
(5)
with the self-consistent conditions r(i)
Nj
C(j)
=
βji Ni
Nj
k∈C(j)
βjk Nk
, C(j)
C(j)
Nk
r(k)
=
hjk Nj
l∈R(k)
Nk r(k)
hlk Nl
.
Here βij is the efficiency of predator i at consuming species j, hij is the relative preference of predator i for prey j, R(i) are the prey species for r(i) predator i, C(i) are the species predating on prey i, Nj is the part of C(j)
species j that is currently being accessed as resource by species i and Nk is the part of species k that is currently acting as consumer of species j. Finally, a recently introduced evolutionary food web model [2] uses the ratio-dependent expression gij (t) =
aij fij (t)Nj (t) . bNj (t) + k αki akj fkj (t)Nk (t)
(6)
fij is the fraction of its effort (or available searching time) that species i puts into preying on species j. These efforts are determined self–consistently from the condition gij (t) . fij (t) = k gik (t)
(7)
This condition is such that no individual can increase its energy intake by putting more effort into a different prey. Given the form of the population dynamics, the question arises of how the coupling constants and the connections between species must be chosen such that the system does not collapse under its dynamics, but remains stable with all species persisting. The first mathematical treatment of the topic of stability of systems consisting of many species is due to May [7]. May performed a linear stability analysis of the population sizes around a supposed equilibrium point d (δNi ) = αij δNj , (8) dt j where δNi is the deviation of the population size of species i from its equilibrium value and αij is the so-called community matrix. In this way he avoided specifying the underlying population dynamics equations, but was constrained to stay near equilibrium. The choice of web structure is equivalent to the choice of the αij . May chose the diagonal elements of the matrix
584
Barbara Drossel
to be −1. The other elements were taken to be zero with probability 1 − h. With probability h, they had a random nonzero value chosen from a distribution of width α, so that α is a measure of the average interaction strength. Using results from random matrix theory, he found that ecosystems that are initially stable will become less stable (i.e., the initially negative eigenvalues of the community matrix move towards zero) when α(SC)1/2 is increased (S being the number of species, and C the mean number of links per species). Furthermore, (8) will almost certainly be stable if α(SC)1/2 < 1, and almost certainly be unstable if α(SC)1/2 > 1. These results suggested that more complex webs should be less stable, in contrast to the observation that large complex food webs exist and appear often more stable than simpler ecosystems such as mono-cultures, which are more susceptible to attacks by pests. The question of what stabilizes large complex food webs is still a matter of debate, although a variety of stabilizing factors have been identified. Apart from requiring mathematical corrections, which do however not affect the conclusions, the considerations by May are unrealistic in several respects. Firstly, stability of an ecosystem does not imply that it is at a stable fixed point of its dynamics. As long as oscillations are not too large, an ecosystem can in principle be on a limit cycle or even a chaotic trajectory. Secondly, the assumption that connections and link strengths are random, must be corrected since the interactions in real food webs are shaped by their history and are far from random. Taking the values of the elements of the community matrix from nature, generates indeed stable webs [8]. Similarly, giving model webs a history by assembling them from a “pool” of species, generates stable webs. Starting with either one or a few species, species from the pool are added to the system, and they remain in it if the resulting system is stable. The species pool is usually a set of several basal species (“plants”), “herbivores”, “carnivores” and top predators, with interaction coefficients between neighboring layers assigned according to some random rule. Although this species pool is usually interpreted as stemming from a large ecosystem, like the mainland, no stability criteria or other criteria inspired from real large webs are applied to it. After adding a new species with an initially small population size to the system, one of the three following things can happen: (i) The new species increases and coexists with all the other species. (ii) The new species remains in the system, but one or more other species go extinct. (iii) The new species goes extinct. Numerical integration of Lotka-Volterra equations, combined with the criterion of local stability, were used by Post and Pimm [9] and by Drake [10,11], to construct webs of typically 20 species. Other authors point out that using population dynamics equations that are more realistic than Lotka-Volterra equations tends to stabilize the dynamics of the models by dampening fluctuations in the population sizes. This is usually shown by using a small system of 4 or 5 species. Allowing predators to switch prey [12], or using the Holling form (3) combined with a preference
Models for Food Webs
585
for weak couplings [13], generates systems with small population fluctuations. However, in these and related models the values of the coupling strengths are put into the system by hand, and no large model webs have been built. Recently, the diversity-stability debate has been reviewed by McCann [14].
4
Evolutionary Models
So far, we have discussed food web models where the possible values of the couplings are put in by hand. A different approach, which enables the construction of large webs is one that allows for the evolution of couplings and thereby of the entire web by modifying existing species. The purpose of such models is not to reproduce the history of life on earth, but rather to find a simple way of taking into account the fact the the values of the couplings are subject to modifications and are shaped by the history of the ecosystem. The first such model, which however does not include a population dynamics, is due to Caldarelli, Higgs and McKane [15]. A modification with population dynamics suggested by Drossel, Higgs and McKane [2] shall be presented in this section. This model has two different time scales, the first one being the ecological time scale on which the population sizes change, and the second one being a longer scale, on which the couplings and the composition of the web change. We shall see that this model generates many features of real food webs. In order to make a modification of existing couplings possible, it is useful to characterize species in terms of features, from which the couplings are derived. Modifying a species means modifying a feature, resulting in modified couplings. The similarity and therefore the competition strength between two species can also be naturally defined in terms of the fraction of features shared by two species. (However, another choice of rules, where links are directly modified and the similarity between species is derived from the proportion of links they have in common, is also possible and was done by L¨ assig et al [16,17].) The model is therefore the following: Species are characterized as binary strings, with each bit representing a feature that is either present (1) or absent (0) in a species. Each species is assigned 10 feature out of a pool of 500 possible ones. “Scores” between two species are obtained by multiplying the two feature vectors to the right and left of an asymmetric random matrix (with elements drawn from a Gaussian distribution of width one around zero) that is chosen at the beginning of the simulation. Positive scores indicate that the first species can feed on the second species, and negative scores mean that the first species is eaten by the second. The external resources are represented as an additional species of fixed (and large) population size, which does not feed on any species. The simulations start with one species and the external resources. The population dynamics are iterated until an attractor or a cutoff time is reached, and than a “mutation” is performed in
586
Barbara Drossel
one individual by replacing one of its features by a randomly chosen different one. This generates a new species of population size one, and applying the population dynamics leads either to the extinction or the survival of this new species, possibly driving other species to extinction. After a sufficiently long time, the properties of the resulting webs are evaluated. Using this evolutionary model in combination with Lotka-Volterra (2), Holling (3 or Beddington (4) population dynamics does not lead to model webs consisting of several trophic layers, but rather to webs where almost every species also feeds on the external resources [3]. It appears that “mutations” establishing additional links to the external resources convey an advantage when these population dynamics are used. We believe the reason for this to be that these three types of dynamics do not force predators to divide their energy between the different available prey and thus to focus their efforts on their best prey. This is different for ratio-dependent functional responses. The responses of the form (5), however, introduce such strong competition between predators that each prey only has one predator, resulting in tree-like web structures that are clearly unrealistic. The form (6) combined with (7), however, is apparently realistic enough to allow for the creation of large food webs. Figure 2 shows a web generated with this model. Evaluating the fraction of top and basal species, the number of links and other properties, we find good agreement with the features of real webs [2], and we even find a preponderance of weak links [3]. Figure 3 shows the number of species as a function of time. One can see that the number of species saturates around a value, which depends on the parameters of the model (mainly the amount of external resources available), and fluctuates around it without having large extinction avalanches. This finding is in strong contrast to the often cited hypothesis that ecosystems might be driven by their internal dynamics to a self-organized critical state, where extinctions of all sizes may occur. Nevertheless, there is a continuous overturn of species in our model. More recent results, concerning for instance the response of the model to the deletion of species, are presented in [18,19]. As the population dynamics
Fig. 2. Example of a food web generated by the evolutionary model [2]. The radius of the circles is proportional to the logarithm of the population size, and the thickness of a link is a measure of the energy flow along that link
Models for Food Webs
587
number of species
150
100
50
0
0
20000
40000 time
60000
Fig. 3. Number of species as function of time for a given set of parameters. Time is measured in units of evolutionary steps
equations suggest, species are capable of switching prey and investing more effort in a different species when a prey goes extinct or becomes rare. For this reason, species removal does not generally result in a collapse of the web.
5
Conclusions
This article has given an overview of the different approaches that have been used to model food webs, with a focus on a particular model that is capable of naturally reproducing many features of real food webs. The theory of food web structure has made many advances during the last decade. Much of the early work concerned webs with random, rather than evolved, structures, but there has been a tendency in later work towards building up a web either from a pool of species (assembly models) or, more recently, by creating webs through modification of the existing species (evolutionary models). In parallel with these developments, some suggestions for more realistic population dynamics for whole communities, and not just for a single predator-prey pair, have been put forward. This development will form the basis for further progress in the theory of food webs. Evolved food webs are generally more stable than randomly assembled food webs. Large and complex randomly assembled food webs simply collapse under the population dynamics and are very unlikely to be stable. Evolutionary dynamics can chose the predator-prey links and the competition structure such that stable webs are built step by step, starting from a small initial web. We do not yet know in sufficient detail in what respects the link strengths and link structure of evolved networks differ from ad-hoc compositions. There is also a need to better investigate the effects of different functional responses on the structure and stability of food web models. Finally, it should be noted that not all food webs share the features listed in this article. Food webs of soil species appear to be very different, since they are not composed of clear trophic layers, but rather of generalist species that ingest complex units of food [20,21]. Implementing the special conditions that prevail in the soil environment and building a model that helps understand
588
Barbara Drossel
how the structure of soil webs results from these conditions, is a further challenge to food web theory that has not yet been tackled.
References 1. B. Drossel and A.J. McKane: Modeling food webs, in: Handbook of graphs and networks, S. Bornholdt, H.G. Schuster (Eds.) (Wiley-VCH, Berlin, 2002). 579 2. B. Drossel, P.G. Higgs and A.J. McKane, J. Theor. Biol. 208, 91 (2001). 579, 583, 585, 586 3. B. Drossel, A.J. McKane, C. Quince and P.G. Higgs, Evolving complex food webs (unpublished). 579, 586 4. S. Scheu and B. Drossel, Komplexit¨ at und Stabilit¨ at von Nahrungsnetzen (unpublished). 579 5. R.J. Williams and N.D. Martinez, Nature 404, 180 (2000). 581 6. R. Arditi and J. Michalski, Nonlinear food web models and their responses to increased basal productivity, in: Food webs: Integration of patterns and dynamics, G.A. Polis, K.O. Winemiller, (Eds) (Chapman and Hall, New York, 1996). 582, 583 7. R.M. May, Nature 238, 413 (1972). 583 8. P. Yodzis, Nature 289, 674 (1981). 584 9. W.M. Post and S.L. Pimm, Math. Biosci. 64, 169 (1983). 584 10. J.A. Drake, Models of community assembly and the structure of ecological landscapes, in Mathematical ecology, T. Hallam, L. Gross, S. Levin (Eds.) (World Scientific, Singapore, 1988). 584 11. J.A. Drake, J. Theor. Biol. 147, 213 (1990). 584 12. J.D. Pelletier, Math. Biosc. 163, 91 (2000). 584 13. K. McCann, A. Hastings and G.R. Huxel, Nature 395, 794 (1998). 585 14. K.S. McCann, Nature 405, 228 (2000). 585 15. G. Caldarelli, P.G. Higgs and A.J. McKane, J. Theor. Biol. 193, 345 (1998). 585 16. M. L¨ assig, U. Bastolla, S.C. Manrubia, and A. Valleriani, Shape of ecological networks, Phys. Rev. Lett. 86, 4418 (2001). 585 17. U. Bastolla, M. L¨ assig, S. C. Manrubia, and A. Valleriani, Dynamics and topology of species networks, in: Biological Evolution and Statistical Physics M. L¨ assig, A. Valleriani (Eds.) (Springer-Verlag, Berlin, 2002). 585 18. C. Quince, P.G. Higgs and A.J. McKane, Food web structure and the evolution of ecological communities, in: Biological Evolution and Statistical Physics M. L¨ assig, A. Valleriani (Eds.) (Springer-Verlag, Berlin, 2002). 586 19. C. Quince, P.G. Higgs and A.J. McKane, The effects of the removal and addition of species on ecosystem stability, in: Complexity emerging: a paradigm for ecological thought, J.A. Drake, C.R. Zimmermann, S. Gavrilets, T. Fukami (Eds.) (Columbia University Press, New York, to be published). 586 20. S. Scheu and M. Falca, Oecologia 123, 285 (2000). 587 21. S. Scheu, European Journal of Soil Biology 38, 11 (2002). 587
Control of Chaos by Time-Delayed Feedback: A Survey of Theoretical and Experimental Aspects Wolfram Just1 , Hartmut Benner2 , and Eckehard Sch¨ oll3 1 2 3
Institute of Physics, Chemnitz University of Technology D-09107 Chemnitz, Germany Institute of Solid State Physics, Darmstadt University of Technology Hochschulstraße 6, D-64289 Darmstadt, Germany Institute of Theoretical Physics, Berlin University of Technology Hardenbergstraße 36, D-10623 Berlin, Germany
Abstract. Time-delayed feedback control has been introduced as a powerful tool for control of unstable periodic orbits in dynamical systems. From the experimental point of view its strength is based on the fact that the application of this method requires just the measurement of simple signals, and it has been applied in physics, chemistry and biology. We present an overview of the theoretical foundations of time-delayed feedback methods and explain in detail the implications for real experiments.
1
Introduction
During the last decade control of chaos has developed into one of the most prominent fields in applied nonlinear science [1]. Its beginning was triggered by the observation that on the one hand chaotic motion provides a huge number of unstable states and that on the other hand each of these states can be stabilised by tiny control forces [2]. The control scheme that was originally developed in [2] is based on the local phase space structure in the vicinity of the target orbit. It has been applied in different experimental contexts where such a structure is experimentally accessible (cf. e.g. [3]). In fact, such approaches allow for more sophisticated goals like tracking and targeting of particular trajectories. In the wake of these developments variants of the control method have been rediscovered [4] which are simpler to handle form the experimental point of view and which are often considered in control theory. Most applications in the physicists context concern the stabilisation of fixed points, either in the original phase space or with respect to a suitable Poincare cross section. Above all it is important to stress that control schemes mentioned above aim at noninvasive control, so that the control force finally vanishes and the target state is not altered by the control loop. Noninvasive schemes open the possibility to use control methods as a kind of nonlinear B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 589–603, 2003. c Springer-Verlag Berlin Heidelberg 2003
590
Wolfram Just et al.
spectroscopy since different proper eigenmotions of the system can be detected and recorded. Control theory is a well developed discipline in applied mathematics and engineering science, and there exists a huge amount of literature ranging from elementary to sophisticated presentations (cf. e.g. [5,6]). Control problems are frequently formulated in terms of boundary value and optimisation problems. Thus also global properties like controllability and observability of a target state can be addressed. However, such concepts are to some extent restricted to systems where at least partial information about the internal state is accessible or to systems which are either linear or conjugate to a linear systems in parts of their phase space. Whether a control scheme is invasive does not play a crucial role from the point of view of control theory. It is a well established fact for decades [7] and to some extent common wisdom in control theory that time delay reduces the efficiency of a control scheme. Therefore it was quite a surprise and has been pointed out recently [8] that time-delayed feedback may be suitable to generate control forces for stabilising time-periodic states. The main idea of such feedback schemes is quite simple and can be applied in different experimental contexts. Suppose a signal s(t) is accessible for measurement and we intend to stabilise an unstable periodic state of period τ . From the time-delayed difference s(t) − s(t − τ ) one generates a control force, e.g. by linear amplification, and one feeds it back to the system. Obviously such a scheme (cf. Fig. 1) is noninvasive since the force vanishes provided the τ -periodic target state is reached. Whether such a prescription works requires a more detailed investigation. Nevertheless, the scheme has been applied successfully in optical experiments, e.g. for stabilising lasers [9] or discharge gas tubes [10], in hydrodynamic experiments e.g. for stabilisation of turbulent Taylor-Couette flows [11], in chemical setups e.g. for electrochemical reactions [12], in magnetic systems, e.g. for controlling high power ferromagnetic resonance experiments [13] and in biological systems, e.g. for controlling arrhythmic cardiac [14]. Of course demonstration experiments like mechanical pendula [15] or electronic circuits [16] have been performed also. The latter are very useful to study details of the control scheme from an experimental perspective. Despite its experimental success a deeper theoretical understanding of time-delayed feedback control has been gained only recently. Here we present a summary of these developments. We mainly focus on essential theoretical features without giving all the technical details. The relevance of these items is illustrated by results from electronic circuit experiments. Section 2 is devoted to the analysis of the original Pyragas scheme. More specialised topics like multiple delays or the introduction of additional time scales in the control process are addressed in Sect. 3. The lack of a proper theoretical understanding of some aspects is related to the fact that time delay systems act on an infinite-dimensional phase space. Thus, dynamics with time delay is of interest by itself. We point out some recent trends in Sect. 4.
Control of Chaos by Time-Delayed Feedback
591
s(t)=g[x(t)]
h(t)
x(t) K ∆ s(t)
s(t _τ )
Fig. 1. Diagrammatic view of time-delayed feedback control. Internal degrees of freedom are denoted by x(t). The measured signal s(t) = g[x(t)] depends on the internal degrees of freedom. The control force F (t) = K[s(t) − s(t − τ )] is generated from the time-delayed difference by linear amplification. The force is coupled to some external parameter h(t) of the experimental setup
2
Properties of the Original Pyragas Scheme
Theoretical approaches for time-delayed feedback control rely to a good deal on linear stability analysis. Fortunately essential steps can be performed without resort to a particular model. Thus the results and the control performance share a large degree of universality. Let us just briefly review the main concepts and introduce the essential notations [17]. Let x(t) denote the internal degrees of freedom of the system under consideration. The dynamics without control force is assumed to be governed by a set of equations of motion ˙ x(t) = f (x(t)) .
(1)
We assume that the motion admits an unstable periodic state ξ(t) = ξ(t + τ ) which we finally intend to stabilise. Observing small deviations from the target state δx(t) = x(t) − ξ(t) the stability of the orbit is governed by linear analysis. With the usual exponential ansatz δx(t) = exp(λt)q(t), where the eigenmode is periodic in time q(t) = q(t + τ ) by virtue of the periodicity of the unstable orbit, one finally obtains a Floquet eigenvalue problem λν q ν (t) + q˙ ν (t) = Df (ξ(t))q ν (t) .
(2)
Df denotes the Jacobian matrix and the index numbers the different eigenmodes. While the real part of the Floquet exponents λν governs the stability the imaginary part determines the torsion of neighbouring trajectories in phase space. The latter quantities are defined modulo 2π/τ owing to the periodicity of the eigenmodes. In what follows we suppose for the simplicity of notation that just the real part of one Floquet exponent is positive. Generalisations of our considerations to cases with several unstable exponents is straightforward to a large extent. We label the unstable exponent by λ+ .
592
Wolfram Just et al.
The control system sketched in Fig. 1 contains the control loop which is based on the time-delayed difference. In terms of an equation of motion a particular realisation of the control reads ˙ x(t) = f (x(t)) − K [χx(t) − χx(t − τ )] .
(3)
In Eq. (3) the control force has been based on the observation of the full state through the observable χx(t), where the factor χ takes care of physical dimensions and can be of course absorbed in the control amplitude. The feedback used in Eq. (3) is very special and to some extent not very realistic, since such a diagonal control requires the measurement and the stabilisation of each degree of freedom. But it considerably facilitates a complete analytical discussion. Performing a linear stability analysis and using the exponential ansatz δx(t) = exp(Λt)Q(t) we immediately recognise that the delay term δx(t) − δx(t − τ ) reduces to a contribution which is local in time, [1 − exp(−Λτ )] exp(Λt)Q(t). Thus stability is governed by the ordinary Floquet problem ˙ α (t) = Df (ξ(t))Qα (t) − Kχ[1 − exp(−Λα τ )]Qα (t) . Λα Qα (t) + Q
(4)
The structure of Eq. (4) permits to absorb the control term, i.e. the second contribution on the right hand side, in the Floquet exponent, i.e. the first contribution on the left hand side. Then, by comparison with Eq. (2) we get the characteristic equations λν = Λα + Kχ[1 − exp(−Λα τ )] ,
(5)
where the index α labels the different Floquet exponents of the controlled system. Thus each eigenvalue λν of the uncontrolled system gives rise to a whole family Λα of exponents which determine stability in the case with control. Such a structure in fact reflects the phase space with infinite dimension on which the corresponding differential-difference equation (3) operates. Transcendental equations of the kind Eq. (5) are typical for the stability analysis of time delay systems [18]. Several techniques, either analytical or numerical, have been developed for their evaluation [19,18,20,21]. In Eq. (5) it is often sufficient to consider the unstable free branch, i.e. ν = +. The Floquet exponents of the controlled system, Λα , determine the control performance. Successful control requires ReΛα < 0. The dependence of the exponents on the control amplitude K is depicted in Fig. 2 for an orbit with Imλ+ = π/τ , i.e. an orbit which flips its neighbourhood during one turn. Such orbits are generated in period doubling bifurcations and therefore occur frequently in chaotic attractors which are generated by period doubling cascades. One observes a typical butterfly shaped structure for the leading exponent which results in a finite control interval. At the lower control threshold a flip instability occurs, whereas the upper control threshold is related to a nontrivial imaginary part so that a Hopf instability occurs. All the other complex valued solutions of Eq. (5) have lower real part (cf. e.g. [19,22]). If
Control of Chaos by Time-Delayed Feedback 0.4
∆Ω τ / 2 π
0.0
Re Λ τ
-0.2
-0.4
-0.6
593
0.2 0.0 -0.2
20
40
K
60
80
-0.4
20
40
60
80
K
Fig. 2. Leading Floquet exponent of the system subjected to time-delayed feedback control in dependence on the control amplitude. Left: real part, right: imaginary part ∆Ω = Im(Λ − λ+ ). Full line: analytical result according to Eq. (5) with λ+ τ = 1.07 + iπ and χτ = 0.036. Symbols: experimental data obtained from a nonlinear diode resonator experiment
the real part of the free exponent, Reλ+ is increased, then the whole curve shifts essentially upwards. Thus the control interval shrinks and finally vanishes at Reλ+ τ = 2 (cf. Fig. 5) [23]. Thus time-delayed feedback control is limited to low-period or weakly unstable orbits. So far our analysis was rigorous but restricted to the diagonal control scheme. We compare these results with experimental data obtained in an electronic circuit experiment (cf. for details of the setup [22]). Experimentally the dominant Floquet exponent can be obtained either from observing the transient dynamics or by linear response methods. Although the coupling of control forces is quite different compared to the diagonal scheme Eq. (3) we observe a remarkable coincidence when the parameters λ+ and χ in Eq. (5) are considered as fit parameters. Thus the analysis presented above has considerable predictive power. To understand such a coincidence on a substantial level one may perform a linear stability analysis for a general coupling scheme, i.e. for a measured signal s(t) = g[x(t)] and a coupling of the control force to internal degrees of freedom which is not specified from the beginning [17]. In contrast to Eq. (4) there appears now a nontrivial control matrix. Nevertheless the characteristic equation can be cast in the form Λα = Γν [K[1 − exp(−Λα τ )]]
(6)
where the functions Γν obey the constraints Γ [0] = λν . They contain all the details of the control scheme. Equation (5) which was used for comparison with the experimental data can be considered as an asymptotic expansion of the full characteristic equation (6). Details of the quality of such an approximation have been discussed in [22]. Above all the considerations make explicit that even the simple expression (5) captures control features beyond the diagonal control. Equation (6) is hard to evaluate since the function Γν appearing on the right hand side is not known in general. Nevertheless the constraint Γ+ [0] =
594
Wolfram Just et al.
λ+ = 0 tells us that at the control boundary, ReΛα = 0, the corresponding imaginary part of the Floquet exponent does not vanishing, since Λα = 0 results in a contradiction. Thus torsion is a necessary ingredient for timedelayed feedback control to work at all [17]. Such a feature can be translated into properties of the free orbit [24]. Only those orbits are accessible to timedelayed feedback schemes which have an odd number of positive unstable Floquet multipliers exp(λν τ ). Such a limitation has been derived from the exact characteristic equation (6). One has to modify the feedback scheme in order to remove this constraint, and there are two alternatives available in the literature. On the one hand one may modulate the control amplitude periodically in time, K = K(t) = K(t + T ). For modulation periods larger than the period τ of the unstable orbit one may achieve stabilisation even in cases where the plain Pyragas scheme fails [25,26]. Such an idea has been applied successfully in experiments. By such modulations one considerably changes the spectral structure of the associated stability problem (cf. e.g. [27]). In fact, one essentially adds additional degrees of freedom to the control loop. On the other hand a controller containing directly an additional unstable degree of freedom has been proposed recently [28] to avoid the constraint mentioned in the previous paragraph. We have already stressed that the simple analytical expression (5) captures essential features of time-delayed feedback control beyond the diagonal coupling scheme. Since Eq. (5) is in general only an approximation there occur of course deviations when other coupling schemes are considered. Equation (5) fails to be a reasonable approximation of the exact characteristic equation (6) because hybridisation with former stable branches changes the spectral structure considerably. One may employ approximations which go beyond the linear order [22,29] but they become hardly feasible for experimental purpose because of the larger number of free parameters. As an essential new feature the control interval may be limited by Floquet exponents which emerge from formerly stable eigenvalue branches. Figure 3 contains an illustration by experimental data obtained from electronic circuit experiments [22]. One readily sees that the control domain is now severely reduced compared to a prediction based entirely on the simple Eq. (5). The detailed structure of the spectrum is determined by the particular form of the control matrix, i.e. by the details of the measured signal s(t) = g[x(t)] and by the coupling of the control force to the internal degrees of freedom. It is in fact one purpose of control theory to discuss the influence of these features on the control performance for non time-delayed feedback schemes. But almost nothing is known for time-delayed feedback control, mainly because of the transcendental characteristic equation that emerges from the corresponding stability problem. Only preliminary results, i.e. numerical simulations of particular model systems can be found in the literature, e.g. for spatially extended systems [30,31], including global feedback schemes which are particularly easy to implement practically [32]. No consis-
0.2
0.4
0.1
0.2
∆ Ω τ /2 π
Re Λ τ
Control of Chaos by Time-Delayed Feedback
0.0
−0.1
595
0.0
−0.2
−0.2 −0.3 0
20
40
K
60
80
−0.4
0
20
40
K
60
80
Fig. 3. Floquet branch crossing in an an electronic circuit experiment. Floquet exponents in dependence on the control amplitude, left: real part, right: imaginary part ∆Ω = Im(Λ − λ+ ). Open symbols: branch which determines the lower control threshold, full symbols: nonleading branch which determines the upper control threshold
tent picture shows up so that to date no a priori estimate can be given what type of coupling scheme enhances the performance of time-delayed feedback control. Such a problem is one of the challenges for future research.
3 Advanced Problems in Time-Delayed Feedback Control In the previous section we have addressed several topics concerning the original Pyragas control scheme. However, a lot of problems have not been mentioned so far, and this section is devoted to a discussion of more specialised topics. In particular we will discuss how appropriate delay times τ can be determined, in which way the control performance can be improved by multiple delays, how control loop latency affects the control properties and in which way a broken time translation invariance can contribute to an improvement of the control. 3.1
Adjustment of the Delay Time
To ensure the noninvasive character of time-delayed feedback control the delay time must be adjusted according to the period of the unstable orbit. Only in special cases, e.g. in periodically driven systems such periods are known a priori. There exist however several empirical schemes which can successfully cope with such a problem. It has been already stressed in [8] that some average of the control signal s(t) − s(t − τ ) shows sharp minima when the delay is changed continuously and that these resonance like structures occur at periods of unstable orbits. These features depend on the value of the control amplitude K and it is quite unlikely that a lot of periods can be observed. But one has at least an indicator for the proper adjustment of the delay time. Improvements of this idea can be found in the literature. Instead of observing the control signal other signatures for proper periods, e.g. the distance between local maxima of the signal s(t), can be considered
596
Wolfram Just et al.
[33] and the whole procedure can be embedded in steepest descent methods [34]. Although in real experiments the control signal never vanishes identically (cf. e.g. [12,13]) one gets at least a first order approximation for a noninvasive scheme. One often observes periodic signals even if the delay time is not adjusted properly. Such induced periodic behaviour follows from the fact that the system under consideration and the control loop perform a combined dynamics. The period Θ of such an induced periodic behaviour can be related to the period T of proper unstable periodic orbits of the system by employing arguments borrowed from structural stability [18]. An estimate for the induced period Θ in dependence on the true period T and the delay τ can be derived by applying first-order perturbation theory to the full differential-difference equation describing the control loop [35], Θ=T +
K (τ − T ) + O((τ − T )2 ) . K −κ
(7)
The result is valid under quite general conditions and the details of the control scheme enter just through the system dependent parameter κ. Whether such an induced periodic behaviour is stable, i.e. whether it shows up in the control signal, depends of course on the value of the control amplitude K. But in certain ranges of control amplitudes the behaviour predicted by Eq. (7) can be recorded in real experiments [35] (cf. Fig. 4). Thus a few data points are sufficient to estimate the true period T form the period of the control signal Θ and its dependence on the control amplitude. Of course Eq. (7) is a first-order result and its applicability is limited to cases where the difference between delay and true period is not too large.
Θ[µ s ]
1.8 1.7 1.6 1.5 2
1.8 τ[µ s ]
3.2
1.6
0.6
0.4
0.2 K
0
Fig. 4. Period of the control signal in dependence on the control amplitude and the delay (gray surface) obtained from an electronic circuit experiment. Period of the proper orbit T = 1.656µs. As a guide for the eye the plane Θ = τ is displayed also. See [35] for details of the experimental setup
Control Schemes with Multiple Delays
As already pointed out in Sect. 2 the original Pyragas scheme is limited to control of unstable orbits with short period. In order to overcome such
Control of Chaos by Time-Delayed Feedback
597
a limitation control methods have been proposed which use multiple delay times [36]. A quite simple scheme generates control forces from time-delayed differences with an exponentially decaying weight which mimics low pass frequency filters F (t) = K
∞
Rν [s(t − ντ ) − s(t − (ν + 1)τ )]
ν=0
= K [s(t) − s(t − τ )] + RF (t − τ ),
|R| < 1 .
(8)
Such a force can be realised without additional delay lines and the scheme is in particular suitable for optical implementation [37]. The extended time-delayed feedback method can be discussed analytically within the diagonal coupling scheme. The corresponding characteristic equation results in [23] λν = Λα + Kχ
1 − exp(−Λα τ ) . 1 − R exp(−Λα τ )
(9)
Stability domains for different values of the filter parameters are sketched in Fig. 5. It is quite straightforward to check that stabilisation is possible if Reλν τ < 2(1 + R)/(1 − R) for orbits with Imλ+ τ = π. Thus the introduction of multiple delays increases the control performance. The previous analysis is of course only approximately valid for general coupling schemes as already discussed in Sect. 2. Nevertheless the main features often endure as can be checked by experiments [23] (cf. Fig. 6). In general control intervals widen if the filter parameter is increased. One should, however, keep in mind that exceptions may occur since nonleading branches in the spectrum may become relevant [22]. Thus the precise shape of the control domain depends on the details of the system and the coupling of the control force as it is for instance visible in diverse numerical simulations [30,31]. The extended control scheme described so far is in principle not limited to filter parameters |R| < 1. One may apply an unstable controller with |R| > 1 and such a concept, as already stressed in Sect. 2, is useful to overcome the constraint imposed by torsion [28]. 6 Re λ+τ 4
2
0 0
2
τχK
4
Fig. 5. Control domains in the Reλ-K parameter plane for extended time-delayed feedback control. Analytical results according to Eq. (9) for an orbit with Imλτ = π and for different values of the filter parameter: R = 0.5 (thick ), R = 0 (medium), R = −0.5 (thin)
598
Wolfram Just et al. 0.8 0.6
R
0.4 0.2 0 0
10
20
30
40
50
60
70
K
Fig. 6. Control domains in the R-K parameter plane for extended time-delayed feedback control. Experimental results from an electronic circuit experiment for orbits with different Floquet exponents λ (triangles, circles, and squares). Lower control threshold (full symbols), upper control threshold (open symbols), analytical result according to Eq. (9) (lines). See [23] for details of the experiment and the chosen parameters
The method according to Eq. (8) uses a particular version of multiple delays, i.e. integer multiples of the basic delay time. This form aims at a simple experimental realisation. Its theoretical motivation comes from a suitably designed transfer function in order to suppress frequency components which belong to the orbit which will be stabilised. Such a particular choice is not the most general for a time-delayed feedback loop. In fact any control force consisting of terms like s(t − δi ) − s(t − τ − δi ) yield a noninvasive control scheme regardless of the values of the offsets δi . Whether approaches really increase the control performance is a delicate question. In fact, common wisdom in control theory tells us that additional delays are unlikely to increase the control performance [7]. That is in particular true for control loop latencies which may become important in fast experiments on GHz time scales where no instantaneous feedback can be realised. In terms of the original Pyragas scheme such a loop latency results in F (t) = s(t − δ) − s(t − τ − δ) .
(10)
The influence of such a single latency δ on time-delayed feedback control has been studied quantitatively in [38]. Following an analysis along the lines of section 2 one obtains for the characteristic equation λν = Λα + Kχ(δ) exp(−Λα δ) [1 − exp(−Λα τ )] .
(11)
Figure 7 displays in addition results of a simple electronic circuit experiment. One observes a monotonic decrease of the control interval. The latter disappears for latencies of the order of 0.1τ . In fact, Eq. (11) yields the analytical estimate 1 − Reλ+ τ /2 δ < τ Reλ+ τ
(12)
Control of Chaos by Time-Delayed Feedback
599
0.4
K 0.3
0.2 0
50
100
150
200
δ [ns] Fig. 7. Control intervals for the original Pyragas scheme in dependence on the control loop latency. Symbols: electronic circuit experiment, lower control threshold (filled squares), upper control threshold (open circles). Lines: analytical result according to Eq. (11). See [38] for details of the experiment and the data fit
for orbits with Imλ+ τ = π. Thus acceptable latencies for time-delayed feedback control are related with the Lyapunov exponent of the unstable orbit. Of course such an estimate involves the mean-field-like approximations mentioned previously and gives in general only the correct order of magnitude. 3.3
Explicitly Time-Dependent Control Loops
The introduction of explicit time-dependencies in the control loop changes the control properties considerably as it was already mentioned in Sect. 2 in the context of rhythmic control. Unfortunately there is no systematic theory which can cope with such features, mainly because of the intricate spectral structure of the associated linear stability problems (cf. e.g. [27]). Nevertheless, we are going to explain a simple mechanism by which the system subjected to control optimises its control performance so that control intervals may considerably increase [39]. The following considerations rely on the property that the free system without control is autonomous, so that a Goldstone mode related with time translation invariance occurs. To explain the essential mechanism we consider the original Pyragas control scheme but suppose that the coupling of the control force to the internal degrees of freedom is mediated by the unstable mode q + (t) ˙ x(t) = f (x(t)) − Kq+ (t) [s(t) − s(t − τ )] .
(13)
Such a kind of coupling where eigenmodes are employed is in fact quite common and can be realised e.g. in optical setups [40]. The measured signal s(t) = g[x(t)] depends on the internal state but the actual dependence does not matter. We assume for the simplicity of presentation that the eigenmode is real, but even for the generic case of complex valued modes the arguments follow the same line [41]. Since the system without control force is autonomous the orbit ξ(t + δ) for arbitrary value of the phase shift δ yields
600
Wolfram Just et al.
a periodic target state of the system subjected to control, Eq. (13). Linear stability analysis of such target states yields the eigenvalue problem ˙ α (t) = Df (ξ(t + δ))Qα (t) Λα Qα (t) + Q −Kdg[x(t)]|Qα (t) [1 − exp(−Λα τ )] q + (t) .
(14)
Thus the stability properties of the target states clearly depend on the value of δ. In physical terms the phase δ quantifies the phase difference between the periodic state and the time-dependence introduced by the coupling function q + (t). It is in general quite difficult to solve eigenvalue problems like Eq. (14) by analytical means. But for our very special choice of coupling some analytical results can be obtained. First, for the in-phase orbit i.e. for δ = 0, one realises that the free eigenmode q + (t) is indeed a solution to Eq. (14) provided the inner product .|. which appears on the right hand side is time-independent. Thus the characteristic equation (5) is recovered for the mode ν = + and the eigenmode control by Eq. (13) is as efficient as diagonal control. Secondly, for δ = 0 perturbation expansion may be applied which shows that the control thresholds depends on δ 2 to leading order [39]. Above all, numerical simulations indicate [41] that pronounced dependencies of the control interval on the phase shifts occurs. During an initial transient phase the system may select by self-adaptation an optimal value for the phase shift which finally results in a huge increase of the control performance (cf. Fig. 8). Such a mechanism is not tied to our particular coupling scheme but may occur for general time-dependent control loops in autonomous systems. 0.1 1
δ/τ
0.08
ε 0.01
0.06 0.0001
0.04
10 −12
10 −8
K
0.0001
1
0.02 0 10
−8
10
−6
10
−4
10
−2
K Fig. 8. Dependence of the phase shift on the control amplitude for eigenmode control. Inset: amplitude of the control signal in dependence on the control amplitude for diagonal control (dotted ) and eigenmode mode control (full line). Data obtained from a simulation of a reaction-diffusion model with global coupling (cf. [39] for details)
Control of Chaos by Time-Delayed Feedback
4
601
Outlook
The previous sections have demonstrated that several aspects of time-delayed feedback control are meanwhile quite well understood from a local point of view, i.e. analytical approaches for the linear stability analysis are available. There remains, however, a considerable gap in understanding how the coupling of the control forces to the internal degrees of freedom and how the properties of the measured signal influence the control performance. Such knowledge would enable us to optimise time-delayed feedback control further. However such an aspect is not the primary goal of time-delayed feedback methods, since these methods are applied when fancy control schemes fail. Whenever e.g. extensive manipulations of the experimental system are possible, fancy data processing can be performed or a mathematical model is available then conventional control techniques borrowed from standard control theory are often superior to time-delayed feedback control. The timedelayed methods just aim at stabilising unknown time-periodic states in systems where only a simple measurement can be performed. There are of course other aspects of the linear stability which are not fully understood. They are related to the introduction of additional time scales through periodic modulation of the control loop. The surprisingly rich structure of the associated Floquet eigenvalue problem prevents so far a deeper understanding of such control methods. Advances can be expected if a better approach to linear differential-difference equations with time-periodic coefficients becomes available. Almost nothing is known about global properties of time delay systems, apart from general mathematical statements concerning existence and uniqueness of notions like global attractors, dimensions, Lyapunov exponents, or invariant manifolds [18]. Thus the important question concerning the domain of attraction of particular time-delayed feedback schemes is completely unclear. Such a problem is intimately related to the infinite-dimensional phase space in which the dynamics of differential-difference equations take place. Thus no proper visualisation tools are available. As a first step towards a global analysis concerning domains of attraction one may apply weakly nonlinear bifurcation analysis [18,42] to time-delayed feedback control. For instance, sub- and supercritical behaviour determine domains of attraction of the target state in a characteristic way. From a more general perspective time delay plays a prominent role in many dynamical systems. Models which received considerable interest in the past refer to biological systems following the lines of the seminal work [43]. Unfortunately no concise picture has developed which constitutes a general dynamical theory of differential-difference equations. By linking the fields of control of chaos with other recent developments of nonlinear dynamics one may consider some nonstandard applications of time-delayed feedback methods. While synchronisation of dynamical systems has already a lot in common with control problems [44,45] recently the effect of time delay has
602
Wolfram Just et al.
been revived. One quite surprising result concerns the fact that time-delayed feedback can be used for prediction of the dynamical behaviour [46] and such a concept has already been applied in laser experiments [47]. Furthermore the interplay between time delay and noise, albeit of fundamental importance in time-delayed feedback control, is scarcely explored mainly because of the non-Markovian character of the underlying dynamics. Thus the developments are still at its beginning. Even for the simple but famous Kramers problem nontrivial features are observed through the introduction of time scales by time delay [48]. These few remarks already indicate that delay systems are still an undiscovered country which deserves exploration from the theoretical as well as experimental point of view. Acknowledgements Many colleagues have contributed to the content of the present work. We are in particular indebted to E. Reibold for conducting most of the experiments, and to J. A. HoIlyst, K. Kacperski and J. E. S. Socolar for many fruitful discussions. This work was supported by “Deutsche Forschungsgemeinschaft” through grant nos. JU261/3-1, BE864/4-1, and SFB555.
References 1. H. G. Schuster (Ed.) Handbook of Chaos Control (Wiley-VCH, Berlin, 1999). 589 2. E. Ott, C. Grebogi, and Y. A. Yorke, Phys. Rev. Lett. 64, 1196 (1990). 589 3. T. Shinbrot, Adv. Phys. 44, 73 (1995). 589 4. E. R. Hunt, Phys. Rev. Lett. 67, 1953 (1991). 589 5. K. Ogata, Modern Control Engineering (Prentice-Hall, New York, 1997). 590 6. H. Nijmeijer and A. Schaft, Nonlinear Dynamical Control Systems (Springer, New York, 1996). 590 7. L. Collatz ZAMM 25/27, 60 (1947). 590, 598 8. K. Pyragas, Phys. Lett. A 170, 421 (1992). 590, 595 9. S. Bielawski, D. Derozier, and P. Glorieux, Phys. Rev. E 49, R971 (1994). 590 10. T. Pierre, G. Bonhomme, and A. Atipo, Phys. Rev. Lett. 76, 2290 (1996). 590 11. O. L¨ uthje, S. Wolff, and G. Pfister, Phys. Rev. Lett. 86, 1745 (2001). 590 12. P. Parmananda, R. Madrigal, M. Rivera, L. Nyikos, I. Z. Kiss, and V. G´asp´ ar, Phys. Rev. E 59, 5266 (1999). 590, 596 13. H. Benner and W. Just, J. Kor. Phys. Soc. 40, 1046 (2002). 590, 596 14. K. Hall, D. J. Christini, M. Tremblay, J. J. Collins, L. Glass, and J. Billette, Phys. Rev. Lett. 78, 4518 (1997). 590 15. T. Hikihara and T. Kawagoshi, Phys. Lett. A 211, 29 (1996). 590 16. K. Pyragas and A. Tamasevicius, Phys. Lett. A 180, 99 (1993). 590 17. W. Just, T. Bernard, M. Ostheimer, E. Reibold, and H. Benner, Phys. Rev. Lett. 78, 203 (1997). 591, 593, 594 18. J. K. Hale and S. M. Verduyn Lunel, Introduction to Functional Differential Equations (Springer, New York, 1993). 592, 596, 601
Control of Chaos by Time-Delayed Feedback
603
19. R. Bellmann, Differential-Difference Equations (Acad. Press, New York, 1963). 592 20. M. E. Bleich and J. E. S. Socolar, Phys. Lett. A 210, 87 (1996). 592 21. V. L. Kharitonov and S. I. Nicolescu, IEEE Trans. Automat. Contr. 48, 127 (2003). 592 22. W. Just, E. Reibold, K. Kacperski, P. Fronczak, J. HoLlyst, and H. Benner, Phys. Rev. E 61, 5045 (2000). 592, 593, 594, 597 23. W. Just, E. Reibold, H. Benner, K. Kacperski, F. Fronczak, and J. HoLlyst, Phys. Lett. A 254, 158 (1999). 593, 597, 598 24. H. Nakajima, Phys. Lett. A 232, 207 (1997). 594 25. S. Bielawski, D. Derozier, and P. Glorieux, Phys. Rev. A 47, R2492 (1993). 594 26. H. G. Schuster and M. B. Stemmler, Phys. Rev. E 56, 6410 (1997). 594 27. W. Just, Physica D 142, 153 (2000). 594, 599 28. K. Pyragas, Phys. Rev. Lett. 86, 2265 (2001). 594, 597 29. K. Pyragas, Phys. Rev. E 66, 026207 (2002). 594 30. M. E. Bleich and J. E. S. Socolar, Phys. Rev. E 54, R17 (1996). 594, 597 31. O. Beck, A. Amann, E. Sch¨ oll, J. E. S. Socolar, and W. Just, Phys. Rev. E 66, 016213 (2002). 594, 597 32. G. Franceschini, S. Bose, and E. Sch¨ oll, Phys. Rev. E 60, 5426 (1999). 594 33. A. Kittel, J. Parisi, and K. Pyragas, Phys. Lett. A 198, 433 (1995). 596 34. H. Nakajima, H. Ito, and Y. Ueda, IEICE Trans. Fund. Electr. E80A, 1554 (1997). 596 35. W. Just, J. M¨ ockel, D. Reckwerth, E. Reibold, and H. Benner, Phys. Rev. Lett. 81, 562 (1998). 596 36. J. E. S. Socolar, D. W. Sukow, and D. J. Gauthier, Phys. Rev. E 50, 3245 (1994). 597 37. M. E. Bleich, D. Hochheiser, J. V. Moloney, and J. E. S. Socolar, Phys. Rev. E 55, 2119 (1997). 597 38. W. Just, D. Reckwerth, E. Reibold, and H. Benner, Phys. Rev. E 59, 2826 (1999). 598, 599 39. N. Baba, A. Amann, E. Sch¨ oll, and W. Just, Phys. Rev. Lett. 89, 074101 (2002). 599, 600 40. A. V. Mamaev and M. Saffman, Phys. Rev. Lett. 80, 3499 (1998). 599 41. W. Just, S. Popovich, A. Amann, N. Baba, and E. Sch¨ oll, Phys. Rev. E 67, 026222 (2003). 599, 600 42. B. F. Redmond, V. G. LeBlanc, and A. Longtin, Physica D 166, 131 (2002). 601 43. M. C. Mackey and L. Glass, Science 197, 287 (1977). 601 44. S. Boccaletti, J. Kurths, G. Osipov, D. L. Valladares, and C. S. Zhou, Phys. Rep. 366, 1 (2002). 601 45. H. Nijmeijer, Physica D 154, 219 (2001). 601 46. H. U. Voss, Phys. Rev. E 61, 5115 (2000). 602 47. S. Sivaprakasam, E. M. Shahverduiev, P. S. Spencer, and K. A. Shore, Phys. Rev. Lett. 87, 154101 (2001). 602 48. L. S. Tsimring and A. Pikovsky, Phys. Rev. Lett. 87, 250602 (2001). 602
Signals from Clustered Ion Channels P. Jung, S. Zeng, and J.W. Shuai Department of Physics and Astronomy and Institute for Quantitative Biology Ohio University, Athens OH 45701, USA
Abstract. Clustering of ion channels is a common phenomenon, yet it is not well understood. While several explanations for channel clustering have been suggested, we propose a new theory that is based on information theoretic reasoning and is thus generic and may apply very generally.
1
Introduction
Clustering of ion channels is a very common phenomenon in nature. It occurs naturally in myelinated neurons, where the active sodium channels are concentrated at the nodes of Ranvier acting as a signal booster. But it also in neurons that are not myelinated, e.g. in some neuron types in the retina [1]. In these types of neurons, there is no obvious geometric reason for the clustering. As another example, the release of Ca2+ from the endoplasmic (or sarcoplasmic) reticulum, is controlled by small clusters of Ca2+ -release channels, that often contain no more than 20–50 channels. While it is not understood why the ion channels are clustered, a number of theories and theoretical models have been put forward. Some of them involve attractive interactions between the channel proteins [2], the organization of the membrane in micro-domains (“rafts”) with different structures where some structures are more likely to host channel proteins than others [3], or the involvement of the cytoskeleton by locally anchoring the channels by a sub-membrane undercoat [1]. A novel idea towards the solution of this puzzle is based on the capability of information transmission of groups of ion channels [4,5,6]. In these papers, it has been shown that small signals may be better detected by smaller clusters. The size of the ion channel clusters determines (for small clusters in a non-trivial way) the magnitude of the fluctuations and thus – via the effect of stochastic resonance – a small signal can be enhanced. Intracellular calcium signaling is based on the release of calcium from intracellular stores by small clusters of release channels that may not include more than 20–50 channels. We propose that the small size of the cluster enhances the calcium release stimulated by small numbers of agonist binding to the receptors. A simple stochastic theory predicts optimal cluster sizes that are compatible with experimental results.
B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 605–616, 2003. c Springer-Verlag Berlin Heidelberg 2003
606
P. Jung et al.
2 Optimal Cluster Sizes in Neuronal Ion Channels Clusters Since there is a great variety in the distribution and kinds of ion channels relevant for transmission of electrical signals in the nervous system, we are selecting as model channels those, that have been identified in the squid axon (although they are not clustered there). Most important are the sodium and potassium channels. Each potassium channel has four identical subunits (gates) that are either closed or open. The opening and closing rates αK (v) and βK (v), respectively, for each subunit are given by [7] v 0.01(10 − v) αK (v) = , βK (v) = 0.125 exp − , (1) exp ((10 − v)/10) − 1 80 where v refers to the cross-membrane potential. The opening and closing processes of the subunits are assumed to be Markovian and described by the two-state master equation for the open-probability of a subunit p˙ n (t) = − (αK (v) + βK (v)) pn (t) + αK (v) .
(2)
The entire channel is open when all four subunits are open. The sodium channel is composed of three identical (fast) subunits that – similar to the potassium channel subunits – tend to open when the crossmembrane voltage is increasing, i.e. v 0.1(25 − v) f , βNa (3) (v) = 4.0 exp − αfNa (v) = exp ((25 − v)/10) − 1 18 with the corresponding two state master equations for the state of the gates f q˙n (t) = − αfNa (v) + βNa (v) qn (t) + αfNa (v) . (4) But it also comprises a (slow) deactivation gate that tends to close with increasing trans-membrane voltage. The opening and closing rates of the deactivation gates are given by v 1 s , (5) , βNa (v) = αsNa (v) = 0.07 exp − 20 exp ((30 − v)/10) + 1 with the associated two-state master equation describing the state of the in-activation gate s (v)) q4 (t) + αsNa (v) . q˙4 (t) = − (αsNa (v) + βNa
(6)
A sodium channel is open when all three fast gates are open and the inactivation gate is open. In all equations above, the trans-membrane potential is measured in millivolts (mV), and the time is measured in milliseconds (ms). A cluster of ion channels is made up of a number NK of potassium channels and NNa of sodium channels that are close enough that they all share
Signals from Clustered Ion Channels
607
the same (not fixed) trans-membrane potential. To obtain a rough estimate of the allowed distances we employ linear cable theory. Assuming that the cluster under consideration is located on an axon and that the axon (a long thin object) can be treated as a one-dimensional cable, the cable equation for the transmembrane voltage v(x, t) along the cable (x-direction) reads τ
∂ 2 v(x, t) ∂v(x, t) = −Iion Rm + λ2m , ∂t ∂x2
(7)
where for the giant squid axon (our model system), λm = 0.65 cm, Rm = 103 Ω cm2 and τ = 1 ms. The term IIon contains all specific ionic transmembrane currents and a leakage current that lumps all other ionic currents. The linear extent of a ion-channel cluster with a shared trans-membrane potential therefore should be smaller than the typical length scale of the system λm = 0.65cm. Since a two-dimensional “cable equation” would factorize into two cable equations, a cluster size should be much less than 0.4 mm2 in order that the approximation of uniform voltage to apply. In this paper we only consider cluster sizes of up to several hundred µm2 and we thus stay well within the regime where a shared transmembrane potential is a very good approximation. Thus, within one ion channel cluster, the diffusion term on the right hand side of the cable equation vanishes and we wind up with an ordinary differential equation for the transmembrane potential which we re-write as C
dv = −gK (v − vK ) − gNa (v − vNa ) − gl (v − vl ) + Iext , dt
(8)
with the conductance of the sodium system, potassium system and leakage system given by gK , gNa , gl , respectively. C denotes the membrane capacitance and vK , vNa and vL the Nernst potentials of the ionic systems. Assuming that the densities of the potassium channels ρK and sodium channels ρNa are homogeneous throughout the cluster of area A, we can express gK and gNa in terms of the conductance of single open channels γK , γNa , i.e. gK N open γK N open = K = K ρK γK A A NK open open NNa NNa γNa gNa = = ρNa γNa . A A NNa
(9)
Dividing (8) by the cluster area, one finds C
N open dv = − K ρK γK (v − vK ) dt NK N open − Na ρNa γNa (v − vNa ) − gl (v − vl ) + Iext . NNa
(10)
The fraction of open channels can be obtained at each time step by using a Monte-Carlo type technique put forward in [8].
608
P. Jung et al.
As a result of this Monte-Carlo type procedure, the number of open channels (potassium and sodium) is determined and can be inserted into Eq.(10) which is then integrated by one time step of 10µs by using a first order solver. For the channel densities we use ρNa = 60/µm2 and ρK = 20/µm2 . Subsequent updating of the channel-states and integration of Eq.(10) leads to the membrane voltage as a function of time. A typical trajectory is shown in Fig. 1. 120
100
membrane potential v
80
60
40
20
0
-20 0
100
200
300
400
500
600
time [ms]
Fig. 1. A spike train generated spontaneously by 1000 potassium and 3000 sodium channels in one cluster is shown. The fluctuations of the membrane potential during quiescent intervals is about 10 mV, i.e. more than 10% of the actual resting potential of about −70 mV
2.1
Spontaneous Firing Rates
In Fig. 2, the average time-interval between two successive action potentials is shown as a function of the cluster size. For small cluster sizes, the firing rate increases with increasing cluster size. Only after the cluster size exceeds about 1µm2 , an increasing cluster size results in a decreasing spontaneous firing rate. 2.2
Variance and Firing Rates
In Fig. 3, we show the standard deviation from the average firing interval normalized by the average firing interval as a function of the cluster size. This measure is called the Fano-factor and is used sometimes to describe
Signals from Clustered Ion Channels
609
120
< T >[ms]
100 80 60 40 20
500
Fig. 2. The average interval between two consecutive spontaneous spikes is shown as a function of the cluster size N
500
Fig. 3. Form of the Fano factor η = (T − T )2 / T 2 as a function of the cluster size N
0
0.05
0.5
5
area( mm2)
0.5
5
50
0.8 0.7 0.6
h
0.5 0.4 0.3 0.2 0.1
0.05
area( mm2)
50
the regularity of the spike train. This Fano factor η is shown in Fig. 3 as a function of the cluster size. For vanishing synaptic noise this measure exhibits a minimum roughly where the firing rate exhibits a maximum. 2.3
Response to Weak Signals
In the limit of large cluster sizes, the deterministic Hodgkin-Huxley equations require the amplitude of injected currents to exceed a threshold (which in general depends on the frequency content of the signal). Thus a signal that does not overcome this threshold will not be encoded in a spike train. Decreasing the cluster size will increase the fluctuations of the membrane potential (as it is evident from Fig. 2). When the membrane voltage fluctuations add favorably to the (sub-threshold) external signal an encoding takes place as random sampling of the subthreshold signal. When the fluctuations become too big, i.e. the area of the cluster too small, the fluctuations of the transmembrane potential over-dominate the signal and the neuronal spike train mostly encodes the noise and not the signal.
610
P. Jung et al.
0.06
0.030 A=0.5mm 2
A=50mm 2
0.04
0.020
0.03
ISIH
ISIH
0.05
0.02
0.010
0.01 0
0
20
40
60
time[ms]
80
0.000
100
0.030
0
50
100
time[ms]
200
0.012 A=100mm 2
0.020
A=250mm 2
ISIH
ISIH
0.008
0.010
0.000
150
0.004
0
100
time[ms]
200
0.000
0
200
400
time[ms]
600
800
Fig. 4. The interspike interval histograms (ISIH) are shown for a subthreshold sinusoidal signal (11) for cluster areas of 0.5µm2 , 5µm2 , 100µm2 and 250µm2
This effect is demonstrated in Fig. 4, where we show the interspike-interval histograms (ISIH) of the neuronal spike train at various clustersizes when an external current of the form 2πt Iext = A cos (11) T is applied, where A = 2µ A/cm2 and T = 60 ms. A peak of the ISIH at the period of the external signal indicates encoding of the signal. Since the ISIH’s in Fig. 4 are normalized, the height of the peak (if any) at T = 60ms can be used as a measure of encoding. Best encoding takes place for the injected signal under consideration at a cluster area of about 100µm2 . Another remarkable finding is that the ISIH at small cluster sizes (see the left upper panel of Fig. 4) exhibits an additional peak at very small intervals, indicating that the concept of refractory period that is characteristic for the macroscopic deterministic systems seems to collapse.
3 Optimal Cluster Sizes for Intracellular Ca2+ Signaling Many important cellular functions are regulated by intra- and intercellular Ca2+ signals. They are involved in the insulin production of pancreatic β– cells [9], in the enzyme secretion in liver cells (for a review, see e.g. [10]) and for the early response to injury of brain tissue [11] and corneal epithelia [12]. Recent new insights into the biophysical mechanism of intracellular
Signals from Clustered Ion Channels
611
Ca2+ release have revealed that the actual release sites are discrete and as small as about 100 nm indicating that mesoscopic methods are necessary for realistic models of Ca2+ . Consequences of the discreteness of the release clusters for Ca2+ wave formation have been explored in [13] and [14]. In this paper, we show that the clustering of the release channels can resonantly enhance the sensitivity of the calcium signaling pathway by exploiting internal fluctuations. Most of the Ca2+ that constitutes the signal is released from intracellular stores such as the endoplasmic reticulum (ER) into the intracellular space through the Inositol 1,4,5-Trisphosphate (IP3 ) receptor. The IP3 receptor (IP3 R) is modeled [15] by three identical subunits that each have three binding sites: one for the messenger molecule IP3 (m gate), one activating site (n gate) for Ca2+ and one inactivating site (h gate) for Ca2+ . In order for a subunit to conduct Ca2+ , only the IP3 and the activating Ca2+ binding site need to be occupied. The entire IP3 R is conducting if three subunits are conducting. The Ca2+ binding site invokes an autocatalytic mechanism of Ca2+ release (Ca2+ induced Ca2+ release) giving rise to a rapidly increasing intracellular Ca2+ concentration if the concentration of IP3 exceeds a certain threshold. When the inactivation Ca2+ binding sites become occupied and the IP3 Rs close, the Ca2+ pumps remove Ca2+ from the intracellular space, which is necessary since elevated concentrations of Ca2+ are toxic for the cell. Once the Ca2+ concentration is low and IP3 is present in sufficient concentration, calcium induced calcium release will rapidly increase intracellular calcium levels giving rise to oscillatory calcium signals. The oscillatory nature of the Ca2+ signals suggests that the primary information content of the Ca2+ signals is their frequency [16]. In previous work it has been reported that globally IP3 -mediated Ca2+ signals can be devolved into localized Ca2+ release events due to clustered distributions of IP3 Rs [17] with only a few tens of IP3 Rs per cluster and a size of about 100nm, indicating that thermal open-close transitions of single IP3 R’s are essential. Observations of signals of differing magnitudes first suggested a hierarchy of calcium signalling events, with smaller blips representing fundamental events involving opening of single IP3 R and the larger sparks or puffs being elementary events resulting from the opening of small groups of IP3 Rs [18,17]. Improved spatial and temporal resolution recordings, however, have revealed that there is not a clear distinction between fundamental and elementary events [17,19]. It is suggested that the localized calcium release varies in a continuous fashion due to stochastic variation in both numbers of channels recruited and durations of channel openings. Our study is based on the Li-Rinzel Model [20], a two-variable simplification of the DeYoung-Keizer model [15] where the fast variables m, n have been replaced by their quasi equilibrium values m∞ and n∞ . According to this model, the calcium flux from the ER to the intracellular space is driven
612
P. Jung et al.
by the Ca2+ gradient, i.e. d[Ca2+ ] = −ICh − IP − IL , dt dh = αh (1 − h) − βh h , dt
(12) (13)
with ICh = c1 v1 m3∞ n3∞ h3 ([Ca2+ ] − [Ca2+ ]ER ) IP =
v3 [Ca ] k32 + [Ca2+ ]2
(14)
2+ 2
IL = c1 v2 ([Ca2+ ] − [Ca2+ ]ER ) .
(15) (16)
Here, [Ca2+ ] denotes the intracellular Ca2+ concentration, [Ca2+ ]ER the Ca2+ concentration in the ER, and h a slow inactivation variable. ICh denotes Ca2+ efflux from intracellular stores through IP3 R channels, IP the ATPdependent Ca2+ flux from the intracellular space back to the stores, and IL represents the leak flux. The slow Ca2+ inactivation process depends on both the concentration of IP3 and Ca2+ via the rate constants αh = a2 d2 ([IP3 ] + d1 )/([IP3 ] + d3 ) , βh = a2 [Ca2+ ].
(17)
The other parameters are m∞ = [IP3 ]/([IP3 ] + d1 ), n∞ = [Ca2+ ]/([Ca2+ ] + d5 ), c1 =0.185, v1 = 6s−1 , v2 = 0.11s−1 , v3 = 0.9µMs−1 , k3 = 0.1µM, d1 = 0.13µM, d2 = 1.049µM, d3 = 0.9434µM, d5 = 0.08234µM, and a2 = 0.2µM−1 s−1 . The total amount of Ca2+ is conserved via the Ca2+ concentration in ER with [Ca2+ ]ER = (c0 − [Ca2+ ])/c1 with c0 = 2.0µM. The concentration of IP3 denoted by [IP3 ] is a control parameter. The form of Eq. (13) suggests that the inactivation process for each IP3 R can be modeled as a stochastic process where h = 1 describes the open IP3 R and h = 0 describes the closed IP3 R (i.e. no calcium current through the IP3 R) – constituting the stochastic Li-Rinzel model. The power three of h in Eq. (12) indicates the three subunits of the IP3 R and thus three inactivation h gates. Each gate can be in two states, the open (unbound) and closed (bound) state. Since the h-gates are the slowest gates, we assume that switching between the two states can be approximated by a two-state Markov process with the opening rate of αh and the closing rate βh . The IP3 R is conducting if all three h-sites are unbound. The Ca2+ flux through the IP3 R in the kinetic model is then given by the modified form of Eq. (14) ICh = c1 v1 m3∞ n3∞
N h−Open 2+ [Ca ] − [Ca2+ ]ER , N
(18)
where N and N h−Open indicate the total number of IP3 Rs and the number of h-open receptors in the cluster, respectively. Eqs. (12)–(17) represent the deterministic limit of the stochastic scheme Eqs.(12), (15)–(18) for a large
Signals from Clustered Ion Channels
613
0.6 (a)
[Ca ++] (mM )
0.5 0.4 0.3 0.2 0.1
I
[Ca2+]
0
0
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1
0.2
(b)
0.4 0.6 [IP3] (mM )
III 0.8
1
[IP3]=0.3mM
[IP3]=0.5mM
0.8 [Ca2+]
II
0.6 0.4
[Ca2+]
0.2 0 1.2 1 0.8 0.6 0.4 0.2 0
[IP3]=0.8mM
0
50
100 150 200 250 30 time [s]
Fig. 5. The bifurcation diagram of the deterministic Li-Rinzel model (a) and calcium signals generated by a cluster of 20 IP3 Rs (b)
number N of channels. The release of Ca2+ in the stochastic Li-Rinzel model is a collective event of a number of globally coupled channels (via the common Ca2+ concentrations) with stochastic opening and closing dynamics. Each gate is simulated explicitly by two-state Markov processes with opening and closing rates αh and βh , respectively. In the deterministic limit (i.e. N → ∞), the two-variable Li-Rinzel model has one stable fixed point for [IP3 ]< 0.354µM and [IP3 ]> 0.642µM. At [IP3 ]=0.354 µM and [IP3 ]=0.642µM Hopf bifurcations occur so that [Ca2+ ] is oscillating for 0.354µM<[IP3 ]<0.642µM (Fig. 5a). Under normal conditions
614
P. Jung et al.
[IP3 ] is below the critical value 0.354µM and the deterministic model with a fixed point does not permit calcium signaling. In Fig. 5b, traces of a Ca2+ signal released from a cluster with 20 IP3 Rs are shown for three values of [IP3 ] in the three deterministically distinguished regimes I,II,III (see Fig. 5a). The Ca2+ signals consists of stochastic sequences of Ca2+ release events (calcium puffs) in all three regimes (I,II,III) with a continuum of amplitudes and durations. The regimes I,II and III are not well distinguishable for these small clusters. Most importantly for the purpose of this paper, the Ca2+ puffs for [IP3 ]< 0.354µM constitute a Ca2+ signal with a frequency content. To determine the degree of periodicity of the Ca2+ released from a cluster, we compute the normalized power spectrum T 2+ 2+ ](τ ) − [Ca ] exp (−2πiωτ )dτ [Ca 1 0 Ss (ω) = , (19) 2 T [Ca2+ ] − [Ca2+ ] where the length of the observation interval T is 5000 s for all data presented in this paper. In Fig. 6, we show the normalized power spectra S(ω) at various sizes N of the release cluster. For very small clusters (e.g. N = 2 in Fig. 6a) and very large clusters (e.g. N = 10, 000 in Fig. 6c)), the power spectrum does not exhibit a peak and thus the release of Ca2+ is dominated by stochastic
Fig. 6. Power spectra S(ω) of the Ca2+ signal released by clusters of (a) N = 2, (b) N =150, and (c) N =10,000 IP3 Rs at [IP3 ]=0.30 µM
Signals from Clustered Ion Channels
615
0.005
[IP3] = 0.25mM
DS [s-1/2]
0.004 0.003 0.002 0.001 0
0
20
40
60
N
80
100
Fig. 7. The elevation of the power spectrum ∆S as a function of N at [IP3 ]=0.25 µM
events. In between, however, a peak in the power spectrum (Fig. 6b) indicates periodicity in calcium release. The strength of the peak is characterized by the elevation of the peak ∆S which is shown in Fig. 7 as a function of the size of the cluster N for [IP3 ]=0.25 µM. The elevation of the power spectrum goes through a maximum at N ≈ 20. Typical recorded values of [IP3 ] range between 0.15µM− 0.25µM. In this context it is interesting to note that the coherence for [IP3 ]=0.25 µM peaks at N = 20 which is considered a realistic cluster size (see also [21]). To summarize, the overall coherence of the Ca2+ signal exhibits a maximum at a cluster size that depends on the concentration of IP3 . For IP3 concentrations closer to the Hopf-bifurcation ([IP3 ]=0.354µM), the maximum coherence is achieved for larger clusters of IP3 Rs and vice versa. Acknowledgements This material is based upon work supported by the National Science Foundation under Grant No. IBN-0078055.
References 1. C. Hildebrand and S.Waxman, Brain Research, 258 23(1983). 605 2. C. Guo and H. Levine, Biophysical Journal 77, 2358 (1999). 605 3. H. Grassme, V. Jendrossek, J. Bock, A. Riehle, E. Gulbins, Journal of Immunology, 168, 298 (2002); J.R. Cochran, D. Aivazian, T.O. Cameron, L.J. Stern, Trends in Biochemical Sciences, 26, 304 (2001); G. Vereb, J. Matko, G. Vamosi, S.M. Ibrahim, E. Magyar, S. Varga, J. Szollosi, A. Jenei, R. Gaspar, T.A. Waldmann, S. Damjanovich, PNAS, 97, 6013 (2000); S. Damjanovich, L. Bene, J. Matko, L. Matyus, Z. Krasznai, G. Szabo, C. Pieri, R. Gaspar R, J. Szollosi, Biophysical Chemistry 82, 99 (1999). 605 4. P. Jung and J. W. Shuai, Europhys. Lett. 56, 29 (2001). 605 5. G. Schmid, I. Goychuk and P. H¨ anggi, Europhys. Lett. 56, 22 (2001). 605 6. J.W. Shuai and P. Jung, Phys. Rev. Lett., 88, 681021 (2002). 605 7. A.L. Hodgkin and A.F. Huxley, J. Physiol.(London), 117, 500 (1952). 606
616
P. Jung et al.
8. E. Schneidman, B. Freedman, I. Segev, Neural Computation, 10, 1679 (1998). 607 9. T. R. Chay and J. Keizer, Biophys. J. 42, 181 (1983). 610 10. G. Dupont, S. Swillens, C. Clair, T. Tordjmann, and L. Combettes, Biochimica et Biophysica Acta, 1498, 134 (2000). 610 11. A. H. Cornell-Bell, S. M. Finkbeiner, M. S. Cooper, and S.J. Smith, Science 247, 470 (1990). 610 12. V.E. Klepeis, A.H. Cornell-Bell, and Vickery Trinkaus-Randall, J. Cell Sci., 114, 4185 (2001). 610 13. M. Falcke, L. Tsimring and H. Levine, Phys. Rev. E62, 2636 (2000). 611 14. J. Keizer and G.D. Smith, Biophysical Chemistry, 72, 87 (1998). 611 15. G. W. D. Young and J. Keizer. Proc. Natl. Acad. Sci. USA. 89, 9895 (1992). 611 16. P. D. Koninck and H. Schulman, Science, 279, 227 (1998); R. E. Dolmetsch, K. Xu, and R. S. Lewis, Nature, 392, 933 (1998). 611 17. M. Bootman, E. Niggli, M. Berridge, and P. Lipp, The Journal of Physiology, 499, 307 (1997). 611 18. P. Lipp, and E. Niggli, Journal of Physiology 508, 801 (1996); H. Cheng, W. J. Lederer, and M. B. Cannell, Science, 262, 740 (1993). 611 19. X. Sun, N. Callamaras, J. S. Marchant, and I Parker, Journal of Physiology, 509, 67 (1998); D. Thomas, P. Lipp, M. J. Berridge, and M. D. Bootman, J. Biol. Chem. 273, 27130 (1998); J. S. Marchant and I. Parker, The EMBO Journal, 20, 65 (2001); L. L. Haak, L. Song, T. F. Molinski, I. N. Pessah, H. Cheng, and J. T. Russell, Journal of Neuroscience, 21, 3860 (2001). 611 20. Y. Li and J. Rinzel, J. Theor. Biol. 166, 461 (1994). 611 21. S. Swillens, G. Dupont, L. Combettes, and P. Champeil, PNAS, 96, 13750 (1999). 615
Modeling of Metals and Metal Sponges via Embedded Particle Computer Simulation Martin Kr¨ oger Polymer Physics, Material Sciences, ETH Z¨ urich CH-8092 Z¨ urich, Switzerland Abstract. In this article we review the embedded atom method when adapted to study solid friction and the mechanical behavior of model metals. The method incorporates the effect of electronic glue through effective many-body potentials. The elastic properties of real metals are reproduced by a set of basic model potentials as revealed by analytic considerations. A slightly modified version of a classical NonEquilibrium Molecular Dynamics (NEMD) computer simulation is employed to study the dynamics and structural changes of the model metal undergoing a process of solid friction and an uniaxial compression, in order to analyze, e.g. plastic yield, transient friction coefficients, and the underlying structure. Under appropriate choice of parameters, the model is also applicable to study porous metals.
1
Introduction
In the early 1980’s the theoretical investigation of hydrogen embrittlement gave rise to a systematic modeling and simulation approach towards generic models for metals. A fundamental understanding of the atomistic processes involved in hydrogen embrittlement had been impossible, largely because of difficulties in the theory of such complicated systems. Traditional monoscale approaches such as ab initio techniques had proven to be inadequate, even with the largest supercomputers, because of the range of scales and the prohibitively large number of atoms involved. The embedded atoms method overcomes the fundamental limitation of past methods such as pair potentials and yet is practical enough for the calculation of defects, surfaces, and impurities in metals on multiple scales. Ab initio methods are still incapable of handling the large numbers of atoms required to represent fracture. Even the capacity of one-electron methods [1] falls far short of the number of atoms required to simulate fracture. The use of pairwise interaction greatly increases the number of atoms that can be treated [2,3], but requires a volume-dependent term to represent the bulk compressibility of the electron gas [4]. Volume dependence restricts the use of pair potentials to situations where the volume is definable; it is not during fracture. The quasiatom theory [5,6] (or effective medium) had been used successfully to calculate the characteristics of hydrogen in metals. In [7] the theory was generalized to treat all atoms in a unified way [8,9]. The method is called ‘embedded particle method’ because it views each particle as embedded in a B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 617–632, 2003. c Springer-Verlag Berlin Heidelberg 2003
618
Martin Kr¨ oger
host lattice consisting of all other particles. Such a view permits calculations employing an electron density, which is always definable, and allows realistic treatment of impurities in structures that include cracks, surfaces, impurities, and alloying additions [7,9]. The generalized method is not significantly more complicated to use than pair potentials, it had been used in a number of recent works, see, e.g. [10,11,12,13,14,15,16]. Solution of the Schr¨ odinger equation yields the electron density established by a given potential, and the energy is a functional of that potential. Hohenberg and Kohn [17] show the converse: that the energy is a functional of the density, and the potential is determined to within an additive constant by its electron density. Scott and Zaremba [5] proved that the energy of an impurity in a host is a functional F of the electron density of the unperturbed (i.e., without impurity) host. This statement is that the embedding density of an impurity is determined by the electron density of the host before the impurity is added. The embedded atoms method makes use of this by viewing each atom in a system as an impurity in the host consisting of all other atoms. The functional F is universal, independent of the host. Its form, however, is unknown. A simple approximation would be to assume that the embedding energy depends only on the environment immediately around the impurity [6], or equivalently that the impurity experiences a locally uniform electron density [5]. This can be viewed either as a local approximation, or as the lowest-order term involving successive gradients of the density. Then the functional is approximated by a function of the electron density at the impurity site plus an electrostatic interaction Φ, and the total energy is written as a sum over all individual contributions. Because the functions F and Φ are not known in general, in [7] experimental data had been used to determine functions. Both functions are required to fit the elastic properties such as the Cauchy discrepancy. In the early works [7], F was taken from electron gas computations provided in [18]. We will review a set of particularly simple and generic model potentials [19,20] motivated through semiempirical calculations which are expected to describe the mechanical properties of metals ranging from the nanometer scale (asperites) up to the macroscopic scale (solid friction, metal foams). The model is characterized by a few model parameters such as reference energy, time and length. Properties of the model are obtained in terms of these reference values. Constitutive relationship between stress and deformation obtained from this model serve to enter conventional solvers when complex geometries are considered. It will be shown that elastic properties of real metals are reproduced by the set of basic model potentials as revealed by analytic considerations. The goal will be to analyze preliminary results of the novel model and the range of its application in a qualitative fashion. We further demonstrate that – and how – the proposed model for a bulk metal can be used to study metal sponges and porous metal structures. Metal sponges and foams show some potential for being produced with controlled spatial variations in their density. This suggests employing
Model Metals and Metal Sponges
619
them as graded materials in space filling lightweight structures in analogy to cortical bone, a natural cellular material, that displays increased density in regions of high loading. Most mechanical and physical properties are affected by the porosity as well as the size of the pores and at the same time of the thickness of the structured studs of the metal sponge. The past few years have seen increasing interest in porous metallic materials, especially in foams made of aluminum or aluminum alloys. The stimulus for this lies in recent process developments which promise materials with better quality and lower cost. Moreover, the environment for the application of new materials of this type has greatly changed. Nowadays higher demands for passenger safety in automobiles or for easy materials recycling make metal foams and sponges attractive where, a few years ago, the same materials would have been ruled out for technical or economical reasons [21]. The mechanical performance of metal foams governs their utility in various applications, such as cores for ultralight sandwich panels/shells, as well as crash or blast absorbing systems. Macroscopic stress/strain characteristics establish their performance Most important are the stiffness, the yield strength and the “plateau” stress at which the material compresses. Once these have been measured, continuum mean-field constitutive relations can be implemented [22] for the study of structural competitiveness, cf. [23] and references cited herein. Such homogenized treatments are applicable to structures that encompass many cells. For structures embracing few cells, recognition must be given to a material length scale that depends on the microscopic mechanisms of deformation at the cell-level. In order to clarify notation, due to the generic choice of model potentials in the following, the spatial coordinate of a ‘particle’ may represent either the position of a ‘model nucleus’ or the position of a spatially localized number of nuclei. Positions of the electrons are not considered explicitely; their motion will be effectively captured through the embedding density and the embedding functional.
2
The Framework
In order to study the mechanical properties of a model metal, including solid friction, uniaxial compression, shear deformation of bulk metals, and also porous metal structures, we adapt the embedded atom method in the spirit Holian et al. [24]. The method, originated by Daw and Baskes [7], resembles the fact that in metals the conduction electrons are not localized about the nuclei; the energy depends upon the local electron density, resulting in forces between particles that are many body in character, rather than simply pairwise additive [25]. Accordingly one considers two contributions to the potential energy E of the whole system made up by N particles [19]. There is a conventional binary interaction term through a two-body interaction potential Φ as function of the distance between interaction sites and a term
620
Martin Kr¨ oger
stemming from an embedding functional F , which produces the effect of the electronic ‘glue’ between interaction sites N N 1 eij Φ(rij ) + F (ρi ) , (1) Eb = N eb = 2 i=1 j=i
where eb = Eb /N represents a ‘binding energy’ per particle. The quantity F is a nonlinear function of the (local) embedding density ρi of atoms i = 1, .., N . It is constructed from the radial coordinates of surrounding particles and requires the choice of a weighting function w(r): wij w(rij ) = w(0) + wij w(rij ), (2) ρi = j
j=i
where rij ≡ ri − rj is the relative position vector between particle coordinates ri and rj . For the study of bulk metals the coefficients eij and wij are irrelevant and set to unity. For the case of solid friction, where two metals are in contact, they become relevant and model the properties of the interface (contact zone). They allow to specify the strength of interaction between particles belonging to the same and to different materials, i.e., we have wij = waa = wbb =fixed for particles i, j of the same metal, and wij = wab =fixed for a pair of ‘different’ specimen i, j. The same potentially applies to eij . If not explicitely specified, we have wij = eij = 1 in the following. Parts of F linear in ρi can be combined with the original repulsive twoparticle potential to an effective potential Φ used here which has also an attractive part. Corrections can be included in (1) for involving gradients of the density [5]; these do not modify the form of Eq. (1). It is implicitly assumed that the electron density is given by a linear superposition of the electron densities of the constituent atoms [26]. Ground-state properties of the solid can be calculated from (1) in a straightforward way (e.g., by using the conjugate gradients technique). 2.1
Particular Choice of Model Potentials
A simple choice for the model functions Φ, w and F leads to our generic model metal, denoted as EMB in the following. For the binary potential function Φ we use a radially symmetric short ranged attractive (SHRAT) potential Φ(r), Φ(r) = φ0 r0−4 3(h − r)4 − 4(h − rmin )(h − r)3 , (3) for r ≤ h and Φ(r) = 0 otherwise, with an energy scale φ0 , a length scale r0 , and an interaction range h. The minimum of the potential is located at the distance r = rmin = 21/6 r0 ≈ 1.123 r0 as for a LJ potential and the well depth of the potential is: −Φ(rmin ) = φ0 r0−4 (h − rmin )4 . Properties of the (pure) SHRAT model system in its gaseous, (metastable) liquid, and solid states
Model Metals and Metal Sponges
621
have been computed recently by molecular dynamics and compared with analytical calculations in [27]. The elastic behavior of the SHRAT model have been characterized by the bulk and shear moduli, and their corresponding Born-Green and fluctuation contributions. Stick-slip behavior, and the detailed elastic response and plastic flow of the model solid had been analyzed, and the transition from the elastic to the plastic behavior has been approximately described by a generalized Maxwell-Kelvin-Voigt model for the stress tensor [27], For reasons discussed in [24,28] we use the normalized Lucy’s weight function in the definition (2) of the embedding density, i.e., r r 3 1− , (4) w(r) = w0 1 + 3 h h for r ≤ h with a prefactor obtained by normalizing the weight function, w0 = w(0) = 105/(16πh3). The particular simple parabolic embedding potential for EMB is
(5) F (ρ) = F0 φ0 r06 (ρ − ρdes )2 − (w0 − ρdes )2 + . . . , where ρdes is the desired embedding number density and F0 is the embedding strength; the dots denote higher order terms in (ρ − ρdes ) which may be considered in order to obtain more than a qualitative agreement between theory and experiments with respect to the quantities listed in Table 1. Other forms for the embedding functional had been used, cf. [9]. Throughout this manuscript we investigate the ‘basic’ model metal for which particle number density n ≡ N/V = r0−3 , interaction range h = 1.6 r0 and the temperature kB T = 0.01φ0 are fixed. For the study of bulk metals the desired embedding density equals the particle number density, i.e., ρdes = n, for the case of a metal sponge we choose ρdes > n. Table 1. Reference values for a set of metals, including the model metal EMB (all parameters n, ρdes , F0 are equal to unity), together with reported values for the elastic modulus E, the shear modulus G and the elastic anisotropy cani ≡ 2c44 /(c11 − c12 ); the coefficients cij denote Voigt moduli [34], taken from [19] metal Ag Cu EMB Fe Ni
2.2
nref 59.4 85.5 72.5 85.2 92.0
nm−3 nm−3 nm−3 nm−3 nm−3
Tref 34.3 40.6 40.0 49.8 51.8
kK kK kK kK kK
Pref 28 GPa 48 GPa 40 GPa 59 GPa 65 GPa
101 156 260 232 239
E GPa GPa GPa GPa GPa
38 59 95 94 101
G GPa GPa GPa GPa GPa
cani 2.99 3.19 2.56 2.32 2.39
Scaled Quantities and Elastic Properties of EMB
To compare with experimental data one has to estimate reference values for dimensionless simulation quantities Qdimless . For any measurable quantity Q
622
Martin Kr¨ oger
with a dimension specified in SI units kg, m and s one has Q = QdimlessQref −γ/2 and Qref = mα+γ/2 r0β+γ φ0 , for [Q] = kgα mβ sγ , where m, r0 and φ0 provide the scales via the binary interaction potential (3) and the equations of motion. The reference values for length r, number density n, energy kB T (and binding energy per particle eb ), pressure P and the elastic moduli in terms of the simulation parameters are therefore rref = r0 , nref = r0−3 , eb,ref = φ0 = kB Tref and Pref = φ0 r0−3 = nref eb,ref . On the other hand, reference values for real masses mref , densities nref and binding energies eb can be inferred from the literature, see Table 1 for sample values. For Ag, e.g., −1/3 one obtains model parameters r0 = nref = 2.56 ˚ A, φ0 = 47.4 × 10−18 J and −21 kg. The values for the moduli of the model metal EMB m = 1.790 × 10 at n = 1 and F0 = 1 are obtained from deformations of a perfect fcc crystal. For plots of the binding energy and pressure for perfect fcc and bcc lattices see Fig. 1. By choosing Tref = φ0 /kB ≡ 40 kK, Pref ≡ 40 GPa one arrives at nref = 72.5 nm−3 and r0 = 2.4 ˚ A. For m = 1.790×10−21 kg the reference time 1/2 is tref = r0 (m/eb,ref ) = 1.3 × 10−11 s for EMB. The squared reference time for the model system tref therefore scales linearly with the mass of a particle or an agglomerate of particles, if the agglomerate is considered as reference unit, and may be of the order of seconds concerning the adapted model for porous metal structures to be discussed below. The behavior of a particular material can be described in detail by considering an additional model parameter in (5), but this is not done here. The values of the elastic anisotropy and the shear modulus are within the expected ranges for fcc metals, the elastic modulus is slightly higher for EMB. The force acting on particle i is obtained by variation of the energy δEb = Fi δri of (1) and yields
∂w ∂F ∂F ∂Φ i + j Fi = − eij ij + , (6) ∂r ∂ρ ∂ρ ∂rij j=i
Fig. 1. The binding energy eb per particle (left) and the hydrostatic pressure (right) vs density (both in LJ units) at h = 1.6 for ideal fcc and bcc lattices and two different amplitudes of the embedding functional: F0 = 0, 1 (reprinted from [20])
Model Metals and Metal Sponges
623
where ∂F /∂ρ|i is evaluated with embedding density ρi . The equations of motions were integrated by a velocity-Verlet algorithm with the force (6). The parameters h, T, ρdes and the average particle density n = N/V are fixed by the model. The influence of the embedding functional F is estimated by varying its strength F0 . A flow simulation introduces further independent variables, which describe the geometry and strength of flow. For details of a NEMD simulation algorithm, the implementation of Lees-Edwards boundary conditions and periodic images, the homogeneous temperature control under shear and elongational flows, evaluation of the stress tensor, see, e.g., Refs. [29,30,31,19]. The simple model metal EMB is explicitely determined by the set of model potentials, and solved without approximations with computational effort of order N .
3
Viscoplastic Behavior of EMB
When metal is subjected to stress, it responds by deforming. If only small stresses are applied, then the material returns to its original shape when the stress is relieved. In this regime, metals are elastic, if however, the stress exceeds a threshold, then the metal suffers permanent plastic deformation. Usually, the kinematics of a three-dimensional continuum, the thermodynamics of materials, and the physics of microscopic defects enter the description of plastic phenomena. A principal feature of plastic behavior is irrecoverable deformation. On a microscopic level, this behavior is caused by defects in the atomic formation of dislocations, dislocation pile-up at grain boundaries, polycrystallinity, anisotropy. A general thermodynamic treatment of phenomenological models has been given, e.g., by Green and Naghdi [32]. Here we report about results obtained by the complementary approach. NEMD computer simulations are performed to study the dynamics and structural changes of the model metal EMB undergoing elongation and shear deformation. The results for the case of shear deformation reveal the influence of initial crystal orientation on transient flow behaviors, the formation of shear bands and dislocations, and the general rate-dependence of metal flow behavior in the viscoplastic (strong) flow regime. A profile unbiased temperature control mechanism [33] is used here. Figure 2 shows a time series for a subsystem of a cubic cell with N = 44000 particles undergoing shear at two different densities n = 1 and n = 1.02 and small and intermediate rates γ˙ = 0.001 and γ˙ = 0.01, respectively. The snapshots (incl. structure analysis) show a cut (width r = 1) of the full system, the direction of shear is depicted in the top right snapshot. Structure is recognized by the method presented in [36]. The top left and centered graphs correspond to an equal amount of deformation γt, ˙ the left column is for a system with shear rate by a factor 10 larger than for the remaining columns. The right column is for a density slightly larger than for the left columns. Shear induced breakup of structure is observed. For example, at
624
Martin Kr¨ oger
t=50
t=50
t=50
t=2500
t=500
t=500
t =3000
t=4000
t=4000
γ=0.01, n=1.00
γ=0.001, n=1.00
γ=0.001, n=1.02
Fig. 2. Snapshots (incl. structure analysis) of an EMB metal subjected to simple shear deformation at shear rate γ, ˙ particle number density n, and time t (all in reduced units). The snapshots show a cut (width r = 1) of the full system, the direction of shear is depicted in the top right snapshot. Structure recognized by the method presented in [36] is encoded as follows: fcc (open spheres), bcc (diamonds), hcp (hexagon within sphere, or bold sphere), and isotropic (filled sphere). Shear induced breakup of structure is observed. For example, at the highest density a polycrystalline conformation with rotating entities is observed (and finally a bcc dominated structure) whereas for smaller densities a more homogeneous evolution (towards fcc) is visible from the plots [20]
the highest density a polycrystalline conformation with rotating entities is observed (and finally a bcc dominated structure) whereas for smaller densities a more homogeneous evolution (towards fcc) is visible from the plots. A structural analysis and structure factors for a sample snapshot at density
10
10
bcc
fcc defect
5
0
−5
−10
−20
−15
−10
−5
0
5
10
15
20
vorticity direction
hcp
5
0
−5
−10
−10
10
10
5
5
0
0
−5
−5
−10
625
−5
0
5
10
−5
0
5
10
gradient direction
Model Metals and Metal Sponges
−10
−20
−15
−10
−5
0
5
10
15
20
−10
flow direction 18
36
0
0
18
36 36
20
0
20
36
36
0
36
Fig. 3. NEMD snapshots of the model metal EMB (N=16400 particles, n = ρdes = F0 = 1) indicating the type of local structure at t = 2000 after start of steady shear flow with shear rate γ˙ = 0.01 (flow, gradient and vorticity direction are specified in the figure). The start configuration is a fcc lattice. Cross-sections of the system are presented, one length unit wide. The method to analyze fcc, bcc, hcp and icosahedral structure is described (and software is provided) in [36]. In the bottom part of the figure the structure factor (see, e.g., [30]) for the same system, projected onto two specified planes, is plotted [20]
n = 1, and shear rate γ˙ = 0.01 are provided by Fig. 3. The transient shear stress for this sample exhibits an overshoot at tγ˙ ≈ 5 before reaching a stationary value at tγ˙ 50. The model metal has been uniaxially compressed at constant elongation rate ˙ = 0.01. Two snapshots are given in Fig. 4. During compression the number of layers decreases. We observe spontaneous symmetry breaking, see inset (top view) of Fig. 4. The often quoted ‘theoretical value’ for the critical
626
Martin Kr¨ oger
y
x
t=10
t=90
Fig. 4. NEMD snapshots of EMB (N = 16800, n = ρdes = F0 = 1) at times t = 10 and t = 90 after incession of uniaxial elongational flow (with rate ˙ = 0.01 in y-direction) of an ideal fcc lattice [20]
penetration hardness σc should be σc = c G with the shear modulus G and c = cth = 1/10. The experimentally observed factor is smaller, cexp = 10−3 −10−2 . Our preliminary result is csim ≈ 1/50 being estimated from the simulated normal stress σyy . 3.1
Solid Friction
The understanding of friction between two solid surfaces can be traced back to the experiments and descriptions offered by Leonardo da Vinci and Charles Coulomb [34]. Accordingly, sliding two bodies in the presence of solid-solid contacts, cf. Fig. 5, requires a friction force F whose magnitude is proportional to the load N ⊥ F normal to the interface, i.e., F = µN with a dimensionless friction coefficient µ. Since the magnitudes of the friction force is independent of the apparent contact area A between the bodies, the established picture is that the shear stress is build up within a small amount of contact zones (asperites). Their effectively interacting area A is increasing during a loadinduced plastic flow. Typically, the contact zones occupy a small part of the total area, e.g. A = 10−4 A and the area per contact zone is of the order of (10µm)2 . For the case of sliding friction we simulate a contact zone at relative motion in x-direction, with a load and shear gradient in y-direction, i.e., the load is related to a normal pressure pyy , also called penetration hardness σc ≡ pyy = N/A at the onset of plastic flow. The shear stress τxy = F/A, or alternatively, the shear component of the friction pressure tensor pxy = −τxy is ‘measured’. According to the ‘friction rule’ F = µN introduced above one F norm
F reib
v
1 µm
10 µm
Fig. 5. Chematic drawing of two metals in contact (load N ≡ Fnorm , friction force F ≡ Freib ) where the relative velocity v is given. Shear stress is build up within a small amount of contact zones (asperites)
Model Metals and Metal Sponges
t=20
t=150
627
t=2000
fcc y
x
Fig. 6. NEMD snapshots of EMB (N = 16800, n = ρdes = F0 = 1) at times t = 20, t = 150 and t = 2000 after incession of shear flow with shear rate γ˙ = 0.01 of two commensurate ideal fcc lattices in contact, modeling the process of solid friction. The interfacial parameter wab = 1.5. The figure shows only a part (2D cut with thickness 3 in reduced units) of the whole system [20]
therefore has τ = µσc or pxy = −µ pyy inside contact zones. In order to simulate an interface, we extended the model for specification of the strength of interaction between particles i, j belonging to the same and to different materials, i.e., we introduce two factors eij and wij in front of Φ(rij ) and w(rij ) in the EMB model equations (1,2). The default values inside the bulk are eij = wij = 1. For two perfect fcc metals blocks with 2056 particles (and periodic boundary conditions in the plane) undergoing solid friction at constant overall shear rate, see Fig. 6. Notice, that at times t = 20 and t = 2000, 16 and 14 layers of particles, are observed, respectively due to the reorganization of initially 100 oriented crystal to its final (prefered 111) orientation. The simulated transient behavior of the friction coefficients for (
)
2.0 F0=0 e_12=1.5 F0=1 e_12=1.5 F0=1 w_12=1.5 F0=1 w_12=1.5 (prerelaxed)
coefficient of friction for EMB
1.0
Reibzahl Friction coefficientµ
1.5
F0=1, n=1, T=0.01
F0=0 εab=1.5
shear rate γ = 0.01
1.0
0.5 F0=1 εab=1.5
F0=1 wab=1.5 0.5
F0=1 wab=1.5 0.0 0.0
1000.0 time (LJu)
Zeit [LJ Einheiten]
2000.0
0.0 0.5
1.0
wab
1.5
2.0
Fig. 7. Left: The transient friction coefficient µ of the model metal EMB at n = 1 vs time for four different settings: (a) F0 = 0, eab = 1.5, (b) F0 = 1, eab = 1.5, (c,d) F0 = 1, wab = 1.5. The simulation runs a-c were started from an ideal fcc lattice, run d) was started from an equilibrium (pre-relaxed) sample. The results have been averaged over 10 independent initial configurations – with respect to the initial distribution of velocities for cases a–c. The results reveal the influence of the interface and the initial condition on the friction behavior. Right: Influence of the interfacial parameter wab on the friction coefficient of EMB [19]
628
Martin Kr¨ oger
four model metals is plotted in Fig. 7 (left); it covers experimentally observed behavior [35]. For the presented data the total simulation time was of the order of 10−10 − 10−9 s and hence smaller than the minimum ‘life time’ of a contact zone 10−7 s, to be determined by the typical size of a zone (10µm) divided by a high velocity, e.g., 100 m/s. The dependence of friction coefficient on wab is analyzed in the rhs of Fig. 7.
4
Porous Metal Structure
A variation of the model potentials introduced above serves to study metal sponges as will be demonstrated in this section. See [21] for an introduction to this field. The modification concerns the controlled mismatch between prefered local and the global embedding number densities. In order to model a porous metal we choose (bulk, cell wall) density ρdes = 1 (reduced units) larger than the overall number density n ≡ N/V = 0.3; the cut-off radius h of Φ is set to h = 1.6, and the temperature is fixed to T = 0.01. This setting allows for the study of the microscopic foundation of i) the cell shape and diameter, cell wall thickness, further structural parameters and the formation dynamics of cellular metals, and ii) the mechanical (elastoplastic) behaviors of metal foams, in order to correlate them with the foam (sponge) structure, e.g., porosity, inhomogeneity, cell size for given foam and bulk densities of a chosen model metal. If ρdes > n for given particle density n, the model metal tends to microphase-separate such that the local embedding density approaches the desired value. As a result, holes surrounded by metal are formed which keep connected caused by the properties of the glue, and the surface tension seems to be high compared to a simple LJ fluid at same parameters, cf. Fig. 8 for the formation (equilibration) step of a EMB model sponge. A result for a larger system consisting of 1.048.576 particles, shown in Fig. 9. A comparison with systems which are smaller by a factor 10–20 confirms, that the sponge structure is quantitatively independent of system size above N ≈ 10000 particles under the current conditions. Three pictures out of an animation for a nonequilibrium sponge (subjected to shear) are shown in Fig. 10. The series illustrates the effect of the implemented embedded particle potentials on the gluey attributes of EMB compared with a simple LJ fluid. Another illustrative example is shown in Fig. 11 for a free standing EMB sponge subjected to finite strain. One of the open questions concerns the selfsimilarity of structures upon changing system size, the mechanical properties of the sponge as function of density. In applications, foams are usually produced with 0.05-0.20 porosity. For our model metal, the porosity is roughly equal to the ratio between particle density and desired embedding density n/ρdes . Our preliminary studies on large systems such as in Fig. 9 reveal that, for given bulk density ρdes , total volume V , foam density n, the ratio between cell wall thickness s and cell diameter d s behave as s/d ≈ n/(3ρdes). For the number density p of pores we simply have p ∝ d−3 ∝ (n/sρdes )3 . It re-
Model Metals and Metal Sponges
t=40
t=5000
629
Fig. 8. Equilibration of an EMB metal sponge at T = 0.01 (N = 2048, n = 0.3, ρ0 = 1) obtained via MD. Initial configuration: fcc lattice (not shown). Snapshots taken at t = 40 (left) and t = 5000 (right). A hole formed inside the sponge is visible in the projection [20]
Fig. 9. Metal sponge at rest consisting of 1.048.576 particles after an equilibration period of 50 red. time units (initial configuration: perfect fcc lattice). Parameters: temperature T = 0.01, prefered local embedding number density ρ0 = 1, and overall particle number density n = 0.3. Plotted are all particles located within a common layer of width 3% of the full simulation cell [20]
Fig. 10. EMB metal sponge with N = 2048 particles, at temperature T = 0.01, particle number density n = 0.3, and embedding density ρ0 = 1 > n, subjected to shear deformation with rate γ˙ = 0.01. Snapshot taken at time t = 20, 100, 500 (from left to right) [20]
630
Martin Kr¨ oger
Fig. 11. Zoom into the system shown in Fig. 9 in a nonequilibrium situation. Metal sponge at rest (left, subsystem consisting of 55296 particles) and subjected to finite shear (tγ˙ = 2) deformation (right) obtained via MD (same particles)
mains to be shown how these relations alter in the coarse of the dynamics of the formation step.
5
Conclusion
The embedded atom method has been adapted to study solid friction and the mechanical behavior of the model metal EMB. The elastic properties of real metals are reproduced by a set of basic model potentials. NEMD computer simulations are performed to study the dynamics and structural changes of the model metal undergoing elastoplastic shear, a process of solid friction and an uniaxial compression, in order to analyze plastic yield and transient friction coefficients, where the stress during sliding is built up within asperites on the nm scale. Longer simulation runs are needed to determine values for the penetration hardness with high precision, and to analyze the relationship between stress and deformation. It was also demonstrated that a variant of the model metal serves to study large scale metal foams and porous metal structures. Acknowledgements This work has been performed under the auspiced of the Sfb 448 of the Deutsche Forschungsgemeinschaft. A fruitful cooperation with I. Stankovic and S. Hess (TU Berlin) is gratefully acknowledged.
References 1. C. F. Melius, C. L. Bisson, and W. D. Wilson, Phys. Rev. B 18, 1647 (1978). 617 2. R. A. Johnson, Phys. Rev. B 6, 2094 (1972). 617 3. J. N. Goodier, in Fracture: An Advanced Treatise, Vol. 2, H. Liebowitz, Ed. (Academic, New York, 1968). 617 4. K. Fuchs, Proc. Roy. Soc. London, Ser. A 153, 622 (1936); 157, 444 (1936). 617
Model Metals and Metal Sponges
631
5. M. J. Stott and E. Zaremba, Phys. Rev. B 22, 1564 (1980). 617, 618, 620 6. J. K. Nørskov, Phys. Rev. B 26, 2875 (1982). 617, 618 7. M. S. Daw and M. I. Baskes, Phys. Rev. Lett. 50, 1285 (1983); Phys. Rev. B 29, 6443 (1984). 617, 618, 619 8. K. W. Jacobsen, The effective medium theory, in: Many-atom interactions in solids, R. M. Nieminen, M. J. Puska, and M. J. Manninen (Eds), Proceedings in Physics 48 (Springer, Berlin, 1990) pp. 34-47. 617 9. R. A. Johnson, Implications of the embedded-atom method format, in: Manyatom interactions in solids, R. M. Nieminen, M. J. Puska, and M. J. Manninen (Eds), Proceedings in Physics 48 (Springer, Berlin, 1990) pp. 85-102. 617, 618, 621 10. M. C. Desjonqu`eres, D. Spanjaard, C. Barreteau, and F. Raouafi, Phys. Rev. Lett. 88, 056104 (2002). 618 11. K. Miwa and A. Fukumoto, Phys. Rev. B 65, 155114 (2002). 618 12. P. Biswas, Phys. Rev. B 65, 125208 (2002). 618 13. F. J. Cherne, M. I. Baskes, and P. A. Deymier, Phys. Rev. B 65, 024209 (2002). 618 14. P. Ballo and V. Slugen, Phys. Rev. B 65, 012107 (2002). 618 15. F. Celestini and J.-M. Debierre, Phys. Rev. E 65, 041605 (2002). 618 16. A. van de Walle and G. Ceder, Rev. Mod. Phys. 74, 11 (2002). 618 17. P. Hohenberg and W. Kohn, Phys. Rev. B 136, 864 (1964). 618 18. M. J. Puska, R. M. Nieminen, and M. Manninen, Phys. Rev. B 24, 3037 (1981). 618 19. M. Kr¨ oger and S. Hess, ZAMM 90, Suppl. 1, 48 (2000). 618, 619, 621, 623, 627 20. M. Kr¨ oger, I. Stankovic, and S. Hess, Multiscale Model. Simul. 1, 25 (2003). 618, 622, 624, 625, 626, 627, 629 21. Metal foams and porous metal structures, Proc. 1st Int. Conf. Metal Foams and Porous Metal Structures (MetFoam’99) Bremen (Germany), 14.-16. June 1999. 619, 628 22. L. J. Gibson, M. F. Ashby, J. Zhang, and T. C. Triantafillou, Int. J. Mech. Sci. 31, 635 (1989). 619 23. A. F. Bastawros, H. Bart-Smith, and A. G. Evans, J. Mech. Phys. Solids 48, 301 (2000). 619 24. B. L. Holian, A. F. Voter, N. J. Wagner, R. J. Ravelo, S. P. Chen, W. G. Hoover, C. G. Hoover, J. E. Hammerberg, and T. D. Dontje, Phys. Rev. A 43, 2655 (1991). 619, 621 25. H. Rafii-Tabar, Phys. Rep. 325, 239 (2000). 619 26. F. Herman and S. Skillman, Atomic Structure Calculations (Prentice-Hall, New Jersey, 1963). 620 27. S. Hess and M. Kr¨ oger, Techn. Mech. 22, 79 (2002). 621 28. W. G. Hoover and S. Hess, Physica A 267, 98 (1999). 621 29. M. P. Allen and D. J. Tildesley, Computer Simulation of Liquids (Clarendon, Oxford, 1987). 623 30. M. Kr¨ oger, W. Loose, and S. Hess, J. Rheol. 37, 1057 (1993). 623, 625 31. M. Kr¨ oger, C. Luap, and R. Muller, Macromolecules 30, 526 (1997). 623 32. A. Green and P. Naghdi, Arch. Rat. Mech. Anal. 18, 251 (1965); J. Eng. Sci. 9, 1219 (1971). 623 33. W. Loose and S. Hess, Rheol. Acta 28, 91 (1989). 623
632
Martin Kr¨ oger
34. F. P. Bowden and D. Tabor, The friction and lubrication of solids, 2nd Ed. (Clarendon Press, Oxford, 1954). 621, 626 35. Z. Xia, W. A. Curtin, and P. W. M. Peters, Acta Mat. 49, 273 (2001); A. Matuszak, J. Mat. Proc. Tech. 106, 250 (2000); E. Pollak, G. Gershinsky, Y. Georgievskii, and G. Betz, Surf. Sci. 365, 159 (1996). 628 36. I. Stankovic, M. Kr¨ oger, and S. Hess, Comput. Phys. Commun. 145, 371 (2002). 623, 624, 625
Computer Simulations of the Glass Transition in Polymer Melts Wolfgang Paul Institut f¨ ur Physik, Johannes Gutenberg-Universit¨ at 55099 Mainz, Germany Abstract. Computer simulations of model polymers have contributed strongly to our understanding of the glass transition in polymer melts. The ability of the simulation to provide information on experimentally not directly accessible quantities like the entropy of a system or the detailed spatial arrangement of the particles allow for stringent tests of theoretical concepts about the glass transition and provide additional insight for the interpretation of experimental data.
1
Introduction
Since a long time polymers have played a central role in studies of the structural glass transition because on the one hand, metastable equilibrium of the supercooled melt is not hard to obtain (for many polymers we do not even know whether there exists a crystalline ground state at all), and on the other hand understanding the properties of glassy polymers is of high technological relevance. One of the important phenomenological observations with glass forming materials is the seemingly vanishing excess entropy (with respect to the crystalline state) at the Kauzmann temperature TK , the so-called Kauzmann paradox. For polymers this phenomenon could be reproduced within the Gibbs-DiMarzio theory [1] which is in good agreement with results on the thermal glass transition in polymers like for instance the shift of the glass transition temperature Tg with molecular weight. In Sect. 2 we will present a critical test of this theory, which conceives of a true thermodynamic transition for the structural glass transition, through Monte Carlo (MC) simulations of the bond-fluctuation lattice model and discuss findings on the existence of a typical length scale associated with the glass transition and its dependence on temperature. An opposing view of the glass transition as a purely kinetic phenomenon has been advocated by mode-coupling theory (MCT) which furthermore traces the transition to a crossover in relaxation behavior in the supercooled fluid which occurs about 20% above the calorimetric glass transition (for fragile glass formers which follow a Vogel-Fulcher-Tamman law for the temperature dependence of the viscosity). This crossover happens on microscopic to mesoscopic length and time scales and is therefore well observable on the B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 633–645, 2003. c Springer-Verlag Berlin Heidelberg 2003
634
Wolfgang Paul
scales of for instance neutron scattering experiments [2] and computer simulations [3]. Molecular Dynamics (MD) simulations of a bead-spring model have been analyzed in great detail within the predictions of MCT and we will present some of the results in Sect. 3. Coarse-grained polymer models like the bead-spring model, however, lack a physical property that is determining much of the relaxation behavior in real polymer melts and this is hindered rotation around chemical bonds described through dihedral potentials. In Sect. 4 we will discuss MD simulations of 1,4polybutadiene with and without taking the dihedral potential into account to find out how these potentials influence the local dynamics in polymer melts. Sect. 5 will finally present some conclusions.
2 Some Results Concerning Thermodynamic Concepts of the Glass Transition The entropy theory of Gibbs and Di Marzio [1] considers the canonical partition function of K polymer chains of length N on a lattice of volume M = KN + H, where H is the number of holes (the free volume left over by the chains). Ω(E, K, N, M )e−E/kB T , (1) Z= E
where E is the internal energy and Ω the number of states for fixed E. In the thermodynamic limit the entropy density of the system s = ln(Ω)/M can be considered to be a function of the energy density e = E/M and the monomer density ρ = KN/M . Alternatively one can consider s(T, ρ) by means of a Legendre transform from E to T . Following Flory’s [4] pioneering work, Gibbs and Di Marzio [1] and later Milchev [5] wrote the microcanonical partition function as a product of an intramolecular and an intermolecular part Ω = Ωintra Ωinter . The intramolecular part typically is taken as a two level system for the bond energies and the intermolecular part is the number of ways K chains of length N can be put on a lattice of M sites. Naturally, this can only be calculated in approximation and this is where the three theoretical approaches differ. The theory of Gibbs and Di Marzio produced an entropy catastrophe which led to speculations about an underlying phase transition for the glass transition, which would only be masked by kinetic effects. For the bond-fluctuation lattice model [6] the entropy density as a function of temperature in a model polymer melt and all terms entering the three theoretical predictions for this entropy could be calculated in a simulation employing a two state bond Hamiltonian: H(b) = 0 for b = (3, 0, 0) and symmetric vectors and H(b) = ε for all other bond vectors (Fig. 1). The shape of the simulation data is well reproduced by all theoretical equations, however, the Flory and Gibbs-DiMarzio treatments generate an entropy catastrophe only through an inadequate approximation of the intermolecular
Computer Simulations of the Glass Transition in Polymer Melts
635
0.30
entropy
0.20
Flory Milchev Gibbs-DiMarzio Simulation
0.10
0.00
0.0
2.0
4.0
6.0
1/T Fig. 1. Entropy density as a function of inverse temperature for the bondfluctuation model compared to three theoretical predictions
part of the partition function (at 1/T = 0). Speculations about an underlying phase transition therefor seem unwarranted from this result. If this speculated phase transition were of second order, one would furthermore expect to be able to find some growing structural correlation length in the simulations. So far, efforts along this line have only lead to length scales which only showed an insignificant increase with decreasing temperature [7,8] which was compatible with a divergence at T = 0. The same is true for length scales defined from the heterogeneity of the dynamics [9,10] which increases upon approaching the glass transition temperature. We will therefore focus in the following on the kinetic aspects of the glass transition in polymer melts. Although these also have been studied by means of Monte Carlo simulations of the bond fluctuation model [13,14], the caging picture can be brought out cleaner using Molecular Dynamics simulations of a bead-spring continuum model.
3 The Cage Effect in Simulations of a Bead-Spring Model In this model the polymers are represented by Lennard-Jones spheres bonded by a finitely extendable nonlinear spring [15]. By making the equilibrium distance between bonded monomers (in the following all quantities will be given in Lennard-Jones units), bmin = 0.96, different from the equilibrium distance of non-bonded monomers, rmin = 1.13, one can kinetically hinder crystallization in the course of the simulation. Chains of length N = 10 were studied and a melt of such chains was cooled down along an isobar, where
636
Wolfgang Paul
equilibration was performed in an NpT simulation and the dynamics was then studied in an NVT simulation to avoid spurious effects of the barostat. This model was studied in great detail [9,16,17,18,19,20] and I will here only discuss a few selected results connected with the polymer aspect of the glass transition in this model. From the behavior of the static structure factor of the melt upon cooling one can infer that the structure stays amorphous with a first sharp diffraction peak (amorphous halo) at q = 6.9 [16]. This peak increases in height and sharpens and slightly shifts following the volume expansion. A first phenomenological impression of the slowing down of the relaxation processes can be obtained by looking at the temperature dependence of the center of mass diffusion coefficient of the chains. For temperatures below 1 it can be fitted with a Vogel-Fulcher law: D(T ) = D(∞) exp{−E/(T − T0 )} with a seemingly vanishing diffusion coefficient at the Vogel-Fulcher temperature T0 = 0.35. On the other hand, one can start from a crystalline ground state at T = 0 where all chains are extended parallel to the z-direction and the end monomers of the chains are located on a tetragonal lattice with an additional base atom in the center [21]. Upon heating this structure slowly, one observes a melting (see Fig. 2) at a temperature Tm = 0.75 ± 0.05 (the large error bar accounts for the fact that no systematic studies on the dependency on the heating rate were performed). The temperature regime between the Vogel-Fulcher temperature T0 = 0.35 and the melting temperature Tm = 0.75 constitutes the supercooled fluid regime of this model. Within this temperature window the mode-coupling theory (MCT) [22] of the glass transition predicts the cage effect to become dominant for the slowing down of the relaxation. This caging is predicted to lead to a two step decay of correlation functions with an intermediate plateau regime (β-regime) where the atoms are still confined in their neighbor cages and a long time structural relaxation (α process), which is predicted to obey time-temperature superposition. In Fig. 3 we show the incoherent intermediate scattering function at
x-direction y-direction z-direction
100
g0(t)
80 60 40 20 0 0
5×10
3
1×10
4
2×10
4
t
Fig. 2. Mean square displacements of the atoms in the crystalline configuration along the chain axis (z) and perpendicular to it (xy). The temperature is T = 0.77
Computer Simulations of the Glass Transition in Polymer Melts
637
1.0
T=0.48
0.9
sc q
0.8
f
0.7
s
φq (t)
0.6 0.5 0.4 0.3 0.2 0.1 0.0 −2 10
q=6.9 2 exp(−q g0(t)/6) KWW (βK=0.75) β−correlator −1
10
0
1
10
10
2
10
3
10
t
Fig. 3. Intermediate incoherent scattering function at the first sharp diffraction peak for a temperature in the supercooled melt regime. The first part of the decay is described by the mean square displacements g0 (t) invoking the dynamic Gaussian approximation. Then the β-correlator describes the plateau and finally the KWW stretched exponential fits the long-time decay
the momentum transfer of the amorphous halo in the supercooled melt. The two step decay is clearly visible and the whole curve can be split into a short time ballistic and harmonic motion, where the dynamic Gaussian assumption S(q, t) = exp{−q 2 /6g0 (t)}, g0 being the monomer mean square displacement, still works, an MCT β-regime and the long time structural decay given by a Kohlrausch-William-Watts (KWW) stretched exponential function. On approaching the so-called MCT critical temperature Tc the idealized version of the theory predicts that the temporal extend of the plateau regime diverges and the structural relaxation arrests completely. Close to Tc the α time-scale should show a power law divergence τα ∝ (T − Tc )−γ , the time-scale of the plateau should diverge as τ ∝ (T − Tc )−2a and the amplitudes of the von Schweidler law S(q, t) = fqc − Bhq (T )tb should vanish as (T − Tc )2 . Figure 4 shows that the bead-spring polymer [15,16] model is nicely consistent with all these predictions with a Tc = 0.45. This also holds for single-chain coherent or melt coherent scattering [19] and the measured exponent parameter λ = 0.635 which determines all other exponents, does not depend on the thermodynamic path used to approach the transition [17]. The polymer typical melt dynamics as described by the Rouse model is enslaved to the caging process and follows the α-scaling [18]. This is understandable through Fig. 5 which shows a mean square displacement master curve obtained from plotting the mean squared displacement of center monomers at different temperatures versus time times the center of mass diffusion coefficient of the chains. The caging occurs on a length scale which is much smaller than the mean bond length of this model, so that the monomers do not really feel the connectivity
638
Wolfgang Paul
0.8
tε q=3 q=6.9 q=9.5
tε
−2a
s 2
100(hq ) |ε|
0.6
0.4
0.2
0.0 0.44
0.46
0.48
0.50
0.52
T
Fig. 4. Test of the MCT predictions for the β-scaling giving Tc = 0.45 3
10
2
10
2
R 1
10
2 g
R
0
10
0.75
6r2sc + A (Dt)
6Dt
g1(t)
−1
10
−2
10
3
10 0.63
∼t
−3
10
1
10 10
−4
10
−1 −3
10 binary LJ−mixture −5 10 −8 −6 −4 −2 0 2 10 10 10 10 10 10
−5
10
−6
10
10
−8
−7
10
−6
10
10
−5
10
−4
−3
10
10
−2
10
−1
10
0
10
1
10
2
Dt
Fig. 5. Master curve of the mean squared monomer displacement of center monomers of the bead-spring chains compared to that of a binary Lennard-Jones mixture [23]. The inset shows the construction procedure before removing the parts not on the master curve
constraint of being part of a polymer. Therefore, the behavior is also very similar to what was observed in the MD simulation of a binary LJ mixture by Kob et al., even to the point that the mode-coupling critical temperatures are almost identical (Tc = 0.45 vs. Tc = 0.435) between the two very different models. It is only the late part of the decay out of the plateau which is influenced by connectivity and instead of crossing over to a free diffusion behavior like in the LJ mixture, for the polymers the Rouse mode dominated regime sets in with an exponent 0.63 which is smaller than the von Schweidler
Computer Simulations of the Glass Transition in Polymer Melts
639
exponent b = 0.75 of the plateau decay. However, also this behavior can be incorporated into a mode-coupling theory, when the theory is extended to not only include the melt structure factor but partial structure factors Snm (q) for the correlations between monomers at different positions n and m along one chain [24]. But is it really true that connectivity is just another way to prevent crystallization and all molecular details do not matter for the actual caging process in a polymer melt? To answer this question we have to look into MD simulations of chemically realistic polymer models.
4 The Influence of Dihedral Barriers on Local Dynamics In the last decade MD simulations of chemically realistic polymer models have progressed to quantitative accuracy in the prediction of materials properties [25]. A necessary input for these kinds of simulations are highly optimized, quantum chemistry based force-fields. For one of the experimentally best studied glass forming polymers, 1,4-polybutadiene, this was developed in reference [26] and later tested at high temperatures against neutron scattering results [27,28], nuclear magnetic resonance results on spin lattice relaxation times [27,29] and dielectric data [30]. The calculation of the dielectric response of a 1,4-polybutadiene melt may serve to illustrate one of the strong points of the computer simulation approach. Dielectric measurements are a q = 0 technique, i.e., they measure the response of the whole sample volume. To assign molecular motions underlying the observed relaxation is an involved process relying heavily on models for the actual motion of the polymer. For 1,4-polybutadiene [30], the simulations showed that there is no correlation between the dipole moments of different chains, so that the average squared dipole moment of the box is just the sum of the squared dipole moments of the individual chains. Further comparison with the time scales of different Rouse modes, which measure the typical reorientation time for segments of varying length (depending on mode number) of the chain, showed, that the dielectric measurement in this case is observing the reorientational motion of a chain segment of about 6 backbone bonds, which is about a Kuhn segment length of the chain. The relaxation map shown in Fig. 6 shows that the model is in good quantitative agreement with experimental information, which was also found in the comparison with other experimental techniques. The excellent agreement between simulation and experiment for the local reorientational motion as observed in dielectrics or NMR experiments relies strongly on the quality of the dihedral force field used in the simulation. These relaxation processes are exponentially sensitive to the values of the barriers in the dihedral potentials. It is natural to ask what the influence of these barriers on the mean square displacement behavior of the monomers
640
Wolfgang Paul
12 10
-1
log(ωmax) [s ]
8 6
Exp. Arbe Exp. Aouadi Simulation scaled viscosity
4 2 0 -2 2×10
-3
3×10
-3
4×10
-3
5×10
-3
6×10
-3
-1
1/T [K ]
Fig. 6. Relaxation map of the structural relaxation in 1,4-polybutadiene as observed in dielectric spectroscopy. The experiments observe the dielectric αrelaxation whereas the simulations are in the temperature window of the combined α − β process
is. To investigate this, we compare the chemically realistic simulation of 1,4polybutadiene (CRC) with another one, where we turned off all dihedral potentials (FRC, freely rotating chain model). Polybutadiene is a special case, in that this has no influence on the melt structure (as shown in Fig. 7) or on the single chain structure factor [31] This can be traced to the symmetry (all isomers are iso-energetic) of the dihedral potentials in this polymer. In MCT the structural relaxation is completely determined through the structure factors. All vertices coupling different modes are functionals of these static pair correlation functions. Consequently, these two polymer models should show the same dynamics, if MCT were applicable to them. In Fig. 8 we compare mean square displacements of the CH2 groups along the chain be3
CRC 273 K FRC 273 K
2.5
S(q)
2 1.5 1 0.5 0
1
2
3
4
-1
q [Å ]
Fig. 7. Melt structure factor for the chemically realistic model of 1,4-polybutadiene (CRC) and the freely rotating chain model (FRC)
Computer Simulations of the Glass Transition in Polymer Melts 10
4
10
3
2
10
2
∆R [Å ]
10
2
10 10 10
10
1
0
-1
353K CRC CH2 240K CRC CH2 273K CRC CH2 273K FRC CH2
-2
10
641
-3
-4 -5
10 -2 10
10
-1
10
0
10
1
10
2
10
3
10
4
10
5
10
6
t [ps]
Fig. 8. Mean square displacements of CH2 groups compared between the CRC and the FRC model at high temperatures
tween the two models for several temperatures well above the mode-coupling Tc of PBD which is 220 K. Clearly, the CRC model develops a well-defined plateau regime which is reminiscent of the caging effect discussed for the beadspring model in the previous section. At the same temperature, however, the FRC model shows no indication of this plateau so that this ‘caging’can not be traced to the intermolecular packing. It is due to the dihedral barriers. The short time vibrational motion is damped out on the time scale of 1 ps for typical carbon-based polymers. At high temperatures thermal activation is fast enough, so that the typical time scale between jumps over dihedral barriers is of the same size. With decreasing temperature this time-scale increases in an Arrhenius fashion [29] and the relaxation is halted until a thermal activation happens. This process is not considered within MCT (which was originally never meant to describe polymers but was developed for simple liquids). The dihedral potentials which depend on the position of four adjacent atoms induce four-body correlations which do not factorize into two-body correlations as assumed in the closure of the MCT equations. Consequently, one of the central results of MCT, the factorization theorem for the β-relaxation (i.e. the von Schweidler law discussed above) does not hold for this plateau regime. To show this, one can consider the following function for any correlator Φ in the β-regime R(t) =
Φ(t) − Φ(t ) Φ(t ) − Φ(t )
(2)
with times t, t , t in the plateau regime of the relaxation function. If the factorization theorem is applicable all correlators should fall on a unique R(t) curve. In Fig. 9 we test the theorem for the Rouse modes of PBD (the same test was shown to work for the Rouse modes in the bead-spring model in [19]). Clearly the curves do not scale, showing that the factorization theorem is sensitive to the exact physical origin of the plateau regime. The effect of
642
Wolfgang Paul
4
Mode 10-22
R(t)
2
0
-2
-4 -1
10
0
1
10
10
t [ps]
Fig. 9. Test of the factorization theorem for the plateau regime in the relaxation of PBD induced by the presence of dihedral barriers. The theorem is applied to the Rouse modes numbers 10 − 22 [31]
the dihedral barriers at lower temperatures where the intermolecular packing becomes more important remains to be studied. One MD simulation of PBD using a different force-field and relatively short simulation runs [32] seemed to show compatibility with the MCT predictions, but this point needs further clarification. It is not at all obvious why the four-body correlations introduced by the dihedral potentials should become unimportant at lower temperatures.
5
Conclusions
We have discussed in this contribution several results from the computer simulation efforts of the last decade to help understand the glass transition in polymer melts. As most polymers do not crystallize easily they could be studied in (meta-)stable equilibrium since a long time and much is known about the phenomenology of the polymer glass transition. A theory which was very successful in explaining much of this phenomenology was the Gibbs-DiMarzio theory which predicts an entropy catastrophe in the spirit of Kauzmann. In a Monte Carlo simulation of the bond-fluctuation lattice model, all quantities entering the theory as well as the entropy could be measured directly and it could be shown that although the shape of the temperature dependence of the entropy (specific heat) is given correctly by the theory, the entropy catastrophe only comes about by a too strong approximation within the theory. The success of this theory is therefore not reason enough for speculations about a phase transition underlying the glass transition in polymers. The view of the glass transition as a purely kinetic phenomenon nowadays is mainly connected with the mode-coupling theory of the glass transition. Although this theory was developed for simple liquids, experimentally it was mostly tested on molecular of polymeric liquids, with the exception of sterically stabilized colloids. We know in polymer physics that all the universal
Computer Simulations of the Glass Transition in Polymer Melts
643
phenomena of this class of materials rely on the presence of excluded volume and chain connectivity. Therefore, a molecular dynamics simulation of a simple bead spring model was performed to study the applicability of MCT to the polymer glass transition. Within the supercooled fluid regime of this model a temperature interval could be identified, where a two step decay of correlation functions developed. The features of this two step decay in the plateau regime (β-regime) and the long-time decay (α-relaxation) were nicely consistent with the predictions of MCT. Since the caging happened on length scales much shorter than the bond length in this coarse-grained model, the connectivity played no important role for the cage effect and the polymer dynamics as described by the Rouse model simply followed the scaling of the α-process. The influence of connectivity on the late stages of the β-regime and the α-process could recently also be included into a mode-coupling approach. In real polymers, however, there exists a strong influence of dihedral barriers on the qualitative and quantitative behavior of relaxation processes. A chemically realistic model of 1,4-polybutadiene with a carefully optimized, quantum chemistry based force field is able to quantitatively reproduce the relaxation behavior of this polymer in the melt. In a computer simulation one can selectively change aspects of the model to look at their importance for a given phenomenon. In this spirit a freely rotating chain model of PBD (where all dihedral potentials were switched off) and a chemically realistic model of PBD were studied. Even far above the glass transition temperature the chemically realistic model of PBD shows a plateau regime in the mean squared monomer displacements, whereas the FRC model at the same temperature shows no such effect. Since the structural properties of the two models as quantified by the melt and chain structure factors are identical, this can not be rationalized within a mode-coupling approach, where the dynamics is uniquely determined through these static two-body correlation functions. Dihedral potentials describe four-body correlations which do not factorize into two-body correlations. The barriers in the potential lead to a slowing down, because further relaxation can only occur through thermal activation. This is similar to the cage process, although of different physical origin. This difference in physical origin is nicely mirrored in the fact, that the factorization theorem for the MCT β-regime fails for the two step process introduced through dihedral barriers. It is still an open question, to what extend the MCT picture will be able to describe the behavior at lower temperatures, close to the merging temperature of the dielectric α and β processes, which is approximately equal to the experimentally determined mode-coupling Tc of this polymer. Acknowledgements It is a pleasure to thank M. Aichele, J. Baschnagel, D. Bedrov, C. Bennemann, K. Binder, O. Borodin, J. Buchholz, B. D¨ unweg, S. Krushev, G. D. Smith, F. Varnik and M. Wolfgardt for a fruitful collaboration. Financial support
644
Wolfgang Paul
from the DFG through SFB 262 and grant PA 473/3 and from the BMBF through grants 03N8000C and 03N6015 is greatfully acknowledged.
References 1. J. H. Gibbs, E. A. Di Marzio, J. Chem. Phys. 28, 373 (1958) 633, 634 2. D. Richter, M. Monkenbusch, A. Arbe, J. Colmenero, J. Non-Cryst. Solids 287, 286 (2001) 634 3. K. Binder, J. Baschnagel, W. Paul, Prog. Polym. Sci. 28, 115 (2003). 634 4. P. J. Flory, Statistical thermodynamics of semi-flexible chain molecules, Proc. R. Soc. Lond. A 234, 60 (1956). 634 5. A. Milchev, C. R. Acad. Bulg. Sci. 36, 1415 (1983). 634 6. M. Wolfgardt, J. Baschnagel, W. Paul, K. Binder, Phys. Rev. E 54, 1535 (1996). 634 7. K. Binder, J. Baschnagel, S. Boehmer, W. Paul, Phil. Mag. B 77, 591 (1998). 635 8. C. Mischler, J. Baschnagel, K. Binder, Adv. Colloid Interf. Sci. 94, 197 (2001). 635 9. C. Bennemann, C. Donati, J. Baschnagel, S. C. Glotzer, Nature 399, 246 (1999). 635, 636 10. Y. Gebremichael, T. B. Schroder, F. W. Starr, S. C. Glotzer, Phys. Rev. E 64 051503 (2001). 635 11. H.-P. Wittmann, K. Kremer, K. Binder, J. Chem. Phys. 96, 6291 (1992). 12. W. Paul, J. Baschnagel, Monte Carlo Simulations of the Glass Transition of Polymers, in Monte Carlo and Molecular Dynamics simulations in polymer science, K. Binder (Ed.) (Oxford University Press, New York, 1995) 13. J. Baschngel, M. Fuchs, J. Phys. Condens. Matter 7, 6761 (1995). 635 14. W. Paul, Monte Carlo Simulations of the Polymer Glass Transition, in Slow Dynamics in Condensed Matter, AIP Conference Proceedings 256, (American Institute of Physics, New York 1992) 635 15. C. Bennemann, W. Paul, K. Binder, B. D¨ unweg, Phys. Rev. E 57, 843 (1998). 635, 637 16. C. Bennemann, J. Baschnagel, W. Paul, Eur. Phys. J. B 10, 323 (1999). 636, 637 17. C. Bennemann, W. Paul, J. Baschnagel, K. Binder, J. Phys.: Condens. Matter 11, 2179 (1999). 636, 637 18. C. Bennemann, J. Baschnagel, W. Paul, K. Binder, Comput. Theor. Polym. Sci. 9, 217 (1999). 636, 637 19. M. Aichele, J. Baschnagel, Eur. Phys. J. E 5, 229 (2001); ibid. 5, 245 (2001). 636, 637, 641 20. F. Varnik, J. Baschnagel, K. Binder, J. Phys. 10, 239 (2000). 636 21. J. Buchholz, W. Paul, F. Varnik, K. Binder, J. Chem. Phys. 117, 7364 (2002). 636 22. W. G¨ otze, L. Sj¨ ogren, Rep. Prog. Phys. 55, 241 (1992); W. G¨ otze, J. Phys. Condens. Matter 11, A1 (1999). 636 23. W. Kob, H. C. Andersen, Phys. Rev. E 51, 4626 (1995). 638 24. S. H. Chong, M. Fuchs, Phys. Rev. Lett. 88, 185702 (2002). 639 25. S. C. Glotzer, W. Paul, Annu. Rev. Mater. Res. 32, 401.437 (2002). 639
Computer Simulations of the Glass Transition in Polymer Melts
645
26. G. D. Smith, W. Paul, J. Phys. Chem. A 102, 1200 (1998). 639 27. G. D. Smith, W. Paul, M. Monkenbusch, L. Willner, D. Richter, X.H. Qiu, M. D. Ediger, Macromolecules 32, 8857 (1999). 639 28. G. D. Smith, W. Paul, M. Monkenbusch, D. Richter, Chem. Phys. 261, 61 (2000). 639 29. G. D. Smith, O. Borodin, D. Bedrov, W. Paul, X. Qiu, M. D. Ediger, Macromolecules 34, 5192 (2001). 639, 641 30. G. D. Smith, O. Borodin, W. Paul, J. Chem. Phys. 117, 10350 (2002). 639 31. S. Krushev, W. Paul, Phys. Rev. E 67, 021806 (2003). 640, 642 32. A. van Zon, S. W. de Leeuw, Phys. Rev. E 58, R4000 (1998). 642
Order Out of Noise: Maximizing Coherence of Noisy Oscillators Arkady Pikovsky Department of Physics, University of Potsdam Postfach 601553, D-14415 Potsdam, Germany Abstract. In many nonlinear systems noise can play a constructive role, leading to an appearance of an ordered state. In this paper we describe the effects of coherence and system size resonance. Coherence resonance appears when an excitable system is driven by noise, then for a certain noise level one observes ordered oscillations. System size resonance happens in globally coupled systems and lattices, here the level of order depends on the ensemble size and reach a maximum at a certain size.
1
Introduction
Constructive role of noise in nonlinear systems has been attracted large interest recently. One extremely popular example is Stochastic Resonance [1,2]. As was demonstrated in [3], a response of a noisy nonlinear system to a periodic forcing can exhibit a resonance-like dependence on the noise intensity. In other words, there exists a “resonant” noise intensity at which the response to a periodic force is maximally ordered. Stochastic resonance has been observed in numerous experiments. Noteworthy, the order in a noise-driven system can have a maximum at a certain noise level even in the absence of periodic forcing, this phenomenon being called Coherence Resonance [4,5,6]. Below we demonstrate this effect taking a famous FitzHugh–Nagumo system as an example. We present a numerical evidence of the coherence resonance, determine conditions for it to occur, and give an analytical description. Being first discussed in the context of a simple bistable model, stochastic resonance has been also studied in complex systems consisting of many elementary bistable cells [7]. Again, one observes a resonance-like dependence on the noise intensity, moreover, the resonance may be enhanced due to coupling [8,9]. In this paper we discuss another type of resonance in such systems, namely the System Size Stochastic Resonance, when the dynamics is maximally ordered at a certain number of interacting subsystems [10]. Contrary to previous reports of array-enhanced stochastic resonance phenomena, here we fix the noise strength, coupling, and other parameters; only the size of the ensemble changes. We also demonstrate, taking an ensemble of coupled noise-driven FitzHugh–Nagumo systems as an example, that for an ensemble of excitable systems the System Size Coherence Resonance can be observed. B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 647–657, 2003. c Springer-Verlag Berlin Heidelberg 2003
648
Arkady Pikovsky
2 Coherence Resonance in a Noise-Driven Excitable System In this section we describe the effect of noise on the autonomous excitable oscillator – the FitzHugh–Nagumo system. We give numerical evidence that a characteristic correlation time of the noise-excited oscillations has a maximum for a certain noise amplitude, and outline a theory of this effect. The FitzHugh–Nagumo model is a simple but a representative example of an excitable system that occurs in different fields of application ranging from kinetics of chemical reactions and solid-state physics to biological processes. Originally it was suggested for the description of nerve pulses [11], it was also widely used for modeling spiral waves in a two-dimensional excitable medium. The equations of motion are ε
x3 dx = x− −y , dt 3 dy = x + a + Dξ(t) . dt
(1) (2)
Here ε << 1 is a small parameter allowing one to separate all motions in the fast (only x changes) and slow (y ≈ x − x3 /3) ones. The parameter a governs the character of solutions: for |a| > 1 the only attractor is a stable fixed point, and for |a| < 1 a limit cycle appears. This cycle consists of two pieces of slow motion connected with fast jumps. For |a| slightly larger than 1 the system is excitable; i.e. small but finite deviations from the fixed point produce large pulses. Indeed, if the perturbation brings the system to the border of the slow branch on which the stable fixed point lies, the jump to the another slow branch happens and the system returns to the stable fixed point only after a large excursion. This highly nonlinear response to perturbations makes the dynamics of the forced Fitz High–Nagumo system nontrivial. Finally, the parameter D governs the amplitude of the noisy external force ξ which we assume to be Gaussian delta-correlated with zero mean: ξ(t)ξ(t ) = δ(t−t ). We integrate system (1,2) numerically using Euler’s method for the parameters ε = 0.01, a = 1.05, and different noise amplitudes. The results shown in Fig. 1 demonstrate that for both small and large noise amplitudes, the noise-excited oscillations appear to be rather irregular, while for moderate noise relatively coherent oscillations are observed. This phenomenon, called coherence resonance, resembles the well-known stochastic resonance [2,3,12]. The stochastic resonance appears if both periodic and noisy forces drive a nonlinear system, with the periodic response having a maximum at some noise amplitude. In our case there is, however, no periodic force and no discrete component appears in the spectrum, but at some noise amplitude the regularity of the process is nevertheless maximal.
Order Out of Noise
649
1.0 0.5 1.0
0.0 0.5
−0.5 0.0
−1.0 1.0
−0.5 −1.0 1.0
0.5
0.5
0.0
0.0
−0.5
−0.5
−1.0 1.0
−1.0
tp
1.0
0.5
0.5
y
0.0 0.0
ta
−0.5
−0.5
te
−1.0 0.0
20.0
−1.0
40.0
60.0
80.0
100.0
t
0
10
20
τ
30
40
5
Fig. 1. Left panel : The dynamics of the FitzHugh–Nagumo system (Eqs. (1), (2) for a = 1.05, ε = 0.01 and different noise amplitudes: from bottom to top D = 0.02, D = 0.07, and D = 0.25. The activation and the excursion times for one pulse are depicted. Right panel : The corresponding autocorrelation functions
To characterize this ordering quantitatively, we compute the normalized autocorrelation function C(τ ) =
(˜ y (t)˜ y (t + τ ) , ˜ y2
y˜ = y − y .
(3)
One can see from Fig. 1 that the correlations are indeed much more pronounced for the moderate noise. To describe this effect with a single quantity, we calculate the characteristic correlation time as follows ∞ C 2 (t)dt . (4) τc = 0
The dependence of this quantity on the noise amplitude is presented in Fig. 2; it has a clear maximum at the noise amplitude Dres ≈ 0.06. While the correlation time can be readily obtained numerically, for the convenience of the theoretical consideration we introduce an other quantity (which can be interpreted, in the context of stochastic resonance terminology, as noise-tosignal ratio). Because the process in Fig. 1 can be viewed as a sequence of
650
Arkady Pikovsky
pulses having durations tp , we look at the normalized fluctuations of pulse durations Var(tp ) Rp = . (5) tp This quantity, shown in Fig. 2, possesses a minimum at Dres . Below we describe an approach to calculating Rp . Physically, the appearance of coherence resonance is deeply related to the excitable nature of the FitzHugh–Nagumo system. The system has two characteristic times: the activation time ta and the excursion time te . The activation time is the time needed to excite the system from the stable fixed point x = −a, y = a3 /3 − a; while the excursion time is the time needed to return from the excited state to the fixed point. The pulse duration tp is the sum of these times tp = ta + te . The crucial point is that these times and their fluctuations have a different dependence on the noise amplitude. The activation time decreases rapidly with the noise amplitude according to the Kramers formula ta ∼ exp(const · D−2 ) [13,14]. It can be also shown that for small noise Var(ta ) ≈ ta 2 . Thus, for small noise, where ta te and the period is dominated by the activation time tp ≈ ta , the fluctuations of the pulse durations are relatively large: Rp ≈ Ra ≈ 1. For large noise the contribution of the activation time ta to the period is negligible, here the excursion time dominates tp ≈ te . If the motion in the excited state is nearly uniform, te weakly depends on the noise amplitude, but its variance can be estimated as Var(te ) ∼ D2 te [15], so the fluctuations grow with the noise amplitude. In this regime Rp ≈ Re ∼ Dte −1/2 . The coherence resonance, i.e. a minimum in the dependence R(D), appears if the threshold of excitation is small and the excursion time is large. In this case the minimum corresponds 10
1
R
τc, R
10
0
10
0
−1
−1
10
10
1
−2
10
−1
D
10
0
10
10
10
−1
10
0
10
1
D
Fig. 2. Left panel : Correlation time τc (solid line) and the noise-to-signal ratio R (Eq. (5), dashed line) vs. noise amplitude for the FitzHugh–Nagumo system with a = 1.05, ε = 0.01. Right panel : The relative first passage time fluctuations vs. noise amplitude in the one-dimensional phase dynamics model (described in text) for A = B = 1 and y0 = −5 (solid line), y0 = −20 (dashed line) and y0 = −50 (dot-dashed line)
Order Out of Noise
651
to a sufficiently large noise amplitude so that ta te , but not very large so that fluctuations of the excursion time are small Re (Dres ) < 1. To make these arguments quantitative, we describe a simple analytical model of the coherence resonance. Note first, that due to the smallness of the parameter ε in the FitzHugh–Nagumo model the motion is restricted to the “nearly-limit” cycle in the phase space, consisting of two lines of slow motion and two straight lines of fast motion. On each line of slow motion the variable x is a function of y. Thus, along the lines of slow motion the dynamics can be represented with the 1-dimensional Langevin equation dU dy =− + Dξ(t) dt dy
(6)
with the noisy term ξ and a nonlinear potential U (y) having a single minimum (a stable fixed point). The fast motion can be modeled in this approach as a jump (reinjection) of the variable y, if the excitation threshold is arrived. Thus, we can consider (6) as defined on the half-line −∞ < y < 0, with reinjection of y from the threshold y = 0 to the point y = y0 . The sequence of pulses is in this interpretation a sequence of walks from y0 to 0, with reinjections. Because each walk is described by the Langevin equation (6), we can apply the method of the Fokker-Planck equation to find statistical characteristics of pulse durations tp . These durations are nothing else but first passage times for the random process (6) starting at y = y0 , with the absorbing boundary y = 0. The equations for the moments of these times are well-known [14,16]. The solutions for the first two moments have the form: v 0 U (v) − U (u) tp (y0 ) = 2D−2 dv du exp(2 ), D2 y0 −∞ v 0 2 U (v) − U (u) −2 tp (y0 ) = 4D dv du tp (u) exp(2 ). D2 y0 −∞ Except for extremely simplified models, the resulting formulae are very tedious. We were able to get closed analytical results for a simple model of the phase motion with a piece-wise linear potential U (y) = −Ay if y < −1 and U (y) = A + B + By if 0 > y > −1 (the minimum of the potential at y = −1 determines the position of the stable fixed point), although these formulae are still too cumbersome to be presented here. From these analytic expressions we calculate the ratio R which characterizes the coherence of the oscillations, and plot it in Fig. 2. Two asymptotics in accordance with the qualitative arguments above are clearly seen: for small noise R ≈ 1 what corresponds to the Poissonian statistics of the activation times for small noise; for large noise R ∼ D. The sharpness of the coherence resonance depends on the model parameters A, B, y0 . In agreement with the qualitative consideration above, the minimum is deeper for larger excursion times (large values of |y0 |). We emphasize that the phase dynamics equation (6) provides a general description of the coherence resonance (with details of a particular system
652
Arkady Pikovsky
coming through the potential U and boundary conditions), provided the excited state is regular (non-chaotic); otherwise the one-dimensional description is not sufficient. In conclusion of this section we mention that the coherence resonance has been experimentally observed in [17,18,19].
3
System Size Stochastic Resonance
The basic model to be considered below is the ensemble of noise-driven bistable overdamped oscillators, governed by the Langevin equations x˙ i = xi − x3i +
N √ g (xj − xi ) + 2Dξi (t) + f (t) . N j=1
(7)
Here ξi (t) is a Gaussian white noise with zero mean: ξi (t)ξj (t ) = δij δ(t−t ); g is the coupling constant; N is the number of elements in the ensemble, and f (t) is a periodic force to be specified later. In the absence of periodic force the model (7) has been extensively studied in the thermodynamic limit N → ∞. It demonstrates an Ising-type phase transition at g = gc from the disordered state with vanishing mean field X = N −1 i xi to the “ferromagnetic” state with a nonzero mean field X = ±X0 . A theory of this transition, based on the nonlinear Fokker-Planck equation, was developed in [20] where also the expressions for the critical coupling gc are given. While in the thermodynamic limit the full description of the dynamics is possible, for finite system sizes we have mainly a qualitative picture: in the ordered phase the mean field X switches between the values ±X0 and its average vanishes for all couplings. The rate of switchings depends on the system size and tends to zero as N → ∞. The asymptotic dynamics in this limit has been discussed in [21]. For us, of the main importance is the fact that qualitatively the behavior of the mean field can be represented as the noise-induced dynamics in a potential with one minimum in the disordered phase (at X = 0) and two symmetric minima (at X = ±X0 ) in the ordered phase. Now applying the ideas of the stochastic resonance, one can expect in the bistable case (i.e. in the ordered phase for small enough noise or for large enough coupling) a resonant-like behavior of the response to a periodic external force when the intensity of the effective noise is changed. Because this intensity is inverse proportional to N , we obtain the resonance-like curve of the response in dependence of the system size. The main idea behind the system size resonance is that in finite ensembles of noise-driven or chaotic systems the dynamics of the mean field can be represented as driven by the effective noise whose variance is inverse proportional to the system size [21,22,23]. This idea has been applied to description of a transition to collective behavior in [24]. In [25] it was demonstrated that the finite-size fluctuations can cause a transition that
Order Out of Noise
653
disappears in the thermodynamic limit. The description of finite-size effects in deterministic chaotic systems using the effective noise concept has been suggested in [22]. We emphasize that noise plays an essential role in this picture: with D = 0 (7) is a deterministic oscillator (double or single well, depending on g), whose response to a periodic force does not depend on N . Before proceeding to a quantitative analytic description of the phenomenon, we illustrate it with direct numerical simulations of the model (7), with a sinusoidal forcing term f (t) = A cos(Ωt). Figure 3 shows the linear response function, i.e. the ratio of the spectral component in the mean field at frequency Ω and the amplitude of forcing A, in the limit A → 0. For a given frequency Ω the dependence on the system size is a bell-shaped curve, with a pronounced maximum. The dynamics of the mean field X(t) is illustrated in Fig. 4, for three different system sizes and for a particular frequency. The resonant dynamics (Fig. 4b) demonstrates a typical for stochastic resonance synchrony between the driving periodic force and the switchings of the field between the two stable positions. For non-resonant conditions (Fig. 4a,c) the switchings are either too frequent or too rare, as a result the response is small. To describe the system size resonance analytically, we use, following [20], the Gaussian approximation. In this approximation one writes xi = X + δi and assumes that δi are independent Gaussian random variables 2 with zero −1 mean and the variance M . Assuming furthermore that N i δi = M and neglecting the odd moments N −1 i δi , N −1 i δi3 , as well as the correlations between δi and δj , we obtain from (7) the equations for X and M 2D 3 ˙ η(t) + f (t) , (8) X = X − X − 3M X + N 1 ˙ M = M − 3X 2 M − 3M 2 − gM + D , (9) 2 where η is the Gaussian white noise having the same properties as ξi (t). In the thermodynamic limit N → ∞ the noisy term η vanishes. If the forcing
Fig. 3. Linear response of the ensemble (7) (D = 0.5, ε = 2) in dependence on the frequency Ω and the system size N . Reprinted from [10]
654
Arkady Pikovsky
Fig. 4. The time dependence of the mean field in the ensemble (7) for D = 0.5, g = 2, A = 0.02, Ω = π/300, and different sizes of the ensemble: (a) N = 80, (b) N = 35, and (c) N = 15. We also depict the periodic force (its amplitude is not in scale) to demonstrate the synchrony of the switchings with the forcing in (b)
term is absent (f = 0), the equations coincide with those derived in [20]. This system of coupled nonlinear equations exhibits a pitchfork bifurcation of the equilibrium X = 0, M > 0 at gc = 3D. This bifurcation is supercritical for D > 2/3 in accordance with the exact solution of (7) given in [20], below we consider only this case. For g > gc the system is bistable with two symmetric stable fixed points (10) X02 = (2 − g + S)/4 , M0 = (2 + g − S)/12 , (here S = (2 + g)2 − 24D) and the unstable point X = 0, M = (1 − g + (1 − g)2 + 12D)/6. Now, with the external noise η and with the periodic force f (t) the problem reduces to a standard problem in the theory of stochastic resonance, i.e. to the problem of the response of a noise-driven nonlinear bistable system to an external periodic force (because the noise affects only the variable X, it does not lead to unphysical negative values of variance M , since M˙ is strictly positive at M = 0). This response has a maximum at a certain noise intensity, which according to (8) is directly related to the system size. To obtain an analytical formula, we perform further simplification of the system (8,9). Near the bifurcation point we can use the slaving principle to the dynamics of X is slower than that of M , and we can exclude the latter one assuming M˙ ≈ 0. Then from (9) we can express M as a function of X and substitute to (8). Near the bifurcation point we obtain a standard noise-driven bistable system 2D 3 ˙ η(t) + f (t) , (11) X = aX − bX + N where a = 1 + 0.5(g − 1) − 0.5 (g − 1)2 + 12D, b = −0.5 + 1.5(g − 1)((g − 1)2 + 12D)−1/2 . A better approximation valid also beyond a vicinity of the critical point can be constructed if we use ¯b = aX0−2 instead of b, where the fixed point X0 is given by (10). Having written the ensemble dynamics as a
Order Out of Noise
655
standard noise-driven double-well system (11) (cf. [1,2,26]), we can use the analytic formula for the linear response R derived in [26]. It reads √ 2
−1 N X02 D−3/2 (− s) π2 Ω 2 √ R= exp(s) , (12) 1+ 2Da D−1/2 (− s) 2a2 where s = aN X02 /(2D), and D are the parabolic cylinder functions. We compare the theoretical linear response function with the numerically obtained one in Fig. 5. The qualitative correspondence is good, moreover, the maxima of the curves are rather good reproduced with the formula (12). This shows that the resonant system size is quite good quantitatively described by the Gaussian approximation, see Fig. 5. 80
80
Nmax
60
linear response
60
40 20 0
40
−2
10
Ω
−1
10
20
0
0
20
40
60
80
N Fig. 5. Comparison of the system-size dependencies of the linear response function for frequencies Ω = 0.05 (circles) and Ω = 0.1 (squares) with theory (12). The parameters are D = 1 and g − gc = 2.5 (where the the exact gc and the approximate gc = 3D are used for the ensemble and the Gaussian approximation, respectively). Inset: Dependence of the system size yielding maximal linear response on the driving frequency Ω (circles: simulations of the ensemble (7), line is obtained by maximizing the expression (12))
4
System Size Coherence Resonance
Here we consider a one-dimensional lattice of N diffusively coupled FitzHugh– Nagumo systems (cf. Eqs. (1), (2)) ε
x3 dxk = xk − k − yk + g(xk−1 + xk+1 − 2xk ) , dt 3 dyk = xk + a + Dξk (t) , 1≤k≤N . dt
(13) (14)
656
Arkady Pikovsky
0.30
R
0.25
0.20
0.15
1
10
Lattice size N
100
Fig. 6. Coherence factor R vs. lattice size for the FitzHugh–Nagumo model (13,14) for ε = 0.01, D = 0.2, g = 30. Circles: a = 1.05, Squares: a = 1.03
We assume periodic boundary conditions and identity of the parameters of the oscillators, only the noise terms ξk are assumed to be different, ξk = 0, ξk (t)ξm (t ) = δkm δ(t − t ). In Section 2 it has been numerically and analytically demonstrated, that pulses, generated by noise in a single FitzHugh–Nagumo oscillator, have maximal coherence at a certain noise level. Here we focus on the dependence of the factor Rp (5) on the system size N , keeping the noise and other parameters constant. The results are presented in Fig. 6. The coherence factor Rp is minimal at N ≈ 10. We emphasize that we have considered the system (13,14) under very strong coupling and for moderate system length N . This ensured that for considered system sizes (up to 100) all oscillators fired nearly synchronously. Thus the factor Rp does not depend on which oscillator is chosen for analysis. We expect that the dynamics in the lattice will become much more complex if either the coupling is reduced or the system size is increased, or both (cf. numerical experiments with weakly coupled lattice of FitzHugh–Nagumo oscillators [27]).
References 1. L. Gammaitoni, P. H¨ anggi, P. Jung, and F. Marchesoni, Rev. Mod. Phys. 70, 223 (1998). 647, 655 2. P. Jung, Phys. Reports 234, 175 (1993). 647, 648, 655 3. R. Benzi, A. Sutera, and A. Vulpiani, J. Phys. A: Math., Gen. 14(11), L453 (1981). 647, 648 4. A. Pikovsky and J. Kurths, Phys. Rev. Lett. 78(5), 775 (1997). 647 5. A. Neiman, P. I. Saparin, and L. Stone, Phys. Rev. E 56(1), 270 (1997). 647 6. A. Longtin, Phys. Rev. E 55(1), 868 (1997). 647 7. P. Jung, U. Behn, E. Pantazelou, and F. Moss, Phys. Rev. A 46(4), R1709 (1992). 647 8. J. F. Lindner, B. K. Medows, W. L. Ditto, M. E. Inchiosa, and A. R. Bulsara, Phys. Rev. Lett. 75(1), 3 (1995). 647 9. J. F. Lindner, B. K. Medows, W. L. Ditto, M. E. Inchiosa, and A. R. Bulsara, Phys. Rev. E 53(3), 2081 (1996). 647
Order Out of Noise
657
10. A. Pikovsky, A. Zaikin, and M. A. de la Casa, Phys. Rev. Lett. 88(5), 050601 (2002). 647, 653 11. A. C. Scott, Rev. Mod. Phys. 47(2), 487 (1975). 648 12. F. Moss, in Contemporary Problems in Statistical Physics, edited by G. H. Weiss (SIAM, Philadelphis, 1994), pp. 205–253. 648 13. P. H¨ anggi, P. Talkner, and M. Borkovec, Rev. Mod. Phys. 62, 251 (1990). 650 14. H. Z. Risken, The Fokker–Planck Equation (Springer, Berlin, 1989). 650, 651 15. A. S. Pikovsky, Z. Physik B 55(2), 149 (1984). 650 16. R. L. Stratonovich, Topics in the Theory of Random Noise (Gordon and Breach, New York, 1963). 651 17. G. Giacomelli, M. Guidici, S. Balle, and J. R. Tredicce, Phys. Rev. Lett. 84(15), 3298 (2000). 652 18. K. Miyakawa and H. Isikawa, Phys. Rev. E 66(3), 046204 (2002). 652 19. D. E. Postnov, S. K. Han, T. G. Yim, and O. V. Sosnovtseva, Phys. Rev. E 59(4), R3791 (1999). 652 20. R. C. Desai and R. Zwanzig, J. Stat. Phys. 19(1), 1 (1978). 652, 653, 654 21. D. Dawson and J. G¨ artner, Stochastics 20, 247 (1987). 652 22. A. S. Pikovsky and J. Kurths, Physica D 76, 411 (1994). 652, 653 23. A. Hamm, Physica D 142(1–2), 41 (2000). 652 24. A. Pikovsky and S. Ruffo, Phys. Rev. E 59(2), 1633 (1999). 652 25. A. S. Pikovsky, K. Rateitschak, and J. Kurths, Z. Physik B 95(4), 541 (1994). 652 26. P. Jung and P. H¨ anggi, Phys. Rev. A 44(12), 8032 (1991). 655 27. O. V. Sosnovtseva, D. E. Postnov, and A. I. Fomin, Applied Nonlinear Dynamics 10(3), 125 (2002). 656
Scale Invariance and Dynamic Phase Transitions in Diffusion-Limited Reactions Uwe C. T¨auber Department of Physics, Virginia Tech Blacksburg, VA 24061-0435, USA [email protected] Abstract. Many systems that can be described in terms of diffusion-limited ‘chemical’ reactions display non-equilibrium continuous transitions separating active from inactive, absorbing states, where stochastic fluctuations cease entirely. Their critical properties can be analyzed via a path-integral representation of the corresponding classical master equation, and the dynamical renormalization group. An overview over the ensuing universality classes in single-species processes is given, and generalizations to reactions with multiple particle species are discussed as well. The generic case is represented by the processes A A+A, and A → ∅, which map onto Reggeon field theory with the critical exponents of directed percolation (DP). For branching and annihilating random walks (BARW) A → (m + 1)A and A + A → ∅, the mean-field rate equation predicts an active state only. Yet BARW with odd m display a DP transition for d ≤ 2. For even offspring number m, the particle number parity is conserved locally. Below dc ≈ 4/3, this leads to the emergence of an inactive phase that is characterized by the power laws of the pair annihilation process. The critical exponents at the transition are those of the ‘parity-conserving’ (PC) universality class. For local processes without memory, competing pair or triplet annihilation and fission reactions kA → (k − l)A, kA → (k + m)A with k = 2, 3 appear to yield the only other universality classes not described by mean-field theory. In these reactions, site occupation number restrictions play a crucial role.
1 Introduction: Active to Absorbing State Transitions The characterization of non-equilibrium steady states constitutes one of the prevalent goals in present statistical mechanics. Unfortunately, away from thermal equilibrium one cannot in general derive even stationary macroscopic properties from an effective free energy function. One might hope, however, that such a classification in terms of symmetries and interactions becomes feasible near continuous phase transitions separating different non-equilibrium steady states: Drawing on the analogy with equilibrium critical points, one would expect certain features of non-equilibrium phase transitions to be universal as well, i.e., independent of the detailed microscopic dynamical rules and even the initial conditions. The emerging power laws and scaling functions describing the long-wavelength, long-time limit should then hopefully be characterized by not too many distinct dynamic universality classes. Yet B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 659–675, 2003. c Springer-Verlag Berlin Heidelberg 2003
660
Uwe C. T¨ auber
studies of a variety of non-equilibrium processes have taught us that critical phenomena as well as generic scale invariance far from thermal equilibrium, where restrictive detailed-balance constraints do not apply, are often considerably richer than their equilibrium counterparts. Indeed, intuitions inferred from the latter have frequently turned out to be quite deceptive. But as in the investigation of equilibrium critical phenomena, field theory representations supplemented with the dynamical version of the renormalization group (RG) provide powerful methods to extract and systematically classify universal properties at continuous non-equilibrium phase transitions. Additional indispensable quantitative tools are of course Monte Carlo simulations, and other numerical approaches and exact solutions when available (the latter are usually restricted to specific one-dimensional systems). Analytical and numerical methods usually supplement each other: Simulation results often call for deeper understanding of the underlying processes, but also rely on some theoretical background as a basis for data analysis. A prominent class of genuine non-equilibrium phase transitions separates ‘active’ from ‘inactive, absorbing’ stationary states where, owing to the absence of any agents, stochastic fluctuations cease entirely [1,2]. These occur in a variety of systems in nature; e.g., in chemical reactions which involve an inert state ∅ that does not release the reactants A anymore. Another example are models for stochastic population dynamics, combining, say, diffusive migration of a species A with asexual reproduction A → 2A (with rate σ), spontaneous death A → ∅ (at rate µ), and lethal competition 2A → A (with rate λ). In the inactive state, where no population members A are left, clearly all processes terminate. Similar effective dynamics may be used to model certain non-equilibrium physical systems. Consider for instance the domain wall kinetics in Ising chains with competing Glauber and Kawasaki dynamics [3]. Here, spin flips ↑↑↓↓ → ↑↑↑↓ and ↑↑↓↑ → ↑↑↑↑ may be viewed as domain wall (A) hopping and pair annihilation 2A → ∅, whereas spin exchange ↑↑↓↓ → ↑↓↑↓ represents a branching process A → 3A. Notice that the para- and ferromagnetic phases respectively map onto the active and inactive ‘particle’ states, the latter rendered absorbing if the spin flip rates are computed at zero temperature, thus forbidding any energy increase. The simplest mathematical description for such processes uses kinetic rate equations, which govern the time evolution of the mean ‘particle’ density n(t). For example, the above population model leads to Fisher’s rate equation ∂t n(t) = (σ − µ) n(t) − λ n(t)2 .
(1)
It yields both inactive and active phases: For σ < µ we have n(t → ∞) → 0, whereas for σ > µ the density eventually saturates at ns = (σ − µ)/λ. The explicit solution (with initial particle density n0 ) (2) n(t) = n0 ns n0 + (ns − n0 ) e(µ−σ)t shows that both stationary states are approached exponentially in time. The two phases are separated by a continuous non-equilibrium phase transition at
Dynamic Phase Transitions in Diffusion-Limited Reactions
661
σ = µ, where the temporal decay becomes algebraic, n(t) = n0 /(1 + n0 λ t) → 1/(λt) as t → ∞, independent of the initial density. But Eq. (1) represents a mean-field approximation, as we have in fact replaced the joint probability of finding two particles at the same position with the square of the mean density. As in equilibrium, however, critical fluctuations are expected to invalidate simple mean-field theory in sufficiently low dimensions d < dc , which defines the upper critical dimension. A more satisfactory treatment therefore necessitates a systematic incorporation of spatio-temporal fluctuations, including specifically the particle correlations as induced by the dynamics.
2 From the Master Equation to Stochastic Field Theory The renormalization group study of (near-)equilibrium dynamical critical phenomena relies on phenomenological Langevin equations for the order parameter and ‘slow’ hydrodynamic variables associated with conservation laws [4]. All other degrees of freedom are treated as Gaussian white noise, whose second moment is related to the relaxation coefficients through Einstein’s fluctuation-dissipation theorem. As we shall see, however, such a description is not necessarily possible at all in reaction-diffusion systems. To the very least one would have to invoke fundamental conjectures on the noise correlators; but far from equilibrium these often crucially influence long-wavelength properties. It is therefore desirable to construct a long-wavelength effective theory for stochastic processes directly from their microscopic definition, without recourse to any serious additional assumptions or approximations. Fortunately, there exists an established procedure to derive the Liouville time evolution operator for locally interacting particle systems immediately from the classical master equation, wherefrom a field theory representation is readily obtained [5]. The key point is that all possible configurations can be labeled by specifying the occupation numbers ni of, say, the sites of a ddimensional lattice. Let us for now assume that there are no site occupation restrictions, i.e., any ni = 0, 1, 2, . . . is allowed (we shall address effects of particle exclusions in Sect. 7). The master equation governs the time evolution of the configurational probability P ({ni }; t). For example, the corresponding contribution from the binary coagulation process 2A → A at site i reads (3) ∂t P (ni ; t)|λ = λ (ni + 1)ni P (ni + 1; t) − ni (ni − 1)P (ni ; t) . This sole dependence on the integer variables {ni } calls for a representation in terms of bosonic ladder operators with the standard commutation relations [ai , a†j ] = δij , and the empty state |0 such that ai |0 = 0. We next define the Fock states via |{ni } = i (a†i )ni |0 (notice the different normalization from standard quantum mechanics), and thence construct the formal state vector |Φ(t) = P ({ni }; t) |{ni } . (4) {ni }
662
Uwe C. T¨ auber
The linear time evolution imposed by the master equation can be cast into an ‘imaginary-time Schr¨ odinger’ equation ∂t |Φ(t) = −H |Φ(t) with a generally non-Hermitian, local ‘Hamiltonian’ H({a†i }, {ai }). For instance, the on-site coagulation reaction is encoded in this formalism via Hiλ = −λ (1 − a†i ) a†i a2i . Our goal really is to evaluate time-dependent statistical averages for observables F , necessarily also just functions of the occupation numbers, F (t) = {ni } F ({ni }) P ({ni }; t). Straightforward algebra using the identity [ea , a† ] = ea shows that this average can be written as a ‘matrix element’ F (t) = P| F ({ai }) |Φ(t) = P| F ({ai }) e−Ht |Φ(0)
(5) ai
with the state vector |Φ(t) and the projector state P| = 0| i e ; notice that P|0 = 1. Probability conservation implies 1 = P| e−Ht |Φ(0) , i.e., for infinitesimal times P|Φ(0) = 1 and P|H = 0, which is satisfied provided H({a†i → 1}, {ai }) = 0. We remark that commuting the factors eai to the right has the effect of shifting all a†i → 1 + a†i . One may then emply coherent states, as familiar from quantum many-particle theory [6], to represent the matrix element (5) as a functional integral with statistical weight exp(−S). Omitting contributions from the initial state, the action becomes ψˆi ∂t ψi + H({ψˆi }, {ψi }) . (6) S[{ψˆi }, {ψi }] = dt i
Taking the continuum limit finally yields the desired field theory that faithfully encodes all stochastic fluctuations.
3
Reggeon Field Theory and Directed Percolation
Let us now return to our population dynamics example with random walkers A (with diffusion constant D in the continuum limit), subject to the Gribov reactions A A + A and A → ∅. The corresponding field theory (6) reads ˆ ψψ ˆ − µ(1 − ψ)ψ ˆ − λ(1 − ψ) ˆ ψψ ˆ 2 . (7) ˆ t − D∇2 )ψ + σ(1 − ψ) S = dd x dt ψ(∂ The stationarity condition δS/δψ = 0 is always solved by ψˆ = 1; upon inserting this into δS/δ ψˆ = 0, and identifying n(t) = ψ(x, t) , one recovers Fisher’s mean-field rate equation (1). Higher moments of the field ψ, however, cannot be immediately connected with density correlation functions. In terms of an arbitrary momentum scale κ, we record the naive scaling dimensions ˆ = κ0 , [ψ] = κd , [σ] = κ2 = [µ], and [λ] = κ2−d . Hence the decay and [ψ] branching rates constitute relevant operators in the RG sense, whereas the annihilation process becomes marginal at d = 2. Next we expand the action ˜ t) = ψ(x, ˆ t) − 1, (7) about the stationary solution ψˆ = 1, i.e., introduce ψ(x, whereupon we arrive at ˜ t − D∇2 )ψ + (µ − σ)ψψ ˜ − σ ψ˜2 ψ + λ ψ(1 ˜ + ψ)ψ ˜ 2 . (8) S = dd x dt ψ(∂
Dynamic Phase Transitions in Diffusion-Limited Reactions
663
Inspection of the one-loop √ fluctuation corrections shows that the effective coupling is in fact u = σλ, with scaling dimension [u] = κ2−d/2 , whence dc = 4. At least in the vicinity of dc , λ becomes irrelevant; certainly the ratio λ/u scales subsequent RG transformations. Simple rescaling to zero under ψ˜ = φ˜ λ/σ, ψ = φ σ/λ then leaves us with the effective field theory
˜ (9) Seff [φ, φ] = dd x dt φ˜ ∂t + D(r − ∇2 ) φ + u (φ˜ φ2 − φ˜2 φ) , where r = (µ − σ)/D. This should capture the critical behavior for the nonequilibrium phase transition at r = 0 in our population dynamics model. The action (9) is known in particle physics as ‘Reggeon’ field theory [7]. It is ˜ −t), φ(x, ˜ t) → invariant with respect to ‘rapidity inversion’ φ(x, t) → −φ(x, −φ(x, −t). Quite remarkably, the very same action is obtained for the threshold pair correlation function [8] in the geometric problem of directed percolation (DP) [9]. Figure 1 depicts a critical directed percolation cluster, contrasted with the structure emerging at the threshold of ordinary isotropic percolation. At dc = 4, the critical exponents as predicted by mean-field theory acquire logarithmic corrections, and are shifted to different values by the infrared-singular fluctuations in d < 4 dimensions. By means of the standard perturbational loop expansion in terms of the diffusion propagator and the vertices ∝ u, and the application of the RG, the critical exponents can be computed systematically and in a controlled manner in a dimensional expansion with respect to = 4 − d. The one-loop results, to first order in , as well as reliable values from Monte Carlo simulations in one and two dimensions [2] are listed in Table 1. Moreover, as a consequence of rapidity invariance there are only three independent scaling exponents, namely the anomalous field dimension η, the correlation length exponent ν, and the dynamic critical exponent z. All other exponents are then fixed by scaling relations, such as β=
ν (z + d − 2 + η) = z ν α 2
(10)
for the order parameter and critical density decay exponents. It is worthwhile noting that Reggeon field theory can be viewed as a dynamic response functional [11], and therefore is equivalent to an effective
Fig. 1. Critical isotropic (left) and directed (right) percolation clusters (from [10])
664
Uwe C. T¨ auber
Table 1. Critical exponents for the saturation density ns , correlation length ξ, time scale tc , and critical density decay nc (t) for the DP universality class DP exponents
d=1
d=2
d=4−
ns ∼ |r|β
β ≈ 0.2765
β ≈ 0.584
β = 1 − /6 + O( 2 )
ξ ∼ |r|−ν
ν ≈ 1.100
ν ≈ 0.735
ν = 1/2 + /16 + O( 2 )
tc ∼ ξ z ∼ |r|−zν
z ≈ 1.576
z ≈ 1.73
z = 2 − /12 + O( 2 )
nc (t) ∼ t−α
α ≈ 0.160
α ≈ 0.46
α = 1 − /4 + O( 2 )
˜ which yields the Langevin equation. To this end, we integrate out the field φ, statistical weight exp(−G) with 2 4uφ . (11) G[φ] = dd x dt ∂t φ + D(r − ∇2 )φ + u φ2 After rescaling, we may interpret (11) as the Onsager-Machlup functional associated with the Gaussian noise distribution for the stochastic process ∂t n(x, t) = D(∇2 − r) n(x, t) − λ n(x, t)2 + ζ(x, t) ,
(12a)
ζ(x, t) = 0 , ζ(x, t) ζ(x , t ) = 2 σ n(x, t) δ(x − x ) δ(t − t ) .
(12b)
Here, the absorbing nature of the inactive state is reflected in the fact that the noise correlator is proportional to n(x, t). Of course, Eq. (12) really means that the local density is to be factored in when the noise average is taken, c.f. Eq. (11). Alternatively, One may define ζ(x, t) = n(x, t) η(x, t), whereupon η(x, t) η(x , t ) = 2 σ δ(x − x ) δ(t − t ), at the cost of introducing ‘square-root’ multiplicative noise into the Langevin equation (12). Within the Langevin framework, we can readily generalize to arbitrary reaction and noise functionals r[n] and c[n] ∂t n(x, t) = D ∇2 n(x, t) − r[n(x, t)] + ζ(x, t) ,
(13a)
ζ(x, t) = 0 , ζ(x, t) ζ(x , t ) = c[n(x, t)] δ(x − x ) δ(t − t ) .
(13b)
In the spirit of Landau theory, we may then expand the functionals r[n] and c[n] near the inactive phase (n 1). In the absence of spontaneous particle production, both must vanish at n = 0, which is the condition for an absorbing state. Keeping only the lowest-order, relevant terms in the expansions with respect to n, we thereby infer that any active to absorbing state phase transition in a single-species system should generically be described by Eqs. (12) and (12), i.e., Reggeon field theory (9). Consequently, in the absence of any special symmetries, memory effects, and quenched disorder, we expect to find the critical exponents of the DP universality class [12].
Dynamic Phase Transitions in Diffusion-Limited Reactions
4
665
Variants of Directed Percolation Processes
In fact, DP-type processes with even arbitrarily many particle species have been fully classified. Consider the reactions A A + A, A → ∅, B B + B, B → ∅, etc., supplemented with bilinear couplings A → B + B, A + A → B, . . . Higher-order reactions then turn out to be irrelevant in the RG sense, and remarkably the critical behavior of the multi-species DP system is once again governed by the DP fixed point [13]. However, if we allow for unidirectional linear couplings through particle transmutations A → B, B → C, . . ., with rates µAB , µBC , . . ., multi-critical behavior may ensue [14]. This becomes already manifest on the level of the coupled rate equations (14a) ∂t nA = D ∇2 − rA nA − λA n2A , 2 2 ∂t nB = D ∇ − rB nB − λB nB + µAB nA , (14b) etc. For as long as the A species is in the active phase (rA < 0), the B particle density will be non-zero as well. As depicted in Fig. 2, this effectively ‘folds’ half of the decoupled B transition line (rA < 0, rB = 0) over onto rA = 0, rB > 0. Along this half-line, the B particles become ‘enslaved’ by the A species; and so forth further down the hierarchy of particle species. As rA → 0 and rB → 0 simultaneously, one encounters a non-equilibrium multicritical point. While the DP exponents ν and z governing the correlation length and critical slowing down remain unchanged, one finds successively reduced values for the order parameter exponents β (j) on the jth hierarchy level [14], as listed in Table 2. The crossover exponent associated with the multi-critical point is φ = 1 to all orders in the expansion [13]. Another mechanism to induce a different universality class in a two-species system is to link diffusing agents A with a passive, spatially fixed, and initially homogeneously distributed species X through the reactions X + A → A + A and A → ∅. One may then integrate out the X fluctuations; upon expanding about the mean-field solution, the resulting effective action becomes [15]
A active B active
rB
A inactive B inactive
A/B
rA
B A A active B active
A inactive B active
Fig. 2. Mean-field phase diagram for the twostage unidirectionally coupled DP process. The arrows indicate active to absorbing transitions for the A and B species. The dotted parabola marks the boundary of the multicritical regime [14]
666
Uwe C. T¨ auber
Table 2. Simulation and one-loop RG results for the saturation density critical exponents on the first three hierarchy levels in unidirectionally coupled DP processes d=1
d=2
d=3
d=4−
β (1)
0.280(5)
0.57(2)
0.80(4)
1 − /6 + O( 2 )
β (2)
0.132(15)
0.32(3)
0.40(3)
1/2 − /8 + O( 2 )
β (3)
0.045(10)
0.15(3)
0.17(2)
1/4 − O( )
˜ t) ∂t + D(r − ∇2 ) φ(x, t) dd x dt φ(x, t ˜ t)2 φ(x, t) . ˜ +2D u φ(x, t) φ(x, t) φ(x, t ) dt − u φ(x,
˜ φ] = Seff [φ,
(15)
The elimination of the passive particles X thus induces memory of all preceding times in the particle annihilation vertex. The DP rapidity inversion in˜ −t), variance is replaced by its non-local counterpart φ(x, t) → −D−1 ∂t φ(x,
−t ˜ φ(x, t) → −D φ(x, t ) dt . The upper critical dimension for this dynamic percolation universality class is shifted to dc = 6. Its static critical exponents are precisely the ones that characterize a critical isotropic percolation cluster [15], compare Fig. 1a. To first order in = 6 − d, one finds η = − /21, ν = 1/2 + 5 /84, z = 2 − /6, and β = ν(d − 2 + η)/2 = 1 − /7. Multispecies generalizations proceed in the same way as for DP, with similar results: Whereas non-linear couplings to other particle species preserve the dynamic universality class, a multi-critical point emerges for unidirectional particle transmutations, with crossover exponent φ = 1 [13].
5
Diffusion-Limited Annihilation Processes
As a preparation for the following Sect. 6, let us now investigate the kth order annihilation reaction kA → ∅. The associated rate equation reads ∂t n(t) = −λ n(t)k . For radioactive decay (k = 1), it is naturally solved by the familiar exponential n(t) = n0 e−λt , whereas one obtains power laws for k ≥ 2, namely
−1/(k−1) n(t) = n1−k + (k − 1) λ t . 0
(16)
In order to consistently include fluctuations in the latter case, we again start out from the master equation, wherefrom we derive the action [16] ˆ (17) S[ψ, ψ] = dd x dt ψˆ (∂t − D ∇2 ) ψ − λ (1 − ψˆk ) ψ k . ˆ t) = 1 + ψ(x, ˜ t), it becomes evident that this After performing the shift ψ(x, field theory does not have a simple Langevin representation. For in order
Dynamic Phase Transitions in Diffusion-Limited Reactions
667
to interpret ψ˜ as the corresponding noise auxiliary field, it should appear quadratically in the action only, and with negative prefactor. Thus even for the pair annihilation process, the Langevin equation derived from the action (17) would entail unphysical ‘imaginary’ noise with c[n] = −2 λ n2 . The scaling dimension of the annihilation vertex is [λk ] = κ2−(k−1)d , whence we infer the upper critical dimension dc (k) = 2/(k−1). This leaves the possibility of non-trivial scaling behavior in low physical dimensions only for the pair (k = 2) and triplet (k = 3) processes. Analyzing the field theory (17) further, we see that the diffusion propagator does not become renormalized at all. Consequently, η = 0 and z = 2 to all orders in the perturbation expansion. The simple structure of the action permits summing the entire vertex renormalization perturbation series by means of a Bethe-Salpeter equation; in Fourier space it reduces to a geometric series of the one-loop diagram [16]. For pair annihilation, this yields the following asymptotic behavior for the particle density: n(t) ∼ (λ t)−1 , i.e., the reaction-limited power law of the rate equation for d > 2; but diffusion-limited decay n(t) ∼ (Dt)−d/2 for d < 2. At dc = 2, one finds the logarithmic correction n(t) ∼ (Dt)−1 ln Dt. The slower decay for d ≤ 2 originates in the fast mutual annihilation of any closeby reactants; after some time has elapsed, only well-separated particles are left. The annihilation dynamics thus produces anti-correlations, mimicking an effective repulsive interaction (which actually provides the interpretation for the negative correlator c[n]). In the ensuing diffusion-limited regime, the typical particle separation scales as .(t) ∼ (Dt)−1/2 , whereupon indeed n(t) ∝ .(t)−d ∼ (Dt)−d/2 . The same power laws hold for the pair coagulation process 2A → A, albeit with different amplitudes. Replacing ordinary diffusion with long-range L´evy flights with probability ∝ r−d−ρ of hopping a distance r (ρ < 2) results in n(t) ∼ (Dt)−d/ρ for d < dc = ρ [17]. For triplet annihilation, one can similarly show that the density decays as n(t) ∼ (λt)−1/2 for d > 1,
1/2 with mere logarithmic corrections n(t) ∼ (Dt)−1 ln Dt at dc = 1 [16]. Generalizations of the pair annihilation reaction to multiple particle types introduce interesting new physics. For the two-species case A + B → ∅ (with no concurrent reactions of identical particles), the rate equations read ∂t nA/B = −λ nA nB . With equal initial densities nA0 = nB0 they are again solved by nA/B (t) ∼ (λt)−1 ; however, with nA0 > nB0 , say, one obtains nB (t) ∼ exp[−(nA0 − nB0 ) λ t] for the minority species, while the majority density saturates at nAs > 0. In order to establish the effects of spatial fluctuations, it is crucial to notice that the density difference nA − nB remains strictly conserved under the reactions; for DA = DB it simply obeys the diffusion equation [18]. Consequently, regions with A or B particle excess become amplified in time. As a result, when nA0 = nB0 , one finds that for dimensions d ≤ 4 species segregation into A/B rich domains occurs [19]. The annihilation processes are then confined to sharp reaction fronts, leading to a decelerated density decay nA/B (t) ∼ (Dt)−d/4 . For unbalanced initial conditions, stretched exponential relaxation ensues for d < 2: ln nB (t) ∼ −td/2 ,
668
Uwe C. T¨ auber
whereas ln nB (t) ∼ −t/ ln t at dc = 2 [20]. In one dimension, special initial configurations may change this picture: Consider, e.g., the alternating arrangement . . . ABABAB . . . of particles that upon encounter react with probability one. Now there is no reason anymore to distinguish between A and B, and the system is in the 2A → ∅ universality class. An obvious question is then what happens for diffusion-limited pair annihilation of q > 2 particle species, with equal initial densities as well as reaction and diffusion rates [21]. In contrast with the two-species case, there exists no local conservation law. Furthermore the renormalization of the reaction vertex proceeds exactly as for 2A → ∅. Consequently, at least for d ≥ 2, where the initial state is not that crucial, the long-time limit should in fact generically be governed by the single-species pair annihilation universality class [22]. This is obvious for q → ∞: In this limit, the probability of like particles ever meeting vanishes, which renders the distinction of different species meaningless. However, in one dimension, at least in the limit of large reaction rates (which should describe the asymptotic regime), particles of different types cannot pass each other. This topological constraint allows for species segregation to occur. Indeed, a simplified deterministic version of the q-species pair annihilation process yields [22] n(t) ∼ t−α(q) , with α(q) = (q − 1)/2q ,
(18)
which correctly reproduces α(2) = 1/4 and α(∞) = 1/2 in d = 1. The asymptotic decay (18) along with the subleading correction ∼ t−1/2 of the pair annihilation process without segregation were recently confirmed in extensive simulations [23]. Yet again, special initial conditions such as . . . ABCDABCD . . . may prevent segregation and instead lead to the 2A → ∅ decay law.
6
Branching and Annihilating Random Walks
In order to allow again for a genuine phase transition, we combine the annihilation kA → ∅ (k ≥ 2) with branching processes A → (m + 1) A. The associated rate equation for these branching and annihilating random walks (BARW) reads ∂t n(t) = −λ n(t)k + σ n(t), with the solution 1/(k−1) n(t) = ns 1 + (ns /n0 )k−1 − 1 e−(k−1) σ t . (19) Mean-field theory thus predicts the density to approach the saturation value ns = (σ/λ)1/(k−1) as t → ∞ for any positive branching rate σ. Above the critical dimension dc (k) = 2/(k − 1) therefore, the system only has an active phase; σc = 0 represents a degenerate ‘critical’ point, with scaling exponents essentially determined by the pure annihilation model: α = 1/(k − 1) = β, ν = 1/2, and z = 2. However, Monte Carlo simulations revealed a much richer picture, in low dimensions clearly distinguishing between the cases of odd and even number of offspring m [3,24]: For k = 2, d ≤ 2, and m odd,
Dynamic Phase Transitions in Diffusion-Limited Reactions
669
a transition to an inactive, absorbing phase is found, characterized by the DP critical exponents. On the other hand, for even offspring number there emerges a phase transition in one dimension, described by a novel universality class with α ≈ 0.27, β ≈ 0.92, ν ≈ 1.6, and z ≈ 1.75. The above mapping to a stochastic field theory, combined with RG methods, elucidates the physics behind those remarkable findings [25]. The action for the most interesting pair annihilation case becomes (20) S = dd x dt ψˆ (∂t − D ∇2 ) ψ − λ (1 − ψˆ2 ) ψ 2 + σ (1 − ψˆm ) ψˆ ψ , which in general allows no direct Langevin representation. Upon combining the reactions A → (m + 1)A and 2A → ∅, one notices immediately that the loop diagrams generate the lower-order branching processes A → (m − 1)A, A → (m− 3)A . . . Moreover, the one-loop RG eigenvalue yσ = 2 − m(m+ 1)/2 (computed at the annihilation fixed point) shows that the reactions with smallest m are the most relevant. For odd m, we see that the generic situation is represented by m = 1, i.e., A → 2A, supplemented with the spontaneous decay A → 0. After a first coarse-graining step, this latter process (with rate µ) must be included in the effective model, which hence becomes identical with the action (7). Thence we are led to Reggeon field theory (9) describing the DP universality class, provided the induced decay processes are sufficiently strong to render σc > 0. Yet for d > 2 the renormalized mass term σR − µR remains positive, which leaves us with merely the active phase. For d ≤ 2, however, the involved fluctuation integrals are infrared-divergent, thus indeed allowing the induced decays to overcome the branching processes to produce a non-trivial phase transition. As function of dimension, the critical exponents display an unusual discontinuity at dc = 2, as they jump from their DP to the mean-field values as a result of the vanishing critical branching rate [25]. It is now obvious why the case of even offspring number m is fundamentally different: Here, the most relevant branching process is A → 3A, and spontaneous particle death with associated exponential decay is not generated, which in turn precludes the previous mechanism for producing an inactive phase with exponential decay. This important distinction from the odd-m case can be traced to a microscopic local conservation law , for the reactions 2A → ∅ and A → 3A, A → 5A . . . always destroy or produce an even number of reactants, preserving the particle number parity. Formally, this is reflected in the invariance of the action (20) under the combined inversions ˆ As we saw earlier, the branching rate σ certainly constiψ → −ψ, ψˆ → −ψ. tutes a relevant variable near dc = 2. Therefore the phase transition can only occur at σc = 0, and for any σ > 0 there exists only an active phase, described by mean-field theory. In two dimensions one readily computes the following logarithmic corrections: ξ(σ) ∼ σ −1/2 ln(1/σ), and n(σ) ∼ σ [ln(1/σ)]−2 [25]. However, setting m = 2 in the one-loop value for the RG eigenvalue yσ , we notice that fluctuations drive the branching vertex irrelevant in low
670
Uwe C. T¨ auber
dimensions d < dc ≈ 4/3. More information can be gained through a oneloop analysis at fixed dimension, albeit uncontrolled [25]. The ensuing RG flow equations for the renormalized branching rate σR = σ/Dκ2 , and annihilation rate λR = Cd λ/Dκ2−d , with Cd = Γ (2 − d/2)/2d−1π d/2 read (for m = 2) dσR 3 λR dλR λR = σR 2 − = λR 2 − d − , . (21) d. d. (1 + σR )2−d/2 (1 + σR )2−d/2 The effective coupling is then identified as g = λR /(1 + σR )2−d/2 , which approaches the annihilation fixed point ga∗ = 2 − d as σR → 0, whereas for σR → ∞ the flow tends towards the active state Gaussian fixed point g0∗ = 0. The separatrix between the two phases is given by the unstable RG fixed point gc∗ = 4/(10 − 3d), which enters the physical regime below the borderline dimension dc ≈ 4/3, as shown in Fig. 3. For d < dc , this describes a dynamic phase transition with σc > 0. The aforementioned fixed-dimension RG analysis yields the rather crude values ν ≈ 3/(10 − 3d), z ≈ 2, and β ≈ 4/(10−3d) for this parity-conserving (PC) universality class. The absence of any mean-field counterpart for this transition precludes a direct derivation of the ‘hyperscaling’ relations (10). Amazingly, in this non-equilibrium system fluctuations generate rather than destroy an ordered phase (translating back from the domain wall to the spin picture) in low dimensions. The inactive state is characterized by a vanishing branching rate, and consequently by the algebraic pair annihilation density decay. For particles undergoing L´evy flights, the existence of the power-law inactive phase is controlled by the anomalous diffusion exponent ρ, emerging for ρ > ρc ≈ 3/2 in d = 1 [26]. Invoking similar arguments for the case of triplet annihilation 3A → ∅ combined with branching A → (m + 1)A one would expect DP behavior with σc > 0 for m mod 3 = 1, 2, as then the processes A → ∅, A → 2A, and 2A → A are dynamically generated. For m = 3, 6, . . ., on the other hand, there can be different, novel scaling behavior, but because of dc = 1 it will be limited to merely logarithmic corrections in one dimension [25]. It is also interesting to generalize the even-offspring BARW to q species, such that only equal particles can annihilate, Ai + Ai → ∅, but both reactions Ai → 3Ai (with rate σ) and Ai → Ai + 2Aj (for j = i, and with rate σ ) are 1/g 3
2
1
0
1 0 0 1 0 1 ACTIVE 0 1 0 1 0 1 PHASE 0 1 0 1 0 1 0 1 0 INACTIVE 1 0 1 0 PHASE 1 0 1 0 1 0 1 0 1 0 1 1
d’c
MEAN FIELD 2
d
Fig. 3. Stationary states and unstable RG fixed point 1/g ∗ for BARW with even offspring number (PC universality class) as function of dimension d [25]
Dynamic Phase Transitions in Diffusion-Limited Reactions
671
possible. It turns out that the latter process always dominates, and in fact the ratio σ/σ → 0 under renormalization. Thus asymptotically one reaches the exactly analyzable q → ∞ limit, with a mere degenerate phase transition at branching rate σc = 0. Below dc = 2, one finds the critical exponents α = d/2, β = 1, ν = 1/d, and z = 2 [25]. The situation for q = 1 is thus qualitatively different from all multi-component cases.
7
Annihilation-Fission Reactions
For single-species reactions without memory and disorder, the only remaining processes with potentially non-mean-field scaling behavior appear to be the combination of purely binary annihilation and fission reactions 2A → A (with rate λ) and 2A → (m + 2)A (rate σm ) with dc = 2, and its triplet counterpart (dc = 1). The former reactions subsequently generate 2A → (m + 1)A, mA, . . ., 2(m + 1)A, . . ., thus producing infinitely many couplings with identical scaling dimensions [27]. Upon including all these binary particle production reactions, the phase transition is readily seen to occur at λc = m m σm . The inactive, absorbing phase (λ > λc ) is obviously characterized by the power laws of the pure coagulation model. Yet for λ < λc , the particle density diverges after a finite time, when no constraints on the site occupation numbers ni are imposed. Thus the asymptotic density is finite only at the phase transition itself. These singular features of the ‘bosonic’ model with its highly discontinuous phase transition are overcome by restricting the site occupation numbers to ni = 0, 1. Extensive Monte Carlo simulations have in fact revealed that this leads to a continuous transition, with critical exponents that seem to belong to a novel universality class (pair contact process with diffusion, PCPD) with critical dimension dc = 2 [28]. However, owing to the difficulty of obtaining truly asymptotic properties in this system, where reactions become extremely rare at low densities, the precise nature of this critical point in purely binary reactions has remained elusive and rather controversial. This applies even to density matrix RG studies [29]. It is therefore fortunate that recent work has demonstrated how to consistently implement site occupation restrictions into the bosonic field theory [30]. For the above binary processes, the reaction part of the action becomes ˆ ˆ ˆ ψˆ ψ 2 e−2v ψψ . (22) S = dd x dt σm (1 − ψˆm) ψˆ2 ψ 2 e−(m+2)v ψ ψ − λ (1 − ψ) Here the exponential terms capture the occupation number limitations, with [v] = κ−d , which suggests that v represents a dangerously irrelevant coupling. Indeed, consider more generally the coupled reactions kA → (k − l)A with 0 < l ≤ k and nA → (n + m)A with n, m > 0, which display a continuous transition for k ≤ n. The mean-field equations obtained from the associated actions show that site occupation restrictions can be neglected at low densities, yielding the critical exponents β = 1/(k −n), ν = (k −1)/2(k −n), z = 2,
672
Uwe C. T¨ auber
and α = 1/(k − 1), except for the degenerate case k = n, where one finds m ns = ln(mσm /lλ), whence β = 1, ν = k/2, z = 2, and α = 1/k [31,32]. For k = 1, expanding the exponentials leads to the action (8), which establishes that the competing processes A → ∅, A → 2A, . . . with site exclusion yield a DP phase transition. The above field theory should also permit a systematic analysis of the fluctuation corrections for the purely binary and triplet reactions. For the latter, one expects mere logarithmic corrections at dc = 1 to the mean-field scaling laws; yet current simulations are inconclusive [32].
8
Concluding Remarks
In this overview, I have outlined how non-linear stochastic processes via their defining master equation can be represented by field theory actions, allowing for a thorough analysis and classification by means of the renormalization group. Systems with a single ‘particle’ species that display a non-equilibrium phase transition from an active to an inactive, absorbing state are generically captured by the directed percolation universality class. The second prominent example, sometimes applicable when additional symmetries (degenerate absorbing states) are present, is the parity-conserving universality class of even-offspring branching and annihilating random walks. The only other scenarios for non-trivial critical scaling behavior appear to be provided by the solely pair or triplet annihilation–fission reactions, where site occupation restrictions become relevant. The full classification of reaction-diffusion models with multiple particle species remains a formidable task. Even the difference in diffusivities may become a relevant control parameter [33]. Specifically in one dimension, exclusion constraints can play a crucial role [34]. Another obviously important open problem concerns the influence of quenched disorder in the reaction rates. For example, a field theory RG investigation for DP with random percolation threshold yields run-away flows [35], reflected in intriguing simulation results with not entirely clear interpretation [36]. A very recent strong disorder RG study, supplemented with numerical density matrix RG calculations, has revealed a novel disorder fixed point [37]. A better understanding of spatially varying reaction rates might also explain the conspicuous rarity of clear-cut experimental realizations even for the supposedly ubiquitous DP universality class [38]. In fact the single verification of DP scaling behavior appears to be its observation in spatio-temporal intermittency in ferrofluidic spikes [39]. Thus many intriguing issues are still open; but I expect that in addition to increasingly more extensive Monte Carlo simulations and sophisticated numerical techniques, field theory representations and subsequent analysis by means of the renormalization group will remain an invaluable tool for the further understanding of cooperative phenomena and scale-invariance in interacting non-equilibrium systems.
Dynamic Phase Transitions in Diffusion-Limited Reactions
673
Acknowledgements I am grateful for the opportunity to present this overview at the 2003 DPG Spring Meeting. I gladly acknowledge fruitful collaboration and discussions with John Cardy, who introduced me to the topics discussed here, as well as with Olivier Deloubri`ere, Erwin Frey, Yadin Goldschmidt, Manoj Gopalakrishnan, Geoff Grinstein, Malte Henkel, Henk Hilhorst, Haye Hinrichsen, Mar´ tin Howard, Hannes Janssen, Miguel Mu˜ noz, G´eza Odor, Klaus Oerding, Zolt´an R´ acz, Beth Reid, Beate Schmittmann, Gunter Sch¨ utz, Franz Schwabl, Steffen Trimper, Ben Vollmayr-Lee, Fr´ed´eric van Wijland, and Royce Zia. This research is presently funded through the National Science Foundation (grant DMR 0075725) and the Jeffress Memorial Trust (J-594). Earlier support came from the EPSRC (GR/J78327), the European Commission (ERB FMBI-CT96-1189), and the DFG (Ta 177/2).
References 1. B. Chopard, M. Droz, Cellular Automaton Modeling of Physical Systems (Cambridge Univ. Press, Cambridge 1998). J. Marro, R. Dickman, Nonequilibrium Phase Transitions in Lattice Models (Cambridge Univ. Press, Cambridge 1999). 660 2. H. Hinrichsen, Adv. Phys. 49, 815 (2000). ´ G. Odor, e-print cond-mat/0205644 (2002). 660, 663 3. P. Grassberger, F. Krause, T. von der Twer, J. Phys. A 17, L105 (1984). M. Droz, Z. R´ acz, J. Schmidt, Phys. Rev. A 39, 2141 (1989). N. Menyh´ ard, J. Phys. A 27, 6139 (1994). ´ N. Menyh´ ard, G. Odor, J. Phys. A 28, 4505 (1995); ibid. 29, 7739 (2001). 660, 668 4. P. C. Hohenberg, B. I. Halperin, Rev. Mod. Phys. 49, 435 (1977). 661 5. M. Doi, J. Phys. A 9, 1479 (1976). P. Grassberger, P. Scheunert, Fortschr. Phys. 28, 547 (1980). L. Peliti, J. Phys. (Paris) 46, 1469 (1984). 661 6. N. V. Popov, Functional Integrals and Collective Excitations (Cambridge Univ. Press, New York 1981). J. W. Negele, J. Orland, Quantum Many-Particle Systems (Addison-Wesley, New York 1988). 662 7. A. A. Migdal, A. M. Polyakov, K. A. Ter-Martirosyan, Phys. Lett. 48 B, 239 (1974). H. D. I. Aberbanel, J. B. Bronzan, Phys. Rev. D 9, 2397 (1974). J. B. Bronzan, J. W. Dash, Phys. Rev. D 10, 4208 (1974). 663 8. J. L. Cardy, R. L. Sugar, J. Phys. A 13, L423 (1980). 663 9. W. Kinzel, in: Percolation Structures and Processes, G. Deutscher, R. Zallen, J. Adler (Eds.), Ann. Isr. Phys. Soc. Vol. 5 (Adam Hilger, Bristol 1983), p. 425. 663 10. E. Frey, U. C. T¨ auber, F. Schwabl, Phys. Rev. E 49, 5058 (1994). 663 11. H. K. Janssen, Z. Phys. B 23, 377 (1976). C. de Dominicis, J. Phys. Colloq. (Paris) 37, C1-247 (1976). R. Bausch, H. K. Janssen, H. Wagner, Z. Phys. B 24, 113 (1976). 663
674
Uwe C. T¨ auber
12. P. Grassberger, K. Sundermeyer, Phys. Lett. 77 B, 220 (1978). P. Grassberger, A. de la Torre, Ann. Phys. (N.Y.) 122, 373 (1979). H. K. Janssen, Z. Phys. B 42, 151 (1981). P. Grassberger, Z. Phys. B 47, 365 (1982). 664 13. H. K. Janssen, Phys. Rev. Lett. 78, 2890 (1997); J. Stat. Phys. 103, 801 (2001). 665, 666 14. U. C. T¨ auber, M. J. Howard, H. Hinrichsen, Phys. Rev. Lett. 80, 2165 (1998). Y. Y. Goldschmidt, H. Hinrichsen, M. J. Howard, U. C. T¨ auber, Phys. Rev. E 59, 6381 (1999). 665 15. H. K. Janssen, Z. Phys. B 58, 311 (1985). H. K. Janssen, B. Schmittmann, Z. Phys. B 64, 503 (1986). 665, 666 16. B. P. Lee, J. Phys. A 27, 2633 (1994). B. P. Lee, J. L. Cardy, J. Stat. Phys. 80, 971 (1995). 666, 667 17. D.C. Vernon, e-print cond-mat/0304376 (2003). 667 18. A. A. Ovchinnikov, Y. B. Zeldovich, Chem. Phys. 28, 215 (1978). D. Toussaint, F. Wilczek, J. Chem. Phys. 78, 2642 (1983). 667 19. B. P. Lee, J. Cardy, Phys. Rev. E 50, R3287 (1994); J. Stat. Phys. 80, 971 (1995). M. Howard, J. Cardy, J. Phys. A 28, 3599 (1995). 667 20. K. Kang, S. Redner, Phys. Rev. Lett. 52, 955 (1984). M. Bramson, J. L. Lebowitz, Phys. Rev. Lett. 61, 2397 (1988); J. Stat. Phys. 65, 941 (1991). 668 21. D. ben-Avraham, S. Redner, Phys. Rev. A 34, 501 (1986). 668 22. O. Deloubri`ere, H. J. Hilhorst, U.C. T¨ auber, Phys. Rev. Lett. 89, 250601 (2002). 668 23. D. Zhong, R. Dawkins, D. ben-Avraham, e-print cond-mat/0301155 (2003). 668 24. H. Takayasu, A. Yu. Tretyakov, Phys. Rev. Lett. 68, 3060 (1992). I. Jensen, J. Phys. A 26, 3921 (1993); Phys. Rev. E 50, 3623 (1994). D. ben-Avraham, F. Leyvraz, S. Redner, Phys. Rev. E 50, 1843 (1994). 668 25. J. L. Cardy, U. C. T¨ auber, Phys. Rev. Lett. 77, 4780 (1996); J. Stat. Phys. 90, 1 (1998). 669, 670, 671 26. D. Vernon, M. Howard, Phys. Rev. E 63, 041116 (2001). 670 27. M. J. Howard, U. C. T¨ auber, J. Phys. A: Math. Gen. 30, 7721 (1997). 671 ´ 28. G. Odor, Phys. Rev. E 62, R3027 (2000); ibid. 63 067104 (2001). H. Hinrichsen, Phys. Rev. E 63, 036102 (2001); Physica A 291, 275 (2001). K. Park, H. Hinrichsen, I.-m. Kim, Phys. Rev. E 63, 065103 (2001). ´ G. Odor, M. C. Marques, M.A. Santos, Phys. Rev. E 65, 056113 (2002). K. Park, I.-m. Kim, Phys. Rev. E 66, 027106 (2002). R. Dickman, M. A. F. de Menezes, Phys. Rev. E 66, 045101 (2002). H. Hinrichsen, Physica A 320, 249 (2003); Eur. Phys. J. B 31, 365 (2003). J. Kockelkoren, H. Chat´e, Phys. Rev. Lett. 90, 125701 (2003). ´ G. Odor, Phys. Rev. E 67, 016111 (2003). 671 29. E. Carlon, M. Henkel, U. Schollw¨ ock, Phys. Rev. E 63, 036101 (2001). G. T. Barkema, E. Carlon, e-print cond-mat/0302151 (2003). 671 30. F. van Wijland, Phys. Rev. E 63, 022101 (2002). 671 31. O. Deloubri`ere, U.C. T¨ auber, unpublished (2003). 672 32. K. Park, H. Hinrichsen, I.-m. Kim, Phys. Rev. E 66, 025101(R) (2002). ´ G. Odor, e-prints cond-mat/0210615 (2002); cond-mat/0304023 (2003). 672
Dynamic Phase Transitions in Diffusion-Limited Reactions
675
F. van Wijland, K. Oerding, H. J. Hilhorst, Physica A 251, 179 (1998). 672 ´ G. Odor and N. Menyh´ ard, Physica D 168, 305 (2002). 672 H. K. Janssen, Phys. Rev. E 55, 6253 (1997). 672 A. G. Moreira, R. Dickman, Phys. Rev. E 54, R3090 (1996). 672 J. Hooyberghs, F. Igl´ oi, C. Vanderzande, Phys. Rev. Lett. 90, 100601 (2003). 672 38. H. Hinrichsen, Braz. J. Phys. 30, 69 (2000). 672 39. P. Rupp, R. Richter, I. Rehberg, Phys. Rev. E 67, 036209 (2003). 672
33. 34. 35. 36. 37.
Protein Dynamics: Glass Transition and Mechanical Function Torsten Becker, Stefan Fischer, Frank Noe, Alexander L. Tournier, G. Matthias Ullmann, and Jeremy C. Smith German Interdisciplinary Research Center for Scientific Computing Computational Molecular Biophysics Group, 69120 Heidelberg, Germany Abstract. We review here internal motions in proteins on a range of time- and length scales, with special emphasis on the role computer simulations can play in their elucidation when used in judicious combination with experiment. The glass transition in protein function is described and the use of neutron scattering in determining dynamical properties relevant to the glass transition is summarised. The coupling between protonation reactions and protein conformational dynamics in the photosynthetic reaction centre is described. Finally, we consider the complex computational problem of calculating large-scale conformational transitions in proteins.
1
Introduction
Whereas the second half of the 20th century was notable for having seen the development and application of techniques for solving the three-dimensional structures of biological macromolecules, the 21st century may well be that in which the internal dynamics required for function are finally elucidated. Motions in proteins are inherently difficult to characterize in detail, due to their wide range of forms and timescales and their inherent anharmonicity due to the irregularity of protein energy surfaces. Therefore, computational methods, such as molecular dynamics simulation, must play a predominant role in sorting out which motions occur and which are required for function. Here we broadly review some aspects of this field that are of particular physical interest, ranging from the glass transition in proteins through to large-scale conformational change. In this way the uninitiated reader will gain insight into the complexity of the protein dynamical landscape and the various ways in which dynamics can influence function.
2 Hydration Effects and the Dynamical Transition in Proteins 2.1
The Dynamical Transition in Proteins
Various experimental techniques such as neutron scattering, M¨ ossbauer spectroscopy, and X-ray scattering have shown the presence of a temperatureB. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 677–693, 2003. c Springer-Verlag Berlin Heidelberg 2003
678
Torsten Becker et al.
dependent transition in protein dynamics at around 180 K-220 K [1,2,3,4]. In this temperature range the dynamics of proteins, as represented by the mean-square displacement, u2 , of the protein atoms change from harmonic behavior below the transition temperature to anharmonic behavior above. This dynamical transition has also been seen using molecular dynamics (MD) simulation techniques [5,6,7]. Figure 1 shows such a simulation, in which a typical transition in the protein u2 can be seen at 220 K. Experiments have shown that in several proteins biological function ceases below the dynamical transition [8,9,10]. The protein dynamics transition has features in common with the glass transition [11,12]. Much debate is still going on to determine whether a protein can be considered as a glass. Proteins share the stretched exponential behavior seen in glasses, however they do not have a precisely defined transition temperature, Tg , and characteristic sharp jump in heat capacity at Tg seen in glasses. 2.2
The Role of Solvent in the Dynamical Transition
A number of experiments have indicated that when a protein is solvated the dynamical transition is strongly coupled to the surrounding solvent [8,13,14,15,16]. The observed dependence of the dynamical transition behavior on the solvent composition leads to the question of the role of solvent in the dynamical transition [13,6]. However, whether solvent drives the dynamical transition in a hydrated protein is still open to question. In recent MD simulations solvent effects were probed by using dual heatbath methods which set the protein and its solvent at different temperatures. This approach enables features inherent to the protein energy landscape to be distinguished from features due to properties of the solvent. To perform the dual heatbath simulations, the Nos´e-Hoover-Chain (NHC) algorithm was implemented and added in the CHARMM package, [17,18]. The model system consisted of myoglobin surrounded by one shell of solvent (492 water molecules). In the NHC method the different parts of the system are each regulated not by one but by two heatbaths, the first one regulating the system and the second regulating the first heatbath. NHC has the advantage over the Nos´e-Hoover algorithm in that it reproduces exact canonical behavior and is more stable. In Fig. 1 are shown the results of simulations in which the protein and solvent are set at the same temperature. These reproduce the experimentally observed dynamical transition in u2 in myoglobin at ∼220 K. u2 is seen to increase relatively slowly up to ∼220 K, whereas beyond this temperature it increases more sharply with temperature, giving rise to the characteristic dynamical transition feature. The data from this initial set of runs was analysed to investigate which parts of the proteins are subject to the dynamical transition. The u2 of sidechain atoms was found to be 6 times greater than the backbone u2 at
Protein Dynamics: Glass Transition and Mechanical Function
679
Fig. 1. Mean-square fluctuations, u2 of the protein non-hydrogen atoms for different sets of simulations
80 K, and twice as large at 300 K (data not shown). The innermost atoms were seen not to show any dynamical transition feature, their u2 increasing linearly with temperature. In contrast, the outer shells of atoms exhibit a marked transition at ∼220 K, the outermost solvent-exposed atoms being the most affected (data not shown). Thus, the atoms found to be most influenced by the dynamical transition are the side-chain atoms on the outer layers of the protein i.e., the protein atoms interacting with the solvent shell. 2.3
Solvent Caging of Protein Dynamics
Figure 1 also presents the protein fluctuations calculated from the dual heatbath simulations, performed fixing the temperature of one component below the dynamical transition while varying the temperature of the other component. Fixing the solvent temperature at 80 K or 180 K suppresses the dynamical transition, the protein u2 increasing linearly with temperature up to 300 K. Therefore, low temperature solvent cages the protein dynamics. Figure 1 also shows that holding the protein temperature constant at 80 K or 180 K and varying the solvent temperature also abolishes the dynamical transition behavior in the protein. In summary then, holding either component at a low temperature suppresses the protein dynamical transition. Cold (80 K and 180 K) solvent is seen to effectively cage protein dynamics over the whole range of protein temperatures examined (from 80 K up to 180 K). This indicates the important role of solvent in influencing protein dynamics.
3
Neutron Scattering from Proteins
Neutron scattering is widely used to probe picosecond-nanosecond timescale dynamics of condensed-phase molecular systems [19,20]. In hydrogen-rich
680
Torsten Becker et al.
molecules like proteins the scattering will be dominated, due to the high incoherent scattering length of hydrogen atoms, by incoherent scattering. The basic quantity measured is the incoherent dynamic structure factor S(Q, ω), where E = h ¯ ω is the energy transfer and Q = kf − ki , with ki and kf being the incident and final wave vector of the scattered neutrons respectively. The scattering function contains information on both the timescales and the spatial characteristics of the dynamical relaxation processes involved. The scattering function can be written as space and time Fourier transformation of the van Hove autocorrelation function G(r, t): S(Q, ω) = G(r, t) =
1 (2π)2 1 N
dt e−iωt
dr e−iQr G(r, t) ,
dr δ(r − r +Ri (0))δ(r −Ri (t)) .
(1) (2)
i
In Eqs. (1) and (2) Ri (t) is the position vector of atom i (i = 1 . . . N ) at time t and the brackets · · · indicate an ensemble average. G(r, t) is the probability that a certain particle can be found at position r at time t, given that it was at the origin at t=0 (here and throughout the article only the classical limit of these functions is considered). For the dynamical transition in proteins, the elastic and quasielastic parts of the scattering function are of primary interest. In systems of spatially-confined atoms, such as proteins, the elastic incoherent neutron scattering is of relatively high intensity and is thus used to obtain an estimate of the dynamics present. Quasielastic scattering gives access to typical timescales and geometries of diffusive dynamics involved. Equation (2) shows that the average over all hydrogen atoms determines the scattering function. Given that hydrogens are equally distributed over the protein, we see, that neutron scattering gives an average over all motions present in the protein. The guiding picture in interpreting dynamic neutron scattering is that of a potential energy surface or “energy landscape”. The shape of this energy landscape determines the associated microscopic dynamics. For proteins, however, the energy landscape can be complex and rugged with many local minima. This leads to the presence of a wide range of vibrational and diffusive dynamical processes. The simplicity of looking at only one atom type is therefore somewhat overshadowed by the difficulty that a wide range of dynamical processes may have to be considered to adequately describe the scattering function. Several simplified models have been used to describe the dynamics activated at the dynamical transition, including continuous diffusion [21], jumping between minima [1,22,23,24,25], mode-coupling theory [26,27,28], stretched-exponential behavior [29] and ’effective force constants’ [30]. Al-
Protein Dynamics: Glass Transition and Mechanical Function
681
though these models are sometimes qualitatively different all can reproduce available experimental data well. Recently a method has been presented for extracting useful information from experimental elastic incoherent neutron scattering data without assuming a specific dynamical model [31]. The method proceeds in two stages: fitting to the Q-dependence of the elastic scattering, followed by decomposition of the resulting ∆r2 . 3.1
Analysing Elastic Scattering at Low Q
To analyse elastic incoherent scattering from proteins without restricting the interpretation to a specific model the use is proposed of a heuristic function of the form: ∞ − 16 Q2 ∆r 2 2 m 1+ (3) S(Q, 0) = e bm · (−Q ) m=2
≈e
− 16 Q2 ∆r 2
1 + b · Q4
(4)
Here bm are parameters fitted to reproduce the experimentally-observed elastic incoherent scattering. This expansion of the elastic scattering function reflects the idea that for low Q the scattering is a Gaussian function in Q with increasing deviations at higher Q-values. Two different aspects contribute to the parameters bm . The dynamics of single atoms can lead to deviation from a Gaussian scattering function for higher Q-values as well as a distribution of mean-square displacements. Looking at these two possible causes of nonGaussian behavior, it was shown in [31] that both aspects lead to a function of the form of Eq. (3). It was also shown that in systems in which heterogeneity makes the dominant contribution to the heuristic parameters bm , b1 is the variance of the distribution of ∆r2 . However, to what extent each of these two effects contributes will vary from system to system and is not known a priori. Therefore, using Eq. (3) and treating bm as heuristic parameters is equivalent to making minimal assumptions about the system. In the low Q-range, as long as deviations from Gaussian behavior are small i.e., bm · (Q2 )m 1, we can neglect higher-order terms and can derive two parameters from the elastic scattering, ∆r2 and b, (see Fig. 2 and Tab. 1 ). Table 1. Mean-square displacement and variance from fitting Eq. (4) to the experimental data obtained in [32] T [K] 200 230 260 280 300
∆r 2 [˚ A2 ] 0.02 ± 0.02 0.08 ± 0.05 0.37 ± 0.06 0.6 ± 0.2 1.1 ± 0.1
b [˚ A4 ] 0.009± 0.009 0.09 ± 0.09 0.45 ± 0.1 0.3 ± 0.2 0.8 ± 0.1
682
Torsten Becker et al.
Fig. 2. Equation (4) (dot-dashed line) fitted to experimental data from [32]. See Tab. 1 for the resulting parameters
3.2
Measured Mean Square Displacement
Figure 3 shows ∆r2 Expt as a function of temperature. The data exhibit a dynamical transition at ∼ 220 K involving a sharp increase in ∆r2 Expt . The next step in neutron data analysis is to interpret ∆r2 as obtained from Eq. (4). In [31] the following form for the measured mean-square displacement was derived: ∆r2 Expt = ∆r2 Conv − a
∆ω 2 arctan π λ
(5)
Here, ∆r2 Conv is the time-converged mean-square displacement consisting of a vibrational part and contributions from the elastic incoherent scattering function of the protein. The second term at the RHS represents finite energy resolution of the instrument. Here ∆ω is the full-width-half-maximum resolution of the instrument, λ is a typical relaxation frequency of the protein and a is the maximal amount that the relaxation process can contribute to the time-converged ∆r2 . Equation (5) shows that two different processes can lead to a temperaturedependent transition in ∆r2 Expt : a non-linear change in ∆r2 Conv with T or equally well a temperature dependence of the relaxation frequency λ. A discussion of both possibilities is now given. 3.3
Temperature-Dependent ∆r 2 Conv
Models involving a nonlinear temperature dependence of ∆r2 Conv have been frequently invoked to explain dynamical transition behavior. In these models the dynamical transition results from a change with T of the equilibrium, converged, long-time atomic probability distribution i.e., ∆r2 Conv . One example of such models is that in [1] which consists of a two-state potential
Protein Dynamics: Glass Transition and Mechanical Function
683
with a free-energy difference between the states of ∆U , separated by a distance, d. The increased population of the higher energy state with increasing temperature leads to a transition in ∆r2 Conv and thus ∆r2 Expt . Another model is that in [30] and [33] in which the energy landscape is approximated by two harmonic potentials with different force constants. Here, the probability of atoms occupying the lower force-constant potential increases with temperature, thus also leading to an increase of ∆r2 Conv . The characteristic the above models have in common is that they lead to ∆r2 Expt being independent of the instrumental resolution provided that resolution is sufficiently high that all the relaxation processes in the system are accessed. 3.4
Temperature-Dependent ∆r 2 Res
An alternative mechanism for nonlinear behavior of ∆r2 Expt involves a nonlinear increase with T of ∆r2 Res , due to motions becoming fast enough to be detected. In principle this effect can lead to apparent dynamical transition behavior in the absence of any change in ∆r2 EISF . Figure 3 shows a fit of Eq. (5) to the experimentally-determined ∆r2 from [32]. The insert to Fig. 3 shows the associated relaxation time, τ (T ) = 1/λ(T ); τ changes from the nanosecond to the picosecond timescale with increasing temperature, passing into the instrumental time resolution window of ∼ 100 ps. This figure demonstrates that dynamical transition behavior can appear in a dynamic neutron scattering experiment without any change with T in the long-time, converged dynamics.
Fig. 3. ∆r 2 Expt determined on a protein solution (glutamate dehydrogenase in 70%CD3 OD/30%D2 O) using the instrument IN6 at the ILL [32], and fitted using 1 as a function of temperature Eq. (5). Insert: Characteristic relaxation time, λ(T )
684
4
Torsten Becker et al.
Protonation Reactions in Proteins
Electrostatic interactions are important for understanding biochemical systems. Acid-base reactions create or destruct unit charges in biomolecules and can thus be fundamental for their function. Together with association reactions and chemical modifications such as phosphorylations, acid-base reactions are the main cause of changes in protein properties. Protonation or deprotonation of titratable groups can cause changes in binding affinities, enzymatic activities, and structural properties. Moreover, very often protonations or deprotonations are the key events in enzymatic reactions. The reduction or oxidation of redox-active groups has a similar importance. In particular, the reduction of disulfide bonds can cause unfolding or functionally-important conformational transitions. Consequently, the function of most proteins depends crucially on the pH and on the redox-potential of the solution. For example, acidic denaturation of proteins in the stomach is a prerequisite for protein degradation during digestion. Beside this rather unspecific effect, the environment can tune the physiological properties of proteins in specific manner. Different values of pH or redox potential in different organs, tissues, cells, or cell compartments steer protein function. Physiological redox and pH buffers such as glutathione and phosphates control these environmental parameters in living systems strictly. A few examples emphasize the physiological importance of pH and redox potential. The pH gradient in mitochondria or chloroplasts drives ATP synthesis. This pH gradient is in both systems generated by several proton transfer steps that couple to a sequence of redox reactions. In hemoglobin, pH influences O2 binding and thus regulates O2 release during blood circulation, a behavior known as the Bohr effect. Pepsinogen cleaves itself in an acidic environment to the highly-active peptidase pepsin. Finally, membrane fusion during influenza virus infection involves large pH-induced structural changes of the protein hemagglutinin.
5 Coupling between Conformational and Protonation State Changes in Membrane Proteins Many membrane proteins transport electrons and protons across a membrane [34]. Protonatable groups play a prominent role in these reactions, because they can either function as proton acceptors or donors in proton transfer reactions or they can influence the redox potential of adjacent redox-active groups. The titration behavior of protonatable groups in proteins can often considerably deviate from the behavior of isolated compounds in aqueous solution. This deviation is caused by interactions of the protonatable group with other charges in the protein and also by changes in the dielectric environment of the titratable group when the group is transferred from aqueous solution
Protein Dynamics: Glass Transition and Mechanical Function
685
into the protein [35,36,37,38]. The situation can be even more complicated, as, due to the fact that the charge of protonable residues depends on pH, their interaction is also pH-dependent. This can lead to titration curves that can not be described by usual sigmoidal functions [39,40]. The photosynthetic reaction center (RC) is the membrane protein complex that performs the initial steps of conversion of light energy into electrochemical energy [41,42] by coupling electron transfer reactions to proton transfer. The bacterial RC of Rb. sphaeroides is composed of three subunits: L, M and H. The L and M subunits have pseudo-two-fold symmetry. Both the L and M subunits consist of five transmembrane helices. The H subunit caps the RC on the cytoplasmic side and possesses a single N-terminal transmembrane helix. The RC binds several cofactors: a bacteriochlorophyll dimer, two monomeric bacteriochlorophylls, two bacteriopheophytins, two quinones, a non-heme iron and a carotenoid. The non-heme iron lies between the two quinone molecules. The primary electron donor, a bacteriochlorophyll dimer called the special pair, is located near the periplasmic surface of the complex, and the terminal electron acceptor, a quinone called QB is located near the cytoplasmic side. While QA is a one-electron acceptor and does not protonate directly, QB accepts two electrons and two protons to form the reduced QB H2 molecule. The first reductions of QA and of QB are accompanied by pKa shifts of residues that interact with the semiquinone species [43]. The reductions induce substochiometric proton uptake by the protein. [44,45,46] The number of protons taken up by the protein upon reduction of the quinones is an observable which is directly dependent on the energetics of the system and intimately coupled to the thermodynamics of the electron transfer process − between the states Q− A QB and QA QB . The pH-dependence of the proton up− take associated with the formation of Q− A and QB in wild type RCs have been determined for Rb. sphaeroides and Rb. capsulatus [47,48,49,50]. Using X-ray structural analysis, it has been shown that a major conformational difference exists between the RC handled in the dark (the ground state) or under illumination (the charge-separated state) [51]. The main difference between the two structures concerns QB itself, which was found in two different positions about 4.5 ˚ A apart. In the dark-adapted state in which QB is oxidized, QB is found mainly in the distal position and only a small percentage in the proximal position. Under illumination, i.e., when QB is reduced, QB is seen only in the proximal position. The crystal was grown at pH=8 [52]. The reaction center structures with proximal or distal QB are called RCprox and RCdist , respectively [53]. The proton uptake upon the first QB reduction and the pHdependent conformational equilibrium between RCprox and RCdist are shown in Figs. 4 and 5, respectively. Using continuum electrostatic calculations, we investigated the pH-dependence of the proton uptake associated with the reduction of QB [54]. The two experimentally-observed conformations of the RC were considered: with QB
686
Torsten Becker et al. 0.8
←→
RCdist Q− B
Proton Uptake
RCdist QB
prox RCQB ←→ RCprox Q−
0.6
0.4
0.2
B
0 6
7
8
9
10
pH
Fig. 4. pH-dependence of the proton uptake upon QB reduction. The symbols in the diagram show the experimentally-determined proton uptake. The line shows a proton uptake calculated by electrostatic calculations using a model that takes conformational transition between the two different reaction center positions RCdist and RCprox into account
RCprox
dist
RC
Population of RC
prox
1
0.8
0.6
0.4
0.2
0 6
7
8
9
10
pH
Fig. 5. Conformational Equilibrium between RCprox and RCdist structures. The population of RCprox shown for oxidized (dashed line) and reduced (solid line) QB depends on pH. In the neutral pH range, both conformations are populated
bound in the proximal or the distal binding site. Comparing the calculated and experimental pH-dependence of the proton uptake revealed that a pHdependent conformational transition is required to reproduce the experimental proton uptake curve (Fig. 5). Neither the individual conformations nor a static mixture of the two conformations with a pH-independent population are capable to reproduce the experimental proton uptake profile. The result is a new picture of RC function in which the position of QB depends not only on the redox state of QB , but also on pH [54]. This model will now be tested experimentally by performing X-ray crystallography of the RC system at different pH values.
Protein Dynamics: Glass Transition and Mechanical Function
6
687
Analysis of Conformational Changes in Proteins
Proteins often have multiple stable macrostates. The conformations associated with one macrostate correspond to a certain biological function. Understanding the transition between these macrostates is important to comprehend the interplay between the protein in question and its environment and can even help to understand malfunctions which lead to diseases like cancer. While these conformational transitions are usually too fast to be measured experimentally, they also occur too rarely to observe them by running standard molecular dynamics simulations. They thus pose a difficult challenge to theoretical molecular biophysicists. In this final section, we will briefly summarize computational methods which have been proposed to analyze conformational changes in macromolecules and identify possible reaction pathways. In particular we will be interested in the analysis of complex transitions in proteins such as the conformational switch in Ras p21 (Fig. 6). Computationally, the problem of finding one possible reaction path between two macrostates corresponds to identifying a low-energy path on the potential energy surface of the protein between two representative end-conformations. A continuous path can be represented by a series of points in conformational space P = [r0 , r1 , ..., rM − 1, rM ] and some predefined way of interpolation between adjacent points, e.g. linear or spline interpolation. Here, r0 and rM are the given reactant and product endstates while the intermediate points have to be found. Starting from an initial guessed path, the intermediate points can be optimized locally using different methods to obtain a low-energy reaction pathway.
Fig. 6. The GTP-bound (left) and GDP-bound (right) state for the conformational switch of ras p21. The transition involves a complex reconfiguration of the backbone fold, providing a considerable challenge for computational pathfinding methods
6.1
Penalty Function Methods
One broad class of methods defines a path cost (or penalty) function, C C=
M−1 k=0
c(rk )|rk+1 − rk |,
(6)
688
Torsten Becker et al.
where c(rk )|rk+1 − rk | assigns a cost for moving along the path from position rk to rk+1 . The initial guess path can be improved straightforwardly by minimizing its cost function in the space of all possible paths using standard techniques such as steepest descent, conjugate gradient or simulated annealing. A rather straighforward definition for a cost function [23] is c(r) = E(r)/L,
(7)
where E(r) is the potential energy of conformation r and L is the total path M−1 length, defined as k=0 |rk+1 − rk |. The best path is thereby defined as the path of minimum mean potential energy. With some additional constraints, as described below, this functional is called a self-penalty walk. It is, in its spirit and results, very similar to the nudged elastic band method [55]. While having the merit of simplicity, the above definition has the following difficulty: through averaging, many small barriers can produce a cost comparative to a single very high barrier. This is not very realistic as the probability of a state decays exponentially with energy height. In this respect, the MaxFlux algorithm [56] proposes a physically better-motivated cost function c(r) = eE(r)/kT ,
(8)
where 1/kT is the Boltzmann coefficient. This temperature dependence allows shorter paths with higher energy barriers to become preferred at higher temperatures, an effect that also reflects physical reality. The penalty function methods require a number of constraint terms to be added to the cost function: (1) Constraints to remove the relative rigidbody translations and rotations of the structures along the path (to yield meaningful values for |rk+1 − rk |). (2) Equidistance constraints to avoid an increase of point density along the path in its low-energy segments and a correspondingly low resolution in segments crossing the saddle region. (3) Self-avoidance terms to prevent the path folding back upon itself. 6.2
Heuristic Methods
Rather than defining an objective cost function, heuristic methods improve the initial path by following a specific set of rules. Arguably, this approach is mathematically not as elegant as the penalty function approach. A clear disadvantage is that, because of the absence of an objective function, standard optimization procedures cannot be simply combined with these methods. On the other hand, these methods can be made more efficient than penalty function methods by tailoring the optimization of the path intermediates according to the position of these intermediates (e.g. to spend more optimization effort on the points in the saddle region than on other points) As an example, we summarize one heuristic method that has proven to be robust and efficient in very large systems: Conjugate Peak Refinement
Protein Dynamics: Glass Transition and Mechanical Function
689
CPR(P = [r0 , r1 , ..., rM ]) (1) Let rmax be a local maximum with the highest energy value along the path P not yet flagged as a saddle point. If there is no such point, exit. Let s0 be the tangent vector to the path at rmax (2) If rmax lies between two existing path-points ri and ri+1 , move it closer to the streambed by calling rnew := approachMEP(rmax ,s0 ). Add new point: P := [r0 , r1 , ..., ri , rnew , ri+1 , ..., rM ]. Go to (1) (3) If rmax lies on an existing path-point rk , check whether the energy has a nearby maximum along s0 . If no, then rk is an unwanted deviation from the path. Remove it and go to (1). If yes, then replace rk by a point closer to the streambed by calling rk := approachMEP(rk ,s0 ). Go to (1) approachMEP(x0 , s0 ) (1) For j from 0 to D − 1 repeat (1.1) Build a new conjugate vector sj+1 with respect to the Hessian at x0 . (1.2) If sj+1 is no longer conjugate to s0 , RETURN(xj ) (1.3) Obtain xj+1 by performing a line minimization along sj+1 , starting from xj . (2) If the energy gradient at xD is vanishing, flag xD as saddle point. RETURN(xD )
(CPR) [57]. It is summarized by the pseudocode below. The basic approach of the algorithm is that path-points are added and/or optimized by performing a one-dimensional maximization along the path and minimizations in the space conjugate to the path direction. This yields path-points which follow the streambed of the energy surface and along which the energy increases monotonously from the minima to the saddle-point(s). The path-points corresponding to the energy maxima along the path have converged to saddle points of first order, because they are in a local maximum along the path direction and in a local minimum along all other directions of the set of conjugate vectors. CPR therefore returns an approximation to a steepest descent path whose only energy maxima correspond to first-order saddle points, i.e. a minimum-energy path. The algorithm does not evaluate the Hessian explicitly. Furthermore most effort is concentrated on the location of saddle-points. For these reasons, the algorithm is fast and can be used to compute reaction paths in proteins with thousands of atoms and for complicated transitions involving hundreds of saddle points [58,59,60]. 6.3
The Initial Pathway Problem
Complex conformational transitions that include a reconfiguration of the backbone fold (as is the case in the conformational switch of Ras p21, Fig. 6) impose a major challenge on reaction-path finding methods. The first difficulty here is to find any MEP at all that has energy barriers low enough to be consistent with the experimental reaction rate. The results of both penalty function methods and heuristic methods rely on an initially guessed path prior to the optimization. Unfortunately, it is very difficult for complex
690
Torsten Becker et al.
backbone rearrangements to find an initial guess that will optimize to a path with low barriers. When the initial guess is generated by standard interpolation techniques (e.g. linear interpolation in cartesian or internal coordinates), the optimized path often has unphysical saddle points, such as the crossing of bonds (interpenetration of two bonded atom pairs) or the passage of water molecules through aromatic side-chains. We have developed interpolation techniques that are specifically tailored for polymers such as proteins or nucleic acids to construct appropriate intermediate structures for the initial path. With these, we have been able to apply CPR in a fully automated manner to very complicated conformational transitions while systematically avoiding unphysical saddle points (to be published elsewhere). 6.4
Multiple Pathways
When a single low-energy path has been found, it is in general not known whether it is just one example of a number of different paths. It has been shown that the MaxFlux method above can be used to identify slightly different pathways in small peptides [61] and even short pathways in moderately-sized polypeptides [62] by varying the initial guess. However, as yet no general method exists for systematically generating many different minimum energy pathways in large molecules. Such a method is highly desireable, since it would not only help to identify different possibilities for a conformational transition, but it would also benefit the search for globally optimal pathways. In summary, then, a significant advance in methodology is still required to properly study the many complex reaction pathways of biological interest.
7
Conclusions
It is hoped that the above cornucopia of dynamical phenomena and associated methodology gives the reader some appreciation for the remaining challenges in biomolecular dynamics research. Deepening our understanding requires methodological advances ranging from instrumentation improvement (e.g. in neutron spectroscopy), and methods of analyzing experimental data and through to theoretical and computational advances for tackling these complex systems. It is to be expected that the burgeoning field of biophysics will devote considerable skill and resources in the next decades to understanding this aspect of the working of the many wonderful tiny molecular machines in the living cell.
Protein Dynamics: Glass Transition and Mechanical Function
691
References 1. W. Doster, S. Cusack, and W. Petry, Nature 337(6209), 754 (1989). 2. R. V. Dunn, V. Reat, J. Finney, M. Ferrand, J. C. Smith, and R. M. Daniel, Biochemical Journal 346 Pt 2, 355 (2000). 3. F. Parak, E. N. Frolov, R. L. Mossbauer, and V. I. Goldanskii, Journal of Molecular Biology 145(4), 825 (1981). 4. R. F. Tilton, J. C. Dewan, and G. A. Petsko, Biochemistry 31(9), 2469 (1992). 5. J. Smith, K. Kuczera, and M. Karplus, Proceedings of the National Academy of Sciences of the United States of America 87(4), 1601 (1990). 6. J. A. Hayward and J. C. Smith, Biophysical Journal 82(3), 1216 (2002). 7. A. R. Bizzarri, A. Paciaroni, and S. Cannistraro, Physical Review E: Statistical Physics, Plasmas, Fluids, and Related Interdisciplinary Topics 62(3 Pt B), 3991 (2000). 8. M. Ferrand, A. J. Dianoux, W. Petry, and G. Zaccai, Proceedings of the National Academy of Sciences of the United States of America 90(20), 9668 (1993). 9. B. F. Rasmussen, A. M. Stock, D. Ringe, and G. A. Petsko, Nature 357(6377), 423 (1992). 10. F. Parak, E. N. Frolov, A. A. Kononenko, R. L. Mossbauer, V. I. Goldanskii, and A. B. Rubin, FEBS Letters 117(1), 368 (1980). 11. J. L. Green, J. Fan, and C. A. Angell, Journal of Physical Chemistry 98(51), 13780 (1994). 12. C. A. Angell, Science 267, 1924 (1995). 13. V. Reat, R. Dunn, M. Ferrand, J. L. Finney, R. M. Daniel, and J. C. Smith, Proceedings of the National Academy of Sciences of the United States of America 97(18), 9961 (2000). 14. A. Paciaroni, S. Cinelli, and G. Onori, Biophysical Journal 83(2), 1157 (2002). 15. M. M. Teeter, A. Yamano, B. Stec, and U. Mohanty, Proceedings of the National Academy of Sciences of the United States of America 98(20), 11242 (2001). 16. L. Cordone, M. Ferrand, E. Vitrano, and G. Zaccai, Biophysical Journal 76(2), 1043 (1999). 17. B. R. Brooks, R. E. Bruccoleri, B. D. Olafson, D. J. States, S. Swaminathan, and M. Karplus, Journal of Computational Biology 4(2), 187 (1983), original Charmm paper. 18. M. E. Tuckerman and G. J. Martyna, Journal of Physical Chemistry 194, 159 (2000). 19. M. Bee, Quasielastic Neutron Scattering: Principles and Applications in Solidstate Chemistry, Biology and Materials Science ((Hilger, Bristol, 1988)). 20. S. Lovesey, Theory of Neutron Scattering from Condensed Matter, Vol.1 ((Oxford University Press, 1987)). 21. G.R. Kneller and J.C. Smith, J. Mol. Biol. 242, 181 (1994). 22. H. Frauenfelder, G.A. Petsko, and D. Tsernoglou, Nature 280, 558 (1979). 23. R. Elber and M. Karplus, Science 235, 318 (1987). 24. A. Lamy, M. Souaille, and J.C. Smith, Biopolymers 39, 471 (1996). 25. H. Frauenfelder, S. Sligar, and P. Wolynes, Science 254, 1598 (1991). 26. W. Doster and M. Settles, in Bellisent-Funel, M-C and Teixera, J., eds., Hydration Processes in Biology (IOS Press, 1999), vol. 305 of NATO Science Series: Life Sciences. 27. A. Perico and R. Pratolongo, Macromolecules 30, 5958 (1997).
692
Torsten Becker et al.
28. G. La Penna, R. Pratolongo, and A. Perico, Macromolecules 32, 506 (1999). 29. S. Dellerue, A.-J. Petrescu, J.C. Smith, and M.-C. Belissent-Funel, Biophys. J. 81(3), 1666 (2001). 30. G. Zaccai, Science 288, 1604 (2000). 31. T. Becker and J.C. Smith, Physical Review E In Press (2003). 32. R. M. Daniel, J. C. Smith, M. Ferrand, S. Hery, R. Dunn, and J. L. Finney, Biophysical Journal 75(5), 2504 (1998). 33. D.J. Bicout and G. Zaccai, Biophys. J. 80, 1115 (2001). 34. G. M. Ullmann, Charge Transfer Properties of Photosynthetic and Respiratory Proteins. (In: H. S. Nalwa (Ed.): Supramolecular Photosensitive and Electroactive Matrials. Academic Press New York, 2001), pp. 525–584. 35. G. M. Ullmann and E. W. Knapp, Eur. Biophys. J. 28, 533 (1999). 36. P. Beroza and D. A. Case, Methods Enzymol. 295, 170 (1998). 37. J. M. Briggs and J. Antosiewicz, Rev. Comp. Chem. 13, 249 (1999). 38. Y. Y. Sham, Z. T. Chu, and A. Warshel, J. Phys. Chem. B 101, 4458 (1997). 39. A. Onufriev, D. A. Case, and G. M. Ullmann, Biochemistry 40, 3413 (2001). 40. G. M. Ullmann, J. Phys. Chem. B 107, 1263 (2003). 41. M. Okamura, M. Paddock, M. Graige, and G. Feher, Biochim. Biophys. Acta 1458, 148 (2000). 42. P. Sebban, P. Mar´ oti, and D. K. Hanson, Biochimie 77, 677 (2001). 43. C. A. Wraight, Biochim. Biophys. Acta 548, 309 (1979). 44. B. Rabenstein, G. M. Ullmann, and E. W. Knapp, Biochemistry 37, 2488 (1998). 45. B. Rabenstein, G. M. Ullmann, and E. W. Knapp, Eur. Biophys. J. 27, 628 (1998). 46. B. Rabenstein, G. M. Ullmann, and E. W. Knapp, Biochemistry 39, 10487 (2000). 47. P. Mar´ oti and C. A. Wraight, Biochim. Biophys. Acta 934, 329 (1988). 48. P. H. McPherson, M. Y. Okamura, and G. Feher, Biochim. Biophys. Acta 934, 348 (1988). 49. J. Tandori, J. M. M. Valerio-Lepiniec, M. Schiffer, P. Maroti, D. Hanson, and P. Sebban, Photochem. Photobiol. 75, 126 (2002). 50. P. Sebban, P. Mar´ oti, M. Schiffer, and D. Hanson, Biochemistry 34, 8390 (1995). 51. M. H. B. Stowell, T. M. McPhillips, D. C. Rees, S. M. Soltis, E. Abresch, and G. Feher, Science 276, 812 (1997). 52. J. Allen, Proteins 20, 283 (1994). 53. C. R. D. Lancaster and H. Michel, Structure 5, 1339 (1997). 54. A. Taly, P. Sebban, J. C. Smith, and G. M. Ullmann, Biophys. J. 84, in press (2003). 55. G. Mills and H. Jonsson, Physical Review Letters (1994). 56. S. Huo and J. E. Straub, Journal of Chemical Physics (1997). 57. S. Fischer and M. Karplus, Chemical Physics Letters 194(3), 252 (1992). 58. K. Olsen, S. Fischer, and M. Karplus, Biophysical Journal (2000). 59. J. Sopkova, S. Fischer, C. Guilbert, A. Lewit-Bentley, and J. Smith, Biochemistry (2000). 60. R. Dutzler, T. Schirmer, M. Karplus, and S. Fischer, Structure (2002). 61. S. Huo and J. E. Straub, Proteins (1999).
Protein Dynamics: Glass Transition and Mechanical Function
693
62. J. E. Straub, J. Guevara, S. Huo, and J. P. Lee, Accounts of Chemical Research (2002). 678, 680, 682 678 678 678 678 678 678 678 678 678 678 678 678 678 678 678 678 678 679 679 680 680 680, 688 680 680 680 680 680 680 680, 683 681, 682 681, 682, 683 683 684 685 685 685 685 685 685 685 685 685 685 685 685 685 685 685 685 685 685 685 685, 686 688 688 689 689 689 689 690 690
Hidden Pseudogap and Superconductivity in Electron Doped High-Temperature Superconductors Lambert Alff1 , Yoshiharu Krockenberger1, Bettina Welter1 , Rudolf Gross1 , Dirk Manske2 , and Michio Naito3 1 2 3
Walther-Meißner-Institut, Bayerische Akademie der Wissenschaften 85748 Garching, Germany Freie Universit¨ at Berlin, Institut f¨ ur Theoretische Physik 14195 Berlin, Germany NTT Basic Research Laboratories 3-1 Morinosato, Atsugi, Kanagawa 243, Japan
Abstract. The ground state of superconductors is characterized by the long range order of condensed Cooper-pairs. While this is the only order in conventional superconductors, in high-temperature (high-Tc ) superconductors the presence of other competing groundstates might be responsible for their complicated phase diagram. A suppression of the accessible electronic states at the Fermi-level in the normal state of high-Tc superconductors which is called pseudogap [1,2] is interpreted as precursor superconductivity [3,4] or as indication of another near-by groundstate which can be separated from the superconducting state by a quantum critical point governing the interplay between the two states [5,6]. Here we report the existence of a second hidden order parameter [7] within the superconducting phase of the underdoped electron doped high-Tc superconductor Pr2−x Cex CuO4−y and the newly synthesized electron doped material La2−x Cex CuO4−y [8]. Our observation is consistent with the presence of a (quantum) phase transition at T = 0 which is discussed to be a key concept to understand high-temperature superconductivity. The existence of a pseudogap when superconductivity is suppressed excludes clearly precursor superconductivity as its origin [9]. This supports the picture that the physics of high-Tc superconductors is determined by the interplay between competing and coexisting groundstates.
It is well established that above the critical temperature Tc of hole doped high-Tc superconductors a pseudogap comparable in size to the superconducting gap exists in the electronic excitation spectra. The doping dependent characteristic temperature where this pseudogap opens is denoted as T ∗ which is in the underdoped region significantly larger than Tc as shown in Fig. 1. For recent overviews see references [1,2]. The physical origin of this pseudogap and its relation to the superconducting state is not yet clear. One step to investigate this problem is the comparison of hole and electron doped high-Tc superconductors. Only the understanding of the full phase diagram as shown in Fig. 1 will give a complete picture of high-Tc superconductivity. However, tunnelling experiments in electron doped high-Tc superconductors B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 695–702, 2003. c Springer-Verlag Berlin Heidelberg 2003
696
Lambert Alff et al.
Fig. 1. Schematic phase diagram for hole and electron doped high-Tc superconductors. The exact behavior of the characteristic pseudogap temperature T ∗ (x) is under discussion for the hole doped case and almost unknown for the electron doped case. The different decay of antiferromagnetism (AFM) with increasing x is due to spin-dilution and -frustration for electron and hole doping, respectively. QCP: possible quantum critical point
have failed to observe such a pseudogap above Tc [10]. There is one considerable advantage of electron doped high-Tc superconductors: Due to the relatively small values of the upper critical field density, the normal state of these cuprates can be probed at low temperatures by suppressing superconductivity by an applied field. We have performed tunnelling spectroscopy using grain boundary junctions as superconductor-insulator-superconductor contacts in epitaxial thin films of the electron doped superconductors Pr2−x Cex CuO4−y and La2−x Cex CuO4−y . The tunnel junction in this method is composed of a grain boundary that is formed during epitaxial growth of the material on a bicrystalline substrate. The tunnelling direction is within the ab-plane of the cuprate superconductors. Since the grain boundary junction is a zig-zag interface [11] this method averages over different grain boundary angles and integrates an area of typically 10−10 m2 . Therefore, this method does not reveal information on a possible phase separation on a smaller length scale. One advantage of grain boundary tunneling is that the involved interfaces are intrinsic, i.e. are formed during the epitaxial growth. A concern is the possibility that at the grain boundary interface a doping level different from the nominal one is probed, e.g. due to oxygen diffusion. However, we have found that the critical temperatures of the junction (as measured by tunnelling) and of the thin film electrodes (as measured by a resistivity measurement) are identical. More experimental details are given in Table 1. Recently, a suppression of the density of states was found in electron doped high-Tc superconductors at an energy scale about ten times larger than the superconducting gap using optical spectroscopy[15,16,17]. Here, we refer with the term pseudogap to a different normal state low-energy scale excitation gap which is comparable in size to the supercoducting gap. This
Hidden Pseudogap in Electron Doped HTC-Superconductors
697
Table 1. Sample overview: Doping x and critical temperatures Tc of the thin films used in this study. Note that the phase diagram of La2−x Cex CuO4−y is shifted to lower x values. We have used [001] tilt bicrystal SrTiO3 substrates with a misorientation angle of 36.8◦ . We have not used Nd2−x Cex CuO4−y as electron doped cuprate superconductor in order to exclude possible effects from the strong paramagnetism of the Nd3+ ion in this compound [12]. Grain boundary tunnelling spectroscopy has been used successfully to study ab-plane tunnelling in hole doped high-Tc superconductors revealing for example zero-energy bound states due to the intrinsic sign change of the d-wave superconducting pair potential [12,13,14]. Note that the values for Tc have been consistently obtained both by resistive measurement of the thin film electrodes and grain boundary tunneling. This shows that grain boundary tunneling reflects doping dependent properties of electrode materials Pr2−x Cex CuO4−y
La2−x Cex CuO4−y
x
Tc
x
Tc
0.134
20
0.098
28
0.148
23
0.116
27.5
0.159
19
0.147
19
property and the smooth evolution from the superconducting gap below Tc into the pseudogap above Tc in the quasiparticle tunnelling spectrum as observed in hole doped high-Tc superconductors were interpreted in terms of a precursor superconductivity scenario [18]. Similar experiments were carried out for the electron doped cuprates Pr2−x Cex CuO4−y and Nd2−x Cex CuO4−y with the result that at least for optimum doping and temperatures above Tc no pseudogap could be detected [10]. However, when the cuprate superconductor is driven in the normal state by applying a high magnetic field at low temperatures, a clear pseudogap feature at a similar energy scale as the superconducting gap is observed in the quasiparticle tunnelling spectra [10,19]. The obvious question is, how this pseudogap evolves with doping. In Fig. 2 we show the doping dependence of the pseudogap for two different electron doped cuprates measured at 4.2 K in a magnetic field of 10 resp. 20 T, i. e. well above the upper critical field Hc2 . As can be seen clearly, the pseudogap feature persists up to the highest applied fields. It is important to note that the pseudogap feature does not depend on the magnetic field. It remains almost unchanged up to 20 T. Having established the doping dependent existence of the pseudogap, the next step is to determine the doping dependence of the characteristic temperature T ∗ as shown in Fig. 3 for all dopings. For the investigated doping range, we always find T ∗ < Tc in contrast to the hole doped high-Tc cuprates explaining why up to now no pseudogap was found in the electron doped compounds above Tc . The dependence on doping x is qualitatively the same for hole and electron doping: T ∗ increases about linearly with decreased doping.
698
Lambert Alff et al.
Fig. 2. Pseudogap in the magnetic field driven normal state and its doping dependence: (a) Doping dependence of tunnelling spectra in La2−x Cex CuO4−y taken at 10 T and 4.2 K. In the underdoped region there is a clear and well developed pseudogap structure which becomes smaller with increased doping x. The pseudogap feature vanishes in the overdoped sample. (b) The same for Pr2−x Cex CuO4−y at 20 T and 4.2 K. A small pseudogap survives up to a doping level of about x = 0.16. (c) Magnetic field series for tunnelling spectra in La2−x Cex CuO4−y (x = 0.116) at 4.2 K. The coherence peaks at the gap edge of the superconducting gap disappear at Hc2 which is between 5 and 6 T. At this field the background conductance drops notably because the electrodes are driven normal. Above 5 T the sample is driven in the normal state, but the pseudogap is clearly present up to the highest applied fields (data from [9])
Fig. 3. Temperature dependence of the pseudogap in high magnetic fields. Normalized spectra at different temperatures are shown for all investigated dopings of Pr2−x Cex CuO4−y in a field of 14 T. The suppression of the density of states at the Fermi-energy becomes smoothly weaker with increasing temperature and vanishes rather suddenly at a temperature we define as T ∗ (x) (data from [9])
Hidden Pseudogap in Electron Doped HTC-Superconductors
699
We shortly address the possibility that the pseudogap is due to residual superconductivity in a vortex liquid state below a much higher ‘real’ Hc2 . For hole doped La2−x Srx CuO4 a Nernst signal was detected in the pseudogap phase4, indicating the presence of vortices below T ∗ but above Tc . For electron doped superconductors the transport entropy of the magnetic flux line also indicates a larger Hc2 than determined from resistivity measurements [20]. However, in the case of electron doping the increase of Hc2 is only of the order of a few Tesla at maximum, and therefore cannot account for the observed gap here which persists even well above 20 T. For the discussion of our results, we summarize the doping dependence of T ∗ and Tc for Pr2−x Cex CuO4−y in a schematic phase diagram in Fig. 4. Tc retraces the well known dome-shaped superconducting phase as already shown in Fig. 1. The pseudogap regime as characterized by T ∗ has a very similar doping dependence as for the hole doped cuprates with the important difference that T ∗ < Tc in the investigated doping range. From extrapolation of the data, we predict that below x ∼ 0.13 a pseudogap will be observed also above Tc as is the case for hole doped high-Tc superconductors at higher doping. On the other hand, going into the overdoped region the T ∗ (x) curve seems to vanish around x ∼ 0.16-0.18 which could mark a critical doping level xc . We note that for hole doped Bi2 Sr2 CaCu2 O8+x a similar observation as in our experiment was reported from c-axis intrinsic tunnelling spectroscopy: The coexistence of a temperature and magnetic field dependent superconducting gap and a field independent pseudogap feature [21]. Therefore, the observed behavior may be universal in the cuprate phase diagram. In a different experiment [22], the temperature dependence of the resistivity, ρ(T ),
Fig. 4. Phase diagram of the electron doped high-Tc superconductor Pr2−x Cex CuO4−y . The data reveals the coexistence of superconductivity and a pseudogap regime in a wide doping range. The data point thickness corresponds to the error. The curves through the data points show a possible continuation of the phase diagram. Extrapolating T ∗ (x) suggests a quantum critical point in the overdoped regime at 0.16 < x < 0.18 (data from [9])
700
Lambert Alff et al.
as a function of doping for the electron doped Pr2−x Cex CuO4−y revealed a metal-insulator cross-over around a critical doping of about x ∼ 0.15 consistent with our results. In hole doped La2−x Srx CuO4 , ρ(T ) behaves similarly. This shows the generality of the phenomenon in high-Tc cuprates [23]. We now address the important question whether the pseudogap regime coexists with superconductivity or is established only when superconductivity is destroyed by the magnetic field. From our data it seems to be evident that the pseudogap is ‘hidden’ below the coexisting global superconducting phase: First the tunnelling conductance evolves smoothly at the Fermi-level as a function of applied field with no sign of an opening of a pseudogap in the superconducting spectra as shown in Fig. 2 (c). This indicates that the pseudogap is underlying the spectra already in zero field. Second, the coherence peaks in the superconducting state are more suppressed in the underdoped region as compared to the optimum and overdoped case due to the presence of the pseudogap suggesting a competition between two order parameters. Third, the density of states conservation rule is recovered only after removing the pseudogap spectral weight [24]. This again shows that the pseudogap is present already in zero applied field. It is a general feature of the cuprate high-Tc superconductors that due to strong electronic correlations there are much more than just two phases with a transition from a normal Fermi-liquid into the superconducting state. The key result of our experiments is to establish the existence of a T ∗ -line within the superconducting region in the cuprate phase diagram. Therefore, a relation between the observed pseudogap and precursor superconductivity can be excluded immediately. It is tempting to identify the T ∗ -line with an additional phase boundary due to a second ‘hidden’ order parameter in the cuprate phase diagram. Recent ARPES measurements indicate a possible broken time-reversal symmetry in the pseudogap regime implying that it is a real thermodynamic phase [25]. This important point has to be confirmed by several experiments. From our measurements, the rather sudden opening of the pseudogap within a range of one or two Tesla may indeed indicate a phase transition. The ARPES experiment has been interpreted as support for a phase with circulating currents [5]. Indeed, there are a number of theoretical (and experimental) works for hole doped high-Tc superconductors high-lighting the possibility of a quantum critical point in the phase diagram involving the existence of a second order parameter [1,5,6,7,26,27,28,29]. It is already well known that antiferromagnetic N´eel order and charge/spin (stripe) density wave exist in the cuprate phase diagram. In addition, the possibility of d-density wave [7] (staggered flux) order and dx2 −y2 + idxy or dx2 −y2 +is superconductivity is intensively discussed. Recent neutron scattering experiments on La2−x Srx CuO4 [30] and YBa2 Cu3 O6.6 [31] have revealed enhanced antiferromagnetic correlations in the underdoped superconducting region, which of course has to be confirmed [32]. Another example of hidden order was found in the heavy fermion metal URu2 Si2 [33]. In the phase
Hidden Pseudogap in Electron Doped HTC-Superconductors
701
diagram suggested here for electron doped high-Tc superconductors, there (co)exist at least two competing order parameters. A small applied magnetic field couples only to the superconducting order parameter suppressing it completely at Hc2 leaving the pseudogap related order parameter at a finite value as predicted recently for a d-density-wave ordered state [34]. This excludes the possibility that the pseudogap regime is related to a second superconducting order parameter with different symmetry [35]. Due to the competition of the two order parameters the magnetic field needed to destroy superconductivity can indeed be smaller than Hc2 because of the additional suppression of superconductivity by the competing order. Strong interactions between the two order parameters may result in even more complicated phase diagrams with more phases than presented in Fig. 4. The identification of the possible order in the pseudogap phase is one of the most prominent questions in the field of high-Tc superconductivity. Any suggested order should be consistent with the observations that the pseudogap coexists with superconductivity and is essentially unchanged by a large applied external field. In particular, these constraints exclude the physical picture of precursor superconductivity. Good candidates for the pseudogap phase are spin or charge density wave states (including staggered flux and stripe phases, and their fluctuations), but also antiferromagnetic order, which extends to higher doping in electron doped compared to hole doped high-Tc supercdonductors.
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18.
J. L. Tallon and J. W. Loram, Physica C 349, 53 (2001). 695, 700 T. Timusk and B. Statt, Rep. Prog. Phys. 62, 61 (1999). 695 V. J. Emery and S. A. Kivelson, Nature 374, 434 (1995). 695 Z. A. Xu, N. P. Ong, Y. Wang, T. Kakeshita, and S. Uchida, Nature 406, 486 (2000). 695 C. M. Varma, Phys. Rev. Lett. 83, 3538 (1999). 695, 700 S. Sachdev, Science 288, 475 (2000). 695, 700 S. Chakravarty, R. B. Laughlin, D. K. Morr, and C. Nayak, Phys. Rev. B 63, 094503 (2001). 695, 700 M. Naito and M. Hepp, Jpn. J. Appl. Phys. 39, L485 (2000). 695 L. Alff et al., Nature 422, 698 (2003). 695, 698, 699 S. Kleefisch et al., Phys. Rev. B 63, R100507 (2001). 696, 697 J. Mannhart et al., Phys. Rev. Lett. 77, 2782 (1996). 696 L. Alff et al., Phys. Rev. Lett. 83, 2644 (1999). 697 L. Alff, Phys. Rev. B 58, 11197 (1998). 697 L. Alff and R. Gross, Advances in Solid State Physics 38 , 453 (1999). 697 Y. Onose, Y. Taguchi, K. Ishizaka, and Y. Tokura, Phys. Rev. Lett. 87, 217001 (2001). 696 E. J. Singley, D. N. Basov, K. Kurahashi, T. Uefuji, and K. Yamada, Phys. Rev. B 64, 224503 (2001). 696 N. P. Armitage et al., Phys. Rev. Lett. 87, 147003 (2001). 696 Ch. Renner, B. Revaz, J.-Y. Genoud, K. Kadowaki, and Ø. Fischer, Phys. Rev. Lett. 80, 149 (1998). 697
702
Lambert Alff et al.
19. A. Biswas et al., Phys. Rev. B 64, 104519 (2001). 697 20. F. Gollnik and M. Naito, Phys. Rev. B 58, 11734 (1998). 699 21. V. M. Krasnov, A. Yurgens, D. Winkler, P. Delsing, and T. Claeson, Phys. Rev. Lett. 84, 5860 (2000). 699 22. P.Fournier et al., Phys. Rev. Lett. 81, 4720 (1998). 699 23. Y. Ando, G. S. Boebinger, A. Passner, T. Kimura, and K. Kishio, Phys. Rev. Lett. 75, 4662 (1995). 700 24. B. Welter et al., Physica C (in print). 700 25. A. Kaminski et al., Nature 416, 610 (2002). 700 26. C. M. Varma, Phys. Rev. B 61, R3804 (2000). 700 27. S. Andergassen, S. Caprara, C. Di Castro, and M. Grilli, Phys. Rev. Lett. 87, 056401 (2001). 700 28. D. Manske, T. Dahm, and K.-H. Bennemann, Phys. Rev. B 64, 144520 (2001). 700 29. C. Panagopoulos, J. L. Tallon, B. D. Rainford, T. Xiang, J. R. Cooper, and C. A. Scott, Phys. Rev. B 66, 064501 (2002). 700 30. B. Lake et al., Nature 415, 299 (2002). 700 31. H. A. Mook, Pengcheng Dai, and F. Dogan, Phys. Rev. B 64, 012502 (2001). 700 32. C. Stock, W. J. L. Buyers, Z. Tun, R. Liang, D. Peets, D. Bonn, W.N. Hardy, and L. Taillefer, Phys. Rev. B 66, 024505 (2002). 700 33. P. Chandra, P. Coleman, J. A. Mydosh, and V. Tripathi, Nature 417, 831 (2002). 700 34. Hoang K. Nguyen and Sudip Chakravarty, Phys. Rev. B 65, R180519 (2002). 701 35. J. A. Skinta, M.-S. Kim, T. R. Lemberger, T. Greibe, and M. Naito, Phys. Rev. Lett. 88, 207005 (2002). 701
High Critical Fields and Currents in Mechanically Alloyed MgB2 J¨ urgen Eckert, Olaf Perner, G¨ unter Fuchs, Konstantin Nenkov, Karl-Hartmut M¨ uller, Wolfgang H¨ aßler, Claus Fischer, Bernhard Holzapfel, and Ludwig Schultz IFW Dresden, Institute of Metallic Materials D-01171 Dresden, Germany Abstract. The combination of high critical fields Hc2 and irreversibility fields H irr and a critical current density jc for example in thin films render MgB2 promising for applications. However, in commercial powders and conventionally sintered polycrystalline MgB2 samples and wires these properties are reduced due to weak pinning. In contrast, mechanical alloying of elemental Mg and B powders combined with hot pressing yields high density nanocrystalline bulk samples with grain sizes on the order of 40–100 nm and distinctly improved pinning. Tc , jc and the H irr –to–Hc2 ratio depend strongly on the milling parameters. Optimized preparation conditions render bulk specimens with critical current densities of about 7 · 105 A/cm2 at 10 K and irreversibility fields of up to about 17 T. The improved pinning of this material is attributed to the large density of grain boundaries. The nanocrystalline MgB2 can be used as starting material for the preparation of tapes with Cu or Fe as sheath material. So far, monofilamentary tapes annealed at 773 K yield jc values of up to 2.2 · 104 A/cm2 in external magnetic fields of 7.5 T at 4.2 K.
1
Introduction
MgB2 exhibits the highest known critical temperature for binary compounds [1] but is affected by weak grain connectivity [2] and low intrinsic irreversibility fields [3]. The potential of the superconducting compound MgB2 for technical applications in the temperature range between 20 and 30 K was mentioned [4] just after its discovery [1]. Since then several groups tried to prepare optimized samples concerning the critical temperature Tc , the critical current density jc , the upper critical field Hc2 as well as the irreversibility field H irr in regard of this fact [5,6]. Soon two important properties were recognized to be crucial for potential application [7,8]. First, despite of the lack of weaklink electromagnetic behaviour at the grain boundaries [7] it is decisive to obtain a sufficient connection between the grains to facilitate the penetration of the superconducting current through the sample [2]. Because of this, optimized densification is a very important factor for the preparation of bulk samples from powders. The other important fact is the capability of the microstructure to allow effective flux pinning. The creation of pinning centers was achieved by irradiation [9], composition variation [10], and doping with different compounds [2,11]. Although some improvements were obtained so B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 703–719, 2003. c Springer-Verlag Berlin Heidelberg 2003
704
J¨ urgen Eckert et al.
far, all these methods show disadvantages especially due to the necessity of a high temperature sintering process so that there is still a strong need for further optimization of all preparation parameters in order to obtain MgB2 superconductors with improved properties. The Mg–B phase diagram [12] shows that MgB2 is a line compound. Any deviation from its exact stoichiometric composition, which cannot be compensated by formation of point defects or interstitials, is expected to lead to decomposition. In the case of initial Mg excess this results in extra Mg in the material and in the case of Mg deficit MgB4 and MgB7 compounds form. Thus, to receive a maximum amount of MgB2 superconducting phase fraction it is crucial to control the exact stoichiometry. Many preparation methods established for MgB2 include annealing at temperatures up to 1500 K for several hours [13]. Due to the high Mg vapour pressure it is very difficult to maintain a constant Mg–to–B ratio even in sealed crucibles and for Mg surplus. Due to all these difficulties, mechanical alloying as a powder metallurgy processing technique seems to be an appropriate tool to prepare MgB2 powders [14,15,16]. It allows to adjust the stoichiometry of a nanocrystalline microstructure very precisely, which remains unchanged due to working at ambient temperature in purified argon atmosphere. Moreover the phase formation of MgB2 can be controlled very accurately by varying the mechanical alloying parameters like milling time and intensity [14,16,17]. The possibility of easy creation of a nanocrystalline microstructure with this technique can provide a high density of magnetic flux pinning grain boundaries thus increasing the critical current density jc according to the equation [18] ∆κ H Fp = 1.18Sv 3 H 1 − , (1) κ Hc2 where Fp is the pinning force per unit volume, Sv is the area of interface per unit volume, κ is the Ginsburg-Landau Parameter, H is the magnetic field and Hc2 the upper critical field. Fp can be improved by increasing Sv , i.e. by reducing the grain size [19,20,21]. Furthermore, the fine-grained powder is well suited for wire and tape fabrication [22] because of the ease to highly densify the material and, therefore, to avoid a weak connection between the MgB2 grains. A short annealing step afterwards at relatively low temperature can additionally be applied. In the following examples for mechanically alloyed MgB2 powders prepared under different conditions and our attempts so far to obtain optimized MgB2 powder, bulk, wires and tapes will be presented.
2
Powder Preparation by Mechanical Alloying
The process of mechanical alloying consists of cold-welding, fracturing and rewelding of the starting powder particles [15]. This leads to a grain size
High Critical Fields and Currents in Mechanically Alloyed MgB2
705
reduction of the Mg and B particles and simultaneous formation of MgB2 [23]. Mechanical alloying was realized in a high-energy ball mill for different times varying from 20 to 100 h. Due to the high reactivity of the nanocrystalline powder the milling was done in a purified Ar atmosphere to avoid oxidation. Tungsten carbide (WC) milling tools, i.e. milling vials and balls and a powder–to–ball mass ratio of 1 : 36 were used. The rotation speed of the planetary ball mill was 250 rpm. The starting materials are amorphous B (99.9 % purity, 1 µm grain size) and fine-grained Mg (99.8 % purity, 250 µm maximum grain size) powders in the stoichiometric ratio of MgB2 . The Mg grain size is reduced from 250 µm in the beginning to only a few nanometers after 50 h of milling. This causes a high reactivity of the material because of the creation of clean and uncontaminated surfaces and an increased surface area–to–volume ratio. As a result of the milling process we obtain an increased formation rate of MgB2 from the starting elements Mg and B compared to untreated powder. The effect that MgB2 is very brittle improves the desired grain size reduction because of the separation of the newly formed MgB2 from the grain surfaces. Accordingly, two parallel processes occur: the growth of the MgB2 phase fraction in the powder and a continuous grain size reduction. Additionally, we find an increase of the defect density inside the MgB2 lattice. For longer milling times, an enrichment and doping of impurities stemming from the milling tools can occur and, therefore, deteriorate the advantageous superconducting properties of MgB2 . The mechanical alloying process leaves a highly defective as well as a partially reacted MgB2 , Mg and B powder mixture. Hence, there is an annealing step necessary to improve the properties of the microstructure and to react the residual nanocrystalline Mg and B. Investigations with a differential scanning calorimetry (DSC) device reveal that short-time heat treatment leads to fully reacted MgB2 powder with no residual Mg and B. Additionally, strain reduction accompanies the chemical reaction. This improvement of the microstructure depends strongly on the chosen temperature, annealing time and, for uniaxially hot pressed bulk samples, on the applied pressure. In this investigation the following parameter set for all samples except the hot pressing series was chosen: temperature of 973 K, pressing time of 10 min and pressure of 640 MPa.
3 Characterization of Phase Formation and Microstructure Chemical analysis of the mechanically alloyed MgB2 powder as well as of hot pressed bulk samples was performed by different methods. Inductively coupled plasma–optical emission spectroscopy (ICP-OES) reveals a stoichiometric Mg:B ratio for the as-milled powder regardless of the milling time. As impurities we find O as well as W, C and Co as debris from the milling tools.
706
J¨ urgen Eckert et al.
The impurity content increases with increasing milling time from 20 to 100 h from 0.3 wt% to 1.7 wt% for W, 0.2 wt% to 1.1 wt% for C and 0.06 wt% to 0.12 wt% for Co. After short-time hot pressing there is no change of the stoichiometric Mg:B ratio. Obviously, no Mg loss during the heat treatment takes place, which is due to the low temperature of 973 K and the short annealing time of 10 min. The oxygen content is in the range of 2–4 wt% for all the powders irrespective of the milling time showing no clear trend. X-ray diffraction of the as-milled powders (Fig. 1) reveals a partially reacted powder consisting of MgB2 and residual unreacted Mg besides a small amount of WC stemming from the milling tools. B is not clearly detectable by this characterization method due to its amorphous structure. The hot pressed bulk samples are nearly single-phase MgB2 with small peaks stemming from WC debris. Rietveld refinement of the X-ray patterns (Fig. 2) reveals that after mechanical alloying the phase fraction of MgB2 reaches 48 wt% for the 50 h milled powder, whereas after hot pressing this value increases to 80 wt% besides a relatively high impurity content of MgO and a small amount of WC wear debris. This clearly shows that any contact of the highly reactive mechanically alloyed powder with air and introduction of oxide impurities has to be avoided as far as possible because MgO already forms during the heat treatment at a rather low temperature. The phase fraction of MgB2 of the whole powder sample depends strongly on the milling time (Fig. 1). Whereas only small MgB2 peaks appear for the 20 h milled powder, MgB2 is the main phase for the 100 h milled sample. The hot pressed bulk samples are nearly single-phase MgB2 with only small peaks stemming from WC debris. After mechanical alloying (Fig. 2) the phase fraction of MgB2 is only 31 wt% for the 20 h milled powder but reaches 55 wt% for the 100 h mechanically alloyed material. Hot pressing again increases
Fig. 1. X-ray diffraction patterns of Fig. 2. Weight fractions of the compopowders mechanically alloyed for times nents of the as-milled powders and hot between 20 and 100 h pressed bulk samples as a function of mechanical alloying time
High Critical Fields and Currents in Mechanically Alloyed MgB2
707
these values to 76 wt% and 82 wt%, respectively (Fig. 2). Besides, impurities like MgO, WC and residual Mg are present in the material. The thermal stability of the as-milled powders was investigated by constant-rate heating at 20 K/min under argon atmosphere in a differential scanning calorimeter (DSC) starting from ambient temperature and holding for 10 min at 973 K. The partially reacted powders show an exothermic MgB2 formation reaction upon heating (Fig. 3), which starts well below 800 K and is completed at 900 K. This allows to compact the powder samples at the relatively low temperature of 973 K compared to conventional sintering of MgB2 [13], which is advantageous to maintain a constant Mg:B ratio and to receive a fully reacted MgB2 without residual Mg and B. The series of powder samples with different milling time (Fig. 3) shows a strong reaction peak for the 20 h milled powder and a rather small peak for the 100 h powder sample. The enthalpy of the peaks clearly shows the strong relationship between milling time and the extent of MgB2 formation. The peak height decreases and the maximum of the peak moves to lower temperatures with increasing milling time. This corresponds to the findings of the XRD measurements meaning that as-milled powder with a lower MgB2 fraction contains a higher fraction of residual Mg and B, which reacts during the heat treatment in the DSC, thus showing a larger MgB2 formation peak and vice versa. The shift of the peak maximum can be explained by smaller grain sizes of Mg and B with increasing time and, therefore, a higher reactivity leading to a lower starting temperature for MgB2 formation. Rietveld refinement of the X-ray diffraction patterns also gives detailed information on the lattice parameters, grain size and internal strain. Whereas the lattice parameter c shows the behaviour of the layer distances in the MgB2 unit cell, the lattice parameter a describes the atom distances in the hexagonal Mg and B layers. For the as-milled powders, c decreases slowly with increasing milling time (Fig. 4). In contrast, a increases slowly to a maximum value and decreases again with further milling. After hot pressing a approaches the literature value of 0.3085 nm and c is slightly higher than the predicted value of
Fig. 3. Differential scanning calorimetry (DSC) plots of the as-milled powders (heating rate 20 K/min)
708
J¨ urgen Eckert et al.
Fig. 4. Lattice parameters a and c as a function of milling time and after hot pressing determined by Rietveld refinement of the X-ray diffraction patterns
0.3523 nm (Fig. 4) possibly due to occupation of interstitial sites. This will have an influence on the superconducting properties of MgB2 [5]. The coherent scattering length, which can be regarded as a minimal bound for the grain size, and the internal strain of the lattice can be determined using the models of Scherrer [24] as well as of Stokes and Wilson [25], respectively. With increasing milling time the Mg grain size decreases (not shown here) whereas the size of the MgB2 grains increases from 4 nm for 20 h mechanically alloyed powder to about 10 nm after 100 h of milling (Fig. 5). Even after hot pressing for 10 min at 973 K with 640 MPa the microstructure remains nanocrystalline with a grain size of about 25 nm. The internal strain (Fig. 5) drops to about half the value of the as-milled MgB2 powder but is still rather large. Hence, a short heat treatment does not destroy the desired nanocrystalline microstructure and annihilates some of the lattice defects introduced during the mechanical alloying process. This can be used to improve the superconducting properties since it is supposed that structural defects like grain boundaries act as pinning centers in MgB2 [8,26,27]. The density of the 50 h milled and hot pressed bulk sample reaches about 84 % of the theoretical density of MgB2 . Applying longer pressing times at 973 K and a pressure of 640 MPa further increases the density to about 95 % for 60 min and longer pressing times. However, the annealing / hot
Fig. 5. Coherent scattering length and internal strain of as-milled powders and hot pressed samples in dependence of the milling time determined by Rietveld refinement of the X-ray diffraction patterns
High Critical Fields and Currents in Mechanically Alloyed MgB2
709
pressing has to be controlled very carefully in order to avoid a change of the stoichiometry of MgB2 due to Mg loss by evaporation and excessive grain growth during long heat treatment. The investigation of the lattice parameters a and c for this hot pressing series shows a constant behaviour of a close to the theoretical value of 0.3085 nm, but a slowly decreasing layer distance c with increasing pressing time, which remains still a little bit higher than the theoretical value of 0.3523 nm after pressing for 90 min. These features are linked with the superconducting properties. The coherent scattering length increases with increasing hot pressing time from 7.5 to 26 nm, whereas the internal strain drops very fast and remains constant at about 50 % of the starting value. Microstructure investigations of the as-milled powder by scanning electron microscopy (SEM) (Fig. 6) reveal homogeneous particles with a size in the range of 1 to 10 µm, which consist of nanocrystallites with a size of less than 100 nm. The hot pressed bulk samples seem to be macroscopically homogeneous as well but they show a distinct volume fraction of µm-sized pores, which are responsible for the density reduction compared to the theoretical value. WC wear debris is also visible as bright circular shaped grains with sub-micron sizes. The WC particles are homogeneously distributed in the MgB2 matrix and seem to be chemically inert. Transmission electron microscopy (TEM) investigations of the same hot pressed sample (Fig. 7) reveal a nanocrystalline microstructure with grain sizes in the range of 5 to 20 nm for the 50 h milled and subsequently hot pressed powder. This finding confirms the results of the Rietveld refinement. Again WC debris was detected with grain sizes in the range of 0.1 to 1 µm as the only larger-sized impurity.
Fig. 6. Scanning electron micrograph of the 100 h mechanically alloyed powder. The high resolution shows a µm-sized particle containing nanocrystallites with a size of less than 100 nm
Fig. 7. Transmission electron micrograph of the 50 h mechanically alloyed and subsequently hot pressed bulk sample (973 K, 10 min, 640 MPa). The homogeneous microstructure consists of crystallites with a size of 5–20 nm
710
4 4.1
J¨ urgen Eckert et al.
Transport and Magnetization in MgB2 Effect of Milling Parameters on Tc
Transition curves for the samples milled for different times as derived from transport measurements are shown in Fig. 8. The critical transition temperature Tc first increases with increasing milling time and reaches a maximum at 34.5 K for the 50 h milled and subsequently hot pressed sample (973 K, 640 MPa, 10 min). For longer milling times, Tc drops remarkably. The maximum Tc for this series of samples lies well below the usually measured value of 39 K for conventionally prepared material [1] revealing a strong relationship between mechanical alloying and superconducting properties in terms of the phase fraction of MgB2 , the grain size, the internal strain as well as impurities. The width of the transition is the smallest for the powder sample milled for 50 h exhibiting ∆Tc = 0.9 K. This behaviour is the result of two opposite processes occurring during mechanical alloying: on one hand the formation and grain size reduction of the MgB2 phase and on the other hand the increase of impurities stemming from the milling tools as revealed by chemical analysis. The dependence of Tc on pressing time between 10 and 90 min at 973 K was investigated for a 50 h milled powder. As expected from Rietveld refinement results we get a higher critical transition temperature for longer pressing time because of the relaxation of the lattice and the reduction of strain. The highest Tc of 34.6 K reached after 90 minutes hot pressing still lies 4 K below the literature value of 39 K [1]. This is due to the characteristics of the microstructure. Even if there is a reduction of the lattice deformation regarding the lattice parameters a and c upon annealing, there is a residual difference between the theoretical value for c = 0.3523 nm and the measured data after 90 min heat treatment. Although the density improved remarkably to 95 % of the theoretical value, there is still need for further densification
Fig. 8. Critical transition temperature of differently milled and subsequently hot pressed samples
High Critical Fields and Currents in Mechanically Alloyed MgB2
711
improvement. Additionally, the impurity content (WC debris and oxygen) found in all mechanically alloyed samples reduces Tc due to doping of the MgB2 lattice. The transition width slowly decreases for longer pressing times to about 1 K for the sample annealed for the longest time. 4.2
Effect of Milling Parameters on jc
The critical current density jc was determined from dc magnetization loops at temperatures between 7.5 K and 35 K with a vibrating sample magnetometer (VSM) in self-field as well as in external magnetic fields up to 8 T. The standard Bean model [31] was applied to evaluate the field dependence of jc at different temperatures from the hysteresis of the magnetization loops and the sample size using the equation jc =
20∆m , V d (1 − d/3b)
(2)
where ∆m is the magnetization of the sample, d is the diameter, b the thickness and V the volume of the cylindrical sample. For the investigation of different milling times as the only changed preparation parameter (Fig. 9) we find the same jc -dependence on milling time as it was measured for the critical temperature Tc . This suggests that for both properties Tc and jc the same mechanisms determining the superconducting properties are relevant. In self-field the behaviour of jc in the lower temperature range between 5 and 15 K is nearly equivalent for all bulk samples. The highest jc reaches 106 A/cm2 at about 10 K for the 50 h milled powder sample, which was subsequently hot pressed for 10 min at 973 K. For the temperature range between 25 and 35 K, there are considerable differences that imply varying pinning behaviour of the different samples. Again, the 50 h milled sample shows the best performance while the 100 h milled sample shows the worst. Flux jumps at lower temperatures avoid the determination of jc at 4.2 K with this method.
Fig. 9. Critical current density in self-field as a function of temperature for differently milled and subsequently hot pressed samples (973 K, 640 MPa, 10 min) calculated from magnetization curves using the Bean model
712
J¨ urgen Eckert et al.
Fig. 10. Critical current density as a function of external magnetic field at temperatures of 7.5 K and 20 K, respectively, for differently milled and subsequently hot pressed samples (973 K, 640 MPa, 10 min) calculated from magnetization curves using the Bean model
The magnetic field dependence of jc reveals the same pinning behaviour as found for the temperature dependence (Fig. 10). For low temperatures, there is a wide range before jc drops below 1 · 105 A/cm2 . For the 50 h milled sample, this is the case for an applied field of 6.4 T. For 20 K, the pinning is strongly reduced. The reasons for this behaviour are related to grain size effects, lattice deformation as well as the impurity content of the bulk samples. Compared to other MgB2 bulk samples prepared by different techniques [8,29,32] our results are similar to optimized sintered samples and thin films revealing an enhanced flux pinning at higher temperatures and in higher external magnetic fields, respectively.
5
Magnetic Properties of MgB2 Bulk Material
The upper critical field as well as the irreversibility field were determined to investigate once more the influence of the preparation parameters on the superconducting properties. For the investigations, a PPMS device with standard four-point method was used to determine the transition curves at different external magnetic fields up to 9 T. Hc2 and H irr were determined at 90 % of the normal-state resistance and zero resistance, respectively. Additionally, H irr was determined from the jc curves using the value of 10 A/cm2 as irreversibility criterion. Figure 11 displays the temperature dependence of Hc2 , revealing an almost linear behaviour for the differently milled samples. Near Tc a positive curvature is detectable [23]. Compared to sintered [3] and thin film [8] MgB2 samples, the mechanically alloyed samples show a behaviour of Hc2 near Tc that lies between the dirty and the clean limit [3]. The MgB2 bulk sample
High Critical Fields and Currents in Mechanically Alloyed MgB2
713
Fig. 11. Temperature dependence of the upper critical field Hc2 and of the irreversibility field H irr of differently milled and subsequently hot pressed samples (973 K, 640 MPa, 10 min)
prepared from 50 h milled powder possesses the highest upper critical field for a given temperature. Shorter as well as longer milling time reduces Hc2 considerably. The behaviour of H irr for the same bulk samples is also plotted in Fig. 11 showing the same temperature dependence as Hc2 . Again, the 50 h milled bulk sample shows the best field dependence, which is comparable to thin films [8]. The comparison of inductive and resistive measurements of H irr (Fig. 12) reveals a good agreement of the differently measured values. The irr irr –to–Hind is estimated to be about 0.8. The reason for this is the ratio of Hres highly dense but still pores containing microstructure of the bulk samples as seen in the SEM micrographs. From these results it can be concluded that there is a good grain connectivity, which nevertheless still can be improved by applying higher pressure or longer pressing times during hot pressing. The comparison of Hc2 and H irr in dependence of preparation parameters and temperature gives some information about the flux pinning capability of the mechanically alloyed nanocrystalline material. The ratio of about 0.7 that is found for the 50 h milled sample for higher external magnetic fields is comparable with the data reported for thin films [8]. This high ratio compared to a ratio of ∼ 0.5 of untextured bulk samples [3] can be explained by the large number of grain boundaries due to the nanocrystalline microstructure, which causes improved flux pinning.
6
Monofilamentary MgB2 -Tapes
The potential of the mechanically alloyed powder for applications was investigated by fabrication of wires and tapes using the powder-in-tube (PIT) method [33,34]. For our investigation, we chose Cu and Fe as sheath materi-
714
J¨ urgen Eckert et al.
Fig. 12. Comparison of the temperature dependence of the irreversibility field Hirr of differently milled and subsequently hot pressed samples (973 K, 640 MPa, 10 min) measured resistively and determined inductively from magnetization curves using the Bean model
als. Both metals show a good thermal and electrical conductivity as well as sufficient mechanical strength to serve in applications for MgB2 . As powder material we took the 50 h milled partially reacted MgB2 powder. The mixture of ductile Mg, brittle MgB2 and B is favourable for the deformation process taking place due to the lower mechanical forces that have to be applied. The composite was deformed by swaging and drawing into a wire. After wire drawing, the sample was flat rolled, while the superconducting filling factor corresponds to about 31–37 %. To receive a fully reacted MgB2 conductor as well as a low density of lattice defects without any stoichiometry change low temperatures together with annealing times varying between 3 and 10 h were chosen. The heat treatment was done in sealed crucibles as well as wrapped in a Ta foil. The samples were exposed to different peak temperatures and dwell times under pure Ar atmosphere. 6.1
Phase Composition, Texture and Microstructure
After mechanical deformation and subsequent heat treatment we obtained a fully reacted MgB2 bulk material inside the sheath as determined by Xray diffraction. Some WC impurities stemming from mechanical alloying and MgO distributed in the material were detected. Rietveld refinement gives no hint for texture in the tapes, and reveals a slightly increased lattice cell volume with a = 0.3091 nm and c = 0.3532 nm compared to the theoretical values of 0.3085 nm and 0.3523 nm. The coherent scattering length equivalent to the grain size was calculated in same way as for the powder samples and indicates a nanocrystalline microstructure of the tapes with 26 nm average MgB2 grain size. The internal strain is of same order of magnitude as for
High Critical Fields and Currents in Mechanically Alloyed MgB2
715
Fig. 13. SEM picture taken from the cross-section of a Fe-cladded tape annealed at 773 K for 3 h
the hot pressed bulk samples indicating a distinct but not complete defect annihilation during the heat treatment. SEM micrographs reveal a porous and inhomogeneous microstructure inside the filaments (Fig. 13). At the interface between the MgB2 core and the Cu sheath a MgCu2 reaction layer is clearly visible. No cracks were observed but the densification of the material has still to be improved. Further details are given in Ref. [34]. 6.2
Superconducting Properties
The critical transition temperature of the tapes is reduced in comparison to the hot pressed bulk samples prepared from mechanically alloyed powder. For the Fe-sheathed conductors, Tc is higher reaching a maximum value of 33.5 K for the Fe-sheathed tape annealed for 3 h at 973 K compared the Cusheathed tapes, which exhibit a maximum value of 32.0 K for 3 h annealing at 873 K. Heat treatments at higher temperatures seem to improve Tc whereas longer annealing times do not affect or slightly reduce it. ∆Tc is in the range of 0.5 to 2 K and also depends on the sheath material, the temperature and duration of the heat treatment. For the critical current density we find the same dependence on sheath material and heat treatment as already discussed for the critical temperature. Fe-sheathed tapes reach a substantially higher jc for all heat treatments than the Cu-sheathed filaments. Concerning the optimization of jc , heat treatment at low annealing temperature and short annealing times are favourable. For applied external magnetic fields we achieve a jc of 2.2 · 104 A/cm2 at 7.5 T and 4.2 K (Fig. 14). This value lies above those of transport jc -measurement received from Fe tapes prepared from commercially single-phase MgB2 milled for different times [35].
716
J¨ urgen Eckert et al.
Fig. 14. Critical current densities as a function of external field for different tapes with Cu and Fe sheath, respectively. The field was applied parallel to the main plane of the tapes
Fig. 15. Temperature dependence of the upper critical field (Hc2 ) and the irreversibility field (H irr ) of the Fe-cladded tape annealed at 773 K for 6 h and for a bulk precursor pellet hot pressed at 973 K for 10 min
The field dependence of the superconductivity in a Fe-sheathed tape shows the same behaviour as obtained for the hot pressed bulk samples (Fig. 15). There is only a moderate displacement of the curve towards lower Hc2 values for the same temperature. Hence, there is still some potential in the tape fabrication to improve the flux pinning capability, e.g. by higher densification. The irreversibility field behaves in the same way as Hc2 , i.e. H irr of the tape is smaller for a given temperature than that of the bulk sample. Instead the Hc2 –to–H irr ratio is with about 0.8 nearly the same.
High Critical Fields and Currents in Mechanically Alloyed MgB2
7
717
Conclusions
Mechanical alloying is a unique preparation technique to synthesize nanocrystalline MgB2 powders for potential application for wires and tapes. In this paper the influence of different preparation parameters of the mechanical alloying process and the effect of different hot pressing times were demonstrated. Mechanical alloying yields partially reacted nanocrystalline powder with about 10 nm grain size containing MgB2 as well as residual Mg and B. There is an optimum milling time of 50 h for achieving a favourable microstructure as well as optimized superconducting properties for stoichiometric MgB2 bulk samples. Hot pressing at the rather low temperature of 973 K for 10 min at a pressure of 640 MPa leads to highly dense nearly single-phase MgB2 bulk samples including a small content of MgO and WC wear debris from the milling tools. The microstructure remains nanocrystalline with a non vanishing residual internal strain showing that the lattice is distorted due to a slightly larger layer distance c than the theoretical value revealing that impurities like oxygen are built in the structure. Longer heat treatment leads to an improvement of the microstructure but cannot annihilate all the defects. The superconducting properties exhibit a strong correlation with the mechanical alloying and heat treatment parameters. The 50 h mechanically alloyed samples exhibit the highest of Tc of 34.5 K, which lies well below the Tc of 39 K of conventional MgB2 samples, as well as the highest jc with 1 · 105 A/cm2 at 7.5 K in a magnetic field of 6.4 T. Improved flux pinning in the samples is confirmed by high Hc2 and H irr values and an improved H irr –to–Hc2 ratio of 0.7 compared to conventionally sintered samples. Monofilamentary wires and tapes fabricated from partially reacted powder show very good Hc2 and H irr values, which are comparable to thin films and irradiated samples. Hence, we can conclude that mechanical alloying is an easy and unique tool to improve the superconducting properties of MgB2 , which is promising for potential applications. Nevertheless, further optimization of the preparation parameters is necessary to achieve a higher densification of the material and a reduced impurity content.
References 1. J. Nagamatsu, N. Nakagawa, T. Muranaka, Y. Zenitani, J. Akimitsu, Nature 410, 63 (2001). 703, 710 2. A. Serquis, X.Z. Liao, Y.T. Zhu, J.Y. Coulter, J.Y. Huang, J.O. Willis, D.E. Peterson, F.M. Mueller, N.O. Moreno, J.D. Thompson, V.F. Nesterenko, S.S. Indrakanti, J. Appl. Phys. 92, No. 1, 351 (2002). 703 3. G. Fuchs, K.-H. M¨ uller, A. Handstein, K. Nenkov, V.N. Narozhnyi, D. Eckert, M. Wolf, L. Schultz, Sol. Solid State Commun. 118, 497 (2001). 703, 712, 713 4. D.C. Larbalestier, A. Gurevich, D.D. Feldmann, A. Polyanskii, Nature 414, 358 (2001). 703
718
J¨ urgen Eckert et al.
5. C. Buzea, T. Yamashita, Supercond. Sci. Tech. 14, R115 (2001). 703, 708 6. R. Fl¨ ukiger, H.L. Suo, N. Mugolino, C. Benedice, P. Toulemonde, P. Lezza, Physica C 385, 286 (2003). 703 7. D.C. Larbalestier, L.D. Cooley, M.O. Rikel, A.A. Polyanskii, J. Jiang, S. Patnaik, X.Y. Cai, D.M. Feldmann, A. Gurevich, A.A. Squitieri, M.T. Naus, C.B. Eom, E.E. Hellstrom, R.J. Cava, K.A. Regan, N. Rogado, M.A. Hayward, T. He, J.S. Slusky, P. Khalifah, K. Inumaru and M. Haas, Nature 410, 186 (2001). 703 8. X. Zeng, A.V. Pogrebnyakov, A. Kotcharov, J.E. Jones, X.X. Xi, E.M. Lysczek, J.M. Redwing, S. Xu, Q. Li, J. Lettieri, D.G. Schlom, W. Tian, X. Pan, Z.K. Liu, Nature Mater. 1, 1 (2002). 703, 708, 712, 713 9. Y. Bugoslavsky, L.F. Cohen, G.K. Perkins, M. Polichetti, T.J. Tate, R. Gwilliam, A.D. Caplin, Nature 411, 561 (2001). 703 10. S. Brutti, A. Ciccioloi, G. Balducci, G. Gigli, P. Manfrinetti, A. Palenzona, Appl. Phys. Lett. 80, No. 16, 2892 (2002). 703 11. S.X. Dou, A.V. Pan, S. Zhou, M. Ionescu, H.K. Liu, P.R. Munroe, Supercond. Sci. Technol. 15, 1587 (2002). 703 12. T. Massalski (Ed.), Binary Alloy Phase Diagrams, second ed., ASM International, Materials Park, OH, 1990. 704 13. A. Handstein, D. Hinz, G. Fuchs, K.-H. M¨ uller, K. Nenkov, O. Gutfleisch, V.N. Narozhnyi, L. Schultz, J. Alloys and Comp. 329, 285 (2001). 704, 707 14. J. Eckert, J.C. Holzer, C.E. Krill III, W.L. Johnson, J. Appl. Phys. 73, No. 6, 2794 (1993). 704 15. L. Schultz, J. Eckert, Topics in Applied Physics 72, Springer-Verlag Berlin, Heidelberg (1994). 704 16. J. Eckert, L. Schultz, J. Mat. Sci. 26, 441 (1991). 704 17. J. Eckert, L. Schultz, E. Hellstern, K. Urban, J. Appl. Phys. 64, No. 6, 3224 (1988). 704 18. D. Dew-Hughes, Phil. Mag. B 55 No. 4, 459 (1987). 704 19. A. G¨ umbel, O. Perner, J. Eckert, G. Fuchs, K. Nenkov, K.-H. M¨ uller, L. Schultz, IEEE Trans. Appl. Supercond. (in press). 704 20. V.N. Narozhnyi, G. Fuchs, A. Handstein, A. G¨ umbel, J. Eckert, K. Nenkov, D. Hinz, O. Gutfleisch, A. W¨ alte, L.N. Bogacheva, I.E. Kostyleva, K.-H. M¨ uller, L. Schultz, J. Supercond. 15, No. 6, 599 (2002). 704 21. Y.D. Gao, J. Ding, G.V.S. Rao, B.V.R. Chowdari, W.X. Sun, Z.X. Shen, Phys. Stat. Sol. (a) 191, No. 2, 548 (2002). 704 22. W. Goldacker, S.I. Schlachter, S. Zimmer, H. Reiner, Supercond. Sci. Technol. 14, 787 (2001). 704 23. A. G¨ umbel, J. Eckert, G. Fuchs, K. Nenkov, K.-H. M¨ uller, L. Schultz, Appl. Phys. Lett. 80, No. 15, 2725 (2002). 705, 712 24. P. Scherrer, G¨ ott. Nachr. 2, 98 (1918). 708 25. A.R. Stokes, A.J.C. Wilson, Proc. Phys. Soc. (London) 56, 174 (1944). 708 26. R. Fl¨ ukiger, P. Lezza, C. Benedice, N. Mugolino, H.L. Suo, Supercond. Sci. Technol. 16, 264 (2003). 708 27. Y. Bugoslavsky, L. Cowey, T.J. Tate, G.K. Perkins, J. Moore, Z. Lockman, A. Berenov, J.L. MacManus-Driscoll, A.D. Caplin, L.F. Cohen, H.Y. Zhai, H.M. Christen, M.P. Paranthaman, D.H. Lowndes, M.H. Jo, M.G. Blamire, Supercond. Sci. Technol. 15, 1392 (2002). 708 28. R.L. Testardi, R.L. Meek, J.M. Poate, W.A. Royer, A.R. Storm, J.H. Wernick, Phys. Rev. B 11, 4303 (1975).
High Critical Fields and Currents in Mechanically Alloyed MgB2
719
29. D.K. Finnemore, J.E. Ostenson, S.L. Bud’ko, G. Lapertot, P.C. Canfield, Phys. Rev. Lett. 86, 2420 (2001). 712 30. P.C. Canfield, D.K. Finnemore, S.L. Bud’ko, J.E. Ostenson, G. Lapertot, C.E. Cunningham, C. Petrovic, Phys. Rev. Lett. 86, 2423 (2001). 31. C.P. Bean, Rev. of Modern Physics, 31 (1964). 711 32. V. Braccini, L.D. Coopley, S. Patnaik, D.C. Larbalestier, P. Manfrinetti, A. Palenzona, A.S. Siri, Appl. Phys. Lett. 81, No. 24, 4577 (2002). 712 33. W. H¨ aßler, C. Rodig, C. Fischer, B. Holzapfel, O. Perner, J. Eckert, K. Nenkov, G. Fuchs, Supercond. Sci. Technol. 16, 281 (2003). 713 34. C. Fischer, C. Rodig, W. H¨ aßler, O. Perner, J. Eckert, K. Nenkov, G. Fuchs, M. Schubert, H. Wendrock, L. Schultz (to be published). 713, 715 35. R. Fl¨ ukiger, P. Lezza, C. Benedice, N. Mugolino, H.L. Suo, Supercond. Sci. Technol. 16, 264 (2003). 715
Field-Induced Superconductivity in Films with Magnetic Dots M. Lange, M. J. Van Bael, M. Morelle, S. Raedts, and V. V. Moshchalkov Laboratorium voor Vaste-Stoffysica en Magnetisme Katholieke Universiteit Leuven Celestijnenlaan 200 D, B-3001 Leuven, Belgium Abstract. Magnetic-field-induced superconductivity (FIS) is a rare phenomenon that has been observed up to now in only very few materials. We show that this effect can be generated in type-II superconducting Pb films by covering them with nanoengineered arrays of Co/Pd dots with perpendicular magnetic anisotropy. In zero applied field and close to the critical temperature, the stray field of the dots is larger than the critical field of the superconductor. A magnetic field applied perpendicular to the sample surface compensates the returning stray field between the dots, resulting in the appearance of superconductivity. The use of nanoengineered dots makes it possible to control FIS over a broad field range by varying the distance between the dots and the amount of magnetic material in each dot.
1
Introduction
As a macroscopic quantum phenomenon, superconductivity is not only interesting from fundamental point of view, but also for technological applications due to the remarkable capability of superconductors to carry current with zero resistance. Unfortunately the superconducting state is very fragile and is suppressed when the magnetic field H, the current density j or the temperature T exceed their corresponding critical values Hc , jc or Tc . Numerous efforts have been made to optimize these parameters and to enlarge the boundaries of the superconducting state in the (H, j, T )-space. The strong internal fields in ferromagnets have a destructive action on the superconducting Cooper pairs through spin pair breaking. Therefore ferromagnetism and superconductivity have been considered as two antagonistic phenomena for a long time. Creating artificial hybrid nanostructures composed of both superconductors and ferromagnets gives rise to a lot of new physical phenomena like the appearance of the π-phase in ferromagnet / superconductor multilayers [1], or strong vortex pinning in superconducting films by arrays of magnetic dots [2]. Here we report on the manipulation of the upper critical field of a superconducting Pb film by covering the film with an array of nanostructured Co/Pd dots with perpendicular magnetic anisotropy. We demonstrate that in these hybrid systems, the Pb film can be in the normal state at zero field, but superconductivity emerges by applying a magnetic field. This rare B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 721–730, 2003. c Springer-Verlag Berlin Heidelberg 2003
722
M. Lange et al.
phenomenon, called magnetic-field-induced superconductivity (FIS) is counterintuitive and has up to now been observed in only three homogeneous bulk materials. This paper is organized as follows: First we will give as introduction a summary of earlier observations of FIS in homogenous bulk materials. After that the preparation of the investigated sample is described, and the structural and magnetic properties of the sample are characterized. In a third part we present the Tc (H) phase boundary of the superconductor and demonstrate the appearance of FIS in these hybrid systems. Finally we will discuss how this effect can be optimized to appear at even larger applied fields.
2 Field-Induced Superconductivity in Homogeneous Materials The first material where FIS was observed is (EuSn)Mo6 S8 [3,4]. At zero field, this compound is superconducting, and in small fields, the normal state is entered. Very surprisingly, when applying high fields between 4 and 22 T, the material becomes superconducting again. The appearance of FIS was interpreted in terms of the Jaccarino-Peter effect [5]: the exchange fields from the paramagnetic Eu ions compensate an applied magnetic field, so that the destructive action of the field is neutralized. A second system that shows FIS, but at significantly lower fields around 0.1 T, is HoMo6 S8 [6]. FIS is observed at a certain sweep rate of the applied field. Measuring simultaneously magnetization and resistance of this compound as a function of the applied field shows that jumps in the resistance are going hand in hand with jumps in the magnetization. This indicates that a purely electromagnetic field compensation effect between the applied field and internal magnetic fields is responsible for the occurrence of FIS in this sample. Only very recently FIS was discovered in organic λ-(BETS)2 FeCl4 materials [7,8] in ultra high fields between 18 and 41 T. In this organic compound the exchange field of the Fe3+ ions is responsible for the compensation of the applied field, and, thus, for the occurrence of the superconducting state. Interestingly, the formation of the nonuniform Fulde-Ferrell-Larkin-Ovchinnikov (FFLO) state is predicted within a certain temperature and field region of the phase diagram of λ-(BETS)2 FeCl4 [9]. FIS in these three compounds is caused by field compensation effects between the applied field on the one hand, and either the exchange field or internal magnetic fields on the other hand. In the next sections we will show that the stray field of artificially prepared magnetic dots on top of the superconductor can give rise to very similar field compensation effects, resulting in the appearance of FIS.
Field-Induced Superconductivity in Films with Magnetic Dots
3
723
Sample Preparation and Characterization
The investigated sample is an array of magnetic Co/Pd dots covering a Pb transport bridge for resistivity measurements. The structure of the sample is schematically shown in Fig. 1(a). The sample consists of a 85 nm superconducting Pb film on a 1 nm Ge base layer on an amorphous Si/SiO2 substrate. All layers are deposited by electron beam evaporation in a molecular beam epitaxy apparatus. The Pb film is covered by a 10 nm Ge layer for protection against oxidation, and to prevent the influence of the proximity effect. This continuous Ge/Pb/Ge trilayer is then patterned into a transport bridge (width 200 µm) using optical lithography and chemical wet etching. The ferromagnetic dots are made by defining a resist mask on the transport bridge by electron beam lithography using a scanning electron microscope (SEM) and subsequent evaporation of a Pd(3.5 nm) base layer and a [Co(0.4 nm)/Pd(1.4 nm)]10 multilayer into the resist mask. After deposition, the resist is removed in a lift-off procedure. Figure 1(b) shows an atomic force microscope (AFM) image of a reference sample that was prepared on a continuous Ge/Pb/Ge trilayer. This sample has a thinner Pb layer (36 nm), but the fabrication process of the Co/Pd dots was exactly the same as for the dots on the transport bridge, using the same parameters during electron beam lithography. Moreover, the Co/Pd multilayer is simultaneously evaporated into the resist masks of both samples. The dots are arranged in a regular square array with period L = 1.5 µm. They have a square shape (side length about 0.8 µm) with slightly irregular edges. The small spots on top of the dots are probably remainders of the resist, which do not influence the magnetic properties of the dots. Note that very similar samples have been used to investigate vortex pinning in type-II (a)
23 nm Co/Pd 10 nm Ge 85 nm Pb 1 nm Ge substrate
(b)
Fig. 1. (a) Schematic presentation of the prepared sample for resistivity measurements. (b) Atomic force microscope image (5 × 5 µm2 ) of a reference Co/Pd dot array covering a Ge/Pb/Ge trilayer
724
M. Lange et al.
superconducting films by arrays of magnetic nanostructures [10,11] or by the domain structure in a continuous Co/Pt multilayer [12]. Co/Pd multilayers containing ultrathin Co layers have an easy axis of magnetization perpendicular to the sample surface [13]. Figure 2 shows the hysteresis loop of the reference sample measured by magneto-optical Kerr effect with H applied perpendicular to the surface. The loop has a large remanent magnetization of Mrem = 0.8Msat , with Msat the saturation magnetization, and a coercive field µ0 Hcoe ≈ 0.13 T. Thus, after magnetizing or demagnetizing the sample, the dots are in quite stable remanent magnetic domain states, so that the application of relatively small magnetic fields does not result in significant changes of the domain structure. The domain states after different magnetization procedures are investigated by magnetic force microscopy (MFM), see Fig. 3. After demagnetizing the dots by oscillating H (perpendicular to the sample surface) around zero with decreasing amplitude, the signal from each of the dots consists of dark and bright spots, as shown in Fig. 3(a). This means that most probably there are several magnetic domains present in the dots. The net magnetic moment m of each dot in this state is approximately zero (m = 0). Saturating the dots in a large positive perpendicular field aligns all magnetic moments along the positive z-direction (mz > 0), where the z-axis is perpendicular to the sample surface. This causes the dots to appear brighter compared with the signal between the dots as is shown in the MFM image in Fig. 3(b). After saturation of the dots in a large negative field, resulting in mz < 0, a darker contrast is observed in the MFM image, see Fig. 3(c). Simultaneous recording of magnetic and topographic images shows that the spots visible on dots in Fig. 3(b) and (c) coincide with the spots that were attributed to the remainders of the resist. Therefore, they are not due to a magnetic signal.
1.0
M / Msat
0.5 0.0
-0.5 -1.0 -0.2
-0.1
0.0
µ0 H (T)
0.1
0.2
Fig. 2. Hysteresis loop of a reference Co/Pd dot array measured by magneto-optical Kerr effect at room temperature. H is applied perpendicular to the sample surface
Field-Induced Superconductivity in Films with Magnetic Dots
(a)
(b)
4
mz > 0
725
m=0
(c)
mz < 0
Fig. 3. Magnetic force microscope images (5 × 5 µm2 ) measured in zero field and at room temperature. The sample was (a) demagnetized, (b) magnetized in µ0 H = +1 T, (c) magnetized in µ0 H = −1 T before the measurement
Experimental Results
The Tc (H) phase boundary of the Pb transport bridge has been determined from resistivity measurements with the magnetic dots in the three magnetic states shown in the MFM images (Fig. 3). Fig. 4 presents ρ(T ) curves through the superconducting transition for some selected fields H applied perpendicular to the sample surface. For a reason that is given below, we have indicated the values of H as multiple integers of the first matching field H1 , which is defined as the field where one vortex of the Abrikosov vortex lattice occupies one unit cell of the dot array. H1 is given by µ0 H1 =
φ0 = 0.919 mT, L2
(1)
with φ0 = 2.068 · 10−15 Tm2 the superconducting flux quantum and L = 1.5 µm the lattice constant of the dot array. Sample resistance is measured with an ac-current of 10 µA at a frequency of 19 Hz, and then the resistivity ρ is normalized to the normal state resistivity ρn = 1.4 Ω cm at 7.3 K. When the Co/Pd dots array is in the m = 0 state, the curves in Fig. 4(a) show the expected behavior of conventional superconductors: the highest critical temperature Tc is observed at zero applied field, and Tc shifts to lower temperatures as H is increased. Note that the Pb film has a very sharp transition with a width of about 0.03 K in zero field. Figure 4(b) shows measurements that are carried out in the same fields as those presented in Fig. 4(a), but with the dots positively magnetized (mz > 0)
M. Lange et al.
ρ/ρn
726
m=0
(a)
ρ/ρn
T (K)
mz < 0
mz > 0
(b)
(c)
T (K)
T (K)
Fig. 4. Temperature sweeps of the resistivity ρ through the superconducting transition of the Pb film with the dots (a) in the m = 0 state, (b) in the mz > 0 state, and (c) in the mz < 0 state at different applied fields
before the measurement. Very surprisingly, the ρ(T ) curves are significantly influenced by the magnetic state of the dots: the highest value of Tc is obtained at H/H1 = +2, and when the field is increased or decreased from this value, the ρ(T ) curves shift to lower temperatures. After magnetizing the dots in a large negative field (mz < 0), the reversed behavior with respect to the polarity of H is observed: the highest critical temperature of the Pb film is obtained at H/H1 = −2, as shown in Fig. 4(c). From these measurements we have constructed the Tc (H) phase boundary, where we defined the critical temperature as Tc = T (ρ = 10%ρn ). A conventional symmetric (with respect to H) phase boundary is obtained when m = 0, see Fig. 5(a). The only features that are indicating the presence of the magnetic dot array on top of the superconductor in this state are the two kinks in the curve at H = ±H1 . The phase boundaries are clearly altered by changing the magnetic state of the dot array to mz > 0 or mz < 0. Both are strongly asymmetric with respect to H. Moreover, the maximum Tc is shifted to +2H1 when mz > 0 and to −2H1 when mz < 0. Kinks in the Tc (H) curves can be seen at H/H1 = 0, +1, +2 and +3 for mz > 0 and at H/H1 = 0, −1, −2 and −3 for mz < 0. FIS is observed as a consequence of the field-shift of the maximum value of Tc , as is demonstrated in Fig. 5(b). In these graphs the field-dependence of
Field-Induced Superconductivity in Films with Magnetic Dots mz > 0 m= 0 mz < 0
4 2
H / H1
727
0
-2
(a)
-4 7.16
7.18
7.20
7.22
T (K)
7.24
1.0
/ n
0.8
(b)
0.6 0.4
mz > 0 mz < 0
0.2 0.0 -6
-4
-2
0
H / H1
2
4
6
Fig. 5. (a) Tc (H) phase boundaries of the Pb film () in the mz > 0 state, (•) in the m = 0 state and (∇) in the mz < 0 state. (b) Field-dependence of the resistivity at T = 7.20 K (◦) in the mz > 0 state and (•) in the mz < 0 state
the resistivity is shown when measuring at constant temperature T = 7.20 K. For instance, for mz > 0 and T = 7.20 K, the sample is in the normal state in zero field, but when a positive field around +2 mT is applied, the Pb film becomes superconducting. Similarly, when the magnetic state of the dots is switched to mz < 0, superconductivity is induced by applying a negative field of about −2 mT, see Fig. 5(b).
5
Discussion
In the present system, the appearance of FIS can be explained by taking into account the local magnetic stray field of the dots Bstray . Let us assume that the dots are in the mz > 0 state, as schematically depicted in Fig. 6.
Bstray m
H
z
Fig. 6. Schematic drawing to explain the appearance of magnetic-field-induced superconductivity (FIS). A magnetic field H of about +2 mT applied in the z-direction compensates Bstray between the dots, so that at T = 7.20 K, the Pb film is superconducting between the dots, but in the normal state in zero applied field
728
M. Lange et al.
The magnetic stray field Bstray of each dot has a positive z-component Bstray,z under the dots and a negative one in the area between the dots. When H = 0, these dipoles fields exceed the upper critical field of the Pb film when T > 7.185 K, and, as a result, the Pb film is in the normal state. An applied positive field can compensate Bstray,z in the interdot area, where the Pb film is now in the superconducting state, thus providing the percolation through dominantly superconducting areas, and making possible the continuous flow of Cooper pairs and zero film resistance. The periodic kinks appearing in the Tc (H) phase boundaries shown in Fig. 5(a) are due to fluxoid quantization. Similar effects are observed in the phase boundaries of superconducting wire networks [14] and in thin superconducting films that contain regular arrays of holes (“antidots”) [15,16]. In low fields, nucleation of superconductivity in the latter system is characterized by collective oscillations in the Tc (H) phase boundary with a period given by H1 . Fluxoid quantization gives rise to vortex patterns that form commensurate lattices with the underlying antidot lattice. The two kinks observed in the phase boundary of the Pb film with the dots in the m = 0 state are due to very similar fluxoid quantization effects. Nucleation of superconductivity takes first place in the interstitial regions of the dot array, while the stray field of the multidomain dots creates normal regions under the dots. The interstitial regions are multiply connected comparable with superconducting films containing antidot lattices. The most striking feature in the system investigated here is that the maximum Tc is at exactly at H = +2H1 when mz > 0 and at H = −2H1 when mz < 0. This can be understood by taking the stray field of the dots into account. From magnetostatical calculations, we estimate the magnetic flux in the interstitial regions of the dot array to be about 2.1φ0 per unit cell. Therefore, the highest value of Tc is achieved when of a field with a value of +2H1 is applied for mz > 0, and at −2H1 for mz < 0. Note that due to these considerations, no shift of the maximum value of Tc will be observed in the phase boundary when the stray field of the dots is too weak to induce vortexantivortex pairs in the superconducting films. Thus, a prerequisite for the appearance of FIS in these systems in that the stay field of the dots must be of the order of one flux quantum. A more sophisticated theoretical treatment is needed to understand all details of the phase boundaries, for instance by solving the linearized Ginzburg-Landau equations taking into account the inhomogeneous stray fields of the dots. There are two possibilities to tune the field region in which FIS could be observed in these hybrid systems, (i) by varying the distance between the dots, and (ii) by changing the magnitude of the stray field. The distance between the dots determines the value of the first matching field H1 , and has therefore a direct influence on the field region of the FIS. For instance, arrays of dots with perpendicular anisotropy have been fabricated with a period of 70 nm [17], corresponding to µ0 H1 ≈ 0.4 T. Using the most advanced
Field-Induced Superconductivity in Films with Magnetic Dots
729
electron beam lithography machines, it has become possible to fabricate even smaller patterns, which can be as small as ∼ 10 nm at present [18]. This implies that fields of the order of 1 T are in principle reachable. However, one should bear in mind that fluxoid quantization requires a large stray field for the appearance of FIS, which could be achieved by increasing the magnetic moment of the dots by using nanopillars [19] made from materials with a large magnetic moment like Gd. Finally we would like to point out that the dipole array field compensator could possibly be used in applications. A lot of attention is currently paid to materials showing giant magnetoresistance effects, and the implementation of these materials as field sensors in magnetic recording. Besides improving and shifting the critical fields, the nanoengineered FIS could also be used in these kind of applications, since tunable huge magnetoresistance effects as presented in Fig. 5(b) are ideal for logical switches or field sensors.
6
Conclusions
The Tc (H) phase boundary between the normal and the superconducting state of a Pb film covered by a nanoengineered array of magnetic dots with perpendicular anisotropy has been investigated. The Tc (H) curve is significantly depending on the magnetic state of the dot array. Multidomain dots only have a minor influence on the phase boundary, while positively or negatively magnetized dots cause a shift of the highest Tc to a finite positive or negative field, respectively. As a consequence of this shift, magnetic-fieldinduced superconductivity is observed in a certain temperature range close to the phase boundary. The field range where this rare effect appears can be controlled by tuning the distance between the dots or the amount of magnetic material in each dot. Acknowledgements The authors are thankful to E. Claessens for help with the measurements, and to Y. Bruynseraede for fruitful discussions. This work was supported by the Belgian IUAP and the Flemish GOA programs, by the ESF “VORTEX” program, and by the Fund for Scientific Research (F.W.O.) - Flanders. M.J.V.B. is a Postdoctoral Research Fellow of the F.W.O.-Flanders.
References ˙ Radovic, M. Ledvij, L. Dobrosavljevi´c-Gruji´c, A.I. Buzdin, and J.R. Clem, 1. Z. Phys. Rev. B 44, 759 (1991). 721
730
M. Lange et al.
2. J.I. Mart´ın, M. V´elez, J. Nogu´es, and I.K. Schuller, Phys. Rev. Lett. 79, 1929 (1997). 721 3. S.A. Wolf, W.W. Fuller, C.Y. Huang, D.W. Harrison, H.L. Luo, and S. Maekawa, Phys. Rev. B 25, 1990 (1982). 722 4. H.W. Meul, C. Rossel, M. Decroux, Ø. Fischer, G. Remenyi, and A. Briggs, Phys. Rev. Lett. 53, 497 (1984). 722 5. V. Jaccarino and M. Peter, Phys. Rev. Lett. 9, 290 (1962). 722 6. M. Giroud, O. Pena, R. Horyn, and M. Sergent, J. Low Temp. Phys. 69, 419 (1987). 722 7. S. Uji, H. Shinagawa, T. Terashima, T. Yakabe, Y. Terai, M. Tokumoto, A. Kobayashi, H. Tanaka, and H. Kobayashi, Nature 410, 908 (2001). 722 8. L. Balicas, J.S. Brooks, K. Storr, S. Uji, M. Tokumoto, H. Tanaka, H. Kobayashi, A. Kobayashi, V. Barzykin, and L.P. Gor’kov, Phys. Rev. Lett. 87, 067002 (2001). 722 9. M. Houzet, A. Buzdin, L. Bulaevskii, and M. Maley, Phys. Rev. Lett. 88, 227001 (2002). 722 10. M.J. Van Bael, K. Temst, V.V. Moshchalkov, and Y. Bruynseraede, Phys. Rev. B 59, 14674 (1999). 724 11. M. Lange, M.J. Van Bael, L. Van Look, K. Temst, J. Swerts, G. G¨ untherodt, V.V. Moshchalkov, and Y. Bruynseraede, Europhys. Lett. 53, 646–652 (2001); 56, 149 (2002). 724 12. M. Lange, M.J. Van Bael, V.V. Moshchalkov, and Y. Bruynseraede, Appl. Phys. Lett. 81, 322 (2002). 724 13. P.F. Carcia, A.D. Meinhaldt, and A. Suna, Appl. Phys. Lett. 47, 178 (1985). 724 14. B. Pannetier: in Quantum Coherence in Mesoscopic Systems, B. Kramer (Ed.) (Plenum Press, New York, 1991), pp. 457. 728 15. A. Bezryadin and B. Pannetier, J. Low. Temp. Phys. 98, 251 (1995). 728 16. V.V. Moshchalkov, V. Bruyndoncx, L. Van Look, M.J. Van Bael, Y. Bruynseraede, and A. Tonomura: Quantization and confinement phenomena in nanostructured superconductors, in Handbook of Nanostructured Materials and Nanotechnology III/9, H.S. Nalwa (Ed.) (Academic Press, San Diego, 2000), pp. 451. 728 17. K. Koike, H. Matsuyama, Y. Hirayama, K. Tanahashi, T. Kanemura, O. Kitakami, and Y. Shimada, Appl. Phys. Lett. 78, 784 (2001). 728 18. M.A. McCord and M.J. Rooks in Handbook of Microlithography, Micromachining, and Microfabrication I, P. Rai-Choudhury (Ed.) (SPIE-The International Society for Optical Engineering, Washington, 1997), p. 139. 729 19. S.Y. Chou, M.S. Wei, P.R. Krauss, and P.B. Fischer, J. Appl. Phys. 76, 6673 (1994). 729
Superconducting Quantum Interference Filters J¨ org Oppenl¨ander Lehrstuhl f¨ ur Theoretische Festk¨ orperphysik, Universit¨ at T¨ ubingen Auf der Morgenstelle 14, 72076 T¨ ubingen, Germany
Abstract. Superconducting quantum interference filters (SQIFs) are multi-loop arrays of Josephson junctions possessing unconventional grating structures. For specially selected array loop size distributions the magnetic flux to voltage transfer function of SQIFs has a unique delta-peak like characteristics at zero applied flux. In contrast to conventional superconducting quantum interference devices (SQUIDs) which posses periodic flux-to-voltage transfer functions, the unique voltage response of SQIFs allows such devices to be directly employed as detectors of absolute strength of magnetic fields. The magnetic field resolution of SQIFs is comparable to or may be even better than that of conventional SQUIDs. In practice, SQIFs may have a number of significant advantages since their performance is not degraded by spreads in the Josephson junction parameters or deviations in the array loop sizes. This fault tolerance allows SQIFs to be realized in a relative simple way and has been used for the successful development of high performance low- and high-Tc SQIFs.
1
Introduction
Devices based on superconducting quantum interference are known as the most sensitive sensors for measurements of magnetic fields. There is, however, still a search for a further improvement of their sensitivity. In particular, for high-Tc superconductor devices operated at 77 K an enhancement of the magnetic field sensitivity is for conventional systems regularly limited by technological problems. Typically, a conventional superconducting magnetic field sensor consists of a superconducting quantum interference device (SQUID), i.e., a closed superconducting loop that contains two Josephson junctions. This SQUID is coupled to a pick-up coil or washer which focuses the external magnetic field into the SQUID-loop. The magnetic field sensitivity of such a system is limited by the noise contribution of the two Josephson junctions. In principle, the sensitivity can be enhanced by coupling several SQUID-loops together. Then the signal strength may √ increas by the number of loops N but the are uncorrelated. noise may increase only by N , provided the noise sources √ Therefore, the magnetic field sensitivity may increase by N . In practice, however, it is not easy to implement this effect in an operating system. Up to now, it works only for low-Tc devices, but at 4.2 K the noise contribution of the junctions is not really relevant. For high-Tc devices the parameter spreads B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 731–745, 2003. c Springer-Verlag Berlin Heidelberg 2003
732
J¨ org Oppenl¨ ander
are in general much too large and the flux-to-voltage transfer function of coupled systems is seriously corrupted. The main reason for the degradation of the performance of coupled systems in the presence of parameter spreads is an intrinsic one. A conventional SQUID possesses a Φ0 -periodic flux-to-voltage transfer function, where Φ0 = h/2e is the elementary flux quantum. If there are small deviations in the SQUID-loop sizes or asymmetries in the junction parameters the response of the system becomes quasiperiodic. The individual SQUIDs modulate with slightly different frequencies with respect to an external magnetic field. In addition, if the junction parameters are not identical, there are offsets in the individual responses and the modulations are out of phase, or, more badly, the modulation intervals of the individual SQUIDs do not overlap. All these problems can be solved by using the novel concept of superconducting quantum interference filters (SQIFs) that we have developed in 2000 at the University of T¨ ubingen. A SQIF is a multi loop array of Josephson junctions possessing unconventional grating structure, i.e., a multi-loop configuration that is characterized by an intrinsic nonperiodicity of the geometry of the structure. For such multiple-loop configurations the interference effects are generated by the phase-sensitive [1] superposition of a mesoscopic number of macroscopic array junction currents in the presence of an external magnetic field. For suitably selected array loop size distributions the flux-to-voltage transfer function of a SQIF has a unique delta-peak like characteristics. The periodicity of the flux-to-voltage transfer function of conventional SQUIDs is removed. Therefore, the SQIF acts as a detector of absolute strength of magnetic field. By this, the SQIF is extremely robust against parameter spreads or deviations in the loop sizes. This fault tolerance allows SQIFs to be realized in a relative simple way and has been used for the successful development of high performance low- and high-Tc SQIFs. In this paper I will review the current status concerning the development of low- and high-Tc SQIFs. In Sect. 2 I briefly summarize the basic properties of standard single loop two-junction SQUIDs and periodic multiple loop parallel 1D arrays, i.e., geometrical configurations with conventional grating structures. Sect. 3 is focused on the basic properties of unconventional grating structures and in Sect. 4 an outline of the theoretical description of superconducting quantum interference filters is given. Then, in Sect. 5, I will present a selection of experimental results on different high performance low- and high Tc SQIFs. Finally, Sect. 6 is devoted to the discussion of these results and future perspectives.
Superconducting Quantum Interference Filters
733
2 Conventional Grating Structures: SQUIDs and Periodic Arrays Consider, as indicated schematically in Fig. 1a, a standard two Josephson junction superconducting quantum interference device (SQUID), for simplicity with symmetric junction parameters, under the dc current bias Ib > 2Ic [2]. Here Ic and R are critical current and normal resistance of a single junction, respectively, and |aL | is the size of the SQUID loop. The external magnetic field B generates the flux Φ = B, a. The DC voltage response function Vxy of the SQUID, i.e., the time average of the rapidly oscillating voltage signal Vxy (t) across the nodes x and y of the circuit, is a Φ0 -periodic function of the strength of external magnetic field (Fig. 1a). Therefore, a two junction SQUID cannot be directly employed as a detector of absolute strength of external magnetic field. A coupled system of standard two-junction SQUIDs is sketched in Fig. 1(b). This is a 1D array of N adjacent Josephson junctions connected in parallel [3,4]. The area elements of the N − 1 SQUID loops are all equal, i.e., an = aL for all n. The voltage response signal Vxy vs. strength |B| of external magnetic field of such a periodic array has the same period Φ0 than a standard two junction SQUID with loop area |aL | (Fig. 1b). For both devices, SQUID and periodic array, the flux-to-voltage transfer function is intrinsically Φ0 -periodic. If such a circuit is used as a sensor inside a magnetometer, in principle only relative measurements of magnetic fields are possible. By using flux-locked-loop techniques [5] the periodic response may be linearized. However, if a changing magnetic field induces more than one flux quantum special electronic devices are needed that count the number of flux quanta. Such additional devices possess only a limited maximum (a)
(b)
1.0
1.0 0.8
0.8
0.6 0.4
aL
0.2
B
α
−3
0.6
Ib
−2
x
0.4
aL
Ib
0.0
x
y
aL
0.2
y
0.0 −1
0
Φ/Φ0
1
2
3
−3
−2
−1
0
Φ/Φ0
1
2
3
Fig. 1. (a) DC Voltage response Vxy of a symmetrical SQUID (N = 2 Josephson juntions) in units of Ic R vs. external flux Φ through the area element aL for a bias current of Ib = 1.1 N Ic . (b) Voltage response Vxy of a periodic onedimensional array vs. external flux Φ through the largest area element aL for bias current Ib = 1.1 N Ic . The array contains N = 11 Josephson junctions. The junctions are indicated by thin black layers (isolators) connecting the upper superconducting layer with the lower superconducting layer
734
J¨ org Oppenl¨ ander
counting rate and the dynamic range of the magnetometer is therefore limited. The devices cannot be used for reliable high precision absolute magnetic field mesurements.
3 Unconventional Grating Structures and Superconducting Quantum Interference Filters A more general quantum interference device is obtained when the area elements an of the N −1 loops in the array differ in size and, possibly, in orientation, as depicted schematically in Fig. 2. If the sizes |an | of the orientated area elements an of the individual superconducting loops are chosen in such a way that for a finite external magnetic field B a coherent superposition of the array junction currents (cf. Sect. 4.1) is prevented, the voltage response function Vxy vs. |B| becomes nonperiodic. From the analogy to optical interference patterns we call such configurations unconventional grating structures. An example for the effects of unconventional grating on the voltage response function is shown in Fig. 2. The areas of the different array loops are chosen randomly between 0.1 |aL | and 1.0 |aL |, while the total area of the array is the same as for the periodic array, Fig. 2. The maximum loop size coincides with the corresponding optimal loop size of a standard two junction SQUID, i.e., max |an | = |aL |. By this, the response function of the unconventional array shown in Fig. 2 is comparable to the response functions shown in Figs. 1a and 1b. The distribution of the array loop sizes has two properties that prevent for a finite external magnetic field B the coherent superposition of the array junction currents. Firstly, the loop sizes are incommensurable, i.e., there
1.0 0.8 0.6
x
0.4
aL
0.2
y 0.0 −3
−2
−1
0
Φ/Φ0
1
2
3
Fig. 2. Voltage response Vxy of a one-dimensional array with unconventional (non-periodic) grating structure (N = 18) in units of Ic R vs. external flux Φ through largest area element aL for bias current Ib = 1.1 N Ic . The loop areas are randomly distributed between 0.1 and 1.0 |aL |. The unconventional array has the same total area as the periodic array in Fig. 1b
Superconducting Quantum Interference Filters
735
exists no greatest common divisor (GCD). Secondly, the size of the smallest loop |amin | strongly differs from the size of the largest loop |amax |, and the sizes of all other loops are distributed between |amin | and |amax | in such a way that no distinct loop size is preferred. The first property of the distribution prevents any (strong) periodicity of the response function. The second property ensures that no significant partially coherent superposition of the array junction currents takes place, i.e., that there exist no finite values of B for which additional significant antipeaks in the voltage response function do occur. If these two necessary conditions are fulfilled by the loop size distribution, the voltage response signal Vxy vs. strength of magnetic field of the unconventional junction array becomes, under a suitable dc current bias Ib , a unique function of |B| around its narrow global minimum at |B| = 0. This feature of the voltage response function of unconventional arrays does only depend on the distribution of the array loop sizes. It does not depend on fluctuations in the parameters of the individual junctions (cf. Sect. 4.1). In contrast to devices with conventional grating structures, the uniqueness of the voltage response function of unconventional arrays allows such devices to be directly employed as detectors of absolute strength of external magnetic field. From their filter property, i.e., the unique antipeak at |B| = 0, arrays with unconventional grating structures are called superconducting quantum interference filters (SQIFs). However, one-dimensional parallel arrays are not the only type of SQIFs. There are also one-dimensional serial SQIFs and even two-dimensional SQIFs. In the next section I will give a brief overview over the theoretical description of the basic types of SQIFs.
4 Theoretical Description of Superconducting Quantum Interference Filters There exist two main types of superconducting quantum interference filters: one-dimensional parallel arrays and one-dimensional serial arrays. Their theoretical description is somewhat different since in the superconducting state the coupling between the array loops depends on the circuitry. Generic twodimensional SQIFs are either a series connection of one-dimensional parallel SQIFs or a parallel connection of one-dimensional serial SQIFs. If the SQIFs are made from high-Tc grain boundary junctions a geometrically onedimensional SQIF can be designed to act topologically two-dimensional. I will present examples of 2D SQIFs in the experimental section. 4.1
One-Dimensional Parallel SQIFs
The DC voltage response V of 1D parallel SQIFs (Fig. 3a) is because of flux quantization a coherent response of all array Josephson junctions [6]. In the limit of vanishing array inductances the problem of the N coupled
736
J¨ org Oppenl¨ ander
Ib
a1
a2 a3
a N−1 Ib (a)
Ib
aM
a1 a 2 a3
Ib
(b)
Fig. 3. Schematic diagram of (a) a parallel SQIF circuit and (b) a serial SQIF circuit. The Josephson junctions are indicated by crosses. The different array loops have different areas |am | with (a) m = 1 . . . N − 1 and (b) m = 1 . . . M
junctions can be mapped onto a virtual single junction model. This simplifies the analytical treatment significantly [6]. The shape of the DC voltage response function of parallel SQIFs and its periodicity properties are governed by the characteristic structure factor N n−1 1 Ic,n 2πi SN (B) = exp B, am , (1) N n=1 Ic Φ0 m=0 which we have introduced in [6]. N N Here Ic = (1/N ) n=1 Ic,n , 1/R = (1/N ) n=1 1/Rn , where Rn is the normal resistance and Ic,n the critical current of the nth array junction, and |a0 | = 0. Assuming a time independent or with respect to the Josephson frequency slowly varying magnetic field B, the DC part of the voltage response of the parallel SQIF is then given by 2 − |S (B)|2 (2) V (B) = Ic R JN N where JN = Ib /(N Ic ) and Ib denotes the total applied DC bias current. Equations (1) and (2) are strictly valid only in the limit of vanishing array inductances. However, as we have shown in [6] the periodicity properties and the qualitative shape of the voltage response are still governed by the structure factor if all array inductances are taken into account. The swing of the DC voltage response and the transfer factor VB = max(∂V /∂B) are decreasing for increasing array inductances. This degradation, however, is for SQIFs in general smaller than for conventional SQUIDs since some array inductances may eventually cancel out depending on the actual array geometry. If the array inductances are not too large, it follows from Eq. (2) that the transfer factor VB scales −1with the number N − 1 of array loops and their total area Atot,SQIF = N m=1 |am | [6]. Assuming uncorrelated Nyquist noise sources located at the individual array junctions, the white voltage noise
Superconducting Quantum Interference Filters
737
spectral density SV (f ) across the parallel SQIF scales with the number N of array junctions such that 2 2 1 N Ic 4 kB T RD SV (f ) = 1 + N (3) 2 Ib R and the SQIF output noise is given by SV (f ) [3]. Here RD = (1/Ic )∂V /∂JN is the dynamical resistance at the bias point JN and the frequency f must be much smaller than the Josephson frequency. For typical array parameters and bias points JN ≈ 1.1, RD ≈ 1.2 R. With N = 2, i.e., a conventional SQUID configuration, Eq. (3) gives the well known expression SV (f ) ≈ 16 kB T R [7]. Since the transfer factor VB scales with the total area Atot and the white voltage noise spectral density SV (f ) scales with the number of junctions N , the magnetic field sensitivity should increase with√increasing size of the SQIF. This increase is not directly proportional to N as in the case of periodic arrays. However, for appropriately chosen loop size distributions a significantly increasing magnetic field sensitivity should be achievable. As will be shown in the experimental section there is already some evidence that this effect really takes place. Of particular interest are in this context SQIFs made from high-Tc Josephson junctions since the reduction of noise at 77 K is a crucial point for applications. 4.2
One-Dimensional Serial SQIFs
The serial SQIF is basically a series array of two-junction loops, but the loops are not identical [8]. The circuitry is schematically depicted in Fig. 3b. Also here, the unconventional grating structure is realized by M array loops with orientated area elements am (m = 1 . . . M ) of different area size |am |. The number of array junctions is now N = 2 M. The DC voltage response V (B) of a serial SQIF is a superposition of the DC voltage responses Vm (B) of the individual two-junction loops [8], 2 2 − |S2,m (B)| , (4) Vm (B) = Ic,m Rm J2,m where S2,m (B) is given by Eq. (1), and m refers to the mth array loop, and V (B) =
M
Vm (B).
(5)
m=1
Here Ic,m and Rm are the mean values of critical current and resistance in the mth loop, respectively, and J2,m = Ib /(2 Ic,m ). The different two-junction loops work with different periodicity with respect to an externally applied magnetic field. For appropriate chosen loop area distributions, therefore, the DC voltage response V (B) becomes as in the case of parallel SQIFs a unique function of B around B = 0.
738
J¨ org Oppenl¨ ander
From Eq. (5) it follows, that the transfer factor VB of serial SQIFs also scales with the number M of array loops and their total area Atot,SQIF . Assuming that the junction parameters of all array junctions are identical, the serial SQIF output noise is also given by Eq. (3).
5
Experiments
Since the first theoretical prediction of the superconducting quantum interference filter effect in 2000 a rapidly growing number of samples with many different circuit designs has been fabricated and investigated. Our group at the University of T¨ ubingen started in late 2000 with the design of two prototype SQIFs, one parallel and one serial. Each of them had about 30 loops and a rather low total effective area. They were designed in order to ”proof the principle”. Nowadays we are designing SQIFs which contain much more than 1000 loops and which reach magnetic field sensitivities that are comparable with the sensitivities of conventional SQUID sensors. To fabricate the low-Tc SQIFs in all cases standard low-Tc niobium technology has been used. In 2001 the Quantum Electronics Department at the Institute for Physical High Technology in Jena started to design and fabricate the first high-Tc SQIFs based on grain boundary Josephson junctions. The development was motivated by the theoretical prediction that due to the in some sense statistic nature of the quantum interference filter effect the performance of SQIFs should not be degraded neither by deviations in the loop sizes nor by spreads in the junction parameters. In particular, large spreads of the critical currents of the Josephson junction seem to be an inherent problem in bicrystal grain boundary junction technology. As it will be seen from the presented experimental results in Sect. 5.2, the SQIF overcomes all these problems. In addition, there is evidence that the theoretically predicted reduction in the magnetic field noise for the high-Tc samples really takes place. If this can be confirmed in further experiments, a new dimension of magnetic field measurements with HTS devices would open. Let me start the presentation of experimental results by some selected low-Tc SQIFs. 5.1
Low-Tc SQIFs
An optical micrograph of a segment of the first experimental parallel lowTc SQIF is shown in Fig. 4. The different sizes of the area elements am and the externally shunted array junctions are clearly visible. Two control current feeding lines are located near the SQIF. The whole chip of size 5 mm × 5 mm contains one parallel SQIF with N = 30 junctions (N −1 = 29 loops) and one reference single loop SQUID. The SQUID is used to calibrate the SQIF. The reference SQUID has the loop size of the largest SQIF loop ASQUID = |amax |. The detailed experimental parameters can be found in [9]. Fig. 5a shows a typical experimentally measured DC voltage response function of the first parallel SQIF (upper curve) together with a part of the
Superconducting Quantum Interference Filters
20 µm
739
weak link ext. shunt area element
feeding
ext. shunted junction
Fig. 4. Optical micrograph of a part of the experimental parallel SQIF. The SQIF was fabricated using standard low-Tc niobium technology. It contains 30 externally shunted Josephson junctions. The different areas of the loops are clearly visible (a)
90
(b) 1750
70
1500
V[ µV]
1250
V[ µV]
50
1000
30
750 500
10
250 0 −150
−100
−50
0
B[ µT]
50
100
150
−60
−40
−20
0
B[ µT]
20
40
60
Fig. 5. (a) Typical experimentally measured DC voltage response of the first fabricated parallel low-Tc SQIF vs. magnetic field B (upper curve). The lower curve shows a part of the DC voltage response of a conventional two-junction SQUID which was used as reference. SQIF and reference SQUID are on the same chip. (b) Typical experimentally measured DC voltage response of the first serial low-Tc SQIF vs. magnetic control field B. The maximum peak to peak voltage swing is ∆Vmax = 1350 µV. All measurements were done at 4.2 K
DC voltage response of the reference SQUID (lower curve). The SQIF shows a sharp antipeak around vanishing magnetic field B = 0. The maximum peak to peak voltage swing reaches ∆Vmax = 67µV with a transfer factor of VB,SQIF = 200 V/T. The reference SQUID possesses a maximum peak to peak voltage swing of ∆Vmax = 21µV and a transfer factor of VB,SQUID = 13V/T. The maximum voltage swing of the SQIF is more than three times larger than that of the reference SQUID. This increase of the maximum voltage swing is implied by the fact that for an appropriate chosen geometrical arrangement the effects of inductances are significantly reduced. There is a partially destructive superposition of the loop screening currents inside the SQIF array [6]. Figure 5b shows a typical experimentally measured DC voltage response of the first serial SQIF [10]. The maximum peak to peak voltage is ∆Vmax = 1.35 mV. The transfer factor VB reaches a value of VB,SQIF = 720 V/T. For both SQIF circuits the experimentally determined performance parameters were in good agreement with the theoretical predictions. However, since our goal was to prove the quantum interference filter effect the first
V [mV]
740
J¨ org Oppenl¨ ander
4.0 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0 −400 −300 −200 −100
0 100 B[µT]
200
300
400
Fig. 6. Experimentally measured DC voltage response of a two-dimensional low-Tc SQIF vs. magnetic field B at 4.2 K. The SQIF contains 1450 loops and has a maximum peak to peak voltage swing ∆Vmax = 1.2 mV
time, at this stage of the development none of the design parameters was chosen for optimum sensitivity. A much higher magnetic field sensitivity can be achieved by increasing the number of SQIF loops and therefore the total area of the sensor. Fig. 6 shows the experimental result for a 2D SQIF with 25 × 58 = 1450 loops and a total effective loop area of ≈ 0.14 mm2 (chip size 5 mm × 5mm). The magnetic field was sweeped up to relativly high levels (±400 µT). For such large magnetic fields the field penetrates the junction areas themselves and the typical Fraunhofer-like pattern of a single Josephson junction occurs [3]. Each array row behaves like a “long” Josephson junction and the voltage responses of all these “long” junctions are summed up in the voltage response of the array. The true SQIF antipeak is located precisely at the minimum of the Fraunhofer-dip. This SQIF “needle” points accurately to the point of vanishing magnetic field, B = 0. Although the size of the 2D array is relatively small in comparison with conventional SQUIDs with√pickup loop, with such a 2D SQIF a magnetic field sensitivity of ≈ 25 fT/ Hz should be achievable. Since the performance parameters scale as theoretically expected (cf. Sect. 4.1) with larger arrays the sensitivity can be increased [11]. Another example of a highly sensitive low-Tc SQIF is shown in Fig. 7. Here a series SQIF that contains 180 loops is geometrically arranged in a socalled meander line. The voltage swing of ∆Vmax = 9.6 mV for such a series configuration is rather high. Also the dynamical range is rather large and the circuit can be directly connected to a room temperature preamplifier. The scaling behaviour of the meander series SQIF is in good agreement with the theoretical predictions. Therefore, as in the case of the 2D SQIF, increasing the number of array loops will increase the sensitivity of the device. 5.2
High-Tc SQIFs
The development of high-Tc SQIFs is motivated by the theoretical prediction of high fault tolerance of this new concept, by advantages for operation in noisy environment, and, last but not least, by a possible increase in magnetic
Superconducting Quantum Interference Filters
741
V [mV]
9 8 7 6 5 4 3 2 1 0 −20
−15
−10
−5
0
B[µT]
5
10
15
20
Fig. 7. Experimentally measured DC voltage response of a meander serial low-Tc SQIF vs. magnetic field B at 4.2 K. The SQIF contains 180 loops and has a maximum peak to peak voltage swing of ∆Vmax = 9.6 mV
field sensitivity. The large voltage swing that can be achieved with SQIFs, i.e., the large dynamical range, impedes unlocking in the flux-locked loop regime. Even if unlocking takes place the measurement information will not get lost, as it is the case for conventional SQUID systems. At present, the development of highly sensitive high-Tc SQIF magnetometers is at the stage of optimizing the SQIF sensor itselve for this application. There exists not yet an operating SQIF magnetometer system. However, as it will be seen from the following experimental results, the first operating system can be expected in the near future. In order to develop a suitable SQIF sensor for magnetometry, Schultze et al. investigated several different circuit topologies [12]. The goal was to find a SQIF that can be directly integrated into the existing read out electronics and into the existing pick up loop design that was developed earlier for SQUID systems. Therefore, topologicaly 2D SQIFs are preferred because they can be designed to match the impedance of the read-out electronics. The challenge is here to find a circuitry that realizes a high performance 2D SQIF along a geometrically one-dimensional grain boundary. A typical design of a serial high-Tc SQIF made from grain boundary Josephson junctions is depicted in Fig. 8. The SQIF loops are electrically connected in pairs above and below the grain boundary (indicated by a dashed line). Grain boundary Josephson junctions are built when the YBCO lead crosses the grain boundary. The different loop sizes are clearly visible. Typically, a control line is located near the SQIF in order to generate a magnetic control field. As a first step in the development of a high-Tc SQIF sensor, one serial, one parallel and one parallel-serial SQIF has been designed and fabricated [12]. The parallel-serial SQIF consists of several series SQIFs that are connected in parallel by the circuitry. The experimental results show at 77 K for all SQIF types a significant improvement of the transfer factor and the noise-limited magnetic field resolution compared to a single SQUID [12]. The parallel-serial√SQIF has the best noise figure with a magnetic field resolution of 1.3 pT/ Hz. If this SQIF will be provided with a suitable pick-
742
J¨ org Oppenl¨ ander Josephson junctions
grain boundary
10 µm
control line
SQIF loops
Fig. 8. Optical micrograph of a segment of a serial high-Tc SQIF realized in grain boundary Josephson junction technology. The grain boundary is indicated by a dashed line. Josephson junctions are built when the YBCO lead crosses the grain boundary. A control line is located near the SQIF
up coil in a√flip-chip configuration one can expect a magnetic field resolution of ≈ 7 fT/ Hz. This would be a very good value for a HTS device operating at 77 K. Typically, such field resolution can be only achieved with low-Tc devices operating at 4.2 K. The full potential of the SQIF concerning high magnetic field-to-voltage transfer factor, however, is not yet fully reached. This will need further improvements of the geometrical layouts. The first experimental results for a promising new high-Tc SQIF layout are shown in Fig. 9. This series SQIF with 112 loops has a voltage swing of ∆V = 3.5 mV and a maximum transfer factor of √ 6000 V/T. It should reach a magnetic field sensitivity of less than 0.5 pT/ Hz. A typical voltage output noise spectrum of a series SQIF is depicted in Fig. 10. This spectrum is remarkable since it has been measured directly with a room temperature preamplifier. No flux-locked-loop electronics has been used and no bias reversal has been applied. Even though, the onset of 1/f -noise is located at a pretty low frequency of ≈ 10 Hz. Typically, for high-Tc devices such a low value can only be achieved by using bias reversal technique [12]. √ The white voltage noise level in Fig. 10 of SV (f ) ≈ 2 nV/ Hz is in very good agreement with the theoretical prediction (cf. Sect. 4.1). With a bare SQIF which possesses a transfer factor √ of typically 2000–8000 nV/nT a magnetic field sensitivity of 0.25–1 pT/ Hz can be achieved. If the SQIF
10.5 10.0
V [mV]
9.5 9.0 8.5 8.0 7.5 7.0
− 40
− 20
0
B [ µ T]
20
40
Fig. 9. Experimentally measured DC voltage response of a serial high-Tc SQIF vs. magnetic field B. The SQIF contains 112 loops and has a maximum peak to peak voltage swing of ∆Vmax = 3.5 mV at 65 K
Superconducting Quantum Interference Filters
S V [V/Hz1/2 ]
10
10
10
10
743
−6
−7
−8
−9
0
50
100
150
200
250 f [Hz]
300
350
400
450
500
Fig. 10. Typical experimentally measured voltage output SV (f ) vs. noise spectrum frequency f of a serial high-Tc SQIF. The SQIF is operated at 75 K inside a microcooler. The singular peaks at ≈ 55 Hz and at higher harmonics are implied by the compressor
is supplied with a suitable pick-up coil a magnetic field sensitivity of some √ fT/ Hz should be reachable. It should be stressed, however, that all these values have been measured at 75 K. The series SQIF is operated inside a microcooler. A typical experimental setup is shown in Fig. 11. The SQIF is mounted on top of the cold finger inside a vacuum dewar. The cooler can be operated down to a temperature of ≈ 50 K. The typical dimensions of the system, including microcooler, are 15 cm × 8 cm × 15 cm. The size of the system can be substantially reduced because the cooling power that is needed to cool the circuit is much less than the cooling power of the commercial microcooler that is used here. The magnetic field noise spectrum of Fig. 10 contains a number of singular peaks around 50 Hz and some higher harmonics. These peaks are implied by the compressor which has a working frequency of 55 Hz. In addition there is the usual 50 Hz peak. The performance of the SQIF is not affected by these disturbances. The dynamical range of the SQIF is more than sufficient to handle typical parasitic oscillations. Even if the SQIF is operated without
SQIF chip mounted on cold finger
vaccum dewar
compressor
Fig. 11. Experimental setup for SQIFs operated in a microcooler. The SQIF is mounted on top of the cold finger inside a vacuum dewar. The typical dimensions of the system are 15 cm × 8 cm × 15 cm. The SQIF chip has a size of 1 cm × 1 cm
744
J¨ org Oppenl¨ ander
any shielding the dip itselve is not degraded. Only the structures on the upper part of the voltage response change.
6
Conclusions
The development of high performance superconducting quantum interference filters is just in the beginning. Even though, the experimental results that have been achieved up to now are very promising. We were able to verify almost all theoretical predictions including fault tolerance and noise reduction experimentally. The first operating highly sensitive high-Tc SQIF magnetometer can be expected in the near future. Magnetometry, however, is not the only application of SQIFs. There are many others including high performance low noise amplifiers. Also here the advantages of SQIFs are obvious. A SQIF amplifier is in contrast to SQUID amplifiers nonhysteretic due to the uniqueness of the response function. In addition, SQIF amplifiers can be realized with high-Tc superconductors in standard grain boundary technology and can be operated at 77 K. Acknowledgements It is a pleasure to thank N. Schopohl, Ch. H¨ außler, T. Tr¨ auble, A. Friesch, J. Tomes, P. Caputo, V. Schultze, R. I. Jsselsteijn and H.-G. Meyer for various contributions to this work. I also would like to thank R. Kleiner, R. P. Huebener and D. Koelle for their support. Part of this work has been supported by the ”Forschungsschwerpunktprogramm des Landes BadenW¨ urttemberg”.
References 1. M. Tinkham, in: Introduction to Superconductivity, (McGraw-Hill, 2nd ed. 1996). 732 2. J. Clarke, in: Superconducting Devices, edited by S.T. Ruggiero and D.A. Rudman, (Academic Press 1990). 733 3. K.K. Likharev, in: Dynamics of Josephson Junctions and Circuits, (Gordon and Breach Science Publishers, 2nd printing 1991). 733, 737, 740 4. A.Th.A.M. de Waele, W.H. Kraan, and R. de Bruyn Ouboter, Physica 40, 302 (1968). 733 5. D. Drung, Physica C 368, 134 (2002). 733 6. J. Oppenl¨ ander, Ch. H¨ aussler, and N. Schopohl, Phys. Rev. B 63, 024511 (2001). 735, 736, 739 7. D. Koelle, R. Kleiner, F. Ludwig, E. Dantsker and J. Clarke, Rev. Mod. Phys. 71, 631 (1999). 737 8. Ch. H¨ aussler, J. Oppenl¨ ander, and N. Schopohl, J. Appl. Phys. 89, 1875 (2001). 737
Superconducting Quantum Interference Filters
745
9. J. Oppenl¨ ander, T. Tr¨ auble, Ch. H¨ aussler, and N. Schopohl, IEEE Trans. Appl. Supercond. 11, 1271 (2001). 738 10. J. Oppenl¨ ander, Ch. H¨ aussler, T. Tr¨ auble, and N. Schopohl, Physica C 368, 119 (2002). 739 11. J. Oppenl¨ ander, Ch. H¨ aussler, T. Tr¨ auble, P. Caputo, J. Tomes, A. Friesch, and N. Schopohl, IEEE Trans. Appl. Supercond. (in press). 740 12. V. Schultze, R. IJsselsteijn, H.-G. Meyer, J. Oppenl¨ ander, Ch. H¨ aussler, and N. Schopohl, IEEE Trans. Appl. Supercond. (in press). 741, 742
Decoherence Due to Discrete Noise in Josephson Qubits E. Paladino1 , L. Faoro2 , and G. Falci1 1
2
NEST-INFM & Dipartimento di Metodologie Fisiche e Chimiche (DMFCI) Universit` a di Catania viale A. Doria 6, 95125 Catania, Italy & INFM UdR Catania Institute for Scientific Interchange (ISI) & INFM Viale Settimo Severo 65, 10133 Torino, Italy
Abstract. We study decoherence produced by a discrete environment on a charge Josephson qubit by introducing a model of an environment of bistable fluctuators. In particular we address the effect of 1/f noise where memory effects play an important role. We perform a detailed investigation of various computation procedures (single shot measurements, repeated measurements) and discuss the problem of the information needed to characterize the effect of the environment. Although in general information beyond the power spectrum is needed, in many situations this results in the knowledge of only one more microscopic parameter of the environment. This allows to determine which degrees of freedom of the environment are effective sources of decoherence in each different physical situation considered.
1
Introduction
A high degree of quantum coherence is crucial for operating quantum logic devices [1]. Solid state nanodevices seem particularly promising because of integrability and flexibility in the design and several possible implementations have been proposed [2,3,4,5,6]. Few recent experiments succeeded in detecting coherent dynamics in superconducting devices [7,8,9,10], but revealed limitations in the performances, due to decoherence. In a quantum logic device the interesting degrees of freedom are related to a given set of observables which we can prepare (write) and measure (read), and define the system. Their eigenstates| {qi } form the computational basis. The dynamics of a state | ψ, t = q1 ...qN cq1 ...qN (t) | q1 , . . . , qN may be controlled if the Hamiltonian of the system is tunable. Loss of coherence is due to the fact that the Hilbert space of the device is much larger than the computational space [11,12]. The additional degrees of freedom define the environment, which cannot be controlled and moreover little information is available on it. Decoherence is ultimately due to the entanglement between system and environment [11], but in order to pursue this point of view one should be able study the full dynamics, in particular system-environment correlations. A less fundamental point of view, which we adopt here, is to associate decoherence to the loss of fidelity of a quantum gate. The environB. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 747–762, 2003. c Springer-Verlag Berlin Heidelberg 2003
748
E. Paladino et al.
ment blurs the output signal because it produces uncertainties in the phase relation between the amplitudes, c{qi } (t). The environment degrees of freedom may be “internal” to the device, which is a many-body object, or “external”, belonging to the circuitry and to auxiliary devices. Internal decoherence is a serious problem in solid state nanodevices, due to the presence of many low energy excitations. Investigation of the reduced qubit dynamics requires information about the environment. In several cases (weakly coupled environment [13], harmonic oscillator environment [14,15]) the information is contained in the power spectrum of the environment operators coupled to the qubit1 . In this work we study a more general situation, an environment of quantum bistable fluctuators which may have memory on the time scale of the qubit dynamics, and may display the effect of non-gaussian correlations. A physical example are charged impurities in substrates and oxides. These background charges (BC) produce for instance the 1/f noise [17] observed in metallic tunnel junctions [18,19]. We will present here analytic results which elucidate several aspects of the BCs environment. In particular we show that decoherence depends on details of the dynamics of the environment beyond the power spectrum [20]. As a consequence it may differ for different gates, but in many cases the additional information required reduces to one microscopic parameter [21].
2
Model for the System and the Environment
Superconducting qubits [3,4,5,6] are the only solid state implementations where coherence in a single qubit has been observed in the time domain [7,8,9] and two-qubit systems are investigated [10]. Thus problems on decoherence can be posed in a realistic perspective. In all the above implementations it has been recognized that 1/f noise is a major source of decoherence. We will focus on charge-Josephson qubits [5], where 1/f noise produced by BCs trapped close to the device (Fig. 1). We describe these BCs by introducing an impurity model. Many of our results and methods can be applied to different noise sources (eg. flux noise in flux-Josephson qubits [22,23]) and other solid-state implementations (eg. gates based on the Coulomb interaction in semiconductor-spin qubits), but we will not discuss these topics here. 2.1
The Superconducting Box
The charge-Josephson qubit [5] is a superconducting island connected to a circuit via a Josephson junction and a capacitance C2 (Fig. 1). The computational states are associated with the charge Q in the island. They are mixed by the Josephson tunneling. Level splittings can be tuned by the external 1
In this spirit the Caldeira Leggett model has been proposed to describe quantum phenomena in superconducting nanocircuits [16].
Decoherence Due to Discrete Noise in Josephson Qubits
Vx C
EJ C Q J
749
Fig. 1. A charge Josephson qubit in the presence of BCs located in the substrate or in the oxide close to the junction. Relevant scales are the charging energy EC = e2 /2(C1 +C2 ) and the Josephson energy EJ
voltage Vx and under suitable conditions (EC EJ and low temperatures kB T EJ ) only two charge states are important, which define the z component of a pseudospin. The the system implements a qubit with Hamiltonian HQ =
2.2
ε EJ σz − σx 2 2
;
ε(Vx ) = 4EC (1 − C2 Vx /e) .
(1)
System Plus Environment Hamiltonian
We describe each BC as a localized impurity level connected to a fermionic band [20,24,25]. For a single impurity the total Hamiltonian is v H = HQ − b† bσz + HI ; HI = εc b† b + [Tk c†k b + h.c.] + εk c†k ck . (2) 2 k
k
Here HI describes the BC alone: b (b† ) destroys (creates) an electron in the localized level εc and the electron may tunnel with amplitude Tk to a band, described by the operators ck , c†k and the energies εk . An important scale is the switching rate γ = 2πN (c )|T |2 (N is the density of states of the band, and |Tk |2 ≈ |T |2 ), which characterizes the relaxation regime of the BC. The BC determines a bistable extra bias v for the qubit, via the coupling term. For a set of BCs we generalize Eq. (2) as follows 1 ˆ H = HQ − E σz + HiI . (3) 2 i † ˆ = where extra bias operator is E(t) i vi bi bi . For simplicity the assumption that each localized level is connected to a distinct band has been made. 2.3
Model for 1/f Noise
The environment in Eq. (3) above is specified by the distribution of the switching rates γi which we choose in order to reproduce the 1/f noise. The standard way [17] is to assume a distribution P (γ) ∝ 1/γ for γ ∈ [γm , γM ]. Indeed if we look at the relaxation regime of the BC, the total extra polarization is a classical stochastic process E(t) with power spectrum S(ω) =
i
Si (ω)
;
Si (ω) =
1 2 γi 2 vi (1 − δp ) 2 2 (γi + ω 2 )
,
(4)
750
E. Paladino et al.
δp being the thermal average of the difference in the populations of the two states of the BC, the distribution P (γ) leads to to 1/f noise, S(ω) = {π(1 − p2 ) nd v 2 /(4 ln 10)} ω −1 for frequencies ω ∈ [γm , γM ] (nd is the number of fluctuators per noise decade). With the above choice of parameters, the Hamiltonian Eq. (3) has been used to study decoherence due to 1/f noise [20].
3
Reduced Dynamics
Our aim is to investigate the effect of the BC environment on the dynamics of the qubit. The standard road-map is to calculate the reduced density matrix [14] of the qubit ρQ (t) = TrE {W (t)}, W (t) being the full density matrix. In the standard weak coupling approach a master equation for ρQ (t) is written [13], the environment entering its dynamics only via the power spectrum, Eq. (4). A master equation for ρQ (t) can also be obtained by modeling the environment by a set of harmonic oscillators. By using diagrammatic techniques [14,26,27] results of the standard weak coupling approach are obtained at lowest order in the couplings v, but it has been pointed out that higher orders are important for a 1/f oscillator environment [20,27,28]. The failure of the standard weak coupling approach is due to the fact that the 1/f environment includes fluctuators which are very slow on the time scale of the reduced dynamics, the approach being reliable only for BCs with vi γi [13,20]. Rather than study higher orders in the perturbation series [27], we prefer to use a different strategy. We study the models Eqs.(2,3) by enlarging the system and considering only the bands as the environment. The modified road-map consists in calculating in some approximation the reduced density matrix (RDM) ρ(t) = Trbands {W (t)} and then extract exactly ρQ (t). This allows to obtain results to all orders in the coupling v and to investigate details of various quantum gates. The disadvantage is that we have a larger system to deal with. We have investigated this problem with different techniques [20], here we present a master equation approach, which covers many of the results we obtained before. 3.1
Master Equation for a Single BC
We split Eq. (2) in system, H0 = HQ − v2 b† bσz + εc b† b, and environment, HE = k εk c†k ck , coupled by V = k [Tk c†k b + h.c.]. The eigenstates (and eigenvalues) of H0 are product states of the form | qubit | BC , namely | a = | θ+ | 0 (−Ω/2), | b = | θ− | 0 (Ω/2), | c = | θ + | 1 (−Ω /2 + εc ), | d = | θ − | 1 (Ω /2 + εc ). Here | θ± are the two eigenstates of σnˆ , the direction being specified by the polar angles θ and φ = 0. Each set of qubit to a value of b† b = 0, 1: the two level split states corresponds 2 2 tings are Ω = ε + EJ and Ω = (ε + v)2 + EJ2 , and finally cos θ = ε/Ω, sin θ = EJ /Ω, cos θ = (ε + v)/Ω , sin θ = EJ /Ω .
Decoherence Due to Discrete Noise in Josephson Qubits
751
In the basis of the eigenstates of H0 the master equation for the RDM in the Schr¨ odinger representation reads dρij (t) = −i ωij ρij (t) + Rij mn ρmn (t) , dt mn
(5)
where ωij is the difference of the eigenenergies and Rij mn are the elements of the Redfield tensor [13] ∞ > < Rij mn = dτ Cnjmi (τ ) eiωmi τ + Cnjmi (τ ) eiωjn τ 0 > < −δnj Cikmk (τ ) eiωmk τ − δim Cnkjk (τ ) eiωkn τ . (6) k
k
The correlation functions are given by > > < Cijkl (t) = i | b | j l | b† | k + i | b† | j l | b† | k iG < (t) ,
where iG> (ω) = γ/(1 + e−βω ), G< (ω) = G> (−ω). Many of the coefficients Rijmn vanish. In particular the system of equations (5) splits in two blocks. The first contains the populations and the coherences ρab and ρcd , together with their conjugate, i.e. the elements diagonal in the BC. The second block contains all the other coherences, which vanish identically if the initial RDM ρ(0) is diagonal in the BC, the physically relevant case for our purposes. In the standard secular approximation [13] in the r.h.s. of Eq. (5) are retained only terms with coefficients Rij ij , but this is not enough in our problem. To discuss possible approximations we first point out that we are interested in the case v Ω, Ω . We have two important scales, namely Ω − Ω and Ω ∼ Ω , the latter being much larger than the former. The secular approximation is valid if γ Ω − Ω, i.e. for an almost static BC. If γ ∼ Ω − Ω Ω ∼ Ω we can no longer neglect the coefficients mixing ρab and ρcd (and those mixing the conjugates). We call this regime adiabatic. > Finally in the fast BC regime, γ ∼ Ω, Ω , all the Rij mn must be present. Despite of this, the reduced dynamics of the qubit is just the result of lowest order approach [14,26,27], so we will not discuss this regime anymore. 3.2
Results for Adiabatic BCs
In the adiabatic regime coherences are calculated using the following elements of the Redfield tensor
γ c2 γ ; Rab cd = 1 − c2 δ − s2 δ + i c2 w + s2 w 1 + δ − iw 2 2
2 γ c2 γ 2 2 2 1 − δ − iw , ; Rcd ab = =− 1+c δ+s δ +i c w−s w 2 2
Rab ab = − Rcd cd
752
E. Paladino et al.
where c = cos[(θ − θ )/2], s = sin[(θ − θ )/2], δ = tca + tdb , δ = tda + tcb , w = wca − wdb , w = wda − wcb and βω 1 1 π + iβωij ij ; wij = − ψ . tij = tgh 2 2 π 2π Here ψ(z) is the digamma function. The coherences ρab (t) and ρcd (t) can be found in closed form by simply diagonalizing a 2 × 2 matrix. They allow to study the qubit coherence via σy (t) = −2[ρab (t) + ρcd (t)]. The qubit coherence decays as exp{−Γ (t)}, where
ρab (t) + ρcd (t)
. Γ (t) = − ln
(7) ρab (0) + ρcd (0) Here we present the analytic solution in a regime where the dynamics of the charge is not modified by the presence of the qubit, a case which is of interest for 1/f noise. We find γ γ 1 A(α) e− 2 (1−α)t − A(−α) e− 2 (1+α)t , (8) ρab (t) + ρcd (t) = ei(Ω+γg/2)t 2α 2 where α = 1 − g 2 − 2igδp − (1 − c4 )(1 − δp ) and g = (Ω − Ω)/γ enter the decay rates whereas the prefactors are A(α) = (α + c2 − ig ) ρab (0) + (α + c2 + ig ) ρcd (0) ; g = g + i δp (1 − c2 ) . 3.3
Two Regimes for the BC
A rough analysis of Eq. (8) allows to draw a physical picture. If g 1, α is substantially imaginary, reflecting the fact that a very slow BC mainly provides a static energy shift. Instead for g 1 the decay rates acquire a real part thus a fast BC may determine an exponential reduction of the output signal. From a more quantitative analysis [20] it emerges that BCs with g 1, which we call weakly coupled, behave as a suitable set of harmonic oscillators with power spectrum given by Eq. (4). They mainly produce the homogeneous broadening of the signal. Instead BCs with g 1, which we call strongly coupled, give rise to memory effects and deviations of their statistical properties from those of an oscillator environment (cumulants higher than the second) are relevant [17]. They mainly produce the inhomogeneous broadening of the signal. Finally we compare this result with other approaches. Notice first that Eq. (8) is valid if the power spectrum Si (ω) (see Eq. (4)) is negligible at frequencies ω ∼ Ω. In this regime both the decay Γ (t) Eq. (7) and the energy shift reproduce the results of the standard weak coupling approach [13] if γt 1. On the other hand, as in the work of Refs. [14,26,27], we may simulate the effect of the BC by a suitable set of quantum harmonic oscillators. The resulting decoherence rate for this oscillator environment is given by 1 2 ∂ 2 Γ (t) , (9) Γosc (t) = g 2 ∂g 2 g=0 where Γ (t) is given by Eq. (8) with thermal initial conditions for the BC.
Decoherence Due to Discrete Noise in Josephson Qubits
4
753
Pure Dephasing
For EJ = 0 the environment only produces random fluctuations of the level splitting. This case, usually referred to as “pure dephasing”, is special in that the Hamiltonian (3) commutes with σz . The charge in the island is then conserved and no relaxation occurs, but if we prepare the qubit in a superposition the system will dephase. If the initial density matrix is factorized, W (0) = wE (0) ⊗ ρQ (0), it is possible to write an exact expression for the coherences only in terms of the environment [12,20,21],
Q −Γ (t)−iδE(t) ρQ ; Γ (t) = − ln TrE wE (0) eiH−1 t e−iH1 t , (10) 01 (t) = ρ01 (0) e ˆ This general expression may be further simwhere Hη = i HiI − (η/2) E. plified because individual charges contribute independently to Γ (t), Γ (t) = i Γi (t), where Γi (t) is the dephasing due to a single charge resulting from the Hamiltonian Eq. (2). We notice that in this case θ = θ , which implies that ρQ 10 (t) = ρab (t) + ρcd (t) and moreover the adiabatic approximation is exact, since the Redfield coefficients relating the above two coherences with all the other entries of the RDM vanish identically. Then Γi (t) is given by Eq. (7) and in the limit described by Eq. (8) we find the analytic form
γi γi
(11) Γi (t) = − ln Ai e− 2 (1−αi )t + (1 − Ai ) e− 2 (1+αi )t , where Ai = [1 + (1 − igi )/α] p0i + [1 + (1 + igi )/α] p1i , gi = vi /γi , αi = 1 − gi2 − 2igi δpi , and p0i (p1i ) is the probability that the i-th BC is initially empty (singly occupied). This result has been obtained using several different techniques in [20,21]. Finally by combining the equations of this section with Eq. (9) we recover the exact result of a set of harmonic oscillators [12] ∞ 1 − cos ωt dω S(ω) . (12) Γosc (t) = π ω2 0 4.1
Single BC
We study deviations of Eq. (11) from the result for an oscillator environment, Eq. (12). We consider different initial conditions for the BCs. Substantial deviations are clearly observed, except in the case of a weakly coupled BC (g = 0.1). In particular BCs with g > 1 induce slower dephasing compared to an oscillator environment with the same S(ω), a sort of saturation effect. Recurrences at times comparable with 1/v are visible in Γ (t). In addition strongly coupled charges show memory effects. We now study the effect of the initial conditions of the BC, expressed via δp0 = p0 −p1 (Fig. 2). In a single
754
E. Paladino et al. 0
2
{
g=2.0
2
-4 0
5 2
-20
1 -40
-8 0
0
25
2
0.1 50
4
γt
6
{
-6
g=1.0
8
{
4γ Γ(t)/(1-δp )
g=5.0 -2
g=0.1
10
Fig. 2. Reduced Γ (t) due to a BC prepared in a stable state (δp0 = 1 solid lines and δp0 = −1 dotted lines) and in a thermal mixture (δp0 = δp dashed lines) for the indicated values of g = v/γ. Inset: longer time behavior for stable state preparation. The curves are normalized in such a way that the oscillator approximation for all of them coincides (thick dashed line)
shot process δp0 = ±1 and Γ (t) describes dephasing during time evolution (homogeneous broadening). Memory effects are apparent and in particular for γt 1 the short time behavior is Γ (t) ≈ v 2 t2 (1−δp20 )/8+γv 2 t3 (1+2δp0 δp− 3δp20 )/24 ∝ t3 . A two level system is stiffer than a set of oscillators and indeed 2 Γosc (t) ≈ v 2 (1 − δp )/8 t2 . On the other hand if we choose δp0 = δp in Eq. (11), for very short times Γ (t) ≈ Γosc (t). This case corresponds physically to repeated measurements where the preparation of the BC is not controlled and determines slightly different characteristic frequency for the qubit. This sort of inhomogeneous broadening adds to decoherence during the individual time evolutions and determines a faster decay of Γ (t). To summarize a weakly coupled BC, g 1, behaves as a source of gaussian noise, decoherence depending only on the power spectrum of the fluctuator, whereas decoherence due a strongly coupled BC, g 1, displays saturation effects and dependence on the initial conditions of the BC. 4.2
1/f Noise in Single Shot Measurements
The set of BCs producing 1/f noise contains both weakly and strongly coupled fluctuators, so no typical time scale is present. A very large number of slow fluctuators is present, and it is not a priori clear how saturation manifests. In Fig. 3 we show the results for a realistic sample. We choose initial conditions δp0j = ±1 randomly distributed on the set of N BCs with (1/N ) j δp0j ≈ δp. We checked that different microscopic realizations of this conditions give roughly the same total initial extra bias E(0) and the same dephasing. We calculate dephasing during the time evolution, i.e. the average signal of several single-shot experiments, where E(0) is re-calibrated
Decoherence Due to Discrete Noise in Josephson Qubits
755
0
2
-Γ(t)
3 -5
5 12 5
4
12 -10 0
2e-09
4 4e-09
t (sec) Fig. 3. Γ (t) due to 1/f noise in the range ω ∈ [γm , 1012 Hz], for decreasing γm (solid lines, the label is the number of decades included). Slower BCs saturate whereas this not happens for an oscillator environment (dashed lines). Parameters (v = 9.2× 107 Hz, nd = 1000) give the experimental noise levels and reproduce the observed decay of the echo signal [19]. Couplings vi are distributed with ∆v/|v| = 0.2
before each experiment. This ideal protocol minimizes the effects of the environment. We now perform a spectral analysis of the effects of the environment by adding slower fluctuators decade after decade, in a way such that 1/f noise with the same amplitude A is present for ω ∈ [γm , 1012 Hz]. We see in this example that dephasing is given by BCs with γj > 107 Hz ≈ |v|/10. The overall effect of the strongly coupled BCs (γj < |v|/10) is minimal, despite of their large number, thus low-frequency noise saturates. Instead Γosc (t), Eq. (12) does not saturate at low frequencies. We notice finally that for this protocol Γ (t) is roughly given by Γosc (t) provided low frequencies are cut off at ω ∼ |v|. Thus dephasing depends essentially on a single additional parameter besides the power spectrum S(ω), namely the average coupling |v| or equivalently the number of charges producing a decade of noise, nd . Our results are not very sensitive to the value of nd we choose. Indeed for constant amplitude A we must keep constant i vi2 ≈ nd v 2 , so the effective −1/2 low-frequency cutoff ∼ |v| varies as nd . For nd → ∞ the low-frequency cutoff goes to zero, as in the result Eq. (12) for the oscillator environment. 4.3
Repeated Measurements and Inhomogeneous Broadening
Single-shot measurements are a goal for experimental research, but presently available protocols involve repeated measurements. For instance in [7,19] the time evolution procedure is repeated ∼ 105 times and the total current due to the possible presence of the extra Cooper pair in all the repetitions is measured. The signal is then the sum over different possible time evolutions of the BCs with initial conditions which are also randomly fluctuating. This
756
E. Paladino et al.
additional blurring of the output signal results in a faster decay for Γ (t). As explained in [21], this is accounted for by letting in Γ (t) (i.e. in Eqs.(11)) δp0i = δpi (tm ), the average of the values of δpj sampled at regular times tα for the overall measurement time tm . As a rough estimate we let δpi (tm ) = ±1 if γi tm < 1 and δpi (tm ) = δpi for γi tm > 1, and consider only the case of long overall measurement time |v|tm 1. In this case BCs with γ < 1/tm are saturated and are not effective, whereas for the other BCs, being averaged, (i) we may take Γ (i) (t) ≈ Γosc (t) for small enough times. The result would be ∞ Γ (t) ≈ 1/tm dω S(ω)(1 − cos ωt)/(πω 2 ) and would proof the recipe proposed by Cottet et al. [28]. In Fig. 4, (dotted lines) we show that indeed dephasing calculated as outlined above, is roughly given at short times by the oscillator environment approximation with a lower cutoff taken at ω ≈ min{|v|, 1/tm }. We also t show results with a different averaging procedure δp0j (tm ) = 1/tm 0 m dtδp(t) which takes into account the strongly correlated dynamics of 1/f noise (solid lines in Fig. 4). These correlations do not affect the results except possibly for tm ≈ |v|. 0
5
-1 7
1/tmeas=10 Hz
*
- Γ (t)
1/tmeas=10 Hz
-2 -3
3
1/tmeas=10 Hz -4 0
5e-10
1e-09
t (sec)
1.5e-09
Fig. 4. Different averages over δp0j for 1/f spectrum reproduce the effect of repeated measurements. They are obtained by neglecting (dotted lines) or account for (solid lines) the strongly correlated dynamics of 1/f noise. The noise level of [19] is used, by setting |v| = 9.2 × 106 Hz, nd = 105 , γm = 1Hz, γM = 109 Hz. Dashed lines are the oscillator approximation with a lower cutoff at ω = min{|v|, 1/tm }
4.4
Charge Echo
Echo-type techniques have been recently suggested [7,28] and experimentally tested [19] as a tool to reduce inhomogeneous broadening due to the lowfrequency fluctuators of the 1/f spectrum. In the experiment of [19] the echo protocol consists of a π/2 preparation pulse, a π swap pulse and a π/2 measurement pulse. Each pulse is separated by the delay time t. We calculated
Decoherence Due to Discrete Noise in Josephson Qubits
757
the decay of the echo signal using a semi-classical approach [29,21]. This result allowed to estimate the parameters we have used in this work by comparing with [19]. The decay depends very weakly on initial conditions of the BCs and is well reproduced by the oscillator environment approximation. This means that in the experiment [19] the echo procedure actually cancels the effect of strongly coupled charges, this conclusion being valid as long as the delay time is short, t|v| 1. We remark that in the regime of parameters we consider, for given noise amplitude the echo signal is strongly dependent on the high frequency cutoff and a detailed analysis may give information on the actual existence of BCs switching at rates comparable with Ω ∼ 10 GHz.
5
Decoherence for a Generic Working Point
The most effective strategy for defeating 1/f noise has been implemented in the experiment by Vion et al. [8]. It is useful to explain it in a non pictorial way by considering decoherence as given by the weak coupling approach [13,14,26] 1 1 (13) S(0) cos2 θ + S(Ω) sin2 θ t . Γ0 (t) = 2 4 Even if this formula does not hold for 1/f noise, it indicates that the dangerous adiabatic term containing S(0) may be eliminated (in lowest order) if one operates at θ = π/2. Small deviations from this point may determine a dramatic increase of decoherence [8]. Understanding decoherence in a generic operating point may help in designing more flexible gates, or to implement different strategies as computation with geometric phases [6]. 5.1
Single BC
First we consider Eq. (8) and we notice that by changing the working point θ, a strongly coupled BC may becomeweakly coupled and vice-versa. For simplicity we let KT εc , then α ≈ c4 − g 2 . The BC is weakly coupled if γ γc = (Ω − Ω)/c2 . Thus, the threshold value decreases if we go from θ = 0 to θ = π/2 (see Fig. 5). To get some insight in the problem of decoherence away from θ = π/2 we consider Γ (t) in the adiabatic regime Eqs. (7,8). Results plotted in Fig. 6 show Γ (t) parametrized by θ for four values of the parameter v/γ. For reference we plot also (dashed lines) Γ0 (t), Eq. (13). The top left panel shows a weakly coupled BC for which Γ (t) roughly follows Γ0 (t). The other panels show BCs with v ≥ γ which turn from strongly coupled to weakly coupled increasing θ ∈ [0, π/2]. For these latter BCs saturation is less effective in suppressing decoherence when the operation point is close to θ = π/2, which may be an indication of the fact that their effect is more sensitive to deviations from the optimal point.
758
E. Paladino et al. 1
γc/v
0.8
v/Ω = 0.3
0.6
v/Ω = 0.2
0.4
v/Ω = 0.1 v/Ω = 0.0
0.2 0 0
0.2
0.4
0.6
0.8
1
θ
1.2
Fig. 5. The threshold value for a BC behaving as weakly coupled depends on the operating point
1.4
θ=π/2
θ=π/2 0
0
θ=0.45 π
θ=0.45 π -0.2
(γ/v) Γ(t)
-0.4 -0.6 -0.8 -1 0
v/γ = 0.05 Ω/γ = 20
θ=π/4
θ=0
2
4
γt
8
6
-0.4
2
θ=π/3
2
(γ/v) Γ(t)
-0.2
θ=π/3 -0.6 -0.8 -1 0
10
2
θ=π/4
θ=0 4
θ=π/2
0
γt
θ=0.45 π
θ=0 θ=π/4
-0.4
2
(γ/v) Γ(t)
2
θ=0 -0.6
-1 0
10
θ=π/2
-0.2
θ=π/3
θ=π/4
-0.4
-0.8
8
6
0
θ=0.45 π
-0.2
(γ/v) Γ(t)
v/γ = 1.0 Ω/γ = 20
v/γ = 2.0 Ω/γ = 20 2
-0.8
4
γt
6
8
10
θ=π/3
-0.6
-1 0
v/γ = 4.0 Ω/γ = 20 2
4
γt
6
8
10
Fig. 6. Reduced Γ (t) (solid lines) plotted for various v/γ = 0.05, 1.0, 2.0, 4.0. In each panel curves correspond, from top to bottom, to θ = π/2, 0.45 π, π/3, π/4, 0. For comparison Γ0 (t) is also plotted (dashed lines). In the units used, Γ0 (t) is the same in each panel
5.2
1/f Noise at the Optimal Point
For a set of BCs the road-map for the reduced dynamics outlined in Sect. 3 is not easily implemented by generalizing the master equation Eq. (5). In [20] we used the Heisenberg equations of motion, where we factorize correlations between the qubit and the bands, and between different BCs. We are left with a set of 3(N + 1) coupled differential equations, where N is the total number of BCs. These approximations are expected to work for small vi but in effect they give accurate results for general values of gi even if vi /EJ is
Decoherence Due to Discrete Noise in Josephson Qubits
759
not very small, as we checked by comparing with numerical evaluation of the master equation for the qubit with one and two BCs. We study decoherence at the optimal point θ = π/2 (Ω = EJ ) via the Fourier transform of σz (t). We consider a set of weakly coupled BCs in the range [ 10−2 , 10 ] EJ which determines 1/f noise around the operating frequency with amplitude of the typical measured spectra [18,19] extrapolated at GHz frequencies. Dephasing due to this set of BCs agrees with the weak coupling result Eq. (13). Then we add a slower (and strongly coupled g0 = v0 /γ0 = 8.3) BC, which should produce no effect according to Eq. (13), Fig. 7a. We find that the strongly coupled BC alone determines a dephasing rate comparable with that of the weakly coupled BCs and the overall dephasing rate is more than doubled. This result shows that slower charges γ Ω play a role in dephasing. Moreover information beyond S(ω) is needed, as we checked by showing that sets of charges with different N and vi but the same S(ω) yield substantially different values of the decoherence rate. Decoherence is larger if BCs with g > ∼ 1 are present in the set. In summary these results show that Eq. (13) underestimates the effect of strongly coupled BCs and also that decoherence at the optimal point may be substantial even if the 1/f spectrum does not extend up to frequencies ∼ EJ . If we further slow down the added BC we find that Γφ increases toward values ∼ γ0 , the switching rate of the BC. This indicates that the effect of strongly coupled BCs on decoherence tends to saturate, in analogy with the
(b)
(a) 6000
S(ω) / EJ
σz(ω) EJ
4000
slow charge simulated 1/f 1/f + slow charge 1/f spectrum
0.01
0.0001
1e-06 0.0001
2000
0
0.01
ω / EJ
1
σz(ω) E J
1200 1
800
400
0 1
1.002
ω / EJ
1.004
1
1.002
ω/EJ
1.004
Fig. 7. (a) The Fourier transform σz (ω) for a set of weakly coupled BCs plus a single strongly coupled BC (solid line). The separate effect of the coupled slow BC alone (g0 = 8.3, dashed line) and of the set of weakly coupled BCs (dotted line), is shown for comparison. In the inset the corresponding power spectra: notice that at ω = EJ the power spectrum of the extra charge alone (dashed line) is very small. In all cases the noise level at EJ is fixed to the value S(EJ )/EJ ≈ 3.18 × 10−4 . (b) The Fourier transform σz (ω) for a set of weakly coupled BCs plus a strongly coupled BC (v0 /γ0 = 61.25) prepared in the ground (dotted line) or in the excited state (thick line)
760
E. Paladino et al.
results for pure dephasing. In this regime, we observe also memory effects related to the initial preparation of the strongly coupled BC (see Fig. 7b). Again we expect a dependence of dephasing on the protocol of the quantum gate.
6
Conclusions
In conclusion we have studied dephasing due to charge fluctuations in solid state qubits. For a fluctuator environment with 1/f spectrum memory effects and higher order moments are important so additional information on the environment is needed to estimate dephasing. In this case the additional information of the environment needed depends on the protocol but often reduces to a single parameter. A new energy scale emerges, the average coupling |v| of the qubit with the BCs, which is the additional information needed to discuss single shot experiments (alternatively one should know the order of magnitude of nd ). For repeated experiments the relevant scale is instead min{|v|, 1/tm } where tm is the overall measurement time. We point out that these scales emerge directly from the study of the dynamics of the model, and not from further assumptions. Finally echo measurements are sensitive to the high-frequency cutoff γM of the 1/f spectrum. It is interesting to notice that some result relative to an environment of quantum harmonic oscillators for arbitrary qubit-environment coupling may be obtained as a limit of the discrete environment in the semi-classical regime, and depend on the classical statistical properties of an equivalent random process with no reference to the quantum nature of the environment. Our results are directly applicable to other implementations of solid state qubits. Josephson flux qubits [30,22] suffer from similar 1/f noise, originated from trapped vortices. Our model applies if σz represents the flux. Also the the parametric effect of 1/f noise on the coupling energy of a Josephson junction [23] can be analyzed within our model, as long as individual fluctuators do not determine large variations of EJ . In this case it may be possible that the same sources generate both charge noise and fluctuations of EJ . This can be accounted for by choosing the “noise axis” as the zˆ axis. An important issue is to understand dephasing near the optimal operating points of the qubit [8]. Low-frequency noise can also be minimized by echo techniques, but the flexibility in the implementation of gates is greatly reduced. A possibility is to implement quantum computation using Berry phases [6], where the design of gates includes an echo procedure. Frequency shifts can be calculated within our model but a reliable analysis of the effect of 1/f noise is still missing. Finally we mention that the sensitivity of coherent devices may be used to investigate high frequency noise [31]. In particular an accurate matching between measured inhomogeneous broadening, echo signal and relaxation may
Decoherence Due to Discrete Noise in Josephson Qubits
761
give reliable information on the actual existence of BCs at GHz and on the high-frequency cutoff of the 1/f spectrum. Acknowledgements We thank M. Palma, R. Fazio, A. Shnirman, G. Sch¨ on and C. Urbina for discussions which greatly sharpened the point of view presented in this work. Very useful discussion with J. Clarke, A. D’Arrigo, D. Esteve, F. Hekking, P. Lafarge, G. Giaquinta, M. Grifoni, D. van Harlingen, A. Mastellone, J.E. Mooji, Y. Nakamura, Y. Nazarov, F. Plastina, U. Weiss, D. Vion and A. Zorin are acknowledged.
References 1. A. Ekert and A. Jozsa, Rev. Mod. Phys. 68, 733 (1996); Quantum Computation and Quantum Information Theory, edited by C. Macchiavello, G.M. Palma, A. Zeilinger, World Scientific (2000). M.Nielsen and I.Chuang Quantum Computation and Quantum Communication, Cambridge University Press, (2000). Experimental implementation of quantum computation, edited by R.G. Clark, Rinton Press, Princeton (2001). 747 2. D. Loss and P. Di Vincenzo, Phys. Rev. A 57, 120 (1998). 747 3. Y. Makhlin, G. Sch¨ on and A. Shnirman, Rev. Mod. Phys. 73357 (2001). 747, 748 4. D.A. Averin, Sol. State Comm. 105, 659 (1998); L.B. Ioffe et al., Nature 398, 679 (1999); J.E. Mooij et al., Science 285, 1036 (1999). 747, 748 5. Y. Makhlin, G. Sch¨ on and A. Shnirman, Nature 398, 305 (1999); A. Shnirman, G. Sch¨ on and Z. Hermon, Phys. Rev. Lett. 79, 2371 (1997). 747, 748 6. G. Falci et al., Nature 407, 355 (2000). 747, 748, 757, 760 7. Y. Nakamura, Yu.A. Pashkin, J.S. Tsai, Nature 398, 786 (1999). 747, 748, 755, 756 8. D. Vion et al., Science 296, 886 (2002). 747, 748, 757, 760 9. Y. Yu et al., Science 296, 889 (2002); J. Martinis et al., Phys. Rev. Lett. 89, 117901 (2002); J. Friedman et al., Nature 406, 43 (2000); I. Chiorescu et al. private communication. 747, 748 10. Yu. A. Pashkin et al. cond-mat/0212314. 747, 748 11. W. Zurek, Physics Today 44, 36 (1991). 747 12. G.M.Palma, K.-A.Suominen and A.K.Ekert, Proc. Roy. Soc. London A 452, 567 (1996). 747, 753 13. C. Cohen-Tannoudji, J. Dupont-Roc and G. Grynberg Atom-Photon Interactions, Wiley-Interscience (1993) 748, 750, 751, 752, 757 14. U. Weiss Quantum Dissipative Systems 2nd Ed (World Scientific, Singapore 1999). 748, 750, 751, 752, 757 15. A. Leggett et. al, Rev. Mod. Phys. 59, 1 (1987). 748 16. A. O.Caldeira and A. J. Leggett, Ann. Phys. 149, 374 (1983). 748 17. M.B. Weissman, Rev. Mod. Phys. 60, 537 (1988). 748, 749, 752 18. A.B. Zorin et. al, Phys. Rev. B 53, 13682 (1996). M. Covington et al., Phys. Rev. Lett. 84, 5192 (2000). 748, 759
762
E. Paladino et al.
19. Y. Nakamura et. al., Phys. Rev. Lett. 88, 047901 (2002). 748, 755, 756, 757, 759 20. E. Paladino et. al., Phys. Rev. Lett. 88, 228304 (2002). 748, 749, 750, 752, 753, 758 21. G. Falci et. al., Proceedings of the International School Enrico Fermi on “Quantum Phenomena of Mesoscopic Systems”, B. Altshuler and V. Tognetti Eds., IOS Bologna (2003). 748, 753, 756, 757 22. J.E. Mooij et al., Science 285, 1036 (1999). 748, 760 23. D. J. Van Harlingen et al. to be published in Quantum Computing and Quantum Bits in Mesoscopic Systems edited by Kluwer Academic Plenum Press (2002). 748, 760 24. R. Bauernschmitt and Y.V. Nazarov, Phys. Rev. B 47, 9997 (1993). 749 25. G.D. Mahan Many-Particle Physics Kluwer Academic, New York, (2000) 749 26. M.Grifoni, E. Paladino, U. Weiss, E.Phys.J. B, 10, 719 (1999). 750, 751, 752, 757 27. A. Shnirman,Y. Makhlin, G. Sch¨ on cond-mat/0202518 750, 751, 752 28. A. Cottet et al. in Macroscopic Quantum Coherence and Quantum Computing edited by D.V. Averin, B. Ruggiero and P. Silvestrini, (Kluwer Pub., 2001),pg.111; E. Paladino et. al. ibidem, pg.359. 750, 756 29. E. Paladino et al. to be published in Quantum Computing and Quantum Bits in Mesoscopic Systems edited by Kluwer Academic Plenum Press (2002). 757 30. L. Tian et al. in Proceedings of the NATO-ASI on Quantum Mesoscopic Phenomena and Mesoscopic Devices in Microelectronics, edited by I.O. Kulik and R. Elliatioglu (Kluwer Pub. 2000), pg. 429. 760 31. R. Aguado and L. P. Kouwenhowen, Phys. Rev. Lett. 84, 1986 (2000). 760
Decoherence of Flux Qubits Coupled to Electronic Circuits F. K. Wilhelm1 , M. J. Storcz1 , C. H. van der Wal2 , C. J. P. M. Harmans3 , and J. E. Mooij3 1 2 3
Sektion Physik and CeNS, Ludwig-Maximilians-Universit¨ at 80333 M¨ unchen, Germany Dpt. of Physics, Harvard University Cambridge, MA 02138, USA Dpt. of Nanoscience, Delft University of Technology 2600 GA Delft, Netherlands
Abstract. On the way to solid-state quantum computing, overcoming decoherence is the central issue. In this contribution, we discuss the modeling of decoherence of a superonducting flux qubit coupled to dissipative electronic circuitry. We discuss its impact on single qubit decoherence rates and on the performance of two-qubit gates. These results can be used for designing decoherence-optimal setups.
1
Introduction
Quantum computation is one of the central interdisciplinary research themes in present-day physics [1]. It promises a detailed understanding of the often counterintuitive predictions of basic quantum mechanics as well as a qualitative speedup of certain hard computational problems. A generic, although not necessarily exclusive, set of criteria for building quantum computers has been put forward by DiVincenzo [2]. The experimental realization of quantum bits has been pioneered in atomic physics, optics and NMR. There, the approach is taken to use microscopic degrees of freedom which are well isolated and can be kept quantum coherent over long times. Efficient controls are attached to these degrees of freedom. Even though these approaches are immensely succesful demonstrating elementary operations, it is not evident how they can be scaled up to macroscopic computers. Solid-state systems on the other hand have proven to be scalable in present-day classical computers. Several proposals for solid-state based quantum computers have been put forward, many of them in the context of superconductors [3]. As solid-state systems contain a macroscopic number of degrees of freedom, they are very sensitive to decoherence. Mastering and optimizing this decoherence is a formidable task and requires deep understanding of the physical system under investigation. Recent experimental success [4,5] suggests that this task can in principle be performed. In this contribution, we are going to study decoherence of superconducting qubits coupled to an electromagnetic environment which produces JohnsonNyquist noise. We show, how the decoherence properties can be engineered B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 763–778, 2003. c Springer-Verlag Berlin Heidelberg 2003
764
F. K. Wilhelm et al.
by carefully designing the environmental impedance. We will discuss how the decoherence affects the performance of a CNOT operation.
2
Superconducting Flux Qubits
Superconducting qubits [3,4,5,6] are very well suited for the task of solidstate quantum computation, because two of the most obvious decoherence sources in solid-state systems are supressed: Quasiparticle excitations experience an energy gap and phonons are frozen out at low temperatures [7]. The computational Hilbert space is engineered using Josephson tunnel junctions that are characterized by two competing energy scales: The Josephson coupling of a junction with critical current Ic , EJ = Ic Φ0 /2π, and the charging energy Ech = 2e2 /CJ of a single Cooper pair on the geometric capacitance CJ of the junction. Here Φ0 = h/2e is the superconducting flux quantum. There is a variety of qubit proposals classified by the ratio of this energies. Whereas another contribution in this volume [8] focuses on the case of charge qubits, Ech > EJ , this contribution is motivated by flux qubit physics, EJ > Ech . However, most of the discussion has its counterpart in other superconducting setups as well. Specifically, we discuss a three junction qubit [6,9], a micrometer-sized low-inductance superconducting loop containing three Josephson tunnel junctions (Fig. 1). By applying an external flux Φq a persistent supercurrent can be induced in the loop. For values where Φq is close to a half-integer number of flux quanta, two states with persistent currents of opposite sign are nearly degenerate but separated by an energy barrier. We will assume here that the system is operated near Φq = 12 Φ0 . The b
microwave current
a
c
bias current
control current
Z sh(ω) qubit
Fig. 1. Experimental setup for measurements on a flux qubit. The qubit (center ) is a superconducting loop that contains three Josephson junctions. It is inductively coupled to a DC-SQUID (a), and superconducting control lines for applying magnetic fields at microwave frequencies (b) and static magnetic fields (c). The DC-SQUID is realized with an on-chip shunt circuit with impedance Z(ω). The circuits a)-c) are connected to filtering and electronics (not drawn)
Decoherence of Flux Qubits Coupled to Electronic Circuits
765
persistent currents in the classically stable states have here a magnitude Ip . Tunneling through the barrier causes a coupling between the two states, and at low energies the loop can be described by a Hamiltonian of a two state system [6,9], ∆ ˆq = ε σ H ˆz + σ ˆx , (1) 2 2 ˆx are Pauli matrices. The two eigenvectors of σ ˆz correspond where σ ˆz and σ to states that have a left or a right circulating current and will be denoted as |L and |R. The energy bias ε = 2Ip (Φq − 12 Φ0 ) is controlled by the externally applied field Φq . We follow [10] and define ∆ as the tunnel splitting at Φq = 1 2 Φ0 , such that ∆ = 2W with W the tunnel coupling between√the persistentcurrent states. This system has two energy eigen values ± 21 ∆2 + ε2 , such √ that the level separation ν gives ν = ∆2 + ε2 . In general ∆ is a function of ε. However, it varies on the scale of the single junction plasma frequency, which is much above the typical energy range at which the qubit is operated, such that we can assume ∆ to be constant for the purpose of this paper. In the experiments Φq can be controlled by applying a magnetic field with a superconducting coil at a distance from the qubit and for local control one can apply currents to superconducting lines, fabricated on-chip in the vicinity of the qubit. The qubit’s quantum dynamics will be controlled with resonant microwave pulses (i. e. by Rabi oscillations). In recent experiments the qubits were operated at ε ≈ 5∆ or ε ≈ 0 [4,9]. The numerical values given in this paper will concentrate on the former case. At this point, there is a good trade-off between a system with significant tunneling, and a system with σ ˆz -like eigenstates that can be used for qubit-qubit couplings and measuring qubit states [6]. The qubit has a magnetic dipole moment as a result of the clockwise or counter-clockwise persistent current The corresponding flux in the loop is much smaller than the applied flux Φq , but large enough to be detected with a SQUID. This will be used for measuring the qubit states. For our two-level system Eq. (1), this means that both manipulation and readout couple to σ ˆz . Consequently, the Nyquist noise produced by the necessary external circuitry will couple in as flux noise and hence couple to σ ˆz , giving a small, stochastically time-dependent part δ(t). Operation at ε ≈ 0 has the advantage that the flux noise leads to less variation of ν. In the first experiments [4] this has turned out to be crucial for observing time-resolved quantum dynamics. Here, the qubit states can be measured by incorporating the qubit inside the DC-SQUID loop. While not working that out in detail, the methods that we present can also be applied for the analysis of this approach. This also applies to the analysis of the impact of electric dipole moments, represented by σ ˆx . With Ech EJ , these couple much less to the circuitry and will hence not be discussed here. As the internal baths are well suppressed, the coupling to the electromagnetic environment (circuitry, radiation noise) becomes a dominant source of decoherence. This is a subtle issue: It is not possible to couple the circuitry
766
F. K. Wilhelm et al.
arbitrarily weakly or seal the experimental setup, because it has to remain possible to control the system. One rather has to engineer the electromagnetic environment to combine good control with low unwanted back-action. Any linear electromagnetic environment can be described by an effective impedance Zeff . If the circuit contains Josephson junctions below their critical current, they can be included through their kinetic inductance Lkin = ¯ where φ¯ is the average phase drop across the junction. The Φ0 /(2πIc cos φ), circuitry disturbs the qubit through its Johnson-Nyquist noise, which has Gaussian statistics and can thus be described by an effective Spin-Boson model [11]. In this model, the properties of the oscillator bath which forms the environment are characterized through a spectral function J(ω), which can be derived from the external impedance. Note, that other nonlinear elements such as tunnel junctions which can produce non-Gaussian shot noise are generically not covered by oscillator bath models. As explained above, the flux noise from an external circuit leads to = 0 + δ(t) in Eq. (1). We parametrize the noise δ(t) by its power spectrum ¯ 2 J(ω) coth(¯ hω/2kB T ). {δ(t), δ(0)}ω = h
(2)
Thus, from the noise properties calculated by other means one can find J(ω) as was explained in Detail in [12]. In this contribution, we would like to outline an alternative approach pioneered by Leggett [13], where J(ω) is derived from the classical friction induced by the environment. In reality, the combined system of SQUID and qubit will experience fluctuations arising from additional circuit elements at different temperatures, which can be treated in a rather straightforward manner.
3 3.1
Decoherence from the Electromagnetic Environment Characterizing the Environment from Classical Friction
We study a DC-SQUID in an electrical circuit as shown in Fig. 1. It contains two Josephson junctions with phase drops denoted by γ1/2 . We start by looking at the average phase γex = (γ1 + γ2 )/2 across the read-out SQUID. Analyzing the circuit with Kirchhoff rules, we find the equation of motion Φ0 Φ0 dt γ˙ ex (t )Y (t − t ). (3) 2CJ γ¨ex = −2Ic,0 cos(γi ) sin γex + Ibias − 2π 2π Here, γin = (γ1 − γ2 )/2 is the dynamical variable describing the circulating current in the loop which is controlled by the flux, Ibias is the bias current imposed by the source, Y (ω) = Z −1 (ω) is the admittance in parallel to the whole SQUID and Y (τ ) its Fourier transform. The SQUID is described by the junction critical currents Ic,0 which are assumed to be equal, and their capacitances CJ . We now proceed by finding a static solution which sets the operation point γin/ex,0 and small fluctuations around them, δγin/ex . The
Decoherence of Flux Qubits Coupled to Electronic Circuits
767
static solution reads Ibias = Ic,eff sin γex,0 where Ic,eff = 2Ic,0 cos γin,0 is the effective critical current of the SQUID. Linearizing Eq. 3 around this solution and Fourier-transforming, we find that 2πIb tan γin,0 Zeff (ω) δγi (ω) (4) iωΦ0 −1 where Zeff (ω) = Z(ω)−1 + 2iωCJ + (iωLkin )−1 is the effective impedance of the parallel circuit consisting of the Z(ω), the kinetic inductance of the SQUID and the capacitance of its junctions. Neglecting self-inductance of the SQUID and the (high-frequency) internal plasma mode, we can straightforwardly substitute γin = πΦ/Φ0 and split it into γin,0 = πΦx,S /Φ0 set by the externally applied flux Φx,S through the SQUID loop and δγi = πMSQ IQ /Φ0 where MSQ is the mutual inductance between qubit and the SQUID and IQ (ϕ) is the circulating current in the qubit as a function of the junction phases, which assumes values ±Ip in the classically stable states. In order to analyze the backaction of the SQUID onto the qubit in the two-state approximation, Eq. (1), we have to get back to its full, continuous description, starting from the classical dynamcis. These are equivalent to a particle, whose coordinates are the two independent junction phases in the three-junction loop, in a two-dimensional potential δγex (ω) =
¨ = −∇U (ϕ, Φx,q + IS MSQ ). C(Φ0 /2π)2 ϕ
(5)
The details of this equation are explained in [6]. C is the capacitance matrix describing the charging of the Josephson junctions in the loop, U (ϕ) contains the Josephson energies of the junctions as a function of the junction phases and IS . is the ciculating current in the SQUID loop. The applied flux through the qubit Φq is split into the flux from the external coil Φx,q and the contribution form the SQUID. Using the above relations we find 2 2 IB tan2 γin,0 IS MSQ = δΦcl − 2π 2 MSQ
Zeff IQ iωΦ20
(6)
where δΦcl MSQ Ic,0 cos γex,0 sin γin,0 is the non-fluctuating back-action from the SQUID. From the two-dimensional problem, we can now restrict ourselves to the one-dimensional subspace defined by the preferred tunneling direction [6], which is described by an effective phase ϕ. The potential restricted on this direction, U1D (ϕ) has the form of a double well [11,14] with stable minima situated at ±ϕ0 . In this way, we can expand U1D (ϕ, Φq ) U (ϕ, Φq , x) + IQ (ϕ)IQ MSQ . Approximating the phase-dependence of the circulating current as IQ (ϕ) ≈ Ip ϕ/ϕ0 where Ip the circulating current in one of the stable minima of ϕ, we end up with the classical equation of motion of the qubit including the backaction and the friction induced from the SQUID
768
F. K. Wilhelm et al.
−Ceff
Φ0 2π
2
2 Z I eff p 2 2 ω 2 + 2π 2 MSQ Ibias tan2 γin,0 ϕ iϕ0 ωΦ20
= −∂ϕ U1D (ϕ, Φx,q + δΦcl ).
(7)
From this form, encoded as D(ω)ϕ(ω) = −∂U/∂ϕ we can use the prescription given in [13] and identify the spectral function for the continuous, classical model as Jcont = ImD(ω). From there, we can do the two-state approximation for the particle in a double well [14] and find J(ω) in analogy to [12] 2 2 πΦ (2π) MSQ Ip 2 2 J(ω) = Ibias tan (8) Re{Zeff (ω)}. hω ¯ Φ0 Φ0 3.2
Qubit Dynamics under the Influence of Decoherence
From J(ω), we can analyze the dynamics of the system by studying the reduced density matrix, i.e. the density matrix of the full system where the details of the environment have been integrated out, by a number of different methods. The low damping limit, J(ω)/ω 1 for all frequencies, is most desirable for quantum computation. Thus, the energy-eigenstates of the qubit Hamiltonian, Eq. (1), are the appropriate starting point of our discussion. In this case, the relaxation rate Γr (and relaxation time τr ) are determined by the environmental spectral function J(ω) at the frequency of the level separation ν of the qubit 2 ν 1 ∆ ν −1 coth J Γr = τr = , (9) 2 ν ¯h 2kB T where T is the temperature of the bath. The dephasing rate Γφ (and dephasing time τφ ) is ε 2 k T Γr B + 2πα (10) Γφ = τφ−1 = 2 ν ¯h with α = limω→0 J(ω)/(2πω). These expressions have been derived in the context of NMR [15] and recently been confirmed by a full path-integral analysis [10]. In this paper, all rates are calculated for this regime. For performing efficient measurement, one can afford to go to the strong damping regime. A well-known approach to this problem, the noninteracting blip approximation (NIBA) has been derived in [13]. This approximation gives good predictions at degeneracy, = 0. At low || > 0 it contains an artifact predicting incoherent dynamics even at weak damping. At high bias, ∆ and at strong damping, it becomes asymptotically correct again. We will not detail this approach here more, as it has been extensively covered in [11,14].
Decoherence of Flux Qubits Coupled to Electronic Circuits
769
If J(ω) is not smooth but contains strong peaks the situation becomes more involved: At some frequencies, J(ω) may fall in the weak and at others in the strong damping limit. In some cases, whern J(ω) ω holds at least for ω ≤ Ω with some Ω ν/¯h, this can be treated approximately: one can first renormalize ∆eff through the high-frequency contributions [11] and then perform a weak-damping approximation from the fixed-point Hamiltonian. This is detailed in [16]. In the general case, more involved methods such as flow equation renormalization [17] have to be used.
4
Engineering the Measurement Apparatus
From Eq. (8) we see that engineering the decoherence induced by the measurement apparatus essentially means engineering Zeff . This includes also the contributions due to the measurement apparatus. In this section, we are going to outline and compare several options suggested in literature. We assume a perfect current source that ramps the bias current Ibias through the SQUID. The fact that the current source is non-ideal, and that the wiring to the SQUID chip has an impedance is all modeled by the impedance Z(ω). The wiring can be engineered such that for a very wide frequency range the impedance Z(ω) is on the order of the vacuum impedance, and can be modeled by its real part Rl . It typically has a value of 100 Ω. 4.1
R-Shunt
It has been suggested [18] to overdamp the SQUID by making the shunt cir
cuit a simple resistor Z(ω) = RS with RS Lkin /2CJ . This is inspired by an analogous setup for charge qubits, [3]. Following the parameters given in [12], a SQUID with Ic,0 = 200nA at Φ/Φ0 0.75 biased at Ibias = 120nA, we find Lkin 2 · 10−9 H. Together with CJ 1fF, this means that the SQUID is overdamped if R Rmax = 1.4kΩ. Using Eq. 8, we find that this provides 2 an Ohmic environment with Drude-cutoff, J(ω) = αω/(1 + ω 2 /ωLR ) where 2 2 2 2 2 ωLR = R/Lkin and α = (2π) /¯h (MSQ Iq /Φ0 ) Ibias tan (πΦ/Φ0 )Lkin /RS . Using the parameters from [12], MSQ Iq /Φ0 = 0.002, we find αR = 0.08Ω and ωLR /R = 8.3GHz/Ω. Thus, for our range of parameters (which essentially correspond to weak coupling between SQUID and qubit), one still has low damping of the qubit from the (internally overdamped) environment at reasonable shunt resistances down to tens of Ohms. For such a setup, one can apply the continuous weak measurement theory as it is outlined e.g. in [18]. This way, one can readily describe the readout through measurement of Zeff which leaves the system on the superconducting branch. If one desires to read out the state by monitoring the voltage at bias currents above the Ic,eff , our analysis only describes the pre-measurement phase and at least shows that the system is hardly disturbed when the current is ramped.
770
F. K. Wilhelm et al.
4.2
Capacitive Shunt
Next, we consider a large superconducting capacitive shunt (Fig. 2a, as implemented in [4,9]). The C shunt only makes the effective mass of the SQUID’s external phase γex very heavy. The total impedance Zeff (ω) and J(ω) are modeled as before, see Fig. 3. As limiting values, we find 2 2 ω L Rl J , for ω ωLC (11) Re{Zeff (ω)} ≈ Rl , for ω = ωLC 2 12 , for ω ωLC ω C Rl sh
We can observe that this circuit is a weakly damped LC-oscillator and it is clear√from (9) and (8) that one should keep its resonance frequency ωLC = 1/ LJ Csh , where Re{Zeff (ω)} has a maximum, away from the qubit’s resonance ωres = ν/¯ h. This is usually done by chosing ωLC ωres . For a C-shunted circuit with ωLC ωres , this yields for J(ω ≈ ωLC ) 2 2 M Ip πΦ 1 (2π) 2 2 Ibias tan (12) J(ω) ≈ 2 R hω 3 ¯ Φ0 Φ0 Csh l The factor 1/ω 3 indicates a natural cut-off for J(ω), which prevents the ultraviolet divergence [11,10] and which in much of the theoretical literature is introduced by hand. Using Eq. (9), we can directly analyze mixing times τr vs ωres for typical sample parameters (here calculated with the
Ibias
Zl(ω)≈ Rl
Ibias
Zl(ω)≈ Rl
a Csh
LJ
δV
LJ
δV
b Rsh Csh
Fig. 2. Circuit models for the C-shunted DC-SQUID (a) and the RC-shunted DCSQUID (b). The SQUID is modeled as an inductance LJ . A shunt circuit, the superconducting capacitor Csh or the Rsh -Csh series, is fabricated on chip very close to the SQUID. The noise that couples to the qubit results from JohnsonNyquist voltage noise δV from the circuit’s total impedance Zeff . Zeff is formed by a parallel combination of the impedances of the leads Zl , the shunt and the SQUID, −1 = 1/Zl + 1/(Rsh + 1/iωCsh ) + 1/iωLJ , with Rsh = 0 for circuit (a) such that Zeff
Decoherence of Flux Qubits Coupled to Electronic Circuits
Re{Z t(ω )}
(Ω )
C sh u nt 10
10
10
10
10
a 0
10
−2
10
−4
10
7
10
10
10
10
8
10
9
10
10
9
7
7
10
8
ω /2 π
10
9
(Hz)
10
10
0
10
10
3
b
−4
10
5
10
10
2
−2
10
c
−1
(rad s )
J(ω )
RC shunt
2
10
771
10
7
10
8
10
9
10
10
9
d 7
5
3
10
7
10
8
ω /2 π
10
9
10
10
(Hz)
Fig. 3. A typical Re{Zt (ω)} for the C-shunted SQUID (a) and the RC-shunted SQUID (b), and corresponding J(ω) in (c) and (d) respectively. For comparison, the dashed line in (c) shows a simple Ohmic spectrum, J(ω) = αω with exponential cut off ωc /2π = 0.5 GHz and α = 0.00062. The parameters used here are Ip = 500 nA and T = 30 mK. The SQUID with 2Ico = 200 nA is operated at f = 0.75 π and current biased at 120 nA, a typical value for switching of the C-shunted circuit (the RC-shunted circuit switches at higher current values). The mutual inductance M = 8 pH (i. e. M Ip /Φ0 = 0.002). The shunt is Csh = 30 pF and for the RC shunt Rsh = 10 Ω. The leads are modeled by Rl = 100 Ω
non-approximated version of Re{Zt (ω)}), see [12] for details. The mixing −5 2 2 h)2 ωres (M Ip /Φ0 )2 Ibias tan2 (πΦ/Φ0 )(2¯ hCsh Rl )−1 rate is then Γr ≈ (2π∆/¯ coth (¯ hωres /2kB T ). With the C-shunted circuit it seems possible to get τr values that are very long. They are compatible with the ramp times of the SQUID, but too slow for fast repetition rates. For the parameters used here they are in the range of 15 µs. While this value is close to the desired order of magnitude, one has to be aware of the fact that at these high switching current values the linearization of the junction as a kinetic inductor may underestimate the actual noise. In that regime, phase diffusion between different minima of the washboard potential also becomes relevant and changes the noise properties [19,20]. 4.3
RC-Shunt
As an alternative we will consider a shunt that is a series combination of a capacitor and a resistor (Fig. 2b) (RC-shunted SQUID). The RC shunt also adds damping at the plasma frequency of the SQUID, which is needed for
772
F. K. Wilhelm et al.
realizing a high resolution of the SQUID readout (i. e. for narrow switchingcurrent histograms) [19]. The total impedance Zt (ω) of the two measurement circuits are modeled as in Fig. 2. For the circuit with the RC shunt ω 2 L2J for ω ωLC Rl , ≤ Rl , for ω = ωLC Rsh1Csh . (13) Re{Zt (ω)} ≈ Rl //Rsh , for ω = ωLC Rsh1Csh Rl //Rsh , for ω ωLC The difference mainly concerns frequencies ω > ωLC , where the C-shunted circuit has a stronger cutoff in Re{Zeff (ω)}, and thereby a relaxation rate, that is several orders lower than for the RC-shunted circuit. Given the values of J(ω) from Fig. 3 one can directly see from the values of that an RCshunted circuit with otherwise similar parameters yields at ωres /2π = 10 GHz relaxation times that are about four orders of magnitude shorter.
5
Coupled Qubits
So far, we have applied our modeling only to single qubits. In order to study entanglement in a controlled way and to eventually perform quantum algorithms, this has to be extended to coupled qubits. 5.1
Hamiltonian
There is a number of ways how to couple two solid-state qubits in a way which permits universal quantum compuation. If the qubit states are given through real spins, one typically obtains a Heisenberg-type exchange coupling. For other qubits, the three components of the pseudo-spin typically correspond to physically completely distinct variables. In our case, σ ˆz corresponds to the flux through the loop whereas σ ˆx/y are charges. Consequently, one usually (1) (2) ˆy coupling, i.e. coupling by finds Ising-type couplings. The case of σ ˆy ⊗ σ a component which is orthogonal to all possible single-qubit Hamiltonians, has been extensively studied [21,22], because this type is straightforwardly realized as a tunable coupling of charge qubits [3]. We study the generic case of coupling the “natural” variables of the pseudospin to each other, which can be realized in flux qubits using a switchable superconducting transformer [6,23], but has also been experimentally utilized for coupling charge qubits by fixed capacitive interaction [24]. We model the Hamiltonian of a system of two qubits, coupled via Isingtype coupling. Each of the two qubits is described by the Hamiltonian Eq. (1) ˆ qq = −(K/2)ˆ σz ⊗ (1). The coupling between the qubits is described by H (2) σ ˆz that represents e.g. inductive interaction. Thus, the complete two-qubit Hamiltonian in the absence of a dissipative environment reads 1 ˆ 2qb = − 1 H ˆz(1) σ ˆz(i) + ∆i σ ˆx(i) − K σ ˆz(2) . (14) i σ 2 i=1,2 2
Decoherence of Flux Qubits Coupled to Electronic Circuits
773
For two qubits, there are several ways to couple to the environment: Both qubits may couple to a common bath such as picked up by coupling elements [6]. Local readout and control electronics coupling to individual qubits [6] can be described as coupling to two uncorrelated baths. In analogy to the procedure described above, one can determine the spectral functions of these baths by investigating the corresponding impedances. In the case of two uncorrelated baths, the full Hamiltonian reads 1 2b (i) + H ˆ 2qb ˆ 2qb + ˆ B1 + H ˆ B2 , σ ˆz(i) X H =H (15) 2 i=1,2 (i) = ζ λν xν are collective coordinates of the bath. In the case of two X ν qubits coupling to one common bath we model our two qubit system in a similar way with the Hamiltonian +H ˆB , ˆ 1b = H ˆ 2qb + 1 σ H ˆz(1) + σ ˆz(2) X (16) 2qb 2 ˆ is a collective bath coordinate similar to above. where X 5.2
Rates
We can derive formulae for relaxation and dephasing rates similar to Eqs. (9) and (10). Our Hilbert space is now four-dimensional. We label the eigenstates √ as |E1 . . . |E4. We chose |E1 to be the singlet state (|↑↓ − |↓↑) / 2, which is always an eigenstate [25] whereas |E2 . . . |E4 are the energy eigenstates (1) (2) in the triplet subspace, which are typically not the eigenstates of σ ˆz + σ ˆz . As we have 4 levels, we have 6 independent possible quantum coherent oscillations, each of which has its own dephasing rate, as well as 4 relaxation channels, one of which has a vanishing rate indicating the existence of a stable thermal equilibrium point. The expressions for the rates, although of similar form as in Eqs. (9) and (10) are rather involved and are shown in [25]. Figure 4 displays the dependence of typical dephasing rates and the sum of all relaxation rates ΓR on temperature for the case ∆ = = K = hνS with νS = 1GHz. The rates are of the same magnitude for the case of one common bath and two distinct baths. If the temperature is increased above the roll off point set by the intrinsic energy scales, Ts = (h/kB )νs = 4.8 · 10−2 K, where Es = 1GHz, the increase of the dephasing and relaxation rates follows a linear dependence, indicating that the environmental fluctuations are predominantly thermal. As a notable exception, in the case of one common bath the dephasing rates Γϕ21 = Γϕ12 go to zero when the temperature is decreased while all other rates saturate for T → 0. This can be understood as follows: the singlet state |E1 is left invariant by the Hamiltonian of coupled qubits in a common bath, Eq. (16), i.e. it is an energy eigenstate left unaffected by the environment. Superpositions of the singlet with another eigenstate are usually still unstable, because the other eigenstate generally suffers from
774 10
F. K. Wilhelm et al. 10
Rates (1/s)
10 10 10 10 10 10
6 4 2 0
Rates (1/s)
10
-8
11
10
10 10 10 10
ΓR Γϕ
Γϕ
32
Γϕ
21
Γϕ Γϕ
42
31
Γϕ
-2
10
10
1 bath
8
10
-7
10
-6
10
-5
10
-4
2 baths
9
8
10
-3
10
-2
10
ΓR Γϕ
Γϕ
32
Γϕ
21
Γϕ Γϕ
42
31
Γϕ
-1
43
41
10
0
10
1
10
2
43
41
7
6
5
10 -8 10
10
-7
10
-6
10
-5
10
-4
10
-3
10
-2
10
-1
10
0
10
1
10
2
T/Ts
Fig. 4. Log-log plot of the temperature dependence of the sum of the four relaxation rates and selected dephasing rates. Qubit parameters K, and η are all set to Es and the bath is assumed to be Ohmic α = 10−3 . The upper panel shows the case of one common bath, the lower panel the case of two distinct baths. At the characteristic temperature of approximately 0.1 · Ts the rates increase very steeply
decoherence. However, the lowest-energy state of the triplet subspace |E2 cannot decay by spontaneous emission and flip-less dephasing vanishes at T = 0, hence the dephasing rate between eigenstates |E1 and |E2 vanishes at low temperatures, see Fig. 4. As shown in [25], there can be more “protected” transitions of this kind if the qubit parameters are adjusted such that the symmetry between the unperturbed qubit and the coupling to the bath is even higher, e.g. at the working point for a CPHASE operation. 5.3
Gate Performance
The rates derived in the previous section are numerous and do strongly dependend on the tunable parameters of the qubit. Thus, they do not yet allow a full assesment of the performance as a quantum logic element. A quantitative measure of how well a two-qubit setup performs a quantum logic gate operation are the gate quality factors introduced in [26]: the fidelity, purity, quantum degree and entanglement capability. These factors characterize the density matrices obtained after attempting to perform the gate operation in a hostile environment, starting from all possible initial condij j Ψin |. To form all possible initial density matrices needed to tions ρ(0) = |Ψin calculate the gate quality factors, we use the 16 unentangled product states j , j = 1, . . . , 16 defined [22] according to |Ψa 1 |Ψb 2 , (a, b =√1, . . . , 4), with |Ψin √ |Ψ1 = |0, |Ψ2 = |1, |Ψ3 = (1/ 2)(|0 + |1), and |Ψ4 = (1/ 2)(|0+ i |1).
Decoherence of Flux Qubits Coupled to Electronic Circuits
775
They form one possible basis set for the superoperator νG which describes the open system dynamics such that ρ(tG ) = νG ρ(0) [22,26]. The CNOT gate is implemented using rectangular DC pulses and describing dissipation through the Bloch-Redfield equation as described in [3,25]. 16 j j + j | UG ρG UG |Ψin . The fiThe fidelity is defined as F = (1/16) j=1 Ψin delity is a measure of how well a quantum logic operation was performed. Clearly, the fidelity for the ideal quantum gate operation is equal to 1. The j 2 , which should be tr (ρ ) second quantifier is the purity P = (1/16) 16 G j=1 1 in a pure and 1/4 in a fully mixed state. The purity characterizes the effects of decoherence. The quantum degree measures nonlocality. It is defined as the maximum overlap of the resulting density matrix after the quantum gate opk k eration with the maximally entangled Bell-states Q = maxj,k Ψme | ρjG |Ψme . For an ideal entangling operation, e.g. the CNOT gate, the quantum degree should be 1. It has been shown [27] that all density operators that have an overlap with a maximally entangled state that is larger than the value 0.78 [22] violate the Clauser-Horne-Shimony-Holt (CHSH) inequality and are thus non-local. The entanglement capability C is the smallest eigenvalue of the partially transposed density matrix for all possible unentangled input states j . (see below). It has been shown [28] to be negative for an entangled state. |Ψin This quantifier should be -0.5, e.g. for the ideal CNOT, thus characterizing a maximally entangled final state. In Fig. 5, the deviations due to decoherence of the gate quality factors from their ideal values are shown. Similar to most of the rates, all gate quality factors saturate at temperatures below a threshold set by the qubit energy scales. The deviations grow linearily at higher temperatures until they reach their theoretical maximum. Comparing the different coupling scenarios, we see that at low temperatures, the purity and fidelity are higher for the case of one common bath, but if temperature is increased above this threshold, fidelity and purity are approximately equal for both the case of one common and two distinct baths. This is related to the fact that in the case of one common bath all relaxation and dephasing rates vanish during the two-qubitstep of the CNOT (see [25] for details), due to the special symmetries of the Hamiltonian, when temperature goes to zero as discussed above. Still, the quantum degree and the entanglement capability tend towards the same value for both the case of one common and two distinct baths. This is due to the fact that both quantum degree and entanglement capability are, different than fidelity and purity, not defined as mean values but rather characterize the “best” possible case of all given input states. In the recent work by Thorwart and H¨anggi [22], the CNOT gate was (i) (j) ˆy coupling scheme and one common bath. They investigated for a σ ˆy ⊗ σ find a pronounced degradation of the gate performance with gate quality factors only weakly depending on temperature. If we set the dissipation and the intrinsic energy scale to the same values as in their work, we also observe only a weak decrease of the gate quality factors for both the case of one com-
776
F. K. Wilhelm et al.
10
10
10
-3
1-F
10
0
10
10
-4
10
10
10
-2
10
10
2 baths 1 bath
-1
1-Q
10
10
10
-6
10
-4
10
-2
10
0
10
10
2
-1
-2
-3
-4
10
-2
-3
10
10
-4
10
-6
10
-4
10
-2
T/Ts
10
0
10
2
10
0
-1
10
|-0.5-C|
1-P
10
0
0
10
-6
10
-4
10
-2
10
0
10
2
-1
-2
-3
-4
10
-6
10
-4
10
-2
10
0
10
2
T/Ts
Fig. 5. Log-log plot of the temperature dependence of the deviations of the four gate quantifiers from their ideal values. Here the temperature is varied from ≈ 0 to 2 · Es . In all cases α = α1 = α2 = 10−3 . The dotted line indicates the upper bound set by the Clauser-Horne-Shimony-Holt inequality
mon bath and two distinct baths in the same temperature range discussed by Thorwart and H¨ anggi. However, see Fig. 5, overall we find substantially ˆy coupling, the Hamiltobetter values. This is due to the fact that for σ ˆy ⊗ σ nian does not commute with the coupling to the bath during the two-qubit steps of the pulse sequence, i.e. the symmetries of the coupling to the bath and the inter-qubit coupling are not compatible. The dotted line in Fig. 5 shows that already at comparedly high temperature, about 20 qubit energies, a quantum degree larger than Q ≈ 0.78 can be achieved. Only then, the Clauser-Horne-Shimony-Holt inequality is violated and non-local correlations between the qubits occur as described in [22]. Thus, even under rather modest requirements on the experimental setup which seem to be feasible with present day technology, it appears to be possible to demonstrate nonlocality and entanglement between superconducting flux qubits.
6
Summary
It has been outlined, how one can model the decoherence of an electromagnetic environment inductively coupled to a superconducting flux qubit. We have exemplified a procedure based on analyzing the classical friction induced by the environment for the specific case of the read-out SQUID. It is shown that the SQUID can be effectively decoupled from the qubit if no bias cur-
Decoherence of Flux Qubits Coupled to Electronic Circuits
777
rent is applied. The effect of the decoherence on relaxation and dephasing rates of single qubits has been discussed as well as the gate performance of coupled qubits. We have shown that by carefully engineering the impedance and the symmetry of the coupling, one can reach excellent gate quality which complies with the demands of quantum computation. Acknowledgements We would like to thank M. Governale, T. Robinson, and M. Thorwart for discussions. FKW and MJS acknowledge support from ARO under contractNo. P-43385-PH-QC.
References 1. see e.g. D. Bouwmeester, A.K. Ekert, and A. Zeilinger, The Physics of Quantum Information (Springer, Berlin, Heidelberg, 2000). 763 2. D. DiVincenzo, Science 270, 255 (1995). 763 3. Yu. Makhlin, G. Sch¨ on, and A. Shnirman, Rev. Mod. Phys. 73, 357 (2001). 763, 764, 769, 772, 775 4. I. Chiorescu, Y. Nakamura, C.J.P.M. Harmans, and J.E. Mooij, Science 299, 1869 (2003). 763, 764, 765, 770 5. Yu.A. Pashkin, T. Yamamoto, O. Astafiev, Y. Nakamura, D.V. Averin, and J.S. Tsai, Nature 421, 823 (2003). 763, 764 6. J.E. Mooij, T.P. Orlando, L. Levitov, L. Tian, C.H. van der Wal, and S. Lloyd, Science 285, 1036 (1999); T.P. Orlando, J.E. Mooij, L. Tian, C.H. van der Wal, L.S. Levitov, S. Lloyd, and J.J. Mazo, Phys. Rev. B 60, 15398 (1999). 764, 765, 767, 772, 773 7. L. Tian, L.S. Levitov, C.H. van der Wal, J.E. Mooij, T.P. Orlando, S. Lloyd, C.J.P.M. Harmans, and J.J. Mazo in I. Kulik and R. Elliatiogly, Quantum Mesoscopic Phenomena and Mesoscopic Devices in Microelectronics (Kluwer, Dordrecht, 2000), 429. 764 8. Yu. Makhlin et al., this volume. 764 9. C.H. van der Wal, A.C.J. ter Haar, F.K. Wilhelm, R.N. Schouten, C.J.P.M. Harmans, T.P. Orlando, S. Lloyd, and J.E. Mooij, Science 290, 773 (2000). 764, 765, 770 10. M. Grifoni, E. Paladino, U. Weiss, Eur. Phys. J. B 10, 719 (1999). 765, 768, 770 11. A.J. Leggett, S. Chakravarty, A.T. Dorsey, M.P.A. Fisher, A. Garg, and W. Zwerger, Rev. Mod. Phys. 59, 1 (1987). 766, 767, 768, 769, 770 12. C.H. van der Wal, F.K. Wilhelm, C.J.P.M. Harmans, and J.E. Mooij, Eur. Phys. J. B 31, 111 (2003). 766, 768, 769, 771 13. A.J. Leggett, Phys. Rev. B 30, 1208 (1984). 766, 768 14. U. Weiss, Quantum Dissipative Systems, (World Scientific, Singapore, ed. 2, 1999). 767, 768 15. A. Abragam, Principles of Nuclear Magnetism (Oxford University Press, Oxford, 1961). 768 16. F.K. Wilhelm, submitted to Phys. Rev. B. 769
778
F. K. Wilhelm et al.
17. S. Kleff, S. Kehrein, and J. von Delft, to appear in Physica E. 769 18. Yu. Makhlin, G. Sch¨ on, and A. Shnirman, Physica C 368, 276 (2002). 769 19. P. Joyez, D. Vion, M. G¨ otz, M.H. Devoret, D. Esteve, J. Supercond. 12, 757 (1999). 771, 772 20. W.T. Coffey, Y.P. Kalmykov, J.T. Waldron, The Langevin Equation; with Applications in Chemistry and Electrical Engineering, (World Scientific, Singapore, 1996). 771 21. M. Governale, M. Grifoni, G. Sch¨ on, Chem. Phys. 268, 273 (2001). 772 22. M. Thorwart, P. H¨ anggi, Phys. Rev. A 65, 012309 (2002). 772, 774, 775, 776 23. J.B. Majer, priv. comm. (2002). 772 24. Yu.A. Pashkin, T. Yamamoto, O. Astafiev, Y. Nakamura, D.V. Averin, J.S. Tsai, Nature 421, 823 (2003). 772 25. M.J. Storcz and F.K. Wilhelm, Phys. Rev. A 67, 042319 (2003). 773, 774, 775 26. J.F. Poyatos, J.I. Cirac, P. Zoller, Phys. Rev. Lett. 78, 390 (1997). 774, 775 27. C.H. Bennett, G. Brassard, S. Popescu, B. Schumacher, J.A. Smolin, and W.K. Wootters, Phys. Rev. Lett. 76, 722 (1996). 775 28. A. Peres, Phys. Rev. Lett. 77, 1413 (1996). 775
The Electron Theory of Magnetism in Monatomic Nanowires Matej Komelj1 , Claude Ederer2 , and Manfred F¨ ahnle2 1 2
Joˇzef Stefan Institute Jamova 39, SI-1000 Ljubljana, Slovenia Max-Planck Institut f¨ ur Metallforschung Heisenbergstr. 3, D-70569 Stuttgart, Germany
Abstract. The prospects for a reliable determination of the measurable magnetic properties of 3d-transition-metal nanowires are discussed. The emphasis is on the spin and orbital magnetic moments, obtained from the X-ray-magnetic-circulardichroism spectra via the sum rules, and on the magnetic anisotropy. A critical test of the sum rules, applied to the monatomic nanowires, is presented. In particular, the importance of the magnetic dipole term is demonstrated. It is shown that the orbital-correlation effects influence the magnetism in one-dimensional systems.
1
Introduction
Nanotechnology makes it possible to investigate systems that consist of a relatively small number of atoms and therefore have a reduced dimensionality. Of particular interest are the phenomena that used to be solely academic but have emerged as realistic problems since it became possible to manipulate individual atoms in some instances. One of the fundamental questions is the presence of magnetic ordering in quasi-one-dimensional systems. Twenty years ago, Weinert and Freeman [1] predicted the existence of ferromagnetic ground states in linear chains of Fe or Ni atoms by means of an ab-initio calculation. Most of the relevant experiments were carried out either on metal atoms embedded in an insulating material or on metallic monolayer nano-stripes on vicinal surfaces of W [2,3] or Cu [4] with a stripe width down to 1–10 nm. Only recently, Gambardella et al. [5] succeeded in preparing real monatomic nanowires by growing Co on a high-purity Pt(997) vicinal surface. The results of the x-ray magnetic circular dichroism (XMCD) experiments [6] have indeed provided the evidence that one-dimensional 3dmetal chains can sustain long-range ferromagnetic order on the time-scale of the experiment. As expected, due to the reduced symmetry compared to bulk or twodimensional films, the measured orbital magnetic moments in nanowires are the largest ever found in 3d-transition-metal systems. Similar evidence has been also found for the magnetic anisotropy energy, obtained from the shape of the magnetization curve by applying a simple model of exchange-coupled superparmagnetic clusters. B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 781–788, 2003. c Springer-Verlag Berlin Heidelberg 2003
782
Matej Komelj et al.
However, the applied experimental methods are, to some extent, uncertain and an interpretation of the results based on the electron theory is highly desirable.
2
XMCD Sum Rules
The essence of XMCD spectroscopy [7] is a distinction between the absorption of X-rays with different polarizations. From the absorption coefficients µ+ (), µ− (), µ0 () for circularly-right, circularly-left and z-axis polarized light, it is possible to determine the magnetic orbital ml = −µB lz and spin ms = −µB σz moments by applying the sum rules [8,9]. When discussing the L2 and L3 absorption edges in the 3d-transition metals, which are mainly determined by the electronic transitions from the 2p core states to the valence 3d states, the sum rules read as: lz = σz = Im = Is = It =
2Im Nh , It 3Is Nh − 7Tz , I t [(µc )L3 + (µc )L2 ] d , [(µc )L3 − 2(µc )L2 ] d , [(µt )L3 + (µt )L2 ] d ,
(1) (2) (3) (4) (5)
with the XMCD signal µc = µ+ − µ− and the total absorption coefficient µt = µ+ + µ− + µ0 . There are some problems connected with the application of the sum rules. The exact number of holes Nh in a d band is not directly measurable. The expectation value of the magnetic dipolar operator: 1 Tˆz = [σ − 3ˆr(ˆr · σ)]z 2
,
(6)
where σ denotes the vector of the Pauli matrices, can be determined experimentally [10] only when the effects of spin-orbit coupling (SOC) are small, which is presumably not the case for one-dimensional nanowires (see below and [11]). In principle, the integrations in Eqs. (3)–(5) have to be performed over the energy interval corresponding to the electron 2p → 3d transitions. This cannot be easily realized in an experiment, and it is usually supposed that the other contributions are negligible. Above all, the sum rules are derived for the non-hybridized case, which is only approximately true for homogeneous bulk materials. There are also other approximations applied in the derivation of Eqs. (1),(2), and the validity of these additional approximations is also not guaranteed although it is known that for bulk materials the XMCD analysis holds quite well. The sum rules seem to be less well fulfilled
The Electron Theory of Magnetism in Monatomic Nanowires
783
at surfaces [12] and for atoms close to an interface in multilayers [13]. The question is whether the sum rules are still valid when going to systems with decreased dimensionality and hence reduced symmetry. In spite of the described difficulties, the sum rule (1) was used for the Co nanowires [6], yielding an orbital moment of 0.68 ± 0.05 µB , which is the largest value measured in a 3d-transition metal.
3
Magnetic Anisotropy Energy
Besides the magnetic moments, the magnetic anisotropy energy (eMAE ) is one of the important characteristic quantities of magnetic materials. It is defined as the difference in the total energy for two different magnetization directions, and arises mainly due to the spin-orbit coupling. The energy is a minimum when the magnetization is along the easy axis, and a maximum when the magnetization is along the hard axis. The distinguished directions are not necessarily limited to some particular axes but can be extended to the whole planes. As proposed by HoGlyst et al. [14], a special double anisotropy, consisting of an easy plane and a uniaxial anisotropy with the easy axis within that plane, would lead to very peculiar properties of kink solitons in a onedimensional ferromagnet with the Gilbert damping. Magnetic solitons were measured by neutron scattering in one-dimensional insulating systems [15] but only the eventual future experiments on metals, where the damping is certainly larger, may prove the theoretically predicted complex motion of the kinks. The magnetic anisotropy can be possibly controlled by the chemical composition and the structure of the substrate. A promising candidate for the solitons experiments is the Co nanowire on the Pt(997)-vicinal surface. The eMAE was determined [6] from the shape of the magnetization curve above the blocking temperature, 15 ± 5 K, by applying a simple model of exchangecoupled superparamagnetic clusters. The obtained eMAE = 2 ± 0.2 meV per atom, with the easy axis perpendicular to the wire, is far larger than any other measured anisotropy energies in transition metals. Although such an enhancement is expected due to the reduced dimensionality and coordination number, the model used might be an oversimplification (see, for example, [4]). It is based on the assumption that the wire consists of clusters, described by a one-dimensional nearest-neighbor-exchange model. The particular clusters are considered as decoupled from each other. This would mean that the exchange interaction is negligible when compared to the anisotropy energy, which is certainly not the case, even for monatomic wires.
4
Ab-initio Calculation
An alternative approach for investigating the magnetism in nanowires is a calculation within the framework of the density functional theory. This has
784
Matej Komelj et al.
often proved to be a reliable tool for a theoretical determination of the various physical quantities. We used the WIEN97 code [16], which adopts the full-potential linearizedaugmented-plane-wave (FLAPW) method [17] with the implemented SO coupling and the tools for the calculation of the magneto-optical effects and XMCD spectra [18,19]. For the exchange-correlation potential, the local-spindensity approximation (LSDA) [20] was applied. The calculations were performed for free-standing and Pt-supported Co [21] and Fe [22] nanowires with the nearest-neighbor distance fixed to the fcc Pt nearest-neighbor distance (2.77 ˚ A). The vicinal Pt(997) surface with the wire at the steps was modeled using the supercell shown in Fig. 1, with two additional vacuum layers on the top. The supercell of the free-standing wire was constructed by removing the Pt atoms from this supercell. The test calculations showed that the results change only very slightly when going to larger supercells. The direction of the magnetization was chosen to be perpendicular to the Pt surface for the calculation of the XMCD-related quantities. First, we calculated the magnetic moments directly from the wave functions. The results are given Table 1. There are no experimental data available for the spin magnetic moments but the agreement between the obtained calculated moment ms = 2.06 µB with the previously reported theoretical value ms = 2.08 µB [23] for the Pt-supported Co nanowire is good. However, the value of the calculated orbital magnetic moment ml = 0.14 µB is drastically smaller than the measured ml = 0.68 ± 0.05µB . As discussed in [24], the reason is that the LSDA+SO coupling does not appropriately account for the orbital correlations. Therefore, we included the orbital-polarization term [25,26], which describes the Hund’s rule intra-atomic couplings. This term induces a substantial enhancement of the calculated orbital magnetic moment to ml = 0.92µB , which clearly proves the importance of correlation effects in monatomic nanowires. The orbital polarization affects mainly the orbital moments, while some minor corrections of spin moments are observed only for the Fe wires. The advantage of the ab-initio calculations over the experiments is the possibility of comparing the directly calculated magnetic moments with those calculated by applying the sum rules (1),(2). The absorption coefficients µ+ (), µ− () and µ0 (), appearing in Eqs. (3),(4), were calculated by using Fermi’s golden rule in the non-relativistic dipole approximation [12], which ˆ · e with requires the evaluation of the matrix elements for the operator p e = e− , e+ or e0 denoting the polarization vector of the light. The calculated
Fig. 1. The supercell used to model the monatomic wire on a vicinal Pt(997) surface
The Electron Theory of Magnetism in Monatomic Nanowires
785
Table 1. The calculated XMCD-spectroscopy-related results using LSDA (upper part) and LSDA + orbital-polarization term (lower part) Co on Pt
free Co
Fe on Pt
free Fe
lz direct
0.14
0.40
0.31
0.27
lz from eq. (2)
0.16
0.59
0.26
0.18
σz direct
2.06
2.16
3.24
3.13
σz from eq. (2)
2.13
2.36
3.19
3.20
σz from eq. (2) Tz = 0
1.94
2.26
2.74
2.78
Tz
-0.027
-0.014
-0.065
-0.060
lz direct
0.92
2.31
0.72
1.76
lz from eq. (2)
0.92
2.17
0.62
1.33
σz direct
2.06
2.16
3.18
3.12
σz from eq. (2)
2.13
2.08
3.17
3.26
σz from eq. (2) Tz = 0
1.94
2.26
2.60
2.17
-0.027
0.026
-0.083
-0.155
arbitrary
units
Tz
0
10 E-EF+E(2p3/2) [eV]
20
Fig. 2. The calculated XMCD signal for Pt-supported Co (upper graph) and Fe (lower graph) nanowires with (solid lines) and without orbitalpolarization term
XMCD spectra µ+ () − µ− (), shown in Fig. 2, demonstrate a considerable influence of the orbital polarization. The integrations in Eqs. (3),(4) were performed over the energy interval corresponding to the unoccupied part of the 3d-valence band. This interval is defined by the Fermi energy and the cut-off energy, obtained from the density of states. The integral of the density of states over this interval yields the number of holes Nh , also needed for the calculation of the magnetic moments via the sum rules (1) and (2). The agreement between the orbital magnetic moments, calculated directly and by using (1) is nearly perfect for the Pt-supported Co and within about 15% for the Fe wire. Although the deviation is larger in the case of the rather unrealistic free-standing wires, the XMCD spectroscopy could indeed be a reliable
786
Matej Komelj et al.
arbitrary units
tool for measurements of the orbital magnetic moments in one-dimensional systems. The validity of the spin sum rule (2) is certainly worse if the Tz term is neglected. This may represent some difficulties for an application of the sum rule to experimental results. St¨ ohr and K¨ onig [10] developed a method for determining the dipole term by angular-dependent XMCD spectroscopy for systems where the effect of SOC is small. In this case Tx ≈ Ty and Tz ≈ −2Tx for a one-dimensional wire along the z-axis [11]. Therefore the magnetic dipole term cancels when averaging the absorption spectra obtained with the magnetization parallel to the three Cartesian axes and the angle-averaged sum rule (2) then allows a determination of the pure spin moment. Following Wu et al. [27] in Fig. 3 we present the energy distributions of Tz (E) and −2Tx (E) for the free-standing Co wire. It is obvious that the influence of the spin-orbit coupling is drastically enhanced when taking into account the orbital-polarization term and the elimination of Tz by the angular-dependent XMCD spectroscopy is therefore not reliable in this case. Theoretical investigations of the magnetic anisotropy in 3d-transitionmetal nanowires by means of ab-initio calculations have been intensive. However, the problem is not yet solved because there remains some disagreement in the literature. Dorantes-D´ avila and Pastor [28] performed a calculation for free-standing and Pd-supported Co and Fe wires of different lengths. They found that the easy axis was along the wire for the free-standing and perpendicular to the wire for the Pd-supported infinite Co wires. The along-the-wire easy axis was also found for the free-standing infinite Fe wires. It was clearly demonstrated that the magnetic anisotropy could be modified by varying the band filling, suggesting that by using an appropriate chemical environment, the desired type of the anisotropy could be realized. More recently, Hong and Wu [29] investigated the anisotropy in Co wires. For a wire on a Pt (001) surface they found the easy axis in the plane of the substrate, in the direction
-3
-2
-1
E-EF [eV]
0
1
Fig. 3. The energy distributions Tz (E) (dashed lines) and −2Tx (E) (full lines) for a freestanding monatomic Co wire. The calculations were performed using the LSDA without SOC (upper part), with SOC (middle part) and with SOC and orbital-polarization term (lower part)
The Electron Theory of Magnetism in Monatomic Nanowires
787
perpendicular to the wire. For a comparison, the experimentally determined [6] easy axis for a Co wire on the Pt (997) vicinal surface, was also perpendicular to the wire, but pointing out of the substrate plane with an angle of about 43◦ to the (111) normal. It has to be mentioned that the experiment is on a wire at a step (which changes the symmetry) whereas the theory is for a wire on a plane surface. In the case of the Cu (001) substrate Hong and Wu obtained the easy axis along the wire. However, for the free-standing wires they found the easy axis perpendicular to the wire, in contrast to the result from [28]. A possible reason for the disagreement is due to the different computational methods. Dorantes-D´ avila and Pastor [28] used a tight-binding Hubbard Hamiltonian in the unrestricted Hartree-Fock approximation, while Hong and Wu [29] applied the FLAPW method with the generalized-gradient approximation (GGA) for the exchange correlation potential. By using the embedded-cluster technique within multiple-scattering theory Lazarovits et al. [30] calculated the magnetic anisotropy of a Co wire on a plane Pt (111) substrate and found the easy axis perpendicular to the substrate plane. Our preliminary FLAPW calculations based on the local-spin-density approximation for free-standing Fe, Co and Ni nanowires yield the easy axis along the wires, in agreement with the pioneering work by Dorantes-D´avila and Pastor [28]. It is clear that there are still many open questions regarding the magnetic anisotropy in monatomic nanowires, and it may turn out that the correlation effects need to be included in the calculations in order to reproduce the experimental results.
5
Conclusion
In the near future more experimental and theoretical efforts will probably be devoted to magnetism in monatomic nanowires. It is expected that the experimentalists will succeed in synthesizing wires of other-than-Co transition metals on various substrates and that different techniques will be applied for the magnetic measurements. On the other hand, the present status of the electron theory allows the modeling of almost any artificial nanowire. But experiences with the orbital moments of Co nanowires or the problem of magnetic anisotropy imply that correlation effects may be important for onedimensional systems and that methods beyond the usual LSDA need to be applied.
References 1. M. Weinert and A.J. Freeman: J. Magn. Magn. Mater. 38, 23 (1983). 781 2. H.J. Elmers, J. Hauschild, H. H¨ oche, U. Gradmann, H. Bethge, D. Heuer and U. K¨ ohler: Phys. Rev. Lett. 73, 898 (1994). 781 3. J. Hauschild, H.J. Elmers and U. Gradman: Phys. Rev. B 57, R677 (1998). 781
788
Matej Komelj et al.
4. J. Shen, R. Skomski, M. Klaua, H. Jenniches, S. Sundar Manoharan and J. Kirschner, Phys. Rev. B 56, 2340 (1997). 781, 783 5. P. Gambardella, M. Blanc, L. B¨ urgi, K. Kuhnke and K. Kern: Surf. Sci. 449, 93 (2000). 781 6. P. Gambardella, A. Dallmeyer, K. Maiti, M.C. Malagoli, W. Eberhardt, K. Kern, C. Carbone: Nature (London) 416, 301 (2002). 781, 783, 787, 788 7. G. Sch¨ utz, W. Wagner, W. Wilhelm, P. Kienle, R. Zeller, R. Frahm, G. Materlik: Phys. Rev. Lett. 58, 737 (1987). 782 8. B.T. Thole, P. Carra, F. Sette, G. van der Laan: Phys. Rev. Lett. 68, 1943 (1992). 782 9. P. Carra, B.T. Thole, M. Altarelli, X. Wang: Phys. Rev. Lett. 70, 694 (1993). 782 10. J. St¨ ohr, H. K¨ onig: Phys. Rev. Lett. 75, 3748 (1995). 782, 786 11. C. Ederer, M. Komelj, M. F¨ ahnle: accepted for publication in J. Elec. Spec. 782, 786 12. R. Wu, A.J. Freeman: J. Magn. Magn. Mater. 132, 103 (1994). 783, 784 13. C. Ederer, M. Komelj, M. F¨ ahnle, G. Sch¨ utz: Phys. Rev. B 66, 94413 (2002). 783 ˙ 14. J.A. HoHlyst, A. Sukiennicki, J.J. Zebrowski: Phys. Rev. B 33, 3492 (1986). 783 15. H.-J. Mikeska, M. Steiner: Adv. Phys. 40, 191 (1991). 783 16. P. Blaha, K. Schwartz, P. Sorantin, S.B. Trickey: Comput. Phys. Commun. 59, 399 (1990). 784 17. E. Wimmer, H. Krakauer, M. Weinert, A.J. Freeman: Phys. Rev. B 24, 864 (1981). 784 18. J. Kuneˇs, P.M. Oppeneer, H.-C. Mertins, F. Sch¨ afers, A. Gaupp, W. Gudat, P. Novak: Phys. Rev. B 64, 1744417 (2001). 784 19. J. Kuneˇs, P. Nov´ ak, M. Diviˇs, P.M. Openneer: Phys. Rev. B 63, 205111 (2001). 784 20. J.P. Perdew, Y. Wang: Phys. Rev. B 45, 13244 (1992). 784 21. M. Komelj, C. Ederer, J.W. Davenport, M. F¨ ahnle: Phys. Rev. B. 66, 140407 (2002). 784 22. C. Ederer, M. Komelj, M. F¨ ahnle: to be published. 784 23. G. Bihlmayer, X. Nie, S. Bl¨ ugel: unpublished, referred to in [6]. 784 24. L. Zhou, D. Wang, Y. Kawazoe: Phys. Rev. B 60, 9545 (1999). 784 25. O. Eriksson, M.S.S. Brooks, B. Johansson: Phys. Rev. B 41, 7311 (1990). 784 26. C.O. Rodriguez, M.V. Ganduglia-Pirovano, E.L. Peltzer y Blanc´ a, M. Petersen, P. Nov´ ak: Phsy. Rev. B 63, 184413 (2001). 784 27. R. Wu, D. Wang, A.J. Freeman: J. Appl. Phys. 75, 5802 (1994). 786 28. J. Dorantes-D´ avila, G.M. Pastor: Phys. Rev. Lett. 81 211 (1998). 786, 787 29. J. Hong, R.Q. Wu: Phys. Rev. B 67, 20406 (2003). 786, 787 30. B. Lazarovits, L. Szunyogh, P. Weinberger: Phys. Rev. B 67, 24415 (2003). 787
Magnetic Liquid Patterns in Space and Time Reinhard Richter1 , Bert Reimann1 , Adrian Lange2 , Peter Rupp1 , and Alexander Rothert1 1 2
Experimentalphysik V, Universit¨ at Bayreuth D-95440 Bayreuth, Germany Institut f¨ ur Theoretische Physik, Universit¨ at Magdeburg D-39016 Magdeburg, Germany
Abstract. Macroscopic surface patterns of magnetic fluids are experimentally investigated for four different configurations of the liquid layer, and the orientation, homogeneity and temporal evolution of the magnetic field. Firstly the formation of surface undulations after a pulse-like application of the magnetic induction is examined. The wavenumbers measured for different magnetic induction are compared with the wavenumber of maximal growth predicted by linear stability analysis for the Rosensweig instability. Secondly, the formation of twin-peak patterns at the magnetic Faraday instability in an annular trough is reported. Thirdly, a ring of magnetic liquid spikes in a gradient magnetic field is periodically excited by an alternating magnetic field. The transition to spatio-temporal intermittency found in this way is characterized by power laws and their critical exponents. Eventually, we record the pinch-off of a magnetic liquid bridge by a high-speed camera. The temporal evolution of the neck radius is compared with results obtained theoretically via universal scaling functions.
1
Introduction
Molecules of liquids have in general an electric and magnetic dipole moment. Whereas these moments are of paramount importance for the chemical properties of the fluid, they are generically to tiny to have visible influence onto their fluid dynamical behaviour. This is different for liquids with artificially amplified magnetic moments. They have been synthesized for technical applications since the early 1960s [1]. These magnetic fluids (MFs) are colloidal suspensions of magnetic particles (e.g. from magnetite) in a carrier fluid like water or oil. They are protected from agglomeration due to van der Waals and magnetic forces by a layer of surfactants, or by ionic charges. The diameter of the suspended magnetic particles is in the range of 10 nm. Due to brownian motion the suspensions are then kept in a thermodynamically stable state. Their magnetic susceptibility ranges up to χ = 10, which is much larger than the largest susceptibility for a natural fluid, liquid Oxygen. It has a χ of only 10−5 . Because of their huge susceptibility MFs are said to show super-paramagnetic behaviour. MFs offer the unique advantage to influence or even control their behaviour via an external magnetic field. E.g. they are drawn into an inhomogeneous magnetic field and thus can withstand gravity B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 789–799, 2003. c Springer-Verlag Berlin Heidelberg 2003
790
Reinhard Richter et al.
or forces due to pressure drop. In this way they have found widespread technical applications, ranging from rotary feed throughs in hard disc drives, to loud speakers (see [2]). From a scientific point of view MFs are attracting more and more interest essentially because of three reasons. Firstly, they serve as test liquids for electric and magnetic interactions, which are present also in common fluids. However, because they are difficult to measure they have also not taken into account in standard theory. These interactions are presently incorporated in a theory of electromagnetic fields in liquids, see e.g. [3,4,5]. Secondly, because the magnetic particles are rather large in comparison with atoms or even molecules, their suspension displays some granular character. Here the reversible formation of chains is one of several interesting subjects [6]. Thirdly the surface of MFs reacts to impressed magnetic fields by the spontaneous formation of macroscopic patterns. For static magnetic fields these structures are stable without any dissipation of energy. This is an important difference to the celebrated paradigms of pattern formation, like Rayleigh-B´enard convection, where the dissipation of energy is essential [7]. However, by modulation of the external field, dissipation and interesting spatio-temporal dynamics can be introduced to measure. In this article we will present some examples for surface structures in magnetic liquids, following a course from conservative to dissipative pattern forming systems. We start with the (static) Rosensweig instability, investigate its response to pulse-like and periodic modulation of parameters, observe spatio-temporal intermittency, and finally the disintegration of magnetic liquid bridges.
2
The Rosensweig Instability
In 1967 Cowley and Rosensweig first investigated the influence of a homogeneous magnetic field to a horizontally extended layer of magnetic fluid. When surpassing a critical value Bc of the magnetic induction, they observed a sudden transition between the flat surface and a hexagonal pattern of liquid crests [8,1]. Figure 1 gives a surface profile of this pattern by means of radioscopy [9]. A basic understanding of this instability can best be achieved via the dispersion relation for an semi-infinitely extended layer of inviscid MF [8,10] ω 2 = gk − µ
(µr − 1)2 1 2 2 σ 3 H k + k . µr + 1 ρ ρ
(1)
Here ω denotes the frequency, k the wave number, H the strength of the external magnetic field, µr the relative magnetic permeability, µ = µr µ0 the magnetic permeability, and µ0 the magnetic field constant. Moreover, ρ stands for the density and σ for the surface tension of the MF. Whereas the first and third term at the r.h.s. are due to gravity and capillary waves, respectively, which are common to all fluids, the second term is specific for
Magnetic Liquid Patterns in Space and Time
a)
791
b)
Fig. 1. Surface profile of the Rosensweig pattern at a magnetic induction of B =22.95 mT (a) in a hexagonal shaped Teflon container of depth 4 mm for the MF EMG 909 from Ferrotec. Figure (b) displays a zoom of the center of the structure
MFs. By increasing H the dispersion relation can be tuned. At H ≈ 0.93 Hc √ Eq. (1) becomes non-monotonic, and at Hc = [2 (µr +1)/(µ(µr −1)2 ) ρgσ]1/2 the curve touches the (ω = 0) - line at the critical wave number kc = ρg/σ. For H > Hc Eq. (1) has a negative solution for a band of wavenumbers around kc . Here (ω) > 0 and small disturbances from the basic state, which are proportional to exp[−i(ω t − kr)], begin to grow. Importantly, with growing amplitude one leaves the range of validity of linear stability analysis. The emerging stable hexagonal pattern and an unstable square pattern has been characterized in the vicinity of the bifurcation point by means of an energy minimization principle [11]. Figure 2 gives a scheme of the subcritical bifurcation diagram obtained in this way. In a recent and improved theoretical treatment a further unstable branch of liquid ridges has been predicted in the neighbourhood of the bifurcation point [12]. Most experimentalists followed the pioneers [8] by varying the magnetic induction in a quasi static manner. By moving on the hysteretic course, sketched in Fig. 2 a, they were focusing on the nature of the stable pattern in the nonlinear regime [13,14]. The wavenumber observed in this way was
BC
BC
Fig. 2. Scheme of the bifurcation diagram in the vicinity of the bifurcation point Bc , as predicted by [11] and [12]. Figure (a) illustrates an adiabatic change of the control parameter B, (b) the consequences of a jump-like increase
792
Reinhard Richter et al.
wavenumber k (m-1)
found to be independent from the magnetic induction [8,13]. Unfortunately this result has been compared with the predictions of linear theory [15]. However, the final stable pattern, resulting from nonlinear interactions, does not generally correspond to the most unstable linear pattern. For a successful comparison with the predictions of linear theory [17], not fully developed crests of small amplitude are suitable. Due to the subcriticality of the prevailing bifurcation these are only accessible via a jump-like increase of the magnetic induction, as sketched in Fig. 2 b. The emerging transient magnetic liquid ridges of appropriate small amplitude are displayed in the insets (a) and (c) of Fig. 3. After this circular transient structures a stable hexagonal pattern of Rosensweig crests evolves, as shown by the insets (b) and (d). Next we focus on the quantitative experimental results displayed in Fig. 3, where the wavenumber k is plotted versus the magnetic induction B. Each open square denotes the wavenumber extracted from a picture taken 180 ms after the jump-like increase of the induction to B > Bc . The solid line gives the prediction of linear stability analysis, taking into account the viscosity and finite thickness of the layer of the fluid. Here the magnetic permeability µr has been used as the only fit parameter. The fitted value for µr differs by 2.8 % from the value given by the manufacturer. The almost linear increase both in experiment and theory is the main result. The inclination is predicted to depend on the viscosity of the MF, approximating zero only in the limit of
induction B (mT) Fig. 3. Plot of the wavenumber k versus the magnetic induction B measured in a circular container of 120 mm diameter and 2 mm depth. The open squares give the experimental values extracted from the circular deformations, examples of which are given in the insets (a) and (c). The solid line displays the theoretical results for the material parameters of the MF EMG 909 using µr ≈ 1.85 as a fit-parameter. The open circles denote the wavenumber of the final hexagonal patterns, example of which are shown in (b) and (d), calculated via a fit of the central hexagonal structure. Figure taken from [16]
Magnetic Liquid Patterns in Space and Time
793
infinite viscosity [17]. In contrast to this linear wavenumber dependence we find a constant behaviour for the wavenumber of the final hexagonal pattern, which is marked by the open circles. This measured constant value confirms the qualitative observations for the final pattern in [8,13]. The experimental data in Fig. 3 show convincingly the difference between the linear and nonlinear stages of the pattern forming process. So far the viscous dissipation has been found to have measurable influence on the transient dynamics, but not on the final stable pattern. Increasing the role of dissipation can be done at will by combining MFs and periodic excitation. Depending on the range of the magnetic induction the product may be regarded as a periodically driven Rosensweig instability or a magnetic Faraday instability.
3
The Magnetic Faraday Instability
The Faraday instability belongs already to the most popular experimental configurations for the investigation of parametrically excited instabilities, structure formation and spatio-temporal chaos [7]. Operating the experiment with magnetic fluid instead of the commonly used water or silicon oil is adding further interesting aspects. Firstly, instead of shaking the container, the instability can also be driven by periodic modulation of the applied magnetic field [10,18,19,20]. Secondly, different orientations of the magnetic field with respect to the surface layer permit the realization of various symmetries [21,22]. Finally, the dispersion relation of MFs (1) can be tuned by the external magnetic field. Experimentally, the non-monotonic dispersion relation was investigated by means of locally excited travelling waves in an annular channel [23], and in a circular container [24]. Due to the non-monotonicity up to three different wavenumbers can be excited with one single driving frequency. Which of the wavenumbers can actually be realized depends on the viscous dissipation in the bulk and in the bottom layer of the fluid [25]. For surface waves excited in a spatially homogeneous manner, the competition of the different wavenumbers was predicted to result in the spontaneous formation of domain structures [26]. This symmetry-breaking process could be experimentally demonstrated in an annular channel excited by vertical vibration [27]. In the annulus a domain of standing subharmonic waves with the wavenumber k1 = 34m−1 and another domain with k2 = 46m−1 evolved. In addition to the predicted domain formation in space, for different parameters a domain formation in time could also be detected [27]. A standing wave pattern of wavenumber k1 collapses spontaneously in the whole annulus, and gives way to a pattern with wavenumber k2 . The latter however is not stable and forms a slowly shrinking domain, which finally vanishes in favour of k1 . This cycle is repeated in an irregular manner.
794
Reinhard Richter et al.
a)
b)
Fig. 4. Twin peak pattern in the non-monotonic regime of the dispersion relation. The time elapsed between picture (a) and (b) is one driving period
Recently, the competition between two different wavenumbers was found to be solved in a third way [28]. In the non-monotonic regime, for a magnetic field of H = 0.98Hc and a driving frequency ωD = 9.615Hz a novel pattern of twin peaks has been detected. Figure 4 displays two snapshots taken one driving period apart. One clearly unveils a subharmonic standing wave. Apparently, instead of two separated domains, a bi-periodic structure in space has been established. Both dominant wavenumbers of the twin-peak pattern are found to be situated on the non-monotonic dispersion curve [28].
4
Spatio-Temporal Intermittency
Low dimensional chaotic dynamics of a single spike of MF has been investigated in experiment [29] and theory [30,31]. The next step is to tackle the problem of spatio-temporal complexity. In order to do so, a quasi one dimensional array of spike-oscillators is a natural choice. It can be realized by the magnetic Faraday instability in an annular trough filled with MF [27]. To fa ciliate a meaningful statistical evaluation of spatio-temporal complexity a high number of oscillators is most desire able. For a given circumference of the annulus the number of spike-oscillators is limited by the wavenumber of the nonlinear pattern. In case of a homogeneous magnetic field the wavenumber of the final nonlinear pattern was found to be independent from the magnetic induction, in agreement with Fig. 3. The number of oscillators could be increased by a decade, when utilizing an inhomogeneous magnetic field [32] and a MF with permeability µr = 4. For this aim the flat ground state and the homogeneous field was sacrificed. Figure 5 presents a picture of the new ground state. A ring of up to 130 spikes is trapped by the field at the edge of a soft iron pole shoe of an electromagnet supplied by direct current of 1 A. The spikes can be excited periodically by applying in addition an alternating current Iex . A CCD-camera takes pictures of the spatio-temporal dynamics phase-locked to the driving frequency. The 2D images are extracted in real time to 1D azimuthal scans along the ring of spikes. In Fig. 6 a 500 of such scans of a laminar state are shown in space and time, where dark regions correspond to high amplitudes. Due to the
Magnetic Liquid Patterns in Space and Time
795
Fig. 5. The MF EMG 901 from Ferrotec trapped by the gradient magnetic field at the sharp edge of a pole shoe of an electro magnet. The diameter of the pole shoe is 40 mm 500
(a)
0
(b)
time t (t)
450
0
(c)
450
0
0
position x
1
Fig. 6. Transition to spatio-temporal intermittency in a ring of 108 magnetic liquid spikes situated at the edge of a pole shoe. The Figure gives a series of stroboscopic azimutal scans (a) in the laminar regime at Iex =2.8A, (b) in the spatio-temporal intermittent regime at Iex =3.0A, and (c) in the chaotic regime at Iex =3.8A. The driving period τ = 1s/12.5 is used to scale the time. Figure taken from [32]
stroboscopic recording the oscillations of the spikes can not be seen. For higher excitation amplitudes the laminar state becomes intermittent in space and time (Fig. 6 a) and eventually chaotic (Fig. 6 c). As an order parameter for the transition to spatio-temporal intermittency (STI) we take the mean chaotic fraction γ, i.e. the ratio of chaotic regions to
796
Reinhard Richter et al.
the length of the system. Its variation with the control parameter Iex is shown in Fig. 7. Close to the onset of STI the mean chaotic fraction is expected to grow with a power law γ ∼ (Iex − Ic )β .
(2)
The solid line in Fig. 7 is a fit to our data, using Ic , β, and an offset representing background noise as adjustable parameters. The threshold value determined in this way is Ic = 3.0±0.05 A. The exponent β = 0.3±0.05 is in agreement with the exponent predicted for directed percolation, β = 0.276486(8) [33], and thus corroborates the conjecture by Pomeau, that the transition to STI might be analogous to directed percolation [34]. To prove this conjecture more thoroughly we have investigated four further power laws, namely for the correlation length, the correlation time, the critical distribution of the laminar lengths and times. Three out of four exponents turned out to be in agreement with the interpretation that chaotic domains spread within the laminar state according to the rules of directed percolation.
mean chaotic fraction g
1.0 0.8 0.6 0.4 0.2 0.0 2.8
2.9
3.0
3.1
3.2
3.3
3.4
3.5
excitation amplitude Iex (A)
5
3.6
Fig. 7. The mean chaotic fraction γ vs. excitation amplitude Iex . γ was exctracted from 2000 excitation periods τ . The solid line is a power law fit. The error bars represent the statistical errors. Figure taken from [32]
Liquid Bridge Pinch-Off
Liquid bridges in between a rotating shaft and a housing are one of the mayor industrial applications of MFs. Their stability and disintegration is apart from these applications also of fundamental interest for the investigation of drop formation. The early stage of the developing instability is described by classical linear stability analysis first conducted by Rayleigh [35]. In the last stage of the surface tension driven instability drop formation occurs. The surface- and flow structures immediately before drop pinch-off are described by universal scaling functions [36]. We have investigated whether this scaling laws for standard Newtonian liquids survive for the case of magnetic liquids subject to an axial magnetic field [37]. A magnetic liquid bridge is suspended
Magnetic Liquid Patterns in Space and Time
797
Fig. 8. Decay of a liquid bridge of magnetic fluid (APG J12 from Ferrofluidics) recorded by means of a high speed CCD-camera. The frames are taken at t = 0 ms (a), 2 ms (b), and 3 ms (c). From [37]
in between the pole shoes of two electric magnets. Upon increase of the static magnetic field the bridge disintegrates. Figure 8 displays a sequence of frames during the rupture of the bridge. During the last 3 ms before the rupture the measured neck radius is found to follow the equation hmin = u(a)s σ(ν)−1 (t0 − t).
(3)
neck radius hmin (mm)
Here u(a)s denotes the predicted dimensionless shrink velocities for the case of a symmetric or asymmetric solution, respectively, i.e. for Stokes- or NavierStokes flow. Moreover, t0 denotes the time of pinch-off, σ the surface tension, ν the cinematic viscosity, and ρ the density. The measured value for u amounts to 0.068, which is close to the value us = 0.071 predicted by Papageorgiou for the case of a viscosity dominated flow [38]. This is in agreement with the relatively large viscosity of the investigated magnetic fluid.
u1 = 0.073 us = 0.071 u2 = 0.029 uas = 0.03
120 90 60 30 0
-4
-3
-2 time t - t0 (ms)
-1
0
Fig. 9. The neck radius for the drop pinch-off of a glycerin-water mixture versus time. The dotted and the dashed line represent the theoretical prediction and a linear fit to the viscous dominated flow regime. The theoretical prediction and the linear fit for the Navier-Stokes flow are marked by the dash doted line and the solid line. From [39]
798
Reinhard Richter et al.
With decreasing neck radius, the flow is accelerated in the liquid thread. Eventually the Stokes approximation brakes down, and inertial terms become important. Such a transition is shown in Fig. 9 and could be observed for a glycerin-water mixture. It remains to be investigated, whether this transition can be observed as well for the case of MF. Tuning the magnetic field should shift the transition point, in analogy to recent findings for different viscosity of the glycerin-water mixture [40].
6
Conclusion
We have presented recent experimental efforts for a quantitative understanding of macroscopic surface patterns of magnetic fluids. Our three main outcomes are: For the two dimensional structures in a normal magnetic field, an almost linear dependence of the wavenumber of maximal growth on the magnetic induction. For a quasi one dimensional array of magnetic liquid spikes in a gradient magnetic field driven periodically we uncover a transition to spatio-temporal intermittency. This transition yields some critical exponents known from directed percolation. For the decay of a magnetic liquid bridge the minimal neck radius was found to shrink with the velocity predicted for Stokes flow. — In this way pattern formation in 2D, 1D, and in the vicinity of a point of pinch-off has been investigated. Acknowledgements The projects have been supported in part by the ‘Deutsche Forschungsgemeinschaft’ under Grants En278/2, Re888/12 and Ri1054/1.
References 1. R. E. Rosensweig, Ferrohydrodynamics (Cambridge University Press, Cambridge 1985). 789, 790 2. B. Berkovski and V. Bashtovoy, Magnetic fluids and applications handbook, 1. ed. (Begell house, inc., New York 1996). 790 3. M. I. Shliomis, Sov. Phys. JETP 34, 1291 (1972). 790 4. M. Liu, Phys. Rev. Lett. 70, 3580 (1993). 790 5. H. M¨ uller and A. Engel, Phys. Rev. E 60, 7001 (1999). 790 6. S. Odenbach, Magnetoviscous Effects in Ferrofluids, Vol. m71. (Springer, Berlin, Heidelberg, New York 2002), 790 7. M. C. Cross and P. C. Hohenberg, Rev. Mod. Phys. 65, 870 (1993). 790, 793 8. M. D. Cowley and R. E. Rosensweig, J. Fluid Mech. 30, 671 (1967). 790, 791, 792, 793 9. R. Richter and J. Bl¨ asing, Rev. Sci. Instrum. 72, 1729 (2001). 790 10. R. E. Zelazo and J. R. Melcher, J. Fluid Mech. 39, 1 (1969). 790, 793 11. A. Gailitis, J. Fluid Mech. 82, 401 (1977). 791 12. R. Friedrichs and A. Engel, Phys. Rev. E 64, 021406–1–10 (2001). 791
Magnetic Liquid Patterns in Space and Time
799
13. J.-C. Bacri and D. Salin, J. Phys. (France) 45, L559 (1984). 791, 792, 793 14. A. Boudouvis, J. Puchalla, L. Scriven, and R. Rosensweig, J. Magn. Magn. Mater. 65, 307 (1987). 791 15. D. Salin, Europhys. Lett. 21, 667–670 (1993). 792 16. A. Lange, B. Reimann, and R. Richter, Magnetohydrodynamics 37, 261 (2001). 792 17. A. Lange, B. Reimann, and R. Richter, Phys. Rev. E 61, 5528 (2000). 792, 793 18. J.-C. Bacri, U. d’Ortona, and D. Salin, Phys. Rev. Lett. 67, 50 (1991). 793 19. T. Mahr and I. Rehberg, Europhys. Lett. 43, 23 (1998). 793 20. J. L. Hyun-Jae Pi, So-ycon Park and K. J. Lee, Phys. Rev. Lett. 84, 5316 (2000). 793 21. J. C. Bacri, A. Cebers, S. Lacis, and R. Perzynski, Phys. Rev. Lett. 72, 2705 (1994). 793 22. V. V. Mekhonoshin and A. Lange, Phys. Rev. E 65, 061509 (2002). 793 23. T. Mahr, A. Groisman, and I. Rehberg, J. Magn. Magn. Mater. 159, L45 (1996). 793 24. J. Browaeys, J.-C. Bacri, C. Flament, S. Neveu, and R. Perzynski, Eur. Phys. J. B 9, 335 (1999). 793 25. H. W. M¨ uller, Phys. Rev. E 55, 6199 (1998). 793 26. D. Raitt and H. Riecke, Phys. Rev. E 55, 5448 (1997). 793 27. T. Mahr and I. Rehberg, Phys. Rev. Lett. 81, 89 (1998). 793, 794 28. B. Reimann, T. Mahr, R. Richter, and I. Rehberg, J. Magn. Magn. Mater. 201, 303 (1999). 794 29. T. Mahr and I. Rehberg, Physica D 111, 335 (1997). 794 30. R. Friedrichs and A. Engel, Eur. Phys. J. B 18, 329 (2000). 794 31. A. Lange, H. Langer, and A. Engel, Physica D 140 294 (2000). 794 32. P. Rupp, R. Richter, and I. Rehberg, Phys. Rev. E 67, 036209 (2003). 794, 795, 796 33. H. Hinrichsen, Adv. Phys. 49, 815 (2000). 796 34. Y. Pomeau, Physica D 23, 3 (1986). 796 35. J. W. S. Rayleigh, Phil. Mag. 34 145 (1892). 796 36. J. Eggers, Rev. Mod. Phys. 69, 865 (1997). 796 37. A. Rothert and R. Richter, J. Magn. Magn. Mater. 201 324 (1999). 796, 797 38. D. T. Papageorgiou, Phys. Fluids 7, 1529 (1995). 797 39. A. Rothert, R. Richter, and I. Rehberg, Phys. Rev. Lett. 87, 084501 (2001). 797 40. A. Rothert, R. Richter, and I. Rehberg, accepted by New Journal of Physics (2003). 798
Magnetisation Reversal of Stripe Arrays – Comparison of MOKE with Polarized Neutron Reflectivity Katharina Theis-Br¨ ohl Institut f¨ ur Experimentalphysik/Festk¨ orperphysik, Ruhr-Universit¨ at Bochum 44780 Bochum, Germany Abstract. Laterally structured magnetic films will be important components of future magnetic storage elements. From the aspect of application, remagnetization within a one domain state is preferred. However, in order to minimize stray field energies, lateral structures often form domains. Therefore, the study of the remagnetization process is of particular interest and was performed by polarized neutron reflectivity (PNR) at CoFe stripe arrays. The results are compared to magnetooptic Kerr effect and Kerr-microscopy. PNR is a method which can distinguish between magnetization reversal via coherent rotation or domain nucleation and wall movement. By applying PNR to lateral structures it is possible to perform remagnetization measurements at the specular reflection and at Bragg peaks from the artificial periodicity of the lateral structure. Whereas at the specular reflection an averaged signal over the complete sample is measured, the Bragg peaks filters out correlation effects between individual magnetic units. The parameters of the CoFe stripes were chosen in such a way that a strong two-fold shape anisotropy is present. When increasing the width of the stripes, the anisotropy changes and more complex domain configurations are observed during the magnetization process.
1
Introduction
Laterally structured magnetic films have recieved much interest in recent years because of their potential applications in optical and magnetic storage devices. A number of methods have been applied to characterize the magnetic properties of these patterns, including the magneto-optical Kerr effect (MOKE) [1,2,3], Kerr-microscopy [4], scanning electron microscopy with polarization analysis (SEMPA) [5], Lorentz-microscopy [6], X-ray magnetic circular dichroism (XMCD) microscopy [7], and magnetic force microscopy (MFM) [8]. In comparison, scattering methods have rarely been applied to patterned structures, although the interpretation of X-ray magnetic scattering (XRMS) and polarized neutron reflectivity (PNR) data is more straightforward than for most other methods. Both techniques, XRMS and PNR will play a crucial role in the future, when it comes to the analysis of magnetic correlation lengths and phase transitions in nanostructured magnetic gratings. The XRMS technique was applied recently to explore the magnetic order in a Co/Pt line array with perpendicular anisotropy [9]. There are some examples reported so far of successful neutron scattering experiments on laterally B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 801–814, 2003. c Springer-Verlag Berlin Heidelberg 2003
802
Katharina Theis-Br¨ ohl
structured magnetic arrays. PNR maps from magnetic dot arrays have been measured in remanence [10] and the magnetization reversal of magnetic dots and bars with pronounced dipole character was recently investigated via PNR [11,12]. Clearly, the disadvantage of neutron scattering is the low intensity from a tiny scattering volume resulting in long counting times. Nevertheless, applying polarized neutron scattering methods to micro- and submicrometer scale lateral structures is a new and challenging task, which can yield new insight into the analysis of the remagnetization processes. However, a critical comparison between MOKE and PNR results from stripe arrays is missing so far. We present a comparative MOKE and PNR study to determine the magnetic hysteresis of periodic magnetic stripe arrays. For homogeneous magnetic films both, vector-MOKE and polarized specular neutron reflectivity (PNR), yield the in-plane magnetization vector, and in general good agreement is found between both methods [13,14,15]. Deviations may arise from differences in the sampling volume and the averaging procedure over different magnetic domains. The advantage of PNR over MOKE is that the former method has a higher penetration depth and simultaneously yields structural information, allowing a layer resolved vector magnetometry of several magnetic layers stacked on top of each other.
2 2.1
Experimental Details Sample Preparation
The samples studied are large area (200–300 mm2 ) Co0.7 Fe0.3 thin polycrystalline films grown by DC magnetron sputtering. The anisotropy of the stripes is dominated by shape anisotropy. Due to their polycrystalline structure the intrinsic magneto-crystalline anisotropy is averaged and the stripes have no further intentionally induced anisotropy. We used Al2 O3 (1¯102) as a substrate with a 5 nm thick Ta buffer layer. The samples were spin coated with Novolak photoresist, which was exposed by 442 nm light in a scanning laser lithography setup and developed afterwards. A layer stack consisting of 3 nm V, 90 nm Co0.7 Fe0.3 and a 3 nm Al protection layer was deposited onto the patterned photoresist. Finally, the photoresist was removed via lift-off. The described procedure resulted in samples of Co0.7 Fe0.3 stripes of 1.2 µm and 2.4 µm width and a grating parameter of d = 3 µm as can be seen from atomic force microscopy (AFM) pictures in Fig. 1. 2.2
Experimental Setups
Hysteresis loops were measured by using a high resolution magneto-optical Kerr effect setup in the longitudinal configuration with s-polarized light, which is well suited for measuring the Kerr angle as a function of the applied
Magnetisation Reversal of Stripe Arrays
803
z [nm]
a 90
20
0 10 ] m [m x
20 y [m10 m]
0
0
z [nm]
b 90 0 20
20
10 y [mm ]
10 m] [m x 0
0
Fig. 1. Surface topography of arrays of (a) 1.2 µm wide and (b) 2.4 µm wide Co0.7 Fe0.3 stripes obtained with an atomic force microscope shown in a 3-dimensional surface view. The displayed area is 20×20 µm2 , respectively
magnetic field. The experimental setup allows for a rotation of the sample around its surface normal (angle χ) in order to apply a magnetic field in various in-plane directions. The angle χ denotes the angle between the easy axis and the applied field. This definition is valid for longitudinal as well as for transversal configurations. The experimental set-up for these measurements is shown schematically in Fig. 2. For more experimental details we refer to [16]. The detailed description of the experimental setup can be found in [17]. The imaging of domain configurations was performed by magneto-optical Kerr microscopy in the longitudinal mode [4]. The weak magneto-optical contrast was digitally enhanced by means of a background subtraction technique [18]. The experimental setup has the option to apply in-plane magnetic fields in any direction independently of the magneto-optical sensitivity direction. To visualize the magnetic domains within the narrow stripes the highest possible optical resolution, which is on the order of 0.3 µm for the given visible light illumination, was chosen. Neutron scattering experiments were performed at the ADAM reflectometer of the Institute Laue-Langevin in Grenoble, France. Measurements use
Fig. 2. Sketch of the longitudinal MOKE setup with the sample rotation angle χ between the easy axis and the applied field, and the angle φ of the magnetization vector M . In the transverse configuration, the field and the sample are rotated by 90◦ , such that the angle χ is kept constant, but the magnetization component mT is in the plane of incidence
804
Katharina Theis-Br¨ ohl
Fig. 3. Sketch of the neutron scattering geometry. χ is the angle of the sample rotation with respect to the applied field (same definition as in Fig. 2). The magnetic field H is applied perpendicular to the scattering plane. αi and αf refer to the incident and exit angles of the neutrons to the sample surface
a neutron wavelength of 0.441 nm. The scattering geometry is depicted in Fig. 3. For field dependent measurements, an electromagnet was used with a field direction perpendicular to the scattering plane and parallel to the incident neutron polarization axis. For the neutron scattering experiments the angle χ is defined the same way as for the MOKE experiments as the sample rotation angle between the easy axis and the applied field. From the four different cross sections measured at the specular reflection one can determine the spin asymmetry as a function of external field Sa (H): Sa =
I+ − I− (I+,+ + I+,− ) − (I−,− + I−,+ ) , = I+ + I− (I+,+ + I−,− ) + (I+,− + I−,+ )
(1)
using the intensities of the non spin-flip ((+, +), (−, −)) and the spin-flip ((+, −), (−, +)) cross sections. Due to the linear dependence of the magnetic part Vmag of the optical potential V = Vnuc ± Vmag
(2)
on the magnetic induction B, with + for “up”-neutrons and − for “down”neutrons, the spin asymmetry shows a linear dependence on the magnetization in case of specular reflection. For the Bragg-reflections scattering theory has to be applied. Using kinematic scattering theory within the first Bornapproximation Sa is not a simple function of the magnetization. In order to find a linear dependence one can define an expression similar to the spin asymmetry: I+,+ + I+,− − I−,− + I−,+ I+ − I− = , (3) Sa = I+,+ + I+,− + I−,− + I−,+ I+ + I− with Sa being proportional to the magnetization component parallel to the y axis. It should, however, kept in mind that this is only a rough approximation for the distorted Born wave approximation, which actually is be applied when analyzing Bragg peaks close to the Yoneda wings. From the I(H) intensity values of the four cross sections (+, +), (−, −), (+, −) and (−, +) a magnetization curve can be calculated, which is proportional to the field dependent magnetization of the stripes. The obtained
Magnetisation Reversal of Stripe Arrays
805
curves may then be compared to the respective longitudinal MOKE hysteresis loops, after both have been normalized to their respective saturation magnetization.
3
Experimental Results
We have carried out magnetization reversal measurements for different sample rotation angles χ. First, we present results from the array with the narrow, 1.2 µm wide, stripes. The left column of Fig. 4 shows hysteresis measurements performed at the first order Bragg peak via polarized neutron scattering. From these measurements, the neutron-derived magnetic hysteresis can be evaluated and compared to hysteresis loops determined by MOKE measurements (right column of graphs of Fig. 4). The top panel in Fig. 4 for χ = 0◦ exhibits a large splitting of the (+, +) and (−, −) intensities, which reverses suddenly at the coercive field Hc . There is almost no spin-flip scattering visible over the entire field range, indicating that the magnetization reversal for the easy axis configuration takes place in form of nucleation and fast domain wall movements at Hc , in complete agreement with the conclusions from MOKE measurements. At χ = 63◦ the (+, +) - (−, −) splitting starts to become reduced for field values far above and below Hc . The (−, −) intensity continuously decreases
Fig. 4. Magnetization reversal measurements performed at the first order Bragg peak (left column) and calculated curves from the polarized neutron measurements (right column) at the array with the 1.2 µm wide stripes. The calculated curves are compared to longitudinal MOKE hysteresis loops, reproduced as solid lines. The top row depicts measurements at a sample rotation of χ = 0◦ , and the bottom row measurements at a sample rotation of χ = 63◦
806
Katharina Theis-Br¨ ohl
with increasing field from negative to positive field values, while the (+, +) intensity is more or less constant for most of the field values and changes suddenly at the coercive field. At the same time the spin-flip scattering increases towards the coercive field. All four cross sections are consistent with a coherent rotation of the magnetization vector away from the easy axis, in very good quantitative agreement with the conclusions from our MOKE measurements [16]. The situation changes at Hc where 180◦ head-to-head domains are created in the individual stripes and domain wall movement sets in. A single domain state again is reached for fields slightly larger than Hc . The field range in which domain nucleation and domain wall movement occurs is very small. The general behavior is dominated by a single domain state with a constant or a rotating magnetization. More quantitative statements can be made by evaluating the spin asymmetry related expression Sa (see above), using the I(H) intensities of the different cross sections. The neutron hysteresis curves (open squares) are compared to the respective longitudinal MOKE hysteresis loops (solid lines) for the same sample rotation, as shown in the bottom graphs of Fig. 4. The overall agreement is rather good for the two sample rotation angles. However, small deviation of the neutron based magnetization curves at Hc from the MOKE hysteresis curves are clearly due to domain formation within this range. Kerr microscopy studies support the conclusions from the MOKE and the PNR measurements. Images at different magnetic fields show a single domain state for a field below and above Hc (not shown). Around Hc , domain nucleation and domain wall movement sets in. The proposed mechanism of head-to-head domain walls through the stripes can be confirmed. From our observations we assume independent domain nucleation and domain wall movement for each single stripe. Possible existence of dipole-dipole interactions between domains of adjacent stripes cannot be extracted from the Kerr microscopy observations in Fig. 5 which shows Kerr-microscopy pictures taken at Hc for two different field angles χ of the sample rotation. In Fig. 5 (left) we show the easy axis case with no magnetization rotation for H = Hc , and in Fig. 5 (right) we show a picture taken from an intermediate angle χ = 63◦ close to the hard axis case. This angle was chosen for comparison with our neutron scattering measurements. For different angles χ < 88◦ similar MOKE microscopy pictures were recorded, in agreement with our vector MOKE results [16]. Next, we present results from the array with the 2.4 µm wide stripes. In the left column of Fig. 6 field scans are shown for χ = 0◦ performed at the specular (top) and at the first order Bragg peak (bottom) via polarized neutron scattering. From the measurements, neutron-derived magnetic hysteresis curves were evaluated and compared to a hysteresis loop determined by longitudinal MOKE measurements in the right column in Fig. 6.
Magnetisation Reversal of Stripe Arrays
c=0° Hext
10 µm
c=63°
807
Hext
Fig. 5. Kerr microscopy pictures taken at the array with the 1.2 µm wide stripes around Hc for different magnetic field to sample alignment. The plane of incidence results in a top-down magneto-optical sensitivity axis parallel to the stripes. The magnetization directions are indicated by arrows in the χ = 0◦ map
Fig. 6. Magnetization reversal measurements (left panels) performed at a sample rotation of χ = 0◦ at the specular (top) and at the the first order Bragg reflection (bottom) and calculated curves from the polarized neutron measurements (right panels). The measurements were performed at the array with the 2.4 µm wide stripes. The calculated curves are compared to longitudinal MOKE hysteresis loops, reproduced as solid lines
At the specular as well as at the Bragg reflection we observe a clear peak in the spin-flip data between zero and coercive field with a maximum at 44 Oe (left column). This peak can be explained by a transverse magnetization component due to a tilt of the orientation of the magnetization. Comparing the neutron based magnetization curve from the specular reflection with the MOKE hysteresis loop, we find a clear deviation within this range. This is
808
Katharina Theis-Br¨ ohl
Fig. 7. Kerr microscopy images taken from the array with the 2.4 µm wide stripes below Hc (top row ), close to Hc (middle row ), and above Hc (bottom row ) for different angles of magnetic field to sample alignment of χ = 0◦ (left, easy axis alignment) and χ = 75◦ (right). The plane of incidence results in a top-down magneto-optical sensitivity axis perpendicular to the stripes
a clear hint for domain nucleation within this range because the coherent rotation model for determining the y component of the magnetic induction vector is not valid anymore [20]. For a more detailed explanation of the peak between zero and coercive field we use Kerr microscopy studies. The Kerr microscopy images taken at the sample rotation angle χ = 0◦ (left column in Fig. 7) display the nucleation of ripple domains below Hc before switching during magnetization reversal. The modulated magnetic structure starts to become visible already at low opposite field values. Within neighboring ripples, the magnetization orientation tilts away slightly to the left and to the right relative to the net magnetization direction. Due to the tilt a transverse magnetization component emerges which causes the peak in the spin-flip scattering intensity. By changing the magnetic field, the ripple domains do not change in size. While the ripple state cannot easily be concluded from the MOKE hysteresis loop, the spin-flip cross sections (+, −) and (−, +) in the left column in Fig. 6 are strongly influenced by the ripple state. The non spin-flip cross sections exhibit a large splitting of the (+, +) and (−, −) intensities, which reverses suddenly at the coercive field Hc . At Hc , the spin-flip scattering de-
Magnetisation Reversal of Stripe Arrays
809
Fig. 8. Results of a polarized neutron scattering measurement at different magnetic fields. The intensity was measured via PSD as function of qx and qz . The sample rotation was fixed to χ = 0◦ . The pattern in the left column were taken with spinup neutrons, and the pattern in the right column with spin-down neutrons. The color-scale intensity in the figure is proportional to the logarithmic neutron-counts
creases to a small value, indicating that the magnetization reversal for the easy axis configuration takes place in form of nucleation and fast domain wall movements, in complete agreement with conclusions from MOKE measurements [19]. Between zero field and Hc , the (−, −) cross section shows a drop of intensity, influenced by the higher spin-flip scattering from ripples with a transverse component of the magnetization. In addition to an enhanced spin-flip intensity, the ripple domains also cause diffuse scattering. In Fig. 8 we show maps measured with polarized neutrons for a sample rotation of χ = 0◦ . The maps in the top row were taken at the spin-flip scattering peak maximum of 44 Oe, and the maps in the bottom row were taken in saturation. The maps provide a clear scale for the intensity plotted as function of qx and qz intensity in a color-scale as function of qx and qz . The specular reflectivity appears as a line at qx = 0 and shows various maxima due to thin film thickness oscillations. In the diffuse range, Bragg reflections from the period of the stripe array are found, two on each side. According to the periodicity of the stripe array of D = 3 µm they A−1 and at qx ∼ A−1 . appear as lines at qx ∼ = ±2.1 · 10−4 ˚ = ±4.2 · 10−4 ˚
810
Katharina Theis-Br¨ ohl
Fig. 9. Magnetization reversal measurements performed at a sample rotation of χ = 75◦ at the specular reflection for four different cross sections (top) and calculated curves from the polarized neutron measurements (bottom) for the array with 2.4 µm wide stripes. The calculated curves are compared to longitudinal MOKE hysteresis loops, reproduced as solid lines
In all maps we observe diffuse scattering within the scattering range of about αi ≤ 0.6 and αf ≤ 0.6, probably caused by the high structural roughness of the ’artificial’ lateral structure. Additional diffuse intensity can be observed in the maps taken at 44 Oe. Bearing in mind that this is the field for which ripple domains have been observed, it is reasonable to assume that the enhanced diffuse scattering is due to these domains. For the sample rotation angle of χ = 75◦ we show measurements performed at the specular reflection (Fig. 9). The intensities of both spin-flip channels (+, −) and (−, +) almost perfectly agree to each other, as expected. The spin-flip intensity continuously increases towards zero field. The maximum is reached already below Hc , prior to an almost continuous decrease of the intensity towards high field. All four cross sections are consistent with a coherent rotation of the magnetization vector away from the easy axis, in very good agreement with conclusions drawn from MOKE measurements. However, the local minimum at Hc in the spin-flip cross sections denotes to an additional occurrence of domains. The Kerr microscopy images (Fig. 7, right column) again display the nucleation of ripple domains below Hc before switching during magnetization reversal. In the case of χ = 75◦ the field is higher compared to the case of χ = 0◦ , however, the external field component parallel to the stripe is still low in magnitude. At Hc , the nucleation of domains with the opposite average magnetization orientation sets in. These domains move through the stripes via head-on domain wall motion. In the images from χ = 75◦ many more domain walls are found compared to the χ = 0◦ images, for which not a single domain wall was observed. This effect is a result of the field alignment close to the effective hard axis direction, favoring the generation of multidomain states inside the stripes. The domain wall velocity for the easy axis configuration is much higher compared to the “harder” χ = 75◦ configuration.
Magnetisation Reversal of Stripe Arrays
811
In the remagnetized stripes, above Hc , again ripple domains are observed. However, the modulation of magnetization is lower in amplitude. More quantitative statements can be made by evaluating the spin asymmetry Sa , using the I(H) intensities of the four cross sections (+, +), (−, −), (+, −), and (−, +). The neutron hysteresis curve (open squares) is compared to the respective longitudinal MOKE hysteresis loop (solid line), as shown in the bottom graph of Fig. 9. We found that the overall agreement between the neutron data and the MOKE curve is rather good. A detailed analysis, however, shows small deviation for the field range between zero field and slightly above Hc . These deviations can be attributed to the nucleation of domains: ripples close to zero field and domains of the other magnetization orientation at Hc , in complete agreement with Kerr microscopy.
4
Discussion
If the remagnetization process does not involve a complicated domain structure, in general, good agreement can be expected between polarized neutron reflectivity and results from other methods. This has been demonstrated for homogeneous magnetic films by a number of investigators [13,14,15]. In our case, the agreement has to be reestablished. First, the samples are not homogeneous but laterally structured, and, second, we compare specular and off-specular neutron Bragg reflections with specular MOKE and with MOKE microscopy. Nevertheless, the main conclusions from the MOKE measurements could be confirmed by PNR. For the 1.2 µm wide stripes those are: First, only for χ = 0◦ the orientation of the magnetization is constant for all field values and parallel to the magnetic stripes. At H = Hc nucleation and domain wall motion occurs. Second, independent of the sample rotation angle and for magnetic fields below the coercive field Hc , a single domain state is present. Third, for rotation angles χ > 0◦ below Hc a coherent spin rotation occurs with increasing field away from the easy axis. The good agreement between both methods for this sample is a result of the single domain state, which is present for all rotation angles and for all field values different from the coercive field. Furthermore, interactions and correlation effects appear to be negligible. This simple domain structure, which follows from the sample design, furnishes a good testing ground for comparative studies. As the domain state becomes more complex and correlation effects are present, we expect deviations between MOKE and PNR results. Fore wider stripes of 2.4 µm width with the same periodicity we find a more complicated remagnetization behavior. It is characterized by the following main observations: First, domains with a modulated magnetization orientation within neighboring ripples are observed. By changing the magnetic field no change in size of the ripple domains occurs. Second, the overall magnetization pro-
812
Katharina Theis-Br¨ ohl
cess is strongly affected by rotation processes. Third, at Hc , nucleation of oppositely oriented domains with head-on domain walls occurs. The ripple phenomenon was theoretically described as the reaction of the film to a statistic perturbation by the crystal anisotropy of individual grains of a polycrystalline film (see Hubert und Sch¨afer [4] and references therein). A longitudinal modulation of the magnetization within the ripple structure causes a small stray field energy which suppresses lateral variations of the magnetization due to the polycrystalline nature of the material. Compared to the 1.2 µm wide narrow stripes with a strong uniaxial anisotropy [16], the local crystal anisotropy of individual grains of the 2.4 µm wide stripes with a smaller uniaxial anisotropy obviously has a stronger influence on the magnetization behavior of the system. As a result, ripple domains are created. Due to the the tilt of the magnetization within the ripples with a transverse magnetization component the presence of ripple domains can be extracted easily from the spin-flip data of the field dependent PNR measurements for the easy axis configuration. Because otherwise no transverse magnetization is present, we observe a clear peak due to the rippling in the top graph of Fig. 6. For all other sample configurations with rotational processes of the magnetization, the tilt of the magnetization within the ripples only causes small deviations of the already existing transverse component of the magnetization and therefore rippling cannot be extracted as easily from the PNR data as from the PNR data for the easy axis configuration. Although we observe ripples and multi-domain states during the reversal of the wider stripe array, again a good agreement between neutron and MOKE based hysteresis curves was observed for those field ranges with rotational processes. For the narrow stripe array we explained the good agreement between both methods as a result of the predominantly single domain state with coherent magnetization rotation. For the easy axis behavior of the wider stripes we find a clear deviation of both curves in the field range where ripples occur. For the orientation close to the hard axis, the reversal is strongly influenced by rotation of the magnetization. Due to the presence of ripples, the reversal can not be coherent. Coherence of the rotation can only be expected within each individual ripple domain. Furthermore, due to the tilt of the magnetization within neighboring ripples, slight deviations between the neutron and the MOKE based magnetization curves occur.
5
Summary
We have studied the magnetization reversal behavior of Co0.7 Fe0.3 stripe arrays with a grating period of 3 µm, which were generated by scanning laser lithography. Using polarized neutron scattering, MOKE and Kerr microscopy, we have analyzed and compared the magnetic hysteresis as a function of angle χ between the stripes and the applied field.
Magnetisation Reversal of Stripe Arrays
813
We have investigated the magnetic hysteresis with polarized neutron scattering at small angles using the specular and the first order Bragg reflections from the stripe arrays. Unlike the MOKE experiment, for the polarized neutron scattering experiment the magnetic field is always applied perpendicular to the scattering plane and parallel to the polarization axis of the neutrons. Both components of the in-plane magnetization then follow from an analysis of the non spin-flip ((+, +), (−, −)) and spin-flip ((+, −), (−, +)) crosssections. For a single domain state and for magnetization reversal via coherent rotation the results from vector MOKE and from PNS agree very well. The very good agreement between both methods for the narrow stripes confirms that for most of the field values the stripes are in a single domain state. In case that the field is applied parallel to the stripe axis (easy axis), a domain nucleation and domain wall movement occurs within a narrow field range at the coercive field. For all other sample orientations we observe a coherent magnetization rotation with increasing field with some domain nucleation occurring just around Hc . The good agreement between MOKE and PNR is in part due to the sample design providing a large uniaxial anisotropy, and in part due to the lack of interaction among the stripes. The wider stripes show a more complicated domain state at coercivity with additional ripple domains. For the field range in which domains occur, the otherwise good agreement between MOKE and PNR is violated because no simple connection between PNR derived magnetization curves and magnetization can be drawn. It can, however clearly decided wether the remagnetization occurs via rotation or via domain processes. Acknowledgements I would like to express my thanks to Vincent Leiner, Maximilian Wolff, Till Schmitte, Andreas Westphalen, Florin Radu, Hartmut Zabel, Karsten Rott, Hubert Br¨ uckl, and Jeffery McCord for collaboration and fruitful discussions. I acknowledge funding by DFG, SFB 491 and BMBF 032AE8BO.
References 1. T. Schmitte, T. Schemberg, K. Westerholt, H. Zabel, K. Sch¨ adler, U. Kunze, J. Appl. Phys. 87, 5630 (2000). 801 2. T. Schmitte, O. Schw¨ obken, S. Goek, K. Westerholt, H. Zabel, J. Magn. Magn. Mat. 240, 24 (2002). 801 3. T. Schmitte, K. Westerholt, H. Zabel, J. Appl. Phys. 92, 4524 (2002). 801 4. A. Hubert and R. Sch¨ afer, Magnetic domains (Springer, Heidelberg, 1998). 801, 803, 812 5. H.P. Oepen and J. Kirschner, Scanning Microsc. 5, 1 (1991). 801 6. J.N. Chapman, A. . Johnston, L.J. Heydermann, S. McVitie, W.A.P. Nicholson, and B. Bormans, IEEE Trans. Mag. 30, 4479 (1994). 801
814
Katharina Theis-Br¨ ohl
7. P. Fischer, T. Eimuller, G. Sch¨ utz, G. Schmahl, P. Guttmann, and G. Bayreuther, J. Mag. and Magn. Mat. 198, 624 (1999). 801 8. J. Schmidt, G. Skidmore, S. Foss, E. Dan Dahlberg, and C. Merton, J. Mag. and Magn. Mat. 190, 81 (1998). 801 9. K. Chesnel, M. Belakhovsky, S. Landis, J. C. Toussaint, S. P. Collins, G. van der Laan, E. Dudzik, and S. S. Dhesi, Phys. Rev. B 66, 024435 (2002). 801 10. B.P. Toperverg, G.P. Felcher, V.V. Metlushko, V. Leiner, R. Siebrecht, O. Nikonov, Physica B 283, 9395 (2000). 802 11. K. Temst, M. J. Van-Bael, H. Fritzsche, Appl. Phys. Lett. 79, 991 (2001). 802 12. K. Temst, M. J. Van-Bael, V.V. Moshchalkov, Y. Bruynseraede, H. Fritzsche, R. Jonckheere, Appl. Phys. A 74, S1538 (2002). 802 13. M. R. Fitzsimmons, P. Yashar, C. Leighton, Ivan K. Schuller, J. Nogues, C. F. Majkrzak and J. A. Dura, Phys. Rev. Lett. 84, 3986 (2000). 802, 811 14. F. Radu , M. Etzkorn, R. Siebrecht, T. Schmitte, K. Westerholt, H. Zabel, Phys. Rev. B 67, 134409 (2003). 802, 811 15. F. Radu, M. Etzkorn, T. Schmitte, R. Siebrecht, A. Schreyer, K. Westerholt, H. Zabel, J. Magn. Magn. Mater. 240, 251 (2002). 802, 811 16. K. Theis-Br¨ ohl, T. Schmitte, V. Leiner, H. Zabel, K. Rott, H. Br¨ uckl, K. McCord, Phys. Rev. B 67, 184415 (2003). 803, 806, 812 17. Th. Zeidler, F. Schreiber, H. Zabel, W. Donner, and N. Metoki, Phys. Rev. B 53, 3256 (1996). 803 18. F. Schmidt, W. Rave, A. Hubert, IEEE Trans. Magn. 21, 1596 (1985). 803 19. K. Theis-Br¨ ohl, et al., to be published (2003). 809 20. F. Radu, M. Etzkorn, R. Siebrecht, K. Westerholt, H. Zabel, Phys. Rev. B 67, (2003). 808
Magnetic Orientation in Birds and Other Animals Wolfgang Wiltschko Fachbereich Biologie und Informatik der J.W.Goethe-Universit¨ at, Zoologie Siesmayerstrasse 70, D-60054 Frankfurt a.M., Germany. [email protected] Abstract. The use of the geomagnetic field for compass orientation is widespread among animals, with two types of magnetic compass mechanisms described: an inclination compass in birds, turtles and salamanders and a polarity compass in arthropods, fishes and mammals. Additionally, some vertebrates appear to derive positional information from the total intensity and/or inclination of the geomagnetic field. For magnetoreception by animals, two models are currently discussed, the Radical Pair model assuming light-dependent processes by specialized photopigments, and the Magnetite hypothesis proposing magnetoreception by crystals of magnetite, Fe3 04 . Behavioral experiments with migratory birds, testing them under monochromatic lights and subjecting them to a brief, strong pulse that could reverse the magnetization of magnetite particles, produced evidence for both mechanisms. However, monochromatic lights affect old, experienced and young birds alike, whereas the pulse affects only experienced birds, leaving young, inexperienced birds unaffected. These observations suggest that a radical pair mechanism provides birds with directional information for their innate magnetic compass and a magnetite-based mechanism possibly mediates information about total intensity for indicating position.
1
Evidence for Magnetic Orientation in Animals
The magnetic field of the Earth provides a wealth of navigational information for anyone who is able to read this information. The course of the field lines marks magnetic North and South. Total intensity shows a large-scale gradient, being highest near the poles, where it reaches more than 60 000 nT, and lowest near the magnetic equator, where it is below 35 000 nT. With inclination (defined as the angle between the direction of the field lines and the horizontal), the situation is similar: at the magnetic poles, it is 90◦ as the field lines go straight up or straight down; it then continuously decreases until it reaches 0◦ , running parallel to the earth’s surface at the magnetic equator (for details on the geomagnetic field, see [1]. Animals could thus, at least in theory, derive different kinds of navigational information from the geomagnetic field: the course of the field lines could be used as a compass to indicate directions, while the gradients of total intensity and inclination could be used as a component of a system indicating position, as they would point out something like ‘magnetic latitude’. The use of a compass needle by B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 815–829, 2003. c Springer-Verlag Berlin Heidelberg 2003
816
Wolfgang Wiltschko
human seafarers and travelers has a long tradition [2] – man, not being able to perceive the geomagnetic field consciously (but see [3]), had to construct appropriate technical devices. Many animals, in contrast, appear to be able to sense magnetic fields, and their use of magnetic information appears not to be restricted to directional information, but, at least in some groups, includes the use of positional information as well [4]. 1.1
Magnetic Compass Orientation
For demonstrating magnetic compass orientation, two basic requirements must be met: the animals to be tested must reproducibly show oriented behavior in a specific direction, and they must do so within a limited space so that the magnetic field can be experimentally altered – then their response to a change in magnetic North will tell the observer whether or not they use the magnetic field as a compass. 1.1.1
The Magnetic Compass of Birds
Migratory birds were the first animals where a magnetic compass was described. During migration in autumn and spring, these birds are controlled by an innate urge to move in their migratory direction [5]. Even captive migrants prefer the sector of their cage that points into migratory direction and stay there more often than in the other sectors. This behavior provides a reliable basis for tests. Tested in spring in the local geomagnetic field,
Fig. 1. Orientation of European Robins in spring in different magnetic fields. The triangles at the periphery of the circle indicate the mean headings of individual birds, the arrows represent the mean vectors drawn proportional to the radius of the circle (they are 1 if all mean headings coincide and 0 if they are equally distributed around the circle). The two inner circles are the 5% (dashed ) and the 1% significance border of the Rayleigh test indicating directional preferences. mN, magnetic North (data from [43,44])
Magnetic Orientation in Birds and Other Animals
817
migratory European Robins, Erithacus rubecula, preferred their normal migratory direction slightly east of north (Fig. 1, left). When magnetic North was deflected by 120◦ (leaving total intensity and inclination unchanged), the birds altered their directional preference accordingly and now headed towards SE (Fig. 1,center). Magnetic compass orientation is widespread among migratory birds. It was found in European, North American and Australian species with different migratory distances and habits: birds migrating short, medium and long distances, and birds migrating at night, during twilight or during daytime have been shown to use the magnetic field for compass orientation. A magnetic compass is also indicated in homing pigeons, Columba livia f. domestica, so that it appears to be a basic element of the avian navigational system (see [4] for details). A functional analysis of the avian magnetic compass revealed several important differences to our technical compass. The most important ones are its functional mode as an inclination compass, and its narrow functional window. The Inclination Compass. When the horizontal component of the ambient magnetic field was deflected, birds altered their headings accordingly and in spring preferred the new experimental magnetic North direction – this was expected if the magnetic field was used as a compass. However, when the vertical component was inverted so that it pointed upward instead of downward, the Robins reversed their headings, preferring no longer magnetic North, but magnetic South (Fig. 1, right). This behavior clearly shows that the birds do not respond to the polarity of the magnetic field. Instead, they seem to derive their directional information from the axial course of the field lines and their inclination in space [6]. Such an inclination compass has been demonstrated in all avian species tested so far with the vertical component inverted, including homing pigeons (see [4]); possibly, all birds possess this type of magnetic compass. Its functional mode means, however, that birds do not distinguish between magnetic North and South, but between poleward, the direction where the axis of the field lines forms the acute angle with gravity, and equatorward, where this angle is larger than 90◦ . This has the interesting consequence that the avian magnetic compass was not affected by the frequent reversals of the geomagnetic field in the past, as these reversals changed polarity without altering the axial course of the field lines. A Narrow Functional Window. Another difference to the technical compass is that the avian magnetic compass is narrowly tuned to the intensity of the ambient magnetic field. European Robins living in the local geomagnetic field of Frankfurt a.M. at an intensity of 46 000 nT are no longer oriented when tested in fields of 34 000 nT and below, or, more surprisingly, of 60 000 nT and above (Fig. 2). This means that a decrease or increase in total intensity by merely 25% led to a breakdown in orientation [7]. The functional window is not fixed, however. Birds that had been exposed to fields of lower or higher intensity adjusted themselves to those intensities and regained their ability to orient. Obviously, the potential functional range is much wider than the actual functional window of the avian magnetic compass; it even includes
818
Wolfgang Wiltschko
Fig. 2. Functional range of the magnetic compass in birds living at different total intensities: + indicates normal orientation in migratory direction; encircled – indicates disorientation. The dashed line marks the intensity of the local geomagnetic field (data from [43])
intensities beyond those found on earth today. Interestingly, the processes setting the magnetic compass to the new intensities represent neither a shift nor a simple enlargement of the functional window. Birds adapted to lower or higher intensities did not lose the ability to orient in the local geomagnetic field, but they were disoriented at intermediate intensities (Fig. 2; [7]). 1.1.2
The Magnetic Compass of Other Animals
Magnetic compass orientation has also been demonstrated in a large number of other animals. It has been found in all major groups of vertebrates and in many invertebrates, among them insects, crustaceans and snails (see Table 1). The functional model of their magnetic compass has been analyzed in very few species only, however. One of them are mole rats, Cryptomys sp., subterranean rodents living in Africa south of the Sahara. They dig extended straight tunnels under ground, which they seem to align with the help of the magnetic field. In the laboratory, they were found to build their nest always in the southeastern sector of a round arena. When magnetic North was turned to geographic South, they responded by building now in the NW, demonstrating magnetic compass use (Fig. 3, left and center). But in contrast to birds (see Fig.1, right), they did not respond to an inversion of the vertical component (Fig. 3, right) – obviously, their magnetic compass is a polarity compass based on the polarity of the field lines [8]. This means that there are two different types of magnetic compasses among animals. Table 1, last column, lists the type found in the various animal groups. While the few arthropods tested so far all have a polarity compass, the type of compass varies among vertebrates
Magnetic Orientation in Birds and Other Animals
819
Fig. 3. Orientation of mole rats in different magnetic fields. Each triangle at the periphery of the circle indicates a nest being built in a round arena; the arrows represent the mean vector drawn proportional to the radius of the circle. mN, magnetic North (data from [8]) Table 1. Magnetic Compass Orientation in the Animal Kingdom Phylum
Class
Species
Type of Compass
Mollusca Arthropods
Snails Crustaceans Insects Elasmobranchs Bony Fish Amphibians
1 6 8 1 4 2
Reptiles Birds
2 species 20 species 2 species
??? Polarity Compass Polarity Compass ??? Polarity Compass Polarity Compass / Inclination Compass Inclination Compass Inclination Compass
Vertebrates
Mammals
species species species species species species
Polarity Compass
??? indicates that the type of magnetic compass has not yet been analyzed
(see [4]). Amphibians appear to have both types of mechanisms, an inclination compass for shoreward orientation and a polarity compass for homing [9]. 1.2
Non-compass Use of the Geomagnetic Field
The data base supporting a use of magnetic parameters for determining position is much smaller and in many ways less clear. Here, mainly two functions of magnetic parameters are important, namely their role as components of the navigational ‘map’ telling animals the direction to their goal at distant sites, and their role as triggers, setting off specific responses in certain regions.
820
Wolfgang Wiltschko
1.2.1
Magnetic Navigational Factors
Although magnetic total intensity and inclination have been discussed as components of a world-wide ‘grid map’ for the navigation of homing pigeons already in the nineteenth century by Viguier [10], a detailed analysis of their possible role is difficult. Studying homing normally means that displaced animals are released, which makes manipulations of magnetic factors problematic. Also, animals do not rely on magnetic cues alone. They seem to use a multitude of factors in an opportunistic way, and these other factors may mask the role of magnetic cues in a given situation. The evidence indicating magnetic ‘map’-factors in the various animals is mostly indirect. When released at strong magnetic anomalies, displaced homing pigeons showed increased scatter up to disorientation. The vector lengths resulting from the vanishing bearings of groups of pigeons were found to be negatively correlated with the change in total intensity within 1 km (Fig. 4) suggesting that strong local magnetic gradients interfered with the navigational processes [11,12].
Fig. 4. Orientation of homing pigeons at various sites near and within a strong magnetic anomaly. – The three upper circular diagrams give the vanishing bearings of pigeons (as dots at the periphery of the circle) and the resulting mean vectors (as arrows) at three selected sites. Lower diagram shows the significant correlation between vector lengths (ordinate, logarithmic scale) and the maximum difference in intensity within 1 km in the direction towards home (abscissa, logarithmic scale); the coefficient of correlation is 0.96 (data from [11])
Magnetic Orientation in Birds and Other Animals
821
Also, at some sites, pigeons showed a certain shift in headings between morning and noon that could be suppressed by magnets, indicating that it might be associated with the decrease in total intensity normally occurring in the course of the forenoon [13]. At the same time, manipulations of the magnetic field around the pigeons’ head by attaching bar magnets or battery-operated coils led to small, but consistent deviations (e.g. [14,15]). All these findings point to an involvement of magnetic factors in avian navigation, yet without clarifying their specific role. Recently, there are indications that other animals may also use magnetic parameters for navigation. Salamanders Notophthalmus viridescens appear to use the inclination of the geomagnetic field for detecting a north-south displacement. Tested in the laboratory, they responded to changes in inclination in the range of only 0.5◦ [16]. Spiny lobsters Palinurus argus displaced about 15 km headed back to the place of capture. When they were tested in the laboratory in a magnetic field that simulated the magnetic conditions of more distant sites, the lobsters responded as if they had been displaced to those sites, showing headings in the direction that would have led them home [17]. The use of magnetic parameters for navigation has also been discussed for animals that return to their site of origin after extended migrations, like salmon, marine turtles and migrating birds, but in these cases, experimental evidence is not yet available. 1.2.2
Magnetic Parameters Acting as ‘Triggers’
Magnetic conditions of specific regions may act as triggers and set off innate responses that have to occur at the respective sites. The first such triggering effect was observed in Pied Flycatchers, Ficedula hypoleuca, a migrating songbird species. During autumn migration, birds from Central Europe avoid crossing the Alpes and the central parts of the Sahara desert by first heading Southwest to Iberia; here, they change course to southeasterly headings, cross the Mediterranean and reach their African winter quarters south of the Sahara. Hand-raised birds tested in the laboratory initiated the second leg of this migration only if the magnetic field of their migration route was simulated and they experienced the conditions of northern Africa at the time when the shift in heading was due [18]. Thrush Nightingales, L. luscinia, another avian migrant, were found to gain weight much faster (i.e., store more fuel for the migration flight), when the magnetic conditions of their migration route were simulated, imitating the situation just prior to crossing the Sahara [19]. Trigger functions of magnetic parameters have also been reported for marine turtles. Young loggerheads, C. caretta, from the Florida population spend their first years at sea within the warmer waters of the Atlantic gyre. They spontaneously responded to magnetic conditions as they are found at the edge of the gyre by preferring headings that would keep them within this gyre (Fig. 5; [20]). Both, inclination as well as total intensity were found to set off these spontaneous responses [21,22].
822
Wolfgang Wiltschko
Fig. 5. Orientation response of young loggerhead turtles in magnetic fields of three locations (marked by solid dots on the map) at the edge of the Atlantic gyre (represented by small arrows). In the circular diagrams, each dot marks the mean heading of a turtle hatchling, with the arrows representing the mean vectors; dashed lines represent the 95% confidence interval for the direction of the mean vector (redrawn from [20])
1.3
Two Types of Magnetic Information Used
The findings mentioned above clearly show that the use of information from the geomagnetic field is widespread among animals. The animals use it for different tasks, where at least two types of magnetic information are required: the vector of the magnetic field provides a compass for locating directions, and total intensity and/or inclination are involved in indicating position, for navigation and as ‘triggers’ to set off specific responses.
2
Models for Magnetoreception
One of the basic question in magnetic orientation research concerns the processes by which animals obtain the relevant information from the geomagnetic field: what is the nature of the primary processes, where do the receptors lie and what physiological mechanisms are involved? – Several ways how animals could perceive the magnetic field have been suggested (see [4] for review). Two models are currently in discussion: the Radical Pair model, suggesting that magnetoreception is mediated by specialized macromolecules, and the Magnetite hypothesis, assuming that magnetoreception is based on ferromagnetic particles of magnetite, Fe3 O4 .
Magnetic Orientation in Birds and Other Animals
2.1
823
The Radical Pair Model
This model, first forwarded by Schulten [23] and recently detailed by Ritz et al. [24] assumes that birds perceive the direction of the magnetic field with the help of specialized photopigments. As an initial step, macromolecules are elevated to singlet excited states by the absorption of a photon. They may then dissociate into radical pairs and, by hyperfine interactions, may be interconverted into triplet pairs, which are chemically different from singlet pairs (Fig. 6). The triplet yield depends on the alignment of the macromolecules with the external magnetic field and may thus be utilized to obtain information about the direction of the field. To mediate magnetic compass information by this mechanism, the triplet yields in different spatial directions must be compared, which requires an orderly array of macromolecules oriented in the various spatial directions. This condition seemed to be satisfied by the arrangement of photopigments in the avian eyes. The light-dependent mechanism described above would generate complex patterns of response on the retina. These patterns would be axial and symmetrical with respect to magnetic North and South and thus allow birds to detect the direction of the ambient magnetic field (see [24] for details). The radical pair model was especially designed in view of the magnetic compass of birds and is thus in agreement with its functional characteristics. The fact that the radical pair mechanism leads to axial rather than polar responses is in accordance with the avian compass being an inclination compass. The fine tuning of the avian compass to the ambient intensity and the ability of birds to adjust their magnetic compass to intensities outside the normal functional range can also be explained by the model: any change in intensity would alter the pattern on the retina, yet without affecting its general symmetry. Birds experiencing an intensity markedly different from the usual one might first be confused by the novel pattern, but after a while, they might recognize its central symmetry and learn to interpret it. This ability to explain important features of the birds’ responses under various magnetic conditions made the radical pair model rather attractive for considerations on magnetoreception in birds.
Fig. 6. Schematic drawing of the Radical Pair model; the triplet yield depends on the alignment of the molecules with the local magnetic field (after [24], modified)
824
Wolfgang Wiltschko
However, a polarity compass, as it is observed in arthropods, fish and mammals (cf. Table 1) cannot be explained by this model. 2.2
The Magnetite Hypothesis
Magnetite, a ferromagnetic material of biogenic origin, was first described as hardening agent to reduce wear in the teeth of chitons, snail-like animals that feed on algae living in the surface layers of coral rock [25]. When Blakemore [26] described magnetotactic bacteria that align themselves along the magnetic field lines with the help of small crystals of magnetite, magnetite was discussed as a possible means to detect magnetic fields. A highly successful search for magnetite began, and magnetite crystals were found in a vast number of animals, among them bees, birds and salmon, that had been demonstrated to respond to magnetic fields (see [27] for summary). The magnetic properties of magnetite depend on the size of the particles (Fig. 7). Very small particles are superparamagnetic: their magnetic moment fluctuates, but it can be aligned by an external magnetic field. Slightly larger particles are single domains with a stable magnetic moment, and even larger particles form multi-domains, where the magnetic moments of the various domains tend to cancel each other. The particles found in bacteria and many other animals appeared to be single domains [27]; recently, however, nervous structures containing superparamagnetic crystals have been described in the beak of pigeons [28]. The processes by which crystals of magnetite might mediate
Fig. 7. Magnetic properties of magnetite particles of various sizes and shapes. The two lower lines for 100 s and 4 x 109 years indicate the stability of the magnetization at room temperature. Size and shape of magnetite crystals found in the various organisms are indicated (after [31], modified to include the findings of [28])
Magnetic Orientation in Birds and Other Animals
825
magnetic information is still unclear. Several mechanisms have been proposed, some of which based on single domains, others on superparamagnetic crystals (e.g. [29,30,31,32,33]). Interestingly, while magnetite particles could provide compass information, a sufficient number arranged in a suitable way could also mediate information on small differences in total intensity [30,31].
3
Testing the Two Hypotheses in Birds
The two competing hypotheses on magnetoreception called for critical tests distinguishing between them. We decided to test them with migratory birds, using the birds’ ability to orient with the geomagnetic field as only cue as a criterion whether or not they could derive meaningful information from the magnetic field in a given situation. The crucial tests were performed with Silvereyes, Zosterops lateralis, a small Australian songbird that migrates from the island of Tasmania to the mainland. 3.1
Testing the Radical Pair Model
The radical pair model proposes photon absorption as the first step in the processes leading to the detection of the magnetic field. This would make magnetoreception light-dependent. Hence the studies testing the hypothesis addressed the question whether or not light was required for magnetic orientation. Tests with young homing pigeons indicated that some light was indeed necessary [34]. Further tests aimed at determining a potential role of photopigments by analyzing a possible wavelength-dependency of magnetoreception. Birds were exposed to nearly monochromatic light of various wavelengths produced by LEDs. The light intensity was set to produce an equal quantal flux with about 6 x 1015 quanta s−1 m−2 as standard in the test cages, which corresponds to a light level found in nature during twilight well after sunset or before sunrise. The results of tests with Silvereyes are shown in Fig. 8: under ‘white’ control and 565 nm Green light, the birds showed well oriented headings in their appropriate northerly migratory direction, under 635 nm Red light, they were disoriented. This applied to old, migration-experienced and young, inexperienced birds alike [35,36]. Repeating the respective tests with other species, such as European Robins, Garden Warblers, Sylvia borin, and homing pigeons, revealed a similar wavelengthdependency of magnetic orientation: the birds showed normal orientation from 424 nm Blue to 565 nm Green, i.e. in the entire blue and green part of the visual spectrum, while they were disoriented under 590 nm Yellow, 635 nm Red light and beyond (see [37] for review). These data are in agreement with the model that magnetoreception is mediated by light-dependent processes in the eye. Recent findings indicate that magnetic orientation may be restricted to the right eye: birds with their left eye covered showed normal orientation, whereas birds with their right eye covered were disoriented [38].
826
Wolfgang Wiltschko
Fig. 8. Orientation behavior of Silvereyes in the geomagnetic field under ‘white’ and monochromatic green and red light, with the peak wavelengths indicated. Upper diagrams: old, experienced birds; lower diagrams: young, inexperienced birds. Symbols as in Fig. 1 (data from [35,36])
3.2
Testing the Magnetite Hypothesis
The easiest way to test the magnetite hypothesis was to subject birds to a short, strong magnetic pulse – with 0.5 T, the pulse was strong enough to alter the magnetization of the magnetite particles indicated in the head of migratory birds [39], and with a duration of 3–5 ms it was short enough not to allow the particles to move out of position. This pulse was applied ‘south anterior’, that is, the bird’s head was inserted in the solenoid in a way that, had the beak been of iron, a north-seeking pole was induced. Immediately after this treatment, the birds were tested for their migratory orientation. The test birds were again Silvereyes, old adult birds that had ample migratory experience and young, inexperienced birds that had been caught almost immediately after fledging in Tasmania and transported to the test site by airplane. The results of the critical test are given in Fig. 9. Before treatment with the pulse, both groups of birds preferred their normal migratory direction slightly east of North. The old, experience birds responded to the pulse by showing a significant clockwise deflection, now heading slightly south of East [40]. The young, inexperienced birds, in contrast, were largely unaffected and continued in their migratory direction [41]. These findings suggest that the magnetic pulse affected a mechanism that was not innate, but based on experience – experience that the young test birds had not been able to obtain.
Magnetic Orientation in Birds and Other Animals
827
Fig. 9. Response of Silvereyes to a brief, strong magnetic pulse; tests under ‘white’ light in the local geomagnetic field. Left: old, experienced birds; right: young, inexperienced birds. – The symbols at the periphery of the circle indicate the mean headings of individual birds: open, orientation before pulse treatment; solid, orientation of the same birds after treatment with the magnetic pulse. The arrows represent the respective mean vectors. Other symbols as in Fig. 1 (data from [40,41])
3.3
Two Mechanisms for Different Tasks
The critical tests revealed that migratory birds respond to different wavelengths of light as well as to a magnetic pulse, thus producing positive evidence for both of the competing models. At the first glimpse, this may seem puzzling. However, the observation that the pulse affects only experienced migrants gives us a clue how to interpret these findings. Large-scale displacement experiments with migrants by the Dutch bird banding station have revealed a fundamental difference in the control of migratory orientation between old, experienced migrants and young, inexperienced migrants: while young first-time migrants follow an innate migration course, experienced migrants that are familiar with their goal appear to determine their course by navigational processes. After displacement, the experienced birds changed course and compensated for the displacement, whereas the young birds continued in their innate migratory direction, flying parallel to their normal migration route [42]. In view of this, we may interpret the data concerning the two hypotheses in the following way: light-dependent processes, affecting experienced and inexperienced birds alike, appear to involve an innate mechanism – this points to the magnetic compass. The pulse, on the other hand, seems to interfere with a mechanism that is based on experience – this indicates that the ability to navigate, in particular magnetic components of the navigational ’map’ of birds, may be involved. In summary, the available data suggest that birds posses two types of magnetoreceptors providing them with two different types of magnetic information: a light-dependent mechanism, as outlined by the radical pair model, mediating directional information for compass orientation, and a magnetite-based mechanism that appears to provide navigational information, indicating position.
828
Wolfgang Wiltschko
References 1. E. J. Chernosky, P. F. Fougere, R. O. Hutchison, in Handbook of Geophysics and Space Environments, S. L. Valley (Ed.)(McGraw-Hill, New York 1966) p. 11. 815 2. R. Nennig, Beitr. Gesch. Technik Ind. 21, 25 (1931). 816 3. R. R. Baker, Human Navigation and Magnetoreception, (Manchester University Press, Manchester 1989). 816 4. R. Wiltschko, W. Wiltschko, Magnetic Orientation in Animals, (Springer Verlag, Berlin, Heidelberg 1995). 816, 817, 819, 822 5. P. Berthold, Zoology 101, 235 (1998). 816 6. W. Wiltschko, R. Wiltschko, Science 176, 62 (1972). 817 7. W. Wiltschko, in Animal Migration, Navigation, and Homing, K. SchmidtKoenig, W. T. Keeton (Eds.)(Springer Verlag, Berlin, Heidelberg 1978) p. 302. 817, 818 8. S. Marhold, W. Wiltschko, H. Burda, Naturwissenschaften 84, 421 (1997). 818, 819 9. J. B. Phillips, Science 233, 765 (1986). 819 10. C. Viguier, Revue Philosophique de la France et de l’Etranger 14, 1 (1982). 820 11. C. Walcott, in Animal Migration, Navigation, and Homing, K. Schmidt-Koenig, W. T. Keeton (Eds.)(Springer Verlag, Berlin, Heidelberg 1978) pp. 142. 820 12. J. Kiepenheuer, in Avian Navigation, F. Papi, H. G. Wallraff (Eds.)(Springer Verlag, Berlin, Heidelberg 1982) p. 120. 820 13. W. Wiltschko, D. Nohr, E. F¨ uller, R. Wiltschko, in Biophysical Effects of Steady Magnetic Fields, G. Maret, N. Boccara, J. Kiepenheuer (Eds.)(Springer Verlag, Berlin, Heidelberg 1986) p. 154. 821 14. C. Walcott, J. Exp. Biol. 70, 105 (1977). 821 15. E. Visalberghi, E. Alleva, Biol. Bull. 125, 246 (1979). 821 16. J. B. Phillips, M. J. Freake, J. H. Fischer, S. C. Borland, J. Comp. Physiol. A 188, 157 (2002). 821 17. L. C. Boles, K. J. Lohmann, Nature 421, 60 (2003). 821 18. W. Beck, W. Wiltschko, in Acta XIX Congr. Int. Ornithol., H. Ouellet (Ed.) (University of Ottawa Press, Ottawa 1988) p. 1955. 821 19. T. Fransson, S. Jakobsson, P. Johansson, C. Kullberg, J. Lind, A. Vallin, Nature 414, 35 (2001). 821 20. K. J. Lohmann, S. D. Cain, S. A. Dodge, C. M. F. Lohmann, Science 294, 364 (2001). 821, 822 21. K. J. Lohmann, C. M. F. Lohmann, J. Exp. Biol. 194, 23 (1994). 821 22. K. J. Lohmann, C. M. F. Lohmann, Nature 380, 59 (1996). 821 23. K. Schulten, Festk¨ orper-probleme 22, 60 (1982). 823 24. T. Ritz, S. Adem, K. Schulten, Biophysic J. 78, 707 (2000). 823 25. H. A. Lowenstam, Geol. Soc. Am. Bull. 73, 435 (1962). 824 26. R. P. Blakemore, Science 19, 377 (1975). 824 27. J. L. Kirschvink, D. S. Jones, B. J. McFaddon (Eds.), Magnetite Biomineralization and Magnetoreception in Organisms (Plenum Press, New York 1985). 824 28. G. Fleissner, E. Holtkamp-R¨ otzler, M. Hanzlik, M. Winklhofer, G. Fleissner, N. Petersen, W. Wiltschko, J. Comp. Neurol. 458, 350 (2003). 824
Magnetic Orientation in Birds and Other Animals 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44.
829
E. D. Yorke, J. Theor. Biol. 77, 101 (1979). 825 J. L. Kirschvink, J. L. Gould, BioSystems 13, 181 (1981). 825 J. L. Kirschvink, Bioelectromagnetics 10, 239 (1989). 824, 825 D. T. Edmonds, Proc. R. Soc. Lond. B 263, 295 (1996). 825 V. P. Shcherbakov, M. Winklhofer, Europ. Biophys. J. 28, 380 (1999). 825 W. Wiltschko, R. Wiltschko, Nature 291, 433 (1981). 825 W. Wiltschko, U. Munro, H. Ford, R. Wiltschko, Nature 364, 525 (1993). 825, 826 U. Munro, J. A. Munro, J. B. Phillips, W. Wiltschko, Austr. J. Zool. 45, 189 (1997). 825, 826 W. Wiltschko, R. Wiltschko, Naturwissenschaften 89, 445 (2002). 825 W. Wiltschko, J. Traudt, O. G¨ unt¨ urk¨ un, H. Prior, R. Wiltschko, Nature 419, 467 (2002). 825 R. C. Beason, W. J. Brennan, J. Exp. Biol. 125, 49 (1986). 826 W. Wiltschko, U. Munro, R. C. Beason, H. Ford, R. Wiltschko, Experientia 50, 697 (1994). 826, 827 U. Munro, J. A. Munro, J. B. Phillips, R. Wiltschko, W. Wiltschko, Naturwissenschaften 84, 26 (1997). 826, 827 A. C. Perdeck, Ardea 46, 1 (1958). 827 W. Wiltschko, R. Wiltschko, Erithacus rubecula, J. Comp. Physiol. A 184, 295 (1999). 816, 818 W. Wiltschko, M. Gesson, R. Wiltschko, Naturwissenschaften 88, 387 (2001). 816
Non-contact Measurement of Thin-Film Conductivity by IR Spectroscopy Gerhard Fahsold and Annemarie Pucci Kirchhoff-Institut f¨ ur Physik, Universit¨ at Heidelberg Im Neuenheimer Feld 227, 69120 Heidelberg, Germany Abstract. Charge transport in metal thin films depends on the electronic structure of the metal, the mesoscopic morphology of the films, and on the microscopic roughness and the chemical nature of the interfaces. These different contributions are accessible with the measurement of the dynamic conductivity. With the knowledge of its frequency dependence in the infrared (IR), two fundamental parameters of metallic conduction are available: the plasma frequency and the relaxation of charge carriers. This allows to distinguish between effects from bandstructure properties and effects from charge carrier scattering, and it also allows to monitor the development of morphology during growth of a metal thin film. Using IR-spectroscopy, the dynamic conductivity can be measured in a non-contact mode. IR-spectroscopy can be performed during thin film growth, adsorbate exposure or other steps of film preparation, which is demonstrated with results from growth of Fe on MgO and of Cu on Si and KBr, from exposure of these films to adsorbates (CO, O2 ), and from Cu homoepitaxy.
1
Introduction
The elastic mean free path of electrons in bulk metals, e.g. Cu at room temperature, is several ten nanometers [1]. As a consequence, if current is carried by metal nanostructures, the resistance it experiences is strongly influenced by the structural and the chemical nature of the nanostructure interfaces with the environment. The so-called surface resistivity [2] causes size effects in metallic nanowires [3,4]. It is of fundamental importance for an understanding of nanostructure electronics and for the development of nanostructure based applications. Surface resistivity and adsorbate induced resistivity have been demonstrated in DC-resistivity measurements [5,6,7,8], IR-broadband reflectivity experiments [6,7], and by optical spectroscopy of plasmon resonances [9]. Apart from surface scattering, also grain-boundary scattering has been shown to cause size effects in nanowire resistivity [3,4]. The charge transport due to the surface electronic structure [10,11] should increasingly contribute to nanostructure conductivity with decreasing size. Another aspect of metal nanostructure conductivity arises from the fact that the de-Broglie wavelength of an electron at Fermi energy is often in the range of few nanometers. If the film morphology diminishes the elastic mean free path of the charge carriers down to the range of this wavelength, effects of charge localization and a transition to non-metallic conductivity B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 833–847, 2003. c Springer-Verlag Berlin Heidelberg 2003
834
Gerhard Fahsold and Annemarie Pucci
are expected [12]. On the other hand, if the elastic mean free path is larger than the film thickness, but this thickness is only several times the de-Broglie wavelength, quantum size effects and dimensionality effects can influence the conductivity [13,14,15,16,17]. Therefore, the investigation of low dimensional metal conductivity is of fundamental importance for future nanostructure technologies. With increasing implementation of metal nanostructures in electronic applications high throughput methods for monitoring conductivity of metal nanostructures are needed for a controlled production of such applications. At the same time, advanced quantitative analytical methods for conductivity investigation are needed to push forward the frontiers for nanostructure applications. As we will show in this paper, IR-spectroscopy is a powerful candidate for these two kinds of requirement. We report on IR-properties of metal thin films on non-metals and on the interpretation of these properties concerning conductivity. Important problems that we will address are: i) How is thin film growth monitored by IRtransmission for film thicknesses below 5 nm? ii) How can IR-spectra of metal thin films be calculated taking account of both the mesoscopic morphology of the film and the intrinsic dynamic conductivity of the metal? iii) What is the influence of the microscopic roughness and the chemical composition of the interfaces on the charge transport in a metal thin film? We will give examples which answer these questions or which demonstrate how IR-spectroscopy can contribute to the answers.
2
Dynamic Conductivity of Cu and Fe
The Drude model works for alkali metals and, surprisingly, up to the middle IR-also for transition metals. But, what is the plasma frequency in the case of a transition metal? In this section we want to analyze the intrinsic IR-properties of metals, including transition metals, and propose an approximative description of these properties. This description will form the basis for an analysis of the influence of size, morphology, and adsorbates on the conductivity of metal thin films. 2.1
The Drude-Sommerfeld Model
The Drude-type description of the dynamic conductivity of a metal reads in the relaxation-time approximation [1] as σ(ω) =
0 ωP2 σ0 , = 1 − iω/ωτ ωτ − iω
(1)
with the DC-conductivity σ0 , the circular frequency ω, the relaxation rate ωτ , and the plasma frequency ωP [1]. Classically, ωP is related to the density
Non-contact Measurement of Thin-Film Conductivity
835
n of the charge carriers and theIR-mass m by ωP2 =
ne2 , 0 m
(2)
where 0 is the dielectric constant of vacuum and e is the charge of an electron. It was Sommerfeld, who realized that the electron gas in a metal is ruled by Fermi statistics which demanded a reformulation of charge transport in a solid. It came out that, at least for metals with an isotropic Fermi velocity vF , the dynamic conductivity has the same frequency dependence as given by (1), only the plasma frequency has to be redefined by ωP2 =
e2 2 v g(EF ). 30 F
(3)
In this formulation the classical parameters are replaced by the electronic structure parameters vF and the density of states g(EF ) at Fermi energy. Values for these parameters are frequently calculated by applying the freeelectron (FE) model, which assumes an integer number of valence electrons per atom [1]. The resulting values for the Drude parameters ωP and ωτ are in discrepancy with experimental (EX) values[18]. In Table 1 we compare these parameters for two standard metals, the weakly-bound electron system Cu and the transition metal Fe. For Cu and for Fe, the FE value of ωP2 is about a factor of 2 and of 14, respectively, above the EX value. Similarly, the FE values for g(EF ) are well above the EX values as calculated from, e.g., the electronic contribution to the specific heat (see Table 1). Significantly better description of EX data for both ωP2 and g(EF ) is achieved with values from bandstructure (BS) calculations [19]. The remaining misfit is probably mainly due to the fact that (3) ignores the anisotropy of the crystalline metal. Regarding the BS of Fe, e.g., various Fermi surfaces with both s,p-like and d-like character of charge carriers are found [19]. 2.2
Dynamic Drude Parameters
While in the far infrared the dielectric properties of most metals follow a Drude-type frequency dependence, already in the mid IR many metals show strong deviation from that behavior [18]. For getting insight into this deviation and for modelling thin film spectra we took tabulated data for the dielectric function (ω) [18] to calculate the Drude parameters ωP (ω) and ωτ (ω) in the whole IR-range. For this calculation (1), (ω) = ∞ +
iσ(ω) , 0 ω
(4)
and ∞ =1 was used [20]. The result is that even for Cu the parameters exhibit a significant dependence on frequency (Fig. 1).
836
Gerhard Fahsold and Annemarie Pucci
Table 1. Parameters for calculating IR-optical properties for Cu and Fe: density of states g(EF ) at Fermi energy, Fermi velocity vF , plasma frequency ωP , and relaxation rate ωτ . The values for g(EF ) and vF are from the free-electron (FE) model [1] or they are average values from bandstructure (BS) calculations [19]. The corresponding values for ωP are calculated using (3) and for ωτ using (1) and experimental DC-conductivity values [1]. Values in brackets are for majority and minority electrons in the case of ferromagnetic Fe. For comparison, experimental (EX) values are given for g(EF ) from specific heat measurements [1] and for ωP and ωτ from far-IR reflectivity data [18]. Items in the lowest line are used for a two-band (TB) description of the frequency dependence of ωP (ω) and ωτ (ω) (Fig. 1) metal source
g(EF ) 1/eV atom
vF 106 m/s
ωP 103 cm−1
ωτ cm−1
Cu
0.214 0.29 0.26
1.57 1.13
0.269 49.3(37.3/59.0) 1.15 (0.24/0.87) 2.0
87.3 73.5 59.6 57/39 123.4
221 157 73 60/1400 2251
1.98 357 0.39 49.3 357 (0.45/0.37) (37.3/59.0) 33.0 147 30/110 120/40000
Fe
P [103 cm-1]
80 60 40
4
10
Fe
Cu Fe
3
10
Cu
20 2 3 4 2 3 4 10 10 10 10 10 10 -1 -1 wavenumber [cm ] wavenumber [cm ]
[cm-1]
FE BS EX TB FE 0.39(0.45/0.37) BS (↓ / ↑) EX TB
2
10
Fig. 1. Wave number dependent Drude parameters ωP (ω) and ωτ (ω) as calculated from tabulated dielectric function data for Cu and for Fe [18]. The solid lines are calculated by superimposing two Drude-type systems with parameters as given in Table 1
What is the reason for the frequency dependence of ωP (ω) and ωτ (ω)? In the mid IR (ω ≤5000 cm−1 ), for Cu, but also for Fe, interband transitions are regarded as playing a minor role [19,21]. Also electron-electron scattering, as calculated for a FE gas, is not expected to have this strong influence [1]. The influence of electron-phonon scattering on the mass of charge carriers and on the IR-relaxation has been calculated for aluminum by Young [22] and for iron
Non-contact Measurement of Thin-Film Conductivity
837
by Allen [23]. The found effective mass enhancement factors might explain the reduction of plasma frequency going from above the Debye frequency to below the Debye frequency. Nevertheless, the effects are insufficient for a description of the strong frequency dependencies of the relaxation rates. We propose another explanation. A superposition of at least two different Drude-type systems of charge carriers, without any frequency dependence in the Drude parameters and even without any interaction, will show an effective dynamic conductivity which could be described with (1) only if a frequency dependence of ωP and of ωτ was introduced. Already a two-band (TB) model, using the parameters given in Table 1, can account for most of the the frequency dependence of ωP and of ωτ (see solid lines in Fig. 1). For iron, the spin-split bandstructure offers a designation of the two bands as majority electrons and minority electrons. The large relaxation value of 40 000 cm−1 for the second band (see Table 1) could be due to scattering between d-like and sp-like electrons. However, due to the onset of interband transitions in the near IR this value should be looked at with care. For Cu, the two bands might be due to the ‘body’ and the ‘necks’ of the Fermi surface [1] which contribute differently to conductivity. As the ‘necks’ penetrate the Brillouin-zone boundary, umklapp-scattering [1] should be strongly enhanced for the ’neck’ electrons compared to the electrons in the inner part of the Brillouin zone. This could explain the large relaxation value of 1400 cm−1 in the copper case (see Table 1). 2.3
Typical Length Scales
Regarding IR-optical properties of nanostructured solids, several length scales are of importance. We define the distance lp (ω) which an electron with vF (see Table 1) propagates during a cycle of the IR-field, as lp (ω) = vF /ω.
(5)
For local optics it is necessary that lp (ω) is below the skin depth δ(ω) δ(ω) = c/ωP (ω).
(6)
This necessity is met for the whole IR for Fe, and in the case of Cu except for the far IR (Fig. 2) [24]. An estimation of the range over which structural distortions of the solid are experienced by the charge carriers is given by the mean free path λb . For an electron with velocity vF in a bulk metal, this length is [25] λb (ω) = vF /ωτ (ω).
(7)
In the mid IR, the values for λb reach several nanometers for Fe and several ten nanometers for Cu (Fig. 2). As a consequence and as stated in the introduction, a large fraction of charge carriers in metal nanostructure is
838
Gerhard Fahsold and Annemarie Pucci
100
length [nm]
10
lp f
1
0.1
b
Cu 100
1000
lp
b
f Fe 100
-1
wavenumber [cm ]
1000
Fig. 2. Frequency dependent lengths for Cu and Fe as defined in Sect. 2.3: skin depth δ, propagation length lp , and mean free path λb in bulk and λf in a thin film with an additional scattering of 2000 cm−1
influenced by the structural nature of the interfaces. For modelling charge transport in a homogeneous thin film of thickness d, the validity of d < λb justifies the assumption of a homogeneous conductivity within the metal. To demonstrate the consequences of surface scattering in thin films (Fig. 2) we increased the relaxation rate by 2000 cm−1 . This is a typical value for additional scattering in various films of few nanometer thickness [43]. The small values for λf (and also for λb ) that occur above 5000 cm−1 for Fe correspond to the breakdown of the Drude model in the optical region. Note, due to surface scattering λf δ in the whole frequency region which is the reverse of that what is usually assumed for an ideal metal.
3 In-Situ IR-Spectroscopy: Experiment and Calculation In the mid IR, metal ultrathin films may be investigated by spectroscopy of transmitted IR-light. Even for a homogeneous film of a well conducting metal (e.g. Cu) with a thicknesses of 20 nm the IR-transmission is still several percent, provided that the substrate is IR-transparent. Concerning technological applications, the transmission measurement offers the advantage of demanding much less stability of the experimental setup than in the case of reflection measurement. In combination with a model for the charge transport, e.g. the Drude model, the transmission spectrum measured at a single angle of incidence is sufficient for a complete determination of the complex dynamic conductivity in the investigated range of frequencies. If the measurement is done at normal incidence, in-plane-out-of-plane anisotropies do not have to be concerned for the metal thin film.
Non-contact Measurement of Thin-Film Conductivity
839
For an in-situ IR-transmission spectroscopy of metal thin films, an FTIRspectrometer (Bruker, IFS 66v/s) is connected to an ultrahigh-vacuum(UHV) chamber which is equipped for thin film preparation and surface structural characterization. For a high stability of the measurement the whole optical path is in vacuum. Detection of light is done with a mercury-cadmiumtelluride(MCT) detector. Film thicknesses are calculated from deposition rates which are measured before and after thin film preparation. A set of spectra which was measured during deposition of Cu on a homogeneous Cu film is shown in Fig. 3. It demonstrates the submonolayer sensitivity of the IR-transmission T (ω) (and it will be analyzed in Sect. 4.4). In this case, roughly ∂T /∂(ω) < 0 is found. This is different for growth of metals on insulators, where an optical crossover is typically observed, i.e. a transition from ∂T /∂(ω) < 0 to ∂T /∂(ω) > 0 occurs for the whole IR-range at a thickness ∼ d0 (Fig. 4). In Sect. 4.1 we will come back to the interpretation of this characteristic behavior.
relative transmittance
1.00
5.0 5.2 5.4 5.6 5.8 5.9
0.96 0.92 0.88
[nm]
0.84
Cu+Cu/Si(111) 103 K
0.80 0.76 2000
4000
-1
wavenumber [cm ]
6000
Fig. 3. IR-transmission spectra as measured during deposition of Cu on an annealed 5-nm-Cu film on Si(111) in UHV and at 103 K. The spectra are normalized to the transmission of the annealed film. For the spectra shown as thick lines the total film thicknesses are labelled. One spectrum was sampled in 8 sec. This series of spectra demonstrates the submonolayer sensitivity of IRtransmission spectroscopy
transmittance
1.0 KBr
0.8 Si[II]
Si[I]
0.6 0.4
-1
1000 cm
-1
3000 cm
0.2
0
1
2
3
4
film thickness [nm]
5
Fig. 4. IR-transmission (normalized to the transmission of the substrate) vs. film thickness, as taken from spectra which were measured during deposition of Cu in UHV and at ∼100 K. The substrates are KBr(001) and two differently prepared Si(111) surfaces [26]. The curves indicate the different growth mode on KBr an on Si, and even a difference in the growth on the two Si surfaces is detected
840
Gerhard Fahsold and Annemarie Pucci
For the analysis of thin film spectra, adsorbate induced changes, and also morphology effects, we use the Drude-type dielectric function (see (1) and (4)) and commercial software to treat multiple reflections at the thin film interfaces [27]. We simulate thin film properties by setting the Drude parameters as ωP (ω, d) = β(d)ωP bulk (ω)
(8)
and as ωτ (ω, d) = ωτ s (d) + ωτ bulk (ω),
(9)
with the Drude parameters ωP bulk and ωτ bulk for the bulk metal (Sect. 2.2). The two only fit parameters are the scaling factor β and the surface scattering rate ωτ s . A motivation for β and for ωτ s is given in [28] and in [26]. Shortly, β very roughly accounts for quantum size effects but also for depolarization of films formed from islands; ωτ s is the additional surface scattering rate and it may be added to ωτ bulk if surface scattering does not interfere with bulk scattering. Following concepts for the classical size effect (CSE) [29], we used a surface roughness parameter α(d) = ωτ s (d) · 2d/vF ,
(10)
which corresponds to the non-specularity (1 − p) used for the CSE. The achieved accordance between our calculations and the measured spectra is demonstrated in Fig. 5. In the following we want to give interpretations of the parameters β and ωτ s which were gained from fit procedures.
transmittance
1.0 0.8 0.6 0.4 0.2
0 1 2 3 4 5 [nm]
Cu/Si(111) 102 K 1000 2000 3000 4000 5000 -1 wavenumber (cm )
Fig. 5. IR-transmission spectra as measured during deposition of Cu on Si(111) in UHV and at 102 K (lines) and as calculated with a Drude-type dynamic conductivity (circles) (see [26] for details of sample preparation). For a thickness below 2 nm, the spectra are not satisfactorily described by the Drude model
Non-contact Measurement of Thin-Film Conductivity
4
841
Charge Transport in Metal Thin Films
From literature, only for a few systems work on charge transport and IR-properties of really continuous metal ultrathin films is known [5,6,14,16,30,31,32,33,34]. Within the recent years, the strongest activities are from the groups of Henzler [14,34], Hasegawa [10,11], and Tobin [6,35]. On oxide substrates, attempts to investigate homogeneous thin films at low film thickness suffered from the strong 3d-island growth (Vollmer-Weber growth mode) [30,36,37,38]. On silicon, the interpretation of conductivity data is often complicated by interdiffusion and silicide formation [39,40]. A detailed study of the growth of Fe on MgO showed that even at room temperature complete substrate coverage can be achieved at a rather small film thickness of ∼1nm [41]. This is a valuable prerequisite for developing an IR-spectroscopic analysis of ultrathin-film conductivity [26,28,42,43,46]. Furthermore, MgO is a chemically stable large-gap insulator where interdiffusion at interfaces can be neglected for temperature below 400 K. 4.1
Growth and Percolation
What have we learned from IR-spectra concerning growth of thin films and concerning the percolation transition? For example, we measured IR-spectra during growth of Fe on MgO(001) at 315 K and analyzed them as described in Sect. 3. The finding is that above a thickness dc = 0.8 nm it is possible to calculate the spectra with the Drude-type model using the parameters shown in Fig. 6a. For d > 4 nm, the roughness parameter α is far below unity and the parameter β approaches unity [49]. Obviously, the Vollmer-Weber-type film
a
4
1,0
b h(x,y,d)
0 0
film
2 0,8 1
2
3
thicknessd (nm)
4
x,y
substrate
5
Fig. 6. (a) Thin film scattering parameter α and depolarization parameter β [49] used for calculating IR-spectra which were measured during growth of Fe on MgO(001) at 300 K in UHV. For d < 0.8 nm, the IR-spectra cannot be described by a Drude-type dielectric function. (b) Sketch of a thin film growing in the VollmerWeber mode. The film is shown at average thicknesses d < d0 (dotted line), d ≈ d0 (dashed line), and d > d0 (solid line), with d0 as the Percolation threshold. The local film thickness is denoted as h(x, y, d)
842
Gerhard Fahsold and Annemarie Pucci
growth is significantly beyond the percolation transition and a mesoscopically smooth morphology has developed (Fig. 6b). If an optical thickness dopt is defined as the maximum of the local thickness h(x, y, d), and the average thickness d = h(x, y, d)x,y , the volume filling F = d/dopt should be close to unity. With α 1, the relaxation corresponds to the CSE for a homogeneous film (i.e. 1 − p < 1) [29]. For 0.8 nm < d < 4 nm, we find β < 1. Assuming the absence of quantumsize effects [15,17], this indicates depolarization [26,44] and, therefore, points at a mesoscopic roughness. Such roughness corresponds to the granular morphology of a film in the vicinity of the percolation threshold. In this thickness range, α reaches values far above unity, which cannot be explained by the CSE for a homogeneous film and a specularity p ≥ 0. Due to the small mean free path in iron (a few nanometers, see Sect. 2.3) and the larger typical grain diameter of Fe/MgO (several ten nanometers [45]), ωτ s (d) is an average over local relaxation rates ωτ s (h(x, y, d)). Assuming a constant microscopic roughness α0 (and h(x, y, d) > 0), we find as an approximation to the average roughness α(d) = α0 d · 1/h(x, y, d)x,y .
(11)
If the average thickness d is only little above the percolation threshold (Fig. 6b) α(d) can easily exceed unity even if α0 < 1 is fulfilled as regions with h < d are weighted stronger than regions with h > d. We want to note that the it may be that the Matthiessen’s rule does not hold for films in the vicinity of the percolation threshold. For d < 0.8 nm, the calculation based on a Drude-type model fails in describing the measured IR-spectra. The films do not behave like a Drudetype metal. Effective-medium models are necessary for calculating the IRproperties at and around the percolation threshold [47,48]. In the following Section, we want to compare our results for the conductivity of this system with directly measured DC-resistivity data. 4.2
AC- and DC-Conductivity
We define the spectroscopic value for the DC-conductivity as the static limit of σ(ω, d) (see (1),(8) and (9)), β(d)2 ωP bulk (ω)2 σf ilm (d) = lim(ω → 0) . (12) ωτ bulk (ω) + ωτ s (d) Using the Drude parameters for Fe from bandstructure calculation (Table 1) and an Fe bulk conductivity σbulk = 11.4 µΩcm at 300 K [1], we find a thin film conductivity σf ilm (d) which is for d > 4nm well explained by the CSE [29] using the smallest possible specularity (i.e. p = 0)(Fig. 7). For 0.8 nm < d < 4 nm, the conductivity drops well below the CSE limit, which can be explained with the mesoscopic roughness of the granular film. As a
Non-contact Measurement of Thin-Film Conductivity
100
film / bulk
CSE 10-1
10-2
-3
10
dc 0
dc
ir Fe/MgO(001) T=315 K 1 2 3 thickness d (nm)
4
5
843
Fig. 7. Zero frequency extrapolation value σf ilm (d) of the dynamic conductivity vs. film thickness d from IR-transmission spectra measured during deposition of Fe on MgO(001) at 315 K (filled squares), DC-conductivity values from [50] (open circles), DC-conductivity for the classical size effect with zero specularity (dashed line), and the critical thickness dc for complete substrate coverage [41]
check whether the extrapolation to the static limit does really lead to reliability conductivity values, we compare σf ilm (d) with values σdc (d) from direct DC-conductivity measurement [50]. For 1 nm < d < 1.3 nm, excellent accordance is achieved (Fig. 7). For d <1nm, the DC-conductivity measurement is above the spectroscopic value and even for d <0.8 nm it is above 10−3 σbulk . In this thickness range, a strongly negative temperature coefficient of the resistivity was observed [50] which points at non-metallic conductivity (i.e. the conductivity vanishes at T =0). This kind of conductivity is not accessible by our Drude-type analysis of IR-spectra. Concerning the growth of Fe on MgO, it is of importance that the substrates for the DC-measurements where polished wafers while for the spectroscopic experiments cleaved crystals were used. A smaller percolation threshold is expected for the polished crystals due to the higher number of surface defects [42]. Correspondingly, the DC-conductivities should be above the spectroscopic values for the same thickness d. 4.3
Adsorbate Effects
The DC-sheet conductance L(d) = σdc (d) · d of a thin film (d ≤ λf , see Sect. 2.3) may be influenced by adsorbates in three different ways: ∆d ∆ωP2 ∆ωτ + 2 − ∆L = L · (13) d ωP ωτ i.e. by a change in film thickness d (e.g. due to oxidation), a change in the charge carrier oscillator strength ωP2 and a change in the charge carrier scattering ωτ . We want to demonstrate, that IR-spectroscopy is able to distinguish between the change in scattering and the two other mechanisms. We exposed a Cu film (d=5.1 nm) grown on KBr(100) to either of CO and O2 and measured (roughly at monolayer coverage) the induced change in the IR-broadband transmittance (Fig. 8). Clearly, two mechanisms (i.e. parameters) are necessary to describe the measured curves. From calculating these
844
Gerhard Fahsold and Annemarie Pucci
transmittance
1.15 1.10 1.05
O2
5.1 nm Cu / KBr T=100 K
CO
1.00 0.95
1000
2000
3000 -1
wavenumber ( cm )
Fig. 8. Adsorbate induced change of broadband transmittance measured for CO and for O2 on 5.1 nm Cu on KBr (solid lines). Exposure and spectroscopy is at ∼100 K. The CO-vibrational structure at about 2100 cm−1 is not included in the calculation (dashed lines). For further details see [26]
curves we obtain a change ∆ωτ /ωτ of 13.2% for O2 and 13.6% for CO2 . For ∆ωP2 /ωP2 we find -4.7% and +1.6% for O2 and CO, respectively. Note, as in the analysis of transmission spectra we cannot distinguish between thickness and plasma frequency, we detect a relative change in the total amount of charge carriers (or in the thickness d of the film) as a change in ωP2 . Here, the two gases have roughly the same effect on the relaxation but the induced charge transfer is of opposite sign [43]. Concerning gas-sensor applications, e.g., a change ∆L does not allow to distinguish between CO and O2 since ∆d/d is covered by ∆ωτ /ωτ . With IR-broadband spectroscopy, the discrimination is possible even if specific vibrational structure is absent. In recent work we found that in continuous but mesoscopically rough films, the IR spectroscopic observation of charge transfer is enhanced [26]. Furthermore, in an even thinner film, with more atomic disorder, the relative change due to the scattering effect will decrease while the charge transfer will gain in importance. In the extreme limit of cold condensed films the resistivity can be lowered by adsorption of ethane, for example [53]. 4.4
Surface Atom Conductivity
Does a Cu adatom on a smooth Cu surface contribute to conductivity? We investigated this question by depositing Cu on an annealed surface of a Cu film on Si(111) (deposition temperature is about 100 K, annealing temperature is 400 K for about one minute). The measured spectra have been shown in Fig. 5. The calculation of these spectra yields changes of the roughness parameter α and the charge carrier parameter β (see Sect. 3). As expected for growth of Cu at 100 K [51], α rapidly increases with beginning of deposition and saturates at about 0.2 nm, i.e. after roughly a monolayer has grown. A change of β 2 (i.e the density n of charge carriers, see 2) will be not expected if the deposited atoms contribute to conductivity in the same way as the atoms of the film do. This is indicated by the result for ∆β 2 . At closer inspection, even slightly positive values (≤ 0.01) are observed. The IR-origin might be, by part, due to the electronic structure of the Cu(111) surface [52].
changes and 2
Non-contact Measurement of Thin-Film Conductivity 0.2
start
stop
0.1
0.0
x10
0.0 0.2 0.4 0.6 0.8 1.0
deposited thickness [nm]
5
845
Fig. 9. Change of roughness α and plasma frequency parameter β as calculated from the spectra in Fig. 3. The deposition period is indicated. Before and after, the thickness is constant
Summary
The infrared properties of bulk Cu and Fe can be described by a Drude-type model using frequency dependent values for the plasma frequency and for the relaxation rate. In the far IR, these values are almost constant and they correspond to bandstructure data. The frequency dependence of the Drude parameters in the mid IR is probably due to electron-phonon scattering and due to a superposition of at least two different types of charge carriers, which is reasonable regarding the complex bandstructures at the Fermi energy. The relevant length scales for the radiation field and for the charge carrier propagation indicate that in the mid IR the metals may be treated in local optics and that, to a good approximation, the charge carriers in thin films may be treated as plasmons. In-situ IR-spectroscopy during growth of a thin film allows to monitor the onset of its metallic conduction and, therefore, of the development of its average mesoscopic morphology. The influence of adsorbates on the resistivity of a thin film can be separated into charge transfer or bandstructure effects and scattering effects. Finally, the contribution of metal atoms at the surface of a metal thin film can be analyzed. The given examples demonstrate that broadband IR-spectroscopy is a powerful tool for fast non-contact investigation of thin film conductivity phenomena. Acknowledgements We thankfully acknowledge stimulating conversations with A. Otto, B.N.J. Persson, and R.G. Tobin. This work was supported by the Deutsche Forschungsgemeinschaft (DFG).
References 1. N.W. Ashcroft and N.D. Mermin, Solid State Physics (Holt-Saunders International Editions, New York, 1976) 833, 834, 835, 836, 837, 842, 846 2. B.N.J. Persson, Surf. Sci. 269-270, 103 (1992) 833 3. C. Durkan and M.E. Welland, Phys. Rev. B 61, 14215 (2000) 833
846
Gerhard Fahsold and Annemarie Pucci
4. W. Steinh¨ ogl, G. Schindler, G. Steinlesberger, and M. Engelhardt, Phys. Rev. B 66, 75414 (2002) 833 5. P. Wißmann, in Surface Physics, Springer Tracts in Modern Physics, vol. 77, ed by G. H¨ ohler (Springer, New York 1975) 833, 841 6. R.G. Tobin, Surf. Sci. 502-503, 374 (2002) 833, 841 7. M. Hein, P. Dumas, A. Otto, and G.P. Williams, Surf. Sci. 419, 308 (1999) 833 8. G.A. Fried, Y. Zhang, and P.W. Bohn, Thin Solid Films 401, 171 (2001) 833 9. T. Brandt, W. Hoheisel, A. Iline, F. Stiez, and F. Tr¨ ager, Appl. Phys. B 65, 793 (1997) 833 10. S. Hasegawa, F. Grey, Surf. Sci. 500, 84 (2000) 833, 841 11. S. Hasegawa, J. Condens. Matter 12, R463 (2000) 833, 841 12. M. Henzler, T. L¨ uer, and A. Burdach, Phys. Rev. B 58, 10046 (1998) 834 13. I. Vilfan, M. Henzler, O. Pfennigsdorf, H. Pfn¨ ur, Phys. Rev. B 66, 241306 (2002) 834 14. O. Pfennigstorf, K. Lang, H.-L. G¨ unter, and M. Henzler, Appl. Surf. Sci. 162, 537 (2000) 834, 841 15. N. Trivedi and N.W. Ashcroft, Phys. Rev. B 38, 12298 (1988) 834, 842 16. M. JaLlochowski, E. Bauer, Pys. Rev. 37, 8622 (1988) and M. JaLlochowski, E. Bauer, Pys. Rev. B 38, 5272 (1988) 834, 841 17. Z. Tesanovic, M. V. Jaric, S. Maekawa, Phys. Rev. Lett. 57, 2760 (1986) 834, 842 18. M.A. Ordal, R.J. Bell, R.W. Alexander, Jr., L.L. Long, and M.R. Querry, Appl. Opt. 24, 4493 (1985) 835, 836, 846, 847 19. D.A. Papaconstantopoulos, Handbook of the band structure of elemental solids (Plenum Press, New York and London 1986); see also Table 1 in [23]. 835, 836 20. ∞ =1 is useful only up to the IR [18] 835 21. J.H. Weaver, E. Colavita, D.W. Lynch, R. Rosei, Phys. Rev. B. 19, 3850 (1997) 836 22. C. Young, J. Phys. Chem. Solids 30, 2765 (1969) 836 23. B. P. Allen, Phys. Rev. B 36, 2920 (1987) 837, 846 24. B.N.J. Persson and A.I. Volokitin, Surf. Sci. 310, 314 (1994) 837 25. This is a rough approximation, as small-angle scattering and umklapp scattering have strongly different effects on conductivity [1] 837 26. G. Fahsold, M. Sinther, A. Priebe, S. Diez, A. Pucci, in preparation. 839, 840, 841, 842, 844 27. W. Theis, software SCOUT 2 (M.Theiss Hard- and Software, Aachen, Germany, 2000) 840 28. G. Fahsold, A. Bartel, O. Kraut, N. Magg, A. Pucci, Phys. Rev. B 61 (2000) 14108. 840, 841 29. E.H. Sondheimer, Adv. Phys 1, 1 (1952) 840, 842 30. M.-L. Th`eye, Phys. Rev. B. 2, 3060 (1970) 841 31. L.A. Kuzik, V.A. Yakovlev, F.A. Pudonin, G. Mattei, Surf. Sci 361/362, 882 (1969) 841 32. L.A. Kuzik, Y.E. Petrov, F.A. Pudonin, V.A. Yakovlev, JETP 78, 114 (1994) 841 33. P.F. Henning, C.C. Homes, S. Maslov, G.L. Carr, D.N. Basov, B. Nikoli´c, and M. Strongin, Phys. Rev. Lett. 83, 4880 (1999) 841 34. O.Pfennigstorf, A. Petkova, H.-L. G¨ unther, and M.Henzler, Phys. Rev. B 65, 45412 (2002) 841
Non-contact Measurement of Thin-Film Conductivity
847
35. E.T. Krastev, L.D. Voice, and R.G. Tobin, J. Appl. Phys. 79, 6865 (1996) 841 36. H. E. Bennett, J. M. Bennett, E. J. Ashley R. J. Motyka, Phys.Rev. 165, 755 (1968) 841 37. C. A. Davis, D. R. McKenzie, R. C. McPhedran, Opt. Commun. 85, 70 (1981) 841 38. Y. Yagil, P. Gadenne, C. Julien, G. Deutscher, Phys. Rev. B 40, 2503 (1992) 841 39. R. Schad, F. Jentzsch, and M. Henzler, J. Vac.Technol. B 10, 1177 (1992) 841 40. Z.H. Zhang, S. Hasegawa, and S. Ino, Surf. Sci. 415, 363 (1998) 841 41. G. Fahsold, A. Pucci, and K.-H. Rieder, Phys. Rev. B 61, 8475 (2000) 841, 843 42. G. Fahsold, A. Priebe, N. Magg, A. Pucci, Thin Solid Films 364, 177 (2000) 841, 843 43. G. Fahsold, M. Sinther, A. Priebe, S. Diez, and A. Pucci, Phys Rev. B 65, 235408 (2002) 838, 841, 844 44. Berthier and K. Driss-Khodja, Opt. Commun. 70, 29 (1989) 842 45. G. Fahsold, A. Priebe, A. Pucci, Appl. Phys. A 73, 39 (2001) 842 46. G. Fahsold, A. Bartel, O. Krauth, and A. Lehmann, Surf. Sci. 433-435, 162 (1999) 841 47. X. Zhang and D. Stroud, Phys. Rev. B 52, 2131 (1995) 842 48. A. Priebe, M. Sinther, G. Fahsold, and A. Pucci, submitted to Journal of Chemical Physics (2003) 842 49. For homogeneous iron films with d >4nm, we always find that ωP (ω) is a factor of ∼1.27 bigger than calculated from the dielectric data form Ref. [18]. Hence, we regard these increased value as the value for bulk iron. 841 50. C. Liu, Y. Park, S. D. Bader, J. Magn. Magn. Mat. 111, L225 (1992) 843 51. W. Wulfhekel, N.N. Lipkin, J. Kliewer, G. Rosenfeld, L.C. Jorritsma, B. Poelsema, and G. Comsa, Surf. Sci. 348, 227 (1996) 844 52. S.D. Kevan, Phys. Rev. Lett. 50, 526 (1983) 844 53. H. Grabhorn, A. Otto, D. Schumacher, and B.N.J. Persson, Surf. Sci. 264, 327 (1992) 844
ReWritable Data Storage on DVD by Using Phase Change Technology H. Kleine, F. Martin, M. Kapeller, B. Cord, and H. Ebinger Singulus Technologies AG 63796 Kahl, Germany [email protected] Abstract. It is expected that the next few years the VHS casette will be replaced by rewritable Digital Versatile Discs (DVD) for home video recording. At this moment three different standards DVD+RW, DVD-RW and DVD-RAM exist, out of which the DVD+RW is expected to dominate the market in Europe and the United States. The disc holds 4.7 GB of computer data, which is equivalent to several hours of high quality video content. At the heart of the disc is a thin film layer stack with a special phase change recording layer. By proper laser irradiation the disc can be overwritten up to 1000 times without noticeable quality loss. A shelf lifetime of 20–50 years is anticipated. With these characteristics the disc is well suited for consumer applications. The present article illuminates how a process engineer can control the disc recording sensitivity, the recording speed and the number of overwriting cycles by the design of the thin film layer stack.
1
Introduction
DVD optical disc and drive have to be considered as a storage device family. Both have been jointly developed at the laboratories of leading consumer electronic companies like Matsushita, Hitachi, Pioneer, Ricoh, Philips over many years. Certain limitations of the drive as the diffraction limit for the chosen laser wavelength have led to special conditions for the optical disc like the spacing between neighboring grooves. In the course of the development process, the technology leaders have established coordination groups, the DVD-Forum for setting disc standards and patent licensing for DVDRW/DVD-RAM, and the DVD-Alliance for DVD+RW, respectively. Standardization is a key success factor for the optical disc industry, as an arbitrary compliant disc can be played back on any compliant player in the world. The specification books are available for purchase and allow any company to develop DVD compliant discs or drives. These book specifications define the limits for e.g. the disc size, the electrical signals upon readout etc., however do not disclose how such a performance can be achieved. Among the various types of DVD discs, the rewritable formats DVD-RW, DVD-RAM and DVD+RW based on phase change technology have the highest complexity. While major electronic companies have a substantial research staff to develop the disc manufacturing process technology and to maintain B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 849–859, 2003. c Springer-Verlag Berlin Heidelberg 2003
850
H. Kleine et al.
high quality production, these formats pose a significant technological entry barrier for small and mid size companies. SINGULUS Technologies is a supplier of optical disc manufacturing equipment. Particularly for manufacturing rewritable optical discs we had to learn that we have to supply our customers not only with production machines but also with the required process knowhow. Therefore we have started research activity on process technology for producing DVD+RW discs, as we consider this format to dominate the market at least in the western hemisphere in the future. In this paper we illustrate a few practical process aspects, which are of importance for DVD+RW manufacturing. All experiments have been conducted on the multilayer sputtering system MODULUS and the manufacturing line SUNLINE developed by our company under production conditions. Measurements have been conducted on the final discs using a professional PULSTEC DDU-1000 optical disc test system. All parameter variations shown in this paper are rather slight modifications of the production process settings. We ask the reader kindly to excuse that not all details of this process are reported as it is usually done in scientific publications.
2
The Structure of a DVD+RW Disc
A rewritable DVD disc consists of two polycarbonate half-sides of 0.6 mm thickness which are bonded together. Figure 1 shows a cross-sectional view of the disc structure. One of the two half-sides (labeled “grooved half-side” in Fig. 1) contains a groove spiral and is sputter coated with a functional thin film layer stack. At the core of this layer stack is a thin phase change layer consisting of a special GeSbTe alloy. This alloy has a crystalline and an amorphous phase, which are both stable at typical ambient temperature conditions and have different reflectivity [1]. By the use of a series of other layers, a thin film system is created, which has a strong reflectivity contrast between the amorphous and crystalline state (10 vs. 20 % reflectivity). Furthermore the thin film system is designed in a way that a focused pulsed laser beam can switch the phase change layer by using a proper pulse sequence and laser power. Recording properties can be adjusted by • the choice of materials in the thin film system • the thicknesses of the various layers • the pulsing of the laser beam. A few aspects will be described in more detail below.
3
The Phase Change Recording Process
The laser beam for recording enters the thin film system from the side of the dielectric layer 1 (see Fig. 1). As the dielectric and interface layers do not
ReWritable Data Storage on DVD by Using Phase Change Technology
851
Structure of a DVD+RW Disc Blank half-side Bonding material Layer stack
0.6 mm ∼ 50 µm ∼ 250 nm
Grooved half-side 0.6 mm Heat flow
Reflective layer Interface layer 2 Dielectric layer 2 Phase Change layer Interface layer 1 Dielectric layer 1
Close-Up of Layer Stack
Laser
Fig. 1. The schematic structure of a DVD+RW disc and its thin film layer stack
absorb light, and the reflective layer has a high reflectivity, the laser light is primarily absorbed in the phase change layer. Due to the layer stack design moderate laser powers of the order of 15 mW, are sufficient to heat this layer up to 600–800◦C and melt it within a few ten nanoseconds. When the laser is switched off, the heat will quickly dissipate through the thin dielectric layer 2 into the thick silver alloy reflective layer. This layer acts as a heat-sink, as the heat can quickly dissipate within this film. Accordingly the phase change layer cools down rapidly and solidifies in an amorphous state, as there is not enough time for crystallization. This means that properly “designed” laser pulses will produce amorphous marks in the phase change layer. Erasing of these marks is rather easy, as a continuous cw laser irradiation at low power around 5–10 mW will heat up the phase change layer above the crystallization temperature of approx. 200 ◦ C. At this temperature the atoms in the phase change layer will locally rearrange to reach a lower energy crystalline state. This also happens on a timescale of tens of nanoseconds. For theoretical modelling of these processes see [2] and references therein. Writing and erasing marks can be carried out in a single pass of the laser beam (direct overwrite) by proper control of the laser power, which is most convenient for data storage applications.
4 From the Analog Reflectivity Signal to a Digital Bit Stream Figure 2 illustrates the relation between the mark structure on the DVD+RW disc and the encoded binary information (an introduction to optical disc systems can be found in [3]). In part (a) a groove is shown, along which amorphous marks of different size are located. These marks have been recorded previously by a focused pulsed laser beam. If the disc spins below the readout laser of a DVD player, the photodiode will detect a reflectivity variation as
852
H. Kleine et al. dt dt: Data-to-clock time deviation
(d)
Clock cycle Origin of Jitter Slicer signal
(b)
HF-Signal / V
(c)
Slicer signal & Cycle
dt
Signal processing leading to 001001000010000 0 000 00100000 0100010001000
1.6 1.4 1.2 1.0 0.8 0.6 0.4 0.2 0.0
5T
0
4T
7T 4T
11 T
3T
380
Groove
760
Slicer level
Binary information
Reflectivity signal at photodiode
1140
Time / ns
Disc structure
(a) Amorphous mark
Crystalline background
Fig. 2. Relation between the schematic disc structure (a) and the corresponding high-frequency (HF) signal at the detector photodiode (b). By choosing an appropriate slicer level the data content can be extracted from the HF signal (c). The data-to-clock deviation dt is a measure for the signal quality (d)
shown in part (b). While the laser is over a crystalline part of the disc, a high reflectivity is observed. If the laser passes an amorphous mark area, the reflectivity level decreases. Due to the large size of the laser beam compared to the groove or mark width, the reflectivity level depends on the size of a mark structure. The larger a mark, the higher the observed reflectivity level. Furthermore the transition between low and high reflectivity levels is rather smooth. This signal is now fed into a special electronic circuit for processing into digital data. In this process the different signal reflectivity levels and the smooth transitions need to be converted into precise equal level binary signals. For this purpose at first the average signal level is calculated and used as a slicer level. Each time the reflectivity signal crosses this slicer level, no matter whether coming from high or low reflectivity, a binary “1” is counted. At the same time, a clock is running in the background at a fixed frequency. Each clock time interval T (T =1/f ) is equivalent to a binary digit. If the crossing of the slicer level falls within a certain T period, then there will be a “1” in the bitstream at this moment, otherwise a “0”. These bits are called “channelbits”, as they not yet represent the digital user data. They first require a translation process and error correction, however this shall be beyond the scope of this article.
ReWritable Data Storage on DVD by Using Phase Change Technology
853
Performance testing of a DVD+RW disc now means to measure the reflectivity signal and the channelbit stream from the disc. Within this paper we will use the following parameters, which are derived from the reflectivity signal and the channelbit stream. 4.1
I14/I14H
A parameter is needed, which characterizes the magnitude of the reflectivity signal from the disc. With larger signals we expect a better disc performance and better readout reliability. For this characterization the reflectivity signal from the largest 14T structures is used, as highest and lowest reflection signal are obtained, when the laser is passing such a 14T mark or gap structure. If we define the highest reflectivity signal level as I14H and the minimum reflectivity signal as I14L, then the parameter I14/I14H is defined as (I14H − I14L)/I14H. This normalized parameter is independent of the particular measurement equipment. 4.2
Asymmetry
A key factor certainly is the position of the average signal or slicer level, as it critically influences the conversion from the analog reflectivity signal to the digital channelbit stream. Due to various reasons, the areas of different mark sizes do not perfectly scale. This means that a 3T mark is not exactly half the size of a 6T mark but somewhat smaller. This leads to the fact, that the average or zero signal level of e.g. the 3T and 14T marks differs somewhat. The asymmetry parameter is measuring this difference and is defined as the difference between both average signal levels divided by I14H. As we will see below, this parameter is an indirect measure of crystallization speed. 4.3
Jitter
Finally it is necessary to have a characterization for signal noise. For this purpose the time delay dt between the crossing of the slicer level and the clock is measured (see Fig. 2 d). If the crossing is much earlier or later than the clock, then the binary “1” will be set at the wrong position in the channelbit stream. This will cause a mistake, which has to be eliminated by the error correction and which should be avoided in the first place. Sampling of many delay times dt will show a certain statistical distribution. If the case that the crossing is observed earlier or later is of equal likelihood and some other conditions are fulfilled, a gaussian distribution for dt might be expected. Indeed the jitter is defined as the width of a gaussian curve fitted to the sample data divided by the clock period T. Figure 3 shows an example with real sample data from a DVD+RW disc. The jitter is a very powerful characterization of an optical disc, as almost any variation in the manufacturing process influences this parameter. Often discs are finally ranked by their jitter data.
854
H. Kleine et al.
5000
Data Gaussian-fit 4000
Events
Data-to-clock jitter:
σ / T = 2.6 ns / 38.2 ns
3000
= 6.8 %
σ
2000
1000
0 -20
-10
0
10
20
dt / ns
Fig. 3. The solid line shows a measured data-to-clock time deviation distribution. The broken line is a Gaussian fit to the data. Its σ corresponds to a Jitter of 6.8%
For all three parameters introduced above exist specified limits in the standardization book for the DVD+RW disc. It is the task of a process engineer to achieve the required disc characteristics for these and numerous other parameters by the proper choice of thin film materials, deposition conditions and layer thicknesses.
5
Disc Sensitivity
When a DVD+RW disc is inserted into a drive for data recording, the drive will firstly carry out an optimization procedure to determine the best recording conditions and the optimum recording power in particular. For this purpose a specially dedicated disc area in the disc center is used, where at first a test signal is recorded with different laser power settings. Next this test signal is analyzed for signal amplitude I14/I14H and the optimum laser power is chosen. Only then the recording of the user data starts in the main disc information area. The optimum recording power is a key parameter in the disc specification, as the limits of commercially available laser diodes and optics have to be obeyed. Therefore a process engineer needs to design the layer stack in a way, that the DVD+RW disc can be recorded with good results at common laser power settings. Considering the recording process described above, for proper function the laser irradiation must melt the phase change layer. Two concurrent processes will determine whether the required melting temperature can be achieved: laser heating versus cooling by heat dissipation. A key element for the cooling process is the dielectric layer 2. The thickness of this layer can be used as a control parameter to influence cooling of the phase change layer. For a thick dielectric layer a low laser power is sufficient to reach melting conditions, as there is little cooling effect. For a thin dielectric however a significantly
ReWritable Data Storage on DVD by Using Phase Change Technology
855
Optimum power (mW)
15.0
14.5
14.0
+- 0.25 mW 13.5
13.0
+- 0.7 nm
12.5 10
11
12
13
14
15
16
DL2 thickness (nm)
Fig. 4. The optimum recording power as determined by the OPC procedure as a function of the thickness of DL2
higher laser power must be used. This is shown in Fig. 4, where the optimum recording power determined during the automatic drive optimization procedure is plotted as a function of the thickness of the second dielectric layer. The control of the dielectric layer thickness is the key parameter to control the recording sensitivity of the disc and of major importance in the disc optimization. Figure 4 also shows with which precision the thickness of certain layers in the disc manufacturing process need to be controlled. In order to keep the disc sensitivity variation within a window of 0.5 mW width, the thickness of dielectric layer 2 needs to be controlled to approximately 0.7 nm precision. This requirement holds for the uniformity of the layer thickness over the disc surface area as well as the variation from disc to disc and translates to approximately ± 3–5 % of the average thickness. This is a quite general figure for the DVD+RW, that holds for all layers.
6
Recording Speed
Optical discs are primarily developed for consumer electronic applications like video and audio recording even though they became an integral part of modern personal computers. As consumer electronic products have substantially longer life-cycles of the order of 10 years compared to computer products, which are replaced after 3–4 years usage only, backward compatibility is of major importance. Accordingly once standardized, the storage capacity of an optical disc generation is fixed. Today’s CDs hold the same 74 minutes of audio content or 650 MB 1 as 20 years ago, when the first version of the CD standard was written. Likewise future DVD+RW discs will have 1
By going to the specification limits 80 minutes or 700 MB capacity can be achieved, however problems with first generation players might be observed.
856
H. Kleine et al.
the same 4.7 GB capacity as today. Otherwise important backward compatibility can not be achieved. The only disc performance criterion which is open for continuous improvement is the recording speed. New recorder models and new rewritable disc versions can be standardized as an extension of the existing specification to allow recording of the disc in a fraction of the nominal playing time. However, it should always be possible to play back the disc at standard speed in older drives. For the DVD disc generation the standard speed is 3.5 m/s, which means that a spot under the laser beam passes with this velocity during disc spinning. This speed is referred to as “1X”. Already in the first specification of the DVD+RW disc in the year 2001, recording speeds from 1X to 2.4X were standardized. For the layer stack optimization this means that the disc must be tested and optimized not only at a single recording speed but at multiple speeds. The key parameter to vary the recording speed is the alloy composition of the phase change material. Fine-tuning can be done by varying the process condition for sputter deposition. By proper adjustment it can be achieved that the crystallization speed of the material, which is the speed with which a crystallization front is moving in as amorphous material matrix, is somewhat larger than the spinning speed of the disc. If the crystallization speed of the material is to low, than it will not be possible to erase the disc at high spinning speeds, as the crystallization front can not keep up with the disc rotation. If the crystallization speed is much higher than the disc rotation erasing is no problem, however the mark will be deformed during the recording process. This effect, called re-crystallization, is shown in Fig. 5. At the trailing edge of the amorphous mark (right hand side in the picture assuming writing from left to right) the laser beam is switched off when the desired mark length has been achieved. For a short moment the phase change layer temperature is above the crystallization level and a part of the mark switches back to the crystalline state. The size of this area increases with the crystallization speed of the phase change material.
ideally: I14=I3
14 T
I*3
3T
I*14
HF-signal
14 T Disc structure
Real length Ideal length
Amorphous mark
* Re-crystallization
Crystalline background
Fig. 5. The re-crystallization effect at the trailing edges of amorphous marks and its consequences on the signal levels
ReWritable Data Storage on DVD by Using Phase Change Technology
857
The crystallization speed itself is difficult to measure [4,5,6]. However, the re-crystallization effect which is intimately related to the crystallization speed can be observed in the asymmetry signal. This is shown in the upper part of Fig. 5, where the reflectivity signal for the mark structure with (dashed line) and without re-crystallization effect (solid line) is sketched. Re-crystallization will take away an equal size area from both the 14T and 3T amorphous mark, however the relative shrinkage of the 3T mark is much larger due to its smaller size. Therefore the average signal level of the 14T mark is only slightly influenced, while the 3T average signal level changes considerably. In particular the average 3T signal level will increase as the average reflectivity increases with smaller mark size. In Fig. 6 the asymmetry parameter is plotted versus the flow rate of the Ar sputter gas for the deposition of the phase change layer. In each case the change in deposition rate was compensated by changing the deposition time accordingly. It can be seen that low sputter gas pressures lead to low and high sputter gas pressures to high asymmetry values. In the light of the recrystallization effect discussion this means that the denser film sputtered at lower gas pressures shows a higher crystallization speed as the film deposited at higher gas flows. We use this effect to optimize the DVD+RW disc conditions at the customer site, to tune the disc to certain consumer drives and to compensate variations in the target composition. 0.06 0.05 0.04
Asymmetry
0.03 0.02 0.01
d ee sp tion a lliz sta cry
0.00 -0.01 -0.02 -0.03 10
20
30
40
50
Ar Flow (sscm)
Fig. 6. The asymmetry parameter as a function of the Ar sputter gas flow during deposition of the phase change layer
7
Overwrite Cycles
Rewritable optical discs are specified to fulfil the standard for 1000 direct overwrite (DOW) cycles. This is certainly not a very impressing number if compared with the millions of recording cycles magnetic storage media are
858
H. Kleine et al.
able to undergo. However, in typical consumer applications like home video recording only a few recordings during the media lifetime are required. The phase change recording process is accompanied by high peak temperatures in the thin film system up to 1000 ◦ C as well as rapid temperature changes on a 10 ns timescale. Therefore mechanical stress due to thermal expansion and atomic diffusion at the peak temperatures are considered to be the main reasons for deterioration. The deterioration of the disc can be observed in various parameters like e.g. the jitter. This is shown in Fig. 7, where the jitter development over 1000 recording cycles is displayed. The upper curve with rectangular symbols shows the behavior for a simplified layer stack where the thin interface layers 1 and 2 are missing. It is noticeable that for the first few overwrite cycles the jitter even improves with overwriting. However, after recording approximately 5 times the jitter continuously increases. For the simplified layer stack the specification limit of 9 % already is exceeded at 500–700 DOW cycles. The initial signal improvement at few DOW cycles is caused by a saturation effect in the phase change layer. During the first heating/recording cycles the local atomic arrangement is slightly changing from the disordered condition after sputter deposition. After a few cycles a saturation limit is reached. The jitter increase with continued overwriting is usually assigned to diffusion of sulphur atoms from the neighboring dielectric layers into the phase change layer. The existence of the sulphur atoms should noticeably alter the phase change layer properties like melting point, reflectivity etc. The lower curve (triangular symbols) has been measured on a disc which contains both interface layer 1 and 2 according to our production recipe. The interface layers consist of a few nm thin nitride film. Obviously their existence strongly 10.0 9.5
Jitter (%)
9.0 8.5 8.0 7.5
without IL with IL
7.0 6.5 0
200
400
600
800
1000
DOW cycles
Fig. 7. The jitter as a function of the number of direct overwrite (DOW) cycles. The performance is shown for the full layer stack (triangular symbols) and a reduced layer stack system, where the two interface layer are missing (rectangular symbols)
ReWritable Data Storage on DVD by Using Phase Change Technology
859
improves the jitter value and moves the overwriting performance well into the specified regime.
8
Outlook
In the future higher recording speeds for DVD+RW and other rewritable disc formats will be specified and launched to the market step by step. Due to the limited mechanical stability of the polycarbonate substrate material, a maximum speed of 16X is expected. To achieve this performance, it will be necessary to develop new phase change alloys with proper crystallization speeds. This should be an interesting field of activity for both the industry players as well as the scientific community.
References 1. J. Feinlib, J. deNeufville, S.C. Moss, S.R. Ovshinsky, Appl. Phys. Lett. 18, 254 (1971). 850 2. V. Weidenhof, N. Pirch, I. Friedrich, S. Ziegler, M. Wuttig, J. Appl. Phys. 88, 657 (2000). 851 3. K.C. Pohlmann: The Compact Disc Handbook (Oxford Univ. Press, Oxford 1992). 851 4. Y. Nakayoshi, Y. Kanemitsu, Y. Masumoto, Y. Maeda, Jpn. J. Appl. Phys. Part 1 31, 471 (1992). 857 5. J.H. Coombs, A.P.J.M. Jongenelis, W. van Es-Spiekman, B.A.J.Jacobs, J. Appl. Phys. 78, 4906 (1995). 857 6. C. Peng, L. Cheng, M. Mansipur, J. Appl. Phys. 82, 4183 (1997). 857
Interactions in Dipole Systems: Model Simulations Using the Method of Local Fields Herbert Kliem Saarland University, Institute of Electrical Engineering Physics 66041 Saarbruecken, Germany Abstract. Dielectric and ferroelectric properties are modeled taking into account dipolar interactions. Dipole systems consisting of permanent dipoles fluctuating thermally activated in double wells, of induced dipoles, and systems composed of permanent and induced dipoles are considered. Within the calculations the local field at each dipole is computed. For induced dipoles the local field solely and for permanent dipoles the local field and a Monte Carlo step determines the moment of the dipole. In an iterative procedure the polarization of the system is obtained. Among others the calculations yield hysteresis loops for the polarization with temperature dependent spontaneous and remanent polarizations, temperature and sample thickness dependent coercive fields, and a Curie-Weiss law for the susceptibility. The role of defects in the system is also considered as well as the role of the electrodes.
1
Introduction
It is the aim of this investigation to show that several dielectric and ferroelectric properties can be described using very simple assumptions: the long range forces of the Coulomb interaction between dipoles seem to play an important role for the dielectric behaviour of matter [1]. To describe the dipole-dipole interaction, analytical models have been developed in the past [2,3,4]. One of the most frequently used approaches is the Lorentz calculation describing the interaction of induced point dipoles located on cubic lattice sites. Today, also numerical simulations are used to describe the dielectric and ferroelectric response with consideration of the dipolar interaction [5,6,7]. In the present model the local electrical fields at the dipoles evoked from the other dipoles in the system and from the applied field are calculated with an iterative algorithm. Each dipole reacts to the local field with a moment that results from (I) the local field, (II) the temperature, if the dipole is permanent, and (III) the properties of the dipole itself. For induced dipoles, which are assumed to be temperature-independent, the properties are summarized with the polarizability. Permanent dipoles in the present model fluctuate thermally activated in double well potentials. The geometry of the double wells, i. e. the distance between the wells, the barrier height, and a probable intrinsic asymmetry, define beneath the charge of the dipole its B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 861–874, 2003. c Springer-Verlag Berlin Heidelberg 2003
862
Herbert Kliem
properties. For permanent dipoles the temperature is regarded in an additional Monte-Carlo step. Besides the local attributes the structure of the arrangement determines the behaviour of the dipole ensemble. For example permanent dipoles on cubic lattice sites exhibit a different response to an applied field than statistically distributed dipoles with respect to the temperature dependence and the hysteresis of the polarization. If induced dipoles come into play they can support the emergence of a spontaneous polarization in the dipole ensemble without application of any external field. With the method of local fields it is also possible to describe the role of the electrodes: the systems under consideration are embedded in plane parallel condensors. To calculate the local fields the method of image is used therefore. The image dipoles have the tendency to pin the original dipoles in their positions. This effects the coercive field strength in very thin layers, i. e. the coercive field increases with decreasing film thickness. In earlier publications we investigated systems of permanent dipoles [8,9], systems of induced dipoles [10,11], and systems composed of permanent and induced dipoles [12,13,14]. Besides the static behaviour of the dipole ensemble, i. e. the thermodynamic balance in the system, the dynamic properties of dielectrics are investigated. It turns out that for systems, where a space charge polarization is involved, the charge attracting electrodes can be responsible for a Kohlrausch behaviour of the polarization [15].
2
Numerical Method
In this section the numerical method shall be explained briefly. A detailed description of the numerical method can be found in [12]. Between two electrodes of a plane-parallel capacitor, N dipoles are set up. The dipoles mutually interact by their electric fields. To calculate the local field Eloc , the method of images ist used. In Fig. 1 the method of images is illustrated for a system of dipoles located on cubic lattice sites. Each dipole is repeatedly reflected at the electrodes. The series is truncated, when the changes of the local fields are below a presumed limit. At least 500 images are used for each original dipole. At the location rk of a dipole k, Eloc (rk ) acts on the dipole moment of dipole k. The local field Eloc (rk ) consists of three parts: the applied field, the field contributions of the original dipoles and their images except for dipole k, and the field contributions of the images of dipole k: Eloc (rk ) = Ea + Ejk + Ekk (1) j,k=j
(Ea : applied field, Ejk : field contribution of dipole j and its image dipoles, Ekk : field contributions of the image dipoles of dipole k). We consider induced
Interactions in Dipole Systems
863
Fig. 1. Illustration of the method of images to calculate the local field at dipole k. This dipole is thought to be removed
point dipoles and permanent dipoles. For the case of a point dipole the dipole moment is simply µ(r) = α Eloc (r)
(2)
For a permanent dipole the potential difference between positive and negative charge is calculated. The permanent dipoles fluctuate thermally activated in double-well potentials, which are symmetrical without local field applied (Fig. 2). The local field shifts the wells against each other, and a redistribution of the dipole orientations results. The probability to occupy one of the two possible directions is given by the Boltzmann factor: w1,2 =
exp(±q · ∆V /kT ) exp(−q · ∆V /kT ) + exp(q · ∆V /kT )
(3)
To determine the direction, the length [0,1] is divided into two parts: w1 and w2 (Fig. 3). Then a random number rd is generated. The coincidence between rd and either w1 or w2 yields the direction and the moment P of the dipole. This corresponds to the Monte Carlo scheme “importance sampling” [16]. The iterations can start from a disordered state for the permanent dipoles, both dipole directions have equal probabilities, and with zero induced dipole moments µk = 0 for all k, for example. We start with the local field calculation for the first randomly chosen permanent dipole and determine its moment. Once this dipole has obtained a new moment, the local field at the next dipole under consideration is recalculated. For the systems composed of permanent and induced dipoles all induced moments are recalculated when a permanent dipole has obtained a new direction. Then the next permanent
864
Herbert Kliem
dipole is considered. In this way all dipoles are processed successively in one step of iteration. The present polarization of the system in z-direction perpendicular to the electrodes is calculated after each step of iteration i: 1 pzk + µzk (4) Pi = V k
k
The last N /2 iterated values of the polarization are averaged to get the final result. The number N depends on the dipole system. For permanent dipoles N = 2 000 is choosen and for composed systems N can be as high as N = 1 500 000. With a similar program a space charge polarization of a solid electrolyte, for example polyethylene oxide (PEO), is modeled by an ionic hopping process in a three dimensional multiwell energy structure. Positive ions located on lattice cells can fluctuate over energy barriers with thermally activated rates. A negative background charge, which is constant in space and time, provides charge neutrality. Blocking layers at the electrodes are simulated by infinitely high energy walls. Typically 1000 ions are distributed on a cubic lattice with up to 100 × 100 × 100 lattice cells. The cells are separated by energy barriers W0 . Figure 4 is a one-dimensional representation of the sample structure. The potentials at the cells are determined by the electrostatic interaction between the ions, the applied field, and the background charge. These local potentials are calculated using the method of images in a similar way as described above. Thus the total barrier height between two sites is given by Wz = W0 +
δW , 2
(5)
where δW is the part which comes about by the charges in the system and in the electrodes.
Fig. 2. Double-well potential
Fig. 3. Determination of the new orientation for a permanent dipole
Interactions in Dipole Systems
865
Fig. 4. One-dimensional representation of the sample structure; ions hop in a multiwell energy structure between two ideal blocking electrodes
The transition rates between cell z and z +1 are δW/2 W0 + wz,z+1 = ν0 exp − kT kT wz+1,z
δW/2 W0 − = ν0 exp − kT kT
(6)
For all charges and possible hopping directions the random time interval between two transitions is calculated using the Monte-Carlo stochastic dynamics method [16,17] t=−
1 n x w
(7)
x is a random number: x ∈]0, 1]. The transition with the shortest time interval tmin is executed. Then the system time is increased tsys : = tsys + tmin
(8)
After a transition all potentials are recalculated leading to new transition rates in eqn. (6). The sum of all consecutive hops results in the macroscopic polarization and polarization current of the sample.
3
Results
The results presented here are a selection from former papers [7,8,9,10,11,12], [13,14,15]. 3.1
Systems of Permanent Dipoles
Up to 2000 permanent dipoles are distributed statistically in space with respect to their orientations and centers.
866
Herbert Kliem
Fig. 5. Normalized polarization of statistically distributed dipoles n: density, q: charge
The dipole length is 1˚ A. In Fig. 5 the normalized polarization P/Pmax of the dipole ensemble is plotted versus the applied electrical field Ea in four cases. As a reference serves the curve without interaction. It is obvious that the dipole-dipole interaction decreases the polarization of the system. This decrease is the stronger the higher the dipole density is and the higher the dipole charge is. The interacting dipoles tend to align antiparallelly in their dipole fields. Besides the disorder in the system introduced by the temperature the external field has to overcome this mutual trapping of the dipoles in their own fields. Hysteresis loops of the polarization can be found if the electrical field is increased until a saturation of the polarization is reached, and if the field is decreased again. Figure 6a shows the loops in statistically disordered systems for two different thicknesses of the system. Figure 6b yields the loops for
Fig. 6. Hysteresis loops of the polarization. (a) Statistically distributed dipoles, variation of the electrode distance d. (b) Dipoles on cubic lattice sites, variation of the temperature T
Interactions in Dipole Systems
867
Fig. 7. The coercive field strengths as function of the electrode distance.(a) Statistically distributed dipoles. (b) Lattice dipoles
Fig. 8. Temperature dependence of the remanent polarization. (a) Statistically distributed dipoles. (b) Lattice dipoles
permanent dipoles on cubic lattice sites at different temperatures. It can be seen that the coercive field Ec in the thiner system is higher than in the other one. This is due to the image dipoles which tend to pin the original dipoles in their position. The effect is the higher the thiner the sample is. Figure 7 shows a systematic computation of Ec (d) (d : electrode distance) for statistically distributed dipoles (7a) and for lattice dipoles (7b). From calculations of the temperature dependent hysteresis loops the curves of the remanent polarizations Pr versus the temperature are deduced. We find a relaxor-like dependence of Pr (T ) for systems with statistically distributed dipoles, and a sharp drop of Pr (T ) in lattice dipole systems (Fig. 8). 3.2
Systems of Induced Dipoles
For systems of indentical induced point dipoles on an infinite cubic lattice Lorentz [2] derived his famous formula for the polarization in an applied field P =
nα Ea 1−nα/30
(9)
868
Herbert Kliem
Fig. 9. The polarization in systems of induced lattice dipoles
For nα/30 = 1 the polarization diverges. This yields an opportunity to check the model simulations. Fig. 9 shows the computed polarization versus the iteration number i for induced lattice dipoles. For nα/30 = 0.52 the polarization as predicted by Lorentz is reached after about 7 iterations (curve A). But for nα/30 = 1 the polarization increases and diverges as expected from Eq. (9). The method of local fields allows for example the calculation of the field at a defect in an otherwise perfect lattice and the computation of local fields in very thin dielectrics consisting of atoms with different polarizability [10,11]. Figure 10 shows the local fields in thin films of two to six atomic layers having two kinds of atoms with polarizabilities α and β. In these films the local fields can differ from those predicted by Lorentz. Therefore also the effective dielectric permittivity in these biatomic thin films is not identical with the volume permittivity. This might be of practical importance in MOS-FET’s with ultrathin gate oxides.
Fig. 10. The local fields in very thin dielectrics consisting of atoms with two different polarizabilitities α and β
Interactions in Dipole Systems
3.3
869
Systems Composed of Induced and Permanent Dipoles
For systems composed of induced and permanent dipoles calculations of the interactive polarization could so far be carried out only for dipole chains and dipole planes. These systems can be looked at as substructures of bariumtitanate. In the [001] direction we find titaniums and oxygens. The titanium ions fluctuate in this direction in double wells and are considered as permanent dipoles, while the oxygen is treated as an induced dipole. The (200) plane in bariumtitanate consists of oxygens having different crystallographic sites and of the titaniums. Due to their different sites the oxygens in the model have different polarizabilities α and β (Fig. 11). Using the simple example of the dipole chain it is possible to show that the interaction between permanent and induced dipoles stimulates the emergence of a spontaneous polarization (Fig. 12). If the chain consists only of induced or permanent dipoles we find with curve A and curve B the expected functions P (Ea ). But for a chain composed of induced and permanent dipoles a spontaneous polarization (curve C) arises.
Fig. 11. The dipole chain referring to the [001] direction and the dipole plane referring to the (200) plane in barium titanate
870
Herbert Kliem
Fig. 12. The polarization of a dipole chain. If permanent and induced dipoles interact a spontaneous polarization emerges (curve C). even without external field all dipoles point into the same direction
A variation of the temperature T results in a loss of the spontaneous polarization with increasing T (Fig.13). From the slope of the curves at Ea = 0 the susceptibility can be calculated χ(T ) =
1 dP (T ) 0 dEa Ea = 0
(10)
This is carried out in detail for the dipole plane (see below). Also for the dipole plane we can find a spontaneous polarization (Fig. 14). If no defect is introduced we get a single domain after about 15 000 iterations. If the plane has a defect, i. e. an oxygen vacancy, two domains are found after 15 000 iterations. The dipoles are arranged in a special pattern, which can be explained by the dipole-dipole interaction. The dipoles in the side chains consisting of induced dipoles only, are oriented antiparallelly to the dipoles of the main chains with induced and permanent dipoles. Since the dipolefield strength decreases strongly with increasing distance, next neighbouring dipole chains interact more strongly. Thus, the induced dipoles of the side chains mediate the orientation of the dipoles in the main chains. This idea is found in a book by Feynman [18]. The spontaneous polarization is temperature dependent (Fig. 15). Hysteresis loops for different temperatures are shown with Fig. 16. The susceptibility has a peak close to the temperature where the spontaneous polarization
Fig. 13. Temperature dependence of the polarization in a dipole chain
Interactions in Dipole Systems
871
Fig. 14. (a) The evolution of a spontaneous polarization. (b) A single domain appears in the plane without defect after 15 000 iterations. (c) A domain wall through the defect exists after 15 000 iterations
Fig. 15. Temperature dependence of the spontaneous polarization
disappears (Fig. 17). For the high temperature side approximately a CurieWeiss law is found. From computations of temperature dependent hysteresis loops the temperature dependencies of the coercive fields and the remanent polarizations are deduced. Both indicate a type II transition. Both transitions depend, as can be expected, on the parameters in the systems (Figs. 18, 19).
872
Herbert Kliem
Fig. 16. Temperature dependence of the dipole plane hysteresis loops
Fig. 17. The temperature dependent susceptibility of the dipole plane. The curve is a Curie-Weiss function
Fig. 18. The temperature dependence of 2 the coercive field, [α] , [β] : e˚ A /V
Fig. 19. The temperature dependence of the remanent polarization in the dipole plane
Interactions in Dipole Systems
3.4
873
Time Dependence of the Polarization
Until now equilibrium polarizations have been calculated. To find the time dependence of the polarization after a disturbence from equilibrium, for example after a step of the electrical field, the stochastic dynamic Monte-Carlo method as described above is used. Within this method the local fields again enter into the calculations as described above. As an example the time dependence of a space charge polarization is considered. Experiments are carried out using the solid electrolyte polyethylenoxide (PEO), which is embedded between blocking electrodes. Fig. 20a is a plot of the measured polarization current after a step of the electric field. For short times the current is proportional to t−0.25 and for long times a current j ∼ t−1.05 is found (Kohlrausch empirical law). The short term current results from a bulk response while the long term current is due to the attraction of ions in the material by their image charges in the electrodes. Modeling the dynamic response using the procedure described above yields for the long term response a Kohlrausch behaviour with j ∼ t−1.1 (Fig. 20b). This is cal-
Fig. 20. (a) Measured polarization current for PEO after a step of the electric field. (b) Calculation of the current using the dynamic stochastic Monte-Carlo method. The power law j ∼ t−1.1 results from the attraction of bulk ions by their images in the electrodes. (c) Additionally a Gaussian distribution of barrier heights in the bulk multiwell structure is assumed
874
Herbert Kliem
culated without any assumed distribution function for the barrier heights in the bulk multiwell structure. Assuming additionally a Gaussian distribution for these bulk barrier heights results in a short term current j ∼ t−0.25 as it was measured (Fig. 20c).
4
Conclusion
Several ferroelectric and dielectric properties can be modeled by the electrostatic interaction of dipoles and charges in the material. A basis for this simulation is the exact calculation of the local fields at the dipoles and charges. A significant result is that the evolution of a spontaneous polarization is supported by the interaction between induced and permanent dipoles. Using the dynamic stochastic Monte-Carlo method it is possible to simulate the transient response of a charge or dipole ensemble after a perturbance from equilibrium. It is shown that a power law behaviour of polarization currents can be evoked by the mutual attraction of charges in the bulk with their image charges in the electrodes.
References 1. L. Bellaiche and D. Vanderbilt, Phys. Rev. Lett. 81 1318 (1998). 861 2. H. A. Lorentz, the theory of electrons, pp. 138, 306, (Dover Publications, reprint of a book by H. A. Lorentz of 1909, New York, 1952). 861, 867 3. J. Onsager, J. Am. Chem. Soc. 58, 1486 (1936). 861 4. H. Froehlich, Theory of dielectrics (Oxford University Press, Oxford, 1990). 861 5. S. C. Hwang and G. Arlt, J. Appl. Phys. 87, 869 (2000). 861 6. B. G. Potter, Jr., V. Tikane, and B. A. Tuttle, J. Apll. Phys. 87, 4415 (2000). 861 7. A. C. Maggs and V. Rossetto, Phys. Rev. Lett 88, 1964021 (2002). 861, 865 8. N. Farag and H. Kliem, Ferroelectrics 228, 197 (1999). 862, 865 9. N. Farag and H. Kliem, IEEE Electr. Ins. Mag. 15, 25 (1999). 862, 865 10. M. De Benedictis, N. Farag, and H. Kliem, phys. stat. sol. (a) 227, 491 (2001). 862, 865, 868 11. N. Farag and H. Kliem, phys. stat. sol. (b) 233, 180 (2002). 862, 865, 868 12. N. Farag and H. Kliem, J. Appl. Phys. 90, 5713 (2001). 862, 865 13. H. Kliem and N. Farag, Ferroelectrics 268, 71 (2002). 862, 865 14. N. Farag and H. Kliem, submitted to phys. stat. sol (b). 862, 865 15. A. Wagner and H. Kliem, J. Appl. Phys. 91, 6638 (2002). 862, 865 16. A. B. Bortz, M. H. Kalos, and J. L. Lebowitz, J. Comput. Phys. 17, 10 (1975). 863, 865 17. A. Sinitzki and V. H. Schmidt, Phys. Rev. B 54, 842 (1996). 865 18. R. Feynman, R. Leighton, and M. Sands, Feynmans Lectures on Physics, Vol. II, Addison Wesley Publ. Comp. (1964) 870
New Challenges in Optical Coating Design Olaf Stenzel Fraunhofer Institute for Applied Optics and Precision Engineering Winzerlaer Str. 10, 07745 Jena, Germany
Abstract. Modern mathematical algorithms allow to theoretically generate thin film designs that fit nearly any reasonable specification. Nevertheless, as practice has shown, the gap between calculated and technologically achievable characteristics may be significant, so that the search for qualitatively new design and production tools is still in progress and represents one of the most complex challenges in thin optical coating theory and technology today. Such new design challenges include the incorporation of gradient index layers into classical designs, the design of rugate filters, or novel filter concepts that are based on resonant grating waveguide structures. Moreover, the development of novel composite coating materials is expected to facilitate the optimisation of future designs.
1
Introduction
Today’s optical instrumentation becomes more and more complex. In order to guarantee durability and high optical performance of any optical component, its surfaces have to be overcoated with specially designed thin film stacks to achieve tailored optical properties as well as surface protection [1]. Clearly, any improved or new optical technology may require modified or new optical coating designs, so that optical thin film design is of utmost importance for the whole field of applied optics. The conventional optical thin film designs may be classified as follows [2,3]: • • • • • • •
Antireflection coatings Neutral mirrors and beam splitters High reflectors Edge Filters Wide and narrow bandpass filters Single or multiple bandstop filters Polarizing beamsplitters
At the first step, optical coating design is a purely computational task. One has to find thin film systems that show a theoretical optical performance arbitrarily close to the required specification (reverse search). If there are multiple solutions of this task, the designer will choose a design most conveniently to produce in practice. At this step, practical considerations such as stability against process parameter fluctuations, thickness monitoring requirements, B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 875–889, 2003. c Springer-Verlag Berlin Heidelberg 2003
876
Olaf Stenzel
and costs become important. In order to find the optimum solution, more and more sophisticated mathematical algorithms have been developed in the past and implemented into commercially available thin film design software [3,4,5,6]. Usually the theoretical apparatus behind the calculations is based on the typical assumptions of homogeneity and isotropy of the film materials. Although there exist general mathematical theorems which guarantee existence and properties of a theoretical solution for broad classes of thin film design problems (theorem on solvability and maximum principle, compare [7]), practical limitations occur that may prevent a good design from being found even at the computer. Among these limitations are: • Finite maximum thickness of the whole layer stack • Finite number of available optical materials, and therefore a finite range of accessible refractive indices • Finite absorption and dispersion of the materials These problems cannot be solved by a further refinement of the mathematical design methods alone. Current challenging research directions therefore include qualitatively new design tools such as: • Vertically or laterally inhomogeneous coatings • Anisotropic coatings • Synthesis of new optical materials (composites, materials showing size effects) In this paper, the practical status of some of these approaches is discussed.
2
Theory
For nonmagnetic, optically homogeneous and isotropic materials, the linear optical constants (the real and imaginary parts n and k of the complex refractive index) in the IR/VIS/UV spectral ranges usually depend only on the wavenumber ν of the light (ν is the reciprocal value of the wavelength in vacuum). In this case, the calculation of optical spectra of a thin film stack may be performed straightforwardly in terms of the matrix formalism when the mentioned optical constants and the film thickness values are known [8]. Here, the j-th single film in a stack defines a characteristic matrix Mj by cos 2πν n ˆ j dj −i sin 2πν n ˆ j dj /ˆ nj (1) Mj ≡ ;n ˆ j ≡ nj + ikj − iˆ nj sin 2πν n ˆ j dj cos2πν n ˆ j dj dj is the physical thickness of the j-th film, while nj and kj are the real and imaginary parts of its complex refractive index . For simplicity normal incidence is assumed. The properties of the whole stack are given by the characteristic matrix of the stack which is obtained from the single film matrices, M11 iM12 Mj (2) = Mj ≡ iM21 M22 j
New Challenges in Optical Coating Design
877
where the individual films are counted starting from the incident medium. The transmittance T and the Reflectance R of a stack embedded between two semi infinite media may be calculated by 2 nsub 2n0 (3) T= n0 n0 M11 + in0 nsub M12 + iM21 + nsub M22 n0 M11 + in0 nsub M12 − iM21 − nsub M22 2 R= n0 M11 + in0 nsub M12 + iM21 + nsub M22 where n0 is the refractive index of the incidence medium, and nsub that of the substrate material. In this way, the calculation of spectra from the construction parameters of the system (forward search) is straightforward. Unfortunately, due to the complicated mathematical structure of Eqs.(3), it is impossible to obtain explicit expressions for the reverse task, when the system parameters shall be deduced from the systems spectral performance. Therefore, such reverse search procedures are usually performed by a numerical minimization of an appropriately defined merit function [9]. The mathematical treatment of both forward and reverse search procedures becomes much more complicated when the simplifying assumptions on homogeneity and isotropy of the film materials are no more valid. In fact, relaxing these requirements results in completely new optical effects not accessible to conventional interference coating design. Table 1 presents some examples of current research fields. The conventional dielectric coatings, as discussed above, are represented in the left upper corner in the table. Anisotropy or inhomogeneity of the coating materials (moving downwards in the table) leads to such important classes of novel coatings like GBO (Giant Birefringent Optics)-devices [10,11] and rugate filters [4]. Finally, the so-called Resonant GWS (Grating Waveguide Structures) [12,13] combine lateral inhomogeneity with anisotropy and therefore exhibit unique properties. In Sect.3, all these systems will be addressed. On the contrary, starting from conventional coatings and moving to the right in Tab.1 will lead us to nanoscopically heterogeneous coating materials, which are however optically homogeneous due to the small characteristic size of the structural units. In this way it is possible to manipulate optical material properties, offering more flexibility in the choice of optical constants for design tasks. Here, amorphous hydrogenated carbon and metal island films are prominent examples [14]. This “movement to the right” will be exemplified in Sect.4. The most complicated case, namely the presence of anisotropy, absorption, and heterogeneity on different length scales in combination will finally lead to the fields of photonics and plasmonics [15,16], but will not be taken into consideration in this paper.
878
Olaf Stenzel
Table 1. Schematic overview on the discussed research fields in applied optics Pure materials or nanoscopiOptically Optically cally homogeneous mixtures homogeneous isotropic nonabsorbing absorbing
yes
yes
nonabsorbing
Conventional Conventional composite dielectric (selective) dielectric coatings absorbers, coatings metal films
yes no
Composites or porous layers
Giant birefringent optics
Polarizer foils
Rugates, Gradient index layers
absorbing
Cermets, metal island films
Metal island films
Rugates, Gradient index layers
no no
3 3.1
Grating waveguide structures (reflectors)
Grating waveguide structures (absorbers)
Photonic crystals and Plasmonics
Utilizing Coatings Anisotropy and Inhomogeneity Giant Birefringent Optics (GBO)-Effects
Giant Birefringent Optics denote a class of polarization sensitive optical effects that occur at the interface between different anisotropic materials or alternatively at the interface between an optically isotropic and an anisotropic material. As a necessary condition, one of the principal refractive indices of material 1 must match a principal refractive index of material 2, while other principal indices should strongly differ from each other [10]. As in multilayer interference coatings a multiplicity of interface responses sums up to the final stack performance, GBO clearly represents a candidate tool for high performance polarizing or oblique incidence interference coating design. The most prominent effects, which may be theoretically derived from the anisotropic version of Fresnel’s reflection coefficients [17,18], are summarized in Tab.2. The first two effects (upper two rows) are essential for the design of efficient polarizers for any angle of incidence. The second group of effects facilitates the design of broadband and omnidirectional mirrors [19]. The critical point is, that the orientation of the optical axis as well as the values of the principal indices must be accurately matched to the requirements fixed in
New Challenges in Optical Coating Design
879
Table 2. Examples for GBO-Effects: “e” denotes the extraordinary, and “o” denotes the ordinary principal refractive index. φ is the angle of incidence, and the subscripts “s” and “p” denote s- or p-polarization. z is the direction of the film axis (perpendicular to the film surface) Optical axis in material 1
Optical axis in material 2
Matching condition
GBO-effect
isotropic
to surface and to incidence plane
n1 = n20
Rs = 0 ∀φ
isotropic
⊥ to incidence plane
n1 = n20
Rp = 0 ∀φ
z
z
n1e = n20 and n10 = n2e
Rs = Rp ∀φ
z
z
n1e = n2e
Rs = Rp (φ)
the first two columns of Tab.2, which is impossible in conventional coating manufacturing. Practically, these effects are therefore utilized in polymer interference coatings, where anisotropy may be induced through mechanical stretching. First polymer multilayer reflectors based on GBO-effects are reported in the literature [10,11]. 3.2
Rugates
On the contrary, another class of coatings is obtained when the materials are regarded to be optically isotropic, but their optical constants are allowed to change smoothly along the film axis z. Hence, we have optically inhomogeneous coatings (gradient index layers) with a refractive index profile according to n ˆ=n ˆ (z, ν) = n(z, ν) + ik(z, ν).
(4)
For a sinusoidal refractive index profile without dispersion and absorption (so-called rugate structures) 4πz n = n(z) = n + µ sin (5) fa (z) Λz the normal incidence reflectivity has a maximum at the rejection band wavelength λreject = nΛz .
(6)
Note that in Eqs. (5) and (6) the value Λz is not the period, but twice the period of the refractive index profile. The quantity n is the spatially averaged refractive index, and µ determines the modulation depth. Eq. (6) is simply a generalization of the well-known formula for the first-order rejection wavelength in a conventional quarter-wavelength high-low stack λreject = 2(nH dH + nL dL )
(7)
880
Olaf Stenzel
where the subscripts H and L stand for the high- and low-index materials. In Eq. (5), the apodization function fa (z) guarantees sidelobe suppression (for more details see [20]). In contrast to conventional high-low refractive index stacks [3], no higher interference order reflection peaks are present in an ideal rugate structure, so that, in principle, rugates may be prepared that exhibit only one reflectance (or rejection) band [2]. The rejection band width of a rugate structure is given by [2,20] Γ = µΛz
(8)
and therefore depends on refractive index difference, again similar to the quarterwave stack. The design of rugate filters or any complicated gradient index structure is usually performed in terms of an inverse Fourier transformation of the so-called Q-function that is constructed from the required spectral performance [4,21]. A refractive index profile obtained by multiplication of two different profiles according to Eq. (5) will consequently result in a rugate filter exhibiting definitely two rejection bands centred at the corresponding λreject,1 and λreject,2 . At present, the design and preparation of any gradient index structure is a serious theoretical and technological challenge. First of all, the choice of the Q-function is ambiguous. Moreover, the mentioned Fourier Transform design method does not automatically consider dispersion and absorption of the materials, so that the designer has to perform subsequent corrections to account for realistic material properties [22]. Technological problems arise from the need to produce well-defined refractive index profiles. There are several approaches to the problem: • Mixtures of two materials with varying filling factors [22] • Replacement of the profile by a sequence of ultrathin high- and low-index layers according to Southwell [5] (normal incidence only) • Varying deposition conditions, resulting in a z-dependent degree of film porosity Among these approaches, no favourite deposition method has been established so far. 3.3
Resonant Grating Waveguide Structures
After all, it might seem promising to combine specific advantages of anisotropic and inhomogeneous layers to achieve novel spectral characteristics. Among the possible combinations, the so-called resonant grating waveguide structures (GWS) (or single-layer waveguide grating) hold a central position due to their prospective filtering properties [23]. In its simplest version, a GWS is built up by a single high-refractive index layer with a one-dimensional diffraction grating on top [12] (see Fig. 1).
New Challenges in Optical Coating Design
881
Fig. 1. Principal structure of a GWS. In the highrefractive index film, both zero- and first-order diffracted waves may propagate. The first-order diffracted wave suffers total internal reflection at the film boundaries
GWS are candidate systems for extreme narrow line reflection filters [23], where a guided-mode resonance mechanism in the high index film may theoretically lead to 100% reflection efficiency, while the system is only merely reflective off-resonance. Consequently, the reflection spectrum is expected to show narrow peaks of nearly ideal reflection, which suggest applications as narrowband filters. In the context of the previous discussion, these systems combine optical inhomogeneity with anisotropy. Indeed, regarding the diffraction grating as a thin laterally structured film, the latter appears to be laterally inhomogeneous with a periodic modulation of the refractive index. On the other hand it is clearly anisotropic, therefore exhibiting polarizing properties even at normal light incidence. The GWS theoretical treatment is complicated because it combines optical interference coatings theory with the theory of diffraction gratings. Hence, typical thin film design programs are at stake here und must be replaced by grating solver software, which performs these calculations within the Rigorous Coupled Wave Approximation (RCWA) [24]. Figure 2 shows the thus calculated normal incidence reflection spectra for a model system. For each polarization, the spectrum shows a narrow reflection peak. Although RCWAcalculations allow to perform the forward search in a direct way, they are not
Fig. 2. Calculated normal incidence reflectance of a GWS. The film refractive index is 2.3, the substrate index 1.37. The assumed grating period is 475 nm and the film thickness 285.2 nm
882
Olaf Stenzel
always convenient for reverse search tasks, important for any design optimisation. Fortunately, it is easy to derive approximation formulae that allow to analytically estimate film thickness and grating period necessary for a required reflection band. For a grating of negligible physical thickness t (t << λ) and normal light incidence the following approximation is obtained dm =
λ0 Λ mπ + ψ21 + ψ23 ; 2π n2 Λ2 − λ20
m = 0, 1, 2...
tan ψ21,s
1 = 2 tan ψ21,p = n
tan ψ23,s =
n2sub tan ψ23,p n2
λ20 − Λ2 , n2 Λ2 − λ20 λ20 − n2sub Λ2 = ; n2 Λ2 − λ20
(9) Λ∈
λ0 λ0 , . n nsub
Here, λ0 is the required central wavelength of the reflection band, the subscripts s and p denote s- and p-polarization, m is the interference order. The values Ψ21 and Ψ23 arise from the phase shift, which suffers the firstorder diffracted wave at total reflection at the film-air-interface (Ψ21 ) and film-substrate interface (Ψ23 ). The derived formulae allow a straightforward and simple estimation of the film thickness for a resonant GWS. Nevertheless, they only provide a first approximation, because the finite thickness t of the grating is neglected. Again, the effect of the grating thickness t may be studied by RCWA calculations, and Fig. 3 presents the effect of t on the reflection band wavelength λ0 and width Γ for a GWS, designed for λ0 = 670 nm by Eqs. (9) assuming m=1, all other parameters are the same as in Figure 2. Obviously, for a vanishing t, the reflection wavelength approaches the design wavelength of 670 nm, while the reflection bandwidth decreases down to zero. For a fixed grating period, a more detailed analysis confirms the following
Fig. 3. Dependence of rejection wavelength λ0 (left) and rejection bandwidth Γ (right) on the grating thickness t, as obtained from RCWA-calculations
New Challenges in Optical Coating Design
• • • •
883
λ0 increases with increasing t Γ increases with increasing t, at least for t << λ0 Γ decreases with increasing m (or increasing d) The number of reflection peaks increases with increasing m.
Within the limitation given by Eqs. (9), it is therefore possible to control both the width and the position of the reflection band by two purely geometrical parameters (film thickness and grating thickness) without changing the period of the grating. This is in contrary to classical quarter-wave-stack reflectors as well as to rugates, where the width of the reflection band is determined by the refractive index difference, and therefore offers qualitatively new design tools. It is interesting to note that a proper choice of Λ, n, nsub and m allows to design reflectors with a predefined number of reflectance peaks, similar as it may be achieved by rugates. On the other hand, the complex polarization behaviour may offer completely new application fields, which are not yet fully understood. Technologically, the single thin film deposition is a rather trivial matter, while the writing of the grating is a much more troubleful and expansive step, usually performed by lithographic techniques. By means of electron beam lithography, grating periods down to 20 nm are accessible today [25]. Practically, GWS reflectivities about 80 % have been experimentally achieved so far in the visible spectral range, while the rejection band width was around 0.1 nm [12]. There is also a report on the preparation of a two-dimensional GWS structure, reaching more than 60 % reflectivity at 781 nm wavelength [26].
4 4.1
Towards Heterogeneous Optical Film Materials Motivation
A certain set of accessible optical constants is provided by the conventional optical thin film materials and consequently used in thin film design. Clearly, the maximum principle [7] guarantees the existence of a solution to design problems in terms of a (possibly very thick) high-low stack, but it is valid only for normal incidence and non-absorbing materials. There remain enough problems where it would be of use to have further optical constants available. For example, highly porous SiO2 may be prepared to have a refractive index of 1.22 [27], and may therefore be used as a considerably thin singlelayer-antireflection coating for glasses. In this section, we will rather focus on absorbing heterogeneous films, namely on metal island films, embedded into dielectric thin film materials. It is common knowledge that metal island films show strong absorption lines that are caused by the excitation of surface plasmon resonance of the free electrons by optical means [28]. The resonance frequency values depend on the size and shape of the islands (often called clusters), they may be influenced by
884
Olaf Stenzel
aggregation effects [28,29,30,31], and they depend on the dielectric properties of both the cluster material and the environment [28,29,30,31,32]. Metal island films therefore represent a special case of an absorber film, while their optical behaviour may be controlled by a variety of preparation parameters. One may consequently manipulate the optical behaviour of metal island assemblies in order to prepare materials with tailored optical absorption properties. Incorporating for example silver particles into thin films of various fluoride or oxide optical thin film materials may result in qualitatively new absorbing materials essential for the design of selective absorber coatings. 4.2
Optical Properties of Metal Island Films
In the following, we will report on samples that have been prepared by subsequent e-beam evaporation of a dielectric layer, followed by thermal evaporation of the silver fraction, which builds the island film, while the sandwich is completed by a further dielectric film. In each sample, intentionally the same amount of silver (corresponding to an average thickness of 4 nm, as recorded by quartz monitoring) has been embedded in a 6nm thick dielectric film, formed from either SiO2 or Al2 O3 . The optical transmittance T and reflectance R of all films have been measured by a Perkin Elmer Lambda 19 spectrophotometer. To correlate the optical properties with the sample morphology, transmission electron microscopy (TEM) has been applied. Figure 4 (left) shows the measured optical loss L = 1 − T − R for silver islands embedded in SiO2 . The samples differ in their deposition temperature, which has a tremendous effect on their optical properties. The loss peak of the silver particles shifts to the blue with increasing deposition temperature. In order to quantify the experimental results, the following spectral moments have been calculated for each sample
1 ν max
L(ν)(ν − ν)2 dν, (10) δν = L0 νmin
1 ν = L0
ν max
νmin
1 L(ν)νdν; L0 = L0
ν max
L(ν)dν. νmin
The zero moment Lo is related to the integral loss, the first moment n defines the centre of the loss band, and δν its width, all data given in wavenumber units. To return to the more familiar description in wavelength units, we further regard the values of δλ ≡ 2λ2 δν; λ ≡ ν−1
New Challenges in Optical Coating Design
885
Fig. 4. Left: The measured optical loss L of samples, where silver clusters are embedded in SiO2 . An increase in deposition temperature leads to a blue-shift of the surface plasmon absorption maximum. Right: TEM image of the 20 ◦ C (top) and 300 ◦ C (bottom) samples. Area 175 nm × 175 nm. Dark spots: silver islands
Fig. 5. Experimental values for λ and δλ at different deposition temperatures
which describe the absorption line width and central wavelength, respectively. The thus calculated values are shown in Fig. 5. Both λ and δλ tend to decrease with increasing deposition temperature. This behaviour of the samples is caused by the particular morphology of the embedded silver islands. Figure 4 (right) shows examples of transmission electron microscopy (TEM) results. At low deposition temperatures, the islands are rather irregular in shape (top), while they become more spherical with increasing deposition temperature (bottom). Due to the small lateral dimensions of the islands and their distances, we assume that the observed optical loss is mainly due to absorption, and not due to scattering. The theoretical treatment of metal cluster assemblies is a hard matter (see for example [29,30,31]). It is nevertheless possible to obtain a qualitative understanding of the main optical effects. Clearly, in non-spherical silver islands (as mainly occurring at low deposition temperatures), the plasmon excitation
886
Olaf Stenzel
may be accomplished along different axes of the cluster. Excitations parallel to the longer axis of a prolate cluster lead to light absorption at lower frequencies, which causes both the red-shift (shift of λ) and the broadening (increase in δλ) of the resulting absorption line of statistical non-spherical cluster assemblies when compared to spheres. A rigorous method to calculate the absorption of periodic planar cluster assemblies is again offered by RCWA calculations. In order to apply this method to the calculation of the optical response of planar silver cluster assemblies such as shown in Fig. 4, one has to assume that the cluster arrangement is periodically continued in two dimensions. In practice, we assumed a 300 nm × 300 nm broad “elementary cell” which contains fifty-five silver islands of 7.8 nm height. Figure 6 shows the result of the calculation of the optical loss of such a silver cluster arrangement embedded in Al2 O3 , compared to the corresponding experimental loss of samples which have been deposited at 200 ◦ C and 300 ◦ C. As seen from the figure, the shape of the experimental absorption line may be excellently reproduced by the RCWA-calculations. In summary, we state that the silver cluster absorption line may be controlled by typical thin film deposition parameters. Mainly, it is here the deposition temperature that is crucial for the optical film properties. This is particularly important for the incorporation of such island films into conventional interference coating designs. Perspectively, these ultrathin metal-dielectriccomposite films are expected to be incorporated into multilayer stacks to design thin film systems with tailored absorption and reflection behaviour.
Fig. 6. Left: The optical loss L of the samples, where the silver clusters are embedded in Al2 O3 . The relevant deposition temperature is indicated in the figure. Solid Line: measured; Dash: RCWA-calculation. Right: TEM image of the 300 ◦ C sample (top) and assumed island geometry in the RCWA-calculation (bottom). Area 300 nm × 300 nm. Dark spots: silver islands
New Challenges in Optical Coating Design
5
887
Some Comments and Summary
In this paper, we made an excursion through modern research fields concerned with thin film optics. Starting from conventional optical coatings, we moved to adjacent areas by consequently reducing the symmetry of the single films. In Sect. 3, we mainly dealt with systems designed for high reflectivity. In order to avoid confusion, it is important to make a final comment concerning the rejection band width of the regarded systems: In quarter-wave-stacks as well as in rugates, the rejection band has a rather rectangular shape. Therefore in broad rejection bands, the FWHM (Full Width at Half Maximum) is approximately identical with the wavelength range of high reflection. There is a completely different situation in GWS-structures. Their rejection bands are rather Lorentzians by shape (see Fig. 2), so that Γ has to be interpreted as FWHM, while high reflection occurs only at the central wavelength. Keeping this is mind, we may state that in GWS-structures, the linewidth may be controlled by geometrical parameters only, while in conventional stacks as well as in rugates, the refractive index contrast is crucial for the linewidth. For the design of narrow-line filters, GWS-structures may therefore be of advantage. In Sect. 4, we mainly reported on the optical properties of silver island films, embedded in oxide materials. Their optical behavior may be influenced by a conventional technological deposition parameter, namely the temperature. We see possible applications in selective absorber designs. As a third point, we would like to emphasize the use of RCWA-calculations in application to any laterally inhomogeneous thin film system, which has been demonstrated throughout the paper. Acknowledgements The author is grateful to H. Heisse (Fraunhofer IOF) for sample preparation and to Dr. U. Kaiser and J. Biskupek (both Jena University) for TEMmeasurements. Parts of the calculations have been performed by Petra Heger and Agnes Sambale (both Fraunhofer IOF). Fruitful discussions with Dr. Kaiser (Fraunhofer IOF) are acknowledged as well as financial support by the BMBF and the TMWFK.
References 1. National Research Council, Harnessing light (National Academy Press, Washington 1998), p. 246 875 2. K. Lewis, in: A. H. Guenther (ed.), International Trends in Applied Optics V (SPIE-Press, Bellingham 2002), p. 187 875, 880 3. A. Thelen, Design of Optical Interference Coatings (McGraw-Hill Book Company 1989). 875, 876, 880
888
Olaf Stenzel
4. H.A. Macleod, Thin Film Optical Filters (Institute of Physics Publishing, Bristol, Philadelphia 2001). 876, 877, 880 5. W.H. Southwell, Appl. Opt. 24, 457 (1985). 876, 880 6. S.A. Furman, A.V. Tikhonravov, Basics of Optics of Multilayer Systems (Edition Frontieres Paris 1992). 876 7. A.V. Tikhonravov, Appl. Opt. 32, 5417 (1993). 876, 883 8. M. Born, E. Wolf, Principles of Optics (Pergamon Press 1968). 876 9. J.A. Dobrowolski, F.C. Ho, A. Waldorf, Appl. Opt. 22, 3191 (1983). 877 10. M.F. Weber, C.A. Stover, L.R. Gilbert, T.J. Nevitt, A.J. Quderkirk, Science 287, 2451 (2000). 877, 878, 879 11. R. Strharsky, J. Wheatley, Optics & Photonics News 13, 34 (2002). 877, 879 12. A. Sharon, D. Rosenblatt, A.A. Friesem, J. Opt. Soc. Am. A 14, 2985 (1997); A. Sharon, S. Glasberg, D. Rosenblatt, A.A. Friesem J. Opt. Soc. Am. A 14, 588 (1997). 877, 880, 883 13. S.S. Wang, R. Magnusson, Applied Optics 32, 2606 (1993); S.S. Wang, R. Magnusson, Opt. Lett. 19, 919 (1994). 877 14. O. Stenzel, in: B. Kramer, (Ed.) Advances in Solid State Physics (Vieweg & Sohn Verlagsgesellschaft mbH, Braunschweig/Wiesbaden 1999), p. 151 877 15. K. Sakoda: Optical Properties of Photonic Crystals (Springer Series in Optical Sciences 80, Springer-Verlag Berlin-Heidelberg 2001). 877 16. D.W. Pohl, in: S. Kawata, Near-Field Optics and Surface Plasmon Polaritons ( Topics in Applied Physics 81, Springer-Verlag Berlin-Heidelberg 2001), p. 9 877 17. A. Roeseler: Infrared Spectroscopic Ellipsometry (Akademie-Verlag Berlin 1990), p. 24 18. R.M.A. Azzam, N.M. Bashara: Ellipsometry and Polarized Light (Elsevier, Amsterdam 1989), p. 354 19. J.P. Dowling, Science 282, 1841 (1998). 878 20. W.H. Southwell, R.L.Hall, Appl. Opt. 28, 2949 (1989); W.H. Southwell, Appl. Opt. 28, 5091 (1989). 880 21. P.G. Verly, J.A. Dobrowolski, Appl. Opt. 29, 3672 (1990). 880 22. D. Poitras, S. Larouche, L. Martinu, Appl. Opt. 41, 5249 (2002). 880 23. L. Escoubas, F. Flory, in: N. Kaiser, H.K. Pulker, Optical Interference Coatings (Springer Series in Optical Sciences 88, Springer-Verlag Berlin Heidelberg 2003) 880, 881 24. K. Hehl, J. Bischoff, UNIGIT grating solver software (2001). 25. Y. Nie, Z. Wang, K. Fu, C. Zhou, Q. Zhang, Opt. Eng. 41, 674 (2002). 883 26. S. Peng, G.M. Morris: Opt. Lett. 21, 549 (1996). 883 27. M.D. Morariu, S. Walheim, U. Steiner, Thin nanoporous silica films as high quality antireflection coatings, Oral presentation at the International Workshop on Nanostructures for Electronics and Optics (NEOP) Oct. 6-9, (2002), Dresden, Germany (no proceedings) 883 28. U. Kreibig, M. Vollmer, Optical Properties of Metal Clusters (Springer Series in Material Science 25, Springer-Verlag 1995). 883, 884 29. J.M. Gerardy, M. Ausloos, Phys. Rev. B 25, 4204 (1982). 884, 885 30. A. Lebedev, O. Stenzel, M. Quinten, A. Stendal, M. R¨ oder, M. Schreiber, D.R.T. Zahn, J. Opt. A: Pure Appl. Opt. 1, 573 (1999). 884, 885 31. A. Lebedev, O. Stenzel, Eur. Phys. J. D 7, 83 (1999). 884, 885
New Challenges in Optical Coating Design
889
32. O. Stenzel, A. Stendal, M. R¨ oder, S. Wilbrandt, D. Drews, T. Werninghaus, C. von Borczyskowski, D.R.T. Zahn Nanotechnology 9, 6 (1998); O. Stenzel Journ. Clust. Sci. 10, 169 (1999). 884
AC-Calorimetry at High Pressure and Low Temperature Heribert Wilhelm Max-Planck-Institut f¨ ur Chemische Physik fester Stoffe N¨ othnitzer Str. 40, 01187 Dresden, Germany Abstract. Recent developments of the ac-calorimetric technique adapted for the needs of high pressure experiments are discussed. A semi-quantitative measurement of the specific heat with a Bridgman-type of pressure cell as well as a diamond anvil cell is possible in the temperature range 0.1 K< T < 10 K. The pressure transmitting medium used to ensure good pressure conditions determines to a great extent via its thermal conductivity the operating frequency and thus the accessible temperature range. Investigations with different pressure transmitting media for T > 1.5 K reveal for solid He a cut-off frequency which is considerably higher than for steatite. Experiments below 1 K and pressures above 10 GPa clearly show that the pressure dependence of the linear temperature coefficient of the specific heat can be measured. It is in qualitative agreement to a related quantity obtained quasi-simultaneously by electrical resistivity measurements on the same sample.
The specific heat (C) is an important thermodynamic quantity. Its temperature dependence can deliver hints about microscopic energy scales and provides a powerful tool to identify phase transitions. In this respect temperature (T ) dependent measurements are an indispensable means not only for experimentalists. This has triggered the development of different and very sophisticated technical realizations to obtain C(T ) from the millikelvin range up to very high temperature. The available methods can be divided in two categories. Adiabatic techniques are considered as the most accurate way to estimate the absolute value of C(T ). They require sample masses of several grams and the subtraction of the addenda, i. e., the specific heat of sample holder and thermometer. Among the non-adiabatic (or dynamic) methods, ac-calorimetry is a suitable technique for samples with masses well below one milligram. The specific heat can be measured with very high sensitivity, despite the small masses. However, the absolute accuracy which can be achieved is less than for the adiabatic methods. Adiabatic techniques are used to detect pressure-induced phase transitions or to investigate the evolution of electronic properties as the unit cell volume is reduced. The sample masses needed demand large volume pressure cells, such as a piston-cylinder cell. With this technique the accessible pressure range is, however, limited to about 3.5 GPa. Very often it would be desirable for the pressure range to be extended. In this case an anvil-type of pressure cell is the only alternative. Such a high pressure tool demands a B. Kramer (Ed.): Adv. in Solid State Phys. 43, pp. 889–901, 2003. c Springer-Verlag Berlin Heidelberg 2003
890
Heribert Wilhelm
much smaller sample volume which makes an adiabatic measurement a hopeless venture. Thus, ac-calorimetry is an ideal method to be used for pressures beyond the limit of piston-cylinder cells.
1
AC-Calorimetry Adapted for High Pressure
The general set-up of the ac-calorimetric technique for measuring the specific heat is sketched as a simplified model in Fig. 1a. The sample is thermally excited by an oscillating heating power P = P0 [1 + cos(ωt)], e.g. generated by a current of frequency ω/2 through a resistance heater. The temperature oscillations at frequency ω are detected with a thermometer attached to the sample. Sullivan and Seidel [1] obtained a relation among the amplitude Tac of the temperature oscillations and the specific heat C of the sample: Tac
P0 = ωC
1+
1 + ω 2 τ22 2 ω τ12
−1/2 .
(1)
This equation contains the time constants τ1 = C/κ and τ2 , with κ the thermal conductivity of the thermal link between sample and temperature bath (see Fig. 1a). It was derived in the ideal case, when the heat capacity of thermometer, heater, and heat link between sample and temperature bath are negligible and assuming a perfect coupling between heater, sample, and thermometer. The measured value of Tac depends on the measuring frequency ω (Fig. 1b): At low frequency (ω ω1 = κ/C) the mean sample temperature is above the bath temperature T0 by Tdc ∝ P0 /κ. The recorded temperature oscillation Tac yields the specific heat of the sample if the frequencies are in the range ω1 ω ω2 = 1/τ2 . The possibility of tuning both the amplitude and the frequency of the excitation is the main advantage of this method; as long as κ can be made small enough, the sensitivity of the measurement does not depend on the mass of the sample. This technique was employed by several groups [2,3,4,5] to investigate the pressure dependence of the specific heat. The conditions for the ac-technique in a pressure cell are far away from being ideal. In particular the thermal properties of the pressure transmitting medium have to be taken into account. This was done by Baloga and Garland [3] for the case of high gas densities and low sample thermal conductivities. In their accessible temperature range (245 K < T < 300 K) the general relation between Tac and C for the accalorimetric expression (1) can be recovered if the product of specific heat and thermal conductivity of the pressure transmitting medium is negligible with respect to that of the sample. Then the heat wave does not propagate too far into the pressure transmitting medium and its specific heat does not contribute too much to Tac . Typical frequencies are of the order of 1 Hz. Eichler and Gey [4] were the first to use the ac-technique for metallic samples in a piston-cylinder cell (pmax ≈ 3.5 GPa) at low temperature (1.3 K< T <
AC-Calorimetry at High Pressure and Low Temperature (a)
(b)
Heater
T ω << ω 1
P=δQ/δt=CδT/δt-κTdc τ1, κ Bath
T0
τ2
Thermometer
P=P0[1+cos(ωt)]
Sample
891
ω 1 << ω << ω 2 ω >> ω 2
T0 + Tdc
T0
t
Fig. 1. (a) Sketch of a general ac-calorimetric assembly. The sample, thermal bath, thermometer, and heater are in contact by a thermal link with thermal conductivity κ. τ1 is a measure of the thermal relaxation between sample and bath; τ2 comprises the relaxation of thermometer, heater, and sample. (b) Sample temperature T (t) for different frequency domains: For ω ω1 ,Tac ≡ Tdc ∝ P0 /κ is not frequency dependent and is a measure of the thermal conductivity κ. In the range ω1 ω ω2 , the amplitude of the ac-part Tac ∝ (ωC)−1 depends on the measuring frequency and yields the specific heat of the sample. At ω ω2 , Tac is strongly reduced. Independent of the frequency, the mean sample temperature is T = T0 +Tdc
7 K). Here, the sample was embedded in diamond powder. It acts as pressure transmitting medium and provided the thermal resistance between the sample and the pressure cell. The measuring frequency was 120 Hz. Pressures well above 3.5 GPa can only be achieved with opposed anvils, i.e., with a clamped Bridgman anvil technique or a diamond anvil cell (DAC). In Bridgman cells, the anvils are often made out of tungsten carbide (WC) or synthetic diamond and the pressure chamber consists of pyrophyllite (a sheet silicate, Al2 Si4 O10 (OH)2 ). The sample is in between two disks of e.g., the soft mineral steatite (3MgO·4SiO2·H2 O) which acts as pressure transmitting medium. In a DAC a metallic gasket contains the sample and the pressure transmitting medium. Compared to a Bridgman cell a DAC comprises several advantages. First of all the pressure range can be extended easily to 50 GPa. Furthermore, the transparent anvils give optical access to the sample and the pressure can be determined with the ruby fluorescence method. Finally, the most important point is the possibility using He as pressure transmitting medium. With respect to hydrostatic pressure conditions, solidified He is an ideal medium since it is highly plastic and inert. However, these desirable features might mislead in underestimating the efforts in the elaborate assembly of the ac-calorimetric circuit in a DAC. The feasibility of the ac-technique at pressures well above the limit of the piston-cylinder cells has to be tested, regardless of the type of high pressure cell. From the general principle of the ac-calorimetry (Fig. 1) it is evident that the main challenge are the unknown thermal properties of the pressure
892
Heribert Wilhelm
κ (W/m/K)
transmitting medium. To shed some light on this, the thermal conductivity of the two preferred media, steatite and He will be discussed qualitatively in the following. A priori it is not evident if the pressure media could satisfy the assumed requirements in the deduction of (1) because little is known about their thermal conductivity under pressure. To get an overview of κ(T ) of the materials used in a Bridgman device, the thermal conductivity of WC, pyrophylitte, and steatite at ambient pressure have been measured (Fig. 2). Literature data of diamond [6] and solid He at different pressure [7,8] are also depicted in Fig. 2. At low temperatures κ(T ) of steatite can be a factor 104 smaller than that of solid He at about 0.1 GPa. Moreover, the purity of He significantly affects the shape and size of κ(T ). For very pure He [10] the maximum thermal conductivity can be one order of magnitude higher than for unpurified He at almost the same pressure [8]. Fortunately, it is very likely that the solidified He in the pressure chamber of a DAC is polycrystalline and contains impuri10
4
10
3
10
2
10
Diamond He 60 bar 380 bar 800 bar 1063 bar cal. 2 GPa
WC
1
Pyrophyllite
1
10
-1
CeRu2Si2
Steatite
-2
10 0.1
1
10
100
400
T (K)
Fig. 2. Thermal conductivities κ(T ) of different materials used in high pressure devices with opposed anvils. WC and diamond are often used as anvils. Data for diamond (type Ib) are taken from [6]. Pyrophyllite and steatite serve as gasket and pressure transmitting medium, respectively. Due to its plasticity solid He permits homogeneous pressure conditions. The κ(T ) data for He are taken from [7] (60 bar) and [8] (380 bar and 800 bar). Unpurified He (1063 bar [8]) has a significantly different κ(T ). At the indicated pressures the crystals were grown. The solid line represents a calculated κ(T ) of He at 2 GPa (see text). κ(T ) of CeRu2 Si2 [9] stands in for the thermal conductivity of heavy Fermion compounds at low temperature
AC-Calorimetry at High Pressure and Low Temperature
893
ties, brought in during the filling procedure. Therefore, κ(T ) might not reach the high values expected for pure He at the same pressure. As was pointed out in [8] the maximum value of κ(T ) occurs roughly at ΘD /50 (ΘD : Debye temperature). Since He is highly compressible, ΘD and thus, the maximum of κ(T ), rapidly increases with pressure. As a result the value of κ(T ) at, e.g. 1 K, could decrease considerably. In order to get a rough estimate of κ(T ) of He at several GPa, the following assumptions are made: (i) The low temperature slope remains unchanged at high pressure. (ii) The maximum value of κ(T ) at Tmax does not increase with pressure. (iii) Tmax ≈ ΘD /50 can be estimated using ΘD (V ) = ΘD (V0 )(V /V0 )γ , with γ = 2.4 [11]. (iv) The maximum value of κ(T ) for He in a DAC might be comparable to that of impure He at about 0.1 GPa (Fig. 2). Based on the equation of state for solid He [12] the density for low temperature and 2 GPa is inferred to be about V ≈ 6 cm3 /mole. This density together with ΘD (V0 ) ≈ 90 K at V0 = 11.77 cm3 /mole [11] yields Tmax ≈ 9 K. Then κ(T ) at 2 GPa (line in Fig. 2) can be estimated with the assumptions specified above. κ(T ) of steatite and He shown in Fig. 2 clarifies the differences in an accalorimetric experiment with these pressure media. The cut-off frequency in the case of steatite changes continuously since κ(T ) is a monotonic function below 10 K. Moreover, it is very likely that its shape will not be effected strongly by pressure. Thus, ω1 at a given temperature should slightly vary with pressure. For steatite κ(T ) = aT 2.3 , with a = 6.7 × 10−3 W/m/K3.3 , is a good approximation of the data below 8 K. Together with heat capacity of a typical heavy Fermion compound like CeRu2 Ge2 [13] or CePd2 Ge2 [14] a cut-off frequency ω1 /(2π) ≈ 100 Hz at ambient pressure is calculated. κ(T ) of pressurized He, however, varies drastically with temperature and pressure. Comparing κ(T ) of He at 2 GPa and 4.2 K with that of steatite at ambient pressure shows that ω1 will be roughly a factor 10 larger and of the order of several kHz. At these frequencies severe constraints are put on the homogeneity of the temperature in the sample. A homogeneous temperature distribution in the sample is given if the thermal wavelength λth ∝ κ/(Cω) is of the order of the sample thickness (typically about 30 µm). This condition is already fulfilled at about ω/(2π) ≈ 1 kHz for the compounds mentioned above. Nevertheless, these frequencies are well below ω2 since metallic samples and thermometer ensure high thermal conductivity. These considerations show that an ac-calorimetric measurement of metallic samples enclosed in solid He is more difficult for 1 K< T < 10 K as for the same sample embedded in steatite. Outside this temperature interval κ(T ) of He can be as small as that of steatite at ambient pressure and it might expected that Tac is dominated by the specific heat of the sample. If the assumptions made above hold for even higher pressure, the cut-off frequency could be reduced significantly and the ac-calorimetry in a DAC would become feasible in a larger temperature range.
894
2
Heribert Wilhelm
AC-Calorimetry in Different Pressure Environments
The previous section illuminated the general aspects of the ac-calorimetry and contemplated the frequency domain in which experiments could be conducted. Two independent experiments [15,16,17] using the same compound but in different pressure devices and pressure transmitting media provide experimental information about the cut-off frequency. Both investigations explored pressures up to 8 GPa and temperatures in the range 1.5 K< T < 10 K. Figure 3 shows the pressure chamber of the Bridgman cell before closing the device. The typical thickness of the sample, thermocouple and heating wires are 20, 12, and 3 µm, respectively. Two different ways of supplying the heat to the samples were tested. For sample A a thin electrical insulation (4–5 µm of an epoxy/Al2 O3 mixture) prevents electrical contact with the heater but still established a good thermal contact. Sample B is set apart
I(w/2)
Sample B Heater
AuFeChromel
Sample A
Pb manometer
Steatite
Pyrophyllite
1 mm
VPb
Lock-in amplifier
V(w)
Fig. 3. Top view of the inner part of a Bridgman-type of pressure cell before closing. Two samples of CeRu2 Ge2 are arranged for an ac-calorimetric experiment. Sample A is placed on top of the heater wires but is insulated from them. Sample B is in contact with a metallic foil and thermally linked to the heater through a Au-wire. The Chromel-AuFe thermocouples measure the sample temperature. The Pb-wire serves as pressure gauge. The entire assembly is mounted on a disk of steatite
AC-Calorimetry at High Pressure and Low Temperature
895
on a metallic (Pb) foil, electrically (and thus thermally) linked to the heater through a gold wire. No heating current passes through this sample. In the course of the experiment it turned out that the configuration A provided a homogeneous temperature distribution whereas the configuration B was ensuring hydrostatic pressure conditions. The heating power was chosen in such a way that the temperature oscillations were in the range 2 mK< Tac < 20 mK. They were measured with a AuFe/Au thermocouple (Au + 0.07 at% Fe). The thermovoltage Vac arises from the temperature difference between the sample (at T0 + ∆T ) and the edge of the sample chamber (at T0 ) [18]. The thermovoltage was amplified at room temperature in two stages and read by means of a lock-in detection referred to the frequency of the heating current. However, two potential drawbacks should not be concealed: (i) the temperature of the samples is measured with a thermocouple, under the assumption that the ambient pressure calibration holds at high pressure. (ii) The total amount of heat supplied to the samples is not known, despite the resistive heating. This prevents so far the acquisition of absolute values for the specific heat.
7
(a)
4
6
2
5
0
5.5
1
p = 5.0 GPa
6.0
8
0.1
1/Vac (arb. units)
normalized Tac
12
1/Vac (arb. units)
p (GPa)
1/Vac
CeRu2Ge2 16
p = 0.7 GPa T = 4.2 K
1
2
3
6.2
4
T (K)
4
6.6
3 10
4
1
10
2
10
3
10
Frequency (Hz)
4
p (GPa) 0
CeRu2Ge2
2
(b) 0.7 0
1 2
4
6
T (K)
8
10
2
4
6
8
T (K)
Fig. 4. (a) Comparison of the inverse of the lock-in signal, 1/Vac ∝ C, of CeRu2 Ge2 enclosed in steatite at 0.7 GPa and the specific heat C/T measured with a relaxation method at ambient pressure [13]. The data sets are normalized at 10 K. A frequency test at 4.2 K is depicted in the inset (ω1 ≈ 450 Hz). (b) Temperature dependence of 1/Vac ∝ C of CeRu2 Ge2 above 5 GPa. The pronounced feature related to the antiferromagnetic transition is suppressed by pressure. The inset shows the specific heat at 5.0 GPa with an anomaly at low temperature. The feature can still be seen near 3.5 K at 5.5 GPa (main figure)
896
Heribert Wilhelm
CeRu2 Ge2 exhibits two magnetic phase transitions at ambient pressure leading to large features in the specific heat. Together with the well known influence of pressure on these transitions [13] this compound is a good candidate for testing ac-calorimetry at high pressure. Figure 4a shows the result of the ac-measurements at 0.7 GPa in comparison to the specific heat obtained by a relaxation method at ambient pressure. Pressure slightly shifted the transition temperatures as expected from the (T, p) phase diagram [13]. The height of the specific heat jump at the second order transition (TN ≈ 9 K) represents 47% of the total signal compared to 51% for the ambient pressure curve. This indicates that Tac is dominated by the heat capacity of the sample. An additional support for this statement is given by a frequency test. According to (1) the relation Tac ∝ 1/ω for ω ω1 should hold, which is indeed observed (inset Fig. 4a). A fit of a low pass filter to the data yields ω1 /(2π) = 450 Hz. Frequency tests at various temperatures and pressures are a necessary task to determine ω1 and to ensure the validity of the relation between Tac and the specific heat of the sample. The height of the first order transition (TC ≈ 7 K) is very sensitive to any distribution of TC and should not compared to the peak in Cp (T ) at ambient pressure. Moreover, ac-calorimetry is not the proper tool to measure a latent heat [19] since it only detects the reversible part at frequency ω on a temperature scale Tac . Nevertheless, the position of a first order transition can be detected by an ac-calorimetric measurement. The ac-calorimetry data of CeRu2 Ge2 above 5 GPa shown in Fig. 4b demonstrate the potential of this method. The influence of pressure on the antiferromagnetic transition is visible and the deduced TN (p) data agree with the (T, p) phase diagram extracted from transport measurements [13]. A critical pressure pc ≈ 7 GPa is necessary to suppress the long-range magnetic order. The broadening of the antiferromagnetic transition is very likely related to intrinsic effects although a small pressure inhomogeneity could be partly responsible for it. In addition to this transition an anomaly at lower temperature was resolved (inset of Fig. 4b). These measurements were the first to show that this anomaly seen so far only by transport measurements [13], has thermodynamic origin and is a bulk property. Working with a DAC allowed Demuer and coworkers [17] to use a different way to supply the oscillating heat power to the sample. They attached an optical fiber to the DAC and heated the sample with the light of an Ar-ion laser. It was chopped mechanically at frequencies up to 3 kHz. The temperature oscillations of a Au-Chromel thermocouple bonded directly on the sample by spark welding were measured with a lock-in amplifier. In this experiment CeRu2 Ge2 was enclosed with solidified He. The cut-off frequency was estimated to 4 kHz at 0.5 GPa and 7 K. This value supports the estimated order of magnitude for ω1 in the case of pressurized He given in Sect. 1. The high thermal conductivity of He limits the application of the ac-method at low pressures. Nevertheless, the magnetic phase transitions could be ob-
AC-Calorimetry at High Pressure and Low Temperature
897
served although only a part of the signal at the fixed measuring frequency of 1.5 kHz was due to the specific heat of the sample. In this investigation an increased width of the transition was also established. Intrinsic effects seem to be responsible for this if the good hydrostatic pressure conditions in the experiment are kept in mind. In addition, a similar broadening in specific heat experiments at ambient and low pressure have been reported [20,21] when TN is pushed to zero temperature either by doping or pressure. The analysis of the thermal conductivity data in Sect. 1 suggests that He could be used as a pressure transmitting medium at low temperature even at pressures of a few GPa. This presumption is corroborated by the results of Holmes and coworkers [22]. With a combined measurement of electrical resistivity and ac-calorimetry the heavy Fermion superconductor CeCu2 Si2 was investigated down to 0.1 K for pressures up to 7 GPa. The jump in the ac-signal caused by the entrance into the superconducting state provided a semi-quantitative measure of the sample specific heat. The onset of the specific heat occurred when the resistive transition was completed and affirms the bulk property of the superconducting state.
3
AC-Calorimetry Below 1 K and beyond 10 GPa
A demonstration of the feasibility of the ac-technique below 1 K and pressures well above 10 GPa is the experiment on CePd2.02 Ge1.98 in a Bridgmantype of pressure cell [14]. The set-up of the experiment was chosen in such a way that electrical resistivity and ac-calorimetry could be performed on the same crystal. This makes it possible to check whether an anomaly in Tac is related to the sample or not with an independent electrical resistivity measurement. Figure 5 shows the arrangement in the pressure chamber. It contains two different samples of the solid-solution CePd2+x Ge2−x , but only one of them (x = 0.02) was connected for the ac-experiment. The sample was heated with a current supplied through Au-wires attached to the sample. This reduces the components in the pressure chamber and avoids a pressure gradient due to the heat wires. With this arrangement it is also possible to calibrate the AuFe/Au thermocouple up to very high pressure and over a wide temperature range [14]. It was observed that the absolute thermopower S(T ) of AuFe at 4.2 K and 1.0 K at 12 GPa is about 20% smaller than the values at ambient pressure. These rather small changes show that the results are not affected qualitatively if the ambient pressure values of S(T ) are used. Thus, the drawback of a missing temperature calibration for the thermocouple mentioned in Sec. 2 could be in principle eliminated. The reliability of the pressure cell is obvious as can be seen in Fig. 5. After the pressure was released from 22 GPa the overall shape of the pressure cell as well as its initial diameter were almost unchanged and the distance between the voltage leads increased by less than 5%.
898
Heribert Wilhelm Iac
Au
x = 0.02
x = 0.02
AuFe Vac
Pb
Pb
x=0
x=0
1 mm
1 mm
Fig. 5. Left: Pressure cell before closing with two samples of CePd2+x Ge2−x (x = 0 and 0.02) and the pressure gauge (Pb-foil). The ac-calorimetric circuit is mounted on one sample (x = 0.02). The temperature oscillations are read with an additional AuFe wire when an ac-heating current is applied. Right: The pressure cell after pressure release from 22 GPa. The almost unchanged configuration shows the reliability of the pressure device 80 T = 4.2 K 10
-4
CePd2.02Ge1.98
Vac (V)
T=9K
60
p = 8.9 GPa
1/Vac (arb. units)
10
-5
10
0
10
1
10
2
10
3
10
4
ν (Hz)
40 p (GPa) 11.7
20 10.0 8.9 6.0
0
0
2
4
6
T (K)
8
10
Fig. 6. Temperature dependence of the inverse lock-in voltage Vac of CePd2.02 Ge1.98 . The entrance into the antiferromagnetically ordered state is clearly visible. Inset: Frequency test at p = 8.9 GPa for different temperatures. The solid lines represent a fit of a low pass filter to the data with cut-off frequencies ω1 /(2π) = 350 Hz and 1060 Hz for 4.2 K and 9 K, respectively
CePd2.02 Ge1.98 was chosen because in its stoichiometric form it is the Ge-doped counterpart of the antiferromagnetically ordered heavy Fermion compound CePd2 Si2 (TN = 10 K). The latter system enters a superconducting ground state when the magnetic order is suppressed (pc = 2.7 GPa) [23]. Applying pressure to CePd2.02 Ge1.98 (TN = 5.16 K [14]) should increase TN to a maximum and then it should approach zero temperature. The aim of the ac-calorimetric measurement was to extract the electronic contribution to the specific heat. Figure 6 shows the inverse of the registered lock-in signal
AC-Calorimetry at High Pressure and Low Temperature
899
Vac below 10 K at various pressures. The pronounced anomaly in 1/Vac (T ) for pressures between 6.0 GPa and 10 GPa is caused by the entrance into the antiferromagnetically ordered phase. The height of the anomaly decreases and it becomes a very broad feature as the system approaches pc = 11.0 GPa [14]. A similar broadening upon approaching the critical pressure was reported for CePd2 Si2 , despite the lower pressure and the use of He as pressure transmitting medium [24]. Recalling the increased transition width reported in Sect. 2 it is very likely that this is an intrinsic phenomenon. Two frequency tests at p = 8.9 GPa are shown in the inset of Fig. 6. A fit of a low pass filter to the data yield cut-off frequencies of ω1 /(2π) = 350 Hz and 1060 Hz for T = 4.2 K and 9 K, respectively. Assuming the validity of ω1 = κ/C, these values and the 1/Vac data at the corresponding temperatures result in κ(4.2 K)/κ(9 K)≈ 0.2, almost the same ratio as at ambient pressure (see Fig. 2). Hence, pressures up to 9 GPa seem to have a weak effect on κ(T ) of steatite below 10 K. The most important observation in this experiment is the pressure dependence of the value of 1/Vac taken at about 0.3 K. The inverse of the lock-in voltage, 1/Vac , strongly increases, reaches a maximum in the vicinity of pc and levels off at high pressure (Fig. 7). The critical pressure was inferred from ˜ the A(p)-anomaly in the temperature dependence of the electrical resistivity ˜ n , with ρ0 the residual resistivity, and the fitting parameters ρ(T ) = ρ0 + AT 0.4
CePd2.02Ge1.98
20
n
à (µΩcm/K )
1/Vac (arb. units)
0.3
0.2 10
0.1
0
0.0 0
5
10
p (GPa)
15
20
0
5
10
15
20
p (GPa)
Fig. 7. Left: Pressure dependence of the inverse lock-in voltage Vac of CePd2.02 Ge1.98 obtained at the lowest temperatures reached in each run. The maximum is attained at a pressure very close to the critical pressure where the magnetic ˜ of ordering temperature is pushed to zero. Right: The temperature coefficient A the electrical resistivity. It shows an anomaly at the magnetic/non-magnetic phase transition
900
Heribert Wilhelm
A˜ and n [14]. Below 1 K, 1/Vac is proportional to C/T , since the temperature dependence of the absolute thermopower, S(T ) ∝ T , is a fairly good assumption. Above this temperature the S(T ) dependence is certainly different and 1/Vac has to be interpreted with caution. Thus, 1/Vac (T ) at low temperature can be regarded as a direct measure of the electronic correlations. The pronounced pressure dependence of 1/Vac shows that the electronic correlations are considerably enhanced as pressure approaches pc and that the signal originates mainly from the sample. However, above 15 GPa, the pressure de˜ pendence is not strong enough to follow the A(p)-dependence according to the empirical Kadowaki-Woods relation [25]. A possible reason for this deviation might be that at these pressures Vac does not represent entirely the heat capacity of the sample. A step towards a quantitative measure of the specific heat at these conditions would be to achieve a control of the supplied heating power and the thermal contact between sample and pressure transmitting medium. Nevertheless, the strong pressure dependence of 1/Vac ˜ at low temperature is reminiscent to A(p) and is a motivation for further studies.
4
Conclusions
The ac-calorimetric technique adapted for high pressure experiments at low temperature (T < 10 K) was discussed. The oscillating sample temperature provides the specific heat of the sample if the measuring frequency is above the cut-off frequency ω1 = κ/C. It is determined by the thermal conductivity of the pressure transmitting medium and the specific heat of the sample. A qualitative estimate of κ(T ) for steatite and solid He, the two preferred pressure media was made. The cut-off frequency for steatite is less than 1 kHz whereas several kHz was inferred for solid He (at ≈ 2 GPa and 4.2 K). An experimental confirmation of the order of magnitude for these values was found for pressures up to 7 GPa and temperatures in the range of 1.5 K< T < 10 K. The large values in the case of He put a temperature limit for the use of a DAC whereas a Bridgman-type of high pressure cell can be used below 10 K and pressures well above 10 GPa. Due to the strong pressure dependence of κ(T ) of He, the maximum in κ(T ) shifts towards higher temperature. This might open the low temperature region for the ac-calorimetric method also for a DAC. These promising results build up the hope of a quantitative understanding of the ac-calorimetry and interesting specific heat data under extreme conditions might be expected. Acknowledgements The work presented here is the result of a collaboration with F. Bouquet, A. Demuer, A. Holmes, D. Jaccard, A. Junod, and Y. Wang. I am grateful to them for many fruitful and stimulating discussions. The assistance of
AC-Calorimetry at High Pressure and Low Temperature
901
A. Bentjen in the thermal conductivity measurements at the MPI CPfS is acknowledged.
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.
11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25.
P. F. Sullivan and G. Seidel, Phys. Rev. 173, 679 (1968). 890 A. Bonilla and C. W. Garland, J. Phys. Chem. Solids, 35, 871 (1974). 890 J. D. Baloga and C. W. Garland, Rev. Sci. Instrum. 48, 105 (1977). 890 A. Eichler, and W. Gey, Rev. Sci. Instrum. 50, 1445 (1979). 890 A. Eichler, H. Bohn, and W. Gey, Z. Phys. B 38, 21 (1980). 890 Landolt B¨ ornstein Vol. III/17a: Physics of Group IV Elements and III-V Compounds, ed.: O. Madelung, Springer, Berlin, p. 357 (1982). 892 F. J. Webb, K. R. Wilkinson, and J. Wilks, Proc. Roy. Soc. A, 214, 546 (1952). 892 W. D. Seward, D. Lazarus, and S. C. Fain, Jr., Phys. Rev. 178, 345 (1969). 892, 893, 901 A. Amato, Ph.D Thesis, University of Geneva, 1988. 892 According to [8] purified (unpurified) He refers to a chemical impurity level less than 15 ppm with an (no) additional purification by an adsorption trap cooled to 63 K. 892 J. S. Dugdale and J. P. Franck, Phil. Trans. Roy. Soc. London A, 257, 1 (1964). 893 R. L. Mills, D. H. Liebenberg, and J. C. Bronson, Phys. Rev. B 21, 5137 (1980). 893 H. Wilhelm, K. Alami-Yadri, B. Revaz, and D. Jaccard, Phys. Rev. B 59, 3651 (1999). 893, 895, 896 H. Wilhelm and D. Jaccard, Phys. Rev. B 66, 064428 (2002). 893, 897, 898, 899, 900 F. Bouquet, Y. Wang, H. Wilhelm, D. Jaccard, and A. Junod, Solid State Commun. 113, 367 (2000). 894 B. Salce, J. Thomasson, A. Demuer, J.J. Blanchard, J.M. Martinod, L. Devoille, and A. Guillaume, Rev. Sci. Insturm. 71, 2461 (2000). 894 A. Demuer, C. Marcenat, J. Thomasson, R. Calemczuk, B. Salce, P. Lejay, D. Braithwaite, and J. Flouquet, J. Low Temp. Phys. 120, 245 (2000). 894, 896 D. Jaccard, E. Vargoz, K. Alami-Yadri, and H. Wilhelm, Rev. High Pressure Sci. Technol. 7, 412 (1998). 895 X. Wen, C. W. Garland, R. Shashidhar, P. Barois, Phys. Rev. B 45, 5131 (1992). 896 R. A. Fisher, C. Marcenat, N. E. Phillips, P. Haen, F. Lapierre, P. Lejay, J. Flouquet, and J. Voiron, J. Low Temp. Phys. 84, 49 (1991). 897 B. Bogenberger and H. v. L¨ ohneysen, Phys. Rev. Lett. 74, 1016 (1995). 897 A. T. Holmes, A. Demuer, and D. Jaccard, Acta Phys. Pol. B 34, 567 (2003). 897 F. M. Grosche, S. R. Julian, N. D. Mathur, and G. G. Lonzarich, Physica B 223&224, 50 (1996). 898 A. Demuer, A. T. Holmes, and D. Jaccard, J. Phys.: Condens. Matter 14, L529 (2002). 899 K. Kadowaki and S. B. Woods, Solid State Commun. 58, 507 (1986). 900
Index
absolute field magnetometer, 731 absorbing state, 659, 660 absorption spectrum, 86, 92 Abstreiter, A., 301 Abstreiter, G., 287 AC-Calorimetry, 889 active states, 660 Adelung, Rainer, 463 adsorbate, 833 AFM – lithography, 113, 139 Aharonov-Bohm, 125 – oscillation, 113 Aharonov-Bohm effect, 139 Aharonov-Bohm Oscillation – fractional, 113 Al-bicrystal, 563 Alff, L., 695 alkali halide, 482 – surface, 477, 479 Amann, M.-C., 301 Ando, Tsuneya, 3 Anisimov, V. I., 267 annihilation and fission reactions, 671 annihilation reaction, 666 anomalous scaling, 21 aspect ratio, 158, 164 atom – manipulation, 51 atomic structure, 477, 547 Au – wire, 181 B¨ auerle, C., 181 Bammerlin, M., 477 bandpass filter, 875 Bauer, G. E. W., 383 beam splitter, 875
Becker, T., 677 Beere, H. E., 327 Beham, E., 287 Beltram, F., 327 Benckiser, E., 95 Benner, H., 589 Bennewitz, R., 477 Bethe-Salpeter equation, 313, 319 Bichler, M., 113, 139, 287 bifurcation, 589 bimagnon-plus-phonon absorption, 95 biochip, 397, 406, 408 – sensor, 406 biological – system, 589 biosensor, 406 blocking barrier, 351 Blossey, R., 125 Boltzmann – equation, 223 Borck, S., 113 Bornemeier, J., 397 Bose metal, 207 Br¨ uckl, H., 397 branching, 659 branching and annihilating random walks, 668 Brandes, T., 63 Brataas, A., 383 breath figure, 161 Brodschelm, A., 341 Bronner, W., 351 BSCCO, 253 Buback, M., 505 Bussi, G., 313 c-BN, 519 – polycristalline, 519
904
Index
CaCu2 O3 , 95 Cain, P., 237 calcium puff, 605 Caldas, M. J., 313 Calorimetry, 889 cantilever technique, 547 Carbon nanotube 171 – Aharonov-Bohm effect, 3 – band structure, 3, 5 – conductivity, 13 – effective-mass approximation, 9 – metallic, 3, 7 – optical absorption, 3, 4 – semiconducting, 3, 7 – tight-binding model, 3 – transport properties, 3 Ce, 267 CePd2.02 Ge1.98 , 889 CeRu2 Ge2 , 889 Chakraborty, T., 79 Chalker-Coddington model, 237 chaos, 589 – control, 589 charge transport, 833 chemical reaction, 659, 660 circuit theory, 383, 386, 390, 391, 393 cleaved-edge overgrowth technique, 443 Co, 781 CO molecule, 51 coating – antireflection, 875 – composite, 875 – optical, 875 coherence, 647 – resonance, 647 coherent interaction, 295 coherent states, 662 collective effect, 207 complexity, 579 computer simulation, 633 conductance – distribution, 237 – fluctuation, 193 conduction band, 35 conductivity, 193, 253, 833 – optical, 833 confinement, 82 conformational transition, 677
conservation law, 669 continuous unitary transformations, 95 continuous-wave operation, 351 Cord, B., 849 correlated – electron system, 267 correlated electron liquid, 207 Coulomb Blockade, 113, 139 Coulomb interaction, 90, 341 coupled oscillators, 647 critical – current density, 703, 707, 711 – exponent, 193 critical exponent, 237, 663 critical field, 721 critical fluctuation, 659, 661 CrO2 , 487, 490 – epitaxial film, 487 Cu, 51, 833, 835 CuO2 , 253 cuprate, 253 cuprates, 95 CVD, 171 da Silva, R. R., 207 Davies, G. A., 327 decoherence, 139, 223, 747, 763 – time, 223 Dedkov, Y., 487 deformation, 617 delay system, 589 density functional theory, 313, 781 density-matrix renormalization group, 95 device simulation, 369 dewetting, 125 DFT, 781 Dicke model, 72 dielectric, 861 Dietl. T., 413 diffusion, 463, 466, 471 – barrier, 475 – equation, 384, 387 – length, 466, 473 – limited aggregation, 467, 473 diffusion-limited reaction, 659 diffusion-limited regime, 667 digital bit stream, 851 diode
Index – light emitting, 35 dipole – approximation, 86 directed percolation, 659, 663 disk sensitivity, 854 dispersion relation, 789 dissipation, 19, 22, 26, 28 – bound, 29 – rate, 32 DMFT, 267 domain wall, 660 dots – ZnO, CdO, CuO, 160 double-phonon relaxation, 351 Drossel, B., 579 DVD, 849 dynamic force microscopy, 477, 481 dynamic percolation, 666 dynamical exponent, 193 Ebinger, H., 849 Eckert, J., 703 Ederer, C., 781 effective field theory, 659, 663 Eigler, D. M., 51 elasticity, 617 electrode – effect, 861 electron – coherence, 181 – interaction, 113 – transport, 449 electron doped, 695 electron theory, 781 electron-phonon interaction, 63 Els¨ aßer, W., 351 embedded atom, 617 energy spectrum, 79, 89, 92 energy-level statistic, 237 Ensslin, K., 139 epitaxial film, 487 Ernst, Frank, 463 Esquinazi, P., 207 etch yield, 519 evolution, 579, 583 exchange coupling, 529 excitable system, 648 exciton, 319, 322, 324 – tunneling, 297
Eyert, V., 267 f -electron material, 267 F¨ ahnle, M., 781 Fahsold, G., 833 Falci, L., 747 Fano resonance, 70 Faoro, L., 747 Faraday instability, 789 Faupel, J., 505 Fe, 547, 781, 835 – epitaxial films, 487 Fe3 O4 , 487 – half-metallic, 487 feedback – control, 589 – method, 589 femtosecond timescale, 341 Ferrer, S., 547 Ferretti, A., 313 ferroelectric, 861 ferrofluid, 789 ferromagnet, 443 ferromagnet/InAs hybrid, 449 ferromagnetic – resonance, 589 few-electron quantum ring, 113 field compensation, 721 field theory, 661 Findeis, F., 287 finite-size effect, 647 finite-size scaling, 237 Fischer, C., 703 Fischer, H., 351 Fischer, S., 677 FitzHugh–Nagumo system, 647, 648 fixed point behavior, 237 Floquet – exponent, 589 fluorine, 519 Fock states, 661 Fonin, M., 487 food webs, 579, 584 force microscopy, 477, 479 Fuchs, F., 351 Fuchs, G., 703 F¨ uhner, C., 113 Fuhrer, A., 139 Fuhse, C., 505
905
906
Index
functional response, 582, 586 GaAlAs, 35, 139 GaAlInP, 35 GaAs, 35, 125, 341, 427 GaAs/AlGaAs, 113, 327 – superlattice, 327 GaInAs, 301 GaInN, 35 Ganichev, S., 427 Garcia, J. M., 125 Gensty, T., 351 geomagnetic field, 815 GeSbTe, 850 glass transition, 633, 677 global coupling, 652 Gnecco, E., 477 gold – disks, 164, 168 – dots, 157 – nanoporous film, 159 Gr¨ uninger, M., 95 grain boundary, 563 – high angle, 563 – low angle, 563 – motion, 563 graphite, 3, 207 Green function – nonequilibrium, 369 Gross, M., 519 Gross, R., 695 Grossmann, Siegfried., 19 growth, 125, 833 Grundler, D., 443 G¨ untherodt, G., 487 Gupta, J. A., 51 h-BN, 519 H¨ aßler, W., 703 Haitz, R., 35 Hall – conductivity, 193 Hall, E. A., 193 Harmans, C.J.P.M., 763 Haug, R. J., 113 Haug. R. J., 193 Heinrich, A. J., 51 Heisenberg, 21 Held, K., 267
heterostructure, 369 high pressure, 889 high temperature superconductor, 695 high-reflectivity facet coating, 351 Hohls, F., 193 Holzapfel, B., 703 hopping – variable range, 193 Huber, R., 341 Huertas-Hernando, D., 383 Hund’s rule, 267 hybrid – structure, 443 hybrid nanostructure, 449 Hydrogen, 171, 519 Ihn, T., 139 impurity potential, 80, 89 InAs, 125, 427, 443, 449 infrared, 427, 833 InGaAs, 125 InP, 301 interaction, 81, 85, 88, 91 – dipolar, 861 interface – metallic, 392 – resistance, 390, 394 intermittency, 19, 23 intersubband, 301, 327 ion channel, 605 – clustering, 605 ionizing radiation, 480 irreversibility field, 703, 712, 715 isotope effect, 51 isotropic percolation, 666 jitter, 853 Josephson junction, 731 – array, 731 Josephson system, 747 Jung, P., 605 Just, W., 589 Kapeller, M., 849 KBr, 477, 480, 483 Keller, G., 267 Kempa, H., 207 Kerr effect – magneto-optical, 801
Index Kerr-microscopy, 801 Keyser, U. F., 113 Kiefer, R., 351 Kijewski, H., 505 Kipp, Lutz, 463 Kirschner, J., 547 Kleine, H., 849 Kliem, H., 861 K¨ ohler, K., 351 K¨ ohler, R., 327 Kolmogorov, 21 Komelj, M., 781 Kondo, 223 – effect, , 113, 181, 223 – impurity, 223 – temperature, 181 K¨ onig, C., 487 Kopelevich, Y., 207 Kopp, T., 95 Kormann, R., 351 Kr¨ oger, M., 617 Krebs, H.-U., 505 Krockenberger, Y., 695 Kroha, J., 223 Kunz, Rainer, 463 Lange, A., 789 Lange, L., 721 Langevin equation, 661 laser, 301, 327 – injection, 327 – quantum cascade, 327 layered crystal, 467, 469, 475 LDA, 267 LED, 35 Lee, S.-C., 369 LEED, 490 Leitenstorfer, A., 341 L´evy flights, 667, 670 light emitting diode, 35 Linfield, E. H., 327 Liouville operator, 661 liquid bridge, 789 local field, 861 localization, 193 logarithmic corrections, 663 Lorentz field, 861 Lorke, A., 125 Lunk, A., 519
907
Lutz, C. P., 51 Luyken, R. J., 125 magnetic – anisotropy, 547, 781 – compass of birds, 815 – dipole, 781 – dot, 721 – liquid, 789 – marker, 409 – moment, 80, 84, 181, 781 – multilayer, 386 – orientation, 815 – X-ray circular dichroism, 781 – X-ray scattering, 529 magnetic tunnel junction, 397 magnetisation reversal, 801 magnetism – one-dimensional, 781 magnetite, 487 magnetization, 815 – manipulation by electric field , 419 – reversal, 383 magneto reception, 815 magneto-elastic coupling, 547 magnetoelectronics, 383, 397 magnetometer, 731 magnetoresistance, 181, 397, 398, 402, 405, 443 – giant, 383, 408 magnetostriction, 547 magnetotransport, 139 Mann, Ch., 351 Manske, D., 695 Martin, F., 849 master equation, 659, 661 Matsuyama, T., 443 Mayer, J., 487 McMahan, A. K., 267 mean-field approximation, 661 mechanical alloying, 703, 708, 714 mechanical stress field, 563 Meier, G., 449 membrane – nanoporous, 158 metal, 617 – sponge, 617 metal-insulator transition, 207, 253 metallic
908
Index
– system, 181 – thin film, 833 Meyer, E., 477 Meyerheim, H., 547 MgB2 , 703, 708, 712, 717 micelles, 156 microcooler, 731 mid-infrared, 301, 351 mirror – neutral, 875 Mn, 415 MOKE, 801 molecule cascade, 51 Molinari, E., 313 M¨ oller, C. H., 443 Molodtsov, S., 487 Monte-Carlo computation, 861 Mooij, J. E., 763 Morelle, M., 721 Moshchalkov, V. V., 721 M¨ uller, K.-H., 703 Mott-Hubbard transition, 267 multi-critical point, 665 multifractality, 19, 24, 32 multiplicative noise, 664 NaCl, 477, 479 Naito, M., 695 nano-engineering, 721 nano-holes, 159 nanocrystalline material, 703, 713 nanoparticle, 171 nanoporous membranes, 166 nanostructure, 801, 833 nanowire, 463, 781 – growth, 466, 470, 472, 475 – network, 464, 470, 471 Navier-Stokes equations, 22, 23 Nekrasov, I. A., 267 Nelke, D., 505 Nenkov, K., 703 net modal gain, 351 network – model, 237 – operator, 237 neuron, 605 neutron – reflectivity, polarized, 801 – scattering, 801
Noe, F., 677 noise, 351, 605, 647, 731 non-equilibrium – distribution function, 223 – transport, 223 non-equilibrium phase transitions, 659 non-equilibrium steady states, 659 nuclear resonance, 529 Nunner, T. S., 95 Oboukhov, 21 octaeder vacancy in fcc, 171 Onsager, 21 Onsager-Machlup functional, 664 Oppenl¨ ander, J., 731 optical gain, 369 optical properties, 287, 420 optical spectroscopy, 95 optical transmission, 165 orbital-correlation effect, 781 organic crystal, 313 oscillator – noisy, 647 p-n junction, 35 Paaske, J., 223 pair contact process with diffusion, 671 Paladino, E., 747 Palladium, 171 Panchenko, E., 505 Paul, W., 633 Pb, 721 Pd, 171 PdH, 171 percolation, 789 Pereira Jr., M. F., 369 period – doubling, 589 Perner, O., 703 persistent current, 79, 81, 89 Petroff, P. M., 125 Pfeiffer, O., 477 phase change recording, 849, 858 phase coherence, 139 phase coherence time, 181 phase transition, 652 – dynamic, 659 – non-equilibrium, 659 phonon cavity, 63, 68
Index photocurrent, 287, 299 photodiode, 287 photoemission spectroscopy – angle-resolved ultraviolet, 487 – spin-resolved, 487 – synchrotron radiation, 487 photogalvanic effect – circular, 427 photonic crystal, 164 photonic filter, 165 photosynthetic reaction, 677 Pikovsky, A., 647 pillars – AlGaAs/GaAs/AlGaAs, 157 – GaAs, 157, 158 – GaAs/InGaAs/GaAs, 157 – Si, 164, 168 – ZnO, 161 pinch-off, 789 pinning, 703, 708, 711, 713 PIT method, 703 plasma enhanced chemical vapor deposition, 519 plasma-CVD, 171 plasma-enhanced chemical vapor deposition, 171 plasmon, 341 plateau transition, 193 polarization, 861 polyme, 633 polymer, 313, 317, 322, 324 – melt, 633 poor man’s scaling, 223 population dynamics, 579, 581, 584, 586, 660 porous metal, 617 PPV, 313 protein dynamic, 677 pseudogap, 695 Pt, 781 Pucci, A., 833 pulsed laser deposition, 505 Pyragas Scheme, 589 quantum – chaos, 63, 72, 73 – coherence, 181 – cascade laser, 301, 351, 369 – dot, 63, 67, 139, 158, 287
909
– – PL, 158 – – SiGe, 165 – – ZnO, CdO, CuO, 160 – fluctuation, 193 – Hall effect, 193, 207, 237 – kinetic, 341 – phase transition, 63, 72 – pump, 66 – ring, 83, 91, 113, 125, 139 – – parabolic, 79, 81, 88 – spin systems, 95 – transport, 369 – tunneling, 51 – wire, 223 – well, 427 qubit, 763 Rabaud, W., 181 Rabi oscillation, 295, 296, 298 radiation damage, 477 Raedts, S., 721 Raman – scattering, 253 – spectrum, 253 random matrix theory, 384 random walk, 659 rate equations, 660 Rayleigh number, 25 Rayleigh-B´enard, 25, 28 reaction fronts, 667 real-space renormalization, 237 receptor clustering, 605 reciprocal space, 529 recombination, 35 Reggeon field theory, 663 Reimann, B., 789 Reiss, G., 397 relative intensity noise, 351 renormalization group, 223, 659, 660 reponse functional, 663 resistivity, 833 – non-contact measurement, 833 resonance, 647 – coherence, 605 – stochastic, 605 Reynolds number, 19, 25, 30 Richardson, 21 Richter, R., 789 rings
910
Index
– ZnO, CdO, 167 Ritchie, D. A., 327 R¨ ohlsberger, R., 529 R¨ omer, R. A., 237 Rosch, A., 223 Rosensweig instability, 789 Rothert, A., 789 R¨ udiger, U., 487 Ruini, A., 313 Rupp, P., 789 Saminadayar, L., 181 Sander, D., 547 scale invariance, 659 scaling, 193 scanning tunneling microscopy, 51 Scarpa, N., 301 Sch¨ ar, S., 477 Scharf, T., 505 Schmidt, K. P., 95 Schneider, H., 351 Sch¨ oll, E., 589 Schopfer, F., 181 Schotter, J., 397 Schultz, L., 703 screening, 341 Seibt, M., 505 self-organization, 125 semiconductor, 327, 443 – ferromagnetic, 413, 422 – III-V, 414 shear, 617 short-pass filter, 165 Shuai J. W., 605 SiC, 35 Sigrist, M, 139 simulation, 861 Skibowski, Michael, 463 Smith, J. C., 677 sodium channel, 605 solid friction, 617 Spahn Torres, J. H., 207 spatio-temporal intermittency, 789 species segregation, 667 specific heat, 889 spectroscopy, 125 – IR-, 833 spin, 80, 81, 83, 88, 89, 91, 93 – accumulation, 397
– chain, 95 dependent tunneling, 397 – field-effect transistor, 443 – flip scattering, 427 – galvanic effect, 427 – glass, 181 – injection, 419, 443 – ladder, 95 – orientation, 427 – polarization, 400, 402, 416, 487 – quantum system, 95 – relaxation, 223, 427 – splitting, 139, 427 – structure, 529 – torque, 384, 393 – transport, 443 – valve, 383, 386, 390, 393 – valve effect, 443 spin-boson model, 64, 74 spin-orbit interaction, 449 spintronics, 383 sputter yield, 519 SQUID, 731, 763 St¨ ormer, M., 505 stability, 579, 584, 587 – analysis, 589 steatite, 889 Stenzel, O., 875 STM, 487 stochastic dynamic calculation, 861 stochastic resonance, 647 Stoner model, 443 Storcz, M. J., 763 Stranski-Krastanov, 125 stress, 519, 547 – measurements, 547 – surface, 547 stripe array, 801 structure, 617 – function, 19, 24 Stufler, S., 287 Sturm, K., 505 superconducting quantum interference, 731 – filter, 731 superconductivity, 703 – field-induced, 721 surface growth, 463
Index surface plasmon, 327 surface X-ray diffraction, 547 S¨ uske, E., 505 synchrotron radiation, 529 system size resonance, 647 Tanemura, M., 171 tape, 703 T¨ auber, U. C., 659 Tauser, F., 341 Taylor-Couette flow, 589 TEM, 487 terahertz, 327 Theis-Br¨ ohl, K., 801 thermal conductivity, 889 thermal convection, 19, 28 thermally-activated hopping, 51 thin film, 477, 529, 833 – analysis, 505 – preparation, 505 Thomas, A., 397 Thonke, K., 155 THz spectroscopy, 341 Tournier, A. L., 677 trace-gas sensing, 351 transition angle, 563 transition metal oxide, 267 transport, 181, 207, 253, 313, 316, 320, 323, 369, 397, 420, 449 – incoherent, 207 – spectroscopy, 113 – properties, 3 – spin-polarized, 449 Tredicucci, A., 327 triplon, 95 Tserkovnyak, Y., 383 tunnel junction , 397 – magnetic, 397 tunneling, 288, 291, 293, 297, 398, 404 – probability, 299 – time, 294, 295, 298 turbulence, 19, 22, 28 – onset of, 31 turbulent diffusivity, 21 two-dimensional electron system, 443 two-dimensional system, 207 two-level system, 287, 289, 291, 292, 295 two-triplet bound state, 95
Uhrig, G. S., 95 Ulbrich, N., 301 Ullmann, G. M., 677 ultrafast phenomena, 341 uniaxial compression, 617 unitary transformation, 95 universality class, 659 upper critical dimension, 661 upper critical field, 703, 712 valence band, 35 van der Wal, C. H., 763 Van Bael, M. J., 721 Vegards Rule, 171 Venturini, F., 253 viscoplasticity, 617 V2 O3 , 267 Vollhardt, D., 267 von Weizsaecker, 21 Vyalikh, D., 487 W, 547 W¨ olfle, P., 223 Wacker, A., 369 weak localization, 449 Wegscheider. W., 113, 139 Weisheit, M., 505 Welker, 35 Welter, B., 695 Wigner function, 69 Wilhelm, F. K., 763 Wilhelm, H., 889 Wiltschko, W., 815 Windt, M., 95 Winning, M., 563 Wulff, H., 519 Wunderlich, W., 171 x-ray diffraction, 487 X-ray waveguides, 529 Yang, Q. K., 351 YBCO, 253 Zeeman effect, 139 Zener model, 415 Zeng, S., 605 ZnO
911
912
Index
– dots, 160 – pillars, 161
– rings, 167 Zrenner, A., 287