ADVANCES IN IMAGING AND ELECTRON PHYSICS VOLUME 110
t
PETER W. HAWKES CEMES/L.uboratoire d’ Optique Electronique du ...
30 downloads
671 Views
8MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
ADVANCES IN IMAGING AND ELECTRON PHYSICS VOLUME 110
t
PETER W. HAWKES CEMES/L.uboratoire d’ Optique Electronique du Centre National de la Recherche Scientifique Toulouse, France
ASSOCIATE EDITORS
BENJAMIN KAZAN Xerox Corporation Palo Alto Research Center Palo Alto, California
TOM MULVEY Department of Electronic Engineering and Applied Physics Aston University Birmingham, United Kingdom
Advances in
Imaging and Electron Physics EDITEDBY PETER W. HAWKES CEMESLuboratoire d 'Optique Electronique du Centre National de la Recherche Scient$que Toulouse, France
VOLUME 110
ACADEMIC PRESS A Harcourt Science and Technology Company
San Diego
San Francisco New York London Sydney Tokyo
Boston
This book is printed on acid-free paper. @ Copyright 0 1999 by Academic Press All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher. The appearance of the code at the bottom of the first page of a chapter in this book indicates the Publisher’s consent that copies of the chapter may be made for personal or internal use of specific clients. This consent is given on the condition, however, that the copier pay the stated per-copy fee through the Copyright Clearance Center, Inc. (222 Rosewood Drive, Danvers, Massachusetts 0 1923), for copying beyond that permitted by Sections 107 or 108 of the U.S. Copyright Law. This consent does not extend to other kinds of copying, such as copying for general distribution, for advertising or promotional purposes, for creating new collective works, or for resale. Copy fees for pre-I998 chapters are as shown on the title pages: if no fee code appears on the title page, the copy fee is the same as for current chapters. 1076-5670199 $30.00 ACADEMIC PRESS A Harcourt Science and Technology Company 525 B. St., Suite 1900, San Diego, California 92101-4495, USA http://www .apnet.com Academic Press 24-28 Oval Road, London NWl 7DX, UK http://www.hbuk.co.uWap/ International Standard Serial Number: 1076-5670 International Standard Book Number: 0-12-0 14752- 1 Typeset by Laser Words, Madras, India Printed in the United States of America 9 9 0 0 0 1 0 2 0 3 B B 9 8 7 6 5 4 3 2
1
CONTENTS
CONTRIBUTORS . . . . . . . . . . . . . . PREFACE . . . . . . . . . . . . . . . . FORTHCOMING CONTRIBLJTORS . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
vii
ix xi
lnterference Scanning Optical Probe Microscopy: Principles and Applications W. S. BACSA I. Introduction: Wave Optical Properties near Surfaces 11. Outline
.
.
.
.
. .
.
.
.
.
.
.
.
. .
.
. .
. .
.
.
.
.
.
. . .
.
.
.
111. The Microscopic Perspective of Light-Matter Interaction: Wave
Scattering
.
.
.
.
. .
.
. . .
. . . .
.
. .
4
IV. Formation of Standing Waves: Propagative and Nonpropagative Waves V. Imaging of Standing Waves: Shadow Formation and Detection through Probe Edge . . . . . . . . . . . . . . . . . . . . . . VI. Quantum Limit in Near-Field Imaging: Short versus Intermediate Distance Observation . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . .
6 9
.
11
.
11
.
17
. .
21
. . . . Time-Resolving Microscopes . . . . . . . . . . . . . . . Concluding Remarks . . . . . . . . . . . . . . . . . . Acknowledgments . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . .
22
VII. Near-Field Holography: Amplitude and Phase Information VIII. Experimental Results
.
.
.
.
.
IX. Concluding Remarks
.
.
.
.
. .
.
High-speed Electron Microscopy 0. BOSTANJOGLO I. Introduction
.
11. High-speed Techniques 111.
IV.
.
. . . . . .
. .
.
.
. .
V
. .
.
. . .
.
.
. .
. .
.
.
.
.
.
.
.
26 58 59
59
vi
CONTENTS
Soft Mathematical Morphology: Extensions. Algorithms. and Implementations A . GASTERATOS AND I. ANDREADIS I. I1. 111. IV . V. VI .
Introduction . . . . . . . . . . . . . . . . . . . . . . Standard Mathematical Morphology . . . . . . . . . . . . . Soft Mathematical Morphology . . . . . . . . . . . . . . . Soft Morphological Structuring Element Decomposition . . . . . . Fuzzy Soft Mathematical Morphology . . . . . . . . . . . . Implementations . . . . . . . . . . . . . . . . . . . .
VII. Concluding Remarks References . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
63 64 68 70 76 86 98 98
Difference in the Aharonov-Bohm Effect on Scattering States and Bound States SEIIISAKODA AND MINORU OMOTE I. I1. I11. IV . V.
Introduction . . . . . . . . . . . . . . . . . . . . . . AB Effect on Scattering States . . . . . . . . . . . . . . . AB Effect on Bound States . . . . . . . . . . . . . . . . AB Effect on a System of Both Bound States and Scattering States . . Gauge Invariance and Scattering Theory . . . . . . . . . . .
VI . Concluding Remarks Acknowledgments . Index.,
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
102 107 125 131 144 148 151 173
CONTRIBUTORS
Numbers in parentheses indicate the pages on which the author’s contribution begins.
I. ANDREADIS (63), Laboratory of Electronics Section of Electronics and Information Systems Technology, Department of Electrical and Computer Engineering, Democritus University of Thrace, Xanthi, Greece, GR-67 I 00. W.S. BACSA(I), Laboratoire de Physique des Solides, ESA 5477, Universite Paul Sabatier, 31062 Toulouse Cedex 4, France. 0. BOSTANJOGLO (21), Optisches Institut der Technischen Universitat Berlin, Sekr.P 1-1 Strasse des 17. Juni 135, Berlin, Germany, D-10623. (63), Laboratory for Integrated Advance Robotics Department A. GASTERATOS of Communication, Computer and System Sciences, University of Genoa, Via Opera Pia 13, Genoa, Italy 1-16145.
OMOTE(101), Department of Physics, Hiyoshi Campus, Keio UniverMINORU sity, Hiyoshi, Yokohama, Japan, 223-8521. SEIJI SAKODA (IOl), Department of Mathematics and Physics, National Defense Academy, Yokosuka, Japan, 239-8686.
vii
This Page Intentionally Left Blank
PREFACE In this new volume of these Advunces, we have two contributions for the microscopists, one on mathematical morphology and a concluding chapter on the Aharonov-Bohm effect in solid-state physics. We begin with a very new type of optical microscopy, interference scanning optical probe microscopy, to which the author, W.S. Bacsa, has largely contributed. In this concise account of the technique, he explains why such a mode of image formation is of interest and then discusses the physics of the light-matter interaction and of the generation of the image with great care. We can expect interesting developments in the next few years and I hope that a further account will appear one day in these pages. Regular readers of these volumes will recall that 0. Bostanjoglo, author of the second chapter, has already discussed the microscopy of fast phenomena in these Advances (voI,***).Technical progress has been remarkable in the intervening years and I am very pleased to be able to include this account of the newest developments. The chapter opens with a description of the three families of high-speed imaging techniques: short-time-exposure imaging, streak imaging and image intensity tracking. Three forms of fast electron microscopy are then presented: transmission electron microscopy (TEM), photo-electron microscopy and reflection electron microscopy (REM). Each requires very specialized instrumentation, which is described fully, and each is illustrated with micrographs. The progress in these difficult imaging methods recorded here is truly impressive. In the third chapter, A. Gasteratos and I. Andreadis introduce us to a recent extension of mathematical morphology called “soft” morphology. This is not quite the same as the fuzzy morphology introduced by E.R. Dougherty and D. Sinha. In soft morphology, the hard-edged operations of max and min are replaced by weighted order statistics. The authors explain these ideas with examples and then consider practical implementations. They also consider a hybrid approach in which ideas from both fuzy morphology and soft morphology are present. Finally, we have a discussion by S. Sakoda and M. Omote on a particular aspect of the Aharonov-Bohm effect. There had been some disagreement between the findings of these authors and the traditional results and in order to resolve this, the authors considered the possibility that there is a difference between the A-B effects for bound states and for scattering states. A system in which both kinds of states coexist is examined in oder to shed light on this. It will be remembered that in the early days of work on the A-B effect, there ix
X
PREFACE
were violent arguments in the scientific journals, some authors convinced that the effect did not exist at all. These were silenced by the experiments of A. Tonomura, which demonstrated beyond any further doubt that the effect was real and it has now become an everyday effect in the nanoworld. Nevertheless, there are still unresolved problems and I am very pleased to include discussion of these in these pages. A further contribution on the subject will appear in volume 112. I thank all the contributors to this volume most warmly for the time and trouble they have taken over their chapters and list material that is promised for future volumes. Peter W. Hawkes
FORTHCOMING CONTRIBUTIONS
L. Alavarez Leon and J.-M. Morel (Vol. 111) Mathematical models for natural images
D. Antzoulatos Use of the hypermatrix
N.D. Black, R. Millar, M. Kunt, F. Ziliani and M. Reid Second generation image coding N. Bonnet Artificial intelligence and pattern recognition in microscope image processing G . Borgefors Distance transforms
A. van den Bos and A. Dekker Resolution S. Boussakta and A.G.J. Holt (Vol. 111) Number-theoretical transforms and image processing
P.G. Casazza Frames
J.A. Dayton Microwave tubes in space
E.R. Dougherty and D. Sinha Fuzzy morphology
J.M.H. Du Buf Gabor filters and texture analysis R.G. Forbes Liquid metal ion sources
E. Forster and F.N. Chukhovsky X-ray optics A. Fox The critical-voltage effect M.J. Fransen (Vol. 11 1) The ZrOIW Schottky emitter xi
xii
FORTHCOMING CONTRIBUTIONS
M. Gabbouj Stack filtering W.C. Henneberger (Vol. 112) The Aharonov-Bohm effect M.I. Herrera and L. Bru The development of electron microscopy in Spain K. Ishizuka Contrast transfer and crystal images C. Jeffries Conservation laws in electromagnetics
M. Journlin and J.X. Pinoli Logarithmic image-processing
E. Kasper Numberical methods in particle optics A. Khursheed Scanning electron microscope design
G. Kogel Positron microscopy K. Koike Spin-polarized SEM W. Krakow Sideband imaging A. van de Laak-Tijssen, E. Coets and T. Mulvey Memoir of J.B. Le Poole
L.J. Latecki Well-composed sets C. Mattiussi The finite volume, finite element and finite difference methods S. Mikoshiba and F.L. Curzon Plasma displays
R.L. Morris Electronic tools in parapsychology J.G. Nagy Restoration of images with space-variant blur
FORTHCOMING CONTRIBUTIONS
P.D. Nellist and S.J. Pennycook Z-contrast in the STEM and its applications
M.A. O’Keefe Electron image simulation
G. Nemes Phase-space treatment of photon beams
B. Olstad Representation of image operators
C. Passow Geometric methods of treating energy transport phenomena
E. Petajan HDTV F.A. Ponce Nitride semiconductors for high-brightness blue and green light emission J.W. Rabalais Scattering and recoil imaging and spectrometry
H. Rauch The wave-particle dualism
D. Saldin Electron holography
G.E. Sarty (Vol. 111) Reconstruction from non-Cartesian grids
G. Schmahl X-ray microscopy J.P.F. Sellschop Accelerator mass spectroscopy S. Shiraj CRT gun design methods
T. Soma Focus-deflection systems and their applications I. Talmon Study of complex fluids by transmission electron microscopy S. Tari (Vol. 1 1 1) Shape skeletons and greyscale images
xiii
xiv
FORTHCOMING CONTRIBUTIONS
J. Toulouse New developments in ferroelectrics T. Tsutsui and Z. Dechun Organic electroluminescence, materials and devices Y. Uchikawa Electron gun optics
D. van Dyck Very high resolution electron microscopy
J.S. Villarrubia Mathematical morphology and scanned probe microscopy L. Vincent Morphology on graphs
N. White Multi-photon microscopy
J.B. Wilburn (Vol. 112) Generalized ranked-order filters C.D. Wright and E.W. Hill Magnetic force microscopy
ADVANCES IN IMAGING AND ELECTRON PHYSICS, VOL. 110
Interference Scanning Optical Probe Microscopy: Principles and Applications W. S. BACSA Laboratoire de Physique des Solides, ESA 5477, Universite' Paul Sabatier, 31062 Toulouse Cedex 4, France
I. Introduction: Wave Optical Properties Near Surfaces . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . 111. The Microscopic Perspective of Light-Matter Interaction: Wave Scattering . . IV. Formation of Standing Waves: Propagative and Nonpropagative Waves . , . 11. Outline
V. Imaging of Standing Waves: Shadow Formation and Detection through Probe Edge . . . . . . . . . . . . . . . . . . . . . . . VI. Quantum Limit in Near-Field Imaging: Short versus Intermediate Distance Observation . . . . . . . . . . . . . . . . . . . . VII. Near-Field Holography: Amplitude and Phase Information
. . . . . . . .
VIII. Experimental Results . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . B. Bilayer Substrate . . . . . . . . . . . . . . . . . . . . IX. Concluding Remarks . . . . . . . . . . . . . . . . . . . . A. Reflection Geometq and Collection Mode
I 2 3
4 6 9 11
11
12 16
17
I. INTRODUCTION: WAVEOPTICAL PROPERTIES NEARSURFACES Optical imaging techniques for the submicrometer range are important for the surface analysis of thin films, macromolecular systems, and optical information storage. Optical techniques have the advantage that spectroscopic or chemical information can be obtained in an ambient environment. While advancing to smaller dimensions with new experimental probes it becomes important to adapt the physical model to the changing underlying physical context. Geometrical optics is a useful approximation for wavelengths that are short compared with the dimension of the probed sample volume. At scales where the wavelength involved is comparable, the use of optical rays, rectilinear propagation, and refraction is limited. The lateral resolution of lens-based systems is diffraction limited as a result of this approximation. At scales that are small compared with the wavelength involved, the wave and quantum optical description is more appropriate. 1 Volume 110 ISBN 0-12-014752- I
ADVANCES IN IMAGING AND ELECTRON PHYSICS Copyright Q 1999 by Academic Press All rights of reproduction in any form reserved. ISSN 1076-5670/99 $30.00
2
W. S . BACSA
Lensless or near-field optical imaging (Synge, 1928; Ash, 1972; Lewis, 1983; Pohl, 1984) avoids the diffraction limit by the pointwise recording or scanning of a subwavelength-sized optical source or detector in the proximity of the surface. In analogy with scanning electron tunneling microscopy, evanescent waves using internal total reflection have been used in combination with an optical probe (Reddick, 1989; Courjon, 1989, 1990). Optical probes in the form of a pointed optical fiber in collection or illumination mode, and transmission or reflection geometry have been investigated in the past (Pohl, 1993). Because artifacts have been observed in collection-reflection mode, illumination mode has been preferred (Cline, 1993). Combining shear force and optical detection allowed the simultaneous acquisition of shearforce and near-field optical images (Betzig, 1992). To circumvent aperture limitations, a sharp tip has been used where the scattered light from the illuminated tip serves as a localized light source (Zenhausern, 1994; Inouye, 1994). Although optical interference in near-field optics has been used to detect optical fields in the far field (Zenhausern, 1994; Pilevar, 1995), we show here that optical interference and diffraction play a major role in the near-field region. The scattering approach to near-field optics provides the opportunity to take advantage of the wave optical properties near surfaces to enhance the sensitivity and lateral resolution. This approach is realized in interference scanning optical probe microscopy by using reflection-collection geometry and a special substrate in the form of a bilayer. The interferometric nature of the recorded images gives a new perspective to optical holography in near-field optics.
11. OUTLINE
In section 111 we describe the microscopic perspective and its differences with respect to a macroscopic point of view; in section IV we discuss the wave optical phenomena near a surface, the formation of standing waves, and show in section V how they can be observed in reflection-collection geometry and used to improve the lateral resolution in reflection geometry. In section VI we discuss the differences encountered when observing at short or intermediate distances and show the quantum physical aspect in near-field optics. In section VII we introduce the concept of near-field holography. Experimental results using reflection geometry and collection mode and using a bilayer substrate are given in section VIII. We discuss the implications of the microscopic perspective to different experimental conditions used in near-field optics and present our conclusions in section IX.
INTERFERENCE SCANNING OPTICAL PROBE MICROSCOPY
3
111. THE MICROSCOPIC PERSPECTIVE OF LIGHT-MATTER INTERACTION: WAVESCATTERING Light-matter interaction is described at macroscopic scales by the materials dielectric constant and applying Maxwell’s field equations with appropriate boundary conditions. This leads to wave solutions such as homogeneous propagative waves and evanescent waves that are nonpropagative perpendicular to the interface. Although the dielectric constant is a quantity averaged over a scale comparable to or larger than the wavelength involved, it ignores the discrete character of the submicroscopic scale, so its application range is limited in near-field optics. In a microscopic perspective the propagation of an optical wave through a medium can be described as a cumulative consequence of many individual scattering processes (Feynman, 1985) by the valence and conduction electrons of the medium. This microscopic perspective describes the physical process at submicrometer and subwavelength scale in contrast to a macroscopic perspective, which gives an effective description. There are several subtle differences between the macroscopic and microscopic perspectives. In the macroscopic description, the interface causes a phase shift for the reflected wave, and the wave velocity depends on the medium. In a microscopic perspective the reflected wave is due to the coherent superposition of all the scattered waves in the medium, which results in a phase shift of the reflected wave, and the interface plays only a minor role. The wave velocity in the microscopic perspective is an apparent wave velocity (Feynman, 1964). The reduced phase velocity of light can be understood quantitatively by using path integrals (de Grooth, 1997). Summing over all alternative paths for photons including single- and higher order scattering processes leads to a phase delay and apparent wave velocity that depends on the medium. The light can scatter elastically or inelastically depending on the wavelength and the nature of the medium. In the case where the light scatters elastically, the wavelength of the scattered wave is unchanged and phase-shifted by a certain fixed value depending on the atoms and molecules involved. Consequently, the elastically scattered light is coherent to the incident wave, which leads to interference and the formation of standing waves. At scales larger than half the wavelength, interference fringes are formed due to destructive or constructive interference. We recall that interference is due to the superposition of coherently scattered waves and is not limited to the macroscopic scale and conclude that wave scattering and interference is inherent to light-matter interaction. In the case where the incident light is inelastically scattered, the light couples strongly (resonance) with the valence and conduction electrons of the medium,
4
W. S. BACSA
resulting in a finite lifetime; the wavelength changes and the phase of the scattered wave is significantly delayed, so the scattered light does not interfere in the same way with the incident beam.
IV. FORMATION OF STANDING WAVES:PROPAGATIVE AND NONPROPAGATIVE WAVES When a monochromatic light beam falls on a surface, a region is formed near the surface where the incident beam overlaps with the reflected beam. Figure l a shows how an incident wave front is reflected on the surface. The incident and reflected wave fronts overlap at a fixed distance from the surface at variable time intervals (Fig. lb): a standing wave is formed oriented parallel to the surface (Wiener, 1890). As the successive wave fronts are reflected the zone of the overlapping wave fronts moves from left to right. The standing wave has maxima and minima parallel to the surface and is nonpropagative perpendicular to the surface but is propagative parallel to the surface. Figure 2 shows that the resulting momentum vector of the incident and reflected waves cancels out perpendicular to the surface and adds parallel to the surface. The amplitude of the reflected waves depends on the amount of light reflected, which in turn depends on the scattering efficiency or atomic polarizability of the material. So far we have assumed a perfect interface. Surface roughness modifies the interference of incident and reflected wave and gives rise to diffuse reflection. The interfering scattered waves at variable heights lead to varying path differences and have the effect of reducing the formation of a standing wave when the roughness is comparable to half the wavelength involved. With increasing distances from the surface, wave fronts with larger path differences interfere, and the amplitude of the standing wave becomes increasingly sensitive to the coherence length of the illuminating beam or substrate roughness (Fig. lc). The reflected wave is phase-shifted due to interaction with the substrate, and destructive interference has the effect of reducing the local field amplitude at the surface. This phase shift is particularly large for opaque substrates, where light interacts strongly with the valence and conduction electrons of the material. As a result an adsorbate placed on an opaque surface is in an unfavorable condition for optical observation. Using a transparent surface layer with a low refractive index of thickness h / 4 places the surface at an interference maximum of the standing wave. The local field intensity at the surface can be enhanced this way to values close to 4 times the vacuum intensity and typically 10-30 times larger than without the transparent surface layer (Bacsa, 1992). A silicon wafer with a thermally grown SiO;! layer is an excellent substrate for taking advantage of standing waves near surfaces.
INTERFERENCE SCANNING OPTICAL PROBE MICROSCOPY
5
FIGURE1. Formation of a standing wave near a substrate surface: (a) a single incident wave front (2) reflects (3) on the substrate (1); (b) two wave fronts overlap at a fixed distance from the surface ( 5 ) at variable time intervals and propagate parallel to the surface; (c) two wave fronts with a larger phase difference (6) overlap at a corresponding larger distance (7) from the surface.
To ensure that there is a maximum at the surface of the transparent layer, the thickness of the layer is tuned to the wavelength of the incident light. This makes the transparent surface layer a microcavity with two interfaces of highly different reflectivities. The adsorbate changes the local reflectivity, which can have a strong influence on the standing wave, leading to a local phase shift and amplitude change (Bacsa, 1997). The standing waves near a bilayer substrate are thus hlghly sensitive to absorbates, which enhances the contrast. The oscillating amplitude of the standing wave with increasing distance from the surface implies that the image contrast depends on topography and is inverted
6
W. S. BACSA Reflection geometry
I
I
I
14
FIGURE2. Schematic of reflection geometry: k parallel add, and k perpendicular cancel on a perfectly reflecting surface.
at regular intervals from the surface. Image contrast changes at variable heights have also been observed using illumination mode (Hecht, 1997). Transmission geometry in near-field optics has been used to increase the local field on the sample surface. The transparent sample holder reflects little light (-4%), so the destructive interference at the surface is small compared with that at an opaque surface. The use of the bilayer substrate as described previously has the advantage of enhancing the field intensity at the surface up to 4 times the vacuum amplitude. To summarize, the formation of standing waves near a surface is a consequence of the superposition of the incident beam with the coherently scattered or reflected wave. The interference condition near a surface can be optimized by the use of a special substrate consisting of one highly reflecting and one transparent layer. Field enhancement on such a surface leads to greater optical interaction, resulting in enhanced contrast and sensitivity. The image contrast depends on topography and the height of the probe over the surface.
V. IMAGING OF STANDING WAVES:SHADOW FORMATION AND DETECTION THROUGH PROBEEDGE
When an optical fiber probe is placed in the zone of the overlapping incident and reflected beams, a shadow is formed-a region where the incident wave
INTERFERENCE SCANNING OPTICAL PROBE MICROSCOPY
Z
7
Schematic of ISOM geometry
I4
I fl
Interference maximum at surface
I8
FIGURE3. When the optical probe is placed in the overlap region of incident and reflected beams near a substrate, a shadow region is formed. An interference maximum falls on the surface of the bilayer substrate.
is absent (Fig. 3). The boundary of the shadow region for the standing wave is modified through diffraction at the probe edge similar to the diffraction of light at an edge. This has the effect of reducing the local field near the boundary and extending it into part of the geometric shadow region. The incident light penetrates into the coating on the illuminated side of the optical probe and can reach the inside of the optical probe at the probe edge, where the thickness of the coating is reduced. The amount of light that passes through the edge region depends on the local field intensity. But the incident light can also reach the inside of the optical probe through reflection on the surface (Fig. 4).As a result, the detected light that passes through the optical probe has two contributions: a contribution that depends on the local reflectivity and the size of the aperture, and a second contribution that depends to the local field intensity at the edge. Typically, the amount of light reflected directly into the probe is less localized and stays nearly constant, and the two contributions can be separated when scanning the probe over the surface. At a fixed height the amount of light reflected into the probe is nearly constant, whereas the light from the edge region is proportional to the local light intensity. The lateral resolution is limited by the size of the edge not by the size of the aperture, since the contribution reflected into the aperture is nearly constant. As the probe is placed near the surface the reflected light cannot illuminate the entire aperture, and the amount of light that is reflected by the substrate
8
W. S . BACSA Schematic of tip region
FIGURE4. Reflection geometry and collection mode with bilayer substrate: optical probe (I), transparent layer (Z), reflecting substrate (3). incident beam (4). standing wave (5). Light reaches the inside of the probe through reflection on the surface (6) or through penetration into the probe coating (8) at the edge (7).
into the aperture decreases. The decrease of the reflected light transmitted through the aperture depends on the angle of incidence and falls off linearly with decreasing distance to the substrate. This linear decrease of the signal can be used as a distance indicator. The shape of the edge plays an important role. If a metal coating is evaporated onto the fiber probe, it is found that the thickness of the metal coating at the edge is narrowed due to surface tension effects. This helps the light penetrate at the edge into the core of the optical fiber, from where it can propagate to the detector. The illuminated edge of a circular-shaped probe is typically prolonged perpendicular to the plane of incidence and narrower in the plane of incidence. A change of aperture geometry could certainly improve this aspect. In addition to the elastically scattered light that can enter the aperture, the inelastically scattered light also is useful for local spectroscopic analysis of the substrate. The spatial resolution of the inelastically scattered light depends on the aperture and distance of the optical probe from the substrate. The onesided illumination reduces the size of the region from which the inelastically scattered light can reach the aperture. We conclude that the external illumination of the probe reduces the interaction zone and avoids aperture limitations, and a decrease of the background signal with distance can be used as an optical indicator of the proximity of the surface.
INTERFERENCE SCANNING OPTICAL PROBE MICROSCOPY
9
VI. QUANTUM LIMITIN NEAR-FIELD IMAGING: SHORT VERSUS INTERMEDIATE DISTANCE OBSERVATION In a microscopic perspective, an optical wave propagates through a medium as a consequence of the incident wave scattered at the induced atomic dipoles of the medium. The scattered wave is a dipolar wave because the wavelength is considerably larger than the size of the atoms or molecules. An oscillating dipole has a strongly divergent and localized component (near field, 2 / r 3 ) similar to an electrostatic dipole with a field concentrated along the dipole axis and a delocalized component that extends to the far field (l/r) directed perpendicular to the dipole axis (Jackson, 1992). The strongly divergent component becomes relatively important only at distances of the order of less than h / n . In addition, both components contain a term (l/r2) that becomes important at intermediate distances. If a near-field optical probe with shear-force feedback control and typical aperture of 100 nm is used, the probe is very close to the surface, within 10 nm (estimated range of dipoleinduced van der Waals forces). The resolution in such a situation is limited by the probe and aperture size. When the probe is used to illuminate the surface, the aperture size is, in addition, limited by the transmission efficiency and heating of the probe (Kavaldjiev, 1995). Because the field falls off in perpendicular and lateral directions in a similar way at intermediate distances due to the component l/rz, a lateral resolution of 100 nm can also be obtained at larger distances (10-100 nm) beyond the dipole-induced force interaction range and without the use of shear-force feedback. Intermediate distance observation also has the following advantage if the quantum aspect is considered. The scale of the quantum regime can be estimated by considering the quantum action: p . x = h/2, and using p = k . h, x = n/k = k/2; in comparison, for an electron: p = k F . ti, and using for the Fermi wavevector kF = 1Olo m-*, x = 0.08 nm. This shows the clear difference between the electronic and optical tunneling regime. In near-field optics quantum effects are considerably larger due to the small photon momentum. Consequently, the influence of the observation process can no longer be ignored for distances smaller than h/2. The path of the scattered photons is influenced by the proximity of the optical probe, and light scattered at the probe edge, in our case, influences the light scattered by the surface. The influence of the probe on the scattering process is reduced by keeping the probe farther away from the surface, but when the distance to the surface increases, the lateral resolution decreases due to the superposition or overlap with contributions from neighboring surface regions. The image contrast is sensitive to the phase differences of the scattered waves, which in turn depend on the path difference. With increasing distance from the surface the path difference of two neighboring points becomes smaller and explains why the lateral resolution
10
W. S. BACSA
decreases. The optimum distance is therefore at an intermediate distance close enough to reduce the overlap with contributions from neighboring regions and far enough to prevent the probe from influencing the scattering process. Girard (1990) showed, using linear response theory, that the dipole moment of a dielectric tip is influenced in a nonlinear way by the presence of the medium, and Zenhausern (1994) and Inouye (1994) used the scattered light from a sharp tip as a point light source to circumvent aperture limitations. It is interesting to compare reflection-collection mode with near-field optical techniques that use a sharp tip as a scattering source (apertureless). In reflection-collection mode the illuminated edge of the probe scatters light in a similar way while the scattered light is detected through the optical probe, in contrast to scattering on a sharp tip, where the scattered light is detected in the far field. The dipole field is a linear superposition of a nonpropagative localized and a propagative delocalized component. Because the elastically scattered dipole waves are coherent, their superposition leads to stationary waves. We have seen that stationary waves are stationary perpendicular but propagative parallel to the substrate surface. The resulting momentum vector k, in the lateral direction is larger than the momentum vector of the incident beam ko, which helps reduce the uncertainty in the lateral direction. In the case of multireflections in a thin layer or internal total reflection the resulting wave can be canceled due to destructive interference. A simple classification of propagative and nonpropagative waves cannot be made outside the near-field range. We conclude that in the microscopic perspective the waves are dipolar waves with a nonpropagative localized and a propagative delocalized component. The two components are two aspects of the same scattering process and cannot be separated. The observation of the nonpropagative near field is limited because it falls in the quantum regime where the observation influences the scattering process. Observation at intermediate distances has the advantage of reducing quantum effects and maintaining a high lateral resolution. The lateral resolution depends on the distance of the probe from the substrate. Whereas scanning tunneling microscopy and atomic force microscopy rely on a short-range interaction field (< 10 nm), near-field optical microscopy has the advantage of being able to use a short- or intermediate- as well as a long-range interaction field. In addition, the macroscopic extent of the standing waves provides the opportunity to approach the surface and scan the probe in a controlled way. Alternatively, use of a second standing wave through a second beam with a different angle of incidence or wavelength makes it possible to take the phase shift between the two standing waves as an absolute substrate distance indicator. The control of optical probes by the use of standing waves have been demonstrated by Umeda (1992) and Kramer (1995).
INTERFERENCE SCANNING OPTICAL PROBE MICROSCOPY
11
AND PHASE INFORMATION VII. NEAR-FIELD HOLOGRAPHY: AMPLITUDE
The phase of a monochromatic wave can be measured by its superposition with a second coherent reference wave. The intensity of the two interfering waves or standing wave depends on the relative phase and amplitudes of the two waves. In optical holography, the phase of the scattered light from the object is recorded in an interferogram formed by superposition of the light with a reference wave (Collier, 1971). Figure 5 shows the geometry used in optical holography, and it can be seen that the overlap region near the surface where standing waves form is analogous to the overlap region between the reflected beam and the reference beam. This means that a near-field optical image recorded in collection mode is an interferogram that contains amplitude and phase information. The subwavelength dimension does not change the fact that light scattered from different parts of the sample is coherently superimposed and interferes. The parallel orientation of the image plane and standing waves is similar to reflection or Lippmann-Bragg holograms. The smaller distance between image plan and object is, however, distinctly different from classical or far-field holography, which makes using the recorded interferogram or hologram to reconstruct the surface geometry more complicated. The distances between an image point and all the points on the object are not comparable to the distance to the surface and vary considerably. The scattered wave has a dipolar character that has to be taken into account, unlike in far-field holography. Thus the methods used in far-field holography cannot be used. Far-field holograms have a thickness and are three-dimensional, which contributes to the quality of the reconstructed image. Similarly, recording in near-field optics at different heights gives additional information and improves the image. The description of the reconstruction of reflection-collection near-field images is beyond the scope of this article. We conclude that reflection-collection near-field images are interferograms that can contain amplitude and phase information or are near-field holograms that can be used to separate topographic and optical contributions.
VIII. EXPERIMENTAL RESULTS The following results show reflection-collection near-field images of optical gratings, carbon nanotubes, and silver islands on a bilayer substrate. The images were recorded using a conventional scanning probe instrument (CP, Park Scientific Instruments). The cantilever probe was replaced with an optical fiber probe (Nanonics Inc.) with a nominal aperture size of 100 nm. A He-Ne laser (632 nm or 543 nm, 5 mW) unpolarized, modulated at 0.1- 1 kHz and at an
12
W. S. BACSA Wave amplitude and phase detection a
2
4
Holography and reflection geometry b
FIGURE 5 . Schematic of wave amplitude and phase detection: (a) coherent wave source ( l ) , object (2). reference beam (3), beam modified by the object (4),and resulting wave ( 5 ) ; the superposition of the object wave (4)with a reference wave (3) leads to a wave whose amplitude relates the relative phases of object and reference waves. (b) incident beam ( I ) and reference beam (2), substrate (3), reflected beam (4), overlap region (3, and hologram (6); note the similarity of the interference conditions in holography and in the reflection geometry used in reflection-collection near-field optics.
angle of incidence of 50" was used for illumination. The collected light was detected by a photomultiplier (Hamahatsu Inc.) and Lock-In amplifier (Stanford Instruments). All the images were recorded in reflection geometry and collection mode at constant height, and no feedback signal was used. In addition, all images (size 256 x 256 image points) were recorded several times in succession in order to discriminate nonreproducible effects. The displayed images are the raw images with no image processing applied. A. Rejlection Geometry and Collection Mode
Figure 6 shows an image of an optical grating at different scan sizes (120 pm (top), 50 pm (middle), 5 pm (bottom)). Two types of modulations can be observed: a large modulation in the diagonal direction due to the tilt of the scan plan with respect to the sample surface, which is attributed to the standing waves oriented parallel to the surface, and a smaller modulation in a
INTERFERENCE SCANNING OPTICAL PROBE MICROSCOPY
13
FIGURE6 . Image of optical grating recorded in reflection geometry and collection mode at constant height with the following scan sizes: 120 pm (top), 50 pn (middle), 5 pm (bottom).
rectangular area due to fringes of the optical grating (top image). The direction of the large (diagonal) modulation (top image) changes from one side of the image to the other, which we believe is due to the nonlinearity of the scanner of the scanning probe instrument. The middle and bottom figures show a smaller scan. It is observed that the edges of the gratings are broadened due
14
W. S. BACSA
to scattering, and finer additional features are seen, which are possibly due to scattering from dust particles on the grating. The middle and bottom images shows the finer additional features more clearly. The additional periodicity does not appear to be related to the scanning process and is possibly related to the surface morphology of the grating. Figure 7 shows the collection mode image of a larger dust particle. Several interference fringes can be observed from the waves scattered by the particle. As in Fig. 6 the diagonal fringes are from the standing waves due to the tilt of the scan plane with respect to the substrate surface. The sudden contrast change in the lower half of the image, perpendicular to the scan direction, indicates uncontrolled and irreproducible change at the probe edge, possibly due to an accumulation of dust particles at the edge that changes the characteristics of the recorded light. The lower part of the figure shows interference fringes of agglomerated carbon nanotubes. Apart from the concentric diffraction fringes, parallel fringes with similar fringe spacing are seen, which are attributed to the edge of the nanotube film outside the scan region. The edge reflects the standing wave that propagates parallel to the surface, which
FIGURE 7. Image of dust particle (top) and carbon nanotubes (bottom) in reflection geometry and collection mode at constant height.
INTERFERENCE SCANNING OPTICAL PROBE MICROSCOPY
15
results in a stationary wave with fringes perpendicular to the image plane. Similar diffraction fringes have also been observed in transmission geometry and illumination mode (Lewis, 1991). Figure 8 shows an edged grating in the shape of a 5. In addition to the diagonal fringes due to the standing waves, several interference fringes are
FIGURE8. Image of optical grating recorded in reflection geometry and collection mode at constant height with the following scan sizes: 120 prn (top), 50 pm (middle), 5 krn (bottom).
16
W. S. BACSA
seen. The diagonal fringes change their orientation from one side of the image to the other, which is again attributed to nonlinearity of the scanner of the scanning probe instrument. Finer details are seen in the middle and bottom images. The middle image shows vertical fringes from the grating edge apart from the horizontal fringes from the grating fringes. The top image shows, in addition to the fringes, a diffuse and displaced image of the edged 5 . We believe that the diffuse image of increaseddecreased intensity in the shape of the edged 5 is due to the component of the light that is directly reflected into the aperture. The reflectivity change of the edged 5 causes a variation of this background signal. The angle of incidence (50") and large distance to the substrate (2 pm) cause a displacement of the diffuse image with respect to the interference fringes. B. Bilayer Substrate
In order to show the contrast enhancement using a bilayer substrate, 3.5 nm of silver was evaporated on the bilayer, which led to the formation of a narrowly dispersed silver island film with an island size of 50 nm. The island size of the same island film has been confirmed by scanning force microscopy (Bacsa, 1997). Figure 9 shows a silver island film deposited on a bilayer substrate. The image shows excellent contrast and demonstrates that the lateral resolution is better than 50 nm even when recorded with an optical probe with nominal 100nm aperture size using a bilayer substrate, reflection geometry, and collection mode. The image was recorded in constant-height mode without feedback control at a distance estimated to 50-100 nm. The rather high degree of order of the silver island is somewhat surprising and can be explained by the narrow size distribution leading to a regular arrangement or packing. The overlap of the scattered waves, which increases with distance from the surface, additionally reduces deviations from a regular arrangement in each island.
FIGURE9. Image of silver island film recorded in reflection geometry and collection mode at constant height with a scan size of 600 nm.
INTERFERENCE SCANNING OPTICAL PROBE MICROSCOPY
17
XI. CONCLUDING REMARKS By adopting the microscopic perspective we have shown that wave scattering and interference are an inherent part of light-matter interaction. Reflection geometry has the advantage that interference conditions are particularly well defined compared with transmission geometry or using probe illumination mode. The formation of standing waves near a surface is a consequence of the superposition of the incident beam and the coherently scattered waves, which leads to larger resulting lateral momentum. The interference condition near a surface can be optimized by the use of a substrate consisting of a highly reflecting and transparent layer that leads to two to four times larger local fields than on a transparent substrate. This local field enhancement and cavity effects result in increased contrast, resolution, and sensitivity. The external illumination of the probe has the advantage of reducing the interaction zone and avoiding aperture limitations. The observation of the near field is limited due to the small momentum of the photon. At distance from the surface smaller than the wavelength involved, one enters the quantum regime, which has the consequence that scattering from the probe influences the scattering process, resulting in nonlinear signal-distance characteristics. The scattered waves are dipolar waves with a nonpropagative localized and a propagative delocalized component. The two components are two aspects of the same scattering process and cannot be separated. Because the optical near field contains a localized and a delocalized component, observation at intermediate distance has the advantage of reducing proximity effects while maintaining a high lateral resolution and providing phase information. The macroscopic extent of the standing waves provides the opportunity to approach the surface and scan the probe in a controlled way. Because the recorded image in refection-collection near-field optics is the image of standing waves or an interferogram that contains amplitude and phase information, a new area of optical holography is opened -near-field holography, with the capacity to separate topographic contributions in reflection near-field optical images.
ACKNOWLEDGMENTS
I am particularly thankful to A. Kulik for his experimental support and encouragement. I would also like to thank Park Scientific, Geneva, for their interest and for making an SPM instrument available for three months.
18
W. S. BACSA
REFERENCES Ash, E. A., and Nichols, G . (1972). Super-resolution aperture scanning microscope. Nature 237, 510. Bacsa, W.S.,and Lannin, J. (1992). Bilayer interference enhanced Raman spectroscopy. Appl. Phy. Lett. 61, 19-21. Bacsa, W. S., and Kulik, A. (1997). Interference scanning optical probe microscopy. Appl. Phys. Lett. 70, 3507-3509. Betzig, E., Finn, P. L., and Weiner, J. S. (1992). Combined shear force and near-field scanning optical microscopy. Appl. Phys. Lett. 60, 2484-2486. Buckland, E. L., Moyer, P. J., and Paesler, M. A. (1992). Resolution in collection-mode scanning optical microscopy. J. Appl. Phys. 73, 1018-1028. Cline, J. A., and Isaacson, M. (1993). Comparison of different modes of reflection in near-field optical imaging. Ultramicroscopy 57, 147- 152. Cline, J. A,, and Isaacson, M. (1995). Probe-sample interaction in reflection near-field scanning optical microscopy. Appl. Optics 34, 4869. Collier, R. J., Burckhardt, Ch. B., and Lin, L. H. (1971). Optical Holography. Academic Press, San Diego, CA. Courjon, D., Sarayeddine, K.. and Spajer, M. (1 989). Scanning tunneling optical microscopy. Opt. Comm. 71, 23. Courjon, D., Vigoureux, J. M., Spajer, M., Sarayeddine, K., and Leblanc, S. External and internal reflection near-field microscopy: Experiments and results (1990). Appl. Opt. 29, 3734. Feynman, R. P. (1964). The Feynman lectures on physics. I, 27, 3 1. Feynman, R. P. (1985). QED, The strange theory ofmatter. Princeton University Press, Princeton, N.J. Fischer, U. Ch. (1985). Optical characterization of 0.1 km circular apertures in a metal film as a light source for scanning ultramicroscopy. J. Vac. Sci. Technol. B 3, 386-390. Girard, Ch., and Spajer, M. (1990). Model for reflection near-field optical microscopy. Appl. Opt. 29,3726. de Grooth, B. G. (1997). Why is the propagation velocity of a photon in a transparent medium reduced? Am. J . Phys. 65, 1156- 1164. Hecht, C. (1997). Facts and artifacts in near-field optical microscopy. J. Appl. Phys. 81, 24922498. Inouye, Y.,and Kawata, S. (1994). Reflection-mode near-field optical microscope with metallic probe tip. Optics Letter 19, 159. Jackson, J. D. (1962). Classical Electrodynamis. Wiley, New York. Kavaldjiev, D. I., Toledo, R., and Vaez-Iravani, M. (1995). On the heating of the fiber tip in a near-field scanning optical microscope. Appl. Phys. Lett 67,277 1. Kramer, A., Hartmann, T., Stadler, S. M., and Guckenberger, R. (1995). An optical tipsample distance control for a scanning near-field optical microscope. Ultramicroscopy 61, 191-195. Lewis, A., Isaacson, M., Hrootunian, A., and Murray, A. (1983). Ultramicroscopy 13, 227. Lewis, A., and Liebermann, K. (1991). Near-field optical imaging with a nonevanescent excited high brightness light source of subwavelength dimensions. Nature 354, 214. Liebermann, K., Lewis, A., Fish, G., Shalom, S., Jovin, Th. M., Schaper, A. and Cohen, S. R. ( I 994). Multifunctional, micropipette-based force cantilevers for scanned probe microscopy. Appl. Phys. Lett. 65, 648. Pilevar, S., Atia, W. A., and Davis, Ch. C. (1995). Reflection near-field scanning optical microscopy: an interferometric approach. Ultramicroscopy 61, 233.
INTERFERENCE SCANNING OPTICAL PROBE MICROSCOPY
19
Pohl, D. W., Denk, W., and Lanz, M. (1984). Optical stethoscopy: image recording with resolution A/20. Appl. Phys. Letr. 44, 651. Pohl, D. W., and Courjon, D. (1993). Near-Field Optics. Kluwer, Dortrecht, The Netherlands. Reddick, R. C., Warmack, R. J., and Ferrell, T. L. (1989). New form of scanning optical microscopy. Phys. Rev. B 39, 767. Synge, E. H. (1928). A suggested method for extending microscopic resolution into the ultramicroscopic region. Phil. Mug. 6, 356. Tsai, C. P., Jackson, H. E., Reddick, R. C., Sharp, S. H., and Warmack, R. J. (1990). Photon scanning tunneling microscopy study of optical waveguides. Appl. Phys. Lett. 56, 1.515. Umeda, N., Hayashi, Y., Nagai, K., and Takayanagi, A. (1992). Scanning Wiener-Fringe microscope with an optical fiber tip. Appl. Opt. 31, 4515. Wiener, 0. (1890). Ann. Phys. 40, 203. Zenhausern, F., O’Boyle, M. P., and Wickramasinghe, H. W. (1994). Apertureless near-field optical microscope. Appl. Phys. Lett. 65, 1623.
This Page Intentionally Left Blank
ADVANCES IN IMAGING A N D ELECTRON PHYSICS, VOL. I10
High-speed Electron Microscopy 0.BOSTANJOGLO Optisches lnstitut der Tecknischen Universirar Berlin Sekr. P 1-1 Strasse des 17. Juni 135 0-10623 Berlin, Germany
1.Introduction. . . . . . . . . . . . . . . . . . . . . . . .
21
. . . . . . . . . . . . . . . . . . . .
22
11. High-speed Techniques
. . . . . . . . . . . . . . . . . . C. Image Intensity Tracking . . . . . . . . 111. Time-Resolving Microscopes . . . . . . . . A. Short-Time-Exposure Imaging
B. StreakImaging
. . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . . . . .
23 25 26
26
A. Time-Resolving Transmission Electronmicroscopy . . . . . . . . . .
26
B. Flash Photoelectron Microscopy . . . . . . . . . . . . . . . .
43
C. Pulsed High-Energy Reflection Electronmicroscopy . . . . . . . . . IV. Concluding Remarks . . . . . . . . . . . . . . . . . . . . . Acknowledgments
. . . . . . . . . . . . . . . . . . . . .
References . . . . . . . . . . . . . . . . . . . . . . . .
54 58 59
59
I. INTRODUCTION Electron microscopy is used to investigate miscellaneous material properties with a high spatial resolution. The most familiar applications are imaging of the atomic structure of solids, of crystal defects, of magnetic and electric fields in solids, and of the chemical composition of thin films and surfaces (e.g., Reimer, 1985, 1993; Murr, 1991). Conventionally, a stationary electron beam either illuminates the whole specimen in a single exposure or scans the specimen. An image of the static distribution of a specific material property is produced in both cases. If time-varying effects are to be captured, the microscope must be pulsed. Periodic variations of a material property are captured by synchronously pulsing the electron beam with the period of the time-varying material property and summing the signals within a selected acquisition time to produce the image. This sampling procedure reduces the superimposed noise to a low level because 21 Volume 110 ISBN 0-12-014752-1
ADVANCES IN IMAGING AND ELECTRON PHYSICS Copyright 0 1999 by Academic Press All rights of reproduction in any form reserved. ISSN 1076-5670/99$30.00
22
0. BOSTANJOGLO
of its statistical nature. “Images” with a joint submicrometer/picosecond resolution were produced, for example, by Brunner et al. (1987). A fast nonrepetitive process is less easily uncovered, as all information about the transient state must be captured by a single short probing pulse, yet these nonrepetitive processes have attracted considerable interest in fundamental and applied research in connection with material processing by laser pulses. Qpical applications in which pulse lasers have progressively replaced established tools are localized cutting, drilling, ablating, patterning, alloying, and connecting of a wide variety of materials. The key condition for a precise local treatment, that is, for a minimum thermal and mechanical loading of the neighboring material, is that the required photon energy be deposited locally and in a short time. Thermal melting, melt flow, crystalline and noncrystalline solidification, and thermal evaporation are the main processes that determine the product of material treatment with a laser pulse in excess of some 10 ps. Commonly, the pump-probe technique, exploiting, for example, light-optical microscopy, is used as a diagnostic tool to track laser-induced effects. The light-optical methods are very fast and reach a time resolution of several femtoseconds ( e g , Schonlein et al., 1987). Their drawback is a limited spatial resolution (> 1 pm) and the fact that they primarily sense the electronic system, so that properties related to the atomic structure must be deduced with a suitable model. Material structure is better approached by electron microscopy, some modes of which directly probe the atomic packing. Furthermore, effects that are not accompanied by a large change of electronic states that strongly interact with visible light, for example, phase transitions in metals, are not easily detected by light optics. However, these effects show up with good contrast when they are imaged by electron microscopy based on Coulomb scattering of the probing electrons at the atomic structure. This chapter describes the various time-resolving electron-optical techniques that were developed to study fast transient effects in free-standing films and on surfaces of bulk materials down to the nanosecond time scale. Hydrodynamic instabilities in confined laser pulse-produced melts and their solidification and evaporation were investigated, as they are of major concern to micromachining with laser pulses. The mechanisms uncovered by high-speed electron microscopy are presented.
11. HIGH-SPEED TECHNIQUES There are three different time-resolving techniques, which are distinguished by the number of spatial coordinates in the image: short-time-exposure imaging, streak imaging, and image intensity tracking.
HIGH-SPEED ELECTRON MICROSCOPY
23
A. Short- Time-Exposure Imaging Short-time-exposure imaging captures a transient stage of a fast process by producing a two-dimensional image of the specimen with a short exposure time. This image may be produced either by using a stationary illumination of the specimen and enabling the image detector for only a short time (Bostanjoglo et al., 1987a,b; 1989) or by illuminating/exciting the specimen with a short electron/photon pulse and recording the electron image with a stationary detector. The first method needs sophisticated pulse electronics and shielding precautions. Preferably, the image detector is a charge-coupled-device (CCD) camera backed by an image intensifier. A sealed intensifier may be gated by pulsing the moderate voltage between the photocathode and the first gain stage, which is a microchannel plate (MCP). An open MCP intensifier is enabled by pulsing the voltage across the channel plate. This voltage must be as small as possible so that electromagnetic interference due to switching is minimized. In addition, the applied voltage may appreciably exceed the maximum safe dc voltage for a short period, giving a gain in the pulsed mode that surpasses the dc value by two orders of magnitude. The second technique is superior, as it may provide a much brighter illumination than a stationary beam if the electrons are emitted by a pulsed source. Short electron pulses may be produced by a fast deflection of a constant-current beam (Gesley, 1993), by pulsing the voltage of the Wehnelt electrode (Szentesi, 1972) or of a filter lens (Plies, 1982), or by exciting the electron emitter with a laser pulse. Only the last method allows the high current densities required for nonsampling short-time-exposure imaging to be extracted. The laser-driven gun used in the author’s group is distinguished by the fact that it can be operated both as a conventional dc thermal and as a high-current pulsed gun. It is a three-electrode type gun, consisting of a hairpin emitter, a Wehnelt electrode, and an anode, which houses an aluminum mirror for directing the laser beam onto the tip of the hairpin. This gun may be pulsed in the thermionic or in the photoelectron emission mode. Because high-currentdensity guns are the key component for short-exposure imaging, they will be considered in some detail. 1. Laser-Driven Thermionic Gun If the emitter is heated by a nanosecond (or shorter) laser pulse, the emitter can attain a temperature well above the melting point without being destroyed, and thermal electron pulses with current densities above those produced by dc heating are attained (Bostanjoglo and Heinricht, 1987; Bostanjoglo et al., 1990; Schafer and Bostanjoglo, 1992). In addition, emitter atoms are evaporated. They are ionized by the accelerated thermal electrons and reduce their
24
0. BOSTANJOGLO
negative space charge, so that electron current densities exceeding the Child limit of genuine electron emitters by one order of magnitude can be generated. However, the laser-driven thermionic gun has several serious drawbacks. As the surface is eroded by each laser pulse its absorption coefficient and therefore the deposited laser fluence vary from pulse to pulse, producing unpredictable electron pulse currents. In addition, the length of the electron pulse may exceed that of the laser pulse by more than 100%due to delayed emission of captured electrons as the plasma is diluted by expanding into the vacuum. This poor pulse-to-pulse stability makes the laser-driven thermionic gun unsuitable for multiframe imaging. Finally, this mode of operation is hazardous, since the gun is driven to the threshold of laser-induced electric breakdown. A small up-deviation of the deposited laser fluence triggers a high-voltage breakdown, which in turn launches a high-amplitude traveling wave that may destroy electronic circuits of the microscope or of the attached high-speed diagnostic devices.
2. Laser-Driven Photoelectron Guns Photocathodes with work functions ranging from the lowest values of x2 eV up to 4 eV have been used in laser-excited guns. Data on a number of electron emitters are given, for example, by Anderson et al. (1992), Travier (1994), and Chevallay et al. (1994). Materials with low work functions ( < 3 eV) are alkali, alkali earth and rare earth metals, oxides of alkali earth metals (Lablond and Rajaonera, 1994), rare earth hexaborides (Watari and Yada, 1986; Yada, 1986; May et al., 1990), borides and carbides of refractory metals (Yada, 1986), and semiconductors with negative electron affinity (Baum et al., 1995). Unfortunately, materials with work functions low enough to emit electrons by one-photon excitation with visible light are mechanically and thermally weak. In addition, they have a low threshold for damage by ion bombardment and foreign gases, so that they must be operated at pressures below lop8mbar. Mechanical strength can be increased by implanting, for example, alkali atoms into a refractory metal (Girardeau-Montaut et al., 1995), but these cathodes are still sensitive to poisoning and require an ultrahigh vacuum. The electron yield of some noble metals is appreciably increased if they are deposited as granular films on an inert substrate (Sabary and Bergeret, 1994). Such thin-film cathodes have been successfully used in electron beam testing devices (Batinic et al., 1995). Materials that are mechanically and thermally stable and chemically inert enough to be operated in a high vacuum ( 4 lop5 mbar) have work functions above 3 eV. Photoelectron emission can be excited from these materials either by one-photon absorption of ultraviolet radiation having a quantum energy larger than the work function or by multiphoton absorption of visible light. The intensity of the light pulse must then be very high (> lOI3 W/cm2) so that
HIGH-SPEED ELECTRON MICROSCOPY
25
multiphoton processes can occur with a high probability. Simultaneously, the pulse must be shorter than the thermalization time of the lattice ( I - 10 ps) so that the photon-absorbing electrons are ejected before an appreciable amount of the deposited energy is transferred to the lattice (Fujimoto et al., 1984; Wang et al., 1994; Girardeau-Montaut et al., 1994). Presently, these ultrashort laser pulses cannot be used in high-speed electron microscopy, as the released electron pulses do not contain enough electrons to produce images with submicron spatial resolution of normal specimens. Since these usually require exposure times of a few nanoseconds, ultraviolet nanosecond laser pulses must be used to excite the photocathode. Convenient sources of this radiation are excimer or Q-switched and frequency-multiplied solid-state lasers. If multiple pulses with a variable spacing in the nano- to microsecond range are to be produced, Qswitched lasers are well suited. Successive pulses are conveniently generated by a stepwise decrease of the losses of the laser resonator (Koechner, 1996). Because it is desirable to operate the same microscope both in the high-speed and in the conventional nonpulsed mode for routine investigations and adjustment, the photocathode should tolerate standard Joule heating for thermionic emission. For convenience the cathode also should work properly in the high lop6 mbar). These specivacuum of an ordinary electron microscope ( fications can be met by coating the hairpin of a standard thermal cathode with a suitable photoelectron emitter. CeB6, LaB6, ZrC, Ce, Tb, Ti, and Zr were tested as photoelectron emitters, with Nb, Ta, W, Ir-W, and Re as refractory metals for the supporting hairpin in various combinations (Nink, 1999). The compounds, being powders, were deposited by cataphoresis on the hairpin and baked at %1200"C and mbar. The metal photocathodes were fabricated by suspending a tiny chip of the metal from the tip of the hairpin and melting it by Joule heating in high vacuum. The photocathode producing the highest electron current densities with the mbar), ultraviolet radiation used (266 nm) in the microscope vacuum and simultaneously having the highest threshold for laser-induced flashover and withstanding conventional operation by dc heating, was Zr-coated Re. A stable pulse-to-pulse operation and high electron yield was achieved by keeping the cathode at ~ 1 2 0 0 ° C during laser pulsing. This cathode produced a 0 with an axial brightness of FZ 4 x lo6 Ncm2 . current density of ~ 7 0 Ncm2 sterrad at an acceleration voltage of 100 kV.
B. Streak Imaging Streak imaging is accomplished by confining the visible part of the object with a slit aperture in the plane of an (enlarged) intermediate image and sweeping the final one-dimensional image across the image detector in the direction perpendicular to the slit (Bostanjoglo and Kornitzky, 1990; Bostanjoglo and
26
0. BOSTANJOGLO
Nink, 1997). The sweeping velocity should be constant to have a “homogeneous” exposure of the detector. Since the slit width cannot be zero, the image is blurred along the time axis. The optimum width is determined by a compromise between shot noise in the image and the time resolution. During the streak operation the specimen is illuminated by an electron beam with the highest current density possible in order to minimize shot noise. Because prolonged illumination would inevitably lead to radiation damage of the specimen, the electron beam is directed onto the specimen only for the short period of the streak. This technique continuously visualizes transitions in the specimen that proceed along the slit coordinate.
C. Image Intensity Tracking Here, the bright-field image intensity of a selected region of the specimen is detected with a fast scintillator/photomultiplier and registered with a fast storage oscilloscope (Bostanjoglo and Liedtke, 1980; Bostanjoglo et al., 1982). The specimen is illuminated with an electron beam of maximum current density to achieve a high signal-to-noise ratio. In order to avoid radiation damage of the specimen the beam is passed to the specimen by a blanking capacitor for a limited time of a few microseconds only. If the image intensity is to be tracked for a period exceeding the safe time, the electron beam is passed in a number of equidistant short pulses (Bostanjoglo and Thomsen-Schmidt, 1989). This time-resolving mode supplies a continuous record of those fast processes in the specimen that modify the image intensity.
111. TIME-RESOLVING MICROSCOPES
There are several types of electron microscopes that probe different zones of the specimen. Transmission microscopes uncover the volume processes of free-standing films that often mimic bulk material. Properties of the top layers of a surface are successfully studied by the photoelectron microscope. Reflection electron microscopy gives access to the space above the surface of the specimen. These three types of electron microscopes were adapted to investigations of fast processes in their specific domain. A. Time-Resolving Transmission Electronmicroscopy
The transmission microscope probes films with thicknesses of 1 nm to 1 pm. Usually, some of the electrons scattered by the atoms are intercepted by an
HIGH-SPEED ELECTRON MICROSCOPY
27
aperture in the back focal plane of the objective lens, and a bright- or dark-field image is produced. The image intensity replicates, among other features, lattice structure, orientation, defects, and grain boundaries in crystalline samples by Bragg scattering, and the distribution of thickness, atomic number, and density in amorphous films by atomic scattering (e.g., Reimer, 1993). 1. Instrumentation Figure I shows a commercial transmission microscope modified for double-frame short-exposure imaging of laser-induced processes in free-standing films (Nink et al., 1999). The electron gun is of the standard three-electrode type but with a Zr-coated Re hairpin emitter, which may be operated in the conventional stationary Joule-heated thermal mode or as a laser-pulsed photocathode. Two successive high-current electron pulses with a width of 10 ns and a selectable spacing of a. Short-Time-Exposure Imaging.
FIGUREI , High-speed double-frame transmission electron microscope with integrated pulse laser for treating the specimen. (1) laser pulse-driven photoelectron gun, ( 2 ) beam blanker, (3) pulse laser for treatment of the specimen, (4)specimen, (5) objective lens with aperture, ( 6 ) field aperture, (7) frame shifter, (8) fiber plate transmission phosphor screen, (9) MCP image intensifier, (10) CCD sensor.
28
0. BOSTANJOGLO
20 ns to 2 ps are delivered. The laser driving the photocathode is a Q-switched twice-frequency-doubled Nd:YAG laser (wavelength 266 nm). Two pulses are extracted by decreasing the Q-spoiling voltage at the Pockels cell in a twostep process with transition times of w2 ns and with a variable spacing of the steps. The driving laser beam is directed onto the cathode tip by an aluminum mirror, installed in the anode, and focused by an external lens to a spot with a l/e2 diameter of ~ 2 Fm. 0 Photoelectron pulses with peak currents of several milliamperes into a half-angle of 7 x rad were produced. Because the microscope is provided with an automatic Wehnelt bias that, however, does not respond to nanosecond pulses of the beam current, the proper Wehnelt bias is set with a dc electron beam current by conventional Joule heating of the photocathode also in the laser-driven mode. The disturbing stationary beam is deflected by a voltage at a blanking capacitor, which is switched off only during the emission of the photocathode. A bright-field image of the specimen is produced with an objective, intermediate (not shown in Fig. l), and projective lens on a transmission screen. This image is intensified with a fiber-coupled MCP intensifier, picked up by a fiber-coupled CCD camera, digitized with a frame grabber, displayed on a monitor, and stored in a computer. Two successive frames are recorded by displacing the image on the detector with a frame-shifting capacitor between the two illuminating electron pulses. In order to separate the frames the image is confined by a rectangular aperture in the image plane of the intermediate lens. The fast processes to be investigated are induced in the specimen by a second frequency-doubled Q-switched Nd:YAG laser that emits 5- or 15-ns pulses (full width at half-maximum, fwhm) at a wavelength of 532 nm. This radiation is directed onto the thin-film specimen by an adjustable dielectric mirror on a heavily doped and polished silicon substrate with a bore to pass the electron beam. The treating laser beam is Gaussian in space and time and can be focused to a spot with a l/e2 diameter of 12 pm on the specimen. Two different modes are provided by a home-built logic circuitry, periodic triggering of both lasers and the beam deflectors for adjusting electron and laser pulses, and single-shot operation for grabbing a double-frame short-exposure image of a single laser pulse treatment of the specimen. b. Streak Imaging. The arrangement is shown in Fig. 2 (Bostanjoglo and Nink, 1997). It coincides with that in Fig. 1 except for two differences. The illuminating electron beam is produced by conventional thermionic operation of the electron gun and is directed onto the specimen only for the period of the streak. A streak image is produced by applying a voltage ramp to the frame-shifting capacitor. The length of the streak can be selected from a few nanoseconds to several microseconds. One-dimensional confinement of the
HIGH-SPEED ELECTRON MICROSCOPY
29
FIGURE 2. Transmission electron microscope for streak imaging. (1) conventional electron gun, (2)-(5) and (8)-(10) as in Fig. 1, (6) slit aperture, (7) linear image shifter.
images is achieved with an adjustable narrow-slit aperture in the image plane of the intermediate lens. Streak imaging is particularly appropriate for determining the velocity of fast-moving phase boundaries, solid/solid, solidliquid, or liquidvapor, that show up in the bright-field image because of changes in electron scattering or in film thickness. c. Image Intensity Tracking. Figure 3 shows the microscope for this mode of operation. The bright-field electron image intensity of a selected area of the specimen (> 100 nm) is converted into a voltage signal with a fast plastic scintillator (Pilot U, 1.9-11s rise time) plus a photomultiplier (rise time 2 ns), which is recorded with a storage oscilloscope (rise time 0.35 ns). The resulting time resolution of the recording unit is (1.92 2* 0.352)'/2 3 ns. The illuminating electron pulse is generated as in the case of streak imaging. Intensity tracking continuously records changes of the electron scattering, which may be due to phase transitions or removal/accumulation of material frondat the probed region. This technique is therefore well suited for detecting transient states, measure their life time and the period of phase transformations.
+ +
30
0. BOSTANJOGLO
FIGURE3. Transmission electron microscope for tracking laser pulse - induced transitions in thin films. (1)-(5) as in Fig. 2, ( 6 ) circular field aperture, (7) plastic scintillator, (8) photomultiplier tube.
2 . Applications Two typical applications of time-resolved transmission microscopy are reported: hydrodynamic instabilities of metal melts subjected to high lateral thermal gradients (x109 Wm) and ablation of metal films by laser pulses. These processes and metals as material were selected because they have a bearing on micromachining with laser pulses. A laser pulse, bell shaped both in time (5-15 ns fwhm) and space (12 Fm l/e2 diameter), is applied to a free-standing metal film with a typical thickness of 100 nm. The film contains impurities due to a preceding exposure to air. This typically is the case in laser microprocessing. As the fluence of the treating laser pulse is increased two regimes are encountered. In the lower fluence regime a local melt is produced that resolidifies. In the upper regime parts of the treated region are ablated. The details of the observed behavior of the treated metal considerably deviate from what naively is expected.
HIGH-SPEED ELECTRON MICROSCOPY
31
a. Thermal-Gradient-Driven Instabilities of Metal Melts. The thickness D of the treated film is assumed to be smaller than the thermal diffusion length during the laser pulse (D < 200 nm for all metals and 10-ns pulses). In this case an in-plane bell-shaped distribution of the temperature T is produced in the film that depends only on the radial coordinate r. The fluence of the laser pulse is high enough to melt the film within a certain radius but too low to heat the film appreciably above the melting temperature. Then, radiation pressure as well as evaporation of metal atoms and their recoil pressure can safely be neglected for nanosecond pulses. The only force the originally flat melt is subject to after the laser pulse stems from a possible gradient d y / d r of the surface tension y , which is identical to a shear stress acting on both surfaces. Now there exists a negative thermal gradient a T p r < 0 in the melt. Since the surface tension depends on temperature, and tabulated thermal coefficients are negative (about - 3 x lop4 N/m . K for many metals, e.g., Iida and Guthrie, 1988), the melt is expected to experience a positive shear stress at both surfaces:
This shear force monotonously drags the liquid to the cooler solid periphery, piling it up there and finally opening a hole at the center of the melt. The actual flow, however, is quite different (Bostanjoglo and Otte, 1993; Bostanjoglo and Nink, 1997; Nink e f al., 1999). Figures 4-6 show the hydrodynamics of laser pulse-produced melts in different metal films, visualized by the three time-resolving techniques described in section 11. None of the liquids that were subjected to an in-plane thermal gradient was perforated, as was expected for a flow driven by negative thermocapillarity (ay/aT < 0). Instead, the flow conspicuously depends on the starting temperature. At lower temperatures the liquid simply contracts within 100 ns and solidifies with a bump at the center. At higher temperatures flow starts with a fast contraction and continues with reversals of the flow direction. In addition to flow, crystallization of the melt of an “ordinary” metal with a high thermal diffusivity starts at its solid periphery and proceeds with an almost constant velocity of several meters per second toward the center of the melt (Bostanjoglo and Nink, 1996; Fig. 7). In the case that the melt accumulates at the center, a solid film with concentric modulations of the thickness is produced (Fig. 8). Melts that are produced with a pulse of a higher fluence are subdivided by an emerging concentric ring-shaped trench (Niedrig and Bostanjoglo, 1997 Fig. 9). The inner zone contracts and finally separates, forming a free disk that continues to contract due to surface tension and disappears in the end. The observed complicated flow can under no circumstances be explained with tabulated material parameters and the assumed shear stress in Eq. (1).
32
0. BOSTANJOGLO
FIGURE4. Short-exposure images of flow in a laser pulse-produced melt in an amorphous Ni0,sPo.z film (60 nm). Exposure time was 10 ns. The moment of exposure is counted from the peak of the treating laser pulse (-00 before, 00 10 s after the pulse) and is given at the upper right comers. The flow stopped about I ps after the laser pulse, whereas the melt crystallized within 4- 10 ps after the pulse. (a) Centripetal flow after a low-energy laser pulse (1.2 pJ). There is no reversal of the flow direction. (b) Centripetal flow followed by centrifugal flow after a high-energy pulse ( I .6 pJ). Flow direction is reversed 300 ns after the laser pulse.
“Rigorous” numerical simulation based on the Navier-Stokes and heat equations and simple physical arguments lead inevitably to a monotonous perforation of the melt within 100 ns. Figure 10 gives a hint of the decisive mechanism behind the actual flow. A melt in a gold film contracts after the first laser pulse. If a second pulse of similar fluence is applied after solidification but before a monolayer of gas is adsorbed from the high vacuum of the microscope, the melt then flows to the periphery. This reversal does not occur if the treated area is allowed to adsorb about a monolayer of air molecules (Bostanjoglo and Nink, 1996). Obviously the flow of “real” liquid metals is determined by surface-active impurity atoms. These atoms accumulate at the surface by replacing metal atoms, thereby decreasing the surface tension according to Gibbs’s isotherm: d y = -kTI‘d(lnX),
HIGH-SPEED ELECTRON MICROSCOPY
33
FIGURE 5. Nonmonotonous flow in a laser pulse-produced melt pool in a polycrystalline cobalt film (60 nm).(a) Streak image of the melt flow. The melting 5-11s laser pulse was applied at the top edge. The slit aperture (width 1 pm) passed the central region of the melt (lower edge in (b)). (b) Texture after crystallization of the melt.
where k is the Boltzmann constant, r is the excess surface density of the surface-active atoms adsorbed at the surface layer, and X is the atomic fraction of the surface active atoms in the bulk liquid. Thus, the surface tension decreases with increasing concentration of surface-active impurities (ay/aX < 0). The thermal coefficient ay/aT is also changed (Vitol and Orlova, 1984 Fig. 11 ; Ricci and Passerone, 1993). If the concentration of the impurities is high enough, ay/aT even becomes positive below some temperature To. Above To the coefficient is again negative and approaches the value of the pure metal. Taking into account that the surface tension is a function of temperature and atomic fraction of the surface-active impurities, y = y ( T , X ) , the shear stress driving the melt flow must then be
It is determined by the thermal and compositional gradients, which cause a thermo- and a chemocapillary flow, respectively. Oxygen atoms are known to be surface active in various metals (Vitol and Orlova, 1984; Ricci and Passerone, 19931, and they are abundant in the investigated films that were exposed to the ambient atmosphere.
34
0. BOSTANJOGLO
FIGURE 6. Oscillating flow in a laser pulse-produced melt pool in a polycrystalline iron film (60 nm). (a) Texture after crystallization of the melt. (b), (c) Oscilloscope traces showing the bright-field image intensity within the circle in (a) at two time scales after the melting laser pulse (arrow). m, cr denote melting and crystallization, respectively. The final level of the intensity in (c) remains constant.
Thus the following scenario is expected. The melt, originally having a homogeneous distribution of impurities along r(aX/ar = 0) but being subjected to a thermal gradient (aT/i3r < O), starts to flow due to the thermocapillary shear stress
The bulk liquid lags behind the near-surface layers, since it is dragged by them via viscosity and since it is driven by the Laplace pressure, which appears only as the surface deforms. Accordingly, the surface-active atoms are redistributed by fast surface flow in such a way that their concentration is reduced in regions having a positive gradient of the surface velocity, and vice versa. A compositional gradient ax/& emerges, which has the same sign as the thermocapillary force and which produces a chemocapillary shear stress
Since ( a ~ / a X<) ~0, the chemocapillary force produced by a thermocapillarydriven flow always opposes the latter. Therefore, the original flow is either stopped or even reversed, in agreement with the observed flow dynamics.
HIGH-SPEED ELECTRON MICROSCOPY
35
FIGURE7. Typical crystallization at nearly constant velocity of a melt pool produced by a focused laser pulse in a crystalline metal film (aluminum 60 nm). (a) Streak image. The melting 5-11s laser pulse was applied at the upper edge. The dark triangle is liquid metal; the vertical dark stripes within the bright area are Bragg-scattering crystals in the crystalline material. Propagation velocity of the crystalniquid boundary is 5 d s . (b) Texture after solidification of the melt. The rectangle marks the location of the streak aperture.
FIGURE8. Concentric thickness modulations of a solidified laser pulse-produced melt in a gold film (90 nm), imaged by backscattered electrons in the scanning microscope.
The different directions of the early stages of flow, that is, centripetal after a low-energy and centrifugal after a high-energy laser pulse, follow from the convex shape of the y-T curve at high concentrations of surface-active impurities (Fig. 11). Because the compositional gradient 6’X/ar is zero at the beginning, the direction of the early melt flow is determined by the sign of the
36
0. BOSTANJOGLO
FIGURE9. Chemocapillary flow in a laser pulse-produced melt in an aluminum film (90 nm), imaged by short-exposure transmission microscopy. Exposure time was 5 ns. The moment of exposure is counted from the peak of the laser pulse and is indicated at the upper left corners. The applied laser pulse (15 ns, 3.5 pJ) produced a hole.
FIGURE 10. Solidified melt pools in the same gold film (65 nm), showing that melt flow after one laser pulse is opposite that after two successive pulses. (a) Transmission microscope image of the solidified melt after one laser pulse of 1.6 pJ. The melt piled up at its center. (b) Structure after two successive pulses 4 ps apart and having the same energy as in (a). The melt solidified after the first pulse and piled up at its periphery after the second melting pulse. The melts solidified about 1 ps after a laser pulse.
thermal coefficient ay/ aT alone. If now the maximum temperature produced by the laser pulse is below To in Fig. 11, the thermocapillarity coefficient ay/aT is positive everywhere, and the liquid contracts in the negative thermal gradient according to Eq. (4).If, however, the maximum temperature of the liquid (which is at the center of the melt pool) exceeds To, then ay/aT is negative up to some radius ro, where the local temperature coincides with To, and positive beyond ro up to the solid rim. Now the liquid experiences an outward thermocapillary drag at the center up to a radius TO, and an inward shear stress beyond ro. The melt starts to deplete at the center and at the periphery and produces a ring-shaped bulge somewhere in between (Fig. 6).
HIGH-SPEED ELECTRON MICROSCOPY
37
Temperature FIGURE1 I . Typical dependence of the surface tension y of a metal on temperature and atomic fraction X of surface-active impurities in the bulk liquid. T , and T , are the melting and critical temperatures, respectively. With growing X a maximum of y appears at To.
The appearance of a circular trench in aluminum films at higher temperatures (Fig. 9) cannot be explained as above with a positive thermocapillary coefficient ay/aT > 0, as the temperature after the laser pulses used is too high (T > To at the center). Instead, chemocapillarity presumably is operating. The surface-active oxygen atoms, stemming from the disintegrated native oxide, are evaporated from the center (which is hottest) to a large extent during the laser pulse (see also Fig. 16). Thus a positive gradient ax/& > 0 of the oxygen concentration is produced. Since the temperature is maximum at the center of the melt, the thermal gradient is small near the center (aT/ar x 0), so that the sign of the total shear stress in Eq. (3) may become negative there and force the central zone of the melt to contract. This physical picture was substantiated by numerical simulations (Balandin ef al., 1995b). The concentric ripples occuring in solidified melts produced by lower energy laser pulses (Fig. 8) cannot be explained by simple physical arguments. They certainly are not frozen capillary waves, as one might think at first. The large number of ripples would mean that they are due to a high-frequency mode, whose excitation, however, is very improbable. The formation of the observed ripples was reproduced by a numerical simulation based on the Navier-Stokes equation, comprising thermo- and chemocapillary shear stress, which assumed that the surface-active impurity atoms segregate at the moving crystallization front and accumulate in the adjacent melt (Balandin et al., 1997, 1998). These simulations give the following physical picture of the solidification process in a metal melt with surface-active impurities. As the crystallization velocity exceeds a threshold of about 6 m l s (in gold) a front wave with a width of about 1 pm is produced ahead of the moving phase boundary. It pulsates and periodically emits steps of the impurity concentration, which in turn cause
38
0. BOSTANJOGLO
steps of the surface tension, and these in turn produce steps of the flow velocity. All these abrupt changes propagate into the melt. As the crystallization front sweeps across the agitated liquid, ripples of the observed period are in fact frozen. A front wave moving along with the phase boundary in a crystallizing germanium melt is shown in Fig. 12 (Bostanjoglo et al., 1992). b. Ablation of Metal Films. If the deposited laser pulse energy exceeds the enthalpies of melting and evaporation, a certain amount of the hottest part of the melt will evaporate during the laser pulse. Figure 13 shows how evaporation and thermocapillarity compete in ablating an aluminum film after a pulse of medium fluence (Niedrig and Bostanjoglo, 1997). A circular trench emerges, as at lower fluences, but in addition, the liquid is removed by evaporation at the center. A hovering liquid ring remains, which collapses due to the surface tension. Simultaneously, the hole is expanded by surface tension with a velocity 'u that can be estimated by equating the approximate change d(2nr2y) of the surface energy and the change vd(pDnr2w) of the kinetic energy
v%(2~/pD)'/~
(6)
Here r is the radius of the hole and p is the density of the liquid. Calculated and measured velocities are in the order of 100 m/s for films with a thicknes of D % 100 nm. As the fluence exceeds a threshold (e.g., M 5 J/cm2 for a 90-nm A1 film), ablation of the aluminum film proceeds exclusively by evaporation (Fig. 14). Liquid flow is reduced to a short radial expansion of the hole, curling up its rim and disrupting it by Rayleigh instabilities into spheres.
FIGURE12. Short-exposure transmission electron microscopy images showing the crystallization of a laser pulse-produced melt in a germanium film (50 nm). Exposure time was 40 ns. The time of exposure after the laser pulse is indicated at the upper right corner (M 10 s after the melting 30-11s laser pulse). Note the pileup of liquid at the moving crystallization front.
HIGH-SPEED ELECTRON MICROSCOPY
39
FIGURE13. Double-frame short-exposure imaging of the ablation of an aluminum film (90 nm) by volume evaporation and thermocapillary flow, caused by a 15-11s laser pulse of 4 pJ. Exposure time was 5 ns. The moment of exposure is counted from the peak of the laser pulse, and is given at the upper left comers of the frames. The double-frame series a-c were produced at three different regions of the same film. The final state was always a hole as in a.
At first sight, the ablation processes in Figs. 13 and 14 seem to be selfexplanatory, but numerical simulations uncover some surprises (Balandin et al., 1995a; Niedrig and Bostanjoglo, 1997). The observed time scales of the ablation, by the combined action of thermocapillary flow and evaporation and by evaporation alone, require that the following two conditions hold: N/m . K 1. The surface tension decreases with a constant tilt of -3 x from the melting temperature up to ~ 3 0 0 0K. This coefficient is equal to the tabulated value of pure aluminum near the melting point (933 K). Above %3000 K the surface tension heads with a very small coefficient of -0.2 x lop4 N/m . K toward zero at the critical temperature of ~ 8 5 0 0K. 2. Surface evaporation is marginal when aluminum is heated by nanosecond laser pulses. Instead, evaporation proceeds by volume evaporation, that is, by boiling, which is calculated to set in at X6000 K, assuming that nucleation of critical bubbles is homogeneous in the free-standing films. Models that are based on equilibrium surface evaporation (the evaporation rate and pressure are given by the Hertz-Knudsen-Langmuir and Clausius-Clapeyron equations, respectively) have been advanced, for example,
40
0. BOSTANJOGLO
FIGURE14. Volume evaporation of an aluminum film (90 nm) by a 15-ns laser pulse of 6.5 pJ. Exposure time was 5 ns. The moment of exposure is counted from the peak of the laser
pulse and is indicated at the upper left comers of the frames. The four double frames a-d were produced at different regions of the same film. The final state was always a hole coinciding in size with that in the 45-ns frame of d.
by Ho e t a l . (1995), Pronko e t a l . (1995), and Metev and Veiko (1998) to explain ablation of metals by short laser pulses. Although these models reproduce the ablated volume surprisingly well (Singh et al., 1990; Preuss et al., 1995), according to the above findings they cannot deal with the dynamics of evaporation of aluminum, and probably of other metals, by nanosecond laserpulses and are therefore misleading. 3. Space-Time Resolution a. Short-time-exposure Bright-jield Imaging. Because each electron image point requires a minimum electron dose to be registered in a single exposure, space and time resolution are not independent. The joint resolution is limited by shot noise in the electron beam and by the detector noise. A specimen area of diameter Ax is illuminated by n electrons during an exposure time A t . A fraction ni of the scattered electrons are passed by the objective lens aperture and produce the bright-field image. An image detector with gain G delivers nd = Gni signal electrons. Two adjacent areas of equal diameter Ax, which produce different numbers nil and nil of image electrons, are distinguished by the detector if the mean difference IGVil - G?Ti,I of the signal electrons
HIGH-SPEED ELECTRON MICROSCOPY
exceeds the root-mean-square noise amplitude minimum signalhoise ratio of about 3, that is,
(--
IGZjl - G7ij2) 112
~~
’3.
41
+ m) ‘I2 by a (71
The overbar denotes the average value. The ffuctuations And of the number of detected electrons comprise shot noise An; in the beam and detector noise expressed by AG. The mean square of And then is
Since the shot noise obeys the Poisson distribution we have ( A r ~ i = ) ~7 i i and n’ = Ti: + Z;, which gives with @ = G2 (AG);!
-
+
c
The term within the bracket is of the order of 1 , as the detector gain is usually very high and the number of electrons imaging the small area x ( A x ) ~ /within ~ the short time At is small. Combining Eqs. (7) and (9), using the image contrast K = lv-GI/(= G )of the two adjacent regions, expressing the average number of image electrons by the current density j of the illuminating electrons, their charge e, and the average transmission factor E of the objective lens aperture, that is, (nil =)/2 = ~ n ( A x )At/4e, ~j we obtain the relation between spatial resolution Ax and exposure time At:
+
+
For a laser-driven photoelectron gun that delivers 2-mA electron pulses into an area of the object of about 30 p m 0 , taking E x 0.1, and K x 1, Eq. (10) gives an ultimate joint resolution of (Ax)2At 2 5 x lo3 nm2 . ns. An image with an exposure time of, for example, 10 ns will have a spatial resolution of 20 nm at best. An additional limitation is imposed by electron beam heating of the specimen. Whereas the beam current density should be as high as possible to keep shot noise low, the probing electron beam should not induce any transitions in the specimen. Since heating by the illuminating electron pulse is adiabatic
42
0. BOSTANJOGLO
at the short exposure times, the total energy EE deposited by the 7 i electrons of the pulse must obey
($) 2
iiE 5 n
Dpc A T ,
where E , is the average energy loss of a beam electron, p is the density, c is the specific heat, D is the thickness, and AT is the maximum allowed electroninduced rise of temperature of the film. Inserting E = A p D with A % 5 x J . cm2/g according to the Bethe stopping power formula (e.g., Reimer, 1993) and using Eq. (lo), we obtain the resolution limit due to electron beam heating, Ax?( 18A ) ‘ I 2 . Y ~ E KAT ~C Taking, for example, iron and replacing the actual specific heat with its high-temperature value c = 3 k/m (where k is the Boltzmann constant and m is atomic mass) and AT with the melting temperature, one gets, with E x 0.1 and K x 1, Ax M 3 nm as the absolute spatial resolution in this case.
6. Streak Imaging. The time resolution At of a streak image may be defined as tsw At=(13) L where t, is the streak period, w is the width of the streak aperture, and L is the streak distance, both measured in the object plane. The spatial resolution Ax along the streak aperture is determined as for short-exposure imaging. Two adjacent rectangular areas of the specimen with width w and length Ax are distinguished by the detector within the time At if their signal-to-noise ratio exceeds a minimum value of about 3. Expressing the average number of the image electrons nil, from the two areas again by the illuminating current density j , that is, (nil i i i 2 ) / 2 = & w A x j A t / e ,we get an inequality similar to the one in the preceding section:
+
AxAt?
9e 2&K2wj *
~
For typical values t, = 100 ns, w / L x 0.1, E x 0.1, K x 0.2, w % I pm, and j x 3 A/cm2, a one-dimensional space resolution of Ax 0.6 pm is calculated for a time resolution of At = 10 ns. This approximately agrees with the actual resolution.
c. Image Intensity Tracking. The joint space-time resolution is derived in a similar way as before. The specimen is illuminated by an electron current of
HIGH-SPEED ELECTRON MICROSCOPY
43
density j . A fraction E of the scattered electrons pass the objective lens aperture and produce a bright-field image. If Ax is the diameter of the specimen area viewed by the scintillator/photomultiplier detector, the current picked up by the detector is then J = ~ j n ( A x ) ~ The / 4 . output signal current of the detector, having a gain G(>> l), is J,r = GJ. This signal is superimposed by a noise current J n with an average amplitude
(2)‘’’. The noise is composed of
fluctuations A G of the gain plus the amplified shot noise (2eJ A f ) ’ / ’ of the image current J ,
where A f is the bandwidth of the detector and the processing electronic circuits. Because the detector is based on multiplication processes with very high gain, one has ( A G ) 2 =z G >> 1, and since the image current J and its average shot noise amplitude are of the same magnitude near the resolution limit, Eq. (15) simplifies as indicated. A transition producing a change AJ, of the signal is resolved if it exceeds the noise amplitude (G)’I2 by a factor of at least 3: AJ, = G AJ 1 3
(5112) 1/’
x 3G(2eJ A f )I/’
Inserting J and replacing the bandwidth A f with the minimum detectable rise/fall time At = 0 . 3 5 / A f we get for the joint space-time resolution (Ax)*At p
25e n~j (A J / J ) 2‘
For typical values j x 10 Ncm’ (from a conventional thermal tungsten hairpin gun) and E = 0.1, inequality (17) states that a phase transition of, for example, 3-11s duration that produces a change AJ/J w 1 of the image current can be detected in specimen areas with diameters down to Ax % 0.2 pm.
B. Flash Photoelectron Microscopy Any electrons released from a surface, for example, by ion, electron, or photon bombardment, by heating, or by high electric fields, can be used to image the surface. Photoelectrons ejected by laser pulses are particularly suited for short-exposure imaging, as 0
0
high electron current densities can be produced without damaging the specimen: the moment of exposure can be freely chosen.
44
0. BOSTANJOGLO
For decades photoelectron microscopy has been used as a powerful surfaceimaging technique. Very different material properties have been characterized: 0
0
0
0 0
crystal texture and defects (Mollenstedt and Lenz, 1963; Engel, 1966; Griffith and Rempfer, 1987; Griffith et al., 1991) chemical reactions and pattern formation (Rotermund et al., 1991; Engel et al., 1991; Ehsasi et al., 1993) p-n junctions, metal leads, and surface states on semiconductor devices (Ninomiya and Hasegawa, 1995; Giesen et al., 1997) surface diffusion (von Oertzen et al., 1992) biological tissue (Griffith, 1986; De Stasio et al., 1998).
The spatial resolution of photoelectron microscopy and related techniques, such as low-energy and mirror electronmicroscopy, was discussed, for example, by Mollenstedt and Lenz (1963) and by Rempfer and Griffith (1992). Photoelectrons are emitted after single or multiphoton absorption. The former requires that the photon energy h f (where h is the Planck constant and f is the frequency of the light) exceed the bond energy of the electron, that is, h f > W A for a metal with a work function W A .At nonzero temperatures thermally excited electrons can be emitted by lower energy photons. Two-photon absorption, the simplest multiphoton process, produces a photoelectron by the simultaneous absorption of two photons. If they have equal frequencies, their quantum energy must exceed only W,4/2. However, the intensity of the light must be so high that on average two photons interact with an electron within a time ~ / W according A to the uncertainty relation. Because the absorption cross cm2, intensities of at least WA2/ha loi3W/cm2 section is about (T x are required for metals with W A x 4 eV. Such high light intensities can be produced by laser pulses, but they inevitably damage most metals unless femtosecond pulses are used. Unfortunately, these ultrashort pulses produce far too few electrons per pulse for a short-exposure image with an acceptable signal-to-noise ratio (at fluences below the damage threshold), so single-photon absorption has been exclusively exploited for photoelectron microscopy. The contrast is determined mainly by the local yield of photoelectrons, which depends on the true local work function, on the local thickness of possibly present dielectric (oxide) coating films, and on local variations of the electric field caused by surface geometry and by adsorbed molecules with high electric polarizability or with a permanent electric dipole. Such adsorbed molecules, for example, water molecules, may enhance the photoelectron emission by more than one order of magnitude (Buzulutskov et al., 1997). All these effects merge to produce an effective work function W A with a local variation AWA. Since the density of the photoelectron current (induced by one-photon absorption) is j = const ( h f - W A ) ~ , (18)
*
HIGH-SPEED ELECTRON MICROSCOPY
45
with n a positive constant, the contrast becomes
The contrast increases sharply as the photon energy approaches the work function, whereas the quantum efficiency decreases to zero and the photoelectron image is disguised by shot noise. For this reason the illuminating photons should have a large quantum energy. Most metals of technical interest have work functions around 4 eV, so a good compromise between contrast and shot noise are photons with hf % 5 eV. Short-exposure photoelectron imaging is most easily realized by illuminating the specimen with an ultraviolet laser pulse. Suitable lasers are frequencymultiplied solid-state and excimer lasers. The latter are preferred because of their smaller coherence length, which helps prevent disturbing interference patterns in the image. A good choice is the KrF laser (wavelength 248 nm, h f = 5.0 eV). 1 . Instrument for Short-Exposure Imaging
All previous photoelectron microscopes had a time resolution limited to several milliseconds. Releasing the photoelectrons with a pulse from an excimer laser having a short coherence length and carefully avoiding parasitic reflections that cause interference patterns, resulted in a resolution of a few nanoseconds. Figure 15 schematically shows the assembled flash photoelectron microscope that can image nonrepetitive changes of a surface on the nanosecond time scale (Bostanjoglo and Weingiirtner, 1997; Weingiirtner and Bostanjoglo, 1998). The specimen is at a high negative potential (-25--30 kV). Imaging photoelectrons are released by a 4-11s (fwhm) pulse from a KrF excimer laser. The fluence of the ultraviolet pulse is kept so low that the surface is not damaged. The photoelectrons are accelerated with a field of 5-8 kV/mm toward a grounded stainless steel anode. They are focused by an electrostatic einzel lens to an intermediate image that is projected by a magnetic lens onto a fiber plate transmission screen. The converted electron image is picked up with a fiber-coupled MCP image intensifier plus a CCD camera, digitized by a frame grabber, and stored in computer memory. A home-built trigger circuit allows us to make an “exposure” at any time relative to the processing visible laser pulse (wavelength 532 nm or 620 nm). The aperture in the back focal plane of the electrostatic lens decreases the angular and energy spread of the imaging electrons and therefore makes geometrical modulations of the surface become visible and increases the spatial
46
0. BOSTANJOGLO
1 Logic unit FIGURE15. Flash photoelectron microscope with attached Iasers for treating the specimen.
resolution (Boersch, 1943; Mollenstedt and Lenz, 1963). Two adjustable aluminum mirrors, which are fixed at the anode, direct the illuminating ultraviolet and the processing visible laser pulse onto the specimen. A beam blanker passes electrons to the detector for 5 ns only during the ultraviolet laser pulse. The beam blanker consists of a low-impedance parallel plate capacitor that normally deflects the electrons beyond the intercepting aperture in the back focal plane of the electrostatic einzel lens and is switched by an avalanche transistor-based cable pulser. In this way disturbing longlasting thermal and delayed ion-induced secondary electrons are kept away from the image. Their contribution to the image during the acquisition time (i.e., “exposure”) is negligible if the fluence of the processing laser pulse is not excessive. The specimen can be heated by electron bombardment from the backside for cleaning purposes. The investigated fast processes were launched in the specimen by a focused pulse either from a Q-switched frequency-doubled Nd:YAG laser (pulse width 10 ns, wavelength 532 nm) or from a colliding pulse mode-locked dye laser (pulse width 100 fs, wavelength 620 nm). The laser beams were focused on the specimen to a spot with a 1/ e 2 diameter of I5 ym and 50 Fm for the nanoand femtosecond pulses, respectively.
HIGH-SPEED ELECTRON MICROSCOPY
47
For a controlled positioning of the processing laser beam, the specimen is illuminated with white light and imaged with reflected and scattered radiation. The accelerating voltage can be cut off and the specimen grounded within 20 ns with a fast switch consisting of cascaded transistors. In this way a laserinduced electric breakdown is prevented by interrupting the avalanche buildup. Of course, this technique is successful only if the breakdown is delayed by more than the fall time of the switch (20 ns) plus the acquisition time for the image (5 ns). 2. Applications Because photoelectron emission reflects the bonding of surface electrons, pulsed photoelectron microscopy is an excellent method for imaging local chemical reactions. Figure 16 shows as an example the reaction induced by a nanosecond laser pulse in aluminum covered with its native oxide (thickness D % 3-4 nm). The fluence was high enough to melt the surface of the metal but too low for appreciable evaporation of metal atoms (as no flashover was initiated). The photoelectron emission at first decreases during the 5- 10 ns after the laser pulse and then considerably increases, saturating after about 100 ns. If the surface is exposed to air, the photoelectron emission returns to the low value of the untreated material. It is well known that liquid aluminum decomposes aluminum oxide A1203, producing a volatile suboxide, A120 (Champion et al., 1969). Now, the dielectric native oxide coating reduces the number of the ejected photoelectrons, as only a fraction exp( -D/L) are transmitted, the mean free path of the photoelectrons in the oxide being
FIGURE16. Photoelectron images of an aluminum film (100 nm) on (100) silicon, showing the removal of the native oxide covering layer by a laser pulse (10 ns, 6 J.20 pn 0).Exposure time was 5 ns. The moment of exposure is counted from the peak of the laser pulse and is given at the upper right comers of the images (03 10 s after the pulse). The images were produced at previously untreated neighboring regions with equal laser pulses.
48
0. BOSTANJOGLO
L x 1-3 nm (Buzulutskov et al., 1997). As the oxide coating disintegrates after the laser pulse the photoelectron yield of course increases and its rise time reflects the time it takes to decompose the oxide and evaporate the products from the melt. There remains the puzzling early decrease of the photoelectron emission. Such a decrease was observed with all metal surfaces that were not cleaned by electron beam heating prior to the laser treatment. The decrease is therefore probably due to the removal of adsorbed polar molecules (e.g., water molecules), which add their dipole field to the cathode field, decreasing the work function by e p n / s o (where p and n are the dipole moment and surface density of the adsorbed molecules, respectively, and EO is the vacuum pennitivity). A particular benefit of photoelectron microscopy is that the first top layers of a specimen are probed, so it is particularly suited for uncovering incubation effects and early stages of radiation-induced material modifications. For example, in laser microprocessing, flash photoelectron microscopy may be applied to visualize effects produced by nano- and femtosecond laser pulses with fluences near the ablation threshold. These two pulse lengths are much longer respectively much shorter than the electrodlattice relaxation time, which is some picoseconds for typical metals (e.g., Elsayed-Ali et al., 1987). The laser pulse energy is primarily absorbed by the electrons. In the case of a nanosecond pulse the electrons are practically in equilibrium with the atomic lattice, and the laser power is fed directly to it, gradually destabilizing it by ordinary heating. In contrast, in the case of a femtosecond pulse, the laser pulse energy is almost totally absorbed first by electrons, exciting them to high levels, which destabilizes the atomic lattice. A metal is destabilized by the high pressure of the hot conduction electron gas, whereas bonds in semiconductors are weakened as the valence electrons are excited into the conduction band (Stampfli and Bennemann, 1992). If the electron excitation is high enough, the lattice will collapse. At lower fluences a destabilized lattice is produced that starts to sink the energy of the electrons either by mechanical work or by exchange of heat (Stampfli and Bennemann, 1992). Since the atomic lattice occupies a very different state when it sinks the energy of a nanoor a femtosecond laser pulse, respectively, its response on the thermodynamic time scale (some picoseconds and longer) is expected to be quite different. Both metals and semiconductors were in fact observed to respond in different ways to nano- and femtosecond pulses (Weingilrtner et al., 1999). Figure 17 shows the completely different effects produced by a 10-ns and 100-fs laser pulse on (100) silicon with a native oxide layer (thickness x 3 nm). The nanosecond pulse causes the silicon surface to melt, as is substantiated by the final smooth craterlike structure (Fig. 18). Photoelectron emission
HIGH-SPEED ELECTRON MICROSCOPY
49
FIGURE 17. Photoelectron images of (100) silicon with a native oxide covering layer 3 nm thick, showing the completely different responses to a 10-ns (a) and to a 100-fs (b) laser pulse. Exposure time was 5 ns. The moment of exposure is counted from the peak of the laser pulse and is given at the upper right comer of the images (oi7 10 s after the pulse). The images were produced at previously untreated neighboring regions. The energy was % 6 p.l for the nanosecond and -0.9 llJ for the femtosecond pulses.
FIGURE18. Typical smooth, flat crater produced by a 10-ns laser pulse (-5 pJ) on (100) silicon with native oxide and imaged by scanning electron microscopy with secondary electrons.
rises while the silicon surface is molten and remains high until the melt solidifies 100-200 ns after the laser pulse. Freezing is accompanied by a slight decrease of photoemission. Exposure of the surface to air returns the photoelectron yield to the low value of the untreated silicon. Obviously, the oxide coating is decomposed as the laser pulse melts the silicon surface, and the photoelectrons can escape from the liquid without crossing a solid coating. As the liquid silicon solidifies, oxygen atoms that were dissolved in the melt are segregated at the surface, and a covering oxide layer is regrown. However, this layer is thinner than the original one, as part of the oxygen atoms were evaporated, and the photoemission is higher after the laser pulse. Thus,
50
0. BOSTANJOGLO
a nanosecond laser pulse effects a partial removal of the oxide from silicon by decomposition, transient storage of some oxygen dissolved in the melt, and regrowth of a thinner coating within 200 ns. This partial cleaning by a melting nanosecond laser pulse, but not the time scale of the process, were previously documented by Auger spectroscopy (Larciprete et al., 1996). The pileup of the melt, freezing at the periphery after a nanosecond pulse (Fig. 18), is not caused by recoil pressure from evaporating atoms. Evaporation is marginal, as no flashover occurs. Because very similar final structures are produced on (100) silicon without an oxide coating, the craterlike distribution of the melt is not affected by chemocapillarity but must be caused by thermocapillary forces. The 100-fs laser pulse has a very different effect on (100) silicon covered by a native oxide (Figs. 17b, 19). The photoelectron yield is heavily reduced during %lo0 ns after the laser pulse within the laser spot. A small, irregular zone with increased photoelectron emission develops from the dark area. The final structure consists of a weakly corrugated surface that is barely visible in the scanning electron microscope (Fig. 19). It is invisible to lightoptical microscopy, even to such surface-sensitive techniques as dark-field and interference microscopy. The transient phase produced by a femtosecond laser pulse on an oxidecoated silicon surface has a very low photoelectron yield and effectively suppresses evaporation of silicon atoms. The phase is probably a foam consisting of oxygen from disintegrated oxide mixed with liquid silicon. This foam settles to a blistered surface with partially removed oxide after %lo0 ns. If a femtosecond laser pulse of equal fluence is applied to a silicon surface having no oxide layer, a heavy ablation occurs, leading to an electrical breakdown, if the high voltage is not switched off within 50 ns. The response of a metal covered by a transparent oxide to a nanosecond laser pulse depends on the thermal stability of the oxide. The oxide is either
FIGURE19. Typical rough patch produced by a 100-fs laser pulse ( ~ 0 . J) 9 on (100) silicon with native oxide and imaged by scanning electron microscopy with secondary electrons at grazing incidence (80" against the normal of the surface).
HIGH-SPEED ELECTRON MICROSCOPY
51
thermally destroyed or decomposed by the liquid metal, or the oxide is stable at the melting temperature of the metal, as in the case of cobalt oxide COO on cobalt. The coating oxide may then increase in thickness after a nanosecond laser pulse, which melts the metal, by gathering oxygen atoms originally dissolved in the crystal. These atoms are abundant in the liquid after the crystal is molten, and segregate at the floating oxide as the melt freezes again. This scenario explains the decrease of photoelectron emission of nanosecond laser-treated cobalt during cooldown (Fig. 20a). The reduction of the photoelectron yield was not caused by desorption of adsorbed polar molecules (e.g., water), as adsorbed layers were removed by electron beam heating. A femtosecond pulse with a fluence similar to that of the chemically active nanosecond pulse typically produces dark lines within a crystal (Fig. 20b), which probably are slip lines, bundles of stacking faults, or grain boundaries. There is a transient increase of the electron emission during 20 ns after the laser pulse, where the linear crystal defect appears. This emission also occurs without photostimulation. Melting does not occur within the laser spot, as the crystals remain visible, so the actual temperature is too low to account for the electron emission as thermal emission. A nanosecond laser pulse with a fluence high enough to melt the treated metal starts chemical reactions between the metal and a coating oxide. In contrast, when the same metal is treated by a femtosecond pulse of equal fluence (additionally being below the threshold for ablation), it experiences plastic deformations that proceed on the nanosecond time scale and that are accompanied by emission of exoelectrons.
FLCURE20. Photoelectron images showing the completely different responses of cobalt to a 10-ns (a) and to a LOO-fs (b) laser pulse (fluence XJ lcm’). Exposure time was 5 ns. The moment of exposure is counted from the peak of the laser pulse. The arrow in a2 shows the fast-shrinking zone with unimpeded photoelectron emission in the solidifying melt. The arrow in b3 shows the crystal defect produced by the 100-fs laser pulse (already visible in b2).
52
0. BOSTANJOGLO
3. Limits Flash photoelectron microscopy is subject to the usual limitations on resolution that originate from lens aberrations and shot noise that likewise limit other imaging techniques. Additional constraints arise because the specimen is located in a high electric field.
a. Limits ofthe Resolution. The space-time resolution is restricted by the aberration of the uncorrected accelerating field at the specimen, by the space charge of the imaging electrons, and by their shot noise. If we assume all lenses except the cathode lens to be ideal (which is a good approximation), the spatial resolution AXL is then that of the two-electrode cathode lens used, which is given by Mollenstedt and Lenz (1963) as AE A X L= 1.2--, eF
where A E is the energy spread of the photoelectrons, and F is the electric field at the specimen. The space charge produced by the photoelectrons reduces the applied accelerating field and blurs the image. There exists no simple relation between the resolution and the electron current density. The actual blurring is considerably larger than predicted by model calculations (Massey el al., 1981; Massey, 1983). Nevertheless, space charge effects can be neglected if the current density j , of the photoelectrons is less than the space charge-limited Child current density jch by one order of magnitude: JCh j, <= CF3I2/10a’f2,
10
and a is the spacing of the two accelerating where C = 2.34 x lop6 electrodes of the cathode lens. The joint space-time resolution, limited by shot noise, is given by Eq. (10) with j replaced by the current density j , of the emitted photoelectrons,
18e m K 2j ,
( A X ~ ) ~> A---.~
Combining the inequalities (21) and (22), limited by the combined action of shot noise and space charge, we find that the space-time resolution obeys
HIGH-SPEED ELECTRON MICROSCOPY
53
The spatial resolution is improved by reducing the distance a between the electrodes and by increasing the accelerating field F . The former is limited to a > 3 mm to provide convenient access for the laser beams, whereas the electric field should not exceed a safe value of x10 kV/mm. Using these limits and E x 0.1, K M 0.2, A t = 5 ns, and A E x 1.5 eV we calculate a spatial resolution of Axt A X N ,GZ . ~0.8 pm.
+
b. Limitation of the Laser Treatment. In-situ material processing by laser is constrained by the requirement that thermal electron emission and evaporation not interfere with photoelectron imaging. Heating by the treating laser pulse must be such that the current density j,, of the thermal electrons stays below that of the photoelectrons j p , that is, j,, = AT2 exp
(-$)
< j , < CF3I2f 1 0 ~ ' ~ ~ .
Inequality (21) is applied, and the Richardson-Dushman expression is used for the current density of the thermal electron emission, where A _< 120 A/cm2K2, k is the Boltzmann constant, and T is the absolute temperature. Inserting the values for electrode spacing ( a = 5 mm) and electric field ( F = 5 x 106V/m), we calculate maximum allowed temperatures of 2400-2800 K for metals with work functions in the range 3.6-4.5 eV. Pulsed photoelectron microscopy can be applied up to and even above the melting temperature of most materials without interference from thermionic emission, as was actually observed. The treating laser pulse also causes ablation of the specimen, and its fluence must be kept low enough that formation of a laser-induced plasma is avoided. However, even if the laser pulse produces only neutral atoms, these are ionized by the thermal and photoinduced electrons, which gain abundent energy in the accelerating field. These ions may cause troubling secondary electrons. Photoionization can be neglected, as at least two photons of the quantum energies used must be absorbed for ionization of free atoms, and two-photon processes are very improbable at the restricted fluences. In fact, photoionization was not observed. The number of ions ni produced by electron collisions during the imaging time Ar is estimated to be
where n is the number of evaporated atoms (during the imaging time), and a i is the ionization cross section averaged over the electron energies. The
positive ions are accelerated toward the specimen, which is at a negative potential, and release qni electrons (where q is the secondary electron yield).
54
0. BOSTANJOGLO
The number of these secondary electrons must stay below the number of the imaging photoelectrons. This requirement and inequality (24), j,, < j p , limit the allowed number n, of evaporated atomic layers (during the imaging time) according to
where d 2 is the area per atom within the processed surface. If relevant values are inserted (ui% 10-20m2, d2 % 6 x 10-20m2, q zs lo), inequality (26) requires that less than one-third of a monolayer is evaporated during imaging, so that ion-induced secondary electrons can be neglected. The vapor pressure of most metals is too low up to several 100 K above the melting temperature for one atomic layer to be evaporated during the imaging time of 5 ns. Accordingly, most metals can be pulse-melted without disturbing photoelectron imaging, but adsorbates and oxides that decompose can be a problem in short-exposure imaging.
C. Pulsed High-Energy Rejection Electronmicroscopy In high-energy reflection electron microscopy the surface of a bulk specimen is illuminated by a collimated electron beam at grazing incidence, and specularly scattered electrons are used to image the surface. Reflection electron microscopy was invented by Ruska (1933), who exploited electrons scattered by 90”, however. Von Borries (1940) introduced a decisively improved technique, concerning chromatic aberration and image intensity, by using glancing-incidence illumination and electrons scattered into low angles to image the surface. Reflection microscopy was abandoned with the advent of the scanning electron microscope. It was revived, however, in the early 1980s. With improved electron optics and on-axis dark-field imaging with Bragg-reflected “lossless” electrons, resolution was driven to the atomic scale. Prominent applications of the technique have included the imaging of reconstructing single crystal surfaces (Tanishiro ef al., 1983), of atomic steps (Cowley and Peng, 1985), of structures of submonolayer deposits on silicon surfaces (Osakabe et al., 1980) and of surface migration of atoms (Yamanoka and Yagi, 1989). Yagi (1993) reviewed techniques and studies of surface structures and slow dynamic processes. Despite its enormous potential as a surface probe, reflection microscopy based on Bragg diffraction is not very suitable for short-exposure imaging of the surface. Usually, only a small fraction of the electrons are passed by the objective lens aperture, and the image is buried beneath shot noise. Bright-field imaging with grazing incident and exit angles is a more promising technique, but a considerable disadvantage is the almost one-dimensional image of the
HIGH-SPEED ELECTRON MICROSCOPY
55
surface. However, this technique is the only one that visualizes the space above the surface, and since the specimen is at ground potential (in contrast with emission microscopy), evaporation and plasma formation are accessible to investigations. Figure 2 1 shows a reflection electron microscope for short-exposure imaging of laser-induced processes (Bostanjoglo and Heinricht, 1990). The setup is similar to that of the transmission microscope in Fig. 1, except for the electron illumination system, which can be tilted against the specimen, and some minor deviations. A laser-driven thermionic electron gun is used, which delivers one intense electron pulse, allowing the acquisition of one short-time (20 ns) exposure image. The bulk specimen can be rotated about an axis that is orthogonal to the electron and the treating laser beam. Incident and exit angles of the electrons
FIGURE 21. Pulsed reflection electron microscope. ( 1 ) Laser pulse-driven thermal electron gun, ( 2 ) - ( 5 ) as in Fig. 1, (6) fiberplate transmission phosphor screen, (7) MCP image intensifier, (8) CCD sensor.
56
0. BOSTANJOGLO
FIGURE22. Generation of the double image of a shadow-casting particle above a plane specimen in the reflection electron microscope. e- illuminating electron beam.
are about 5", as measured against the surface. Because of these grazing angles the image of a geometrical structure on the surface is extremely shortened in the direction of the incident electrons. A laser-produced circular crater appears as a very slender ellipse. Any particle ejected from the laser-processed region has two images that appear symmetrically to the slender image of the eroded crater (Fig. 22). The two images are due to the absorption of incident and reflected (at the surface) electrons, respectively. The reflection microscope was used to visualize the evaporation of semiconductors and the ablation of metal films on semiconductors (Bostanjoglo and Heinricht, 1990; Heinricht and Bostanjoglo, 1992). Figure 23 shows, for example, the detachment of a gold film from a silicon wafer by a low-energy laser pulse that melted only the metal. The film was produced by evaporation on a silicon surface covered by native oxide and adsorbed molecules from the ambient atmosphere. As the gold film was melted by the laser pulse the adsorbed layers evaporated and lifted the liquid film within 340 ns after the laser pulse. About 300 ns later the liquid had separated from the wafer and contracted to a drop, which was driven back to the substrate by electrostatic forces (Fig. 24). Such processes occur whenever a light-absorbing coating produces a nonwetting liquid film on the substrate. Separation of the liquid may be due to true nonwetting or to an isolating gas produced by desorbed molecules or to volatile products from a disintegrated oxide. Laser-based cleaning and restoration methods rely on these and similar ablation effects. The joint spatial (Ax) and time (At) resolution of pulsed reflection microscopy is determined mainly by shot noise of the imaging electrons and is derived as for transmission microscopy. The resolution is given by a relation identical in form to inequality (10):
HIGH-SPEED ELECTRON MICROSCOPY
SO ns
190 ns
340 ns
490 ns
640 ns
Final structure
57
FIGURE23. Short-exposure reflection electron images showing the liftoff process of a laserpulsed 100-nm gold film on a silicon wafer. Exposure time was 20 ns. The moment of exposure is counted from the peak of the laser pulse and is given below the images. These were produced at neighboring previously untreated regions with equal laser pulses with an energy and fluence of 1.3 pJ and 0.6 MW/cm2, respectively.
FIGURE24. Scanning electron image of the rest of the gold film after the liftoff process shown in Fig. 23.
(Ax)2At 2
18e XEK'j
~
'
As before, j is the current density of the electrons illuminating the specimen for a time At, K is the contrast between two distinct adjacent areas with diameter Ax, and E is the fraction of electrons passing the aperture of the objective lens. The difference between this relation and that for bright-field transmission microscopy is in the physical meaning of the passed fraction of electrons, which here are all scattered electrons. In contrast, the fraction E in inequality (10) contains mostly unscattered electrons for not too thick films and is therefore much larger.
58
0. BOSTANJOGLO
Assuming values of the parameters typical for the assembled pulsed reflection microscope ( j % 80 A/cm2, E % lop3, At % 20 ns) and choosing as a specimen an opaque shadow-casting particle on an ideally flat surface, that is, K = 1, one obtains a spatial resolution perpendicular to the electron beam of Ax % 0.3 pm. This is on the order of what actually was achieved. The resolution along the direction of the electron beam is Ax/ sin a, with a % 5 ” , the angle that the illuminating electron beam makes with the imaged surface.
IV. CONCLUDING REMARKS Electron microscopy is an indispensable method for characterizing and analyzing materials down to the atomic scale. A very useful application is in-situ investigation, allowing imaging of the dynamics of miscellaneous processes. The time scale of three types of electron microscopes was pushed down to a few nanoseconds for nonrepetitive processes by implementing a high-current laser pulse-driven thermal- and photoelectron gun, fast electron beam shifting, and electronic image registration. The extended electron microscopes were of the transmission, photoemission, and reflection type, giving access to the volume of the specimen, its surface, and the space above it, respectively. Three complementary high-speed techniques were realized: double-frame imaging, streak imaging, and image intensity tracking. The potential of the new time-resolving probes was demonstrated by tracing such fast laser-triggered effects as phase transitions, melt instabilities, chemical reactions, and mechanical deformations. Melt flow driven by large thermal and compositional gradients, evaporation of superheated liquid metals, and decomposition and precipitation of oxide surface layers were investigated. High-speed electron microscopy has uncovered effects to which conventional light-optical methods have no easy access. Femtosecond laser pulses, depositing their energy in the electronic system, which then destabilizes the atomic lattice, were found to produce extraordinary effects on a “thermodynamic” (nanosecond) time scale. These effects were completely different from those initiated on the same time scale by the exclusively “thermal” nanosecond laser pulses. Modeling the dynamics visualized by transmission microscopy with computer-based numerical simulations allows one to extract material parameters relevant at temperatures up to the critical point, at thermal gradients up to several lo3 Wpm, and at stresses up to the theoretical yield point. Photoelectron microscopy was found to be well suited for uncovering early stages of material modifications where rival high-speed light-optical methods fail because of lacking contrast. The resolution of the described high-speed
HIGH-SPEED ELECTRON MICROSCOPY
59
microscopes is limited at present by shot noise in the electron image to several hundred nanometers and a few nanoseconds for nonrepetitive processes. A higher space-time resolution can be reached only if the buildup of negative space charge at the electron emitters is reduced. Brighter electron guns would improve the resolution of transmission microscopy. They may be able to be realized by locally increasing the electric field with suitably corrugated emitters or perhaps by exploiting the very high electric fields of ultrashort laser pulses in completely new designs. Adverse space-charge effects at the surface of specimens in the photoelectron microscope could be overcome by pulsing the accelerating voltage. Voltage levels significantly above the presently used safe dc value could be applied during the short imaging time without causing an electric breakdown. In order to prevent blurring due to the inevitable oscillations in the voltage pulse at the cathode, the emission microscope must be completely electrostatic, and the voltage of all lenses must be derived from the cathode voltage by fast capacitive/resistive dividers.
ACKNOWLEDGMENTS Sincere thanks are due to F. Rohn-Schwarz, T. Nink, and M. Weingartner for helping to produce the paper. The high-speed research was generously supported by the Deutsche Forschungsgemeinschaft and by the Alexander von Humboldt Stiftung.
REFERENCES Anderson, T., Tomov, N., and Rentzepis, P. M. ( 1992). J . Appl. Phys. 71, 5 161- 5 167. Balandin, V. Yu., Gernert, U., Nink, T., and Bostanjoglo, 0. (1997). J. Appl. Phys. 81, 2835-2838. Balandin, V. Yu., Niedrig, R., and Bostanjoglo, 0. (1995). J . Appl. Phys. 77, 135- 142. Balandin, V. Yu., Nink, T., and Bostanjoglo, 0. (1998). J . Appl. Phys. 84, 6355-6358. Balandin, V. Yu., Otte, D., and Bostanjoglo, 0. (1995). J. Appl. Phys. 78, 2037-2044. Batinic, M., Begert, D., and Kubalek, E. (1995). Nucl. Instrum. Meth. Phys. Res., Sect. A 363, 43. Baum, A. W., Spicer, W. E., Pease, R. F., Castello, K. A,, and Aebi, V. W. (1995). SPlE 2522, 208 - 2 12. Boersch, H. (1943). Z. Techn. Phys. 23, 129- 130. von Bomes, B. (1940). Z. Phys. 116, 370-378. Bostanjoglo, O., and Heinricht, F. (1987). J. Phys. E.: Sci. Instrum. 20, 1491-1493. Bostanjoglo, 0.. and Heinricht, F. (1990). Rev. Sci. Instrum. 61, 1223-1229. Bostanjoglo, O., Heinricht, F., and Wiinsch, F. (1990). Proc. X l l f h Int. Congron Elecrron Microscopy (L. D. Peachy and D. B. Williams. Eds.), Vol. l . San Francisco Press, San Francisco, pp. 124-125.
60
0. BOSTANJOGLO
Bostanjoglo, O., and Kornitzky, J. (1990). Proc. XII. In?. Congron Electron Microscopy (L. D. Peachy and D. B. Williams, Eds.), Vol. 1. San Francisco Press, San Francisco, pp. 180- 18 1. Bostanjoglo, O., Kornitzky, J., and Tornow, R. P. (1989). J. Phys. E.; Sci. Instrum. 22, 10081011. Bostanjoglo, O., and Liedtke, R. (1980). Phys. Status Solidi A 60, 451-455. Bostanjoglo, 0.. Marine, W., and Thomsen-Schmidt, P. (1992). Appl. Surf: Sci. 54, 302-307. Bostanjoglo, O., and Nink, T. (1996). J. Appl. Phys. 79, 8725-8729. Bostanjoglo, O., and Nink, T. (1997). Appl. Surf: Sci. 109/110, 101-105. Bostanjoglo, 0.. and Otte, D. (1993). Muter Sci. Eng. A 173, 407-411. Bostanjoglo, O., Schlotzhauer, G., and Schade, S. (1982). Optik 61, 91-97. Bostanjoglo, O., and Thomsen-Schmidt, P. (1989). Appl. Surf: Sci. 43, 136- 141. Bostanjoglo, O., Tomow, R. P., and Tornow, W. (1987a). Ultramicroscopy 21, 367-372. Bostanjoglo, O., Tornow, R. P., and Tornow, W. (1987b). Scanning Microsc. Suppl. 1, 197-203. Bostanjoglo, O., and Weingmner, M. (1997). Rev. Sci. Instrum. 68, 2456-2460. Brunner, M., Winkler, D., Schmitt, R., and Lischke, B. (1987). Scanning 9, 201-204. Buzulutzkov, A., Breskin, A., and Chechik, R. (1997). J. Appl. Phys. 81, 466-479. Champion, J. A., Keene, B. J., and Sillwood, J. M. (1969). J. Mar. Sci. 4, 39-49. Chevallay, E., Durand, I., Hutchins, C., Suberlucq, G., and Wurgel, M. (1994). Nucl. Instrum. Meihods Phys. Res. Sect. A 340, 146- 156. Cowley, J. M., and Peng, L. M. (1985). Ultramicroscopy 16, 59-67. De Stasio, G., Capazi, M., Lorusso, G. F., Baudat, P. A., Droubay, T. C., Perfetti, P., Margaritondo, G., and Tonner, B. P. (1998). Rev. Sci. Instrum. 69, 2062-2066. Ehsasi, M., Karpowicz, A,, Berdau, M., Engel, W., Christmann, K., and Block, J. H. (1993). Ultramicroscopy 49, 3 18- 329. Elsayed-Ali, M. E., Norris, T. B., Pessot, M. A,, and Mourou, G. A. (1987). Phys. Rev. Lett. 58, 1212-1215. Engel, W. (1966). Proc. 6th In?. Congron Electron Microscopy (R. Uyeda, Ed.), Vol. 1. Maruzen, Tokyo, pp 217-218. Engel, W., Kordesch, M. E., Rotermund, H. H., Kubala, S., and v. Oertzen, A. (1991). Ultrumicroscopy 36, 148- 153. Fujimoto, J. G., Liu, J. M., Ippen, E. P., and Bloemhergen, N. (1984). Phys. Rev. Lett. 53, 1837- 1840. Gesley, M. (1993). Rev. Sci. Instrum. 64, 3169-3190. Giesen, M., Phaneuf, R. J., Williams, E. D., Einstein, T. L., and Ibach, H. (1997). Appl. Phys. A 64,423-430. Girardeau-Montaut, J. P., Girardeau-Montaut, C., Afif, M., Perez, A., and Monstaizis, S. D. (1995). Appl. Phys. Lett. 66, 1886-1888. Girardeau-Montaut, J. P., Girardeau-Montaut, C., Monstaizis, S. D. (1994). J. Phys. D: Appl. P h y ~ 27, . 848-851. Griffith, 0. H. (1986). Appl. Surf: Sci. 26, 265-279. Griffith, 0. H., Habliston, P. A., and Birrell, G. B. (1991). Ultramicroscopy 36, 262-274. Griffith, 0. H., and Rempfer, G . F. (1987). Adv. Opt. Electron Microsc. 10, 269-337. Heinricht, F., and Bostanjoglo, 0. (1992). Appl. Surf: Sci. 54, 244-254. Ho, J. P., Grigoropoulos, C. P., and Humphrey, J. A. (1995). J. Appl. Phys. 78, 4696-4709. Iida, T., and Guthrie, R. I. L. (1988). The Physical Properties of Liquid Metals. Clarendon, Oxford, p 134. Koechner, W. ( 1996). Solid-Stare Laser Engineering. Springer-Verlag, Berlin, p. 458. Lablond, B., and Rajaonera, G. (1994). Nucl. Instrum. Meth. Phys. Res. A 340, 195- 198. Larciprete, R., Borsella, E., and Cinti, P. (1996). Appl. Phys. A 62, 103-114. Massey, G. A. (1983). IEEE J. Quantum Electron. QE-19, 873-877. Massey, G. A., Jones, M. D., and Plummer, B. P. (1981). J. Appl. Phys. 52, 3780-3790.
HIGH-SPEED ELECTRON MICROSCOPY
61
May, P. G., Petkie, R. R., Hasper, J. M. E., and Yee, D. S. (1990). Appl. Phys. Lett. 57, 15841585. Metev, S. M., and Veiko, V. P. (1998). Laser-Assisted Microtechnology. pp. 46-52. SpringerVerlag, Berlin. Mollenstedt, G., and Lenz, F. (1963). Adv. Electronics and Electron Phys. 18, 251 -329. Murr, L. E. (1991). Electron and Ion Microscopy and Microanalysis: Principles and Applications. M. Dekker, New York. Niedrig, R., and Bostanjoglo, 0. (1997). J. Appl. Phys. 81, 480-485. Nink, T. (1999) (Ph. D. Diss., “High-speed transmission electron microscopy of instabilities in laser pulse-produced melts in metal films’’ Technische Universitat Berlin). Nink, T., Galbert, F., Mao, Z . , and Bostanjoglo, 0. (1999). Appl. Sur$ Sci. 138-139, 439-443. Ninomiya, K., and Hasegawa, M. (1995). J. Vac. Sci. Technol., A 13, 1224-1228. v. Oertzen, A., Rotermund, H. H., and Nettesheim, S. (1992). Chem. Phys. Lett. 199, 131- 137. Osakabe, N., Tanishiro, Y., Yagi, K., and Honjo, G. (1980). SUI$ Sci. 97, 393-408. Plies, E. (1982). Proc. 10th Int. Congron Electron Microscopy (J. B. Le Poole, E. Zeitler, G. Thomas, G. Schimrnel, C. Weichan, Y. V. Bassewitz, Eds.). Vol. 1. Deutsche Gesellschaft Elektronenmikroskopie, FrankfuMain, pp. 3 19-320. Preuss, S., Demchuk, A,, and Stuke, M. (1995). Appl. Phys. A 61, 33-37. Pronko, P. P., Dutta, S. K., Du, D., and Singh, R. K. (1995). J. Appl. Phys. 78, 6233-6240. Reimer, L. ( 1985). Scanning Electron Microscopy. Springer-Verlag, Berlin. Reimer, L. (1993). Transmission Electron Microscopy. Springer-Verlag, Berlin. Rempfer, G. F., and Griffith, 0. H. (1992). Ultramicroscopy 47, 35-54. Ricci, E., and Passerone, A. (1993). Muter. Sci. Eng., A 161, 31-40. Rotermund, H. H., Engel, W., Jackuhith, S., v. Oertzen, A., andErtl, G. (1991). Ultramicroscopy 36, 164-172. Ruska, E. (1933). 2. Phys. 83, 492-497. Sabary, F., and Bergeret, H. (1994). Nucl. Instrum. Meth. Phys. Res.. Sect. A 340, 199-203. Schafer, B., and Bostanjoglo, 0. (1992). Optik 92, 9 - 13. Schonlein, R. W., Lin, W. 2.. Fujirnoto, J. G., and Eesley, G. L. (1987). Phys. Rev. Lett. 58, 1680- 1683. Singh, R. K., Holland, 0. W., and Narayan, J. (1990). J. Appl. Phys. 68, 233-247. Starnpfli, P., and Bennemann, K. H. (1992). Phys. Rev. B : 46, 10686- 10692. Szentesi, 0. I. (1972). J. Phys. E.: Sci. Instrum. 5, 563-567. Tanishiro, Y., Takayanagi, K., and Yagi, K. (1983). Ultramicroscopy 11, 95- 102. Travier, C. (1994). Nucl. Instrum. Methods Phys. Res. Sect. A 340, 26-39. Vitol, E. N., and Orlova, K. B. (1984). Russ. Metall. 4, 34-40. Wang, X. Y., Riffe, D. M., Lee, Y. S., andDowner, M. C. (1994). Phys. Rev. B : 50,8016-8019. Watari, F., and Yada, K. (1986). Proc. l l t h Int. Congron Electron Microscopy (T. Irnura, S. Maruse, T. Suzuki, Eds.), Vol. 1. Jap. SOC.Electron Microscopy, Tokyo, pp, 261 -262. Weingiirtner, M., and Bostanjoglo, 0. (1998). Surface Coat. Technol. 100/101, 85-89. Weingtirtner, M., Elschner, R., and Bostanjoglo, 0. (1999) Appl. Sut$ Sci. 138-139, 499-502. Yada, K. (1986). Proc. l l t h Inr. Congron Electron Microscopy (T. Imura, S. Maruse, T. Suzuki, Eds.), Vol. I . Jap. SOC.Electron Microscopy, Tokyo, pp. 227-228. Yagi, K. (1993). Surf Sci. Rep. (Netherlands) 17, 305-362. Yarnanoka, A,, and Yagi, K. (1989). Ultramicroscopy 29, 161-167.
This Page Intentionally Left Blank
ADVANCES IN IMAGING AND ELECTRON PHYSICS. VOL. 110
Soft Mathematical Morphology: Extensions, Algorithms, and Implementations A. GASTERATOS Laboratory f o r Integrated Advance Robotics Department of Communication, Computer and System Sciences, University of Genoa, Via Opera Pia 13, 1-16145 Genoa. Italy
I. ANDREADIS Laboratory of Electronics Section of Electronics and Information Systems Technology, Department of Electrical and Computer Engineering, Democritus University of Thruce GR-671 00 Xanthi, Greece
. . . . . . . . . . . . . . . . . A. Binary Morphology . . . . . . . . . . . . . . . . . . . B. Basic Algebraic Properties . . . . . . . . . . . . . . . . . C. Gray-Scale Morphology with Flat Structuring Elements . . . . . . . D. Gray-Scale Morphology with Gray-Scale Structuring Elements . . . . . E. Fuzzy Morphology . . . . . . . . . . . . . . . . . . . 111. Soft Mathematical Morphology . . . . . . . . . . . . . . . . . A. Binary Soft Morphology . . . . . . . . . . . . . . . . . . B. Gray-Scale Soft Morphology with Flat Structuring Elements . . . . . . C. Gray-Scale Soft Morphology with Gray-Scale Structuring Elements . . . 1V. Soft Morphological Structuring Element Decomposition . . . . . . . . V. Fuzzy Soft Mathematical Morphology . . . . . . . . . . . . . . A. Definitions . . . . . . . . . . . . . . . . . . . . . . B. Compatibility with Soft Mathematical Morphology . . . . . . . . . C. Algebraic Properties of Fuzzy Soft Mathematical Morphology . . , . . VI. Implementations . . . . . . . . . . . . . . . . . . . . . . A. Threshold Decomposition . . . . . . . . . . . . . . . . . B. Majority Gate . . . . . . . . . . . . . . . . . . . . . C. Histogram Technique . . . . . . . . . . . . . . . . . . . VII. Concluding Remarks . . . . . . . . . . . . . . . . . . . . References , . . . . . . . . . . . . . . . . . . . . . . I. Introduction . . . . . . . .
.
. . . .
.
11. Standard Mathematical Morphology . . . . . . .
63
64 65 65 67 67 67 68 68 69 70 70 76 76 8I 84 86 87 87 96 98 98
I. INTRODUCTION Mathematical morphology is an active and growing area of image processing and analysis, that is based on set theory and topology (Matheron, 1975; Serra,
63 Volume I10 ISBN 0-12-014752-1
ADVANCES IN IMAGING AND ELECTRON PHYSICS Copyrlght 0 1999 by Academic Press All rights of reproduction In any form reserved. ISSN 1076-5670/99 $30 00
64
A. GASTERATOS AND I. ANDREADIS
1982; Haralick et al., 1986; Giardina and Dougherty, 1988). Mathematical morphology studies the geometric structure inherent within the image. For this reason it uses a predetermined geometric shape known as the structuring element. Erosion, which is the basic morphological operation, quantifies the way in which the structuring element fits into the image. Mathematical morphology has provided solutions to many tasks where image processing can be applied, such as in remote sensing, optical character recognition, radar image sequence recognition, and medical imaging. Soft mathematical morphology was introduced by Koskinen et al. (1991). In this approach the definitions of the standard morphological operations are slightly relaxed in such a way that a degree of robustness is achieved while most of the desirable properties of the operations are maintained. Soft morphological filters are less sensitive to additive noise and to small variations in object shape than standard morphological filters. They have found applications mainly in noise removal, in areas such as medical imaging and digital television (Harvey, 1998). Another, relatively new, approach to mathematical morphology is fuzzy mathematical morphology. A fuzzy morphological framework was introduced by Sinha and Dougherty (1992). In this framework the images are not treated as crisp binary sets but as fuzzy sets. Set union and intersection are replaced by fuzzy bold union and bold intersection, respectively, in order to formulate fuzzy erosion and dilation, respectively. This attempt to adapt mathematical morphology into fuzzy set theory is not unique. Several other attempts have been developed independently by researchers, and they are all described and discussed by Bloch and Maitre (1995). Several fuzzy mathematical morphologies are grouped and compared, and their properties are studied. A general framework unifying all these approaches is also demonstrated. In this paper recent trends in soft mathematical morphology are presented. The standard morphological operations and their algebraic properties and fuzzy morphology are discussed in section 11, and soft mathematical morphology is described in section 111. A soft morphological structuring element decomposition technique is introduced in section IV, and the definitions of fuzzy soft morphological operations and their algebraic properties are provided in section V. Several implementations of soft morphological filters are analyzed in section VI.
11. STANDARD MATHEMATICAL MORPHOLOGY The considerations for the structuring element used by Haralick et al. (1987) have been adopted for the basic morphological operations. Also, the notations of the extensions of the basic morphological operations (soft morphology,
SOFT MATHEMATICAL MORPHOLOGY
65
fuzzy morphology, and fuzzy soft morphology) are based on the same consideration. Moreover, throughout the paper the discrete case is considered, that is, all sets belong to the Cartesian grid Z2. A. Binary Morphology
Let the set A denote the image under process and the set B the structuring element. Binary erosion and dilation are defined as follows: AOB = n ( A ) - , XEB
and
respectively, where A and B are sets of Z2, and (A), is the translation of A by x, which is defined as follows: (A), = {c E z21c= a
+ x for some a E A).
(3)
The definitions of binary opening and closing are
and A e B = (A$B)OB,
respectively. B. Basic Algebraic Properties
The basic algebraic properties of the morphological operations are provided in this section. 1. Duality Theorem
Erosion and dilation are dual operations:
where A' is the complement of A and is defined as A' = {x
E
Z21x $ A},
66
A. GASTERATOS AND I. ANDREADIS
and BS is the reflection of B and is defined as
BS = {XI
for some b E B , x = -b].
Opening and closing are also dual operations: (A B ) = ~A~.B~.
2. Translation Invariance Both erosion and dilation are translation-invariant operations: (A), @ B = (A @ B), and
(A),OB = (AOB),, respectively.
3. Increasing Both erosion and dilation are increasing operations:
A sB
A
j AOC
G BOC,
s B j A @ D g B @ D.
(12)
(13)
4. Distributivity Erosion distributes over set intersection, and dilation distributes over set union:
(A n B)OC = (AOC) n (BOC)
(14)
respectively.
5. Anti-Extensivity -Extensivity Erosion is an anti-extensive operation, provided that the origin belongs to the structuring element: 0 E B =+AOB _C A. (16) Similarly, dilation is extensive, if the origin belongs to the structuring element:
O E B J A G A ~ B
(17)
SOFT MATHEMATICAL MORPHOLOGY
67
6. Idempotency Opening and closing are idempotent, that is, their successive applications of the same structuring element do not change further the previously transformed result: AoB = (AoB)oB (18) and A o B = (AoB)oB.
C. Gray-Scale Morphology with Flat Structuring Elements The definitions of morphological erosion and dilation of a function f : F -+ Z by a flat structuring element (set) B are
( f @ B ) ( x )= min{f(Y)lYE (BLJ
(20)
and
(f @ B)(x) = maxv(Y>lY E
(BS)XL
(21)
respectively, where x , y E Z2 are the spatial coordinates, and F g Z2 is the domain of the gray-scale image (function). D. Gray-Scale Morphology with Gray-Scale Structuring Elements
The definition of erosion and dilation of a function f : F -+ Z by a gray-scale structuring element g : G + Z are
respectively, where x , y E Z 2 are the spatial coordinates, and F, G C Z2 are the domains of the gray-scale image (function) and gray-scale structuring element, respectively. E. Fuzzy Morphology
In this paper the definitions introduced by Sinha and Dougherty (1992) are used. These are a special case of the framework presented by Bloch and
68
A. GASTERATOS AND I. ANDREADIS
Maitre (1995). In this approach, fuzzy mathematical morphology is studied in terms of fuzzy fitting. The fuzziness is introduced by the degree to which the structuring element fits into the image. The operations of erosion and dilation of a fuzzy image by a fuzzy structuring element having a bounded support are defined in terms of their membership functions as follows:
pAOB(x) = min [minil, 1 Y G B
+ pA(x + y> - pB(y)II
[l+p~(x+y)-pB(y)I and
where x , y E Z2 are the spatial coordinates, and PA, p g are the membership functions of the image and the structuring element, respectively. It is obvious from Eqs. (24) and (25) that the results of both fuzzy erosion and dilation have membership functions whose values are within the interval [O,l].
MORPHOLOGY 111. SOFTMATHEMATICAL In soft morphological operations the maximum and the minimum operations used in standard gray-scale morphology are replaced by weighted order statistics. A weighted order statistic is a certain element of a list whose members have been ordered. Some of the members of the original unsorted list participate with a weight greater than 1, that is, they are repeated more than once before sorting (David, 198 1 ; Pitas and Venetsanolpoulos, 1990). Furthermore, in soft mathematical morphology the structuring element B is divided into two subsets, the core B1 and the soft boundary B2. A . Binary Soft Morphology The basic definitions of binary soft erosion and dilation are (Pu and Shih, 1995)
+ Card[A n (B2),]) 1 Card[B I + Card[B~]- k + 1} (26)
(AWBL, B2, k l ) ( x ) = { x E Al(k x Card[A n (B,),] 1
69
SOFT MATHEMATICAL MORPHOLOGY
and
respectively, where k is called the order index, which determines the number of times that the elements of the core participate in the result, and Card[X] denotes the cardinality of set X, that is, the number of the elements of X. In the extreme case when the order index k = 1 or, alternatively, B = B 1 (Bz = S ) , soft morphological operations are reduced to standard morphological operations.
Example I : The figure demonstrates a case of soft binary dilation and erosion. The adopted coordinate system is (row, column). The arrows denote the origin of the coordinate system and its direction.
If k > Card[B2], soft morphological operations are affected only by core B1, that is, using BI as the structuring element. Therefore, in this case the nature of soft morphological operations is not preserved (Kuosmanen and Astola, (1995); Pu and Shih, (1995)). For this reason the constraint k 5 min{Card(B)/2, Card(B2)] is used. In Example 1 min(Card(B)/2, Card(B2)) = 2.5, so only the cases k = 1 and k = 2 are considered. For k = 1 the results of both dilation and erosion are the same as those that would have been obtained by applying Eqs. (2) and (I), respectively. B. Gray-Scale SOBMorphology with Flat Structuring Elements
The definitions of soft morphology were first introduced by Koskinen et al. (1 992) as transforms of a function by a set. In the definition of soft dilation the reflection of the structuring element is used, so that in the case of k = 1 the definitions comply with Haralick et al., (1987): (f@[Bi, B2, kl)(x) = min(k)(Ik0 f
( ~ > lE~ (BiLI U (f(z>lz
E (B2L)) (28)
70
A. GASTERATOS AND 1. ANDREADIS
and
respectively, where min ( k ) and max ( k ) are the kth smallest and the kth largest element of the multiset, respectively; A muftiset is a collection of objects in which the repetition of objects is allowed, and the symbol o denotes the repetition, that is, {k o f ( x ) ] = ( f ( x ) ,f(x), . . . , f ( x ) } ( k times). C. Gray-Scale Soft Morphology with Gray-Scale Structuring Elements
Soft morphological erosion of a gray-scale image f : F -+ Z by a soft grayscale structuring element [a,p, k ] :B -+ Z is (Pu and Shih, 1995)
Soft morphological dilation
f @ [a, B, kl(x) = max(k) (X -
Y ) , (X - Z )
F
+
( ( k 0 ( f ( x - Y> a ( y > > u } {f(x - z )
+ B(z>I>,(31)
Y EBI z E B2
where x, y , z E Z2 are the spatial coordinates; a: B1 -+ Z is the core of the gray-scale structuring element; B:B2 +. Z is the soft boundary of the grayscale structuring element, and F, B !, B2 Z2 are the domains of the gray-scale image, the core of the gray-scale structuring element, and the soft boundary of the gray-scale structuring element, respectively. Figure 1 demonstrates one-dimensional soft morphological operations and the effect of the order index k . The same structuring element is used for both operations. It is a one-dimensional structuring element with five discrete values. The central value corresponds to its core and is equal to 30. Additionally, it denotes the origin. The four remaining values belong to its soft boundary and are equal to 20. From both Figures l(a) and (b) it is obvious that the greater the value of the order index, the better the fitting.
s
IV. SOFTMORPHOLOGICAL STRUCTURING ELEMENT DECOMPOSITION A soft morphological structuring element decomposition technique is described next. (Gasteratos et al., 1998d). According to this technique, the domain B of
SOFT MATHEMATICAL MORPHOLOGY
-
Image -.-
- Erosion ( k = l )
71
-......Erosion ( k = 2 )
I
~~
-
Image
-
Dilation ( k = l ) -...--Dilation (k=2) (b)
FIGURE1. Illustration of one-dimensional soft morphological operations and the effect of the order index k ; (a) soft erosion; (b) soft dilation.
72
A. GASTERATOS AND I. ANDREADIS
FIGURE2. Example of a 4 x 4 soft morphological structuring element decomposition
the structuring element is divided into smaller nonoverlapping subdomains B l , B2, . . ., Bn. Also, B1, UB2 U . . . U Bn = B. The soft morphological structuring elements obtain values from these domains and are denoted by [AX,,p l , k ] , [hz,p2, k ] , . . . , [ A n , p n ,k ] , respectively. These have a common origin, which is the origin of the original structuring element. Additionally, the points of B that belong to its core are also points of the cores of B1, B2, . . ., Bn, and the points of B that belong to the soft boundary are also points of the soft boundaries of B1, B2, . . ., Bn. This process is graphically illustrated in Figure 2. In this figure the core of the structuring element is denoted by the shaded area. Soft dilation and erosion are computed as follows:
SOFT MATHEMATICAL MORPHOLOGY
73
respectively, where B1 and B2 are the domain of the core and the soft boundary of the large structuring element [a,j3,k]: B + Z.
Proof: n
VY E
BI:~(Y)
= UAi(y) i= 1
* f ( x - Y) +
n
U(Y)
= U[f(x - Y ) i=l
+ Ai(y)l, (x - y) E BI
74
A. GASTERATOS AND I. ANDREADIS
1
max (X -
Y ) E BI
( X - Z)
E
(Ik 0 ( f ( x - Y ) + hi(Y>>lu {f(x - z>+ Pi(Z)}> .
1
B:
where N is the number of elements of the multiset. However, if an element is not greater than the local (N-k)th order statistic, then it cannot be greater than the global (N-k)th order statistic. Therefore, the terms max (N) . . . max (k+l ) can be omitted:
f
@
[a,,6 k l b )
= max ( k ) i=l n :
rnax(@ ( { k o ( f ( x - y ) (X - Y ) E ( X - Z) E
max v - ' ) - Y )E BI ( X - Z ) E Bz
(X
B1 B2
(Ik 0 (f(x
-Y)
+ h j ( y > ) }u { f ( x - z ) + p;(z)I),
+ h;(y>)Iu If(x - Z> +
~i(z>I>,
75
SOFT MATHEMATICAL MORPHOLOGY
i
= max(k
max(J)
i=l
(X-Y)EBI (X - 2 ) E B2
(fk 0 (f(x - Y)
+ ~ , ( Y ) ) J u {f(x
-
z> + P,(z)I)
j = l
Equation (33) can be proven similarly.
1
.
0
Example 2: Let us consider the following image f and soft structuring element [a,j31:
Soft dilation at point (0,O) for k = 2, according to Eq. (31) is f
@
[a,p , 2](0,0) = max '2'((2o (14, 13)) U { 16, 12, 12, 171) = max'*)(14, 14, 13, 13, 16, 12, 12, 17) = 16.
According to the proposed technique the structuring is divided into three structuring elements:
The following multisets are obtained from the preceding structuring elements: (2 o (14), 16), ( 2 o (13), 12}, and { 12, 171, for the first the second and the third structuring elements, respectively. From these multisets the max and max(2) elements are retained: ({16, 14), (13, 13}, and (17, 12)). The max(2) of the union of these multisets, that is, 16, is the result of soft dilation at point (0,O). It should be noticed that although 16 is the max of the first multiset, it is also the max (*) of the global multiset.
76
A. GASTERATOS AND I. ANDREADIS
V. FUZZYSOFTMATHEMATICAL MORPHOLOGY
A. Definitions Fuzzy soft mathematical morphology operations are defined taking into consideration that in soft mathematical morphology the structuring element is divided into two subsets, the core and the soft boundary, from which the core “weights” more than the soft boundary in the formation of the final result. Depending on k , the kth order statistic provides the result of the operation. Also, fuzzy soft morphological operation should preserve the notion of fuzzy fitting (Sinha and Dougherty, 1992). Thus, the definitions for fuzzy soft erosion and fuzzy soft dilation are (Gasteratos et al., 1998a)
L
and
, respectively, where, x, y, t E Z2 are the spatial coordinates, and P A , ~ L B , ,UB, are the membership functions of the image, the core of the structuring element, and the soft boundary of the structuring element. Additionally, for the fuzzy structuring element B c Z2:B = B1 U B2 and B I n B2 = 0. It is obvious that for k = 1 Eqs. (36) and (37) revert to Eqs. (24) and (25), respectively, that is, standard fuzzy morphology.
Example 3: Let us consider the image A and the structuring element B. Fuzzy soft erosion and fuzzy soft dilation are computed for cases k = 1 and k = 2.
SOFT MATHEMATICAL MORPHOLOGY
0.1
0.9
0.9
0.7
0.8
0.3
0.8
1.0
0.2
0.2
0.2
77
I ,
In order to preserve the nature of soft morphological operations, the constraint k 5 min{Card(B)/2, Card(Bz)} is adopted in fuzzy soft mathematical morphology as well as in soft mathematical morphology. In this example only the cases k = 1 and k = 2 are considered in order to comply with this constraint. Case 1 ( k = 1): The fuzzy soft erosion of the image is calculated as follows: PE(O3 0) = ~ A @ [ B ~ , B z , l l 0( o)~
+ 1,0.9 - 1 + 1,0.9 - 0.8 + 111 = 0.3 p E ( 0 , I ) = min[l, min[0.3 - 0.8 + 1,0.9 - 1 + 1,0.9 -1 + 1,0.7- 0.8 + 111 = min[l, min[0.3 - 1
= 0.5
wE(5,2) = min[l, min[0.2 - 0.8
+ 1,O.Z - 1 + 111 = 0.2.
Therefore, the eroded image is
PAOIBi, Bz, 1)
The values of the eroded image at points (0,2) and (1,2) are higher than the rest values of the image in accordance with the notion of fuzzy fitting, since only at these points does the structuring element fit better than the rest points of the image. Fuzzy erosion quantifies the degree of structuring element fitting. The larger the number of pixels of the structuring element, the more difficult the fitting. Furthermore, fuzzy soft erosion shrinks the image. If fuzzy image A is considered as a noisy version of a binary image (Sinha and Dougherty, 1992), then the object of interest consists of points (0, l), (0, 2), (0, 3), (0, 4), (1, l), (1, 2), (1, 3), (1, 4), (2, I), and (2, 2), and the rest is the background.
78
A. GASTERATOS AND I. ANDREADIS
By eroding the image with a 4-pixel horizontal structuring element it would be expected that the eroded image would comprise points (0, 2 ) and (1,2). This is exactly what has been obtained. Similarly, the dilation of the image is calculated as follows: pD(0,
0 ) = pAtB[B).B2, 1](07 0 )
+ 0.8 - 111 = 0.7 p ~ ( 01,) = max[O, max[0.3 + I - 1,0.9 + 1 - 1,0.9 + 0.8 - 111 = 0.9 = max[O, max[0.3
p ~ ( 5 , 2=) max[O, max[0.2
+1
-
+ 0.8
1,0.9
-
1,0.2
+ 1 - 1,0.2 + 1 - I]] = 0.2.
Therefore, the dilated image is
0.6 0.8
1.0
1.0 0.8
0.2
As can be seen, fuzzy soft dilation expands the image. In other words, the dilated image includes the points of the original image and also points (0, 0), (0, 51, (1, O), (1, 51, ( 2 , O), (2, 3), and (2, 4). Case 2 (k = 2): The erosion of the image is calculated as follows: p ~ ( 0 , O= ) p ~ e p , , ~ ~ , 2 1 (= 0 ,min[l, 0) min(2’[0.3,0.3,0.9,0.9, 1.111 = 0.3 p ~ ( 01) , = min[l, min‘2’[0.5,0.9,0.9,0.9,0.9,0.9]]= 0.9
p ~ ( 52,) = min [l, min ‘2’[0.4,0.2, 0.211 = 0.2.
The eroded image for k = 2 is
79
SOFT MATHEMATICAL MORPHOLOGY
In this case the values of the eroded image at points (0, l), (0, 2), (0, 3), (1, l), (1, 2), and (1, 3) are higher than the rest values of the image in agreement with the notion of fuzzy soft fitting. At these points the repeated k times “high-value’’ pixels, which are combined with the core of the structuring element and the pixels that are combined with the soft boundary of the structuring element, are greater than or equal to the kCard[BI] Card[B2] - k 1. The dilation of the image is calculated similarly:
+
+
p ~ ( 00) , = ~ A ~ [ B , , B0)~=, max ~ I ( [0, ~ ,max(2)[0.3,0.3, 0.711 = 0.3 p ~ ( 01), = max [0, max(2)[0.3,0.3,0.9,0.9,0.7]] = 0.9
pD(5,2) = max [0, max(2)[0.4,0.2,0.2,0.2,0.2]] = 0.2. Therefore, the dilated image for k = 2 is
Here again fuzzy soft dilation expands the image but more “softly,” than when k = 1. This means that certain points ((0, O),(l, 0), (2, 0), and (2, 4)) that were considered image points (when k = 1) now ( k = 2) belong to the background. The greater the k , the less the effect of dilation. Finally, fuzzy soft opening and closing are defined as:
respectively. Basic fuzzy soft morphological operations are illustrated through one-dimensional and two-dimensional signals. Figure 3 depicts fuzzy soft morphological erosion and dilation in one-dimensional space. More specifically, Figure 3a shows the initial one-dimensional signal and fuzzy soft erosion for k = 1 and for k = 2. Figure 3b shows the initial one-dimensional signal and fuzzy soft
80
A. GASTERATOS AND I. ANDREADIS
0
-
Fuzzy Image
-Erosion ( k = l )
..- - - - .Erosion (k=2)
(a)
-Fuzzy Image
Dilation ( k = l )
. .- -.. Dilation (k=2) '
(b)
FIGURE3. Illustration of one-dimensional fuzzy soft morphological operations and the effect of the order index k ; (a) fuzzy soft erosion; (b) fuzzy soft dilation; (c) the structuring element.
dilation for k = 1 and fork = 2. Figure 3c shows the structuring element. The core of the structuring element is the shaded area, and the rest area of the structuring element is its soft boundary. From Figures 3a and 3b it becomes clear that the action of the structuring element becomes more effective when k = 1, that is, the results of both fuzzy soft erosion and dilation are more visible
SOFT MATHEMATICAL MORPHOLOGY
81
1
(C)
FIGURE 3. (Continued)
in the case k = 1 than in the case k = 2. Moreover, both erosion and dilation preserve the details of the original image better in the case k = 2 than in the case k = 1. Figure 4 presents the result of fuzzy soft morphological erosion and dilation on a two-dimensional image. More specifically, Figures 4a and 4b present the initial image and the structuring element, respectively. The image in Figure 4b has been considered as an array of fuzzy singletons (Goetcharian, 1980). The results of fuzzy soft erosion ( k = 1) after the first and second iterations are presented in Figures 4c and 4d, respectively. The white area is reduced after each iteration. The white area of the eroded image (Figure 4c) is the area of the initial image, where the structuring element fits better. Similarly, Figures 4e and 4f present the results of fuzzy soft erosion ( k = 3) after the first and the second iteration, respectively. A comparison of Figures 4c and 4e makes it clear that the greater the k the less visible the results of fuzzy soft erosion. Figures 4g and 4h depict the results of fuzzy soft dilation ( k = 1) after the first and the second iteration, respectively. In the case of fuzzy soft dilation the white area increases. Similarly, Figures 4i and 4j show the results of fuzzy soft dilation ( k = 3) after the first and the second iteration, respectively. Again, the greater the k the less visible the results of fuzzy soft dilation. B. Compatibility with Soj? Mathematical Morphology
Let us consider Example 3. Thresholding image A and structuring element B (using a threshold equal to 0.5) produces the following binary image and binary structuring element:
82
A. GASTERATOS AND I. ANDREADIS
(g)
(h)
(i)
FIGURE 4. (a) Image, (b) structuring element, (c) fuzzy soft erosion ( k = I ) after the first iteration, (d) fuzzy soft erosion ( k = 1) after the second iteration, (e) fuzzy soft erosion ( k = 3) after the first iteration, ( f ) fuzzy soft erosion ( k = 3) after the second iteration, (9) fuzzy soft dilation ( k = 1) after the first iteration, (h) fuzzy soft dilation ( k = 1) after the second iteration, (i) fuzzy soft dilation ( k = 3) after the first iteration and (i)fuzzy soft dilation ( k = 3) after the second iteration.
SOFT MATHEMATICAL MORPHOLOGY
83
(i)
FIGURE 4. (Continued)
Applying soft binary erosion and soft binary dilation to image A with structuring element B gives the following images for k = 1 and k = 2:
k = 1:
mm .o
0.0 1.0 0.0 0.0 0.0
1.0 1.0 1.0 1.0 1.0 1.0
0.0 0.0 1.0 0.0 0.0 0.0
1.0 1.0 1.0 1.0 1.0 1.0
0.0 0.0 0.0 0.0 0.0 0.0
1.0 1.0 1.0 1.0 1.0 0.0
0.0 1.0 1.0 1.0 0.0 0.0
w
0.0 1.0 1.0 1.0 1.0 1.0
0.0 0.0 0.0 0.0 0.0 0.0
0.0 1.0 1.0 1.0 0.0 0.0
k = 2:
It is obvious that these results are identical with those of Example 3 when the same threshold value is used. This was expected, since binary soft morphology quantifies the soft fitting in a crisp way, whereas fuzzy soft erosion quantifies the soft fitting in a fuzzy way. The same results are obtained using a threshold of 0.55. However, when fuzzy soft morphology and thresholding with a threshold equal to or greater than 0.6 on the one hand and thresholding
84
A. GASTERATOS AND I. ANDREADIS
with the same threshold and soft morphology on the other hand are applied, different results are obtained. This means that, in general, the operations do not commute.
C. Algebraic Properties of Fuzzy Soft Mathematical Morphology 1. Duality Theorem
Fuzzy soft erosion and dilation are dual operations: ~Ac@[B1,-B2.k](x)
=&A@[BI,B~,~])~(~)*
Opening and closing are also dual operations: ~ ( A * [ B I , B ~ ,(x) ~ ] )= ~ FAco[-BI .-B2,k](x),
2. Translation Invariance
Fuzzy soft erosion and dilation are translation invariant:
where u E Z 2
3. Increasing Both fuzzy soft erosion and dilation are increasing operations:
where A and A’ are two images with membership functions pA and respectively, and p ~ ( x c ) ~ A ) ( X )and Vx E Z 2 .
PA’,
4. Distributivity
Fuzzy soft erosion is not distributive over intersection, as it is in standard morphology: 3 E Z2
and
3A1, A2, B E Z 2 (
&AlflA2)@[B1 ,Bz,kl (x)
#
~ ( A ~ @ [ B I , B ~ , ~ I ) ~ ( A ~ O [ B I , B ~ , ~(44) I)(~)
Example 4: Consider the following image A and structuring element B, where image A is the intersection of images A1 and A2.
85
SOFT MATHEMATICAL MORPHOLOGY
1.0 0.8
0.5
0.8
=
0.4
1.0
1.0
0.7
0.8
1.0
n
1.0
0.8 0.5
0.9
0.4
The fuzzy soft erosion for k = 2 of A, A l , A2 and the intersection of the eroded A1 and the eroded A2 are 07
06 0 6 0 5
05
09
06
05
05
06
(45)
5. Anti-extensivity-Extensivity Fuzzy soft opening is not anti-extensive. If it were anti-extensive, then ,%Ao[BI,Bz,kl(x) 5 ~ A ( x ) , vx E z2.In the following example it is shown that 3 E Z ~ I L L A ~ [ B , , B ~ , ~ I>( XwA(x). )
Example 5: Consider the image A and the structuring element B, for k = 2. In this example
0.1
I I I I 0.9
0.9 0.9 0.1
which means that fuzzy soft opening is not anti-extensive.
86
A. GASTERATOS AND I. ANDREADIS
Similarly, it is shown that, in general, fuzzy soft closing also is not extensive: 3x E Z2 and A, B E Z 2 ( p ~ . [ ~ , , ~ 2 ,
6. Idempotency In general, fuzzy soft opening is not idempotent: 3E
z2
and 3A, B E z21pAo[B~.B2,k](x) # ~ ( A ~ [ B ~ , B ~ . ~ ] ) o [ B I , B(46) ~.~](~)~
as illustrated by the following example:
Example 6: Consider the image A and the structuring element B, for k = 1. 1.0 1.0 0.0 0.0 0.0 1.0 1.0
PA
PB
c 4 0.0 0.5 0.0 0.4 0.0 0.0 0.0
PAOB
From this example it is obvious that fuzzy soft opening is not idempotent. By the duality theorem (Eq. (41)) it can be proved that, in general, fuzzy soft closing also is not idempotent: 3E
z2
and 3A, B
c Z21pA.[BI,B2,k](X)# /~(A.[BI.B~,~I).IB~,B~,~I(~) (47) VI. IMPLEMENTATIONS
Soft morphological operations are based on weighted order statistics, so algorithms such as mergesort and quicksort, which were developed for the computation of weighted order statistics, can be used for the computation of soft morphological filters (Kuosmanen and Astola, 1995). The average complexity of the quicksort algorithm is O(N log N), where N is the number of elements to be sorted (Pitas and Venetsanopoulos, 1990).Therefore, the average complexity for a soft morphological operation utilizing a soft structuring element [a,p, k ] : B + Z is O((kCard[B~] Card[B2]) log(kCard[B11+ CardlB21)).
+
SOFT MATHEMATICAL MORPHOLOGY
87
Hardware implementations of soft morphological operations include the threshold decomposition and the majority gate techniques. These structures along with an algorithm based on a local histogram are described in some detail in this section. A, Threshold Decomposition The threshold decomposition (Wendt et al., 1986) is a well-known technique for hardware implementation of nonlinear filters. The implementation of soft morphological filters in hardware using the threshold decomposition technique has been described by Shih and Pu (1995) and h and Shih (1995). According to this approach both the gray-scale image and the gray-scale structuring element are decomposed into 2b binary images f i and 2b structuring elements pi, respectively. Binary soft morphological operations are performed on the binary images by the binary soft structuring elements, and then a maximum or a minimum is selected at each position, depending on whether the operation is soft dilation or soft erosion, respectively. Finally, the corresponding binary pixels are added. Figure 5 demonstrates this technique for soft dilation. The logic-gate implementation of binary soft morphological dilation and erosion are shown in Figures 6a and b, respectively. The parallel counter counts the number of “ones” of the input signal, and the comparator compares them with the order index k and outputs 1 when this number is greater than or equal to k . It is obvious that this technique, although it can achieve high-speed computation times, since it is realized using simple binary structures, is hardware demanding. Its hardware complexity grows exponentially both with the structuring element size and the resolution of the pixels, that is, its hardware complexity is O(zNzb).
B. Majority Gate 1. Algorithm Description The majority gate algorithm is an efficient bit-serial algorithm suitable for the computation of the median filter (Lee and Jen, 1992). According to this algorithm the most significant bits (MSBs) of the numbers within the data window are processed first. The other bits are then processed sequentially until the least significant bits (LSBs) are reached. Initially, a set of signals (named the rejecting Jag signals) are set to “1 .” These signals indicate which numbers are candidates to be the median value. If the majority of the MSBs are found to be “1,” then the MSB of the output is “1,” otherwise it is “0.” The majority is computed through a CMOS programmable device, shown in
88
A. GASTERATOS AND I. ANDREADIS
rn
f
output -*
f"b-
1I
dilation
FIGURE 5. Illustration of the threshold decomposition technique for soft dilation.
k
b (a)
(b)
FIGURE 6. Implementation of binary (a) soft morphological dilation and (b) soft morphological erosion.
Figure 7. In the following stage the bits of the numbers whose MSBs have been rejected by means of the rejecting flag signals are not taken into account. The majority selection procedure continues until the median value is found. Gasteratos et al. (1997a) have proposed an improvement to this algorithm for the implementation of any rank filter, using a single hardware structure.
89
SOFT MATHEMATICAL MORPHOLOGY
i,,i,, 0
... iN : inputs : output
Vdd,Vss : power supply
FIGURE 7. Programmable CMOS majority gate.
This improvement is based on the concept that by having a method to compute the median value of 4N 1 numbers and by being able to control 2N of these numbers, any order statistic of the remaining 2N 1 numbers can be determined. Suppose that there are W = 2N 1 numbers x i , the rth order statistic of which is required. The 2N 1 inputs are the numbers xi,whereas the rest are dummy inputs dl(0 < 1 5 2N). The binary values of the dummy inputs can be either “00. . .0” or “1 1 . . . l”, which implies that when the W’ numbers are ordered in ascending sequence, dl are placed at the extremes of this sequence.
+
+
+
+
2. Systolic Array Implementation f o r SOBMorphological Filtering a. A Systolic Array for a 3 x 3 Structuring Element. A pipelined systolic array capable of computing soft gray-scale dilatiorderosion on a 3 x 3-pixel image window using a 3 x 3-pixel structuring element, both of 8-bit resolution, is presented in Figure 8 (Gasteratos et al., 1998b). The central pixel of the structuring element is its core, whereas the remaining eight pixels constitute its soft boundary. The inputs to this array are the nine pixels of the image window, the nine pixels of the soft morphological structuring element, and a control signal MODE. Latches (LI) store the image window, latches (L*l) store the structuring element, and latch (L**l)stores number k . Signal MODE is used to select the operation. When this is 1, soft dilation is performed,
Arithmetic unit
Order statistic unit
90 k
MODE pi : image data si : structuring element data d i:dummy numbers
FIGURE8. Systolic array hardware structure implementing the majority-gate technique for soft morphological filtering.
91
SOFT MATHEMATICAL MORPHOLOGY
whereas when it is 0, soft erosion is performed. Image data are collected through multiplexers MUXI, which are controlled by the signal MODE. The pixels of the structuring element remain either unchanged for the operation of dilation or they are complemented (by means of XNOR gates) for the operation of erosion. In the next stage of the pipeline, data are fed into nine adders. In the case of soft erosion the 2’s complements of the pixel values of the structuring element are added to the image pixel values, which is equivalent to the subtraction operation. According to the constraint k 5 min{Card(B)/2, Card(B2)), in this case k is in the range 1 Ik 5 4. Table 1 shows the number of elements of the image data window contained in the list as well as the number of dummy elements. For soft dilation all the dummy inputs are pushed to the top, whereas for soft erosion they are pushed to the bottom. Thus, the appropriate result is obtained from the order statistic unit. A control unit that is a decoder controls an array of multiplexers MUX2 (its input is number k ) . Its truth table is shown in Table 2. The control unit provides the input to the order statistic unit, either a dummy number or a copy of the additionhubtraction result of the core. The order statistic unit consists of identical processing elements (PEs) separated by latches (L**4 to L**ll). The resolution of the latches, which hold the additionhubtraction results or the dummy numbers (L3 to L1 l), decreases TABLE 1 USEOF DUMMY NUMBERS IN THE COMPUTATION OF WEIGHT ORDER STATISTICS
k
Sequence of numbers
Dummy numbers
1 2 3 4
9 10
I
11 12
6 5
TRUTH
8
TABLE 2 TABLEOF THE CONTROL UNIT
~
outputs
Input ~
_
_
_
_
_
_
k ~
il
000 1 0010 001 I 0100
i2
i3
i4
i5
i6
il
i8
0 0 1 1
0
0 0 0 0
0 0
0 0 0
0 0
0
0
0 0 0 0
~
0 I 1
1
0
0 1
0 0
0
92
A. GASTERATOS AND I. ANDREADIS
r,,, : the rejecting flag signals
c,,, : the setting flag signals
it,, : intermediate signals b, , : the binary representation of the inputs
oj
D--
FIGURE9. The basic processing element (PE).
by one bit at each successive stage, since there is no need to carry the bits that have been already processed. On the other hand, the resolution of the latches that hold the result (L4* to L*11) increases by one bit at each successive stage. The circuit diagram of this PE is shown in Figure 9. In this figure W’ = 4N 1. The 2N 1 inputs are the numbers xi, whereas the rest are the dummy inputs. Due to its simplicity this PE can attain very short processing times, independent of the data window size. Also, it is clear that the hardware complexity of the PE is linearly related to the number of its inputs.
+
+
b. Order Statistic Module Hardware Requirements for Other Structuring Elements. In this section a case study of the hardware requirements for the order statistic unit of a more complex structuring element is described.
I order statistic
k=4
2nd order statistic k=2
3rd order statistic k=3
FIGURE10. (a) Structuring element; (b) arrangement of the dummy numbers in soft morphological dilation using the structuring element of (a); (c) arrangement of the dummy numbers in soft morphological erosion using the same structuring element.
94
A. GASTERATOS AND I. ANDREADIS
The arithmetic unit consists of a number of addershbtractors equal to the number of pixels of the structuring element. Figure 1Oa illustrates the structuring element. In this case, Card(B) = 16, Card(B1) = 12, Card(B2) = 4, and k 5 min(8,4], that is, 1 5 k 5 4. When k = 4, the maximum number of elements of the multiset is Card(B2) kCard(BI) = 52. The 49th (4th) order statistic of the multiset is sought. Thus, the total number of the inputs to the order statistic unit is 97. Forty-five dummy numbers are pushed to the top (bottom) in the operation of soft dilation (erosion). When k = 3, there are 40 elements of the multiset, and the 38th (3rd) order statistic is searched. Now, 46 dummy numbers are pushed to the top (bottom), and 11 to the bottom (top). In the same way, when k = 2, there are 28 elements of the multiset, and the 27th (2nd) order statistic is searched. Forty-seven dummy numbers are pushed to the top (bottom), and 22 to the bottom (top). Finally, when k = 1, there are 16 elements of the multiset, and the 16th (1st) order statistic is searched. In this case 48 dummy numbers are pushed to the top (bottom) and 33 to the bottom (top). An order statistic unit can be synthesized for any structuring element following the above procedure. In this case hardware complexity is linearly related both to the structuring element size and the resolution of the pixels, that is, the hardware complexity is O(Nb).
+
3 . Architecture for Decomposition of So@ Morphological Structuring Elements An architecture suitable for the decomposition of soft morphological structuring elements is depicted in Figure 11. The structuring element is loaded into the structuring element management module. This divides the structuring element into n smaller structuring elements and provides the appropriate one to the next stage. The pixels of the image are imported into the image window management module, which provides an image window that interacts with the appropriate structuring element provided by the structuring element management module. Both the previous modules consist of registers and multiplexers (MUXs) controlled by a counter modn (Figure 12). The second stage, that is, the arithmetic unit, consists of adders/subtractors (dilatioderosion) and an array of MUXs that are controlled by the order index k , like the one shown in Figure 9. The MUXs provide the multiple copies of the additiodsubtraction results to the next stage, that is, an array of order statistic modules (OSMs). The max ([)/ min (') results (1 = 1, . . . , k ) of every multiset are collected through an array of registers. These registers provide the n x k max ('I/min ( I ) of the n multisets concurrently to the last-stage OSM, which computes the final result according to Eqs. (32) and (33).
-Image
smcnuing element'sinput
window management
_ j Strucruring element -managment
--
OSM fork=k
Arithmetic Mu'tiset
-
Unir
> I
A
~
k
Decoder ~
1
output image pixel
OSMs
FIGURE1 1. Architecture for the implementation of the soft morphological structuring element decomposition technique.
96
A. GASTERATOS AND I. ANDREADIS
Clock
Mod n counter
Selection
I output Array of Registers Input ~
b
MUXs
b
C. Histogram Technique
One way to compute an order statistic is to sum the values in the local histogram until the desired order statistic is reached (Dougherty and Astola, 1994). However, instead of adding the local histogram values serially, a successive approximation technique can be adopted (Gasteratos and Andreadis, 1999), which ensures that the result is traced in a fixed number of steps. The number of steps is equal to the number b of the bits per pixel. In a successive approximation technique the result is computed recursively; in each step of the process the N pixel values are compared with a temporal result. Pixel values that are greater than, less than, or equal to that temporal result are marked with labels GT, LT and EQ, respectively. GT, LT, and EQ are Boolean variables. Pixel labels are then multiplied by the corresponding pixel weight (wj).The sum of LTs and EQs determines whether the kth order statistic is greater than, less than, or equal to the temporal values. The pseudocode of the algorithm follows: Notations: N: Number of pixels; b: pixel value resolution (bits); iml, im2, . . ., imN: image pixels; w1,w2, . . . , W N :corresponding weights; k: the sought order statistic; temp: temporal result; 0:output pixel. initial o=o temp = 2b-' begin for i = 1 to b do begin compare(im1, im2, . . . imN: temp) (ifimj = temp then EQj = 1 else EQ, = 0 if im, < temp then LT, = 1 else LT, = 0)
SOFT MATHEMATICAL MORPHOLOGY
then o
97
t temp
N
elsif
wjLTj 2 k j= I
then temp else temp end
t temp t temp
- 2b-’-’
+ 2b-’-’
end A module utilizing standard comparators, adders/subtractors, multipliers, and multiplexers (for the “if” operations) can be used to implement this technique in hardware. Also, there are two ways to realize the algorithm. The first is through a loop that feeds the temp signal back to the input b times. Such a module is demonstrated in Figure 13. Its inputs are the addition or subtraction results of the image pixel value data with the structuring element pixel values, depending on whether the operation is soft dilation or soft erosion, respectively. Alternatively, b successive modules can be used to process the data in a pipeline fashion. The latter implementation is more hardware demanding but results in a faster hardware structure. The preceding algorithm requires a fixed number of steps equal to b. Furthermore, the number of steps grows linearly according to the pixel value resolution (O(b)). Its main advantage is that it can directly compute weighted
FIGURE13. Block diagram of a hardware module for the computation of weighted order statistics. based on the local histogram-successive approximations technique.
98
A. GASTERATOS AND I. ANDREADIS
rank order operations. This means that there is no need to reconstruct the local histogram according to the weights of the image pixels. Comparative experimental results using typical images showed that for 5 x 5 and larger image data windows the combined local histogram and successive approximation technique outperforms the existing quicksort algorithm for weighted order statistics filtering (Gasteratos and Andreadis, 1999).
VII. CONCLUDING REMARKS Soft morphological filters are a relatively new subclass of nonlinear filters. They were introduced to improve the behavior of standard morphological filters in noisy environments. This paper has presented the recent descriptions of soft morphological image processing. Fuzzy soft mathematical morphology applies the concepts of soft morphology to fuzzy sets. The definitions and the algebraic properties have been illustrated through examples and experimental results. Techniques for soft morphological structuring element decomposition and its hardware implementation have also been described. Soft morphological operations are based on weighted order statistics. Algorithms for implementing soft morphological operations include the well-known mergesort and quicksort algorithms for weighted order statistics computation. An approach based on a local histogram and a successive approximations technique also has been described. This algorithm demonstrates a great improvement in speed for a 5 x 5 image data window or larger. Soft morphological filters can be implemented in hardware using the threshold decomposition and the majority-gate techniques. The threshold decomposition technique is fast, but its hardware complexity is exponentially related to both the structuring element size and the resolution of the pixels. In the majority-gate algorithm the hardware complexity is linearly related to both the structuring element size and the resolution of the pixels.
REFERENCES Bloch, I., and Maitre, H. (1995). Pattern Recognition 28, 1341 - 1387. David, H. A. (1981). Order statistics Wiley, New York. Dougherty, E. R.,and Astola, J . (1994). Introduction to Nonlinear Image Processing SPIE, Bellingham, Washington. Gasteratos, A., and Andreadis, I. (1999). IEEE Signal Proc. Lett. 6, 84-86. Gasteratos, A., Andreadis, I., and Tsalides, Ph. (1997a). Pattern Recognition 30, 1571 - 1576. Gasteratos, A,, Andreadis, I., and Tsalides, Ph. (1998a). ZEE Proc. Vision Image, Signal Process. 145, 40-49. Gasteratos, A,, Andreadis, I., and Tsalides, Ph. (1998b). IEE Proc. Circ. Dev.Sys. 145,201-206.
SOFT MATHEMATICAL MORPHOLOGY
99
Gasteratos, A., Andreadis, I., and Tsalides, Ph. (1998d). In: (H. J. A. M. Heijimans and J. B. T. M. Roerdink, Eds.) Mathematical Morphology and Its Applications to Image and Signal Processing. Kluwer, Dordrecht, The Netherlands, pp. 407-414. Giardina, C. R., and Dougherty, E. R. (1988). Morphological Methods in Image and Signal Processing. Prentice-Hall, Englewood Cliffs, New Jersey. Goetcharian, V. (1980). Pattern Recognition 12, 7- 15. Haralick, R. M., Sternberg, R., and Zhuang, X . (1986). IEEE Trans. Pattern Anal. Mach. Intell. PAMI-9, 532-550. Harvey, N. R. (1998). http://www.spd.eee.strath.ac.uW-harve/bbc-epsrc.html. Koskinen, L., Astola, J., and Neuvo, Y . (1991). Proc. SPIE Symp. Image Alg. Morph. Image Proc. 1568, 262- 270. Kuosmanen, P., and Astola, J. (1995). J. Math. Imag. Vision 5, 231-262. Lee, C. L., and Jen, C. W. (1992). IEE Proc. G 139, 63-71. Matheron, G. (1975). Random Sets and Integral Geometry. Wiley, New York. Pitas, I., and Venetsanopoulos, A. N. (1990). IEEE Proc. 80, 1893- 1921. Pu,C. C., and Shih, F. Y. (1995). Graphical Models and Image Processing 57, 522-526. Serra, J. (1982). Image Analysis and Mathematical Morphology. Vol. 1. Academic Press, London. Shih, F. Y., and Pu, C. C. (1995). IEEE Trans. Signal Proc. 43, 539-544. Shinha, D., and Dougherty, E. R. (1992). J. Visual Commun. Imag. Repres. 3, 286-302. Wendt, P. D., Coyle, E. J., and Gallagher, N. C. Jr. (1985). IEEE Trans. Acoustics Speech, Signal Proc. ASSP-34, 898-91 1.
This Page Intentionally Left Blank
.
ADVANCES IN IMAGING AND ELECTRON PHYSICS VOL. 110
Difference in the Aharonov-Bohm Effect on Scattering States and Bound States SEIJI SAKODAa AND MINORU OMOTEb a Department
of Mathematics and Physics. National Defence Academy. Yokosuka 239.8686. Japan bDepartment of Physics. Hiyoshi Campus. Keio University. Hiyoshi. Yokohama 223.8521. Japan
I. Introduction . . . . . . . . . . . . . . . . . . .
11. AB Effect on Scattering States . . . . . . . . . . . . A . Influence of an Infinitely Thin Solenoid
. . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . . III. AB Effect on Bound States . . . . . . . . . . . . . . . . . . A. AB Effect on Energy Levels and Eigenfunctions . . . . . . . . . . B. AB Effect on Landau Levels . . . . . . . . . . . . . . . . . IV . AB Effect on a System of Both Bound States and Scattering States . . . . . A . Time-Dependent Scattering Theory . . . . . . . . . . . . . . . B. Bound States of ABC System in Two-Dim. . . . . . . . . . . . . V. Gauge Invariance and Scattering Theory . . . . . . . . . . . . . . VI . Concluding Remarks . . . . . . . . . . . . . . . . . . . . . Acknowledgments . . . . . . . . . . . . . . . . . . . . . B. Influence of a Solenoid of Finite Radius
Appendix I. Path Integral for a System in the AB Potential
102
107 108 114 125
126
129 131 132 142 144 148
151
. . . . . . . .
151
. . . . . .
154
Appendix IrI. Unitarity and Optical Theorem in Two-Dimensional Scattering . .
155
Appendix IV . S-matrix of the AB Scattering
157
Appendix 11. Semiclassical Derivation of Shifted Landau Levels
. . . . . . . . . . . . . Appendix V . Result of AB and the Unitarity of the S-Matrix . . . . . . . A . Concise Form of AB’s Wave Function . . . . . . . . . . . . . .
159
159
B . Takabayashi’s Derivation of AB’s Wave Function . . . . . . . .
162
Appendix VI . Gordon’s Method for Scattering by the Coulomb Potential
165
. . . . . References . . . . . . . . . . . . . . . . . . . . . . . .
169
101 Volume 110 ISBN 0-12-014752-1
ADVANCES IN IMAGING AND ELECTRON PHYSICS Copyright 0 1999 by Academic Press All rights of reproductionin any fonn reserved. ISSN 1076-5670f99 $30.00
I02
SEIJI SAKODA AND MINORU OMOTE
I. INTRODUCTION The concept of electromagnetic fields, conceived by FaraLdy to describe electromagnetic phenomena, was established by Maxwell and Hertz in the second half of the nineteenth century. Field strength E and B was the foundation of the classical theory of electromagnetism. For example, a charged particle with electric charge e under the influence of E and B is subject to the Lorentz force e F = e E ( x , t ) -v x B ( x , t ) , c
+
where x and v ( = i ) denote the position and the velocity of the particle at time t , respectively. In the Schrodinger picture of nonrelativistic quantum mechanics, the equation of motion is a Schrodinger equation. For a charged particle, the time-derivative itialat and the momentum operators i in a Schrodinger equation for a neutral particle are replaced with itialat - e#(.i?, t ) and p - eA(.?, t ) / c , respectively. The electromagnetic interaction is described by introducing a scalar potential #(x, t ) and a vector potential A (x , t ) to satisfy l a
t ) - V#(X, t ) , E ( x , t ) = ----A@, c at B ( x , t ) = V x A ( x , r).
At first it may appear that E and B are being replaced as fundamental descriptiors of electromagnetism by # and A . However, it was understood that field strength E and B did play a central role in the theory of electromagnetism in quantum theory as well as in classical theory and also that potentials # and A were mathematical auxiliaries until Aharonov and Bohm (1959) noticed a quantum effect that could not be explained without using a vector potential, and a possibility that the classically less significant quantities A and # instead of E and B might play the leading roles in quantum theory. The main reason why the classical view had been accepted for a long time was that all physically observable effects of the electromagnetic interaction
DIFFERENCE IN AHARONOV-BOHM EFFECT ON SCATTERING STATES
103
seemed to be expressed in terms of field strength E and B . In addition, since E and B are invariant under a gauge transformation
A and 4 are redundant in describing the electromagnetic interaction. This was another reason why E and B , rather than A and 4, were regarded to be physically significant. In 1959 Aharonov and Bohm proposed that in the quantum theory an electromagnetic effect caused by a vector potential could affect the phase factor of a wave function of a charged particle. They suggested an experiment to explain an observable effect of a vector potential through a shift of electron interference fringe patterns. Basing on the semiclassical calculation in their explanation of the idea, they found that an interference fringe can be affected by a vector potential even when an electron propagates outside the region B # 0. Aharonov and Bohm also predicted that a shift of interference fringe is completely determined by a phase factor
where the contour of integration goes around the region B # 0 once counterclockwise. Following their argument on this effect of the vector potential we deal with only static fields and time-independent gauge transformations in the following. By setting a = -e@/(2xAc), where the whole magnetic flux in the region B # 0 is designated by @, we rewrite the phase factor Eq. ( 3 ) as e-2sia which indicates the absence of the effect if Q is an integer. This kind of effect caused by a vector potential is called the Aharonov-Bohm (AB) effect. Despite affirmative experimental tests (Chambers, 1960; Fowler et al., 1961; Boersch et al., 1961; Mollenstedt and Bayh, 1962) of the effect on the shift of interference fringe, which appeared soon after its prediction, some authors (Bocchieri and Loinger, 1978; Bocchieri et al., 1979; Roy, 1980) opposed the
104
SEIJI SAKODA AND MINORU OMOTE
existence of the AB effect by attributing the shift of fringe to leakage fields. Grounds for such an objection were removed by the elegant experiment, in which the flux quantization was also demonstrated, by Tonomura et al. (1986). We refer the reader to the review of earlier experiments Olariu and Popescu (1985, Sec. 111). In the advancement of theoretical physics a basic concept in developing theories of gauge fields has been that any observable effect of gauge fields (A,) = ( # / c , A ) is explained by the phase factor
In particular, for the lattice definition of gauge field (Wilson, 1974) and also the so-called nonintegrable phase factor (Yang, 1974; Wu and Yang, 1975)
was the key to integral formalism for gauge fields as well as for the derivation of Dirac’s quantization of magnetic monopole. As a proof for the shift of fringes, Aharonov and Bohm discussed a scattering problem, which can be solved exactly, of a charged particle by an extremely thin solenoid. Many other authors also have considered the same problem from various viewpoints (Kretzschmar, 1965a; Berry, 1980; Henneberger, 1981; Ruijsenaars, 1983; Aharonov et al., 1984; Takabayashi, 1985; Nagel, 1985; Hagen, 1990; Jacluw, 1990; Stelitano, 1995; Giacconi et al., 1996; Sakoda and Omote 1997; Arai and Minakata, 1998). Because the scattering cross section obtained by Aharonov and Bohm possessed the same periodic behavior with respect to a as was predicted for the shift of interference fringes, these two problems, the effect on an interference pattern and that on the scattering cross section, have been considered to be equivalent. Hence, the exact solution of the latter has been regarded as a proof of the former. Ruijsenaars (1983) and later Sakoda and Omote (1997) pointed out that the scattering amplitude is associated with a delta function of forward direction, - l)S(O), in addition to the one obtained by Aharonov and Bohm, if (COSJKZ a plane wave is taken as an incident wave of the Aharonov-Bohm scattering. The S-matrix of this scattering problem is therefore equal to -1, and the scattering amplitude does not vanish if a is an odd integer. Sakoda and Omote (1997) also showed that the existence of this delta function is essential for the unitarity of the S-matrix. The difference between the results of Aharonov-Bohm and those of Ruijsenaars, (1983) or of Sakoda and Omote (1997) stems from differences in interpretation of the incident wave of the scattering problem. The AB effect on scattering amplitude is therefore not
DIFFERENCE IN AHARONOV-BOHM EFFECT ON SCAPERING STATES
105
equivalent to that on the shift of interference pattern if we take this interpretation to regard a plane wave as an incident wave. In addition to the effect in a scattering problem, there is also an AB effect on a system of bound states (Byers and Yang 1961; Kretzschmar, 1965b; Peshlun, 1981; Lewis, 1983). Byers and Yang (1961) showed that the AB effect on an energy spectrum of such a system is a periodic and even function of ct (a theorem of Byers and Yang). The idealized AB experiment, that is, an interference experiment of two well-separated coherent electron beams is approximated by the situation considered in Byers and Yang ( 1 96 1). Thus it shares the same property with the AB effect on bound states. In this situation, the absence of the AB effect for integral a’s is often explained as a consequence of the gauge invariance of the theory by regarding a multiplication of a local phase factor e-Icuvto a wave function as a gauge transformation. Since the theorem of Byers and Yang is concerned only with a discrete series of eigenvalues of a Hamiltonian, that is, bound states, the analysis of a scattering problem is out of the range to which the theorem applies. Another issue in which the phase factor Eq. (3) plays a role is the representation theory of the canonical commutation relations. Taking an infinitely thin solenoid as an example, let us consider this issue briefly. A vector potential for describing a solenoid is given by @
A(x ) = -(- sin p, cos cp) , 2nr
(x, y ) = ( r cos rp, r sin rp),
(6)
excepting the origin. When a is an integer, a representation of commutation relations among position operators 2 and velocity operators ir = ($ eA(.t)/c)/Fon a Hilbert space of functions spanned by eikrWiav (k E R 2 ) can be regarded as an equivalent representation of canonical commutation relations of i?and p . To see this, let us introduce an operator that is unitary at least where a = ( a l ,a 2 )E R 2 is a set of parameters. A matrix formally, e-iwur/A, element of this operator between two eigenvectors of position operators is found by a simple calculation to be
provided that we do not encounter the singularity of the vector potential on a line segment from x’ to x’ a . By taking two positive real numbers a and b then having a = (a, 0) and b = (0, b ) define a rectangular region cornered by x’, x’ + a , x’ + a + b , and x’ + b , we find
+
106
SEIJI SAKODA AND MINORU OMOTE
where the contour of integration is taken to be along the edges of the rectangular region. If the solenoid is involved in the interior of this rectangular region, the phase factor on the right-hand side of Eq. (8) becomes e-2niff,otherwise unity. Thus we see that the set of position operators and velocity operators fulfills the Weyl form of the canonical commutation relations if and only if a is an integer. The concept of equivalent or inequivalent representation of canonical commutation relations in terms of velocity operators with a singular vector potential has been developed and generalized to relativistic quantum mechanics and even to the case of non-Abelian gauge fields in Reeh, 1989, and Arai, 1992, 1993, 1995. In view of the preceding result, we may regard a Hamiltonian involving a vector potential
where the vector potential A is given by Eq. (6), to be equivalent to the one without a vector potential if a! is an integer. Hence, no AB effect is observed. There are problems with the preceding argument when it is applied to the scattering problem: (i) If we cannot distinguish any Hamiltonian in the form of Eq. (9) with an integral a from the one with a = 0, what would be a suitable choice for a free Hamiltonian in constructing a scattering theory of interacting (a # 0) Hamiltonians. (ii) The equivalence holds only for the case of an infinitely thin solenoid and is of no use when we consider a solenoid of finite radius. The first problem is closely related to the disagreement in two interpretations of the incident wave discussed previously, while the second one is often discarded in the gauge invariance argument in the literature. We may conclude that the AB effect is absent for integral a’s even in the scattering problem if we are fully convinced that we always obtain the same result (S-matrix) regardless of the choice above. The unitary equivalence of two Hamiltonians described above does not mean the coincidence of two Smatrices defined in two ways for an a by taking a pair of different integer values of a as free Hamiltonians. Nevertheless, it has often been concluded that an observation of the unitary equivalence leads to the vanishing of the scattering amplitude given by Aharonov and Bohm for an integral a. Given the results on the AB effect on bound states as well as on the equivalence of representations of the canonical commutation relations, much remains to be done on the AB effect in a scattering problem, since neither the known AB effect on bound states nor the unitary equivalence helps to elucidate the effect in scattering states. Thus we consider the scattering problem by choosing and fixing the Hamiltonian of a = 0 as a free Hamiltonian in this chapter. As is well known, there are two approaches to dealing with a scattering problem in quantum theory. The first is to find a stationary state describing the scattering process by
DIFFERENCE IN AHARONOV-BOHM EFFECT ON SCATTERING STATES
107
solving a time-independent Schrodinger equation. The second approach is to study the time development of a wave packet with respect to a time-dependent Schrodinger equation. Aharonov and Bohm as well as most authors have analyzed the scattering by means of the first approach, whereas others have discussed the same problem using the second approach (Kretzschmar, 1965a; Ruijsenaars, 1983; Stelitano, 1995). In order to compare our results with the result of Aharonov-Bohm and others, we first consider the problem for an infinitely thin solenoid using the Lippmann-Schwinger equation in section 1I.A. We solve the same problem for a solenoid of finite radius in section 1I.B borrowing the idea of Gordon (1928) for solving the Rutherford scattering. Solving a problem for a solenoid of finite size, we can reproduce the result for an infinitely thin solenoid, in agreement with that of section ILA, to convince us of the validity of these two methods. Section I11 is devoted to an investigation of the AB effect on bound states. Differences and similarities of the AB effect found in bound states system and those in scattering states are discussed there. Then we analyze a system that possesses both bound states and scattering states simultaneousl y by calculating the S-matrix of such a system in section IV. We discuss gauge invariance and its meaning in connection with the scattering problem in section V. In section VI, we give an overview of the results and briefly discuss the relation between Aharonov-Bohm’s scattering amplitude and the S-matrix of the scattering problem. Some related but rather technical topics are discussed in appendices. Although there are many other related topics worth examining, such as geometric phases in quantum mechanics (Berry, 1984; Simon, 1983; Shapere and Wilczek, 1989), self-adjointness of the Hamiltonian operator (Doebner et al., 1989; de Sousa Gerbert, 1989; Hagen, 1991; Bordag and Voropaev, 1993; Audretsch et al., 1995; Magni and Valz-Gris 1995; Park, 1995; Odaka and Satoh, 1997; Dqbrowski and SfoviEek, 1998), in particular for Hamiltonians with spin effect, and the AB effect with other coexisting interactions (Guha and Mukherjee, 1987; Kibler and Negadi, 1987; Sokmen, 1988; Chetouani et al., 1989; Doebner and Papp, 1990; Drlganascu et al., 1992; Hagen, 1993; Villalba, 1995; Lin, 1998; Park and Yoo, 1998; Roy and Singh, 1983), some of them are beyond the scope of this paper. 11. AB EFFECT ON SCATTERING STATES
In this section we consider the scattering problem (AB scattering) first studied by Aharonov and Bohm, in two ways and in two situations. We first consider the problem of an infinitely thin solenoid in terms of the Lippmann-Schwinger (LS) equation then reconsider the same problem using a rather elementary but
108
SEUI SAKODA AND MINORU OMOTE
powerful method proposed by Gordon for solving Rutherford scattering. It is easy to apply the latter even to the problem of a solenoid with finite radius. A. Influence of an Infinitely Thin Solenoid 1. Incident Waves in AB Scattering
In order to study a scattering of particles we solve a time-independent Schrodinger equation
f i $ r ( x ; k )= E $ r ( x ; k ) , k = (kcosqk, ksinqk).
R2k2 E = -, 2P
(10)
where Rk denotes the momentum of an incident particle. A Hamiltonian for the AB scattering is given by Eq. (9) and the AB potential Eq. (6). In terms of two-dimensional polar coordinates, Eq. (10) is explicitly written as
Since the Hamiltonian commutes with the angular momentum, it can easily be shown that one of the most general solutions of Eq. (10) is given by
where J , ( x ) denotes the Bessel function of uth order. Here, the singular solution is omitted because of the physical requirement that the solenoid be impenetrable. In Eq. (12), c,’s are arbitrary constants determined by physical requirements. It is essential to specify a boundary condition that represents the physical situation on a solution of Eq. (10) when finding an eigenfunction that describes the scattering process of particles. Our main interest here and in the following discussion is to determine the correct choice of c,’s to describe the scattering process. It has been asserted by Aharonov and Bohm and many other authors that the incident wave for this scattering problem, when the incident beam comes in from the positive x-axis, that is, k = (-k, 0), should be a modulated plane wave e-ikrcosv-iav to make the probability current of the incident wave constant. To fulfill this requirement the coefficients have been taken to be c, = (-i)ln+cul. In the case of nonintegral a,however, the incident wave becomes a multivalued function. From the physical point of view it seems quite unsatisfactory to take such a multivalued wave function as an incident one.
DIFFERENCE IN AHARONOV-BOHM EFFECT ON SCATTERING STATES
109
Also, for sufficiently large Y, the term due to the vector potential does not contribute to the dominant part of the current, even if we take a plane wave as an incident wave function. Thus, one cannot argue conclusively that the incident modulated wave gives the condition for determining those constants as c, = (-i)Jn+lyl. 2. The Lippmann-Schwinger Equation In this section we obtain the wave function for the scattering state of charged particles scattered by an infinitely thin solenoid. It is known that for the present problem the Born approximation fails to give a reliable answer because we dr’Ji(kr’)/r’ even in its first order (Ahacannot avoid a divergent integral ranov etal., 1984; Nagel, 1985). The iterative method for solving the LS equation is also unsatisfactory for the same reason. In other words, replacement of n 2 / r 2 with (n a ) 2 / r 2 in the radial Schrodinger equation causes a nonperturbative effect, as becomes evident if we consider a relation between J , ( x ) and J I ~ + ~ / (in . Xterms ) of a formal series in powers of a.Therefore, we need to solve the LS equation in an exact way. We will do this with the aid of the Feynman kernel given in Eq. (1 8). To solve an LS equation exactly, we need a Green’s function
+
(XI
(E
- B + i&)-’
Ix’)
of the total Hamiltonian fi instead of
in a perturbative calculation. Fortunately, we can obtain the one for the case of the AB potential, since we can find the complete set of the eigenfunctions 1
.
-e’nloJln+,l(kr),
fi
n = 0, f l , &2, . . ,
for the Hamiltonian with effects of the solenoid. In terms of the eigenfunctions, we can immediately find an expression of the Feynman kernel K(xFtXI;t)
as its spectral representation
= (XFle-
iAr/A
Ixf)
(13)
110
SEIJI SAKODA AND MINORU OMOTE
To verify this expression, it is sufficient to note that (i) it obeys the timedependent Schrodinger equation, (ii) it is apparently single-valued with respect to both X F and XI, and (iii) it has the correct limit
which follows from
kdkJ,(ak)d,(bk) [Re(u) > - l , u , b > 01,
L 6 ( a - b)=
@
(16)
I0
and
If we carry out the integration with respect to k in Eq. (14), we obtain
(18) For a path integral derivation of this Feynman kernel, see appendix I, in which we review dealing with Hamiltonians with the AB effect in terms of the Euclidean path integral in order to show its ability to exhibit the AB effect on systems with bound states. Readers interested in other derivations of the AB effect by the path integral method should consult Schulman (1971, 19Sl), Bernido and Inomata (198 l), Mornandi and Menossi (1984), Gerry and Singh (1979), Shiekh (1986), and Ohnuki (1986). Because the information about the time evolution of the system is completely described by Eq. (14) or (18), we can immediately proceed to solve the scattering problem using the LS equation or by means of a time-dependent description of the process. We postpone the latter until section IV, in which we do a time-dependent analysis of a system of both scattering states and bound states under the influence of the AB potential. The LS equation for the system is given by
A2 (-2iavt +a) u ( x ’ ; k ) , 2p-Q
x -a
DIFFERENCE IN AHARONOV-BOHM EFFECT ON SCATTERING STATES
111
where we have taken an incident plane wave u ( x ; k )= eikrcose(8 = p - pk) as an eigenstate of the free Hamiltonian, and p k indicates the direction of the incident beam. From Eq. (18), we can easily obtain the Green’s function in the above by a Laplace transform
By substituting E = fi2k2/2p,we obtain +m
where we used the formula
and Q(x) is the step function. If we substitute +i& in the above instead of - k , we obtain
which will be useful in calculating a Green’s function for the boundary condition of incoming waves. By solving LS equations for both outgoing and incoming boundary conditions, we will obtain two wave functions, *(+) and @-) in Appendix IV, so that we can find an S-matrix element as an inner product of these wave functions. Here we deal only with the LS equation of the outgoing boundary condition to parallel the argument on the same problem given by Aharonov and Bohm. For convenience, we provide a quick review of Takabayashi’s derivation of the scattering amplitude of AB and its application for the incoming boundary condition in Appendix V.B. Substituting Eq. (21) and the partial wave expansion of the plane wave u(x;k ) into the integrand of @ s ( x ;k ) , we obtain
112
SEIJI SAKODA AND MINORU OMOTE
where
To evaluate these integrals we make use of an indefinite integral formula for cylindrical functions (represented by Z , and 2, for the sake of convenience) ax p2 - v2
{Z,+1(&2a)
(a11
- Z,(ax>-%+,
to obtain
and
Then, it is straightforward to find { A , ( r ) ~ ; : ;(kr) ,~
+ B, ( r ) ~ ~ ,(kr)} + , ~ eine+ilnln/2
= -Jl,,(kr)eine+i)nlz/2+ J
+ I (kr)einO+inn-iln ,
+nln/2 9
with the aid of Lommel’s formula
2i
J,+l (x)H:”(x) - J , ( x ) H ; ! , ( x ) = -.
XX
Using Eq. (29) we can rewrite the sum in Eq. (24) as
n=-oo
(29)
DIFFERENCE IN AHARONOV-BOHM EFFECT ON SCATTERING STATES
1 13
We thus find that the total wave function for the scattering state is given by fcc
n=-cc
It is very interesting to note that putting P k = IT makes the solution of the LS equation coincide with the wave function obtained by Aharonov and Bohm (1959). However, we must remember that in solving the LS equation we take a plane wave as an incident wave and the resulting scattered wave is given by Eq. (30). Consequently, despite the coincidence in whole wave function of the scattering state, both the incident wave and the scattered wave in this section are different from those of Aharonov and Bohm.
3. The Scattering Amplitude Next we proceed to find the differential cross section by using the scattered wave, Eq. (30). From Eqs. (27) and (28) we notice that A,(r) = O((kr)'), while B,(r) = O ( ( k r ) - ' ) for large kr. We then easily obtain the asymptotic form of scattered wave I,!Js(r,p) from Eq. (24) as
where A n ( w )is found from Eq. (27) to be
A n ( o a )= -i sin{((n + a (- ( n l ) n / 2 } .
(33)
Using the asymptotic form of the Hankel functions and Eq. (33) for An(co) in Eq. (32), we obtain
where the phase shift in the nth partial wave is given by
c
-na/2 (n &(a) = +na/2 ( n
+ [a]1 0 ) + [a]< 0).
(35)
Here and in the following discussion we denote the integral part of a by [a] and its nonintegral part by (a)to write a = [a] {a}. Here we introduce a regularization parameter E for the sum in Eq. (34)so that it is defined as an Abel sum, because the phase shift does not decrease at
+
114
SEIJI SAKODA AND MINORU OMOTE
all when (nl becomes large, and define f (0) as
Then, calculating the sum of the geometric series and making use of the symbolic relation 1 1 lim - PE-+ox-u&~~E x-a
in&x - a )
where P denotes the principal value, we finally obtain in terms of the scattering amplitude f (0)
Although the total wave function happens to have exactly the same form as the result of Aharonov and Bohm mentioned above, the scattering amplitude Eq. (37) differs from theirs by the 6 function term. Nevertheless, this difference can be neglected when we consider the differential cross section only for the nonforward direction (0 # 0), since we cannot well separate the scattered and unscattered particles in the forward direction experimentally. The differential cross section for the nonforward direction is thus given by 1 sin2rra! da(a)= de. 2nk sin2(O/2)
In this sense the scattering amplitude of Aharonov and Bohm describes the physics appropriately. However, if we take into account the unitarity of the S-matrix, the 6 function for the forward direction cannot be neglected, as was pointed out by Ruijsenaars (1983). A more detailed description of the property of the S-matrix for this system is given in Appendix IV.
B. Influence of a Solenoid of Finite Radius 1. Gordon's Method
From the viewpoint of impenetrability of the solenoid, it is worth considering a solenoid of finite radius because the solution of the Schrodinger equation
DEFERENCE IN AHARONOV-BOHM EFFECT ON SCATTERING STATES
115
for an infinitely thin solenoid neither vanishes nor is defined at the origin if a! is an integer. We introduce here Gordon's idea (1928), which has been proposed in the analysis of Rutherford scattering, as another approach to the AB scattering, aiming to ensure the impenetrability of the solenoid even for integral a. The method given here may be useful for any other scattering problem of long-range potential. Furthermore, for our purposes, the simplicity of the method enables us to solve the problem for an arbitrary solenoid radius. Hence, we shall solve the scattering problem of the influence of a solenoid with a finite radius. By setting the radius to 0, we can find the same result for an infinitely thin solenoid as we obtained by LS formalism in the previous section. The essence of the method is to prepare the asymptotic region described by the free Hamiltonian far distant from the solenoid in order to overcome some difficulties caused by the long-range effects of the solenoid field. To this aim we introduce a modified vector potential
where ro is the radius of the shielded solenoid. It should be noticed that in the region R < r the vector potential does not affect charged particles at all. To return to the original AB problem, we just take R + 00 after solving the Schrodinger equation for this system. As can readily be seen, the system has several bound states, whereas R remains finite. Since the number of bound states gradually decreases to 0 as R + w, their existence can be ignored. 2. Wave Functions of Scattering States
In the asymptotic region ( r > R ) where the vector potential is absent, the solution of the Schrodinger equation is given by eigenstates of the free Hamiltonian, and the wave function that describes the scattering state, y!q~(x;k ) , is given by
n=-cc
where the an's are constant coefficients to be determined.
116
SELTI SAKODA AND MINORU OMOTE
The Schrodinger equation in the scattering region (ro < r 5 R ) for the wave function +I@; k ) is given by
Assuming a partial wave expansion for + ~ ( x ; k )
n=-ca
we obtain a radial Schrodinger equation
A change of variable r Eq. (44)into
H
z = r ~ ( r / Ras) ~well as +I,, = W , / & transforms
w,
= 0,
(451
+
where v = n a, h = ( k R ) * / ( 4 a ) ,and a prime denotes differentiation with respect to z. The general solution of Eq. (45)is given by a linear combination of the Whittaker functions
To determine the arbitrary constants, a, in Eq. (40) and b,, cn in Eq. (46), to fit the physical situation, we require that the change from +I to +u be continuous at the surface r = R and that $1 vanish at r = ro to enforce the impenetrability of the solenoid. These conditions may be imposed on each partial wave independently.
DIFFERENCE IN AHARONOV-BOHM EFFECT ON SCATTEFUNG STATES
117
To fulfill the condition of impenetrability of the solenoid, we assume ~ , + ~ ( x ;tok )be written as
- Mh+u/2,Iul/2 (ZO)Mh+u/2.-lul/2(z))~ where we have introduced zo = a r i / R 2 to write b; = bn& as b, again, for later convenience. (kr) a,H;')(kr), we obtain, from the continuity Writing @II,n ( r ) = einrr/2Jn ~ , &.lfill,n/@II,n ~ at of a radial wave function and its derivative, & . @ ~ , ~ / l f i= r = R to yield
+
a , ( R ) = Z1 ei n s / 2 { e2i&,(R) - 11,
(49)
where W [ f ( x ) g, ( x ) ] = f ( x ) g ' ( x ) - f ' ( x ) g ( x ) is a Wronskian of f ( x ) and g(x). Then substituting Eq. (49) into @I,,, ( R ) = $ 1 1 , ~ ( R ) , we find
where - ( Z O t,(Y) in the denominator means a subtraction from the first term ) MHamiltonian ~+u~~, of ~a u ~ 1 2 ( by the amount of M ~ + u ~ ~ , ~ ~ u ~ ~ ~ (If~the scattering problem has a strictly finite-ranged interaction with cylindrical symmetry, a phase shift of an angular momentum n will always be given by an expression similar to the one on the right-hand side of Eq. (50). It is sufficient for us to find a,(m) and b , ( m ) here to solve the scattering problem of Aharonov and Bohm, although an investigation of a system with finite R itself may be interesting. To this aim, we need to determine the asymptotic behavior of wave functions for large R . As an example, let us consider the asymptotic form of @ ~ , , ( r From ). the definition of a Whittaker function M,,,(z) in Eq. (47), we obtain
118
SEIJI SAKODA AND MINORU OMOTE
in which we used Kummer’s formula
Recalling that h = k 2 R 2 / ( 4 a ) and z = a r 2 / R 2 , we rewrite each term within the braces as
By using a relation between hypergeometric functions
we find that the products of hypergeometric functions above satisfy
Then utilizing the hypergeometric representation of a Bessel function
we obtain
Since the ratio of Wronskians in Eq. (51) is independent of any proportionkr) ality constant of $[+, we may use xn = J - ~ , . ~ ( k r o ) J ~ , , ~-( J~,~(krO)J-~,~(kr)
DIFFERENCE IN AHARONOV-BOHM EFFECT ON SCATTERING STATES
instead of
+ I , ~ for
119
finding S,(oo) and hence a,(co):
Using a well-known asymptotic form of cylindrical functions, we find
where a = kro and X = kR. Thus we obtain
where we have utilized the asymptotic forms of cylindrical functions again. Thus we obtain a wave function for the interacting region (ro < r < 00) after taking the limit R + 00
where x = kr. By recognizing the simple formula J-,(t)J,(u) - J,(t)J-,(u) = i sinnu{HL')(t)Jv(u)- J,(t)HS')(u)],
(63)
120
SEIJI SAKODA AND MINORU OMOTE
we can simplify Eq. (62) as follows:
Substituting Eq. (60) into Eq. (40), we obtain a wave function for the region r > R in the limit R -+ 00:
This expression is useful only in the asymptotic region ( r +. 00) by definition. It should therefore be considered in its asymptotic form for large r . Apparently, the scattered wave in this scheme is given by the asymptotic form of the second term on the right-hand side. Hence, taking r in Eq. (65) to the sufficiently large, we can find a definition of the S-matrix of this scattering problem.
3. The S-Matrix By using the asymptotic form of Hankel functions, we find the desired expression of the scattered wave from the second term of Eq. (65):
The first term in the scattering amplitude exactly cancels the corresponding term from the incident plane wave. Therefore the S-matrix for the system is just a multiplication of a complex number of unit modulus:
on each eigenspace of the angular momentum. Hence the unitarity of the S-matrix is evident. It is also possible to use Eq. (64) to find Eq. (67). To this aim let us multiply 1/(2n) by h ( x ; k )in Eq. (64) and write the product as + ( + ) ( x ; k )respecting , its outgoing boundary condition. If we consider another boundary condition for the same Hamiltonian by replacing HL1)(kr)with H f ) ( k r ) in Eq. (40) and , obtain carrying out the necessary calculations similar those for + ( + ) ( x ; k ) we
DIFFERENCE IN AHARONOV-BOHM EFFECT ON SCAlTERING STATES
12 1
another solution of the Schrodinger equation corresponding to the incoming condition, namely,
An inner product between @(+)(x;q ) with $ ( - ) ( x ; p ) defines an S-matrix element from an incident momentum tiq to the final one tip. To calculate this inner product it is useful to write
Then we find
where O,, = q p - qq and p = ( p c o s q p ,psinqp). By using a formula for the indefinite integral of a product of two cylindrical functions, Z,(ax) and ?,(bx),
and a definition of the delta function, lim x-+m
we obtain
sin(a - b)x = S(a - b), n(a-b)
122
SEIJI SAKODA AND MINORU OMOTE
Substituting this result into Eq. (69), we find that the angular part of an S matrix element of this scattering problem is given by
for a fixed energy A 2 k 2 / ( 2 ~That ) . the two approaches described coincide is a proof for the equivalence of @I and +II in the region far from the solenoid in which we define a scattering amplitude. Therefore we can conclude that the scattering problem of Aharonov-Bohm accepts an incident plane wave as a constituent of its stationary wave function in the asymptotic region. 4. Reduction to the Infinitely Thin Solenoid
By taking ro -+ 0 in Eqs. (64), (65), and (73), we obtain the corresponding wave functions and the angular part of the S-matrix:
and
for an infinitely thin solenoid. These are the same results obtained in the previous section by means of the LS equation. 5 . Behavior of the Total Cross Section Let us now consider the total cross section of a system of a solenoid with a finite radius. Using a = [a] {a}again, we rewrite the scattering amplitude f ( 0 ) in Eq. (66) as
+
DIFFERENCE IN AHARONOV-BOHM EFFECT ON SCATTERING STATES
123
where A:(a) is given in terms of the Bessel and the Neumann functions by
The total cross section is then found to be
This result explains an interesting feature of this system: a(a) is apparently periodic in (Y with period 2 (not 1). Unlike the case where ro = 0 (an extremely thin solenoid), the wave function is strictly subjected to the boundary condition that it vanishes at r = ro even when a it an integer. Thus the solenoid is completely impenetrable to the charged particles. Here let us consider the special case where (Y is an integer. The explicit form of o is found to be
for the case where (Y is an even integer and
for the case where a is an odd integer. There is a large difference in the behaviors of these two formulas. It is most convenient to examine the limit a .+ 0 to see the difference between the (T (even) and (T (odd) cases. Equation (81) is merely a total cross section of two-dimensional hard core scattering, so it tends to 0 with a -+ 0 as n2/k(log(a/2))2. This result simply means that in the case where ct is an even integer, charged particles are not affected by the solenoid at all. Thus the AB effect is absent. Conversely, Eq. (83) goes to infinity in the same limit, since the Neumann function instead of the Bessel function appears in the numerator of each term. Therefore the total cross section of AB scattering for the case where a is an odd integer diverges when the radius of the solenoid tends to 0. This singular behavior is, however, not specific to the system of
124
SEIJI SAKODA AND MINORU OMOTE
an infinitely thin solenoid. Rather, it is a common feature of the total cross section, excepting the case where a! is an even integer, regardless of the size of the solenoid. If we notice that the partial cross section for large n immediately approaches 4 sin2(na!/2)/k even for finite a, we conclude that the singularity is not the consequence of making the radius of the solenoid infinitely small. This result implies the important fact that the vector potential can affect the charged particles even in the case where a! is an odd integer, which is a different conclusion from that reached by Aharonov and Bohm. Thus we have found two interesting properties of the total cross section of the AB scattering by a solenoid of finite radius: (i) as a function of a! it varies periodically with period 2; (ii) excepting the special case where a! is an even integer, it diverges independently of the size of the solenoid. As for the first property, if we consider a system of an infinitely thin solenoid and suppose that total cross section is obtained by integrating the differential cross section Eq. (38) with respect to 0, this behavior of a(a) for finite a seems to disagree with the periodicity of &(a) in Eq. (38). This problem is resolved by taking into account the delta function in the scattering amplitude Eq. (37). To see this we integrate f * ( 6 , ) f ( e q ) with respect to sp, where 6, = sp - ' p p and 0, = sp - sp,, for the scattering amplitude f ( 6 ) in Eq. (37). Then we set OPq = 4pp - 'pq to 0 to obtain the information on the total cross section. The integral is given by
J -7r
J -7r
in which
It is useful to notice that
to carry out the integration. By making use of this formula, we find Z,
2n = - {(cosna! - 1)' k
+ sin2 na!}S(6,,)
4n k
= -(1
- cosna!)S(Bp,). (86)
Apparently, the total cross section varies with period 2 with respect to a! although it become divergent, excepting where ci is an even integer, on putting
DIFFERENCE IN AHARONOV -BOHM EFFECT ON SCAnERING STATES
125
0,, + 0. This is the same feature described previously for a solenoid of finite radius. The coefficient of 6(0) matches exactly that obtained by
Therefore, the calculation given above is regarded as another proof of the optical theorem for AB scattering (Eq. (227) in Appendix 111). These results clearly show the significance of the delta function of forward direction in the scattering amplitude for an infinitely thin solenoid, otherwise there would be no convincing explanation for the present question. Let us now consider the second property of the total cross section, which has been partially explained in the preceding discussion. To resolve the problem it is useful to rewrite H;:,)(a)in Eq. (73) as 2 J ~ u ~( aHfi;(a). ) Then we find
In this form it is evident that S(0) for a solenoid of any finite radius contains that of an infinitely thin solenoid as a result of the net magnetic flux through the solenoid. This is the expected result because the total flux CP is independent of TO. The origin of the divergent total cross section even where a! is an odd integer is, hence, cos m S ( 0 ) in the S-matrix, similar to the case where ro = 0. Apart from the divergent nature of the total cross section of this scattering problem, the unitarity of the S-matrix is expected from Eq. (67) at least formally. Such problems are discussed separately in Appendices IV through V. 111. AB EFFECTON BOUND STATES To examine the property of the AB effect on bound states, we consider a system that has a potential V ( x ) in addition to the effect of the vector potential. Such a system is described by a Hamiltonian,
126
SEIJI SAKODA AND MINORU OMOTE
Taking a quadratic potential as V ( x ) above, we study the property of the AB effect on energy levels as well as the one seen in eigenfunctions (see, for example, Byers and Yang (1961), JSretschmar (1965b), Peshkin (1981), or Lewis (1983) for a discussion the feature of the AB effect in bounded systems.) We modify the Hamiltonian with the quadratic potential to fit the explanation of shifts in Landau levels in the later part of this section. A. AB Effect on Energy Levels and Eigenfunctions
1. Energy Levels
A Hamiltonian with a quadratic potential in addition to the AB potential is given by e 2 1 -pw2i2, 2
+
where the vector potential A ( x ) remains the same as the one in the AB Hamiltonian. This slightly generalized Hamiltonian is sufficient to see the AB effect in bounded systems. The self-adjoint extension of this Hamiltonian and Eq. (109) can be seen in Doebner et al. (1989). From the path integral derivation of a heat kernel (Feynman kernel of an imaginary time) shown in Appendix I, we see that a heat kernel, (XF [ e - @ I x r ) = KE(rF, p ~r1, ; p1; p), for the Hamiltonian above is given by Eq. (206):
We shall find eigenvalues and eigenfunctions of the Hamiltonian in Eq. (90) on the basis of this kernel. To find eigenvalues of the Hamiltonian Eq. (90), we first evaluate the partition function Z(B> = Tre-pH by substituting X F = X I = x in Eq. (91), then performing an integration with respect to x:
{
po cosh Amp -
ti sinh tiwg
r2}.
DIFFERENCE IN AHARONOV-BOHM EFFECT ON SCATTERING STATES
127
Making a change of variable,
rHU= and substituting to find
t = &up, we
pmr2
fi sinh fiiop '
(931
can carry out the integration with respect to r
where Q ~ ( z designates ) an associated Legendre function, and we have used the formula
for p = 1. For
t>
0, we adopt the formula
for associated Legendre functions in Eq. (96) to obtain
Thus we find that eigenvalues of the Hamiltonian in Eq. (90) are given by E m , n = (2m
+
111 fa1
+ l)fiw,
m = 0, 1 , 2 , . . . , n = - 0 0 , . . . , +00. (98)
2. Eigenfunctions Our next task is to find eigenfunctions of the Hamiltonian in Eq. (90). To this aim we consider the decomposition of the heat kernel in the form
where an eigenfunction I,+,,,,~(x) is assumed to be proportional to a phase factor einp.
128
SEIJI SAKODA AND MINORU OMOTE
We may therefore consider an expansion of
into a power series of e-BhO. Substituting
for simplicity, we rewrite expression (100) as
Making use of the series expansion of I,(z) in terms of powers of z, we obtain
By observing that the second line of the expression is a generating function of generalized Laguerre polynomials Lip)(z), we get an expansion 1 (1 - e - 2 r ) 2 1 + ~ n + a ~ + l exp
{ - e-2r(t 1- + e-25
c W
q)} =
(6 + q)e-2kr.
L;21+~n+a~)
k=O
(104) To decompose the sum of L12'+'n+a1)(6 q ) with respect to I in expression (103) into a multiple of LFfaI)(<)and LF+"l)(q), we utilize, by substituting k I = rn, the formula
+
+
Then we find a desired expansion of expression (100) given by
DIFFERENCE IN AHARONOV-BOHM EFFECT ON SCATTERING STATES
129
Substitution of this expression into the kernel KE(r, 40; r', p'; B) helps us complete the separation of variables x and x' in Eq. (99). Thus we find an eigenfunction of the Hamiltonian in Eq. (90),
where 6 = pWr2/A. The a-dependence of eigenfunctions in Eq. (107) and eigenvalues in Eq. (98) exhibits the AB effect on bound states for this rather simple model.
B. AB Effect on Landau Levels If we consider a system in a constant and monotonic magnetic field in addition to the AB flux, a Hamiltonian for such a system is given in terms of a vector potential
We substitute w = ( e ( K / 2 p cto find that the necessary modification for the new Hamiltonian is an additional term w(L, ha), where L, = x p , - y p x , compared with the one considered in the previous example. Therefore, the Hamiltonian, which corresponds to the vector potential shown above, is given by
+
B = B A B + ;pw2i2 + w ( i , + ha),
(109)
where HAB designates the Hamiltonian of Aharonov-Bohm. Since the additional term in the Hamiltonian commutes with the rest, it is convenient to write a matrix element of the Hamiltonian in terms of the same expression of the completeness relation utilized in considering the previous example. Thus we make use of an expression of the Hamiltonian given by
130
SEIJI SAKODA AND MINORU OMOTE!
to find an infinitesimal kernel
Following the same procedure as in the previous example we find a heat kernel in Appendix I,
that differs from the previous one by the factor e-hwfi(n+a)in each term of the sum with respect to n . Therefore, the eigenfunctions for the present Hamiltonian are chosen to be the same as the ones obtained previously. To find the eigenvalues, we evaluate the partition function whose formal expression is given by
z(p)=
cc m
+co
,-(2m+1n+ffl+n+a+l)tiofi
(1 13)
m=O n=-m
although the sum with respect to n (angular momentum) diverges because the regulating factor e-(ln+crl+n+a)Awpin the sum becomes unity for n’s that satisfy n a < 0. From Eq. ( 1 13), we obtain eigenvalues of the Hamiltonian Eq. (109) given by
+
The result shown here generalizes the one given by Landau for a! = 0 and describes shifts in the Landau level (Landau and Lifshitz, 1965). The same result for a! # 0 was found by Lewis (1983) by means of the operator formalism. The energy levels in Eqs. (98) and (114) are entirely invariant under a shift of a! by any integer amount and also under a + -a. This fact, which is valid for generic Hamiltonians of bound states, is known as the theorem of Byers and Yang, (1961). Before closing this section, we note that Eq. (1 14) can also be obtained by making use of the Bohr-Sommerfeld quantization rule (see Appendix I1 for an explicit calculation). From the viewpoint of wave mechanics, a problem of
DIFFERENCE IN AHARONOV-BOHM EFFECT ON SCATTERING STATES
13 1
quantization of a bound state energy is one of enhancing the interference of a wave with itself. Therefore the result obtained above shares a common origin with the AB effect in an interference pattern. However, if we consider a scattering theory corresponding to the above by substituting w 4 0 in the Lagrangian Eq. (210), the situation becomes quite different. Since we cannot find any classical orbits that describe a scattered particle, there is no reasonable explanation for the AB effect in scattering events in terms of a semiclassical approximation. A sharp contrast with this fact will be seen when we consider a semiclassical description of the Coulomb scattering. In this sense the AB effects found in a scattering problem and those in a bound state energy (or interference pattern) can be quite different. ON A SYSTEM OF BOTHBOUND STATES AND SCATTERING STATES IV. AB EFFECT
Throughout the preceding sections we have found a difference in the AB effect emerging on a system of bound states versus that on scattering states. The models we have been dealing with are systems that have only bound states or scattering states. The natural question that then arises is, What type of AB effect should be expected on a system that provides both states simultaneously. We may ask, for example, whether the singularities and the phase shifts of the S-matrix of such a system can behave similarly to those found for a system of only bound states or scattering states. It will therefore be very significant for testing if the results we obtained separately for the two situations can be found in a single Hamiltonian. Fortunately, there is such an interesting model, a Hamiltonian with attractive Coulomb potential in addition to the Aharonov-Bohm one, that exhibits the AB effect in both scattering and bound states. This problem was investigated by several authors in three-dimensional space and was often called the ABC problem (Guha and Mukherjee, 1987; Kibler and Negadi 1987; Sokmen, 1988; Chetouani et al., 1989; Doebner and Papp, 1990; Driigiinascu et al., 1992). The problem was first solved by Guha and Mukherjee (1987) by use of rotational parabolic coordinates and later by Kibler and Negadi (1987) by means of the Kustaanheimo-Stiefel transformation. These authors solved bound states of the system to find eigenfunctions as well as eigenvalues E = -(1/2m2)pe2e‘2/fi2 where m is given by m = IvI nl nz I, n; = 0, 1,2, . . . ( i = 1,2), for the Coulomb potential -ee’/r in addition to the AB potential for a particle of charge e’ and reduced mass p. (We use u = n a to avoid any possible confusion.) The energy spectrum shown above is even and periodic (with period 1 ) with respect to a; hence, it is regarded as an example of the theorem of Byers and Yang. Similar problems in two-dimensional space as well as in relativistic settings are considered in
+ + +
+
132
SEIJI SAKODA AND MINORU OMOTE
Doebner and Papp (1990), Hagen (1993), Villalba (1995), Lin (1998), and Park and Yo0 (1998). Our aim in this section is to look at the difference in AB effect on scattering states versus bound states, as was mentioned at the end of section 1II.A. We examine here the a-dependence of the energy spectrum as well as the scattering amplitude by solving the two-dimensional ABC problem. We analyze the scattering problem in terms of time-dependent quantum scattering theory (Reed and Simon, 1979; Pearson, 1988) then search singularities of S-matrix to obtain bound state energies. The method also involves the same treatment of AB scattering just by switching off the Coulomb interaction. Thus we can confirm our previous result found by solving the LS equation. A. Time-Dependent Scattering Theory 1 . Wave Functions of Scattering States
A Hamiltonian that describes a two-dimensional ABC system is given by 2
K
r
where K is assumed to be positive to enable bound states, and A ( x ) is again the Aharonov-Bohm vector potential. Writing a wave function in the separated form e,(x) = Rn(r)einVgives the radial Schrodinger equation
*+
-ti’ { id - ?r2} R , ( r ) 2 p dr2 r d r
-
K
;Rn(r) = ER,(r),
( u = n +a).
(116) . . By substituting R,(r) = rI”’,y,(r) and k(> 0) with K = f i 2 g / ( 2 p ) and E = A2k2/(2p),and introducing g, we rewrite Eq. (1 16) as
to find a regular solution at r = 0 given by
xn(r)= eik‘1F1(1/2
+ Iul - i y , 2Iul + 1; -2ikr),
Thus we obtain the wave function
for angular momentum n.
( y = g/(2k)).
(118)
DIFFERENCE IN AHARONOV-BOHM EFFECT ON SCATTERING STATES
133
In order to ortho-normalize these wave functions, we define a real-valued one:
x 1F1(1/2
+ IuI - i y , 2 ( u (+ 1; -2ikr).
(120)
The ortho-normalization relation is then given by
while the integral
provides a projection to subspace of scattering states from the whole Hilbert space L2(R+) associated with ein’+’. Although we need to add the contribution from bound states to obtain the completeness relation, we ignore it here to concentrate on solving the scattering problem.
2. Wave Operators By use of the projection Eq. (122), we find the positive energy sector of the time evolution operator U ( t ) = e-iHr/Aof the Hamiltonian of Eq. (1 15) U + (x ,x’; t ) = (x I (I(t)P+Ix’)
where P+ is defined by +m
( X I P + I ~ ’ )=
C efn(q-9‘)p(n) + (r r’). 3
(124)
n=-Do
If we are dealing with a short-range potential, we can immediately define the wave operators W * by
W k = t-PfDo lim U(-t)Uo(t) = r-+lim U(-t)P+Uo(t), +m
(125)
where Uo(t) = e-iHof/fiis the time evolution operator of the free Hamiltonian. In Eq. (125), the limits of the operators on the right-hand side are defined by
134
SEIJI SAKODA AND MINORU OMOTE
evaluating their action on a wave packet. In this sense P+ is automatically inserted when we take It( + 00 because the wave functions of any bound states cannot survive this limit. For a long-range potential, such as a Coulomb interaction, we cannot define wave operators simply through Eq. (125) because the limit It1 + 00 is not well defined even on wave packets. If we take a wave packet whose mean value of momentum is Tip, the wave packet will be centered at r = fibtl/p. Therefore the behavior of the radial wave function at large r determines the existence of the limit It\ + 00. When r can be taken to be sufficiently large, we may solve the radial Schrodinger equation by the semiclassical method. If we assume a central potential in addition to the AB potential, the existence of the limit in Eq. (125) is therefore guaranteed by the convergence of the integral
where V ( r ) , the potential under consideration, is assumed to decrease more slowly than 1 / r 2 for large r, and C is some positive constant. In terms of partial wave expansion, the AB potential in the radial Schrodinger equation belongs to a class of potentials evaluated in this scheme and is short range in this sense. Its wild behavior is therefore due to some other reason, that is, the ill-defined sum with respect to n in Eq. (34). The convergence of the sum is broken by the nondecreasing feature of the phase shift. Taking these factors into account, we may define the wave operators by modifying the asymptotic free motion to fit the asymptotic form of f n ( r ;k ) in the region far from the origin,
where C.C. stands for complex conjugation. To avoid the difficulty caused by fiylog(2kr) in the phase of Eq. (127) when we take Jtl + 00 in Eq. (125), we introduce a unitary operator J ( t ) ,
Noticing that
we now define the wave operators by
W* = lim U ( - t ) J ( t ) U a ( t ) f-P+cc
DIFFERENCE IN AHARONOV-BOHM EFFECT ON SCATTERING STATES
135
instead of Eq. (125). Evaluating W+ on a wave packet, we then define an S operator by S = W: W - . Here, it should be noted that we need to introduce J ( t ) just to avoid the trouble caused by the Coulomb interaction, not by the AB potential. As we stated previously, the AB potential causes no difficulty in finding wave operators and an S operator as far as the radial part is concerned. To carry out the above procedure for obtaining the wave operators and the scattering operator, we introduce a wave packet
where
For later convenience. let us call the limit
a plane wave limit. For estimation of wave operators, it is useful to write a free wave packet (x IUo(t)l&,@))as
+
where r = h t / p and a2(t) = a2 it/2. An integral that involves this wave packet in the integrand is dominated by the Gaussian factor in the first line centered at x = &p Irl according to the sign of t . It is for this reason that we can discard the contribution from the bound state in a time evolution operator when we consider a scattering problem. The wave packet Eq. (133) can also be written as
136
SEIJI SAKODA AND MINORU OMOTE
in terms of polar coordinates. Using this expression and Eq. (123) together with Eq. (127), we obtain
x exp ( i z log
x
T)
(135)
Lrn
r’dr‘ f n ( r ’ ;k ) J n ( a 2 p r ’ / a 2 ( texp )) 5
If we put r’ = q t , the integration with respect to r’ in Eq. (135) is rewritten as
sou
a2(t> 0
-
r’dr’ f n ( r ’ ;k)Jn(a2pr’/a2(t))
t2 Srn4 dq a2(t)
fn
( q t ;k > J n ( a 2 P q t / a 2 ( t ) )
0
where yq = g/(2q). For dealing with large t , we may take the asymptotic form
where q n = arg(r(l/2
+ IuI - iy)) - nlu1/2. Then noting that
DIFFERENCE IN AHARONOV-BOHM EFFECT ON SCATTERING STATES
137
we can rewrite the integration with respect to q as
Each term within the braces yields - 2 ( - t ) q 2 f ikqt = --B
it 2 (4)
k)'-
= -a2(-t)(q fk)2 - 02(t)k2
2 k2 (139) 402(-t)
+ O(l/t)
+
in the exponent. Since q k > 0, the first term cannot contribute to the integration, whereas the second term has a Gaussian factor e-uZ(q-k)2in the integrand. Therefore we may estimate the above integral by
+
x 2Jn( 2 - ~ ~ k p / i ) e (-1" ~ O( l/a)).
(141)
Utilizing a well-known asymptotic form of a Bessel function, we finally set -B --f 00 after multiplying by d m to obtain the plane wave limit,
.
+cc
Writing f n ( r , p ) explicitly, we find
138
SEIJI SAKODA AND MINORU OMOTE
In the same way, we can find
The following simple relation between (x 1 W - ( @ @ ) )and (x 1 W + l @ ( q ) )is seen from these expressions: (xlW-l@(P)) = (.rIW+I@(-p))*
(145)
The operation on the wave function on the right-hand side is precisely the time reversal transformation for the motion under a magnetic field. This is expected, since we obtained these incoming and outgoing wave functions by the two opposite limits in the time direction. 3. The S-Matrix of the ABC Problem in Two Dimensions To find an S-matrix element, we calculate an inner product of wave functions (XlW-I@(P)) and (xIW+l@(q)):
where 6, ( p ) is given by
Let us investigate the angular distribution of the scattering operator by defining 1
+-
DIFFERENCE IN AHARONOV-BOHM EFFECT ON SCATTERING STATES
139
+
, sum If we substitute a = [a] {a]by writing the integral part of a as [ a ] the is naturally divided into two parts: one from --oo to -[a] - 1 and the other from -[a] to +w. It is easy to see that each partial sum can be arranged to yield a hypergeometric function. Thus we obtain
(149) for 8 # 0. When 8 tends to 0, hypergeometric functions become divergent. To avoid this we need a regularization, which we accomplish by replacing eie in the first term with ei'-& and e-" in the second term with epic-', where E is positive infinitesimal. If we substitute a = y,, = 0 after introducing this regularization, Eq. (149) becomes
= s(e) which means that S = 1 for free motion, confirming that the regularization is the desired one. Therefore we may regard hypergeometric functions in Eq. (149) to be regularized in this way in the following. As was mentioned before, we can derive the S-matrix for the case of the AB potential alone just by substituting y, = 0 in Eq. (149). Writing E explicitly, we obtain
- e-i14(e-r) -
cos n{a}C3(8)-
isinn(c-w} 1 P-}, n 1 - ere
(151)
which reproduces the scattering amplitude of AB scattering obtained before by solving the LS equation.
140
SEIJI SAKODA AND MINORU OMOTE
For the moment let us look closely at the time evolution of a wave packet from t = -oc to t = +cm for this special case to consider the meaning of cos na in front of a(@. By definition, the wave operator W - brings a free wave packet at t = -oc to that consisting of eigenfunctions of the total Hamiltonian at t = 0: = /d2kG(k,p)(xlQ(k)),
(XI$@))
(152)
where an eigenfunction (XIS@))is given by
.I
( x l q , ~ ) )= 2n
+m
C ,in(v-vp+n)--il4n/2j
IUI
(prh
(153)
n=-w
and G ( k , p ) is the same one that appears in Eq. (131). This wave packet at r = Oevolvesinto (xIU(t)IS@)).For(tl-+00,wemayconsider &IU(t)lQ@)) in the region far from the origin. Thus we may evaluate the integral in Eq. (152) by taking the asymptotic form of Jl,,(kr) for large r. If we set t + -00, only the term proportional to ecikr in the asymptotic form of Bessel functions can contribute to the integration to yield
which is simply (x lUo(t)l&@)). This is the inverse of the map from the free wave packet at t = -m to that of the total Hamiltonian at t = 0 generated by W - . Thus we see that the asymptotic condition imposed on the wave packet Eq. (152) that it approach a free motion at t = -00 is satisfied. If we take t -+ +cc in ( x l U ( t ) l Q @ ) ) ,only the term proportional to eikr contributes to the integral, so that (xlU(t -+
+W)l@@))
+
Since ei(n-lul)n= cos na i sin nu(- for n 2 -[a]and for n I-[a]- 1 , respectively), we divide the sum into four parts according to the range of the
DIFFERENCE IN AHARONOV-BOHM EFFECT ON SCATTERING STATES
141
sum and whether it is proportional to cos na or to sin nu. Terms proportional to cos na are summed to yield cos na(xI Uo(t)l&@ I), whereas, with the help of the integral representation of a Bessel function, those containing sin JKY can be arranged to be rewritten as
where xp+e = ( r cos((o
+ O), r sin((o + 0)). Thus we obtain
As can easily be seen from the procedure shown above, this expression must be equal to
for t + +OO and t’ + -m. In terms of path integral, all paths from x’ to x can contribute to (x lU(t - t ’ ) l x ’ ) .They are finally averaged by the integration with respect to x’ to yield cos na preceding the wave packet of free motion in the first term. If there were classically preferred paths to go by the solenoid clockwise and counterclockwise from pt’ to p t , we could regard both eirrffand e-‘Rff in cosna as a Dirac phase factor
taking a path of integration along such preferred paths, respectively. Since there is no such classical path, we cannot expect the path integral for (x 1 U ( t t’)ln’) to be dominated by such contributions. Furthermore, the factor cos na cannot be interpreted as a sum of contributions from such classical paths even if we could find them, because it should be 2 cos m rather than cos na (the average of eirrffand e-inff)if that were the case. Hence, we should accept this factor as it is. Another point to be noted is that cosna seems to contradict the widely accepted feature of the AB effect that it disappears for integral a.In terms of a Feynman kernel, relying on the fact that (x~U(r)lx’)l a+a+m
= (xlU(t)lx’)l
e-im(p-q’),
(I
142
SEIJI SAKODA AND MINORU OMOTE
where m is some integer, it is often said that we can remove the effect of the AB potential when a is an integer by a phase redefinition of a wave function +(x) H +(x)eiap. Such a transformation naturally requires a modified description of a free motion. We then need to formulate a scattering theory for a Hamiltonian in comparison with this modified free motion. From this new viewpoint, we should find unity, instead of cos nu,as a factor preceding a free wave packet in Eq. (157) for any integral values of a. This immediately leads us to conclude that a transformation +(x) H +(x)eiUv cannot leave a free motion unaffected because the S-matrix must be unchanged if that were the case. Our position on AB scattering in this chapter differs from the above because we always take a plane wave to describe a free motion. We will discuss these points, in some detail in section V, in particular, on regarding eCiCuv as a gauge transformation. B. Bound States of ABC System in Two Dimensions
Let us now return to the general situation of Eq. (149), removing any restrictions on the values of IY and yp. As can be seen in Eq. (147), the S-matrix has poles
and zeros -ig
P=
2(m
+ + 1/2)
(m = 0, 1 , 2 , . . .)
IVI
for each angular momentum n along the positive and negative imaginary axes, respectively, in the complex p-plane. From a pole on the positive imaginary axis, we can find a bound-state energy
;+:
(x ) up to the constant of normalization We also can find an eigenfunction ~in Eq. ~ (1,18):~ by substituting k = i
DIFFERENCE IN AHARONOV-BOHM EFFECT ON SCATTERING STATES
143
where L:)(z) designates a generalized Laguerre polynomial. With the aid of the formula Xa+le-x
{L:’(x)}
2
dx = (2m
+ a + 1)
+a
+ 1)
(163)
m!
we find eigenfunctions that are normalized to satisfy
by choosing Am,nto be
It is evident that the eigenfunctions and eigenvalues of Hamiltonians for and a! 1 are connected by
+
a!
if I is an integer. This result clearly shows that the AB effect on bound states is periodic with respect to a! with period 1 and can be completely removed by multiplying clap by a wave function if a! is an integer. This is a common feature of the AB effect for systems that have only bound states and would be most easily understood if we considered path integral formulation of partition functions for such systems, as was done for the quadratic potential in section 111. However, once we turn to the AB effect on scattering states, periodicity no longer holds because an S-matrix for an integral a! is related to that of a! = 0 by s ~ ,(e) ~ ,= e-iaocos X U S ~ , (e). ~, (166) In terms of the S-matrix, what we have shown is that the analytical properties of S,+,.,(O> as a function of p are identical with those of S f f , y p ( @This ) . is simply an example of the fact that features of bound states are determined, solely and independently of the free motion, by the Hamiltonian of the system. Hence the theorem of Byers and Yang holds for bound states. It does not apply, however, to scattering states because S,(€J) at p 2 > 0 fully depends on both Hamiltonians. In concluding this section, we may conjecture that the S-matrix of a system with both bound states and scattering states under the influence of an AB potential will show a periodic behavior with respect to a! with period 2 in
144
SEIJI SAKODA AND MINORU OMOTE
scattering states through the factor eiIaln, although the analytical structure of the complex plane of the energy or the magnitude of momentum changes with period 1 once we choose a free motion to be described by a plane wave. Choosing another free motion by taking a Hamiltonian with an AB potential for a = I changes the statement by replacing [a]with [a]- 1, provided that 1 is some integer. This will be made clear in the discussion ON gauge invariance in connection with the scattering problem in the next section. V. GAUGE INVARIANCE AND SCATTERING THEORY
We have seen some differences between our results and commonly accepted feature of the AB effect, particularly for integral a through several examples in the preceding sections. It is worth examining the subject more closely. As has already been seen in the previous section, the differences lie in interpretation of a phase redefinition of a wave function. Therefore, our aim in this section is to clarify the meaning of a gauge transformation in a scattering problem, including the AB scattering as a special case. For the sake of simplicity, we restrict ourselves to dealing only with time-independent gauge transformations. As is well known, a Schrodinger equation
accepts a gauge transformation
as a unitary transformation: two theories described by
are unitary equivalent if they are connected by
A (x ) - a, ai A (x ) =0. provided that the gauge function A (x ) is integrable: If the potential function V ( x ) and the effect of vector potentials are short
DIFFERENCE IN AHARONOV-BOHM EFFECT ON SCATTERING STATES
145
ranged, the Schrodinger equations given above are equivalent to the corresponding Lippmann-Schwinger equations
respectively, where u(n) is a plane wave. In these equations, interaction parts, Hintand Hint,of Hamiltonian operators are defined to be the difference of the total Hamiltonian and that of the free motion in each gauge:
An S-matrix element, which is a physical quantity, in the first gauge is defined as an inner product of an outgoing wave (@(+I in Eq. (172)) with an incoming wave (+(-I in Eq. (172)). These waves are transformed by the same unitary operator through a change of the gauge into corresponding ones in another gauge. Apparently, an S-matrix element is invariant under the unitary transformation Eq. (171). This is the gauge invariance of the theory realized as a unitary equivalence. Due to our assumption, the LS formalism is equivalent to the time-dependent description of scattering problems. In the latter formuin the above. The same is lation, wave operators W, play the parts of I++(*) therefore m e in the time-dependent description. Now, let us suppose two Hamiltonians fi and H' in Eqs. (172) and (173) are given simultaneously, and they still satisfy the relation
without changing the definition of phase of wave functions. For this situation, the unitary equivalence considered above is of no use. Therefore, we may solve Eq. (172) as well as Eq. (173) by taking a plane wave as an incident wave. If there happen to exist a set of Hamiltonians among which there holds such a relation (Eq. (176)) and results in the same solution of LS equations, all Hamiltonians in this set are equivalent to one another in the sense of the asymptotic motion of a scattering problem. This is another type of equivalence that can be proved only after solving the dynamics of the systems. (Gauge invariance discussed in Ruijsenaars (1983) is one of this type.) We should clearly distinguish this kind of equivalence from that of unitary equivalence.
146
SEIJI SAKODA AND MINORU OMOTE
Let us now turn to the Aharonov-Bohm problem for integral a!,
(177) Apart from the broken global integrability of the gauge function, we may regard Eq. (177) as a unitary transformed version of the Schrodinger equation of a free particle,
in Eq. (171). In this case free motion obtained by taking A(x) = @q1/(2n) is described by a plane wave U O ( X ) multiplied by e - l a p . We then observe no scattering in this system. However, we should remember that we must fix the arbitrariness of a local phase factor of wave functions when we set up a representation of the canonical commutation relation (Dirac, 1958). This is usually done by taking the phase factor as a constant (Schrodinger representation). Hence, there is no reason to exclude an interpretation of Eq. (177) regarding it as an interacting system in the Schrodinger representation. If we find the scattering operator to be identity even in this situation, we may conclude that the AB effect disappears for integral a! in the scattering problem as well as in energy levels of bound states. In this regard, we recall that we found this feature in the Schrodinger representation for the case of bound states in sections I11 and IV. It would be useful to extend the range for finding a necessary condition required for gauge functions to satisfy the above situation. To this aim we return to the generic case, then consider a relation of S-matrices for two Hamiltonians H 1 and H 2 , defined relative to a free Hamiltonian Ho,assuming a connection H 2 = UAHjUi by aunitary operator U A , ( x l U ~ l + = ) eiA(")(xl+),between H Iand H z . Writing time evolution operators U i ( t ) = e - i H i r / ffor i i = 0, 1,2, we define wave operators of H i ( i = 1,2) by
in which the necessary modification for the case of long-range interaction has simply been deleted. Introducing a wave packet
where
DIFFERENCE IN AHARONOV-BOHM EFFECT ON SCATTERING STATES
147
and D is the spatial dimension in which we are working, we evaluate the limit in Eq. (179) to obtain an explicit form of wave operators. Since H Z = U h H l U l , each of W Z ) is connected to W z ) ,respectively, by
(xlwf’l6.0))) = e’A(x’(xlW:‘’16@>) t -lim tfoo
e-iA@r) , ( ~ = P i t / p ) (181)
where use has been made of the fact that an integration containing (xJUo(t)J 6.O)))is dominated by a Gaussian factor centered at x = p r . A relation between S-matrices S(i)= Wf’tW? of H I and H 2 then follows:
We thus find a necessary condition
for two Hamiltonians to yield the same S-matrix. Any gauge function that tends to a constant (may be set to 0) when we set JxI + 00 satisfies this condition. In other words, we can regard a free motion modified by such a gauge transformation as essentially the same as the original one. For such a situation we can say that either of two views of a scattering problem of a Hamiltonian H = UAHoUl -choosing Ho as a free motion, or regarding H itself as a free Hamiltonian after unitary transformation U A -will produce the same S-matrix in the results. This is a preferable feature for a gauge function if we regard a quantum mechanical system as being reduced from field theory as its one-particle sector. Without this restriction to gauge functions, we may need to deal with a number of systems of the asymptotic field corresponding to each gauge function. Let us, for the moment, leave the coincidence of two S-matrices and turn to their analytical properties as functions of the magnitude of the momentum by considering a relation between Green’s functions
for i = 1,2. Carrying out the integration with respect to t and using the relation
H2 = U A H ~ Uwe~obtain , G 2 ( x , x ’ ; E )= (XI
E
+ i&
1
- H2 1x7 = ( X I E
1
+
i& -
jx
1 ) eiA(x)-iA(x’)
(185)
148
SEIJI SAKODA AND MINORU OMOTE
Thus we find that a unitary transformation U A does not cause any changes in the spectrum of a Hamiltonian. The same is then true for the analytical structure of a complex plane of the magnitude of the momentum. In particular, poles and zeros of an S-matrix on a complex p-plane are kept intact under such unitary transformations. Returning to Hamiltonians with AB potential, we find that a pair of values in a, say a1 and az, gives an example of the situation considered above if a2 = a 1+ 1 with an integer 1. For this case we may choose a gauge function A(x) = -tzclp/e. Apparently, this gauge function cannot pass the test Eq. (183): it results in ei'(pp-pq-n) rather than unity. Thus we obtain (@b)l,#"+')l#(q)) = ei'(pp-9q-n) (@ t P1IS'"' I@ (4)) .
(186)
We may regard this result as a generalization of Eq. (1 66) and the following discussion. Combining Eq. (186) with considerations of the analytical structure of an S-matrix given above, we conclude that once we fix a free motion by choosing from among various Hamiltonians of integral a, those of other integer values of a! are in general not equivalent to the one chosen to be a free Hamiltonian. Thus we have completed the conjecture we made at the end of the previous section. VI. CONCLUDING REMARKS
In this chapter we have examined the Aharonov-Bohm effect on bound states as well as on scattering states from several viewpoints. To study the AB effect on bound states, we have solved three models: a two-dimensional harmonic oscillator in addition to the AB potential, motion under a monotonic magnetic field with an infinitely thin solenoid (shifted Landau levels), and a two-dimensional ABC problem with attractive Coulomb interaction. In each model we were able to confirm the validity of the theorem of Byers and Yang, namely, we observed that the energy spectrum of bound states varies periodically with respect to a in period 1. Conversely, we found a different behavior with respect to a for the AB effect on scattering states. S-matrices found in sections 1I.A and 1I.B possess a periodicity with period 1 as functions of a. Furthermore, by analyzing a system of a solenoid with a finite radius, we confirmed that the diverging nature, excepting the case where a is an even integer, of the total cross section of this scattering problem is not specific to a system of an infinitely thin solenoid. Rather we showed it to be a generic feature of such systems regardless of the size of a solenoid. We also found the difference in the periodicity between the AB effect on scattering states and that on bound states in a system (two-dimensional ABC)
DIFFERENCE IN AHARONOV-BOHM EFFECT ON SCATTERING STATES
149
that provides scattering states and bound states simultaneously. The singularities (poles) of the S-matrix move on the complex p-plane with period 1 as a whole with respect to a,whereas a phase shift in them as a function of a, with p being positive real, changes with period 2. Observation of the discrepancy in the AB effect on bound states and on scattering states naturally led us to consider the meaning of gauge invariance in a scattering problem, since it is closely related to the removal of the AB effect for an integral a by a phase redefinition (gauge transformation) of wave functions. We showed that a gauge transformation e-icuq,which is fully utilized as the basis for eliminating the AB effect when a is an integer, necessarily causes a change of free motion in the sense of the asymptotic condition of a scattering problem. The appearance of the AB effect when a is an odd integer, though it is usually said to be absent, can be clarified through this consideration. Our conclusion on this point is that a gauge transformation cannot affect the energy spectrum of a system and leaves the S-matrix intact if and only if the gauge function A(x) approaches a constant when we set 1x1 -+ 00. The gauge transformation e-icrP does not satisfy this condition to result in the nontrivial S-matrix even when a is an odd integer. By solving the LS equation for an infinitely thin solenoid, we obtained a wave function of a scattering state that coincides with the one given by Aharonov and Bohm in spite of the difference in incident waves. A difference in the incident waves used by Aharonov-Bohm and by us creates a difference in the scattering amplitude by the amount of a delta function peaked in the forward direction. The delta function in the forward direction has been shown to be essential for the unitarity of the S-matrix. We derived the S-matrix of this scattering problem from a solution of an LS equation with the outgoing boundary condition by regarding a plane wave as an incident wave. We obtained the same result as an inner product of an outgoing wave and an incoming wave (see appendix IV), which validates viewing a plane wave as an incident wave. Aharonov-Bohm’s approach seems to lack a consideration of the incoming wave and hence a definition of the S-matrix. We assert that there is no convincing way to define an S-matrix relying on Aharonov-Bohm’s modulated plane wave. If we take Aharonov-Bohm’s scattering amplitude (fFd(0)) as a piece of an S-matrix, it is interesting to ask what would be the rest. If we simply set sic) = s(e)
+
pfa;ics,, 2n
(187)
as a relation between the S-matrix and Aharonov-Bohm’s scattering amplitude and suppose that (188)
150
SEIJI SAKODA AND MLNORU OMOTE
holds as a consequence of the unitarity, then, noticing that we obtain the curious relation
-fZ(O),
ffi*(-O)
=
Thus we find the definition by Eq. (187) to be invalid. To obtain the correct definition, we may set
to solve Q(a) to match the natural requirement Q ( 0 ) = 1. This can be done by using the relation
With the aid of this relation, we can write the unitarity of the S-matrix in terms of Q(a) as lQ(a)I2 sin2 na = 1 (192)
+
to yield Q(a)= c o s l ~ a Thus . we obtain the correct definition,
instead of Eq. (187). This is exactly the same one we obtained by solving the LS equation. Let us consider why we cannot resort to the simple relation Eq. (187). In the typical short-range scattering problem, as can be seen in Appendix Ill, we have asymptotic forms of outgoing and incoming waves,
whereas for the AB scattering we find (see Appendix V)
151
DIFFERENCE IN AHARONOV-BOHM EFFECT ON SCATTERING STATES
for an outgoing wave and
for an incoming wave, respectively. From Eq. (194), we observe that we always find a delta function S(p - 4 ) in the inner product of + ( - ) ( x ; p )and @(+)(x;q ) from an integration of the product of the first terms in each asymptotic form for typical short-range scattering problems. This function is the one usually expected in an S-matrix to describe an unscattered wave. If we can find the same delta function from the corresponding integration for the AB scattering, we may obtain an S-matrix given by Eq. (187). This is not the case, however. If we calculate the corresponding integration by substituting (pq = n,we find e-inaS(p - q ) . Conversely, by substituting (ps = -n,we obtain eiTaS@ - q ) , though these two choices must be physically equivalent. This clearly shows the inadequacy of the interpretation of f F 2 (0) as a scattering amplitude based on the observation of the asymptotic form Eq. (1 95). To find an S-matrix we return to full expressions of the incoming and outgoing waves, then we calculate their inner product to find Eq. (193). ACKNOWLEDGMENTS
We gratefully acknowledge fruitful discussions with Y. Ohnuki and K. Odaka on related topics. I. PATHINTEGRALFOR A SYSTEM IN THE AB POTENTIAL APPENDIX Because the calculation of path integration becomes well defined when we formulate it in the framework of imaginary time with some attractive potential like a harmonic oscillator rather than in the original situation of the AB scattering, we consider a Euclidean path integral for the Hamiltonian H in Eq. (90) that contains a quadratic potential in addition to the vector potential of an AB solenoid. Once we obtain a heat kernel for this system, we can immediately transcribe the result into a corresponding Feynman kernel. Furthermore, by removing the effect of the additional potential, we will also be able to find a Feynman kernel for the system of the AB potential alone. To formulate a path integral for the heat kernel (x 1e-B’ Ix’) ( p > 0) of the Hamiltonian in Eq. (90), we first evaluate the infinitesimal kernel lx’)
(E
= /3/N
<< 1).
(197)
152
SEIJI SAKODA AND MINORU OMOTE
To obtain an integral representation of this infinitesimal kernel, we make use of an expression of a two-dimensional delta function
that is the consequence of Eq. (16) in conjunction with Eq. (17). In Eq. (198), indices of Bessel functions, u = u(n), can be any function of n so far as Re(u(n)) > -1 is held. If we substitute u = ( n 61, with S being some real number, a matrix element of the Hamiltonian Eq. (90) is given by
+
fi2p2
(n
+ S)2 r2
+
(n
+ a)2 r2
4
where the quadratic potential term has been symmetrized with respect to x and x’. By simply substituting 6 = a in Eq. (199) and performing the necessary exponentiation, we obtain a desired representation of the infinitesimal kernel Eq. (197):
To simplify this expression, we may make use of the formula
which holds for ( arg(a)l < 17/2, Re(p) > -1, p . q > 0. Adapting Eq. (201) to the integration with respect to p in Eq. (200), we can rewrite its right-hand side as
DIFFERENCE IN AHARONOV-BOHM EFFECT ON SCATTERING STATES
153
By substituting 1
--+E
Aw
sinhfiwe’
l+-
(AWE)* + coshhm, 2
(203)
which do not affect the term O(E) in the exponent of the infinitesimal kernel, we can further simplify the above expression of the infinitesimal kernel as follows:
KE(r, (p; r’, (p’; E) E (x 1 (1 -
&) Id)
xexp{- pw cosh AWE (r2 2A sinh Awe
+ r!’)} .
Writing the infinitesimal kernel in this form and utilizing the addition theorem of hyperbolic functions, we can easily verify the multiplication rule
This means that we have found a representation for the heat kernel of finite (imaginary) time ,f3, (xF(e-@ 1x1) = K,drF, ( P F ; rI,cp1;B), given by
which holds true for any B =- 0. The right-hand side of Eq. (206) defines a bounded Hermitian operator e-pA as an integration kernel on the Hilbert space. If we set a = 0, it becomes
which is just the well-known heat kernel for a two-dimensional isotropic harmonic oscillator. Another limit in which we are interested from the kernel
154
SEIJI SAKODA AND MINORU OMOTE
Eq. (206) is obtained by setting w + 0. In this limit we obtain
By substituting h/3 = it, we obtain the Feynman kernel Eq. ( 1 8)
which plays a fundamental role in our discussion of the LS formalism of AB scattering in section 11. A path integral description of the AB effect even for a solenoid of finite size and arbitrary shape can be found in Ohnuki (1986). See also Schulman (1971) and Ohnuki and Kitakado (1993) for the relationship between the AB parameter a! and an arbitrary parameter that emerges from consideration of the quantization of a particle moving on a circle.
APPENDIX 11. SEMICLASSICAL DERIVATION OF SHIFTED LANDAU LEVELS In terms of two-dimensional polar coordinates (r, (p), a classical Lagrangian for the system solved in section 1II.B is given by
I-L L = -(t2 2
+ r2k2)- (ha!+ pwr2)+.
(210)
Integrating the equations of motion, F - r(+ - w12
+ rw2
{ p r 2 ( k- w ) - f i a ! }
= 0,
= 0,
we find that p , = p r 2 ( +- w ) - ha and
c = -rc2L . 2 + -(pv 1 2pr2
+
+
+ -w 2 r2 2 ' l
are constant. Since C = E - w ( p , ha!), where E is the energy function given by cL E = -(i2 r2k2), (213) 2
+
155
DIFFERENCE IN AHARONOV-BOHM EFFECT ON SCATTERING STATES
solving Eq. (212) with respect to i. results in
Then the Bohr-Sommerfeld quantization condition reads
fp r d r = 2n(m + 1/2)ti, fp , d q = 27tnh,
(215)
+
where p r = p i and m in the first equation has been replaced by m 1/2 according to the well-known improvement by the WKB approximation of the Schrodinger equation (Landau and Lifshitz, 1965, ch. 7, sec. 48). The second equation of (215) explains the quantization of the angular momentum, p q = nh, while the first one yields the energy spectrum of the system (modified Landau level). To see this, we rewrite the left-hand side of the first equation of Eq. (215):
f
:/
prdr = 2p
dr Jw2(b - r2)(r2- u ) ~ ,
where
2 a+b = -{E
A2
-hw(n + a ) ) , a b = y ( n +a)’(0 Ia Ib),
Nu2
then make a change of variable, r
1.0 H u = r2, to
obtain
U
Thus the first equation of (215) is equivalent to
from which the modified Landau level E = (2m can be read immediately.
+ 1 + II + a + )II + a J ) h u
APPENDIX 111. UNITAIUTY AND THE OPTICAL THEOREM IN TWO-DIMENSIONAL SCATTERING We briefly discuss two-dimensional scattering theory here for completeness. The Hamiltonian of the system under consideration is assumed to have
156
SEIn SAKODA AND MINORU OMOTE
cylindrical symmetry, so that a scattered wave is well described as a function of the scattering angle 8 by a scattering amplitude f(8). Suppose we have two solutions, +k(r, (o;(oo) and +k(r, (o;(ob), for a scattering problem corresponding to different incident beams of the same energy. The solutions are assumed to have the asymptotic behavior +k(r,
p; (oo)
+k(r, (o; (oh)
-
eikrcoseI
1
. . erkr-in/4
2/;
1 eikrcos8'I
f(8)
(6 = (o - Po),
e r .k r - ~.n / 4
f ( 8 ' ) (8' = (o - v;),
fi
(220) (221)
where the phase factor e-iir/4has been introduced for later convenience. If we assume the Hamiltonian to be Hermitian, it is a straightforward procedure to obtain r+n
as a consequence of the Schrodinger equation. Taking r sufficiently large and using the asymptotic form of the wave functions, we immediately find
(223) This is the generalized optical theorem and is merely the c-number version of the unitarity of the S-matrix. To see this, let us define the S-matrix for the wave function given in Eq. (220). From the asymptotic form of the wave function,
we can find a definition of operators
j.= 1 + 7 , (iF)((oo)= F((o) +
and
7:
Jq'" 2Tt
--x
&Jof((o
Then unitarity of the operator i reads
j-tj = 4.7 + j t ) , which is equivalent to Eq. (223).
- (oo)F((oo)
157
DIFFERENCE IN AHARONOV-BOHM EFFECT ON SCATTERING STATES
As a special case of Eq. (223) or (226), we can easily obtain the optical theorem just by taking boo = PO’ 0=
-e2Re(f(O)).
(227)
We can also formulate a definition of an S-matrix in terms of outgoing and incoming waves. An outgoing wave I+(+)(x;k ) is introduced to satisfy
with an asymptotic form
Then, an incoming wave corresponding to an outgoing wave is given by
I + ( - ) ( x ; k= ) I+(+)*@; 4).
(230)
These wave functions are identified with those obtained as solutions of LS equations with outgoing and incoming boundary conditions, respectively. An S-matrix element from Aq to Ap is given by an inner product of these wave functions, @ I S ( q )=
1
I+‘-’*(X;p)I+(+)(x;q)d2X.
(23 1)
Substituting the asymptotic form Eq. (230) and making use of the symmetry f(-0) = f(@, that follows from the assumption made at the beginning of this appendix, we obtain
which is equivalent to Eq. (225). APPENDIX IV. S-MATRIX OF THE AB SCATTERING Taking a plane wave
158
SEIJI SAKODA AND MINORU OMOTE
as an eigenstate of the Hamiltonian (20) of a free particle, we obtain
.
+M
as solutions of LS equations
Here we again replace n + a! by u. A matrix element of the S-operator is given in terms of I*(+)@)) and I@(-)@)) by
Making use of the explicit form of lW(*)(k)),we can easily obtain
where 6, = ( n - Iul)n/2. It is then easily seen that
That gkt = 1 can be verified in the same way. Therefore the S-matrix of the AB scattering is unitary. By performing the sum in Eq. (238), we can further
159
DIFFERENCE IN AHARONOV-BOHM EFFECT ON SCATTERING STATES
rewrite
Recalling Eq. (37) for the scattering amplitude f(@, we find a fundamental operator relation,
i=l+j.. Furthermore, if we introduce common eigenstates of momentum,
(24 11
k~jgand of
the angular
1
1 += n))= dqeinVI@(k)) ( n = 0, f l , f 2 , . . ,),
6
(242)
--K
the operator iis diagonalized as
to confirm that the solution of LS equation assures the unitarity of S-matrix as well as its cornmutability with k ~ .
APPENDIX V. RESULTOF AHARONOV AND BOHM AND THE UNITARITY OF THE S-MATRIX A. Concise Form of Aharonov and Bohm's Wave Function
The wave function of the scattering state is given by +-03
q a ( x ; k )=
C
J ,VI (x)ein(Q+=)-iIullr/2
,
x = kr, O =
- q ~ k v, = n
+ a.
n=--03
(244)
By using the integral representation of the Bessel functions
160
SEIJI SAKODA AND MINORU OMOTE
we can immediately convert Eq. (244) into its integral representation (Jackiw, 1990; Stelitano, 1995; Sakoda and Omote, 1997),
$rff =
1
e-cyt-incu/2
Ldt8siht
1 - e-t+iB+in/2 +
e-( 1--n)t-iB+i(
l+a)n/2
1 - e-t-i8+in/2
for 0 5 a < 1. (See Sakoda and Omote, 1997, fig. 1 -for the contours of integrations in this appendix.) When a! has an integral part (a = [a] [a}), the wave function is obtained by $rff = e-i[ffl(e+x)$r{ff). We may, therefore, consider only the case of 0 5 a < 1. Making a change of variable, we can further rewrite Eq. (246) as
+
+
where the contour C+ goes along -in/2+ 00 + -in/2 + 3in/2 + 3in/2 00, while C - is taken to go along in12 - 00 + in12 + -3in/2 -+ -3in/2 O0.
On a change of variable t H u = er there arises a multivalued function u'+ in the integrand. Therefore we need to deal with it due care. If we recall that our solution for the scattering state, lQ(+)(k))in Appendix IV, was obtained from the LS equation, we immediately notice that we have only one way to deform the contour to adapt the residue theorem to the integral on the u-plane. Another option for the deformation obviously corresponds to another solution, l Q ( - ) ( k ) ) .As is seen in Sakoda and Omote, 1997, fig. 3, we can make use of the residue theorem for the contour integration around the unit circle only when 8 # 0. We then obtain
where
If we interpret the modulated plane wave hnc = eixcose-iff(e-s~n(e)n) as an incident wave, the second term of Eq. (248) can be regarded as a scattered wave. Then we obtain the Aharonov-Bohm scattering amplitude f z ( 8 ) with the aid of the stationary-phase approximation from
DIFFERENCE IN AHARONOV -BOHM EFFECT ON SCATTERING STATES
16 1
where the explicit form of the scattering amplitude fri(0)without any restriction on the range of 01 is given by
So far, the amplitude ffi(0) has not been treated in any connection with the S-matrix of the theory. Here it is important to note that we cannot define an S-matrix from Eq. (248) because $ra in Eq. (248) is not defined for 8 = 0. Therefore it is inappropriate to decompose the total wave function in the form given above for considering the relation between the S-matrix and (8). To find a definition of the S-matrix for this scattering problem, we need the asymptotic form of the total wave function:
ffi
which should then be compared with Eq. (224) and with the discussion in Appendix 111. For the present case, the S-matrix is defined by
i = c o s n a l + f,.A(a) B,
(252)
which is the same as the result given in Eq. (240). By equating the expressions in Eqs. (252) and Eq. (241), we find
j= (cos nu - 1> 1 +; : j
(253)
s","
as the relation of the two scattering amplitudes. Therefore d81ffi(e)12 cannot be interpreted as the total cross section. As a consequence, fg(6) does not obey the unitarity condition Eq. (226). Rather, it satisfies an operator relation a)tA(a) (a)t f A(a) f A B = sin2 ~ 0 1 1 -cos na(fAs B)
fkB A
-
+
because the S-matrix itself has been shown to be unitary. In terms of the amplitude itself, it is expressed as
2n dqf!&j*(q - q f ) f g ( q- (pi) = - sin2 na8(qj - qpi)
k
because ff'(0)
satisfies
162
SEIJI SAKODA AND MLNORU OMOTE
B. Takabayashi’s Derivation of Aharonov-Bohm ’s Wave Function As was seen in the previous section, there is another option for deformation of the contour of integration, shown in Sakoda and Omote, 1997, fig. 4, to obtain lQ(-)(k))in the same calculation for l q ( + ) ( k ) )given above. By using essentially the same technique, Takabayashi (1985) showed a simplified way to calculate the sum with respect to angular momentum in Aharonov-Bohm’s wave function. Here, we briefly review his method through its application to finding I W ) ( k ) ) . We again begin with Eq. (245) and consider suitable deformations of the contour of the integral. By substituting x = p f is ( E > 0) and t = u iv in Eq. (245), we find
+
/e‘ Isinht-ut
- epsinhucos~~&cosh~sinu-vu
+
Writing p sinh u cos v 7 E cosh u sin v = d p 2 sinh2 u E* cosh2 u cos(v f S), where 6 = E cosh u / ( p sinh u), we find that the upper limit of C can be changed within n / 2 7 6, 5 v 5 3 n / 2 6, (6, = E / p ) for x = p =k iE, respectively. We can choose the lower limit in the same way from -3n/2 7 6, 5 v _< - n / 2 7 6, for x = p f ie. Hence, we may take 03 in12 and 00 - i3n/2 as an upper and a lower limits, respectively, for x = p i~ to obtain
+ +
J,(p
+iE) = 2ni
e(p+i&) sinh t-!A
d t.
(254)
m-i3n/2
Splitting the whole contour into three segments: 03 - i3n/2 -+ -i3n/2, -i3n/2 -+ i n / 2 , and in12 + 03 i n / 2 , then arranging integrands, we obtain
+
for 0 < arg(z) < n / 2 . In the same way, we also find e-ivn/2
J”(Z)
=
JT
7 (I
dt}
(256)
if - n / 2 < arg(z) < 0. Equation (255) is the formula that was utilized in Takabayashi, 1985. As can be readily recognized, the condition 0 < arg(z) < n / 2 for this formula clearly indicates its applicability for calculating of the outgoing , - n / 2 < arg(z) < 0 for Eq. (256) suggests its use for wave (x \ q ( + ) ( k ) )while . calculate the latter in the remainder of the incoming wave ( x l Q ( - ) ( k ) )We this appendix.
DIFFERENCE IN AHARONOV-BOHM EFFECT ON SCATTERING STATES
163
+ a1 and z = kr - iE, we can write the
By substituting Eq. (256) for u = In right-hand side of Eq. (235) as
Exchanging the sum and integration, we obtain from the first term +m
1
rn
in which we used the formula
for h = 6 f 6’ and a suitable choice of m to make h - 2mn fall into the range of integration with respect to 6’. If 0 = fn,e-jffV on the right-hand side of Eq. (258) should be replaced by cos na,since for that case the delta functions in the above become the ones of an end point of the integration. Substituting a = [a] { a }and it { a ]= u’, we can calculate the sum in the second term of Eq. (259) as follows:
+
+
n=l
- , - i b I @ + ~ ) sin
e(l-lal)r
Noticing the symmetry under t
--t
e-(l-la))l
in the last expression, we obtain
164
SEIJI SAKODA AND MINORU OMOTE
Note that the sum in Eq. (259) has been evaluated under the assumption t > 0, whereas the resultant integral with respect to t is dominated by the contribution around t = 0. This requires a regularization to the sum in Eq. (259) to avoid a possible singularity at t = 0 , 8 = fn.Hence 1/(1 eiO)in Eq. (262), thus regularized, should be understood as a principal value. The same is also true for f z(8) in Eq. (250). Combining Eq. (258) with (260), we obtain a compact form of ( x l q ( - ) ( k ) ) given by
+
(x 1 q ( - ) ( k ) )= I { ei(kr-i&)cosO-iaO
2n sin na ,-ibl(O+n) -~
e-i(kr-ia) cosh f
dt}
.
(261)
n When kr is sufficiently large, the integration in the second term is approximated by
We thus obtain an asymptotic form of the incoming wave
-
e-ikr+irr/4
e-i([al+l/2)(O+~)
rn
sin na
+
sin(@ n)/2}
for large kr. In view of Eq. (250), the second term in the above can be written in terms of f$(@ to yield
Since the first term is also related to its counterpart in the outgoing wave,
by eikr cos 0-ia0
eikrcos8-iu(O-sgn(0)n)
(266)
DIFFERENCE IN AHARONOV-BOHM EFFECT ON SCATTERING STATES
165
under the time reversal transformation, that is, k + - k , a -+ -a, and complex conjugation, we realize the relation between outgoing and incoming waves still holds in their asymptotic forms. APPENDIX VI. GORDON'S METHOD FOR
SCATTERING BY THE
COULOMB POTENTIAL
The Rutherford formula played a quite significant role in the development of quantum theory in its early days. It revealed the pointlike structure of the nucleus as well as the interaction between electrons and nuclei. Although Rutherford himself did not envision quantum mechanics, he had calculated a correct scattering cross section based on classical mechanics. (This fact itself is astounding.) As is well known, the system with a Coulomb potential is solvable under several situations and levels of interest: (i) a Kepler system in classical mechanics, (ii) a Bohr-Sommerfeld quantization gives exact (bound state) energy levels for an attractive Coulomb system, and (iii) the Schrodinger equation for both attractive and repulsive Coulomb potential can be solved exactly. Explanations for the solvability have also been studied intensively by the path integral method by introducing Kustaanheimo-Stiefel, Levi-CivitA in two dimensions, transformation that converts the path integral for the fixed-energy amplitude of a Coulomb system into quadratic form which enables us to carry out the path integral just as we do for a harmonic oscillator (see Kleinert, 1995, Ch. 13). Thus we also recognize that the bound state energy of a Coulomb system can be obtained through a semiclassical (WKB) approximation. In order to compare the features of Coulomb scattering and Aharonov-Bohm's model, we revisit the former in this section, although the result for the problem can be found in any textbook of quantum mechanics. A Hamiltonian for a Coulomb system is given by
where we assume a repulsive potential@ > 0) for the sake of simplicity. A solution of the time-independent Schrodinger equation (V2
+ k2 - 3) $(x) r
= 0,
y =PK
h2k
corresponding to an incident electron of energy E = (1?k)~/(2p)coming in along the negative z-axis is given by
166
SEIJI SAKODA AND MINORU OMOTE
where 1F1 (a, b;z ) designates a confluent hypergeometric function. It has an asymptotic form except just on the positive z-axis ( r = z) in the region far from the scattering center: +(x)
-
exp [i {kz
+ ylog{k(r
-
z)))]
x exp [i Ikr - y log{k(r - z))
Writing the second term as
+ JT + 2 arg(r( 1 + iy))}] .
( z = rcos8)
&kr
r 2k sin2(O/2)
(270)
exp [i { -2ylog{2kr sin2(O/2))
+ n + 2 arg(r(1 + iy))}]
,
(27 1) we immediately find the Rutherford formula
In obtaining the Rutherford formula, we interpret the first term of Eq. (270) as an incident wave that describes the incident beam, and the second term as a scattered one. Unlike the usual scattering problem of short-range force, both the incident and scattered waves of Coulomb scattering have extraordinary factors that make the incident wave differ from a plane wave and the scattered one from a simple spherical outgoing one, although their effect on corresponding currents can safely be neglected in the region far from the scattering center. These factors are understood as the appearance of the long-range nature of the Coulomb potential. A classical paper by Gordon (1928) (Mott and Massay, 1961) may be the first one that dealt with the scattering problem of a Coulomb system by means of quantum mechanics. He developed several ideas for solving the Schrodinger equation to fit the scattering situation, as well as an intuitive consideration from the viewpoint of a de Broglie wave. We review his idea briefly for comparison with the Aharonov-Bohm scattering. The long-range nature of the Coulomb potential is also made evident in a simple way by solving the problem using this rather elementary method. For any (short-range and/or long-range) potential, as an intermediate stage in solving a scattering problem, we can assume a region in which scattered particles are treated to be free from the potential by introducing an additional potential that should, of course, be removed at the end of the calculation or just by having its tail cut off, then formulating a scattering theory for the usual finite-range potential. After solving such a modified scattering problem,
DIFFERENCE IN AHARONOV-BOHM EFFECT ON SCATTERING STATES
167
we may be able to obtain the answer for the original problem by removing the additional potential or by substituting the cutoff parameter 00. This is true for any short-range potential, and the result we get is independent of the scheme we use, whether by solving the Schrodinger equation for the original problem or by following the above procedure. However the order of solving the scattering problem and removing the additional potential or substituting the cutoff parameter 00 does not commute in general if the potential is a long-range one. For the Coulomb potential, we may assume charge distribution on a sphere in order to completely hide the original charge located at the origin. We then obtain a Hamiltonian of a finite-range potential given by
Taking this Hamiltonian as a starting point, we solve a (well-defined) scattering problem, then we take R + 00 to consider the original Coulomb scattering. In region I (the scattering region), the Hamiltonian coincides with Eq. (267) to yield the same Schrodinger equation. Thus we obtain Eq. (268) again, and the wave function in this region is assumed to be given by
c Q3
+I(X>
=
1 =o
i'(21
+ 1)Pl(cosQ)X/(r),
(274)
where the radial wave function ~ [ ( rmust ) obey
Conversely, the Hamiltonian describes a free motion in region I1 (the asymptotic region). Therefore we may assume the following wave function in this region: $11 (x = eikz +s (x 1, (276)
+
where the scattered wave
+S
in Eq. (276) should be expressed as
bo
+s (x =
C i' (21 + 1
)PI
(cos 81% (')(kr)C,,
(277)
1=o
where the Cl's are constants to be determined through the continuity of the wave function and its derivative at the surface r = R, and /q(')(kr) designates
168
SEIJI SAKODA AND MINORU OMOTE
a spherical Hankel function of the first lund (diverging wave). A solution, which is regular at r = 0, of Eq. (275) is given by
where B1 is a constant. Finally, it is sufficient to observe the behavior of wave functions in the limit R + 00. Therefore we may consider the continuity of the wave function at r = R by assuming that R is sufficiently large. Then we obtain
for sufficiently large R . Thus we obtain a wave function in region I: 00
p!q(x) = E i ' ( 2 1
+ l)P~(~os8)e'~'(kr)~~F~(I + 1 + iy, 21 + 2;-2ikr)
1=a
and one in region 11:
$rII(x)= eikz
+
00
i'(21
+ l)Pl(cosB)hl(')(kr)
I=O
After some tedious calculation (Messiah, 1970), we can evaluate the sum with respect to 1 in Eq. (280) to obtain
which differs from the Coulomb wave function Eq. (269) just by the factor e-iy10g(2kR). No matter how small the difference, we cannot find the limit R -+ 00 for both and h ~It .may seem that we need only redefine the phase of the wave functions by multiplying by to find the Coulomb wave function and revert to the usual Schrodinger wave mechanics. However, after doing so, we cannot assume a plane wave for an incident wave. This is the long-range nature of the Coulomb potential. The difficulty seen
DIFFERENCE IN AHARONOV-BOHM EFFECT ON SCATTERING STATES
169
above in taking R + 00 can be regarded as a common feature of long-range interactions. In this sense, the AB scattering does not belong to the class of scattering by long-range potentials because we have had no such difficulty in solving the AB scattering by applying Gordon’s method. Here it is worth noting that for long-range potentials, we have no convincing way of deriving the Lippmann-Schwinger equation, which assumes an incident plane wave from a time-dependent Schrodinger equation without modifying the asymptotic free motion. Again, it is not the case for AB scattering, because we are able to not only derive but also solve the LS equation for AB scattering. As a final comment on Coulomb scattering, it would be interesting to find that Gordon solved the problem by means of a semiclassical method in Gordon, 1928. For any given point x we can find two classical trajectories that pass through that point: the one corresponding to a particle before being scattered, and the other corresponding to the scattered particle. The trajectories are distinguished according to whether the given point x is reached before passing through the turning point on that trajectory or following it. By calculating approximate wave functions corresponding to these trajectories, an incident wave and a scattered wave are found by a semiclassical approximation. In this way, Gordon obtained the Rutherford formula by a semiclassical method. The method may be applicable to any potential for which a classical scattering event can be found if a semiclassical approximation is desired. Again, this is not the case for AB scattering because there is no classical counterpart for the AB effect.
REFERENCES Aharonov, Y., Au, C. K., Lerner, E. C., and Liang, J. Q. (1984). Phys. Rev. D 29, 2396 Aharonov Y., and Bohm, D. (1959). Phys. Rev. 115, 485. Arai, A. (1992). J. Math. Phys. 33, 3374. Arai, A. (1993). J. Math. Phys. 34, 915. Arai, A. (1995). J. Marh. Phys. 36, 2569. Arai, M. and Minakata, H. (1998). Int. J. Mod. Phys. A 13, 831. Audretsch, J., Jasper, U., and Skarzhinsky, V. D. (1995). J. Phys. A 28, 2359. Bemido, C. C., and Inomata, A. (1981). J. Marh. Phys. 22, 715. Berry, M. V. (1980). E m J. Phys. I, 240. Berry, M. V. (1984). Proc. Roy. SOC.A 392, 45. Bocchieri, P., and Loinger, A. (1978). Nuovo Cirnento, 47A, 475. Bocchieri, P., Loinger, A,, and Siragusa, A. (1979). Nuovo Cimento, 51A. I . Boersch, H., Hamisch, H., Grohmann, K., and Wohlkben, D. (1961). Z. Phys. 165, 79. Bordag, M., and Voropaev, S. (1993). J. Phys. A 26, 7637. Byers N. and Yang, C. N. (1961). Phys. Rev. Lett. 7, 46. Chambers, R. G . (1960). Phys. Rev. Lett. 5, 3. Chetouani, L., Guechi, L., and Hamrnann, T. F. (1989). J. Math. Phys. 30, 655. Dqbrowski, L., and &oviEek, P. (1998). J. Math. Phys. 39, 47.
170
SEIJI SAKODA AND MINORU OMOTE
Dirac, P. A. M. (1958). The Principles of Quantum Mechanics, 4th ed. Oxford, New York, pp. 89-94. Doebner, H. D. Elmers, H. J., and Heidenreich, W. F. (1989). J. Math. Phys. 30, 1053 Doebner, H. D., and Papp, E. (1990). Phys. Lett. A 144, 423. Driigiinascu, Gh. E. Campigotto, C., and Kibler, M. (1992). Phys. Lett. A 170, 339. Fowler, H. A., Marton, L., Simpson, J. A,, and Suddeth, J. A. (1961). J. Appl. Phys. 32, 1153. Gerry, C. C., and Singh, V. A. (1979). Phys. Rev. D 20, 2550. Giacconi, P., Maltoni, F., and Soldati, R. (1996). Phys. Rev. D 53, 952. Gordon, W. (1928). 2. Phys. 48, 180. Guha, A., and Mukherjee, S. (1987). J. Math. Phys. 28, 840. Hagen, C. R. (1990). Phys. Rev. D 41, 2015. Hagen, C. R. (1991). Int. J. Mod. Phys.A 6, 3119. Hagen, C. R. (1993). Phys. Rev. D 48, 5935. Henneberger, W. C. (1981). J. Math. Phys. 22, 116. Jackiw, R. (1990). Ann. Phys. 201, 83. Kibler, M., and Negadi, T. (1987). Phys. Lett. A 124, 42. Kleinert, H. (1995). Path Integrals in Quantum Mechanics, Statistics, and Polymer Physics, 2nd ed. World Scientific, Singapore. Kretzschmar, M. (1965). Z. Phys., 185, 84. Kretzschmar, M. (1965). Z. Phys., 185,97. Landau, L. D., and Lifshitz, E. M. (1965). Quantum Mechanics. Pergamon, New York. Lewis, R. R. (1983). Phys. Rev. A 28, 1228. Lin, D. H. (1998). J. Phys. A 31, 4785. Magni, C., and Valz-Gris, F. (1995). J. Math. Phys. 36, 177. Messiah, A. (1970). Quantum Mechanics. North-Holland, Amsterdam. Mollenstedt, G., and Bayh, W. (1962). Phys. Bl. 18, 299. Mornandi, G., and Menossi, E. (1984). J. Phys. 5, 49. Mott, N. F., and Massay, H. S. W. (1949). The Theory of Atomic Collision. Oxford, New York. Takayanagi, K. Japanese (1961). translation from of the 2nd ed. Yoshioka Shoten, Tokyo. Nagel, B. (1985). Phys. Rev. D 32, 3328. Odaka, K., and Satoh, K. (1997). Mod. Phys. Lett. A 12, 337. Ohnuki, Y. 1986. Proc. 2nd Int. Symp. on Foundations of Quantum Mechanics, Tokyo, pp. 117- 126. Ohnuki, Y. and Kitakado, S. (1993). J. Math. Phys. 34, 2827. Olariu, S., and Popescu, I. I. (1985). Rev. Mod. Phys. 57, 339. Park, D. K. (1995). J. Math. Phys. 36, 5453. Park, D. K., and Yoo, S. K. (1998). Ann. Phys. 263, 295. Pearson, D. B. (1988). Quantum Scattering and Spectral Theory. Academic Press, New York. Peshkin, M. (1981). Phys. Rep?. 80, 375-386. Reed, M., and Simon, B. (1979). Methods of Modern Mathematical Physics. Vol. 111: Scattering Theory Academic Press, New York. Reeh, H. J. (1989). J. Math. Phys. 29, 1535. Roy, S. M. (1980). Phys. Rev. Len. 44, 111. Roy, S. M., and Singh, V. (1983). Phys. Rev. Lett. 51, 2069. Ruijsenaars, S. N. M. (1983). Ann. Phys. 146, 1. Sakoda, S. and Omote, M. (1997). J. Math. Phys. 38, 716. Schulman, L. S. (1971). J. Math. Phys. 12, 304. Schulman, L. S. (1981). Techniques and Applications of Path Integration. Wiley, New York, Sec. 23.1. Shapere, A,, and Wilczek, F. (1989). Geometric Phases in Physics. World Scientific, Singapore. Shiekh, A. Y . (1986). Ann. Phys. 166, 299.
DIFFERENCE IN AHARONOV -BOHM EFFECT ON SCATTERING STATES
17 1
Simon, B. (1983). Phys. Rev. Left. 51, 2167. Sokmen, I. (1988). Phys. Left. A 132, 65. de Sousa Gerbert, Ph. (1989). Phys. Rev. D 40, 1346 Stelitano, D. (1995). Phys. Rev. D 51, 5876. Alvarez, M. (1996). Phys. Rev. A 54, 1128. (Kretzschmar, 1965a; Raijsenaars, 1983: Stelitano, 1995; Alvarez, 1996) Takabayashi, T. (1985). Hadronic Journal Supplement 1, 219. Tonomura, A., Osakabe, N., Matsuda, T., Kawasaki, T., Endo, J., Yano, S., and Yamada, H. ( 1986). “Evidence for Aharonov-Bohm Effect with Magnetic Field Completely Shielded from Electron Wave,” Phys. Rev. Lett. 56, 792. Tonomura, A., Umezaki, H., Matsuda, T., Osakabe, N., Endo, J., and Sugita, Y. 1983. “Electron Holography, Aharonov-Bohm Effect and Flux Quantization,” Proc. Int. Symp. Foundations of Quantum Mechanics, Tokyo, pp. 20-28. Tonomura, A,, Matsuda, T., Hasegawa, S., Igarashi, M., Kobayashi, T., Naito, M., Kajiyama, H., Endo J., Osakabe, N., and Aoki, R. 1989. “Electron-InterferometricObservation of Magnetic Flux Quanta Using the Aharonov-Bohm Effect,” Proc. 3rd Znf. Symp. Foundations of Quantum Mechanics, Tokyo, pp. 15- 24. Villalba, V. M. (1995). J. Math. Phys. 36,3332. Wilson, K. G. (1974). Phys. Rev. D 10, 2445. Wu, T. T., and Yang, C. N. (1975). Phys. Rev. D 12, 3845. Yang, C. N. (1974). Phys. Rev. Lett. 33, 445.
This Page Intentionally Left Blank
INDEX A ABC problem in two dimensions: bound states of, 142-4 S-matrix of, 138-42 Ablation of metal films, 38-40 Aharonov-Bohm (AB) effect: background of, 102-7 conclusions, 148-5 1 defined, 103 gauge invariance and scattering theory, 144-8 path integral for a system in, 151-4 semiclassical derivation of shifted Landau levels, 154-5 S-matrix, 157-165 Aharonov-Bohm (AB) effect, bound and scattering states and: ABC system in two dimensions, 131, 142-4 time-dependent scattering theory, 132-42 Aharonov-Bohm (AB) effect, bound states and, 125 eigenfunctions, 127-9 energy levels, 126-7 Landau levels, 129-31 Aharonov-Bohm (AB) effect, scattering states and, 107 behavior of total cross section, 122-5 Gordon’s method, 114- 15 incident waves in, 108-9 Lippmann-Schwinger equation, 109-13 reduction to infinitely thin solenoid, 122 scattering amplitude, 1 13- 14 S-matrix, 120-22 wave function, concise form, 159-61
wave function, Takabayashi’s derivation of, 162-5 wave functions of scattering states, 115-20 Algebraic properties: of fuzzy soft mathematical morphology, 84-6 of standard mathematical morphology, 65-7 Anti-extensivity operation: for fuzzy soft mathematical morphology, 88 for standard mathematical morphology, 66 Arithmetic unit, 94 Array of registers, 94 Auger spectroscopy, 50
B Beam blanker, 46 Bessel function, 118, 123, 137, 141 Bethe stopping power formula, 42 Binary soft mathematical morphology, 68-9 Binary standard mathematical morphology, 65 Bohr-Sommerfeld quantization rule, 130, 155 Born approximation, 109 Bound states. See Aharonov-Bohm (AB) effect, bound and scattering states and; Aharonov-Bohm (AB) effect, bound states and Bright-field imaging, 40-2, 54
C Chemocapillary shear stress, 34, 37 Clausius-Clapeyron equation, 39 Coulomb scattering, 131, 165-9 Cylindrical functions, 119
174
INDEX
for standard mathematical morphology, 66
D Delta function, 121 Dilation: algebraic properties for fuzzy soft mathematical morphology, 84-6 algebraic properties for standard mathematical morphology, 65-7 fuzzy soft, 76-81 Soft, 69-70, 72-5 Dirac phase factor, 141 Dirac quantization of magnetic monopole, 104 Distributivity : for fuzzy soft mathematical morphology, 84-5 for standard mathematical morphology, 66 Duality theorem: for fuzzy soft mathematical morphology, 84 for standard mathematical morphology, 65-6
E Eigenfunctions, Aharonov-Bohm (AB) effect and, 127-9 Electromagnetism, classical view, 102-3 Energy levels, Aharonov-Bohm (AB) effect and, 126-7 Erosion: algebraic properties for fuzzy soft mathematical morphology, 84-6 algebraic properties for standard mathematical morphology, 65-7 fuzzy soft, 76-81 soft, 70, 72-5 Excimer lasers, 45 Extensivity operation: for fuzzy soft mathematical morphology, 86
F Far-field holography, 11 Feynman kernel, 109-10, 126, 141-2 Flash photoelectron microscopy. See Photoelectron microscopy, flash Frequency-multiplied solid-state lasers, 45 Fuzzy soft mathematical morphology: algebraic properties, 84-6 compatibility with soft morphology, 81 -4 definitions, 64, 76-81 Fuzzy standard mathematical morphology, 67-8
G Gauge fields, 104 Gauge invariance and scattering theory, 144-8 Gibbs isotherm, 32 Gordon’s method, 114-15, 165-9 Gray-scale morphology with flat structuring elements: soft mathematical morphology and, 69-70 standard mathematical morphology and, 67 Gray-scale morphology with gray-scale structuring elements: soft mathematical morphology and, 70 standard mathematical morphology and, 67 Green’s function, 109, 111, 147
H Hamiltonians, 110, 125-6, 129, 145 Hankel functions, 113, 120 Hertz-Knudsen-Langmuir equation, 39 High-speed electron microscopy: applications, 21
INDEX
conclusions, 58-9 flash photoelectron microscopy, 43-54 image intensity tracking, 26 pulsed high-energy reflection electron microscopy, 54-8 short-time-exposure imaging, 23 -5 streak imaging, 25-6 time-resolving transmission electron microscopy, 26-43 Histogram technique, 96-8 Holography: far-field, 11 near-field, 11 Hypergeometric functions, 1 18, 139
I Idempotency : for fuzzy soft mathematical morphology, 86 for standard mathematical morphology, 67 Image intensity tracking, 26 space-time resolution, 42-3 transmission electron microscopy and, 29-30 Image window management module, 94 Incident waves in Aharonov-Bohm scattering, 108-9 Increasing operations: for fuzzy soft mathematical morphology, 84 for standard mathematical morphology, 66
K Kummer’s formula, 118 Kustaanheimo-Stiefel transformation, 131
L Landau levels: Aharonov-Bohm (AB) effect and, 129-31
175
semiclassical derivation of shifted, 154-5 Laplace transform, 111 Laser-driven photoelectron guns, 24-5 Laser-driven thermionic gun, 23-4, 55 Legendre function, 127 Light-matter interaction, microscopic perspective of, 3-4 Linear response theory, 10 Lippmann-Bragg holograms, 1 1 Lippmann-Schwinger (LS) equation, 101, 109-13, 145 Lommel’s formula, 112 Lorentz force, 102
M Majority gate algorithm: architecture for decomposition, 94-6 description of, 87-9 order statistic module hardware requirements for structuring elements, 92-4 systolic array implementation, 89-92 Mathematical morphology (standard): See also Fuzzy soft mathematical morphology; Soft mathematical morphology algebraic properties, 65-7 applications, 64 binary, 65 defined, 63-4 fuzzy, 67-8 gray-scale morphology with flat structuring elements, 67 gray-scale morphology with gray-scale structuring elements, 67 Maxwell’s field equations, 3 mergesort, 86 Metal films, ablation of, 38-40 Metal melts, thermal-gradient-driven instabilities of, 3 1-8 Microscopic perspective of light-matter interaction, 3-4 Multiset, 70, 74-5
176
N Near-field holography, 11 Near-field imaging, quantum limit in, 9-10 Neumann function, 123
0 Optical imaging techniques applications, 1 bilayer substrate example, 16 conclusions, 17 experimental results, 11- 16 formation of standing waves, 4-6 imaging of standing waves, 6-8 microscopic perspective of light-matter interaction, 3-4 near-field holography, 11 quantum limit in near-field, 9-10 reflection-collection mode, 10, 11-16 wave optical properties near surfaces, 1-2 Order index, 69-70 Order statistics: hardware requirements, 92-4 modules (OSMs), 94 multiset, 74 weighted, 68, 86
P Phase factor, 103-4 Photoelectron microscopy, Rash: advantages of, 43-4, 48 applications, 47-5 1 lasers for, 45 laser treatment limitations, 53 -4 resolution limitations, 52-3 short-exposure imaging, 45 -7 single or mukiphoton absorption, 44-5 Photoionization, 53 Pulsed high-energy reflection electron microscopy. See Reflection
INDEX
electron microscopy, pulsed high-energy Pump-probe technique, 22
Q Quantum limit in near-field imaging, 9- 10 Quicksort, 86
R Reflection-collection mode, 10- 16 Reflection electron microscopy, pulsed high-energy, 54-8 Rejecting flag signals, 87 Richardson-Dushman expression, 53 Rutherford scattering, 115, 165
S Scattering states. See Aharonov-Bohm (AB) effect, bound and scattering states and ; Aharonov-Bohm (AB) effect, scattering states and; Time-dependent scattering theory Scattering theory: gauge invariance and, 144-8 unitarity and optical theorem in two dimensional, 155- 7 Schrodinger equation, 102, 109, 114-6, 121, 132, 134, 144-6 Shadow formation, probe edge and, 6-8 Short-time-exposure imaging, 23 -5 Bright-field imaging, 40-2 flash photoelectron microscopy and, 45 -7 reflection electron microscopy and, 55 transmission electron microscopy and, 27-8 S-matrix, 120-2, 157-165 of the ABC problem in two dimensions, 138- 142
INDEX
Soft mathematical morphology: See also Fuzzy soft mathematical morphology applications, 64 binary, 68-9 compared to standard morphology, 64 compatibility with fuzzy soft morphology, 81-4 gray-scale morphology with flat structuring elements, 69-70 gray-scale morphology with gray-scale structuring elements, 70 structuring element decomposition, 70, 72-5 weighted order statistic, 68 Soft mathematical morphology, implementations: histogram technique, 96-8 majority gate, 87-96 threshold decomposition, 87 Space-time resolution: flash photoelectron microscopy and, 52-3 reflection electron microscopy and, 56-7 transmission electron microscopy and, 40-3 Standing waves. See Waves, standing Streak imaging, 25-6 time resolution of, 42 transmission electron microscopy and, 28-9 Structuring element: See also Gray-scale morphology with flat structuring elements; Gray-scale morphology with gray-scale structuring elements defined, 64 Structuring element decomposition, soft, 70, 72-5 architecture for, 94-6 Structuring element management module, 94
177
Successive approximation technique, 96 Systolic array implementation, 89-92
T Takabayashi’s derivation of Aharonov-Bohm wave function, 162-5 Theorem of Byers and Yang, 130- 1 Thermocapillary shear stress, 34, 36-8 Threshold decomposition, 87 Time-dependent scattering theory: S-matrix, 138-42 wave functions, 132-3 wave operators, 133-8 Time-resolving microscopes: flash photoelectron microscopy, 43-54 pulsed high-energy reflection electron microscopy, 54-8 transmission electron microscopy, 26-43 Time-resolving techniques: image intensity tracking, 26 short-time-exposure imaging, 23 -5 streak imaging, 25-6 Translation-invariance operations: for fuzzy soft mathematical morphology, 84 for standard mathematical morphology, 66 Transmission electron microscopy, time-resolving, 26 applications, 30-40 image intensity tracking, 29-30 instrumentation, 27-9 metal films, ablation of, 38-40 metal melts, thermal-gradient-driven instabilities of, 31 -8 short-time-exposure imaging, 27-8 space-time resolution, 40-3 streak imaging, 28-9 Transmission geometry, 6
178
U Unitarity and optical theorem in two dimensional scattering, 155-7 Unitary equivalence, 145
W Wave functions of scattering states, 115-20, 132-3
ISBN 0-12-014752-1
INDEX
concise form of, 159-61 Takabayashi’s derivation of, 162-5 Wave operators, 133- 8 Waves, standing: formation of, 4-6 imaging of, 6-8 Wave scattering, microscopic perspective, 3 -4 Weighted order statistic, 68, 86 Whittaker function, 116, 117