EDITOR-IN-CHIEF
PETER W. HAWKES CEMES-CNRS Toulouse, France
Academic Press is an imprint of Elsevier Radarweg 29, PO Box 211, 1000 AE Amsterdam, The Netherlands 32 Jamestown Road, London NW1 7BY, UK 30 Corporate Drive, Suite 400, Burlington, MA 01803, USA 525 B Street, Suite 1900, San Diego, CA 92101-4495, USA First edition 2010 Copyright # 2010, Elsevier Inc. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means electronic, mechanical, photocopying, recording or otherwise without the prior written permission of the publisher. Permissions may be sought directly from Elsevier’s Science & Technology Rights Department in Oxford, UK: phone (+44) (0) 1865 843830; fax (+44) (0) 1865 853333; email:
[email protected]. Alternatively you can submit your request online by visiting the Elsevier web site at http://www.elsevier.com/ locate/permissions, and selecting Obtaining permission to use Elsevier material. Notice No responsibility is assumed by the publisher for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions or ideas contained in the material herein. Because of rapid advances in the medical sciences, in particular, independent verification of diagnoses and drug dosages should be made. Library of Congress Cataloging-in-Publication Data A catalog record for this book is available from the Library of Congress British Library Cataloguing in Publication Data A catalogue record for this book is available from the British Library ISBN: 978-0-12-381314-5 ISSN: 1076-5670 For information on all Academic Press publications visit our Web site at elsevierdirect.com Printed in the United States of America 10 11 12 10 9 8 7 6 5 4 3 2 1
Contributors
W. S. Bacsa CEMES-CNRS and Universite´ de Toulouse, 29, Rue Jeanne Marvig, BP 94347, 31055 Toulouse Cedex 4, France
1
Ruy H. A. Farias and Erasmo Recami LNLS - Laborato´rio Nacional de Luz Sı´ncrotron, Campinas, S.P., Brazil; and Facolta` di Ingegneria, Universita` statale di Bergamo, Italy, and INFN–Sezione di Milano, Milan, Italy
33
Andrew Neice Stanford University Medical Center, Stanford, California, USA
117
A. Sever Sˇkapin and P. Ropret Slovenian National Building and Civil Engineering Institute, Dimicˇeva 12, 1000 Ljubljana, Slovenia; and Institute for the Protection of Cultural Heritage of Slovenia, Conservation Centre, Research Institute, Poljanska 40, 1000 Ljubljana, Slovenia; and Museum Conservation Institute, Smithsonian Institution, 4210 Silver Hill Road, Suitland, Maryland 20746, USA
141
Markus E. Testorf and Michael A. Fiddy Dartmouth College, Hanover, NH, USA; and University of North Carolina-Charlotte, USA
165
ix
Preface
The present volume is almost a thematic volume on subwavelength microscopy for three of the chapters deal with different aspects of this subject, which is currently the object of much research effort. The two remaining chapters deal with the ‘‘chronon’’ and with the role of microscopy in the fine arts, other aspects of which I hope to cover in future volumes. The volume begins with an account by W. Bacsa of optical interference close to surfaces and ways of using this to achieve subwavelength resolution. The different families of standing waves are examined and the potential of the methods is described. This is followed by a highly unusual contribution by R.H.A. Farias and E. Recami, in which the discretization of time is studied. This leads the authors to recapitulate the familiar theories of the electron and also takes them well beyond ‘‘electron physics’’; many original ideas are put forward. This chapter, which forms a short monograph on the subject, will surely stimulate further discussion. The third chapter brings us back to subwavelength imaging; here, A. Neice discusses the limitations of the various methods. A whole section is devoted to Pendry’s superlens and the concluding chapter examines the limit of resolution. Many optical and electron optical techniques are used to study paintings, frescos and archaeological material. In the next chapter, A. Sever Sˇkapin and P. Ropret show how historical pigments in wall layers can be analysed by optical and scanning electron microscopy and by energydispersive techniques. They apply these methods to samples from a number of churches and other buildings in Slovenia. The volume ends with a long account by M.E. Testorf and M.A. Fiddy on superresolution. This is a wide-ranging chapter that sets out from the Rayleigh limit and Abbe’s theory, after which the notion of degrees of freedom is examined. This is followed by Lukosz superresolution, filters and the Gerchberg—Papoulis algorithm, with a last section on generalized sampling. This nicely complements the earlier chapters on subwavelength imaging, and some of that material is seen here from a different standpoint. As always, my thanks to all the authors for their efforts to make their subjects accessible to a wide readership. Peter W. Hawkes
vii
Future Contributions
A. Abramo and L. Geretti Deterministic and statistical neurons S. Ando Gradient operators and edge and corner detection N. Baddour 2D Fourier transforms in polar coordinates A. Bardea and R. Naaman (Vol. 164) Magnetolithography: from the bottom-up route to high throughput D. Batchelor Soft x-ray microscopy E. Bayro Corrochano Quaternion wavelet transforms C. Beeli Structure and microscopy of quasicrystals C. Bobisch and R. Mo¨ller Ballistic electron microscopy F. Bociort Saddle-point methods in lens design A. Buchau Boundary element or integral equation methods for static and time-dependent problems N. V. Budko Negative velocity and the electromagnetic field E. Buhr Transmission scanning electron microscopy R. Castan˜eda (Vol. 164) The optics of spatial coherence wavelets A. Cornejo Rodriguez and F. Granados Agustin Ronchigram quantification
xi
xii
Future Contributions
T. Cremer Neutron microscopy E. de Chambost The history of CAMECA J. Debayle and J. C. Pinoli Theory and applications of general adaptive neighbourhood image processing A. X. Falca˜o The image foresting transform R. G. Forbes Liquid metal ion sources C. Fredembach Eigenregions for image classification R. Fru¨ke EUV scanning transmission microcopy ¨ lzha¨user A. Go Recent advances in electron holography with point sources P. Han and H. E. Hwang Phase retrieval in the Fresnel domain M. Haschke Micro-XRF excitation in the scanning electron microscope L. Hermi, M. A. Khabou, and M. B. H. Rhouma Shape recognition based on eigenvalues of the Laplacian M. I. Herrera The development of electron microscopy in Spain R. Hill The helium ion microscope A. Imiya and T. Sakai Gradient structure of images in scale space M. S. Isaacson Early STEM development K. Ishizuka Contrast transfer and crystal images A. Jacobo Intracavity type II second-harmonic generation for image processing L. Kipp Photon sieves T. Kohashi Spin-polarized scanning electron microscopy
Future Contributions
xiii
O. L. Krivanek Aberration-corrected STEM R. K. Leary and R. M. D. Brydson Chromatic aberration correction, the next step in electron microcopy S. Lefevre and J. Weber Mathematical morphology, video and segmentation R. Leitgeb Fourier domain and time domain optical coherence tomography B. Lencova´ Modern developments in electron optical calculations J.-c. Li (Vol. 164) Fast Fourier transform calculation of diffraction integrals H. Lichte New developments in electron holography M. Marrocco Discrete diffraction M. Matsuya Calculation of aberration coefficients using Lie algebra P. Midgley Precession microscopy L. Muray Miniature electron optics and applications S. Nepijko and G. Scho¨nhense Analysis of optical systems, contrast depth and measurement of electric and magnetic field distribution on the object surface in mirror electron microscopy S. Nepijko and G. Scho¨nhense The use of electron holography to measure electric and magnetic fields and other practical applications M. A. O’Keefe Electron image simulation H. Ott Scanning electron microscopy of gaseous specimens D. Paganin and T. Gureyev Intensity-linear methods in inverse imaging N. Papamarkos and A. Kesidis The inverse Hough transform K. S. Pedersen, A. Lee, and M. Nielsen The scale-space properties of natural images
xiv
Future Contributions
H. Sawada Recent developments in aberration correction for electron lenses T. Schulz Thermoluminescence in scanning electron microscopy R. Shimizu, T. Ikuta, and Y. Takai Defocus image modulation processing in real time T. Soma Focus-deflection systems and their applications P. Sussner and M. E. Valle Fuzzy morphological associative memories V. Syrovoy Theory of dense charged particle beams I. Talmon Study of complex fluids by transmission electron microscopy M. Teschke Phase-contrast imaging Y. Uchikawa Electron gun optics Z. Umul The boundary diffraction wave E. Wolf History and a recent development in the theory of reconstruction of crystalline solids from X-ray diffraction experiments L. Yaroslavsky Sampling and image recovery from sparse data D. Yi (Vol. 164) Fourth-order partial differential equations for image enhancement
Chapter
1 Optical Interference near Surfaces and its Application in Subwavelength Microscopy W. S. Bacsa
Contents
1. Overview of Optical Interference Near Surfaces 2. Optical Microscopy and Optical Standing Waves 3. Optical Standing Waves Near Surfaces, Holography, and Interference Substrates 4. Intermediate-Field and Surface Standing Waves 5. Lateral Standing Waves 6. Reconstruction of Intermediate-Field Images 7. Talbot Effect and Phase Singularities 8. Conclusion and Perspectives Acknowledgement References
1 3 4 7 10 22 28 30 30 30
1. OVERVIEW OF OPTICAL INTERFERENCE NEAR SURFACES Tremendous progress in projection lithography and imprint lithography over the past two decades has made it possible to control the surface structure of semiconductors at a fraction of the optical wavelength (Rothschild, 2005). The considerable control of the structure of thin films affords new opportunities to explore optics with much greater detail at subwavelength scales. This means that new functional materials can be
CEMES-CNRS and Universite´ de Toulouse, 29, Rue Jeanne Marvig, BP 94347, 31055 Toulouse Cedex 4, France Advances in Imaging and Electron Physics, Volume 163, ISSN 1076-5670, DOI: 10.1016/S1076-5670(10)63001-7. Copyright # 2010 Elsevier Inc. All rights reserved.
1
2
W. S. Bacsa
designed or optical techniques developed based on entirely different principles than conventional optical microscopes or conventional grating-based optical spectroscopes. Optical interference is commonly used in optical filters, the fabrication of optical gratings, and plays a crucial role in photonic crystals (Joannopoulos et al., 1997). Optical interference also plays an important role in improving masks in advanced projection lithography (Liu and Zakohr, 1992). When considering a plane surface and an optical monochromatic beam, the reflected beam necessarily overlaps and interferes with the incoming beam within a distance that scales with the beam width. Depending on the coherence length of the incident optical beam, standing optical waves are formed in the zone of the overlapping incident and reflected beams. Here we review experimental observations of optical standing waves near surfaces, show in some detail the formation of lateral and surface standing waves near surfaces, and discuss how interference substrates can be used to enhance optical signals of molecular monolayers or graphene. We show that the intermediate distance range is a relatively unexplored region of the optical field that yields new perspective for the development of optical holography without reference beam extension to the subwavelength range (Bacsa, 1999). When an optical probe is scanned in collection mode, the optical field at intermediate distance from structured surfaces can be explored at unprecedented detail. We compare experimental images recorded with an optical probe in collection mode with an analytic dipole model to explain the surface and lateral standing waves (LSWs) near surfaces. The availability of scanning probe instruments makes it possible to record optical standing waves near surfaces. This offers the possibility of exploring the physics at length scales comparable to and below the size of the wavelength of light. Optical interference also plays an important role in the field of plasmonics. Plasmonics is directed primarily toward defining metallic nanostructures in resonance with the incident light (Crommie et al., 1993). Here we focus on (1) exploring the optical field near structured surfaces without limiting ourselves to metals, and (2) learning how the knowledge of the field distribution at intermediate distances from the surface can be used to characterize the surface topography and its composition. We consider applications in optical subwavelength surface imaging and optical spectroscopic sensors. Standing waves have also been observed and used for electrons and X-rays. Standing waves of electrons on metallic surface have been observed using scanning tunneling microscopy (Zegenhagen, 1993), and standing waves of X-rays near surfaces have been used to observe scattering from surface layers (Ozbay, 2006).
Optical Interference near Surfaces and its Application in Subwavelength Microscopy
3
2. OPTICAL MICROSCOPY AND OPTICAL STANDING WAVES Lens-based optical microscopes rely on ray optics, which does not take into account of the wave aspect of photons. The lateral resolution of lensbased optical microscopes is limited by optical diffraction or the size of half the optical wavelength. Scanning optical probe techniques with apertures smaller than the wavelength circumvent the diffraction limit of lens-based systems. Near-field optics aims to use the large local field at the proximity of the surface using an optical scanning probe either in reflection or transmission geometry (Betzig and Trautman, 1992). However, the transmission of optical waves through the aperture of optical probes is strongly reduced for apertures smaller than the size of the optical wavelength. Apertures in the size range of a fraction of the wavelength are typically used. To reach the near field and to record images with high lateral resolution, the distance to the surface is often one order of magnitude smaller than the aperture size. The image resolution, however, is still limited by the aperture size. The small probe-substrate distance compared with the aperture size makes it difficult to image surfaces with height variations larger than the near field and implies a strong probe-surface coupling. In this chapter, we are interested in observing the local field at larger distances from the surface at distances comparable to the aperture size of the optical probe. We find that a larger distance from the surface does not necessarily imply a lower resolution. At larger distances from the surface, it is important to consider the interference of the scattered light or the superposition of diffracted waves from different parts on the surface to understand the optical field at intermediate distance from the surface. Optical standing waves were first observed by Wiener using photographic emulsions (Sommerfeld, 1954). Wiener pointed out that the interference of the incident and reflected monochromatic beams forms standing waves near surfaces. In the case of planar surfaces, the standing optical wave is oriented parallel to the surface. In studies of thin films interference effects near surfaces have been observed in the photoluminescence intensity on reflecting surfaces. The photoluminescence intensity showed oscillations when the film thickness was changed (Holm et al., 1982), which has been explained by the effect of multiple reflection and interference in thin films. Similarly, when observing the Raman signal of nitrogen and oxygen layers on silver substrate oscillations of the Raman signal have been observed as a function of film thickness (Ager et al., 1990). The oscillations in the optical signal show that the intensity is enhanced when the thickness of the film is a multiple of half the wavelength of the excitation wavelength when taking into account the index of reflection of the medium and angle of incidence. The fact that the Raman signal is enhanced for particular thicknesses of the film has been used to
4
W. S. Bacsa
define an interference substrate that enhances the Raman signal of ultrathin layers. Interference substrates consisting of a reflecting and transparent layer and with a thickness of the transparent layer tuned to the excitation wavelength and angle of incidence show enhanced Raman signals (Bacsa and Lannin, 1992). Optical standing waves near surfaces, using optical scanning probes in collection mode, were first observed by Umeda et al. (1992), who showed that the fringe spacing of the surface standing wave (SSW) depends on the angle of incidence. The standing wave fringe spacing increases with increasing angle of incidence and is inversely proportional to the cosine of the angle of incidence. When combining interference substrates and using optical scanning probes in collection mode it is has been demonstrated that subwavelength resolution can be observed when imaging at intermediate distance from the surface. When using interference substrates a lateral resolution of about wavelength/15 has been observed for metal island films (Bacsa and Kulik, 1997). Optical standing waves have also been used to control the position of an optical probe near the liquid-gas interface to observe surface-adsorbed organic molecules (Kramer et al., 1998). In summary, optical interference near surfaces has been observed for many years. Research in ultrathin films and the molecular surfaceadsorbed monolayer has revived interest in using optical standing waves to increase spectroscopic sensitivity. Optical scanning probe microscopy allows exploration of optical standing waves at unprecedented detail.
3. OPTICAL STANDING WAVES NEAR SURFACES, HOLOGRAPHY, AND INTERFERENCE SUBSTRATES The recording of optical standing waves with an optical probe near surfaces can be compared with the recording of optical standing waves in optical holography. In optical holography, the optical standing wave of the reference wave is recorded in a photosensitive emulsion or hologram (Caulfield, 1970). In the vicinity of the surface no reference beam is needed; however, the object wave or reflected beam interferes directly with the incident beam for distances smaller than the beam diameter. As a result, the recording of the optical standing wave near surfaces can be considered a new form of optical holography. The difference from conventional holography is that the distance between the recorded standing wave and the object is considerably smaller for standing waves in the overlap zone. This has the potential to increase the achievable lateral resolution. While holography and microscopy are well-differentiated concepts in conventional optics and are often considered complementary,
Optical Interference near Surfaces and its Application in Subwavelength Microscopy
5
we see here that the two concepts come together when considering recording standing waves near surfaces with a scanning probe. Figure 1 shows the incident and reflected beams and the formation of SSWs parallel to the surface in the overlapping zone. This scenario is compared with the interference of the reference beam with the reflected beam as used in conventional holography forming standing waves that are recorded in a photosensitive emulsion. The orientation of the standing wave in the overlapping zone depends on the relative orientation of the two beams. Holograms are recorded in this overlapping zone. The reference beam forms the reflected beam through the hologram, which is reproducing the standing waves in the absence of the substrate. Optical standing waves near the surface, however, can be recorded directly with a scanning optical probe without the use of photosensitive emulsions. The information of the standing waves is recorded by scanning the optical probe at variable distance from the surface in a plane parallel or perpendicular to the surface. In general, the optical field penetrates into the substrate surface depending on its index of refraction. Imaging in the proximity of the surface is useful to image surfaces at subwavelength resolution or to detect refractive index changes below the surface not visible to scanning force microscopy. When the distance between the image and the surface is increased, the interference of the scattered waves from different parts of the surface gives rise to a complicated standing wave or diffraction pattern. A thorough understanding of the
FIGURE 1 Schematic of overlap zone near a reflecting surface and in the overlap zone of a reference beam and the reflected beam. The triangular zone shows the overlap zone of incident and reflected beam. The standing waves are oriented parallel to the surface. The parallelogram shows the overlap zone of the reference beam with the reflected beam. The orientation of the standing waves depends on the propagation direction of the two beams. The arrow in the parallelogram shows how the reflected beam is formed in the absence of the surface by reflecting off the grating generated by the standing waves in the hologram.
6
W. S. Bacsa
formation of the scattered wave and its interference will finally allow standing waves to be described with a model which will be used to test numerical reconstruction. In the following experimental findings we discuss a simplified dipole model for the formation of standing waves and demonstrate the numerical reconstruction of the images generated by the dipole model. In more recent years, interference substrates have been used to increase the fluorescence signal of cells and increase the optical contrast in optical microscopy (Lambacher and Fromherz, 2002). The most striking example of the application of interference substrates has been in making single atomic layers of graphene visible using conventional optical microscopy (Blake et al., 2007; Geim and Novoselov, 2007). For interference substrates the standing wave maximum falls on the surface. The substrate and the transparent spacer layer form a half cavity. As a result, the deposition of a single atomic layer of graphene influences the field at the surface considerably, which has the effect of changing the reflectivity by a sizable amount. Figure 2 shows the time-averaged electric field intensity of the SSW perpendicular to the surface for an interference substrate. A single atomic aluminum layer shifts the standing wave and changes its intensity at the SiO2/air interface. The two curves show the influence of the deposition of a single metallic atomic layer on the interference substrate on the standing wave. The amplitude at the interface is significantly reduced, and the
2.5 Si
SiO2
1 2
Normalized intensity
2.0 300 nm 1.5 1.0
0.5 0.0
−100
0 100 200 Distance perpendicular to surface (nm)
300
FIGURE 2 Time-averaged electric field intensity of the SSW perpendicular to the surface for an interference substrate. A single atomic aluminum layer shifts the standing wave and changes its intensity at the SiO2/air interface. The two curves show the influence of the deposition of a single metallic atomic layer on the interference substrate on the standing wave.
Optical Interference near Surfaces and its Application in Subwavelength Microscopy
7
standing wave is shifted by more than an order of magnitude compared with the thickness of the metallic monolayer (Bacsa, 1997). The use of interference substrates was critical to select bits of monolayers of graphene (Geim and Novoselov). This finding was crucial in the recent surge of interest in the research on graphene. The Raman spectrum of monolayer graphene has an unusual second-order spectrum that has been used to confirm the presence of monolayer graphene. Using an interference substrate enhances the Raman signal due to the presence of the interference maximum at its surface and is now routinely used to confirm the presence of individual graphene layers. The next section reviews the fundamental characteristics of standing waves near surfaces, how they are observed experimentally using optical scanning probe microscopy, and their description using a simplified dipole model.
4. INTERMEDIATE-FIELD AND SURFACE STANDING WAVES Structured surfaces with dimensions comparable or smaller than the wavelength of light are increasingly used as parts of functional units in integrated devices. Light scatters in multiple directions on these structured surfaces. The overlapping of the scattered and the incident/ reflected waves leads to complex standing waves due to their phase coherence. While much effort has been put forth in understanding of the near field in the proximity of surfaces, less effort has been expended in the understanding of light scattering at intermediate distance from surfaces. Standing waves in general are formed when two counterpropagating monochromatic and coherent waves overlap. Standing waves near surfaces can be classified into two main types: LSWs and SSWs. The standing wave front is always directed perpendicular to the sum of the two wave vectors describing the propagation direction for two beams. We assume for simplicity only plane waves. The complexity of the standing wave field increases with distance due to the larger number of overlapping scattered waves contributing to the field. To observe the optical standing wave near surfaces, we use a pointed optical fiber probe in collection mode and scan the probe parallel to the surface at variable distance from the surface. To reduce complexity of the standing wave field we scan the probe in the optical field outside the near-field range. To realize the advantage of scanning the probe outside the near-field range, we describe the optical field as a function of distance from the surface using a dipole model. In a simplified diagram (Figure 3) we can use oscillating dipoles to approximate light scattered from objects orders of magnitude smaller than that of the wavelength. The field of an oscillating dipole consists of three terms, which are proportional to 1/r, 1/r2, and 1/r3. This distance
8
W. S. Bacsa
a h
I P S
FIGURE 3 Scan configuration of the optical scanning probe with respect to the plane of incidence and the surface: substrate (S), plane of incidence (P), image plane (I), distance between substrate and image plane (h), and angle of incidence (a).
dependence of the field has the consequence that the relative importance of the three terms is inversed at a characteristic distance lc¼ l/2p. Below this critical distance lc the term 1/r3 is dominant, whereas above this critical distance lc the term 1/r prevails. The field associated with the term proportional to 1/r is transversal or perpendicular to a radial direction and propagating, whereas the field associated with the term proportional to 1/r3 is longitudinal or parallel to a radial direction and nonpropagating. In this context, we can classify the near field as the range for which the distance is smaller than the characteristic distance lc, and we name the field for distances larger than lc but still near the object (<100l) the intermediate field. In the intermediate field, the propagating transverse field is dominant as in the far field. However, in the intermediate field the dispersion of the path length between a given image point and points on the surface is large. This has important consequences on the relative phase for each image point. Figure 3 clearly shows that at distances larger than lc the term proportional to 1/r is dominant and the modeling of the interference of scattered waves is simplified by taking only this propagating term into account. We conclude here that at distances larger than lc, the propagating part of the scattered field is dominant. Interestingly, this critical distance is comparable to the resolution observed in near-field optics. We therefore expect at intermediate distance from the surface a similar later lateral resolution provided the lateral resolution scales with its distance from the surface. In fact, we see later that due to the presence of the substrate one must take into account the index of refraction of the substrate, which has the added effect of improving the resolution limit for the intermediate field. For a perfectly flat surface the resulting wave vector for the incident and reflected wave is oriented parallel to the surface. The component of the wave vector of the incident wave and reflected wave perpendicular to
Optical Interference near Surfaces and its Application in Subwavelength Microscopy
9
the surface are counterpropagating, cancel each other, and as a consequence standing waves are formed parallel to the surface. The resulting wave moves parallel to the surface, but its amplitude oscillates in the direction perpendicular to the surface with its maxima and minima at a given distance from the surface. This is explained by the constant path length difference between two points in one wave front of the incident and the reflected wave. The time-averaged field is nonuniform and oscillates in a direction perpendicular to the surface (see Figure 2), forming the SSW. By inclining the incident beam, the wave vector component perpendicular to the surface is smaller and the resulting wave vector larger, leading to a SSW with a larger wavelength. Figure 5 shows the calculated time-averaged electric field intensity of a Gaussian beam reflected off a flat surface. The fringes of the SSW are oriented parallel to the surface. The fringe spacing depends on the angle of incidence (yin) and is denoted as l 1 ls ¼ : : 2 cosðyin Þ When an optical probe is scanned in the vicinity of a surface (Figure 4), SSWs can be used to orient the image plane parallel to the surface by changing the tilt angles. For a perfectly flat image plane all the fringes of the SSW disappear when parallel to the surface. However, when a piezo tube scanner is used, the substrate is displaced in a plane that is deformed
100
IEI (a.u.)
10
1 1/r
1/2p
1/r2
0.1
1/r3 0.0
0.1
0.2
0.3 0.4 r(l)
0.5
0.6
FIGURE 4 The three terms of the electric field proportional to 1/r, 1/r2, and 1/r3 for an oscillating dipole as a function of radial distance. The relative importance of the three terms is inversed at a characteristic distance lc¼ l2p.
10
W. S. Bacsa
FIGURE 5 The calculated electric field intensity (log scale) of a Gaussian beam reflected off a flat surface, polarized perpendicular to the plane of incidence or parallel to the surface (transverse-electric).
due to bending of the piezo tube, giving rise to a circular contrast. The substrate is not displaced in a perfectly parallel manner and as a result, the image plane is spherically deformed. The piezo tube makes a symmetric bending movement, which causes the intensity of the image to change from the center to the edge of the image with either a maximal intensity at the center or at the edge depending on whether the center of the image is placed in a standing wave maximum or minimum. After the image plane is oriented parallel to the surface by adjusting the tilt correction (via application of correction voltage to sectors of the piezo tube), one can then approach the surface without the optical probe colliding with the surface. As a result, images can be recorded in the very proximity of the surface with no feedback signal at constant distance from the surface. The imaging of optical standing waves on flat surfaces can also be used to verify the symmetry of the scanner movement. Surface standing waves depend on the wavelength of the incident beam. If the incident beam contains several wavelengths, each component of the beam will form its own SSW. The spectroscopic composition of the beam can be extracted by mapping the field distribution perpendicular to the surface and carrying out a Fourier analysis. This means that mapping of SSWs can be used in spectroscopic sensors. Surface standing waves have been mapped with transparent detectors and their spectroscopic performance tested (Stiebig et al., 2005).
5. LATERAL STANDING WAVES A scattered wave is created when a single scatterer (such as metal particles) is placed on a perfectly flat surface. The scattered wave interferes with the incident wave, and the wave reflected from the surface creates concentric interference or diffraction fringes around the particle. The fringes are not
Optical Interference near Surfaces and its Application in Subwavelength Microscopy
11
FIGURE 6 Recorded LSW in a plane parallel to the surface of agglomerated nano-sized gold particles on a polished silicon wafer. The long axis of the fringes is oriented parallel to the plane of incidence.
necessarily circular but often elliptical due to the inclination of the incident and reflected beam. The eccentricity of the LSW depends on the angle of incidence/reflection and as we will see, on the index of refraction of the substrate. The long axis is always parallel to the plane of incidence (Figure 6). The number of visible fringes and the fringe spacing is reduced when decreasing the distance between image plane and surface. Furthermore, the entire fringe pattern shifts in a direction parallel to the plane of incidence and toward the incident beam when reducing the distance to the surface. Figure 7 shows the changes of the LSWs when the distance to the substrate is increased or decreased. The optical standing wave field shown in the figure has been recorded for two different distances from the surface. The optical probe and surface have been illuminated by a laser beam at a given angle (wavelength: 669 nm, 10 mW, s-polarized, angle 45 ). The local optical field is detected through a metal-coated optical fiber probe (aperture size 100 nm; Nanonics Imaging Ltd., Jerusalem, Israel). The gold particles, 2 to 3 nm in size, have been deposited on a polished silicon wafer from a suspension droplet. Concentric diffraction fringes around the agglomerated gold particles are observed in addition to the diagonal fringes. The diagonal fringes are formed from the SSWs due to the tilt of the image plane with respect to the surface. The tilt of the image plane with respect to the surface can be corrected by tilting the image plane using the piezoelectric scanner (AutoProbe CP-R, Veeco Instruments Inc., Woodbury, NY) or mechanical adjustment. The exact tilt angle and tilt orientation can be determined from the fringe spacing and fringe orientation. The fact that the diagonal fringes are not perfectly straight lines indicates that the image plane is slightly deformed due to the nonplanar movement of the piezo tube scanner. We call the concentric fringes or
12
W. S. Bacsa
10 mm
10 mm
FIGURE 7 Standing wave patterns of agglomerated gold islands at two different distances (top, 100 nm; bottom, 10 nm) from the surface.
standing waves around the particle lateral standing waves. In general, the LSWs depend on the size of the particles, the distance from the surface, and the angle of incidence. The bottom image in Figure 7 shows the same region at a smaller distance from the surface. The first concentric fringe is smaller, displaced toward the incident beam, and the same diagonal fringes are seen as in the top image but they have considerably smaller amplitude compared with the LSWs around the particles. The fact that the maximum amplitude of the standing wave oriented parallel to the surface is constant shows that the lower contrast of the parallel fringes in the lower image means that the concentric fringes around the island increase in intensity when reducing the distance. The SSWs can serve as a reference to calibrate the intensity of the LSW. Clearly the scattered field amplitude increases when approaching the scattering particle. We conclude that with a smaller distance to the surface, the LSWs are spaced closer and are considerably more intense compared with the SSWs.
Optical Interference near Surfaces and its Application in Subwavelength Microscopy
13
When approaching a structured substrate with an optical probe in collection mode and in the overlapping zone of incident and reflected beam, we clearly observe SSWs oriented parallel to the surface and LSWs around a lateral structure in the substrate. SSWs can be used to orient the image plane parallel to the surface. LSWs are displaced and the fringe spacing is reduced and more intense when reducing the distance to the surface. We can use an analytic dipole model to understand the displacement of the LSWs when changing the distance to the surface. We can calculate the interference of the incident, reflected, and scattered wave and the time-averaged intensity of the resulting field in a plane perpendicular to the surface. For simplicity, we have chosen a polarization perpendicular to the plane of incidence. For the scattered dipole field we consider only the term that is proportional to 1/r, since the other terms are negligible at the considered intermediate distance. Figure 8 shows the calculated field intensity (Caumont and Basca, 2006). The angle of incidence is 45 and the beam illuminates the surface from the left side. The horizontal fringes show that the SSWs are modulated by the LSWs. The fringes from the LSWs on the left side of the incident beam are closely spaced, explained by the larger difference of the wave vectors of the incident wave and the wave vector of the scattered wave on the left side. The figure also shows how the fringe spacing of the lateral waves increases when the distance from the surface is increased. The fringes of the LSW spread out with increasing z. The two fringes around the origin (z ¼ 0, x ¼ 0) at the center shift to the right with increasing z. Figure 8 also shows that the line that joins all the centers of the lateral fringes at the right side creates an angle equal to the angle of the reflected beam. Stated differently, the diffraction fringes are centered around the reflected beam. The LSWs form parabolas in the plane of incidence where the common axis of all the parabolas coincides with the direction of the reflected beam.
z (l)
3.0
1.5
0 –3.0
–1.5
0 x (l)
1.5
3.0
FIGURE 8 Calculated field intensity in the plane of incidence of a single point scatterer. Parallel fringes are due to surface standing waves. The modulation of the parallel fringes is due to the LSW from a single point scatterer located at (0,0).
14
W. S. Bacsa
(a)
(b)
FIGURE 9 (a) Recorded image of silicon particles on a polished silicon waver (image size 15.6 mm). (b) Model calculation using the following parameters: angle of incidence, 53 ; direction of incidence, 225 ; tilt angle, 22.8 ; tilt direction, 245 ; polarization of the incident beam, 245 ; distance from surface, 3.8 wavlengths; and polarizability corresponding to a silicon particle size of 114 nm.
The following text compares the images obtained using the dipole model with the experimentally observed fringe pattern. Figure 9 shows the recorded optical image of two submicron-sized silicon particles on a silicon wafer and the calculated image taking into account the angle and direction of incidence, interference with the incident and reflected wave, and tilt angle of the image plane (Levine et al., 2002). The diagonal fringes are due to the SSWs. Concentric and elliptically shaped fringes are observed from the LSW around the particle location. Although the calculated image provides a good reproduction of the observed SSW, the same pattern is observed but the proportions are slightly different. The elliptical fringes are slightly longer and the fringe spacing and contrast is different in the experimental image. To understand the eccentricity of the elliptical diffraction fringes we have studied the influence of the k vector of the scattered wave. The k vector depends on the local index of refraction or polarizability and influences as a result the eccentricity (Caumont and Basca, 2006). Figure 10 compares the calculated fringe pattern as a function of the size of the scattering wave vector (SiO2: n ¼ 1.43, Si: n ¼ 4.38). Comparing the fringes in the three images of Figure 10, we notice that the elliptical fringes are less elongated along the direction of incidence as the index of refraction of the substrate is larger, which is seen comparing the first fringes in Figure 10a, 10b, and 10c. Clearly the increased wave vector of the scattered wave is related to the shape of the fringes. Aside from the more circular fringes, the fringe spacing is reduced with increasing index of refraction. The sensitivity of the shape of the diffraction or LSW fringes on the refractive index of the substrate offers the possibility of locally determining the relative refractive index by simply observing the fringe shape and spacing of a particular point scatterer. (Notice that the fringe center in the three images in Figure 2 is shifted to the left and is not located at the same
Optical Interference near Surfaces and its Application in Subwavelength Microscopy
15
(a)
(b)
(c)
FIGURE 10 Diffraction fringes of single point scatterer; image size 5 10 wavelengths, distance between image and surface: (a) one wavelength, no substrate is taken into account; (b) on SiO2; and (c) on silicon.
place as in Figure 10a, 10b, and 10c.) The actual position of the point scatterer is in the center of the image in Figure 10 where the fringes have maximum intensity. The ellipsoidal shape of the fringes is more circular with increasing the index of refraction. However, Figure 9 shows the opposite trend. The fringes in the experiment are more elongated, indicating a lower index of refraction. We note, however, that the penetration of light has not been included in the dipole model. The penetration of the beam into the substrate causes a phase shift of the reflected beam, resulting in a shift of the fringe pattern associated with the reflected beam and changing the overall fringe shape. We conclude that the dipole model reproduces the observed diffraction fringes well but is limited in making a quantitative comparison. Figure 11 shows the change of the calculated fringe pattern when the distance to the surface is changed for three different distances. The LSW shifts consistently to the side of the reflected beam with increasing distance from the substrate (Caumont and Bacsa, 2006). At the same time the fringe spacing increases due to the larger distance to the point scatterer. The dispersion of the optical path length across the image is reduced when increasing the distance between image and surface. It can be shown that this shift is linear with increasing distance. We can estimate the location of the center of the fringes for any distance to the surface by recording two images at different heights and
16
W. S. Bacsa
(a)
(b)
(c)
(d)
FIGURE 11 Lateral standing wave of point (a) single scatterer at (b)1 l, (c) 2 l , and (d) 3 l from the surface (image size 5 10 l ,TE polarization). The angle of incidence is 45 .
knowing the distance between the two image planes (Figure 11a). By connecting the centers of the three diffraction images (Figure 11b, c, and d) recorded at distances in regular intervals, we can extrapolate the location of the point scatterer at the center of the rectangle (Figure 11a). This shows that the lateral shift of the center of the diffraction by changing the distance to the surface, by a known amount, can be used to determine the distance between image plane and surface. Up to this point, we have discussed a single point scatterer and the LSW formed around the scatterer. In the following text, we discuss standing wave formation from a micrograting and show how the standing wave with higher complexity can be described by the linear superposition of dipole waves (Bacsa et al., 2006). The local field distribution has been recorded in a plane parallel to the surface at a constant height of GaAs microgratings (10 mm) created by electron-beam lithography (grating groove depth, 1 mm; groove width, 0.7–1 mm). The optical probe and surface were illuminated by a laser
Optical Interference near Surfaces and its Application in Subwavelength Microscopy
17
beam with an angle of incidence of 50 (wavelength, 669 nm, 10 mW, s-polarization). We have used bent metal-coated optical probes (aperture size, 100 nm; Nanonics Imaging Ltd.) and a modified scanning probe instrument (atomic force microscopy model CP-R; Veeco Instruments, Inc.). We observed that the detected signals were larger by a factor of at least two for transverse electric (TE) polarization compared with transverse magnetic (TM) polarization. We attribute this to the orientation of the induced dipole at the probe edge, which is favorable for TE polarization to scatter light at small angles to the axis of the optical fiber. The images were recorded without shear force detection. The incident beam was inclined toward the bottom of the images. Figure 12 shows the recorded image of the micrograting at a large distance from the sample
200 (a)
100
(arb. units)
150
50
0 (b)
FIGURE 12 (a) Recorded optical image of micrograting (periodicity, 1.43 mm) oriented perpendicular to the plane of incidence at a large distance (30 mm) and image size (60 mm). The incident beam direction is from the lower side and is at an angle of incidence 50 , and the fringe spacing of the standing wave of the incident and reflected beams is 6010 nm. The circle indicates the location of the micrograting. (b) The same scan range and experimental conditions as in (a) after changing the tilt of the image plane.
18
W. S. Bacsa
(>30 mm). The grating location is indicated by the dashed circle. All experimental images are reproducible and do not depend on a specific optical probe. In general, the image plane is not parallel to the surface and cuts through the SSW created by the incident and reflected planar waves. Figure 13 shows horizontal fringes from the standing waves. The fringe spacing can be used to deduce the tilt angle and to correct the substrate orientation. Figure 13c shows the same region with the micrograting after tilt correction, which removes the fringes created by the SSW. Apart from the micrograting in the center, other diffraction fringes are seen on the side. Fine parallel fringes are also seen on the lower side of the location of the micrograting. Figure 13a shows a recorded optical image (size 20 mm) at a smaller distance (5 mm) to the micrograting. A darker region 1 and a brighter region 2 can be distinguished. The grating fringes are seen in both regions. The larger distances from the surface than typically used in near-field optics lead to a larger phase difference between the incident and scattered fields. The interference of the two fields leads to the formation of a diffraction image, which is displaced in the direction of the illuminating beam. This is similar to what is observed for a single point scatterer. The fact that the size of the micrograting is larger than the illumination wavelength modifies the reflected wave locally. The two regions seen in Figure 13a are then explained by the superposition of the image formed by the modified reflected wave owing to the presence of the grating and the displaced diffraction image of the grating. Figure 13b is an enlargement of the lowerright corner of region 2 in Figure 13a. The grating structure with horizontal fringes (image size, 6.25 mm) is clearly seen. The sharpness of the edges of the grating is different in the horizontal and vertical directions, and the contrast of the vertical edge groove in the horizontal direction is higher. High lateral resolution has been observed earlier on metal island films (Bacsa and Lannin, 1992). Figure 13c shows the same micrograting rotated (with the grating fringes in vertical orientation), recorded at a different image height and keeping the incident beam fixed. The image confirms the high edge resolution perpendicular to the incident beam direction. Displaced diffraction fringes are again superimposed with an image that reproduces the grating structure. Interestingly, we see the horizontal fringes of the grating grooves prolonged into the vertical edge groove. We believe that the finite penetration of the light into the substrate for GaAs at 669 nm, which is 500 nm, causes the contrast to spread by 200 nm in the direction of the reflected beam, and this explains why the contrast is larger in the direction perpendicular to the direction of the incident beam. To better understand the recorded fringe contrast we have used the simplified dipole model to calculate the interference pattern of the scattered field from the micrograting with the incident field. The model takes into account the time-averaged interference of the incident and scattered
Optical Interference near Surfaces and its Application in Subwavelength Microscopy
19
200
(a)
1
100
(arb. units)
150
50 2 0 (b)
2
(c)
FIGURE 13 Recorded optical image of a micrograting at a smaller distance (5 mm) than in Figure 12. (a) Image size 20 mm, region 1 is due to diffraction from the micrograting, and region 2 shows the grating fringes. (b) Enlargement of the lower-right corner (image size 6250 nm) of image (a). The vertical edges are narrower than the horizontal edges. The inset shows a cross section in the horizontal direction. (c) A rotated micrograting under the same experimental conditions; image size is 10 mm. The vertical grating edges in the circle are as narrow as in (a) and (b). The arrow indicates the first diffraction fringe from a single dust particle. See text for details.
20
W. S. Bacsa
propagating electric transverse field components of a single dipole and a plane wave. The longitudinal field component has not been included since the image distance is sufficiently large (>lc). Furthermore, the scattered field amplitude is several orders of magnitude smaller than the incident field. We therefore neglect the coupling between different discrete dipoles. Higher diffraction orders can be excluded at the image height considered here. The image of the grating is then modeled by the linear superposition of 1180 discrete dipoles. (To simplify, we have not included the effect of polarization.) Figure 14 shows the calculated image contrast at two different image heights. First, we observe that the diffraction image is displaced in the direction of the reflected beam, as observed in the experimental image. The two different heights show that the shift of the diffracted image depends on distance from the surface. Second, the model calculation reproduces the diffraction fringes around and below
200
(a)
100 2
(arb. units)
150 1
50
0 (b)
FIGURE 14 Calculated image contrast of the micrograting using 1180 point dipoles. (a) Image height is 3 mm; the square indicates the location of the grating, and regions 1 and 2 are the same as in Figure 2. (b) The diistance between the image plane and surface is 9 mm; the diffraction fringes are displaced in the direction of the illuminating beam. The displacement depends on the image height. The locations of the dipoles are marked by points.
Optical Interference near Surfaces and its Application in Subwavelength Microscopy
21
the grating, but we note that the contrast is not entirely reproduced. Region 2 is less clearly visible in the simulated image. We attribute the differences between the experimental observations and the numerical simulations to the fact that we use only a two-dimensional distribution of dipoles and neglect the three-dimensional structure of the grating. The finite penetration of the illuminating beam into the substrate is expected to have the effect of displacing the diffraction image in the direction of the reflected beam. The calculated image makes it possible to estimate the upper limit for the distance between the surface and the image planes. We deduce an upper limit for the image height of 2 to 3 mm from Figure 14a. The high lateral resolution (80 nm) suggests that the distance between the surface and image plane is smaller, which is consistent with the fact that the diffraction image is shifted in the lateral direction owing to the finite penetration of the light into the substrate. What is remarkable here is that we observe subwavelength lateral resolution with an optical probe at a distance of several wavelengths from the surface. But we find that the size of the image that reproduces the grating (region 2, Figure 13b) is limited by the overlap with the diffracted image, which depends on the lateral shift caused by the finite penetration of the light into the substrate. The separation of the grating image and its diffraction image affords the opportunity to image at high lateral resolution at a larger distance from the surface with no feedback signal to control the probe in the proximity of the surface. Although the overlap of the two images is smaller with increasing distance from the surface, the reduced lateral resolution at larger distances limits the size of objects that can be observed with high lateral resolution on a reflecting surface. To summarize, we have recorded constant height images of semiconductor microgratings created by electron-beam lithography using reflection-collection-probe, optical scanning–probe microscopy. Interference fringes due to the tilt of the image plane were corrected by changing the sample orientation. An image of the grating and the superimposed diffracted image are separated in the image plane. The highest observed edge resolution is comparable to the probe aperture size. The finite penetration depth of the light leads to a reduced-edge resolution in the direction of the illuminating beam. Using a simplified dipole model, the diffraction image can be calculated and explains the displacement of the diffraction image. The larger displacement of the diffraction image observed in the experiment is attributed to the finite penetration of the illuminating beam into the substrate, which is not included in our twodimensional model. The analytic dipole model is able to account for the image displacement and fringe pattern around the micrograting. We can conclude that LSWs are well described by the dipole model. The lateral shift with increasing distance is consistent with experimental
22
W. S. Bacsa
observations. The model provides the opportunity to explore the influence of the substrate on the fringe spacing. The dipole model can be used for single subwavelength–scaled scatterers or extended surface structures such as microgratings. Differences between experimental and calculated images are attributed to finite penetration of the light into the substrate, which is neglected in the dipole model.
6. RECONSTRUCTION OF INTERMEDIATE-FIELD IMAGES The successful description of the interference fringes near structured surfaces using the dipole model allows testing of the numerical reconstruction (Bacsa and Neumayer, 2007). This consists of a first step to generate the interference image, using the dipole model, and a second step to reconstruct the image numerically. The interference image is reconstructed through deconvolution using the image of the LSW of a single point scatterer. The scattering function takes into account the angle and direction of the incident beam. To demonstrate the reconstruction we consider three point scatterers that are located 0.75 wavelengths from each other. The image plane is one wavelength off the plane that contains the three point scatterers. Figure 15 shows the calculated standing wave field (left side) and the corresponding reconstructed or deconvoluted image (right side). The angle of the incident beam is 45 and the beam directions in the first and second line are rotated by 90 . The location of the three point scatterers is visible in the numerically reconstructed (right) image. The influence of the incident beam on the reconstructed image can be seen by changing the beam’s angle of direction. In one direction with two scatterers lined up with the direction of incidence, the two scatterers are resolved, whereas they are not aligned when the beam is incident in a direction perpendicular to it. This shows that the resolution of the reconstructed image is higher along the direction of the incident beam plane. The reconstruction removes the interference fringes, and we find that the quality of the reconstruction depends on the image size of the image of the LSW of a single point scatterer used for the numerical reconstruction. The images for the incident beams at right angle can be superimposed. This has the effect that the lateral resolution in the image is uniform in the image plane. The last line of images in Figure 15 shows the superposition of the two images shown in the first two lines. The three point scatterers are well resolved in the superimposed image at the bottom-right side of Figure 15. At this stage we have not included the polarization of the incident beam and the index of refraction of the substrate. The polarization of the incident beam changes the relative intensity in a given fringe but does not change the fringe size itself. The numerical reconstruction images show
Optical Interference near Surfaces and its Application in Subwavelength Microscopy
(a)
(b)
(c)
(d)
(e)
(f)
23
FIGURE 15 Calculated standing wave field (left side, a, c, e) and reconstructed images (right side, b, d, f) of three point scatterers. The distance between the point scatterers is 0.75 wavelengths, image size is 5 wavelength, and the angle of incidence is 45 . See text for details.
that subwavelength resolution can be obtained by recording standing wave fields at a distance of one wavelength off the surface, outside the near-field region at an intermediate distance. The presence of the substrate helps to increase the lateral resolution due to the fact that the fringe spacing is reduced with increasing index of refraction. Observation at larger distances from the surface has the advantage that the near field with both transverse and longitudinal field components does not contribute to the image contrast, which simplifies the reconstruction process. At intermediate distance it is sufficient to take into account only the transverse component. Lateral resolution can be improved by increasing the angle of incidence. The larger component of the k vector parallel to the surface results in a larger lateral resolution. However, the penetration of the light into the substrate leads to a reduction of the contrast in direction of the incident beam.
24
W. S. Bacsa
The distance between the image plane and the surface must be known, however, to reconstruct the interference image; this distance is in general not well known in an experiment. In the following text, we investigate the effect when the distance used in the reconstruction is overestimated or underestimated. In this, we can estimate at the same time the depth resolution capabilities. Figure 16 shows the reconstructed image when combining the reconstructed images with the incident beam rotated by 90 and by overestimating the distance between the image plane and the surface by 10% to 50%. As the distance is increased, the position of the bright spots shifts. The bright spots do not line up with the lines indicating the position of the point scatterers. A darker ghost image appears in the neighborhood of the bright spots. We can also observe that the background changes from dark to bright as the distance is increased from its correct value, decreasing the image contrast. (a)
(b)
(c)
(d)
(e)
(f)
FIGURE 16 Reconstructed image of three point scatterers spaced at 0.75 wavelengths by increasing the distance used in the numerical reconstruction. (a) 0%, (b) 10%, (c) 12%, (d), 15%, (e) 20%, (f) 50%.
Optical Interference near Surfaces and its Application in Subwavelength Microscopy
(a)
(b)
(c)
(d)
(e)
(f)
25
FIGURE 17 Reconstructed image of three point scatterers spaced at 0.75 wavelengths by decreasing the distance used in the numerical reconstruction. (a) 0%, (b) 10%, (c) 12%, (d), 15%, (e) 20%, (f) 50%.
Figure 17 shows the reconstructed images when underestimating the distance between image plane and point scatterers by 10% to 50%. Again, little change is observed by decreasing the distance by 10% but shifts in the spot positions are observed for larger deviation from the correct distance. Except for the appearance of dark spots or a ghost, the background gets brighter with decreasing distance. The error in the distance used in the reconstruction has the effect that the position of the image point shifts in the opposite lateral direction when increasing or decreasing the distance to the surface. In Figure 16 the bright spots move down, and in Figure 17 the bright spots move up with changing the distance to the surface. By changing the distance used in the numerical reconstruction we can estimate the depth resolution, or the resolution along the surface normal to 1/10 of the wavelength (see Figures 16 and 17). The depth
26
W. S. Bacsa
resolution can be improved by increasing the angle of incidence or by increasing the index of refraction of the substrate of the three point scatterers. It is also clear that the lateral and depth resolution depends on the scattering efficiency of the particle, which in turn is given by the electronic polarizability and the size of the particle. Instead of taking only three point scatterers, a larger number of point scatterers can be used to test the deconvolution of the interference image. We define several letters in a matrix (5 5). Figure 18 shows the result when testing the depth resolution on multiple point scatterers forming the four letters ISOM (Interference Scanning Optical probe Microscopy). Each matrix defining a single letter of 1.5 1.5 wavelength in size is defined by point scatterers 0.3 wavelengths apart. Figure 18 shows three superimposed interferograms for three different directions of incident (0, p/2, p). Each interferogram is color coded (red, green, blue). The reconstructed image using the three interferograms separately and superimposing them is shown in Figure 18a. The second row of images shows the result of the reconstruction when the distance is increased/decreased by10 %. The letters are still readable but the background is brighter (or darker) (a)
(b)
(c)
(d)
(e)
(f)
FIGURE 18 Reconstruction of four letters when changing the distance of image plane and plane of point scatterers (one wavelength). (a) Three superimposed interferograms (red, green, and blue). (b) Reconstructed image with correct distance. (c) Reconstructed image when increasing or (d) decreasing the distance by 10%. (e) Reconstructed image when increasing or (f) decreasing the distance by 20%. The image size is 6 12 wavelengths, the separation of point scatterers is 0.3 wavelengths, and the letter size is 1.5 1.5 wavelengths.
Optical Interference near Surfaces and its Application in Subwavelength Microscopy
27
depending on whether the distance is increased (or decreased). The last row shows the result when changing the distance by 20%. Interestingly, the letters cannot be recognized when the distance is increased by 20% but can still be recognized when the distance is reduced by 20%. This shows that the depth resolution is asymmetric and depends on whether the distance of the plane of the standing wave image and the plane of the point scatterers is overestimated or underestimated. So far we have included in the numerical calculation only the interference of the incident beam with the scattered wave from multiple point sources. The interference with the reflected beam with the scattered wave has been neglected here due to the similarity of the interference fringes created. This has the effect that when the calculated standing wave field is compared with experimental images there are several subtle differences that need to be resolved. (We point out that the depth resolution considered here is used in the context of being able to determine the distance between the image plane and the plane of the point scatterers and not to resolve two point scatterers at different heights.) The reconstruction demonstrated here of a set of characters from a numerically generated interferogram can be used in cryptography. Information can be defined by a point pattern. By using the point pattern an interference image can be numerically generated (coding). This coded image can be transmitted to the receiver. The receiver then needs to reconstruct the coded image (encoding) through deconvolution by using a set of parameters used when generating the coded image, such as the distance between the image plane and the plane of the point scatterers, angle and direction of incidence, or image size and wavelength. The simplified dipole model allows an explanation of some of the basic characteristics of the interference patterns formed in the intermediate field. We have investigated depth resolution capabilities of the numerical reconstruction of the optical standing wave field near surfaces and find that the point scatterers can be located within 1/10 wavelength in a direction parallel to the surface normal. Deviations in the numerically reconstructed image are visible in the form of lateral shifts, the appearance of an inverted image, and increased background signal. By using multiple point scatterers to define several letters we find that the depth resolution is asymmetric. The resolution is lower when the distance between the image plane and the plane of the point scatters is increased. The numerical reconstruction of the standing wave field cannot be applied to experimental images so far due to the effects of finite penetration of the optical wave into the substrate and the neglect of the reflected beam on the interference pattern. We have shown here how the numerical generation of complex interference patterns and their numerical reconstruction can be applied in cryptography.
28
W. S. Bacsa
7. TALBOT EFFECT AND PHASE SINGULARITIES Scanning an optical probe in collection mode allows exploration of the complex interference pattern at variable distances from the surface. When observing periodic patterns on a surface in the form of microgratings we came across additional effects. Figure 19 shows a 60-mm scan at a large distance (100 mm) from the surface in the intermediate field region of a micrograting in the shape of a zero (Levine et al., 2004). The microgratings were fabricated by electron-beam lithography on a GaAs substrate. The micrograting shown has an elliptical shape (size 80 50 mm) with a grating etched into the substrate. Concentric diffraction fringes around the grating pattern, as well as displaced fringes of the shape of the micrograting, are observed in the direction of the incident beam (angle of incidence 45 ) where the periodic structure from the grating is reproduced. This self-imaging of the grating at macroscopic distances of optical gratings was first reported by Talbot (1836) in the nineteenth century. In general, a self-focused image can be observed at periodic distances from the grating. The scattered waves from a grating interfere at periodic distances from the surface. The Talbot effect were later explained by Rayleigh (1881). Scattered waves from periodic objects have a fixed phase relationship. This has the consequence that the image is reproduced at characteristic distances from the surface. Any double-periodic system (wave, grating) gives rise to a beat frequency. Here the beat frequency occurs in space. This characteristic distance (dTalbot) depends on the periodicity of the object and the wavelength of the incident beam. We observe the Talbot image of an optical grating in the vicinity of the surface and in the intermediate-field region. Its appearance indicates that the surface is at a distance that is a multiple of the Talbot distance, dTalbot. At a known distance or wavelength, the appearance of the Talbot image
10 mm
FIGURE 19
Talbot effect of a micrograting recorded in the intermediate-field region.
Optical Interference near Surfaces and its Application in Subwavelength Microscopy
29
provides information about the wavelength or distance to the substrate. While the diffracted wave around the object is displaced in the direction of the reflected wave for a non-normal incident beam, the Talbot image is displaced in the direction opposite of the incident beam. In Figures 12 and 14 parallel lines can be seen on the lower side of the image in a direction opposite to the direction of the incident beam; this represents the Talbot or self-image of the grating. Talbot self-imaging has been used at macroscopic distances in wave front sensing and in transform spectrometer designs (Kung et al., 2001). We have come across another interesting effect when illuminating the islands of gold particles—not with s-polarized light but with light polarized at 45 : We observe spiral-shaped standing waves irrespective of the shape of the island at sufficiently large distances from the surface (Figure 20). Phase singularities as observed in the spiral-shaped standing waves are observed in a number of wave phenomena (Freund et al., 1993; Nye and Berry, 1974). It has been suggested that phase singularities could be used to trap particles (Gahagan and Swartzlander, 1996). Phase singularities have also been observed in the near field on top of optical wave guide structures (Balistreri et al., 2000). Phase singularities in their simplest form can be created through the superposition of three plane waves. The 45 polarization of the incident beam gives an additional field component in the direction parallel to the plane of incidence and can explain the observed spiral formation. So far we have assumed that the optical probe is not sensitive to polarization. But the transmission of light through the aperture of the optical fiber probe is expected to be sensitive to the polarization of the local optical field. The induced dipole at the probe edge, oriented perpendicular to the plane of incidence, makes the largest contribution. The induced dipole is oriented parallel to the
10 mm
FIGURE 20 Spiral-shaped standing waves of an island of gold particles (distance 100 mm); the incident beam is linearly polarized at 45 .
30
W. S. Bacsa
probe edge for s-polarized light and has a larger emission amplitude in the plane of incidence in which also falls the axis of the optical fiber probe. This is consistent with what we observe experimentally for the optical fiber used in the experiment: The signal for p-polarized light is smaller than for s-polarized light. The recorded signal is hence dominated by contributions of the local field with s-polarization.
8. CONCLUSION AND PERSPECTIVES Optical standing waves have been observed in the past in light-sensitive emulsions in holography and ultrathin films. They have been used to enhance the photoluminescence and Raman signal or to increase optical contrast in optical microscopy. Scanning optical probe microscopy allows observation of optical standing waves near surfaces of structured surface at much greater detail. Comparison with a simplified dipole model shows the advantage of the intermediate-field range and provides insight into the formation of lateral and SSWs and their dependence on angle and direction of incidence, index of refraction, and polarization. Surface standing waves can be used to adjust the image plane of the scanning probe parallel to the surface and record images without any distance-regulating feedback signal. Fringe spacing of LSWs is sensitive to the index of refraction of the substrate, which allows enhanced lateral resolution. The amplitude of the SSW can be used as a relative measure of the amplitude of the LSW. This gives the perspective that the imaging of standing waves at intermediate distance from structured surfaces can be used to image surfaces with lateral resolution at a fraction of the optical wavelength and at comparable lateral resolution as found in near-field optics. The dipole model, used so far to explain the observed standing wave field is, however, not sufficiently accurate at this point to make holographic imaging at intermediate distance from the surface possible. It is believed that further work in this field, including adjusting the phase shift of the reflected beam, has considerable potential to make this a reality.
ACKNOWLEDGEMENT The author would like to thank Michel Caumont et Fre´de´ric Neumayer for technical support.
REFERENCES Ager, J. W., III, Veirs, D. K., & Rosenblatt, G. M. (1990). Raman intensities and interference effects for thin films adsorbed on metals. Journal of Chemical Physics, 92, 2067. Bacsa, W. S. (1997). Device for optical scanning of objects on a scanning surface and process for operating it. U.S. patent No. 5841129.
Optical Interference near Surfaces and its Application in Subwavelength Microscopy
31
Bacsa, W. S. (1999). Interference scanning optical probe microscopy: principles and applications. Advances in Imaging and Electron Physics, 10, 1–19. Bacsa, W. S., & Kulik, A. (1997). Interference scanning optical probe microscopy. Applied Physics Letters, 70, 3507–3510. Bacsa, W. S., & Lannin, J. S. (1992). Bilayer interference enhanced Raman spectroscopy. Applied Physical Letters, 61, 19–21. Bacsa, W. S., & Neumayer, F. (2007). Depth resolution capabilities using optical standing waves near surfaces. Conference Technical Proceedings NSTI-Nanotech, 1, 149–151. Bacsa, W. S., Levine, B., & Caumont, M. (2006). Local optical field variation in the neighborhood of a semiconductor micrograting. Journal of the Optical Society of America B, 23, 893–896. Balistreri, M. L. M., Korterik, J. P., Kuipers, L., & Van Hulst, N. F. (2000). Local observations of phase singularities in optical fields in waveguide structures. Physical Review Letters, 85, 294–296. Betzig, E., & Trautman, J. K. (1992). Near-field optics: microscopy, spectroscopy, and surface modification beyond the diffraction limit. Science, 237, 189–195. Blake, P., Hill, E. W., Castro Neto, A. H., Novoselov, K. S., Jiang, D., Yang, R., et al. (2007). Making graphene visible. Applied Physics Letters, 91, 063124–1–063124-3. Caulfield, H. J. (1970). Handbook of Optical Holography. New York: Academic Press. Caumont, M., & Bacsa, W. S. (2006). Local diffuse light scattering and surface inspection. Conference Technical Proceedings NSTI-Nanotech, 3, 281–283. Crommie, M. F., Lutz, C. P., & Eigler, D. M. (1993). Imaging standing waves in a twodimensional electron gas. Nature, 363, 524–527. Freund, I., Shvartsman, N., & Freilikher, V. (1993). Optical dislocation networks in highly random media. Optics Communications, 101, 247–264. Gahagan, K. T., & Swartzlander, G. A. (1996). Optical vortex trapping of particles. Optics Letters, 21, 827–829. Geim, A. K., & Novoselov, K. S. (2007). The rise of grapheme. Nature Materials, 6, 183–191. Holm, R. T., McKnight, S. W., Palik, E. D., & Lukosz, W. (1982). Interference effects in luminescence studies of thin films. Applied Optics, 21, 2512–2519. Joannopoulos, J. D., Villeneuve, R., & Fan, S. (1997). Photonic crystals: putting a new twist on light. Nature, 386, 143–149. Kramer, A., Hartmann, T., Eschrich, R., & Guckenberger, R. (1998). Scanning near-field fluorescence microscopy of thin organic films at the water/air interface. Ultramicroscopy, 71, 123–132. Kung, H. I., Bhatnagar, A., & Miller, D. A. B. (2001). Transform spectrometer based on measuring the periodicity of Talbot self-images. Optics Letters, 26, 1645–1647. Lambacher, A., & Fromherz, P. (2002). Luminescence of dye molecules on oxidized silicon and fluorescence interference contrast microscopy of bio-membranes. Journal of the Optical Society of America B, 19, 1435–1453. Levine, B., Caumont, M., Amien, C., Chaudret, B., Dwir, B., & Bacsa, W. S. (2004). Local optical field in the neighborhood of structured surfaces: phase singularities and Talbot effect. Conference Technical Proceedings NSTI-Nanotech 2004, 3, 5–8. Levine, B., Kulik, A., & Bacsa, W. S. (2002). Optical space and time coherence near surfaces. Physical Review B, 66, 233404–1–233404-4. Liu, Y., & Zakhor, A. (1992). Binary and phase shifting mask design for optical lithography. IEEE Transactions on Semiconductor Manufacturing, 5, 138–152. Nye, J. F., & Berry, M. V. (1974). Dislocations in wave trains. Proceedings of the Royal Society London A, 336, 165–190. Ozbay, E. (2006). Plasmonics: merging photonics and electronics at nanoscale dimensions. Science, 311, 189–193.
32
W. S. Bacsa
Rayleigh, L. (1881). On copying diffraction-gratings and some phenomena connected therewith. Philosophical Magazine, 11, 196–205. Rothschild, M. (2005). Projection optical lithography. Materials Today, 8, 18. Sommerfeld, A. (1954). Reflection and refraction of light. In Optics (pp. 56–58). New York: Academic Press. Stiebig, H., Knippb, D., Bhalotrac, S. R., Kungc, H. L., & Miller, D. A. B. (2005). Interferometric sensor for spectral imaging. Sensors and Actuators A-Physical, 120, 110–114. Talbot, W. H. F. (1836). Facts relating to optical sciences No. IV. Philosophical Magazine, 9, 401 407. Umeda, N., Hayashi, Y., Nagai, K., & Takayanagi, A. (1992). Scanning Wiener-fringe microscope with optical fiber tip. Applied Optics, 31, 4515–4518. Zegenhagen, J. (1993). Surface structure determination with X-ray standing waves. Surface Science Reports, 18, 199–271.
Chapter
2 Introduction of a Quantum of Time (‘‘chronon’’), and its Consequences for the Electron in Quantum and Classical $ Physics Ruy H. A. Farias* and Erasmo Recami†
Contents
1. Introduction 2. The Introduction of the Chronon in the Classical Theory of the Electron 2.1. The Abraham–Lorentz’s Theory of the Electron 2.2. Dirac’s Theory of the Classical Electron 2.3. Caldirola’s Theory for the Classical Electron 2.4. The Three Alternative Formulations of Caldirola’s Theory 2.5. Hyperbolic Motions 3. The Hypothesis of the Chronon in Quantum Mechanics 3.1. The Mass of the Muon 3.2. The Mass Spectrum of Leptons 3.3. Feynman Path Integrals 3.4. The Schro¨dinger and Heisenberg Pictures 3.5. Time-Dependent Hamiltonians 4. Some Applications of the Discretized Quantum Equations
34 38 39 40 42 47 48 50 53 55 57 61 62 66
* LNLS - Laborato´rio Nacional de Luz Sı´ncrotron, Campinas, S.P., Brazil { Facolta` di Ingegneria, Universita` statale di Bergamo, Italy, and INFN–Sezione di Milano, Milan, Italy $
Work Partially Supported by CAPES, CNPq, FAPESP and by INFN, MIUR, CNR E-mail addresses:
[email protected];
[email protected]
Advances in Imaging and Electron Physics, Volume 163, ISSN 1076-5670, DOI: 10.1016/S1076-5670(10)63002-9. Copyright # 2010 Elsevier Inc. All rights reserved.
33
34
Ruy H. A. Farias and Erasmo Recami
4.1. The Simple Harmonic Oscillator 4.2. Free Particle 4.3. The Discretized Klein-Gordon Equation (for massless particles) 4.4. Time Evolution of the Position and Momentum Operators: The Harmonic Oscillator 4.5. Hydrogen Atom 5. Density Operators and the Coarse-Graining Hypothesis 5.1. The ‘‘Coarse-Graining’’ Hypothesis 5.2. Discretized Liouville Equation and the Time-Energy Uncertainty Relation 5.3. Measurement Problem in Quantum Mechanics 6. Conclusions Appendices Acknowledgements References
66 69 73 76 81 86 86 88 90 95 98 106 106
1. INTRODUCTION In this paper we discuss the consequences of the introduction of a quantum of time t0 in the formalism of non-relativistic quantum mechanics, by referring ourselves, in particular, to the theory of the chronon as proposed by P.Caldirola. Such an interesting ‘‘finite difference’’ theory, forwards — at the classical level — a selfconsistent solution for the motion in an external electromagnetic field of a charged particle like an electron, when its charge cannot be regarded as negligible, overcoming all the known difficulties met by Abraham–Lorentz’s and Dirac’s approaches (and even allowing a clear answer to the question whether a free falling electron does or does not emit radiation), and — at the quantum level — yields a remarkable mass spectrum for leptons. After having briefly reviewed Caldirola’s approach, our first aim will be to work out, discuss, and compare to one another the new formulations of Quantum Mechanics (QM) resulting from it, in the Schro¨dinger, Heisenberg and density–operator (Liouville–von Neumann) pictures, respectively. Moreover, for each picture, we show that three (retarded, symmetric and advanced) formulations are possible, which refer either to times t and t-t0, or to times t-t0/2 and t+t0/2, or to times t and t + t0, respectively. We shall see that, when the chronon tends to zero, the ordinary QM is obtained as the limiting case of the ‘‘symmetric’’ formulation only; while the ‘‘retarded’’ one does naturally appear to describe QM with friction, i.e., to describe dissipative quantum systems (like a particle moving in an absorbing medium). In this sense, discretized QM is much richer than the ordinary one.
Consequences for the Electron of a Quantum of Time
35
We are also going to obtain the (retarded) finite–difference Schro¨dinger equation within the Feynman path integral approach, and study some of its relevant solutions. We then derive the time–evolution operators of this discrete theory, and use them to get the finite–difference Heisenberg equations. When discussing the mutual compatibility of the various pictures listed above, we find that they can be written down in a form such that they result to be equivalent (as it happens in the ‘‘continuous’’ case of ordinary QM), even if our Heisenberg picture cannot be derived by ‘‘discretizing’’ directly the ordinary Heisenberg representation. Afterwards, some typical applications and examples are studied, as the free particle (electron), the harmonic oscillator and the hydrogen atom; and various cases are pointed out, for which the predictions of discrete QM differ from those expected from ‘‘continuous’’ QM. At last, the density matrix formalism is applied for a possible solution of the measurement problem in QM, with interesting results, as for instance a natural explication of ‘‘decoherence’’, which reveal the power of dicretized (in particular, retarded) QM. The idea of a discrete temporal evolution is not a new one and, as with almost all physical ideas, has from time to time been recovered from oblivion.1 For instance, in classical Greece this idea came to light as part of the atomistic thought. In the Middle Ages, belief in the discontinuous character of time was at the basis of the ‘‘theistic atomism’’ held by the Arabic thinkers of the Kala¯m (Jammer, 1954). In Europe, discussions about the discreteness of space and time can be found in the writings of Isidore of Sevilla, Nicolaus Boneti and Henry of Harclay, investigating the nature of continuum. In more recent times, the idea of the existence of a fundamental interval of time was rejected by Leibniz, because it was incompatible with his rationalistic philosophy. Within modern physics, however, Planck’s famous work on black-body radiation inspired a new view of the subject. In fact, the introduction of the quanta opened a wide range of new scientific possibilities regarding how the physical world can be conceived, including considerations, like those in this chapter, on the discretization of time within the framework of quantum mechanics. In the early years of the twentieth century, Mach regarded the concept of continuum as a consequence of our physiological limitations: ‘‘. . . le temps et l’espace ne repre´sentent, au point de vue physiologique, qu’un continue apparent, qu’ils se composent tre`s vraisemblablement
1 Historical aspects related to the introduction of a fundamental interval of time in physics can be found in Casagrande (1977).
36
Ruy H. A. Farias and Erasmo Recami
d’elements discontinus, mais qu’on ne peut distinguer nettement les uns des autres’’ (Arzelie`s, 1966, p. 387). Also Poincare´ (1913) took into consideration the possible existence of what he called an ‘‘atom of time’’: the minimum amount of time that allows distinguishing between two states of a system. Finally, in the 1920s, J. J. Thomson (1925–26) suggested that the electric force acts in a discontinuous way, producing finite increments of momentum separated by finite intervals of time. Such a seminal work has since inspired a series of papers on the existence of a fundamental interval of time, the chronon, although the overall repercussion of that work was small at that time. A further seminal article was written by Ambarzumian and Ivanenko (1930), which assumed a discrete nature for space-time and also stimulated many subsequent papers. It is important to stress that, in principle, time discretization can be introduced in two distinct (and completely different) ways: 1. By attributing to time a discrete structure, that is, by regarding time not as a continuum, but as a one-dimensional ‘‘lattice’’. 2. By considering time as a continuum, in which events can take place (discontinuously) only at discrete instants of time. Almost all attempts to introduce a discretization of time followed the first approach, generally as part of a more extended procedure in which space-time as a whole is considered intrinsically discrete (a fourdimensional lattice). Recently, Lee (1983) introduced a time discretization on the basis of the finite number of experimental measurements performable in any finite interval of time.2 For an early approach in this direction, see Tati (1964) and references therein, such as Yukawa (1966) and Darling (1950). Similarly, formalizations of an intrinsically discrete physics have also been proposed (McGoveran and Noyes, 1989). The second approach was first adopted in the 1920s (e.g., by Levi, 1926, and by Pokrowski, 1928) after Thomson’s work, and resulted in the first real example of a theory based on the existence of a fundamental interval of time: the one set forth by Caldirola (1953, 1956) in the 1950s.3 Namely, Caldirola formulated a theory for the classical electron, with the aim of providing a consistent (classical) theory for its motion in an electromagnetic field. In the late 1970s, Caldirola (1976a) extended its procedure to nonrelativistic QM. It is known that the classical theory of the electron in an electromagnetic field (despite the efforts by Abraham, 1902; Lorentz, 1892,1904; Poincare´,
2
See also Lee (1987), Friedberg and Lee (1983), and Bracci et al. (1983). 3 Further developments of this theory can be found in Caldirola (1979a) and references therein. See also Caldirola (1979c, 1979d; 1984b) and Caldirola and Recami (1978), as well as Petzold and Sorg (1977), Sorg (1976), and Mo and Papas (1971).
Consequences for the Electron of a Quantum of Time
37
1906; and Dirac, 1938a,b; as well as Einstein, 1915; Frenkel, 1926, 1926–28; Lattes et al., 1947; and Ashauer, 1949, among others) actually presents many serious problems except when the field of the particle is neglected.4 By replacing Dirac’s differential equation with two finitedifference equations, Caldirola developed a theory in which the main difficulties of Dirac’s theory were overcome. As seen later, in Caldirola’s relativistically invariant formalism the chronon characterizes the changes experienced by the dynamical state of the electron when submitted to external forces. The electron is regarded as an (extended-like) object, which is pointlike only at discrete positions xn (along its trajectory) such that the electron takes a quantum of proper time to travel from one position to the following one (or, rather, two chronons; see the following). It is tempting to examine extensively the generalization of such a theory to the quantum domain, and this will be performed herein. Let us recall that one of the most interesting aspects of the discretized Schro¨dinger equations is that the mass of the muon and of the tau lepton follows as corresponding to the two levels of the first (degenerate) excited state of the electron. In conventional QM there is a perfect equivalence among its various pictures: the ones from Schro¨dinger, Heisenberg’s, and the density matrices formalism. When discretizing the evolution equations of these different formalisms, we succeed in writing them in a form such that they are still equivalent. However, to be compatible with the Schro¨dinger representation, our Heisenberg equations cannot, in general, be obtained by a direct discretization of the continuous Heisenberg equation. This work is organized as follows. In Section 2 we present a brief review of the main classical theories of the electron, including Caldirola’s. In Section 3 we introduce the three discretized forms (retarded, advanced, and symmetrical) of the Schro¨dinger equation, analyze the main characteristics of such formulations, and derive the retarded one from Feynman’s path integral approach. In Section 4, our discrete theory is applied to some simple quantum systems, such as the harmonic oscillator, the free particle, and the hydrogen atom. The possible experimental deviations from the predictions of ordinary QM are investigated. In Section 5, a new derivation of the discretized Liouville-von Neumann equation, starting from the coarse-grained hypothesis, is presented. Such a representation is then adopted to tackle the measurement problem in QM, with rather interesting results. Finally, a discussion on the possible interpretation of our discretized equations is found in Section 6.
4 It is interesting to note that all those problems have been—necessarily—tackled by Yaghjian (1992) in his book when he faced the question of the relativistic motion of a charged, macroscopic sphere in an external electromagnetic field (see also Yaghjian, 1989, p. 322).
38
Ruy H. A. Farias and Erasmo Recami
2. THE INTRODUCTION OF THE CHRONON IN THE CLASSICAL THEORY OF THE ELECTRON Almost a century after its discovery, the electron continues to be an object still awaiting a convincing description, both in classical and quantum electrodynamics.5 As Schro¨dinger put it, the electron is still a stranger in electrodynamics. Maxwell’s electromagnetism is a field theoretical approach in which no reference is made to the existence of material corpuscles. Thus, one may say that one of the most controversial questions of twentieth-century physics, the wave-particle paradox, is not characteristic of QM only. In the electron classical theory, matching the description of the electromagnetic fields (obeying Maxwell equations) with the existence of charge carriers like the electron is still a challenging task. The hypothesis that electric currents could be associated with charge carriers was already present in the early ‘‘particle electrodynamics’’ formulated in 1846 by Fechner and Weber (Rohrlich, 1965, p. 9). But this idea was not taken into consideration again until a few decades later, in 1881, by Helmholtz. Till that time, electrodynamics had developed on the hypothesis of an electromagnetic continuum and of an ether.6 In that same year, Thomson (1881) wrote his seminal paper in which the electron mass was regarded as purely electromagnetic in nature. Namely, the energy and momentum associated with the (electromagnetic) fields produced by an electron were held entirely responsible for the energy and momentum of the electron itself (Belloni, 1981). Lorentz’s electrodynamics, which described the particle-particle interaction via electromagnetic fields by the famous force law 1 f ¼ r E þ v^B (1) c where r is the charge density of the particle on which the fields act, dates back to the beginning of the 1890 decade. The electron was finally discovered by Thomson in 1897, and in the following years various theories appeared. The famous (prerelativistic) theories by Abraham, Lorentz, and Poincare´ regarded it as an extended-type object, endowed again with a purely electromagnetic mass. As is well known, in 1902 Abraham proposed the simple (and questionable) model of a rigid sphere, with a uniform electric charge density on its surface. Lorentz’s (1904) was quite similar and tried to improve the situation with the mere introduction of the effects resulting from the Lorentz-Fitzgerald contraction.
5 Compare, for example, the works by Recami and Salesi (1994, 1996, 1997a, 1997b, 1998a, 1998b) and references therein. See also Pavsic et al. (1993, 1995) and Rodrigues, Vaz, and Recami (1993). 6 For a modern discussion of a similar topic, see Likharev and Claeson (1992).
Consequences for the Electron of a Quantum of Time
39
2.1. The Abraham–Lorentz’s Theory of the Electron A major difficulty in accurately describing the electron motion was the inclusion of the radiation reaction (i.e., of the effect produced on such a motion by the fields radiated by the particle itself). In the model proposed by Abraham–Lorentz the assumption of a purely electromagnetic structure for the electron implied that Fp þ Fext ¼ 0
(2)
where Fp is the self-force due to the self-fields of the particle, and Fext is the external force. According to Lorentz’s law, the self-force was given by ð 1 Fp ¼ r Ep þ v ^ Bp d3 r c where Ep and Bp are the fields produced by the charge density r itself, according to the Maxwell-Lorentz equations. For the radiation reaction force, Lorentz obtained the following expression: Fp ¼
1 4 2 ke2 2e2 X ð1Þn 1 dn a n1 _ ; a W a þ O R el 3c2 3 c3 3c3 n¼2 n! cn dtn
(3)
where k (4pe0)1 (in the following, whenever convenient, we shall assume units such that numerically k ¼ 1), and where ðð 1 rðrÞrðr0 Þ 3 3 0 Wel d rd r 2 jr r0 j is the electrostatic self-energy of the considered charge distribution, and R is the radius of the electron. All terms in the sum are structure dependent. They depend on R and on the charge distribution. By identifying the electromagnetic mass of the particle with its electrostatic self-energy mel ¼
Wel c2
it was possible to write Eq. (2) as 4 mel v_ G ¼ Fext 3
(4)
so that G¼
2 e2 a_ ð1 þ OðRÞÞ 3 c3
(5)
which was the equation of motion in the Abraham–Lorentz model. Quantity G is the radiation reaction force, the reaction force acting on the
40
Ruy H. A. Farias and Erasmo Recami
electron. One problem with Eq. (4) was constituted by the factor 43. In fact, if the mass is supposed to be of electromagnetic origin only, then the total momentum of the electron would be given by p¼
4 Wel v 3 c2
(6)
which is not invariant under Lorentz transformations. That model, therefore, was nonrelativistic. Finally, we can observe from Eq. (3) that the structure-dependent terms are functions of higher derivatives of the acceleration. Moreover, the resulting differential equation is of the third order, so that initial position and initial velocity are not enough to single out a solution. To suppress the structure terms, the electron should be reducible to a point, (R ! 0), but in this case the self-energy Wel and mass mel would diverge! After the emergence of the special theory of relativity, or rather, after the publication by Lorentz in 1904 of his famous transformations, some attempts were made to adapt the model to the new requirements.7 Abraham himself (1905) succeeded in deriving the following generalization of the radiation reaction term [Eq. (5)]: ! 2 e2 d2 um um uv d2 uv Gm ¼ þ 2 : (7) 3 c ds2 c ds2 A solution for the problem of the electron momentum noncovariance was proposed by Poincare´ in 1905 by the addition of cohesive forces of nonelectromagnetic character. This, however, made the nature of the electron no longer purely electromagnetic. On the other hand, electrons could not be considered pointlike because of the obvious divergence of their energy when R ! 0; thus, a description of the electron motion could not dismiss the structure terms. Only Fermi (1922) succeeded in showing that the correct relation for the momentum of a purely electromagnetic electron could be obtained without Poincare´’s cohesive forces.
2.2. Dirac’s Theory of the Classical Electron Notwithstanding its inconsistencies, the Abraham–Lorentz’s theory was the most accepted theory of the electron until the publication of Dirac’s theory in 1938. During the long period between these two theories, as well as afterward, various further attempts to solve the problem were set forth,
7
See, for example, von Laue (1909), Schott (1912), Page (1918, 1921), and Page and Adams (1940).
Consequences for the Electron of a Quantum of Time
41
either by means of extended-type models (Mie, Page, Schott and so on8), or by trying again to treat the electron as a pointlike particle (Fokker, Wentzel and so on).9 Dirac’s approach (1938a) is the best-known attempt to describe the classical electron. It bypassed the critical problem of the previous theories of Abraham and Lorentz by devising a solution for the pointlike electron that avoided divergences. By using the conservation laws of energy and momentum and Maxwell equations, Dirac calculated the flux of the energy-momentum four-vector through a tube of radius e R (quantity R being the radius of the electron at rest) surrounding the world line of the particle, and obtained m
dum ¼ Fm þ Gm ds
(8)
where Gm is the Abraham four-vector [Eq. (7)], that is, the reaction force acting on the electron itself, and Fm is the four-vector that represents the external field acting on the particle: e Fm ¼ Fmn un : c
(9)
According to such a model, the rest mass m0 of the electron is the limiting, finite value obtained as the difference of two quantities tending to infinity when R ! 0 2 1e k ð e Þ ; m0 ¼ lim e!0 2 c2 e the procedure followed by Dirac was an early example of elimination of divergences by means of a subtractive method. At the nonrelativistic limit, Dirac’s equation tends to the one previously obtained by Abraham–Lorentz: dv 2 e2 d2 v 1 m0 3 2 ¼ e E þ v^B (10) dt 3 c dt c
8 There were several attempts to develop an extended-type model for the electron. See, for example, Compton (1919) and references therein; also Mie (1912), Page (1918), Schott (1912), Frenkel (1926), Schro¨dinger (1930), Mathisson (1937), Ho¨nl and Papapetrou (1939, 1940), Bhabha and Corben (1941), Weyssenhof and Raabe (1947), Pryce (1948), Huang (1952), Ho¨nl (1952), Proca (1954), Bunge (1955), Gursey (1957), Corben (1961, 1968, 1977, 1984, 1993), Fleming (1965), Liebowitz (1969), Gallardo et al. (1967), Ka´lnay (1970, 1971), Ka´lnay and Torres (1971), Jehle (1971), Riewe (1971), Mo and Papas (1971), Bonnor (1974), Marx (1975), Perkins (1976), Cvijanovich and Vigier (1977), Gutkowski et al. (1977), Barut (1978a), Lock (1979), Hsu and Mac (1979), Coleman (1960), McGregor (1992) and Rodrigues et al. (1993). 9 A historical overview of these different theories of electron can be found in Rohrlich (1965) and references therein and also Rohrlich (1960).
42
Ruy H. A. Farias and Erasmo Recami
except that in the Abraham–Lorentz’s approach m0 diverged. Equation 2 2 (10) shows that the reaction force equals 23 ce3 ddtv2 . Dirac’s dynamical equation [Eq. (8)] was later reobtained from different, improved models.10 Wheeler and Feynman (1945), for example, rederived Eq. (8) by basing electromagnetism on an action principle applied to particles only via their own absorber hypothesis. However, Eq. (8) also presents many problems, related to the many infinite nonphysical solutions that it possesses. Actually, as previously mentioned, it is a third-order differential equation, requiring three initial conditions for singling out one of its solutions. In the description of a free electron, for example, it even yields ‘‘self-accelerating’’ solutions (runaway solutions), for which velocity and acceleration increase spontaneously and indefinitely (see Eliezer, 1943; Zin, 1949; and Rohrlich, 1960, 1965). Selection rules have been established to distinguish between physical and nonphysical solutions (for example, Schenberg, 1945 and Bhabha, 1946). Moreover, for an electron submitted to an electromagnetic pulse, further nonphysical solutions appear, related this time to pre-accelerations (Ashauer, 1949). If the electron comes from infinity with a uniform velocity v0 and at a certain instant of time t0 is submitted to an electromagnetic pulse, then it starts accelerating before t0. Drawbacks such as these motivated further attempts to determine a coherent model for the classical electron.
2.3. Caldirola’s Theory for the Classical Electron Among the various attempts to formulate a more satisfactory theory, we want to focus attention on the one proposed by Caldirola. Like Dirac’s, Caldirola’s theory is also Lorentz invariant. Continuity, in fact, is not an assumption required by Lorentz invariance (Snyder, 1947). The theory postulates the existence of a universal interval t0 of proper time, even if time flows continuously as in the ordinary theory. When an external force acts on the electron, however, the reaction of the particle to the applied force is not continuous: The value of the electron velocity um should jump from um(t t0) to um(t) only at certain positions sn along its world line; these discrete positions are such that the electron takes a time t0 to travel from one position sn1 to the next sn. In this theory11 the electron, in principle, is still considered pointlike but the Dirac relativistic equations for the classical radiating electron are replaced: (1) by a corresponding finite-difference (retarded) equation in the velocity um(t) 10
See Schenberg (1945), Havas (1948), and Loinger (1955). Caldirola presented his theory of electron in a series of papers in the 1950s, such as his 1953 and 1956 works. Further developments of his theory can be found in Caldirola (1979a) and references therein. See also Caldirola (1979c; 1979d; 1984b) and Caldirola and Recami (1978).
11
Consequences for the Electron of a Quantum of Time
8 9 = m0 < um ðtÞun ðtÞ um ðtÞ um ðt t0 Þ þ ½ u ð t Þ u ð t t Þ n n 0 ; c2 t0 : e ¼ Fmn ðtÞun ðtÞ; c
43
(11)
which reduces to the Dirac equation [Eq. (8)] when t0 ! 0, but cannot be derived from it (in the sense that it cannot be obtained by a simple discretization of the time derivatives appearing in Dirac’s original equation); and: (2) by a second equation, this time connecting the ‘‘discrete positions’’ xm(t) along the world line of the particle; in fact, the dynamical law in Eq. (11) is by itself unable to specify univocally the variables um(t) and xm(t), which describe the motion of the particle. Caldirola named it the transmission law: xm ðnt0 Þ xm ½ðn 1Þt0 ¼
t0 um ðnt0 Þ um ½ðn 1Þt0 ; 2
(12)
which is valid inside each discrete interval t0, and describes the internal or microscopic motion of the electron. In these equations, um(t) is the ordinary four-vector velocity satisfying the condition um ðtÞum ðtÞ ¼ c2 for t ¼ nt0 where n ¼ 0, 1, 2,. . . and m,n ¼ 0, 1, 2, 3; Fmn is the external (retarded) electromagnetic field tensor, and the quantity t0 2 ke2 y0 ¼ ’ 6:266 1024 s 3 m0 c3 2
(13)
is defined as the chronon associated with the electron (as justified below). The chronon y0 ¼ t0/2 depends on the particle (internal) properties, namely, on its charge e and rest mass m0. As a result, the electron happens to appear eventually as an extendedlike particle12, with an internal structure, rather than as a pointlike object (as initially assumed). For instance, one may imagine that the particle does not react instantaneously to the action of an external force because of its finite extension (the numerical value of the chronon is of the same order as the time spent by light to travel along an electron classical diameter). As noted, Eq. (11) describes the motion of an object that happens to be pointlike only at discrete positions sn along its trajectory, 12 See, for example, Salesi and Recami (1995, 1996, 1997a,b, 1998). See also the part on the field theory of leptons in Recami and Salesi (1995) and on the field theory of the extended-like electron in Salesi and Recami (1994, 1996).
44
Ruy H. A. Farias and Erasmo Recami
Caldirola, 1956, 1979a, even if both position and velocity are still continuous and well-behaved functions of the parameter t, since they are differentiable functions of t. It is essential to notice that a discrete character is assigned to the electron merely by the introduction of the fundamental quantum of time, with no need of a ‘‘model’’ for the electron. As is well known, many difficulties are encountered with both the strictly pointlike models and the extended-type particle models (spheres, tops, gyroscopes, and so on). In Barut’s words (1991), ‘‘If a spinning particle is not quite a point particle, nor a solid three dimensional top, what can it be?’’ We deem the answer lies in a third type of model, the ‘‘extended-like’’ one, as the present theory; or as the (related) theoretical approach in which the center of the pointlike charge is spatially distinct from the particle center of mass (see Salesi and Recami, 1994, and ensuing papers on this topic, like Recami and Salesi, 1997a,b, 1998a, and Salesi and Recami, 1997b). In any case, it is not necessary to recall that the worst troubles in quantum field theory (e.g., in quantum electrodynamics), like the presence of divergencies, are due to the pointlike character still attributed to (spinning) particles, since the problem of a suitable model for elementary particles was transported, without a suitable solution, from classical to quantum physics. In our view that particular problem may still be the most important in modern particle physics. Equations (11) and (12) provide a full description of the motion of the electron. Notice that the global ‘‘macroscopic’’ motion can be the same for different solutions of the transmission law. The behavior of the electron under the action of external electromagnetic fields is completely described by its macroscopic motion. As in Dirac’s case, the equations are invariant under Lorentz transformations. However, as we shall see, they are free of pre-accelerations, selfaccelerating solutions, and the problems with the hyperbolic motion that had raised great debates in the first half of the twentieth century. In the nonrelativistic limit the previous (retarded) equations reduces to the form m0 1 (14) ½vðtÞ vðt t0 Þ ¼ e EðtÞ þ vðtÞ ^ BðtÞ ; c t0 rðtÞ rðt t0 Þ ¼
t0 ½vðtÞ vðt t0 Þ; 2
(15)
which can be obtained, this time, from Eq. (10) by directly replacing the time derivatives by the corresponding finite-difference expressions. The macroscopic Eq. (14) had already been obtained by other authors for the dynamics of extended-type electrons13.
13 Compare, for example, Schott (1912), Page (1918), Page and Adams (1940), Bohm and Weinstein (1948), and Eliezer (1950).
Consequences for the Electron of a Quantum of Time
45
The important point is that Eqs. (11) and (12), or (14) and (15), allow difficulties met with the Dirac classical Eq. (8) to be overcome. In fact, the electron macroscopic motion is completely determined once velocity and initial position are given. Solutions of the relativistic Eqs. (11) and (12) for the radiating electron—or of the corresponding non-relativistic Eqs. (14) and (15)—were obtained for several problems. The resulting motions never presented unphysical behavior, so the following questions can be regarded as solved Caldirola, 1956, 1979a: Exact relativistic solutions:
– Free electron motion – Electron under the action of an electromagnetic pulse (Cirelli, 1955) – Hyperbolic motion (Lanz, 1962) Non-relativistic approximate solutions: – Electron under the action of time-dependent forces – Electron in a constant, uniform magnetic field (Prosperetti, 1980) – Electron moving along a straight line under the action of an elastic restoring force (Caldirola et al., 1978) Before we proceed, it is interesting to briefly analize the electron radiation properties as deduced from the finite-difference relativistic Eqs. (11) and (12) to show the advantages of the present formalism with respect to the Abraham–Lorentz–Dirac one. Such equations can be written (Lanz, 1962; Caldirola, 1979a) as DQm ðtÞ e þ Rm ðtÞ þ Sm ðtÞ ¼ Fmn ðtÞun ðtÞ; t0 c
(16)
DQm m0 um ðtÞ um ðt t0 Þ
(17)
where
m0 um ðtÞun ðtÞ n n n Rm ðtÞ ½u ðt þ t0 Þ þ u ðt t0 Þ 2u ðtÞ c2 2t0
m0 um ðtÞun ðtÞ n n Sm ðtÞ ¼ ½ u ð t þ t Þ u ð t t Þ : 0 0 c2 2t0
(18) (19)
In Eq. (16), the first term DQ0/t0 represents the variation per unit of proper time (in the interval t t0 to t) of the particle energy-momentum vector. The second one, Rm(t), is a dissipative term because it contains only even derivatives of the velocity as can be proved by expanding un(t þ t0) and un(t t0) in terms of t0; furthermore, it is never negative Caldirola, 1979a; Lanz, 1962 and can therefore represent the energymomentum radiated by the electron in the unit of proper time. The third term, Sm(t), is conservative and represents the rate of change in proper time of the electron reaction energy-momentum.
46
Ruy H. A. Farias and Erasmo Recami
The time component (m ¼ 0) of Eq. (16) is written as T ðtÞ T ðt t0 Þ þ R0 ðtÞ þ S0 ðtÞ ¼ Pext ðtÞ; t0 where quantity T(t) is the kinetic energy 0
(20)
1
1 B C TðtÞ ¼ m0 c2 @qffiffiffiffiffiffiffiffiffiffiffiffiffi 1A 2 1b
(21)
so that in Eq. (20) the first term replaces the proper-time derivative of the kinetic energy, the second one is the energy radiated by the electron in the unit of proper time, S0(t) is the variation rate in proper time of the electron reaction energy (radiative correction), and Pext(t) is the work done by the external forces in the unit of proper time. We are now ready to show that Eq. (20) yields a clear explanation for the origin of the so-called acceleration energy (Schott energy), appearing in the energy-conservation relation for the Dirac equation. In fact, expanding in power series with respect to t0 the left-hand sides of Eqs.(16–19) for m ¼ 0, and keeping only the first-order terms, yields TðtÞ Tðt t0 Þ dT 2 e2 da0 ’ t0 dt 3 c2 dt
(22)
1 2 e2 R0 ðtÞ ’ qffiffiffiffiffiffiffiffiffiffiffiffiffi 3 am am 3c 1 b2
(23)
S0 ðtÞ ’ 0
(24)
where am is the four-acceleration am
dum dum ¼g dt dt
quantity g being the Lorentz factor. Therefore, Eq. (20) to the first order in t0 becomes dT 2 e2 da0 2 e2 am am qffiffiffiffiffiffiffiffiffiffiffiffiffi ’ Pext ðtÞ; þ dt 3 c2 dt 3 c3 1 b2
(25)
or, passing from the proper time t to the observer’s time t: dT 2 e2 da0 2 e2 dt þ am am ’ Pext ðtÞ : dt 3 c2 dt 3 c dt
(26)
Consequences for the Electron of a Quantum of Time
47
The last relation is identical with the energy-conservation law found by Fulton and Rohrlich (1960) for the Dirac equation. In Eq. (26) the derivative of (2e2/3c2)a0 appears, which is simply the acceleration energy. Our approach clearly shows that it arises only by expanding in a power series of t0 the kinetic energy increment suffered by the electron during the fundamental proper-time interval t0, while such a Schott energy (as well as higher-order energy terms) does not need show up explicitly when adopting the full formalism of finite-difference equations. We return to this important point in subsection 2.4. Let us finally observe (Caldirola, 1979a, and references therein) that, when setting m0 ½um ðtÞuv ðt t0 Þ um ðt t0 Þuv ðtÞ Fself mv ; ect0 the relativistic equation of motion [Eq. (11)] becomes e self n Fmn þ Fext mn u ¼ 0; c
(27)
(28)
confirming that Fself mn represents the (retarded) self-field associated with the moving electron.
2.4. The Three Alternative Formulations of Caldirola’s Theory Two more (alternative) formulations are possible with Caldirola’s equations, based on different discretization procedures. In fact, Eqs. (11) and (12) describe an intrinsically radiating particle. And, by expanding Eq. (11) in terms of t0, a radiation reaction term appears. Caldirola called those equations the retarded form of the electron equations of motion. By rewriting the finite-difference equations, on the contrary, in the form 8 9 < = m0 um ðtÞun ðtÞ um ðt þ t0 Þ um ðtÞ þ ½ u ð t þ t Þ u ð t Þ n 0 n ; c2 t0 : (29) e ¼ Fmn ðtÞun ðtÞ; c xm ½ðn þ 1Þt0 xm ðnt0 Þ ¼ t0 um ðnt0 Þ;
(30)
one gets the advanced formulation of the electron theory, since the motion—according to eqs. (29) and (30)—is now determined by advanced actions. In contrast with the retarded formulation, the advanced one describes an electron that absorbs energy from the external world.
48
Ruy H. A. Farias and Erasmo Recami
Finally, by adding the retarded and advanced actions, Caldirola derived the symmetric formulation of the electron theory: 8 9 = m0 < um ðtÞun ðtÞ um ðt þ t0 Þ um ðt t0 Þ þ ½ u ð t þ t Þ u ð t t Þ n 0 n 0 ; c2 2t0 : (31) e ¼ Fmn ðtÞun ðtÞ; c xm ½ðn þ 1Þt0 xm ððn 1Þt0 Þ ¼ 2t0 um ðnt0 Þ;
(32)
which does not include any radiation reaction terms and describes a nonradiating electron. Before closing this brief introduction to Caldirola’s theory, it is worthwhile to present two more relevant results derived from it. The second one is described in the next subsection. If we consider a free particle and look for the ‘‘internal solutions’’ of the Eq. (15), we then get—for a periodical solution of the type 0 1 2pt x_ ¼ b0 c sin@ A t0 0 1 2pt y_ ¼ b0 c cos@ A t0 z_ ¼ 0 which describes a uniform circular motion, and by imposing the kinetic energy of the internal rotational motion to equal the intrinsic energy m0c2 of the particle—that the amplitude of the oscillations is given by b20 ¼ 34. Thus, the magnetic moment corresponding to this motion is exactly the anomalous magnetic moment of the electron, obtained here in a purely classical context (Caldirola, 1954): ma ¼
1 e3 : 4p m0 c2
This shows that the anomalous magnetic moment is an intrinsically classical, and not quantum, result; and the absence of h in the last expression is a confirmation of this fact.
2.5. Hyperbolic Motions In a review paper on the theories of electron including radiation-reaction effects, Erber (1961) criticized Caldirola’s theory for its results in the case of hyperbolic motion.
Consequences for the Electron of a Quantum of Time
49
Let us recall that the opinion of Pauli and von Laue (among others) was that a charge performing uniformly accelerated motions—for example, an electron in free fall—could not emit radiation (Fulton and Rohrlich, 1960). That opinion was strengthened by the invariance of Maxwell equations under the group of conformal transformations (Cunningham, 1909; Bateman, 1910; Hill, 1945), which in particular includes transformations from rest to uniformly accelerated motions. However, since the first decades of the twentieth century, this had been—however—an open question, as the works by Born and Schott had on the contrary suggested a radiation emission in such a case (Fulton and Rohrlich, 1960). In 1960, Fulton and Rohrlich, using Dirac’s equation for the classical electron, demonstrated that the electron actually emits radiation when performing a hyperbolic motion (see also Leiter, 1970). A solution of this paradox is possible within Caldirola’s theory, and it was derived by Lanz (1962). By analyzing the energy-conservation law for an electron submitted to an external force and following a procedure similar to that of Fulton and Rohrlich (1960), Lanz obtained Eq. (20). By expanding it in terms of t and keeping only the first-order terms, he arrived at Eq. (25), identical to the one obtained by Fulton and Rohrlich, in which (we repeat) the Schott energy appears. A term that Fulton and Rohrlich (having obtained it from Dirac’s expression for the radiation reaction) interpreted as a part of the internal energy of the charged particle. For the particular case of hyperbolic motion, it is am am ¼
da0 dt
so that there is no radiation reaction [compare with Eq. (25) or (26)]. However, neither the acceleration energy, nor the energy radiated by the charge per unit of proper time, 23e2 am am , is zero. The difference is that in the discrete case this acceleration energy does not exist as such. It comes from the discretized expression for the charged particle kinetic energy variation. As seen in Eq. (22), the Schott term appears when the variation of the kinetic energy during the fundamental interval of proper time is expanded in powers of t0: T ð tÞ T ð t t0 Þ d 2 e2 d T 2 a0 : ’ t0 dt 3 c dt This is an interesting result, since it was not easy to understand the physical meaning of the Schott acceleration energy. With the introduction of the fundamental interval of time, as we know, the changes in the kinetic energy are no longer continuous, and the Schott term merely expresses, to first order, the variation of the kinetic energy when passing from one discrete instant of time to the subsequent one.
50
Ruy H. A. Farias and Erasmo Recami
In Eqs. (22) and (25), the derivative dT/dt is a point function, forwarding the kinetic energy slope at the instant t. And the dissipative term 2 2 e am am is simply a relativistic generalization of the Larmor radiation law: 3 if there is acceleration, then there is also radiation emission. For the hyperbolic motion, however, the energy dissipated (because of the acceleration) has only the same magnitude as the energy gain due to the kinetic energy increase. We are not forced to resort to pre-accelerations to justify the origin of such energies (Plass, 1960, 1961). Thus, the present theory provides a clear picture of the physical processes involved in the uniformly accelerated motion of a charged particle.
3. THE HYPOTHESIS OF THE CHRONON IN QUANTUM MECHANICS Let us now address the main topic of this chapter: the chronon in quantum mechanics. The speculations about the discreteness of time (on the basis of possible physical evidences) in QM go back to the first decades of the twentieth century, and various theories have proposed developing QM on a space-time lattice.14 This is not the case with the hypothesis of the chronon, where we do not actually have a discretization of the time coordinate. In the 1920s, for example, Pokrowski (1928) suggested the introduction of a fundamental interval of time, starting from an analysis of the shortest wavelengths detected (at that time) in cosmic radiation. More recently, for instance, Ehrlich (1976) proposed a quantization of the elementary particle lifetimes, suggesting the value 4.4 1024 s for the quantum of time.15 However, a time discretization is suggested by the very foundations of QM. There are physical limits that prevent the distinction of arbitrarily close successive states in the time evolution of a quantum system. Basically, such limitations result from the Heisenberg relations such that, if a discretization is introduced in the description of a quantum system, it cannot possess a universal value, since those limitations depend on the characteristics of the particular system under consideration. In other words, the value of the fundamental interval of time must change a priori from system to system. All these points make the extension of Caldirola’s procedure to QM justifiable. In the 1970s, Caldirola (1976a,b, 1977a,b,c, 1978a) extended the introduction of the chronon to QM, following the same guidelines that had led him to his theory of the electron. So, time is still a continuous variable, but the evolution of the system along its world line is discontinuous. As for 14
See, for example, Cole (1970) and Welch (1976); also compare with Jackson (1977), Meessen (1970), Vasholz (1975) and Kitazoe et al. (1978). See also Golberger and Watson (1962), Froissart et al. (1963), DerSarkissian and Nelson (1969), Cheon (1979), and Ford (1968).
15
Consequences for the Electron of a Quantum of Time
51
the electron theory in the nonrelativistic limit, one must substitute the corresponding finite-difference expression for the time derivatives; for example df ðtÞ f ðtÞ f ðt DtÞ ! dt Dt
(33)
where proper time is now replaced by the local time t. Such a procedure was then applied to obtain the finite-difference form of the Schro¨dinger equation. As for the electron case, there are three different ways to perform the discretization, and three ‘‘Schro¨dinger equations’’ can be obtained (Caldirola and Montaldi, 1979):
i
h ^ ðx; tÞ; i ½Cðx; tÞ Cðx; t tÞ ¼ HC t
(34)
h ^ ðx; tÞ; ½Cðx; t þ tÞ Cðx; t tÞ ¼ HC 2t
(35)
h ^ ðx; tÞ; i ½Cðx; t þ tÞ Cðx; tÞ ¼ HC t
(36)
which are, respectively, the retarded, symmetric, and advanced Schro¨dinger equations, all of them transforming into the (same) continuous equation when the fundamental interval of time (which can now be called just t) goes to zero. It can be immediately observed that the symmetric equation is of the second order, while the other two are first-order equations. As in the continuous case, for a finite-difference equation of order n a single and complete solution requires n initial conditions to be specified. The equations are different, and the solutions they provide are also fundamentally different. There are two basic procedures to study the properties of such equations. For some special cases, they can be solved by one of the various existing methods for solving finite-difference equations or by means of an attempt solution, an ansatz. The other method is to find a new e such that the new continuous Schro¨dinger equation, Hamiltonian H ih
@Cðx; tÞ e ¼ HCðx; tÞ; @t
(37)
reproduces, at the points t ¼ nt, the same results obtained from the discretized equations. As shown by Casagrande and Montaldi (1977; 1978), it is always possible to find a continuous generating function that makes it possible to obtain a differential equation equivalent to the original finite-difference one, such that at every point of interest their solutions are identical. This procedure is useful because it is generally difficult to work with the finite-difference equations on a qualitative basis. Except for some special cases, they can be solved only numerically. This equivalent
52
Ruy H. A. Farias and Erasmo Recami
e is, however, non-hermitean and is frequently very diffiHamiltonian H cult to obtain. When the Hamiltonian is time-independent, the equivalent Hamiltonian is quite easy to calculate. For the symmetric equation, for example, it is given by ^ e ¼ h sin1 t H H t h
(38)
ˆ when t ! 0. One can use the symmetric equation e !H As expected, H to describe the nonradiating electron (bound electron) since for Hamiltonians explicitly independent of time its solutions are always of oscillating character: t t ^ f ðxÞ: Cðx; tÞ ¼ exp i sin1 H t t In the classical theory of electrons, the symmetric equation also represents a nonradiating motion. It provides only an approximate description of the motion without considering the effects due to the self-fields of the electron. However, in the quantum theory it plays a fundamental role. In the discrete formalism, it is the only way to describe a bound nonradiating particle. The solutions of the advanced and retarded equations show completely different behavior. For a Hamiltonian explicitly independent of time the solutions have a general form given by h t ^ it=t Cðx; tÞ ¼ 1 þ i H f ðxÞ; h ˆ, and, expanding f(x) in terms of the eigenfunctions of H ^ n ðxÞ ¼ Wn un ðxÞ Hu X f ð xÞ ¼ c n u n ð xÞ n
with X
jcn j2 ¼ 1;
n
it can be obtained that Cðx; tÞ ¼
X n
h it=t t cn 1 þ i Wn un ðxÞ: h
In particular, the norm of this solution is given by X jCðx; tÞj2 ¼ jcn j2 expðgn tÞ n
Consequences for the Electron of a Quantum of Time
53
with 1 t2 2 W2 gn ¼ ln 1 þ 2 Wn ¼ 2n t þ O t3 : t h h The presence of a damping factor, depending critically on the value t of the chronon, must be noted. This dissipative behavior originates from the retarded character of the equation. The analogy with the electron theory also holds, and the retarded equation possesses intrinsically dissipative solutions representing a radiating system. The Hamiltonian has the same status as in the continuous case. It is an observable since it is a Hermitean operator and its eigenvectors form a basis of the state space. However, due to the damping term, the norm of the state vector is no longer constant. An opposite behavior is observed for the solutions of the advanced equation in the sense that they increase exponentially. Before proceeding, let us mention that the discretized QM (as well as Caldirola and coworkers’ approach to ‘‘QM with friction’’ as, for example, in Caldirola and Montaldi (1979)) can find room within the theories based on the so-called Lie-admissible algebras (Santilli, 1979a,b, 1981a,b,c, 1983).16 For a different approach to decaying states see Agodi et al. (1973) and Recami and Farias (2009).
3.1. The Mass of the Muon The most impressive achievement related to the introduction of the chronon hypothesis in the realm of QM comes from the description of a bound electron using the new formalism. Bound states are described by the symmetric Schro¨dinger equation and a Hamiltonian that does not depend explicitly on time. A general solution can be obtained by using a convenient ansatz: X Cðx; tÞ ¼ un ðxÞ expðian tÞ; n
ˆ un (x) ¼ Enun (x) gives the spectrum of eigenvalues of the where H Hamiltonian. If the fundamental interval of time t corresponds to the chronon y0 associated with the classical electron, it can be straightforwardly obtained that 1 1 En y0 sin an ¼ : y0 h
16 Extensive related work (not covered in the present paper) can also be found in Jannussis (1985a,b, 1990, 1984a), Jannussis et al. (1990; 1983a; 1983b) and Mignani (1983); see also Jannussis et al. (1982a; 1982b; 1981a; 1981b; 1980a; 1980b), Jannussis (1984b,c), and Montaldi and Zanon (1980).
54
Ruy H. A. Farias and Erasmo Recami
This solution gives rise to an upper limit for the eigenvalues of the Hamiltonian due to the condition j
En y 0 j 1: h
Since y0 is finite, there is a maximum value for the energy of the electron given by E max ¼
h 2 hm0 c3 ¼ 105:04 MeV: y0 3 e 2
Now, including the rest energy of the electron, we finally get E ¼ E max þ Eelectron 105:55 MeV; 0 which is very close (an error of 0.1%) to the measured value of the rest mass of the muon. The equivalent Hamiltonian method allows extending the basis of eigenstates beyond the critical limit. However, for the eigenvalues above the critical limit, the corresponding eigenstates are unstable and decay in time: X Cðx; tÞ ¼ cn un ðxÞ expðign tÞ expðkn tÞ; n
As for the retarded equation, the norm of the state vector is not constant and decays exponentially with time for those eigenstates outside the stability range. Caldirola (1976a,b, 1977c) interpreted this norm as indicating the probability of the existence of the particle in its original Hilbert space, and associated a mean lifetime with these states. The considerations regarding the muon as an excited state of the electron can be traced back to the days of its discovery. Particularly, it has already been observed that the ratio between the masses of the two particles is almost exactly 3/(2a), where a is the fine structure constant (Nambu, 1952). It has already been noted also that 23a is the coefficient of the radiative reaction term in Dirac’s equation for the classical electron (Rosen, 1964, 1978). Bohm and Weinstein (1948) put forward the hypothesis that various kinds of ‘‘mesons’’ could be excited states of the electron. Dirac (1962) even proposed a specific model for an extended electron to interpret the muon as an excited state of the electron.17 Caldirola (1978a; 1977a; 1977b; see also Fryberger, 1981) observed that by means of the Heisenberg uncertainty relations it is possible to associate
17 On this point, also compare the following references: Barut (1978a,b), Motz (1970), Ouchi and Ohmae (1977), Nishijima and Sato (1978), Sachs (1972a,b), Pavsic (1976), Matore (1981), Sudarshan (1961) and Kitazoe (1972).
Consequences for the Electron of a Quantum of Time
55
the existence of the muon as an excited state of the electron with the introduction of the chronon in the theory of electron. The relation Dt DE h=2 imposes limitations in the determination, at a certain instant, of the energy E associated with the internal motion of the electron. If excited states of the particle corresponding to larger values of mass exist, then it is possible to speak of an ‘‘electron with rest mass m0’’ only when DE (m0 m0)c2, where m0 is the rest mass of the internal excited state. Such internal states could be excited in the presence of sufficiently strong interactions. From the uncertainty relation, we have that Dt
h ; 2ðm0 m0 Þc2
and, supposing the muon as an excited state, we get ðm0 m0 Þc2 ffi
3 hc m0 c2 : 2 e2
Thus, it can be finally obtained that Dt
1 e2 t0 ¼ : 3 m0 c2 2
That is, the value of the rest mass of an interacting electron can be taken only inside an interval of the proper time larger than half a chronon. So, when we take into account two successive states, each one endowed with the same uncertainty Dt, they must then be separated by a time interval of at least 2 Dt, which corresponds exactly to the chronon t0.
3.2. The Mass Spectrum of Leptons To obtain the mass of the next particle, a possibility to be considered is to take the symmetric equation as describing the muon. According to this naı¨ve argumentation, the equation also foresees a maximum limit for the energy of the eigenstates of the muon. By assuming the equation as successively describing the particles corresponding to these maxima, an expression can be set up for the various limit values, given by n n 3 hc ðnÞ 2 31 E0 ¼ m0 c2 þ 1 þ 1 ¼ m c ; (39) 0 2 e2 2a such that, for
56
Ruy H. A. Farias and Erasmo Recami
n ¼ 0 ! Eð0Þ ¼ 0:511 MeV n ¼ 1 ! Eð1Þ ¼ 105:55 MeV n ¼ 2 ! Eð2Þ ¼ 21801:54 MeV
ðelectronÞ ðmuonÞ ðheavy lepton?Þ;
the masses for the first excited states can be obtained, including a possible heavy lepton which, according to the experimental results until now, does not seem to exist. Following a suggestion by Barut (1979; see also Tennakone and Pakvasa, 1971, 1972), according to which it should be possible to obtain the excited states of the electron from the coupling of its intrinsic magnetic moment with its self-field, Caldirola (1978b, 1979b, 1980, 1984a) and Benza and Caldirola (1981), considering a model of the extended electron as a micro-universe (Recami, 2002), also succeeded in evaluating the mass of the lepton t. Caldirola took into account, for the electron, a model of a point-object moving around in a four-dimensional de Sitter micro-universe characterized by c2 t2 x2 y2 z2 ¼ c2 t20 ; where t0 is the chronon associated with the electron and the radius of the micro-universe is given by a ¼ ct0. Considering the spectrum of excited states obtained from the naı¨ve argumentation above, we find that each excited state determines a characteristic radius for the micro-universe. Thus, for each particle, the trajectory of the point-object is confined to a spherical shell defined by its characteristic radius and by the characteristic radius of its excited state. For the electron, for example, the point-object moves around, inside the spherical shell defined by its corresponding radius and by the one associated with its excited state: the muon. Such radii are given by n 31 ð nÞ a ¼ t0 c þ1 : (40) 2a According to the model—supposing that the intrinsic energy of the lepton e(n) is given by m(n)c2—the lepton moves in its associated pffiffiffi micro-universe along a circular trajectory with a velocity b ¼ 23, to which corresponds an intrinsic magnetic moment mðanÞ ¼
1 e2 : 4p mðnÞ c2
(41)
Starting from Barut’s suggestion (1979), Caldirola obtained for the lepton e(n) an extra self-energy given by Eðn;pÞ ¼ ð2pÞ4 mðnÞ c2 :
Consequences for the Electron of a Quantum of Time
57
The condition set down on the trajectory of the point-object, so that it remains confined to its corresponding spherical shell, is given by 31 ðn;pÞ þ 1 m0 c2 ; E 2a and the values attainable by p are p ¼ 0 for n ¼ 0, and p ¼ 0, 1 for n 6¼ 0. The spectrum of mass is then finally given by n h i h i3 1 þ1 : mðn;pÞ ¼ 1 þ ð2pÞ4 mðnÞ ¼ m0 1 þ ð2pÞ4 (42) 2a Thus, for different values of n and p we have the following:
n
p
m(n)
0 1
0 0 1
0.511 MeV 105.55 MeV 1794.33 MeV
electron muon tau
It must be noted that the tau appears as an internal excited state of the muon and its mass is in fair agreement with the experimental values (Hikasa et al., 1992): mt 1784 MeV. The difference between these values is less than 1%. Which is remarkable given the simplicity of the model. The model foresees the existence of other excited states that do not seem to exist. This is to some extent justifiable once the muon is obtained as an excited electron and the description of the electron does not imply the existence of any other state. To obtain the lepton tau it was necessary to introduce into the formalism the coupling of the intrinsic magnetic moment with the self-field of the electron.
3.3. Feynman Path Integrals The discretized Schro¨dinger equations can easily be obtained using Feynman’s path integral approach. This is particularly interesting since it gives a clearer idea of the meaning of these equations. According to the hypothesis of the chronon, time is still a continuous variable and the introduction of the fundamental interval of time is connected only with the reaction of the system to the action of a force. It is convenient to restrict the derivation to the one-dimensional (1D) case, considering a particle under the action of a potential V(x, t). Although the time coordinate is continuous, we assume a discretization of the system (particle) position corresponding to instants separated by time intervals t (Figure 1). The transition amplitude for a particle going from an initial point (x1, t1) of the space-time to a final point (xn,tn) is given by the propagator
58
Ruy H. A. Farias and Erasmo Recami
x
tn–tn–1 = t
×
×
t0
FIGURE 1
tn–1
×
tn
× ×
×
tn + 1
t
Discrete steps in the time evolution of the considered system (particle).
Kðxn ; tn ; x1 ; t1 Þ ¼ hxn ; tn jx1 ; t1 i:
(43)
In Feynman’s approach this transition amplitude is associated with a path integral, where the classical action plays a fundamental role. It is convenient to introduce the notation ðtn dtLðx; x_ Þ
Sðn; n 1Þ
(44)
tn1
_ is the classical Lagrangian and S(n, n1) is the classical such that L (x, x) action. Thus, for two consecutive instants of time, the propagator is given by 1 i Kðxn ; tn ; xn1 ; tn1 Þ ¼ exp Sðxn ; tn ; xn1 ; tn1 Þ : (45) A h The path integral is defined as a sum over all the paths tha can be possibly traversed by the particle and can be written as ð ð ð N Y i hxn ; tn j x1 ; t1 i ¼ lim AN dxN1 dxN2 . . . dx2 exp Sðn; n 1Þ ; N!1 h n2 (46) where A is a normalization factor. To obtain the discretized Schro¨dinger equations we must consider the evolution of a quantum state between two consecutive configurations (xn1, tn1) and (xn, tn). The state of the system at tn is denoted as þ1 ð
C ð xn ; t n Þ ¼
Kðxn ; tn ; xn1 ; tn1 ÞCðxn1 ; tn1 Þdxn1 : 1
(47)
Consequences for the Electron of a Quantum of Time
59
On the other hand, it follows from the definition of the classical action (Eq. 44) that x þ x m n n1 Sðxn ; tn ; xn1 ; tn1 Þ ¼ ðxn xn1 Þ2 tV ; tn1 : (48) 2t 2 Thus, the state at tn is given by 8 0 19 þ1 ð < im = t x þ x n n1 ; tn1 A C ð xn ; t n Þ ¼ exp ðxn xn1 Þ2 i V @ :2ht ; h 2 1
(49)
Cðxn1 ; tn1 Þdxn1 :
When t 0, for xn slightly different from xn1, the integral due to the quadratic term is rather small. The contributions are considerable only for xn xn1. Thus, we can make the following approximation: xn1 ¼ xn þ ! dxn1 d; such that
2 @Cðxn ; tn1 Þ @ C 2 Cðxn1 ; tn1 Þ ffi Cðxn ; tn1 Þ þ þ : @x @x2 By inserting this expression into Eq. (49), supposing that18 V xþ V ðxÞ; 2 and taking into account only the terms to the first order in t, we obtain C ð xn ; t n Þ ¼
1 i 2ihpt 1=2 iht @ 2 C exp tV ðxn ; tn1 Þ Cðxn ; tn1 Þ þ : A h m 2m @x2
Notwithstanding the fact that exp(itV(xn, tn)/ h) is a function defined only for certain well-determined values, it can be expanded in powers of t, around an arbitrary position (xn, tn). Choosing A ¼ (2ihpt/m)1/2, such that t ! 0 in the continuous limit, we derive i Cðxn ; tn1 þ tÞ Cðxn ; tn1 Þ ¼ tV ðxn ; tn1 ÞCðxn ; tn1 Þ h þ
18
The potential is supposed to vary slowly with x.
iht @ 2 C þ O t2 : 2 2m @x
(50)
60
Ruy H. A. Farias and Erasmo Recami
By a simple reordering of terms, we finally obtain
Cðxn ; tn1 þ tÞ Cðxn ; tn1 Þ h2 @ 2 ¼ þ V ð x ; t Þ Cðxn ; tn1 Þ i n n1 t 2m @x2 Following this procedure we obtain the advanced finite-difference Schro¨dinger equation, which describes a particle performing a 1D motion under the effect of potential V(x,t). The solutions of the advanced equation show an amplification factor that may suggest that the particle absorbs energy from the field described by the Hamiltonian in order to evolve in time. In the continuous classical domain the advanced equation can be simply interpreted as describing a positron. However, in the realm of the (discrete) nonrelativistic QM, it is more naturally interpreted as representing a system that absorbs energy from the environment. To obtain the discrete Schro¨dinger equation only the terms to the first order in t have been taken into account. Since the limit t ! 0 has not been accomplished, the equation thus obtained is only an approximation. This fact may be related to another one faced later in this chapter, when considering the measurement problem in QM. It is interesting to emphasize that in order to obtain the retarded equation one may formally regard the propagator as acting backward in time. The conventional procedure in the continuous case always provides the advanced equation: therefore, the potential describes a mechanism for transferring energy from a field to the system. The retarded equation can be formally obtained by assuming an inversion of the time order, considering the expression 8 t 9 þ1 n1 ð ð < = 1 i Cðxn1 ; tn1 Þ ¼ exp Ldt Cðxn ; tn Þdxn ; (51) :h ; A 1
tn
which can be rigorously obtained by merely using the closure relation for the eigenstates of the position operator and then redefining the propagator in the inverse time order. With this expression, it is possible to obtain the retarded Schro¨dinger equation. The symmetric equation can easily be obtained by a similar procedure. An interesting characteristic related to these apparently opposed equations is the impossibility of obtaining one from the other by a simple time inversion. The time order in the propagators must be related to the inclusion, in these propagators, of something like the advanced and retarded potentials. Thus, to obtain the retarded equation we can formally consider effects that act backward in time. Considerations such as these, that led to the derivation of the three discretized equations, can supply useful guidelines for comprehension of their meaning.
Consequences for the Electron of a Quantum of Time
61
3.4. The Schro¨dinger and Heisenberg Pictures In discrete QM, as well as in the ‘‘continuous’’ one, the use of discretized Heisenberg equations is expected to be preferable for certain types of problems. As for the continuous case, the discretized versions of the Schro¨dinger and Heisenberg pictures are also equivalent. However, we show below that the Heisenberg equations cannot, in general, be obtained by a direct discretization of the continuous equations. First, it is convenient to introduce the discrete time evolution operator for the symmetric " !# ^ iðt t0 Þ 1 tH ^ sin Uðt; t0 Þ ¼ exp (52) t h and for the retarded equation,
ðtt0 Þ=t ^ ^ ðt; t0 Þ ¼ 1 þ i tH : U h
(53)
To simplify the equations, the following notation is used throughout this section: f ð t þ tÞ f ð t tÞ (54) Df ðtÞ $ 2t DR f ðtÞ $
f ðtÞ f ðt tÞ : t
(55)
For both operators above it can easily be demonstrated that, if the ˆ is a Hermitean operator, the following equations are valid: Hamiltonian H ^ ðt; t0 Þ ¼ DU {
^ ðt; t0 Þ ¼ DU
1 ^ ^ Uðt; t0 ÞH; ih
(56)
1 ^{ ^ U ðt; t0 ÞH: ih
(57)
In the Heisenberg picture the time evolution is transferred from the state vector to the operator representing the observable according to the definition ^H U ^ SU ^ { ðt; t0 ¼ 0ÞA ^ ðt; t0 ¼ 0Þ: A ˆS
(58)
In the symmetric case, for a given operator A , the time evolution of ˆ H(t) is given by the operator A h { i ^ SU ^ H ðtÞ ¼ D U ^ ðt; t0 ¼ 0ÞA ^ ðt; t0 ¼ 0Þ DA h H i (59) ^ H ðtÞ ¼ 1 A ^ ;H ^ DA ih
62
Ruy H. A. Farias and Erasmo Recami
which has exactly the same form as the equivalent equation for the continuous case. The important feature of the time evolution operator that is used to derive the expression above is that it is a unitary operator. This is true for the symmetric case. For the retarded case, however, this property is no longer satisfied. Another difference from the symmetric and continuous cases is that the state of the system is also time-dependent in the retarded Heisenberg picture: "
^ t2 H jC ðtÞ ¼ 1 þ 2 h H
# 2 ðtt0 Þ=t
jCS ðt0 Þ :
(60)
ˆ , f (A ˆ )] ¼ 0, it is possible to show that the By using the property [A evolution law for the operators in the retarded case is given by h
H i ^ S ðt Þ : ^ H ðtÞ ¼ 1 A ^ S ðtÞ; H ^ S ðt Þ þ D A (61) DA ih In short, we can conclude that the discrete symmetric and the continuous cases are formally quite similar and the Heisenberg equation can be obtained by a direct discretization of the continuous equation. For the retarded and advanced cases, however, this does not hold. The compatibility between the Heisenberg and Schro¨dinger pictures is analyzed in the appendices. Here we mention that much parallel work has been done by Jannussis et al. For example, they have studied the retarded, dissipative case in the Heisenberg representation, then studying in that formalism the (normal or damped) harmonic oscillator. On this subject, see Jannussis et al. (1982a,b, 1981a,b, 1980a,b) and Jannussis (1984b,c).
3.5. Time-Dependent Hamiltonians We restricted the analysis of the discretized equations to the timeindependent Hamiltonians for simplicity. When the Hamiltonian is explicitly time-dependent, the situation is similar to the continuous case. It is always difficult to work with such Hamiltonians but, as in the continuous case, the theory of small perturbations can also be applied. For the symmetric equation, when the Hamiltonian is of the form ^ ¼H ^0 þ V ^ ðtÞ; H
(62)
ˆ 0, the resolution method is ^ is a small perturbation related to H such that V similar to the usual one. The solutions are equivalent to the continuous solutions followed by an exponentially varying term. It is always possible to solve this type of problem using an appropriate ansatz.
Consequences for the Electron of a Quantum of Time
63
However, another factor must be considered and is related to the ˆ does not have stable eigenstates. existence of a limit beyond which H For the symmetric equation, the equivalent Hamiltonian is given by ^ : e ¼ h sin1 t H H t h
(63)
Thus, as previously stressed, beyond the critical value the eigenvalues e is no longer Hermitean. Below that limit, are not real and the operator H e is a densely defined and self-adjoint operator in the L L2 subspace H e When the limit value is exceeded, the defined by the eigenfunctions of H. system changes to an excited state and the previous state loses physical meaning. In this way, it is convenient to restrict the observables to self^ adjoint operators that keep invariant the subspace L. The perturbation V is assumed to satisfy this requirement. In usual QM it is convenient to work with the interaction representation (Dirac’s picture) in order to deal with time-dependent perturbations. In this representation, the evolution of the state is determined by the time^ (t), while the evolution of the observable is determined dependent potential V ˆ 0. In the discrete formalism, the by the stationary part of the Hamiltonian H ˆ time evolution operator defined for H 0, in the symmetric case, is given by " !# ^0 i ð t t Þ t H 0 1 ^ 0 ðt; t0 Þ ¼ exp sin : (64) U t h In the interaction picture the vector state is defined, from the state in the Schro¨dinger picture, as ^ { ðtÞjCS ðt0 Þi; jCI ðtÞi ¼ U 0 ^ { ðt Þ where U 0
^ { ðt; t0 U 0
(65)
¼ 0Þ. On the other hand, the operators are defined as ^ SU ^I ¼ U ^ { ðtÞA ^ 0 ðtÞ: A 0
(66)
Therefore, it is possible to show that, in the interaction picture, the evolution of the vector state is determined by the equation ihDCI ðx; tÞ ¼
ih I ^ I CI ðx; tÞ; C ðx; t þ tÞ CI ðx; t tÞ ¼ V 2t
(67)
which is equivalent to a direct discretization of the continuous equation. For the operators, we determine that ^ I ð t tÞ ^ I ð t þ tÞ A I 1 h^I ^ i A ^ ¼ D A ðt Þ ¼ A ; H0 ; 2t ih which is also equivalent to the continuous equation.
(68)
64
Ruy H. A. Farias and Erasmo Recami
Thus, for the symmetric case, the discrete interaction picture retains the same characteristics of the continuous case for the evolution of the ˆ remain operators and state vectors, once, obviously, the eigenstates of H below the stability limit. We can adopt, for the discrete case, a procedure similar to that one commonly used in QM to deal with small timedependent perturbations. We consider, in the interaction picture, the same basis of eigenstates ˆ 0, given by jni. Then, associated with the stationary Hamiltonian H X X jCðtÞiI ¼ CðtÞhn j CðtÞiI j ni ¼ cn ðtÞ j ni n
n
is the expansion, over this basis, of the state of the system at a certain instant t. It must be noted that the evolution of the state of the system is determined once the coefficients cn(t) are known. Using the evolution equation [Eq. (67)], it can be obtained that X ihDhn j CðtÞiI ¼ hn j V^I j mihm j CðtÞiI : m
^ in the Using the evolution operator to rewrite the perturbation V Schro¨dinger picture, we obtain X ihDcn ðtÞ ¼ cm ðtÞVnm ðtÞ expðionm tÞ; (69) m
such that onm ¼
1 tEn tEm sin1 sin1 ; t h h
and we obtain the evolution equation for the coefficients cn(t), the solution of which gives the time evolution of the system. As in usual QM, it is also possible to work with the interaction picture ˆ I(t, t0), which is defined as evolution operator, U ^ I ðt; t0 Þ j Cðt0 ÞiI ; jCðtÞiI ¼ U such that Eq. (67) can be written as ^ I ðt; t0 Þ ¼ V ^ I ðt ÞU ^ I ðt; t0 Þ: ihDU
(70)
ˆ I(t, t0) must satisfy the initial condition U ˆ I(t, t0) ¼ 0. Given The operator U this condition, for the finite-difference equation above we have the solution " !# ^ I ðt Þ i ð t t Þ t V I 0 1 ^ ðt; t0 Þ ¼ exp sin U : t h
65
Consequences for the Electron of a Quantum of Time
A difference from the continuous case, where the approximate evolution operator is an infinite Dyson series, is that this approach provides a well determined expression. The solution to the problem is obtained by correlating the elements of the matrix associated with such operator to the evolution coefficients cn(t). In general, the finite-difference equations are more difficult to analytically solve than the equivalent differential equations. In particular, this difficulty is much more stressed for the system of equations obtained from the formalism above. An alternative approach is to use the equivalent Hamiltonians (Caldirola, 1977a,b, 1978a; Fryberger, 1981). Once the equivalent Hamiltonian is found, the procedure is the same as for the continuous theory. If the ^ is small, the equivalent Hamiltonian can be written as perturbation term V e ¼ h sin1 t H ^ ðtÞ: ^0 þ V ^ ðtÞ ¼ H ^0 þ V H t h In the interaction picture, the state of the system is now defined as jCI ðtÞi ¼ expi
e 0t H jCS ðtÞi; h
(71)
and the operators are given by ! ! e 0t e 0t I S H H ^ ^ A exp i : A ¼ exp i h h
(72)
The state in Eq. (71) evolves according to the equation ih
@ I ^ I jCI ðtÞi; jC ðtÞi ¼ V @t
(73)
^ I is obtained according to definition (72). where V Now, small time-dependent perturbations can be handled by taking into account the time evolution operator defined by ^ I ðt; t0 ÞjCI ðt0 Þi: jCI ðtÞi ¼ U
(74)
According to the evolution law [Eq. (73)], we have d ^I ^ I ðt ÞU ^ I ðt; t0 Þ: (75) U ðt; t0 Þ ¼ V dt ˆ I(t0, t0) ¼ 1, the time evolution operator is given Thus, once given that U by either ðt ^ I ðt0 ; t0 Þdt0 ^ I ðt; t0 Þ ¼ 1 i ^ I ðt 0 ÞU U V h t0 ih
66
Ruy H. A. Farias and Erasmo Recami
or ^ I ðt; t0 Þ ¼ 1 þ U
ð t1 ð tn1 X i n ð t ^ I ðt 2 Þ . . . V ^ I ðtn Þ; ^ I ðt 1 ÞV dt1 dt2 . . . dtn V h t0 t0 t0 n¼1
where the evolution operator is obtained in terms of a Dyson series. Drawing a parallel, between the elements of the matrix of the evolution operator and the evolution coefficients cn(t) obtained from the continuous equation equivalent to Eq. (69), requires the use of the basis of eigenstates ˆ 0. If the initial state of the system is an of the stationary Hamiltonian H eigenstate jmi of that operator, then, at a subsequent time, we have ^ I ðt; t0 Þjmi: cn ðtÞ ¼ hnjU The method of the equivalent Hamiltonian is simpler because it takes full advantage of the continuous formalism.
4. SOME APPLICATIONS OF THE DISCRETIZED QUANTUM EQUATIONS Returning to more general questions, it is interesting to analyze the physical consequences resulting from the introduction of the fundamental interval of time in QM. In this section we apply the discretized equations to some typical problems.
4.1. The Simple Harmonic Oscillator The Hamiltonian that describes a simple harmonic oscillator does not depend explicitly on time. The introduction of the discretization in the time coordinate does not affect the outputs obtained from the continuous equation for the spatial branch of the solution. This is always the case when the potential does not have an explicit time dependence. For potentials like this, the solutions of the discrete equations are always formally identical, with changes in the numerical values that depend on the eigenvalues of the Hamiltonian considered and on the value of the chronon associated with the system described. We have the same spectrum of eigenvalues and the same basis of eigenstates but with the time evolution given by a different expression. For the simple harmonic oscillator, the Hamiltonian is given by 2 ^ ¼ 1 P ^ 2; ^ 2 þ mo X H 2m 2
(76)
Consequences for the Electron of a Quantum of Time
67
to which the eigenvalue equation corresponds: ^ n i ¼ En jun i; Hju
(77)
so that En gives the energy eigenvalue spectrum of the oscillator. As mentioned previously, since this Hamiltonian does not depend explicitly on time, there is always an upper limit for the possible values ˆ , a general of its energy eigenvalues. In the basis of eigenfunctions of H state of the oscillator can be written as X t 1 En t jCðtÞi ¼ cn ð0Þjun i exp i sin ; t h n with cn(0) ¼ hunjC(t ¼ 0)i. Naturally, when t ! 0, the solution above recovers tthe continuous expression with its time dependency given by n . Therefore, there is only a small phase difference between the exp iE h two expressions. For the mean value of an arbitrary observable, 2 3 XX i ^ ðtÞi ¼ hCðtÞjAjC c m ð0Þcn ð0ÞAmn exp4 ðEm ¼ En Þt5 h m¼0 n¼0 2 3 2 3 tt 5 þ O t4 ; exp4i Em E3n 3 3!h ˆ juni, we obtain an additional phase term that implies a with Amn ¼ humjA small deviation of the resulting frequencies compared with the Bohr frequencies of the harmonic oscillator. To first approximation, this deviation is given by the term depending on t2 in the expression above. ˆ Of note, the restrictions imposed on the spectrum of eigenvalues of H mutilate the basis of eigenvectors: the number of eigenvectors becomes finite and does not constitute a complete set anymore. Therefore, it no longer forms a basis. For eigenstates beyond the upper limit the states are unstable and decay exponentially with time. For a time-independent Hamiltonian, the retarded equation always furnishes damped solutions characteristic of radiating systems. In this case, there is neither stationary solutions nor an upper limit for the energy eigenvalues. The larger the eigenvalue, the larger the damping factor and more quickly its contribution to the state of the system tends to zero. If we write the state of the oscillator as jCðtÞi ¼
X n
tt i cn ð0Þjun i 1 þ tEn ; h
68
Ruy H. A. Farias and Erasmo Recami
which has a norm decaying according to hCðtÞjCðtÞi ¼
X n
t t2 E2n t cn ð0Þjun i 1 þ 2 ; h
(78)
we have for an arbitrary observable that [with hA(t)i hAi (t)]: XX t t2 it
hAðtÞi ¼ cm ð0Þcn ð0ÞAmn exp ln 1 þ 2 En Em ðEm En Þ t h h m n or, to the first order in t, hAðtÞi ¼
XX m
c m ð0Þcn ð0ÞAmn
n
2 t i 2 exp ðEm En Þt exp t Em En ; h 2h2
so that, in addition to the Bohr frequencies defining the emission and absorption frequencies of the oscillator, we obtain a damping term that causes the average value of the observable—which is explicitly independent of time—to tend to zero with time. A cursory analysis shows that even for very small eigenvalues, smaller than 1.0 eV, the damping factor is large, so the decay of the average values is very fast. The damping factor of the norm in Eq. (78) can be evaluated, and its behavior can be seen in Figure 2.
1.0 6.6 eV
0.8
0.066 eV
<Ψ|Ψ>n
6.6e–10 eV
0.6
0.4
0.2
0.0
10–10 10–8
10–6
10–4
10–2
100 102 t (s)
104
106
108
1010
1012
FIGURE 2 Typical behavior of the damping factor associated with different energy eigenvalues (retarded case).
Consequences for the Electron of a Quantum of Time
69
4.2. Free Particle For a free particle (an electron for example), the general solution of the symmetric Eq. (35) can be obtained, in the coordinate representation, using as an ansatz the solution for the continuous case. Thus, a spectrum of eigenfunctions (plane waves) is obtained given by ð p xÞ Cp ðx; tÞ ¼ ð2phÞ3=2 exp iaðjpjÞt þ i : h
Inserting this expression into the symmetric equation, we obtain for the frequency a(jpj) that 2 1 1 t p aðjpjÞ ¼ sin : (79) c h 2m0 When t ! 0, a(jpj)h coincides with the energy of the particle. As observed for the bound particle, here we also have an upper limit for the spectrum of eigenvalues. Thus, the upper limit for the possible values of momentum is given by rffiffiffiffiffiffiffiffiffiffiffi 2m0 h ¼ 10 MeV=c (80) p pMax t for the electron. In other words, there is a limit beyond which the frequencies cease to be real. As in the continuous case, the state of the particle is described by a superposition of the eigenstates and can be written as ð 1 ðpxÞ 3 Cðx; tÞ ¼ d pc ð p Þ exp ia ð jpj Þt þ i : h ð2phÞ3=2 The coefficients c(p) are determined from the initial condition C(x, 0) ¼ C0(x). From the expression for a, it can be observed that beyond a certain qffiffiffiffiffiffi value of p the expression loses meaning. When p 2mt 0 , the complete solution is defined only if c(p)¼0. From the stationary phase condition, we have that x¼
p t rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ; t 2 p4 m0 1 h 4m2 0
and, supposing that c(p) corresponds to a distribution of probabilities with a peak at p ¼ p0, then the wave packet will move in the direction p0 with uniform velocity t 2 p4 1=2 p0 1 ; v¼ h m0 4m20
70
Ruy H. A. Farias and Erasmo Recami
which coincides with the group velocity of the packet. It can be promptly observed that when p reaches its maximum value permitted, the velocity diverges: v ! 1. Thus, the introduction of a fundamental interval of time does not impose any restriction on the velocity of the particle, although it results in a limit for the canonical momentum of the eigenfunctions. Starting from the condition of stationary phase it is possible to redefine the momentum associated with the particle, so that this new momentum does not suffer any restrictions. Thus, one can conclude that it is possible to exist free electrons with any energy, differently from what happens with bound electrons. For p > pmax, the frequency a(jpj) fails to be real; its dependence on p is shown in Figure 3. An analysis of Eq. (79) shows that if a(jpj) is complex, then, for p pmax, the imaginary component is null and the real part is given by Eq. (79). When p pmax, then p ; 2t v ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 3 12 u0 u 2 2 1 6 tp u@ tp A 7 ImðaðpÞÞ ¼ ln4 15 ; þt 2m0 h t 2m0 h ReðaðpÞÞ ¼
with the real part being a constant and the imaginary one tending logarithmically to 1. Using the expressions above, for p > pmax the eigenstates become unstable, with a time-dependent decay term. When we look e that, for the continuous Schro¨dinger for an equivalent Hamiltonian H e is a equation, supplies equivalent outputs, this is possible only if H non-Hermitean operator. It is straightforward to see that this is the case 4.0 ⫻ 1023 Re (a) Im (a) Continuous
a (s–1)
2.0 ⫻ 1023
0.0
–2.0 ⫻ 1023 0
3
6
9
12
15
p (MeV/c)
FIGURE 3 Real and imaginary components of a(jpj) obtained for the symmetric equation compared to the continuous case.
71
Consequences for the Electron of a Quantum of Time
e ¼ H1 þ iH2, with H1 and H2 Hermiteans and such that H1 jpi ¼ h for H Re (a(p)) jpi and H2 jpi ¼ h Im (a(p)) jpi. For the retarded equation, using the same ansatz of the symmetric case, the damping factor appears for every value of p. There is no limitation on the values of p, but, when p ! 1, the real part of a(jpj) tends to the same limit value observed for the symmetric case. Figure 4 illustrates the behavior of the components of a(jpj). The general expression for an eigenfunction is found to be " " 2 2 ## 2 ipx it t p t 1 p t Cp ðx; tÞ / exp tan exp ln 1 þ : h t 2mh 2t 2mh Performing a Taylor expansion and keeping only the terms to the first order in t we obtain the continuous solution multiplied by a damping factor: ipx 1 2 iot exp o tt Cp ðx; tÞ / exp (81) h 2 where o ¼ p2/2mh is the frequency obtained for the continuous case. The damping term depends only on the Hamiltonian, through the frequency o, and on the chronon associated with the particle. As the latter is constant for a given particle, that term shows that for very high frequencies the solutions decay quite fast and, as the system evolves, a decay for smaller frequencies also takes place. The inflection point in Figure 5, delimiting the region of the spectrum where the decay is faster, moves in the direction of smaller frequencies as time passes. The consequence of this decay is the narrowing of the frequency bandwidth, which is relevant for the wave packet describing the particle. This is an echo of the continuous decrease of the energy. As in 6.0 ⫻ 1023
a (s–1)
4.0 ⫻ 1023 2.0 ⫻ 1023 Re (a ) Im (a ) Continuous
0.0 –2.0 ⫻ 1023
– 4.0 ⫻ 1023 0
10
20
30 p (MeV/c)
40
50
FIGURE 4 Real and imaginary components of a(jpj) obtained for the retarded equation compared with the continuous case.
72
Ruy H. A. Farias and Erasmo Recami
2 e–w t t / 2
1
w
1/tt Inflection Point
2 e–w t t / 2
1
1/t⬘t
FIGURE 5
t⬘> t
w
Displacement with time of the inflection point.
the symmetric case, obtaining an equivalent Hamiltonian is possible only if non-Hermitean operators are considered. It is worthwhile to reconsider the question of the physical meaning of the three discretized Schro¨dinger equations. Apparently, the choice of the equation for a particular situation is determined by the restrictions imposed on the system by the boundary conditions. The symmetric equation is used for special situations for which the system neither emits nor absorbs radiation, or does it in a perfectly ‘‘balanced’’ way. This is the case for the electrons in their atomic orbits. Therefore, the particle is stable until a certain energy limit, beyond which the behavior of the states is similar to that of the retarded solutions. For energies far below that limit, the particle behaves almost identically to the continuous case, except that the new frequencies associated with each wave function differ from the continuous frequencies by a factor of order t2. The probability that a particle is found with energy larger than the limit value decreases exponentially with time. For the bound electron, the limit is that equivalent to the rest mass of the muon. If a parallel with the classical approach is valid, the symmetric equation describes: (1) an isolated system, which does not exchange energy with the surrounding environment, or: (2) a situation of perfect thermodynamic equilibrium, in which a perfect balance between absorbed and dissipated energies is verified. For the classical theory of the electron the symmetric equation is only an approximation that ignores the radiation reaction effects. In QM, however, the existence of nonradiating states is related to the very essence of the theory. The symmetric equation shows that, below the critical limit, the states are physically identical to the outputs from the continuous theory: they are nonradiating states.
73
Consequences for the Electron of a Quantum of Time
The retarded equation represents a system that somehow loses energy into the environment. The mechanism of such energy dissipation is related not only to the Hamiltonian of the system but also to some properties of the environment—even the vacuum—as it can be inferred from the description of the free particle. From the solutions obtained it is now observed that time has a well-defined direction of flux and that the frequency composition of the wave packet associated with the particle depends on the instant of time considered. It is clear that it is always possible to normalize the state at a certain instant and consider it as an initial state. This is permitted by the formalism. However, in a strictly rigorous description, the frequency spectrum corresponds to a specific instant of time subsequent to the emission. This aspect can be interesting from the point of view of possible experimental verifications.
4.3. The Discretized Klein-Gordon Equation (for massless particles) Another interesting application is the description of a free scalar particle— a scalar or zero-spin ‘‘photon’’—using a finite-difference form of the KleinGordon equation for massless particles. In the symmetric form, the equation is written as h2 Am ¼ 0 !
Cðt þ 2tÞ 2CðtÞ þ Cðt 2tÞ r2 CðtÞ ¼ 0: 4c2 t2
(82)
Using a convenient ansatz we obtain, for this equation, in the coordinate representation, that t 1 2 2 Ck ðx; tÞ ¼ A exp i cos 1 2t k expðikxÞ; 2t which can be written as t Cp ðx; tÞ ¼ A exp i cos1 1 2c2 t2 E2 =h2 expðipx=hÞ; 2t since E ¼ p2c2 and p ¼ hk. Expanding the time exponential in powers of t, we find, to the second order in t, a solution that is very similar to the continuous expression: i Cp ðx; tÞ ¼ A exp ðE0 t pxÞ ; h with E2 t 2 E0 E 1 þ 2 : 6h
74
Ruy H. A. Farias and Erasmo Recami
A difference of the order of t2 is observed between the energy values of the ‘‘photons’’ in the continuous and discrete approaches. The general solution is given by a linear combination of the eigenfunctions found. A priori, the value of the chronon for the particle is not known. The timedependent exponential term in the expressions above leads to an upper limit for the allowed energy, which is given by E h/t. We could suppose that the value of the chronon for this photon is of about the fundamental time interval of the electromagnetic interactions 109 s resulting in a critical value of approximately 6.6 keV, which is a very low limit. A smaller chronon should increase this limit but, if there is any generality in the classical expression obtained for the electron, we should expect a larger value for this massless particle. If instead of a photon we consider a scalar neutrino, taking for the value of the chronon t 1013 s—a typical time for the weak decay—the limit for the energy associated with the eigenfunctions is now approximately 0.007 eV. This means that in the composition of the wave packet describing this particle the only contribution comes from eigenfunctions, the energy of which is below that limit. The eigenfunctions obtained for the Hamiltonian considered are ‘‘plane waves’’ solutions. The dependence of these solutions on energy and time is shown in Figures 6 and 7. For smaller values of t the decay of the modes with energy above the maximum is faster. Apparently, it seems possible to determine a limiting value for the chronon starting from the uncertainty relations. This could be obtained, when describing particles, using the expression t<
h 2mo c2
that provides for the electron a maximum limit given by 6.4 1022 s. However, this value is two degrees of magnitude larger than the classical value of the chronon for the electron, which is a considerable difference. It is possible to use this relation for a complex system, which is shown later. We also need to consider the conditions with which a photon must be supplied in order to be described by the symmetric equation. For the electron, it seems clear that not irradiating in a bound state—which is imposed by QM—implies the adoption of the symmetric equation. For the photon (as for a free particle), when the retarded form of the KleinGordon equation is used, a solution is also obtained wherein the highest frequencies decay faster than the lowest ones. There is always a tendency in the sense that the lowest frequencies prevail. If we are allowed to assign a physical meaning to such a discretized Klein-Gordon equation, we are also allowed to think that, the farther the light source, the more the spectrum of the emitted light will be shifted for the largest wavelengths, even if the source is at rest with respect to the observer. Thus, we could
75
Consequences for the Electron of a Quantum of Time
1.0
1.0
0.5
0.5
y (x,t)
y (x,t)
(a)
0.0 –0.5
0.0 –0.5
–1.0
–1.0 0.0
2.0 ⫻10
–4
–4
4.0 ⫻10 6.0 ⫻10 x (m)
–4
8.0 ⫻10
–4
–3
1.0 ⫻10
–4
4.0 ⫻10 6.0 ⫻10 x (m)
–4
–4
8.0 ⫻10
–3
4.0 ⫻10 6.0 ⫻10 x (m)
–3
–3
8.0 ⫻10
0.0
2.0 ⫻10
0.0
2.0 ⫻10
–4
1.0 ⫻10
–3
–3
1.0 ⫻10
1.0
1.0
0.5
0.5
y (x,t)
y (x,t)
(b)
0.0 –0.5
0.0 –0.5
–1.0
–1.0 0.0
–3
2.0 ⫻10
–3
–3
4.0 ⫻10 6.0 ⫻10 x (m)
–3
8.0 ⫻10
1.0 ⫻10
–2
–2
FIGURE 6 Solution of the discretized Klein-Gordon equation, when the energy is smaller than the critical limit, depicted for different values of energy and time. (a) E < EM: E ¼ 0.0001 eV; t ¼ 0. Discrete and continuous solution are identical. (b) E < EM: E ¼ 0.0001 eV; t 1 1010s. Discrete and continuous solutions differ in phase.
obtain a red shift effect as a consequence of the introduction of the chronon that could be used in the construction of a tired-light theory. . . Finally, we need to point out that the discretization considered for the Klein-Gordon equation does not follow exactly the same procedure that led to the discretized Schro¨dinger equation, since it is a relativistic invariant equation. We did not change the proper time, but the time coordinate itself into the discretized form. We considered a discretized version of the Hamiltonian operator by applying the transformations p!
h r; i
H ! ihD; with D as defined in subsection 3.4, on the Hamiltonian of a relativistic free particle, qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi H ¼ p2 c 2 þ m 2 c 4 as usual in the continuous case.
76
Ruy H. A. Farias and Erasmo Recami
(a) 1.0 0.5
y (x,t)
y (x,t)
0.5
0.0
0.0
–0.5 –0.5 –1.0 0.0
2.0 × 10
–4
–4
4.0 × 10
–4
6.0 × 10
8.0 × 10
–4
–3
1.0 × 10
0.0
x (m)
–4
2.0 × 10
–4
4.0 × 10
–4
6.0 × 10
–4
8.0 × 10
1.0 × 10
–3
x (m)
(b) 1.0
4.00E – 020
0.8
|y|2 (x,t)
y (x,t)
2.00E – 020 0.00E + 000
0.6 0.4
–2.00E – 020 0.2 –4.00E – 020 0.0
–4
2.0 × 10
–4
4.0 × 10
–4
6.0 × 10
–4
8.0 × 10
–3
1.0 × 10
0.0
0.0
–10
2.0 × 10
4.0 × 10
x (m)
–10
–10
6.0 × 10
–10
8.0 × 10
–9
1.0 × 10
t (s)
FIGURE 7 Solution of the discretized Klein-Gordon equation when the energy is larger than the critical limit, depicted for different values of energy and time. In this case, the amplitude decay is very fast. (a) For the two insets above, it is E > EM: E ¼ EM(1 þ 1 107) eV; t ¼ 1 1010s; discrete and continuous solutions differ in phase and amplitude. (b) For the two insets below, it is E > EM: E ¼ EM(1 þ 1 107) eV. In the left inset: t ¼ 1 108s. The right inset shows the damping of the amplitude with time.
4.4. Time Evolution of the Position and Momentum Operators: The Harmonic Oscillator It is possible to apply the discretized equations to determine the time evolution of the position and momentum operators, which is rather interesting for the description of the simple harmonic oscillator. To do so, we use the discretized form of the Heisenberg equations which, in the symmetric case, can be obtained by a direct discretization of the continuous equation. Starting from this equation, we determine the coupled Heisenberg equations for the two operators: ^ ð t þ tÞ p ^ ðt tÞ p ¼ mo2 ^xðtÞ; 2t
(83)
^ xðt þ tÞ ^ xðt tÞ 1 ^ ðtÞ: ¼ p 2t m
(84)
Consequences for the Electron of a Quantum of Time
77
Such coupled equations yield two finite-difference equations of second order, the general solutions of which are easily obtained. The most immediate way to determine the evolution of these operators is to use the creation and annihilation operators. Keeping the Heisenberg equation { and ^ A ^ þ1 , ^ ¼o A remembering that for the harmonic oscillator we have H 2 we obtain for the symmetric case: ^ þ tÞ Aðt ^ tÞ Aðt ^ ¼ ioAðtÞ; 2t
(85)
^ { ðt tÞ ^ { ðt þ tÞ A A ^ { ðtÞ; ¼ ioA 2t
(86)
such that t 1 ^ ^ AðtÞ ¼ Að0Þ exp i sin ðotÞ ; t { { t 1 ^ ^ A ðtÞ ¼ A ð0Þ exp i sin ðotÞ ; t
(87) (88)
where we used the fact that, for t ¼ 0, the Heisenberg and Schro¨dinger ˆ{ ¼ A ˆ {(0), ˆ (t ¼ 0) ¼ A ˆ ¼A ˆ (0) and A ˆ {(t ¼ 0) ¼ A pictures are equivalent: A { ˆ and A ˆ independent of time. To obtain these equations we considwith A ered that, for the nonrelativistic case, there is neither creation nor annihilation of particles, such that we can impose restrictions on the frequencies in the phase term of the operators. For the creation operators, for example, the terms with negative frequencies—associated with antiparticles—are discarded. We can observe that the Number and the Hamiltonian operators are not altered: ^ { ðtÞAðtÞ ^ ¼A ^{ ^ ^ ¼A N 0 1 ð0ÞAð0Þ; 0
1
1 ^ { ð0ÞAð0Þ ^ ^ ¼ ho@N ^ þ 1A ¼ ho@A þ A: H 2 2 Thus, starting from these operators, we obtain for the symmetric case: 2 3 2 3 ^ ð0Þ p t t ^ xðtÞ ¼ ^ xð0Þ cos4 sin1 ðotÞ5 þ sin4 sin1 ðotÞ5 mo t t 2 3 2 3 t t ^ ðtÞ ¼ p ^ ð0Þ cos4 sin1 ðotÞ5 mo^xð0Þ sin4 sin1 ðotÞ5: p t t
78
Ruy H. A. Farias and Erasmo Recami
which differ from the continuous case since the frequency o here is replaced by a new frequency 1t sin1 ðotÞ that, for t ! 0, tends to the continuous one. Also, there is now an upper limit for the possible oscillation frequencies given by o 1/t. Above this frequency the motion becomes unstable, as observed in Figure 8. The existence of a maximum limit for the frequency isequivalent to an upper limit for the energy eigenvalues given by En ¼ n þ 12 ho h=t, which is equal to the upper limit obtained using Schro¨dinger’s picture. Since t 1 sin1 ðotÞ ffi o þ o2 t2 þ Oðt4 Þ; t 3! the difference expected in the behavior of the oscillator with respect to the continuous solution is quite small. For example, if we take the vibration frequency of the hydrogen molecule (H2), we have that o 1014 Hz, while the term of the second order in t is smaller than 103 Hz (if the analogy (a)
(b) 13
1.5 ⫻ 10
2.0 ⫻ 10
18
12
p (t)
p (t)
5.0 ⫻ 10
0.0 –5.0 ⫻ 10
12
13
–1.5 ⫻ 10 –2 ⫻ 10–10
–1 ⫻ 10–10
1 ⫻ 10–10
0
0.0 –1.0 ⫻ 10
18
–2.0 ⫻ 10
18
–3.0 ⫻ 10
18
–1 ⫻ 10–5
x (t)
0
1 ⫻ 10–5
2 ⫻ 10–5
3 ⫻ 10–5
x (t)
(c)
p (t)
1.0 ⫻ 1011
0.0 –5.0 ⫻ 1010
2
0 –12 ⫻1
1.5
3
10 –1 ⫻
1.0
10 – 1
0.0
5.0 ⫻
3
2
10 –1
10 –1 ⫻
.0 ⫻ –5
10 – 1 .5 –1
–1 .0
⫻
2
–1.0 ⫻ 1011
x (t)
FIGURE 8 Phase space of the harmonic oscillator when o > t1. In the discrete case, with time intervals multiples of t: in inset (a) time is regarded as intrinsically discrete, so that in the picture only the points where the lines touch one another are meaningful. (b) If time is regarded as intrinsically continuous, inset (b) shows the behavior of the oscilator described by the discrete equations. In the actually continuous case, (c), no modification is expected with respect to the ordinary case, under the present hypothesis.
Consequences for the Electron of a Quantum of Time
79
with the classical theory is valid, the chronon is expected to be smaller for more massive systems). In terms of average values we have, for the position operator h^ xðtÞi ¼ h^ xðtÞicont þ
o2 t2 th^ pðtÞi; 3!m
that the term of order t2 is expected to be considerably smaller than the mean value for the continuous case. At this point, the mean values are determined taking for the system a state composed of a superposition of stationary states. For the stationary states juni themselves the mean values ^ are zero. of ^ x and p For the retarded case the solutions can be obtained using the time evolution operators for the Heisenberg equation (Appendix A). As expected, decaying terms appear in the resulting expressions. The creation and annihilation operators obtained for this case are then given by " ^ ¼ Að0Þ ^ AðtÞ 1 þ iot þ t2 o2 x
# t t
2 0
1 3 1 ^ Að0Þ expðiotÞ exp4@x þ Ao2 tt5; 2
1 2 0 3 # t t { { { 1 ^ ð0Þ expðiotÞ exp4@x þ Ao2 tt5; ^ ð0Þ 1 iot þ t2 o2 x ^ ðtÞ ¼ A A A 2 "
ˆ {){ ¼ A ˆ continues to be with x being a real positive factor. The relation (A valid but the Number operator and, consequently, the Hamiltonian, are no longer constant: 1 3 1 ^ ^ ð0ÞAð0Þ ^ ^ ¼ Hð0Þ exp42@x þ Ao2 tt5; 1 þ o2 t2 x þ o2 t HðtÞ ¼A 2 9 8 > > " # t > > < { t 1= 2 2 2 2 2 ^ ^ ^ : þ NðtÞ ¼ ho A ð0ÞAð0Þ 1 þ o t x þ o t > 2> > > ; : "
{
2
# t t 2
2
0
Taking into account the terms to the second order in t, we derive that the oscillation frequencies also decay with time. These results are consistent with the fact that the system is emitting radiation, with the consequent reduction of its total energy. However, it is remarkable that the energy of the quanta associated with the creation and annihilation operators is not constant, even with a tiny variation rate. In the same way, when we calculate the position and momentum operators a damping factor is obtained.
80
Ruy H. A. Farias and Erasmo Recami
(a) 1.0 1.0 E + 16 1.0 E + 14 1.0 E + 06
N
0.8 0.6 0.4 0.2
0.0 10–11 10–9 10–7 10–5 10–3 10–1 101 103 105 107 109 1011 1013 t (s) (b)
p (t)
1.0 ⫻ 1011
t0
0.0 –5.0 ⫻ 1010 –1.0 ⫻ 1011 –2 ⫻ 10–11 –1 ⫻ 10–11
0 1 ⫻ 10–11 2 ⫻ 10–11 x (t)
FIGURE 9 (a) Damping factors associated with the Number operator calculated for a few frequencies. (b) Damping of the oscillations for the harmonic oscillator described by the retarded equation.
Figure 9a shows the strange damping factor associated with the Number operator. This damping occurs within a period of time that is characteristic for each frequency, being slower and postponed for lower frequencies. Figure 9b shows the dampening of the oscillations as described by the retarded equation. Once the expressions for the position and momentum operators are determined, we obtain that, to first order in t, 8 9 1 2 0 3 < = ^ ð0Þ p 1 2 h^ xðtÞi ¼ ^ xð0Þ cosðotÞ þ sinðotÞ exp4@x þ Ao tt5; : ; mo 2 1 2 0 3 1 h^ xðtÞi ¼ h^ xðtÞicont exp4@x þ Ao2 tt5: 2
Consequences for the Electron of a Quantum of Time
81
Taking into account the higher-order terms, we can observe a small variation in the oscillation frequency just as observed in the symmetric case. The introduction of time-independent perturbations does not cause any additional variations aside from those found even in the continuous case. We note that the results obtained with this procedure are in agreement with those obtained following Schro¨dinger’s picture.
4.5. Hydrogen Atom The hydrogen atom is basically a system made up of two particles attracting each other through Coulombian force, which is therefore inversely proportional to the square of the distance between them. The basic Hamiltonian is denoted by 2 ^2 ^0 ¼ P e ; H 2m R
(89)
and is composed of the kinetic energy of the atom in the center-of-mass frame, and of the Coulombian electrostatic potential (m is the reduced mass of the system electron-proton). A more complete description is obtained by adding correction terms (fine structure) to the Hamiltonian, including relativistic effects such as the variation of the electron mass with velocity and the coupling of the intrinsic magnetic moment of the electron with the magnetic field due to its orbit (spin-orbit coupling). There are also the hyperfine corrections that appear as a result of the interaction of the electron with the intrinsic magnetic moment of the proton and, finally, the Lamb shift, due to the interaction of the electron with the fluctuations of the quantized electromagnetic field. The Hamiltonian can finally be written as (Cohen-Tannoudji et al., 1977) ^ I ¼ me c2 þ H ^0 H
^4 P ^ Lamb : ^þH ^ hf þ H ^S L m2e c2 R3
(90)
The introduction of the magnetic moment of the nucleus through the hyperfine correction causes the total angular momentum to be F ¼ J þ I. The Hamiltonian does not depend explicitly on time such that, for the symmetric Schro¨dinger equation i
h ^ I Cðx; tÞ; ½Cðx; t þ tÞ Cðx; t tÞ ¼ H 2t
(91)
we obtain, using the separation of variables, the following uncoupled equations:
82
Ruy H. A. Farias and Erasmo Recami
^ I FðxÞ ¼ EFðxÞ H h ^ I TðtÞ; i ½Tðt þ tÞ Tðt tÞ ¼ H 2t with the general solution
t 1 tE Cðx; tÞ ¼ FðxÞ exp i sin : t h
(92)
The difference related to the continuous case appears only in those aspects involving the time evolution of the states. Since the Hamiltonian is time independent, its eigenvalues are exactly the same as those obtained in the continuous case (Cohen-Tannoudji et al., 1977): ! 1 me c2 n 3 4 2 2 2 Eðn;jÞ m0 c 2 me c a a þ Ehf þ ELamb : 2n 2n4 j þ 12 4 A situation in which a difference between the two cases can appear is in taking into account the probabilities of transition between the eigenstates for an atom submitted to a time-dependent potential. In the discrete approach, it is possible to use the method of the equivalent Hamiltonian to obtain the transition probabilities. As mentioned previously (subsection 3.5), the problem is treated using the conventional approximate methods for time-dependent perturbations. If we consider, for example, the nonrelativistic interaction of an atom with an electromagnetic field described by the vector potential A(x, t), we have for the low-intensity limit, in the Coulomb gauge, the Hamiltonian ^ ^ I VðtÞ ^ ^ I e Að ^ R; ^ tÞP; ^ HðtÞ ¼H ¼H me c
(93)
where the potential term is taken as the perturbation. If we consider that the potential describes a monochromatic field of a plane wave, then ^ x ^ x n n ^ iot þ exp io iot ; (94) Aðx; tÞ ¼ A0 e exp io c c ^ is the linear polarization of the field and n ^ is the propagation where e direction. The term depending on (iot) corresponds to the absorption of a quantum of radiation ho and the (iot) term to stimulated emission. Let us assume that the system is initially in an eigenstate jFii of the timeindependent Hamiltonian. Keeping only the perturbations to the first ^ (t), we obtain that order in V ð i t expðioni t0 ÞVni ðt0 Þdt0 ; c1n ðtÞ ¼ h 0
Consequences for the Electron of a Quantum of Time
83
where oni in the discrete case is given by 1 1 tEn 1 tEi sin oni ¼ sin : t h h Working with the absorption term, we get by contrast that ðt ieA0 ð1Þ io^ n x=c ^ ðe pÞjFi i exp½iðoni oÞt0 dt0 : cn ðtÞ ¼ 2 hFn je me ch 0
Thus, the probability of transition from the initial state jFii to the final state jFfi is given by 2 ð t e2 jA0 j2 ð1Þ 2 io^ n x=c ^ je ð e pÞjF i Pfi ðtÞ ¼ jcf ðtÞj2 ¼ j exp½iðofi oÞt0 dt0 j ; hF i f m2e c2 h2 0
or Pfi ðtÞ ¼
2 ofi o t=2 4e2 jA0 j2 2 sin io^ nx=c ^ jhF je ð p ÞjF ij ; e 2 i f m2e c2 h2 ofi o
so that the determination of the matrix elements of the spacial term, using the electric dipole approximation, provides the selection rules for the transitions. What is remarkable in this expression is the presence of a resonance showing a larger probability for the transition when 1 1 tEf 1 tEi sin o ¼ ofi ¼ sin : (95) t h h This expression is formally different from the one obtained for the continuous approach. When we expand this expression in powers of t, we obtain o
3 3 Ef Ei 1 Ef Ei 2 þ t: 6 h3 h
(96)
The first term supplies the Bohr frequencies as in the continuous case; the second, the deviation in the frequencies caused by the introduction of the time discretization: 3 3 1 Ef Ei 2 Dofi ¼ t: 6 h3 If we consider the chronon of the classical electron, t 6.26 1024 s, it is possible to estimate the deviation in the frequency due to the time discretization. Then, for the hydrogen atom, Dofi 2:289 102 E3f E3i :
84
Ruy H. A. Farias and Erasmo Recami
If we take into account, for example, the transitions corresponding to the first lines of the series of Lyman and Balmer, that is, of the nondisturbed states n ¼ ni ! n ¼ nf we have
ni
nf
DE (eV)
1 1 1 2
2 3 4 3
10.2 12.1 12.75 1.89
DnD (Hz)
n (Hz) 2.465 1015 2.922 1014 3.082 1014 4.566 1014
10 10 10 <1
where DE is the difference of energy between the states, n is the frequency of the photon emitted in the transition, and DnD is the frequency deviation due to the discretization. Such deviation is always considerably small. We must remember that the hyperfine corrections and those due to the Lamb shift are of the order of one GHz. For the transition n ¼ 1 ! n ¼ 2, for example, the correction due to the Lamb shift is approximately 1.06 GHz. Larger deviations caused by the discretization occur for monoelectronic atoms with larger atomic numbers. For the first transition the deviation is approximately 90 Hz for the 2He, 1.1 kHz for the 3Li, and 420 kHz for the 6C. However, these deviations are still much smaller than the one due to the Lamb shift; that is also the case for the muonic atoms. For a muonic atom with a proton as the nucleus, using for the chronon a value derived from the classical expression for the electron (tm ¼ 3.03 1026 s) the deviation is 1.4 kHz for the transition n ¼ 1 ! n ¼ 2. For that transition the frequency of the emitted radiation is 4.58 1017 Hz. For the retarded equation, a difference with respect to the symmetric case is present in the time evolution of the states. The procedure is identical to the one used above, and the general solution is now given by
tE Cðx; tÞ ¼ FðxÞ 1 þ i h
t=t ;
so that the transitions now occur with frequencies given by itEf ih itEi o ¼ ofi ¼ ln 1 þ ln 1 þ : t h h
(97)
As results from the characteristics of the retarded equation, this is a complex frequency. The real component of such frequency can be approximated by Ef Ei 1 E3f E3i 2 þ t ; Re ofi 3 h3 h
Consequences for the Electron of a Quantum of Time
85
where the first term is the expression for the continuous case. For the particular transition n ¼ 1 ! n ¼ 2, the deviation due to the discretization is 18 Hz. The imaginary component, on the other hand, can be approximated by Imðofi Þ
i E2f E2i t: 2 h2
In the expression for the probability of transition, we have the magnitude of an integral involving the time dependency of the general solution. In this case, the characteristic damping causes the probability to tend to a fixed, nonzero value. An example of such behavior is shown in Figure 10, which shows the variation of the time-dependent term between an initial instant t0 ¼ 0 and some hundred chronons later. To observe the decay of the amplitude factor we have used a larger value for the chronon, 1018 s. The decay is slower when the chronon is of the order of the one we have been considering for the electron. As can be observed, the effect of the time discretization on the emission spectrum of the hydrogen is extremely small. Using the expressions obtained above we can estimate that, in order for the effect of the time discretization to be of the same order of the Lamb shift, the chronon associated with the electron should be 1018 s, far larger than the classical value (but close to the typical interval of the electromagnetic interactions!). In any case, it should be remembered that the Lamb shift measurements do not seem to be in full agreement with quantum electrodynamics (see, for example, Lundeen and Pipkin, 1981). As we conclude this subsection, it is noteworthy that, for a timeindependent Hamiltonian, the outputs obtained in the discrete formalism using the symmetric equation are very similar to those from the 0.95689
0.00567852 6.26e − 018
3.756e − 015
FIGURE 10 Behavior of the time-dependent component of the transition probability as a function of time (in seconds).
86
Ruy H. A. Farias and Erasmo Recami
continuous case. For such Hamiltonians, as we know, the effect of the discretization appears basically in the frequencies associated with the time-dependent term of the wave function. As already observed, the difference in the time dependency is of the type 1 tEn exp½iEn ðt t0 Þ=h ! exp i sin ðt t0 Þ=t : h The discretization causes a change in the phase of the eigenstate, which can be quite large. The eigenfunctions individually describe stationary states, so that the time evolution appears when we have a linear combination of such functions, to describe the state of the system. This state evolves according to X 1 tEn jCðtÞi ¼ cn ð0Þ exp i sin ðt t0 Þ=t jfn i; h n considering that Hjfni ¼ Enjfni is the eigenvalue equation associated with the Hamiltonian. When the stationary states of a particle under, for example, 1D squared potentials, are studied, the same reflection and transmission coefficients and the same tunnel effect are obtained, since they are calculated starting from the stationary states. When we consider a linear superposition of these stationary states, building a wave packet, the timedependent terms must be taken into account, resulting in some differences with respect to the continuous case. Some attempts have been made to determine significant measurable differences between the two formalisms (compare Wolf, 1987a,b and Wolf, 1989a,b), but no encouraging case has been found yet.
5. DENSITY OPERATORS AND THE COARSE-GRAINING HYPOTHESIS 5.1. The ‘‘Coarse-Graining’’ Hypothesis First, it is convenient to present a brief review of some topics related to the introduction of the coarse-grained description of a physical system. This hypothesis will then be used to obtain a discretized form of the Liouville equation, which represents the evolution law of the density operators in the usual QM. An important remark is that the introduction of a fundamental interval of time is perfectly compatible with a coarse-grained description. The basic premise of such a description, in statistical physics, is the impossibility of a precise determination of the position and momentum of each
Consequences for the Electron of a Quantum of Time
87
particle forming the system, at a certain instant of time. Let us consider, for the sake of simplicity, a system composed of N similar pointlike particles, each with three degrees of freedom described by the coordinates (q1, q2, q3). We can associate with this ensemble of particles an individual phase space (named m-space) defined by the six coordinates (q1, q2, q3; p1, p2, p3) so that the system as a whole is represented by a crowd of points in this space. Since the macroscopic observation is unable to precisely determine the six coordinates for each particle, let us assume that it is possible to know only if a given particle has its coordinates inside the intervals (qi þ dqi) and (pi þ dpi), with i ¼ 1, 2, 3. In order to describe the state of the system in the m-space, we divide it into cells corresponding to the macroscopic uncertainties dqi and dpi, each one occupying in the m-space a volume defined as wi ¼ dq1 dq2 dq3 dp1 dp2 dp3 :
(98)
These cells must be sufficiently small related to the macroscopically measurable dimensions but also sufficiently large to contain a great number of particles. When considering the system as a whole, its macroscopic state is given by a collection of points ni corresponding to the number of particles inside each cell. Now, if we take into account the 6N-dimensional phase space G, in which each of the states assumed by the system is represented by a point, to each configuration ni corresponds in G a cell with volume given by ðdV ÞG ¼
N Y
ðwi Þni :
n¼1
Considering that the permutation of the particles inside the cells of the G space does not change the macroscopic state of the system, then to each collection of numbers ni corresponds a volume On in the G space19 given by ! X N! Y ni W ðOn Þ ¼ ðwi Þ ni ¼ N : Pi ni ! i i The state of the system is determined by the star occupied by the representative point of the system in the G space. This way, macroscopically, it is only possible to distinguish in which ‘‘star‘‘ the system is located, such that any point in this star corresponds to the same macroscopic state. When we consider a system that is not in equilibrium, a change in its macroscopic state can be observed only when the point describing the
19
Jancel (1969) calls it a star.
88
Ruy H. A. Farias and Erasmo Recami
system changes star. The crossing time is small but finite. During this period of time the macroscopic state of the system does not change, notwithstanding the fact that its microscopic state is continuously changing. Thus, from the point of view of statistical physics, the introduction of a fundamental interval of time appears very naturally. That is still more significant when we remember that the predictions of QM are always obtained as mean values of observables. The uncertainty relations, according to the usual interpretation of QM—the Copenhagen interpretation—are independent of the arguments above. If we accept that they play a fundamental role in the microscopic world—and this is postulated by Copenhagen—then the concept of chronon, as a fundamental interval of time, must be related to them.
5.2. Discretized Liouville Equation and the Time-Energy Uncertainty Relation An attempt to establish a relationship between the chronon and the timeenergy uncertainty relation has been put forward by Bonifacio (1983), extending the coarse-graining hypothesis to the time coordinate. In conventional QM the density operator evolves according to the Liouville-von Neumann equation: @^ r i h^ i ^ ; ¼ iL^ rðtÞ ¼ H; r @t h
(99)
where L is the Liouville operator. One can immediately observe that, if H is time-independent, the solution is given by H H ^ ^ (100) rðT Þ ¼ exp i T rð0Þ exp i T ; h h which gives the time evolution of the density operator starting from an initial time t0, such that T ¼ t t0 is the evolution time. When we build a coarse-grained description of the time evolution, by introducing a graining of value t such that the evolution time is now given by T ¼ kt (k ¼ 1, 2,. . ., 1), the resulting density operator r does not satisfy the continuous Eq. (99) but a discretized form of it given by ^ðtÞ r ^ ð t tÞ r ¼ iL^ rðtÞ; t
(101)
with t ¼ kt, which reduces to the Liouville–von Neumann equation when t ! 0. In the energy representation jni, once satisfied certain conditions that ensure that r(k) is a density operator, Eq. (101) rules for r an evolution that preserves trace, obeys the semigroup law, and is an irreversible evolution toward a stationary diagonal form. In other words, we observe
Consequences for the Electron of a Quantum of Time
89
a reduction of state in the same sense as in the measurement problem of QM. This reduction is not instantaneous and depends on the characteristic value t: t!1 X rðtÞ ! rnn ð0Þjnihnj: n
It is important to observe that the nondiagonal terms tend exponentially to zero according to a factor which, to the first order, is given by o2 tt nm exp : 2 Thus, the reduction to the diagonal form occurs provided we have a finite value for t, no matter how small, and provided we do not have onmt 1 for every n and m, where onm ¼ (En Em)/ h are the transition frequencies between the different energy eigenstates. This latter condition is always satisfied for systems not bounded. These results, together with an analysis of the discrete Heisenberg equation defined in terms of the average values of observables ðtÞ ¼ Tr rðtÞA ^ A in the coarse-grained description, suggest an interpretation of t in terms of the uncertainty relation DEDt h/2 such that t is a characteristic interval of time satisfying the inequality qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi h t tE with DE ¼ hH2 i hHi2 ; (102) 2DE so that the mathematical meaning of the time-energy uncertainty relation is that of fixing a lower limit for the time interval within which the time evolution can be described. Thus, ‘‘. . .the coarse-grained irreversibility would become a necessary consequence of an intrinsic impossibility to give an instantaneous description of time evolution due to the timeenergy uncertainty relation’’ (Bonifacio, 1983). Since the density operator, in the energy representation, tends to a diagonal form, it is tempting to apply it to the measurement problem. We can also observe that, even without assuming any coarse-graining of time, namely, without using (Recami and Farias, 2009) any statistical approaches as Bonifacio’s, the reduction to a diagonal form results straightforwardly from the discrete Liouville equation and some asymptotic conditions regarding the behavior of the solution, once satisfied (Bonifacio and Caldirola, 1983, 1982) the inequality onm t 1. (See also Ghirardi and Weber, 1984; Ghirardi, Rimini, and Weber, 1985) The crucial point from which both the decay of the nondiagonal terms of the density operator and the very discrete Liouville equation are
90
Ruy H. A. Farias and Erasmo Recami
derived, is that the time evolution operator obtained from the coarsegrained description is not a unitary operator. This way, the operator ^ ðt ¼ kt; t ¼ 0Þ ¼ V
1 1þ
^ itL h
k ;
(103)
like all the nonunitary operators, does not preserve the probabilities associated with each of the energy eigenstates that make up the expansion of the initial state in that basis of eigenstates. We must recall that the appearance of nonunitary time evolution operators is not associated with the coarse-grained approach only, since the operators also result from the discrete Schro¨dinger equations.
5.3. Measurement Problem in Quantum Mechanics Let us apply the discrete formalism introduced in the previous subsection to the measurement problem. Using a quite general formalization, we can describe the measurement process taking advantage of the properties observed for the evolution of the density operator as determined by the discrete Liouville-von Neumann equation.20 When speaking of measurement, we must keep in mind that, in the process, an object O, of which we want to measure a dynamic variable R, and an apparatus A, which is used to perform such measurement, are ^ is the operator associated with the involved. Let us suppose that R ^ ¼ rjri and defines observable R, with an eigenvalue equation given by Rjri a complete basis of eigenstates. Thus, considered by itself, any possible state of the object can be expanded in this basis: X jCi0 ¼ cr jri0 : (104) r
With respect to the apparatus A, we are interested only in its observable A, whose eigenvalues a represent the possible values indicated by a pointer. In addition, let its various internal quantum numbers be labeled by an index n. These internal quantum numbers are useful to specify a complete basis of eigenvectors associated with the apparatus: ^ ni ¼ aja; ni : Aja; A A
(105)
Now, let us suppose that the apparatus is prepared in an initial state given by j0, niA, that is, in the initial state the value displayed is zero. The interaction between the two systems is introduced by means of the 20
We follow closely the description exhibited in Recami and Farias (2009) and in Ballentine (1986).
Consequences for the Electron of a Quantum of Time
91
time evolution operator and is such that there is a correlation between the value of r and the measure ar. We consider a quite general situation to deal with the measurement process itself. First, let us consider the following pure state of the system object þ apparatus (O þ A): jCin i ¼ jCi0 j0; niA :
(106)
The evolution of this state, in the continuous description, using the evolution operator, is given by X ^ ðt; t0 ÞjCi j0; ni ¼ cr jar ; r; ni ¼ jCfn i (107) U 0 A r
which is a coherent superposition of macroscopically distinct eigenstates, each one corresponding to a different measure ar. The major problem for the Copenhagen interpretation results from the fact that it considers the state jCin i as associated with a single system. A pure state provides a complete and exhaustive description of an individual system. Thus, the coherent superposition above describes a single system so that, at the end of the interaction that settles the measurement, the display should not show a well-defined output since Eq. (107) describes a system that is a superposition of all its possible states. However, we know from experience that the apparatus always displays a single value as the output of the measurement. It is this disagreement between observation and the description provided by the formalism, when interpreted according to Copenhagen, that results in the necessity of introducing the postulate of the reduction of the vector state jCfn i ! jar0 ; r0 ; ni; where r0 is the value displayed by the apparatus. This fact has been considered by many as a problem for the usual interpretation of QM (Wigner, 1963; Ballentine, 1986, 1970). The attempts to find a solution, in the context of different interpretations, have been numerous, from the many-worlds interpretation, proposed by Everett and Wheeler (Everett, 1957; Wheeler, 1957) to the measurement theory by Daneri et al. (1962),21 in which the reduction of the quantum state is described as a process triggered by the appearance of aleatory phases in the state of the apparatus, simply because of its interaction with the elementary object. The approach introduced here is—by contrast— somewhat simpler.
21
See also Caldirola (1974), Lanz et al. (1971), Ludwig (1953), and George et al. (1972).
92
Ruy H. A. Farias and Erasmo Recami
As an initial state in the measurement process, let us consider a mixed state for the composite system O þ A, X ri ¼ Cn jCin ihCin j; (108) n
where Cn is the probability associated with each of the states jCin i. Such probability is, as in the classical physics, an ignorance probability, that is, it is not intrinsic to the system. In the continuous case, when we apply the time evolution operator to that density operator, we get a final state given by X ^ iU ^{ rf ¼ Ur Cn jCin ihCin j (109) rf ¼
X r1 ;r2
c r1 cr2
X
n
Cn fjar1 ; r1 ; nihar1 ; r1 ; njg;
(110)
n
so that the presence of nondiagonal terms corresponds to a coherent superposition of states. In this case, the postulate of the reduction of the quantum state is connected with the nondiagonal terms of the density operator. It is usually postulated that when a measurement is carried out on the system, the nondiagonal terms tend instantaneously to zero. Since in the continuous case the time evolution of the state results from the application of a unitary operator, which preserves the pure state condi^2 ¼ r ^, it is impossible to obtain the collapse of the pure state from the tion r action of such an operator. In the diagonal form the density operator ˆ , and the indeterdescribes an incoherent mixture of the eigenstates of A mination regarding the output of the measurement is a sole consequence of our ignorance about the initial state of the system. In the discrete case (Recami and Farias, 2009), which has the time evolution operator given by Eq. (103), with the interaction between apparatus and object embedded in the Hamiltonian H, the situation is quite different. The main cause of such difference is the fact that the time evolution operator is not unitary. Let us consider the energy representation, describing the eigenvalue equation of the Hamiltonian as Hjni ¼ Enjji so that the eigenstates jni are the states with defined energy. From the ^ is formalism of the density matrices we know that when the operator R diagonal in the energy representation, then when calculating the expected value of the observable, we do not obtain the interference terms describing the quantum beats typical of a coherent superposition of the states jni. Because the time evolution operator is a function of the Hamiltonian and, therefore, commutes with it, the basis of the energy eigenstates is also a basis for this operator. We can now use a procedure identical to the one applied by Bonifacio (1983), and consider the evolution of the system in
Consequences for the Electron of a Quantum of Time
93
^ (t ¼ kt, t ¼ 0) takes the initial this representation. Thus, the operator V i density operator r to a final state for which the nondiagonal terms decay exponentially with time: rfrs ¼ hrjV ðt ¼ kt; t ¼ 0Þjsi ¼
rirs ð1 þ iors tÞt=t
;
(111)
with ors ¼
1 1 ðEr Es Þ ¼ DErs : h h
(112)
Equation (111) can be written as rrs ðtÞ ¼ rrs ð0Þegrs t einrs t
(113)
1 ln 1 þ o2rs t2 ; 2t
(114)
1 tan1 ðors tÞ: t
(115)
such that grs ¼
nrs ¼
We can observe directly that the nondiagonal terms tend to zero with time and the decay is faster the larger the value of t, which here is an interval of time related to the entire system O þ A. If we keep in mind that in the coarse-grained description the value of the time interval t originates from the impossibility of distinguishing between two different states of the system, we must remember that the system O þ A is not an absolutely quantum system. That means that t could be significantly larger, implying a much faster damping of the nondiagonal terms of the density operator (Figure 11). We then arrive at a process like the one of the reduction of the quantum state, even if in a rudimentary formalization. This result seems encouraging for future researche on such important and controversial subject. Some points must be noted out from this brief approach of the measurement problem. First, this result does not occur when we use the time evolution operators obtained directly from the retarded Schro¨dinger equation. The dissipative character of that equation causes the norm of the state vector to decay with time, also leading to a nonunitary evolution operator. However, this operator is such that, in the definition of the density operator we obtain damping terms that are effective even for the diagonal terms. This point, as well as the question of the compatibility between Schro¨dinger’s picture and the formalism of the density matrix, are analyzed in Appendix A. As the composite system O þ A is a complex system, it is suitably described by the coarse-grained description, so that
94
Ruy H. A. Farias and Erasmo Recami
(a)
(b) 1.0
ρ (t)
ρ (t)
1.0
0.5
0.0
0.0
5.0 ⫻ 10–8 1.0 ⫻ 10–7 1.5 ⫻ 10–7 2.0 ⫻ 10–7 2.5 ⫻ 10–7 t (s)
0.5
0.0
0
1 x 10–12
2 x 10–12
3 x 10–12
4 x 10–12
t (s)
FIGURE 11 Vanishing in time of the nondiagonal terms of the density operator for two different values of t. For both cases we have used DE ¼ 4 eV. (a) Slower damping for t ¼ 6.26 1024 s; (b) faster damping for t ¼ 1019 s.
the understanding of the relationship between the two pictures can be useful to gain a deeper insight on the processes involved. Notwithstanding the simplicity of the approach, we could also observe the intrinsic relation between measurement process and irreversi^ meets the properties of a semigroup, bility. The time evolution operator V so that it does not necessarily possess an inverse; and noninvertible operators are related to irreversible processes. In a measurement process, in which the object is lost just after the detection, we have an irreversible ^ process that could very well be described by an operator such as V. Finally, it is noteworthy that the measurement problem is controversial even regarding its mathematical approach. In the simplified formalization introduced previously, we did not include any consideration beyond those common to the quantum formalism, allowing an as clear as possible individualization of the effects of the introduction of a fundamental interval of time in the approach to the problem. The introduction of a fundamental interval of time in the description of the measurement problem makes possible a simple but effective formalization of the state-reduction process. Such behavior is observed only for the retarded case. When we take into account a symmetric version of the Liouville-von Neumann equation, the solution is given by h i it 1 t rnm ðtÞ ¼ rnm ð0Þ exp sin ð E n Em Þ ; t h where the diagonal elements do not change with time and the nondiagonal elements have an oscillatory behavior. This means that the symmetric equation is not suitable to describe a measurement process, and this is an important distinction between the two formulations: actually, only the retarded one describes dissipative systems.
Consequences for the Electron of a Quantum of Time
95
It is important to stress that the retarded case of direct discretization of the Liouville-von Neumann equation results in the same equation obtained via the coarse-grained description. This led us to consider our equation as the basic equation to describe complex systems, which is always the case when a measurement process is involved. Our results, moreover, are independent of any statistical approach.
6. CONCLUSIONS In this paper we attempted to gain a better insight into the applicability of the various distinct formalisms obtained when performing a discretization of the continuous equations. For example, what kind of physical description is provided by the retarded, advanced, and symmetric versions of the Schro¨dinger equation? This can be achieved by observing the typical behavior of the solutions obtained for each case and, particularly, attempting to derive these equations from Feynman’s approach. We then have an advanced equation that describes a system that absorbs energy from the environment. We can imagine that, in order to evolve from one instant to another, the system must absorb energy, and this could justify the fact that, by using Feynman’s approach with the usual direction of time, we can obtain only the advanced equation. The propagator depends only on the Hamiltonian because it is independent of the wave function that describes the initial state. Thus, it describes a transfer of energy to the system. The retarded equation is obtained by a time reversion, by an inversion of the direction of the propagator, that is, by inverting the flux of energy. The damping factor characteristic of the retarded solutions refers to a system that is continuously releasing energy into the environment. Thus, both the retarded and the advanced equations describe open systems. Finally, the symmetric equation describes a system in an energy equilibrium with the environment. Thus, the only way to obtain stationary states is by using the symmetric equation. Regarding the nature of such an energy, it can be related to the very evolution of the system. It can be argued that a macroscopic time evolution is possible only if there is some energy flux between the system and the environment. The states described by the symmetric equation are basically equilibrium states, without net dissipation or absorption of energy by the system as a whole. We can also conceive of the symmetric equation as describing a closed system, which does not exchange energy with the external world. On the other hand, when a comparison is made with the classical approach, we can speculate that the symmetric equation ceases to be valid when the interaction with the environment changes rapidly within a chronon of time. Thus, phenomena such as the collision of highly
96
Ruy H. A. Farias and Erasmo Recami
energetic particles require the application of the advanced or retarded equations. The decay of the norm associated with the vector states described by the retarded equation would indicate the very decay of the system, a system abandoning its initial ‘‘equilibrium state.’’ The behavior of the advanced equation would indicate the transition of the system to its final state. This speculation suggests another interpretation, closer to the quantum spirit. We could consider the possible behavior of the system as described by all three equations. However, the ordinary QM works with averages over ensembles, which is a description of an ideal, purely mathematical reality. The point is that if we accept the ergodic hypothesis, such averages over ensembles are equivalent to averages over time. The fact is that the quantum formalism always works with average values when dealing with the real world. When the potentials involved vary slowly with respect to the value of the chronon of the system, which means a long interaction time, the contributions due to the transient factors from the retarded and advanced equations compensate for each other and cancel out. Then, on the average, the system behaves according to the symmetric equation. On the contrary, when the potentials vary strongly within intervals of time of the order of the chronon, we do not have stationary solutions. The discrete formalism describes such a situation by making recourse, during the interaction, to the transient solutions, which will yield the state of the system after the interaction. Afterward, the system will be described again by a symmetric solution. The most conservative quantum interpretation would be that of believing that only the symmetric equation describes a quantum system. During the interaction process the theory does not provide any description of the system, pointing only to the possible states of the system after the transient period. The description of the interaction would demand one more ingredient: the knowledge of the interaction process (which would imply an additional theoretical development, for example, the determination of an interaction model). In addition to the question of the physical meaning of the discretized equations, that is, to the type of physical description underlying it, there is the question of the time evolution of the quantum states. The Schro¨dinger equations describe the evolution of a wave function, with which an amplitude of probability is associated. An analogy with the electron theory leads us to the supposition that this wave function does not react instantaneously to the external action, but reacts after an interval of time that is characteristic of the described system. In discrete QM, the justification of the non-instantaneous reaction comes from the fact that the uncertainty principle prevents a reaction arbitrarily close to the action application instant (Wolf, 1990a,b,c,d, 1992a,b, 1994). Such uncertainty could be related to the perturbation caused by the Hamiltonian on the state of the system, resulting in an uncertainty relation like the
Consequences for the Electron of a Quantum of Time
97
Mandelstam-Tamm time-energy correlation (Fock, 1962): a time evolution in which the macroscopic state of the system leaps discontinuously from one instant to the other. Therefore, the quantum jumps appear not only in the measurement process but are an intrinsic aspect of the time evolution of the quantum system.22 The difference, in our case, is that the jump does not take the system suddenly out of the quantum state with which it was endowed, but only determines the evolution of that state. Another aspect characteristic of the discrete approach is the existence of an upper limit for the eigenvalues of the Hamiltonian of a bounded system. The description of a free particle showed the existence of an upper limit for the energy of the eigenfunctions composing the wave packet that describes the particle, but this limit does not imply an upper value for the energy of the particle. The existence of this limiting value determines the Hamiltonian eigenvalue spectrum within which a normalization condition can hold. Once that value is exceeded, a transition to the internal excited states of the system occurs. As an example, this allowed us to obtain the muon as an excited internal state of the electron. It must be remarked the nonlinear character of the relation between energy and oscillation frequency of a state, and the fact that the theory is intrinsically nonlocal, as can be inferred from the discretized equations. It must also be stressed that the theory described in this chapter is nonrelativistic. Finally, it must be emphasized that the symmetric form of the discrete formalism reproduces grosso modo the results of the continuous theory. The effects of the introduction of a fundamental interval of time are evident in the evolution of the quantum systems, but they are—in general—extremely tiny. There have been some attempts to find physical situations in which measurable differences between the two formalisms can be observed, but until now with little success.23 A possibility is that this could be afforded by exploiting the consequences of the phase shifts caused by the discretization (see subsections 4.2 and 4.3). Regarding the justifications for introducing a fundamental interval of time, let us recall Bohr’s (1935) reply to the famous 1935 paper by Einstein, Podolski, and Rosen (Einstein et al., 1935): ‘‘The extent to which an unambiguous meaning can be attributed to such an expression as physical reality cannot of course be deduced from a priori philosophical conceptions, but . . . must be founded on a direct appeal to experiments and measurements.’’ Considering time as continuous may be regarded as a criticizable philosophical position since, at the level of experiments and measurements, nature seems to be discrete. More important is to recall that the new formalism allows not only the description of the stationary states, but also a space-time description of
22 23
For the controversy envolving quantum jumps see, for example, Schro¨dinger (1952) and Bell (1987). Several systems have been analyzed in Wolf (1987a,b, 1989a,b, 1990a,b,c,d, 1992a).
98
Ruy H. A. Farias and Erasmo Recami
transient states. The retarded formulation yields a natural quantum theory for dissipative systems. Significantly, it leads to a simple solution of the measurement problem in QM. Such interesting problems await future attention.
APPENDICES A. Evolution Operators in the Schro¨dinger and Liouville–von Neumann Discrete Pictures In applying the formalism introduced in the previous sections to the measurement problem, the requirement of the existence of a well-defined evolution operator is evident. By well-defined we mean that, as in the continuous case, a unitary operator that satisfies the properties of a group. In the continuous case, when the Hamiltonian is independent of time, the time evolution operator has the form ^ ðt; t0 Þ ¼ exp iðt t0 ÞH=h ^ U ˆ be Hermitean. and is a unitary operator that satisfies the condition that H In the continuous case, by definition, every observable is represented by a Hermitean operator. An operator is unitary when its Hermitean conjugate is equal to its inverse, such that ^ {A ^ ¼A ^A ^ { ¼ 1: A Another important aspect regarding a unitary operator is related to the probability conservation. In other words, if the initial state is normalized to 1, it will keep its norm for all subsequent times. The evolution operator does not change the norm of the states on which it operates. Thus, we know beforehand that the evolution operators associated with the retarded and advanced discretized Schro¨dinger equations are not unitary operators.
A.1. Evolution Operators in the Schro¨dinger Picture For the discretized Schro¨dinger equation the discrete analog of the time evolution operator can be obtained easily. Let us initially consider the symmetric equation, which is the closest to the continuous description. After some algebraic manipulation the evolution operator can be written as " !# ^ i ð t t Þ t H 0 1 ^ ðt; t0 Þ ¼ exp sin U ; t h
99
Consequences for the Electron of a Quantum of Time
so that "
^ ^ ðt; t0 ÞjCðx; t0 Þi ¼ exp iðt t0 Þ sin1 tH jCðx; tÞi ¼ U t h
!# jCðx; t0 Þi:
Thus, if the eigenvalue equation of the Hamiltonian is given by ^ ðx; t0 Þi ¼ EjCðx; t0 Þi; HjC we have that iðt t0 Þ tE sin1 jCðx; tÞi ¼ exp jCðx; t0 Þi: t h ˆ is a Hermitean operator, the evolution operator for the symSince H metric equation is also Hermitean. However, the existence of a limit for ˆ implies that, beyond the possible values of the eigenvalues of H such threshold, the evolution operator is no longer Hermitean. In fact, if we ˆ has the form consider that beyond the threshold the operator H ^ ¼ ^n þ i^ H k; ^ are Hermitean operators, we obtain in the continuous where ^n and k approach the same results obtained in the discrete case. One of the characteristics of a non-Hermitean operator is the fact that it does not conserve the norm of the state on which it acts. For the retarded equation, the evolution operator is given by i ^ ðtt0 Þ=t ^ Uðt; t0 Þ ¼ 1 þ tH ; h
(116)
such that, in the limit t ! 0,
i ^ lim 1 þ tH t!0 h
ðtt0 Þ=t
^ ¼ ehðtt0 Þ H; i
which is an expression known as the Trotter equality. Taking the conjugate ˆ { we can verify that this operator is not unitary. In Hermitean operator U ˆ , we can verify that the basis of eigenstates of H ðtt0 Þ=t t2 E2n { ^ ^ ; hnjU Ujni ¼ 1 þ 2 h is not equal to 1. This explains why the probabilities are not conserved for the solutions of the retarded equation. In addition, as the evolution operator for the advanced equation is given by
100
Ruy H. A. Farias and Erasmo Recami
i ^ ðtt0 Þ=t ^ ; Uðt; t0 Þ ¼ 1 tH h it can be verified that the formal equivalence between the two equations is obtained by the inversion of the time direction and of the sign of the energy. In the relativistic case, this is understandable if we remember that, if a transformation changes the sign of the time component of a coordinate four-vector, then it also changes the sign of the energy, which is the corresponding element of the energy-momentum four-vector. Then the retarded equation describes a particle endowed with positive energy traveling forward in time, and the advanced equation describes an object with negative energy traveling backward in time, that is, an antiparticle (Recami, 1978; Recami and Rodrigues, 1982; Recami et al., 1983; Pavsic and Recami, 1982).
A.2. Evolution Operator in the Density Matrix Picture
For the sake of simplicity, let jc(t)i be a pure state. The density of states operator is defined as ^ðtÞ ¼ jcðtÞihcðtÞj: r It can be shown that such operator evolves according to the following dynamic laws. For the retarded case, ^ðtÞ ¼ DR r
i t 1 h^ ^ ðtÞ^ ^ ðtÞ; ^ðtÞ 2 H rðtÞH HðtÞ; r ih h
for the advanced case, ^ðtÞ ¼ DA r
i t 1 h^ ^ ðtÞ^ ^ ðtÞ; ^ ðt Þ þ 2 H rðtÞH HðtÞ; r ih h
and, finally, for the symmetric case, D^ rðtÞ ¼
i 1 h^ ^ðtÞ : HðtÞ; r ih
We can thus observe that the retarded and the advanced equations cannot be obtained by a direct discretization of the continuous Liouvillevon Neumann equation. Such formal equivalence occurs only for the symmetric case. Taking into account the retarded case, we can obtain the equivalent time evolution operator as ^ ðt; t0 Þ ¼ h V
1 ^þ H ^ ...H ^ 1 þ ith L h t2 2
iðtt0 Þt :
(117)
Consequences for the Electron of a Quantum of Time
101
Of note, this operator is different from the one obtained from the coarse-grained approach, ^ CG ðt; t0 Þ ¼ h V
1 1þ
it ^ hL
iðtt0 Þt :
(118)
^ CG is defined as having the and it is not unitary as well. Quantity V properties of a semigroup: It does not necessarily have an inverse but possesses the other group properties such as commutativity and existence of an identity (in addition to the translational invariance of the initial condition). We can conclude from the difference between the two operators that, apparently, the descriptions clash. In the coarse-grained approach the starting point was the continuous Liouville-von Neumann equation and, by introducing the graining of the time coordinate, an evolution operator was obtained satisfying the retarded equation ^ðtÞ ¼ DR r
i 1 h^ ^ ðt Þ : HðtÞ; r ih
The second path started from the definition of the density operator to determine the dynamical equation it satisfies and then obtained the evolution operator. For the symmetric case, the evolution operator is given by " !# ^ iðt t0 Þ 1 tL ^ sin ; (119) V ðt; t0 Þ ¼ exp t h which is similar to the operator obtained for the continuous case.
A.3. Compatibility Between the Previous Pictures We thus have two distinct evolution operators for the retarded Schro¨dinger and Liouville equations so that, once a connection is established between them, we arrive at the question of the compatibility of the two descriptions. We try to set up a relation between those operators by observing their action on the density operator. So, we expect that both operators satisfy the expression ^ ðt; t tÞ^ ^ ðt; t tÞ^ ^ { ðt; t tÞ; V r0 ¼ U r0 U where the different action of the operators is basically due to the bilinearˆ , given ^ given by Eq. (117), while U ity (Recami et al., 2010) of the operator V by Eq. (116), is linear. This relation is valid in the continuous case, where the evolution operators act on the density operator according to
102
Ruy H. A. Farias and Erasmo Recami
^ðtÞ ¼ exp½iLðt t0 Þ=h^ r0 ¼ exp½iHðt t0 Þ=h^ r0 exp½iHðt t0 Þ=h: r Considering the basis of Hamiltonian eigenstates jni, we have ^ rð0Þjmi ¼ ðEn Em Þr ð0Þ; hnjL^ nm so that ^ r ^ð0Þ ¼ exp½itðEn Em Þrnm ð0Þ; exp iLt
(120)
^ r ^ ¼ exp½itðEn Em Þr ð0Þ: ^ð0Þ exp iHt exp iHt nm
(121)
The question is knowing whether the same is valid for the discrete case. For the retarded approach, we must check whether the relation h
1 ^þ H ^ ...H ^ 1 þ ith L h t2 2
^0 ¼ h iðtt0 Þ=t r
1 ^ 1 þ hi tH
^0 h iðtt0 Þ=t r
1 ^ 1 hi tH
iðtt0 Þ=t
is valid. We see that, if we consider that equations such as (120) and (121) continue to be valid in the discrete case, then the above relation is valid. For a generic element of the operator, we then obtain h
1
1 1 rnm ð0Þ
it=t rnm ð0Þ ¼
t=t : t=t 2 1 þ hi tEn 1 hi tEm 1 þ ith ðEn Em Þ þ ht 2 En Em
Such equivalence also can be observed for the other cases. However, when we consider the evolution operator obtained from the coarsegrained approach, we find an incompatibility with the operator deriving from the Schro¨dinger one. For the operator [Eq. (116)] we have 1 1 ^ð0Þjmi ¼
hnj h it=t r t=t rnm ð0Þ: i 1 þ h t ð En E m Þ ^ 1 þ hi tL The question now is to determine the fundamental difference between the two descriptions: Are both valid, and under what conditions? Some points must be emphasized. First, remember that the coarse-grained description is a semi-classical approach that assumes a system with a certain degree of complexity, whereas the vector state description is a fundamentally quantum approach without any imposition, in principle, on the number of degrees of freedom of the system described. The two approaches differ importantly even in the way they conceive the chronon. In the coarsegrained approach, it is understood as a magnitude inwardly connected to
Consequences for the Electron of a Quantum of Time
103
the experimental limitations or, for an ideal measurement device, to the limitations imposed by the uncertainty relations. For the Schro¨dinger equation, the value of the chronon is taken as a fundamental interval of time associated with interaction processes among the components of the system, and of the system as a whole with some external potential; that is, it is associated with the internal processes of the system (as it has been conceived for the classical electron). In this way, the absence of the mixed term in the evolution operator obtained with the semi-classical procedure is comprehensible, as is its incompatibility with the purely quantum description provided by the Schro¨dinger equations. As a semiclassical approach, the range of applicability of the coarse-grained formalism extends to the cases where the system to be studied is not purely microscopic, particularly in the measurement processes. We stress that, in this formalization, only the retarded equation was obtained. Thus, the system as described dissipates energy: It is an open system. This is the characteristic that make it possible for us to have access to the output of a measurement. In connection with the operator obtained directly in the Schro¨dinger picture for the retarded case, all the elements of the density matrix, even the diagonal ones, are damped with time. There is also the controversy linked to the non-existence, in QM, of an applicability limit of the theory due to the number of degrees of freedom involved. The formalism does not distinguish between a microscopic and a macroscopic system, so that it should reproduce what is obtained with the coarse-grained formalism. This means that the measurement problem appears in the discrete formalism also through the non-equivalence of the evolution operators in Eqs. (117) and (118).
B. Non-Hermitean Operators in the Discrete Formalism One feature we have stressed throughout this work is the non-Hermitean character of the discrete formalism. In the Schro¨dinger representation, for example, the continuous equation can reproduce the outputs obtained with the discretized equations once we replace the conventional Hamiltonian by a suitable non-Hermitean Hamiltonian we have called the equivalent Hamiltonian. One characteristic of a non-Hermitean operator is that its eigenvalues are defined over the field of complex numbers. A linear nonHermitean operator can always be considered as consisting of a Hermitean part, which supplies the real component of the eigenvalues, and an antiHermitean part, which gives the complex component (Recami et al., 2010). In the continuous case, let us take the Hamiltonian as being a nonHermitean operator given by e ¼ ^n þ i^ H k;
104
Ruy H. A. Farias and Erasmo Recami
^ are Hermitean. Then we have, in the Schro¨dinger picture, where ^n and k that the time evolution operator is given by 1 ^ ^ ^ (122) Ucont ðt; t0 Þ ¼ exp ðk i nÞðt t0 Þ : h For the discrete case, comes from Appendix A that the evolution operator for the retarded states is given by Eq. (116) ðtt0 Þ=t ^ ^ ðt; t0 Þ ¼ 1 þ i tH U ; h
(123)
ˆ is the Hermitean operator associated with the conventional where H Hamiltonian. This evolution operator can be written as 2 0 13 2 0 13 2 ^2 ^ ð t t Þ t i ð t t Þ tH H 0 0 ^ ret ðt; t0 Þ ¼ exp4 ln@1 þ 2 A5 exp4 tan1 @ A5: U 2t t h h (124) Comparing Eqs. (122) and (124) we obtain the equivalence of the ^ are given by Hamiltonians once ^n and k 0 1 ^ h tH ^ v ¼ tan1 @ A; t h 0 1 ^2 h @ t2 H ^ ¼ ln 1 þ 2 A: k 2t h For the advanced case we obtain the same expressions except for a ^. For the symmetric case, below the critical limit, we have minus sign for k 0 1 ^ h tH ^n ¼ sin1 @ A; t h ^ ¼ 0: k Above that limit ^n ceases to be Hermitean and, in this case, the evolution operator can be written as 8 39 2 ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi v !2 u > > = < u ^ ^ tH t tH 7 ^ sym ðt; t0 Þ ¼ exp ip ðt t0 Þ exp ðt t0 Þ ln6 þ 1 U 5 4 h > > 2t t h ; :
Consequences for the Electron of a Quantum of Time
105
so that hp ; 2t vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 3 u0 1 2 u ^ ^ h 6 tH u@tHA 7 ^ ¼ ln4 15; k þt h t h ^n ¼
^ ceases to be zero. with ^n being now independent of the Hamiltonian and k ^ The expressions obtained above show the characteristics that ^n and k must fulfill, so that the continuous equation reproduces the outputs of the discretized equations. By observing the continuous evolution operator we e shows a nonstationary behavior, have that the anti-Hermitean part of H resulting in a damping or amplifying term associated with the evolution of the quantum state. Thus, the stationary solutions appear only for the symmetric case below the critical limit. In all the other cases, the transient term always appears. In QM, the non-Hermitean operators have been used mainly as mathematical shortcuts, as in the case of the Lippmann-Schwinger equation in the scattering theory. It has already been observed that the introduction of such operators could make possible the description of unstable states, by phenomenologically linking the transient factor to the lifetime of the considered states (Agodi et al., 1973, and Cohen-Tannoudji et al., 1977). If in a certain instant t0 ¼ 0 the system is in one of the eigenstates jni of the ˆ , then if such state is unstable, the probability of the system Hamiltonian H to be found in the same state at a later instant t is ^ { Ujnij ^ Pn ðtÞ ¼ jhnjU ¼ expðt=tL Þ; and that allows us to specify a lifetime tL, for the retarded case, as tL ¼
t ; 2 2 ln 1 þ thE2 n
(125)
and for the symmetric case, above the critical energy, as tL ¼
t qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi : 2 2 2 ln tEhn þ thE2 n 1
Such lifetimes are connected with states that, in the discretized formalism, are intrinsically unstable. Only the retarded equation seems to be associated with quantum states that decay with time. If that is truly valid, we have an expression that could be used for phenomenologically
106
Ruy H. A. Farias and Erasmo Recami
determining the value of the chronon. Finally, we can conclude that the time discretization brings forth a formalism which, even if only Hermitean Hamiltonians are involved, is equivalent to the introduction of non-Hermitean operators in the continuous QM.
ACKNOWLEDGEMENTS The authors are grateful to P.Hawkes for his generous interest and very kind collaboration. They thank also V.Bagnato, R.Bonifacio, C.Dobrigkeit-Chinellato, S.Esposito, F.Fontana, G.C.Ghirardi, G.Giuffrida, P.Leal-Ferreira, A.Natale, E.C.Oliveira, F.Pisano, I.Radatti, S.Randjbar-Daemi, A.Ranfagni, A.Salanti, G.Salesi, J.W.Swart, I.Torres Lima Jr., C.Ussami, M.Zamboni-Rached and in particular to D.G.Chakalov, H.E.Herna´ndez-Figueroa, M.Tenorio´ de Vasconselos and D.Wisniweski, for stimulating discussions or kind collaboration. At last, one of us [RHAF] acknowledges a former PhD fellowship from FAPESP. A preliminary version of this paper appeared in e-print form as arXiv:quant-th/9706059.
REFERENCES Abraham, M. (1902). Prinzipien der Dynamik des Elektrons. Annalen Der Physik, 315(1), 105–179. Abraham, M. (1905). Theorie der Elektrizita¨t, Vol. II. Leipzig: Springer. Agodi, A., Baldo, M., & Recami, E. (1973). Approach to the Theory of Unstable States. Annals of Physics, 77, 157–173. Ambarzumian, V., & Ivanenko, D. D. (1930). Zur Frage nach Vermeidung der unendlichen Selbstru¨ckwirkung des Elektrons. Zeitschrift fu¨r Physiotherapie, 64, 563–567. Arzelie`s, H. (1966). Rayonnement et Dinamique du Corpuscule Charge´ Fortment Acce´le´re´. Paris: Gauthier-Villars. Ashauer, S. (1949). On the classical equations of motion of radiating electrons. Proceedings Cambridge Philological Society, 45, 463–475. Ballentine, L. E. (1970). The statistical interpretation of quantum mechanics. Reviews of Modern Physics, 42(4), 358–381. Ballentine, L. E. (1986). What is the point of the quantum theory of measurement? In L. Roth & A. Inomata (Eds.), Fundamental Questions in Quantum Mechanics. Gordon & Breach. Barut, A. O. (1978a). The creation of a photon: A heuristic derivation of the Planck’s constant h or fine structure constant. Zeitschrift fu¨r Naturforschung, A33, 993–994. Barut, A. O. (1978b). The mass of the muon. Physics Letters, B73, 310–312. Barut, A. O. (1979). Lepton mass formula. Physical Review Letters, 42, 1251. Barut, A. O. (1991). Brief History and Recent Developments in Electron Theory and Quantumelectrodynamics. In D. Hestenes & A. Weingartshofer (Eds.), The Electron: New Theory and Experiment. Dordrecht: Kluwer. Bateman, H. (1910). The transformation of the electrodynamical equations. Proc. London Math. Soc., B8, 223. Bell, J. S. (1987). Are There Quantum Jumps? In C. W. Kilmister (Ed.), Schro¨dinger: Centenary Celebration of a Polymath (pp. 41–52). Cambridge: Cambridge U. P. Belloni, L. (1981). Historical remarks on the ‘classical’ electron radius. Lett. Nuovo Cim., 31, 131–134. Benza, V., & Caldirola, P. (1981). de Sitter microuniverse associated with the electron. Nuovo Cimento, A62, 175–185. Bhabha, H. I. (1946). On the expansibility of Solutions in Powers of the Interaction Constant. Physical Review, 70, 759–760.
Consequences for the Electron of a Quantum of Time
107
Bhabha, M. J., & Corben, H. C. (1941). General Classical Theory of Spinning Particles in a Maxwell Field. Proc. Roy. Soc. (London), A178, 273–314. Bohm, D., & Weinstein, M. (1948). The self-oscillations of a charged particle. Physical Review, 74, 1789–1798. Bohr, N. (1935). Can quantum-mechanical description of physical reality be considered complete? Physical Review, 48, 696–702. Bonifacio, R. (1983). A coarse grained description of time evolution: Irreversible state reduction and time-energy relation. Lett. Nuovo Cim., 37, 481–489. Bonifacio, R., & Caldirola, P. (1983). Finite-difference equations and quasi-diagonal form in quantum statistical mechanics. Lett. Nuovo Cim., 38, 615–619. Bonifacio, R., & Caldirola, P. (1982). Unstable states of a finite-difference Schro¨dinger equation. Lett. Nuovo Cim., 33, 197–202. Bonnor, W. B. (1974). A New Equation of Motion for a Radiating Charged Particle. Proc. Roy. Soc. (London), A337, 591–598. Bracci, L., Fiorentini, G., Mezzorani, G., & Quarati, P. (1983). Bounds on a hypothetical fundamental length. Physics Letters, B133, 231–233. Bunge, M. (1955). A picture of the electron. Nuovo Cimento, 1, 977–985. Caldirola, P. (1953). Sull’equazione del moto dell’elettrone nell’elettrodinamica classica. Nuovo Cimento, 10, 1747–1752. Caldirola, P. (1954). Spiegazione classica del momento magnetico anomalo dell’elettrone. Nuovo Cimento, 11, 108–110. Caldirola, P. (1956). A new model of classical electron. Suppl. Nuovo Cim., 3, 297–343. Caldirola, P. (1974). Dalla Microfisica alla Macrofisica. Milan: Mondadori. Caldirola, P. (1976a). On the introduction of a fundamental interval of time in quantum mechanics. Lett. Nuovo Cim., 16, 151–155. Caldirola, P. (1976b). On the Finite Difference Schrodinger Equation. Lett. Nuovo Cim., 17, 461–464. Caldirola, P. (1977a). On the existence of a heavy lepton. Lett. Nuov Cim., 20, 519–521. Caldirola, P. (1977b). The chronon and uncertainty relations in the theory of the excited-mass states of the electron. Lett. Nuov Cim., 20, 632–634. Caldirola, P. (1977c). Chronon in Quantum Theory. Lett. Nuovo Cim., 18, 465–468. Caldirola, P. (1978a). The chronon in the quantum theory of the electron and the existence of the heavy leptons. Nuovo Cimento, A45, 549–579. Caldirola, P. (1978b). The mass of the neutrinos and the Pontecorvo oscillations. Lett. Nuovo Cim., 21, 250–252. Caldirola, P. (1979a). A relativistic theory of the classical electron. Rivista Nuovo Cim, 2(13), 1–49. Caldirola, P. (1979b). On a relativistic model of the electron. Nuovo Cimento, A49, 497–511. Caldirola, P. (1979c). Introduzione del cronone nella teoria relativistica dell’elettrone. In M. Pantaleo & F. de Finis (Eds.), Centenario di Einstein – Astrofisica e Cosmologia, Gravitazione, Quanti e Relativita` (p. 829). Florence: Giunti–Barbera. Caldirola, P. (1979d). Progressi nella teoria dell’elettrone. In Annuario ’79, Enciclopedia ESTMondadori (pp. 65–72). Milan: Mondadori Pub. Caldirola, P. (1980). The introduction of the Chronon in the electron theory and a chargedlepton mass formula. Lett. Nuovo Cim., 27, 225–228. Caldirola, P. (1984a). Introduction of the chronon in the theory of electron and the waveparticle duality. In S. Diner et al. (Ed.), The Wave–Particle Dualism (pp. 183–213). Dordrecht: Reidel. Caldirola, P. (1984b). A geometrical model of point electron. In Revista Brasil. de Fı´sica (pp. 228–260). special vol. Ma´rio Scho¨nberg on his 70th Birthday (July). Caldirola, P., & Montaldi, E. (1979). A new equation for quantum dissipative systems. Nuovo Cimento, B53, 291–300.
108
Ruy H. A. Farias and Erasmo Recami
Caldirola, P., & Recami, E. (1978). The concept of Time in physics. Epistemologia, 1, 263–304. Caldirola, P., Casati, G., & Prosperetti, A. (1978). On the Classical Theory of the Electron. Nuovo Cimento, A43, 127–142. Casagrande, F. (1977). The introduction of the Chronon into the classical and quantum theory of the electron. Scientia; rivista di scienza, 112, 417–427. Casagrande, F., & Montaldi, E. (1977). Some remarks on the finite-difference equations of physical interest. Nuovo Cimento, A40, 369–382. Casagrande, F., & Montaldi, E. (1978). On the discrete-time Schro¨dinger equation for the linear harmonic oscillator. Nuovo Cimento, A44, 453–464. Cheon, I. T. (1979). Special relativity on the discrete space-time and a fundamental length. Lett. Nuovo Cim., 26, 604–608. Cirelli, R. (1955). Sul moto di un elettrone investito da un impulso elettromagnetico istantaneo. Nuovo Cimento, 1, 260–262. Cohen-Tannoudji, C., Diu, B., & Laloe¨, F. (1977). Quantum Mechanics. New York: Wiley. Cole, E. A. B. (1970). Transition from a continuous to a discrete space-time scheme. Nuovo Cimento, A66, 645–656. Coleman, S. (1960). Classical Electron Theory from a Modern Standpoint. Reprinted in. In D. Teplitz (Ed.), Electromagnetism: Paths to Research (p. 1982). New York: Plenum. Compton, A. H. (1919). The size and shape of the electron. Physical Review, 14, 20–43 247–259. Corben, H. C. (1961). Spin in Classical and Quantum Theory. Physical Review, 121, 1833–1839. Corben, H. C. (1968). Classical and quantum theories of spinning particles. San Francisco: Holden-Day. Corben, H. C. (1977). Particle spectrum. American Journal of Physics, 45, 658–662. Corben, H. C. (1984). Quantized relativistic rotator. Physical Review, D30, 2683–2689. Corben, H. C. (1993). Factors of 2 in magnetic moments, spinorbit coupling, and Thomas precession. American Journal of Physics, 61, 551–553. Cunningham, E. (1909). The principle of relativity in electrodynamics and an extension thereof. Proc. London Math. Soc., 8, 77–97. Cvijanovich, G. B., & Vigier, J. P. (1977). New extended model of the Dirac electron. Foundations of Physics, 7, 77–96. Daneri, A., Loinger, A., & Prosperi, G. M. (1962). Quantum theory of measurement and ergodicity conditions. Nucl. Phys., 33, 297–319. Darling, B. T. (1950). The Irreducible Volume Character of Events. I. A Theory of the Elementary Particles and of Fundamental Length. Physical Review, 80, 460–466. DerSarkissian, M., & Nelson, T. J. (1969). Development of an S-matrix theory based on the calculus of finite differences. Nuovo Cimento, A64, 337–344. Dirac, P. A. M. (1938a). The classical theory of electron. Proc. Royal Soc., A167, 148–169. Dirac, P. A. M. (1938b). La the´orie de l’e´lectron et du champ e´lectromagne´tique. Ann. Inst. Poincare´, 9, 13–49. Dirac, P. A. M. (1962). An extensible model of the electron. Proc. Roy. Soc., A268, 57–67. Einstein, A. (1915). Der energiesatz in der allgemeinen relativittstheorie. Berl. Ber., 349, 448–459. Einstein, A., Podolsky, B., & Rosen, N. (1935). Can quantum-mechanical description of physical reality be considered complete? Physical Review, 47, 777–780. Eliezer, C. J. (1943). The hydrogen atom and the classical theory of radiation. Proceedings Cambridge Philological Society, 33, 173–180. Eliezer, C. J. (1950). A note on Electron Theory. Proceedings Cambridge Philological Society, 46, 199–201. Erber, T. (1961). The classical theories of radiation reaction. Fortschritte fu¨r Physik, 9, 343–392. Ehrlich, R. (1976). Possible evidence for the quantization of particle lifetimes. Physical Review, D13, 50–55.
Consequences for the Electron of a Quantum of Time
109
Everett, H. (1957). Relative State Formulation of Quantum Mechanics. Reviews of Modern Physics, 29, 454–462. ¨ ber einen Widerspruch zwischen der elektrodynamischen und der relaFermi, E. (1922). U tivistischen Theorie der elektromagnetischen Masse. Physik. Z., 23, 340–344. Fleming, G. N. (1965). Nonlocal Properties of Stable Particles. Physical Review, B139, 963–968. Fock, V. A. (1962). Criticism of an attempt to disprove the uncertainty relation between time and energy. Sov. Phys. JETP, 15(4), 784–786. Ford, K. W. (1968). Electrodynamics with a quantum of length. Physical Review, 175, 2048–2053. Frenkel, J. (1926). Die Elektrodynamik des Rotierenden Elektrons. Zeitschr. f. Phys., 37, 243–262. Frenkel, J. I. (1926–28). Lehrbuch der Elektrodynamik, 2. (p. 248). Berlin: Springer. Friedberg, R., & Lee, T. D. (1983). Discrete Quantum Mechanics. Nucl. Phys., B225, 1–52. Froissart, M., Golberger, M. L., & Watson, K. M. (1963). Spatial Separation of Events in S-Matrix Theory. Physical Review, 131, 2820–2826. Fryberger, D. (1981). Model for the generation of leptonic mass. II. Physical Review, D24, 979–998. Fulton, T., & Rohrlich, F. (1960). Classical radiation from a uniformly accelerated charge. Ann. of Phys., 9, 499–517. Gallardo, J. C., Ka´lnay, A. J., & Risemberg, S. H. (1967). Lorentz-invariant localization for elementary systems. Physical Review, 158, 1484–1490. George, G., Prigogine, I., & Rosenfeld, L. (1972). The macroscopic level of quantum mechanics. Dansk. Mat. Fys. Medd, 38, 1–44. Ghirardi, G. C., Rimini, A., & Weber, T. (1985). Unified Dynamics for Microscopic and Macroscopic Systems. Physical Review, D34, 470–491. Ghirardi, G. C., & Weber, T. (1984). Finite-difference evolution equations and quantumdynamical semi-groups. Lett. Nuovo Cim., 39, 157–164. Golberger, M. L., & Watson, K. M. (1962). Concerning the Notion of ‘Time Interval’ in SMatrix Theory. Physical Review, 27, 2284–2286. Gursey, F. (1957). Relativistic Kinematics of a Classical Point Particle in. Spinor Form. Nuovo Cimento, 5, 784–809. Gutkowski, D., Moles, M., & Vigier, J. P. (1977). Hidden Parameter Theory Of The Extended Dirac Electron. 1. Classical Theory. Nuovo Cim., B39, 193–225. Havas, P. (1948). On the Classical Equations of Motion of Point Charges. Physical Review, 74, 456–463. Hikasa, K. et al. (1992) (Particle Data Group). Review of particle properties. Physical Review, D45, S1–S574, D46, 5210 (erratum). Hill, E. L. (1945). On Accelerated Coordinate Systems in Classical and Relativistic Mechanics. Physical Review, 67, 358–363. Ho¨nl, H. (1952). Feldmechanik des elektrons und der elementarteilchen. Ergeb. Exacten Naturwiss., 26, 291–382. ¨ ber die innere Bewegung des Elektrons – I. Z. Phys., 112, Ho¨nl, H., & Papapetrou, A. (1939). U 512–540. ¨ ber die innere Bewegung des Elektrons – III. Z. Phys., Ho¨nl, H., & Papapetrou, A. (1940). U 116, 153–183. Huang, K. (1952). On the Zitterbewegung of the Dirac Electron. American Journal of Physics, 20, 479–484. Hsu, J. P., & Mac, E. (1979). Fundamental length, bubble electrons and nonlocal quantum electrodynamics. Nuovo Cimento, B49, 55–67. Jackson, J. C. (1977). A quantisation of time. J. Phys. A, 10, 2115–2122. Jammer, M. (1954). Concepts of Space. Cambridge: Harvard Univ. Press.
110
Ruy H. A. Farias and Erasmo Recami
Jancel, R. (1969). Foundations of Classical and Quantum Statistical Mechanics. New York: Pergamon Press. Jannussis, A. (1984a). Connection between master equation and Lie - admissible formulation. Lett. Nuovo Cim., 39, 75–80. Jannussis, A. (1984b). Quantum equations of motion in the finite-difference approach. Lett. Nuovo Cim., 40, 250–256. Jannussis, A. (1984c). Caldirola–Montaldi theory for a finite-difference Dirac equation. Nuovo Cimento, B84, 27–34. Jannussis, A. (1985a). Difference equations in the Lie-Admissible formulation. Lett. Nuovo Cim., 42, 129–133. Jannussis, A. (1985b). Some remarks on the small - distance derivative model. Nuovo Cimento, B90, 58–64. Jannussis, A. (1990). A Lie admissible complex time model and relativity theorem. Hadronic Journal, 13, 425–434. Jannussis, A., Leodaris, A., Fillippakis, P., & Fillippakis, T. (1980a). Some remarks on the discrete-time Schrdinger equation. Lett. Nuovo Cim., 29, 259–262. Jannussis, A., Leodaris, A., Fillippakis, P., & Fillippakis, T. (1980b). Quantum dissipative system and the Caldirola - Montaldi equation. Lett. Nuovo Cim., 29, 437–442. Jannussis, A., Brodimas, G. N., Papatheou, V., Leodaris, A. D., & Zisis, V. (1981a). Perturbation theory for the solution of the Caldirola-Montaldi equation. Lett. Nuovo Cim., 31, 533–538. Jannussis, A., Brodimas, G. N., Leodaris, A. D., Papatheou, V., & Zisis, V. (1981b). Quantum friction and the Caldirola-Montaldi procedure. Lett. Nuovo Cim., 30, 289–292. Jannussis, A., Leodaris, A., Fillippakis, P., Fillippakis, T., & Zisis, V. (1982a). Heisenberg equations of motion in the Caldirola–Montaldi procedure. Nuovo Cimento, B67, 161–172. Jannussis, A., Streclas, A., Leodaris, A., Patargias, N., Papatheou, V., Fillippakis, P., et al. (1982b). Some remarks on the Caldirola-Montaldi equation. Lett. Nuovo Cim., 34, 571–574. Jannussis, A., Brodimas, G. N., Papatheou, V., Karayiannis, G., Panagopoulos, P., & Ioannidou, H. (1983a). Lie-admissible unification of dissipative Schro¨dinger’s equations. Lett. Nuovo Cim., 38, 181–188. Jannussis, A., Brodimas, G. N., Papatheou, V., Karayiannis, G., Panagopoulos, P., & Ioannidou, H. (1983b). Lie-admissible unification of dissipative Schro¨dinger’s equations. Hadronic Journal, 6, 1434–1461. Jannussis, A., Skaltsas, D., & Brodimas, G. N. (1990). Equivalence between CaldirolaMontaldi model and Lie admissible complex time model. Hadronic Journal, 13(89), 415–423. Jehle, H. (1971). Relationship of Flux Quantization to Charge Quantization and the Electromagnetic Coupling Constant. Physical Review, D3, 306–345. Ka´lnay, A. J. (1970). Lorentz-invariant localization for elementary systems – II. Physical Review, D1, 1092–1104. Ka´lnay, A. J. (1971). Lorentz-Invariant Localization for Elementary Systems – III. Zero-Mass Systems. Physical Review, D3, 2357–2363. Ka´lnay, A. J., & Torres, P. L. (1971). Lorentz-Invariant Localization for Elementary Systems – IV. The Nonrelativistic Limit. Physical Review, D3, 2977–2981. Kitazoe, T. (1972). Structure of leptons. Lett. Nuovo Cim., 4, 196–198. Kitazoe, T., Ishihara, M., & Nakatani, H. (1978). Dirac equation on a lattice. Lett. Nuovo Cim., 21, 59–64. Lanz, L. (1962). On the energy conservation for the motion of a radiating electron with particular reference to a hyperbolic motion. Nuovo Cimento, 23, 195–201. Lanz, L., Prosperi, G. M., & Sabbadini, A. (1971). Time scales and the problem of measurement in quantum mechanics. Nuovo Cimento, B2, 184–192. Lattes, C. M., Scho¨nberg, M., & Schu¨tzer, W. (1947). Classical theory of charged point particles with dipole moments. Anais Ac. Brasil. Cien, 19(3), 46–98.
Consequences for the Electron of a Quantum of Time
111
Lee, T. D. (1983). Can Time be a discrete dynamical variable? Physics Letters, B122, 217–220. Lee, T. D. (1987). Difference Equations and Conservation Laws. Journal of Statistical Physics, 46, 843–860. Leiter, D. (1970). The paradox of radiation from a uniformly accelerated point electron and a consistent physical framework for its resolution. Int. J. Theor. Phys., 3, 387–393. Levi, R. (1926). The´orie de L’action universelle et descontinue. Comptes Rendus de l’Academie, 183, 865–867. Liebowitz, B. (1969). A model of the electron. Nuovo Cimento, A63, 1235–1246. Likharev, K. K., & Claeson, T. (1992). Single electronics. Scient. Am, 50–55 (June). Lock, J. A. (1979). The Zitterbewegung of a free localized Dirac particle. American Journal of Physics, 47, 797–802. Loinger, A. (1955). Rendic. Acc. Lincei, 18, 395. Lorentz, H. A. (1892). La The´orie e´lectromagne´tique de Maxwell et son application aux corps mouvants. Arch. Nee´r. Sciences Exactes et Naturelles, 25, 363–552. Lorentz, H. A. (1904). Electromagnetische verschijnselen in een stelsel, dat zich met willekeurige snelheid, kleiner dan die van het licht, beweegt. Versl. Kon Adad. Wetensch Amsterdam, 12, 986–1009. Ludwig, G. (1953). Der Meßprozeß. Z. Phys, 135, 483–511. Lundeen, S. R., & Pipkin, F. M. (1981). Measurement of the Lamb shift in Hydrogen, n ¼ 2. Physical Review Letters, 46, 232–235. Marx, E. (1975). Electromagnetic energy and momentum from a charged particle. Int. J. Theor. Phys., 14, 55–65. Mathisson, M. (1937). Neue Mechanik materieller Systeme. Acta Physica Polonica, 6, 163–200. Matore, I. M. (1981). Model of the electron and muon structure. Dubna: JINR Report P4-81-81. McGoveran, D., & Noyes, P. (1989). Foundations of a Discrete Physics. SLAC-PUB-4526. McGregor, M. H. (1992). The enigmatic electron. Dordrecht: Kluwer. Meessen, A. (1970). Quantification de l’espace-temps et ge´ne´ralization de l’e´quation de Dirac. Ann. Soc. Sci. Bruxelles, 84, 267–275. Mie, G. (1912). Grundlagen einer Theorie der Materie. Annalen Der Physik, 342(3), 511–534. Mignani, R. (1983). On the Lie-Admissible structure of the Caldirola equations for dissipative processes. Lett. Nuovo Cim., 38, 169–173. Mo, T. C., & Papas, C. H. (1971). New equation of motion for classical charged particles. Phys. Rev, D4, 3566–3571. Montaldi, E., & Zanon, D. (1980). On the green functions for the discrete-time Schrdinger equation. Lett. Nuovo Cim., 27, 215–221. Motz, L. (1970). A gravitational theory of the mu-meson and leptons in general. Nuovo Cimento, A65, 326–332. Nambu, Y. (1952). An empirical mass spectrum of elementary particles. Prog. Theor. Phys., 7, 595–596. Nishijima, K., & Sato, H. (1978). Higgs-Kibble Mechanism and the Electron-Muon Mass Ratio. Prog. Theor. Phys., 59, 571–578. Ouchi, T., & Ohmae, A. (1977). Models for the Electron and the Muon. Prog. Theor. Phys., 58, 1664–1666. Page, L. (1918). Is a moving mass retarded by the reaction of its own radiation? Physical Review, 11, 376–400. Page, L. (1921). Theory of the motion of electrons between co-axial cylinders taking into account the variation of mass with velocity. Physical Review, 18, 58–61. Page, L., & Adams, N. I.Jr., (1940). Electrodynamics. New York: Von Nostrand. Pavsic, M. (1976). Discrete Rest Masses Resulting from Relativistically Covariant Massless Field Equations in Five-Dimensions. Lett. Nuovo Cim., 17, 44–48. Pavsic, M., & Recami, E. (1982). Charge conjugation and internal space-time symmetries. Lett. Nuovo Cim, 34, 357–362.
112
Ruy H. A. Farias and Erasmo Recami
Pavsic, M., Recami, E., Rodrigues, W. A., Maccarrone, G. D., Raciti, F., & Salesi, G. (1993). Spin and electron structure. Physics Letters, B318, 481–488. Pavsic, M., Recami, E., & Rodrigues, W. A. (1995). Electron structure, zitterbewegung and a new non-linear Dirac-like equation. Hadronic J, 18, 97–118. Perkins, G. A. (1976). The direction of the Zitterbewegung: a hidden variable. Foundations of Physics, 6, 237–248. Petzold, J., & Sorg, M. (1977). Generalized Caldirola equations for the classical radiating electron. Z. Physik, A283, 207–215. Plass, G. N. (1960). Classical Electrodynamic Equations of Motion with Radiative Reaction. Physical Review Letters, 4, 248–249. Plass, G. N. (1961). Classical electrodynamic equations of motion with radiative reaction. Reviews of Modern Physics, 33, 37–62. Poincare´, H. (1906). Sur la dynamique de l’e´lectron. Rendic. Circolo Matem. di Palermo, 21, 129–175. Poincare´, H. (1913). Dernie`re Pense´es. Paris: Flammarion. Pokrowski, G. I. (1928). Zur Frage nach eine Struktur der Zeit. Zeits. fu¨r Physik, 51, 737–739. Proca, A. (1954). Me´canique du point. J. Phys. Radium, 15, 65–72. Prosperetti, A. (1980). The Motion of a Charge Particle in a Uniform Magnetic Field. Nuovo Cimento, B57, 253–268. Pryce, M. H. L. (1948). The Mass-Centre in the Restricted Theory of Relativity and Its Connexion with the Quantum Theory of Elementary Particles. Proc. Royal Soc. (London), A195, 62–81. Recami, E. (1978). How to recover Causality in special relativity. Foundations of Physics, 8, 329–340. Recami, E. (2002). Multi-verses, micro-universes and elementary particles. Report NSF-ITP02-94 (KITP. UCSB, 2002); also appeared in e-print form as arXiv:physics/0505149. Recami, E., & Farias, H. A. (2009). A simple quantum (finite-difference) equation for disipation and decoherence. Nuovo Cimento, B124, 765–776. Recami, E., Olkhovsky, V. S., & Maydanyuk, S. P. (2010). On non-selfadjoint operators for observables in quantum mechanics and quantum electrodynamics. Int. J. Mod. Phys., A25, 1785–1818. Recami, E., & Rodrigues, W. A. (1982). Antiparticles from special relativity with orthochronous and anti-chronous Lorentz transformations. Foundations of Physics, 12, 709–718 Errata: 13 (1983). E553. Recami, E., Rodrigues, W. A., & Smrz, P. (1983). Some applications of non-hermitian operators in quantum mechanics and quantum field theory. Hadronic Journal, 6, 1773–1789. Recami, E., & Salesi, G. (1994). Field theory of the spinning electron and internal motions. Physics Letters, A190, A195, 137–143; Erratum. Recami, E., & Salesi, G. (1995). Field theory of the spinning electron: I and II. In J. Dowling (Ed.), Electron theory and quantum electrodynamics: 100 years later (pp. 241–260). New York: Plenum. Proceedings, NATO Advanced Study Institute, Edirne, Turkey, September 5–16 (1994). Recami, E., & Salesi, G. (1996). Field Theory of the Spinning Electron: About the New NonLinear Field Equations. Adv. Appl. Cliff. Alg., 6(1), 27–36. Recami, E., & Salesi, G. (1997a). Kinematics and hydrodynamics of spinning particles. In J. Keller & Z. Oziewicz (Eds.), The Theory of the Electron (pp. 253–268). Mexico City: UNAM Press. Recami, E., & Salesi, G. (1997b). The velocity operator for spinning quantum particles. In J. Keller & Z. Oziewicz (Eds.), The Theory of the Electron (pp. 269–278). Mexico City: UNAM Press. Recami, E., & Salesi, G. (1998a). Kinematics and hydrodynamics of spinning particles. Physical Review, A57, 98–105.
Consequences for the Electron of a Quantum of Time
113
Recami, E., & Salesi, G. (1998b). A velocity field and operator for spinning particles in (nonrelativistic) Quantum Mechanics. Foundations of Physics, 28, 763–776. Riewe, F. (1971). Generalized Mechanics of a Spinning Particle. Lett. Nuovo Cim., 1, 807–808. Rodrigues, W. A., Vaz, J., & Recami, E. (1993). A generalization of Dirac Nonlinear electrodynamics, and spinning charged particles. Foundations of Physics, 23, 469–485. Rodrigues, W. A., Vaz, J., Recami, E., & Salesi, G. (1993). About zitterbewegung and electron structure. Physics Letters, B318, 623–628. Rohrlich, F. (1960). Self-energy and stability of the classical electron. American Journal of Physics, 28, 639–643. Rohrlich, F. (1965). Classical Charged Particles. Reading: Addison-Wesley. Rosen, G. (1964). Radiative reaction and a possible theory for the muon. Nuovo Cimento, 32, 1037–1045. Rosen, G. (1978). Approach to a theory for the masses of the electron, muon, and heavier charged leptons. Int. J. Theor. Phys., 17, 1–4. Sachs, M. (1972a). The Electron-Muon Mass Doublet from General Relativity. Nuovo Cimento, B7, 247–264. Sachs, M. (1972b). On the lifetime of the muon state of the electron-muon mass doublet. Nuovo Cimento, B10, 339–347. Salesi, G., & Recami, E. (1994). On the field theory of the extended-like electron. Trieste: ICTP. Report IC/94/185. Salesi, G., & Recami, E. (1995). About the kinematics of spinning particles. Frascati: INFN. Report INFN/AE–95/16. Salesi, G., & Recami, E. (1996). Field theory of the extended-like electron. In P. I. Pronin & G. A. Sardanashvily (Eds.), Particles, Gravity and Space-Time (pp. 345–368). Singapore: World Scientific. Salesi, G., & Recami, E. (1997a). The velocity operator for spinning particles in quantum mechanics. Adv. Appl. Cliff. Alg., 7, 269–278. Salesi, G., & Recami, E. (1997b). Hydrodynamical reformulation and quantum limit of the Barut–Zanghi theory. Foundations of Physics Letters, 10, 533–546. Salesi, G., & Recami, E. (1998). A Velocity Field and Operator for Spinning Particles in (Nonrelativistic) Quantum Mechanics. Invited paper for the issue in memory of A. O. Barut. Foundations of Physics, 28, 763–776. Santilli, R. M. (1979a). Status of the mathematical and physical studies on the Lie-admissible formulation as of July 1979, with particular reference to strong interactions. Hadronic Journal, 2, 1460–2019. Santilli, R. M. (1979b). Initiation of the representation theory of a Lie-admissible algebra of operators on a bimodular Hilbert space. Hadronic Journal, 3, 440–506. Santilli, R. M. (1981a). An intriguing legacy of Einstein, Fermi, Jordan, and others: The possible invalidation of quark conjectures. Foundations of Physics, 11, 383–472. Santilli, R. M. (1981b). Generalization Of Heisenberg Uncertainty Principle For Strong Interactions. Hadronic Journal, 4, 642–657. Santilli, R. M. (1981c). Experimental, Theoretical, And Mathematical Elements For A Possible Lie Admissible Generalization Of The Notion Of Particle Under Strong Interactions. Hadronic Journal, 4, 1166–1257. Santilli, R. M. (1983). Foundations of Theoretical Mechanics – Vol. II: Birkhoffian Generalizazion of Hamiltonian Mechanics. New York and Heidelberg: Springer. Schenberg, M. (1945). Classical Theory of the Point Electron. Physical Review, 69, 211–224. Schott, G. A. (1912). Electromagnetic Radiation. Cambridge. ¨ ber die kra¨ftefreie Bewegung in der relativistischen QuantenSchro¨dinger, E. (1930). U mechanik. Sitzunger. Preuss. Akad. Wiss. Phys. Math. Kl., 24, 418–428. Schro¨dinger, E. (1952). Are There Quantum Jumps? Part I. The British Journal for the Philosophy of Science, 3, 109–123.
114
Ruy H. A. Farias and Erasmo Recami
Snyder, H. S. (1947). Quantized space-time. Physical Review, 71, 38–41. Sorg, M. (1976). The problem of radiation reaction in classical electrodynamics. Zeitschrift fu¨r Naturforschung, 31A, 1133–1142. Sudarshan, E. C. G. (1961). Theory of Leptons I. Nuovo Cimento, 21, 7–28. Tati, T. (1964). Concepts of Space-Time in Physical Theories. Progress of Theoretical Physics Supplement, 29, 1–96. Tennakone, K., & Pakvasa, S. (1971). Discrete Scale Transformations and a Possible Lepton Mass Spectrum. Physical Review Letters, 27, 757–760. Tennakone, K., & Pakvasa, S. (1972). A Model of Leptons. Physical Review, D6, 2494 1512. Thomson, J. J. (1881). On the electric and magnetic effects produced by the motion of electrified bodies. Philosophical Magazine, 11, 229–249. Thomson, J. J. (1925–26). The intermittence of electric force. Proceedings of the Royal Society of Edinburgh, 46, 90–115. Vasholz, D. P. (1975). Discrete space quantum mechanics. Journal of Mathematical Physics, 16, 1739–1745. von Laue, M. (1909). Die Wellenstrahlung einer bewegten Punktladung nach dem Relativitatsprinzip. Annalen Der Physik, 28, 436–442. Welch, L. O. (1976). Quantum mechanics in a discrete space-time. Nuovo Cimento, B31, 279–288. Weyssenhof, J., & Raabe, A. (1947). Relativistic dynamics of spin-fluids and spin-particles. Acta Physica Polonica, 9, 7–25. Wheeler, J. A. (1957). Assessment of Everett’s ‘Relative state’ formulation of quantum theory. Reviews of Modern Physics, 29, 463–465. Wheeler, J. A., & Feynman, R. (1945). Interaction with the absorber as the mechanism of radiation. Reviews of Modern Physics, 17, 157–181. Wigner, E. P. (1963). The problem of measurement. American Journal of Physics, 31, 6–15. Wolf, C. (1987a). Possible implications of the discrete nature of time in electron spin polarization measurements. Nuovo Cimento, B100, 431–434. Wolf, C. (1987b). Electron spin resonance as a probe to measure the possible discrete nature of Time as a dynamical variable. Physics Letters, A123, 208–210. Wolf, C. (1989a). Discrete time effects and the spectrum of hydrogen. European Journal of Physics, 10, 197–199. Wolf, C. (1989b). Modification Of The Mass Momentum Energy Relation Due To Non-local Discrete Time Quantum Effects. Nuovo Cimento, B103, 649–654. Wolf, C. (1990a). Testing discrete quantum mechanics using neutron interferometry and the superposition principle: A gedanken experiment. Foundations of Physics, 20, 133–137. Wolf, C. (1990b). Wave packet anomalies due to a discrete space, discrete time modified Schrdinger equation. Annales De La Fondation Louis De Broglie, 15, 189–194. Wolf, C. (1990c). Probing the possible composite structure of leptons through discrete time effects. Hadronic J, 13, 22–29. Wolf, C. (1990d). Spin flip spectra of a particle with composite dyon structure is discrete time quantum theory. Annales De La Fondation Louis De Broglie, 15, 487–495. Wolf, C. (1992a). Space, mutated spin correlation induced by a non-local discrete time quantum theory. Hadronic Journal, 15, 149–161. Wolf, C. (1992b). Countability versus Continuum in the quantum mechanics of a free particle. In H. C. Myung (Ed.), Proceedings of the Fifth International Conference on Hadronic Mechanics and nonpotentials interactions (pp. 417–428). Commack: Nova Science. Wolf, C. (1994). Upper limit for the mass of an elementary particle due to a discrete-time quantum mechanics. Nuovo Cimento, B109, 213–220. Yaghjian, A. D. (1989). A classical electro-gravitational model of a point charge with a finite mass. Proc. URSI symposium on electromagnetic theory, 322–324.
Consequences for the Electron of a Quantum of Time
115
Yaghjian, A. D. (1992). Relativistic Dynamics of a Charged Sphere (Updating the Abraham–Lorentz model). Berlin: Springer. Yukawa, H. (1966). Atomistics and the divisibility of space and time. Suppl. Prog. Theor. Phys., 37–38, 512–523. Zin, G. (1949). Su alcune questioni di elettrodinamica classica relative al moto dell’elettrone. Nuovo Cimento, 6, 1–23.
Chapter
3 Methods and Limitations of Subwavelength Imaging Andrew Neice
Contents
1. Introduction 2. Overview of Subwavelength Imaging 2.1. Types of Subwavelength Imaging 2.2. Evanescent Waves 2.3. Current Subwavelength Imaging Technologies 3. Pendry’s Superlens and Metamaterials 3.1. Introduction to Pendry’s Superlens 3.2. Limitations on the Resolution of the Pendry Superlens 4. Generalized Limits on Subwavelength Imaging 4.1. A Model System 4.2. Resolution Limits in the Model System 4.3. Alternative Explanation for the Resolution Limit 5. Summary and Conclusion References
117 119 119 122 124 126 126 128 131 131 134 136 138 139
1. INTRODUCTION The well-known diffraction limit on imaging, the Rayleigh criterion, Dl ¼ 1:22f l=D;
(1)
states that the maximum achievable resolution is proportional to the wavelength of the light used to form the image. Depending on the exact Stanford University Medical Center, Stanford, California, USA Advances in Imaging and Electron Physics, Volume 163, ISSN 1076-5670, DOI: 10.1016/S1076-5670(10)63003-0. Copyright # 2010 Elsevier Inc. All rights reserved.
117
118
Andrew Neice
experimental setup, and what exactly is meant by ‘‘resolution’’, using Sparrow’s limit or Abbe’s limit may be more appropriate. In all of these cases, however, the maximum resolution is proportional to the wavelength of the light used for imaging. These limits have proved to be of immense practical importance. When the Rayleigh criterion was developed, optical technology consisted predominantly of traditional lenses, mirrors, pinholes, and so forth and operated mostly in the visible regime. Since then, however, imaging technology has been extended outside the visible regime into other areas of the electromagnetic spectrum, from radiowaves to X-rays. Not surprisingly, lenses and mirrors have been supplemented by new ways of obtaining image information from electromagnetic waves, such as relatively compact X-ray diffraction machines and enormous radiotelescopes. With the development of electron microscopy, images have been obtained by entities that were not even known to be waves in Rayleigh’s time. Despite all the technological advancements, however, it remained true that the resolution of the image was fundamentally proportional to the wavelength of the radiation used to obtain the image. Recently, however, interest in obtaining subwavelength resolution has increased due to technological pressure in a variety of fields. In particular, the micro-electronics industry, health care, and biological research are strongly driven to develop subwavelength imaging technology. The semiconductor industry, for example, uses photomask techniques to etch an image of a circuit onto a silicon wafer. The industry’s drive to continue miniaturization of components necessitates the use of either higher and higher frequencies of electromagnetic (EM) radiation (which can introduce technological problems of its own) or a method of projecting sub-wavelength images onto the wafer. The biological sciences also have a vested interest in subwavelength imaging. The limit of light microscopy is 200 nm, which is enough to resolve bacteria and some cellular detail in eukaryotic cells, but not virii or the detailed structure of cellular organelles. Transmission electron microscopy can resolve features down to 50 pm (Erni et al., 2009), enough to resolve organelles and virii. The utility of transmission elctron microscopy in resolving the function, as opposed to the structure, of cellular organelles is another matter entirely. Tissue samples need to be finely sectioned and often coated with a conductive material. In addition, the exposure to high vacuum and electron bombardment make imaging of living tissue problematic. Given these limitations, multiple alternatives have been developed. A common technique for studying biological systems is to develop an antibody to a protein of interest and then attach a fluorescent molecule to this antibody. The investigator can then trace which proteins are expressed in which organelles at what times. Because these molecules fluoresce in the visible spectrum (and are damaged, along
Methods and Limitations of Subwavelength Imaging
119
with other biological tissues by X-ray or electron bombardment) they must be imaged in the visible regime, with the associated resolution limits. Medical imaging techniques would similarly benefit from subwavelength techniques. Making useful diagnoses based on gross anatomic features requires images with resolution of 0.1–l mm. Unfortunately, the human body is opaque between the microwave and the ultraviolet regime, so X-rays or gamma rays are used to obtain a useful image. All common imaging medical imaging techniques—conventional X-rays, computed tomography (CT) scans, and nuclear medicine scans all use X-rays or gamma rays. The only exceptions are ultrasound imaging and magnetic resonance imaging (MRI), which are discussed in the next section. Obviously, exposing patients to ionizing radiation is concerning. Although the amount of radiation exposure in a typical plain radiograph is trivial, CT scans carry a significant radiation dose. The radiation dose of a CT scan can exceed 500 millirem, compared with 10 millirem for a single chest X-ray. (For reference, the average annual cumulative radiation exposure of an individual from natural sources is 300 millirem.) It has been suggested that although this radiation exposure causes only a very small increase in cancer rates, given the ubiquity of CT scans many new malignancies are potentially created (Brenner and Hall, 2007). Sub-wavelength imaging techniques are attractive because of their potential use of submicrowave EM radiation while still providing sufficient resolution to allow medical diagnosis. These examples are perhaps but a small subset of the potential applications of subwavelength imaging. As with wavelength imaging, practical and theoretical advances are occurring simultaneously in the field of subwavelength imaging. Numerous applications have preceded the development of any comprehensive theory. This review attempts to clarify the commonalities and identify any underlying theory of the different types of subwavelength imaging. In particular, limits of subwavelength imaging in a manner analogous to Eq. (1) are presented.
2. OVERVIEW OF SUBWAVELENGTH IMAGING 2.1. Types of Subwavelength Imaging Before discussing the physics of subwavelength imaging the field must be defined more precisely. For example, magnetic resonance imaging, although not generally considered ‘‘subwavelength’’, nevertheless reconstructs an image at a submillimeter resolution based on EM signals in the microwave frequency range, and thus in a broad sense could be considered
120
Andrew Neice
subwavelength imaging. Although space does not allow a detailed discussed of MRI, it involves applying an external magnetic field to a sample and measuring the intensity of its magnetic resonance, which at the magnetic field intensities usually used for medical imaging results in signals in the microwave regime. The intensity of magnetic resonance at different spatial points can be calculated by introducing inhomogeneities into the underlying magnetic field—thus, the key to obtaining spatial resolution is not the microwave radiation measured at the detector, but the introduction of quasi-static inhomogeneities in the magnetic field. MRI has not traditionally been thought of as a subwavelength imaging technique, so the previous example may seem superfluous. However, numerous other techniques are described as subwavelength. One example includes photoactivated localization microscopy. In this technique (Patterson et al., 2007) a fluorescent protein is localized to one part of a cell, generally by attaching the protein to an antibody that in turn binds to some cell structure of interest. The protein is excited and the emission of the fluoresced photon is recorded. This process is repeated multiple times. It is assumed that the photons are emitted in a Gaussian distribution about the true location of the protein. After sufficient data points have been recorded, the true mean position and uncertainty in the position are calculated using statistical techniques. This technique, of course, is not viable if two fluorescent proteins are relatively close to each other (that is, within a wavelength of one another) because it is uncertain to which of the proteins the emission should be assigned when an emitted photon is detected. Multiple experimental techniques are available to overcome this difficulty. The easiest is simply to dilute the concentration of the fluorescent protein enough so that it is unlikely it will have any close neighbors. The other, more advanced technique, is to use multiple varieties of fluorescent proteins that are activated under differing conditions. In this case, although a fluorescent protein may be close to another fluorescent protein, it is unlikely to be close to one that fluoresces under similar circumstances, and imaging can proceed for each different type of fluorescent protein. At the end of the process, a composite image of all the different varieties of fluorescent protein is recombined to show the image at subwavelength resolution. Both modalities use certain externalities to accomplish imaging. In this review, we propose to divide subwavelength imaging into two classes— true subwavelength imaging and functional subwavelength imaging. The former is the primary concern of this review; the latter includes imaging techniques such as MRI and photoactivated localization microscopy. The latter group (although of obvious practical and economic importance) includes such an inhomogeneous group of techniques that developing broad theoretical constraints on these imaging techniques
Methods and Limitations of Subwavelength Imaging
121
may be impossible and is beyond the scope of this review. For the most part, this document is restricted to images based on EM phenomena. This eliminates other potential forms of subwavelength imaging—for example, sonic subwavelength imaging. The derivation of imaging constraints in the following sections could be applied to any source of information that decreases in intensity as the inverse of the radius squared. For example, pressure disturbances could easily substitute for charges in the following discussion. In any case, a proposed definition of true subwavelength imaging should be reminiscent of the criteria for a wavelength optical system: The subwavelength system should be able to (1) image an unrestricted variety of images and (2) use transmitted, reflected, or emitted radiation. If arbitrary restrictions must be added to the subwavelength system to accurately reconstruct images, then it would be more accurately described as a functional system. In this vein, the following rules for a true subwavelength system are proposed: 1. The image is created based on waves or fields, EM or otherwise, that are produced, reflected, refracted, absorbed, or otherwise transmitted by the plane being imaged, and the detection and imaging system is agnostic as to the specific interaction with the radiation. In particular, it is not reliant on any particular nonlinear optical effects. 2. The distribution of matter within the imaged plane can be any arbitrary distribution and still allow imaging. 3. The fields or waves described in rule (1) are the only means of interaction between the imaging device and the sample. Imaging that is otherwise subwavelength but violates one of these rules is considered functional subwavelength imaging. MRI violates rule (3) because the magnetic field gradients that alter the precession rate are necessary to produce the image. Photoactivated localization microscopy violates rule (2) because the fluorescent proteins must be disbursed widely enough that the photons associated with their fluorescence can be associated with only one emitting protein. These imaging modalities, therefore, are described as functional—not true—subwavelength imaging. A survey of the literature indicates that the majority of subwavelength imaging techniques are, in fact, functional. In particular, the need for subwavelength images in the biological sciences has spawned a plethora of clever techniques, which are too numerous to describe in detail here but named briefly for interested readers. Other common modalities classified as functional subwavelength imaging include stimulated emission depletion microscopy, structured illumination microscopy, and reversible saturable optical fluorescence transitions (RESOLFT) microscopy. Like photoactivated localization microscopy, most of these techniques use florescent proteins and are useful in the visible light regime. Many use
122
Andrew Neice
nonlinearities in the behavior of the fluorescent proteins, such as photobleaching and saturation, and hence violate rule (1). True subwavelength imaging techniques appear to be rarer. Here two experimentally realized techniques are discussed in some detail—the Pendry superlens and near field scanning optical microscopy. Before discussing the details of the techniques, however, the common underlying principles of physics are reviewed.
2.2. Evanescent Waves The common feature of current true subwavelength imaging technologies is their use of the information contained in evanescent waves; as such, the definition and features of evanescent waves are discussed briefly. A concrete example of evanescent waves is the case of total internal reflection due to Snell’s law, a situation described in many excellent undergraduate physics texts (Griffiths, 1999) and outlined here for clarity. In the case of total internal reflection, with light propagating from a medium with a high refractive index to a lower one, the propagating light completely reflects and none is propagated into the lower refractive index material. However, boundary conditions require that there must be some EM field in the lower refractive index material because otherwise there would be a discontinuity in the fields. Fortunately, the same results used to solve for the fields in the case of refraction may be used, except that now cos(yT), where yT is the angle of refraction, is imaginary. This changes the solution from a refracted, propagating wave to an exponentially decaying wave. Specifically, if an EM wave incident at an angle yI, polarized perpendicular to the plane of x þ cosyI ^zÞ passes between two incidence, with wave vector ^kI ¼ kI ð sinyI ^ materials of index of refraction n1 and n2, it will be totally internally reflected if sinyT ¼ n1 sinyI =n2 > 1: The EM fields in the lower refractive index material can be solved for, and will be ^ ¼ E0 expðkzÞ cosðkx otÞ^y E ^ ¼ E0 expðkzÞ½k sinðkx otÞ^ x þ k cosðkx otÞ^z; B o where k¼
o c
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðn1 sinyI Þ2 n22
Methods and Limitations of Subwavelength Imaging
123
and k ¼ on1 sinyI =c: This is a wave propagating parallel to the interface, and decaying exponentially in the Z direction, an example of an evanescent wave. A similar set of solutions arises for EM radiation polarized parallel to the interface. Several interesting features are note worthy—if the Poynting vector is constructed, this wave carries no energy in the z-direction. Because of their exponential decay, evanescent waves can be detected only if the detector is extremely close to the interface. One method of accomplishing this is to place a second interface with a discontinuity in the index of refraction close to the first interface at which total internal reflection occurred. If the boundary conditions are again solved for, propagating waves are reformed at the second interface in a process analogous to quantum mechanical tunneling. EM disturbances can be thought of as creating propagating EM waves, which do not decay in amplitude with distance, and evanescent waves, which decay exponentially with distance from the source. Reconstructing an image requires use of all the available information contained in the EM fields. Lenses and other conventional optical technology can refocus the propagating EM waves but evanescent waves, which decay exponentially from their source, are unrecoverable with conventional optical technology. The process of discarding the evanescent waves leads to the diffraction limit. To assert this, the procedure outlined by Pendry is followed (Pendry, 2000). Consider a one-dimensional pattern of radiators along the x-axis that emit S-polarized plane waves in the z-direction. If the pattern of radiators is written as a Fourier series over the wave vectors of the image kx, then the electric field component of the radiation can be expressed as X ^ y; zÞ ¼ ^ x Þeðikx xþikz ziotÞ ; Eðx; Eðk (2) The radiation must follow the homogeneous wave equation, leading to the dispersion relation rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi o2 (3) k2x : kz ¼ c2 For kx < o=c; kz is real and Eq. (2) yields propagating waves. For kx > o=c; kz imaginary and Eq. (2) yields evanescent waves. The recovery of the evanescent waves, which allows the imaging system to reconstruct the high-kx, high-frequency components of the image, can be accomplished in several ways. The most obvious method is simply to place a detector quite close to the image to obtain signal information before much decay can
124
Andrew Neice
occurr. Pendry’s superlens approaches the problem differently, using a negative index of refraction to create lenses that focus evanescent waves. Regardless of the technique used to recover the evanescent waves, it is apparent from the above equations that the information contained within them must be recovered to reconstruct the image. The electric field in Eq. (2) is quite nonspecific—there are no spatial or other restrictions, and the original cause of the EM oscillations need not be specified. In fact, the above derivation could be performed for other types of waves, such as sonic waves. Rules 1 through 3 would not be violated by an oscillating electric field of this nature. Therefore, the above derivation is applicable to true subwavelength imaging, and the essence of true, as opposed to functional, subwavelength imaging is recovery of evanescent waves.
2.3. Current Subwavelength Imaging Technologies Having established a more precise definition of subwavelength imaging, the current experimental techniques used to generate true subwavelength images can now be described. The earliest practical technique used to overcome the diffraction limitation was near-field scanning optical microscopy, proposed by E.H. Synge in 1928 and first practically realized by Ash and Nichols in 1972. In this technique, a small sharpened transparent probe, coated in metal and with a small aperture at the tip, is advanced very close to a specimen. The size of the aperture and the distance to the specimen are both much smaller than the wavelength of light used for imaging. In a process similar to atomic force microscopy, the tip is then rastered over the surface of the sample. Photons incident on the detector are transmitted by an optical fiber to a photodetector. This process is shown schematically in Figure 1. A
C B
D
FIGURE 1 A schematic drawing of near-field scanning optical microscopy. The probe (A) is rastered over the sample (B). Illumination may come from behind the sample (D), through the probe, or above the surface (C). The aperture size of the probe and the distance to the sample are smaller than the wavelength of the light used for illumination.
Methods and Limitations of Subwavelength Imaging
125
There are multiple ways to illuminate the sample. Transillumination, illumination with light coming through the tip of the probe, or reflection of light from outside the probe, and likewise the setup can be used for imaging based on absorbance, reflectance, and so on. There are no restrictions on the matter being imaged and no interactions with the system other than the interaction with the radiation; thus, this system is considered a true subwavelength imaging system. This system can obtain subwavelength information because the probe tip is so close to the sample and the aperture is so small that the system obtains information about both propagating waves leaving the surface and evanescent waves. As noted in the last section, all true subwavelength imaging systems use the information contained in evanescent waves—the propagating waves encode spatial information at subwavelength resolution, and the evanescent waves encode it at superwavelength resolution. This system is not subject to the diffraction limit because the information in the evanescent waves is recovered. Of note, this system has no focusing mechanism. If there were samples at multiple depths, there would be no way to distinguish the data from one plane versus the other. As will be demonstrated, this is a general limitation of all subwavelength imaging systems. A more recent subwavelength imaging system uses the Pendry superlens, which is introduced here but discussed in more detail later. The superlens is a slab of material with simultaneously negative values of permittivity and permeability, leading to a negative index of refraction. This creates a number of unusual optical properties, the most interesting of which from a subwavelength imaging perspective is the reversal of the exponential decay of evanescent waves. The properties of such a material were first described by Veselago (1968) although at the time it was a theoretical exercise because no such materials had been identified. Veselago initially believed that naturally occurring materials might be found with these properties; to date, this has not occurred. Although no naturally occurring materials have been shown to have negative indices of refraction, there is no theoretical basis to prohibit this. However, to date only artificial materials have been constructed with negative indices of refraction. Metamaterials, such as arrays of resonators often constructed on printed circuit boards, have been shown to have a negative index of refraction at some particular frequency, usually in the microwave regime. The first such metamaterial was proposed by Smith and colleagues more than 30 years after Vesalago’s work (Smith et al., 2000). Numerous examples of metamaterials with negative indices of refraction in the microwave regime have been subsequently realized, and superresolution has been repeatedly demonstrated experimentally. At higher frequencies, thin films of silver have successfully been used to create superlenses.
126
Andrew Neice
Fang et al. (2005) used this technique to successfully demonstrate superresolution in the visible spectrum, attaining images with 60-nm resolution.
3. PENDRY’S SUPERLENS AND METAMATERIALS 3.1. Introduction to Pendry’s Superlens Pendry’s superlens remains one of the most studied subwavelength imaging techniques, and the first laws proposed as limits on the resolution of subwavelength images were developed as part of the description of Pendry’s superlens. A brief description of the superlens is provided here so that the limitations on its resolution may be better understood. The field has developed to the point that a number of excellent books (Caloz and Itoh, 2006; Eleftheriades and Balmain, 2005; Marques et al., 2008; Ramakrishna and Grzegorczyk, 2009) are available for a more detailed description of superlenses. The superlens was proposed based on Veselago’s studies of negative index of refraction materials. Materials can be classified into four categories based on their effective values of permittivity and permeability. Most common isotropic materials have positive values of permittivity and permeability. Solving for Maxwell’s equations in these media results in the usual forward-propagating waves. Metals at optical frequencies or plasmas can have negative values of permittivity combined with positive values of permeability. Ferrimagnetic materials can have negative permeabilities with positive permittivities. Both of these situations result in imaginary values for the index of refracpffiffiffiffiffiffiffiffi tion, n ¼ mr er : Negative index of refraction materials are those with simultaneously negative values of permittivity and permeability (here written as the relative permittivity and permeabilities). Given the definition of index of refraction above, it would appear that negative values for permittivity and permeability could have a positive index of refraction (and for that matter, materials with positive permittivity and permeability could have a negative index of refraction). Solving Maxwell’s equations with negative permittivity and permeability results in a wave vector oppopffiffiffiffiffiffiffiffi site in sign to forward-propagating waves, however. If n ¼ mr er ¼ ck=o; this implies that the positive value of the square root should be used for the case with positive permittivity and permeability, and the negative value for the case with negative permittivity and permeability (Caloz and Itoh, 2006). The reversal of the wave vector for materials with negative permittivity and permeability results in a rearrangement of the spatial relationship between the electric and magnetic field vectors and the wave vector. This leads to the term left-handed materials to describe these materials, as
Methods and Limitations of Subwavelength Imaging
127
opposed to conventional right-handed materials. The term left-handed materials is used here, although a number of synonyms have been proposed, including double-negative materials, Veselago media, negative refractive index materials, backward wave materials, and negative phase velocity media. Regardless of the term, the reversal of the wave vector also dictates that the phase and group velocities will be opposite in sign to maintain causality. The relationship between the Poynting vector and the direction of propagation is also reversed. A number of other properties were predicted by Veselago (Caloz and Itoh, 2006), as follows: Reversal of a number of common optical and wave phenomena, including
the Doppler effect, Snell’s law, and Vavilov–Cerenkov radiation.
Consistent with the reversal of Snell’s law, negative refraction at the
interface between a right-handed and left-handed medium.
Interchange of convergence and divergence effects in convex and con-
cave lenses when the lens is left-handed.
The permittivity and permeability must have a frequency dependence;
it is impossible to create a nondispersive negative index of refraction material, as this would result in the same phase and group velocity. Reversal of the boundary conditions that relate the normal components of EM fields at the boundary between a right-handed and left-handed medium. The property of negative refraction also leads to the prediction that a slab of material can act as a lens. Consider a negative index of refraction slab, a Veselago lens. The reversal of Snell’s law indicates that rays of light from a source will be refracted at a negative angle relative to their angle of incidence. If the slab is chosen such that n1 ¼ n2, the rays will be refracted at exactly the necessary angle to focus within the left-handed material. Incident waves striking the lens at angle y will be refracted to angle y and hence will be focused within the slab, at a depth l, where l was the distance of the object from the slab. The same phenomenon occurs when they leave the left-handed material, and there is another focus outside the lens. The distance to the internal and external focus can be calculated by basic trigonometry. Figure 2 illustrates this phenomenon. Aside from its intrinsic novelty and freedom from spherical aberration, it would not appear that this type of lens has any superlative properties. Initially it was thought that this lens could focus only propagating waves (although perhaps it may be more accurate to say that no one had considered the possibility that it could focus evanescent waves). It was first pointed out by Pendry (2000) that a special subset of the Veselago lens, with er ¼ 1, mr ¼ 1, heretofore referred to as the Pendry superlens, was able to focus both evanescent waves and propagating waves. Evanescent waves in the Pendry superlens grow in magnitude as they pass through the lens; then as they exit the lens, they are again
128
Andrew Neice
n1 = –n2
n1
n2
L
D
n1 <> –n2
n1
n1
n2
D-L
FIGURE 2 Double-focusing effect in a flat lens with the distance to the focal points as shown. A lens without perfect negative matching between the indices of refraction is shown for comparison on the right.
attenuated. If the distance between the matter to be imaged and the image sensor is twice the thickness of the Pendry superlens, the evanescent fields will be exactly reproduced at the focus. Similarly, since the Pendry lens is itself a Veselago lens, all of the propagating components are likewise focused. Because all EM fields present in the initial image are exactly reproduced on the opposite side of the Pendry lens, the image can be exactly reproduced with no limits whatsoever on resolution. The focusing is illustrated in the Figure 3.
3.2. Limitations on the Resolution of the Pendry Superlens The Pendry superlens is initially concerning in several respects. The exponential growth in the fields is initially troubling, because it could imply increasing transfer of energy with distance from the interface. However, the Poynting vector for evanescent waves has no z-component, so there is no increased transfer in energy with the growth in amplitude of the waves. In conventional optics, propagating waves are focused into a threedimensional (3D) point but with a radius not smaller than a half wavelength. If the Pendry super-lens were taken to be otherwise equivalent to conventional optical systems, one might assume it had the ability to focus an arbitrarily large amount of energy on an arbitrarily small spot. This would be troubling because it appears to contradict the uncertainty principle (Williams, 2001). However, the Pendry superlens cannot focus energy in three dimensions. From the uniqueness theorem, the fields in the region beyond 2d must be equivalent to those that would have been generated in the x-interval between 0 and 1 if the lens and imaging system were not present (Marques et al., 2008). However, that does not imply that the region x < 2d must match the region x < 0. Intuitively it could not, as the magnitude of the evanescent fields is increasing as traveling from the
Methods and Limitations of Subwavelength Imaging
129
d
2d
FIGURE 3 Focusing of propagating (upper) and evanescent (lower) waves in the Pendry superlens. In the evanescent waves, there is initially an exponential decay, followed by an exponential growth, followed by another decay. The magnitude of the evanescent waves is exactly the same in the image to the right as at the source, but it is not focused in a three-dimensional sense.
imaging plane to the far end of the lens, while the magnitude would decrease moving away from the matter to be imaged. Therefore, 3D focusing is impossible—implying that the Pendry superlens can reproduce images only when the matter to be imaged is limited to a single plane. Another method of reaching this conclusion is based on the effect of evanescent waves from multiple planes. The propagating waves can be focused at one 3D point; however, the evanescent waves are continually decaying and are of the same magnitude as in the original matter being imaged only at a particular point. If there are multiple planes emitting radiation, the imaging plane will have evanescent waves from all the planes combined. The inability to focus in three dimensions is shared with near-field scanning optical microscopy, and, as shown below, appears to be a general characteristic of subwavelength imaging. Another concern arises due to the increasing electric fields as the opposite end of the slab is approached. Very high-frequency evanescent
130
Andrew Neice
waves, which are very rapidly attenuated, rapidly grow within the slab. Specifically, they grow by a factor e2KD,where D is the difference between the slab thickness and the distance of the object to the slab, and qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi K ¼ k2x þ k2y o2 =c2 , analogous to Eq. (3). For extremely large wave vectors, the electric field diverges, as does the energy stored in these fields (Garcia and Nieto-Vesperinas, 2002). This difficulty resolves if real materials that inevitably have losses are considered, as was done by several investigators (Marques and Baena, 2004; Merlin, 2004; Smith et al., 2003). The transmission coefficient for a slab of thickness d can be calculated using the same techniques used to calculate the transmission of propagating waves through a right-handed material— that is, by solving Maxwell’s equations in each material and applying appropriate boundary conditions. The transmission coefficient is given by t¼
4Z ð1 þ
d ZÞ2 eikz ;2
ð1 ZÞ2 eikz 2
d
;
(4)
where Z is the ratio of impedances between the slab and the surrounding material, Z ¼ Z2 =Z1 . The impedance is defined by Zi ¼ omi =kz;i ; where kz,2 is the z-component of the wave vector inside the slab, real for propagating waves and imaginary for evanescent waves. Specifically, qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Kz;2 ¼ o2 m2 e2 k2x k2y ¼ o2 m2 e2 k2r ; where the permittivity and permeability are the values inside the slab. In a lossless case, Z ¼ 1, and as expected we obtain an exponential amplification with distance for evanescent waves, and a complex number of constant magnitude for transmission of propagating waves. (Note that in the left-handed material the sign of both the wave vector and the permeability change, leading to Z ¼ 1.) However, to take into account losses the permeability is taken as m2 ¼ m0 ð1 þ idm Þ:
(5)
In the limit in which kr is taken to be very large (this corresponds to the high-frequency information in the image), kz,2 is approximately equal to ikr. If dm is relatively small, then in the denominator the approximations 1 þ Z ¼ 2 and 1 z ¼ idm can be used, and in the numerator Z ¼ 1 can be used. Substituting this into Eq. (4) yields t¼
4ekr d
4 : þ d2m ekr d
(6)
The exponential growth in the transmission is now dependent on the second term in the denominator being much smaller than the first. This can be expressed as
Methods and Limitations of Subwavelength Imaging
131
4ekr d > d2m ekr d ;
(7)
1 2 : kr; max ln d dm
(8)
which rearranges to
This upper limit on the wave vector that can be transmitted through the superlens solves the problem of the creation of very large electric fields. More significantly, it imposes an upper limit on the resolution of Pendry’s superlens, as large wave vectors corresponding to high resolution will not be transmitted. A similar analysis can be undertaken when the permittivity has an imaginary component. In general, Eq. (8) for the maximum resolution can be generalized to 1 2 ; (9) Kmax ln d d where d ¼ maxðdm; de Þ: When Eq. (9) was derived it was not posited to be a general limitation on subwavelength imaging, but rather peculiar to the Pendry superlens. The situation was similar regarding the inability to focus in three dimensions and the accompanying inability to image multiple planes.
4. GENERALIZED LIMITS ON SUBWAVELENGTH IMAGING 4.1. A Model System Although the previously mentioned limitations on resolution were developed with Pendry’s superlens in mind, Pendry’s superlens is not the only type of subwavelength imaging theoretically possible or even practically realized. It is therefore reasonable to ask if the above limit on resolution is applicable only to the Pendry superlens, or whether it is in fact a more general limitation. Model systems for subwavelength imaging have been proposed to answer this question (Neice, 2009); these systems are discussed and expanded here. While this model is not practically realizable, by removing any modality-specific details it aims to expose any fundamental limitations to subwavelength imaging. Instead of concerning itself with specific types of lenses or other optical equipment, which may limit the generalizability of the conclusions, the device instead relies solely on information about the EM fields
132
Andrew Neice
in the imaging plane, coupled with mathematical techniques to reconstruct the image. Any conceivable lens or optical device could be described as a device that takes information about the EM fields in its vicinity and applies some transformation to them, so this maintains our generalizability. It is assumed that the information contained in EM fields in the imaging plane, although there may be noise in the data, is complete—it represents the entire EM field of the object being imaged and therefore consists of both the propagating and evanescent EM waves. The matter to be imaged is modeled as an oscillating charge distribution given by q(x, y, z) cos(ot), which results in EM perturbations at frequency o. The source of the EM oscillations is immaterial, but it could be due to the material’s interaction with a source of illumination, intrinsic fluorescence, or some other excitation. Notably, all of the oscillations in charge are in phase; this charge distribution could perhaps be made even more general by allowing different parts of the distribution to oscillate in different phases, described by the equation q(x, y, z) cos[ot þ ’(x, y, z)]. However, in the following discussion we explore the limit as o approaches zero so the oscillating term simply becomes a constant dependent only on position, and the difference between the two expressions is immaterial. Adjacent to the charge distribution is a planar sensor at distance d from the origin, which is tuned to the same frequency o, and for mathematical simplicity it is assumed to be parallel to the plane that will be imaged. It is also assumed that there is no charge in the region z > d. The sensors are taken as infinitely small and so produce a continuous function s(x, y, t). Initially s(x, y, t) is taken to be the magnitude of the EM field perpendicular to the plane. Later, we repeat the analysis with other possible values for s(x,y,t) and show that although the mathematics is complicated considerably, the resolution is not improved. Initially, it is also assumed that the sensor plane extends infinitely, although it will shortly be shown that this has no real effect on resolution. Figure 4 shows a diagram of this theoretical imaging system. Hereafter, the axes will be taken to be arranged so that the sensor and the plane to be imaged are perpendicular to the z-axis. Note the similarity of this system to the first practically realized subwavelength imaging system, near-field scanning optical microscopy. This system follows the three rules enumerated in Section 2. As the system collects all information about the EM fields generated by the image, it can be considered to use all information available in both the evanescent and the propagating parts of the EM field. Consequently, this system could be used to derive limitations on conventional imaging as well as subwavelength imaging. To limit the discussion to subwavelength imaging, o is assumed to approach zero. This allows us to model the system as a quasi-static charge distribution and ignore interference
Methods and Limitations of Subwavelength Imaging
133
X
B A
Z
Y
FIGURE 4 A diagram showing the imaging system, with the matter to be imaged (A) and the sensor plane (B) at a given distance from the origin.
effects. In addition to discarding the higher frequencies that are immaterial to subwavelength imaging, this simplifies the analysis considerably. It is immediately apparent that, when the quasi-static approximation is used, this system cannot be used to reconstruct charge distributions that are three dimensional. For example, a point charge and a charged sphere would both create an identical electric field in the sensor plane and could not be distinguished. Therefore, to create unique images the charge distribution is limited to two dimensions. Although this does not necessarily need to be a flat plane, for this discussion that plane is assumed to be the z ¼ 0 plane. Note that this limitation is shared with near-field scanning microscopy and Pendry’s superlens (in both theory and in practice.) This limitation is unique to subwavelength imaging. A pinhole camera, for example, can focus objects in multiple planes simultaneously. In contrast, a subwavelength imaging system cannot focus images from multiple planes simultaneously; in fact, it cannot produce a unique image if there is radiation from multiple planes incident upon it. If the area of interest is limited to a defined plane parallel to the sensor, then it is possible to find a unique solution to the charge distribution by performing an anticonvolution. Consider a charge distribution in the plane z ¼ 0, given by q(x, y)cos(ot). As o approaches zero, the electric field in the z-direction at the sensor is ð d qðx; yÞdxdy: (10) sðx0 ; y0 ; tÞ ¼ cosðotÞ ðd2 þ ðx x0 Þ2 þ ðy y0 Þ2 Þ3=2 Thus, the image in the sensor plane is simply a convolution of the image with the function
134
Andrew Neice
d ðd2 þ x2 þ y2 Þ3=2
(11)
or, in radial coordinates. d ðd2
þ r2 Þ3=2
:
(12)
The Fourier transform of the convolution of two functions, of course, is equivalent to multiplying the Fourier transforms of the functions. The two-dimensional (2D) Fourier transform of Eq. (12) can be found in tables of common Fourier transforms and is given by 2pe2pdq ;
(13) qffiffiffiffiffiffiffiffiffiffiffiffiffiffi where q is the radius in Fourier space, equal to k2x þ k2y . In the following discussion, when wave vectors and frequency are referred to, it should be taken to refer to the information contained in the image, not the wave vector or frequency of the EM radiation. Equation (13) can be conceptualized as the attenuation of the original image in Fourier space. Explicitly, pffiffiffiffiffiffiffiffiffi 2 2 (14) FT½sðx0 ; y0 Þ ¼ 2pe2pd kx þky FT½qðx; yÞ: Inspecting Eq. (14), it is apparent that the higher frequencies of the image are diminished in an exponential fashion. In order to reconstruct the original image q(x, y) from the sensor image s(x, y), the higher frequencies must be amplified and the magnitude of this amplification increases exponentially as the frequency increases.
4.2. Resolution Limits in the Model System If the sensors used are perfect and the sensor plane is infinite, there is no theoretical limit on resolution—simply Fourier transform the sensor image and multiply by the inverse of Eq. (13). A limit on resolution is introduced if the sensor plane is finite—wave vectors longer than the dimensions of the sensor are unable to be distinguished. However, discarding the lowfrequency, long-wavelength components of the image does little to degrade the image quality, although it may change the overall brightness. Of more concern, however, is the amplification of the high-frequency components by an exponentially increasing amount. Any noise present in the data is amplified to the point that spurious results are obtained. Therefore, noise in the data forces the high-frequency components of the signal to be discarded, which places an upper limit on resolution.
Methods and Limitations of Subwavelength Imaging
135
The exact cutoff wave vector, above which sensor data would be discarded, is determined based on the system’s tolerance for artifact in the reconstructed image. However, the order of magnitude can be estimated easily. It is assumed there is a constant level of noise over the entire range of wave vectors, and the signal-to-noise level at wave vector q ¼ 0 is given by 1/dn. If the minimum tolerable signal-to-noise ratio (SNR) is 1, then the exponential decay of the signal as wave vector increases results in a cutoff when 1=dn ffi e2pqd =2p, which rearranges to qffi
lnð2p=dn Þ : 2pd
(15)
All wave vectors above this cutoff would be discarded and the limit on resolution would be given by Eq. (15). This is extremely similar in form to Eq. (9), which was derived in the isolated context of a Pendry superlens, but given this analysis can be plausibly put forth as a general limit on subwavelength imaging. For a constant resolution, as the distance to the sensor increases, the SNR must improve exponentially to maintain a constant resolution. Now the assumptions that led to Eq. (15) are revisited to assess its generality. One assumption to be questioned is the choice of the perpendicular electric field as the quantity measured in the sensing plane. Could the sensing plane instead measure the local voltage, or parallel electric field, and obtain improved resolution? (Due to the quasi-static assumption, we ignore magnetic field for now, although a similar analysis could be applied.) First, measuring voltage in the sensor plane is analyzed. The voltage in the plane is given by the convolution ð 1 qðx; yÞdxdy: (16) sðx0 ; y0 ; tÞ ¼ cosðotÞ 2 2 0 ðd þ ðx x Þ þ ðy y0 Þ2 Þ1=2 The 2D Fourier transform of 1 ðd2
þ ðx
x0 Þ 2
þ ðy y0 Þ2 Þ1=2
(17)
After transforming to radial coordinates is e2pdq =q:
(18)
This is a more dramatic attenuation of the high-frequency components, by a factor of 1/q, than in Eq. (13). Therefore, noisy data would have an even larger effect and the maximum attainable resolution would be worse than that given in Eq. (15).
136
Andrew Neice
Next the case of parallel electric field is considered. This is a more difficult case than the perpendicular field because the radial symmetry is spoiled. For example, the field parallel to the x-axis would be given by ð ðx x0 Þ 0 0 qðx; yÞdxdy; (19) sðx ; y ; tÞ ¼ cosðotÞ ðd2 þ ðx x0 Þ2 þ ðy y0 Þ2 Þ3=2 which is a convolution of the image with the function x ðd2
2
þ ðxÞ þ ðyÞ2 Þ3=2
:
(20)
This system lacks radial symmetry, making calculation of the 2D Fourier transform more difficult. However, making this calculation is unnecessary. On closer inspection it becomes apparent that, unlike the voltage or Ez field, the Ex and Ey fields are not unique to a particular charge distribution. For example, consider the Ex field of a charge distribution where q(x, y) ¼ q(y). Ex is uniformly zero everywhere for any arbitrary q(y) and hence q(y) is not reconstructible. Therefore, the best generalizable method for reconstructing the image remains the measurement of Ez, with its associated limit on resolution, Eq. (15). Although the discussion here has been limited to charge, other physical phenomena that result in disturbances that diminish as a function of the inverse radius squared (e.g., pressure and sound) could be treated similarly; the above method is not restricted solely to EM systems.
4.3. Alternative Explanation for the Resolution Limit It is not surprising that subwavelength imaging results in a resolution limit in the form of Eq. (15) or Eq. (9), if we consider true subwavelength imaging to be the reconstruction of evanescent waves. Because evanescent waves decay exponentially from their source, and do so at a rate that is dependent on their wavelength, intuitively it seems reasonable that an exponential increase in the SNR would be required as the observer moves away from the source. Specifically, consider a source such as Eq. (2) with wave vector 3. Although this source is for S-polarized EM waves, it could be rewritten to represent any number of EM or non-EM forms of radiation. In the limit of a low-frequency, large-image wave vector, that is, o=c << kx , kz can be approximated as ikx. The amplitude of the signal as the observer moves away from the source is then S ¼ S0 ekx z :
(21)
Suppose the noise in the signal is equally distributed at all wave vectors, and that the maximum acceptable SNR is 1. The cutoff wave
Methods and Limitations of Subwavelength Imaging
137
vector at which SNR becomes greater than 1 can be calculated. All wave vectors above this must be discarded. Suppose the SNR at a wave vector of zero is S0/N¼1/d0. Substituting this into Eq. (21), kx ¼
1 1 lnðS0 =SÞ ¼ lnð1=d0 Þ; z d
(22)
which differs from our prior expressions only by constants. If the frequency is not small enough to be ignored, this can be written as qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 (23) k2x o2 =c2 ¼ lnð1=d0 Þ: d The above analysis focused on SNRs, while the analysis that led to Eq. (9) never explicitly considered SNRs but instead considered signal losses due to imperfect metamaterials. However, it led to a result of the same form. It is therefore useful to reconsider exactly what is meant by the quantity 1/d0. SNR is one possibility, but depending on the nature of the imaging system, 1/d0 could be interpreted differently. Some systems may not explicitly consider noise but have other ways to describe imperfections in the imaging system. As a simple example, the above analysis could be repeated with the same results if S0/S were taken as be the ratio of the average signal at the source to the minimum signal detectable by the device. Another possibility that more directly relates to the Pendry superlens arises if the quality factor Q of the detector is considered. In the above analysis, noise was taken to be a corruption of the magnitude of the signal for each of the image’s spatial wave vectors, kx. Specifically, for an array of detectors this could be caused by noise at each individual detector, or alternatively by some corruption of the data that varied depending on the coordinates of the detector. If the first possibility is considered, the noise at the individual detector, then the quality Q of the detector becomes important. If the detector is tuned to the frequency of the radiation from the image, o, a detector with quality factor Q will detect signals with a bandwidth Do ¼ o=Q:
(24)
If ambient noise is assumed to be equally distributed at all frequencies, while the signal is only at exactly the frequency o, then noise and signal to noise will be inversely proportional to Q Specifically, N nDo 1=Q:
(23)
The quality factor Q, generally speaking, is well known as the ratio of energy stored to energy dissipated per cycle. In the case of EM, waves, when m ¼ mð1 þ idÞ or e ¼ eð1 þ idÞ, the ratio of energy stored to
138
Andrew Neice
dissipated would be 1/d; therefore, N d, S/N 1/ d, S/N ¼ C/d. This expression for signal to noise can be substituted into Eq. 22 to obtain the same expression for resolution as Eq. (22), with some constant added to the expression to account for the factor C However, now the meaning of d has changed; It is more consistent with Eq. (9). This discussion suggests that the factor 1/d may have different meanings depending on the exact construction of the subwavelength imaging system, but generally should be taken as a measure of the system’s ability to separate signal from noise. With this definition in mind, Eq. (22) is put forth as a general expression for subwavelength resolution.
5. SUMMARY AND CONCLUSION The field of subwavelength imaging is in the enviable position of having already proven itself in a number of practical applications; in the future, it seems reasonable to suppose that even more widespread adoption of subwavelength techniques will occur. A sound theoretical underpinning, however, is helpful if such progress is to be rapid. Surveying advances across the varieties of subwavelength imaging, the following general rules appear to apply: Subwavelength techniques are either true, or functional; the former are
agnostic to the details of the matter to be imaged, but the latter require (often ingenious) experimental constraints to obtain a subwavelength image. Even true subwavelength techniques cannot produce images if the matter to be imaged is spread over three dimensions; a restriction to two dimensions is necessary. Similar to the diffraction limit for wavelength imaging, for true subwavelength imaging the limit on resolution is given by Eq. (15) or some similar relation. In general, it will be inversely proportional to the distance to the sensor and proportional to the logarithm of a value that quantifies the quality of the sensor. This is the signal-to-noise ratio in the case of computational image reconstruction; it is the lossy part of the permittivity or permeability in a Pendry superlens. In particular, the proposed resolution limit on subwavelength imaging is buttressed by the fact that three unique approaches to predicting resolution limits in subwavelength systems yielded identical results. Equations (22), (9), and (15) were derived in entirely different contexts but are virtually identical in form. It remains to be seen whether the scientifically and economically important forms of subwavelength imaging will be predominantly true or functional. If they are functional; the limitations described in this paper
Methods and Limitations of Subwavelength Imaging
139
are of little practical use. However, if they are true, hopefully these rules will help guide future designs. Specifically, they suggest that the key to high-resolution subwavelength imaging is high-Q, low-noise detection systems, as well as minimizing the distance to the sensor and isolation of one particular imaging plane. One limitation of the above analysis is that the resolution limit was always taken in the limit of very low frequency and large image wavevector, to ensure that there were no confounding effects with wavelength imaging. Interference effects were ignored. It is unclear if the limits on resolution expressed above apply in an intermediate regime between wavelength and subwavelength imaging. One possibility in this intermediate regime would be to repeat the analysis that led to Eqs. (21) and (22), but to jettison the assumption that the frequency of the radiation was very small and the image wave vector very large [that is, to use Eq. (23)]. Nevertheless, this does not change the fundamental logarithmic relationship among noise, distance, and resolution. Additionally, since no practical devices have been developed for intermediate-wavelength imaging, it seems that the low-frequency limit is likely the most useful expression.
REFERENCES Ash, E. A., & Nicholls, G. (1972). Super-resolution aperture scanning microscope. Nature, 237, 510–512. Brenner, D. J., & Hall, E. J. (2007). Computed tomography—an increasing source of radiation exposure. The New England Journal of Medicine, 357(22), 2277–2284. Caloz, C., & Itho, T. (2006). Electromagnetic Metamaterials: Transmission Line Theory and Microwave Applications (pp. 4–59). Hoboken, NJ: Wiley. Eleftheriades, G., & Balmain, K. (2005). Negative-Refraction Metamaterials: Fundamental Principles and Applications (pp. 1–46, 213–247). Hoboken, NJ: Wiley. Erni, R., Rossell, M. D., Kisielowski, C., & Dahmen, U. (2009). Atomic-resolution imaging with a sub-50-pm electron probe. Physical Review Letters, 102, 096101. Fang, N., Lee, H., Sum, C., & Zhang, X. (2005). Sub-diffraction-limited optical imaging with a silver superlens. Science, 308, 534–537. Garcia, N., & Nieto-Vesperinas, M. (2002). Left-handed materials do not make a perfect lens. Physical Review Letters, 88, 207403. Griffiths, D. J. (1999). Introduction to Electrodynamics (pp. 413–414). (3rd ed.). Upper Saddle River, NJ: Prentice-Hall. Marques, R., & Baena, J. (2004). Effect of losses and dispersion on the focusing properties of left-handed media. Microwave and Optical Technology Letters, 41, 290–294. Marques, R., Martin, F., & Scrolla, M. (2008). Metamaterials With Negative Parameters: Theory, Design, and Microwave Applications (pp. 1–40). Hoboken, NJ: Wiley. Merlin, R. (2004). Analytical solution of the almost-perfect lens problem. Applied Physics Letters, 84, 1290–1292. Neice, A. (2009). Comparison of Pendry’s superlens with computational image reconstruction. Microwave and Optical Technology Letters, 41(12), 2913–2914.
140
Andrew Neice
Patterson, G. H., Betzig, E., Lippincott-Schwartz, J., & Hess, H. F. (2007). Developing photoactivated localization microscopy (PALM). In 4th IEEE International Symposium, Biomedical Imaging: From Nano to Macro (pp. 940–943). April 12–15, Arlington, VA. Pendry, J. B. (2000). Negative refraction makes a perfect lens. Physical Review Letters, 85, 3966–3969. Ramakrishna, S., & Grzegorczyk, T. (2009). Physics and Applications of Negative Refractive Index Materials (pp. 176–213). Boca Raton, FL: CRC Press. Smith, D. R., Padilla, W. J., Vier, D. C., Nemat-Nasser, S. C., & Schultz, S. (2000). Composite medium with simultaneously negative permeability and permittivity. Physical Review Letters, 84(18), 4184–4187. Smith, D. R., Schurig, D., Rosenbluth, M., & Schultz, S. (2003). Limitations on subdiffraction imaging with a negative refractive index slab. Applied Physics Letters, 82, 1506–1508. Synge, E. H. (1928). A suggested method for extending the microscopic resolution into the ultramicroscopic region. Philosophical Magazine, 6, 356–362. Veselago, V. (1968). The electrodynamics of substances with simultaneously negative values of e and m. Soviet Physics Uspekhi, 10(4), 509–514. Williams, J. M. (2001). Some problems with negative refraction. Physical Review Letters, 87, 249703.
Chapter
4 Identification of Historical Pigments in Wall Layers by Combination of Optical and Scanning Electron Microscopy Coupled with Energy-Dispersive Spectroscopy A. Sever Sˇkapin* and P. Ropret†,‡
Contents
1. Introduction 2. Determination of Historical Pigments by Combined Optical Microscopy, Scanning Electron Microscopy, and Energy-Dispersive Spectroscopy 2.1. Principle 2.2. Sampling and Preparing the Sample Cross Section 2.3. Low-Vacuum Scanning Electron Microscopy and Energy-Dispersive Spectroscopic Analysis: Advantages and Limitations 2.4. Strategy 3. Red Pigments and Pigments Based on Carbon 4. Green, Blue, and Brown Pigments: Green Earth, Chromium Oxide Greens, Ultramarine, and Umber 5. Ochre Pigments and Barium Sulfate or Lithopone
142
144 144 144
144 145 146 151 157
* Slovenian National Building and Civil Engineering Institute, Dimicˇeva 12, 1000 Ljubljana, Slovenia {
{
Institute for the Protection of Cultural Heritage of Slovenia, Conservation Centre, Research Institute, Poljanska 40, 1000 Ljubljana, Slovenia Museum Conservation Institute, Smithsonian Institution, 4210 Silver Hill Road, Suitland, Maryland 20746, USA
Advances in Imaging and Electron Physics, Volume 163, ISSN 1076-5670, DOI: 10.1016/S1076-5670(10)63004-2. Copyright # 2010 Elsevier Inc. All rights reserved.
141
142
A. Sever Sˇkapin and P. Ropret
6. Study of the Transformation of Calcium Carbonate Into Calcium Sulfate References
160 161
1. INTRODUCTION In the restoration and conservation of historical objects it is most important to preserve stylistic authenticity, whenever possible, of the original layers and to ensure materials durability. To accomplish these tasks successfully, it is essential to identify the chemical nature and the microstructural features of original historical materials. In the case of painted walls and painted colored coats, it is primarily necessary to determine the chemical composition of the original color layers. Due to their significant appearance, pigments represent one of the most important components in color layers. Several complementary analytical techniques are usually required to provide an understanding of the composition of the materials used and the technique followed by the artist. The study of pigments used in artwork has comprised a wide range of different techniques such as X-ray diffraction, Raman and infrared spectroscopy, photoluminescence spectroscopy, particle-induced X-ray emission, as well as various microscopical techniques, which usually only in some combinations provide sufficient information for precise characterization (Aze et al., 2006; Berrie, 2007; Brostoff et al., 2009; Feller, 1986; Lau et al., 2008; Ricci et al., 2004; Rosi et al., 2004, 2009; Roy, 1993; West FitzHugh, 1997). X-ray diffraction is used to characterize the crystallographic structure in pigment contained in polycrystalline solid samples and to identify unknown substances by comparing diffraction data against a database maintained by the International Centre for Diffraction Data (Bueno et al., 2004; Daniilia et al., 2000; Mazzocchin et al., 2004; Rampazzi et al., 2002). Particle-induced X-ray emission (PIXE) (Neelmeijer and Ma¨der, 2002; Roascio et al., 2002; Zoppi et al., 2002) provides elemental information and is suitable for examining paintings because of the low level of background produced by organic components such as binders. Thus, traces of pigments can be identified by this method. Raman or Fourier-transform Raman spectroscopy or micro-Raman spectroscopy is a nondestructive technique that provides information at a molecular level and is suitable for selective studies on inhomogeneous materials or surface investigations (Centeno et al., 2006; Coma et al., 2000; Coupry et al., 1994; Daniilia et al., 2000; Perardi et al., 2000; Pe´rez-Alonso et al., 2004; Roascio et al., 2002; Ropret et al., 2008; Ruiz-Moreno et al., 2003; Sandalinas et al., 2006; Vandenabeele et al., 2000;
Identification of Historical Pigments in Wall Layers
143
Zoppi et al., 2002). The combination of PIXE and micro-Raman spectroscopy (Mendes et al., 2008) provides complementary information on the composition of the materials that allows identification of the inorganic pigments used in the paintings. Photoluminescence spectroscopy has also been used as a nondestructive technique for identification of some ancient pigments (Pozza et al., 2000). Fourier-transform infrared spectroscopy is suitable for the identification and differentiation of selected pigments (Daniilia et al., 2000; Genestar and Pons, 2005; Mazzocchin et al., 2004; Miliani et al., 2007; Pe´rez-Alonso et al., 2004). However, only in combination with other techniques, such as scanning electron microscopy–energy-dispersive spectroscopy (Genestar and Pons, 2005) and/or others, can precise identification be established. There are also reports about precise analyses of pigments in heterogeneous samples, such as wall paintings, by a multitechnique approach combining micro-Raman, Fourier-transform infrared, and scanning electron microscopy (SEM) with energy-dispersive spectroscopy (EDS) (Ospitali et al., 2008). Because of a pigment’s visual and optical properties, optical microscopy is one of the most important techniques for its characterization (Bueno and Medina Florez, 2004; Casarino and Pittaluga, 2001; Eastaugh et al., 2005; Mazzocchin et al., 2004; McCrone, 1979). Electron microscopy coupled with energy-dispersive X-ray spectrometry is now one of the most significant and powerful analytical instruments and is often also used for characterization of historical pigments (Bueno and Medina Florez, 2004; Daniilia et al., 2000; Manzano et al., 2000; Mazzocchin et al., 2004; Roascio et al., 2002; Viti et al., 2003). In the present work, the main focus is on the study of some historical pigments in the paint layers of internal and external walls. However, the successful restoration can only be achieved through examination and understanding of the composition, as well as the degradation processes, of materials that compose wall layers. Precise identification of a pigment can rarely be performed with a single method. In most cases, several complementary techniques are required. In this work, we demonstrate that a quite reliable identification of various inorganic pigments in multilayered paint coats and similar decorating textures is possible using solely optical and SEM combined with EDS (SEM-EDS) analysis. The course of such an analysis on practical examples of materials used for restoration of selected historical objects is shown. Because the results of SEM-EDS analysis offer only the elemental composition of material components, the identification of pigments by this technique is limited only to the inorganic components. For analysis of organic components (organic pigments, binders, different additives, and so on) other techniques must be used (e.g., Raman, Fourier-transform infrared spectroscopy, pyrolysis–gas chromatography–mass spectrometry).
144
A. Sever Sˇkapin and P. Ropret
2. DETERMINATION OF HISTORICAL PIGMENTS BY COMBINED OPTICAL MICROSCOPY, SCANNING ELECTRON MICROSCOPY, AND ENERGY-DISPERSIVE SPECTROSCOPY 2.1. Principle Using a combination of microscopic techniques (optical and SEM) and EDS allows the identification of painting materials even if they are finely grained and mixed with other materials or dispersed in a binder. Optical microscopy can be used for accurate identification of borders between various layers, determination of various nuances within a given color layer, and for localization of separate grains of pigments. SEM provides microstructural features of pigments, whereas using EDS mapping analysis, which enables determination of elemental distribution, the distribution of pigments within different layers can be established. EDS quantitative elemental analysis on selected spots provides the final information needed for pigment identification.
2.2. Sampling and Preparing the Sample Cross Section The sampling of a selected specimen is performed by scraping off the wall paintings and the wall layers with a scalpel. The samples are mounted onto adhesive tape and inserted into a two-component epoxy resin. The surfaces of the samples are then carefully ground with an 800-mesh and a 4000-mesh SiC abrasive grinding paper. After grinding the surfaces are polished on a cloth using 3-mm and ¼-mm diamond pastes. Such a procedure yields polished cross sections of an individual investigated sample. The samples presented herein were obtained from different important Slovenian historical monuments: the Cistercian Abbey of Sticˇna, the Manor of Novo Celje, and St. Clement (renamed Our Lady of Health) Church.
2.3. Low-Vacuum Scanning Electron Microscopy and Energy-Dispersive Spectroscopic Analysis: Advantages and Limitations In the present work, all SEM examinations and EDS analyses were performed in the low-vacuum mode where the samples need not be coated with an additional, highly conductive film of gold or graphite. This method avoids possible damage of the samples. Also, such uncoated samples remained suitable for further or repeatable investigation. Aside from these advantages, the low-vacuum mode also has some limitations. For example, the spatial resolution of micrographs is decreased. Luckily,
Identification of Historical Pigments in Wall Layers
145
this was not an issue in our examinations because the magnifications were far from the the higher limit. Some general limitations regarding the compositional studies by EDS analysis are as follows: The analysis is normally not quantitative but semi-quantitative. Detection limits are typically 0.1% or higher, so trace elements cannot
be detected.
EDS can identify almost all of the elements that are present in sufficient
amount, but it cannot provide information about compounds.
The accuracy of EDS analysis can be affected by many variables, includ-
ing the nature of the sample (inhomogeneous and rough samples have reduced accuracy). EDS detectors cannot detect the presence of elements with atomic number less than 5, meaning that EDS cannot detect hydrogen, helium, lithium, or beryllium. Some elements have overlapping peaks (e.g., Mn Kb and Fe Ka) and therefore might be difficult to discern in some cases. It is important to note that due to the low-vacuum mode of operation, side effects such as scattering of the incident beam and the inherent inaccuracy of determination of light elements (oxygen, carbon, nitrogen) must be taken into account in interpretation of the measured elemental distribution. The electron beam profile is drastically modified in the low-vacuum ‘‘atmosphere’’ of the SEM. It has been found that due to the gas environment the original beam is split into two fractions, unscattered and scattered electron fractions. The unscattered fraction retains the original electron distribution and the same diameter as found in the original electron probe. The scattered fraction, however, forms the so-called electron skirt around the unscattered focused electron probe (Belkorissat et al., 2004). The scattered electrons increase the noise-to-signal ratio and complicate the correct interpretation of X-ray micrographs (Rattenberger et al., 2009).
2.4. Strategy The strategy shown in Figure 1 was adopted to identify the pigments and elemental distribution of the finishing layers in the walls. First, selected cross sections of a sample were observed by optical microscopy to detect different pigments based on their color characteristics and morphology. Then the same cross-sectional areas were examined by SEM to determine their microstructural characteristics. Furthermore, using EDS mapping analysis of the same areas, the information on elemental distribution across each area (in this particular case, the distribution) was studied with respect to the following elements—calcium, silicon, magnesium, aluminum, iron, and sulfur. The brightness is proportional to the content
146
A. Sever Sˇkapin and P. Ropret
1. Photograph zoom of polished cross-section of the sample
a
a
b
b
S2
a b
S1
2. Optical images of the selected areas a and b
b S2
S1
Si
Ca
4. EDS mapping analysis – distribution of the presented elements across selected area b
Mg
300 mm
300 mm
Fe
Al
S2
3. SEM images of the selected areas a and b
300 mm
300 mm
300 mm
S
300 mm
S1
Spectrum
C
O
Mg
Al
Si
S
K
Ca
Fe
S1
44.3
47.8
0.9
0.5
0.6
0.1
0.1
0.7
5.0
S2
30.5
54.5
5.5
1.2
6.7
0.1
0.2
0.5
0.8
5. EDS spot analysis – qualitative and semi-quantitative elemental analysis of defined pigment grains
In normal molar percentage
FIGURE 1 Procedure of determination of historical pigments by combination of optical microscopy, SEM, and EDS.
of a certain element; the brighter the image, the higher the content of the element at a given spot. Finally, using the same technique, qualitative and semi-quantative analyses of selected grains of pigments were performed.
3. RED PIGMENTS AND PIGMENTS BASED ON CARBON There are quite a few known historical red pigments, but in this work the possibility of identification of red iron oxide (Fe2O3), red earth (Fe2O3þ Al2(SiO3)3), and vermilion (cinnabar, HgS) using the combination of
Identification of Historical Pigments in Wall Layers
147
optical microscopy, SEM, and EDS is presented. All three pigments have been used throughout history except vermilion, which was used until the discovery of cadmium red in the beginning of the twentieth century (Berrie, 2007; Douma, 2008; Roy, 1993). However, pigments based on carbon (carbon black or graphite) are dicussed together with red pigments merely because specific cases of samples consisting of both red and black paint layers are presented. Figure 2a presents an optical image and Figure 2b an SEM image of the same area and at the same magnification of the polished cross section of the first representative sample. Four layers can be distinguished by color: On the top, a bright ochre coat can be seen, followed by a white, a red, and a black layer; the bottom bright layer pertains to the plaster. In this example, the focus is on the red and black layers. A careful examination of the optical image shows different nuances in the red pigment grains. Mapping analysis (Figure 3) also reveals that the red layer contains iron, aluminum, silicon, mercury, and sulfur, in addition to carbon, oxygen, calcium, and magnesium. The origin of calcium and magnesium is most likely due to the presence of calcite and/or dolomite in the surrounding inorganic binder. The distribution of iron, aluminum, mercury, and sulfur across all layers shows that all these elements are found mainly in the red coat, while individual brighter spots of iron and aluminum are also found in the other layers. Particles of a silicon compound can be found across all layers. From the distribution of carbon in the mapping EDS analysis it can be seen that the layer below the red one is richer in carbon than the surroundings. Comparing the features obtained from the optical microscopy with the results of mapping analysis, it is possible to hint at certain pigments. In the spots designated as S1 and S4, aluminum, silicon, iron, and much more oxygen (red earth) were found than in other spots. In areas S2 and S3, sulfur and mercury (vermilion) were identified,
(b)
(a)
S6 S5 S3 S4
S6
S1 200 mm S2
S3 S4 S1 S5
20 kV S2
⫻70 200 mm
20 45 16Pa
FIGURE 2 (a) Optical and (b) SEM image (Back-scattered electrons image) of the same area of a polished cross-section of first representative painted sample.
A. Sever Sˇkapin and P. Ropret
148
(a)
Fe
(b)
800 mm (d)
800 mm Hg
800 mm (g)
800 mm
Si
(e)
(h)
800 mm
Al
800 mm S
800 mm O
(c)
(f)
C
800 mm Ca
(i)
Mg
800 mm
FIGURE 3 EDS mapping analysis of the same area as presented in Figure 2. Brightness at a certain spot is proportional to the content of the given element at that spot. Al, aluminum; C, carbon; Ca, calcium; Fe, iron; Hg, mercury; Mg, magnesium; O, oxygen; S, sulfur; Si, silicon.
whereas in area S6, the prevailing elements are iron and oxygen (red iron oxide). To corroborate our assumption, individual red pigment grains were additionally examined by a qualitative and a quantitative elemental analysis that confirmed that areas S1 and S4 corresponded to red earth, areas S2 and S3 to vermilion (cinnabar), and areas S6 to red iron oxide. Table 1 shows the molar percentage of the elements in individual pigment grains. The rows that were found to belong to the same pigment are shadowed in the same fashion. The presence of titanium and potassium in the red earth pigment is ascribed to impurities. Spot S5, which presents the black paint layer, contains much more carbon than other spots, which indicates that the pigment is carbon black or graphite. The absence of phosphorus excludes the possible presence of ivory black, another inorganic pigment rich in carbon and Ca3(PO4)2. The difference between the
Identification of Historical Pigments in Wall Layers
149
TABLE 1 Quantitative EDS Analysis of Individual Pigment Grains (in normalized molar percentage; all-element analyses)*; The same shaded area presents the same pigment, bold numbers indicate elements relevant for the identification of the pigment. Spectrum
C
O
Mg
Al
Si
S
K
Ca
Fe
Ti
Hg
surroundings S1 S2 S3 S4 S5 S6
38.2 25.6 35.7 44.0 21.1 66.7 25.5
53.2 58.5 43.5 43.3 62.6 29.8 62.3
5.1 2.4 3.2 3.0 1.8 2.0 3.2
0.1 5.9 0.4 0 6.9 0.1 0
0.2 5.0 0.3 0 5.8 0.1 0.1
0 0 7.5 4.2 0 0 0
0 0.4 0 0 0.2 0.1 0
3.2 0.7 1.4 1.0 0.7 1.2 1.4
0 1.3 0 0 0.8 0 7.5
0 0.2 0 0 0.1 0 0
0 0 8.0 4.5 0 0 0
Al, aluminum; C, carbon; Ca, calcium; Fe, iron; Hg, mercury; K, potassium; Mg, magnesium; O, oxygen; S, sulfur; Si, silicon; Ti, titanium. * The position of spots is marked in Figures 2a and 2b; see text for details.
content of carbon in grain denoted S5 in Figure 2 and that found in the surroundings is so high that the identification with mapping analysis is unambiguous (Figure 3f). Unfortunately, carbon black and graphite cannot be distinguished by this method. When interpreting the results of quantitative EDS analysis detected using the low-vacuum mode, it is necessary to take into account the specific characteristics of this mode. Due to scattering of the incident beam, the detected X-rays partly originate from the surroundings of the examined pigment grain. Of course, the quantity of captured surroundings depends on the grain size—the larger the grains, the lower the influence of surroundings. In addition, in low vacuum the quantitative analysis of light elements such as carbon, nitrogen, and oxygen is inaccurate so that merely a comparison of light elements among different pigment grains can be recommended. Therefore, when interpreting the compositions found in spots S2 and S3 identified as vermilion, one might conclude that the elements calcium, magnesium, carbon, and oxygen and traces of aluminum and silicon in the case of S2 originate from the surroundings. This assumption has also been comfirmed using EDS analysis of the surroundings that gained mainly the elements calcium, magnesium, carbon, and oxygen (see the first row in Table 1). In the case of areas S1, S4, and S6, in addition to the elements characteristic for these pigments (iron in the case of S6 and aluminum, silicon, and iron in the case of spots S1 and S4), quite large amounts of oxygen were found in all of these spots. This fact indicates that the pigments in S1, S4, and S6 are oxides. Similarly, a quantitative EDS analysis (see Table 1) shows a significant difference in the content of carbon between the pigment (66.7%) and in the surroundings (38.2%). The other elements in this spot
A. Sever Sˇkapin and P. Ropret
150
(oxygen, magnesium, calcium, and traces of aluminum, silicon, and potassium) originate from the surroundings. The next case (Figure 4 and Table 2) refers to the cross section of a different sample but from the same monument. However, quite a similar situation with regard to the presence of pigments is found. As before, the optical image (Figure 4a) shows different nuances in the red pigment grains. EDS spot analysis shows that in the particles designated as S2, S3, and S6, in addition to the surrounding elements such as carbon, oxygen, magnesium, and calcium (first row in Table 2), the prevailing elements are mercury and sulfur, which correspond to vermilion. Pigment grains designated as S1, S4, S5, and S7 consist of aluminum, silicon, iron, and oxygen, but in contrast to the preceding case the ratios between them are different. The pigment grain S5 deviates the most; it contains much more iron (6%) than pigment grains S1, S4, and S7 (0.2%–0.8%). This might indicate that in this case the pigment is red iron oxide, but silicon and aluminum are not found in trace amounts as would be expected. Obviously, all four grains then belong to the red earth, and an explanation why the ratio between Fe2O3 and Al2(SiO3)3 is much higher in S5 than in the other grains could be proposed. One possible suggestion is that the red earth pigments originate from different sources, but much more systematic investigations would be required to confirm this assumption. The black layer area with no visible black grains (area A8) was also checked; its quantitative EDS results were compared with the results of the black grain designated with S9, which belongs to the carbon black or graphite pigment. It is apparent that carbon is the prevailing element, while all other elements—oxygen, magnesium, aluminum, silicon, potassium, and calcium—originate from the surroundings. The composition of the S10 grain from the upper white layer suggests that it belongs to one of the magnesium oxides or carbonates, which can be ascribed to the inorganic binder system. (b)
(a) S10
A8 S10
S9 S6
S3 S4 S2
S1
S9
S3
S5
S6
S7 50 mm
A8
S4 S2 20 kV
S1 ⫻300 50 mm
S7 S5 21 35 17Pa
FIGURE 4 (a) Optical and (b) SEM image (Back-scattered electrons image) of the same area of a polished cross-section of the second representative painted sample.
151
Identification of Historical Pigments in Wall Layers
TABLE 2 Quantitative EDS Analysis of Individual Pigment Grains (in normalized molar percentage; all-element analyses)*; The same shaded area presents the same pigment, bold numbers indicate elements relevant for the identification of the pigment. Spectrum
C
O
Mg
Al
Si
S
K
Ca
Fe
Hg
Cl
surroundings S1 S2 S3 S4 S5 S6 S7 A8 S9 S10
24.7 20.2 37.6 50.5 24.0 25.5 51.7 27.9 36.7 66.8 27.5
55.2 63.0 41.4 37.2 61.0 59.0 34.6 58.3 53.2 29.6 61.3
8.8 2.5 3.6 3.3 3.0 2.9 3.0 2.6 5.1 1.8 8.8
0.4 6.3 0 0.2 5.4 2.6 0.1 5.3 0.1 0.1 0.1
0.9 5.9 0.3 0.3 4.8 2.3 0.3 3.5 0.2 0.1 0.2
0 0 7.0 3.5 0 0 4.1 0 0 0 0
0.1 0.2 0 0 0.2 0.1 0 0.2 0.1 0.2 0
9.8 1.1 2.2 1.2 1.4 1.5 1.7 2.0 4.5 1.4 2.1
0 0.8 0 0 0.2 6.0 0 0.2 0 0 0
0 0 7.7 3.8 0 0 4.4 0 0 0 0
0.1 0 0.2 0 0 0.1 0.1 0 0.1 0 0
Al, aluminum; C, carbon; Ca, calcium; Cl, chlorine; Fe, iron; Hg, mercury; K, potassium; Mg, magnesium; O, oxygen; S, sulfur; Si, silicon. * The position of spots is marked in Figures 4a and 4b; see text for details.
4. GREEN, BLUE, AND BROWN PIGMENTS: GREEN EARTH, CHROMIUM OXIDE GREENS, ULTRAMARINE, AND UMBER Among the historical green pigments, green earth and two chromium oxide greens were chosen: viridian and chromium oxide. The former pigment has been used since ancient times; it is a complex aluminosilicatebased mineral with the nominal formula K[(Al,Fe(III)),(Fe(II),Mg)](AlSi3, Si4)O10(OH)2, while both chromium oxide greens (Cr2O3) and viridian (Cr2O3 2 H2O) were introduced during the first half of the nineteenth century (Douma, 2008; Feller, 1986; West FitzHugh, 1997). The blue and brown pigments included in our study are ultramarine and umber. Ultramarine is a complex sulfur-containing sodium aluminum silicate. Its aproximate compositional formula is (Na, Ca)8 (AlSiO4)6 (SO4, S, Cl)2, the proportions of aluminum, silicon, and oxygen being fixed; the other elements are variable (Douma, 2008; Roy, 1993). Natural ultramarine was first used in the sixth to seventh century AD, while the synthetic pigment has been used since 1828 (Douma, 2008). Umber is iron(III)-oxide containing manganese(IV)-oxide with chemical formula Fe2O3, MnO2. It was discovered in prehistoric times and is still in use. The natural umber mineral is found throughout the world in many shades with hues ranging from yellow to brown and faint blue (Douma, 2008).
152
A. Sever Sˇkapin and P. Ropret
Figure 5 shows images of the same area and at the same magnification of the polished cross section of the first representative green-painted sample taken from the Cistercian Abbey of Sticˇna. On the upper side of the green layer is the white layer, very likely coated to cover the green color of the wall. Mapping analysis of the same area (Figure 6) revealed that the green layer contained iron, silicon, aluminum, potassium, magnesium, and oxygen, which indicated that this pigment was most likely green earth. To confirm this assumption, individual green pigment grains were additionally examined by a qualitative and a quantitative elemental analysis. Table 3 presents molar percentages of the elements identified in the analyzed grains denoted with S1 to S5. It is apparent that pigments S1, S2, S3, and S4 are richer in iron (1.0%–1.6%), silicon (4.9%–7.4%), aluminum (0.4%–0.6%), magnesium (2.8%–5.2%), potassium (0.7%–1.3%), and oxygen (51.5%–54.2%) compared with the surroundings (first row in Table 3) and so obviously correspond to green earth. The isolated blue grain in the white coat (designated as S5 in Figure 5) was also analyzed. EDS quantitative analysis indicated that the blue grain consisted of sodium, silicon, aluminum, and sulfur (carbon, oxygen, calcium, and magnesium originated from the surroundings), which most likely correspond to ultramarine. This pigment is presumably an impurity within the white coat because only one grain of this pigment was found in the entire sample. A similar analysis was done for the sample from the facade of the important historical monument, the Manor of Novo Celje. There are two colored finishing coats, an ochre and a green one. Figure 7 presents an optical image of both finishing coats, but the focus is only on the lower, green layer. Figure 8 shows the magnified area of the lower-right part of the area of Figure 7. When switching from the optical to SEM investigation it may be sometimes difficult to find exactly the same region on the
(a)
(b)
S5
S5
S4 S2
S1 S3
S2
50 mm
20 kV
⫻270
S4 50 mm S3
S1 20 43
12Pa
FIGURE 5 (a) Optical and (b) SEM image (Back-scattered electrons image) of the same area of a polished cross section of the first representative green painted sample.
153
Identification of Historical Pigments in Wall Layers
(a)
(b)
Fe
(c)
200 mm
200 mm (d)
K
Mg
(f)
200 mm
(g)
C
Ca
200 mm
(h)
200 mm
Al
200 mm
(e)
200 mm
FIGURE 6
Si
O
(i)
200 mm
Cl
200 mm
EDS mapping analysis of the same area as presented in Figure 5.
TABLE 3 Quantitative EDS Analysis of Individual Pigment Grains (in normalized molar percentage; all-element analyses)*; The same shaded area presents the same pigment, bold numbers indicate elements relevant for the identification of the pigment. Spectrum
C
O
Na
Mg
Al
Si
S
K
Ca
Fe
surroundings S1 S2 S3 S4 S5
48.2 36.6 31.8 32.9 33.3 55.1
45.3 51.5 54.2 54.0 54.0 36.2
0 0 0 0 0 2.1
2.4 4.0 2.8 5.2 2.8 0.6
0.1 0.5 0.6 0.4 0.6 1.6
0.9 4.9 7.4 5.0 6.5 2.2
0.1 0.1 0 0.1 0.1 1.0
0 0.8 1.3 0.8 0.7 0
3.0 0.6 0.4 0.6 0.4 1.1
0 1.0 1.5 1.0 1.6 0
Al, aluminum; C, carbon; Ca, calcium; Fe, iron; Hg, mercury; K, potassium; Mg, magnesium; Na, sodium; O, oxygen; S, sulfur; Si, silicon. * The position of spots is marked in Figures 5a and 5b; see text for details.
A. Sever Sˇkapin and P. Ropret
154
S2 S3 200 mm
S1
FIGURE 7 Optical image of a polished cross section of the sample taken from the facade of the Manor of Novo Celje.
(a)
(b) S2
S2
S3 S3 S1 S1 50 mm
20 kV
⫻270 50 mm
20 52 12Pa
FIGURE 8 (a) Optical and (b) SEM image (Back-scattered electrons image) of the magnified area of the lower-left part of the area shown in Figure 7.
sample’s surface because the sample appears very different under both microscope types. For example, in the present sample the green grain seen under the optical microscope as a homogeneous particle appears as a conglomerate of numerous small particles when investigated with SEM. These particles are easily lost among the other particles constituting the coat. Thus, the actual location of the ‘‘green optical particle’’ under SEM can be found only by careful study of all structural features on a wider area. In this specific case, the mapping analyis appeared to be particularly helpful. Namely, as shown in Figure 9a, the distribution of chromium shows a high concentration exactly at the spot where the green pigment is expected. In the right corner, however, an increased concentration of iron, aluminum, and silicon (see Figure 9b,c,d) is seen at sites that correspond to the red nuances in the optical image (Figure 8a). Thus, these sites are identified with a high probability as red earth—in this case, probably as an impurity in the otherwise green coat.
Identification of Historical Pigments in Wall Layers
(a)
Cr
200 mm (d)
200 mm
FIGURE 9
Fe
(c)
200 mm Si
200 mm (g)
(b)
(e)
(h)
200 mm
Al
200 mm Mg
(f)
200 mm C
155
Ca
200 mm O
(i)
S
200 mm
EDS mapping analysis of the same area as presented in Figure 8.
Further individual green pigment grains were examined by qualitative and quantitative elemental analyses. Figure 10 presents a typical EDS spectrum of the green grains; Table 4 shows the quantitative analysis of three green grains designated as S1, S2, and S3. Besides the surrounding elements (carbon, oxygen, magnesium, silicon, sulfur, and calcium), all grains consist of chromium (5.4%, 7.1%, and 12.1%, respectively) and the quantity of oxygen (55.3%, 58.3%, and 62.4%) is much higher than in the surroundings (38.1%). This indicates that the green pigment is one of the chromium oxide greens, either pure Cr2O3 or viridian, a chromium(III)oxide dihydrate. Unfortunately, viridian and the anhydrous chromium (III)-oxide cannot be distinguished by this combination of techniques. Further investigations would be needed to discriminate between these two forms. For example, micro-Raman spectroscopy could be used to add molecular information to the elemental composition gained by EDS technique.
A. Sever Sˇkapin and P. Ropret
156
Cr O
Spectrum S3
Cr
Ca Ca C Cr
S Mg
Si
Ca
S
0 1 2 3 Full scale 2762 cts cursor: 16.386 keV (0 cts)
FIGURE 10 Figure 8.
4
5
6
7
8
9
10 keV
EDS spectrum of the green pigment grain designated as S3 in Figure 7 and
TABLE 4 Quantitative EDS Analysis of Individual Pigment Grains (in normalized molar percentage; all-element analyses)*; The same shaded area presents the same pigment, bold numbers indicate elements relevant for the identification of the pigment. Spectrum
C
O
Mg
Al
Si
S
Cl
Ca
Cr
surroundings S1 S2 S3
53.8 33.5 28.2 21.9
38.1 55.3 58.3 62.4
1.2 0.4 0.3 0.4
0.2 0 0 0.1
0.7 0.2 0.3 0.2
0.2 0.6 0.7 0.2
0.1 0 0 0
5.7 4.6 5.1 2.7
0 5.4 7.1 12.1
Al, aluminum; C, carbon; Ca, calcium; Cl, chlorine; Cr, chromium; Mg, magnesium; O, oxygen; S, sulfur; Si, silicon. * The position of spots is marked in Figures 8a and 8b; see text for details.
The next case (Figure 11) refers to the cross section of the sample taken from the St. Clement Church (Our Lady of Health Church). Four layers can be distinguished on the top of the plaster. Ochre, white, red, and blue pigments are presented inside these four finishing layers. In this example, the focus is on the blue and brown pigments. The spot qualitative and a semi-quantitative elemental analysis of the grains designated as S1, S2, S3, and S4 have been performed. The molar percentage of the elements obtained in individual grains is presented in Table 5. The blue grains designated as S1 and S2 consist of sodium, calcium, aluminum, silicon, sulfur, chlorine, and oxygen (Table 5), which leads to the conclusion that the pigment is ultramarine. Manganese, iron, and oxygen were found in the dark brown pigment S4, which most likely corresponds to umber.
157
Identification of Historical Pigments in Wall Layers
(a)
(b)
S4
S4 S2
S2
S1
S1
50 µm
S3
S3 ⫻500
20 kV
50 mm
20
50
12Pa
FIGURE 11 (a) Optical and (b) SEM image (Back-scattered electrons image) of the same area of a polished cross section of the sample taken from the St. Clement Church.
TABLE 5 Quantitative EDS Analysis of Individual Pigment Grains (in normalized molar percentage; all-element analyses)*; The same shaded area presents the same pigment, bold numbers indicate elements relevant for the identification of the pigment. Spectrum
C
O
Na
Al
Si
S
Cl
K
Ca
Mn Fe
Zn
surroundings S1 S2 S3 S4
40.1 17.5 20.6 36.7 28.7
48.7 61.4 60.7 43.5 53.2
0 3.7 2.9 0 0
0.1 4.7 3.9 0.2 0.5
0.4 5.4 4.4 0.4 0.3
0.2 1.6 1.3 0 0.1
0.1 0.1 0.1 0.1 0.1
0.1 0.4 0.3 0 1.0
5.9 3.8 4.4 1.4 1.0
0 0 0 0 7.5
4.3 1.4 1.4 17.5 4.7
0.1 0 0 0.2 2.9
Al, aluminum; C, carbon; Ca, calcium; Cl, chlorine; Fe, iron; K, potassium; Mn, manganese; Na, sodium; O, oxygen; S, sulfur; Si, silicon; Zn, zinc. * The position of spots is marked in Figure 11; see text for details.
Spot analysis of the white surroundings was also performed in the same ‘‘colored’’ layer (S3). It seems to belong to pigment zinc white (ZnO), which represents one of the white pigments with good hiding power. Although it has been known since ancient times, zinc white has been used in artists’ paints only since the end of the eighteenth century (Douma, 2008; Feller, 1986).
5. OCHRE PIGMENTS AND BARIUM SULFATE OR LITHOPONE Yellow ochre is a natural mineral consisting of silica and clay; it owes its color to an iron oxyhydroxide mineral, goethite. (Douma, 2008) It is found throughout the world in many hues from yellow to brown and has been used since prehistoric times.
A. Sever Sˇkapin and P. Ropret
158
Among white pigments barium sulfate and zinc white were chosen. The natural pigment barytes was first suggested as a pigment at the end of eighteenth century. Its general application—in both the synthetic and the natural form—started in the beginning of nineteenth century, and it is still used today as a white pigment (Feller, 1986). In a mixture with zinc sulfide (ZnS), it is known as lithopone (ZnS þ BaSO4) (Feller, 1986). The sample studied here was taken from the St. Clement Church (Our Lady of Health Church). At least six layers can be distinguished on the top of the plaster of the investigated sample (Figure 12). The color changes from light yellow, slight gray, brown, and slight ochre, which is produced by the ochre, the black, and the white pigments (Figure 12a). EDS mapping analysis (Figure 13) shows that zinc is present in the lower and in the uppermost layers, while zinc is not found in the middle layer (Figure 13a); there calcium is dominant (Figure 13b). Spots and areas of barium, silicon, sulfur, and iron are seen in the finishing layers (Figures 13c–f, respectively). The spot qualitative and quantitative elemental analyses of the grains have been examined; the molar percentage of the elements in individual grains is presented in Table 6. The grains of ochre pigments contain more iron and oxygen (S1, S2, S3) than the surroundings, which suggests the pigment is one of the ochre pigments rich in iron. Many different ochre-colored pigments (e.g., warm ochre, limonite, yellow ochre, raw sienna, and so on) have similar elemental composition and can be distinguished by vibrational spectroscopy, such as micro-Raman and micro–Fourier transformed infrared (microFTIR) spectroscopy (Bikiaris et al., 1999). Black grains consist of carbon (grain S4); this means that the pigment can be identified as carbon black or graphite. The spectra acquired from grains S5, S6, and S7 belong to the white pigment BaSO4, while the pigment from S8 seems to be ZnS, which indicates that lithopone (BaSO4 þ ZnS) was probably used. Again, the (b)
(a) S7
S6
S7
S8 S4
S6 S8
S5 S2
S4 S1
S3
S5 S2
S1
S3
100 µm
20 kV
⫻190
100 mm
20
41
12Pa
FIGURE 12 (a) Optical and (b) SEM image (Back-scattered electrons image) of the same area of a polished cross section of the sample taken from the St. Clement Church.
159
Identification of Historical Pigments in Wall Layers
(a)
Zn
(b)
300 µm
(c)
300 µm
(d)
Si
Ba
300 µm
(e )
300 µm
FIGURE 13
Ca
S
300 µm
(f )
Fe
300 µm
EDS mapping analysis of the same area as presented in Figure 12.
TABLE 6 Quantitative EDS Analysis of Individual Pigment Grains (in normalized molar percentage; all-element analyses)*; The same shaded area presents the same pigment, bold numbers indicate elements relevant for the identification of the pigment. Spectrum
C
O
Na
Si
S
Ca
Fe
Zn
Sr
Ba
surroundings S1 S2 S3 S4 S5 S6 S7 S8
56.7 38.3 42.2 33.2 86.6 29.9 30.2 39.6 60.8
34.1 49.6 47.1 50.5 12.6 53.5 54.9 44.8 28.1
0.8 0.5 0.4 0 0 0 0.6 0.4 0
0.9 2.4 1.9 0.2 0.1 0.3 0.2 0.2 0.2
0.3 0.2 0.1 0.3 0.1 6.4 6.2 5.9 3.5
2.3 0.6 0.6 1.1 0.4 1.0 0.4 0.5 0.5
0.1 6.5 5.5 13.3 0 0 0 0 0.4
4.7 2.9 2.1 1.4 0.4 1.5 0.6 1.2 5.1
0 0 0 0 0 0.6 0.6 0 0
0.1 0 0 0 0 6.8 6.3 7.4 1.5
Ba, barium; C, carbon; Ca, calcium; Fe, iron; Na, sodium; O, oxygen; S, sulfur; Si, silicon; Sr, strontium; Zn, zinc. * The position of spots is marked in Figure 12; see text for details.
presence of lithopone cannot be unabiguously identified by EDS. In addition to ZnS. ZnO could be present. Another technique that adds molecular information (e.g., micro-Raman spectroscopy) needs to be used for its positive identification.
160
A. Sever Sˇkapin and P. Ropret
6. STUDY OF THE TRANSFORMATION OF CALCIUM CARBONATE INTO CALCIUM SULFATE In addition to the identification and study of the distribution of pigments, other examinations can be performed using a combination of optical microscopy, SEM and EDS. For example, in the following study the distribution of different elements in plasters and finishing layers deposited at different points of time throughout the centuries is presented. During examination of samples collected from the fac¸ade from the Cistercian Abbey of Sticˇna, silicon was found in the finishing layers. Figure 14 shows optical and SEM images and EDS mappings for silicon, carbon, and calcium of cross sections of such a sample. The elemental distribution shows a high intensity of silicon on the top of the outer orange-ochre layer and on the top of a lower plaster, accompanied by a minimal intensity of carbon. The borders that occurred due to different phases of the painting process are clearly visible. Because of the presence of silicon at the top of the layers and its inhomogeneous distribution across the paint layer, it is assumed that in these places calcium carbonate transformed into calcium sulfate. A similar phenomenon was observed in a study of degradation of a Quaglios’ mural painting in the Cathedral of Saint Nicholas
(a)
(b)
200 µm (c)
S
800 µm
(d )
⫻75
20 kV
C
800 µm
200 mm
(e )
20 40
10Pa
Ca
800 µm
FIGURE 14 Transformation of calcium carbonate into calcium sulfate: (a) optical image, (b) SEM image (Back-scattered electrons image), and (c,d,e) EDS mapping analysis of a cross section of outer facade from the Cistercian Abbey of Sticˇna.
Identification of Historical Pigments in Wall Layers
161
(Ropret and Bukovec, 2005). Apparently the samples in which a high content of silicon was detected had been exposed to humidity and a sulfur-containing environment, probably polluted air. At the top of the internal bottom orange-ochre layer a significant presence of silicon was not observed. This indicates that the time interval between application of both ochre layers was too short for the process of sulfatization to become relevant (Sever Sˇkapin et al., 2007).
REFERENCES Aze, S., Vallet, J. M., Baronnet, A., & Grauby, O. (2006). The fading of red lead pigment in wall paintings: tracking the physico-chemical transformations by means of complementary micro-analysis techniques. European Journal of Mineralogy, 18, 835–843. Belkorissat, R., Kadoun, A., Dupeyrat, M., Khelifa, B., & Mathieu, C. (2004). Direct measurement of electron beam scattering in the low vacuum SEM. Microchim. Acta, 147, 135–139. Berrie, B. H. (Ed.), (2007). In Artists’ Pigments: A Handbook of their History and Characteristics, Vol. 4. Washington, DC: National Gallery of Art. Bikiaris, D., Daniilia, S., Sotiropoulou, S., Katsimbiri, O., Pavlidou, E., Moutsatsou, A. P., et al. (1999). Ochre-differentiation through micro-Raman and micro-FTIR spectroscopies: application on wall paintings at Meteora and Mount Athos, Greece. Spectrochimica Acta. Part A: Molecular Spectroscopy, 56, 3–18. Brostoff, L., Centeno, S. A., Ropret, P., Bythrow, P., & Pottier, F. (2009). Combined x-ray diffraction and Raman identification of synthetic organic pigments in works of art: from powder samples to artists’ paints. Analytical Chemistry, 81, 6096–6106. Bueno, A. G., & Medina Flo´rez, V. J. (2004). The Nasrid plasterwork at ‘‘qubba Dar al-Manjara 1-kubra’’ in Granada: characterisation of materials and techniques. Journal of Cultural Heritage, 5, 75–89. Casarino, A., & Pittaluga, D. (2001). An analysis of building methods: chemical-physical and archaeological analyses of micro-layer coatings on medieval facades in the centre of Genoa. Journal of Cultural Heritage, 4, 259–275. Centeno, S. A., Llado´ Buisan, V., & Ropret, P. (2006). Raman study of synthetic organic pigments and dyes in early lithographic inks (1890-1920). Journal of Raman Spectroscopy, 37, 1111–1118. Coma, L., Breitman, M., & Ruiz-Moreno, S. (2000). Soft and hard modeling methods for deconvolution of mixtures of Raman spectra for pigment analysis. A qualitative and quantitative approach. Journal of Cultural Heritage, 1, S273–S276. Coupry, C., Lautie, A., Revault, M., & Dufilho, J. (1994). Contribution of Raman spectroscopy to art and history. Journal of Raman Spectroscopy, 25, 89–94. Daniilia, S., Sotiropoulou, S., Bikiaris, D., Salpistis, C., Karagiannis, G., Chryssoulakis, Y., et al. (2000). Panselinos’ Byzantine wall paintings in the Protaton Church, Mount Athos, Greece: a technical examination. Journal of Cultural Heritage, 1, 91–110. Douma, M. (curator). (2008). Pigments through the ages. http://www.webexhibits.org/ pigments. Eastaugh, N., Walsh, V., Chaplin, T., & Siddall, R. (2005). Pigment Compendium: Optical Microscopy of Historical Pigments. Burlington, MA: Butterworth-Heinemann. Feller, R. L. (Ed.), (1986). In Artists’ Pigments: A Handbook of Their History and Characteristics, Vol. 1. Cambridge, UK: Cambridge University Press. Genestar, C., & Pons, C. (2005). Earth pigments in painting: characterisation and differentiation by means of FTIR spectroscopy and SEM-EDS microanalysis. Analytical and Bioanalytical Chemistry, 382, 269–274.
162
A. Sever Sˇkapin and P. Ropret
Lau, D., Villis, C., Furman, S., & Livett, M. (2008). Multispectral and hyperspectral image analysis of elemental and micro-Raman maps of cross-sections from a 16th century painting. Analytica Chimica Acta, 610, 15–24. Manzano, E., Bueno, A. G., Gonzalez-Casado, A., & del Olmo, M. (2000). Mortars, pigments and binding media of wall paintings in the ‘‘Carrera del Darro’’ in Granada, Spain. Journal of Cultural Heritage, 1, 19–28. Mazzocchin, G. A., Agnoli, F., & Salvadori, M. (2004). Analysis of Roman age wall paintings found in Pordenone, Trieste and Montegrotto. Talanta, 64, 732–741. McCrone, W. C. (1979). Microscopy in the study of art and archaeology. Microscope, 27, 167. Mendes, N., Lofrumento, C., Migliori, A., & Castellucci, E. M. (2008). Micro-Raman and particleinduced X-ray emission spectroscopy for the study of pigments and degradation products present in 17th century colored maps. Journal of Raman Spectroscopy, 39(2), 289–294. Miliani, C., Rosi, F., Borgia, I., Benedetti, P., Brunetti, B. G., & Sgamellotti, A. (2007). Fiberoptic mid-infrared reflectance spectroscopy: a suitable technique for in-situ studies of mural paintings. Applied Spectroscopy, 61, 293–299. Neelmeijer, C., & Ma¨der, M. (2002). The merits of particle induced X-ray emission in revealing painting techniques. Nuclear Instruments and Methods in Physics Research Section B, 189(1–4), 293–302. Ospitali, F., Bersani, D., Di Lonardo, G., & Lottici, P. P. (2008). ‘‘Green earths’’: vibrational and elemental characterization of glauconites, celadonites and historical pigments. Journal of Raman Spectroscopy, 39(8), 1066–1073. Perardi, A., Zoppi, A., & Castellucci, E. (2000). Micro-Raman spectroscopy for standard and in situ characterization of paintings materials. Journal of Cultural Heritage, 1, S269–S272. ´ lvarez, M., & Madariaga, J. M. (2004). Scientific analysis Pe´rez-Alonso, M., Castro, K., A versus restorer’s expertise for diagnosis prior to a restoration process: the case of Santa Maria Church (Hermo, Asturias, North of Spain). Analytica Chimica Acta, 524, 379–389. Pozza, G., Ajo, D., Chiari, G., De Zuane, F., & Favaro, M. (2000). Photoluminescence of the inorganic pigments Egyptian blue, Han blue and Han purple. Journal of Cultural Heritage, 1, 393–398. Rampazzi, L., Cariati, F., Tanda, G., & Colombini, M. P. J. (2002). Characterization of wall paintings in the Sos Furrighesos necropolis (Anela, Italy). Journal of Cultural Heritage, 3, 237–240. Rattenberger, J., Wagner, J., Schrottner, H., Mitsche, S., & Zankel, A. (2009). A methode to measure the total scattering cross section and effective beam gas path length in a lowvacuum SEM. Scanning, 31(3), 107–113. Ricci, C., Borgia, I., Brunetti, B. G., Miliani, C., Sgamellotti, A., Seccaroni, C., et al. (2004). The Perugino’s palette: integration of an extended in situ XRF study by Raman spectroscopy. Journal of Raman Spectroscopy, 35, 616–621. Roascio, S., Zucchiatti, A., Prati, P., & Cagnana, A. (2002). Study of the pigments in medieval polychrome architectural elements of ‘‘Veneto-Byzantine’’ style. Journal of Cultural Heritage, 3, 289–297. Ropret, P., Centeno, S. A., & Bukovec, P. (2008). Raman Identification of yellow synthetic organic pigments in modern and contemporary paintings: reference spectra and case studies. Spectrochimica Acta. Part A: Molecular Spectroscopy, 69, 486–497. Ropret, P., & Bukovec, P. (2005). Chemical cleaning of Quaglios’ mural painting in the Cathedral of Saint Nicholas in Ljubljana. In Proceedings of the XXI International Congress On Wall Paintings (pp. 283–291). Venezia: Arcadia Ricerche. Rosi, F., Manuali, V., Miliani, C., Brunetti, B. G., Sgamellotti, A., Grygar, T., et al. (2009). Raman scattering features of lead pyroantimonate compounds. Part I: XRD and Raman characterization of Pb2Sb2O7 doped with tin and zinc. Journal of Raman Spectroscopy, 40, 107–111.
Identification of Historical Pigments in Wall Layers
163
Rosi, F., Miliani, C., Borgia, I., Brunetti, B., & Sgamellotti, A. (2004). Identification of nineteenth century blue and green pigments by in situ x-ray fluorescence and micro-Raman spectroscopy. Journal of Raman Spectroscopy, 35, 610–615. Roy, A. (Ed.), (1993). In Artists’ Pigments: A Handbook of Their History and Characteristics, Vol. 2. Oxford, UK: Oxford University Press. Ruiz-Moreno, S., Pe´rez-Pueyo, R., Gabaldo´n, A., Soneira, M. J., & Sandalinas, C. (2003). Raman laser fibre optic strategy for non-destructive pigment analysis. Identification of a new yellow pigment (Pb, Sn, Sb) from the Italian XVII century painting. Journal of Cultural Heritage, 4, 309–313. Sandalinas, C., Ruiz-Moreno, S., Lopez-Gil, A., & Miralles, J. (2006). Experimental confirmation by Raman spectroscopy of a Pb-Sn-Sb triple oxide yellow pigment in sixteenthcentury Italian pottery. Journal of Raman Spectroscopy, 37, 1146–1153. Sever Sˇkapin, A., Ropret, P., & Bukovec, P. (2007). Determination of pigments in color layers on walls of some selected historical buildings using optical and scanning electron microscopy. Materials Characterization, 58, 1138–1147. Vandenabeele, P., Moens, L., Edwards, H. G. M., & Dams, R. (2000). Raman spectroscopic database of azo pigments and application to modern art studies. Journal of Raman Spectroscopy, 31, 509–517. Viti, C., Borgia, I., Brunetti, B., Sgamellotti, A., & Mellini, M. (2003). Microtexture and microchemistry of glaze and pigments in Italian Renaissance pottery from Gubbio and Deruta. Journal of Cultural Heritage, 4, 199–210. West, FitzHugh, E. (Ed.), (1997). In Artists’ Pigments: A Handbook of their History and Characteristics, Vol. 3. Oxford, UK: Oxford University Press. Zoppi, A., Signorini, G. F., Lucarelli, F., & Bachechi, L. (2002). Characterisation of painting materials from Eritrea rock art sites with non-destructive spectroscopic techniques. Journal of Cultural Heritage, 3, 299–308.
Chapter
5 Superresolution Imaging—Revisited Markus E. Testorf* and Michael A. Fiddy†
Contents
1. Introduction 2. Classical Estimates of the Image Resolution 2.1. The Rayleigh Limit 2.2. Abbe’s Theory of the Microscope 2.3. Relation to the Sampling Theorem 2.4. Other Classical Metrics for Resolution 3. Reassigning Degrees of Freedom 3.1. The Space-Bandwidth Product and the Degrees of Freedom of an Image 3.2. Superresolution via Space-Bandwidth Adaptation 3.3. Interpretation of Lukosz Superresolution 4. Applications of Lukosz Superresolution 4.1. Synthetic Aperture Radar Imaging 4.2. Structured Illuminations 4.3. Superresolution Imaging Through Turbulence 4.4. Multi-Aperture Systems and Digital Superresolution 4.5. Evanescent Waves and Near-Field Microscopy 5. Optical Superresolution 5.1. Superresolution Filters 5.2. Superoscillations 6. Numerical Superresolution Algorithms 6.1. The Gerchberg–Papoulis Algorithm 6.2. The Prior Discrete Fourier Transform Algorithm
166 168 168 170 172 173 174 174 176 180 181 181 182 183 184 186 187 187 188 190 192 193
* Dartmouth College, Hanover, NH, USA {
University of North Carolina-Charlotte, USA
Advances in Imaging and Electron Physics, Volume 163, ISSN 1076-5670, DOI: 10.1016/S1076-5670(10)63005-4. Copyright # 2010 Elsevier Inc. All rights reserved.
165
166
Markus E. Testorf and Michael A. Fiddy
6.3. Further Interpretation of the PDFT Algorithm 6.4. Multiple Scattering 7. Generalized Sampling Expansions 7.1. Squeezing and Dissecting Phase Space 7.2. Papoulis Generalized Sampling 7.3. Compressive Sampling 8. Concluding Remarks Acknowledgment References
197 203 204 204 207 211 212 213 213
1. INTRODUCTION Throughout history optical sciences have maintained their importance as an enabling technology—namely, by providing images about very distant and very small objects. Among a variety of important parameters, the system’s spatial resolution, that is the smallest object feature that can be identified from the system output, determines its performance. This resolution limit is strongly linked to the wavelength of the probing signal in practice and it proves invariably difficult to image features significantly smaller than the wavelength regardless of the optical instrument being used. Closely related to the wavelength limit and often used synonymously is the Rayleigh limit (Born and Wolf, 1980; Rayleigh, 1899), which is arguably the most widely used resolution criterion among several classical resolution limits. The Rayleigh limit expresses the resolution of an imaging system in terms of the point spread function (PSF) of the optical system. Quite intriguingly, the Rayleigh limit turns out to be linked with other estimates of the signal resolution, including the Shannon sampling rate, which is fundamental for representing band-limited signals. Perhaps this is why the Rayleigh limit is often revered as if it were a fundamental constant of physics. However, for decades an ever-increasing number of concepts and ideas have aimed at pushing the image resolution limit beyond the Rayleigh condition, but without reducing the wavelength or increasing the form factor of the instrument. Many of these methods have been labeled superresolution methods, frequently independent of adherence to any of the conditions that would define a meaningful resolution limit in classical terms. The word superresolution now is often used as a phenomenological attribute, unrelated to any specific type of theoretical framework or strategy. This, in turn, has created a culture of reporting on new superresolution schemes in terms of the achievable resolution rather than in terms of the relationships to preexisting methods or to fundamental
Superresolution Imaging—Revisited
167
underlying assumptions. As a consequence, it is often difficult to compare any two methods in terms of performance limitations and applicability unless they belong to the same family of superresolution methods. The inflationary use of the term superresolution, without proper motivation and justification, is unfortunate. On the one hand, it ignores the inevitable and fundamental trade-offs associated with improving the image resolution. On the other hand, a proper and systematic classification of existing methods would aid with the design of new methods and instruments, tailoring their performance to the needs of the specific imaging problem. This review, which is not intended to be comprehensive, reflects our attempt at such a classification of superresolution methods and their associated trade-offs. We show that, in fact, most strategies to improve the image resolution can be derived from a small set of principles. Pivotal to this analysis is Lukosz’s interpretation of imaging systems as communication channels and superresolution as the result of a customized encoding scheme that exploits the available degrees of freedom for transmitting information in an optimal fashion (Lukosz, 1966, 1967). Even a casual survey of this topic reveals the existence of two seemingly unrelated concepts that relate to superresolution. The first is sometimes referred to as space-bandwidth adaptation. High frequencies that are cut off by the imaging system are reassigned to degrees of freedom, which are transmitted by the imaging system, but not occupied by the signal. This includes trade-offs between the support region of the signal and its spatial bandwidth, as well as trade-offs between spatial and temporal bandwidth. This concept covers a large variety of optical systems designed to improve signal resolution, including synthetic aperture methods, scanning techniques, and digital superresolution algorithms that overcome the resolution limit imposed by the size of the electronic detector. The second concept underlying superresolution imaging is based on the trade-off between the dynamic range of the signal—that is, the number of transmitted bits per degree of freedom, and the resulting image resolution. This trade-off can be appreciated in the spirit of Lukosz’s original work by devising a suitable encoding scheme to convert high signal frequencies into grey levels of the transmitted optical signal (Zalevsky et al., 2006). More widespread and quite thoroughly investigated are methods in which the support region and other signal properties are implicitly used as high-frequency encoders and the measured data are a super-position of different signal frequencies. In this case, the decoding step requires a computationally costly reconstruction algorithm and precise knowledge of the exploited signal properties. However, the gain of signal resolution can again be recognized as a trade-off in the Lukosz sense, typically depending on the signal-to-noise ratio (SNR) of the signal, that is the number of bits encoded in each data sample.
168
Markus E. Testorf and Michael A. Fiddy
Despite the fact that one can usually assume that the measurable fields are analytic, noise renders these problems ill-posed and all superresolution algorithms ill-conditioned. Regularization techniques and the incorporation of prior knowledge can alleviate the ill conditioning, but typically trade-offs must be made in attempting to improve an image. This applies to both superresolution concepts.
2. CLASSICAL ESTIMATES OF THE IMAGE RESOLUTION In order to appreciate fully the principles underlying any superresolution strategy, we need to review with some depth the classical concepts used for estimating image resolution. For instance, the diffraction limit as defined by the Rayleigh resolution criterion depends on a number of conditions, most importantly a clear system aperture. This can be safely assumed for most conventional imaging systems. A number of metrics have been developed to describe the image resolution for different classes of imaging systems but it is not our goal to compare these different resolution limits exhaustively. Instead, we focus on two resolution criteria: the Rayleigh limit and the Abbe resolution limit. It is important to understand the fundamental difference between these two criteria, as well as how superresolution methods relate to each of them.
2.1. The Rayleigh Limit The term optical superresolution is associated with any imaging technique that provides resolution beyond the Rayleigh limit. Textbooks commonly define the Rayleigh criterion in terms of the PSF of a diffraction-limited system. Assuming a clear circular aperture, the system PSF is the Airy disk, and the Rayleigh limit is associated with the image of two point objects, where the maximum of the first Airy disk coincides with the first zero of the second Airy function. For later reference we briefly derive Rayleigh’s criterion for a model 4f system (Figure 1) closely following Rayleigh’s original derivation (Rayleigh, 1899). To keep mathematical expressions simple we restrict the discussion to a two-dimensional (2D) geometry with only one transverse coordinate. This means a clear aperture in the Fraunhofer plane of the system may be described by a rectangle of size A. The system PSF for incoherent image formation reads PðxÞ ¼ sinc2 ðxA=lf Þ;
(1)
which corresponds to a triangular optical transfer function (OTF) of width 2A. The Rayleigh criterion applies to the image of two point objects,
Superresolution Imaging—Revisited
Fraunhofer plane
Object
169
Optical image
A
f
f
f
f
FIGURE 1 Generic 4f system.
Itp ðxÞ ¼ Pðx d=2Þ þ Pðx þ d=2Þ;
(2)
where d ¼ lf/A defines the smallest distance for which a separation of the two object points is possible. In this case, the two-point image shows a saddle at x ¼ 0, which is about 81.06% of the maximum intensity. The Rayleigh limit assumes that this is the minimum contrast necessary to distinguish the two objects as separate points. If the object points are moved closer together, the saddle region is assumed unrecognizable and separation is no longer possible. It is immediately obvious that the Rayleigh criterion is a rather arbitrary limit tied to a set of very specific conditions, namely, a two-point object and a clear aperture across the entire pass band of the angular spectrum. The ability to recognize the saddle is obviously a function of the SNR of the detected signal, which allows for some resolution uncertainty, but not much. Rayleigh’s (1899) own discussion leaves no doubt that he himself conceived the resolution limit as a heuristic estimate. In fact, he starts his discussion by pointing out that a safe estimate of the diffraction limit would be twice the radius of the main lobe of the PSF. With reference to Friedrich Wilhelm Herschel he remarks on the practice of obscuring the central part of a telescope’s aperture to obtain an annular aperture that results in a narrower center lobe, albeit at the expense of increased side lobes. This, in effect, foreshadows contemporary work on the design of multi-aperture telescopes (Watson et al., 1989), as well as Toraldo di Francia’s (1952) work on supergain antennae, which opened up the field of superresolution filters. A number of more recent studies have stressed the fact that the Rayleigh limit can be regarded as an upper bound on the resolution limit, especially in the context of numerical reconstruction techniques (Dharanipragad, 1996). While this is undoubtedly correct in a rigorous sense, it ignores the empirical observation that the Rayleigh limit defines—very roughly—a demarcation line between the image resolution
170
Markus E. Testorf and Michael A. Fiddy
obtainable with even the most generic of optical instruments, and those imaging systems that must rely on specially designed optical hardware and post–image-processing techniques. It has been shown that the resolution limit can be reduced significantly using numerical techniques if prior information about the imaging problem is available. This includes, for example, precise knowledge of the PSF and some object space information, such as local support constraints. It should be cautioned, however, that the utility of such numerical techniques is not guaranteed. In practice, none of the object properties or the instrument functions are likely to be known with sufficient precision to ignore the Rayleigh limit. Typically, the exploitation of prior knowledge increases the sensitivity of the signal recovery to deviations from these assumed constraints; we return to this problem in Section 6. This highlights the fact that despite the progress in imaging technology, the Rayleigh limit remains a useful heuristic estimate of the resolution limit.
2.2. Abbe’s Theory of the Microscope It is worthwhile to contrast the Rayleigh limit with other resolution criteria, and we do so with Abbe’s analysis of the resolving power of a microscope (Abbe, 1873; Lohmann, 2006a). Let us assume a periodic input signal to the 4f-system (Figure 2), illustrates 1 X x ndg up ðxÞ ¼ rect : (3) a n¼1 The grating period is dg and each grating period consists of a rectangular slit of width a. The plane wave spectrum in the Fraunhofer plane has the form p ðnÞ ¼ u
1 X
um dðn mng Þ
(4)
m¼1
with ng ¼ 1/dg. The system aperture will block all diffraction orders m with jmngj > A/(2lf), and the image loses all information about the grating structure if ng > A/(2lf). This means that the minimum grating period dg ¼ 2d is twice the Rayleigh limit. Figure 2 illustrates the case where the zeroth, as well as the plus and minus first diffraction orders, pass through the system aperture. Thus, the smallest feature we can observe is the oscillation of the base frequency of the input grating: uimage ðxÞ ¼ u0 þ 2u1 cosð2png xÞ:
(5)
Superresolution Imaging—Revisited
(a)
(b)
171
A lf
n
x dg (c)
x
FIGURE 2 Abbe’s theory of resolution. (a) Periodic input signal; (b) signal spectrum modulated by the system aperture; (c) output signal composed of the zeroth and the plus and minus first diffraction orders of the input signal.
We note that this concept also can be used to establish the operating wavelength as the fundamental diffraction limit. If the signal is detected at some distance from the object, then only homogeneous (i.e., propagating) plane wave components will contribute to the detected signal. The spectrum of homogeneous plane waves covers a frequency bandwidth of 2/l and consequently we obtain a fundamental resolution limit of l/2. However, the validity of this limit is subject to the same conditions we need to establish for the Abbe criterion in general. To associate in a strict sense the cutoff frequency with the size of a localized feature in the object domain, we have to assume an infinitely extended periodic object with the resolved local feature, in fact, being periodically continued. Neither the Rayleigh limit nor the Abbe limit helps specify a perfect image, but rather establishes criteria that allow retrieval of spatial information. In the same spirit, we can continue the investigation of the Abbe criterion, change the direction of the plane wave incident on the grating, and thus offset the plane wave spectrum. We then observe the interference pattern of the zeroth diffraction order and the first diffraction order. This is not the same as the original grating pattern but provides spatial information about the input grating. In this way, we can extend the resolution limit of Abbe’s theory by a factor of two and the corresponding minimum grating period becomes identical to the Rayleigh limit.
172
Markus E. Testorf and Michael A. Fiddy
The freedom of choosing the incidence direction of a coherent plane wave is equivalent to illuminating with incoherent light. The latter can be interpreted as many mutually incoherent plane waves illuminating the object from all directions. The effective OTF has twice the bandwidth compared with coherent image formation, which results in a minimum period compatible with Rayleigh resolution. It is remarkable that the Abbe perspective, involving an infinitely extended grating, leads to a rigorous resolution limit rather than a heuristic estimate. Once only a single diffraction order (albeit from an infinitely extended periodic function) passes the aperture, we lose the object information regardless of the SNR. Even the condition of a clear aperture is not crucial in this case. In contrast, the Rayleigh limit assumes an object or structure of finite extent and does not result in an equally rigorous definition of resolution.
2.3. Relation to the Sampling Theorem It is interesting and important for our discussion that the classical resolution limits are also compatible with well-known signal analysis and the Shannon–Whittaker sampling theorem. We recall that a signal of finite bandwidth DnA can be represented by an infinite set of equidistant sampling points separated by xs ¼ 1/DnA. Additional sampling points do not carry independent information. If we apply this sampling procedure to the output of a 4f-system and analyze the coherent image formation process, we find a spatial bandwidth of the output signal DnA ¼ A/lf and the sampling distance xs ¼ d. Thus, the sampling distance is half the minimum feature size that can be resolved according to Abbe’s theory and is identical to the Rayleigh resolution limit. This result is not surprising, since the sampling theorem requires at least two samples per period of the highest harmonic contained in the signal. Also, the Rayleigh limit centers the PSFs of the first point at the first zero of the second point, thus enforcing the same distance as required for a sinc-function interpolation of the signal from sampled data. Like the Abbe criterion, the sampling theorem assumes an infinitely extended signal or object. If the signal is also strictly periodic—again mirroring the assumption we made for deriving Abbe’s resolution limit—the entire signal information can be rigorously represented by a finite number of data or samples. Adopting this model opens up the possibility for numerical computation and numerical superresolution methods. We can interpret a signal of finite support as one period of an infinitely periodic signal, modulated by a window function with that finite support. We can interpret the spectrum within a finite frequency band as a
Superresolution Imaging—Revisited
173
superposition of the baseband frequencies and down-converted higher signal frequencies. Samples of this finite spectrum can then be used as the basis for bandwidth extrapolation and superresolution methods.
2.4. Other Classical Metrics for Resolution We conclude our review of classical resolution criteria by acknowledging the existence of alternative estimates of the resolution limit. In particular, if we recall the list of conditions necessary to establish the Rayleigh limit, it is obvious that more suitable estimates can be established if we narrow the class of permissible input signals or seek a criterion for accommodating a specific class of imaging systems. As merely one example, we mention system apertures other than the clear rectangular or circular aperture of a generic 4f system. For instance, astronomical telescopes are often designed as multi-aperture systems for which Rayleigh’s resolution limit no longer suffices and alternative two-point resolution criteria have had to be established (Watson et al., 1988). Another point is that in practice image resolution is often significantly better than predicted by the Rayleigh limit. This has led to empirical resolution estimates (Treanor, 1946). An attempt to account for the dependence of the observed resolution on the vision of the human observer (or the SNR of the camera) is Sparrow’s criterion (Sparrow, 1916). Here, the resolution limit is associated with a two-point object and a clear rectangular system aperture. Then, Sparrow’s criterion can be interpreted as the distance between the two object points, where the saddle between the two maxima of the superposed PSFs can no longer be observed and a single maximum is starting to develop, which is less than half the Rayleigh distance. Rayleigh, in his original work, starts his discussion with the suggestion to define the resolution limit as the distance where the first zero of the first Airy function coincides with the first zero of the second Airy function. This criterion, equivalent to twice the Rayleigh distance, would eliminate the dependence of the resolution limit on the SNR, yet would provide an upper bound rather than the lower bound as defined by Sparrow’s criterion. Even this brief and incomplete survey of classical resolution criteria confirms our understanding of Rayleigh’s criterion as a heuristic estimate that may be used as a baseline for comparison rather than as a fundamental physical property or limitation of an imaging system. Abbe’s criterion allows us to establish a rigorous resolution limit, at least in a theoretical sense, that turns out to be compatible with Rayleigh’s more heuristic criterion. Thus, Abbe’s condition also provides some additional justification for using Rayleigh’s condition despite its shortcomings. At the same time it is evident that defining the resolution limit rigorously for signals of practical importance is an elusive goal. Any rigorous
174
Markus E. Testorf and Michael A. Fiddy
definition of the resolution limit requires us to associate the minimum feature size, a local property of the signal, with a well-defined frequency cutoff (i.e., a local property of the spectrum). In the case of an infinitely extended periodic signal, the local feature is replicated periodically and thus associated with a nonlocal feature. We conclude that the inability to establish a rigorous diffraction limit is a consequence of the Fourier reciprocity of signal and spectrum. The uncertainty relationship between signal and spectrum hinders us from finding a more rigorously defined classical limit. However, for objects of finite support this opens up the possibility of superresolution imaging. In the remainder of our survey we explore superresolution methods, and we distinguish between methods that exploit this uncertainty relationship and those that deliberately reshuffle the spectrum to fit additional signal frequencies through the pass band of the imaging system. While this distinction is in some sense artificial, it allows us to highlight important properties of certain numerical superresolution methods. We first consider the latter of these two categories.
3. REASSIGNING DEGREES OF FREEDOM The image resolution is widely regarded as a critical parameter to characterize imaging systems. Lukosz proposed a different interpretation treating the imaging system as a communication channel. In his two seminal papers, Lukosz (1966, 1967) recognizes that it is not the image resolution (or system bandwidth) that determines an ultimate limit, but the total amount of object information transferred to the image plane—that is, the capacity of the imaging system—that limits the relayed object information. As part of his discussion, Lukosz proposes several coding schemes to use part of the total channel capacity for transmitting frequency bands of the signal that would otherwise reside outside the pass band of the system. This approach formulates insights into available trade-offs. The resolution can be improved beyond the Rayleigh limit by selecting a suitable coding scheme, but other parameters that determine the total information capacity, such as the object size, must be adjusted accordingly to keep the total amount of transmitted signal information unchanged.
3.1. The Space-Bandwidth Product and the Degrees of Freedom of an Image Lukosz’s concept of superresolution by reassigning channels to high signal frequencies draws on the notion of the space-bandwidth product and the degrees of freedom as the decisive property of optical signals and
Superresolution Imaging—Revisited
175
systems. For our discussion, it is worthwhile to recall briefly the theoretical justification of this concept. We need to tackle the question of how many independent variables are needed to describe an optical signal, and namely, the optical wavefront. This problem was first considered by von Laue (1914) and later by Gabor (1961), who interprets the degrees of freedom as channels for information transfer. He also recognizes the close relationship between the degrees of freedom of optical signals and important invariants of geometrical optics, namely, the Smith–Lagrange invariant. Toraldo di Francia (1955, 1969) repeatedly considered the number of degrees of freedom of optical signals in the image plane. In his 1955 paper, he derives an estimate of the degrees of freedom with the help of the sampling theorem. We can once again consider the 4f system in Figure 1. Regarding the system aperture as a perfect low-pass filter with a bandwidth of DnA ¼ A/(lf), we need to sample the image at a rate of xs ¼ 1/DnA. Then, assuming that the essential parts of the signal are contained in a window of size D, we can estimate the number of independent variables for describing the signal as S ¼ DDnA :
(6)
For the case of rectangular images and apertures in a fully threedimensional (3D) imaging system we can extend this concept straightforwardly to a second signal dimension and arrive at what is commonly referred to as the space-bandwidth product (SBP) (Lohmann, 2006b). The terms degrees of freedom refers to the minimum number of independent variables necessary to describe an optical signal. This is typically associated with a slightly more rigorous analysis, where we acknowledge the fact that amplitude and phase of the complex amplitude have to be treated as independent variables. Furthermore we can effortlessly extend the analysis to the case of timedependent signals with a finite duration and an essential time bandwidth. Thus the total number of degrees of freedom may be estimated as (Gabor, 1961, Lohmann, 2006b) F ¼ 2Dx Dnx 2Dy Dny 2TB;
(7)
where Dx, Dy, Dnx, and Dny denote the extension of the image and the spatial bandwidth with respect to the two independent signal coordinates x and y, and T and B refer to the time duration and bandwidth of the signal. We may also assign an additional factor of two if we consider the two polarization states as independent degrees of freedom. For small F we may further add one degree of freedom per independent signal dimension which reflects the fact that even subwavelength features are capable of transmitting at least one optical mode.
176
Markus E. Testorf and Michael A. Fiddy
For incoherent image formation we may be inclined to double the number of degrees of freedom for each spatial coordinate, since in this case the OTF of system is twice as large in each direction compared with the coherent case. As Toraldo di Francia (1955) points out, however, the degrees of freedom are no longer truly independent, since their values must be chosen to produce a real positive output signal. The notion is often extended to characterize physical objects for which the degrees of freedom of the third spatial dimension must be taken into account. We note, however, that the direction parallel to the optical axis does not carry additional degrees of freedom for the electromagnetic field in free space, which is completely specified by the complex amplitude and the signal dynamics determined by the wave equation. This distinction between the degrees of freedom of a wavefront and a physical structure is crucial for designing photonic elements for manipulating light signals. We note that the notion of the SBP is based on the heuristics of a truncated sampling expansion. Similar to the resolution limit, the rigor of the SBP is limited by the uncertainty of the signal bandwidth due to the truncation of the sampling expansion. We will turn to a more rigorous justification of the SBP in Section 6. However, the heuristic definition of the SBP as stated in Eq. (6) is sufficient to understand Lukosz’s concept of superresolution. Finally, we emphasize that the SBP, as well as the concept of degrees of freedom as defined above, also depends on the assumption of a spaceinvariant imaging system. If the point response is a function of position, for instance caused by aberrations and vignetting effects, we have to generalize the concept of the SBP to allow for the analysis of local frequencies and spectral properties. A suitable framework for this type of generalization is phase-space optics. The application of phase optics to the analysis of the propagation of the degrees of freedom through optical systems was pioneered by Lohmann (2006b). We illustrate the phasespace concept by analyzing one particular superresolution imaging system.
3.2. Superresolution via Space-Bandwidth Adaptation Figure 3 schematically shows an imaging system that Lukosz (1967) proposed in his original work and which, in principle, allows us to improve the resolution by an arbitrary factor Nr. The coherent object wavefront is propagated to a grating placed in front of the first lens. The number of grating harmonics determines the gain factor Nr. Here, we assume the grating’s transmission function to be cosinusoidal: Nr ¼ 2. Grating diffraction produces two copies of the object wave, each copy propagating in a different direction. The 4f system images the wavefront immediately behind the grating G1 to the conjugate plane, where a second
Superresolution Imaging—Revisited
OP
G1
L1
FP
L2
IP
177
G2
A
zG
zG f
f
f
f
FIGURE 3 Lukosz-type superresolution system: The signal in the object plane (OP) is propagated to the first grating (G1). The encoded signal is then imaged to the conjugate plane located at the second grating (G2) by the 4f imaging system consisting of two Fourier lenses. L1 and L2. The system aperture of size A resides in the Fraunhofer plane (FP). The decoded signal is observed in the image plane (IP) of the system.
identical grating G2 is located. Conceptually, the desired image can be recovered by freespace (back-)propagation of the wavefront behind the second grating to the image plane IP. The trade-off between image resolution and permissible object size can be understood most conveniently with the help of phase-space diagrams (PSDs). This means that the optical signal is depicted as a finite area in an abstract phase space that simultaneously represents spatial information and spatial frequency information. The rigorous foundation to this concept can be established with the help of the Wigner distribution function, which associates each complex amplitude distribution with a phase-space distribution (Torre, 2005). Figure 4a shows the schematic representation of an optical signal. The concept of the SBP allows us to consider the case of a signal that is essentially compact in space and simultaneously essentially band-limited, with S ¼ DxDn. For paraxial optical signals, propagation through any lossless optical system can be described in phase space with the ABCD formalism of geometrical optics. The phase space of coherent signals obeys the same propagation rules as the phase space of geometric optical rays. For instance, free-space Fresnel diffraction corresponds to a horizontal shear of the signal’s phase-space area. This becomes immediately obvious if we consider the propagation of optical rays. The transverse ray position is shifted proportional to the sine of its propagation angle. Quantitatively, we can deduce the shear parameter by considering paraxial ray propagation over a distance z along the optical axis. This means that a ray corresponding to a spatial frequency nr will experience a transverse shift by Dxr ¼ nr lz:
(8)
(a)
(b)
n
Δn
(e)
n
x
x
Δx
Δn
Δx
(c)
n
x
Δx
(g)
n
(h)
n
n
Δn
x
x
Δx
(d) n
(f)
n
x
Δn
x
x
FIGURE 4 Phase-space diagram of the superresolution system in Figure 3. (a) Signal passing the 4f system without encoding; (b) signal with a bandwidth exceeding the pass band of the 4f system by a factor of two; (c) before the first grating (G1); (d) after G1; (e) encoded signal after passing the 4f system and before the second grating (G2); (f) after G2; (g) signal back-propagated to the image plane IP; and (h) after removing artifacts outside the signal area.
Superresolution Imaging—Revisited
179
Similarly, other paraxial optical systems can be associated with affine transformations of phase space. As a second example, the Fourier transformation of Fraunhofer diffraction exchanges the space and frequency axes corresponding to a clockwise rotation of phase space by 90 . Equipped with these tools derived from phase-space optics we can now conveniently analyze Lukosz’s system. Figure 4a shows the PSD of a signal transmitted by the 4f system without the encoder/decoder gratings. To improve the image resolution by a factor of two, we must transmit signals through the 4f system that have twice the nominal bandwidth supported by the system. In this case, this goal is achieved by preserving the number of transmitted degrees of freedom (i.e., the area or volume in phase space) by cutting the field of view in half (Figure 4b). Propagation to the first grating corresponds to a horizontal shear (Figure 4c). The subsequent modulation with the grating function splits the signal into two identical copies, each shifted along the frequency axis (Figure 4d). Imaging the signal immediately behind the first grating to the conjugate plane of the second grating corresponds to erasing all frequencies outside the band that is transmitted by the optical system. Figure 4e schematically shows the resulting PSD. (Note that this treats the effect of the aperture in a geometric optical sense). A rigorous model needs to include the diffraction effects, resulting in a spatial blur in the image plane. The second grating again generates two identical copies of the incident signal, each shifted in opposite direction along the frequency axis (Figure 4f). As a result we obtain three signal components, one of which is the restored (but Fresnel-propagated) input signal. We recover the input signal after backpropagating it to the output image plane (Figure 4g) and applying a spatial filter to block the undesired signal components (Figure 4h). The grating encoder system was investigated with the help of a PSD in similar form by Zalevsky and Mendlovic (2004). In fact, PSDs have been frequently used to illustrate Lukosz-type superresolution (Mendlovic and Lohmann, 1997; Mendlovic et al., 1997; Zalevsky et al., 2000a). It is also noteworthy that the PSDs not only serve as useful illustrations of signals with space-variant spectral properties, but can also be used effectively to deduce important quantitative relationships. For instance, Figure 4c allows us to establish the relationship between the signal support, the bandwidth of the imaging system, and the position of the encoder grating relative to the input plane. Recognizing that in the grating plane the signal shear must be Dx for the system bandwidth Dn, we can use Eq. (8) to establish the distance of the encoder grating from the object plane as zG ¼
Dx : 2lDn
(9)
With 0 < zG < f, this also formulates a constraint on the ratio of object size and system bandwidth.
180
Markus E. Testorf and Michael A. Fiddy
3.3. Interpretation of Lukosz Superresolution The analysis of the grating encoder system system allows us to interpret Lukosz-type superresolution in more general terms. While the system in Figure 3 was successfully demonstrated experimentally (Bachl and Lukosz, 1967), we note that superresolution comes at a price. Even under optimum conditions, the output signal carries only one quarter of the input power. In general, the loss in signal power is proportional to the gain in resolution. Both the system aperture A and the spatial filter in the output plane cut the remaining signal energy in half. This is somewhat important if we remember that the Rayleigh diffraction limit critically depends on the SNR of the detected signal. Similar trade-offs apply to essentially all systems that exploit the reassignment of information channels for resolution improvement. A second observation concerns the terminology. It is obvious that the entire system (including the diffraction gratings) treated as a black box would merely act as an imaging system with a nominal resolution of 1/(2Dn). Both gratings must be sufficiently large to accept an angular spectrum perfectly compatible with this nominal resolution. We can speak of superresolution only if we carefully define the part of the system that is considered the communication line and the parts that act as encoders and decoders of information. This distinction may be obvious for some instances of Lukosz superresolution, but it is nonetheless important and its omission sometimes renders the claim of the degree of superresolution accomplished rather suspect. Phase-space analysis allows us to identify another important aspect of this superresolution approach. Note that the encoding step results in a PSD with the two copies of the original signal not overlapping. This can be generalized by considering a phase space or configuration space in which the different copies of the original signal are passed through the system. The copies need to be orthogonal in phase space, which is readily achieved by avoiding their overlap. This concept can be extended to cases in which the channel coding involves sequential measurements of different spatial frequency bands (time-domain encoding) or the transmission of signal parts modulated with different time-frequency carriers (time-frequency encoding). Phasespace analysis may then involve an extended phase space consisting of space and spatial frequency coordinates, as well as time and time-frequency. The phase-space perspective straightforwardly includes mixed encoding schemes wherein components of the spatial input signal are carried by generalized time-frequency signals. However, superresolution is achieved only if the transmitted signal parts do not overlap in this extended phase space. We can further extend the Lukosz concept to include encoding schemes, in which the different frequency bands of the input signal are
Superresolution Imaging—Revisited
181
modulated in phase space with orthogonal masking functions. Despite partial overlap in phase space, the knowledge of the encoding masks is sufficient to decode and combine the spatial frequency components. This may be compared with aperture coding schemes or masking in the object and image space where the masks form a set of orthogonal codes that allow signal reconstruction from multiple measurements. This can again be cast in terms of an extended configuration space, where one coordinate corresponds to the parameter that distinguishes the orthogonal masking functions. Thus, we can construct a configuration space where the transmitted signal components do not overlap and can be exploited to provide improved resolution. A systematic treatment of the phase-space interpretation of Lukosz superresolution was given by Zalevsky, Mendlovic, and Lohmann (Zalevsky et al., 2000b) and by Zalevsky (2009). Lukosz remarks that the system in Figure 3 works for both coherent and incoherent signals. In the latter case, each object point acts as a coherent point source. The PSD then shows a vertical line for this object point and the construction of the output signal is achieved exactly as before. Repeating the procedure for each object point independently results in an image that again shows improved resolution for a reduced field of view. Superposition of the multiplexed signal components is achieved with optical hardware, thus ensuring coherent superposition. For applications of the Lukosz scheme where the multiplexing involves the detection and numerical superposition of frequency bands, the distinction between coherent and incoherent processing is more critical.
4. APPLICATIONS OF LUKOSZ SUPERRESOLUTION The specific examples discussed by Lukosz in his seminal papers on superresolution may be of limited interest for practical reasons; however, the underlying concept of ‘‘squeezing’’ phase space—that is, formulating a trade-off between different dimensions that describe the imaging system—ultimately is useful for classifying most superresolution methods. Encoding the input signal to meet the bandwidth restriction of the communication channel exemplifies the core of current trends to design imaging systems for specific tasks. It is useful to consider established and emerging imaging techniques and understand how they conform to Lukosz’s concept.
4.1. Synthetic Aperture Radar Imaging Aperture synthesis is not typically considered a superresolution technique. Multiple measurements are performed that correspond to patches of the Fourier transform of the object. The image is computed numerically from the entire portion of the Fourier transform that is accessible this way.
182
Markus E. Testorf and Michael A. Fiddy
The standard interpretation is to associate the resolution limit with the final size of the synthetic aperture, which then conforms to the Rayleigh diffraction limit. Lukosz (1967) suggested an alternative interpretation. If the actual instrument for data acquisition—for instance, the transmitting antenna of a synthetic aperture radar system—is regarded as the communication channel in the Lukosz sense, we observe that only low spatial bandwidth measurements are possible. The spatial degrees of freedom may be in fact reduced to one. The encoding is performed by scanning the instrument across the target area, and the trade-off is made between total spatial bandwidth (or resolution) and the time necessary to transmit the entire dataset. In other words, we assume a small time bandwidth of the input scene, which is true, for instance, for stationary objects. Treating aperture synthesis as a superresolution method highlights the universal character of the Lukosz superresolution approach. At the same time, however, it also emphasizes that the ability of an imaging system to achieve superresolution is often linked to a rather arbitrary breakdown of the entire system into encoder and communication channel. This once again suggests caution in using the term superresolution. Aperture synthesis extends the superresolution concept straightforwardly to hybrid imaging systems, where part of the imaging task is performed with optical hardware while other parts are realized as numerical algorithms operating on a finite discrete set of measurements. For optical frequencies it is necessary to distinguish between coherent and incoherent image formation. If interferometric or holographic methods are used to record the complex amplitude, we can patch the synthetic aperture from sequential measurements in a manner very similar to the case of radiofrequencies. The situation changes if we assume mutually incoherent object points. In this case, the intensities recorded from laterally shifted low-resolution (LR) imaging systems cannot be fused without difficulty. Recent work shows the implementation of synthetic aperture Fresnel holography for quasi-monochromatic light signals (Katz and Rosen, 2010). In this case, each lateral scanning position of the LR aperture is encoded with a mask in front of a lensless detector. The mask creates a self-referenced holographic pattern for each object point, which can be fused and reconstructed numerically.
4.2. Structured Illuminations Many recent advances in microscopic imaging systems have extended the range of applications of optical microscopy. One important innovation is the use of nonhomogeneous illumination to extract additional object information.
Superresolution Imaging—Revisited
183
Structured illumination with cosinusoidal grating patterns was introduced to improve the axial resolution of a standard microscope (Neil et al., 1997), but it was quickly recognized that it also improves the lateral spatial resolution (Gustafsson, 2000). In this mode, it can be readily recognized as a variant of the grating encoder system discussed in Section 3.2 (Fedosseev et al., 2005). Instead of down-converting high frequencies to the pass band of the imaging system by placing a grating between the object and the system aperture, the spatial frequencies are redirected by illuminating the object with off-axis plane waves. The image resolution is improved by synthesizing the OTF from multiple measurements obtained with different orientations of the illuminating grating. Compared with a physical screen that acts as both the encoder and decoder, structured illumination is band-limited similar to the actual imaging system itself. Projecting the pattern onto the object requires a second imaging system, which is characterized by a finite pass band of angular frequencies. Thus, the gain in resolution by the use of projected patterns is limited to essentially a factor of two in each dimension. This limitation can be overcome, however, by exploiting the nonlinear response (saturation) of fluorescence (Gustafsson, 2005). The projected cosine grating induces an intensity pattern that contains higher harmonics. The frequency bands encoded by different harmonics can be recovered from multiple measurements, each with a different phase lag of the projected grating. Changing the orientation of the projected grating allows synthesis of the extended OTF. Resolution below 50 nm has been confirmed experimentally.
4.3. Superresolution Imaging Through Turbulence Section 3.3 briefly highlighted the fact that the hardware required for encoding the information may well void any advantage the overall system might offer. In particular, the numerical aperture associated with the distance between object and grating encoder and the size of the grating must match the angular bandwidth of the input signal, and thus, may require an aperture size significantly larger than the aperture of the low bandwidth imaging system. In principle, this problem is avoided completely if the encoding step does not require additional hardware for the signal coding but uses part of the natural environment. This approach has been widely used for imaging through atmospheric turbulence. The turbulent atmosphere usually presents a problem for imaging applications because it must be regarded as a time-dependent perturbation resulting in aberrated images with reduced image resolution. However, it is possible not only to use numerical postprocessing to recover
184
Markus E. Testorf and Michael A. Fiddy
images with the nominal resolution of the optical system, but also to achieve a resolution significantly exceeding the Rayleigh limit (Charnotskii et al., 1990; Lambert et al., 2002). The process is based on the turbulence acting as the encoder and down converting high object frequencies that otherwise would not pass the imaging system. Instead of a deterministic code, the screen is a random pattern characterized by it power spectrum. While conceptually rather attractive there is no straightforward solution to implementing the decoder, since, in general, we have no knowledge of the particular instance of the screen at the time the image is recorded, and thus, a simple deconvolution step at the output of the imaging system does usually not suffice to decode the high bandwidth signal. There are basically two approaches to exploit the recorded image information and to obtain high-resolution (HR) images. The first approach attempts a direct (and independent) measurement of the turbulence. This can be accomplished, for instance, with wavefront senors that are widely used for adaptive optics (Tyson, 2000). The wavefront sensor coupled with the adaptive optical system can be interpreted as the decoder of the superresolution imaging system. Alternatively, it may be possible to obtain an actual image of the turbulent medium. If we can assume isoplanatic imaging, and the domain or turbulence is constrainted to an optically thin layer, this additional information can be used more directly to deconvolve the image information (Zalevsky et al., 2007). An interesting example of this kind of superresolution system was described by Zalevsky et al. (2008), where the droplets of rainfall were imaged and simultaneously used as encoding screen. The second approach is the numerical estimation of object information from a time series of frames with blind deconvolution algorithms (Hunag and Tsai, 1984; see Sheppard et al., 1998b for a brief yet rather comprehensive comparison of existing algorithms). Fast and robust multiframe algorithms (Farsiu et al., 2004) permit superresolution of stationary and moving objects. A comprehensive discussion of these algorithms is beyond the scope of this chapter. However, it is noted that the computational cost of these decoding algorithms can be significant. We generalize this as a rule of thumb observed in practice: the less costly the encoding procedure, the more costly the decoding step.
4.4. Multi-Aperture Systems and Digital Superresolution Recent innovations in imaging technology are driven by the desire to build cameras with a small form factor, and of particular interest has been the development of ‘‘flat’’ cameras. The original inspiration for the
Superresolution Imaging—Revisited
185
design of small form factor (viz ‘‘flat’’) cameras was the TOMBO imaging system (Tanida et al., 2001) that relies on the fusion of a number of LR images to exploit the optical resolution as provided by any of the subsystems. Currently, the resolution of computational imaging systems is limited by the size of the detector elements (Choi and Schulz, 2008), and the primary goal is to overcome these limits associated with the detector. This problem is inherent in small–focal length cameras and becomes less important as pixel sizes more closely match the diffraction-limited resolution (in the Rayleigh sense) of the imaging lens. Digital superresolution (Prasad, 2007), sometimes also referred to as geometrical superresolution (Solomon et al., 2005), applies the Lukosz concept to the detector rather than to the optical system. The pixel is regarded as the communication channel. The image at the pixel must be encoded to compensate for the low-pass characteristic of finite pixel size, as well as for aliasing due to the spacing of the detector elements. If we assume uim(x) represents the image in the detector plane, the detector array is characterized by the size p of each rectangular pixel, and their spacing xd, the detected discrete signal can be modeled as ( ) X 1 x x ud ðxÞ ¼ uim ðxÞwrect dðx nxd Þ rect ; (10) p n¼1 D with D being the width of the detector array and w denoting a convolution. If we ignore the effect of the finite width D of the detector array, the spectrum of the detected signal will be the spectrum of the optical image, low–pass filtered—that is, modulated by a sinc function of width 1/p. The sampling corresponds to a replication of the band-limited spectrum of the image. The rectangular pixel shape inevitably results in some aliasing, and this factor becomes significant for fill factors p/xd < 1. To overcome the limitations that the detector array imposes on the detected signal we need to introduce diversity. This means, we must obtain more than one measurement per pixel using a different code for the recorded intensity in each case. In principle, the different measurements can be obtained sequentially with a single system, where a trade-off between image resolution and time-bandwidth is exploited. Alternatively, we can obtain the necessary data in parallel by recording multiple frames with multi-aperture systems. A variety of coding schemes has been investigated. Aliasing can be ameliorated by exploiting a subpixel shift between different frames (Ashok and Neifeld, 2007). Ur and Gross (1992) explicitly recognize subpixel shifts of image frames as a form of Lukosz superresolution. They also suggest Papoulis’ generalized sampling expansion as the framework to recover superresolved images thereby establishing a link between
186
Markus E. Testorf and Michael A. Fiddy
Lukosz type superresolution and the sampling scheme we discuss in section 7.2. The low-pass characteristic associated with the finite pixel size can be addressed in various ways (Brady et al., 2008), including focal plane coding (Portnoy et al., 2006). By modulating the pixel array in each imaging system with a mask function of higher resolution than the detector array, it is possible to transmit high-frequency components of the optical image via the LR pixel array. A convenient and robust scheme is the use of a Hadamard code. Hadamard codes are orthogonal in the sense that the computed image is ideally retrieved from a straightforward superposition of data. In addition, the masks can be realized with passive optics (Sloane and Harwit, 1976). In practice, the finite system precision and measurement noise render it necessary to resort to more sophisticated image-recovery methods (Choi and Schulz, 2008; Elad and Feuer, 1999). While these algorithms are aimed at simultaneously exploiting more than one superresolution concept, we recognize them in this context as acting as the decoder in a Lukosz-type superresolution scheme.
4.5. Evanescent Waves and Near-Field Microscopy Lukosz’s original work (Lukosz, 1966) treats the diffraction limit defined by the wavelength and the appearance of evanescent waves as not explicitly related to his theory of resolution (Lukosz, 1967). A variety of imaging techniques that exploit evanescent waves for improved imagery prompt us to consider their relationship to Lukosz superresolution. All near-field methods are based on the conversion of evanescent waves into propagating optical modes. For instance, in a scanning nearfield microscope (Bohn et al., 2001; Novotny et al., 1995; Pohl et al., 1984) the tip of an optical fiber is probing the near-field distribution, effectively converting the local evanescent field into a propagating fiber mode. The relationship of the scanning principle to the Lukosz concept is rather obvious. The superior spatial resolution is traded with the time to scan the entire object. Similar to synthetic aperture radar the optical system transmits only one single mode, and the object function has to occupy only a small time-bandwidth to allow the scanning system to operate. As a second example, we discuss total internal reflection standing wave microscopes, where the subwavelength features of the object convert evanescent waves into propagating plane waves that can be detected in the far field (see, e.g., Cragg and So, 2000; Sentenac et al., 2009). This may be facilitated by fluorescent dyes. While it is the object geometry that is down-converting frequency bands of the illuminating wave, the similarity to microscopy with structured illumination is striking, which justifies the interpretation of this class of systems in the context of Lukosz superresoluton.
Superresolution Imaging—Revisited
187
5. OPTICAL SUPERRESOLUTION So far we have considered superresolution schemes, where the heuristic Rayleigh limit was accepted as the lower bound for resolving object information. The classical diffraction limit of a given imaging system was overcome by suitable encoding schemes for transmitting the entire signal bandwidth through the system. At the output the encoded information is decoded, either optically, or numerically, and the smallest resolvable feature size of the decoded signal is, in fact, in perfect agreement with the classical diffraction limit. The term superresolution refers to specific interpretation of the imaging system carefully differentiating between the baseline system and the encoder/decoder. Only this context allows us to validate any claim of superresolution. A second group of superresolution schemes does not accept the PSF of a generic imaging system with a clear aperture as the fundamental limit, but is aimed at designing apertures with a reduced size of the PSF (or minimum feature size) without otherwise increasing the bandwidth of the synthesized signal. We refer to these schemes as optical superresolution, since they rely exclusively on the properties of the band-limited electromagnetic signal rather than on encoding schemes or numerical postprocessing methods.
5.1. Superresolution Filters When introducing the Rayleigh limit we mentioned Rayleigh’s comment on partially obscuring the aperture of a telescope to reduce the width of the PSF (at the expense of more dominant sidelobes). Obviously, the design of system apertures with reduced spot size of the PSF is of great interest, not only to astronomy but equally to microscopy and optical data storage, where the storage density critically depends on the spot size of the write/read beam. In 1952, Toraldo di Francia formulated and investigated this problem more systematically. He showed that the central spot of the PSF can be reduced by constructing a system aperture composed of a discrete annular slits. By carefully balancing the radii of the concentric annuli it is possible to reduce the spot size of the central disk. Moving the zeros of the band-limited PSFs closer, without changing asymptotic properties, provides higher resolution. However, the trade-off is at the expense of larger sidelobes thereby effectively reducing the field of view. In practice, the reduced intensity of the central spot translates into a reduction of the SNR of the output image. In fact, for high gain in resolution, the central spot may be significantly lower than the intensity of the sidelobes. Thus the SNR of the signal effectively limits the achievable
188
Markus E. Testorf and Michael A. Fiddy
resolution, since the ratio of sidelobe intensity to intensity of the center spot must be smaller than the SNR in order to be of any practical use. Toraldo di Francia’s study (1952), as well as many subsequent investigations (see, e.g., Martı´nesz-Corral et al., 1995; Sales and Morris, 1997a; Sheppard et al., 1998a; Zhang, 2007), have confirmed that the spot size of the central spot can be made arbitrarily small, if one is to accept the reduced Strehl ratio and a finite field of view over which the amplitude of the sidelobes is significantly smaller than the central lobe. Most investigations following Toraldo di Francia’s work are concerned with the development of suitable algorithms to design spatial filters with a superresolving PSF accepting the qualitative trade-off between gain of resolution, optical power concentration, and limited field of view. However, Sales and Morris (1997b) showed that an upper bound can be established for the resolution gain G as a function of the Strehl ratio S—that is the ratio of the intensity of the PSF versus the intensity of the PSF of the clear aperture, both measured at the origin. For optical systems with circular symmetry they found a power law dependence S kGa ;
(11)
where the parameters k and a must be determined for specific intervals of S. Sales and Morris 1997b reported (k, a) ¼ (3.41, 4.00) for 0 G 0.2, (k, a) ¼ (5.85, 4.33) for 0.2 < G 0.46, and (k, a) ¼ (1.5, 1.00) for 0.46 < G 1.00. From a theoretical perspective, it is intuitively satisfying to observe a trade-off between the achievable resolution and other characteristic properties of the PSF. To establish the link with the general theme of our discussion—that any trade-off aimed at improving the resolution limit is based on the conservation of the overall signal information—we need to consider the phenomenon of superresolving filters in a broader context.
5.2. Superoscillations In Section 2.2 we argued that the Abbe resolution limit presents a more rigorous estimate of the diffraction limit, if we assume a strictly periodic function with discrete diffraction orders. The diffraction limit is reached if we cut off all harmonics but the first one. Then the period of the first harmonic defines the smallest feature that can be observed. Although this correctly states the limit for observing any feature of the object function at all, it should lead to the conclusion that if several harmonics are passed through the system, the smallest resolvable feature size is identical to the period of the highest passing harmonic. In fact, it is possible to construct band-limited signals that oscillate arbitrarily fast within a window of arbitrary width. These so-called
Superresolution Imaging—Revisited
189
superoscillations were first described by Aharonov et al. (1990) and Berry (1994a,b). In addition to quantum mechanics, superoscillations have been studied in the context of optics and signal processing. The concept of superoscillations has found application in a variety of different areas, including tunneling and superluminality (Aharonov et al., 2002), selfimaging (Berry and Popescu, 2006), and sampling theory (Kempf, 2000; Feirreira and Kempf, 2006). It was shown that for 2D signals the occurrence of superoscillations is linked to phase singularities (Berry and Dennis, 2009) and can be observed in speckle patterns (Dennis et al., 2008). For the scope of our discussion it is of interest that superoscillations are also considered as a phenomenon to construct superresolving focii for optical tweezers (Thomson et al., 2008) and to achieve superresolution imaging (Zheludev, 2008), as an alterative to methods exploiting evanescent wave phenomena. In this context, it becomes obvious that optical superresolution filters, described by Toraldo di Francia almost six decades ago, merely define a subset of the more general class of superoscillation phenomena. In turn, the accumulated knowledge of designing superresolving apertures should easily be transferrable to other applications of superoscillations. This, however, remains a task yet to be undertaken. Two general properties of superoscillations are of particular interest for our discussion. First, the occurrence of superoscillations can be interpreted as the local frequency of the signal nðxÞ ¼
1 d log uðxÞ 2p dx
(12)
exceeding the global band-limit of the signal u(x). We note that the relationship of superoscillations with the local frequency provides an important vehicle to interpret the phenomenon, and it would be of considerable interest to extend this interpretation to superresolution filters. Second, for a fixed amplitude of the super-oscillating window, it was shown (Feirreira and Kempf, 2006) that in general the signal energy grows exponentially with the number of superoscillations and polynomially with the inverse of the signal bandwidth. With the number of superoscillations increasing the bulk of the signal energy then resides outside the window where superoscillations occur. This mirrors the properties of superresolving filters and formulates the fundamental trade-off of superoscillation phenomena. Feirreira and Kempf (2006) also point out that this exponential growth of the dynamic range of the signal is demanded by Shannon’s information theory, which can be rephrased as follows: With a space-bandwidth product of S of the optical signal we can transmit S log2(1 þ SNR) bits of information. If we want to increase the number of signal features
190
Markus E. Testorf and Michael A. Fiddy
beyond the space-bandwidth product, we need to sacrifice the SNR—that is, the amount of information carried by each feature. This, in turn, identifies superoscillations as a Lukosz-type scheme of super-resolution. By recognizing that each degree of freedom carries more than one bit of information, depending on the SNR, we can trade spatial resolution for SNR. It is of some importance that suitable greylevel encoding schemes have been studied to exploit this trade-off deliberately (Zalevsky et al., 2006). However, we are unaware of any study that has explicitly interpreted superoscillations in terms of Lukosz encoding and linked to grey-level codes.
6. NUMERICAL SUPERRESOLUTION ALGORITHMS The trade-off between signal resolution and SNR that defines superoscillations can also be exploited with numerical algorithms for imaging applications. In fact, this type of numerical superresolution defines a rather separate research community that has developed a large variety of different superresolution methods. Instead of comparing different algorithms, we review some of the fundamental principles on which these algorithms function. The generic problem we want to address may be formulated as follows: Assuming a signal u(x), of compact support—nonzero only within a window j x j xL / 2—we aim to recover the signal from only partial knowledge of its spectrum. This means the Fourier transformation u˜ (n) of the signal is measured within a window jnj nM / 2. In practice, we sample the spectrum at a finite number N of discrete locations. This assumption of Fourier data is perfectly compatible with any form of coherent image formation, because we can always propagate the spatial frequency spectrum, truncated by the system aperture, to the actual plane of measurements and vice versa. Our assumption of a signal of compact support may be compared with our discussion of the Abbe criterion. We interpret the truncated signal u(x) in Fig. 5a as an infinitely extended periodic signal up(x), which is multiplied with a window equal in size to a multiple of the base period p that is, x uðxÞ ¼ up ðxÞrect : (13) xL We can now expand the periodic signal into a Fourier series up ðxÞ ¼
1 X n¼1
un expði2pnx=dg Þ;
(14)
Superresolution Imaging—Revisited
(a)
191
(b) nM
xL
n
x dg
FIGURE 5 Abbe’s theory of resolution for truncated grating structures. (a) Truncated periodic signal; (b) spectrum of the signal in (a).
and in accordance with Abbe’s analysis the discrete diffraction orders are either detected (j n / d g j < nM / 2) or completely lost for signal recovery. The number of diffraction orders retained is thus N ¼ dg nM þ 1. The finite window replaces every discrete diffraction order with a sinc function (Figure 5b) and the signal spectrum can be written as e uðnÞ ¼ xL
1 X
sincðn xL nÞ:
(15)
n¼1
We still identify N sinc functions that peak within the measurement window and contribute most of the energy for signal recovery. However, sinc functions with their center maximum located outside the measurement window nevertheless contribute to the measured signal. We expect to find a trade-off between the increase in significant Fourier components and the SNR; to do so we need to recover the central lobe of the sinc function from side-lobes located inside the window of measurement. Pask (1976) used this analysis to demonstrate the feasibility of superresolution imaging. This analysis can be formulated more rigorously by recognizing the Fourier transformation of a compact signal evaluated over a finite window in the frequency domain as an eigenvalue problem. The eigenfunctions of the truncated Fourier integral operator are prolate spheroidal wave functions (Slepian and Pollak, 1961; Landau and Pollak, 1961; Frieden, 1971). Toraldo di Francia (1969) showed that the eigenvalues of the truncated Fourier operator are of similar magnitude up to order N, where N is the space-bandwidth product. Beyond this threshold the eigenvalues rapidly decrease below the background noise regardless of the SNR, thereby providing a rigorous mathematical justification for the concept of the space-bandwidth product. Analogous to Pask’s analysis in terms of sinc functions centered inside or outside the data window, the spheroidal functions concentrate signal
192
Markus E. Testorf and Michael A. Fiddy
power inside the measurement window up to order N, while higher-order spheroidal functions contain most power outside the measurement window. This inevitably creates instability if we want to attempt bandwidth extrapolation even for high-quality data. This instability is inherent in all numerical algorithms for bandwidth extrapolation from limited Fourier data and is addressed by inserting prior information into the reconstruction algorithm.
6.1. The Gerchberg–Papoulis Algorithm If we assume our data are to be collected within a finite window in the Fourier domain of the object u(x) with support xL,—that is, e uM ðnÞ ¼
ð xL =2 xL =2
uðxÞ expði2pnxÞdx
(16)
and nM / 2 < n < nM / 2, we may be tempted to recover the object by means of an inverse Fourier transform estimating the object as ^0 ðxÞ ¼ u
ð nM =2 nM =2
e uM ðnÞ expði2pxnÞdn:
(17)
This, of course, results in a low-pass filtered version of the object function with no extrapolated signal outside the Fourier aperture. The estimate is inconsistent with our model, namely, the finite extent of the object. We can try to incorporate this knowledge into an updated estimate of the frequency spectrum if we impose the support information on the signal and compute the spectrum once again as e u1 ðnÞ ¼
ð xL =2 xL =2
^0 ðxÞ expði2pnxÞdx: u
(18)
This can be interpreted as the measured spectrum u˜M(n) convolved with the sinc function that corresponds to the signal support. The spectrum u˜1(n) contains an extrapolated bandwidth beyond the window of measured data. However, this new spectrum will be inconsistent with the data within the data window. We can impose the knowledge we have by constructing a new data-consistent estimate: 0 n n e e e u1 ðnÞ ¼ 1 rect u1 ðnÞ þ rect uM ðnÞ; (19) nM nM that is, the original data are imposed wherever available. This dataconsistent estimate of the signal spectrum is used to compute a better estimate of the signal and the process is repeated until convergence to a
Superresolution Imaging—Revisited
193
data-consistent solution that conforms to the model we impose on the signal. This algorithm was independently described by Gerchberg (1974) and Papoulis (1975) for superresolution imaging and bandwidth extrapolation and is known as the Gerchberg–Papoulis (GP) algorithm. For the noiseless case the algorithm generally converges to the correct spectrum. In principle, the perspective of projections onto convex sets verifies this. The algorithm is unstable, however, and any broadband noise in the measured data results in oscillatory image artifacts. The GP algorithm is remarkable in its simplicity and illustrates rather intuitively how true bandwidth extrapolation can be accomplished to gain further insight into superresolution imaging. We now turn to an analytic solution of the same estimation problem.
6.2. The Prior Discrete Fourier Transform Algorithm The GP algorithm converges to a solution where the spectrum outside the data window is obtained as the convolution of the Fourier transform of the signal window with the data window. However, the data are replaced by a set of coefficients that yields the correct data values as a result of the convolution step. It is not necessary to use an iterative algorithm to obtain these new extrapolation coefficients. Instead, it is possible to solve the estimation problem analytically, which is the core of the so-called PDFT algorithm. The acronym PDFT stands for ‘‘prior discrete Fourier transform’’— expressing the fact that Fourier data are processed by a modified inverse transformation that incorporates the prior available knowledge. The context of image reconstruction from limited Fourier data also defines the theoretical background originally chosen to explore the PDFT algorithm in detail (Byrne, 2005; Byrne and Fiddy, 1988; Byrne and Fitzgerald, 1982; Byrne et al., 1983). Here, we derive the PDFT in a more general form closely following the procedure outlined in a more recent tutorial on the subject (Shieh et al., 2006a). We assume that each of the N measurements fn taken at the output of the optical system can be interpreted as a projection of the input signal: ð (20) fn ¼ uobj ðxÞhn ðxÞdx: The projector hn (x) is the point response of the system sampled at the position xn in the output plane. For computer tomography and magnetic resonance imaging the measurements are samples of the Fourier transform of the signal; that is, hn (x) ¼ exp (i2pxnn).
194
Markus E. Testorf and Michael A. Fiddy
To apply the PDFT algorithm, it is critical to assume that we have some prior knowledge about the class of objects we wish to image. This prior knowledge can be expressed by a weighting function q(x) and we consider the substitutions g(x) ¼ uobj (x)/q*(x) and bn (x) ¼ q (x) hn(x) yielding ð (21) fn ¼ gðxÞbn ðxÞdx: A suitable reconstruction model is required to estimate the object function. It is reasonable to assume that the image is well represented by a linear combination, ^ gðxÞ ¼
N X
an bn ðxÞ;
(22)
n¼1
or, reversing the substitutions, ^obj ðxÞ ¼ uimag ðxÞ ¼ jqðxÞj2 u
N X
an hn ðxÞ:
(23)
n¼1
We note that Eq. (23) would provide the exact input signal for an ¼ fn, if the N basis functions formed a complete orthonormal set. Otherwise, we need to treat the an as free parameters of a model, which we need to determine from additional constraints imposed on the imaging problem. As a reasonable condition we want the solution of the estimation problem to be data consistent in the absence of noise—that is, if we substitute the image estimate uimag(x) into Eq. (20), we want to recover the sampled values fn of the system output. From all possible solutions we seek the set of coefficients an for which the estimate of the input signal is as similar as possible to the original input signal in the sense that w ¼ jjgðxÞ ^ gðxÞjj ¼ min: The coefficients an are then determined from ð @w @ jgðxÞ ^ gðxÞj2 dx ¼ 0: ¼ @an @an
(24)
(25)
Substituting Eq. (22) we obtain ð
gðxÞbn ðxÞdx
ð N X am bm ðxÞbn ðxÞdx ¼ 0:
(26)
m¼1
This represents a system of linear equations, fn ¼
N X
am Pm;n ;
m¼1
(27)
Superresolution Imaging—Revisited
where Pm,n is the so-called P-matrix of the PDFT algorithm, ð Pm;n ¼ jqðxÞj2 hm ðxÞhn ðxÞdx:
195
(28)
Solving the system of equations in Eq. (27) yields the coefficients am and thus the desired estimate of the input signal; together with Eqs. (23) and (28) this represents a closed-form solution to the estimation problem. For practical applications, it is not always possible to establish a P-matrix that can be inverted easily, especially if the prior is determined by measurement or is not accessible analytically. In addition, the dataset may be noisy or otherwise conditioned to prevent us from finding a stable solution of the system of equations. While both noise and a reduced rank of the P-matrix effectively reduces the information we have for reconstructing the signal, we can still apply the PDFT after regularizing the P-matrix. This is best accomplished by multiplying the diagonal with a constant (1 þ e), and e << 1 (Miller–Tikhonov regularization). The regularization parameter is established empirically for a certain class of imaging problems and balances the image performance—namely, the resolution enhancement provided by the PDFT against image artifacts that arise from overconstraint of the reconstruction problem. The PDFT algorithm was successfully applied to limited Fourier transform data. In this context, the PDFT algorithm is applicable to diffraction tomography (Testorf and Fiddy, 2001b) and synthetic aperture radar applications (Testorf, 2004; Testorf and Fiddy, 2001a,c). In particular for small datasets the above formulation of the PDFT results in a computationally attractive algorithm. For larger datasets the computation of the P-matrix is rather time-consuming and requires more computing resources than the solution of the linear system of equations. The closed form of the PDFT may still be attractive if the P-matrix can be precomputed and stored for repeated use. However, for massive datasets solving the system of equations becomes impractical and an iterative discrete version of the PDFT algorithm is available (Shieh et al., 2006b). The PDFT algorithm is demonstrated with synthetic Fourier data in Fig. 6. The object consists of two cylinders with different magnitude (Figure 6a). As a data model we choose diffraction tomography, assuming the first Born approximation to be valid. This means we compute samples at a nonuniform grid equivalent to the data grid obtained in a bistatic data acquisition scheme. The sample values are computed from the analytic Fourier transformation of the object function. For this numerical experiment we limit the range of incident angles to 80 . Comparing the image estimate computed as an inverse Fourier series (Figure 6c) and with the PDFT algorithm (Figure 6d) illustrates the bandwidth extrapolation. The Fourier estimate is severely degraded by the
196
Markus E. Testorf and Michael A. Fiddy
(a)
(b) ny
2/l
8l nx
(c)
(d)
FIGURE 6 The PDFT algorithm applied to synthetic Fourier data. (a) Object function; (b) distribution of Fourier samples; image estimate calculated as (c) an inverse Fourier series, and (d) with PDFT algorithm.
limited coverage of the Fourier domain by the sampled data. The PDFT estimate recovers the shape of the object even with a good approximation of the contrast. Clearly, the HR of the PDFT estimate is primarily due to the highquality data, which do not contain any signal noise except for the limited precision of the floating point representation of the data (regularization parameter e ¼ 107). In addition, the prior (white line in Figure 6d) was chosen only slightly larger than the actual object. For practical applications typically neither condition is fulfilled. Figure 7 shows an example obtained with measured data. The data consist of microwave measurements (single time-frequency of 10 GHz) in a bistatic configuration. The data, recorded at the U.S. Air Force Microwave Laboratory in Ipswich, Massachusetts, were published as part of a contest aimed at comparing inverse-scattering algorithms by evaluating their performance when applying them to experimental data (McGahan and Kleinman, 1996, 1997). The object used in Figure 7 is the IPS007 target, which consists of two empty cardboard cylinders. It can be regarded as a weakly scattering target (i.e., the model of Fourier data we use for reconstruction is applicable). The comparison of the Fourier estimate (Figure 7b), the filtered backpropagation algorithm (Figure 7c) that accounts for nonhomogeneous distribution of Fourier samples, and the PDFT estimate (Figure 7d) indicates that the
Superresolution Imaging—Revisited
(a)
(b)
(c)
(d)
197
8l
FIGURE 7 The PDFT algorithm applied to measured data. (a) Object geometry; (b) inverse Fourier series; (c) filtered back-propation algorithm; (d) PDFT algorithm (e ¼ 0.001).
PDFT results in fewer image artifacts. This observation can be generalized and the PDFT typically removes the ringing artifacts of the system PSF associated with the synthetic aperture of the system. A significant improvement in resolution is not observed. The object features are equally well resolved in all three estimates. While we can associate this qualitatively as a consequence of noisy data, this does not allow us to predict the superresolution capabilities for any given imaging problem. It is worthwhile to address this point in more detail, since the properties of the PDFT algorithm can be generalized to a wider class of superresolution algorithms.
6.3. Further Interpretation of the PDFT Algorithm The PDFT algorithm is particularly well suited to study how numerical algorithms accomplish superresolution by bandwidth extrapolation. To this end, we investigate more closely the properties of the PDFT algorithm. In this context, as mentioned previously, the PDFT algorithm can be interpreted as a closed-form equivalent of the GP algorithm (Byrne et al., 1983). The PDFT is of central importance because its basic concept can be used to relate and extend other numerical superresolution
198
Markus E. Testorf and Michael A. Fiddy
methods. This includes maximum entropy methods (Burg, 1967, 1975) and maximum likelihood methods (Capon, 1969) that are closely related to a nonlinear variant of the PDFT, the so-called inverse PDFT (IPDFT) (Byrne et al., 1983). At the same time the PDFT algorithm allows us to study more fundamental aspects of numerical imaging techniques, unobscured by the complexity of some more sophisticated methods. Here we concentrate on connecting the PDFT algorithm with other superresolution schemes.
6.3.1. The PDFT and Optical Superresolution We gain some additional understanding by interpreting the data-acquisition process, for the case of Fourier data, as a truncated sampling process. For a finite object of width xL the Whittaker–Shannon sampling theorem demands a sampling rate of at least ns ¼ 1/xL for the frequency spectrum. The signal interpolation based on a truncated set of samples provides only the low spatial frequency estimate of the signal. For a spectrum sampled at the Nyquist rate determined by the object support, we can easily verify [with the help of Eq. (28)] that the P-matrix is diagonal. This means that at this sampling rate the PDFT will not improve the signal estimate beyond the classical limit, and the model of the signal is already represented optimally in a least-square sense by the available truncated sampling expansion. It is worth contemplating why Fourier data sampled at the Nyquist rate precludes any hope of bandwidth extrapolation. The sampling expansion is particularly constructed to obtain orthogonal interpolation functions. In other words, the zeros of the interpolating sinc function perfectly coincide with the location of sampling points. Thus, the data are assumed to be independent and do not contain any information related to other sampling points. Points on the sampling grid of the spectrum outside the window of measured data do not contribute to the points inside this window. In turn, the latter cannot be used to estimate points on the sampling grid outside the data window. It is well known that spectral data sampled at the Nyquist rate do not contain sufficient information for bandwidth extrapolation but that a higher sampling rate is required (Papoulis, 1985). This is precisely the context to which the PDFT algorithm is applicable. For oversampled data, the samples are no longer independent and the coefficients of the PDFT reconstruction must be selected to ensure data consistency as a result of convolution with the interpolating function. This interdependency has two consequences. First, it provides the freedom to balance the PDFT coefficients to obtain improved signal resolution (and bandwidth extrapolation). Second, the image reconstruction from interdependent samples results in high susceptibility to noise in the measured data. In particular, the improved image resolution is the result of a
Superresolution Imaging—Revisited
199
delicate interference between different interpolating functions, all of which carry the main portion of their energy outside the data window. Thus, even small errors inside the window are amplified in the extrapolated region. Landau (1986) showed that this confines the accurately extrapolated frequency band to a rather small domain outside the known data window. While we may argue that further increasing the rate of sampling might balance the lack of accuracy, this is not the case; the only way to extend the extrapolated region is to improve the data quality. It was suggested (Berry, 1994b) and later verified (Feirreira et al., 2007) that this instability can be interpreted as being linked to the formation of superoscillations. In others words, numerical superresolution algorithms construct superoscillating signals, which optimally resemble the object function. We note that this relationship between superresolution algorithms and optical superresolution phenomena was exploited from a practical point of view by constructing superresolving filters with the PDFT algorithm (Testorf and Fiddy, 2007). The strong dependence of the achievable bandwidth extrapolation on the SNR of the data also points to a trade-off in the Lukosz sense. The space-bandwidth product of the measured signal can be calculated straightforwardly as S ¼ xLnM and we attempt to extract information about more than S independent signal features by trading for SNR. This is the same relationship we found for super-resolving filters and superoscillations. We can then reverse the argument in Section 5.2 and estimate the maximum gain in bandwidth (and resolution) we expect for a given SNR, by turning once again to Shannon’s result (i.e., each data sample can be reassigned to log2(1 þ SNR) image features with SNR ¼ 1). The exponential growth of the required SNR as a function of bandwidth can be identified as the main constraint for superresolution imaging.
6.3.2. The Role of Prior Knowledge The interpretation of the PDFT algorithm as a ‘‘Lukosz trade-off’’ leaves a pessimistic outlook on the prospect of calculating superresolved images. However, Matson and Tyler (2004, 2006) observed that many numerical superresolution algorithms report significant improvements over the classical diffraction limit. They account for this by distinguishing between what they call primary and secondary superresolution. We interpret primary superresolution as the gain in resolution due to exploiting the Lukosz trade-off between SNR and image resolution. This gain in resolution is essentially independent of the object signal and the number of samples and adheres to the properties of the PDFT algorithm discussed so far. Secondary superresolution is defined as any additional gain in image resolution not related to primary superresolution. This gain may be related to other Lukosz trade-offs, particularly in the context of image
200
Markus E. Testorf and Michael A. Fiddy
reconstruction from multiple encoded image frames. For instance, imaging through atmospheric turbulence and image reconstruction from multiple aliased images may be interpreted as extracting the desired information simply from multiple separate measurements. However, in each frame the different frequency bands overlap and the information gain is transmitted similar to the degrees of freedom accessible through primary superresolution. It is not surprising therefore that similar algorithms are used to recover the image, and that in these cases the algorithm displays superior image resolution simply based on the better SNR of the high-frequency components. However, a second source of secondary superresolution is due to the use of prior information, and the PDFT algorithm can be used to understand this type of secondary superresolution more intuitively as an interrelationship between the estimator and the object function. In particular, we can characterize the estimator in terms of the prior knowledge we inject into the reconstruction process. Since we strive to estimate a continuous signal from a finite set of data, we try to solve an ill-posed problem and always must make assumptions to select the true object function from the infinite set of possible signal reconstructions. We interpret secondary superresolution as the result of incorporating specific object characteristics into the choice of prior information. It is immediately obvious that the PDFT algorithm introduces additional information through the choice of prior and the P-matrix. These are composed of the weighting function q(x) and the regularization parameter e, the latter in effect improving the estimation process by accounting for signal noise. The trade-off between the amount of data and the use of prior information can be further emphasized if we consider the trivial case jq(xj2 ¼ uobj(x) (i.e., the prior is chosen to be the true object function). The reconstruction is always perfectly accomplished from a single sample in the Fourier plane. It is clear that the choice of prior may incorporate object properties to any degree, and a weighting function q(x) already consisting of small features to some degree facilitates the extrapolation of the spectrum in the Fourier domain. Conversely, we may accomplish an improved regularization by using a smoothly modulated weighting function. Such estimators have indeed proved superior for resolving localized object structures (Shieh et al., 2001) and may be interpreted in terms of an apodization effect. How does the support constraint of the PDFT algorithm favor certain object features? We may approach this problem by considering a flat-top object signal and a data window in the Fourier domain that corresponds to a single sideband. This situation is depicted in Figure 8, with a bandwidth Dnw of the pass band centered at nw. This data model is rather common in inverse synthetic aperture radar applications if the radar beam is obliquely incident on a planar surface.
Superresolution Imaging—Revisited
201
Δn
n
FIGURE 8
n
Bandwidth extrapolation from single sideband.
The typical radar image shows the edges of the surface but not the actual surface unless the specular reflection is caught by the receiving antenna. Nevertheless, we may expect a recovery of the surface profile by applying the PDFT algorithm. This would correspond to the extrapolation of the main lobe of the sinc function in Figure 8 from the samples measured inside the data window. It turns out that this cannot be accomplished even for synthetic noiseless Fourier data as soon as the main lobe is located outside the data window. At the same time we can observe superresolution at the points associated with the edges of the flat-top signal. This can be explained by evaluating the maximum relative difference between the sinc function and a properly normalized cosine within the data window. We obtain an estimate of the relative difference of the Fourier spectra u˜ within the data window by considering the 1/n dependence of the sinc envelope and compare it with the constant envelope of a two point object represented by a cosine characteristic. For large center frequencies nw and small fractional bandwidth Dnw/nw we find e Du Dnw =nw Dnw ! : 2 e u nw 1 ðDnw =4nw Þ
(29)
In other words, if the central lobe of the sinc function is located outside the data window, the oscillating tail inside the data window is equally well described by a cosine as by a sinc function. The recovery of a function that grows outside the data window is discouraged by the regularization. This means, regardless of the noise level, the PDFT algorithm always shows a preference for small objects or edges rather than object features with a spectrum that carries most of its energy outside the window of measurement.
202
Markus E. Testorf and Michael A. Fiddy
The superior performance of the PDFT algorithm with respect to small objects can be demonstrated with experimental data (Testorf and Fiddy, 2001c). Figure 9 shows image reconstructions obtained as part of a subsurface imaging experiment designed to use microwaves to detect buried plastic anti-personnel land mines. The data were collected from scaled experiments. The entire target geometry and the wavelength were reduced in scale by a factor of 1/10, whereas materials with the same permittivity as in the full-scale problem were used to build the models. The measurements were performed with a compact range radar system (Kekis et al., 2000). The experimental setup allows measurement of monostatic backscatter data. The angle a between the antenna and the normal of the ground surface was varied between 60 and 60 with increments of 5 . Only angles between 15 and 50 were used in Fig. 9 to suppress the specular reflection. For each angle 201 radar frequencies were measured between 8.2 GHz and 12.4 GHz. As targets metal calibration spheres were used with diameters of 9.5 mm and 15.9 mm, respectively. (a) 5
Range (m)
0.04 0.02 1
23
0 –0.02 –0.04
7 –0.04
–0.02 0 0.02 Cross-range (m)
0.04
(b)
Range (m)
0.04
2
0.02 10 6
0 –0.02 –0.04
5 –0.04
4 –0.02
0
0.02
0.04
Cross-range (m)
FIGURE 9 The PDFT algorithm applied to subsurface imaging. (a) Fourier estimate; (b) PDFT algorithm e ¼ 0.05.
Superresolution Imaging—Revisited
203
Figure 9 shows the comparison between the Fourier transform estimate and the PDFT algorithm. While for the IPS007 target the PDFT resulted in a reduction of image artifacts, the algorithm provides true bandwidth extrapolation in the case of objects which can be regarded as point scatterers.
6.4. Multiple Scattering Inverse scattering techniques aimed at determining the permittivity of a scattering object from a set of finite measurements of the diffracted electromagnetic field are often defeated by strongly scattering objects. The high relative permittivity forces the electromagnetic field to interact several times with the target before being detected. As a consequence, approximate scattering models, such as the first-order Born approximation, are insufficient to estimate target structure. One accepted strategy to obtain useful images of the object is the use of nonlinear numerical methods, which implement a forward model of electromagnetic scattering, then minimizing the difference between measured and calculated data by modifying the permittivity distribution. One example of this class of algorithms is the distorted Born iterative method (Chew and Wang, 1990). The use of high-quality data is often accompanied by superresolution effects with an observed resolution better than l/10 (Chen and Chew, 1998a, b). It was suggested that the superior resolution is the effect of the evanescent waves being coupled with propagating modes (Simonetti, 2006). The reconstruction algorithm then uses this information to obtain images with superior resolution (Aydiner and Chew, 2003). We acknowledge the possibility of evanescent waves providing the necessary information for superresolution imaging. A better understanding may be achieved by interpreting the inverse scattering problem within the framework of the PDFT algorithm and other numerical superresolution algorithms. The inverse scattering problem is generally ill-posed and ill-conditioned. This is obvious considering that the dataset contains significantly fewer degrees of freedom than a permittivity distribution with subwavelength features. As a consequence, regularization methods are typically required to construct a stable imaging algorithm. The need for data with high SNR suggests that the information about subwavelength features is merely a small signal component of the scattered field. The contribution of multiple scattering on the scattered field amplitude is similar to high-frequency bands that are down-converted to a lowfrequency pass band of the imaging system via prior knowledge of the support constraint. In a recent study based on a Crame´r-Rao analysis of multiple scattering, it was shown that multiple scattering does in fact not
204
Markus E. Testorf and Michael A. Fiddy
translate automatically into resolution enhancement (Sentenac et al., 2007). In particular, resolution enhancement was only observed, for a small momentum transfer at each of the scattering events. As a consequence, it remains doubtful to what extend superresolution beyond the wavelength limit can be associated with multiple scattering. The fact that the observed superresolution effects are typically associated with point objects and sharp edges points instead to the prior information inserted by the regularization scheme as the source of resolution enhancement. The superresolution of inverse scattering algorithms may thus be better classified as secondary superresolution. A rigorous quantitative analysis of the contributions of these competing effects to the image resolution remains a subject of future studies.
7. GENERALIZED SAMPLING EXPANSIONS Closely related to numerical superresolution imaging is the development of compressive sampling. While the former can be thought of as improving the resolution without expanding the dataset, the latter may be characterized as maintaining the image quality (and resolution) with a reduced set of data. In both cases, we strive to use the measured data more efficiently. Perhaps not surprisingly, in compressive sampling we encounter the counterparts of the fundamental concepts discussed in the context of superresolution imaging. We emphasize this correspondence as we survey the most important methods currently discussed in the context of computational optical imaging and sensing.
7.1. Squeezing and Dissecting Phase Space We introduced Lukosz’s approach to superresolution as a rearrangement of the phase-space volume of the signal. Phase-space representations share properties with incompressible fluids. If we want to squeeze the phase space in one of its dimensions by applying some optical operation, we must give way and allow for expansion along one of the other phasespace axes. We can apply the same concept to the Whittaker–Shannon sampling theorem. Assuming a band-limited signal of bandwidth Dn, it is possible to represent the continuous function u(x) by the sampling expansion uðxÞ ¼ ud ðxÞw sincðDnxÞ with
(30)
Superresolution Imaging—Revisited
ud ðxÞ ¼ uðxÞ
1 X
1 X
dðx nxs Þ ¼
n¼1
uðnxs Þdðx nxs Þ;
205
(31)
n¼1
where the sampling distance is xs ¼ 1/Dn. The sampled signal ud(x) is discrete in x and its spectrum is the signal spectrum periodically replicated in n. It was shown that this classical sampling procedure can conveniently be interpreted in the phase space of the Wigner distribution function (Stern and Javidi, 2004a,b). The sampling procedure replicates the spectrum along the n-axis and discretizes the Wigner distribution function along the x-axis. The sampling interpolation is equally straightforward and involves the low-pass filtering of the spectrum. In phase space this corresponds to the multiplication of the Wigner function of the low-pass filter in frequency and a simultaneous convolution in the spatial domain. Interpreting the classical sampling expansion in phase space offers little more than a simultaneous display of the well-understood interpretation of the sampling theorem in the spatial domain and the spectral domain. However, it raises the possibility of generalizing Shannon’s sampling theorem by combining it with Lukosz’s idea of space-bandwidth adaptation. To understand this idea we first consider the related method for bandwidth compression (Papoulis, 1994). This is analyzed conveniently in phase space with the help of the PSDs (Figure 10). The original (bandlimited) signal (Figure 10a) is passed through two consecutive paraxial optical systems represented by affine transformations of phase space (a)
(b) n
n Δns
x
q
Δns
x
Δx
Δx (c) n Δn ⬘s
Δns x xa
xb Δx⬘
FIGURE 10 Bandwidth compression via linear optical transformations. (a) Phase-space diagram of band-limited function; (b) PSD after chirping; (c) PSD after fractional Fourier transformation to recover band-limited function.
206
Markus E. Testorf and Michael A. Fiddy
(Torre, 2005). For instance, we may combine chirping with a fractional Fourier transformer. Chirping (i.e., the multiplication of the signal with a quadratic phase function) corresponds to a shear of the phase-space volume parallel to the frequency axis (Figure 10b). The signal then occupies a diagonal band of phase space, and the transformed signal is neither space nor frequency limited. We note that the Shannon sampling theorem is not applicable in its standard form. The subsequent fractional Fourier transform corresponds to a rotation of the phase-space volume, and we recover a band-limited function at a rotation angle y, which realigns the phase-space volume with the x-axis (Figure 10c). As a consequence of applying two different transformations the bandwidth has changed, and we find 0
0
Dns ¼ Dns cosy:
(32)
Obviously, a reduced bandwidth can be represented by fewer samples per unit length. This, however, does not reduce the total amount of data needed to represent the signal. We can appreciate this if we consider a signal with a finite space-bandwidth product,—that is, the signal is essentially limited in its size Dx as indicated in (Figure 10a). After the compression, the size has increased to Dx0 , and two sections of length xa show a linearly changing bandwidth. If, in a heuristic sense, we adapt the local sampling distance to the local bandwidth, we find that the truncated sampling expansion needed to represent the compressed signal is identical to that of the original signal. This is essentially equivalent to the spacebandwidth product adaptation of Lukosz superresolution. Equipped with this result we can now apply the same concept to signals with a finite local bandwidth yet with an unbounded global bandwidth. As an example that can be analyzed rigorously, we consider again a signal that occupies a diagonal strip of phase space similar to Figure 10b. We always find this type of phase-space coverage if the signal of compact support (Figure 11a) is propagating in free space (Fresnel diffraction). In optics the need to sample this type of signal naturally occurs in the context of digital Fresnel holography. In a strict sense, the phase-space equivalent of any object of finite size will be transformed into a non–bandlimited, non–space–limited signal, such as the one in Figure 11b, as a result of free-space propagation. The PSD in Figure 11b makes it evident that the appropriate sampling rate is the inverse of the local bandwidth Dns. Sampling in the spatial domain results in a replication of the non–band-limited signal in the frequency domain. However, the local spectra do not overlap and we can find a suitable interpolation formula to recover the continuous signal. Stern (2006) suggested a construction of the interpolation step, first transforming the signal into the domain of compact support by
Superresolution Imaging—Revisited
(a)
(b)
n
207
n
x
x Δns
(c)
n
Δns
x
FIGURE 11 Generalized sampling of Fresnel holograms: (a) Phase-space diagram of a signal compact in space; (b) signal in the domain of sampling; (c) signal in (b) after dechirping.
applying the inverse of the transformation that first gave rise to the non–band-limited signal. Then, the signal is filtered to obtain the continuous function of compact support followed by yet another transformation to recover the continuous signal in Figure 11a. Sharma and Joshi, (2006) expanded on this theme by considering a wider class of signals that are compact in some diffraction plane. In this context, it was also claimed that the transformation of signals that are compact along some direction in phase space can be used to reduce the sampling rate without losing signal information. It is obvious from the phase-space interpretation that this can be accomplished with the bandwidth compression scheme discussed above. However, if we start our discussion with the signal in Figure 11b, it is also obvious that the minimum sampling distance is determined by the local bandwidth, not by the transformation we choose to obtain a band-limited signal. In fact, if our only goal is the the discrete representation of signal information, we are always free to interpret the non–band-limited signal in Figure 11b as the result of a chirp modulation of the band-limited signal in Figure 11c (Hennelly, 2009). The interpolation then simply has to account for the chirp.
7.2. Papoulis Generalized Sampling An alternative generalized sampling expansion was introduced by Papoulis (1977) and forms the basis for flexible data acquisition schemes. It has been cited frequently in the context of multi-aperture systems,
208
Markus E. Testorf and Michael A. Fiddy
where the task can be defined as sampling a large bandwidth signal with multiple low-bandwidth systems. Then Papoulis’s generalized sampling (PGS) provides a framework to acquire samples with these low-bandwidth systems, while maintaining the total number of samples required to represent the high-bandwidth signal. Papoulis’s original derivation, while mathematically elegant, presents some difficulty if we want to interpret its physical significance. We therefore follow an alternative path to derive PGS closely following the approach suggested by Brown (1981). We assume a signal u(x), which need not be band-limited. To represent the signal with a set of discrete samples we send the signal through M space-invariant linear systems with response functions hm(x), and m ¼ 0, . . ., M 1. The associated transfer functions ð1 Hm ðnÞ ¼ hm ðxÞ expði2pnxÞdx (33) 1
are all defined on the interval jnj < Dn/2, which defines the total bandwidth of the cluster of subsystems, and we assume Hm(n) ¼ 0 outside this pass band. The signal is filtered by each of these linear systems ð1 gm ðxÞ ¼ uðx0 Þhm ðx x0 Þdx0 (34) 1
and sampled at a rate Mx0 ¼ M/Dn; the sampled signal reads gðsÞ m ðxÞ ¼ gm ðxÞ
1 X
dðx kMx0 Þ ¼
k¼1
1 X
gm ðkMx0 Þdðx kMx0 Þ:
(35)
k¼1
We expect to recover an estimate of the continuous signal ^ðxÞ ¼ u ¼
M 1 X
1 X
gm0 ðk0 Mx0 Þym0 ðx k0 Mx0 Þ
m0 ¼0 k0 ¼1 M 1 ð 1 X m0 ¼0 1
ðsÞ gm0 ðx0 Þym0 ðx
(36) 0
0
x Þdx ;
where we need to construct the interpolation function ym(x). Since all functions gm(x) are Dn band-limited, it follows that uˆ(x) is also bandlimited to the same interval. Since we do not restrict our discussion to band-limited signals, we need to establish a criterion to select the desired solution from the infinite set of signals that can be reconstructed from a finite set of samples. Brown (1981) suggested a minimum (L2) norm criterion. Here we follow Unser
Superresolution Imaging—Revisited
209
and Zerubia (1998) and demand data consistency—that is, we want to recover ð1 ^ðx0 Þhm ðx x0 Þdx0 gm ðxÞ ¼ u 1 (37) M 1 ð 1 ð 1 X ðsÞ ¼ gm0 ðx0 Þym0 ð x x0 Þdx0 hm ðx x Þdx : m0 ¼0 1 1
By Fourier transforming the data consistency condition, we obtain M 1 X
Gm ðnÞ ¼
1 X
Gm0 ðn k0 DnM ÞÞYm0 ðnÞHm ðnÞ;
(38)
m0 ¼0 k0 ¼1
with DnM ¼ Dn/M ¼ 1/(x0M). The Fourier transforms of the measured signal Gm(n) and the transfer function of the subsystem Hm(n) are Dn-bandlimited by definition and, thus, the Fourier transform of the interpolating function Ym(n) needs to be defined on this frequency interval as well. Equation (38) can be further modified Gm ðnÞ ¼ ¼
M 1 X
1 X
^ k0 DnM ÞHm0 ðn k0 DnM ÞYm0 ðnÞHm ðnÞ Uðn
m0 ¼0 k0 ¼1 M 1 X 1 X
(39)
^ k DnM ÞHm ðnÞYm0 ðnÞHm0 ðn k DnM Þ; Uðn 0
0
m0 ¼0 k0 ¼1
and Eq. 39 can now be expressed in terms of the periodic continuations of ˆ (n) and Hm0 (n), U 1 X
^ k0 DnM ÞHm0 ðn k0 DnM Þ Uðn
k0 ¼1
¼ ¼
1 M 1 X X ^ nDn k0 DnM ÞHm0 ðn nDn k0 DnM Þ Uðn
(40)
n¼1 k0 ¼0 M 1 X
m0 ðn k0 DnM Þ: k0 DnM ÞH Uðn
k0 ¼0
ˆ (n)Hm(n) ¼ Gm(n), we can solve Eq. (39) ¯ (n)H m(n) ¼ U Observing that U by setting M 1 X m0 ¼0
m0 ðnÞ ¼ 1 Ym0 ðnÞH
(41)
210
Markus E. Testorf and Michael A. Fiddy
for k0 ¼ 0 and M 1 X
m0 ðn k0 DnM Þ ¼ 0 Ym0 ðnÞH
(42)
m0 ¼0
for k ¼ 1, . . ., M 1. It is now possible to follow Brown’s derivation and obtain an explicit solution for Eqs. (41) and (42) on the interval Dn/2. . . Dn/2 þ DnM prompted by the observation that essentially the same system of linear equations must be solved for each subinterval of length DnM. However, the advantage of better computational economy is accompanied by a significant bookkeeping effort, which obscures the physical interpretation rather than helping it. Equations (41) and (42) form a system of linear equations HY ¼ I;
(43)
1 H0;0 H0;1 H0;M1 B 1;0 C H H H ¼ @ .H . . . . . . . . . . .1;1 . . . . . . . . . . . .1;M1 . . . . . . A; HM1;0 HM1;1 HM1;M1
(44)
where 0
m(n kDnM), with Hk,m ¼ H
0
1 Y0 ðnÞ B Y1 ðnÞ C C Y¼B @ . . . . . . :: A; YM1 ðnÞ
(45)
and 0 1 1 B0C C I¼B @ A: 0
(46)
If det j H j 6¼ 0, the interpolation functions are obtained as the first column of H1. The existence of the inverse of H can be interpreted physically. In particular, it demands coverage of the entire bandwidth Dn—every frequency within this band must be admitted by at least one of the subsystems. However, coverage of the entire bandwidth is a necessary—not a sufficient—condition for the PGS to work. In addition, wherever transfer functions overlap, they have a distinct shape (in the sense of orthogonality)
Superresolution Imaging—Revisited
211
to allow unique recovery of the spectrum of each subsystem. A trivial example can be constructed by assuming two subsystems, each with a pass band of Dn and each sampling the output signal at half the Nyquist rate. We cannot recover the signal if both system functions are identical, but we need to introduce diversity between the two measurements (e.g., subpixel shift) to provide the information for signal recovery. Brown derives his solution as a minimum (L2) norm solution, while here we used data consistency as the criterion to select the solution. This still corresponds to a minimum norm solution of the reconstruction problem (Unser and Zerubia, 1998). The reconstruction of band-limited functions corresponds to the case of zero minimum norm. The resemblance between the above solution and the theory of the PDFT algorithm is striking, which is why it is necessary to point out some of the fundamental differences. Formally this difference can be described as follows: The PGS expansion uses the data as interpolation coefficients and determines the interpolation function for optimum recovery. In contrast, the PDFT algorithm uses the transfer functions of the subsystems as interpolation functions and computes a new set of interpolation coefficients to construct the optimum solution. While this may appear as a matter of choice, the latter approach enables true bandwidth extrapolation by taking into account a priori information about the signal. The PGS expansion is thus perfectly adapted to accomplish Lukosz-type superresolution by trading degrees of freedom and is well suited to accomplish digital superresolution and de-aliasing in multiframe reconstruction. The PDFT algorithm is preferable whenever prior information about the class of input signal is available and can be incorporated.
7.3. Compressive Sampling We again consider data that can be interpreted as projections of the signal domain, for instance samples of the Fourier spectrum. If we can assume that the signal consists of a finite set of discrete locations where the signal is non-zero, we can represent the signal accurately with less data than suggested by Shannon’s sampling theorem. This problem was the focus of the work championed by Tao, Candes, Romberg, and others (Candes et al., 2006; Romberg, 2008), which has received widespread attention in particular in the context of computational optical sensing and imaging (Takhar et al., 2006; Brady et al. 2009). Key to this compressive sampling approach is the qualitative observation that the number of degrees of freedom necessary to describe the signal is significantly less than the SBP if the signal consists of small isolated features. By giving up the convenience of the signal representation as a linear superposition of shift-invariant sinc interpolating functions, the signal can be reconstructed with HR from a dataset of much
212
Markus E. Testorf and Michael A. Fiddy
smaller size. This is accomplished with a nonlinear iterative algorithm that minimizes the L1-norm of the difference between the original signal and the signal estimate. In this context, we note that similar to numerical superresolution algorithms the success of this compression scheme rests on the mapping of the signal information to the degrees of freedom defined by the sampling process. More importantly, however, the L1-norm minimization uses the prior information of the signal shape to converge to an almostperfect reconstruction. In this context, we may well interpret compressive sampling as a non-linear superresolution algorithm. For example, the strong prior knowledge of isolated object points may then be interpreted as the source of secondary superresolution and explains the high signal quality of the reconstruction from incomplete data.
8. CONCLUDING REMARKS Our survey of superresolution methods touched on a wide variety of methods. Our discussion undoubtedly falls short of appreciating the creativity necessary to unearth superresolution phenomena and to construct algorithms for solving practical imaging problems. Instead, our interest was focused on connecting and reinterpreting superresolution methods in terms of the trade-offs first formulated explicitly by Lukosz and extended by others. The basic concept behind Lukosz superresolution—the conservation of total signal information and the improvement of resolution at the expense of some other system properties—seems obvious, if not trivial. However, its extension to superresolution algorithms is certainly not sufficiently appreciated. In particular, whenever numerical superresolution is based on more than one type of information, the Lukosz concept helps us dissect the problem and identify the different sources of information. For instance, image information that is primarily supplied by the available prior information and in fact is better interpreted as a sophisticated type of lookup table, is competing against resolution determined by the data. The first demands more sophisticated knowledge about possible objects in the target scene, while the latter requires a system analysis to optimize the trade-offs described by Lukosz. In many cases, it may be obvious which Lukosz trade-off must be investigated. In some cases, it may be impossible to separate the different effects in a functioning superresolution system. However, we believe that significant progress in developing working superresolution schemes could be made if it became common practice to state explicitly the physical source of resolution gain. This should allows us to compare superresolution systems more easily and to adapt existing schemes to new imaging applications.
Superresolution Imaging—Revisited
213
ACKNOWLEDGMENT One of the authors (Markus Testorf) gratefully acknowledges A. Barnett, Department of Mathematics at Dartmouth College, and S. Sinzinger, University of Illmenau, Germany, for inviting the author to present preliminary accounts of this work. Measured data were provided by the Electromagnetics Technology Division, Air Force Research Laboratory/SNH, Hanscom Air Force Base, MA 01731-3010 (IPS007 target) and the Submillimeter-Wave Technology Laboratory, (STL), University of MassachusettsLowell (subsurface targets).
REFERENCES Abbe, E. (1873). Beitra¨ge zur Theorie des Mikroskops und der mikroskopischen Wahrnehmung. Archiv fu¨r mikroskopische Anatomie, 9(1), 413–418. Aharonov, Y., Anandan, J., Popescu, S., & Vaidman, L. (1990). Superpositions of time evolutions of a quantum system and a quantum time-translation machine. Physical Review Letters, 64(25), 2965–2968. Aharonov, Y., Erez, N., & Reznik, B. (2002). Superoscillations and tunneling times. Physical Review A, 65(5), 052124. Ashok, A., & Neifeld, M. A. (2007). Pseudorandom phase masks for superresolution imaging from subpixel shifting. Applied Optics, 46(12), 2256–2268. Aydiner, A. A., & Chew, W. C. (2003). On the nature of super-resolution in inverse scattering. In Antennas and Propagation Society International Symposium, 2003, 1, (pp. 507–510). Bachl, A., & Lukosz, W. (1967). Experiments on superresolution imaging of a reduced object field. Journal of the Optical Society of America, 57, 163–169. Berry, M. V. (1994a). Evanescent and real waves in quantum billiards, and Gaussian beams. J Phys A, 27, L391–L398. Berry, M. V. (1994b). Faster than Fourier. In J. S. Anandan & J. L. Safko (Eds.), Quantum Coherence and Reality; in celebration of the 60th birthday of Yakir Aharonov (pp. 55–65). Singapore: World Scientific. Berry, M. V., & Dennis, M. R. (2009). Natural superoscillations in monochromatic waves in D dimensions. J Phys A Math Gen, 42(2), 022003. Berry, M. V., & Popescu, S. (2006). Evolution of quantum superoscillations and optical superresolution without evanescent waves. Journal of Physics A: Mathematical and General, 39, 6965–6977. Bohn, J. L., Nesbitt, D. J., & Gallagher, A. (2001). Field enhancement in apertureless near-field scanning optical microscopy. Journal of the Optical Society of America A, 18(12), 2998–3006. Born, M., & Wolf, E. (1980). Principles of Optics. Oxford, UK: Pergamon. Brady, D. J., Dogariu, A., Fiddy, M. A., & Mahalanobis, A. (2008). Computational optical sensing and imaging: introduction to the feature issue. Applied Optics, 47(10), COSI1–COSI2. Brady, D. J., Choi, K., Marks, D. L. M., Horisaki, R., & Lim, S. (2009). Compressive Holography. Optics Express, 17, 13040–13049. Brown, J. L. Jr., (1981). Multi-channel sampling of low-pass signals. IEEE Transactions on Circuits and Systems, CAS-28, 101–106. Burg, J. P. (1967). Maximum entropy spectral analysis. In Proceedings of the 37th Meeting of the Society of Exploration Geophysicists. New York: IEEE Press.
214
Markus E. Testorf and Michael A. Fiddy
Burg, J. P. (1975). Maximum entropy spectral analysis. Ph.D. thesis, Palo Alto, CA: Stanford University. Byrne, C. L. (2005). Signal Processing—A Mathematical Approach. Wellesley, MA: A K Peters, Ltd. Byrne, C. L., & Fiddy, M. A. (1988). Images as power spectra; reconstruction as a Wiener filter approximation. Inverse Problems, 4, 399–409. Byrne, C. L., & Fitzgerald, R. M. (1982). Reconstruction from partial information with applications to tomography. SIAM Journal on Applied Mathematics, 42, 933–940. Byrne, C. L., Fitzgerald, R. M., Fiddy, M. A., Hall, T. J., & Darling, A. M. (1983). Image restoration and resolution enhancement. Journal of the Optical Society of America A, 73, 1481–1487. Candes, E. J., Romberg, J., & Tao, T. (2006). Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Transactions on Information Theory, 52(2), 489–509. Capon, J. (1969). High-resolution frequency-wavenumber spectrum analysis. Proceedings/ IEEE, 57(8), 1408–1418. Charnotskii, M. I., Myakinin, V. A., & Zavorotnyy, V. U. (1990). Observation of superresolution in nonisoplanatic imaging through turbulence. Journal of the Optical Society of America A, 7, 1345–1350. Chen, F.-C., & Chew, W. C. (1998a). Experimental verification of super resolution in nonlinear inverse scattering. Applied Physics Letters, 72, 3080–3082. Chen, F.-C., & Chew, W. C. (1998b). Ultra-wideband radar imaging experiment for verifying super-resolution in nonlinear inverse scattering. In Antennas and Propagation Society International Symposium, 1998, 2, (pp. 1284–1287). Chew, W. C., & Wang, Y. M. (1990). Reconstruction of two-dimensional permittivity distribution using the distorted born iterative method. IEEE Transactions on Medical Imaging, 9(2), 218–225. Choi, K., & Schulz, T. J. (2008). Signal-processing approaches for image-resolution restoration for TOMBO imagery. Applied Optics, 47(10), B104–B116. Cragg, G. E., & So, P. T. C. (2000). Lateral resolution enhancement with standing evanescent waves. Optics Letters, 25(1), 46–48. Dennis, M. R., Hamilton, A. C., & Courtial, J. (2008). Superoscillation in speckle patterns. Optics Letters, 33(24), 2976–2978. Dharanipragad, S. (1996). Resolution limits in signal recovery. IEEE Transactions on Signal Processing, 44, 546–561. Elad, M., & Feuer, A. (1999). Super-resolution reconstruction of image sequences. IEEE Transactions on Pattern Analysis, 21(9), 817–834. Farsiu, S., Robinson, M. D., Elad, M., & Milanfar, P. (2004). Fast and robust multiframe super resolution. IEEE Transactions on Image Processing, 13, 1327–1344. Fedosseev, R., Belyaev, Y., Frohn, J., & Stemmer, A. (2005). Structured light illumination for extended resolution in fluorescence microscopy. Optics and Laser in Engineering, 43(3–5), 403–414. Feirreira, P. J. G., Kempf, A., & Reis, M. J. C. S. (2007). Construction of Aharonov-Berry’s superoscillations. Journal of Physics A: Mathematical and General, 40, 5141–5147. Feirreira, P. J. S., & Kempf, A. (2006). Superoscillations: faster than the Nyquist rate. IEEE Transactions on Signal Processing, 54, 3732–3740. Frieden, B. R. (1971). Evaluation, design, and extrapolating methods for optical signals, based on use of the prolate functions. In E. Wolf (Ed.), Progress in Optics, Vol. IV (pp. 311–407). Amsterdam: North-Holland. Gabor, D. (1961). Light and information. In E. Wolf (Ed.), Progress in Optics, Vol. I(pp. 109– 153). Amsterdam: North-Holland. Gerchberg, R. (1974). Super-resolution through error energy reduction. Optica Acta, 21, 709–720.
Superresolution Imaging—Revisited
215
Gustafsson, M. G. L. (2000). Surpassing the lateral resolution limit by a factor of two using structured illumination microscopy. Journal of Microscopy, 198(2), 82–87. Gustafsson, M. G. L. (2005). Nonlinear structured-illumination microscopy: wide-field fluorescence imaging with theoretically unlimited resolution. In Proceedings of the National Academy of Sciences of the United States of America, 102, (pp. 13081–13086). Hennelly, B. (2009). Sampling in phase space. In M. E. Testorf, B. M. Hennelly & J. OjedaCastan˜eda (Eds.), Phase-Space Optics: Fundamentals and Application (pp. 309–336). New York: McGraw-Hill. Hunag, T. S., & Tsai, R. Y. (1984). Multi-frame image restoration and registration. Adv Comput Vis Image Process, 1, 317–339. Katz, B., & Rosen, J. (2010). Super-resolution in incoherent optical imaging using synthetic aperture with Fresnel elements. Optics Express, 18, 962–972. Kekis, J. D., Testorf, M., Fiddy, M. A., & Giles, R. H. (2000). Detecting 1/10th scaled structures in dielectric media using monostatic X-band radar scattering measurements. Proceedings of SPIE, 4123, 13–24. Kempf, A. (2000). Black holes, bandwidths and Beethoven. Journal of Mathematical Physics, 41, 2360–2374. Lambert, A., Fraser, D., Jahromi, M. R. S., & Hunt, B. R. (2002). Super-resolution in image restoration of wide-area images viewed through atmospheric turbulence. Proceedings of SPIE, 4792, 35–43. Landau, H. (1986). Extrapolating a band-limited function from its samples taken in a finite interval. IEEE Transactions on Information Theory, 32(4), 464–470. Landau, H. J., & Pollak, H. O. (1961). Prolate spheroidal wave functions, Fourier analysis and uncertainty II. Bell Syst Tech J, 40, 65–84. Lohmann, A. W. (2006a). Optical Information Processing (pp. 433–456, chapter 35). Universita¨tsverlag Illmenau. Lohmann, A. W. (2006b). The space-bandwidth product applied to spatial filtering and to holography. In M. Testorf, J. Ojeda-Castan˜eda & A. W. Lohmann (Eds.), Selected Papers on Phase-Space Optics, MS 181(pp. 11–32). Bellingham, WA: SPIE Press , SPIE Milestone Series. Lukosz, W. (1966). Optical systems with resolving power exceeding the classical limit. Journal of the Optical Society of America, 56, 1463–1472. Lukosz, W. (1967). Optical systems with resolving powers exceeding the classical limit. II. Journal of the Optical Society of America, 57, 932–941. Martı´nez-Corral, M., Andre´s, P., Ojeda-Castan˜eda, J., & Saavedra, G. (1995). Tunable axial superresolution by annular binary filters. Application to confocal microscopy. Optics Communications, 119, 491–498. Matson, C. L., & Tyler, D. W. (2004). Primary and secondary superresolution: degrees of freedom versus Fourier extrapolation. Proceedings of SPIE, 5562, 179–187. Matson, C. L., & Tyler, D. W. (2006). Primary and secondary superresolution by data inversion. Optics Express, 14(2), 456–473. McGahan, R. V., & Kleinman, R. E. (1996). Special session on image reconstruction using real data. IEEE Transactions on Antennas and Propagation, 38, 39–40. McGahan, R. V., & Kleinman, R. E. (1997). Second annual special session on image reconstruction using real data. IEEE Transactions on Antennas and Propagation, 39(2), 7–9. Mendlovic, D., & Lohmann, A. W. (1997). Space-bandwidth product adaptation and its application to superresolution: fundamentals. Journal of the Optical Society of America A, 14, 558–562. Mendlovic, D., Lohmann, A. W., & Zalevsky, Z. (1997). Space-bandwidth product adaptation and its application to superresolution: examples. Journal of the Optical Society of America A, 14, 563–567.
216
Markus E. Testorf and Michael A. Fiddy
Neil, M. A. A., Juskaitis, R., & Wilson, T. (1997). Method of obtaining optical sectioning by using structured light in a conventional microscope. Optics Letters, 22, 1905–1907. Novotny, L., Pohl, D. W., & Hecht, B. (1995). Scanning near-field optical probe with ultrasmall spot size. Optics Letters, 20(9), 970–972. Papoulis, A. (1975). A new algorithm in spectral analysis and bandlimited extrapolation. IEEE Transactions on Circuits and Systems, CAS-22, 735–742. Papoulis, A. (1977). Generalized sampling expansion. IEEE Transactions on Circuits and Systems, CAS-24, 652–654. Papoulis, A. (1985). A note on the predictability of band-limited processes. Proceedings/IEEE, 73, 1332–1333. Papoulis, A. (1994). Pulse compression, fiber communications, and diffraction: a unified approach. Journal of the Optical Society of America A, 11(1), 3–13. Pask, C. (1976). Simple theory of super-resolution. Journal of the Optical Society of America, 66, 68–70. Pohl, D. W., Denk, W., & Lanz, M. (1984). Optical stethoscopy: image recording with resolution lambda/20. Applied Physics Letters, 44(7), 651–653. Portnoy, A., Pitsianis, N. P., Brady, D. J., Guo, J., Fiddy, M. A., Feldman, M. R., et al. (2006). Thin digital imaging system using focla plane coding. Proceedings of SPIE, 6065, 60650F. Prasad, S. (2007). Digital superresolution and the generalized sampling theorem. Journal of the Optical Society of America A, 24(2), 311–325. Rayleigh, L. (1899). Investigations in optics, with special reference to the spectroscope. In Scientific Papers, Vol. 1(pp. 415–459). Cambridge, UK: Cambridge University Press. Romberg, J. (2008). Imaging via compressive sampling. IEEE Signal Proc, 25(2), 14–20. Sales, T. R. M., & Morris, G. M. (1997a). Diffractive superresolution elements. Journal of the Optical Society of America A, 14, 1637–1646. Sales, T. R. M., & Morris, G. M. (1997b). Fundamental limits of optical superresolution. Optics Letters, 22, 582–584. Sentenac, A., Belkebir, K., Giovannini, H., & Chaumet, P. C. (2009). High-resolution totalinternal-reflection fluorescence microscopy using periodically nanostructured glass slides. Journal of the Optical Society of America A, 26(12), 2550–2557. Sentenac, A., Gue´rin, C.-A., Chaument, P. C., Drsek, F., Giovannini, H., & Bertaux, N. (2007). Influence of multiple scattering on the resolution of an imaging system: a Cramer-Rao analysis. Optics Express, 15, 1340–1346. Sharma, K. K., & Joshi, S. D. (2006). Signal reconstruction from the undersampled signal samples. Optics Communications, 268(2), 245–252. Sheppard, C. J. R., Calvert, G., & Wheatland, M. (1998a). Focal distribution for superresolving toraldo filters. Journal of the Optical Society of America A, 15(4), 849–856. Sheppard, D. G., Hunt, B. R., & Marcellin, M. W. (1998b). Iterative multiframe superresolution algorithms for atmospheric-turbulence-degraded imagery. Journal of the Optical Society of America A, 15, 978–992. Shieh, H. M., Byrne, C. L., & Fiddy, M. A. (2006a). Image reconstruction: a unifying model for resolution enhancement and data extrapolation. Tutorial. Journal of the Optical Society of America A, 23, 258–266. Shieh, H. M., Byrne, C. L., Testorf, M., & Fiddy, M. A. (2001). Incorporation of prior information in surface imaging applications. Proceedings of SPIE, 4491, 336–345. Shieh, H. M., Byrne, C. L., Testorf, M. E., & Fiddy, M. A. (2006b). Iterative image reconstruction using prior knowledge. Journal of the Optical Society of America A, 23, 1292–1300. Simonetti, F. (2006). Multiple scattering: The key to unravel the subwavelength world from the far-field pattern of a scattered wave. Physical Review E, 73, 036619. Slepian, D., & Pollak, H. O. (1961). Prolate spheroidal wave functions, Fourier analysis and uncertainty I. Bell Syst Tech J, 40, 43–63.
Superresolution Imaging—Revisited
217
Sloane, N. J. A., & Harwit, M. (1976). Masks for Hadamard transformation optics. Applied Optics, 15, 107–114. Solomon, J., Zalevsky, Z., & Mendlovic, D. (2005). Geometric superresolution by code division multiplexing. Applied Optics, 44(1), 32–40. Sparrow, C. M. (1916). On spectroscopic resolving power. The Astrophysical Journal, 44, 76–86. Stern, A. (2006). Sampling of linear canonical transformed signals. Signal Proc, 86, 1421–1425. Stern, A., & Javidi, B. (2004a). Sampling in the light of Wigner distribution. Journal of the Optical Society of America A, 21(3), 360–366. Stern, A., & Javidi, B. (2004b). Sampling in the light of Wigner distribution: errata. Journal of the Optical Society of America A, 21(10), 2038. Takhar, D., Laska, J. N., Wakin, M. B., Duarte, M. F., Baron, D., Sarvotham, S., et al. (2006). A new compressive imaging camera architecture using optical-domain compression. Proceedings of SPIE, 6065, 606509. Tanida, J., Kumagai, T., Yamada, K., Miyatake, S., Ishida, K., Morimoto, T., et al. (2001). Thin observation module by bound optics (TOMBO): concept and experimental verification. Applied Optics, 40(11), 1806–1813. Testorf, M., & Fiddy, M. (2001a). Algorithms for data evaluation applied to the detection of buried objects. Waves Random Media, 11, 535–547. Testorf, M., & Fiddy, M. (2001b). Imaging from real scattered field data using a linear spectral estimation technique. Inverse Problems, 17, 1645–1658. Testorf, M., & Fiddy, M. A. (2001c). Suppression of surface recflection in subsurface imaging applications. Proceedings of SPIE, 4491, 121–130. Testorf, M., & Fiddy, M. (2007). Linear spectral estimation and the design of superresolution filters. In Frontiers in Optics. OSA Technical Digest (CD) Optical Society of America, paper FThV4. Thomson, L. C., Boissel, Y., Whyte, G., Yao, E., & Courtial, J. (2008). Simulation of superresolution holography for optical tweezers. New Journal of Physics, 10(2), 023015. Toraldo di Francia, G. (1952). Super-gain antennas and optical resolving power. Nuovo Cimento Suppl, A 9, 426–438. Toraldo di Francia, G. (1955). Resolving power and information. Journal of the Optical Society of America, 45, 497–501. Toraldo di Francia, G. (1969). Degrees of freedom of an image. Journal of the Optical Society of America, 59(7), 799–803. Torre, A. (2005). Linear Ray and Wave Optics in Phase Space. Amsterdam: Elsevier. Treanor, P. J. (1946). On the telescopic resolution of unequal binaries. Observatory, 66, 255–258. Tyson, R. K. (2000). Introduction to Adaptive Optics. Bellingham, WA: SPIE Press. Unser, M., & Zerubia, J. (1998). A generalized sampling theory without bandlimiting constraints. IEEE Trans Circuits II, 45(8), 959–969. Ur, H., & Gross, D. (1992). Resolution from subpixel shifted pictures. CVGIP: Graphical Models and Image Processing, 54(2), 181–186. von Laue, M. (1914). Die Freiheitsgrade von Strahlenbu¨ndeln. Annalen der Physik, 349(16), 1197–1212. Watson, S. M., Mills, J. P., & Rogers, S. K. (1988). Two-point resolution criterion for multiaperture optical telescopes. Journal of the Optical Society of America A, 5, 893–903. Watson, S. M., Mills, J. P., & Rogers, S. K. (1989). Sidelobe reduction via multiaperture optical systems. Applied Optics, 28, 687–693. Zalevsky, Z., Saat, E., Orbach, S., Mico, V., & Garcia, J. (2008). Exceeding the resolving imaging power using environmental conditions. Applied Optics, 47(4), A1–A6. Zalevsky, Z. (2009). Super resolved imaging in Wigner-based phase space. In M. E. Testorf, B. M. Hennelly & J. Ojeda-Castan˜eda (Eds.), Phase-Space Optics: Fundamentals and Applications (pp. 193–216). New York: McGraw-Hill.
218
Markus E. Testorf and Michael A. Fiddy
Zalevsky, Z., & Mendlovic, D. (2004). Optical Superresolution (Springer Series in Optical Sciences), 91. New York: Springer. Zalevsky, Z., Garcı´a-Martı´nez, P., & Garcı´a, J. (2006). Superresolution using gray level coding. Optics Express, 14, 5178–5182. Zalevsky, Z., Mendlovic, D., & Lohmann, A. W. (2000a). Optical systems with improved resolving power. In E. Wolf (Ed.), Progress in Optics, Vol. 40(pp. 271–341). Amsterdam: Elsevier. Zalevsky, Z., Mendlovic, D., & Lohmann, A. W. (2000b). Understanding superresolution in Wigner space. Journal of the Optical Society of America A, 17, 2422–2430. Zalevsky, Z., Rozental, S., & Meller, M. (2007). Usage of turbulence for superresolved imaging. Optics Letters, 32(9), 1072–1074. Zhang, Y. (2007). Design of three-dimesnional superresolving binary amplitude filters by using the analytic method. Optics Communications, 274, 37–42. Zheludev, N. I. (2008). What diffraction limit? Nature Materials, 7, 420–422.
Contents of Volumes 151–162
VOLUME 1511 C. Bontus and T. Ko¨hler, Reconstruction algorithms for computed tomography L. Busin, N. Vandenbroucke, and L. Macaire, Color spaces and image segmentation G. R. Easley and F. Colonna, Generalized discrete Radon transforms and applications to image processing T. Radlicˇka, Lie agebraic methods in charged particle optics V. Randle, Recent developments in electron backscatter diffraction
VOLUME 152 N. S. T. Hirata, Stack filters: from definition to design algorithms S. A. Khan, The Foldy–Wouthuysen transformation technique in optics S. Morfu, P. Marquie´, B. Nofie´le´, and D. Ginhac, Nonlinear systems for image processing T. Nitta, Complex-valued neural network and complex-valued backpropagation learning algorithm J. Bobin, J.-L. Starck, Y. Moudden, and M. J. Fadili, Blind source separation: the sparsity revoloution R. L. Withers, ‘‘Disorder’’: structured diffuse scattering and local crystal chemistry
VOLUME 153 Aberration-corrected Electron Microscopy H. Rose, History of direct aberration correction M. Haider, H. Mu¨ller, and S. Uhlemann, Present and future hexapole aberration correctors for high-resolution electron microscopy
1
Lists of the contents of volumes 100–149 are to be found in volume 150; the entire series can be searched on ScienceDirect.com
219
220
Contents of Volumes 151–162
O. L. Krivanek, N. Dellby, R. J. Kyse, M. F. Murfitt, C. S. Own, and Z. S. Szilagyi, Advances in aberration-corrected scanning transmission electron microscopy and electron energy-loss spectroscopy P. E. Batson, First results using the Nion third-order scanning transmission electron microscope corrector A. L. Bleloch, Scanning transmission electron microscopy and electron energy loss spectroscopy: mapping materials atom by atom F. Houdellier, M. Hy¨tch, F. Hu¨e, and E. Snoeck, Aberration correction with the SACTEM-Toulouse: from imaging to diffraction B. Kabius and H. Rose, Novel aberration correction concepts A. I. Kirkland, P. D. Nellist, L.-Y. Chang, and S. J. Haigh, Aberration-corrected imaging in conventional transmission electron microscopy and scanning transmission electron microscopy S. J. Pennycook, M. F. Chisholm, A. R. Lupini, M. Varela, K. van Benthem, A. Y. Borisevich, M. P. Oxley, W. Luo, and S. T. Pantelides, Materials applications of aberration-corrected scanning transmission electron microscopy N. Tanaka, Spherical aberration-corrected transmission electron microscopy for nanomaterials K. Urban, L. Houben, C.-L. Jia, M. Lentzen, S.-B. Mi, A. Thust, and K. Tillmann, Atomic-resolution aberration-corrected transmission electron microscopy Y. Zhu and J. Wall, Aberration-corrected electron microscopes at Brookhaven National Laboratory
VOLUME 154 H. F. Harmuth and B. Meffert, Dirac’s difference equation and the physics of finite differences
VOLUME 155 D. Greenfield and M. Monastyrskiy, Selected problems of computational charged particle optics
VOLUME 156 V. Argyriou and M. Petrou, Photometric stereo: an overview F. Brackx, N. de Schepper, and F. Sommen, The Fourier transform in Clifford analysis N. de Jonge, Carbon nanotube electron sources for electron microscopes E. Recami and M. Zamboni-Rached, Localized waves: a review
Contents of Volumes 151–162
221
VOLUME 157 M. I. Yavor, Optics of charged particle analyzers
VOLUME 158 P. Dombi, Surface plasmon-enhanced photoemission and electron acceleration with ultrashort laser pulses B. J. Ford, Did physics matter to the pioneers of microscopy? J. Gilles, Image decomposition: Theory, numerical schemes, and performance evaluation S. Svensson, The reverse fuzzy distance transform and its use when studying the shape of macromolecules from cryo-electron tomographic data M. van Droogenbroeck, Anchors of morphological operators and algebraic openings D. Yang, S. Kumar, and H. Wang, Temporal filtering technique using time lenses for optical transmission systems
VOLUME 159 Cold Field Emission and the Scanning Transmission Electron Microscope A. V. Crewe, The work of Albert Victor Crewe on the scanning transmission electron microscope and related topics L. W. Swanson and G. A. Schwind, A review of the cold-field electron cathode Joseph S. Wall, Martha N. Simon, and James F. Hainfeld, History of the STEM at Brookhaven National Laboratory Hiromi Inada, Hiroshi Kakibayashi, Shigeto Isakozawa, Takahito Hashimoto, Toshie Yaguchi, and Kuniyasu Nakamura, Hitachi’s development of cold-field emission scanning transmission electron microscopes P. W. Hawkes, Two commercial STEMs: The Siemens ST100F and the AEI STEM-1 Ian R. M. Wardell and Peter E. Bovey, A history of Vacuum Generators’ 100-kV scanning transmission electron microscope H. S. von Harrach, Development of the 300-kV Vacuum Generator STEM (1985–1996) Bernard Jouffrey, On the high-voltage STEM project in Toulouse (MEBATH) Andreas Engel, Scanning transmission electron microscopy: biological applications K. C. A. Smith, STEM at Cambridge University: reminiscences and reflections from the 1950s and 1960s
222
Contents of Volumes 151–162
VOLUME 160 Zofia Baran´czuk, Joachim Giesen, Klaus Simon, and Peter Zolliker, Gamut mapping Adrian N. Evans, Color area morphology scale-spaces Ye Pu, Chia-Lung Hsieh, Rachel Grange, and Demetri Psaltis, Harmonic holography Gerhard X. Ritter and Gonzalo Urcid, Lattice algebra approach to endmember determination in hyperspectral imagery Reinhold Ru¨denberg, Origin and background of the invention of the electron microscope H. Gunther Rudenberg and Paul G. Rudenberg, Origin and background of the invention of the electron microscope: Commentary and expanded notes on Memoir of Reinhold Ru¨denberg
VOLUME 161 Marian Mankos, Vassil Spasov, and Eric Munro, Principles of dual-beam low-energy electron microscopy Jorge D. Mendiola-Santiban˜ez, Iva´n R. Terol-Villalobos, and Israel M. Santilla´nMe´ndez, Determination of adequate parameters for connected morphological contrast mappings through morphological contrast measures Ignacio Moreno and Carlos Ferreira, Fractional Fourier transforms and geometrical optics Vladan Velisavljevic´, Martin Vetterli, Baltasar Beferull-Lozano, and Pier Luigi Dragotti, Sparse image representation by directionlets Michael H. F. Wilkinson and Georgios K. Ouzounis, Advances in connectivity and connected attribute filters
VOLUME 162 Kiyotaka Asakura, Hironobu Niimi, and Makoto Kato, Energy filtered X-ray photoemission electron microscopy E. C. Cosgriff, P. D. Nellist, A. J. D’Alfonso, S. D. Findlay, G. Behan, P. Wang, L. J. Allen, and A. I. Kirkland Image contrast in aberration-corrected scanning confocal electron microscopy C. J. Edgcombe, New dimensions for field emission: Effects of structure in the emitting surface Archontis Giannakidis and Maria Petrou, Conductivity imaging and generalized Radon transform: A review O. Losson, L. Macaire, and Y. Yang, Comparison of color demosaicing methods
Index
A Abbe’s theory Fraunhofer plane wave spectrum, 170 interference pattern, 171 periodic input signal, 170–171 Abraham–Lorentz theory electrostatic self-energy, 39 momentum, electron, 40 radiation reaction force, 39–40 self-force, 39 B Blue and brown pigments. See Green, blue and brown pigments C Calcium carbonate–calcium sulfate transformation, 160–161 Caldirola theory acceleration energy, 46–47 electron radiation properties, 45 finite-difference equation, 42, 47 kinetic energy, 46 relativistic equation of motion, 47 transmission law, 43 Chromium oxide green pigments, 151, 155 Chronon atom of time, 36 density operators and coarse-graining hypothesis discretized Liouville equation and time-energy uncertainty relation, 88–90 measurement problem, quantum mechanics, 90–95 Dirac differential equation, 37 discretized quantum equations, applications discretized Klein–Gordon equation, 73–76 free particle, 69–73 hydrogen atom, 81–86
position and momentum operators, time evolution, 76–81 simple harmonic oscillator, 66–68 electron classical theory Abraham–Lorentz theory, 39–40 Caldirola theory, 42–48 Dirac theory, 40–42 evolution operators, 98–103 quantum mechanics Feynman path integrals, 57–60 leptons, mass spectrum, 55–57 muon mass, 53–55 Schro¨dinger and Heisenberg pictures, 61–62 time-dependent Hamiltonians, 62–66 time discretization, 36 Coarse-graining hypothesis Liouville equation, 86 phase space, 87 Crame´r-Rao analysis, 203 D Density operators and coarse-graining hypothesis discretized Liouville equation and time-energy uncertainty relation density operator, 88 Heisenberg equation, 89 time evolution operator, 90 Dirac theory energy-momentum four-vector flux, 41 reaction force, 41–42 Discrete formalism Discrete Fourier transform algorithm, 193–197 Discretized quantum equations, applications discretized Klein–Gordon equation eigenfunctions, 74 Hamiltonian operator, 75 plane waves solutions, 74–76 red shift, 75 zero-spin photon, 73
223
224
Index
Discretized quantum equations, applications (cont.) free particle eigenfunctions, 69 frequency, 69–70 inflection point, 71–72 retarded equation, 71, 73 symmetric equation, 69–70, 72 hydrogen atom eigenvalues, 82 Hamiltonian, 81–82 hyperfine corrections, 81, 84 Lamb shift, 81, 84–85 time discretization effect, 83 transition probability, 85 position and momentum operators, time evolution creation and annihilation operators, 77, 79 damping factor, 79–80 finite-difference equations, 77 Heisenberg equations, 76 phase space, 78 simple harmonic oscillator damping factor, 68 eigenvalue equation, 67 Hamiltonian, 66 E Electron classical theory, chronon Abraham–Lorentz theory electrostatic self-energy, 39 momentum, electron, 40 radiation reaction force, 39–40 self-force, 39 Caldirola theory acceleration energy, 46–47 electron motion, 43–44 electron radiation properties, 45 finite-difference equation, 42, 47 kinetic energy, 46 magnetic moment, 48 relativistic equation of motion, 47 symmetric formulation, 48 transmission law, 43 Dirac theory energy-momentum four-vector flux, 41 reaction force, 41–42 force law, 38 hyperbolic motions energy-conservation law, 49 radiation-reaction effects, 48 Schott energy, 49
Ergodic hypothesis, 96 Evanescent waves dispersion relation, 123 Poynting vector, 123 total internal reflection, 122 Evolution operators density matrix picture density of states operator, 100 time evolution operator, 100–101 Liouville-von Neumann discrete picture, 98 Schro¨dinger picture Hermitean operator, 99 time evolution operator, 98 Trotter equality, 99 F Feynman path integrals positron, 60 quantum state, 58–60 time evolution, 57–58 transition amplitude, 57–58 Feynman’s approach, 58, 95 Fraunhofer plane wave spectrum, 170 Functional subwavelength imaging, 120–121 G Geometrical superresolution. See Multiaperture systems and digital superresolution Gerchberg–Papoulis algorithm, 192–193 Green, blue and brown pigments EDS analysis, 152–153, 155–157 green earth and chromium oxide greens, 151, 155 optical image, 152, 154, 156–157 SEM image, 152, 154, 156–157 ultramarine, 151 umber, 151 H Hadamard codes, 186 Harmonic oscillator. See Simple harmonic oscillator Historical pigments crystallographic structure, 142 green, blue and brown pigments EDS analysis, 152–153, 155–157 green earth and chromium oxide greens, 151, 155 optical image, 152, 154, 156–157 SEM image, 152, 154, 156–157
Index
ultramarine, 151 umber, 151 micro-Raman spectroscopy, 142–143 microscopic techniques, combination of principle, 144 sampling and preparing sample cross section, 144 SEM and EDS, 144–145 strategy, 145–146 ochre pigments and barium sulphate EDS analysis, 158–159 optical image, 158 SEM image, 158 particle-induced x-ray emission (PIXE), 142–143 red pigments and carbon based pigments EDS analysis, 147–151 grain size, 149 optical image, 147 SEM image, 147 Holography graphene, 6–7 interference substrates, 6 optical standing wave, 4–5 overlapping zone, 5 Hydrogen atom eigenvalues, 82 Hamiltonian, 81–82 hyperfine corrections, 81, 84 Lamb shift, 81, 84–85 time discretization effect, 83 transition probability, 85 I Image resolution, classical estimates Abbe’s theory Fraunhofer plane wave spectrum, 170 interference pattern, 171 periodic input signal, 170–171 classical metrics for resolution, 173–174 Rayleigh limit definition, 168 numerical techniques, 170 optical transfer function (OTF), 168 saddle, definition, 169 two-point image, 168–169 Shannon–Whittaker sampling theorem, 172 Interference scanning optical probe microscopy (ISOM), 26–27 ISOM. See Interference scanning optical probe microscopy
225
K Klein–Gordon equation eigenfunctions, 74 Hamiltonian operator, 75 plane wave solutions, 74–76 red shift, 75 zero-spin photon, 73 L Lateral standing waves fringe pattern calculation, 14 grating structure, 18 interference fringes, 21 k vector, 14 local field distribution, 16 local optical field, 11 micrograting, 1180 point dipoles, 20 optical image, 18–19 scattered dipole field, 13 scattered field amplitude, 12, 20 Liouville equation, 86 Lithopone, 158 Lorentz invariant. See Caldirola theory Lukosz superresolution and application aperture radar imaging, 181–182 concepts, 182 evanescent waves and near-field microscopy, 186 interpretation, 180–181 multi-aperture systems and digital superresolution, 184–186 phase-space analysis, 180 phase-space diagram (PSD), 181 structured illuminations, 182–183 turbulence, 183–184 M Mandelstam-Tamm time-energy correlation Measurement problem, quantum mechanics density operator, 92–94 eigenvalue equation, 90 irreversibility, 94 Liouville-von Neumann equation, 94–95 time evolution operator, 91 Metamaterials, 125 Multi-aperture systems and digital superresolution detector array, 185 Hadamard codes, 186 limitations, 185 TOMBO imaging system, 185
226
Index
N Near-field scanning optical microscopy, 124 Non-Hermitean operator equivalent Hamiltonian, 103 probability, 105 time evolution operator, 104 Numerical superresolution algorithms Gerchberg–Papoulis algorithm, 192–193 multiple scattering Crame´r-Rao analysis, 203 inverse scattering problem, 203–204 nonlinear numerical methods, 203 prior discrete Fourier transform algorithm (PDFT) bandwidth extrapolation, 197 coefficients an, 194 linear combination, 194 monostatic backscatter data, 202 P-matrix, 195 projector, 193 secondary superresolution, definition, 199 synthetic Fourier data, 195–196
k vector, 14 local field distribution, 16 local optical field, 11 micrograting, 1180 point dipoles, 20 optical image, 18–19 scattered dipole field, 13 scattered field amplitude, 12, 20 optical microscopy, 3–4 optical standing waves, 3–5 plasmonics, 2 surface standing waves classification, 7 electric field intensity calculation, 9–10 oscillating dipole, 7, 9 piezo tube scanner, 9 Talbot effect and phase singularities, 28–30 Optical microscopy, 3–4 Optical standing waves, 3–5 Optical superresolution superoscillations, 188–190 superresolution filters, 187–188 Optical transfer function (OTF), 168 OTF. See Optical transfer function
O
P
Ochre pigments and barium sulphate EDS analysis, 158–159 optical image, 158 SEM image, 158 Optical and SEM combined with EDS (SEM-EDS) analysis principle, 144 sampling and preparing sample cross section, 144 SEM and EDS, 144–145 strategy, 145–146 Optical interference near surfaces holography graphene, 6–7 interference substrates, 6 optical standing wave, 4–5 overlapping zone, 5 intermediate-field images, reconstruction ISOM, 26–27 numerical reconstruction, 24–25 standing wave field, 22–23 steps to reconstruct, 22 lateral standing waves fringe pattern calculation, 14 grating structure, 18 interference fringes, 21
Papoulis generalized sampling associated transfer functions, 208 Brown’s derivation, 210 data consistency condition, 209 PDFT. See Prior discrete Fourier transform algorithm Pendry superlens double-focusing effect, 127–128 Fourier transform, 134 negative index of refraction, 126–127 permeability and permittivity, 126–127 resolution impedance, 130 propagating wave focusing, 129 transmission coefficient, 130 uniqueness theorem, 128 wave vectors, 130–131 Phase-space analysis, 180 Pigments. See Historical pigments Plasmonics, 2 Position and momentum operators, time evolution creation and annihilation operators, 77, 79 damping factor, 79–80 finite-difference equations, 77 Heisenberg equations, 76
Index
phase space, 78 Prior discrete Fourier transform algorithm (PDFT) bandwidth extrapolation, 197 coefficients an, 194 monostatic backscatter data, 202 P-matrix, 195 projector, 193 secondary superresolution, definition, 199 synthetic Fourier data, 195–196 Q Quantum jumps, 97 Quantum mechanics, chronon damping factor, 53, 68, 80 Feynman path integrals positron, 60 quantum state, 58–60 time evolution, 57–58 transition amplitude, 57–58 leptons, mass spectrum intrinsic magnetic moment, 56 self-energy, 56–57 muon mass bound states, 53 Heisenberg uncertainty relations, 54–55 Schro¨dinger and Heisenberg Pictures evolution law, 62 time evolution operator, 61–62 time-dependent Hamiltonians eigenstates, 64 evolution coefficients, 66 time evolution operator, 62–63, 65 vector state, 63 time discretization, 50 Quantum of time. See Chronon R Rayleigh limit definition, 168 numerical techniques, 170 optical transfer function (OTF), 168 saddle, definition, 169 two-point image, 168 Red pigments and carbon based pigments EDS analysis, 147–151 grain size, 149 optical image, 147 SEM image, 147
227
S Sampling expansions compressive sampling, 211–212 Papoulis generalized sampling associated transfer functions, 208 Brown’s derivation, 210 data consistency condition, 209 squeezing and dissecting phase space chirping, definition, 205–206 free-space propagation, 206 phase-space representations, 204 Wigner distribution function, 205 Secondary superresolution, definition, 199 Shannon–Whittaker sampling theorem, 172 Simple harmonic oscillator damping factor, 68 eigenvalue equation, 67 Hamiltonian, 66 Space-bandwidth adaptation grating encoder system, 179 image resolution, improvement, 179 optical signal representation, 175, 177 phase-space diagrams, 177–178 principle, 176 shear parameter, 177 Space-bandwidth product incoherent image formation, 176 optical signals, 175 phase-space optics, 176 Sparrow’s criterion, 173 Subwavelength imaging. See also Optical interference near surfaces bandwidth, 137 charge distribution, 132–133 evanescent waves dispersion relation, 123 Poynting vector, 123 total internal reflection, 122 fluorescent proteins, 120–122 magnetic resonance intensity, 120 metamaterials, 125 near-field scanning optical microscopy, 124 Pendry superlens double-focusing effect, 127–128 negative index of refraction, 126–127 permeability and permittivity, 126–127 resolution, 128–131 photoactivated localization microscopy, 120 photomask techniques, 118
228
Index
Subwavelength imaging. See also Optical interference near surfaces (cont.) planar sensor, 132–133 Rayleigh criterion, 117 signal amplitude, 136 signal-to-noise ratio (SNR), 135–137 submicrowave electromagnetic radiation, 119 transmission electron microscopy, 118 types functional, 120–121 true, 120–122 Superoscillations, 187–190 Superresolution filters, 187–188 Superresolution imaging classical estimates Abbe’s theory, 170–172 classical metrics for resolution, 173–174 Rayleigh limit, 168–170 sampling theorem, 172–173 numerical superresolution algorithms Gerchberg–Papoulis algorithm, 192–193 multiple scattering, 203–204 prior discrete Fourier transform algorithm (PDFT), 193–197 optical superresolution superoscillations, 188–190 superresolution filters, 187–188 reassigning degrees of freedom Lukosz superresolution, 180–181 space-bandwidth adaptation, 176–179 space-bandwidth product, 174–176
sampling expansions compressive sampling, 211–212 Papoulis generalized sampling, 207–211 squeezing and dissecting phase space, 204–207 Surface standing waves classification, 7 electric field intensity calculation, 9–10 oscillating dipole, 7, 9 piezo tube scanner, 28–30 Synthetic aperture radar imaging, 181–182 T Talbot effect and phase singularities, 28–30 Time-dependent Hamiltonians eigenstates, 64 evolution coefficients, 66 time evolution operator, 62–63, 65 vector state, 63 Time-energy uncertainty relation, 89 Transillumination, 125 True subwavelength imaging, 120–122 Turbulence, 183–184 U Ultramarine, 151 Umber, 151 V Veselago lens. See Pendry superlens