ADVANCES IN IMAGING AND ELECTRON PHYSICS VOLUME 130
EDITOR-IN-CHIEF
PETER W. HAWKES CEMES-CNRS Toulouse, France
ASSOCIATE EDITORS
BENJAMIN KAZAN Palo Alto, California
TOM MULVEY Aston University Birmingham, United Kingdom
Advances in
Imaging and Electron Physics Edited by
PETER W. HAWKES CEMES-CNRS Toulouse, France
VOLUME 130
Elsevier Academic Press 525 B Street, Suite 1900, San Diego, California 92101-4495, USA 84 Theobalds Road, London WC1X 8RR, UK
This book is printed on acid-free paper. Copyright ß 2004, Elsevier Inc. All Rights Reserved.
No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the Publisher. The appearance of the code at the bottom of the first page of a chapter in this book indicates the Publisher’s consent that copies of the chapter may be made for personal or internal use of specific clients. This consent is given on the condition, however, that the copier pay the stated per copy fee through the Copyright Clearance Center, Inc. (www.copyright.com), for copying beyond that permitted by Sections 107 or 108 of the U.S. Copyright Law. This consent does not extend to other kinds of copying, such as copying for general distribution, for advertising or promotional purposes, for creating new collective works, or for resale. Copy fees for pre-2004 chapters are as shown on the title pages. If no fee code appears on the title page, the copy fee is the same as for current chapters. 1076-5670/2004 $35.00 Permissions may be sought directly from Elseviers Science & Technology Rights Department in Oxford, UK: phone: (+44) 1865 843830, fax: (+44) 1865 853333, e-mail:
[email protected]. You may also complete your request on-line via the Elsevier homepage (http://elsevier.com), by selecting ‘‘Customer Support’’ and then ‘‘Obtaining Permissions.’’ For all information on all Academic Press Publications visit our Web site at www.academicpress.com
ISBN: 0-12-014772-6 PRINTED IN THE UNITED STATES OF AMERICA 04 05 06 07 08 9 8 7 6 5 4 3 2 1
CONTENTS
Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix Future Contributions . . . . . . . . . . . . . . . . . . . . . . . . . xi
Statistical Experimental Design for Quantitative Atomic Resolution Transmission Electron Microscopy S. van Aert, A. J. den Dekker, A. van den Bos, and D. van Dyck I. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . II. Basic Principles of Statistical Experimental Design . . . . . . III. Statistical Experimental Design of Atomic Resolution Transmission Electron Microscopy Using Simplified Models IV. Optimal Statistical Experimental Design of Conventional Transmission Electron Microscopy . . . . . . . . . . . . . . V. Optimal Statistical Experimental Design of Scanning Transmission Electron Microscopy . . . . . . . . . . . . . . VI. Discussion and Conclusions . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . .
. .
2 13
.
27
.
58
. 104 . 143 . 157
Transform-Based Image Enhancement Algorithms with Performance Measure Artyom M. Grigoryan and Sos S. Agaian I. Introduction . . . . . . . . . . . . . . . . . . II. Transforms with Frequency Ordered Systems III. Tensor Method of Image Enhancement . . . References . . . . . . . . . . . . . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
165 170 218 240
Introduction . . . . . . . . . . . . . . . . . . . . . . . Similarity Measures . . . . . . . . . . . . . . . . . . . Deriving the Transformation Between the Two Images Feature Extraction . . . . . . . . . . . . . . . . . . . Literature Survey . . . . . . . . . . . . . . . . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
243 246 266 276 282
Image Registration: An Overview Maria Petrou I. II. III. IV. V.
v
vi
CONTENTS
VI. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . 287 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293
CONTRIBUTORS
Numbers in parentheses indicate the pages on which the authors’ contribution begins.
Sos S. Agaian (165), Department of Electrical Engineering, The University of Texas at San Antonio, San Antonio, Texas 78249, USA A. J. den Dekker (1), Delft Center for Systems and Control, Delft University of Technology, Mekelweg 2, 2628 CJ Delft, The Netherlands Artyom M. Grigoryan (165), Department of Electrical Engineering, The University of Texas at San Antonio, San Antonio, Texas 78249, USA Maria Petrou (243), Informatics and Telematics Institute, CERTH, POB 361, Thermi 57001, Thessaloniki, Greece S. van Aert (1), Department of Physics, University of Antwerp, Groenenborgerlaan 171, 2020 Antwerp, Belgium A. van den Bos (1), Faculty of Applied Sciences, Delft University of Technology, Lorentzweg 1, 2628 CJ Delft, The Netherlands D. van Dyck (1), Department of Physics, University of Antwerp, Groenenborgerlaan 171, 2020 Antwerp, Belgium
vii
This Page Intentionally Left Blank
PREFACE
The three chapters in this volume are all related in one way or another to imaging. The first very long and original contribution by S. van Aert, A. J. den Dekker, A. van den Bos, and D. van Dyck is concerned with the difficult problem of obtaining structural information at atomic resolution (around 0.1 nm) from electron microscope images. This requires far more than a good microscope. Every stage of the image-forming process must be analyzed thoroughly and techniques must then be devised to extract structural information about the specimen from the images recorded, in which such information, degraded and distorted, is coded. This is the first full study of this subject and I am sure that it will be heavily used by microscopists seeking high-resolution data. In the second chapter, A. M. Grigoryan and S. S. Agaian reconsider the venerable question of image enhancement. Although this has a large literature, the subject is by no means to be regarded as exhausted, as this chapter shows. The authors are particularly concerned with improvement of the visual appearance of images and production of visually pleasing images. They discuss several approaches to these goals and define image quality carefully. A lengthy section is devoted to the ‘‘tensor’’ methods of image enhancement introduced by one of the authors. Finally, we include an account of image registration by M. Petrou, no stranger to these pages. Although registration, ‘‘the process that allows one to know which parts of two different images were produced by the same physical object or the same part of a physical object that was imaged’’ is enormously important, especially in medical imaging and remote sensing, it is not often regarded as a subject in its own right. This contribution shows convincingly that a study of registration was needed, and I have no doubt that it will be widely welcomed. It only remains for me thank all the authors for the care they have devoted to the preparation of their contributions and for their efforts to make them as readable as such difficult subject material allows. Peter W. Hawkes
ix
This Page Intentionally Left Blank
FUTURE CONTRIBUTIONS
G. Abbate New developments in liquid-crystal-based photonic devices S. Ando Gradient operators and edge and corner detection H. F. Arnoldus (vol. 132) Travelling and evanescent waves and the use of dyadic Green’s functions C. Beeli Structure and microscopy of quasicrystals G. Borgefors Distance transforms B. C. Breton, D. McMullan and K. C. A. Smith (Eds) (vol. 133) Sir Charles Oatley and the scanning electron microscope A. Bretto (vol. 131) Hypergraphs and their use in image modelling B. Buchberger Gro¨bner bases H. Delingette Surface reconstruction based on simplex meshes D. van Dyck Very high resolution electron microscopy R. G. Forbes Liquid metal ion sources E. Fo¨rster and F. N. Chukhovsky X-ray optics A. Fox The critical-voltage effect G. Gilboa PDE-based image enhancement L. Godo and V. Torra Aggregation operators xi
xii
FUTURE CONTRIBUTIONS
A. Go¨lzha¨user Recent advances in electron holography with point sources K. Hayashi X-ray holography M. I. Herrera The development of electron microscopy in Spain D. Hitz Recent progress on HF ECR ion sources H. Ho¨lscher Dynamic force microscopy J. Hormigo and G. Cristobal (vol. 131) Texture and the Wigner distribution D. P. Huijsmans and N. Sebe Ranking metrics and evaluation measures K. Ishizuka Contrast transfer and crystal images K. Jensen Field-emission source mechanisms G. Ko¨gel Positron microscopy T. Kohashi Spin-polarized scanning electron microscopy W. Krakow Sideband imaging N. Kru¨ger (vol. 131) The application of statistical and deterministic regularities in biological and artificial vision systems B. Lahme (vol. 132) Karhunen–Loe`ve decomposition B. Lencova´ Modern developments in electron optical calculations R. Lenz Aspects of colour image processing W. Lodwick Interval analysis and fuzzy possibility theory
FUTURE CONTRIBUTIONS
xiii
S. Mane Dynamics of spin-polarized particles in circular accelerators M. Matsuya Calculation of aberration coefficients using Lie algebra L. Mugnier, A. Blanc, and J. Idier Phase diversity K. Nagayama Electron phase microscopy A. Napolitano Linear filtering of generalized almost cyclostationary signals M. A. O’Keefe Electron image simulation N. Papamarkos and A. Kesidis The inverse Hough transform R.-H. Park Circulant matrix representation of feature masks K. S. Pedersen, A. Lee, and M. Nielsen The scale-space properties of natural images R. Piroddi and M. Petrou (vol. 132) Dealing with irregularly sampled data M. Rainforth (vol. 132) Recent developments in the microscopy of ceramics, ferroelectric materials and glass E. Rau Energy analysers for electron microscopes H. Rauch The wave-particle dualism E. Recami Superluminal solutions to wave equations J. Rehacek, Z. Hradil, and J. Perˇina Neutron imaging and sensing of physical fields H. Rose (vol. 132) Five-dimensional Hamilton–Jacobi approach to relativistic quantum mechanics of the electron
xiv
FUTURE CONTRIBUTIONS
J. J. W. M. Rosink and N. van der Vaart (vol. 131) HEC sources for the CRT G. Schmahl X-ray microscopy G. Scho¨nhense, C. Schneider, and S. Nepijko Time-resolved photoemission electron microscopy R. Shimizu, T. Ikuta, and Y. Takai Defocus image modulation processing in real time S. Shirai CRT gun design methods K. Siddiqi and S. Bouix The Hamiltonian approach to computer vision N. Silvis-Cividjian and C. W. Hagen Electron-beam-induced deposition T. Soma Focus-deflection systems and their applications J.-L. Starck (vol. 132) The curvelet transform W. Szmaja Recent developments in the imaging of magnetic domains I. Talmon Study of complex fluids by transmission electron microscopy M. E. Testorf and M. Fiddy Imaging from scattered electromagnetic fields, investigations into an unsolved problem R. Thalhammer Virtual optical experiments M. Tonouchi Terahertz radiation imaging N. M. Towghi Ip norm optimal filters Y. Uchikawa Electron gun optics K. Vaeth and G. Rajeswaran Organic light-emitting arrays
FUTURE CONTRIBUTIONS
J. Valde´s Units and measures, the future of the SI D. Vitulano Fractal encoding D. Windridge The tomographic fusion technique C. D. Wright and E. W. Hill Magnetic force microscopy M. Yeadon Instrumentation for surface studies
xv
This Page Intentionally Left Blank
ADVANCES IN IMAGING AND ELECTRON PHYSICS, VOL. 130
Statistical Experimental Design for Quantitative Atomic Resolution Transmission Electron Microscopy S. VAN AERT,1 A. J. DEN DEKKER,2 A. VAN DEN BOS,3 AND D. VAN DYCK1 1
Department of Physics, University of Antwerp, Groenenborgerlaan 171, 2020 Antwerp, Belgium 2 Delft Center for Systems and Control, Delft University of Technology, Mekelweg 2, 2628 CJ Delft, The Netherlands 3 Faculty of Applied Sciences, Delft University of Technology, Lorentzweg 1, 2628 CJ Delft, The Netherlands
I. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . A. Qualitative Atomic Resolution Transmission Electron Microscopy . . . B. Quantitative Atomic Resolution Transmission Electron Microscopy . . . C. Statistical Experimental Design . . . . . . . . . . . . . . . . . II. Basic Principles of Statistical Experimental Design . . . . . . . . . . . A. Introduction . . . . . . . . . . . . . . . . . . . . . . . . B. Parametric Statistical Models of Observations . . . . . . . . . . . C. Attainable Precision . . . . . . . . . . . . . . . . . . . . . 1. The Crame´r-Rao Lower Bound. . . . . . . . . . . . . . . . 2. Precision Based Optimality Criteria . . . . . . . . . . . . . . D. Maximum Likelihood Estimation . . . . . . . . . . . . . . . . E. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . III. Statistical Experimental Design of Atomic Resolution Transmission Electron Microscopy Using Simplified Models . . . . . . . . . . . . . . . . A. Introduction . . . . . . . . . . . . . . . . . . . . . . . . B. Parametric Statistical Models of Observations . . . . . . . . . . . 1. One-Dimensional Observations . . . . . . . . . . . . . . . . 2. Two-Dimensional Observations. . . . . . . . . . . . . . . . 3. Three-Dimensional Observations . . . . . . . . . . . . . . . C. Approximations of the Crame´r-Rao Lower Bound . . . . . . . . . 1. One-Dimensional Observations . . . . . . . . . . . . . . . . 2. Two-Dimensional Observations. . . . . . . . . . . . . . . . 3. Three-Dimensional Observations . . . . . . . . . . . . . . . D. Discussions and Examples . . . . . . . . . . . . . . . . . . . 1. Two-Dimensional Observations. . . . . . . . . . . . . . . . 2. Three-Dimensional Observations . . . . . . . . . . . . . . . E. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . IV. Optimal Statistical Experimental Design of Conventional Transmission Electron Microscopy . . . . . . . . . . . . . . . . . . . . . . A. Introduction . . . . . . . . . . . . . . . . . . . . . . . . B. Parametric Statistical Model of Observations . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
2 5 7 10 13 13 17 19 20 22 25 26
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
27 27 28 28 29 31 34 34 36 42 45 45 51 57
. . . . . .
58 58 62
1 Copyright 2004, Elsevier Inc. All rights reserved. ISSN 1076-5670/04
2
VAN AERT ET AL.
1. The Exit Wave. . . . . . . . . . . . . . . . . . 2. The Image Wave . . . . . . . . . . . . . . . . . 3. The Image Intensity Distribution . . . . . . . . . . . 4. The Image Recording . . . . . . . . . . . . . . . 5. The Incorporation of a Monochromator . . . . . . . . C. Statistical Experimental Design . . . . . . . . . . . . 1. Microscope Settings . . . . . . . . . . . . . . . . 2. Numerical Results . . . . . . . . . . . . . . . . 3. Interpretation of the Results . . . . . . . . . . . . D. Conclusions . . . . . . . . . . . . . . . . . . . . V. Optimal Statistical Experimental Design of Scanning Transmission Electron Microscopy . . . . . . . . . . . . . . . . . . A. Introduction . . . . . . . . . . . . . . . . . . . . B. Parametric Statistical Model of Observations . . . . . . . 1. The Exit Wave. . . . . . . . . . . . . . . . . . 2. The Image Intensity Distribution . . . . . . . . . . . 3. The Image Recording . . . . . . . . . . . . . . . C. Statistical Experimental Design . . . . . . . . . . . . 1. Microscope Parameters . . . . . . . . . . . . . . 2. Numerical Results . . . . . . . . . . . . . . . . 3. Interpretation of the Results . . . . . . . . . . . . D. Conclusions . . . . . . . . . . . . . . . . . . . . VI. Discussion and Conclusions . . . . . . . . . . . . . . . Appendix A . . . . . . . . . . . . . . . . . . . . . Appendix B . . . . . . . . . . . . . . . . . . . . . Appendix C . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
62 65 67 68 69 70 71 72 99 102
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
104 104 107 108 110 113 115 116 117 138 141 143 145 149 152 157
I. Introduction In materials science, the last decades are characterized by an evolution from macro- to micro- and, more recently, to nanotechnology. In nanotechnology, nanomaterials play an important role. Examples of nanomaterials are nanoparticles, nanotubes, and layered magnetic and superconducting materials (Nalwa, 2002; van Tendeloo et al., 2000). The interesting properties of these materials are related to their structure. Therefore, one of the central issues in materials science is to understand the relations between the properties of a given material on the one hand and its structure on the other hand. A complete understanding of this relation, combined with recent progress in building nanomaterials atom by atom, will enable materials science to evolve into materials design, that is, from describing and understanding towards predicting materials with interesting properties (Browning et al., 2001; Olson, 1997, 2000; Reed and Tour, 2000; Wada,
QUANTITATIVE ATOMIC RESOLUTION TEM
3
1996). In order to understand the properties-structure relation, experimental and theoretical studies are needed (Muller and Mills, 1999; Spence, 1999; Springborg, 2000). Essentially, theoretical studies allow one to calculate the properties of materials with known structure, whereas experimental studies allow one to characterize materials in terms of structure. In practice, however, the combination of both approaches is not yet feasible. One of the reasons is that present experimental characterization methods may generally not locally determine atom positions within sub-a˚ngstrom precision (Olson, 2000). A precision of the ˚ is needed (Muller, 1998, 1999; Kisielowski, Principe, order of 0.01 to 0.1 A Freitag, and Hubert, 2001). Various experimental characterization methods exist. However, scanning probe techniques, such as scanning tunnelling and atomic force microscopy, restrict investigations to surface or near-surface regions (Wiesendanger, 1994). Hence, they cannot provide subsurface information. Classical X-ray and neutron diVraction techniques, on the other hand, only provide averaged, instead of local, structure information (Zanchet and Ugarte, 2000). Therefore, they may only be applied successfully to periodic materials, such as crystals, whereas nanomaterials are usually aperiodical. Only atomic resolution transmission electron microscopy (TEM) techniques seem to be appropriate to provide local information to atomic scale since electrons interact suYciently strongly with materials (Fujita and Sumida, 1994), (Spence, 1999). Another advantage of electrons is that they are charged and can therefore give information about the ionization state of atoms. They can also be deflected by lenses yielding information both in real and Fourier space. Furthermore, as compared to X-rays or neutrons, electrons would even provide more structure information for a given amount of radiation damage (Henderson, 1995). Figure 1 presents a compact scheme of the collection of electron microscopical observations by means of atomic resolution TEM. The observations are two-dimensional projected images of three-dimensional objects. Obviously, only the position of projected atoms or atom columns may be obtained from a single image. Quantitative atomic resolution TEM allows materials scientists to measure structure parameters, including the positions of projected atoms or atom columns, from the obtained observations. The observations fluctuate about their expectations. The physical model describing these expectations, the expectation model, contains the structure parameters to be measured. Quantitative atomic resolution TEM makes use of such a model combined with statistical parameter estimation techniques in order to measure, or more specifically, to estimate, the positions of projected atoms or atom columns. Subsequently, the positions of the atoms in three-dimensional space may be derived from combining the measurements of a set of projected images. Therefore,
4
VAN AERT ET AL.
Figure 1. Scheme of an atomic resolution TEM experiment. The observations are twodimensional projected images of three-dimensional objects. The structure parameters of these objects are unknown. Quantitative atomic resolution TEM allows one to estimate these parameters from the observations. The precision of the estimates depends on the microscope settings. The optimal microscope settings result into the highest attainable precision.
quantitative atomic resolution TEM is probably the most appropriate technique for very precise measurement of atom positions. The precision of the projected atom position or atom column position estimates is limited by the presence of the fluctuations in the observations. It depends on the microscope settings, such as the defocus and the aperture. In the literature, a particular choice of such settings is referred to as experimental design (Fedorov, 1972). The purpose of this article is to optimize the experimental design in terms of the attainable precision, under relevant physical constraints. These constraints are either the radiation sensitivity of the object or the specimen drift. Therefore, either the incident electron dose per square a˚ngstrom (that is, the amount of electrons per square a˚ngstrom that interact with the object during the experiment) or the recording time has to be kept subcritical. Of crucial importance in the optimization procedure is that the attainable precision can be adequately quantified (van den Bos, 1982; van den Bos and den Dekker, 2001). It is used as optimality criterion for quantitative evaluation of the eVect of microscope settings on the precision. This evaluation procedure, which is called statistical experimental design, allows electron microscopists to derive the optimal statistical experimental design, that is, the experimental design resulting into the highest attainable precision. Strictly speaking, optimal statistical experimental design refers to the optimization of tunable settings,
QUANTITATIVE ATOMIC RESOLUTION TEM
5
such as the defocus, and not to fixed settings, such as the spherical aberration constant, which, in the absence of a so-called spherical aberration corrector, is a fixed property of the microscope. According to van den Bos (2002), the optimization of fixed settings by means of new instrumental developments could be called optimal instrumental design. However, since the optimization procedure is the same, this distinction in terminology will not be made in the remainder of this article. The application of statistical experimental design to quantitative atomic resolution TEM is, in the author’s opinion, novel. In these considerations, subjective qualities of the electron microscope as an imaging instrument are no longer important. In a sense, it doesn’t matter whether the produced images are good-looking or not. The electron microscope is considered to be a measuring instrument (van Aert, den Dekker, van den Bos, and van Dyck, 2002; van den Bos, 2002a). This means that the structure parameters, the projected atom positions or atom column positions in particular, are quantitatively estimated from the electron microscopical observations, instead of visually determined. For many years, it has been standard practice to interpret images visually or to compare images visually with computer simulations in order to determine the structure of an object. This will be called qualitative atomic resolution TEM. The optimality criteria used to evaluate the accompanied microscope designs are based on classical resolution criteria, such as Rayleigh’s. However, these criteria are not suitable for quantitative atomic resolution TEM. Instead, the attainable statistical precision is the criterion of importance. A. Qualitative Atomic Resolution Transmission Electron Microscopy Up to recently, qualitative atomic resolution TEM was hampered by ˚ would insuYcient resolution of the electron microscope. A resolution of 1 A be required to visualize the individual projected atom columns of materials with columnar structures, such as perfect crystals or crystals containing defects in the structure, viewed along a main zone axis. Over the years, ˚ resolution. Examples diVerent methods have been developed to obtain 1 A of such methods are: . . .
. . .
High-voltage electron microscopy Correction of the spherical aberration in the electron microscope High-angle annular dark-field scanning transmission electron microscopy Focal-series reconstruction OV-axis holography Correction of the chromatic aberration in the electron microscope
6
VAN AERT ET AL.
These methods improve the interpretability of the experimental images in terms of the structure. The former three do not require image processing techniques, whereas the latter three do. In high-voltage electron microscopy, the accelerating voltage of the electron microscope is increased up to 1 MV and beyond (Phillipp et al., 1994). It is used in conventional transmission electron microscopy (CTEM). In this mode, the object is illuminated with a parallel incident electron beam. If the object is thin, the directly interpretable resolution for CTEM is given by the so-called point resolution (O’Keefe, 1992; Spence, 1988). In high˚ . For comparison, in voltage electron microscopy, it is about equal to 1 A intermediate voltage electron microscopy, operating at an accelerating ˚ . However, the disadvantage of high voltage of about 300 kV, it is 2 A voltage electron microscopy is the increase of the displacement damage to the object, that is, the displacement of the atoms in the object from their initial positions (Spence, 1999; Williams and Carter, 1996). Spherical aberration in the electron microscope is a lens defect that, like other aberrations, causes a point object to be imaged as a disk of finite size. By using multipole lenses, Rose (1990) has developed a corrector which cancels spherical aberration out. Correction of the spherical aberration is applied to both CTEM (Haider et al., 1998) and scanning transmission electron microscopy (STEM) (Batson, Dellby, and Krivanek, 2002). In the STEM mode, an electron probe is formed, which scans in a raster over the object. At present, one of the main diYculties of the spherical aberration corrector is the complicated procedure for the alignment of the large number of electrostatic and magnetic optical elements (Spence, 1999). In high-angle annular dark-field scanning transmission electron microscopy (HAADF STEM), one of the STEM variants, mainly inelastically scattered electrons are detected. The elastically scattered electrons are eliminated from detection. Here, the directly interpretable resolution is enhanced (Nellist and Pennycook, 2000), although at the expense of a significant loss of imaging electrons. The latter three possibilities, focal-series reconstruction, oV-axis holography, and correction of the chromatic aberration in the electron microscope are used in CTEM mode. In CTEM, one has, apart from the point resolution, another resolution measure, the so-called information limit. The information limit represents the smallest detail that can be resolved by using image processing techniques. It is inversely proportional to the highest spatial frequency that is still transferred with appreciable intensityfrom the exit plane of the object to the image plane (de Jong and van Dyck, 1993; O’Keefe, 1992). In intermediate voltage electron microscopy, the information limit is usually smaller than the point resolution.
QUANTITATIVE ATOMIC RESOLUTION TEM
7
Focal-series reconstruction and oV-axis holography push the directly interpretable resolution down to the information limit. This is done by retrieving the exit wave, that is, the complex electron wave function at the exit plane of the object. Ideally, the exit wave is free from any imaging artifacts, which means that the visual interpretability of the reconstruction is enhanced considerably for thin objects when compared to the original experimental images. Today, the information limit of CTEM is slightly ˚ for electron microscopes equipped with a field emission gun as below 1 A electron source (Spence, 1999), (Kisielowski, Hetherington, Wang, Kilaas, O’Keefe, and Thust, 2001; O’Keefe et al., 2001). The focal-series reconstruction method reconstructs the exit wave from a series of images collected at diVerent defocus values (Coene et al., 1996; Kirkland, 1984; Saxton, 1978; Schiske, 1973; Thust, Coene, Op de Beeck, and van Dyck, 1996; Thust, Overwijk, Coene, and Lentzen, 1996; van Dyck and Coene, 1987; van Dyck, Op de Beeck, and Coene, 1993). OV-axis holography (Lichte, 1991) is based on the original idea of Gabor (1948), where the exit wave is retrieved from the interference between the object wave and a reference wave. The dominant factor governing the information limit is generally the chromatic aberration. It results from a spread in defocus values, arising from fluctuations in accelerating voltage, lens current, and thermal energy of the electron. By use of a chromatic aberration corrector (Reimer, 1984; Weißba¨cker and Rose, 2001, 2002) or a monochromator (Mook and Kruit, 1999), the chromatic aberration and hence the information limit enhance. The chromatic aberration corrector is at the conceptual stage, while the monochromator is already used in practice. However, by use of a monochromator the enhancement of the information limit is reached at the expense of a loss of the incident electron dose. The qualitative methods presented in this section, nowadays result in a ˚ . Other methods to obtain this resolution exist as well, but resolution of 1 A they will not be treated in this article. B. Quantitative Atomic Resolution Transmission Electron Microscopy One a˚ngstrom resolution is convenient for atomic resolution, but insuYcient for materials science of the future, which will require precision rather than ˚ resolution (Cahn, 2001). One is inclined to think that a precision of 0.01 A ˚ , which is far beyond the present possibilities. requires a resolution of 0.01 A However, resolution and precision are quite diVerent things. On the one hand, resolution expresses the ability to visualize separately adjacent atom columns in an image. On the other hand, precision corresponds to the
8
VAN AERT ET AL.
variance, or the square root of the variance, the standard deviation, with which structure parameters can be estimated. In this study, the most important parameters are the projected atom column positions since nanomaterials are usually crystals containing defects in their columnar ˚ precision, quantitative atomic structure. In order to attain 0.01 to 0.1 A resolution TEM is needed. Its goal is to estimate structure parameters of an object as precisely as possible from the observations. Estimation of the structure parameters requires an expectation model of the observations. In quantitative atomic resolution TEM, the expectation model represents the expected number of electron counts detected, for example, with a charged coupled device (CCD) camera. It describes, for instance, the expected number of electrons per pixel in the two-dimensional projected image of Figure 1. The expectation model is given by a function, which describes the electron-object interaction, the transfer in the microscope, and the image detection. Nowadays, these processes are suYciently well understood to make the derivation of an expectation model possible and several commercial software packages for atomic resolution TEM image simulations are available (Kilaas and Gronsky, 1983; Stadelmann, 1987). The parameters of the expectation model are structure parameters as well as microscope settings, characterizing the object under study and the microscope, respectively. In the derivation of this model, the object is described by the assembly of electrostatic potentials of the constituting atoms. Since the electrostatic potential is known for each atom type, the structure parameters reduce to atom numbers, atom positions, object thickness, orientation of the object with respect to the incident electron beam, and the Debye-Waller factor, which accounts for vibrations of the atoms at a given temperature (Wang, 2001). Then, the exit wave, resulting from the electron-object interaction, can be derived. An allembracing solution for this exit wave has not yet been found. DiVerent routes to achieve this goal are currently investigated. Proposed solutions are given by, for example, the weak phase object (Buseck, Cowley and Eyring, 1988), the multislice (Cowley and Moodie, 1957), and the Bloch wave theory (Hirsch et al., 1965; Howie, 1970; Kambe, Lehmpfuhl, and Fujimoto, 1974). A remarkable solution is given by the channelling theory (Geuens and van Dyck, 2002; Howie, 1966; Op de Beeck and van Dyck, 1996; Pennycook and Jesson, 1991; Sinkler and Marks, 1999, van Dyck and Chen, 1999a; van Dyck et al., 1989). It requires advanced knowledge of quantum mechanics. The channelling theory proposes a solution for the exit wave, which is simple, albeit approximate, but which is in closed analytical form so that it has the advantage that the projected structure of the object may relatively easily be obtained from this solution. The theory is applicable if the object is oriented along a main zone axis. In this orientation, the atoms
QUANTITATIVE ATOMIC RESOLUTION TEM
9
are superimposed along a column, hence the name atom column. It can then be shown that the electrons are trapped in the positive potential of these columns. Each atom column, in a sense, acts as a channel for the electrons. If the distance between adjacent columns is not too small, a one-to-one correspondence between the exit wave and the object structure is established. From the channelling theory an analytical expression for the exit wave can be derived, which is parametric in the projected atom column positions, the atom numbers of the atoms along a column, the distance between successive atoms along a column, and the Debye-Waller factor (van Dyck and Chen, 1999a). As already mentioned, one may expect to obtain projected information only. Ambiguity about the types and distance of atoms along a column may only be removed by combining information from diVerent zone axis orientations (van Dyck and Chen, 1999b). Furthermore, the transfer in the microscope and the image detection, which are also described by the expectation model, are characterized by a collection of microscope settings, such as the defocus value, the spherical aberration constant of the objective lens, the accelerating voltage, and the pixel size of the camera. The structure parameters or microscope settings of the expectation model are either known beforehand with suYcient accuracy and precision or not, in which case they have to be estimated from the experiment by means of statistical parameter estimation techniques. This is done by adapting the expectation model to the experimentally obtained observations with respect to the unknown parameters using a criterion of goodness of fit, such as the least squares sum or the likelihood function (Saxton, 1997). The set of parameters for which this criterion is optimum corresponds to the estimates. In a sense, in quantitative atomic resolution TEM, one is looking for the optimal value of a criterion in a parameter space whose dimension is equal to the number of parameters to be estimated. This search for the global optimum of the criterion of goodness of fit is an iterative numerical optimization method (Mo¨bus et al., 1997). An overview of such methods can be found in Murray (1972) and van den Bos (1982). Generally, the dimension of the parameter space is high. Consequently, it is quite possible that the optimization procedure ends up at a local optimum instead of at the global optimum of the criterion of goodness of fit, so that the wrong structure is suggested. To solve this dimensionality problem, that is, to find a pathway to the global optimum in the parameter space, a good starting structure is required (van Dyck et al., 2003). Finding such a starting structure is not trivial, since due to two scrambling processes, details in the images do not necessarily correspond to features in the atomic structure. The first scrambling process is the dynamic scattering of the electrons on their way through the object. The second scrambling process is the transfer
10
VAN AERT ET AL.
in the electron microscope. Imaging lenses are not perfect, but have aberrations, such as spherical and chromatic aberration. As a consequence, the structure information of the object may be strongly delocalized. Additionally, the images are always disturbed by noise, that is, fluctuations in the observations, which further complicates direct interpretation. However, it has been shown that good starting structures can be found by using the qualitative methods described before. For example, focal-series reconstruction methods in a sense invert, or equivalently, undo, the eVect of lens aberrations. Consequently, the thus obtained exit wave is much more related to the object structure, providing a directly interpretable resolution close to the information limit, which just surpasses the limit beyond which individual atom columns can be discriminated (Kisielowski, Hetherington, Wang, Kilaas, O’Keefe and Thust, 2001; Thust and Jia, 2000; Zandbergen and van Dyck, 2000). Focal-series reconstruction methods thus yield an approximate structure that may be used as a starting point in a final numerical optimization procedure by adapting the expectation model to the original observations. The starting structure obtained may still be insuYciently close to the global optimum of the criterion of goodness of fit to guarantee convergence. In order to find a better starting structure, one also has to undo the first scrambling process mentioned, that is, the dynamic scattering of the electrons on their way through the object. Undoing the dynamic scattering is possible by means of the channelling theory. Adapting the analytical expression for the exit wave to the reconstructed exit wave with respect to the structure parameters provides the experimenter with an approximate structure that can then be used as an improved starting point for a final numerical optimization procedure by adapting the expectation model to the original images.
C. Statistical Experimental Design As mentioned before, with the resolution becoming suYcient to discriminate individual atom columns, a structure is char-acterized completely by the atom column positions, the atom numbers, the distance between successive atoms along a column, the object thickness, and orientation. Then, quantitative structure determination by means of quantitative atomic resolution TEM is a statistical parameter estimation problem, the image pixel values being the observations from which the parameters of interest have to be estimated. The precision with which these parameters can be estimated is only limited by the presence of noise. In this article, it will be shown that estimation of the unknown parameters may result in higher precisions if it is accompanied by statistical experimental design.
QUANTITATIVE ATOMIC RESOLUTION TEM
11
The procedure to derive the optimal statistical experimental design is as follows. Due to the inevitable presence of noise, the observations will always fluctuate randomly and are therefore modelled as stochastic variables. By definition, a stochastic variable is characterized by its probability density function, while a set of stochastic variables has a joint probability density function. The joint probability density function defines the expectations, that is, the mean value of each observation, as well as the fluctuations of the observations about these expectations. The expectations are described by the expectation model, which is parametric in the quantities to be estimated. Given the joint probability density function, use of the concept of Fisher information allows one to determine the attainable precision, that is, the lowest possible variance, with which a parameter can be estimated unbiasedly from a set of observations assumed to obey a certain distribution (Frieden, 1998; van den Bos, 1982; van den Bos and den Dekker, 2001). Thus, it is possible to derive an expression for the lower bound on the variance with which the atom column positions can be estimated from a quantitative atomic resolution TEM experiment. This lower bound, which is called the Crame´r-Rao Lower Bound (CRLB), is independent of the estimation method used, and therefore represents the intrinsic limit on precision. Moreover, it is a function of the microscope settings. This means that the CRLB varies with the microscope settings, of which at least some are adjustable. The optimal statistical experimental design of an atomic resolution TEM experiment is then given by the microscope settings that correspond to the lowest CRLB (van Aert, den Dekker, van den Bos and van Dyck, 2002b; van Dyck et al., 2002). It is found by minimizing the CRLB with respect to the microscope settings, under the existing physical constraints, which are the radiation sensitivity of the object or the specimen drift. Notice, that the optimal statistical experimental design may be diVerent for diVerent objects under investigation. In this article, the use of statistical experimental design for quantitative atomic resolution TEM is demonstrated. To begin with, it is applied to CTEM, STEM, and electron tomography experiments, all described in a simplified way (van Aert, den Dekker, van Dyck and van den Bos, 2002a). The attainable precision with which position and distance parameters of one or two components can be estimated has been investigated. For CTEM and STEM, the components are two-dimensional and the observations are counting events in a two-dimensional pixel array, whereas for electron tomography, the components are three-dimensional and the observations are counting events in a set of two-dimensional pixel arrays, which is obtained by rotating these components about a rotation axis. The expectation models of the observations are assumed to be Gaussian peaks, although they are of a higher complexity in practice. Under this assumption,
12
VAN AERT ET AL.
the CRLB on the variance with which position and distance parameters of one or two components can be estimated, which is usually calculated numerically, may be given in closed analytical form. Although a simplified model has been used for the derivation of these expressions, they are very useful as rules of thumb to give insight into statistical experimental design for quantitative atomic resolution TEM. The rules of thumb clearly show the dependence of the attainable precision on the width of the point spread function, the width of the components, and the number of detected counts. For electron tomography, the attainable precision also depends on the orientation of the components with respect to the rotation axis. Generally, the precision improves by increasing the number of detected counts or by narrowing the point spread function. However, below a certain width of the point spread function, the precision is limited by the intrinsic width of the components. Then, further narrowing of the point spread function is useless. This result is meaningful in practice. For example, in STEM experiments, further narrowing of the probe, which represents the point spread function, is not so beneficial in terms of precision since the width of the probe is currently almost equal to the width of an atom (Krivanek, Dellby, and Nellist, 2002). Moreover, as in STEM, if a narrower probe may be accompanied by a decrease of the number of detected electrons, both eVects have to be weighed against each other under the existing physical constraints. The optimal statistical experimental designs of CTEM and STEM experiments, assuming more complicated, physics based expectation models, instead of Gaussian peaks, are derived as well. These results are derived from the numerical minimization of the CRLB with respect to the microscope settings. The thus obtained results are intuitively interpreted using the rules of thumb for the CRLB, which are derived from Gaussian peaked expectation models. First, for CTEM operating at an intermediate accelerating voltage of about 300 kV, it is shown that a spherical or a chromatic aberration corrector may improve the attainable precision. However, the gain, which depends on the object under study, usually turns out to be disappointing. Furthermore, a monochromator does usually not improve the attainable precision if the experiment is limited by specimen drift (den Dekker et al., 2001), whereas it may slightly improve the precision if the experiment is limited by the radiation sensitivity of the object. For CTEM operating at a low accelerating voltage of about 50 kV, the attainable precision improves substantially by using both a spherical aberration corrector and either a chromatic aberration corrector or a monochromator. Next, for STEM, it is shown that the optimal probe is not the narrowest possible and that its optimal width strongly depends on the object under study. Moreover, an
QUANTITATIVE ATOMIC RESOLUTION TEM
13
annular detector usually results in a higher attainable precision than an axial one. Furthermore, as for CTEM, the precision that is gained using a spherical aberration corrector depends on the object under study, but this gain is generally only marginal (den Dekker, van Aert, van Dyck, and van den Bos, 2000; van Aert and van Dyck, 2001; van Aert, den Dekker, van Dyck, and van den Bos, 2000, 2002b). Also, it is shown that for both CTEM and STEM, the reduced brightness of the electron source is preferably as high as possible and the specimen holder as stable as possible, especially if the experiment is limited by specimen drift. The outline of the article is as follows. Section II introduces the basic principles of statistical experimental design. The attainable precision is proposed as quantitative performance measure. It allows one to evaluate, to optimize, and to compare diVerent experimental settings. In Section III, this process is illustrated for the estimation of position and distance parameters from CTEM, STEM, and electron tomography experiments, which are all described by simplified expectation models. In Section IV, the optimal statistical experimental design of CTEM experiments is derived from more complicated, physics based expectation models. Special attention is paid to the spherical aberration corrector, the chromatic aberration corrector, and the monochromator. In Section V, the optimal statistical experimental design of STEM experiments is discussed. In particular, the optimal probe and detector configuration are determined. In Section VI, conclusions are drawn.
II. Basic Principles of Statistical Experimental Design
A. Introduction In this section, the basic principles of statistical experimental design will be introduced. These principles may be applied to set up experiments in many branches of science, from elementary particle physics to astronomy. In these experiments, the measurement of any unknown parameter, such as the position of a star, the concentration of chemical elements, or the decay constant in a radio-active decay process, always takes place in the presence of fluctuations in the observations. As a result of these fluctuations, the precision with which the parameters can be measured is limited. The purpose of statistical experimental design is to derive the experimental design, that is, a particular choice of experimental settings, resulting in the highest precision. This so-called optimal statistical experimental design can be derived by applying the apparatus of mathematical statistics
14
VAN AERT ET AL.
straightforwardly. Hence, statistical experimental design is a powerful method, which can replace conventional methods that were, or, still are, used to optimize the experimental design. These conventional methods are based on the intuition of the experimenter. However, intuition might be very misleading, especially in combination with the increasing complexity of today’s experiments. Instead, statistical experimental design is needed. In the remainder of this article, it will be used to optimize the experimental design of quantitative atomic resolution TEM experiments in terms of the precision with which the atom positions can be estimated. To begin with, a simple definition of an experiment must be given. In principle, it can be defined as the way of collecting and analyzing a set of observations for a given purpose. From statistician’s point of view, this purpose is to measure unknown parameters as precisely as possible. This allows the experimenter to draw reliable conclusions from his or her experiment. The vital importance of precise measurement as a path to understanding was already recognized in 1883 by William Thomson, Lord Kelvin, the famous Scottish physicist. One of his much-quoted utterances in a lecture to civil engineers in London is the following one (Cahn, 2001): I often say that when you can measure what you are speaking about, and express it in numbers, you know something about it; but when you cannot measure it, when you cannot express it in numbers, your knowledge is of a meagre and unsatisfactory kind; it may be the beginning of knowledge, but you have scarcely, in your own thoughts, advanced to the state of science. William Thomson, Lord Kelvin
In order to be able to measure unknown parameters as precisely as possible, the analysis of the observations is based on the use of parameter estimation techniques. In other words, an estimator, which estimates the parameters from the observations, is chosen. Since an estimator is a function of the observations, the precision of the chosen estimator depends on the way the observations are collected. Often, these observations may be collected under a large variety of experimental designs. Given the purpose of an experiment, the optimal experimental design is given by the experimental settings resulting in the highest precision of the unknown parameters. The definition of an experiment may be illustrated at the hand of an example. A quantitative atomic resolution TEM experiment may be regarded as a set of observations, that is, electron counting results made, for example, with a CCD camera, from which the structure of the object under study, the atom positions in particular, has to be estimated as precisely as possible. In TEM, these observations may be collected by choosing, for example, defocus and aperture, and by choosing between diVerent imaging modes, such as conventional transmission electron CTEM and STEM. This oVers electron
QUANTITATIVE ATOMIC RESOLUTION TEM
15
microscopists the possibility to choose the electron microscope settings in accordance with the optimal experimental design so as to estimate unknown parameters as precisely as possible. The optimization of the experimental design consists of diVerent steps. First, a parametric statistical model of the observations has to be chosen. Since the observations fluctuate randomly about their expectations, due to the inevitable presence of noise, they are modelled as stochastic variables. By definition, a stochastic variable is characterized by its probability density function, while a set of stochastic variables has a joint probability density function. The joint probability density function of the observations defines the expectations of the observations as well as the fluctuations of the observations about these expectations. The expectations are described by the expectation model, that is, a physical model containing the parameters to be estimated. For example, in a radioactive decay process, the expectation model is a multi-exponential function, where the parameters are the decay constants. In quantitative atomic resolution TEM, the expectation model represents the expected number of electron counts. It is given by a function, which describes the electron-object interaction, the transfer in the microscope, and the image detection. The parameters of the expectation model are, for example, the projected atom or atom column positions, the object thickness, and the atom numbers. Usually, this kind of parameters has a clear physical meaning. Hence, the specification of the parametric statistical model of the observations needs a solid physical base. Second, the optimality criterion that will be used to optimize the experimental design has to be specified. The choice of this criterion depends on the purpose of the experiment, which is to estimate the unknown parameters of the expectation model as precisely as possible. Hence the optimality criterion to be preferred is the precision of the parameter estimates. Therefore, the precision has to be adequately quantified. This can be done using statistical parameter estimation theory. From the parametric statistical model of the observations, the attainable statistical precision can be determined, that is, the lower bound on the variance with which the parameters can be estimated without bias from the observations (van den Bos, 1982; van den Bos and den Dekker, 2001). The meaning of this socalled CRLB is as follows. Generally, one may use diVerent estimators in order to estimate parameters. An estimator is a function of the observations that is used to compute the parameters. Thus, an estimator is, like the observations, a stochastic variable. It is said to be unbiased if its expectation is equal to the true value of the parameter. Stated diVerently, an unbiased estimator has no systematic error. Moreover, diVerent estimators will have diVerent precisions. The precision of an estimator is represented by its variance or by its standard deviation, which is the square root of the
16
VAN AERT ET AL.
variance. It can be shown that the variance of unbiased estimators will never be lower than the CRLB. There exists a class of estimators, including the maximum likelihood estimator, that achieves the CRLB asymptotically, that is, for an increasing number of observations. The existence of the maximum likelihood estimator justifies the choice of the CRLB as optimality criterion. The CRLB is a function of the experimental settings. Thus, the lower bound on the variance of each individual, unknown parameter of the expectation model could be computed and minimized as a function of the experimental settings. However, simultaneous minimization of the set of lower bounds corresponding to the entire set of unknown parameters is usually impossible. Therefore, statistical parameter estimation theory provides diVerent optimality criteria, which are functions of the set of lower bounds. These are scalar measures and the experimenter has to choose one of them or has to produce a criterion him or herself, reflecting his or her specific purpose. For an electron microscopist, a specific purpose might be to measure the atom column positions as precisely as possible, irrespective of the precision of the object thickness or of the atom numbers. Thus, a possible optimality criterion is the sum of the lower bounds on the variance of the position coordinates. Generally, the choice of the optimality criterion requires detailed knowledge from experts in the scientific field. Finally, the optimality criterion chosen has to be optimized with respect to the experimental settings. This produces the optimal statistical experimental design. Usually, this is a nonlinear optimization problem for which the optimal value of the criterion has to be found numerically. This optimization is subject to the relevant physical constraints. For atomic resolution TEM, these constraints are the radiation sensitivity of the object under study or the specimen drift. Therefore, the incident electron dose per square a˚ngstrom or the recording time has to be kept within the constraints. So far the introduction to the basic principles of statistical experimental design. For an extended introduction to statistical experimental design and the diVerent steps encountered for the optimization, the reader is referred to Fedorov (1972) and Pa´zman (1986). The section is organized as follows. In Section II.B, parametric statistical models of observations will be discussed. In Section II.C, it will be shown how an adequate expression for the attainable statistical precision of the parameter estimates, that is, the CRLB, can be derived from such a parametric statistical model. The presented optimality criteria are functions of the attainable precisions. In Section II.D, the maximum likelihood estimator of the parameters will be derived from the parametric statistical model of observations. This estimator attains the CRLB asymptotically and, hence, justifies the choice of the optimality criteria. Section II.E consists of conclusions.
QUANTITATIVE ATOMIC RESOLUTION TEM
17
B. Parametric Statistical Models of Observations In this section, parametric statistical models of observations will be introduced. Specifically, they will be used to model electron microscopical observations. Any experimenter will readily admit that his or her observations ‘contain errors’. With a view to statistical experimental design, these errors must be specified. Generally, due to the inevitable presence of noise, sets of observations made under the same conditions nevertheless diVer from experiment to experiment. The usual way to describe this behaviour is to model the observations as stochastic variables. The reason is that there is no viable alternative and that it has been found to work (van den Bos, 1999; van den Bos and den Dekker, 2001). By definition, a stochastic variable is characterized by its probability density function, while a set of stochastic variables has a joint probability density function. Consider a set of stochastic observations wm ; m ¼ 1; . . . ; M made at the measurement points x1, . . . , xM. These measurement points are assumed to be exactly known. In CTEM, the observations are, for example, electron counting results made, for example, at the pixels of a CCD camera, where M represents the total number of pixels. Then, the M 1 vector w defined as w ¼ ðw1 . . . wM ÞT
ð1Þ
is the column vector of these observations. It represents a point in the Euclidean M space having w1, . . . , wM as coordinates. This will be called space of observations (van den Bos and den Dekker, 2001). The expectations of the observations, that is, the mean values of the observations, are defined by their probability density function. The vector of expectations E ½w ¼ ðE ½w1 . . . E ½wM ÞT
ð2Þ
is also a point in the space of observations and the observations are distributed about this point. The symbol E [.] denotes the expectation operator. In this article, the expectations of the observations are described by the expectation model, that is, a physical model, which contains the unknown parameters to be estimated, such as the position coordinates of the projected atoms or atom columns. The unknown parameters are represented by the T 1 parameter vector y ¼ ðy1 . . . yT ÞT . Thus, it is supposed that the expectation of the mth observation is described by E ½wm ¼ fm ðyÞ ¼ fðxm ; yÞ;
ð3Þ
where fm(y) represents the expectation model, which is evaluated at the measurement point xm and which depends on the parameter vector y. Apart
18
VAN AERT ET AL.
from the unknown parameters y, the expectation model contains known parameters and experimental settings as well. For example, in quantitative atomic resolution TEM, the expectation model is sometimes described as fkl ðyÞ ¼
N jcðrkl ; yÞ tðrkl ; "; Cs Þj2 ; Inorm
ð4Þ
where N represents the total number of detected electrons in an image, the function c(rkl; y) describes an object consisting of nc atom columns with rkl ¼ ðxk yl ÞT the position of the pixel (k, l ) and with the parameter vector y ¼ ðbx1 . . . bxnc by1 . . . bync ÞT containing the positions of the atom columns, t(r; ", Cs) represents the point spread function of the electron microscope depending on microscope settings such as the spherical aberration constant Cs and the defocus ", and Inorm represents a normalization factor so that the integral of the function jcðrkl ; yÞ tðrkl ; "; Cs Þj2 =Inorm is equal to one. Models like Eq. (4) will be derived and explained in detail in the remainder of this article. Electron microscopical observations are electron counting results detected, for example, with a CCD camera. Under the assumption that the quantum eYciency of this detector is large enough to detect single electrons, these observations are binomially distributed. This means that the probability that the observation wm is equal to om is given by (Papoulis, 1965) ! N Nom m po ð5Þ m ð 1 pm Þ om with N the total number of detected electrons, pm the probability that a single electron hits the pixel at the position xm, and ! N N! ð6Þ ¼ om !ðN om Þ! om For large N and pm 1, which is a useful approximation for electron microscopical observations, the binomial distribution tends to a Poisson distribution (Bevington, 1969). Therefore, the probability that the observation wm is equal to om is given by (Papoulis, 1965) m lo m expðlm Þ; om !
ð7Þ
where the parameter lm ¼ Npm is equal to the expectation of the observation wm, which in its turn, is described by the expectation model, given by Eq. (3):
QUANTITATIVE ATOMIC RESOLUTION TEM
E ½wm ¼ lm ¼ fm ðyÞ:
19 ð8Þ
The assumption that the observations are Poisson distributed is usually made in electron microscopy (see, for example, (Herrmann, 1997)). A property of the Poisson distribution is that the variance of the observation wm is equal to lm: varðwm Þ ¼ lm :
ð9Þ
Moreover, electron microscopical observations may be assumed to be statistically independent. Therefore, the probability P(o; y) that a set of observations w ¼ ðw1 . . . wM ÞT is equal to o ¼ ðo1 . . . oM ÞT is equal to the product of all probabilities described by Eq. (7): Pðo; yÞ ¼
M Y lo m m
m¼1
om !
expðlm Þ
ð10Þ
This function is called the joint probability density function of the observations. It represents the parametric statistical model of the observations. The parameters y to be estimated enter P(o; y) via lm. In Section II.C, the parameterized joint probability density function will be used to derive the CRLB, that is, an expression for the attainable precision with which the unknown parameters can be estimated unbiasedly from the observations. The presented optimality criteria, which may be used for the optimization of the experimental design, are functions of the attainable precisions. In Section II.D, from the joint probability density function, the maximum likelihood estimator of the parameters is derived. This estimator actually achieves the CRLB asymptotically, that is, for the number of observations going to infinity. C. Attainable Precision In this section, it will first be shown how the joint probability density function can be used to determine the attainable precision, that is, the CRLB, which is a lower bound on the variance of any unbiased estimator. The CRLB is independent of any particular method of estimation. Next, optimality criteria, which are functions of the CRLB, are given. The CRLB depends on experimental settings, the design. Hence, functions of the CRLB, such as the optimality criteria, also depend on the experimental settings. This means that they vary with the experimental settings, of which at least some are adjustable. The experimenter has to choose one of these criteria, depending on his or her purpose, and optimize it to find the corresponding optimal design.
20
VAN AERT ET AL.
1. The Crame´r-Rao Lower Bound In this section, the parameterized probability density function of the observations, which is derived in Section II.B, will be used to define the Fisher information matrix and to compute the CRLB on the variance of unbiased estimators of the parameters of the expectation model. The CRLB will also be extended to include unbiased estimators of vectors of functions of these parameters. The reader is referred to (Frieden, 1998; van den Bos, 1982; van den Bos and den Dekker, 2001) to find the details of the CRLB. First, the Fisher information matrix F with respect to the elements of the T 1 parameter vector y ¼ ðy1 . . . yT ÞT is introduced. It is defined as the T T matrix 2 @ ln Pðo; yÞ F ¼ E ; ð11Þ @y @yT where P(o; y) is the joint probability density function of the observations w ¼ ðw1 . . . wM ÞT . The expression between square brackets represents the Hessian matrix of ln P, for which the (r, s)th element is defined by @ 2 ln P(o; y)/@yr@ys. For electron microscopical observations, where P(o; y) is given by Eq. (10), it follows from Eqs. (8), (10), and (11) that the (r, s)th element of F is equal to: Frs ¼
M X 1 @lm @lm : l @yr @ys m¼1 m
ð12Þ
ˆ of any unbiased Next, it can be shown that the covariance matrix cov(y) estimator yˆ of y satisfies: ð13Þ cov yˆ F 1
ˆ and F 1 is This inequality expresses that the diVerence of the matrices cov(y) ˆ positive semidefinite. Since the diagonal elements of cov(y) represent the variances of yˆ 1 ; . . . ; yˆ T and since the diagonal elements of a positive semidefinite matrix are nonnegative, these variances are larger than or equal to the corresponding diagonal elements of F 1: ð14Þ var yˆ r F 1 rr ; where r ¼ 1; . . . ; T and [F 1]rr is the (r, r)th element of the inverse of the Fisher information matrix. In this sense, F 1 represents a lower bound to ˆ The matrix F 1 is called the CRLB on the the variances of all unbiased y. ˆ variance of y.
QUANTITATIVE ATOMIC RESOLUTION TEM
21
Finally, the CRLB can be extended to include unbiased estimators of vectors of functions of the parameters instead of the parameters proper. Let gðyÞ ¼ ðg1 ðyÞ . . . gC ðyÞÞT be such a vector and let gˆ be an unbiased estimator of g(y). Then, it can be shown that covðgˆ Þ
@g 1 @gT F @y @yT
ð15Þ
where @g/@yT is the C T Jacobian matrix defined by its (r, s)th element @gr/@ys (van den Bos, 1982). The right-hand member of this inequality is the CRLB on the variance of gˆ . It should be noticed that the CRLB may only be computed if the probability density function of the observations is known. At first sight, this seems to be a problem since the true parameters of the probability density function are unknown. Nevertheless, even if the CRLB is a function of the unknown parameters, it remains an extremely useful tool. For nominal values of the unknown parameters it enables one to quantify variances that might be achieved, to detect possibly strong covariances between parameter estimates and, as will be shown in this article, to optimize the experimental design (van den Bos, 1982). Moreover, the estimates obtained using an estimator that achieves the CRLB may be substituted for the true parameters in the expression for the CRLB so as to get a level of confidence to be attached to these estimates (den Dekker and van Aert, 2002). In this section, it has been shown how from the joint probability density function, which is described in Section II.B, the elements of the Fisher information matrix may be calculated explicitly. From the latter, the CRLB on the variance of the parameters of the expectation model and on the variance of functions of these parameters may be computed from the righthand member of Eq. (13) and (15), respectively. The diagonal elements of the CRLB give a lower bound on the variance of any unbiased estimator of the parameters. Since the joint probability density function is a function of the experimental settings, the CRLB is a function of these settings as well. Therefore, the CRLB may be used to evaluate and to optimize the experimental design in terms of the precision. However, simultaneous minimization of the diagonal elements of the CRLB, that is, the right-hand members of Eq. (14), is usually impossible. Therefore, statistical parameter estimation theory provides diVerent optimality criteria, which are functions of the elements of the CRLB. These are scalar measures. The experimenter may choose one of these provided criteria or may produce a criterion him or herself, reflecting his or her purpose. A selection of criteria, which are provided in the literature, are given in the following section.
22
VAN AERT ET AL.
2. Precision Based Optimality Criteria In this section, optimality criteria that may be used for the evaluation and optimization of the experimental design are discussed. These criteria are functions of the CRLB and depend, like the CRLB, on the experimental settings. Several criteria are found in the literature (Fedorov, 1972; Pa´zman, 1986). A selection of them is discussed here. A distinction between global and partial, or, equivalently, truncated, optimality criteria is made. Global criteria are used when all parameters, represented by the elements of the parameter vector y, are important. Partial or truncated criteria are used when only some parameters or some functions of the parameters are important. For atomic resolution TEM, partial criteria are needed if the electron microscopist is only interested in, for example, the positions of atom columns, the positions of light or heavy atom columns, the distance between particular atom columns, or the positions of the atoms of a certain atom type, whereas he or she is not so interested in the object thickness or the atom numbers. Examples of both types of criteria are given below. a. Global Optimality Criteria . A-optimality criterion. The A-optimality criterion is defined by the sum of the diagonal elements of the CRLB, that is, the trace of the CRLB:
tr F 1 :
ð16Þ
This criterion may be interpreted under the assumption that there exists an estimator with covariance matrix equal to the CRLB. Then, minimizing the A-optimality criterion corresponds to minimizing the sum of the variances of the estimates yˆ 1, . . ., yˆ T of the parameters y1, . . . yT, without taking the correlation between these estimates into account. A geometric interpretation of this criterion may be given by considering the ellipsoid of concentration, which is a measure of the concentration of the distribution of the estimates about the true parameters. It is defined by the ellipsoid enclosing the true parameters y such that, a uniform distribution over the area bounded by the ellipsoid will have the same expectation and covariance matrix as the distribution of the estimates (Crame´r, 1999; Mood, Graybill, and Boes, 1974). In Figure 2, the square root of the A-optimality criterion, (tr F 1)1/2, is shown on the ellipsoid of concentration for the special case of two unknown parameters. This figure is based on Fedorov (1972). . D-optimality criterion. The D-optimality criterion is defined by the determinant of the CRLB:
QUANTITATIVE ATOMIC RESOLUTION TEM
det F 1 :
23 ð17Þ
A statistical interpretation of the D-optimality criterion may be given for the hypothetical estimator discussed before. Then, minimizing the D-optimality criterion corresponds to minimizing the volume of the ellipsoid of concentration, which is shown in Figure 2 for the special case of two parameters. The drawback of minimizing the D-optimality criterion is that in some cases the volume of the ellipsoid of concentration is small because it is ‘narrow but long’. This means that there is a linear combination of the parameters which is estimated with a very large variance under the corresponding optimal design. Minimax criterion in space of parameters. The minimax criterion in the space of parameters is defined by the maximum value of the diagonal elements of the CRLB: maxr F 1 rr : ð18Þ .
Minimizing this criterion corresponds to minimizing the largest variance of the estimate of the corresponding parameter. For example, in Figure 2, the 1=2 square root of the criterion, given by Eq. (18), corresponds to ½F 1 11 .
Figure 2. Ellipsoid of concentration for two parameters. The geometric interpretation of the square root of the A-optimality criterion, minimax criterion in space of parameters, and 1=2 1=2 E-optimality criterion is represented by (trF 1 Þ1=2 ; ½F 1 11 , and 1=lmin , respectively. Minimizing the D-optimality criterion corresponds to minimizing the volume (the area in this example) of the ellipsoid of concentration. This figure is based on (Fedorov, 1972).
24
VAN AERT ET AL.
E-optimality criterion. The E-optimality criterion is defined by the inverse of the minimum eigenvalue lmin of the Fisher information matrix: .
1 : lmin
ð19Þ
In Figure 2, the square root of the E-optimality criterion is shown on the ellipsoid of concentration for the special case of two parameters. . Linear optimality criteria. Linear optimality criteria are defined by criteria functions of the form
tr WF 1 ;
ð20Þ
where W is a positive definite T T matrix. The A-optimality criterion corresponds to the particular case where W is equal to the identity matrix. If W is a diagonal matrix, Eq. (20) is equal to: T X r¼1
Wrr F 1 rr ;
ð21Þ
that is, a weighted sum of the variances.
b. Partial or Truncated Optimality Criteria. In principle, partial or truncated optimality criteria are analogous to global optimality criteria, but instead of the full CRLB that is, F 1, only a submatrix FS1 of F 1 is used. If only y1, . . . , yS of the entire collection of T unknown parameters y1, . . . yT are important, the submatrix to be used is defined as: 1 0 1 ½F 11 ½F 1 12 . . . ½F 1 1S B ½F 1 21 ½F 1 22 . . . ½F 1 2S C C B ð22Þ FS1 ¼ B C .. .. .. A @ . ... . . ½F 1 S1
½F 1 S2
. . . ½F 1 SS
Then, for example, the partial D-optimality criterion is defined by the determinant of FS1 . Moreover, if only some functions of the parameters are important, the inverse of the right-hand member of inequality (15) has to be used. The optimality criteria, which are presented in this section, are functions of the elements of the CRLB. Minimization of these criteria as a function of the experimental settings, under the relevant physical constraints, produces the optimal statistical experimental design. However, diVerent optimality criteria will generally produce diVerent optimal designs. The experimenter
QUANTITATIVE ATOMIC RESOLUTION TEM
25
has to choose one of them or has to produce a criterion him or herself depending on his or her purpose. D. Maximum Likelihood Estimation In this section, it is discussed how the maximum likelihood estimator of the parameters may be derived from the parameterized probability density function, which is discussed in Section II.B. This estimator is very important since it achieves the CRLB asymptotically, that is, for the number of observations going to infinity. Thus, it is asymptotically most precise and is therefore often used in practice. The maximum likelihood estimator is clearly discussed in (van den Bos and den Dekker, 2001). A summary is given here. The maximum likelihood method for estimation of the parameters consists of three steps: 1. The available observations w ¼ ðw1 . . . wM ÞT are substituted for the corresponding independent variables o ¼ ðo1 . . . oM ÞT in the probability density function, for example, in Eq. (10). Since the observations are numbers, the resulting expression depends only on the elements of the parameter vector y ¼ ðy1 . . . yT ÞT . 2. The elements of y ¼ ðy1 . . . yT ÞT , which are the hypothetical true parameters, are considered to be variables. To express this, they are replaced by t ¼ ðt1 . . . tT ÞT . The logarithm of the resulting function, ln P(w; t), is called the log-likelihood function of the parameters t for the observations w, which is denoted as q(w; t). 3. The maximum likelihood estimates yˆ ML of the parameters y are defined by the values of the elements of t that maximize q(w; t), or yˆ ML ¼ arg maxt qðw; tÞ
ð23Þ
The most important properties of the maximum likelihood estimator are the following ones: . Consistency. Generally, an estimator is said to be consistent if the probability that an estimate deviates more than a specified amount from the true value of the parameter can be made arbitrarily small by increasing the number of observations used. . Asymptotic normality. If the number of observations increases, the probability density function of a maximum likelihood estimator tends to a normal distribution.
26
VAN AERT ET AL.
Asymptotic eYciency. The asymptotic covariance matrix of a maximum likelihood estimator is equal to the CRLB. In this sense, the maximum likelihood estimator is most precise. . Invariance property. The maximum likelihood estimates g ˆ ML of a vector of functions of the parameters y, that is, gðyÞ ¼ ðg1 ðyÞ . . . gC ðyÞÞT , are equal to gðyˆ ML Þ ¼ ðg1 ðyˆ ML Þ . . . gC ðyˆ ML ÞÞT (Mood, Graybill and Boes, 1974). .
In the remainder of this article, it will be checked if the maximum likelihood estimator attains the CRLB for atomic resolution TEM experiments. If so, the use of the optimality criteria given in Section II.C.2, which are functions of the elements of the CRLB, is justified.
E. Conclusions In this section, it has been shown how to evaluate and to optimize the experimental design in terms of the precision with which unknown parameters can be estimated. The optimization consists of diVerent steps, which may be summarized as follows: 1. The parametric statistical model of the observations is derived. This model defines the expectations of the observations as well as the fluctuations of the observations about these expectations. The specification of this model requires a solid physical base. 2. The CRLB, which is a theoretical lower bound on the variance of the parameter estimates, is computed from the parametric statistical model of the observations. This lower bound represents the highest attainable precision. Since the parametric statistical model of the observations is a function of the experimental settings, the CRLB is a function of these settings as well. 3. An optimality criterion is chosen, reflecting the purpose of the experimenter. This criterion is a function of the elements of the CRLB, which, like the CRLB, depends on the experimental settings. Generally, diVerent optimality criteria will produce diVerent optimal experimental designs. 4. The criterion chosen is optimized with respect to the experimental settings. The settings corresponding to the optimum are suggested as the optimal statistical experimental design. This optimization procedure is subject to the physical constraints. In the remainder of this article, this procedure will be applied to set up quantitative atomic resolution TEM experiments.
QUANTITATIVE ATOMIC RESOLUTION TEM
27
III. Statistical Experimental Design of Atomic Resolution Transmission Electron Microscopy Using Simplified Models
A. Introduction In this section, the attainable precision with which position and distance parameters of one or two components can be estimated, is computed for atomic resolution TEM experiments described by simplified models. In other words, an expression for the CRLB on the variance of position and distance estimates, which has been introduced in Section II, is derived for one-, two-, and three-dimensional components. Such expressions may be used to evaluate and optimize the experimental designs. For one- and two-dimensional components, the observations consist of counting events in a one- and twodimensional pixel array, respectively. For three-dimensional components, they consist of counting events in a set of two-dimensional pixel arrays, which is obtained by rotating these components about a rotation axis. In principle, these examples may be considered as simulations of a wide variety of experiments. However, in the remainder of this article, the two-dimensional example will be regarded as a simplified simulation of a high-resolution CTEM or STEM experiment, whereas the three-dimensional example will be regarded as a simplified simulation of an electron tomography experiment. Usually, the performance of such experiments is discussed in terms of twopoint resolution, expressing the possibility of perceiving separately components of a two-point image. One of the earliest and most famous criteria for two-point resolution is that of Rayleigh (1902). Criteria such as Rayleigh’s are suitable to set up qualitative atomic resolution TEM experiments. However, as already mentioned in Section I, a diVerent optimality criterion is needed in the framework of quantitative atomic resolution TEM, where one has prior knowledge about the observations in the form of a parametric statistical model, describing the expectations of the observations as well as the fluctuations of the observations about these expectations. Then, an obvious alternative to two-point resolution is the attainable precision with which position or distance parameters can be measured. In this section, the model describing the expectations of the observations, the expectation model, is assumed to consist of Gaussian peaks with unknown position. Under this assumption, it will be shown that the CRLB, which is usually calculated numerically, may be approximated by a simple rule of thumb in closed analytical form. Although the expectation models of images obtained in practice are usually of a higher complexity than a Gaussian peak, the rules of thumb are suitable to give insight into statistical experimental design for quantitative atomic resolution TEM. This will be
28
VAN AERT ET AL.
shown in the remainder of this article, where more complicated, physics based expectation models will be considered and where, consequently, the CRLB has to be calculated numerically. In the absence of rules of thumb for the attainable precision, it would be diYcult, if not impossible, to understand these numerical results. In the author’s opinion, whenever possible, every numerical analysis should be preceded by a simplified analysis. This will provide a check of the numerical results. In Section III.B, parametric statistical models of the observations are described. In Section III.C, the approximations of the CRLB, that is, the rules of thumb for the CRLB, are derived from these models. Section III.D consists of discussions and examples. In Section III.E, conclusions are drawn. Part of the results of this section has earlier been published in (van Aert, den Dekker, van Dyck, and van den Bos, 2002a). B. Parametric Statistical Models of Observations In this section, the pertinent parametric statistical models of the observations are described. In the remainder of this section, these models will be used for the derivation of expressions for the CRLB with which the position of one component or the distance between two components can be measured. The purpose is to find rules of thumb for the CRLB, that is, expressions that are easy to calculate and to interpret. In order to accomplish this, it will be assumed that the expectation models underlying the observations consist of Gaussian peaks with unknown position and known amplitude and width. In Sections III. B. 1, 2, and 3, the expectation model is described for one-, two-, and three-dimensional observations, respectively. 1. One-Dimensional Observations For one-dimensional observations, the normalized image intensity distribution is assumed to be given by: f ðx; bÞ ¼
nc 1X F ðx bxn Þ; nc n¼1
ð24Þ
where nc is the total number of components, bxn is the position of the nth component, and 2
1 x F ðxÞ ¼ pffiffiffiffiffiffi exp ð25Þ 2r2 2pr
with r the width of the Gaussian peak, to which both the width of the component and the two-point resolution of the imaging instrument
QUANTITATIVE ATOMIC RESOLUTION TEM
29
contribute. The nc-dimensional parameter vector b is equal to ðb1 . . . bnc ÞT ¼ ðbx1 . . . bxnc ÞT . Suppose that the observations wk ; k ¼ 1; . . . ; K are made at equidistant pixels of size Dx at the measurement points xk. If Dx is small compared to the width r of the Gaussian peak, the probability pk(b) that an electron hits the pixel at the position xk is approximately given by: Z xk þDx=2 pk ðbÞ ¼ pðxk ; bÞ ¼ f ðx; bÞdx f ðxk ; bÞDx: ð26Þ xk Dx=2
This means that the number of electrons expected to be found at this pixel is given by: lk ¼ nc Np pk ðbÞ;
ð27Þ
where Np is the total number of electrons in each Gaussian peak. Therefore, Eq. (27) describes the expectation model, which contains the parameters b. 2. Two-Dimensional Observations For two-dimensional observations, two distinct expectation models are assumed, corresponding to the so-called dark-field and bright-field imaging mode in TEM. In dark-field imaging, the noninteracting electrons are eliminated from detection, whereas in bright-field imaging, these electrons contribute to the background intensity in the image. The expectation models for dark-field and bright-field imaging are approximated by a model consisting of Gaussian peaks without and with background, respectively, although they are of a higher complexity in practice. a. Dark-Field Imaging. For dark-field imaging, the normalized image intensity distribution of the two-dimensional object is assumed to be given by: gDF ðx; y; bÞ ¼
nc 1X G x bxn ; y byn ; nc n¼1
ð28Þ
where bxn and byn are the x- and y-coordinate of the position of the nth component, respectively, and 2
1 x y2 Gðx; yÞ ¼ exp ; ð29Þ 2r2 2pr2 with r the width of the Gaussian peak. The 2nc-dimensional parameter vector b is equal to ðb1 . . . b2nc ÞT ¼ ðbx1 . . . bxnc by1 . . . bync ÞT . For a twodimensional object, the components are, for example, atoms or atom
30
VAN AERT ET AL.
columns in projection. In fact, Eq. (28) results from a two-dimensional convolution between an object function and the point spread function of the electron microscope. The intensity distribution of the identical components of the object as well as the point spread function t(x, y) are assumed to be Gaussian with corresponding widths rC and rEM, respectively. In this case r2 ¼ r2C þ r2EM :
ð30Þ
The observations wkl ; k ¼ 1; . . . ; K, l ¼ 1; . . . ; L are made at equidistant pixels of area Dx Dy at the measurement points (xk yl )T. The field of view (FOV), that is, the total area of detection is equal to KDx LDy. If Dx and Dy are small compared to the width r of the Gaussian peak, the probability pkl (b) that an electron hits the pixel at the position (xk yl )T is approximately given by: pkl ðbÞ ¼ pðxk ; yl ; bÞ ¼
Z
xk þDx=2 xk Dx=2
Z
yl þDy=2
yl Dy=2
gDF ðx; y; bÞdxdy
gDF ðxk ; yl ; bÞDxDy:
ð31Þ
For a given total number of electrons Np in each Gaussian peak, the number of electrons expected to be found at this pixel is given by: lkl ¼ nc Np pkl ðbÞ:
ð32Þ
This equation describes the expectation model containing the parameters b. b. Bright-Field Imaging. For bright-field imaging, the normalized image intensity distribution of the two-dimensional object is assumed to be given by: gBF ðx; y; bÞ ¼
1 nc OgDF ðx; y; bÞ ; FOV nc O
ð33Þ
where O is a constant, representing the strength of the interaction of the electrons with one component, gDF (x, y; b) is described by Eq. (28), and FOV is the field of view. The term ‘1’ represents a constant background, corresponding to the noninteracting electrons and the denominator FOV nc O is a normalization constant. In what follows, the term ‘nc OgDF ðx; y; bÞ’ is assumed to be much smaller than the term ‘1’, which means that the number of interacting electrons is small compared to the number of noninteracting electrons. In analogy with dark-field imaging, the probability pkl (b) that an electron hits the pixel at the position (xk yl)T is approximately given by:
QUANTITATIVE ATOMIC RESOLUTION TEM
pkl ðbÞ gBF ðxk ; yl ; bÞDxDy:
31 ð34Þ
For a given total number of electrons N, the number of electrons expected to be found at the pixel at the position ðxk yl ÞT is given by: lkl ¼ Npkl ðbÞ:
ð35Þ
This result defines the expectation model for bright-field imaging containing the parameters b. 3. Three-Dimensional Observations The three-dimensional observations made at the three-dimensional object consist of a single-axis tilt series of two-dimensional projections recorded by an electron tomography experiment. These projections are obtained by recording two-dimensional images while tilting the object about a fixed axis. Other data collection geometries in electron tomography exist as well, such as conical and random-conical tilting (Frank, 1992). However, only singleaxis tilting is considered here. It is assumed that the three-dimensional density distribution of the object is given by: d ðx; y; z; bÞ ¼
nc 1X D x bxn ; y byn ; z bzn ; nc n¼1
ð36Þ
where bxn, byn, and bzn are the x-, y-, and z-coordinate, respectively, of the position of the nth component with respect to a reference coordinate system and 2
1 x y2 z2 Dðx; y; zÞ ¼ exp ; ð37Þ 2r2C ð2pÞ3=2 r3C with rC the width of the identical components. The 3nc-dimensional parameter vector b is equal to ðb1 . . . b3nc ÞT ¼ ðbx1 . . . bxnc by1 . . . bync bz1 . . . bznc ÞT . The components are, for example, atoms. Figure 3 shows the surface of the three-dimensional density distribution and the positions of two components. It will be assumed that the y-axis is the rotation axis and the z-axis is the axis parallel to the illuminating electron beam. In the derivation of a rule of thumb for the attainable precision, it will be assumed that the tilt angles y j ; j ¼ 1; . . . ; J are equidistantly located on the interval (p/2, p/2). Although such a full angular range is rather unrealistic, it will be shown in Section III.D that the derived rules of thumb still provide insight for a limited angular range. At each tilt angle y j, the position coordinates of j j j j j j the components b j ¼ ðbx1 . . . bxn by1 . . . byn bz1 . . . bzn ÞT with respect to the c c c reference coordinate system are given by:
32
VAN AERT ET AL.
Figure 3. Surface of the three-dimensional density distribution of an object consisting of two components. The position coordinates of the two components are represented by the elements of the parameter vector b. It has been assumed that the y-axis is the rotation axis and the z-axis is the axis parallel to the illuminating electron beam. Furthermore, d is the distance between the two components, d0 is the distance between the components projected onto the (x, z)-plane, and f is the angle between the rotation axis and the axis connecting both components. It should be mentioned that this is not the tilt angle.
j ¼ bxn cosy j þ bzn siny j ; bxn j ¼ byn ; byn
j bzn ¼ bzn cosy j bxn siny j ;
ð38Þ
for n ¼ 1; . . . ; nc . The normalized image intensity distribution of a twodimensional projection is equal to: Z h j ðx; y; bÞ ¼ d x; y; z; b j dz tðx; yÞ ¼ gDF x; y; " j ; ð39Þ that is, the convolution of the projected density distribution and the point spread function of the electron microscope. The parameters j j j " j ¼ ðbx1 . . . bxn b j . . . byn ÞT are the position coordinates of the componc y1 c ents in this projection and the function gDF is given by Eq. (28). It follows from Eq. (39) that each projected image is assumed to be a two-dimensional
QUANTITATIVE ATOMIC RESOLUTION TEM
33
dark-field imaging experiment. However, for future research, it would be interesting to consider a bright-field imaging experiment as well, since this imaging mode is often used in practice. The observations wklj ; k ¼ 1; . . . ; K; l ¼ 1; . . . ; L; j ¼ 1; . . . ; J are made at equidistant pixels of area Dx Dy at the measurement points (xk yl)T at the tilt angles y j. The FOV of each projection is equal to KDx LDy. If Dx and Dy are small compared to the width r of the projected Gaussian peak, which is defined by Eq. (29), the probability pklj ðbÞ that an electron hits the pixel at the position (xk yl)T at the tilt angle y j is approximately given by: Z xk þDx=2 Z yl þDy=2 pklj ðbÞ ¼ p j ðxk ; yl ; bÞ ¼ h j ðx; y; bÞ dxdy ð40Þ yl Dy=2 xk Dx=2 h j ðxk ; yl ; bÞDxDy:
It will be assumed that the total number of electrons ncNp is equally distributed over the two-dimensional projections. In this case, the number of electrons at each projection is equal to nc Np =J, where Np =J represents the number of electrons in each projected Gaussian peak. Then, the number of electrons expected to be found at the pixel at the position (xk yl)T at the tilt angle y j is given by: n c Np j p ðbÞ: ð41Þ J kl This result describes the expectation model containing the parameters b. In Sections III.B.1, 2, and 3, expectation models have been given for one-, two-, and three-dimensional observations, respectively. These models describe the expected numbers of detected electron counts, that is, the expectations. Notice that for each expectation model, the components have been assumed to be identical. In future research, this may be extended to nonidentical components, representing, for example, objects consisting of diVerent types of atoms. Moreover, it will be supposed that the observations, which fluctuate about the expectations, are statistically independent and have a Poisson distribution. Therefore, the joint probability density function of the observations P(o; b), representing the parametric statistical model of the observations, is given by Eq. (10), where the total number of observations M is equal to K, K L, and K L J for one-, two-, and three-dimensional observations, respectively. In Section III.C, the CRLB on the variance with which position and distance parameters can be estimated will be derived from the obtained parametric statistical models of the observations. Notice that for three-dimensional objects, the estimation of the position and distance parameters has to be interpreted as follows. The parameter estimates are obtained by adapting ljkl ¼
34
VAN AERT ET AL.
the assembly of projected models, given by Eq. (41), to the experimental projected images with respect to the unknown parameters. This procedure is considered rather than adapting the three-dimensional model, such as that given by Eq. (36), to a three-dimensional reconstruction, which may be obtained by combining the projected images using the so-called weighted back-projection method (Frank, 1992). The reason why this alternative procedure is not considered is because the joint probability density function of the three-dimensional reconstruction is unknown. If the joint probability density function is unknown, the CRLB cannot be computed. C. Approximations of the Crame´r-Rao Lower Bound In this section, rules of thumb will be derived for the highest attainable precision with which the position coordinates of an isolated component and the distance between two components can be measured. In other words, the exact expressions for the CRLB, following from Section II.C.1 will be approximated. This will be done for one-, two-, and three-dimensional objects, for which the parametric statistical models of the observations are described in Section III.B. Throughout this section, the words ‘isolated component’ and ‘two components’ should not be interpreted in their strict sense. Expressed in a simplified way, it means that neighboring components may be present as long as these components do not overlap with the one or two components considered. Expressed in a correct way, it means that the elements of the Fisher information matrix associated with a position coordinate of the one or two components considered and a position coordinate of a neighboring component are equal to zero. Hence, the Fisher information matrix and its inverse, the CRLB, are block diagonal. In the derivation of the approximations of the CRLB, only their submatrices need to be considered. An interpretation of a block diagonal CRLB may easily be given for a (hypothetical) estimator with covariance matrix equal to the CRLB. Then, the zero-elements of the CRLB associated with two diVerent position coordinates indicate that the estimates of these position coordinates are uncorrelated. 1. One-Dimensional Observations For a one-dimensional object, the approximations of the CRLB on the variance s2bx of the position bx of an isolated component and on the variance s2d of the distance d between two components may be directly obtained from the results presented in (Bettens, van Dyck, den Dekker, Sijbers, and van den Bos, 1999). In this paper, the same expectation model as the one
QUANTITATIVE ATOMIC RESOLUTION TEM
35
described in Section III.B.1 was used, but the observations were assumed to be multinomially distributed instead of Poisson distributed. However, it can be shown that the expressions for the elements of the Fisher information matrix are equal under both assumptions and given by Eq. (12). Therefore, also the approximations of the CRLB are equal. For an isolated component, the CRLB on the variance s2bx of the position bx is approximated by: s2bx
r2 Np
ð42Þ
where r is the width of the Gaussian peak, which is defined by Eq. (25), and Np is the total number of electrons in this peak. The conditions for the validity of this approximation are that the pixel size Dx is small compared to the width of the Gaussian peak and that the component is located for the most part within the FOV. The approximation is not valid if, for instance, only one half of an object is detected. For two components, under the same conditions, the CRLB on the variance s2d of the distance d between these components is approximated by: s2d
4r4 N p d2
if d
and s2d
2r2 Np
if d
pffiffiffi 2r
pffiffiffi 2r
ð43Þ
ð44Þ
pffiffiffi If d is equal to 2r, both approximations are equal to one another. From the comparison of Eqs. (42) and it follows that, if the distance between pffiffi(44), ffi 2 two components is larger than 2r, s2d is twice pffiffiffi as large as sbx . This expresses the fact that for distances larger than 2r, a (hypothetical) estimator attaining the CRLB will provide uncorrelated estimates of the coordinates of neighboring components. From Eqs. (42)–(44), it follows that the precision with which the position or the distance can be measured is a function of the total number of electrons Np in each component and the width r of the Gaussian peaks. The precision may be improved, that is, s2bx or s2d may be decreased, by increasing the number of electrons. Also, the precision will improve ifpffiffithe ffi peaks are narrower. Moreover, if the distance becomes smaller than 2r, the lower bound on the standard deviation sd of the distance increases inversely proportionally to the distance. In (Bettens, van Dyck, den Dekker, Sijbers, and van den Bos, 1999), it has been shown that the approximations given in this section are useful rules of thumb.
36
VAN AERT ET AL.
2. Two-Dimensional Observations For a two-dimensional object, the approximations of the CRLB on the variance s2bx or s2by of the position coordinates bx or by, respectively, of an isolated component and on the variance s2d of the distance d between two components will be derived for both dark-field and bright-field imaging, for which the expectation models are described in Section III.B.2. The derivations of these lower bounds are similar to those of the onedimensional object. First, an isolated component is considered, for which its position coordinates are represented by the elements of the parameter vector b ¼ ðbx by ÞT . It will be assumed that this component is located for the most part within the FOV, which means that detection of only one half of an object is not considered. Moreover, the pixel sizes Dx and Dy are assumed to be small compared to the width r of the Gaussian peak, which is defined by Eq. (29). Under these assumptions, the (1, 1)th element of the Fisher information matrix F associated with the position coordinates b is approximately equal to its (2, 2)th element, that is, F11 F22. The reason for this is that the component has rotational symmetry. Furthermore, it follows from Eq. (12) that the Fisher information matrix F is symmetric. Therefore, F simplifies into:
F11 F12 F : ð45Þ F12 F11 From Eq. (14), it follows that the CRLB on the variance s2bx or s2by is given by the corresponding diagonal element of F 1: s2bx ¼ s2by ¼ F 1 11 : ð46Þ
The right-hand member of this equation will be calculated explicitly for dark-field as well as for bright-field imaging, resulting in Eqs. (65) and (68), respectively. Second, two components are considered, for which the position coordinates are represented by the elements of the parameter vector b ¼ ðbx1 bx2 by1 by2 ÞT . It will be assumed that the components are located for the most part within the FOV and that the pixel sizes Dx and Dy are small compared to the width r of the Gaussian peak. Under these assumptions, it can be shown for the elements of the Fisher information matrix F associated with the position coordinates b that F11 F22, F33 F44, F24 F13, and F23 F14. The reason for this is that the components are assumed to be identical and hence interchangeable. Furthermore, F is symmetric. Therefore, F simplifies into:
37
QUANTITATIVE ATOMIC RESOLUTION TEM
0
F11 B F12 B F @ F13 F14
F12 F11 F14 F13
F13 F14 F33 F34
1 F14 F13 C C F34 A F33
ð47Þ
The purpose is to find an expression for the CRLB on the variance s2d of the distance between two components. For a two-dimensional object, the distance is defined as: qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 d ¼ ðbx1 bx2 Þ2 þ by1 by2 : ð48Þ
Since d is a function of the elements of the parameter vector b, an expression for s2d follows directly from the right-hand member of inequality (15): s2d ¼
@d 1 @dT ; F @b @bT
ð49Þ
where the Jacobian matrix @d/@bT is equal to @d 1 ¼ bx1 bx2 bx2 bx1 by1 by2 by2 by1 : T d @b
Equation (49) may be written as: F11 F12 2 s2d 2 bx1 bx2 by1 by2 F13 F14 d
F13 F14 F33 F34
1
bx1 bx2
by1 by2
ð50Þ
!
: ð51Þ
The derivation of Eq. (51) is given below. It should be noted that this derivation may be skipped during a first reading without losing the thread of this article. Derivation of Equation (51). The derivation of Eq. (51) is based on the fact that the T T Fisher information matrix F may easily be transformed into a block diagonal matrix FD if F is invariant under a transformation of the parameters b to Mb, where the T T matrix M represents a symmetry operation. This supposition will first be proven. The condition that F is invariant under a symmetry operation M is mathematically written as: F ¼ M T FM;
ð52Þ
where the matrix M has the property Mn ¼ I
ð53Þ
38
VAN AERT ET AL.
with n an integer and I the identity matrix. Next, suppose that the eigenvectors and eigenvalues of M are represented by the columns Yi ; i ¼ 1; . . . ; T of the T T matrix V and the elements li ; i ¼ 1; . . . ; T of the T T diagonal matrix L, respectively, or equivalently, in symbols: MV ¼ V L:
ð54Þ
M n V ¼ V Ln :
ð55Þ
MnV ¼ V :
ð56Þ
Ln ¼ I:
ð57Þ
Then, it follows from Eq. (54) that
Furthermore, it follows from Eq. (53) that
Combining Eqs. (55) and (56) results in:
This means that the eigenvalues of M are equal to exp(i2pr/n), with r ¼ 0; 1; . . . ; n 1. Since the dimension T of the Fisher information matrix F is usually larger than n, these eigenvalues are degenerated. From Eqs. (52) and (54), it follows that: V T FV ¼ LT V T FV L:
T
ð58Þ
The notation FD will be used to indicate the matrix V FV. It will now be shown that FD is block diagonal. The (i, j)th element of FD, represented by YiT FYj , is calculated by subsequent use of Eqs. (52) and (54) as follows: ðFD Þij ¼ YiT FYj ¼ YiT M T FMYj ¼ li lj YiT FYj ;
ð59Þ
where the symbol * denotes the complex conjugate. Thus, YiT FYj is equal to li lj YiT FYj . This relation is trivial if Yi and Yj have the same eigenvalue, since then li lj ¼ 1. If, on the other hand, Yi and Yj have diVerent eigenvalues, YiT FYj has to be equal to 0, since li lj 6¼ 1. Therefore, the matrix FD ¼ V T FV is block diagonal, which proves the supposition. The supposition, which is discussed above, will now be used to derive Eq. (51). Since the two components are assumed to be identical, the Fisher information matrix F, given by Eq. (47), is invariant under interchanging the components. Thus, the matrix M, which represents this symmetry operation, is given by: 1 0 0 1 0 0 B1 0 0 0C C ð60Þ M¼B @ 0 0 0 1 A: 0 0 1 0
QUANTITATIVE ATOMIC RESOLUTION TEM
39
The matrix of eigenvectors V of M and the matrix of eigenvalues L of M are given by: 0 1 1 0 1 0 1 B 1 0 1 0C C V ¼ pffiffiffi B ð61Þ @ 0 1 0 1A 2 0 1 0 1 and
0
1 B0 L¼B @0 0
The matrix FD ¼ V T FV supposition, and is equal 0 F11 þ F12 B F13 þ F14 B @ 0 0
0 1 0 0
1 0 0 0 0C C: 1 0A 0 1
ð62Þ
is block diagonal, as predicted by the preceding to 1 F13 þ F14 0 0 C F33 þ F34 0 0 C: ð63Þ 0 F11 F12 F13 F14 A 0 F13 F14 F33 F34
Since FD is defined as VT FV, it follows that:
F 1 ¼ VFD1 V T ;
ð64Þ
where FD is given by Eq. (63). Equation (64) allows one to easily invert the 4 4 Fisher information matrix F associated with the position coordinates b since FD is block diagonal. The inverse of FD is block diagonal as well, with submatrices equal to the inverse of the 2 2 submatrices of FD. Next, the result of Eq. (64) is substituted into Eq. (49) resulting into Eq. (51). Next, the right-hand member of Eq. (51) will be calculated explicitly for distances, which are either small or large compared to the width r of the Gaussian peak, and for dark-field, as well as for bright-field imaging. Dark-Field Imaging. The CRLB on the variance s2bx or s2by of the position coordinates bx or by, respectively, of an isolated component and on the variance s2d of the distance d between two components are given for dark-field imaging. The results are obtained from the explicit calculations of the expressions given by the right-hand members of Eqs. (46) and (51), which are given in Appendix A. For an isolated component, the CRLB on the variance s2bx or s2by is approximated by:
40
VAN AERT ET AL.
s2bx ¼ s2by
r2 Np
ð65Þ
where r is the width of the Gaussian peak, which is defined by Eq. (29), and Np is the total number of electrons in this peak. For two components, the CRLB on the variance s2d of the distance d between these components is approximated by: s2d
4r4 N p d2
if d
and s2d
2r2 Np
if d
pffiffiffi 2r
pffiffiffi 2r
ð66Þ
ð67Þ
pffiffiffiffiffiffi If d is equal to 2r, both approximations are equal to one another. Notice that Eqs. (65), (66), and (67) are equal to their one-dimensional analogues, which are given by Eqs. (42), (43), and (44). Moreover, from the comparison of Eqs. (65) and (67), itpfollows that, if the distance between the two ffiffiffi 2 2 components is larger than 2r, s2d is twice pffiffiffi as large as sbx or sby . This expresses the fact that for distances larger than 2r, an estimator attaining the CRLB will provide uncorrelated estimates of the coordinates of neighboring components. The approximations of s2bx , s2by , and s2d are valid if the pixel sizes Dx and Dy are small compared to the width r of the Gaussian peak and if the components lie for the most part within the FOV. The approximation is not valid if, for instance, only one half of an object is detected. Bright-Field imaging. The CRLB on the variance s2bx or s2by of the position coordinates bx or by, respectively, of an isolated component and on the variance s2d of the distance d between two components are given for bright-field imaging. The results are obtained from the explicit calculations of Eqs. (46) and (51), which are given in Appendix B. For an isolated component, the CRLB on the variance s2bx or s2by is approximated by: s2bx ¼ s2by
8pr4 FOV NO2
ð68Þ
where r is the width of the Gaussian peak, which is defined by Eq. (29), FOV is the field of view, N is the total number of detected electrons, and O represents the strength of the interaction of the incident electrons with one component. Notice that N/FOV denotes the total number of detected electrons per unit area. For two components, the CRLB on the variance s2d of the distance d between these components is approximated by:
QUANTITATIVE ATOMIC RESOLUTION TEM
41
s2d
64pr6 FOV 3NO2 d2
rffiffiffiffiffiffi 4 r if d 3
ð69Þ
s2d
16pr4 FOV NO2
rffiffiffiffiffiffi 4 r if d 3
ð70Þ
and
pffiffiffiffiffiffiffiffi If d is equal to 4=3r, both approximations are equal to one another. From the comparison of Eqs. (68) and (70), it follows that, if the distance pffiffiffiffiffiffiffi ffi 2 between the two components is larger than 4=3r, s2d is twice as p large ffiffiffiffiffiffiffiffias sbx 2 or sby . This expresses the fact that for distances larger than 4=3r, an estimator attaining the CRLB will provide uncorrelated estimates of the coordinates of neighboring components. The approximations of s2bx , s2by , and s2d are valid if the pixel sizes Dx and Dy are small compared to the width r of the Gaussian peak and if the components lie for the most part within the FOV. The approximation is not valid if, for instance, only one half of an object is detected. The rules of thumb for dark-field and bright-field imaging, which are described by Eqs. (65)–(70), are scalar measures that may be used to obtain insight into statistical experimental design. The precision with which the position coordinates and the distance can be measured is a function of the number of electrons and the width of the peaks. The attainable precision may be quantified and improved by increasing the number of detected electrons per unit area or by narrowing the peaks. In practice, it follows from Eq. (30) that the peaks may be narrowed by narrowing the point spread function, that is, by improving the two-point resolution of the electron microscope. However, it is important to notice that below a certain width of the point spread function, the precision is limited by the intrinsic width of the components, for instance, by the width of the electrostatic potential of the atoms (den Dekker, Sijbers, and van Dyck, 1999). Then, further narrowing of the point spread function is useless. This result is meaningful in practice. For example, in STEM experiments, further narrowing of the probe, which represents the point spread function, is not so beneficial in terms of precision since the width of the probe is currently almost equal to the width of an atom (Krivanek, Dellby, and Nellist, 2002). Moreover, as in STEM, if a narrower point spread function may be accompanied with a decrease of the number of detected electrons, both eVects have to be weighed against each other under the existing physical constraints. Also, from the rules of thumb, it follows that the precision may be orders of magnitude better than the two-point resolution of the imaging instrument if the number of detected electrons p per ffiffiffiffiffiffiffiffi is large. ffiffiffi unitparea Furthermore, if the distance becomes smaller than 2r or 4=3r for dark
42
VAN AERT ET AL.
field and bright field imaging, respectively, the lower bound on the standard deviation sd of the distance d increases inversely proportionally to the distance. In Section III.D.1 which consists of discussions and examples, it will be shown that the lower bounds on the standard deviation sbx or sby of the position coordinates bx or by, respectively, of an isolated component and on the standard deviation sd of the distance d is well approximated by the square roots of the right-hand members of Eqs. (65)–(70). 3. Three-Dimensional Observations For a three-dimensional object, the derivation of rules of thumb for the highest attainable precision, that is, the CRLB, with which the position coordinates of an isolated component or the distance between two components can be estimated is similar to its two-dimensional analogue. First, an isolated component is considered, for which its position coordinates are represented by the elements of the parameter vector b ¼ ðbx by bz ÞT . The symmetric Fisher information matrix F associated with the position coordinates b is given by: 0 1 F11 F12 F13 F ¼ @ F12 F22 F23 A: ð71Þ F13 F23 F33
From Eq. (14), it follows that the CRLB on the variance s2bx , s2by or s2bz of the position coordinates bx, by or bz, respectively, is given by its corresponding diagonal element of F 1: s2bx ¼ F 1 11 ; s2by ¼ F 1 22 ; ð72Þ s2bx ¼ F 1 33 : The right-hand members of these equations are calculated explicitly in Appendix C, resulting in: s2bx ¼ s2bx
2r2 Np
ð73Þ
and s2by
r2 Np
ð74Þ
where r is the width of the projected Gaussian peak, which is defined by Eq. (29), and Np is the total number of detected electrons in the component. The conditions for the validity of the approximations are that the pixel sizes
QUANTITATIVE ATOMIC RESOLUTION TEM
43
Dx and Dy are small compared to r, that the diVerence Dy between successive tilt angles is small compared to the full angular tilt range, and that the component is located for the most part within the region of observation. Furthermore, the tilt angles y j are assumed to be equidistantly located on the interval (p/2, p/2). From the comparison of Eqs. (73) and (74) with Eqs. (42) and (65), it follows that the lower bound on the variance with which the y-coordinate or the x- and z-coordinates of the position can be estimated is equal to or twice as large as their one- and two-dimensional analogues, respectively. Recall that the y-coordinate is the coordinate along the rotation axis and that the x- and z-coordinates are the coordinates perpendicular to the rotation axis. Second, two components, for which their position coordinates are represented by the elements of the parameter vector b ¼ ðbx1 bx2 by1 by2 bz1 bz2 ÞT , are considered. It will be assumed that the three-dimensional components are located for the most part within the region of observation and that the pixel sizes Dx and Dy are small compared to the width r of the projected Gaussian peak. Furthermore, the Fisher information matrix F associated with the position coordinates b is a symmetric matrix. Under the assumptions given above and the symmetry property of the Fisher information matrix, it may be shown that 1 0 F11 F12 F13 F14 F15 F16 B F12 F11 F14 F13 F16 F15 C C B B F13 F14 F33 F34 F35 F36 C C: B ð75Þ F B C B F14 F13 F34 F33 F36 F35 C @ F15 F16 F35 F36 F55 F56 A F16 F15 F36 F35 F56 F55
The reason for this is that the components are assumed to be identical and hence interchangeable. The purpose is to find an expression for the CRLB on the variance s2d of the distance between two components. For a threedimensional object, the distance is defined as: ffi qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 2 2 d ¼ ðbx1 bx2 Þ þ by1 by2 þ ðbz1 bz2 Þ : ð76Þ
Since d is a function of the elements of the parameter vector b, an expression for s2d follows directly from the right-hand member of inequality (15): s2d ¼
@d 1 @dT ; F @b @bT
where the Jacobian matrix @d=@bT is equal to
ð77Þ
44
VAN AERT ET AL.
@d 1 ¼ bx1 bx2 T d @b
bx2 bx1
by1 by2
by2 by1
bz1 bz2
bz2 bz1 :
ð78Þ
Following the same lines of thought as in the derivation of Eq. (51), it can be shown that s2d may be approximated by: s2d
2 bx1 bx2 by1 by2 bz1 bz2 2 d 1 0 11 0 F11 F12 F13 F14 F15 F16 bx1 bx2 C B C B @ F13 F14 F33 F34 F35 F36 A @ by1 by2 A: F15 F16
F35 F36
F55 F56
ð79Þ
bz1 bz2
The expression given by the right-hand member of Eq. (79) has been calculated explicitly in Appendix C for the special cases where the distance between the two components is small or large compared to the width r of the projected Gaussian peak. This results in the following rules of thumb: s2d
4r4 V ðfÞ N p d2
if d
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2V ðfÞ=W ðfÞr
ð80Þ
s2d
2r2 W ðfÞ Np
if d
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2V ðfÞ=W ðfÞr
ð81Þ
and
where V ðfÞ ¼ 4
3cos4 f 3cos2 f 2 ; cos4 f 6cos2 f 3
W ðfÞ ¼ 1 þ sin2 f;
ð82Þ ð83Þ
and f is the angle between the rotation axis and the axis that connects the two components. This angle has been visualized in Figure 3. It should be mentioned that this is not the tilt angle. For diVerent tilt angles y j in a tilt series, f is constant. The conditions for the validity of the approximations are that the components are located for the most part within the region of observation, that the pixel sizes Dx and Dy are small compared to r, and that the diVerence Dy between successive tilt angles is small compared to the full angular tilt range. Furthermore, the tilt angles y j are assumed to be equidistantly located on the interval (p/2, p/2). If d is equal to pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2V ðfÞ=W ðfÞr, for a given angle f, both approximations are equal to one another. From Eqs. (80) and (81), it follows that the precision with which
QUANTITATIVE ATOMIC RESOLUTION TEM
45
the distance can be estimated is a function of the total number of electrons, the width of the peaks, the distance between the components, and the angle f. If f is equal to p/2, the approximated s2d is about 2 times as large as if f pffiffiisffi equal to 0. In terms of the standard deviation this corresponds to a factor 2. Moreover, if f is equal to 0, the approximations given by Eqs. (80)–(81) are equal to their one- and two-dimensional analogues given by Eqs. (43)–(44) and (66)-(67), respectively. This is intuitively clear since the components are then on the rotation axis and therefore the distance between the components in a two-dimensional projection is at each tilt angle equal to the real distance. In Section III.D it will be shown that the lower bounds on the standard deviation sbx, sby or sbx of the position coordinates bx, by or bz, respectively, of an isolated component and on the standard deviation sd of the distance d between two components is well approximated by the square roots of the right-hand members of Eqs. (73)–(74) and (80)–(81), respectively. D. Discussions and Examples In this section, the exactly calculated lower bounds on the standard deviation of the position coordinates of an isolated component and on the standard deviation of the distance will be compared with its approximations. This will be done for two- and three-dimensional objects. For onedimensional objects, a discussion may be found in Bettens, van Dyck, den Dekker, Sijbers, and van den Bos (1999). 1. Two-Dimensional Observations The approximations of the lower bound on the standard deviation, which are derived in Section III.D.2 for two-dimensional objects, will be investigated by means of examples, for dark-field as well as for brightfield imaging. a. Dark-Field Imaging. The approximations of the lower bound on the standard deviation sbx and sby of the position coordinates bx and by of an isolated component and on the standard deviation sd of the distance d between two components, which are described by the square roots of the right-hand members of Eqs. (65)-(67), are discussed for dark-field imaging experiments. These approximations will be compared with their exactly calculated lower bounds, which are found by numerical computation of the CRLB with respect to the position coordinates. The expressions for the CRLB are given in Section II.C.1. The CRLB is computed by substitution of the parametric statistical model of dark-field imaging observations, which is derived in Section III.B. into the obtained expressions. Unless otherwise
46
VAN AERT ET AL.
stated, the total number of electrons in a Gaussian peak, the width of this peak, the pixel sizes, and the field of view are given by the numbers of Table 1. Moreover, it is assumed that the center of mass of the components coincides with the center of the field of view. Figure 4 shows the exactly calculated lower bound on the standard deviation of the position coordinates together with its approximation as a function of the width of the Gaussian peak. Furthermore, Figure 5 shows the exactly calculated lower bound on the standard deviation of the distance and its approximations as a function of the distance between two components. From these figures, it is observed that the square roots of the right-hand members of Eqs. (65)–(67) are accurate approximations of sbx, sby, and sd. One of the assumptions that is made in the derivation of Eqs. (65)–(67) is that the pixel sizes Dx and Dy are small compared to the width of the
TABLE 1 Total Number of Electrons in a Gaussian Peak (Np), the Width ( r) of this Peak, the Pixel Sizes (Dx and Dy), and the Field of View (FOV ) Np
r
Dx
Dy
FOV
15,000
20
1.2
1.2
200 200
Figure 4. The exactly calculated lower bound on the standard deviation of the position coordinates and its approximation, given by the square root of the right-hand member of Eq. (65), as a function of the width of the Gaussian peak.
QUANTITATIVE ATOMIC RESOLUTION TEM
47
Gaussian peak. Therefore, Figure 6 shows the exactly calculated lower bound on the standard deviation sd of the distance as a function of the pixel size Dx, which has been assumed to be equal to Dy. The distance between the two components is equal to 10. From this figure, it is seen that below a certain pixel size, sd decreases only slightly with decreasing pixel size, with
Figure 5. The exactly calculated lower bound on the standard deviation of the distance and its approximations, given by the square roots of the right-hand members of Eqs. (66) and (67), as a function of the distance between two components.
Figure 6. The exactly calculated lower bound on the standard deviation of the distance as a function of the pixel size Dx, with Dy ¼ Dx. The distance is equal to 10.
48
VAN AERT ET AL.
all other quantities kept constant. Hence, the precision that is gained by decreasing the pixel size is only marginal. This was also observed for onedimensional observations by Bettens et al. (1999). This has to do with the fact that the pixel signal-to-noise ratio (SNR) decreases with decreasing pixel size. Finally, it is examined if there exists an estimator attaining the CRLB on the variance of position coordinates and on the variance of the distance and if this estimator may be considered unbiased. If so, this would justify the choice of the CRLB as precision based optimality criterion. Generally, one may use diVerent estimators in order to estimate the position coordinates or the distance such as the least squares estimator or the maximum likelihood estimator, which has been introduced in Section II.D. DiVerent estimators have diVerent properties. One of the asymptotic properties of the maximum likelihood estimator is that it is normally distributed about the true parameters with a covariance matrix approaching the CRLB (van den Bos, 1982). This property would justify the use of the CRLB as optimality criterion, but it is an asymptotic one. This means that it applies to an infinite number of observations. However, the number of observations used in the examples given above is finite and even relatively small. If asymptotic properties still apply to such experiments can often only be assessed by estimating from artificial, simulated observations (van den Bos, 1999). Therefore, 600 diVerent dark-field experiments made at an isolated component are simulated; the observations are modelled using the parametric statistical model described in Section III.B. Next, the position coordinates bx and by of the component are estimated from each simulation experiment using the maximum likelihood estimator. The mean and variance of these estimates are computed and compared to the true value of the position coordinate and the lower bound on the variance, respectively. The lower bound on the variance is computed by substituting the true values of the parameters into the expression for the CRLB. The results are presented in Table 2. From the comparison of these results, it follows that it may not be concluded that the maximum likelihood estimator is biased or that it does not attain the CRLB. The maximum likelihood estimates of bx are presented in the histogram of Figure 7. The solid curve represents a normal distribution with mean and variance given in Table 2. This curve makes plausible that the estimates are normally distributed. This property is also tested quantitatively by means of the so-called Lilliefors test (Conover, 1980), which does not reject the hypothesis that the estimates are normally distributed. From the results obtained from the simulation experiments, it is concluded that the maximum likelihood estimates cannot be distinguished from unbiased, eYcient estimates. These results justify the choice of the CRLB as optimality criterion.
49
QUANTITATIVE ATOMIC RESOLUTION TEM TABLE 2 Comparison of True Position Coordinates and Lower Bounds on the Variance with Estimated Means and Variances of 600 Maximum Likelihood Estimates of the Position Coordinates, Respectively True position coordinate bx by
0 0
bx by
Lower bound on variance s2bx s2by
26.7 103 3
26.7 10
s2bx s2by
Estimated mean
Standard deviation of mean
0.9 103 12.5 103
6.6 103 6.4 103
Estimated variance
Standard deviation of variance
25.8 103
1.5 103
3
24.4 10
1.4 103
The numbers of the last column represent the estimated standard deviation of the variable of the previous column.
Figure 7. Histogram of 200 maximum likelihood estimates of the x-coordinate of the position of a component. The normal distribution superimposed on this histogram makes plausible that the estimates are normally distributed.
b. Bright-Field Imaging. The approximations of the lower bound on the standard deviation sbx and sby of the position coordinates bx and by of an isolated component and on the standard deviation sd of the distance d between two components, which are described by the square roots of the
50
VAN AERT ET AL.
right-hand members of Eqs. (68)–(70), are discussed for bright-field imaging experiments. These approximations will be compared with their exactly calculated lower bounds, which are found by numerical computation of the CRLB with respect to the position coordinates. The expressions for the CRLB are given in Section II.C.1. The CRLB is computed by substitution of the parametric statistical model of bright-field imaging observations, which is derived in Section III.B, into the obtained expressions. Unless otherwise stated, the total number of electrons, the width of the Gaussian peak, the constant representing the strength of the interaction, the pixel sizes, and the field of view are given by the numbers of Table 3. Moreover, it is assumed that the center of mass of the components coincides with the center of the field of view. Figure 8 shows the exactly calculated lower bound on the standard deviation of the position coordinates together with its approximation as a function of the constant O, which represents the strength of the interaction. Furthermore, Figure 9 shows the exactly calculated lower bound on the
TABLE 3 The Total Number of Electrons (N ), the Width ( r) of the Gaussian Peak, the Constant (O) Representing the Strength of the Interaction, the Pixel Sizes (Dx and Dy), and the Field of View (FOV ) N
r
O
Dx
Dy
FOV
18,000,000
20
100
1.2
1.2
200 200
Figure 8. The exactly calculated lower bound on the standard deviation of the position coordinates and its approximation, given by the square root of the right-hand member of Eq. (68), as a function of the constant O representing the interaction strength.
QUANTITATIVE ATOMIC RESOLUTION TEM
51
Figure 9. The exactly calculated lower bound on the standard deviation of the distance and its approximations, given by the square roots of the right-hand members of Eqs. (69) and (70), as a function of the distance between two components.
standard deviation of the distance and its approximations as a function of the distance between two components. From these figures, it is observed that the square roots of the right-hand members of Eqs. (68)–(70) are accurate approximations of sbx, sby, and sd. Like for dark-field imaging, it is examined by means of simulation experiments if the maximum likelihood estimator attains the CRLB on the variance of the distance between two components and if it is unbiased. The observations made at these components are modelled using the parametric statistical model for bright-field imaging described in Section III.B. From 600 diVerent simulation experiments, the distance is estimated using the maximum likelihood estimator. The mean and variance of these estimates are computed and compared to the true value of the distance and the lower bound on the variance, respectively. The lower bound on the variance is computed by substituting the true values of the parameters into the expression for the CRLB. The results are presented in Table 4. From the comparison of these results, it follows that it may not be concluded that the maximum likelihood estimator is biased or that it does not attain the CRLB. 2. Three-Dimensional Observations The approximations of the lower bounds on the standard deviation sbx, sby, and sbz of the position coordinates bx, by, and bz of an isolated component and on the standard deviation sd of the distance d between two components, which are described by the square roots of the right-hand members of
52
VAN AERT ET AL. TABLE 4 Comparison of True Distance and Lower Bound on the Variance with Estimated Mean and Variance of 600 Maximum Likelihood Estimates of the Distance between Two Components, Respectively True distance 60
d
d
Lower bound on variance s2d
s2d
1.27
Estimated mean
Standard deviation of mean
60.0
0.5
Estimated variance
Standard deviation of variance
1.29
0.07
The numbers of the last column represent the estimated standard deviation of the variable of the previous column. TABLE 5 The Total Number of Projected Images (J ), the Number of Electrons in each Projected Gaussian Peak ðNp =JÞ, the Width ( r) of this Peak, the Pixel Sizes (Dx and Dy), the Field of View (FOV ) of each Projected Image, and the Angle(f) between the Rotation Axis and the Axis Connecting Two Components J
Np
r
Dx
Dy
FOV
f
20
15,000
20
1.2
1.2
200 200
p=2
Eqs. (73), (74), (80), and (81) in Section III.C.3, are investigated by means of examples. These approximations will be compared with their exactly calculated lower bounds, which are found by numerical computation of the CRLB with respect to the position coordinates. The expressions for the CRLB are given in Section II.C.1. The CRLB is computed by substitution of the parametric statistical model of the three-dimensional observations, which is given in Section III.B, into the obtained expressions. Unless otherwise stated, the total number of projected images, the number of electrons in each projected Gaussian peak, the width of this peak, the pixel sizes, the field of view of each projected image, and, in case of two components, the angle between the rotation axis and the axis connecting these components are given by the numbers of Table 5. Moreover, it is assumed that the center of mass of the components coincides with the center of the field of view. Figure 10 shows the exactly calculated lower bound on the standard deviation of the position coordinates and its approximations as a function of the width of the projected Gaussian peak, which is described by Eq. (29). Furthermore, Figure 11 shows the exactly calculated lower bound on the standard deviation of the distance and its approximations as a function of the distance between two components. The axis combining both
QUANTITATIVE ATOMIC RESOLUTION TEM
53
Figure 10. The exactly calculated lower bound on the standard deviation of the position coordinates and its approximations, given by the square roots of the right-hand members of Eqs. (73) and (74), as a function of the width of the projected Gaussian peak.
Figure 11. The exactly calculated lower bound on the standard deviation of the distance and its approximations, given by the square roots of the right-hand members of Eqs. (80) and (81), as a function of the distance between two components.
components is assumed to be perpendicular to the rotation axis. Moreover, in Figures 12 and 13, the exactly calculated lower bound on the standard deviation of the distance and its approximations are shown as a function of the angle f, for the distance between the components being small and large compared to the width of the projected Gaussian peak, respectively. From Figures 10 to 13, it is observed that the square roots of the right-hand
54
VAN AERT ET AL.
Figure 12. The exactly calculated lower bound on the standard deviation of the distance and its approximation, given by the square root of the right-hand member of Eq. (80), as a function of the angle f between the rotation axis and the axis connecting the two components of the object. The distance is equal to 2.
Figure 13. The exactly calculated lower bound on the standard deviation of the distance and its approximation, given by the square root of the right-hand member of Eq. (81), as a function of the angle f between the rotation axis and the axis connecting the two components of the object. The width of the projected Gaussian peaks is equal to 10 and the distance is equal to 50.
members of Eqs. (73), (74), (80), and (81) are accurate approximations of sbx, sby, sbz, and sd. Next, some remarks are due. It should be mentioned that in the derivation of the approximations of the CRLB, the diVerence Dy between successive tilt
QUANTITATIVE ATOMIC RESOLUTION TEM
55
angles has been assumed to be small compared to the full angular tilt range (p/2, p/2), or in other words, the total number of projections has been assumed to be large, which is rather unrealistic. However, in the comparisons presented in Figures 10 to 13, the exactly calculated lower bounds on the standard deviation follow from the assumption that there are only 20 available projections. This shows that the approximations are useful, even for a limited number of projections. Additionally, Figure 14 shows the exactly calculated lower bound on the standard deviation of the distance sd as a function of the total number of projections, with all other parameters kept constant. It is seen that there is a fast convergence of sd to a constant with increasing number of projections. This means that the precision does not improve beyond a certain number of projections. The reason for this is that the number of electrons per projection decreases with increasing number of projections since the total number of electrons has been kept constant. Therefore, the pixel SNR decreases with increasing number of projections. Furthermore, in the derivation of the approximations, a full angular tilt range, that is, the interval (p/2, p/2), has been assumed, which is also unrealistic. Therefore, Figure 15 shows the exactly calculated sd, following from a limited angular tilt range, that is, the interval (p/3, p/3), and the approximations as a function of the distance between the two components. Although the approximations start to deviate from the exactly calculated sd, they are still useful as rule of thumb since they describe the behaviour of sd well.
Figure 14. The exactly calculated lower bound on the standard deviation of the distance as a function of the number of projections J, with the number of electrons in each projected Gaussian peak Np/J. The width of this peak is equal to 10 and the distance between the two components is equal to 40.
56
VAN AERT ET AL.
Figure 15. The exactly calculated lower bound on the standard deviation of the distance, assuming a limited angular tilt range, that is, the interval (p/3, p/3), and its approximations, given by the square roots of the right-hand members of Eqs. (80) and (81), as a function of the distance between the two components.
Finally, it is examined by means of simulation experiments if the maximum likelihood estimator attains the CRLB on the variance of the position coordinates of an isolated component and if it is unbiased. The significance of this has been made clear earlier in Section III.D.1. The three-dimensional observations made at the component are modelled using the parametric statistical model described in Section III.B. The width of the projected Gaussian peaks is equal to 10. From 600 diVerent simulation experiments, the position coordinates are estimated using the maximum likelihood estimator. The mean and variance of these estimates are computed and compared to the true value of the position coordinate and the lower bound on the variance, respectively. The lower bound on the variance is computed by substituting the true values of the parameters into the expression for the CRLB. The results are presented in Table 6. From the comparison of these results, it follows that it may not be concluded that the maximum likelihood estimator is biased or that it does not attain the CRLB. Additionally, a remark on maximum likelihood estimation has to be made. Maximum likelihood estimates are given by the values that maximize the log-likelihood function, as shown in Section II.D. However, in order to avoid ending up at a local maximum, instead of at the global maximum of the log-likelihood function, it is important to have good starting values for the position coordinates of the components, as already mentioned in Section I. For that purpose, a three-dimensional reconstruction could be
57
QUANTITATIVE ATOMIC RESOLUTION TEM TABLE 6 Comparison of True Position Coordinates and Lower Bounds on the Variance with Estimated Means and Variances of 600 Maximum Likelihood Estimates of the Position Coordinates, Respectively True position coordinate bx by bz
0 0 0
bx by bz
Lower bound on variance s2bx s2by s2bz
13.3 103 3
6.7 10 13.3 103
s2bx s2by s2bz
Estimated mean
Standard deviation of mean
0.6 103 3.7 103 5.3 103
4.7 103 3.3 103 4.8 103
Estimated variance
Standard deviation of variance
13.0 103
0.8 103
3
6.6 10 13.8 103
0.4 103 0.8 103
The numbers of the last column represent the estimated standard deviation of the variable of the previous column.
useful. It may be obtained by combining the projected images using the socalled weighted back-projection method (Frank, 1992). E. Conclusions The attainable precision with which position and distance parameters of one or two components can be estimated is computed for simulations of highresolution CTEM, STEM, and electron tomography experiments, all described by simplified models. Usually, the performance of such atomic resolution TEM experiments is discussed in terms of two-point resolution, expressing the possibility of perceiving separately components of a twopoint image. Although such resolution based criteria are suitable to set up qualitative atomic resolution TEM experiments, a precision based optimality criterion is needed in the framework of quantitative atomic resolution TEM. Then, an obvious alternative to two-point resolution is the attainable precision with which position or distance parameters can be measured. In the simulation experiments, the observations were assumed to be electron counting results made at Gaussian peaks with unknown position. Under this assumption, the CRLB, which is usually calculated numerically, is given by a simple rule of thumb in closed analytical form. Although the expectation models of images obtained in practice are usually of a higher complexity, the rules of thumb are suitable to give insight into statistical experimental design for quantitative atomic resolution TEM. The
58
VAN AERT ET AL.
rules of thumb show how the attainable precision depends on the width of the point spread function, the width of the components, the number of detected electrons, and on the distance between the components. Particularly for electron tomography experiments, it is a function of the orientation of the components with respect to the rotation axis as well. Generally, the precision improves by increasing the number of detected counts or by narrowing the point spread function. However, below a certain width of the point spread function, the precision is limited by the intrinsic width of the components. Then, further narrowing of the point spread function is useless. Moreover, if a narrower point spread function results into a decrease of the number of detected electrons, both eVects have to be weighed against each other under the existing physical constraints. In the following sections, the optimal statistical experimental designs of CTEM and STEM experiments, assuming more realistic expectation models than Gaussian peaks, will be derived by computing the CRLB numerically. It will be shown that these numerical results may be interpreted by means of the obtained rules of thumb of this section.
IV. Optimal Statistical Experimental Design of Conventional Transmission Electron Microscopy
A. Introduction Optimal statistical experimental designs of CTEM experiments will be described. As mentioned in Section I the future of such experiments is quantitative structure determination. Unknown structure parameters, atom column positions in particular, are quantitatively estimated from the observations. Quantitative structure determination should be done as precisely as possible. A precision of the atom column positions of the ˚ is needed (Kisielowski, Principe, Freitag and Hubert, order of 0.01 to 0.1 A 2001; Muller, 1998, 1999). Precise measurements will allow materials scientists to draw reliable conclusions from the experiment. Such measurements may be used for comparison with or as an input for theoretical first-principles calculations in order to get a deeper understanding of the properties-structure relation. Hence, the experimental design of CTEM experiments should be evaluated and optimized in terms of precision. As shown in Section II, the obvious optimality criterion is the attainable precision, that is, the CRLB, with which the atom column positions can be estimated. The attainable precision should replace widely used performance criteria of an electron microscope, which express the
QUANTITATIVE ATOMIC RESOLUTION TEM
59
possibility to perceive separately two atom columns in an image. Although these criteria are suitable to set up qualitative CTEM experiments, the attainable precision is needed as a criterion in the framework of quantitative CTEM experiments. In Section III, the attainable precision has been derived in closed analytical form for atomic resolution transmission electron microscopy experiments using simplified models. In this section, the attainable precision will be derived for more complicated, physics based CTEM models and the obtained expression will be used to evaluate and optimize the experimental design. To begin with, it will be described how CTEM observations are collected. A scheme is shown in Figure 16. The object under study is illuminated by a parallel incident electron beam. As a result of the electron-object interaction, the so-called exit wave, which is a complex electron wave function at the exit plane of the object, is formed. A one-to-one correspondence between the exit wave and the projected object structure is established if the object is oriented along a main zone axis and if the distance between adjacent atom columns is not too small. Next, a magnified image of the exit wave is formed by a set of lenses of which the objective lens is the most important one. The formation of this image may be described in two steps. First, the so-called image wave, which is a complex electron wave function at the image plane, is formed. Since the objective lens is not perfect, the image wave is influenced by lens aberrations such as spherical
Figure 16. Scheme of a CTEM experiment.
60
VAN AERT ET AL.
aberration, defocus, and chromatic aberration. Second, the image intensity distribution, given by the modulus square of the image wave, is recorded. As a recording device, a CCD camera may be chosen. Therefore, CTEM observations may be considered to be electron counting results collected at the pixels of a CCD camera. Widely used performance criteria of CTEM experiments are the point resolution and the information limit of the electron microscope. The point resolution rs represents the smallest detail that may be interpreted directly from the image provided that the object is thin and that the defocus is adjusted to the so-called Scherzer defocus (Scherzer, 1949). The point resolution depends only on the spherical aberration constant Cs and the electron wavelength l, according to the formula rs ¼ 0:66ðCs l3 Þ1=4 (Spence, 1988). The information limit ri represents the smallest detail that is present in the image and that may be resolved by image processing techniques such as oV-axis holography (Lichte, 1991) and the focal-series reconstruction method (Coene, Thust, Op de Beeck, and van Dyck, 1996; Kirkland, 1984; Saxton, 1978; Schiske, 1973; Thust, Coene, Op de Beeck, and van Dyck, 1996; van Dyck and Coene, 1987; van Dyck, Op de Beeck and Coene, 1993). Both techniques retrieve the exit wave, which ideally is free from any lens aberration. The information limit is inversely proportional to the highest spatial frequency that is still transferred with enough intensity from the exit plane of the object to the image plane (de Jong and van Dyck, 1993; O’Keefe, 1992). Usually, the information limit is smaller than the point resolution in intermediate voltage electron microscopy. The information limit is mainly determined by spatial incoherence and temporal incoherence. Spatial incoherence is due to beam convergence, which is caused by the fact that the illuminating beam is not parallel but may be considered as a cone of incoherent plane waves. Temporal incoherence is due to chromatic aberration, which results from a spread in defocus values, arising from fluctuations in accelerating voltage, lens current, and thermal energy of the electron, where the thermal energy fluctuation is often the dominating term. Chromatic aberration will mostly be the dominant factor governing the information limit (de Jong and van Dyck, 1993). The information limit due to chromatic aberration is defined as ri ¼ ðplD=2Þ1=2 , with D the defocus spread, expressed in terms of the standard deviation (Spence, 1988). Over the years, diVerent methods have been developed to improve the point resolution or the information limit. Existing methods to improve the point resolution are, for example, high-voltage electron microscopy and correction of the spherical aberration. High-voltage electron microscopy is based on the principle that an increase of the accelerating voltage is accompanied with a decrease of the electron wavelength and a corresponding improvement of the point resolution (Phillipp, Ho¨schen, Osaki, Mo¨bus,
QUANTITATIVE ATOMIC RESOLUTION TEM
61
and Ru¨hle, 1994). Spherical aberration is a lens defect that, like other aberrations, causes a point object to be imaged as a disk of finite size. By using a combination of magnetic quadrupole and octopole lenses, spherical aberration may be cancelled out (Rose, 1990; Scherzer, 1949). This improves the point resolution. One of the advantages of the spherical aberration corrector is that structure-imaging artifacts due to contrast delocalization may to a great extent be avoided (Haider, Uhlemann, Schwan, Rose, Kabius, and Urban, 1998). Existing methods to improve the information limit are based on correction of chromatic aberration by use of either a chromatic aberration corrector (Reimer, 1984; Weißba¨cker and Rose, 2001, 2002) or a monochromator (Mook and Kruit, 1999). The chromatic aberration corrector is still at the conceptual stage. The monochromator is already used in practice and eliminates all electrons having energies outside a prespecified energy range. The methods presented nowadays result in a ˚ , which is suYcient to visualize the individual atom resolution of about 1 A columns of materials with columnar structures, viewed along a main zone axis. In fact, the methods developed to improve the point resolution or the information limit are advantageous for qualitative high-resolution CTEM. However, the future of CTEM experiments, is quantitative, instead of qualitative, structure determination. The structure parameters, the atom column positions in particular, are quantitatively estimated from the electron microscopical observations, instead of visually determined. Hence, the obvious optimality criterion to be used to evaluate the experimental design of CTEM experiments is the attainable precision, that is, the CRLB with which these structure parameters can be estimated, and not so much the point resolution or the information limit. In this section, optimal statistical experimental designs of CTEM experiments will be computed in terms of the experimental settings producing the highest attainable precision. It will be obtained using the principles of statistical experimental design as explained in Section II. The section is organized as follows. In Section IV.B, a parametric statistical model of the observations will be derived. This model describes the expectations of the observations as well as the fluctuations of the observations about these expectations. Next, in Section IV.C, it will be shown how the CRLB on the variance of the atom column position estimates may be deduced from this model. Afterward, an adequate optimality criterion, which is a function of the elements of the CRLB, will be given. This criterion is then used to evaluate and optimize the experimental design. Special attention is paid to the dependence of the optimality criterion on the use of a spherical aberration corrector, a chromatic aberration corrector, and a monochromator. In Section IV.D, conclusions are drawn.
62
VAN AERT ET AL.
Part of the results of this section concerning the use of a monochromator has earlier been published in den Dekker, van Aert, van Dyck, van den Bos, and Geuens (2000) and den Dekker, van Aert, van Dyck, van den Bos, and Geuens (2001). B. Parametric Statistical Model of Observations In order to derive the optimal statistical experimental design, a parametric statistical model of the CTEM observations is needed. This model, which contains microscope settings such as defocus, spherical aberration constant, chromatic aberration constant, and defocus spread, as well as structure parameters such as the atom column positions and the object thickness, will be derived in this section. In this derivation, two basic approximations will be made. The first approximation is the use of the simplified channelling theory to describe the dynamical scattering of the electrons on their way through the object (Geuens and van Dyck, 2002; van Dyck and Op de Beeck, 1996). Secondly, partial spatial and temporal coherence will be incorporated by representing the microscope’s transfer function as a product of the corresponding coherent transfer function and two envelope functions (Fejes, 1977; Frank, 1973). The image calculation is then treated as a simple Fourier optics scheme. This approach is nowadays called the quasi-coherent approximation (Coene and van Dyck, 1988). Admittedly, the approximations made are of a limited validity. However, they are very useful for a compact analytical model-based derivation of the optimal statistical experimental design of quantitative CTEM experiments as well as for explaining the basic principles governing the obtained results. The principal results obtained are independent of the approximations made. Moreover, it should be noticed that the image magnification will be ignored, without loss of generality. 1. The Exit Wave The first important step in the derivation of the parametric statistical model of the observations is to obtain an expression for the exit wave c(r, z). This is a complex wave function in the plane at the exit face of the object, resulting from the interaction of the electron beam with the object. Use will be made of the simplified channelling theory. At this stage, structure parameters will enter the model. High-resolution CTEM images often show a one-to-one correspondence with the projected object structure if the incident electron beam propagates along a main zone axis. This happens for instance in ordered alloys with columnar structures provided that the point resolution of the microscope is suYcient and the distance between adjacent columns is not too small
QUANTITATIVE ATOMIC RESOLUTION TEM
63
Figure 17. Schematic representation of electron channelling.
(van Tendeloo and Amelinckx, 1978; van Tendeloo, and Amelinckx, 1982). From this, it has been suggested that for materials oriented along a main zone axis and with suYcient separation between the columns, the exit wave mainly depends on the projected structure, that is, on the type of atom columns. The physical reason behind this is that the atoms are superimposed along an atom column in this orientation. Then, it can be shown that the electrons are trapped in the positive electrostatic potential of the atoms. Because of this, each atom column acts as a guide or a channel within which the electron scatters dynamically without leaving the column (van Dyck, 2002). This channelling eVect is schematically represented in Figure 17. In the simplified channelling theory, applicable if the incident electron beam propagates along a main zone axis, an expression for the exit wave is given by (van Dyck and Op de Beeck, 1996):
nc X E1s;n 1 z 1 ; ð84Þ cn f1s;n ðr bn Þ exp ip cðr; zÞ ¼ 1 þ E0 l n¼1
where r ¼ ðx yÞT is a two-dimensional vector in the plane at the exit face of the object, perpendicular to the incident beam direction, z is the object thickness, E0 is the incident electron energy, and l is the electron wavelength. The incident electron energy and the electron wavelength are related (Kirkland, 1998): hc l ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi E0 ð2m0 c2 þ E0 Þ
ð85Þ
with h Plancks constant, m0 the electron rest mass and c the velocity of light so that hc ¼ 12:398 keV and m0 c2 ¼ 511 keV. It should be mentioned that
64
VAN AERT ET AL.
the accelerating voltage is equal to E0/e, where e ¼ 1:6 1019 C is the electron charge. The summation in Eq. (84) is over nc atom columns. The function f1s;n ðr bn Þ is the lowest energy bound state of the nth atom column located at position bn ¼ ðbxn byn ÞT and E1s,n is its energy. The lowest energy bound state is a real-valued, centrally peaked, radially symmetric function, which is a two-dimensional analogue of the 1s-state of an atom. Following van Dyck and Op de Beeck (1996), it has been assumed that the dynamical motion of the electron in a column may be expressed primarily in terms of this tightly bound 1s-state. The other states are not neglected, but for thin objects they will not build up and are incorporated in the term ‘1’ in Eq. (84), which describes the unscattered incident electron wave. The author is well aware of the fact that for heavy atom columns, where higher order states start to play a more prominent role (Kambe, Lehmpfuhl, and Fujimoto, 1974), Eq. (84) becomes a less accurate description of the exit wave (van Dyck and Op de Beeck, 1996). The excitation coeYcients cn may be found from (van Dyck and Op de Beeck, 1996): Z cn ¼ f1s;n ðr bn Þcðr; 0Þdr; ð86Þ where the symbol * denotes the complex conjugate. For plane wave incidence, i.e., cðr; 0Þ ¼ 1, one thus has: Z ð87Þ cn ¼ f1s;n ðr bn Þdr:
Following Geuens, Chen, den Dekker, and van Dyck (1999) and Geuens and van Dyck (2002), the 1s-state function may be approximated by a single, quadratically normalized, parameterized Gaussian function
1 r2 f1s;n ðrÞ ¼ pffiffiffiffiffiffi exp 2 ; ð88Þ 4an an 2p where r is the Euclidean norm of the two-dimensional vector r, that is, r ¼ jrj, and an represents the column dependent width. This width is directly related to the energy of the 1s-state. Then, it follows from Eqs. (87) and (88) that pffiffiffiffiffiffi cn ¼ 2 2pan : ð89Þ
The two-dimensional Fourier transform F1s;n ðgÞ of Eq. (88), which will be needed in the remainder of this section, is given by: pffiffiffiffiffiffi F1s;n ðgÞ ¼ 2 2pan exp 4p2 a2n g2 ð90Þ
with g being the Euclidean norm of the two-dimensional spatial frequency vector g in reciprocal space, that is, g ¼ jgj. Throughout this article, the
QUANTITATIVE ATOMIC RESOLUTION TEM
65
two-dimensional Fourier transform H(g) of an arbitrary function h(r) is defined as Z ð91Þ H ðgÞ ¼ =r!g hðrÞ ¼ hðrÞ exp ði2pg:rÞdr; where the symbol ‘.’ denotes the scalar product. Consequently, the inverse Fourier transform is defined as: Z 1 ð92Þ hðrÞ ¼ =g!r H ðgÞ ¼ H ðgÞ exp ði2pg:rÞdg:
2. The Image Wave In the second step of the derivation of the parametric statistical model of the observations, an expression for the image wave ci (r, z) is obtained. This is a complex electron wave function at the image plane. At this stage, most microscope settings will enter the model. The image wave is written as the convolution product of the exit wave with the point spread function t(r) of the electron microscope (van Dyck, 2002): ci ðr; zÞ ¼ cðr; zÞ tðrÞ:
ð93Þ
The two-dimensional Fourier transform of t(r) represents the microscope’s transfer function T(g). Following (van Dyck, 2002), T(g) is radially symmetric and described as: T ðgÞ ¼ T ðgÞ ¼ AðgÞDs ðgÞDt ðgÞ expðiwðgÞÞ; where A(g) is a circular aperture function, given by: ( 1 if g gap AðgÞ ¼ 0 if g > gap
ð94Þ
ð95Þ
with gap the objective aperture radius. Notice that the objective aperture semiangle ao is equal to gapl. In what follows, it will be assumed that there is no objective aperture so that A(g) is constant and equal to 1. The phase shift w(g), resulting from the objective lens aberrations, is radially symmetric and given by: 1 wðgÞ ¼ p"lg2 þ pCs l3 g4 2
ð96Þ
with " being the defocus. Notice that higher order aberration eVects such as 2-fold astigmatism, 3-fold astigmatism, and axial coma, have been neglected. They could be included in the phase shift as well (Thust, Overwijk, Coene,
66
VAN AERT ET AL.
and Lentzen, 1996). In the quasi-coherent approximation, the eVects of partial spatial and temporal coherence are incorporated by the damping envelope functions Ds(g) and Dt(g), respectively. For a Gaussian incoherent eVective electron source, the function Ds(g) is described as (Frank, 1973), (Spence, 1988): 2 ! a2c pCs l2 g3 þ p"g ; ð97Þ Ds ðgÞ ¼ exp ln 2 where ac is the semi-angle of beam convergence. For a Gaussian spread of defocus, the function Dt(g) is described as (Fejes, 1977): ! p2 l2 D2 g4 Dt ðgÞ ¼ exp ; ð98Þ 2 where D is the defocus spread due to chromatic aberration, which is given by (O’Keefe, 1992; Spence, 1988): sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2
2 DI DV 2 DE : ð99Þ þ þ D ¼ Cc 4 I0 V0 E0 Notice that the defocus spread D, which is here defined pasffiffiffi the standard deviation, corresponds to a half width at 1/e height equal to 2D. In Eq. (99), Cc is the chromatic aberration coeYcient, DV and DI are the standard deviations of the statistically independent fluctuations of the accelerating voltage V0 and objective lens current I0, respectively, while DE is the intrinsic energy spread, that is, the standard deviation of the statistically independent fluctuations of the incident electron energy E0 of the electrons in the electron source, defined as: DE ¼
Z
1 1
ðE E0 Þ2 pðE ÞdE
1=2
;
ð100Þ
where p(E ) is the energy probability density function. It is usually assumed that p(E ) is well approximated by a Gaussian function: ! 1 ðE E0 Þ2 pffiffiffiffiffiffi exp pðE Þ ¼ ð101Þ DE 2p 2ðDE Þ2
with expectation value E0 and standard deviation DE. Straightforward calculations show that the relationship between the standard deviation DE and the full width at half maximum height of the energy distribution described by Eq. (101) is given by:
67
QUANTITATIVE ATOMIC RESOLUTION TEM
FWHM ¼
pffiffiffiffiffiffiffiffiffi 8ln2DE 2:35DE:
ð102Þ
In the following, it is assumed that DV/V0 and DI/I0 are small in comparison to DE/E0, so that they may be neglected and Eq. (99) reduces to:
DE D ¼ Cc : ð103Þ E0 Notice that the quasi-coherent approximation used is only of a limited validity and is certainly not the state-of-the art to treat partial coherence. According to the work of Frank (1973), this approximation is only valid for a small eVective source and a central ‘unscattered’ beam much stronger than any other (Spence, 1988). A more correct analytical treatment may be achieved via autocorrelations in Fourier space, incorporating the microscope properties in the form of a transmission-cross-coeYcient (Born and Wolf, 1999; Frank, 1973; Ishizuka, 1980). However, such a treatment would complicate the derivation of the optimal statistical experimental design and the explanation of the basic principles governing the obtained results severely and unnecessarily. Moreover, it should be mentioned that the analysis via transmission-cross-coeYcients is also not perfect, since it does not take the influence of beam convergence and defocus spread on the scattering of the electrons with the object into account (van Dyck, 2002). 3. The Image Intensity Distribution Next, an expression for the image intensity distribution I(r) will be derived. This is given by the modulus square image wave. Hence, it follows from Eqs. (84) and (93) that I ðrÞ ¼ jci ðr; zÞj2
2
(
) nc
ð104Þ X E1s;n 1
tðrÞ ; z 1 ¼ 1 þ cn f1s;n ðr bn Þ exp ip
E0 l n¼1
where it is taken into account that 1 * t(r) is equal to 1. Furthermore, f1s;n ðr bn Þ tðrÞ
ð105Þ
represents the 1s-state function convoluted with the microscope’s point spread function, which is equal to Z 1 F1s;n ðgÞT ðgÞJ0 ð2pgjr bn jÞgdg ð106Þ 2p 0
since f1s,n(r) and t(r) are both radially symmetric functions. In Eq. (106), J0(.) is the zeroth-order Bessel function of the first kind.
68
VAN AERT ET AL.
Furthermore, notice that it can be seen from Eq. (104) that for identical atom columns, the contrast varies periodically with thickness, where the periodicity is given by (van Dyck and Chen, 1999a):
2E0 l
ð107Þ D1s ¼ E1s;n which is called the extinction distance. This periodic oscillation is due to dynamical eVects, which have been included in the model via the channelling approximation. Generally, the extinction distance will be diVerent for diVerent types of atom columns. 4. The Image Recording Next, the expectation model, describing the expected number of electrons recorded by the detector, will be derived. As a recording device, a CCD camera is chosen, consisting of K L equidistant pixels of area Dx Dy, where Dx and Dy are the sampling distances in the x- and y-direction, respectively. Pixel (k, l ) corresponds to position ðxk yl ÞT ðx1 þ ðk 1Þ Dx y1 þ ðl 1ÞDyÞT of the recorded image, with k ¼ 1; . . . ; K and l ¼ 1; . . . ; L and ðx1 y1 ÞT represents the position of the pixel in the bottom left corner of the field of view (FOV). The FOV is centered about (0 0)T. It is chosen suYciently large so as to guarantee that the tails of the microscope’s point spread function t(r) are collected. Furthermore, it is assumed that the quantum eYciency of the CCD camera is suYciently high to detect single electrons. The probability pkl that an electron hits a pixel (k, l ) is then approximately given by pkl ¼
I ðrkl Þ DxDy Inorm
ð108Þ
with I(r) given by Eq. (104), rkl ¼ ðxk yl ÞT , and Inorm a normalization factor given by: Z ð109Þ Inorm ¼ I ðrÞdr; where the integral extends over the whole FOV. This means that for a given total number of detected electrons N, the number of electrons expected to be found at pixel (k, l ) is equal to: lkl ¼ Npkl :
ð110Þ
This result defines the expectations of the observations wkl recorded by the detector and is hence called the expectation model. The total number of detected electrons N is equal to the total number of incident electrons, that
QUANTITATIVE ATOMIC RESOLUTION TEM
69
is, the number of electrons that interact with the object, since it has been assumed that there is no objective aperture. In the presence of an objective aperture, part of the electrons would be lost. The total number of incident electrons depends on the reduced brightness (Br) of the electron source, the incident electron energy (E0), the recording time (t), the field of view (FOV ), the semi-angle of beam convergence (ac), and the electron charge (e ¼ 1:6 1019 C), according to the formula (Spence, 1988): N¼
Br E0 tFOV pa2c : e2
ð111Þ
The reduced brightness of the electron source is defined as the brightness of the electron source per accelerating voltage, whereas the brightness of the electron source describes the current density per unit solid angle of this source (Williams and Carter, 1996). In the absence of electron-electron interactions, the reduced brightness is a conserved quantity. This means that it is the same at every point on the optical axis (van Veen, Hagen, Barth, and Kruit, 2001). In what follows, the importance of this quantity on the performance of CTEM experiments will be studied. 5. The Incorporation of a Monochromator In this section, special attention is paid to the incorporation of a monochromator into the expectation model (den Dekker, van Aert, van Dyck, van den Bos, and Geuens, 2001). Suppose that a monochromator is incorporated in the imaging system below the electron source, removing all electrons, except those whose energy lies within a prespecified energy range ½E0 dE=2; E0 þ dE=2. The monochromator reduces the standard deviation of the energy spread from DE, which is defined by Eq. (100), to DEm, which is described by: !1=2 Z DEm ¼
E0 þdE=2
E0 dE=2
ðE E0 Þ2 p0 ðE ÞdE
ð112Þ
with p0 (E) being the energy distribution of the electrons transmitted by the monochromator, which is given by: 8 pðE Þ dE dE >
: 0 otherwise
with p(E) defined as in Eq. (101). Straightforward calculations, using Eqs. (101), (112), and (113), then show that the standard deviation defining the
70
VAN AERT ET AL.
energy spread of the electrons transmitted by the monochromator may be described as: vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
u 1 dE u p E0 ð114Þ DEm ¼ DE t1 dE 2 ffiffi Erf pdE 2 2DE
with Erf(.) being the error function. As an unfavorable side eVect of the incorporation of a monochromator, the total number of incident electrons that interact with the object reduces if the recording time t is kept constant. Only a fraction of the total number of electrons given by Eq. (111) will be recorded. It may be shown that the total number of detected electrons by use of a monochromator is given by: Z Br E0 tFOV pa2c E0 þdE=2 N¼ pðE ÞdE e2 E0 dE=2 ð115Þ
Br E0 tFOV pa2c dE : Erf pffiffiffi ¼ e2 2 2DE
Hence, the expectation model by incorporating a monochromator is still given by Eq. (110), but with a reduced total number of electrons N as in Eq. (115) instead of as in Eq. (111) and a reduced energy spread of the electrons as in Eq. (114) instead of as in Eq. (100). For CTEM, the observations are electron counting results, which are supposed to be independent and Poisson distributed. Therefore, the joint probability density function of the observations P(o; b), representing the parametric statistical model of the observations is given by Eq. (10), where the total number of observations is equal to K L and the expectation model is given by Eq. (110). The parameter vector b ¼ ðbx1 . . . bxnc by1 . . . bync ÞT consists of the x- and y-coordinates of the atom column positions to be estimated. In the following section, the experimental design resulting into the highest attainable precision with which the elements of the vector b can be estimated will be derived from the joint probability density function of the observations. C. Statistical Experimental Design In this section, the optimal statistical experimental design of high-resolution CTEM experiments will be derived in the sense of the microscope settings resulting into the highest attainable precision with which the position coordinates of the atom columns can be estimated. Therefore, the CRLB with respect to the position coordinates will be computed from the
QUANTITATIVE ATOMIC RESOLUTION TEM
71
parametric statistical model of the observations discussed in the previous section. In Section II, this CRLB was discussed. Then, a scalar measure of this CRLB, that is, a function of the elements of the CRLB, will be chosen as optimality criterion, which will then be evaluated and optimized as a function of the microscope settings. An overview of the microscope settings will be given in Section IV.C.1. Some of them are tunable, while others are fixed properties of the electron microscope. Next, in Section IV.C.2, the results of the numerical evaluation of the dependence of the chosen optimality criterion on the microscope settings will be discussed. This will be done for both isolated and neighboring atom columns. The section is concluded by simulation experiments to find out if the maximum likelihood estimator attains the CRLB and, moreover, if it is unbiased. If so, this justifies the choice of the CRLB as optimality criterion. Finally, in Section IV.C.3, an interpretation of the numerical optimization results will be given. The object thickness, the energy of the atom columns, and the microscope settings are supposed to be known. However, the following analysis may relatively easily be extended to include the case in which these or even more parameters are unknown and hence have to be estimated simultaneously. 1. Microscope Settings An overview of the microscope settings, which enter the parametric statistical model of the CTEM observations, is given in this section. For simplicity, some of these settings will be kept constant in the evaluation and optimization of the experimental design. The settings describing the illuminating electron beam are the electron wave-length l, the semi-angle of beam convergence ac, the standard deviation DE of the intrinsic energy spread of the electrons in the electron source, the reduced brightness Br of the electron source, and the width dE of the energy selection slit (in the presence of a monochromator). The electron wavelength and the reduced brightness of the electron source are fixed properties of a given electron microscope. The eVect of these settings on the precision with which atom column positions can be estimated will be studied. The semi-angle of beam convergence may be varied experimentally, but it will be held fixed and suYciently small in the present analysis in order to guarantee that the quasi-coherent approximation made in the derivation of the expectation model is reasonable. Moreover, typical values will be chosen for the standard deviation of the intrinsic energy spread of the electrons, in agreement with electron sources used today. The width of the energy selection slit will be variable, thus resulting into a variable energy spread DEm of the electrons.
72
VAN AERT ET AL.
The microscope settings specifying the objective lens are the defocus ", the spherical aberration constant Cs, and the chromatic aberration constant Cc. The defocus will be variable. For most electron microscopes, the spherical and chromatic aberration constant are fixed properties of the microscope, however, by incorporating a spherical or chromatic aberration corrector, these settings are (or will become) tunable. Therefore, it is interesting to study the eVect of these settings on the precision. The microscope settings describing the image recording are the pixel sizes Dx and Dy, the number of pixels K and L in the x- and y-direction, respectively, and the recording time t. The pixel sizes Dx and Dy will be kept constant. In agreement with the results presented in Section III, it may be shown that the precision will generally improve with smaller pixel sizes, with all other settings kept constant. However, below a certain pixel size, no more improvement is gained. This has to do with the fact that the pixel signal-tonoise ratio (SNR) decreases with a decreasing pixel size. Therefore, the pixel sizes are chosen in the region where no more improvement may be gained. This is similar to what is described in (Bettens, van Dyck, den Dekker, Sijbers, and van den Bos, 1999; den Dekker, Sijbers and van Dyck, 1999; van Aert, den Dekker, van Dyck, and van den Bos, 2002a). The number of pixels K and L, defining the FOV for given pixel sizes Dx and Dy, will be chosen fixed, but large enough so as to guarantee that the tails of the microscope’s transfer function are collected in the FOV. 2. Numerical Results In this section, the results of the numerical evaluation of the dependence of the attainable precision, that is, the CRLB, on the microscope settings will be studied. This section is divided into four parts. First, general comments, which should be kept in mind during the reading of this section, will be given, including an overview of the original, non-optimized microscope settings and of the structure parameters. Second, optimal experimental designs for isolated atom columns will be computed. The corresponding highest attainable precisions will be compared to the attainable precisions at the original microscope settings. Third, the influence of neighboring atom columns on these optimal designs will be discussed. Finally, simulation experiments will be carried out to find out if the maximum likelihood estimator attains the CRLB and if it is unbiased. If so, this justifies the choice of the CRLB as optimality criterion. a. General Comments. In this section, general comments will be given, which should be kept in mind during further reading. They are related to the
QUANTITATIVE ATOMIC RESOLUTION TEM
73
comparison of the original and optimal microscope settings and to the structure parameters of the objects under study. i. Original and Optimal Microscope Settings. In what follows, the values for the original, non-optimized microscope settings are given in Table 7, unless otherwise mentioned. These values are typical for today’s electron microscopes. In what follows, they will be compared to the optimal values which result into the highest attainable precision. In principle, the optimal values should be found by optimizing the attainable precision for all microscope settings simultaneously. This corresponds to an iterative, numerical optimization procedure in the space of microscope settings. In this space, every point represents a set of values for the microscope settings of which the dimension is equal to the number of microscope settings. However, it has been found that, apart from the optimal defocus, the optimal value of each of these microscope settings is independent of the other settings. Consequently, the optimization of most microscope settings may be performed one at a time, instead of simultaneously. This kind of optimization is also justified from a practical point of view. Suppose, for example, that an experimenter has an electron microscope with spherical aberration corrector but without chromatic aberration corrector. This microscope will allow him or her to tune the spherical aberration constant, whereas the chromatic aberration constant is fixed. In this case, one is only interested in knowing the optimal spherical aberration constant for a given chromatic aberration constant, instead of knowing the combined optimal spherical and chromatic aberration constant.
TABLE 7 Original Microscope Settings Microscope setting
Value
ac(rad) DE(eV ) Br ðAm2 sr1 V1 Þ Cs(mm) Cc(mm) ˚) Dx(A ˚ Dy(A) K L t(s)
104 0.75 2 107 0.5 1.3 0.2 0.2 100 100 1
74
VAN AERT ET AL.
In the following, the attainable precision will be computed as a function of the following microscope settings: . . . . .
Defocus Spherical aberration constant Chromatic aberration constant Energy spread of a monochromator Reduced brightness of the electron source
The evaluation of the precision as a function of the defocus will be done for a range of spherical aberration constants, for a given incident electron energy and corresponding electron wavelength. In this way, it will be possible to express the optimal defocus in terms of the spherical aberration constant and electron wavelength. The evaluation as a function of the other microscope settings will be performed separately. Moreover, microscopes operating at an incident electron energy of both 300 keV and 50 keV will be considered. Unless otherwise stated, the values of the microscope settings diVerent from those to be optimized are given in Table 7 and the defocus is adjusted to its optimal value, which will be shown to be given, to a good approximation, by Eqs. (118)-(119). The results of the evaluation of the attainable precision as a function of the individual microscope settings will be presented in figures. In these figures, the point corresponding to the original microscope settings will be marked with a symbol. Use of the same symbol in diVerent figures indicates that the corresponding microscope settings are identical. This makes comparison between diVerent figures easier. The following three symbols with corresponding microscope settings will be used: . .
.
Ed ¼ 300 keV, optimal defocus, other settings are given in Table 7. Em ¼ 50 keV, Cc ¼ 0 mm, optimal defocus, other settings are given in Table 7. Ej ¼ 50 keV, Cs ¼ 0 mm, optimal defocus, other settings are given in Table 7.
ii. Structure Parameters. The evaluation and optimization of the attainable precision as a function of the microscope settings will be done for both silicon [100] and gold [100] atom columns for which the width of the 1s-state and its energy are given in Tables 8 and 9 for a microscope operating at 300 keV and 50 keV, respectively. The other structure parameters of the object under study, such as the atom column positions and the object thickness, will be given in the following parts.
75
QUANTITATIVE ATOMIC RESOLUTION TEM TABLE 8 ˚ 2 and Width of the 1s-State and Its Energy (Debye-Waller Factor ¼ 0.6 A E0 ¼ 300 keV) of a Silicon [100] and a Gold [100] Atom Column Column type Structure parameter
Si [100]
Au [100]
˚) an(A E1s,n(eV)
0.34 20.2
0.13 210.8
TABLE 9 ˚ 2 and Width of the 1s-State and Its Energy (Debye-Waller Factor ¼ 0.6 A E0 ¼ 50 keV) of a Silicon [100] and a Gold [100] Atom Column Column type Structure parameter
Si [100]
Au [100]
˚) an(A E1s,n(eV)
0.45 12.4
0.16 148.3
TABLE 10 Structure Parameters of an Isolated Atom Column Structure parameter
Value
˚) bx(A ˚) by(A ˚) z(A
0
0
E0 l
E1s;n
b. Isolated Atom Columns
i. Structure Parameters. For isolated atom columns, the structure parameters other than the width of the 1s-state and its energy, that is, the atom column positions and the object thickness, are given in Table 10, unless otherwise stated. The object thickness is equal to half the extinction distance, which is given by Eq. (107). At this thickness and at thicknesses equal to odd multiples of half the extinction distance, the electrons are strongly localized at the atom column positions (Lentzen, Jahnen, Jia, Thust, Tillmann and Urban, 2002). ii. Optimality Criterion. The optimal statistical experimental design will be described by the microscope settings resulting into the highest attainable precision with which the position coordinates b ¼ ðbx by ÞT can be measured. This attainable precision (in terms of the variance) is represented
76
VAN AERT ET AL.
by the diagonal elements sb2x and sb2y of the CRLB. An expression for these elements will be derived in the following paragraphs. For an isolated atom column, the CRLB is equal to the inverse of the 2 2 Fisher information matrix F associated with the position coordinates. The (r, s)th element of F is defined by Eq. (12): Frs ¼
K X L X 1 @lkl @lkl l @br @bs k¼1 l¼1 kl
ð116Þ
with lkl the expected number of electrons at the pixel (k, l ). An expression for the elements Frs is found by substitution of the expectation model given by Eq. (110) as derived in Section IV.B and its derivatives with respect to the position coordinates into Eq. (116). Explicit numbers for these elements are obtained by substituting values of a given set of microscope settings and structure parameters of the object into the obtained expression for Frs. For the radially symmetrical expectation model used, the diagonal elements of the Fisher information matrix are equal to one another. Moreover, since the Fisher information matrix is symmetric, the diagonal elements of its inverse, that is, of the CRLB, are also equal to one another: s2bx ¼ s2by ¼ F 1 11 ð117Þ
with [F 1]11 the (1, 1)th element of the CRLB, that is, of F 1. In what follows, the precision will be represented by the lower bound on the standard deviation sbx and sby, that is, the square root of the right-hand member of Eq. (117). It will be used as optimality criterion for the evaluation and optimization of the experimental design. Therefore, this chosen optimality criterion will be calculated for various types of atom columns as a function of the defocus, the spherical aberration constant, the chromatic aberration constant, and the energy spread of a monochromator. In this evaluation and optimization procedure, the relevant physical constraints are taken into consideration. The constraint is either the radiation sensitivity of the object under study or the specimen drift. ˚ or the recording Therefore, either the incident electron dose per square A time has to be kept within the constraints.
iii. Optimal Defocus Value. First, the dependence of the precision on the defocus is studied, as well as the dependence of the optimal defocus on the spherical aberration constant and the electron wavelength. The precision is represented by the square root of the right-hand member of Eq. (117). In Figure 18, it is plotted for a silicon [100] atom column as a function of the defocus " and the spherical aberration constant Cs for a given electron wavelength l. Notice that the evaluation is done for positive as well as for negative Cs-values. Negative Cs-values may be obtained by use of a spherical
QUANTITATIVE ATOMIC RESOLUTION TEM
77
Figure 18. The lower bound on the standard deviation of the position coordinates of an isolated silicon atom column as a function of the spherical aberration constant and the defocus. The solid white curve is described by Eqs. (118) and (119) and the dotted white curve describes the numerically found optimal defocus values as a function of the considered spherical aberration constants.
aberration corrector (Kabius, Haider, Uhlemann, Schwan, Urban, and Rose, 2002), (Lentzen, Jahnen, Jia, Thust, Tillmann, and Urban, 2002). The solid white curve shown in Figure 18 is described by the relation rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 4 ð118Þ " ¼ Cs l if Cs < 0; 3 rffiffiffiffiffiffiffiffiffiffiffi 4 "¼ Cs l 3
if Cs 0;
ð119Þ
where Eq. (119) is the well-known Scherzer defocus (Scherzer, 1949), which is generally believed to be optimal in terms of point resolution and contrast (Spence, 1988). The dotted white curve shown in Figure 18 describes the numerically found optimal defocus values as a function of the considered spherical aberration constants. From the comparison of the solid and dotted
78
VAN AERT ET AL.
white curve in Figure 18, it follows that the Scherzer defocus (for positive Cs) and Eq. (118) (for negative Cs) are close to the optimal defocus values in terms of precision, except for values of Cs that are significantly higher than the original setting of 0.5 mm. Moreover, for a given spherical aberration constant, operating at the corresponding optimal defocus instead of at the defocus described by Eqs. (118) or (119) is hardly beneficial. Therefore, the optimal defocus value, in terms of spherical aberration constant and electron wavelength, is approximately given by Eqs. (118) and (119). This result is in agreement with the results presented in (den Dekker, Sijbers, and van Dyck, 1999), where the attainable precision with which the position of a single atom can be estimated is evaluated as a function of microscope settings for high-resolution CTEM. Furthermore, this finding does not depend on whether the radiation sensitivity of the object under study or the specimen drift determines the relevant physical constraint. In Figure 18, the ˚ are recording time as well as the number of incident electrons per square A fixed. The optimal defocus value does not change if, for example, longer ˚ would be allowed. recording times or more incident electrons per square A The reason for this is that the precision is inversely proportional to the square root of the total number of detected electrons N, which, in its turn is directly proportional to the recording time. This follows from Eqs. (110), (111), (115), (116), and (117). Therefore, for other values of the recording ˚ , only the actual values time or the number of incident electrons per square A for the standard deviation ascribed to Figure 18 would be diVerent, whereas the optimal defocus value would be the same. From now on, the defocus will be adjusted to the value given by Eq. (118) for negative Cs-values and to the Scherzer defocus, given by Eq. (119), for positive Cs-values since these are useful approximations of the optimal defocus value. iv. Optimal Spherical Aberration Constant. Subsequently, the dependence of the precision on the spherical aberration constant is studied. Usually, the spherical aberration constant is a fixed property of the electron microscope. However, by incorporating a spherical aberration corrector, it is tunable and may range from the value of the original uncorrected microscope over zero and even to negative values (Kabius, Haider, Uhlemann, Schwan, Urban, and Rose, 2002; Lentzen, Jahnen, Jia, Thust, Tillmann, and Urban, 2002). Thus far, the advantages of a spherical aberration corrector were usually discussed in the literature in terms of qualitative structure determination, that is, in terms of the possibility to perceive two atom columns separately in an image. The optimality criterion used was the point resolution rs of the electron microscope, which is equal to 0.66(Cs l3)1/4. By use of a spherical aberration corrector, the point resolution improves and, consequently, structure-imaging artifacts due to
QUANTITATIVE ATOMIC RESOLUTION TEM
79
contrast delocalization reduce (Haider, Uhlemann, Schwan, Rose, Kabius, and Urban, 1998). In the present analysis, however, the possible benefit of a spherical aberration corrector is discussed in terms of the attainable statistical precision with which position coordinates of an atom column can be determined. This is the criterion of importance in the framework of quantitative structure determination, which will gain importance in the future. This criterion takes the object and the total number of detected electrons into account. First, the precision is evaluated and optimized as a function of the spherical aberration constant for a microscope operating at an incident electron energy of 300 keV, corresponding to an accelerating voltage of ˚ . In Figure 19, it is plotted for 300 kV and an electron wavelength of 0.02 A a silicon [100] as well as for a gold [100] atom column as a function of the spherical aberration constant Cs. The optimal spherical aberration constant in terms of precision is the one that corresponds to the minimum of the curve shown in Figure 19. From Figure 19, it follows that the optimal spherical aberration constant is equal to 0 mm in this example. For light atom columns such as silicon [100], the precision in terms of the standard deviation that is gained by reducing the spherical aberration constant from the original setting of 0.5 mm to the optimal setting of 0 mm is a factor of 1.3. For heavy atom columns such as gold [100], the precision that is gained by reducing the spherical aberration constant from 0.5 mm to 0 mm is a factor of 1.9. Therefore, correction of spherical aberration is more useful in terms of precision for heavy than for light atom columns. Notice, however,
Figure 19. The lower bound on the standard deviation of the position coordinates of an isolated silicon and gold atom column as a function of the spherical aberration constant. The incident electron energy is equal to 300 keV.
80
VAN AERT ET AL.
that for silicon, it follows from Figure 18 that a comparable gain in precision as the mentioned factor of 1.3 may be obtained without spherical aberration corrector, by using a slightly diVerent defocus value than Scherzer’s. The same conclusion may be obtained for gold. Next, the previous evaluation has been repeated, but this time for a ˚ . The thinner object. The object thickness is assumed to be equal to 50 A results are shown in Figure 20. From this figure, it is concluded that for thin objects, the optimal spherical aberration constant is diVerent from 0 mm. The reason for this is that for the thin object considered, a spherical aberration constant equal to 0 mm and a defocus adjusted to Scherzer’s lead to images with very low contrast, which result into extremely high standard deviations of the position coordinates. This is also found in (den Dekker, Sijbers, and van Dyck, 1999), where the attainable precision with which the position of a single atom can be estimated is evaluated as a function of microscope settings for high-resolution CTEM. In this paper, intuitive interpretations of the results may be found. For a gold [100] atom column, the optimal spherical aberration constant is close to but diVerent from 0 mm, whereas for a silicon [100] atom column, it is negative and equal to 0.35 mm. Therefore, from the comparison of Figures 19 and 20, it is concluded that the optimal spherical aberration constant clearly depends on the object under study. This finding is in contrast to what is found in Lentzen, Jahnen, Jia, Thust, Tillmann, and Urban (2002), where expressions are derived for the optimal spherical aberration constant in terms of phase
Figure 20. The lower bound on the standard deviation of the position coordinates of an isolated silicon and gold atom column as a function of the spherical aberration constant. The ˚. incident electron energy is equal to 300 keV. The object thickness is equal to 50 A
QUANTITATIVE ATOMIC RESOLUTION TEM
81
contrast and delocalization. The obtained expressions do not depend on structure parameters of the object under study. Subsequently, the precision is evaluated and optimized as a function of the spherical aberration constant for a microscope operating at an incident electron energy of 50 keV, instead of 300 keV, corresponding to an ˚. accelerating voltage of 50 kV and an electron wavelength of 0.05 A Usually, decreasing the incident electron energy, or equivalently, increasing the electron wavelength, is not beneficial in terms of precision if the relevant physical constraint of the experiment is determined by the specimen drift. Some of the reasons for the deterioration of the precision with decreasing incident electron energy are the accompanied decrease of the number of detected electrons, which follows directly from Eq. (111), and the deterioration of the point resolution rs ¼ 0:66ðCs l3 Þ1=4 . However, for some materials one should use incident electron energies lower than 300 keV in order to avoid displacement damage, that is, displacement of atoms from their initial positions. The amount of displacement damage decreases with decreasing incident electron energy (Williams and Carter, 1996). Examples of materials which are sensitive to displacement damage are metals and amorphous materials. Although silicon and gold are possibly insensitive to displacement damage, the evaluation of the attainable precision is again performed for these columns so as to make comparison with the 300 keV results possible. The results for 50 keV are shown in Figure 21. In this evaluation, the chromatic aberration constant is equal to 0 mm. From
Figure 21. The lower bound on the standard deviation of the position coordinates of an isolated silicon and gold atom column as a function of the spherical aberration constant. The incident electron energy is equal to 50 keV. A chromatic aberration constant is used with Cc = 0 mm.
82
VAN AERT ET AL.
Figure 21, it follows that the optimal spherical aberration constant is equal to 0 mm, just as for a microscope operating at an incident electron energy of 300 keV and an object thickness equal to half the extinction distance. Moreover, it is concluded that, both for light and for heavy atom columns, correction of the spherical aberration is useful in terms of precision, although the gain is higher for heavy than for light atom columns. For example, for a light atom column, such as silicon [100], the precision that is gained by reducing the spherical aberration constant from 0.5 mm to 0 mm is a factor of 2.5, whereas for a heavy atom column such as gold [100], this is a factor of 13.9. The latter is a substantial reduction of the standard deviation. From the comparison of the numerical values of the lower bound on the standard deviation of the position coordinates corresponding to 50 keV and 300 keV, it follows that, as predicted above, the precision is higher for 300 keV than for 50 keV if the recording time is fixed. Therefore, reducing the incident electron energy is only beneficial in terms of precision if the object under study is sensitive to displacement damage. In the discussion of the optimal spherical aberration constant, some remarks are due. It should be mentioned that the results of the optimal spherical aberration constant do not depend on whether the radiation sensitivity of the object under study or the specimen drift determines the relevant physical constraint of the CTEM experiment. In Figures 19 to 21, the ˚ are recording time as well as the number of incident electrons per square A fixed. Furthermore, it should be mentioned that the possible benefit of a spherical aberration corrector, which allows one to reduce the spherical aberration constant, is underestimated in the present analysis due to the following reason. The semi-angle of beam convergence has been kept constant and suYciently small in order to guarantee that the quasi-coherent approximation, made in the derivation of the expectation model, is reasonable (Spence, 1988). The chosen angle does therefore not correspond to the optimal value in terms of attainable precision. In the quasi-coherent approximation, the eVects of partial spatial and temporal coherence are incorporated by coherent damping envelope functions. For large semi-angles of beam convergence, the quasi-coherent approximation is no longer valid. A better approximation would be to include partial spatial and temporal coherence in the expectation model in the form of transmission-crosscoeYcients (Frank, 1973), (Born and Wolf, 1999; Ishizuka, 1980). This model would allow one to evaluate and optimize the attainable precision as a function of the semi-angle of beam convergence. Although such an analysis is not made in this work, it is intuitively clear that the optimal semi-angle of beam convergence would increase with decreasing spherical aberration constant and that the relative gain in precision would increase accordingly. This intuitive reasoning is based on the facts mentioned in Kabius, Haider,
QUANTITATIVE ATOMIC RESOLUTION TEM
83
Uhlemann, Schwan, Urban, and Rose (2002) and on the expectation model given by Eq. (110), although it is of a limited validity. From Eqs. (111) and (115), it follows that the total number of detected electrons increases with increasing semi-angle of beam convergence, which has a favorable eVect on the attainable precision with which position coordinates can be estimated. As a side eVect, however, it follows from Eq. (97) that with increasing semi-angle of beam convergence, high spatial frequencies are more severely attenuated due to partial spatial coherence, which has an unfavorable eVect on the attainable precision. The optimal semi-angle of beam convergence is the one for which both eVects are balanced so as to produce the highest attainable precision. The relative importance of the attenuation of high spatial frequencies becomes less for lower values of spherical aberration constant as follows from Eq. (97). Therefore, the optimal semi-angle of beam convergence will shift to higher values with decreasing spherical aberration constant. Due to the accompanied increase of the total number of detected electrons, the relative gain in precision will increase accordingly. Nevertheless, a decisive answer to the questions which semi-angle of beam convergence is optimal and what precision may be gained can only be provided by means of further research. v. Optimal Chromatic Aberration Constant. Next, the dependence of the precision on the chromatic aberration constant is studied. Usually, the chromatic aberration constant is a fixed property of the electron microscope. However, by incorporating a chromatic aberration corrector, which is at a conceptual stage (Weißba¨cker and Rose, 2001, 2002), it will become tunable and may even become negative. The advantages of a chromatic aberration corrector for use in CTEM experiments are usually discussed in the literature in terms of the information limit of the electron microscope. The information limit ri is equal to (plD/2)1/2, with D the defocus spread, which is proportional to the chromatic aberration constant (Spence, 1988). By use of a chromatic aberration corrector, the information limit improves. In combination with image processing techniques such as oV-axis holography or the focal-series reconstruction method, visual interpretability of the reconstructed exit wave is enhanced, which is a benefit for qualitative structure determination. In the present analysis, the performance of a chromatic aberration corrector is studied for quantitative structure determination aiming at the highest precision with which position coordinates of an atom column can be estimated. First, the precision is evaluated and optimized as a function of the chromatic aberration constant for a microscope operating at an incident electron energy of 300 keV. In Figure 22, it is plotted for a silicon [100] as well as for a gold [100] atom column as a function of the chromatic
84
VAN AERT ET AL.
Figure 22. The lower bound on the standard deviation of the position coordinates of an isolated silicon and gold atom column as a function of the chromatic aberration constant. The incident electron energy is equal to 300 keV.
aberration constant. From this figure, it follows that the optimal chromatic aberration constant is equal to 0 mm. The precision in terms of the standard deviation that is gained by reducing the chromatic aberration constant from the original setting of 1.3 mm to 0 mm is a factor of 1.1 and 1.4 for silicon [100] and gold [100], respectively. Hence, both for light and for heavy atom columns, correction of the chromatic aberration is not so useful in terms of precision under the given conditions. Second, the precision is evaluated and optimized as a function of the chromatic aberration constant for a microscope operating at an incident electron energy of 50 keV, instead of 300 keV. Figure 23 shows the results of the evaluation for a silicon [100] as well as for a gold [100] atom column. The spherical aberration constant is equal to 0 mm. From this figure, it follows that the optimal chromatic aberration constant is again equal to 0 mm. Compared to the results obtained for a microscope operating at an incident electron energy of 300 keV, correction of the chromatic aberration is more useful in terms of precision both for light and for heavy atom columns. The precision that is gained by reducing the chromatic aberration constant from 1.3 mm to 0 mm is a factor of 3.5 and 22.0 for a light atom column such as silicon [100] and for a heavy atom column such as gold [100], respectively. These are substantial reductions of the standard deviation. However, as mentioned earlier, decreasing the incident electron energy is only recommended for materials which are sensitive to displacement damage.
QUANTITATIVE ATOMIC RESOLUTION TEM
85
Figure 23. The lower bound on the standard deviation of the position coordinates of an isolated silicon and gold atom column as a function of the chromatic aberration constant. The incident electron energy is equal to 50 keV. A spherical aberration corrector is used with Cs = 0 mm.
The results of the optimal chromatic aberration constant do not depend on whether the radiation sensitivity of the object under study or the specimen drift determines the relevant physical constraint of the CTEM experiment. In Figure 22 and 23, the recording time as well as the number of ˚ are fixed. incident electrons per square A vi. Optimal Energy Spread of a Monochromator. Furthermore, the precision of the position coordinate estimates is evaluated and optimized as a function of the energy spread of a monochromator. This evaluation and optimization will be done for a fixed recording time as well as for a fixed ˚ . In the former case, the physical number of incident electrons per square A constraint is determined by the specimen drift whereas in the latter one, it is determined by the radiation sensitivity of the object. The reason for considering both constraints is that the number of incident electrons per second decreases by use of a monochromator. The use of a monochromator in CTEM experiments is assumed to be advantageous for qualitative structure determination. The reason for this supposition is that the information limit ri, which is equal to (plD/2)1/2, improves by use of a monochromator because of the decrease of the defocus spread D. This means that, in combination with oV-axis holography or the focal-series reconstruction method, visual interpretability of the reconstructed exit wave is enhanced. In these discussions, the object under study or the total number of detected electrons are not taken into account. However, a reduction of
86
VAN AERT ET AL.
the incident number of electrons per second leads to a decrease in SNR if the recording time is kept within the constraints. This eVect has to be taken into account when the performance of a monochromator for quantitative structure determination is evaluated. This might be done by using a modified definition of the information limit that includes the SNR (de Jong and van Dyck, 1993; van Dyck and de Jong, 1992). In the present analysis, however, this is done by choosing the attainable precision, instead of the information limit, as optimality criterion. This criterion takes both the object under study and the total number of detected electrons into account. First, it is assumed that the specimen drift determines the relevant physical constraint. Hence, the recording time is kept constant in the evaluation of the precision as a function of the standard deviation DEm of the energy spread of the monochromator, given by Eq. (114). Consequently, the total number of detected electrons decreases with decreasing energy spread. This follows directly from Eq. (115). Figures 24 and 25 show the results of the evaluation for a silicon [100] and for a gold [100] atom column for a microscope operating at an incident electron energy of 300 keV and 50 keV, respectively. At 50 keV, the spherical aberration constant is set to 0 mm. The optimal value of the energy spread in terms of precision is the one that corresponds to the minimum of the curve. From Figure 24, where the incident electron energy is equal to 300 keV, it follows that, for light atom columns such as silicon [100], no precision is gained by decreasing the energy spread by means of a monochromator. On the other hand, for heavy
Figure 24. The lower bound on the standard deviation of the position coordinates of an isolated silicon and gold atom column as a function of the standard deviation of the energy spread of a monochromator. The incident electron energy is equal to 300 keV. In this evaluation, the recording time is kept constant.
QUANTITATIVE ATOMIC RESOLUTION TEM
87
Figure 25. The lower bound on the standard deviation of the position coordinates of an isolated silicon and gold atom column as a function of the standard deviation of the energy spread of a monochromator. The incident electron energy is equal to 50 keV. A spherical aberration corrector is used with Cs = 0 mm. In this evaluation, the recording time is kept constant.
atom columns such as gold [100], a monochromator may slightly improve the precision. The precision that is gained by reducing the intrinsic energy spread of 0.75 eV to the optimal energy spread of 0.45 eV by use of a monochromator is a factor of 1.1 in this example. From Figure 25, it follows that, for a microscope operating at an incident electron energy of 50 keV, both for light and heavy atom columns, the precision improves by use of a monochromator. For a silicon [100] column and a gold [100] column, the precision that is gained by reducing the intrinsic energy spread of 0.75 eV to the optimal energy spread of 0.13 eV and 0.02 eV using a monochromator is a factor of 1.7 and 5.5, respectively. Second, it is assumed that the radiation sensitivity of the object determines the relevant physical constraint. Hence, the number of incident ˚ is kept constant in the evaluation of the precision as a electrons per square A function of the standard deviation of the energy spread. In practice, it follows from Eq. (115) that this may be realized by compensating the loss of incident electrons due to the use of the monochromator with an increasing recording time. Figures 26 and 27 show the results of the evaluation for a silicon [100] and for a gold [100] atom column for a microscope operating at an incident electron energy of 300 keV and 50 keV, respectively. The recording time corresponding to an intrinsic energy spread of 0.75 eV is equal to 1 s. At 50 keV, the spherical aberration constant is equal to 0 mm. From these figures, it follows that under the given conditions, the precision
88
VAN AERT ET AL.
Figure 26. The lower bound on the standard deviation of the position coordinates of an isolated silicon and gold atom column as a function of the standard deviation of the energy spread of a monochromator. The incident electron energy is equal to 300 keV. In this ˚ is kept constant. evaluation, the number of incident electrons per square A
Figure 27. The lower bound on the standard deviation of the position coordinates of an isolated silicon and gold atom column as a function of the standard deviation of the energy spread of a monochromator. The incident electron energy is equal to 50 keV. A spherical aberration corrector is used with Cs ¼ 0 mm. In this evaluation, the number of incident ˚ is kept constant. electrons per square A
improves by use of a monochromator. The precision that is gained is larger for heavy atom columns such as gold [100] and for smaller incident electron energies. This may be illustrated by the following numerical values. The precision that is gained by reducing the intrinsic energy spread of 0.75 eV to
QUANTITATIVE ATOMIC RESOLUTION TEM
89
0.03 eV using a monochromator for a microscope operating at an incident electron energy of 300 keV is a factor of 1.1 and 1.4 for a silicon [100] and gold [100] column, respectively. For a microscope operating at an incident electron energy of 50 keV, these factors are substantial and equal to 3.5 and 21.1, respectively. vii. Optimal Reduced Brightness of the Electron Source. Next, the eVect of the reduced brightness Br of the electron source on the precision with which the position coordinates of an atom column can be measured is studied. Using Eqs. (110), (116), and (117), it follows that the precision, represented by the lower bound on the standard deviation of the position coordinates, is inversely proportional to the square root of the total number of detected electrons N. Furthermore, it follows from Eqs. (111) and (115) that in the absence as well as in the presence of a monochromator, N is directly proportional to the reduced brightness of the electron source Br. Therefore, new developments in producing electron sources with higher reduced brightness (de Jonge, Lamy, Schoots, and Oosterkamp, 2002; van Veen, Hagen, Barth, and Kruit, 2001) are advantageous in terms of precision. For example, if the reduced brightness is increased by a factor of 10, the lower bound on the pffiffiffiffiffistandard deviation of the position coordinates decreases by a factor of 10. Hence, on the one hand, if the experiment is limited by specimen drift, the optimal reduced brightness is preferably as high as possible, that is, as high as physical limitations to the production of electron sources with higher reduced brightness allow. The dominant limitation is determined by the statistical Coulomb interactions (Kruit and Jansen, 1997; van Veen, Hagen, Barth, and Kruit, 2001). On the other hand, if the experiment is limited by the radiation sensitivity of the object, the reduced brightness has to be kept subcritical or an increase of the reduced brightness Br has to be kept subcritical or an increase of the reduced brightness Br has to be compensated by a decrease of the recording time t, so ˚ within the as to keep the number of incident electrons per square A constraints. Finally, a remark about the recording time needs to be made. If the experiment is limited by specimen drift, the recording time is kept within the constraints in this study. The amount of specimen drift is determined by mechanical instabilities of the specimen holder. Hence, new developments providing more stable specimen holders, would allow microscopists to increase the recording time. This has a favorable eVect on the precision since, as mentioned above, the lower bound on the standard deviation of the position coordinates is inversely proportional to the square root of the total number of detected electrons N, which in its turn is directly proportional to the recording time.
90
VAN AERT ET AL.
viii. Summary. Tables 11 and 12 give a summary of the attainable precisions with which the position coordinates of an isolated atom column can be estimated for a microscope operating at an incident electron energy of 300 keV and 50 keV, respectively. The attainable precision is represented for the values of the original microscope settings as described in Table 7 and for the optimal values of one or two of these settings with all other values kept fixed. The defocus is adjusted to the value given by Eq. (118) for negative Cs-values and to the Scherzer defocus, given by Eq. (119), for positive Cs-values. These values are close to optimal. This is done for both a silicon [100] and gold [100] atom column for which the structure parameters are given in Tables 8, 9, and 10. The recording time is held constant. From these tables, the following conclusions are drawn: TABLE 11 The Attainable Precision for an Isolated Silicon [100] and Gold [100] Atom Column for Different Values of Microscope Settings and for an Incident Electron Energy of 300 keV Column type Microscope settings
Si [100]
original settings optimal spherical aberration constant (Cs ¼ 0 mm) optimal chromatic aberration constant (Cc ¼ 0 mm) optimal energy spread of the monochromator (see text) 10 higher reduced brightness optimal spherical and chromatic aberration constant optimal spherical aberration constant and optimal energy spread of the monochromator
0.0014 0.0011 0.0013 0.0014 0.0004 0.0009 0.0011
Au [100]
˚ A ˚ A ˚ A ˚ A ˚ A ˚ A ˚ A
0.0054 0.0028 0.0040 0.0050 0.0017 0.0011 0.0022
˚ A ˚ A ˚ A ˚ A ˚ A ˚ A ˚ A
TABLE 12 The Attainable Precision for an Isolated Silicon [100] and Gold [100] Atom Column for Different Values of Microscope Settings and for an Incident Electron Energy of 50 keV Column type Microscope settings original settings optimal spherical aberration constant (Cs ¼ 0 mm) optimal chromatic aberration constant (Cc ¼ 0 mm) optimal energy spread of the monochromator (see text) 10 higher reduced brightness optimal spherical and chromatic aberration constant optimal spherical aberration constant and optimal energy spread of the monochromator
Si [100] 0.0142 0.0079 0.0057 0.0098 0.0045 0.0023 0.0046
˚ A ˚ A ˚ A ˚ A ˚ A ˚ A ˚ A
Au [100] 0.1274 0.0555 0.0350 0.1133 0.0403 0.0025 0.0100
˚ A ˚ A ˚ A ˚ A ˚ A ˚ A ˚ A
QUANTITATIVE ATOMIC RESOLUTION TEM
91
The attainable precision is better at 300 keV than at 50 keV. Hence, reducing the incident electron energy is only recommended if the experiment is limited by displacement damage instead of specimen drift. . Mathematically speaking, at 300 keV, the attainable precision improves with a spherical or chromatic aberration corrector. However, since the accompanied gain in precision is only marginal, one may wonder if such correctors are needed in order to obtain a prespecified precision of the atom column positions. . At 50 keV, the attainable precision improves with a spherical or chromatic aberration corrector. A chromatic aberration corrector is preferable. . The attainable precision improves more with a chromatic aberration corrector than with a monochromator. . The attainable precision improves substantially if the reduced brightness would be 10 times higher. . The attainable precision improves substantially with both a spherical and chromatic aberration corrector, especially for heavy atom columns and low incident electron energies. .
Furthermore, as mentioned earlier, the attainable precision may be improved if the mechanical stability of the specimen holder is improved, since it would provide longer recording times and hence more detected electrons. c. Neighboring Atom Columns. The optimal microscope settings described in the previous part of Section IV.C.2 are derived for single isolated atom columns. One should keep in mind that the attainable precision with which the position of a single isolated column can be estimated is a valid criterion for the optimization of the experimental design as long as neighboring columns are clearly separated in the image. Under this condition, the attainable precision with which the position of an atom column is estimated is independent of the presence of neighboring columns. This condition was not always met in the previous part. For example, images of silicon [100] atom columns of a crystal, taken with a microscope which operates at an incident electron energy of 50 keV and which is not corrected for spherical and chromatic aberration, show strong overlap. Then, the attainable precision with which the position of an atom column can be estimated is aVected unfavorably by the presence of neighboring columns. To find out if the optimal microscope settings change in the presence of neighboring atom columns, the attainable precision with which atom column position coordinates of silicon [100] and gold [100] crystals can be estimated, will be computed.
92
VAN AERT ET AL.
i. Structure Parameters. The two-dimensional projected structure of the objects under study, which are, silicon [100] and gold [100] crystals, is modelled as a lattice consisting of 7 7 projected atom columns at the positions T T ð120Þ bn ¼ bxn byn ¼ nx d ny d ; with indices n ¼ ðnx ; ny Þ; nx ¼ 3; . . . ; 3, ny ¼ 3; . . . ; 3, and d the distance between an atom column and its nearest neighbor. The values of the distance d for both a silicon [100] and a gold [100] crystal (International Centre for DiVraction Data, 2001) and for the object thickness are given in Table 13. It should be mentioned that the chosen object thickness is equal to ˚ instead of half the extinction distance such as for isolated atom 50 A columns in the previous section.
ii. Optimality Criterion. The optimal statistical experimental design will be described by the microscope settings resulting into the highest attainable precision with which the position coordinate bxn of the central atom column of the lattice consisting of 7 7 atom columns can be estimated. This column corresponds to the index n ¼ ð0; 0Þ. The attainable precision (in terms of the variance) is represented by the diagonal element s2bxn of the CRLB. An expression for this element may be derived as follows. First, the Fisher information matrix associated with the total set of 98 position coordinates bxn and byn is computed. This is a 98 98 matrix. The expression for the elements Frs of the Fisher information matrix is given by Eq. (116). Explicit numbers for these elements are obtained by substituting values of a given set of microscope settings and structure parameters of the object into the obtained expression for Frs. Next, the CRLB is computed. It is given by the inverse of the Fisher information matrix. Finally, the diagonal element s2bxn of the CRLB, corresponding to the position coordinate bxn of the central atom column of the lattice, represents the attainable precision. In what follows, the precision will be represented by the lower bound on the standard deviation sbxn, that is, the square root of s2bxn .
TABLE 13 Structure Parameters of Neighboring Atom Columns Column type Structure parameter ˚) d (A ˚) z (A
Si [100]
Au [100]
1.92 50
2.04 50
QUANTITATIVE ATOMIC RESOLUTION TEM
93
It will be used as optimality criterion for the evaluation and optimization of the experimental design. Alternatively, one could choose the lower bound on the standard deviation sbyn of the position coordinate byn of the central atom column since sbxn and sbyn are equal to one another. The reason for this is that, for the chosen structure of the objects under study, rotation of the expectation model over an angle of 90 degrees carries the expectation model into itself. Moreover, the central atom column is preferred rather than one of the other 48 atom columns since this column is mostly aVected by the presence of neighboring columns. As mentioned in Section II.C.2., the chosen criterion may be regarded as a partial or truncated optimality criterion. iii. Optimal Microscope Settings. First, in Figures 28 and 29, the precision is evaluated as a function of the spherical aberration constant for a microscope operating at an incident electron energy of 300 keV for a silicon [100] and gold [100] crystal, respectively. The solid curve corresponds to a microscope without correction for chromatic aberration, that is, a microscope without chromatic aberration corrector and monochromator. The dashed curve corresponds to a microscope with chromatic aberration corrector, that is, a microscope for which the chromatic aberration constant is equal to 0 mm. The dotted curve corresponds to a microscope with monochromator, for which the standard deviation of the energy spread is chosen equal to 0.086 eV corresponding to a typical full width at half maximum height of 200 meV as follows from Eq. (102) (Batson, 1999). In this
Figure 28. The lower bound on the standard deviation of the position coordinates of ˚ thick silicon [100] crystal under study as a function of the the central atom column of the 50 A spherical aberration constant for a microscope operating at 300 keV equipped with or without chromatic aberration corrector or monochromator.
94
VAN AERT ET AL.
Figure 29. The lower bound on the standard deviation of the position coordinates of ˚ thick gold [100] crystal under study as a function of the the central atom column of the 50 A spherical aberration constant for a microscope operating at 300 keV equipped with or without chromatic aberration corrector or monochromator.
evaluation, it is assumed that the specimen drift is the relevant physical constraint. Hence, the recording time is kept constant. It should be noticed that the precision is not represented for a spherical aberration constant equal to 0 mm in Figures 28 and 29. The reason for this is that for the thin crystals considered and for the defocus adjusted to the Scherzer defocus, the contrast in the image is very low, which results in extremely high standard deviations of the position coordinates. From Figures 28 and 29, the following conclusions are drawn for neighboring atom columns and an incident electron energy of 300 keV: . The optimal spherical aberration constant is close to, but diVerent from, 0 mm. The reason for this finding is due to the small object thickness. . The attainable precision improves by use of a chromatic aberration corrector. Particularly for light atom columns such as silicon [100], the gain in precision is only marginal. . The attainable precision deteriorates by use of a monochromator with an energy spread of 0.086 eV for both types of crystals. . Strictly speaking, the highest attainable precision, corresponding to the optimal experimental design, is obtained for a microscope with chromatic aberration constant equal to 0 mm and with spherical aberration constant close to, but diVerent from, 0 mm (for the thin objects considered). However, the precision that is gained is only marginal. Hence, the question may be raised if this gain is required to obtain a desired precision.
QUANTITATIVE ATOMIC RESOLUTION TEM
95
It should be noticed that the possible benefit of a spherical aberration corrector is underestimated in the present analysis for the same reason as has been mentioned in the discussion of the evaluation of the spherical aberration constant for isolated atom columns. Second, in Figures 30 and 31, the precision is evaluated as a function of the spherical aberration constant for a microscope operating at an incident electron energy of 50 keV, instead of 300 keV, for a silicon [100] and gold [100] crystal, respectively. Again, the solid curve corresponds to a microscope without correction for chromatic aberration. The dashed curve corresponds to a microscope with chromatic aberration corrector, that is, a microscope for which the chromatic aberration constant is equal to 0 mm. The dotted curve corresponds to a microscope with monochromator, for which the standard deviation of the energy spread is equal to 0.086 eV. The recording time is kept constant in the evaluation. Also here, the precision is not represented for a spherical aberration constant equal to 0 mm since for the thin crystals considered and for the defocus adjusted to the Scherzer defocus, the corresponding standard deviations of the position coordinates are very high. From Figures 30 and 31, the following conclusions are drawn for neighboring atom columns and an incident electron energy of 50 keV: The optimal spherical aberration constant is diVerent from 0 mm. The reason for this finding is due to the small object thickness. .
Figure 30. The lower bound on the standard deviation of the position coordinates of ˚ thick silicon [100] crystal under study as a function of the the central atom column of the 50 A spherical aberration constant for a microscope operating at 50 keV equipped with or without chromatic aberration corrector or monochromator.
96
VAN AERT ET AL.
Figure 31. The lower bound on the standard deviation of the position coordinates of ˚ thick gold [100] crystal under study as a function of the the central atom column of the 50 A spherical aberration constant for a microscope operating at 50 keV equipped with or without chromatic aberration corrector or monochromator.
The attainable precision improves by use of either a chromatic aberration corrector or a monochromator, although a chromatic aberration corrector is preferred. . The attainable precision improves more with a chromatic than with a spherical aberration corrector. The gain is more significant for heavy atom columns such as gold [100] than for light atom columns such as silicon [100]. . The highest attainable precision, corresponding to the optimal experimental design, is obtained for a microscope with chromatic aberration constant equal to 0 mm and with spherical aberration constant close to, but diVerent from, 0 mm (for the thin objects considered). The gain in precision is substantial. .
From the comparison of the conclusions obtained from Figures 28 to 31 for neighboring atom columns with those obtained for isolated atom columns as summarized in the previous section, it follows that the main conclusions regarding the optimal microscope settings remain. Moreover, like for isolated atom columns, increasing the reduced brightness of the electron source and improving the mechanical stability of the specimen holder is advantageous in terms of precision if the experiment is limited by specimen drift. This is evident since these conclusions, which are given earlier, are only based on the total number of detected electrons and not on the structure of the object under study.
QUANTITATIVE ATOMIC RESOLUTION TEM
97
d. Attainability of the Crame´r-Rao Lower Bound. Finally, the discussion of the optimization of a CTEM experiment should be complemented with an investigation if there exists an estimator attaining the CRLB on the variance of the position coordinates and if this estimator is unbiased. If so, this would justify the choice of the CRLB as optimality criterion used in this section. Generally, one may use diVerent estimators in order to measure the position coordinates of the atom columns from CTEM experiments such as the least squares estimator or the maximum likelihood estimator, which has been introduced in Section II.D. DiVerent estimators have diVerent properties. One of the asymptotic properties of the maximum likelihood estimator is that it is normally distributed about the true parameters with covariance matrix approaching the CRLB (van den Bos, 1982). This property would justify the use of the CRLB as optimality criterion, but it is an asymptotic one. This means that it applies to an infinite number of observations. However, the number of observations used in CTEM experiments is finite and may even be relatively small. If asymptotic properties still apply to such experiments can often only be assessed by estimating from artificial, simulated observations (van den Bos, 1999). Therefore, 200 diVerent CTEM experiments made on an isolated silicon [100] atom column are simulated; the observations are modelled using the parametric statistical model described in Section IV.B. The spherical aberration constant is set equal to 1 mm. Next, the position coordinates bx and by of the atom column are estimated from each simulation experiment using the maximum likelihood estimator. The mean and variance of these estimates are computed and compared to the true value of the position coordinate and the lower bound on the variance, respectively. The lower bound on the variance is computed by substituting the true values of the parameters into the expression given by the right-hand member of Eq. (117). The results are presented in Table 14. From the comparison of these results, it follows that it may not be concluded that the maximum likelihood estimator is biased or that it does not attain the CRLB. Furthermore, the maximum likelihood estimates of bx are presented in the histogram of Figure 32. The solid curve represents a normal distribution with mean and variance given in Table 14. This curve makes plausible that the estimates are normally distributed. This property is also tested quantitatively by means of the so-called Lilliefors test (Conover, 1980), which does not reject the hypothesis that the estimates are normally distributed. From the results obtained from the simulation experiments, it is concluded that the maximum likelihood estimates cannot be distinguished from unbiased, eYcient estimates. These results justify the choice of the CRLB as optimality criterion.
98
VAN AERT ET AL. TABLE 14 Comparison of True Position Coordinates and Lower Bounds on the Variance with Estimated Means and Variances of 200 Maximum Likelihood Estimates of the Position Coordinates, Respectively True position ˚) coordinate (A
bx by
0 0
bx by
Lower bound ˚ 2) on variance (A s2bx s2by
2.6 106 6
2.6 10
s2bx s2by
Estimated ˚) mean (A
Standard deviation ˚) of mean (A
6.3 105 3.6 105
10.9 105 11.1 105
Estimated ˚ 2) variance (A
Standard deviation ˚ 2) of variance (A
2.4 106
0.2 106
2.5 10
6
0.2 106
The numbers of the last column represent the estimated standard deviation of the variable of the previous column.
Figure 32. Histogram of 200 maximum likelihood estimates of the x-coordinate of the position of an atom column. The normal distribution superimposed on this histogram, with mean and variance given in Table 14, makes plausible that the estimates are normally distributed.
QUANTITATIVE ATOMIC RESOLUTION TEM
99
3. Interpretation of the Results To provide more insight, an intuitive interpretation will be given to some numerical results obtained in Section IV.C.2. This will be done at the hand of a result obtained in Section III where a rule of thumb was obtained for the attainable precision with which the position of one component can be measured from a bright-field imaging experiment such as CTEM. The rule of thumb, which is given by Eq. (68), was derived for an expectation model of the observations consisting of a constant background from which a Gaussian peak was subtracted. From it, one observes that the attainable precision is a function of the width of the Gaussian peak and the total number of detected electrons. Generally, the precision will improve by narrowing the Gaussian peak and by increasing the total number of detected electrons. Empirically, it has been found that the obtained rule of thumb is generalizable to more complicated CTEM expectation models than Gaussian peaks. Two diVerent approaches may be followed. One approach is to consider the highest spatial frequency that is transferred from the exit plane to the image plane instead of the inverse of the width of the Gaussian peak. Another approach is to consider the width associated with the peak which remains if the background is subtracted from the CTEM expectation model instead of the width of the Gaussian peak. The generalized rule of thumb is then that the precision will improve by decreasing the width of this peak or by increasing the highest spatial frequency that is transferred from the exit plane to the image plane and by increasing the total number of detected electrons. From the example illustrated in Figure 28, it may be concluded that the lower bound on the standard deviation of the position coordinates of a silicon [100] atom column of a crystal is a factor of 2.4 lower by using a chromatic aberration constant of 0 mm and a spherical aberration constant of 0.2 mm instead of an energy spread of the monochromator of 0.086 eV and a spherical aberration constant of 1.0 mm. This result may intuitively be interpreted by comparing the corresponding expectation models. These models describe the expected number of electrons detected at the pixels of a CCD camera. It has been derived in Section IV.B. Figure 33 shows intersections of the two-dimensional, radially symmetric column model and a plane through its radial axis. It is clearly observed that the peak, which remains if the background is removed, is narrower if the spherical aberration constant is equal to 0.2 mm instead of 1.0 mm. This narrowing is directly related to the improvement of the point resolution rs ¼ 0:66ðCs l3 Þ1=4 with decreasing spherical aberration constant. Moreover, the number of detected electrons is much larger in the absence of a monochromator since it is assumed in this example that the recording time is
100
VAN AERT ET AL.
Figure. 33. Intersection of the two-dimensional, radially symmetric expectation model of the observations made on an isolated silicon [100] atom column and a plane through its radial axis. The solid curve corresponds to a microscope with chromatic and spherical aberration constant equal to 0 mm and 0.2 mm, respectively. The dashed curve corresponds to a microscope with standard deviation of the energy spread of the monochromator and spherical aberration constant equal to 0.086 eV and 1.0 mm, respectively.
fixed. These considerations give an intuitive interpretation to the conclusion drawn from Figure 28. Moreover, it follows from Figure 24 to 31 that the precision that is possibly gained by use of a monochromator is higher for heavy atom columns such as gold [100] than for light atom columns such as silicon [100] and for microscopes operating at lower incident electron energies, for example, 50 keV instead of 300 keV. These results may be explained on a more or less intuitive basis as follows. Figures 34 and 35 show the damping envelope function Dt(g) due to partial temporal coherence, described by Eq. (98), associated with an electron source having an intrinsic energy spread equal to 0.75 eV, together with the Fourier transformed 1s-state functions F1s,n(g), described by Eq. (90), for a gold [100] and a silicon [100] atom column for a microscope operating at an incident electron energy equal to 300 keV and 50 keV, respectively. It follows from Eqs. (104)–(106) that F1s,n(g) may be regarded as the object transfer function associated with the atom column. It acts as a low pass filter and severely attenuates the amplitude of the microscope’s transfer function T(g) at high spatial frequencies. The microscope’s transfer function is described by Eq. (94) and it includes the damping envelope function Dt(g). The bandwidth of the low pass filter F1s,n(g) associated with the atom column depends on the
QUANTITATIVE ATOMIC RESOLUTION TEM
101
Figure 34. The damping envelope function Dt(g) due to partial temporal coherence, described by Eq. (98), together with the object transfer function F1s,n(g), described by Eq. (90), for a gold [100] and a silicon [100] atom column for a microscope operating at an incident electron energy of 300 keV.
Figure 35. The damping envelope function Dt(g) due to partial temporal coherence, described by Eq. (98), together with the object transfer function F1s,n(g), described by Eq. (90), for a gold [100] and a silicon [100] atom column for a microscope operating at an incident electron energy of 50 keV.
weight of this column. Heavy atom columns have more sharply peaked 1sstate functions, and therefore wider object transfer functions, than light atom columns (see also Tables 8 and 9). Consequently, a silicon [100] atom column will have a narrower bandwidth than a gold [100] atom column, as can be seen in Figures 34 and 35. A reduction of the energy spread, which may be obtained by incorporating a monochromator, decreases the
102
VAN AERT ET AL.
information limit since it increases the band-width of the damping envelope function Dt(g). However, pushing the inverse of the information limit of the microscope beyond the bandwidth of the object transfer function is useless. From Figures 34 and 35, it follows that there is more object spatial frequency information to be gained at an incident electron energy of 50 keV instead of 300 keV and for a gold [100] atom column than for a silicon [100] atom column. This eVect, in combination with the loss of electrons by use of a monochromator if the experiment is limited by the specimen drift or the non-loss if the experiment is limited by the radiation damage, makes the results obtained from Figures 24 to 31 understandable. The same reasoning may be applied to understand that the optimal chromatic aberration constant is equal to 0 mm and that a chromatic aberration corrector improves the precision more for heavy than for light atom columns and for lower incident electron energies as follows from Figures 22, 23, and 28 to 31. The chromatic aberration corrector increases the bandwidth of the damping envelope function Dt(g) like the monochromator does, but this is not accompanied by a reduction of the total number of detected electrons. The examples given above illustrate that the rule of thumb derived in Section III, for the attainable precision with which the position of one component can be measured from a bright-field imaging experiment such as CTEM, may be used to give an intuitive interpretation to the numerical results obtained in Section IV.C.2. This provides a check of these numerical results. However, this rule of thumb cannot replace the exact expressions for the attainable precision which have been used in Section IV.C.2. D. Conclusions It has been shown that when it comes to the evaluation and optimization of quantitative CTEM experiments aiming at the highest precision, criteria such as point resolution and information limit may give rise to deceptive guidelines, since they do not take the object and total number of detected electrons into account. Alternatively to these criteria, the obvious optimality criterion is the attainable statistical precision, that is, the CRLB, with which position coordinates of atom columns can be estimated. This criterion depends on the microscope settings, the object, and the total number of detected electrons, rather than on the microscope settings alone. An expression for the attainable statistical precision has been derived from a parametric statistical model of the observations. The expectations of the observations have been described by means of the channelling theory and the quasi-coherent approximation, whereas the fluctuations of the
QUANTITATIVE ATOMIC RESOLUTION TEM
103
observations have been described by means of Poisson statistics. The obtained expression has been used to evaluate and optimize the design of quantitative CTEM experiments. This analysis has been made for microscopes operating at an intermediate incident electron energy of 300 keV and for those operating at a low incident electron energy of 50 keV. The relevant physical constraints have been taken into consideration. These constraints are the radiation sensitivity of the object or the specimen drift. ˚ or the recording time has Therefore, the incident electron dose per square A been kept within the constraints. From the analysis, the following general guidelines have been derived: . The optimal defocus value is approximately given by Eq. (118) for negative Cs-values and at the Scherzer defocus, given by Eq. (119), for positive Cs-values. . A spherical and chromatic aberration corrector may improve the attainable precision. The precision that is gained depends on the object under study. Correction has more sense for low than for intermediate incident electron energies and for objects consisting of heavy instead of light atom columns. It should be mentioned that the optimal spherical aberration constant is diVerent from 0 mm for thin objects. . The attainable precision improves more with a chromatic aberration corrector than with a monochromator. . The highest attainable precision, corresponding to the optimal experimental design, is obtained for a microscope with both a spherical and chromatic aberration corrector. . Increasing the reduced brightness of the electron source may improve the attainable precision substantially if the experiment is limited by specimen drift. . Improving the mechanical stability of specimen holders, which would provide longer recording times, improves the attainable precision, especially if the experiment is limited by specimen drift.
Additionally, the following guidelines have been derived for microscopes operating at intermediate incident electron energies: A monochromator does usually not improve the attainable precision if the experiment is limited by specimen drift, except for heavy atom columns, whereas it slightly improves the precision if the experiment is limited by the radiation sensitivity of the object. . The precision that is possibly gained using a spherical aberration corrector, a chromatic aberration corrector, a monochromator or any combination might be disillusioning in the sense that this gain is only marginal and might not be needed to obtain a required precision. .
104
VAN AERT ET AL.
Furthermore, the following guidelines have been derived for microscopes operating at low incident electron energies: A monochromator improves the attainable precision. The attainable precision improves more with either a chromatic aberration corrector or a monochromator than with a spherical aberration corrector. . The precision that is gained using a spherical aberration corrector, a chromatic aberration corrector, a monochromator or any combination might be substantial. . .
V. Optimal Statistical Experimental Design of Scanning Transmission Electron Microscopy
A. Introduction In this section, optimal statistical experimental designs of STEM experiments will be described. They will be computed in a similar way as those of CTEM experiments in Section IV. Hence, the STEM designs will be evaluated and optimized in terms of the attainable precision, that is, the CRLB, with which atom column positions of the object under study can be measured. The choice of this optimality criterion reflects the purpose of future atomic resolution TEM experiments. As mentioned in Section I, this purpose is quantitative structure determination, which means that the structure parameters of the object under study, the atom column positions in particular, are quantitatively estimated from the observations. Ultimately, this should be done as precisely as possible. First, it will be described how STEM observations are collected. A scheme is shown in Figure 36. An electron probe is formed by demagnifying a small electron source with a set of condenser and objective lenses. The resulting probe scans in a raster over the object. At each probe position, a part of the object under study is illuminated. As a result of the electron-object interaction, the so-called exit wave, which is a complex electron wave function at the exit plane of the object, is formed. This wave propagates to a detector, which is placed in the back focal plane beyond the object. In this plane, a so-called convergent-beam electron diVraction pattern is formed. The part of this pattern that reaches the detector is integrated and displayed as a function of the probe position. In STEM, one distinguishes diVerent imaging modes that are related to the shape or size of the detector such as axial bright-field coherent STEM and annular dark-field incoherent STEM.
QUANTITATIVE ATOMIC RESOLUTION TEM
105
Figure 36. Scheme of a STEM experiment. Usually, an annular (black colored) or axial (grey colored) detector is chosen. The angle aD represents the inner collection semi-angle of an annular detector or the outer collection semi-angle of an axial detector.
In the former mode, an axial detector with a small outer collection semiangle aD is used, whereas in the latter mode, an annular detector with a large inner collection semi-angle aD is used. The angle aD is shown in Figure 36. It corresponds to a detector radius gdet equal to aD/l, where l is the electron wavelength. For more details on STEM, the reader is referred to Batson, Dellby, and Krivanek (2002); Cowley (1997), Crewe (1997), Nellist and Pennycook (2000), and Pennycook and Yan (2001). For many years, it has been standard practice to evaluate the performance of STEM experiments qualitatively, that is, in terms of direct visual interpretability. The performance criteria used are two-point resolution and contrast. For example, when axial bright-field coherent STEM is compared to annular dark-field incoherent STEM, the latter imaging mode is preferred. The basic ideas underlying this preference are the improvement of two-point resolution for incoherent imaging compared to coherent imaging (Pennycook, 1997) and the higher contrast in dark-field images than in bright-field images (Cowley, 1997). In annular dark-field incoherent STEM, visual interpretation of the images is optimal if the Scherzer conditions (Scherzer, 1949) for incoherent imaging are adapted (Pennycook and Jesson, 1991). As demonstrated in (Nellist and Pennycook, 1998), the resolution may be further improved if the main lobe of the probe is narrowed. However, visual interpretability is then reduced as a result of a considerable rise of the sidelobes of the probe.
106
VAN AERT ET AL.
Two important aspects are absent in these widely used performance criteria. First, the electron-object interaction is not taken into account. Second, the dose eYciency, which is defined as the ratio of the number of detected electrons to the number of incident electrons, is left out of consideration. Improvement of resolution and contrast is often obtained at the expense of dose eYciency, which leads to a decrease in the SNR. For example, the incoherence in annular dark-field incoherent STEM is attained by using an annular dark-field detector with a geometry much larger than the objective aperture, that is, an annular detector with an inner collection semi-angle much larger than the objective aperture semi-angle (Nellist and Pennycook, 2000). Its corresponding improvement of two-point resolution, by adapting the Scherzer conditions for incoherent imaging, is thus obtained at the expense of dose eYciency. Another example is the following. It is well known that in bright-field images, decreasing the outer collection semi-angle of an axial detector leads to higher contrast, but also to a deterioration of the SNR, which deteriorates the quality of an image. To compensate for such a decrease in SNR, longer recording times are necessary, which in turn increase the disturbing influence of specimen drift. The observation that the quality of an image is determined by both the resolution and the SNR has led to several modified criteria (Sato, 1997; Sato and OrloV, 1992). The ultimate goal of STEM is not qualitative structure determination, but quantitative structure determination instead. Ultimately, structure parameters of the object under study, such as the atom column positions, have to be measured as precisely as possible. However, this precision will always be limited by the presence of noise. Given the parametric statistical model of the observations, an expression may be obtained for the highest attainable precision with which the atom column positions can be measured. This expression, which is called the CRLB, is a function of structure parameters, microscope settings, and dose eYciency. Therefore, it may be used as an alternative performance measure in the evaluation and optimization of the design of a STEM experiment for a given object. The optimal statistical experimental design corresponds to the microscope settings resulting in the highest attainable precision. It will be obtained by using the principles of statistical experimental design explained in Section II. The section is organized as follows. In Section V.B, a parametric statistical model of the observations will be derived. This model describes the expectations of the observations as well as the fluctuations of the observations about these expectations. Next, in Section V.C, an expression for the CRLB on the variance of the atom column position estimates is obtained from this model. Also, an adequate optimality criterion, which is a function of the elements of the CRLB, is given. This criterion is then used to evaluate and optimize the experimental design. Special attention will be paid
QUANTITATIVE ATOMIC RESOLUTION TEM
107
to the optimal reduced brightness of the electron source, the optimal defocus value, the optimal spherical aberration constant, the optimal detector radius, and the optimal source width. Furthermore, it will be investigated if an annular detector is preferable to an axial one. In Section V.D, conclusions are drawn. Part of the results of this section has earlier been published in den Dekker, van Aert, van Dyck, and van den Bos (2000), van Aert and van Dyck (2001), van Aert, den Dekker, van Dyck, and van den Bos (2000a, 2002b), van Aert, van Dyck, den Dekker, and van den Bos (2000).
B. Parametric Statistical Model of Observations A parametric statistical model of the observations is needed in order to obtain an expression for the CRLB, which will be used for the optimization of the experimental design. In this section, such a model will be derived. It describes the expectations of the observations as well as the fluctuations of the observations about these expectations. This model contains microscope settings such as defocus, spherical aberration constant, and detector angle, as well as structure parameters such as atom column positions and the object thickness. In the derivation of this model, three basic approximations will be made. First, use will be made of the simplified channelling theory to describe the dynamical, elastic scattering of the electrons on their way through the object (Broeckx, Op de Beeck and van Dyck, 1995; Geuens and van Dyck, 2002; Pennycook and Jesson, 1992; van Aert, den Dekker, van Dyck and van den Bos, 2002b; van Dyck and Op de Beeck, 1996 ). Second, temporal incoherence due to chromatic aberration, which results from a spread in defocus values, will not be taken into account. This approximation is justified by the fact that researchers suspect that STEM imaging is robust to chromatic aberration (Batson, Dellby and Krivanek, 2002; Krivanek, Dellby and Nellist, 2002; Nellist and Pennycook, 1998, 2000). Third, thermal diVuse, inelastic scattering will not be taken into account. Thermal diVuse scattered electrons are predominantly collected in the detector at high angles (Treacy, 1982). Therefore, increasing the inner collection semiangle aD (see Figure 36) of an annular detector has the eVect of increasing thermal diVuse, inelastic scattering relative to elastic scattering (Wang, 2001). The main advantage of this is the strong dependence of the detected signal on the atomic number Z, hence the name, Z-contrast imaging. The disadvantage, however, is the accompanied decrease of dose eYciency, which leads to a decrease in SNR. In Section V.C, it will be shown that, as a result of this decrease in SNR, the optimal inner collection angle in terms of precision is small compared to the angles where thermal diVuse scattering is
108
VAN AERT ET AL.
important. This justifies the fact that thermal diVuse scattering will not be taken into account. Although the approximations made are of a limited validity, they are useful for a compact analytical model-based optimization of the design of quantitative STEM experiments as well as for explaining the basic principles governing the obtained results. The principal results are independent of the approximations made. 1. The Exit Wave The first step toward the parametric statistical model of the observations is to obtain an expression for the exit wave c(r, z). It is a complex electron wave function in the plane at the exit face of the object, resulting from the interaction of the electron probe with the object. As for CTEM, use will be made of the simplified channelling theory. At this stage, both structure parameters and microscope settings, describing the object and probe, respectively, will enter the model. According to the simplified channelling theory, applicable if the probe propagates along a major zone axis, an expression may be derived for the exit wave of an object consisting of nc atom columns (Broeckx, Op de Beeck, and van Dyck, 1995; Geuens and van Dyck, 2002; Pennycook and Jesson, 1992; van Aert, den Dekker, van Dyck, and van den Bos, 2002b; van Dyck and Op de Beeck, 1996). This derivation is equivalent to that of the exit wave for CTEM, given by Eq. (84), except that the parallel incident electron beam used in CTEM is now replaced by the electron probe. The expression for the exit wave for STEM is given by:
nc X E1s;n 1 z 1 ; cðr; zÞ ¼ pðr rkl Þ þ cn ðrkl bn Þf1s;n ðr bn Þ exp ip E0 l n¼1 ð121Þ
where r ¼ ðx yÞT is a two-dimensional vector in the plane at the exit face of the object, perpendicular to the propagation direction of the electron probe, z is the object thickness, E0 is the incident electron energy, and l is the electron wavelength. The incident electron energy and the electron wavelength are related according to Eq. (85). Furthermore, the function p(r rkl) describes the probe located at the position rkl ¼ ðxk yl ÞT . The function f1s;n ðr bn Þ is the lowest energy bound state of the nth atom column located at position bn ¼ ðbxn byn ÞT and E1s,n is its energy. The energy of this state is a parameter related to the projected ‘weight’ of the atom column, which is a function of the atom numbers of the atoms along a column, the distance between successive atoms, and the Debye-Waller factor (van Dyck and Chen, 1999a). The lowest energy bound state f1s;n ðr bn Þ is
QUANTITATIVE ATOMIC RESOLUTION TEM
109
a real-valued, centrally peaked, radially symmetric function, which is a twodimensional analogue of the 1s-state of an atom. In Eq. (121), it is assumed that the dynamical motion of an electron in a column may be primarily expressed in terms of this tightly bound 1s-state. As in Section IV.B.1, where an expression for the exit wave is described for CTEM, it will be assumed that the 1s-state function may be approximated by a single, quadratically normalized, parameterized Gaussian function given by Eq. (88) (Geuens and van Dyck, 2002). The excitation coeYcients cn ðrkl bn Þ of Eq. (121) are found from: Z cn ðrkl bn Þ ¼ f1s;n ðr bn Þpðr rkl Þdr; ð122Þ where the symbol * denotes the complex conjugate. Since the 1s-state is a real-valued function and since the probe function is assumed to have radial symmetry so that pðrÞ ¼ pðrÞ, Eq. (122) may be written as a convolution product: cn ðrkl bn Þ ¼ pðrkl bn Þ f1s;n ðrkl bn Þ:
ð123Þ
The convolution theorem (Papoulis, 1968) allows one to rewrite this equation as: cn ðrkl bn Þ ¼ =1 g!r b PðgÞF1s;n ðgÞ; Z kl n ¼ PðgÞF1s;n ðgÞexpði2pg:ðrkl bn ÞÞdg;
ð124Þ
where P(g) is the two-dimensional Fourier transform of the probe function p(r), F1s,n(g) is the Fourier transform of the 1s-state f1s,n(r) given by Eq. (90), g is a two-dimensional spatial frequency vector, and the symbol ‘.’ denotes the scalar product. The Fourier transform and the inverse Fourier transform are defined by Eqs. (91) and (92), respectively. For radially symmetric 1s-state and probe functions, Eq. (124) may be written as: Z 1 cn ðrkl bn Þ ¼ cn ðjrkl bn jÞ ¼ 2p PðgÞF1s;n ðgÞJ0 ð2pgjrkl bn jÞgdg: 0
ð125Þ
This is an elementary result of the theory of Bessel functions, where J0(.) is the zeroth-order Bessel function of the first kind and qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2ffi 2 ð126Þ jrkl bn j ¼ ðxk bxn Þ þ yl byn represents the distance from the probe to the nth atom column.
110
VAN AERT ET AL.
The illuminating STEM probe p(r) is the inverse Fourier transform of the coherent transfer function of the objective lens P(g): pðrÞ ¼ =1 g!r PðgÞ:
ð127Þ
The transfer function P(g) is radially symmetric and given by: PðgÞ ¼ PðgÞ ¼ AðgÞexpðiwðgÞÞ;
ð128Þ
where g ¼ jgj is the Euclidean norm of the two-dimensional spatial frequency vector. The circular aperture function A(g) is defined in the same way as in Eq. (95): ( 1 if g gap AðgÞ ¼ ð129Þ 0 if g > gap with gap the objective aperture radius. Notice that the objective aperture semiangle a0 is equal to gapl. The phase shift w(g), resulting from the objective lens aberrations, is radially symmetric and is defined in the same way as in Eq. (96) (van Dyck, 2002): 1 wðgÞ ¼ p"lg2 þ pCs l3 g4 2
ð130Þ
with " the defocus, l the electron wavelength, and Cs the spherical aberration constant. Other aberration eVects such as 2-fold astigmatism, 3-fold astigmatism, and axial coma, could also be included in this phase shift (Thust, Overwijk, Coene, and Lentzen, 1996). From the comparison of Eq. (128) with Eq. (94), where the microscope’s transfer function for CTEM is described, it follows that, apart from the damping envelope functions describing partial spatial and temporal coherence in CTEM, both equations are equal to one another. In the present work, temporal incoherence will not be taken into account since STEM imaging is suspected to be robust to chromatic aberration (Batson, Krivanek, Dellby and Nellist, 2002; Dellby and Krivanek, 2002; Nellist and Pennycook, 1998, 2000). Furthermore, spatial incoherence, resulting from a finite source image, will be incorporated in the model in the next section. 2. The Image Intensity Distribution From the expression for the exit wave, which has been obtained in the previous section the image intensity distribution may be computed. The exit wave, as given by Eq. (121), describes the interaction of the electron probe, which is located at a given position, and the object. The steps needed in proceeding from the exit wave to the image intensity distribution are the
QUANTITATIVE ATOMIC RESOLUTION TEM
111
following ones. First, the propagation from this exit wave to the detector, which is placed in the back focal plane beyond the object, is described as the Fourier transform of the exit wave. Next, the intensity pattern in the detector plane is given by the modulus square of the thus obtained wave. This is the so-called convergent-beam electron diVraction pattern. Then, the part of this pattern that reaches the detector is integrated and displayed as a function of the probe position. In this way, an expression for the image intensity distribution may be obtained. At this stage, the microscope parameters describing the detector will enter the model. From the procedure described above, it follows that the total detected intensity in the Fourier detector plane of a STEM is given by (Cowley, 1976): Z Ips ðrkl Þ ¼ jCðg; zÞj2 DðgÞdg; ð131Þ where C(g, z) is the two-dimensional Fourier transform of the exit wave c(r, z) and | C(g, z) |2 describes the convergent-beam electron diVraction pattern. Furthermore, D(g) is the detector function, which is equal to one in the detected field and equal to zero elsewhere. An expression for the twodimensional Fourier transform of the exit wave may be obtained from combining Eqs. (91) and (121): Cðg; zÞ ¼ PðgÞ expð2pig rkl Þ
nc X E1s;n 1 z 1 : þ cn ðrkl bn ÞF1s;n ðgÞ expð2pig bn Þ exp ip E0 l n¼1
ð132Þ
Notice that it can be seen from Eqs. (131) and (132) that for identical atom columns, the contrast varies periodically with thickness. This periodicity is the same as for CTEM, given by Eq. (107):
2E0 l
: D1s ¼ ð133Þ E1s;n It is called the extinction distance. This periodic oscillation is due to dynamical eVects, which have been included in the model via the channelling approximation. Generally, the extinction distance will be diVerent for diVerent types of atom columns. Thus far, it has been tacitly assumed that the source image may be modelled as a point. Therefore, the subscript ‘ps’ in Eq. (131) refers to point source. Elaborating on the ideas given in (Mory, Tence, and Colliex, 1985), it follows that the finite size of the source image may be taken into account
112
VAN AERT ET AL.
by a two-dimensional convolution of the intensity distribution Ips(rkl) with the intensity distribution of the source image S(r): I ðrkl Þ ¼ Ips ðrkl Þ Sðrkl Þ:
ð134Þ
The eVect of the source image is thus an additional blurring. A realistic form for the intensity distribution of the source image is Gaussian (Mory, Tence, and Colliex, 1985). The function S(r) is thus a two-dimensional normalized Gaussian distribution given by:
1 r2 exp S ðrÞ ¼ SðrÞ ¼ ; ð135Þ 2s2 2ps2 with s the standard deviation, representing the width corresponding to the radius containing 39% of the total intensity of S(r). Up to now, no assumptions have been made about the shape or size of the detector. From now on, however, the detector is assumed to be radially symmetric. Mathematically, this means that DðgÞ ¼ DðgÞ. Insight in the expression given by the right-hand member of Eq. (134) is obtained if it is split up into three terms: I ðrkl Þ ¼ I0 þ I1 ðrkl Þ þ I2 ðrkl Þ:
ð136Þ
The zeroth order term I0 corresponds to a non-interacting probe, the first order term I1(rkl) to the interference between the probe and the 1s-state and the second order term I2(rkl) to the interference of diVerent 1s-states. The zeroth order term I0 is given by: Z ð137Þ I0 ¼ jPðgÞ expð2pig rkl Þj2 DðgÞdg S ðrkl Þ: It describes a constant background intensity, resulting from the noninteracting electrons collected by the detector. This equation may be rewritten by substitution of Eq. (128) and using the fact that D(g) is radially symmetric. This results in: Z ð138Þ I0 ¼ 2p A2 ðgÞDðgÞ g dg: Due to the definition of the aperture function, given by Eq. (129), the following equality may be used: A2 ðgÞ ¼ AðgÞ:
ð139Þ
Therefore, Eq. (138) becomes: I0 ¼ 2p
Z
AðgÞDðgÞ g dg:
ð140Þ
QUANTITATIVE ATOMIC RESOLUTION TEM
113
The first order term I1(rkl) corresponds to the interference of the incident probe p(r rkl) and the 1s-state f1s, n (r bn):
nc X E1s;n 1 z 1 2Re cn ðjrkl bn jÞ exp ip I1 ðrkl Þ ¼ E0 l n¼1 ð141Þ Z 2p P ðgÞF1s;n ðgÞJ0 ð2pgjrkl bn jÞDðgÞ g dg S ðrkl Þ: This is a linear term in the sense that contributions of diVerent atom columns are added. The second order term I2(rkl) describes the interference of diVerent 1s-states f1s;n ðr bn Þ and f1s;m ðr bm Þ: I2 ðrkl Þ ¼
where
nc nc X X cn ðjrkl bn jÞcm ðjrkl bm jÞ n¼1 m¼1
E1s;n 1 E1s;m 1 z 1 exp þip z 1 exp ip E0 l E0 l Z 2p F1s;n ðgÞF1s;m ðgÞJ0 2pgdn;m DðgÞ g dg S ðrkl Þ;
dn;m ¼ jbn bm j ¼
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 ðbxn bxm Þ2 þ byn bym ;
ð142Þ
ð143Þ
is the distance between the atom columns at positions bn and bm. It is only the last term I2(rkl) of Eq. (136) that remains for annular darkfield STEM using an annular detector with an inner collection radius gdet ¼ aD =l greater than or equal to the objective aperture radius gap ¼ a0 =l. The terms I0 and I1(rkl) of Eq. (136) given by Eqs. (140) and (141), respectively, are equal to zero since: PðgÞDðgÞ ¼ 0;
ð144Þ
AðgÞDðgÞ ¼ 0:
ð145Þ
or, equivalently,
Therefore, Eq. (142) describes the image intensity distribution for annular dark-field STEM. It can be shown that this result agrees with the result as derived in (Pennycook, RaVerty, and Nellist, 2000). 3. The Image Recording Finally, the expectation model, describing the expected number of electrons recorded by the detector, will be derived. In a STEM, the illuminating electron probe scans in a raster over the object. The image is
114
VAN AERT ET AL.
thus recorded as a function of the probe position rkl ¼ ðxk yl ÞT . Without loss of generality, the image magnification is ignored. Therefore, the probe position rkl ¼ ðxk yl ÞT directly corresponds to an image pixel at the same location. The recording device is characterized as consisting of K L equidistant pixels of area Dx Dy, where Dx and Dy are the probe sampling distances in the x and y directions, respectively. Pixel (k, l ) corresponds to position ðxk yl ÞT ðx1 þ ðk 1ÞDx y1 þ ðl 1ÞDyÞT with k ¼ 1; . . . ; K and l ¼ 1; . . . ; L and (x1 y1)T represents the position of the pixel in the bottom left corner of the field of view (FOV). The FOV is chosen centered about (0 0)T. Assuming a recording time t for one pixel and a probe current I, the number of electrons per probe position is given by: It e
ð146Þ
with e ¼ 1:6 1019 C the electron charge. The recording time t for one pixel is the ratio of the recording time t for the whole FOV to the total number of pixels KL: t¼
t : KL
ð147Þ
The total number of incident electrons Ni is equal to: Ni ¼ KL
It : e
ð148Þ
The probe current I is given by (Barth and Kruit, 1996): I¼
2 a2 Br E0 p2 dI50 o 4e
ð149Þ
with Br the reduced brightness of the electron source, E0 the incident electron energy, dI50 the diameter of the source image containing 50% of the current and ao the objective aperture semi-angle. From Eq. (135), it follows that pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi dI50 ¼ 2 2ln0:5 s: ð150Þ As a consequence of the detector shape and size in STEM, only the electrons within a selected part of the convergent-beam electron diVraction pattern are used to produce the image. Mathematically, this is expressed in Eq. (131). The selected part is determined by the detector function D(g). Suppose that fkl represents the fraction of electrons collected by the detector. Then, the expected number of electrons lkl at the pixel (k, l ) equals (Reimer, 1993):
QUANTITATIVE ATOMIC RESOLUTION TEM
lkl ¼ fkl
It : e
115 ð151Þ
The fraction fkl, which is smaller than 1, may be expressed as: fkl ¼
I ðrkl Þ ID¼1
ð152Þ
with I(rkl) given by Eq. (134) and ID¼1 the constant intensity obtained if the detector function D(g) is uniform. From straightforward calculations, using Eqs. (136)–(142), it follows that: Z ð153Þ ID¼1 ¼ 2p AðgÞgdg: The total number of detected electrons N to form the image is now equal to: N¼
K X L X k¼1 l¼1
fkl
It : e
ð154Þ
Then, the dose eYciency DE, which is defined as the ratio of the number of detected electrons to the number of incident electrons, becomes: PK PL fkl N DE ¼ ð155Þ ¼ k¼1 l¼1 : Ni KL This follows directly from Eqs. (148) and (154). For STEM, the observation are electron counting results, which are supposed to be Poisson distributed and statistically independent. Therefore, the joint probability density function of the observations P(o; b), representing the parametric statistical model of the observations is given by Eq. (10), where the total number of observations is equal to K L and the expectation model is given by Eq. (151). The parameter vector b ¼ ðbx1 . . . bxnc by1 . . . bync ÞT consist of the x- and y-coordinates of the atom column positions to be estimated. In the following section, the experimental design resulting into the highest attainable precision with which the elements of the vector b can be estimated will be derived from the joint probability density function of the observations. C. Statistical Experimental Design In this section, optimal statistical experimental designs of STEM experiments will be derived in the sense of the microscope settings resulting into the highest attainable precision with which the position coordinates of the
116
VAN AERT ET AL.
atom columns can be estimated. The STEM observations are described by the parametric statistical model derived in Section V.B. This model will be used to obtain an expression for the attainable precision, which is represented by the CRLB associated with the position coordinates. In Section II, it has been explained how an expression for the CRLB may be derived. Next, a scalar measure of this CRLB, that is, a function of the matrix elements of the CRLB, will be chosen as optimality criterion. This criterion will then be evaluated and optimized as a function of the microscope settings. An overview of these microscope settings will be given in Section V.C.1. Some of them are tunable, while others are fixed properties of the electron microscope. Next, in Section V.C.2, the results of the numerical evaluation and optimization of the microscope settings will be presented for both isolated and neighboring atom columns. The section is concluded by simulation experiments to find out if the maximum likelihood estimator attains the CRLB and if it is unbiased. If so, this justifies the choice of the CRLB as optimality criterion. Finally, in Section V.C.3, an interpretation of the numerical optimization results will be given. The object thickness, the energy of the atom columns, and the microscope settings are supposed to be known. However, the following analysis may relatively easily be extended to include the case in which these or even more parameters are unknown and hence have to be estimated simultaneously. 1. Microscope Parameters An overview of the microscope settings, which enter the parametric statistical model of the STEM observations, is given in this section. For simplicity, some of these settings will be kept constant in the evaluation and optimization of the experimental design. The settings describing the electron probe are the defocus ", the spherical aberration constant Cs, the objective aperture radius gap, the electron wavelength l, the width of the source image s, and the reduced brightness Br of the electron source. The electron wavelength and the reduced brightness of the electron source are fixed properties of a given electron microscope. In the evaluation of the experimental design, the electron wavelength will be kept constant. Furthermore, the eVect of the reduced brightness on the precision with which atom column positions can be estimated, will be studied. For most electron microscopes, the spherical aberration constant is a fixed property of the microscope as well, however, by incorporating a spherical aberration corrector, it is tunable. Therefore, it is interesting to study the eVect of the spherical aberration constant on the precision. The microscope settings specifying the detector configuration are related to the detector function D(g). In principle, the detector may have any shape
QUANTITATIVE ATOMIC RESOLUTION TEM
117
or size. However, in this article, the shape of the detector is confined to the more common ones, which are annular and axial detectors. The inner or outer collection radius gdet or semi-angle aD, which are related as gdet ¼ aD =l, is tunable. The microscope settings describing the image recording are the probe sampling distances or, equivalently, the pixel sizes Dx and Dy, the number of pixels K and L in the x- and y-direction, respectively, and the recording time t. The pixel sizes Dx and Dy are kept constant. In agreement with the results presented in Section III, it can be shown that the precision will generally improve with smaller pixel sizes for a constant total number of incident electrons Ni, as defined by Eq. (148). However, below a certain pixel size, no more improvement is gained. This has to do with the fact that the pixel SNR decreases with a decreasing pixel size. Therefore, the pixel sizes are chosen in the region where no more improvement may be gained. This is similar to what is described in (Bettens, van Dyck, den Dekker, Sijbers, and van den Bos, 1999; den Dekker, Sijbers and van Dyck, 1999; van Aert, den Dekker, van Dyck, and van den Bos, 2002a). The number of pixels K and L, defining the FOV for given pixel sizes Dx and Dy, is chosen fixed, but large enough so as to guarantee that the tails of the electron probe are collected in the FOV. 2. Numerical Results In this section, the experimental designs will be numerically evaluated and optimized in terms of the attainable precision with which atom column positions can be estimated. This section will be divided into four parts. First, general comments, which should be kept in mind during the reading of this section, will be given, including an overview of the original, non-optimized microscope settings and of the structure parameters. Second, the optimal experimental designs for isolated atom columns will be computed. Third, the influence of neighboring atom columns on the optimal experimental design will be discussed. Finally, simulation experiments will be carried out to find out if the maximum likelihood estimator attains the CRLB and if it is unbiased. If so, this justifies the choice of the CRLB as optimality criterion. a. General Comments. In this section, general comments will be given, which should be kept in mind during further reading. An overview of the original microscope settings and the structure parameters of the objects under study will be given. i. Original and Optimal Microscope Settings. In what follows, the values for the original, non-optimized microscope settings are given in Table 15, unless otherwise mentioned. These are typical values used in today’s
118
VAN AERT ET AL. TABLE 15 Original Microscope Settings Microscope setting
Value
E0(keV) ˚) l(A Br ðAm2 sr1 V1 Þ Cs(mm) ˚) Dx(A ˚) Dy(A K L t(s)
300 0.02 2 107 0.5 0.2 0.2 100 100 8 108
STEM experiments. Furthermore, in the conventional approach, which is based on direct visual interpretability, the Scherzer conditions for incoherent imaging are usually applied (Pennycook and Jesson, 1991; Scherzer, 1949). Under these conditions, the objective aperture radius and defocus are given by:
1 4l 1=4 gap ¼ ; l Cs ð156Þ " ¼ ðCs lÞ1=2
respectively. For the microscope settings as given in Table 15, this ˚ . Moreover, the outer ˚ 1 and " ¼ 320 A corresponds to gap ¼ 0:56 A collection radius of an axial detector or the inner collection radius of an annular detector is usually taken much smaller or larger than the objective aperture radius, respectively (Nellist and Pennycook, 2000). To the author’s knowledge, explicit expressions for these radii do not exist. One of the guidelines that has been found in the literature is, for example, that the inner collection radius gdet of an annular detector should be at least three times the objective aperture radius (Hartel, Rose, and Dinges, 1996). Therefore, if gap ˚ 1, this corresponds to a value of gdet being larger than is equal to 0.56 A 1 ˚ 1.68 A . Other researchers propose a value of two times the objective aperture radius, which is representative for a typical Crewe detector (Pennycook, Jesson, Chisholm, Browning, McGibbon, and McGibbon, ˚ 1, this corresponds to gdet being equal 1995). For gap being equal to 0.56 A 1 ˚ to 1.12 A . It should be noticed that for such large values of the detector radius, thermal diVuse, inelastic scattering may be more important than elastic scattering. Consequently, the expectation model proposed in Section V.B, which only takes elastic scattering into account, is no longer valid. For example, the oscillation of the detected intensity as a function of thickness with a periodicity as given by Eq. (133) is no longer observed using annular
QUANTITATIVE ATOMIC RESOLUTION TEM
119
detectors with a large inner detector radius. In Pennycook and Yan (2001), this oscillation as a function of thickness has been studied for a rhodium atom column, where the distance between successive atoms is equal to ˚ . This has been done for a small, medium, and large detector radius 2.7 A ˚ 1, and 2.25 A ˚ 1, respectively. ˚ 1, 1.50 A corresponding to a value of 0.75 A From this study, it followed that the periodic oscillation as described by the model given in Section V.B applies for the small detector radius, whereas this oscillation is almost completely suppressed for the large detector radius. Therefore, in the present study, the evaluation of the inner detector radius of annular detectors will be restricted to small values. It will be shown that this constraint does not cause problems for the computation of the optimal detector radius in terms of attainable precision. The optimal value of the detector radius will be shown to be much smaller than the values usually taken. In the remainder of this section, the values for the microscope settings which are usually preferred in STEM experiments, as described above, will be compared to their optimal values in terms of attainable precision. These optimal values are found by optimizing the attainable precision for all microscope settings simultaneously. This corresponds to an iterative, numerical optimization procedure in the space of microscope settings of which the dimension is equal to the number of microscope settings. It has been found that some of these microscope settings are strongly correlated. This implies that the optimization cannot be performed one at a time. For example, it will be shown that the optimal detector radius strongly depends on the aperture radius. Furthermore, the optimal defocus value strongly depends on the spherical aberration constant. In what follows, the results following from this simultaneous optimization procedure will be described setting by setting. The relation of each setting to other microscope settings will be mentioned if necessary. In what follows, the attainable precision will be computed as a function of the following microscope settings: . . . . . .
Objective aperture radius Radius of annular and axial detectors Defocus Spherical aberration constant Reduced brightness of the electron source Width of the source image
For isolated atom columns, the width of the source image, which is determined by Eq. (135), will be kept constant in the following sense. The diameter dI50 of the source image containing 50% of the current will be assumed to be determined by the objective aperture angle ao, following the relation (Barth and Kruit, 1996)
120
VAN AERT ET AL.
dI50 ¼
0:54l : ao
ð157Þ
The right-hand member of this equation is equal to the diameter of the diVraction-error disc containing 50% of the total intensity. Consequently, the contribution of the source image to the total probe size is rather small. Then, meeting Eq. (157), it follows from Eq. (149) that the probe current is constant and equal to IB ¼ 1018 Br (Barth and Kruit, 1996). This implies ˚ is constant as a that the total number of incident electrons per square A function of the microscope settings for a fixed recording time. The reason why the diameter of the source image will be assumed to be determined by the diVraction-error disc, instead of assuming it to be tunable, is the following one. For isolated atom columns, the optimal diameter dI50 would be infinite, corresponding to an infinite probe current, as follows from Eq. (149). However, an infinite source image is not realistic since neighboring atom columns will then strongly overlap. Therefore, the dependence of the ‘tunable’ source diameter on the precision will be studied for neighboring atom columns only. ii. Structure Parameters. The evaluation and optimization of the attainable precision as a function of the microscope settings will be done for diVerent types of atom columns. The atom columns which will be considered are given in Table 16 as well as the corresponding width of the 1s-state an, its energy E1s,n, the interatomic distance d, that is, the distance between successive
TABLE 16 ˚ 2 and E0 ¼300 keV), Width of the Is-State, Its Energy (Debye-Waller Factor ¼ 0.6 A Interatomic Distance, and Atomic Number for Different Atom Columns Column type Structure parameter
Si [100]
Si [110]
Sr [100]
˚) an(A E1s,n(eV) ˚) d(A Z
0.34 20.2 5.43 14
0.27 37.4 3.84 14
0.22 57.3 6.08 38
Column type Structure parameter
Sn [100]
Cu [100]
Au [100]
˚) an(A E1s,n(eV) ˚) d(A Z
0.20 69.8 6.49 50
0.18 78.3 3.62 29
0.13 210.8 4.08 79
121
QUANTITATIVE ATOMIC RESOLUTION TEM
atoms along a column, and the atom number Z of these atoms. The other structure parameters of the object under study, such as the atom column positions and the object thickness, will be given in the following parts. b. Isolated Atom Columns i. Structure Parameters. For isolated atom columns, the atom column positions and the object thickness are given in Table 17. The object thickness is equal to half the extinction distance, which is given by Eq. (133). From the proposed model in Section V.B, it follows that at this thickness and at thicknesses equal to odd multiples of half the extinction distance, the electrons are strongly localized at the atom column positions. ii. Optimality Criterion. The optimal statistical experimental design will be described by the microscope settings resulting into the highest attainable precision with which its position coordinates b ¼ ðbx by ÞT can be estimated. This attainable precision (in terms of the variance) is represented by the diagonal elements s2bx and s2by of the CRLB. These elements are theoretical lower bounds on the variance with which the position coordinates can be estimated without bias. An expression for them will be derived in the following paragraph. This derivation is completely analogous to the one presented in Section IV.C.2, for CTEM and may therefore be skipped by the reader who is already familiar with it. For an isolated atom column, the CRLB is equal to the inverse of the 2 2 Fisher information matrix F associated with the position coordinates. The (r, s)th element of F is defined by Eq. (12): Frs ¼
K X L X 1 @lkl @lkl l @br @bs k¼1 l¼1 kl
ð158Þ
with lkl the expected number of electrons at the pixel (k, l ). An expression for the elements Frs is found by substitution of the expectation model given by Eq. (151) as derived in Section V.B and its derivatives with respect to the position coordinates into Eq. (158). Explicit numbers for these elements are obtained by substituting values of a given set of microscope settings and structure parameters of the object into the obtained expression for Frs. TABLE 17 Structure Parameters of an Isolated Atom Column Structure parameter
Value
˚) bx(A ˚) by(A ˚) z(A
0
0
E0 l
E1s;n
122
VAN AERT ET AL.
For the radially symmetrical expectation model used, the diagonal elements of the Fisher information matrix are equal to one another. Moreover, since the Fisher information matrix is symmetric, the diagonal elements of its inverse, that is, of the CRLB, are also equal to one another: s2bx ¼ s2by ¼ F 1 11 ð159Þ
with [F 1]11 the (1, 1)th element of the CRLB, that is, of F 1. In what follows, the precision will be represented by the lower bound on the standard deviation sbx and sby, that is, the square root of the right-hand member of Eq. (159). It will be used as optimality criterion for the evaluation and optimization of the experimental design. Therefore, this chosen optimality criterion will be calculated for various types of atom columns as a function of the objective aperture radius, the radius of annular and axial detectors, the defocus, the spherical aberration constant, and the reduced brightness of the electron source. In this evaluation and optimization procedure, the relevant physical constraints are taken into consideration. The relevant constraint is either the radiation sensitivity of the object under study or the specimen drift. ˚ or the recording Therefore, either the incident electron dose per square A time has to be kept within the constraints.
iii. Optimal Objective Aperture Radius. First, the dependence of the precision on the objective aperture radius gap is studied. Recall that the objective aperture radius is directly related to the objective aperture semiangle ao according to the formula ao ¼ gap l. The precision, which is represented by the square root of the right-hand member of Eq. (159), has been evaluated as a function of the objective aperture radius for annular as well as for axial detectors and for diVerent atom column types. From this evaluation, it is found that the optimal aperture radius is mainly determined by the atom column type under study and that it is the same for annular and axial detectors. The optimal aperture radius turns out to be proportional to the width of the function F1s,n(g), that is, the Fourier transform of the 1sstate f1s,n(r) as given by Eq. (90). Figure 37 compares the optimal aperture radius with the width of F1s,n(g). The width of F1s,n(g) is equal to 1/(4pan), where an is the width of the 1s-state f1s,n(r) as given by Eq. (88). The optimal aperture radii are plotted as a function of ðd 2 =Z þ 0:276BÞ1=2 , since this term is more or less proportional to the width of the function F1s,n(g) as shown in (van Dyck and Chen, 1999a). For a given atom column, d represents the interatomic distance, Z the atomic number, and B the Debye-Waller factor. From Figure 37, it is clear that the influence of the object on the optimal objective aperture radius is substantial. In contrast to what one might
QUANTITATIVE ATOMIC RESOLUTION TEM
123
Figure 37. Comparison of the optimal aperture radius for Cs being equal to 0.5 mm with the width of the Fourier transformed 1s-state F1s,n(g) for diVerent atom column types. The width of F1s,n(g) is proportional to (d 2/Z + 0.276B)1/2.
expect, the resulting probe in the optimal design is not as narrow as possible. Its main lobe is even broader than the 1s-state f1s,n(r). This is shown in Figure 38, where both the 1s-state and the amplitude of the optimal probe are shown for a silicon and a gold [100] atom column. Furthermore, for heavy atom columns such as gold [100], an increase of the spherical aberration constant results in a decrease of the optimal aperture radius and vice versa. For this column, the optimal aperture radius ˚ 1 for Cs being equal to 0 mm, whereas it is equal to is equal to 0.75 A 1 ˚ 0.50 A for Cs being equal to 0.5 mm. For lighter atom columns such as silicon[100], the optimal aperture radius is independent of the spherical aberration constant. For a silicon [100] atom column, the optimal ˚ 1 for both Cs being equal to 0 mm and aperture radius is equal to 0.28 A 0.5 mm. It should be noticed that the foregoing analysis was done for object thicknesses equal to half the extinction distance as follows from Eq. (133) and Table 17. However, the conclusions remain the same for thicknesses diVerent from half the extinction distance. Also, it should be mentioned that these conclusions are not subject to the relevant physical constraint, which is either the radiation sensitivity of the object under study or the specimen drift. From the discussion given above, it follows that there is a fundamental diVerence between the optimal aperture radius in terms of the attainable precision and the aperture radius as given by Eq. (156), which is assumed to
124
VAN AERT ET AL.
Figure 38. The dashed curve of the left- and right-hand figure represents the 1s-state f1s, n(r) for a silicon [100] and a gold [100] atom column, respectively. The solid curves represent the amplitude of their associated optimal electron probes, that is, | p(r)|, for Cs being equal to 0.5 mm.
be optimal in terms of direct visual interpretability. The former depends more on the width of the 1s-state of the column under study than on the spherical aberration constant. The latter depends on the spherical aberration constant, but is independent of any structure parameter. iv. Optimal Detector Configuration. Next, the optimal detector configuration in terms of precision is described. In Figure 39, the precision with which the position coordinates of an isolated silicon [100] atom column can be estimated, is evaluated as a function of the detector-to-aperture radius, that is, gdet/gap. For annular detectors, gdet represents the inner collection detector radius, whereas for axial detectors, it represents the outer collection detector radius (see Figure 36). The objective aperture radius and ˚, ˚ 1 and 80 A the defocus are set to their optimal values of 0.28 A respectively. From this figure, the following conclusions may be derived: . For an annular detector, the optimal detector radius equals the optimal aperture radius. . For an axial detector, the optimal detector radius is slightly smaller than the optimal aperture radius. . An annular detector results in higher precisions than an axial detector when operating at the optimal conditions.
QUANTITATIVE ATOMIC RESOLUTION TEM
125
Figure 39. The lower bound on the standard deviation of the position coordinates of an isolated silicon [100] atom column as a function of the detector-to-aperture radius for an annular and an axial detector. The objective aperture radius and the defocus are set to their ˚ 1 and 80 A ˚ , respectively. optimal values of 0.28 A
It should be mentioned that in Figure 39, the size of the optimal detector radius of the annular detector is of the same order as the size of the aperture radius. For such detector radii, thermal diVuse scattering is unimportant. As mentioned earlier in this section, thermal diVuse scattering is not included in the expectation model given in Section V.B. The reader may wonder if the precision would be higher by using a large detector radius so that thermal diVuse scattering is dominant. This is not to be expected, since the precision in terms of the lower bound on the standard deviation is inversely proportional to the square-root of the total number of detected electrons, that is, the signal-to-noise ratio, which in its turn is inversely proportional to the detector radius. It is unlikely that the decrease of the total number of detected electrons by using a large detector radius may be compensated by the fact that only thermal diVuse scattered electrons are detected. Furthermore, it should be mentioned that the conclusions obtained from Figure 39 are not subject to the relevant physical constraint, which is either the radiation sensitivity of the object under study or the specimen drift. In Figure 39, the recording time as well as the number of incident electrons per ˚ are fixed. The optimal detector settings do not change if, for square A ˚ example, longer recording times or more incident electrons per square A would be allowed. For diVerent values of the recording time or the number ˚ , only the actual values for the standard of incident electrons per square A deviation ascribed to Figure 39 would be diVerent, whereas the optimal detector settings would be the same.
126
VAN AERT ET AL.
As mentioned earlier, the detector radius is usually taken much smaller or larger than the objective aperture radius for an axial or annular detector, respectively, thus aiming at optimal direct visual interpretability. However, this is typically not found if the attainable precision is used as optimality criterion. Then, the optimal detector radius is almost equal to the aperture radius. This has to do with the fact that the signal-to-noise ratio decreases with decreasing or increasing radius of axial or annular detectors, respectively. The finding that the optimal detector radius of an annular detector equals the optimal aperture radius is in agreement with the result found in (Rose, 1975). In that paper, the annular detector was optimized in terms of signal-to-noise ratio. Thus far, however, this guideline is usually not followed in practice since one seems to prefer direct visual interpretability above precision, even if this visual interpretability is accompanied with a low signal-to-noise ratio. v. Optimal Defocus Value. Subsequently, the dependence of the precisionon the defocus is studied, as well as the dependence of the optimal defocus on the spherical aberration constant, the electron wavelength, and the optimal objective aperture radius. In Figure 40, the precision is evaluated for a silicon [100] atom column as a function of the defocus " and the spherical aberration constant Cs for a given electron wavelength l and for the objective aperture radius gap and detector radius gdet adjusted to their ˚ 1. This evaluation is done for optimal values, both corresponding to 0.28 A an annular and axial detector. The solid white curves shown in Figure 40 are described by the relation 1 " ¼ Cs l2 g2ap : 2
ð160Þ
The dotted white curves describe the numerically found optimal defocus values as a function of the considered spherical aberration constants. From the comparison of the solid and dotted white curves, it follows that the defocus value as described by Eq. (160) is close to the optimal defocus value in terms of precision. Moreover, for a given spherical aberration constant, the precision that is gained by operating at the corresponding optimal defocus instead of at the defocus given by Eq. (160) is hardly significant. Therefore, the optimal defocus value, as a function of the spherical aberration constant, the electron wavelength, and the optimal objective aperture radius, is approximately given by the empirical relation as described by Eq. (160). At this defocus value, the transfer function is flattened in the sense that it is nearly equal to one over the whole angular range of the objective aperture. The optimal transfer function for a silicon [100] atom column and for a spherical aberration constant equal to 0.5 mm
QUANTITATIVE ATOMIC RESOLUTION TEM
127
Figure 40. The lower bound on the standard deviation of the position coordinates of an isolated silicon [100] atom column as a function of the spherical aberration constant and the defocus. The left- and right-hand figure represent the results for an annular and axial detector, respectively. The objective aperture radius and detector radius are adjusted to their optimal ˚ 1. The solid white curves are described by Eq. (160) and values, both corresponding to 0.28 A the dotted white curves describe the numerically found optimal defocus values as a function of the considered spherical aberration constants.
is presented in Figure 41, where the arrow represents the optimal objective aperture radius. Equation (160) is derived from Eq. (130) by setting the phase shift w(g) exactly to zero for g ¼ gap with gap the optimal objective aperture radius. These findings do not depend on whether the radiation sensitivity of the object under study or the specimen drift determines the relevant physical constraint. From now on, the defocus will be adjusted to the value given by Eq. (160). From the comparison of the optimal defocus in terms of direct visual interpretability as given by Eq. (156) with the optimal defocus in terms of precision as given by Eq. (160), it follows that their relation to the objective aperture radius is equal for both optimality criteria. Nevertheless, the explicit numbers for the defocus are diVerent since the optimal aperture radius is diVerent for both optimality criteria.
128
VAN AERT ET AL.
Figure 41. Transfer function for a spherical aberration constant of 0.5 mm, an electron ˚ , and a defocus value of 80 A ˚ . The arrow represents an objective wavelength of 0.02 A ˚ 1. aperture radius of 0.28 A
vi. Optimal Spherical Aberration Constant. Today, the use of the spherical aberration corrector is regarded as the most promising way to improve the visual interpretability of STEM images. The aim of this corrector is to obtain sub-a˚ngstrom resolution (Batson, Dellby, and Krivanek, 2002). In the present section, the potential merit of spherical aberration correctors is studied for quantitative instead of qualitative STEM applications. Therefore, the precision is evaluated as a function of the spherical aberration constant. In Figures 42 and 43, the ratio of the precision for a given spherical aberration constant to the precision for a spherical aberration constant of 0 mm is shown as a function of the spherical aberration constant for an isolated silicon and gold [100] atom column, respectively. This evaluation is done for an annular as well as an axial detector. For each considered spherical aberration constant, the objective aperture radius is set to its optimal value. Furthermore, the detector radius is taken equal to this optimal objective aperture radius. For silicon, which is a light atom column, it follows from Figure 42 that the precision that is gained by reducing the spherical aberration constant from 0.5 mm to 0 mm is a factor of 1.0009 and 1.0011 for an annular and axial detector, respectively. These gains are negligible. For gold, which is a heavy atom column, it follows from Figure 43 that the precision that is gained by reducing the spherical aberration constant from 0.5 mm to 0 mm is a factor of 1.21 and 1.39 for an annular and axial detector, respectively. From the numerical values given above, it follows that correction of the spherical aberration is more useful in terms of precision for heavy than for
QUANTITATIVE ATOMIC RESOLUTION TEM
129
Figure. 42. Ratio of lower bounds on the standard deviation of the position coordinates, defined as sbx =sbx ðCs ¼ 0 mm), of an isolated silicon [100] atom column as a function of the spherical aberration constant for an annular as well as for an axial detector.
Figure. 43. Ratio of lower bounds on the standard deviation of the position coordinates, defined as sbx =sbx ðCs ¼ 0 mm), of an isolated gold [100] atom column as a function of the spherical aberration constant for an annular as well as for an axial detector.
light atom columns. These results may be explained by the fact that the optimal aperture setting is strongly dependent on the atom column type as shown earlier. The optimal aperture radius for a gold [100] atom column is much larger than for a silicon [100] atom column. Because spherical
130
VAN AERT ET AL.
aberration is observable only for non-paraxial rays, correction is only necessary for objective lenses working with larger apertures. Notice, however, that from Figures 42 and 43, it follows that the accompanied gain in precision is only marginal, or even negligible for light atom columns. The finding that, mathematically, the optimal spherical aberration constant in terms of precision is equal to 0 mm agrees with the optimal value in terms of direct visual interpretability. vii. Optimal Reduced Brightness of the Electron Source. Next, the eVect of the reduced brightness Br of the electron source on the precision with which the position coordinates of an atom column can be measured is studied. Using Eqs. (151), (158), and (159), it follows that the precision, represented by the lower bound on the standard deviation of the position coordinates, is inversely proportional to the square root of the product of the probe current I and the recording time t for one pixel. Furthermore, it follows from Eq. (149) that I is directly proportional to the reduced brightness of the electron source Br. Therefore, just as for CTEM (see Section IV.C.2), new developments in producing electron sources with higher reduced brightness (de Jonge, Lamy, Schoots, and Oosterkamp, 2002; van Veen, Hagen, Barth, and Kruit, 2001) are advantageous in terms of precision. For example, if the reduced brightness is increased by a factor of 10, the lower bound on p the ffiffiffiffiffistandard deviation of the position coordinates decreases by a factor of 10. Hence, on the one hand, if the experiment is limited by specimen drift, the optimal reduced brightness is preferably as high as possible, that is, as high as physical limitations to the production of electron sources with higher reduced brightness allow. The dominant limitation is determined by the statistical Coulomb interactions (Kruit and Jansen, 1997; van Veen, Hagen, Barth, and Kruit, 2001). On the other hand, if the experiment is limited by the radiation sensitivity of the object, the reduced brightness has to be kept subcritical or an increase of the reduced brightness Br has to be compensated by a decrease of the recording time t, so as to keep ˚ within the constraints. the number of incident electrons per square A Finally, a remark about the recording time needs to be made. If the experiment is limited by specimen drift, the recording time is kept within the constraints in this study. The amount of specimen drift is determined by mechanical instabilities of the specimen holder. Hence, just as for CTEM (see Section IV.C.2), new developments providing more stable specimen holders, would allow microscopists to increase the recording time. This has a favorable eVect on the precision since, as mentioned above, the lower bound on the standard deviation of the position coordinates is inversely proportional to the square root of the recording time t for one pixel, which in its turn is directly proportional to the recording time for the whole FOV.
131
QUANTITATIVE ATOMIC RESOLUTION TEM
It could be mentioned that for STEM, specimen drift appears in a diVerent manner than for CTEM. For CTEM, drift blurs the image, whereas for STEM, it distorts the image (Pennycook and Yan, 2001). viii. Comparison with Conventional Approach. Tables 18 and 19 compare the optimal microscope settings in terms of precision with the conventional settings that are optimal in terms of direct visual interpretability. This is done for an isolated strontium [100] atom column using an annular and axial detector, respectively. The objective aperture radius and defocus corresponding to optimal visual interpretability are given by Eq. (156). Furthermore, in order to meet the conditions for direct visual interpretability more or less, the detector radius is taken two times larger or smaller than the objective aperture radius for an annular or axial detector, respectively. The spherical aberration constant is set to 0.5 mm. The other microscope settings and structure parameters are given in Tables 15 to 17. From the bottom rows of Tables 18 and 19, it follows that the attainable precision sbx is improved by a factor of 8.5 and 1.8 at the optimal TABLE 18 Comparison between the Optimal Microscope Settings in Terms of Precision and in Terms of Direct Visual Interpretability for a Strontium [100] Atom Column using an Aannular Detector and Cs ¼ 0.5 mm. Optimality criterion Microscope setting ˚) "(A ˚ 1) gap(A ˚ 1) gdet(A ˚) sbx (A
Precision
Visual interpretability
80 0.40 0.40
316 0.56 1.12
0.04
0.34
TABLE 19 Comparison between the Optimal Microscope Settings in Terms of Precision and in Terms of Direct Visual Interpretability for a Strontium [100] Atom Column Using an Axial Detector and Cs ¼ 0.5 mm. Optimal criterion Microscope setting ˚) "(A ˚ 1 Þ gap ðA ˚ 1 Þ gdet ðA ˚ sbx ðAÞ
Precision
Visual interpretability
80 0.40 0.35
316 0.56 0.28
0.10
0.18
132
VAN AERT ET AL.
microscope settings in terms of precision instead of at those in terms of visual interpretability for an annular and axial detector, respectively. Furthermore, the gain in precision by using an annular detector instead of an axial detector, under optimal conditions in terms of precision, is a factor of 2.5. ix. Summary. The optimal STEM microscope settings in terms of the attainable precision for isolated atom columns are summarized here: . The optimal aperture radius is mainly determined by the atom column type. It is proportional to the width of the Fourier transform of the 1s-state of the column under study. This means that the optimal aperture radius is larger for heavy than for light columns. . The optimal inner radius of an annular detector equals the optimal aperture radius. . The optimal outer radius of an axial detector is slightly smaller than the optimal aperture radius. . An annular detector results in a higher precision than an axial detector and is therefore preferred. . The optimal defocus value is approximately given by Eq. (160). It is determined by the optimal aperture radius, the spherical aberration constant, and the electron wavelength. . Strictly speaking, the optimal spherical aberration constant is equal to 0 mm. The precision that is gained by reducing spherical aberration depends on the column type. This gain is usually only marginal. . The reduced brightness of the electron source is preferably as high as possible if the experiment is limited by the specimen drift. . Improvements of the mechanical stability of the specimen holder, providing longer recording times, are beneficial in terms of precision, especially if the experiment is limited by the specimen drift.
c. Neighboring Atom Columns. In the previous section, the optimal experimental STEM design was described for isolated atom columns. The optimality criterion was the attainable precision with which the position of an isolated atom column can be estimated. This choice is justified as long as neighboring atom columns are clearly separated in the image. Then, the attainable precision with which the position of an atom column is estimated is independent of the presence of neighboring columns. However, in the previous section, images of silicon [100] atom columns of a crystal taken at the optimal settings for isolated atom columns may show strong overlap. The attainable precision is then aVected unfavorably by the presence of neighboring columns. In this section, it will be investigated if the optimal microscope settings change in the presence of neighboring atom columns. This will be done for silicon [100] and gold [100] crystals.
133
QUANTITATIVE ATOMIC RESOLUTION TEM
i. Structure Parameters. The two-dimensional projected structure of the objects under study, which are, silicon [100] and gold [100] crystals, is modelled as a lattice consisting of 5 5 projected atom columns at the positions T T ð161Þ bn ¼ bxn byn ¼ nx d ny d ;
with indices n ¼ ðnx ; ny Þ; nx ¼ 2; . . . ; 2; ny ¼ 2; . . . ; 2, and d the distance between an atom column and its nearest neighbor. The values of the distance d for both a silicon [100] and a gold [100] crystal (International Centre for DiVraction Data, 2001) and for the object thickness are given in Table 20. The object thickness is equal to half the extinction distance, which is given by Eq. (133).
ii. Optimality Criterion. The optimal statistical experimental design is given by the microscope settings resulting into the highest attainable precision with which the position coordinate bxn of the central atom column of the lattice consisting of 5 5 atom columns can be estimated. This column corresponds to the index n ¼ ð0; 0Þ. The attainable precision (in terms of the variance) is represented by the diagonal element s2bxn of the CRLB. An expression for this element may be derived as follows. First, the Fisher information matrix associated with the total set of 50 position coordinates bxn and byn is computed. This is a 50 50 matrix. The expression for the elements Frs of the Fisher information matrix is given by Eq. (158). Explicit numbers for these elements are obtained by substituting values of a given set of microscope settings and structure parameters of the object into the obtained expression for Frs. Next, the CRLB is computed by inverting the Fisher information matrix. Finally, the diagonal element s2bxn of the CRLB, corresponding to the position coordinate bxn of the central atom column of the lattice, represents the attainable precision. In what follows, the precision will be represented by the lower bound on the standard deviation sbxn, that is, the square root of s2bxn . It will be used as optimality criterion for the evaluation and optimization of the experimental design. TABLE 20 Structure Parameters of Neighboring Atom Columns Column type Structure parameter
Si [100]
Au [100]
˚) d(A ˚) z(A
1.92
E0 l
E1s;n
2.04
E0 l
E1s;n
134
VAN AERT ET AL.
Alternatively, one could choose the lower bound on the standard deviation sbyn of the position coordinate byn of the central atom column since sbxn and sbyn are equal to one another. The reason for this is that, for the structure of the objects under study, rotation of the expectation model over an angle of 90 degrees carries the expectation model into itself. Moreover, the central atom column is preferred rather than one of the other 24 atom columns since this column is mostly aVected by the presence of neighboring columns. As mentioned in Section II.C.2, the chosen criterion may be regarded as a partial or truncated optimality criterion. iii. Optimal Microscope Settings. First, in accordance with the optimization of the design for isolated atom columns, it will be assumed that the diameter of the source image is given by Eq. (157). Meeting this assumption, the precision has been evaluated as a function of the microscope settings for both a gold [100] and a silicon [100] crystal. From this evaluation, it is investigated if the optimal settings for isolated atom columns as summarized earlier are still optimal in the presence of neighboring columns. For a gold [100] crystal, the optimal settings are reasonably well described by those for isolated atom columns. The reason for this is that neighboring atom columns do not show strong overlap in the images taken at these settings. One of the minor changes in the optimal design that has been observed is the following one: The optimal objective aperture radius increases by an order of about ˚ 1 to 10%. For example, for an annular detector, it increases from 0.50 A 1 ˚ 1 to ˚ at a spherical aberration constant of 0.5 mm and from 0.75 A 0.55 A 1 ˚ at a spherical aberration constant of 0 mm. 0.85 A .
For a silicon [100] crystal, changes in the optimal settings compared to those for isolated atom columns are more pronounced than for gold. The reason for this is that neighboring columns show strong overlap in the images taken at the settings which are optimal for isolated columns. The most important changes are the following ones: . The optimal objective aperture radius increases substantially. For ˚ 1 to 0.50 A ˚ 1. example, for an annular detector it increases from 0.28 A 1 1 ˚ at a spherical ˚ to 0.65 A For an axial detector, it increases from 0.28 A ˚ 1 to 0.95 A ˚ 1 at a aberration constant of 0.5 mm and even from 0.28 A spherical aberration constant of 0 mm. . The optimal outer detector radius of an axial detector is considerably ˚ 1 smaller than the optimal aperture radius. It is found to be equal to 0.25 A for both a spherical aberration constant of 0 mm and 0.5 mm.
QUANTITATIVE ATOMIC RESOLUTION TEM
135
For a low value of the spherical aberration constant, an axial detector may result into a higher attainable precision than an annular detector. .
The latter two findings are illustrated in Figure 44, where the precision is evaluated as a function of the detector-to-aperture radius for both an annular and an axial detector and a spherical aberration constant of 0 mm. The optimal aperture radius is set to its optimal value, corresponding to ˚ 1 for an annular and axial detector, respectively. ˚ 1 and 0.95 A 0.50 A Furthermore, it is worth mentioning that at the optimal settings for a silicon crystal, neighboring columns are clearly separated in the image. Next, it is found that the precision that may be gained by correcting spherical aberration is larger for neighboring atom columns than for isolated atom columns. Second, the diameter of the source image has been taken variable. In practice, this is possible by adjusting the settings of the condenser lenses, allowing the demagnification of the source to be continuously varied. It is well known that an increase of this diameter is accompanied by two side eVects: a broadening of the source image and an increase of the probe current (Nellist and Pennycook, 2000). The former has an unfavorable eVect on the precision while the latter has a favorable eVect. Moreover, a decrease of the diameter of the source image is accompanied by the opposite side
Figure 44. The lower bound on the standard deviation of the position coordinates of the central atom column of the silicon [100] crystal under study as a function of the detector-toaperture radius for an annular and an axial detector. The spherical aberration constant is set to ˚ 1 and 0 mm, the objective aperture radius is set to its optimal value corresponding to 0.50 A ˚ 1 for an annular and axial detector, respectively. 0.95 A
136
VAN AERT ET AL.
eVects. The potential merit of a variable diameter of the source image is studied for experiments that are limited by specimen drift only and not for experiments that are limited by the radiation sensitivity of the object. The reason for this is that the latter constraint, which implies that the total ˚ has to be kept constant, would number of incident electrons per square A lead to unrealistic microscope settings and recording times. This follows intuitively from Eqs. (148) and (149). The total number of incident electrons 2 t constant. Hence, a may be kept constant by keeping the product dI50 decrease of the diameter dI50 of the source image may be compensated by an increase of the recording time t for one pixel. Narrowing the source image has a favorable eVect on the precision. Although this leads to a decrease of the probe current as well, it has no eVect on the total number of incident electrons if the recording time may be increased without limits. Hence, the ‘optimal’ diameter of the source image would be infinitely small and the accompanied recording time would be infinitely large. Such settings are unrealistic in practice. For the silicon and gold crystals under study, it has been found that the optimal diameter of the source image is of the same order of magnitude as the diameter of the diVraction-error disc as given by Eq. (157). This is illustrated for gold in Figure 45, where the precision is evaluated as a function of the ‘source image’-to-‘diVraction-error disc’ diameter for an
Figure 45. The lower bound on the standard deviation of the position coordinates of the central atom column of the gold [100] crystal under study as a function of the ‘source image’to-‘diVraction-error disc’ diameter for an annular and an axial detector. The objective aperture ˚ 1, being optimal for a spherical aberration constant of and detector radius are set to 0.55 A 0.5 mm.
QUANTITATIVE ATOMIC RESOLUTION TEM
137
annular as well as for an axial detector. The diameter of the source image and the diVraction-error disc are determined by Eqs. (150) and (157), ˚ 1, respectively. The objective aperture and detector radius are set to 0.55 A being optimal for a spherical aberration constant of 0.5 mm. As follows from Figure 45, for this example, the optimal diameter of the source image is slightly smaller than the diameter of the diVraction-error disc. For other examples, it has been observed that the optimal diameter may equally well be larger, instead of smaller, than the diameter of the diVraction-error disc. For all examples that have been studied, the order of magnitude of this optimal diameter is approximately equal to the diameter of the diVractionerror disc. Furthermore, it has to be mentioned that a variable diameter of the source image has hardly any eVect on the optimal settings diVerent from this diameter. d. Attainability of the Crame´r-Rao Lower Bound. Finally, it is investigated if there exists an estimator attaining the CRLB on the variance of the position coordinates and if this estimator may be considered unbiased. If so, this would justify the choice of the CRLB as optimality criterion used in this section. The procedure that is used to investigate the existence of an estimator attaining the CRLB is the same as the one used in Sections III.D and IV.C.2. Recall that one of the asymptotic properties of the maximum likelihood estimator is its normal distribution about the true parameters with a covariance matrix approaching the CRLB (van den Bos, 1982). This property would justify the use of the CRLB as optimality criterion, but it is an asymptotic one. If this asymptotic property still applies to STEM experiments, where the number of observations is finite, will be assessed by carrying out simulation experiments. Therefore, 200 diVerent STEM experiments made on an isolated strontium [100] atom column are simulated; the observations are modelled using the parametric statistical model described in Section V.B. The objective aperture radius and defocus ˚ 1 and 160 A ˚ , respectively. An are set to the optimal values of 0.4 A annular detector is used with detector radius equal to the optimal aperture radius. Next, the position coordinates bx and by of the atom column are estimated from each simulation experiment using the maximum likelihood estimator. The mean and variance of these estimates are computed and compared to the true value of the position coordinate and the lower bound on the variance, respectively. The lower bound on the variance is computed by substituting the true values of the parameters into the expression given by the right-hand member of Eq. (159). The results are presented in Table 21. From the comparison of these results, it follows that it may not be concluded that the maximum likelihood estimator is biased or that it does
138
VAN AERT ET AL. TABLE 21 Comparison of True Position Coordinates and Lower Bounds on the Variance with Estimated Means and Variances of 200 Maximum Likelihood Estimates of the Position Coordinates, Respectively. True position ˚) coordinate (A
bx by
0 0
bx by
Lower bound ˚ 2) on variance (A
Estimated ˚) mean (A
Standard deviation ˚) of mean (A
0.002 0.001
0.003 0.003
Estimated ˚ 2) variance (A
Standard deviation ˚ 2) of variance (A
s2bx
0.0019
s2bx
0.0019
0.002
s2by
0.0019
s2by
0.0022
0.0002
The numbers of the last column represent the estimated standard deviation of the variable of the previous column.
not attain the CRLB. Furthermore, the hypothesis that the estimates are normally distributed has been tested quantitatively by means of the so-called Lilliefors test (Conover, 1980), which does not reject this hypothesis. From the results obtained from the simulation experiments, it is concluded that the maximum likelihood estimates cannot be distinguished from unbiased, eYcient estimates. These results justify the choice of the CRLB as optimality criterion. 3. Interpretation of the Results In Section V.C.2, optimal STEM designs were obtained by numerically computing and evaluating the attainable precision as a function of the microscope settings. Numerical analysis is the only correct way to obtain the optimal design. However, in the present section, it will be shown that an intuitive interpretation may sometimes be given by means of the results of Section III, where rules of thumb were obtained for the attainable precision with which the position of one component or the distance between two components can be estimated from dark-field and bright-field imaging experiments. These rules of thumb are given by Eqs. (65)-(70). For dark-field imaging, they were derived for an expectation model of the observations consisting of Gaussian peaks located at each component. For bright-field imaging, they were derived for a model consisting of a constant background from which Gaussian peaks located at each component were subtracted. The rules of thumb show that the precision is a function of the width of the Gaussian peak and the total number of detected electrons. Generally, the precision improves by narrowing the Gaussian peak and by increasing the
QUANTITATIVE ATOMIC RESOLUTION TEM
139
total number of detected electrons. Furthermore, for neighboring components, the precision of the distance deteriorates if the distance is smaller than a typical value, which is proportional to the width of the peak. In other words, the precision deteriorates if neighboring components strongly overlap in the image. It may be shown that this does not only apply to the precision of the distance, but to the precision of the position as well. Empirically, it has been found that the obtained rules of thumb are generalizable to more complicated STEM expectation models than Gaussian peaks. Instead of the width of the Gaussian peak, one may consider the width of the corresponding non-Gaussian peak of the STEM expectation model. Then, the generalized rules of thumb express that the precision will improve by decreasing this width or by increasing the total number of detected electrons and that it will deteriorate if neighboring components strongly overlap in the image. The applicability of these rules of thumb to give an intuitive interpretation to the numerical results found in previous sections will now be demonstrated by means of two examples. The first example explains why the optimal probe is not as narrow as possible for dark-field STEM, using an annular detector with inner radius larger than or equal to the objective aperture radius. It is generally known that the size of a diVraction-limited probe decreases if the objective aperture radius increases. Consequently, the width of the non-Gaussian peak of the expectation model decreases. This follows from the expression for the darkfield image intensity distribution, given by Eq. (142). Therefore, an increase of the objective aperture radius has a favorable eVect on the attainable precision. At this point, it should be realized, however, that below a certain probe size, the width of the peak is limited by the intrinsic width of the column dependent 1s-state. Today, the width of the probe is almost equal to the width of an atom, as follows from Batson, Dellby, and Krivanek (2002) and Krivanek, Dellby, and Nellist (2002). On the other hand, it has been found that the optimal design of an annular detector corresponds to an inner radius of the detector being equal to the optimal objective aperture radius. Thus, an increase of the objective aperture radius means an increase of the detector radius. The accompanied loss of the number of detected electrons has an unfavorable eVect on the attainable precision. As a consequence, the optimal design balances the width of the probe and the number of detected electrons. This is illustrated in Figure 46, where intersections of the two-dimensional, radially symmetric expectation model of an isolated gold [100] column and a plane through its radial axis are shown. The solid curve corresponds to an objective aperture radius of ˚ 1 being optimal for a gold crystal. The dashed curve corresponds to 0.85 A ˚ 1 resulting into a narrower probe. a larger objective aperture radius of 1.5 A Furthermore, for both curves, the spherical aberration constant is set at
140
VAN AERT ET AL.
Figure 46. Intersection of the two-dimensional, radially symmetric expectation model of the observations made on an isolated gold [100] atom column and a plane through its radial axis. An annular detector is chosen with inner radius equal to the objective aperture radius. The solid curve corresponds to the optimal settings; the dashed curve corresponds to non-optimal settings.
0 mm and an annular detector is used with inner radius equal to the objective aperture radius. Figure 46 clearly illustrates that the width of the peak of the expectation model is smaller if the probe is narrower, but also that the number of detected electrons is lower. This makes plausible that the optimal probe is not as narrow as possible. The second example explains why, for a silicon [100] crystal, the optimal objective aperture radius increases substantially and why the optimal outer radius of an axial detector is much smaller than the objective aperture radius as compared to the optimal settings for an isolated silicon column, as mentioned earlier. Figure 47 shows intersections of the two-dimensional, radially symmetric, bright-field expectation model of an isolated silicon [100] column and a plane through its radial axis. The solid curve ˚ 1 and an outer corresponds to an objective aperture radius of 0.65 A 1 ˚ detector radius of 0.25 A being optimal for a silicon [100] crystal. The ˚ 1 and dashed curve corresponds to an objective aperture radius of 0.28 A 1 ˚ an outer detector radius of 0.22 A being optimal for an isolated silicon [100] column. The dotted curve corresponds to an objective aperture radius ˚ 1. Furthermore, for all ˚ 1 and an outer detector radius of 0.55 A of 0.65 A
QUANTITATIVE ATOMIC RESOLUTION TEM
141
Figure 47. Intersection of the two-dimensional, radially symmetric expectation model of the observations made on an isolated silicon [100] atom column and a plane through its radial axis. An axial detector is chosen. The solid curve corresponds to the optimal aperture and detector settings for a silicon crystal; the dashed curve corresponds to the optimal settings for a single, isolated silicon column; the dotted curve corresponds to non-optimal settings.
curves, the spherical aberration constant is set at 0.5 mm. From the comparison of the dashed curve with the solid and dotted curve, it follows that the width of the peak decreases by increasing the objective aperture radius, that is, by narrowing the probe. In the presence of neighboring silicon [100] columns, for which the distance between a column and its ˚ , this increase of the objective aperture nearest neighbor is equal to 1.92 A radius avoids that columns would strongly overlap in the image. This has a favorable eVect on the precision. Furthermore, from the comparison of the solid and dotted curve, it follows that by decreasing the detector radius, the contrast improves although at the expense of the number of detected electrons. This makes plausible that, for a silicon crystal, the optimal objective aperture radius increases and that the optimal outer radius of an axial detector is much smaller than the objective aperture radius as compared to the optimal settings for an isolated silicon column. D. Conclusions Conventionally, the design of a STEM experiment is optimized in terms of direct visual interpretability. However, quantitative STEM experiments aim at the highest precision with which the positions of the atom columns may be estimated. Since this is a diVerent purpose, the design has been
142
VAN AERT ET AL.
reconsidered. The obvious optimality criterion is the attainable precision with which atom column positions can be estimated. An expression for the attainable statistical precision has been derived from a parametric statistical model of the observations. The expectations of the observations have been described by means of the channelling theory, whereas the fluctuations of the observations have been described by means of Poisson statistics. The obtained expression has been used to evaluate and optimize the design of quantitative STEM experiments. From this analysis, the following conclusions have been obtained: . The optimal objective aperture radius is mainly determined by the object under study. For isolated atom columns, it has been found that it is proportional to the width of the Fourier transform of the 1s-state of the column under study. This means that the optimal aperture radius is larger for heavy than for light columns. Consequently, the probe is narrower for heavy than for light atom columns. However, in the presence of neighboring columns, it has been found that the optimal aperture radius may increase if the optimal aperture radius for isolated columns leads to strong overlap of neighboring columns in the image. . The optimal inner radius of an annular detector equals the optimal aperture radius. . For isolated atom columns, the optimal outer radius of an axial detector is slightly smaller than the optimal aperture radius. However, if this detector leads to very low contrast of the image in the presence of neighboring atom columns, the optimal detector radius decreases. . Usually, an annular detector results in higher precisions than an axial detector and is therefore preferred, although there are exceptions. . The optimal defocus value is approximately given by Eq. (160). It is determined by the optimal aperture radius, the spherical aberration constant, and the electron wavelength. . Strictly speaking, the optimal spherical aberration constant is found to be equal to 0 mm. The precision that is gained by reducing spherical aberration depends on the object under study. Usually, this is only marginal. . The reduced brightness of the electron source is preferably as high as possible if the experiment is limited by specimen drift. . Improvements of the mechanical stability of the specimen holder, providing longer recording times, are beneficial in terms of precision, especially if the experiment is limited by specimen drift. . The optimal width of the source image is of the same order of magnitude as the size of the diVraction-error disc if the experiment is limited by specimen drift.
QUANTITATIVE ATOMIC RESOLUTION TEM
143
VI. Discussion and Conclusions A method has been proposed for optimizing the design of quantitative atomic resolution transmission electron microscopy experiments. The obvious optimality criterion has been shown to be the attainable precision with which structure parameters, the atom or atom column positions in particular, can be estimated. This precision can be adequately quantified in the form of the so-called CRLB, which is a lower bound on the variance of the parameter estimates. Minimization of the CRLB as a function of the microscope settings, under the existing physical constraints, results in the optimal statistical experimental design. The constraints are either the radiation sensitivity of the object or the specimen drift. Therefore, either the incident electron dose per square a˚ngstrom (that is, the amount of electrons per square a˚ngstrom that interact with the object during the experiment) or the recording time is constrained in the optimization. The attainable precision with which position and distance parameters of one or two components can be estimated has been investigated. This has been done for two- and three-dimensional components. For two-dimensional components, the observations consist of counting events in a twodimensional pixel array, whereas for three-dimensional components, they consist of counting events in a set of two-dimensional pixel arrays, which is obtained by rotating these components about a rotation axis. These examples may be regarded as simulations of high-resolution conventional or scanning transmission electron microscopy and electron tomography experiments, respectively. The model describing the expectations of the observations made on these components, the expectation model, has been assumed to consist of Gaussian peaks with unknown position. Under this assumption, the CRLB, which is usually calculated numerically, is given by a simple rule of thumb in closed analytical form. Although the expectation models of images obtained in practice are usually of higher complexity than Gaussian peaks, the rules of thumb are suitable to give insight into statistical experimental design for quantitative atomic resolution transmission electron microscopy. For two- and three-dimensional, one and two component expectation models, the expressions show how the attainable precision depends on the width of the point spread function, the width of the components, and the number of detected counts. Furthermore, for two- and three-dimensional, two component expectation models, the attainable precision also depends on the distance between the components. Particularly for three-dimensional, two component expectation models, it is a function of the orientation of the components with respect to the rotation axis as well. Generally, the precision may be improved by increasing the number of
144
VAN AERT ET AL.
detected counts or by narrowing the point spread function. However, below a certain width of the point spread function, the precision is limited by the intrinsic width of the components. Then, further narrowing of the point spread function is useless. Moreover, if a narrower point spread function is accompanied with a decrease of the number of detected electrons, both eVects have to be weighed against each other under the existing physical constraints. The following results have been derived from the minimization of the numerically calculated CRLB with respect to the microscope settings, assuming expectation models with a solid physical base, instead of Gaussian peaks. Using this procedure, the optimal statistical experimental design of conventional and scanning transmission electron microscopy experiments have been derived. The obtained results may intuitively be interpreted using the rules of thumb for the CRLB, which have been obtained from Gaussian peaked expectation models. For conventional transmission electron microscopy, it has been shown that a spherical and chromatic aberration corrector may improve the attainable precision. Correction has most sense for low accelerating voltages and for objects consisting of heavy atom columns. However, it should be mentioned that the optimal spherical aberration constant is diVerent from 0 mm for thin objects. Furthermore, increasing the reduced brightness of the electron source or improving the mechanical stability of specimen holders, may improve the attainable precision considerably. Moreover, particularly for electron microscopes operating at intermediate accelerating voltages of about 300 kV, it has been found that a monochromator usually deteriorates the attainable precision if the experiment is limited by specimen drift, whereas it slightly improves the precision if the experiment is limited by the radiation sensitivity of the object. For electron microscopes operating at low accelerating voltages of about 50 kV, it has been shown that correction of the chromatic aberration by either a chromatic aberration corrector or a monochromator may improve the attainable precision significantly, although a chromatic aberration corrector would be preferred. For scanning transmission electron microscopy, it has been shown that the optimal probe is not the narrowest possible. The optimal objective aperture radius, which determines the size of a diVraction limited probe, has been found to be mainly determined by the object under study. More specifically, for isolated atom columns, it is proportional to the potential depth of the atom column, that is, the diVerence between the maximum and minimum potential energy. However, in the presence of neighboring columns, it has been found that the optimal aperture radius may increase so as to avoid strong overlap of neighboring columns in the image. In the evaluation of the experimental design, annular and axial detector types have
QUANTITATIVE ATOMIC RESOLUTION TEM
145
been compared. Usually, an annular detector may result in higher precisions than an axial one, but there are exceptions. The optimal inner radius of an annular detector has been found to be equal to the optimal objective aperture radius. The optimal outer radius of an axial detector is usually slightly smaller than the optimal aperture radius. However, if this detector leads to very low contrast of the image, the optimal detector radius decreases. Moreover, a spherical aberration corrector improves the precision. However, the accompanied gain, which depends on the object under study, may be disappointing. As for conventional transmission electron microscopy, the reduced brightness of the electron source is preferably as high as possible and the specimen holder as stable as possible, especially if the experiment is limited by specimen drift. In this article, statistical experimental design has been used to discover the theoretical limits to quantitative atomic resolution transmission electron microscopy. This limit has been shown to be determined by the highest attainable precision with which atom or atom column positions can be estimated. Statistical experimental design allows one to find the microscope settings resulting into the highest attainable precision. Most important is that it provides the electron microscopist with insight in which precision may be obtained at which microscope settings. Thus, it shows the possible benefit of the optimal settings compared to the usual settings. Then, the electron microscopist may decide if it is advantageous to modify these usual settings. Acknowledgments The research of Dr. A. J. den Dekker has been made possible by a fellowship of the Royal Netherlands Academy of Arts and Sciences. S. van Aert is a Postdoctoral Fellow of the F.W.O (Fund for Scientific Research, Flanders, Belgium). Appendix A In this appendix, the approximations, described by Eqs. (65), (66), and (67) of Section III, for the highest attainable precision with which position coordinates of one isolated component or the distance between two components can be estimated of a two-dimensional object from a darkfield imaging experiment are derived. First, Eqs. (65) and (67), which describe the CRLB on the variance of the position coordinate of one isolated component and on the variance of the
146
VAN AERT ET AL.
distance between two non-overlapping components, respectively, will be proven. If the pixel sizes Dx and Dy are assumed to be small compared to the width r of the Gaussian peak, which is described by Eq. (29), it can be shown that the non-diagonal elements of the Fisher information matrix F associated with the position coordinates, which is approximated by the right-hand members of Eqs. (45) and (47) for one and two components, respectively, are approximately equal to zero. The reason for this is that the components do not overlap and that the image intensity distribution of each component has rotational symmetry. Moreover, the diagonal elements are nonzero and equal to one another. For example, the first diagonal element of F associated with the position coordinate bx1 is calculated explicitly as follows. From Eq. (12), it follows that: F11 ¼
K X L X 1 @lkl @lkl ; @bx1 @bx1 l k¼1 l¼1 kl
ð162Þ
where lkl is given by Eq. (32). In this expression, Eqs. (28), (29), and (31) are substituted. Moreover, since the components do not overlap, the denominator lkl of Eq. (162) is approximated by Np Gðxk bx1 ; yl by1 ÞDxDy. In the thus obtained expression, the sums are approximated by integrals since Dx and Dy are assumed to be small compared to the width r of the Gaussian peak. This results in: F11
Np : r2
ð163Þ
An analogous reasoning results in the same approximation for the other diagonal element F33 associated with the y-coordinates of the position of two components. Substitution of these approximations into Eqs. (46) and (51) results in Eqs. (65) and (67), respectively. Second, Eq. (66), which describes the CRLB on the variance of the distance between two overlapping components, will be proven. For that purpose, the diVerences of the elements of the Fisher information matrix F associated with the position coordinates b ¼ ðbx1 bx2 by1 by2 ÞT , that is, F11 F12, F33 F34, and F13 F14 are calculated explicitly. From Eq. (12), it follows that
K X L 1X 1 @lkl @lkl 2 ; ð164Þ F11 F12 2 k¼1 l¼1 lkl @bx1 @bx2
where it has been taken into account that F11 F22 and F12 ¼ F21 . In the following calculations, the diVerences between the coordinates of the two components and the sums of these coordinates are needed. In order to
QUANTITATIVE ATOMIC RESOLUTION TEM
147
simplify the notation, the parameters a are introduced. The elements of the parameter vector a ¼ ða1 a2 a3 a4 ÞT are defined as: a1 ¼ bx1 bx2 ;
a2 ¼ by1 by2 ;
a3 ¼ bx1 þ bx2 ; a4 ¼ by1 þ by2 :
ð165Þ
Using Eqs. (28), (31), (32), and (165), the derivatives @lkl =@bx1 and @lkl =@bx2 are written as: @lkl @Gðxk ða1 þ a3 Þ=2; yl ða2 þ a4 Þ=2Þ Np DxDy; @xk @bx1
ð166Þ
@lkl @Gðxk ða3 a1 Þ=2; yl ða4 a2 Þ=2Þ Np DxDy: @bx2 @xk
ð167Þ
Since the distance d is assumed to be small compared to the width r of the Gaussian peak, Eqs. (166) and (167) are Taylor expanded about a1 ¼ 0 and a2 ¼ 0 as follows: @lkl @Gðxk a3 =2; yl a4 =2Þ a1 @ 2 Gðxk a3 =2; yl a4 =2Þ Np @bx1 2 @xk @x2k
a2 @ 2 Gðxk a3 =2; yl a4 =2Þ DxDy; 2 @yl @xk
ð168Þ
@lkl @Gðxk a3 =2; yl a4 =2Þ a1 @ 2 Gðxk a3 =2; yl a4 =2Þ Np þ @xk @bx2 2 @x2k
a2 @ 2 Gðxk a3 =2; yl a4 =2Þ þ DxDy: @yl @xk 2
ð169Þ
Next, Eqs. (164), (168), and (169) are combined and since the two components are assumed to overlap nearly completely, the denominator lkl of Eq. (164) is approximated by 2Np Gðxk a3 =2; yl a4 =2ÞDxDy. This results in: 2 2 2 3 =2;yl a4 =2Þ 3 =2;yl a4 =2Þ K X L a1 @ Gðxk a@x þ a2 @ Gðxk a 2 @ @ Np DxDy X yl xk k : F11 F12 4 G ð x a =2; y a =2 Þ k 3 l 4 k¼1 l¼1 ð170Þ
148
VAN AERT ET AL.
The next step is to substitute Eq. (29) into Eq. (170). Then, the sums may be approximated by integrals if it assumed that Dx and Dy are small compared to the width r of the Gaussian peak. Straightforward calculations result in: F11 F12 An analogous reasoning yields: F33 F34
Np 2 2 : 2a þ a 1 2 4r4
ð171Þ
Np 2 2 2a þ a 2 1 4r4
ð172Þ
Next, it follows from Eq. (12) that !
K X L 1X 1 @lkl @lkl @lkl @lkl ; F13 F14 2 k¼1 l¼1 lkl @bx1 @bx2 @by1 @by2
ð173Þ
where it has been taken into account that F13 F24 and F14 F23. A similar derivation as that resulting in Eqs. (168) and (169) gives: @lkl @Gðxk a3 =2; yl a4 =2Þ a1 @ 2 Gðxk a3 =2; yl a4 =2Þ Np @by1 2 @yl @xk @yl
a2 @ 2 Gðxk a3 =2; yl a4 =2Þ DxDy; 2 @y2l
ð174Þ
@lkl @Gðxk a3 =2; yl a4 =2Þ a1 @ 2 Gðxk a3 =2; yl a4 =2Þ Np þ @yl @xk @yl @by2 2
2 a2 @ Gðxk a3 =2; yl a4 =2Þ þ DxDy: 2 @y2l
ð175Þ
Next, Eqs. (29), (168), (169), (173)-(175) are combined and the denominator lkl of Eq. (173) is approximated by 2Np Gðxk a3 =2; yl a4 =2ÞDxDy since the two components are assumed to overlap nearly completely. Moreover, if Dx and Dy are assumed to be small compared to the width r of the Gaussian peak, the sums may be approximated by integrals. This results in: F13 F14
N p a1 a 2 : 4r4
ð176Þ
QUANTITATIVE ATOMIC RESOLUTION TEM
149
Finally, substitution of Eqs. (171), (172), and (176) into Eq. (51) results in Eq. (66).
Appendix B In this appendix, the approximations, described by Eqs. (68), (69), and (70) of Section III for the highest attainable precision with which position coordinates of one isolated component or the distance between two components can be estimated of a two-dimensional object from a brightfield imaging experiment are derived. First, Eqs. (68) and (70), which describe the CRLB on the variance of the position coordinate of one isolated component and on the variance of the distance between two non-overlapping components, respectively, will be proven. If the pixel sizes Dx and Dy are assumed to be small compared to the width r of the Gaussian peak, which is described by Eq. (29), it can be shown that the non-diagonal elements of the Fisher information matrix F associated with the position coordinates, which is approximated by the right-hand members of Eqs. (45) and (47) for one and two components, respectively, are approximately equal to zero. The reason for this is that the components do not overlap and that the image intensity distribution of each component has rotational symmetry. Moreover, the diagonal elements are nonzero and equal to one another. For example, the first diagonal element of F is calculated explicitly as follows. From Eq. (12), it follows that: F11 ¼
K X L X 1 @lkl @lkl ; @bx1 @bx1 l k¼1 l¼1 kl
ð177Þ
where lkl is given by Eq. (35). In this expression, Eqs. (33) and (34) are substituted. Moreover, it is assumed that the number of interacting electrons is much smaller than the number of noninteracting electrons. Hence, the term ‘ncOgDF (x, y; b)’ of Eq. (33) is assumed to be much smaller than the term ‘1’. Therefore, the denominator lkl of Eq. (177) may be approximated by NDxDy/ (FOV ncO). This results in: F11
K X L N ðnOÞ2 X @gDF ðxk ; yl ; bÞ @gDF ðxk ; yl ; bÞ DxDy @bx1 @bx1 FOV nc O k¼1 l¼1
ð178Þ
In the thus obtained expression, Eqs. (28) and (29) are substituted. Moreover, the factor Nðnc OÞ2 =ðFOV nc OÞ is approximated by Nðnc OÞ2 =FOV since the number of interacting electrons is much smaller than the number of
150
VAN AERT ET AL.
noninteracting electrons. If Dx and Dy are assumed to be small compared to the width r of the Gaussian peak, the sums may be approximated by integrals, resulting in: F11
NO2 : 8pr4 FOV
ð179Þ
An analogous reasoning results in the same approximation for the other diagonal element F33 associated with the y-coordinates of the position of two components. Substitution of these approximations into Eqs. (46) and (51) results in Eqs. (68) and (70), respectively. Second, Eq. (69), which describes the CRLB on the variance of the distance between two overlapping components, will be proven. For that purpose, the diVerences of the elements of the Fisher information matrix F associated with the position coordinates b ¼ ðbx1 bx2 by1 by2 ÞT , that is, F11 F12 , F33 F34 , and F13 14 are calculated explicitly. From Eq. (12), it follows that
K X L 1X 1 @lkl @lkl 2 F11 F12 ; ð180Þ 2 k¼1 l¼1 lkl @bx1 @bx2 where it has been taken into account that F11 F22 and F12 = F21. Using Eqs. (28), (33)–(35), and (165), the derivatives @lkl =@bx1 and @lkl =@bx2 are written as: @lkl NODxDy @Gðxk ða1 þ a3 Þ=2; yl ða2 þ a4 Þ=2Þ ; @xk @bx1 FOV nc O
ð181Þ
@lkl NODxDy @Gðxk ða3 a1 Þ=2; yl ða4 a2 Þ=2Þ : @xk @bx2 FOV nc O
ð182Þ
Since the distance d is assumed to be small compared to the width of the Gaussian peak, Eqs. (181) and (182) are Taylor expanded about a1 ¼ 0 and a2 ¼ 0 as follows: @lkl NODxDy @Gðxk a3 =2; yl a4 =2Þ @xk @bx1 FOV nc O 2 a1 @ Gðxk a3 =2; yl a4 =2Þ ð183Þ 2 @x2k
a2 @ 2 Gðxk a3 =2; yl a4 =2Þ ; 2 @yl @xk
QUANTITATIVE ATOMIC RESOLUTION TEM
@lkl NODxDy @Gðxk a3 =2; yl a4 =2Þ @bx2 FOV nc O @xk a1 @ 2 Gðxk a3 =2; yl a4 =2Þ þ 2 @x2k
a2 @ 2 Gðxk a3 =2; yl a4 =2Þ þ : @yl @xk 2
151
ð184Þ
Eqs. (180), (183), and (184) are combined and the denominator lkl of Eq. (180) is approximated by NDxDy=ðFOV nc OÞ since ‘nc OgDF ðx; y; bÞ’ of Eq. (33) is assumed to be much smaller than the term ‘1’. This results in: F11 F12
K X L NO2 DxDy X @ 2 Gðxk a3 =2; yl a4 =2Þ a1 2ðFOV nc OÞ k¼1 l¼1 @x2k
2 @ 2 Gðxk a3 =2; yl a4 =2Þ þ a2 : @yl @xk
ð185Þ
The next step is to substitute Eq. (29) into Eq. (185). Moreover, NO2 =ðFOV nc OÞ is approximated by NO2 =FOV since the number of interacting electrons is much smaller than the number of noninteracting electrons. Then, if it assumed that Dx and Dy are small compared to the width r of the Gaussian peak, the sums may be approximated by integrals. Straightforward calculations result in: F11 F12 An analogous reasoning yields: F33 F34
2 NO2 3a1 þ a22 : 6 32pr FOV
ð186Þ
2 NO2 3a2 þ a21 : 6 32pr FOV
ð187Þ
Next, it follows from Eq. (12) that F13 F14
!
K X L 1X 1 @lkl @lkl @lkl @lkl ; @by1 @by2 2 k¼1 l¼1 lkl @bx1 @bx2
ð188Þ
where it has been taken into account that F13 F24 and F14 F23. A similar derivation as that resulting in Eqs. (183) and (184) gives:
152
VAN AERT ET AL.
@lkl NODxDy @Gðxk a3 =2; yl a4 =2Þ @by1 FOV nc O @yl a1 @ 2 Gðxk a3 =2; yl a4 =2Þ @xk @yl 2
2 a2 @ Gðxk a3 =2; yl a4 =2Þ ; 2 @y2l
ð189Þ
@lkl NODxDy @Gðxk a3 =2; yl a4 =2Þ @yl @by2 FOV nc O a1 @ 2 Gðxk a3 =2; yl a4 =2Þ @xk @yl 2
a2 @ 2 Gðxk a3 =2; yl a4 =2Þ : þ 2 @y2l þ
ð190Þ
Next, Eqs. (29), (183), (184), (188)-(190) are combined and the denominator l kl of Eq. (188) is approximated by NDxDy=ðFOV nc OÞ since ‘nc OgDF ðx; y; bÞ’ of Eq. (33) is assumed to be much smaller than the term ‘1’, which means that the number of interacting electrons is much smaller than the number of noninteracting electrons. Also, for the same reason, NO2 =ðFOV nc OÞ is approximated by NO2/FOV. Moreover, if Dx and Dy are assumed to be small compared to the width r of the Gaussian peak, the sums may be approximated by integrals. This results in: F13 F14
NO2 a1 a2 : 16pr2 FOV
ð191Þ
Finally, substitution of Eqs. (186), (187), and (191) into Eq. (51) results in Eq. (69). Appendix C In this appendix, the approximations, described by Eqs. (73), (74), (80), and (81) of Section III, for the highest attainable precision with which position coordinates of one isolated component or the distance between two components can be estimated of a three-dimensional object from a dark-field imaging tomography experiment are derived. First, Eqs. (73)–(74) and (81), which describe the CRLB on the variance of the position coordinates of one isolated component and on the variance of the distance between two non-overlapping three-dimensional components,
QUANTITATIVE ATOMIC RESOLUTION TEM
153
respectively, will be proven. If the pixel sizes Dx and Dy are assumed to be small compared to the width r of the Gaussian peak, which is described by Eq. (29), it can be shown that the non-diagonal elements of the Fisher information matrix F associated with the position coordinates, which are given by Eqs. (71) and (75) for one and two components, respectively, are approximately equal to zero. The reason for this is that for most projections the components do not overlap and that the image intensity distribution of each projected component has rotational symmetry. Moreover, the diagonal elements are nonzero. For example, the first diagonal element of F associated with the position coordinate bx1 is calculated explicitly as follows. From Eq. (12), it follows that: F11 ¼
j j J X K X L X 1 @lkl @lkl ; j @b @b x1 x1 j¼1 k¼1 l¼1 lkl
ð192Þ
where ljkl is given by Eq. (41). According to the chain rule for diVerentiation and from Eq. (38), it follows that Eq. (192) may be rewritten as: " # j j J K X L X X 1 @lkl @lkl 2 j F11 ¼ cos y : ð193Þ j j j j¼1 k¼1 l¼1 lkl @bx1 @bx1 In fact, the sum between square brackets has already been calculated in Appendix A, where a comparable sum and its approximation is given by Eqs. (162) and (163), respectively. This result is incorporated in Eq. (193) as follows. On the one hand, it follows from Eqs. (39)–(41) that ljkl is equal to: nc N p gDF xk ; yl ; "j DxDy; J
ð194Þ
lkl ¼ nc Np gDF ðxk ; yl ; bÞDxDy;
ð195Þ
ljkl
j j j j . . . bxn by1 . . . byn ÞT is the 2nc-dimensional parameter vector where "i ¼ ðbx1 c c of projected position coordinates. On the other hand, it follows from Eqs. (31) and (32) that lkl, which was used for the calculation of Eq. (162), was equal to:
where b ¼ ðbx1 . . . bxnc by1 . . . bync ÞT was the 2nc-dimensional parameter vector of the two-dimensional components. From the comparison of Eq. (195) with Eq. (194), it follows that lkl is equal to J times ljkl if b of Eq. (195) is replaced by "j. Therefore, Eq. (162) is equal to J times the sum between square brackets of Eq. (193) if b of Eq. (162) is replaced by "j. The approximation of Eq. (162), which was given by Eq. (163), was only valid if the two-dimensional components did not overlap. Hence, for the threedimensional object this result is only valid for those projections j where the
154
VAN AERT ET AL.
projected components do not overlap. This condition is fulfilled for most projections, but for some projections, the projected components may overlap, even if the three-dimensional components do not overlap. However, since it will be assumed that the total number of projections J is large, the contribution of projections in Eq. (193) for which the condition is not fulfilled will be neglected. Therefore, the sum between square brackets of Eq. (193) is replaced by 1/J times the approximation given by Eq. (163). Thus, Eq. (193) is approximately equal to: F11
J X 1 j¼1
J
cos2 y j
Np : r2
ð196Þ
Next, it is assumed that the tilt angles yi ; j ¼ 1; . . . ; J are equidistantly located on the interval (p/2, p/2). Therefore, the diVerence Dy between successive tilt angles is equal to: Dy ¼
p : J
ð197Þ
Thus, 1/J of Eq. (196) can be replaced by Dy/p. Furthermore, the diVerence Dy between successive tilt angles is assumed to be small compared to the full angular tilt range (p/2, p/2), or in other words, the total P number of projections J is assumed to be large. Therefore, the sum j Dy may be R approximated by the integral dy. Straightforward calculations yield: F11
Np : 2r2
ð198Þ
By an analogous reasoning, it can be shown that the diagonal element of F associated with the position coordinate by1, that is, F22 of Eq. (71) or F33 of Eq. (75), is approximately equal to: Np : r2
ð199Þ
Moreover, the diagonal element of F associated with the position coordinate bz1, that is, F33 of Eq. (71) or F55 of Eq. (75), is approximated by: Np : 2r2
ð200Þ
Next, Eqs. (198)–(200) are substituted into Eqs. (72) and (79). Then, Eq. (72) produces Eqs. (73) and (74). Finally, the following notion is taken into account. The distance d0 between the components projected onto the (x, z)plane is equal to:
QUANTITATIVE ATOMIC RESOLUTION TEM
d0 ¼
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðbx1 bx2 Þ2 þ ðbz1 bz2 Þ2 ¼ d sinf;
155 ð201Þ
where f is the angle between the rotation axis and the axis that connects the two components. The distance d0 and the angle f are shown in figure 3. Taking account of Eq. (201), Eq. (79) results in Eq. (81). Second, Eq. (80), which describes the CRLB on the variance of the distance between two overlapping three-dimensional components, will be proven. For that purpose, the diVerences of the elements of the Fisher information matrix F associated with the position coordinates b ¼ ðbx1 bx2 by1 by2 bz1 bz2 ÞT , that is, F11 F12, F33 F34, F55 F56, F13 F14, F15 F16, and F35 F36 are calculated explicitly. From Eq. (12), it follows that !2 j j L J X K X 1X 1 @lkl @lkl ; ð202Þ F11 F12 2 j¼1 k¼1 l¼1 ljkl @bx1 @bx2 where it has been taken into account that F11 F22 and F12 ¼ F21 . According to the chain rule for diVerentiation and from Eq. (38), it follows that Eq. (202) may be rewritten 2 as: !2 3 j j J K X L X X @l @l 1 1 kl F11 F12 ð203Þ cos2 y j 4 jkl 5: j j 2 @b @bx2 l j¼1 k¼1 l¼1 kl x1
Following the same reasoning as in the derivation of Eq. (196) from Eq. (193), it can be shown that the sum between square brackets of Eq. (203) may be directly derived from the result of Eq. (164) of Appendix A, which is given by Eq. (171). This results in: J 2 2 X Np 1 j j j j cos2 yj F11 F12 2 b b þ b b : ð204Þ x1 x2 y1 y2 J 4r4 j¼1 In order to simplify the notation, the parameters a are introduced. The elements of the parameter vector a ¼ ða1 a2 a3 a4 a5 a6 ÞT are defined as: a1 ¼ bx1 bx2 ; a2 ¼ by1 by2 ;
a3 ¼ bz1 bz2 ; a4 ¼ bx1 þ bx2 ;
a5 ¼ by1 þ by2 ; a6 ¼ bz1 þ bz2 :
Next, Eqs. (38) and (205) are substituted into Eq. (204), resulting in:
ð205Þ
156
VAN AERT ET AL.
F11 F12
J X 1 j¼1
Np 2 j j 2 cos y 2 a1 cosy þ a3 siny þ ða2 Þ : J 4r4 2 j
ð206Þ
In this expression, the sum is approximated by an integral since the diVerence Dy between successive tilt angles is assumed to be small compared to the full angular tilt range (p/2, p/2). This results in:
Np 3 2 1 a1 þ a22 þ a23 : ð207Þ F11 F12 4 2 8r 2 Similar reasonings result in: F33 F34
Np 1 2 1 2 2 a a þ 2a þ 2 2 3 4r4 2 1
ð208Þ
F55 F56
Np 1 2 3 2 2 þ þ a a a : 2 2 3 8r4 2 1
ð209Þ
and
Next, it follows from Eq. (12) that F13 F14
J X K X L 1X 1 2 j¼1 k¼1 l¼1 ljkl
@ljkl @ljkl @bx1 @bx2
!
@ljkl @ljkl @by1 @by2
!
ð210Þ
where it has been taken into account that F13 F24 and F14 F23. According to the chain rule for diVerentiation and from Eq. (38), it follows that Eq. (210) may be rewritten as: " ! !# j j j j J K X L X X @l @l @l @l 1 1 kl kl F13 F14 : ð211Þ cosyj kl kl 2 k¼1 l¼1 ljkl @bjx1 @bjx2 @bjy1 @bjy2 j¼1 Following the same reasoning as in the derivation of Eq. (196) from Eq. (193), it can be shown that the sum between square brackets of Eq. (211) may be directly derived from the result of Eq. (173) of Appendix A, which is given by Eq. (176). This results in: 3 2 j j j j J by1 by2 Np bx1 bx2 X 5: ð212Þ cosy j 4 F13 F14 4r4 j¼1 Next, Eqs. (38) and (205) are substituted into Eq. (212), resulting in: " # j j J X j Np a1 cosy þ a3 siny F13 F14 : ð213Þ cosy 4r4 j¼1
QUANTITATIVE ATOMIC RESOLUTION TEM
157
In this expression, the sum is approximated by an integral since the diVerence Dy between successive tilt angles is assumed to be small compared to the full angular tilt range (p/2, p/2). This results in: F13 F14
Np a1 a2 : 8r4
ð214Þ
F15 F16
Np a1 a3 8r4
ð215Þ
F35 F36
Np a2 a3 : 8r4
ð216Þ
Similar reasonings yield:
and
Next, Eqs. (207)–(209) and (214)–(216) are substituted in Eq. (79). Finally, it is noticed that the distance a2 between the components projected onto the (y, z)-plane is equal to: a2 ¼ dcosf:
ð217Þ
Taking account of Eqs. (201) and (217) results in Eq. (80).
References Barth, J. E., and Kruit, P. (1996). Addition of different contributions to the charged particle probe size. Optik 101, 101–109. Batson, P. E. (1999). Advanced spatially resolved EELS in the STEM. Ultramicroscopy 78, 33–42. Batson, P. E., Dellby, N., and Krivanek, O. L. (2002). Sub-a˚ngstrom resolution using aberration corrected electron optics. Nature 418, 617–620. Bettens, E., van Dyck, D., den Dekker, A. J., Sijbers, J., and van den Bos, A. (1999). Modelbased two-object resolution from observations having counting statistics. Ultramicroscopy 77, 37–48. Bevington, P. R. (1969). Data Reduction and Error Analysis for the Physical Sciences. New York: McGraw-Hill. Born, M., and Wolf, E. (1999). Principles of Optics—Electromagnetic Theory of Propagation, Interference and Diffraction of Light. 7th (expanded) ed. Cambridge: Cambridge University Press. Broeckx, J., Op de Beeck, M., and van Dyck, D. (1995). A useful approximation of the exit wave function in coherent STEM. Ultramicroscopy 60, 71–80. Browning, N. D., Arslan, I., Moeck, P., and Topuria, T. (2001). Atomic resolution scanning transmission electron microscopy. Physica Status Solidi B 227, 229–245. Buseck, P. R., Cowley, J. M., and Eyring, L. (1988). High-Resolution Transmission Electron Microscopy and Associated Techniques. Oxford: Oxford University Press.
158
VAN AERT ET AL.
Cahn, R. W. (2001). The Coming of Materials Science. New York: Pergamon, chapter 5, pp. 187–210. Coene, W., and van Dyck, D. (1988). New aspects in nonlinear image processing for high resolution electron microscopy. Scanning Microscopy, Supplement 2, 117–129. Coene, W. M. J., Thust, A., Op de Beeck, M., and van Dyck, D. (1996). Maximum-likelihood method for focus-variation image reconstruction in high resolution transmission electron microscopy. Ultramicroscopy 64, 109–135. Conover, W. J. (1980). Practical Nonparametric Statistics, 2nd ed. New York: Wiley. Cowley, J. M. (1976). Scanning transmission electron microscopy of thin specimens. Ultramicroscopy 2, 3–16. Cowley, J. M. (1997). Scanning transmission electron microscopy, in Handbook of Microscopy—Applications in Materials Science, Solid-State Physics and Chemistry, Methods II, edited by S. Amelinckx, D. van Dyck, J. van Landuyt, and G. van Tendeloo. Weinheim: VCH, pp. 563–594. Cowley, J. M., and Moodie, A. F. (1957). The scattering of electrons by atoms and crystals. I. A new theoretical approach. Acta Crystallographica 10, 609–619. Crame´r, H. (1999). Mathematical Methods of Statistics, 19th ed. Princeton: Princeton University Press. Crewe, A. V. (1997). The scanning transmission electron microscope, in Handbook of Charged Particle Optics, edited by J. Orloff. Boca Raton: CRC Press, pp. 401–427. de Jong, A. F., and van Dyck, D. (1993). Ultimate resolution and information in electron microscopy II. The information limit of transmission electron microscopes. Ultramicroscopy 49, 66–80. de Jonge, N., Lamy, Y., Schoots, K., and Oosterkamp, T. H. (2002). High brightness electron beam from a multi-walled carbon nanotube. Nature 420, 393–395. den Dekker, A. J., Sijbers, J., and van Dyck, D. (1999). How to optimize the design of a quantitative HREM experiment so as to attain the highest precision. Journal of Microscopy 194, 95–104. den Dekker, A. J., and van Aert, S. (2002). Quantitative high resolution electron microscopy and Fisher information, in Proceedings of the 15th International Congress on Electron Microscopy, Interdisciplinary and Technical Forum Abstracts 2002 in Durban, South Africa, edited by R. Cross. Vol. 3, Onderstepoort: Microscopy Society of Southern Africa, pp. 185–186. den Dekker, A. J., van Aert, S., van Dyck, D., and van den Bos, A. (2000). A quantitative evaluation of different STEM imaging modes, in Proceedings of the 12th European Congress on Electron Microscopy, Instrumentation and Methodology 2000 in Brno, Czech Republic, edited by P. Toma´nek and R. Kolar˘´ık. Vol. 3, Brno: The Czechoslovak Society for Electron Microscopy, pp. 131–132. den Dekker, A. J., van Aert, S., van Dyck, D., van den Bos, A., and Geuens, P. (2000). Does a monochromator improve the precision in quantitative HRTEM? in Jaarboek Nederlandse Vereniging voor Microscopie 2000, including the proceedings of the Joint Meeting of the BVM and the NVvM 2000 in Papendal, Arnhem, The Netherlands, edited by H. K. Koerten. Rijnsburg: Press Point, pp. 138–140. den Dekker, A. J., van Aert, S., van Dyck, D., van den Bos, A., and Geuens, P. (2001). Does a monochromator improve the precision in quantitative HRTEM? Ultramicroscopy 89, 275–290. Fedorov, V. V. (1972). Theory of Optimal Experiments. New York: Academic Press. Fejes, P. L. (1977). Approximations for the calculation of high-resolution electron-microscope images of thin films. Acta Crystallographica A 33, 109–113. Frank, J. (1973). The envelope of electron microscopic transfer functions for partially coherent illumination. Optik 38, 519–536.
QUANTITATIVE ATOMIC RESOLUTION TEM
159
Frank, J. (1992). Electron Tomography—Three-Dimensional Imaging with the Transmission Electron Microscope. New York: Plenum Press. Frieden, B. R. (1998). Physics from Fisher Information—A Unification. Cambridge: Cambridge University Press. Fujita, H., and Sumida, N. (1994). Usefulness of electron microscopy, in Physics of New Materials, edited by F. E. Fujita. Berlin: Springer-Verlag, pp. 226–263. Gabor, D. (1948). A new microscopic principle. Nature 161, 777–778. Geuens, P., and van Dyck, D. (2002). The S-state model: A work horse for HRTEM. Ultramicroscopy 93, 179–198. Geuens, P., Chen, J. H., den Dekker, A. J., and van Dyck, D. (1999). An analytic expression in closed form for the electron exit wave. Acta Crystallographica A Supplement 55, Abstract P11.OE.002. Haider, M., Uhlemann, S., Schwan, E., Rose, H., Kabius, B., and Urban, K. (1998). Electron microscopy image enhanced. Nature 392, 768–769. Hartel, P., Rose, H., and Dinges, C. (1996). Conditions and reasons for incoherent imaging in STEM. Ultramicroscopy 63, 93–114. Henderson, R. (1995). The potential and limitations of neutrons, electrons and X-rays for atomic resolution microscopy of unstained biological molecules. Q. Rev. Biophys. 28, 171–193. Herrmann, K.-H. (1997). Image recording in microscopy, in Handbook of Microscopy— Applications in Materials Science, Solid-State Physics and Chemistry, Methods II, edited by S. Amelinckx, D. van Dyck, J. van Landuyt, and G. van Tendeloo. Weinheim: VCH, pp. 885–921. Hirsch, P. B., Howie, A., Nicholson, R. B., Pashley, D. W., and Whelan, M. J. (1965). Electron Microscopy of Thin Crystals. London: Butterworths. Howie, A. (1966). Diffraction channelling of fast electrons and positrons in crystals. Philosophical Magazine 14, 223–237. Howie, A. (1970). The theory of high energy electron diffraction, in Modern Diffraction and Imaging Techniques in Material Science, edited by S. Amelinckx, R. Gevers, G. Remaut, and J. van Landuyt. Amsterdam: North-Holland Publishing Company, pp. 295–339. International Centre for Diffraction Data. (2001). Release 2001 for the Powder Diffraction File. Pennsylvania: International Centre for Diffraction Data. (This is a software package.). Ishizuka, K. (1980). Contrast transfer of crystal images in TEM. Ultramicroscopy 5, 55–65. Kabius, B., Haider, M., Uhlemann, S., Schwan, E., Urban, K., and Rose, H. (2002). First application of a spherical-aberration corrected transmission electron microscope in materials science. J. Elect. Micro. Supplement 51, 51–58. Kambe, K., Lehmpfuhl, G., and Fujimoto, F. (1974). Interpretation of electron channeling by the dynamical theory of electron diffraction. Zeitschrift fu¨r Naturforschung 29a, 1034–1044. Kilaas, R., and Gronsky, R. (1983). Real space image simulation in high resolution electron microscopy. Ultramicroscopy 11, 289–298. Kirkland, E. J. (1984). Improved high resolution image processing of bright field electron micrographs I. Theory. Ultramicroscopy 15, 151–172. Kirkland, E. J. (1998). Advanced Computing in Electron Microscopy. New York: Plenum Press. Kisielowski, C., Hetherington, C. J. D., Wang, Y. C., Kilaas, R., O’Keefe, M. A., and Thust, A. (2001). Imaging columns of the light elements carbon, nitrogen and oxygen with sub ˚ ngstrom resolution. Ultramicroscopy 89, 243–263. A Kisielowski, C., Principe, E., Freitag, B., and Hubert, D. (2001). Benefits of microscopy with super resolution. Physica B 308–310, 1090–1096. Krivanek, O. L., Dellby, N., and Nellist, P. D. (2002). Aberration correction in the STEM, in Proceedings of the 15th International Congress on Electron Microscopy, Interdisciplinary and
160
VAN AERT ET AL.
Technical Forum Abstracts 2002 in Durban, South Africa, edited by R. Cross. Vol. 3, Onderstepoort: Microscopy Society of Southern Africa, pp. 29–30. Kruit, P., and Jansen, G. H. (1997). Space charge and statistical Coulomb effects, in Handbook of Charged Particle Optics, edited by J. Orloff. Boca Raton: CRC Press, pp. 275–318. Lentzen, M., Jahnen, B., Jia, C. L., Thust, A., Tillmann, K., and Urban, K. (2002). Highresolution imaging with an aberration-corrected transmission electron microscope. Ultramicroscopy 92, 233–242. Lichte, H. (1991). Electron image plane off-axis holography of atomic structures, in Advances in Optical and Electron Microscopy, edited by T. Mulvey and C. J. R. Sheppard. Vol. 12, London: Academic Press, pp. 25–91. Mo¨bus, G. R., Schweinfest, T., Gemming, T., Wagner, T., and Ru¨hle, M. (1997). Iterative structure retrieval techniques in HREM: a comparative study and a modular program package. J. Microscopy 190, 109–130. Mood, A. M., Graybill, F. A., and Boes, D. C. (1974). Introduction to the Theory of Statistics, 3rd ed. Singapore: McGraw-Hill. Mook, H. W., and Kruit, P. (1999). Optics and design of the fringe field monochromator for a Schottky field emission gun. Nuclear Instruments and Methods in Physics Research A 427, 109–120. Mory, C., Tence, M., and Colliex, C. (1985). Theoretical study of the characteristics of the probe for a STEM with a field emission gun. J. Micro. Spect. Electro. 10, 381–387. Muller, D. A. (1998). Core level shifts and grain boundary cohesion, in Microscopy and Microanalysis, Proceedings Microscopy and Microanalysis 1998 in Atlanta, Georgia, edited by G. W. Bailey, K. B. Alexander, W. G. Jerome, M. G. Bond, and J. J. McCarthy. Vol. 4, Supplement 2, New York: Springer, pp. 766–767. Muller, D. A. (1999). Why changes in bond lengths and cohesion lead to core-level shifts in metals, and consequences for the spatial difference method. Ultramicroscopy 78, 163–174. Muller, D. A., and Mills, M. J. (1999). Electron microscopy: probing the atomic structure and chemistry of grain boundaries, interfaces and defects. Mat. Sci. Engin. A 260, 12–28. Murray, W. (1972). Numerical Methods for Unconstrained Optimization. London: Academic Press. Nalwa, H. S. (2002). Nanostructured Materials and Nanotechnology: Concise Edition. San Diego: Academic Press. Nellist, P. D., and Pennycook, S. J. (1998). Subangstrom resolution by underfocused incoherent transmission electron microscopy. Phys. Rev. Lett. 81, 4156–4159. Nellist, P. D., and Pennycook, S. J. (2000). The principles and interpretation of annular darkfield Z-contrast imaging, in Advances in Imaging and Electron Physics, edited by P. W. Hawkes. Vol. 113, San Diego: Academic Press, pp. 147–199. O’Keefe, M. A. (1992). ‘Resolution’ in high-resolution electron microscopy. Ultramicroscopy 47, 282–297. O’Keefe, M. A., Hetherington, C. J. D., Wang, Y. C., Nelson, E. C., Turner, J. H., Kisielowski, C., Malm, J.-O., Mueller, R., Ringnalda, J., Pan, M., and Thust, A. (2001). Sub˚ ngstrom high-resolution transmission electron microscopy at 300 keV. Ultramicroscopy A 89, 215–241. Olson, G. B. (1997). Computational design of hierarchically structured materials. Science 277, 1237–1242. Olson, G. B. (2000). Designing a new material world. Science 288, 993–998. Op de Beeck, M., and van Dyck, D. (1996). Direct structure reconstruction in HRTEM. Ultramicroscopy 64, 153–165. Papoulis, A. (1965). Probability, Random Variables, and Stochastic Processes. New York: McGraw-Hill.
QUANTITATIVE ATOMIC RESOLUTION TEM
161
Papoulis, A. (1968). Systems and Transforms with Applications in Optics. New York: McGrawHill. Pa´zman, A. (1986). Foundations of Optimum Experimental Design. Dordrecht: D. Reidel Publishing Company. Pennycook, S. J. (1997). Scanning transmission electron microscopy: Z contrast, in Handbook of Microscopy—Applications in Materials Science, Solid-State Physics and Chemistry, Methods II, edited by S. Amelinckx, D. van Dyck, J. van Landuyt, and G. van Tendeloo. Weinheim: VCH, pp. 595–620. Pennycook, S. J., Rafferty, B., and Nellist, P. D. (2000). Z-contrast imaging in an aberrationcorrected scanning transmission electron microscope. Microscopy and Microanalysis 6, 343–352. Pennycook, S. J., and Jesson, D. E. (1991). High-resolution Z-contrast imaging of crystals. Ultramicroscopy 37, 14–38. Pennycook, S. J., and Jesson, D. E. (1992). Atomic resolution Z-contrast imaging of interfaces. Acta Metallurgica et Materialia, Supplement 40, 149–159. Pennycook, S. J., Jesson, D. E., Chisholm, M. F., Browning, N. D., McGibbon, A. J., and McGibbon, M. M. (1995). Z-contrast imaging in the scanning transmission electron microscope. J. Micro. Soc. Am. 1, 231–251. Pennycook, S. J., and Yan, Y. (2001). Z-contrast imaging in the scanning transmission electron microscope, in Progress in Transmission Electron Microscopy 1—Concepts and Techniques, edited by X.-F. Zhang and Z. Zhang. Berlin: Springer-Verlag, pp. 81–111. Phillipp, F., Ho¨schen, R., Osaki, M., Mo¨bus, G., and Ru¨hle, M. (1994). New high-voltage ˚ point resolution installed in Stuttgart. atomic resolution microscope approaching 1 A Ultramicroscopy 56, 1–10. Rayleigh, Lord (1902). Wave theory of light, in Scientific Papers by John William Strutt, Baron Rayleigh. Vol. 3, Cambridge: Cambridge University Press, pp. 47–189. Reed, M. A., and Tour, J. M. (2000). Computing with molecules. Sci. Am. 282, 68–75. Reimer, L. (1984). Particle optics of electrons, in Transmission Electron Microscopy—Physics of Image Formation and Microanalysis. Berlin: Springer-Verlag, pp. 19–49. Reimer, L. (1993). Elements of a transmission electron microscope, in Transmission Electron Microscopy—Physics of Image Formation and Microanalysis. Berlin: Springer-Verlag, pp. 86–135. Rose, H. (1975). Zur Theorie der Bildenstehung im Elektronen-Mikroskop I. Optik 42, 217–244. Rose, H. (1990). Outline of a spherically corrected semiaplanatic medium-voltage transmission electron microscope. Optik 85, 19–24. Sato, M. (1997). Resolution, in Handbook of Charged Particle Optics, edited by J. Orloff. Boca Raton: CRC Press, pp. 319–361. Sato, M., and Orloff, J. (1992). A new concept of theoretical resolution of an optical system, comparison with experiment and optimum condition for a point source. Ultramicroscopy 41, 181–192. Saxton, W. O. (1978). Computer Techniques for Image Processing in Electron Microscopy. New York: Academic Press, chapter 9, pp. 236–248. Saxton, W. O. (1997). Quantitative comparison of images and transforms. J. Microscopy 190, 52–60. Scherzer, O. (1949). The theoretical resolution limit of the electron microscope. Journal of Applied Physics 20, 20–28. Schiske, P. (1973). Image processing using additional statistical information about the object, in Image Processing and Computer-aided Design in Electron Optics, edited by P. W. Hawkes. London: Academic Press, pp. 82–90.
162
VAN AERT ET AL.
Sinkler, W., and Marks, L. D. (1999). A simple channelling model for HREM contrast transfer under dynamical conditions. J. Microscopy 194, 112–123. Spence, J. C. H. (1988). Experimental High-Resolution Electron Microscopy, 2nd ed. New York: Oxford University Press. Spence, J. C. H. (1999). The future of atomic resolution electron microscopy for materials science. Mat. Sci. Engi. R 26, 1–49. Springborg, M. (2000). Methods of Electronic-Structure Calculations: From Molecules to Solids. Chichester: Wiley. Stadelmann, P. A. (1987). EMS—A software package for electron diffraction analysis and HREM image simulation in materials science. Ultramicroscopy 21, 131–146. Thust, A., and Jia, C. L. (2000). Advances in atomic structure determination using the focal-series reconstruction technique, in Proceedings of the 12th European Congress on Electron Microscopy, Instrumentation and Methodology 2000 in Brno, Czech Republic, edited by P. Toma´nek and R. Kolar˘´ık. Vol. 3, Brno: The Czechoslovak Society for Electron Microscopy, pp. 107–110. Thust, A., Overwijk, M. H. F., Coene, W. M. J., and Lentzen, M. (1996). Numerical correction of lens aberrations in phase-retrieval HRTEM. Ultramicroscopy 64, 249–264. Thust, A., Coene, W. M. J., Op de Beeck, M., and van Dyck, D. (1996). Focal-series reconstruction in HRTEM: simulation studies on non-periodic objects. Ultramicroscopy 64, 211–230. Treacy, M. M. J. (1982). Optimising atomic number contrast in annular dark field images of thin films in the scanning transmission electron microscope. J. Micro. Spect. Electron. 7, 511–523. van Aert, S., den Dekker, A. J., van den Bos, A., and van Dyck, D. (2002a). High-resolution electron microscopy: from imaging toward measuring. IEEE Trans. Instrument. Measure. 51, 611–615. van Aert, S., den Dekker, A. J., van den Bos, A., and van Dyck, D. (2002b). The benefits of statistical experimental design for quantitative electron microscopy, in Proceedings of the 15th International Congress on Electron Microscopy, Interdisciplinary and Technical Forum Abstracts 2002 in Durban, South Africa, edited by R. Cross. Vol. 3, Onderstepoort: Microscopy Society of Southern Africa, pp. 189–190. van Aert, S., den Dekker, A. J., van Dyck, D., and van den Bos, A. (2000). Design aspects for an optimum DF STEM probe, in Proceedings of the 12th European Congress on Electron Microscopy, Instrumentation and Methodology 2000 in Brno, Czech Republic, edited by P. Toma´nek and R. Kolar˘´ık. Vol. 3, Brno: The Czechoslovak Society for Electron Microscopy, pp. 129–130. van Aert, S., den Dekker, A. J., van Dyck, D., and van den Bos, A. (2002a). High-resolution electron microscopy and electron tomography: Resolution versus precision. J. Struct. Biol. 138, 21–33. van Aert, S., den Dekker, A. J., van Dyck, D., and van den Bos, A. (2002b). Optimal experimental design of STEM measurement of atom column positions. Ultramicroscopy 90, 273–289. van Aert, S., and van Dyck, D. (2001). Do smaller probes in a scanning tranmission electron microscope result in more precise measurement of the distances between atom columns? Philosophical Magazine B 81, 1833–1846. van Aert, S., van Dyck, D., den Dekker, A. J., and van den Bos, A. (2000). Quantitative ADF STEM: guidelines towards an improved experimental design, in Jaarboek Nederlandse Vereniging voor Microscopie 2000, including the proceedings of the Joint Meeting of the BVM and the NVuM 2000 in Papendal, Arnhem, The Netherlands, edited by H. K. Koerten. Rijnsburg: Press Point, pp. 126–127.
QUANTITATIVE ATOMIC RESOLUTION TEM
163
van den Bos, A. (1982). Parameter estimation, in Handbook of Measurement Science, edited by P. H. Sydenham. Vol. 1, Chicester: Wiley, pp. 331–377. van den Bos, A. (1999). Measurement errors, in Encyclopedia of Electrical and Electronics Engineering, edited by J. G. Webster. Vol. 12, New York: Wiley, pp. 448–459. van den Bos, A. (2002). Afscheidsrede—Naar Waarde Schatten; Valedictory Address. Technische Universiteit Delft. van den Bos, A., and den Dekker, A. J. (2001). Resolution reconsidered—Conventional approaches and an alternative, in Advances in Imaging and Electron Physics, edited by P. W. Hawkes. Vol. 117, San Diego: Academic Press, pp. 241–360. van Dyck, D. (2002). High-resolution electron microscopy, in , edited by P. W. Hawkes. Advances in Imaging and Electron Physics Vol. 123, San Diego: Academic Press, pp. 105–171. van Dyck, D., and de Jong, A. F. (1992). Ultimate resolution and information in electron microscopy: general principles. Ultramicroscopy 47, 266–281. van Dyck, D., Danckaert, J., Coene, W., Selderslaghs, E., Broddin, D., van Landuyt, J., and Amelinckx, S. (1989). The atom column approximation in dynamical electron diffraction calculations, in Computer Simulation of Electron Microscope Diffraction and Images, edited by W. Krakow and M. O’Keefe. Warrendale: The Minerals, Metals & Materials Society, pp. 107–134. van Dyck, D., and Chen, J. H. (1999a). A simple theory for dynamical electron diffraction in crystals. Solid State Communications 109, 501–505. van Dyck, D., and Chen, J. H. (1999b). Towards an exit wave in closed analytical form. Acta Crystallographica A 55, 212–215. van Dyck, D., and Op de Beeck, M. (1996). A simple intuitive theory for electron diffraction. Ultramicroscopy 64, 99–107. van Dyck, D., Op de Beeck, M., and Coene, W. (1993). ‘‘A new approach to object wavefunction reconstruction in electron microscopy.’’ Optik 93, 103–107. van Dyck, D., van Aert, S., den Dekker, A. J., and van den Bos, A. (2002). How to select the items for the shopping list of future high resolution electron microscopists? in Microscopy and Microanalysis, Proceedings Microscopy and Microanalysis 2002 in Qubec City, Canada, edited by E. Voelkl, D. Piston, R. Gauvin, A. J. Lockley, G. W. Bailey, and S. McKernan. Vol. 8, Supplement 2, Cambridge: Cambridge University Press, pp. 94–95. van Dyck, D., van Aert, S., den Dekker, A. J., and van den Bos, A. (2003). Is atomic resolution transmission electron microscopy able to resolve and refine amorphous structures? Ultramicroscopy 98, 27–42. van Dyck, D., and Coene, W. (1987). A new procedure for wave function restoration in high resolution electron microscopy. Optik 77, 125–128. van Tendeloo, G., Pauwels, B., Geuens, P., and Lebedev, O. (2000). TEM of nanostructured materials, in Proceedings of the 12th European Congress on Electron Microscopy, Physical Sciences 2000 in Brno, Czech Republic, edited by J. Gemperlova´ and I. Va´vra. Vol. 2, Brno: The Czechoslovak Society for Electron Microscopy, pp. 1–6. van Tendeloo, G., and Amelinckx, S. (1978). A high resolution study of ordering in Au4Mn. Physica Status Solidi A 49, 337–346. van Tendeloo, G., and Amelinckx, S. (1982). High resolution electron microscopic and electron diffraction study of the Au—Mg System II. The Y-phase and some observations on the Au77Mg23 phase. Physica Status Solidi A 69, 103–120. van Veen, A. H. V., Hagen, C. W., Barth, J. E., and Kruit, P. (2001). ‘‘Reduced brightness of the ZrO/W Schottky electron emitter.’’ J. Vac. Sci. Technol. B 19, 2038–2044. Wada, Y. (1996). Atom electronics: a proposal of nano-scale devices based on atom/molecule switching. Microelect. Engin. 30, 375–382.
164
VAN AERT ET AL.
Wang, Z. L. (2001). Inelastic scattering in electron microscopy—Effects, spectrometry and imaging, in Progress in Transmission Electron Microscopy 1—Concepts and Techniques, edited by X.-F. Zhang and Z. Zhang. Berlin: Springer-Verlag, pp. 113–159. Weißba¨cker, C., and Rose, H. (2001). Electrostatic correction of the chromatic and of the spherical aberration of charged-particle lenses (Part I). J. Elect. Microscopy 50, 383–390. Weißba¨cker, C., and Rose, H. (2002). Electrostatic correction of the chromatic and of the spherical aberration of charged-particle lenses (Part II). J. Elect. Microscopy 51, 45–51. Wiesendanger, R. (1994). Scanning Probe Microscopy and Spectroscopy—Methods and Applications. Cambridge: Cambridge University Press. Williams, D. B., and Carter, C. B. (1996). Transmission Electron Microscopy—A Textbook for Materials Science. New York: Plenum Press. Zanchet, D., Hall, B. D., and Ugarte, D. (2000). X-ray characterization of nanoparticles, in Characterization of Nanophase Materials, edited by Z. L. Wang. Weinheim: Wiley-VCH, pp. 13–36. Zandbergen, H. W., and van Dyck, D. (2000). Exit wave reconstructions using through focus series of HREM images. Microscopy Research and Technique 49, 301–323.
ADVANCES IN IMAGING AND ELECTRON PHYSICS, VOL. 130
Transform-Based Image Enhancement Algorithms with Performance Measure ARTYOM M. GRIGORYAN AND SOS S. AGAIAN Department of Electrical Engineering, The University of Texas at San Antonio San Antonio, Texas 78249, USA
I. Introduction . . . . . . . . . . . . . . . . . . . A. Spatial Domain Methods . . . . . . . . . . . . . B. Frequency Domain Methods . . . . . . . . . . . . C. Image Quality . . . . . . . . . . . . . . . . . II. Transforms with Frequency Ordered Systems . . . . . . . A. General Transform-Based Image Enhancement Algorithm B. Performance Measure of Enhancement . . . . . . . . C. Experimental Results . . . . . . . . . . . . . . . 1. a-Rooting Enhancement . . . . . . . . . . . . 2. Analysis of a-Rooting . . . . . . . . . . . . . 3. Modified a-Rooting . . . . . . . . . . . . . . 4. Transform-Based Enhancement by Operator C3(p, s) . D. Zonal Transform-Based Enhancement Methods . . . . 1. Enhancement Algorithm with Two Zones . . . . . . 2. Enhancement Algorithm with Three Zones . . . . . E. Negative a-Rooting Method . . . . . . . . . . . . III. Tensor Method of Image Enhancement . . . . . . . . . A. Splitting Image-Signals . . . . . . . . . . . . . . B. Tensor Representation of the Image . . . . . . . . . C. Construction of the Covering s . . . . . . . . . . . D. Properties of Image-Signals . . . . . . . . . . . . E. Image Enhancement . . . . . . . . . . . . . . . F. Method of 1-D a-Rooting . . . . . . . . . . . . . G. Conclusion . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
165 166 167 168 170 173 175 180 180 183 200 201 205 208 211 212 218 220 221 226 229 230 233 238 240
I. Introduction In many image-processing applications the quality of images should be improved to support the human perception, and diVerent image enhancement methods are widely used for this purpose. For instance, in the medical imaging area, an eVective contrast enhancement for diagnostic purposes can 165 Copyright 2004, Elsevier Inc. All rights reserved. ISSN 1076-5670/04
166
GRIGORYAN AND AGAIAN
be achieved by including certain basic human visual properties (Ji et al., 1994). Image enhancement is a problem-oriented procedure. The goal of image enhancement is to improve the visual appearance of the image, or to produce the most visually pleasing image. Existing methods for image enhancement focus mainly on properties of the processed image while excluding any consideration of the observer characteristics. With their specific nature, various enhancement methods are required for various types of images and applications. Methods of image enhancement can be classified by two categories: spatial domain methods (which operate directly on pixels), including regionbased and rational-morphology-based, and frequency domain methods (which operate on transform coeYcients). There also is a third option that consists of combining methods of image enhancement in spatial and frequency domains. A. Spatial Domain Methods In the first category, we can refer to methods that are based on the following operations. Unsharp masking is a well-known technique used in photography to enhance a visual quality of an image by processing edges of the image. The enhancement is first reduced for separation of edges, amplification, and summation them back into the image (Polesel et al., 2000). Histogram equalization is a simple and eVective technique in image enhancement when the image is processed in such a way that the histogram of the image becomes flat (i.e., all values of brightness will appear on the image with the same frequency). Disadvantages of the histogram equalization are the following. a. Small-scale details that are often associated with small bins of the histogram are eliminated, and noise that shares bins with large image features may be enhanced. The resulting images often appear harsh. In general, it is desired to enhance first a low-pass version of an image and then add back small-scale high frequency features of the image (Reeves and Jernigan, 1997). b. The selection of a suitable function for the gray level modification is not always a trivial step. The number of possible mathematical functions that may be used for this propose is practically infinite. We need some ways and tools for the definition (or detection) of a suitable function, on one side, and measure of uncertainty for quantities justification of the results on the other side (Ji et al., 1994; Tizhoosh et al., 1998).
TRANSFORM-BASED IMAGE ENHANCEMENT ALGORITHMS
167
c. The luminance of an image may change significantly after equalization, this is why it has never been utilized in a video system in the past (Wang et al., 2000). Note, that in paper (Stark, 2000), the author proposes a scheme for adaptive image-contrast enhancement based on a generalization of histogram equalization. The histogram equalization is a useful technique for improving image contrast, but its eVect is too severe for many purposes. However, dramatically diVerent results can be obtained with relatively minor modifications. A concise description of adaptive histogram equalization is set out, and this framework is used in a discussion of past suggestions for variations on histogram equalization. A key feature of this formalism is a ‘‘cumulation function,’’ which is used to generate a grey level mapping from the local histogram. By choosing alternative forms of cumulation function, one can achieve a wide variety of eVects (Rosenfeld and Kak, 1982; Morrow et al., 1992). Contrast stretching: By stretching the histogram over the available dynamic range we attempt to correct this situation. If the image is intended to go from brightness 0 to brightness L then one generally maps the minimum value to the value 0 and the maximum value to the value L (Agaian, 1999; Ji et al., 1994; Kim et al., 1997). Modified cosine function: A modified cosine function was developed for image enhancement, in which some enhanced images may appear a little darker if the average pixel value in the image generally falls below the intermediate value (which is 128 in most images of 256 gray-levels, that are used in experiments) (Williams et al., 2001). Other methods: A ‘‘global’’ entropy transformation, image enhancement using first derivative and local statistics (Khellaf et al., 1991; Kim et al., 1997). B. Frequency Domain Methods The basic idea behind this method consists in transforming the image, manipulating the transforms coeYcients, and performing then the inverse transformation. These methods operate on transforms of the image, such as the Fourier, wavelet, and cosine transforms. The basic advantages of transform-based image enhancement techniques are the following. a. Low complexity of computations. b. The critical role of unitary transforms in digital image processing, where they are used in diVerent stages of processing such as filtering, coding, recognition, and restoration analysis.
168
GRIGORYAN AND AGAIAN
c. Unitary transforms give the spectral information about an image by decomposition of the image into spectral coeYcients that can be modified (linearly or non-linearly) for the purposes of enhancement and visualization. d. It is easy to view and manipulate the frequency composition of the image, without direct reliance on spatial information.
C. Image Quality The improvement in images after enhancement is often very diYcult to measure. A processed image can be said to have been enhanced over the original image if it allows the observer to better perceive the desirable information of the image. In images, the improved perception is diYcult to qualify. There is no universal measure that can specify both the objective and subjective validity of the enhancement method (Kim et al., 1997). In many image-processing applications, the image quality should be improved to support the human perception. Image quality evaluation by human observers is, however, heavily subjective in nature. Individual observers judge the image quality diVerently (Tizhoosh, 2001). In many cases, the quality of the relevant part of image information, which is perceived by the observer, should reach a maximum. The human observer, however, does not perceive this result as good because his judgment is subjective. The diYculty in developing an image enhancement technique lies in quantifying a criterion for enhancement (Williams et al., 2001). Image enhancement is a complex problem because the contrast is diYcult to quantify, and contrast ‘‘optimality’’ is a perceptual concept. In investigation of contrast enhancement based on the human visual system, the authors point out that there is generally no criterion for identifying how much enhancement is adequate at each location of the image. Without a generally applicable measure of contrast, contrast enhancement cannot be formulated as an optimization problem (Ji et al., 1994; Reeves and Jernigan, 1997). The analysis of the existing transform-based image enhancement techniques (Aghagolzadeh and Ersoy, 1992; Castleman, 1996; Kim et al., 1997) shows that to select optimal processing parameters and measure the quality of images, the quantative measure of image enhancement that relates to Weber’s law of human visual system can serve as a building criterion for image enhancement (Agaian et al., 2001; Grigoryan et al., 2001). Transform-based image enhancement methods include techniques such as alpha-rooting, weighted a-rooting, modified unsharp masking, and filtering, which are all motivated by the human visual response (Agaian, 1991; Aghagolzadeh and Ersoy, 1992; Jain, 1989).
TRANSFORM-BASED IMAGE ENHANCEMENT ALGORITHMS
169
There are common problems for transform image enhancement methods that need to be solved, because: a. such methods introduce certain artifacts (so-called ‘‘objectionable blocking eVects’’ (Aghagolzadeh and Ersoy, 1992)); b. the methods cannot simultaneously enhance all parts of the image very well; c. it is diYcult to select optimal processing parameters, and there is no eYcient measure that can be served as a building criterion for image enhancement; and d. in general, there is no automatic procedure for image enhancement. Finding a solution to these problems is very important especially when the image enhancement procedure is used as a preprocessing step for other image processing techniques such as detection, recognition, and visualization. There are excellent surveys on spatial domain based image enhancement techniques (Wang, 1981; Zamperoni, 1995) as well as textbooks on digital image processing (Castleman, 1996; Gonzalez and Wintz, 1987; Jain, 1989; Ritter and Wilson, 1996; Rosenfeld and Kak, 1982). It is also important to consider here novel transform-based image enhancement methods. This chapter focuses on a several goals: 1. to present a mathematical description of parametric transform-domain-based image enhancement techniques; 2. to present certain basic human visual properties based on new image enhancement measures, which helps us to formulate the image enhancement problem as an optimization problem; 3. to show that the proposed framework can be used to generate several image enhancement eVects (a special case of which is the a-rooting), and to develop an adaptive, automated image enhancement system; 4. to show that the proposed techniques are flexible and can be implemented eYciently; and 5. to determine for a given image the best transform among class of unitary transforms. In the first part of the chapter, we consider a novel frequency-domainbased image enhancement method for object detection and visualization. The method is based on the unitary transforms such as discrete Fourier, Hartley, cosine, and Hadamard transforms and new enhancement operators. New quantitative transform-based measures of image enhancement are used for that. The technique presented here has been successfully employed by NASA’s Earth Observing System satellite data products for the purpose of anomaly detection and visualization. These satellites collect a
170
GRIGORYAN AND AGAIAN
terabyte of data per day, and fast and eYcient methods are crucial for analyzing these data (Kogan et al., 1998). In the second part, we consider a new transform-based method of image enhancement that is based on the tensor (or vector) representation of the two-dimensional image with respect to the Fourier transform that was developed by Grigoryan (1984, 1986, 1991, 2001, 2002). In this representation an image is defined as a set of one-dimensional (1-D) image-signals that split the Fourier transform into a set of 1-D transforms. As a result, the problem of the image enhancement is reduced to the processing of 1-D splitting image-signals. The splitting yields a simple model for image enhancement, when by using only a few image-signals it is possible to achieve image enhancement that is comparative to the known class of the frequency-domain-based parametric image enhancement algorithms. Based on a quantitative measure, the best parameters for image enhancement can be found for each image-signal to be processed separately. To show that, we give examples of image-signals and their contributions in the process of enhancement of images of sizes 256 256 and 512 512. II. Transforms with Frequency Ordered Systems When analyzing signals and systems, it is useful to map data from the time domain into another domain (in our case, the frequency domain). The basic characteristics of a complex wave are the amplitude and phase spectra. Specifying amplitude and phase spectra is an important concept for complex waves. For example, an amplitude spectrum contains information about the energy content of a signal and the distribution of the energy among diVerent frequencies, which is often used in many applications. To achieve this, the real variable, t, is generalized to the complex variable, (u + jv), which is then mapped back via inverse mapping. For example, the one-dimensional discrete Fourier transform (1-D DFT)
N 1 X 2pnp 2pnp xn cos jxn sin Fp ¼ N N k¼0 ð1Þ
1 X 1N 2ppn 2ppn xn ¼ Fp cos þ jFp sin N p¼0 N N maps the real line (time domain) into the complex plane, or real wave into the complex one.
TRANSFORM-BASED IMAGE ENHANCEMENT ALGORITHMS
171
The 1-D Fourier transform maps the time domain signal into the frequency domain. The sum of the cosine products can be defined as the ‘‘real’’ components of the spectrum, and the sum of the sine products can be defined as the ‘‘imaginary’’ components of the spectrum. To compute these components, one can use the known algorithms of the fast Fourier transform. In general, we can consider the mapping systems, or transforms, that are based on a special sequency-ordered system. As an analog to frequency, the sequency is defined as the rate at which a basis function crosses the zero-axis. Such a transform can be represented in the form of Tp ¼
N1 X n¼0
xn ½ap;n cp ðnÞ þ bp;n ’p ðnÞ ¼
N1 X n¼0
xn ½IR ðp; nÞ þ II ðp; nÞ
p ¼ 0 : ðN 1Þ
ð2Þ
where cp(n) and ’p(n) are selected functions. IR and II respectively can be considered as ‘‘real’’ and ‘‘imaginary’’ components of the sums [ap;n cp ðnÞ þ bp;n ’p ðnÞ]. It is easy to see that the known unitary transforms such as the Hartley, cosine, sine, and Hadamard transforms are the particular cases of the sequency-ordered systems. Example 1. If ap,n ¼ bp,n ¼ 1andcp(n) ¼ cos(2pnp/N),’p(n) ¼ sin(2pnp/N), then the sequency-ordered system becomes the discrete Hartley transform of a one-dimensional, discrete real function, xn, and is defined as Hp ¼
N 1 X n¼0
xn
N
1 X 2pnp 2pnp 2pnp xn cas cos þ sin ¼ N N N n¼0
ð3Þ
where casðtÞ ¼ cosðtÞ þ sinðtÞ. The Hartley transform is similar to the Fourier transform, but only generates real coeYcients rather than complex ones. Example 2. If ap;n ¼ cosðpn=2NÞ, bp;n ¼ sinðpn=2NÞ, and cp;n ¼ cos ðpnp=NÞ, ’p;n ¼ sinðpnp=NÞ, then the sequency-ordered system becomes the cosine transform. Really, the discrete cosine transform is determined by the basis functions 8 1 > > ffi; if p ¼ 0 > < pffiffiffiffiffiffi 2N fp ðnÞ ¼ rffiffiffiffiffi ð4Þ
> 2 pðn þ 1=2Þp > > ; if p 6¼ 0 cos : N N
172
GRIGORYAN AND AGAIAN
as Xpc
rffiffiffiffiffi N1 2X
pðp þ 1=2Þn ¼ xn cos N n¼0 N rffiffiffiffiffi N 1 ppn pn ppn 2X pn ¼ cos þ sin sin xn cos N n¼0 2N N 2N N
ð5Þ
p ¼ 0 : ðN 1Þ:
Example 3. We consider the following system {walp(n)} that is formed from the Walsh ordered functions {walp(t)} defined on the interval [0,1] wal0 ðtÞ 1;
t 2 ½0; 1
ð6Þ
wal2sþq ðtÞ ¼ wals ð2½t þ 1=4Þ þ ð1Þsþq wals ð2½t 1=4Þ;
q ¼ 0; 1
ð7Þ
where s ¼ 0; 1; . . . (Poularikas and Seely, 1991). If ap;n ¼ bp;n ¼ 1 and cp ðnÞ ¼ walp ðnÞ if p is even, and ’p ðnÞ ¼ walp ðnÞ if p is odd, then the sequency-ordered system becomes the Cal-Sal WalshHadamard transform (C-SWHT). The C-SWHT is defined as Xpcs ¼
N 1 X N¼0
xn ½calp ðnÞ þ salp ðnÞ
ðN ¼ 2r Þ
ð8Þ
where calp ðnÞ ¼ wal2s ðnÞ, if p ¼ 2s, salp ðnÞ ¼ wal2sþ1 ðnÞ, if p ¼ 2s þ 1, and s denotes the sequency. A transform definition via a parametric class of trigonometric systems is Tp ¼
N 1 X n¼0
xn ½an;p cosðcp ðnÞÞ þ bn;p sinðcp ðnÞÞ
ð9Þ
1 a0 ðp þ a1 Þðn þ a2 Þ N
ð10Þ
where cp ðnÞ ¼
for some constants a0, a1, and a2. Similar to the Fourier transform, one can define the ‘‘magnitude’’ and ‘‘phase’’ of the real transform Tp. The ‘‘phase’’ associated with p is defined as yð pÞ ¼ arctan
II ð pÞ IR ð pÞ
ð11Þ
where II (t) and IR(t) are respectively the sums of the ‘‘real’’ and ‘‘imaginary’’ components in Eq. (9). The ‘‘magnitude’’ is defined as:
TRANSFORM-BASED IMAGE ENHANCEMENT ALGORITHMS
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Tp ¼ I 2 ðpÞ þ I 2 ðpÞ: R
The ‘‘power’’ Tp2 ¼
N 1 X n¼0
x2n ¼
N 1 h X n¼0
ð12Þ
I
ðan;p Þ2 þ ðbn;p Þ2
173
i
ð13Þ
and phase spectra can be recombined to reconstruct completely the Tp. Given an image x ¼ xn,m of size N N, we consider a two-dimensional unitary transform Tx X ð p; sÞ ¼ ðTxÞð p; sÞ ¼ ¼
N 1 N 1 X X n¼0 m¼0
xn;m fp;s ðn; mÞ
1 N 1 N X X
h i xn;m an;m cp;s ðn; mÞ þ bn;m ’p;s ðn; mÞ
n¼0 m¼0
ð14Þ
where ffp;s ; p; s ¼ 0 : ðN 1Þg is the set of basis functions of the transform T, and fcp;s ; ’p;s ; p; s ¼ 0 : ðN 1Þg is a complete set of orthogonal functions. an,m and bn,m are coeYcients of the transform. It is clear that the ‘‘magnitude’’ of such systems are similar to the magnitude of the Fourier transform. This fact points to the possibility of a construction unified transform-based enhancement algorithm for all sequency-ordered systems. A. General Transform-Based Image Enhancement Algorithm Analyzing the existing transform-based enhancement algorithms [a-rooting and magnitude reduction methods (Aghagolzadeh and Ersoy, 1992; Kogan et al., 1998)], we find a common algorithm, which encompasses all of these techniques. The actual procedure of the image enhancement via an invertible transform consists of the following three steps. Step 1: Perform the unitary transform Tx of the image xn,m. Step 2: Multiply the transform coeYcients, an,m and bn,m, by some factor, O(n,m). Step 3: Perform the inverse unitary transform. The frequency ordered system-based method can be represented as T
T 1
x ! X ! O X ! ¼ xˆ
ð15Þ
174
GRIGORYAN AND AGAIAN
where O is an operator which could be applied on the combination of an,m and bn,m (particularly, on the modules of the transform coeYcients) or could be applied directly to these coeYcients. For instance, they could be aan,m, ban,m, or logaan,m, logabn,m. Basically, we are interested in the cases when O(n,m) is an operator of magnitude (see cases 1–4, below) and when O(n,m) is performed separately on the coeYcients. To avoid many subscripts in formulae, we will also use notation X ðp; sÞ ¼ Xp;s for discrete 2-D transform coeYcients. Let the enhancement operator O be of the form X( p, s) C( p, s), where the latter is a real function of the magnitude of the coeYcients, i.e., C( p, s) f (|X |)( p, s) C( p, s) must be real because we only wish to alter the magnitude information, not the phase information. In the framework of this constraint, we have several possibilities for C( p, s), which can oVer far greater flexibility: 1. C0 ðp, sÞg ¼ constant, where g 0 (when g ¼ 0 the enhancement preserves all constant information); 2. C1 ðp, sÞ ¼ C0 ðp, sÞg jX ðp, sÞja1 , where C0( p, s) is a constant and 0 a < 1 [which is the so-called modified a-rooting (Aghagolzadeh and Ersoy, 1992)]; 3. C1 ðp, sÞ ¼ Cð p, sÞg jX ð p, sÞja1 , 0 a < 1 (which is the so-called weighted a-rooting); 4. C2 ðp, sÞ ¼ logb ½jX ð p, sÞjl þ 1, 0 b, 0 < l g 0 (Kogan et al., 1998); and 5. C3 ðp, sÞ ¼ C1 ðp, sÞ C2 ðp, sÞ. Denoting by y( p, s) 0 the phase of the transform coeYcient X( p, s), we can write X ð p, sÞ ¼ jX ð p, sÞj exp½ jyð p, sÞ
ð16Þ
where |X( p, s) | is the magnitude of the coeYcients. Rather than apply the enhancement operator O directly on the transform coeYcients X( p, s), we will investigate the operator, which is applied on the modules of the transform coeYcients, OðX Þð p, sÞ ¼ OðjX jÞð p, sÞ exp½ jyð p, sÞ
ð17Þ
We assume that the enhancement operator O (|X |) takes one of the forms Ci ð p, sÞjX ð p, sÞj, i ¼ 1,2,3, at every point ( p, s). Remark 1. The above approach can be used: (a) on the whole image, or via blockwise processing with block sizes 8, 16, 32, and 64; (b) on some low pass or high pass filtered image. As an example, one can see in Figure 1 that an original image X can be divided first into a low-pass image XL and
TRANSFORM-BASED IMAGE ENHANCEMENT ALGORITHMS
175
Figure 1. Diagram of the image enhancement with C3(p, s).
high-pass image XH. The high-pass image is enhanced by multiplication by C3( p, s) and then recombined with the low-pass image [see also (Aghagolzadeh and Ersoy, 1992; Wang, 1981), when using the coeYcient C1( p,s)]. In practice, the coeYcient 0 a < 0.99 is used in C1( p, s) for image enhancement. The optimal value of a is image dependent and should be adjusted interactively by the user. For simplicity of our reasonings, we will assume that in definition of the coeYcients C1( p, s) we have Cðp; sÞ ¼ C0 ðp; sÞ. One may ask: What are the optimal values of a, b, and l? Can one choose a, b, and l automatically? What is the best enhancement frequency ordered system? What is the optimal size of the transform N? In order to answer these questions, we now present a quantitative measure of image enhancement. B. Performance Measure of Enhancement In practice, many definitions of the contrast measure are used (Beghcladi and Negrate, 1989; Kim et al., 1997; Morrow et al., 1992). For example, the local contrast proposed by Gordon and Rangayan was defined by the mean gray values in two rectangular windows centered on a current pixel. Baghcladi and Negrate proposed another definition of the local contrast based on the local edge information of the image, in order to improve the first mentioned definition. The local contrast method proposed by Beghdadi and Negrate has been adopted, in order to define a performance measure of enhancement. Use of statistical measures of gray level distribution measures of local contrast enhancement (for example, mean, variance, or entropy) have not been particularly meaningful for mammogram images. A number of images, which clearly illustrated an improved contrast, showed no consistency, as a class, when using these statistical measurements. A measure proposed in (Morrow et al., 1992), which has greater consistency than statistical measures, is based on the contrast histogram.
176
GRIGORYAN AND AGAIAN
Intuitively, it seems reasonable to expect that an image enhancement measure with values at given pixels should depend strongly on the values at pixels that are close by weekly on those that are further away, and that also this measure should be related with the human visual system. In our definition, we use a modification of Weber’s and Fechner’s laws (Fechner, 1960; Gordon, 1989). Weber established a visual law, argued that the human visual detection depends on the ratio, rather than diVerence, between the light intensity value f(x,y) and f(x,y) þ df(x,y). The Weber definition of contrast was used to measure the local contrast of a single object. (One usually assumes a large background with a small test object, in which case the average luminance will be close to the background luminance. If there are many objects, this assumption does not hold.) Fechner’s law proposes the following relationship between the light intensity f(x,y) and brightness:
f ðx; yÞ F max 0 0 Bðx; yÞ ¼ k ln ð18Þ þ k ln F min F max where k0 is a constant, and Fmin and Fmax are the ‘‘absolute threshold’’ and ‘‘upper threshold’’ of the human eye (Krueger, 1989). Below, a new quantitative measure of image enhancement is presented. Let an image fn,m be split into k1k2 blocks wk,l (i, j) of sizes l1 l2, and let a, b, and l be fixed enhancement parameters (or, vector parameters). For a given class {F} of unitary transforms, we define a value EME ¼ EMEa;b;l;k1 ;k2 as follows EME ¼ max wðEMEa;b;l;k1 ;k2 ðFÞÞ F"fFg
EMEðFÞ ¼ EMEa;b;l;k1 ;k2 ðFÞ ¼
k2 X k1 I wmax ;k;l ðFÞ 1 X 20 log w k1 k2 l¼1 k¼1 I min ;k;l ðFÞ
ð19Þ ð20Þ
where Iwmin;k,l (F) and Iwmax;k,l (F) respectively are the minimum and maximum of the image xn,m inside the block wk,l, after processing the image by F transform-based enhancement algorithm with parameters a, b, and l. The function w is the sign function, wðxÞ ¼ x, or wðxÞ ¼ x, depending on the method of enhancement under consideration. The decision to add this function has been made after studying various examples of enhancement by transform methods using the diVerent coeYcients Ci ð p; sÞ; i ¼ 1; 2; 3. This will be demonstrated in the following sections. Definition 1. EME is called a measure of enhancement, or measure of improvement of the image f.
TRANSFORM-BASED IMAGE ENHANCEMENT ALGORITHMS
177
Definition 2. EME(F) is called a measure of enhancement of the image f with respect to transform F. The value EME½ f ¼ EMEk1 ;k2 ½ f ¼
k1 k2 X 1 X max wk;l 20 log k1 k2 l¼1 k¼1 min wk;l
k2 X k1 I wmax; k;l ðIÞ 1 X 20 log w ¼ EMEðIÞ ¼ k1 k2 l¼1 k¼1 I min; k;l ðIÞ
ð21Þ
we call to be the enhancement measure of the image fn,m. This value can be considered as a particular case of Eq. (20), when the transform F is the identical transform I and no transformation of the image is performed. In other words, we can use notion EME½ f ¼ EMEk1 ;k2 ðIÞ. Definition 3. The best (or optimal ) transform relative to the measure of enhancement EME is called a transform F0 such that EMEðF0 Þ ¼ EME. The image enhancement algorithm based on this transform is called an optimal image improvement transform-based enhancement algorithm. Suppose the transform-based enhancement algorithm depends on the parameters a, b, and g, or vector a = (a, b, g), i.e., F = Fa. Definition 4. Let F be the best (optimal) transform. The best (optimal) Ftransform-based enhancement image vector parameter is called a parameter a0 ¼ ða0 ; b0 ; g0 Þ such that EMEðFa0 Þ ¼ EME. It should be noted that the window size can also be included in the vector a as a parameter of optimal enhancement. We also define another image enhancement measure, by applying the well-known concept of the entropy. For a given image enhancement transform F, we define a measure element of the image fn,m as EMEEðFÞ ¼
k2 X k1 I w I wmax; k;l ðFÞ 1 X max; k;l log w : w k1 k2 l¼1 k¼1 I min; k;l I min; k;l ðFÞ
ð22Þ
Relative to this measure, we consider the value EMEE½ f ¼ EMEEk1 ;k2 ½ f ¼
k2 X k1 1 X max wk;l max wk;l log min wk;l k1 k2 l¼1 k¼1 min wk;l
to be the enhancement measure by entropy of the image fn,m.
ð23Þ
178 Definition 5.
GRIGORYAN AND AGAIAN
The quantity EMEE ¼ max wðEMEðFÞÞ F2fFg
ð24Þ
is called a measure of enhancement by entropy. In other words, the measure of enhancement by entropy is defined as maximum of image enhancement measured by diVerent transforms F. Relative to the measure of enhancement by entropy, we define the best (optimal) transform F0, such that EMEEðF0 Þ ¼ EMEE, and an optimal image improvement transform-based enhancement by entropy algorithm. These two definitions of the enhancement measure may lead to diVerent optimal transforms. Both definitions of the best transform F0 for the best transform-based enhancement of the image will be considered and analyzed. To illustrate the introduced above measures of image enhancement, we consider the image fn;m given in Figure 2. The curve of the enhancement
Figure 2. The gray-scale image of size 512 512.
TRANSFORM-BASED IMAGE ENHANCEMENT ALGORITHMS
179
measure EMEF (a) for this image is shown in Figure 3a. The parameter a varies in the interval [0, 1] with step 0.001. The measure EMEF (a) is estimated by applying the 2-D discrete Fourier transform and using splitting blocks in Eqs. (20) and (22) of size 7 7. The measure of enhancement of the image is EME½ f ¼ 7:6313. The enhancement function EMEF ðaÞ takes its maximum at point a ¼ 0.83. The result of image enhancement by a-rooting method with this optimal parameter a is given in Figure 3b. The enhancement is estimated as EMEF ð0:83Þ ¼ 20:6935. The graph of another enhancement measure EMEEF (a) is shown in Figure 3 in part c. The maximum of the enhancement measure by entropy is achieved at point a ¼ 0.84, that is considered to be optimal for image
Figure 3. (a) The curve of the enhancement measure EME of the image. (b) The image enhanced by 0.83-rooting method. (c) The curve of the enhancement measure EMEE of the image. (d) The image enhanced by 0.84-rooting method. Measurements are with respect to the 2-D discrete Fourier transform.
180
GRIGORYAN AND AGAIAN
enhancement relative to this measure. The enhancement of the image by 0.84-rooting is shown in part d. The enhanced and original images have respectively measures EMEEF ð0:84Þ ¼ 1071:28 and EMEE½ f ¼ 0:6764. One can note that for the considered image, the enhancement measure EMEE yields better enhancement than the measure EME. But, their optimal parameters a are close to each other. In general, this is not only the case, and it will be shown that these kinds of image enhancement measures may provide very much diVerent optimal parameters of a. We now consider the problems of designing the best transformbased image enhancement algorithm and the best F-transform-based enhancement image vector parameter (a0, b0, g0). C. Experimental Results For more clarity and visibility, we first demonstrate the experimental results of the enhancement algorithm for 2-D signals such as the clock-on-moon image being a linear combination of clock and moon images. In our test cases, we use three classes of algorithms, namely, the transform-based enhancement algorithms via the operators C1( p, s), C2( p, s), and C3( p, s), respectively. For each of these cases, we present two classes of experiments. The first class shows how to choose the best operator parameter (or, the best enhancement algorithm) for the given transform. The second class shows how to choose the best image enhancement transform for the given image. In order to enhance images before passing them through a visualization algorithm, we reduce the magnitude information of the image while leaving the phase information intact. Since the phase information is much more significant than the magnitude information in the determination of edges, reducing the magnitude produces better edge detection capabilities. The method of enhancement tends also to reduce the low-frequency components rather than the high-frequency components (both the low-frequency components, that are associated with sharp edges, and high-frequency components, that are associated with the edge elements). The ‘‘clock’’ image was taken as the original, xn,m, and then the ‘‘moon’’ image was superimposed with it. This operation results in an illegible image (shown in Fig. 4) that will be processed to obtain an enhanced image, xˆn,m. 1. a-Rooting Enhancement We consider the transform-based enhancement algorithm via operator C1(p, s). Test 1.1 Choosing the best operator parameter. This case is known as modified a-rooting or root filtering (Aghagolzadeh and Ersoy, 1992;
TRANSFORM-BASED IMAGE ENHANCEMENT ALGORITHMS
181
Figure 4. Linear combination of the (a) clock and (b) moon images, which results in (c) an illegible image.
Antrews et al., 1972; McClellan, 1980). When a equals zero, only the phase is retained. When a < 1, the ‘‘amplitude’’ of the large transform coeYcients are reduced relative to the ‘‘amplitude’’ of the small coeYcients, and the result is enhanced edges and details in the image. Since most of the edge information is contained in the high-frequency region of the spectrum, the edges are enhanced by this method. By varying the a-level of the reduced image, we are able to enhance the quality of the images for visualization. Figure 5, (a) through (c), illustrates the process of enhancement of the image when the parameter a ¼ 0.92 and the Fourier, Hadamard, and cosine transforms are used. As we see in the above examples, the magnitude reduction using C1( p, s) served to sharpen the image as well as even out the brightness throughout the image. The results of the visualization algorithms will be more accurate because they will be operating on these enhanced images. They will also be less dependent on magnitude variations based on magnification and blurring, therefore making it much easier to set a thresholding constant, which need not change from image to image. Test 1.2 Choosing the best image enhancement transform for the given image. Let xˆn,m be identical to xn;m after normalization by a constant. The enhancement measure of the original image shown in Figure 4 is 4.5, or EMEI (X) ¼ 4.5, where I is the identical transform. Figure 6 shows four curves which describe the measure of the enhancement, when applying the Fourier, Hadamard, cosine, and Hartley transforms. We see that on the whole interval, where a varies, the maximal measure of enhancement is provided mostly by the cosine and Hadamard transforms. The curves have two maximums, at points a1 ¼ 0:92 and a2 ¼ 0:6, where the maximum measure is provided by the Fourier transform (the best transform among the above transforms). The experimental results show that the parameter a1 corresponds to the best visual estimation of enhancement. The enhancement by the transforms are very close between these two extreme points.
182
GRIGORYAN AND AGAIAN
Figure 5. Enhancement of (a) the original image via a-rooting based on (b) the Fourier, (c) Hadamard, and (d) cosine transforms when a ¼ 0.92.
We now estimate the enhancement relative to the second measure EMEE. Figure 7 shows four curves of the enhancement measure EMEEF (a) of the clock-on-moon image. The measures are with respect to the Fourier transform, and the size of blocks equal L L, when L ¼ 5, 7, 8, and 9. The parameter a varies from 0.3 to 1, and values of EMEEF (a) are very small when a < 0.3. One can see a few pikes on each curve. We consider separately two regions of a. The first region is (0.3, 0.7) and the second is [0.7, 1]. The maximal value of all four curves in the second region is achieved at point a ¼ 0.88, and it will be considered as an optimal parameter for image enhancement by a-rooting. The amplitude of the maximum A in this region increases with size k1 k2 of the splitting blocks. In the first region, each curve has a few pikes at points which are considered to be the points of most interest in this region. Such two points C and B for curve EMEEF (a) calculated for blocks 9 9 are shown in the bottom part of the figure. To compare the measurement of the enhancement relative to other transforms, we consider Figure 8, where enhancement curves present the
TRANSFORM-BASED IMAGE ENHANCEMENT ALGORITHMS
183
Figure 6. a-rooting by using (from the top to bottom) the 2-D discrete Fourier, cosine, Hartley, and Hadamard transforms.
enhancement measure relative to the Fourier, cosine, Hadamard, and Hartley transforms. The curves have pikes at diVerent points as in the first region as well as in the second region. The optimal values of a of the enhancement measure in two regions, when applying these transforms, are given in Table 1. When applying the Hartley transform, the maximum of the pike in the second region has the smallest amplitude, but is located at point 0.92 that is considered to be optimal point when using the enhancement measure EME. The cosine and Fourier transforms provide the values that are close to the optimal a, but not the Hadamard transform. The results of enhancement of the clock-on-moon image by a-rooting, when applying the extremal values of a from the second region are shown in Figure 9. Figure 10 shows the results of a-rooting when using the values a that corresponds to pikes B in the first region. 2. Analysis of a-Rooting All data of the enhancement measure given in Figure 6 have been calculated by Eq. (20) with splitting block size 5 5. Figure 11 shows the original image in part a, along with the image enhanced by 0.92-rooting in part b,
184
GRIGORYAN AND AGAIAN
Figure 7. Curves of the enhancement measure EMEE of the clock-on-moon image, when applying the Fourier transform and the splitting blocks are of size L L, for L ¼ 5; 7; 8, and 9.
and the curve of the enhancement measure EMEF (a) in part c. At point a ¼ 0:92 the enhancement of the image is estimated as 17.3121. One can see two pikes on the curve of the enhancement measure. In Figure 6, the highest pike is at point a ¼ 0:6, not 0.92. The size k1 k2, of the splitting blocks was considered to be equal L L, where L ¼ 5. In the L ¼ 11 case, the high of the pike at a ¼ 0:92 is maximal. It is interesting to note that the first pike of the enhancement curve is shifted to the right, when the size of the blocks increases, but not the second pike at point a ¼ 0:92. For illustration, Figure 12 shows five curves of the image enhancement, that have been calculated for block size L L, when L ¼ 3, 5, 7, 9, and 11. When L increases, the first pike goes up and is shifted to the right. The second pike is also raised but located at invariable point a ¼ 0:92. Because of the stability of the location of the second pike with respect to the change
185
TRANSFORM-BASED IMAGE ENHANCEMENT ALGORITHMS
Figure 8. Curves of the enhancement measure EMEE of the clock-on-moon image, when applying (from top to bottom) the Fourier, cosine, Hadamard, and Hartley transformations and splitting blocks are of size 7 7. TABLE 1 Values of Optimal a in the Second and First Regions, for the Enhancement Measure EMEE Estimated for the Clock-on-Moon Image, When Applying the DFT, DCT, DHdT, and DHT F
DFT
DCT
DHdT
DHT
aopt(II) aopt(I)
0.89 0.58
0.90 0.54
0.82 0.63
0.92 0.60
of the block size, we consider the value a ¼ 0:92 to be an optimal for the image enhancement by the method of a-rooting. We now compare the enhancement measure of the clock-on-moon image with the measures of its components, namely the clock and moon images. Figure 13 shows the enhancement measure EMEF (a) calculated for the clock image in part a, along with the enhancement measures of the moon
186
GRIGORYAN AND AGAIAN
Figure 9. The enhancement of the clock-on-moon image by a-rooting method, when applying (a) a ¼ 0:89, (b) a ¼ 0:90, (c) a ¼ 0:82, and (d) a ¼ 0:92.
and clock-on-moon images in parts b and c, respectively. The measures are calculated by applying the Fourier transform and block of sizes 5 5 and 7 7, and their curves are shown in the figure by dash and solid lines, respectively. The enhancement measure of the clock image equals 5.4370 when calculating with blocks of size 5 5, and 6.9615 when applying 7 7 blocks. For the optimal enhancement of the clock image, the optimal value of parameter a is considered to be 0.96. This value is close to 1 and corresponds to the location of the second pikes of curves of the enhancement measure. The enhancement of the clock image by the method of a-rooting with a ¼ 0:96 equals EMEF ð0:96Þ ¼ 11:5860 with blocks 7 7 and shown in Fig. 13d. For L ¼ 5 and 7 cases, both the curves of the enhancement measure of the moon image contain only one pike at points a ¼ 0:80 and 0.84, respectively.
TRANSFORM-BASED IMAGE ENHANCEMENT ALGORITHMS
187
Figure 10. The results of a-rooting over the clock-on-moon image, when applying (a) a ¼ 0:58, (b) a ¼ 0:54, (c) a ¼ 0:63, and (d) a ¼ 0:60.
The result of enhancement of the moon image by using parameter a ¼ 0:84 in a-rooting method is shown in Figure 13e. The curves of the enhancement measure EMEF (a), a 2 [0, 1], calculated for the clock-on-moon image have two pikes as for the clock image, for values L ¼ 5 and 7. At points a ¼ 0:80 and a ¼ 0:84 the enhancement is estimated as 19.2974 and 20.6963, respectively. The enhancement measure of the moon image is EMEF ¼ 6:7827 when applying 5 5 blocks, and 8.8159 when applying 7 7 blocks. The enhancement measure of the clock-on-moon image is EMEF ¼ 9.7984 when using 5 5 blocks, and 11.9441 when using 7 7 blocks. The optimal parameter a is considered to be 0.92 and the corresponding enhancement of the image is shown in Figure 13f. The curves of enhancement for this image preserve the form of the curves of enhancement of the clock image.
188
GRIGORYAN AND AGAIAN
Figure 11. (a) Original clock-on-moon image. (b) The image enhanced by the a-rooting method with a ¼ 0:92. (c) The measure of the enhancement with respect to the Fourier transform.
The optimal value a has been shifted from 0.96 to 0.92. If assume the moonimage to be a noise added to the original clock image, then one can consider the enhancement measure as an image characteristic that does not depend much on the noise. In this sense, the enhancement measure EMEF (a) is robust. We next consider the boy image shown in Figure 14a. The measure of enhancement of this image equals 9.2618, when blocks of size 5 5 are used. The curves of the enhancement measure of this image, when a-rooting is applying with respect to four unitary transforms, are given in Figure 15. These curves contain two pikes, and the second pikes are located at the same point a ¼ 0:96. The values of the measure of enhancement for the optimal a ¼ 0:96, when applying these transforms, are given in Table 2. The maximum of the second pike at the optimal point is achieved by the Hartley transform. The cosine transform provides the lowest maximum. The optimal
TRANSFORM-BASED IMAGE ENHANCEMENT ALGORITHMS
189
Figure 12. Curves of the enhancement measure EMEF (a) calculated by the splitting blocks of size L L, for L ¼ 3, 5, 7, 9, and 11. The measurements are with respect to the Fourier transform.
enhancement of the image is achieved at point a ¼ 0.96, and all four transforms indicate on this point. The result of the image enhancement by the a-rooting at the optimal point, when applying the Fourier transform, is shown in Figure 14b. The enhancement is 12.4942, i.e., 4/3 times higher than for the original image. Figure 16 shows five curves that describe the image enhancement measure calculated by applying the cosine transform and diVerent sizes L L of splitting blocks, when L ¼ 3, 5, 7, 9, and 11. As the first pikes of the curves are shifted to the right and lose their amplitudes, the second pikes are concentrated at point a ¼ 0:96 and the amplitudes of their highs increase in proportion to L. Therefore, the value 0.96 is considered to be optimal for enhancement of the image by a-rooting. Our observations show that the whole region of parameter a, i.e., the interval (0, 1), can be divided by four or two subregions (or parts) that describe diVerent types of image enhancement. To define such partition, we
190
GRIGORYAN AND AGAIAN
Figure 13. (a–c) Curves of the enhancement measure EMEF (a) calculated by the splitting blocks of sizes 5 5 (dash line) and 7 7 (solid line), calculated respectively for the clock, moon, and clock-on-moon images. (d) The optimal enhancement of the clock image by a ¼ 0:96. (e) The optimal enhancement of the moon image by a ¼ 0:84. (f) The optimal enhancement of the clock-on-moon image by a ¼ 0:92.
consider, for example, the curve of enhancement measured for the boy image of Figure 14a. The enhancement is estimated with respect to the Fourier transform, when size 7 7 is used for splitting blocks. The curve of the image enhancement EMEF (a) is shown in Figure 17, where the horizontal line corresponds to the enhancement measure 9.2618 of the original boy image. The line intersects the enhancement curve at points a ¼ 0.57, 0.73, and 0.87, and the point 1 can also be considered as such one. The results of the a-rooting for these four values of a are shown in Figure 18. All four images have almost the same enhancement measure estimated by the Fourier transform, but they have diVerent quality and contrast. The image in part b represents a very good quality of all objects of the picture. In parts b and c, one can see the images with enhancement as well as enhanced contours. More details and contours in the picture can be observed. The a-rooting for both a ¼ 0:73 and 0.57 works not only as an enhancement operator but as gradient operator, too. Many contours appear
TRANSFORM-BASED IMAGE ENHANCEMENT ALGORITHMS
191
Figure 14. (a) Boy image with EMEF ¼ 9:2618. (b) The image enhanced by a-rooting, for a ¼ 0:96 that results in the enhancement EMEF ð0:96Þ ¼ 12:4942, when estimating by blocks of size 5 5.
Figure 15. Four curves of the image enhanced calculated by a-rooting for the Fourier, cosine, Hartley, and Hadamard transformations.
192
GRIGORYAN AND AGAIAN TABLE 2 The Maximal Values of the Enhancement Measure for the Boy-Image F
DFT
DCT
DHdT
DHT
aopt EMEF(aopt) EMEF(0.96)
0.96 12.4942 12.4942
0.58 13.5647 12.2934
0.63 12.7904 12.3226
0.96 12.7693 12.7693
Figure 16. Image enhancement measure estimated by the cosine transform for diVerent sizes L L of splitting blocks, where L ¼ 3, 5, 7, 9, and 11.
in the picture, when using a ¼ 0:57 in comparison with the case a ¼ 0:73, and qualities of both the images are good. Two following values of a are also of the most interest. The first is the point where the maximum of the first pike of the enhancement curve is located, and the second one corresponds to the minimum point between two pikes. In Figure 17, these two extremal points are denoted by B and C, and they correspond to the points a ¼ 0:65 and 0.80. The results of image
TRANSFORM-BASED IMAGE ENHANCEMENT ALGORITHMS
193
Figure 17. Image enhancement measure estimated by the Fourier transform for splitting blocks of size 7 7.
enhancement by the a-rooting with these values are shown in Figure 19 in parts a and b. The enhancement of the boy image is estimated respectively as 7.7304 and 11.3483, when the Fourier transform is applied and the splitting blocks are of size 7 7. Based on the considered above results, we may divide the range of a by four parts I, II, II, and IV, by the following points 0, 0.57, 0.73, 0.87, and 1, as shown in Figure 17. Region I, where a varies from 0.87 to 1, can be considered as a region with favorable enhancement parameters a. Region II, where a varies from 0.73 to 0.87, can be considered as a region with lowfavorable enhancement parameters a. Region III, where a varies from 0.57 to 0.73, can be considered as a region with favorable enhancement and gradient parameters a. Region IV, where a varies from 0 to 0.57, can be considered as a region with low-favorable gradient parameters a. Figure 20 shows the histograms of the original boy image in part a, along with the histogram of image processed by 0.96-rooting in part b, and with the cumulative density functions of the boy image and enhanced image in parts c and d, respectively. One can see that after the enhancement, the range of intensities of the image is shifted to the left, from high intensities to low intensities. There are less pixels of the image with high intensities, but more pixels with low intensities, which will be more visible. The curves of cumulative density functions also show that the range of the image is shifted
194
GRIGORYAN AND AGAIAN
Figure 18. (a) Original image. (b) Image enhanced by 0.87-rooting. (c) Image enhanced by 0.73-rooting. (d) Image enhanced by 0.57-rooting. All images have the same enhancement measure. The measurements are with respect to the Fourier transform.
Figure 19. (a) Image enhanced by 0.80-rooting. (b) Image enhanced by 0.65-rooting.
TRANSFORM-BASED IMAGE ENHANCEMENT ALGORITHMS
195
Figure 20. (a) The histogram of the boy image. (b) The histogram of the image enhanced by 0.96-rooting method. (c) The cumulative density function of the boy image. (d) The cumulative density function of the enhanced image.
and reduced after 0.96-rooting. The cumulative density function in part d is becomes almost flat (or, 1) after point n ¼ 210. The cumulative density function of the enhanced image becomes flat starting from point 190. Figure 21 shows six histograms of the image processed by a-rooting, when values of a correspond to six special points a defined in Figure 17. These points are 0.57, 0.65, 0.73, 0.78, 0.88, and 0.96. Figure 22 shows six cumulative density functions corresponding to these histograms. One can observe the process of histogram formation from the curves with one mode (pike), when a 2 [0, 75], to two modes (pikes), when a 2 [75, 1], in the range of high intensities, let say 100. The histogram H(a) depends on the parameter a. When a varies from 0 to 1, the histograms of images enhanced by a-rooting approach to the histogram H(1) of the original boy image, as shown in Figure 21 in part f by dot line. In other words, H(a) ! H(1) when a ! 1.
196
GRIGORYAN AND AGAIAN
Figure 21. (a–f ) The histograms of the boy image enhanced by a-rooting method for a ¼ 0.57, 0.65, 0.73, 0.78, 0.88, and 0.96, respectively.
To show the pixels that correspond to the second mode, we can consider the thresholding of the image at level T ¼ 166 by ( fT ðn; mÞ ¼ fn;m ; if fn;m < T fT ðn; mÞ ¼ 255; if fn;m T: as shown in Figure 23, where the histogram of the enhanced image is given in part a along with the thresholded enhanced image. The enhancement corresponds to the optimal a-rooting, when a ¼ 0.96. In the case, when the curve of the enhancement measure has only one pike, as in the example with the moon image (see Figure 13b), or vehicle image (see Figure 3), the region [0, 1] of parameter a may be divided by two parts [0, a0] and [a0, 1]. We name them, respectively, to be the range of lowfavorable enhancement parameters and favorable enhancement parameters of a. In other words, we do not consider the presence of the gradient type parameters a. The value of a0 is defined as the point of intersection of the enhancement curve EMEF (a) with the horizontal line y ¼ EMEF that
TRANSFORM-BASED IMAGE ENHANCEMENT ALGORITHMS
197
Figure 22. (a–f ) The cumulative density function that correspond to the histograms of the enhanced boy image by a-rooting method for a ¼ 0.57, 0.65, 0.73, 0.78, 0.88, and 0.96, respectively.
corresponds to the enhancement measure of the original image. Such point may vary but not significant, when applying instead of the Fourier transform other unitary transforms. As an example, we consider the enhancement of the chemical plant image shown in Figure 24a. The enhancement measure of the image equals EMEF ¼ 12:4671, when applying the Fourier transform and 5 5 splitting blocks. Figure 25 shows three curves of the image enhancement measure EME(a), calculated by applying the Fourier transform and splitting blocks of sizes 5 5, 7 7, and 9 9. The parameter a ¼ 0:92 is considered to be an optimal parameter for enhancement of the image by a-rooting method. The image enhanced by the optimal parameter is shown in Figure 24b. Each curve of the image enhancement measure has only one pike at a point approaching to a ¼ 0:92, as the size of blocks becomes larger. If we consider the curve calculated for 9 9 blocks, then the horizontal line
198
GRIGORYAN AND AGAIAN
Figure 23. (a) The histogram of the boy image enhanced by 0.96-rooting method. (b) The thresholding of enhanced image by T ¼ 166; all samples with intensities greater than or equal to T have been changed by value 255 and correspond to the brightest intensify in the image.
y ¼ 12:4761 intersects the curve at point a0 ¼ 0:82. The region of a we consider to be divided by two intervals as [0,0.82] and [0.82,1]. The first interval is referred to as the region of low-favorable enhancement parameters, and the second interval [0.82,1] is referred to as a region of favorable enhancement parameters a for the plant image. We do not observe the action of a-rooting as a gradient operator for a < a0. As an example, Figure 26 shows the original plant image in part a, along with the image enhanced by a-rooting with a ¼ 0:92 in part b, the image processed by a-rooting with a ¼ 0:82 in part c, and the image processed by a-rooting with a ¼ 0:70 in part d. The enhancement measure for these images equal respectively 12.4761, 28.4461, 10.6091, and 4.4578. One can note a good quality of images for all processed images in parts (b–d).
TRANSFORM-BASED IMAGE ENHANCEMENT ALGORITHMS
199
Figure 24. (a) Image with enhancement measure EMEI ¼ 12.467. (b) Image enhanced by 0.92-rooting.
Figure 25. The measure of image enhancement estimated by the Fourier transform for splitting blocks of size L L, when L ¼ 5, 7, and 9.
200
GRIGORYAN AND AGAIAN
Figure 26. a-rooting enhancements with respect to the Fourier transform. (a) Original image, (b) image enhanced by 0.92-rooting, (c) image enhanced by 0.82-rooting, and (d) image enhanced by 0.70-rooting.
3. Modified a-Rooting We now consider the transform-based enhancement algorithm via operator C2( p, s). Test 2.1 Choosing the best operator parameter. Figure 27 illustrates the enhanced images by varying parameter a, when using (a–c) the Fourier transform, and (d–f) the Hadamard. The log-magnitude reduction using C2( p, s) serves to enhance the edges around regions in the image. Test 2.2 Choosing the best image enhancement transform for the given image. Figure 28 illustrates the measure of the image enhancement by using diVerent transforms and varying parameters l and b, respectively, in the intervals [0, 2] and [0, 1]. The surface of the measure for the Fourier method in part a, along with the diVerences between the measures when the Fourier
TRANSFORM-BASED IMAGE ENHANCEMENT ALGORITHMS
201
Figure 27. Enhanced images via the a-rooting based on the Fourier transform (a–c), and Hadamard transform (d–f).
and Hadamard in b, the Fourier and cosine transforms in c, and the cosine and Hadamard transforms in d are used for enhancement. The results of the Fourier transform-based image enhancement are shown in Figure 29, for the boundary parameters. The large values of b lead to the elimination of the higher frequencies on the image spectrum, and the operator O works as a filter of low frequencies. Contrarily, the small values of b increase the image enhancement. 4. Transform-Based Enhancement by Operator C3(p, s) We now consider the transform-based enhancement algorithm by operator C3( p, s). Test 3.1 Choosing the best operator parameter. Combining the magnitude reduction and log-magnitude reduction methods in C3( p, s) accomplishes the sharpening and edge enhancements for a given image. In our experiments, we found C3( p, s) with a ¼ 0:8 and l ¼ 1:5 to be the optimal magnitude reduction operator on the image. Figure 30 illustrates the surface of the enhancement measure EME for b ¼ 0:8 and l ¼ 0:9, when the Fourier transform based enhancement image algorithm is used. Figure 31 illustrates
202
GRIGORYAN AND AGAIAN
TRANSFORM-BASED IMAGE ENHANCEMENT ALGORITHMS
203
Figure 29. (a) The original image and (b–d) the 2-D Fourier transform enhancements when operating with C3( p, s) coeYcients that are calculated, respectively, for parameters (l, b) equal (0.05, 0.05), (1.9, 0.05), (0.05, 0.9), and (1.9, 0.9).
the surface of the enhancement measure for a ¼ 0:8, when the similar Hadamard transform algorithm is applied. Test 3.2 Choosing the best image enhancement transform. We face with the problem of selecting the optimal unitary transform for our application. Since our goal is to achieve maximum accuracy in the detection of regions of interest as well maximize computational speed, we must balance these two factors and make a selection that is appropriate for our application. Therefore, we analyzed the quality of the results and the execution time for each of these unitary transform algorithms. Figure 28. Measure of log-enhancement by the Fourier, Hadamard, and cosine transformations. (a) Fourier enhancement measure, (b) diVerence between the Fourier and Hadamard measures, (c) diVerence between the Fourier and cosine measures, (d) diVerence between the cosine and Hadamard measures.
204
GRIGORYAN AND AGAIAN
TRANSFORM-BASED IMAGE ENHANCEMENT ALGORITHMS
205
Test 3.3 Comparison. An enlarged example of the proposed optimal magnitude reduction is shown in Figure 32(a), when using the Fourier transform (b) and comparing with a-rooting (c). The histograms (d) through (f) show how the range of intensities diVer, when using diVerent coeYcients for the enhancement. The measure of enhancement is 9.84 and 7.80, when using, respectively, the coeYcients C3( p, s) and C1( p, s) for enhancement of the image. Figure 33 illustrates for comparison the outputs of the 2-D Fourier transform enhancement for all three methods under consideration. One can see that the maximum measure of enhancement and best visual estimate occurs when using the coeYcient C3( p, s). All further references to the magnitude reduction algorithm will be to this specific combination of magnitude reductions. It should be noted, that from standpoint of information theory, the probability distribution, which conveys the most information, is perfectly uniform (Rosenfeld and Kak, 1982). Therefore, if we could obtain as uniform a histogram as possible, the image information could be maximized. D. Zonal Transform-Based Enhancement Methods The classical transform-based image enhancement techniques are performed uniformly over the entire frequency spectrum. There are situation in which (a) it is desirable to enhance details in an image without significantly changing its general characteristics (Stark, 2000), (b) an image may have enough global contrast with considerable low-contrast local details, or the contrast is poor in some parts of the image but adequate in the other parts of the image (Ji et al., 1994). There is another problem: enhancing contrast between large, low frequency regions histogram equalization in the spatial domain the small details are often lost (Laine et al., 1995). For the solution of the above problems, we propose the transform-based image enhancement within radially concentric zones. The motivation of using zones comes from the fact that: a very ‘‘short’’ transform coeYcient length corresponds to the homogenous image blocks, a ‘‘medium’’ transform coeYcient length corresponds to the texture, and a ‘‘long’’ transform coeYcient length corresponds to the highly active image blocks. By using this method, we may achieve much more flexibility and control over the magnitude reductions in diVerent regions within the frequency domain.
Figure 30. The Fourier enhancement via log-reduction, when coeYcients C3( p, s) are calculated for one fixed parameter a, b, or l. (a) Surface of the enhancement measure (a ¼ 0:8). (b) Image of the enhancement measure. (c) Surface of the enhancement measure (b ¼ 0:8). (d) Surface of the enhancement measure (l ¼ 0:9).
206
GRIGORYAN AND AGAIAN
Figure 31. (a) Surface of the enhancement measure (a ¼ 0:8) for the Hadamard transform-based enhancement via log-reduction when coeYcients C3( p, s). (b) Image of the enhancement measure.
TRANSFORM-BASED IMAGE ENHANCEMENT ALGORITHMS
207
Figure 32. The 2-D Fourier transform-based enhancement of (a) the image via the operator O with coeYcients (b) C3( p, s) and (c) C1( p, s), when a ¼ 0:8, l ¼ 1:5, and b ¼ 0:8, and the histograms (d–f) of the images (a–c), respectively.
By using increasingly greater reductions in higher frequencies, we manage to attenuate the high-frequency noise component of the image. At the same time, we also maintain the edge-enhancing eVects of the magnitude reduction algorithm. We now demonstrate zonal transform-based image enhancement on examples. We first will find the maximum and minimum values within the frequency domain data. Then, using these maximum and minimum points as end-markers, we will divide the frequency domain into regions based on each point’s magnitude distance from the maximum and minimum. We set distance dividers between the maximum and minimum points, which divide the frequency domain into regions. Each region has a specified magnitude reduction value, a, and log-magnitude reduction value, b. We next determine the four pairs of a and b values, as well as three values of the distance to specify our magnitude reductions. These values can be determined by means of the enhancement measure.
208
GRIGORYAN AND AGAIAN
Figure 33. The 2-D Fourier transform-based enhancement of (a) the original image and results (b–d) of the enhancement, respectively, for the coeYcients C1( p, s), C2( p, s), and C3( p, s) with a ¼ 0:9, l ¼ 1:5, and b ¼ 0:8.
1. Enhancement Algorithm with Two Zones We consider the following transform-based image enhancement algorithms with two zones, a1 and a2. (a) The enhancement operator X( p, s) can be defined by ( C3 ðp; sÞ X ð p; sÞ; if ð p; sÞ 2 a1 ð25Þ OðX ðp; sÞÞ ¼ C4 ðp; sÞ X ð p; sÞ; if ð p; sÞ 2 a2 : The zone a2 must be very small and coeYcient C4( p, s) 0 must be near to zero or zero. (b) The enhancement operator X( p, s) can be defined by ( C3 ð p; sÞ ½X ð p; sÞ ka1 ; if ðp; sÞ 2 a1 OðX ð p; sÞÞ ¼ ð26Þ k X ðp; sÞ; if ðp; sÞ 2 a2
TRANSFORM-BASED IMAGE ENHANCEMENT ALGORITHMS
209
where k is a constant (which can be the mean of all jX ð p; sÞj; ð p; sÞ 2 a2 Þ; a1 ¼ c6 lnb ðlng NÞ, N is the size of the input signal, c6, b, and g are constants (and g 0.5). As an example, Figure 34 illustrates the curves of the enhancement measure of the image, by using two zones and applying the Fourier transform-based enhancement method for the operators C1( p, s) (in part a), C2( p, s) (in part b), and C3( p, s) (in part c), for the parameters a ¼ 0:8, b ¼ 0:8, and l ¼ 1:5. The varying parameter for the curves is the radius r of the first zone by which the area of the Fourier spectrum is divided. The second zone is the rest of the area. Figure 34 shows that when radius of the zone increases, the measure of image enhancement grows faster when using the coeYcient C2( p, s), than C3( p, s). Figure 35 illustrates an example of image enhancement by two zones with varying radius-parameter r. As can be observed from the experimental results, the proposed algorithm eVectively enhances the overall contrast and
Figure 34. The curves of the Fourier transform-based image enhancement (a), (b), and (c), by two zones, when using the coeYcients C1, C2, and C3, respectively.
210
GRIGORYAN AND AGAIAN
Figure 35. (a) The curve of the Fourier transform-based image enhancement by two zones and (b), (c), and (d) the results of image enhancement, when radius of the first zone is 32, 64, and 127, respectively.
sharpness of the test images. Many details that could not been seen in the test image have been clearly revealed. (c) The enhancement operator O(X( p, s)) can be defined by ( C3 ðp; sÞ X ð p; sÞ; if Sð p; sÞ a OðX ð p; sÞÞ ¼ ð27Þ C0 ðp; sÞ X ð p; sÞ; if Sð p; sÞ < a where X( p, s) is a magnitude of the transform image, a1 is a thresholding operator, C0( p, s) is a very small constant (or, zero). S( p, s) is defined as Sðp; sÞ ¼ f1 ðX ðp; sÞÞ=f2 ðDCÞ, where f1 and f2 are real functions and DC is a coeYcient (or, a zero-frequency component). For instance, Sðp; sÞ ¼ 2X ðp; sÞ=DC, when f1 ðxÞ ¼ 2x and f2 ðxÞ ¼ x.
TRANSFORM-BASED IMAGE ENHANCEMENT ALGORITHMS
211
Figures 36 and 37 illustrate the examples of image enhancement by using the Fourier and Hadamard transforms and two zones. The threshold is a ¼ 16 and parameter b takes values 0.5, 0.6, 0.7, 0.8, and 1. 2. Enhancement Algorithm with Three Zones For enhancement of an image by using three zones, the enhancement operator O(X( p, s)) can be defined by 8 if 0 X ðp; sÞ < a2 > < aX ðp; sÞ; ð28Þ OðX ð p; sÞÞ ¼ b½X ð p; sÞ a2 þ ga ; if a2 X ðp; sÞ < a1 > : g½X ð p; sÞ a1 þ gb ; if a1 X ðp; sÞ: This is a typical contrast stretching transform, which can be applied in the frequency domain.
Figure 36. The 2-D Fourier transform-based image enhancement by two zones (a ¼ 16).
212
GRIGORYAN AND AGAIAN
Figure 37. The 2-D Hadamard transform-based image enhancement by two zones (a ¼ 16).
E. Negative a-Rooting Method In many cases, a desired quality or enhancement of an image can be achieved by processing the negative image. The whole technique of the transform-based enhancement can be applied to enhance the negative image and compose then the negative of the enhanced negative image. For instance, we consider the following enhancement that we name to be the negative optimal a-rooting method. We consider a discrete image x(n, m) of size N N, where N > 1. Step 1: Calculate the negative image yðn; mÞ ¼ 255 xðn; mÞ;
n; m ¼ 0 : ðN 1Þ
ð29Þ
where it is assumed that the intensity of the image varies in the interval [0, 255]. Step 2: For each value of a varying in the interval (0, 1), estimate the enhancement measure EMEðaÞ ¼ EME½ yˆ a , where yˆa is the result of processing of the image by a-rooting:
TRANSFORM-BASED IMAGE ENHANCEMENT ALGORITHMS
213
F 1
F
yðn; mÞ ! Y ðp; sÞ ! C1 ðp; sÞY ð p; sÞ ¼ jY ðp; sÞja1 Y ðp; sÞ ! yˆ a ðn; mÞ:
ð30Þ
The 2-D discrete Fourier transformation F is used in this algorithm, but any other unitary transform can also be used. Step 3: Find the optimal value a0 that provides the maximum of the enhancement measure EMEða0 Þ ¼ max EMEðaÞ: a2ð0;1Þ
ð31Þ
Step 4: Process the negative image y(n, m) by the method of a0-rooting. Step 5: Calculate the negative image of the obtained image yˆa0 (n,m) xˆ a0 ðn; mÞ ¼ A½L yˆ a0 ðn; mÞ
ð32Þ
where L is the maximum gray-level of image yˆ a0 ðn; mÞ and A is the scale coeYcient A ¼ 255=maxðL yˆ a0 ðn; mÞÞ. As an example, we consider the enhancement of a low resolution image xðn; mÞ; n; m ¼ 1 : 512, shown in Figure 38a. The enhancement measure of the image equals EME½x ¼ 7:958, when applying the discrete Fourier transform and splitting blocks of size 5 5. The curve of the image enhancement measure EME(a) by the Fourier transform is shown in part c. At point a ¼ 0:61 the a-rooting results in the image shown in part b with the same measure of enhancement EME. The result has a low grey-level resolution and can be considered as a binary (or thresholded) image with enhanced counters. All small details in the image have been enhanced. Figure 39 shows the negative image yðn; mÞ ¼ 255 xðn; mÞ in part a, along with the enhanced negative image when applying a-rooting with a ¼ 0:80 in part b, and the negative of the processed negative image in part c. The curve of the negative image enhancement measure by the Fourier transform is shown in part d. The parameter a ¼ 0:80 is considered to be an optimal parameter for enhancement of the negative image by a-rooting method. The enhancement measure EME for the negative image equals 4.6112. The measure increases by about 3 times and equals 13.4305 for the enhanced negative image. The negative of the enhanced negative image has the measure equals 6.5244 and is shown in Figure 39c. Thus, the image has been enhanced by processing its negative image. For comparison, we consider the direct method of image enhancement. The image enhancement measure EME(a) has only one pike at point a ¼ 0:85, where the measure takes value 12.6548. The parameter a1 ¼ 0:85 is considered to be an optimal parameter for enhancement of the image by a-rooting method, with respect to the Fourier transform. Figure 40 shows the original image in part a, along with the image xˆ(n,m) enhanced by
214
GRIGORYAN AND AGAIAN
Figure 38. (a) Image of size 512 512 with enhancement measure EMEI ¼ 7:958. (b) Image enhanced by 0.61-rooting. (c) Curve of enhancement measure EME(a) with respect to the Fourier transform and splitting blocks of size 5 5. The maximum of the measure is located at point 0.85.
0.85-rooting in b. The enhancement is estimated as EMEð0:85Þ ¼ 12:6548. It should be noted, that intensity of the enhanced image x¯(n,m) does not exceed the intensity of original image at each pixel. The diVerence between these images, x(n,m) x¯(n,m), is shown in part c. We can note that the image xˆ a0 ðn; mÞ; a0 ¼ 0:80, processed by negative a0rooting looks better than the image processed by direct a1-rooting method. Moreover, we may improve further the enhanced image, by changing the scale of the image by using diVerent gray-scale transformations. As an example, in part d, the method of enhancement by the following log-power transformation of the enhanced image xˆ a0 ðn; mÞ is shown xˆ a0 ðn; mÞ ! log 2 ½xˆ a0 ðn; mÞb þ 1;
n; m ¼ 1 : 512
TRANSFORM-BASED IMAGE ENHANCEMENT ALGORITHMS
215
Figure 39. (a) Negative image with enhancement measure EMEI ¼ 4:6112. (b) Negative image enhanced by 0.80-rooting with enhancement measure equal 13.4305. (c) Image obtained from enhanced negative image. (d) Curve of negative image enhancement measure EME(a) with respect to the Fourier transform and splitting blocks of size 5 5.
where the value of b is chosen to be 6.50. The enhancement measure of this image after the log-power transformation becomes 15.5145 (see Figure 40d), and that exceeds the enhancement obtained by the direct method of a1-rooting, where a1 ¼ 0:85. We also consider the enhancement of the image shown in Figure 41a. The enhancement measure of the image equals EME½x ¼ 16:1165, when applying the Fourier transform and splitting blocks of size 5 5. The curve of the image enhancement by the Fourier transform is shown in part c. The image enhancement measure has a pike at point a ¼ 0:98, where the measure of enhancement becomes 19.6667. The parameter a ¼ 0:98 is considered to be an optimal parameter for enhancement of the image by a-rooting method with respect to the Fourier transform. The image enhanced
216
GRIGORYAN AND AGAIAN
Figure 40. (a) Original image. (b) Image enhanced by 0.85-rooting. (c) DiVerence between the original and optimal enhanced images. (d) Enhanced image when processing by log-power transformation after the negative a0-rooting. The measure of enhancement for images (a–d) equals, respectively, 7.9580, 12.6548, 7.1104, and 15.5146.
by the optimal parameter is shown in part b. We can observe a small enhancement of the image. The enhancement measure of the original image has a high value and, applying the optimal a-rooting, we raise the enhancement of the image by about 3.5 units of EME, only. Figure 42a shows the image of Figure 41 after processing the image by negative optimal a-rooting method. The negative image has measure EME equals 10.89 and the enhanced negative has measure 19.02. The curve the negative image enhancement by the Fourier transform is shown in part b. The parameter a ¼ 0:93 is considered to be an optimal parameter for enhancement of the negative image by a-rooting method. The processed image has the measure equals 14.65 which is less than the original image or when applying the direct method of a-rooting as shown in Figure 41b. But many details of the background can be seen better in the image processed by the negative a-rooting.
TRANSFORM-BASED IMAGE ENHANCEMENT ALGORITHMS
217
Figure 41. (a) Image with enhancement measure EMEI ¼ 16:1165. (b) Image enhanced by 0.98-rooting. (c) Curve of enhancement measure EME (a), a 2 (0, 1), with respect to the Fourier transform and splitting block size 5 5. The maximum of the enhancement measure is located at point a0 ¼ 0:98.
In the next part, with the example of the modified a-rooting method of image enhancement, we show how to apply the concept of a vector representation of an image for the transform-based enhancement. The image is represented uniquely by a set of one-dimensional signals that are called image-signals, and then one or a few number of image-signals will be processed. Even by processing only one image-signal, we can achieve good results for image enhancement. This approach gives us new tools to achieve the desired results. It will be shown that rather than use zones in forms of circles or other figures, one can consider special cyclic groups or their union as ‘zones’ at samples of which the spectrum can be processed for image enhancement.
218
GRIGORYAN AND AGAIAN
Figure 42. (a) Image obtained from enhanced negative image by 0.93-rooting. (b) Curve of negative image enhancement measure EME(a), a 2 (0, 1), with respect to the Fourier transform and splitting blocks of size 5 5.
III. Tensor Method of Image Enhancement In this part, a new method of image enhancement is introduced. The method is based on the tensor (or vectorial) representation of a two-dimensional image with respect to the Fourier transform (Grigoryan, 1984, 1986, 1991, 2002; Grigoryan and Agaian, 2001). The image is defined as a set of onedimensional (1-D) signals that split the Fourier transform into a set of 1-D transforms. As a result, the problem of image enhancement is reduced to the processing 1-D splitting signals. The splitting yields a simple model for image enhancement, when by using only a few number of image-signals it is possible to achieve an enhancement of images that is comparative to the
TRANSFORM-BASED IMAGE ENHANCEMENT ALGORITHMS
219
known class of the frequency-domain-based parametric image enhancement algorithms that are used for the object detection and visualization. Based on the quantitative measures EME and EMEE, the best parameters for image enhancement can be found for each image-signal to be processed separately. To illustrate the idea of the tensor representation, we consider the twodimensional case N N. Suppose s ¼ ðTÞ is an covering of the lattice XN,N where a considered image f is defined. We are interested in constructing such coverings s for which the following property holds: Any 2-D image f defined on X can be represented as a set of 1-D sequences (or signals) fT such that the 1-D DFT of each fT coincides with the 2-D DFT of f at samples of the corresponding set T. As an example, Figure 43 shows a transformation w of a 3 3-point sequence f into four 1-D sequences fTk, k ¼ 1 : 4, whose DFTs define the 2-D DFT of f at samples of sets Tk that completely fill X3,3. It is assumed that such a transformation w exists. For comparison, Figure 44 illustrates the traditional method of calculation of the 3 3-point 2-D DFT. In this method three 3-point DFTs are used over the rows of the image, then the obtained 2-D data are transposed and three 3-point DFTs are used again over the rows. Thus, six 3-point DFTs are required in the traditional algorithm, and four in the diagram of Figure 43.
Figure 43. Diagram of 3 3-point sequence f transformation into four signals fT1, fT2, fT3, and fT4, whose 1-D DFTs define completely the 2-D DFT of f.
220
GRIGORYAN AND AGAIAN
Figure 44. Diagram of calculation of the 2-D DFT of the of 3 3-point sequence f by six 1-D DFTs.
A. Splitting Image-Signals The concept of the tensor, or vectorial representation of an image yields new forms of image description as 3-D data, each element of which is a weight of the sinusoidal wave with a corresponding frequency. In the spectral domain, such representation yields the decomposition of the two-dimensional Fourier transform of the image into a set of 1-D transforms of image-signals. To further simplify the discussion, we describe the case of an N N-point DFT. However, the concepts discussed here are not limited to the N N-point DFT. They apply to the Fourier transform of arbitrary order N M. The N N-point DFT of an image f, accurate to the normalizing factor 1/N, is determined as Fp1 ;p2 ¼ ðF N;N f Þp1 ;p2 ¼
N 1 N 1 X X
fn1 ;n2 W n1 p1 þn2 p2
n1 ¼0 n2 ¼0
ð33Þ
where W ¼ expð2pj=NÞ. The points (p1, p2) are from the the fundamental period X of the transformation FN,N, i.e. X ¼ XN;N ¼ fðp1 ; p2 Þ; p1 ; p2 ¼ 0 : ðN 1Þg: The designation p ¼ 0 : (N 1) denotes p as an integer that runs from 0 to (N 1). It was shown (Grigoryan, 1984, 1991, 2001) that the fundamental period X can be divided by a family of sets s ¼ ðTk Þk¼1:n , n > 1, in a way that the 2-D Fourier transform of the two-dimensional image f at each set Tk becomes an image of the one-dimensional Mk-point Fourier transform, FMk, of an one-dimensional signal, f (k). The lengths Mk of the transforms equal to the cardinality of sets Tk. This supposition means that: (a) 1-D unitary transforms FMk determine the 2-D transform F and compose uniquely the splitting of the two-dimensional transform F ½ f $ F M1 ½ f ð1Þ ; F M2 ½ f ð2Þ ; . . . ; F Mn ½ f ðnÞ : ð34Þ
TRANSFORM-BASED IMAGE ENHANCEMENT ALGORITHMS
221
(b) The family of signals f ðkÞ , k ¼ 1 : n, which we define to be imagesignals, describes completely the image f f ð1Þ ; f ð2Þ ; . . . ; f ðnÞ g $ f : The family s of sets is a covering of X. In the general case, the concept of splitting a 2-D unitary transform by the covering s is defined in the following way. We use the notation |T for the restriction of data on T and card s for the cardinality of s. Definition 6. Two-dimensional discrete unitary transformation FN,N is said to be revealed by the covering s if, for each T 2 s, there exists an 1-D unitary transform A ¼ A(T ) and a sequence fT such that ðF N;N f ÞjT ¼ A fT :
ð35Þ
The set of the 1-D transformations fAðTÞ; T 2 sg is called a s-splitting of the transform FN,N by the covering s. We define the set of 1-D sequences ffT ; T 2 sg to be the s-representation of f with respect to FN,N. The splitting of the 2-D Fourier transformation FN,N consists of 1-D Fourier transformations, i.e., AðTÞ ¼ F M , where M is the cardinality of T. The algorithm of the transform FN,N f can thus be represented as the following. Algorithm for calculating FN,N Step 1: Construct the image-signals fT. Step 2: Calculate the 1-D transforms FM fT. Step 3: Fill the 2-D DFT by using the calculated 1-D DFTs at samples of sets T. The 2-D DFT may be split in diVerent ways into a set of 1-D short Fourier transforms. Therefore, the image may be represented by diVerent sets of 1-D image-signals. For purposes of image processing, we are interesting in such set of image-signals that provides an eVective performance of the concrete image processing in practice through the procedures over the image-signals. Here, we only stand on the procedure of image enhancement.
B. Tensor Representation of the Image The tensor representation of the image relates to the following covering of the fundamental period X. Let s ¼ ðTÞ be a covering of X, which is defined by the following cyclic groups with generators ( p1, p2) 2 X
222
GRIGORYAN AND AGAIAN
Tp1 ;p2 ¼ ð0; 0Þ; ðp1 ; p2 Þ; ð2p1 ; 2p2 Þ; . . . ; ðkp1 ; kp2 Þ ;
k ¼ card T 1
T0;0 ¼ fð0; 0Þg
ð36Þ
where we denote by p¯ the number p mod N. As an example, Figure 45 shows the arrangement of elements of the groups T1,1, T0,1, and T1,0. When the generator is (1,1), all elements of the group T1,1 are situated along the diagonal of the lattice. All elements of the group T0,1 are situated on the first row of the lattice, and the elements of group T1,0 are situated on the first column of the lattice. In the general case, elements of the subset Tp1,p2 lie on l parallel lines at an angle of y ¼ tan1 ðp1 =p2 Þ to the horizontal axis. The number l is determined as follows. If p1 ¼ 0 or p2 ¼ 0, then l ¼ 1. For other cases, let k1 and k2 be the smallest integers satisfying, respectively, the relations k1 p1 ¼ 0 and k2 p2 ¼ 0. Then, l ¼ k1 =k2 when k1 k2 , and l ¼ k2 =k1 when k1 < k2 . It follows directly from cyclicity of groups Eq. (36), that the irreducible covering s of the domain X composed by groups Tp1,p2 is unique. For instance, Figure 46 shows four groups that compose the covering s ¼ ðT1;1 ; T0;1 ; T2;1 ; T1;0 Þ of the lattice 3 3, The group Tp1,p2 with any other generator 6¼ (0,0), diVerent from generators (0,1), (1,1), (2,1), and (1,0), coincides with one of the groups of covering s. For example, T2;2 ¼ T1;1 and T1;2 ¼ T2;1 .
Figure 45. Arrangement of elements of the groups T1,1, T0,1, and T1,0 on the square lattice X8,8.
Figure 46. Period (3 3) and four sets of samples.
TRANSFORM-BASED IMAGE ENHANCEMENT ALGORITHMS
223
The following property holds for the Fourier transform (Grigoryan, 1984, 2001) Fp1 ;p2 ¼
N 1 X
fp1 ;p2 ;t W t
ð37Þ
t¼0
and in the general case Fkp1 ;kp2 ¼
N1 X
fp1 ; p2 ;t W kt ;
t¼0
for all k ¼ 0 : ðN 1Þ:
ð38Þ
where fp1 ; p2 ; t ¼
X
ð39Þ
fn1 ;n2
Vp1 ; p2 ;t
The sets V are defined by Vp1 ;p2 ;t ¼ fðn1 ; n2 Þ; n1 p1 þ n2 p2 ¼ t mod N g;
t ¼ 0 : ðN 1Þ
ð40Þ
To prove Eq. (38), we note that sets Vp1,p2,t1 and Vp1,p2,t2 are pairwise disjoint sets, that is \ with t1 ¼ 6 t2 : Vp1 ;p2 ;t2 ¼ ;; Vp1 ;p2 ;t1
Therefore, owing to the periodicity property of the Fourier transform, we obtain 0 1 N 1 N 1 X X X @ fn1 ;n2 AW tk fp1 ;p2 ;t W kt ¼ t¼0
t¼0
¼
Vp1 ;p2 ;t
N 2 1 1 1 N X X n1 ¼0 n2 ¼0
fn1 ;n2 W ðn1 p1 þn2 p2 Þk ¼ Fkp1 ;kp2 ¼ Fkp1 ;kp2 :
The set Vp1,p2,t, if it is not empty, is the set of points (n1, n2) along a maximum of p1 þ p2 parallel straight lines at an angle of c ¼ tan1 ð p2 =p1 Þ to the horizontal axis. The equations for the lines are 9 xp1 þ yp2 ¼ t > > > = xp1 þ yp2 ¼ t þ N ð41Þ > ... : ... > > ; xp1 þ yp2 ¼ t þ ðp1 þ p2 1ÞN: In the square domain Y ¼ ½0; N ½0; N, the lines lie at angle c ¼ tan1 ðp2 =p1 Þ to the horizontal axis.
224
GRIGORYAN AND AGAIAN
Figure 47. The elements of the set V1,2,2 lie on the three straight lines: x 1 þ y 2 ¼ 2, x 1 þ y 2 ¼ 10 and x 1 þ y 2 ¼ 18. Therefore, f1;2;2 ¼ ð f0;1 þ f2;0 Þ þ ð f0;5 þ f2;4 þ f4;3 þ f6;2 Þ þ ð f4;7 þ f6;6 Þ.
As an example, Figure 47 shows the elements of the set V1,2,2 on the discrete lattice X8,8, that lie on three lines. Two points of the set V1,2,2 are on the line x þ 2y ¼ 2, four points are on the line x þ 2y ¼ 10, and two points are on the line x þ 2y ¼ 18. Thus, we can say that all samples of the set Vp,s,t lie on the family, L t ¼ L p;s;t , of parallel rays passing along samples of the discrete lattice XN,N traced on the initial image. The covering s ¼ ðTp1 ;p2 Þ reveals the 2-D DFT, and the image-signals are defined by ð42Þ fT ¼ fTp1 ;p2 ¼ fp1 ;p2 ;0 ; fp1 ;p2 ;1 ; . . . ; fp1 ;p2 ;N1 : It means that for each set T 2 s
ðF N;N f ÞjT ¼ F N fT :
ð43Þ
The 2-D DFT is split by this covering into a set of the 1-D transformations fF N ; F N ; . . . ; F N g. The totality { fT; T 2 s} is called a tensor, or vectorialrepresentation of f with respect to the Fourier transformation, and the transformation w : f ! { fT; T 2 s} is called a tensor transformation. Each image-signal fTp1 ; p2 determines the 2-D DFT at all samples of the group Tp1 ;p2 , and the following one-to-one correspondence holds fTp1 ; p2 $ F0;0 ; Fp1 ;p2 ; F2p1 ;2p2 ; . . . ; FðN1Þ ;ðN1Þ : ð44Þ p1
p2
Figure 48 illustrates the clock image of size 256 256 in part a, along with the image-signal fT1,3 of length 256 in (b), the 1-D DFT over this
TRANSFORM-BASED IMAGE ENHANCEMENT ALGORITHMS
225
Figure 48. (a) Clock-on-moon image. (b) Image-signal corresponding to the group T1,3. (c) Absolute value of the 1-D DFT of the image-signal (zero component is shifted to the center). (d) Arrangement of values of the 1-D DFT in the 2-D DFT of the image at points of the group T1,3.
image-signal in (c), and samples of the subset T1,3 X256,256 at which the 2D DFT of the image will be filled by the 1-D DFT in (d). In general, the image f is real and each component fp1,p2,t is real and can be considered to be the amplitude, or weight of the cos(ok)- and sin(ok)-waves with the frequency o ¼ 2ps=N, where k ¼ 0 : N 1 and s ¼ g.c.d. ( p1, p2). According to the construction of the image-signals being the discrete integrals (or image projections) along the parallel lines of Eq. (41), any processing of the image-signal fT yields the change in the Fourier spectrum at points of the corresponding group T. After performing the inverse 2-D discrete Fourier transform, the corresponding change will be observed in the spatial domain at points along the parallel lines of sets Vp1 ;p2 ;t ; t ¼ 0 : (N 1). As an example, Figure 49 shows the clock-on-moon image after
226
GRIGORYAN AND AGAIAN
Figure 49. Image processing by image-signals (a) fT1;3 , (b) fT1;1 , (c) fT2;1 , and (d) fT0;1 . Arrows show the direction of the parallel projections that compose the corresponding sets Vp1 ;p2 ;t , t ¼ 0 : 255, for (p1, p2) ¼ (1, 3), (1, 1), (2, 1), and (0, 1).
processing only one image-signal, namely, (a) the image-signal fT1;3 , (b) the image-signal fT1;1 , (c) the image-signal fT2;1 , and (d) the image-signal fT0;1 . For better illustration of directions of parallel lines of the corresponding sets Vp1 ;p2 ;t on the image, the magnitude of the image-signals have been amplified by four times. C. Construction of the Covering s Equation (38) shows that the splitting in Eq. (43) can be performed by the cyclic groups Tp1 ;p2 . In other words, if s ¼ ðTp1 ;p2 Þ is an irreducible covering of X, then s reveals the 2-D DFT. In this section, the construction of the irreducible covering of the lattice XN,N of size N N is given.
227
TRANSFORM-BASED IMAGE ENHANCEMENT ALGORITHMS
We first define the set BN ¼ fn 2 f0; 1; . . . ; N 1g; g:c:d:ðn; NÞ > 1g and the function b( p) that is equal to the number of the elements s 2 BN that are coprime with p and such that ps < N. Further, we denote by f(N) the Euler function, that is, the number of positive integers that are smaller than N and coprime with N. Theorem 1. Given an arbitrary natural number N > 1, the following totality is the irreducible covering of the square lattice XN,N ð45Þ sN;N ¼ Tp1 ;p2 p1 ;p2 2J where
J¼
N1 [
p2 ¼0
ð1; p2 Þ
[ [
p2BN
ð p1 ; 1Þ
[
0
[
@
p1 ;p2 2BN ;g:c:d:ð p1 ;p2 Þ¼1;p1 ;p2
1
ð p 1 ; p2 Þ A
ð46Þ
The cardinality of the covering equals cardsN;N ¼ 2N fðNÞ þ
X
bðpÞ:
p2BN
ð47Þ
In particular, we obtain the following irreducible covering of the lattice XN,N s ¼ sN;N ¼ T1;p2 p2 ¼0:N1 ; T0;1 ð48Þ if N is a prime,
sN;N ¼
T1;p2
p2 ¼0:N1
; T2p1 ;1 p1 ¼0:N=21
ð49Þ
if N ¼ 2r ðr > 1Þ. In the general case, when N is a power of a prime L 2, the irreducible covering can be taken as ð50Þ sN;N ¼ T1;p2 p2 ¼0:ðN1Þ ; TLp1 ;1 p1 ¼0:N=L1 :
Figure 50 illustrates the set of all image-signals that compose the tensor representation of the clock-on-moon image of Fig. 48a. The complete set of 384 image-signals are displayed in the order shown in the construction of the covering sN,N for N ¼ 256 ð51Þ s256;256 ¼ Tp1 ;1 p1 ¼0:255 ; T1;2p2 p2 ¼0:127 :
228 GRIGORYAN AND AGAIAN
Figure 50. 384 image-signals compose the tensor representation of the image with respect to the 2-D DFT.
TRANSFORM-BASED IMAGE ENHANCEMENT ALGORITHMS
229
Thus, number 1 image-signal is fT1;0 , the image-signal fT1;1 has number 2, . . ., the image-signal fT255;1 has number 256, the image-signal fT1;0 has number 257, the image-signal fT1;2 has number 258, the image-signal fT1;4 has number 259, . . ., and the last image-signal fT1;254 has number 384. D. Properties of Image-Signals We now consider properties of the image-signals. Let Gf be the totality of all image-signals n o ð52Þ Gf ¼ fT ¼ fTp1 ;p2 ; Tp1 ;p2 2 sN;N :
In the case, when N > 1 is a general prime or power of two, Gf consists, respectively, of N þ 1 or 3N/2 image-signals. The following relations hold between components of the image-signals. If N is a prime, then for any p1, p2, and t 9 fkp;ks;kt ¼ fp1 ;p2 ;t; k 6¼ 0 > = ð53Þ fNp;Ns;t ¼ fp1 ;p2 ;Nt > ; f0;0;t ¼ 0; t 6¼ 0:
The first relation in Eq. (53) shows a way to construct the image signal fTkp;ks from fTp1 ;p2 . Indeed, let XN ¼ ½0; 1; . . . ; N 1 be the set of integers congruent modulo N. Then ð54Þ kXN ¼ kn; n 2 XN ¼ XN ðk 6¼ 0Þ
and the image-signal fTkp;ks is the permutation of the signal fTp1 ;p2 , which is inverse to the permutation
0 1 2 3 ... N 1 : ð55Þ 0 k 2k 3k . . . ðN 1Þk If N equals a power of two, then for any p1, p2, and t, the following holds 9 f2p;2s;2t ¼ fp1 ;p2 ;t þ fp1 ;p2 ;tþN=2 > > > > f2p;2s;t ¼ 0; g:c:d:ð2p; 2s; tÞ 1 = ð56Þ > > fNp;Ns;t ¼ fp1 ;p2 ;Nt > > ; f0;0;t ¼ 0; t 6¼ 0:
For a given (p1, p2), the sets Vp1 ;p2 ;t , t ¼ 0: (N 1), compose a partition of X, and therefore
230
GRIGORYAN AND AGAIAN N 1 X t¼0
fp1 ;p2 ;t ¼ P ¼
N 1 N 1 X X
fn;m
n¼0 m¼0
ð57Þ
i.e., the sum of components of each image-signal is equal to the power P of the image. Owing to the Parseval theorem, the energy associated with the image-signal fTp1 ;p2 equals "ð fTp1 ; p2 Þ ¼
2 1 l X X 1N
fp21 ;p2 ;t ¼
Fkp1 ; kp2 : N t¼0 k¼0
ð58Þ
In the case, when N a prime, the totality Gf contains N + 1 signals fT1,0, fT1,1, . . ., fT1,N 1, and fT0,1. Sets T 2 s have only one common point (0,0). Therefore N1 X s¼0
"ð fT1;s Þ þ "ð fT0;1 Þ ¼ ¼ "ð f Þ þ N
N1 X X N1
p1 ¼ 0 p2 ¼ 0
N 1 N 1 X X n¼0 m¼0
fn;m
Fp ;p 2 þ NF 2 1 2 0;0
!2
¼ "ð f Þ þ NP 2 :
ð59Þ
The total energy associated with all signals of Gf coincides with the energy "( f ) of the image f, accurate to the addend NP 2 (more exactly P 2, if the normed factor 1/N is taken into account in Eq. (38)). If the image is centered, i.e., each of its value fn,m is replaced by f n;m ¼ fn;m P =N 2 , then the energy of the totality Gf of image-signals coincides with the energy of the initial image f "ðGf Þ ¼ "ð f Þ: As an example, Figure 51 shows the cowgirl image in part a, along with the graph of the function gðnÞ ¼ "ðGp;s Þ in part b, where n are numbers of the generators ( p,s) taken in accordance with their order that is given in construction Eq. (50) of the covering sN,N. The image has been centered. The image-signal with the maximum energy 299.73 is G0,1. The next two image-signals are G1,0, G128,1 with high energies equals 166.6847 and 133.9112, respectively. The figure also shows another seven image-signals with high energy. They are G1,1, G1,253, G1,2, G1,128, G1,85, G1,171, and G254,1. E. Image Enhancement Owing to the tensor representation of the 2-D DFT, an image is presented in the form of the totality of 1-D image-signals that contain the information about the 2-D DFT of the image at samples of the corresponding groups.
TRANSFORM-BASED IMAGE ENHANCEMENT ALGORITHMS
231
Figure 51. (a) The cowgirl image and (b) the energy curve of 384 image-signals of the image.
Rather than enhance the image in the frequency domain by processing all frequency by methods of transformations, for instance by the method of magnitude-reduction, one can perform the enhancement of the image by processing only one or a few of the image-signals. Optimal parameters for processing the selected image-signal can be found by using the concept of the quantative measure of enhancement EME or EMEE. Image enhancement in the frequency domain is straightforward. One need simply perform the unitary transform of an image to be enhanced, manipulate with the transform coeYcients, and then perform the inverse transform ˆ F N;N : f ! F ! O F ! F 1 N;N ½OðjFjÞ ¼ f
ð60Þ
where O is an operator that changes only the modules of coeYcients Fp1 ;p2 . We consider the enhancement operator O to be of the form Cð p1 ; p2 Þ Fp1 ;p2 ,
232
GRIGORYAN AND AGAIAN
where Cð p1 ; p2 Þ is a real function of the magnitude of coeYcients. The modified a-rooting method of image enhancement is described by the following coeYcients
a1 Cð p1 ; p2 Þ ¼ A Fp1 ;p2 ; 0 a < 1 ð61Þ
where A is a constant. We assume that the enhancement operator O(|F |) takes the form Cðp1 ; p2 ÞjFp1 ;p2 j at each point (p1, p2). The best (optimal) Fourier transform-based enhancement image parameter a0 is such that EMEðF a0 Þ ¼ EME. The optimal value of a0 is image dependent and it can be found automatically (Grigoryan et al., 2001) by computing EME(Fa) being a function of variable a. As an example, Figure 52(a) shows the curve described the measure of enhancement of the clock-on-moon image, when applying the 2-D discrete
Figure 52. The modified a-rooting by the 2-D Fourier transformation. (a) EME characteristics of the image. (b) The image enhanced by a-rooting method, when a ¼ 0:61. (c) The image enhanced by a-rooting, when a ¼ 0:92. Relative to the measure EME, the image in enhanced by more than twice.
TRANSFORM-BASED IMAGE ENHANCEMENT ALGORITHMS
233
Fourier transformation and size 7 7 for splitting blocks. The curve has two maximums, at points a1 ¼ 0:61 and a0 ¼ 0:92, where the maximum measure is provided by the Fourier transformation. The parameter a0 corresponds to the best visual estimation of enhancement. Figure 52 in parts b and c illustrate the enhancement of the original image via a-rooting based on the DFT when a ¼ 0:61 and a ¼ 0:92, which yield the enhancement of the image equal, respectively, EMEðF 0:61 Þ ¼ 16:27 and EMEðF 0:92 Þ ¼ 13:04. F. Method of 1-D a-Rooting In the tensor representation of an image fn1 ;n2 , the Fourier transform method of image enhancement can be performed by processing the image-signals fTp1 ;p2 , Tp1 ;p2 2 s, in the following way.
Figure 53. (a) Enhancement measure function EME(n, ao) for ao ¼ 0:975, when n ¼ 0 : 383. (b) Image-signal fT135;1 . (c) CoeYcients C1(k), k ¼ 0 : 255, of the one-dimensional a-rooting enhancement. (d) Image enhanced by the image-signal.
234
GRIGORYAN AND AGAIAN
Algorithm of Image Enhancement Step 1: Perform the 1-D DFTs of the image-signals fTp1 ;p2 ! Fk ¼
N 1 X t¼1
fp1 ;p2 ;t W kt ;
k ¼ 0 : ðN 1Þ:
Step 2: Multiply the transforms of image-signals by coeYcients Ck ¼ AjFk ja1 , k ¼ 0 : (N 1). Step 3: Compose the 2-D DFT by new 1-D DFTs. Step 4: Perform the inverse 2-D DFT. Rather than process all image-signals by the 1-D a-rooting method with a fixed parameter a, we can separately process the image-signals by diVerent a parameters, to achieve the optimal enhancement. The optimality is with respect to the measure EME ¼ EME(F ). Thus, we may change Step 2 in the algorithm, by using diVerent (or, optimal) a 2 [0, 1] for each image-signal. As an example, for the moon-clock image of size 256, Figure 53 in part a shows the values of the enhancement measure EME(n; ao), calculated after processing only one, the nth image-signal for ao ¼ 0:975, where n ¼ 0 : 383.
Figure 54. Enhancement measure function EME(135, a).
TRANSFORM-BASED IMAGE ENHANCEMENT ALGORITHMS
235
The maximum of EME(n; ao) equals 9.722 and is achieved for n ¼ 135 that corresponds to the image-signal fT135,1. This image-signal is shown in Figure 53b, along with the coeYcients C1 ðkÞ; k ¼ 0 : 255 in c, and the enhancement of the image in d. It is important to note, that by processing only one image-signal, we may achieve a significant enhancement of the image. Figure 54 shows the curve of the function EME(135; a), when a varies in the interval [0.8, 1]. One can see that value 0.975 is optimal for this image-signal. We recall that the a-rooting by the 2-D DFT yields the optimal value 0.92 (see Fig. 52), and at point a ¼ 0:92 the image enhancement is estimated as 5.29 which is significantly less than 9.722. Figure 55 shows the 1-D DFT of the image-signal fT135,1 in part a, along with the arrangement of values of this DFT at points of the group T135,1 for the 2-D DFT of the image in b. Figure 56 shows the 1-D DFT of the processed image-signal fT135,1 in part a, along with (b) the arrangement of values of the coeYcients C1 ðkp1 ; kp2 Þ ¼ C1 ðkÞ of the 1-D a-rooting at points of the group in the 2-D lattice X256,256 of the image.
Figure 55. (a) The 1-D DFT of the image-signal. (b) Arrangement of the 1-D DFT at points of the 2-D DFT of the image.
236
GRIGORYAN AND AGAIAN
Optimal values of a for other image-signals may diVer from 0.975. As an example, we consider the 256th image-signal, that corresponds to the group T1,0 and at which the minimum of EME(n; ao) is achieved and equals 8.635 [see Fig. 53a]. The analysis of the function EME (257; a) shows that the optimal value of a for this image-signal is 0.965 and the enhancement of the image equals EME(256; 0.965) = 9.81 and this enhancement is greater than to the image-signal fT135;1 . The image-signal number 256 is shown in Figure 57 in part a, along with the coeYcients C1 ðkÞ; k ¼ 0 : 255 in b, and the enhancement of the image by processing this image-signal in c. The curve of the enhancement measure function EME (256; ao) when ao varies in the interval [0.8, 1] is shown in part d. Two other extremal values, namely the second maximum and minimum, of the enhancement measure EME(n; ao), for ao ¼ 0:975, are also shown in Figure 53a. The image-signals are respectively signals numbers 171 and 1. The second maximum of EME(n; ao) equals 9.68 and is achieved for n ¼ 341
Figure 56. (a) The 1-D DFT of the processed image-signal number 135 by the onedimensional a-rooting. (b) Arrangement of the coeYcients C1 ðkp1 ; kp2 Þ ¼ C1 ðkÞ; k ¼ 0 : 255, at points of the 2-D 256 256 lattice.
TRANSFORM-BASED IMAGE ENHANCEMENT ALGORITHMS
237
that correspond to the image-signal fT1;170 . This image-signal is shown in Figure 58 in part a along with the coeYcients C1(k) in part b, and the enhancement of the image by processing this image-signal in part c. The curve of the enhancement measure function EME(342; ao) when ao varies in the interval [0.8, 1] is shown in part d. The optimal value of a for this image-signal is 0.975. We can enhance the image by processing two or more image-signals. As an example, Figure 59 in part b shows the result of image enhancement when processing two image-signals fT135;1 and fT1;0 . Enhancement equals 9.78. We note that by processing the image signal fT135;1 , that corresponds to the maximum of the enhancement curve of Figure 53a, we achieve an image enhancement of 9.72. By processing the image signal fT1;0 , that corresponds to the minimum of the enhancement curve, we achieve an image enhancement of 9.81.
Figure 57. (a) Image-signal fT1;0 . (b) CoeYcients C1 ðkÞ; k ¼ 0 : 255. (c) Image enhanced by the image-signal. (d) Enhancement measure function EME(256; ao) for a 2 [0.8, 1]. The maximum of the measure is located at point a ¼ 0:965.
238
GRIGORYAN AND AGAIAN
Figure 58. (a) Image-signal fT1;170 . (b) CoeYcients C1 ðkÞ; k ¼ 0 : 255. (c) Image enhanced by the image-signal fT1;170 . (d) Enhancement measure function EME(342, ao) for a 2 [0.8, 1].
As one can see, the tensor representation of the image with respect to the Fourier transform yields the splitting of the 2-D DFT into a set of 1-D DFTs. The image is represented as a set of 1-D image-signals and the problem of image enhancement in the frequency domain can be reduced to processing the image-signals. The quantitative measure of enhancement EME allows us to select optimal parameters for each image-signal and achieve significant enhancement of the image by using even one imagesignal which can be defined automatically. G. Conclusion Owing to the tensor representation of the 2-D DFT, the image is presented in the form of a totality of 1-D image-signals that contain the information about the 2-D DFT of the image at samples of the corresponding groups.
TRANSFORM-BASED IMAGE ENHANCEMENT ALGORITHMS
239
Figure 59. (a) Image-signal fT135;0 . (b) Image-signal fT1;0 . (c) CoeYcients C1 ðkÞ; k ¼ 0 : 255, for these two image-signals. (d) Image enhanced by the two image-signals fT135;0 and fT1;0 .
Rather than enhance image in the frequency domain by processing all frequencies by the method of magnitude-reduction, one can perform the enhancement by processing only one or a few number of image-signals. Optimal parameters for processing the selected image-signal are calculated by using the quantitative measure of enhancement EME. We have restricted our attention to investigation of a special form of 2-D image representation, called the tensor (or, vectorial) representation of an image with respect to the Fourier transformation. The modified, so-called paired representation (Grigoryan, 1991, 2001) can also be used for image enhancement by splitting image-signals. In paired representation, each image-signal caries the spectral information of the image at points of special sets that do not intersect for diVerent image-signals. In standpoint of the tensor and paired representation, the problem of image enhancement is reduced to processing a few number of selected image-signals or the complete set of image-signals with diVerent (the best) parameters. The
240
GRIGORYAN AND AGAIAN
best parameters are calculated by using the concept of the quantitative measure.
Acknowledgment The authors thank Dr. Richard M. Leahy for allowing us to use images of the library of the Signal and Image Processing Institute, University of Southern California.
References Agaian S. S. (1990/1991). Advances and problems of fast orthogonal transform for signal/image processing applications, (Part 1) Nauka, Moscow, Issue 4, 99–145/(Part 2) Issue 5, 146–215 Agaian, S. S. (1999). Visual morphology. Proceedings of SPIE: Nonlinear Image Processing X 3646, 139–150. Agaian, S. S., Panetta, K., and Grigoryan, A. M. (2001). Transform-based image enhancement algorithms. IEEE Trans. on Image Processing 10, 367–382. Aghagolzadeh, S., and Ersoy, O. K. (1992). Transform image enhancement. Optical Engineering 31, 614–626. Antrews, H., Tescher, A., and Kruger, R. (1972). Image processing by digital computer. IEEE Spectrum 9, 20–32. Ballard, D., and Brown, C. (1982). Computer Vision. Englewood Cliffs, NJ: Prentice-Hall. Beauchamp, K. G. (1975). Walsh Functions and Their Applications. London: Academic Press. Beghcladi, A., and Negrate, A. L. (1989). Contrast enhancement technique based on local detection of edges. Computer Vision, Graphics, Image Processing 46, 162–274. Castleman, K. R. (1996). Digital Image Processing. New Jersey: Prentice-Hall. Chang, D.-C., and Wu, W.-R. (1998). Image contrast enhancement based on a histogram transformation of local standard deviation. IEEE Trans. Medical Imaging 17, 518–531. Donoho, D. L. (1995). De-noising by soft-thresholding. IEEE Trans. on Inform. Theory 41, 613–627. Elliot, D. F., and Rao, K. R. (1983). Fast Transforms: Algorithms, Analyzes, Applications. Orlando, New York: Academic Press. Fechner, G. T. (1960). Elements of Psychophysics, Vol. 1. New York: Rinehart & Winston. Gonzalez, R., and Wintz (1987). Digital Image Processing, 2nd ed., Reading, MA, AddisonWaley. Gordon, I. E. (1989). Theory of Visual Perception. New York: Wiley. Grigoryan, A. M. (1984). Two-dimensional Fourier transform algorithm. Izvestiya VUZ SSSR, Radioelectronica 27, 52–57. Grigoryan, A. M. (1991). Algorithm of computation of the discrete Fourier transform with arbitrary orders. Vichislit. Matem. i Mat. Fiziki 30, 1576–1581. Grigoryan, A. M. (2001). 2-D and 1-D multi-paired transforms: Frequency-time type wavelets. IEEE Trans. Signal Processing 49, 344–353. Grigoryan, A. M. (2002). Efficient algorithms for computing the 2-D hexagonal Fourier transforms. IEEE Trans. Signal Processing 50, 1438–1448.
TRANSFORM-BASED IMAGE ENHANCEMENT ALGORITHMS
241
Grigoryan, A. M., and Agaian, S. S. (2001). Shifted Fourier transform based tensor algorithms for 2-D DCT. IEEE Trans. Signal Processing 49, 2113–2126. Grigoryan, A. M., Agaian, S. S., and Panetta, K. (2001). A new measure of Image Enhancement. Processing of the conference ACIVS 2001, IEEE Signal Processing, BadenBaden. Grigoryan, A. M., and Agaian, S. S. (2003) Tensor form of image representation: Enhancement by image-signals. [E1 5014–26], in Proceedings of SPIE. Electronic Imaging 2003: Santa Clara, CA: Science & Technology. 5014, 221–231. Grigoryan, A. M., and Grigoryan, M. M. (1986). Tensor representation of the two-dimensional discrete Fourier transform and new orthogonal functions. Autometria 1, 21–27. Harmuth, H. F. (1969). Applications of Walsh function in communications. IEEE Spectrum 6, 82–91. Harris, J. (1977). Constant variance enhancement: a digital processing technique. Applied Optics 16, 1268–1271. How, H. S., and Tretter, D. R. (1990). Frequency characterization of the discrete cosine transform. Proc. SPIE 1349, 31–42. Jain, A. (1989). Fundamentals of digital image processing. Englewood Cliffs, NJ: Prentice Hall. Ji, T. L., Sundareshan, M. K., and Roehrig, H. (1994). Adaptive image contrast enhancement based on human visual properties. IEEE Trans. Medical Imaging 13, 573–586. Khellaf, A., Beghdadi, A., and Dupoiset, H. (1991). Entropic contrast enhancement. IEEE Trans. Medical Imaging 10, 589–592. Kim, J. K., Park, J. M., Song, K. S., and Park, H. W. (1977). Adaptive mammographic image enhancement using first derivative and local statistics. IEEE Trans. Medical Imaging 16, 495–502. Kogan, R., Agaian, S. S., and Lentz, K. P. (1998). Visualization using rational morphology and zonal magnitude-reduction. Proceedings of SPIE 3304, 153–163. Krueger, L. E. (1989). Reconciling Fechner and Stevens: Toward an unified psychophysical law. Behav. Brain Sci 12, 251–320. Laine, A. F., Fan, J., and Yang, W. (1995). Wavelets for contrast enhancement of digital mamography. IEEE Engineering in Medicine and Biology 14, 536–550. McClellan, J. H. (1980). Artifacts in alpha-rooting of images. Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing 5, 449–452. Millane, R. P. (1994). Analytic properties of the Hartley transform and their applications. Proc. IEEE 82, 413–428. Morrow, W. M., Paranjape, R. B., Rangayyan, R. M., and Desautels, J. E. L. (1992). Regionbased contrast enhancement of mammograms. IEEE Trans. Medical Imaging 11, 392–406. Natendra, P., and Rithch (1981). Real-time adaptive contrast enhancement. IEEE Trans. Pattern Analysis Machine Intelligence 3, 655–661. Netravali, N., and Presada (1977). Adaptive quantization of picture signals using spatial masking. Proc. IEEE 65, 536–548. Polesel, A., Ramponi, G., and Mathews, V. J. (2000). Image enhancement via adaptive unsharp masking image processing. IEEE Trans. Medical Imaging 9, 505–510. Poularikas, A. D., and Seely, S. (1991). Signals and Systems. Boston: PWS-KENT Publishing Company. Rajcsy, R., and Liberman (1976). Texture grandiuts as a depth one. Computer Graphics and Image Processing 5, 52–57. Reeves, T. H., and Jernigan, M. E. (1997). Multiscale-based image enhancement. IEEE 1997 Canadian Conference 2, 500–503. Ritter, G. X., and Wilson, J. N. (1996). Handbook of Computer Vision Algorithms in Image Algebro. Boca Raton, Fla: CRC Press.
242
GRIGORYAN AND AGAIAN
Rosenfeld, A., and Kak, A. C. (1982). Digital Picture Processing, Vol. 1. New York: Academic Press. Saghri, J. A., Cheatham, P. S., and Habibi, H. (1989). Image quality measure based on a human visual system model. Opt. Eng 28. Stark, J. A. (2000). Adaptive image contrast enhancement using generalizations of histogram equalization. Image Processing, IEEE Trans. Medical Imaging 9, 889–896. Tizhoosh, H. R., Krell, G., and Michaelis, B. (1998). Lambda-enhancement: Contrast adaptation based on optimization of image fuzziness, in Fuzzy Systems Proceedings, IEEE World Congress on Computational Intelligence; FUZZ-IEEE’98 2, pp. 1548–1553. Tizhoosh, H. R. (2001). Observer-dependent image enhancement. Fuzzy Systems, The 10th IEEE International Conference 1, 23–26. Wang, D., Vagnucci, A. H., and Li, C. C. (1981). Digital image enhancement. Computer Vision, Graphics, Image Processing 24, 363–381. Wang, Yu., Chen, Q., and Zhang, B. (2000). Image enhancement based on equal area dualistic sub-image histogram equalization method. Consumer Electronics, IEEE Trans. Medical Imaging 45, 68–75. Williams, B., Hung, C.-C., Yen, K. K., and Coleman, T. (2001). Image enhancement using the modified cosine function and semi-histogram equalization for gray-scale and color images, in Systems, Man, and Cybernetics, 2001 IEEE International Conference, 1, pp. 518–523. Zamperoni, P. (1995). Image enhancement. Adv. Image Electron Physics 92, 1–77. Zimmerman, J. B., Pizer, S. M., Staab, E. V., Perry, J. R., McCartney, W., and Brenton, B. C. (1988). An evaluation of the effectiveness of adaptive histogram equalization for contrast enhancement. IEEE Trans. Medical Imaging 7, 304–312.
ADVANCES IN IMAGING AND ELECTRON PHYSICS, VOL. 130
Image Registration: An Overview MARIA PETROU Informatics and Telematics Institute, CERTH, POB 361, Thermi 57001, Thessaloniki, Greece
I. Introduction . . . . . . . . . . . . . . . . II. Similarity Measures . . . . . . . . . . . . . A. Similarity Measures for Pixel Based Methods . . B. Similarity Measures for Feature Based Methods . 1. Point Matching Methods . . . . . . . . . 2. Curve Matching Methods . . . . . . . . . 3. Region Matching Methods . . . . . . . . III. Deriving the Transformation Between the Two Images A. Feature-Based Methods . . . . . . . . . . . B. Pixel-Based Methods . . . . . . . . . . . . IV. Feature Extraction . . . . . . . . . . . . . . V. Literature Survey . . . . . . . . . . . . . . VI. Conclusions . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
243 246 246 253 253 264 264 266 267 268 276 282 287 288
I. Introduction Image registration is the process that allows one to know which pixels of two diVerent images were produced by the same physical object or the same part of a physical object that was imaged. This definition, however, already runs into problems: A pixel is not a physical entity. It is the result of our imaging technology. So, there may not really exist an exact correspondence between pixels, as diVerent imaging set-ups will create diVerent pixels of the same scene. In other words, the object patch depicted by the pixel of one image may not coincide exactly with the object patch depicted by any pixel of the other image. Perhaps, therefore, a more correct definition of image registration is: Image registration is the process that allows one to know which parts of two diVerent images were produced by the same physical object or the same part of a physical object that was imaged. Image registration is of paramount importance in many areas of research. For example, in computer vision, image registration is the necessary first step in order to perform stereo vision for 3D depth recovery from a pair of images. In robotics it is necessary to register the frames in the video sequence captured by the cameras of the robot for object tracking and path 243 Copyright 2004, Elsevier Inc. All rights reserved. ISSN 1076-5670/04
244
PETROU
finding. In medicine, one needs image registration at all levels of research and clinical practice: from the registration of data with pre-labelled and segmented anatomical atlases to aligning patients in successive radiotherapy sessions. In remote sensing, most applications require the detection of change between images captured at diVerent times: the advancement of desert, the growth of plants, the change of land use, monitoring schools of fish in the ocean or swarms of locusts in the Sahel, require the accurate registration of images that have to be compared in order to detect change. And, of course, just knowing what is where is perhaps the most crucial first step of any remote sensing application: the registration of an image with a map is a prerequisite of any further investigation. It is not surprising, therefore, that the vast amount of research in image registration has been done either in the field of remote sensing or in the field of medical image analysis. Distinguishing the research in these two disciplines from both stereo vision and video processing is the fact that they often have to register images captured by very diVerent sensors. So, special techniques had to be developed to solve the problem of multimodal image registration. Before one embarks into trying to solve the problem, it is worth trying to understand what causes the problem. The images one wishes to register have been produced by spatially and/or temporally diVerent image capturing settings. In eVect, the image plane has diVerent relative position with respect to the depicted scene in the two images. This may be expressed as the movement of the camera with respect to the scene. This movement may consist of a known and an unknown component. Often, the known component is deterministic and it is routinely removed from the image. This is, for example, the case of the satellite orbital characteristics being taken into consideration when producing a satellite remote sensing image. The unknown component is usually the problem, and this is often random and unpredictable: it could be the vibration of the satellite or aeroplane in a remote sensing image, or the displacement of organs in the human body, or even the natural variation of organs’ shapes and sizes when one wishes to register medical images from diVerent subjects. In view of the above, the discussion that will follow will draw its examples from all the above mentioned applications, trying to put them under the same unified framework. Almost all image registration algorithms consist of the following steps: . .
.
Assume a correspondence between the two images. On the basis of this correspondence infer the transformation from the sensed to the reference image. Transform the sensed image by applying this transformation.
IMAGE REGISTRATION: AN OVERVIEW .
.
245
Compute a similarity measure between the reference image and the transformed sensed image. If the value of this similarity measure is within acceptable limits, exit. If not, go to the first step and repeat the process.
The correspondence assumed in the first step is a correspondence between landmark points, often picked up manually in many routine applications in remote sensing and in medicine, even today. This clearly is a very tedious process, but it is still used as it is really the only 100% safe method in applications where there is no room for mistakes. Automating this process necessarily involves the process of feature detection. A feature is a distinguishable point in the image, eg a corner, an edge, etc. In the extreme case a pixel may be thought of as a feature. A pixel is not really a good feature. For a start, it does not have physical entity as it was mentioned earlier, so it does not really represent some physical structure of the scene. In addition, a pixel may not be particularly distinguishable from its surrounding pixels (i.e., it may have low information value). Pixels, however, are distributed as densely as it gets in an image, so they oVer robustness through redundancy, and dense registration information. That is why there is a whole selection of popular image registration methods which are pixel based. Feature based methods on the other hand suVer from the sparsity of features, something which necessitates the interpolation of the registration parameters obtained by the matching process. Once a correspondence between features has been established, one may try to infer a transformation between the two images. Here we have two options: the transformation may be assumed to be global or local. A global transformation is assumed to apply to the sensed image as a whole, and again we have two options: it may be a rigid–body transformation, or it may be a nonrigid–body transformation. It is surprising how popular rigid body transformations are, even when blatantly they are not applicable, e.g., in medicine when human organs are registered. The reason of their popularity is their simplicity: a translation and a rotation can easily be estimated, and they can be estimated fast. Occasionally scaling is allowed as well. A nonrigid–body transformation is a polynomial transformation (i.e., the coordinates of a pixel in one image are assumed to be polynomial functions of the coordinates of the corresponding pixel in the other image). If the polynomials are of the first order, the transformation is aYne and it is expressed by a matrix that transforms the coordinates of a pixel and which in addition to the rotation and scaling of the image, it also allows shear deformations, plus a translation. A special case of a polynomial transformation is the bilinear transformation applied locally: once the correspondence between landmark points has been established, one may
246
PETROU
assume that the coordinates of a pixel in one image are first order polynomial functions with respect to each of the coordinates of the same pixel in the other image. Then one considers quadruples of matched landmark points from which they infer the coeYcients of this transformation, which is only valid within the quadrilateral defined by the four matched points (Petrou and Bosdogianni, 2000). Of course, local transformations are much more powerful than global ones, as they have much more flexibility in capturing local changes in structure. From the basic structure of all registration algorithms, we may infer that these algorithms consist of three basic ingredients: A feature extractor, an image transformation, and a measure of similarity between images. In what follows, we shall discuss in detail each one of these ingredients in reverse order, and present some example algorithms which will demonstrate various aspects of image registration.
II. Similarity Measures One should distinguish between the similarity measures used to assess the similarity of two images, from those used to assess the similarity between two features. In both cases, the similarity measures are used to drive the matching process.
A. Similarity Measures for Pixel Based Methods Once the transformation from one image to the other has been established, the sensed image may be transformed to the grid of the reference image. Then in order to assess the quality of the transformation we need a measure of similarity between the two images. Here we have to distinguish between two cases: registering images of the same modality, and registering images of diVerent modalities. When the two images are of the same modality, that is, from identical or similar sensors, we expect a certain physical patch in the scene to create in the two images pixels of the same grey value, up to a multiplicative factor, which is the same for all pixels and which may represent diVerent lighting levels or diVerent calibrations of the two images. Therefore, when the images are fully registered, the grey values in the reference image and in the transformed sensed image should be maximally correlated. So, the similarity between the two images is measured by the correlation coeYcient, which is defined as:
IMAGE REGISTRATION: AN OVERVIEW
rðI1 ; I2 Þ ¼ where
1 XX I1 ði; j Þ mI1 I2 ðTrans form ði; j ÞÞ mI2 sI1 sI2 i j
247 ð1Þ
: I1 : I2 Transform(i, j) : sI1 : sI2 mI1 mI2
Reference Image Sensed Image The transformation we wish to identify The standard deviation of the overlapping part of the reference image : The standard deviation of the overlapping part of the sensed image : The mean of the overlapping part of the reference image : The mean of the overlapping part of the sensed image
The purpose of the registration algorithm is to choose Transform(i, j) so that r(I1, I2) is maximal. It is worth noting that the way the correlation coeYcient is defined means that it is invariant to global illumination changes, since a global lowering or increase in brightness in one of the images will simply multiply both numerator and denominator by the same factor and thus cancel out. When it comes to assessing the similarity between two images of diVerent modality, the idea is very simple: a particular physical patch in the scene has some invariant spectral characteristics which make it produce a pixel of a certain grey value when seen by one sensor, and of a diVerent but fixed grey value when seen by the other sensor. As long as the physical object which created a pair of pixels in the two images remains the same, this pair of pixels should always have a particular pair of grey values. In measuring, therefore, the similarity between two images captured by two diVerent sensors, instead of looking for the maximum correlation between the grey values of the corresponding pixels, we are looking for the maximum number of pairs of corresponding pixels that take the same pair of values. This is expressed by a quantity called mutual information, which is defined as follows:
XX pI 1 I 2 ð g1 ; g2 Þ pI1 I2 ðg1 ; g2 Þ log MI1 I2 ¼ ð2Þ pI1 ðg1 ÞpI2 ðg2 Þ g1 g2 where pI1 (g1)
: The normalized histogram of grey values of the reference image
248
PETROU
pI2 (g2) : The normalized histogram of grey values of the sensed image pI1I2 (g1, g2) : The normalized joint histogram of grey values g1 and g2 which correspond to the same pixel when the images are registered. Choosing a transformation changes pI1I2 (g1, g2), since it changes which grey value of I1 comes into correspondence with which grey value of I2. Figure 1 shows schematically how mutual information is computed: on the left and at the bottom of the graph one can see the grey level histogram of each image. Mutual information is a double entry table where one counts the pairs of pixels of a particular grey value in one image and a particular grey value in the other image. Note that for total dependence, MI1I2 is equal to the entropy of either image, since in this case pI1 ðg1 Þ ¼ pI2 ðg2 Þ ¼ pI1 I2 ðg1 ; g2 Þ which upon substitution into (2) yields: X X MI 1 I 2 ¼ pI2 ðg2 Þ log pI2 ðg2 Þ pI1 ðg1 Þ log pI1 ðg1 Þ ¼ g1
g2
ð3Þ ð4Þ
This is the case when there is one-to-one correspondence between the grey values of one modality and the grey values of the other modality. In this
Figure 1. Calculation of mutual information.
IMAGE REGISTRATION: AN OVERVIEW
249
extreme case, one may entirely predict the image of one modality from the image of the other modality if this correspondence is known. It may be said that if one modality is totally predictable from the other, then one modality is enough to convey all the information that is to be conveyed, and the second modality is totally redundant. This would indeed have been the case if there were such one-to-one correspondence between the grey values of the two modalities. However this is never the case in practice and it only represents the extreme theoretically possible maximum value of MI1I2. Figure 2 depicts schematically how the two histograms of the two images and how the mutual information matrix are expected to look like in such an extreme case: The grey value histogram of one image is a reshuZed version of the columns of the grey level histogram of the other image, since there is one-to-one correspondence between the grey values of the two images. On the other hand, when the two images are totally independent, MI1 I2 ¼ 0, since now pI1 I2 ðg1 ; g2 Þ ¼ pI1 ðg1 ÞpI2 ðg2 Þ. This is the case when the image of one modality cannot be predicted at all from the knowledge of the image of the other modality, and simply knowing that one pixel has value g1 in one image tells us nothing about the possible range of its values in the other image: the pixel may take any value at all. In this case the two modalities convey totally diVerent information and they are both necessary, but at the same time, we do not have a clue on how to register the two
Figure 2. For total dependence the mutual information is equal to the entropy of either image.
250
PETROU
Figure 3. For total independence the mutual information is equal to zero.
images. Of course this is also an extreme case that does not arise in practice. Figure 3 shows schematically how the mutual information matrix looks like in such an extreme case. It is possible that the above defined measures may be enhanced by considering other aspects of the registration process. For example, the way the correlation coeYcient is defined by Eq. (1) allows it to obtain its maximum value when the two images overlap by only a few pixels. So, it is advisable to constrain the registration so that the two registered images have as high overlap with each other as possible. In addition, we obviously wish that one image is geometrically distorted as little as possible in order to match the other. Kovalev and Petrou (1998) used a cost function for measuring the registration error, which, in addition to the correlation function, included two more terms, one of which measured the overlapping area between the two registered images, and one of which measured the geometric distortion the one image had to be subjected to in order to match the other: The procedure then they followed in order to register the two images did not simply try to maximize the correlation between the grey values of the two images but to minimize the cost function. Their cost function was defined as follows: U ¼ aU1 þ bU2 þ gU3 :
ð5Þ
IMAGE REGISTRATION: AN OVERVIEW
251
where a, b and g were parameters controlling the relative importance of each term. The three terms combined were the following: U1 1 rðI1 ; I2 Þ;
ð6Þ
where r(I1, I2) was the correlation coeYcient between the two images, defined by generalizing Eq. (1) to 3D, since they were registering 3D volume images. The second term was used to express the desire for image I2 to be distorted as little as possible to fit image I1. It was a purely geometric term which did not involve any grey values: U2
1 X jxkþ1 xk dxx j þ ykþ1 yk dxy þ jzkþ1 zk dxz j N12 k2B 12
þ xkþNx xk dyx þ ykþNx yk dyy þ zkþNx zk dyz
þ xkþNx Ny xk dzx þ ykþNx Ny yk dzy þ zkþNx Ny zk dzz Þ ð7Þ
where Nx and Ny was the size of the image along the x and the y axes respectively, and N12 was the number of pixels in B12, which was the set of overlapping pixels between the two images. dab was the diVerence in coordinate values along the b axis, of two neighboring pixels ‘‘aligned’’ along the a axis. In a regular grid, dxx ¼ dyy ¼ dzz ¼ 1 and dxy ¼ dyx ¼ dxz ¼ dyz ¼ dzx ¼ dzy ¼ 0. k scans the image in a raster fashion, along the x axis on each successive slice corresponding to fixed z. More explicitly, the meaning of the first term, for example, in function (7) is the following: xk+1 and xk are the coordinate positions along the x axis of the two neighboring pixels with indices k + 1 and k respectively. At the beginning of the registration process, the diVerence between these two coordinates is dxx, since these pixels are next to each other along the x axis. After the image has been transformed, the two pixels may shift with respect to each other, so their distance along the x axis may have changed. How diVerent this distance is from the original value dxx, expresses the distortion of the rigid grid. In a similar way, term jxkþNx xk dyx j expresses the distortion of the grid away from the rigid one due to the shifting in relative position of two neighboring pixels along the y axis (indices k þ Nx and k identify neighboring pixels along the y axis in a raster indexing format). Finally, the third term of the cost function was used to express the desire for maximum overlap between images I1 and I2: U3 1
N12 N
where N was the maximum number of pixels in the image.
ð8Þ
252
PETROU
In general, when one uses a cost function to measure the quality of a sought solution, the cost function takes the form: Cost ¼ Faithfulness to the data term þ Prior model term
ð9Þ
In Kovalev and Petrou (1998), the faithfulness to the data term was expressed by the term related to the correlation function, while the other two terms were the prior model used (i.e., they were expressing the constraint the authors wished to impose on the solution, namely that the solution had to be such that the two images would have maximum overlap and also that the transformation used would be the least necessary to bring the two images into registration). In a problem of diVerent nature, the two basic terms of the cost function may take a totally diVerent form. For example, in solving the problem of motion analysis, which entails the registration of one frame in a video sequence with another, so that the motion vectors associated with each pixel can be identified, Dengler and Schmidt (1988) used as the prior model the so-called membrane model, which required that the first derivatives of the components of the motion vectors were as small as possible. Bober et al. (1998) solved the same problem enhancing this approach. Their purpose was to estimate the two components of the motion vector at every pixels, plus the two first derivatives of these two components with respect to the two co-ordinates. This implied six unknowns per pixel. Bober et al. restricted themselves to the case in which the objects depicted by the video sequence had apparent motion from frame to frame described by rotation, zoom, and translation only. In such a case the unknowns become only four, since the four first derivatives of the two components of each motion vector are restricted to have a particular relationship with each other. With these simplifications, the velocity vector v for each pixel, in a local coordinate system with center the pixel of interest, could be written as: v ¼ ða1 x˜ þ a2 y˜ þ a3 ; a2 x˜ þ a1 y˜ þ a4 ÞT . (The local coordinate system was defined so that its center was the pixel to which this velocity vector was assigned, and axes x˜ and y˜ were parallel to the image axes x and y.) The problem was to choose values for the parameters a1, a2, a3 and a4 for each pixel. The cost function Bober et al. (1998) used consisted of two faithfulness to the data terms: The first one measured how close assigned parameters a3 and a4 were to the estimated values of the motion field obtained by matching the grey values of the pixels between the two frames. The second term encouraged pixels with high values of the first derivatives of their motion vectors (i.e., high values of a1 and a2) to coincide with pixels that an edge detection module had identified as edge pixels (i.e., the motion boundaries, associated with sudden changes in the value of the motion vector, were encouraged to coincide with grey level boundaries). They also included a
IMAGE REGISTRATION: AN OVERVIEW
253
constraint or prior model term, which said that neighboring pixels should have similar valued velocity components, and also similar valued derivatives of the velocity components (i.e., this term encouraged solutions, which implied that neighboring pixels had similar values of a1, a2, a3, and a4). This means that they used a membrane model as well, which just like in the case of Dengler and Schmidt (1988) was allowed to ‘‘tear’’ at places with high discontinuity: The membrane model tries to make the motion vectors of neighboring pixels identical to each other, but one may incorporate in the cost function some factors, which, when the diVerence in motion vectors of adjacent pixels becomes too large, the model stops forcing them to take similar values. B. Similarity Measures for Feature Based Methods Although features may be rich image representations, quite often after the features have been identified, they are stripped down to the bare minimum of representation, namely single points, on the basis of which matching is performed. So, feature based methods start from the most simplistic ones, which try to match sets of points between the two images, and progress to more complex ones, as gradually the matched objects become more complex geometrically, through the addition of attributes and pairwise (or higher order) relations between them. 1. Point Matching Methods To match a set of points with another set of points one may use the Chamfer or Distance transform matching (Barrow et al., 1977; Rignot et al., 1991). In the reference image, we compute for each pixel its distance from the nearest feature point, using the distance transform. An example of the distance transform for a 16 16 image is shown in Figure 4. There are two feature points detected in this image, indicated by the black pixels. All other pixels have some integer values assigned to them. The values indicate (to the nearest interpixel distance) the distance this pixel has from its nearest feature point. Obviously the feature points themselves have value 0. When a match is assumed with the sensed image, each feature point of the sensed image corresponds to a pixel of the reference image. By reading the value the distance transform assigns to this corresponding pixel, we know how close the feature point of the sensed image is to a feature point of the reference image. The sum of all distance values of the feature points of the sensed image is the similarity measure between the two point sets and by extension a measure of the quality of matching. For a perfect match this sum obviously has to be zero.
254
PETROU
Figure 4. Distance map from two feature points.
This method is fine, if we consider that the two images have been captured from the same point of view and under the same perspective: the relative position of the points considered is not expected to have changed from one image to the other. This is not the case, however, in a stereo vision problem, when the imaged objects are clearly 3-dimensional, and when the images to be matched have been captured under diVerent perspective transformations. Perspective transformations change the relative positions of points. In that case one needs to involve projective geometry theorems on invariance, in order to solve the stereo vision problem. Geometric invariance under perspective transformations is based on the concept of the cross ratio of four collinear points. The equation of a line with respect to an origin O can be written as r ¼ b þ md where r is the position vector of any point along the line, b and d are some vectors that fully define the line, and m is a parameter which takes real values, a diVerent value for each point of the line. The cross ratio then of four points A, B, C and D of the line is defined as: m mA mD mB ½A; B; C; D ¼ C ð10Þ mC mB mD mA This ratio is invariant to linear scaling, skewing, rotation, translation, and perspective projections (Coxeter, 1974). The cross ratio [l1, l2, l3, l4] of a pencil of four coplanar lines l1, l2, l3, and l4, is defined as the cross ratio of the four points which are defined by considering any line l which intersects the four lines, and it can be shown to be independent of line l. Now, let us assume that we have four coplanar points O1, O2, O3, and O4, and a point P on the same plane. We can define the following cross ratios of the pencils of lines we can draw on the plane, passing through points O1, O2, and O3, respectively:
IMAGE REGISTRATION: AN OVERVIEW
k1 ½O1 O2 ; O1 O3 ; O1 P; O1 O4
k2 ½O2 O3 ; O2 P; O2 O4 ; O2 O1 k3 ½O3 P; O3 O4 ; O3 O1 ; O3 O2
255
ð11Þ
It can be shown that if we know the positions of points O1, O2, O3, and O4 these three numbers uniquely define the position of point P on the plane. All these definitions are demonstrated in Figure 5. Suppose now that points O1 ; O2 ; O3 ; and O4 are some reference points, the positions of which are known in the 3D space, and their images o1, o2, o3, and o4 can easily be
Figure 5. Definition of the projective coordinates of a point P.
256
PETROU
identified on the image plane. Then the three cross ratios k1, k2, and k3 for the projection p of point P on the image plane, can easily be measured on the image plane, and since cross ratios are invariant to projections, their values will be the same on the reference plane. This way, we may define on the reference plane the corresponding point P of image point p. Suppose 0 0 0 0 next that we have another set of reference points O1 , O2 , O3 , and O4 , the 0 0 positions of which are also known in the 3D space, and their images o1 , o2 , 0 0 o3 , and o4 , can also be identified easily on the image plane. Then we are able 0 to identify the projection P of point p on the second reference plane. This 0 way we can have the equation of line PP , and we know that the true 3D position of the point that gave rise to the image point p lies along this line. If we have a second camera that sees the same points but from a diVerent perspective, we shall be able to define another line in the 3D space along which the point that gave rise to the image point p also lies. Two lines intersect at most at one point, which must be the position of the point that gave rise to the image point p. This is the basis of projective geometry 3D reconstruction. The problem really is how to identify which point in the image captured by the first camera, corresponds to which point in the image captured by the second camera. Georgis et al. (1997, 1998) proposed ways of solving the point matching problem. Instead of matching two sets of points seen from totally diVerent perspective, which distorts their relative positions significantly, their idea was to construct a virtual image of the points seen by the first camera, under the perspective of the second camera, and match the raw points seen by the second camera, with the virtual points of the image they constructed. If the two cameras used form a stereo head, the lines that connect the corresponding points must all intersect at a single point E, on the epipole. This is better shown in Figure 6. The virtual image is formed from the intersections of the viewing lines of the various points on the object with one of the reference planes. These intersections can be fully defined from the cross ratios measured on the image plane and the knowledge of the 3D positions of the reference points as explained earlier. The problem then is cast as follows: The lines that connect matched points must all intersect at a single point. From all sets of matching lines you can define by considering one point from the right image and one point from the virtual image, choose the one that minimizes a cost function that measures how close a set of lines pass through a single point. Given a set of N points in the plane, one can fit through them the least square error line and measure how good this fit is. The problem of estimating how close a set of N lines pass through a single point can be thought of as the dual of the problem of fitting a line to N points. The dual representation of a line with equation y ¼ ax þ b is a point in parameter
IMAGE REGISTRATION: AN OVERVIEW
257
Figure 6. Matching the virtual points seen by the left camera with the real points seen by the right camera is much easier than matching the raw sets of points seen by the two cameras.
space (a, b). If we have a set of lines with parameters (ai, bi), all of which intersect at the same point (x, y), their dual representation in the (a, b) space must be a straight line parameterised by the values of x and y. Thus, as a measure of how accurately a set of N lines intersect at the same point can be taken to be the measure of how well the parameters of these lines are aligned in the dual space. It can be shown that such a measure can be defined as f ða1 ; b1 ; a2 ; b2 ; . . . ; aN ; bN Þ
N N X X i¼1 j¼1
2 ðai a¯ Þ bj b¯ aj a¯ bi b¯
ð12Þ
where a¯ and b¯ are the mean values of ai and bi respectively. This function can be used as a cost function that has to be minimized in order to identify the optimal matching between the two sets of points. It is a monotonic function (since it is quadratic in the unknowns), and so it can be
258
PETROU
optimized by using an algorithm like branch and bound (Georgis et al., 1997). However, such a function is not robust to outliers, being a least square error solution. A better method should be one that avoids outliers. Such a method was presented in Georgis et al. (1998). The problem is tackled there as an evidence accumulation problem, by considering triplets of matched pairs of points. Three lines connecting matched points must pass through the same point if the points are matched correctly. If we consider all possible triplets, each one of them will define with a certain accuracy, a ‘‘point’’ of intersection E (see Figure 6). The word ‘‘point’’ is inside quotation marks, because if the matches are not correct, the three lines will not intersect at a single point. For every triplet considered, the cost function defined by Eq. (12) can be computed. Let us suppose that there are N points in the left image and M points in the right image that have to be matched. We create an accumulator array of dimensions N M, with the N points in the left image arranged along the vertical axis of the array, and the M points of the right image arranged along the horizontal axis of the array. Every time a triplet we consider has cost function value below a certain tolerance, we increment by 1 the entries of the array which correspond to the matched points. Note that when we consider a triplet of the first image and a triplet of the second image, there is only one way to match the points: the top most points must be matched, the middle points must be matched and the bottom most points must be matched. This is because we cannot have matching lines that intersect between the two images: they must intersect somewhere along the epipolar line. At the end, the pairs of points that contributed to the largest number of valid matched triplets will emerge with the largest values in the accumulator array. These peaks will indicate the best match. We may consider N!/[(N 3)!3!] triplets in the left image and M!/[(M 3)!3!] triplets in the right image. Since we can match any triplet of one image with any triplet of the other image, the total number of triplets of paired points we may consider is the product of these two numbers. If the points to be matched are many, the approach becomes very slow. Georgis et al. (1998) proposed a randomized approach where the accumulator array is constantly monitored for emerging peaks of matches. A simple example is shown in Figure 7a where we assume that we have four points in the left image (N ¼ 4) which have to be matched with five points in the right image (M ¼ 5). One may create 4!/(3!1!) ¼ 4 triplets from the points in the left image and 5!/(3!2!) ¼ 10 triplets from the points in the right image. Two such triplets, one from each set, are shown in Figure 7a by the encircled points. The only possible match between these points is indicated by the arrows. One may consider 4 10 ¼ 40 such pairs of triplets, not all of which will have a cost function value below a predefined threshold.
IMAGE REGISTRATION: AN OVERVIEW
259
Figure 7. (a) Two sets of points to be matched. The encircled points belong to two randomly chosen triplets. There is only one way to match the points of two triplets (indicated by the lines) because the joining lines cannot intersect between the two images. (b) An example accumulator array for votes gathered for paired points according to the number of valid matched triplets they contributed to. The circled entries are the peaks of the array indicating the final chosen match.
Let us suppose that the triplets with cost function value below the threshold contributed to the values of the accumulator array shown in Figure 7b. These numbers indicate that the best match is ðp1 ; p02 Þ; ðp2 ; p03 Þ; ðp3 ; p04 Þ and (p4 ; p05 ). The methods discussed so far take into consideration only the geometric arrangement of feature points in each image. Clearly, such an approach does not make use of all available information. Even if we ignore feature attributes that may characterize each feature point, there is still information encapsulated in a point set that may be tapped for better results, for example, the relative position and orientation of the line segments defined by pairs of points in the same point set. The process of matching may be enhanced by considering in addition to geometric characteristics, attributes which characterize the features to be matched. If the features are more complex shapes than just points, the first choice in terms of attributes are also geometric characteristics. For example, Christmas et al. (1995) used a line detection algorithm to extract line segments that represented the roads in an image. The problem they tried to solve is exemplified in Figure 8: Where in the digital road map is the region depicted by the image? Each line segment was characterized by its length and orientation. Matching only on the basis of attributes of individual objects does not exploit the wealth of contextual information contained in the relations of objects. Christmas et al. (1995) showed that the use of binary relations resulted in a very powerful scheme for structural matching. Unlike the previously defined similarity measures, which are all global measures and quantify the global quality of matching two images,
260
PETROU
Figure 8. Where in the digital road map is the region depicted in the image?
Christmas et al. (1995) used an object-centered measure of similarity, where two objects were matched with a certain probability according to how well their attributes and relations matched. So, their measures of similarity were probabilities referring to individual objects that were being matched. They then used probabilistic relaxation to match the line segments extracted from the image with the line segments extracted from the map, taking into consideration relative length and relative orientation of pairs of matched segments. The matching was expected to be correct within the accuracy of the measurement errors (Christmas et al., 1996a). In other words, they modelled the errors with which the relative orientation and relative lengths of pairs of matched segments could be estimated and assigned to each pair of matched segments a probability to be correct, computed from the distribution of these errors. At all steps of their algorithm, every line segment of the image was assigned a probability with which it could be matched to every line segment of the map. Each image was represented in terms of an attribute relational graph (ARG), similar to the one shown in Figure 9. Let the set of the N nodes of the graph representing the scene be A ¼ fa1 ; a2 ; . . . ; aN g. Each object ai was assigned a label yi, which could take as its value any of the M þ 1 model
IMAGE REGISTRATION: AN OVERVIEW
261
Figure 9. The line segments in the scene and the map are represented by attribute relational graphs. The problem of matching then reduces to one of graph matching, where attributes as well as relations between the nodes have to be taken into consideration to achieve the best match.
labels that formed set O : O ¼ fo0 ; o1 ; . . . ; oM g, where o0 was the null label used to label objects for which no other label was appropriate. At the end of the labelling process, it was expected that each object would have one unambiguous label value. However, labels of more than one object could have the same value (i.e., many-to-one matches were allowed). For each object ai, it was assumed that a set of m1 measurements xi was available, corresponding to the unary attributes of the object: n o ð1Þ ð2Þ ðm Þ xi ¼ xi ; xi ; . . . ; xi 1
Examples of unary attributes are the length, color, or orientation of an object. The set of all unary measurement vectors xi made on the set A of objects was denoted by xi;i2N0 fx1 ; . . . ; xN g where N0 1; 2; . . . ; N. For each pair of objects ai and aj a set of m2 binary measurements Aij was assumed to be available: n o ð1Þ ð2Þ ðm Þ Aij ¼ Aij ; Aij ; . . . ; Aij 2
262
PETROU
Examples of binary relations are the relative position of one object with respect to another, relative size, or orientation. The binary relations object ai sustained with the other objects in the set were denoted by Aij; j2Ni ¼ fAi1 ; . . . ; Aii1 ; Aiiþ1 ; . . . ; AiN g where Ni f1; 2; . . . ; i 1; i þ 1; . . . ; Ng. The same classes of unary and binary measurements were also made on the model, to create the model graph. These were the unary measurements, xˇa, of model label oa, and the binary measurements, Aˇ ab, between model labels oa and ob. The label yi of an object ai was given the value oyi , provided that it was the most probable label given all the information available for the system (i.e., all unary measurements and the values of all binary relations between the various objects). Thus, it was postulated that the most appropriate label of object ai was oyi given by:
ð13Þ P yi ¼ oyi xj; j2N0 ; Aij; j2Ni ¼ max P yi ¼ ol xj; j2N0 ; Aij; j2Ni ol 2O
This quantity was eventually expressed as
Pðyi ¼ oyi jxi ÞQðyi ¼ oyi Þ P yi ¼ oyi xj; j2N0 ; Aij; j2Ni ¼ P ol 2O Pðyi ¼ ol jxi ÞQðyi ¼ ol Þ
ð14Þ
where
Q ð yi ¼ o a Þ
Y X
j2Ni ob 2O
P yj ¼ ob xj p Aij yi ¼ oa ; yj ¼ ob
ð15Þ
Eq. (14) and (15) express the matching probabilities conditioned on both unary and binary measurements as functions of the probabilities conditioned only on the unary measurements, Pðyj ¼ ob jxj ), and information about the binary measurements. Function pðAij jyi ¼ oa ; yj ¼ ob ) quantifies the compatibility between match yj ¼ ob and a neighboring match yi ¼ oa . Clearly, this is a quantity that is known to us at the outset of the matching process; hence the equations eVectively tell us how to update the probabilities Pðyi ¼ oyi jxi Þ given information about the binary measurements. This suggests that the desired solution to the problem of matching, as defined by Eq. 13, can be obtained by combining Eq. (14) and (15) in an iterative scheme where the probabilities Pðyi ¼ oyi jxi Þ are those calculated at one level (level n, say) of the iteration process, and the probabilities Pðyi ¼ oyi jxj; j2N0 ; Aij; j2Ni Þ are the updated probabilities of a match at level n þ 1:
IMAGE REGISTRATION: AN OVERVIEW
where
Pðnþ1Þ ðyi ¼ oyi Þ ¼ P
QðnÞ ðyi ¼ oa Þ
Y X
j2Ni ob 2O
PðnÞ ðyi ¼ oyi ÞQðnÞ ðyi ¼ oyi Þ ðnÞ ðnÞ ol 2O P ðyi ¼ ol ÞQ ðyi ¼ ol Þ
PðnÞ yj ¼ ob p Aij yi ¼ oa ; yj ¼ ob
263 ð16Þ
ð17Þ
The quantity QðnÞ ðyi ¼ oa Þ expresses the support match yi ¼ oa receives at the nth iteration step from the other objects in the scene, taking into consideration the binary relations that exist between them and object ai. The matching probabilities were initialised considering only the lengths and orientations of the individual line segments (i.e., only the attributes of the objects). The iterative process was terminated when an unambiguous labelling was reached, that is when each object was assigned one label only with probability 1, the probabilities for all other labels for that particular object being zero. The process sounds very bulky, since one has to retain at all iteration steps probabilities of all matching to all. There are however ways of accelerating it, for example by pruning away matchings between very distant objects, if some prior knowledge is available about the level of deformation between image and map (Christmas et al., 1996b). So, the algorithm may run in seconds or minutes rather than hours for a typical problem. This bulkiness of the algorithm is actually its strength as well: the algorithm contains high levels of redundancy, so it can tolerate huge errors and large number of outliers. For example, the line segments extracted from the image do not necessarily correspond to the line segments of the map (i.e., a line segment in the image represents a particular stretch of road in the physical world). That particular stretch of road in the map representation may be represented in part by one segment and in part by another, in all possible arbitrary combinations of division, with the two map segments representing other stretches of the road as well. This is the same observation we made at the beginning: just like a pixel does not have a unique physical substance, a line segment is not really a physical object. And yet, both pixel-based and line-based methods work, due to the overkilling they do in solving the problem. It appears that the secrete of success in practice, just like in nature, is overkilling, by blasting the problem with multiple representations and multiple solutions to achieve tolerance of errors, outliers, occlusions, mistakes, and assumptions that do not quite apply in reality. For example, all probabilistic relaxation approaches assume the independence of the measurements. Given that the measurements we perform in an image all use the same pixels which have probably been preprocessed a few times, we
264
PETROU
know that the independence assumption is a theoretical invention which simply allows us to proceed. And yet, probabilistic relaxation algorithms somehow recover from all that and usually work impressively well. 2. Curve Matching Methods A very popular matching method is based on contour matching (Dai and Khorram, 1999; Li et al., 1995). For example, one may try to match contours of constant height extracted from a SAR image with similar contours representing a digital elevation model (DEM) or a map, or one may use isocontours of constant intensity extracted from the two images to be matched. The matching algorithm here tries to identify contours of the same shape. Therefore, a versatile contour shape representation scheme is needed. A good choice is the use of Freeman chain coding. Figure 10 shows an example representation of a contour using 4 and 8 connectivity. The chain code that represents the contour can be smoothed to remove unnecessary details and then treated as the signature of the curve that has to be matched with the signature of another curve. This is typically done using 1D correlation. 3. Region Matching Methods One step more sophistication is added to the approach when whole regions are matched between the two images. The matched regions should have similar characteristics. The similarity of the characteristics depends on the type of distortion one assumes between the two images. For example, if one assumes that the two images diVer by an aYne transformation, then one should describe the objects to be matched by features which are aYnely invariant. Flusser and Suk (1994) used a region based approach. To cope with changes of shape in the regions between the two images, they used aYnely invariant moments to describe each region, and they matched regions with the same moment values. The aYnely invariant moments used were defined as follows: 1 m20 m02 m211 4 m00 1 I2 10 m230 m203 6m30 m21 m12 m03 þ 4m30 m312 þ 4m03 m321 3m221 m212 m00 1 I3 7 m20 m21 m03 m212 m11 ðm30 m03 m21 m12 Þ þ m02 m30 m12 m221 m00
I1
IMAGE REGISTRATION: AN OVERVIEW
265
Figure 10. The chain code representation of a region can be used as its signature. At the top, the chain code produced by 4 connectivity and at the bottom the signature produced using 8 connectivity.
266 I4
PETROU
1 3 2 m20 m03 6m220 m11 m12 m03 6m220 m21 m02 m03 þ 9m220 m02 m212 þ 12m20 m211 m03 m21 m11 00
þ 6m20 m11 m02 m30 m03 18m20 m11 m02 m21 m12 8m311 m03 m30 6m20 m202 m30 m12 þ 9m20 m202 m221 þ 12m211 m02 m30 m12 6m211 m202 m30 m21 þ m302 m230 1 I5 6 m40 m04 4m31 m13 þ 3m222 m00 1 I6 9 m40 m04 m22 þ 2m31 m22 m13 m40 m213 m04 m231 m322 m00
where mpq is defined by ZZ mpq
region
f ðx; yÞðx xt Þp ðy yt Þq dxdy
ð18Þ
with f(x, y) being the grey level image function, and (xt, yt) the center of mass of the region. One might have expected that region matching should be a more robust approach than curve or point matching. This is not the case, however. Regions are not usually robustly extracted from an image, and they are often extracted misshapen due to errors in the image processing methodology or due to occlusions. For example, in the above formulae, one can see that if the center of gravity of a region is miscalculated, perhaps due to slightly variable illumination, the invariance of the moments breaks down. Once the regions have been matched, all information regarding them is discarded and each region is represented by its center of gravity, which is then used in a point matching algorithm. Region matching methods are very popular in motion detection algorithms in video sequences. The problem there lies in identifying where a pixel or a block of pixels moved between two successive frames in the video sequence. The regions matched in that case are really arbitrarily defined blocks of pixels. A block in the first image is correlated with all blocks in adjacent places, and the block that produces the maximum correlation coeYcient is the one that is considered to be the matching block.
III. Deriving the Transformation Between the Two Images The similarity measures described in the previous section are used to assess how similar two images are. In other words, they are used to drive the process of transforming one images so it gradually becomes more similar to
IMAGE REGISTRATION: AN OVERVIEW
267
the other image. In this section we shall discuss the ways we express the transformation from one image to the other.
A. Feature-Based Methods In general, feature based methods produce matched pairs of points, from which the transformation needed for one image to match the other has to be inferred. What form this transformation has is a matter of choice. For example, in most cases, the transformation is assumed to be that of a rigid body, where one of the two images has simply to be rotated and translated in order to match the other. This transformation is also known as Euclidean transformation. For the 2D case, a rotation can be described by a single angle, and a translation by two numbers, the shiftings along the two image axes. So, for the case of matching 2D images, this transformation depends on three parameters only. Once pairs of corresponding points have been identified, their corresponding coordinates in the two images are assumed to be related by equations of the form: ! ! ! ! cos f sin f x2 s1 x1 ¼ þ ð19Þ sin f cos f y1 y2 s2 Each paired set of points supplies one such set of equations for the unknown values of parameters f, s1, and s2. The system of equations is solved in the least square error sense. If in addition to rotation and translation between the two images, change of scale is also allowed, the transformation is called similarity transformation, and the above equations take the form ! ! ! ! x1 s1 x2 cos f sin f ð20Þ þ ¼m sin f cos f s2 y2 y1 where m is the scaling parameter. A more general assumed transformation is one that allows scaling by diVerent factors along the two axes (i.e., an aYne transformation) plus a possible translation. In this case the coordinates of the matched points are related by an expression of the form ! ! ! ! s1 x2 a11 a12 x1 ð21Þ þ ¼ s2 y2 a21 a22 y1 which depends on six parameters (namely a11, a12, a21, a22, s1 and s2). This transformation is a special case of a polynomial transformation, according
268
PETROU
to which x1 and y1 are expressed as polynomial functions of x2 and y2, not necessarily of the first order. In all cases, if the system of equations is solved in the least square error sense, wrongly matched pairs of points may cause significant errors in the estimated parameters. To avoid failures due to outliers, it has been proposed to use robust techniques for matching point sets. For example, the simple technique described earlier based on the distance transform (see Figure 4) is bound to suVer from the presence of outliers, as even a single outlier may change dramatically the values of the distance transform. The presence of erroneous matches will damage severely the solution of the system of equations used to define the transformation parameters. The Ransac method has been proposed as an alternative to the least square error solution, in order to avoid the influence of the outliers (Kim and Im, 2003). In some cases, the global transformation between the two images is not derived explicitly. Instead, the identified paired points are used to extract further information from the pair of images. This is the case, for example, in the problem solved by Christmas et al. (1995) where the problem was to identify where in the reference ‘‘image’’ (the map) the sensed image was fitting best, or in the stereo vision problem solved by Georgis et al. (1997) where the idea was to use the positions of the matched points to infer 3D information about the imaged scene. In some cases, it is known a priori that the whole image did not change like a rigid body. In such a case, the parameters of the assumed transformation are extracted locally and they are only valid for small patches of the image. In general, this is not a very good approach as it creates seams between regions inside which diVerent parametric transformations apply. Such problems can be avoided if pixel-based elastic registration methods are used.
B. Pixel-Based Methods The avoidance of parametric transformations altogether, allows one to have much more flexibility in the inferred transform: one assigns to each pixel a vector which indicates how it moves between the registered images. This creates a dense flow field which is very popular in motion detection and matching in video sequences. The creation of a flow field allows elastic object registration. This is of particular importance in medicine. Figure 11 shows such a flow field from the elastic registration of brain volume data presented by Kovalev and Petrou (1998). The degree of distortion needed for one image to match the other may serve in such a case as quantification of the process which caused the distortion. For example, Figure 12 shows
IMAGE REGISTRATION: AN OVERVIEW
269
Figure 11. The arrows indicate how each voxel has to change position, so that one MRI brain scan matches another of the same patient, obtained with a diVerence of a few months.
two slices extracted from the registration of 3D brain data, from two regions of the brain, one of which contained a growing tumor, while the other did not. The registered volumes were referring to the same patient, but they were taken four months apart. It is very clear which slice comes from the region with the tumor. The value of the cost function defined in Barrow et al. (1977) by Kovalev and Petrou (1998) can quantify this distortion and it may be used as an extra indicator of the condition. To achieve such a ‘‘flow’’ field when registering images, one has to proceed in very small steps: The steps are guided by the desire to reduce the cost function of dissimilarity between the two images [e.g., see Eq. (9)], or to increase the similarity between the images [e.g., see Eq. (1) and (2)]. In either case, the function that is optimized depends on thousands of unknown parameters: the components of the flow vector associated with each pixel or voxel. In addition, this dependency is highly non-linear and certainly not quadratic. The optimization, therefore, has to be done by some stochastic technique that explores the highly dimensional solution space in some intelligent way. Christensen et al. (1994) did exactly that, and came up with an elastic volume registration algorithm which needed many hours on a powerful computer to run. Stochastic optimization techniques are known to be notoriously slow, as they explore the configuration space by taking small steps. For example, simulated annealing at each step updates only the value of a single unknown variable (and there are thousands of them) and then dozens if not hundreds of passes are needed. There are various shortcuts one may use to accelerate the process. Kovalev and Petrou (1998) proposed the use of a library of global distortion operators. Each operator is invoked at random, with random values of the parameters on which it depends, and it
270
PETROU
Figure 12. At the top: a slice of a 3D volume image, distorted by the process of registering one volume of data with another, belonging to the same patient but acquired a few months apart, when there is a tumor growing in that region. At the bottom: a slice of 3D data distorted by the process of registering the volume image with one acquired a few months earlier from the same subject, when there is no tumor growing. The mild distortion is due to natural human tissue variation. Quantifying these distortions, allows one to use the second in order to calibrate the first, and thus come up with another indicator of the pathology observed in the first case.
is applied to a randomly chosen voxel. Each operator implies a distortion which decays exponentially away from the central point. The accumulated eVect of all those ‘‘global’’ operators is very eVective in producing a globally inhomogeneous distortion eVect, as shown in Figures 11 and 12. The approach they used was a ‘‘greedy’’ one: the eVect of an invoked operator on the image was accepted only if it reduced the cost function. So, their method could not escape from local optima, but it was many times faster
IMAGE REGISTRATION: AN OVERVIEW
271
than that of Christensen et al. Their method was also accelerated because the values of whole sets of voxels were updated in one go, instead of the values of a single voxel at a time. Further gains were achieved by invoking the operators according to a probability density function expressing the usefulness of each operator, estimated during some training initial run of the algorithm: during a training stage, an operator was credited as useful, if its eVect was to reduce the cost function and the distortion it proposed was accepted. Operators, which kept proposing distortions that kept being rejected because their eVect would have been to increase the value of the cost function, were in subsequent steps chosen with probability lower than that of other operators. The ‘‘library’’ of operators they used consisted of three members only: exponential growth, exponential shrinkage, and shear distortion. The eVect of each one of these operators on the volume image, when applied to a voxel, is shown in Figure 13. The eVect of each operator was defined as follows: If the exponential growth (shrinkage) operator was chosen, then an arbitrary voxel i was chosen at random, and all other voxels k were shifted radially away (towards) voxel i by distance dk given by equation dk ¼ regdik where r and g were some parameters and dik was the distance of voxel k from i. The values of the parameters were chosen so that the spatial order of the voxels was preserved. If the translation operator was chosen, they proceeded to choose at random a pair of voxel positions (xi, yi, zi) and (xj, yj, zj), within a certain distance d from each other. All the remaining voxels of image I2 were shifted in location according to the following law: A voxel k was moved in the direction of the vector defined from i to j, and by a distance given by dk ¼ dij esdik , where dij was the distance between j and i, dik was the distance between k and i, and s was the ‘‘springiness’’ parameter that controlled the severity of the distortion. The algorithm required 44 additions/subtractions per voxel, 14 multiplications/divisions, 1 exponential operation, and 1 square root. In a Pentium 133 MHz PC machine it took 1.88 secs per iteration on a 100,000 voxel object. In this time the authors also included the time to convert the
Figure 13. The three operators used by Kovalev and Petrou (1998) to produce volume distortions at randomly chosen positions of the data, with randomly chosen parameters, and thus drive the process of registration by global optimization.
272
PETROU
numbers from the 4 bytes with which they were stored to save memory, to 8 bytes that were needed for performing the calculations. Typically, about 50,000 iterations were needed for two volumes to be registered. This meant that to register two volumes using a PC, several hours were needed. However, on a 0.8 GigaFLOPS machine, on which other flexible volume registration methods (Christensen et al., 1994; Thompson and Toga, 1996) were reputed to require 9 hours, the Kovalev and Petrou algorithm would need only minutes. Other ways of accelerating such approaches are by using a multiresolution technique. The philosophy of a multiresolution optimization method is to create a ‘‘grainy’’ structure in the solution space. Each ‘‘grain,’’ or ‘‘blob,’’ represents a whole set of possible solutions. The optimization method then jumps from one grain to the other (i.e., from one set of possible solutions to the other) until one lands to the grain which contains the optimal solution. It is only at that stage that small careful steps of updating the values associated with individual pixels are taken, until the globally optimal solution is reached. The problem of course is on how one should choose these ‘‘grains’’ so that the structure of the configuration space is preserved in the reduced resolution, and most important of all, the correct optimal solution is preserved. Kovalev and Petrou (1998) imposed a grainy structure in the solution space in an ad hoc way, where each grain consisted of all those solutions which, for example, had all voxels shifted away from a particular center, according to a particular rule. Considering various operators in eVect was allowing them to sample these grains, allowing only jumps from grain to grain which were always leading to the optimal solution (i.e., always minimizing the cost function). The theoretically optimal way to impose grainy structure in the solution space, however, is to invoke the renormalization group transform (Gidas, 1989; Petrou, 1995). The renormalization group transform guarantees that the correlations between pixels/voxels is retained at all levels of resolution, and thus the optimal solution is also retained. It turns out that the equations of the renormalization group transform are only solvable in an analytic way in very few cases. Nicholls and Petrou (1994) proposed an alternative transform, the supercoupling transform, which although it does not preserve the correlations of the pixels/voxels in the reduced resolution space, it guarantees the preservation of the optimal solution, which is really what is important. These concepts are demonstrated in Figure 14. In such an approach one creates a pyramid of coarse image versions. However, as one tries to coarsen the solution space and not just the image space, this pyramid of images is not created in the conventional way (i.e., by low pass filtering and subsampling the image) as it is done in other cases (Dengler and Schmidt, 1988). Instead, image and model are coarsened together and create
IMAGE REGISTRATION: AN OVERVIEW
273
Figure 14. The cost function depends on thousands of variables, namely the components of the motion vectors associated with all pixels, and showing where each pixel moves between the two registered images. A solution of the problem is a point in this multidimensional solution space, where the value of each unknown is measured along one of the axes. Stochastic optimization techniques home in the global solution through a series of small steps, as shown in (a), and so they are very slow. Multiresolution methods use coarse solutions. Each coarse solution is consistent with many detailed solutions, so it may be thought of as representing a whole set of fine detail solutions. The coarse solutions are represented by crosses in (b) and the fine solutions represented by each cross are shown here as bubbles (grains). In a multiresolution approach, we first solve the problem by improving the coarse solution (i.e., we jump from grain to grain) until we are in the grain/bubble that contains the optimal solution. Then we proceed with careful small steps just like in the single resolution approach, until the global optimum is obtained.
a coarse ‘‘image’’ the values of which are the result of the original pixel values and the model parameters, appropriately blended. Such a coarse ‘‘image’’ is compatible with many higher resolution ‘‘images.’’ Thus, it can be thought of as representing all those solutions, which, if coarsened, would have given rise to this coarse solution. Processing then such a coarse grid in order to identify the best solution, is equivalent to moving from one set of full resolution solutions to another. Bober et al. (1998) applied this transformation to obtain the flow vectors in a video sequence motion detection problem. Another way of speeding up these approaches, is to use a multistage strategy: Bober et al. (1998) did not start from scratch in order to match the pixels in one frame with the pixels in the next frame. They first applied a block matching method, where a block of pixels in the first frame was associated with a corresponding block of pixels in the next frame, by using grey value correlation for searching exhaustively all neighboring positions
274
PETROU
to which the original block could have been shifted between successive frames. Thus, some preliminary registration between the pixels of the first image and the pixels in the second image was inferred, and it served as the starting point for the subsequently applied optimization algorithm. So, the multiresolution optimization of cost function (9) served only for refining the matching vectors. In a similar manner, Kovalev and Petrou (1998) started by first registering the two volumes of data as rigid bodies, and then they applied their algorithm to refine the solution. In an industrial inspection problem, Costa and Petrou (2000) followed a similar strategy: they had to inspect ceramic tiles for the correctness of the depicted patterns on them. They had first to bring the inspected tile in complete registration with the reference tile/pattern, and then subtract them point by point to detect defects. They first registered the two tiles using a rigid body rotation and translation algorithm computed by aligning the straight line edges of the two tiles detected by edge detection and the Hough transform, and then they refined the registration to sub-pixel accuracy by using a phase correlation method (Casasent and Psaltis, 1976). Phase correlation is very powerful in detecting the translation parameters between two images. It is based on the observation that if f (x, y) is a function and F(ox, oy) its Fourier transform, then the Fourier transform of ˆ x ; oy Þ ¼ F ðox ; oy Þejðx0 ox þ y0 oy Þ . In other words f ðx x0 ; y y0 Þ is Fðo Fˆ ox ; oy j ðx0 ox þ y0 oy Þ ¼ ð22Þ e F ox ; oy If we take the inverse Fourier transform of the expression on the righthand-side, if indeed the second image is a shifted version of the first image, it will be equivalent to taking the inverse Fourier transform of the complex exponential function on the left-hand-side. A complex exponential function, ejo0 f , is a function of a single frequency (since ejo0 f ¼ cos o0 f þ j sin o0 fÞ, and its Fourier transform therefore is an impulse located at o0. The principle of duality tells us that the inverse Fourier transform of ejðx0 ox þy0 oy Þ will be an impulse located at the point with coordinates (x0, y0), which are the shifting parameters between the two images. If the two images also diVer by a rotation as well as a translation, the search for the optimal rotation and translation parameters between the two images can be accomplished in two steps: either by exhaustive search (i.e., by assuming a rotation), identifying the shifting parameters for this value of the rotation, and from among all these results associated with diVerent assumed rotation angles selecting the one that produces the strongest peak in the real space, or by treating the rotation as another shift along the polar
IMAGE REGISTRATION: AN OVERVIEW
275
angle, and locating the angle of rotation between the two images as another shifting parameter. The latter is based on the observation that we can ignore the shifting between the two images if we take the modulus of their Fourier transforms: the relative shift of two images aVects only the phase of the transform, so by taking the modulus, we eliminate any eVect the relative ˆ x ; oy Þj ¼ jF ðox ; oy ÞjÞ. If the two shift has on the Fourier transform (i.e., jFðo images then are rotated versions of each other, the moduli of their Fourier transforms will also be rotated versions of each other (i.e., shifted versions in the polar angle defined in the frequency space). But we know how to detect the relative shift between two signals: we take their Fourier transforms with respect to the shifted parameter, (i.e., the polar angle in the frequency space), we divide them point by point, take the inverse Fourier transform, and identify the value of the shifting parameter from the position where the strongest peak of this inverse Fourier transform appears. Once the relative rotation of the two images has been dealt with, the second image is rotated with respect to the first, and then the relative shift of the two images is identified (Costa and Petrou, 2000). Phase correlation may also be used to identify the scale between two images. Change of scale of a real variable results in inverse scaling of the corresponding frequency variable. So, if instead of using the frequency variables directly, we use their logarithms, change in scale becomes a shift: Suppose image f (x, y) is scaled to produce image f (ax, by). The Fourier transform F˜ (ox, oy) of f (ax, by), in terms of the Fourier transform F (ox, oy) of f (x, y), is given ˜ x ; oy Þ ¼ 1 F ox ; oy . If we plot these two functions in logarithmic by Fðo a b jabj ˜ coordinates, Fðlogo x ; logoy Þ ¼ CF ðlogox loga; logoy logbÞ, where C is some constant of proportionality. So, scaling has again been expressed as another shift (Srinivasa Reddy and Chatterji, 1996). Kruger and Calway (1998) later generalised the approach to deal with shear as well (i.e., to register images that diVer from each other by an aYne transform). Recently, Kadyrov and Petrou (2003) proposed as a way of accelerating the process of image registration by correlation, the application of the ‘‘clock algorithm’’ as they called it: Suppose we wish to find the rotation between two images. Instead of trying all possible rotation angles, we may create N replicas of one image in rotated steps of say S1 ¼ 360=N degrees, and concatenate them to form a large image. Then we create M replicas of the other image rotated in steps of S2 ¼ S1 =M degrees, and concatenate them to form a large composite image from them. We then apply correlation between the two composite images. If the nth image from the first composite image matches with the mth image of the second composite image, then the rotation between the two initial images is (n 1ÞS1 þ mS2 . Figure 15 demonstrates this algorithm for N ¼ M ¼ 4.
276
PETROU
Figure 15. Assume that the problem is to find the rotation angle between the two images at the top. Create N ¼ 4 rotated versions of the first image, with step S1 ¼ 360=4 ¼ 90 degrees, and concatenate them to form a single image. Then create M ¼ 4 rotated versions of the second image, with step S2 ¼ S1 =4 ¼ 22:5 degrees, and concatenate them to form the second composite image. Correlate the two composite images to find the position of maximum correlation. This will be when the third panel of the second composite image (m ¼ 2) exactly matches the second panel of the first composite image (n ¼ 1). Then the angle between the two initial images at the top is (n 1Þ90 þ m22:5 ¼ 45 degrees. The accuracy with which this angle is estimated depends on the values of N and M we choose to use.
IV. Feature Extraction In all of the previous discussion, we bypassed the issue of feature detection and concentrated on the subsequent steps which are intrinsic to the image registration problem. Feature extraction, however, is an important first step of such algorithms, and even though it is a broad subject beyond the scope of this chapter, it is worth discussing some aspects of it, in particular those aspects that have been closely developed or intertwined with the registration process. One of the first methods of feature extraction applied in remote sensing image registration is that of using a chip library (LeMoigne et al., 2001; Kim
IMAGE REGISTRATION: AN OVERVIEW
277
and Im, 2003). A chip is a small image that depicts some characteristic structure that is known to be present in the image. The chip has known coordinates. The image is searched by scanning it with the chip until the position of the object depicted by the chip is identified in the image. The center of that object then is used as a point for the subsequent steps of the registration algorithm. Obviously, one repeats the process for all chips in the library that are expected to be present in the image. Other features that have been used for registration purposes are edges and corners. The extraction of corner points is a special case of extracting ‘‘interesting points.’’ Interesting points are points with significant information value (i.e., points that are distinct from their surroundings). The structure tensor, which measures how diVerent the local neighbourhood of a pixel is from a flat patch, is often used for this purpose. Let us consider the image function I(x, y). For small shifts (x0 , y0 ) away from a point (x0, y0), the image function can be expanded into a Taylor series as follows:
0 0 0 @I 0 @I I ðx0 þ x ; y0 þ y Þ ’ I ðx0 ; y0 Þ þ x þy
@x x¼x0 ;y¼y0 @y x¼x0 ;y¼y0
2 1 02 @ 2 I
1 02 @ 2 I
0 0 @ I þ y þxy þ x 2 @x2 x¼x0 ;y¼y0 2 @y2 x¼x0 ;y¼y0 @x@y x¼x0 ;y¼y0 ð23Þ Keeping only the first order terms, we may write
0 @I 0 0 0 @I þy I ðx0 þ x ; y0 þ y Þ I ðx0 ; y0 Þ ’ x
@x x¼x0 ;y¼y0 @y x¼x0 ;y¼y0
ð24Þ
which upon squaring it, yields:
½Iðx0 þ x0 ; y0 þ y0 Þ Iðx0 ; y0 Þ2 ’ x02 Ix2 þ y02 Iy2 þ 2x0 y0 Ix Iy ! ! Ix2 Ix Iy x0 0 0 ¼ ðx y Þ y0 Ix Iy Iy2
ð25Þ
@I @I where for simplicity we have used Ix @x jx¼x0 ;y¼y0 and Iy @y jx¼x0 ;y¼y0 . Tensor ! Ix2 Ix Iy ð26Þ Ix Iy Iy2
is the so-called structure tensor that can be used to identify points of interest (e.g., corners Harris and Stephens, 1988; Vliet and Verbeek, 1995). We can easily see that if the image is flat in the vicinity of point (x0, y0), the determinant of the structure tensor is zero.
278
PETROU
By keeping up to the second order terms in (23), we can see that the deviation of the local image structure away from a plane may be expressed by another tensor, namely the Hessian matrix of the function: 1 02 x Ixx þ y02 Iyy þ 2x0 y0 Ixy 2 ! ! x0 1 0 0 Ixx Ixy ¼ ðx y Þ 2 Ixy Iyy y0
I ðx0 þ x0 ; y0 þ y0 Þ I ðx0 ; y0 Þ x0 Ix y0 Iy ’
where for simplicity we have used Ixx
@2 I Ixy @x@y jx¼x0 ;y¼y0 . The Hessian tensor
@2I @x2 jx¼x0 ;y¼y0 ,
Ixx Ixy Ixy Iyy
!
Iyy
@2 I @y2 jx¼x0 ;y¼y0
ð27Þ and
ð28Þ
has also been used to detect points of interest in the image, particularly corners (Baudet, 1978; Dreschler and Nagel, 1981). Features in general are points that are somehow distinct from their surroundings. To be distinct means to diVer, e.g., the first or second order derivative of the image function at that point must be high. This is the idea behind the use of the structure and the Hessian tensors to identify features. However, change also manifests itself in the Fourier domain, and in particular in high frequencies. So, another way of finding distinct points is to identify points which have high local energy. Local energy in a particular frequency band implies the necessity to localize the information both in space and in frequency domain. This can be achieved with the help of Gabor functions, which yield band-limited local information, and they can be used to extract features as ‘‘energetic’’ points irrespective of whether they are corners or edges. Such features have been used for image registration in quite unstructured images (Zheng and Chellappa, 1993) and even multimodal images (Li and Zhou, 1996a). Some of the most interesting feature extraction methods that have been proposed recently for the problem of image registration are based on the use of wavelets. Several variations of the method exist and one can use wavelets to extract contours and edges. However, the general idea of all wavelet-based methods is as follows: Use a pair of low and high pass filters that have been specifically designed to allow the analysis of an image into components of multiple resolution with minimum redundancy. Such pairs of filters have been designed and published by various researchers of wavelet theory. The most commonly used wavelets are the Haar wavelet, the Daubechies wavelets and the Mallat wavelets. Note that again explaining the wavelet analysis theory is beyond
IMAGE REGISTRATION: AN OVERVIEW
279
the scope of this chapter, so we assume that the reader is familiar with some basic theory, and we present here only the way this theory is applied to image registration. Wavelet based schemes in general follow the steps below: .
.
.
.
.
.
Convolve the image with the low pass filter along the x axis, and the output with the low pass filter along the y axis to produce the low-low pass component of the image. Subsample the result by a factor of 2 in each direction, to produce the LL1 sub-band of the image. Convolve the image with the low pass filter along the x axis, and the output with the high pass filter along the y axis to produce the low-high pass component of the image. Subsample the result by a factor of 2 in each direction, to produce the LH1 sub-band of the image. Convolve the image with the low pass filter along the y axis, and the output with the high pass filter along the x axis to produce the high-low pass component of the image. Subsample the result by a factor of 2 in each direction, to produce the HL1 sub-band of the image.
For a complete image decomposition, one has to convolve the image with the high pass filter along both the x and y directions and subsample, in order to produce the HH1 component of the signal. This component, however, is often not used in practice for image registration, and its creation may be entirely omitted in that case. The subsampling is done by omitting every other line and every other column. The result, therefore, is 1/4 the size of the original image. The process can be repeated on the LL1 sub-band to produce the second order sub-bands LL2, LH2 and HL2 which would be 1/16 the size of the original image. Note that sub-bands LH and HL have been produced by smoothing the image in one direction and highpassing it (eVectively diVerentiating it) along the other direction. Therefore, if the high pass filter is an antisymmetric function, (i.e., a first derivative estimator) the results are nothing else than some sort of estimation of the two components of the gradient vector along the diVerentiating directions respectively. In other words, sub-band LH can be thought of as the component of the gradient vector along the y direction, and sub-band HL as the component of the gradient vector along the x direction. In that case, the LH and HL bands may be used to indicate the components of the flow field between the two images (Olivio et al., 1995). If more than one levels of resolution are applied, these two sub-bands can be thought of as containing the components of the gradient vector in progressively coarser resolutions of the image. Any subsequent processing is applied to the coarsest resolution first, and the
280
PETROU
outcome is propagated to the finer resolution. Djamdji et al. (1993) treated the high frequency bands of wavelet decomposition as the gradient component maps, and processed them by performing non-maxima suppression (i.e., keeping only the local maxima in each map) and thresholding, before using them to perform multiresolution image registration. An improved version of their algorithm was proposed in Corvi and Nicchiotti (1995). Instead of using each band separately, we may follow one of two options: we may combine the two sub-bands to produce the magnitude of the gradient vector (Fonseca and Costa, 1997), or we may combine the two subbands to produce an output corresponding to the output of the Laplacian of Gaussian filter (Li and Zhou, 1996b). The latter is correct only if the high passed filter used is a symmetric one (i.e., one that estimates the second derivative of the image function). In the first case we compute qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð29Þ Gk ði; j Þ ¼ ðLH k ði; j ÞÞ2 þ ðHLk ði; j ÞÞ2
where k indicates the level of resolution, LHk(i, j) and HLk(i, j) are the values pixel (i, j) has in the corresponding sub-bands in resolution k, and Gk (i, j) is the magnitude of the gradient vector for the particular pixel in the particular level of resolution. Then, we may keep the X% pixels with the strongest gradient magnitudes in the reference and sensed image. This will result in thick stripes of pixels around the edges of the two images. One then matches these sets of edge pixels between the two images in the coarsest resolution and registers the two images in that resolution. The registration of the next resolution up is achieved in the same way, only now the matching between the two sets of edge pixels is sought in a restricted space, in view of the knowledge of the best matching in the coarse resolution which imposes some limits as to how far away the match of a certain edge pixel may be sought. This way the solution is propagated at all levels, up to the full image level. In a diVerent scenario, we proceed on the basis of the following observation: We remember that the Laplacian is the sum of the second derivatives of a function. These are usually estimated by convolving the image with a smoothing filter along one axis and the output with a filter estimating the second derivative along the other axis, repeating the process by exchanging the order of the axes, and subsequently summing the two results. When the smoothing filter is a Gaussian and the diVerentiating filter is the second derivative of a Gaussian, the output is that of a Laplacian of Gaussian (LoG) filter. So, if we use as high pass filter in wavelet analysis one that looks like the second derivative of a Gaussian (i.e., like a sombrero or
IMAGE REGISTRATION: AN OVERVIEW
281
Mexican hat filter), by adding the two subbands LH and HL we may obtain something analogous to the output of the Laplacian of Gaussian filter (Li and Zhou, 1996b). So, we may write: LoGk ði; j Þ ’ LH k ði; j Þ þ HLk ði; j Þ
ð30Þ
where LoGk(i, j) is the value of the Laplacian of a Gaussian filter at pixel (i, j) in the kth resolution. It is known that edges manifest themselves as points of zero value in the output of an LoG filter. So, after the summation of the two sub-bands of the image, one may compute the zero-crossings in the output. Note that edges computed as the zero crossing points are continuous, since there is always a zero crossing between two neighboring pixels with LoGk(i, j) values of the opposite sign. This way, one may create contours from each image and then follow one of the methods of contour matching discussed in the previous section. Again, one has to start from the coarsest resolution and propagate the solution to the finer resolutions in the form of constraints imposed in the search space of the next resolution up, from the known matchings obtained at the current resolution. There is a variation of this method where the contour pixels identified are endowed with some strength value computed from Eq. (29), instead of simply being points of binary contours. The strength associated with each contour point may be used to improve the quality of the obtained matching. All these variations of the wavelet based image registration methods are schematically shown in Figure 16. The idea of using features is to capture some information on the local structure of the image. However, features tend to be sparse, and some researchers have advocated the use of dense local structure representations. For example, Nyul et al. (2001) suggested the use of cross correlation or mutual information for registering 3D volume images, in conjunction not with grey values, but with ‘‘scale’’ values: each voxel caries a number that is the radius of the largest sphere centered at the voxel with homogeneous intensities. More recently, Petrou and Lazaridis (2003) proposed that each pixel caries a number that is formed from the expansion of the local neighbourhood around the pixel in terms of Walsh basis images (Petrou and Bosdogianni, 2000). The coeYcients of this expansion quantify how much the local structure looks like a horizontal positive going edge, a horizontal negative going edge, a corner of certain polarity and so on. Normalizing these coeYcients and using them as digits in an appropriately chosen number system to create a unique number to represent the local structure allowed them to perform correlation between images using those densely distributed feature values.
282
PETROU
Figure 16. Image registration with wavelets.
V. Literature Survey This section presents some image registration papers within the grant picture presented in the previous sections. A survey paper of image registration methods proposed prior to 1992 is Brown (1992). A comparative study of 16 brain image registration techniques applied to register PET, CT, and MRI volume data can be
IMAGE REGISTRATION: AN OVERVIEW
283
found in West et al. (1997). Another comparative study of image registration techniques used in remote sensing can be found in Fonseca and Manjunath (1996). One of the first point matching methods for image registration was proposed by Stockman et al. (1982), who used a robust technique of vote counting to extract the parameters of a transformation between two images, by considering two pairs of matched points and inferring the parameters of the transformation from them. The finally accepted values of the parameters were those supported by most pairs of matched points. Stockman et al. (1982) extracted the points to be matched as intersections of extracted lines. Goshtasby and Page (1984) on the other hand, used as points to be matched the centers of gravity of regions extracted in the two images and employed probabilistic relaxation to obtain the matching. In another paper, Goshtasby et al. (1986) used the robust vote counting method of Stockman et al. (1982) to define the best match of the centers of gravity of extracted regions in the two images. They then refined the extracted regions and obtained the final transformation between the two images at a second step of the process, by solving the similarity transformation equations (see Eq. (20)) by the least square error method. Goshtasby (1988) later turned his attention to the issue of extracting the transformation between the two images, after the sets of matched points have been identified. He proposed the use of surface splines to represent the two images, which he treated as landscapes with the grey value playing the role of the height. He had to define the spline parameters of one ‘‘landscape,’’ to fit the other. This must have been one of the first elastic registration algorithms proposed. Flusser (1992) also used splines to achieve elastic image registration, but he applied them locally in small patches of the image. Five years after the paper by Goshtasby and Page (1984), Ton and Jain (1989) also used as landmark points the centers of easily identifiable regions, e.g., lakes or oil drills in a remote sensing image. The sets of landmark points were then matched not only on the basis of their individual attributes (e.g., a water point matched only to water points in the other image) but also on the basis of their binary relations. So, these authors were some of the first people to use contextual information into the point matching process. Four years later, Christmas et al. put the use of binary relations on a firm mathematical basis and used them for matching linear segments as opposed to just points. A method of hierarchical probabilistic relaxation for image matching was proposed in a brief note by Zhang et al. (2000). They used the correlation coeYcient between the grey levels of the matched regions as the starting probability of each matched pair of points. These probabilities were subsequently updated using relational information conveyed by considering two pairs of matched points. The updating formula they used for the probabilities was the
284
PETROU
so-called sum rule, which Christmas et al. (1995) had shown to be valid under the assumption of low contextual information. Amit and Kong (1996), similar to Christmas et al. (1995), also used relative lengths and orientations between segments defined by landmark points represented by graphs, and dynamic programming to perform graph matching. Linear segments and relative lengths and orientations had also been used as early as 1984 by Medioni and Nevatia (1984) to register maps and images. Ton and Jain (1989) used as binary relations the relative distances of pairs of points and they developed a system of updating the probabilities of the various possible matches taking into consideration attributes and relations. The matched sets of points were then used to compute the parameters of the transformation between the two images, assuming it to be a rotation and a translation, Eq. (19). Once the idea of using the characteristics of the region, of which the identified point was the center of gravity, had been put forward, it opened the way of using other region attributes to match the points, and even to escape from the idea of matching just points, and move into the idea of matching whole regions. One of the first region matching methods was presented by Ventura et al. (1990). The authors proposed the extraction of regions of interest in both images, and they described each region by a set of shape characteristics, like area, ellipticity, etc. The regions in the two images were matched on the basis of these characteristics, and their centers of gravity were identified as the corresponding sets of points that could be used to solve for the parameters of the transformation between the two images. The assumed transformation was simple: a first order polynomial, that is, the aYne transform of Eq. (21). The issue of course here is: if you assume that the two images diVer by an aYne transformation, you should worry whether the descriptors of the regions you use are aYnely invariant; otherwise you should not be using them to identify the corresponding regions. Flusser and Suk (1994) dealt with this problem four years later, by using aYnely invariant moments [see Eq. (18)] to describe the extracted regions in the two images. Later, Dai and Khorram (1999) enhanced this approach: They presented an image registration algorithm using closed image contours extracted with the help of a Laplacian of Gaussian edge operator. The shapes of the regions delineated by the extracted contours were described using a combination of aYnely invariant moments and chain code representations (see Figure 10) of the contours themselves. The corresponding regions in the two images were identified on the basis of similarity with respect to the above two descriptors, and the centers of gravity of the corresponding regions were used as sets of corresponding points to find the parameters of the aYne transform between the two images [Eq. (21)] just like it was done by previous researchers. However, Li et al. (1995) had already introduced most of these ideas 4 years
IMAGE REGISTRATION: AN OVERVIEW
285
earlier when they combined two invariant moments and chain code representation to match contours between images assumed to diVer by a similarity transformation. More recently, Zana and Klein (1999) used a point matching method to register eye fundus images. The points were identified as bifurcation points of the tree of vessels in each image, and the relative angles between the branches of the tree were used to match the points and extract the parameters of the aYne transformation between the two images. The angles between the tree branches were assumed invariant based on the assumption that any transformation locally can be approximated by a similarity transform [see Eq. (20)], which preserves angles. Yang and Cohen (1999) proposed some new aYne invariants to improve the sensitivity of the registration algorithms to small shape details. One of the first pixel matching methods was proposed by Anuta (1970), who used the correlation function (as defined by Eq. (1)) to find the similarity transform between two images of diVerent modalities or diVerent spectral bands. To overcome the problem that grey values have diVerent meaning in diVerent modalities or spectral bands (and therefore they should not be correlated), he extracted the edges in each image (or region borders) and correlated those images, instead of the original ones. To speed the process up, he computed the correlation function in the frequency domain, but this did not allow him to deal with the issue of rotation, scaling and translation in one go. So, he had to assess originally ranges of possible rotation and scaling parameter values between the two images, and refine their values by improving on the value of the correlation function which was returning the best translation parameters for each case. The use of correlation for registering images of diVerent modalities via edge extraction persisted for many years. In 1991, Rignot et al. (1991) used the method to register images that were reduced to binary after the process of edge extraction. So, although this appears to be a pixel-based method, in practice it is a feature based method again, with the features being the edge pixels. The approach was compared by Rignot et al. (1991) with the method of Chamfer matching which had been proposed earlier by Barrow et al. (1977). The authors found that the Chamfer matching method was more robust than the correlation based method. Matching contours was also the method favoured by Govindu and Shekhar (1999) who argued that contour extraction is much more robust than the extraction of other features, and also their shape is more resilient to transformation changes. Their method of contour matching was based on correlating the probability density functions (normalized histograms) of the orientations of the tangents along each contour. This allowed them to identify the rotation angle between the two contours. However, they were
286
PETROU
able to extent the approach in recovering the transformation parameters between images for the cases of Euclidean, similarity, and aYne transforms. How errors in the location of the extracted features (edges and straight lines) influence the registration accuracy is discussed in O’Gorman (1996). Computing the correlation between images becomes problematic when some of the data are missing due to occlusion (for example, the presence of cloud in a remote sensing image). McGuire and Stone (2000) proposed the use of the normalized convolution in the computation of the correlation coeYcient, introduced in 1993 by Knutsson and Westin (1993) to solve the problem of image reconstruction from incomplete data. Normalized convolution eVectively gives a weight to each pixel according to whether it is occluded or not. It then convolves the pixels and the weights and divides the two results point by point. Problems caused by aliasing to the phase correlation method [see Eq. (22)] when used to detect relative image shifts are dealt with by Stone et al., 2001, who claim ability to register images with accuracy of a few hundredths of interpixel distance, in the presence of mild aliasing. Didon and Langevin (1998) used correlation in 3D to register MRI brain images, assuming that the two volumes diVered only by a 3D rotation and a translation. An application to augmented reality registration of phase correlation techniques has been presented by Cheng and Robinson (1998). Roche et al. (2001) used as similarity measure something very similar to the first term of the cost function used by Kovalev and Petrou (1998) [see Eq. (5)] to register 3D ultrasound and MRI images. Wavelet decomposition has not only been used to extract features; it has also been used to accelerate image registration by correlation. El-Ghazawi et al. (1997) estimated the rotation and translation parameters between two images by exhaustive search and using the correlation coeYcient as a measure of similarity. However, by starting the process with the LL band of the lowest resolution and restricting the search space in the next resolution up to a range around the result of the lower resolution, they had significant gains in computational eYciency. Inconsistency of the results produced by the wavelet based registration techniques between diVerent levels of decomposition, due to the translation sensitivity of the wavelets, are discussed in Stone et al. (1999). Rigid body registration seems to be very popular for medical applications, because it is fast, in spite of its obvious limitations (Alpert, 1996; Nestares and Heeger, 2000). Nestares and Heeger (2000) used the assumption of a Euclidean transformation to register MRI images, and discussed issues arising from the use of diVerent protocols during image capture. They used a robust technique that ignored voxels which had very diVerent values in the
IMAGE REGISTRATION: AN OVERVIEW
287
two volumes that were being registered. Jacq and Roux (1995) used a genetic optimization algorithm to solve the problem of elastic volume registration. Their algorithm appears to be surprisingly fast. Elastic image registration combining physical and statistical models has been proposed by Wang and Staib (1998, 1999).
VI. Conclusions Image registration methods are central to processes where more than one images are involved. In general, one tries to identify the transformation which, when applied to the one image, will make it match the other. The most general case is that of inhomogeneous elastic image registration, where each pixel or voxel is associated with its own vector which indicates in which position it has to move to match the second image. However, methods which tackle the problem head-on are slow. That is why often the assumption of rigid body transformation is invoked. In general, it is advisable to use a global matching method first and then apply an elastic matching method, to refine the obtained match at a reduced computational cost. The next most popular assumption is that of an aYne transformation. This is because many complex transformations can be approximated reasonably well by an aYne transformation. All these transformations, from a mere translation to a non-linear polynomial one, are parametric representations of the general transformation where each pixel or voxel is given its own motion vector. The parameters of the transformation are usually computed in the least square error sense by considering pairs of matched points. However, robust methods should be preferred. Cross-modality registration relies heavily on the use of structural approaches, where features are matched, or pixel-based approaches where the mutual information is used as similarity measure. Structural approaches rely on extracting and matching points of interest (e.g., corners), linear segments, contours, or whole regions. These features tend to be sparse. Increasingly, however, people opt for dense structural information, to which correlation methods may be applied (Nyul et al., 2001; Petrou and Lazaridis, 2003). These approaches are alternative to wavelet-based approaches that may be used to extract local structure. They either use intuitively defined structure (Nyul et al., 2001), or they use the coeYcients of the expansion of the local neighborhood in terms of some orthogonal basis (Petrou and Lazaridis, 2003). In either case, they extract dense structural information at each pixel in the image.
288
PETROU
References Alpert, N. M., Bedrichevsky, D., Levin, Z., Morris, E. D., and Fischman, A. J. (1996). Improved methods for Image Registration. Neuroimage 3, 10–18. Amit, Y., and Kong, A. (1996). Graphical Templates for Model Registration. IEEE Trans. on Pattern Analysis and Machine Intelligence 18, 225–236. Anuta, P. E. (1970). Spatial registration of multispectral and multitemporal digital imagery using fast Fourier transform techniques. IEEE Trans. Geoscience Electronics 8, 353–368. Barrow, H. G., Tenebaum, J. M., Bolles, R., and Wolf, H. C. (1977). Parametric Correspondence and Chamfer Matching: Two new techniques for Image Matching, in Proceedings of the 5th Joint Conference on Artificial Intelligence. Cambridge: Mass, pp. 659–663. Baudet, P. R. (1978). Rotationally invariant image operators, in 4th International Conference on Pattern Recognition, pp. 579–583. Bober, M., Petrou, M., and Kittler, J. (1998). Non-linear motion estimation using the supercoupling approach. IEEE Trans. Pattern Analysis and Machine Intelligence 20, 550–555. Brown, L. G. (1992). A survey of image registration techniques. ACM Computing Surveys 24, 325–376. Casasent, D., and Psaltis, D. (1976). Position oriented and scale invariant optical correlation. Applied Optics 15, 1793–1799. Cheng, L.-T., and Robinson, J. (1998). Dealing with speed and robustness issues for videobased registration on a wearable computing platform, in Proceedings of the 2nd International Symposium on Wearable Computers. Los Alamitos, Calif.: IEEE CS Press, Vol. 24, pp. 84–91. Christensen, G. E., Rabbit, R. D., and Miller, M. J. (1994). 3D brain mapping using a deformable neuroanatomy. Physics in Medicine and Biology 39, 609–618. Christmas, W. J., Kittler, J., and Petrou, M. (1995). Structural matching in Computer Vision using Probabilistic Relaxation. IEEE Trans. Pattern Analysis and Machine Intelligence 17, 749–764. Christmas, W. J., Kittler, J., and Petrou, M. (1996a). Probabilistic feature-labelling schemes: modelling compatibility coefficient distributions. Image and Vision Computing 14, 617–625. Christmas, W. J., Kittler, J., and Petrou, M. (1996b). Labelling 2-D geometric primitives using probabilistic relaxation: reducing the computational requirements. Electronic Letters 32, 312–314. Corvi, M., and Nicchiotti, G. (1995). Multiresolution image registration, in IEEE International Conference on Image Processing. Washington, DC, October, pp. 23–26. Costa, C. E., and Petrou, M. (2000). Automatic registration of ceramic tiles for the purpose of fault detection. Machine Vision and Applications 11, 225–230. Coxeter, H. S. M. (1974). Projective Geometry. University of Toronto Press: Toronto. Dai, X., and Khorram, S. (1999). A feature-based image registration algorithm using improved chain-code representation combined with invariant moments. IEEE Trans. Geoscience and Remote Sensing 37, 2351–2362. Dengler, J., and Schmidt, M. (1988). The dynamic pyramid-A model for motion analysis with controlled continuity. Int. J. Pattern Recognition and Artificial Intelligence 2, 275–286. Didon, J. P., and Langevin, F. (1998). Fast 3D registration of MR brain images using the projection correlation registration algorithm. Medical and Biological Engineering and Computing 36, 107–111. Djamdji, J.-P., Bijaoui, A., and Maniere, R. (1993). Geometrical registration of images: The multiresolution approach. Photogrammetric Engineering and Remote Sensing 59, 645–653.
IMAGE REGISTRATION: AN OVERVIEW
289
Dreschler, L., and Nagel, H. H. (1981). Volumetric model and 3D trajectory of a moving car derived from monocular TV frame sequences of a street scene, in Proceedings of the International Joint Conference on Artificial Intelligence, pp. 692–697. El-Ghazawi, T., Chalermwat, P., and Le Moigne, J. (1997). Wavelet-based image registration on parallel computers, in Proceedings of the 1997 ACM/IEEE Conference on Supercomputing. San Jose, CA, pp. 1–9. Flusser, J. (1992). An adaptive method for image registration. Pattern Recognition 25, 45–54. Flusser, J., and Suk, T. (1994). A moment-based approach to registration of images with affine geometric distortion. IEEE Trans. Geoscience and Remote Sensing 32, 382–387. Fonseca, L. M. G., and Manjunath, B. S. (1996). Registration techniques for multisensor remotely sensed imagery. Photogrammetric Engineering and Remote Sensing 62, 1049–1056. Fonseca, L. M. G., and Costa, M. (1997). Automatic registration of satellite images, in Proceedings of the Brasilian Symposium on Graphic Computation and Image Processing, IEEE Computer Society, pp. 219–226. Georgis, N., Petrou, M., and Kittler, J. (1997). On the correspondence problem for wide angular separation of non-coplanar points. Image and Vision Computing 16, 35–41. Georgis, N., Petrou, M., and Kittler, J. (1998). Error analysis guided 3D reconstruction without camera calibration for wide stereo. IEEE Trans. Pattern Analysis and Machine Intelligence 20, 366–379. Gidas, B. (1989). A renormalization group approach to image processing problems. IEEE Trans. Pattern Analysis and Machine Intelligence 11, 164–180. Goshtasby, A. (1988). Registration of images with geometric distortions. IEEE Trans. Geoscience and Remote Sensing 26, 60–64. Goshtasby, A., and Page, C. V. (1984). Image matching by a probabilistic relaxation labelling process, in Proceedings of the 7th International Conference on Pattern Recognition, pp. 307–309. Goshtasby, A., Stockman, G. C., and Page, C. V. (1986). A region-based approach to digital image registration with subpixel accuracy. IEEE Trans. Geoscience and Remote Sensing 24, 390–399. Govindu, V., and Shekhar, C. (1999). Alignment using distributions of local geometric properties. IEEE Trans. Pattern Analysis and Machine Intelligence 21, 1031–1043. Harris, C., and Stephens, M. (1988). A combined corner and edge detector, in Proceedings of the 4th Alvey Vision Conference, pp. 189–192. Jacq, J.-J., and Roux, C. (1995). Registration of 3-D images by genetic optimization. Pattern Recognition Letters 16, 823–841. Kadyrov, A., and Petrou, M. (2003). Fast registration for 2D images: the clock algorithm, in Proceedings of the International Conference on Image Processing. Barcelona, Spain. Kim, T., and Im, Y.-J. (2003). Automatic Satellite Image Registration by Combination of Stereo Matching and Random Sample Consensus. IEEE Trans. Geoscience and Remote Sensing 41, 1111–1117. Knutsson, H., and Westin, C.-F. (1993). Normalised and differential convolution: Methods for interpolation and filtering of incomplete and uncertain data, in IEEE Conference on Computer Vision and Pattern Recognition, pp. 515–523. Kovalev, V. A., and Petrou, M. (1998). Non-rigid volume registration of medical images. J. Computing and Information Technology 6, 181–190. Kruger, S., and Calway, A. (1998). Image registration using multiresolution frequency domain correlation, in The British Machine Vision Conference. ISBN 1-901725-04-9, pp. 316–325. Le Moigne, J., Netanyahu, N. S., Masek, J. G., Mount, D. M., and Goward, S. N. (2001). Robust Matching of Wavelet Features for sub-pixel Registration of Landsat Data, in Proceedings of the
290
PETROU
International Geoscience and Remote Sensing Symposium, IGARSS2001. Sydney, Australia, July 9–13. Li, H., Manjunath, B. S., and Mitra, S. K. (1995). A contour-based approach to multisensor image registration. IEEE Trans. Image Processing 4, 320–334. Li, H. H., and Zhou, Y.-T. (1996a). Automatic visual/IR image registration. Optical Engineering 35, 391–400. Li, H. H., and Zhou, Y.-T. (1996b). A wavelet-based point feature extractor for multi-sensor image registration, in SPIE Proceedings, Wavelet Applications III. Vol. 2762, pp. 524–534. Medioni, G., and Nevatia, R. (1984). Matching Images Using Linear Features. IEEE Trans. on Pattern Analysis and Machine Intelligence 6, 675–685. McGuire, M., and Stone, H. S. (2000). Techniques for multiresolution image registration in the presence of occlusions. IEEE Trans. Geoscience and Remote Sensing 38, 1476–1479. Nestares, O., and Heeger, D. J. (2000). Robust multiresolution alignment of MRI brain volumes. Magnetic Resonance in Medicine 43, 705–715. Nicholls, G. K., and Petrou, M. (1994). On multiresolution image restoration, in Proceedings of the 12th International Conference on Pattern Recognition. Jerusalem, Vol. III, pp. 63–67. Nyul, L. G., Udupa, J. K., and Saha, P. K. (2001). Task-specific comparison of 3-D image registration methods, in Medical Imaging 2001: Image processing, Proceedings of SPIE. Vol. 4322, pp. 1588–1598. O’Gorman, L. (1996). Subpixel precision of straight-edged shapes for registration and measurement. IEEE Trans. Pattern Analysis and Machine Intelligence 18, 746–751. Olivio, J.-C., Deubler, J., and Boulin, C. (1995). Automatic registration of images by a waveletbased multiresolution approach, in SPIE Proceedings. Vol. 2569, pp. 234–243. Petrou, M. (1995). Accelerated optimization in Image Processing via the Renormalisation Group Transformation, in Complex Stochastic Systems and Engineering, edited by D. M. Titterington. London: Clarendon Press, pp. 105–120. Petrou, M., and Bosdogianni, P. (2000). ‘‘Image Processing: The Fundamentals.’’ London: John Wiley. Petrou, M., and Lazaridis, G. (2002). Image Registration, in Image and Signal Processing for Remote Sensing VIII, Proceedings of SPIE (24–27 September), edited by S. B. Serpico. Crete, Greece, Vol. 4885, pp. 1–12. Rignot, E. J. M., Kowk, R., Curlander, J. C., and Pang, S. S. (1991). Automated multisensor registration: Requirements and techniques. Photogrammetric Engineering Remote Sensing 57, 1029–1038. Roche, A., Pennec, X., Malandain, G., and Ayache, N. (2001). Rigid registration of 3D ultrasound with MR images: A new approach combining intensity and gradient information. IEEE Trans. Medical Imaging 20, 1038–1049. Srinivasa Reddy, B., and Chatterji, B. N. (1996). An FFT-based technique for translation, rotation, and scale-invariant image registration. IEEE Trans. Image Processing 5, 1266–1271. Stockman, G., Kopstein, S., and Benett, S. (1982). Matching images to models for registration and object detection via clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence 4, 229–241. Stone, H. S., Le Moigne, J., and McGuire, M. (1999). The translation sensitivity of waveletbased registration. IEEE Transactions on Pattern Analysis and Machine Intelligence 21, 1074–1081. Stone, H. S., Orchard, M. T., Chang, E.-C., and Martucci, S. A. (2001). A fast direct Fourierbased algorithm for subpixel registration of images. IEEE Trans. Geoscience and Remote Sensing 39, 2235–2243. Thompson, P., and Toga, A. W. (1996). A surface-based technique for warping 3D images of the brain. IEEE Trans. Medical Imaging 15, 402–417.
IMAGE REGISTRATION: AN OVERVIEW
291
Ton, J., and Jain, A. K. (1989). Registering Landsat images by point matching. IEEE Trans. Geoscience and Remote Sensing 27, 642–651. della Ventura, A., Rampini, A., and Schettini, R. (1990). Image registration by recognition of corresponding structures. IEEE Trans. Geoscience and Remote Sensing 28, 305–314. Vliet, L. J., and Verbeek, P. W. (1995). Estimators for orientation and anisotropy in digitised images, in Proceedings of the 1st Conference of the Advanced School for Computing and Imaging, pp. 442–450. Wang, Y., and Staib, L. H. (1998). Elastic Model based non-rigid registration incorporating statistical shape information, in Medical Image Computing and Computer Assisted intervention, Lecture Notes in Computer Science. Springer, Berlin, Vol. 1496, pp. 1162–1173. Wang, Y., and Staib, L. H. (1999). Integrated approaches to non-rigid registration in medical images, in Proceedings of the IEEE Workshop on Applications of Computer Vision, pp. 102–108. West, J., Fitzpatrick, J. M., Wang, M. Y., Dawant, B. M., Maurer, C. R., Jr., Kessler, R. M., Maciunas, R. J., Barillot, C., Lemoine, D., Collignon, A., Maes, F., Suetens, P., Vandermeulen, D., van den Elsen, P. J., Napel, S., Sumanaweera, T. S., Harkness, B., Hemler, P. F., Hill, D. L. G., Hawkes, D. J., Studholme, C., Maintz, J. B. A., Viergever, M. A., Malandain, G., Pennec, X., Noz, M. E., Maguire, G. Q. Jr., Pollack, M., Pelizzari, C. A., Robb, R. A., Hanson, D., and Woods, R. P. (1997). Comparison and evaluation of retrospective intermodality brain image registration techniques. J. Computer Assisted Tomography 21, 554–556. Yang, Z., and Cohen, F. S. (1999). Cross-weighted moments and affine invariants for image registration and matching. IEEE Trans. Pattern Analysis and Machine Intelligence 21, 804–814. Zana, F., and Klein, J. C. (1999). A multimodal registration algorithm of eye fundus images using vessels detection and Hough transform. IEEE Trans. Medical Imaging 18, 419–427. Zhang, Z., Zhang, J., Liao, M., and Zhang, L. (2000). Automatic registration of multi-source imagery based on global image matching. Photogrammetric Engineering and Remote Sensing 66, 625–629. Zheng, Q., and Chellappa, R. (1993). A computational vision approach to image registration. IEEE Trans. Image Processing 2, 311–326.
This Page Intentionally Left Blank
Index
A-optimality criterion, 22 Aperture function, 112 Aperture settings, for STEM, 132 ARG. See Attribute relational graph Artifacts, 169 Astigmatism, 65 Asymptotic efficiency, 26 Asymptotic normality, 25 Asymptotic properties, 48 Atom column(s), 8–9, 22. See also Isolated atom columns; Neighboring atom columns 1s-state for, 124f attainable precision for, 90, 90t central, 92–93, 93f, 94f, 95f, 96f chromatic aberration and, 84 energy of, 116 estimation of, 116 gold, 74, 75t heavy, 64, 123 identical, 68 light v. heavy, 79–80, 100, 128–129 maximum likelihood estimates of, 98f optimal transfer function for, 126–127 position, 61, 104 position coordinates, 79, 79f, 97 quantitative estimation of, 58 rhodium, 119 silicon, 74, 75t, 76, 77f spherical aberration constant and, 95 visualizing, 61 weight of, 108 Atomic force microscopy, 3
A Aberration correctors, 12 Accelerating voltage, 12, 60, 64 fluctuations of, 66 Accumulator array, 258–259 Adaptive image-contrast enhancement, 167 Affine invariates, 285 Affine transforms, 284, 286, 287 parameters of, 285 Affinely invariant movements, 264–266 Aliasing, 286 Amplitude spectrum, 170 Anatomical atlases, 244 Angle of rotation, 274–275 Angular tilt range full, 55 limited, 56f Annealing, simulated, 269 Annular detector(s), 105, 105f attainable precision and, 13, 119 v. axial detectors, 13, 132, 145 detector-to-aperture radius of, 135f direct visual interpretability using, 131–132, 131t inner collection radius of, 118–119 objective aperture radius for, 122 optimal configuration of, 124–127 optimal defocus value and, 126 optimal inner radius of, 132, 142 optimum design of, 139, 140f radii, 125–126 SNR and, 126 thermal diffuse and, 107 Anomaly detection, 169 Anomaly visualization, 169 Anuta, P. E., 285 293
294 Atomic resolution transmission electron microscopy (TEM), 3, 4f bright-field imaging in, 29 dark-field imaging in, 29 optimality criteria of, 16 partial optimality criteria in, 22 qualitative, 5–7 quantitative, 3–4, 7–10 Rayleigh, Lord, and, 27 statistical experimental design of, 27–58 two-point resolution and, 57 Atoms electrostatic potential of, 8, 41, 63 ionization state of, 3 position of, 4 Attainable precision, 4, 11–12, 19–25 aberration correctors and, 12 annular detectors and, 13 for atom columns, 90, 90t in bright-field imaging, 149–152 chromatic aberration constant and, 74 chromatic aberration correctors and, 91, 94, 96, 103 of CTEM, 58–59, 99, 144 in dark-field imaging, 145–149, 152–157 defocus and, 74, 90 dependence of, 72 energy spread and, 74 highest, 94, 96 for high-resolution CTEM, 80 incident electron energy and, 91 isolated atom columns and, 121 for microscope settings, 73 monochromators and, 94, 96, 103 optimal experimental design and, 103 reduced brightness and, 74 rules of thumb and, 34, 57–58 simplified models and, 27 specimen drift and, 86 spherical aberration constant and, 74
INDEX
spherical aberration correctors and, 91, 103 in STEM, 119 Attainable statistical precision, 15 optimality criteria and, 102 of position coordinates, 79 Attribute relational graph (ARG), 260, 261f Autocorrelations, in Fourier space, 67 Axial coma, 65 Axial detector(s), 105, 105f v. annular detectors, 13, 132, 145 attainable precision and, 119 detector-to-aperture radius of, 135f direct visual interpretability using, 131–132, 131t objective aperture radius for, 122 optimal configuration of, 124–127 optimal defocus value and, 126 optimal outer radius of, 142 outer collection radius of, 118 outer collection semi-angle of, 106 outer detector radius of, 134 radii, 125–26 spherical aberration constant and, 135
B Barrow, H. G., 285 Beam convergence, 60, 66 semi-angle of, 71, 82–83 Beghcladi, A., 175 Bessel functions theory of, 109 zeroth-order, 67 Bifurcation points, 285 Binary measurements, v. unary, 262 Binary relations, 259, 262, 283 relative distances as, 284 Binomial distribution, 18 Bloch wave theory, 8 Block matching method, 273 Blocking effects, from transform-based image enhancement, 169
INDEX
Bound algorithms, 258 Brain imaging, 282–283 data, 3D, 269 MRI, 286 Branch algorithms, 258 Bright-field imaging, 29, 30–31, 99 attainable precision in, 149–152 contrast in, 106 CRLB and, 40–42 v. dark-field images, 105 scalar measures and, 41 three-dimensional observations and, 33 two-dimensional observations and, 49 Brightness, of electron source, 13 Brown, L. G., 282
C C3(p, s) coefficients Fourier transforms with, 203f Cal-Sal Walsh Hadamard transform (C-SWHT), 172 Camera movement, 244 CCD camera. See Charged coupled device camera Center of gravity, 266 Center of mass, v. field of view, 46 Central atom columns microscope settings for, 134–137 Chain code, 264 Freeman, 264 representation, 265f, 285 Chamfer transform matching, 253, 285 Channelling approximation, 68 Channelling theory, 8–9, 102–103 Charged coupled device (CCD) camera, 8 electron counting results in, 17 quantum efficiency of, 18, 68 as recording device, 60 Cheng, L.-T., 286 Chip library, 276–277 Chips, 277
295 Christmas, W. J., 283, 284 Chromatic aberration, 60 atom columns and, 84 correcting, 6, 61 defocus by, 66 in electron microscopy, 10, 83 incident electron energy and, 84 information limit and, 7 in STEM, 107, 110 temporal incoherence due to, 107 Chromatic aberration constant attainable precision and, 74 optimal, 83–85 position coordinates and, 84f, 85f Chromatic aberration corrector(s), 7, 61, 72 attainable precision and, 91, 94, 96 in CTEM experiments, 83 information limit and, 83 v. monochromator, 91 v. spherical aberration corrector, 96 Circular aperture function, 65, 110 Clock algorithm, 275 Closed analytical form, 27 Coarse solutions, 272–273, 273f Coding, 167 Coefficient C1(p, s), 175, 205 Coefficient C2(p, s) v. coefficient C3(p, s), 209 Coefficient C3(p, s), 205 v. coefficient C2(p, s), 209 Cohen, F. S., 285 Coherent transfer function, 62 Collinear points, 254 Complex conjugate, 38, 64 Components distance between, 58 orientation of, 58 width of, 58 Computer vision, 243 Condenser lenses, 104 Configuration space structure, 272 Conical tilting, 31 Constant background intensity, 112
296 Constraining, 250 Constraint radiation sensitivity as, 122 specimen drift as, 122 Contextual information, 259, 284 Contours binary, 281 extraction, 278, 284 matching, 264, 285 rotation angles between, 285 tangents along, 285 Contrast, 105 in bright-field images, 106 definitions of, 175 delocalization, 79 enhancement, 168, 205 histogram, 175 histogram equalization and, 167 local, 175 local enhancement, 175 for medical imaging, 165–166 stretching, 167 Conventional transmission electron microscopy (CTEM), 6. See also High-resolution CTEM accelerating voltage of, 12 attainable precision of, 11, 99, 144 chromatic aberration corrector in, 83 damping envelope functions in, 110 exit wave and, 108 expectation models, 99 future of, 61 information limit of, 6–7 object structure and, 62 observations, 59–60, 70, 97 optimal statistical experimental designs of, 12, 58–104 optimality criteria of, 61 performance criteria of, 60 qualitative v. quantitative, 59 scheme of, 59f simplified channelling theory and, 108 simulations, 97, 143
INDEX
spherical aberrations in, 6 statistical experimental design and, 11–12 v. STEM, 131–132 transfer function for, 110 Convergent-beam electron diffraction pattern, 104 in STEM, 111 Convolution product, of exit wave, 65 Convolution theorem, 109 Correlation coefficient, 246–247, 250, 266 image registration by, 286 between images, 286 Correlation function, 285 data term faithfulness and, 252 Cosine function, modified, 167 Cosine transforms, 167, 169, 171 EMEE and, 185f enhancements by, 181, 182f image enhancement measure for, 192f log-enhancement by, 202f -rooting by, 183f, 191f Cost function, 250–251, 273f of dissimilarity, 269 first term of, 286 multiresolution optimization, 274 purpose of, 252 Coulomb interactions, statistical, 89, 130 Covariance matrix, 20, 97 CRLB and, 34 Covering construction of, 226–229 Crame´r-Rao Lower Bound (CRLB), 11–12, 20–21 analytical form of, 57 approximations of, 27, 34–45 attainability of, 97, 137–138 attainable precision and, 19 attainable statistical precision and, 15 block diagonal, 34 bright-field imaging and, 40–42
297
INDEX
covariance matrix and, 34 dark-field imaging and, 39–40 dependence of, 72 dependencies of, 106 derivation of, 27 diagonal elements of, 76 estimators and, 16, 97 Fisher information matrix and, 92 for isolated atom columns, 121 joint probability density function and, 34 maximum likelihood estimation and, 25 minimization of, 143 of neighboring atom columns, 133 numerical minimization of, 12 in one-dimensional observations, 34–35 optimality criteria and, 24 position coordinates and, 70, 137 scalar measure of, 71 in STEM, 104, 107 structural parameters and, 61 three-dimensional observations and, 42–45 two-dimensional observations and, 36–42 Crewe detector, 118 CRLB. See Crame´r-Rao Lower Bound Cross ratio of collinear points, 254 on image plane, 256 Cross-modality registration, 287 Cs-values negative, 77 Scherzer defocus and, 78 CT volume data, 282–283 CTEM. See Conventional transmission electron microscopy Cumulation function, 167 Cumulative density function, 193–195, 195f histograms of, 197
Current density, 69 Curve matching methods, 264 Cyclic groups, 221
D Dai, X., 284 Damping envelope function, 66 bandwidth of, 102 in CTEM, 110 partial temporal coherence and, 101f Dark-field imaging, 29–30 attainable precision in, 145–149 v. bright-field imaging, 105 CRLB and, 39–40 Gaussian peaks for, 138 scalar measures and, 41 three-dimensional observations and, 32–33 tomography, 152–157 two-dimensional observations and, 45–48 Data collection geometries, 31 Data mapping, 170 Data term, 252 Daubechies wavelets, 278 Debye-Waller factor, 8, 108, 122 channelling theory and, 9 Defocus, 60, 72. See also Scherzer defocus attainable precision and, 74, 90, 119 by chromatic aberration, 66 electron probe and, 116 electron wavelength and, 76 Gaussian spread of, 66 optimal value of, 76–78, 103, 126–127 precision and, 80 spherical aberration and, 76, 77f spherical aberration constant and, 127f, 128f spread of, 107 Delocalization, 80–81 Detector function, 116–117
298 Detector-to-aperture radius, 125f, 135, 135f DFT. See Discrete Fourier transform(s) Didon, J. P., 286 Diffraction-error disc, 120 source image and, 136–137, 136f Diffraction-limited probe, 139 size of, 144 Digital elevation model, 264 Digital image processing, 169 Digital road map, 259, 260f Direct visual interpretability, 118 optimal, 124, 126 optimal defocus value and, 127 using annular detectors, 131–132, 131t using axial detectors, 131–132, 131t Discrete Fourier transform(s) (DFT) 1-D, 170, 225f, 234, 234f, 235f 2-D, 179, 213, 219, 219f, 220f, 221, 224, 234, 235f 2-D splitting by, 238 N N-point, 220 tensor representations and, 219, 238 Discrete integrals, 225 Displacement damage, 6, 81 incident electron energy and, 84 v. specimen drift, 91 Dissimilarity, 269 Distance. See also Extinction distance in bright-field imaging, 51 from feature points, 254f precision, 41 relative, 284 standard deviation of, 47f, 51, 53f, 54f, 55f, 56f transform, 268 transform matching, 253 true v. maximum likelihood estimates, 52t for two-dimensional objects, 37 variance of, 27, 48
INDEX
Distortion, 270 degree of, 268 volume, 270f, 271f D-optimality criterion, 22–23 Dose efficiency CRLB and, 106 as performance criteria, 106 in STEM, 115 Dual representation, 256–257 Duality, principle of, 274 Dynamic scattering of electrons, 9 undoing, 10
E Earth Observing System, 169–170 Edge(s) detection, 274 enhancements, 201 extraction, 278 horizontal negative going, 281 horizontal positive going, 281 Edge-enhancing effects of magnitude reduction algorithm, 207 Eigenvalues, 38 matrix of, 39 Eigenvectors, 38 matrix of, 39 Elastic object registration, 268 Elastic registration algorithms, 283 Elastic scattering v. inelastic scattering, 118 v. thermal diffuse, 118 Elastic volume registration, 287 Electron channelling, 63f counting results, 17, 60, 70 dynamical motion of, 109 rest mass, 63 wavelength, 60, 63, 74 Electron energies intermediate incident, 103 low incident, 104
INDEX
Electron microscope(s). See also Microscope settings chromatic aberration of, 83 information limit of, 60, 102 of intermediate incident electron energies, 103 of low incident electron energies, 104 numerical results, 72–98 observations, 61 point resolution of, 60 point spread function of, 67 spherical aberrations and, 78 transfer function, 65 transform function, 100 Electron microscopy aberrations in, 9–10 chromatic aberrations in, 5, 6 high-voltage, 60 intermediate voltage, 6, 60 as measuring instrument, 5 observations, 18–19 Poisson distribution in, 19 spherical aberrations in, 5 Electron probe interaction of, 110 lobes of, 105 v. parallel incident electron beam, 108 settings of, 116 in STEM, 104 Electron source demagnifying, 104 optimal reduced brightness of, 89–91 reduced brightness of, 103, 130–131 Electron tomography attainable precision of, 12 component orientation in, 58 simulations of, 57 statistical experimental design and, 11–12 three-dimensional observations and, 31 Electron wavelength, 71 defocus and, 76
299 electron probe and, 116 increasing, 81 Electron-electron interaction, 69 Electron-object interaction in CTEM, 59 exit wave and, 104 as performance criteria, 106 Electrons in bright-field imaging, 50t characteristics of, 3 detected, 58 dynamic motion of, 64 dynamic scattering of, 9, 62 energy spread and, 86 expected number of, 33, 99 in Gaussian peak, 46t incident, 68–69, 70 incident number of, 86 interacting v. noninteracting, 30 intrinsic energy spread of, 71 over two-dimensional projection, 33 precision and, 35 scattering, 107 specimen drift and, 102 thermal energy of, 60 Electrostatic potential of atom types, 8 of atoms, 41, 63 El-Ghazawi, T., 286 Ellipsoid of concentration, 22, 23f EME, 176, 199f, 219 1-D -rooting and, 233f curve of, 179f definitions of, 176–178 Fourier transform and, 188f, 189f, 190f, 191f, 199f, 214f, 217f Hartley transform and, 183 image-signal processing and, 231 modified -rooting, 232f for negative images, 213, 216 EME(n; o), 234–235, 234f minimum of, 236 EMEE, 179–180, 219 curves of, 184f image-signal processing and, 231 optimal for, 185t
300 Energy probability density function, 66 Energy selection slit, 71 Energy spread detected electrons and, 86 information limit and, 101–102 intrinsic, 87 of monochromators, 74,85–89 position coordinates and, 86f, 87f, 88f radiation sensitivity and, 87 spherical aberration constant and, 87 standard deviation of, 69 variable, 71 Enhanced counters, 213 Enhancement algorithms for 2-D signals, 180 with three zones, 211 with two zones, 208–211 Enhancement measure, 177. See also Measure of enhancement curves of, 184–185 EMEE, 179–180 by entropy, 177 maximal values of, 192t Enhancement operator X(p, s), 208–209 Enhancement operators, 169, 174 Enhancement parameters favorable, 196, 198 low-favorable, 196, 198 Entropy, 249f Entropy transformation global, 167 Envelope functions, 62 E-optimality criterion, 24 Epipole, 256 Equalization, 167 Error function, 70 Estimates distribution of, 97 maximum likelihood, 52t, 56–57 maximum likelihood v. efficient, 48 of position coordinates, 49f Estimators, 15–16
INDEX
CRLB and, 21, 97 least squares, 48 maximum likelihood, 16, 25–26, 48 Euclidean norm, 64 Euclidean transforms, 267, 286 Excitation coefficients, 64 Exit plane in CTEM, 59 spatial frequency from, 60 Exit waves, 7 channelling theory and, 8–9 convolution product of, 65 in CTEM, 59, 108 electron-object interaction and, 104 Fourier transform of, 111 image intensity distribution and, 110 incident electron beam and, 63 lens aberration and, 60 parametric statistical model of observations and, 62 projected structure and, 63 in STEM, 108–110 visual interpretability, 83 Expectation models, 3, 15, 17, 27, 68 CTEM, 99 for electrons, 33 Gaussian peaked, 12 of images, 57 monochromator in, 69–70 physics based, 28 in quantitative atomic resolution TEM, 8, 18 radially symmetrical, 76, 99, 100f, 122 STEM, 113–115, 139 substitution of, 76 Expectations, vector of, 17 Experimental characterization methods, 3 Experimental design, 4 attainable precision and, 103 optimality criteria and, 22 optimizing, 14, 15 statistical experimental design and, 13
INDEX
Experimental settings, joint probability density function and, 21 Experiments definition of, 14 simulation, 51, 56 Extinction distance, 68, 92 object thickness and, 82, 121, 133 in STEM, 111 Extracted features, errors in, 286 Eye fundus images, 285
F Favorable enhancement parameters , 193 Feature detection, 276 Feature extraction, 246, 276–281 Feature points, 254f Feature-based methods, 267–268 similarity measures for, 253–266 Fechner’s law, 176 Field emission guns, 7 Field of view (FOV) v. center of mass, 46 pixel size and, 72 point spread function and, 68 in STEM, 114 Filtering, 167, 168 Fisher information matrix, 11, 20 calculating, 21 CRLB and, 92 CRLB approximation and, 34 radially symmetrical expectation model and, 122 symmetric, 76 three-dimensional observations and, 42 Flow field, 268, 269 Flow vectors, 273 Flusser, J., 264, 283, 284 Focal-series reconstruction, 5, 6 interpretable resolution of, 7 methods, 10, 83 Fonseca, L. M. G., 283
301 Fourier detector plane, 111 Fourier optics scheme, 62 Fourier space, autocorrelations in, 67 Fourier spectrum, 225 Fourier transform(s), 167, 169. See also Discrete Fourier transform of 1s-state, 122, 142 2-D, 208f with C3(p, s) coefficients, 203f curves of, 209f decomposition of, 220 EME by, 188f, 189f, 190f, 191f, 199f, 214f, 217f EMEE and, 184f, 185f enhancements by, 182f, 194f of exit wave, 111 FN,N, 221 image enhancement measure for, 193f inverse, 65, 274 log-enhancement by, 202f modified -rooting and, 232–233 modulus of, 275 -rooting by, 183f, 191f, 200f, 201f tensor representation and, 170, 218, 223 by two zones, 210f, 211, 211f two-dimensional, 64–65, 65 via log-reduction, 204f via operator O, 207f FOV. See Field of view Frame registration, 252 Frequency composition, 168 sequency and, 171 variables, 275 Frequency domain, 170 image enhancement in, 231 magnitude reductions within, 205 methods, 166, 167–168 Frequency ordered systems, 170–218
302 Frequency-domain-based parametric image enhancement algorithms, 219 Functions of the parameters, 20
G Gabor functions, 278 Gaussian filter, Laplacian of, 280 Gaussian function, 64, 109 Gaussian incoherent effective electron source, 66 Gaussian peaks, 12, 27 for dark-field imaging, 138 electrons in, 46t narrowing, 99 v. non-Gaussian peaks, 139 widths of, 28–29, 36, 46, 53f Genetic optimization algorithm, 287 Geometric distortion, 250 Geometric invariance, 254 Global distortion operations, 269 Global maximums, 56 Gold [100] crystal microscope setting for, 134 optimal objective aperture radius and, 134, 140f Goodness of fit, 9 Gordon, I. E., 175 Goshtasby, A., 283 Govindu, V., 285 Gradient operator, 198 Gradient parameters , 193 low-favorable, 193 Gradient type parameters, 196 Gradient vectors, 279 magnitude of, 279 Grain, 272 Graph matching, 284 Gravity, centers of, 283 Gray level distribution, 175 image function, 266 modification, 166 resolution, 213
INDEX
Gray values, 251 correlation, 247, 273–274 histogram of, 247–248 mean, 175 modalities and, 248–249 Gray-scale image, 178f transformations, 214
H HAADF STEM. See High-angle annular dark-field scanning transmission electron microscopy Haar wavelets, 278 Hadamard transforms, 169, 171 EMEE and, 185f enhancements by, 181, 182f log-enhancement by, 202f -rooting by, 183f, 191f, 201f by two zones, 211, 212f via log-reduction, 206 Hartley transforms, 169, 171 EME and, 183 EMEE and, 185f -rooting by, 183f, 191f Heeger, D. J., 286 Hessian matrix, 20, 278 Hierarchical probabilistic relaxation, 283 High pass filters, 278–279 High-angle annular dark-field scanning transmission electron microscopy (HAADF STEM), 5, 6 High-pass images, 175 Highpassing, 279 High-resolution CTEM attainable precision for, 80 main zone axis in, 62 microscope settings for, 80 simulations of, 57 statistical experimental design of, 70–102
INDEX
High-voltage electron microscopy, 5, 6 Histogram(s) comparison of, 205 of cumulative density function, 197 formation, 195 of gray values, 247–248 of image enhancement, 195f of -rooting, 196f Histogram equalization, 166–167 image contrast and, 167 Holography. See Off-axis holography Hough transform, 274 Human eye absolute threshold, 176 upper threshold, 176 Human perception, 165–166 image quality and, 168 Human visual properties, 166 image enhancement measures and, 169 light intensity value and, 176 transform-based image enhancement and, 168
I Identical transform, 181 Illuminating electron beam, 31 settings of, 71 Image(s) binary, 213 calculations, 62 capturing settings, 244 composite, 275, 276f correlation between, 286 correspondence, 245 decomposition, 279 enhanced by image-signal, 237f, 238f, 239f equalization, 167 independent, 249 low-pass versions of, 166 matching, 283 modalities of, 246 processing techniques, 60
303 pyramid of, 272 recording, 113–115 reference, 253 referenced, 246–267 SAR, 264 sensed, 246–267 tensor representation of, 221–226 thresholded, 213 transformation between two, 266–276 vector representation of, 217 virtual, 256 Image blocks highly active, 205 homogenous, 205 Image enhancement 1-D -rooting, 234 automatic, 169 criteria for, 168 diagram of, 175f first derivative, 167 goal of, 166 histogram of, 195f local statistics, 167 optimal for, 185 parametric transform-domainbased, 169 spatial domain based, 169 tensor method of, 218–240, 230–233 transform-based, 167 Image enhancement measures by cosine transform, 192f for Fourier transform, 193f human visual properties and, 169 Image enhancement transform choosing, 181–183, 200–201, 203 methods of, 166 Image intensity distribution, 60 derivation of, 67–68 in STEM, 110–113 Image plane, 59 cross ratios on, 256 Image projections. See Discrete integrals
304 Image quality, 168–170 evaluation, 168 human perception and, 168 Image registration algorithms, 244–245, 246 constraining, 250 by correlation, 286 definition of, 243 multimodal, 244 multiresolution, 280 remote sensing, 276 with wavelets, 282f Image transformation, 246 Image wave, 59 modulus square, 67 parametric statistical model of observations and, 65–68 Image-signal(s), 217, 221, 225f 1-D DFTs of, 234, 235f, 236f as discrete integrals, 225 energy curve of, 231f image enhanced by, 237f, 238f, 239f one-dimensional, 170 optimal parameters for, 238 processing, 226f, 231 properties of, 229–230 spectral information in, 239 splitting, 170, 220–221, 239 in tensor representation, 227–229, 228f, 233–234 Improved perception, 168 Incident electron dose, 16 constraints of, 122 Incident electron energy, 63, 74 attainable precision and, 91 chromatic aberration and, 84 decreasing, 81 displacement damage and, 84 exit wave and, 63 lower, 102 Incident probe, 113 Incoherent imaging, 118 Industrial inspection, 274 Inelastic scattering, v. elastic scattering, 118
INDEX
Information limit chromatic aberration and, 7 chromatic aberration corrector and, 83 of electron microscopes, 60 energy spread and, 101–102 improving, 61 v. point resolution, 60 SNR and, 86 Information theory, 205 Inner collection angle, 107 Inner collection radius, 118–119 Instrumental design, 5 Intensity distribution, 112 Interaction strength, 50f, 50t Interatomic distance, 120t, 122 Intuitive interpretation, 99 Invariance property, 26 Invariant moments, 285 Inverse transformation, 167 Irreducible covering, of square lattice, 226–227 Isocontours, of constant intensity, 264 Isolated atom columns attainable precision and, 121 CRLB for, 121 microscope settings for, 120 optimal detector radius and, 142 optimality criterion of, 75–76, 121–122 positions, 121 STEM and, 144 structure parameters of, 75, 75t, 121–124, 121t Iterative numerical optimization method, 9
J Jacobian matrix, 21, 37 Jacq, J.-J., 287 Jain, A. K., 283, 284
305
INDEX
Joint probability density function, 11, 15, 17 attainable probability and, 19 CRLB and, 34 experimental settings and, 21 monochromators and, 70 of observations, 19, 33
K Khorram, S., 284 Klein, J. C., 285 Knutsson, H., 286 Kovalev, V. A., 250
L Labeling process, 261 Landmark points, 245, 283 matched, 246 Langevin, F., 286 Laplacian, 280 Laplacian of Gaussian (LoG) filter, 280–281 edge operator, 284 Layered magnetic materials, 2 Least error solution, 268 Least squares error methods, 283 estimators, 97 Lens(es) aberrations, 59–60 current, 60 magnetic quadrupole, 61 octopole, 61 Li, H., 284–285 Light intensity value human visual detection and, 176 ratio v. difference between, 176 Lilliefors test, 48, 97, 138 Line detection algorithm, 259 Linear combinations, 181f Local maximums, 56 Local optima, 270
Local statistics, 167 LoG filter. See Laplacian of Gaussian filter Log-enhancement, 202f Log-likelihood function, 25, 56 Log-magnitude reduction methods, 201 Log-power transformation, 214 negative -rooting and, 216f Log-reduction Fourier transform via, 204f Hadamard transform via, 206 Low pass filters bandwidth of, 100–101 wavelets and, 278–279 Lowest energy bound state, 64 Luminance, 167
M Magnitude, 172 of coefficients, 174 Magnitude reduction, 181 algorithm, 207 comparison of, 205 Magnitude reductions within frequency domain, 205 methods, 201 Main zone axis, 61 in high-resolution CTEM, 62 Mallat wavelets, 278 Manjunath, B. S., 283 Map representation, 263 Matched points, 258, 259f featured-based methods and, 267 Matching probabilities, 262–263 Materials science, 2–3 Maximum likelihood estimators asymptotic properties of, 97 of atom columns, 98f of position coordinates, 98t simulation experiments and, 137 McGuire, M., 286 Measure of enhancement EME, 176
306 Measure of enhancement (Cont. ) by entropy, 178 performance, 175–180 Measure of improvement. See Measure of enhancement Measurement errors, 260 Measurement precision, 14 Medical imaging, 165–166 analysis, 244 Medicine, 244 Medioni, G., 284 Membrane model, 252–253 Microscope settings, 71–72 attainable precision for, 73 for central atom columns, 134–137 CRLB and, 106 for high-resolution CTEM, 80 for isolated atom columns, 120 optimal, 73–74, 93–96, 117–120 original, 73–74, 73t, 117–120, 118t STEM, 107, 116–117, 132 structure parameters, 74, 120–121 Minimax criterion, in space of parameters, 23 Modalities different, 247, 285 gray values and, 248–249 Monochromator(s), 7, 61 attainable precision and, 94, 96, 103 v. chromatic aberration corrector, 91 energy selection slit and, 71 energy spread, 74 in expectation model, 69–70 joint probability density function and, 70 limitations of, 12, 144 optimal energy spread of, 85–89 precision and, 100 radiation sensitivity and, 103 specimen drift and, 85 Monotonic functions, 257–258 Motion analysis, 252 Motion detection, 273 Motion vectors, of pixels, 252–253
INDEX
MRI, 269f brain images, 286 registration of, 286 volume data, 282–283 Multidimensional solution space, 273f Multipole lenses, 6 Multiresolution optimization, 272 Multistage strategies, 273–274 Murray, Walter, 9 Mutual information, 247 for 3D volume images, 281 calculation of, 248f entropy and, 249f independence of, 250f matrix, 250 total independence and, 250f Nanomaterials, 2
N Nanomaterials, 2 Nanoparticles, 2 Nanotechnology, 2 Nanotubes, 2 NASA, 169–170 Negative images, 212, 213 EME for, 216 enhancement, 215f Negrate, A. L., 175 Neighboring atom columns, 91, 92t, 94, 95 optimality criterion of, 133–134 STEM and, 132 structure parameters for, 133, 133t Nestares, O., 286 Neutron diffraction, 3 Nevatia, R., 284 Noise, 10 observations and, 15 precision and, 10, 106 stochastic variables and, 11 Nonlinear optimization problems, 16 Non-maxima suppression, 280 Non-paraxial rays, 130 Normalization constant, 30
INDEX
Normalization factor, 68 Normalized convolution, 286 Normalized image intensity distribution, 28 in bright-field imaging, 30–31 in dark-field imaging, 29–30 of two-dimensional projection, 32 Numerical analysis, 28 Numerical optimization procedures, 10 Numerical results, 72–98
O Object aperture radius STEM, 122–124 Object detection, 169 Object function, 30 Object patch, 243 Object spatial frequency information, 102 Object structure, 62 Object thickness, 92, 116 extinction distance and, 82, 121, 133 isolated atom columns and, 121 oscillation and, 118–119 spherical aberration constant and, 80, 95 Object transfer function, 101f Object visualization, 169 Objective aperture, 69 attainable precision and, 119 semiangle, 65, 110 Objective aperture radius, 65, 118 1s-state and, 123f electron probe and, 116 for gold [100] crystal, 134, 140f optimal, 142 precision and, 122 for silicon [100] crystal, 134, 140, 141f spherical aberration constant and, 128 for STEM, 144
307 Objective lenses aberrations, 65 current, 66 electron probe and, 104 imperfections in, 59 Observations, 14 of constant background, 99 CTEM, 59–60, 97 electron microscope, 61 electron microscopial, 18–19 fluctuations of, 102–103 joint probability density function of, 15, 19, 33 multinomially distributed v. Poisson distributed, 35 noise and, 15 one-dimensional, 28–29, 34–35 parametric statistical models of, 15, 17–19, 28–34, 62–70, 106, 107–115 Poisson distribution of, 33 quantitative atomic resolution TEM and, 14 simulated, 48 space of, 17 STEM, 116 three-dimensional, 31–34, 42–45, 51–57 two-dimensional, 29–31, 36–42, 45–51 Occlusions, 266, 286 Off-axis holography, 5, 6, 60, 83 interpretable resolution of, 7 O’Gorman, L., 286 One-dimensional (1-D) transforms, 170 1s-state(s) approximation of, 109 for atom columns, 124f column dependent, 139 Fourier transform of, 100, 122, 142 interference of, 113 optimal objective aperture radius and, 123f width of, 74, 75t, 120t
308 Operator C3(p, s), 201–205 Operator O, 173–74 as filter, 201 Fourier transform via, 207f in tensor representation, 231–232 Operator O(X(p, s)), 210 Operator parameters choosing, 180–181, 200, 201–203 Operators exponential growth, 271 exponential shrinkage, 271 probability density function of, 271 shear distortion, 271 usefulness of, 271 Optimal detector configuration, 124–127 Optimal statistical experimental design, 4–5 of CTEM, 12, 58–104 derivation of, 13–14 optimality criteria and, 16 procedure of, 11 of STEM, 12, 104–142 Optimality criteria, 15, 16, 92–93. See also A-optimality criterion; D-optimality criterion; E-optimality criterion; Precision based optimality criteria asymptotic, 97 attainable statistical precision and, 102 CRLB as, 24, 48 of CTEM experiments, 61 experimental design and, 22 global, 22, 24 of isolated atom columns, 75–76, 121–122 linear, 24 partial, 22, 24–25, 93 quantitative atomic resolution TEM of, 27 scalar measure of CRLB as, 71 statistical parameter estimation theory and, 21 truncated, 22, 24–25, 93
INDEX
Ordered alloys, 62 Orthogonal functions, 173 Oscillation, 118–119 Outer collection radius, of axial detector, 118 Overlapping area, 250 maximum, 251
P Page, C. V., 283 Paired representation, 239 Parallel incident electron beam, 6, 59 v. electron probe, 108 Parallel projections, 226f Parameter estimation, 14, 33–34 unknown, 14 vector, 36 Parameter space dimensionality problems of, 9 minimax criterion in, 23 Parametric statistical models, 27 of bright-field imaging, 50 of observations, 15, 17–19, 28–34, 62–70, 107–115 substitution of, 52 variance and, 33 Parametric transformations, 268 Parseval theory, 230 Partial spatial coherence, 62, 82 quasi-coherent approximation and, 66 Partial temporal coherence, 82 damping envelope function and, 101f Performance criteria of CTEM, 60 of STEM, 105–106 Periodic oscillation channelling approximation and, 68 Perspective transformations, 254 geometric invariance under, 254 PET volume data, 282–283 Petrou, M., 250 Phase, 172
INDEX
contrast, 80–81 correlation, 275 correlation method, 286 shift, 65 spectrum, 170 Pixel(s) alignment, 251 arrays, 27 corresponding, 247 definition of, 243 distribution, 245 matching, 285 motion vectors of, 252–253 neighboring, 251 signal-to-noise ratio, 48 SNR, 55, 72, 117 spatial domain methods and, 166 unknowns of, 252 voxel correlation, 272 Pixel size, 36, 42–43, 72 precision and, 47–48 in STEM, 117 Pixel-based methods, 268–276 similarity measures for, 246–253 Planck’s constant, 63 Plane waves incidence, 64 incoherent, 60 Point matching methods, 253–264, 283 problems, 256 Point resolution, 6 of electron microscopes, 60 improving, 60–61, 99 v. information limit, 60 spherical aberration corrector and, 79 Point spread function, 30, 32 of electron microscopes, 67 FOV and, 68 probe narrowing and, 41 STEM and, 41 width of, 58 Points energetic, 278 of interest, 277
309 virtual, 256, 257f Poisson distribution, 18 in electron microscopy, 19 of observations, 33 Poisson statistics, 103 Polynomial transformation, 267–268 Position coordinates atom column, 97 attainable statistical precision of, 79 bright-field imaging and, 50–51 of central atom column, 134 chromatic aberration constant and, 84f, 85f CRLB and, 70, 137 energy spread and, 86f, 87f, 88f estimates of, 50f maximum likelihood estimates of, 49f, 98t precision, 41 reduced brightness and, 89 spherical aberration constant and, 80f, 81f, 129f standard deviation of, 53f true, 98t true v. estimated, 138t true v. maximum likelihood, 49t, 57t variance of, 48 Position estimates, variance of, 27 Power, 173 Precision. See also Attainable precision defocus and, 80 electrons and, 35 monochromators and, 100 noise and, 10, 106 objective aperture radius and, 122 optimal detector configuration and, 124–127 optimizing, 79 pixel size and, 47–48 v. resolution, 7–8 specimen drift and, 81 spherical aberration constant and, 93
310 Precision (Cont. ) variance and, 8 Precision based optimality criteria, 22–25 Prior model data term faithfulness and, 252 terms, 253 Probabilistic relaxation, 260, 263, 283 Probability density functions, 15, 17 correlating, 285 of operators, 271 Probability distribution, 205 Probe(s) narrowing, 41 optimal, 12–13 sampling distance, 117 scanning, 3 STEM, 12 Processing parameters, 169 Projected density distribution, 32 Projected images, 52t three dimensional reconstruction and, 57 Projected structure, 63 Projections number of, 55 v. pixel SNR, 55 Projective coordinates, 255f Projective geometry 3D reconstruction, 256 Properties-structure relation, 2–3
Q Quantitative atomic resolution TEM expectation model of, 18 observations and, 14 optimality criteria of, 27 statistical experimental design for, 57 Quantum efficiency, 18 Quantum mechanics, 8–9
INDEX
Quasi-coherent approximations, 62, 66, 82, 102–103 limited validity of, 67
R Radiation sensitivity, 4, 12, 78 chromatic aberration constant and, 85 energy spread and, 87 monochromators and, 103 optimality criteria and, 16 reduced brightness and, 89 as relevant constraint, 122 Radioactive decay process, 15 Random-conical tilting, 31 Ransac method, 268 Rational-morphology based methods, 166 Rayleigh, Lord, 5 qualitative atomic resolution TEM and, 27 two-point resolution and, 27 Recognition, 167 Recording times, 72 constraints of, 122 longer, 103 reduced brightness and, 89 SNR and, 86 specimen drift and, 130 Reduced brightness, 69 attainable precision and, 74, 119 electron probe and, 116 of electron source, 89–91, 130–131 increasing, 91, 96, 144 position coordinates and, 89 radiation sensitivity and, 89 recording time and, 89 specimen drift and, 103, 132, 142 Reference plane, 256 Region matching, 266 methods, 264–266 Region-based methods, 166
311
INDEX
Registration accuracy, 286 algorithm, 247 brain image, 282–283 error, 250 process, 250 Relational information, 283 Relative image shifts, 286 Remote sensing, 244, 283 Renormalization group transform, 272 -representation, 221 Resolution v. precision, 7–8 Rayleigh, Lord, and, 5 Restoration analysis, 167 Results, interpretation of, 99–102 Rigid bodies, 274 registration, 286 Rignot, E. J. M., 285 Robinson, J., 286 Robotics, 243 Roche, A., 286 -rooting, 168 1-D, 233–238, 233f analysis of, 183–199 enhancement, 180–183, 186f, 187f, 188f by Fourier transform, 200f, 201f as gradient operator, 198 by Hadamard transform, 201f histograms of, 196f modified, 174, 200–201, 232 negative, 212–217, 216, 216f negative optimal, 212 transformation curves for, 191f by transforms, 183f weighted, 168 Rose, Waltar, 6 Rotation angles between contours, 285 finding, 276f Roux, C., 287 Rules of thumb attainable precision and, 34, 57–58 suitability of, 57
S Satellite orbital characteristics, 244 Scalar measures, 21 bright-field imaging and, 41 dark-field imaging and, 41 Scaling, 245, 275 parameter, 267 Scanning transmission electron microscopy (STEM) annular bright-field incoherent, 104 annular dark-field, 113 attainable precision in, 12, 119 axial bright-field coherent, 104 chromatic aberrations in, 107, 110 convergent-beam electron diffraction pattern in, 111 CRLB in, 104, 107 v. CTEM, 131–132 dose efficiency in, 115 electron probe in, 104 expectation models, 113–115, 139 extinction distance in, 111 Fourier detector plane of, 111 FOV in, 114 goal of, 106 illuminating probe of, 110 image recording in, 113–115 interpretations of, 138–141 isolated atom columns and, 144 microscope settings, 107, 116–117, 117–120, 132 neighboring atom columns and, 132 numerical results of, 117 objective aperture semiangle in, 110 observations, 116 optimal objective aperture radius, 122–124, 144 optimal probes for, 12–13, 139, 144 optimal statistical experimental designs of, 12, 104–142 performance criteria of, 105–106 point spread function and, 41 probes, 12 qualitative evaluation of, 105
312 Scanning transmission electron microscopy (STEM) (Cont. ) quantitative v. qualitative, 141–142 scheme of, 105f simulations of, 57, 143 spherical aberration corrector in, 116 spherical aberrations in, 6 statistical experimental design, 11–12, 115–141 transfer function and, 110 visual interpretability of, 128 Scanning tunneling, 3 Scherzer conditions, 118 Scherzer defocus, 60, 77–78, 94, 103 Scrambling processes, 9–10 Sequency, frequency and, 171 Sequency-ordered system, 171 Sharpening, 201 Shekhar, C., 285 Shifting parameters, 274–275 Signal-to-noise ratio (SNR), 48 annular detectors and, 126 information limit and, 86 pixel, 55, 72, 117 recording time and, 86 thermal diffuse and, 125 Silicon [100] crystal microscope setting for, 134 optimal objective aperture radius and, 140 optimal objective aperture radius for, 134 Similarity measures, 246–266, 259–260, 266–267 for feature based methods, 253–266 object-centered, 260 for pixel based methods, 246–253 Similarity transform(s), 267, 285, 286 equations, 283 Simplified analysis, 28 Simplified channelling theory, 62, 63, 107 CTEM and, 108
INDEX
Simplified models, 27–58 Simulation experiments, 48 CTEM, 97 maximum likelihood estimator and, 137 Sine transforms, 171 Single-axis tilt series, 31 Sinusoidal wave, 220 SNR. See Signal-to-noise ratio Source image attainable precision and, 119 diameter of, 135–136 diffraction-error disc and, 136–137, 136f electron probe and, 116 optimal width of, 142 width, 119 Spatial domain methods, 166–167 Spatial frequencies from exit plane, 60 high, 83 vector, 64 Spatial incoherence, 60 Specimen drift, 4, 78, 94 attainable precision and, 12, 86 chromatic aberration constant and, 85 v. displacement damage, 91 electron source brightness and, 13 electrons and, 102 monochromators and, 85 optimality criteria and, 16 precision and, 81 recording time and, 130 reduced brightness and, 103, 132, 142 as relevant constraint, 122 specimen holders and, 889 STEM v. CTEM, 130–131 Specimen holders, 89 mechanical instability of, 130 mechanical stability of, 96, 103, 132, 142 Spectral bands, 285 Spectral characteristics, 247 Spectral domain, 220
INDEX
Spectral information in image-signals, 239 unitary transforms and, 168 Spherical aberration(s), 59–60 correction of, 60–61 corrector, 72 in CTEM, 6 defocus and, 76, 77f in electron microscopy, 5, 10 in STEM, 6 Spherical aberration constant atom columns and, 95 attainable precision and, 74, 119 axial detectors and, 135 defocus and, 127f, 128f electron probe and, 116 energy spread and, 87 object thickness and, 80, 95 objective aperture radius and, 128 optimal, 78–83, 94, 130, 142 optimal defocus value and, 126 position coordinates and, 80f, 81f, 129f precision and, 93 for STEM, 132 Spherical aberration corrector(s), 78–79, 80, 95 attainable precision and, 91, 103 v. chromatic aberration corrector, 96 negative Cs-values and, 77 in STEM, 116 Spline(s), 283 parameters, 283 Splitting 1-D signals, 218 image-signals, 170, 220–221 -splitting, 221 Square error sense, 268 solutions, 258 Staib, L. H., 287 Standard deviation(s) of distance, 47f, 53f, 54f, 55f, 56f lower bound of, 45–51, 46f, 93
313 of position coordinates, 53f three-dimensional observations and, 51–57 Statistical experimental design, 4, 10–13, 11–12 of atomic resolution TEM, 27–58 basic principles of, 13–26 experimental design and, 13 of high-resolution CTEM experiments, 70–102 of quantitative atomic resolution TEM, 57 of STEM, 115–141 Statistical parameter estimation theory, 10, 15 optimality criteria and, 16, 21 STEM. See Scanning transmission electron microscopy Stereo head, 256 Stereo vision problems, 254 processing, 244 Stochastic optimization techniques, 269 Stochastic variables, 11, 17 definition of, 15 Stockman, G., 283 Stone, H. S., 286 Structural matching, 259 Structure determination qualitative, 78, 83 quantitative, 10, 58, 79, 86, 104 of STEM, 106 Structure parameters, 61, 92 CRLB and, 106 of isolated atom columns, 75, 75t, 121–124, 121t microscope settings and, 120–121 of neighboring atom columns, 92t, 133, 133t precision of, 106 quantitative estimation of, 58 Structure-imaging artifacts, 79 Sub-bands, 279
314 Suk, T., 264, 284 Sum rule, 284 Superconducting materials, 2 Supercoupling transform, 272
T TEM. See Atomic resolution transmission electron microscopy Temporal coherence, 62, 66 Temporal incoherence, 60 due to chromatic aberration, 107 Tensor(s) Hessian, 278 method, 218–240 structure, 277 transformation, 224 Tensor representation, 170, 219, 239 of 2-D DFT, 238 elements of, 222f Fourier transform and, 218 of image, 221–226 image-signals of, 227–229, 228f, 233–234 operator O in, 231–232 in spectral domain, 220 transform coefficients in, 231 Thermal diffuse annular detector and, 107 detector radii and, 125 v. elastic scattering, 118 SNR and, 125 in STEM, 107 Thermal energy, of electrons, 60 Thomson, William, 14 3D ultrasound, 286 3D volume images, 251, 270f mutual information for, 281 Three-dimensional density distribution, 31, 32f Three-dimensional reconstruction, 34, 56–57 Tilt angles, 44 Time domain, data mapping from, 170–171
INDEX
Tomography, dark-field imaging, 152–157 Ton, J., 283, 284 Total independence, mutual information and, 250f Transfer function for atom column, 126–127 CTEM and, 110 electron microscope, 65 STEM and, 110 Transform(s) coefficients, 167 with frequency ordered systems, 170–218 Transform coefficients, 166, 173, 174 length, 205 in tensor representations, 231 Transformations affine, 245 bilinear, 245 global v. local, 245 non-rigid body, 245 parameters of, 283 polynomial, 245 rigid body, 245 between two images, 266–276 Transform-based enhancement algorithm, 173–175 by operator C3(p, s), 201–205 optimal image improvement, 177, 178 problems for, 169 types of, 168 zonal, 205–211 Translation parameters, 285 Transmission-cross-coefficient, 67 Tree of vessels, 285 Trigonometric systems, parametric class of, 172 Triplets, matched, 258–259 2-D signals enhancement algorithms for, 180
315
INDEX
Two-dimensional objects distance for, 37 Two-point resolution, 27, 105 atomic resolution TEM and, 57
U Unambiguous labeling, 263 Unary attributes, 261 Unary measurements v. binary, 262 vectors, 261 Unitary transforms, 167, 169, 171 1-D, 220 2-D, 221 inverse, 173 spectral information and, 168 two-dimensional, 173 Unsharp masking, 166 modified, 168
V Van den Bos, A., 5 iterative numerical optimization method and, 9 Variance, 8 lower bound of, 56 parametric statistical models and, 33 Vectorial representation. See also Tensor representation of images, 217 Velocity components derivatives of, 253 valued, 253 Ventura, A., 284 Video processing, 244
Volume registration methods, 272 Vote counting, 283 Voxel(s) changes, 269–271 with homogenous intensities, 281 pixel correlation, 272 position change of, 269f
W Walsh basis images, 281 Walsh ordered functions, 172 Wang, Y., 287 Wavelet(s), 278 decomposition, 280, 286 image registration with, 282f theory analysis, 278–279 transforms, 167 Waves, complex, 170 Weber’s law, 168, 176 Weighted back-projection method, 34 three-dimensional reconstruction and, 57 West, J., 283 Westin, C.-F., 286
X X-ray, 3
Y Yang, Z., 285
Z Zana, F., 285 Z-contrast imaging, 107 Zhang, Z., 283
This Page Intentionally Left Blank