ADVANCES IN IMAGING AND ELECTRON PHYSICS VOLUME 145
EDITOR-IN-CHIEF
PETER W. HAWKES CEMES-CNRS Toulouse, France
HONORARY ASSOCIATE EDITORS
TOM MULVEY BENJAMIN KAZAN
Advances in
Imaging and Electron Physics
E DITED BY
PETER W. HAWKES CEMES-CNRS Toulouse, France
VOLUME 145
AMSTERDAM • BOSTON • HEIDELBERG • LONDON NEW YORK • OXFORD • PARIS • SAN DIEGO SAN FRANCISCO • SINGAPORE • SYDNEY • TOKYO Academic Press is an imprint of Elsevier
Academic Press is an imprint of Elsevier 525 B Street, Suite 1900, San Diego, California 92101-4495, USA 84 Theobald’s Road, London WC1X 8RR, UK
∞ This book is printed on acid-free paper.
Copyright © 2007, Elsevier Inc. All Rights Reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the Publisher. The appearance of the code at the bottom of the first page of a chapter in this book indicates the Publisher’s consent that copies of the chapter may be made for personal or internal use of specific clients. This consent is given on the condition, however, that the copier pay the stated per copy fee through the Copyright Clearance Center, Inc. (www.copyright.com), for copying beyond that permitted by Sections 107 or 108 of the U.S. Copyright Law. This consent does not extend to other kinds of copying, such as copying for general distribution, for advertising or promotional purposes, for creating new collective works, or for resale. Copy fees for pre-2007 chapters are as shown on the title pages. If no fee code appears on the title page, the copy fee is the same as for current chapters. 1076-5670/2007 $35.00 Permissions may be sought directly from Elsevier’s Science & Technology Rights Department in Oxford, UK: phone: (+44) 1865 843830, fax: (+44) 1865 853333, E-mail:
[email protected]. You may also complete your request on-line via the Elsevier homepage (http://elsevier.com), by selecting “Support & Contact” then “Copyright and Permission” and then “Obtaining Permissions.” For information on all Elsevier Academic Press publications visit our Web site at www.books.elsevier.com ISBN-13: 978-0-12-373907-0 ISBN-10: 0-12-373907-1 PRINTED IN THE UNITED STATES OF AMERICA 07 08 09 10 9 8 7 6 5 4 3 2 1
CONTENTS
C ONTRIBUTORS . . . . . . . . . . . . . . . . . . . . . . . . . . . P REFACE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F UTURE C ONTRIBUTIONS . . . . . . . . . . . . . . . . . . . . . .
vii ix xi
Applications of Noncausal Gauss–Markov Random Field Models in Image and Video Processing A MIR A SIF I. II. III. IV. V. VI. VII.
Introduction . . . . . . . . . . . . . . . . . . . . . Terminology . . . . . . . . . . . . . . . . . . . . . Potential Matrix . . . . . . . . . . . . . . . . . . . Rauch–Tung–Striebel Smoothing for Image Restoration Video Compression . . . . . . . . . . . . . . . . . . Inversion Algorithms for Block Banded Matrices . . . Conclusions . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
2 3 7 15 22 37 50 51
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
55 57 59 60 61 62 64 80 90 90 91
Direct Electron Detectors for Electron Microscopy A.R. FARUQI I. II. III. IV. V. VI. VII. VIII. IX.
Introduction . . . . . . . . . . . . . . . . . . Detectors—General Introduction . . . . . . . . Film . . . . . . . . . . . . . . . . . . . . . . CCDs . . . . . . . . . . . . . . . . . . . . . . Direct Electron Semiconductor Detectors . . . . Monte Carlo Simulations . . . . . . . . . . . . Hybrid Pixel Detectors, Medipix1, and Medipix2 MAPS Detectors Based on CMOS . . . . . . . Conclusions . . . . . . . . . . . . . . . . . . Acknowledgments . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . v
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
vi
CONTENTS
Exploring Third-Order Chromatic Aberrations of Electron Lenses with Computer Algebra Z HIXIONG L IU I. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . II. Variational Function and Its Approximations . . . . . . . . . . III. Chromatic Perturbation Variational Function and Its Approximations . . . . . . . . . . . . . . . . . . . . . . . . IV. Analytical Derivation of Third-Order Chromatic Aberration Coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . V. Graphical Display of Third-Order Chromatic Aberration Patterns VI. Numerical Calculation of Third-Order Chromatic Aberration Coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . VII. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . .
96 96 102 106 129 135 143 146 146 148
Anisotropic Diffusion Partial Differential Equations for Multichannel Image Regularization: Framework and Applications DAVID T SCHUMPERLÉ AND R ACHID D ERICHE I. II. III. IV. V. VI.
Preliminary Notations . . . . . . . . . . . . . . . . . . Introduction . . . . . . . . . . . . . . . . . . . . . . PDE-Based Smoothing of Multivalued Images: A Review Curvature-Preserving PDEs . . . . . . . . . . . . . . . Implementation Considerations . . . . . . . . . . . . . Applications . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . Appendix A . . . . . . . . . . . . . . . . . . . . . . Appendix B . . . . . . . . . . . . . . . . . . . . . . . Appendix C . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
150 151 160 174 181 183 193 195 197 198 203
I NDEX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
CONTRIBUTORS
Numbers in parentheses indicate the pages on which the authors’ contributions begin.
A MIR A SIF (1), Computer Science and Engineering, York University, Toronto, Ontario, Canada M3J 1P3 R ACHID D ERICHE (149), Odyssée Project Team, INRIA/ENPC/ENS–INRIA, 06902 Sophia Antipolis, France A.R. FARUQI (55), MRC Laboratory of Molecular Biology, Cambridge CB2 2QH, United Kingdom Z HIXIONG L IU (95), Department of Electronics, Peking University, Beijing 100871, China DAVID T SCHUMPERLÉ (149), Image Team, GREYC/ENSICAEN–UMR CNRS 6072, 14050 Caen Cedex, France
vii
This page intentionally left blank
PREFACE
Image processing, electron optics, and electron detection are the subjects of this volume. We begin with a chapter by A. Asif on noncausal random field models, which are attracting considerable attention in image and video processing but require a very different treatment from the models commonly found in the image processing textbooks. Asif explains in detail how these noncausal models are handled, and discusses three applications that illustrate the process. This is followed by a particularly timely account by A.R. Faruqi of direct electron detectors for electron microscopy. With electron microscope image processing now commonplace, it was inevitable that new techniques for getting the electron image from the microscope to the computer would emerge. Faruqi describes two types of semiconductor pixel detectors in great technical detail and illustrates their usefulness in the area of electron cryomicroscopy. This new generation of detectors is of potential importance for a very wide audience, and I am delighted to include this survey here. The third chapter is by Z.-x. Liu, who shows how useful M ATHEMATICA can be for calculating expressions for the aberration coefficients of electron lenses. In the past, these coefficients have, for the most part, been established by hand at the expense of much long and dull algebra. For the higher-order aberrations, however, the task becomes gigantic and ever since the 1970s, the help of computer algebra has been invoked. Efficient and (reasonably) userfriendly commercial packages are now available, such as M APLE and M ATH EMATICA. Here, Liu shows how formula for the higher-degree chromatic aberrations of electron lenses can be established with the aid of Mathematica. We conclude with a substantial contribution by D. Tschumperlé and R. Deriche on the role of anisotropic diffusion partial differential equations in the regularization of multichannel images. The authors examine the technique in considerable detail and give several examples, notably from the realm of color image processing. As always, I thank all the authors for contributing to the series and for the trouble they have taken to make their material accessible to a wide readership. Forthcoming contributions are listed in the following pages. Peter Hawkes
ix
This page intentionally left blank
FUTURE CONTRIBUTIONS
G. Abbate New developments in liquid-crystal-based photonic devices S. Ando Gradient operators and edge and corner detection C. Beeli Structure and microscopy of quasicrystals V.T. Binh and V. Semet Cold cathodes A.B. Bleloch Aberration correction and the SuperSTEM project C. Bontus and T. Köhler Helical cone-beam tomography G. Borgefors Distance transforms Z. Bouchal Non-diffracting optical beams A. Buchau Boundary element or integral equation methods for static and time-dependent problems B. Buchberger Gröbner bases F. Colonna and G. Easley The generalized discrete Radon transforms and their use in the ridgelet transform T. Cremer Neutron microscopy H. Delingette Surface reconstruction based on simplex meshes xi
xii
FUTURE CONTRIBUTIONS
R.G. Forbes Liquid metal ion sources C. Fredembach Eigenregions for image classification S. Fürhapter (vol. 146) Spiral phase contrast imaging L. Godo and V. Torra Aggregation operators A. Gölzhäuser Recent advances in electron holography with point sources D. Greenfield and M. Monastyrskii Selected problems of computational charged particle optics M. Haider Aberration correction in TEM M.I. Herrera The development of electron microscopy in Spain D.P. Huijsmans and N. Sebe Ranking metrics and evaluation measures M. Hÿtch, E. Snoeck, and F. Houdellier Aberration correction in practice K. Ishizuka Contrast transfer and crystal images J. Isenberg Imaging IR-techniques for the characterization of solar cells K. Jensen Field-emission source mechanisms L. Kipp Photon sieves G. Kögel Positron microscopy T. Kohashi Spin-polarized scanning electron microscopy O.L. Krivanek Aberration correction and STEM
FUTURE CONTRIBUTIONS
R. Leitgeb Fourier domain and time domain optical coherence tomography B. Lencová Modern developments in electron optical calculations H. Lichte (vol. 150) New developments in electron holography W. Lodwick Interval analysis and fuzzy possibility theory L. Macaire, N. Vandenbroucke, and J.-G. Postaire Color spaces and segmentation M. Matsuya Calculation of aberration coefficients using Lie algebra S. McVitie Microscopy of magnetic specimens S. Morfu and P. Marquié Nonlinear systems for image processing T. Nitta Back-propagation and complex-valued neurons M.A. O’Keefe Electron image simulation D. Oulton and H. Owens Colorimetric imaging N. Papamarkos and A. Kesidis The inverse Hough transform R.F.W. Pease (vol. 150) Miniaturization K.S. Pedersen, A. Lee, and M. Nielsen The scale-space properties of natural images I. Perfilieva Fuzzy transforms V. Randle Electron back-scatter diffraction E. Rau Energy analysers for electron microscopes
xiii
xiv
FUTURE CONTRIBUTIONS
E. Recami Superluminal solutions to wave equations J. Rodenburg (vol. 150) Ptychography and related diffractive imaging methods P.E. Russell and C. Parish Cathodoluminescence in the scanning electron microscope G. Schmahl X-ray microscopy J. Serra (vol. 150) New aspects of mathematical morphology R. Shimizu, T. Ikuta, and Y. Takai Defocus image modulation processing in real time S. Shirai CRT gun design methods H. Snoussi (vol. 146) Geometry of prior selection T. Soma Focus-deflection systems and their applications J.-L. Starck Independent component analysis: the sparsity revolution I. Talmon Study of complex fluids by transmission electron microscopy G. Teschke and I. Daubechies Image restoration and wavelets M.E. Testorf and M. Fiddy Imaging from scattered electromagnetic fields, investigations into an unsolved problem M. Tonouchi Terahertz radiation imaging N.M. Towghi Ip norm optimal filters E. Twerdowski Defocused acoustic transmission microscopy
FUTURE CONTRIBUTIONS
xv
Y. Uchikawa Electron gun optics K. Urban Aberration correction in practice C. Vachier-Mammar and F. Meyer Watersheds K. Vaeth and G. Rajeswaran Organic light-emitting arrays M. van Droogenbroeck and M. Buckley Anchors in mathematical morphology M. Wild and C. Rohwer (vol. 146) Mathematics of vision R. Withers Disorder, structured diffuse scattering and the transmission electron microscope
This page intentionally left blank
ADVANCES IN IMAGING AND ELECTRON PHYSICS, VOL. 145
Applications of Noncausal Gauss–Markov Random Field Models in Image and Video Processing AMIR ASIF Computer Science and Engineering, York University, Toronto, Ontario, Canada M3J 1P3
I. Introduction . . . . . . . . . . . . . . II. Terminology . . . . . . . . . . . . . . III. Potential Matrix . . . . . . . . . . . . . A. Two-Dimensional Gauss–Markov Random Fields . . 1. Two-Dimensional Bilateral Representation . . . . 2. Two-Dimensional Forward Unilateral Representation . 3. Two-Dimensional Backward Unilateral Representation B. Three-Dimensional Gauss–Markov Random Fields . . 1. Three-Dimensional Bilateral Representation . . . 2. Three-Dimensional Forward Unilateral Representation 3. Three-Dimensional Backward Unilateral Representation IV. Rauch–Tung–Striebel Smoothing for Image Restoration . A. Blurring Models . . . . . . . . . . . . B. Image Restoration Algorithm . . . . . . . . C. Image Restoration Experiments . . . . . . . . D. Summary . . . . . . . . . . . . . . V. Video Compression . . . . . . . . . . . . A. SNP/VQR Encoder . . . . . . . . . . . B. Computationally Efficient Implementation . . . . 1. Structure of Three-Dimensional Forward Regressors . C. Sub-Block SNP/VQR . . . . . . . . . . D. Computational Savings . . . . . . . . . . E. Cascaded VQ . . . . . . . . . . . . . F. Video Compression Experiments . . . . . . . G. Summary . . . . . . . . . . . . . . VI. Inversion Algorithms for Block Banded Matrices . . . . A. Notation . . . . . . . . . . . . . . B. Theorems . . . . . . . . . . . . . . C. Inversion Algorithms for Block Banded Matrices . . 1. Inversion of Full Matrix P with Block Banded Inverse A 2. Inversion of L-Block Banded Matrices A . . . . D. Simulations . . . . . . . . . . . . . E. Summary . . . . . . . . . . . . . . VII. Conclusions . . . . . . . . . . . . . . References . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 3 7 8 8 10 11 12 12 13 14 15 15 17 19 22 22 24 26 26 29 31 33 34 36 37 39 41 45 45 47 49 49 50 51
1 ISSN 1076-5670 DOI: 10.1016/S1076-5670(06)45001-1
Copyright 2007, Elsevier Inc. All rights reserved.
2
ASIF
I. I NTRODUCTION Noncausal Gauss–Markov random fields (GMRFs) have been used extensively in image processing. An analysis of the major applications reveals that GMRFs have been versatile enough to be applied in areas as diverse as stochastic relaxation for image restoration (Geman and Geman, 1984), surface reconstruction and pattern analysis (Geiger and Girosi, 1991), pattern recognition in computer vision (Rangarajan et al., 1991), emission tomography in nuclear science (Lee et al., 1993), textured image segmentation in image processing (Derlin and Elliott, 1987), anomaly detection in hyperspectral imagery (Schweizer and Moura, 2000), and data assimilation in physical oceanography (Asif and Moura, 1999). This chapter reviews the central concepts of noncausal GMRFs and explains these concepts by providing examples from the fields of image restoration, video compression, and matrix inversion in linear algebra. Unlike the one-dimensional (1D) GMRF models, which naturally lead to recursive processing algorithms of the Kalman–Bucy type, the twodimensional (2D) (Moura and Balram, 1992; Tekalp et al., 1985; Woods, 1972) and three-dimensional (3D) (Schweizer and Moura, 2000) noncausal GMRFs are not conducive to recursion because of their bidirectional structure. After introducing the basic definitions, this chapter establishes several recursive one-sided formulations (Moura and Balram, 1992) for both 2D image field and 3D video sequences that are equivalent to the original noncausal GMRF models, yet enable the optimal recursive processing of the 2D and 3D fields. These recursive, one-sided formulations are obtained by performing a Cholesky factorization of the potential matrix A, also referred to as the information matrix, which is the inverse of the covariance matrix P . The forward Cholesky factorization, A = LT L, with L being a lower triangular matrix, leads to the forward unilateral representation that processes the 2D image field and the 3D video sequence in the natural order of occurence, i.e., starting with the first row (i = 1), all subsequent rows are processed one after the other in the lexicographic order (1 i NI ) from the first frame to the last frame. This chapter highlights the central ideas of noncausal GMRFs by considering three applications. First, the classical image restoration problem is considered in which the input image is corrupted with additive noise and a convolutional blur resulting from such factors as sensor noise, improper image focus, and relative object-camera motion. The 2D forward unilateral model is used to develop a computationally efficient Rauch–Tung–Striebel (RTS) smoother-type algorithm, which, in comparison with the Wiener filter and filters that consider one-sided causal state models, restores blurred images at relatively higher peak signal to noise ratio (PSNR) and improved perceived
APPLICATIONS OF NONCAUSAL GAUSS – MARKOV RANDOM FIELDS
3
quality. The second application of noncausal GMRFs is chosen from video compression in multimedia communications. The proposed video codec, referred to as SNP/VQR, models the 3D video sequence as a 3D noncausal GMRF that enables scalable, noncausal prediction (SNP) based on the 3D forward recursive representation. The resulting error field is compressed with vector quantization coupled with conditional replenishment (VQR). Because multimedia communications require real-time processing of video sequences, practical implementations of the SNP/VQR are derived by exploiting the block banded structure of the potential matrix A of the noncausal GMRF used to model the 3D video sequence. SNP/VQR outperforms the standard video codecs, including the MPEG4 and H.263, at low bit rates necessary for mobile wireless networks. Finally, a third application of noncausal GMRFs is selected from matrix inversion in linear algebra. In image and signal processing, it is often customary to invert large, block banded matrices. The theory of GMRFs is applied to develop computationally efficient algorithms for inverting positive definite and symmetric, L-block banded matrices A and their inverses P . Compared to the direct inversion algorithms, the proposed algorithms provide computational savings of up to two orders of the magnitude of the linear dimension of the constituent blocks in A and P . This chapter is organized as follows. Section II, introduces terminology, as well as the basic definitions of the local neighborhoods, Markov, and Gaussian fields. The block banded structure of the potential matrices A along with the one-sided (unilateral) expressions representing the noncausal GMRFs for both 2D image fields and 3D video sequences are derived in Section III. Sections IV to VI consider the three applications of the noncausal GMRFs in the areas of image restoration (Section IV), video compression (Section V), and block banded matrix inversion (Section VI). For each application, we compare the performance of the GMRF-based algorithms with the standardized algorithms commonly used in these areas. Finally, Section VII concludes by summarizing the main concepts.
II. T ERMINOLOGY Before formally defining the GMRFs, we introduce the terminology used in the article. In our exposition, we follow much of the notation used in (Moura and Balram, 1992). Considering a still image as a 2D finite lattice of dimensions (NI × NJ ), the pixel intensity at site (i, j ) is represented by the random variable X(i, j ). Lower-case letters xi,j denote the values assumed by X(i, j ). In other words, xi,j is a particular realization of the random variable X(i, j ). Similarly, a video sequence is modeled as a 3D lattice of dimensions (NI × NJ × NK ) with X(i, j, k) representing the pixel intensity at spatial
4
ASIF
location (i, j ) in frame k. As for still images, the lower-case letters xi,j,k denote the intensity value assumed by the 3D random variable X(i, j, k). In terms of the pixel intensities, the conditional probabilities of the 2D and 3D Markov random fields are defined as follows. 2D Image Field: Prob X(i, j ) = xi,j | X(m, n) = xm,n , (i, j ) = (m, n) (m,n) = Prob X(i, j ) = xi,j | X(m, n) = xm,n , (m, n) ∈ N(p,2) . (1) 3D Video Sequence: Prob X(i, j, k) = xi,j,k | X(m, n, q) = xm,n,q , (i, j, k) = (m, n, q) (m,n,q) = Prob X(i, j, k) = xi,j,k | X(m, n, q) = xm,n,q , (m, n, q) ∈ N(p,3) ,
(2)
(m,n)
where N(p,2) is the pth-order neighborhood for spatial site (m, n) within (m,n,q)
the 2D image. Likewise, N(p,3) is the pth-order local neighborhood for site (m, n, q) within the 3D video sequence. In keeping with the spirit of the Markovian property, the local neighborhoods are usually chosen to be of reduced order compared with the overall dimensions of the fields. Next, we define the neighborhoods on the basis of the Euclidean distance. Local Neighborhoods: The pth-order neighborhoods are defined in terms of the closest neighbors of the reference pixel as follows. 2D Image Field: For 2D spatial coordinates (i, j ), (i,j ) N(p,2) = (m, n): 0 < (m − i)2 + (n − j )2 Dp . (3) 3D Video Sequence: For 3D spatial coordinates (i, j, k), (i,j,k) N(p,3) = (m, n, q): 0 < (m − i)2 + (n − j )2 + (q − k)2 Dp ,
(4) where Dp is an increasing function of order p that represents the square of the Euclidean distance between a pixel and its furthest neighbor. Figure 1 shows the 2D neighborhood configurations for pixel (i, j ), represented by “o,” with Dp set to 1, 2, 4, 5, 8, 9, corresponding to order p = 1, 2, 3, 4, 5, 6, respectively. Note that a neighborhood configuration of order p includes all pixels marked from “1” to “p.” As an example, the second-order (p = 2) neighborhood is obtained by setting Dp = 2 and includes pixels labeled as 1s or 2s in Figure 1. In terms of the spatial coordinates (i, j ) of the reference
APPLICATIONS OF NONCAUSAL GAUSS – MARKOV RANDOM FIELDS
F IGURE 1.
First- to sixth-order neighborhood configurations for 2D GMRF.
F IGURE 2.
First- to third-order neighborhood configurations for 3D GMRF.
5
(i,j )
pixel X(i, j ), the second-order neighborhood N(p=2,2) consists of pixels (i,j ) = (i − 1, j ), (i + 1, j ), (i, j − 1), (i, j + 1), N(p=2,2)
(i − 1, j − 1), (i − 1, j + 1), (i + 1, j − 1), (i + 1, j + 1) .
(5)
(i,j,k) N(p,3)
Figure 2 shows the neighborhood configurations for pixel “o” within the 3D fields. Unlike the 2D neighborhood configurations, the 3D configurations include pixels from more than one frame. Following lexicographic ordering, we simplify the notation used to represent the 2D image fields and 3D video sequences. The simplified notation for the 2D image fields is obtained by arranging the intensity values for the pixels X(i, j ) of row i into the (NJ × 1) random vector Xi and then stacking the vectors corresponding to all rows (1 i NI ), of the image to form a as follows. (NI NJ × 1) vector X 2D Image Field: = XT XT · · · XT T X I 1 2 T (6) with Xi = X(i, 1) X(i, 2) · · · X(i, J ) .
6
ASIF
For the 3D video sequences, the aforementioned procedure is repeated to arrange the pixels for frame k, (1 k NK ), into the lexicographically ordered vector X (k) . The frame-ordered vectors X(k) are then stacked in the (NI NJ NK × 1) vector. 3D Video Sequence: = X (1)T X(2)T · · · X (Nk )T T X T T (k)T T , with X(k) = X1(k) (7) X2(k) · · · XN I (k)
where Xi contains the pixels of row i in frame k. In the rest of the chapter, to represent the lexicographically ordered pixels we use the same symbol X for both the 2D image field and the 3D video sequence. However, the exact would be derived from the context in which description and dimensions of X appears. X
Gauss–Markov Models: Before defining the GMRFs, we introduce the Markov–Gibbs equivalence to describe the joint probability distribution A Gibbs distribution is specified in terms of the energy function Prob(X). which sums all possible interactions of the pixels present within the U (X), field. The Gibbs distribution is defined as = 1 exp −U (X) , Prob(X) (8) Z where Z is the normalization constant and is given by Z= exp −U (X) (9)
The best with the summation performed over all possible configurations of X. known case for the Gibbs distribution is obtained by considering quadratic = 0 and pairwise neighbor interactions fields that have zero mean E{X} 2 limited to: (1) the X (i, j ) and X(i, j )X(m, n) terms in a 2D image sequence, or (2) the X 2 (i, j, k) and X(i, j, k)X(m, n, q) terms in a 3D video sequence. For the zero mean quadratic fields, the Gibbs distribution reduces to
1 T 1 Prob(X) = exp − 2 X AX . (10) Z 2σX A Markov field with the probability distribution given by Eq. (10) is referred to as the GMRF. It is straightforward to show that the covariance matrix of the GMRF is a scaled version of the inverse of A, commonly referred to as the potential matrix. The potential matrix A is a symmetric matrix of order NI NJ for the 2D image field and NI NJ NK for the 3D video sequences. The following section shows that the potential matrix A has a block banded structure that is used in our image and video processing algorithms.
APPLICATIONS OF NONCAUSAL GAUSS – MARKOV RANDOM FIELDS
7
III. P OTENTIAL M ATRIX In noncausal GMRF, we use p-order neighborhoods of the reference pixel to make a linear prediction of the intensity value of the pixel. The first-order (p = 1) noncausal 2D and 3D GMRFs are given by 2D Image Field: j ) = βv X(i − 1, j ) + X(i + 1, j ) X(i, (11) + βh X(i, j − 1) + X(i, j + 1) . 3D Video Sequence:
j, k) = βv X(i − 1, j, k) + X(i + 1, j, k) X(i, + βh X(i, j − 1, k) + X(i, j + 1, k) + βt X(i, j, k − 1) + X(i, j, k + 1) ,
(12)
where βv , βh , and βt are, respectively, the vertical, horizontal, and temporal j ) and X(i, j, k) denote, respectively, field interactions. The symbols X(i, the predicted intensity values of the pixel X(i, j ) in the 2D image field and X(i, j, k) in the 3D video sequence. The evaluation of the field interactions is based on maximizing the likelihood function, specified in Eq. (10), which is computationally intensive. A near-optimal but less intensive procedure is based on the following approximate expressions. For the 2D image field, the vertical and horizontal interactions are given by βv =
ξ χv |χv | + |χh |
and βh =
ξ χh , |χv | + |χh |
(13)
where χv and χh denote the vertical and horizontal sample correlations of the 2D GMRF and are defined as χv =
NJ N I −1 1 X(i, j )X(i + 1, j ), NJ (NI − 1)
(14)
χh =
NI N J −1 1 X(i, j )X(i, j + 1). NI (NJ − 1)
(15)
i=1 j =1
i=1 j =1
The term ξ is a positive tolerance and is bounded (Moura and Balram, 1992) by ξ < 1/(2 cos π/(NI + 1)). For the 3D video sequence, the expressions for the field interactions are simple extensions of the corresponding equations for the 2D image field and
8
ASIF
are given by βv =
ξ χh , |χv | + |χh | + |χt | ξ χt βt = . |χv | + |χh | + |χt |
ξ χv , |χv | + |χh | + |χt |
βh =
and (16)
The 3D sample correlations {χv , χh , χt } are computed using the following expressions: χv =
NJ N NK I −1 1 X(i, j, k)X(i + 1, j, k), NJ NK (NI − 1)
(17)
χh =
NI N NK J −1 1 X(i, j, k)X(i, j + 1, k), NI NK (NJ − 1)
(18)
χt =
NJ N NI K −1 1 X(i, j, k)X(i, j, k + 1). NI NJ (NK − 1)
(19)
i=1 j =1 k=1
i=1 j =1 k=1
i=1 j =1 k=1
The noncausal predictive models in Eqs. (11) and (12) can alternatively be expressed in terms of the error field generated by subtracting the predicted value of the pixel from its actual value. In the sections below, we provide alternate one-sided (unilateral) models for the 2D and 3D noncausal GMRFs. A. Two-Dimensional Gauss–Markov Random Fields In 2D image fields, the error e(i, j ) involved in the prediction is given by ˆ j ). Using the lexicographic notation defined in Eq. (6) e(i, j ) = X(i, j )−X(i, for the image pixels X(i, j ) and for the error image e(i, j ), the 2D GMRF is expressed in the following three equivalent representations. 1. Two-Dimensional Bilateral Representation The 2D bilateral representation is obtained by expressing Eq. (11) in the matrix-vector format as = e, AX
(20)
is the (NI NJ × 1) column vector for the original image X(i, j ). where X Similarly, e is the (NI NJ × 1) column vector for the error field e(i, j ). The matrix A is the (NI NJ × NI NJ ) potential matrix and has the following
APPLICATIONS OF NONCAUSAL GAUSS – MARKOV RANDOM FIELDS
9
structure for zero Dirichlet1 (Andrew and Hunt, 1977) boundary conditions: ⎤ ⎡ B C 0 · 0 ⎢C B C 0 · ⎥ ⎢ ⎥ . . . ⎢ .. .. .. · ⎥ A=⎢ · (21) ⎥ ⎣· ⎦ C B C 0 · 0 C B 0 with ⎤ ⎡ 0 · 0 1 −βh ⎢ −βh 1 −βh 0 · ⎥ ⎥ ⎢ . . . ⎥ ⎢ .. .. .. B=⎢ · · ⎥ and ⎣ · 1 −βh ⎦ 0 −βh 0 · 0 −βh 1 ⎤ ⎡ −βv 0 · · 0 ⎢ 0 · 0 ⎥ −βv 0 ⎢ ⎥ . . . ⎢ ⎥ .. .. .. (22) C=⎢ · · ⎥. ⎣ 0 ⎦ 0 · 0 −βv 0 · · 0 −βv We use the Kronecker product to represent block banded matrices in a compact fashion. Using the Kronecker product, the potential matrix A and its constituent blocks {B, C} are represented as follows:
A = INI ⊗ B + HN1 J ⊗ C
with B = −βh HN1 J + INJ
and C = −βv INJ ,
(23)
where ⊗ represents the Kronecker product. The symbols INI and INJ are identity matrices, while HN1 I and HN1 J are Toeplitz matrices that have zeros everywhere except for the first upper and first lower diagonals, which are composed of all 1s. The subscripts denote the order of the matrices. For a 2D, first-order, noncausal GMRF, the potential matrix A in Eq. (23) is a sparse block tridiagonal matrix and contains all the relevant information regarding the GMRF structure (Moura and Balram, 1992). The error vector e is a sample from a colored noise process with covariance, Σe = σ 2 A. Beginning with AX = e, it can be shown by application of the orthogonality is given by Σx−1 = principle that the information matrix for the GMRP, X, 1 A. The structure of the potential matrix A includes both past (left and up) σ2 and future (right and down) pixels for prediction of a pixel value. Such a 1 For convenience of notation, we express our results in terms of zero Dirichlet boundary conditions. The results are generalizable to other boundary conditions. In fact, we use symmetric Neumann boundary conditions in our experiments.
10
ASIF
representation precludes recursive computations as the Kalman–Bucy filter or RTS filtering. The one-sided representations for the noncausal GMRFs are derived by taking the Cholesky factorizations of A. We consider the one-sided (unilateral) representations next. 2. Two-Dimensional Forward Unilateral Representation The forward representation is obtained by taking the lower Cholesky factorization of A = LT L, where the matrix L is a lower triangular (NI NJ × NI NJ ) block bidiagonal matrix ⎤ ⎡ L11 · · 0 0 0 · 0 ⎥ ⎢ L21 L22 ⎢ ⎥ . . ⎥ .. .. L=⎢ (24) · · ⎥ ⎢ · ⎣ · ⎦ LNI −1NI −2 LNI −1NI −1 0 0 0 · 0 LNI NI −1 LN I N I with (NJ × NJ ) blocks Lii s (1 i NI ), on the main block diagonal and Lii−1 s (2 i NI ), on the first lower block diagonal of L. Substituting = e, left multiplying by L−T , and expanding in terms of A = LT L in AX vectors Xi representing pixel values of row i results in the following one-sided representation for the 2D, noncausal GMRF: L11 X1 = v1 ,
for (i = 1),
Lii−1 Xi−1 + Lii Xi = vi ,
for (2 i NI ).
(25) (26)
LT e
The vector v = and represents the forward 2D whitened noise field v(i, j ). The vector v is obtained by lexicographically arranging the field sites v(i, j ) of row i first in a column vector vi and then stacking the vectors corresponding to all rows on top of each other, as in Eq. (6). Using the relationship v = L−T e, it follows that the noise field v(i, j ) is white given that the covariance matrix of e is Σe = σ 2 A−1 . This implies that the 2D forward representation has completely removed the correlation and whitened the image field. The 2D forward regressors, Lii−1 s and Lii s, in Eq. (24) are evaluated by solving a Riccati-type equation that follows by equating the main block diagonals and upper block diagonals in A = LT L. The resulting expressions are SNI = B, for (i = NI ), −1 C , for (NI − 1) i 1 , Si = B − CSi+1
(27) (28)
where LTii Lii = Si and Lii−1 = L−T ii C. With real data fields and in actual applications, evaluation of the 2D regressors Lii and Lii−1 is not required
APPLICATIONS OF NONCAUSAL GAUSS – MARKOV RANDOM FIELDS
11
for all rows. In fact, the regressors converge asymptotically to a steady-state solution at a geometric rate (see Moura and Balram, 1992). Hence, only a few of the regressors need to be computed. For a first-order GMRF, the steadystate solution for the regressors is given by S∞ = B/2 + (B/2)2 − βv2 INI with LTii,∞ Lii,∞ = S∞ and Lii−1,∞ = L−T ii,∞ C.
(29) √ The √ notation √ G defines the principal square root of a square matrix G such that G · G = G. Experimental results show that the spectral norm Si − S∞ decreases geometrically to extremely low values of O(10−8 ) within a few (i 10) iterations. 3. Two-Dimensional Backward Unilateral Representation
The backward representation is obtained by taking the upper Cholesky factorization of A = U T U , where the matrix U is an upper triangular (NI NJ × NI NJ ) block bidiagonal matrix ⎡ ⎤ U11 U12 0 · 0 U22 U23 0 · ⎢ 0 ⎥ ⎢ ⎥ . . ⎢ ⎥ .. .. U =⎢ · (30) · · ⎥ ⎣ 0 ⎦ · 0 U U 0
·
·
NI −1NI −1
0
NI −1NI
UNI NI
with (NJ × NJ ) lower triangular blocks Uii s (1 i NI ), on the main block diagonal and blocks Uii+1 s (1 i NI − 1), on the first upper block diagonal. Using the upper Cholesky structure, the backward one-sided representation for the 2D, noncausal GMRF is given by NI , U NI NI XNI = w
i, Uii Xi + Uii+1 Xi+1 = w
for (i = NI ), for (NI − 1) i 1 .
(31) (32)
The vector w = U −T e represents the backward 2D whitened error field and is formed by lexicographic ordering of the backward error field w(i, j ). As for the 2D forward representation, the backward error field w(i, j ) is completely uncorrelated and white. The 2D backward regressors are obtained by expanding A = U T U in terms of the constituent blocks. The resulting expressions are R1 = B, for (i = 1), −1 Ri = B − CRi−1 C , for (2 i NI ).
(33) (34)
12
ASIF
As for the 2D forward regressors, the backward regressors converge asymptotically at a geometric rate to a steady-state solution. The 2D unilateral representations provide one-sided models for image processing. From a practical point of view, the unilateral representations enable the optimal noncausal processing of 2D image field by recursive techniques such as the RTS smoother and the KBF. The 2D unilateral forward representation processes (i = 1) row first and then proceeds in a natural order (2 i NI ) along the rows i of the image field. On the contrary, the 2D unilateral backward representation processes the last row (i = NI ) first and then proceeds in a reversed order (NI − 1 i 1) from the last row to the first row of the image field. Consequently, we use the 2D unilateral forward representation for processing images. However, the results are general and can also be developed for the backward representation. B. Three-Dimensional Gauss–Markov Random Fields By following the procedure for the 2D GMRFs and assuming zero Dirichlet boundary conditions, we derive the following three equivalent representations for the 3D GMRFs. 1. Three-Dimensional Bilateral Representation The 3D bilateral representation is given by = e, AX where A = INK ⊗ A1 + HN1 K ⊗ A2 ,
A1 = INI ⊗ B
+ HN1 I
⊗ C,
and A2 = INI ⊗ D.
(35) (36) (37)
contains the pixel values In the context of 3D video sequences, recall that X of the entire video and has dimensions of (NI NJ NK × 1). The constituent sub-blocks B, C, and D are given by B = −βh HN1 J + INJ ,
C = −βv INJ ,
and D = −βt INJ . (38)
Comparing the bilateral representation for the 2D image fields [Eqs. (20)– (23)], with that for the 3D video sequences [Eqs. (35)–(38)], we note that apart from the increased dimensions for the video sequences, the other difference between the two bilateral representations lies in the structure of the potential matrix A. While A is block banded in both cases, the level of blockness is different. For the 2D image fields, A consists of blocks B and C, and has, therefore, a single level of blockness. On the other hand, A for the 3D video sequences has two levels of blockness. Its constituent blocks A1 and A2 are themselves block banded and consist of the sub-blocks B, C, and D. The two
APPLICATIONS OF NONCAUSAL GAUSS – MARKOV RANDOM FIELDS
13
levels of blockness is also reflected in the 3D forward and backward unilateral representations, which are considered next. 2. Three-Dimensional Forward Unilateral Representation As for the 2D forward representation, the 3D forward unilateral representation is achieved by taking the lower Cholesky factorization A = LT L, where the lower triangular matrix L has the following structure ⎤ ⎡ (1) L · · 0 0 ⎢ F (2) L(2) 0 · 0 ⎥ ⎥ ⎢ ⎢ ⎥ .. .. L=⎢ · (39) ⎥. . . · · ⎢ ⎥ ⎣ 0 · F (NK −1) L(NK −1) 0 ⎦
0 · 0 F (NK ) L(NK ) Exploiting the above block banded structure, the 3D bilateral representation = e can be expressed as AX F v (k)
(k)
X
(k−1)
L(1) X(1) = v (1) , (k)
+L X
(k)
=v
(k)
for k = 1,
,
for (2 k NK ),
(40) (41)
represents the row-ordered pixels in frame k of the 3D whitened where error field v(i, j, k) obtained from the transformation v = L−T e. The 3D forward regressors L(k) s in L are lower triangular, whereas F (k) s are upper triangular. The 3D forward representation [Eqs. (39)–(41)] whitens a frame k of the video sequence per iteration. In contrast, the 2D forward representation [Eqs. (24)–(26)] whitens a row i of the image field per iteration. Consequently, the 3D forward regressors L(k) s and F (k) s for the video sequence have dimensions of (NI NJ × NI NJ ) in comparison to the 2D forward regressors Lii s and Li−1i s for the image field, which are of order NJ . Extending the 2D Riccati equations [Eqs. (27) and (28)], the expressions for the 3D forward regressors are given by
S
(k)
S (NK ) = A1 , for (k = NK ), −1 = A1 − A2 S (k+1) A2 , for (NK − 1) k 1 ,
(42) (43)
where (L(k) )T L(k) = S (k) and F (k) = (L(k) )−T A2 . The 3D forward regressors also converge at a geometric rate with the steady-state values given by S (∞) = A1 /2 + (A1 /2)2 − βt2 INI NJ −T with L(∞) L(∞) = S (∞) and F (∞) = L(∞) A2 . (44)
14
ASIF
3. Three-Dimensional Backward Unilateral Representation The 3D backward unilateral representation is obtained by taking the upper Cholesky factorization A = U T U , where the upper triangular matrix U is given by ⎡ U (1)
⎢ 0 ⎢ U =⎢ ⎢ · ⎣ 0 0
Θ (1) U (2) · · ·
0 Θ (2) .. .
· · .. .
0 ·
U (NK −1) 0
0 0
U (k) X(k) + Θ (k) X(k+1) = w(k) ,
⎥ ⎥ ⎥. ⎥ · (NK −1) ⎦
(45)
Θ U (NK )
The 3D backward unilateral representation is given by U (NK ) X(NK ) = w(NK ) ,
⎤
for k = NK , for (NK − 1) k 1 .
(46) (47)
The vector w(k) represents the lexicographic pixels in frame k of the 3D whitened error field w = U −T e. The 3D backward regressors U (k) s are upper (k) triangular and Θ s are lower triangular. The Riccati equations for computing the 3D backward regressors are given by
R (k)
R (1) = A1 , for (k = 1), −1 = A1 − A2 R (k−1) A2 , for (2 k NK ),
(48) (49)
where (U (k) )T U (k) = R (k) and Θ (k) = (U (k) )−T A2 . The 3D forward regressors also converge at a geometric rate to steady-state values. As for the 2D image fields, we use the 3D forward unilateral representation of the GMRF for processing video sequences. This allows processing of the video in the natural order of frames, that is, (1 k NK ). The results derived for the 3D forward unilateral representation are, however, generalizable to the 3D backward unilateral representation. The next sections discuss three applications of the GMRFs in image and video processing. Section IV describes a restoration technique for deblurring images corrupted with additive noise. Our algorithm uses a practical implementation of the RTS smoother based on noncausal prediction that models the blurred image as a finite-lattice GMRF. Section V presents a low bit rate, video coding scheme referred to as SNP/VQR. The novelty and superior performance of SNP/VQR is due to the noncausal GMRP paradigm. Finally, Section VI develops novel matrix inversion algorithms for inverting block banded matrices such as the potential matrix A of the GMRF.
APPLICATIONS OF NONCAUSAL GAUSS – MARKOV RANDOM FIELDS
15
IV. R AUCH –T UNG –S TRIEBEL S MOOTHING FOR I MAGE R ESTORATION An image X(i, j ), for (1 i NI ) and (1 j NJ ), acquired by a practical (typically imperfect) imaging system suffers from degradations resulting from such factors as sensor noise, improper camera focus, and relative object-camera motion. If the imaging system is modeled by a linear shift-invariant (LSI) system, the observed image z(i, j ) can be expressed in terms of the ideal or original image X(i, j ) as z(i, j ) = g(i, j ) ⊕ X(i, j ) + ξ(i, j ),
(50)
where ⊕ denotes the 2D convolution operator, g(i, j ) is the point spread function (PSF) that introduces blur in X(i, j ), and v(i, j ) represents additive noise. Image restoration filters noise and blur from the observed image to minimize the effect of degradations. In other words, the objective of image restoration is to determine X(i, j ) from the observed image z(i, j ). Broadly speaking, robust image restoration algorithms can be classified in two categories. Category 1, based on inverse filters (Critin and AzimiSadjadi, 1992), considers only blurring and performs poorly in the presence of observation noise. Category 2 uses the Wiener filter or compound Gauss– Markov random field (CGMRF)-based restoration (Molina et al., 2000) to control changes in the image model using a hidden random field. Our algorithm belongs to the second category and uses the RTS (Rauch et al., 1995) smoother for image restoration. The state model of the RTS smoother is derived by considering the image field as a 2D first-order, noncausal GMRF. The observation model comes directly from Eq. (50). The main advantage of the proposed algorithm lies in the computational speed of the algorithm. In CGMRF-based restoration algorithms, the process of finding the maximum a posterior (MAP) estimate is complex and methods such as simulated annealing, deterministic relaxation, or anisotropic diffusion are used. Such methods are computationally intensive and may lead to unstable solutions. We derive a fast implementation of the RTS smoother for image restoration, which is stable, yet provides reasonable quality with respect to the Wiener filter and filters that use a one-sided causal state model. Unlike our earlier work in (Moura and Balram, 1991) and (Asif, 2004), which considers additive noise only, we consider restoration of images corrupted with both noise and blur in this chapter. A. Blurring Models We consider restoring images blurred by either of the following two PSFs as defined in (Tekalp et al., 1985).
16
ASIF
Truncated Gaussian Blur:
g(i, j ) =
k1 e 0,
− (i
2 +j 2 ) 2σ 2
, for |i − j | 2, otherwise.
Out-of-focus Blur:
g(i, j ) =
k2 , for (i 2 + j 2 ) 5, 0, otherwise,
and corrupted by white Gaussian noise. The value of the constants, k1 and k2 in the PSFs, is selected such that i,j g(i, j ) = 1 in each blurring function. The constant σ in the truncated Gaussian blur corresponds to the standard deviation of the blur. In our experiments, the standard deviation σ is set equal to 6. For the two PSFs, Eq. (50) can be expressed in the matrix-vector notation Zi = (G1 Xi−2 + G2 Xi−1 + G3 Xi + G2 Xi+1 + G1 Xi+2 ) + Ξi for (1 i NI ),
(51)
where Xi is defined in Eq. (6) and Zi = [ z(i, 1) z(i, 2) · · · z(i, NJ ) ]T is a (NJ × 1) vector for row i of the blurred image z(i, j ). The observation noise vector Ξi is obtained by lexicographic ordering of row i of the noise field ξ(i, j ). The blocks Gi s are sparse and Toeplitz with exactly five nonzero diagonals for the two aforementioned blur models. In Eq. (51), we assume zero Dirichlet conditions; hence, any pixel outside the image boundaries is assumed to be of zero intensity value. Dynamic Models: For the RTS algorithm, we use the 2D forward unilateral representation as the state model. In other words, we are modeling the blurred image as a 2D noncausal GMRF of order (p = 1). The 2D unilateral representation [Eq. (26)] and the observation model [Eq. (51)] cannot be used directly in the RTS algorithm because of the difference in the dimensions of the field vectors Xi s containing the state. Defining a new state T T T T ]T results in the following vector Ψi = [ Xi−2 Xi−1 XiT Xi+1 Xi+2
APPLICATIONS OF NONCAUSAL GAUSS – MARKOV RANDOM FIELDS
17
dynamical equations for the state and observation models: ⎡ ⎤ ⎤⎡ ⎤ ⎡ 0 IJ 0 0 0 Xi−2 Xi−1 ⎢ 0 0 IJ 0 ⎥ ⎢ Xi−1 ⎥ ⎢ Xi ⎥ 0 ⎢ ⎥ ⎥⎢ ⎥ ⎢ 0 ⎥ ⎢ Xi ⎥ ⎢ Xi+1 ⎥ = ⎢ 0 0 0 IJ ⎣0 0 0 0 ⎦ ⎣ ⎣X ⎦ IJ Xi+1 ⎦ i+2 −1 Xi+3 0 0 0 0 −Lii Lii−1 Xi+2 Ψi+1
⎡
Γi
⎤
0 ⎢ 0 ⎥ ⎢ ⎥ + ⎢ 0 ⎥ Vi , ⎣ 0 ⎦ L−1 ii
Ψi
(52)
Πi
Zi+1 = [ G1
G2
G3 G
G2
G1 ] Ψi+1 + Vi ,
(53)
which are used in the image restoration algorithm explained next. Note that the state model [Eq. (52)], consists of a set of five simultaneous equations. The first four (rows 1 to 4) are trivial, while the fifth equation (last row) is a direct extension of the 2D forward unilateral representation for the firstorder, noncausal GMRF considered in Section III.A.2. The state matrices Γi and Πi are variant matrices and vary from one row i to another. Once the 2D regressors {Lii , Lii−1 } reach the steady-state values {Lii∞ , Lii−1∞ } after a few rows, these matrices become invariant with respect to the rows. In the following algorithm, we assume steady-state condition such that the state matrices {Γi , Πi } reach the invariant values {Γ, Π }. The state and observation noise are assumed to be white and Gaussian, given by Vi ∼ N(0, Σw = σw2 I ) and Ξi ∼ N(0, Σr = σr2 I ), where N denotes the Gaussian distribution with its mean and covariance matrix shown within the brackets. B. Image Restoration Algorithm Based on the RTS algorithm, the procedure for obtaining a restored image has three sequential stages that are outlined below. 1. Parameter Estimation. After subtracting the global mean, the horizontal (βh ) and vertical (βv ) field interactions are estimated. Based on the values of the interactions (βh , βv ), the steady-state values {Lii∞ , Lii−1∞ } of the regressors (Lii , Lii−1 ) are computed using Eq. (29). 2. Steady-State Error Covariance Approximation. Since the state and observation equations [Eqs. (52) and (53)] are shift-invariant, the predictor
18
ASIF
covariance matrix Pi+1|i is approximated with its steady-state value (say P (p) ) computed using the Riccati equation in the KBF: Pi+1|i = Γ Pi|i Γ T + Π QΠ T ,
Pi+1|i+1 = [I − Ki+1 G]Pi+1|i , −1 where Ki+1 = Pi+1|i GT GPi+1|i GT + R .
(54) (55) (56)
The initial condition is P1|0 = σ 2 I . The Frobenius norm of the difference (Pi+2|i+1 − Pi+1|i ) is used as the convergence criterion for the predictor covariance matrix. 3. RTS Smoothing. The state model Eq. (52) and the observation model Eq. (53) with the state matrices provided by step 1 and the predictor covariance matrix given by step 2 are used as the basis for the doublesweep RTS smoother (Rauch et al., 1995). The forward sweep recursively i+1|i+1 ) i+1|i ) and the filter estimate (Ψ computes the predictor estimate (Ψ using the KBF: (a) Predictor update. (b) Filter update.
i|i i+1|i = Γ Ψ Ψ
1|0 = 0. with Ψ
i+1|i + K(Zi+1 − GΨ i+1|i ), i+1|i+1 = Ψ Ψ
(57)
(58)
for 1 i (NI − 1). It may be noted that the Kalman gain K and the filter covariance matrix Pi+1|i+1 can be expressed in terms of the steady-state value of the predictor covariance matrix using Eqs. (55) and (56) with Pi+1|i = P (p) . Both Kalman gain Ki and the filter covariance matrix Pi+1|i+1 converge and do not need any further updating during the KBF iterations. The steady-state values of the Kalman gain and filter covariance matrix are denoted by K and i+1|i and the filtered field Ψ i+1|i+1 are the P (f ) . The predicted field Ψ outputs of this stage. i from the The backward sweep computes the smoother estimate Ψ i+1|i+1 ) provided i+1|i ) and the filter estimate (Ψ predictor estimate (Ψ by the KBF: (c) Smoother update. i|i + S(Ψ i+1 − Ψ i+1|i ) i = Ψ Ψ
I = Ψ I |I with Ψ
(59)
for (NI − 1) i 1. The smoother gain S uses the steady-state −1 values for the error covariances and is given by S = P (f ) Γ T P (p) , (f ) is the steady-state value of the filter covariance matrix. where P The smoother covariance matrix, Pi , is obtained from the following
APPLICATIONS OF NONCAUSAL GAUSS – MARKOV RANDOM FIELDS
19
iteration: Pi = P (f ) + Γ Pi+1 − P (p) Γ T
with PI = P (f ) .
(60)
i is the The smoothed image Xs (i, j ) obtained from the state vector Ψ output of the final stage. C. Image Restoration Experiments This section compares the performance of the RTS-based smoothing algorithm that uses noncausal predictive models versus other enhancement schemes. Qualitative and quantitative studies are performed. The qualitative analysis is based on the subjective evaluation of the image restored by the selected schemes. The quantitative comparison is based on the PSNR
2552 PSNR in dB = 10 log10 , MSE where MSE =
I J 2 1 X(i, j ) − Xs (i, j ) , IJ
(61)
i=1 j =1
where Xs (i, j ) denotes the restored version of the original image X(i, j ). We compare our RTS-based noncausal algorithm with the following restoration techniques: (a) Wiener filter. Using the orthogonality principle, the transfer function W (f1 , f2 ) of the Wiener filter is W (f1 , f2 ) =
G∗ (f1 , f2 )Sxx (f1 , f2 ) , |G(f1 , f2 )|2 Sxx (f1 , f2 ) + Svv (f1 , f2 )
(62)
where Sxx (f1 , f2 ) and Svv (f1 , f2 ) are the power spectral densities (PSD) of the original image and additive noise in terms of the horizontal frequency f1 and vertical frequency f2 . The term G(f1 , f2 ) is the 2D Fourier transform of the PSF g(i, j ) as defined in Eq. (50), while the notation G∗ (f1 , f2 ) denotes the complex conjugate of G(f1 , f2 ). (b) Spatial averaging. The input pixel is replaced by a spatial average of its neighborhood pixels. A (3 × 3) window around the reference pixel X(i, j ) is used as the neighborhood for the spatial averaging filter, which is defined as 1 1 Xs (i, j ) = 9
1
ℓ1 =−1 ℓ2 =−1
z(i + ℓ1 , j + ℓ2 ).
(63)
20
ASIF
F IGURE 3. Image restoration of the aerial image: (a) original, (b) noisy and blurred with truncated Gaussian blur (MSE = 1181.60), (c) restored image with (3 × 3) spatial averaging (MSE = 1015.3), (d) restored image with Wiener filter (MSE = 439.6), (e) restored image with RTS algorithm using causal prediction (MSE = 407.89), and (f) festored image with RTS algorithm using noncausal prediction (MSE = 207.54).
(c) RTS with causal prediction. This resembles the RTS algorithm described except for the prediction model, which is one-sided (causal) and is given by x(i, ˆ j ) = βdc x(i − 1, j − 1) + βvc x(i − 1, j ) + βhc x(i, j − 1), (64)
where βdc , βvc , and βhc are the diagonal, vertical, and horizontal field interactions of a third-order Markov mesh. The superscript denotes causal prediction. (d) RTS with noncausal GMRP prediction. This is the RTS algorithm described in the section.
Figure 3 illustrates the experimental results of the restoration schemes (a)– (d) with Neumann boundary conditions for an aerial test image distorted by the truncated Gaussian blur [Eq. (12)] with σ set to 6. This implies that a Gaussian noise with a signal to noise ratio (SNR) of 10 dB is added to the
APPLICATIONS OF NONCAUSAL GAUSS – MARKOV RANDOM FIELDS
21
F IGURE 4. Image restoration of the Lenna image. (a) Original, (b) noisy and blurred with out-of-focus blur (MSE = 501.33), (c) restored image with (3 × 3) spatial averaging (MSE = 432.24), (d) restored image with Wiener filter (MSE = 305.13), (e) restored image with RTS algorithm using causal prediction (MSE = 254.54), and (f) restored image with RTS algorithm using noncausal prediction (MSE = 166.46).
blurred image. The resulting distorted image is shown in Figure 3b where for reference we also include the original image in Figure 3a. The outputs of the spatial averaging filter are shown in Figure 3c. The restored image does not include any additional postprocessing of images after restoration, as is also the case for the results presented later. Because the spatial filter does not take the blurring model into consideration, it is unable to remove the distortions introduced by blur. Figures 3d, e, and f illustrate the outputs from the Wiener and RTS filters. In Figure 3d, the Wiener filter is used. Figure 3e is obtained from the causal model given in Eq. (64), while Figure 3f uses the noncausal GMRP model, described in Section IV.B. The RTS filter based on the noncausal GMRP model [scheme (d)] exhibits the best performance, restoring important features like edges more distinctly than its counterpart based on the third-order causal Markov mesh [scheme (c)], which includes undesired horizontal streaking. The Wiener filter is computationally intensive
22
ASIF TABLE 1 Q UANTITATIVE COMPARISONS BASED ON PSNR AND MSE FOR RTS- BASED GMRF RESTORATION SCHEMES PRESENTED IN S ECTION IV Restoration scheme
Noisy/Blurred image Spatial averaging Wiener filtering RTS with causal prediction RTS with noncausal prediction
Aerial image
Lenna image
MSE
PSNR (dB)
MSE
PSNR (dB)
1181.6 1015.3 439.6 407.9 207.6
17.41 18.07 21.7 22.03 24.96
501.33 432.24 305.13 254.54 166.46
21.13 21.77 23.28 24.07 25.92
as it requires calculating the 2D Fourier transform (and its inverse) of the blurred image, yet it does not clearly restore most of the features present in the aerial image. A second comparison based on the Lenna image is shown in Figure 4, where the input image is distorted with the out-of-focus blur. Restoration results reinforce our earlier conclusions with the RTS algorithm based on the noncausal GMRF outperforming the remaining filters. The MSEs included in Table 1 highlight the quantitative improvement achieved by the noncausal GMRF-based RTS image restoration algorithm over schemes (a)– (c) for the two test images. D. Summary This section presented a practical implementation of the RTS filter based on a noncausal GMRF state model for restoration of a blurred image corrupted by additive noise. We exploit the shift-invariant characteristics of the state matrices in the noncausal GMRP predictive model and use the steady-state solution of the Riccati equation in the RTS filter. The resulting implementation is computationally practical and faster than the competing algorithms. The experimental results outperform the Wiener filter and illustrate the superiority of the noncausal GMRP prediction model used in the RTS filter over a causal prediction model. The following section considers a second application of the GMRF in the field of video compression.
V. V IDEO C OMPRESSION With the recent developments in multimedia technology, streaming video has emerged as a popular data delivery format for many applications, including
APPLICATIONS OF NONCAUSAL GAUSS – MARKOV RANDOM FIELDS
23
web conferencing, video on demand, and distance education. Despite tremendous growth in the area of networking, the bandwidth, or alternatively the transmission rate of the channel, remains one of the biggest bottlenecks in the transmission of streaming video. Consider, for example, the distribution of a QCIF-quality video with a resolution of 144 × 176 pixels per frame at 30 frames per second on the Internet. In the raw format, a transmission rate of 144 × 176 pixels/frame × 3 colors/pixel × 8 bits/color × 30 frames/s = 1.825 Mbps
is required to support each video stream. Because the Internet is shared, such transmission rates cannot be simultaneously supported for several audiovisual applications. Therefore, an important step in multimedia communications is compression of the raw data resulting from a video stream. In multimedia applications, a video compression/decompression system is referred to as a video codec, which is the focus of discussion in this section. To explain the application of GMRFs in video compression, we focus on the compression component of a multimedia communication system. The transmission issues such as error resilience (Li et al., 1998), scalability (Tan and Zakhor, 2001), and congestion control (Vicisano et al., 1998) are not explicitly addressed in this section. Broadly speaking, video codecs can be classified in two categories: transform codecs and predictive codecs. The transform codecs use an algebraic transform such as the discrete cosine transform (DCT) or discrete wavelet transform (DWT) to represent the video stream in a transformed domain, where the energy is bundled into a fewer number of significant coefficients. Compression is achieved by discarding the insignificant coefficients with relatively lower energy. The number of significant coefficients retained determines the compression ratio. The International Organization for Standardization (ISO) standard MPEG4 (ISO/IEC, 1999) and the International Telecomunication Union (ITU) standard H.263 (ITU-T Recommendation, 1998) are transform codecs. The predictive codecs (Asif and Moura, 1996; Asif and Kouras, 2006) use a different paradigm for compression, where the correlation present in the video stream is removed. The correlation is removed spatially within a frame by subtracting the predicted values of the pixels from the actual values, as well as temporally by using motioncompensation schemes. To illustrate the application of GMRF in video compression, we describe a new scalable video codec for low bit rates based on noncausal prediction and vector quantization (VQ). The proposed scheme is, therefore, a predictive codec and models the video sequence X(i, j, k) as a 3D GMRF, which is used to generate a 3D error field v(i, j, k) considerably less correlated than the original video sequence. Cascaded VQ is then used
24
ASIF
to compress the error field v(i, j, k). We apply extended replenishment to VQ where the index of the current vector is encoded and transmitted only if it is different from the index at the same location in the previous frame. The proposed video codec is referred to as scalable noncausal prediction with cascaded vector quantization and conditional replenishment (SNP/VQR). In our simulations, SNP/VQR outperforms the ISO standard, MPEG4 (ISO/IEC, 1999), and the ITU standard, H.263 (ITU-T Recommendation, 1998), both in terms of PSNR and perceived video quality at transmission rates below 150 Kbps. A. SNP/VQR Encoder As illustrated in Figure 5, the compression procedure of SNP/VQR has a predictive component followed by a quantization component. The predictive component, shown in part I (Predictor) of Figure 5, is based on modeling the video sequence as a 3D noncausal GMRF, described in Sections II and III, and has the following four stages. 1. In the first stage, the vertical, horizontal, and temporal interactions {βh , βv , βt } are estimated using Eq. (16) from the input video X(i, j, k). These interactions define the state or potential matrix A used for prediction. The interaction parameters are also required at the decoder to reconstruct the video and constitute overhead information transmitted to the receiver. 2. In the second stage, the 3D bilateral model is transformed to the 3D forward unilateral representation [Eqs. (40) and (41)] that uses the 3D forward regressors L(k) s and F (k) s for prediction. The recursive form is obtained by Cholesky factorization of the potential matrix A with matrices L(k) s and F (k) s being the output of this stage. Since the Cholesky regressors converge, only a limited number of these matrices are computed at this stage. 3. In the third stage, the input video is modeled as a first-order, noncausal 3D GMRF such that the unilateral prediction model is used to estimate each frame of the video. 4. Stage 4 generates the uncorrelated error field v by subtracting the values of the pixels in the predicted frame from the original pixel values. In SNP/VQR, we combine stages 3 and 4 such that the whitened error field v = L−T e is obtained separately for each frame k from the input video x(i, j, k) without computing the predicted values. In other words, Eqs. (40) and (41) are used to determine v. Since the noncausal GMRF at the receiver is based on the reconstructed frames, we also use the reconstructed frames for prediction at the encoder.
APPLICATIONS OF NONCAUSAL GAUSS – MARKOV RANDOM FIELDS
F IGURE 5.
25
Block diagram representation of the SNP/VQR video codec.
To achieve high compression ratios, the whitened error video v (k) is vector quantized using an N-stage cascaded VQ. The VQ step is shown in part II (Transmitter) of Figure 5. The indices of the vector-quantized blocks ∗ } constitute the output of the VQ step and are transmitted over { v1∗ , . . . , vN the channel. The vector-quantized blocks v1∗ obtained from stage 1 of the cascaded VQ form the base layer of the streaming video, while the blocks ∗ } form (N − 1) additional enhancement layers that add to the { v2∗ , . . . , vN video quality obtained from the base layer. In cascaded VQ, we also apply conditional replenishment (Goldberg and Sun, 1986) at each stage of cascaded VQ by encoding and transmitting the vector-quantized block only if its index is different from the corresponding index at the same location in the previous frame. Conditional replenishment leads to considerable reduction in the number of code vectors transmitted. Although not implemented in our codec, conditional replenishment can be coupled with motion compensation schemes to provide better tradeoffs between video quality and compression ratio than conditional replenishment alone. Part III (Decoder) of Figure 5 represents the SNP/VQR decoder, which ∗ } by inverting reconstructs the video sequence from the VQ blocks { v1∗ , . . . , vN the steps of the encoder (part I of Figure 5) in the reverse order. The
26
ASIF
expressions for obtaining the reconstructed video Xr(k) from v ∗(k) are −1 Xr(1) = L(1) v ∗(1) , for k = 1, −1 ∗(k) Xr(k) = L(k) − F (k) Xr(k−1) , for (2 k NK ), v
(65) (66)
which are derived from the transformed forward regression model [Eqs. (40) and (41)]. Since the VQ step introduces controlled distortion, the recon As in the standard structed video is not a perfect match of the input video X. codecs, SNP/VQR uses lossy compression to achieve low bit per pixel representations but, at low bit rates, the subjective quality of the reconstructed video is much superior to these standards. At bit rates below 150 Kbps, MPEG4 and H.263 exhibit several visual degradations such as blocking. SNP/VQR exhibits better visual quality with no blocking, and more details are retained in the compressed video. In terms of PSNR, SNP/VQR outperforms MPEG4 by 1 dB and H.263 by about 1.5 dB at transmission rates around 50 Kbps. A detailed comparison of SNP/VQR with H.263 and MPEG4 is performed later in this section. The following discussion provides a practical implementation of the SNP/VQR codec. B. Computationally Efficient Implementation
The 3D forward unilateral regression model Eqs. (40) and (41) is computationally impractical to implement even for a reduced video format like QCIF with a frame size of 144 × 176 pixels. For a QCIF video sequence, the linear dimension of the 3D forward regressors {L(k) , F (k) }s in the forward regression model with NI = 144 and NJ = 176 is roughly of O(2.5 × 104 ). Storage and matrix operations at such high dimensions are clearly not feasible. For example, inverting the 3D forward regressor L(k) for frame k in Eq. (66) requires computations of O(NI3 NJ3 ), or O(2.25 × 1013 ) flops assuming a QCIF resolution video. Because of the high computational complexity, direct implementation of SNP/VQR is not feasible in real-time multimedia applications. To derive practical implementations, we approximate the 3D forward regressors by an M-block banded structure. Before presenting the sub-block implementation, we comment first on the structure of the 3D forward regressors of the 3D first-order GMRF, which provide intuitive justification for the block banded approximation. 1. Structure of Three-Dimensional Forward Regressors The structure of the 3D forward regressors {L(k) , F (k) }s is illustrated through a simple example. A first-order Dirichlet field with βv = 0.156631, βh = 0.166309, and βt = 0.167446 is defined on a 3D (24 × 24 × 24) lattice, i.e.,
APPLICATIONS OF NONCAUSAL GAUSS – MARKOV RANDOM FIELDS
27
F IGURE 6. Illustration of the convergence properties in the 3D forward regressors for {βv = 0.156631, βh = 0.166309, βt = 0.167446}. (a) Plots of L(k) − L∞ and F (k) − F ∞ ∞ ∞ ∞ versus frame k. (b) Plots of L∞ ℓ1 +1ℓ1 +1 − Lℓ1 ℓ1 and Fℓ1 +1ℓ1 +1 − Fℓ1 ℓ1 versus block row ℓ1 . ∞ . (d) Plots of F ∞ for the (c) Plots of L∞ for the last five rows ℓ and (1 ℓ ℓ ) in L 1 2 1 ℓ1 ℓ2 ℓ1 ℓ2 first five rows ℓ1 and (ℓ1 ℓ2 NI ) in F ∞ .
NI = NJ = NK = 24. These values of the interaction parameters are derived from a real video sequence and are used to compute the 3D forward regressors {L(k) , F (k) }s using relationships in Eqs. (42) and (43). For comparison, we also compute the steady-state pair {L∞ , F ∞ } using Eq. (44). Based on the computed values, we make the following observations. Observation 1. In Figure 6(a), we plot the norm of the differences L(k) − L∞ and F (k) − F ∞ for (Nk k 1). The plots in Figure 6(a) highlight the rapid geometric convergence of the sequences L(k) and F (k) in a small number of iterations k.
28
ASIF
The remaining two observations are made for the constituent blocks {L(k) ℓ1 ℓ2 } (k)
of L(k) and {Fℓ1 ℓ2 } of F (k) , for (1 ℓ1 , ℓ2 NI ), from their values obtained from the aforementioned setup.
Observation 2. In Figure 6(b), we plot the norm of the differences −Fℓ∞ of the constituent sub-blocks in F ∞ . The plot is shown as Fℓ∞ 1 ℓ1 1 +1ℓ1 +1 a solid line marked with the symbol ∗. Likewise, the norm of the differences ∞ ∞ L∞ ℓ1 +1ℓ1 +1 − Lℓ1 ℓ1 for consecutive sub-blocks on the main diagonal in L is plotted as a solid line marked with the symbol ◦. We observe that the constituent sub-blocks in F ∞ and L∞ themselves converge along the block diagonals. Similar convergences were observed for other sub-blocks in F ∞ and L∞ along diagonals outside the main block diagonal. Observation 3. In Figures 6(c) and (d), the norms of the sub-blocks {L∞ ℓ1 ℓ2 } and {Fℓ∞ } along a block row ℓ are plotted. Interestingly, the sub-blocks 1 1 ℓ2 constituting the 3D forward regressors {L∞ , F ∞ } converge to 0 along block row ℓ1 on the respective nonzero side of the main diagonal. This illustrates that a block banded approximation to the 3D forward regressors {L∞ , F ∞ } is reasonable. To verify properties 1–3, we performed an element to element comparison between the relevant F (k) s (and L(k) s) for different values of field interactions. The evolution of the elements in F (k) s (and L(k) s) follows the pattern observed in properties 1–3. Based on observations 1–3, we approximate the 3D forward regressor F (k) in the Cholesky factor L by M1 block banded upper triangular matrices (Asif and Moura, 2005) as follows:
F (k) =
0 (k) F ℓ1 ℓ 2 0
,
(67)
for 0 (ℓ2 − ℓ1 ) M1 . Similarly, the 3D forward regressors L(k) in the Cholesky factor L are approximated by M2 block banded lower triangular
APPLICATIONS OF NONCAUSAL GAUSS – MARKOV RANDOM FIELDS
29
matrices as follows:
L(k) =
0
, (k) L ℓ1 ℓ2 0
(68)
for 0 (ℓ1 − ℓ2 ) M2 . Note that the upper triangular structure of F (k) and the lower triangular structure of L(k) follow directly from the forward unilateral representation [Eqs. (40) and (41)] and are not approximations. The only approximation made in Eqs. (67) and (68) is to impose a block banded structure on the nonzero triangular portion of the 3D forward regressors. When coupled with the one-sided forward representation of Eqs. (40) and (41), the block banded approximations considerably simplify the prediction model. The simplified model is referred to as sub-block SNP/VQR and is considered next. C. Sub-Block SNP/VQR To derive the sub-block SNP/VQR, we expand Eqs. (40) and (41) at the sub-block level with L(k) and F (k) approximated by the M-block banded approximations given in (67) and (68). Below we present the resulting expressions for M1 = M2 = 3. Frame (k = 1): ∀(1 ℓ1 NI ), ℓ1
(1)
τ =max(1,ℓ1 −3)
Frame (2 k NK ): min(ℓ 1 +3,NI ) τ =ℓ1
(1)
Lℓ1 τ Xτ(1) = vℓ1 .
(69)
∀(1 ℓ1 NI ),
(k) Fℓ1 τ Xτ(k−1)
+
ℓ1
τ =max(1,ℓ1 −3)
(k)
(k)
Lℓ1 τ Xτ(k) = vℓ1 . (k)
(k)
(70)
To compute the 3D forward regressor sub-blocks {Fℓ1 ℓ2 , Lℓ1 ℓ2 }, Eqs. (42) and (43) are expanded in terms of the block banded structure defined in Eqs. (67) and (68). For M1 = M2 = 3, the simplified expressions are given by Eqs. (71)–(74), where term δ¯ℓ1 NI equals 1 if ℓ1 = NI . Otherwise, δ¯ℓ1 NI = 0. (k) Since the 3D forward regressor sub-blocks {L(k) ℓ1 ℓ2 , Fℓ1 ℓ2 } in Eqs. (71)–(74) converge to a steady state, therefore only a limited number of these sub-blocks
30
ASIF
are computed. The steps involved in computing the steady state-values are as follows. Frame (k = NK ): (N )
(k)
1. Compute sub-blocks (LNI Kℓ2 ) (NI ℓ2 NI − 3), and FNI NI for the last row (ℓ1 = NI ) using Eqs. (71) and (72). (NK ) K) 2. Repeat step 1 for sub-blocks (L(N ℓ1 ℓ2 ) (ℓ1 − ℓ2 3), and (Fℓ1 ℓ2 ) (ℓ2 − ℓ1 3), for row (NI − 1 ℓ1 1), using Eqs. (71) and (72) until the sub-blocks converge. For real video sequences, we observed convergence within (ℓ1 < 20) iterations. Frame (k 2): 3. Steps 1 and 2 are repeated for each of the following frames ((NK − 1) (k) k 1) until the sub-blocks L(k) ℓ1 ℓ2 (ℓ1 − ℓ2 3), and Fℓ1 ℓ2 (ℓ2 − ℓ1 3) ((NI − 1) ℓ1 1), converge with respect to k. Convergence of (k) (k) {Lℓ1 ℓ2 , Fℓ1 ℓ2 } was observed within (k < 15) iterations. In steps 2 and 3, the convergence criterion is based on the Frobenius norm of the difference of the two sub-blocks. In step 3, for example, the convergence (k−1) −4 criterion is L(k) ℓ1 ℓ1 − Lℓ1 ℓ1 < δc , where δc is 10 . Frame (k = NK ): ⎫ (N ) (NK ) T (NK ) ⎪ , for N ) L ℓ 1, Lℓ1 ℓK1 = chol B − δ¯ℓ1 NI (Lℓ1 +1ℓ I 1 ⎪ ℓ1 +1ℓ1 1 ⎪ ⎪ (NK ) −T ⎪ (NK ) Lℓ1 ℓ1 −1 = Lℓ1 ℓ1 C, for NI ℓ1 2,⎬ (N ) Lℓ1 ℓK1 −2 = 0, for NI ℓ1 3,⎪ ⎪ ⎪ ⎪ ⎪ ⎭ (NK ) Lℓ1 ℓ1 −3 = 0, for NI ℓ1 4, (71)
(N ) K) Fℓ(N = (Lℓ1 ℓK1 )−T D, 1 ℓ1 (N )
(N )
(N )
(N )
K K Fℓ1 ℓK1 +1 = −(Lℓ1 ℓK1 )−T (Lℓ1 +1ℓ )T Fℓ1 +1ℓ , 1 1 +1
K) K ) −T K) T (NK ) = −(L(N (L(N Fℓ(N ℓ1 ℓ1 ) ℓ1 +1ℓ1 ) Fℓ1 +1ℓ1 +2 , 1 ℓ1 +2
(N )
(N )
⎫ ⎪ ⎪ ⎪ ⎪ ⎪ for (NI − 1) ℓ1 1, ⎬
for NI ℓ1 1,
(N )
(N )
K K Fℓ1 ℓK1 +3 = −(Lℓ1 ℓK1 )−T (Lℓ1 +1ℓ )T Fℓ1 +1ℓ , 1 1 +3
for (NI − 2) ℓ1 1, ⎪ ⎪ ⎪ ⎪ ⎪ ⎭ for (NI − 3) ℓ1 1.
(72)
APPLICATIONS OF NONCAUSAL GAUSS – MARKOV RANDOM FIELDS
31
Frame ((NK − 1) k 1): ℓ1 (k) Lℓ1 ℓ1 = chol B −
⎫ ⎪ ⎪ T (k+1) ⎪ (Fτ(k+1) ⎪ ⎪ ℓ1 ) Fτ ℓ1 ⎪ ⎪ ⎪ τ =max(1,ℓ1 −3) ⎪ ⎪
⎪ min(N ,ℓ +3) I 1 ⎪ ⎪ (k) T (k) ¯ ⎪ (Lτ ℓ1 ) Lτ ℓ1 , for (NI ℓ1 1),⎪ − δℓ 1 N I ⎪ ⎪ ⎪ ⎪ τ =ℓ1 +1 ⎪ ⎪ ⎪ ⎪ ℓ 1 −1 ⎪ ⎪ (k+1) T (k+1) (k) (k) −1 ⎪ ⎪ C− (Fτ ℓ1 ) Fτ ℓ1 −1 Lℓ1 ℓ1 −1 = (Lℓ1 ℓ1 ) ⎪ ⎪ ⎪ ⎪ τ =max(1,ℓ1 −3) ⎪ ⎪ ⎪
min(N ,ℓ +2) ⎪ I 1 ⎬ (k) T (k) (Lτ ℓ1 ) Lτ ℓ1 −1 , for NI ℓ1 2, − (73) ⎪ ⎪ τ =ℓ1 +1 ⎪ ⎪ ⎪ ⎪ ℓ 1 −2 ⎪ ⎪ (k) (k) −1 (k+1) T (k+1) ⎪ ⎪ − Lℓ1 ℓ1 −2 = (Lℓ1 ℓ1 ) (Fτ ℓ1 ) Fτ ℓ1 −2 ⎪ ⎪ ⎪ ⎪ τ =max(1,ℓ1 −3) ⎪
⎪ ⎪ ⎪ (k) T (k) ⎪ ⎪ − (Lℓ1 +1ℓ1 ) Lℓ1 +1ℓ1 −2 , for NI ℓ1 3, ⎪ ⎪ ⎪ ⎪ ⎪
⎪ ℓ 1 −3 ⎪ ⎪ ⎪ (k+1) T (k+1) (k) −1 (k) ⎪ − (Fτ ℓ1 ) Fτ ℓ1 −3 , Lℓ1 ℓ1 −3 = (Lℓ1 ℓ1 ) ⎪ ⎪ ⎪ ⎪ τ =max(1,ℓ1 −3) ⎪ ⎭ for NI ℓ1 4, ⎫ (k) −T ⎪ Fℓ(k) ) D, for N ℓ 1, = (L I 1 ⎪ ℓ ℓ ℓ 1 1 1 1 ⎪ ⎪ ⎪ (k) (k) −T (k) (k) T ⎪ ⎪ Fℓ1 ℓ1 +1 = −(Lℓ1 ℓ1 ) (Lℓ1 +1ℓ1 ) Fℓ1 +1ℓ1 +1 , ⎪ ⎪ ⎪ ⎪ for (NI − 1) ℓ1 1, ⎪ ⎪ ⎬ (k) (k) −T (k) (k) T Fℓ1 ℓ1 +2 = −(Lℓ1 ℓ1 ) (Lℓ1 +1ℓ1 ) Fℓ1 +1ℓ1 +2 ⎪ ⎪ T (k) ⎪ for (NI − 2) ℓ1 1, + (L(k) ⎪ ℓ1 +2ℓ1 ) Fℓ1 +2ℓ1 +2 , ⎪ ⎪ ⎪ ⎪ (k) (k) −T (k) (k) (k) (k) T T (Lℓ1 +1ℓ1 ) Fℓ1 +1ℓ1 +3 + (Lℓ1 +2ℓ1 ) Fℓ1 +2ℓ1 +3 ⎪ Fℓ1 ℓ1 +3 = −(Lℓ1 ℓ1 ) ⎪ ⎪ ⎪ ⎪ (k) (k) ⎭ T + (Lℓ1 +3ℓ1 ) Fℓ1 +3ℓ1 +3 , for (NI − 3) ℓ1 1. (74) D. Computational Savings
The sub-block SNP/VQR reduces the computational complexity of the block SNP/VQR in the following ways. 1. For a (NI × NJ × NK ) video sequence, storing the 2NK forward regressor blocks F (k) and L(k) requires O(2NK NI2 NJ2 ) memory size. Since the
32
ASIF
Riccati iteration converges at a geometric rate, storage of all NK blocks is not required. We stop at iteration (p ≪ NK ), reducing the memory requirements by a factor of p/NK to O(2pNI2 NJ2 ). 2. The storage requirements in item 1 are still impractical to be implemented. By imposing an M-block banded structure on the upper triangular blocks F (k) s, only constituent sub-blocks Fℓ(k) within (ℓ2 − ℓ1 ) M are stored. 1 ℓ2
Similarly, for the lower triangular blocks L(k) , only sub-blocks L(k) ℓ1 ℓ2 within (ℓ1 − ℓ2 ) M are stored. The memory requirements are reduced to O(2pMNI NJ2 ) with the M-block banded approximations. 3. The 3D forward regressor sub-blocks Fℓ(k) and L(k) ℓ1 ℓ2 themselves converge 1 ℓ2 along the block diagonals of F (k) and L(k) , respectively. Instead of storing NI such sub-blocks for each F (k) and L(k) , only (m ≪ NI ) blocks are needed. This further reduces the memory requirements to O(2pmMNJ2 ). In other words, the sub-block SNP/VQR requires only 2pmM sub-blocks of dimensions (NJ × NJ ) to be stored. The number of computations in the Riccati step is reduced by a factor of (pmMNI2 /NK ). In our experiments, we use a three-block banded approximation. Respective convergences were observed within p < 15 and m < 20 iterations, implying a computational savings of O(104 ) over the bilateral model [Eq. (37)]. 4. By using the sub-block SNP/VQR [Eqs. (69) and (70)], the number of computations required to generate the 3D error field is significantly reduced from that required in the block SNP/VQR [Eqs. (40) and (41)]. In the (k) (k) reduced model, sub-blocks Lℓ1 ℓ2 and Fℓ1 ℓ2 are of dimensions (NJ × NJ ). For a three-block banded approximation, the total number of floating point operations (flops) required to generate the error field v(i, j, k) with the subblock SNP/VQR is given by 8NI NJ2 for one frame k. The corresponding number of flops in block SNP/VQR is 2NI2 NJ2 . Therefore, the sub-block SNP/VQR reduces the number of flops at the encoder by O(NI ). The computational saving at the decoder is of O(NI2 ), as we show next. Both block and sub-block SNP/VQR require inverting the state matrices to generate the reconstructed frame x(i, j, k) from the error field v(i, j, k). To reconstruct one frame of the video sequence, the number of computations required in the block SNP/VQR is of O(NI3 NJ3 ). In the subblock SNP/VQR, the number of flops required to generate one frame is of O(NI NJ3 ), implying a saving of O(NI2 ). (k) (k) 5. Since computations of the 3D forward regressor blocks Lℓ1 ℓ2 and Fℓ1 ℓ2 are dependent on the values of the field interactions {βh , βv , βt }, therefore the sub-blocks cannot be computed offline. Schweizer and Moura (2000) show
APPLICATIONS OF NONCAUSAL GAUSS – MARKOV RANDOM FIELDS
33
that the field interactions are bounded by the inequality
1 π π π + |βv | cos + |βt | cos < . |βh | cos NJ + 1 NI + 1 NK + 1 2 (75) It is possible to consider a subset of values for the field interactions and construct a lookup table containing the corresponding 3D forward (k) regressor blocks {L(k) ℓ1 ℓ2 , Fℓ1 ℓ2 }. The same lookup table is also constructed at the decoder. The online computations of the 3D forward regressor blocks are avoided with this approach but at the expense of an additional approximation as the field interactions are quantized to the values included in the lookup table. We next consider a rough comparison of the computational complexity of the sub-block SNP/VQR with the standard codecs such as MPEG4 and H.263. The most time-consuming operation in the standardized codecs is the block-based motion estimation, which has a computational complexity of O(NI NJ (2R + 1)2 ) per frame for an (NI × NJ ) frame with the search range of ±R pixels. The sub-block SNP/VQR involves no motion compensation but uses 3D noncausal prediction [Eqs. (69) and (70)], which takes most (k) (k) of the encoding time. Given that the unilateral representation {Lℓ1 ℓ2 , Fℓ1 ℓ2 } is available from a lookup table, the computational complexity of the noncausal prediction is of O(NI NJ2 ) per frame, which is comparable to the computational complexity of the standard codecs. In the extreme case of an exhaustive search—(R = kNJ )—the computational complexity of the blockbased motion estimation is of O(NI NJ3 ). Compared to the standard codecs using exhaustive search motion estimation, the sub-block SNP/VQR provides an improvement by a factor of NJ . E. Cascaded VQ We briefly discuss VQ (Linde et al., 1980) for the sake of completeness. Each frame k (1 k NK ), of the 3D error field v is partitioned into contiguous, nonoverlapping, square (N × N) sub-blocks. Each sub-block is then encoded with a codebook containing P code elements. The code elements have the same dimensions as the partitioned sub-block. The optimum encoding rule is the nearest neighbor rule, in which the index R is transmitted to the decoder if the corresponding code element in the codebook yields the least distortion. The VQ decoder looks up the code element with index R from a copy of the codebook to form the sub-block of the reconstructed error field. Cascaded VQ is an extension of VQ, which reduces the computational complexity of VQ. The basic idea behind cascaded VQ is to perform VQ in
34
ASIF
stages. The first stage encodes frame k of the 3D error field and generates the difference between the original and encoded fields. The second stage vector quantizes the difference field obtained from the first stage and generates its own difference field if followed by an additional stage. The performance of cascaded VQ depends on the optimality of the codebook used at each stage and the distribution of the number of code vectors between different stages of cascaded VQ. Determination of the optimal distribution of the number of code vectors is a computationally intensive problem (Barnes and Frost, 1993). The computational complexity of such a search exceeds that of an exhaustive search. In the proposed setup, we use three stages in cascaded VQ. To determine a good distribution of code vectors, we ran an experiment for a three-stage, 6-bit vector quantizer and observed that the performance of the “321”-bit is closest to the performance of a singlestage VQ. In the sub-block implementation of SNP/VQR, we use a threestage, 6-bit vector quantizer with a “321”-bit distribution among the three stages. Application of cascaded VQ provides an additional benefit as it provides three different levels of quality of services (QoS) at which the video stream can be decoded. For the best spatial quality video, the outputs of all three stages of cascaded VQ are transmitted to the receiver to reconstruct the 3D error field v(k) . For intermediate quality, the outputs of the first two stages are used by the receiver to reconstruct v(k) . Finally, for the lowest spatial quality, the output of only the first stage is transmitted to the receiver. F. Video Compression Experiments In the following experiments, we show (1) that reasonably good quality is obtained at low bit rates using SNP/VQR and (2) that these results are superior, in terms of visual quality, to those obtained at similar bit rates using the ITU standard H.263 and the ISO standard MPEG4. The H.263 and MPEG4 encoders have many optional features, and their performances vary from one implementation to another depending on how many of the available features are selected. For H.263, we use the baseline implementation available at http://www.angelfire.com/in/H261/h263capture.html (Tanakitparpa, 1999) that incorporates half-pixel motion compensation, 3D variable-length coding of DCT coefficients, and coding of overhead information such as macroblock control data and coded block patterns. Optional features such as unrestricted motion vectors, syntax-based arithmetic coding, and advanced prediction mode are not implemented. The MPEG4 codec is downloaded from http://ffmpeg.sourceforge.net/index.php (Bellard, 2004) and is also a baseline implementation. In addition to the perceived quality, we use PSNR as the
APPLICATIONS OF NONCAUSAL GAUSS – MARKOV RANDOM FIELDS
35
TABLE 2 C OMPARISON OF PSNR FOR TEST VIDEO SEQUENCES RECONSTRUCTED AT 30 FRAMES PER SECOND WITH MPEG4 AND THE BLOCK SNP/VQR AT DIFFERENT BPS REPRESENTATIONS a Foreman
Highway
Rate (Kbps)
PSNR (dB) MPEG4
160 125 73 50 36
32.61 31.60 28.64 26.76 −
News PSNR (dB)
SNP/VQR
Rate (Kbps)
PSNR (dB)
SNP/VQR
Rate (Kbps)
MPEG4
MPEG4
SNP/VQR
32.75 31.84 30.32 28.98 27.79
148 100 59 27 20
35.94 35.43 33.43 29.81 −
35.95 35.48 34.06 31.66 29.50
191 119 75 44 33
36.21 32.95 30.11 26.14 −
36.25 33.68 32.57 29.68 28.29
a Some PSNR values for MPEG4 are missing because of the inability of the MPEG4 implementation (Bellard, 2004) to achieve the corresponding transmission rates without reducing the frame rate.
quantitative measure of video quality to compare the performance of the three codecs. Figure 7 plots the mean PSNR computed for the “carphone” sequence after its encoding with the H.263, MPEG4, and SNP/VQR codecs as a function of the transmission rate. In SNP/VQR, we vary the total number of bits in cascaded VQ such that the resulting bit rates range from 25 Kbps to 250 Kbps. The codebooks used at different bit rates are universal in the sense that these are generated from a set of training sequences that do not include the four test sequences used in the comparative study. Figure 7 illustrates that SNP/VQR produces sequences with higher PSNR values than H.263 for bit rates below 200 Kbps and MPEG4 for bit rates below 175 Kbps. At 125 Kbps, for example, the video sequence compressed with SNP/VQR has a PSNR value that is roughly 0.5 dB higher than the video sequence obtained at the same bit rate from H.263, while the improvement over MPEG4 is given by 0.25 dB. At a transmission rate of around 50 Kbps, the block SNP/VQR provides improvements of about 1.5 dB over H.263 and 0.75 dB over MPEG4. Table 2 provides additional PSNR comparisons between the block SNP/VQR and MPEG4 for the “foreman,” “highway,” and “news” sequences. Our earlier observations are validated with the results listed in Table 2. A higher PSNR does not necessarily imply a superior quality reconstructed video, because the perceived video quality is highly dependent on the human visual system (HVS). To provide subjective evaluation of the sequences, representative frames are extracted from the sequences compressed with H.263, MPEG4, and SNP/VQR. Figure 8 illustrates the perceived differences between frame 21 of the carphone sequence reconstructed using H.263, MPEG4, and SNP/VQR at three different bit rates. In Figure 8, the frames
36
ASIF
F IGURE 7. Comparison of the mean PSNR for the carphone sequence compressed using H.263, MPEG4, and SNP/VQR. In these results, SNP/VQR is configured at the Gold service with βh = 0.166309, βv = 0.156631, and βt = 0.167446.
compressed with SNP/VQR exhibit better visual quality with more details retained (e.g., the structure of the eyes and eyebrows, and the tower seen through the car’s window). Moreover, there is little blocking visible in the frames compressed with SNP/VQR despite the fact that VQ is prone to introducing blocking at low bit rates. Although not included here for lack of space, similar observations are made for the other three test sequences compressed with H.263, MPEG4, and SNP/VQR. G. Summary Section V illustrated the application of the 3D GMRF in the area of video compression. The resulting video codec is referred to as SNP/VQR, which models the video sequence as a 3D, first-order GMRF and generates an error field v(i, j, k) considerably less correlated than the original video sequence. Cascaded VQ coupled with conditional replenishment is used to compress the error field. SNP/VQR outperforms the standard codecs, including MPEG4 and H.263, at bit rates below 150 Kbps. The next section presents another application of GMRFs in developing fast algorithms for inverting block banded matrices.
APPLICATIONS OF NONCAUSAL GAUSS – MARKOV RANDOM FIELDS
37
F IGURE 8. Frame 21 extracted from the carphone sequence compressed to different bit rates using H.263, MPEG4, and the block SNP/VQR. Frame (a) is the original frame. The frames in the left column are compressed using H.263, the frames in the middle column are compressed using MPEG4, and the frames in the right column are compressed using the block SNP/VQR. The bit rate for frames (b), (c), and (d) is 39 Kbps (CR = 155), and the bit rate for frames (e), (f), and (g) is 24 Kbps (CR = 253).
VI. I NVERSION A LGORITHMS FOR B LOCK BANDED M ATRICES Block banded matrices and their inverses arise frequently in image and signal processing applications, including covariances of GMRF considered earlier in Section II. PSF to model blur in image restoration problems presented in Section IV, or, with finite difference numerical approximations to partial differential equations. In signal processing, it is often customary to invert block banded matrices. For example, Section IV derived a fast implementation of the RTS smoothing algorithm, which inverts the error
38
ASIF
covariance matrices {Pi+1|i , Pi+1|i+1 }s during each iteration of the forward and backward sweep of the algorithm. This section applies the theory of GMRFs to develop computationally efficient algorithms for inverting positivedefinite and symmetric L-block banded matrices A and their inverses P = A−1 . Several matrix inversion algorithms (Reeves, 2002; Chakraborty, 1998; Ammar and Gragg, 1988; Kavcic and Moura, 2000; Chandrasekaran and Syed, 1998; Dewilde and van der Veen, 2000; Erishman and Tinney, 1975; Duff et al., 1992) have been proposed in the literature for the inversion of (scalar, not block) banded matrices. The novelty of our work is that it applies to block banded matrices P in which the constituent blocks {Pij } are blocks themselves and are not necessarily scalars. There are relatively fewer algorithms for block banded matrices P and most of them (Golub and Loan, 1996; Chun and Kailath, 1991; Kalouptsidis et al., 1984; Kailath et al., 1979; Bini and Meini, 1999; Yagle, 2001; Corral, 2002) impose some kind of structure on P . For example (Kalouptsidis et al., 1984; Kailath et al., 1979; Bini and Meini, 1999; Yagle, 2001) assume P to be Toeplitz. Unlike existing approaches for matrix inversion, the algorithms presented in this section do not impose any additional constraint on the structure of P ; in particular, the algorithms in this section do not require P to be Toeplitz. Exploiting the earlier results for GMRFs, we show that the matrix P , whose inverse A is an L-block banded matrix, is completely defined by the blocks within its Lblock band. In other words, when the block matrix P has an L-block banded inverse, P is highly structured: any block entry outside the L-block diagonals of P can be obtained from the block entries within the L-block diagonals of P . The section proves this fact, at first sight surprising, and derives the following algorithms (Asif and Moura, 2005) for block matrices P whose inverses A are L-block banded: 1. Inversion of P. An inversion algorithm for P that uses only the block entries in the L-block band of P . This is a very efficient inversion algorithm for such P —it is faster than direct inversion by two orders of magnitude of the linear dimension of the blocks used. 2. Inversion of A. A fast inversion algorithm for the L-block banded matrix A that is faster than its direct inversion by up to one order of magnitude of the linear dimension of its constituent blocks. Compared with the scalar banded representations, the block banded implementations of Algorithms 1 and 2 provide computational savings of three orders of magnitude of the dimension of the constituent blocks used to represent A and its inverse P .
APPLICATIONS OF NONCAUSAL GAUSS – MARKOV RANDOM FIELDS
39
A. Notation Consider a positive-definite symmetric matrix P represented by its (I × I ) constituent blocks P = {Pij }, 1 i, j J . The matrix P is assumed to have an L-block banded inverse A = {Aij }, 1 i, j J , with the following structure:
A=
0
Aij = 0 , |i − j | L 0
(76)
where the square blocks Aij and the zero square blocks 0 are of order I . Equation (76) generalizes our earlier notation used to specify the potential matrix A for 2D and 3D GMRFs. For the 2D, first-order GMRF, the block bandwidth L equals 1 and the potential matrix A has the structure specified in Eq. (21), which implies that Ai−1i = C,
Aii = B,
and Aii+1 = C.
(77)
Ai−1i = A2 ,
Aii = A1 ,
and Aii+1 = A2 ,
(78)
The remaining blocks Aij (|i − j | 2), for the 2D first-order GMRF are all zero blocks 0. Further, the dimensions of Aij in 2D GMRF are (NJ × NJ ), so (I = NJ ). Likewise, comparing the notation used in Eqs. (35)–(38) to represent the potential matrix for the 3D first-order GMRF implies that L = 1 and with the remaining blocks Aij (|i − j | 2), for the 3D first-order GMRF being 0. For 3D GMRF, the dimensions of Aij is (NI NJ × NI NJ ), so (I = NI NJ ) in this case. To generalize our results, we use Eq. (76) to represent A. We further note that the constituent blocks Aij in A (or Pij in P = A−1 ) have dimensions of (I × I ) such that there are J block rows and J block columns in matrix A (or P ). The potential matrices for the 2D and 3D GMRFs are special cases of the generalized notation. To be concise, we borrow the MATLAB notation to refer to an ordered combination of blocks Pij . A principal submatrix of P spanning block rows and columns i through j (1 i j J ) is given by ⎤ ⎡ Pii Pii+1 · Pij ⎢ Pi+1i Pi+1i+1 · Pi+1j ⎥ ⎥. (79) P (i : j, i : j ) ⎢ .. ⎣ ⎦ . · · · Pj i+1 · Pjj Pj i
40
ASIF
The Cholesky factorization of A = U T U results in the Cholesky factor U , that is an upper triangular matrix. To indicate that the matrix U is the upper triangular Cholesky factor of A, we work often with the notation U = chol(A). Lemma 1.1 shows that the Cholesky factor of the L-block banded matrix A has exactly L nonzero block diagonals above the main diagonal. Lemma 1.1. A positive definite and symmetric matrix A = U T U is L-block banded if and only if (iff) the constituent blocks Uij in its upper triangular Cholesky factor U are Uij = 0
for (j − i) > L, 1 i (J − L).
(80)
Since the Cholesky factor U is an upper triangular matrix, its inverse U −1 is also an upper triangular matrix with the following structure: ⎡
U −1
−1 U11 ⎢ 0 ⎢ ⎢ 0 ⎢ =⎢ ⎢ · ⎢ ⎣ 0
0
∗ −1 U22 0 · · ·
∗ ∗ −1 U33 .. .
∗ ∗ ∗ .. .
· ·
0 0
· · · ·
UI−1 −1I −1 0
∗ ∗ ∗
⎤
⎥ ⎥ ⎥ ⎥ ⎥, · ⎥ ⎥ ∗ ⎦ UI−1 I
(81)
where the lower diagonal entries in U −1 are all zero blocks 0. More importantly, the main diagonal entries in U −1 are block inverses of the corresponding blocks in U . These features are used next to derive three important theorems for L-block banded matrices where we show how to obtain: 1. A block entry {Uij } of the Cholesky factor U from a selected number of blocks {Pij } of P without inverting the full matrix P . 2. The block entries {Pij } of P recursively from the blocks {Uij } of U without inverting the complete Cholesky factor U . 3. The block entries {Pij }, |i − j | > L outside the first L diagonals, of P from the blocks within the first L-diagonals of P . Since we operate at a block level, the three theorems offer considerable computational savings over direct computations of U from P and vice versa. (Readers are referred to Asif and Moura (2005) for proofs of the three theorems.)
APPLICATIONS OF NONCAUSAL GAUSS – MARKOV RANDOM FIELDS
41
B. Theorems Theorem 1. The Cholesky’s blocks {Uii , . . . , Ui,i+L }s2 on block row i of the Cholesky factor U of an L-block banded matrix A = U T U are determined from the principal submatrix P (i : i + L, i : i + L) of P = A−1 by: ∀i, 1 i (J − L): ⎡ Pii Pii+1 ⎢ Pi+1i Pi+1i+1 ⎢ ⎣ · ·
Pi+Li
· · .. .
Pi+Li+1
·
P (i:i+L,i:i+L)
∀i: (J − L + 1) i J : ⎡ Pi,i+1 · Pii ⎢ Pi+1,i Pi+1,i+1 · ⎢ .. ⎣ . · · PJ,i+1 · PJ i P (i:J,i:J )
.. .
UJTJ UJ J = PJ−1 J.
⎤ ⎡ U T ⎤ ⎡ −1 ⎤ Uii Pii+L ii T ⎥ ⎢ 0 ⎥ U Pi+1i+L ⎥ ⎢ i,i+1 ⎥ ⎥⎢ ⎢ . ⎥=⎢ . ⎥ , (82) ⎦ .. ⎦ ⎣ .. ⎦ ⎣ · T Pi+Li+L 0 Ui,i+L ⎤ ⎡ U T ⎤ ⎡ −1 ⎤ PiJ Uii ii T ⎥ ⎢ 0 ⎥ U Pi+1,J ⎥ ⎢ i,i+1 ⎥ ⎥⎢ ⎥ =⎢ .. ⎥ ⎦⎢ ⎣ ... ⎦ , ⎣ · . ⎦ T 0 PJ J Ui,J
(83)
(84)
Theorem 1 shows how the blocks {Uij } of the Cholesky factor U are determined from the blocks {Pij } of the L-banded P . Equations (82) and (83) show that the Cholesky blocks {Uii · · · Ui,i+L } on block row i of U , 1 i (J − L), involve only the (L + 1)2 blocks in the principal submatrix P (i : i + L, i : i + L) that are in the neighborhood of these Cholesky blocks {Uii , . . . , Ui,i+L }. For block rows i > (J − L), the dimensions of the required principal submatrix of P is further reduced to P (i : J, i : J ) as shown by Eq. (83). In other words, all block rows of the Cholesky factor U can be determined independently of each other by selecting the appropriate principal submatrix of P and then applying Eq. (82). For block row i, the required principal submatrix of P spans block rows (and block columns) i through i + L. To solve for the Cholesky blocks, Eq. (82) is not solved directly in the form presented above. An alternative to Eq. (82) can be obtained by right 2 A comma in the subscript helps in differentiating between P ii+2,τ and Pi,i+2τ that in our earlier notation is written as Pii+2τ . We will use a subscript comma only for cases where confusion may arise.
42
ASIF
multiplying both sides in Eq. (82) by Uii and rearranging terms, to get −1
⎡ UT U ⎤ ⎡ Pii ii ii T U ⎢ Pi+1i ⎢ Uii+1 ii ⎥ ⎥=⎢ ⎢ .. ⎦ ⎣ · ⎣ . T P Uii Uii+L i+Li
Pii+1 Pi+1i+1 ·
Pi+Li+1
· · .. . ·
P (i:i+L,i:i+L)
⎤⎡ II Pii+L Pi+1i+L ⎥ ⎢ 0 ⎥⎢ . ⎦ ⎣ .. ·
Pi+Li+L
⎤
⎥ ⎥, ⎦
(85)
0 IO
where II is the identity block of order I . Equation (85) is solved for {UiiT Uii , T U , . . . , UT Uii+1 ii ii+L Uii }. The blocks Uii are obtained by Cholesky factorization of the first term UiiT Uii . For this factorization to be well defined requires that the resulting first term UiiT Uii in Eq. (85) be positive definite. This is easily verified. Since block P (i : i + L, i : i + L) is a principal submatrix of P , its inverse is positive definite. The top left entry corresponding to UiiT Uii on the right-hand side of Eq. (85) is obtained by selecting the first (I × I ) principal submatrix of the inverse of P (i : i + L, i : i + L), which is then positive definite as desired. As a special case, we restate Theorem 1 for matrices with tridiagonal matrix inverses (i.e., for L = 1). Note that this case refers to the potential matrix of the first-order GMRF. Corollary 1.1. The Cholesky blocks {Uii , Uii+1 } of a tridiagonal (L = 1) block banded matrix A = U T U can be computed directly from the main diagonal blocks {Pii } and the first upper diagonal blocks {Pii+1 } of P = A−1 using the following expressions: UJ J = chol PJ−1 (86) J , −1 T )−1 Uii = chol (Pii − Pii+1 Pi+1i+1 Pii+1 for (J − 1) i 1. −1 Uii+1 = −Uii Pii+1 Pi+1i+1 (87) We now proceed with Theorem 2 expresses the blocks Pij of P in terms of the Cholesky blocks Uij . Theorem 2. The upper triangular blocks Pij in P = A−1 , with A being L-block banded, are obtained recursively from the Cholesky blocks {Uii , . . . , Uii+L } of the Cholesky factor U = chol(A) by ∀i, j, 1 i (J − 1), i j (i + L) J : Pij = −
min(J,i+L) ℓ=i+1
Uii−1 Uiℓ Pℓj ,
(88)
APPLICATIONS OF NONCAUSAL GAUSS – MARKOV RANDOM FIELDS
for the last row.
43
−1 min(J,i+L) T − Piℓ Uii−1 Uiℓ , Pii = UiiT Uii
(89)
PJ J =
(90)
−1 UJTJ UJ J
ℓ=i+1
Theorem 2 states that the blocks Pii · · · Pii+L on block row i and within the first L-block diagonals in P can be evaluated from the corresponding Cholesky blocks Uii · · · Uii+L in U and the L-banded blocks in the lower block rows of P , that is, Pmm · · · Pmm+L , with m > i. To illustrate the recursive nature of the computations, consider computing the diagonal block PJ −3J −3 of a matrix P that, for example, has a two-block banded (L = 2) inverse. This requires computing the following blocks PJ −3J −3
PJ −3J −2 PJ −2J −2
PJ −3J −1 PJ −2J −1 PJ −1J −1
PJ −2J PJ −1J PJ J
in the reverse zigzag order specified below 9
8 7 6 5 3
4 , 2 1
where the number indicates the order in which the blocks are computed. The block PJ J is calculated first, followed by PJ −1J , and so on with the remaining entries until PJ −3J −3 is reached. For matrices with tridiagonal matrix inverses (as for the covariances of the first-order GMRF), Theorem 2 simplifies to the following corollary. Corollary 2.1. The main and the first upper block diagonal entries {Pii , Pii+1 } of P with a tridiagonal (L = 1) block banded inverse A can be evaluated from the Cholesky factors {Uii , Uii+1 } of A from the following expressions −1 , (91) PJ J = UJTJ UJTJ −1 Pii+1 = (−Uii Uii+1 )Pi+1i+1 Pii = (UiiT Uii )−1 + (Uii−1 Uii+1 )Pi+1i+1 (Uii−1 Uii+1 )T for (J − 1) i 1. (92)
44
ASIF
Next, we present Theorem 3, which expresses the block entries outside the first L-diagonals in P in terms of its blocks within the first L-diagonals. Theorem 3. Let A be L-block banded and P = A−1 . Then ∀i, j, 1 i < (J − L), (i + L) < j J : ⎡ ⎤ ⎡ ⎤ · Pi+1i+L −1 Pi+1j Pi+1i+1 .. ⎦ ⎣ .. ⎦ . Pij = [Pii+1 · · · Pii+L ] ⎣ . · · . Pi+Li+1 · Pi+Li+L Pi+Lj (93) This theorem shows that the blocks Pij , |i − j | > L, outside the Lband of P are determined from the blocks Pij , |i − j | L, within its L-band. In other words, the matrix P is completely specified by its first Lblock diagonals. Any blocks outside the L-block diagonals can be evaluated recursively from blocks within the L-block diagonals. In the sequel, we refer to the blocks in the L-block band of P as the significant blocks. The blocks outside the L-block band are referred to as the insignificant blocks. By Theorem 3, the insignificant blocks are determined from the significant blocks of P . To illustrate the recursive order by which the insignificant blocks are evaluated from the significant blocks, consider an example where we compute block P16 in P , which we assume has a three-block banded (L = 3) inverse A = P −1 . First, write P16 as given by Theorem 3 as P16 = [ P12
P13
P22 P14 ] P32 P42
P23 P33 P43
P24 P34 P44
!−1
! P26 P36 . P46
Then, note that all blocks on the right-hand side of the equation are significant blocks (i.e., these lie in the three-block band of P , except P26 , which is an insignificant block). Thus we need to compute P26 first. By application of Theorem 3 again, block P26 can be computed directly from the significant blocks (i.e., from blocks that are all within the three-block band of P ), so that no additional insignificant blocks of P are needed. As a general rule, to compute the block entries outside the L-block band of P , we should first compute the blocks on the (L + 1)th diagonal from the significant blocks, followed by the (L + 2) block diagonal entries, and so on, until all blocks outside the band L have been computed. The following corollary simplifies Theorem 3 for matrices with tridiagonal matrix inverses.
APPLICATIONS OF NONCAUSAL GAUSS – MARKOV RANDOM FIELDS
45
Corollary 3.1. Given the main and the first upper block diagonal entries {Pii , Pii+1 } of P with a tridiagonal (L = 1) block banded inverse A, any insignificant upper triangular block entry of P can be computed from its significant blocks from the following expression: ∀i, j, 1 i J, (i + 2) j J : $ " j −2 # T −1 Pj −1j . Pℓ+1ℓ+1 Pℓ+1ℓ Pij =
(94)
ℓ=i
In Corollary 3.1, the following notation is used: J #
(Aℓ ) = A1 A2 · · · AJ .
(95)
ℓ=1
Note that in Eq. (94), the block Pij is expressed in terms of blocks on the main diagonal and on the first upper diagonal {Pii , Pii+1 }. Thus, any insignificant block in P is computed directly from the significant blocks {Pii , Pii+1 } without the need for a recursion. C. Inversion Algorithms for Block Banded Matrices In this section, we apply Theorems 1 and 2 to derive computationally efficient algorithms to invert the full symmetric positive-definite matrix P with an L-block banded inverse A and to solve the converse problem of inverting the symmetric positive-definite L-block banded matrix A to obtain its full inverse P . We also include results from simulations that illustrate the computational savings provided by Algorithms 1 and 2 over direct inversion of the matrices. In this section, the matrix P is (N × N), that is, N = I J with blocks Pij of order I . We only count the multiplication operations assuming that inversion or multiplication of generic (I ×I ) matrices requires I 3 floating point operations (flops). 1. Inversion of Full Matrix P with Block Banded Inverse A Algorithm 1 computes the L-block banded inverse A from blocks Pij of P using the following two steps. Since A is symmetric, ATij = Aj i (similarly for Pij ), we only compute the upper triangular blocks of A (or P ). Step 1. Starting with i = J , the Cholesky blocks {Uii , . . . , Uii+L }, are calculated recursively using Theorem 1. The blocks on row i, for example, are T U ,..., calculated using Eq. (85), which computes the terms {UiiT Uii , Uii+1 ii T Uii+L Uii } for (J − L) i 1. The main diagonal Cholesky blocks {Uii } are
46
ASIF
obtained by solving for the Cholesky factors of {UiiT Uii }. The off-diagonal Cholesky blocks Ui,i+l , 1 l L, are evaluated by multiplying the T U calculated in Eq. (85) by the inverse of {U }. corresponding entity Ui,i+l ii ii Step 2. The upper triangular block entries Aij , j > i, in the information matrix A are determined from the following expression: i T ℓ=max(1,j −L) Uℓi Uℓj for (j − i) L, (96) Aij = for (j − i) > L, 0 obtained by expanding A = U T U in terms of the constituent blocks of U . Alternative Implementation. A second implementation of Algorithm 1 that avoids Cholesky factorization is obtained by expressing Eq. (96) as i T T T −1 T ℓ=max(1,j −L) (Uℓi Uii )(Uii Uii ) (Uℓj Uii ) , (j − i) L, Aij = (j − i) > L 0, (97)
T U }, for 1 k L and solving Eq. (85) for the Cholesky products {Uii+k ii and 1 i J , instead of the individual Cholesky blocks. Throughout the chapter, we use this implementation whenever we refer to Algorithm 1. Computations. In Eq. (85), the principal matrix P (i : i + L, i : i + L) is of order LI . Multiplying its inverse with IO as in Eq. (85) is equivalent to selecting its first column. Thus, only L out of L2 block entries of the inverse of P (i : i + L, i : i + L) are needed, reducing by a factor of 1/L the computations to inverting the principal submatrix. The number T U } on row i of U of flops to calculate the Cholesky product terms {Uii+k ii 3 2 3 is therefore (LI ) /L or L I . The total number of flops to compute the Cholesky product terms for N/I rows in U is then
N 2 3 × L I = NL2 I 2 . (98) I In step 2 of Algorithm 1, the number of summation terms in Eq. (97) to compute Aij is L (except for the first few initial rows, i < L). Each term involves 2 block multiplications3 (i.e., 2LI 3 flops are needed for computing Aij ). There are roughly (L + 1)N/I nonzero blocks Aij in the upper half of the L-block banded inverse A, resulting in the following flop count: N No. of flops in step 2 = (L + 1) × 2LI 3 ≈ 2NL2 I 2 . (99) I No. of flops in step 1 =
3 Eq. (97) also inverts once for each block row i the matrix (U T U ). Such inversion 1 i N/I ii ii times requires N I 2 flops, a factor of L2 less than our result in Eq. (98), not affecting the order of the
number of computations.
APPLICATIONS OF NONCAUSAL GAUSS – MARKOV RANDOM FIELDS
47
The total number of flops to compute A using Algorithm 1 is therefore given by No. of flops in Algorithm 1 = 3NL2 I 2 (100)
or (3L2 I 3 J ), an improvement of O((J /L)2 ) over the direct inversion of P . As an aside, it may be noted that Step 1 of Algorithm 1 computes the Cholesky factors of an L-block banded matrix and can also be used for Cholesky factorization of A. 2. Inversion of L-Block Banded Matrices A
Algorithm 2 calculates P from its L-block banded inverse A from the following two steps. Step 1. Calculate the Cholesky blocks {Uij } from A. These can be evaluated recursively using the following expressions: ∀i, k, 2 i J, 1 k L: " i−1 Uii = chol Aii − Uii+k =
Uii−T
"
UℓiT Uℓi
ℓ=max(1,i−L)
Aii+k −
i−1
∀k, 1 k L:
UℓiT Uℓi+k
ℓ=max(1,i−L)
The boundary condition for the first row i = 1 is U11 = chol(A11 ) and
$ ,
(101) $
.
(102)
−T U11+k = U11 A11+k .
(103) Eqs. (101)–(103) are derived by rearranging terms in Eq. (96). Step 2. Starting with PJ J , the block entries {Pij }, 1 i J , j i, and j J , in P are determined recursively from the Cholesky blocks {Uij } using Theorems 2 and 3. Alternative Implementation. To compute Uii from Eq. (101) demands that the matrix Aii −
i−1
T Uℓi Uℓi
(104)
ℓ=max(1,i−L)
be positive definite. Numerical errors with badly conditioned matrices may cause the factorization of this matrix to fail. The Cholesky factorization can be avoided by noting that Theorem 2 requires only terms (UiiT Uii ) T and (Uii−1 Uii+k ), which in turn use (Uii+m Uii+k ). We can avoid the Cholesky factorization of matrix in Eq. (104) by replacing Step 1 as follows:
48
ASIF
F IGURE 9. Number of flops required to invert a full matrix P with L-block banded inverse using Algorithm 1 for L = 2, 4, 8, and 16. The plots are normalized by the number of flops required to invert P directly.
Step 1. Calculate the product terms
UiiT Uii = Aii −
i−1
ℓ=max(1,i−L)
" T −1 −1 Aii+k − Uii Uii+k = Uii Uii
UℓiT Uℓi ,
i−1
ℓ=max(1,i−L)
UℓiT Uℓi+k
T T Uii+k Uii+m = Uii−1 Uii+k UiiT Uii Uii−1 Uii+m
(105) $
, (106) (107)
for 2 i J , 1 k L, and k m L with boundary −1 −1 TU T condition, U11 11 = A11 , U11 U11+k = A11 A11+k , and (U11+k U11+m ) = −1 T A11+k A11 A11+m . We will use implementation Eqs. (105)–(107) in conjunction with Step 2 for the inversion of L-block banded matrices. Computations. Since the term (UℓiT Uℓi ) is obtained directly by iteration of the previous rows, Eq. (105) in Step 1 of Algorithm 2 only involves additions and does not require multiplications. Eq. (106) requires one (I × I ) matrix multiplication.4 The number of terms on each block row i of U is L, therefore, the number of flops for computing all such terms on block row i is LI 3 . T U Eq. (107) computes (Uii+k ii+m ) and involves two matrix multiplications. 2 There are L /2 such terms in row i, requiring a total of L2 I 3 flops. The 2 4 Eq. (106) inverts matrix U −1 U ii+k for each block row i. For (1 i N/I ), this requires N I ii flops that do not affect the order of the number of computations.
APPLICATIONS OF NONCAUSAL GAUSS – MARKOV RANDOM FIELDS
49
number of flops required in Step 1 of Algorithm 2 is therefore given by N 3 × LI + L2 I 3 ≈ NL2 I 2 . (108) I Step 2 of Algorithm 2 uses Theorem 2 to compute blocks Pij . Each block typically requires L multiplications of (I × I ) blocks. There are (N/I )2 such blocks in P , giving the following expression for the number of flops: 2 N × LI 3 = N 2 LI . (109) No. of flops in Step 2 = I No. of flops in Step 1 =
Adding the results from Eqs. (108) and (109) gives
No. of flops in Algorithm 2 = L(LI + N)NI
(110)
L(L + J )I 3 J
or flops, an improvement of approximately a factor of O(J /L) over direct inversion of matrix A. D. Simulations Figures 9 and 10 plot the results of Monte Carlo simulations that quantify the savings in floating point operations (flops) resulting from Algorithms 1 and 2 over the direct inversion of the matrices. The plots are normalized by the total number of flops required in the direct inversion; therefore, the region below the ordinate y = 1 in these figures corresponds to the number of computations smaller than the number of computations required by the direct inversion of the matrices. This region represents computational savings of our algorithms over direct inversion. In each case, the dimension I of the constituent blocks {Pij } in P (or of {Aij } in A) is kept constant at I = 5, while the parameter J denoting the number of (I × I ) blocks on the main diagonal in P (or A) is varied from 1 to 50. The maximum dimensions of matrices A and P in the simulation is (250 × 250). Except for the few initial cases where the overhead involved in indexing and identifying constituent blocks exceeds the savings provided by Algorithm 2, both algorithms exhibit considerable savings over direct inversion. For Algorithm 1, the computations can be reduced by a factor of 10–100, whereas for Algorithm 2, the savings can be by a factor of 10. Higher savings will result with larger matrices. E. Summary Section VI exploited the theory of GMRFs to derive inversion algorithms for block banded matrices. Algorithm 1 inverts the full matrix P with a block banded inverse A and provides a computational savings of two orders of
50
ASIF
F IGURE 10. Number of flops required to invert an L-block banded (I J × I J ) matrix A using Algorithm 2 for L = 2, 4, 8, and 16. The plots are normalized by the number of flops required to invert A directly.
magnitude of the linear dimension I of the block. Algorithm 2 solves the converse problem of inverting the block banded matrix A and is faster than its direct inversion by a factor of I . These algorithms have wide application in signal processing, for example, in the Kalman filter or the RTS smoother, where inversion of the covariance matrix is required at each iteration.
VII. C ONCLUSIONS The GMRF framework has been widely used in image and signal processing algorithms. This chapter reviews the theory of GMRFs, including the block banded structure of its potential matrix. Because of their noncausal nature, the GMRFs are bilateral, which precludes their direct application in recursive image processing algorithms such as the Kalman filter. We present two one-sided (unilateral) representations for GMRFs that are optimal and equivalent to the original bilateral model. Further, we highlight the central ideas of GMRFs by providing three examples from image processing. (1) For restoration of blurred images with additive noise, a computationally practical implementation of the RTS smoother is derived. The proposed algorithm models the blurred image as a 2D, finite lattice GMRF and outperforms the Wiener filter and other deterministic techniques. (2) For video compression, a scalable video codec, referred to as SNP/VQR, is proposed that transforms the raw video into a 3D whitened field using the 3D noncausal, GMRFbased prediction. The whitened field is compressed with VQ coupled with conditional replenishment. SNP/VQR provides a higher PSNR and better
APPLICATIONS OF NONCAUSAL GAUSS – MARKOV RANDOM FIELDS
51
subjective quality than the ISO standard MPEG4 and ITU standard H.263. (3) Finally, the theory of GMRFs is exploited to derive computationally efficient algorithms for inverting full matrices with block banded inverses and for the converse problem of inverting block banded matrices. The resulting inversion algorithms provide computational savings of up to two orders of magnitude of the linear dimension of the constituent blocks in the block banded matrices. Currently, the GMRF framework is being used to solve detection and estimation problems associated with synthetic aperture radar imagery, computer vision, tomography, and surface reconstruction.
R EFERENCES Ammar, G.S., Gragg, W.B. (1988). Superfast solution of real positive definite Toeplitz systems. SIAM J. Matrix Anal. Appl. 9, 61–76. Andrew, H., Hunt, B. (1977). Digital Image Restoration. Prentice Hall. pp. 211–220. Asif, A. (2004). Fast Rauch–Tung–Striebel smoother based image restoration for noncausal images. IEEE Signal Process. Lett. 11 (3), 371–375. Asif, A., Kouras, M.G. (2006). Scalable video codec by noncausal prediction, cascaded vector quantization, and conditional replenishment. IEEE Trans. Multimedia 8 (1), 19–31. Asif, A., Moura, J.M.F. (1996). Image codec by noncausal prediction, residual mean removal, and cascaded VQ. IEEE Trans. Circuits Syst. Video Technol. 6 (1), 42–55. Asif, A., Moura, J.M.F. (1999). Data assimilation in large time-varying multidimensional fields. IEEE Trans. Image Process. 8 (11), 1593–1607. Asif, A., Moura, J.M.F. (2005). Block matrices with L-block banded inverse: Inversion algorithms. IEEE Trans. Signal Process. 53 (2), 630–642. Barnes, C.F., Frost, R.L. (1993). Vector quantizers with direct sum codebooks. IEEE Trans. Inform. Theory 39, 565–580. Bellard, F. (2004). FFMPEG multimedia system, available at: http://ffmpeg. sourceforge.net/index.php. Bini, D.A., Meini, B. (1999). Toeplitz system. SIAM J. Matrix Anal. Appl. 20 (3), 700–719. Chakraborty, M. (1998). An efficient algorithm for solving general periodic Toeplitz system. IEEE Trans. Signal Process. 46 (3), 784–787. Chandrasekaran, S., Syed, A.H. (1998). A fast stable solver for nonsymmetric Toeplitz and quasi-Toeplitz systems of linear equations. SIAM J. Matrix Anal. Appl. 19 (1), 107–139. Chun, J., Kailath, T. (1991). Generalized displacement structure for blockToeplitz, Toeplitz-block, and Toeplitz-derived matrices. In: Numerical Lin-
52
ASIF
ear Algebra, Digital Signal Processing and Parallel Algorithms. SpringerVerlag, pp. 215–236. Corral, C. (2002). Inversion of matrices with prescribed structured inverses. In: Proceedings of 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP ’02, vol. 2, pp. 1501–1504. Critin, S., Azimi-Sadjadi, M.R. (1992). A full-plane block Kalman filter for image restoration. IEEE Trans. Image Process. 1 (4), 488–495. Derlin, H., Elliott, H. (1987). Modeling and segmentation of noisy and textured images using Gibbs random fields. IEEE Trans. Pattern Anal. Machine Intell. 16, 39–55. Dewilde, P., van der Veen, A.J. (2000). Inner-outer factorization and the inversion of locally finite systems of equations. Linear Algebra Appl. 313 (1–3), 53–100. Duff, I.S., Erishman, A.M., Tinney, W. (1992). Direct Methods for Sparse Matrices. Oxford University Press, Oxford, UK. Erishman, A.M., Tinney, W. (1975). On computing certain elements of the inverse of a sparse matrix. Commun. ACM 18, 177–179. Geiger, D., Girosi, F. (1991). Parallel and deterministic algorithms from MRFs: Surface reconstruction. IEEE Trans. Pattern Anal. Machine Intell. 13 (5), 401–412. Geman, S., Geman, D. (1984). Stochastic realization, Gibbs distribution and the Bayesian restoration of images. IEEE Trans. Pattern Anal. Machine Intell. 6, 721–741. Goldberg, M., Sun, H. (1986). Image sequence coding using vector quantization. IEEE Trans. Commun. 34, 792–800. Golub, G.H., Loan, C.V. (1996). Special linear systems. Matrix Computations, third ed., The John Hopkins University Press, Baltimore, MD. pp. 133– 192. Kailath, T., Kung, S.-Y., Morf, M. (1979). Displacement ranks of matrices and linear equations. J. Math. Anal. Appl. 68, 395–407. Kalouptsidis, N., Carayannis, G., Manolakis, D. (1984). Fast algorithms for block-Toeplitz matrices with Toeplitz entries. Signal Proc. 6 (3), 77–81. Kavcic, A., Moura, J.M.F. (2000). Matrix with banded inverses: Algorithms and factorization of Gauss–Markov processes. IEEE Trans. Inform. Theory 46 (4), 1495–1509. Lee, M., Rangarajan, A., Zubal, I.G., Gindi, G. (1993). A continuation method for emission tomography. IEEE Trans. Nucl. Sci. 40, 2049–2058. Li, X., Paul, S., Ammar, M. (1998). Layered video multicast with retransmissions (LVMR): Evaluation of hierarchical rate control. In: Proceedings of IEEE Infocom ’98. San Francisco, CA, March 1998, vol. 3, pp. 1062–1072. Linde, Y., Buzo, A., Gray, R.M. (1980). An algorithm for vector quantizer design. IEEE Trans. Commun. 28, 232–240.
APPLICATIONS OF NONCAUSAL GAUSS – MARKOV RANDOM FIELDS
53
Molina, R., Katsaggelos, A.K., Mateos, J., Hermoso, A., Segall, C.A. (2000). Restoration of severely blurred high range images using stochastic and deterministic relaxation algorithms in compound Gauss–Markov random fields. Pattern Recognition 33, 555–571. Moura, J.M.F., Balram, N. (1991). Recursive enhancement of noncausal images. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing. May 1991, pp. 2997–3000. Moura, J.M.F., Balram, N. (1992). Recursive structure of noncausal Gauss– Markov random fields. IEEE Trans. Inform. Theory 38, 334–354. Rangarajan, A., Chellappa, R., Manjunath, B.S. (1991). Markov random fields and neural networks with applications to early vision. Artificial Neural Networks and Statistical Pattern Recognition: Old and New Connections. Elsevier. Rauch, H.E., Tung, F., Striebel, C.T. (1995). Maximum likelihood estimates of linear dynamic systems. J. AIAA 3 (8), 1445–1450. Reeves, S.J. (2002). Fast algorithm for solving block banded Toeplitz systems with banded Toeplitz blocks. In: Proceedings of 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2002, vol. 4, pp. 3325–3329. Schweizer, S.M., Moura, J.M.F. (2000). Hyperspectral imagery: Clutter adaptation in anomaly detection. IEEE Trans. Inform. Theory 46 (5), 1855– 1871. Tan, W., Zakhor, A. (2001). Video multicast using layered FEC and scalable compression. IEEE Trans. Circuits Syst. Video Technol. 11 (3), 373–386. Tanakitparpa, T. (1999). H.263 video codec, available at: http://www. angelfire.com/in/H261/h263capture.html. Tekalp, A., Kaufman, H., Woods, J. (1985). Identification of image and blur parameters for the restoration of noncausal blurs. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP ’85, vol. 10, pp. 656–659. Vicisano, L., Rizzo, L., Crowcroft, J. (1998). TCP-like congestion control for layered multicast data transfer. In: Proceedings of IEEE Infocom ’98. San Francisco, CA, March 1998, vol. 3, pp. 996–1003. Woods, T.J. (1972). Two dimensional discrete Markovian fields. IEEE Trans. Inform. Theory 18, 232–240. Yagle, A.E. (2001). A fast algorithm for Toeplitz-block–Toeplitz linear systems. In: Proceedings of 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP ’2001. Phoenix, May 2001, vol. 3, pp. 1929–1932. ISO/IEC (1999). IS 14496-1: Information Technology–Coding of AudioVisual Objects (MPEG-4). ITU-T Recommendation H.263 V.2: Video Coding for Low Bit Rate Communication, ITU Telecommunication Standardization Sector of ITU (1998).
This page intentionally left blank
ADVANCES IN IMAGING AND ELECTRON PHYSICS, VOL. 145
Direct Electron Detectors for Electron Microscopy A.R. FARUQI MRC Laboratory of Molecular Biology, Cambridge CB2 2QH, United Kingdom
I. Introduction . . . . . . . . . . . . . . . . II. Detectors—General Introduction . . . . . . . . . . A. Efficiency . . . . . . . . . . . . . . . . B. Spatial Resolution . . . . . . . . . . . . . C. Uniformity of Response . . . . . . . . . . . . D. Size . . . . . . . . . . . . . . . . . E. Dynamic Range . . . . . . . . . . . . . . F. Radiation Damage . . . . . . . . . . . . . III. Film . . . . . . . . . . . . . . . . . . IV. CCDs . . . . . . . . . . . . . . . . . . V. Direct Electron Semiconductor Detectors . . . . . . . . VI. Monte Carlo Simulations . . . . . . . . . . . . VII. Hybrid Pixel Detectors, Medipix1, and Medipix2 . . . . . A. Experimental Results for Medipix2 in Electron Microscopy . B. Sensitivity of Medipix2 to Electrons in the 120–300 keV Range C. Comparison of Medipix2 with Film . . . . . . . . D. Resolution and Efficiency for 120-keV Electrons . . . . E. Modulation Transfer Function . . . . . . . . . . F. Detective Quantum Efficiency . . . . . . . . . . G. Future Prospects for Medipix2 . . . . . . . . . . VIII. MAPS Detectors Based on CMOS . . . . . . . . . . A. MAPS: Sensitivity and Resolution . . . . . . . . B. Radiation Damage in CMOS-Based Detectors . . . . . C. Outlook for CMOS-Based Detectors . . . . . . . . IX. Conclusions . . . . . . . . . . . . . . . . Acknowledgments . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
55 57 57 58 58 58 59 59 59 60 61 62 64 69 70 72 74 74 75 77 80 83 86 89 90 90 91
I. I NTRODUCTION Image recording and storage is one of the most important experimental steps in electron microscopy (EM). Although film has been used traditionally for recording images, recent advances in microelectronics technology is changing the situation and new types of direct detection sensors, with a number of attractive properties, are emerging. This review discusses the design principles and potential applications of such direct detection devices, which are based on two types of semiconductor pixel detectors. The two direct detection 55 ISSN 1076-5670 DOI: 10.1016/S1076-5670(06)45002-3
Copyright 2007, Elsevier Inc. All rights reserved.
56
FARUQI
technologies—hybrid pixel detectors (HPDs) and monolithic active pixel sensors (MAPSs)—are described in some detail. Both are relatively new developments, based on the latest complementary metal oxide semiconductor (CMOS) design technologies, and both have great potential for use in EM applications. Although such detectors will be generally applicable in all areas of EM, this review emphasizes applications in electron cryomicroscopy (cryoEM). Biological samples are very sensitive to radiation damage from the highenergy electrons inherent in EM. The damage is suppressed in cryo-EM, while maintaining the samples in an aqueous environment, by embedding them in vitreous ice and cooling them to liquid nitrogen temperatures (Auer, 2000; Henderson, 2004). The three general areas of cryo-EM that are used most commonly in biological work are single-particle analysis, electron crystallography, and electron tomography (Henderson, 2004). The signal to noise ratio (SNR) obtained from imaging a single molecule is too small to be useful in structural analysis, and the signal from a large number of molecules needs to be summed to obtain significant information. In electron crystallography, the signal from a large number of molecules, which form the two-dimensional (2D) array, are added crystallographically to determine the final signal. In the absence of suitable 2D crystals, it is nevertheless possible to obtain structural information by computer analysis of single-particle images of large molecules and macromolecular complexes (Henderson, 2004). Because biological specimens have only slightly greater density than the surrounding aqueous medium, and radiation damage limits the dose, there is very poor contrast in the images. The SNR can be improved considerably by averaging a large number of images from “identical” particles, which is essential to obtain high-resolution structural information. A large number of images are collected from molecules in random orientations and averaging is done using specialized software algorithms. The main advantage of single-particle analysis over electron crystallography (or Xray crystallography) is that molecules do not need to be crystallized before imaging (Frank, 2002). Cryo-EM is particularly suitable for imaging large macromolecular complexes (MW > 250 kD), which may not be available in large quantities or are difficult to crystallize with sufficient order to be able to use crystallographic methods. It has been shown theoretically that atomic structure resolution should be obtained provided a sufficiently large number of particles can be used for averaging (Henderson, 1995). Structures that have been determined by X-ray crystallography and for which atomic coordinates are known can use cryo-EM as a complementary technique by superimposing the coordinates on the lower resolution maps (Frank, 2002).
DIRECT ELECTRON DETECTORS FOR ELECTRON MICROSCOPY
57
Electron tomography is performed by recording a tilt series of micrographs of larger specimens, such as subcellular components or indeed whole cells. The specimen is tilted progressively to obtain the complete series. The data are obtained at a lower resolution (20–50 nm) than for single-particle analysis (Baumeister et al., 1999). Because of the high doses required to acquire a tomography dataset, it is very important to use a high-efficiency detector, which records every electron. With the increase in the number of applications of cryo-EM to biological microscopy and requirements for more extensive data collection, it is very important to replace film with an electronic detector. After a general introduction to detectors in the next section, we discuss, very briefly, the use of charge-coupled devices (CCDs). The remainder of the review deals with various aspects of direct detection techniques, including the results of predicted behavior, based on Monte Carlo simulations and prospects for future improvements.
II. D ETECTORS —G ENERAL I NTRODUCTION Detectors for electrons (or X-rays) fall into two broad categories. First, analog detectors produce a charge proportional to the input signal and rely on signal integration. Second, digital detectors integrate and amplify the charge from each incident electron and record it as a single count if the pulse height of the signal exceeds a preset threshold discriminator level. Most noise pulses are eliminated in the process as the signal due to each electron is orders of magnitude greater than thermal noise. As a general rule, analog detectors are capable of recording data at higher fluxes than digital detectors, but digital detectors have a higher SNR. Most of the familiar electron detectors fall in the former category and include film (in terms of optical density measurement), phosphor-coupled CCDs, image plates, and CMOS-based MAPS detectors, whereas only hybrid pixel detectors fall in the latter category. The qualities of the perfect detector for cryo-EM have been discussed in a number of recent publications (Faruqi and Subramaniam, 2000; Faruqi, 2001; Faruqi et al., 2005a). It is useful to review those properties and compare them with what can be achieved with the new direct detectors. A. Efficiency Each incident electron should be detected with high efficiency adding a minimal amount of noise in the detection process to obtain a high detective
58
FARUQI
quantum efficiency (DQE). The definition of DQE is (Dainty and Shaw, 1974): DQE = (S/N)2out /(S/N)2in ,
where S and N refer to the signal and noise in the detector, respectively. Since the detector always introduces some extra noise, in addition to Poissonian noise, during detection and readout DQE is always less than 1. The DQE of hybrid pixel detectors is dependant on the threshold settings (see Section VII.F). If the output of the detector is directly proportional to the input energy, this improves the DQE. For CMOS-based detectors, the energy loss follows a Landau-type distribution where a significant fraction of the events may result in 10–50 times the energy loss of the most probable events (Milazzo et al., 2005). B. Spatial Resolution The response of the detector to an electron should be ideally restricted to the single pixel on which the electron was incident, to obtain a narrow point spread function (PSF) and a high-modulation transfer function (MTF). A high value of MTF also affects the DQE at Nyquist frequency, which is important for capturing high-resolution features in the images. C. Uniformity of Response The response of the detector must be reasonably uniform to be able to detect small variations in contrast. However, even if the intrinsic uniformity is not perfect, it should be possible to correct for any nonuniformities by applying a flat field correction, obtained by illuminating the detector to uniform illumination. D. Size The sensitive area of the detector determines how much data can be collected in one exposure. The important parameter related to the overall size is the total number of independent pixels available in the detector. For example, film is readily available in 10-cm square area with 10-µm pixels (or smaller) routinely scanned providing 10 000 × 10 000 pixels. Some of the most demanding applications in cryo-EM are in recording single-particle images, for which most users would prefer an area of 4000 × 4000 independent pixels (Faruqi et al., 2005a).
DIRECT ELECTRON DETECTORS FOR ELECTRON MICROSCOPY
59
E. Dynamic Range A large dynamic range is particularly useful for recording electron diffraction data, which may contain high-intensity spots adjacent to weak ones, in a single exposure (Faruqi et al., 1999). This is also true in materials science where radiation damage to the specimen is less important. However, for cryo-EM a large dynamic range is not so important. F. Radiation Damage A very important requirement, relevant for direct detection semiconductor detectors, is the ability to withstand radiation damage, for a reasonably long period (≥ 1 year in normal usage). An estimate, based on 106 images recorded on a detector with a 25-µm2 square pixel, gives the integrated dose to be ∼ 5 × 107 electrons/pixel, or ∼ 1 Mrad (Faruqi et al., 2006).
III. F ILM From the earliest days of EM, electron-sensitive film has been used extensively for recording micrographs and continues to be used even now in some of the most demanding applications, which require a high MTF and DQE. The main properties of film are reviewed briefly to highlight the main advantages and drawbacks compared with the newer electronic detectors. Film used in EM typically has a sensitive layer of ∼ 10 µm thick consisting of gelatin containing fine grains of silver bromide. When a grain is struck by incident electrons it can be activated, which results in its conversion into silver metal during the development process. A single incident electron can activate a number of grains along its trajectory and, within a limited range, the resulting optical density of the developed film is directly proportional to the incident electron dose (Zeitler, 1992). Unfortunately, some grains that have not been exposed to incident electrons also are converted into silver (due to cosmic rays). The resulting fog makes it impossible to distinguish singleelectron events, as discussed in greater detail in Section VII.C. To summarize, the main strengths of film are: 1. Excellent spatial resolution (Henderson et al., 2006); this property is needed particularly for digitizing high-resolution images. 2. Large area coverage; a consequence of excellent spatial resolution is that a 10-cm square film can provide a medium for storing more than 100 million 10-µm pixels in a single image.
60
FARUQI
3. Data archiving is simple as the medium (film) is the archive. This ensures that the archiving is independent of data formats and computer operating systems. The lifetime of electronic archive media is uncertain, whereas data stored on film, on past experience, should last at least several decades. Despite the many advantages of film, there is considerable room for improvement in film detector technology as described below. 1. Data on film are only available to the user after several laborious intermediate steps between exposing the film and accessing the data. These steps consist of developing the film, followed by scanning to digitize the optical density values into a computer in a suitable image format. Conversely, all electronic detectors are capable of rapid readout of an image into a computer, giving online access. Suitable software is used for processing, displaying, and archiving the data. Ready access to the data is a valuable tool for the user in making decisions about the quality of specimen under study, which could lead to possible time saving and improved data. 2. The dynamic range of electronic detectors is considerably greater than for film; the dynamic range of CCD-based detectors is about two orders of magnitude greater than film and is virtually infinite for hybrid pixel detectors. 3. The inherent fog level affects the SNR in film. Very low-level exposures consisting of fewer than 5 electrons/pixel cannot be recorded above the noise level, as discussed in Section VII.C (Faruqi et al., 2005b). 4. Some applications in EM (e.g., electron tomography) require rapid feedback from the detector, which is impossible to implement with film.
IV. CCD S Most of the earlier electronic detectors were based on phosphor-coupled fibreoptic CCDs (Faruqi and Subramaniam, 2000). The CCDs used in these detectors were developed originally for imaging faint astronomical objects with very low readout noise but with slow readout. The dark current can be reduced by cooling to sufficiently low temperature, as for any semiconductor detector. With low dark current and, more importantly, lower dark current noise, fairly long (>10 seconds) exposures are feasible. CCDs are semiconductor pixel detectors, not unlike the CMOS-based detectors discussed later. A brief explanation is needed to explain why CCDs are not used in a direct detection mode in EM (Faruqi and Subramaniam, 2000). The main reason is that CCDs are very susceptible to radiation damage (Roberts et al., 1982). CCDs have a wide range of sensitivity to electromagnetic radiation and charged particles. X-rays or electrons, in addition to light, can be detected
DIRECT ELECTRON DETECTORS FOR ELECTRON MICROSCOPY
61
easily. First, the energy deposited by an incident 100-keV electron is sufficient to produce a reasonably high signal, but the signal charge is high enough to substantially fill a pixel well, thus reducing the dynamic range. Second, radiation damage to the front surface, which contains the polysilicon gates used for applying voltages needed for the readout process, is a serious problem. A practical solution, which has been adopted generally, is to use a phosphorcoated fibreoptic assembly as the first element in the detector (Faruqi and Andrews, 1997). Phosphors emit visible light photons, which are then imaged onto the CCD through fibreoptics (or sometimes lenses). Passage of light through a number of optical interfaces, however, causes light scattering, which leads to a reduction in resolution. A partial solution is to bin pixels, 2 × 2 or 4 × 4, but that reduces the number of available pixels for recording the image. Where the residual readout noise is less than the shot noise in the data, it can be ignored. The main application of phosphor/fibreoptic-coupled CCDs has been in situations when the somewhat lower resolution (compared with film) is acceptable (e.g., in recording electron diffraction data, in focusing and alignment of the beam, or when the advantage of immediate access to the stack of aligned images outweighs any disadvantage, such as with electron tomography). One notably successful use of CCD detectors in high-resolution biological EM has been in the acquisition of electron diffraction patterns from 2D crystals of proteins (Faruqi et al., 1999; Downing and Hendrickson, 1999). One of the key advantages of CCDs over film in this case is that the dynamic range is about two orders of magnitude greater than film, allowing diffraction spots with wide variations in intensity to be recorded in a single exposure. The SNRs of weaker spots are also higher than for film, leading to more accurate structures. Recent improvements in the technology of CCD-based detectors have produced CCDs with 4k × 4k pixels. There have been some attempts to use the new detectors for single-particle imaging (Sander et al., 2005; Booth et al., 2004). The main conclusions are that the CCDs are excellent for acquiring lower-resolution images, which lead to structures at > 9 Å, but film is still the detector of choice for higher-resolution data.
V. D IRECT E LECTRON S EMICONDUCTOR D ETECTORS The electrical properties of crystalline silicon, which is a semiconductor, make it eminently suitable for use as a radiation detector, provided it can be doped in a prescribed manner. According to the band theory of solids, electrons occupy two distinct energy bands, valence and conduction bands, which are separated by a forbidden region. Electrons can be excited into the conduction band, leaving holes behind (Faruqi and Subramaniam, 2000).
62
FARUQI
Electrons in the conduction band and corresponding holes in the valence band acquire mobility under the influence of an external voltage and can move to an external amplifier. Energy deposited in silicon is converted mainly into electron–hole pairs; this conversion forms the basis for radiation detection. The signal generated depends on the band gap in silicon (1.12 eV), and the energy required to produce an electron–hole pair, 3.55 eV. The two types of direct detection detectors, which are the main part of this review, are HPDs (Campbell et al., 1998; Hall, 1995; Krüger, 2005; Llopart et al., 2002; Llopart and Campbell, 2003 and http://www.cern.ch/MEDIPIX) and MAPS (Prydderch et al., 2003; Milazzo et al., 2005; Deptuch, 2005). Although operating under different principles, the two types of detectors nevertheless share a number of common properties. Both detectors have been made possible due to the high density of microelectronics device integration, which is possible in CMOS technology, allowing a large number of features to be packed into small spaces and to allow small pixel sizes. The other common feature is the fact that the incident electron deposits energy in silicon, resulting in a number of electron–hole pairs, which constitutes the signal. Despite the similarities between HPDs and MAPS, there are a number of significant differences in design and, more importantly, in performance. For HPDs the process of electron detection and subsequent readout of the result are separated into two layers of silicon. A detector layer, which is typically silicon but could be a higher-density material, is 300 µm thick in the detectors under review; it is fully depleted by a suitable voltage applied across the detector layer. The holes (it is preferable to detect holes rather than electrons in silicon), created by to the incident electron, are forced to drift toward the bump bond to the readout electronics for subsequent readout. In MAPS, as the term monolithic in the name implies, the detection and readout are integrated into the same layer of silicon. There is a relatively thin, sensitive layer (4–20 µm) that acts as the detector but has a number of collection diodes built into the pixel. The charge from the primary event, collected in the diodes, constitutes the signal.
VI. M ONTE C ARLO S IMULATIONS Monte Carlo simulations have proven very useful in predicting the behavior of electrons in phosphor-coupled CCDs (Joy, 1995; Faruqi and Andrews, 1997; Meyer and Kirkland, 1998) and semiconductor detectors (Faruqi et al., 2003; McMullan et al., 2006). The predicted behavior of electrons in silicon is a very useful tool for optimizing the design of detectors and in terms of their performance. The most basic Monte Carlo simulations plot the trajectory of a large number of electrons as they travel through silicon. The next level of
DIRECT ELECTRON DETECTORS FOR ELECTRON MICROSCOPY
63
sophistication in the simulations is estimating the energy loss suffered by the electron during its progression through the detector. The latter method is used for predicting the amount of energy deposited in a given pixel, in the adjacent pixels, and as energy lost by backscattering. Some assumptions are required to enable Monte Carlo simulations to be performed using reasonable computer time. Electron interactions in the detector material are assumed to be one of two types: (1) elastic scattering, which changes the direction of the electron without loss of energy, and (2) inelastic scattering, which results in a change of direction with some loss of energy (Joy, 1995). Only the inelastic interactions lead to signal generation in the detector. Although the detector is pixellated, electrons scattered away from the pixel where the electron was incident (which we refer to as the seed pixel) can cross the pixel boundaries to be recorded in an adjacent pixel, or indeed further away, provided they have sufficient energy. This phenomenon leads to signal spreading, similar to signal spreading in phosphor-coupled CCD detectors where light scattering is responsible for the effect. Unlike CCDs, hybrid detectors are able to apply a lower energy threshold on each pixel (there is also a high-level threshold that is not useful when imaging monochromatic electrons) to reduce the impact of signal spreading. A key feature of the hybrid detectors—low or no noise—results from the large amount of energy deposited by the incident electrons compared to noise levels. As described in more detail later, it is very important to set the thresholds on each pixel to reduce signal spreading, i.e., optimizing the detector for resolution (high MTF) but without setting the threshold too high, which would lead to sacrificing efficiency (DQE). Figure 1 shows three examples of electron trajectories in silicon, computed for three different energies, 120, 200, and 300 keV (and one in cadmium telluride [CdTe] at 300 keV). The trajectories clearly show how the spread of charge (and consequently deposited energy) increases with the higher energy of the incident electron. The electron tracks shown in Figure 1 do not provide a quantitative measure of the energy deposited at different sections of the electron track, which is required to predict the degree of charge sharing expected. The Monte Carlo program (Joy, 1995) has recently been extended to include the amount of energy deposited along the track (McMullan et al., 2006). This information is used to estimate the distribution of energy deposited in the primary and adjacent pixels and is used for predicting DQE. A significant number of electrons are backscattered in silicon and exit from the entrance side. The backscattering process is particularly significant for MAPS detectors, where the initial signal is relatively small. A significant energy loss due to backscattered electrons may occur in pixels distant from the entry pixel, resulting in a large signal, which causes a reduction in the MTF and DQE (McMullan et al., 2006).
64
FARUQI
F IGURE 1. Monte Carlo simulation of electron trajectories in silicon at 120 (a), 200 (b), and 300 keV (c) and CdTe (d) at 300 keV. The faint vertical lines are drawn with a spacing of 55 µm, the pixel size. All trajectories start at the center of a pixel. Note the extent of the electron tracks at 300 keV in silicon, with a significant fraction transmitted through the 300-µm detector layer. The extent of charge spreading is reduced considerably in CdTe, at the same energy, due to the higher atomic number and density compared with silicon. (From McMullan et al., 2006.)
VII. H YBRID P IXEL D ETECTORS , M EDIPIX 1, AND M EDIPIX 2 Medipix1 and Medipix2 are HPDs containing a pixellated detector, and a bump (also called flip chip) bonded to a readout electronics chip, containing the same pixellation. Consequently, each pixel on the detector chip is read out by its individual readout pixel electronics (http://www.cern.ch/MEDIPIX). The Medipix series of HPDs evolved from a large research and development program for new detector development at the Large Hadron Collider, being constructed at the particle physics laboratory CERN, in Geneva. The main properties of HPDs that make them attractive for particle physicists are that they allow precise tracking of charged particles with excellent timing resolution (Krüger, 2005). The development of Medipix has been driven largely by the need to develop X-ray photon-counting detectors with high efficiency for medical applications. It is not possible, however, to cover X-ray applications in this review and we therefore restrict our discussion to the application with which we have been closely involved: EM with electrons within the energy range 100–300 keV. The conversion of the primary electron into a detectable signal occurs in the detector layer, which is made from high-resistivity silicon. The detector
DIRECT ELECTRON DETECTORS FOR ELECTRON MICROSCOPY
65
F IGURE 2. Schematic diagram of a single pixel in a hybrid pixel detector. The top detector layer is bump-bonded to the readout chip. Electron–hole pairs created in the silicon detector layer, which is fully depleted, are separated under the influence of an externally applied field. Holes are transferred through the bump bond to the pixel readout electronics, situated in the readout chip. (Adapted from http://www.cern.ch/MEDIPIX.)
is fabricated out of p-i-n silicon, with the pixels patterned on the p side; a voltage field is applied across the detector, resulting in complete depletion in the sensitive volume. Electron–hole pairs created by the primary electron are free to drift in the potential created by the applied voltage—for silicon, holes are collected as signal. An electron with 120-keV energy creates ∼ 33 000 electron–hole pairs in the detector layer. Because of the long lifetime of holes in silicon a large fraction are collected at the detector node. As shown later, the pixel amplifier readout noise is typically 100e− rms, leading to an excellent SNR of > 300, which can be used in noise-free acquisition of images. Figure 2 shows the schematic diagram of a single pixel with the detector and electronics readout part. The detector has a thin metallic layer for electrical contact used to apply the bias voltage. Typically, the detector layer is 300-µm thick and requires 50–100 V for complete depletion in silicon. The signal charge (holes in the present case) is transferred across the bump bond into the pixel electronics, shown in schematic form in Figure 3. The process of bumpbonding is a complicated technical procedure that requires specialized skills and equipment. Examples of solder bumps on the readout chip before bonding to the sensor are shown at two magnifications in Figures 4(a) and 4(b). The pitch of the solder bumps is the same as the pitch of the pixels, that is, 55 µm, and occupies ∼ 20-µm diameter of space within the pixel. The sensor
66
FARUQI
F IGURE 3. Schematic diagram of the pixel readout electronics. The signal charge, generated by the incident electron, is converted into a voltage in the preamplifier. The analog output of the preamplifier is compared with lower and upper preset threshold voltages in the two discriminators. If the pulse height satisfies both criteria, a digital count is incremented in the shift register. (From http://www.cern.ch/MEDIPIX.)
layer must be metallized before bonding; this is shown in Figure 4(c). The process is completed when the two sides are bonded together and an electrical connection is made through all the bumps from the sensor to the readout pixel electronics (LaBennett, R. RTI International, NC, USA). The pixel readout electronics can be divided into a front-end analog part and a rear-end digital part (Llopart and Campbell, 2003). The signal from the incident electron is amplified in the input amplifier and compared against two preset voltage levels: A lower-level threshold that is used to eliminate noise (and is also used to optimize the detector performance at different energies) and an upper-level threshold, which is not needed for monochromatic electrons. Accepted signal pulses are counted in a 13-bit shift register, which is read out sequentially with other pixels at the conclusion of the image acquisition. One of the first tests on Medipix1 (64 × 64 170-µm pixels) was to establish whether the detector behaved like a Poissonian electron counter (Faruqi et al., 2003). Small nonuniformities between pixels arising from variations in semiconductor properties were corrected by flood field illumination. The detector was uniformly illuminated with electrons with gradually increased intensity. The measured and expected counts (on the basis of Poissonian statistics) confirm that this is indeed true—the noise in the measurements
DIRECT ELECTRON DETECTORS FOR ELECTRON MICROSCOPY
67
F IGURE 4. Micrographs of solder bumps on the Medipix2 readout and sensor chips before bonding. The scale marks are given on the individual micrographs. (Courtesy Richard LaBennett, RTI International, NC, USA.) (a) Bumps on the readout chip at low magnification. (b) Bumps on the readout chip at higher magnification. (c) Bumps on the sensor side.
68
FARUQI TABLE 1 C OMPARISON OF THE BASIC P ROPERTIES OF M EDIPIX 1 AND M EDIPIX 2
Medipix1
Medipix2
Pixel size: 170 µm × 170 µm Number of pixels: 64 × 64 pixels Pixel amplifier: only positive input Column-wise leakage Current compensation Single discriminator (lower level) Maximum counting rate: 1 MHz/pixel Not buttable 1-µm SACMOS technology Parallel I/O 1.6 M transistors/chip
55 µm × 55 µm 256 × 256 pixels Positive or Negative input Pixel-wise current Compensation (i.e., better) Two discriminators (lower and upper level) 1 MHz/pixel Three-side buttable (to form a 2 × 2 quad chip) 0.25-µm CMOS technology Serial/parallel I/O (faster readout with parallel) 33 M transistors/chip
arises purely from counting statistics without any added noise from the detector. Based on the extensive experience of many of the Medipix Collaboration members (including the various shortcomings for use in EM (Faruqi et al., 2003)) on using Medipix1, several design improvements were made to the second detector designed within the collaboration: Medipix2 (Llopart et al., 2002; Llopart and Campbell, 2003). Because microelectronics technology had progressed rapidly in the intervening period since Medipix1 was designed, it was possible to use a much smaller linewidth in the electronics layout for the Medipix2 readout chip, that is, 0.25 µm. The technology used was the more common CMOS technology instead of the nonstandard 1-µm Self Aligned Contact CMOS (SACMOS) used in the design of Medipix1. Because of the higher-density layout it was possible to pack more functionality in the pixel electronics in a much smaller pixel size (55 µm compared with 170 µm). The overall size of the chip was increased to ∼ 2 cm2 containing 256 × 256 pixels. The design also allowed for three-side butting of the basic chip to enable construction of a 2 × 2 tiled array detector (the Quad) with 512 × 512 pixels. The pixel electronics were modified to be linear up to 80 000e− , which covers much of the energy range of interest in EM. A summary of the main properties of Medipix1 and Medipix2, which is adapted from Mikulec et al. (2003), is shown in Table 1. The input (pixel) amplifiers were designed to accept both electrons (negative input) and holes (positive input). Holes are generally collected in silicon, but it is better to collect electrons in CdTe; the latter has a much higher stopping power than silicon and is potentially very useful in detecting 300-keV electrons as discussed earlier.
DIRECT ELECTRON DETECTORS FOR ELECTRON MICROSCOPY
69
A. Experimental Results for Medipix2 in Electron Microscopy The Medipix2 single and quad chips have been tested in CM12 120-kV and FEI F30 300-kV electron microscopes at the MRC Laboratory in Cambridge. The main problem, as with all detectors for use in the electron microscope, is that due to the high probability for electron scattering and absorption, detectors cannot have a window and need to be installed within the high vacuum of the microscope. All electrical connections are made by using
(a) F IGURE 5. (a) Photograph of the single-chip Medipix2 mounting in the CM12 120-kV electron microscope. The detector was mounted in the film plane, with the film mechanism disabled, for these tests. The readout cabling is routed via the vacuum-compatible electrical feedthroughs fixed on the left port (the glass window normally used in the port was replaced by a steel plate) window. (From Faruqi et al., 2005b.) (b) Photograph of the quad detector in a specially constructed mounting before installation in the Tecnai 300-kV electron microscope. The quad peripheral electronics is protected by a metal shield, with a square hole, located above the detector. The thick metal base plate forms the bottom of the vacuum vessel. (From McMullan et al., 2006.)
70
FARUQI
(b) F IGURE 5.
(continued)
vacuum-compatible electrical feedthroughs, which connect the Medipix2 chip (or quad) to the readout electronics located outside the camera vacuum. Peripheral electronic components immediately surrounding the detector are shielded with lead to prevent unnecessary damage from the primary beam. Two photographs illustrate the mounting process. Figure 5(a) shows a single chip mounted under the film plane in a CM12 microscope, and Figure 5(b) shows a quad fixed to the mounting plate, which is subsequently bolted on under the F30. Figure 6 shows completed system, with the detector attached to the F30. The readout of the chip is made through a specially designed interface module, Muros2 and using readout software Medisoft, controlled from a PC running National Instruments software LabWindows (Fornaini et al., 2003; San Secundo Bello et al., 2003; Conti et al., 2003). B. Sensitivity of Medipix2 to Electrons in the 120–300 keV Range A general overview of the Medipix2 performance over the range of energies of interest is shown in Figure 7 (McMullan et al., 2006). A standard 300mesh EM grid was imaged at different electron energies so that the grid bars had a spacing of 665 µm in both directions. The images in Figure 7, recorded at 120, 150, 180, 200, 250, and 300 keV, show a gradual increase in
DIRECT ELECTRON DETECTORS FOR ELECTRON MICROSCOPY
F IGURE 6.
71
Photograph of the Medipix Quad mounted below the Tecnai F30 microscope.
blurring or degradation in resolution as the energy is increased to 300 keV. If the image at the highest energy, 300 keV, is compared with the image at the lowest energy, 120 keV, the difference in resolution is quite striking. A separate observation was that there was evidence of radiation damage to the readout chip at 300 keV but not at lower energies. Both results can be explained on the basis of Monte Carlo simulations for 300-keV electrons in 300-µm silicon (see Figure 2). The explanation for the worsening resolution at higher energies is that the range of 300-keV electrons is such that there is considerable sideways spread into adjacent pixels. The radiation damage to
72
FARUQI
F IGURE 7. A montage of 300-mesh grid images recorded at the following energies: 120, 150, 180, 200, 250, and 300 keV. The images at the higher energies appear much more blurred compared with the image at 120 keV (see text for in greater detail). (From McMullan et al., 2006.)
the readout chip at 300 keV can be explained by the longer range of electrons. Significant fractions (∼10%) of the primary electrons, which traverse the detector layer, enter the readout chip to deposit energy, causing the radiation damage. The problem and solution to the radiation damage issue is explored in more detail in Section VIII.B. Taking the two factors into account—poor resolution and some radiation damage at 300 keV—the detailed tests on Medipix2 were restricted to 120 keV for the present. Tests at higher energies will be continued in the future when detectors equipped with CdTe, which has a much higher stopping power compared with silicon, become available. The newer designs of the readout chip already incorporate rad-hard design principles, which should reduce radiation damage problems. C. Comparison of Medipix2 with Film Several different methods can be used to measure the sensitivity of Medipix2 to 120-keV electrons. The method for measuring DQE is presented later in Section VII.F, but a direct measure of Medipix2 sensitivity to 120-keV electrons can be established by comparing Medipix2 data with the optical density on film of images acquired under near-identical conditions. Film calibration, in terms of electron dose per optical density unit, is known fairly accurately provided the development is done out under carefully controlled
DIRECT ELECTRON DETECTORS FOR ELECTRON MICROSCOPY
73
(a)
(b) F IGURE 8. Comparison of Medipix2 sensitivity with film. (a) A raster of spots generated by the spotscan program containing an average of 111 electrons/spot (with the standard deviation, σ = 11). The equivalent exposure on film gave a similar value for the number of electrons/spot (116) but with a higher standard deviation (σ = 24). (b) Similar raster to (a) but with fewer electrons/spot (mean = 4.7, σ = 1.8); the equivalent exposure on film was not measurable due to the noise levels. (From Faruqi et al., 2005b.)
74
FARUQI
conditions. The comparative tests used a spotscan routine (Faruqi et al., 1999), which generates a raster of spots with a preset intensity, which were recorded both on film and on Medpix2. The intensity in a spot (i.e., the number of electrons in the spot) can be varied by either changing the dwell time per spot in the software or by altering the electron beam intensity in the microscope. Two examples of rasters, recorded on Medipix2 at two different intensities, are shown in Figures 8(a) and 8(b). In Figure 8(a), the mean intensity measured by Medipix2 was 111 with a standard deviation of 11, which statistically suggests a Poissonian distribution—as expected from a detector that does not add noise to the signal (Faruqi et al., 2005b). The results for film, using the OD calibration mentioned earlier, gave a slightly higher measured value for the number of electrons in the raster (i.e., 116) but with a considerably greater standard deviation (24) than would be expected from purely Poissonian statistics. The implication is that the extra readout noise added in film increases the noise values of the measurements. Repeating the same experiment at considerably reduced intensity, shown in Figure 8(b), with the intensity per raster spot now reduced to a mean of 4.8 electrons, the standard deviation is now 1.8—still Poissonian. However, due to the high intrinsic noise (∼ 5 electrons) the raster spots were not visible on film and could not be measured. This simple experiment demonstrates that Medipix2 can be used as an electron counter with high efficiency obeying Poissonian statistics. D. Resolution and Efficiency for 120-keV Electrons Unusually for an imaging detector, it is possible to set thresholds on each pixel for Medipix2. The equivalent energy for the voltage settings range from the region close to the onset of noise (i.e., ∼ 4 keV) up to a value higher than 120 keV. Intuitively, it is apparent that very low threshold settings would result in a high detection efficiency but with lower resolution due to the effects of signal sharing, whereas very high thresholds would reduce the detection efficiency but reduce signal sharing and improve resolution. A theoretical (simulations) and experimental study has been performed (McMullan et al., 2006) to determine the optimum parameters for the threshold settings. E. Modulation Transfer Function The experimental value for resolution was measured, using the well-established knife-edge method (Fujita et al., 1992) at 40, 80, and 120 keV; results are shown in Figure 9 (McMullan et al., 2006). The resolution given by the modulation transfer function at Nyquist frequency [MTF(Nyquist)] is plotted
DIRECT ELECTRON DETECTORS FOR ELECTRON MICROSCOPY
75
F IGURE 9. Plot of modulation transfer function (MTF) at Nyquist frequency for Medipix2 at 40, 80, and 120 keV. The experimental points are shown as symbols and the theoretical results, obtained by Monte Carlo simulations, are shown as lines. The maximum value of MTF at Nyquist frequency, 2/π , is marked by a horizontal dashed line. MTF values higher than are theoretically possible arise due to a pixel shrinkage effect, discussed in the text. (From McMullan et al., 2006.)
as a function of the threshold settings. The experimental values, indicated by different symbols for different energies, are in excellent agreement with the theoretical predictions, which are shown as continuous curves. An interesting point in Figure 10 is that the MTF(Nyquist) exceeds the theoretical maximum (2/π) at higher thresholds. The physical explanation for this effect (Tlustos et al., 2006) is that the pixel size effectively “shrinks” at higher thresholds, with only the central core able to record incident electrons. When electrons are incident on the outer parts of the pixel, part of their energy is deposited in adjacent pixels, so insufficient energy remains in the seed pixel to be above the higher threshold. F. Detective Quantum Efficiency All detectors add some noise during the measurement and readout process. The DQE of a detector is a measure of efficiency and the amount of noise added by the detector and can thus be used as a figure of merit for the detector. The DQE of a detector is defined as: DQE = (S/N)2output /(S/N)2input ,
(1)
76
FARUQI
F IGURE 10. Plot showing both theoretical and experimental values of the number of counting pixels per incident electron, with 120-keV energy, as a function of threshold values. The total number of incident electrons was set to 2100 for the simulations. The key to the different distributions is given in the inset: a single count per incident electron is denoted by the dots and labeled (a), 2 counts per electron are denoted by dashes and labeled (b), and so on. At extremely low thresholds, most incident electrons produce several counts and very few with just one count. As the threshold is increased, the 4 or 5 counts per electron region (labeled (d) and (e)) is eliminated, but there are still 2 or 3 counts. Once the threshold crosses the halfway point (i.e., half of incident energy, equal to 60 keV), only single counts are obtained. The integrated counts expected per electron is plotted as the envelope of the various curves, marked (a) to (e). The experimentally obtained points, shown as solid circles, are in excellent agreement with the predictions. (From McMullan et al., 2006.)
where S and N refer to the signal and noise, respectively. Since the value of S/N at the output is always smaller than S/N at the input, DQE is always less than 1. A novel method for obtaining the DQE at 120 keV has been described recently (McMullan et al., 2006; Zweig, 1965; Rabbani et al., 1987). The method uses the fact that the lower threshold on Medipix2 can be preset over a wide range of values from ∼ 5 keV up to 120 keV (or even higher, if required). As the threshold is lowered to very low values (∼ few keV), a single incident electron can be recorded in a number of adjacent pixels due to charge sharing. The variation in the number of “counting” pixels can be used to calculate the equivalent of variance, from which it is possible to compute the DQE as a function of threshold provided the number of electrons incident on
DIRECT ELECTRON DETECTORS FOR ELECTRON MICROSCOPY
77
the detector is known accurately. The total number of counts was estimated from the Monte Carlo simulations by using the number of zero, single, double, triple, and so on counts per electron. Experimentally, the DQE measurements were made by exposing a Medipix2 detector to uniform radiation containing only ∼ 240 electrons/second (all images were recorded for 1 second) over the sensitive area. Since there are ∼ 65 000 pixels in the detector, the probability of a pixel being hit by two electrons in a frame, or even two electrons impinging on adjacent pixels, is very small. This point is important as multiple counts in the vicinity of a pixel are treated as arising from the same incident electron. The variation of counts per electron with threshold can be understood intuitively as follows. At a setting of the threshold equivalent value close to that of the incident energy— 120 keV—no counts are recorded as there are few electrons that deposit all their energy in one pixel. As the threshold is lowered gradually, some single counts begin to appear as the energy deposited increases above the threshold value. At the same time, there are fewer instances when none of the pixels have sufficient energy to exceed the threshold, so the number of “no counts” decreases. With a further reduction in the threshold, some double counts occur (still arising from the impact of a single electron with sufficient charge sharing in adjacent pixels to trigger two counts); with still lower threshold there are some 3- and a few 4-pixel counting events. Figure 10 shows good agreement between the theoretically expected curves (solid lines) and experimental points (different symbols explained in the figure text). The value of DQE(0) was calculated from these measurements (details in McMullan et al., 2006) and is shown in Figure 11. The high DQE figures are particularly impressive as they were obtained at extremely low doses, making Medipix2 ideal for low-dose imaging. The DQE at Nyquist frequency can be calculated as follows: DQE(Nyquist) = DQE(0) × MTF(Nyquist)2 /NTF(Nyquist)2 ,
where NTF refers to the noise transfer function, defined by (Meyer and Kirkland, 2000), shown in Figure 11. The maximum value of MTF at Nyquist frequency for a pixellated detector is 2/π(0.636), so the maximum value of DQE at Nyquist, which has a DQE(0) of 1.0 and NTF of 1.0, is given by 4/π 2 (0.404). Figure 11 shows the DQE(Nyquist) as a fraction of the maximum possible value as triangles along with the theoretically predicted continuous line. The dotted line shows the experimental values for DQE(Nyquist). G. Future Prospects for Medipix2 Medipix2 falls short of being an acceptable detector for EM because of two primary shortcomings. First, the total area and the number of available pixels
78
FARUQI
F IGURE 11. Theoretical and experimental plots of DQE at zero and Nyquist frequency (McMullan et al., 2006). The DQE(0) is extremely high (86%) at low thresholds and decreases gradually with increasing threshold values. The experimental points are shown as open circles and the theoretical values as a continuous line. The values for DQE(Nyquist) are depressed by the fact that the maximum value for a pixilated detector is limited to (2/π)2 or 0.405. The triangles show the experimental values, and the solid line the theoretical values for DQE(Nyquist). (From McMullan et al., 2006.)
are quite small. Second, the resolution at higher energies, needed for the majority of cryo-EM work, is too low. The radiation damage at 300 keV, however, is likely to be a less serious obstacle in the future because readout chip designs will be considerably more radiation hard, as already mentioned. The low noise characteristics of the quad would be very useful in ultralow-dose microscopy (McMullan et al., 2006). One possible reason for the poor quality of high-resolution images is attributed to specimen charging or mechanical movement during imaging. These effects could be studied by dose-fractionated imaging, where a large number of images are taken in a “movie” mode, with extensive subsequent data processing (Henderson and Glaeser, 1985; Henderson, 1992). The total size of the detector used in the present tests was a quad chip, consisting of 512 × 512 pixels. In order to fulfill our criteria for the “acceptable” detector, the size would need to be increased by a factor of 8 in both directions to arrive at the required number of pixels: 4k × 4k. A start has
DIRECT ELECTRON DETECTORS FOR ELECTRON MICROSCOPY
79
F IGURE 12. Projected arrangement for achieving the design of a larger format detector by tiling Medipix2 Quads into a 3 × 3 array. The readout chip connections are brought out through the chip rather than on one edge to allow four-side buttable detectors. (From Bethke et al., 2006.)
been made to tile an array of quads into larger areas in a project called High Resolution Large Area X-ray Detector (RELAXD) by PANalytical (Almelo, The Netherlands) (Bethke et al., 2006), an industrial company that has been closely associated with the Medipix2 chip. The existing Medipix2 chip is only three-side buttable, so in its present form it is not possible to consider larger area tiling. The plan is to modify the Medipix2 chip to make it fourside buttable, thus making possible larger area detectors. The concept of the supertiling is simple enough in principle—the wire connections from the readout chips are taken out through the silicon to the back side of the wafer rather than to one side, as in the original design. The plan is to use the existing Medipix2 readout but use wafer connections as shown in the insert in Figure 12. The wires also are “fanned” out slightly to simplify the connections. The plan in the first phase is to design a detector with a 3×3 array
80
FARUQI
of quads (i.e., with 1.5k × 1.5k in the detector). Depending on the experience with the first design, more challenging designs, with larger arrays, could be undertaken in future. The solution to the poor-resolution problem at 300 keV is unlikely to be solved without changing the detector material from silicon to either CdTe or gallium arsenide (GaAs), both of which have much greater stopping power compared with silicon. The predicted behavior of 300-keV electrons in CdTe is very similar to 120-keV electrons in silicon (Fig. 1), leading to the expectation that once the technology of producing a satisfactory grade of CdTe is available, it should be possible to obtain better resolution at 300 keV. The reduced range of the 300-keV electrons would also mean that radiation damage would be a lesser problem even with the current Medipx2 readout chips.
VIII. MAPS D ETECTORS BASED ON CMOS The second type of direct imaging detector considered in this review is a MAPS, designed at Rutherford Appleton Laboratory (Oxfordshire, United Kingdom), in CMOS technology for space applications (Prydderch et al., 2003). The same technology has been used for designing CMOS sensors for the consumer market, for use in cheaper digital and video cameras instead of the traditional high-performance CCD-based cameras. The design of MAPS detectors differs from the original CMOS sensors in several important respects. The design of the sensor is modified to allow charged particles to deposit sufficient energy to be recorded by adding a sensitive epitaxial layer; the electronic layout can be designed to be more resistant to radiation damage. A major advantage for CMOS-based devices over CCDs is that, as the CMOS process is widely used in a range of microelectronics applications, integration of additional electronics tools (e.g., analog to digital converters [ADCs]) can be incorporated within the detector layout. On the other hand, CCD fabrication uses a specialized technology specifically used only for manufacturing CCDs by very few manufacturers. The main differences between the readout from CMOS and CCD imagers are shown in Figure 13; this helps to explain the functionality of the two devices. CCDs are read out by shifting rows of charge, along the columns, toward the bottom row, which also has an output node. The output node is connected to a low-noise charge-sensitive amplifier, which converts the charge to a voltage signal, which is digitized in an ADC. The readout occurs on a pixel by pixel basis, making the readout of the complete CCD a relatively slow process. The MAPS readout is arranged in such a manner that a “row select” signal allows all the pixels in that row to be read out in parallel and digitized. For very fast readout an ADC per column
DIRECT ELECTRON DETECTORS FOR ELECTRON MICROSCOPY
81
F IGURE 13. Readout schemes for a CMOS-based MAPS detector and a CCD. Only one output node is shown for the CCD, but some large format CCDs have up to four nodes to speed up the readout. The readout for MAPS is much faster as the readout is done on a row by row basis. For very fast readout there could be one ADC per column; if readout times are less critical, a number of columns are multiplexed into one ADC.
could be used, but for economy a multiplexer is used to read a number of columns (∼ 8) through one ADC. Because of the parallel readout architecture intrinsic to MAPSs, the readout speed is considerably faster than for CCDs (typically 10–100 times depending on complexity of readout circuits). Since the charge to voltage conversion in CMOS devices takes place at the pixel level, unlike CCDs where charge is transferred through a large number of pixels, the detector is less susceptible to radiation damage. MAPS do not require any bump-bonding, an advantage over hybrid pixel detectors. Since the size of the bump and the bump-bonding process requires a minimum size for the pixel, it is not feasible to design HPDs with pixel sizes much smaller than 55 µm, which is far from optimum. The pixel size is much more flexible in MAPS and could be made much smaller than 55 µm to match the requirements in EM.
82
FARUQI
F IGURE 14. Schematic diagram showing the layout of a single pixel in a MAPS detector. Charge generated by the incident electron in the epilayer diffuses to, and is trapped by, the n+ diode and charges, or discharges (if already at a high voltage) the stray capacitance of the pixel. The signal readout is accomplished by just three transistors (more complex designs use more): T1 , T2 , and T3 . When a particular row is selected by switching T3 on, the signal voltage on T2 is read out through the column into an ADC. Successive rows are selected to read out all the pixels in a given column.
The layout and basic design principles for a single pixel in MAPS is shown in schematic form in Figure 14. The readout electronics component typically occupies the top 2–4 µm of thickness as shown in Figure 14. The essential components of the pixel are the n+ p− diode, where the p− is the thin epitaxial layer (abbreviated as epilayer) above the p+ bulk (Prydderch et al., 2003). The incoming electron creates signal charge (electron–hole pairs) in the epilayer, typically ∼ 80e− h+ pairs per micrometer for high-energy electrons (minimum ionizing particles), which diffuse to the n+ diode. The p+ substrate has no function in the electron detection process but strengthens and supports the thin epilayer. Because there are no electrical fields (other than potentials created due to different doping) in the epilayer, the signal electrons can travel only by diffusion. Any electrons diffusing toward the p+ bulk layer are reflected back into the epilayer due to the potential difference at the boundary; electrons diffusing toward the n+ region are trapped in a potential well and unable to escape. MAPS detectors use this process to read out the signal by making the n+ into a readout node as discussed below. The readout from a pixel can be explained with reference to Figure 14. Prior to exposure the node A, at the output of the n+ diode, is reset to a fixed positive
DIRECT ELECTRON DETECTORS FOR ELECTRON MICROSCOPY
83
voltage by the transistor T1 . During exposure, electrons collected through the n+ diode discharge the stored value by a small amount, which represents the signal to be recorded. During readout, columns are selected sequentially and all pixels in a given row are read out (the row is selected by the transistor T3 ) through transistor T2 . The small signal from the pixel is amplified and then digitized in an ADC in circuits usually separate from the detector chip. All the evaluation tests described in this review were carried out on a MAPS detector that had an epilayer of ∼ 4 µm and 525 × 525 pixels on a 25-µm pitch (Prydderch et al., 2003). Due to the extremely thin epilayer (detection layer), there is little sideways spread of charge from the incident electron. This can be observed in the Monte Carlo simulation of electron trajectories in silicon at energies between 120 and 300 keV (shown in Figure 1). The thickness of the silicon for the simulations was chosen to be 300 µm. It is clear from the very top few microns of the electron tracks, which show very little lateral spread, that the resolution would be expected to be very good (spoiled partially by backscattered electrons). Although electrons recorded in CMOS sensors with a relatively thin epilayer generate a much smaller signal than in Medipix2, it is sufficiently high for efficient single-electron detection. Monte Carlo simulations predict that an electron would typically deposit 1–2 keV in the epilayer, which would convert into 280–560 electron–hole pairs in the signal. With a readout noise ∼ 50e− root mean square, which can be reduced in future designs, these simulations suggested a possible detector design based on direct detection in MAPS, able to count every electron in a given pixel with high efficiency. The MTF and DQE could be further improved by using backthinned devices, which reduce the number of backscattered electrons (see Section VIII.C). A. MAPS: Sensitivity and Resolution The sensitivity and resolution of the MAPS detector was measured over a range of energies between 40 and 300 keV, with a more detailed evaluation at 40 and 120 keV (Faruqi et al., 2005a). The shadow image of a standard 300-mesh EM grid, obtained with uniform illumination (shown in Figure 15), was found to be very useful for these measurements. The grid images provide a convenient way of recording “bright” and “dark” field images adjacent to each other. The power spectrum of the images can be used for calculating the resolution compared to film (Roseman and Neumann, 2003). Images at several different levels of illumination were recorded to explore the single-electron sensitivity of MAPS at 120 keV. Dark images were also collected immediately after the main image and subtracted to eliminate fixed pattern noise and dark current. Figure 15 shows images of the grid, with
84
FARUQI
F IGURE 15. Three sets of grid images showing the response of a MAPS detector to 120-keV electrons. The illumination was set to 6 electrons/pixel in the right-hand images, the central panel is a blank, and the panel on the left had only 6 electrons/100 pixels. With only 6 electrons/100 pixels illumination, the probability of recording two electrons in the same pixel is very small, and it is assumed that the recorded signal values in each pixel are due to one electron. (From Faruqi et al., 2005a.)
beam energy set to 120 keV, at two widely different levels of illumination; Figure 15a was recorded with the illumination set to 6 electrons/100 pixels and Figure 15c with 6 electrons/pixel. Since there is an extremely low probability of double hits in Figure 15a, the signals in the pixels correspond to MAPS response to single electrons. About 1% electrons are transmitted through the grid bars and the grid support (see Figure 15c). A similar set of images were also recorded at 40 keV (not shown) and show features similar to those at 120 keV except that, because of their lower energy, 40-keV electrons are completely stopped in the grid bars and support. Since single electrons with 40 and 120 keV energy are clearly recorded with MAPSs, is it possible to obtain the SNR and resolution from these data? The single-electron response at 40 and 120 keV was investigated by averaging the response for a number of single electrons (152 at 120 keV) (Faruqi et al., 2005a). The response function was centered on the seed pixel containing the highest counts. The single-electron profile is given in Table 2; the total counts in a 3 × 3 box were 50, which is the signal, after background subtraction (see also Figure 16).
DIRECT ELECTRON DETECTORS FOR ELECTRON MICROSCOPY
85
TABLE 2 S INGLE - ELECTRON PROFILE AT 120 KE V
0.8 3.8 1.6
0.0
0.2 5.2 29.9 3.1 0.1
1.4 3.6 1.0
−0.2
TABLE 3 S INGLE ELECTRON PROFILE AT 40 KE V 0.5 0.3 0.1 −0.2 0.2
0.1 0.6 3.0 0.6 0.3
0.2 3.7 39.0 3.5 −0.1
−0.1 0.2 2.2 0.0 0.1
−0.6 −0.3 0.3 −0.2 0.0
A similar measurement at 40 keV, shown in Table 3, gives the average signal as 52.8. The noise in the MAPS detector, estimated by subtracting two blank images, produced ∼ 2 ADC units. This suggests a high SNR of ∼ 25 at 120 keV and even slightly higher at 40 keV. The method of integrating all the signal charge in a 5 × 5 box would be reduced somewhat if a slightly smaller integration box was used but would still be substantial. The SNR at higher voltages is expected to be lower as the signal charge from the passage of a 300-keV electron (from energy loss calculations) would be roughly half that at 120 keV. The resolution of MAPS was measured at 120 keV using two methods: The knife-edge method (McMullan et al., 2006) described for Medipix2 and another method (Roseman and Neumann, 2003), which provides the value of a relative MTF compared with a “perfect” detector—in this case taken to be film. In the latter case, the relative MTF is measured by recording images of the 300-mesh grid on MAPS and film. The ratio of the normalized measured amplitudes of the pattern is compared with the normalized measured values for film to determine the relative MTF (i.e., MAPS relative to film). In order to make the comparison strictly comparable, data from film, originally scanned with a 7-µm raster, were binned to give an image equivalent to the image on MAPS. Comparing the power at the Nyquist frequency (i.e., at twice the pixel size—50 µm), the power in the MAPS pattern was reduced to 52% in comparison with film at 120 keV. Due to the smaller lateral spread of charge at the lower energy, 40 keV, the power is only reduced to 75%. MTF
86
FARUQI
and DQE results at 300 keV, using the knife-edge method, will be published shortly. B. Radiation Damage in CMOS-Based Detectors As discussed in the section on the general requirements for direct detectors, one of the essential features is reasonable immunity to radiation damage during normal use. For convenience, we used the period of 1 year as acceptable before a detector might need to be replaced. CMOS circuits, which include CMOS-based sensors (i.e., MAPSs), are susceptible to radiation damage effects, especially when the absorbed radiation dose is integrated over a 1-year period of normal EM operation. Two main types of radiation damage affect CMOS circuits. The first type of damage, also called displacement damage, is caused by protons, neutrons, or other heavy charged particles. This occurs due to the knocking out of silicon atoms from the crystalline lattice as the heavy particles impart a large momentum transfer. Fortunately, the energy of electrons of interest in EM is not sufficiently high to cause this type of damage in silicon. The second type of damage occurs because of charging effects in the pixel structure and results in an increased dark current (i.e., a signal in the absence of any incident radiation). Dark current, generated by the spontaneous emission of electrons, is always present in semiconductor detectors. It is less important in HPDs as setting the lower threshold just above noise level eliminates noise counts. With higher doses of radiation, the dark current in MAPS detectors increases to such an extent that insufficient dynamic range is left to acquire any useful signal. Further, the images become noisier as it is more difficult to correct for the additional dark current noise. The first MAPS detector (Faruqi et al., 2005a), from which resolution and detection efficiency were described earlier (Section VIII.A), was not required to be radiation-hard for the original application (Prydderch et al., 2003). Rad-hard sensors are, however, needed in some space applications where they are likely to encounter a continuous flux of charged particles. Thus, there has been a strong incentive to design rad-hard sensors—a requirement that is also shared by our applications in EM and in other fields. Special design techniques have been devised to make CMOS sensors rad-hard. Although the main technical details of how this is achieved (Bogaerts et al., 2003) are beyond the scope of this review, the designs ensure that there are enclosed gates around transistors, which prevent charge buildup, which in turn, prevents buildup of current. An example of this type of design shows that the increase in dark current during irradiation can be reduced by about three orders of magnitude compared with sensors fabricated using
DIRECT ELECTRON DETECTORS FOR ELECTRON MICROSCOPY
87
F IGURE 16. A plot of the distribution of the number of single-electron events with a given value of ADC counts at 40 (a) and 120 keV (b). The shape of the distribution is assumed to be a Landau distribution at high energies. The mean value of the distribution at 40 keV, which resembles a normal distribution, is 53 (σ = 18). The distribution at 120 keV has a slight tail at high ADC values, making it resemble a Landau distribution, with a mean value of 50 (σ = 28). (From Faruqi et al., 2005a.)
normal designs (Bogaerts et al., 2003). A rad-hard CMOS sensor, known as STAR250, containing 512 × 512 25-µm pixels, has been designed by the FillFactory NV (Mechelen, Belgium) (part of Cypress Corporation) for the European Space Agency (Cypress Semiconductor Corporation (Belgium) BVBA). Tests were performed to determine how well this sensor would fare when irradiated with 300-keV electrons in a Tecnai F30 electron microscope (FEI, Hillsboro, Oregon). The STAR250 was mounted on a special purpose board, with analog signals transmitted via vacuum connectors to an external control unit containing the ADC and the drive electronics needed for the sensor (Faruqi et al., 2006). The output of the ADC was used to generate images in a PC at 30 frames/second. The radiation damage was estimated by exposing the STAR250 sensor to a small, uniform beam of 300 keV electrons with known intensity. The number of electrons per square millimeter was calculated fairly precisely from the beam current monitor, which had originally been calibrated against absolute current measurements with a Faraday cup. A standard EM grid was chosen for this experiment because it has transparent grid squares with opaque borders providing damaged and undamaged areas adjacent to each other as shown in Figure 17. With the same shadow image of the 300-mesh grid as used previously, a spacing of 665 µm in both directions in the sensor plane is obtained. During irradiation the sensor was continuously operated with all the voltage drivers applying the appropriate voltages used in reading out the images. The damage was estimated by measuring the residual contrast in the
88
FARUQI
F IGURE 17. Images showing the radiation damage studies on a STAR250 sensor at 300 keV. The sensor was subjected to three areas of irradiation: A, 1 Mrad, B, 0.2 Mrad, C, 0.2 Mrad but with annealing over 4 weeks. The bright field images are shown in (a), dark field in (b), and the difference in (c). The main result is that the sensor has a residual contrast of 82% after 1 Mrad of radiation. With suitable dark current corrections, such a sensor would be useable. (From Faruqi et al., 2006.)
recorded images after irradiation. The residual contrast is defined as [BA − DA]post-irradiation , [BA − DA]pre-irradiation where BA = Recorded intensity in the bright area and DA = Intensity in the dark area.
DIRECT ELECTRON DETECTORS FOR ELECTRON MICROSCOPY
89
Three small areas on the sensor were irradiated with 300-keV electrons (shown in Figure 17): area A with the equivalent of 200 krad, area B also with 200 krad, and area C with 1 Mrad. It is useful to note that a dose of 1 Mrad is approximately equivalent to 8 × 1010 electrons/mm2 on the sensor. The main conclusion from this experiment was that after an integrated dose of 1 Mrad (8 × 1010 300-keV electrons/mm2 ) the residual contrast was 82%, shown in grid image A in Figure 17(b). The lowered contrast can be explained more clearly by pointing out the greatly increased dark levels, shown in grid image A in Figure 17(c)—any subsequent signal must be measured above this new raised “baseline”. The lifetime of the STAR250 may actually be longer than predicted by our measurements, which is due to the nonlinear nature of radiation damage with dose (Bogaerts et al., 2003). The radiation damage occurring between 1 and 10 Mrad is about an order of magnitude less than the damage between 0 and 1 Mrad. Since our measurements were made in the 0–1-Mrad region, where damage is most rapid with exposure to radiation, above 1 Mrad the rate of damage might be slower (Bogaerts et al., 2003). The residual contrast obtained with an integrated dose of 0.2 Mrad is shown in both grid images B and C in Figure 17(b). The difference is that the measurements for B were done immediately after irradiation but with a gap of 4 weeks for image C. The residual contrast is slightly higher for the delayed measurements, suggesting some degree of annealing may have taken place during the 4 weeks, which acts like a repair mechanism on the sensor. Heating up the sensor could also speed up the annealing process, but this has not been verified for the present sensor. The most important conclusion from this exercise is that, provided a sensor with rad-hard design is available, it should be possible to use it at least up to a total dose of 1 Mrad, making sure to correct for dark current shifts over that time.
C. Outlook for CMOS-Based Detectors The DQE and resolution of MAPS is very good and likely to improve as suitably backthinned detectors become available. Combining the radiation resistance of the rad-hard STAR250 sensor into the MAPS detector would produce an extremely useful device, but with only 512 × 512 pixels. As already discussed, a larger format sensor, with 4k square array, is essential for many applications. New developments in CMOS designs may allow stitching together a number of smaller (1024 × 1024) sensors into a larger sensor with an adequate number of pixels.
90
FARUQI
IX. C ONCLUSIONS An urgent need exists for development of high-sensitivity, high-resolution, and low-noise electronic detectors for use in the increasingly popular field of structural studies using electron cryomicroscopy. Existing electronic detectors, based on phosphor-coupled CCDs or image plates, cannot fulfill all those requirements, and this has motivated the quest for improved electronic detectors. The most demanding applications, which require high spatial resolution, still use film, with all its drawbacks, as the detection medium. The development of direct-detection detectors is progressing in parallel with improved CMOS processing technology. Two types of direct detectors were described in this chapter: an HPD, based on Medipix2, designed by the Medipix collaboration, based at CERN (http://www.cern.ch/MEDIPIX), and a MAPS, designed at Rutherford Appleton Laboratory (Prydderch et al., 2003). HPDs, which use separate pixellated detector and readout electronics chips, have the unique property of being able to acquire data without adding any noise—a feature that is potentially extremely useful in acquiring multipleframe data for later integration or subsequent image processing. The design of MAPS, on the other hand, simplifies the construction process and allows the integration of detector and readout on the same chip. The technology used for MAPS construction may be more convenient than HPDs and therefore may be easier to produce in a large format, which would satisfy most EM requirements. The loss of energy by the incident electron in a thin layer follows a Landau distribution, which results in a large variance in the energy deposited and consequent lowering of the DQE. Improved designs of the MAPS detectors, including backthinned detectors, should lead to a lowering of the variance and an improvement in the DQE. The progress in CMOS technology, which lies at the heart of direct detection, is moving at a very rapid pace in a number of specialist laboratories. Our aim will be to harness these developments into a detector optimized for EM.
ACKNOWLEDGMENTS I thank my colleagues in the MRC Laboratory of Molecular Biology, Richard Henderson and Greg McMullan, for detailed criticisms and comments on the manuscript and for permission to use unpublished data (including Figures 1, 5(b), 7, 9, 10, and 11). I also thank my colleagues in the Medipix Project, particularly Michael Campbell at CERN for Figures 2 and 3; Richard LaBennett of RTI Corporation for Figure 4; Klaus Bethke for Figure 12; and Renato Turchetta for many discussions on all aspects of MAPS detectors.
DIRECT ELECTRON DETECTORS FOR ELECTRON MICROSCOPY
91
R EFERENCES Auer, M. (2000). Three-dimensional electron cryo-microscopy as a powerful structural tool in molecular medicine. J. Mol. Med. 78, 191–202. Baumeister, W., Grimm, R., Walz, J. (1999). Electron tomography of molecules and cells. Trends Cell Biology 9, 81–85. Bethke, K., de Vries, R., Kogan, V., Vasterink, J., Verbruggen, R., Kidd, P., Fewster, P., Bethke, J. (2006). Applications and new developments in X-ray materials analysis with Medipix2. Nucl. Instrum. Methods 563, 209–214. Bogaerts, J., Dierckx, B., Meynants, G., Uwaerts, D. (2003). Total dose and displacement damage effects in a radiation-hardened CMOS APS. IEEE Trans. Elec. Dev. 50, 84–90. Booth, C.R., Jiang, W., Baker, M.L., Hong Zhou, Z., Ludke, S.J., Chiu, W. (2004). A 9 Å single particle reconstruction from CCD captured images on a 200 kV electron cryomicroscope. J. Struct. Biol. 147, 116–127. Campbell, M., Heijne, E.H.M., Meddeler, G., Pernigotti, E., Snoeys, W. (1998). A readout chip for a 64 × 64 pixel matrix with 15-bit single photon counting. IEEE Trans. Nucl. Sci. 45 (3), 751–753. Conti, M., Maiorino, M., Mettivier, G., Montesi, M.C., Russo, P. (2003). Preliminary test of Medisoft4: control software for the Medipix2 chip. IEEE Trans. Nucl. Sci. 50, 869–877. Dainty, J.C., Shaw, R. (1974). Image Science. Academic Press. Deptuch, G. (2005). Tritium autoradiography with thinned and back-side illuminated monolithic active pixel sensor device. Nucl. Instrum. Methods 543, 537–548. Downing, K.H., Hendrickson, F.M. (1999). Performance of a 2K CCD camera designed for electron crystallography at 400 kV. Ultramicroscopy 75, 215– 234. Faruqi, A.R. (2001). Prospects for hybrid pixel detectors in electron microscopy. Nucl. Instrum. Methods A 466, 146–154. Faruqi, A.R., Andrews, H.N. (1997). Cooled CCD camera with tapered fibre optics for electron microscopy. Nucl. Instrum. Methods A 392, 233–236. Faruqi, A.R., Cattermole, D.M., Henderson, R., Mikulec, B., Raeburn, C. (2003). Evaluation of a hybrid pixel detector for electron microscopy. Ultramicroscopy 94, 263–276. Faruqi, A.R., Henderson, R., Holmes, J. (2006). Radiation damage studies on STAR250 CMOS sensor at 300 keV for electron microscopy. Nucl. Instrum. Methods 565, 139–143. Faruqi, A.R., Henderson, R., Subramaniam, S. (1999). Cooled CCD detector with tapered fibre optics for electron diffraction patterns. Ultramicroscopy 75, 235–250.
92
FARUQI
Faruqi, A.R., Henderson, R., Prydderch, M., Turchetta, R., Allport, P., Evans, A. (2005a). Direct single electron detection with a CMOS detector for electron microscopy. Nucl. Instrum. Methods 546, 170–175. Faruqi, A.R., Henderson, R., Tlustos, L. (2005b). Noiseless direct detection of electrons in Medipix2 for electron microscopy. Nucl. Instrum. Methods 546, 160–163. Faruqi, A.R., Subramaniam, S. (2000). CCD detectors in high-resolution biological electron microscopy. Quart. Rev. Biophys. 33, 1–27. Fornaini, A., Boerkamp, T., Oliviera, R., Visschers, J. (2003). A multichip board for X-ray imaging in build-up technology. Nucl. Instrum. Methods 509, 206–212. Frank, J. (2002). Single-particle imaging of macromolecules by cryo-electron microscopy. Annu. Rev. Biophys. Biomol. Struct. 31, 303–319. Fujita, H., Tsai, D.-Y., Itoh, T., Doi, K., Morishita, J., Ueda, K., Ohtsuka, A. (1992). A simple method for determining the modulation transfer function in digital radiography. IEEE Trans. Med. Imaging 11, 34–39. Hall, G. (1995). Silicon pixel detectors for X-ray diffraction studies at synchrotron sources. Quart. Rev. Biophys. 28, 1–32. Henderson, R. (1992). Image contrast in high-resolution electron microscopy of biological molecules: TMV in ice. Ultramicroscopy 46, 1–18. Henderson, R. (1995). The progress and limitations of neutrons, electrons and X-rays for atomic resolution microscopy of unstained biological molecules. Quart. Rev. Biophys. 28, 171–193. Henderson, R. (2004). Realizing the full potential of electron cryomicroscopy. Quart. Rev. Biophys. 37, 3–13. Henderson, R., Cattermole, D., McMullan, G., Scotcher, S., Fordham, M., Amos, W.B., Faruqi, A.R. (2006). Digitisation of electron microscope films. Ultramicroscopy, in press. Henderson, R., Glaeser, R. (1985). Quantitative analysis of image contrast in electron micrographs of beam sensitive crystals. Ultramicroscopy 16, 139– 150. Joy, D.C. (1995). Monte Carlo Modeling for Electron Microscopy and Microanalysis. Oxford Univ. Press. Krüger, H. (2005). 2D detectors for particle physics and for imaging applications. Nucl. Instrum. Methods A 551, 1–14. Llopart, X., Campbell, M. (2003). First test measurements of a 64k pixel readout chip working in single photon counting mode. Nucl. Instrum. Methods A 509, 157–163. Llopart, X., Campbell, M., Dinapoli, R., San Secundo, D., Pernigotti, E. (2002). IEEE Trans. Nucl. Sci. 49, 2279–2283. McMullan, G., Cattermole, D.M., Chen, S., Henderson, R., Llopart, X., Summerfield, C., Tlustos, L., Faruqi, A.R. (2006). Electron imaging with Medipix2 hybrid pixel detector. Ultramicroscopy, in press.
DIRECT ELECTRON DETECTORS FOR ELECTRON MICROSCOPY
93
Meyer, R.R., Kirkland, A. (1998). The effects of electron and photon scattering on signal and noise transfer properties of scintillators in CCD cameras used for electron detection. Ultramicroscopy 75, 23–33. Meyer, R.R., Kirkland, A.I. (2000). Characterization of the signal and noise transfer of CCD cameras for electron detection. Micros. Res. Tech. 49, 269– 280. Mikulec, B., Campbell, M., Heijne, E., Llopart, X., Tlustos, L. (2003). Xray imaging using single photon processing with semiconductor pixel detectors. Nucl. Instrum. Methods A 51, 282–286. Milazzo, A., Leblanc, P., Duttweiler, F., Jin, L., Bouwer, J.C., Peltier, S., Ellisman, M., Bieser, F., Matis, H.S., Wieman, H., Denes, P., Kleinfelder, S., Xuong, N. (2005). Active pixel sensor array as a detector for electron microscopy. Ultramicroscopy 104, 152–159. Prydderch, M.L., Waltham, N.J., Turchetta, R., French, M.J., Holt, R., Marshall, A., Burt, D., Bell, R., Pool, P., Eyles, C., Mapson-Menard, H. (2003). A 512 × 512 CMOS monolithic active pixel sensor with integrated ADCs for space science. Nucl. Instrum. Methods A 512, 358–367. Rabbani, M., Shaw, R., Van Metter, R. (1987). Detective quantum efficiency of imaging systems with amplifying and scattering mechanisms. J. Opt. Soc. Am. A 4, 895–901. Roberts, P.T.E., Chapman, J.N., MacLeod, A.M. (1982). A CCD-based recording system for CTEM. Ultramicroscopy 8, 385–396. Roseman, A.M., Neumann, K. (2003). Objective evaluation of the relative modulation transfer function of densitometers for digitisation of electron micrographs. Ultramicroscopy 96, 207–218. San Secundo Bello, D., Beuzekom, M., van Janweijer, P., Verkooijen, H., Visschers, J. (2003). An interface board for the control and data acquisition of the Medipix2 chip. Nucl. Instrum. Methods A 509, 164–170. Sander, B., Golas, M.M., Stark, H. (2005). Advantages of CCD detectors for de novo three-dimensional structure determination in single-particle electron microscopy. J. Struct. Biol. 151, 92–105. Tlustos, L., Ballabriga, R., Campbell, M., Heijne, E., Kincade, K., Llopart, X., Stejskal, P. (2006). Imaging properties of the Medipix2 system exploiting single and dual energy thresholds. IEEE Trans. Nucl. Sci. 53, 367–372. Zeitler, E. (1992). The photographic emulsion as analog recorder for electrons. Ultramicroscopy 46, 405–416. Zweig, H.J. (1965). Detective quantum efficiency of photodetectors with some amplification mechanisms. J. Opt. Soc. Amer. 55, 525–528.
This page intentionally left blank
ADVANCES IN IMAGING AND ELECTRON PHYSICS, VOL. 145
Exploring Third-Order Chromatic Aberrations of Electron Lenses with Computer Algebra ZHIXIONG LIU Department of Electronics, Peking University, Beijing 100871, China
I. Introduction . . . . . . . . . . . . . . . . . . . . . . II. Variational Function and Its Approximations . . . . . . . . . . . . . A. The Second- and Fourth-Order Approximations . . . . . . . . . . . B. The Gaussian Value of the Fourth-Order Approximation . . . . . . . . III. Chromatic Perturbation Variational Function and Its Approximations . . . . . . A. The Second- and Fourth-Order Approximations . . . . . . . . . . . B. The Gaussian Values of the Second- and Fourth-Order Approximations . . . . IV. Analytical Derivation of Third-Order Chromatic Aberration Coefficients . . . . . A. Intrinsic Chromatic Aberration Coefficients . . . . . . . . . . . . B. Combined Chromatic Aberration Coefficients . . . . . . . . . . . 1. Chromatic Spherical Aberration Coefficients . . . . . . . . . . 2. Chromatic Coma 1 Coefficients . . . . . . . . . . . . . 3. Chromatic Coma 2 Coefficients . . . . . . . . . . . . . 4. Chromatic Astigmatism Coefficients . . . . . . . . . . . . 5. Chromatic Field Curvature Coefficients . . . . . . . . . . . 6. Chromatic Distortion Coefficients . . . . . . . . . . . . . 7. Anisotropic Chromatic Spherical Aberration Coefficients . . . . . . 8. Anisotropic Chromatic Coma 1 Coefficients . . . . . . . . . . 9. Anisotropic Chromatic Coma 2 Coefficients . . . . . . . . . . 10. Anisotropic Chromatic Astigmatism Coefficients . . . . . . . . 11. Anisotropic Chromatic Field Curvature Coefficients . . . . . . . 12. Anisotropic Chromatic Distortion Coefficients . . . . . . . . . C. Total Chromatic Aberration Coefficients . . . . . . . . . . . . . V. Graphical Display of Third-Order Chromatic Aberration Patterns . . . . . . . A. Two Auxiliary Procedures . . . . . . . . . . . . . . . . . B. Chromatic Aberration Patterns . . . . . . . . . . . . . . . . 1. Chromatic Spherical Aberration Patterns . . . . . . . . . . . 2. Chromatic Coma Patterns . . . . . . . . . . . . . . . 3. Chromatic Astigmatism and Field Curvature Patterns . . . . . . . 4. Chromatic Distortion Patterns . . . . . . . . . . . . . . VI. Numerical Calculation of Third-Order Chromatic Aberration Coefficients . . . . A. Calculation of Various Quantities . . . . . . . . . . . . . . . B. Calculation of Chromatic Aberration Coefficients . . . . . . . . . . 1. Isotropic Chromatic Aberration Coefficients Caused by Electric Perturbation 2. Anisotropic Chromatic Aberration Coefficients Caused by Electric Perturbation 3. Isotropic Chromatic Aberration Coefficients Caused by Magnetic Perturbation 4. Anisotropic Chromatic Aberration Coefficients Caused by Magnetic Perturbation . . . . . . . . . . . . . . . . . . . . . .
96 96 97 100 102 103 105 106 107 109 114 115 116 117 119 120 121 122 123 124 126 127 128 129 129 131 131 132 133 134 135 136 140 141 141 142 143
95 ISSN 1076-5670 DOI: 10.1016/S1076-5670(06)45003-5
Copyright 2007, Elsevier Inc. All rights reserved.
96
LIU
C. Numerical Results VII. Conclusions . . Acknowledgments Appendix . . . References . . .
. . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
143 143 146 146 148
I. I NTRODUCTION Computer algebra was introduced to electron optics in the 1970s, by Hawkes (1977) and Soma (1977), who used the computer languages CAMAL and REDUCE, respectively. Since then, computer algebra systems have developed considerably, and Mathematica, MATLAB, and Maple have become the main three ones. Among them Mathematica is the world’s only fully integrated environment for technical computing (Wolfram, 2003), which not only performs symbolic and numerical calculations, but has powerful graphical functions as well. Since it was first released in 1988, Mathematica has widely been applied to various branches of science and engineering. However, there are neither books nor articles on electron optics with Mathematica in the numerous Mathematica publications (see http://www.wolfram.com/bookstore/ and The Mathematica Journal). This chapter is the first attempt to systematically demonstrate the power of Mathematica for aberration analysis in electron optics. This work focuses on third-order chromatic aberrations of electron lenses, including analytical derivation of aberration coefficients, graphical display of aberration patterns, and numerical calculation. As for Mathematica language, it is easily understood, and we explain those used in this chapter in Appendix. The entire chapter is also a complete program, including the input instructions and output results, and can be run in the Mathematica’s environment. This chapter is only a beginning. It is expected that various complicated problems in charged particle optics will be solved by means of Mathematica.
II. VARIATIONAL F UNCTION AND I TS A PPROXIMATIONS The variational function for rotationally symmetric electron optical systems in fixed coordinates is expressed (Ximen, 1986; Hawkes and Kasper, 1989) as √ √ F = Φ 1 + R′ . R′ − ηΨ R∗ . R′ , Φ = V (z) +
Ψ =
A = R
∞ k=0
∞ k=1
(−1)k ∂ 2k V (z) (R . R)k , (22k (k!)2 ) ∂z2k
(−1)k ∂ 2k B(z) (R . R)k , 22k+1 k!(k + 1)! ∂z2k
(1)
EXPLORING THIRD - ORDER CHROMATIC ABERRATIONS
97
where R = (X, Y ), R′ = (X′ , Y ′ ), and R∗ = (−Y, X) are the twodimensional (2D) vectors in a fixed-coordinate system, the prime (′ ) implies √ the derivative with respect to z, η = e/2m, and V (z) and B(z) are the axial electric potential and magnetic flux density distributions, respectively. With Mathematica the variational function can be expanded through nth order as follows: % variationalF[n_] := Module {k, term1, term2, f, nmax}, √ √ k = n2 ; term1 = V(z)Normal Series 1 + 1, {1 , 0, k} ∗ √ Normal Series R′ .R′ + 1, {R′ .R′ , 0, k} ; 2i % i& k ((−1)i ∂ ∂zV(z) 2i )(R.R) term1 = Expand term1/.1 → i=1 ; 2i 2 (2 (i!) )V(z) ∂ 2i B(z) % i ((−1) )(R.R)i & ∂z2i ; term2 = Expand η(R∗ .R′ )/. → k−1 i=0 22i+1 i!(i+1)! f = term1 − term2; nmax = k(k + 1); For nn = nmax, nn ≥ k + 1, nn = nn − 1, For i = 0, i ≤ nn, i = i + 1, For j = 0, j ≤ nn − i, j = j + 1, & f = Select f, FreeQ #1, (R.R)i (R′ .R′ )j (R∗ .R′ )−i−j+nn & ;f
where the last three For loops aim at deleting all terms in f whose orders are higher than nth order (Wolfram, 2003). It should be emphasized that there is no difference between scalar and vector in the above Mathematica input cell because they are all in the bold fonts. However, they can be recognized through the mathematical operation. For instance, R, R′ , and R∗ in this cell are all vectors as the dot between them is meant by scalar product. Using function variationalF[n], we can obtain an arbitrary order approximation of the variational function. A. The Second- and Fourth-Order Approximations In fixed coordinates, the second- and fourth-order approximations of the variational function are F2 = variationalF[2] − variationalF[0]; StringForm[”F2 = “,”, F2 ] F4 = variationalF[4] − variationalF[2]; StringForm[”F4 = “.”, F4 ] ' R . RV ′′ (z) 1 ′ 1 + R . R ′ V (z), F2 = − ηB(z)R ∗ . R ′ − √ 2 2 8 V (z)
98
LIU
1 V ′′ (z)2 (R . R)2 V (4) (z)(R . R)2 + ηR ∗ . R ′ B ′′ (z)R . R + √ 16 128V (z)3/2 128 V (z) R ′ . R ′ V ′′ (z)R . R 1 ′ ′ 2 ' − − (R .R ) V (z), √ 8 16 V (z)
F4 = −
which will prove to be valid if only a rotating coordinate transform is performed on the fixed coordinates. Note that there is also no difference between scalar and vector in the Mathematica output cell because they are all in the plain fonts. The way to recognize them is the same as those in the input cells. The rotating transform in electron optics takes the following form (Ximen, 1986; Hawkes and Kasper, 1989): R . R = r . r,
R′ . R′ =
η2 B(z)2
ηB(z) ∗ ′ r . r + r′ . r′ + √ r .r, 4V (z) V (z)
ηB(z) R∗ . R ′ = √ r . r + r ∗ . r′ , 2 V (z)
(2)
where r = (x, y), r′ = (x ′ , y ′ ), and r∗ = (−y, x) are the 2D vectors in the rotating coordinate system. Eq. (2) is written in a separate Mathematica input cell below. 2 B(z)2 rotatingTransform = R.R → r.r, R′ .R′ → η4V(z) r.r + ηB(z) ηB(z) ′ ′ ∗ ′ ∗ ′ ∗ ′ r .r + √V(z) r .r , R .R → 2√V(z) r.r + r .r ; Substituting Eq. (2) into F2 and F4 , the second- and fourth-order approximations of the variational function in rotating coordinates, f2 and f4 , are respectively found. ′ .r′ } ; f2 = CollectExpand[F /.rotatingTransform], {r.r, r 2 f4 = Collect Expand F4 /.rotatingTransform , (r.r)2 , (r.r)(r′ .r′ ), (r′ .r′ )2 , (r∗ .r′ ), (r∗ .r′ )2 ; StringForm[”f2 = “,”, f2 ] StringForm[”f4 = “.”, f4 ] f2 =
2 η B(z)2 1' V ′′ (z) , V (z)r ′ . r ′ + r . r − √ − √ 2 8 V (z) 8 V (z)
EXPLORING THIRD - ORDER CHROMATIC ABERRATIONS
99
η2 V ′′ (z)B(z)2 η2 B ′′ (z)B(z) η4 B(z)4 f4 = − − + √ 32 V (z) 128V (z)3/2 64V (z)3/2
V ′′ (z)2 V (4) (z) − + (r . r)2 √ 128V (z)3/2 128 V (z)
3 η B(z)3 ηV ′′ (z)B(z) 1 − + ηB ′′ (z) r . r + r∗ . r′ − 16V (z) 16V (z) 16
2 2 ′′ η B(z) V (z) 1 + r′ . r′ − √ − √ r . r − ηB(z)r ∗ . r ′ r ′ . r ′ 4 16 V (z) 16 V (z) 2 2 2 ∗ ′ ' η B(z) (r . r ) 1 2 − (r ′ . r ′ ) V (z) − . √ 8 8 V (z)
Usually, it is well known that the second- and fourth-order approximations of the variational function are written in the following forms: f2 = −M0 (r . r) + N0 (r′ . r′ ), M0 N0 ′ ′ 2 L0 (r . r)(r′ . r′ ) − (r . r ) f4 = − (r . r)2 − 4 2 4 √ 2 − V P0 (r . r) + Q0 (r′ . r′ ) r∗ . r′ − K0 V r∗ . r′ ,
(3)
where coefficients, L0 , M0 , . . . , and K0 , are StringForm ”L0 = “,”, Simplify −4Coefficient f4 , (r.r)2 StringForm ”M0 = “,”, Simplify −2Coefficient f4 , (r.r)(r′ .r′ ) StringForm ”N0 = “,”, Simplify −4Coefficient f4 , (r′ .r′ )2 (√ V(z) StringForm ”P0 = “,”, Simplify −Coefficient f4 , (r.r)(r∗ .r′ ) (√ StringForm ”Q0 = “,”, Simplify −Coefficient f4 , (r′ .r′ )(r∗ .r′ ) V(z) ( StringForm ”K0 = “.”, Simplify −Coefficient f4 , (r∗ .r′ )2 V(z) L0 =
η4 B(z)4 + 2η2 V ′′ (z)B(z)2 − 4η2 V (z)B ′′ (z)B(z) + V ′′ (z)2 − V (z)V (4) (z) , 32V (z)3/2
η2 B(z)2 + V ′′ (z) , √ 8 V (z) √ V (z) , N0 = 2 η(η2 B(z)3 + V ′′ (z)B(z) − V (z)B ′′ (z)) , P0 = 16V (z)3/2
M0 =
100
LIU
ηB(z) , Q0 = √ 4 V (z) η2 B(z)2 K0 = . 8V (z)3/2 Obviously, they often appear in textbooks on electron optics. Nevertheless, L0 , M0 , . . . , and K0 are used instead of L, M, . . . , and K since N is an internal function in Mathematica, N[expr] meaning approximate numerical value of expr. B. The Gaussian Value of the Fourth-Order Approximation The Gaussian value of a nth-order approximation of the variational function is defined by the value of the approximation where the position and slope vectors, r and r′ , are respectively replaced by their Gaussian values (Wu, 1957; Liu, 2004), that is r = ro rβ + ro ′ rα
and r′ = ro rβ ′ + ro ′ rα ′ .
(4)
This concept is important in electron optical aberration theory because it makes the high-order aberration analysis concise and effective. Using Eq. (4), we have three scalar products: r . r = (ro . ro )rβ 2 + 2(ro . ro ′ )rα rβ + (ro ′ . ro ′ )rα 2 ,
r′ . r′ = (ro . ro )rβ ′ 2 + 2(ro . ro ′ )rα ′ rβ ′ + (ro ′ . ro ′ )rα ′ 2 , ) Vo ∗ ∗ ′ ro . r o ′ r .r = V which are written in a separate Mathematica cell for later use. gsValues = (r.r) → (ro .ro )r2β + 2(ro .ro ′ )rβ rα + (ro ′ .ro ′ )r2α , .r )(r ′ )2 + 2(ro .ro ′ )rβ ′ rα ′ + (ro ′ .r′o )(rα ′ )2 , (r′ .r′ ) → (r √o o ∗ β ′ o (ro .ro ) (r∗ .r′ ) → V√ ; V(z)
(5)
In addition, the formulas of aberration coefficients are simplified considerably both in number and in form if the following vector identities are used: 2 ∗ ro . ro ′ = (ro . ro )(ro ′ . ro ′ ) − (ro . ro ′ )2 , ∗ ∗ ro . ro ′ ro = (ro . ro ′ )ro ∗ − (ro . ro )ro ′ , ∗ ∗ (6) ro . ro ′ ro ′ = (ro ′ . ro ′ )ro ∗ − (ro . ro ′ )ro ′ , which is written in the Mathematica code as
EXPLORING THIRD - ORDER CHROMATIC ABERRATIONS
101
identicalTransform = (ro ∗ .ro ′ )2 → (ro .ro )(ro ′ .ro ′ ) − (ro .ro ′ )2 , (ro ∗ .ro ′ )ro → (ro .ro ′ )ro ∗ − (ro .ro )ro ′ ∗ , (ro ∗ .ro ′ )ro ′ → (ro ′ .ro ′ )ro ∗ − (ro .ro ′ )ro ′ ∗ ;
Substituting Eq. (5) into Eq. (3), together with the first identity in Eq. (6), and then expanding it provides the Gaussian value of the fourth-order approximation of the variational function in the rotating coordinate system, f4g = Expand − L40 (r.r)2 − M20 (r.r)(r′ .r′ ) − N40 (r′ .r′ )2 − √ ′ ∗ ′ 2 V(z) P0 (r.r) + Q0 (r′ .r′ ) (r∗ .r ) − K0 V(z)(r .r ) /. gsValues/.identicalTransform ; f4g = f4g ; where f4g is assigned for the late numerical calculation. On the other hand, it is useful to write f4g in an expanded form, f4g =
l+m+n=2
fl,m,n (ro . ro )l (ro ′ . ro ′ )m (ro . ro ′ )n
l,m,n=0,1,2
l+m+n=1 fl,m,n ∗ (ro . ro )l (ro ′ . ro ′ )m (ro . ro ′ )n , = r o ∗ . ro ′
(7)
l,m,n=0,1
where all coefficients will be found by Mathematica. StringForm ”f2,0,0 = “,”, Collect Coefficient f4g , (ro .ro )2 , {L0 , M0 , N 0 , K0 } StringForm ”f1,1,0 = “,”, Collect Coefficient f4g , (ro .ro )(ro ′ .ro ′ ) , {L0 , M0 , N 0 , K0 } = “,”, Collect Coefficient f4g , (ro .ro )(ro .ro ′ ) , StringForm ”f1,0,1 {L0 , M0 , N0 , K0 } StringForm ”f0,2,0 = “,”, Collect Coefficient f4g , (ro ′ .ro ′ )2 , {L0 , M0 , N 0 , K0 } StringForm ”f0,1,1 = “,”, Collect Coefficient f4g , (ro ′ .ro ′ )(ro .ro ′ ) , {L0 , M0 , N0 , K0 } = “,”, Collect Coefficient f4g , (ro .ro ′ )2 , StringForm ”f0,0,2 {L0 , M0 , N 0 , K0 }∗ ∗ ′ StringForm ”f1,0,0 = “,”, Collect Coefficient f4g , (ro .ro )(ro .ro ) , {P0 , Q0 } StringForm ”f0,1,0 ∗ = “,”, Collect Coefficient f4g , (ro ′ .ro ′ )(ro ∗ .ro ′ ) , {P0 , Q0 }
102
LIU
StringForm ”f0,0,1 ∗ = “.”, Collect Coefficient f4g , (ro .ro ′ )(ro ∗ .ro ′ ) , {P0 , Q0 } 1 1 1 f2,0,0 = − L0 rβ4 − M0 (rβ ′ )2 rβ2 − N0 (rβ ′ )4 , 4 2 4 1 1 f1,1,0 = − L0 rα2 rβ2 − N0 (rα ′ )2 (rβ ′ )2 − K0 Vo 2 2
1 2 ′ 2 1 2 ′ 2 + M0 − rβ (rα ) − rα (rβ ) , 2 2 f1,0,1 = −L0 rα rβ3 − N0 rα ′ (rβ ′ )3 + M0 −rα ′ rβ ′ rβ2 − rα (rβ ′ )2 rβ , 1 1 1 f0,2,0 = − L0 rα4 − M0 (rα ′ )2 rα2 − N0 (rα ′ )4 , 4 2 4 3 ′ 3 ′ f0,1,1 = −L0 rβ rα − N0 (rα ) rβ + M0 −rα ′ rβ ′ rα2 − rβ (rα ′ )2 rα ,
f0,0,2 = −L0 rα2 rβ2 − 2M0 rα rα ′ rβ ′ rβ − N0 (rα ′ )2 (rβ ′ )2 + K0 Vo , ' ' f1,0,0 ∗ = −P0 Vo rβ2 − Q0 Vo (rβ ′ )2 , ' ' f0,1,0 ∗ = −P0 Vo rα2 − Q0 Vo (rα ′ )2 , ' ' f0,0,1 ∗ = −2P0 rα Vo rβ − 2Q0 Vo rα ′ rβ ′ .
III. C HROMATIC P ERTURBATION VARIATIONAL F UNCTION AND I TS A PPROXIMATIONS In the fixed-coordinate system the perturbation of the variational function caused by the fluctuation of the lens-accelerating voltage (electric perturbation) and the fluctuation of the lens magnetic field (magnetic perturbation) is defined by Hawkes and Kasper (1989) as
∂F Vo Bo ∂F FV + FB , V + B = F = ∂V ∂B Vo B0 ∂F ∂F , FB = B0 , (8) FV = Vo ∂V ∂B0 where V and B are the lens axial electric potential and magnetic flux density distributions, respectively; Vo the voltage at the object side, B0 the maximum value of B; and FV and FB are, respectively, called the electric and magnetic chromatic perturbation variational functions in fixed coordinates.
EXPLORING THIRD - ORDER CHROMATIC ABERRATIONS
103
A. The Second- and Fourth-Order Approximations The second- and fourth-order approximations of the chromatic perturbation variational function in rotating coordinates are defined in the Mathematica code as follows: ∂F2 /.rotatingTransform ; fV2 = Expand Vo ∂V(z) ∂F4 /.rotatingTransform ; fV4 = Expand Vo ∂V(z) F2 = F2 /.B(z)→ B0 b(z); F4 = F4 /.B′′ (z) → B0 b′′ (z); ∂F2 fB2 = Expand B0 ∂B /.rotatingTransform /.B0 b(z) → B(z); ∂F40 fB4 = Expand B0 ∂B0 /.rotatingTransform /.B0 b′′ (z) → B′′ (z);
which can be written in a unified form,
fX4
fX2 = AX (r . r) + BX (r′ . r′ ) + CX r∗ . r′ , MX NX ′ ′ 2 LX (r . r)2 − (r . r)(r′ . r′ ) − (r . r ) =− 4 2 4 √ 2 − V PX (r . r) + QX (r′ . r′ ) r∗ . r′ − KX V r∗ . r′
(9)
where the uppercase letter “X” stands for either “V ” or “B” and all expansion coefficients are found by Mathematica. StringForm”AV = “,”, Simplify Coefficient fV2 , (r.r) StringForm”BV = “,”, Simplify Coefficient fV2 , (r′ .r′ ) ′) StringForm”CV = “;”, Simplify Coefficient fV2 , (r∗ .r StringForm”AB = “,”, Simplify Coefficient fB2 , (r.r) StringForm”BB = “,”, Simplify Coefficient fB2 , (r′ .r′ ) StringForm ”CB = “.”, Simplify Coefficient fB2 , (r∗ .r′ ) Vo (η2 B(z)2 + V ′′ (z)) , 16V (z)3/2 Vo = √ , 4 V (z) ηB(z)Vo = ; 4V (z) η2 B(z)2 =− √ , 4 V (z) = 0, 1 = − ηB(z). 2
AV = BV CV
AB BB CB
104
LIU
StringForm ”LV = “,”, Simplify −4Coefficient fV4 , (r.r)2 StringForm ”MV = “,”, Simplify −2Coefficient fV4 , (r.r)(r′ .r′ ) StringForm ”NV = “,”, Simplify −4Coefficient fV4 , (r′ .r′ )2 √ StringForm ”PV = “,”, Simplify −Coefficient fV4 , (r.r)(r∗ .r′ ) / V(z) √ StringForm ”QV = “,”, Simplify −Coefficient fV4 , (r′ .r′ )(r∗ .r′ ) / V(z) StringForm ”KV = “.”, Simplify −Coefficient fV4 , (r∗ .r′ )2 /V(z)
LV =
4 1 Vo η B(z)4 − 2η2 V ′′ (z)B(z)2 − 3V ′′ (z)2 5/2 64V (z) + V (z)V (4) (z) ,
Vo (η2 B(z)2 − V ′′ (z)) , 16V (z)3/2 Vo NV = √ , 4 V (z) MV =
ηB(z)Vo (η2 B(z)2 − V ′′ (z)) , 32V (z)5/2 ηB(z)Vo QV = , 8V (z)3/2 PV =
η2 B(z)2 Vo . 16V (z)5/2 StringForm ”LB = “,”, Simplify −4Coefficient fB4 , (r.r)2 StringForm ”MB = “,”, Simplify −2Coefficient fB4 , (r.r)(r′ .r′ ) StringForm ”NB = “,”, Simplify −4Coefficient fB4 , (r′ .r′ )2 √ StringForm ”PB = “,”, Simplify −Coefficient fB4 , (r.r)(r∗ .r′ ) / V(z) √ StringForm ”QB = “,”, Simplify −Coefficient fB4 , (r′ .r′ )(r∗ .r′ ) / V(z) StringForm ”KB = “.”, Simplify −Coefficient fB4 , (r∗ .r′ )2 /V(z) KV =
η2 B(z)B ′′ (z) , √ 8 V (z) MB = 0, LB = −
NB = 0,
ηB ′′ (z) , √ 16 V (z) QB = 0,
PB = −
KB = 0.
EXPLORING THIRD - ORDER CHROMATIC ABERRATIONS
105
B. The Gaussian Values of the Second- and Fourth-Order Approximations Substituting the Gaussian trajectory into the second- and fourth-order approximations of the chromatic perturbation variational function, Eq. (9), together with a replacement of identicalTransform, we have the corresponding Gaussian values. The electric and magnetic second- and fourth-order approximations are also expressed in a unified form with “X” instead of either “V ” or “B”. As a result, the Gaussian value of the second-order approximation can be expressed by fX2g =
l+m+n=1
fX,l,m,n (ro . ro )l (ro ′ . ro ′ )m (ro . ro ′ )n
l,m,n=0,1
+ fX,0,0,0 ∗ ro ∗ .ro ′
(10)
and its coefficients are derived below, ( fX2g = Expand Expand AX (r.r) + BX (r′ .r′ ) + CX (r∗ .r′ ) .gsValues ; fX2g = fX2g; StringForm”fX,1,0,0 = “,”, CoefficientfX2g , (ro .ro ) StringForm”fX,0,1,0 = “,”, CoefficientfX2g , (ro ′ .ro ′ ) StringForm”fX,0,0,1 = “,”, Coefficient fX2g , (ro .ro ′ ) StringForm ”fX,0,0,0 ∗ = “.”, Coefficient fX2g , (ro ∗ .ro ′ ) fX,1,0,0 = AX rβ2 + BX (rβ ′ )2 , fX,0,1,0 = AX rα2 + BX (rα ′ )2 ,
fX,0,0,1 = 2AX rα rβ + 2BX rα ′ rβ ′ , √ CX Vo ∗ fX,0,0,0 = − √ . V (z) Similarly, the Gaussian value of the fourth-order approximation is fX4g =
l+m+n=2
fX,l,m,n (ro . ro )l (ro ′ . ro ′ )m (ro . ro ′ )n
l,m,n=0,1,2
l+m+n=1 fX,l,m,n ∗ (ro . ro )l (ro ′ . ro ′ )m (ro . ro ′ )n + ro ∗ .ro ′ l,m,n=0,1
whose coefficients are fX4g = Expand Expand − L4X (r.r)2 − M2X (r.r)(r′ .r′ ) − N4X (r′ .r′ )2 − √ V(z) PX (r.r) + QX (r′ .r′ ) (r∗ .r′ ) − KX V(z)(r∗ .r′ )2 /.
(11)
106
LIU
gsValues/.identicalTransform ; fX4g = fX4g ; StringForm”fX,2,0,0 = “,”, CoefficientfX4g , (ro .ro )2 StringForm”fX,1,1,0 = “,”, CoefficientfX4g , (ro .ro )(ro ′ .ro ′ ) StringForm ”fX,1,0,1 = “,”, Coefficient fX4g , (ro .ro )(ro .ro ′ ) StringForm”fX,0,2,0 = “,”, CoefficientfX4g , (ro ′ .ro ′ )2 StringForm ”fX,0,1,1 = “,”, Coefficient fX4g , (ro ′ .ro ′ )(ro .ro ′ ) StringForm”fX,0,0,2 = “,”, Coefficient fX4g , (ro .ro ′ )2 StringForm”fX,1,0,0 ∗ = “,”, CoefficientfX4g , (ro .ro )(ro ∗ .ro ′ ) StringForm”fX,0,1,0 ∗ = “,”, CoefficientfX4g , (ro ′ .ro ′ )(ro ∗ .ro ′ ) StringForm ”fX,0,0,1 ∗ = “.”, Coefficient fX4g , (ro .ro ′ )(ro ∗ .ro ′ )
1 1 1 fX,2,0,0 = − LX rβ4 − MX (rβ ′ )2 rβ2 − NX (rβ ′ )4 , 4 2 4 1 1 1 fX,1,1,0 = − LX rα2 rβ2 − MX (rα ′ )2 rβ2 − MX rα2 (rβ ′ )2 2 2 2 1 ′ 2 ′ 2 − NX (rα ) (rβ ) − KX Vo , 2 fX,1,0,1 = −LX rα rβ3 − MX rα ′ rβ ′ rβ2 − MX rα (rβ ′ )2 rβ − NX rα ′ (rβ ′ )3 , 1 1 1 fX,0,2,0 = − LX rα4 − MX (rα ′ )2 rα2 − NX (rα ′ )4 , 4 2 4 fX,0,1,1 = −LX rβ rα3 − MX rα ′ rβ ′ rα2 − MX rβ (rα ′ )2 rα − NX (rα ′ )3 rβ ′ , fX,0,0,2 = −LX rα2 rβ2 − 2MX rα rα ′ rβ ′ rβ − NX (rα ′ )2 (rβ ′ )2 + KX Vo , ' ' fX,1,0,0 ∗ = −PX Vo rβ2 − QX Vo (rβ ′ )2 , ' ' fX,0,1,0 ∗ = −PX Vo rα2 − QX Vo (rα ′ )2 , ' ' fX,0,0,1 ∗ = −2PX rα Vo rβ − 2QX Vo rα ′ rβ ′ .
In the last two Mathematica input cells, fX2g and fX4g are assigned for the late use in the numerical calculation.
IV. A NALYTICAL D ERIVATION OF T HIRD -O RDER C HROMATIC A BERRATION C OEFFICIENTS We define third-order chromatic aberration as the chromatic aberration that is dependent on the third-order small quantities of the electron trajectory position and slope but always proportional to the first-order variation of the accelerating voltage or the maximum of the axial magnetic flux density.
EXPLORING THIRD - ORDER CHROMATIC ABERRATIONS
107
Therefore, the third-order chromatic aberration of an electron lens can be calculated separately either for the electrostatic or for the magnetic perturbation. The total chromatic aberration is the sum of these. They are expressed in a unified second-order inhomogeneous ordinary differential equation, V ′′ + η2 B 2 V′ rX3 ′′ + r′X3 + rX3 2V 4V +
*
d ∂fX4 ∂fX4 X o 1 − + = √ Xo dz ∂r′ ∂r rg ,rg ′ V
+ * ∂fX2 d ∂fX2 + + − ′ dz ∂r ∂r r3 ,r3 ′ +
* ∂f4 d ∂f4 + + − dz ∂r′ ∂r type of (rg .rg )rX1 ,...
+ * ∂f4 d ∂f4 + , + − dz ∂r′ ∂r type of (rg .rX1 )rg ,...
(12)
where subscript “X” represents either “V ” or “B”; other subscripts at the right lower corner of the square brackets mean that the independent variables, r and r′ , in the square brackets are replaced by them accordingly; rX1 is the first-order chromatic aberration without the factor X o /Xo . It is seen from Eq. (12) that the third-order chromatic aberration consists of the intrinsic and combined components, the former depends on the fourth-order chromatic perturbation variational function, whereas the latter on the thirdorder geometric and first-order chromatic aberrations.
A. Intrinsic Chromatic Aberration Coefficients According to Eq. (12), the third-order intrinsic chromatic aberration satisfies the following equation: V ′′ + η2 B 2 V′ rXi ′′ + rXi ′ + rXi +
* 4V 2V ∂fX4 d ∂fX4 X o 1 + − , = √ Xo dz ∂r′ ∂r rg ,rg ′ V rXi (zo ) = rXi ′ (zo ) = 0
(13)
108
LIU
and its solution at the image plane with reference back to the object plane has the form ,zi ∂fX4g 1 rXi (zi ) = − √ dz, (14) ∂ro ′ Vo zo
where the gradient of fX4g with respect to ro ′ is evaluated by a special procedure written as grad[scalarF_, var_] := Module {replacement, vr, sf}, replacement = {ro → {xo , yo }, ro ′ → {uo , vo }, ro ∗ → {−yo , xo }}; vr = var/.replacement; sf = scalarF/.replacement; D[sf, #]&/@vr Thus, we have the gradient of ∂fX4g /∂ro ′ .
fX4g = 0; For l = 0, l ≤ 2, l = l + 1, For m = 0, l + m ≤ 2, m = m + 1, n = 2 − l − m; l ′ ′ m ′ n fX4g = fX4g + fX,l,m,n (ro .ro )(ro .ro ) (ro .ro ) ; For l = 0, l ≤ 1, l = l + 1, For m = 0, l + m ≤ 1, m = m + 1, n = 1 − l − m; fX4g = fX4g + fX,l,m,n ∗ (ro ∗ .ro ′ )(ro .ro )l (ro ′ .ro ′ )m (ro .ro ′ )n ; 2 replacement1 = xo + y2o → R2o , u2o + v2o → S2o , uo xo + vo yo → T2o , vo xo − uo yo → W2o ; replacement2 = xo W2o → −yo T2o + vo R2o , yo W2o → xo T2o − uo R2o , uo W2o → −yo S2o + vo T2o , vo W2o → xo S2o − uo T2o ; fX4gso = grad(fX4g , ro ′ )/.replacement1/.replacement2;
where replacement1 is clear, while replacement2 corresponds to the component relationships of the second and third vector identities in Eq. (6). Obviously, ∂fX4g /∂ro ′ is a vector consisting of the x- and y-components. Examining its structure (which is not given out explicitly here), rXi can, in principle, be expanded as
Xo 1 rXi (zi ) = − √ Xo Vo z i , ∗ × BXi ro ′ + bXi ro ′ So2 + F1Xi ro + f1Xi ro ∗ So2 zo
∗ + F2Xi ro ′ + f2Xi ro ′ To2 + CXi ro + cXi ro ∗ To2 ∗ + DXi ro ′ + dXi ro ′ Ro2 + EXi ro + eXi ro ∗ Ro2 dz,
(15) where the uppercase and lowercase letters, BXi , F1Xi , . . . , EXi and bXci , f1Xi , . . . , eXi are associated with the isotropic and anisotropic intrinsic
EXPLORING THIRD - ORDER CHROMATIC ABERRATIONS
109
chromatic aberration coefficients, respectively. They can be extracted from either the x- or y-component of ∂fX4g /∂ro ′ . In this context, the x-component is always used. The third-order isotropic intrinsic chromatic aberration coefficients are shown below: BXi = Coefficient fX4gso [[1]], S2o uo ; F1Xi = Coefficient fX4gso [[1]], S2o xo ; F2Xi = Coefficient fX4gso [[1]], T2o uo ; CXi = Coefficient fX4gso [[1]], T2o xo ; DXi = Coefficient fX4gso [[1]], R2o uo ; EXi = Coefficient fX4gso [[1]], R2o xo ; StringForm[”BXi = “, F1Xi = “, F2Xi = “,”, BXi , F1Xi , F2Xi ] StringForm[”CXi = “, DXi = “, EXi = “,”, CXi , DXi , EXi ] BXi = 4fX,0,2,0 ,
CXi = 2fX,0,0,2 ,
F1Xi = fX,0,1,1 ,
DXi = 2fX,1,1,0 ,
F2Xi = 2fX,0,1,1 , EXi = fX,1,0,1 ,
and the anisotropic intrinsic chromatic aberration coefficients are as follows: bXi = Coefficient fX4gso [[1]], −S2o vo ; f1Xi = Coefficient fX4gso [[1]], −S2o yo ; f2Xi = Coefficient fX4gso [[1]], −T2o vo ; cXi = Coefficient fX4gso [[1]], −T2o yo ; dXi = Coefficient fX4gso [[1]], −R2o vo ; eXi = Coefficient fX4gso [[1]], −R2o yo ; StringForm[”bXi = “, f1Xi = “, f2Xi = “,”, bXi , f1Xi , f2Xi ] StringForm[”cXi = “, dXi = “, eXi = “.”, cXi , dXi , eXi ] bXi = 0, cXi = 2fX,0,0,1 ∗ ,
f1Xi = 3fX,0,1,0 ∗ , dXi = −fX,0,0,1 ∗ ,
f2Xi = −2fX,0,1,0 ∗ , eXi = fX,1,0,0 ∗ .
B. Combined Chromatic Aberration Coefficients As is shown in Eq. (12), there are three terms in the combined chromatic aberration. For the sake of conciseness, it is calculated one term at a time. Before this, ∂f4g /∂ro , ∂f4g /∂ro ′ , ∂ǫ4g /∂ro ′ , ∂ǫ4g /∂ro ′ , ∂fX2g /∂ro , ∂fX2g /∂ro ′ , ∂ǫX2g /∂ro , and ∂ǫX2g /∂ro ′ are first calculated, which are associated with the third-order geometric aberration and the first-order chromatic aberration, respectively.
110
LIU
fToǫ = f2,0,0 → ǫ 2,0,0 , f1,1,0 → ǫ 1,1,0 , f1,0,1 → ǫ 1,0,1 , f0,2,0 → ǫ 0,2,0 , f0,1,1 → ǫ 0,1,1 , f0,0,2 → ǫ 0,0,2 , f1,0,0 ∗ → ǫ 1,0,0 ∗ , f0,1,0 ∗ → ǫ 0,1,0 ∗ , f0,0,1 ∗ → ǫ 0,0,1 ∗ ; fTofo = f2,0,0 → f2,0,0,o , f1,1,0 → f1,1,0,o , f1,0,1 → f1,0,1,o , f0,2,0 → f0,2,0,o , f0,1,1 → f0,1,1,o , f0,0,2 → f0,0,2,o , f1,0,0 ∗ → f1,0,0,o ∗ , f0,1,0 ∗ → f0,1,0,o ∗ , f0,0,1 ∗ → f0,0,1,o ∗ ; f4g = 0; For l = 0, l ≤ 2, l = l + 1, For m = 0, l + m ≤ 2, m = m + 1, n = 2 − l − m; l ′ ′ m ′ n f4g = f4g + fl,m,n (ro .ro ) (ro .r o ) (ro .ro ) ; For l = 0, l ≤ 1, l = l + 1, For m = 0, l + m ≤ 1, m = m + 1, n = 1 − l − m; f4g = f4g + fl,m,n ∗ (ro ∗ .ro ′ )(ro .ro )l (ro ′ .ro ′ )m (ro .ro ′ )n ; f4gro = grad(f4g , ro ); f4gso = grad(f4g , ro ′ ); ǫ 4gro =√ f4gro /.fToǫ; ǫ4gso = f4gso /.fToǫ; √ V(z)/ Vo (rβ f4gso − rα f4gro ); η3go = f4gso /.fTofo; η3g = fXToǫX = {fX,1,0,0 → ǫ X,1,0,0 , fX,0,1,0 → ǫ X,0,1,0 , fX,0,0,1 → ǫ X,0,0,1 , fX,0,0,0 ∗ → ǫ X,0,0,0 ∗ }; fXTofXo = {fX,1,0,0 → fX,1,0,0,o , fX,0,1,0 → fX,0,1,0,o , fX,0,0,1 → fX,0,0,1,o , fX,0,0,0 ∗ → fX,0,0,0,o ∗ }; ∗ ′ ′ ∗ ′ fX2g = fX,1,0,0 (ro .ro ) + fX,0,1,0 (ro ′ .r o ) + fX,0,0,1 (ro .ro ) + fX,0,0,0 (ro .ro ); ∗ fX2gro = grad(fX2g , ro ); fX2gro = −fX2gro [[2]], fX2gro [[1]] ; fX2gso = grad(fX2g , ro ′ ); fX2gso ∗ = −f X2gso [[2]], fX2gso [[1]] ; ǫ X2gro = fX2gro /.fXToǫX; ǫ X2gro ∗ = −ǫ X2gro [[2]], ǫ X2gro [[1]] ; ǫ X2gso = fX2gso /.fXToǫX; ǫ X2gso ∗ = −ǫ X2gso [[2]], ǫ X2gso [[1]] ; √ √ ηX1g = Expand V(z)/ Vo (rβ fX2gso − rα fX2gro ) ; ηX1go = fX2gso /.fXTofXo; ηX1g ∗ = −ηX1g [[2]], ηX1g [[1]] ; ηX1go ∗ = −ηX1go [[2]], ηX1go [[1]] ;
The first combined chromatic aberration satisfies the following equation: V′ V ′′ + η2 B 2 rXc1 ′ + rXc1 rXc1 ′′ + 4V
* 2V + ∂fX2 d ∂fX2 X o 1 + , − = √ Xo dz ∂r′ ∂r r3 ,r3 ′ Vo rXc1 (zo ) = rXc1 ′ (zo ) = 0,
(16)
and its solution at the image plane with reference back to the object plane has the form
EXPLORING THIRD - ORDER CHROMATIC ABERRATIONS
111
Xo 1 Vo Xo ,zi ' ∗ Vo Γ11 r3 + Γ12 r3 ′ + Γ13 r3 ∗ + Γ14 r3 ′ dz, ×
rXc1 (zi ) = −
(17)
zo
whose integrand is written in the Mathematica code. ′ ′ Γ 11 = 2A X√rα ; Γ12 = 2BX rα ; Γ 13 = CX rα ; Γ 14 = −CX rα ;
r3 = 1/ Vo rα (ǫ 4gro + η3go ) − rβ ǫ 4gso /.replacement1; √ √ √
r3 ′ = 1/ Vo rα ′ (ǫ 4gro + η3go ) − rβ ′ ǫ 4gso − Vo / V(z) η3g /. replacement1; ′∗ ′ ′
r3 ∗ = − r3 [[2]], r 3 [[1]] ; r3 = − r3 [[2]], r3 [[1]] ; integrand1 = Expand Expand √ Vo (Γ 11 r3 + Γ 12 r3 ′ + Γ 13 r3 ∗ + Γ 14 r3 ′ ∗ ) /.replacement2 ;
where replacement1 and replacement2 have been assigned in Section IV.A. The second combined chromatic aberration satisfies the following equation: V′ V ′′ + η2 B 2 rXc2 ′ + rXc2 rXc2 ′′ + 4V +
*
2V d ∂f4 ∂f4 X o 1 − + , = √ Xo dz ∂r′ ∂r type of (rg .rg )rX1 ,... V rXc2 (zo ) = rXc2 ′ (zo ) = 0,
(18)
and its solution at the image plane with reference back to the object plane has the form
Xo 1 rXc2 (zi )= − Vo Xo z i , ' ∗ Vo Γ21 rX1 + Γ22 rX1 ′ + Γ23 rX1 ∗ + Γ24 rX1 ′ dz, × (19)
zo
′,
∗,
′∗
where Γ21 , Γ22 , Γ23 , Γ24 , rX1 , rX1 rX1 and rX1 are referred to the following input Mathematica cell, together with the expansion of the integrand. replacement3 = x2o → R2o − y2o , u2o → S2o − v2o , uo xo → T2o − vo yo , vo xo → W2o + uo yo , x3o → R2o − y2o xo , y3o → R2o − x2o yo , u3o → S2o − v2o uo , v3o → S2o − u2o vo ; ′ ′ rg = rβ {xo , yo } + rα {uo , vo }; rg ′ = o , yo } + rα {u rβ {x o , vo }; ∗ ′ ∗ ′ ′ r [[1]] ; r = −r [[2]], r [[1]] ; rg = −rg [[2]], g g g g Γ 21 = Expand Expand −rα L0 (rg .rg ) + M0 (rg ′ .rg ′ ) +
112
LIU
√ ; 2P0 V(z)(rg∗ .rg ′ ) /.replacement3 ′ M (r .r ) + N (r ′ .r ′ )+ −r Γ 22 = Expand Expand α 0 g g 0 g g √ ; 2Q0 V(z)(rg ∗ .rg ′ ) /.replacement3 √ √ ′ ′ −rα ′ P0 V(z)(r Γ 23 = Expand Expand g .rg ) + Q0 V(z)(rg .rg )+ ∗ ′ ; 2K0 V(z)(rg .rg ) /.replacement3 √ √ ′ ′ rα P0 V(z)(r Γ 24 = Expand Expand g .rg ) + Q0 V(z)(rg .rg )+ ∗ ′ 2K0 V(z)(rg .rg ) /.replacement3 ; √ rX1 = 1/ Vo Expand rα (ǫ X2gro + ηX1go ) − rβ ǫ X2gso ; √ rX1 ′ = 1/ Vo Expand √ √ rα ′ (ǫ X2gro + ηX1go ) − rβ ′ ǫ X2gso − Vo / V(z) ηX1g ; ′ ′ ′∗ rX1 ∗ = −rX1 [[2]], rX1[[1]] ; rX1 = −rX1 [[2]], rX1 [[1]] ; integrand2 = Expand Expand √ Vo (Γ 21 rX1 + Γ 22 rX1 ′ + Γ 23 rX1 ∗ + Γ 24 rX1 ′ ∗ ) /.replacement2 ;
The third combined chromatic aberration satisfies the following equation: V ′′ + η2 B 2 V′ rXc3 ′ + rXc3 rXc3 ′′ + 2V 4V
+
* d ∂f4 X o ∂f4 1 − + = , Vo Xo dz ∂r′ ∂r type of (rg .rX1 )rg ,... rXc3 (zo ) = rXc3 ′ (zo ) = 0,
(20)
and its solution at the image plane with reference back to the object plane has the form
Xo 1 rXc3 (zi ) = − Vo Xo ,zi ' ∗ Vo Γ31 rg + Γ32 rg ′ + Γ33 rg ∗ + Γ34 rg ′ dz. ×
(21) Similarly, the following Mathematica cell calculates the integrand of Eq. (21): Γ 31 = Expand Expand −2rα L0 (rg .rX1 ) + M0 (rg ′ .rX1 ′ )+ √ V(z)P0 (rg ∗.rX1 ′ + rX1 ∗ .rg ′) /.replacement3 ; ′ ′ Γ 32 = Expand Expand −2rα ′ M 0 (rg .rX1 ) + N0 (rg .rX1 )+ √ ∗ ′ ′ ∗ ) /.replacement3 ; V(z)Q0 (rg .rX1 + rX1 .rg √ √ ′ Γ 33 = Expand Expand −2rα V(z)P0 (rg .rX1)+ V(z)Q0 (rg ′ .rX1 ′ )+ ′ ) /.replacement3 ; V(z)K0 (rg ∗ .rX1 ′ + rX1∗ .rg√ √ Γ 34 = Expand Expand 2rα V(z)P0 (rg .rX1 ) + V(z)Q0 (rg ′ .rX1 ′ )+ zo
EXPLORING THIRD - ORDER CHROMATIC ABERRATIONS
113
V(z)K0 (rg ∗ .rX1 ′ + rX1 ∗ .rg ′ ) /.replacement3 ; integrand3 = Expand Expand √ Vo (Γ 31 rg + Γ 32 rg ′ + Γ 33 rg ∗ + Γ 34 rg ′ ∗ ) /.replacement2 ;
Like the intrinsic chromatic aberration, the sum of these three combined aberration coefficients is expanded as rXc (zi )
,zi Xo 1 ∗ BXc ro ′ + bXc ro ′ So2 + F1Xc ro + f1Xc ro ∗ So2 = − Vo Xo zo ∗ ′ + F2Xc ro + f2Xc ro ′ To2 + CXc ro + cXc ro ∗ To2 ∗ + DXc ro ′ + dXc ro ′ Ro2 + EXc ro + eXc ro ∗ Ro2 dz, (22)
and the corresponding chromatic aberration coefficients are extracted from the x-component of the integrand of Eq. (22). First, in order to simplify the final results, some of the replacements are written in the Mathematica code, including three new quantities introduced: fmn1 , fmn2 , and fmn3 . commonReplacement = AX r2β + BX (rβ ′ )2 → fX,1,0,0 , AX r2α + BX (rα ′ )2 → fX,0,1,0 , √ √ AX rα rβ + BX rα ′ rβ ′ → fX,0,0,1 /2, CX Vo / V(z) → fX,0,0,0 ∗ , L0 r4β + M0 (rβ ′ )2 r2β + N0 (rβ ′ )4 → −4f2,0,0 , L0 r2α r2β +N0 (rα ′ )2 (rβ ′ )2 +2K0 Vo +M0 r2β (rα ′ )2 +r2α (rβ ′ )2 → −2f1,1,0 , L0 rα r3β + rβ ′ N0 rα ′ (rβ ′ )2 + M0 rβ (rβ rα ′ + rα rβ ′ ) → −f1,0,1 , L0 r4α + 2M0 (r α ′ )2 r2α + N0 (rα ′ )4 → −4f0,2,0 , L0 rβ r3α + rα ′ N0 rβ ′ (rα ′ )2 + M0 rα (rβ rα ′ + rα rβ ′ ) → −f0,1,1 , L0 r2α r2β − K0 Vo + rα ′ rβ ′ (2M0 rα rβ + N0 rα ′ rβ ′ ) → −f0,0,2 , √ √ P0 r2β + Q0 (rβ ′ )2 → −f1,0,0 ∗ / Vo , P0 r2α + Q0 (rα ′ )2 → −f0,1,0 ∗ / Vo , √ P0 rα rβ + Q0 rα ′ rβ ′ → −f0,0,1 ∗ /2/ Vo , M0 r2β + N0 (rβ ′ )2 → fmn1 , M0 r2α + N0 (rα ′ )2 → fmn2 , M0 rα rβ + N0 rα ′ rβ ′ → fmn3 , √ √ √ √ (rα rβ ′ − rβ rα ′ ) → − Vo / V(z), (rβ rα ′ − rα rβ ′ ) → Vo / V(z) ;
Then, all combined chromatic aberration coefficients in Eq. (22) are calculated one by one. collectFactor = f2,0,0 , f1,1,0 , f1,0,1 , f0,2,0 , f0,1,1 , f0,0,2 , f1,0,0 ∗ , f0,1,0 ∗ , f0,0,1 ∗ , f2,0,0,o , f1,1,0,o , f1,0,1,o , f0,2,0,o , f0,1,1,o , f0,0,2,o , f1,0,0,o ∗ , f0,1,0,o ∗ , f0,0,1,o ∗ , ǫ 2,0,0 , ǫ 1,1,0 , ǫ 1,0,1 , ǫ 0,2,0 , ǫ 0,1,1 , ǫ 0,0,2 , ǫ 1,0,0 ∗ , ǫ 0,1,0 ∗ , ǫ 0,0,1 ∗ , fX,1,0,0 , fX,0,1,0 , fX,0,0,1 , fX,0,0,0 ∗ , fX,1,0,0,o , fX,0,1,0,o , fX,0,0,1,o , fX,0,0,0,o ∗ ,
114
LIU
ǫ X,1,0,0 , ǫ X,0,1,0 , ǫ X,0,0,1 , ǫ X,0,0,0 ∗ ; For no = 1, no ≤ 12, no = no + 1, If no == 1, factor = S2o uo ; If no == 2, factor = S2o xo ; If no == 3, factor = T2o uo ; If no == 4, factor = T2o xo ; If no == 5, factor = R2o uo ; If no == 6, factor = R2o xo ; If no == 7, factor = −S2o vo ; If no == 8, factor = −S2o yo ; If no == 9, factor = −T2o vo ; If no == 10, factor = −T2o yo ; 2 2 If no == 11, factor = −Rovo ; If no == 12, factor = −Ro yo ; sum = Coefficient Expand integrand1[[1]] , factor ; + integrand2[[1]] + integrand3[[1]] sum = Simplify Collect[sum, collectFactor] /. commonReplacement/.commonReplacement; If[no == 1, BXc = sum]; If[no == 2, F1Xc = sum]; If[no == 3, F2Xc = sum]; If[no == 4, CXc = sum]; If[no == 5, DXc = sum]; If[no == 6, EXc = sum]; If[no == 7, bXc = sum]; If[no == 8, f1Xc = sum]; If[no == 9, f2Xc = sum]; If[no == 10, cXc = sum]; If[no == 11, dXc = sum]; If[no == 12, eXc = sum] ;
where variable sum is meant by the sum of all terms of a certain special combined chromatic aberration coefficient. The above input cell is suited to all 12 coefficients. For each special combined chromatic aberration coefficient, we have found that it has 8 components at the most. Now, we are ready to find all of them. 1. Chromatic Spherical Aberration Coefficients replacementInBXc = M0 rα ′ r3α + N0 (rα ′ )3 rα → rα rα ′ fmn2 ; BXc = BXc /.replacementInBXc; BXc1 = ExpandCoefficient BXc , r2α ; BXc2 = ExpandCoefficient[BXc , rα rβ ] ; BXc3 = Expand Coefficient[BXc , rα rα ′ ] ; BXc4 = Expand Coefficient BXc , r2β ; BXc5 = ExpandCoefficient[BXc , rβ rα ′ ]; BXc6 = Expand Coefficient[BXc , rβ rβ ′ ] ; BXc7 = Expand Coefficient BXc , rα ′ 2 ; BXc0 = Expand BXc − BXc1 r2α + BXc2 rα rβ + BXc3 rα rα ′ + BXc4 r2β + BXc5 rβ rα ′ + BXc6 rβ rβ ′ + BXc7 rα ′ 2 ; StringForm[”BXc0 = “,”, BXc0 ] StringForm[”BXc1 = “,”, BXc1 ] StringForm[”BXc2 = “,”, BXc2 ]
EXPLORING THIRD - ORDER CHROMATIC ABERRATIONS
StringForm[”BXc3 StringForm[”BXc4 StringForm[”BXc5 StringForm[”BXc6 StringForm[”BXc7
115
= “,”, BXc3 ] = “,”, BXc4 ] = “,”, BXc5 ] = “,”, BXc6 ] = “.”, BXc7 ]
BXc0 = −4ǫ0,2,0 fX,0,0,1 + 2ǫ0,1,1 fX,0,1,0 + 8f0,2,0,o fX,0,1,0 BXc1 BXc2
+ 12f0,2,0 ǫX,0,0,1 − 6f0,1,1 ǫX,0,1,0 + 24f0,2,0 fX,0,1,0,o , √ 3(fX,0,0,0 )∗ V (z)(f0,1,0 )∗ ∗ , = −CX (f0,1,0 ) − √ Vo = 0,
BXc3 = 2BX f0,1,1 − 3fmn2 fX,0,0,1 , BXc4 = 0,
BXc5 = 6fmn2 fX,0,1,0 − 8BX f0,2,0 , BXc6 = 0,
BXc7 = 0.
2. Chromatic Coma 1 Coefficients replacementInF1Xc = √ √ 3P0 rβ r2α +Q0 rα ′ (4rα rβ ′ −rβ rα ′ ) → −2rα f0,0,1 ∗ / Vo +rβ f0,1,0 ∗ / Vo , M0 rα ′ r3α + N0 (rα ′ )3 rα → rα rα ′ fmn2 , N0 rβ ′ (rα ′ )2 + M0 rα rβ rα ′ → rα ′ fmn3 , −L0 r4α − 2M0 (rα ′ )2 r2α − N0 (rα ′ )4 → 4f0,2,0 , −M0 rβ rα ′ r2α + N0 (rα ′ )2 (rβ rα ′ − 2rα rβ ′ ) → −2rα rα′ fmn3 + rβrα ′ fmn2 , −L0 rβ r3α − rα ′ N0 rβ ′ (rα ′ )2 + M0 rα (rβ rα ′ + rα rβ ′ ) → f0,1,1 ; F1Xc = F1Xc /.replacementInF1Xc; F1Xc1 = ExpandCoefficient F1Xc , r2α ; F1Xc2 = ExpandCoefficient[F1Xc , rα rβ ] ; F1Xc3 = Expand Coefficient[F1Xc , rα rα ′ ] ; F1Xc4 = Expand Coefficient F1Xc , r2β ; F1Xc5 = ExpandCoefficient[F1Xc , rβ rα ′ ]; F1Xc6 = Expand Coefficient[F1Xc , rβ rβ ′ ] ; F1Xc7 = Expand Coefficient F1Xc , rα ′ 2 ; F1Xc0 = Expand F1Xc − F1Xc1 r2α + F1Xc2 rα rβ + F1Xc3 rα rα ′ + F1Xc4 r2β + F1Xc5 rβ rα ′ + F1Xc6 rβ rβ ′ + F1Xc7 rα ′ 2 ; StringForm[”F1Xc0 = “,”, F1Xc0 ] StringForm[”F1Xc1 = “,”, F1Xc1 ]
116
LIU
StringForm[”F1Xc2 StringForm[”F1Xc3 StringForm[”F1Xc4 StringForm[”F1Xc5 StringForm[”F1Xc6 StringForm[”F1Xc7
= “,”, F1Xc2 ] = “,”, F1Xc3 ] = “,”, F1Xc4 ] = “,”, F1Xc5 ] = “,”, F1Xc6 ] = “.”, F1Xc7 ]
F1Xc0 = −ǫ0,1,1 fX,0,0,1 + 4ǫ1,1,0 fX,0,1,0 + 2f0,1,1,o fX,0,1,0 + f0,1,1 ǫX,0,0,1 − 4f0,0,2 ǫX,0,1,0 + 8f0,2,0 ǫX,1,0,0
+ 4f0,2,0 fX,0,0,1,o + 4f0,1,1 fX,0,1,0,o + 3(ǫ0,1,0 )∗ (fX,0,0,0 )∗
F1Xc1
F1Xc2
F1Xc3
+ 3(f0,1,0 )∗ (ǫX,0,0,0 )∗ , ' ' = −2K0 Vo V (z)fX,0,0,1 + CX (f0,0,1 )∗ √ 2(f0,0,1 )∗ (fX,0,0,0 )∗ V (z) − , √ Vo ' ' = 4K0 Vo V (z)fX,0,1,0 − 3CX (f0,1,0 )∗ √ (f0,1,0 )∗ (fX,0,0,0 )∗ V (z) , + √ Vo = 4BX f1,1,0 − 2fmn3 fX,0,0,1 − 2fmn2 fX,1,0,0 ,
F1Xc4 = 0,
F1Xc5 = −2BX f0,1,1 + fmn2 fX,0,0,1 + 4fmn3 fX,0,1,0 , F1Xc6 = 0,
F1Xc7 = 0.
3. Chromatic Coma 2 Coefficients replacementInF2Xc = M0 rα ′ r3α + N0 (rα ′ )3 rα → rα rα ′ fmn2 , 2N0 rβ ′ (rα ′ )2 + M0 rα (rβ rα ′ + rα rβ ′ ) → rβ ′ fmn2 + rα ′ fmn3 , 2L0 r2α r2β + K0 V(z)(rα ′ )2 r2β − 2K0 rα V(z)rα ′ rβ ′ rβ + 2N0 (rα ′ )2 (rβ ′ )2 + K0 r2α V(z)(rβ ′ )2 + M0 (rβ rα ′ + rα rβ ′ )2 → −2f1,1,0 − f0,0,2 , M0 rβ ′ r3α + N0 (rα ′ )2 (2rα rβ ′ − rβ rα ′ ) → √ √ √ √ rβ rα ′ fmn2 − M0 r2α Vo / V(z) − 2N0 rα ′ 2 Vo / V(z) ; F2Xc = F2Xc /.replacementInF2Xc; F2Xc1 = ExpandCoefficient F2Xc , r2α ; F2Xc2 = ExpandCoefficient[F2Xc , rα rβ ] ; F2Xc3 = Expand Coefficient[F2Xc , rα rα ′ ] ; F2Xc4 = Expand Coefficient F2Xc , r2β ; F2Xc5 = Expand Coefficient[F2Xc , rβ rα ′ ] ;
EXPLORING THIRD - ORDER CHROMATIC ABERRATIONS
F2Xc6 = Expand Coefficient[F2Xc , rβ rβ ′ ] ; F2Xc7 = Expand Coefficient F2Xc , rα ′ 2 ; F2Xc0 = Expand F2Xc − F2Xc1 r2α + F2Xc2 rα rβ + F2Xc3 rα rα ′ + F2Xc4 r2β + F2Xc5 rβ rα ′ + F2Xc6 rβ rβ ′ + F2Xc7 rα ′ 2 ; StringForm[”F2Xc0 = “,”, F2Xc0 ] StringForm[”F2Xc1 = “,”, F2Xc1 ] StringForm[”F2Xc2 = “,”, F2Xc2 ] StringForm[”F2Xc3 = “,”, F2Xc3 ] StringForm[”F2Xc4 = “,”, F2Xc4 ] StringForm[”F2Xc5 = “,”, F2Xc5 ] StringForm[”F2Xc6 = “,”, F2Xc6 ] StringForm[”F2Xc7 = “.”, F2Xc7 ] F2Xc0 = −2ǫ0,1,1 fX,0,0,1 + 4ǫ0,0,2 fX,0,1,0 + 4f0,1,1,o fX,0,1,0 + 2f0,1,1 ǫX,0,0,1 − 4f0,0,2 ǫX,0,1,0 − 8f1,1,0 ǫX,0,1,0
+ 16f0,2,0 ǫX,1,0,0 + 8f0,2,0 fX,0,0,1,o + 8f0,1,1 fX,0,1,0,o
F2Xc1 F2Xc2
F2Xc3
− 2(ǫ0,1,0 )∗ (fX,0,0,0 )∗ − 2(f0,1,0 )∗ (ǫX,0,0,0 )∗ , √ ' ' 2M0 Vo fX,0,0,1 = 2K0 Vo V (z)fX,0,0,1 + − 2CX (f0,0,1 )∗ , √ V (z) ' ' = −4K0 Vo V (z)fX,0,1,0 + 2CX (f0,1,0 )∗ √ 6(f0,1,0 )∗ (fX,0,0,0 )∗ V (z) − , √ Vo = 4BX f0,0,2 − 4fmn2 fX,1,0,0 ,
F2Xc4 = 0,
F2Xc5 = −4BX f0,1,1 − 2fmn2 fX,0,0,1 + 4fmn3 fX,0,1,0 , F2Xc6 = 4fmn2 fX,0,1,0 , √ 4N0 Vo fX,0,0,1 F2Xc7 = . √ V (z)
4. Chromatic Astigmatism Coefficients replacementInCXc = N0 rβ ′ (rα ′ )2 + M0 rα rβ rα ′ → rα ′ fmn3 , M0 rα (rα rβ ′ − 2rβ rα ′ )rβ + N0 rα ′ rβ ′ (rα rβ ′ − 2rβ rα ′ ) → √ √ −rα ′ rβ fmn3 − fmn3 Vo / V(z), 3P0 r2α r2β − Q0 2r2β (rα ′ )2 − 6rα rβ rβ ′ rα ′ + r2α (rβ ′ )2 → √ √ −2rα rβ f0,0,1 ∗ / Vo + r2β f0,1,0 ∗ / Vo − Q0 Vo /V(z), √ √ CX → fX,0,0,0 ∗ V(z)/ Vo ;
117
118
LIU
CXc = CXc /.replacementInCXc; CXc1 = ExpandCoefficient CXc , r2α ; CXc2 = ExpandCoefficient[CXc , rα rβ ] ; CXc3 = Expand Coefficient[CXc , rα rα ′ ] ; CXc4 = Expand Coefficient CXc , r2β ; CXc5 = ExpandCoefficient[CXc , rβ rα ′ ]; CXc6 = Expand Coefficient[CXc , rβ rβ ′ ] ; CXc7 = Expand Coefficient CXc , rα ′ 2 ; CXc0 = Expand CXc − CXc1 r2α + CXc2 rα rβ + CXc3 rα rα ′ + CXc4 r2β + CXc5 rβ rα ′ + CXc6 rβ rβ ′ + CXc7 rα ′ 2 ; StringForm[”CXc0 = “,”, CXc0 ] StringForm[”CXc1 = “,”, CXc1 ] StringForm[”CXc2 = “,”, CXc2 ] StringForm[”CXc3 = “,”, CXc3 ] StringForm[”CXc4 = “,”, CXc4 ] StringForm[”CXc5 = “,”, CXc5 ] StringForm[”CXc6 = “,”, CXc6 ] StringForm[”CXc7 = “.”, CXc7 ] √ 2fmn3 Vo fX,0,0,1 CXc0 = −2ǫ0,0,2 fX,0,0,1 + + 4ǫ1,0,1 fX,0,1,0 √ V (z) + 4f0,0,2,o fX,0,1,0 − 2f0,0,2 ǫX,0,0,1 − 4f1,0,1 ǫX,0,1,0 + 8f0,1,1 ǫX,1,0,0 + 4f0,1,1 fX,0,0,1,o + 4f0,0,2 fX,0,1,0,o
CXc1 CXc2 CXc3 CXc4 CXc5
+ 2(ǫ0,0,1 )∗ (fX,0,0,0 )∗ + 2(f0,0,1 )∗ (ǫX,0,0,0 )∗ 2Q0 Vo (fX,0,0,0 )∗ − 4(f0,1,0 )∗ (fX,0,0,0,o )∗ − , √ V (z) √ ' ' 2(f1,0,0 )∗ (fX,0,0,0 )∗ V (z) = − 8K0 Vo fX,1,0,0 V (z), √ Vo √ ' ' 6(f0,0,1 )∗ (fX,0,0,0 )∗ V (z) = 6K0 Vo fX,0,0,1 V (z) − , √ Vo = 4BX f1,0,1 − 8fmn3 fX,1,0,0 , √ ' ' 2(f0,1,0 )∗ (fX,0,0,0 )∗ V (z) = − 4K0 Vo fX,0,1,0 V (z), √ Vo = 2fmn3 fX,0,0,1 − 4BX f0,0,2 ,
CXc6 = 4fmn3 fX,0,1,0 ,
CXc7 = 0.
EXPLORING THIRD - ORDER CHROMATIC ABERRATIONS
119
5. Chromatic Field Curvature Coefficients replacementInDXc = M0 rα ′ r2β + N0 rα ′ (rβ ′ )2 → rα ′ fmn1 , M0 rα (2rα rβ ′ − rβ rα ′ )rβ + N0 rα ′ rβ ′ (2rβ rα ′ − rα rβ ′ ) → 2rβ rβ ′ fmn2 − rα rα ′ fmn1 , L0 r2α r2β + 2K0 V(z)(rα ′ )2 r2β − 4K0 rα V(z)rα ′ rβ ′ rβ + N0 (rα ′ )2 (rβ ′ )2 + 2K0 r2α V(z)(rβ ′ )2 + M0 r2β (rα ′ )2 + r2α (rβ ′ )2 → −2f1,1,0 , P0 r2α r2β + Q0 2r2β (rα ′ )2 − 2rα rβ rβ ′ rα ′ + r2α (rβ ′ )2 → √ −r2β f0,1,0 ∗ / Vo + Q0 Vo /V(z), √ √ √ √ rα rβ ′ → rβ rα ′ − Vo / V(z), CX → fX,0,0,0 ∗ V(z)/ Vo ; DXc = Expand[D Xc /.replacementInDXc]/.replacementInDXc; DXc1 = ExpandCoefficient DXc , r2α ; DXc2 = ExpandCoefficient[DXc , rα rβ ] ; DXc3 = Expand Coefficient[DXc , rα rα ′ ] ; DXc4 = Expand Coefficient DXc , r2β ; DXc5 = ExpandCoefficient[DXc , rβ rα ′ ]; DXc6 = Expand Coefficient[DXc , rβ rβ ′ ] ; DXc7 = Expand Coefficient DXc , rα ′ 2 ; DXc0 = Expand DXc − DXc1 r2α + DXc2 rα rβ + DXc3 rα rα ′ + DXc4 r2β + DXc5 rβ rα ′ + DXc6 rβ rβ ′ + DXc7 rα ′ 2 ; StringForm[”DXc0 = “,”, DXc0 ] StringForm[”DXc1 = “,”, DXc1 ] StringForm[”DXc2 = “,”, DXc2 ] StringForm[”DXc3 = “,”, DXc3 ] StringForm[”DXc4 = “,”, DXc4 ] StringForm[”DXc5 = “,”, DXc5 ] StringForm[”DXc6 = “,”, DXc6 ] StringForm[”DXc7 = “.”, DXc7 ] DXc0 = −2ǫ1,1,0 fX,0,0,1 + 2ǫ1,0,1 fX,0,1,0 + 4f1,1,0,o fX,0,1,0 − 2f1,1,0 ǫX,0,0,1 − 2f1,0,1 ǫX,0,1,0 + 4f0,1,1 ǫX,1,0,0
+ 2f0,1,1 fX,0,0,1,o + 4f1,1,0 fX,0,1,0,o − (ǫ0,0,1 )∗ (fX,0,0,0 )∗
DXc1
− (f0,0,1 )∗ (ǫX,0,0,0 )∗ + 6(f0,1,0 )∗ (fX,0,0,0,o )∗ √ 4fmn2 Vo fX,1,0,0 3Q0 Vo (fX,0,0,0 )∗ + + , √ √ V (z) V (z) √ ' ' 3(f1,0,0 )∗ (fX,0,0,0 )∗ V (z) = 8K0 Vo fX,1,0,0 V (z) − , √ Vo
120
LIU
DXc2 =
√ ' ' (f0,0,1 )∗ (fX,0,0,0 )∗ V (z) − 6K0 Vo fX,0,0,1 V (z), √ Vo
DXc3 = 2BX f1,0,1 − fmn1 fX,0,0,1 , DXc4
√ ' ' 3(f0,1,0 )∗ (fX,0,0,0 )∗ V (z) = 4K0 Vo fX,0,1,0 V (z) − , √ Vo
DXc5 = −4BX f1,1,0 + 2fmn1 fX,0,1,0 − 4fmn2 fX,1,0,0 , DXc6 = 2fmn2 fX,0,0,1 , DXc7 = 0.
6. Chromatic Distortion Coefficients replacementInEXc = 3N0 rα ′ (rβ ′ )2 + M0 rβ (rβ rα ′ + 2rα rβ ′ ) → r′α fmn1 + 2rβ ′ fmn3 , √ √ 3P0 rα r2β +Q0 rβ ′ (2rβ rα ′ +rα rβ ′ ) → −rβ f0,0,1 ∗ / Vo −rα f1,0,0 ∗ / Vo , 3L0 r2α r2β +3N0 (rα ′ )2 (rβ ′ )2 +M0 r2β (rα ′ )2 +4rα rβ rβ ′ rα ′ +r2α (rβ ′ )2 → −2(f1,1,0 + f0,0,2 √), √ rα rβ ′ → rβ rα ′ − Vo / V(z) ; EXc = Expand[E Xc /.replacementInEXc]/.replacementInEXc; EXc1 = ExpandCoefficient EXc , r2α ; EXc2 = ExpandCoefficient[EXc , rα rβ ] ; EXc3 = Expand Coefficient[EXc , rα rα ′ ] ; EXc4 = Expand Coefficient EXc , r2β ; EXc5 = ExpandCoefficient[EXc , rβ rα ′ ]; EXc6 = Expand Coefficient[EXc , rβ rβ ′ ] ; EXc7 = Expand Coefficient EXc , rα ′ 2 ; EXc0 = Expand EXc − EXc1 r2α + EXc2 rα rβ + EXc3 rα rα ′ + EXc4 r2β + EXc5 rβ rα ′ + EXc6 rβ rβ ′ + EXc7 rα ′ 2 ; StringForm[”EXc0 = “,”, EXc0 ] StringForm[”EXc1 = “,”, EXc1 ] StringForm[”EXc2 = “,”, EXc2 ] StringForm[”EXc3 = “,”, EXc3 ] StringForm[”EXc4 = “,”, EXc4 ] StringForm[”EXc5 = “,”, EXc5 ] StringForm[”EXc6 = “,”, EXc6 ]
EXPLORING THIRD - ORDER CHROMATIC ABERRATIONS
StringForm[”EXc7 = “.”, EXc7 ] EXc0 = −ǫ1,0,1 fX,0,0,1 + 8ǫ2,0,0 fX,0,1,0 + 2f1,0,1,o fX,0,1,0
− 3f1,0,1 ǫX,0,0,1 + 4f0,0,2 ǫX,1,0,0 + 4f1,1,0 ǫX,1,0,0
+ 2f0,0,2 fX,0,0,1,o + 2f1,1,0 fX,0,0,1,o + (ǫ1,0,0 )∗ (fX,0,0,0 )∗
EXc1 EXc2 EXc3 EXc4 EXc5
+ (f1,0,0 )∗ (ǫX,0,0,0 )∗ + (f0,0,1 )∗ (fX,0,0,0,o )∗ √ 4fmn3 Vo fX,1,0,0 , + √ V (z) = 0, √ (fX,0,0,0 )∗ V (z)(f1,0,0 )∗ ∗ = −CX (f1,0,0 ) − , √ Vo = 8BX f2,0,0 − 2fmn1 fX,1,0,0 , √ (f0,0,1 )∗ (fX,0,0,0 )∗ V (z) =− , √ Vo = −2BX f1,0,1 + fmn1 fX,0,0,1 − 4fmn3 fX,1,0,0 ,
EXc6 = 2fmn3 fX,0,0,1 ,
eXc7 = 0.
7. Anisotropic Chromatic Spherical Aberration Coefficients replacementInbXc = M0 rα ′ r3α + N0 (rα ′ )3 rα → rα rα ′ fmn2 ; bXc = bXc /.replacementInbXc; bXc1 = ExpandCoefficient bXc , r2α ; bXc2 = ExpandCoefficient[bXc , rα rβ ] ; bXc3 = Expand Coefficient[bXc , rα rα ′ ] ; bXc4 = Expand Coefficient bXc , r2β ; bXc5 = ExpandCoefficient[bXc , rβ rα ′ ]; bXc6 = Expand Coefficient[bXc , rβ rβ ′ ] ; bXc7 = Expand Coefficient bXc , rα ′ 2 ; bXc0 = Expand bXc − bXc1 r2α + bXc2 rα rβ + bXc3 rα rα ′ + bXc4 r2β + bXc5 rβ rα ′ + bXc6 rβ rβ ′ + bXc7 rα ′ 2 ; StringForm[”bXc0 = “,”, bXc0 ] StringForm[”bXc1 = “,”, bXc1 ] StringForm[”bXc2 = “,”, bXc2 ] StringForm[”bXc3 = “,”, bXc3 ] StringForm[”bXc4 = “,”, bXc4 ] StringForm[”bXc5 = “,”, bXc5 ]
121
122
LIU
StringForm[”bXc6 = “,”, bXc6 ] StringForm[”bXc7 = “.”, bXc7 ] bXc0 = −2ǫX,0,1,0 (f0,1,0 )∗ − 2fX,0,1,0 (ǫ0,1,0 )∗ − 4ǫ0,2,0 (fX,0,0,0 )∗ − 4f0,2,0 (ǫX,0,0,0 )∗ ,
bXc1 bXc2 bXc3
√ fX,0,0,1 (f0,1,0 )∗ V (z) , = −CX f0,1,1 − √ Vo √ 2fX,0,1,0 (f0,1,0 )∗ V (z) , = 4CX f0,2,0 + √ Vo = fmn2 (fX,0,0,0 )∗ − 2BX (f0,1,0 )∗ ,
bXc4 = 0,
bXc5 = 0,
bXc6 = 0, bXc7 = 0.
8. Anisotropic Chromatic Coma 1 Coefficients replacementInf1Xc = Q0 rα ′ (4rα rβ ′ − 9rβ rα ′ ) − 5P0 r2α rβ → √ √ −2rα f0,0,1 ∗ / Vo + 9rβ f0,1,0 ∗ / Vo , P0 rβ r2α + Q0 rα ′ (3rβ rα ′ − r2α rβ ′ ) → √ √ √ −rβ f0,1,0 ∗ / Vo + 2Q0 rα ′ Vo / V(z), M0 (3rβ rα ′ − 2rα rβ ′ )r2α + N0 (rα ′ )2 (3rβ rα ′ − 2rα rβ ′ ) → √ √ rβ rα ′ fmn2 + 2fmn2 Vo / V(z) ; f1Xc = f1Xc /.replacementInf1Xc; f1Xc1 = ExpandCoefficient f1Xc , r2α ; f1Xc2 = ExpandCoefficient[f1Xc , rα rβ ; f1Xc3 = Expand Coefficient[f1Xc , rα rα ′ ] ; f1Xc4 = Expand Coefficient f1Xc , r2β ; f1Xc5 = ExpandCoefficient f1Xc , rβ rα ′ ; f1Xc6 = Expand Coefficient[f1Xc , rβ rβ ′ ] ; f1Xc7 = Expand Coefficient f1Xc , rα ′ 2 ; f1Xc0 = Expand f1Xc − f1Xc1 r2α + f1Xc2 rα rβ + f1Xc3 rα rα ′ + f1Xc4 r2β + f1Xc5 rβ rα ′ + f1Xc6 rβ rβ ′ + f1Xc7 rα ′ 2 ; StringForm[”f1Xc0 = “,”, f1Xc0 ] StringForm[”f1Xc1 = “,”, f1Xc1 ] StringForm[”f1Xc2 = “,”, f1Xc2 ] StringForm[”f1Xc3 = “,”, f1Xc3 ]
EXPLORING THIRD - ORDER CHROMATIC ABERRATIONS
StringForm[”f1Xc4 StringForm[”f1Xc5 StringForm[”f1Xc6 StringForm[”f1Xc7
123
= “,”, f1Xc4 ] = “,”, f1Xc5 ] = “,”, f1Xc6 ] = “.”, f1Xc7 ]
f1Xc0 = −4ǫX,0,1,0 (f0,0,1 )∗ + 3ǫX,0,0,1 (f0,1,0 )∗ + 12fX,0,1,0,o (f0,1,0 )∗ + 2fX,0,1,0 (ǫ0,0,1 )∗ − fX,0,0,1 (ǫ0,1,0 )∗ + 6fX,0,1,0 (f0,1,0,o )∗
f1Xc1
f1Xc2 f1Xc3 f1Xc4 f1Xc5
− ǫ0,1,1 (fX,0,0,0 )∗ − f0,1,1 (ǫX,0,0,0 )∗ + 12f0,2,0 (fX,0,0,0,o )∗ √ 2fmn2 Vo (fX,0,0,0 )∗ + , √ V (z) √ 2fX,0,0,1 (f0,0,1 )∗ V (z) = −2CX f1,1,0 − √ Vo √ ∗ ' ' 6fX,1,0,0 (f0,1,0 ) V (z) − + 4K0 Vo (fX,0,0,0 )∗ V (z), √ Vo √ 9fX,0,0,1 (f0,1,0 )∗ V (z) = CX f0,1,1 + , √ Vo = 2BX (f0,0,1 )∗ , √ 4fX,0,1,0 (f0,1,0 )∗ V (z) , =− √ Vo ' = 8Q0 Vo fX,0,1,0 − 6BX (f0,1,0 )∗ + fmn2 (fX,0,0,0 )∗ ,
f1Xc6 = 0,
f1Xc7 = 0.
9. Anisotropic Chromatic Coma 2 Coefficients replacementInf2Xc = M0 rβ ′ r3α + N0 (rα ′ )2 (2rα rβ ′ − rβ rα ′ ) → √ √ √ √ rβ rα ′ fmn2 − fmn2 Vo / V(z) − N0 (rα ′ )2 Vo / V(z) ; f2Xc = f2Xc /.replacementInf2Xc; f2Xc1 = ExpandCoefficient f2Xc , r2α ; f2Xc2 = ExpandCoefficient[f2Xc , rα rβ ] ; f2Xc3 = Expand Coefficient[f2Xc , rα rα ′ ] ; f2Xc4 = Expand Coefficient f2Xc , r2β ; f2Xc5 = ExpandCoefficient[f2Xc , rβ rα ′ ]; f2Xc6 = Expand Coefficient[f2Xc , rβ rβ ′ ] ; f2Xc7 = Expand Coefficient f2Xc , rα ′2 ; f2Xc0 = Expand f2Xc − f2Xc1 r2α + f2Xc2 rα rβ + f2Xc3 rα rα ′ +
124
LIU
f2Xc4 r2β + f2Xc5 rβ rα ′ + f2Xc6 rβ rβ ′ + f2Xc7 rα ′ 2 ; StringForm[”f2Xc0 = “,”, f2Xc0 ] StringForm[”f2Xc1 = “,”, f2Xc1 ] StringForm[”f2Xc2 = “,”, f2Xc2 ] StringForm[”f2Xc3 = “,”, f2Xc3 ] StringForm[”f2Xc4 = “,”, f2Xc4 ] StringForm[”f2Xc5 = “,”, f2Xc5 ] StringForm[”f2Xc6 = “,”, f2Xc6 ] StringForm[”f2Xc7 = “.”, f2Xc7 ] f2Xc0 = −2ǫX,0,0,1 (f0,1,0 )∗ − 8fX,0,1,0,o (f0,1,0 )∗ − 4fX,0,1,0 (ǫ0,0,1 )∗
+ 2fX,0,0,1 (ǫ0,1,0 )∗ − 4fX,0,1,0 (f0,1,0,o )∗ − 2ǫ0,1,1 (fX,0,0,0 )∗
f2Xc1
f2Xc2 f2Xc3 f2Xc4 f2Xc5
− 2f0,1,1 (ǫX,0,0,0 )∗ − 8f0,2,0 (fX,0,0,0,o )∗ √ 2fmn2 Vo (fX,0,0,0 )∗ − , √ V (z) √ 4fX,1,0,0 (f0,1,0 )∗ V (z) = −2CX f0,0,2 + √ Vo ' ' ∗ − 2K0 Vo (fX,0,0,0 ) V (z), √ 6fX,0,0,1 (f0,1,0 )∗ V (z) = 2CX f0,1,1 − , √ Vo = −4BX (f0,0,1 )∗ , √ 8fX,0,1,0 (f0,1,0 )∗ V (z) = , √ Vo = 4BX (f0,1,0 )∗ + 2fmn2 (fX,0,0,0 )∗ ,
f2Xc6 = 0, f2Xc7
√ 2N0 Vo (fX,0,0,0 )∗ =− . √ V (z)
10. Anisotropic Chromatic Astigmatism Coefficients replacementIncXc = P0 rβ r2α + Q0 rα ′ (2rα rβ ′ − rβ rα ′ ) → √ √ √ −rβ f0,1,0 ∗ / Vo − 2Q0 rα ′ Vo / V(z), P0 rα r2β + Q0 rβ ′ (2rβ rα ′ − rα rβ ′ ) → √ √ √ −rα f1,0,0 ∗ / Vo + 2Q0 rβ ′ Vo / V(z), M0 rα (rα rβ ′ − 2rβ rα ′ )rβ + N0 rα ′ rβ ′ (rα rβ ′ − 2rβ rα ′ ) → √ √ rβ rα ′ fmn3 − fmn3 Vo / V(z), 2 Q0 2rβ (rα ′ )2 − 6rα rβ rβ ′ rα ′ + r2α (rβ ′ )2 − 3P0 r2α r2β →
EXPLORING THIRD - ORDER CHROMATIC ABERRATIONS
125
√ √ 2Q0 Vo /V(z) + r2α f1,0,0 ∗ / Vo + rα rβ f0,0,1 ∗ / Vo ; cXc = cXc /.replacementIncXc; cXc1 = ExpandCoefficientcXc , r2α ; cXc2 = ExpandCoefficient cXc , rα rβ ; cXc3 = Expand Coefficient[cXc , rα rα ′ ] ; cXc4 = Expand Coefficient cXc , r2β ; cXc5 = ExpandCoefficient[cXc , rβ rα ′ ]; cXc6 = Expand Coefficient[cXc , rβ rβ ′ ] ; cXc7 = Expand Coefficient cXc , rα ′ 2 ; cXc0 = Expand cXc − cXc1 r2α + cXc2 rα rβ + cXc3 rα rα ′ + cXc4 r2β + cXc5 rβ rα ′ + cXc6 rβ rβ ′ + cXc7 rα ′ 2 ; StringForm[”cXc0 = “,”, cXc0 ] StringForm[”cXc1 = “,”, cXc1 ] StringForm[”cXc2 = “,”, cXc2 ] StringForm[”cXc3 = “,”, cXc3 ] StringForm[”cXc4 = “,”, cXc4 ] StringForm[”cXc5 = “,”, cXc5 ] StringForm[”cXc6 = “,”, cXc6 ] StringForm[”cXc7 = “.”, cXc7 ] 4Q0 Vo fX,0,0,1 − 2ǫX,0,0,1 (f0,0,1 )∗ √ V (z) + 4fX,0,1,0,o (f0,0,1 )∗ + 8ǫX,1,0,0 (f0,1,0 )∗ + 4fX,0,0,1,o (f0,1,0 )∗
cXc0 = −2(ǫ0,0,1 )∗ fX,0,0,1 +
− 4ǫX,0,1,0 (f1,0,0 )∗ + 4fX,0,1,0 (ǫ1,0,0 )∗ + 4fX,0,1,0 (f0,0,1,o )∗
cXc1 cXc2
cXc3
− 2ǫ0,0,2 (fX,0,0,0 )∗ − 2f0,0,2 (ǫX,0,0,0 )∗ + 4f0,1,1 (fX,0,0,0,o )∗ √ 2fmn3 Vo (fX,0,0,0 )∗ + , √ V (z) √ 2fX,0,0,1 (f1,0,0 )∗ V (z) − 2CX f1,0,1 , = √ Vo √ √ 2fX,0,0,1 (f0,0,1 )∗ V (z) 8fX,1,0,0 (f0,1,0 )∗ V (z) = 2CX f0,0,2 + − √ √ Vo Vo √ ∗ ' ' 4fX,0,1,0 (f1,0,0 ) V (z) + 6K0 Vo (fX,0,0,0 )∗ V (z), − √ Vo ' = 4BX (f1,0,0 )∗ − 16Q0 Vo fX,1,0,0 ,
cXc4 = 0,
cXc5 = 2fmn3 (fX,0,0,0 )∗ − 4BX (f0,0,1 )∗ ,
126
LIU
' cXc6 = 8Q0 Vo fX,0,1,0 ,
cXc7 = 0.
11. Anisotropic Chromatic Field Curvature Coefficients replacementIndXc = P0 rβ r2α + Q0 rα ′ (2rα rβ ′ − rβ rα ′ ) → √ √ √ −rβ f0,1,0 ∗ / Vo − 2Q0 rα ′ Vo / V(z), √ √ 3P0 rα r2β +Q0 rβ ′ (2rβ rα ′ +rα rβ ′ ) → −rβ f0,0,1 ∗ / Vo −rα f1,0,0 ∗ / Vo , M0 rα (2rα rβ ′ − rβ rα ′ )rβ + N0 rα ′ rβ ′ (3rα rβ ′ − 2rβ rα ′ ) → √ √ 2fmn3 Vo / V(z) + rα rα ′ fmn1 , L0 r2α r2β + 2K0 V(z)(rα ′ )2 r2β − 4K0 rα V(z)rα ′ rβ ′ rβ + N0 (rα ′ )2 (rβ ′ )2 + 2K0 r2α V(z)(rβ ′ )2 + M0 r2β (rα ′ )2 + r2α (rβ ′ )2 → −2f1,1,0 , 5P0 r2α r2β + Q0 −2r2β (rα ′ )2 + 6rα rβ rβ ′ rα ′ + r2α (rβ ′ )2 → √ √ −2Q0 Vo /V(z) − rα rβ f0,0,1 ∗ / Vo − 3r2α f1,0,0 ∗ / Vo ; dXc = dXc /.replacementIndXc; dXc1 = ExpandCoefficient dXc , r2α ; dXc2 = ExpandCoefficient[dXc , rα rβ ] ; dXc3 = Expand Coefficient[dXc , rα rα ′ ] ; dXc4 = Expand Coefficient dXc , r2β ; dXc5 = ExpandCoefficient[dXc , rβ rα ′ ]; dXc6 = Expand Coefficient[dXc , rβ rβ ′ ] ; dXc7 = Expand Coefficient dXc , rα ′ 2 ; dXc0 = Expand dXc − dXc1 r2α + dXc2 rα rβ + dXc3 rα rα ′ + dXc4 r2β + dXc5 rβ rα ′ + dXc6 rβ rβ ′ + dXc7 rα ′ 2 ; StringForm[”dXc0 = “,”, dXc0 ] StringForm[”dXc1 = “,”, dXc1 ] StringForm[”dXc2 = “,”, dXc2 ] StringForm[”dXc3 = “,”, dXc3 ] StringForm[”dXc4 = “,”, dXc4 ] StringForm[”dXc5 = “,”, dXc5 ] StringForm[”dXc6 = “,”, dXc6 ] StringForm[”dXc7 = “.”, dXc7 ] 2Q0 Vo fX,0,0,1 + ǫX,0,0,1 (f0,0,1 )∗ √ V (z) − 2fX,0,1,0,o (f0,0,1 )∗ − 4ǫX,1,0,0 (f0,1,0 )∗ − 2fX,0,0,1,o (f0,1,0 )∗
dXc0 = (ǫ0,0,1 )∗ fX,0,0,1 −
− 2ǫX,0,1,0 (f1,0,0 )∗ − 6fX,0,1,0 (ǫ1,0,0 )∗ − 2fX,0,1,0 (f0,0,1,o )∗
EXPLORING THIRD - ORDER CHROMATIC ABERRATIONS
dXc1 dXc2
dXc3 dXc4 dXc5
127
− 2ǫ1,1,0 (fX,0,0,0 )∗ − 2f1,1,0 (ǫX,0,0,0 )∗ − 2f0,1,1 (fX,0,0,0,o )∗ √ 2fmn3 Vo (fX,0,0,0 )∗ , − √ V (z) √ 3fX,0,0,1 (f1,0,0 )∗ V (z) , = −CX f1,0,1 − √ Vo √ √ fX,0,0,1 (f0,0,1 )∗ V (z) 4fX,1,0,0 (f0,1,0 )∗ V (z) = 2CX f1,1,0 − + √ √ Vo Vo √ ∗ ' ' 2fX,0,1,0 (f1,0,0 ) V (z) − 2K0 Vo (fX,0,0,0 )∗ V (z), + √ Vo ' = 8Q0 Vo fX,1,0,0 − 6BX (f1,0,0 )∗ + fmn1 (fX,0,0,0 )∗ , √ 2fX,0,1,0 (f0,0,1 )∗ V (z) , = √ Vo = 2BX (f0,0,1 )∗ ,
dXc6 = 0,
dXc7 = 0.
12. Anisotropic Chromatic Distortion Coefficients replacementIneXc = M0 rα ′ r2β + N0 rα ′ (rβ ′ )2 → rα ′ fmn1 , P0 rα r2β + Q0 rβ ′ (3rα rβ ′ − 2rβ rα ′ ) → √ √ √ −rα (f1,0,0 )∗ / Vo − 2Q0 rβ ′ Vo / V(z), L0 r2α r2β + 2K0 r2α V(z)(rβ ′ )2 − 4K0 rα V(z)rα ′ rβ ′ rβ + N0 (rα ′ )2 (rβ ′ )2 + 2K0 V(z)(rα ′ )2 r2β + M0 (rα ′ )2 r2β + r2α (rβ ′ )2 → −2f1,1,0 , √ √ rα rβ ′ → rβ rα ′ − Vo / V(z) ; eXc = Expand[eXc /.replacementIneXc]/.replacementIneXc; eXc1 = ExpandCoefficienteXc , r2α ; eXc2 = ExpandCoefficient eXc , rα rβ ; eXc3 = Expand Coefficient[eXc , rα rα ′ ] ; eXc4 = Expand Coefficient eXc , r2β ; eXc , rβ rα ′ ]; eXc5 = Expand[Coefficient eXc6 = Expand Coefficient[eXc , rβ rβ ′ ] ; eXc7 = Expand Coefficient eXc , rα ′ 2 ; eXc0 = Expand eXc − eXc1 r2α + eXc2 rα rβ + eXc3 rα rα ′ + eXc4 r2β + eXc5 rβ rα ′ + eXc6 rβ rβ ′ + eXc7 rα ′ 2 ; StringForm[”eXc0 = “,”, eXc0 ]
128
LIU
StringForm[”eXc1 StringForm[”eXc2 StringForm[”eXc3 StringForm[”eXc4 StringForm[”eXc5 StringForm[”eXc6 StringForm[”eXc7 eXc0 =
eXc1 eXc2 eXc3 eXc4 eXc5 eXc6
= “,”, eXc1 ] = “,”, eXc2 ] = “,”, eXc3 ] = “,”, eXc4 ] = “,”, eXc5 ] = “,”, eXc6 ] = “.”, eXc7 ]
4Q0 Vo fX,1,0,0 + 2ǫX,1,0,0 (f0,0,1 )∗ + fX,0,0,1,o (f0,0,1 )∗ √ V (z) − 3ǫX,0,0,1 (f1,0,0 )∗ − fX,0,0,1 (ǫ1,0,0 )∗ + 2fX,0,1,0 (f1,0,0,o )∗
− ǫ1,0,1 (fX,0,0,0 )∗ − f1,0,1 (ǫX,0,0,0 )∗ + 2f1,1,0 (fX,0,0,0,o )∗ , √ 2fX,1,0,0 (f1,0,0 )∗ V (z) = −4CX f2,0,0 − , √ Vo √ fX,0,0,1 (f1,0,0 )∗ V (z) , = CX f1,0,1 + √ Vo = 0, ' ' = 2K0 Vo (fX,0,0,0 )∗ V (z), ' = −4Q0 Vo fX,1,0,0 − 2BX (f1,0,0 )∗ + fmn1 (fX,0,0,0 )∗ , ' = 2Q0 Vo fX,0,0,1 ,
eXc7 = 0.
C. Total Chromatic Aberration Coefficients In summary, the third-order total chromatic aberration can be expressed as
Vo × BV r′o + bV ro ′ ∗ (ro ′ . ro ′ )2 r3c = Vo + F1V ro + f1V ro ∗ (ro ′ . ro ′ )2 + F2V ro ′ + f2V ro ′ ∗ (ro ′ . ro )2 + CV ro + cV ro ∗ (ro ′ . ro )2 + DV ro ′ + dV ro ′ ∗ (ro . ro )2 + EV ro + eV ro ∗ (ro . ro )2
Bo + × BB ro ′ + bB ro ′ ∗ (ro ′ . ro ′ )2 Bo + F1B ro + f1B ro ∗ (ro ′ . ro ′ )2 + F2B ro ′ + f2B ro ′ ∗ (ro ′ . ro )2 + CB ro + cB ro ∗ (ro ′ . ro )2 DB ro ′ + dB ro ′ ∗ (ro ′ . ro )2 + EB ro + eB ro ∗ (ro . ro )2 . (23)
EXPLORING THIRD - ORDER CHROMATIC ABERRATIONS
129
If ΞX stands for BX , F1X , . . . , and eX with subscript “X” indicating either “V ” or “B”, then the chromatic aberration coefficients in Eq. (23) are written in a unified form, 1 ΞX = − √ Vo
ΞXc =
,zi
1 ΞXi dz − Vo
,zi
ΞXc dz,
zo zo 2 ΞXc0 + ΞXc1 rα + ΞXc2 rα rβ + ΞXc3 rα rα ′ + ΞXc4 rβ2 + ΞXc5 rβ rα ′ + ΞXc6 rβ rβ ′ + ΞXc7 rα ′ 2 .
(24)
V. G RAPHICAL D ISPLAY OF T HIRD -O RDER C HROMATIC A BERRATION PATTERNS The patterns of the third-order chromatic aberrations in the image plane can be displayed with reference back to the object plane (Wu, 1957; Liu, 2002). For this purpose, a new 2D coordinate system is introduced whose origin is located at the object point and two axes are ξ and η parallel to ro = (xo , yo ) and ro ∗ = (−yo , xo ), respectively. If the angle between the initial slope ro ′ and position ro is γ , then the aberration patterns are expressed in a pair of parametric equations with γ as a parameter. A. Two Auxiliary Procedures We first develop two auxiliary procedures and then display the third-order chromatic aberration patterns by means of them. Procedure 1 creates a 9-point object and will be used to display all the thirdorder chromatic aberration patterns except distortion. chromAberrationPatterns[fx_, fy_, C1_, C2_, c1_, c2_] := Module {x, y, ro, so, rV, a, xp, yp, δ, n}, x = {0, 1, 1,√0, −1,√−1, −1, √ 0, 1}; √y = {0, 0, 1, 1, 1, 0, −1, −1, −1 }; ro = {0, 1, 2, 1, 2, 1, 2, 1, 2}; rV = {0.26, 0.2, 0.14, 0.06, 0, −0.06, −0.14, −0.2, −0.26}; a = {0, 0, π/4, π /2, 3π/4, π, 5π/4, 3π/2, 7π/4}; so = 0.02; xp = {}; yp = {}; δ = π/360; n = 0; Fori = 1, i ≤ 60000, i = i + 1,AppendTo[xp, 0]; AppendTo[yp, 0] ; For j = 1, j ≤ 9, j = j + 1, For i = 1, i ≤ 9, i = i + 1, For γ = 0, γ ≤ 2π, γ = γ + δ, n = n + 1; xp[[n]] = fx γ , C1, C2, c1, c2, ro[[i]], so, rV[[j]], a[[i]] ;
130
LIU
yp[[n]]= fy γ ; , C1, C2, c1, c2, ro[[i]],so, rV[[j]], a[[i]] tbl = Table Point xp[[i]], yp[[i]] , {i, 1, n} ; Show Graphics {PointSize[0.008], tbl} , PlotRange → All, AspectRatio → Automatic, DisplayFunction → Identity
Procedure 2 creates a 410-point object and is used to display distortion patterns only. distortionPatterns[fx_, fy_, C1_, c1_] := Module {x, y, ro, a, xp, yp, d, rV}, x = {}; y = {}; ro = {}; a = {}; xp = {}; yp = {}; d = 0.05; rV = {−0.2, −0.1, 0, 0.1, 0.2}; For i = 1, i ≤ 410, i = i + 1, AppendTo[x, 0]; AppendTo[y, 0]; AppendTo[ro, 0]; AppendTo[a, 0] ; For i = 1, i ≤ 2200, i = i + 1, AppendTo[xp, 0]; AppendTo[yp, 0] ; x[[1]] = 1; x[[42]] = 0.5; x[[83]] = 0; x[[124]] = −0.5; x[[165]] = −1; For i = 1, i ≤ 165, i = i + 41, y[[i]] = −1; For j = i + 1, j ≤ i + 40, j = j + 1, x[[j]] = x[[i]]; y[[j]] = y[[j − 1]] + d ; y[[206]] = 1; y[[247]] = 0.5; y[[288]] = 0; y[[329]] = −0.5; y[[370]] = −1; For i = 206, i ≤ 410, i = i + 41, x[[i]] = −1; For j = i + 1, j ≤ i + 40, j = j + 1, y[[j]] = y[[i]]; x[[j]] = x[[j − 1]] + d ; ' For i = 1, i ≤ 410, i = i + 1, ro[[i]] = x[[i]]2 + y[[i]]2 ; If Abs x[[i]] ≤ 10−12 &&y[[i]] ≥ 0, a[[i]] = 2 tan−1 (1) ; If Abs x[[i]] ≤ 10−12 &&y[[i]] < 0, a[[i]] = 6 tan−1 (1) ; If x[[i]] > 10−12 , a[[i]] = tan−1 y[[i]]/x[[i]] ; If x[[i]] < −10−12 , a[[i]] = tan−1 y[[i]]/x[[i]] + 4 tan−1 (1) ; n =0; For j = 1, j ≤ 5, j = j + 1, For i = 1, i ≤ 410, i = i + 1, n = n + 1; xp[[n]] = fx C1, c1, ro[[i]], a[[i]], rV[[j]] ; yp[[n]] = fy C1, c1, ro[[i]], a[[i]], rV[[j]] ; tbl = Table Point xp[[i]], yp[[i]] , {i, 1, n} ; Show Graphics {PointSize[0.012], tbl} , PlotRange → All, AspectRatio → Automatic, DisplayFunction → Identity
Here are two different objects (see also Fig. 1):
fx[γ _, C1_, C2_, c1_, c2_, ro_, so_, rV_, α_] := Module {ξ , η}, ξ = 0; η = 0; ξ cos(α) − η sin(α) + ro cos(α) fy[γ _, C1_, C2_, c1_, c2_, ro_, so_, rV_, α_] := Module {ξ , η}, ξ = 0; η = 0; ξ sin(α) + η cos(α) + ro sin(α) object1 = chromAberrationPatterns[fx, fy, 0, 0, 0, 0]; fx[C1_, c1_, ro_, a_, rV_] := Module {ξ , η}, ξ = 0; η = 0; ξ cos(a) − η sin(a) + ro cos(a) fy[C1_, c1_, ro_, a_, rV_] := Module {ξ , η},
EXPLORING THIRD - ORDER CHROMATIC ABERRATIONS
F IGURE 1.
131
A 9-point object (left) and a 410-point object (right).
ξ = 0; η = 0; ξ sin(a) + η cos(a) + ro sin(a) object2 fy, 0, 0]; = distortionPatterns[fx, Show GraphicsArray {object1, object2} , GraphicsSpacing → 0.5 ; B. Chromatic Aberration Patterns The parametric equation representing the third-order chromatic aberration pattern can also be expressed in a unified form by means of the subscript “X”; all third-order isotropic and anisotropic chromatic aberration patterns are demonstrated below. 1. Chromatic Spherical Aberration Patterns The parametric equations are
Xo BX cos(γ ) − bX sin(γ ) ro ′ 3 ξ= Xo
Xo BX sin(γ ) + bX cos(γ ) ro ′ 3 η= Xo and typical isotropic and anisotropic patterns are
(25)
fx[γ _, C1_, C2_, c1_, c2_, ro_, so_, rV_, α_] := Module {ξ , η}, 3 3 ξ = rV C1 cos(γ ) − c1 sin(γ ) so ; η = rV C1 sin(γ ) + c1 cos(γ ) so ; ro cos(α) + ξ cos(α) − η sin(α) fy[γ _, C1_, C2_, c1_, c2_, ro_, so_, rV_, α_] := Module {ξ , η}, 3 3 ξ = rV C1 cos(γ ) − c1 sin(γ ) so ; η = rV C1 sin(γ ) + c1 cos(γ ) so ; η cos(α) + ro sin(α) + ξ sin(α) BX = 50000; bX = 30000; isoPattern = chromAberrationPatterns[fx, fy, BX , 0, 0, 0]; anisoPattern = chromAberrationPatterns[fx, fy,BX , 0, bX , 0]; Show GraphicsArray {isoPattern, anisoPattern} ,
132
LIU
F IGURE 2. Typical third-order isotropic (left) and anisotropic (right) chromatic spherical aberration patterns.
GraphicsSpacing → 0.5 ;
Figure 2 clearly shows that the diameter of the chromatic spherical aberration disk increases in the presence of the anisotropic component and that the shape is not different from the geometric spherical aberration pattern. 2. Chromatic Coma Patterns The parametric equations are
Xo ξ= F1X + F2X cos2 (γ ) − f2X sin(γ ) ro ′ 2 ro , Xo
Xo η= F2X sin(γ ) cos(γ ) + f1X + f2X cos2 (γ ) ro ′ 2 ro (26) Xo and typical isotropic and anisotropic patterns are fx[γ _, C1_, C2_, c1_, c2_, ro_, so_, rV_, α_] := Module {ξ , η}, ξ = rV C1 + C2 cos2 (γ ) − c2 sin(γ ) so2 ro; 2 2 η = rV c1 + C2 sin(γ ) cos(γ ) + c2 cos (γ ) so ro; ro cos(α) + ξ cos(α) − η sin(α) fy[γ _, C1_, C2_, c1_, c2_, ro_, so_, rV_, α_] := Module {ξ , η}, ξ = rV C1 + C2 cos2 (γ ) − c2 sin(γ ) so2 ro; 2 (γ ) so2 ro; η = rV c1 + C2 sin(γ ) cos(γ ) + c2 cos η cos(α) + ro sin(α) + ξ sin(α) F1X = 2000; F2X = 1800; f1X = 400; f2X = 200; isoPattern = chromAberrationPatterns[fx, fy, F1X , F2X , 0, 0]; anisoPattern = chromAberrationPatterns[fx, fy,F1X , F2X , f1X , f2X ]; Show GraphicsArray {isoPattern, anisoPattern} , GraphicsSpacing → 0.5 ;
It is clearly seen from Figure 3 that the chromatic coma patterns are different from the third-order geometric coma both for isotropic and for
EXPLORING THIRD - ORDER CHROMATIC ABERRATIONS
F IGURE 3.
133
Typical third-order isotropic (left) and anisotropic (right) chromatic coma patterns.
anisotropic patterns. They become dumbbell shaped as a result of the positive and negative field fluctuation; the shape of the dumbbell is also slightly different between the isotropic and anisotropic patterns. 3. Chromatic Astigmatism and Field Curvature Patterns The parametric equations are
Xo (CX + DX ) cos(γ ) − dc sin(γ ) ro ′ ro 2 , ξ= Xo
Xo DX sin(γ ) + (cX + dX ) cos(γ ) ro ′ ro 2 η= Xo
(27)
and typical isotropic and anisotropic patterns are
fx[γ _, C1_, C2_, c1_, c2_, ro_, so_, rV_, α_] := Module {ξ , η}, ξ = rV (C1 + C2) cos(γ ) − c2 sin(γ ) soro2 ; η = rV C2 sin(γ ) + (c1 + c2) cos(γ ) soro2 ; ro cos(α) + ξ cos(α) − η sin(α) fy[γ _, C1_, C2_, c1_, c2_, ro_, so_, rV_, α_] := Module {ξ , η}, ξ = rV (C1 + C2) cos(γ ) − c2 sin(γ ) soro2 ; η = rV C2 sin(γ ) + (c1 + c2) cos(γ ) soro2 ; η cos(α) + ro sin(α) + ξ sin(α) CX = 20; DX = 12; cX = 6; dX = 2; isoPattern = chromAberrationPatterns[fx, fy, CX , DX , 0, 0]; anisoPattern = chromAberrationPatterns[fx, fy,CX , DX , cX , dX ]; Show GraphicsArray {isoPattern, anisoPattern} , GraphicsSpacing → 0.5 ;
Figure 4 shows that the pattern of third-order chromatic astigmatism and field curvature is similar to the shape of the third-order geometric astigmatism and field curvature.
134
LIU
F IGURE 4. Typical third-order isotropic (left) and anisotropic (right) chromatic astigmatism and field curvature patterns.
4. Chromatic Distortion Patterns The parametric equations are
Xo EX ro 3 , Xo
Xo η= eX ro 3 (28) Xo and typical isotropic and anisotropic third-order chromatic distortion patterns are fx[C1_, c1_, ro_, α_, rV_] := Module {ξ , η}, ξ = rVC1ro3 ; η = rVc1ro3 ; ξ cos(α) − η sin(α) + ro cos(α) fy[C1_, c1_, ro_, α_, rV_] := Module {ξ , η}, ξ = rVC1ro3 ; η = rVc1ro3 ; ξ sin(α) + η cos(α) + ro sin(α) EX = 0.32; eX = 0.2; isoPattern = distortionPatterns[fx, fy, EX , 0]; anisoPattern = distortionPatterns[fx, fy, EX , eX ]; Show GraphicsArray {isoPattern, anisoPattern} , GraphicsSpacing → 0.5 ; EX = −0.32; eX = 0.2; isoPattern = distortionPatterns[fx, fy, EX , 0]; anisoPattern = distortionPatterns[fx, fy, EX , eX ]; Show GraphicsArray {isoPattern, anisoPattern} , GraphicsSpacing → 0.5 ; ξ=
Figures 5 and 6 show that the chromatic distortion patterns are different from the third-order geometric patterns for both pincushion and barrel distortions and that the third-order chromatic distortion becomes severe at the edges. In addition, there exist both pincushion and barrel distortion for each type of distortion patterns since the perturbation of the lens field always suffers from the positive and negative fluctuation.
EXPLORING THIRD - ORDER CHROMATIC ABERRATIONS
135
F IGURE 5. Typical third-order isotropic (left) and anisotropic (right) chromatic pincushion distortion patterns.
F IGURE 6. patterns.
Typical third-order isotropic (left) and anisotropic (right) chromatic barrel distortion
VI. N UMERICAL C ALCULATION OF T HIRD -O RDER C HROMATIC A BERRATION C OEFFICIENTS For numerical calculation, a Hutter’s electrostatic immersion lens, a Glaser’s bell-shaped magnetic lens, and a Ximen’s combined electromagnetic lens are chosen. Their axial electromagnetic field distributions have a unified form (Ximen and Liu, 2000): + * −1 z V (z) = V0 exp k tan , a + * B0 z 1 B(z) = k tan−1 , (29) z 2 exp 2 a 1 + (a ) where B0 = 0 for the Hutter’s electrostatic immersion lens, k = 0 for the Glaser’s bell-shaped magnetic lens, and both B0 = 0 and k = 0 for the Ximen’s combined electromagnetic lens. Their lens parameters are selected optionally. The Hutter’s electrostatic immersion lens: B0 = 0; V0 = 1000; a = 1; k = 4; m = −0.1;
136
LIU
The Glaser’s bell-shaped magnetic lens: B0 = 0.01; V0 = 2000; a = 0.01; k = 0; m = −1000; The Ximen’s combined electromagnetic lens: B0 = 0.05; V0 = 10000; a = 0.01; k = 2; m = −10; The above three input cells must be run separately. For instance, if the Glaser’s bell-shaped magnetic lens is calculated, then the other two input cells should be skipped. However, all the following Mathematica input cells are common for these three lenses. A. Calculation of Various Quantities First, Eq. (29) should be entered. V(z_) := V0 Ek tan
−1 ( z ) a
; B(z_) :=
1 k tan−1 ( z ) a z2 +1 a2
B0 E 2
;
As usual, we introduce a new variable ϕ determined by ϕ = cot−1 (z/a). 0 . / 2 newVariable = tan−1 az → π2 − ϕ, az2 + 1 → 21 , z → a cot(ϕ) ; sin (ϕ) V(z) = V(z)/.newVariable; B(z) = B(z)/.newVariable; ) a2 η2 B20 1.7588047×1011 3k2 a ; ; ω = + η= 2 16 4V0 + 1; dzdphi = − 2 sin (ϕ)
where ω is the basic lens parameter and dz dp hi = dz/dϕ. Thus, for a given magnification m, the position of the object and the position of image for the lens, respectively, have the forms, kπ / . mE 4ω sin πω + π; ϕ i = ϕ o − πω ; Vo = V(z)/.ϕ → ϕ o ; ϕ o = tan−1 kπ E 4ω m cos
π ω
+1
where Vo is the electrical potential at the object position which will be used later. The difference between the subscripts “o” and “0” should be noted in the context. The two independent paraxial solutions, rα and rβ , and their derivatives with respect to z are directly written as follows: sin(ω(ϕ−ϕ o )) 1 k(ϕ−ϕ o ) E4 ; sin(ϕ) / ω sin(ϕ o ) ; cos(ϕ o )− 41 k sin(ϕ o ) 1 sin(ϕ o ) sin(wb+ω(ϕ−ϕ o )) E 4 k(ϕ−ϕ o ) ; rβ = sin(wb) sin(ϕ) ∂rβ ′ α rα ′ = ∂r ∂ϕ /dzdphi; rβ = ∂ϕ /dzdphi; a rα = − ω sin(ϕ . o) −1 wb = tan
EXPLORING THIRD - ORDER CHROMATIC ABERRATIONS
137
Having completed this, the numerical calculation of the third-order chromatic aberration coefficients of the lens can easily be carried out. First, all the quantities related to the chromatic aberration coefficients should be expressed in ϕ. L0 = Simplify −4Coefficient f4 , (r.r)2 /.newVariable; M0 = Simplify −2Coefficient f4 , (r.r)(r′ .r′ ) /.newVariable; ′ ′ 2 N0 = Simplify −4Coefficient f4 , (r .r )∗ ′ /.newVariable; √ P0 = Simplify −Coefficient f4 , (r.r)(r .r ) / V(z) /.newVariable; √ Q0 = Simplify −Coefficient f4 , (r′ .r′ )(r∗ .r′ ) / V(z) /.newVariable; ∗ ′ 2 K0 = Simplify −Coefficient f4 , (r .r) /V(z) /.newVariable; AV = Simplify Coefficient fV2 , (r.r) /.newVariable; BV = Simplify Coefficient fV2 , (r′ .r′ ) /.newVariable; ′ ) /.newVariable; CV = Simplify Coefficient fV2 , (r∗ .r AB = Simplify Coefficient fB2 , (r.r) /.newVariable; BB = Simplify Coefficient fB2 , (r′ .r′ ) /.newVariable; CB = Simplify Coefficient fB2 , (r∗ .r′ ) /.newVariable; LV = Simplify −4Coefficient fV4 , (r.r)2 /.newVariable; MV = Simplify −2Coefficient fV4 , (r.r)(r′ .r′ ) /.newVariable; ′ ′ 2 NV = Simplify −4Coefficient fV4 , (r .r )∗ ′ /.newVariable; √ PV = Simplify −Coefficient fV4 , (r.r)(r .r ) / V(z) /.newVariable; √ QV = Simplify −Coefficient fV4 , (r′ .r′ )(r∗ .r′ ) / V(z) /.newVariable; KV = Simplify −Coefficient fV4 , (r∗ .r′ )2 /V(z) /.newVariable; LB = Simplify −4Coefficient fB4 , (r.r)2 /.newVariable; MB = Simplify −2Coefficient fB4 , (r.r)(r′ .r′ ) /.newVariable; ′ ′ 2 NB = Simplify −4Coefficient fB4 , (r .r )∗ ′ /.newVariable; √ PB = Simplify −Coefficient fB4 , (r.r)(r .r ) / V(z) /.newVariable; √ QB = Simplify −Coefficient fB4 , (r′ .r′ )(r∗ .r′ ) / V(z) /.newVariable; KB = Simplify −Coefficient fB4 , (r∗ .r′ )2 /V(z) /.newVariable;
Then, the expansion coefficients of the Gaussian values of the fourthorder approximation of the variational function and the second- and fourthorder approximations of the chromatic perturbation variational function will automatically be expressed in the new variable ϕ in terms of the following Mathematica code. f4g = f4g; f2,0,0 = Coefficientf4g , (ro .ro )2 ; f1,1,0 = Coefficientf4g , (ro .ro )(ro ′ .ro ′ ) ; f1,0,1 = Coefficient f4g , (ro .ro )(ro .ro ′ ) ;
138
LIU
f0,2,0 = Coefficientf4g , (ro ′ .ro ′ )2 ; f0,1,1 = Coefficient f4g , (ro ′ .ro ′ )(ro .ro ′ ) ; f0,0,2 = Coefficient f4g , (ro .ro ′ )2 ; f1,0,0 ∗ = Coefficientf4g , (ro .ro )(ro ∗ .ro ′ ) ; f0,1,0 ∗ = Coefficientf4g , (ro ′ .ro ′ )(ro ∗ .ro ′ ) ; f0,0,1 ∗ = Coefficient f4g , (ro .ro ′ )(ro ∗ .ro ′ ) ; fX2g = fX2g; fV,1,0,0 = CoefficientfX2g , (ro .ro ) /.X → V; ′ ′ fV,0,1,0 = CoefficientfX2g , (ro .ro ) /.X → V; fV,0,0,1 = Coefficient fX2g , (ro .ro ′ ) /.X → V; fV,0,0,0 ∗ = Coefficient fX2g , (ro ∗ .ro ′ ) /.X → V; fB,1,0,0 = CoefficientfX2g , (ro .ro ) /.X → B; fB,0,1,0 = CoefficientfX2g , (ro ′ .ro ′ ) /.X → B; fB,0,0,1 = Coefficient fX2g , (ro .ro ′ ) /.X → B; fB,0,0,0 ∗ = Coefficient fX2g , (ro ∗ .ro ′ ) /.X → B; fX4g = fX4g; fV,2,0,0 = CoefficientfX4g , (ro .ro )2 /.X → V; fV,1,1,0 = CoefficientfX4g , (ro .ro )(ro ′ .ro ′ ) /.X → V; fV,1,0,1 = Coefficient fX4g , (ro .ro )(ro .ro ′ ) /.X → V; fV,0,2,0 = CoefficientfX4g , (ro ′ .ro ′ )2 /.X → V; fV,0,1,1 = Coefficient fX4g , (ro ′ .ro ′ )(ro .ro ′ ) /.X → V; fV,0,0,2 = Coefficient fX4g , (ro .ro ′ )2 /.X →V; fV,1,0,0 ∗ = CoefficientfX4g , (ro .ro )(ro ∗ .ro ′ ) /.X → V; fV,0,1,0 ∗ = CoefficientfX4g , (ro ′ .ro ′ )(ro ∗ .ro ′ ) /.X → V; fV,0,0,1 ∗ = Coefficient fX4g , (ro .ro ′ )(ro ∗ .ro ′ ) /.X → V; fB,2,0,0 = CoefficientfX4g , (ro .ro )2 /.X → B; fB,1,1,0 = CoefficientfX4g , (ro .ro )(ro ′ .ro ′ ) /.X → B; fB,1,0,1 = Coefficient fX4g , (ro .ro )(ro .ro ′ ) /.X → B; fB,0,2,0 = CoefficientfX4g , (ro ′ .ro ′ )2 /.X → B; fB,0,1,1 = Coefficient fX4g , (ro ′ .ro ′ )(ro .ro ′ ) /.X → B; fB,0,0,2 = Coefficient fX4g , (ro .ro ′ )2 /.X →B; fB,1,0,0 ∗ = CoefficientfX4g , (ro .ro )(ro ∗ .ro ′ ) /.X → B; fB,0,1,0 ∗ = CoefficientfX4g , (ro ′ .ro ′ )(ro ∗ .ro ′ ) /.X → B; fB,0,0,1 ∗ = Coefficient fX4g , (ro .ro ′ )(ro ∗ .ro ′ ) /.X → B;
All the values of the fl,m,n and fX,l,m,n in f4g and fX2g at the object plane are
EXPLORING THIRD - ORDER CHROMATIC ABERRATIONS
139
f2,0,0,o = f2,0,0 /.ϕ → ϕ o ; f1,1,0,o = f1,1,0 /.ϕ → ϕ o ; f1,0,1,o = f1,0,1 /.ϕ → ϕ o ; f0,2,0,o = f0,2,0 /.ϕ → ϕ o ; f0,1,1,o = f0,1,1 /.ϕ → ϕ o ; f0,0,2,o = f0,0,2 /.ϕ → ϕ o ; f1,0,0,o ∗ = f1,0,0 ∗ /.ϕ → ϕ o ; f0,1,0,o ∗ = f0,1,0 ∗ /.ϕ → ϕ o ; f0,0,1,o ∗ = f0,0,1 ∗ /.ϕ → ϕ o ; fV,1,0,0,o = fV,1,0,0 /.ϕ → ϕ o ; fV,0,1,0,o = fV,0,1,0 /.ϕ → ϕ o ; fV,0,0,1,o = fV,0,0,1 /.ϕ → ϕ o ; fV,0,0,0,o ∗ = fV,0,0,0 ∗ /.ϕ → ϕ o ; fB,1,0,0,o = fB,1,0,0 /.ϕ → ϕ o ; fB,0,1,0,o = fB,0,1,0 /.ϕ → ϕ o ; fB,0,0,1,o = fB,0,0,1 /.ϕ → ϕ o ; fB,0,0,0,o ∗ = fB,0,0,0 ∗ /.ϕ → ϕ o ; In addition, variable transformation in some quantities must be performed to evaluate ǫl,m,n (l +m +n = 2), ǫl,m,n ∗ (l +m +n = 1), ǫX,l,m,n (l +m+n = 1), and ǫX,l,m,n ∗ (l + m + n = 0) in the combined aberration integrals. They are dzdphix = dzdphi/.ϕ → x; f2,0,0,x = f2,0,0 /.ϕ → x; f1,1,0,x = f1,1,0 /.ϕ → x; f1,0,1,x = f1,0,1 /.ϕ → x; f0,2,0,x = f0,2,0 /.ϕ → x; f0,1,1,x = f0,1,1 /.ϕ → x; f0,0,2,x = f0,0,2 /.ϕ → x; f1,0,0,x ∗ = f1,0,0 ∗ /.ϕ → x; f0,1,0,x ∗ = f0,1,0 ∗ /.ϕ → x; f0,0,1,x ∗ = f0,0,1 ∗ /.ϕ → x; fV,1,0,0,x = fV,1,0,0 /.ϕ → x; fV,0,1,0,x = fV,0,1,0 /.ϕ → x; fV,0,0,1,x = fV,0,0,1 /.ϕ → x; fV,0,0,0,x ∗ = fV,0,0,0 ∗ /.ϕ → x; fB,1,0,0,x = fB,1,0,0 /.ϕ → x; fB,0,1,0,x = fB,0,1,0 /.ϕ → x; fB,0,0,1,x = fB,0,0,1 /.ϕ → x; fB,0,0,0,x ∗ = fB,0,0,0 ∗ /.ϕ → x;
In this way, the related eikonals, ǫl,m,n , ǫl,m,n ∗ , ǫX,l,m,n , and ǫX,l,m,n ∗ can be defined as the functions of ϕ. eikonal[fx_, dzdphix_, ϕo_, ϕ_?NumericQ] := Module {}, If IntegerQ[fx]&&fx == 0, 0, NIntegrate fxdzdphix, {x, ϕo, ϕ} ǫ 2,0,0 = eikonal[f2,0,0,x , dzdphix, ϕ o , ϕ]; ǫ 1,1,0 = eikonal[f1,1,0,x , dzdphix, ϕ o , ϕ]; ǫ 1,0,1 = eikonal[f1,0,1,x , dzdphix, ϕ o , ϕ]; ǫ 0,2,0 = eikonal[f0,2,0,x , dzdphix, ϕ o , ϕ]; ǫ 0,1,1 = eikonal[f0,1,1,x , dzdphix, ϕ o , ϕ]; ǫ 0,0,2 = eikonal[f0,0,2,x , dzdphix, ϕ o , ϕ]; ǫ 1,0,0 ∗ = eikonal[f1,0,0,x ∗ , dzdphix, ϕ o , ϕ]; ǫ 0,1,0 ∗ = eikonal[f0,1,0,x ∗ , dzdphix, ϕ o , ϕ]; ǫ 0,0,1 ∗ = eikonal[f0,0,1,x ∗ , dzdphix, ϕ o , ϕ]; ǫ V,1,0,0 = eikonal[fV,1,0,0,x , dzdphix, ϕ o , ϕ]; ǫ V,0,1,0 = eikonal[fV,0,1,0,x , dzdphix, ϕ o , ϕ]; ǫ V,0,0,1 = eikonal[fV,0,0,1,x , dzdphix, ϕ o , ϕ]; ǫ V,0,0,0 ∗ = eikonal[fV,0,0,0,x ∗ , dzdphix, ϕ o , ϕ];
140
LIU
ǫ B,1,0,0 = eikonal[fB,1,0,0,x , dzdphix, ϕ o , ϕ]; ǫ B,0,1,0 = eikonal[fB,0,1,0,x , dzdphix, ϕ o , ϕ]; ǫ B,0,0,1 = eikonal[fB,0,0,1,x , dzdphix, ϕ o , ϕ]; ǫB,0,0,0 ∗ = eikonal[fB,0,0,0,x ∗ , dzdphix, ϕ o , ϕ]; Furthermore, the other three variables are reassigned. fmn1 = M0 rβ 2 + N0 rβ ′ 2 ; fmn2 = M0 rα 2 + N0 rα ′ 2 ; fmn3 = M0 rα rβ + N0 rα ′ rβ ′ ; B. Calculation of Chromatic Aberration Coefficients Before the numerical calculation of chromatic aberration coefficients, the double-exponential method is adopted for the numerical integration instead of the less accurate default method, that is: SetOptions[NIntegrate, Method → DoubleExponential]; and a common procedure is written to evaluate the third-order chromatic aberration integrals. aberrationCoefficient[ci_, c0_, c1_, c2_, c3_, c4_, c5_, c6_,c7_, dzdphi_, ra_, rb_, ras_, rbs_, ϕo_, ϕi_, Vo_] := Module {iComponent, cComponent, cComponent0, cComponent1, cComponent2, cComponent3, cComponent4, cComponent5, cComponent6, cComponent7}, iComponent = If IntegerQ[ci]&&ci == 0, 0, √ −NIntegrate ci ∗ dzdphi, {ϕ, ϕo, ϕi} / Vo ; cComponent0 = If IntegerQ[c0]&&c0 == 0, 0, −NIntegrate c0 ∗ dzdphi, {ϕ, ϕo, ϕi} /Vo ; cComponent1 = If IntegerQ[c1]&&c1 == 0, 0, −NIntegrate c1 ∗ ra2 ∗ dzdphi, {ϕ, ϕo, ϕi} /Vo ; cComponent2 = If IntegerQ[c2]&&c2 == 0, 0, −NIntegrate c2 ∗ ra ∗ rb ∗ dzdphi, {ϕ, ϕo, ϕi} /Vo ; cComponent3 = If IntegerQ[c3]&&c3 == 0, 0, −NIntegrate c3 ∗ ra ∗ ras ∗ dzdphi, {ϕ, ϕo, ϕi} /Vo ; cComponent4 = If IntegerQ[c4]&&c4 == 0, 0, −NIntegrate c4 ∗ rb2 ∗ dzdphi, {ϕ, ϕo, ϕi} /Vo ; cComponent5 = If IntegerQ[c5]&&c5 == 0, 0, −NIntegrate[c5 ∗ rb ∗ ras ∗ dzdphi, {ϕ, ϕo, ϕi} /Vo ; cComponent6 = If IntegerQ[c6]&&c6 == 0, 0, −NIntegrate c6 ∗ rb ∗ rbs ∗ dzdphi, {ϕ, ϕo, ϕi} /Vo ;
EXPLORING THIRD - ORDER CHROMATIC ABERRATIONS
141
cComponent7 = If IntegerQ[c7]&&c7 == 0, 0, −NIntegrate c7 ∗ ras2 ∗ dzdphi, {ϕ, ϕo, ϕi} /Vo ; cComponent = cComponent0 + cComponent1 + cComponent2 + cComponent3 + cComponent4 + cComponent5 + cComponent6 + cComponent7; iComponent + cComponent
where the If statements are especially designed for those numerical integration whose integrand vanishes to avoid extraneous information. In general, four types of chromatic aberration coefficients must be calculated, as follows: 1. Isotropic Chromatic Aberration Coefficients Caused by Electric Perturbation X = V; BV = aberrationCoefficient[BXi , BXc0 , BXc1 , BXc2 , BXc3 , BXc4 , BXc5 , BXc6 , BXc7 , dzdphi, rα , rβ , rα ′ , rβ ′ , ϕ o , ϕ i , Vo ]; F1V = aberrationCoefficient[F1Xi , F1Xc0 , F1Xc1 , F1Xc2 , F1Xc3 , F1Xc4 , F1Xc5 , F1Xc6 , F1Xc7 , dzdphi, rα , rβ , rα ′ , rβ ′ , ϕ o , ϕ i , Vo ]; F2V = aberrationCoefficient[F2Xi , F2Xc0 , F2Xc1 , F2Xc2 , F2Xc3 , F2Xc4 , F2Xc5 , F2Xc6 , F2Xc7 , dzdphi, rα , rβ , rα ′ , rβ ′ , ϕ o , ϕ i , Vo ]; CV = aberrationCoefficient[CXi , CXc0 , CXc1 , CXc2 , CXc3 , CXc4 , CXc5 , CXc6 , CXc7 , dzdphi, rα , rβ , rα ′ , rβ ′ , ϕ o , ϕ i , V o ]; DV = aberrationCoefficient[DXi , DXc0 , DXc1 , DXc2 , DXc3 , DXc4 , DXc5 , DXc6 , DXc7 , dzdphi, rα , rβ , rα ′ , rβ ′ , ϕ o , ϕ i , Vo ]; EV = aberrationCoefficient[EXi , EXc0 , EXc1 , EXc2 , EXc3 , EXc4 , EXc5 , EXc6 , EXc7 , dzdphi, rα , rβ , rα ′ , rβ ′ , ϕ o , ϕ i , Vo ]; StringForm”BV = “,”, ScientificForm[BV , 8] StringForm”F1V = “,”, ScientificForm[F1V , 8] StringForm”F2V = “,”, ScientificForm[F2V , 8] StringForm”CV = “,”, ScientificForm[CV , 8] StringForm”DV = “,”, ScientificForm[DV , 8] StringForm ”EV = “,”, ScientificForm[EV , 8]
The output results will be used in the related table and are not shown here to avoid repetition. 2. Anisotropic Chromatic Aberration Coefficients Caused by Electric Perturbation X = V; bV = aberrationCoefficient[bXi , bXc0 , bXc1 , bXc2 , bXc3 , bXc4 , bXc5 , bXc6 , bXc7 , dzdphi, rα , rβ , rα ′ , rβ ′ , ϕ o , ϕ i , Vo ];
142
LIU
f1V = aberrationCoefficient[f1Xi , f1Xc0 , f1Xc1 , f1Xc2 , f1Xc3 , f1Xc4 , f1Xc5 , f1Xc6 , f1Xc7 , dzdphi, rα , rβ , rα ′ , rβ ′ , ϕ o , ϕ i , Vo ]; f2V = aberrationCoefficient[f2Xi , f2Xc0 , f2Xc1 , f2Xc2 , f2Xc3 , f2Xc4 , f2Xc5 , f2Xc6 , f2Xc7 , dzdphi, rα , rβ , rα ′ , rβ ′ , ϕ o , ϕ i , Vo ]; cV = aberrationCoefficient[cXi , cXc0 , cXc1 , cXc2 , cXc3 , cXc4 , cXc5 , cXc6 , cXc7 , dzdphi, rα , rβ , rα ′ , rβ ′ , ϕ o , ϕ i , Vo ]; dV = aberrationCoefficient[dXi , dXc0 , dXc1 , dXc2 , dXc3 , dXc4 , dXc5 , dXc6 , dXc7 , dzdphi, rα , rβ , rα ′ , rβ ′ , ϕ o , ϕ i , Vo ]; eV = aberrationCoefficient[eXi , eXc0 , eXc1 , eXc2 , eXc3 , eXc4 , eXc5 , eXc6 , eXc7 , dzdphi, rα , rβ , rα ′ , rβ ′ , ϕ o , ϕ i , Vo ]; StringForm”bV = “,”, ScientificForm[bV , 8] StringForm”f1V = “,”, ScientificForm[f1V , 8] StringForm”f2V = “,”, ScientificForm[f2V , 8] StringForm”cV = “,”, ScientificForm[cV , 8] StringForm”dV = “,”, ScientificForm[dV , 8] StringForm ”eV = “,”, ScientificForm[eV , 8]
The outputs are also omitted.
3. Isotropic Chromatic Aberration Coefficients Caused by Magnetic Perturbation X = B; BB = aberrationCoefficient[BXi , BXc0 , BXc1 , BXc2 , BXc3 , BXc4 , BXc5 , BXc6 , BXc7 , dzdphi, rα , rβ , rα ′ , rβ ′ , ϕ o , ϕ i , Vo ]; F1B = aberrationCoefficient[F1Xi , F1Xc0 , F1Xc1 , F1Xc2 , F1Xc3 , F1Xc4 , F1Xc5 , F1Xc6 , F1Xc7 , dzdphi, rα , rβ , rα ′ , rβ ′ , ϕ o , ϕ i , Vo ]; F2B = aberrationCoefficient[F2Xi , F2Xc0 , F2Xc1 , F2Xc2 , F2Xc3 , F2Xc4 , F2Xc5 , F2Xc6 , F2Xc7 , dzdphi, rα , rβ , rα ′ , rβ ′ , ϕ o , ϕ i , Vo ]; CB = aberrationCoefficient[CXi , CXc0 , CXc1 , CXc2 , CXc3 , CXc4 , CXc5 , CXc6 , CXc7 , dzdphi, rα , rβ , rα ′ , rβ ′ , ϕ o , ϕ i , Vo ]; DB = aberrationCoefficient[DXi , DXc0 , DXc1 , DXc2 , DXc3 , DXc4 , DXc5 , DXc6 , DXc7 , dzdphi, rα , rβ , rα ′ , rβ ′ , ϕ o , ϕ i , Vo ]; EB = aberrationCoefficient[EXi , EXc0 , EXc1 , EXc2 , EXc3 , EXc4 , EXc5 , EXc6 , EXc7 , dzdphi, rα , rβ , rα ′ , rβ ′ , ϕ o , ϕ i , Vo ]; StringForm”BB = “,”, ScientificForm[BB , 8] StringForm”F1B = “,”, ScientificForm[F1B , 8] StringForm”F2B = “,”, ScientificForm[F2B , 8] StringForm”CB = “,”, ScientificForm[CB , 8] StringForm”DB = “,”, ScientificForm[DB , 8] StringForm ”EB = “.”, ScientificForm[EB , 8]
EXPLORING THIRD - ORDER CHROMATIC ABERRATIONS
143
The outputs are omitted for the same reason. 4. Anisotropic Chromatic Aberration Coefficients Caused by Magnetic Perturbation X = B; bB = aberrationCoefficient[bXi , bXc0 , bXc1 , bXc2 , bXc3 , bXc4 , bXc5 , bXc6 , bXc7 , dzdphi, rα , rβ , rα ′ , rβ ′ , ϕ o , ϕ i , Vo ]; f1B = aberrationCoefficient[f1Xi , f1Xc0 , f1Xc1 , f1Xc2 , f1Xc3 , f1Xc4 , f1Xc5 , f1Xc6 , f1Xc7 , dzdphi, rα , rβ , rα ′ , rβ ′ , ϕ o , ϕ i , Vo ]; f2B = aberrationCoefficient[f2Xi , f2Xc0 , f2Xc1 , f2Xc2 , f2Xc3 , f2Xc4 , f2Xc5 , f2Xc6 , f2Xc7 , dzdphi, rα , rβ , rα ′ , rβ ′ , ϕ o , ϕ i , Vo ]; cB = aberrationCoefficient[cXi , cXc0 , cXc1 , cXc2 , cXc3 , cXc4 , cXc5 , cXc6 , cXc7 , dzdphi, rα , rβ , rα ′ , rβ ′ , ϕ o , ϕ i , Vo ]; dB = aberrationCoefficient[dXi , dXc0 , dXc1 , dXc2 , dXc3 , dXc4 , dXc5 , dXc6 , dXc7 , dzdphi, rα , rβ , rα ′ , rβ ′ , ϕ o , ϕ i , Vo ]; eB = aberrationCoefficient[eXi , eXc0 , eXc1 , eXc2 , eXc3 , eXc4 , eXc5 , eXc6 , eXc7 , dzdphi, rα , rβ , rα ′ , rβ ′ , ϕ o , ϕ i , Vo ]; StringForm”bB = “,”, ScientificForm[bB , 8] StringForm”f1B = “,”, ScientificForm[f1B , 8] StringForm”f2B = “,”, ScientificForm[f2B , 8] StringForm”cB = “,”, ScientificForm[cB , 8] StringForm”dB = “,”, ScientificForm[dB , 8] StringForm ”eB = “.”, ScientificForm[eB , 8]
The outputs are omitted.
C. Numerical Results The numerical results of the third-order chromatic aberration coefficients for the Hutter’s electron lens are presented in Table 1, in addition to those calculated by the differential algebraic (DA) method (Berz, 1999) for comparison. The numerical results for the other two lenses are respectively shown in Tables 2 and 3 and Tables 4 and 5.
VII. C ONCLUSIONS All the analytical formulas of third-order chromatic aberration coefficients of electron lenses and the numerical results have proved to be correct by means of the DA method (see Tables 1–5). However, the expressions of those 12 third-order chromatic aberration coefficients are not unique. For instance, if
144
LIU
TABLE 1 N UMERICAL R ESULTS OF THE T HIRD -O RDER C HROMATIC A BERRATION C OEFFICIENTS FOR THE H UTTER ’ S E LECTROSTATIC I MMERSION L ENS , T OGETHER W ITH T HOSE C ALCULATED BY THE DA M ETHOD Aberration coefficients
Mathematica
DA method
BV F1V F2V CV DV EV
1.7784477 1.0817309 1.4636948 9.4150074×10−1 4.5575518×10−1 3.0809672×10−1
1.7784475 1.0817308 1.4636946 9.4150065×10−1 4.5575499×10−1 3.0809662×10−1
TABLE 2 N UMERICAL R ESULTS OF THE T HIRD -O RDER C HROMATIC A BERRATION C OEFFICIENTS C AUSED BY E LECTRIC P ERTURBATION FOR THE G LASER ’ S B ELL -S HAPED M AGNETIC L ENS , T OGETHER W ITH T HOSE C ALCULATED BY THE DA M ETHOD Aberration coefficients
Mathematica
DA method
BV F1V F2V CV DV EV bV f1V f2V cV dV eV
−6.7767275×10−1 −7.5877830 −2.4000184 × 101 −2.7629165 × 102 −2.3393893 × 102 −2.8058779 × 103 −4.1784198×10−1 −1.0325929 × 101 −1.0000397 × 101 −2.6721638 × 102 −7.8967461 × 101 −2.0771734 × 103
−6.7767249×10−1 −7.5877800 −2.4000162 × 101 −2.7629148 × 102 −2.3393875 × 102 −2.8058758 × 103 −4.1784109×10−1 −1.0325914 × 101 −1.0000375 × 101 −2.6721595 × 102 −7.8967268 × 101 −2.0771699 × 103
the three quantities, fmn1 , fmn2 , and fmn3 , were not introduced, then the forms of those expressions would be different. The formulas presented herein are only the simpler ones. In comparison of the numerical results between Tables 2 and 3, it is clear that the third-order chromatic aberration of pure magnetic lenses can be written as follows: r3c = − Bc ro ′ + bc ro ′ ∗ (ro ′ . ro ′ )2 + F1c ro + f1c ro ∗ (ro ′ . ro ′ )2 + F2c ro ′ + f2c ro ′ ∗ (ro ′ . ro )2 + Cc ro + cc ro ∗ (ro ′ . ro )2 + Dc ro ′ + dc ro ′ ∗ (ro . ro )2
EXPLORING THIRD - ORDER CHROMATIC ABERRATIONS
+ Ec ro + ec ro ∗ (ro . ro )
Vo 2 Vo
−2
Bo Bo
145 (30)
which is similar to the expression of the first-order chromatic aberration of pure magnetic lenses (Hawkes and Kasper, 1989). Although Eq. (30) is not a TABLE 3 N UMERICAL R ESULTS OF THE T HIRD -O RDER C HROMATIC A BERRATION C OEFFICIENTS C AUSED BY M AGNETIC P ERTURBATION FOR THE G LASER ’ S B ELL -S HAPED M AGNETIC L ENS , T OGETHER W ITH T HOSE C ALCULATED BY THE DA M ETHOD Aberration coefficients
Mathematica
DA method
BB F1B F2B CB DB EB bB f1B f2B cB dB eB
1.3553455 1.5175566 × 101 4.8000369 × 101 5.5258329 × 102 4.6787785 × 102 5.6117558 × 103 8.3568395×10−1 2.0651857 × 101 2.0000793 × 101 5.3443276 × 102 1.5793492 × 102 4.1543469 × 103
1.3553460 1.5175556 × 101 4.8000387 × 101 5.5258320 × 102 4.6787816 × 102 5.6117566 × 103 8.3568551×10−1 2.0651885 × 101 2.0000850 × 101 5.3443363 × 102 1.5793543 × 102 4.1543549 × 103
TABLE 4 N UMERICAL R ESULTS OF THE T HIRD -O RDER C HROMATIC A BERRATION C OEFFICIENTS C AUSED BY E LECTRIC P ERTURBATION FOR THE X IMEN ’ S C OMBINED E LECTROMAGNETIC L ENS , T OGETHER W ITH T HOSE C ALCULATED BY THE DA M ETHOD Aberration coefficients
Mathematica
DA method
BV F1V F2V CV DV EV bV f1V f2V cV dV eV
2.5531641×10−3 3.6345255×10−1 −2.6806022×10−2 1.7205123 3.3632697 × 101 3.3269181 × 103 −9.6146201×10−4 4.8747764×10−1 −4.5508199×10−1 3.3974854 × 101 −5.6436396 × 101 −6.3709715 × 102
2.5531787×10−3 3.6346147×10−1 −2.6806249×10−2 1.7205741 3.3632808 × 101 3.3269249 × 103 −9.6151643×10−4 4.8747998×10−1 −4.5508080×10−1 3.3974843 × 101 −5.6436874 × 101 −6.3709532 × 102
146
LIU
TABLE 5 N UMERICAL R ESULTS OF THE T HIRD -O RDER C HROMATIC A BERRATION C OEFFICIENTS C AUSED BY M AGNETIC P ERTURBATION F OR T HE X IMEN ’ S C OMBINED E LECTROMAGNETIC L ENS , T OGETHER W ITH T HOSE C ALCULATED BY THE DA M ETHOD Aberration coefficients
Mathematica
DA method
BB F1B F2B CB DB EB bB f1B f2B cB dB eB
−1.6505495×10−3 −1.4741282 1.1362776 −9.5490930 × 101 5.6720727 × 101 −1.2760132 × 104 5.7808535×10−3 1.9232739×10−1 6.6931453×10−1 −2.2760711 × 101 2.3115603 × 102 1.0685840 × 104
−1.6505664×10−3 −1.4741311 1.1362772 −9.5490961 × 101 5.6720386 × 101 −1.2760151 × 104 5.7808659×10−3 1.9232404×10−1 6.6931596×10−1 −2.2760833 × 101 2.3115630 × 102 1.0685828 × 104
rigorous analytical proof, we believe that the conclusion is valid owing to the extremely high accuracy of the numerical calculation. From this work, it is very evident that the computer algebra system— Mathematica—is very suitable for aberration analysis in electron optics in many aspects, including analytical symbolic derivation, numerical calculation, and graphics display. Moreover, its input and output styles are consistent with the traditional form used in technical publications, which considerably increases their readability.
ACKNOWLEDGMENTS I thank Ms. J.T. Costa, Manager of Partnership Programs, Wolfram Research, Inc., for providing a single-user Mathematica license; Professor M. Berz for using COSY INFINITY in this work; and Dr. P.W. Hawkes for encouragement in writing articles for Advances in Imaging and Electron Physics.
A PPENDIX The following definitions are taken from “The Mathematica Book, 5th edition” (Wolfram, 2003). &&—and (logical operator)
EXPLORING THIRD - ORDER CHROMATIC ABERRATIONS
147
{}—curly braces for lists1 {a, b, c}—a list with three elements a, b, and c #n—the nth variable in a pure function2 &[x, body]—another form of a pure function, Function[x, body] ϕ_?NumericQ—matches for ϕ_ are tested with the function NumericQ Apart[expr]—separate into terms with simple denominator AppendTo[v, elem]—append elem to the value of v Coefficient[expr, form]—coefficient of form in expr Collect[expr, x]—group together powers of x D[f, x]—partial derivative of f with respect to x expr/.x → value—replace x by value in the expression expr expr/.{x → xval, y → yval}—perform several replacements Expand[expr]—multiply out products and powers Factor[expr]—reduce to a product of factors For[start, test, incr, body]—evaluate start, then repetitively evaluate body and incr, until test fails FreeQ[list, form]—test whether form occurs nowhere in list Function[x, body]—a pure function in which x is replaced by argument you provide Graphics[list]—general two-dimensional graphics GraphicsArray[list]—array of other graphics objects If[p, then, else]—give then if p is true, and else if p is false Module[{a, b, . . .}, proc]—a procedure with local variables a,1b, . . . x max NIntegrate[f, {x, xmin, xmax}]—numerical approximation to x min f dx Normal[series]—truncate a power series to give an ordinary expression NumericQ[expr]—numeric quantity Point[{x, y}]—point at position (x, y) in a two-dimensional coordinate system PointSize[d]—give all points a diameter d as a fraction of the width of the whole plot Select[expr, f]—select the elements in expr for which the function f gives true Series[expr, {x, x0 , n}]—find the power series expansion of expr about the point x = x0 to at most nth order ScientificForm[expr, tot]—use scientific notation with at most tot digits SetOptions[function, option → value, . . .]—reset defaults Show[plot]—redraw a plot Show[GraphicsArray[{plot1, plot2, . . .}], option value]—draw array of plots with options changed 1 Lists provide a means to make a collection of objects in Mathematica. 2 Pure functions give functions that can be applied to arguments, without having to define an explicit
name for the function.
148
LIU
Simplify[expr]—try a sequence of algebraic transforms and give the smallest form of expr found StringForm[”cccc“, cccc“”, x1 , x2 ]—output a string in which successive “ are replaced by successive xi Table[expr, {i, imax}]—make a list of values of expr with i running from 1 to imax v[[i]]—the ith element of the list v (double brackets for indexing)
R EFERENCES Berz, M. (1999). Modern map methods in particle beam physics. In: Hawkes, P.W., Kazan, B., Mulvey, T. (Eds.), In: Advances in Imaging and Electron Physics, vol. 108. Academic Press, San Diego, pp. 148–169. Hawkes, P.W. (1977). Computer calculation of formulae for electron lens aberration coefficients. Optik 48, 29–51. Hawkes, P.W., Kasper, E. (1989). Principles of Electron Optics. Academic Press, New York. Liu, Z. (2002). Fifth-order canonical geometric aberration analysis of electrostatic round lenses. Nucl. Instrum. Methods Phys. Res. A 488, 42–50. Liu, Z. (2004). Improved fifth-order geometric aberration coefficients of electron lenses. J. Phys. D: Appl. Phys. 37, 653–659. Soma, T. (1977). Relativistic aberration formulas for combined electric– magnetic focusing-deflection systems. Optik 49, 255–262. Ximen, J. (1986). Aberration theory in electron and ion optics. In: Hawkes, P.W. (Ed.), In: Advances in Electronics and Electron Physics, Suppl. 17. Academic Press, Orlando, pp. 48–49. Ximen, J., Liu, Z. (2000). Analysis and calculation of third- and fifthorder aberrations in combined bell-shaped electromagnetic lens—a new theoretical model in electron optics. Optik 111, 75–84. Wolfram, S. (2003). The Mathematica Book, fifth ed. Wolfram Media. Wu, M. (1957). Fifth-order transverse aberrations in rotationally symmetric electron optical systems. Acta Phys. Sinica 13, 181–205.
ADVANCES IN IMAGING AND ELECTRON PHYSICS, VOL. 145
Anisotropic Diffusion Partial Differential Equations for Multichannel Image Regularization: Framework and Applications DAVID TSCHUMPERLÉ1 AND RACHID DERICHE2 1 Image Team, GREYC/ENSICAEN–UMR CNRS 6072, 14050 Caen Cedex, France 2 Odyssée Project Team, INRIA/ENPC/ENS–INRIA, 06902 Sophia Antipolis, France
Preliminary Notations . . . . . . . . . . . I. Introduction . . . . . . . . . . . . . . A. Defining a Local Geometry for Multichannel Images . 1. Local Geometric Features . . . . . . . . . 2. Geometry From a Scalar Feature . . . . . . . 3. Di Zenzo Multivalued Geometry . . . . . . . II. PDE-Based Smoothing of Multivalued Images: A Review . A. Variational Methods . . . . . . . . . . . B. Divergence-Based Diffusion PDEs . . . . . . . C. Oriented Heat Flows . . . . . . . . . . . D. Trace-Based Diffusion PDEs . . . . . . . . E. Links Between Existing Regularization Methods . . . III. Curvature-Preserving PDEs . . . . . . . . . . A. The Single-Direction Case . . . . . . . . . B. Curvature-Preserving PDEs and Line Integral Convolutions C. Between Traces and Divergences . . . . . . . D. Extension to Multidirectional Smoothing . . . . . IV. Implementation Considerations . . . . . . . . . V. Applications . . . . . . . . . . . . . . A. Color Image Denoising and Artifact Removal . . . B. Color Image Inpainting . . . . . . . . . . C. Color Image Interpolation . . . . . . . . . D. Flow Visualization . . . . . . . . . . . VI. Conclusion . . . . . . . . . . . . . . . Appendix A . . . . . . . . . . . . . . Appendix B . . . . . . . . . . . . . . Appendix C . . . . . . . . . . . . . . References . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
150 151 154 154 155 156 160 160 163 165 167 172 174 174 177 178 180 181 183 184 185 189 191 193 195 197 198 203
149 ISSN 1076-5670 DOI: 10.1016/S1076-5670(06)45004-7
Copyright 2007, Elsevier Inc. All rights reserved.
150
TSCHUMPERLÉ AND DERICHE
P RELIMINARY N OTATIONS Throughout this chapter, we represent a multichannel or multivalued image by a continuous function I : → Rn , where ⊂ R2 is the definition domain of the image (basically a two-dimensional (2D) rectangle W × H ) and n ∈ N+ is the dimension of each vector-valued image pixel I(X) located at X = ( x y )T ∈ . The notation Ii stands for the ith channel of the image I. Note that Ii can be considered itself as a scalar-valued image Ii : → R. Thus, we have ∀X = (x, y) ∈ ,
I(X) = ( I1(X)
I2(X)
· · · In(X) )T .
For the common case of color images, we naturally get n = 3, that is, three vector components (R, G, B) per pixel, retrieved respectively from the red (I1 ), green (I2 ), and blue (I3 ) channels of a color image I. We also intensely use second-order diffusion tensors in equations. A diffusion tensor D is assimilated to a 2 × 2 symmetric and positive-definite matrix, having then two positive eigenvalues λ1 , λ2 and two associated orthonormal eigenvectors u1 ⊥u2 . The shape of a tensor D may be seen as an ellipse, oriented by the vector basis u1 ⊥u2 and elongated by λ1 and λ2 , as illustrated below.
D=
a b
b c
= λ1 u1 uT1 + λ2 u2 uT2
When λ2 ≫ λ1 (lengthened ellipse), the tensor D is said to be anisotropic and has u2 as its principal orientation. When λ1 = λ2 = β, the tensor D is isotropic and thus equal to a weighted version of the 2 × 2 identity matrix Id
β 0 . λ1 = λ2 = β ⇒ D = βId = 0 β An isotropic tensor D has no privileged orientations, all vectors of R2 being possible eigenvectors of D. Finally, we denote Gσ , a normalized 2D Gaussian function with a standard deviation of σ :
2 1 x + y2 . exp − Gσ (x, y) = 2π σ 2 2σ 2
ANISOTROPIC DIFFUSION PARTIAL DIFFERENTIAL EQUATIONS
151
I. I NTRODUCTION Obtaining regularized versions of noisy or corrupted image data has always been a desirable goal in the fields of computer vision and image processing. Removing noise or scratches from degraded images is indeed a fundamental preprocessing step that can possibly ease the further analysis of the image data by higher-level algorithms such as detectors of important image features (edges, corners, objects, motion). The ability to create simplified versions of the image data also is very interesting, when considering the analysis of the images at multiple scales. In a more general manner, image regularization is one of the key stages of most computer vision algorithms since it plays a fundamental role for solving ill-posed computer vision problems (Hadamard, 1923), including restoration, segmentation, registration, surface reconstruction, and so on. This explains why many image regularization formalisms have already been proposed and studied in the literature. Perona and Malik (1990) in their pioneering work in the early 1990s were the first to imagine image regularization in terms of anisotropic diffusion partial differential equations (PDEs). Their method, applied on scalar-valued images (one value by pixel), has particularly raised a strong interest for PDE-based formulations, since it succeeded in smoothing image data in a nonlinear way, removing the noise quite well while allowing the preservation of significant image features, such as contours and corners (discontinuities of the signal), despite an initial formulation that later was proved to be unstable (Kichenassamy, 1997; Weickert and Benhamouda, 1997a, 1997b). First created to describe physical laws and natural motions of mechanic objects and fluids (strings, water, wind (Wesseling, 2000)), diffusion PDEs had already been widely studied, and interesting theoretical results from the fields of physics and mathematics have found interesting implications for the purpose of data regularization. Actually, PDEs are local formulations and thus, they are well adapted to deal with degraded images where sources of data corruption are local or semilocal. This is not restrictive: Gaussian noise, scratches, or compression artifacts are, for instance, local degradations usually encountered in digital (original or digitized) images. Following the way opened by Perona and Malik (1990), many authors have proposed variants of diffusion PDEs for image regularization since then, primary for the restoration of scalar-valued datasets. Important theoretical contributions in this field concern the way the classical isotropic diffusion equation (heat flow) has been extended to deal with anisotropic smoothing (Black et al., 1998; Krissian, 2000; Perona and Malik, 1990; Sapiro, 2001; ter Haar Romeny, 1994; Weickert, 1998; Yezzi, 1998), how diffusion PDEs may be seen as gradient descents of various energy functionals (Alvarez and Mazorra, 1994; Aubert and Kornprobst, 2002; Blanc-Feraud et al.,
152
TSCHUMPERLÉ AND DERICHE
1995; Charbonnier et al., 1994; Chambolle and Lions, 1997; Charbonnier et al., 1997; Gilboa et al., 2002; Kimmel et al., 2000; Rudin et al., 1992; Teboul et al., 1998), and the link between regularization PDEs and the concept of nonlinear scale spaces (Alvarez et al., 1993; Lindeberg, 1994; Nielsen et al., 1997). Extensions of these techniques to deal with color images, and more generally multichannel datasets, have been more recently attempted by (Blomgren and Chan, 1998; Chan et al., 2000; Kimmel et al., 2000; Pardo and Sapiro, 2000; Sapiro, 2001; Sapiro and Ringach, 1996; Tschumperlé, 2002; Tschumperlé and Deriche, 2003; Tschumperlé and Deriche, 2005; Weickert, 1998; Weickert, 1999; Weickert and Brox, 2002) (among others), leading to more elaborated expressions: A coupling term between image channels generally appears in the equations. Diffusion equations dealing with constrained multidimensional datasets have been also proposed, allowing regularization of images of unit vectors (Coulon et al., 2001; Kimmel and Sochen, 2002; Perona, 1998; Tang et al., 1998; Vese and Osher, 2001), orthonormal matrices (Chefd’hotel et al., 2004; Chu, 1990a, 1990b; Tschumperlé and Deriche, 2001c, 2002b), positive-definite matrices (Chefd’hotel et al., 2004; Tschumperlé and Deriche, 2001b), or image data defined on implicit surfaces (Bertalmio et al., 2001; Chan and Shen, 2000a; Tang et al., 2000). Usually these types of constrained PDEs simply add an extra constraint term to the corresponding unconstrained equation and are not discussed here. Despite the wide range of existing constrained and unconstrained PDE formalisms for scalar and multichannel images, all proposed methods have something in common—a nonlinear regularization PDE such as ∂I ∂t = R locally smooths the image I along one or several directions of the plane that are different at each image point, depending on the local image configuration. Typically, the principal smoothing direction is always chosen to be parallel to the image contours, resulting in an anisotropic regularization that does not destroy the edges. This has an interesting interpretation in terms of scalespace: As the image data are gently regularized step by step, a continuous sequence of smoother images I(t) is generated while the evolution time t of the PDE goes by. Obviously, anisotropic regularization algorithms must let the less significant data features disappear first (preferably noise), while the interesting image details (edges) are preserved as long as they become unimportant themselves within the image (Alvarez et al., 1993; Lindeberg, 1994; Nielsen et al., 1997; Perona and Malik, 1990; Witkin, 1983). Roughly speaking, regularization PDEs may be seen as iterative and nonlinear filters that simplify the image little by little and then minimize the image variations (Figure 1). Note that such equations generally do not converge toward a very interesting solution. Basically, the image obtained at convergence (t → ∞)
ANISOTROPIC DIFFUSION PARTIAL DIFFERENTIAL EQUATIONS
(a)
(b)
(c)
(d)
153
(e)
F IGURE 1. Nonlinear regularization PDEs and the notion of anisotropic scale-space. (a) Initial image I(t=0) , (b) t = 50, (c) t = 250, (d) t = 1000, (e) t = 3000.
is constant everywhere, corresponding to an image without any variations. This is indeed the most simplified image obtainable. To avoid this undesired oversimplification, regularization algorithms are usually based on a modified PDE velocity R′ = R + α(Inoisy − I ), including a so-called data fidelity term weighted by a user-defined parameter α ∈ R+ . It prevents the expected solution (regularized image) at convergence from being too different from the original noisy image (not constant, by the way). Another classical restoration technique is done by stopping the pure regularization flow ∂I ∂t = R after a finite number of iterations (which thus becomes a parameter of the method). Here, the main interest is in the regularization term R itself rather than the one containing the fidelity term R′ . For a broader mathematical study of linear or nonlinear fidelity terms, please refer to (Meyer, 2001; Nikolova, 2001; Nikolova and Ng, 2001). As it is clear that local and oriented image smoothing is one of the key ideas used by most PDE-based regularization methods, this leads to the problem of defining a coherent geometry from a multichannel image, which must be the first aim of a good regularization algorithm. Following this simple and general principle, recent contributions (Tschumperlé and Deriche, 2003; Tschumperlé and Deriche, 2005; Weickert, 1998) proposed two different and generic PDE-based frameworks able to design regularization processes from any given underlying local smoothing geometry. These methods have two main interests: On the one hand, they unify many previously proposed equations into generic diffusion PDEs and provide a local geometric interpretation of the corresponding regularizations. On the other hand, they clearly separate the design of the smoothing geometry from the smoothing process itself. In a first step, the geometry of the structures inside the image is retrieved (generally by the computation of the so-called structure tensor field). Then a local geometry of the desired smoothing is defined by the mean of a second field of diffusion tensors, depending on the first one. Finally, one step of the smoothing process itself is performed through one or several iterations of a specific diffusion PDE. This procedure is repeated until the image is sufficiently regularized.
154
TSCHUMPERLÉ AND DERICHE
This chapter first discusses the definition of a local geometry for multichannel images, by reviewing and comparing proposed solutions in the literature (Blomgren, 1998; Blomgren and Chan, 1998; Di Zenzo, 1986; Sapiro, 1996; Weickert, 1998) (Section I.A). This is followed by a review of important works already proposed for scalar and multichannel image regularization within a diffusion PDE framework. These methods may be classified into three different approaches: (1) variational formulations, (2) divergence expressions, and (3) oriented Laplacians. The main focus is on the interpretation of the algorithms in terms of local smoothing (Section II). We particularly note the advantages and drawbacks of each equation in real cases. Then we focus on a very recent alternative, formulated as a tensor-driven diffusion that regularizes multichannel images while taking specific curvature constraints into account (Section III). This formulation is mathematically positioned between previous existing equations in such a way that it solves most issues encountered with classical regularization methods. Moreover, we show that a theoretical interpretation of the curvature-constrained formalism exists in terms of line integral convolutions (LICs), which is a simple filtering technique originally proposed by Cabral and Leedom (1993). This direct analogy allows the design of an explicit numerical scheme that implements the regularization PDE by successive integrations of pixel values along integral lines (Section IV). This iterative scheme has two main advantages compared with classical PDE implementations. On one hand, it preserves thin image structures remarkably well, since it naturally works at a subpixel accuracy, thanks to the use of a fourth-order Runge–Kutta integration. On the other hand, the algorithm is able to run up to three times faster than classical explicit schemes since it is unconditionally stable, even for large PDE time steps. The described method makes diffusion PDEs a generic and very efficient approach to solving image-processing problems needing multichannel image regularization. Finally, we illustrate this effectiveness, in terms of computational speed and visual quality, with results on color image restoration, color image inpainting, and nonlinear resizing, among all possible applications in the area of image regularization (Section V). A. Defining a Local Geometry for Multichannel Images 1. Local Geometric Features As stated in the introduction, image regularization may be considered a filter that reduces local pixel variations. More precisely, one wants to smooth a multichannel image I : → Rn while preserving its edges (discontinuities in the image intensities), that is, performing a local smoothing mostly along
ANISOTROPIC DIFFUSION PARTIAL DIFFERENTIAL EQUATIONS
155
directions of the edges, avoiding a smoothing orthogonal to these edges. At first glance, a naive approach would be to apply a scalar-valued regularization filter on each channel Ii of the multichannel image I, doing so independently for each i = 1, . . . , n. However, the correlation between image channels would be ignored in this case, and it might cause important disparities in the smoothing behavior, because local smoothing directions and amplitudes could be very different from each channel to another. Such decoupled regularization methods generally lead to undesirable oversmoothing effects, destroying significant edge structures in the image. Multichannel image regularization is based instead on a coherent image smoothing that locally uses the same smoothing directions and amplitudes for all image channels Ii . Naturally, this means that the local geometry of a multichannel image I must be measured first. Such a geometry consists in the definition of these important features at each image point X = (x, y) ∈ of I: + − • Two orthogonal directions θ(X) , θ(X) ∈ S1 (unit vectors of R2 ) directed respectively across and along the edges (generally the maximum and minimum variations of the image intensities at X). The direction θ − generally corresponds to the edge direction, when there is one, while θ + naturally extends the notion of gradient direction for multichannel images. • A corresponding variation norm N (X) measuring the local strength of an edge. This is the extension of the vector gradient norm for multichannel images.
In order to construct such a vector geometry, different approaches have been considered and are detailed in the following text. 2. Geometry From a Scalar Feature One simple method consists of first computing a scalar image f (I), using a vector to scalar function f : Rn → R that would ideally model the human perception of vector-valued edges. It is particularly conceivable for color images: One may choose, for instance, the lightness function (perceptual response to the luminance) coming from the CIELAB color base (Poynton, 1995): f = L∗ = 116g(Y ) − 16
with Y = 0.2125R + 0.7154G + 0.0721B
where g : R → R is defined by √ g(s) = 3 s g(s) = 7.787s +
16 116
if s > 0.008856, else.
156
TSCHUMPERLÉ AND DERICHE
(a)
(b)
(c)
(d)
(e)
F IGURE 2. Using lightness L∗ to detect geometry of a color image fails for isolightness contours. (a) Red channel R, (b) green channel G, (c) blue channel B, (d) color image (R, G, B), (e) lightness (scalar) image L∗ .
Thus, we may define a vector-valued local vector geometry {N , θ+ , θ− } of I by choosing ∇f (I) 2 2 θ+ = ∇f (I) , and N = 2∇f (I)2. θ− ⊥θ+ , However, this method has two major drawbacks. First, it is not always possible to easily define a significant function f for multichannel images (particularly when the number of channel is n > 3). Second, there are mathematically no functions f that can detect all possible vector-valued variations. For instance, the lightness function defined above cannot detect isolightness vector contours in a color image. This is the case for the image shown on Figure 2: The contours inside the colored yin-yang symbol will not be detected by N = ∇f (I) , since f (I) is constant therein. As a consequence, the smoothing performed will be either isotropic or oriented in an incorrect direction: the existing color edges inside the yin-yang symbol will probably be blurred. 3. Di Zenzo Multivalued Geometry A very elegant solution has been proposed by Di Zenzo (1986) to overcome this limitation. He considers a multichannel image I : → Rn as a vector field, and looks for the local variations of the vector norm dI 2 , mainly given by a variation matrix G = (gi,j ). This yields dI = Ix dx + Iy dy,
where Ix =
∂I ∂I and Iy = ∂x ∂y
(∈ Rn ),
then
that is,
dI 2 = dIT dI = Ix 2 dx 2 + 2ITx Iy dx dy + Iy 2 dy 2 , 2
T
dI = dX G dX,
where G =
n i=1
∇Ii ∇IiT
and dX =
dx dy
,
ANISOTROPIC DIFFUSION PARTIAL DIFFERENTIAL EQUATIONS
157
G is denoted as the structure tensor. It sums variation contributions from each image channel Ii . It is easy to see that G is a 2 × 2 symmetric and semipositive-definite matrix. Its coefficients (gi,j ) are as follows: ⎧ ⎪ g11 = ni=1 Ii2x , ⎪ ⎨ g12 = g21 = ni=1 Iix Iiy , ⎪ ⎪ ⎩ g22 = n I 2 . i=1 iy
In the common case of color images I = (R, G, B), G is defined as
Rx2 + G2x + Bx2 Rx Ry + Gx Gy + Bx By G= . Rx Ry + Gx Gy + Bx By Ry2 + G2y + By2
(1)
The interesting point about G is that its positive eigenvalues λ+/− give the maximum and the minimum values of dI 2 , while the orthogonal eigenvectors θ+ and θ− are the corresponding orientations of these extrema, and are formally given by √
g11 + g22 ± 2g12 √ and θ+/− λ+/− = , (2) g22 − g11 ± 2 2 . The vectors θ are normalized to the unit where = (g11 − g22 )2 + 4g12 ± vector afterward. With this simple and efficient approach, Di Zenzo opened a natural way to deal with the local vector geometry of multichannel images, through the use of the oriented orthogonal basis (θ+ , θ− ) and the variations measures (λ+ , λ− ). A slight variant has been proposed by Weickert (1998). He proposed instead to study the eigenvalues and eigenvectors of a Gaussian-smoothed version Gσ of the structure tensor G:
Gσ =
n ∇Iiα ∇IiTα ∗ Gσ , i=1
where ∇Iiα = ∇(Ii ∗ Gα ),
(3)
where Gα and Gσ are 2D Gaussian kernels with variances respectively equal to α and σ . User-defined parameters α and σ have an influence on the smoothness of the obtained structure tensor field, and by extension, on the regularity of the retrieved vector-valued image geometry. It is noteworthy that eigenvalues of Gσ are well adapted to discriminate different geometric cases: • When λ+ ≃ λ− ≃ 0, there are very few vector variations around the current point X = (x, y); the region is almost flat and does not contain any edges or corners (as is the case for the inside of the strips in Figure 3a). For this configuration, the variation norm N that must be defined should be low.
158
TSCHUMPERLÉ AND DERICHE
(a)
(b)
(c)
(d)
F IGURE 3. Comparing possible vector variation norms N , N− , and N+ for a synthetic √ √ λ+ , (c) N− = λ+ − λ− , color image. (a) Color checkerboard (size 40 × 40), (b) N = √ (d) N+ = λ+ + λ− .
• When λ+ ≫ λ− , there are many vector variations. The current point may be located on a vector edge (as for the edges of the strips in Figure 3a). For this configuration, the variation norm N should be high. • When λ+ ≃ λ− ≫ 0, the variation are located on a saddle point of the vector surface, which could be a corner structure in the image (for instance, the intersections of the strips in Figure 3a). In this case, N should be even higher than for the previous configuration. Regularization algorithms indeed have a tendency to smooth corners quickly. A very high variation measure estimated on corner points would attenuate the smoothing there, which is often a desired effect. Many proposed regularization algorithms acting on multichannel images have implicitly or explicitly based their smoothing behavior from these Di Zenzo attributes. In particular, three different choices of vector gradient norms N have been proposed so far in the literature to measure vector-valued variations: √ • N = λ+ , as a natural extension of the scalar gradient norm viewed as the value of maximum variations (Blomgren, 1998; Sapiro, 1996; Sapiro, 1997) (Figures 3b and 4b). This norm will not give particular importance to corners√compared with straight edges. λ+ − λ− , also called the coherence norm, has been chosen • N− = by (Sapiro and Ringach, 1996; Weickert, 1996a; Weickert, 1997a). Note that this norm fails to detect discontinuities that are saddle points of the vector-valued surface. This is illustrated on the intersections of the strips (Figure 3c), as well as in the center and the left and right parts of the child’s eye (Figure 4c). This will mainly perturb any regularization process that uses this norm since some colored sharp corners, considered as homogeneous regions, will be probably oversmoothed. √ • N+ = λ+ + λ− , is also denoted by ∇I , is often chosen (Bertalmio et al., 2000a; Blomgren and Chan, 1998; Pardo and Sapiro, 2000; Tang et al.,
ANISOTROPIC DIFFUSION PARTIAL DIFFERENTIAL EQUATIONS
(a)
(b)
(c)
159
(d)
F IGURE 4. Comparing possible vector variation norms N , N− , and N+ for a real color √ √ image. (a) Color photograph (small portion 60 × 40), (b) N = λ+ , (c) N− = λ+ − λ− , √ (d) N+ = λ+ + λ− .
1998; Tschumperlé and Deriche, 2001b; Tschumperlé and Deriche, 2001c) because it detects edges and corners well and is easy to compute. It does not require an eigenvalue decomposition of G as the other norms did, because 6 7 n ' 7 ∇Ii 2 . (4) N+ = ∇I = trace(G) = 8 i=1
Moreover, the norm N+ has the interesting property of giving preferences to certain corners (Figure 3d). This is very valuable for image restoration purposes, since the smoothing can be attenuated on high-curvature structures that are classically difficult to preserve. Note that for the scalar case (n = 1), the structure tensor calculus reduces to: dI 2 = dX G1 dX,
2 Ix Iy Ix where G1 = ∇I ∇I T = . Ix Iy Iy2
when n = 1,
1 In this case, the eigenvectors θ+/− and the eigenvalues λ1+/− of G1 are 1 ∇I ⊥ θ−1 = ξ = ∇I λ− = 0, , associated to ∇I λ1+ = ∇I 2 . θ+1 = η = ∇I ,
Basically, the three above-defined norms N+ , N− , and N all reduce to ∇I in the case of scalar-valued images, which is a desired property. Once a local vector geometry is defined, it can be used as a measure in many image analysis processes involving multichannel images (not only for regularization algorithms). For instance, color edge detection may be performed by finding threshold local maxima of the N+ norm (Figure 5 and Koschan, 1995;
160
TSCHUMPERLÉ AND DERICHE
(a)
(b)
F IGURE 5. Using a vector variation norm for color edge detection. (a) Color image, (b) detect√ ing color edges with the norm N+ = λ+ + λ− .
Tschumperlé and Deriche, 2001a; Tschumperlé and Deriche, 2002a). This vector geometry computation has also been integrated as a measure of contours in some multichannel image segmentation methods (Sapiro, 1996; Sapiro, 1997). √ For all the reasons listed above, the norm N+ = λ+ + λ− associated with the Di Zenzo geometry is probably one of the best measures for detecting local variations in multichannel images and is considered in the next sections of this chapter.
II. PDE-BASED S MOOTHING OF M ULTIVALUED I MAGES : A R EVIEW We review and propose a classification of classical smoothing methods based on diffusion PDEs into three different approaches, related to different interpretation levels of the regularization processes, from the most global to the most local ones. Each section starts with a description of the original idea for scalar-valued images and is then extended for multichannel datasets. A. Variational Methods Contrary to the formulation of the original Perona–Malik equation, several methods have been proposed to resolve the problem of image regularization as a global minimization procedure, within a variational framework. Formalisms described by (Aubert and Kornprobst, 2002; Chambolle and Lions, 1997; Charbonnier et al., 1997; Sapiro, 2001; Weickert, 1998), among numerous references, contributed to the definition of generic energy functionals measuring global image variations. The goal is to minimize adapted variation
ANISOTROPIC DIFFUSION PARTIAL DIFFERENTIAL EQUATIONS
Function name
φ(s)
Reference
Tikhonov Perona–Malik Minimal surfaces Geman–McClure Total variation Green
s2
Tikhonov, 1963 Perona and Malik, 1990 Charbonnier et al., 1994 Geman and McClure, 1985 Rudin et al., 1992 Green, 1990
F IGURE 6.
1' − exp(−s 2 /K 2 ) 2 1 + s2 − 2 s 2 /(1 + s 2 ) s 2 log(cosh(s))
161
List of different φ-functions and corresponding references.
functionals to flatten low image variations (then gradually remove the noise), while preserving the high ones (avoiding the smoothing of image contours). The formulation of the φ-functionals gathers some of these approaches in a general framework and yields a very unifying way to proceed. A noisy scalar image Inoisy can be regularized by minimizing the following φ-functional: , (5) min E(I ) = φ ∇I d, I : →R
where φ : R → R is an increasing function, directing the regularization behavior and penalizing high gradient norms. The minimization is performed via the corresponding diffusion PDE evolution, coming from the Euler– Lagrange equations of E(I ): I(t=0) = Inoisy , φ ′ ( ∇I ) (6) ∂I ∂t = div ∇I ∇I .
Different choices of functions φ lead to different proposed regularization methods, especially the simple isotropic smoothing (equivalent to a Gaussian convolution), as introduced by Tikhonov (Tikhonov, 1963), as well as the well-known Perona–Malik (Perona and Malik, 1990) and total variation (TV) anisotropic flows (Rudin et al., 1992). Many regularization methods acting on scalar-valued images have been unified by the φ-function formalism (Figure 6). A natural extension of the φ-functionals for the regularization of multichannel images I could consist in minimizing the following cost functional E(I) measuring a global multichannel image variation: , (7) min n E(I) = φ N (I) d, I : →R
where N (I) is one of the three local variation norms defined in Section I.A.
162
TSCHUMPERLÉ AND DERICHE
But more generally, as vector-valued images possess two distinct variation measures λ+ and λ− (eigenvalues of the structure tensor G) contrary to a single measure ∇I for scalar images, it seems natural to minimize a functional defined by a function ψ : R2 → R of two variables instead of a single one. The ψ-functional below is thus a more complete extension of the φ-function formulation for multichannel images. , min n E(I) = ψ(λ+ , λ− ) d. (8) I : →R
The Euler–Lagrange equations of Eq. (8) can be derived and reduce to a simple form of divergence-based PDE (see Appendix A for details about this Euler–Lagrange derivation): * +
∂Ii ∂ψ ∂ψ T T = div (i = 1, . . . , n). (9) θ+ θ + + θ− θ− ∇Ii ∂t ∂λ+ ∂λ− The choice of specific cases of ψ-functions leads to previous vector-valued regularization approaches defined as variational methods, such as the whole range of vector-valued φ-functionals (Blomgren and Chan, 1998; Pardo and Sapiro, 2000; Tang et al., 2000): ' ψ(λ+ , λ− ) = φ( λ+ + λ− ) or the Beltrami flow framework (Kimmel et al., 1998; Kimmel et al., 2000; Kimmel and Sochen, 1999; Sochen et al., 2001; Sochen et al., 1998; Sochen, 2001): ' ψ(λ+ , λ− ) = (1 + λ+ )(1 + λ− ).
Note that this last approach is also equivalent to defining the minimizing functional E(I) as a Polyakov action, which is a physical measure of the area of the image I seen as a 2D surface embedded in a (n + 2)D space. This geometric interpretation helps in understanding how functional minimization can play a role in smoothing images by forcing them to be more regular, thereby finding a minimal surface (Figure 7). Despite the interesting global geometric interpretation of variational formulations, such methods clearly lack flexibility. Indeed, they are formulated as global minimization processes, despite the local geometric smoothing properties that are intrinsically desired for regularization purposes. Such PDEs are obtained by the Euler–Lagrange derivation of a functional and cannot thus be finely tuned to adapt themselves to local geometric cases (e.g., contours, corners). Unfortunately, this adaptability is primordial in many situations, especially when the noise level is high.
ANISOTROPIC DIFFUSION PARTIAL DIFFERENTIAL EQUATIONS
(a1)
(a2)
(b1)
(b2)
163
F IGURE 7. Example of image denoising by surface area minimization. (a1) Noisy image, (a2) corresponding surface. (b1) Restored image, (b2) corresponding surface.
B. Divergence-Based Diffusion PDEs One level of flexibility for designing regularization PDEs has been reached with the introduction of more generic divergence expressions (Aubert and Kornprobst, 2002; Alvarez et al., 1992; Kornprobst et al., 1997; Sapiro, 2001; Weickert, 1998). Basically, the idea was to replace the function φ ′ ( ∇I )/ ∇I in the divergence of the scalar-valued PDE (6) by expressions depending on more appropriate image features. On one hand, this gives more freedom to design regularization PDEs that better fit local constraints. On the other hand, the global interpretation of the regularization process is often lost; generally, equations so designed no longer correspond to a functional minimization. Historically, authors (Alvarez et al., 1992) first proposed to use a diffusivity function g( ∇I ∗Gσ ) depending on the convolved gradient norm ∇I ∗Gσ , rather than simply considering ∇I as a measure of image variations, for the
164
TSCHUMPERLÉ AND DERICHE
regularization of scalar-valued images: ∂I = div g ∇I ∗ Gσ ∇I , ∂t where Gσ is a 2D normalized Gaussian function. This has initially been done to ensure that the regularization formulations are well posed. However, it also appeared to respect a more coherent local diffusion geometry by involving a larger neighborhood in the computation of the local image variations that influence the smoothing process. A major generalization of divergence-based equations for scalar and multichannel images has been more recently proposed by Weickert (Weickert, 1996b; Weickert, 1997a; Weickert, 1997b; Weickert, 1998). Basically, the idea consists of considering image pixels as chemical concentrations or temperatures that diffuse with respect to some physical laws (Fick’s law and continuity equations). He proposed this very generic divergence-based equation, parameterized by a field D : → P (2) of 2 × 2 diffusion tensors:
∂Ii = div(D∇Ii ) (i = 1, . . . , n), (10) ∂t The tensor field D defines a gradient flux and controls the local diffusion behavior of the smoothing process [Eq. (10)]. Note that the ψ-functional formalism described in Section II.A is a particular instance of the PDE (10) with D defined as follows: ∂ψ ∂ψ D= θ+ θ+T + θ− θ−T . ∂λ+ ∂λ− More specifically, Weickert proposed to design the diffusion tensor D for each image point X = (x, y), by selecting its two eigenvectors u, v and eigenvalues λ1 , λ2 as functions of the spectral elements of the smoothed structure tensor Gσ [Eq. (3)] such that ⎧ λ = β, ⎪ ⎨ 1 u = θ + , and β (if λ+ = λ− ), (11) −C ⎪ v = θ −, ⎩ λ2 = β + (1 − β) exp else 2 (λ −λ ) +
−
(C > 0 and β ∈ [0, 1] are user-fixed parameters of the method). The tensor D is computed at each image point as: D = λ1 uuT + λ2 vvT . Note that the tensor field D is the same for all image channels Ii , ensuring that all Ii are smoothed by a common multichannel geometry, which takes the correlation between image channels into account (since D depends on Gσ ), contrary to an uncorrelated channel-by-channel approach. Weickert assumed that the tensor shape at each point X = (x, y) of the field D gives the preferred smoothing geometry at this point. The idea behind the choice in Eq. (11) was then:
ANISOTROPIC DIFFUSION PARTIAL DIFFERENTIAL EQUATIONS
165
• On almost constant regions, we have λ+ ≃ λ− ≃ 0 and then we derive λ1 ≃ λ2 ≃ β, that is, D ≃ αId . The tensor D is then defined as isotropic in flat regions. • Along image contours, we have λ+ ≫ λ − ≫ 0
and as a result, λ2 > λ1 > 0.
Here, the diffusion tensor D is anisotropic, directed mainly by the smoothed direction θ − of the image contours. However, it is important that the amplitudes and directions of the local smoothing performed by the divergence-based PDE (10) are not precisely defined by the eigen characteristics (shapes) of the diffusion tensor D at X. This may lead to a smoothing behavior that is not expected, as illustrated by the following simple example. Suppose we want to anisotropically smooth a ∇I scalar image I : → R everywhere along the gradient direction ∇I with a constant strength of 1. (This is for illustration purposes, since all image discontinuities would be destroyed by choosing such a smoothing geometry.) Intuitively, we would define D at each point X ∈ as
∇I T ∇I ∀X ∈ , D(X) = ∇I ∇I leading to the simplification of Eq. (10) as
∂I 1 T = div ∇I ∇I ∇I = div(∇I ) = I, ∂t ∇I 2 2
2
∂ I ∂ I where I = ∂x 2 + ∂y 2 stands for the Laplacian of I . As noticed in (Koenderink, 1984), the evolution of this well-known heat flow equation is similar to the convolution √ of the image I by a normalized Gaussian kernel Gσ with a variance σ = 2t. So, this particular choice of anisotropic tensors D leads to an isotropic smoothing behavior, without preferred smoothing orientations. Note that choosing D = Id (identity matrix) would yield the same result: Different tensors fields D with very different shapes (isotropic or anisotropic) may define the same regularization behavior. Actually, the divergence is a differential operator, so Eq. (10) implicitly depends on the spatial variations of the tensor field D. Clearly, the divergence Eq. (10) hampers the design of a significant pointwise smoothing behavior.
C. Oriented Heat Flows Oriented heat flows, also named oriented Laplacian formulations, consider that a local smoothing process can be decomposed into two orthogonal
166
TSCHUMPERLÉ AND DERICHE
F IGURE 8. directions.
Principle of oriented Laplacians. Two 1D smoothings are done along adapted
and unidimensional heat flows respectively oriented along two directions u1 and u2 (these vectors forming an orthonormal basis) associated with two smoothing amplitudes c1 and c2 . The smoothing amplitudes and orientations are naturally different for each image point, since they adapt themselves to the local configuration of the image (Figure 8). The resultant equation is written as the sum of these two heat flows: ∂I = c1 Iu1 u1 + c2 Iu2 u2 , (12) ∂t where u1 and u2 are unit orthogonal vectors and c1 , c2 0. Iu1 u1 and Iu2 u2 denote the second derivatives of I in the directions u1 and u2 and their vector components are formally defined as follows: ∀i = 1, . . . , n,
Iiu1 u1 = uT1 Hi u1
and Iiu2 u2 = uT2 Hi u2 ,
where Hi is the Hessian of Ii , defined on each point X ∈ by ⎛ 2 ⎞ ∂ Ii ∂ 2 Ii
2 ∂x∂y Iixx Iixy ∂x ⎠. =⎝ 2 Hi = ∂ 2 Ii Iixy Iiyy ∂ Ii ∂x∂y
(13)
∂y 2
Here, the diffusion behavior is defined entirely by the knowledge of the smoothing directions u1 , u2 and the corresponding weights c1 and c2 . This formulation was first proposed by (Kornprobst, 1998; Kornprobst et al., 1996) for the regularization of scalar-valued images I , with the following choices for c1 , c2 and u1 , u2 : ∇I ⊥ u1 = ξ = ∇I c1 = 1, , and ∇I c2 = g( ∇I ∗ Gσ ), u2 = η = ∇I , where g : R → R is a function decreasing to 0 (the pixel diffusion must vanish on high gradients). It allows a permanent anisotropic smoothing along
ANISOTROPIC DIFFUSION PARTIAL DIFFERENTIAL EQUATIONS
167
the edges ξ , even on very high gradients, since c1 = 1 everywhere in the image. The general formalism of oriented Laplacians [Eq. (12)] also allows other well-known equations to be calculated, such as the mean curvature flow ∂I ∂t = Iξ ξ , obtained with c1 = 1, c2 = 0, u1 = ξ and v2 = η (Deriche and Faugeras, 1997). Note that this was not possible with divergence-based expressions [Eq. (10)]. Other works for scalar image regularization have used similar variants of this technique (Carmona and Zhong, 1998; Krissian, 2000). Sapiro and Ringach (1996) proposed an extension of the mean curvature flow It = Iξ ξ for multichannel images, using an oriented Laplacian formulation [Eq. (12)]. They naturally used the Di Zenzo attributes to incorporate informations on the multichannel geometry in their proposed equation: ∂I = g(λ+ − λ− )Iθ− θ− , (14) ∂t where g : R → R is a positive decreasing function, avoiding the smoothing of high gradients regions. It was one of the first attempts to construct an oriented Laplacian PDE directly from a local vector-valued geometry. Indeed, all channels Ii are smoothed along a common vector edge direction with a common intensity. However, some drawbacks remain: √ • The coherence norm N− = λ+ − λ− is used here as a measure of vectorvalued variations in order to reduce diffusion on image contours. This may not be a good choice since some corner structures do not respond to the norm N− , as explained in Section I.A, and will then be oversmoothed. • In flat regions (N− → 0), the diffusion is made along a single direction θ− , which is directed mainly by the noise since no coherent structures exist in these regions. Undesired texture effects result from this monodirectional smoothing. This is particularly true here, since contrary to decoupled regularizations, vector components are not blended with this method (the diffusions in all image channels Ii follow a common direction). Isotropic smoothing would be more adapted to remove noise in such flat regions. D. Trace-Based Diffusion PDEs A simple generalization of oriented Laplacians has been proposed (Tschumperlé, 2002; Tschumperlé and Deriche, 2003; Tschumperlé and Deriche, 2005). The concept relies on the use of a generic diffusion tensor field T : → P (2) to describe the diffusion geometry of Eq. (12), instead of separately describing local directions θ+ , θ− and amplitudes c1 , c2 of smoothing. Actually, the proposed equation was simply a rewrite of the previous PDE (12), using a trace operator: ∂Ii = trace(THi ), ∀i = 1, . . . , n, (15) ∂t
168
TSCHUMPERLÉ AND DERICHE
where Hi is the Hessian of Ii [Eq. (13)] and the tensor field T is computed as: ∀X ∈ ,
T(X) = c1 u1 uT1 + c2 v2 vT2 .
Note that in this case, each channel Ii of I is also smoothed with a common tensor field T. Equations (12) and (15) are strictly equivalent, but Eq. (15) clearly separates the smoothing geometry (defined by the tensor field T) from the smoothing itself. This is close to Weickert’s method that led to the divergence PDE (10); the regularization problem now simplifies to the design of a tensor field T adapted to the considered application. But in the case of trace-based PDEs, the tensor field that defines the local smoothing behavior has the interesting property of unicity: Two different tensor fields will necessarily lead to different smoothing behaviors. Indeed, Eq. (15) has a simple geometric interpretation in terms of local filtering with oriented Gaussian kernels. Indeed, let us consider first that T is a constant tensor field. It can then be demonstrated that the formal solution of the PDE (15) is (see Appendix B for details): Ii(t) = Ii(t=0) ∗ G(T,t)
(i = 1, . . . , n),
(16)
where ∗ designates for the convolution operator and G(T,t) is an oriented Gaussian kernel, defined by
XT T−1 X 1 (T,t) exp − with X = ( x y )T . (X) = (17) G 4π t 4t This is in fact a generalization of Koenderink’s idea (Koenderink, 1984), who proved this property in the field of computer vision for the simpler case of the isotropic diffusion tensor T = Id , resulting in the well-known heat flow i equation: ∂I ∂t = Ii . Figure 9 illustrates three Gaussian kernels G(T,t) (x, y) respectively obtained with isotropic and anisotropic tensors T and the corresponding evolutions of the diffusion PDE (15) on a color image. The Gaussian kernels G(T,t) produce the classical representations of the tensors T with ellipses. Conversely, it is clear that the tensors T represent the exact geometry of the smoothing performed by the PDE (15). When T is not constant (which is generally the case), that is, it represents a field → P(2) of variable diffusion tensors, the PDE (15) becomes nonlinear but can be viewed as the application of temporally and spatially varying local masks GT,t (X) over the image I. Figure 10 illustrates three examples of spatially varying tensor fields T, represented with fields of ellipsoids, and the corresponding evolutions of Eq. (15) on a color image. As before, the shape of each tensor T gives the exact geometry of the local smoothing process
ANISOTROPIC DIFFUSION PARTIAL DIFFERENTIAL EQUATIONS
F IGURE 9. kernels.
169
Trace-based PDEs [Eq. (15)] viewed as convolutions by oriented 2D Gaussian
F IGURE 10.
Trace-based PDEs [Eq. (15)] with nonconstant diffusion tensor fields T.
performed by the trace-based PDE (15) point by point. Because the trace is not a differential operator, the local interpretation of the smoothing process as a convolution with an oriented Gaussian mask is valid here. In the same way that structure tensors code for each image pixel X the main directions of the edges θ− as well as the edge strength λ+ + λ− , the diffusion tensor field T similarly will code the preferred local smoothing directions, as well as the desired smoothing amplitudes along these directions,
170
TSCHUMPERLÉ AND DERICHE
for each image pixel X. Of course, T(X) depends on the local geometry of I and is thus defined from the spectral elements λ− , λ+ and θ − , θ + of the smoothed structure tensor Gσ . For image denoising purposes, the choice proposed by (Tschumperlé, 2002; Tschumperlé and Deriche, 2003; Tschumperlé and Deriche, 2005) is 1 (1 + λ+ + λ− )p1 1 = (1 + λ+ + λ− )p2
− c1 = f(λ = + ,λ− )
and
+ c2 = f(λ + ,λ− )
with p1 < p2 ,
(18)
where p1 , p2 ∈ R are parameters of the proposed method. At this point, the desired smoothing behavior is intended to be: • If a pixel X is located on an image contour (λ+ (X) is high), the smoothing on − X would be performed primarily along the contour direction θ(X) (since + − f(.,.) ≪ f(.,.) ), with a smoothing strength inversely proportional to the contour strength. • If a pixel X is located on a homogeneous region (λ+ (X) is low), the smoothing on X would be performed in all possible directions (isotropic smoothing), + − since f(.,.) ≃ f(.,.) and then T ≃ Id (identity matrix). This is one possible choice for f − , f + in order to satisfy basic image denoising requirements. Actually, it is quite natural to design a smoothing behavior from the image structure before applying the regularization process itself. The trace-based Eq. (15) is a good attempt to separate the smoothing geometry from the smoothing process itself, while providing a geometric interpretation on how the smoothing is performed. It proved some natural links between PDEs and other local filtering techniques, such as bilateral filtering (Barash, 2000; Tomasi and Manduchi, 1998). Another similar approach based on non-Gaussian convolution kernels has been also proposed for the specific case of the Beltrami flow (Sochen et al., 2001). Nevertheless, the fact that the trace equation [Eq. (15)] behaves locally as an oriented Gaussian smoothing whose strength and orientation is directly related to the tensor T(X) has a major drawback. Indeed, on curved structures (like corners), this Gaussian behavior is not desirable: When the local variation of the edge orientation θ − is high, a Gaussian filter tends to round corners, even by conducting the smoothing only along θ − . This is due to the fact that an oriented Gaussian mask is not curved itself. This classical behavior is best known as the “mean curvature flow” effect, characterized by ∂2I the PDE ∂I . This problem is illustrated in Figures 11b and 12b where −2 ∂t = ∂θ
ANISOTROPIC DIFFUSION PARTIAL DIFFERENTIAL EQUATIONS
(a)
(b)
(c)
171
(d)
F IGURE 11. Problems encountered using trace-based PDEs [Eq. (15)] on curved image structures. (a) Noisy synthetic color image; (b) applying trace-based PDE (15), with p1 = 0.5, p2 = 1.2; (c) applying trace-based PDE (15), with p1 = 0.9, p2 = 1.2; (d) applying our constrained PDE (29), with p1 = 0.5, p2 = 1.2.
Eq. (15) has been applied on synthetic and real color images and T has been defined as in Eq. (18) (then f − = 0). It is obvious that image structures are subject to the mean curvature flow effect, resulting in rounding the corners of the square in Figure 11b, or in blending parallel thin curved structures in Figure 12b. To avoid this oversmoothing effect, one may try to stop the action of the diffusion PDE on corners (by vanishing tensors T(X) there, that is, f − = f + = 0). But this implies the detection of curved structures on noisy or corrupted images, which is generally imprecise in the presence of noise, even when using the Di Zenzo geometry. Conversely, image undersmoothing on edges may occur when limiting the diffusion too much in regions with high-intensity variations (Figure 11c). There is a difficult trade-off between complete noise removal and preservation of curved structures when using trace-based PDEs [Eq. (15)]. These types of regularization processes are not concerned about the curvature of the smoothing directions, and by extension, of the curvature of the image contours. Taking this curvature into account is very desirable and has motivated the work presented later. In Section III, we propose a class of trace-based regularization PDEs that smooth an image I along a tensor field T, while implicitly taking curvatures of specific integral curves of T into account. Generally, the method locally filters the image with curved Gaussian kernels when necessary in order to better
172
TSCHUMPERLÉ AND DERICHE
(a)
(b)
(c)
F IGURE 12. Comparisons between trace-based PDEs [Eq. (15)] and our new curvature-preserving PDEs [Eq. (29)] on a real image. (a) Image of a fingerprint; (b) applying trace-based PDE (15), with p1 = 0.5, p2 = 1.2; (c) applying our constrained PDE (29), with p1 = 0.5, p2 = 1.2.
preserve image structures. For comparison purposes, results of the curvaturepreserving equation are shown on Figures 11d and 12c. E. Links Between Existing Regularization Methods The link between these three formulations is generally not trivial, especially for vector-valued images. It is well known for the classical case of φfunctional regularization of scalar images (n = 1). One can start from a regularizing functional minimization (A) and find the corresponding divergencebased (B) and oriented Laplacian (C) formulations as follows: , (A): min φ ∇I d I :→R
⇒
(B):
⇒
(C):
′
φ ( ∇I ) ∂I = div ∇I ∂t ∇I ∂I φ ′ ( ∇I ) = Iξ ξ + φ ′′ ∇I Iηη , ∂t ∇I
(19)
where η = ∇I / ∇I and ξ ⊥η. Note that this regularization generally leads to anisotropic smoothing (in the sense that it is performed in privileged spatial directions with different weights), despite the isotropic shape of the
ANISOTROPIC DIFFUSION PARTIAL DIFFERENTIAL EQUATIONS
173
′
( ∇I ) underlying tensor D = φ ∇I Id in the divergence expression. Also note that this global-to-local path (from variational to trace-based equations) can rarely be followed in the inverse order. For multichannel images, this link also can be found: , (A): min n ψ(λ+ , λ− ) d I : →R
⇒
(B):
⇒
(C):
∂Ii ∂ψ ∂ψ = div(D∇Ii ), where D = θ+ θ+T + θ− θ−T ∂t ∂λ+ ∂λ− n ∂Ii (20) = trace δij D + Qij Hj , ∂t j =1
where the δij is the Kronecker symbol (δij = 1 when i = j , and 0 elsewhere), Qij designates a family of n2 tensors (i, j = 1, . . . , n), defined as the T symmetric parts of the following matrices Pij (i.e., Qij = (Pij + Pij )/2):
∂α ∂α ij T T T P = α∇Ii ∇Ij Id + 2 θ+ θ + + θ− θ− ∇Ij ∇IiT G ∂λ+ ∂λ−
∂β ∂β θ+ θ+T + α + θ− θ−T ∇Ij ∇IiT +2 α+ ∂λ+ ∂λ− with α= and
f1 (λ+ , λ− ) − f2 (λ+ , λ− ) λ+ − λ − f1 =
∂ψ ∂λ+
and
β=
λ+ f2 (λ+ , λ− ) − λ− f1 (λ+ , λ− ) λ+ − λ −
and f2 =
∂ψ . ∂λ−
The development (A) ⇒ (B) from the functional to the divergence formulation is detailed in Appendix A. The development (B) ⇒ (C) from the divergence to the trace-based equation is detailed in Appendix C. This last development initially proposed by (Tschumperlé and Deriche, 2003; Tschumperlé and Deriche, 2005) unifies an entire range of previously proposed vector-valued regularization algorithms (variational and divergence based PDEs) into an extended trace-based equation, composed of several channel-diffusion contributions that have direct geometric interpretations in terms of local filtering by Gaussian kernels. Although the geometric interpretation of the overall sum of trace equations is not direct, it is interesting to see that additional diffusion tensors Qij clearly appear in the trace expressions, and contribute to modify the inner tensor D, which is not representative
174
TSCHUMPERLÉ AND DERICHE
of the smoothing behavior. More generally, tensors appearing in traces and divergences lead to different smoothing behaviors.
III. C URVATURE -P RESERVING PDE S The framework of curvature-preserving PDEs, first introduced by (Tschumperlé, 2006), defines a specific variant of multichannel diffusion PDEs. Its goal is to provide a generic tensor-driven regularization method as the divergence-based PDE (10) and trace-based PDE (15), but also focuses on the preservation of thin curved structures. We review this very efficient formalism and show how it can be understood from a local smoothing geometry viewpoint. A. The Single-Direction Case To illustrate the general idea of curvature-preserving PDEs, we first focus on image regularization along a vector field w : → R2 instead of a tensor field T. We then consider a local smoothing everywhere along a single w direction w , with a smoothing strength w . The two spatial components of w are denoted by w(X) = ( u(X) v(X) )T . The curvature-preserving regularization PDE that smoothes I along w is defined by: ∂Ii = trace wwT Hi + ∇IiT Jww , ∂t where Jw is the Jacobian of w, and Hi is the Hessian of Ii . ⎛ 2 ⎞ 2 $ " ∀i = 1, . . . , n,
Jw =
∂u ∂x ∂v ∂x
∂u ∂y ∂v ∂y
and
Hi = ⎝
∂ Ii ∂x 2 ∂ 2 Ii ∂x∂y
∂ Ii ∂x∂y ∂ 2 Ii ∂y 2
⎠.
(21)
The PDE (21) adds a term ∇IiT Jww to the trace-based Eq. (15) that smoothes I along w with locally oriented Gaussian kernels (see Section II.D). This extra term naturally depends on the variation of the vector field w. The following shows how Eq. (21) is related to w. X be the curve defining the integral curve of w, starting from X and Let C(a) parametrized by a ∈ R: ⎧ X = X, ⎨ C(0) (22) X ⎩ ∂ C(a) = w(C X ). (a) ∂a
ANISOTROPIC DIFFUSION PARTIAL DIFFERENTIAL EQUATIONS
(a)
175
(b)
F IGURE 13. Integral curve C X of a vector field w : → R2 . (a) Integral curve of a general field w. (b) Example of integral curves when w is the lowest eigenvector of the structure tensor G of a color image I (one block is one color pixel).
X is tracked forward, and backward When a → +∞, the integral curve C(a) when a → −∞ (Figure 13). F denotes the family of integral curves of w. X around a = 0 is: A second-order Taylor development of C(a) X X C(h) = C(0) +h
X ∂ C(a)
∂a
= X + hw(X) +
|a=0
+
2 X h2 ∂ C(a) + O h3 2 2 ∂a |a=0
h2 Jw(X) w(X) + O h3 , 2
with h → 0, and O(hn ) = hn ǫn . Then, we can compute a second-order Taylor X ) around a = 0, which corresponds to the variations of development of Ii (C(a) the image intensity near X when following the integral curve C X :
X h2 Ii C(h) = Ii X + hw(X) + Jw(X) w(X) + O h3 2
h T = Ii (X) + h∇Ii (X) w(X) + Jw(X) w(X) 2 +
h2 trace w(X) wT(X) Hi (X) + O h3 . 2
The term trace(w(X) wT(X) Hi (X) ) = derivative of Ii along w.
∂ 2 Ii ∂w2
corresponds to the second directional
176
TSCHUMPERLÉ AND DERICHE
X ) at a = 0 is then: The second derivative of the function a → Ii (C(a) X )= ∂ 2 Ii (C(a) = = ∂a 2 =
a=0
= lim
h→0
X X 1 X − 2Ii C(0) Ii C(h) + Ii C(−h) h2
1 2 T h ∇Ii Jw(X) w(X) h2 + h2 trace w(X) wT(X) Hi (X) + O h3 = trace w(X) wT(X) Hi (X) + ∇IiT Jw(X) w(X) . = lim
h→0
(23)
Note that this is precisely the right term in the proposed curvature-preserving PDE (21). Eq. (21) can be seen individually for all integral curves of F instead of each point X ∈ ; consider another point Y ∈ C X . Then there exist ǫ ∈ R such X . Indeed, C X and C Y describe the same curve [Eq. (22)] with that Y = C(ǫ) Y = CX different parameterization: ∀a ∈ R, C(a) (ǫ+a) . As Eq. (21) is verified on ∂Ii (C X )
∂ 2 Ii (C X )
Y, then ∂t(a) |a=ǫ = ∂a 2(a) |a=ǫ . This is obviously true for ǫ ∈ R since Eq. (21) is verified for all points Y lying on the integral curve C X . Then, the PDE (21) may be also written as ∀C ∈ F , ∀a ∈ R,
∂ 2 Ii (C(a) ) ∂Ii (C(a) ) = . ∂t ∂a 2
(24)
We may recognize in Eq. (24) a one-dimensional heat flow constrained on C . This is very different from a heat-flow oriented by w, as in the ∂ 2 Ii i formulation ∂I ∂t = ∂w2 since the curvatures of integral curves of w are now implicitly taken into account. In particular, the constrained Eq. (21) has the interesting property of vanishing when image intensities are perfectly constant on the integral curve C , regardless of the curvature of C . In this context, defining a field w that is tangent everywhere to the image structures will allow the preservation of these structures, even if they are curved (such as corners). This is not the case with divergence or trace-based PDEs [Eqs. (10) and (15)]. This curvature-preserving property of Eq. (21) is illustrated in Figures 11d and 12b). The constrained Eq. (21) is an elliptic PDE since the matrix wwT is positive definite. The existence and unicity of the solutions of Eq. (21) are not directly approached here. Section III.B, shows that its solution can be approximated by the technique of LICs, which is a well-posed analytical approach.
ANISOTROPIC DIFFUSION PARTIAL DIFFERENTIAL EQUATIONS
177
B. Curvature-Preserving PDEs and Line Integral Convolutions LICs were first introduced (Cabral and Leedom, 1993) as a technique to render a textured image ILIC that represents a vector field w : → R2 . The idea, originally expressed under a discrete formula, consists in smoothing an image Inoise (containing only noise) by averaging its pixel values along the integral curves of w. A continuous formulation of an LIC is then: ∀X ∈ ,
ILIC (X)
1 = N
,+∞ X f (p)Inoise C(p) dp
(25)
−∞
where f : R → R is an even function (strictly decreasing to 0 on R+ ) and C X is defined as the integral curve [Eq. (22)] of w through X. The normalization factor N allows the preservation of the average pixel value along C X and is 1 +∞ equal to N = −∞ f (p) dp. As noted in Section III.A, the curvature-preserving PDE (21) can be seen as the one-dimensional heat flow [Eq. (24)] constrained on the integral curve X ), Eq. (24) can be C X ∈ F . Using the variable substitution L(a) = I(C(a) ′′ [t] also written as ∂L ∂t (a) = L(a) . The solution L at time t is known to be the convolution of L[t=0] by a normalized Gaussian kernel Gt (see Deriche and Faugeras, 1997; Koenderink, 1984): L[t] (a)
,+∞ L[t=0] = (p) Gt (a−p) dp
with Gt (p)
−∞
p2 . (26) =√ exp − 4t 4π t 1
X = X and Substituting L in Eq. (26) with a = 0, and remembering that C(0) Gt (−p) = Gt (p) :
∀X ∈ ,
I[t] (X)
,+∞ X I[t=0] C(p) = Gt (p) dp.
(27)
−∞
Eq. (27) is a particular form of the continuous LIC-based formulation [Eq. (25)] with a Gaussian1 weighting function f = Gt . Here, the nor+∞ malization factor is N = −∞ Gt (p) dp = 1. Intuitively, the evolution of the curvature-preserving PDE (21) may be seen as the application of local convolutions by normalized one-dimensional Gaussian kernels along integral curves C of w. Thus this type of anisotropic image smoothing considers a curved filtering, instead of just an oriented one. Applying this setting on a multichannel image I, with w being the lowest eigenvector of the structure tensor field G (i.e., the contour direction) allows
178
TSCHUMPERLÉ AND DERICHE
the anisotropic smoothing of I with edge preservation, even if these edges are curved. This is shown in Figure 13b, where few integral lines C X are computed around a typical T-junction structure. Note how the streamlines rotate when arriving at the junction, with a subpixel precision. The streamlines have been computed with a fourth-order Runge–Kutta scheme. Eq. (27) is an analytical solution of Eq. (21) when w does not evolve over time. This property is generally not verified when dealing with general nonlinear regularization PDEs, where the smoothing geometry is reevaluated at each time step (thus defining a temporal nonlinearity). To eliminate this nonlinearity, we perform several successive iterations of the LIC scheme [Eq. (27)], where the vector field w is updated at each iteration. This is a good method of approximating Eq. (21). Classical explicit schemes usually consider the smoothing geometry w as constant between two successive PDE iterations I[t] and I[t+dt] . Thus, the curvature-preserving Eq. (21) is efficiently discretized by several iterations of the LIC formulation [Eq. (27)]. This is detailed in Section IV. C. Between Traces and Divergences The curvature-preserving PDE (21) may be compared to trace and divergence expressions [Eqs. (10) and (15)], for the case of single-direction smoothing T = wwT . In this case, the divergence PDE (10) may be developed as follows: $ " 2 ∂Ii i u ∂x + uv ∂I T ∂y div ww ∇Ii = div 2 ∂Ii i uv ∂I ∂x + v ∂y
∂ 2 Ii ∂ 2 Ii ∂ 2 Ii + v2 2 = u2 2 + 2uv ∂x∂y ∂x ∂y " ∂u ∂v ∂u $ 2u ∂x + u ∂y + v ∂y + ∇IiT ∂v ∂u 2v ∂v ∂y + u ∂x + v ∂x $ " ∂u $! " ∂u u ∂x + u ∂v u ∂x + v ∂u T ∂y ∂y T = trace ww Hi + ∇Ii + ∂v ∂u + v ∂v + v ∂v u ∂x v ∂x ∂y ∂y T = trace ww Hi + ∇IiT Jww + div(w)∇IiT w. This equation is a sum of three different terms that have the following meaning: • The first term corresponds to the trace PDE (15), which smoothes locally I along w, using oriented Gaussian kernels.
ANISOTROPIC DIFFUSION PARTIAL DIFFERENTIAL EQUATIONS
179
• The two first terms correspond to the curvature-constrained regularization PDE (21), which smooths locally I along w while taking the curvature of integral curves C of w into account. • The three terms together correspond to the classical divergence PDE (10), which performs local diffusions of I along w. This last term, div(w)∇IiT w, is mainly responsible for the perturbations of the effective smoothing direction, as described in Section II.B. It is not desirable for image regularization purposes. It is interesting to observe that the curvature-constrained PDE (21) is then “mathematically” positioned between the trace [Eq. (15)] and divergence formulations [Eq. (10)], and allows at the same time to consistently smooth along the predefined smoothing directions w, while preserving curved images structures. The curvature-preserving PDE (21) can also be written as a divergencebased PDE minus a constraint term: trace wwT Hi + ∇IiT Jww = div wwT ∇Ii − div(w)∇IiT w. Two particular cases of directions w merit study, in the case of scalar-valued images (n = 1): ⊥
∇I T • When w = ∇I (isophote direction), then ∇I Jww = −Iww , eliminating the velocity of the curvature-preserving evolution Eq. (21), by counterbalancing the trace-based term (which is nothing more than the mean curvature motion in this case). No smoothing is performed. This is natural since pixels along the isophotes have constant values, so averaging those values should not modify the image. Note by comparison that the velocity of the corresponding divergence-based expression div(wwT ∇Ii ) also is eliminated here. ∇I T • When w = ∇I (gradient direction), then ∇I Jww = 0, and the velocity of the curvature-preserving PDE (21) becomes simply Iww , which corresponds to a smoothing of the image along the gradient direction [the same as the unconstrained trace-based PDE (15)]. Note by comparison that the velocity of the corresponding divergence-based expression is I in this case, which corresponds to an isotropic smoothing of the image, instead of an anisotropic one.
These two particular cases allow better understanding of the difference of regularization behaviors among the trace, divergence, and curvaturepreserving formulations. Note that when w is a divergence-free field (i.e., div(w) = 0), the divergence-based PDE (10) and the curvature-preserving formulation Eq. (21) are strictly equivalent. This is very rarely the case.
180
TSCHUMPERLÉ AND DERICHE
D. Extension to Multidirectional Smoothing In (Tschumperlé, 2006), the single-direction smoothing PDE (21) has been extended so that it can be applied to a tensor-valued geometry T : → P(2), instead of a vector-valued geometry w. This is important since a diffusion tensor describes much more complex smoothing behaviors than single directions. In particular, it may represent both anisotropic or isotropic regularization behaviors. The extension of the curvature-preserving PDE (21) is not straightforward; the notions of curvature and integral curves of tensorvalued fields T are not as natural as with direction fields w. To resolve this problem, we proposed to locally decompose a tensordriven smoothing process into several vector-driven smoothing processes along different orientations. We first notice that ,π
aα aαT dα =
π Id 2
where aα =
α=0
cos α sin α
.
Then, any 2 × 2 tensor T may be written as follows: " ,π $ √ 2√ T T, aα aαT dα T= π α=0
' ' √ T is the square root of T = f + uuT + where T = f + uuT + f − vv√ √ √ − T f vv . It can easily be verify that ( T )2 = T and ( T )T = T. Thus, the tensor T may be decomposed as 2 T= π
,π √
α=0
Taα aαT
√
2 T dα = π T
,π √ √ ( Taα )( Taα )T dα.
(28)
α=0
√ √ The tensor T has been split into a sum of atomic tensors ( Taα )( Taα )T ; each vector √ is purely anisotropic and directed only along the direction of the vector Taα ∈ R2 . Eq. (28) suggests a means to decompose any tensordriven regularization PDE into a sum of single-direction smoothing processes, each of them respecting the overall geometry T. For instance: • If √ T = Id (identity matrix), the tensor is isotropic and: ∀α ∈ [0, π ], Taα = aα . The resulting smoothing will be then performed in all directions aα of the plane with the same strength. • If T =√uuT (where u ∈ S1 ), the tensor is purely anisotropic and: ∀α ∈ [0, π ], Taα = (uT aα )u. The resulting smoothing will be then performed only along the direction u of the tensor T.
ANISOTROPIC DIFFUSION PARTIAL DIFFERENTIAL EQUATIONS
181
Then, using Eq. (28) and considering that each single-direction smoothing must be done with a curvature-preserving approach [Eq. (21)], the following curvature-constrained regularization PDE results, acting on a multichannel image I : → Rn and driven by a tensor-valued smoothing geometry T: ∀i = 1, . . . , n, ,π √ √ √ 2 ∂Ii = trace ( Taα )( Taα )T Hi + ∇IiT J√Taα Taα dα, ∂t π α=0
which can be simplified as ∀i = 1, . . . , n,
2 ∂Ii = trace(THi ) + ∇IiT ∂t π
,π
√ J√Taα Taα dα, (29)
α=0 T
J√Taα
stands for the Jacobian of the where aα = ( cos α sin α ) , and √ vector field → Taα . This type of smoothing decomposition along all orientations of the plane can be also found in (Weickert, 1994). As in the single-direction smoothing case, Eq. (29) may be seen as a trace-based equation [Eq. (15)], where an extra term has been added to respect the curvature of all integral lines passing through the tensor-valued geometry T.
IV. I MPLEMENTATION C ONSIDERATIONS To implement the regularization method in Eq. (29), the LIC-based interpretation of curvature-preserving PDEs presented in Section III.B can be used. Indeed, we can explicitly discretize Eq. (29) by the following Euler scheme: " N −1 $ √ 2dt R( Taα ) I[t+dt] = I[t] + N k=0
where α = kπ/N (in the interval [0, π ]), dt is the usual temporal discretization step, and R(w) represents a discretization of the unidirectional smoothing PDE velocity [Eq. (21)] that preserve curvatures along a vector √ field w. If this N −1 [t] expression is written as I[t+dt] = N1 ( k=0 I + 2dt R( Taα )), it may be expressed √ as the averaging of different Gaussian-pondered LICs along vector fields Taα : $ " N −1 [t] 1 √ , I[t+dt] = I LIC( Taα ) N k=0
182
TSCHUMPERLÉ AND DERICHE
where each Gaussian variance has a standard deviation dt. Basically, the difficulty lies is the LIC computation, which requires the tracking of integral curves of a vector field. Here we used a very simple method based on the classical Runge–Kutta (Press et al., 1992) integration scheme. Faster LIC implementations have been proposed by (Stalling and Hege, 1995) but do not deal with Gaussian-pondering functions, as needed here. This simple observation leads then to the following fast algorithm for the implementation of one iteration of the curvature-preserving PDE (29): • Compute the smoothed structure tensor field Gσ from I[t] : ⎛ ∂I [t] 2 ∂Ii[t] ∂Ii[t] ⎞ i n ∂x ∂x ∂y ⎝ ⎠ Gσ = Gσ ∗ ∂Ii[t] ∂Ii[t] ∂Ii[t] 2 i=1
∂x
∂y
∂y
σ will depend on the noise scale. We used relatively low values (between 0 and 1.5) for our experiments in Section V. • Compute the eigenvalues λ+ , λ− and eigenvectors θ + , θ − of Gσ . • Compute the smoothing geometry tensor field T from Gσ : T=
1 1 T T θ −θ − + θ +θ + . (1 + λ+ + λ− )p1 (1 + λ+ + λ− )p2
• For all α in [0, π ] (discretized with √ a user-fixed step dα ): – Compute the vector field w = Taα . – Perform an LIC of I[t] along C X in the forward and backward directions. • Average all LICs computed in previous steps. The main parameters of the algorithm are p1 , p2 , σ , and dt and the number of PDE iterations nb that are applied. The characteristics of this scheme, compared to the classical finite-difference model are as follows: • It allows the preservation of thin image structures from a numerical point of view; the smoothing is performed along integral curves of w, with a subpixel accuracy. Precise fourth-order Runge–Kutta interpolation (Press et al., 1992) is used to track the integral curves C in the image. • It allows the choice of very large time steps dt, since the scheme we proposed is unconditionally stable. Indeed, dt is simply proportional to the overall smoothing variance of the Gaussian-pondering convolutions done along C ∈ F . • As a result, the regularization algorithm performs very fast. Very few iterations are necessary to obtain the result, even if each iteration is more time consuming. For our applications (presented in Section V), we were able to choose only nb = 1 iteration with very large time steps dt. In fact,
ANISOTROPIC DIFFUSION PARTIAL DIFFERENTIAL EQUATIONS
(a)
(b)
183
(c)
F IGURE 14. Comparisons between classical explicit PDE schemes and LIC-based implementation of the PDE (29). (a) Noisy color image, (b) regularization using a finite-difference scheme (stopped at t = 100), (c) regularization using the LIC-based scheme (stopped at t = 100).
this leads to a rough approximation of Eq. (29) since we lost the temporal nonlinearity property of the PDE. But for images with little noise, this gave surprisingly good results. The spatial nonlinearity seems to play a more important role than the temporal nonlinearity in the PDE evolution. The smoothing is done as an averaging of multiple LICs in different directions α. The choice of the discretization step dα is important in this context. In regions where the smoothing needs to be primarily anisotropic, only a few values of α are necessary since in all cases, the smoothing is done along the same single direction. But in homogeneous regions needing isotropic smoothing, a smaller dα gives much better results. Practically speaking, we chose dα = 45◦ , which is sufficient for good precision for isotropic smoothing. Figure 14 shows the efficiency of the scheme compared with the classical finite-difference one. A synthetic noisy image is anisotropically smoothed with the PDE (29), with p1 = 0.01 and p2 = 100 (smoothing mostly along isophotes θ− , with a strength of 1). The LIC-based scheme (Figure 14c) clearly better preserves the structure along time t. This is due to the important role played by the subpixel accuracy property of the underlying LIC computation.
V. A PPLICATIONS A wide variety of problems can be resolved by the regularization PDEs proposed throughout this chapter. Particularly, the curvature-preserving regularization approach described in Section III is used in the following examples dealing with color images. We show results of color image denoising, inpainting, and resizing by nonlinear interpolation. Given processing times have been obtained on a 2.8 GHz i686 Intel processor.
184
TSCHUMPERLÉ AND DERICHE
A. Color Image Denoising and Artifact Removal Image denoising is a direct application of regularization methods. Sensor inaccuracies, digital quantifications, or compression artifacts are indeed some of the various noise sources that can affect a digital image, and suppressing them is a desirable goal. Such artifacts generally lead to small random variations that affect pixels of the image and that must be removed. Figures 15 through 19 show how the curvature-preserving PDE framework [Eq. (29)] can be successfully applied to remove such artifacts while preserving the essential structures of the processed images. • Figure 15 shows an application of the regularization method on the 512 × 512 Baboon color image, artificially degraded by adding uncorrelated Gaussian noise on (R, G, B). This color image has been then denoised with Eq. (29). Thanks to the proposed LIC-based numerical implementation, only one PDE iteration has been necessary to denoise the image, with parameters p1 = 0.5, p2 = 0.7, σ = 1.5, and dt = 50. Processing time is 19.3 seconds for the entire image. • Figure 16 illustrates a real case where a color photograph has been digitized from a grainy paper, leading to the apparition of watered effects on the digital picture. Using the PDE-based regularization method [Eq. (29)] allows removal of the grains while preserving fine structures (palm tree leaves). The image shown is a 152 × 133 portion of the original one. Only one PDE iteration was necessary, with p1 = 0.5, p2 = 0.7, σ = 1, and dt = 10. Processing time is 11 seconds for the entire image (size 586 × 367). • Figure 17 shows the restoration of a digital photograph shot under low-light conditions by a cellular telephone. Such devices usually have poor-quality cameras, leading to significant digital noise (more precisely, Poisson noise) on the acquired color images. The processed color image has a size of 262 × 280 and was restored in 4 seconds (one PDE iteration), with parameters p1 = 0.2, p2 = 0.5, σ = 2, and dt = 120. Note that the curvature-preserving PDE (29) adapts itself locally to the multichannel image geometry in order to preserve thin structures while removing the noise quite well. • Image regularization can also be useful when dealing with other types of noise. The enhancement of JPEG-compressed images is an example of interest. Figure 18 illustrates the suppression of compression artifacts in color images. A JPEG-compressed color image with a size 283 × 249, where the JPEG quality ratio has been set to 10%, is processed by the multichannel image regularization algorithm. Usual block effects inherent to the Discrete Cosine Transform (DCT) compression are visible on the
ANISOTROPIC DIFFUSION PARTIAL DIFFERENTIAL EQUATIONS
185
F IGURE 15. Denoising of the color image Baboon corrupted with artificial Gaussian noise. Noisy color image (left), denoised image (middle), zoom on the eye (right).
compressed image. One PDE iteration is applied then, with p1 = 0.5, p2 = 0.9, σ = 2, and dt = 200, to obtain the regularized result (right). Processing time is 5.4 seconds for the entire image. • Figure 19 illustrates another regularization with a different type of noise. In this case, the regularization method is used to improve a digital true-color image quantified in 256 colors by the Floyd–Steinberg algorithm (size = 355 × 287), which introduces some dithering effects in the image. One PDE iteration has been applied here, with p1 = 0.5, p2 = 0.8, σ = 1, and dt = 30. A 136 × 118 portion of the image is shown. Processing time is 12.8 seconds for the entire image. This kind of reconstruction may be interesting for image-compression algorithms, allowing them to retrieve true color images, even by storing them with a quantified palette. B. Color Image Inpainting Image inpainting is a very new and challenging application that consists of filling in missing image regions (defined by the user) by guessing pixel values such that the reconstructed image still looks natural. Basically, the user provides one color image I : → R3 , and one mask image M : → {0, 1}. The inpainting algorithm must fill in the regions where M(X) = 1, by means of some intelligent interpolations. For example, inpainting algorithms can be used to remove various structures in images (scratches, logos, or real objects) that are usually bigger than other image artifacts. Image inpainting was first proposed as a method based on a variational formulation by Masnou and Morel (Masnou and Morel, 1998), followed by many solutions based on diffusion or transport PDEs (Bertalmio et al., 2000b; Bertalmio et al., 2003; Chan and Shen, 2000b; Chan and Shen, 2001; Chan et al., 2002; Tschumperlé
186
TSCHUMPERLÉ AND DERICHE
F IGURE 16. Denoising of the color image Tunisia desert, containing watered effects. Noisy color image (left), denoised image (right); details are shown on the bottom row.
and Deriche, 2003; Tschumperlé and Deriche, 2005). Other papers related to inpainting without use of PDEs (Ashikhmin, 2001; Criminisi et al., 2003; Jia and Tang, 2003; Wei and Levoy, 2000) among others. In this chapter, we view the inpainting process as a direct application of the proposed curvature-preserving PDE (29). We apply the diffusion equation only on the regions to inpaint, allowing the neighbor pixels of these regions to diffuse inside; a nonlinear completion of the image data along isophote directions is thus naturally done, reconstructing the missing parts of the image, since the performed smoothing tries to follow a coherent multivalued image geometry computed from the known parts of the image. The concept of inpainting with PDEs is shown in Figures 20 through 24. • Figure 20 illustrates a simple case of removal of text from a color image, by guessing the pixel colors behind the text. The mask used here is easily detected from the degraded image by considering only green pixels. The result of the image inpainting is shown on the right. Note how structures have been naturally completed in a coherent way. This example also shows the limitation of the reconstruction with regard to reconstruction of textured regions. For instance, the algorithm has not been able to reconstruct one eye of the woman on the left. This is very difficult task, and it is not
ANISOTROPIC DIFFUSION PARTIAL DIFFERENTIAL EQUATIONS
187
F IGURE 17. Denoising of the color image Lolotte, real color photograph. Noisy color image (left), denoised image (right); details are shown on the bottom row.
surprising that diffusion equations that perform only local smoothing fail in such complex cases. • Figure 21 shows how the PDE-based inpainting technique can be also used to remove real objects from digital photographs. A 500 × 500 color image (left) is inpainted with a user-defined mask (middle), corresponding to the region overlapping the initial man’s glasses. The inpainted image (right) is obtained in 4 minutes 11 seconds, after 200 iterations of the PDE (29) with parameters p1 = 0.001, p2 = 100, σ = 4, and dt = 150. Note that p1 ≪ p2 encourages smoothing only along the isophote directions with a strength of 1 everywhere. Another similar example of real object removal can be seen in Figure 24, where a cage automatically disappears in a color photograph of a parrot.
188
TSCHUMPERLÉ AND DERICHE
F IGURE 18. Removing blocs artifacts from the JPEG-compressed color image Baby. JPEG color image (left), regularized image (right); details are shown on the bottom row.
F IGURE 19. Removing quantization noise in the quantized color image Penguin. Quantized color image (left), regularized image (right) (details).
• Figures 22 and 23 show the reconstruction capabilities of our inpainting technique. Here, half of the pixels of two color images have been suppressed by masking them with a checkerboard-shaped mask with squares
ANISOTROPIC DIFFUSION PARTIAL DIFFERENTIAL EQUATIONS
189
F IGURE 20. Removing undesirable text in color image by an inpainting method. Corrupted color image (left), inpainted image (right). Details are shown on the bottom row.
of respective sizes 16 × 16 and 32 × 32. The images are then reconstructed using several iterations of the PDE (29) applied only inside the inpainting masks, with parameters p1 = 0.001, p2 = 100, σ = 4, and dt = 50. This application is very interesting for image transmission within a network. It can be used to generate coherent reconstructed images even if corrupted network packets have been received. For each inpainting result shown, the initialization of the pixel values inside the inpainting masks at t = 0 has been done by white noise. The inpainting algorithm does not depend much on of the initialization step; Eq. (29) diffuses neighborhood values inside the inpainting mask until convergence, and there is then a strong border condition. Different types of initialization (noise, zerofilling, or linear interpolation) did not produce much difference in results. C. Color Image Interpolation These same techniques can easily be used to perform image magnification by edge-preserving interpolation. Starting from a linear or bicubic interpolation
190
TSCHUMPERLÉ AND DERICHE
F IGURE 21. Removing a real object in a color image by an inpainting method. Original color image (left), inpainting mask (middle), inpainted image (right).
F IGURE 22. Reconstruction of 50% of a color image by an inpainting method. Original color image (left), masked color image (middle), reconstruction (right).
of a small image and applying the PDE (29) on the image (except on the original known pixels that form a sparse inpainting mask), allows computation of magnified images that have been regularized while taking their local geometry into account. It allows removal of usual block or jagging effects inherent to classical linear interpolation techniques. This technique is very similar to image inpainting with a very sparse grid for the inpainting mask. • Figure 25 shows one example of image resizing. An original 195 × 173 color image is resized by a factor of 4 with classical nearest-neighbor, linear, and bicubic interpolations, and then by the PDE-based technique (29). The nonlinear regularization filter, driven by the image geometry, clearly allows removal of the aliasing effects usually encountered with simple
ANISOTROPIC DIFFUSION PARTIAL DIFFERENTIAL EQUATIONS
191
F IGURE 23. Reconstruction of 50% of a color image by an inpainting method. Masked color image (left), reconstruction (right).
F IGURE 24. Removing a real object in a color image by an inpainting method. Original color image (left), inpainting mask (middle), inpainted image (right).
interpolation methods, while correctly preserving the small structures of the image. Note that the original known points of the images are always preserved during the regularizing PDE flow, ensuring that resizing the image back to its original dimension (subsampling) always results in the original input data. D. Flow Visualization A final application of regularization PDEs is presented here for visualization purposes. A 2D vector field F : → R2 can be visualize in several ways. We can first use vector graphics (Figure 26) (left), but we need to subsample the field since this type of representation is not adapted to represent very dense flows. A better solution is as follows. We smooth a completely noisy (color) image I, with a regularizing flow equivalent to Eq. (29) but where T is directed by the directions of F , instead of the local geometry of I. The tensor field D
192
TSCHUMPERLÉ AND DERICHE
(a)
(b)
(c)
(d)
(e) F IGURE 25. Comparisons of image resizing methods. Original thumbnail image (a), nearest-neighbor (b), linear (c), bicubic (d) and PDE-based (e) interpolations.
ANISOTROPIC DIFFUSION PARTIAL DIFFERENTIAL EQUATIONS
193
F IGURE 26. Visualization of a 2D vector field using arrows (left), after 5 PDE iteration (middle), after 15 PDE iteration (right).
in Eq. (29) must be defined as ∀X ∈ ,
D(X) =
1 FF T . F
(30)
This is a field of fully anisotropic tensors, each time oriented along the flow F . This technique is in fact equivalent to using a single LIC filter (Cabral and Leedom, 1993), but it is interesting to note that when the PDE evolution time t goes by, a visualization scale-space of F is explicitly constructed (Figure 27). Here, the regularization Eq. (30) used ensures that the smoothing of the pixels is done exactly in the direction of the flow F . This is not the case in (Becker et al., 2000; Buerkle et al., 2001; Diewald et al., 2000; Preusser and Rumpf, 1999), where the authors proposed a similar idea using a divergencebased expression. Using similar divergence-based techniques raises a risk of smoothing the image in false directions, as noted in Section II.B.
VI. C ONCLUSION Multichannel image regularization is a fundamental process in many image processing and computer vision algorithms. It is then necessary to gain full control of this process, as well as to understand how the equations function from a geometric point of view. This chapter has described the most classical PDE-based methods proposed for the regularization of multichannel images and introduced a very efficient curvature-preserving framework that generally outperforms its competitors. This is due not only to the particular aim of preserving fine and curved structures, but also thanks to the proposed numerical scheme that is especially efficient since it works at a subpixel level. Clearly, this kind of multichannel image regularization technique can play a role in many image-processing applications. The processing time, which was
194
TSCHUMPERLÉ AND DERICHE
F IGURE 27.
Visualization scale-space generated with regularization PDEs.
one of the main drawbacks of PDE-based methods, is no longer problematic. All these reasons make the framework of multivalued diffusion PDEs a very good choice for image regularization purposes. This chapter has shown the results on color image denoising, inpainting, and resizing, but many other applications may benefit from the proposed curvature-preserving framework. Other application results of the curvature-preserving algorithm can be found at the following web page: http://www.greyc.ensicaen.fr/~dtschump/ greycstoration/. The binaries of the algorithm can be also downloaded and tested on different architectures, as well as the source code (C++), which are available as a part of the open source image-processing library: The CImg Library.1 1 Tschumperlé, D., The CImg Library. Available at: http://cimg.sourceforge.net. The C++ Template Image Processing Library.
ANISOTROPIC DIFFUSION PARTIAL DIFFERENTIAL EQUATIONS
195
A PPENDIX A In this Appendix, we demonstrate how the minimization of the functional , (A.1) min n E(I) = ψ(λ+ , λ− ) d I : →R
can be performed by a gradient descent and its corresponding PDE flow. The Euler–Lagrange equations corresponding to the functional Eq. (A.1) are: " ∂ψ $ ∂Iix ∂Ii (i = 1, . . . , n). (A.2) = div ∂ψ ∂t ∂Iiy
∂ψ T ∂Iiy )
∂ψ ( ∂I ix
Actually, the vector , can be written in a more comprehensive form. From the chain-rule property of the derivation, we have: " ∂ψ $ " ∂λ+ ∂λ− $ " ∂ψ $ ∂Iix ∂ψ ∂Iiy
=
∂Iix ∂λ+ ∂Iiy
∂Iix ∂λ− ∂Iiy
∂λ+ ∂ψ ∂λ−
(A.3)
.
∂ψ We know formally the expressions ∂λ since the function ψ is directly defined ± from the λ± . ∂λ± ± Finding the ∂I and ∂λ ∂Iiy is more difficult. Here is a simple way to proceed. ix As the λ± are the eigenvalues of the structure tensor G = (gkl ), we may decompose its derivatives (with respect to Iix and Iiy ), in terms of derivatives with respect to the gkl : ∂λ± ∂λ± ∂gkl ∂λ± ∂λ± ∂gkl = and = . (A.4) ∂Iix ∂gkl ∂Iix ∂Iiy ∂gkl ∂Iiy k,l
k,l
∂gkl The expressions ∂I and ix ⎧ 11 ⎨ ∂g ∂Ii = 2Iix , x
⎩ ∂g11 = 0, ∂Ii y
∂gkl ∂Iiy
are particularly simple: ⎧ ⎧ 12 22 ⎨ ∂g ⎨ ∂g ∂Iix = Iiy , ∂Iix = 0, and and ⎩ ∂g12 = Iix , ⎩ ∂g22 = 2Iiy , ∂Ii ∂Ii y
that is, Eq. (A.4) can be written as: " ∂λ± $ " ∂Iix ∂λ± ∂Iiy
=
∂λ± 2 ∂g 11 ∂λ± ∂g12
y
∂λ± ∂g12 ∂λ± 2 ∂g 22
$
∇Ii .
(A.5)
± Thus, one last obstacle remains: finding the formal expressions of ∂λ ∂gkl . Remember that λ± and θ± are the eigenvalues and eigenvectors of the
196
TSCHUMPERLÉ AND DERICHE
structure tensor G: G = λ+ θ+ θ+T + λ− θ− θ−T . The derivation of this tensor, with respect to one of its coefficient gkl is: ∂θ+ T ∂θ− T ∂G ∂λ+ ∂λ− = θ+ θ+T + θ− θ−T + λ+ θ + + λ− θ ∂gkl ∂gkl ∂gkl ∂gkl ∂gkl − + λ + θ+
∂θ T ∂θ+T + λ− θ− − . ∂gkl ∂gkl
(A.6)
Moreover, as the θ± are unitary and orthogonal eigenvectors, we have
θ+T θ+ = θ−T θ− = 1,
θ+T θ− = θ−T θ+ = 0,
and
⎧ T ⎨ ∂θ+ θ+ = θ+T ∂θ+ = 0, ∂gkl ∂gkl ⎩ ∂θ−T
T ∂θ− ∂gkl θ− = θ− ∂gkl = 0.
(A.7)
We first multiply Eq. (A.6) by θ±T at the left, by θ± at the right, and then use the properties of Eq. (A.7). It allows high simplifications and leads to these two relations: ∂G ∂λ+ = θ+T θ+ ∂gkl ∂gkl
and
∂G ∂λ− = θ−T θ− . ∂gkl ∂gkl
(A.8)
The relations in Eq. (A.8) formally show how eigenvalues of a diffusion tensor G vary with respect to a particular coefficient gkl of G. This interesting property can be proved for any symmetric matrix. For instance, authors of (Papadopoulo and Lourakis, 2000) proposed a similar demonstration in a purely matrix form, leading to the same result. They used it to deal with general covariance matrices. ∂G are very simple to write: Moreover, in our case the matrices ∂g kl ∂G = ∂g11
1 0
0 0
,
∂G = ∂g12
0 1
1 0
,
and
∂G = ∂g22
0 0 0 1
.
With all these elements, we can express (A.5) as " ∂λ+ $ ∂Iix ∂λ+ ∂Iiy
= 2θ+ θ+T ∇Ii
and
" ∂λ− $ ∂Iix ∂λ− ∂Iiy
= 2θ− θ−T ∇Ii .
(A.9)
Finally, replacing Eq. (A.9) in the Euler–Lagrange Eqs. (A.3) and (A.2) gives the vector-valued gradient descent of the functional [Eq. (8)]:
ANISOTROPIC DIFFUSION PARTIAL DIFFERENTIAL EQUATIONS
min
I : →Rn
,
197
ψ(λ+ , λ− ) d
+ * ∂ψ ∂Ii ∂ψ T T = 2 div ⇒ θ+ θ + + θ− θ− ∇Ii ∂t ∂λ+ ∂λ− (for i = 1, . . . , n).
(A.10)
Note that Eq. (A.10) is a divergence-based equation such that ∂Ii = div(D∇Ii ), ∂t
where D = 2
∂ψ ∂ψ θ+ θ+T + 2 θ− θ−T ∂λ+ ∂λ−
D ∈ P(2) is then a 2 × 2 diffusion tensor, whose eigenvalues are λ1 = 2
∂ψ ∂λ+
and
λ2 = 2
∂ψ ∂λ−
associated with these corresponding orthonormal eigenvectors: u1 = θ +
and u2 = θ− .
Computing this gradient descent is done exactly in the same way when dealing with image domains defined in higher-dimensional spaces ( ⊂ Rp where p > 2). More particularly, the case of 3D volume regularization (p = 3) can be written as , min n ψ(λ1 , λ2 , λ3 ) d I : →R
⇒
+ * ∂Ii ∂ψ ∂ψ ∂ψ T T T = 2 div θ1 θ + θ2 θ + θ3 θ ∇Ii . ∂t ∂λ1 1 ∂λ2 2 ∂λ3 3
In this case, the λ1,2,3 are the three eigenvalues of the 3 × 3 structure tensor G, and θ1,2,3 are the corresponding orthonormal eigenvectors.
A PPENDIX B This appendix demonstrates that the solution of the generic trace-based PDE ∀i = 1, . . . , n,
∂Ii = trace(THi ) ∂t
is the convolution of the image I Ii(t) = Ii(t=0) ∗ G(T,t)
(i = 1, . . . , n)
198
TSCHUMPERLÉ AND DERICHE
by an oriented Gaussian kernel G(T,t) defined as
1 XT T−1 X G(T,t) (X) = exp − with X = ( x 4π t 4t
y )T .
To demonstrate this, we simply derive the kernel expression G(T,t) in time and in space
1 XT T−1 X XT T−1 X ∂G(T,t) =− 1 − exp − ∂t 4t 4t 4π t 2 and
⎧ ⎨ ∇G(T,t) = − 1 2 exp − XT T−1 X T−1 X, 4t 8πt ⎩ H (T,t) = − 1 exp− XT T−1 X T−1 I − d G 4t 8πt 2
XXT T−1 2t
,
where ∇G(T,t) and HG(T,t) are, respectively, the gradient and the Hessian of G(T,t) . This means that
XXT T−1 1 XT T−1 X trace Id − trace(THG(T,t) ) = − exp − 4t 2t 8π t 2
T −1 T −1 1 X T X X T X =− 2− exp − 4t 2t 8π t 2 ∂G(T,t) . ∂t As the convolution is a linear operation, we have =
∂(Ii0 ∗ G(T,t) ) ∂G(T,t) = Ii0 ∗ = Ii0 ∗ trace(THG(T,t) ) ∂t ∂t = trace(THIi ∗G(T,t) ) 0
as well as lim Ii(t) ∗ G(T,t) = Ii0 ,
t→0
which tells us that the initial condition at t = 0 is coherent both for the PDE and the convolution process, since the Gaussian function G(T,t) is normalized. This statement is thus true for each instant t of the PDE flow.
A PPENDIX C This appendix develops tensor-driven divergence PDEs into their tracebased counterpart. Most divergence-based regularization PDEs acting on
ANISOTROPIC DIFFUSION PARTIAL DIFFERENTIAL EQUATIONS
199
multivalued images have the following form: ∂Ii = div(D∇Ii ) (i = 1, . . . , n), (C.1) ∂t where D is a diffusion tensor based only onfirst-order operators. D is often computed from the structure tensor G = nj=1 ∇Ij ∇IjT and depends mainly on the spatial derivatives Iix and Iiy . Intuitively, as the divergence ∂ ∂ + ∂y is itself a first-order derivative operator, we should be able div( ) = ∂x to write Eq. (C.1) only with first and second spatial derivatives Iix , Iiy , Iixx , Iixy , and Iiyy . Thus, it could be expressed with oriented Laplacians in each image channel Ii as well, that is an expression based on the trace operator ∂Ii ∂t = trace(DHi ). We want to make the link between the two different diffusion tensors D and T in the divergence-based and trace-based regularization PDEs, in the case when D is not constant: ∂Ii ∂Ii = div(D∇Ii ) and = trace(THi ). ∂t ∂t As noted in the previous section, these two formulations are almost equivalent, up to an additional term depending on the variation of the tensor field D: −−→
−−→
div(D∇Ii ) = trace(DHIi ) + ∇IiT div(D),
(C.2)
where div( ) is the matrix divergence. −−→ A natural progression is to then decompose the additional term ∇IiT div(D) into oriented Laplacians, expressed with additional diffusion tensors Q in the trace operator. For this purpose, we consider that the divergence tensor D is defined at each point X ∈ by D = f1 (λ+ , λ− )θ+ θ+T + f2 (λ+ , λ− )θ− θ−T
with f1/2 : R2 → R. (C.3)
It means that D is only expressed from the eigenvalues λ± and the eigenvectors θ± of the structure tensor G: G = λ+ θ+ θ+T + λ− θ− θ−T . This is indeed a very generic hypothesis that is verified by the majority of the proposed vector-valued regularization methods, for instance, the one proposed in Appendix A: ⎧ ⎨ f1 (λ+ , λ− ) = 2 ∂ψ , ∂Ii ∂λ+ = div(D∇Ii ) with Eq. (C.3) and ⎩ f2 (λ+ , λ− ) = 2 ∂ψ . ∂t ∂λ−
200
TSCHUMPERLÉ AND DERICHE −−→
In order to develop the additional diffusion term ∇IiT div(D) in Eq. (C.2), we propose to write D as a linear combination of G and Id : D = α(λ+ , λ− )G + β(λ+ , λ− )Id
(C.4)
that is, we separate the isotropic and anisotropic parts of D, with f1 (λ+ , λ− ) − f2 (λ+ , λ− ) and λ+ − λ − λ+ f2 (λ+ , λ− ) − λ− f1 (λ+ , λ− ) β= . λ+ − λ −
α=
Thus
(C.5)
f1 − f2 λ+ θ+ θ+T + λ− θ− θ−T λ+ − λ − λ+ f2 − λ− f1 + θ+ θ+T + θ− θ−T λ+ − λ − 1 = θ+ θ+T (λ+ f1 − λ− f1 ) + θ− θ−T (λ+ f2 − λ− f2 ) λ+ − λ −
αG + βId =
= f1 θ+ θ+T + f2 θ− θ−T = D.
Here we assumed that λ+ = λ− (i.e., the structure tensor G) is anisotropic. If G is isotropic, an isotropic diffusion tensor D is usually chosen too, in the divergence operator of Eq. (C.2), i.e., f1 (λ+ , λ− ) = f2 (λ+ , λ− ). In this case, we choose α = 0 and β = f1 (λ+ , λ− ). −−→ This decomposition is useful to rewrite the matrix divergence div(D) as: −−→
−−→
div(D) = α div(G) + G∇α + ∇β and the additional term of Eq. (C.2) would be computed as follows: −−→ −−→ ∇I T div(D) = trace div(D)∇IiT −−→ = α trace div(G)∇IiT + trace G∇α∇IiT + trace ∇β∇IiT .
(C.6)
(C.7) (C.8) (C.9)
In the following, we propose to find formal expressions of Eqs. (C.7)–(C.9).
• First, remember that the structure tensor G is defined as: G=
n j =1
∇Ij ∇IjT .
ANISOTROPIC DIFFUSION PARTIAL DIFFERENTIAL EQUATIONS
201
We have then: −−→
div(G) = = = =
2 n Ijx −−→ div Ijx Ijy j =1
Ijx Ijy Ij2y
n 2Ijx Ijxx + Ijx Ijyy + Ijy Ijxy Ijx Ijxy + Ijy Ijxx + 2Ijy Ijyy j =1
n Ijx (Ijxx + Ijyy ) I I + Ijy Ijxy + jx jxx Ijx Ijxy + Ijy Ijyy Ijy (Ijxx + Ijyy ) j =1
n j =1
Ij ∇Ij + Hj ∇Ij ,
where Ij and Hj are, respectively, the Laplacian and the Hessian of the image component Ij . Then, we can write Eq. (C.7) as n −−→ α trace Hj ∇IiT ∇Ij Id + ∇Ij ∇IiT . α trace div(G)∇IiT = j =1
(C.10)
• We finally need to compute ∇α and ∇β in Eqs. (C.8) and (C.9). This can be done by the decomposition ∇α =
∂α ∂α ∇λ+ + ∇λ− ∂λ+ ∂λ−
and
∇β =
∂β ∂β ∇λ+ + ∇λ− ∂λ+ ∂λ− (C.11)
and as the λ± , eigenvalues of the structure tensor G, depend on the Ijx and Ijy :
λ±x λ± y ⎞ ⎛ ∂λ ∂λ± ± n ∂Ijx Ijxx + ∂Ijy Ijxy ⎠ ⎝ = ∂λ± ∂λ± I + I j =1 ∂Ijx jxy ∂Ijy jyy " $ ∂λ ± n ∂Ix HIj ∂λ±j . =
∇λ± =
j =1
∂Iyj
202
TSCHUMPERLÉ AND DERICHE
In Appendix A, we derived eigenvalues of a structure tensor G, with respect to the spatial image derivatives. We derived the following relation: " ∂λ± $ ∂Ixj ∂λ± ∂Iyj
= 2θ± θ±T ∇Ij .
Then, ∇λ± =
n j =1
2Hj θ± θ±T ∇Ij .
(C.12)
We can replace Eq. (C.12) into the expressions of Eq. (C.11) to find the spatial gradients of α and β: ∂α ∂α ∇α = nj=1 2Hj ∂λ θ+ θ+T + ∂λ θ+ θ+T ∇Ij , + − (C.13) ∂β T + ∂β θ θ T ∇I . θ θ ∇β = nj=1 2Hj ∂λ j + + + + ∂λ− +
Using Eq. (C.13), we finally compute−−→ the two missing parts of Eqs. (C.8) and (C.9) of the additional term ∇IiT div(D): ∂α ∂α θ+ θ+T + ∂λ θ− θ−T ∇Ij ∇IiT , trace(G∇α∇IiT ) = nj=1 trace 2GHj ∂λ + − ∂β ∂β θ+ θ+T + ∂λ θ− θ−T ∇Ij ∇IiT . trace(∇β∇IiT ) = nj=1 trace 2Hj ∂λ + −
(C.14) • The final step consists in combining Eqs. (C.10) and (C.14) to express the −−→ additional term ∇IiT div(D) in the PDE (C.2). −−→
∇IiT div(D) =
n j =1
trace Hj Pij ,
(C.15)
where the Pij are the following 2 × 2 matrices: Pij = α∇IiT ∇Ij Id
∂α ∂α T T θ+ θ + + θ− θ− ∇Ij ∇IiT G +2 ∂λ+ ∂λ−
∂β ∂β T T θ+ θ + + α + θ− θ− ∇Ij ∇IiT . +2 α+ ∂λ+ ∂λ− (C.16) Note that the indices i, j in the notation Pij do not designate the coefficients of a matrix P, but the parameters of the family consisting of n2 matrices Pij (each of them is a 2 × 2 matrix).
ANISOTROPIC DIFFUSION PARTIAL DIFFERENTIAL EQUATIONS
203
The matrices Pii are symmetric, but generally not the Pij (where i = j ), since the gradients ∇Ii and ∇Ij are not usually aligned. However, we want to express Eq. (C.15) only with symmetric matrices, in order to interpret it as a sum of local smoothing processes oriented by diffusion tensors. Fortunately, the trace operator has this simple property:
A + AT trace(AH) = trace H , 2 where (A + AT )/2 is a 2 × 2 symmetric matrix (the symmetric part of A). Thus, we define the symmetric matrices Qij , corresponding to the symmetric parts of the Pij : Pij + Pij Q = 2
T
ij
(C.17)
and derive −−→
∇IiT div(D) =
n j =1
trace Hj Qij .
Finally, the divergence-based PDE (C.2) can be written as: div(D∇Ii ) =
n j =1
trace δij D + Qij Hj ,
(C.18)
where δij is the Kronecker’s symbol: 0 if i = j, δij = 1 if i = j. This completes the link between divergence PDEs and sums of atomic trace-based PDEs. A direct geometric interpretation of Eq. (C.18) is not direct.
R EFERENCES Alvarez, L., Guichard, F., Lions, P.L., Morel, J.M. (1993). Axioms and fundamental equations of image processing. Arch. Ration. Mech. Anal. 123 (3), 199–257. Alvarez, L., Lions, P.L., Morel, J.M. (1992). Image selective smoothing and edge detection by nonlinear diffusion (II). SIAM J. Numer. Anal. 29, 845– 866.
204
TSCHUMPERLÉ AND DERICHE
Alvarez, L., Mazorra, L. (1994). Signal and image restoration using shock filters and anisotropic diffusion. SIAM J. Numer. Anal. 31 (2), 590–605. Aubert, G., Kornprobst, P. (2002). Mathematical Problems in Image Processing: Partial Differential Equations and the Calculus of Variations. In: Applied Mathematical Sciences, vol. 147. Springer-Verlag. Ashikhmin, M. (2001). Synthesizing natural textures. In: ACM Symposium on Interactive 3D Graphics. Research Triangle Park, NC, pp. 217–226. Barash, D. (2000). Bilateral filtering and anisotropic diffusion: Towards a unified viewpoint. Technical Report, HP Laboratories Israel. Becker, J., Preusser, T., Rumpf, M. (2000). PDE methods in flow simulation post processing. Comput. Vis. Sci. 3 (3), 159–167. Bertalmio, M., Cheng, L.T., Osher, S., Sapiro, G. (2000a). Variational problems and partial differential equations on implicit surfaces: The framework and examples in image processing and pattern formation. UCLA Research Report. Bertalmio, M., Sapiro, G., Caselles, V., Ballester, C. (2000b). Image inpainting. In: Proceedings of the SIGGRAPH. ACM Press, Addison–Wesley– Longman, pp. 417–424. Bertalmio, M., Cheng, L.T., Osher, S., Sapiro, G. (2001). Variational problems and partial differential equations on implicit surfaces. Comput. Vis. Sci. 174 (2), 759–780. Bertalmio, M., Vese, L., Sapiro, G., Osher, S. (2003). Simultaneous structure and texture image inpainting. IEEE Trans. Image Process. 12 (8), 882–889. Black, M.J., Sapiro, G., Marimont, D.H., Heeger, D. (1998). Robust anisotropic diffusion. IEEE Trans. Image Process. 7 (3), 421–432. Blanc-Feraud, L., Charbonnier, P., Aubert, G., Barlaud, M. (1995). Nonlinear image processing: Modeling and fast algorithm for regularization with edge detection. In: Proceedings of the International Conference on Image Processing (ICIP), Washington, DC, 1995, pp. 474–477. Blomgren, P. (1998). Total Variation Methods for Restoration of Vector Valued Images. PhD dissertation, Department of Mathematics, University of California, Los Angeles. Blomgren, P., Chan, T.F. (1998). Color TV: Total variation methods for restoration of vector-valued images. IEEE Trans. Image Process. 7 (3), 304–309. Buerkle, D., Preusser, T., Rumpf, M. (2001). Transport and diffusion in timedependent flow visualization. In: Proceedings IEEE Visualization. Cabral, B., Leedom, L.C. (1993). Imaging vector fields using line integral convolution SIGGRAPH’93. Computer Graphics 27, 263–272. Carmona, R., Zhong, S. (1998). Adaptive smoothing respecting feature directions. IEEE Trans. Image Process. 7 (3), 353–358. Chambolle, A., Lions, P.L. (1997). Image recovery via total variation minimization and related problems. Numer. Math. 76 (2), 167–188.
ANISOTROPIC DIFFUSION PARTIAL DIFFERENTIAL EQUATIONS
205
Chan, T., Kang, S.H., Shen, J. (2002). Euler’s elastica and curvature based inpainting. SIAM J. Appl. Math. Chan, T.F., Kang, S.H., Shen, J. (2000). Total variation denoising and enhancement color images based on the CB and HSV color models. J. Vis. Comm. Image Represent. 12 (4). Chan, T.F., Shen, J. (2001). Non-texture inpainting by curvature-driven diffusions (CDD). J. Vis. Comm. Image Represent. 12 (4), 436–449. Chan, T., Shen, J. (2000a). Variational restoration of non-flat image features: Models and algorithms. SIAM J. Appl. Math. 61 (4), 1338–1361. Chan, T., Shen, J. (2000b). Mathematical models for local deterministic inpaintings. Technical Report 00-11. Department of Mathematics, UCLA, Los Angeles. Charbonnier, P., Aubert, G., Blanc-Féraud, M., Barlaud, M. (1994). Two deterministic half-quadratic regularization algorithms for computed imaging. In: Proceedings of the International Conference on Image Processing (ICIP), vol. II, pp. 168–172. Charbonnier, P., Blanc-Féraud, L., Aubert, G., Barlaud, M. (1997). Deterministic edge-preserving regularization in computed imaging. IEEE Trans. Image Process. 6 (2), 298–311. Chefd’hotel, C., Tschumperlé, D., Deriche, R., Faugeras, O. (2004). Regularizing flows for constrained matrix-valued images. J. Math. Imaging Vision 20 (2), 147–162. Chu, M.T. (1990a). A list of matrix flows with applications. Technical Report. Department of Mathematics, North Carolina State University. Chu, M.T. (1990b). Matrix differential equations: A continuous realization process for linear algebra problems. Technical Report. Department of Mathematics, North Carolina State University. Coulon, O., Alexander, D.C., Arridge, S.R. (2001). A regularization scheme for diffusion tensor magnetic resonance images. In: 17th International Conference on Information Processing in Medical Imaging, Lecture Notes in Computer Science, vol. 2082. Springer, pp. 92–105. Criminisi, A., Perez, P., Toyama, K. (2003). Object removal by exemplarbased inpainting. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2, pp. 721–728. Deriche, R., Faugeras, O. (1997). Les EDP en traitement des images et vision par ordinateur. Traitement Signal 13 (6). Diewald, U., Preusser, T., Rumpf, M. (2000). Anisotropic diffusion in vector field visualization on Euclidian domains and surfaces. IEEE Trans. Vis. Comput. Graphics 6 (2), 139–149. Di Zenzo, S. (1986). A note on the gradient of a multi-image. Comput. Vision, Graphics Image Process. 33, 116–125.
206
TSCHUMPERLÉ AND DERICHE
Gilboa, G., Sochen, N., Zeevi, Y. (2002). Forward-and-backward diffusion processes for adaptative image enhancement and denoising. IEEE Trans. Image Process. Hadamard, J. (1923). Lectures on the Cauchy Problem in Linear Partial Differential Equations. Yale University Press, New Haven. Jia, J., Tang, C.K. (2003). Image repairing: Robust image synthesis by adaptive ND tensor voting. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 1, pp. 643–650. Kichenassamy, S. (1997). The Perona–Malik paradox. SIAM J. Appl. Math. 57 (5), 1328–1342. Kimmel, R., Malladi, R., Sochen, N. (1998). Image processing via the Beltrami operator. In: Proceedings of the 3rd Asian Conference on Computer Vision, vol. 1. Hong Kong, pp. 574–581. Kimmel, R., Malladi, R., Sochen, N. (2000). Images as embedded maps and minimal surfaces: Movies, color, texture, and volumetric medical images. Int. J. Comput. Vision 39 (2), 111–129. Kimmel, R., Sochen, N. (1999). Geometric-variational approach for color image enhancement and segmentation. In: Scale-Space Theories in Computer Vision, Scale-Space’99, Lecture Notes in Computer Science, vol. 1682. Springer, pp. 295–305. Kimmel, R., Sochen, N. (2002). Orientation diffusion or how to comb a porcupine. J. Vis. Commun. Image Represent. 13, 238–248. Koenderink, J.J. (1984). The structure of images. Biol. Cybernet. 50, 363–370. Kornprobst, P. (1998). Contributions à la Restauration d’Images et à l’Analyse de Séquences: Approches Variationnelles et Solutions de Viscosité. PhD thesis, Université de Nice-Sophia Antipolis. Kornprobst, P., Deriche, R., Aubert, G. (1996). Image restoration via PDEs. In: First Annual Symposium on Enabling Technologies for Law Enforcement and Security—SPIE Conference 2942. Boston, Massachusetts. Kornprobst, P., Deriche, R., Aubert, G. (1997). Nonlinear operators in image restoration. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. Puerto Rico, pp. 325–331. Koschan, A. (1995). A comparative study on color edge detection. In: Proceedings of the 2nd Asian Conference on Computer Vision, ACCV’95, pp. 574–578. Krissian, K. (2000). Multiscale analysis: Application to medical imaging and 3D vessel detection. PhD Thesis. INRIA, Sophia Antipolis. Lindeberg, T. (1994). Scale-Space Theory in Computer Vision. Kluwer Academic Publishers. Masnou, S., Morel, J.-M. (1998). Level lines based disocclusion. IEEE Int. Conf. Image Process. 3, 259–263.
ANISOTROPIC DIFFUSION PARTIAL DIFFERENTIAL EQUATIONS
207
Meyer, Y. (2001). Oscillatory Patterns in Image Processing and Nonlinear Evolution Equations. University Lecture Series, vol. 22. American Mathematical Society, Providence, RI. Nielsen, M., Florack, L., Deriche, R. (1997). Regularization, scale-space and edge detection filters. J. Math. Imaging Vision 7 (4), 291–308. Nikolova, M., (2001). Image restoration by minimizing objective functions with nonsmooth data-fidelity terms. In: IEEE Workshop on Variational and Level Set Methods. Vancouver, Canada, pp. 11–19. Nikolova, M., Ng, M. (2001). Fast image reconstruction algorithms combining half-quadratic regularization and preconditioning. In: Proceedings of the International Conference on Image Processing. IEEE Signal Processing Society. Papadopoulo, T., Lourakis, M.I.A. (2000). Estimating the Jacobian of the Singular Value Decomposition: Theory and Applications. Research Report 3961. INRIA, Sophia Antipolis. Pardo, A., Sapiro, G. (2000). Vector probability diffusion. In: Proceedings of the International Conference on Image Processing, IEEE Signal Processing Society. Perona, P. (1998). Orientation diffusions. IEEE Trans. Image Process. 7 (3), 457–467. Perona, P., Malik, J. (1990). Scale-space and edge detection using anisotropic diffusion. IEEE Trans. Pattern Anal. Machine Intell. 12 (7), 629–639. Poynton, C.A. (1995). Poynton’s colour FAQ, www.inforamp.net/poynton [Web page]. Preusser, T., Rumpf, M. (1999). Anisotropic nonlinear diffusion in flow visualization. In: IEEE Visualization Conference. Press, W.H., Flannery, B.P., Teukolsky, S.A., Vetterling, W.T. (1992). Runge– Kutta method. In: Numerical Recipes in FORTRAN: The Art of Scientific Computing. Cambridge University Press, pp. 704–716. ter Haar Romeny, B.M. (1994). Geometry-driven diffusion in computer vision. In: Computational Imaging and Vision. Kluwer Academic Publishers. Rudin, L., Osher, S., Fatemi, E. (1992). Nonlinear total variation based noise removal algorithms. Physica D 60, 259–268. Sapiro, G. (1996). Vector-valued active contours. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition. San Francisco, pp. 680–685. Sapiro, G. (1997). Color snakes. Comput. Vision Image Understanding 68 (2). Sapiro, G. (2001). Geometric Partial Differential Equations and Image Analysis. Cambridge University Press. Sapiro, G., Ringach, D.L. (1996). Anisotropic diffusion of multivalued images with applications to color filtering. IEEE Trans. Image Process. 5 (11), 1582–1585.
208
TSCHUMPERLÉ AND DERICHE
Sochen, N., Kimmel, R., Bruckstein, A.M. (2001). Diffusions and confusions in signal and image processing. J. Math. Imaging Vision 14 (3), 195–209. Sochen, N. (2001). On affine invariance in the Beltrami framework for vision. In: IEEE Workshop on Variational and Level Set Methods. Vancouver, Canada, pp. 51–56. Sochen, N., Kimmel, R., Malladi, R. (1998). A geometrical framework for low level vision. IEEE Trans. Image Process. [Special Issue on PDE based Image Processing] 7 (3), 310–318. Stalling, D., Hege, H.C. (1995). Fast and resolution independent line integral convolution. In: ACM SIGGRAPH, 22nd Annual Conference on Computer Graphics and Interactive Technique, pp. 249–256. Tang, B., Sapiro, G., Caselles, V. (1998). Direction diffusion. In: International Conference on Computer Vision. Tang, B., Sapiro, G., Caselles, V. (2000). Diffusion of general data on non-flat manifolds via harmonic maps theory: The direction diffusion case. Int. J. Comput. Vision 36 (2), 149–161. Teboul, S., Blanc-Féraud, L., Aubert, G., Barlaud, M. (1998). Variational approach for edge-preserving regularization using coupled PDEs. IEEE Trans. Image Process. [Special Issue on PDE based Image Processing] 7 (3), 387–397. Tikhonov, A.N. (1963). Regularization of incorrectly posed problems. Sov. Math. Dokl. 4, 1624–1627. Tomasi, C., Manduchi, R. (1998). Bilateral filtering for gray and color images. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 839–846. Tschumperlé, D. (2002). PDE’s Based Regularization of Multivalued Images and Applications. PhD thesis. Université de Nice Sophia Antipolis. Tschumperlé, D. (2006). Fast anisotropic smoothing of multi-valued images using curvature-preserving PDEs. Int. J. Comput. Vision 68 (1), 65–82. Tschumperlé, D., Deriche, R. (2001a). Constrained and unconstrained PDE’s for vector image restoration. In: Proceedings of the 10th Scandinavian Conference on Image Analysis, Bergen, Norway, pp. 153–160. Tschumperlé, D., Deriche, R. (2001b). Diffusion tensor regularization with constraints preservation. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Kauai, Hawaii. Tschumperlé, D., Deriche, R. (2001c). Regularization of orthonormal vector sets using coupled PDEs. In: IEEE Workshop on Variational and Level Set Methods. Vancouver, Canada, pp. 3–10. Tschumperlé, D., Deriche, R. (2002a). Diffusion PDE’s on vector-valued images: Local approach and geometric viewpoint. IEEE Signal Process. Mag. 19 (5), 16–25. Tschumperlé, D., Deriche, R. (2002b). Orthonormal vector sets regularization with PDE’s and aplications. Int. J. Comput. Vision.
ANISOTROPIC DIFFUSION PARTIAL DIFFERENTIAL EQUATIONS
209
Tschumperlé, D., Deriche, R. (2003). Vector-valued image regularization with PDEs: A common framework for different applications. In: IEEE Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 651– 656. Tschumperlé, D., Deriche, R. (2005). Vector-valued image regularization with PDEs: A common framework for different applications. IEEE Trans. Pattern Anal. Machine Intell. 27 (4). Vese, L.A., Osher, S. (2001). Numerical methods for p-harmonic flows and applications to image processing. CAM Report 01-22. UCLA. Weickert, J. (1994), Anisotropic diffusion filters for image processing based quality control. In: 7th European Conference on Mathematics in Industry, pp. 355–362. Weickert J. (1996a). Anisotropic diffusion in image processing. PhD thesis. University of Kaiserslautern, Laboratory of Technomathematics. Germany. Weickert, J. (1996b). Theoretical foundations of anisotropic diffusion in image processing. Comput. Suppl. 11, 221–236. Weickert, J. (1997a). Coherence-enhancing diffusion of colour images. In: 7th National Symposium on Pattern Recognition and Image Analysis. Weickert, J. (1997b). A Review of Nonlinear Diffusion Filtering. In: ScaleSpace Theory in Computer Vision, Lecture Notes in Computer Science, vol. 1252. Springer, Berlin, pp. 3–28. Weickert, J. (1998). Anisotropic Diffusion in Image Processing. TeubnerVerlag, Stuttgart. Weickert, J. (1999). Coherence-enhancing diffusion of colour images. Image Vision Comput. 17, 199–210. Weickert, J., Benhamouda, B. (1997a). A semidiscrete nonlinear scale-space theory and its relation to the Perona–Malik paradox. In: Adv. Comput. Vision. Springer, Wien, pp. 1–10. Weickert, J., Benhamouda, B. (1997). Why the Perona–Malik Filter Works. Technical Report 97/22. Department of Computer Science, University of Copenhagen. Weickert, J., Brox, T. (2002). Diffusion and regularization of vector and matrix-valued images. Inverse Problems, Image Analysis, and Medical Imaging, Contemp. Math. 313, 251–268. Wei, L.Y., Levoy, M. (2000). Fast texture synthesis using tree-structured vector quantization. In: ACM SIGGRAPH, International Conference on Computer Graphics and Interactive Techniques, pp. 479–488. Wesseling, P. (2000). Principles of Computational Fluid Dynamics. Springer, Berlin. Witkin, A.P. (1983). Scale-space filtering. In: International Joint Conference on Artificial Intelligence, pp. 1019–1021. Yezzi, A. (1998). Modified curvature motion for image smoothing and enhancement. IEEE Trans. Image Process. 7 (3), 345–352.
This page intentionally left blank
Index
A
summary of, 49–50 theorems for, 41–45 Blurring models, of image restoration, 15–17
Analog detectors digital detectors v., 57 for electron detection, 57–59 Anisotropic diffusion PDEs curvature-preserving, 174–181 applications of, 183–193 implementation considerations in, 181–183 history of, 151–152 for multichannel image regularization, 149–203 concluding remarks on, 193–194 introduction to, 151–160 preliminary notations for, 150 smoothing with, 160–174 Artifact removal, of color image, 184–185, 185f–188f Astigmatism, chromatic aberration anisotropic coefficients of, 124–126 coefficients of, 117–118 patterns of, 133, 134f
C Cascaded vector quantization (VQ) in predictive codec, 23–24 in SNP/VQR, 25–26, 25f in video compression, 33–34 CCDs. See Charge-coupled devices CGMRF. See Compound Gauss–Markov random field Charge-coupled devices (CCDs) CMOS v., 80–81, 81f for cryo-EM, 57 as electron detectors, 60–61 film v., 60 Monte Carlo simulations in, 62–63, 64f readout of, 80–81, 81f Cholesky blocks, solving for, 41–46 Cholesky factorization in block banded matrix inversion, 40 2D GMRF from, 2, 10–11 3D GMRF from, 2, 13–14, 24, 25f in video compression, 28 Chromatic aberration coefficients, 106–129 astigmatism, 117–118 anisotropic, 124–126 coma 1, 115–116 anisotropic, 122–123 coma 2, 116–117 anisotropic, 123–124 combined, 109–128 distortion, 120–121 anisotropic, 127–128 from electric perturbation anisotropic, 141–142 isotropic, 141 field curvature, 119–120 anisotropic, 126–127 intrinsic, 107–109 from magnetic perturbation
B Backward unilateral representation for 2D GMRF, 11–12 for 3D GMRF, 14 Bandwidth, for streaming video, 23 Beltrami flow framework, in image regularization, 162 Bicubic interpolation, for color image resizing, 189–191, 192f Bilateral representation for 2D GMRF, 8–10 for 3D GMRF, 12–13 Block banded matrices GMRFs for inversion of, 2–3 inversion of, 37–50 algorithms for, 45–49, 48f, 50f notation for, 39–40 simulations of, 49
211
212 anisotropic, 143 isotropic, 142–143 spherical, 114–115 anisotropic, 121–122 total, 128–129 Chromatic aberration patterns, 131–134 astigmatism and field curvature, 133, 134f coma, 132–133, 133f distortion, 134, 135f spherical, 131–132, 132f Chromatic perturbation, of variational function, 102–106 Gaussian value of second- and fourth-order approximation, 100–102 second- and fourth-order approximations of, 103–104 CMOS. See Complementary metal oxide semiconductor Color image. See also Multichannel image denoising and artifact removal of, 184–185, 185f–188f inpainting of, 185–189, 189f–191f interpolation of, 189–191, 192f Coma 1, chromatic aberration anisotropic coefficients of, 122–123 coefficients of, 115–116 patterns of, 132–133, 133f Coma 2, chromatic aberration anisotropic coefficients of, 123–124 coefficients of, 116–117 patterns of, 132–133, 133f Complementary metal oxide semiconductor (CMOS) CCD v., 80–81, 81f for direct detection devices, 56 DQE of, 58 in HPD detectors, 62 in MAPS detectors, 56–57, 62, 80–89 in Medipix2, 68 outlook for, 89 radiation damage of, 86–89, 87f–88f readout of, 80–81, 81f Compound Gauss–Markov random field (CGMRF), for image restoration, 15 Computer algebra with electron optics, 96 for third-order chromatic aberrations of electron lenses, 95–146 analytical derivation of, 106–129
INDEX chromatic aberration variational function of, 102–106 concluding remarks on, 143–146, 144t–146t graphical display of patterns of, 129–134 introduction to, 96–97 numerical calculation of, 135–143 variational function of, 97–102 cryo-EM. See Electron cryomicroscopy Curvature-preserving diffusion PDEs applications of, 171–172, 171f–172f, 183–193 denoising and artifact removal, 184–185, 185f–188f flow visualization, 191–193, 193f–194f inpainting, 185–189, 189f–191f interpolation, 189–191, 192f for image regularization, 174–181 implementation of, 181–183, 183f line integral convolutions and, 176–178 multidirectional smoothing, 180–181 single-direction case, 174–176, 175f traces and divergences in, 178–179 links with, 172–174
D Dark current, in CMOS circuits, 86 Data archiving, of film, 60 DCT. See Discrete cosine transform Denoising, of color image, 184–185, 185f–188f Detective quantum efficiency (DQE) in electron detector, 57–58 film and, 59 of MAPS, 83, 89 of Medipix2, 75–77, 77f–78f MTF v., 58 Di Zenzo multivalued geometry for local geometry of multichannel image, 156–160, 158f–160f with trace-based PDE, 171 Diffusion PDEs. See also Anisotropic diffusion PDEs curvature-preserving, 171–172, 171f–172f, 174–181 applications of, 183–193 implementation considerations, 181–183, 183f line integral convolutions and, 176–178
213
INDEX multidirectional smoothing, 180–181 single-direction case, 174–176, 175f traces and divergences in, 178–179 divergence-based, 163–165 trace-based, 167–172, 169f, 171f–172f Diffusion tensors, for multichannel image regularization, 150, 164–165 Digital detectors analog detectors v., 57 for electron detection, 57–59 Direct detection devices, 55–56 Direct electron detectors, for electron microscopy, 55–90 CCDs for, 60–61 concluding remarks on, 90 detectors, 57–59 direct electron semiconductor detectors, 61–62 film for, 59–60 HPDs, 64–80 introduction to, 55–57 MAPS, 56–57, 62, 80–89 Monte Carlo simulations, 62–63, 64f semiconductor, 61–62 Discrete cosine transform (DCT), in transform codecs, 23 Discrete wavelet transform (DWT), in transform codecs, 23 Displacement damage, to CMOS circuits, 86 Distortion chromatic aberration anisotropic coefficients of, 127–128 coefficients of, 120–121 patterns of, 134, 135f Divergence-based diffusion PDEs for image regularization, 163–165 links with, 172–174 trace-based diffusion PDEs link to, 198–203 DQE. See Detective quantum efficiency DWT. See Discrete wavelet transform
E Elastic scattering, in electron interactions, 63 Electric perturbation, chromatic aberration coefficients from anisotropic, 141–142 isotropic, 141 Electron cryomicroscopy (cryo-EM) description of, 56
Medipix2 in, 77–78 qualities of detector for, 57–59 Electron crystallography with cryo-EM, 56 single-particle analysis v., 56 Electron detector. See Direct electron detectors Electron lenses. See also Electron optics third-order chromatic aberrations of, 95–146 analytical derivation of, 106–129 chromatic aberration variational function of, 102–106 concluding remarks on, 143–146, 144t–146t graphical display of patterns of, 129–134 introduction to, 96–97 numerical calculation of, 135–143 variational function of, 97–102 Electron microscopy (EM) direct electron detectors for, 55–90 CCDs for, 60–61 concluding remarks on, 90 detectors, 57–59 direct electron semiconductor detectors, 61–62 film for, 59–60 HPDs, 64–80 introduction to, 55–57 MAPS, 80–89 Monte Carlo simulations, 62–63, 64f Medipix2 in, 69–70, 77–78 Electron optics computer algebra with, 96 rotating transform of, 98 variational function of, 97–102 Gaussian value of fourth-order approximation, 100–102 second- and fourth-order approximations of, 97–100 Electron tomography, with cryo-EM, 56–57 EM. See Electron microscopy Euler–Lagrange equations, for image regularization, 162, 195–197
F Field curvature, chromatic aberration anisotropic coefficients of, 126–127 coefficients of, 119–120
214 patterns of, 133, 134f Film CCDs and HPDs v., 60 for electron detection, 55, 59–60 MAPS v., 85–86 Medipix2 v., 72–74, 73f Floyd–Steinberg algorithm, image regularization of, 185, 188f Forward unilateral representation for 2D GMRF, 10–11 for 3D GMRF, 13 Fourth-order chromatic perturbation approximations of, 103–104 Gaussian values for, 100–102
G Gaussian function, in image regularization, 163–165, 168–169, 169f Gaussian value chromatic perturbation variational function, 105–106 variational function, 100–102 Gauss–Markov random field (GMRF). See Noncausal Gauss–Markov random field Gauss–Markov random process (GMRP). See Noncausal Gauss–Markov random process Gibbs distribution, 6 Glaser’s bell-shaped magnetic lens, for numerical calculation of third-order chromatic aberration coefficients, 135–136, 143, 144t–145t GMRF. See Noncausal Gauss–Markov random field 1D GMRF. See One-dimensional noncausal Gauss–Markov random field 2D GMRF. See Two-dimensional noncausal Gauss–Markov random field 3D GMRF. See Three-dimensional noncausal Gauss–Markov random field GMRP. See Noncausal Gauss–Markov random process Gradient flux, in image regularization, 164
H H.263 codec computational complexity of, 33
INDEX description of, 23 SNP/VQR v., 3, 24, 26, 33–36, 35t, 36f–37f High Resolution Large Area X-ray Detector (RELAXD), with Medipix2, 79–80, 79f HPDs. See Hybrid pixel detectors Hutter’s electrostatic immersion lens, for numerical calculation of third-order chromatic aberration coefficients, 135–136, 143, 144t Hybrid pixel detectors (HPDs) as direct detection device, 56, 62 DQE of, 58 film v., 60 MAPSs v., 62 Medipix1 and Medipix2, 64–80
I Identity matrices, for 2D GMRF, 9 Image(s). See also Color image; Multichannel image 2D GMRF for, 3 processing of GMRFs for, 2 image regularization in, 193 regularization of divergence-based diffusion PDEs for, 163–165 Euler–Lagrange equations for, 162, 195–197 goals of, 160–161 of JPEG-compressed images, 184, 188f Perona–Malik equation for, 160, 161t surface area minimization in, 162–163, 163f trace-based diffusion PDEs for, 167–172, 169f, 171f–172f Image restoration. See also Two-dimensional noncausal Gauss–Markov random field algorithm for, 17–19 GMRFs for, 2–3 GMRP for, 17–22, 20f–21f, 22t RTS smoothing for, 2–3, 15–22 blurring models, 15–17 experiments of, 19–22, 20f–21f image restoration algorithm, 17–19 summary of, 22, 22t
215
INDEX Wiener filter for, 15, 19, 20f–21f, 21–22, 22t Inelastic scattering, in electron interactions, 63 Inpainting of color image, 185–189, 189f–191f description of, 185–186 International Organization for Standardization (ISO), MPEG4 of, 23 International Telecommunication Union (ITU), H.263 of, 23 ISO. See International Organization for Standardization Isolightness, for multichannel image regularization, 156, 156f ITU. See International Telecommunication Union
J JPEG-compressed images, image regularization of, 184, 188f
K Kalman gain, in image restoration, 18 Kalman–Bucy filter (KBF) for 1D GMRF, 2 for 2D GMRF, 10, 12 for image restoration, 17–18 KBF. See Kalman–Bucy filter Knife-edge method for MAPS, 85–86 for Medipix2 MTF, 74–75, 75f Koenderink’s method, for image regularization, 168 Kronecker product, for 2D GMRF, 9
L LICs. See Line integral convolutions Line integral convolutions (LICs) with curvature-preserving PDEs, 182–183, 183f in image regularization, 154 for 2D vector field visualization, 193 Linear interpolation, for color image resizing, 189–191, 192f Linear shift-invariant (LSI) system, for imaging system modeling, 15
Local geometry, of multichannel image, 153–160 Di Zenzo multivalued geometry for, 156–160, 158f–160f features of, 154–155 geometry from scalar feature for, 155–156, 156f Local neighborhoods, for 2D and 3D GMRF models, 4–6, 5f LSI. See Linear shift-invariant system
M Magnetic perturbation, chromatic aberration coefficients from anisotropic, 143 isotropic, 142–143 Maple, Mathematica v., 96 MAPSs. See Monolithic active pixel sensors Markov field, 6 Mathematica background of, 96 definitions in, 146–148 for third-order chromatic aberrations of electron lenses, 95–146 analytical derivation of, 106–129 chromatic aberration variational function of, 102–106 concluding remarks on, 143–146, 144t–146t graphical display of patterns of, 129–134 introduction to, 96–97 numerical calculation of, 135–143 variational function of, 96–102 MATLAB, Mathematica v., 96 Maximum a posterior (MAP) estimate, finding of, 15 Mean squared error (MSE), in image restoration, 22, 22t Medipix1 description of, 64–65, 65f Medipix2 v., 68, 68t pixel readout electronics of, 65–66, 66f–67f testing of, 66–68 Medipix2 in cryo-EM, 77–78 description of, 64–65, 65f DQE of, 75–77, 77f–78f experimental results for, 69–70
216 film v., 72–74, 73f future prospects for, 77–80 Medipix1 v., 68, 68t mounting of, 69f–71f, 70 MTF of, 74–75, 75f pixel readout electronics of, 65–66, 66f–67f resolution and efficiency of, 74 sensitivity of, 70–72, 72f Modulation transfer function (MTF) DQE v., 58 of electron detector, 58 film and, 59 of MAPS, 83 of Medipix2, 74–75, 75f Monolithic active pixel sensors (MAPSs) backscattering in, 63 as electron detector, 56–57, 62, 80–89 film v., 85–86 HPDs v., 62 layout in, 82–83, 82f Monte Carlo simulations of, 83 outlook for, 89 radiation damage to, 86–89, 87f–88f readout in, 81–82, 81f sensitivity and resolution of, 83–86, 84f, 85t Monte Carlo simulations for cryo-EM, 57 of MAPS, 83 for Medipix2, 75f in phosphor-coupled CCDs, 62–63, 64f results of, 48f, 49, 50f MPEG4 codec computational complexity of, 33 description of, 23 SNP/VQR v., 3, 24, 26, 33–36, 35t, 36f–37f MSE. See Mean squared error MTF. See Modulation transfer function Multichannel image curvature-preserving PDEs for, 174–181 applications of, 183–193 implementation considerations, 181–183, 183f line integral convolutions and, 176–178 multidirectional smoothing, 180–181 single-direction case, 174–176, 175f traces and divergences in, 178–179 local geometry of, 153–160
INDEX Di Zenzo multivalued geometry for, 156–160, 158f–160f features of, 154–155 geometry from scalar feature for, 155–156, 156f nonlinear regularization PDE for, 152–153, 153f PDEs for regularization of, 149–203 concluding remarks on, 193–194 introduction to, 151–160 preliminary notations for, 150 smoothing with, 160–174 regularization of, 160–174 divergence-based diffusion PDEs for, 163–165 goals of, 160–161 method links in, 172–174 methods for, 160–163, 161t oriented heat flows of, 165–167, 166f trace-based diffusion PDEs for, 167–172, 169f, 171f–172f representation of, 150 Multimedia application streaming video as, 22–23 video codec of, 23
N Nearest-neighbor interpolation, for color image resizing, 190–191, 192f Noncausal Gauss–Markov random field (GMRF) applications of, 2–51 concluding remarks on, 50–51 image restoration with, 15–22 introduction to, 2–3 inversion algorithms for block banded matrices, 37–50 potential matrix of, 7–14 terminology of, 3–6 video compression with, 22–37 block banded matrices inversion with, 37–50 algorithms for, 45–49, 48f, 50f notation for, 39–40 simulations of, 49 summary of, 49–50 theorems for, 41–45 first-order, 7 Gibbs distribution and Markov field of, 6
217
INDEX Noncausal Gauss–Markov random process (GMRP) for image restoration, 17–22, 20f–21f, 22t for pixel intensity prediction, 7, 9 Nonlinear regularization PDE, for multichannel image regularization, 152–153, 153f Nonlinear scale spaces, regularization PDEs and, 151–152
O One-dimensional (1D) noncausal Gauss–Markov random field (GMRF), 2D and 3D GMRF models v., 2 Oriented heat flows, for multichannel image regularization, 165–167, 166f Oriented Laplacian formulations. See Oriented heat flows Oriented orthogonal basis, in Di Zenzo multivalued geometry, 157 Out-of-focus blur, PSF for, 15–16
P Partial differential equations (PDEs) curvature-preserving, 174–181 applications of, 183–193 implementation considerations for, 181–183, 183f line integral convolutions and, 176–178 multidirectional smoothing, 180–181 single-direction case, 174–176, 175f traces and divergences in, 178–179 for multichannel image regularization, 149–203 concluding remarks on, 193–194 curvature-preserving diffusion, 171–172, 171f–172f divergence-based diffusion, 163–165 introduction to, 151–160 method links in, 172–174 oriented heat flows of, 165–167, 166f preliminary notations for, 150 trace-based diffusion, 167–172, 169f, 171f–172f variational methods for, 160–163 PDEs. See Partial differential equations; specific PDEs PDS. See Power spectral densities
Peak signal to noise ratio (PSNR) in image restoration, 22, 22t with RTS, 2–3, 19 SNP/VQR v. MPEG4 and H.263 with, 24, 26, 34–36, 35t, 36f Perona–Malik equation, for image regularization, 160, 161t Pixel neighborhoods for 2D and 3D GMRF models, 4–6, 5f for GMRP, 7 Point spread function (PSF) of electron detector, 58 in image restoration, 15–16 Polyakov action, in image regularization, 162 Power spectral densities (PDS), in image restoration, 19 Predictive codecs, 23–24 PSF. See Point spread function PSNR. See Peak signal to noise ratio
Q QCIF video sequence, SNP/VQR with, 26
R Radiation damage to CCDs, 60–61 of CMOS detectors, 86–89, 87f–88f electron detector and, 59 to Medipix2, 71–72, 77–78 to STAR250, 87–89, 88f Rauch–Tung–Striebel smoothing (RTS) for 2D GMRF, 10 for image restoration, 2–3, 15–22 blurring models, 15–17 experiments of, 19–22, 20f image restoration algorithm, 17–19 summary of, 22, 22t PSNR with, 2–3, 19 Wiener filter v., 2 Regularization PDEs nonlinear scale spaces and, 151–152 visualization scale-space with, 193, 194f RELAXD. See High Resolution Large Area X-ray Detector Resizing, of color image, 189–191, 192f Resolution. See also Spatial resolution of MAPS, 83–86, 84f, 85t, 89 of Medipix2, 74, 80
218 Riccati-type equation for 2D GMRF, 10–11 for 3D GMRF, 13–14 for image restoration, 17–18 for video compression, 31–32 Rotating transform, of electron optics, 98 RTS smoothing. See Rauch–Tung–Striebel smoothing Runge–Kutta integration with curvature-preserving PDEs, 182 for image regularization, 154
S SACMOS. See Self Aligned Contact complementary metal oxide semiconductor Scalable noncausal prediction with cascaded vector quantization and conditional replenishment (SNP/VQR) codec decoder for, 25–26, 25f encoder for, 24–26, 25f 3D GMRF with, 3 implementation of, 3 MPEG4 and H.263 v., 3, 24, 26, 33–36, 35t, 36f–37f sub-block, 29–31 computation of, 29–31 computational complexity with, 31–33 Second-order chromatic perturbation approximations of, 103–104 Gaussian values for, 100–102 Self Aligned Contact complementary metal oxide semiconductor (SACMOS), in Medipix1, 68 Sensitivity of MAPS, 83–86, 84f, 85t of Medipix2, 70–72, 72f Signal processing, block banded matrices inversion for, 37–38 Signal to noise ratio (SNR) of CCDs, 61 of cryo-EM, 56 of film, 60 of MAPS, 85 Silicon, electron trajectories in, 63, 64f Single-particle analysis with cryo-EM, 56 electron crystallography v., 56 SNP/VQR. See Scalable noncausal prediction with cascaded vector
INDEX quantization and conditional replenishment SNR. See Signal to noise ratio Spatial averaging, for image restoration, 19–22, 20f–21f, 22t Spatial resolution of CCDs, 61 of film, 59 Spherical, chromatic aberration anisotropic coefficients of, 121–122 coefficients of, 114–115 patterns of, 131–132, 132f Spotscan routine, for Medipix2 v. film, 73f, 74 STAR250, radiation damage to, 87–89, 88f Streaming video bandwidth for, 23 uses of, 22–23 Structure tensor, in Di Zenzo multivalued geometry, 156–157 Surface area minimization, in image regularization, 162–163, 163f
T Third-order chromatic aberrations analytical derivation of, 106–129 combined coefficients of, 109–128 intrinsic coefficients of, 107–109 total coefficients of, 128–129 description of, 106–107 of electron lenses, 95–146 analytical derivation of, 106–129 chromatic aberration variational function of, 102–106 concluding remarks on, 143–146, 144t–146t graphical display of patterns of, 129–134 introduction to, 96–97 numerical calculation of, 135–143 variational function of, 96–102 numerical calculation of coefficients of, 135–143 calculations, 140–143 numerical results of, 143, 144t–146t of various quantities, 136–140 patterns of, 129–134 auxiliary procedures for, 129–131, 131f display of, 131–134
219
INDEX Three-dimensional backward unilateral representation for, 3D GMRF, 14 Three-dimensional bilateral representation for 3D GMRF, 12–13 two-dimensional bilateral representation v., 12–13 Three-dimensional forward regressors computation of, 29–30, 32–33 structure of, 26–27 Three-dimensional forward unilateral representation for, 3D GMRF, 13 Three-dimensional (3D) noncausal Gauss–Markov random field (GMRF) Cholesky factorization for, 2, 13–14, 24, 25f conditional probabilities of, 4 1D GMRF model v., 2 local neighborhoods for, 4–6, 5f matrix for, 39 SNP/VQR models with, 3 vertical and horizontal interactions of, 7–8 video compression with, 22–37 cascaded VQ, 33–34 computational savings of, 31–33 computationally efficient implementation for, 26–29, 27f experiments of, 34–36 SNP/VQR encoder for, 24–26, 25f sub-block SNP/VQR with, 29–31 summary of, 36–37 for video sequence, 3–4, 12–14 three-dimensional backward unilateral representation for, 14 three-dimensional bilateral representation for, 12–13 three-dimensional forward unilateral representation for, 13 zero Dirichlet boundary conditions for, 12 Toeplitz matrices, for 2D GMRF, 9, 16 Trace-based diffusion PDEs description of, 167–170, 169f Di Zenzo multivalued geometry with, 171 divergence-based diffusion PDEs link to, 198–203 for image regularization, 167–172, 169f, 171f–172f links with, 172–174 problems with, 170–172, 171f–172f solution of, 197–198
Transform codecs, 23 Tridiagonal matrix inverses, 43 Truncated Gaussian blur, PSF for, 15–16, 20–21, 20f Two-dimensional backward unilateral representation, for 2D GMRF, 11–12 Two-dimensional bilateral representation for 2D GMRF, 8–10 three-dimensional bilateral representation v., 12–13 Two-dimensional forward unilateral representation, for 2D GMRF, 10–11 Two-dimensional (2D) noncausal Gauss–Markov random field (GMRF) for blurred images, 16–17 Cholesky factorization for, 2, 10–11 conditional probabilities of, 4 1D GMRF model v., 2 local neighborhoods for, 4–6, 5f matrix for, 39 for still images, 3, 8–12 two-dimensional backward unilateral representation for, 11–12 two-dimensional bilateral representation for, 8–10 two-dimensional forward unilateral representation for, 10–11 vertical and horizontal interactions of, 7 zero Dirichlet boundary conditions for, 8–9, 9n1, 16 Two-dimensional (2D) vector field, visualization of, 191–193, 193f–194f
V Variational function chromatic perturbation of, 102–106 Gaussian value of second- and fourth-order approximation, 100–102 second- and fourth-order approximations of, 103–104 in electron optics, 96–102 Gaussian value of fourth-order approximation, 100–102 second- and fourth-order approximations of, 97–100 Variations measures, in Di Zenzo multivalued geometry, 157
220 Vector components, of multichannel images, 150 2D Vector field. See Two-dimensional vector field Vector quantization (VQ). See also Cascaded vector quantization cascaded vector quantization of, 33–34 Vector quantization coupled with conditional replenishment (VQR), error field compression with, 3 Vector variation norms, in image regularization, 158–160, 158f–160f Video codec description of, 23 types of, 23–24 Video compression. See also Three-dimensional noncausal Gauss–Markov random field 3D GMRF for, 22–37 cascaded VQ, 33–34 computational savings of, 31–33 computationally efficient implementation for, 26–29, 27f experiments of, 34–36 SNP/VQR encoder for, 24–26, 25f sub-block SNP/VQR with, 29–31 summary of, 36–37 GMRFs for, 2–3, 23–24
INDEX Video sequence, 3D GMRF, 3–4 Visualization scale-space, with regularization PDEs, 193, 194f VQ. See Vector quantization VQR. See Vector quantization coupled with conditional replenishment
W Weickert’s method, for image regularization, 164–165, 168 Wiener filter for image restoration, 15, 19, 20f–21f, 21–22, 22t RTS v., 2
X Ximen’s combined electromagnetic lens, for numerical calculation of third-order chromatic aberration coefficients, 135–136, 143, 145t–146t
Z Zero Dirichlet boundary conditions for 2D GMRF, 8–9, 9n1, 16 for 3D GMRF, 12