ADVANCES IN ELECTRONICS AND ELECTRON PHYSICS VOLUME 66
EDITOR-IN-CHIEF
PETER W. HAWKES Lahoratoire d' Optiyue Electronique du Centre National de la Recherche Scientifi'que Toulouse, France
ASSOCIATE EDITOR-IMAGE
PICK-UP AND DLSPLAY
BENJAMIN KAZAN Xerox Coupordon Pnlo Alto Reseurch Center Palo Alto, Culgorniri
Advances in ~
Electronics and Electron Physics EDITEDBY PETER W. HAWKES Laboratoire D' Optique Electronique du Centre Nutionul de la Recherche Scientifque Toulouse, France
VOLUME 66 1986
ACADEMIC PRESS, INC. Harcourt Brace Jovanovich, Publishers Orlando San Diego New York Austin London Montreal Sydney Tokyo Toronto
BY ACADEMIC PRESS.INC. ALL RIGHTS RESERVED. KO PART OFTHIS PUBLICATION MAY BE REPRODLCEI) OR TRAKSMllTED I N ANY FORM OR BY ANY M E A M . ELECTRONIC OR MECHANICAL. INCLUDING PHOTOCOPY. RECORDING. OR ANY INFORMATION STORAGE AND RETRIEVAL SYSTEM. WITHOCT PERMISSION I N WRITING FROM THE PUBLISHER.
COPYRIGHTQ 1986
ACADEMIC PRESS. INC. Orlando. Florida 32887
United Kingdom Edition published by
ACADEMIC PRESS INC.
(LONDON) 24-28 Oval Road. London N W I 7DX
LTD.
LIBRARY OF CONGRESS cATAl.OG C A R D NL'MREK:
ISBN 0-12-014666-5 PRINTED IN THE U N I T E D STATES OF AMERICA
X h 87
xx
8Y
Y
x
7 h 5 .I 3
I I
49-7504
CONTENTS CONTRIBUTORS TO VOLUME66 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . PREFACE. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
vii ix
Applied Problems of Digital Optics L. P. YAROSLAVSKII 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11. Adaptive Correction of Distortions in Imaging and Holographic Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111. Preparation of Pictures , . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IV. Automatic Localization of Objects in Pictures . . . . . . . . . . . . . V. Synthesis of Holograms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... ..
1
5 45 68 92 136
Two-Dimensional Digital Filters and Data Compression V . CAPPELL~NI I. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11. Two-Dimensional Digital Filters . . . . . . . . . . . . . . . . . . . . . . . 111. Local Space Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IV. Data Compression V. Joint Use of Two-
.. ..
141 142 152 158
.. V1. Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..
173 176 199
.. ..
Statistical Aspects of Image Handling in LowDose Electron Microscopy of Biological Material CORNELIS H. SLUMPand HEDZERA. FERWERDA 1.Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11. Object Wave Reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111. Wave-Function Reconstruction of Weak Scatterers . . . . . . . . . . . IV. Parameter Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . V. Statistical Hypothesis Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix A: The Statistical Properties of the Fourier Transform of the Low-Dose Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . V
202 213 230 254 277 297
vi
CONTENTS
Appendix B: The Statistical Properties of an Auxiliary Variable Appendix C: The CramCr-Rao Bound . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
299 305 306
Digital Processing of Remotely Sensed Data A. D. KULKARNI I. Introduction . . . . . . .
............................. ...................... .....................
310 319 326
Geometric Correction and Registration Tech Classification Techniques . . . . . . . . . . . . . . . System Design Considerations . . . . . . . . . . . Conclusion. . .............................. ..................... References . . . . . . . . . . . . . . .
361 361
INDEX. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
369
11. Preprocessing Techni 111. Enhancement Techniques . . .
IV. V. VI. VII.
CONTRIBUTORS TO VOLUME 66 Numbers in parentheses indicate the pages on which the authors’ contributions begin.
V. CAPPELLINI, Dipartimento di Ingegneria Elettronica, University of Florence, and IROE-C.N.R., Florence, Italy (141) HEDZERA. FERWERDA, Department of Applied Physics, Rijksuniversiteit Groningen, 9747 AG Groningen, The Netherlands (201) A. D. KULKARNI,*National Remote Sensing Agency, Balanagar, Hyderabad, 500037 India (309) CORNELIS H. SLUMP,Department of Applied Physics, Rijksuniversiteit Groningen, 9747 AG Groningen, The Netherlands$ (201)
L. P. YAROSLAVSKII, Institute for Information Transmission Problems, 101447 Moscow, USSR ( 1 )
*Present address: Computer Science Department, University of Southern Mississippi, Hattiesburg, Mississippi 39406. $Present address: Philips Medical Systems, Eindhoven. The Netherlands. vii
This Page Intentionally Left Blank
PREFACE The four chapters that make up this volume are all concerned, though in very different ways, with image handling, image processing, and image interpretation. The first contribution, which comes from Moscow, should help Western scientists to appreciate the amount of activity in digital optics in the Soviet Union. The extent of this is not always realized, for despite translation programs, some of it is not readily accessible and little is presented at conferences in Europe, the United States, and Japan. I hope that L. P. Yaroslavskii’s chapter will help to correct the perspective where necessary. V. Cappellini needs no introduction to the electrical engineering community; here he surveys the difficult but very active and important fields of digital filtering in two dimensions and source coding. The list of applications in the concluding section shows the wide range of application of these ideas. The third chapter is concerned with the extremely delicate problem of radiation damage and image interpretation in electron microscopy. For some years, it has been realized, with dismay, that some specimens of great biological importance are destroyed in the electron microscope by the electron dose needed to generate a usable image. One solution is to accumulate very low dose images by computer image manipulation, but a thorough knowledge of image statistics is imperative for this, as indeed it is for other types of electron image processing. This difficult area remained largely uncharted territory until C. H. Slump and H. A . Ferwerda began to explore it in detail: their chapter here gives a very full account of their findings and sheds much light-more indeed than I suspect they dared to hope when they began-on this forbidding subject. The final chapter, by A . D. Kulkarni, is concerned with yet another branch of this vast subject, in particular with enhancement and image analysis. This should be a very helpful supplement to the basic material to be found in the standard textbooks on the subject. P. W. Hawkes
ix
This Page Intentionally Left Blank
Applied Problems of Digital Optics L. P. YAROSLAVSKII Institute for Information Transmission Problems Moscow. USSR
I . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I1 . Adaptive Correction of Distortions in Imaging and Holographic Systems . . . . . . A . Problem Formulation . Principles of Adaptation to the Parameters of Signals and Distortions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Methods for Automatic Estimation of Random-Noise Parameters . . . . . . . . C. Noise Suppression by Filters with Automatic Parameter Adjustment . . . . . . . D . Correction of Linear Distortions . . . . . . . . . . . . . . . . . . . . . . . . . . E. Correction of Nonlinear Distortions . . . . . . . . . . . . . . . . . . . . . . . . . . 111. Preparation of Pictures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A . Problems of Picture Preparation . Distinctive Characteristics of Picture Preparation in Automated Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Preparation by Means of Adaptive Nonlinear Transformations of the Video Signal Scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Linear Preparation Methods as a Version of Optimal Linear Filtration . . . . . D . Rank Algorithms of Picture Preparation . . . . . . . . . . . . . . . . . . . . . . . . E. Combined Methods of Preparation . Use of Vision Properties for Picture Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IV . Automatic Localization of Objects in Pictures . . . . . . . . . . . . . . . . . . . . . A . Optimal Linear Coordinate Estimator: Problem Formulation . . . . . . . . . . . B. Localization of an Exactly Known Object with Spatially Uniform Optimality Criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C . Allowance for Object’s Uncertainty of Definition and Spatial Nonuniformity . Localization on “Blurred Pictures” and Characteristics of Detection . . . . . . . D . Optimal Localization and Picture Contours. Selection of Objects from the Viewpoint of Localization Reliability. . . . . . . . . . . . . . . . . . . . . . . . . . E . Estimation of the Volume of Signal Corresponding to a Stereoscopic Picture . . . V . Synthesis of Holograms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A . Mathematical Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Discrete Representation of Fourier and Fresnel Holograms . . . . . . . . . . . . C. Methods and Means for Recording Synthesized Holograms . . . . . . . . . . . . D . Reconstruction of Synthesized Holograms . . . . . . . . . . . . . . . . . . . . . E . Application of Synthesized Holograms to Information Display . . . . . . . . . . References
1
5 6 9 16 27 34 45 46 47
55 61 63 68 69 71 78 82 88 92 94 98 102 120 128 136
I . INTRODUCTION Improvement of the quality and information throughput of optical devices has always been the main task of optics . For the majority of applications. today’s optics and electronics have. in essence. solved the 1 Copynght @ 1986 by Academic Press. Inc All rights of reproduction In any form reserved
2
L. P. YAROSLAVSKII
problem of generating high-quality pictures with great information capacity. Now, the effective use of the enormous amount of information contained in them, is., processing of pictures, holograms, and interferograms, has become topical. One might develop the information aspects of the theory of optical pictures and systems on the basis of information and signal theory and enlist the existing tools and methods for signal processing (of which the most important today are those of digital computer engineering). Armed with electronics, optics has mastered new wave length ranges and methods of measurement, and by means of computers it can extract the information content of radiation. Computerized optical devices enhance the analytical capabilities of radiation detection thus opening qualitatively new horizons to all areas in which optical devices find application. Historically, digital picture processing began at the turn of the 1960s with the application of general-purpose digital computers to the simulation of techniques for picture coding and transmission through communications channels (David, 1961; Huang et al., 1971; Yaroslavskii, 1965, 1968), although digital picture transmission was mentioned as early as the beginning of the 1920s (McFarlane, 1972). By the 1970s it had become obvious that, owing to the advances of computer engineering, it might be expedient to apply digital computers to other picture-processing problems (Vainshtein et al., 1969; Huang et al., 1971; Yaroslavskii, 1968) which traditionally belonged to the domain of optics and optoelectronics. First, publications appeared dealing with computer synthesis of holograms for information display, synthesis of holographic filters, and simulation of holographic processes (Brown and Lohmann, 1966,1969; Huang and Prasada, 1966; Lesem, 1967; Huang, 1971). Finally, by the middle of the 1970s progress in microelectronics enabled the advent of the first digital picture-processing systems, which found wide applications in Earth resource studies, medical diagnostics, and computeraided research. The digital processing of pictures and other optical and similar signals is now emerging as a new scientific field integrating theory, methods, and hardware. We refer to this area as “digital optics” by analogy to the term “digital holography” (Huang, 1971; Yaroslavskii and Merzlyakov, 1977, 1980, 1982), which combines such segments as digital synthesis, analysis, and simulation of holograms and interferograms. The term digital optics reflects the fact that, along with lenses, mirrors, and other traditional optical elements, digital computers and processors are becoming integral to optical systems. Finally, to complete the characterization of digital optics as a scientific field, one should say that it is a part of the general penetration of computer engineering and digital methods into optical studies, as recently noted by Frieden ( 1980).
APPLIED PROBLEMS OF DIGITAL OPTICS
3
What qualitatively new features are brought to optical systems by digital processors? There are two major ones: first, adaptability and flexibility. Owing to the fact that the digital computer is capable of rearranging the structure of the processing without changing its own physical structure, it is an ideal vehicle for adaptive processing of optical signals and is capable of rapid adaptation to various tasks, first of all to information adaptation. It should be also noted that this capability of the digital computer to adapt and rearrange itself has found application in active and adaptive optics for control of light beams as energy carriers. The second is the simplicity of acquiring and processing the quantitative data contained in optical signals, and of connecting optical systems with other information systems. The digital signal representing the optical one in the computer is essentially the pure information carried by the optical signal deprived of its physical vestment. Thanks to its universal nature, the digital signal is an ideal means for integration of different information systems. Digital optics relies upon information theory, digital signal processing theory, statistical decision theory, and that of systems and transformations in optics. Its methods are based on the results of these disciplines, and, similarly, these disciplines find in digital optics new formulations of their problems. Apart from general- and special-purpose computers, the hardware for digital optics also involves optical-to-digital signal converters for input into the digital processor and converters of digital signals into optical form such as displays, photorecorders, and other devices. In the early stages of digital optics, this hardware was borrowed from other fields, including generalpurpose computer engineering, computer graphics, and computer-aided design. Currently, however, dedicated hardware is being designed for digital optics, such as devices for the input of holograms and inteferograms into computers, precision photorecorders for picture processing and production of synthesized holograms, displays, and display processors. Digital optics considerably influences trends in today’s computer engineering towards the design of dedicated parallel processors of two-dimensional signals. As an area of research, digital optics interfaces with other information and computer sciences such as pattern recognition, artificial intelligence, computer vision, television, introscopy, acoustoscopy, radio holography, tomography. Therefore, the methods of digital optics are similar to those of these sciences, and, vice versa. The aim of this article is to discuss the most important problems of applied digital optics as seen by the author, including those of adaptation and of continuity and discreteness in processing pictures and other optical signals. The first section deals with methods for correction of linear and nonlinear distortions of signals in display and holographic systems and with noise
4
L. P. YAROSLAVSKlI
suppression. The emphasis will be on adaptive correction of distortions with unknown parameters and on a means of automatic estimation of these parameters through the observed distorted signal. The second section is devoted to methods for the improvement of a picture’s visual quality and to making preparations for facilitating visual picture interpretation. The term “preparation” was suggested by the present writer (Belikova and Yaroslavskii, 1974; Yaroslavskii, 1979a, 1985) expressly to stress the need for a special processing oriented to the individual user. The philosophy of the methods described in the first two sections relies upon the adaptive approach formulated in Section I,A, which has three aspects. First, it is constructed around adaptation to unknown noise and distortion parameters by means of direct estimation of them through observed distorted signals. Second, for the determination of optimal processing parameters, a new statistical concept of a picture is used that regards the picture as a combination of random object(s) to be interpreted and a random background, together with a new correction quality criterion. This consists in considering that the correction error is minimized on average over a noise ensemble and random parameters of “interpretation objects (see Subsection II,A,l), while the background is considered as fixed. With this method, adaptation to the background is attained. Third, the approach envisages adaptation of picture processing to the user, that is to the specific problem faced by the user of the data contained in a picture. As noted above, it is the simplicity of adaptive processing that is one of the basic merits of digital picture processing as compared with analog (optical, photographic, electronic, etc.) methods. The third section demonstrates how this adaptive approach may be extended to the detection of objects in pictures. This is one of the fundamental problems in automated recognition. The fourth section discusses the problems of digital holography and, by way of hologram synthesis for information display, illustrates another important and characteristic aspect of digital optics: the need to allow in digital processing for the analog nature of the processed signal, i.e., the need to observe the principle of correspondence between analog signal transformation and its digital counterpart. Such a need exists not only in the digital processing of optical signals, but here and especially in digital holography it is particularly manifest in this system because the digital hologram obtained from a digital (discrete and quantized) signal is at the same time an analog object, an element of an analog optical system, thus a most evident embodiment of the unity of discreteness and continuity. ”
APPLIED PROBLEMS OF DIGITAL OPTICS
5
11. ADAPTIVE CORRECTION OF DISTORTIONS IN IMAGINGAND
HOLOGRAPHIC SYSTEMS There are many papers, reviews, and monographs on the correction of distortions in imaging systems (Vasilenko, 1979; Sondhi, 1972; Frieden, 1975; Huang et al., 1971; Huang, 1975; Andrews and Hunt, 1977; Gonzales and Wintz, 1977; Pratt, 1978). Their attention is focused on elimination of distortions in systems which either may be regarded as linear, spatially invariant systems with additive and independent noise, or may be reduced to them. Distortions and their correction in holographic systems have not been sufficiently studied. Little attention has been paid to correction of nonlinear distortions, including those due to signal quantization in digital processors, and to suppression of random noise, which is of prime importance in real problems of processing pictures, holograms, and interferograms. Moreover, the characteristics of distortions and noise, which are required data for their correction and suppression, are usually assumed to be known, although in practical applications of picture processing this is far from being the case, and one must estimate the parameters of distortions and noise directly through the observed distorted signal. Finally, it should be mentioned that, in the majority of the existing studies of correction, insufficient attention has been paid to the allowance for specific computational methods, peculiarities of digital representation, and processing of signals in digital computers. These problems are discussed in this section. In Subsection II,A are formulated the principles of the adaptive approach to picture distortion correction, correction quality estimation, and determination of distortion parameters through distorted signals. Subsection II,B describes algorithms intended for noise parameter estimation through an observed noisy signal: measurements of the variance and correlation function of additive signal-independent fluctuation noise, and of the intensity and frequency of harmonic components of periodic noise in pictures; and estimates of pulse noise and quantization noise parameters, and noise of the “striped” type. Subsection II,C is devoted to noise filtration: linear filtration with automatic adjustment of parameters for suppression of additive noise of narrow spectral composition as well as “striped” noise, and to nonlinear methods of pulse noise filtration. On the basis of the adaptive approach developed, methods are proposed in Subsection I,D for the digital correction of linear distortions in imaging systems and those for hologram recording and reconstruction. Subsection II,E discusses the digital correction of nonlinear distortions, its relation to the problem of optimal signal quantization, practical methods of
6
L. P. YAROSLAVSKII
amplitude correction, and the possibilities of automatic estimation and correction of nonlinear distortions of interferograms and holograms. A . Problem Formulation. Principles of Adaptation to the Parameters of Signals and Distortions
The solution of the distortion correction problem is built around the assumption that it is possible to define a two-dimensional function a(x,y ) describing the output of an ideal system, and the real system may be described by some transform 9 converting the ideal signal into that actually observed
The task of correction is then to determine, knowing some parameters of the transform F,a correcting transform @ of the observed signal such that the result of its application
be, in the sense of some given criterion, as close to the ideal signal as possible. The choice of approaches to this problem depends on the way of describing signals and their transformations in the corrected systems and also on the correction quality criterion.
I . Description of Pictures and Correction Quality Criterion According to the fundamental concepts of information theory and optimal signal reception theory, signals are elements of a statistical ensemble defined by the ensembles of messages carried by the signals and random distortions and noise. The distortion correction quality is defined by the correction error of individual realizations of the signal
averaged over these ensembles. Here, the overbar represents averaging over the ensemble of random distortions and noise, and the angle brackets represent an average over the ensemble of signals. For a concrete definition of averaging over the signal ensemble in Eq. (3), it is necessary to have a description of pictures as elements of the statistical ensemble. In studies of picture restoration, the statistical description relies most commonly on statistical models of Gaussian and Markov random processes and their generalizations to the two-dimensional case. As applied to picture processing, this approach, however, is very limited. It is essential in picture processing that pictures are, from the viewpoint of information theory,
APPLIED PROBLEMS OF DIGITAL OPTICS
7
signals rather than messages. It is the random picture parameters, whose determination is in essence the final aim of picture interpretation, that are messages. These may be size, form, orientation, relative position of picture details, picture texture, etc. Therefore, two essentially different approaches should be distinguished in the formulation of the statistical description of pictures as signals. In one of them, which may be called a local informational approach, pictures are considered as a set of “interpretation objects” and random background. Interpretation objects involve picture details whose random parameters (e.g., mutual position, form, orientation, number, etc.) are the messages which should be determined as the result of picture interpretation. The rest of the picture, which has no informative (from the viewpoint of the given application) parameters, is the background. Another approach may be called a structure informational one. In this case, the parameters of the picture as a whole, e.g., its texture, are informative, and the picture cannot be decomposed into interpretation objects and background. For a statistical description of pictures as textures, the abovementioned classical methods and models of random process theory may be used. A statistical description of pictures in the local informational approach is more complicated and should be based on a separate statistical description of the interpretation objects and background, and also their interrelations. In particular, this results in the fact that the error [see Eq. ( 3 ) ] of picture distortion correction should be averaged separately over the random parameters of interpretation objects and random background. In doing so, the correcting transform minimizing the correction error (as averaged over the background) will be also optimal on the average. However, it is usually desirable that the correcting transform be the best for a given particular corrected picture rather than on the average. From the standpoint of the local informational approach, this implies that a conventionally optimal transform with fixed background is desired rather than averaging of the correction error [Eq. (3)] over the random background. It is this approach that will be studied below. Accordingly, the zT(a-G) in Eq. (3) will be understood as values of the signal correction error averaged over the set of the corrected picture samples, and angle brackets will be understood as averaging over random interpretation object parameters only. 2. System Description It is customary to employ for description of signal transformations in imaging and holographic systems models built of elementary units performing pointwise nonlinear or linear transformations of signals and responsible for
8
L. P.YAROSLAVSKII
the so-called nonlinear and linear signal distortions, while random corruptions of the signal are described by models of additive and multiplicative fluctuation and pulse noise. In accordance with this description, correction is divided into suppression of noise and correction of linear and nonlinear distortions which are solved in the sequence reverse to that of units in the system model. 3. Principles Underlying Estimation of Noise and Distortion Parameters The distinguishing feature of the correction of pictures, holograms, and interferograms is that the characteristics of noise and distortions which are necessary for the construction of correcting transforms in advance are mostly unknown and must be extracted directly from the observed distorted signal. This refers primarily to the determination of statistical characteristics of noise. At first sight this problem might seem intrinsically contradictory: In order to estimate noise parameters through the observed mixture, one has to separate noise from the signal, which may be done only if noise parameters are known. The way out of this dilemma is not to separate signal and noise for determination of statistical noise characteristics, but to separate their characteristics on the basis of measurements of corresponding characteristics of the observed noisy signal (Jaroslavski, 1980b). The problem of signal and noise separation may be solved either as a determinate one, if appropriate distorted signal characteristics are known exactly a priori, or as a statistical problem of parameter estimation. In the latter case, signal characteristics should be regarded as random variables if they are numbers, or random processes if they are number sequences, and the characteristics determined for the observed signal should be regarded as their realizations. In this approach, construction of optimal parameter estimation procedures should be based in principle on statistical models of the characteristics under consideration which should be constructed and substantiated specifically for each particular characteristic. Fortunately enough, in the majority of practical cases, noise is a very simple statistical object; i.e., it is describable by a few parameters, and the characteristics of the distorted signal are dependent mostly on the picture background. Therefore, the reduced problem of noise parameter estimation may be solved by comparatively simple tools even if the statistical properties of the measurable video signal characteristics are given a priori in a very rough and not too detailed manner. One has only to choose among all the measurable signal characteristics those for which noise-induced distortions manifest themselves as anomalies of behavior detectable in the simplest possible way.
APPLIED PROBLEMS OF DIGITAL OPTICS
9
Without making it our aim to construct an exhaustive theory of anomaly detection and estimation, we shall just describe two digitally easily realizable and, to our mind, sufficiently universal detection methods relying upon a common a priori assumption about the smoothness of nondistorted signal characteristics, those of prediction and voting (Jaroslavski, 1980b).Philosophically, these methods are akin to the recently developed robust parameter estimation methods [e.g., see Ershov (I978)l. In the prediction method, for each given element of the sequence under consideration, the difference is determined between its actual value and that predicted through the preceeding, already analyzed elements. If the difference exceeds some given threshold, it is concluded that there is anomalous overshoot. In doing so, the prediction depth, a technique for determination of the predicted value, and the threshold must be defined a priori for the given class of signals. The voting method is a generalization of the well-known median smoothing method [e.g., see Pratt (1978)], in which each element of the sequence is considered together with 2n of its neighbors ( n from the left and n from the right). This sample of (2n + 1) values is arranged in decreasing or increasing order of magnitude, and the value of the given element is compared with the k extreme (i.e., greatest or smallest) values of the ordered sequence. If it is in this range, it is concluded that this element has an anomalous (great or small) value. The voting method is built around the assumption that the “normal” characteristic as a rule is locally monotonous and that deviations from the local monotonicity are small if any. Values of n and k are given a priori on the assumption about “normal” behavior of the nondistorted signal characteristic. This approach to correction of distortions of picture signals, where correction algorithms are optimized on the average through the random parameters of interpretation objects and realizations of random noise, and the required statistical properties of noise and distortions are determined directly via the nondistorted signal, may be called “adaptive.” €3. Methods for Automatic Estimation of
Random-Noise Parameters
This subsection deals with methods based on the above approach and intended for automatic diagnostics of such types of noise as additive signalindependent fluctuation noise, additive narrow-band noise, pulse noise, noise of the “striped” type, and quantization noise (most commonly met in practical corrections of pictures, holograms, and interferograms).
10
L. P. YAROSLAVSKII
I . Diagnostics of the Parameters of Additive Signal-lndependent Fluctuation Wide-Band Noise in Pictures The most important characteristics of the additive and statistically signalindependent fluctuation noise are its standard deviation and correlation function. If, as is often the case, the noise is not correlated or is weakly correlated, the following simple algorithm may be constructed for determination of its variance and correlation function based on the measurement of anomalies in the covariance function of the observed picture (Yaroslavskii, 1979a, 1985). Owing to the additivity and signal independence of noise, the covariance function Co(r,s), measured over observed, N x M-element pictures, is the sum of the covariance function C(r,s) of a non-noisy picture, the noise covariance function Cs(r,s), and the representation of a random process E(r,s) that characterizes the measurement error of a noise covariance function through its finite-dimensional representation Co(r,s) = C(r,s)
+ C,(Y,s) + E(r, s )
(4)
The variance of a random process E(r,s) is known to be inversely proportional to the number of samples Q N M over which the measurement was done. Since this number is over hundreds of thousands, the random error d r , s) in Eq. (4) is small, and C,(r, s) may be estimated as q r ,4
=
CO@,s)
-
C(r,s )
(5)
Consider first the case of noncorrelated noise, where C&r, s) = 6: S(r,s)
(6)
6: being noise variance and 6(r,s) the Kronecker delta function. Thus, the covariance function of the observed picture differs from that of the non-noisy one only in the origin, the difference being equal to the noise variance
s:
=
C,(O,O) - C(0,O)
(7)
and for the rest of the values of ( r , s ) one may use C o ( r , s )as an estimate of C ( r ,s) C o ( r , s )= C ( r , s )
(8)
As measurements of picture correlation functions have demonstrated (for example, see Mirkin, 1978),in the vicinity of the origin (r = 0, s = 0) they are very slowly varying functions of r and s. The value of C(r,s) necessary for the computation of noncorrelated noise variance through Eq. (7) may
APPLIED PROBLEMS OF DIGITAL OPTICS
11
be, therefore, estimated with high accuracy by interpolation over values C(r,s) = Co(r,s) for points (r,s) in the vicinity of the origin. Thus, in order to determine the variance of additive noncorrelated noise in a picture, it is sufficientto measure the covariance function Co(r,s) of the observed picture in a small vicinity of the point (O,O), determine by interpolation the estimate C(r,s) of C(r,s), and apply C(0,O) - tO(0,O)
(9) as a variance estimate. Experiments show that even interpolation over onedimensional cross-sections of the covariance function provides good estimates (Mirkin and Yaroslavskii, 1978). This approach may be also used for estimating the covariance function and variance of weakly correlated noise, i.e., noise whose covariance function C&r, s) is distinct from zero only in a small vicinity of the origin where a nonnoisy picture covariance function may be satisfactorily interpolated by the values of Co(r,s) at those points where C,(r,s) is known in advance to be zero. In the above method, the approximate dimensions of the domain within which nonzero values of C&r,s) are concentrated, and the smoothness of C(r, s) in the vicinity of this domain, are postulated a priori. In Fig. 1, for the sake of illustration the covariance function of the picture shown in Fig. 2 is presented on a semilogarithmic scale. One can readily see in Fig. 1 the break of the covariance function, interpolated values of this function in the vicinity of zero being shown by the dotted line. Below, the difference between the original and interpolated functions is shown, which serves as the estimate of the covariance function of noise in Fig. 2. 0:
=
FIG.1. Estimate of the covariance function of wide-band noise in the picture of Fig. 2
12
L. P. YAROSLAVSKII
FIG.2. Picture used in experiments on estimation of the noise covariance function
2. Estimation of’ Additive Wide-Band Noise Parameters in “One-Dimensional” Interferoyrams
An interferogram with monotonous variation in some direction of the phase difference between reference and object beams will be referred to as “one dimensional” (Yaroslavskii and Fayans, 1975), as exemplified by the interferogram of Fig. 3a. The ideal, i.e., noiseless, interferogram is a two-dimensional sinusoidal signal. As follows from the properties of a discrete Fourier transform, in the power spectrum of the two-dimensional signal there exists a sharp peak near the mean spatial frequency of the interferogram (see Fig. 3b). If there is additive noise in the interferogram, as in Fig. 3a, the peak is also observed against the noise background (Fig. 3c). The problem of estimating signal and noise parameters via their observed power spectrum, evidently, boils down to that of detecting the signal peak in its spectrum and separating that area of the
APPLIED PROBLEMS OF DIGITAL OPTICS
13
FIG.3. Noise parameter estimation in interferograms:(a) example of a noisy interferogram; (b) power spectrum of a non-noisy interferogram; (c) power spectrum of the interferogram in part (a).
spectral plane where the intensity of the signal spectrum components is essentially distinct from zero. The boundaries of this area may be determined by means of a priori data on the mean spatial frequency of the interferogram, which depends on the interferometer design and on the maximal area of the spatial spectrum defined by the a priori data on the interferometry object. Yaroslavskii and Fayans (1 975) have demonstrated that sufficiently good estimates of the noise power spectral density may be obtained by simple averaging of the noisy interferogram spectrum over the peripheral areas of the spectral density which are known not to be occupied by the signal spectrum. Notably, apodization masks (windows usually used in spectral analysis) must be used for better cleaning of the observed signal spectrum periphery at the tails of the spectrum peak of a non-noisy interferogram signal (Ushakov, 198 1).
14
L. P. YAROSLAVSKII
Even more exhaustive use of a priori data on the ideal interferogram and additive noise is also possible in determination of noise parameters. For example, Ushakov (1979)has used the fact that the distribution function of the noise spectral component intensity is essentially of the Rayleigh type for interferogram noise with rather arbitrary distribution due to the normalizing effect of the Fourier transform. He also has supposed that the signal spectral component intensity has a uniform distribution, which is equivalent to the assumption of a triangular pyramid-shaped form for a signal peak in the frequency domain. This allowed him to construct an algorithm decision for each spectral component of a noisy interferogram, whether it belonged to the signal or noise area of the spectral domain via the value of the likelihood ratio. Ushakov’s experiments (1979, 1981) have demonstrated that this method of diagnostics may provide a very high degree of noise suppression in interferograms with filtration.
3. Estimation of Intensity and Frequency of Harmonic Components of Periodic and Other Narrow-Band Noises Periodic (Moire) noise occurs most commonly in TV and photographic systems where the video signal is transmitted through radio channels. Sometimes it occurs because of discretization of pictures with high-frequency periodic structures, and sometimes it is due to interference effects in pictures obtained in coherent optical imaging systems. The characteristic feature of this noise is that the spectrum in the Fourier basis has only a few components appreciably distinct from zero. Noise having narrow spectral composition in other bases also may be regarded as belonging to this class. At the same time, the spatial spectrum of non-noisy pictures in Fourier and some other bases is as a rule a more or less smooth and monotonic function. Therefore, the narrow-band noise manifests itself in the form of anomalously great and localized deviations or overshoots in the spectra of distorted pictures. In contrast to the above fluctuation noise having overshoots of the correlation function at the origin, localization of these overshoots is unknown. They may be localized by means of the prediction and voting methods described above. the mean value of the squared modulus of noisy signal spectral To this end,___ components (1p,,s12),taken with respect to a chosen basis and computed by appropriate fast algorithms (Ahmed and Rao, 1975;Yaroslavskii, 1979a, 1985) is determined by averaging over all the observed pictures with similar periodic noise. If one-dimensional filtration is performed (e.g., along picture rows), averaging may be done over all the rows subject to filtration. Next, localized
APPLIED PROBLEMS OF DIGITAL OPTICS
15
noise components are detected by voting or prediction; i.e., noise-distorted spectral components (Ipr.s12)of the observed signal are marked. By virtue of noise additivity, ( lfir,s12)are, obviously, equal to the sum of the intensities of spectral components of a non-noisy signal (/ar,s12)and noise (see below in Subsection II,C,l). Consequently,
m2
GIz=
~
(IBr,sI‘>
- (Iar,st2>
(10)
Taking into account that the non-noisy signal a priori has a smooth ) in Eq. (10) may be determined by spectrum, the values of ( l a r , s / 2 required interpolation over the nearest samples of Ifir,s12which are not marked as noise distorted. 4 . Estimation of Parameters of Pulse Noise, Quantization Noise, and “Striped” Noise
The basic statistical characteristic of pulse noise is the probability of distortion of signal samples which defines the noise detection threshold in the filtration algorithm (see Section 11,C). The threshold may be determined by means of the histogram of distribution of the modulus of difference between each picture sample and its value predicted by its vicinity. This histogram has two characteristic parts: one defined by the distribution of the difference signal of a non-noisy picture, and another defined by the distribution of the difference between predictions made through a non-noisy picture and noise, as well as by the distribution of noise prediction error. As video signals of neighboring elements are strongly correlated, the first part decreases rather quickly. The second part of the histogram decreases much more slowly because noise overshoots are independent (see Fig. 9b below, showing the histogram of the difference signal of Fig. 9a). A good estimate of the noise detection threshold is provided by the histogram breakpoint, which may be detected by the prediction method. Pulse noise overshoots may be detected also by the voting method if it is applied to the sequence of values of a noisy video signal in a small vicinity of each picture element (see Section 11,C). Signal quantization noise depends on the number of quantization levels. In order to determine it, it suffices to construct a signal histogram and compute the number of signal values for which the histogram is distinct from zero. Striped noise in pictures is caused by random overshoots of the video signal mean value computed in the direction of the stripes. For example, this was the type of noise in the photographs transmitted by the interplanetary stations “Mars-4” and “Mars-5’’ (Belikova et al., 1975, 1980). This may be detected and measured in the same way as the spectrum overshoots by
16
L. P. YAROSLAVSKII
analyzing the sequence of video signal values averaged along the rows in the direction of the bands (see also Section 11,C). C. Noise Suppression by Filters with Automatic Parameter Adjustment
In this Section, filters for the suppression of additive and pulse noise in pictures are described that have automatic parameter adjustment (APA) to the observed distorted picture and are based on the principles of filtration formulated in Section I1,A. For brevity they will be called APA filters. I . Optimal Linear Automatic Parumeter Adjustment Filtration oJ Additive Signal-lndependent Noise
Linear filtration of a noisy signal is known to be the simplest tool for additive noise suppression. Filter parameters are usually determined on the basis of the optimal (Wiener) filtration theory developed for continuous signals and the rms filtration error criterion. The synthesis of rms optimal discrete linear filters of random signals as represented in an arbitrary basis was discussed by Pratt (1972, 1978; see also Ahmed and Rao, 1975). Relying upon the adaptive approach formulated in Section H,A, let us derive the basic formulas for optimal discrete linear filters. For the sake of simplicity we shall use one-dimensional notation; in order to pass to two variables, it will be sufficient to regard the indices as two-component vectors. Let A = { a s } be an N-dimensional vector of picture signal spectrum samples with respect t o some orthonormal basis. It is desired to restore the signal from its observed mixture (1 1)
B=A+X
with independent noise X = {q),so that the squared modulus [&I2 of the from signal A , averaged over the ensemble signal deviation estimate A^ = of noise realizations and random signal parameters, and estimated for one signal sample
be minimal. Determine the optimal linear filter H into A
=
{ v ~ , which ~} transforms signal B
N- 1
d;, =
1
VS,"P,
n=O
and meets the above criterion (mrms filter).
APPLIED PROBLEMS OF DIGITAL OPTICS
17
Optimal values of qs,, are solutions of the systems of equations
where the asterisk signifies the complex conjugate, i.e., of the following systems
If, as is usually the case, the mean value of noise samples is zero, -
p,
= a,
(m>
+ K, = a,
-
= <@,a:>
+ KnK:
(16)
Since for qS,, and qErnsystems (15) are equivalent, it suffices to solve only one of them. By substituting Eq. (16) into Eq. (15) one obtains for qs,,the following system of equations
The matrix H = {yl,,,} defined by Eq. (17) has dimensionality N x N , and, generally, filtration of an N-element vector requires N 2 operations, which is objectionable for practical applications such as processing of pictures and other two-dimensional signals of large information content. A way out of this situation is provided by two-stage filtration
2 = T-~H~TA
(18)
where T and T-’ are direct and inverse matrices of the transformations, which may be performed by the so-called “fast algorithms” (e.g., see Ahmed and Rao, 1975), and Hd is a diagonal matrix describing the so-called “scalar filter” or “filter mask”(Yaroslavskii, 1979a, 1985). This approach to a digital realization of optimal linear filters was seemingly first suggested by Pratt (1972), who considered the use of the Walsh transform as a T transform. Obviously, to bring about good filtration quality, the joint transform T- ‘H,T should well approximate the optimal filter matrix H T-’H,T
1H
(19)
18
L. P. YAROSLAVSKII
Exact equality in Eq. (19) is known to be attainable only if T is a matrix of eigenvectors of the matrix Hd(see Ahmed and Rao, 1975);of course, there is no guarantee that this optimal transform will have a fast algorithm. In this connection, one has to check the feasibility of transform matrix factorization into a product of sparse matrices, and of the synthesis of transforms approximating the given one and definitely possessing a fast algorithm (Yaroslavskii, 1979a, 1985; Jaroslavski, 1980~). Similarly to the above general case, one may easily demonstrate that the Scalar filter Hd = that is optimal with respect to a chosen criterion is defined by
(m>
where is a power spectrum of the observed distorted signal in a chosen basis averaged over noise realizations and random parameters of interpretation objects, and I K , ~ ~is the noise power spectrum. Another possible correction quality criterion which has proved effective is that of signal spectrum reconstruction (SSR) (e.g., see Pratt, 1978). By modifying it according to our approach, that is by imposing a requirement that the restored signal power spectrum coincide with that of the distorted signal averaged over variations of interpretation objects and corrected by the estimate of a noise spectrum, we obtain that the scalar filter optimal with respect to this criterion is
The form of Eqs. (1 7), (20),and (21 ) for optimal linear filters implies that the desired signal and noise parameters may be determined through the observed noisy signal. Therefore, they define optimal linear APA filters. Depending on the depth of filtration error averaging over s in accordance with the criterion of Eq. (12), they will be adjustable either globally or locally. In the latter case, filtration errors are estimated on the average over picture fragments, and corresponding formulas involve spectra and covariance matrices of fragments rather than the picture as a whole. Notably, filters described by Eqs. (20) and (21) are realizable in adaptive coherent-optics systems for spatial filtration with a nonlinear medium in the Fourier plane (Yaroslavskii, 1981). Described below are some practical applications of APA filtration to additive noise in pictures and interferograms. Filtration of strongly correlated (narrow-band) additive noise whose power spectrum {El2) contains only a few components distinct from zero, or
APPLIED PROBLEMS OF DIGITAL OPTICS
19
of a similar narrow-band signal against the background wide-band noise, is one of the important cases of a practical nature of distortion correction in pictures and other signals for which the linear filtration technique based on Eqs. (20) and (21) performs well. Narrow-band noise may be exemplified by periodic noise characteristic of some picture transmission systems. Filtration of narrow-band signals against the background wide-band noise may be represented by the suppression of additive noise in one-dimensional interferograms. The filter of Eq. (20), designed to suppress narrow-band noise, would pass without attenuation the video signal spectral components with zero noise intensity and significantly attenuate those with high noise intensity. For high as compared with the signal, intensity of individual noise components Elz, the filter of Eq. (20) is well approximated by the so-called “rejection” filter, which completely suppresses spectral signal components distorted by intensive noise components
Computationally, the rejection filter is even simpler than that of Eq. (20). The and l
(m2)
20
L. P. YAROSLAVSKII
b
c FIG.4. Periodic noise filtration: (a) original noisy picture; (b) Walsh spectrum averaged along rows; (c) characteristic of the filter mask (graph of samples of filter frequency response); (d) results of filtration.
“Striped”-like noise and background nonuniformity found in the photographs from automatic interplanetary stations “Mars-4” and “Mars-5” (Belikova et al., 1975,1980) provide another example of additive narrow-band noise readily lending itself to linear filtration (Fig. 5a). Its discrete Fourier spectrum is concentrated in the domain of very low spatial frequencies. It is not advantageous to use processing in the spectral domain here, but rather double digital filtration of the signal by means of one-dimensional recursive filters with the following formulas
APPLIED PROBLEMS OF DIGITAL OPTICS
21
where Cis a constant, equal to one-half of the maximal video signal, which was used as the estimate of the unknown mean value of the video signal clo,o over the frame. Each of the one-dimensional filters suppresses stripes in one direction, parameters N , and N , being taken equal to 256 (sometimes 128) elements (depending on the size of background spot) for 1024 x 1024 frames. In the spectral domain, these two filters completely suppress the component cto,o and substitute for it a variable proportional to ii, and also attenuate the low-frequency spectrum components. Figure 5b shows the result of processing the photograph of Fig. 5a by this filter. The filter of Eq. (23) lends itself to high-speed digital implementation. As compared with filtration in the frequency domain using a fast Fourier transform (FFT) algorithm, the speed gain was about a factor of ten for 1024 x 1024. Its disadvantage as compared with frequency filtration is the difficulty of automatic determination of the parameters N , and N 2 . Automatic parameter adjustment as discussed above is made possible by another spatial filtration method described by the following relation A
;k,l
= ak,I - (iik
- iik)
(24)
where Ck is the signal mean value along row k (the direction of rows coincides with that of stripes), and gk is the estimate of this mean value, which is a^k = (ak + iik+,)/2 for the rows k where prediction or voting detects an overshoot, or gk = iik otherwise. The filter is applied twice with the aim of suppressing vertical and horizontal stripes. A result of its application may be seen in Fig. 6. Some of the pictures processed in this way were used for the synthesis of the first color pictures of the Mars surface (Avatkova et al., 1980).It should be pointed out that without such processing the synthesis of reliable Mars surface color pictures would have been impossible. When speaking about this type of processing, one again has to dwell upon the necessity of eliminating contrast details with sharp boundaries during rejection filtration of the narrow-band noise. In the case under consideration these are crosslike reference marks. Prior to filtration they were eliminated by means of a nonlinear detection and smoothing procedure and later restored to their places (Yaroslavskii, 1979a, 1985). ~
,
2. Pulse Noise Filtration Pulse noise does not manifest itself over the whole signal, but rather in spontaneously located points where a random variable is substituted for the signal. As pulse noise affects only individual picture samples, the noise filtration algorithm should have two stages: detection of noise overshoots and correction of distorted signal samples. For filtration of pulse noise in pictures
FIG. 5. Examples of filtration o f striped noise and nonuniform background ( a ) original picture; (b) result with filtering.
22
noise:
e
f
h
----FIG.6. APA filtration of striped noise: (a) noisy picture; (b) result of horizontal filtration; (c) the same after transposition; (d) result of vertical filtration; (e), (f) graphs of mean values, respectively, along rows and columns of the picture in (a); (g), (h) the same graphs after overshoot detection and interpolation..
24
L. P.YAROSLAVSKII
three methods are possible, differing in the method of detecting noise overshoots and in the rule of distorted sample correction. a. Iterative prediction algorithm (Yaroslavskii, 1968, 1979a, 1985). Noise overshoots are detected by comparing each video signal sample with its value linearly predicted through neighboring samples. If their difference exceeds some threshold 6,,it is concluded that there is noise. All the distorted samples are marked and substituted by values predicted through unmarked neighboring samples plus some constant 6,,which is smaller than the visual sensitivity threshold for individual details and greater than the sensitivity threshold for length overfalls. Since the video samples used in prediction for detection may be, in their turn, distorted, the filtration algorithm should be iterative, the threshold 6,
FIG.7. Suppression of pulse noise by iterative algorithm with prediction; (a), noise-free picture; (b), noisy picture with the probability of sampler distortion 0.3; (c), results of filtration after lst, 2nd, 3rd, and 4th iterations, respectively.
APPLIED PROBLEMS OF DIGITAL OPTICS
25
being reduced in the course of iterations. Impulse noise is, thus, filtered by this algorithm in several iterations, each iteration being performed in two passes over the picture: One pass detects noise overshoots, and the second corrects distorted samples. Experiments have shown that three to four iterations suffice. Figure 7 shows simulation results for the above algorithms, and Fig. 8a-f shows graphs of the video signal for the same pictures as in Fig. 7a-f, illustrating the degree of noise suppression and filtration-induced signal distortions.
b. Recursive prediction algorithm (Yaroslavskii, 1968, 1979a, 1985). In this algorithm noise pulses are detected and distorted samples are corrected in a single pass because not all the eight neighboring picture samples in the
FIG. 8. Graphs of the video signal of the picture shown in Fig. 7
26
L. P. YAROSLAVSKII
orthogonal raster are used, but rather those four already processed preceding the given sample in left-to-right downward scanning (three in the preceding row plus one in the same row). The reader is referred to Section II,B for the method of automatic determination of the detection threshold 6, for this algorithm. The operation of the algorithm is illustrated in Fig. 9. c. Algorithm with detection by voting.
In this algorithm, noise-distorted picture samples are detected by means of the voting method described in Section II,B and corrected as in the first iterative algorithm. Such an algorithm is a generalization of the well-known median noise filtration algorithm (Pratt, 1978; Tukey, 1971), enabling one to overcome the main drawback of median filtration-the fact that it corrects all picture samples independently of the pulse noise level. The results of algorithm verification are shown in Fig. 10.
b
FIG.9. Suppression of pulse noise by a recursive algorithm with prediction: (a) noisy picture with a probability of samples distortion 0.3; (b) distribution histogram of prediction error modulus and detection threshold; (c)filtration results.
APPLIED PROBLEMS OF DIGITAL OPTICS
27
FIG.10. Suppression of pulse noise by the detection-by-voting algorithm: (a) noisy picture with a probability of sample distortion 0.3; (b) results of filtration with rejection by the first and ninth elements; (c) the same with rejection by the second and eighth elements; (d) the same with rejection by the third and seven elements.
The quality of these algorithms may be estimated by the probabilities of passing Pp or false detection Pfd of noise overshoot and by mean and rms filtration errors. As might be expected, the data tabulated in Tables I and I1 demonstrate that the best filtration quality is provided by the iterative prediction algorithm which reduces rms error by a factor of 3 to 4 with a low probability of false detection. The other two algorithms are roughly equal in quality, are inferior to the iterative algorithm in filtration quality, but are much faster. D. Correction of Linear Distortions Several approaches to the problem of optimal correction of distortions in linear systems are being pursued, and numerous publications discuss it (Frieden, 1975; Vasilenko, 1979; Andrews and Hunt, 1977; Pratt, 1978).Here
28
L. P. YAROSLAVSKlI
COMPARISON OF PULSF
TABLE I NOISEFILTRATION ALGORITHMS BY THE FALSE ALARM
PROBABILITIES OF PASSING AND
PP
Error probability per picture element (%) Iterative prediction algorithm First iteration Second iteration Third iteration Fourth iteration Recursive prediction algorithm Algorithm with detection by voting Boundaries (0,8) Boundaries (1,7)
PJd
20
30
40
0
0 0 1 5 21
0 0 5 28
7 21
5 16
10
20
30
40
10
74 48 30 18 II
16 52 33 19
11 54 34 20
0 0
11
11
71 54 34 20 8
6 8
0 1 5 15
26 13
32 17
41 20
52 26
15 36
27
1
10
1
TABLE I1 COMPARISON OF PULSE
NOISEFILTRATION ALGORITHMS BY RMS A N D MEAN FILTRATION ERRORS Relative rms error ( Y O )
Error probability per picture element (%) Without filtration Iterative prediction algorithm Third iteration Fourth iteration Recursive prediction algorithm Algorithm with detection by voting Boundaries (0,8) Boundaries (1,7)
Relative mean error ( Y O )
10
20
30
40
9.2
13
16
18.5 5.6 5 6.1
2.5 3.1 4
3.6 3.7 4.6
4.6 4.1 5.4
4.5 4.3
6.6 4.8
8.7 5.6
11
1.3
20
30
40
0.3
0.46
0.5
0.63
0.06 0.04 0.1
0.12 0.05 0.16
0.13 0.08 0.26
0.15 0.15 0.38
0.1 0.2’
0.3
0.4 1.4
0.4 1.5
10
1
we shall discuss correction methods stemming from the adaptive approach formulated above (see Section II,A), peculiarities of correction realization in digital picture processing systems, and also correction of linear distortions in the synthesis and reconstruction of holograms, which is scarcely discussed in the literature. Let the corrected signal B = { B,} be represented as a result of the action of a linear operator A on some nondistorted signal A = {a,) and additive independent noise X = { K , } B=AA+X
(25)
APPLIED PROBLEMS OF DIGITAL OPTICS
29
Assume that the operator A is defined by the diagonal matrix A = {A,} and determine the correctingfilter mask H = {~r,) optimal with respect to the same mrms criterion used in Section II,C. For this filter, the average mean-squared error as computed per signal sample over the noise ensemble and variations of interpretation objects
will be minimal if, as follows from Eq. (20),
lo,
A, = 0
(m>
where the sense and definition of and k 12 are'the same as in Eq. (20). Correspondingly, the frequency response of the correcting filter for the SSR criterion will be
are mean values over variations of interpretation objects of where the ( the nondistorted picture spectrum for those s whose = 0. It should be either known a priori or determined by interpolation of l & - 2 ( ( ~ 1 2 ) -k 12 by neighboring points as was done in diagnostics of narrow-band noise in Section II,B. Numerous experimental facts noted by the author and many other researchers indicate that in picture distortion correction sufficiently good results may be obtained if some typical spectrum of the given class of background pictures is used as an a priori nondistorted picture spectrum ( ) E , ) ~ (e.g., ) see Slepyan, 1967). As we see it, this typical spectrum is a picture spectrum estimate averaged over variations of interpretation objects. Denote this by lEs12 and then obtain the following formula for the SSR criterion
It then follows from that for picture correction it is sufficient to know only the phase characteristics of the distorting imaging system. It also follows that if the imaging system does not distort the phase of picture Fourier
30
L. P. YAROSLAVSKII
spectral components,
that is, the characteristic of the correcting filter is independent of the distorting system. This implies that pictures may be corrected even with unknown distortion characteristics, correction being independent of the distorting system characteristics. An important class of imaging systems is composed of systems without signal spectrum phase distortions. They may be exemplified by systems for observation through a turbulent atmosphere (e.g., see Pratt, 1978) or Gaussian aperture systems, that is, by practically all systems where an image is generated by an electronic beam, etc. The effectivenessof this method of correction was borne out by simulation (Karnaukhov and Yaroslavskii, 1981), as may be seen in Fig. 11. It should also be stressed that a filter of the type in Eq. (30)may be easily implemented in an adaptive optical system with nonlinear medium in the Fourier plane similar to that described by Yaroslavskii (1981). When correcting linear distortions of imaging systems, one must take into consideration that correction usually precedes picture synthesis. The frequency response of a photographic or other recorder reproducing the corrected picture also differs from the ideal. One has to allow for this fact during correction. If one denotes by H l ( f x , f y )the continuous frequency response of the imaging system up to the place where correction may be done, and by H2(fx,,fq.)the continuous frequency response of the processing system, one can readily obtain, for example, the rms optimal continuous frequency response of the correcting Wiener filter as follows
The digital realization of such a correcting filter is possible either by means of processing the discrete Fourier spectrum with an FFT algorithm, or by digital filtration in the spatial domain. It is good practice to employ even signal continuation in order to attenuate the boundary effects of filtration, and combined discrete Fourier transform algorithms in order to reduce processing time (see Yaroslavskii (1979a, 1985)]. The choice between these two approaches is defined by the required amount of computation and memory size. It turns out in practice that if a correcting digital filter cannot be satisfactorily approximated by a separable and recursive one, processing in the spectral domain with FFT algorithms usually presents a smaller computational burden.
FIG.11. Correction of unknown Gaussian defocusing: (a) original defocused picture; (b) result of correction by the filter in Eq. (30).
32
L. P. YAROSLAVSKli
The above technique was employed, for example, in processing the photographs made by the automatic interplanetary stations “Mars-4’’ and “Mars-5’’ (Belikova et al., 1975, 1980). In this case, the overall frequency response of the photographing and picture transmission system was known (Selivanov et al., 1974), and correction was performed by means of a simple separable recursive digital filter transforming the samples of the corrected video signal ak,[through the following formula
The gain of the difference signal g, and the dimensions of the averaging area ( 2 N , + 1)(2N, + l), were chosen by the approximation of the desired correcting filter frequency response by the continuous frequency response of the filter in Eq. (32)
X
+
sinc[n(2N2 1)fy/2Fy] sinc(nfJ2F,)
(33)
where (2Fx,25,) are dimensions of the rectangle confining the spatial picture spectrum and defining signal sampling, and H o ( f x ,f,) is the frequency response of the photographic recorder of the picture processing system (Yaroslavskii, 1979a, 1985). The dashed line in Fig. 12 shows the cross section of the system frequency response to be corrected (Selivanov et al., 1974), and the chain-dotted line shows the correcting frequency response, Eq. (33),for g = 4, N , = N2 = 1. The curve labeled 1 in this picture is the post correction frequency response disregarding the frequency response of the photorecording device, and the curve labeled 2 is the overall response. The digital correction thus has more than doubled the spatial bandwidth at the level 0.7. One can visually judge its effect, for example, by Fig. 13, showing a picture before (a) and after (b) correction. It should be noted that in this case correction by means of the separable recursive filter has been made possible owing to a rather simple form of the distorted system characteristics. Correction by this filter is not completely perfect; for instance, on “middle” frequencies it somewhat overcorrects. However, the time required for picture processing by such a filter is several times less than the time that would be required for processing in the spectral domain by an FFT algorithm. Correction of linear distortions in holographic systems has its own peculiarities. In the analysis and synthesis of holograms, linear distortions are
fiml Fic,. 12. Correction of the overall frequency response of photo-TV system.
Fic. 13. Picture (a) before and (b) after correction of photo-TV system frequency response.
34
L. P. YAROSLAVSKII
defined mostly by the finite dimensions of the apertures of devices for recording and sampling (measurement) of holograms and wave fields. As follows from the analysis of synthesized hologram reconstruction (see Section V,D), the finite size of the hologram recorder aperture and the limited resolution of the recording medium bring about the shadowing of the field by a masking function proportional to the squared modulus Ih(x,y)I2 of the Fourier transform of the recorder pulse response with allowance for the characteristics of the photographic material used. This shadowing may be corrected by a corresponding predistortion of the original field amplitude distribution over the object (Yaroslavskii and Merzlyakov, 1977, 1980; Yaroslavskii, 1972a). For a rectangular A t x AYJrecorder aperture
h(x,y) = sinc(7c A t x / l d )sinc(7c AYJy / l d )
(34)
where 3, is the hologram reconstruction wavelength, and d is the distance from the point source illuminating the hologram to the observation point (see Section V,A). Therefore, if the samples of the original field are enumerated by indices k, 1 ( k = 0, 1,. . .,N - 1; I = 0, 1,. . . ,M - l), the amplitude of the field distribution over the object should be multiplied by the following correcting function [see Eq. (171)]
(disregarding the modulation transfer function of the film used for recording holograms). The effect of the shadowing and its correction are illustrated in Fig. 14a and b. Correction of the finite dimensions of signal sensors in digital reconstruction of holograms and wave fields may be done in a similar manner (Yaroslavskii and Merzlyakov, 1977, 1980). E . Correction of Nonlinear Distortions Nonlinear distortions are described by system amplitude characteristics showing the dependence of output on input
b = &(a)
(36)
The ideal system amplitude characteristic w d ( a ) is regarded as given. Generally it is a linear function. The aim of the correction is to find a pointwise correcting transformation that makes the amplitude characteristic of the system after correction the same as that given.
FIG.14. Results of reconstruction of hologram synthesized (a) wlthout and (b) with shadowing correction.
36
L. P. YAROSLAVSKII
1. Correction of Nonlinear Distortions in Imaging Systems When determining correcting transformations for imaging systems, one should bear in mind that before and after correction in the digital system the signal is subjected to a number of nonlinear transformations such as predistortion at processor input, quantization, and nonlinear correction before signal reconstruction at the processor's output. The sequence of the transformations is illustrated in Fig. 15a. The task of the optimal corrections is to minimize the difference between corrected, 6,and nondistorted signals. It is akin to the well-known problem of a
I Nonlinear distortion
Wd (a)
7
i"
Nonlinear predistortion before quantization
I .
Uniform quantization
Correction of nonIinear distortion
Correction of nonlinear predistortion /i W,",(b)
t FIG.15(A) Model of nonlinear distortions in imaging systems and their digital correction.
APPLIED PROBLEMS OF DIGITAL OPTICS
37
I I
FIG.15(B) Digital correction of nonlinear distortions in imaging systems.
optimal quantization (see Garmash, 1957; Max, 1960; Andrews, 1970; Yaroslavskii, 1979a, 1985c), and may be solved by the following method for correction of nonlinearity described by a given distorting function Wd(a) with a given predistorting function Wpd(b)(see Fig. 15b): (1) The boundaries {a') of signal quantization intervals prior to distortion
a'
=
w;'(w,,'(b'))
(37)
are determined through a given quantization scale {b'}( r = 0,1,. . . ,M - 1; M being the number of quantization levels of the signal b). (2) For each rth quantization interval (ar,a'"), the optimal value a' of a representative of this interval is determined, ensuring the minimal quantization error.
38
L. P. YAROSLAVSKII
(3) For each rth representative, a number q of the quantization interval of the continuous variable reconstructed from its quantized values {b,} is determined by the given function of nonlinear predistortion corrections. The resulting table q(r) is the desired correction table. 2. Correction of Nonlineur Distortions in Holographic Systems
The effect of nonlinear distortions during the recording and reconstruction of holograms radically differs from what happens with pictures. Moreover, the nonlinearity of the amplitude characteristic of recording media and of devices for hologram recording and quantization has a different effect on mirror-reflecting and diffusion-reflecting objects (Yaroslavskii and Merzlyakov, 1977, 1980).
0
512 N
FIG. 16. Contribution of thresholding of the dynamic range of orthogonal components of a diffuse object hologram: (a) original distribution of field amplitude; (b) reconstructed distribution under k 3a limitation; (c) the same under i2u; (d) the same under fa.
39
APPLIED PROBLEMS OF DIGITAL OPTICS
Nonlinear distortions and quantization of holograms of mirror-reflecting objects result in the destruction of object macroforms (in particular, reconstructed images become contourlike ones). By appropriate choice of the quantized corrected signal values, distortion in the reconstructed image may be reduced. Holograms of diffusion-reflection objects are more stable to thresholding and quantization. These distortions do not result in the destruction of the reconstructed image, but manifest themselves in the occurrence of random noise called diffusion, or speckle, noise. Figure 16 shows the results of a simulation of dynamic range thresholding during recording of the orthogonal components of a diffusion-reflecting object hologram [(a) is the initial distribution of field intensity over a onedimensional test object, and (b)-(d) are the distributions after thresholding at the levels & 3 0 & 2a,and f0,respectively, where 0 is the rms value of the field components]. One may easily see from these pictures that diffusion noise appears and grows with thresholding and that the object’s macrostructure is preserved. Quantitatively, noise may be evaluated through the dependence of diffusion noise intensity on the extent of thresholding in the hologram field orthogonal component, which is shown graphically in Fig. 17. In this graph
D
2
qL
FIG.17. Speckle contrast vs. hologram value thresholding depth.
40
L. P. YAROSLAVSKIl
the x axis represents the extent of thresholding in the hologram field orthogonal component with respect to rms values, and the y axis gives values of the ratio of the standard deviation of the diffusion noise to the mean value of reconstructed field intensity (speckle contrast). The diagram was obtained for an object with constant intensity reflection coefficient. A similar regularity is observed in the quantization of the orthogonal components of the field of the diffusion object hologram. Reduction of the number of quantization levels leads to higher diffusion noise, but the object's macrostructure is preserved [see Fig. 18, where (a) is the initial field intensity distribution, and (b)-(d) are distributions after quantization within the range - 3rr into 128,64, and 16 levels, respectively]. The form of the speckle contrast
+
0 I
512
I
\-
0
512N
0
512 N
Influence of quantizdtion of the orthogondl components on d diffuse ObJect hologram (a) original distribution of object held intensity, (b)-(d) reconstructed distribution at uniform quantization into 128,64, and 16 levels, respectively FIG 18
APPLIED PROBLEMS PRORI.EMS OF OF DIGITAL DIGITAI OPTICS nPTlfT APPLIED
A1 41
--.
0
64
,A’
FIG.19. Speckle contrast vs. number of levels of uniform quantization of hologram orthogonal components.
of hologram hologram quantization quantization levels levels(Fig. (Fig. 19) 19)isis very very instructive. instructive. This This as a function of shows that, that, with with aa decrease decrease of of the the number number of of quantization quantization levels, levels, dependence shows intensity at at first first grows grows comparatively comparatively slowly, slowly, but but after after the relative noise intensity levels its its speed speed dramatically dramatically accelerates. accelerates. approximately 32 levels approximately The comparative stability stability of of diffusion diffusion object object holograms holograms to to nonlinear nonlinear distortions and quantization quantization enables enables one one to to combat combat such such distortions distortions by by simulating the diffusion diffusion light light bias bias of of the the objects objects in in hologram hologram synthesis synthesis as as isis also done in optical optical holography. holography. This This is is something something of of an an analogy analogy to to the the wellwellof adding adding pseudorandom pseudorandom noise noise to to combat combat picture picture quantizaquantizaknown method of 1962).However, However, this this isis not not the the only only or or the the best best way way see Roberts, 1962). tion noise (e.g.,see hologram stability stability to to nonlinear nonlinear distortions distortions and and quantization. quantization. of providing hologram publications (e.g., (e.g., Kurst Kurst et et al., al., 1973) 1973)propose propose to to employ employ the the so-called so-called Some publications which would would give give the the same same effect effect of of “spreading” “spreading” inin“regular” diffusors, which the hologram hologram as as aa random random diffusor, diffusor, but but without without aa random random formation over the pattern over over the the reconstructed reconstructed image. image. As As the the digital digital synthesis synthesis of of noise pattern holograms is less less limited limited by by implementation implementation considerations considerations than than by by anything anything holograms else, the idea of a “regular” “regular” diffusor diffusor may may be be realized realized here here at at best. best. else, A convenient and and practicable practicable method method for for introducing introducingregular regular redundancy redundancy A into aa digital hologram, hologram, called called the the multiplication multiplication method, method, was was proposed proposed by by
42
L. P. YAROSLAVSKII
FIG.20.
Multiplication method for recording synthesized holograms
Yaroslavskii (1974). Its essence is as follows: The synthesized hologram is broken down into several fragments of differing signal intensities, as shown in Fig. 20 (1 - signal intensity, f is the coordinate on the hologram); the signal in the central, usually most intensive, fragment is L times attenuated, where L has a value of the order of the ratio of the maximal signal in this interval to the signal maximum in a neighboring, less intensive fragment. The attenuated interval is repeated over the area L times and is summed with the signal in a neighboring interval of the hologram. As shown in Fig. 20, this procedure may be repeated several times, resulting in a multiple digital hologram with a much narrower dynamic range of values to be recorded. This method features such merits as simplicity of realization and flexibility because all the multiplication operations are performed over the already computed hologram and, in principle, may be done in the course of hologram recording. Experimental multiplication of holograms has demonstrated that with an appropriate choice of multiplication parameters (number and size of multiple hologram fragments) this method works well (Yaroslavskii and Merzlyakov, 1977, 1980; Jaroslavski and Merzlyakov, 1979). 3. Correction of Nonlinear Distortions of Holograms and Interferograms under Unknown Distortion Function
The form of the characteristic of nonlinear signal distortion is often unknown, as occurs in hologram and interferogram reconstruction.
APPLIED PROBLEMS OF DIGITAL OPTICS
43
In such a case, a priori knowledge about the signal may sometimes be used for determination of the distortion characteristic and, consequently, correcting transformation. Yaroslavskii and Fayans (1975) proposed a method for the determination and correction of nonlinear distortions of interferograms relying upon the properties of an undistorted interferogram and hologram in statistical measurements. An undistorted interferogram is described by the following equation: In, = Io(c
+ cos cp)
(38)
where I , is the interferogram amplitude, c is a constant defining the positive bias of the interferogram signal, and cp is the phase angle of the interferogram. If the observed interferogram contains quite a few periods, cp may be regarded as uniformly distributed over the interval [ - n,n], and I , , must be distributed according to the known law hO(Ind)
=
{znJ1
-
[(Ind/Io)
- clz}-i
(39)
Let
be the real observed interferogram (hologram), where Wd is a distorting function. The distribution density may be empirically measured by the observed signal histogram. Thus, correction of nonlinear distortions in this case boils down to construction of a transformation of the signal with distribution values hl(lob)into a signal with a given distribution. The recoding table for such a transformation may be determined by means of the following simple algorithm: (1) Construct the table through the observed histogram by means of the formula
where M is the number of quantization levels, and the function int rounds to the nearest integer value. A signal transformation done according to this table is referred to as “equalization” because it transforms the arbitrarily distributed signal into a uniformly distributed one (Belikova and Yaroslavskii, 1974; Andrews, 1972; Hummel, 1975). See also Section II1,B. (2) Construct a similar table W,(I)through the desired histogram ho(I). (3) Permute inputs and outputs of the table W2(I)so as to obtain the table r^( W2),which, to quantization effects, defines the transformation of the uniformly distributed signal into that with distribution ho(I).
44
L. P. YAROSLAVSKII
FIG.2 I , Correction of nonlinear distortions of interferograrns: (a) distorted interferogram; (b) corrected interferogram; (c) cross section of distorted interferogram; (d) cross section of corrected interferogram.
(4) Construct of the tables W,(I)and f(W2) a joint table W,(I) = f(W2 = W,(I))
(42)
The operation of this algorithm is illustrated in Fig. 21. It should be noted that if the distorted interferogram or hologram contains additive noise, it will distort the distribution of its values and, consequently, the correcting transformation defined by the algorithm. Experimental verification of algorithm’s stability to additive noise, however, has demonstrated that, even under significant noise level, the correction quality is quite satisfactory (Ushakov and Yaroslavskii, 1984).
APPLIED PROBLEMS OF DIGITAL OPTICS
4s
111. PREPARATION OF PICTURES
As was already noted in the Introduction, representation of the object to the observer by means of an ideal imaging system often turns out to be insufficient for scientific and practical applications. In complicated problems requiring meticulous analysis of pictures (search, object identification, determination of various quantitative characteristics, generalizing descriptions, etc.), it is desirable to arm the observer’s vision with a means for the interpretation of pictures and extraction of the data necessary for analysis. These are, first, technical means using tools all the way from a magnifying glass, pencil, compass, ruler, tracing paper, etc., through complicated optical and optoelectronic devices and dedicated digital picture processing systems; and second, methods of video signal processing. This auxiliary processing we call “picture preparation.” Methodologically, picture preparation may be treated in two ways. From the viewpoint of object transformation into the picture in imaging systems, preparation may be regarded as correction of the interaction of the video signal sensor with the object. From the viewpoint of interpretation and extraction of information, preparation is a preprocessing of the signal intended to coordinate it with the end user, i.e., the human interpreter responsible for decision making. Preparation as picture processing to facilitate visual perception has two aspects: preparation for the collective user of such media as TV, movies, or print art, and preparation for the individual user. In the former case, it is often referred to as “enhancement” (Huang et a/., 1971; Huang, 1975; Andrews, 1972; Gonzalez and Wintz, 1977; Pratt, 1978; Rosenfeld and Kak, 1982; Rosenfeld, 1969). The latter case corresponds to nonformalizable applied problems of picture interpretation. This article pays attention mostly to the second aspect as being most important in applications and defining largely the structure of the processing system. Awareness of the importance of this aspect is very significant both for further development of methods for picture processing oriented to interpretation, and for determination of approaches for the construction of automated picture processing systems. Section III,A classifies preparation problems and analyzes the requirements of automated picture processing systems from this standpoint. Methods of adaptive amplitude transformations of video signals are described in Section II1,B. Linear methods of picture preparation are described and substantiated in Section III,C, and, in Section III,D the concept of rank algorithms for picture preparation is presented. Section III,E is devoted to the combined preparation methods, to preparation involving determination and
46
L. P. YAROSLAVSKII
visualization of the signal’s quantitative characteristics as well as decision making, and to the ways of using color and stereoscopic vision for picture preparation. A . Problems of’ Picture Preparation: Distinctive Characteristics of’
Picture Preparation in Automated Systems Two classes of problems in preparation may be identified: geometrical transformations and feature processing. Geometrical transformations are performed to obtain the most convenient and obvious planar representation of three-dimensional objects. In this domain, digital processors do not have significant advantages as compared with analog (optical, TV) means. Their main merit, the capability of rapidly rearranging the transformation algorithm, does not make up for transfers of bulky data, which require large memory space, and for the difficulty of providing high accuracy of interpolation. That is why we shall not touch upon this class of problems. The processing of features is composed of extraction, measurement, and visualization of the video signal characteristics or of those features which are most informative for the visual system in the current problem of analysis. The choice of features is dictated by the task being executed in the course of analysis and by the distinguishing features of the objects under consideration. These may be, for instance, values and local mean values of the video signal in certain spectral ranges of the registered radiation, the power of the picture spatial spectrum in certain areas of the spectral plane, the area and form of a cross section of the normalized picture correlation function at a certain level, and so on. In selecting feature measurement and transformation methods for automated digital picture processing systems, it is advisable to proceed from the efficiency requirements to the software. To this end, basic transformation classes should be identified which could underlie construction of ramified processing procedures. In compliance with well-known principles of the theory of signals and systems, the following transformation classes may be defined: non-linear point-wise transformation, linear transformations, and combined transformations. Below, consideration is given to the following feature processing methods that are based on the adaptive approach and belong to the above classes: methods of adaptive amplitude transformation, linear preparation methods, combined preparation methods, preparation methods with decision making, and determination and visualization of picture quantitative characteristics. The main characteristic of preparation by means of feature processing is the lack of a formal criterion of picture informativeness for visual analysis.
APPLIED PROBLEMS OF DIGITAL OPTICS
47
Therefore, preparation should be done interactively with the participation of the user controlling the processing by direct observation of the picture in the course of processing. For support of the interactive mode in automated picture processing systems, special devices for dynamic picture visualization-displays and display processors-should be provided. The basic functions of display processor are as follows: (1) Reproduction of high-quality black-and-white and color pictures from the digital signal arriving from the central processor of the picture processing system; (2) Provision of feedback from the user to the central processor both for control and video signals; and ( 3 ) Fast hard-wired picture processing in real-time coordinated with the user’s inherent response and comfortable observation conditions. To perform these functions, the display processor should include: (1) Digital video signal storage, (2) Bilateral data exchange channel between the memory and central processor; ( 3 ) Arithmetic unit and hard-wired processors for fast picture processing with either subsequent visualization only, or visualization after writing into the memory; (4) Graphic processor with generators of vectors, graphs, and characters; and (5) Organs of control and dialogue (functional keys, buttons, joy-sticks, track balls, light pens, etc.). All the modern automated picture processing systems feature display processors (see, for example, Jaroslavskii, 1978; Kulpa 1976; Machover et al., 1977; Reader and Hubble, 1981; Cady and Hodgson, 1980). B. Preparation by Means of’ Adaptive Nonlinear Transformations of the Video Signal Scale
Pointwise nonlinear transformations of video signals are the simplest kind of transformations which may be classified as picture preparation and which came into practice long ago. It suffices to mention such methods as solarization, pseudocoloring in scientific and artistic photography, and gamma correction in print art and TV. With the advent of digital technology, these transformations, realizable in only one operation per picture element, have gained wide acceptance and development. Among the most popular, one
48
L. P. YAROSLAVSKII
may cite such methods as equidensities, amplitude windows, bit-slicing, equalization, and histogram hyperbolization (Belikova and Yaroslavskii, 1974; Andrews, 1972; Hummel, 1975; Frei, 1977). The latter two methods are notable for the fact that their video signal transformation laws are determined through measurement of a video signal histogram, thus making the transformations adaptive. Equalization is described by the following transformation:
where rn is the quantized value of the transformed signal, m = 0, l , . . . , M - l;h(s) is the histogram of its values, s = 0, 1,. . ., M - 1; ri? is the transformed value; and int(x) is the integer part of x. Histogram equalization brings about higher contrast in those picture areas which have the most frequent values of the video signal. Selectiveness of equalization with respect to the frequency of video signal values is its major advantage over other methods of contrast enhancement. Hyperbolization is related to equalization, but it is the histogram of the video signal value's logarithm distribution that is equalized there. If equalization is performed simultaneously over the entire picture and is based on the histogram of the entire picture, it will be globally adaptive. Often, however, local adaptation is required. In this case, picture fragments should be equalized rather than the entire picture, and the fragments may overlap each other. This mode of processing brings to its logical completion the concept of adaptation in nonlinear amplitude transformations. In fragmentwise equalization with overlapping, the distribution histogram is constructed over the whole fragment, but only its central part is transformed, which corresponds to the nonoverlapping areas. If each succeeding fragment is shifted with respect to the preceding one by one element, the tranformation is called "sliding" (Belikova and Yaroslavskii, 1974). The table of sliding transformation (equalization) varies from one picture element ( k ,1 ) to another depending on variations of the histograms h(k,')(s)of the surrounding fragments
1
h'","(s) - h("-')(O))/(l - h'""'(0))
(44)
The fragmentwise and sliding equalizations were used in processing space photographs (Belikova et al., 1975,1980; Nepoklonov et al., 1979), geological interpretations of aerial photographs, and medical radiograms (Belikova and Yaroslavskii, 1980).
APPLIED PROBLEMS OF DIGITAL OPTICS
49
The effect of fragmentwise equalization may be seen in Fig. 22(b). If the original aerial photograph (Fig. 22a) is equalized as a whole (see Fig. 22c) rather than by fragments, its total contrast will also be enhanced, but the distinguishability of the details will be much worse. Figure 23 shows the fragmentwise equalization of the Venus surface panoramas transmitted by the automatic interplanetary stations “Venera-9.” It may be easily seen that equalization enables one to distinguish in the bright and dark areas of the panorama numerous details having low contrast, to emphasize the volume of the plate. In both cases, fragments were 15 x 15 elements with a 3 x 3 step. Note that with fragmentwise and sliding equalization, the number of operations required for transformation table generation may become prohibitive if one does not use recursive algorithms for estimation of current histograms [e.g., see Yaroslavskii (1979a, 19S5)l. Equalization may be regarded as a special case of amplitude transformation of the observed signal into that with a given distribution. The
FIG.22. Picture equalization: (a) original aerial photograph; (b) effect of fragmentwise equalization; (c) equalization of the picture as a whole.
50
L. P. YAROSLAVSKII
FIG.23. Application of fragment-wise equalization to processing Venus surface panorama: (a)before processing; (b) after equalization.
algorithm for this transformation is presented in Section I1,E. In the case of equalization, it is a uniform law. Such a transformation may be used for standardization of various pictures; for example, in constructing photomosaics [see Milgram (1974)l or in texture analysis (Rosenfeld and Troy, 1970). Another interesting possibility of generalization lies in changing the relation between the steepness of a signal’s nonlinear transformation and its histogram (Belikova and Yaroslavskii, 1974; Yaroslavskii, 1979a, 1985). At equalization, the transformation steepness is proportional to histogram values, but it may be made proportional to some power p of the histogram, thus leading to the formula
At p > 1, the greater p is, the more will weak modes be suppressed in the histogram and the most powerful ones extended over the entire range. P = 0 corresponds to a linear extension of the video signal. At p < 0, the more powerful the mode, the greater its compression. Processing by Eq. (45) may be named “power intensification of the picture,” the choice of p being left to the user. Notably, Eq. (45) resembles formulas describing the optimal signal predistortion law for quantization (see Yaroslavskii, 1979a, 1985). This similarity throws more light on the essence of adaptive amplitude transformations. From this point of view, power intensification corresponds to a
APPLIED PROBLEMS OF DIGITAL OPTICS
51
model regarding the visual system as a quantizing device and processing as a signal predistortion required for matching with this device. At p - co,power intensification becomes adaptive mode quantization, quantization boundaries lying within the minima between the histogram modes. Adaptive mode quantization is a version of cluster analysis which is very popular in pattern recognition and classification. Rosenfeld (1969) discusses the application of adaptive mode quantization to picture segmentation as being the first step of their automatic description. A method for adaptive mode quantization as a picture preparation method was developed by Belikova and Yaroslavskii (1974, 1975). This required a new approach to the substantiation of the number of histogram modes and the criterion of mode separation at quantization. In order to establish quantitative criteria for the selection of optimal boundaries between the modes, it is necessary to have a description of the causes of fuzziness of modes and losses due to misclassification. In picture preparation, the most constructive requirement seems to be that of the minimal number of incorrectly classified picture elements. Other requirements such as smoothness of the boundaries of isolated areas, or lack of small foreign impregnations inside a large area, or some similar conditions, are also possible. The degree of mode fuzziness is defined by the object’s properties with respect to the chosen feature. Usually, they are not easily formalized, and one has to construct more or less plausible models relying on the a priori knowledge of how the properties of objects manifest themselves through the observed picture. For instance, the picture to be subjected to preparation may be treated as the result of a transformation of the original field containing only “pure” modes (i.e., field whose distribution of values with respect to the feature under consideration consists of a set of delta functions) effected by random operators and/or noise. Then decision rules may be determined by means of statistical decision theory, for example, for the criterion of minimal frequency of picture element classification error. The field of decisions resulting in this case may be treated as an estimate of the original picture under the assumption that the prepared picture was obtained by distortion of the original “pure”-mode field by noise and operators. The simplest models with random operators acting upon the ideal picture and with additive or multiplicative noise for which a closed solution may be obtained with respect to the choice of decision algorithms usually are not sufficiently adequate for the actual relations between the object’s properties to be extracted and measured features. For example, in the distribution of features over the picture, modes may be made significantly fuzzy because of the “trend” over the observed picture, which should not be treated as the result --f
52
L. P. YAROSLAVSKII
only of the action of noise or of a linear operator on the signals; picture elements grouping into modes usually make up continuous areas or, at least for visual analysis, only continuous areas should be extracted and small ones disregarded and so on. In order to improve adaptive mode quantization and allow for the abovementioned factors formalized with difficulty, Belikova and Yaroslavskii (1975) proposed to make use of such auxiliary techniques as fragmentwise processing, separation by mode fuzziness types (fuzziness due to a linear operator and that due to additive noise), mode rejection by the value of the population, and rejection of small details. Some of the results of the application of the adaptive mode quantization method are illustrated in Figs. 24 through 26. Figure 24a shows the picture used in the experiments, and Figs. 24b-d show the results of its uniform quantization with different mode rejection thresholds by the value of their
FIG.24. Adaptive mode quantization: (a) original picture; (b)-(d) quantizations with thresholds 4,5, and 7%, respectively.
FIG.25. Separation of individual modes: (a)the picture of Fig. 24a as quantized into 3 levels with mode power threshold 10%; (b) details of one of the modes; (c) contours of this mode; (d) superposition of the contours on the original picture.
FIG.26. Comparison of fragmentwise and global quantizations: (a) original picture; (b) global three-level quantization; (c) the result of fragmentwise quantization without overlapping.
54
L. P. YAROSLAVSKII
FIG.26 (continued)
population (power), respectively, 4, 5, and 7%. The resulting number of quantization levels was 11,8, and 4. Comparison of these pictures reveals how details disappear with an increase of mode rejection threshold and the preparation appears more generalized. One may separate details pertaining to particular modes from other details, determine their boundaries, and impose them on the original photograph (see Fig. 25).
APPLIED PROBLEMS OF DIGITAL OPTICS
55
Fragmentwise and global (over the entire picture) quantization may be compared by their results as shown in Fig. 26, fragment boundaries being shown by the grid. In Fig. 26b only the rough structure of the picture is left; Figure 26c preserves numerous details of the original, the picture is sharper, and the boundaries between impregnations are seen better than in the original. Belikova and Yaroslavskii (1980) proposed a method of controlled adaptive transformations which is a further extension of the methods of adaptive amplitude transformation. Transformation parameters are determined there by analyzing the histogram of a picture preparation or fragment, or of a picture of the same object in another radiation range rather than directly of the processed picture. C . Linear Preparation Methods as a Version of Optimal Linear Filtration
Numerous linear processing methods that may be regarded as picture preparation are well known. For emphasizing small details, the suppression of low, and the amplification of high, spatial frequencies of the Fourier signal spectrum is popular. For suppression of small hindering details, lowfrequency filtration, i.e., suppression of higher spatial frequencies of the picture, is advised (Huang et al., 1971;Huang, 1975; Andrews, 1972; Gonzalez and Wintz, 1977; Pratt, 1978; Rosenfeld and Kak, 1982; Rosenfeld, 1969). In order to provide a reasonable basis for the choice of linear transformations and their parameters, it is advisable to treat them as an optimal, in a sense, linear filtration of the useful signal against the noise background, and regard picture details to be amplified as useful signal and the background as noise. Let us determine the characteristics of a filter to minimize the squared modulus of error between the signal of the extracted object (useful signal) and the result of observed signal filtration averaged over all the possible variations of the useful signal and representations of signal sensor noise. Let us confine our discussion only to the most easily realizable filtermasks described by diagonal matrices and consider the observed signal as an additive mixture of the extracted object and background picture. and {A,) be representation coefficients with respect to Let { a s ) ,{&), {&I, some basis { q , ( k ) } , respectively, of objects to be extracted, the observed picture, background, and filter mask. Then the mean-squared value of the filtration error modulus is
<Mi’>= [(N211@s s=o -
1s11.12)]
(46)
56
L. P. YAROSLAVSKII
where the bar means averaging over signal sensor noise, square brackets mean averaging over all the possible object positions in the picture, and angle brackets mean averaging over other stochastic parameters (form, orientation, scale, etc). It may be readily demonstrated that the values of 2, minimizing error are defined by
By substituting into Eq. (47)
one obtains
Since v-
where $ k ( s ) signal,
iS
I
a basis reciprocal to (cp,(k)},and Y-
{ak}
are samples of the object
1
In the simplest and most natural case where the object coordinates are uniformly distributed over the picture area, [ak] is independent of k Cakl
= [a]
and Eq. (51) becomes
c(cr,>l =
C$(s)l where
In this case,
Note that, since [$(s)] = f l 6 ( s ) for the majority of practically used bases, the second term in Eq. ( 5 5 ) affects only that value of A. which is usually
APPLIED PROBLEMS OF DIGITAL OPTICS
57
responsible for the inessential constant component over the picture field. Therefore, it will be disregarded below. One may also assume in preparation problems that the objects to be extracted occupy only a minor part of the picture and that the contribution of their variations to the squared modulus of the observed signal spectrum may be taken into account by some smoothing of the spectrum. Thus, one obtains the final formula for the optimal filter mask
where the tilde means the above-mentioned smoothing. This is similar to the classical formula of the optimal Wiener filter but with the denominator containing only the observed signal power spectrum smoothed and averaged over the signal sensor noise rather than the sum of spectral power densities of signal and noise. Such a filter is optimal for the given observed picture on the average over all the variations of the object to be extracted and signal sensor noise. This filter will be referred to as an MRMS filter. If reconstruction of the signal power spectrum is used as the criterion of optimality (Pratt, 1978) instead of the minimum of the rms filtration error, one obtains the filter
which will be called an RSS filter. Finally, if one desires to obtain through filtration the maximum of the ratio of signal on the desired object in its localization point to the rms value of the background picture, one obtains the filter
which may be called MSNR (see Section IV). Thus, a family of filters [Eqs. (56), (57),and (%)I results that may be used during preparation to make objects more prominent against a hindering background. These filters are adaptive because their characteristics depend on the spectrum of a processed picture. Adaptation may be either global if the filtration error is averaged over all the picture and, as a result, the formula of filter frequency response involves the spectrum of the entire picture, or local if the error is averaged over fragments and the formula involves fragment spectra. Notably, the above-mentioned recommendation about suppression of low, and amplification of high, spatial frequencies when extracting minor details, and of suppression of high spatial frequencies when smoothing pictures, are included as special cases in the above three types of filters. Indeed, the picture spectrum as a rule is a function rapidly decreasing with the growth of the spatial frequency (index s). Thus in all the filters of Eqs. (56-58), the
58
L. P. YAROSLAVSKII
position of the passband maximum varies depending on object size, which affects the numerator. If objects have a small size, the passband maximum lies in the domain of high spatial frequencies; if large details are extracted, it shifts to lower frequencies. Experimental processing of geological and medical pictures has demonstrated the effectiveness of these filters (Belikova and Yaroslavskii, 1980). Figure 27 shows filtration with the aim of enhancing distinguishability of microcalcinates in mammograms (roentgenogram of the mammary gland), where (a) is the original mammogram and (b) is the result of MSNR filtering. Minor impregnations of microcalcinates into the soft tissues of the mammary gland are one of most important symptoms of malignant tissue degeneration. Their differentiation by usual mammograms presents significant difficulties, especially at the early stages of disease. Processing like that shown in Fig. 27b
FK. 27. Optimal filtration for enhancing distinguishability of microcalcinates in mammograms: (a) original mammogram; (b) result of the optimal MRMSN filtration; (c) marks indicating detected points.
APPLIED PROBLEMS OF DIGITAL OPTICS
59
FIG.28. Example of optimal filtration of an angiogram: (a) original brain radiogram; (b) isotropic separation of minor details, such as arbitrarily oriented blood vessels; (c) anisotropic separation of minor details, which extracts vertical vessels.
may be of great help in the early diagnosis of malignant tumors of mammary glands. Figure 28 demonstrates examples of applying similar processing to angiograms with the aim of enhancing the distinguishability of blood vessels. An interesting pseudo relief effect is observed in Fig. 28c, resulting from the application of an anisotropic filter which extracts vertical vessels to the radiogram of Fig. 28a. Such processing might be an alternative to administering a contrast substance to a patient at examination that is a painful and, sometimes, dangerous operation. Figure 29 illustrates the application of linear filtration to the suppression of ribs and enhancement of middle-detail contrast in X rays. Fast computer implementation of preparation by spatial filtration is important since interactive processing requires high speed. Single or multiple (parallel or cascaded) signal filtration through the two-dimensional separable recursive filter of the type in Eq. (32) is one of the fastest approaches to optimal filtration. This filter has rectangular impulse response and is, therefore, suitable for separation of rectangular vertical or horizontal details. Multiple parallel filtration enables generation of arbitrarily oriented impulse response corresponding to the orientation of picture details. Successive (cascaded) or iterative processing enables smoother and, in particular, more isotropic impulse response. Sometimes it is more convenient to perform filtration in the spectral domain. It is good practice to d o so if separate spectral components of the signal or narrow intervals of the signal spectrum (as in Fig. 29) are to be suppressed or enhanced. It is important to mention that the speed of existing or predicted digital processors is insufficient for interactive real-time linear transformations. Local spectral adaptation for processing a 1024 x 1024 picture requires, for example, K x 2” operations, where K is a complexity factor which inherently
FIG.29. Suppression of ribs and enhancement of contrast of middle-size details by linear filtration: (a) original x ray; (b) result of filtration.
APPLIED PROBLEMS OF DIGITAL OPTICS
61
cannot be less than several tens even for the best recursive algorithms. Since interactive processing of one frame requires about 0.1 sec, the required speed of a digital processor is in the hundreds of millions of operations per second. Optical technology is known to be much superior in speed to digital in linear spatial filtration. There is a simple optical representation of correction and preparation filters developed here and in Section 11.To this end, it suffices, as Yaroslavskii suggested in 1981, to place a nonlinear optical medium whose transparency depends o n the energy of incoming radiation into the Fourier plane of the classical coherent optical system of the spatial picture filtration. Introduction of this medium makes the optical system adaptive and enables implementation of filters with frequency response of the type in Eqs. (56)-(58). D. Rank Algorithms of Picture Preparation Apart from linear picture preparation methods, it is desirable to have nonlinear ones as well. Arbitrary transformation of digital signals, of course, can be realized with linear and pointwise nonlinear transformations of individual signal samples. Nevertheless, it is advisable to have units larger than pointwise transforms. The distinguishing feature of pictures as two-dimensional signals is that their individual points are related to their neighbors. Therefore, the majority of transformation algorithms are of local nature; i.e., groups of points in some vicinity of the given point are processed simultaneously. Linear transformations readily comply with this requirement of locality and enable construction of algorithms whose computational complexity is only weakly dependent on the size of the vicinity. Nonlinear picture transformations should feature the same properties. At present, a very useful class of nonlinear transformations has appeared. It features both locality and computational simplicity, and consists of algorithms that might be named “rank filtration algorithms” because they are built around the measurement of local-order (rank) picture statistics. A value having rth rank, i.e., occupying the rth place in a list of sample elements ranked in increasing order (in a variational sequence of R elements) is rth-order statistics of a sample consisting of R values. Obviously, any rthorder statistics m,(k, I ) may be determined from local histograms h‘k”’(s) through the following equation
For computation of local histograms there exist fast recursive algorithms similar to those of recursive digital filtration (Yaroslavskii, 1985). Therefore, the computational complexity of rank filtration algorithms basically is almost
62
L. P. YAROSLAVSKII
independent of fragment size. With the computation of specific rank statistics and their derivatives, further simplifications may be possible due, in particular, to the informational redundancy of the picture. The most popular algorithm of this class is that of median filtration (see Section I1,C) (Pratt, 1978; Huang, 1981; Justusson, 1981; Tyan, 1981), where samples of a processed sequence are replaced by the median of the distribution of values of points in a given vicinity of these samples. The median is known to be a robust-against-distribution-“tails” estimate of the sample mean value (Huber, 1981). It is the robustness that makes the median filter superior to those computing the local mean for picture smoothing. The low sensitivity of the median t o distribution “tails” accounts for the fact often mentioned in the literature [e.g., see Pratt (1978)l that, in contrast to smoothing by sliding averaging, that by the sliding median preserves sharp overfalls and detail contours. Robustness allows one to make far-reaching generalizations of median filters, for example, in the direction of constructing median matched twodimensional filters as robust analogs of linear matched and of optimal filters, and, in particular, filters described in the preceding section. For instance, a median filter with an arbitrary window may be regarded as a robust matched filter for a detail having the form of filter window. An algorithm based on the determination of the difference between the picture and the result of its arbitrary-window median filtration is a robust analog of linear filters described in the preceding section and is oriented to extraction of details in pictures. A version of this filter was described by Frieden (1980). The median represents nth-order statistics of the local histogram constructed through a fragment consisting of (2n + 1) samples. Other generalizations of the median filter are possible using order statistics different from the median such as extremal filtration algorithms, where the maximum (2nth-order statistics) or minimum (zero-order statistics) over a (2n + 1)-point fragment is substituted for the fragment under consideration. Obviously, if the point rank over the fragment is substituted for its value, the above sliding equalization algorithm results. Thus, both sliding equalization and other existing adaptive amplitude transformation algorithms relying upon analysis of local histograms may be regarded as rank algorithms. This relation is also stressed by another property of the rank algorithmstheir local adaptability to the characteristics of processed pictures and potential applicability to robust feature extraction in preparation and automatic recognition of pictures, rather than to robust smoothing only. As an example of feature extraction rank algorithms, one can describe a robust algorithm for estimation of local dispersion based on computation of the difference between some given order statistics to the right and to the left of the median (R-L algorithms). In Fig. 30 this algorithm is compared with an
APPLIED PROBLEMS OF DIGITAL OPTICS
63
FIG.30. Comparison of sliding variance and rank R-L algorithms: (a) original picture; (b) pattern of values of local variances of the picture in (a) over a 9 x 9 - point fragment; (c) result of processing by an R-L algorithm with the same size fragment and R = 51, L = 31.
estimate of local variance by computation of the sliding mean value of the squared difference between values of picture points and their local mean values. This comparison demonstrates that R-L algorithms provide much better localization of picture nonuniformities as compared with the sliding variance algorithm. E. Combined Methods of Preparation. Use of Vision Properties for Picture Preparation
In real applications, the best results may obviously be obtained by using various combinations of nonlinear and linear preparation methods and utilizing all the possibilities of visual perception. The diversity of com-
64
L. P. YAROSLAVSKII
binations is unlimited, but two practically important classes may be distinguished among them: preparation with decision making and preparation with determination and visualization of picture quantitative characteristics. Pictures resulting from preparation with decision making can be considered as fields of decisions with respect to selected features. A simple example of such algorithms is represented by the above (Section II1,B) algorithms for adaptive mode quantization, which should be complemented by various linear and nonlinear algorithms whose aim is to provide higher stability of mode selection. The MSNR filters with subsequent detection and marking of the most intensive signal overshoots constitute another example of combined algorithms. This corresponds to optimal detection and localization of picture details, as shown in Section IV. Figure 31 presents examples of such processing. The diversity of methods for preparation with determination of quantitative characteristics is as great as the diversity of picture quantitative characteristics. But what they have in common is that the results of quantitative measurements are represented as pictures: tables, graphs, dimetric projections of surfaces, lines of equal values, etc. Such preparation with determination and visualization of quantitative characteristics may consist of multiple stages. This may be illustrated by detection of layers with respect to depth in lunar soil samples conveyed by the automatic interplanetary station “Luna-24” (Leikin et al., 1980).One of the methods for detection of the layered structure of soil samples is separation of layers with respect to the characteristic size of stones in the stone fraction. The following method was employed for determination and visualization of the average size of stones: ( 1 ) Optimal filtration of the original picture (Fig. 32a) by an MRMS filter for the separation of the stone fraction from the background; (2) Binary quantization of the resulting preparation by the adaptive mode quantization algorithm for obtaining the field of decisions (Fig. 32b); (3) Measurement of normalized one-dimensional correlation functions of preparation rows (i.e., horizontal cross sections of the soil sample) and representation of the correlation function set as a two-dimensional signal whose values are correlation functions in the coordinates “depth of drilling interval of the correlation”; (4) One-dimensional smoothing of this signal by a rectangular window in the direction of increasing depth; ( 5 ) Determination of equal-value lines of the smoothed signal and plotting them in the coordinates “depth width of the correlation function at a given level”(see the graph for the level 0.5 in Fig. 32c). This graph is regarded as the final preparation which along the depth coordinate corresponds to the original picture, and along another coordinate characterizes the average
FIG.31. Preparation with decision making: (a) original mammogram; (b) results of linear filtration of the mammogram by an MRMSN fiiter oriented to the detection of microcalcinates; (c) isolation of concentration domains of calcinatelike details.
66
L. P. YAROSLAVSKII
a
1. 2 FIG.32. Preparation with determination and visualization of the picture’s quantitative characteristics: (a) original radiophotograph of a soil column; (b) binary preparation, the result of stone isolation; (c) graph of the correlation function section of the picture in (b) at level 0.5.
diameter of the black spots in the preparation of Fig. 32b, i.e., the average size of stones in the sample. One can easily see hills and valleys in this graph that correspond to the specimen areas with large and small stones. Therefore, the graph is a convenient quantitative measure for division of the specimen into layers according to the average size of stones. Obviously, a single feature is insufficient in the general case for picture interpretation. To put it differently, it is desirable to generate and represent for visual analysis multicomponent or vector features. To solve this problem, the properties of vision should be exploited to full advantage. First of all, color vision might be used for representation of vector features. In this case, simultaneous representation and observation of three-component features is possible: Each of three picture preparations representing three
b
4 . 2
1
: 1,
4
3
-1
5
6
8
5
6
9
10 11 1;
FIG 32 (continued)
7
8
9
10
1 3 14 1 5 16 17 18 19
68
L. P. YAROSLAVSKII
features is shown by a distinct color (red, blue, or green), and these pictures are mixed on the display screen into a color picture. This technique of representation of preparation results may be named “colorization.” Two-component vector attributes may also be represented by stereoscopic vision. This is most natural in processing pictures which comprise a stereoscopic pair. In this case, one or both photographs of the pair are substituted by some preparation, and the user observer is thus able to examine the stereoscopic picture with the effects of preparation. Another approach to using stereoscopic vision is to treat the feature resulting from picture preparation as a “relief,” and to synthesize through this relief and the original picture new pictures constituting a stereoscopic pair. The user can thus observe a pseudostereoscopic picture whose brightness is defined by one picture preparation or by the original picture, and the relief is defined by another one. Finally, there is one more possibility for representing preparation results: picture cinematization, i.e., their transformation into movies by generating from the series of preparation results a series of movie frames shown with cinematographic speed in order to provide smoothness of the observed changes. Cinematization is best used for observation of smooth variations in a preparation parameter: e.g., fragment size at sliding equalization, exponent at power intensification, etc. Combinations of all three methods are possible, of course. IV. AUTOMATIC LOCALIZATION OF OBJECTS I N PICTURES One of the major tasks of pictures is to provide information about the relative location of objects in space. In many applications, detection and localization (measurement of coordinates) of objects is of extreme practical importance. Many other problems of automatic picture interpretation, especially those of object recognition, may also be reduced to this problem. A copious literature exists on localization and detection of objects in pictures, but the variety of ideas used for the solution of this problem is not so rich. Essentially, detection and localization of objects is reduced in all methods to some kind of correlation of the given object with the observed picture and to subsequent comparison of the result with a threshold. The approach is justified either by a simple additive model treating the observed picture as a sum of the desired object and correlated independent noise with a known autocorrelation function (Andrews, 1970; Vander Lugt, 1964; Rosenfeld, 1969; Pratt, 1978), or by the Schwartz inequality (Rosenfeld, 1969). Numerous experimental verifications, however, reveal that, for sufficiently complicated practical pictures, the probability of erroneous identification by a
APPLIED PROBLEMS OF DIGITAL OPTICS
69
correlation detector of the desired object with foreign background objects is rather high. In order to improve detection quality, various improvements are suggested such as signal quantization, spatial differentiation, predistortion of the form of the correlated object, etc. Being heuristic in nature, these improvements can neither be listed nor classified, nor ordered with respect to their quality. At the same time, this adherence to the correlator is not occasional. The correlation detector-estimator is essentially a version of the linear detectorestimator, where a decision about the presence of a desired object and its coordinates is made pointwise through the level of signal in each point of the field at the output of a linear filter acting upon the observed picture. The aim of the linear filter in such devices is to transform the signal space so as to enable independent decision making by each signal coordinate of the transformed space rather than by the signal as a whole. Due to the decomposition into independent linear and nonlinear spatial inertialess units, analysis and implementation of such a device in digital and analog processors is much simplified. This accounts for the popularity of the correlation method for object detection and localization in pictures. Simplicity of implementation is an important factor, and it turns out that one can determine the optimal characteristics of the linear detector-estimator to ensure the best localization reliability by relying upon its representation as a combination of linear filter and nonlinear pointwise decision unit, as well as on the adaptive approach developed here. The present section is devoted to presentation of this approach, which has proved fruitful both for digital and purely optical processing. In Section IV,A the problem of an optimal detector-estimator is posed. In Section IV,B the problem of determination of the optimal linear filter for localization of an exactly known object by a spatially uniform localization criterion is solved, and data are presented that bear this result out. In Section IV,C it is extended to the case of an inexactly defined object, spatially nonuniform criteria, and a distorted picture. In Section IV,D the results obtained are treated in order to explain the well-known recommendations on the usefulness of extracting contours prior to picture correlation, and to define the very notion of contour more exactly. Moreover, the problem of selecting the best, from the localization reliability standpoint, objects is solved here. In the existing literature, this important practical problem has hardly been discussed. A . Optimal Linear Coordinate Estimator. Problem Formulation
Let us consider an estimator consisting of a linear filter and decision unit determining the coordinates of the absolute maximum of a signal at the filter
70
L. P. YAROSLAVSKII
output, and let us determine the optimal linear filter ensuring the best quality of estimation. The quality of the object coordinate estimation is defined by two kinds of errors: errors due to false identification of the object with separate details in the observed picture, and those of measurement of the coordinates in the vicinity of their true value. The first kind of errors define large deviations of the result exceeding the size of the desired object. In the case of detection, they are called false-alarm errors. We shall refer to them as anomalous. The second kind or normal errors are of the order of magnitude of the object size and are due mostly to the distortions of the object signal by sensor noise. They are quite satisfactorily described by the additive model. Therefore, the classical estimator with matched filter is optimal in terms of the minimum of normal error variance, as was shown by Yaroslavskii in 1972 (it may be assumed that normal errors are characterized by their variance). However, it will yield many anomalous errors. Their probability and related property of estimator threshold were discussed in detail by Yaroslavskii (1972 b). Here we shall determine the characteristics of the linear filter of an estimator optimal in terms of anomalous errors. Let us define exactly the notion of optimality. In order to allow for possible spatial nonuniformity of the optimality criterion, let us assume that the picture is decomposed into N fragments of area S,, n = 0,1,. . ., N - 1. Let h'"'(b,x,, y o ) be a histogram of video signal magnitudes b(x, y ) at the filter output as measured for the nth fragment in points not occupied by the object, provided that the object lies at the point with coordinates ( x o , y o ) ,and b, be the filter output in the object localization point (it may be assumed that bo > 0 without restricting generality). As the linear estimator under consideration decides upon the coordinates of the desired object via those of the absolute maximum at the linear filter output, the integral Qn(Xo >
PO) =
i:
h,(b, x o y o ) db 7
(59)
bo
then represents that portion of nth fragment points that can be erroneously taken by the decision unit for object coordinates. Generally speaking, b, should be regarded as a random variable because it depends on video signal sensor noise, photographing environment, illumination, object orientation at photographing, neighboring objects, and other stochastic factors. In order to take them into consideration, introduce a function q(b,) which is the a priori probability density of b,. Object coordinates also should be regarded as random. Moreover, the weight of measurement errors in localization problems may differ over different picture fragments. To allow for these factors, we introduce weighting functions
APPLIED PROBLEMS OF DIGITAL OPTICS
71
w ( " ) ( x o , y oand ) W, characterizing the a priori significance of errors in the determination of coordinates within the nth fragment and for each nth fragment, respectively
ss
w(")(xo, y , ) dx, dy, = 1
S,
N- 1
1 wn=l
n=O
Then the quality of estimating coordinates by the estimator under consideration may be described by a weighted mean with respect to q(bo),w(")(xo, yo), and W, of the integral of Eq. (59) m
..
S,
-a,
bO
If we want to know the mean estimation quality over a set of pictures, Q should be averaged over this set. An estimator providing the minimum of Q will be regarded as optimal. B. Localization of an Exactly Known Object with Spatially Uniform Optimality Criterion Assume that the desired object is eactly defined, which means that the response of any filter to this object may be exactly calculated or that q(bo)is a delta function db,)
=
w,
- 60)
The Eq. (61) defining the localization quality becomes
or, if the histogram averaged within each fragment over xo and y , is denoted by
Sn
it becomes
72
L. P. YAROSLAVSKII
Suppose that the optimality criterion is spatially homogeneous, i.e., that weights W, are independent of n and are equal to 1jN. Then
is the histogram of the filter output as measured over the whole picture and averaged with respect to the unknown object coordinates. By substituting Eq. (66) into Eq. (65), we obtain Q=
IG:
h(6)db
First, determine the frequency response H(fx,f;,)of a filter minimizing Q. The choice of H ( f , , f , ) affects both bOand histogram h(b). Since go is the filter response at the object localization, it may be determined through the object spectrum crO(fx,f,.) as 60
=
s’
ao(.f;,f;,)H(f,f,)df,df,
(68)
-u(
As for the relation between h(b)and H(f’,fi,), it is, generally speaking, of an involved nature. The explicit dependence on H ( f x , f;,) may be written only for the second moment of the histogram h(b)by making use of the Parseval relation for the Fourier transform
m,
=
([-:,
b2K(b)db)’’, 112
= i S J ^ . i . o . . o ) d x O d . o ~-~x
b2Nb,x0,yo)db)
S
x
where S, is the area of the picture under consideration minus the area
APPLIED PROBLEMS OF DIGITAL OPTICS
73
occ:upied by the signal of the desired object at the filter output, a~ ~ 9' o((fx,f,) is the Fourier spectrum of the picture, where the signal in the area occupied by the desired object is set to zero (background spectrum), and
JJ Sl
Therefore, we shall rely upon Chebyshev's inequality, which is well known in probability theory and which for histograms is
and require that g = rn:/bi
be minimal. This condition is equivalent to that of the maximum of 31) n n
J J -m
In order to determine the minimum of y1 with respect to H ( f x , f y ) ,let us make use of the Schwartz inequality 7
- x f
7,
-x
-m
from which it follows that the maximum (75) -m
is attained at (76)
74
L. P. YAROSLAVSKII
One may express lc~,,,(f,,f,)1~ through the spectrum of the observed picture a p ( f xf,, ) and that of the desired object cro(fx, f,). Obviously,
Then, substitution of Eq. (77) into Eq. (70) results in
-
Ic(bgl2
=
&I2
+ 10(,12 - cr,*aow or,cr,*w* -
(78)
where
S
is the spectrum of the weight function w ( x o ,yo). Usually, the area occupied by the desired object is much less than the area of the picture itself. Therefore, the following approximate estimate is often practicable Obviously, if an optimal filter is required for a set of pictures, the result of spectrum averaging over the set should be substituted into Eqs. (78) and (80) for 1.,(fX,f,)l2. Such an optimal filter may be rather easily implemented by optical means (Yaroslavskii, 1976a, 1981) in an adaptive optical system with a nonlinear element in the Fourier plane and has shown to give good results (Dudinov et ul., 1977). With digital realization, it is most reasonable to process the signal in the frequency domain because the frequency response [Eq. (76)] of the optimal filter is based on measurement of the observed picture spectrum. Computer simulation of the optimal linear estimator also has confirmed its advantage over the traditional correlator. Figure 33 shows a 512 x 512element picture over which experiments were carried out on determination of the coordinates of 20 test 5 x 5-element dark marks whose disposition is shown in Fig. 34 by numbered squares. As may be seen from this scheme, the test objects are situated in structurally different areas of the aerial photograph; this fact enables us to estimate the correlator and optimal linear estimator under different conditions. The contrast of marks is about 25% of the video signal amplitude range. The ratio of the mark amplitude to the rms video signal value over the background is about 1.5. The results of the simulation are shown in Fig. 35, which presents (in the downward direction) the cross sections of the initial video signal and outputs of a standard correlator and optimal filter passing through the centers of marks (12) and (15) in Fig. 33. One may easily see in the graph of correlator output the autocorrelation peaks of test marks and false correlation peaks, including those exceeding the autocorre-
APPLIED PROBLEMS OF DIGITAL OPTICS
75
FIG.33. Test aerial photograph with square marks.
lation one. These false peaks result in false decisions (Fig. 36). Comparison of this graph with the lower one in Fig. 35 shows how the optimal filter facilitates the task of spot localization to the decision unit. The result of the optimal estimator operation is tabulated below in Table 111, which lists 31 main local maxima of the optimal filter output. As may be seen from the table, coordinates of all twenty test marks are precisely measured, and no false decision is made. It may also be seen which areas of the picture give a smaller output response, i.e., are potentially localizable with greater difficulty (see also Fig. 34 where each spot is numbered as in Table 111).
I
FIG.34. Scheme of marks in Fig. 33
FIG.35. Graphs of a section of the original picture (Fig. 33) video signal (upper), standard correlator output (middle), and optimal filter output (lower).
APPLIED PROBLEMS OF DIGITAL OPTICS
r c
77
--
:u
.. .. /
FIG.36. Scheme of decisions at the standard correlator output.
TABLE 111 RESULTS OF MEASURING TESTMARKSI N FIG. 35 Serial number" 1
-7
3 4 5 6 I
8 9 10
11 12 13 14 15 16 a
Relative local maximum
1 0.88
0.81 0.83 0.83 0.83 0.82 0.8 0.8 0.878 0.78 0.78 0.778 0.774 0.77 0.766
1-20 are true peaks; 21-31 are false peaks.
Serial number" 17 18 19 20
21 22 23 24 25 26 27 28 29 30 31
Relative local maximum 0.762 0.754 0.754 0.737 0.733 0.729 0.725 0.721 0.721 0.713 0.709 0.709 0.709 0.704 0.704
78
L. P. YAROSLAVSKII
C. Allowance for Object’s Uncertainty of Definition and Spatial Nonuniformity: Localization on “Blurred Pictures” and Characteristics of Detection 1. Localization of Inexactly Dejined Picture
This is the case when 4(b,) cannot be regarded as a delta function; i.e., the object is not exactly known. As before, the picture will be regarded as spatially uniform. Now, the optimal estimator must provide the minimum of the integral
J -=
J bo
(81)
where h(b)is defined by Eq. (66).
a. Estimator with selection. Decompose the interval of possible values into subintervals within which 4(b,) may be regarded as constant. Then
where b$)is representative of the ith interval, q iis the area under 4(b0)over the ith interval. As q i 0, Q , is minimal if
Qy)=
bg
h(b)db
(83)
is minimal. The problem, thus, is reduced to the above problem of localization of an exactly known object, the only difference being that now an estimator with the filter
should be generated separately for each “representative” of all the possible object variations. Stated differently, this means that there are more than one given objects. Of course, this results in losses of time on selection.
h. Estimator adjusted to averaged object. If the dispersion of parameters is small enough, one may, at the expense of a higher rate of anomalous errors, solve the problem as though the object is exactly known; the optimal filter in this case is corrected with due regard to the object parameter dispersion. In order to correct the filter characteristic, change in Eq. (81)the variables h , = b - h, and the order of integration
79
APPLIED PROBLEMS OF DIGITAL OPTICS
The internal integral in Eq. (85) is a convolution of distributions or distribution of the difference of two independent variables b and b,. One may denote this distribution by h,(b,). Its mean value is equal to the difference of mean values b, and b,, of the distributions q(b,) and h(b),and the variance is equal to the sum of variances of these distributions, that is [rnt + 6:, where 8,‘ is the variance of the distribution q(b,). Therefore, Ql
=
J-: J-: &w, h,(b,)dbl =
h,(bl
(86)
-
bo
The problem, thus, has boiled down to that of Section IV,B, and similarly to Eq. (76) one may write the following expression for the optimal filter frequency response
where @ < ( f x , f Y ) is a function complex conjugate to the object spectrum averaged over the set of unknown object parameters [the result of averaging over q(b,) in Eq. ( 8 5 ) ] , and
laef(fx>LA2= C ~ O ( S X J J - &l(S4l2
(88)
is the mean-squared difference a,(fx, j,) - ?i,(fx,fy). The optimal filter is somewhat different from that of the determinate case: It relies upon an “averaged” object and corrected power spectrum of the background picture, correction being the rms of the object power spectrum. 2. Localization in the Case of Spatially Nonhomogenuous Criterion
Let us turn to the general formula, Eq. (61). Depending on the constraints on implementation, one of the two ways to attain the minimum of Q may be chosen.
a. Readjustable Estimator with Fragmentwise Optimal Filtration. given W,, the minimum of Q is attained at the minima of all
QY =
:/
Under a
~ ~ b o ~ d b o S S w ’ ” ) ~ x , ~ . Y o ~ d xh,(b,xo,.Yo)db od?o
(89)
S,
This means that the linear filter should be readjustable and process pictures by fragments within which the averaging in Eq. (89) is done. For each fragment, the characteristic of an optimal filter is determined through Eq. (74) or (87) on the basis of measurements of the observed local power spectrum of fragments (with allowance for the above reservations about the influence of the object spectrum on the observed picture spectrum). According to Eq. (61), the
80
L. P. YAROSLAVSKII
fragments do not overlap. It is obvious from the very sense of Eq. (61) that it gives rise to the sliding processing algorithm based on an estimate of the current local power spectrum of the picture because error weights may be defined by a continuous function. Note also that, with fragmentwise and sliding processing, the readjustable filter characteristic is independent of weights W, or a corresponding continuous function. b. Nonreadjustable Estimator. When a readjustable estimator with fragmentwise or sliding processing cannot be implemented, the estimator should be adjusted to the power spectrum of picture fragments averaged over W,. Indeed, it follows from Eq. (61)
where h=(b)is a histogram averaged over {K}and w(")(xo,yo), whence one may conclude by analogy with Eqs. (76) and (87) that
where
Thus, the transfer function of the optimal filter is in this case dependent on the weights {Wn). 3. Localization on Defocused Pictures
Let the picture be distorted by a linear, spatially invariant system with frequency response H , ( f x ,f,). Obviously, the optimal estimator should be adjusted to an object that was subjected to the same transformation as the observed picture; i.e., the filter transfer characteristic should be as follows
Depending on which way is more convenient for filter implementation and for representation of the reference object, different modifications of this formula
APPLIED PROBLEMS OF DIGITAL OPTICS
81
are possible. For example,
correponds to the estimator where the observed defocused picture of spectrum is first “whitened” by a filter making its power spectrum uniform and then ~ 2 )ratio 1 ’ Z (. ~ Z ) l ’ Z / may ~ H sbe~ correlated with the reference a ~ ~ H s ~ / ( ~ q , gThe regarded as a picture spectrum at the output of a filter inverse to the defocusing one, i.e., as a spectrum of a picture corrected by the inverse filter. Here a relation exists between localization in a defocused pictures and correction of pictures distorted by linear systems (see Section 11,D).
4 . Detection Characteristics Sometimes it is desirable to detect an object with certain reliability without a priori knowledge that it is present in the picture. Detection reliability is known to be characterized by the conditional probabilities of a missing object and a false alarm (detection). A peculiar feature of the localization and detection problem under consideration lies in the fact that the possibilities of missing an object and of a false alarm depend on different random factors: The former depends on signal sensor noise, and the latter, on the presence of foreign objects and (to a lesser degree) signal sensor noise. Since foreign objects are assumed not to be defined a priori, it is impossible to determine the probability of a false alarm. One can only be sure that for the observed set of foreign objects it is minimized by appropriate choice of the above linear filter. In order to determine the false alarm probability, one has to assume a statistical description of foreign objects in the form, for instance, of signal overshoot distribution at the output of optimal filter as defined for a given class of pictures. The noise of the video signal sensor is quite satisfactorily described by the additive Gaussian model. Therefore, the object missing probability may be defined as
where ho is the maximal signal of the desired object at the optimal filter output, h,, is the chosen detection threshold, 6 is the standard deviation of sensor noise, and @(x) is the error integral 1
r x
L. P. YAROSLAVSKII
82
D. Optimal Localization and Picture Contours. Selection of Objects from the Viewpoint of Localization Reliability
I . Whitening and Contours In order to gain insight into the sense of operations performed over the observed picture by the derived optimal linear filter, its characteristic, Eq. (76), may be conveniently represented as
In this representation, the filter action is reduced to the picture whitening (filter H , ) mentioned above ( in Section IV,C) followed by correlation of the whitened picture with exactly the same transformed desired object (filter H 2 ) . An interesting feature of the optimal filter of Eq. (97) is that the whitening by the filter H ( f x , f , )= l/(Ic(bg(fx,f;.)(2)112 usually brings about contouring of the observed picture owing to amplification of its high spatial frequencies since, as a rule, the picture power spectrum is a sufficiently rapidly decreasing function of spatial frequencies and, consequently, HI(f x , f,) grows with frequency. This conclusion is illustrated by Fig. 37, demonstrating the result of whitening of the picture shown in Fig. 33, and also by the results of test picture whitening as shown in Fig. 38. The recommendation empirically established by some researchers, that in order to enhance localization reliability it is a good practice to extract contours of the picture prior to correlation by some kind of spatial differentiation or to quantize it roughly for improving boundary sharpness, thus, has a rational substantiation. Moreover, this result casts a new light on what are to be regarded as picture contours and why contours are of such importance for the visual system. The concept of contours often occurs and is differently defined in publications on picture processing and recognition. From the viewpoint of object localization in pictures by the linear estimator, “contours” result from picture whitening. The more intensive this “contour” portion in the signal describing the object (the sharper object picture, in particular), the more reliable is localization. Possibly, from this standpoint one can explain the well known effect in vision psychophysics that visibility of noise and distortions near sharp brightness overfalls (object boundaries) is lower than where brightness varies smoothly, i.e., where the intensity of the “contour” signal is small. Notably, when contour extraction is discussed, usually isotropic differentiating procedures are implied. The optimal whitening for localization, however, is not necessarily isotropic or differentiating because it is defined by
APPLIED PROBLEMS OF DIGITAL OPTICS
FIG.
83
37. The result of “whitening” of the picture of Fig. 33.
the spectrum of the background picture or, in the case of a spatially nonhomogeneous estimator, by those of picture fragments over which the desired object is looked for. Moreover, the same phenomenon accounts for the adaptivity of whitening, that is the filter characteristic is adjusted to the observed picture, and the effect of whitening is different on different pictures. For example, it is angular points that are emphasized in rectangulars and parallelograms against the background of circles; in texts, vertical and horizontal fragments of characters are contoured (practically, only angular points are left of them), but sloping fragments almost d o not change because they occur rarely (see Fig. 38b).
84
L. P. YAROSLAVSKII
FIG.38. “Whitening” of the test picture consisting of geometrical figures and characters: (a) original picture; (b)after whitening.
2. Selection of Reference Objects in terms of Localization Reliability There are numerous applications in which the localization object is not defined and one has to choose it. The question is how to do it to best advantage. This problem occurs in stereogrammetry and artificial intelligence, where it is called “the problem of characteristic points.” The literature on stereogrammetry recommends to take as reference objects those fragments that have pronounced local characteristics such as crossroads, river bends, separate buildings, etc. Zavalishin and Muchnic (1974) suggest taking those picture areas over which some especially introduced informativeness functions have extremal values. Qualitative recommendations of this sort may also be found in other publications on pattern recognition.
APPLIED PROBLEMS OF DIGITAL OPTICS
85
The above analysis gives a solution to this problem. Indeed, it follows from Eq. (75) for the maximal “signal-to-noise” ratio at the output of the optimal linear filter that picture fragments with maximal “whitened” spectrum power a 0 / (1 ~ ~ ~ will ~ 1 be ~ the ) ” best ~ references. They will provide the greatest response of the optimal filter and, consequently, the minimum of false identification errors. Hence, the following recommendation may be made on selection of reference objects (in stereogrammetry, for example). One of the stereo pair pictures should be decomposed into fragments and the ratio of their spectrum a(f,,f,) should be determined to the module of the second picture spectrum Irx,,(fx,f,)12. Next, for each fragment the integral of Eq. (73)(or a corresponding sum at digital processing) is computed, and the required number of greatest results is chosen. Since, as it was already observed, the picture spectrum is most commonly a rapidly decreasing function, the reference objects with slowly decreasing spectra, i.e., picture fragments which are visually estimated as containing the most intensive contours, will be the best ones. These recommendation were checked experimentally by Belinskii and Yaroslavskii (1980).Figures 39 and 40 show some of the results of detection of reference objects by means of the above algorithm with sliding processing by a 32 x 32 window. The degree of object (fragment of the original picture) detection reliability is shown by the degree of blackening. It may be readily seen that where the original picture has some sharply pronounced local peculiarities-brightness overfalls, variations of texture pattern, etc.- the best fragments are distinguished. The algorithm for reference object determination requires rather cumbersome computations, especially with sliding processing. Therefore, computationally simpler algorithms approximating the optimal one are of interest. Experiments (Belinskii and Yaroslavskii, 1980) have shown that algorithms for computation of local variance, or mean local values of video signal gradients for which fast recursive algorithms exist, may be used as simplified algorithms. All the processing methods described in this article may be effectively implemented in a hybrid optodigital system built around an adaptive optical correlator with a nonlinear medium in the Fourier plane (Yaroslavskii, 1976a, 1981). With purely digital implementation one has to make some simplifications in order to enhance the speed. This is exemplified by rough quantization of the whitened signal which (Belinskii et al., 1980)enables drastic reduction of operations for computation of the correlation between the “whitened” picture and the desired object, and by the algorithm for identification of benchmarks in aerial and space photographs (Yaroslavskii, 1976b).
86
L. P. YAROSLAVSKII
FIG.39. Automatic extraction of reference objects in an aerial photograph: (a) original picture; (b) result of testing of 32 x 32 fragments.
FIG.40. Automatic extraction of reference objects in a space photograph: (a) original picture; (b) result of testing of 32 x 32 fragments.
88
L. P. YAROSLAVSKII
E. Estimation of’ the Volume of Signal Corresponding to a Stereoscopic Picture The stereo effect is known to be one of the basic stereo vision mechanisms (Valyus, 1950) widely used in different projections of stereo TV and cinema (Shmakov et al., 1966, in applied TV (Shmakov et al., 1966), in aerial photography and cartography, and in many other fields of human activity making use of visual information. Therefore, it is of great practical interest to estimate the volume of signal corresponding to stereo pictures, i.e., the capacity of the channel required for storage and transmission of stereo pictures. This problem is discussed in a number of publications (see, for example, Shmakov et al., 1966; Gurevich and Odnol’ko, 1970)from which one may conclude that the volume of signal corresponding to stereo pictures (stereo pair) is approximately twice the same as that of one picture in the pair. i.e., the capacity of the channel for transmission and storage of stereo pictures is approximately twice that of the single-picture channel. These estimates are based on data on vision resolution of flat pictures and those with depth, and rely upon an implicit assumption that resolutions of stereovision for brightness and relief (depth) components of the stereo picture are equal. Being unfounded, this assumption leads to an overstated estimate of the signal volume. As presented in this article, analysis of optimal localization of objects in pictures enables much more optimistic estimates. From the informational standpoint, the two pictures of the pair are equivalent to one picture plus the relief (depth) map of the scene. Indeed, by means of two pictures one can construct a relief map, and, vice versa, by a relief map and one of the pictures the second picture of the pair may be constructed. Therefore, the increment of signal volume provided by the second picture of a pair is equal to the signal volume corresponding to the relief map. The number of depth grades resolved by the eye is approximately the same as that of the brightness (about 200 according to Gurevich and Odnol’ko, 1970). Therefore, the relative increment of signal volume will be mostly defined by the number of degrees of freedom of the relief map, i.e., by the number of its independent samples. This number may be estimated by the following simple reasoning. Each sample of the relief map may be determined by indentifying corresponding areas in the photographs that form a stereo pair, measuring their parallax, and recalculating it into the relief (plan) depth with due regard to the survey (observation) geometry. All the engineering systems using stereo pictures operate in this manner, and it would be natural to assume that the stereo vision mechanism operates similarly. The number of degrees of freedom
APPLIED PROBLEMS OF DIGITAL OPTICS
89
(independent samples) of the relief map, obviously, is equal to the ratio of the picture area to the minimal area of its fragments which may be identified with confidence in another picture of the pair. It is also evident that, in order to provide reliable identification, the dimensions of identified fragments should exceed those of the picture resolution element, and its area should be several times that of the resolution element. This implies that the number of independent samples of map relief and, consequently, the signal volume increment, will always be several times less than the number of resolution elements in a stereo-pair picture. For example, for identified areas of 2 x 2 and 3 x 3 elements, the increment of signal volume will be, respectively, 4 and 9 times less the signal volume of one picture, etc. The studies of an optimal linear detector of objects in pictures (Belinskii and Yaroslavskii, 1980) demonstrate that, for reliable identification in complicated pictures, areas should be more than 8 x 8 through 10 x 10 picture elements. This fact enables one to hypothesize that the signal volume increment required for representation of the stereo effect is only several percent or even a fraction of one percent of the signal volume for one picture of a stereo pair. The present writer has carried out a series of experiments on stereo picture processing with the aim of indirect verification of this hypothesis. Samples of one of the stereo pair pictures were thinned out and bilinearly interpolated samples were substituted for the rejected ones. The experiments were aimed at determination of the influence of thinning out on the perception of depth and sharpness of the observed stereoscopic picture. Experiments were carried out with frames of a stereoscopic cartoon film (Fig. 41) and a training aerial photograph (Fig. 42). The former were of interest because of sharp steplike changes of plans over which the loss of resolution in one of the pictures caused by thinning out and interpolation might be more prominent. The stereo aerial photograph was used for a quantitative estimation of the influence of thinning out and interpolation on the precision of parallax measurements and, thus, on the accuracy of a relief map. Observing stereo pictures by means of drawings, one may see that thinning out and interpolation of one picture do not tell markedly on the stereo picture quality even at 5 x 5 thinning out when signal volume is decreased by a factor of 25. This same fact is confirmed by the results of measuring the precision of parallax determination for respective points as performed on the stereo comparator for the aerial photograph of Fig. 42 over 31 randomly selected fragments. These results are plotted in Fig. 43. The graph of Fig. 43a shows that at 1 : 3 thinning out the rms error of
90
L. P. YAROSLAVSKII
FIG.41. Influence of thinning out of a picture from a stero pair on the stereoscopic effect: (a) original stereo pair; (b)-(e) the right-hand frame of (a) thinned out with steps of 2: 1,3: 1,4: 1, and 5.1.
APPLIED PROBLEMS OF DIGITAL OPTICS
91
FIG.41 (continued)
parallax measurement is within the precision of the stereo comparator, that is characterized by the error for nonrastered (i.e., not sampled and reconstructed) pictures. Moreover, rastering and 1:2 thinning out slightly decrease this error. This may be explained by the fact that at sampling and reconstruction of pictures by means of a rectangular aperture, pseudocontours occur at the boundaries of neighboring samples that somewhat improve the accuracy of localization of respective points. As may be seen from the graph in Fig. 43b, the loss of stereo effect becomes noticeable only with 1: 7 thinning out, thus confirming the above hypothesis. At the qualitative level it is confirmed also by the well-known fact that one of the pictures in a pair may be distorted significantly (decrease of sharpness, distorted reproduction of half-tints, distortion or even complete loss of colors) without appreciable loss of the stereo effect. O n the other hand, the reasoning used for estimation of the signal volume increment seems to elucidate these phenomena to some extent.
92
L. P. YAROSLAVSKII
FIG.42. Tutorial aerial photograph used in the experiments on thinning out
It should be noted that arguments about the minimal size of identifiable area are tentative because special pictures and objects may be imagined (e.g., sparse contrast points or linear objects against an absolutely even background) where the increment estimate will not be so optimistic. However, it seems to be true for complicated pictures of natural origin. V. SYNTHESIS OF HOLOGRAMS
Hologram synthesis requires the solution of two major problems: computation of the field to be recorded on the hologram, and recording the computation results on a physical carrier capable of interacting with radiation in a hologram reconstruction scheme or in an optical system of spatial
.
APPLIED PROBLEMS OF DIGITAL OPTICS
40
20
i 1
r I
I b
10.2
2
93
.
1
3
4
5
6
, , , Oi
I 2 3 4 5 6 7 8 FIG.43. (a) rms of parallax estimation error and (b) the rate of points with loss of stereo effect as functions of the degree of thinning out. Point “ 0 on the abscissa corresponds to the nonsampled picture and characterizes the precision of stereo comparator. Point “ I ” corresponds to the sampled original picture without thinning out. Points 2 through 8 correspond to thinning out 1:2 through 1:8.
filtration. Solution of the first problem requires an adequate digital representation of wave field transformations occurring in optical systems. For the second problem, optical media are required which can be used for recording synthesized holograms, and techniques and devices for controlling their optical properties such as a transmission or reflection factor, refraction factor, or optical thickness. This section is devoted to the presentation of approaches to these problems. Section V,A formulates a mathematical model which may be used as a basis of synthesis of holograms for data visualization. Section V,B describes, with allowance for the performance of devices for hologram recording and reconstruction, discrete representation of Fourier and Fresnel holograms. Methods for recording synthesized holograms in amplitude, phase, and binary
94
L. P. YAROSLAVSKII
media are analyzed in Section V,C, where the existing hologram recording methods and their modifications are discussed and a universal interpretation of various methods is given. In Section V,D the reconstruction of synthesized holograms in the optical Fourier scheme is considered, and distortions of the reconstructed image are discussed arising at construction of the continuous hologram through its discrete representation. Finally, Section V,E describes the existing methods of data visualization by means of synthesized holograms.
A . Mathematical Model Consider a mathematical model of hologram synthesis built around the scheme of visual observation of objects shown in Fig. 44. The observer’s position with respect to the observed object is defined by the observation surface where the observer’s eyes are situated, and the set of foreshortenings is defined by the object observation angle. In order that the observer may see the object at the given observation angle, it suffices to reproduce the distribution of intensity and phase of the light wave scattered by the object over the observation surface by means of the hologram. For the sake of simplicity, consideration will be given to monochromatic object illumination, which enables one to describe lightwave transformations in terms of complex wave amplitude. Although interaction between radiation and the body at reflection from the body’s surface is of an involved nature, the object characteristics defining its ability to reflect and dissipate incident radiation may be described for our purposes by a radiation reflection factor with respect to the intensity B ( x , y , z ) or amplitude b(x,y,z), which are functions of the object’s surface coordinates. The intensity of the reflected
FIG.44. Scheme of visual object observation by its hologram.
95
APPLIED PROBLEMS OF DIGITAL OPTICS
wave I,(x,y,z) and its complex amplitude A,(x,y,z) at the point (x,y,z) are related to the intensity I(x, y , z) and amplitude of the incident wave as follows I,(& y , z ) = B(x,y , z)l(x, y, 2) A O k Y,
(98)
4 = Nx, y , z)A(x, Y, 4
(99) The reflection factor with respect to amplitude may be regarded as a complex function represented as
W, Y ,4 = I&, Y ,4 expCidx, Y ,41
(100)
Its modulus Ibl and phase p show how the amplitude modulus A and lightwave phase o change after reflection by the body surface at the point (x, y, z) IA,(x,y,z)l
(101)
= IA(X,Y, z)IIb(x,y, z)l
o,(x, y, 4
= w(x,y>4
+ P(X, y, 4
(102)
where A,(&
y,4
= IA,(x,
Y ,41expCiwo(x,Y ,41
(103)
a x , Y, z) = I&, Y, 41expCMx, Y ,41 ( 104) According to Eqs. (98)-(104), the intensity reflection factor may be determined through the amplitude reflection factor as
B
=
lbI2 = bb*
(105)
The relation between the complex amplitude r(5,q, ()of the lightwave field over an arbitrary observation surface defined at the coordinates q, c) and the complex amplitude A , of the object surface can be described by an integral
(r,
re,0,
=
jJ]
~ , ( x ,y , 2) w , Y, z;
4, q, i) dx dy d z
(106)
S(X,Y,Z)
whose kernel T(x, y, z; 4, q, () depends on the spatial disposition of the object and the observation surface, integration being performed over the object
O(<.M
where T is a kernel reciprocal to T, and integration is performed over the observation surface. Thus, the hologram synthesis lies in the computation of r(4,y~,() through b(x, y , z) and A(x, y, z), which are defined by the object description and illumination conditions, and in recording the result in a physical medium in a form allowing interaction with radiation for visualization or reconstruction A,(x, y , z ) according to Eq. (1 07).
96
L. P. YAROSLAVSKII
The computation of integrals like that of Eq. (106) is generally a very complicated problem, but it may be significantly simplified by taking into consideration the following natural limitations of visual observation:
(1) The size of the observer's eye pupil is much smaller than the distance to the observation surface. ( 2 ) The areas of the observation surface approximately as large as the interpupillary distance may be regarded as flat. (3) The relief depths of objects situated at distances convenient for the observer usually are small as compared with the distance. They, first of all, enable reduction of the 3D problem to a 2D one. To this end, one may break down the observation surface into areas approximated by planes, and, using the laws of geometrical optics, to replace the distributions of amplitude and phase of the wave over the object surface by those of the wave in a plane tangent to the object (or sufficiently close to it, so as to make diffraction at recalculation of wave's amplitude and phase negligible) and parallel to the given plane area of the observation surface. Thus, the problem of hologram synthesis over the whole observation surface boils down to the synthesis of fragmentary holograms for the plane areas of this surface, the complete hologram being composed as a mosaic of fragmentary ones. Instead of Eq. (106) one then obtains for the fragmentary hologram
Ut?yl)
=
jj
&(x, Y )KO(&y ; t, '1) dx dy
(108)
(W)
where
Ao(x,y)= IAo(x,y)Iexp[i~,(x,y)I is a complex function resulting from recalculation of the amplitude and phase of the field reflected by the object onto the plane (x, y ) tangent to it and parallel to the observation plane (<,yl); zo is the distance between these planes; T,"(x,y ; (,yl) is the transformation kernel. If the geometrical dimensions of the body are small as compared with the distance zo to the observation plane, this fact together with the smallness condition of the observation surface fragment enables the use of the Fresnel integral as an approximation of Eq. (108)(Born and Wolf, 1959; Mertz, 1965; Papoulis, 1968):
( 109)
Iis the radiation wavelength. where ,
APPLIED PROBLEMS OF DIGITAL OPTICS
97
The holograms synthesized by means of this relation will be referred to as synthesized Fresnel holograms. Further simplification is possible if
because the integral of Eq. (109) becomes rFR(t9
ul) = expCi4t2
+ Y2)/2Z01
which is the Fourier transform of the function Ao(x,y). rF(t3fl)
=
Irnlrn
Ao(x,y )exp[ - i2n(x<
-m
+ yq)/lz,] dx dy
(112)
-m
with an accuracy to within an exponential phase factor. The holograms synthesized by means of Fourier transformation will be referred to as synthesized Fourier holograms or Fourier holograms. It will readily be seen that the Fresnel hologram is the Fourier hologram of the same object but as observed and recorded through a lens. This second lens described by the factor exp[in(t2 + q2)/Azo]in Eq. (111) may be excluded from synthesis because it has a single parameter zo which is common to the entire hologram. It may be restored separately at hologram reconstruction. From the standpoint of object wavefront reconstruction, the Fresnel hologram differs from the Fourier one in that it basically has focusing properties and is capable of reproducing the finite distance to the object. This implies that if a flat wavefront from a coherent light source is used for reconstruction of Fresnel holograms, at a distance zo they produce a focused image defined by Ao(x,y). The Fourier holograms reconstruct an object situated as if at infinity if a flat wavefront is used for reconstruction, or in the location of the light source if a spherical wavefront from a point source is used. Mathematically, object reconstruction by Fresnel and Fourier holograms is described by inverse Fresnel and Fourier transformations, respectively. When holograms are observed visually, these transformations are performed by the optical system of the eye. Having passed from the spatial problem to the flat one, we, strictly speaking, have lost the possibility of taking into account exactly the effect of object depth relief on the wavefront in the place of observation. Even the Fresnel hologram involves only the distance from the object to the observation plane rather than object relief depth. Nevertheless, there is still a possibility of synthesizing a wave field which under certain conditions reconstructs the object, and, consequently, the most important property of
98
L. P. YAROSLAVSKII
holographic visualization-naturalness of object observation-is preserved. As for relief reproduction, it may be done as is shown in Section V,E either by varying foreshortenings at the synthesis of fragmentary holograms perceived owing to stereo vision, or by simulation of the diffuse properties of the reflecting surface.
B. Discrete Representation of' Fourier and Fresnel Holograms Equations (109) and (112) underlie the synthesis of Fourier and Fresnel holograms. Their particular realization in digital processors depends on the method of discrete description of the object and wave field on it Ao(x,y ) as well as on the discrete representation of the hologram itself. Matrices of samples taken over a rectangular raster according to the sampling theorem with some step (Ax,Ay; At, Aq) along coordinates (x, y ) and (5, q ) are the simplest and most natural way of digital representation of object and hologram. We may denote them as A,(k,I ) and r&,s), respectively. The passage from A,(k, I ) to A,(x, y) or from rF(r, s) to r,((,q ) is done by interpolation in analog hologram recording and reconstruction devices.
I . Discrete Representation of Fourier Holograms Interpolation may be described in mathematical terms as a convolution of a discrete signal
with some interpolating function h(-,-), where 6(.) is the Dirac delta function; u and u are parameters defining the shift of the sampling raster with respect to the coordinate system (x,y), and the number of samples depends on object's dimensions (- X,,,, X,,,; - Y,,, , Y,,,) and sampling step int(2Xma,/Ax)
N
=
M
= int(2Ym,,/Ay)
According to the sampling theorem, exact interpolation Ao(x,y ) over A,(k, 1 ) is possible if
where (- (,
t,,,;
-
qmaX,qmax)is a rectangular domain beyond which the
APPLIED PROBLEMS OF DIGITAL OPTICS
99
spatial Fourier spectrum of the function Ao(x,y) in the coordinates (
In accordance with this representation, it suffices to synthesize the hologram of a discrete object &(x, y ) through a matrix of samples A,@,1); the original continuous object Ao(x,y ) may be reconstructed through &(x, y ) by analog means at the stage of synthesized hologram reconstruction. By substituting Eqs. (1 13) and (115) into Eq. (1 12), one obtains that the Fourier hologram of the discrete object &(x,y) may be computed as finite sum
where H(5, q ) is, the Fourier transform of the interpolating function h(x, y). Denote by rF,d({, q) the sum in Eq. (1 16). As the object &(x, y ) has limited dimensions ( kX,,,; f Y,,,), the function FF,d((, q), according to the sampling :heorem mentioned above, may be computed by interpolation of its samples rF(r, s)
(1 17)
taken along a rectangular raster with a step At
= L Z O / ~ X , , , ~ ~ Aq ; = LzO/2Y,,,
(1 18)
A possible shift of the sample raster with respect to the coordinate system (4, q ) is taken into account by pA4 and qAq, respectively. The problem of Fourier hologram synthesis is, thus, reduced to digital computation of the matrix of samples eF(r,s) and two analog procedures: interpolation at the st_age of hologram recording in order to obtain a q), and interpolation of a continuous object continuous hologram rF,J<, A o ( x , y ) at the stage of hologram recording [function H ( t , q ) ] and reconstruction. One may substitute Eqs. (118) and (115) into Eq. (1 17) and finally-obtain the following formula for determination of the elements of the matrix {r&, s)} in terms of a matrix of numbers Ao(k,1)
which is the formula of a two-dimensional shifted discrete Fourier transform
100
L. P. YAROSLAVSKII
(SDFT) (u, v ; p , q ) to an accuracy of a normalizing factor l / p (Yaroslavskii, 1979a,b, 1985). An SDFT is a generalization of the standard discrete Fourier transform which is its special case at u = 0, u = 0, p = 0, and q = 0. This generalization consists of taking into consideration a possible shift of the signal and its Fourier spectrum sampling rasters with respect to the coordinate systems of the signal and its spectrum. It is especially important to allow for this shift in digital holography because the synthesized digital hologram or spatial filter are to be included into an analog optical system for image reconstruction or spatial filtration. Such a system has its own natural coordinate system related to its optical axis. By assuming various values of the shift parameters p , q, u, and u, we define the position of the synthesized hologram and reconstructed image in this coordinate system. The fact that the shift parameters may assume arbitrary noninteger values renders to the discrete transform some features approaching it to the integral Fourier transform. They are discussed in more detail in the monograph of Yaroslavskii and Merzlyakov (1982). 2. Discrete Representation of Fresnel Holograms
One may rewrite Eq. (109) as
rFR(t, r ~ =) exp[ - in(t2 + q 2 ) / i z o ]
JXm
(
Ao(x,y )exp -ni -
' z o y 2 ) exp(
-
i2n X t + y q ) d x d r ~
(120)
AZO
Similar to the case of the Fourier hologram, the function rFR(<, ?)eXP[-in(t2
+ q2)/;*zO]
may be reconstructed by interpolation of its samples ~FR(
=
r F R ( t , ~ ) e x p ~ - i n (+t ~q2)//2201 ~~rFR(rAt,SAY/)eXp{ -h[(r r
s
At
(r + d A t ; r?
+ p)'Ac2 + + q)2AV/2]/~~o) (S
+ 4)Aq)
(121) by means of the interpolating function y((, q), where A t and Aq are defined by Eq. (118) and p and q are, as in the case of Fourier holograms, parameters defining the shift of the hologram sample raster with respect to the coordinate system (t,q). It follows that x
-
-
(s
rFR(t, q) = r(t,7) exp[in(t2 + q2)/iLz01 (1 22) also may be reconstructed through correction of the hologram f'((,q) of
APPLIED PROBLEMS OF DIGITAL OPTICS
101
Eq. (122) by illuminating it by a spherical wavefront rSp(<,
YIJ = eXPCW52 + v’)/AzoI
(123)
q). We obtain from Eq. (120) Let us see now how to determine FFR(t, TFR(rAir,sAq)exp{-in[(r =
J
+ p)’At2 + (s + q)2Aq2]/A~o;J
5
+ y2)/1~z,l x exp( - i[x(r + p ) A t + y(s + q) Aql/2zo} dx dy j o ( x , y )exprin(x2
- z
(124)
Similar to the case of Fourier holograms, it is natural to assume that IAo(x,y)l may be reconstructed through its samples IAo(k,1)1 by interpolating them by some function h(x,y ) analogous to Eq. (1 13)
Then,
+ p)’ A t 2 + (s + 4)’ Au]’/hO} = c C (Ao(k, l)l h(x ( k + u)Ax;y - (1 + c)Ay) x exp{in[(x2 + y2)/Azo + Oo(x,y)l} (126) x expj -i2n[x(r + p)At + y ( s + q)Au]l/Az,}dxdy At, s Au]) exp{ -in[(r
~FR(Y
Im 1 m
N-1M-1
-
k=O
l=O
-m
-m
Having changed the variables, we may rewrite Eq. (126) as
+ u)Ax (Y + p)A(]’ + C(1 + AJ) (s + q) Aq]’}/Az0] x exp(in([(k
-
0)
x
-
h(x,y)exp{-i2n[x(r
J;=Jm
+ P ) M + y(s + ~
) A ~ I I
-30
x exp(in{(x
+[y
-
(1
+ ( k + u)Ax2]
+ u)AY]~
-
-
(k
+ u)’Ax’)
( 1 + U)’AY’})
+ ( k + u)Ax;y + (1 + u)Ay) + u)Ax;(I + u ) A ~ ) dxdy ]
x exp[iGO(x - &((k
(127)
The interpolating function h(.x, y) usually should appreciably differ from zero only within the rectangular (kAx; k Ay); and with an appropriate set of
102
L. P. YAROSLAVSKII
sampling intervals Ax and A y one may, therefore, disregard phase errors in the latter two factors in Eq. (127). Thus, N-1M-1
&(k, 1) exp(in{ [ ( k
r F R (A r t , s Aq) =
k=O
1=0
+ u) Ax - ( r + p ) A t ] ’
where the masking function H(r, s) is equal to H ( r , s )N
h(x,y)exp{ - i 2 n [ x ( r -sc
+ p)A5 + y(s + q ) A q l } d x d y
(129)
the samples of the Fourier transform of the interpolating function h(x,y). The sum in Eq. (128) is the discrete Fresnel transform to an accuracy of a factor of 1/ T M (Yaroslavskii and Merzlyakov, 1982) 1
r F R ( rs,) =
N-lM-1
~
x exp Liz(
(kti - Y/K
+ W,)Z + (lv
-
s/v M
+ wy)2
I]
(130)
where
The synthesis of Fresnel holograms, thus, is reduced to computation, by means of the discrete Fresnel transform, of the matrix { r F R ( r , s ) } through a matrix of samples {A,(k,1 ) } describing the complex field amplitude over the object, and to analog interpolation of the samples according to Eq. (121) through Eq. (123). Interpolation at reconstruction of the object from the hologram is done by masking the hologram by the function in Eq. (129). C . Methods and Means for Recording Synthesized Holograms The samples of Fourier and Fresnel holograms obtained through Eqs. (1 19)and (130) are, so to say, mathematical holograms, i.e., arrays of, generally,
APPLIED PROBLEMS OF DIGITAL OPTICS
103
complex numbers defining the phase and amplitude of the synthesized wavefront. In order to transform the mathematical hologram into a physical one capable of generating the required wavefront, these numbers should be transformed into parameters of optical media modulating, respectively, the amplitude and phase of the light wave used for hologram reconstruction. The existing optical media may be classified into three categories: amplitude media, phase media, and combined or amplitude-phase media. In amplitude media, the controlled parameter is the factor of transmission with respect to light intensity. This is the most common and available class, whose typical representatives are the standard halogen-silver emulsions used in photography and optical holography. In phase media, light intensity transmission is not controllable, but they enable their optical thickness to be controllable, for example, by varying their refractive index, or physical thickness, or both. Such are photothermoplastic materials, photoresists, bleached photographic materials, media based on bichromized gelatine, photopolymers, etc. Combined media allow independent control of the light intensity transmission factor and optical thickness. Currently, these are photographic materials with two or more layers sensitive to radiation of different wavelengths which enables control of the transparency of certain layers and of the optical thickness of others by independent exposure of each layer to its wavelength. Special digitally controlled hologram recording devices are required for the control of optical parameters of these media according to computations of the wave field. Today, there is no such a device, and different displays designed for the output of characters, graphs, and grey-scale pictures are used to this end. The distinctive feature of alphanumeric and graphical displays is that they can perform only binary or two-level modulation of medium optical parameters. Therefore, amplitude and phase media employed in these devices will be referred to as binary in virtue of the assumption that their controlled optical parameters may assume only two values. The use of amplitude and phase media in the binary mode is ineffective in terms of information capacity because the possibility of writing information on them is defined only by their spatial degrees of freedom (resolution), whereas in amplitude and phase media, in principle, the degrees of freedom related to a transmission (reflection, refraction) factor may be used as well. The major advantage of the binary mode is simplicity of media exposure, photochemical processing and copying, and the possibility of using the most widespread alphanumeric and graphic displays. The disadvantages of the binary mode may be overcome by using grey-scale displays for writing on amplitude and phase media. For recording holograms on combined media color displays are used.
104
L. P. YAROSLAVSKII
Medium optical parameters are most commonly controlled in these devices by an intensity-modulated beam of light or electrons affecting separate elementary areas (resolution cells) and writing (successively or simultaneously into several cells) appropriately coded samples of the mathematical hologram. The most important characteristics of recorders are their sampling step (the distance A( and Aq between neighboring separately exposed resolution cells) and the total number of exposed cells. The sampling step defines the angular dimensions of the reconstructed picture. In order to make the picture’s angular dimensions about ten degrees or more with a reconstruction source wavelength of about 0.5 pm, A( and Aq should be 3 pm at most. The best of the existing recorders have a sampling step of about 1 through 10 pm (Yaroslavskii and Merzlyakov, 1980,1982; Dallas, 1980)and the total number of cells from lo3 x lo3 to lo4 x lo4. At the early stages of digital holography, holograms were recorded by means of standard plotters and alphanumeric printers. The hologram hardcopies were photographically reduced in order to achieve acceptable values of the sampling step. The existing methods for recording synthesized holograms on amplitude, phase, binary, and combined media that are oriented to the above class of recorders may be classified with respect to different features. Here, a representation of complex numbers describing samples of the mathematical hologram will be chosen as the classification key. For other possible classifications the reader is referred to reviews by Lee (1978) and Dallas (1980). Complex numbers may be represented in two ways: exponentially or additively (Fig. 45). From the physical standpoint the form A exp(icp)seems to be the most natural representation of complex numbers, where A and cp are, respectively, its modulus and phase. The combined media are most suitable for this representation. Chu et al. (1973) describe the use of three-layer color photographic films as such a medium (see also Dallas, 1980). Lesem et al. ( 1969) suggested disregarding amplitude data and recording only the hologram sample phase, which enables the use of purely phase media. Although such holograms (called “kinoforms”) reconstruct the wave field with some distortions (some information about them may be found in Yaroslavskii and Merzlyakov, 1980, 1982), they are advantageous in terms of energy effectiveness because almost the total energy of the reconstruction beam is transformed into the energy of the reconstructed wave field without being absorbed in the hologram. Moreover, these distortions may be reduced to some degree by an appropriate choice of the diffusion component of the wave field phase on the object. The symmetrization method proposed by Yaroslavskii (1972a), is dual in a sense to that of the kinoform (see also Jaroslavski and Merzlyakov, 1979; Yaroslavskii and Merzlyakov, 1980). This method consists in that, prior to hologram synthesis, the object is symmetrized so that its Fourier hologram
APPLIED PROBLEMS OF DIGITAL OPTICS
- Multiphase method
(Yaroslavsky, Merzlyakov, 1982)
._m U
E
w ,-
a
-2;
\
m u
1
I I
Double-phase method (Brown, Lohman, 1966: Haskell, Culver, 1972; Hsueh, Sawchuk, 1978; Shmarev, 1976)
L
2 .m
105
-
Coding methods based on explicit introduction of space carrier (Huang, Prasada, 1966; Burch, 1967; Kozma, Kelly, 1967; Lee, 1978; Kirk, Jones, 1971)
I\ \ 1 I Representation by 2-D SIMPLEX (Burckhardt, 1970; Chavel, Hugonin, 1976)
.m
B
E
._ -
a
1
\
Representation by orthogonal and biorthogonal components (Lee, 1970; Yaroslavsky, Merzlyakov, 1979, 1980)
Binary holograms (Brown, Lohmann, 1966, 1969; Lohmann, Paris, 1967: Haskell, Culver, 1972)
\
Symmetrization method (Yaroslavsky, 1972; Yaroslavsky, Merzlyakov, 1979, 1980)
0 X
E E
Kinoform (Lesem et al., 1969)
Referenceless on-axis complex holograms (Chu et al., 1973)
FIG.45.
Classification of methods for recording synthesized holograms.
I
106
L. P. YAROSLAVSKII
contains only real samples and, thus, may be recorded on a purely amplitude medium. As real numbers may be both positive and negative, holograms should be recorded in the amplitude medium with constant positive bias, making all the recorded values positive. The philosophy of this method is based on the well-known property of the integral Fourier transform which may be formally written in the notation of Section V,A as follows: If A,(X,Y)
=
A%Y)
(132)
= A,(-X,Y)
then where * means the complex conjugate. For SDFT as a discrete representation of the integral Fourier transform, a similar property requires object symmetrization through a rule depending on the shift parameters u, u, p , and q (Yaroslavskii, 1979a, 1985; Jaroslavski, 1980a; Yaroslavskii and Merzlyakov, 1982). For example, with integer 2u and 2u and p = q = 0, this rule becomes A,(k,I)=
{l:::;): < < -
k,I),
0 5 k < N - 1; N k 2N - 1;
0 51<M 0 < 1< M
-
1; 1 (133)
which implies symmetrization by object duplication. In doing so, the number of samples of the object and, correspondingly, its Fourier hologram, is twice that of the original object. It is this double redundancy that enables one to do without recording the phase component. Symmetrization by quadruplication is possible as well, which consists in symmetrical complementation of the object according to the rule of Eq. (133) with respect to both indices k and 1. Hologram redundancy becomes then equal to four. The holograms of symmetrized objects are also symmetrical and reconstruct symmetrized objects (duplicated or quadruplicated depending on particular symmetrization) (Fig. 46). Symmetrization exemplifies optimal coding of information about an object necessary for matching hologram with the properties of the recording medium. Notably, duplication and quadruplication do not imply a corresponding increase in computer time for execution of SDFT at hologram calculation because, for computation of SDFF, one may use combined algorithms to enhance computation speed by means of signal redundancy (Yaroslavskii and Merzlyakov, 1982; Yaroslavskii, 1979a, 1985). Historically, one of the first methods for recording synthesized holograms was that of binary holograms proposed by A. Lohmann and his collaborators (Brown and Lohmann, 1966,1969; Lohmann and Paris, 1966).In this method, an elementary cell of a binary medium is allocated for storing the amplitude and phase of each sample of a mathematical hologram, the modulus of the
FIG.46. (a) Reconstructed picture under duplicated symmetrization; (b) reconstructed picture under quadruplicated symmetrization.
108
L. P. YAROSLAVSKII
complex number being represented by the size of the cell region (aperture) completely passing (or reflecting) light and the phase-by the aperture’s position within the cell. All the cells corresponding to mathematical hologram samples are arranged over a regular (usually, rectangular) raster. A shift of aperture by A< in a given cell with respect to its raster node corresponds to a phase detour for this cell equal to (271A< cos 1 3 ~ ) / j .for hologram reconstruction at an angle 8, to the system’s optical axis perpendicular to the hologram plane (Fig. 47). The use of the spatial shift of the aperture for representation of complex number phase was named the “detour phase” method (e.g., see Dallas, 1980). As was already noted, only spatial degrees of freedom are used for recording in binary media; therefore, the number of binary medium elements should exceed the number of hologram samples by a factor equal to the product of the amplitude and phase quantization levels that may run into 3
I
6
FIG.47. Phase coding by spatial shift of the aperture (detour phase method). (1) direction of the optical axis of the hologram reconstruction system; (2) hologram resolution cell; (3) nodes of regular raster; (4)difference of ray path resulting in phase detour cp = 2n(A5/,?) cos 0; ( 5 )hologram plane; ( 6 )direction of picture reconstruction.
APPLIED PROBLEMS OF DIGITAL OPTICS
109
several tens or even hundreds. The low effectiveness of using the degrees of freedom of the hologram carrier is the major drawback of the binary hologram method. But its merits, such as simpler technology of recording, photochemical processing and copying of synthesized holograms, and the possibility of using widespread computer plotters, account for its popularity. Several modifications of the method are known today, oriented to different plotters (e.g., see Hiskell and Culver, 1972) and different ways of taking into account boundary effects occuring when the aperture crosses the elementary cell boundaries. For further details the reader is referred to Dallas (1980). With an additive representation, the complex number (regarded as a vector in the complex plane) is a sum of several components. During recording in amplitude media, these components should have a standard direction (phase angle) and controllable length (amplitude). The simplest case is the representation of a vector r by its orthogonal components, say, real r,, and imaginary rimparts:
r = rreere+ rimeim
(134)
where ereand ei, are orthogonal unit vectors. When recording a hologram, the phase angle between the orthogonal components may be coded by the detour phase method, r,, and rim being recorded into hologram resolution cells neighboring in raster rows (see Fig. 48a) (Yaroslavskii and Merzlyakov, 1980). In doing so, the reconstructed image will be observed at an angle 8, to the axis 4 defined as follows
A( cos 9,
= i.14
(135)
where i,is the light wavelength used for hologram reconstruction. For recording negative values of r,,and rim, a constant positive bias may be added to recorded values. Since two resolution elements are used here for recording one hologram sample, such a hologram has double redundancy, as in symmetrization with duplication. With such coding and recording of a hologram, one must take into account that the optical path difference A12 and 31-14 will correspond to the next pair of hologram resolution cells (see Fig. 48a); that is, values of rr, and rimfor each odd sample of the mathematical hologram should be recorded with opposite signs. This coding technique may be described formally as follows. Let (r,s) be indices characterizing the number of the mathematical hologram sample T ( r , s ) ,r = 0,1,. . . , N - 1, s = 0,1,. . ., M - I ; m,n be indices of the number of the medium resolution cell; r ( m , n ) be the hologram coded for recording. Then the coded hologram is written as
-
r(m,n)
= f(-
i)r(i)mo[(-i)mor(r,s)
+ r*(r,s)] + c
(136)
110
L. P. YAROSLAVSKII a
2 70"
2
mol=o mOl=o mol=I mW=o moz-o moz=o
mo,=i rno2=i
a
C
2t I
FIG 48. Hologram coding by decomposition of complex numbers in (a)orthogonal and (b), Resolution cells corresponding to one hologram sample are outlined with
(c) biorthogonal bases.
bold line.
where m
=
2r
+ m,,
m, = 0, 1; n = s; and c is a positive bias constant. Indeed, ( - I)'Re[T(r, s)],
m,
=0
T(m, n) = ( - ~ ) ' r m [ r ( r ,s)],
m,
=
r ( m ,n)
=
1
In order to avoid constant biasing of r,, and Ti,,, at their recording, and resulting loss of energy effectiveness, Lee (1970) proposed allocating four neighboring-in-raster-rows medium resolution cells for recording one hologram sample (Fig. 49b). With a reconstruction angle as defined by Eq. (135), a phase detours 0, n/2, rc, and 3n/2 correspond to them. Therefore, these cells + Iriml)/2;(lrrel should be written in the following order: (rre lrrel)/2;(rim - rJ2; (lF'iml- rim)/2. One may see that in this case all the recorded
+
111
APPLIED PROBLEMS OF DIGITAL OPTICS
values are nonnegative and the method represents a vector in the complex plane in a biorthogonal basis (ere;eim;e,; eG) (Fig. 49a)
r=3(rre+ IrreIIere + + ( r i m + IrimIleim
+$(IrreI
- r r e ) e E + W i m--1 ri,,,)eG
(137) In the notation of Eq. (136), the Lee method may be described as follows:
+ r*(r,s)ll + (- l)mol(i)mo*[( - l)mo2r(r, S) + T*(r, s)]} (138) where rn = 4r + 2m,, + mO2;m,, = 0;l; mO2= 0;l; n = S, and the vertical T.(m,n) = + { l ( i ) m o 2 [ ( -
l)mo2r(r,S)
bars stand for taking the modulus of a number. At coding by the Lee method, it is required, in order to preserve the proportions of the reconstructed picture, that the size of the medium resolution cell in one direction be four times less than that in the perpendicular one. Yaroslavskii (1972a) proposed to allocate for each sample of the mathematical hologram coded by this method two resolution cells in two
b
FIG.49. Additive representation of complex numbers: (a) representation with a biorthogonal basis; (b) representation with 2D simplex; (c) representation by a sum of two equallength vectors.
112
L. P. YAROSLAVSKII
neighboring raster rows (Fig. 48c), i.e., to write according to the following relation
+
F(m,n) = ${l(i)mn[(- l ) m o r ( r , s ) r*(r,s)]l
+ (-
l)"o(i)mo[(- t)mor(r, s) + T * ( r ,s)]}
(139)
where m = 2r + m,; n = 2s + no; m,, no = 0;l. (Yaroslavskii and Merzlyakov, 1980).In doing so, the image is reconstructed along a direction making angles 6
A ~ c o = s ~Af2 ~ Aq cos 6, = 212
Vector representation in the biorthogonal basis of Eq. (137) is redundant, which manifests itself in the fact that two of the four components are always zero. This redundancy may be reduced if complex numbers are decomposed with respect to a two-dimensional simplex (e,, eI2,, e240) (Fig. 49b).
r = Toeo+ Tie,,, + T-lez4,
(141)
Similar to the biorthogonal basis, this basis is not linearly independent because eO
+ e120 + e240 =
(142)
and its redundancy may be exploited in order to ensure that the components be nonnegative. There exist two versions of hologram coding by two-dimensional simplex, proposed by Burckhardt (1970) and Chavel and Hugonin (1976). Burckhardt's idea is to represent an arbitrary vector in a plane as the sum of two components directed along those of three vectors e,, e12,, and e240, which confine the third part of the plane where this vector is located. This implies that of the three numbers ro,rl,and r-,defining the given complex number T through Eq. (141),two are always positive and are projections of the vector r on the corresponding basic vectors, and the third one is zero. The following relations may be obtained for r,, TI, and K l from this condition
r,, rl,r-
1
-
-[(1
-2J3
1 ----[(I l- - 2 g
+ sgnA)(IBI
-
B ) + (1 - sgnA)(ICI + C)]
+ sgnC)(IA/ - A ) + (1
-
sgnC)(IBI
+ B)]
(143)
APPLIED PROBLEMS OF DIGITAL OPTICS
113
where
and sgn(A, B, C) is the sign of A , B, and C , respectively. Chavel and Hugonin (1976) noted that by virtue of Eq. (142), the addition of an arbitrary constant V to r,, r1,and K 1 will not change Eq. (141). Therefore, they suggested determination of r,, I-,, and rP1 in the form
rk=r;+K
k = -i,o,i
(145)
by imposing on TE the following constraint
and choosing V so as to make rknonnegative. In this case, TF is defined as
r; = (r + r*)/3= 2 ~ 1 3 l-7
=
[I- exp( - i2n/3)
ro
=
[rexp(i2n/3)
+ I-* exp(i2n/3)]/3
+ r*exp( -i2n/3)]/3
= 2C/3
(146)
= 2B/3
The bias V,evidently, may differ for different hologram samples. Understandably, the choice of bias value influences the quality of the reconstructed image. Several versions of such a choice providing high image quality were suggested by Chavel and Hugonin (1 976). When writing a hologram, one may code the phase angle corresponding to the vectors e,, e , 2 0 , and e240by means of the above-mentioned detour phase method by writing the components r,, rl,and r- into neighboring hologram resolution cells in the raster. If the three components of the simplexdecomposed vector are allocated according to Burckhardt (1970) in three row-neighboring raster resolution cells enumerated with respect to rn, from 0 to 2 (see Fig. 50a), the image is reconstructed at an angle
Ot
= arccos(i./3 A()
(147)
to the axis ( coinciding with the direction of the hologram raster rows. One may readily demonstrate by using Eq. (143)that the following formula relating values written in cells (rn,n) of the medium with samples of the
114
L. P. YAROSLAVSKII
b
a m.0
m:l
ma2 I
0"
I
0"
240"
120'
120"
240" I
0"
,
120"
240"
840'
120'
0 '
,
/
\20° I
1
240'
0'
-/-
120" 240"
0 '
I
C
I \ 1
0"
/
1
240"
120'
Oo
120°
-~1. 240"
0 '
,
120'
240'
FIG.50. Hologram coding by decomposition of complex numbers with respect to simplex: (a) the Burckhardt method; (b) a version of the method with allocation of cells representing vectors e,, e , 2 0 , and eZ4, along the orthogonal raster; (c) a version with allocation along the hexagonal raster. Triples of medium resolution cells used for coding one hologram sample are outlined with a bold line.
mathematical hologram r ( r , s) corresponds to the Burckhardt coding method
-
[('
r(m,n) =2 s
Re[r(r, s ) exp( - i2nm0/3)] +
I Re[r(r, s) exp( - i2nm0/3)] I
x
(I Re{ T ( r ,s) exp( - i2n(m0 - 1)/3] I
-
Re{ T(r, s ) exp[ - i27c(m0 - 1)/3]})
Re[r(r, s ) exp( - i2nmO/3)] + ( I - I Re{ r ( r , s ) exp( - i2nmO/3)]I x (IRe{r(r,s)expCi2n(m0 + 1)/31)1
where Re(z) is real part of z, m = 3r + m,; m, = 0,1,2; and n = s. A hologram coded by the Chavel-Hugonin method (1976) may be written in a similar manner.
APPLIED PROBLEMS OF DIGITAL OPTICS
115
When recording simplex components along raster rows, three times more hologram resolution cells will be used along raster rows than columns; therefore, the scale of the reconstructed image along the coordinate axes will differ by a factor of three. In order to equalize the scales, each hologram row may be repeated three times, but this implies excessive use of resolution cells. The redundancy may be decreased by means of two-dimensional packing of cells representing the vectors e,, e l z 0 , and 240. Figure 50b,c shows two packings (over orthogonal and hexagonal rasters) which, respectively, provide scale ratios 2: 1.5 and 3 g : 4 instead of 1:3. Appropriate formulas are omitted because of their awkwardness. The above methods may be, evidently, used for hologram recording in binary media as well as in the amplitude ones. In this case, projections of the complex number on basic vectors are represented by varying the size of the transparent aperture in each of the appropriate resolution cells. A specifically binary method implementing the idea of additive representation of a complex number is that of “pulse code modulation” (PCM) (Goodman et al., 1974; Haskell, 1973; Kuzmenko, 1977), where each mathematical hologram sample is represented by groups of, say, K x L neighboring resolution cells ( K along raster rows, L along columns), each cell being either completely transparent (reflecting) or nontransparent. The total number of states of such a group is, obviously, 2KL;therefore, 2KL different vectors may be coded by each group, and all of them may be computed in advance. It is then required to find for each sample of the mathematical hologram a nearest vector of all the possible 2KLvectors and to write the corresponding combination of transparent and nontransparent cells. The search may be done either by means of selection, which might require up to 2KL steps, or by “weighting,” which reduces this estimate to K L (Kuzmenko, 1977). The PCM method makes much more effective use of the medium degrees of freedom than that of Lohmann or similar binary methods for exponential representation of complex numbers. Indeed, with the same number K L of medium resolution cells allocated for representation of one mathematical hologram sample, the Lohmann method enables coding of only K L different vectors (e.g., vectors having K and L different values of amplitude and phase, respectively) rather than of 2KL,as with PCM coding. Let us now discuss methods for recording in phase media that rely upon additive representation of complex numbers. In this case, component vectors should have standard length and controllable direction (phase angle). The simplest version is a representation of vectors as a sum of two such components (Brown and Lohmann, 1966)
r = I-,
exp(icp,)
+ I-,
exp(icp,)
( 149)
As may be easily seen from Fig. 49c, phase angles cpl and cpz of the component
116
L. P. YAROSLAVSKII
vectors are defined by cpl =
- arccos(lrl/2ro)
(p2 =
q - arccos(lrl/2ro)
where Irl is the modulus, and y the phase angle of the coded hologram sample. Hsueh and Sawchuk (1978) named this method two-phase coding. It may be used for hologram recording both in phase and binary media. At recording in phase media, two neighboring medium resolution cells may be allocated for representation of the two component vectors (see Fig. 48a) which, formally, is as follows
r(m,n) = A,exp(i[q(r,s)
-
(- l)moarccos(~~(r,s)~/2~o)]} (151)
where m = 2r
+ m,;
m,
= 0,l;
n
=
s.
In this case, the image will be reconstructed in a direction normal to the hologram plane, because the optical path difference of rays passing along this direction through the neighboring hologram resolution cells is zero. This holds, however, only for the central area of the image through which passes the optical axis of the reconstruction system. In the peripheral areas of the image, some phase shift occurs between the rays, thus leading to distortions of the peripheral image, which will be discussed in more detail in the section to follow. Shmaryov (1976) proposed writing separately two holograms of each component vector of Eq. (149) and sum the images reconstructed from them in a special optical setup. Hsueh and Sawchuk (1978) considered two versions of the double-phase method oriented to recording in binary media with coding of component vector phases by the detour phase method. In the first version, two elementary cells of the binary medium are allocated to each of the two component vectors, their phases being coded by a shift of the transparent (or completely reflecting) aperture along a direction perpendicular to the line connecting cell centers (Fig. 51a). This technique features the same distortions due to mutual spatial shift of elementary cells as the above recording in phase media. In order to reduce the distortions, Hsueh and Sawchuk (1978) proposed decomposing each elementary cell into subcells alternating as shown in Fig. 5lb. Obviously, the relative shift of elementary cells will now be A q / n rather than Aq, and the peripheral image distortions may be significantly cut down. Alternation of subcells is an effective means for decreasing distortion, but like any binary recording, it requires more degrees of freedom of the medium and hologram recorder.
APPLIED PROBLEMS OF DIGITAL OPTICS
117
FIG.51. Two-phase coding for hologram recording on a binary medium: (a) two separate cells; (b) decomposition of cells into alternating subcells.
The two-phase method is evidently generalized to the case of multiphase coding by vector decomposition into K equal-length components (Merzlyakov and Yaroslavskii, 1982) K
= k=l
rOexp(i(Pk)
(152)
As r is a complex number, Eq. (152) represents two equations for K unknown values of phases (Pk.They have a unique solution, Eq. (150),only for K = 2. At K > 2, q k may be chosen in a rather arbitrary manner. For example, for odd K it is more convenient to choose ( P k so as to make an arithmetic progression of the phase angles qk (Pkfl
- (Pk =
(Pk
- (Pk-1 =
(153)
118
L. P. YAROSLAVSKII
In this case, we obtain the following equations for the increment 0 sinK%/2- Irl sin%/2 r,
(154)
and phase angle cpk ( ~ = k ~p
+ (k
-
(K
+ 1)/2)0;
k = 1,2,. . . ,K
(155)
For odd K , Eq. (154) boils down to an algebraic equation of power ( K - 1)/2 with respect to sin' %/2.Thus, for K = 3, we obtain 0
=
2arcsin(JT7i77i72)
(156)
For even K , it is more expedient to separate all the component vectors into two groups having the same phase angles cpl or cp2 that are defined by analogy with Eq. (1 50) as arccos(lrl/Kro)
cpl = cp
-
cp2
+ arccos(lrl/Kr,)
= cp
(157)
If K > 2 is chosen, the dynamic range of possible hologram values may be extended because the maximal reproducible amplitude is K A , . Most interesting of the K > 2 cases are those of K = 3 and 4 because the two-dimensional spatial degrees of freedom of the medium and hologram recorder may be used more effectively through allocation of the component vectors according to Figs. 50b,c and 48c. The above hologram coding methods relying on additive representation of the complex number have one more important property in common. All of them make use of some form of implicit introduction of the spatial carrier and of a nonlinear transformation of the signal with a spatial carrier similar to the classical method of recording optical holograms (Leith and Upatnieks, 1961). It is easy to check that Eqs. (136), (138), (139), (148),and (151) may be, for example, rewritten in the following equivalent form containing explicit hologram samples multiplied by those of the spatial carrier with respect to one or both coordinates
+
F(m,n)= Re(T(r,s)exp(-i2nm/2))/4 c, m = 2r
+ mo; rn,
= 0, I ;
n
=
s
(1 36')
e(rn,n) = $rctf{Re[r(r,s)exp( -i2nm/2)]}
+ 2mo, + mo2;m,, , m o 2 = 0 , l ; n = s rim , n ) = rctf(Re( T(r, s) exp[ i27c(m + 2n)/4)}) ~n = 2r + ni,; n = 2s + n o ; rn,,iz, = 0, 1 m = 4r
(138')
-
(1 39')
APPLIED PROBLEMS OF DIGITAL OPTICS
1
F(m, n) = -
‘
@p=O
hlim(Re{r(r,s) exp[ - i27c(m
x rctf(Re{T(r,s) exp( - i27c(m
m = 3r
119
+ p)/3)})
+ p + 3)/3)))
+ m,; m, = 0,1,2; n = s
(148’)
F(m, n) = r, exp(i{cp(r,s)- c o s ~ ~ ~ r n / ~ ~ a r c c o s ~ ~ ~ ~ r , s ~ ~ /
m = 2r
+ m,;
m,
= 0,l;
n
=s
(151‘)
where rctf(z) is the “rectifier” function rctf(z) =
z, 0,
230 z
and hlim(z) is the “hard-limiter’’ function hlim(z) =
1,
0,
230 z
(159)
As one may see from these expressions, the spatial carrier has at least twice a shorter period as that of hologram sampling, i.e., at least two samples of the spatial carrier correspond to one hologram sample in order to enable reconstruction of the amplitude and phase of hologram samples through the modulated signal of the spatial carrier. This redundancy implies that, in order to modulate the spatial carrier by a hologram, one or more intermediate samples between the basic ones are required. They may be determined by some method of hologram sample interpolation. The simplest interpolation is repetition of samples. It is namely this interpolation that is implied in the above methods of recording. For instance, according to Eq.(136) each hologram sample is repeated two times for two samples of the spatial carrier, in Eq.(138) it is repeated four times for four samples, etc. Of course, such a “zero-order” interpolation characteristic of all the codings based on the detour phase method provides a very rough approximation of intermediate samples. Section V,D illustrates distortions of the reconstructed image caused by it and manifesting themselves in superposition of noise images. For partial correction of these distortions, mostly for binary coding, several iterative algorithms of hologram calculation have been proposed. All of them are built around iterative determination of the phase of the hologram sample situated in that place of the elementary cell where, according to the detour phase method, the transparent aperture should be situated. In order to determine exact values of the desired intermediate samples, it is necessary at hologram synthesis to perform SDFT (u, u, p , q ) i i s many times as many additional samples are required for one basic sample, varying
120
L. P. YAROSLAVSKII
appropriately the shift parameters p and q (Yaroslavskii and Merzlyakov, 1982). It should be also noted that the symmetrization method may be regarded as an analog of the method of Eq. (136) with ideal interpolation of intermediate samples. In the symmetrization method, such an interpolation is done automatically and, as will be seen in Section V,D, the restored image is not distorted by noise images. Along with methods of implicit introduction of the spatial carrier, there are methods based on its explicit introduction. The majority of them were already proposed in the first publications on digital holography as simul:, 'm of optical hologram recording and from the analogy between holograms arld interferograms. Of the methods oriented to amplitude media, let us mention those 4 Burch (1967), who proposed to record a hologram as
F(m, n) = { 1 + Ir(m,n)l cos[cp(m, n)
+ 27rm/a]}/2
(160)
and of Huang and Prasada (1966),who suggested introducing a constant bias IT(rn,n)I in order to enhance the contrast of the useful component of the hologram [second term of the sum in Eq. (160)]
F(m, n) = Ir(m,n)\{1
+ cos[q(m, n) + 2xm/a])/2
(161)
Of binary-media-oriented methods, one may mention that described in the overview of Lee (1978) T(m,n)= hlim{cos[arcsin((r(m,n)l/r,}
+ cos[cp(m,n)+ 27rrn/a])
(162)
In Eqs. (161) and (162), a is the period of the spatial carrier. Of phase-media-oriented methods, we may mention that of Kirk and Jones (1971), who proposed recording the function
F(m, n) = r, exp(i[cp(m, n) - h(m,n)cos 2nm/a])
(163)
where h(m,n) depends somehow on Ir(rn,n)l and the number of the diffraction order where the reconstructed picture should be obtained. In a sense, this method is equivalent to multiphase coding methods and at a = 2 it coincides with that of two-phase coding Eq. (151)
h(m,n) = arccos(lT(r,s)1/2A,), m = 2r
+ m,;m,
= 0,l; n = s
D. Reconstruction of' Synthesized Holograms Reconstruction and observation of synthesized holograms may be described by schemes shown in Fig. 52a,b. The synthesized discrete hologram computed via discrete representations of field transformations performs in
APPLIED PROBLEMS OF DIGITAL OPTICS
b
121
7
6
FIG.52. Circuits for synthesized hologram reconstruction for (a)documenting and (b) visual observation of reconstructed images (1, light source; 2, collimator; 3, 7, hologram; 4, lens; 5, cassette with photographic film; 6, point light source; 8, observer).
them the role of an analog physical element, and the field generated by the hologram is subjected to continuous transformations. Therefore, sampling and transformation of the digital signal into a physical hologram (i.e., the method of hologram recording) affect directly the final reconstruction results. Below, the reconstruction of synthesized Fourier holograms in an analog circuit performing the Fourier transform is discussed as used in three methods of hologram recording: symmetrization, orthogonal coding [Eq. (1 36)], and two-phase recording in phase medium. Symmetrization relies upon the following transform_ationin the recorder of the numerical matrix of the mathematical hologram rF(r, s) into the physical hologram r(5,q):
where c is the constant bias required for representation of negative values of rF(r, s); At, Av] are values of the sampling step, respectively, along the axes 4, q of the hologram recorder; to,yl0 are constants depending on the hologram position with respect to the optical axis of the reconstruction system; the function HR(t,q ) describes the recorder aperture performing analog interpolation of discrete hologram samples; W ( t q, ) is a masking function defining the physical size of the recorded hologram (apodization function).
122
\
x
6(t + 50 - rAt)S(y + v o
(166)
- SAV)
where stands for convolution. According to the convolution theorem, the result of an inverse continuous Fourier transform performed at the reconstruction of the hologram r(5,q) is representable as 0
y)=
s'.1 r(<,
q) exp[-i271(X<
x
x I =
where
+ yq)/2zOl d< dq
-a:
m
1 2
-ms=-m
[fdr, s)
+ cl exp[
-
i2n(r A t x
+ s Aqy)]
APPLIED PROBLEMS OF DIGITAL OPTICS
123
or, after transformations,
As may be seen from Eq. (171), at synthesis and recording of Fourier holograms one ought to take p = q = 0 and to= qo = 0. It also follows from this formula that a hologram placed into the optical Fourier system would reconstruct the samples of the original distribution of the field over the object in several diffraction orders (whose numbers are defined by m and n:l masked by function hR(x,y), which is a Fourier transform of the aperture function HR(t,v ] ) of the recorder's recording element, and interpolated according to a function which is a transformation of the hologram window function W(5,v]). The last term within the braces of Eq. (171) describes the so-called central spot occurring because of the constant component in the hologram. Figure 53 shows the arrangement of diffraction orders on the restored image under shift parameters u = - N ; v = -M/2. The numbers in parentheses indicate diffraction orders (m,n) for these shift parameters, bold and thin arrows show the original symmetrized picture. Figure 46a presents an example of an image reconstructed from a hologram recorded by means of this method. In the case of orthogonal coding one obtains according to Eq. (1.36)
124
L. P. YAROSLAVSKII
FIG.53. Arrangement of the diffraction orders during reconstruction of a hologram synthesized by symmetrization by duplication. Bold and thin arrows indicate the symmetrized parts of the image; numbers in parentheses indicate diffraction orders.
Following the reasoning used in the derivation of Eq. (171) and omitting the sufficiently simple but rather awkward calculations, we obtain that, in the continuous Fourier transformation system, such a hologram reconstructs the following function
APPLIED PROBLEMS OF DIGITAL OPTICS
125
As in the above case, one, obviously, should take to= ylo = 0 and p = q at recording and synthesis of the Fourier hologram; and the reconstructed image contains in several diffraction orders a central spot due to the constant component in the hologram [sums with coefficient c in Eq. (173)] and is masked by the function h,(x, y), which is the Fourier transform of the recorder's recording element aperture. But here, in contrast to the above case, each diffraction order contains two superposing images of the object-a direct one and its conjugate turned by 180" with respect to the direct. Each of them is additionally masked by cos n ( i + A t x/AzO)and sin n(i A5 x/Rzo), respectively. Therefore, in the central part of the direct image the conjugate one is suppressed, but at its periphery the noise caused by the conjugate image has an intensity comparable with that of the direct image. The significant disadvantage of this method as compared with that of symmetrization is accounted for by the fact that here additional hologram samples necessary for representation of the spatial carrier are obtained by stepwise interpolation of mathematical hologram samples, while in the symmetrization method interpolation is ideal automatically. The pattern of diffraction orders of the direct and conjugate images is shown for the purpose of illustration in Fig. 54 for u = - N / 2 , v = - M/2. An example of the reconstructed image shown in Fig. 55 where one may easily see the direct image and its conjugate, a noise image whose contrast increases along the vertical axis from the center to the periphery. Two-phase recording of the hologram amplitude and phase in phase medium according Eq. (15 1) is described by =0
+
-
(- l)"', arccos(lPF(r,s)l/2ro)}] x S(5
x 6(yl
+ ylo - SAY])
+ to
-
(2r
+ mo)A t ) (174)
L. P. YAROSLAVSKII
126
FIG.54. Arrangement of diffraction orders for the orthogonal coding method. Continuous and dashed arrows stand for superposing direct and conjugate images. Boxes indicate diffraction orders. Below, weighting functions of the direct and conjugate images are depicted.
After the Fourier transformation, this hologram reconstructs the following function - i2n(5 x + 9 y ) d5 d9 -a
AZO
]
APPLIED PROBLEMS OF DIGITAL OPTICS
x exp[
-
i2n(2r A[ x
127
+ s Aq y)/lz,]
x exp[icp(r, s)] exp[ - i2n(2r A t x
+ s Aq y)/Azo]
By substituting Eq. (119) into Eq. (175) for IFF(r,s)lexp[icp(r,s)] and introducing some auxiliary function A",(k, I) defined as follows
FIG.5 5 . Example or a recocstruction of orthogonally coded hologram. One may discern superposition of the direct and conjugate images.
128
L. P . YAROSLAVSKII
obtain after some transformations AR(x,y) = w ( x > y )
2h(x,y)exp{- i 2 n [ ( t 0 -
+ ?OYl/izO)
Reconstruction of a hologram recorded on a phase medium by the twophase method is, thus, similar to the reconstruction of an orthogonally coded hologram [see Eq. (173)]. There are also several diffraction orders of the image masked by the function hR(x,y); one may observe superposition of the noise image described by the function i o ( k ,I ) over the original image described by Ao(k,l); and the original and noise images are additionally masked by the functions cos(n.A t x / i z o ) and sin(7-cA t x/lz,), respectively, with the result that in the center noise is attenuated, but over the peripheral area it may be of the same intensity as the basic image. In contrast to the orthogonal coding method, the noise image here is not conjugated to the original one, but is similar to it, in a sense, because, according to Eq. (176), although it has a distorted amplitude spectrum, the phase spectrum is the same. Moreover, unlike the holograms coded via symmetrization or orthogonal methods, the two-phase coding does not produce a central spot in the diffraction orders of the reconstructed image because the hologram is recorded in the phase medium without a constant amplitude component. Figure 56 shows the pattern of diffraction orders in the reconstructed picture for this case with u = - N / 2 and u = - M / 2 . Figure 57 shows an example of image reconstruction in diffraction orders (0,O) and ( 0 , l ) . E . Application of Synthesized Holograms t o Information Display
Optical data processing today is the major area for practical application of synthesized holograms, which are widely used as elements of optical
129
” I
,i I
\
FIG.56. Arrangement of orders for two-phase recording. Solid and dashed arrows indicate the basic and noisy images; boxed figures denote diffraction order. Below, weighting functions of both images are depicted.
processors such as spatial filters, focusers, deflectors, special diffraction grids, and lenses (Lee, 1978; Dallas, 1980; Yaroslavskii and Merzlyakov, 1982; Mayorov et al., 1983). But there is another, not less important and, in a sense, even more promising and attractive application for synthesized holograms, that of information visualization and design of holographic displays. Although the idea of designing three-dimensional displays based on hologram synthesis dates back to the first publications devoted to digital holography (Brown and Lohmann, 1969; Lesem et al., 1968, 1969; Huang, 1971), only now can one state that it has become feasible. Three basic methods of hologram calculation for information visualization have been shaped: “multiplan” holograms, compositional stereo holograms, and programmable diffusors.
130
L. P. YAROSLAVSKII
FIG.57. Example of reconstruction of a hologram recorded by the two-phase method
The essence of the multiplan hologram method (Brown and Lohmann, 1969; Lesem et al., 1968; Edgar, 1969; Dallas, 1980; Ichioka et al., 1971)lies in representing a three-dimensional object as a set of several planes (plane cross sections) situated at various distances from the observer, and the object hologram calculation is reduced to a calculation of holograms of individual sections. Two versions of this method exist. In the first one, holograms of individual sections are first calculated in the observation plane and then summed into a complete hologram. It is assumed here that nearer sections do not cover more distant ones. In the second, more perfect version, a Fresnel hologram of the first, most distant section is computed in the plane of the second section, which is nearer to the observer, and then is multiplied by the transparency function of the second section allowing, in particular, for
APPLIED PROBLEMS OF DIGITAL OPTICS
131
possible shadowing of the first section by the second one. A Fresnel hologram of the resulting field distribution is computed in the third section, and so on until the observation plane. This version of the multiplan hologram method was named “ping-pong propagation” (Dallas, 1980). The method of compositional stereoholograms suggested by Yaroslavskii (1974) and Yatagai (1974; see also Jaroslavski and Merzlyakov, 1979, Jaroslavski et al., 1980;Yatagai, 1976)relies upon stereovision, which is one of the most important mechanisms of 3D perception. According to this method, holograms of different foreshortenings of a body under consideration are synthesized separately (see Section V,A) and are then combined into a mosaic or compositional hologram. Scanning this mosaic by eye, one could see a smooth change of planes as if observing the object through a window whose dimensions are equal to those of the compositional hologram. The greater the area of the mosaic hologram, the greater the angle of object observation. In order to be convenient for observation, such holograms, obviously, should have dimensions at least several times greater than the interpupillary distance. At the same time, the size of a hologram corresponding to one observation foreshortening may be, evidently, chosen equal to the order of magnitude of the pupil size, i.e., 10 to 20 mm with allowance for eye movement. The physical size of an elementary hologram required for object reconstruction with the desired resolution is defined by the number of resolution elements over the object and the desired observation angle, that is, by the maximal spatial frequency on the synthesized hologram. For 512 x 512 resolution elements and maximal spatial frequency of 100 lines per mm, the size of the elementary hologram will be 5 x 5 mm. This means that a hologram for one observation obtained by foreshortening may be a mosaic repetition of the elementary holograms. The complete macrohologram results from appropriate arrangement of the mosaics constructed for each observation foreshortening. If only horizontal parallax is used for perception, holograms may be repeated in the vertical direction as many times as is required for the construction of a convenient macrohologram. This principle underlies the design of circular compositional macroholograms-synthesized holographic films (Karnaukhov et al., 1976; Merzlyakov and Yaroslavskii 1977a; Jaroslavskii and Merzlyakov, 1979)(see Fig. 58). A remarkable feature of the compositional stereoholograms is that they reproduce not only the body’s volume, but also its motion in space. For instance, if the circular hologram is rotated, the observer will see a rotation of the bodies. Experiments demonstrate that the illusion of continuous motion is preserved even at a low speed of hologram rotation. This fact enables one to speak about the cinematographic effect. The possibility of continuous changing of frames (Fourier holograms) provided by the invariance of the
132
L. P. YAROSLAVSKII
FIG.58. Model of synthesized circular holographic film. Looking at a point light source through the hologram, which constitutes cylinder’s surface, the observer sees a three-dimensional object as if suspended in air. With rotation of the hologram, the object rotates around its axis as well.
Fourier transform with respect to shift is a significant advantage of holographic films as compared with conventional ones. When the objects to be reproduced do not have contours or details capable of producing a stereo effect, one may synthesize holograms by means of the programmable diffusor method (Yaroslavskii, 1974) based on a simulation of the game of light and shade over the diffuse surfaces of bodies. The light patches occur during illumination by directed light owing to a special property of diffusely reflecting objects to diffuse the incident light nonuniformly along different directions. Therefore, the intensity of light reflected by some area of the object’s surface in a given direction depends on the angle between this direction and the normal to this area as well as on the direction to the light source. In order to model this effect by hologram synthesis, one may take advantage of the fact that visualization requires reproduction of only the object’s macroforms, i.e., unevennesses much larger than the illumination wavelength. Given this macroform, one may determine the distance from each point on the object’s surface to its tangent plane perpendicular to the observation direction (see Section V,A). This distance defines the wave phase
APPLIED PROBLEMS OF DIGITAL OPTICS
133
detour due to the object’s macroform, that is, the “regular” phase component of the reflection factor as recalculated to the tangent plane. In order to reproduce diffuseproperties of the surface, a “random” component describing the surface microform (roughness) should be added to the “regular” component. For simulation of nonuniform diffusion of light in different directions, the “random” phase component should be a correlated process, and its power spectrum (squared modulus of the Fourier transform) should coincide with the angular distribution of the reflected light intensity in the given place of the diffusion surface. The noncorrelated component or diffusor with noncorrelated samples used in the majority of hologram synthesis methods for simulation of diffusion illumination corresponds to uniform light diffusion in all the directions. For further details the reader is referred to Merzlyakov and Yaroslavskii (1977’), Yaroslavskii and Merzlyakov (1980), Jaroslavski and Merzlyakov (1979). The “programmable” diffusor provides a basis for synthesizing Fourier holograms containing simultaneously information about all the object’s foreshortenings and, thus, about its form. Figure 59 shows some experimental results with holograms synthesized through the programmable diffusor method. For more convenient observation, it is also good practice to construct macroholograms from programmable diffusor holograms. But, in contrast to compositional holograms, only separate hologram fragments corresponding to different foreshortenings should be multiplied mosaically, rather than the entire hologram (Yaroslavskii and Merzlyakov, 1980). It is also noteworthy that by varying the multiplicity of fragment repetition one can arbitrarily vary the object observation scale depending on observation angle. All of the hologram synthesis techniques described produce holograms reconstructable “through” and in monochromatic light. For visualization, it would be more convenient to have reflecting holograms reconstructable in white light. Such should be hybrid optodigital holograms. The general idea of hybrid hologram synthesis as formulated by Yaroslavskii (Jaroslavski and Merzlyakov, 1979) lies in writing synthesized holograms on a carrier which already contains an analog hologram generated in advance intended for matching the recorded hologram with the observation and illumination environment. In the course of recording, the synthesized hologram modulates the analog one so that the reconstructed wavefront may be defined by the product of wavefront amplitudes of both holograms. In this case, the analog hologram should be that of a point light source. If the analog hologram is recorded in oncoming beams (Denisyuk, 1962), the resulting hybrid hologram will be reconstructable in white light and may be made reflecting. The hybrid hologram, thus, can combine the advantages of optical holograms (simplicity and convenience of observation in natural
134
L. P. YAROSLAVSKII
a
FIG.59. Examples of image reconstruction from a hologram synthesized by the programmable diffusor method: (a) object (uniformly colored pyramid); (b) hologram; (c) result of reconstructing the upper left fragment of the hologram, the pyramid as illuminated from the upper left; (d) result of reconstructing the upper right fragment of the hologram, the pyramid as illuminated from the upper right; (e), (f) two views from the down left and down right.
APPLIED PROBLEMS OF DIGITAL OPTICS
135
illumination) and synthesized ones (possibility of visualization of objects defined by a mathematical description or by a signal). Generally speaking, the production of hybrid holograms according to this method requires special recorders for synthesized holograms capahle of using preliminarily exposed photographic materials. The point is that the requirements for photographic materials for optical and synthesized holograms differ significantly. The former must have high resolution (several thousand lines per mm), but may have low sensitivity. The latter may have low resolution (several hundred lines per mm suffice), but should be of high sensitivity in order to ensure short hologram recording time. It is difficult to integrate high resolution and high sensitivity in one material. Therefore, different photographic materials are used today for recording optical and digital holograms. Taking this into account, the following two methods of hybrid hologram production may be recommended (Jaroslavski et al., 1980). 1. Method of Rephotographing
The synthesized hologram is photographically copied on a plate or film preliminarily exposed to an optical hologram and afterwards subjected to photochemical processing. This method requires meticulous adjustment of exposures, but it better implements the concept of hybrid holograms.
2. Method of Sandwich Holograms This method, which technologically is much simpler, consists of separate production of synthesized and optical holograms, which are then put together to form a sandwich. An additional advantage of this method is that it enables combination of kinoform and bleached optical holograms that feature the highest diffraction effectiveness. Both methods, however, have not yet been implemented because of difficulties involved in the production of point-source optical holograms free of chromatic aberrations. For the time being, the method of holographic rephotographing, consisting of the production of optical holograms of images reconstructed from synthesized holograms, seems to be more promising. Case and Dallas (1978) have described an experiment carried out with volume holograms according to this technique. Karnaukhov et al. (1982) reported a holographic rephotographing method as applied to rainbow holograms. This enables generation of a hologram containing several frames of holographic film (Yaroslavskii and Mezrlyakov, 1980), that is, several foreshortenings of a 3D body. The method of rephotographing on a rainbow
136
L. P. YAROSLAVSKII
hologram matches well that of compositional stereoholograms, because both ignore vertical parallax. The best existing hybrid holograms have been obtained through this method. It seems, however, that a most interesting, and also most difficult, modification of the holographic rephotographing method was suggested by MacQuigg (1977), who has constructed a setup for the subsequent recording on volume medium of three holograms of transparencies on which three components of the hologram synthesized through the Burckhardt (1970) method (see Section V,D) are written. Recording is done in three exposures, with the object beam being shifted by 120". The hybrid volume hologram generated in such a way reconstructs the image in white light as the Burckhardt hologram does in coherent light. All the existing methods for synthesis and recording of color holograms rely upon separate calculation of three holograms for red, green, and blue colors, the difference lying only in the methods for their recording. For instance, Dallas et al. (1972) proposed the synthesis of three binary holograms for red, blue, and green, and the introduction for them of three different spatial carriers. They are reconstructed by three lasers (red, blue, and green) illuminating the hologram at appropriate angles so as to match the reconstructed color-separated images into a color image. Fienup and Goodman (1974) proposed producing color-synthesized holograms by photographing binary holograms on color film from a CRT through red, blue, and green filters. In doing so, each hologram is recorded on a corresponding layer owing to filters. For the reconstruction of color images from such holograms, a laser with three radiation lines (red, blue, and green) is used, and a color image is formed in the focal plane of the lens performing the Fourier transformation. Being lighted by this laser, each film layer selects the appropriate beam component and reconstructs its image. Since the spectral selectivity of layers of color photographic emulsions is not high enough, color distortions are possible due to the mutual influence of layers. These distortions may be reduced if, in addition to recording into separate layers, one also separates the color-separated holograms spatially either by shifting them with respect to each other, or by the spatial interleaving of hologram elements. Moreover, the mutual influence may be offset by appropriate correction of the hologram amplitude and phase recorded into each layer. In order to simplify the adjustment of color film exposures and also to produce color holograms containing a good deal of samples (macroholograms), Yaroslavskii (1978; see also Yaroslavskii and Merzlyakov, 1982) proposed to contactprint on color film through color filters three colorseparated holograms written by one of the existing methods on black-andwhite film. The resulting holograms may be used also for direct visual observation of a colored object if a point source of white light is used.
APPLIED PROBLEMS OF DIGITAL OPTICS
137
At present, color macroholograms may be recorded directly on color film without the intermediate recording of color-separated holograms on blackand-white film thanks to the advent of devices for photorecording of color image such as Colormation C4300 (OPTRONICS INT., USA).
REFERENCES Ahmed, N., and Rao, K. R. (1975). “Orthogonal Transforms for Digital Signal Processing.” Springer-Verlag, Berlin and New York. Andrews, H. C. (1970). “Computer Technique in Image Processing.” Academic Press, New York. Andrews, H. C. (1972). Endeavour 31,88. Andrews, H. C., and Hunt, B. R. (1977). “Digital Image Restoration.” Prentice-Hall, New York. Avatkova, N. A,, Sveshnikova, 0. M., Feinberg, I. S., and Yaroslavskii L. P. (1980). In “Poverkhnost’Marsa” (“Surface of Mars”), p. 77. Nauka, Moscow. Belikova, T. P., and Yaroslavskii, L. P. (1974). Vopr. Radioelektron. 14,80. Belikova, T. P., and Yaroslavskii, L. P. (1975). Izu. Akad. Nauk SSSR Tekh. Kibern (4), 139. Belikova, T. P., and Yaroslavskii, L. P. (1980). Autometriya (4), 66. Belikova, T. P., Kronrod, M. A,, Chochia, P. A., and Yaroslavskii, L. P. (1975). Kosm. Issled. 13, 898. Belikova, T. P., Kronrod, M. A,, Chochia, P. A,, and Yaroslavskii, L. P. (1980). In “Poverkhnost’Marsa” (“Surface of Mars”). Nauka, Moscow. Belinskii, A. N., and Yaroslavskii, L. P. (1980). Issled. Zemli iz Kosmosa (4), 85 Belinskii, A. N., Katyshev, V. A,, and Yaroslavskii, L. P. (1980) In “Tezisy dokladov Vsesoyuznogo symposiuma “Problemy tsyfrovogo kodirovaniya i preobrazovaniya izobrazhenii” (Abstracts of Allunion Symposium “Problems of Image Digital Coding and Transformation”), p. 67. Tbilissi. Bockstein, 1. M., and Yaroslavskii, L. P. (1980). Autometriya (3), 66. Born, M., and Wolf, E. (1959). “Principles of Optics.” Pergamon, Oxford. Brown, B. R., and Lohmann, A. W. (1966). Appl. Opt. 6, 967. Brown, B. R., and Lohmann, A. W. (1969). I B M J . Res. Deu. 13, 160. Burch, J. J. (1967). Proc. IEEE 55, 599. Burckhardt, S. K. (1970). Appl. Opt. 9, 639. Cady, F. M., and Hodgson R. M. (1980). Proc. Inst. Electr. Eng. 127, 197. Case, S. K., and Dallas, W. J. (1978). A p p l . Opt. 17, 2537. Chavel, 0. and Hugonin, J. P. (1976). J . Opt. Soc. Am. 66, 989. Chu, D. S., Fienup, 1. R., and Goodman, J. W. (1973). Appl. Opt. 12, 1386. Dallas, W. J. (1980). Top. Appl. Phys. 41,291. Dallas, W. J., Ichioka, Y., and Lohmann, A. W. (1972). J . Opt. Soc. Am. 62, 739B. David, E. E. (1961). Proc. IRE 49, 319. Denisyuk, Yu. N. (1962). Dokl. Akad. Nauk SSSR 144, 1275. Dudinov, V. N., Kryshtal, V. A,, and Yaroslavskii, L. P. (1977). Geod. Kartogr. (l), 42. Edgar, R. F. (1969). Opt. Technol. 1, 183. Ershov, A. A. (1978) Autom, Telemekh. (8), 66. Fienup, J. R., and Goodman, J. W. ( 1 974). Nouu. Rev. Opt. 5, 269. Frei, W. (1977). Comp. Graph. Image Process. 6, 286. Frieden, B. R. (1975). Top. A p p l . Phys. 6 , 77.
138
L. P. YAROSLAVSKII
Frieden, B. R. (1980). Top. Appl. Phys. 41, 1. Garmash, V. A. (1957). Elektrosuyaz (lo), 13. Gonzalez, R. C., and Wintz, P. (1977). “Digital Image Processing.” Addisson-Wesley, Reading, Massachusetts. Goodman, J. W., Chu, D. C., and Fienup, I. R. (1974). Proc. SIE 41, 155. Gurevich, S. B., and Odnol’ko, V. V. (1970). Tekh. Kino Tel. ( 5 ) , 55. Haskell, R. E. (1973). J . Opt. Soc. Am. 63, 504. Haskell, R. E. and Culver, B. C. (1972). Appl. Opt. 11, 2712. Hsue, C. K., and Sawchuk, A. A. (1978). Appl. Opt. 17,3874. Huang, T. S . (1971). Proc. I E E E 59, 1335. Huang, T. S. (1975). Top. Appl. Phys. 6, 1. Huang, T. S. (1981). Top. Appl. Phys 41, 1. Huang, T. S., and Prasada, B. (1966). Q. Prog. Rep. Res. Lab. Electron. M I T (81), 199. Huang, T. S., Schreiber, W. F., and Tretiak, 0.J. (1971). Proc. I E E E 59, 1586. Huber, P. S. (1981). “Robust Statistics.” Wiley Series in Probability and Mathematical Statistics. Wiley, New York. Hummel, R. A. (1975). Comp. Graph. Image Process. 4,209. Ichioka, Y., Izumi, M., and Suzuki, T. (1971). Appl. Opt. 10,403. Jaroslavski, L. P. (1978). Proc. Congr. Int. Comm. Opt. I l t h , Sept. 10-17 p. 149. Institute de Optica “Daza de Valdes” C.S.I.C. Jaroslavski, L. P. (19804. In “Digital Signal Processing” (V. Cappellini and A. G. Constantinides, eds.), p. 69. Academic Press, New York. Jaroslavski, L. P. (1980b). Prepr. Int. Congr. High Speed Photoyr. Photon. p. 259. Jaroslavski, L. P. (198oC). In “EUSIPCO-80. Signal Processing. Theory and Applications. Eur. Signal Process. Con&, I s t , Lausanne p. 161. Jaroslavski, L. P., and Merzlyakov, N. S . (1979). Comput. Graph. Image Process. 8, 1. Jaroslavski, L. P. Karnaukhov, V. N., and Merzlyakov, N. S . (1980). Abstr. Paper. Symp. Opt. 80, 86. Justusson, B. I. (1981). Top. Appl. Phys. 43, 161. Karnaukhov, V. N., and Yaroslavskii, L. P. (1981). Pis’ma u Zh.Tekhn. Fiz. 7,908. Karnaukhov, V. N., Merzlyakov, N. S., and Yaroslavskii, L. P. (1976). Pis’ma u. Zh.Tekhn. Fiz. 2, 169. Karnaukhov, V. N., Merzlyakov, N. S., and Ovechkis, Yu. N. (1982). Opt. Commun. 42, 10. Kirk, J. P., and Jones, A. L. (1971). J . Opt. SOC.Am. 61, 1023. Kulpa, Z . (1981). In “Digital Image Processing Systems” (L. Bolc and Z. Kulpa, eds.), p. 1. Springer-Verlag, Berlin and New York. Kurst, C. N., Hoadley, H. O., and De Palma, J. J. (1973). J . Opt. Soc. Am. 63, 1080. Kuzmenko, A. V. (1977). Opt. Spektrosk. 42,973. Lee, W. H. (1970). A p p l . Opt. 9, 639. Lee, W. H.(1978). Prog. Opt. 16, 119. Leikin, G. A., Zabaluyeva, E. V., and Yaroslavskii, L. P. (1980). Nauchn. In$ Ser. Astrofiz. 43,88. Leith, E. N., and Upatnieks, J. (1961). J . Opt. SOC. Am. 51, 1469. Lesem, L. B. (1967). American Federation of Information Processing Societies, Fall Joint Computer Conference. A F I P S Con$ 31,41. Lesem, L. B., Hirsch, P. M., and Jordan, J. A. (1968). Commum. A C M 11,661. Lesem, L. B., Hirsch, P. M., and Jordan, J. A. (1969). I B M J . Res. Deu. 13, 150. Lohmann, A. W., and Paris, D. P. (1966). A p p l . Opt. 6, 1739. McFarlane, M. D. (1972). Proc. IEEE 60,768. Machover, C., Neighbours, M., and Stuart, C. (1977). I E E E Spectrum 14,23. MacQuigg, D. R. (1977). Appl. Opt. 16, 1380. Max, J. (1960). I R E Trans. IT-4, 7.
APPLIED PROBLEMS OF DIGITAL OPTICS
139
Mayorov, S. A., Ochin, E. F., and Romanov, Yu. F. (1983). “Opticheskiye analogoviye vychislitel’niye mashiny” (“Optical Analog Computers”). Energoatomizdat, Leningrad. Mertz, L. (1965).“Transformations in Optics.” Wiley, New York. Merzlyakov, N. S., and Yaroslavskii, L. P. (1977a).Dokl. Akad. Nauk. SSSR 237, 318. Merzlyakov, N. S., and Yaroslavskii, L. P. (1977b).Zh. Tekh. Fiz. 47, 1263. Merzlyakov, N. S., and Yaroslavskii, L. P. (1982).Prikl. Vopr. Hologr. Tem. Sb. p. 175. Milgram, D. L. (1974). Uniu. M d . Tech. Rep. July p. 313. Mirkin, L. I. (1978). Vopr. Kibern. 38, 73. Mirkin, L. I., and Yaroslavskii, L. P. (1978). Vopr. Kibern. 38,97. Nepoklonov, N. S., Leykin, G . A., Selivanov, A. S.. Yaroslavskii, L. P., Aleksashin, E. P.. Bockstein, I. M., Kronrod, M. A., and Chochia, P. A. (1979). In “Perviye panoramy poverkhnosti Venery.”(“The First Panoramas of the Venus Surface”),p. 80. Nauka, Moscow. Papoulis, A. (1968). “Systems and Transforms with Applications in Optics.” McCraw-Hill, New York. Pratt, W. K. (1972). IEEE Trans. C-21,636. Pratt, W. K. (1978). “Digital Image Processing.” Wiley, New York. Reader, C., and Hubble, L. (1981).Proc. IEEE69,606. Roberts, L. C . (1962). IRE Trans. IT-8, 145. Rosenfeld, A. (1969). “Picture Processing by Computer.” Academic Press, New York. Rosenfeld, A,, and Kak, A. (1982).“Digital Picture Processing,” 2nd Ed., Vol. 1. Academic Press. New York. Rosenfeld, A,, and Troy, E. B. (1970). In “Conference Record of the Symposium on Feature Extraction and Selection in Pattern Recognition.” IEEE Publ. 70C51-C, p. 115. Selivanov,A. S., Narayeva, M. K., Synelnikova, N. F., Suvorov, B. A., Yelensky, V. Ya., Alyoshkin, G. M., and Shabanov, A. G. (1974). Tekh. Kino Teleu. (9), 55. Shmakov, P. V., Kolin, K. T., and Jakoniya, V. E. (1966).“StereoteIevideniye”(“Stereotelevision”). Svyaz’, Moscow. Shmaryov, E. K. (1976).Opt. Spektrosk. 41,905. Slepyan, D. (1967).J . Opt. Soc. Am. 57,905. Sondhi, M. M. (1972).Proc. I E E E 60,842. Tukey, J. M. (1971).“Exploratory Data Analysis.” Addisson Wesley, Reading, Massachusetts. Tyan, S. G. (1981). Top. A p p f . Phys. 43, 197. Ushakov, A. N. (1979).Autometriya (4), 61. Ushakov, A. N. (1981). In “Tsyfrovaya obrabotka signalov i eyo primeneniya.” (“Digital Signal Processing and its Applications”) (L. P. Yaroslavskii, ed.), p. 98. Nauka, Moscow. Ushakov, A. N., and Yaroslavskii, L. P. (1984). Autornetriya (5), 115. Vainshtein, G. G., Lebedev, D. S., and Yaroslavskii, L. P. (1969). Izu. Akad. Nauk, SSSR (3), 46. Valyus, N. A. (1950).“Stereskopiya” (“Stereoscopics”).Acad. Sci. USSR, Moscow. Van der Lugt, A. E. (1964). I E E E Trans. In$ Theory IT-10, 139. Vasilenko, C. I. (1979). “Teoriya vosstanovleniya signalov” (“Theory of Signals Restauration”). Sovetskoye Radio, Moscow. Yaroslavskii, L.P. (1965). Vopr. Radioelektron., Ser. I X Tekh. Teleu. 6, 17. Yaroslavskii, L. P. (1968). “Ustroystva vvoda-vyvoda izobrazhenii dlya tsyfrovykh vychislitel’nykh mashin” (“Picture Input-Output Devices for Computers”). Energiya, Moscow. Yaroslavskii, L. P. (1972a). In “Konferentsiya PO avtomatizatsii nauchnykh issledovanii na osnove prymeneniya EtsVM. 5--9 Iyunya 1972g. Koherentno-opticheskiye elementy obrabotki informatsii” (Conference on Automatization of ScientificResearch with Application of Computers, 5-9 June 1972. Coherent-optical Elements for Information Processing), p. 7. Novosybirsk. Yaroslavskii, L. P. (1972b). Radiotekh. Elektron. 17,714.
140
L. P. YAROSLAVSKII
Yaroslavskii, L. P. (1974).I n “Konferentsiya “Avtomatizatsiya nauchnykh issledovanii na osnove prymeneniya EVM”, 10-12 Iyunya 1974g. Ispol’zovaniye novykh fizicheskikh printsipov v sistemakh avtomatizatsii. Koherentno-opticheskaya diagnostika plazmy. Opticheskiye metody obrabotki i khraneniya informatsii” (Conference on Automatization of Scientific Research with Application of Computers, 10-12 June 1974. Using the new Physical Principles in Automatization Systems. Coherent-Optical Diagnostics of Plasma. Optical Methods of Information Processing anf Storage), p. 87. Novosybirsk. Yaroslavskii, L. P. (1976a). Bull. Izobr. (43), 135. Yaroslavskii, L. P. (1976b). Geod. Kartogr. (lo), 15. Yaroslavskii, L. P. (1978). In “Avtomatizatsiya eksperimental’nykh issledovanii. Trudy Vsesoyuznoy nauchno-tekhnicheskoy konferentsii (5- iyunya 1978)” (“Automatization of Experimental Research. Transactions of the Allunion Conference”), p. 140. KuAI, Kuybishev. Yaroslavskii, L. P. (1979a). “Vvedeniye v tsifrovuyu obrabotku izobrazhenii” (“Introduction to Digital Picture Processing”), Sovetskoye Radio, Moscow. Yaroslavskii, L. P. (1979b). Probl. Peredachi. In: 15, 102. Yaroslavskii, L. P. (1981). In “Automatizatsiya nauchnykh issledovanii na osnove prymeneniya EVM. Tezisy dokladov VI Vsesoyuznoy konferentsii” (“Automatization of Scientific Research with Application of Computers.” Abstracts of Papers Presented at the 6th AllUnion Conference), p. 123. Inst. Avtom. Elektrometr. Sib. Otd. Akad. Nauk SSSR, Novosybirsk. Yaroslavskii, L. P. (1985). “Digital Picture Processing. An Introduction.” Springer-Verlag, Berlin and New York. Yaroslavskii, L. P. and Fayans, A. M. (1 975). In “Ikonika. Tsifrovaya Golografiya. Obrabotka izobrazhenii” (“Iconics. Digital Holography. Picture Processing”) (D. S. Lebedev, ed.), p. 29. Nauka, Moscow. Yaroslavskii, L. P., and Merzlyakov, N. S. (1977). “Metody tsifrovoy golografii” (“Methods of Digital Holography”). Nauka, Moscow. Yaroslavskii, L. P., and Merzlyakov, N. S. (1980).“Methods of Digital Holography. “Plenum, New York. Yaroslavskii, L. P., and Merzlyakov, N. S. (1982). “Tsifrovaya golografiya” (“Digital Holography”). Nauka, Moscow. Yaroslavskii, L. P., Kronrod, M. A., and Merzlyakov, N. S. (1974). In “Sovremennoye sostoyaniye i perspektivy razvitiya golografii” (“State of Art and Prospects of Development of Holography”) (L. D. Bakhrakh and G. Kh. Friedman, eds.), p. 54. Nauka, Moscow. Yatagai, T. (1974). Opt. Commun. 12,43. Yatagai, T. (1976). A p p l . O p t . 15, 2722. Zavalishin, N. B., and Muchnik, I. B. (1974).“Modeli zritel’nogo vospriyatiya i algoritmy analiza iziobrazhenii” (“Visual Perception Models and Algorithms for Image Analysis”). Nauka, Moscow.
ADVANCES IN ELECTRONICS A N D ELECTRON PHYSICS. VOL . 66
Two-Dimensional Digital Filters and Data Compression V . CAPPELLINI Dipartimento di Inyeyneria Elettronica Unicersity of Florence and IROE-C.N.R. Florence, Italy
I . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I1. Two-Dimensional Digital Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A . Definition of Two-Dimensional Digital Filters and General Properties . . . . . . B. Two-Dimensional Digital Filter Stability . . . . . . . . . . . . . . . . . . . . . . . C. Design Methods of Two-Dimensional Digital Filters . . . . . . . . . . . . . . . . I11. Local Space Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A . Local Space Operators for Image Smoothing and Enhancement . . . . . . . . . . B. Edge Detectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IV . Data Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A . Source Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Data Compression Methods and Techniques . . . . . . . . . . . . . . . . . . . . . V . Joint Use of Two-Dimensional Digital Filters and Data Compression . . . . . . . . A . Some Typical Connections of the Two Digital Operations . . . . . . . . . . . . . B. Processing System for Digital Comparison and Correlation of Images Having Different Space Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VI . Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A . Applications to Communications . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Applications to Remote Sensing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Applications to Biomedicine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D . Applications to Robotics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
141 142 142 144
145 152 153 155 158 159 162 173 173 174 176 176 182 185
192 199
I . INTRODUCTION
The area of digital image processing is of increasing importance and interest due to the impressive advances and innovations in digital methods and technology . In particular one may outline the following aspects: definition of new and efficient algorithms to perform linear and nonlinear operations with great application flexibility and adaptivity; production of fast new image digitizing systems and high-resolution displays; efficient software or hardware implementation capabilities due to the large expansion and evolution of 141 Copyright G 1986 by Acddemic Press. Inc All rights of reproduction in any form reserved
142
V. CAPPELLINI
standard computers, minicomputers, array-processors, microprocessors, and high-integration digital circuits (LSI, VLSI), available at a decreasing cost. Digital image processing methods and techniques are being increasingly applied in several important fields such as communications, radar-sonar systems, remote sensing, biomedicine, office automation, moving-object recognition, and robotics. High-interest digital operations, which can be performed for image processing, are the following: two-dimensional (2D) digital transformations, 2D digital filtering, local space processing, data reduction or compression, and pattern recognition. Digital transformations, digital filtering, and local space operators can be used to perform smoothing, enhancement, noise reduction, and edge extraction. Data compression operations permit one to reduce a large amount of data representing the images in digital form, solving transmission or storage problems. Pattern recognition is used to extract useful configurations from images for final interpretation and utilization. In this article 2D digital filtering and data compression operations are mainly described, pointing out their crucial importance for image processing; in particular the joint use of these two digital operations to increase the overall efficiency of image processing is presented. Local space operators are also described as simpler 2D digital filters defined only in the space domain (while the more complex digital filters are defined and designed both in the space and frequency domains). After a synthetic review summarizing the separate digital operations (2D digital filters, local space operators, data compression) and the presentation of their joint use, some examples of applications to important fields such as communications, remote sensing, biomedicine, and robotics are described.
11. TWO-DIMENSIONAL DIGITAL FILTERS The following aspects regarding 2D digital filters are presented: filter definition and general properties, stability, and designing methods. A . Definition of Two-Dimensional Digital Filters and Generul Properlies
Linear shift-invariant 2D digital filters are defined by 2D difference equations of the type (Cappellini et al., 1978)
DIGITAL FILTERS AND DATA COMPRESSION
143
wheref(n, ,n,) are the input data (samples of the input image), g(n, ,n,) are the output data (samples of the output image), and a ( k , , k , ) , b ( k , , k , ) are the coefficientswhich define the 2D digital filter. R , and S , are suitable sets, where the indices of the sums can vary to specify different classes of filters. A first important class of 2D digital filters is obtained when R , is defined as N , - 1 and S , is the void set. In this case the set 0 I n , I N , - 1,0 I n2 I Eq. (1) is reduced to
and defines the class of FIR (finite impulse response) 2D digital filters. The difference equation is now a convolution between the input-data matrix and the coefficient matrix, and no feedback of previous output data is present. In the ( z , ,z 2 ) plane the filter of Eq. (2) is defined by the transfer function
which can be obtained by applying the 2D z transform to both sides of Eq. (2) and has the form of a bivariate polynomial. In the above FIR case, assuming that the entire matrix to be processed is available, the sequence of computation is in principle irrelevant, and it is only a matter of computational convenience. On the contrary, in the general case when S , is not the void set and some samples g(n,,n2) are used in the computation of the current output sample, the sequence of computation is indeed important, because the output samples to be used in (1) have to be available (that is, previously computed or part of the initial conditions). Moreover, the sequence of computation has to be such that the corresponding linear system is stable. Therefore, the set S, has to be chosen in a suitable way, and a set of initial conditions for the computation have to be defined in such a way to compute any output sample as a function of the previously computed output samples or of the initial conditions. According to the above considerations, several different sequences of computation can be chosen. The most common one is that corresponding to the so-called quadrant recursive causal filter. In this case g(n,, n z ) = 0 for n, I 0 and n2 0, if f ( n , , n,) = 0 for n , 5 0 and n2 0. The input-output equation can now be written in the form
MI-1 M,-l
144
V. CAPPELLINI
By means of this equation, it is possible to compute every output sample from the previously computed ones and from the initial conditions. The recursion can be performed both along the rows and the columns of the array. In this case the 2D digital filter is described in the (zl, z 2 ) plane by the transfer function
which is the ratio of two bivariate polynomials. The matrix {b(n,,n,)}, which defines the recursive part of the filter, is different from zero only in the region n , 2 0 and n2 2 0. It is normally called a first quadrant sequence and can be indicated with the symbol (++b(n,,n , ) ) . In the same way it is possible to define filters whose recursive part corresponds to matrices different from zero on the second, third, and fourth quadrants. The quadrant filters are not the only form which allows the choice of sets R , and S , and of initial conditions to obtain recursive implementations of Eq. (1) (Mersereau and Dudgeon, 1975). Another choice corresponds to the unsymmetrical half-plane filters, whose coefficient matrices are defined on half-planes. B. Two-Dimensional Digital Filter Stability Among the different definitions of stability, the most commonly used is based on the BIB0 (bounded-input bounded-output) criterion (Cappellini et al., 1978).This corresponds to saying that a filter is stable if its response to a limited input is also limited. It is possible to show that, for causal linear shiftinvariant filters, this corresponds to the condition
where { h ( n , ,n , ) } is the impulse response of the filter. The above definition allows the first very important observation that the stability criterion is always verified, if the number of terms in the impulse response is finite, as it is the case with FIR digital filters. Obviously, the condition of Eq. (6)does not provide a viable method to test the stability of IIR (infinite impulse response) digital filters. In the onedimensional (1D) case, it is possible to relate the B I B 0 stability condition to the positions of the singularities of the z transfer function (poles), and it is possible to test the stability by finding the zeros of the denominator polynomial. A similar theorem, which establishes a relation between the
DIGITAL FILTERS AND DATA COMPRESSION
145
stability of the filter and the zeros of the denominator bivariate polynomial, can be formulated also in the 2D case. For causal quadrant filters, this theorem (Shanks et al., 1972) states that, if B ( z , , z,) in Eq. ( 5 ) is a polynomial in z 1 and z,, the expansion of l/B(z,,z,) in negative powers of z1 and z z converges absolutely if and only if B(z,,z,) # 0,
(7) The corresponding sequences are defined as minimum-phase causal quadrant sequences. Correspondingly, maximum-phase and mixed-phase noncausal quadrant sequences are defined. Unfortunately, in the 2D case the formulation of the stability conditions as above does not directly produce an efficient stability test, as in the 11) case, due to the lack of an appropriate factorization theorem of algebra. An approach to the solution of the stability problem in the 2D case can be the use of the properties of the complex cepstrum (Oppenheim and Shafer, 1975) of causal minimum phase sequences and noncausal sequences. The complex cepstrum of a sequence { f ( n l ,n,)) is defined as for lzll 2 1,1z21 2 1
h n 2 ) = z- ' C l n [ I ~ ( f (n , ,n2))Il
(8)
and it exists if: (1) the Fourier transform of the sequence is not equal to zero or infinity at any frequency; (2) any linear phase component has been eliminated through an appropriate shift of the original sequence (Dudgeon, 1975). These conditions are, for example, satisfied if the coefficient matrix is the coefficient matrix of a squared-magnitude transfer function. Further, if the computed cepstrum of a quadrant causal filter is quadrant causal, then the corresponding sequence is minimum phase and the filter is stable. The same considerations can also be extended to the half-plane filters. C . Design Methods of Two-Diniensional Digital Filters
Many design methods have been defined for 2D digital filters of the FIR and IIR type. Some more important methods are presented in the following, with special reference to those combining a good efficiency with a reasonable complexity of designing and implementation for image processing. 1. Design qf Two-Dimensional F I R Digital Filters
An important property of 2D FIR digital filters is that they can be designed to have completely real or completely imaginary frequency responses, modified by a linear phase term, if suitable symmetries are present in the impulse response.
146
V. CAPPELLINI
As an example, by considering a 2D linear digital filter having N , and N , odd in its noncausal form, the frequency response of this filter can be written in , the form [starting from Eq. (3), with z1 = e-jwlx and z2 = e - J W z xassuming the space-sampling interval X = 11
if the filter impulse response {h(n,,n,)} has the following symmetries h(n,,n2) = h(n,, - n 2 )
=
k(-n,, -n2)
=
h(-nl,n2)
(10)
which correspond to symmetries with respect to the origin of the axes and also with respect to the axes (circularly symmetric filters can be obtained in this way). The frequency response of Eq. (9) is symmetric with respect to the axes, as can be verified with a sign change of w1 and 0 2 The . frequency response of the causal filter is then obtained by multiplying Eq. (9) by the linear phase term which corresponds to the shift of the impulse response, that is,
The design problem consists in the evaluation of the coefficient matrix (a(nl, n 2 ) ) in such a way to meet a set of given specifications in the space or frequency domain. Several different methods have been proposed and defined, some of which are a direct generalization of their 1D counterparts. A relatively simple design method is the so-called window method (Cappellini et al., 1978).It is based on the consideration that, the 2D frequency response being periodic, it is possible to represent it as a Fourier series, whose coefficients, according to the 2D sampling theorem, are proportional to the samples of the impulse response of the filter. Therefore, it is possible to obtain, analytically or using an approximation method based on the inverse discrete Fourier transform (IDFT), the sampled impulse response, starting from the frequency domain specifications. The problem is that the resulting impulse response is, in general, of infinite order and has to be truncated to obtain a practically usable digital filter. However, if the truncation is performed using a rectangular or circular window function, with an abrupt transition between the value equal to one in the zone where the impulse response has to be retained and equal to zero in the truncation region, quite a large error in the frequency response is obtained. Therefore, the goal is to be able to truncate the impulse response, introducing the minimum error in the frequency response. To this purpose, the obtained values of the sampled impulse response h(n,, n 2 ) are multiplied by the samples w ( n l , n,) of a window function, whose Fourier transform presents a suitable trade-off between the width of the main lobe and
DIGITAL FILTERS AND DATA COMPRESSION
147
the area under the side lobes; that is a(%, a21 = @ l >
n2)w(4,112)
(1 1)
Many window functions have been defined to design filters in the 1D case. For 2D design, extensions to the 2D domain are in general used. In particular, a 2D window, having circular symmetry properties, can be defined starting from a w ( t ) window as W(X,)i) = W ( J r n )
(12)
Three useful window functions are: the Lanczos-extension window (Cappellini window l), the Kaiser window, and the Weber-type approximation window (Cappellini window 2). The Lanczos-extension window wl(t) has the 1D continuous form
for It( > z where m is a positive parameter, controlling the correction performance that is the tradeoff between the obtained width of the transition band and the maximum error in the approximation, and T is half of the window time extension. The Kaiser window has the form (Kaiser, 1966)
(14) where I, is the modified Bessel function of the first kind and zero order and ma is a positive number, which controls the tradeoff between the width of the transition bandwidth and the maximum in-band error. The Weber-type approximation window is a close representation of a function which gives a minimum value of the uncertainty product in a modified form (Hilberg and Rothe, 1971). The obtained expression of the window w2(t), as a third-order polynomial approximation, is the following (defined in the time interval 0-1.5) w2(t) = at3
+ bt2 + ct + d
0 5 t < 0.75
0.75 I t I 1.5
1.783724 b = -3.604044 c = 0.076450 d = 2,243434
a = -0.041165
a =
b = 1.502131 c = -4.591678 d = 3.651582
(15)
148
V. CAPPELLINI
The windows wK(t) and w,(t) represent indeed near-optimum windows, giving high-efficiency digital filters; however, the simple window w ,( t ) also gives good-efficiency filters. As a typical example, Fig. 1 shows the spatial-frequency response (one quadrant) of a circular 2D low-pass digital filter, using the window w , ( t ) with rn = 1.6, for N , = N , = 16 and w,/o, = 4 (0, = sampling angular spatial frequency; o,= cutoff angular spatial frequency). It is possible to design 2D FIR optimal digital filters by using the linear programming approach (Hu and Rabiner, 1972) and some modifications of the ascent algorithm (multiple exchange ascent algorithm) (Harris and Mersereau, 1977). The main problem with these methods is the computation time. This practically limits the maximum length of the impulse responses of the obtained filters to about 9 x 9 in the linear programming case and to about 15 x 15 in the multiple exchange ascent case, which is more efficient.
FIG.I . Spatial-frequency response (one quadrant) of a circular 2D low-pass digital filter, using the window w,(t).
DIGITAL FILTERS AND DATA COMPRESSION
149
The design problem can be made more tractable by reducing the number of variables in the linear programming through the frequency-sampling approach (Hu and Rabiner, 1972).In this case a grid of points in the frequ,ency domain is chosen, and most of the frequency sample values are fixed through a direct translation of the filter specifications. A linear programming problem can be set up using constraint relations for the interpolated frequency response, where the variables are the frequency samples in the transition bands. As a typical example, Fig. 2 shows the spatial-frequency response (one quadrant) of a 2D digital filter, having near-circular symmetry, designed through this last procedure for N , = N2 = 16 (Calzini et al., 1975). Another suboptimum design method is based on the transformation of the frequency response of a 1D filter into the frequency response of a 2D filter (McClellan, 1973).Let us consider, for instance, a linear-phase 1D digital filter with N odd: Its frequency response, dropping the linear phase term, can be
FIG.2. Spatial-frequency response (one quadrant) of a 2D digital filter, designed by the frequency sampling procedure.
150
V. CAPPELLINI
written in the form (N-1)/2
H(e'")
C
=
a(k)cos(kw)
k=O
where a(k) are the coefficients defining the frequency response. If a transformation of variables of the form
coso
=
+
+
A C O S O , + BCOSO, C C O S W ~ C O S OD,
(17)
is carried out in Eq. (1 6), using the properties of the Chebychev polynomials and of the trigonometric functions, it is possible to obtain a 2D function of the type ( N - 1)/2 ( N - 1 ) / 2
a ( k , ,k , ) cos(k,o,) cos(k202)
H ( e J o lejo2) , = ki=0
(1 8)
k2=0
which is formally identical to the frequency response of a linear-phase 2D FIR digital filter [see Eq. (9)].With the choice A = B = C = - D = 9, the mapping contours in 2D are approximately circular, at least for small values of w 1 and w , . This design procedure can also be generalized to the use of transformation relations more complex than the simple relation of Eq. (17) (Mersereau et al., 1976), and some efficient implementation structures exist for the obtained filters (Mecklenbrauker and Mersereau, 1976).
2. Design of Two-Dimensional I I R Digital Filters In this case the coefficients {a(nl,n,)} and {b(n,,n,)} have to be chosen to approximate the desired frequency response with a stable recursive implementation [see Eqs. (4) and ( 5 ) ] . The stability is indeed a specific and important problem of recursive structures, as already shown in Section I1,B. To design 2D digital filters of the IIR type is a more difficult task than to design IIR 1D filters. In fact, the 1D techniques normally rely on the factorability of one-variable polynomials, which result in very simple algorithms for the stability test and for the stabilization of unstable filters; these techniques are unfortunately not directly generalizable to the 2D case (Cappellini er al., 1978). Two main classes of design methods have been defined. The first one is based on spectral transformations from 1D to 2D and the second one on parameter optimization, using some filter structures, as the second-order filter section cascade, where stability control is easily introduced into the approximation algorithm (Maria and Fahmy, 1974). A general design procedure (Ekstrom, 1980) uses a nonlinear optimization to minimize an error expression, where the distance from an ideal frequency response and the distance from a stable implementation, obtained by means of
DIGITAL FILTERS AND DATA COMPRESSION
151
the cepstrum decomposition, are present. In this case, it is possible to obtain simultaneous control of the frequency domain approximation and of the stability of the filter, with a procedure which is indeed general but rather complex in implementation; further, some knowledge of the general nonlinear approximation problems, when an acceptable error minimum is not automatically reached, is required. A proposed design technique (Shanks et al., 1972) consists of mapping 1 D filters into 2D filters, with a rotation operation. If a 1 D continuous filter is given in its factored form, its transfer function can be viewed as that of a 2 D filter that varies in one direction only
where sl, s2 are the Laplace variables and qi, pi are the zeros and poles, can be respectively. A rotation of the (s1,s2) axes through an angle performed by means of the transformations
A filter whose frequency response is now a function of s1 and s2 and corresponds to a rotation by an angle - p of Eq. (19) is obtained. Then a digital filter can be defined through the application of the bilinear z transform to both the continuous variables. The above approach, which in direct implementation suffers from the warping effects of the bilinear z transform, has been used to obtain simple rotated blocks, which can be combined to define circularly symmetric recursive filters (Costa and Venetsanopoulos, 1974), where also the conditions for the stability of the rotated sections have been proved. Another method (Bernabo et al., 1976) is based on the transformation of the squared magnitude function of a 1 D digital filter to the 2D domain, followed by a suitable decomposition of the resulting filter. With reference to Section II,A, given a first-quadrant filter (causal filter), it is possible to define the corresponding second-, third-, and fourth-quadrant filters, according to the relations h,(n,,n,)
= h,(n,,
-4= h 3 ( - 4 , -4= h , ( - n , , n z )
(21)
with transfer functions H,(z,,z,)
= Hz(z,,Z;l) = H3(z;l,z;1)
=
H4(z;l,z2)
(22)
The cascade of the four filters is a zero-phase digital filter, whose frequency response is defined by the coefficients p ( k l , k2) and q ( k , , k2), determined
152
V. CAPPELLINI
FIG.3. Spatial-frequency response of a 2D IIR digital filter
through the convolution of the coefficients of the four filters, and has the following form [(see also Eq. ( 5 ) ]
c c p(k,,k,)cos(k,o,)cos(k,o,) c c 4(k,,k,)cos(k,o,)cos(k2w,)
N-1 N-I
qej w , ,
jo2
k 1=0 k2 = 0
1 = M-1
M-1
(23)
k i = 0 kz=O
Such a 2D frequency response can be obtained through the transformation of Eq. (17) applied to the numerator and denominator of the squared magnitude function of a 1D IIR digital filter. The obtained squared magnitude transfer function has to be factorized to get stable recursive digital filters and the cepstrum decomposition [see Eq. (8)] can be used. In particular, to reduce the error connected to the truncation of the infinite cepstrum, windows (see Section II,C,l) can be used to reduce oscillations. Figure 3 shows an example of a 2D IIR digital filter, designed according to the above procedure: The squared magnitude function of a fourth-order Chebychev low-pass filter, having a 2% in-band ripple, a normalized cutoff space frequency 0.25, and a - 20-dB space frequency 0.35, is used; the filter has a numerator and a denominator with 6 x 6 coefficients, obtained using a Kaiser window [Eq. (14)] with w , = 3. The filter maximum in-band ripple turns out to be 0.022 and the transition band, defined as the difference between the normalized space frequencies, where the amplitude of the frequency response is, respectively, 90% and 10% of the in-band nominal value, is equal to 0.0937. 111. LOCALSPACEOPERATORS
Local space operators can be considered as simpler 2D digital filters defined only in the space domain (see Section I), while the above described are more complex 2D digitals filters defined and designed both in the space and frequency domains.
DIGITAL FILTERS AND DATA COMPRESSION
153
Indeed local space operators have in general low complexity: small blocks of data (image samples) are processed in the space domain. The interest of this approach is connected to the resulting economic implementation and very fast processing capabilities, characteristics which are very important when large amount of data (image samples) have to be processed or when real-time operators are to be performed. Further local space operators of the nonlinear type can be defined, solving difficult image processing problems (as noise reduction) in a fast and efficient way. Some very simple local space operators are that ones used to perform sampling or quantization variations. Through sample repetition or interpolation a zooming effect can be obtained in order to change the sampling rate or to present particulars of the observed scenes. Through quantization variation, scale expansion or compression can be obtained to perform image enhancement or smoothing, respectively. In particular a useful image enhancement can be accomplished, by evaluating the amplitude distribution or histogram on the entire analyzed image and changing the scale in such a way to shift the minimum amplitude to zero and the maximum amplitude to the full scale value (image stretching). The other local space operators can be classified into two main groups: operators performing image smoothing or enhancement (low-pass or highpass filtering) and operators performing edge extraction (derivative filtering).
A . Local Space Operators for Image Smoothing and Enhancement
One simple local operator of this type is the average operator, which can be used to reduce variations in the grey levels, for instance reducing noise components in the image. As an example, the arithmetic average of 9 samples (pixel values) can be performed: The obtained result is then substituted for the central pixel of the 3 x 3 sample matrix used according to the relation
where f ( n , ,n 2 ) are the image samples. The operation is then iterated for all the points of the image, correponding obviously to a sliding convolution (FIR filtering). Indeed, operators of this type are equivalent to low-pass filters, reducing variations between adjacent pixels of the image. Their importance, as already observed, lies in the fact that very fast implementations result, because no multiplications need to be performed. On the other hand, another class of operators exists to emphasize the differences between adjacent pixels. This is obviously the opposite of the
154
V. CAPPELLINI
previously considered operation, and its purpose is to enhance the variations, outlining the contours of the objects in the image. Along this line, very simple separable algorithms can be defined on a row and/or column basis. As an example, let us consider the following simple procedure: the differences d between the corresponding pixels (same column) of two adjacent rows are computed; then, if d > 0 the maximum luminance value (white level) is substituted for the sample; if d < 0 the black value is chosen as the sample value; and if d = 0 an intermediate value is selected. Obviously, this procedure enhances the transitions between the rows, that is along the columns of the image. However, the procedure can be repeated along the rows, considering the differences between the columns. Another approach is based on the computation of the differences between a pixel and the average value of the 8 adjacent values according to the relation
The above two operators correspond to high-pass filtering algorithms, of which the first is separable. Of increasing interest are local space operators performing nonlinear filtering (Cappellini, 1983). One interesting example is represented by the following nonlinear smoother of noisy images, which is especially useful before edge detection. By considering a block of 3 x 3 samples, the smoother is defined by the relation
where S = { f : / f ( n l k l , n 2 - k , ) -f(n,,n,)l < c,) and k l , k 2 = - 1, 0, 1 with k , + k 2 # 0. By means of this smoother, the value of each pixel is replaced by the average of its neighborhood values, except those which have level differences greater than a fixed value (c,) in absolute value. In this way, small-amplitude noise is removed, while no degradation results for edges and boundaries present in the processed image regions (a smoothing of the pixel values on either side of the edge is performed without any damage for the edge itself, as would happen with linear smoothing). Further, this nonlinear filtering procedure can be iterated to obtain a greater noise reduction; two iterations are in general sufficient. Another interesting example of nonlinear operators is represented by the following nonlinear filters, very useful in reducing noise spikes or scintillation pulses essentially represented by high noise levels concentrated in one or two pixels (due to image sensors such as TV cameras, photodetectors, ir detectors, ultrasonic transducers, or to the analog-to-digital conversion).
DIGITAL FILTERS AND DATA COMPRESSION
155
Let us evaluate the average value fa(nl, n z )of three grey levels of the pixels in a 3 x 3 block [excluding the central f ( n , , n 2 ) ] as expressed by the double addition in Eq. (25): if all 8 pixels, around f ( n l ,nz),have a grey level differing from fa(nl,n,) less than a suitable threshold T and f(nl,nz) differs from fa(nl, n z )more than T A (with A > 0), the value off(n, ,n z )is set equal to the average fa(n,, nz); otherwise the central pixel maintains its original value f ( n , , nz).By adjusting the two parameters T and A, noise spikes of a different level in more or less flat image regions can be eliminated. For a binary image, as obtained after edge detection, a nonlinear operator analogous to the preceding one can be defined through the following relations (Cappellini, 1983)
+
lf’(nl, n z ) = 1,
I
f’(n,,n,)
if f ( n , , n 2 ) = 1
= 0,
otherwise if
f(n, ki= 1 kZ= - 1 kl+kl#O ~
f ’ ( n , , n z ) = 1,
-
k , , n z - kz) < no (28)
otherwise
where f ’ ( n , ,n z )is the updated value regarding the central pixel f ( n , ,n 2 ) and in general n, = 7 and no = 2 or n, = 8 and no = 1 (the last case means that a central pixel of value 0 is changed to 1 if all the 8 near pixels are 1 and a central pixel of value 1 is changed to 0 if all the 8 near pixels are 0). B. Edge Detectors
A very important class of 2D local space operators is represented by edge detectors, which extract edges or boundaries in the processed image. Most of these operators perform a kind of derivative filtering, evaluating the gradient through a test on a given image pixel and its close ones. Once the gradient has been estimated (magnitude and angle), it is compared with a threshold: If its value (in particular the magnitude) is greater than the threshold, the pixel is considered as a part of an edge, whose direction is orthogonal to the gradient direction. These operators can be divided in two groups: To the first group belong the operators which evaluate the two orthogonal components of the gradient; the second group is based on gradient detection by means of a set of templates or masks of different orientation (Pratt, 1978; Cappellini., 1979b). In the first group, two orthogonal components D, and D,of the gradient in each pixel are evaluated, and then its magnitude is obtained by means of the
156
V. CAPPELLINI
relation D
=
J
m
while its direction is evaluated as
Y = arctan (DJD,)
(30)
The two components D, and D, can be evaluated by several methods, by using different weights on a given number of values near the tested pixel. By considering the ( n , , n,) pixel, the simplest way to obtain the two components D, and D, corresponds to evaluating the differences between the adjacent pixels, that is,
This is the same as using the coefficient matrices or masks
in a nonrecursive digital filter implementation (performing the addition of the products of the values of the masks and the underlying values of the image pixels). Another method to find the gradient components D, and D,. is the following (Roberts method)
D1
=f(n,,n,
D, = f(n,,n,)
+ 1) -
+ 1,nJ f ( n 1 + l , n , + 1) -
f(nl
(33)
with the corresponding masks
'1
0
D l = [ -1
"1
(34)
D 2 = [ '0 - 1
which gives two orthogonal components rotated z/4 with respect to the image axes. A more accurate estimation of the gradient can be obtained by using 3 x 3 coefficient matrices. Some of these matrices or masks are reported in the following (1) smoothed gradient -1
0
1 ,
-1
0
1
0, =
1
1
0 -1
0 -1
"1
-1
(35)
157
DIGITAL FILTERS AND DATA COMPRESSION
(2) Sobel gradient O x = [ - 2l o '0] , 2
-1
1
0
A]
D,=[-i
-2
(36)
-1
(3) isotropic gradient
The expressions related to the previous masks are of the form D,
+ +
+ +
=f(n,-
+
+
1, n2 1) w f ( n l ,n2 1) f ( n , 1, n2 1:) - f ( n l - 1, n2 - 1) - w f ( n 1 ,n2 - 1) - f ( n , 1, n2 - 1)
D, = f ( n 1 - 1,122
- 1)
+ 1, n2
+ wf(n1
-
+
+ 1) f ( n , + 1, n2 + 1)
1, n 2 ) + f ( n , - 1, n2
+
(38)
1) - w f ( n , 1, n z ) where the weight w assumes the values 1, 2, for the three masks, respectively. The second group of operators for the estimation of the gradient and then for the identification of edges is based on gradient detection by means of a set of templates or masks of different orientation, searching sequentially at each pixel for the best match among the image subarea and the masks. Every mask of the set is superimposed on each pixel of the image, and the additions of products between the mask coefficientsand the underlying pixels of the image are performed just as in the previous group of local operators. The gradient is assumed to be detected by the mask which gives the greatest value of the addition of the products: Its direction is assumed according to the direction of the mask. Each set of masks is composed of eight different 3 x 3 masks, each of which is obtained from the previous one through a circular permutation of its elements around the central one. Thus, if we assume that the first mask of a given set is -
f(nl
-
::r]
the second and the third mask will be
D G H
and so on.
A D G
(39)
158
V. CAPPELLINI
The sets of masks more frequently used are obtained through a permutation of the following masks (Cappellini, 1979b) (1) Prewitt mask 1
1 (41)
(2) Kirsch mask -35 -3
05 -3
-351
-31
(3) Robinson masks r
0 -1
0 -2
-1
3
(43)
There are also other approaches to edge detection, in particular to increase the processing speed or to process noise images. An interesting example is represented by the following operator (Cappellini and Odorico, 1981).A block of 3 x 3 pixels is considered: To each of the 8 pixels surrounding the central one a binary value is given according to the difference among the pixel value and the central value. In this way a binary image is obtained having 256 possible configurations (taking the central pixel at a constant reference value of 0 or 1): these configurations are divided into 5 classes, having a decreasing probability that the central pixel is a part of an edge or a contour. By setting a threshold, separating these classes into two groups, it is finally estimated whether the central pixel is or not a part of an edge. Interesting aspects of this operator are the following ones: adaptive criteria can be used for the above separation into two groups, depending on the noise characteristics in the processed image; high-speed implementation is obtained, due to the fact that to estimate whether a pixel is or not a part of an edge, after the binary values were obtained, it is practically sufficient to compare the actual binary configuration with a memorized decision table.
IV. DATACOMPRESSION Data compression is a digital operation or transformation performed to reduce the amount of redundant data (redundancy reduction) and is particularly useful for image processing, due to the large number of sampled
DIGITAL FILTERS AND DATA COMPRESSION
159
data representing each image. In the following some general considerations on source coding and several data compression methods and techniques are presented, outlining the more important for image processing.
A . Source Coding
Let us consider an information source S emitting the information symbols sl,s,, . . . ,sq. These discrete symbols could be, for instance, the different grey levels of a sampled image. Let us assume that each symbol is emitted with a specific probability: p(si)is the probability of si (in practice, as is well known, this probability could be estimated as the ratio between the number of times that si appears and the overall number of observed or recorded symbols). The information quantity given by the symbol si is assumed to be proportional to the inverse of its probability p ( s , ) through a logarithmic function; that is,
where ko is a constant and a is the logarithmic base. In general the following values are assumed: k, = 1, a = 2. In this way, the information quantity connected to a symbol having probability 3 (equal probability in the case of only two symbols) results to be unitary value, that is 1 bit (binary unit). Once having defined the information quantity of a symbol si,it is possible to define the mean information quantity or entropy of a source S = {sl,s 2 , .. .,s q } of the above type, also called zero-memory source due to the fact that the emission of one symbol is independent by the emission of the other near symbols. The entropy of the zero-memory source, H ( S ) , is defined as (Shannon and Weather, 1949) 1
11
which measures the average number of bits per symbol. It is easy to prove that the entropy function H ( S ) has the maximum value when the symbols si are equiprobable, that is when p(si) = l/q; in this case H ( S ) = log, q. One zero-memory source of particular interest is represented by the binary source, having only two symbols s1 = 0 and s2 = 1. Denoting p = p(O), it results that p(1) = p = 1 - p . The entropy function H 2 ( S ) has the expression 1 H , ( S ) = plog2-
P
+ Plog,_1
P
160
V. CAPPELLINI
This function, also denoted H ( p ) ,is often called the entropy function and has a maximum value ~ ( p=) 1 for p = j~ = f. A slightly more complex information source is represented by the nth extension of the zero-memory source, in which blocks of n symbols si are considered (as the words representing the image samples). The symbols ai of this extended source S" are q" and the entropy is
A more general and complex information source is represented by the Markov source or source with memory, in which the emission of any symbol si is done with a conditional probability p ( s i / s j , s, j 2 , . . ,sjm);that is, this emission is now connected to a specific group of preceding symbols s j , ,s j 2 , .. . ,sjm.The Markov source is called of m order (m denotes the memory extension), The above conditional probabilities are q m + l , because there are qm possible configurations or states of the source, corresponding to the different possible dispositions with repetition of the m symbols, and from each one of this state can appear any one of the q source symbols. The representation of the Markov sources can be done through the state diagrams, in which the different source states are reported as points and the possible connections among the states are outlined by conjunction lines (Abramson, 1963). Let us now consider the source coding. By considering, for instance, a zeromemory source S = {s1,s2,.. . ,s4> with symbol probabilities p(sl), &), . . . , p(s,) as above, each symbol si can be transformed or mapped into a fixed sequence of Ii symbols taken from a finite alphabet X = {xl,x, . . . ,x,}. This corresponds to encoding each symbol siinto a code word X i , belonging to the set { X , , X , , . . . ,X q } ;X is called the code alphabet. Source codes can be classified according to the code-word structure: codes using a variable-length encoding, in which the code words X i have a variable length; codes with a fixed length of the code words. Further, the source codes can be distinguished according to the following properties: (1) nonsingular codes, having all different code words; (2) codes which can be univocally decoded, for which the nth code extension is nonsingular for any finite value of n; ( 3 ) comma codes, having a specific symbol to separate a code word from the near ones; (4) instantaneous codes, for which any code word can be decoded into a source symbol without the necessity of considering or knowing the following symbols.
An important example of a source code is represented by the usual binary code, used to represent an image sample (quantized grey level)in a digital form.
DIGITAL FILTERS AND DATA COMPRESSION
161
This binary code, having in general a constant word length, is a simple example of a nonsingular code, which can be univocally decoded. A very important property of source codes is connected to the economy or compactness of representation of the information symbols. To make this concept precise, the average code word length can be defined as 4
L=
C P(si)li i= 1
(48)
This parameter L is very useful to measure the economy of each source code. For instance, a source code having an average word length Lless than or equal to that of all other codes using the same code alphabet for the same information source is called a compact code. It is clear now that a fundamental problem of the source coding consists in the search and definition of compact codes for the different information sources. A general theoretical solution to the above problem is given by the first Information Theory Theorem or first Shannon theorem for source coding (Shannon and Weather, 1949; Shannon, 1959). Substantially this theorem establishes a general bound for the average word length L in relation to the information source entropy. In simplified form this bound is expressed by the following relation Hr(S)I L
(49)
where Hr(S)is the source entropy, measured with a logarithm in base r. The bound can also be set as
44s) Ln/n
(50)
where L,, represents the average word length of the nth extension of the information source with the following limit limn+ (L,,/n) = Hr(S)
(51)
In general the price which is paid to reduce L o r Ln/n is represented by the complexity of the source coding. The above fundamental theorem permits us now to also define in a rigorous way the efficiency of each source code [according to Eq. (49)]
v
= Hr(S)/L
(52)
and the redundancy of the source code as 1 -q
[L - Hr(S)]/L (53) The above Eqs. (52) and (53) permit one to compare different codes for the same information source, selecting that which has the higher efficiency (higher value of q ) or less redundancy. =
162
V. CAPPELLINI
An example of compact or optimum codes is represented by the Huffman encoding procedure, in which the length li of each code word is inversely related to the value of the probability p(sJ In this way the more probable and therefore the more frequent words are encoded in shorter sequences compared with the less probable ones (Abramson, 1963). B. Data Compression Methods and Techniques
Many methods and techniques of source coding or data compression have been studied, defined, and applied to image processing (for local processing, image transmission or storage). According to the first Shannon theorem (see Section IV,A), the different data compression methods can be divided into two main classes: (1) reversible methods, which permit-at least in principle-recovering through decompression (source decoding or inverse transformation) all the original source information amount; (2) irreversible methods, which do not permit recovering all the original data and which introduce therefore some information loss or distortion.
The reversible methods obey and respect the source coding bound of Eq. (50),while the irreversible methods do not. Among the reversible methods, some important ones for image processing are: (1) adaptive sampling and quantization (2) prediction interpolation (adaptive and nonadaptive) (3) variable word-length coding (as the Huffman code) (4) digital filtering (maintaining a useful spectrum extension) (5) use of transformations (Fourier, Walsh-Hadamard, Haar, Karhunen-Loeve) while some irreversible methods are: (1) (2) (3) (4) (5)
thresholding parameter extraction power spectrum digital filtering (with spectrum extension reduction) probability functions.
To evaluate the performance of data compression methods, distortion functions and distortion measures can be used (Berger, 1971).In practice, three measurements are evaluated: the compression ratio C , ; the peak error e,; the
DIGITAL FILTERS AND DATA COMPRESSION
163
rms error e,. The compression ratio C, is defined as (Benelli et al., 1980)
c, = L s / L
(54)
where L, is the mean source word length [in general equal to the entropy H ( S ) ] and L is the mean word length data compression. If the source messages (image samples) are represented in the standard binary form, the compression ratio C , can be obtained by the ratio
c, = NSINC
(55)
where N, represented the number of bits (0 and 1 values) of the source words (representing the image samples), and Nc represents the corresponding number of bits of the code words after compression. The peak error e prepresents the peak or maximum error resulting between the input message (image samples) and the corresponding reconstructed message after decoding or decompression, while the rms error e, represents the root-mean-square error between the input and reconstructed messages evaluated on a suitable block of data (number of image samples, in particular on the image samples overall). It is clear that in general data compression methods of the irreversible type give higher values of the compression ratio C,, with the penalty however of higher e p and e, errors. Furthermore, reversible methods can also become irreversible, if some parameters (as threshold values, sampling frequency, amplitude tolerances, etc.) are changed in such a way as to obtain higher compression ratios (consequently introducing higher error values). In practice, the efficiency of the different data compression methods depends on the nature of the information source (different types of images), and each method can result in being more efficient for some information sources (particular type of image) than for other ones. In this regard, it is very important to perform a suitable analysis of the information source (image to be processed) before the application of any data compression method: (1) space analysis
(2) space-frequency analysis (amplitude and phase spectrum) (3) statistical analysis (statistical average, amplitude distribution, auto-
correlation, power spectral density). 1. Adaptive Sampling and Quantization
Adaptive sampling methods are based on the use of the minimum sampling frequency for any processed image; the minimum sampling frequency, as is well known (from the sampling theorem), is equal to double the maximum space frequency. Adaptivity can in particular be obtained by
164
V. CAPPELLINI
changing the sampling frequency for some sub-block of the processed image; for instance, sample sub-blocks of 32 x 32, 64 x 64, or 128 x 128 are considered, and the sampling frequency is locally adjusted in each image subblock according to the local maximum space frequencies. The maximum space frequency in the image or image sub-blocks can be practically found through the 2D F F T (fast Fourier transform). By means of the above procedures, a reduction of sample number is obtained, in comparison with the use of a fixed sampling frequency. Also, the quantization level (number of bits representing the image samples) can be changed in an adaptive way from one image to another or in image sub-blocks, taking into account the actual grey-level range (for instance, low grey-level ranges require a lower number of bits). The grey-level range can be easily found through the amplitude distribution (grey-level histogram). In practice, the change of sampling frequency or quantization law from one image to another or in image sub-blocks can be identified through a special word expressing the particular sampling frequency or quantization level locally used.
2. Prediction and Interpolation Data compression methods with prediction or interpolation are interesting, due to their relatively simple structure and reasonably good efficiency (Benelli et al., 1980). Let us consider 1D algorithms, which can be applied to image processing line by line (for instance, processing the sequence of image samples along the rows). In prediction methods a priori knowledge of some previous samples (image samples along a row) is used, while in interpolation methods a priori knowledge both of previous and future samples is utilized. In both types of operations the most widely applied technique consists in comparing the predicted or interpolated sample with the actual sample: If the difference is less than a fixed error or amplitude tolerance, the actual sample is not maintained, otherwise the actual sample is maintained, Figure 4 shows a block diagram of a typical data compression system with prediction or interpolation: The nonredundant samples (i.e., the samples for which the prediction or interpolation fails) are fed into a buffer to be reorganized at constant space intervals with the space position identification (synchronization), necessary for the reconstruction of the original data from the compressed samples. The important role of the buffer is therefore to store the incoming samples remaining after the gate, so that they can be reorganized at uniform sampling rate. In this way, while at the input of the buffer we have only bit compression, at its output we also have bandwidth compression (sampling rate reduction). In the following, the symbols C, and C,, indicate the average compression
c
BUFFER
GATE
-
compressed output
b
-COMPARATOR. Input
P R E D l CTOR INTERPOL.
=
$
f *
GATE
synchronization
ratios, respectively, with and without including the bits added for the space identification or synchronization. Prediction algorithms are utilized, according to the following difference equation f,(n)
=f
( n - 1) + A f ( n - 1) + A 2 f ( n - 1) + ... + ANf(n- 1)
(56)
where f p ( n )= predicted sample at space position nX ( X being the space sampling interval along the image rows and columns); f ( n - 1) = sample value at the previous space position ( n - 1)X; A f ( n - 1) = f ( n - 1) f ( n - 2), . . . , A N f ( n- 1) = A N - l f ( n - 1) - A N - l f ( n - 2). The value of N corresponds to the order of the prediction algorithm: with N = 0 we obtain the zero-order predictor (ZOP) and with N = 1 the first-order predictor (FOP). For the ZOP algorithm, several procedures can be followed: (1) Z O P with fixed aperture, in which the dynamic range of the data is divided into a set of fixed tolerance bands with a width of 26; if f ( n -- 1) is the last remaining sample, f ( n )is not maintained when it lies in the same tolerance band; (2) ZOP with floating aperture, where a tolerance band + 6 is placed about the last remaining sample; if the following sample lies in this band, it is not maintained (in this case the next samples are compared again with the value of the last remaining sample & 6 and so on); (3) Z O P with offset aperture, in which the predicted sample is f p ( n )= f ( n - 1) f 6, where 6 is a prefixed quantity and the sign + is used if the last remaining sample is out of tolerance in the positive direction and vice versa. In the F O P algorithm with floating aperture the first two samples are maintained and a straight line is drawn through them, placing an aperture & 6
166
V. CAPPELLINI
about that line: If the actual sample f ( n ) is within this aperture, it will not be maintained and the line will be extrapolated for a following space interval X and so on. Data compression algorithms using interpolation differ from the corresponding ones with prediction, due to the fact that with interpolation both previous and following samples are used to decide whether or not the actual sample is redundant. The more interesting algorithms, based on low-order interpolation, are zero-order interpolator (ZOI) and first-order interpolator (FOI). Adaptive data compression methods with prediction or interpolation represent an improvement on the preceding ones of the nonadaptive type, especially when there are input images to be processed having high activity (variation of the space and frequency behavior from one image part to the other). These methods can be divided into linear and nonlinear ones depending on the specific procedure used for the adaptivity implementation. Of high interest is the adaptive-linear prediction (ALP) method, in which the predicted sample f,(n) is evaluated by a linear weighting of M previous samples (Benelli et al., 1980) M
where bkare suitable weighting coefficients. If the prediction error falls within a given threshold value y , the actual sample is not maintained. If the considered process is a stationary Gaussian series with zero mean, the coefficients Bk can be determined in such a way as to minimize the meansquare prediction error given by M
f ( n - i) -
1 &(M,N)f(n
-
k=l
i
-
k))’
(58)
M being the number of the preceding samples for the prediction and N the
number of samples which the predictor uses to learn the image sample’s evolution (line by line). The method turns out to be advantageous as long as the statistical characteristics of the image are maintained on suitable extended regions. In practice a counter can be used to measure the number of consecutive predictions affected by error; when this number exceeds a prefixed value T, a new set of M coefficients is again computed and the algorithm goes with the new coefficients. 3. Differential Pulse-Code Modulation and Delta Modulation
The general block diagram of differential pulse-code modulation (DPCM) is shown in Fig. 5. A predicted sample f J n ) is evaluated through a linear
167
DIGITAL FILTERS AND DATA COMPRESSION en
input
-
QUANTIZER
.
PREDICTOR
*
c
output
FIG.5 . Block diagram of a DPCM system.
weighting of the M previous samples (Benelli et al., 1980) M
The predicted samples can be obtained using any of the prediction algorithms as ZOP, FOP, ALP,. . . . The difference en between the actual sample and the predicted one is quantized with quantization intervals of amplitude A and encoded in a code word L , bits long. If the image samples have high correlation and the weighting coefficients are correctly chosen, DPCM generally offers a higher efficiency with respect to the usual binary coding; with an equal number of bits, DPCM assures an higher signal-to-quantizationnoise ratio (SNR), or with an equal SNR it requires a lower number of bits. Many adaptive DPCM methods (ADPCM) have also been studied and defined. In general the A value is varied, becoming smaller when the grey level is quiescent and vice versa or the length of the prediction interval ( M ) is changed, according to the signs and values of some previous differences between the predicted and the actual samples. A special data compression method, which can be considered as a DPCM with l-digit code, is represented by delta modulation (DM). In the DM method, the changes in the grey level between consecutive samples are substituted for the absolute grey-level values. These changes are represented in the form of binary pulses, whose sign ( + or -) depends on the sign of the amplitude change. Fig. 6a shows the block diagram of a DM system, while Fig. 6b outlines the main wave forms at different points of the system. In the classical DM a single binary pulse is obtained at each sampling interval instead of a complete code word; the output pulse is in this case synchronous with the input word stream, yielding a constant compression ratio. Errors in the reconstructed data can, however, appear due to two effects: the approximation of the input wave form (grey-level variation) to a step function (granular or quantization noise); and quick variations of the input wave form, which cannot be followed with accuracy. Regarding this last
168
V. CAPPELLINI (a)
PULSE GENERATOR
A(+) COMPARATOR A (-)
I
QUANTIZER
-
MODULATOR
-!+ I
INTEGRATING NETWORK
PULSE GEN. OUTPUT A : eo
B : e -e 0 1 C ( e0 -e1 ) / l e 0 - e 1I
e**
D :e2
eo . el FIG.6. D M system: (a) block diagram; (b) main wave forms in the different points.
aspect, input variations cannot be followed for which the gradient of the sampled data exceeds the limit g
=
Ar
(60)
where A is the change in amplitude of a DM pulse and r is the rate of the pulses (pulse space frequency for the processed image). The distortion due to this aspect is also known as slope-overload distortion. Many studies have been developed to analyze the efficiency of the classical D M and to increase the efficiency (Benelli et al., 1980). A first method is based on the change of the step amplitude, according to the wave-form variations:
DIGITAL FILTERS AND DATA COMPRESSION
169
The step amplitude is increased when a given number No of consecutive samples have the same binary value and it is decreased in the contrary case (high information delta modulation, HIDM). Another interesting modification of the classical DM is basic asynchronous delta modulation (BADM), in which the sampling rate is increased during intervals of high activity (rapid dynamic range variations) and it is decreased in lower activity intervals. A special technique, called operational asynchronous delta modulation (OADM), avoids the errors corresponding to rapid amplitude variations in the following way: When the difference between the input and the reconstructed samples exceeds a prefixed tolerance value, the algorithm goes back m samples and inverts the A value, adjusting the sampling interval appropriately. 4 . Use of Digital Filtering The use of digital filtering ( I D and 2D) is a very useful approach for data compression, for several reasons (see also Section V). First, if the useful information is concentrated in a limited frequency band, digital filtering can extract this band, in particular through low-pass or bandpass filtering; indeed, a lower number of data are required to represent the extracted limited band (in comparison with the overall spectral extension) and hence a data compression result is obtained. Further low-pass digital filtering is useful in preprocessing before the application of particular data compression methods, because the smoothed data can be more efficiently compressed by specific compression algorithms. The joint use of digital filtering and data compression methods is presented in detail in Section V. 5 . Use of Transformations
Orthogonal transformations, such as Fourier, Hadamard-Walsh, Haar, Karhunen-Loeve, etc., in particular in discrete or digital form, can be used for data compression, due to the fact that in general they represent a more compact representation of image data. This means that the transformed data become defined and exist in a smaller region or domain than the original data; a lower number of significant transformed data then result. The 2D discrete Fourier transform (DFT) is defined as in Pratt (1 978)
while the inverse discrete Fourier transform (IDFT) is expressed as
170
V. CAPPELLINI
where the k , ,k , indices correspond to frequencies ( v l = k , Av, v , = k , Av, Av being a constant-space-frequency interval). With suitable symmetry properties, the discrete cosine transform and sine transform (DCT and DST, respectively) can be used. The DCT transform can be expressed in the following way
The Hadamard transform is based on the properties of the Hadamard matrix (square form with elements equal to 2 1, having orthogonality between the rows and columns). A normalized Hadamard matrix, of N x N size, satisfies the relation
HHT = 1
(64)
The orthonormal Hadamard matrix of lowest order is the 2 x 2 Hadamard matrix 1 1 -z-$l
--[
-:I
The above transform is also known in the litearture as a Walsh transform (WT). A frequency interpretation of the above Hadamard matrix is indeed possible: the number of sign changes along any Hadamard matrix row, divided by 2, is called the sequence of the row. The rows of a Hadamard matrix of order N can also be considered as samples of rectangular functions having a subperiod equal to 1 / N ; these functions are called Walsh functions (Pratt, 1978). The Haar transform is based on the Haar matrix, which contains elements equal to I1 and 0. One of the most efficient transforms is represented by the KarhunenLoeve transform (KLT), which can be defined in the following way
where the A(nl, n,; k , , k,) kernels satisfy the relation
where C(nl, n,; n ; , n i ) denote the covariance function of image data and i ( k l ,k,) are constants (eigenvalues of covariance functions) for fixed values of k , and k,. The above-considered discrete transforms can in general be evaluated in a fast form; the computing operation is divided in a sequence of subsequent computing steps in such a way that the results of the first computing steps
DIGITAL FILTERS AND DATA COMPRESSION
171
(partial results) can be utilized repetitively in subsequent steps. Efficient software packages are available for Fast Fourier Transforms (FFTI and Fast Walsh Transforms (FWT). For instance, the number of operations required to evaluate 1D FFT becomes N log, N instead of N 2 for DFT, and to evaluate 1D FWT, N log, N instead of N 2 for DWT (the difference between DFT-FFT and DWT-FWT is that for the first type of transform the operations are complex multiplications and additions, while for the second the operations are additions and subtractions). As already outlined above, the transformed data constitute a compact representation of the original image data; the number of significant transformed data is appreciably smaller than the number of original image data. For instance, an image constituted by a regular smoothed variation of grey levels will be represented by few Fourier components (few F F T data),while an image containing sudden variations in the grey level (nearly rectangular greylevel variations along the rows and columns) will be represented by few Walsh components (few FWT data). Further, the transformed data can be compressed in a stronger way by applying simple algorithms such as thresholding (for instance, setting to zero the values under a small threshold such as a few percent) or prediction interpolation. The block diagram of a system applying this last approach, in particular to verify the efficiency of thresholding or the ZOP algorithm also through suitable displays, is shown in Fig. 7. Variable-word-length coding can also be used for different transformed data blocks. In practice, the transformed data are divided into several squares and a minimum word length (a bit number sufficient to represent the maximum absolute amplitude value in the square plus 1 bit for the sign) is employed for each of them. In the actual storing or transmission of the processed image data, an additional fixed-length word is inserted before each square data, in order to specify the number of bits to represent the square coefficients. With reference to 2D FWT, if N = 2", with n an integer, is the number of rows and columns of the sampled image and L is the number of grey levels, the maximum value which the transform will assume (corresponding to the addition of all the image samples) will be N2L. If q is the quantization value for the transformed data, the number of bits required to specify the word length used in a square will be (68) 1ogZ[10gZ(N2L/q+ ')I where the algorithms are rounded to the next highest integer. A modification of the above method for image data compression by means of 2D FFT or FWT consists in applying the same procedure of variable-wordlength coding of the transformed data in a limited number of transformed nb
=
I
I N P U T DATA
*
1
APERTURE DEFINITION
I
--
-2
,
2
ZOP FLOATING
FFT
I t I
.
RECDNSTRUC TlON
-
+
-
r
(FFT)-'
I A N N l H I L A T I ON
FK;. 7. Block diagram of a data compression system using FFT with thresholding or a ZOP algorithm.
DIFF.
(rms)
4-
DIGITAL FILTERS A N D DATA COMPRESSION
173
image subareas. In particular no value can be maintained for those subareas, where the addition of the absolute values of the transformed data is below a given threshold. From a computational viewpoint, we can further outline the following comparison considerations:
(1) 2D FFT is in general more efficient for images having continuous regular or smoothed variations (sine-wave type) in the grey level; (2) 2D FWT is more efficient for images having sudden variations (of a rectangular type) in the grey level; (3) Karhunen-Loeve transform is the most efficient, but it is much more complex than the others, and no fast computing routine is available for its use. V. JOINTUSEOF TWO-DIMENSIONAL DIGITAL FILTERS AND DATACOMPRESSION The above-considered 2D digital filtering and data compression operations can be joined together with significant advantages for digital imageprocessing efficiency. The two operations are in general performed in cascade, one after the other. Most parts of images indeed require some kind of filtering to smooth the data or to perform space-frequency corrections and obtain enhancement, in general with the goal of reducing the noise or disturbances and of obtaining higher-quality images. Data compression is, as already outlined, a desirable operation after filtering to reduce the amount of data, which is becoming a tremendous problem for the practical use of images in many application areas. Further, the combination of the two operations can be attractive to increase the compression efficiency (see Section IV,B,4); smoothed data after low-pass filtering can surely be more efficiently compressed by the different data compression algorithms, because they now operate on 2D data having lower space-frequency values. In the following some typical connections of the two digital operations are first presented; then a special new system, based on digital filtering and data reduction, is described for digital comparison and correlation of digital images having different space resolutions. A . Some Typical Connections of the Two Digital Operations
Local space operators, 2D digital filters, and data compression can be connected in different useful ways to obtain some specific results on the processed image.
174
V. CAPPELLINI
(1) Low-pass filtering (by means of a local space smoother or 2D digital filtering) and thresholding: high space-frequency components can be reduced due to the noise and disturbances, and hence a binary image can be obtained, where the more useful data are maintained (by selecting suitable threshold values). (2) High-pass filtering (by means of local differential operators or 2D digital filtering) and thresholding: image enhancement is performed, giving higher contrast to image structures (grey-level variations), and hence a binary image is obtained, where some useful structures and patterns can be extracted (by selecting suitable threshold values). ( 3 ) Low-pass filtering [as in (l)] and compression by means of prediction interpolation, DPCM, DM, or variable-word-length coding: high spacefrequency noise is reduced, and in the meantime smoothed data are more efficiently compressed. (4) Low-pass filtering [as in (l)] followed by edge detection and hence spike elimination (by means of nonlinear operators as in Section 111,A): high space-frequency noise is reduced (for instance, random noise), useful edges and boundaries are extracted (representing a compressed form of the image), and further high-amplitude spikes or scintillation noise are eliminated. ( 5 ) Low-pass filtering [as in (l)] and compression by means of digital transformations (2D FFT or FWT):the same result as in (3) can be obtained. (6) Use of 2D FFT to perform filtering and compression: once the 2D FFT has been evaluated, high space-frequency components can be discarded to obtain a filtering effect (smoothing with noise reduction); hence the remaining 2D FFT components can be reduced with thresholding or variableword-length coding (see Section IV,B,S). B. Processing System for Digital Comparison and Correlation of Images Having DifSerent Space Resolution In many application areas, such as remote sensing, biomedicine, and robotics, an important practical problem is represented by the availability of several images given by different sensors or equipment and regarding the same scene (land region, body organ, mechanical object,. . .). In general these images are taken from different view points and have different space resolution. Increasingly often a processing goal is to obtain integrated images or maps, where the data from the different images pertaining the same observed scene are suitably correlated (for instance, through a simple addition difference or specific weighting of one image’s data by the other images’ data). To solve the above problems, there are two types of digital processing to be performed:
DIGITAL FILTERS AND DATA COMPRESSION
175
(1) geometrical corrections and rotations with a change of viewpoint (to refer the different images to the same viewpoint or to the point at infinite distance and orthogonal position producing orthoimages); (2) space resolution variations in such a way finally to have images with the same space resolution, which can be actually integrated.
While for point (1) there are several geometrical transformations available using trigonometric functions, for point (2) there are few approximation procedures. In the following a rigorous method is presented, based on 2D digital filtering and data reduction (Cappellini et al., 1984a). Let us consider two images fl(nl, n,) and f 2 ( n l n,) in digital form, the first with high space-frequency resolution or definition and a space-,sampling interval X , , the second with lower space-frequency resolution and a sampling interval X , > X , . Practically, if m = (X,/Xl), to one pixel of the image f 2 ( n l ,n , ) corresponds m2 pixels of the image fl(nl n,). Several approaches can be used to obtain from a high-definition image fl(nl, n,) an image gl(n,, iz,), having a space-sampling interval X , equal to that of the lower-definition image [gl(nl,nz) is a compressed form of fl(n17nz)l~
One simple technique corresponds to evaluating g l ( n l ,n,) data as the usual average of fl(nl, n,) data [see also Eq. (24)]; that is (with m odd),
The gl(nl, n,) image obtained in this way represents a rough smoothed version of the original high-resolution image fl(nl, n 2 ) . A second, more refined approach consists in evaluating g l ( n l , n,:l data as a weighted average of fl(nl, n,) data; that is,
The weights wa(k,, k 2 ) in the above relation define the form of smoothing operation which is performed on the f l ( n , , n 2 ) data; it is easy to verify, for instance, that with wa(kl, k , ) = (l/mz) Eq. (70) is equivalent to Eq. (69). It can appear reasonable to give, in general, greater weight to the central pixels of the m x m subimage of the fl(n,, n,) image with respect to the peripheral ones. For this purpose one solution corresponds to using a linear weighting resulting in a conical (or pyramidal) function in the 2D domain; another solution consists in using a Gaussian weighting function (the 2D function can be easily obtained through the circular rotation of a ID Gaussian function). The above-described techniques perform a smoothing operation on the high-resolution image f , ( n , , n , ) in a heuristic way to obtain the image
176
V. CAPPELLINI
y 1( n , ,n,) to be compared and correlated with the lower-resolution image
f 2 ( n l ,n2). A more rigorous and precise system is based on the use of a 2D digital filter of the low-pass type with circular symmetry (see Section 11). The precise steps of this system are the following:
(1) to perform a low-pass, circular symmetry, 2D digital filtering with a cutoff frequency 0,/271 = 1/2X,, obtaining the filtered image gl(nl , n J ; (2) to reduce or “decimate” the obtained data yl(n,, n,) up to a spacesampling interval equal to X,, that is, to obtain the image (in digital form) gz(n1 n 2 ) = g,(n,X,, n2X2). 1
The above digital operations indeed remove from the 2D spectrum of the highresolution image f l( n1 , pi2) the space-frequency components greater than wJ271, giving therefore an image which is directly comparable-for that which regards the space resolution-to the lower-resolution image f 2 ( n l ,n,). The two digital images g t ( n l , n,) and f2(nl, n,) now result to have grey-level variations in the different space directions with the same maximum space frequency. It is interesting to observe that the 2D digital filter used in this last rigorous system includes, as particular cases, the approaches defined by Eqs. (69) and (70). The first is obtained by setting the coefficients of the 2D digital filter a ( k , , k 2 ) = (l/m2)[see Eqs. (2) and (9) for the FIR case]; the second results by setting a(kl, k 2 ) = w,(kl, k,).
VI. APPLICATIONS In the following some examples of applications of 2D digital filters, local space operators, and data compression to such important fields as communications, remote sensing, biomedicine, and robotics are presented. A . Applications to Communications
In a communications system a message is transmitted from one place to another through different physical communication channels (lines, cables, satellite links, optical fibers, etc.). If the transmitted message is a signal s ( t ) (Fig. 8), the receiver produces an estimate s,(t) of the original source message, trying to reduce the noise and degradation introduced by the channel. Often many messages of the same or different type have to be sent in parallel to utilize the communications medium in a more efficient way. Multiplex communication systems are used for this purpose. Two important types of multiplex systems are represented by frequency division multiplex
DIGITAL FILTERS AND DATA COMPRESSION
INFORMATION SOURCE
message
MODULATOR OR ENCODER TRANSMITTER
+
CHANNEL
-
177
RECEIVER
received signal
Se (t)
estimate of the information source messagg
FIG.8. General block diagram of a communication system
(FDM) and time division multiplex (TDM). In the first the single messages are set in adjacent frequency domain, while in the second the messages are organized in subsequent time intervals, in general by sending one sample of each message after the other in a cycle or frame and sending one frame after the other. By representing each sample in the TDM system in digital form (a word of a given number of bits) a pulse-code-modulation (PCM)multiplex system is obtained. Digital communications of the PCM type have expanded recently in an exponential manner, due to the main aspects outlined in the Introduction (Section I). In digital communications it is easy to apply digital operations such as previously described (digital filtering, local space operators, data compression). By considering the transmission of images, these are in general converted into a video signal through a scanning procedure and then this signal is sampled and set in digital form. Digital operations described in the previous Sections can be usefully applied to this digital signal both in transmission and reception, according to the general block diagram in Fig. 9. In transmission I D digital filtering can be performed to reduce highfrequency noise components and exactly define the bandwidth. 2D digital filtering and local space operators can also be applied, if a suitable memory or buffer is available, processing image data in such a way as to reduce high space-frequency components (image smoothing) or to obtain image enhancement. Hence data compression can be performed to maintain the more significant data. By means of filtering and compression, a bandwidth reduction or bandwidth compression is achieved, which is a very important result to increase the efficiency of the digital communication system (the same image data can be transmitted by using a smaller bandwidth, or other data can be transmitted with the image by using the same bandwidth). Achannel coder can be added after compression to protect the remaining important data against channel noise and disturbances (Benelli et al., 1977, 1984). In reception, after the eventual channel decoder for error detection and correction, the image data are reconstructed (decompression) also utilizing synchronization data (given by a sequence detector) and hence digital filtering is performed ( I D and 2D type) to reduce remaining channel noise or to
-
-
information
source
A-D
CONVERTER
f
DATA
DIGITAL FILTER
-F
COMPRESS.
-
-
CODER
-t
I
4
TRANSMITT.
-
'
SEQUENCE G EN E R AT0 R
-RECEIVER
-t.
t
D ATA DECODER
f
DECOMPRESS.
PROCESSOR
i
1L
SEQUENCE
Ir
-
information
ir
sequence
Flci. 9. Block diagram of a digital communication system, using digital filtering, data compression, and error-controlled coding operations.
DIGITAL FILTERS AND DATA COMPRESSION
179
increase image quality. With reference to this last filtering processing, 1D and 2D digital filters can indeed be applied in the following way: (1) ID digital filtering can reduce channel noise and degradation (such as multipath and Doppler effects) through channel equalization (fixed and adaptive type) and matched filtering (Cappellini et al., 1978); (2) 2D digital filtering can reduce space-frequency components due to noise and perform image restoration (inverse filtering) and enhancement.
Let us now consider in more detail the digital transmission of time-varying images (television) and of time-fixed or static images. In the first case, for good reproduction of movement, image sequences are required at a sufficient rate. In European TV standards for instance, 25 images/s are used. By using 625 lines/image and 8 bits/sample, transmission rates of 50 Mbits/s are obtained. To reduce this high value, the analysis of two subsequent images can be performed in such a way as to take into account, for instance, only the actually moved parts in the transition from one image to the subsequent one (interframe techniques). DPCM techniques (see Section IV,B,3) can be used for this purpose; through variable-word-lengt h coding, mean word lengths of 2-2.5 bits/sample are obtained, reducing the transmission rate to 10-20 Mbits/s. A suitable encoding, as with prediction interpolation or-more efficiently-with digital transformations (see Section IV,B), can also be performed in each single image (intraframe techniques). Combining interframe and intraframe techniques, lower bit rates (2- 10 Mbits/s) are obtained, at the expense of higher complexity and cost. An interesting approach to reduce the redundancy in TV images, by means of inter-intraframe coding. is based on movement compensation: The coding consists essentially in determining for each pixel the prediction model (spatial, temporal) and transmitting, when necessary, the quantized difference and the prediction model changes (Brofferio et al., 1975). In the case of videotelephone or teleconference s: stems, due to the lower number of images points in movement from one image to another, and by using DPCM techniques with variable-word-length coding, 0.9- 1 bits/sample are sufficient. If information about image moving objects or parts is suitably used, values of 0.4-0.5 bits/sample are reached. For instance, for an object translation, the shift and direction values can be sent to the receiver, which will also reconstruct grey levels of moved image parts. Transmission rates on the order of a few Mbits/s are thus obtained. In the second case of static images, two practical situations can be considered: transmission of written documents and sheets, and transmission of photos (telephoto). Regarding documents and sheets, let us consider a standard A4 sheet (29.6 x 20.8 cm). To describe the written information and transmit it to the
180
V. CAPPELLINI
receiver (as in facsimile), an efficient method can be represented by the use of an optical reader of the written characters. If the sheet contains 30 lines, each with 70 characters, there are 2100 characters in a sheet. If, for instance, 7 bits are utilized for the transmission of the characters, 14,700 bits are required for the representation of the sheet. If, however, variable-word-length coding is used, taking into account the character probabilities (see Section IV,A), a lower number of bits is required (as an example, for the English language, a mean word length-equal to the source entropy-4.2 bits/character is resulting, with 8800 bits required to represent the sheet). A more economic system can scan the sheet (assumed to be black characters on a white background) with 1200 lines (at least 4 lines/mm are required) and represent each line by 800 equidistant space samples (points). Each sample being of a binary value, there are 960 000 bits required to represent the whole sheet (an amount much greater than the previous one). A little more efficient coding is obtained through the representation of the lengths of black or white point sequences by means of variable-word-length coding: Mean values of 0.3-0.4 bits/point are sufficient. By using the variations of the above sequences line by line (comparing one line with the following one) and suitable encoding of these variations, mean values of 0.10.2 bits/point are obtained (with 96,ooO-180,000 bits required to represent the whole sheet). For what concerns the representation and transmission of black and white photos, the number of data required is much higher. Let us consider a telephoto of 13 x 18 cm. To have sufficient space resolution, 8.6 lines/mm are required, and therefore 1500 lines with 1100 samples/line result. The overall amount of data, by representing the grey level of each sample by 7 bits, is therefore 11.5 Mbits (by using transmission with 4800 bits/s, 40 minutes are required for the transmission of a complete photo). By means of data compression techniques, such as DPCM with variable-word-length coding (see Section IV,B,3) a mean word length of 2-3 bits/sample is obtained, and by means of digital transformation (see Section IV,B,5) a value of 1-2 bits/ sample and less can be reached. With reference to this last approach, Fig. 10 shows an example of the application of the discrete cosine transform (DCT, implemented in a fast way, FCT) to a typical photograph of Florence (the Old Bridge). Thresholding compression of transformed data is used, representing image square sub-blocks containing N, x N , data (transformed data less than 17; are neglected). Fig. 10a shows the original digitized photo, and Fig. 10b the reconstruction in the case N , = 16 (mean word length = 0.6 bits/sample, ep = 16.8674, and e, = 2.13%). Finally, it is important to observe that all the above data-compression and data-rate-reduction results can be improved, if 2D digital filtering or local
FIG.10. Example of the application of the fast cosine transform (FCT) to perform data compression on a photograph of the Old Bridge in Florence: (a) original digitized photo; (b) reconstructed photo (with 0.6 bits/sample).
182
V. CAPPELLINI
space processing (as of the low-pass filtering type) is performed before the application of data-compression techniques. B. Applications to Remote Sensing
Many remote sensing images and maps are currently collected by platforms aboard aircrafts and satellites. Indeed passive remote sensing systems, as optical cameras, multispectral scanners (MSS), and microwave radiometers, or active remote sensing systems, as side-looking radar (SLR) and laser radar (lidars), give an impressive amount of images and data. The above images, maps, and data given by remote sensing systems in general need to be processed to improve their quality (geometric and sensor corrections, noise reduction, enhancement,. . .) and to obtain final useful results (extraction of specific regions and land-sea areas as for agriculture investigation or water resource monitoring). 2D digital filters, local space operators, and data compression represent indeed very useful digital operations to achieve the above outlined goals. 2D digital filters or local space operators can be applied as a preprocessing operation to smooth the image data (by means of low-pass filtering) or to perform a space-frequency correction or to obtain enhancement (by means of high-pass or bandpass filtering), also extracting edges and boundaries. In particular, after enhancement better-quality images can, in general, be obtained and through edge extraction different earth regions can be recognized and classified with easy evaluation of the corresponding areas. Data compression can be applied in general after some type of filtering, to reduce the amount of data, which is becoming a tremendous problem for the practical use of satellite data and aircraft photos of large earth areas (Cappellini, 1980). In the following some typical processing examples are given, regarding the application of the above digital operations. Figure 11 shows an example of the application of filtering and edge extraction to an aircraft photo ( a region south of Florence). In Fig. 1l a the digitized image is shown, while in Fig. 11b the result of processing by means of the nonlinear filtering operator as defined by Eq. (26) (nonlinear smoother) followed by the isotropic-gradient edge detector [Eq. (37)]. As it appears, the main different ground regions are isolated; adding the grey-level information, three classes can easily be obtained: forest (black and high-intensity grey levels), wine grapes and oil plants (medium-intensity grey levels), and other ground regions. Another very simple processing example is shown in Fig. 12. A LANDSAT-C image of the Tirrenic coast (at the bottom the Arno River appears) is processed first through grey-level expansion (stretching, see Section
DIGITAL FILTERS AND DATA COMPRESSION
183
P
rc
r
,
..
\. L
b FIG.11. Example of the application of nonlinear smoothing and edge extraction to an aircraft photo (region south of Florence): (a) digitized photo; (b) processed result.
111) and then through thresholding. Figure 12a shows the original image, and
Fig. 12b gives the final result, which practically corresponds to an estimation of the water resources in the analyzed region. Figure 13 shows an example of the application of a 2D FIR digital filter of the high-pass type [Eq. (2)] to a LANDSAT-C image. At the right is a part of the original image (North Africa), while at the left the filtered image appears. As is clear, a good enhancement effect results, which can be very useful for extracting some significant regions; further, through thresholding as in Fig. 12b, final estimate regarding these regions could be obtained (Cappellini, 1984). Figure 14 shows an example of the application of data compression with a ZOP algorithm and floating tolerance (see Section IV,B,2) to an ERTS-1 image. At the left the original image is shown, and at the right the reconstructed one after compression (an average compression ratio Cra = 1.56 is obtained). Figure 15 gives another example of the application of data compression on the same ERTS-1 image, using the 2D FWT with variable-word-length coding
184
V. CAPPELLINI
FIG.12. Example of application of stretching and thresholding to a LANDSAT-C image (Tirrenic coast with the Arno River at the bottom): (a) original image; (b) processed result.
of transformed data blocks (4 x 4). At the left is the original image, in the middle the reconstructed one with q / N = 4% [see Eq. (68)] corresponding to a compression ratio C,, = 2.14, and at the right the reconstructed one with q / N = 8% and C,, = 3.84. Higher compression ratios can be obtained by increasing q / N (Cappellini et al., 1976). A practical application of the processing system for digital comparison and correlation of images having different space resolution, presented in Section V,B, is shown in Figs. 16-20. Figure 16 shows a SEASAT-SAR image (256 x 256) of a coastal region in South Italy (Sele River in Campania), which represents the high-resolution image f l ( n l , nz).Figure 17 gives a LANDSATC image (256 x 256) of the same region, the image representing the lowerresolution onef,(n,, n z ) .Figure 18 shows the result of 2D FIR digital filtering (low-pass type, circular symmetry) of the SEASAT image. Figure 19 shows the two final images obtained for comparison and correlation. At the left is
DIGITAL FILTERS AND DATA COMPRESSION
185
FIG.13. Example of the application of a 2D FIR digital filter of the high-pass type to a LANDSAT-C image (North Africa):at the right is a part of the original image, while at the left the filtered image is emerging.
the LANDSAT image (a part of the original image suitably rotated to be registered with the SEASAT image); at the right is the SEASAT filtered image already decimated [corresponding to g z ( n , ,n 2 ) ] . Figure 20 finally gives a simple integration test: the addition of the two images in Fig. 19 (Cappellini et al., 1984a).
C . Applications to Biomedicine With recent rapid technological evolution, much equipment has been introduced in biomedicine to produce different types of biomedical images or
186
V. CAPPELLINI
FIG. 14. Example of the application of data compression with a ZOP algorithm and floating tolerance to an ERTS-1 image: at the left the original image is shown, while at the right the reconstructed one is shown (CrA= 1.56).
bioimages. Some examples of biomedical branches giving bioimages are: radiography (x-ray), thermography (ir), scintigraphy (nuclear medicine), ecography (ultrasonics), electrocardiography (ECG maps), electroencephalography (EEG maps), and computer tomography (CT). Other bioimages of increasing interest are nuclear-magnetic-resonance (NMR) images and microwave-radiometry images. The above bioimages can be processed by 2D digital filters, local space operators, and data compression to obtain several useful results. By means of low-pass filtering, a smoothing of the bioimage is obtained, reducing high space-frequency noise components. By using high-pass or bandpass filtering, enhancement effects result, outlining and extracting useful data and patterns in other ways not clearly recognized. By means of inverse filtering (restoration), noisy bioimages can be processed to obtain higher-quality images for clinical diagnosis and interpretation. Data-compression techniques can hence reduce the amount of data, increasing in an impressive way, solving storage problems (archival systems), and increasing the efficiency of bioimage
DIGITAL FILTERS AND DATA COMPRESSION
187
FIG.15. Example of the application of data compression using a 2D FWT and variableword-length coding of transformed data blocks (4 x 4) to an ERTS-1 image: at the left, the original image; in the middle, the reconstructed image (Cra= 2.14); at the right, the reconstructed image (Cra= 3.84).
transmission from one place to another (telemedicine). In the following some typical examples are reported. A first example regards ECG or EEG maps. A special hardware system was recently built in Florence, containing a fast digital processor performing up to 256 1D digital filtering operations on ECG or EEG signals (Cappellini and Emiliani, 1983). In particular 16 signals (representing a 4 x 4 micromap) can be processed in realtime and the filtered data, for instance corresponding to LY components in an EEG, can then be processed by computer systems performing 2D digital filtering or other compression operations. Indeed the 1D digital filtering performed on the 4 x 4 signals by the hardware system represents an interesting example of parallel processing of 2D data, extracting useful frequency components (and in this way performing also a sort of data compression). Figure 21 shows a standard chart recording of parallel filtering
188
V. CAPPELLINI
FIG.16. A SEASAT-SAR image of a region in South Italy.
of four EEG signals (2 x 2 micromap), obtaining three outputs for any input signal in the frequency bands 8-10, 10-12, and 12-14 Hz. Figure 22 shows an example of processing in infrared thermography. In Fig. 22a the original digitized image is given, in 22b the result of grey-level expansion (stretching) is reported in conjunction with edge detection applied on a limited range of grey levels, outlining the venous traces (Prosperi, 1983). Figure 23 shows an example of the application of a 2D FIR digital filter of the bandpass type to a nuclear medicine image; at the top is the original image, and at the bottom the result of processing. Due to the special enhancement effect, a cyst now appears at the left of the image (the small black region). Figure 24 shows another example of processing a computer tomography image, In 24a the original image is given, in 24b the result of linear stretching
DIGITAL FILTERS A N D DATA COMPRESSION
189
FIG.17. A LANDSAT-C image of the same region as in Fig. 16 (extended area).
performed in the limited grey-level range 80-190, in 24c the result of a 2D FIR digital filtering of the parabolic type. As it appears this last filter can indeed be useful for obtaining special enhancement. It can be proved that a 2D parabolic filter (having flexible parameters as the origin and slope) is a good approximation of inverse or restoration filtering (Cappellini et al., 1978).This example outlines how in computer tomography, in addition to the standard image manipulation provided, special effects can be obtained in particular 'by means of 2D digital filtering. As already observed, 2D digital filtering can be very useful as preprocessing before data compression (see Section V). Table I shows some experimental results, obtained by processing nuclear medicine images before with 2D lowpass digital filtering and then with data compression using digital transformations (2D F F T and FWT). As it appears, with the same epand e, errors, the compression ratio C,, is appreciably increased when the 2D digital filter is
190
V. CAPPELLINI
FIG. 18. Result of 2D FIR digital filtering(1ow-pass type, circular symmetry) applied to the SEASAT image.
FIG. 19. The two final images obtained for comparison and correlation: at the left LANDSAT image; at the right, the SEASAT filtered and decimated image.
IS
the
FIG.20. A simple integration test: addition of the two images obtained in Fig. 19.
\
2
I
d
P
!ib-Awj)jflFIG.21.
--.-.
Multiple parallel digital filtering of four EEG signals.
192
V. CAPPELLINI
FIG.22. Example of processing in infrared thermography: (a) original digitized image: (b) result of edge detection applied on a limited range of grey levels.
used in comparison with the situation with no prefiltering (in the first case C, passes from 2.5 to 6, in the second one from 3.5 to 8) (Cappellini, 1979a). D. Applications to Robotics
In computer vision for robotics, in which one or more scene sensors such as T V cameras take information on mechanical objects or other systems in a
DIGITAL FILTERS AND DATA COMPRESSION
193
FIG.23. Example of the application of a 2D FIR digital filter of the bandpass type to a nuclear medicine image: at the top is the original image; at the bottom, the result of processing.
static position or in movement, efficient processing techniques are required to analyze the images given by the sensors. In particular, due to environmental noise conditions (light change, different colors of the objects, movement,. . .), preprocessing is required with fast filtering operations; hence edge detection is useful to extract the object’s shape before final recognition and classification. In the following some examples of the application of digital operations described in the previous sections are given.
194
V. CAPPELLINI
FIG.24. Example of processinga computertomographyimage:(a)original image; (b) result of linear stretching;(c) result of 2D FIR digital filtering of the parabolic type.
The first example regards the analysis of complex objects, where the goal is to process images taken of the objects and to produce an automatic object decomposition and subpart identification classification. The processing procedure is outlined in Fig. 25: by using a TV camera images are acquired in 3 colors (R,G,B-red, green, blue), then prefiltering is performed to reduce the noise; after boundary extraction, decomposition and syntactical analysis are performed. Figure 26 shows an example of the application of this procedure: in 26a an original digitized image (red color presented in black and white) is given representing a circuit board (acquisition with strong noise); in 26b the result of decomposition of the circuit board, obtained through a nonlinear
DIGITAL FILTERS AND DATA COMPRESSION
195
FIG.24b
filter of the type presented in Section III,A, an edge detector, and homogeneity operator applied on the three R,G,B images (Cappellini et al., 1984b). Another example regards the recognition and tracking of moving objects as on a transporting tape. The processing steps are the following: preprocessing with a nonlinear smoother [as in Eq. (26)]; edge detection (i.e., Sobel-type operator); spike elimination with a nonlinear operator [as in Eqs. (27) and (28)]; segmentation; object recognition by performing the FFT on the boundary of the object (distances of the boundary points from the centroid). An example of the application of this procedure is given in Fig. 27 on some mechanical objects. At the left there are the input digitized images of two
196
V. CAPPELLINI
FIG.24c
positions; at the right the recognized objects are shown (each identified with a different color, here appearing as a different grey level) with perfect tracking of their movements (Cappellini and Del Bimbo, 1983). With reference also to the above examples, in these robotics applications the preprocessing step is indeed very important (fast and efficient nonlinear filtering operators and edge detectors are required), in such a way as to reduce the noise and disturbances and extract the significant data in compressed form (that is limited to the really significant ones) for the best performance of the final recognition-classification algorithms and procedures.
TABLE I EXPERIMENTAL RESULTS OBTAINED APPLYING DATACOMPRESSION USING TWO-DIMENSIONAL FFT AND FWT TRANSFORMATIONS TO NUCLEAR MEDICINE IMAGES WITH OR WITHOUT A TWO-DIMENSIONAL LOW-PASS DIGITAL PREFILTERING
Compression ratio C,, = 2.5 (Cca= 6 with pre-filtering)
FFT FWT
Compression ratio C,, = 3.5 (Cra= 8 with pre-filtering)
e,
e,
eP
er
0.1322 0.0483
0.0134 0.0074
0.1581 0.1088
0.0164 0.0160
I
I
Different color bands acauisition I
Pref i I t ering algorithm I
I I I I
I
I ACQUISITION I I
Boundary extraction
I I
I I
I
I I I
I
I
Decomposition algorithm I
I I
SEGMENTATION
Syntactical analysis FIG.25. Processing procedure for automatic object decomposition and subpart identification classification.
198
V. CAPPELLINI
a
FIG.26. Example of the application of the procedure in Fig. 25: (a) original digitized image representing a circuit board; (b) result of decomposition obtained through a nonlinear operator, an edge detector, and a homogeneity operator.
DIGITAL FILTERS AND DATA COMPRESSION
199
FIG.27. Example of processing images related to moving objects: at left, the input digitized images; at right, the recognized objects with movement tracking.
REFERENCES Abramson, N. (1963).“Information Theory and Coding.” McGraw-Hill, New York. Benelli, G., Bianciardi, C., Cappellini, V., and Del Re, E. (1977). Proc. EUROCON--Euro Conf. Electrotech. Venice. Benelli, G., Cappellini, V., and Lotti, F. (1980). Radio Electron Eng. 50, 29. Benelli, G., Cappellini, V., and Del Re, E. (1984). IEEE Select. Areas Comm. SAC-2, 77. Berger, T. (1971). “Rate Distortion Theory-A Mathematical Basis for Data Compression.” Prentice-Hall, New York. Bernabo, M., Cappellini, V., and Emiliani, P. L. (1976). Electron. Lett. 12,288. Brofferio, S., CalTorio, C., Rocca, F., and Ruffino, U. (1975). Proc. Florence Conf. Digital Signal Process. p. 158. Calzini, M., Cappellini, V., and Emiliani, P. L. (1975). Alta Frequenza 44, 747. Cappellini, V. (1979a). Proc. J U R E M A Conf., Zagreb. Cappellini, V. (1979b). Proc. Int. Workshop Image Process. Astron., Trieste p. 258. Cappellini, V. (1980). Int. Remote Sensing, 1, 175. Cappellini, V. (1983). Proc. IEEE Int. Symp. Circuits Systems, Newport Beach p. 402. Cappellini, V. (1984). Proc. EARSeLIESA Symp. Integrated Approaches Remote Sensing, Guildford p. 325.
200
V. CAPPELLINI
Cappellini, V., and Del Bimbo, A. (1983). In “Issues in Acoustic Signal/Image Processing and Recognition” (C. H. Chen, ed.), p. 283. Springer-Verlag, Berlin and New York. Cappellini, V., and Emiliani, P. L. (1983). Proc. MEDINFO-83, Amsterdam p. 682. Cappellini, V., and Odorico, L. (1981). Proc. lEEE Int. Conf. Acoust. Speech Signal Process., Atlanta p. 1129. Cappellini, V., Chini, A,, and Lotti, F. (1976). Proc. Int. Techn. Scie. Meef.Space, Rome p. 33. Cappellini, V., Constantinides, A. G., and Emiliani, P. (1978). “Digital Filters and Their Applications.” Academic Press, New York. Cappellini, V., Carla, R., Conese, C., Maracchi, G. P., and Miglietta, F. (1984a). Proc. EARSeLIESASymp. Integrated Approaches Remote Sensing, Guildford p. 23. Cappellini, V., Del Bimbo, A,, and Mecocci, A. (1984b). Image Vision Comput. 2, 109. Costa, J. M., and Venetsanopoulos, A. N. (1974). IEEE Trans. Acoust. Speech Signal Process. ASSP-22,432. Dudgeon, D. E. (1975). IEEE Trans. Acoust. Speech Signal Signal Process. ASSP-23,242. Ekstrom, M. P. (1980). IEEE Trans. Acoust. Speech Signal Process. ASSP-28, 16. Harris, 0.B., and Mersereau, R. M. (1977). IEEE Trans. Acoust. Speech Signal Process. ASSP-25, 492. Hilberg, W., and Rothe, P. G. (1971). lnf. Control 18, 103. Hu, J. V., and Rabiner, L. R. (1972). IEEE Trans. Audio Electroacoust. AU-20,249. Kaiser, J. F. (1966). In “System Analysis by Digital Computer”(F. F. Kuo and J. F. Kaiser, eds.), p. 218. Wiley, New York. McClellan, J. H. (1973). Proc. Annu. Princeton Conf: Inf: Sci. Systems, 7th, p. 247. Maria, G. A., and Fahmy, M. M. (1974). IEEE Trans. Acoust. Speech Signal Process. A S P - 2 2 , 16. Mecklenbrauker, W. F. G., and Mersereau, R. M. (1976). I E E E Trans. Circuits Systems CAS-23, 414. Mersereau, R. M., and Dudgeon, D. E. (1975). I E E E Proc. 63,610. Mersereau, R. M., Mecklenbrauker, W. F. G., and Quatieri, T. F.,Jr. (1976).IEEE Trans. Circuits Systems CAS-23,405. Oppenheim, A. V., and Shafer, R. W. (1975). “Digital Signal Processing.” Prentice-Hall, New York. Pratt. W. K. (1978). “Digital Image Processing.” Wiley, New York. Prosperi, L. ( I 983). Thesis, Department Electrical Engineering, University of Florence. Shannon, C. E. (1959). IRE Nut/. Conu. Rec. 7, 142. Shannon, C. E., and Weather, W. (1949).“The Mathematical Theory of Communication.” Univ. of Illinois Press. Urbana. Shanks, J . L., Treitel, S., and Justice, J. H . (1972). I € € € Truns. Audio Electroacoust. AU-20, 115.
ADVANCES IN ELECTRONICS A N D ELECTRON PHYSICS. VOL . 66
Statistical Aspects of Image Handling in Low-Dose Electron Microscopy of Biological Material CORNELIS H . SLUMP* AND HEDZER A . FERWERDA Department of Applied Physics Rijksuniuersiteit Groningen Groningen. 7he Netherlands
I . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A . Need for a Fundamental Statistical Analysis . . . . . . . . . . . . . . . . . . . . . B. Interaction of the Electron Beam with the Specimen . . . . . . . . . . . . . . . . .
I1.
111.
1V.
V.
C. Relation between the Object Structure and the Electron Wave Function . . . . . D . Image Formation in the CTEM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E. Remarks about the Contrast Mechanisms . . . . . . . . . . . . . . . . . . . . . . . F. The Phase Problem in Electron Microscopy . . . . . . . . . . . . . . . . . . . . . G . The Stochastic Process Characterizing the Low-Dose Image . . . . . . . . . . . . Object Wave Reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A . Introduction and Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Derivation of the Basic Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Solution of the Integral Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . D . Statistical Analysis of an Approximate Solution . . . . . . . . . . . . . . . . . . . Wave-Function Reconstruction of Weak Scatterers . . . . . . . . . . . . . . . . . . . A . Introductory Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Axial Illumination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Tilted Illurnination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Parameter Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A . Maximum Likelihood Estimation in Electron Microscopy . . . . . . . . . . . . . B. Illustrative Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Two-Dimensional Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D . Discussion and Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Statistical Hypothesis Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A . lntroduction to Statistical Hypothesis Testing in Electron Microscopy . . . . . . B. Object Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Position Detection of Marker Atoms . . . . . . . . . . . . . . . . . . . . . . . . . . D . Statistical Significance of Image Processing . . . . . . . . . . . . . . . . . . . . . . E . Discussion and Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix A: The Statistical Properties of the Fourier Transform of the Low-Dose Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix B: The Statistical Properties of an Auxiliary Variable . . . . . . . . . . . . Appendix C: The Cramer-Rao Bound . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
202 202 203 204 205 201 209 209 213 213 215 219 224 230 230 230 242 254 254 259 213 216 211
211 281 290 295 295 291 299 305 306
* Present address: Philips Medical Systems. Eindhoven. The Netherlands. 20 1 Copyright 0 1986 by Academic Press. Inc All rights of reproduction in any form reserved.
202
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
I. INTRODUCTION A . Need for a Fundamental Statistical Analysis
It is well known that in the electron microscopy of biological specimens radiation damage constitutes a severe problem. This radiation damage manifests itself as the breaking of chemical bonds due to the bombardment of the specimen by electrons. Obviously, radiation damage can be reduced by limiting the number of electrons during exposure. The price to be paid for this economy is that the pictures exhibit a grainy appearance (“shot noise”) which introduces a probabilistic feature in the imaging process. The image is to be considered as the realization of a stochastic process. The processing of the noisy micrographs consequently acquires a stochastic nature. It is to be expected that the smaller the number of participating electrons the larger the uncertainty in the results will be. It is the aim of the present article to give a statistical characterization of the results obtained from noisy exposures. By this we mean that not only the average of a certain calculated quantity is needed but also its variance, etc. The present analysis puts intuitive notions such as signal-to-noise ratio on a firm basis. Ideally, the probability density function of the relevant quantity should be determined. As we shall see in the subsequent sections, such a goal may be too ambitious in general. In the past several attempts have been made in electron microscopy to improve the statistical significance of the obtained results. The most obvious approach is to repeat the experiments under identical circumstances. This will reduce the variance of the quantities in which we are interested. Unwin and Henderson (1975) achieve this for substances which can be crystallized. In this case the number of repetitions of the exposure of one single unit equals the number of unit cells in the crystal. Unfortunately, not all biological materials can be forced to crystallize. Periodic structures can be handled by the Fourier filtering method of Unwin and Henderson (1975) or the cross-correlating techniques of Saxton and Frank (1977). Nonperiodic objects have only been analyzed in a systematic way by techniques borrowed from pattern recognition, e.g., cluster analysis. Van Heel and Frank (1981) have applied a statistical version of such an algorithm, called correspondence analysis, to a large number of images that each contain a similar, single isolated biological macromolecule. These images could thereafter be oriented with respect to each other, aligned and averaged, resulting in a higher signal-to-noise ratio and showing many more details of the object. Even in the case of the abovementioned improvements in signal-to-noise ratio, a statistical characterization of the final quantities remains necessary.
IMAGE HANDLING IN ELECTRON MICROSCOPY
203
Before entering upon the statistical analysis proper we have to discuss briefly the interaction of the electron beam with the specimen and the image formation in the CTEM (conventional transmission electron microscope).
B. Interaction of the Electron Beam with the Specimen
A basic understanding of the scattering of electrons by the specimen under consideration is needed if one wants to deduce the object structure from one or more micrographs. Our discussion will be very sketchy. A more detailed account can be found in Heidenreich (1964), Haine (1961), Misell (1973a). We distinguish two categories of interactions: (1) inelastic interactions, and (2) elastic interactions. In inelastic interactions the incident electrons transfer energy and momentum to the object, leading to an excitation of the specimen. Such an excitation could be a breaking of a chemical bond, ionization or excitation to another energy level, plasmon interactions, etc. In elastic interactions there is no internal excitation of the object whatsoever. The transfer of momentum and energy is determined by conservation of energy and momentum. The elastic scattering is the scattering of the incident electrons from the electrons and atomic nuclei (Coulomb scattering) in the specimen. When considering monoenergetic incident electrons, the characteristic difference between elastic and inelastic scattering behavior is that the spread in scattering angles is larger for elastic scattering than for inelastic scattering. Intuitively we can use this fact to make some guesses about the resolution which might be obtained when deducing the object structure from the images. This calculation is an idealization because the scattering characteristics of the objects are recorded in some complicated way in the electron micrograph, as will be clarified in the subsequent sections. Let Ap denote the spread of the momentum transfer between the incident electron and the object. According to the Heisenberg uncertainty principle, the object can be localized with an uncertainty of position of the order Ax = %i/Ap,-h being Planck’s constant divided by 271. As Ap is largest for elastic scattering, high-resolution work (aiming at discovering structural information of the order of a few angstrom) has to utilize elastically scattered electrons. It is convenient to remove the inelastically scattered electrons by a filter lens (see e.g., Henkelman and Ottensmeyer, 1974; Egerton et al., 1975). The inelastically scattered electrons form an unwanted low-resolution background signal which can be interpreted as blur. The wave function describing the elastic scattering of an electron by an object has an important property: There exists a definite phase relationship between the incident wave function and the scattered wave function, a
204
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
relationship which does not hold for inelastically scattered electrons. For that reason we shall assume monochromatic, spatially coherent illumination. This is an idealized situation which is experimentally approximated by the field emission gun, which has a very small apparent spot size. Strictly speaking, the illumination is partially coherent. The degree of partial coherence might be incorporated in the calculations, which become increasingly cumbersome without yielding additional insight (for a discussion see Hawkes, 1980b). Therefore, in the present contribution, we shall assume monochromatic,fully spatially coherent illurnination. The monochromaticity requirement is satisfied by removing the inelastically scattered electrons by an energy filter lens. C . Relation between the Object Structure and the Electron Wave Function The discussion in this section only applies to the case of elastic scattering. In order to determine the object structure from the scattered wave function, whose determination will be discussed in the following sections, we need a model for the object. We shall give two examples. (1) The object is considered as a collection of scattering centers. The position and scattering strength of each center is described by a certain number of parameters. In principle, the intensity distribution of the image can be calculated and is compared with the measured image intensity distribution. The parameters can now be determined by a fitting procedure. Needless to say, this procedure is only viable if the number of parameters is restricted, which is the case when we have considerable prior information about the object from other sources (e.g., chemical). This approach will be discussed in further detail in Section IV. (2) The object is described by an electrostatic potential distribution V(r) which is due to all charges inside the object. According to the WKB approximation and Glauber theory (Lenz, 1971; Glauber, 1959), the phase shift cx(.,.) imparted to a plane wave incident along the z direction is proportional to the projection of V(r) on a plane perpendicular to the z axis 4 x 0 , y o ) 0~
b0, y o , z)dz
(1)
The reconstruction of the scattered wave function yields a projection of the potential distribution. It should be clear that the inverse scattering problem (i.e., the determination of the object’s structure from the scattered wave function) is more
IMAGE HANDLING IN ELECTRON MICROSCOPY
205
complicated than the present examples suggest and needs further investigation.
D. Image Formation in the CTEM The setup of the electron microscopic image formation has been schematically sketched in Fig. 1. The z axis of the coordinate system is chosen along the optical axis of the microscope. The wave function in a plane perpendicular to the z axis and situated immediately behind the object shall henceforth be denoted by the name “object wave function,” and is written as $O(XO>YO)
= exPCi4x0,Yo) - P(x0,yo)l
(2)
assuming a plane wave incident along the z axis, exp(ikz), and choosing the z coordinate of the plane, zo, equal to zero. xo and yo are the coordinates in the object plane and will be measured in units of the wavelength of the incident electrons. We shall give a physical interpretation of the quantities CI and fl occurring in Eq. (2). For a plane incident wave along the z axis, a ( x o , y o ) represents the phase shift which is imparted to the incident wave function by the object and which can be related to the projection of the electrostatic potential distribution on the object plane z = zo = 0 [see Eq. (l)]. The quantity P(x0, y o ) has a phenomenological interpretation: This quantity describes the loss of electrons due to inelastic scattering. These electrons are supposed to have been removed by a filter lens. This filtering is incorporated phenomenologically by p(xo,yo). The wave function in the image plane of the microscope, i.e., the image wave function (see Fig. l), is related to the object wave function by
(3) where K ( . ,.) is given by [see, e.g., Hawkes (1980)l K ( x 0 - x,yo - Y )=
II d5
d?exP{ -iY(5,?) - 2.niKxo - X I 5
+ (YO -Y h I J (4)
Equations (3) and (4) are the equations of the linear transfer theory of image formation, and they apply when the optical system is isoplanar, i.e., when the wave-optical aberrations are independent of the object coordinates .toand y o . The real function y ( - , .) introduced in Eq. (4) is the wave aberration function. It incorporates the effects of spherical aberration through the coefficient C, and
206
CORNELIS H. SLUMP AND HEDZER A. FERWERDA $(X.Y
1
-2 optic a x i s
z .za=o
e x i t pupil
z.zp
image plane
z .zi
FIG.1. Schematic of image formation in the transmission electron microscope
the defocus through the coefficient D
The value of C,, usually in the range of 1 to 3 mm, is fixed for a given microscope. However, the defocus D is variable and is to be chosen by the experimentalist. In Eq. (5) aberrations of higher order such as coma and astigmatism are neglected. In the object plane xo and yo are measured in units of E.; in the exit pupil 5 and q are expressed in units of the (back) focal length of the optical system. In the image plane x and y are measured in MA, with M representing the lateral magnification. A wave function is not a measurable physical quantity in itself. In the image plane only the intensity, i.e., the wave function multiplied by its complex conjugate, can be observed and registered. The information about the In an structure of the specimen is contained in the phase function aberration-free microscope this information would disappear from the product $(.,.)$*(.,.) when the defocus parameter D is also set to zero, a situation which would correspond to the Gaussian reference plane. The aberration function y(. ,-)[cf. Eq. ( 5 ) ] is primarily responsible for the generation of phase contrast. In this context phase contrast is defined (as usual) as the contrast in the image caused by the phase shift imparted to the incident electron wave function by the object. This phase contrast is ~(e;).
IMAGE HANDLING IN ELECTRON MICROSCOPY
207
observable thanks to the phase shift provided by the aberration function y(.,.) which plays a role similar to the phase plate in a phase-contrast microscope.
E . Remarks about the Contrast Mechanisms We will briefly make some comments on the physical mechanisms which give rise to image contrast. We will assume that the image is due to the elastically scattered electrons. For low-resolution microscopy one gets useful information from the scattering contrast. The scattering contrast (also called diffraction contrast) is caused by the removal of electrons which have been scattered over such large angles that they are intercepted by the apertures of the microscope. This scattering contrast should be clearly distinguished from the contrast caused by the removal of the inelastically scattered electrons which have been removed by the energy filter lens. Scattering contrast is fully taken into account in our treatment by the finite size of the aperture in the exit pupil, which is incorporated in our formulas. In our treatment scattering contrast is caused by electrons that have been intercepted by the diaphragm in the exit pupil. This diaphragm should not be taken too literally: All the apertures inside the microscope are represented by one effective aperture taken to be located in the exit pupil. Scattering contrast can be observed when essentially only the unscattered electrons pass through the microscope. When the specimen undergoes crystalline scattering (in the context of crystalline specimens the name diffractionis usually preferred), contrast is observed when only the zero-order diffracted beam is transmitted. In some cases scattering contrast admits a simple physical interpretation. The object is considered to consist of a number of point scatterers, which scatter independently. Let us assume that the electrons are never scattered more than once; multiple scattering is excluded. Let the object be illuminated by a wave propagating in the z direction and let Z(x,y , z ) denote the beam intensity in an arbitrary point of space. a is the total cross section for scattering over angles larger than the angular half-width E of the aperture in the exit pupil. As a depends on the chemical composition of the scatterer, a is taken to depend on the space coordinates: a = a(x,y , z). We now consider the propagation of the beam intensity Z(x,,y,z)when proceeding in the z direction (see Fig. 2). Going from the plane perpendicular to the z axis with z coordinate z to a similar plane with z coordinate z Az,we find that the number of electrons which is prevented from reaching the image plane is given by
+
I ( x ,y , 4
-
Z ( X , Y ,z
+ A 4 = 4x9 Y , z ) l ( x ,y, z)p(x,Y, 4 Ai!
where p(x,y,z) denotes the number of scatterers per unit volume. I ( x , y , z )
208
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
/ I
I
I
1
/
1
zi
v
1Z+AZ I
I
FIG.2.
object
>
2 axis
/
Propagation of the beam intensity through the object.
consequently satisfies the transport equation ar(x, y , z, =
- a(x,
y , z)p(x,y , z)Z(x, y , z)
i3Z
which is solved by
assuming a uniform incident beam intensity lint before the beam hits the object. In particular, for a weakly scattering object (meaning a “small” value of a), we obtain by expanding the exponential function and retaining only the first two terms
L
I(x,Y , z ) = Iinc - Iinc
d x . Y ,Z’)P(X, y , z’) dz’
(8)
From this formula one sees that the image contrast is the projection of a(x, y , z)p(x, y , z) on the object plane. ~ ( xy ,, z)p(x, y , z) might be interpreted as
the “scattering power” of the object per unit length. If the scattering cross section a does not vary appreciably over the object, the image contrast is approximately proportional to the projection of the density of the scattering centers, which in its turn is approximately equal to the mass density. The image contrast due to the inelastically scattered electrons gives information on the loss characteristics of the object and is irrelevant, even a nuisance, for structure determination. The reader should be warned not to take the simple-minded discussion of scattering contrast too seriously. The state of coherence of the incident beam
IMAGE HANDLING IN ELECTRON MICROSCOPY
209
has been completely ignored, a liberty that seems to be common in the study of transport phenomena. For high-resolution electron microscopy the most important contrast mechanism is phase contrast, which arises from the interference between the unscattered and the scattered electron waves. In the next section this contrast mechanism is treated in greater depth.
F . The Phase Problem in Electron Microscopy It is clear from the foregoing that determination of the structure of the object requires the knowledge of the complex object wave function, in particular its phase, given by cx(xo,yo) [cf Eq. (1) and (2)]. We shall see later on that we can determine the complex object wave function if we know the complex wave function in the image plane, +(x, y). Unfortunately, only the intensity in the image plane is observable, which is proportional to the square of the modulus of this wave function, I+(x, y)12.So, apparently the phase of the image wave function is not recorded and must be considered as being lost. This prevents a unique determination of the object wave function and frustrates the determination of the object structure. This complication is known as the “phase problem.” In this article the phase problem will be solved by using two or more exposures of the same specimen under different settings of the microscope, e.g., by changing the defocusing between two exposures. A more detailed account can be found in the review articles by Ferwerda (1978, 1981, 1983), where more references can be found, and Saxton (1980).
G. The Stochastic Process Characterizing the Low-Dose Image Within the limits of the virtually countless number of electrons contributing to the image intensity, the observed intensity is proportional to the squared We will call this situation the modulus of the image wave function deterministic case. In this study, however, the specimens of interest are the radiation sensitive objects of biological material. As has been stated earlier, irreparable radiation damage resulting from the imaging electrons restricts the electron dose tremendously. Lowering the dose in order to prevent uncontrollable structural changes in the specimen will cause quantum noise to become manifest. The interpretation of the image will become difficult due to quantum noise. It is obvious that a compromise must be made between radiation damage of the structure and interpretability of the image contrast [see, e.g., Kellenberger (1980)l. We will not pursue this further. We minimize the structural damage by reducing the electron dose, and we will investigate in $(a,-).
210
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
subsequent sections the consequences of the reduction for the retrieval of information from noisy images in general and in particular with respect to the phase problem. In low-dose circumstances the image intensity distribution recorded on the micrograph is a realization of a stochastic process. This noise process, which is directly related to the way in which images are recorded, is investigated in this section. The stochastic nature of the recorded image has a consequence that the results of image processing also become stochastic quantities, for example the results obtained with an algorithm for phase retrieval. This section is therefore of fundamental importance in this study. From among other noise sources, such as thermal fluctuations of the current in the magnetic lenses and mechanical vibrations, we restrict our analysis to the so-called quantum noise (“shot noise”). The reason for this restriction is that quantum noise is a fundamental and major noise source which cannot be made arbitrarily small by good instrumentation and intelligent operation of the microscope. Throughout this study we will assume that low-dose images are recorded in the following idealized away. The registration of the image is performed by a detector (viz. a photographic plate) which is divided into a large number of N 2 (to be specified later) identical nonoverlapping squares. We assume that each image cell counts the exact number of electrons which arrive in the cell. Hence, a recorded image consists of an N x N array of random counts f i k , l , ( k , I ) = { 1,. . . ,N } . We use the hat symbol to denote random variables. The above assumptions are not limiting because the number of developed grains of silver in a region of a photographic emulsion is proportional to the number of incoming quanta. The step of scanning the image with a microdensitometer is omitted from the analysis presented here. The process of digitizing the photographic plate will inevitably add noise to the signal to be processed. The approach in this section is to assume that the information in the micrograph is available in digital form. This is in line with new developments in instrumentation, where a direct interface exists between microscope image and computer, and with developments in solid-state image-receptor technology. The intensity of the electron source must be low because of the restriction to the low-dose regime. It is to be expected therefore that the electron wave packets of the successively emitted electrons do not overlap. This is experimentally supported by Munch (1975), who found an average space of 15 m between two electrons in a beam current of 1.5 x lo-’’ A at 75 keV. On the average there is only one electron in the microscope at a time. Therefore, the successively emitted electrons do not interact with each other, and the emissions are statistically independent events. A further consequence is that the scattering process of the beam with the specimen and the subsequent image formation can be described by a one-electron wave function. Due to the
IMAGE HANDLING IN ELECTRON MICROSCOPY
21 1
statistical independence of the sucessive emissions the total number fi, of emitted electrons during the exposure time T is a realization of a Poisson process, as is shown for example in Davenport and Root (1958, Chap. 7). This means that fi, is a Poisson-distributed random variable with probability distribution
P{Z,
= k } = exp(-/Z,T)(3b,T)k/k!,
k
= 0,1,2,3,.. .
(9) where A, is the source intensity parameter, equal to the mean number of emissions per second. The random counts f i k , i , ( k ,1) = { 1 , . . . ,N } of which the recorded image consists, are independent, Poisson-distributed random variables, as will be shown in the following. In the low-dose regime the probability that an electron which has been emitted by the source will arrive in the Ith image cell, 1 = { 1,. . .,N ] is expressed in terms of the one-electron image wave function, cf. Eq. (3) by
s = jjI$(X.,.)l'dXdY
(10)
ar
in which a, is the area of the Ith image cell. Let fil denote the number of detected electrons in the Ith image element. The probability that k electrons will arrive in the Zth image element is given by a k-times independently repeated Bernoulli trial, with 4 being the probability of success. We arrive at
k = 0,1,2,3,.. .
(1 1)
In Eq. (11) a combination of the Poisson distribution for the total number of electrons and the binomial distribution can be recognized. Defining a new variable j = rn - k, we can write for Eq. (1 1) m
P ( f i l = k ) = exp(-i,T)(k!)-'Pt
1 (&T)j+k(j!)-l(l- 4)' j=O
= exp( -/Z,Tfi)(k!)-'(&T~)k
(12)
From Eq. (12) we conclude that f i l is Poisson distributed with parameter & T e . Consider two nonoverlapping area elements of the image plane, area i and j (see Fig. 3). We will determine the joint probability that k electrons arrive in area i and I electrons in area j. The line of approach followed here parallels the discussion by Papoulis (1965, pp. 76,77). In the low-dose regime the electrons are independently emitted. Therefore the image formation can be modeled as
212
CORNELIS H. S L U M P A N D HEDZER A. FERWERDA Y
-d
I
FIG.3. The area elements i and j in the image plane.
an m-times independently repeated experiment, where m is a Poissondistributed random variable with parameter &T. We have necessarily
(13) The following three mutually exclusive events can be distinguished as possible outcomes of an experiment: m>k+l
electron arrives in area element i event i event j__ electron arrives in area element j event i + j electron arrives somewhere in the image but not in i or j The individual probabilities of the three events are related by (14)
P,+p,+Pq=l
If m has a fixed value satisfying Eq. (13), the probability of the joint event ( k electrons in i and I electrons i n j } is Pm{fii= k , i i j
=
I}
=
{k!I![m - (k
+ l ) ] ! } - ' m ! P ~ P ~ P ~ " + k (15) '
As m is Poisson distributed with the distribution of Eq. (9), we arrive al the following expression for the probability of the joint event that k electrons will arrive in area element i and I electrons will arrive in area element j m
P { i , = k,Gj = l } =
C
exp(-i,,T)(m!)-'(A,T)"Pm{fii = k , f i j = I} (16)
m=l+k
Using Eqs. (14) and (15) and introducing a new variable m' = m
-
(k
+ l ) , we
213
IMAGE HANDLING IN ELECTRON MICROSCOPY
obtain from Eq. (1 6) P{G, = k,Cj = I } m
= e x p ( - A ~ ~ ) ( ~ ! ) - l ( ~ ~ s T ~ ) ~ ( ~ !C ) - l( m ( A’ !s) -Tl (~&)~l ) m ’ ( --l m’=O
= ex p( -
-
p,)“’
A, Tp)(k !) (A, T e ) kexp( - A,T e ) (/ !) (As ~ 5 ) (1 7)
With Eq. (1 2) we obtain
P(G, = k,Ci
=I
}
= P{Ci = k } P { n j = I}
(18)
which shows that the random variables Gi and iij are statistically independent. Notice that this property does not depend on the size or shape of the areas i andj; the only requirement is that they do not overlap. Consequently, the recorded image {fil, G2, fi3, ...,C N l ) consists of N 2 independent and hence uncorrelated Poisson-distributed random variables and has the probability of occurrence
n N2
F{C1,G2,...,CN2], =
I=1
e x p ( - ~ s ~ F * ) ( ~ ~ ! ) - l ( ~ s T ~ (19) l)~’
As Cl is Poisson distributed according to Eq. (12), the mathematical expectation value of GI is given by E{ti,}
=
&T&
(20)
and the variance is given by var{Gl) = &T&
(21)
The stochastic image process is completely specified statistically by Eqs. (9), (lo), (ll), and (19). The stochastic properties of the data will play a dominant role in subsequent sections which deal with the extraction and evaluation of information from the low-dose images.
11. OBJECT WAVERECONSTRUCTION
A . Introduction and Review
The reconstruction of the object wave function is of great importance for the imaging of the structure of biological materials, especially at high resolution by means of an electron microscope. This is due to the relation between the object wave function and the electrostatic potential of the object,
214
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
as has been briefly touched upon in Section I. In Section I,D it is indicated that the interaction between the incoming electrons and the specimen is described by a shift in the phase of the electron wave function. The amount of phase shift corresponds to the projection in the propagation direction of the object’s electrostatic potential. The information about the specimen structure is not related to a directly measurable physical quantity, a situation which leads to the so-called phase problem. Only the intensity of a wave function, which is proportional to its squared modulus, can be recorded, for example, on a photographic plate. In this section we will first discuss the properties of the various methods and algorithms which are proposed in the literature for solving the phase problem when they are applied to low-dose imaging conditions. We hereby exclude periodic objects so that the noise-reducing techniques of averaging over the periodic repetition of image cells cannot be applied (Unwin and Henderson, 1975). Apart from analytical techniques such as zero flipping, which are mainly of theoretical importance, basically three algorithms exist for the retrieval of the phase of the object wave function. The elegant method proposed by Frank (1973)will in practical situations run into difficulties and is omitted here together with the Gerchberg-Saxton algorithm (Gerchberg and Saxton, 1972), which can give nonunique results even for the noise-free case. The three methods discussed in this paragraph are: (1) Newton-Kantorovich approach (Van Toorn and Ferwerda, 1976; Roger, 1981). In this approach the nonlinear equations, describing the object wave function in terms of recorded intensities, are expanded as a kind of Taylor expansion based on a first (intelligent) guess or starting from prior information about the solution to be obtained. Based on the first-order term of this expansion, the wave function is updated, which is the starting point for a new expansion, and the process is repeated. Owing to the noisy data, it is difficult to formulate a criterion which has to indicate that the calculational procedure has converged. A too stringent adaptation to the noisy data has to be prevented. A further problem with this method concerns the convergence to a correct solution when the initial starting function is not close to it. Owing to the iterative nature of the procedure, the influence of the noise on the results obtained is difficult to quantify analytically. The determination of even the expectation value and the variance of the obtained results is a hard task. However, some insight can be obtained by Monte Carlo studies. Analytically tractable is the case of an initial starting solution which is close to the true solution, when only one correction update of the wave function is necessary. However, the statistical analysis of this situation is not of great value as not much improvement is to be expected from adaptation to noisy data. What is most likely to happen is that the good initial a priori solution will be corrupted
IMAGE HANDLING IN ELECTRON MICROSCOPY
215
by noise and that the procedure ends up with an inferior result if compared with the a priori information. (2) Misell’s algorithm (Misell, 1973b).This is also an iterative procedure, based on two exposures which are defocused with respect to each other. Starting from an initial image wave function with the correct modulus according to the first exposure and with an arbitrary or even random phase, the image wave function belonging to the second exposure is calculated. The modulus of this wave function is corrected to satisfy the second exposure. Based on this corrected wave function, the wave function corresponding to the first exposure is calculated, the correct modulus is enforced, etc. When this computational scheme has converged, the object wave function is obtained by inversion of the integral equation [cf. Eq. (3)], which relates the object and the image wave function to each other. With the Misell algorithm the same problem arises with respect to the convergence and the noise influence on the result as with the Newton-Kantorovich approach. Owing to quantum noise the data of the two exposures are not consistent with each other; so that in a strict mathematical sense there exists no solution, and consequently the algorithm cannot be expected to converge. (3) The direct approach (Van Toorn and Ferwerda, 1976).This method is also based on two exposures with a different defocusing parameter. The wave function in the exit pupil is calculated by solving two coupled nonlinear Volterra integral equations of the first kind. The algorithm is not an iterative procedure and is therefore of potential interest for evaluating the solution obtained statistically. The algorithm is very sensitive to noise, as has been reported by Van Toorn et al. (1978), because of error accumulation. Of all methods the approach of solving the integral equations directly seems to be the most relevant for the analysis of the influence of the stochastic data on the reconstructed wave function. Iterative procedures are not tractable for statistical evaluation; therefore, in the next two sections the direct method is analyzed in greater detail.
B. Derivation of the Basic Equation
In this section we derive the basic integral equation which relates the object wave function to a recorded intensity distribution in the image plane. In order to keep the equations as simple as possible, we treat in this chapter one lateral dimension of the images only. For electron microscopes with square diaphragms (if there are any), the extension to two lateral dimensions is straightforward. With a circular symmetry the mathematical treatment becomes more complicated. However, this elaborate analysis will not yield
216
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
more insight than the case of one lateral dimension, which is treated in this section. The illuminating electron beam is described by a one-electron wave function. For this we assume a quasimonochromatic plane wave propagating in the direction of the optical axis of the imaging system, which we consider to be isoplanar. The object wave function is written as
(22)
$o(xo) = expCia(x0) - B(x0)l
where the phase term a(.) denotes the phase shift due to the object’s electrostatic potential and where the attenuation term p(.) describes the removal of inelastically scattered electrons from the imaging beam by the appropriate energy filter lens. The geometry of the electron microscope being considered here with one lateral dimension is presented in Fig. 4. The optical system is characterized by the relations between the wave functions in the three planes: object plane, exit pupil, and image plane. The wave function in the exit pupil $ p ( - ) is related to the object wave function $o(.) by
I-+**
$,(O = expC O(01 -
$o(xo)exp( - 27cix00 dxo
(23)
where y(.) is the aberration function in the exit pupil. The coordinate 4 in the exit pupil is measured in units off; the focal length of the imaging system. In
X
5) 5 specimen holder
4Jolxo)
vacuum
XO
*d *E
F
coherent
G
i IIu mi n a t i on
Z
L
. optic a x i s
-E -d
I
object plane
z:z,=o
exit pupil
z.z
P
image p l a n e ZZZ,
FIG.4. Schematic diagram of the imaging system for the case of axial illumination
217
IMAGE HANDLING IN ELECTRON MICROSCOPY
the object plane xo is measured in units of 2, the wavelength of the accelerated electrons. In the image plane the coordinate x is measured in units of MA, with M the (lateral) magnification. The aberration function y(.) for an isoplanar system is given by
r(5) = 2 n ~ - 1 ( g , 5 4- + ~ p )
(24) where only the spherical aberration with coefficient C, and the defocusing with coefficient D have been taken into account, neglecting higher-order aberrations such as coma and astigmatism. The image wave function $(-)is related to by the wave function $&a)
As has been discussed in Section I,G the recorded low-dose image consists of a (one-dimensional) array of statistically independent, Poisson-distributed, random counts n^ = (n^-,j2, fi-N,2+ . . . , i?N,2- The Shannon number N equals 2 x 2d x 2~ because the intensity $$* in the image plane has a bandwidth of 48. The t i k (k = - * N , . . . , i N - 1) random counts represent the electrons which have arrived in the kth image cell. In Section 1,G it has been shown that Zk is a Poisson-distributed random variable with intensity parameter ikequal to its expectation value, given by
A, = E{fik) = &T(2d)-'
S.,
(261
$(x)$*(x)dx
+
where a,, denotes the kth image cell: ( 4 ~ ) -k - ( 8 ~ ) - < x < ( 4 ~ ) - l k ( 8 ~ ) Calculating the Fourier transform of the stochastic image according to N/2- 1
?(()
=
1
k= -N/2
fikexp[2ni(4&)-'k5]
(27)
we obtain the stochastic function ?(-). For reasons of convenience ?(-) is defined here for a continuum of values of 5. In practice Eq. (27) will be calculated using a fast Fourier transform (FFT) algorithm, which results in 2(cl)for a discrete set of ( values (to. Where appropriate we will use this discrete representation. The function ?(.) is a complex stochastic function, and its statistical properties are studied in Appendix A. In Appendix A it is shown that the autocorrelation function R ( . , .) of the complex stochastic process Eq. (27) is given by
Wt,,t z ) = E { ~ ( 5 1 ) c ^ * ( 5 z ) f= E(C^(51))E(C^*(t2)) + CAkexp[2ni(4~)-'k(t, - t2)1 I from which it follows that
c^(cl)and c^((,)
are correlated
(5,
#
t2).
(28)
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
218
The basic integral equation is derived from Eq. (27) taking the expectation value of both sides. Using Eq. (26) this results in N/2 - 1
C
E(c^(())= i.,T(2d)-l
exp[2ni(4~)-'k51
k= -N/2
la,
$(x)$*(x)dx
(29)
Substituting Eq. (25) and performing the integration over x yields
sin [2n(Se)-'(] X
n4
exp[2ni(4~)-'k(<' - (")I
Carrying out the summation over k leads to
Because d >> 1, the aperture in the object plane is numerically large compared with the wavelength of the accelerated electrons, and we have
+
sin[2nd(( (' - <")I sinCn(4~)-'(( + 5' - (")I
= 4&&5+ 5' - i"")
With Eq. (32) the integration with respect to 5'' in Eq. (31) can be evaluated leading to [note that $,(() is a band-limited function only defined in the region --i; I ( 5 c]
(33) In Eq. (33) an integral equation of the first kind is presented which is of the Volterra type. In practical situations E(c^(())is unknown, as we only have one realization of the stochastic function c^((). Therefore the integral equation is stochastically driven. Note that ?(() equals c^*( - 5) because of the symmetry properties of the Fourier transform of a real function. Equation (33) is the basis for the discussion of the direct method for the determination of the object wave function. This is the subject of the next section.
IMAGE HANDLING IN ELECTRON MICROSCOPY
219
C. Solution of the Integral Equations In this paragraph we discuss the direct method for the reconstruction of the object wave function based on two defocused exposures (Van Toorn and Ferwerda, (1976). Defining the auxiliary wave function $q(-) (34) we obtain from Eq. (33) with the two exposures i?(-) and c^2(.), which correspond to the defocus parameter D' and D2, respectively, the set of integral equations $q(()
= e~p(-2ni/,-'505~)$~(5)
x e~p[-2nii.-'iD'((~
c^(5) = 3.,T(2d)-'
+ 255')] d(',
J:,i $q((')$:(t'+
0 < 5 < 2~
(35)
()exp[ - 2 ~ i A - ' f D ~ ( 5+ ~ 25(')] d t '
In Eq. (35) the sinc function in front of the integral in Eq. (33) has been approximated by a constant
In order to reconstruct the object wave function, we have to solve t,hq(.) from Eq. (35). Inversion of an equation similar to Eq. (23) yields the required $,J.). Introducing a new integration variable q
Eq. (35) becomes
Until now the coordinate 5 has been treated as a continuous variable. It is, however, a discrete variable due to the calculation of ?(-)in practical situations by means of a fast Fourier transform algorithm (FFT).Setting the sampling
220
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
distance 9 in the exit pupil equal to (2d)- ', 5 is expressed as k = 1 , 2 ,..., 4 N - 1
<=2~-kh,
(39)
approximating the integration according to the trapezoidal rule, Eq. (38) is transformed into a set of algebraic equations. For k = 1 , t equals 2~ - h, and Eq. (38) becomes
C^'(~E
-
h)
=
i . , ~ ( 2 d ) - ' + h ( $ ~ ( -+~ h)$q*(~)exp[-2nii.-'D'(+h)(2~ - h)]
+ I)~(-E)$~*(E
-
F2(2&- h) = &T(2d)-'4h{$q(
+ I)~(-E)$~*(E
-
h)exp[-271i~-'D1(-~h)(2&- h ) ] } -E
+ h)]$q*(~)exp[-2niiLF'D2($h)(2~
-
h)exp[-2nK1D2(-+h)(2&
-
(40) h)]
h)]}
For the moment, we treat the two unknown constants $q*(~) and tjq(-&) as if they were known. Their actual values follow from the following procedure. We set $:(E) equal to 1, making the overall phase factor a fixed value. The derivative of E(-) at the point 5 = 28 yields the product from which $q( - E) can be evaluated. The two 2i.,T(2d)- 't,bq( - E)$:(E) exposures both give rise to an estimated value for t,hq( -E), of which the mean is = 1,6k'.22ni(4~)-1k( - l)k. In the reconstruction taken. Note that d,c^'92(2~) procedure both $ q * ( ~ ) and t+hq(- E) are multiplicative coefficients. From the normalization constraint Jd4 tjq(4)$q*(l), which corresponds to the total number of electrons detected in the image, we obtain after the reconstruction a h) and $X(E - h) correction for the modulus of $;(E). From Eq. (40) t,bq( - 6 easily follow using Cramer's rule. Throughout this whole procedure i+hq(-)and $q*(.) are treated as unknown functions which are determined independently; the complex conjugation relation remains to be checked afterwards. Choosing D 2 equal to -D' and D' equal to D in the following, the determinant of Eq. (40) becomes
+
detk,,
=
-22iSin[2nl-'Dh(2~
If det # O the sample values t,hq( - E
+ h) and $:(E
-
-
h)]
(41)
h) are obtained. For k = 2,
t = 2~ - 2h, with the trapezoidal rule Eq. (38) becoming ? ' ( 2 ~- 2h) = 3 . , T ( 2 d ) - ' 3 h { $ q ( - ~
+ h)$,*(~)exp[--nij.-'~h(2r:
+ 2$hq( + h)$q*(E h) + ($q(-~)$q*(~ 2h)exp[2niA-'Dh(2~ -&
-
211)l
-
-
-
2h)l)
+
F2(2c - 2h) = 3L,T(2d)-'3h{$q(-~ h)t,b,*(s)exp[2xii:~'~h(2~ - 2/11]
+ 2Ijq( + h)$q*(E h) + ($q(-~)t,bq*(~ 2h)exp[-2nK1Dh(2~ -E
-
-
-
2/91]
(42)
IMAGE HAh-DLING IN ELECTRON MICROSCOPY
22 1
The determinant of Eq. (42) is equal to det,,, = --2isin[2nK1D2h(2~- 2h)] (43) With the two samples $q( - E + h) and $,*(E - h) obtained in the previous step k = 1 and if det,,, # 0, the values $q( - E + 2h) and $X(E - 211) can be evaluated. The next step k = 3 leads to 5 = 2.5 - 3h and Eq. (38) corresponds to
+
?(2e - 3h) = A , T ( 2 d ) - 1 $ h { $ q ( - ~ 3h)$,*(&)exp[-2ni3,-'D$h(2~- 3h)l
+ 21,b~(-~+ 2h)$,*(~- h)exp[-2niLX1D$h(2~ 3h)l + 2$q(-e -th)$q*(~- 2h)exp[2nilb-'D$h(2~- 3h)l + 3h)exp[2nii.-'D$h(2~ 3h)l) (44) - 3h)l E2(2&- 3h) = A,T(2d)-'$h{$q(-e + 3h)$,*(~)exp[2niA-'D$h(2& -
$q(-~)$,*(~ --
-
+ 21j~c-c + 2h)$q*(~- h)exp[2xiA-'D+h(2~ 3h)l + 21,b~(-~+ h)$q*(~ 2h)exp[-2niAX1D+h(2~ - 3h)l + -E)$,*(E 3h)exp[-2niEL-'D$h(2~ - 3h)l) -
-
-
$q(
which has the determinant det,
2i si n[2d- D$h(2~- 3h)l (45) From Eq. (44) t+hq(- E + 3h) and $,*(E - 3h) are obtained using all the sample points determined in the steps k = 1 and 2. If the defocus parameter D is chosen not too large in relation with the size of the exit pupil E, the determinant of the set of algebraic equations is only zero for ( = 2~ and ( = 0. Repeating the numerical procedure for k = 4, 5,. . . , +A' - 1 will result in a computed function t+hq(-) in the interval ( - E , + E ) , except in the endpoints, where the determinant is equal to zero. The mathematical structure of the algorithm is easily seen from the explicitly presented steps k = 1, 2, and 3. The statistical analysis of the whole procedure is nevertheless still far from trivial due to the many complex terms involved and due to the correlation of the ?(-) samples. We therefore analyze an idealized system of equations similar to Eq. (38) but simplified, in order to obtain insight into the behavior of the above algorithm under noisy data. With the model which is described in the following, the performance of the algorithm of Van Toorn and Ferwerda (1976) is simulated in a stochastic environment. ~
=
-
1. Model Computation In order to reveal the behavior of the set of integral equations (38) with stochastic driving function, we simplify the set by
222
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
y((= )
s s
E-512
-(&
-512)
f(-31r + rl)Y(+5 + q )
x sin(2nEb-'~'tq)dg,
j'(5) =
0<4 I 2~
(46) &-
-(&
-
fc-2+ v)g(+5 + v )
512)
0 < t 5 2~
x cos(2n2- Dz [ q ) dv,
With D = D' = - D 2 Eq. (46) is constructed to have a determinant similar to Eq. (36) in the computational steps k = 1,2,. . . ,$ N - 1 of the algorithm det
= sin(2nA-'D2<~)
(47)
The two functions f(.)and g ( - ) to be determined from Eq. (46) are real functions. The statistical properties of the stochastic driver functions PI(-) and f2(-) will be discussed further on. The first step ( k = 1) of the algorithm with = 2~ - h leads to
<
f'(2t:
-
h) = i h [ f ( - s
+ h)g(E)
f2(2c
-
h) = + h [ f ( - ~
+ h)g(&)+ f ( - & ) q ( &
-
f'( -E)g(E
h)] sin[2n3.-'Dih(2~
-
(48) h)]c0~[2n2-'D+h(2t: - h)]
The general step, i' = 2~ - kh, k = 1,2,. . ., i N fl(2E
-
h)]
-
-
-
1 gives
kh)
+ kh)q(s) f ( - E ) g ( e kh)] sin[2nA-'@kh(2~ k h ) ] + h 1 f ( - & + ( k / ) h ) g ( & /h)sin[27L1D~h(k- 21)(2~ kh)]
=ih[f(-e
-
-
-
k- 1
-
-
-
I=1
(49) Y2(2t:- kh) = i h [ . f ' ( --6 k~
+ kh)g(c) + f ( -e)g(E
-
kh)] cos[2nj.-' Dikh(2s - k h ) ]
1
+ h 1 f ( - ~+ ( k
-
I)h)g(e - Ih)cos[2nj.-' D$h(k
-
21)(2e - kh)]
I=1
For simplicity we assume the noise in the stochastic functions j(.) to be Gaussian with zero mean and variance equal to unity. Furthermore, the samples of 9(.)are taken to be statistically independent. The correlation in the driving functions of Eq. (38) is neglected in the computation. We have
Y(i1)= $ ( < I ) E ( f'.2(Ck)f1,2(t,))
+ N(O, 11, =
y^2(r,)= y2(<,)+ N ( 0 , l )
E~.c'1,"r,)}Ec4;'.2(<~)),
5k
#
i',
(50)
IMAGE HANDLING IN ELECTRON MICROSCOPY
223
With d E ) and f ( - ~ set ) equal to 1 we obtain from Eq. (48)
.f( --E
+ h) = {sin[2nL1Dh(2e - h)])-'{2h-'y^'(2& - h)cos[2nA-'Dih(2~ - h)]
+ 2h-'y^2(2~
-
h)sin[2nA-'D$h(2~ - h)])
(51) and a simila! expression for $(E - h). As the functions j(.) are Gaussian distributed, .f(- E h) is also Gaussian distributed
+
+ h)) = + h) (52) var{.f(-E + h)) = var[@(E h ) ) = 4{h2sin2[2nA-1Dh(2~- h)]} In the second step ( k = 2) of the algorithm, the product of J ( - E + h) and E{f(-&
f(-E
-
- h) has to be computed in order to be able to solve the matrix equation, Eq. (49) for ?( - E + 2h) and G(E - 2h). We now analyze the stochastic properties of this product. The product z^ of two Gaussian-distributed variables 2 and j with zero mean and variances o: and a;, respectively, is distributed according to Kendall and Stuart (1963) @(E
(53) with 2 = N(0,a:), y^ = N(0, o.;) and z^ = 2 j and where KO(.)denotes the modified Bessel function of the second kind of order zero. We have P(z^) = (n%oy)-' Ko((~,a,)-'lzl)
E(5) = 0 (54)
var{z^) =o,Zo,'
In order to avoid unnecessary notational complication, we simply replace the varying sine and cosine terms by 3 s . Then the determinant of the matrix equation becomes equal to unity. The variances off( - E + h ) and $(E - h ) are equal to 4h-'. We obtain from Eq. (49)
j ( - &+ 2h) = h-'$y^'(2&
-
+ h-'J29'(2&
2h) - 2 j ( - & - 2h)
+ h)$(E
-
h) (55)
from which it follows that
E { j ( - & + 2h)) = f ( - & + 2h) var{j( --E
(56)
+ 2h)) = 4h-' + 64K4
A similar expression with the same variance exists for @(E - 2h). Because h << 1 [h = (2d)-']- we observe a significant increase in the variance of the estimated values f(- E + 2h) and $(E - 2h). For the determination of j ( --E + 3h) and @(E - 3h) (the next step) the cross-products
I ( - &+ 2 h ) @ ( ~ h ) -
and
j ( - &+ /I)$(&
- 2h)
224
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
$re required. These products will have a larger variance than the product f ' ( - - E + h)ij(c - h ) , which leads to a larger variance of f ' ( - - ~ 311) and ij(c - 312) as compared with the variance of the estimated samples in the previous step of the reconstruction algorithm. This process continues. Ultimately the variance of the estimated samples will have increased so much that the pertinent data values do not influence the result any longer. It has to be concluded that because of the noise in j1v2(-) the algorithm is unstable. The solution thus obtained has an increasing noise variance because of the inherent error propagation. This variance is very much larger than the noise variance of the data, a situation which is characteristic for so-called ill-posed problems. The conclusion of the above model computation applies to the algorithm of Van Toorn and Ferwerda (1976) and is consistent with the reported noise sensitivity of the algorithm by Van Toorn et ul. (1978).The performance of the pertinent algorithm in low-dose imaging of biological material is severely degraded due to the inherent noise accompanying low-dose exposures. In the next paragraph an algorithm will be presented with negligible error propagation and with its stochastic behavior still analyzable. However, the results of this algorithm are only approximate in the noise-free case, which manifests itself in the fact that the algorithm is not unbiased. With noisy data the approximate character of the reconstruction method is not a drawback, as exact results are impossible in this situation.
+
D. Statistical Analysis of' an Approximate Solution In this paragraph an approximate solution is presented of the fundamental set of coupled integral equations [Eq. (38)]. The solution is approximate in the sense that certain assumptions are made which are not exactly fulfilled. In practical situations where noise is manifest, however, the deviation from the true solution is expected to be smaller than the variance of the approximate solution due to the noise in the data. The method proposed in this section does not suffer from the increasing noise variance as the algorithm in the previous paragraph. With partial integration we obtain from the basic set of Eqs. (38).
IMAGE HANDLING IN ELECTRON MICROSCOPY
:'(()
= ).,T(2d)-'(
-
-
-2nij.-'D2[)-'
$¶( - E ) $ , * ( - &
/
(
$¶(E
- t)$:(c)
x exp[ -2nii.-'D2((c
- ( 8 - 5,Z)
-
+t)]
+ t1exp[2nij--'D2((e -it)]
8
&-ti2
225
+ r)$,*(+4+ r)]exp(-2nii-'DZir)d0) -[$¶(-it ar
(57)
For a fixed value of t we have two equations with the four unknown variables <), $:( - c + t),$:(E), and $¶( -e). The latter two occur in the equations for every value of ( chosen, a situation like the algorithm of the previous section. Therefore $:(E) and -6) are treated in the same way. Together with the unknown terms an integral also appears in Eqs. (57). The contribution due to this integral will be examined in greater detail. In the algorithm of Van Toorn and Ferwerda (19761, a similar term was responsible for the error propagation. The approach here is geared to control the noise amplification by reducing the influence of the integral term with the help of an extra exposure. In order to determine the contribution of the integral to Eq. (57), we first make the new assumption that the influence of the spherical aberration (C,) is negligible. A large value of I: (high-resolution imaging) corresponds to a small sampling cell in the image and therefore for a given electron dose to a low signal-to-noise ratio. The dominance of shot noise can be mitigated by choosing a smaller value of c, which leads to a larger sampling cell with better statistics at the expense of spatial resolution. The practical reason for working at medium resolution motivates the neglect of C,, as is shown in Fig. 5. Neglecting the spherical aberration C,, we have from Eqs. (23), (24). and (34) $q(~ -
Spatial frequencies of +b0(-)up to E do contribute to the image contrast. In the case of noisy data we refrain from the reconstruction of the object wave function &(.) in all its spatial frequencies (bandwidth extrapolation). which would lead to a very high noise variance in the reconstruction result. Instead, we only reconstruct spatial frequencies of the object wave function which do contribute to the observed image contrast. This results in a low-pass filtered approximation &(-) to the original object wave function &(-)
With Eq. (59). $¶(-) in Eq. (58) can be represented by N14- 1
$¶(t)= C
k= -Nl4
$,(k/2e)exp[-2.rri(2e)-'kt]
226
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
\
\ \ \ \
0
1.0
0
\
,
I
2.0
3.0
I
L.O X 1 ~ - 3
b
5.0
-E FIG.5 . The influence of neglecting C, in the aberration function y(.) [cf. Eq. (24). (a) Real part exp[ - b?(.)]; solid line: cos[y(.)] and dashed line: cos(2nj.-'$D('); (b) imaginary part exp[-bj(.)]; solid line: -sin[?(-)] and dashed line: -sin(-zj.-'fD('); for the microscope parameters C, = 1.6 mm, E = 0.5 x lo-', D 3 = 160 nm, A = 4 pm.
With Eq. (60) the contribution of the integral term I((; D ) to Eq. ( 5 7 )can be evaluated, where /((; D) is defined by re-e2
1
which is equivalent to
/(t;D) = - C i + b ~ ( k / 2 ~ ) C i + b * 0 ( k ' / 2 ~ ) 2 7 1 i ( 2 ~ ) -k')exp[2ni(4~)-'(k '(k + k')(] k
k' r-:
2
exp{ -2ni[(2~)-'(k
-
k')
+ j . - ' D t ] q } dy
-(r-:/2)
=
- - n i x i+bO(k/2&)Ci+b*0(k'/2e)(2e)-'(k - k')exp[2ni(4~)-' ( k k
X
k'
sin{2n[(2~)-'(k - k ' ) - j . - ' D t ] ( t ; n[(2c)-'(k - k ' ) K ' D t ]
+
-
it))
+ k')t]
227
IMAGE HANDLING IN ELECTRON MICROSCOPY
The computational strategy in this section is to avoid error propagation by reducing the influence of already reconstructed sample points of the wave function on the reconstruction process. With an extra exposure these cumulative effects as expressed in the integral term Z((; D) can be eliminated for the greater part. Taking three exposures with defocusing parameters D', D 2 , and D3, respectively, we obtain from Eqs. (57) and (61), setting both &( - E) and $G(E) equal to unity - 2 4 4 T )- 1 2ni2- 1 D(j )5 ( ; j )(5) = &(E - 5)e ~p[ -2ni3~ -'D(j) ((~ - $()I -$:(
---E
+ ()exp[2nii.-'D(j)((~ it)] -
-
I ( ( ; D")),
j = 1,2,3 (63)
The integral term [Eq. (62)] can be written
Z((;D)
=
- 2 n i c C ~O(k/2~)$~(k'/2~)exp[271i(4~)-1(k + k')(l k k'
X
sin{2n[(2~)-'(k - k') - ~.-'D(](E + it)} n[1 + ( k - k')-'i.-'D(2~]
whereby terms with k = k' are omitted from the summation because these terms are zero. When the defocusing parameters are chosen according to
D'=D-AD D2 = D D3 = D
(65)
+ AD
with the practical values D = 80 nm and AD = 40 nm, we observe that the denominator of Eq. (64) hardly depends on the different D values, especially for larger values of k - k'. The same situation applies to the sine term in Eq. (64), which is nearly the same for the three D values. We conclude that the integral term I((;D) is in good approximation the same for the three exposures. We now can eliminate this term from Eq. (63). Subtracting the equation with the defocusing D2 from the equations corresponding to D' and D 3 , respectively, we obtain the following set of equations -2d(i.,T)-'2niR-'([D1c^'(S) = &(s -
-
$;(-E
-
D2c^2(()]
-it)] - exp[-2nij.-'D2t(~: -*()I} + ()(exp[2niX'D'((~ - $31 -exp[2niiL-'D25(~ - +()I}
()(exp[-2niZ 'D'((E
-2d(EL,T)-'2~iX'( [ D 3 C 3 ( ( )- D2E2(t)] =
-
-$:(-E
(){exp[-2ni-'D3((e
-$()I
-
exp[-2d-'D2((~
-
+()I}
+ ()(exp[2ni2-'D3C(~ - $31 - exp[2xX'D2((& -it)]) (66)
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
228
With Eq. (65) the determinant of Eq. (66) becomes
-+()I
det(D,AD) = 2isin[2n3--'2AD<(e
- 4isin[2nX1 AD<(c -
it)],
0 < ( < 2c (67) this determinant is presented in Fig. 6 as a function of
-
() = (2isin[2nIL-'2AD((~- $()I
2d(i,T) '27ciIL-'( x [ ( D - AD)?: - D?f] - 2d(i.,T)- ' 2711'3. ' t x [ ( D + AD)?: - D?;] -
X
-
4isin[2ni-'AD((~
-
$<)I}
{exp[2ni(D - AD)((&- $()/).I exp[2niDc(~- + t ) / i ] } {exp[2ni(D + AD)((&- ic)/i] - exp[2niD((c - +<)/i] (68) -
~
The object wave function Go(.) follows from Il/,(.)by means of a Fourier transform, remembering that is in fact a discrete variable 1,'2d, I = [-$N, ...,$ N - 11, we have
<
.v 4
JO(k,'2e) = ( i N ) -
~~
1
I
I = -3'4
Jq(1/2d) exp[2ni(;N)--' k I ]
(69)
From Fig. 6 it is to be expected that for values of ( near 4 = 0 and ( = 2c the reconstructed Jq(-) function will have a very large variance, as the determinant Eq. (67) is close to zero. For this reason it is advisable to sacrifice some bandwidth and omit these $q(*) values from the reconstruction of &,(.). In the
-10
'
FIG 6. The determinant of Eq. (67) of the set of Eq. (66) for the case of Eq. (65) with the microscope parameters D = 160 nm, c = 0.5 x AD = 80 nm, i. = 4 pm.
IMAGE HANDLING IN ELECTRON MICROSCOPY
229
case of Fig. 6 we would have, say, 1.0 x I (I 9.0 x We now turn to the statistical properties of the reconstructed object wave function. A complete statistical characterization of the reconstruction result of Eq. (69) together with Eq. (68) requires the determination of the probability density function of Jo(.). This is a very complicated and elaborate task. Therefore we only consider here the first two moments of the probability density function in question. With respect to the first-order moment, we easily obtain from E { C ( < ) } = c(<) that
N O ( W 4 )
=
$o(k/24
(70)
TWOremarks must be made with respect to Eq. (70).In the first place we have, according to Eq. (26), Gk being an integer, a finite number of significant decimals in E{c^((5)}.Hence, E{Jo(.)) has a finite number of significant decimals also. It is in this respect that Eq. (70) is valid. Secondly, the set of equations (66) is an approximation, as the integral term Z(t; D) of Eq. (64) is not exactly equal for the three D values of Eq. (65). This may result in a bias term in the reconstruction of Jo(-). The bias is, however, small in comparison with the variance of J0(-),which is calculated in the following. The Fourier transform [Eq. (27)] resulted in correlated random variables c^([,) and c^(t2), # t2.The inverse Fourier transform [Eq. (69)] is now expected to be a decorrelating operator, resulting in uncorrelated random variables G0(k/2.2), k = { - - ifi, N - 1). For the calculation of the variance of the reconstructed I)~(.) from Eqs. (69)and (70),we use that var {c^(<J} =I& [Eq. (247)], which is equal to 2,T. This results in
...,a
-4sin{2rrL
A D ( 2 d ) - ’ l ’ [ ~- (2d)-’I’]}).
’
x ( 2 ~ 2 - ’ l ’ ) ~ ( [ (-DAD)2 + D2] x [ 2 - 2~0~(27&’(2d - AD)(2d)-’l’[~- $(2d)- ‘l’])]
+ [(D +
AD)^
+ 021~2
-
+ AD)(2d)-’/’(&
-
k
=
+(2d)-’/’))],
( - $ N , ...,* N
-*
2cos(2na-y2~
-
1)
(71)
where I’ is chosen so as to avoid the small values of the determinant around 5 = 0 and 2.2, e.g., I’ = { N’, ..., 4 N ‘ - I } with N ’ = 0.9N, in the case of the determinant of Fig. 6. From Eq. (71) we observe that the variance is independent of the sample value k/2e. Furthermore, the larger the electron dose i.,T applied, the smaller the variance in tj0(-), as is to be expected.
230
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
111. WAVE-FUNCTION RECONSTRUCTION OF WEAK SCATTERERS
A . Introductory Remarks
The imaging of the substructure of biological specimens by means of an electron microscope is greatly limited by the radiation sensitivity of these objects. Because of this sensitivity the number of interacting electrons incident on the specimen has to be minimized in order to reduce the radiation damage. In low-dose imaging, however, the contrast is very noisy. Because of this poor signal-to-noise ratio the evaluation in particular of nonperiodic object structures is very cumbersome. In Section I,G the stochastic process that governs low-dose image formation was analyzed. This provided the basis for the discussion in Section I1 of the reconstruction of the object wave function from two defocused images obtained with axial illumination. The discussion was presented for one lateral spatial dimension only. In this Section the same problem is discussed, but this time for the special case of weak scattering objects. A two-dimensional analysis is presented; the first part of this Section is based on axial illumination, while the second part applies tilted-beam illumination. Tilted-beam imaging has been the subject of numerous papers (see, for example, Hawkes, 1978, 1980a). This imaging mode is related to holography (see e.g., Wade, 1980). We assume that the apertures of the microscope are square. We know that the apertures of a microscope are in fact circular, but this geometry complicates the mathematical description and does not lead to deeper insight. The object wave function is estimated from two defocused low-dose images (Section II1,B) or from low-dose images obtained with different directions of the oblique illumination (Section 111,C). The stochastic properties of the recorded images are taken into account, and the probability density function of the reconstructed object wave function is given in terms of the operating parameters of the microscope. We assume coherent (quasimonochromatic) illumination. The inelastically scattered electrons are supposed to be removed from the imaging process by means of an appropriate energy filter lens. Only the elastically scattered electrons contribute to the image contrast. B. Axial Illumination
A scheme of the optical system is shown in Fig. 4, where the diaphragms are taken to be symmetrical with respect to the optical axis. The image formation is described by specifying the relations between the electron wave function in the three planes of Fig. 4: object plane, exit pupil, and image plane.
IMAGE HANDLING IN ELECTRON MICROSCOPY
23 1
Under the assumption that the optical system is isoplanar, the image wave function $(.,-) is related to the object wave function t,bO(-,-) by W Y )=
I, I, dxo
dYoWo- x, Y o - Y ) $ O ( X O , Y o )
(72)
where K ( - , . )is given by [see, for example, Hawkes (1980b)l w
o - X9YO
-
Y) =
J. J. d5
d?exP{ -i?(L?d
-
2"mO - 4
5 + (Yo
-
Y)?l)
(73) The wave-aberration function y(., -) contains the spherical aberration (with coefficient C,) and the defocus (with coefficient D)of the optical system
where A denotes the wavelength of the accelerated electrons. In Eq. (74) higherorder aberrations such as coma and astigmatism are neglected. In the object plane xo and yo are measured in units of A; in the exit pupil 5 and q are expressed in the (back) focal length f of the imaging system. In the image plane x and y are measured in units of MA, where M is the (lateral) magnification of the optical system. The interaction between the illuminating electron beam and the specimen is now described in a simple model. The illuminating beam is represented by a coherent (quasimonochromatic) one-electron wave function for which we take a plane wave exp(ikz) propagating along the optical z axis with wave number k = 2n/,i. The object wave function is expressed by $o(xo, Y o ) = expCi+o, Y o ) - K X O , Yo11
(75)
Apart from the two lateral dimensions, the formalism in this section is until now identical to that in the previous section. However, as this section is restricted to (thin) weak scattering objects, differences will arise that will change the formalism completely. In the case of a weak scattering specimen, Eq. (75) is approximated by $O/o(XO?YO)
= 1
+ i~/o(XO,YO) - D(X03Yo)
(76)
Together with Eqs. (72) and (73) we obtain for the image wave function from Eq. (76)
232
CORNELIS H. S L U M P A N D HEDZER A. FERWERDA
where &(-, -) denotes the wave function in the exit pupil, which is defined as
*&L ? I ) =
JI,,
dx,
I"
4%exPC - W t x o + YY,)I
M x o 2 Y o ) - K x o , Yo11 (78)
The constant d is numerically large since it corresponds to the aperture in the object plane measured in units of the electron wavelength A. Therefore the two sinc functions in Eq. (77) can be approximated by 6 functions. This results in $(x, Y ) = 1
+
s. I dt
drl$&<, rl) exPC-
b(t,rl) + 2ni(xt + 1'rl)I
(79)
The squared modulus of the image wave function is of great importance for the image formation. Neglecting the term quadratic in $&-,-), we obtain $(x3Y)$*(x,Y ) = 1
+
drl
$p(L r l ) expl- b ( 5 , r l ) + 274x4 + y q ) ] + C.C. (80)
where C.C.denotes the complex conjugate of the preceding term. In the weakobject approximation of Eq. (76) we have neglected terms of higher order in z(-, -) and b(-,-),while in Eq. (80) we assume that the integral which is quadratic in z(-;) and p(-;) is negligible in comparison with the integrals which are linear in a ( . , - )and fl(-;). The assumption of Eq. (80) is not quite the same as the assumption in Eq. (76). Substituting Eq. (78) into (80) and using the symmetry properties of Fourier transforms of real functions, we obtain the well-known result (see, for example, Hawkes, 1980b) $(x, Y)$*(X>Y ) = 1
+2
j dx, on
jm,. jo jm dYo a(xo,Y,)
x exp{2niC(x - x o ) 5 -
2
I,
dxo
I,
drl sinIr(L 1?)1
d4
+ (Y - ~ 0 ) r l l )
dyoB(xo,yo)
x exp{2niC(x - xo)4 + (4'
Jo I
-
d5
d~cosCy(5,~)l
~011)
(81)
In the deterministic (noise-free) case the image intensity distribution is proportional to Eq. (81). This situation applies when the total number of electrons involved in the image formation is infinite. In low-dose imaging the recorded image intensity is a realization of a stochastic process. The following briefly outlines the characterization of this stochastic process.
IMAGE HANDLING IN ELECTRON MICROSCOPY
233
I . Low-Dose lmuge Recording The stochastic process that characterizes the low-dose image has been described extensively in Section I,G. We thus only recapitulate the main properties here. In low-dose imaging the emissions of electrons by the source are statistically independent events. Therefore the total number of electrons Cr emitted during the exposure time T is a random variable distributed according to the Poisson distribution
P { C T = k ) =exp(-;I,T)(AsT)k/k!,
k = 0 , 1 , 2 , ...
(82)
where the source intensity is is the mean number of electron emissions per second. The image intensity is assumed to be recorded in the following idealized way. The image plane is divided into a large number N 2 of identical nonoverlapping squares. It is assumed that each image cell exactly counts all the electrons arriving at the cell. Consequently a recorded image consists of an N x N array of independent random counts Ck,/,( k , 1) = { 1,. . . ,N }. In lowdose imaging the probability that an electron which is emitted by the source will arrive in the (k, 4th image cell is given by Pk,/
=
ss
dx d y $(x3
Y)$*(x> Y )
(83)
ak,l
where uk,[denotes the area of the ( k , I)th image cell. The recorded image is a realization of a stochastic Poisson process, characterized by
The random counts Ck,/ are Poisson-distributed random variables with the parameter = kTpk,,
(85)
The image wave function is a band-limited function of bandwidth E. In Eq. (80) only the linear terms have been taken into account. Therefore in this approximation the image intensity has bandwidth E also. Applying Whittaker-Shannon sampling to the image results in N Zimage cells with the Shannon number N equal to goo. In the next subsection the consequences of the noise in the data on the reconstruction of a ( - ; ) and fi(-,-) are treated in detail.
234
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
2. Object Wuae Reconstruction
We will determine the object wave function from two low-dose images with different defocusing parameters D. The Poisson-distributed integer data values will be transformed into new random variables $ I by subtracting the background intensity and scaling, as follows
With Eq. (86) a representation of the recorded image is obtained with the sample values s^(k/2&, 1/28) which are denoted by &I. The data rik.1 are integers. Therefore the number of significant decimals in sk,Iis governed by the value of N 2 ( A ,T)-'. Further decimals are beyond retrieval and are further discarded here. For the (mathematical) expectation value S k , ) of s^k,l,we have E { $ k , l } = ask,[ = N Z O i 2 0k.l
+ 27ci(x5 + y q ) ] + C.C. where uk,[is the area of the (k,l)th image cell, which is given by ( 2 & ) - ' k xI (2&)-'k + (4&)-1,(2&)-'1 - (4&)-1I y I (2&)-'1 + (4&)-'. (4&)-1I The Fourier transform ?((, q ) of the transformed data values is defined by
For reasons of convenience the function E(5, q ) is defined for a continuum of values t and q. In actual practice Eq. (88) will be calculated using a fast Fourier transform (FFT) algorithm, which yields 2(tp,qq) for a discrete set of 5, and ylq values. The function E(-, -) is a complex stochastic function, the properties of which are described in Appendix B. As is discussed in this appendix, the .) has the advantage of consisting of uncorrediscrete representation of 2(-, lated random variables. When explicit calculations need to be performed with we shall return to this discrete representation. Appendix B shows that, to a good approximation (for not too few electrons per sample cell), the probability distribution of E(-,-) is complex Gaussian with its mean equal to the true, deterministic function and a (nearly) constant variance N4(3.,T ) - ' . This variance is a microscope parameter, because its value depends only on the illumination dose and the resolution (i.e., the area of the sample cell). by carrying With Eq. (87) we can write for the expectation value of 2(-,.), out the integrations over x and y and with evaluation of the summations over ?(.,a)
IMAGE HANDLING IN ELECTRON MICROSCOPY
x exp[-ni(2~)--l(t- 5"
X
+q
-
q")]
sin[2n(2e)-'+N(t . sin[n(2~)-'(5 .-
235
(")I <")I
-
sin[2~(2~)-'+N(q - q")] sin[n(2~)-'(q - q")]
(89)
Because the aperture in the object plane is large in comparison to the wavelength of the imaging electrons, we have
(90) The integrations in Eq. (89) can be carried out with Eq. (90),leading to
E(c^(t,?))= N 2 ~ 0 2 ( $ p ( - t-v)exP[-iy(t,rlll ?
+ IC1;(5,?)exPcil)(t,?)I}
where resulting sinc functions have been set equal to unity sin [2n(46)-1t] sin pn(4&)-lq] N1 n(2E) t n(2E)- 1 q
(91)
(92)
~~
The result in Eq. (91) will be used for the determination of the object wave function. Just as in the previous section the expectation value of C(-,.) is not known; instead we must use C(-,-)itself. For the case of weakly scattering objects we have obtained in Eq. (91) a direct relation between 2(-,*) and the wave function in the exit pupil. The corresponding equation in the previous section is an integral equation, which is much more difficult to handle.
236
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
In terms of the functions a(.,.) and from Eq. (91) with Eq. (78) at
fi(.,.) to be reconstructed, we arrive
~ { ? ( ( , q )=N'0;~2sin[y(<,q)] }
-N2oO22COSCY(5,v)I
dJ'o expC2ni(tx, + VY,)l P(x0 Y o ) 3
(93) For the reconstruction of a(.,.) and b(-,-)two exposures with different defocusing parameters D are made. The quantities related to the two exposures are denoted by the superscripts ( I ) and ( 2 ) , respectively. By means of elimination we obtain straightforwardly from Eq. (93) with the two exposures leading to ?'(-,.) and t 2 (.)- ,
From Eq. (74) we obtain for the determinant with D 1 sin[;'*(t, q ) - ~
' ( 5q ,) ] = sin[nX'
~
D 2 = AD
A D ( t 2 + q')]
(95)
Equations (94) are the basis for the statistical estimation of the functions a(.,-) and b(.,-),which contain the information about the specimen in question. We now return to the discrete representation of ?(-,-) and denote the sample value ?(k'/2d, / ' / 2 d ) by E k f , l , . The discrete inverse Fourier transform which corresponds to Eq. (88) is represented by
The inverse Fourier transform Eq. (96) is applied to Eq. (94) in the following analysis. For the left-hand side of the first equation of Eq. (94)
IMAGE HANDLING IN ELECTRON MICROSCOPY
237
we obtain
(97) where the summation over k' is approximated as follows
c
N/2 - 1
exp[2zi(2d)-'(xb
-
x,)k']
k'= -N/Z
= exp[-zi(2d)-'(xb
N
2d
-
xo)]
sin[2n(2d)-'+N(xb - x o ) j sin[n(2d)-'(xb - x,,)]
sin[2m(x0 -- xb)] n(x0 - Xb)
and the summation over I' is approximated by a similar expression. In the expression on the right-hand side of Eq. (97) frequency components of a ( - ,.) above E are filtered out (low-pass filter equation). In general the functions a ( - , - ) and b(.,.) are not band limited. The images do not, however, contain information about frequency components beyond E . Although in principle bandwidth extrapolation can be applied, it is not widely used because of the severe drawback of the considerable increase of the noise variance. Therefore @(.,.) and b(-,.) are represented by band-limited functions denoted by c((-,-) and -), respectively
a(.,
X-
sin[2ne(xo ~XC[X,
-
(2~)-'k)]sin[2n~(y, - (2~)-'1)1 - (2~)-'13 27~~[yo
-(2~)~'kl
(99)
A similar expression applies for the relation between b(.,.) and p(.,.). By substituting Eq. (99) into Eq. (97), and using the orthonormality property of the sinc functions
s
+m
-m
sin{2ne[xo - (2~)-'k]} sin{2m[x0 - (2~)-'k']} dxo = d k , k f 271&[XO- (2e)-'k]7r[xo - (2&)-'k']
(100)
(which applies closely here because d >> l), we obtain the following set of equations [which are the discrete counterparts of Eq. (94)] by which i(-, .) and
238
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
B(-,-)are estimated N/2-l
N/2-1
1
1
!?(k/2~,1/28) = N-2
exp[ -2niN-’(kk‘
+ U’)]
k‘=-N/21‘=-N/2 ( k ’ . l ’ ) # (0.0)
x (2 sinEn2-l AD(2d)-2(k’2+1’2)]}-1 x (cos[y2(k’/2d, 1’/2d)] t:f,l’ -cos[y 1 (k’/2d),1’/2d)]t;.,,.} N/2- 1
E(k/2~,1/2&)=N-’
1
N/2- 1
exp[-2niN-’(kk’+lZ’)]
k‘= -N/21’= -N/2
x
+ lf2)]}-’
{2sin[7tX1
x {sin[yl (k’/2d, 1’/2d)]t;f,1,- sin[y2(k’/2d, 1’/2d)It:,,l,}
(101) In the equation for &(-,.) in Eq. (101) the point k‘ = 1‘ = 0 must be excluded from the summatiqn because in this case the denominator is equal to zero. In the expression for B(., .) this precaution is not necessary because in this case the numerator is also equal to zero for k‘ = 1’ = 0. With Eq. (101) we have to obtained an estimate for the band-limited approximations E(-, -) and the functions in the object wave function a(-,-)and b(.,-),respectively. We now proceed to investigate the stochastic properties of Eqs. (101). Restricting ourselves to the significant decimals of the gk.1 variables, we notice that the statistics of Eq. (91) are unbiased
p(.,.)
{ - i N , . ..,$N
E{@(k/2&, 1/28)) = E(k/2e, 1/2~),
( k ,I )
E{;(k/2~, 1/28)) = p(k/2~,1/28),
( k , 1) = { -$N,. . .,$N - l}
=
-
I} (102)
The variances of the two statistics of Eq. (101) will be calculated separately. We first consider the object Cmplitude function B(.,.). From Eq. (101) we note that the random variable p(.,.) is a weighted sum of Gaussian random variables,-and therefore it is a Gaussian variable likewise. The expectation as is expressed by Eq. (102) and the variance, value of B(.,-) is the true which defines cri, is given by $(.,a)
N/2- 1
var{~(k/2&,1/2~)}=(41,T)-’
1
N/2- 1
1
(sin[ni-’ AD(2d)-2(k’2+1’2)]}-2
k’=-N/2I‘= -N/2
x {sin2[y’(k’/2d, 1’/2d)] +sin2[y2(k’/2d, I’/2d)]}
=(4i,T)-b;,
( k , l ) = { -3N )...,$ N
-
l}
(103)
A
B(.,
From Eq. (103)we see that the variance of .) does not depend on ( k , I). The value of the constant variance (4AST)-’op2 is fully determined by the microscope parameters. From the definition of y(., .) [cf. Eq. (74); we note that
239
IMAGE HANDLING I N ELECTRON MICROSCOPY
every term in the summation is finite provided that AD (cf. Eq. (95)] is chosen in such a way that sin[nL-’ AD(”)-’(k’’ I”)] is zero only in (k’, 1’) = (0,O). For (k‘,1’) = (0,O) the term in the summation in Eq. (103) takes the finite vAalue of (AD)-’(D” + 0’’ ). To summarize, the object amplitude function B(., estimated by Eq. (101), can be described as the sum of the true .) function value plus a signal-independent Gaussian stochastic process with zero mean and constant variance
+
9)
p(-,
~ ( k / 2 ~ , 1 / 2 ~ ) = ~ ( k / 2 ~ , 1 / 2 c ) + N ( O , ( 4 L , T ) -( ’ka, I~) )=,{ - t N , . . . , + N - l } (104) where N(0, a’) denotes a normal distributed random variable with mean equal to zero and a variance of a’. Next we turn to the object phase function ti(-,-). From Eq. (101)we see that $(-,.) is also a weighted sum of Gaussian random variables. Therefore I?(-, -) is a Gaussian random variable, its expectation value is the true ti(., .) value [cf. Eq. (102)], and its variance is given by NI2 - 1
var{$(k/2s, 1/24) =(41,T)-’
N/2-1
C C k’=-N/21’=-N/Z
(sinCn1-l AD(2d)-2(k’2+1’2)]}-2
WJ’) # (0.0)
+ cos2[y1(k’/2d,If/2d)]}, ...,4 N - l}
x {cos2[y2(k’/2d,1’/2d)]
(k,I)={-*N,
(105)
Unlike the situation with the object amplitude function, not every term in the summation in Eq. (105) is finite. If the term belonging to (k’, 1’) = (0,O)had not been excluded from the summation, an infinite variance would have resulted. The variance, Eq. (105), is nevertheless still dominated by the (k’,l’) combinations close to (0,O)where the denominator of Eq. (105) is very small while the numerator is of order unity. The estimated object phase function @(-,.) obtained from Eq. (101) appears to be very sensitive to noise. The variance of :(-, .) is considerably larger than the variance of the object amplitude function p(-,.), and in fact so large that it is questionable whether this quantity is of any use at all. We will now investigate the mechanism which is responsible for the extremely high variance of g(.,-).For this purpose it is necessary to check the -) contributions of the different Fourier coefficients of the phase function Z(-, to the data iik,l.From Eq. (81) we arrive at the following expression for the contribution of the phase function a(-,.) to the recorded image intensity E{Afik,l}= 1,TN-22
s. s. d5
dqsinCy(5, q)] JuodxoJuodYoa(xo~Yo)
x exp[2ni{(k(2~)-’ - xo)5
+(
4 2 ~-) y~o ) ~q ) ]
(106)
240
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
where A& denotes the contrast resulting from the phase function alone. To a good approximation we can substitute for this expression
c
N/Z- 1
E { A k k , f }= R,TN-22N-2
N/2- 1
1 sin[y(k’/2d,1’/2d)]
k’= -N/Z f‘= - N / 2
x exp[2niN-’(kk’
+ 11’)l
In order to be detectable, the right-hand side of Eq. (107) must have at least a numerical value of one as Ai?k,l has to be an integer. The term W 2 @(r/2~, s / 2 ~exp{ ) - 2niN-’(rk’ + sl’)} in Eq. (107) is of the same .) which is known to be less than one. Taking i(-,.) order of magnitude as i(-, to be approximately 0.1, a rough guess is obtained as to which Fourier coefficients do contribute to the phase contrast. In order to observe phase contrast, the following relation now applies
In order for a specific Fourier coefficent (k’,I’) to contribute to the image intensity 6, the more stringent condition must be imposed Isin[y(k’/2d, 1‘,’2d)]l 2 5 N 2 ( i . , T ) - ’
(109)
From Eq. (74) we observe that y(-, -)isa function of (2d)-2(k’2 + I”). Therefore we can find a circle around (0,O) with a radius q given by
k”
+ 1”
I q2
(1 10)
with k‘ and 1’ such that /sin[y(k‘/2d, /‘/2d)]l I 5N2(&T)-’
(111)
The recorded intensity distribution fi does not contain information about the Fourier coefficients (k’,1’) inside the circle with radius q. Therefore these coefficients cannot be retrieved. In Eq. (105) we already observed that the variance of ;(-,-) is dominated by the (k’,!’)combinations close to (0,O). If the Fourier coefficients (k’, 1’) satisfying Eqs. (1 10)and (1 11) are excluded from the computation of g(-,.) in Eq. (101)the value of the noise variance in Eq. (105) is improved considerably. Nevertheless all the phase information contained in the images is used. This in fact represents the bandpass filtration of the object In addition to the frequency components above c, the phase function tl(-;). low-frequency Fourier coefficients are filtered away also. Therefore we do not
24 1
IMAGE HANDLING IN ELECTRON MICROSCOPY
reconstruct the E(.,.) function. Instead a filtered version iq(-,.) is computed which does not contain frequency components in the circle with radius q. The bandpass-filtered object phase function Eq(-,.) is estimated by [cf. Eq. (lol)]
x {2sin[rrA-’ AD(2d)-2(k‘2
x
{COS[Y’(
+ l”)]}-’
’
k’/2d, 1’/2d)]c^:,,l, - cos [y (k‘/2d, 1’/2d)]c^;,,l,},
( k , l ) = {-+N, ...,+N
l}
-
(1 12)
The bandpass-filtered object phase function of Eq. (1 12) can be described as the sum of the expectation value of gq(.,-),which is equal to the true iiq(-,-) value, plus a signal-independent Gaussian stochastic process with zero mean and a constant variance (4%sT)-’0&.This variance follows from [cf. Eq. (105)l
x {cos2[ ~ ( ~ ) ( k ’ / 21’1241 d, = (42sT)-10-~q,
(k,l)
=
+ cos2[y(’)(k’/2d,1’/2d)]}
{ -4N... .,$N
-
l}
(113)
To summarize, the bandpass-filtered object phase function Gq(-,.)estimated by Eq. (1 12) can be expressed as
tq(k/2&,112~)= Eq(k/2&,1/28) + N(0,(4AsT)-’~:q),
( k , 1 )= {
-4 N, ...,4 N
-
(114)
1)
where the relation between the bandpass-filtered Eq(., .) and the true E(-, -) function is represented by
x
expf2niN-’[(k
-
r)k’
+ (1
-
( k , / ) = {-+N, ...,+N - 1)
s)1’1}, (1 15)
Equations (101), (103), (104), ( 1 12), (1 13), and (1 14) are the principal results of this section. For examples of the reconstruction of object wave functions from simulated low-dose images involving the bandpass filtering, the interested reader is referred to Slump and Ferwerda (1982). The next paragraph is devoted to a discussion of a reconstruction algorithm for the object wave
242
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
function which is also capable of reconstructing the lower Fourier coefficients of the object phase function.
C . Tilted Illumination In the previous paragraph the low-dose reconstruction of a weak phaseamplitude object was discussed. The reconstruction algorithm was based on two defocused images obtained with axial illumination. It became evident that the lower spatial frequencies of the object-phase structure were only weakly transmitted and therefore hardly contribute to the image contrast, especially when the applied electron dose is of a low level. Low-dose imaging leads to large noise variances in the reconstructed phase part of the object wave function. In this section we discuss the properties of a promising though more eleborate method of reconstructing the object wave function in the context of low-dose electron microscopy. This reconstruction algorithm is based on several image intensity distributions, which are obtained by illuminating the object structure consecutively from different directions. In Fig. 7 a diagram of the imaging system with tilted illumination is presented. The illuminating electron beam is again described by a one-electron wave function. We assume coherent illumination by a plane wave exp(ik-r,) with wave number k = Ikl = 2x12, propagating in a direction which makes an angle 8, with the optic axis (see Fig. 7). In the simple model for thin weak scattering objects which is considered in this chapter, the object wave function is represented by
$o(x,,Y,)
=
exP~ia(x,,.Yo) - P(xo,~,)l$b(xo,Yo)
(1 16)
The wave $b(.,.) represents the illuminating electron wave function; it is the object wave function in the absence of an object. The restriction to thin weak objects allows the approximation of the object wave function by $O(XO>YO)
=
I1
+ ia(xo,y,)
-
P(xo>Yo)lIl/b(xo,Yo)
(1 17)
Denoting the polar angles of the wave vector k by Bo and 40,we easily obtain $b(xo, y o ) = exp( - 27ri(x0sin 8, cos @ O
+ y o sin 0, sin 40))
( 1 18)
where xo and y o are measured in units of 1 and the position of the object plane zo = 0. Defining the background wave function Il/bg(*,-) as the image wave function in the absence of an object, we obtain by substituting Eqs. (1 18) and (73) into Eq. (72), and carrying out the integration over xo and y , dv exPC - iY(5, v ) + 2 n w X
+ YV)I
sin[2nd(5 + sin Bo cos 40)] sin[2nd(v + sin 8, sin 40)] (1 19) n(v + sin 8, sin &), n(4 sin 8, cos 40)
+
243
object plane
2.0
exit pupil
z=zp
image plane z=zi
FIG.7. Schematic diagram of the imaging system for the case of tilted illumination.
Because of the numerically large value of d (the aperture in the object plane expressed in units of A) the two sinc functions in Eq. (1 19)can be approximated by 6 functions. This yields $bg(x,y ) N exp{ -- iy( - sin 8, cos q50, -sin 8, sin 4,) - 2ni(x sin 6, cos 4, + y sin 8, sin 4,) ( 120) In the derivation of Eq. (120) it is assumed that both sin8,cos4, and sin 0, sin 4, are contained in the interval; this corresponds to bright-field imaging. In the image plane information about the object structure is contained in the image wave function &(., -), defined by dvl$~b(5,vl)expC-iy(5,vl)
+ 274x5 + y q ) ]
(121)
244
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
where the wave function in the exit pupil $&-,.) is given by
$&t> II) =
lo(, lo,
dY0 exp{ - 2niC(t
dx,
+ (v + ~
~
~
~
O
~
~
~
~
+ sin 0 0 cos 40)xo
O P (~X OY? Y OO) I l
~
C
~
(122) ~ ~
~
,
The squared modulus of the image wave $(-,.) is given by $(x,Y)$*(x,Y)=I$bg(x,~)+ $i(x>Y)12
(123)
Using Eq. (120) we obtain for the modulus of the image wave function
I$(x~Y)I
=
C1
+ $bg(x,~)lCIT(x?~) + $&(x,Y)$i(x,Y)+ $ i / i ( x > ~ ) $ T ( x , ~ ) I ~ ' ~ (124)
In the next subparagraph the recorded noisy image is expanded into a set of orthonormal functions. The properties of this expansion are then investigated.
1. Orthonormul Expansion Of the Low-dose Image The image wave function is a band-limited function of bandwidth E ; thus its squared modulus has bandwidth 2.5. In the next subparagraph we will show that the highest (spatial) frequency which is used in the reconstruction of the object wave function is equal to 3e. In order to improve the signal-to-noise ratio, we consider the squared modulus of the image wave function to have a bandwidth of ;E. Applying Whittaker-Shannon sampling to the image results in N 2 image cells, with N equal to ~ C J ~ The E . recorded image intensity is a realization of a stochastic Poisson process. The random counts Yik,l are Poisson-distributed random variables with intensity parameter
which follows from Eq. (85) and the approximation of the integral in Eq. (83). We now expand the modulus of $(.,-) into a set of orthonormal functions. It is convenient to write the two-dimensional orthonormal functions as a direct product of two one-dimensional functions. From Eq. (124) we have the expansion
The functions 4,,, and bnare chosen to be orthonormal on the interval ( - d, + d ) . This set of functions is complete if the indices rn and n in E q . (126) continue to infinity. This is not required here because the image is sampled in squares with sides of length ( 3 e ) - ' . Within the cells the value of the functions is taken to be a constant.
~
Y
IMAGE HANDLING IN ELECTRON MICROSCOPY
245
From Eq. (125), it follows that
Our purpose is to estimate the expansion coefficients a = (. . . ,am,n,. . .) from the random variables fi = (. . . , t ? k , I ,...) using Eqs. (84) and (127). As the variables fi are integers, the accuracy attainable in the coefficients a is limited to approximately (&T)-’N 2 . The maximum likelihood method claims the best estimate of a to be those values which maximize the likelihood function L(ii, a). This function is the joint probability function of the observations. When the parameters a have their true value, L(ii,a) is the probability of obtaining the recorded count pattern 3 given in
In Appendix C the likelihood function [Eq. (128)] is used to determine the amount of information about the parameters a contained in the recorded image fi. Closely related to this Fisher-information matrix is the minimum achievable error variance of the parameters a as expressed in the Cramer-Rao bound (see, for example, Kendall and Stuart, 1967; Van der Waerden, 1969; Van Trees, 1968).The estimated values for the parameters depend on the data; thus they also are random variables. Knowledge about their probability density function, or at least of the first two moments, is of as much importance as the values themselves. We will return to this subject further on. In order to (k, I ) = simplify the estimation of the parameters a, the auxiliary variable { -*N,. . . , i N - I}, is introduced E{t?k,l}
From Eq. (129)
=
&TN-’(I
+ Sk,1}’
( 1 29)
is estimated by
Appendix B shows that the probability density function of is to a good approximation Gaussian, with mean equal to s k , l and variance equal to N2(4iST)-’.From Eqs. (127), (129), and (130), the relation between the auxiliary random variables gk,k,land the parameters a is obtained as m=O n=O
By using Eq. (131), the parameters a can be estimated either by the method of least squares or by the maximum likelihood method, because the probability density function of each Fk,, is Gaussian. The variances of do not depend
246
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
on a. Thus the estimated values for a are obtained by minimizing Q 2 , which is defined as
Minimizing Q 2 with regard to a results in the following expression for the parameters up,q
x 4 p ( k / 3 & ) 4 ( ~ / 3= 40
Using the orthonormality relations
we obtain
For the expectation value of hP,, we obtain from Eq. (131) %P41
= ZPdl
thus the statistic of Eq. (135) is unbiased. We wish to remark here again that the number of significant figures which can be retrieved from the integer data values ii is limited to an accuracy of about (1,T)-'NZ.As the probability density function of the $k,l variables is (approximately) Gaussian, as is shown in Appendix B, the probability density of GP,, is also Gaussian. For the we find that variance of ip,q Nl2- 1
NlZ-1
=(4&T)-'N2,
The covariance matrix
I/ of
( p , q ) = {O,.. . ,N
-
l}
(1 37)
the estimated parameters is given by
Vr,s:p,q = E ( C 2 r . s - E { c r , s } I C ' p , q - E{'p,q}I] = (4AT)-'N26r,p6s,q (138) because of the identical independence of the P parameters, a result that follows from the independence of the ii recordings (cf. Section 1,G). From Eqs. (1 36) and (1 37) we conclude that the estimated parameters P of the expansion in Eq. (127) are uncorrelated and Gaussian distributed. The mean equals the true value given in Eq. (1 36), and the variance is given by Eq. (137), which shows that the variance is a quantity independent of the object.
IMAGE HANDLING IN ELECTRON MICROSCOPY
247
Moreover, when comparing the covariance matrix I/ in Eq. (138) with the Cramer-Rao bound in Appendix C, we see that they are identical. We therefore conclude that the expansion parameters a are efJicientZy estimated, i.e., estimated with the lowest achievable error variance. Equation (135) is therefore an efficient statistic, and all the information that is contained in the data is converted into estimated values. In the next subparagraph we use the expansions of the recorded images to reconstruct the object wave function. 2. Reconstruction of the Object Wave Function
In this subparagraph relations are derived between the object wave function and the recorded low-dose images. In order to do so we determine the wave function in the exit pupil using the orthonormal expansion of the image data described in the previous subsection. Using Eqs. (124)and (136) we obtain
where we neglected the squared modulus of 1,9~(-,-) and approximated the square root by the first two terms of its Taylor expansion. This is an admissible approximation because we have restricted ourselves to weak objects. Until now we have not specified the orthonormal functions to be used in the expansion in Eq. (127). Because of their convenient properties under Fourier transformation, we choose the prolate spheroidal functions (Slepian and Pollak, 1961). An overview of the properties of these functions is given, for example, by Frieden (1971). The prolate spheroidal functions @:(-) are the eigenfunctions of the finite Fourier transform operator, and they are defined by the integral equations (Slepian, 1964) 1
at@<( ) =
J J y
exp(icxy)@;(x)dx,
J-l
-
1I x I1
Introducing the new integration variables x' = dx and 5 the function 4j(x') = @ ; ( x ' / d ) , transforms Eq. (140) into
=+
( 140)
~ yand , defining
where we chose c = 3 x ~ d The . interval of 5 is given by - $ E I 5 $E. The superscipt c has been omitted in Eq. (141); c corresponds with the spacebandwidth product. In its discrete representation Eq. (141) is written N / Z- 1
+anN +,,(l/%)
=
1
k = -N/Z
exp(2niklN- ')4,(k/3~)
248
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
with N = 6de = 3 0 , ~ , 1 = { -+N, -+N + l , . . ., i N - 1). Calculating the Fourier transform of the left-hand side of Eq. (139), we obtain using Eq. (143)
c c
N/Z- 1
N/2- 1
exp[2ni(kk'
k = -N/21= -N/2
+ Il')N-']
c
N-1N-1 m=Om=O
For the right-hand side of Eq. (139) we obtain, with Eqs. (120) and (121) N/Z- 1
1
N/Z- 1
1
exp[2ni(kk'
+ l I ' ) i V 1 ] Re
k = -N/Z I= -N/2
1 =-
N/Z- 1 N / 2 - 1
C
1
2 k = -N/2
exp[2ni(kk'
I = -N/2
+ I~')N-']
where C.C. denotes the complex conjugate of the preceding term. The summations in Eq. (144) over k and 1 can be carried out and result in sinc functions which can be approximated by 6 functions, as in Eq. (1 19). These 6 functions allow us to carry out the integrations over 5 and v] in Eq. (144), which leads to
= t ( 3 ~ ) ~ e x p [ - - i y ( - s i n H , c o s ~ , ,-sinB,sin$,)
+ ig((2d)-'k'
-
sin0,cos~,,(2d)-'/'
x
$;((2d)-'k'
x
rect,(sin 8, cos 4,
-
k'
iy( -(2d)-
x
$p(-(2d)-1k'
x
rect,(
-
sinQOsin4,,)]
sinfIOcos4,,(2d)~~'I' - sinB,sin4,)
+ + ( 3 ~exp[i;l( )~ -sin -
-
-
( 2 d ) - 'k') rect,(sin 8, sin 4,
-
(2d)
1')
8, cos 4,,, -sin (I, sin 4,)
+ sin Q, cos 4,, -(2d)- I' + sin 8, sin 4,)] + s i n 8 , ~ 0 ~ 4 , , - ( 2 d ) - ~ /+' sinH,sin@,)
sin ( I , cos 4,,
~
( 2 d ) - k ' ) rect,(
-
sin c), sin 4,
~
(2d) '1') (145)
IMAGE HANDLING IN ELECTRON MICROSCOPY
249
where the rect function is defined as rect,(t) =
1, 0,
It1 I E
elsewhere
Equation (145) is the basic equation for the determination of the $J.,.) function. We assume that the first exposure [denoted by superscript (I)] has been made with the tilt angles B0 and 4o of the illuminating beam specified by
0;
=:
0;
62-1’2;
=
-
an
(147)
sothatsinUhsit-14; = sinUhcos4; =$(ass 10-3-10-2).Usingthevalues in Eq. (147), we obtain from the first exposure N-IN-1
bN2
1 C
a^~,,.,.m0m(k1/3~)4,(1’/3E)
m=O n = 0
=
t ( 3 ~ ) ~ e x p [ - i ~ ( - $ e-46) ,
+ iy((2d)-’k’
x $;((2d)-’k’ - +E, (2d)-’l’ x rect,($e -
-
(2d)- ‘ 1 ’ )
i~(-(2d)-’k’ + + E ,
-(2d)-’l’
-
-
is, (2d)-’l’
--$)I
+E)rect,(*s - (2d)-’k’)
+ 3(3~)~ exp[iy( -$e, -+e) -(2d)-’l’ + 3~)]$~(-(2d)-’k’+ SE,
+ tE)rect,(-$e
-
(2d)-’k’)rect,(-+e
-
(2d)-’l’) (148)
Since the rect functions in Eq. (148) do not overlap each other everywhere, regions in the (k’, 1’) plane now emerge where $,*(.,.) or t,bP(-,-) as appropriate can be determined separately. These regions are shown in Fig. 8. From Fig. 8 we see that only when (2d)- k‘ and (2d)- I‘ both lie between -3e and +E do $&-,.)and $p*(.,-)appeartogetherinEq.(148).Thus $J.,.)and $,*(.,.)cannot be determined in this region from the first exposure only. In the regions of Fig. 8 marked with $: and $c/p, these functions can be determined separately. Also from Fig. 8 we see that the highest-frequency component of the (k’,l’) plane corresponds to %E. This is why in the previous section the WhittakerShannon sampling of the recorded image was chosen to correspond to a . will now focus on the determination of $,(k‘/2d, 1‘/2d) bandwidth of 3 ~ We in the square: - E < (k‘/2d, 1’/2d) IE. By taking another exposure with tilt angles
’
250
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
FIG.8. The regions in the (k',l') plane representing the support of the rect functions in Eq. (148).
we obtain the relation
+
= $ ( 3 ~ ) ~ e x p [ - i y ( + ~ , + s )iy((2d)-'k'
+ f~,(2d)-'l'+ 3s)
x $:((2d)-'k' x rect,(-+e
-
(2d)-'k')rect,(-$&
+ i(3e)' exp[iy(+e,$e) x i+bp(-(2d)-'k' x rect,(+s
-
+ $ ~ , ( 2 d ) - l l+' $&)I
-
iy( -(2d)-
' c , - (2d)-'l'
- 2
'
(2d)- k ' ) rect,(+s
-
-
k'
(2d)-'I') Le, -( 2 d ) - l l '
- 2
-+&)I
- +&)
(2d)- ' I ' )
(150)
Equation (150) allows us to calculate the t+bP(-,-)function in that part of the square in the ( k ' , ! ' ) plane which was inaccessible in the previous step [Eq. (148)]. In the region shown in Fig. 8, where the rect functions do overlap, we still cannot determine i+bp(-,.) and $,*(.,.). The reason is that Eqs. (148) and (150) are not independent in this region, as can be seen by taking the complex
25 1
IMAGE HANDLING IN ELECTRON MICROSCOPY
conjugate of Eq. (150) and changing k‘ into - k‘, 1’ into - 1’, and using the even symmetry of the aberration function y(-, .). The $p(., .) function is now known in the full square ( - 1 8 < k‘/2d,1’/2d < +&) from Eqs. (148) and (150) and the object wave function can be determined by inverting Eq. (122). Equation (150) contains more information, however, because the same procedure as utilized in Eq. (148) can be repeated here. Using all the available information leads to an accuracy in the determination of t,bp(-,-) which is not constant over the range of spatial frequencies involved. In order to achieve uniform accuracy we must take yet another two exposures with beam tilt angles
6;
=
6;
4;
=
-4;
= E2-1/2.
=
6:
=
8;
= E2-I”
440 -- 4; + +x -p; 1
= 27.c
With these four exposures we are able to make three complete reconstructions for each point ( k / 2 ~1/2&) , of the object wave function. Taking the mean of these reconstructed values results in a reduced variance of the noise. The reconstructed object wave functions result from inverting Eq. (122). AS highfrequency information about the object is lost due to the aperture in the exit pupil, we have to approximate ia(.,-) - B(.,.) by a band-limited (filtered) In principle, band-width extrapolation is version denoted by i i ( - , - )possible, but it increases the variance of noise considerably. Except for the “missing” quadrant of the square in the (k’,1’) plane (see Fig. 8) the $p(., .) resulting from exposure is given by
p(.,.).
+ $8, -(2d)-’I’ + + E ) = exp[-iy(-+&, - $ E ) + i ~ ( - ( 2 d ) - ’ k ’ + +E,
(3~)~$;(-(2d)-Ik’
-(2d)-I1’
$- + E ) ]
Using Eq. (122) we obtain the following relation for the object wave function
Introducing the new variables ( 2 d ) - ’ k = - ( 2 d ) - k’ + $ E and ( 2 d ) - ‘ 1 = -(2d)-I1’ + $E and taking the discrete Fourier transform of both sides of Eq. (153), we obtain by carrying out the summations over k and 1 on the
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
252
right-hand side with N' = 2 d 2 ~= o0o "72
N c -2
-
1
1
"12
-
1
&i(k/2d,1/2d)exp[-2ni(kr + Is)N'-']
k=-N'/ZI=-N'Z
X
+
2nE[(2E)- s Yo1 sin{n(2d)-'[(2~)-'s yo])
+
The integration over xo and y o in Eq. (154) can be carried out using thc orthogonality property
which applies in our case in good approximation because d is very large compared with the wavelength.
253
IMAGE HANDLING IN ELECTRON MICROSCOPY
Based on a straightforward computation we find
c
N/2- 1
N’-’
N/2- I
k l x,2&k(,,.,)..p[-2ni(kr
+ ls)N‘-’]
k = -N/2 I = - N
=(
”)
2 ~ ) - ~ [ 2E i C’(2E~ :
-
fl(z 2E ’z)]exp[ 2E in(r + s ) / 2 ]
(158)
From Eq. ( I 52) we obtain, using Eqs. (1 57) and (158) *
[iE;-( - rj2c,
-
s / ~ E-) fl( - r / 2 ~ , s / ~ E )exp[in(r ]
-&)I
= exp[-iy(-+e,
c
N’/2 - 1
”‘2 - 1
1 2ni(kr + Is)”- ‘ 3
k = - N’/2 I =
x exp[iy(k/2d,1/2d) -
x
N-lN-1
x
C
+ s)/21
-
N’/2
+
G ~ , ~ Z , , U ~ $ ~ [ - ( ~ ~Ed) ]-$~, [~- ( 3 ~ ) - ’ 1 + @ ] (159)
m=O n=O
In Eq. (1 59) we must exclude in the summation over k and 1the region in which $:(-,-) could not be resolved from exposure ( l ) . For each of exposures ( l ) through (4) we perform the calculation in Eq. (159),excluding the inaccessible zones and taking the-correct tilt angles. This results in four descriptions of the estimated i(-, .) and fl(., .) functions. These four reconstructions are equivalent to three reconstructions using the function $Jk’/2d, 1’/2d)in the whole square. The final estimated ;(.,.) and functions are the mean values of the four descriptions [Eq. (1 59)]. Since Eqs. (152) and (1531 are linear, the estimated function values are independent Gaussian random variables with
p(.,.)
*
~ { i E ; - ( - r / 2- ~- s, / ~ E ) - & Y / ~ E ,
-s/~E)}
-
=
iE(--r/2c, - s / ~ E )- f i ( - r / 2 ~ , - s / ~ E ) , (r,s) = {
-+ N ’ , ...,+N ’
-
l}
(160) The computation of the variances is more involved. From the first exposure we obtain the variance
c’
N’/2-1
var{gl}
= fN2(4i,T)-’
c
N’/2-1 N - l N - l
1’
k=-N‘/Zf=-N’/Zm=O
x 4:[-(3~)-’k
Ianl2laml2 n=O
+ $ f ] $ ; [ - ( 3 ~ ) - ~+1+ d l
(161)
The prime on the summations over k and 1 indicates that we must exclude those values of k and 1 for which $:(-,-) could not be solved from exposure According to Slepian (1964),Ian12is approximately constant and equal to N - ’ for 0 I n I N - 1 , and zero elsewhere. Thus lc1,I2 and l c ( , I 2 can be replaced by
254
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
N - ' . The three remaining exposures can all be treated in the same way. Combining the variances of the four exppures, we find for the variance of the resulting object wave function i$(.,-) - B(., .)
N/2-1
N/2-1
=&(3)N2(4&T)-' k=-N/21=-N/2
$:(
-
k / 3 ~ ) 4 , 2-( 1/34
=iN2(41,,T)-'
(162)
where we used the orthonormality of &(-,-). Note that in combining the four exposures, the summations over k and 1 cover the rectangle - $N _< k, 1 I i N - 1 three times.
IV. PARAMETER ESTIMATION A . Maximum Likelihood Estimation in Electron Microscopy
The present section introduces a second line of approach to low-dose electron microscopy; i.e., certain aspects of the optimal use of a priori information are considered. In this section we focus on the estimation of unknown parameters related to the object structure, using a single exposure of the specimen. Emphasis is on the statistical significance of the obtained results, and on the merits of numerical strategies. First a brief introduction to the statistical estimation method of maximum likelihood. From among other methods maximum-likelihood estimation is chosen because of its properties in the application to electron microscopy. This does not imply that maximum likelihood is the most suitable method in every case. Whenever considered appropriate, other methods will be used. The properties of maximumlikelihood estimation are treated in many textbooks on statistics (e.g., Kendall and Stuart, 1967; Van der Waerden, 1969).This estimation method originates from R.A. Fisher, who developed the theory in several papers around 1920(see his collected works, 1950). The principle of maximum likelihood was already applied, however, by C.F. Gauss in 1795, in dealing with observational errors in astronomy. It selects for the pertinent parameters those values which assign to the observed data the largest probability of occurrence. Electron microscopy is often applied in cases in which one already has the disposal of much information about the specimen to be observed with the microscope. This information can, for example, originate from knowledge of analytical chemistry or from x-ray diffraction. Usually the recorded image of the object structure is not of interest in itself. The aim in particular of low-dose
IMAGE HANDLING IN ELECTRON MICROSCOPY
255
electron microscopy should be to extract information about the observed structure, in other words to provide a description of the internal structure of the specimen in question. This task can be facilitated considerably through the use of a priori information. In order to be of practical value, the a priori information has to be translated into a description of the pertinent structure as a functional relationship between not too many unknown parameters. The type of biological specimens we focus our attention on can be modeled by their electrostatic potential V(x,,, y o , z; 8), involving, for example, the positions of several heavy atoms, where 8 = (O,,0 2 ,. . .,Om) denotes the m distinct parameters to be estimated. The electron wave function in the object plane results from this potential distribution by taking the projection of the distribution with respect to the object plane, as is discussed in Section 1,C
where u denotes the velocity of the electrons impinging on the specimen. Hence, the image wave function is also a known function of the unknown parameters. The intensity of the image Poisson process can now be calculated (in principle) for the (k, 1) image cell = 1sT(2d)-’
ss
Il/(X,Y;8)Il/*(X,y;8)dxdy,
Uk.1
( k , 1 ) = { -*N,. . .,*N - l }
(164)
The actual recorded image 5 is a realization of this stochastic process which we have analyzed in depth in Section I,G. From Eq. (164), the intensity being a function of 8, all the information contained in the recorded image can be used for the estimation of the m unknown parameters. The N 2 observations are independent Poisson-distributed random variables, as has been shown in Section I,G. Their joint probability density function is therefore
The function L(5,0) is called the likelihood function if the 6k.l values are fixed to the values of a realization of the stochastic image process, making L(5,O)a a function of 0 alone. According to the maximum-likelihood principle mentioned above, the “best” values for the parameters 8 are those values 6 for which the likelihood function has its absolute maximum. A necessary condition for the existence of a maximum in L(5,8)is that
a
-logL(ii,O)
a4
= 0,
i
=
1 , . .., m
256
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
We hereby explicitly assume that the maximum does not occur at the boundary-of the range of permitted 8 values. The maximum-likelihood estimates 8 are a root of the set of likelihood equations obtained from (167) Because the estimated parameters depend directly on the stochastic data, they are random variables themselves. The merits of the estimated parameter values can be judged best by their probability distribution. Often the determination of such a probability distribution is a task in itself, and usually this computation cannot be performed analytically. In this case one has to be satisfied with less complete characterizations of the accuracy of the estimates 6. such as the first two moments of the distribution. In particular the determination of the second moment can also become rather complicated. The analytical complications arise from the fact that the functional relationship between the parameters representing the a priori information about the specimen is generally highly nonlinear. With respect to the first moment, the bias b(6)of the estimator 6i s defined as the deviation of the expectation value of 6 from the true value 8 b(6)= E ( 6 ) - 8 (168) whereas the second moment is related to the mean-square error matrix V ( 6 ) , of which the cr.selement is defined as The matrix V ( 5 )is $so called the error covariance matrix when the estimator 5 is unbiased, i.e., b(8) = 0. The maximum likelihood estimator has several properties which are of great importance in the application of the method to low-dose electron microscopy. First, it is a consistent estimator. This means that the estimates converge in probability to their true values when the number of observations increases. Furthermore, the estimator is unbiased asymptotically in the number of observations N 2 . A lower bound for the mean-square error of any estimator of 8 is provided by the famous Cramer-Rao inequality (if it exists), (see e.g., Van Trees, 1968, p. 72, for a discussion in a general setting; or Snyder, 1975, p. 80, in particular for Poisson-distributed data). An intuitively appealing discussion of this bound can be found in Gardner (1979). The variance of any unbiased estimator of 8 is greater than or equal to the value specified by the Cramer-Rao bound. This value appears to be equal to the inverse of the Fisher information matrix, which can be interpreted as the information contained in the data about the unknown parameters (Van der Waerden, 1969; Kendall and Stuart, 1967; Van Trees, 1968).
IMAGE HANDLING IN ELECTRON MICROSCOPY
257
The amount of information that is contained in the stochastic data ii is of crucial importance in statistical estimation theory. An estimator can do no more than transform the information present in the data and redistribute it over the unknown parameters. An important problem in the theory of statistical estimation is whether a proposed estimator uses all of the available information efficiently, i.e., when there is no information loss. In order to compare the performance of different estimators it is very useful to have a measure for the information that is contained in the data. According to Fisher (1950) this amount of information is expressed in the following matrix (if it exists)
where L(. ,.) is the likelihood function. Assuming that L ( - .), is regular enough to allow the interchange of differentiation and integration, Eq. (170) is equivalent to
The m x m matrix F ( 8 ) with entries is called the Fisher information matrix and is widely used in the theory of statistical estimation. For Poissondistributed data and likelihood function [Eq. (165)], the Fisher matrix is given by (Snyder, 1975, p. 81) N/2 - 1 .fi,j(8)
N/2-1
1
=k=-N/Zi=-N/2
2
a
~ . ~ ~ ( e ) ~ ~ k , l ( 8 ) ~ ~ ~ k , l ( 8 (172) )
If the inverse of the matrix in Eq. (172) exists, then the inverted matrix yields the Cramer--Rao bound for any unbiased estimator of 8. The Cramer-Rao bound thus obtained is to be compared with the covariance matrix of the unbiased estimator for 8. This comparison yields insight into the merits of the pertinent estimator, and whether it uses all the available information efficiently. I t appears possible to construct efficient estimators only for the restricted class of exponential probability distributions, irrespective of the number of observations (Koopman, 1936; Pitman, 1936). The maximum-likelihood estimator is asymptotically unbiased in the number of measurements and normally distributed, while the covariance matrix approaches the Cramer-Rao bound. Hence, asymptotically the maximum-likelihood estimator has the lowest attainable variance. It should be noted that these properties of the maximum-likelihood estimator are not generally valid but are only true when the likelihood function satisfies certain regularity conditions; for example, the interchange of differentiation and
258
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
integration with respect to 0 should be admissible. In our case these conditions are fulfilled since a physical object is imaged with a finite aperture E. This guarantees rather smooth intensity variations. Although the Shannon number N 2 of low-dose images will in practical situations be of the order of magnitude of several thousands, and therefore the number of observations N 2 is relatively large, we can not always rely on the nice properties of the maximum-likelihood estimator which are asymptotically in N 2 . In many cases the asymptotic limit will be reached quite slowly. The likelihood function will often have several local maxima when the number of observations N 2 is finite. The fact that there may be several local maxima poses the difficult numerical problem of obtaining the global optimum of a function of several variables. The problem will be discussed in some detail further on. The great advantage of using a priori information is, as mentioned earlier, the use of all the information that is contained in the observations for the determination of the parameters to be estimated. Another advantage lies in its directness, as no inverse problem remains to be solved. The reconstruction procedures of the object wave function described in the previous two sections resulted in only one projection of the electrostatic potential distribution. The potential distribution of the specimen remains to be reconstructed from projections obtained under different tilt angles of the specimen. Disadvantages of the a priori information approach are the numerical complications in obtaining unique results (e.g., the global extreme of the likelihood function) and the evaluation of the statistical significance of the results obtained. It is almost impossible to make generalizations about the proposed method for information evaluation in low-dose electron microscopy. This is due to the heavy dependence of the obtained results on the functional relationship between the pertinent parameters. A minor modification of this relationship may change the performance drastically. Every application is almost unique. Therefore, only some examples are presented here. In the next subsection we consider three simplified estimation problems, each one with only one single lateral dimension. However, all features of the a priori information approach advocated in this chapter can be demonstrated with these examples. With respect to the Fisher information it should be noted that Eq. (170) is not the only measure for information. Other measures of information exist, for example, the Shannon information, which is widely used in communication theory. In information theory there exists a vast amount of specialized literature about measures of information (see e.g., Aczel and Daroczy, 1975). A useful overview of the role of information measures in information theory is contained in Van der Lubbe (1981, Chap. 1). The quantity defined by R.A.
IMAGE HANDLING IN ELECTRON MICROSCOPY
259
Fisher in Eq. (1 70) or (171) has the following properties which are intuitively associated with the concept of information. (1) The amount of information is positive definite. (2) The available information increases directly with the number of measurements. Doubling the number of independent measurements doubles the amount of information. (3) Information is related to the accuracy of the measurement. (4) The amount of information is related to the knowledge to be retrieved from the measurement. Data not relevant for the parameter set 8 chosen do not contribute to the information. B. Illustrative Examples
In this subsection three examples are presented showing different ways of using a priori information about the specimen for the evaluation of low-dose images. The examples are formulated for one (lateral) dimension only: In order to concentrate on the function of the prior information the equations are kept as simple as possible. The extension to two dimensions is straightforward for rectangular apertures in the microscope.
I . A Weak Phase Object The first example deals with a weak phase object. This situation generally applies to thin biological specimens when scattering contrast (i.e., interception of electrons by apertures) may be neglected. The prior knowledge that we are dealing with a weak phase object allows us to expand the object wave function $,,(.) of Eq. (163) in the first two terms of its Taylor expansion $o(xo) = 1 + i4(x,)
(173)
where 46) is a real function representing the projection of the object's electrostatic potential. That part of the Fourier spectrum which corresponds to spatial frequencies above E is intercepted by the exit pupil and is not propagated to the image plane. Therefore the information contained in the spatial frequencies above E is lost, as these frequencies do not contribute to the data. In the sequel we will discard spatial frequencies above E by modeling the object wave function by
where the bar indicates a low-pass filtered approximation. In theory it is possible to reconstruct $(-) with a bandwidth higher than E, using
260
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
regularization techniques, see Bertero et al. (1980). In general, however, these techniques amplify the noise considerably and are therefore of limited value for low-dose microscopy. With Eq. (174) the low-pass filtered approximation to the original 4(-)function is represented by N amplitudes 4(k(2&)-'). We consider these to be unknown parameters ak, k = { --i N , ...,$N - l } that must be estimated from the recorded image ii using the maximum-likelihood method. From Eqs. (23) and (173) in combination with Eq. (174), the wave function in the exit pupil is expressed by
Hence, the image wave function is given by approximating the sinc function by a delta function
$(XI
=
1
+ iCa, k
r
J
tF
exp(-iy(t)
-
271i([k(2~)-'
-
x]}d(
(176)
-E
from which the squared modulus results f + E
where only terms that are linear in u k have been taken into account. This is consistent with the approximation in Eq. (1 73), where the quantities related to the object have been retained up to the first order. The expectation value for the number of electrons arriving in image cell 1 is E(n^,) = i, = (2d)-'i.,T
dxt,b(x)$*(.u), /a
Ul:(2E)-'I -
With Eq. (177) using the relation N i.= ,2,TN
= 2 d ( 2 ~we )
' 1 + 4&zak
(
+ (4c)-'
(4c)-' I XI I (2E)-'I
k
1-y
x exp[-2ni5(2~)-'(k
~
( 178)
obtain
d(sin[y(()]
1)](71()-'
sin[2n((41:)-'])
(179)
The occurrence of sin y(-) in the integrand of Eq. (179) obstructs the analytical solution of that equation. However, by setting the defocus parameter D according to Scherzer focus conditions (Scherzer, 1949) in order to maximize the phase contrast, the shape of the sin y(-) function can be approximated by a few straight-line segments as shown in Fig. 9.
IMAGE HANDLING IN ELECTRON MICROSCOPY
26 1
0
t
g -0.5 2an
c
\\
-1.0
FIG.9. The function sin ~ ( tunder ) , Scherzer focus conditions approximated by straight-line , ~ 4 C, = 1.6 mm, D = 80 nm, E = lo-', i. = 4 pm. segments, y ( ( ) = 2 7 r l ~ ~ ' ( ~ C4DC2), ~
In order to arrive at manageable expressions, the sinc function in Eq. (179) is approximated by the constant value of (2&)-'. Instead of sin )(y we take th ree straight-line segments and the integral in Eq. (179) is set equal to
sinn(k .-
-
-
I)
n(2&)-'(k - I
I-& +&
d($(()exp[-2ni(2&)-'(k
+
where we approximated shy(() Fig. 10)
'Y
-1
+ 4((), lirl
I b:
with
4(()
-
l)]
(180)
defined by (see
48 2 812
(181)
The integral on the right-hand side of Eq. (1 80) can be elementarily evaluated by repeated integration by parts since the second derivative of $(()consists of four delta functions at k c/8 and f612. This results in
J
+&
dr$(;)exp[-2ni((2&)-l(k
-
113
-&
-
8sin[3e(k - /)/I61 sin[Sn(k - 1)/16] 3 ~ n ( k- l)(2&)-' n(k - 1)(2&)-l
~~
(182)
262
CORNELIS
N.SLUMP A N D HEDZER A. FERWERDA O(5)
-0.5E
-0.125E
0'
FIG.10. The function
0.125E
0.5E
-z
4(5)of Eq. (181).
With Eqs. (180) and (182) we obtain from Eq. (179) sin n(k - 1) - 8 sin[3n(k - 1)/16] 3~74k- 1)(2~)-' n ( 2 ~ ) - ' ( k- 2) -
sin[Sn(k - 1)/16] n(k - 1)(2&)-'
Equation (1 83) is the required relation between the Poisson-process parameter i.,and the unknown ak's to be estimated, cf. Eq. (164). The parameters a k can be evaluated by means of inversion of Eq. (183),replacing the expectation values on the left-hand side by the measured values In this example, however, we will follow the recipe of maximum-likelihood estimation. The maximumlikelihood estimates P are a solution of the likelihood equations, cf. Eq. (167)
-
8sin[3n(k - /)/16] sin[5n(k - /)/16] 3 m ( k - 1)(2&)-l n(k - I ) ( ~ E ) - '
x A,7"-'2(
r
=
sinn(r - 1) 7~(2~)-'(r -I)
{ - i N , . .. , i N
-
-
l}
11
8sin(3n(r - 1)/16) sin(Sn(r - 1)/16) 3m(r - I ) ( ~ E ) - ' n(r - l)(2&)-' (184)
Due to the presence of the parameters ak in the denominator, the set of equations is nonlinear and cannot be solved analytically. Instead the solution must be obtained numerically. It is usually very difficult to indicate the statistical significance of a result which has been obtained numerically. Of interest is the probability distribution or at least the first two moments of the estimated parameters P.
IMAGE HANDLING IN ELECTRON MICROSCOPY
263
In order to be able to judge the statistical significance of the estimated values, we proceed analytically by linearizing the set of Eq. (184). The denominator is expanded in the first two terms of the binomial series neglecting higher-order terms. This is consistent with earlier approximations, i.e., first order in the object properties. The following set of equation results
-
11
8sin[3n(k - 1)/16] sin[5n(k - 1)/16] 3 4 k - 1)(2&)-' n(k - 1)(2&)-' sin n(r - 1) n(2&)-'(r- 1)
-
8 sin[3n(r - 1)/16] sin[Sn(r - 1)/16] =O 3&n(r- 1)(2&)-' n(r - l)(2&)-'
(185)
Taking only the most significant contributions in the double summation into account, and neglecting terms which are smaller by one order of magnitude results in N/2-1 sinn(r - I ) 6, N +(fir + l ) - ' 1 (A,TN--' - f i J ( I= -N/2 n&(r- I ) -
16sin[3n(r - 1)/16] sin[5n(r - 1)/16] n(r - 1 ) 3n4r - 1 )
r = {-$N, ...,$ N - I }
(186)
The constant 1 is added to fir in the denominator because of the nonzero probability P{G, = 0 } that ii, will become zero, regardless of how large A, may be. By taking the expectation value of Eq. (186),using the fact that the fir's are independent random variables, and using E{(fi, l)-'} = &'[1 exp( - &)I, we obtain with the same type of approximations that were used in the derivation of Eq. (186)
+
E(6,)
= a,
(187)
Calculating the variance of G, in the same way using that E { ( f i , + 1)-'} A;' + AL3 + O(A;"), we obtain var (6,)
1:
12 1 (1024)- N(A,TE')-
N
N(~A,TE')-
=
(188)
The variance of the estimator in Eq. (186) as expressed in Eq. (188) can be compared to the lowest attainable variance of any estimator for a, as given by the CramCr-Rao bound. In order to determine this bound for a, the Fisher information matrix of the parameters a is calculated first [see Eq. (172)]
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
264
With the approximations used earlier in the derivation of Eqs. (186)-(188), and neglecting terms which are an order of magnitude smaller, Eq. (189) results in j;,s(a) 2 1 2 1 ( 1 6 ) ~ ' i . , T N ~ ' ~2 ~ SN, -, ,' 8 j . , T ~ ~ h , , ,
(190)
According to our approximations the estimator in Eq. (186) is unbiased [Eq. (187)l;therefore, the Cramer-Rao bound is equal to the inverse of the Fisher information matrix. As this matrix is in good approximation in diagonal form, the inversion is easy. It thus follows that the variance Eq. (188) is close to the value of the Cramer-Rao bound, an indication that the estimator in Eq. ( 1 86) is (nearly) optimal. In summary, in the first example the prior knowledge about the object structure is not very detailed; it is only a weak object [cf. Eq. (174)l. However, it became apparent that rather complicated nonlinear equations must be solved. With suitable approximations we are able to obtain analytical results of which the statistical features are established. The evaluation of the statistical significance of results that are obtained numerically is a difficult task. An example of such a situation is encountered in the next example. In that example the functional form of the image intensity distribution is supposed to be known, the pertinent function depending on several parameters.
2. linage Intensity Distribution in Analytical Form In this second example we assume the image intensity distribution to be an priori known function of several unknown parameters. This prior functional relationship, together with the observations which can be rather severely disrupted by noise, is the usual information which estimation methods take as their starting point. In the previous example the a priori knowledge concerned the object structure, and this resulted in the pertinent object wave function. The calculation of the corresponding image intensity is generally quite complicated. But in this second example this difficulty is sidestepped because the functional form of the image intensity is assumed to be LI
E(6,j
= j., =
i5TN-'( 1
~
h-1
akexp{-+s,2[/(2c)-1
- pkj2j),
(191)
1 = { - * N ,...,+N - I } with ak << 1 , k = ( 1 , . . . , m ) , such that every i.> ,0. The theoretical image contrast, i.e., the intensity distribution in the limit of an infinite number of electrons contributing to the image formation, consists of the sum of m Gaussian-shaped functions. The kth Gaussian function has an amplitude parameter u k , a position parameter pkr and the width of the bell shape is described by sL. From the recorded image, which is a realization of the stochastic Poisson process [Eq. (165)], the 3m parameters have to be
IMAGE HANDLJNG IN ELECTRON MICROSCOPY
265
estimated. A functional relationship between data and parameters, such as Eq. (191), can originate from a weakly scattering phase-amplitude object which is imaged in focus (i.e., with defocusing D set equal to zero) or from the mass-density variations of an object which is imaged under scattering contrast conditions. In order to have a problem that is manageable, the number of parameters in this example is made equal to three and the position parameters p k and the width parameters sk are set to fixed values. The three unknown parameters are the amplitudes of three Gaussian functions. The estimation problem is simplified to a linear relation between only three parameters. The remaining problem is nonetheless not trivial, as will become clear in the following analysis. The full problem will be treated in the next section, where we use twodimensional images. From Eq. (167) and using Eq. (191) we obtain the three likelihood equations
from which a solution yields candidate values for the amplitude a , , u 2 ,and u 3 . The set of equations (192) is nonlinear; hence the solution has to be obtained numerically. This can be done by a zero-locating algorithm (e.g., Powell, 1970) which attempts to find a zero in the neighborhood of a specified starting point. After a zero has been located, it remains to be checked whether it corresponds to a minimum, a maximum, or a saddle point of the corresponding loglikelihood function. In general Eq. (192) will have more than one zero. Thus, in order to obtain the global maximum of the log-likelihood function, the calculation must be repeated for various starting points. The set of equations (192) provides only a partial description of the function to be maximized:
266
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
For computational purpose it is better to maximize Eq. (193) directly than to locate the zeroes of Eq. (192) because using the function values in Eq. (193) will reveal regions of low likelihood where the location of local minima is not of interest. In this way the computational effort in the determination of zeroes which correspond to local minima is avoided. A local maximum of Eq. (193) is obtained using gradient information about the log-likelihood function by applying the algorithm proposed by Fletcher (1970), which is a variation of the algorithm of Fletcher and Powell (1963). The Fletcher algorithm departs from a starting point specified by the user, and then performs Newton-like steps in the direction of the local maximum in the neighborhood of the starting point until certain convergence criteria are satisfied. The estimated values for the parameters are defined to correspond to the global maximum of the log-likelihood function. With the algorithm indicated above there is generally no certainty that the obtained maximum is indeed the global optimum. Repeating the calculation from various starting points or from a number of randomly chosen starting points (Box, 1966), increases the likelihood that the largest maximum obtained is indeed the global optimum. The above outlined numerical strategy does not provide absolute certainty of obtaining the global optimum. We will return to this point further on where we will indicate a method that obtains the global optimum with certainty. The reliability of the numerical strategy of using the maximizing algorithm from different starting points can be improved upon by analyzing the shape of the log likelihood at the largest maximum found as a function of the parameters. A low-dose electron micrograph consists of a large number of data points. Hence it is to be expected that, for cases that are not excessively pathological, the likelihood function will tend to its asymptotic form. The asymptotic form of the likelihood function has one unique maximum, the width of which is characterized by the Cramer-Rao bound. The actual width is calculated as a function of the parameters at the maximum attained, and these values are compared. When these values are consistent with each other, the pertinent maximum is believed to be the global optimum. In Fig. 11 a realization of a simulated Poisson process with parameter j.,, 1 = { -fN,. . . ,fN- 1 ) [cf. Eq. (189)l is presented. Table I contains a summary of a series of numerical simulations with increasing electron dose 2, of the estimation of a,, u 2 , and u 3 using the procedure described above. The largest maximum found after a series 1, 6, 11, and 16 sets of random starting points (Box, 1966) is taken to be the global maximum the location of which determines the estimated value GZ, and 2,. It appeared from the simuiations that there is no difference in the maximum found after the different starting points generated. This adds to the confidence that the maximum obtained is indeed the global optimum. The shape of this maximum, which is
5
FIG.1 I. The simulated one-dimensional low-dose image (solid line) and 1, (dashed line) of Eq. (189) with the following a2 = 0.35, p 2 = 0, s2 = d / 3 , d = lo4, a3 = 0.25, p , = d / 3 , parameter setting: N = 201, a, = 0.25, p1 = - d / 3 , s, = d/5, c: = 0.5 x sj = d / 5 , I = 4 pm, I, TN-' = 8, dose F i, =, 50 e-/nmz(0.5 e-/A2).
Error analysis. confidence bounds Estimated values
Electron dose
ir,
ir,
ir,
0.31
0.36 0.37 0.35 0.35 0.35 0.35
0.35 0.33 0.31 0.29
I'ositivc hound -.
-.-.
0.33 0.31 0.18 0.28 0.27
0.41
0.2x
0.12 0.w 0.06 0.05 0.03 0.02
0.10
0.11
0.07 0.05
0.08
0.04
0.04 0.03 0.02
0.03 0.02
0.06
Ncg;~tIVC hound
-0.13 -0.w -0.07 -0.05
-0.03 -0.02
-0.11 -0.ox
-0.12 -0.09
-0.06 -0.04 -0.03 0.02
-0.06 -0.05
-0.03 -0.02
Experimental covarkince
0.13 0.09 0.06 0.05 0.03 0.02
0.11
0.07 0.06
0.04 0.03 0.02
Cramer Rao
0.12 0.08 0.06
0.13
0.05
0.05 0.03 0.02
0.03 0.02
0.m 0.07
0.1 I 0.08 0.06
0.13 0.09 0.07
0.04 0.03 0.02
0.03 0.02
0.05
269
IMAGE HANDLING IN ELECTRON MICROSCOPY
to be compared with the width predicted by the Cramer-Rao bound, is analyzed as follows. The parameters are set to the values corresponding to the largest maximum. For each parameter a consecutive determination is made of the two values for which the log-likelihood function has decreased by 0.5 in comparison to its value at the pertinent maximum. During this calculation of the positive and negative error bounds of the parameter in question, the other parameters retain their values corresponding to the maximum. When the log-likelihood function has its asymptotic form corresponding to an infinite number of observations, the above procedure yields the exact confidence intervals of the parameters (Eadie et al., 1971, p. 203). The decrease of the log-likelihood function by 0.5 corresponds to confidence intervals of one standard deviation. Hence in the asymptotic regime there is a probability of 68.3% that the parameter in question will be in the interval bounded by the two error values found. Evaluation of the Cramer-Rao bound yields a ( 3 x 3 ) matrix V which is the inverse of the Fisher information matrix N/2-1
f;,j(a) =
1
A,'(a)exp{ - ~ s ; ~ [ Q ~ E ) - ' - p i I 2 - 3 s i 2 [ I ( 2 e ) - '
-
pj12}
L= -N/2
( 194)
Maximum-likelihood estimates are asymptotically unbiased in the number of observations. Hence taking the square root of the diagonal terms of V yields the predicted confidence intervals for the parameters in question. The Cramer-Rao bound depends via Eq. (194) on the true values of a. In practice this bound has to be evaluated using the estimated parameters a^,, a^2, and a^, . The confidence values according to the experimental covariance matrix are indicated in Table I. In the simulations of this example in which the true values of a are known, we have determined for comparison purposes the exact Cramer-Rao confidence intervals. We see that with increasing dose the error bounds approach the Cramkr-Rao confidence intervals quite closely. The symmetry in the parameters describing the Gaussian function 1 and 3 (cf. Fig. I 1) is reflected in the error bounds of the parameters a , and u 3 . In the following example the a priori information is, with respect to the object structure, unlike the present example where the intensity distribution is an u priori known function of a number of parameters. 3. A IVelikly Scattering Bell-Shaped Object Structure The third and final example of this subsection, presenting case studies about the estimation of a priori parameters in the evaluation of lowdose electron micrographs, consists of a weak Gaussian amplitude object. Like the other two examples in this section, the analysis is for simplicity purposes limited to one lateral dimension. We assume that the object wave function
270
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
can be expressed as the following function of three unknown parameters 8 = (a,p,s) i,bo(xo)= exp{ -+aexp[-s-'(x,
-
P)~]}
(195)
These three parameters must be estimated from a recording of the intensity distribution in the image plane of the microscope. Assuming furthermore that parameter a is not too large, a < 1, Eq. (195) can be approximated by (weakly scattering object) t,bo(xo) = 1 - +a exp[ - - ~ - ~ ( xP ~ )~]
(196)
The wave function in the exit pupil of the optical system becomes
where it is assumed that position pis not located near one of the endpoints -d and + d of the object plane and that hence the Fourier transform can be carried out from - x to + co.If the width s of the Gaussian function is of the ' the image plane, order of magnitude of several sampling distances ( 2 ~ ) - in then the Gaussian function in the exit pupil is so narrow that the integration interval --e to + E is equivalent to - x to + a.Due to the rapid decrease of the Gaussian function, the aberration function exp[ - iy(()] only contributes in the region t = 0. The image wave function follows
x exp(2nitx)
I - taexp[-ss2(x - P ) ~ ]
(198) We see that, under the conditions stated above, the image wave function is in good approximation equal to the object wave function, cf. Eq. (196). The intensity parameter of the recorded Poisson process ii = (6-, v , z , . . . ,6N,2 ) is given by, see, e.g., Eq. (1 64) or (1 78) 2
-
In order to obtain estimated values for the parameters 8 = ( a , p , s ) , we maximize the log-likelihood function log L(., .) which reads as N2-1
logL(6,O) =
[-/Z,TN '(1
-
~ e x p ( - s - ~ [ 1 ( 2 ~ )-- 'p]'))
I = -N/2
+ 6,log(l
-
aexpj-s-'[1(2~)-~ - p ] ' } ) ]
(200)
27 1
IMAGE HANDLING IN ELECTRON MICROSCOPY
A necessary condition for a maximum of Eq. (200) is that the likelihood equations are zero, (d/dQi) log L(ii,fl) = 0, i = 1,2,3. This yields
C(A,TN-' - ;il;i;')2~-~[1(2~)-~ - p12 exp{ - s ~ ~ [ 1 ( 2 & ) - ~PI'} I
=
o
which form a coupled set of nonlinear equations which cannot be solved analytically. Therefore we proceed numerically in order to obtain estimated values for the parameters. As indicated in the previous example, we prefer to maximize Eq. (200) iteratively from a starting point rather than locating the zeroes of Eq. (201). Repeating the numerical procedure from various starting points will in the end reveal the global optimum. When using only a finite number of starting points, one has no absolute certainty that the highest maximum found in the optimizing process is the global maximum. Comparison of the shape of the maximum in question with its asymptotic values as expressed by the Cramer-Rao bound can identify the global maximum when the shape values are consistent. Table I1 summarizes a series of numerical simulations estimating a, p, and s, performed with increasing electron dose A,, = 2, T N - ' . The numerical procedure is identical to the one used in the previous example, which is referred to for details. We observe that for the highest value the width of the largest maximum approaches the asymptotic values predicted by the Cramer-Rao bound. This indicates that this maximum is indeed the global maximum. For the lower dose values the deviations are considerable, and this check on the results obtained does not work. As has already been remarked in connection with the previous example, the Cramer-Rao bound depends in many cases on the unknown parameters to be estimated. For very low-dose electron micrographs, one cannot be certain of being in the regime of asymptotic statistics. These two facts make the procedure of locating the global maximum by repeating the same calculations over and over while varying the starting values rather unsatisfactory. Fortunately, a numerical procedure exists which locates the global optimum with certainty. A discussion of this method in a general setting is beyond the scope of this article (see Slump and Hoenders, 1985). On behalf of the present discussion, an outline of the procedure is presented in the following. The method just referred to is based on information about the total number of stationary points of the log-likelihood function in the domain of interest. A stationary point of log I,(.,.) corresponds to a zero of the set of likelihood equations (d/dOi)logL(5,fl)= 0, i = 1,2,. . . [cf. Eq. (201)] of this
TABLE 11 SUMMARY 01' T H I NUMERICAL SIMIJLA.IIONS or THE ESTIMATION OF THE AMPLITUIIE N , THE POSITION po
A N I I THE
WIDTHs
OF A
GAUSSIAN FUNCTION"
Error analysis, confidence bounds Dose
Estimated values
'.I,
11
P
Y
2 4 6
0.369 0.346 0.750 0.322
598.6 578.8 492.1 671.9
950.2 907.9 148.2 820.4
X
Positive bo ti nd
0.147 -~
0.189 0.078
340.9 267.9 31.4 183.3
> 1550. 760.0 82.6 333.9
Experimental covariance
Negative bound
-0.205 -
-0.188 -0.079
-0.1534.6 -468.6 - 42.8 -196.2
-544.0 -312.3 66.3 -208.4 -
0.156 ~
0.185 0.077
395.4 -
37.7 179.1
Cramer-Rao
548.6 ~
70.9 243.1
0.187 0.133 0.108 0.094
339.2 239.9 195.8 169.6
491.2 347.4 283.6 245.6
' I 0". Eq. (199) from Poisson-distributed data with the following parameter setting: N = 101, a = 0.3, i: = 10 .', p = d / 3 = 833.33, d = 2500, 4 4 = 625.0. A horizontal bar indicates that the value in question could not be calculated due to severe nonparabolic behavior of the log-likelihood function for the corresponding parameter, caused by the neighboring local extrema. The size of the sampling cell is 4 A'. Hence the dose values correspond with 0.5, 1.0, 1.5. and 2.0 e/AZ.
.s
IMAGE HANDLING IN ELECTRON MICROSCOPY
273
example. The information about the total number of local extrema present in a domain is of great value as it tells us whether an iterative procedure to locate all of the zeros of the likelihood equations has missed a zero. This information is provided by evaluating an integral derived by Picard (1892) from previous work by Kronecker (1878) at the end of the nineteenth century. The integrands contain relatively simple algebraic quantities containing derivatives up to the third order of the log-likelihood function involved. The integration must be performed over the domain of interest. For an extensive discussion of this socalled Kronecker-Picard (KP) integral illustrated with examples, see Hoenders and Slump (1983). The Kronecker-Picard integral yields the exact number of zeros of a set of equations in a domain, provided that the zeros are simple; i.e., the Jacobian must not be equal to zero for these points. C. Two-Dimensional Examples The application of the maximum-likelihood method to the estimation problems of the previous section illustrates the possibilities and properties of this method in estimating LI priori parameters in the evaluation of low-dose electron micrographs. Image data are, however, essentially two dimensional. Therefore, in this section a more realistic example is presented which is based on two-dimensional data. The estimation problem presented in this section is inspired by the second example of the previous paragraph [cf. Eq. 1911. The a priori image intensity is assumed to be a function of 15 parameters 3
A k J = x o (1
+ 1 umexp(-fs;2[k(2&)-~
-
pm12 - +r;2[/(2+1-
qm]2})
m= I
(202) = i.,TN-’ and = E(&). Besides the two-dimensional data, a with i0 difference with the estimation problem in Eq. (191) is that the amplitudes a, are not constrained to be smaller than unity. The problem of this section is the estimation of the parameters of the three Gaussian blobs from simulated low( k , I ) = ( - i N , . .., dose images, with Poisson-distributed picture elements $N - l), of which the corresponding intensity &,, is given by Eq. (202). The simulated images are presented in Fig. 12. The estimated values for the parameters are obtained from maximizing the log-likelihood function corresponding to Eq. (202). The numerical procedure is identical to the one used in the one-dimensional examples of the previous section, where the details are described. Table I11 summarizes the series of simulations estimating the parameters p, q, r, and s, performed with increasing electron dose I.,, with fixed values for the amplitudes a and with the image data of Fig. 12. The amplitudes
274
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
FIG.12. Simulated images used in the estimation calculations which are summarized in Table 111. (a) contains the noise-free image corresponding to Eq. (202).
SUMMARY OF
THE NUMERICAL SIMULATIONS OF THE
Estimated parameter values
Dose 10
8 16 32 48 64
TABLE 111 ESTIMATION OF THE PARAMETERS p, q. r, AND s OF THE THREE GAUSSIAN BLOBS"
PI
-2516.3 2529.4 -2537.1 -2547.2 -2558.6 -
41
3127.9 3 170.2 3185.1 3194.9 3193.6
-
f-1
3026.3 3 144.9 3173.7 3175.9 3170.7
s^1
2059.8 2084.0 2114.9 2105.8 2116.2
P2
42
F2
F2
5070.6 5 1 15.3 5098.2 5088.9 5094.6
1578.8 1594.6 1593.6 1597.0 1601.1
2607.6 2587.5 2574.1 2557.2 2568.1
2070.9 2093.1 2112.2 2115.0 2124.6
P, -1288.5 - 1296.0 -1287.1 - 1285.5 -1285.9
43
F3
-5181.4
3000.6 3 1 10.0 3156.9 3172.5 3179.0
- 5076.4
-5085.2 -5118.0 -5122.9
"Cf. Eq. (202), from Poisson-distributed data, see Fig. 12, with the following parameter setting: N = 128, a , = 4, p1 = -2560, y I = 3200, rl a2 = 6, p 2 = 5120, q2 = 1600, rz = 2 5 6 0 , ~=~2133.3, d = 6400, a , = 5, p , = 1280, q , = 5120, r 3 = 3200, s, = 3840.0. sI = 2 1 3 3 , ~= 0.5 x
F3
3660.5 3707.6 3752.3 3827.8 3822.1 =
3200,
276
CORNELIS H. SLUMP AND HEDZER A FERWERDA TABLE I V THEVALUESOF THE E X A ~CTK A M ~ R - R BAOLND O OF THF PARAMETERS p, q. r, A W s OF THE THREE GAUSSIAN BLOBS"
35.0 24.6 17.4 14.2 12.3
8 16 32 48 64 ~
~~
26.2 18.5 13.1 10.7 9.3
43.7 30.8 21.8 17.8 15.4
25.0 17.6 12.5 10.2 8.8
33.4 23.6 16.7 13.6 11.8
13.6 9.6 6.8 5.6 4.8
41.5 29.4 20.8 17.0 14.7
11.3 8.0 5.6 4.6 4.0
14.2 10.1 7.1 5.8 5.0
62.0 43.8 31.0 25.3 21.9
14.7 10.4 7.4 6.0 5.2
111.4 78.8 5.5.7 45.5 39.4
~
" Cf. Eq. (202).
a are excluded from the estimation because of reasons of computational convenience. The parameters that must be estimated are now all of the same order of magnitude. The exact Cramer-Rao bounds of the estimated parameters are presented in Table IV. An in-depth analysis of the shape of the attained maxima reveals that only for the highest dose values the width of these maxima approaches the values of the Cramer-Rao bound as presented in Table IV. This is due to the fact that Eq. (202) is a highly nonlinear function of the parameters that must be estimated. The analysis of the shape of the attained maxima was greatly facilitated through the use of the MINu1.r program (James and Roos, 1975), developed at C E R N , Geneva, for function optimization. Even with the capabilities for global optimum search offered by ~ we still have n o guarantee of attaining this maximum. the M I N U I program, D. Discussion and Conclusions
The subject of this section is the optimal use of u priori information about the structure of the imaged specimen in low-dose electron microscopy. In this section we take advantage of the prior information available by modeling the object structure in a functional relationship between a number of parameters. From this description a theoretical image intensity distribution results, i.e., the image contrast in the limit of an infinite number of electrons contributing to the image formation. Using the statistical technique of maximum-likelihood estimation, numerical values are obtained for the unknown parameters from the registered realization of the stochastic low-dose image process. The advantage of the approach of parameter estimation is that all the information available in the data is used to determine the relevant parameters about the imaged specimen one wants to know. A disadvantage of parameter estimation is the theoretical image contrast which is required as a function of
IMAGE HANDLING IN ELECTRON MICROSCOPY
277
the parameters to be estimated. This image contrast must be based on the object wave function, a calculation which is analytically very elaborate and complicated for phase contrast images. Furthermore, the determination of the object wave function as function of a number of parameters is not a simple task. Of course, the required functions can be computed numerically. However, the whole estimation procedure will become rather time consuming. More feasible is the situation at a much lower resolution scale, where scattering contrast dominates the image formation. The required image contrast as a function of parameters now can be based on the much more simple mass-density model of the specimen involved. Because of the lower resolution, the sampling cells in the image are much larger and better statistics in the data are achieved for low electron-dose values. A further complication with parameter estimation is the fact that in general the estimation problem is highly nonlinear in the parameters of interest. This nonlinearity manifests itself in the presence of local maxima in the likelihood function. The search for the global maximum of the likelihood function is a very complicated numerical problem when local extrema are present. Since the estimated parameters are based on stochastic data, the obtained values are also random variables. The statistical properties of the results are as important as the actual numerical values calculated. Unfortunately, the determination of even the first two moments is often a complicated task, due to the nonlinearity of the problem. A statistical characterization of the estimated parameters can only be established in the asymptotic regime of the maximumlikelihood estimator. Again the low-resolution imaging of specimens with scattering contrast is the most promising situation for the application of maximum-likelihood parameter estimation to low-dose electron microscopy in molecular biology.
v.
STATISTICAL HYPOTHESIS TESTING
A . Introduction to Statisticul Hypothesis Testing in Electron Microscopy
The present section is the second one which is devoted to the optimal use of a priori information. The evaluation of low-dose electron micrographs is
considered using the techniques of statistical decision theory. First we provide a short introduction to the kery useful technique of statistical hypothesis testing. This technique will be applied in consecutive subsections to three key problems in the evaluation of low-dose images (1) The detection of the presence of an object with a specified error probability for missing the object and false alarm.
278
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
(2) The detection of the positions of single heavy atoms, to be used as markers in the analysis of images of identical molecules with random orientation. The markers allow the images to be aligned and averaged which leads to a higher signal-to-noise ratio (Frank, 1980; Van Heel and Frank, 1981). (3) How to measure the statistical significance of, e.g., applying image processing to the low-dose image, in order to judge to what extent artefacts are introduced by the computer processing of the image. The visual interpretation of low-dose electron micrographs of moderately stained or unstained biological material is almost impossible due to the low and noisy contrast. Therefore computer processing of these images is indispensable. However, image processing applied to electron micrographs by means of a digital computer has to be performed with great care in order to prevent artefacts and false judgements about the structure to be observed. For overcoming these complications, which can be severe, especially for low-dose images, statistical decision theory offers a tool for an independent and objective check afterwards by quantifying the statistical significance of the obtained results. This can be done by statistical hypothesis testing whenever one has prior information about the structure being observed. In many cases occurring in practice, the information in an electron micrograph is partially redundant. This redundancy of the image, which is equivalent to a priori information, offers the opportunity to reduce the influence of noise. One way in which this a priori information can be used optimally is to apply the method of maximum likelihood to the estimation of unknown parameters. This technique which has been studied in depth in the previous section is especially suited when detailed a priori information about the parametrization of the specimen and the resulting image intensity distribution is available. Another approach of using the available a priori information in an optimal way is the construction of one or more hypotheses about the image distribution. Next, the statistical significance of the hypothesis under consideration is tested against the recorded image intensity which results in case of consistency in acceptance of the hypothesis and otherwise it is rejected. The rest of this section contains an outline of this technique of hypothesis testing (for a more general discussion see e.g., Van der Waerden, 1969; Kendall and Stuart, 1967; Lehmann, 1959). Throughout this chapter a recorded low-dose image is represented by an N x N array of statistically independent Poisson-distributed random counts i?k,lwhich correspond to the number of electrons that have arrived in the ( k , I ) image cell, ( k , I ) = { - $ N , . . . , i N - 1 ) with N Z roughly equal to the number of degrees of freedom of the image. The probability distribution of an individual iik,lhas been discussed in Section I,G [cf. Eq. (12)]. The following example is a simple application of hypothesis testing to a recorded image.
IMAGE HANDLING IN ELECTRON MICROSCOPY
279
Suppose that we have to decide between two possibilities: in the image ii or in a smaller region of interest, either specimen A or specimen B is imaged. The specimens A and B can be, e.g., two different biological molecules. Specimen A is characterized by the intensity parameter /z& of the Poisson process of the image recording and let specimen B correspond to the intensity parameter &. We assume here that the intensity parameters A t l and A& are completely specified; i.e., they do not depend on unknown parameters that have to be determined from the image data. In this case we have two simple hypotheses, the null hypothesis H,: Specimen A is imaged and the alternative hypothesis HI: specimen B is imaged. The null hypothesis is the hypothesis which is tested, here chosen to correspond to specimen A . Composite hypotheses also exist, in the case there is not one simple alternative hypothesis but instead a number of alternatives usually involving a free parameter. In the next subsection an example of such a composite hypothesis will be encountered. From the recorded image ii we now have to test hypothesis H , against its alternative HI and to decide whether specimen A or B was imaged. In order to do so, a so-called test statistic T is needed, which is a function of the experimental data to be specified further. Let W be the sample space of the test statistic, i.e., the space containing all possible sets of values of T.The space W is now divided into a critical region w and a region of acceptance W - w. If T falls within the critical region, hypothesis H , is rejected; otherwise it is accepted. The critical region w is chosen in such a way that a preselected level of significance c1 of the test is achieved. This level of significance sl is defined as the probability that T is in w while H , is true (see Fig. 13a) c(
=
P { T €WJH,)
=
i‘
p(TIH,)dT
(203)
In other words, c1 is the probability that H , is rejected although the hypothesis is true. Having chosen a value of a, the value of c follows from Eq. (203) such c: H , is accepted, that if T 2 c: H , is rejected and thus HI is accepted; if T I HI is rejected. Whether a test is useful or not depends on its ability to discriminate against the alternative hypothesis H I .This is measured by the power of the test, which is defined as the probability 1 - p that T is in w while HI is true. This makes p the probability that H , is accepted although HI is true (see Fig. 13b)
p = P { T €w - WIH,}
=
SI
p(TIH,)dT
(204)
m
The performance of a specific test is measured by the two types of error that may occur. The first is type I: H I is chosen while H , is true (“false alarm”). The probability of a type-I error is c(. The second error is called type 11: H , is chosen while HI is true (“miss”).The probability that a type-I1 error will be made is p.
280
CORNELIS H . SLUMP A N D HEDZER A. FERWERDA
Fic;. 13. The level of significance 2 and the power I against the simple alternative hypothesis H , .
- [j
of testing the null hqpothesis ff(,
In hypothesis testing one has to choose the significance level 2, i.e., the probability of a type-I error one is willing to accept and the test statistic T which is to be chosen such that for a given value of 2, /lis minimal. In this section three test statistics will be compared: the likelihood ratio, the chisquare test, and Student's t test. These test statistics are introduced in the following. The likelihood of observing the recorded realization fi of the stochastic image process is given by [cf. Eq. (19)]
I!d(fi,k) = n n e x p ( ~ j " k . J ) ( ~ k ,; .Jfk.1 l k!' ) ~ ' k
(205)
I
The likelihood ratio q is the test statistic which is defined as the ratio of the probabilities of obtaining the recorded count pattern for the hypotheses H , and H ,
q(ii) = L(ii, Ho)/L(fi,H , )
= exp
CC[.~,, '
-
+ fik,i(Io!gi&
-
~og;.;,,)~)
(206) Having calculated the likelihood ratio q according to Eq. (206), its value is to be compared with threshold value q,. If q 2 qo hypothesis H , is accepted, otherwise H , is chosen. The test procedure is now completely specified; what remains to be solved is how the threshold value qo should be chosen in order to correspond to the desired CI level. A further question is what the resulting
IMAGE HANDLING IN ELECTRON MICROSCOPY
28 1
power of the test will be. In general these matters depend on the hypothesis at hand, i.e., the differences between At.,and j&. The likelihood ratio is a powerful test statistic for the decision between the two simple hypotheses H , and H I . Another test statistic which is especially suited for measuring the discrepancy between observed iik,ldata values is the chi-square statistic Tx2with N 2 - 1 degrees of freedom T,,(N2
1) = x x j . ; l ' ( k k , L- i,J2
~
k
I
(207)
The larger the values of Tx2,the larger is the discrepancy between the observed and expected data values. The expected data values are computed on the basis of a hypothesis H , . This H , is rejected if the obtained q2value exceeds the critical value at the desired significance level, e.g., x:,95 or xi,99,which are the critical values to be obtained from tables at the 5% and 1 "/, significance level, respectively (see, for example, Van der Waerden, 1969, Chap. 14, Table 6). In that case the chi-square test concludes that the observations differ significantly from the expected values at the chosen level of significance. Otherwise H , is accepted or at least not rejected. In the next subsection also a third test statistic will be encountered which has a more limited scope of application, namely Student's t test. This is the appropriate test statistic if one wants to test whether an observed mean value of a set of N 2 independent normal-distributed random variables ( X I ,X , , . . ,X N 2 )is consistent with the expected value p . The test statistic t , which is defined as
x
,
r
= .f '(X-
p)N
(208)
where X = N-' C i X , is the sample mean and s2 = ( N 2 - 1 ) - ' c i ( X i- X)' denotes the sample variance. has a Student's t distribution with N 2 - 1 degrees of freedom. Also for this test statistic the critical values, e.g., to,,, and to,99 can be obtained from tables (for example Van der Waerden, 1969, Chap. 14, Table 7). If the test statistic exceeds the critical value, the hypothesis H , is rejected. In the next subsections statistical hypothesis testing is applied to problems in electron microscopy. B. Object Detection
A critical problem in the evaluation of low-dose electron micrographs is the detection of the presence of an object in the noisy images. Once an object has been detected in a certain region of interest, various techniques can be applied to extract information about this object. However, first the question has to be answered whether there is an object present or that the pertinent image intensity variation is .just a random fluctuation. Otherwise, faulty
282
CORNELIS H. S L U M P A N D HEDZER A. FERWERDA
conclusions will arise from applying image processing to an image which consists of random noise only. The detection of objects in question in noise-limited micrographs has been treated by Saxton and Frank (1977) using a matched filter approach based on cross-correlation and by Van Heel (1982)applying a variance image operator to the low-dose image. The visual perceptibility of objects under low-intensity levels has been treated in the pioneering work of Rose (1948a,b) in the early days of television systems. The results of Rose’s analysis are fundamental to low-dose electron microscopy and will be outlined briefly. In order to detect an image-resolution cell of one picture element (pixel) having a contrast C, where C is defined as the relative difference with respect to the background intensity A,, C = A&,/2, in an image of N x N pixels, a total number of electrons nT is needed in the image information. According to Rose (1948a,b) this number is given by
nT = N2k2Cp2 (209) It is assumed that this total number of electrons is uniformly distributed over the image and further the detection quantum efficiency (DQE) is taken to be unity, so that every impinging electron is recorded. The factor k in Eq. (209) is introduced in order to avoid false alarms and should be between 4 and 5 when the image has about lo5 pixels. The following example adopted from Rose ( 1973) illustrates Eq. (209) and clarifies the role of the factor k. Suppose we want to detect a single picture element with a constrast value C of at an unknown position in an image consisting of 100 x 100 pixels. According to Eq. (209) a total number of (at least) 108k2imaging electrons is needed in order to make this pixel visible. A pixel of the background receives in the mean a number of electrons 2, equal to 104k2 and the pixel to be detected expects to receive 1 = 9900k2electrons. The recorded numbers of electrons, ii, and ii, respectively, are Poisson-distributed random variables as is discussed in Section 1,G. Due to the relatively large numbers involved, the Poisson distribution is well approximated by the Gaussian probability density function P { ii,
= m } = exp( - i.,)(m!)-
’2;
= p(m)= ( 2 n a ~ ) p ” 2 e x p [ - - ~ ~ 2(m
(210)
where 0; equals j-,. Figure 14 presents the two probability density functions, the shaded areas correspond to the type-I error a, and type-I1 error p, which depend on the decision threshold c. The distance between the two peaks is look2.This is equal to k times the standard deviation, which is nearly the same for both density functions. If the decision threshold c would be situated halfway in between 2 and L o , the error probabilities a and fl would be equal.
IMAGE HANDLING IN ELECTRON MICROSCOPY
283
FIG.14. The two probability density functions p ( m , io) and p(m, A),together with the shaded areas a and p, representing, respectively, the probability of a type-I error (“false alarm”) and a type-I1 error (“miss”).
Usually this is a desirable situation, but not in this case because there are lo4 - 1 background pixels. Although Rose cannot use explicitly the threshold c because his detection criterion is the visual perceptibility, it is argued in Rose (1973) that the distance from c to I., should be at least 4 standard deviations in order to bring the total CI risk down to 0.3. With a distance from A to c of one standard deviation, the risk becomes 0.1 58, and the value for k is found to be 5. When the image contains less pixels, the value for k can be lowered somewhat. Note that the dose values in this example (I., = 25 x lo4 electrons/pixel) are far away from low-dose imaging conditions, which underlines the inherent difficulty of the evaluation of low-dose electron micrographs at high resolution. Considering the detection of image detail with a larger spot size than one pixel, Eq. (209) can be rewritten as
nT = A k 2 ( d C ) - ’
(21 1 )
where d is the diameter of the spot to be detected and A is the area of the image. According to Eq. (21 1 ) the diameter of a test spot which is just visible varies inversely with the contrast c‘ for a fixed value of the electron dose. This relation is illustrated in Fig. 15, where a test pattern adopted from Rose (1948b) is presented, which consists of a two-dimensional array of discs in a uniform background. The diameter d of the discs decreases in steps of a factor of 2 while moving to the right along a row and the contrast C of the discs decreases in steps of a factor of 2 while moving downwards along a column. These images show that the boundary between the visible and invisible discs lies roughly along a diagonal where the product dC is constant. With Fig. 15 the discussion of Rose’s detection criterion is determined, and we turn to hypothesis testing for the detection of objects. For objects
284
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
d FIG.15. Test pattern [Rose (3948b)], illustrating Eq. (21 1). (a) Original test pattern. (b)-(d) Test patterns for electron-dose values of 8, 16, and 32 electrons per pixel.
consisting of one pixel, not so much can be improved upon Eq. (209); however, for larger objects a significant improvement is possible, as has also been reported by Saxton and Frank (1977) and by Van Heel (1982). This can be understood from the fact that the Rose criterion is the visibility of the test spot, which is based on the integrated contrast over image elements with the size of the test spot. In the case of extended objects detection methods based on the statistics of the individual pixels use more information from the recorded image and therefore in principle are more appropriate for the treatment of lower dose values.
285
IMAGE HANDLING I N ELECTRON MICROSCOPY
Applying statistical hypothesis testing to the detection of the presence of an object, one has to decide between two possibilities: either there is no object, the null hypothesis ( H o : j . k , l= constant = 1,,7iV2) or there is an object present, the alternative hypothesis ( H , : j u k , larbitrary, but not all equal). If ,?,7?v-’ is known, then H , is a simple hypothesis because it is completely specified; H , , however, is a composite hypothesis (the object is not specified). The likelihood-ratio test statistic as it is discussed in the previous section does not apply to this situation because the probability of the alternative hypothesis cannot be specified. This complication is overcome by using instead the generalized or maximum-likelihood ratio as test statistic, which is defined as the ratio of the maximum-likelihood values of the two hypotheses. Of both hypotheses the likelihood is maximized by variation over the pertinent hypotheses; cf. the maximum-likelihood estimation. When H , is true = ii = N L(ii,KO) is maximal estimating Ak,[by CkEl Ck,[. If H , is true, we have = Gk,[;in the case of one observation the best estimated value is the observation itself. The generalized likelihood ratio q is
q ( 6 ) = L(fi,J?,)/L(h,
fi,) = exp
N2iilog ii
-
k 1
Ck.[log Ck,,
from which it follows that 0 5 q I 1. It can be shown (e.g. Kendall and Stuart, 1967, p. 233) that when H , is true, - 2 log q is distributed for N2 -+ CT, as z 2 ( N 2- 1). For image data we may well expect to be in the asymptotic regime so that the probability distribution of -2 log q equals the chi-square distribution with N Z - 1 degrees of freedom. When one has chosen the level of significance ci, usually of the order of 1 the threshold value c for the decision acceptance or rejection of H , can be obtained from tables of the ;c2 distribution (see Fig. 16).The decision threshold
x,
-r
FIG. 16. The chi-square distribution with N *
1 degrees of freedom of r = threshold value c is chosen such that the shaded area equals 1 - a. -
--
21og q. The
286
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
c with which r
=
-
2 log q is to be compared, is chosen such that P x z { r ,N 2 - l} dr
However, for larger values of N 2 - 1 the chi-square distribution is approximated very well by the Gaussian distribution. Defining the value z as follows ~ ( q=) (2NZ- 2)- '"(1 - N 2 - 210gq)
(214)
we have obtained a variable which is distributed standard normal N(0,l). The a levels with the corresponding threshold values c are presented below.
r level
Threshold value c
0.1
1.282
0.05
1.645
0.01
2.326
If the value z exceeds a threshold value c, the H , hypothesis is rejected at the corresponding significance level a. We will follow here the conventional terminology that results significant at the c( level of 1% are highly significant, results significant at the 5% level are probably significant, and results significant at levels larger than 5% are not significant. Because H , is the set of all possible alternatives to H , , nothing can be said about the probability of a type -11 error /I. Applying the chi-square test for the detection of the presence of an object, the test statistic is the following [cf. Eq. (207)] T , , ( N ~-
where 6 = N 2 Z kZ I G k , l . Because of the asymptotic properties of the x 2 distribution, which are already discussed in connection with Eqs. (213) and (214), the value z is defined as z(T) = (2N2 - 2)-' ' [ T , 2 ( N Z- 1) - N Z +
11
(216)
is distributed standard normal N(0,l). Therefore, the threshold values c of the above table also apply to the chi-square test. With the third test statistic, is., Student's t test of Eq. (208) the consistency of the observed mean value is measured against the expected value p, which corresponds to i.,TN -'. The value of p is to be determined from previous exposures under the same conditions, e.g., by dose measurements using a Faraday cup. The t test requires that the random variables to be tested are
IMAGE HANDLING IN ELECTRON MICROSCOPY
287
(approximately) normally distributed. We therefore first apply the square-root transformation to the Poisson-distributed image data j k , [ = (kk.1
+
$)1’2
(217)
The obtained j k , [ values are in good approximation normally distributed N ( p L y , b )where , p y = ( p + ;)l/’. The square-root transformation in Eq. (217) is discussed in Appendix B. The test statistic t t = s - ’ ( y - py)N
(218)
where J = N - 2 C k C I F k k ,and l s2 = ( N 2 - 1 ) - ’ X k X L ( & [ - #, has the Student’s t distribution with N 2 - 1 degrees of freedom. For large values of N 2 - 1, t is distributed standard normal N(0,l). It is to be expected that when an object is present the number of electrons arriving in the image will be reduced, e.g., by scattering contrast or because of the energy filter lens which removes inelastically scattered electrons. Therefore, only the significance levels less than the measured mean corresponding to the lower tail of N(0,I) have to be tested. The threshold values c of the above table apply here, however with opposite sign. The three test statistics which are compared in this section are first applied to the detection of the presence of objects in simulated images of four model objects. The simulated images are arrays of 64 x 64 pixels. The objects are represented by their object wave function and consist of two amplitude objects and two phase objects: amplitude: $,(x,,y,)
phase:$&,,
yo)
=
exp(--0.05),
inside a circle with a diameter of 8 and 16 pixels, respectively;
=
1,
otherwise.
= exp(i/4),
=
inside a circle with a diameter of 8 and 16 pixels, respectively; otherwise.
1.
are calculated according to Eqs. (3)-(5) for The image wave functions the following setting of the microscope parameters: D = 180 nm, C, = 1.6 mm, A = 4 pm, and E = 5 x From this parameter setting it follows that a resolution cell in the image which corresponds in this simulation to a picture element, is equal to (4 x 4) A’. The contrast calculated from the image wave function $(-,-)is the parameter I,,,, of the image Poisson process. By means of random number generation a realization of the low-dose image is obtained. The results of the three test statistics, generalized likelihood ratio, chi-square, and Student’s t test, respectively, given in Eqs. (214), (216), and (218) are summarized in Table V. $(a,.)
TABLE V SUMMARY ok THE DETECTION OF THE PRESENCE OF THE MODELOBJECTS I N THE SIMULATED IMAGESW I T H THE THKWTESTSTATISTICS FOR INCREASINGELECTRON DOSE' Amplitude object
Phase object -.
i.,TN-'(e-/pixel)
i n ( e ./nm'), (e-/A')
Likelihood ratio
Chi-square
Student's
Likelihood ratio
Ho Ho Hn H0,l H", I
HO H0 H0 H", I
H0,l H, Hl Hl Hl
Ho11 Ho, L H0,l HOIl HI
Ho HO Ho Hwi
HO Ho Ho Ho
If0 Ho Hl Hl Hi
Hl Hl Hl Hl Hi
H,, 1 HOIl HI HI HI
H" HO Ho, 1 Hl
Ho Ho Ho Ho
11,
11"
Chi-square
Student's
Object diameter 8 pixels 12 16 32 48 64
75 (0.75) 100(1)
200 (2) 300 (3) 400 (4)
H0
h0,1
lfn
Object diameter 16 pixels 12 16 32 48 64
15 (0.75) 100 (1)
200 (2) 300 (3) 400 (4)
Ho! I
If,, 1 HI HI Hi
" I I , means that the hypothesis !I,, is not rejected, the test statistic is not significant. Hi indicates that the test statistic is highly signilicant. If,, is rejected, while I I , I indicates that the test statistic is probably significant.
IMAGE HANDLING IN ELECTRON MICROSCOPY
289
From the simulated experiments summarized in Table V we observe that the Student’s t test, which has a very sharp response on amplitude contrast, is not sensitive to phase contrast at all. This is not surprising, as the t test statistic measures the deviation between the total number of expected electrons in the image and the acquired number of detected electrons in the image. In the case of amplitude contrast, the number of electrons arriving at the image detector plane will be reduced if compared with the case that there is no object present. In the case of phase contrast this difference is negligible in comparison with the statistical fluctuation in the total number of electrons that is involved in the image formation. For the detection of phase contrast we observe from Table V that the likelihood-ratio test is more sensitive than the chi-square test. In a small experiment in which the presence of phase contrast in low-dose images is tested, we used the likelihood ratio as test statistic. The three lowdose electron micrographs which are the input data for this experiment are a courtesy of Dr. E. J. Boekema and Dr. W. Keegstra of the Biochemisch Laboratorium, Rijksuniversiteit Groningen. The imaged specimens presented in Fig. 17 are an image of a carbon support foil, a carbon foil with a small amount of uranyl acetate, which is used as staining material, and an image of a NADH: Q oxidoreductase crystal (slightly negatively stained with uranyl acetate) from bovine heart mitochondria (see Boekema et al., 1982). The crystal structure of the last image is visualized in Fig. 17d, which is obtained by Fourier peak filtration (Unwin and Henderson, 1975). The images are small sections of 128 x 128 pixels of low-dose CTEM electron micrographs, obtained with an electron dose in between 5 and 7 e-/A2. The micrographs have been scanned by a microdensitometer with a sampling grid of 25 pm. Since the magnification is 46 600, the size of a pixel corresponds to 5.3 A. The low-dose images are recorded on a photographic plate, which is not the ideal device for electron detection. Moreover, due to the scanning by the microdensitometer, we cannot expect the recorded image to have the Poisson statistics of Section I,G. The test statistics developed in this section are based on the Poisson statistics of a recorded image as derived for ideal electrondetection conditions in Section I,G. The following crude approach has been chosen to correct the statistics of the image to the Poisson regime. From the carbon foil image the mean and variance are calculated. Hence the image is scaled in such a way that an equal mean and variance value are obtained which numerically corresponds to a dose of 6 eC/A’. Exactly the same scaling is applied to the two other images. The scaled images serve as input for the estimation experiment. The likelihood-ratio test statistic detects phase contrast, and thus the presence of an object at the 5% level in the image of the carbon foil with uranyl acetate. In the image of the NADH dehydrogenase crystal, phase contrast is detected at the 1o/;l level. Object detection by means of
290
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
CARBON FOIL
CARBON FOIL- UAC
CRYSTAL
CRYSTAL FILTERED
FIG.17. Low-dose CTEM images used for an object detection experiment. The imaged specimens are consecutively (a) carbon foil, (b) carbon foil with a small amount of uranyl acetate, and (c) NADH: Q oxidoreductase crystal. (d) is obtained from (c) by means of Fourier peak filtration.
hypothesis testing exceeds the capability of the human eye, which is very useful in the noisy images of low-dose electron microscopy.
C. Position Detection of Marker Atoms Another key problem in the evaluation of low-dose images is the detection of the position of single heavy atoms. These atoms are used as markers in the analysis of images of identical macromolecules with a random orientation
IMAGE HANDLING IN ELECTRON MICROSCOPY
29 1
with respect to each other. If the marker positions are known, the images can be aligned and integrated, which leads to a higher signal-to-noise ratio (Frank, 1980; Van Heel and Frank, 1981). The imaging of single atoms in electron microscopy has been studied by many authors (see, e.g., Proceedings of the 47th Nobel Symposium, Direct Imaging of Atoms in Crystals and Molecules, Chernicu Scriptu 14 (1978-1979). Calculations of the image contrast due to a single atom have been reported (e.g., by Scherzer, 1949; Niehrs, 1969, 1970; Reimer and Gilde, 1973; Iijima, 1977). Experimental observations showing evidence of imaged single atoms are among others, in Dorignac and Jouffrey (1980), Kirkland and Siege1 (l98l), and Isaacson et a / . (1 974). For the construction of hypotheses about the location of the marker, the theoretical image contrast of one isolated heavy atom is required. However, the calculation of the theoretical image intensity distribution as a function of the lateral position of the atom in the object plane is a complicated task. Our purpose here is to discuss the detection capability of marker atoms under lowdose illumination. In order to bring out the essentials, we will simplify this calculation considerably by neglecting inelastic scattering phenomena completely. Furthermore, we represent the electrostatic potential of the pertinent atom at the position ( x a ,y,, z,) in the specimen which is situated just before the object plane z = zo by the Wentzel potential (Lenz, 1954) V(r,) =
where r, = [(xO- x,)’ atom R is given by
+ (yo
-
Ze exp( - r,R ’) 471c0r,
~
~
y,)’
R
+ (zo
~
= uHZ-’
z,)~]”’. The “radius” of the (220)
with aH the hydrogenic Bohr radius corresponding to 0.529 8, and Z is the charge of the atomic nucleus. According to Eq. (I), plane-wave illumination exp(ikz) incident parallel to the optical axis results in the object wave function (zo = 0)
The integral in Eq. (221) can be evaluated analytically if we extend the upper bound of the integration interval to infinity. This is allowed because of the rapid exponential decay of the potential in Eq. (219). With a change of variables we obtain (using Gradshteyn and Ryzhik, 1980, f.3.387.6, p. 322)
292
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
where KO(.)is the modified Bessel function of the second kind and of zero order. The constant a is given by
In the weak-object approximation* we obtain the object wave function
1 - 2iaKo(R-'C(xo - x,I2 + ( Y O - Y,) 1 ' ) (224) If the integration is extended over the whole object plane, we obtain from Eqs. ( 3 ) and (4) introducing the new variables x b = xo - x, and yb = y o - y, for the wave function in the exit pupil 2
$o(xo,~o;x,,~,)
$,(C3 '11
=
exp[
- pi((,
I')
~
d.uo dy, [ I
2ni(C.u, -
112
+ tly,)]
2ia K,(R-'(.x:
+Y:)':~)]
- 1
(225) exp[ -2ni((-ub + qyb)] The structure of the integrand suggests the use of polar coordinates. s;,= r b c o s 4 b , y b = r;,sin4b,c = r,cos4,,and 17 = r,sin4, $,,(r,. 41,) = exp[ - r;,(r,,) - 2nir,(.ua cos cj,, + J', sin 4,,)] x
x
((:
r ~ ) d r ~ ~ [ c ~ n d $- 2h i[alK o ( R - ' r b ) ] (226)
x exp[- z n i r ; r p c o s ( ~ ~-; ,
Using
I:
r1;
riIdrh =
dd)h K,(R 'r;)exp[-2nirbr,cos($b
drbrbK,(R-'r;)J,(2nr,rb)
=
[R-'
-
4,)]
+ (2~w,)']-'
(227)
we obtain the following expression for the image wave function, where we introduce the new coordinates .x' = .Y - .xa, y' = J' y,, x' = r'cos q$', y ' = r'sin $', and with the coordinate center (.x,,y;,) in the image plane ~
(228) In the derivation of Eq. (228) we assume a circular exit pupil with radius I:. * In the immediate neighborhood of ( . Y " , J ~ )the weak-object approximation is not valid. However, this corresponds to scattering electrons passing the atomic nucleus at a very small distance. These scattering events occur with a very low probability.
IMAGE HANDLING IN ELECTRON MICROSCOPY
293
The image intensity distribution is proportional to the squared modulus of the image wave function. In the weak-object approximation we obtain with the first-order terms only $(r’, d)’)$*(r’, 4’) = 1 - 1 6 d a
dr, r, sin[y(r,)]Jo(2?cr,r’) J O
It is statistically more convenient to detect in the image regions with a surplus of electrons (peak detection) than it is to detect regions with less electrons, especially with a low-intensity background. Therefore it is important in Eq. (229) to set the defocus parameter D[cf. Eq. ( 5 ) ] to a value which maximizes the contrast under the condition of contrast reversal. At the position of the imaged atom in the image plane, more electrons arrive than in the background. The stochastic low-dose image is characterized by the parameter of the Poisson process I,,,,
x $*(k(2&)-’
+ x,, 1(2&)-’ + y,)
(230)
In a simulation experiment low-dose images are obtained for different dose values by means of random-number generation of Poisson-distributed variables. The images contain 128 x 128 pixels, and the presence of 3 uranium atoms is simulated. The separation between the atoms is taken such that the respective image contrast patterns do not overlap. The detection procedure is as follows. The position parameters x, and ya are also restricted to multiples of (2~)-’,corresponding to the pixel dimensions in the image. The hypothesis is now tested for every pixel being the center of the calculated atomic contrast pattern, Eq. (230). For each pixel (r,s) the likelihood L(6, A(r/2~,s/2~))[cf.Eq. (205)] is calculated. The obtained likelihood values which are beyond a threshold value c correspond to detected marker positions. The choice of the threshold value c determines the error probabilities CI and p, as has been discussed at length in the previous two subsections to which we refer. The simulated images are generated with the following values for the microscope parameters: N = 128, C , = 1.6mm, D = 144 nm (contrast reversal), 2 = 4 pm, and E = lo-’. In the simulations which are summarized in Table VI the presence of an object is also tested by means of the three test statistics discussed in the previous subsection. In this simulation we have simply taken the three highest likelihood values to correspond with the three atomic positions. In practical situations this extra a priori knowledge is not always available.
TABLE VI SUMMARY OF THE DETECTION OF THE PRESENCE OF A N OBJECT WITH THREE TESTSTATISTICS' Object detection 3 Uranium atoms ~ , T N'(<,-/pixel)
8 12 16 20
52
i o ( e /nm2),( e -
/AZ)
200 (2) 300 (3) 400 (4) 500 ( 5 ) 1300(13)
Likelihood ratio
Chi-square
Student's
HO,,
HO
&,I
Ho
Ho,, Ho,, HI
Ho
HO Ho Ho H"
HO HI
Position detection (Number of atoms at correct position)
Ho
For an explanation of the symbols H o ,Ho,,, and H, see the legend of Table V. The position detection states the number of marker locations which have been detected correctly.
IMAGE HANDLING IN ELECTRON MICROSCOPY
295
D. Statistical Signijicance of Image Processing
The technique of statistical hypothesis testing is also applicable to the measurement of the statistical significance of the effect of image-enhancement algorithms on the low-dose electron micrographs. When applying imageprocessing techniques, great care should be exercised in order to avoid the generation of artefacts and hence false conclusions about the imaged specimen. As an example we will measure the statistical significance of a popular noise-reducing (“filter”) algorithm for periodic objects, i.e., the Fourier peak filtration technique of Unwin and Henderson (1975). In Fig. 17c the low-dose micrograph of a periodic object is presented and Fig. 17d is obtained by Fourier peak filtration from the image in Fig. 17c. (For more details about these images see the end of Section V,B.) We now wish to determine quantitatively whether the filtered image is consistent with the data. Therefore two hypotheses H , and H , are constructed. The null hypothesis H , is the outcome of the computer processing of the filtered image (cf. Fig. 17d).The H , hypothesis is the set of all possible alternatives to H , . In the latter case A k , l is estimated by Xk,l = iik,l.Let p = { Pk,l}, (k,I ) E { 1,. . .,N } denote the processed image. We now obtain the likelihood ratio
For the images in Figs. 17c and d we conclude by calculating the test statistic - 2 log q the rejection of the null hypothesis at the 5% level of significance; i.e., H , is probably significant. For a noise-reducing algorithm this result is rather poor. This procedure is also applied to a filtered image, which is calculated from the image in Fig. 17c, however with a small shift in the positions of the filter peaks. With this image the H , hypothesis is rejected at the 10% level of signijicance, which means that the HI hypothesis is highly significant. E . Discussion and Conclusions
This section deals with the handling in an optimal way of prior information available about the imaged specimen in low-dose electron microscopy. The prior information is quantified and transformed into a set of different hypotheses. Hence the statistical significance of each alternative hypothesis is measured against the recorded data. In the application of statistical hypothesis testing to low-dose electron microscopy, it is difficult to make general statements about performance and attainable accuracy, likewise
296
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
for the case of parameter estimation. One of the problems analyzed in this section is the basic problem of the detection of an object in a region of interest in the low-dose micrograph. Once an object has been detected, techniques from the field of image processing can be applied for further information extraction. In a simulation experiment Student’s t test appears to be very effective for the detection of amplitude contrast. This could be expected, as the t statistic calculates the deviation from the mean number of electrons per pixel. Amplitude objects “absorb” electrons, leading to less electrons in the image, a situation which can be detected by the t statistic. For phase objects the mean contrast over the image does not change. No electrons are captured by diaphragms at high resolutions (no scattering contrast), and therefore the t statistic shows no significance at all. Also from the simulation experiment we see that the likelihood-ratio test statistic is more powerful for phase-contrast detection than the commonly used x 2 statistic. The detection of the positions of heavy-atom markers (Section V,C) turned out to be quite successful even for modest electron-dose values. The detection method will benefit from a priori knowledge about the total number of marker positions to be detected. Then simply the highest likelihood values can be taken to correspond with the locations to be determined, likewise in the simulation experiment. An earlier attempt failed to obtain this information from the inelastically scattered electrons which are missing from the image (energy filter lens) together with knowledge of the total cross section of the atom for inelastic scattering. This failure is due to the fact that for increasing 2 the cross section for inelastic scattering decreases. This implies that the inelastic signal from a U atom gets submerged in the inelastic signal from its surrounding, which consists mainly of organic material. The technique of statistical hypothesis testing can also be applied to measure the statistical significance of image processing on the low-dose electron micrograph. The example discussed in Section V,D which concerned the application of the Fourier peak filtration method for periodic objects of Unwin and Henderson (1975) is only included for illustrative purposes. It is not to be considered as an analysis of the statistical relevance of the Unwin and Henderson technique. For periodic objects the cross-correlating techniques of Saxton and Frank (1977) are believed to be more powerful as these are able to accomodate lattice deformations. Another problem which can be handled with hypothesis testing is the problem of measuring the radiation damage. Again for periodic objects a measure of the radiation damage exists in the broadening of the diffraction spots for increasing electron dose. For aperiodic specimens a series of images can be made of which the differences are quantified by means of statistical hypothesis testing. When the images of the series start to differ significantly this is due to the radiation damage.
297
IMAGE HANDLING IN ELECTRON MICROSCOPY OF THE APPENDIX A: THESTATISTICAL PROPERTIES FOURIER TRANSFORM OF THE LOW-DOSE IMAGE
In this appendix we derive the stochastic properties of the complex stochastic process t(t)defined in Eq. (27) N/2 - 1
E(5) =
c
k = -N/2
ikexp[271i(4&)-1k(]
(232)
which are required in Section 11. The measurements (counted electrons) i i k are . autocorrelation Poisson-distributed random variables with intensity i k The function R ( - ,.) of the complex stochastic process defined in Eq. (232) is given by
N5,, 5 2 ) = E { ~ ( W * ( t 2 ) )
Using the statistical independence of the 6,"s(cf. Section I,G),Eq. (233)can be rewritten as
kfk'
+ C E { i i z )e ~ p [ 2 ~ i ( 4 ~ ) - ' k ( 5 ,l z ) ] -
k
C E { i i k } E ( j i kexp[2ni(4e)-'(kt1 ,} - k't2)] + C [ E { ~ : }E2{fik}]exp[27ci(4~)-'k(t, - t2)1 = E { ~ ^ ( S , ) ) E { ~ ^ * ( S+, )Ci.,exp[2ni(4~)-lk(t~ } t2)1 (234) =
k.k'
-
k
-
k
where the property of Poisson-distributed variables has been used that
E{i?i} = i.; + iLk
(235)
In practice ?(-) will be calculated by means of a fast Fourier transform (FFT) algorithm. This results in the discrete values = (2d)-ll,
I
=
{ --;A',...,$ N
-
1)
298
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
for the continuous variable 5. The autocorrelation in Eq. (234) becomes
where denotes c'(l/2d). We now determine the probability distribution of the complex random variable tI.It is convenient to examine the characteristic function @-(.) of tl, because completely determines the probability distribution o?tI and is more easily obtained. The characteristic function of the discrete random variable & is defined m
C exp(iol)exp(- A k ) a : / i !
@",(o) = E(exp(iw6,)f =
l=O
= exp{d,[exp(io)
-
11)
(237)
fik exp(2niN-'
(238)
From Eq. (232) we have that (€+,(o) = E exp io H
k
Straightforward computation using the fact that the Gk variables are statistically independent (cf. Section 1,G)yields @tI(o) = exp
CA,(exp[io e x p ( 2 n i ~ ' k l ) l- 1 ) )
(239)
( k
Inverse Fourier transformation of Eq. (239) results in the probability distribution of t l .The moments m, of C; which are defined as E(c^y}= m,
(240)
are related to the derivatives of the characteristic function @?,(-):
are most easily The first two moments of the distribution function of obtained from the logarithm of Eq. (241) [see, for example, Papoulis (1964), p. 1581 d
IMAGE HANDLING IN ELECTRON MICROSCOPY
299
We obtain for the expectation value E{?,} E{?,}
= x,?keXp(2niN-'k1) k
(243)
which is consistent with applying the expectation operator to the right-hand side of Eq. (232). To obtain the variance we proceed by defining the real functions GI and 6, to be the real and imaginary parts of El
c; = GI + ib, A
(244)
According to Eq. (242), we have
E{G,} = a,,
E { & } = b,
(245)
Straightforward computation of the variances e l and ez,results in 015,
=
c
/?k
c0s2(27cN-' k l )
k
From the definition of the variance of a complex random variable, we obtain for the variance of ?, =
E{l?I
-
E{c[)1'} = 1 2 , k
(247)
From Eq. (247) we conclude that the variance of ?, is equal to a constant value independent of 1.
APPENDIX B: THESTATISTICAL PROPERTIES OF AN AUXILIARY VARIABLE This appendix is devoted to the stochastic properties of an auxiliary random variable together with its Fourier transform, which is used in Section 111. The variable &is defined in Section I11 in two places in two different ways. According to Eq. (86) & , is given by ?k,[
= N'(&T)p1(6k,,
7
&TN-')
(248)
while Eq. (130) reads as gk,k,l
2 -112
= (IsTN- )
-112
[nk,,
- (&TN-2)'1']
(249)
Apparently the two definitions in Eqs. (248)and (249) are quite different. They originate from
+
E{6k,l) = &TNp2(1
Sk,,)
(250)
300
CORNELIS H . S L U M P A N D HEDZER A. FERWERDA
and
respectively. Since sk,l << 1, it will appear that the two definitions give rise to same probability distribution. The only difference in the two definitions is a factor of four between the variances. A . Part
I
In the first part of Section 111, we are interested in the statistical properties of the Fourier transform of gk.1 according to Eq. (248). The Fourier transform of the variable gk,/,( k ,1) = { - i N , . . . , i N - 1) is a complex stochastic process ?((, q), defined in Eq. (88)
We abbreviate the factor 1,TN-' by 1,. The measurements (counted electrons) iik,l are Poisson-distributed random variables with parameter j.k,l = Lo( 1 + s k , / ) .The autocorrelation function R(., -) of the complex stochastic process [Eq. (252)]
( k ,I ) # (k', I ' )
(254)
301
IMAGE HANDLING IN ELECTRON MICROSCOPY
In the second term on the right-hand side of Eq. (254), the variance of Fk,l appears. For this variance we obtain, using the property of Poissondistributed variables tik,l,that E{tiz,,} = A,& + &, c2(L?k,k,l)
=
E(SIk2.1). - EZ{S^k,l}= J?{Aiz(6k,l
- A,)'}
+ A;]
= 1,2[E{ti,2,,} - 2E{tik,l)& = i.0
(1
+
- sE,I
- s&
(255)
Sk,J)
Since S k , ! << 1, we have to a good approximation that c2(gk,k,l) N
i.0'
(256)
With Eq. (256), we obtain for the autocorrelation function of Eq. (254) ~ ( 5 1 ~ t Z . ~ l ~=V E{c^(tl, Z) Vl)I~{c^*(52?~Z)}
+ j.,'CCexP(2ni(2&)-'Ck(51 k
-
f
52)
+ 4rll v1Z)l) -
(257)
The summations over k and I in Eq. (257) can be evaluated (by a geometric series) as
Choosing sampling points (,,, = (2d)-'m and qr = (2d)-'r, such that t1- t2 and q 1 - q z are multiples of (2d)-', we find for the autocorrelation function Nt,, m t q r , V S ) that
+
R ( t m , t n , q r > y / s ) = E(c^((5,,~r)}E{c^*(5,,~s)}A i 1 N 2 d m , n d r , s
(259)
For this choice of discrete values C(t,,,,q,) and c^((5,,qs), (m,n)# (r,s) are uncorrelated. This situation will occur when we apply Whittaker-Shannon sampling in the exit pupil. This will always be the case in practice, where E(-, -) is calculated by means of a fast Fourier transform algorithm. In the following i s treated for a continuum of 5 and q values, but the discrete analysis, t(.,-) representation is kept in mind when necessary. We examine the characteristic function Q5(w) of C((,q), since Q5(w) completely determines the probability distribution of i?(t,q). From Eq. (252) we have that
{
(1
+
@?(w)= E exp iwCCs^,,,exp[2ni(2~)-~(k5 Iq)])] k
I
(260)
302
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
Substituting Eq. (248) leads to exp[2ni(2~)-'(k5
+ Ir)]
+
$)])I
-liO~~'C~~k,,eXp[271i(2&)-1(k~ k
I
(261)
Straightforward computation, making use of the characteristic function of the random variable Ck,/ [Eq.(237)], results in
+ xxA,,,(exp(io&'
+
)
exp[2ni(2~)-'(k< l q ) ] } - 1)
k 1
(262)
Fourier transformation of Eq. (262) yields the probability distribution of ?(-,-). We restrict our attention, however, to the first two moments of this distribution, which can be calculated most easily from the logarithm of the characteristic function [cf. Eq. (242)l. We obtain for the expectation value of ?(a,
E(C((5,q)) = CCsk,,expC2ni(2E)-1(k5+ 11111 k
(263)
f
a relation which has been used in the derivation of Eq. (89). We see that the expectation value of C(-, .) is equal its true value, which proves that Eq. (248)is an unbiased statistic. For the variance of ?(-,-) we obtain c2(C^(5,q)) =
Cc(l+ k
sk,l)
I
(264)
As the sample values sk,f << 1, we approximate Eq. (264) as o2(CI((r,q)) 'v & ' N 2
(265) SO the variance of ?(-,-) is to a good approximation equal to a microscope parameter N4(IbsT)-'. Instead of calculating higher-order moments, we examine Eq. (262). In electron microscopy practice, Lo >> 1, even for modest resolution requirements, since &, denotes the average number of electrons available for imaging per sampling cell. When Lo >> 1, the term exp{io&' exp [2xi (2~)-'( k t
+ Iq)]}
of Eq. (262) can be approximated by the first terms of its Taylor expansion
+ lq)]) exp(iw&' exp[2~i(2~)-'(k< 1+io&' e x p [ 2 n i ( 2 ~ ) - ' ( k 5 + 1 q ) ] - ~ w ~exp[4~i(2&)-'(kt E.~~ +1q)]
(266)
IMAGE HANDLING IN ELECTRON MICROSCOPY
303
With Eq. (266) we obtain for @?(o) to a good approximation exp [2ni(2~)-'(k[
+ lr])]
We define the real functions a^((, q ) and 6((, r ] ) to be the real and imaginary part of Z(5, r ] )
Z(5, r ] )
+ i&5, r ] )
= a^([, YI)
(268)
From Eq. (267) the characteristic functions of the real part a^(.,.) and the imaginary part ,.) are easily derived. These functions have the structure of the characteristic function of a Gaussian random function with mean a(.,.) and b(.,.) with variances given by
&.
From this we conclude that, to a good approximation (if we restrict our attention to the two lowest-order moments, mean and variance), ?(-,.) is a Gaussian random function with parameters given by Eqs. (263) and (265).
B. Part 2
In the second part of this appendix we examine the probability density function of the auxiliary variable .?k,k,ldefined in Eq. (130). In order to simplify the notation we will drop the subscripts (k, 1) and abbreviate also here the mean number of arriving electrons per image cell AsTN - 2 by 2,. From Eq. (251) we have that E{G}
= &(l
+ sy
(270)
and s is estimated by s^ = & 1 / 2 ( @ 1 / 2
-
1-y)
Since n^ is distributed according to the Poisson distribution we have P{$}
= exp(-A)A'/$!
(272)
with
A = &(1
+
s)2
(273)
304
CORNELIS H. S L U M P A N D HEDZER A. FERWERDA
Applying the transformation Eq. (271) which has Jacobian &,fi)-”’ using Eq. (272) in the probability density function of s^ p(?) = 2A0(1
+ s^)exp[-/20(l + S ) ~ ] [ A ~ ( +I s)2]1~(1+1)2/[A,(l + :)’I!
results (274)
With Stirling’s approximation to the factorial n!,
n!
N
(2nr1)’/~n“exp(-n)
(275)
we obtain for Eq. (274)
(276) Expanding the logarithm in Eq. (276) into a power series, we obtain to third order in s and s^ p(s^) N (2n)-1’2(4ivo)’’Z exp{ -2i0(?
-
s)’
-
2i,[+(i3
-
s3) + ss^(s^ - s)]} (277)
Defining a’ as follows, o’ = (42,)-1
-
(278)
Eq. (277) can be written p(?)
(2na’)-’’’ exp[ -+o-’(S^ - s)’] exp{ -40-~[+(s^~- s3) + ss^(sI - s)]} (279)
The second exponential in Eq. (279) is close to unity because s << 1 (and consequently i<< 1) because we restricted the imaging to weak objects; so we have to a good approximation p(?)
N
( 2 7 c ~ r ~ )exp[-fo-’(? -”~
-
s)’]
(280)
From Eq. (280) we observe that the probability density function of s^ is Gaussian with mean equal to the true value s and with variance (4jbO)-’.This result is in agreement with the results obtained by Anscombe (1948), used in Slump and Ferwerda (1982). Anscombe applied the square-root transformation = (n
+ c)’”
(28 1)
with c an arbitrary positive constant and where 6 is a Poisson-distributed random variable with intensity parameter A. Anscombe observed that 9 is distributed Gaussian, asymptotically in R; in particular he obtained the
IMAGE HANDLING IN ELECTRON MICROSCOPY
305
following results
E { j ) 2 (2 + c)'I2 - (8E.'12)-'
+ (128A312)-'(24c 7) (282) var(9) e $[l + (8;.)-'(3 - 8c) + (3222)-'(32c2 52c + 17)] Substituting for the arbitrary constant c the value 4 results in a nearly constant -
-
variance, already for moderate values of 2.
APPENDIX C : THECRAMER-RAO BOUND In this appendix we derive the Cramer-Rao bound for the variance of the parameters to be estimated: a = (ao,o,.. . ,am,".. . , a N - l , N - l )in Eq. (127). This Cramtr-Rao bound is the minimum value for the variance which is achievable by whatever estimation method from the recorded image 6. For an intuitively appealing interpretation of the Cramer-Rao inequality see Gardner (1979). We first calculate the amount of information that is contained in the data ii about the parameters a. According to Fisher (1950) this amount of information is defined by the following matrix
i"
a
L(ii, a)-log L(ii, a) daP4 where L(6,a) is the likelihood function of Eq. (1 28). Under the assumption that L(6,a) is regular enough to allow the interchanging of differentation and integration, Eq. (283) is equivalent with F(a)r,s,p,q = E -log 8ar.s
Substitution of Eq. (128) into Eq. (284) results with Eq. (127) in the following Fisher information matrix
= 42, TN-26r,,6s,,
(285)
The Cramer-Rao bound for the variance of an unbiased statistic is equal to the inverse of the Fisher information matrix [Eq. (283)] (see for example Kendall and Stuart, 1967; Van der Waerden, 1969; and Van Trees, 1968). From Eq. (285) we observe that the parameters a all have the same CramerRao bound and that they are not interconnected. That means that information about one specific parameter ak,ldoes not give information about any other parameter
306
CORNELIS H. S L U M P A N D HEDZER A. FERWERDA
ACKNOWLEDGMENTS The authors gratefully acknowledge the support of the Netherlands Organization for the Advancement of Pure Research (Z.W.O.). We thank Dr. B. J. Hoenders, Dr. E. J. Boekema, and Dr. W. Keegstra for their help at the time when this research was carried out. The processing of the images in this article has been performed at the Information Theory Group of the Department of Electrical Engineering at Delft University of Technology, for which help Dr. J. J. Gerbrands is gratefully acknowledged. We thank Mrs. E. Boswijk for her patience with typing the many changes in the manuscript, which was originally submitted as a thesis to the Rijksuniversiteit Groningen by one of us (C.H.S.). We thank Mr. 0. T. Schutter for skillfully drawing the figures.
REFERENCES Aczel, J., and Daroczy, Z. (1975). “On Measures of Information and Their Characterization.” Academic Press, New York. Anscombe, F. J. (1948). Biometrika 35,246-254. Bertero, M., De Mol, C., and Viano, C. (1980). I n “Inverse Scattering Problems in Optics” (H P. Baltes, ed.), Springer-Verlag, Berlin and New York. Boekema, E. J., Van Breemen, J. F. L., Keegstra, W., Van Bruggen, E. F. J., and Albracht, S. P. J. (1982). Biochim. Biophys. Acta 679, 7-1 1. Box, M. J. (1966). Comput. J . 9, 67-77. Davenport, W. B., and Root, W. L. (1958). “Random Signals and Noise.” McGraw-Hill, New York. Davidoglou, A. (1901). C. R. Hebd. 133,7844786,860-863. Dorignac, B., and Jouffrey, B. (1980). Proc. Eur. Congr. EIectron Microsc. 7th 1, 112. Eadie, W. T., Drijard, D., James, F. E., Roos, M., and Sadoulet, B. (1971). “Statistical Methods in Experimental Physics.” North-Holland Publ., Amsterdam. Egerton, R. F. (1982). Int. Congr. Electron Microsc., IOth, Hamburg 1, 151-158. Egerton, R. F., Philip, J. G., Turner, P. S., and Whelan, H. J. (1975). J . Phys. E . 8, 1033-1037. Ferwerda, H. A. (1978). I n “Inverse Source Problems in Optics” (H. P. Baltes, ed.), Chap. 2. Springer-Verlag, Berlin and New York. Ferwerda, H. A. (1981). Optics in Four Dimensions (ICO-Ensenada, 1980) (M. A. Marchado and L. M. Narducci, eds.). A I P Conf. Proc. (65), 402-41 1. Ferwerda, H. A. (1983). “Technical Digest of the Topical Meeting on Signal Recovery and Synthesis with Incomplete Information and Partial Constraints, Incline Village, Nevada” Paper ThAI. Optical Society of America. Fisher, R. A. (1950). “Contributions to Mathematical Statistics.” Wiley, New York. Fletcher, R. (1970). Comput. J . 13, 317-322. Fletcher, R., and Powell, M. J. D. (1963). Comput. J . 6, 163-168. Frank, J. (1973). Optik 38, 582-584. Frank, J. (1980). I n “Computer Processing of Electron Microscope Images” (P. W. Hawkes, ed.). Springer-Verlag, Berlin and New York. Frieden, B. R. (1971). “Progress in Optics,” Vol. 9, pp. 311-407 (E. Wolf, ed). NorthHolland, Amsterdam. Gerchberg, R. W., and Saxton, W. 0.(1972) Optik 35,237-246.
IMAGE HANDLING IN ELECTRON MICROSCOPY
307
Glauber, R. J. (1959). In “Lectures in Theoretical Physics, Summer Institute, University of Colorado, Boulder” (W. E. Britten and L. G. Dunham, eds.), Vol. 1, pp. 315-414. Wiley (Interscience),New York. Gradshteyn, 1. S., and Ryzhik, I. M. (1980). “Table of Integrals, Series, and Products.” Academic Press, New York. Haine, M. E. (1961). “The Electron Microscope.” Spon, London. Hawkes, P. W. (1978). Comput. Graph Image Process 8,406-446. Hawkes, P. W. (1980a). Optik 55, 207 -212. Hawkes, P. W. (1980b). In “Computer Processing of Electron Microscope Images” (P. W. Hawkes, ed.), Chap. 1, Springer-Verlag, Berlin and New York. Heidenreich, R. D. (1964). “Fundamentals of Transmission Electron Microscopy.” Wiley, New York. Henkelman, R. W., and Ottensmeyer, F. P. (1974).J . Microsc. (Oxford) 102,79-94. Hoenders, B. J., and Slump, C. H. (1983).Computing 30, 137-147. Iijima, S. (1977).Optik 38, 193. Isaacson, M., Langmore, J. P., and Rose, H. (1974).Optik 41,92. James, F., and Roos, M. (1975). Comput. Phys. Commun. 10,343-367. Kellenberger, E. (1980).Eur. Conyr. Electron Microsc. 7th, The Hague 2,632-637. Kendall, M. G., and Stuart, A. (1963).“The Advanced Theory of Statistics,” Vol. 1. Charles Griffin, London. Kendall, M. G.. and Stuart, A. (1967).“The Advanced Theory of Statistics,” Vol. 2. Charles Griffin, London. Kirkland, E. J.. and Siegel, B. M. (1981). Ultramicroscopy 6, 169. Koopman, B. 0.(1936). Trans. Am. Math. SOC.39,399. Kronecker, L. (1878).“Monatsberichte der Koniglich Preuszischen Akademie der Wissenschaften zu Berlin,” pp. 145-1 52 (Reprinted 1897in “L. Kronecker Werke,” Vol. 2, pp. 71-82. Teubner, Leipzig). Lehmann, E. L. (1959). “Testing Statistical Hypotheses.” Wiley, New York. Lenz, F. A. (1954). Z . Naturforsch. 9a. 185-204. Lenz, F. A. (1971).In “Electron Microscopy in Material Science” (U. Valdre, ed.). Academic Press, New York. Misell, D. L. (1973a).Adu. Electron. Electron Phys. 32,63. Misell, D. L. (1973b).J . Phys. D 6,2200-2216,2217-2225. Munch, J. (1975).Optik 43,79-99. Niehrs, H. (1969). Optik 30,273. Niehrs, H . (1970). Optik 31, 51. Papoulis, A. (1965). “Probability, Random Variables and Stochastic Processes.” McGraw-Hill, New York. Picard, E. (1892). Math. Pure Appl. (4me Ser.) 8, 5-24. Pitman, E. J. G . (1Y36). Proc. Cambridge Philos. Soc. 32, 567. Powell, M. J. D. (1970). In “Numerical Methods for Non-linear Algebraic Equations” (P. Rabinowitz, ed.). Gordon & Breach, New York. Reimer, L., and Gilde, H. (1973). In “Image Processing and Computer-aided Design in Electron Optics” (P. W. Hawkes, ed.), pp. 138-167. Academic Press, New York. Roger, A. (1981). IEEE Trans. Antennas Propagat. AP-29,232-238. Rose, A. (1973).“Vision: Human and Electronic.” Plenum, New York. Rose, A. (l948a). In “Advances in Electronics”(L. Marton, ed.), Vol. 1. Academic Press, New York. Rose, A. (1948b).J . Opt. Soc. Am. 38, 196-208. Saxton, W. 0.(1980). In “Computer Processing of Electron Microscope Images” (P. W. Hawkes, ed.), Chap. 2. Springer-Verlag, Berlin and New York.
308
CORNELIS H. S L U M P AND HEDZER A. FERWERDA
Saxton, W. O., and Frank, J. (1977). Ultramicroscopy 2,219-227. Scherzer, 0 .(1949). J . Appl. Phys. 20, 20-29. Slepian, D. (1964). Bell Syst. Technol. J . 43, 3009-3057. Slepian, D., and Pollack, H. 0.(1961). Bell Syst. Tech. J . 40,43-63. Slump. C. H., and Ferwerda, H. A. (1982). Optik 62,93-104, 143-168. Slump, C. H., and Hoenders, B. J. (1985). I E E E Trans. IT-31,490-497, Snyder, D. L. (1975). “Random Point Processes.” Wiley, New York. Unwin, P. N. T., and Henderson, R. (1975). J . Mol. Biol. 94,425-440. Van der Lubbe, J. C. A. (1981). A generalized probabilistic theory of the measurement of certainty and information. Thesis, Delft University of Technology. Van der Waerden, B. L. (1969).“Mathematical Statistics.” Springer-Verlag, Berlin and New York. Van Heel, M. G. (1982). Ultramicroscopy 8,331-342. Van Heel, M. G., and Frank, J. (1981). Ultramicroscopy 6, 181-194. Van Toorn, P., and Ferwerda, H. A. (1976). Opt. Acta 23,457-468,469-481. Van Toorn, P., Huiser, A. M. J., and Ferwerda, H. A. (1978). Optik 51, 309-326. Van Trees, H. L. (1968). “Detection, Estimation and Modulation Theory,” Part I . Wiley, New York. Wade, R . H. (1980). In “Computer Processing of Electron Microscope Images” (P. W. Hawkes, ed.), p. 233. Springer-Verlag, Berlin and New York.
ADVANCES IN E L E C T R O N I C S A N D E L E C T R O N PHYSICS. VOL 66
Digital Processing of Remotely Sensed Data A . D. KULKARNI h'ational Remote Sensing Agency Balunagar. Hyderabad. India*
I. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A . Illustrative Spectral Signatures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Atmospheric Windows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C . Remote Sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D. Applications of Remote Sensing . . . . . . . . . . . . . . . . . . . . . . . . . . . . E . Digital Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11. Preprocessing Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A . Radiometric Corrections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Geometric Corrections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111. Enhancement Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Gray-Scale Manipulation Techniques . . . . . . . . . . . . . . . . . . . . . . . . . B. Edge-Enhancement Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . C . Filtering Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D . Spatial Smoothing Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E . Enhancement by Band Ratioing . . . . . . . . . . . . . . . . . . . . . . . . . . . . F. Enhancement by Principal Components . . . . . . . . . . . . . . . . . . . . . . . G . Pseudocolor Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . H . Enhancement by Stereo-Pair Decomposition . . . . . . . . . . . . . . . . . . . . . I . Shift-Variant Enhancement Techniques . . . . . . . . . . . . . . . . . . . . . . . . IV . Geometric Correction and Registration Techniques . . . . . . . . . . . . . . . . . . . A . Interpolation Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Registration Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . V. Classification Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Linear Discriminant Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B . Minimum Distance Classifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C . Supervised Classification Techniques . . . . . . . . . . . . . . . . . . . . . . . . . D . Tree Classifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E . Contextual Classification Techniques . . . . . . . . . . . . . . . . . . . . . . . . . F . Clustering Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VI . System Design Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VII . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
310 310 313 313 317 318 319 319 323 326 327 328 336 338 340 341 341 342 343 343 344 350 355 356 356
356 351 361 364 365 367 367
* Present address: Computer Science Department. University of Southern Mississippi. Hattiesburg. Mississippi 39406. 309 Copyright Q 1986 by Academic Press. Inc All rights of reproduction in any form reserved.
310
A. D. KULKARNI
I. INTRODUCTION Remote sensing is a science of deriving information about an object from the measurements made at a distance from the object. Different sensors and techniques used for this purpose have improved our capability to gather information about the earth’s natural resources and environment. A remote sensing system can be either an active or a passive system. In both of the cases electromagnetic (EM) energy from ground targets is measured. Every object on the ground responds uniquely to the electromagnetic energy incident on it, and it also has unique radiation properties over the electromagnetic spectrum. Such properties can be effectively used to identify the objects on the ground. The earliest and the most useful form of remote sensing was photography. Here, the photon energy (in visible or in near-visible portions of the spectrum) which is radiated or reflected from objects is collected by a camera (sensor) and recorded on a light-sensitive film emulsion. Aerial multiband color photography can be used to identify various categories of objects on the ground. The technique of remote sensing to a great extent relies on the interaction of EM radiation with matter. Macroscopically, the interactions are absorption, transmission, reflection, and emission of radiation from the features. These are due to atomic and molecular absorption, as well as scattering. These physical processes affect the reflected/emitted radiation (signal) measured by the sensors. The remotely measured signal expressed as a function of wavelength is often referred to as the “spectral signature” of the target object on which the measurements have been made. In principle, the spectral signatures are unique; i.e., different objects have different spectral signatures. It would therefore be possible to identify the object from its spectral signature. In brief, this is the principle of multispectral remote sensing, which is a powerful technique for monitoring natural resources and the environment. A . 1llustratit.e Spectral Signatures
Some typical spectral signatures are discussed below. A typical spectral curve for green vegetation is shown in Fig. 1. In the visible wavelength pigmentation dominates the spectral response of plants. In the near-infrared region, the reflectance rises noticeably because the green leaf absorbs very little energy in this region. In the mid-infrared region (MIR), water absorbs energy strongly. Since green leaves have a very high moisture content, the water absorption band dominates the spectral response in this region. The relationship between the incident, the reflected, the absorbed, and
DIGITAL PROCESSING OF REMOTELY SENSED DATA
311 Domlnant factor: Controlling l e a f reflectance
Chlorophyll absorption
Water absorption
2.2
1.6
0.4
Wavelength t p m ) Visible
R e f l e c t i v e infrared
I
Spectral region
Sea Near
Infrared
Middle
infrared
FIG.1. Significant spectral response characteristics of green vegetation.
the transmissed energy is given by 1, = R,
+ A, + T,
where I, is the incident energy, R , is the reflected energy, A, is the absorbed energy, and T, is the transmitted energy. In visible wavelengths most of the energy striking a green leaf is absorbed and very little is transmitted through the leaf. Figure 1 shows a reflective peak at approximately 0.54 pm, which is in the green wavelength region. The spectral characteristics of soil are shown in Fig. 2. The moisture content, the amount of organic matter, the amount of iron oxide, the relative percentages of clay, silt, and sand, particle size, and the roughness characteristics of the soil surface, etc., influence the spectral reflectance of soils significantly. An increase in iron oxide can cause a significance decrease in the reflectance, at least in visible wavelengths. The spectral transmittance through turbid and clear water is shown in Fig. 3. For bodies of water, the interactions are a result of water quality and are
312
A. D. KULKARNI
Moisture content
Wavelength ( p m )
FIG. 2. Spectral reflectance curves for a typical clay soil (pembroke) at two moisture contents.
50
0.5
/ 0.1
I 0.4
1 0.5 Woveiength ( p m )
I 0.6
I 0.7
FIG.3. Spectral transmittance through ten meters of water of various types. - .-.. Distilled , coastal (Hulburt), ... bay (Hulburt). (Hulburt); ---, ocean (Clark and Jones), ~
313
DIGITAL PROCESSING OF REMOTELY SENSED DATA
further affected by various conditions of the water. Locating and delineating water bodies by remote sensing can be done most easily in the near-IR wavelengths. B. Atmospheric Windows
Spectral transmittance up to far-infrared regions of the EM spectrum is shown in Fig. 4. The molecules responsible for each absorption band are water vapor, carbon dioxide, and ozone, as indicated in the upper part of the figure. The transmission curve can be characterized by several portions of the band of high transmission, known as atmospheric windows. The transmission depends upon the amount of absorber along the path, the altitude, the angle that the path makes with the horizontal, and the wavelength of observation. There are several atmospheric windows, which give the necessary spectral bands for remote sensing data acquisition. The most utilized spectral windows are 0.31.3 m and 8-14 m. C . Remote Sensors
Radiation sensors are the instruments which measure the intensity of radiation leaving a surface or an object, as a function of time, wavelength, space, and geometry, including angular orientation of the target with reference to the observer. No single instrument can take all of these measurements well or even satisfactorily. For most applications, therefore, some parameters are stressed in each instrument at the expense of others.
4
ri
C%HrO
J ~ ).U
.i+
Absorbing molecule H3O cozo
Coz03
ii
+
1L
H20.Co2
1 1
Coz
.I
Wavelength ( p m ) Near Middla gnfrared [-infrared .I I
Far infrared 1.
4
FIG.4. Transmittance through the earth’s atmosphere (horizontal path at sea level, length 1828).
314
A. D. KULKARNI
1. Photographic Systems
The technology of remote sensing systems actually originated in the science of photograph interpretation. Camera systems remain popular for aircraft remote sensing programs in spite of the increased use of satellite imaging. In photographic systems, the film functions as a detector; the lens as an optical system. A photographic system is basically a framing system where all the data in an image are acquired simultaneously, the limitation being the restriction of relatively limited spectral range compared with a multispectral system. 2. Satellite Remote Sensing Systems
Satellite remote sensing began in earnest with the launching of the first earth resources satellite in July, 1972, by the National Aeronautics and Space Administration (NASA) of the United States. This was followed by Landsats 2 and 3 in 1975 and 1978, respectively. Landsat 2 and 3 carried multispectral scanner (MSS) and return beam vidicon (RBV) sensors. These satellites are sun synchronous satellites with a local solar time of 8.30 am., with repetitive coverage of 18 days. The ground resolution for MSS is about 57 m x 79 m, while that of the RBV is 30 m x 30 m. The swath width for MSS is 185 km. MSS produces images in a sequential fashion. The target is scanned in raster (a line at a time), usually with an optical mechanical system. Radiation passes through converging optics, which establishes an instantaneous field of view (IFOV). The total field of view (FOV) is established by the scanning motion of the optical system. The radiation is then dispersed into its spectral components using prisms, gratings, or filters. An array of detectors senses the dispersed radiation. The detectors sense the wavelength region to which they are sensitive. The signals from each detector are amplified, processed, and recorded on board or transmitted by telemetry to a receiving station. The scene and the calibration source are scanned by an optomechanical system. MSS spectral bands are given in Table I. RBV contains a photoemissive surface onto which the scene of interest is TABLE I MSS SPECTRAL BANDS Band number
Spectral range (pn)
0.5-0.6 0.6-0.7 0.7-0.8 0.8-1.1
DIGITAL PROCESSING OF REMOTELY SENSED DATA
315
focussed optically. Generally, external components are used to focus and to deflect the readout electron beam that scans the image on the photosensitive surface. The advantage of RBV over other electron-beam imagers is its high spatial resolving power (100 to 120 lines per mm). For a given ground resolution requirement the size of the image-forming optics can be much smaller for an RBV system. It has two possible disadvantages. First is that the antimony sulfide photoconductor has a high capacitance that causes erasure and subsequent long preparation times, on the order of 5 to 15 sec. A second disadvantage is that the focus coil of the tube consumes about 30 watts of power. The ground resolution of RBV is about 30 m x 30 m. The successors to Landsat 1,2, and 3 are Landsat 4 and 5. Landsat 4 was launched by NASA in 1982 and Landsat 5 in 1983. These satellites, in addition to MSS, have an additional sensor called a thematic mapper (TM). TM has 7 spectral bands out of which one is a thermal band. The resolution of 6 nonthermal bands is 30 m x 30m. As in the case of MSS, the reflected radiation is collected by a scanning mirror. However, in TM the data are collected in both the forward and reverse scans. TM spectral bands are given in Table 11. The TM data for band 3 are shown in Fig. 5. The SPOT Satellite will be launched by France in the mid-1980s. This will orbit in a near-circular subsynchronous orbit at an altitude of 832 km. The payload of the first SPOT Satellite consists of two high-resolution visible (HRV) imaging instruments and a package composed of two magnetic tape data recorders and a telemetry transmitter. The resolution of HRV is 20 m x 20m. In the panchromatic mode, the resolution is 1 0 m x 10m. There is also an off-nadir viewing capacity which is achievable by steering the plane mirror through which light enters the HRV instrument. This capacity extends for a range of f27" relative to the vertical. This allows the instrument to image an object within a strip extending 475 km to either side of the satellite ground track. The spectral bands for SPOT are 0.50-0.59, 0.61-0.68, and 0.790.84 pm for the spectral mode and 0.51-0.73 pm for the panchromatic mode. TABLE I1 TM SPECTRAL BANDS Band number
Spectral range (pm)
0.45- 0.52 0.52- 0.60 0.63- 0.69 0.76- 0.90 1.55- 1.75 10.40-12.50 2.08- 2.35
316
A. D. KULKARNI
FIG.5. Thematic mapper data (band 3).
The Indian Remote Sensing Satellite (IRS) is scheduled for launch in the mid-1980s.IRS will have two payloads, and both will employ solid-state imaging self-scan (LISS) sensors. The payload system characteristics are given in Table 111. The altitude of IRS will be about 900 km, thus giving a swath of
148 km.
TABLE 111 IRS PAKAMETEKS Parameter
LlSS I
LISS 11
Ground resolution ( m ) Number of bands Spectral range ( p m ) Swath (km)
93 4 0.45-0.90
36.5
148
4 0.45 0.96 74
DIGITAL PROCESSING OF REMOTELY SENSED DATA
317
D. Applications of Remote Sensing Remotely sensed spectral measurements can be a source of information for many applications. The applications of remote sensing include agriculture, forestry, geology, mineral resources, hydrology, water resources, geography, cartography, meteorology, and military.
I . Agriculture and Forestry The applications of remote sensing to agriculture and forestry include crop identification and area estimation, assessment of crop condition and yield potential, detection of diseases and other crop stresses, and soil mapping. The use of multistage sampling with a combination of ground, aircraft, and satellite imagery is often very useful.
2. Land-Use Inventory and Mapping One of the major tasks confronting planners and administrators of local, state, and national government agencies is the acquisition and analysis of information concerning natural resources, land-use inventory, and mapping. Remote sensing techniques are quite helpful in substituting the conventional practices of land-use studies. There is a considerable savings in time and cost of maps produced from Landsat data, as compared with the interpretation of aerial photography.
3. Hydrology and Water Resources Remote sensing data are used for monitoring and managing water resources. It can be used for flood mapping, snow cover measurements, determination of surface water and wet lands, detection of water pollution, and mapping of water surface temperatures. 4. Minerril Resources and Geological Structures Aerial photography has been used in geology for many years. Recent advances indicate that satellite-acquired images can be effectively used for many geological tasks. The wide coverage of Landsat images is particularly advantageous for the detection and mapping of lineaments. Other examples include reconnaissance mapping in inaccessable regions, map revisions, regional and synoptic analysis of crust features, assessment of dynamic surface processes, and systematic search for minerals. 5. Meteorology
Satellites provide meterologists with a data source with synoptic and repetitive coverage. The data are used for (1) synoptic meteorology, where
318
A. D. KULKARNI
satellite observation of clouds provides measurements of winds, cyclogenesis, and rainfall estimation; (2) atmospheric profiling, where vertical profiles of temperature, humidity, and certain gaseous constituents are provided; and (3) surface features of importance to meteorology like temperature, soil moisture, and sea ice coverage. 6. Military Applications
Reconnaissance by means of aerial photography has been practiced by the military for a long while. Aerial photographs as well as satellite images are used for reconnaissance. E. Digital Processing The automatic or machine processing of remotely sensed data is usually carried out in two stages. The first stage is a preprocessing, and the second stage is processing. No imaging system will reproduce images without any distortions. These distortions are mainly due to the characteristics of the sensing devices. The distortions can be of two types, specifically, geometric distortions and radiometric distortions. In order to use or analyze data for the various applications mentioned earlier, one needs to correct for radiometric and geometric distortions affecting the image. Usually, the data received from a satellite are first recorded on high-density tape, in real time, without any correction. These recorded data are later replayed at a slower rate and various geometric and radiometric corrections are applied. The data are then either recorded on film or on computercompatible tape (CCT). Preprocessing deals with generation of film and CCT products. It also generates additional information such as scene latitude and longitude coordinates for the corner points, annotation information such as satellite identification, band number, date of data acquisition, etc. This information is provided in the form of annotation in the film products or is stored in the header records of the CCT product. Preprocessing also includes systematic geometric and radiometric corrections to compensate for the imaging system degradation. In order to use data for the applications mentioned earlier, the data can be analyzed on a computer for detection of various categories, etc., or it can be interpreted visually. For visual interpretation purposes, one can improve the quality of the acquired data by using various enhancement techniques. The data can be analyzed on a computer using various pattern-recognition techniques. The images can be improved in their geometric accuracy by using
DIGITAL PROCESSING OF REMOTELY SENSED DATA
319
ground control point (GCP) information. These tasks are carried out at the processing stage. Various preprocessing techniques for geometric and radiometric corrections and processing techniques for enhancement, geometric correction, classification, and registration, which are often used in remote sensing, are discussed in the subsequent sections. The main aim of machine processing of remotely sensed images is the extraction of useful information. Many techniques derived from pattern recognition and image processing have been applied to process data. Many review articles and a number of books have been written on the subject, which include: Fu (1976, 1983), Haralick (1976), Anuta (1977), Swain and Davis (1978), Bernstein (1976), Goldberg (1981), Rosenfeld (1983), and Deekshatulu and Kamat (1983). 11. PREPROCESSING TECHNIQUES
No imaging system will aquire and reproduce images without distortion. Each type of remote sensor has a typical internal calibration. Also, each type of sensor causes its own geometric distortions. In preprocessing, these distortions are compensated for. Usually, two types of corrections are carried out, namely, radiometric and geometric corrections. In the case of satellite data, radiometric corrections are mainly carried out to compensate for detector gain variations. The geometric corrections are carried out to rectify the distortions due to earth rotation, nonlinearity of the scanning mirror velocity profile, detector positions on payload, etc. The corrections for these distortions can be carried out with the knowledge of system parameters and are known as systematic corrections. Often, standard film and computer-compatible tape products of Landsat data are generated after these corrections are applied. The algorithms for these corrections are briefly discussed below. A . Radiometric Corrections
In the case of MSS data, there are four spectral bands and six detectors for each spectral band. The TM sensor array consists of 100 detectors in 7 spectral bands. There are 4 detectors for thermal and 16 detectors in each of the six nonthermal bands. Since all of these detectors are not identical, there exist differences in their sensitivity. The nonuniformity of the detector sensitivities produces what is known as the stripping effect in the output image. One needs to compensate for the detector sensitivity variations to rectify the image.
320
A. D. K U L K A R N I
There are three basic approaches for radiometric corrections. These are based on the following: (1) The prelaunch gain and bias values of the detectors. (2) The on-board calibration parameters of the detectors. (3) The statistical properties of the observed data.
I . Radiometric Correction Using Prelaunch Gain and Bias Values In general, the detectors can be modeled as shown in Fig. 6. The observed digital voltage for the ith detector is given by V,
= giR
+ bi
(2)
where R is the input spectral radiance in milliwatts per square centimeter per steradian, g iis the gain factor in discrete voltage count per unit radiance for the ith detector, hi is the offset in discrete voltage count for the ith detector, and V, is the digital output voltage count in discrete levels 0 to 255. The prelaunch gain and bias values of the detectors can be used to compensate the detector sensitivity variations. The observed digital output voltage (gray value) of each pixel can be modified to correct the gray value, as shown in Fig. 6. The corrected gray value for each pixel is obtained by
V, = g i V , + bi
(3) where is the corrected gray value, and V ,is the observed gray value. g i and bj are obtained by using prelaunch gain and bias values as below. Let gi and bi denote gain and bias values for the ith detector in the given spectral band and are known from the prelaunch measurements. Let V,,, = 255 and Vmin= 0 be the maximum and minimum output voltage counts, respectively. The minimum and maximum input radiance for each detector can be obtained by solving Eqs. (4) and (5)
The reference minimum and maximum input radiance in each spectral band
R input radiance
Detector go'" 9,
ith
Fiti. 6.
Vo observed grey value
Vc corrected g:
bias b',
Detector response and correction model.
grey v@'ue
DIGITAL PROCESSING OF REMOTELY SENSED DATA
32 1
can be chosen as 'minrer
= max(Rmin,),
( l ?Nd)
(61
Rmaxrcr
= min(Rmax,),
( I ? Nd)
(7)
where N , represents the number of detectors in each band. The reference gain and bias for each band can be determined by using Eqs. (2), (6), and (7) as below gref = ( K a x - Kin)/('maxrer
bref =
- Rminrer)
Vmax - YrefRmax,,,
(8) (9)
Thus for a given spectral radiance, R, the observed voltage for the ith detector will be
V , = giR + bi
(10)
However, the output corresponding to the reference gain and the bias will be
V , = grefR + bref
(1 1)
The observed gray values are modified such that the corrected gray values for the given input radiance are the same for all the detectors. Hence, from Eqs. (3), (lo), and (1 1) we get Y
= Yref/Y,
b
= bref
- bi(gref/~t)
The above procedure is repeated for the whole spectral band. 2. Radiometric Correction Using Internal Calibration Systems
In order to correct the detector gain variations it is also possible to use calibration data supplied by the internal calibration systems which are provided with MSS and TM instruments for this purpose. The internal calibration system for TM consists of an obscuration shutter assembly, which includes a set of three calibration source lamps with associated optical conductors for bands 1 through 5 and 7, and a temperature-controlled blackbody surface for the thermal band (band 6). At the completion of each imaging scan, the obscuration shutter rotates into a range of positions in which the normal optical path for each detector is blocked as the shutter passes through these optical paths, the calibration lamps, blackbody surface, and zero-radiance surfaces pass through the detector FOVs. As a result, calibration radiance levels and a dc restore level are provided between the active scans (Webb,
322
A. D. KULKARNI
1983).As in the previous section, Eq. (3) is used to obtain corrected gray values from the observed gray values; however, in this case values of g{ and bi are obtained as follows: Let the input radiance and the output digital voltage count be related by the relation V, = giR + b,
+e
(14)
where V ,is the output digital voltage in discrete levels of 0 to 255; R is the input radiance; y, is the gain factor for the ith detector, in discrete voltage counts per unit radiance; b, is the offset for the ith detector in discrete voltage counts; and e is the error term. The parameters g i and b, can be estimated using eight calibration samples and their associated radiance values using a regression model (Murphy, 1984). The estimation is optimized such that the sum of the squared errors is minimum. The total estimation error eT is given by
Differentiating eT with respect to g i and b, and equating it to zero, we obtain 8
gi
=
C Di(j)V,(j)
j= 1 8
bi
=
C Ci(j)&(j)
j= 1
where C , ( j )and Q ( j ) for ith detector are given by C i ( j )=
(1 ~
k - R) ( j ) ~Cw ) ) I [ ~ ( C ~ ( k )-~( )C R ( ~ ) ) ~ I
oi(j)= ( 8 ~ ( j-) C ~ ( k ) ) / E g ( x ~ ( k-) ~ (C) R ( ~ ) ) ~ I
(17)
where denotes the sum over all the calibration samples. Once g iand bivalues for each detector are obtained, gi and bi values can be obtained as in Eqs. (12) and ( 1 3). 3. Radiometric Corrections Using Statistical Methods Even after applying corrections with the prelaunch gain and bias values or by using internal calibration data, the detector gain and bias variations may not be fully compensated for. In such cases the histograms of the observed gray values, V,,of each detector are used to correct the observed gray values. From the observed histogram for each detector, it is possible to obtain mean
DIGITAL PROCESSING OF REMOTELY SENSED DATA
323
(mi)and standard deviation (a,)for each detector as follows
where N, represents the numbe; of observed samples. The reference mean (mref)and the ref standard deviation (oref) can be defined as
aref= minoi,
iE(l,Nd)
(21)
where Nd represents the number of detectors in each band. The observed gray values can then be corrected by using Eq. ( 3 ) ;however, gi and b; values in this case will be given by Eq. (22) below.
g: and b: for all the bands can be obtained in the same fashion.
In the above method we have used only the second-order statistics for the correction. It is possible to choose the histogram for any one of the detectors as a reference histogram and to generate a look-up table corresponding to each detector. Therefore, the histograms of the modified values corresponding to each detector matches the reference histogram for each band.
B. Geometric Corrections Geometric distortions are introduced due to the characteristics of the sensing device. In the case of satellite images, geometric distortions are mainly due to satellite altitudes, earth curvature effects, earth rotation effects, yaw, roll, and pitch of the spacecraft, nonlinearity of the scanning mirror profile, etc. Corrections for these distortions are basically carried out by modeling the scanning system. The payload calibration data (PCD)are used to correct these distortions. In the case of TM, models are used to correct distortions due to earth curvature, earth rotation, mirror profile nonlinearity, and detector offsets, whereas PCD information is used to correct the distortions due to TM axis misalignment, yaw, roll, and pitch of the aircraft. PCD contains gyro data which give yaw, roll, and pitch information. Some of the correction models are discussed below.
324
A. D. KULKARNI
1. Earth Curvature and Panoramic Distortion Correction
The earth curvature and the panoramic distortion can be modeled together as a single constant distortion in each line of the image data. The amount of correction C ( x i )for the ith output pixel is found as the input pixel displacement.
+ C(Xi)
(23) where x i is the output pixel location; ii is the input location for the corresponding pixel; and C(x,) is a distortion correction. Figure 7 shows the relationship of the corrected output scan line scaled for pixels subtending equal increments of the central angle 4, and the raw image with pixels subtending equal increments of the scan angle 8. From the plane geometry shown it can be seen that
2i = xi
(p
+ h)sinH = psin(8 + 4)
(24)
For any point in the output image 4 we can write
4(f))= sin-'{[(p
+ h)/p]sinO}
-
8
8 ( 4 ) = tan-'(sin + / { [ ( p + h ) / p ] - cos 4 } )
(25)
(26) where p is the radius of curvature of the earth's surface, and k is the satellite
FIG 7. Earth's curvature and panoramic distortion.
DIGITAL PROCESSING OF REMOTELY SENSED DATA
altitude. If there are 2N
325
+ 1 pixels per scan line, then we get A0 = 0,,,/N
A 4 = 4max./N where 0 and
4 are the incremental angles 0,
=
iA8
#,=iA4,
f o r i = 1,2,3,..., N
The correction C(x,) for the ith output pixel is then C(X,)= $,/A% - 4i/A4
or C(Xi) = (l/A0)tan~~'(Csin(iA4)I/{C(p + h)/pl - cos(iA4))) - (bi/A4) (30) 2. Mirror Nonlineurity Correction
The nonlinearity of the mirror velocity profile in the case of TM can be expressed by the following equations. The initial smoothed profile polynomial is given by (Webb, 1983)
4(t)= a,
+ a,t + a2t2 + a , t 3 + a4t4 + a5r5
(31)
For a reverse scan the polynomial is given by
4(t) = 6, + b,t + b,t2 For the actual scan time 4A(t)
= a0
t,
+ b 3 t 3 + b4t4 + b 5 t s
(32)
the modified profile is given by
+ a1(ti)/(ts)+ a2C(ti)/(ts)12t2+ a3C(ti)/(ts)13t3
+ a4C(ti)/(ts)I4l4 + asC(ti)/(ts)I t 5
(33) where ti is the ideal scan time (0.060743 sec), and t, is the actual scan time. The ground calibration profile for the ith scan can be written as &(t) = a,i
where a,i
+ a,, +
= a,
= al(li)/(ts)
+ {(Af)/CtFH(i - tpdts)]} + {(Afi)/[tstFH(i tFH/ts)]}
= a2[(ti)/(ts)12 a3i
= a3C(ti)/(ts)13
a4i = a5i
+ a3,t3+ a4,t4 + a,,t5
a4C(ti)/(ts)14
= asC(ti)/(ts)15
-
(34)
326
A. D. KULKARNI
where t,, is the first half scan time, and Afi is obtained from a line-length code for each scan cycle. Equation (34) gives the velocity profile for each scan and can be used for resampling along the scan, such that the output pixels are of the same size. 3. Correction due to Misalignment of T M Axis and Yaw,Roll, Pitch of the Spacecraft Distortions are caused by misalignment of the TM axis and also can be due to yaw, roll, and pitch of the spacecraft. The information for correction of the distortions like yaw, roll, and pitch is sent in PCD. These distortions can be corrected using Eq. (35):
where u is the input (uncorrected image) pixel number; u is the input (uncorrected image) scan number; x is the output (corrected image) pixel number; and y is the output (corrected image) scan number. Equation (35) can be simplified to a first-order transformation and can be rewritten as
+ a I 2 y + a,, u = a z l x + a2,y + a,,
u = allx
(36)
Equation (36) can be used to generate the lookup tables for generating corrected images. The coefficients a , a I 2 , .. .,u20 can be calculated using yaw, roll, and pitch angle information sent in PCD. In order to implement the above corrections, resampling or interpolation techniques like nearest neighbor, cubic convolution, etc., are used.
111. ENHANCEMENT TECHNIQUES
In order to make maximum use of the available information in visual interpretation, some sort of enhancement is usually needed. Image enhancement emphasizes viewing the image for extraction of the information that may not have been readily available in the original. Many enhancement techniques are used in remote sensing. Principal among them are gray-scale manipulation techniques, filtering techniques, edge-extraction techniques, pseudocolor composite techniques, etc. Many textbooks and review articles are available which describe these techniques. Some of these are Andrews and Teacher
DIGITAL PROCESSING OF REMOTELY SENSED DATA
327
(1972), Rosenfeld and Kak (1982), Andrews (1970), Rosenfeld (1976), Wang et al. (1983),and Gonzales and Wintz (1977). The techniques which are often used in remote sensing are described in the subsequent sections. A . Gray-Scale Manipulation Techniques
Gray-scale manipulations are intensity-mapping techniques. These are also known as gray-level rescaling techniques. Gray-level rescaling directly assigns each pixel to a new gray level to improve the contrast of the image. In
81
BRIGHT REGION STRETCHING
MID-RANGE STRETCHING
Onin IN
max
min
81
DARK REGION STRETCHING
I No. of
$1
NONLINEAR STRETCHING
INPUT HISTOGRAM
I
I
OUTPUT HISTOGRAM
FIG.8. Gray-scale transform functions
328
A. D. KULKARNI
general, gray-level rescaling takes into consideration one pixel at a time and is independent of its neighbors. In many remotely sensed images the gray levels of the object are so close to that of the background that it is difficult to discriminate them by visual inspection. Contrast enhancement is needed to increase the gray-level differences between the object and the background. In other situations, the gray levels of a large percentage of pixels concentrate in a narrow portion of the histogram. This makes the film details hardly visible. Gray-scale manipulation techniques increase the dynamic range and can be defined as
Gkj) = T(g(i,j))
(37)
where ij is the enhanced imagery, g is the unenhanced or the input imagery, T(.) is the transform function, and i,j represent the column and the row number of the pixel. Some of the most commonly used mapping functions are shown in Fig. 8. The original and enhanced images are shown in Figs. 9 and 10, respectively (Rao et al., 1982). Histogram equalization is also a commonly used enhancement technique. A typical histogram of the input and output images are shown in Fig. 8. It can be seen that the gray values in the histogram corresponding to the output imagery are uniformly distributed. Histogram equalization is equivalent to maximizing the zeroth-order entropy. The cumulative distribution function (CDF) of a typical image is shown in Fig. 1 1. The desired linear C D F that corresponds to the equalized histogram is also shown in Fig. 1 1. The C D F linearization procedure can be formulated as (Wang et al., 1983)
G ( i j ) = (gmax
-
gAMi,.j))
+ gmin
(38)
where p(g(i, j ) ) is the value of the C D F at a gray level g(i, j);gminand gmaxare the maximum and the minimum gray levels; and is the output or the enhanced image.
B. Edge-Enhancement Techniques
Edge-enhancement and detection techniques are important in many applications like detecting boundaries, detecting roads and rivers, mapping of lineaments to aid geological ground surveys, detection of seismic horizons in the automation of geological interpretation, etc. The problem of edge detection is difficult (to solve and to define), and a large number of schemes have been presented in the literature. The edges represented by an abrupt change in the pixel intensity are normally detected by comparing intensities or average intensities computed over the neighborhood in the vicinity of a pixel.
DIGITAL PROCESSING OF REMOTELY SENSED DATA
329
FIG.9. MSS raw data (band 7).
In edge enhancement, we attempt to make the edge more visually prominent within an image. In other words, the techniques are employed to increase the gray-level difference between the boundaries of the two regions. Most of the edge operators are local. In the case of one-dimensional data, the step edge and its derivatives are shown in Fig. 12. The line edge and its derivatives are shown in Fig. 13. It is obvious that for a step edge, the first derivative shows the maxima at the edge point, whereas for the line edge the first derivative is zero. The second derivative in the case of step edge is zero. However, for the line edge the second derivative is maximum at the edge point. Hence, the edges in the image can be detected by fixing a threshold on the values of the first or second derivatives. Since the images are two dimensional, the derivatives in both directions need
330
A. D. KULKARNI
FIG.10. Image of Fig. 9, enhanced by stretching.
to be considered. Since the gray-value variations are different for different types of edges, the threshold for detecting the edges is a subjective matter. The presence of noise in the image also causes difficulty in detection of the edges. The operators like Laplacian, gradient, and Robert’s are well known. They can be described as y^(i,j) = abs[4g(i,j)
-
g(i
+ 1 , j ) - g(i,j + 1 )
-
g(i - 1 , j ) - g(i, j
c X j ) = { C g ( i , j ) - d i , j - 1)Y + Cg(i,j) - g(i -
G ( i , j ) = abs[g(i,j) - g(i
+ 1 , j + l)]
2
~ j ) }l
1/2
-
l)] (39)
(40) (41)
In Eqs. (39), (40), and (41), g(i, j ) corresponds to the input image gray value at (i,j ) . y(i,j) corresponds to an output image gray value at (i, j ) . In order to determine whether a pixel at (i,j) is an edge pixel or not, some threshold T is
DIGITAL PROCESSING OF REMOTELY SENSED DATA
Total Number of Pixels
Actual CDF
0
255
Grey Values
FIG.1 1 . Cumulative distribution function of a typical image.
I
I
d2f ( x '
d x2
FIG.12. Step edge and its derivatives.
X
33 1
332
A. D. KULKARNI
d 2 f (x) dx'
FIG.13. Line edge and its derivatives
employed; i.e if @(i,j ) 2 T, pixel at (i, j ) is an edge pixel
if g(i, j ) < T, pixel at (i, j ) is not edge pixel
(42)
The edge operators can also be defined in terms of masks. If we consider the regions of the size 3 x 3 as shown in Fig. 14, with the central pixel as y5, the gray values in the region can be represented by a vector g = [g, ,q 2 , .. ., g9]. The weighting coefficient vector can be represented by w = [w,,w 2 , . . .,w 9 ] .
DIGITAL PROCESSING OF REMOTELY SENSED DATA
92
333
93
46
FIG. 14. 3 x 3 image mask.
The edges then can be detected by fixing a threshold on the value of S, which is given by 9 s = gw = giwi (43)
c
i= 1
where g is the gray value vector and w the weighting coefficient vector. The different masks can be used for detecting edges in different directions. The masks for the Sobel operator are shown in Fig. 15. The class of operators are proposed in the literature for detection of the edges and lines by fitting a hypersurface to an image in the neighborhood of each image point. Hueckel (1973) treated the surface fitting using a polar version of the orthogonal Fourier basis. Haralick (198 l), Morgenthaler and Rosenfeld (198 l), and Chittiheni (1 982, 1983) used the multidimensional orthogonal polynomials as the basis functions. Hypersurface fitting using orthogonal polynomials is discussed below Let X = ( x , , x , ) be a point in a two-dimensional space. Let ro be a rectangular hypersurface such that C x l =0
for all i
(44)
x i E ro
Let [ S i ( X )0, 5 i 5 n] be a set of two-dimensional orthogonal basis functions defined over the region ro. Let f(X) be the digital picture function. Let g ( X ) an estimate of f ( X )expressed as a weighted sum of the basis functions; i.e.,
! 1
0
-1
2
0
-2
1
0
-1-
1
2
0 -1
0 -2
FIG.15. Masks for Sobel operator.
1 0
-1
334
A. D. KULKARNI
where [ui,O I i 5 N ] are the set of coefficients. The total squared estimation error, E 2 , can be written as
The partial derivatives of the continuous function g ( X )give the measure of the edge at the center point of the region yo. The hypersurface fitting using the discrete orthogonal polynomials can be carried out as follows: Let $i be the domain of x i . Let [pij(xi),O I J I N ] be a set of descrete orthogonal polynomials on $ i , 1 5 i I 2. The set of two-dimensional basis functions [ S l ( X ) 0 , I1I N ] can be constructed using one-dimensional discrete orthogonal polynomials p i j ( x i )as follows
0 5 15 N Let mik = c.xf be the kth moment of x i over the domain orthogonal polynomials are given below
(47)
S , W ) = pldX,)p2l(x,),
A few discrete
1
&(Xi)
=
fl1(Xi)
= xi
pi2(xi)= xi’ E3(Xi) =
$i.
-
mi2/mio
x; - (mi4/mi2)xi
Using the basis functions given in Eq. (47), the continuous quadratic hypersurface g(xl, x,) at (x, ,x,) as shown in Fig. 16 can be written as
+ 2, I - j + 2 ) s , + B,(k i + 2 , l j + 2)s2 + B,,(k - i + 2,l j + 2)s: + B,,(k i + 2,l j + 2)s: + B , , ( k i + 2,l j + 2 ) s l s 2 ) f ’ ( x l i , ~ 2 j ) l (49) + B,(k
-
i
-
-
-
-
-
-
-
where s1 = (xl - x l i ) / h X l , s 2= (x, - ~ , ~ ) / h ~ , Where h,, and h,, are the sampling increments. In Eq. (49). g(x,,x,) represents the continuous function at (x, ,x,) in the 3 x 3 neighborhood centered at (i, j ) and f ( x l i , xZj)represents the discrete image function. B,, B , , B,, B , , , B,,, and B , , are the matrices shown in Fig. 17. It can be seen from Fig. 17 that matrices B , , B , correspond to the first derivative; B , , , B,, correspond to the second derivative, and B,, corresponds to the cross derivative. The amplitudes of these derivatives can be used to detect the edges in the image. To make the edges in the image sharp or to eliminate the blurring effects, an antidiffusion operator has been proposed. An antidiffusion operation com-
335
DIGITAL PROCESSING OF REMOTELY SENSED DATA
FIG.16. Discrete image f ( x l i . x Z i ) .
1/6[-:
1
-: -:] 1
1
L/4[
0
-/] [: 1/6
-2
-1 B11
B12
B22
FIG.17. Matrices for hypersurface fitting.
pensates for the loss of the gray level at the edge. Figure 18 shows a perfect one-dimensional step function u(x) and its second-order derivative V 2 u ( x ) By . subtracting V2u(x) from u(x) we obtain the graph illustrated in Fig. 18. It can be seen that this operation increases the gray-level difference between the edge (x = 0 + ) and its neighboring background (x = 0 - ). The edge-enhancement operation can be formulated as S(X1>X2)= S(X1, x2) - YD”
(50)
where g(x, ,x2) is the input image gray value at (xl, x2), g(x, ,x2) is the output
336
A. D. KULKARNI
X
FIG. 18. One-dimensional step function u ( x ) , its second-order derivative Vzu(x),and their difference, illustrating step-edge enhancement.
C. Filtering Techniques
An image can be viewed as a matrix of the points with each point represented by its gray level. It can also be considered as a combination of a two-dimensional Fourier series with different spatial frequencies. Thus, a picture can be represented uniquely in the frequency domain. It has been noticed that the low-frequency Fourier components correspond to a homogeneous object and a background, whereas the high-frequency components are
331
DIGITAL PROCESSING OF REMOTELY SENSED DATA
due to the edges and the small details in the image. Thus, low-pass or high-pass filters can be employed to smoothe the picture or to enhance the edges. Figure 19 shows a block diagram of a filter in the frequency domain. Duda and Hart (1 973) considered a low-pass filter as
Wf,,f,) = C k O S .f,)(cos .f,,l"
(52)
where a 2 1, and a high-pass filter as
H(f,>fJ = 1 - C~c~s.fx)cos.~;)lu (53) where x > 1. In Eqs. (52)and (53)f, and f,are the spatial frequencies in the x and y directions, respectively. It is obvious that the low-pass filter removes the noise; however, it blurs the image, and the high-pass filter sharpens the edges, but it also enhances the noise. Filtering can also be carried out in the spatial domain. A filtering technique using local statistics was first proposed by Wallis (1976) and then extended by Lee (1980).The filtering technique in the spatial domain can be represented as
where i ( i , j ) is the enhanced image; g(i, j ) is the input or the raw image; C, and C , are the constants such that C, 5 1 and C, 2 1; i, j correspond to the row and the column number of the pixel; and g(i,j) is the local gray-level mean surrounding the pixel (i,j ) . It can be seen that with C, = 1 and C, = 0, the operation is a simple smoothing. With C , 5 1 and C , 2 1 all the edges and the fine details in the image are enhanced. Oppenheim et al. (1968)have modeled image formation as a multiplicative process in which a pattern of illumination is multiplied by a reflectance pattern to produce the brightness image. The image then can be represented as
In Eq. (55) gi(i,j)and gr(i,j ) are the illumination and the reflectance patterns, respectively. The technique for filtering the images which are modeled as in Eq. (55) is a homomorphic filtering technique. The homomorphic image processor can be represented as shown in Fig. 20, where F represents the filtering
g(x,y)
G (f x , f y ) F
Ir
g(x,y)
G' ( f x r f y ) H
F-1
338
A. D. KULKARNI
FIG.20. Homomorphic image processor. F represents the filtering operation.
operation which can also be carried out in the spatial domain, as described in Eq. (54). The output image is given by i(i, j ) = Cgi(i, j)IclCgr(i>j)Ic2
(56)
For simultaneous dynamic range reduction and edge enhancement C1 should be less than 1 and C, should be greater than 1. D. Spatial Smoothing Techniques
If the image contains noise, smoothing techniques are used for cleaning the noise. Some of the smoothing techniques even blur the observed image. Thus edge enhancement may be needed afterwards. The simplest smoothing technique is equal to a weighted averaging over a neighborhood of a pixel. It can be expressed as
c r
gkj)=
n
n
W(P,M
P= -mg=
i -P,j
-
4)
(57)
-n
where the weighting coefficients w(p, q ) are given by
w(p,q)= 1/(2rn
+ 1)(2n + 1)
(58)
Equation (57) replaces the gray level at (i,j) by a gray level averaged over a (2m + 1) by (2n + 1) rectangular neighborhood surrounding (i,j). To reduce the blurring effect, several unequal weighted smoothing techniques have been suggested. Graham (1962) used a 3 x 3 neighborhood and a weighting factor matrix Wgiven by
i
0.25 0.5 0.25
W = 0.5 1.0 0.5 0.25 0.5 0.25
i
Brown (1966) proposed the weighting factor matrix
(59)
339
DIGITAL PROCESSING OF REMOTELY SENSED DATA
Kuwahara et al. (1976) proposed a smoothing scheme which replaces the gray level at (i, j ) , by the average gray level of its most homogeneous neighboring region. Yasuoka and Haralick (1983) have proposed a scheme using a slope facet model with a t test for cleaning pepper and salt noise. In a linear stochastic model, the gray value of any pixel can be expressed as g(i,j)
=
ai
+ bj + y + p ( i , j )
(61)
where i is the row position, j is the column position, E represents the independent identically distributed (IDD) random variable with standard deviation p, and a, b, y , and p are the parameters of the model. Each pixel is checked for noise by considering a 3 x 3 neighborhood. The above model is fitted for a 3 x 3 block. The estimated &, 7 , and p^ are found by a criterion function J which is in this case a total mean-squared error for the block R.
B,
J
=
C
[g(i, j ) - ai - bj - 71’
i,jeR
Minimizing J with reference to a, B, and y, we get /
*T = C y ( i . j ) /i,jz 1
(65)
i,j
In all the above summations i a n d j vary from is found as below
-
1 to
+ 1. From these estimates
where N is the number of elements in the block (in this case N estimated gray value of the pixel can now be expressed as
=
9). The
+ j j + 7 + jj&(i,j)
(67) The t test can be used to test the hypothesis, H o : g ( i , j ) = g(i,j), i.e., the estimated value. Here t is defined as ~ ( ji ), = &i
t = ~ ( i , j) g(i,j)/pfi
(68) We take N = 9 and p = p^. The threshold value of t is taken as t ( N - 1,0.05) using 9574 confidence level, and it can be read from the tables. If t < t ( N - 1,
340
A. D. KULKARNI
FIG.21.
Landsat data with noise
FIG.22. Noise-filtered image of Fig. 2 1.
0.05),accept N o ; i.e., g(i, j ) is not a noise element and is not replaced. If t 2 t ( N - 1,0.05),reject H o ; i.e., g ( i , j ) is noise and is replaced by g ( i , j ) . The scheme is amenable to iterations. The image with noise and the noise-removed image using the above algorithm are shown in Figs. 21 and 22, respectively. E. Enhancement by Band Ratioing
Band iatioing is often used in practice for enhancement of multispectral data. Spectral ratios can be defined as G(i,j) = [atB,(i,j )
+ u 2 B 2 ( i , j +) .--+ a,B,,(i, j ) l / [ b l B l ( C j )
+ h,B,(i, j ) + + bnBn(i,j)]
(69)
where a , , u,,. . . , a n and h,, b,, . . . , h, are constants. B , , B,, . . . ,B,, are the gray values in different spectral bands. n is the number of bands. Often ratioed images show more details which are not visible in the raw image.
DIGITAL PROCESSING OF REMOTELY SENSED DATA
341
F. Enhancement by Principal Components In Landsat multispectral images there exists a correlation between the gray values of the pixel in the different spectral bands. The spectral band values can be decorrelated using the Karhuen-Loeve transform (or principal component technique). The technique is discussed below. Let X = (xl, x2,. . . ,X,jT be an n-dimensional vector representing spectral gray values of a pixel corresponding to n spectral bands. Let the transformed vector Y= ( y ,,y , ,. . .,y,) be given by Y= [A]X
(70)
where [ A ] is an n x n matrix, rows of which are the eigenvectors of the covariance matrix of X , C, such that C,
=
[A][ib][AT]
(71)
where [i. represents ] a diagonal matrix, the elements of which correspond to the eigenvalues of C,
CAI
=[;01 A2
...
(72)
---
where i,, 2 A2 2 2 2,. The image corresponding to y , is the first principal component image. Similarly, the image corresponding to yi is the ith principal component image. The transformed values y, ,y,, . . .,y,, which are decorrelated, contain most of the information in the first few principal components depending upon the ratio iLi/A,, . The first few principal component images may be used as enhanced image for visual interpretation purposes.
G . Pseudocolor Composites Enhancement can also be carried out by assigning various colors to the features in the image corresponding to different gray-level ranges. For example, from a single black and white image, three images can be generated such that each generated image enhances features corresponding to a chosen gray-level range. The three images then can be assigned red, blue, and green colors to generate an enhanced color composite. It is also possible to decompose the input image into three images such that each of the three images corresponds to the different frequency pass band. The three decomposed images then can be assigned red, blue, and green colors to generate the color composite output.
342
A. D. KULKARNI
H . Enhancement by Stereo-Pair Decomposition
In the visual interpretation of Landsat images, in order to enhance interpretation capabilities, it is often desirable to add height information to the reflectance pattern or the gray values in the image. MSS images are obtained by a scanning mechanism such that terrain height variations have almost no effect on the spatial location of the pixels in the image plane. If we know the height information for the pixels, then it is possible to generate stereo images from the MSS image using the model as shown in Fig. 23. In Fig. 23, the object plane corresponds to a Landsat image. Let (x, y, z ) be the coordinates of a point in the object plane. The (x, y )coordinates correspond to column and row numbers of the pixel. Let 0, and 0, be the perspective centers for obtaining projections of the object plane in the image planes IP, and IP,. Let ( x i ,y;) and ( x ; , y i ) be the coordinates of the point in image planes IP, and IP,, respectively. The object plane and the image plan coordinates can be related by (Rao et al., 1982) x; = [l
+
Y ; = c1
+ z/(z1 - 4 l ( Y
Z/(Z, -
(73)
z)](x - x,)
(74)
- Y1)
where (x, y , z ) are the coordinates of a point in the object plane; (x,, y, ,zl) are the coordinates of the point 0, with reference to 0 as the origin; and x; and y ; are the coordinates of the corresponding point in the image plane IP, with 0,
IP 2
0
FIG.23. Model for stereo image generation
343
DIGITAL PROCESSING OF REMOTELY SENSED DATA
as the origin. Similar equations can be obtained for projecting a point from the object plane to the image plane IP,. Equations (73) and (76) can be used to generate stereo-pair images from the Landsat image. It is also possible to use any other information like the earth’s magnetic field as z information and generate the stereo images (Green et al., 1978). I . Shijt Variant Enhancement Techniques The techniques discussed above are shift invariant in nature; i.e., transformation functions used for the enhancement do not change with respect to spatial coordinates of the pixel. However, in practice most of the images have different gray-value distributions and textural properties at the different spatial regions in the image. Hence, shift-invariant operators may not yield good results for the entire image. In order to overcome this difficulty, shiftvariant operators for intensity mapping and filtering can be used. The operators can be adaptive in nature and can be obtained by considering the local properties of the image at the different spatial locations (Kulkarni et al., 1982).
Iv. GEOMETRIC CORRECTION AND REGISTRATION TECHNIQUES As described in Section I there are two types of geometric corrections. The corrections carried out by using imaging system characteristics are called systematic corrections. The systematic corrections for geometric distortions are carried out at the preprocessing stage. The precision corrections are usually carried out at the processing stage. One of the methods to correct the image is to use ground control point (GCP) information. Here, the uncorrected image is compared with the corrected image or a map, and a few GCPs spread throughout are identified. The spatial relationship between the point in the uncorrected image g(u, u) and the corrected imagef(x, y) can be written as
g(u,t>)=f(4l(X>Y),42(X,Y))
(75)
where (u,v) are the spatial coordinates of a point in the uncorrected image; and (x, y) are the spatial coordinates of the corresponding point in the corrected image. Thus the transform relationship between the two coordinates of the two images can be represented by
344
A. D. KULKARNI
The functions
and
+* can be polynomials of the form M-IN-I
u=
C C
amnxmyn
m=O n=O
M- 1N- 1
C
u=
bmnxmyn
m=O n=O
Equations (78) and (79) represent the relationship between coordinates of the pixel in the uncorrected image and the corrected image. The problem is to find out the transformation coefficients a,,, a,,, . . . , a M and boo, h,, ,. . . ,b, , N - These coefficients are obtained by using GCP information as below. Let ( x i ,y i ) for i = 1,2,. . . ,Npbe the coordinates of GCPs in the corrected image. Np represents the number of ground control points. Let (ui,ui)for i = 1,2,. . . ,Np be the coordinates of the corresponding GCPs in the uncorrected image. Let Gi and Ci represent the estimates of ui and u i obtained by Eqs. (78) and (79). The total error in the estimate Giis given by ~
NP
J
=
c (Gi
- Ui)Z
i=l
(80)
Equations (78) and (80) can be solved to get the coefficients amnsuch that J is minimum. The coefficients b,, can be evaluated in the same fashion. Equations (78) and (79) represent polynomials of order M and N . In many cases the polynomials can be approximated to the first-order transformations given below.
+ alox + aoly u = boo + b l o x + boly u = a,,
In order to carry out geometric correction, the pixels from the uncorrected image are transformed to the corrected image as defined by the transform equation, as shown in Fig. 24. The gray values in the output image are obtained from input-image pixel gray values by using some resampling or interpolation technique. These are discussed in the next section. The uncorrected and the corrected images are shown in Figs. 25 and 26, respectively. A . Interpolation Techniques
The interpolation techniques are used in geometric correction and are also used for image magnification and reduction. Interpolation is a process of estimating intermediate values of a continuous event from the discrete
Y
Y
U
UNCORRECTED IMAGE
X
CORRECTED IMAGE
FIG.24. Geometric correction transformation.
FIG.25.
Modular multispectral scanner image, uncorrected
346
A. D. K U L K A R N I
FIG.26. Geometrically corrected image of Fig. 25.
samples. The limitations of classical polynomial interpolation, like Lagrange interpolation, are thoroughly discussed by Hou and Andrews (1978). They developed an algorithm for interpolation using cubic spline functions. Recently, Keys (1981) has developed an algorithm for interpolation by cubic convolution. He has defined a kernel for interpolation. There are also other methods, like nearest neighbor, bilinear interpolation, and the hypersurface approximation.
DIGITAL PROCESSING OF REMOTELY SENSED DATA
347
The cubic convolution interpolation method is more accurate than the nearest-neighbor or bilinear-interpolation methods. However, it is not as accurate as the cubic spline interpolation method. In interpolation by hypersurface approximation, a quadratic or cubic surface defined over twodimensional space in the neighborhood of the point to be interpolated, is used. For equispaced, one-dimensional data, the continuous interpolation function can be written as
where g(x) is an interpolated, continuous function corresponding to a sampled functionf (xk)and xk are the interpolation nodes. c k are the coefficients, which depend on sampled dataf(x,), and h is the sampling interval. The kernels for nearest-neighbor, bilinear, and cubic convolution interpolation are given in Eqs. (84), (85), and (86), respectively (Stucki, 1979).
U(s)= 1 =0
for 0 I (sI I0.5 otherwise
U(s)= 1 - Is1
=o
for 0 IIs1 I 1
u(s)= 1 4-~ls12 + 1 =
-(sI3 + 51sI2 - 81sl
=o
(85)
otherwise
+4
for
(sI < 1
for
1 IIs1 I 2
otherwise
(86)
In nearest-neighbor, bilinear, and cubic convolution interpolation methods, the coefficients c k in Eq. (82) are the sampled data functionf(x,). Interpolation by hypersurface approximation can be carried out by using discrete orthogonal polynomials as the basis functions. The expression for a continuous surface in two-dimensional space using the hyperquadratic surface approximation has been given in Eq. (49). The same can be used for the interpolation. As an illustration, the nearest-neighbor, cubic convolution, and hypersurface approximation' interpolation algorithms have been applied to Landsat data, and the outputs are shown in Figs. 27 through 30 (Kulkarni and Sivaraman, 1984). Recently, Kekre et al. (1982) have used raised cosine functions as basis functions and have developed an algorithm for interpolation. Park and Schowengerdlt (1983) have developed an algorithm for interpolation using the parametric cubic convolution technique. They have used the family of
FIG.27. Modular rnultispectral scanner raw Landsat data
FIG.28.
Interpolation by nearest neighbor of data in Fig. 27
DIGITAL PROCESSING OF REMOTELY SENSED DATA
349
FIG.29. Interpolation by cubic convolution of data in Fig. 27.
piecewise cubic polynomials, and the kernel for the same is given by U ( s )= ( a =~
=o
+ 2)ls13
--
(a
+ 3)lsI2 + 1
1 . ~ 15 1~ ~ + 1 ~8~1.~1 - 4~r
<1
for
Is1
for
1 I Is1 I 2
otherwise (87) In Eq. (87) the common value of a = - 1 is used. It has been shown that a = -0.5 provides the interpolator with better convergence properties. An intermediate choice, a = -0.75, is also sometimes advocated.
350
A. D. KULKARNI
FIG.30. Interpolation by hypersurface fitting of data in Fig. 27
In the case of images which are inherently two dimensional in nature, the interpolation in two dimensions can be carried out by using a one-dimensional kernel, first row-wise and then column-wise.
B. Registration Techniques Registration is an important aspect of any image processing system. Registration may be interpreted as automatic determination of local similarity between the two structured data sets. Misregistration results from the fact that
DIGITAL PROCESSING OF REMOTELY SENSED DATA
3.5 1
the sensors are separated in space and time such that spatial alignment of the sensors is impractical or impossible. Geometric distortions, scale differences, and look-angle effects can all be combined to produce misregistration. Several digital techniques have been used for registration of remotely sensed images. Anuta (1970) discussed the spacial registration problem from a cross-correlation point of view. He implemented the algorithm using Fourier transform techniques. Barnea and Silverman (1972) described a sequential similarity detection algorithm for translational registration. Webber (1973) combined an affine transformation with the sequential similarity detection algorithm to determine all of the parameters of the affine transformation. In all of the above methods, a portion of one image or a window W is taken as a reference image and the similar portion in the other image is located by search over a search area S. There may be translational, rotational, or scale difference between the two images. The search may also be carried out using optimization techniques as shown below. Let F , (x, y ) and F2(x,y ) be two images to be registered. Let S be the search area in image F , (x, y ) and W be the window from image F2(x,y), as shown in Fig. 31. S is taken as an L x 1, array of pixels which assume one of the gray levels such that O I S ( x , y ) l k - 1,
1 <x,ylL
(88)
W is considered to be an M x M array of digital picture elements with the same gray levels, and M chosen such that M < L. We can write
0 I W(x‘,y’) I k - 1,
0 I x’,y’ I M
(89)
It is assumed that enough a priori information is known about the dislocation between the window and the search area such that the parameters M and L are selected with the virtual guarantee that at registration a complete subimage is contained in the search area.
-M
-
I
M
I
SUB IMAGE
WINDOW I W’
S E A R C H AREA S
FIG.31. Registration of
two
images.
352
A. D. KULKARNI
If we assume that there exist small rotational, translational, and scale differences between the two images, the coordinates (x, y ) of the search area S and the coordinates (x’, y’) of the corresponding point in the window W can be related as
Since we consider a portion of the image represented by the window W to be small, the first-order transformation given by Eq. (90) is adequate to relate the geometric differences between the two images. In the presence of noise the two images can be represented as S(x, Y ) = W(x’,y‘)
+ n ( x ,y )
(91)
In Eq. (91),n ( x , y ) represents the nonsimilarity between the two images and can be considered as the noise in one of the images. It can be seen that the problem of registration is to evaluate the transformation coefficients in Eq. (90). If we consider normalized cross-correlation as a criterion for registration, then the problem reduces to that of finding transformation coefficients such that the function F given in Eq. (92) is maximum
(92) If we consider the absolute difference in the gray values as a criterion for registration, then the problem reduces to that of finding out the transformation coefficients such that the function F given in Eq. (93) is minimum
In Eqs. (92) and (93) the double summation is over the window W and the corresponding subimage in the search area S. The normalized cross-correlation criterion method requires a considerable amount of computer time. In the sequential similarity detection algorithm, which is a faster algorithm, a threshold T is set, and the difference as in Eq. (93) is calculated sequentially for the pixels in the window. At each step in the algorithm, a pixel in the window is chosen at random, the error with respect to the search image is calculated and accumulated. The calculation for the particular window is stopped when the threshold T is exceeded. Only those search images which closely match the window require complete calculation of the function F i n Eq. (93). This results in considerable savings of computer time.
DIGITAL PROCESSING OF REMOTELY SENSED DATA
353
It can be seen that, with a search area of size L x L and a window of the size M x M ,we can specify ( L - M - 1)2 different subimages and each of these subimages must be compared with the window W to find the maximum of the function F in Eq. (92) or the minimum of the function F i n Eq. (93). It is possible to use optimization techniques to carry out a search for the optima of F using an optimization technique. In order to determine the values of the transformation coefficients such that the function F is optimum, it is necessary to carry out a search in six-dimensional space. An algorithm has been developed to carry out optimized searching using sequential simplex optimization technique (Kulkarni et ul., 1981).They have worked out an example for the registration of Landsat images with translational difference. The optimized search is carried out in two-dimensional space to locate the registration point. The search has been carried out using both cross-correlation and graylevel difference as the criterion for the registration. The contours of the crosscorrelation and gray-level difference are shown in Figs. 32 and 33, respectively. The output after registration is shown in Fig. 34. It is also possible to use the optimization technique along with the sequential similarity detection algorithm for a fast registration system.
3001 250-
0.865
0.870
200-
0.880
(D
0.890
C 0
0 v)
0.900
150-
100-
50 I00
FIG.37
I50
200
Contours of cross correlation.
2joPixel
w
w
J:0
-I
2
0 0
lu
0
Scans
0
01
-
0
N vl
0
0
w 0
DIGITAL PROCESSING OF REMOTELY SENSED DATA
355
V. CLASSIFICATION TECHNIQUES In the analysis of remotely sensed data, classification plays an important role. In remote sensing, reflectance values of the pixels in various spectral bands are recorded. The classification process labels these pixels in various ground categories. Thus the classification process converts the observed data into meaningful information. Pattern classification can be defined as the assignment of a point in the featured space (e.g., a remotely sensed “pixel” characterized by its reflectances in different spectral bands) to a proper pattern class. The techniques used to solve pattern classification problems can be grouped into two general categories, namely, decision theoretic (or statistical) and syntactic. In the statistical approach, a set of features are extracted from the patterns, and the recognition is achieved by partitioning the feature space. In the syntactic approach, each pattern class is characterized by several subpatterns and a relationship between them. Another way of grouping the pattern recognition techniques is supervised and unsupervised methods. In the case of the supervised method, a certain number of training samples are available for each class, and these are used to “train” the classifier. In the case of the unsupervised method, the training samples are not available. The decision theoretic methods can again be divided into parametric and nonparametric methods. In the parametric methods, each pattern class is characterized by a statistical distribution which in turn is dependent on a certain number of parameters. The nonparametric methods do not assume any such distribution. Many textbooks and review articles, such as Fukunaga (1972),Duda and Hart (1973),and Fu (1980, 1982, 1983) deal with pattern recognition problems. The pattern recognition system can be represented as shown in Fig. 35. In the analysis of remotely sensed data, the reflectance values of the pixels are often used as the feature vectors. The pattern classification process essentially partitions the feature space, with the help of decision boundaries. These boundaries can be defined in terms of the discriminant functions, some of which are discussed below.
%KRN
r ti;11E X T R A C T ION
CLASSIFIER
Xn
FIG.35. Pattern recognition system.
OUTPUT
356
A. D. KULKARNI
A . Linear Discriminant Functions
Let X = (xl ,x2,. . . , x J T be the observation vector. Let w l , w2,. . . , w, be the m classes. The problem is to assign X to a proper w,.A linear discriminant function is a linear combination of xi)s. The decision boundary between the regions w,and w j is in the form of N
D i ( X )- Dj(X) =
C ckxk + C,v+ k=
1
1
(94)
=O
where the Cis are the constants determined from the training samples. A sample X is assigned to a class w iif Di(X) > D j ( X )
for all j ,
(95)
j # i
B. Minimum Distance Class$er One of the important types of linear classifiers is the minimum distance classifier. Here the distances between sample points in the prototype training samples are used for the classification. Suppose that the reference vectors R , , R,, . . ., R , are given for m classes. The minimum distance classifier assigns the input sample X to a class w,if (1 X - Ri 11 is the minimum, where 11 1) represents the distance, defined as
-
(1 x - R 1)
=
[(X - Ri)T(X - R,)] l I 2
(96)
C . Supervised Clussijication Techniques In supervised parametric classification techniques, the training samples are used to obtain the statistical properties of each of the categories. In decision making one can use reflectance values of the pixels as the feature vectors or the reflectance values can be mapped from an n-dimensional space to a lowerdimensional space. The maximum likelihood classifier has been widely accepted for the analysis of multispectral data. We assume that each observation (pixel) consists of a set of measurements on n variables. With the assumption of a multivariate normal distribution, the probability that X belongs to a class k is given by
pk(Xi) = (2~)-"'"Z~1-''~exp[-(1/2)(Xi - p k ) 5 ; ' ( X i - pk)]
(97)
where n is the number of the measurement variables to characterize each observation; Xiis a vector of measurements on n variables associated with the i t h observation; p k ( X i ) is the probability density value associated with the observation vector X i ,as evaluated for the class k ; X k is the covariance matrix
357
DIGITAL PROCESSING OF REMOTELY SENSED DATA
associated with the kth class; and ,uk is the mean vector associated with the kth class. In the maximum-likelihood decision rule, Eq. (97) allows the calculation of the probability that an observation is a member of the kth class. The individual pixel is then assigned to a class for which the probability is the greatest. In an operational context, the mean and the covariance matrices calculated from observed samples or training sets of finite sample sizes are used in Eq. (97). Equation (97) can be rewritten as
where D, is the covariance matrix associated with class k, taken as an estimator of x k ; and mk is the mean vector associated with class k, taken as the estimator of p k . Since the log of the probability function is a monotonically increasing function, decisions can be made by comparing the values for each class obtained from Eq. (98). A simple decision rule can be derived from Eq. (98) as below, by eliminating the constants. R , : Choose the k which minimizes Fk(Xi)= lnlD,l
+ (Xi
-
(99)
mk)TD,' ( X i - m,)
If we use the a priori probabilities of the classes, then the decision rule can be modified as (Strahler, 1980) R , : Choose k which minimizes F ~ ( X=~ InlDkI ) + ( X i - mk)TD;'(Xi - mk) - 21np(wk) where p ( w k )is the probability that an observation will be a member of prior probability for class w k .
(100) wk,
it. , a
D. Tree Class8ers Single-stage classifiers are used in practice in remote sensing. However, with the development of sensors like the thematic mapper, the data are being acquired with higher spectral, as well as spatial, resolution. Analysis of such data with a single-stage classifier would take a huge amount of computer time. Tree classifiers are considered to be more effective than a single-stage classifier. The advantage with the use of the tree classifier is that the accuracy and the computational efficiency is improved. Tree classifiers are also known as multilevel classifiers. A typical decision tree scheme is shown in Fig. 36 (Fu, 1982). At the first level, the classes are classified into i groups using only n , features. Here i 5 m and n , 5 n, and the n , features selected are the best features to classify these i groups. The same procedure will then be repeated at the second level for each of the i groups. The method is continued in this fashion for the third level, fourth level, etc., until each of the original m classes can be separately identified. Following each tree path in the decision tree, each
A. D. KULKARNI
358
FIG.36. Decision tree structure classification
of the rn classes can be recognized. The basic concerns of the tree classifier are the separation of the groups of the classes at each nonterminal node, the choice of the subset of the features which are most effective in separating these groups of the classes. Therefore, there are three major tasks in the design of a tree classifier (Lin and Fu, 1983): to set up a structure of an optimum tree, (2) to choose the most effective subset at each nonterminal node, and ( 3 ) to choose the decision rule at each nonterminal node.
(1)
During the design of a tree classifier, one would always desire to obtain a optimum tree classifier in the sense of achieving the highest possible classification accuracy while using the smallest possible computer time. Many research workers have defined different evaluation functions to direct a search through possible decision tree structures. However, these optimization approaches require a large amount of computer time and memory space, and the optimization is not guaranteed. One of the ways to reduce the total number of possible tree structures is to limit the number of features selected at each stage. A binary tree can be considered as a special case of the tree classifiers. An algorithm for the binary tree has been developed and implemented (Kulkarni, 1983).The approach is motivated by the classification accuracy, as well as the computational efficiency. Here, at each nonterminal node two clusters are formed. In order to obtain the clusters of the classes, minimum distance has been used as the criterion. The two most distant class centers have been chosen as the cluster centers and the distance, in the feature space, of the class mean from the cluster center has been used as the criterion to assign the class to one of the two clusters. After having assigned all the classes, the cluster centers are
DIGITAL PROCESSING OF REMOTELY SENSED DATA
359
recalculated and the process is iterated, until the cluster centers are stabilized. This can be described in the steps below:
(1) Select the mean vectors of the two classes which are maximum distances apart in the feature space as the starting cluster centers. (2) Assign each class to one of the clusters using the distance in the feature space as the criterion. (3) Calculate the new cluster centers as the mean of the means of the classes in the cluster. (4) If U , and U, are the old centers and and are the new cluster centers, then the error, Err, can be written as
ol
2
Err
= i=l
c2
(q- oi)(q- oJT
(5) Repeat the procedure from step 2 through 4 until the cluster centers are stabilized. In order to classify a pixel to one of the clusters at each nonterminal node, one can use a subset of the selected features. For the selection of the subset of the features, the following procedure can be adopted. At each nonterminal node the scatter matrix S , can be defined as
s, =
c (q. 2
-
i= 1
where q.is the mean vector corresponding to cluster i and U,, is the total pooled mean vector. The pooled covariance matrix can be defined as 2
S2
=
C Ki i= 1
where kiis the covariance matrix corresponding to cluster i. The separability at each nonterminal node can be defined as (O'Toole and Stark, 1980) J = Tr(S,S; ')
(104)
where Tr(-)denotes the trace of a matrix (-). The separability J can be used as the criterion for selecting the features as described in the steps shown below.
(1) Select a combination of the features. (2) Obtain the scatter matrix S , and the pooled covariance matrix S2 corresponding to two clusters considering the selected features. (3) Obtain the separability J . (4) Repeat the above procedure for the desired combination of feature vectors and select a minimum number of features with maximum separability as a subset of features to be used.
360
A. D. KULKARNI
( 5 ) Repeat this process at each nonterminal node. Another measure of separability is the Bhattacharya distance. In the case of Gaussian-distributed classes, the Bhattacharya distance can be expressed as (Lin and Fu, 1983)
J
=
B,,, + B,
(105)
where B m = i(u2 -
Ul)'C(Cl
B,. = iln[l(cl
+2'(-12/)2'
-
(106)
'1)
+ ~2)/21/1~1~1'21~211 '1
(107)
B, is due to the difference of the means of the two clusters, and B, is because of the covariances corresponding to two clusters. As an illustration, an example has been worked out using the above procedure. In the example, Landsat data have been analyzed. In the area selected 16 categories have been observed. The mean vectors corresponding to these 16 categories have been evaluated and are given in Table IV. The tree structure for these classes is obtained using the procedure described above and is shown in Fig. 37. In order to select the subset of features at each nonterminal node, the separability function J is used. Table V shows the separability J at all nonterminal nodes for various subsets. In the given example, the subsets of the
TABLE 1V MEANVECTORS Mean vector (reflectance values) Class no. 1
2 3 4 5 6 7 8 9 10 11
12 13 14 15
16
Description Cropland Fallowland Mixed forest Water Wet crop Sal forest Cultural waste Scrubs Garjin forest Tank water Wet land Bamboo forest Resivior water Jhum area Tropical fruit plantation Evergreen forest
Band 1 Band 2
Band 3 Band 4
39.0 77.9 30.8 52.2 39.7 40.7 43.4 47.7 34.4 35.1 57.7 35.6 28.8 44.1
38.8 103.2 36.3 57.7 37.8 38.5 48.4 54.8 30.9 31.4 71.5 37.6 22.6 55.5
82.5 110.0 84.7 37.7 75.2 99.0 72.4 87.7 88.2 29.3 70.7 90.8 14.0 61.9
79.7 88.6 78.7 10.7 64.3 94.7 61.2 79.7 88.0 11.7 48.0 90.4 3.4 47.5
38.0
35.6
80.0
79.2
DIGITAL PROCESSING OF REMOTELY SENSED DATA
36 1
rn
6, 16
FIG.37. Binary tree structure
features to be used have been selected heuristically. However, the procedure can be automated by defining the optimization function in terms of accuracy and the computer time required. The maximum-likelihood technique can be employed at each nonterminal node for the classification.
E. Contextual Class$cation Techniques In earlier sections we have discussed classification algorithms wherein the classification is performed such that each pixel is classified individually and independently. This classification can only exploit spectral information. There is no provision for using the spatial information to help to decide what a particular pixel in the image might be. Using spatial information together with the spectral information, the analyst may easily identify roads, delineate boundaries, etc. Recent studies have demonstrated the effectiveness of a contextual classifier that combines spectral and spatial information while employing a general statistical approach. Machine classification algorithms can incorporate spatial information in several ways. These approaches can be categorized as being structural and
A. D. KULKARNI
362
TABLE V SEPARABILITY J FOR DIFFERENT FEATURE VECTORSUBSETS Nodes Bands
1
2
3
4
5
6
7
11.60 4.27 3.87 3.95 3.22 0.76 3.48 3.36 3.36 3.42 8.02 0.02 6.04 2.76 3.31
39.70 33.10 17.08 15.40 34.40
5.42 4.80 4.00 3.76 5.55 3.19 3.12 2.12 3.30 2.30 2.23 1.50 1.68 1.60 0.60
7.28 5.79 4.31 2.88 5.93
8.10 8.70 9.28 10.90 1.24 5.70 5.81 1.12 0.09
25.10 23.80 21.70 18.00 23.50 19.00 9.83 7.75 12.40 9.60 2.05 5.73 7.15 0.70 0.05
2.55 3.39 2.54 3.54 1.19 2.11 1.96 0.26 0.78
3.55 1.66 0.88 0.94 1.42 0.65 0.95 1.41 1.14 1.82 1.92 0.16 0.19 0.75 1.11
1.65 0.80 0.43 0.41 0.49 0.15 0.64 0.74 0.70 0.77 1.31 0.60 0.06 0.64 0.60
8
9
10
11
12
13
14
5.80 2.38 3.32 3.42 3.30 1.19 1.24 3.02 2.07
3.17 2.59 1.78 1.19 2.59
2.90 2.75 1.43 1.34 2.80
8.38 8.20 4.15 3.37 8.29
2.58
2.66_
8.13
1.60 2.10 0.98 1.57 0.59 1.60 0.98 0.01 0.58
0.40 0.52 2.37 2.46 0.22 0.37 2.29 0.07 0.15
2.66 2.65 5.35 5.49 0.07 2.60 5.30 0.02 0.04
1.09 0.38 0.38 0.39 0.44 0.10 0.27 0.68 0.36 0.77
0.48 0.47 0.28 0.61 0.46 0.46 0.22 0.21 0.25 0.25 0.04 0.21 0.25 0.01 0.01
3.43 3.41 0.32 0.36 3.42 3.38 0.14 0.13 3.29 3.28 0.05 0.1 1 3.25 0.03 0.01
3.9q 4.06 0.17 0.88 1.1 1 2.82
29.80
5.24
0.94 0.02 0.09 0.25 0.66
textural or contextual. In the contextual approach, the probable classification on neighboring pixels influences the classification of each pixel. Classification accuracies can be improved through this approach, because of the fact that the certain ground cover classes naturally tend to occur more frequently in the same context than others. A general contextual classification approach probabilistically relates the classification of a pixel to the true classes of a limited number of surrounding
DIGJTAL PROCESSING OF REMOTELY SENSED DATA
363
pixels. Chittineni (1981) uses a general Morkov model to describe the class dependencies between the neighboring pixels. Chitteneni developed his model for a single-dimensional multispectral data. Compound decision theory is invoked to develop a classification method which exploits spatial/spectral information. The approach was formulated by Swain et al. (1981), and was further developed by Tilton et al. (1982). In decision compound theory a decision rule d ( x i j ) assigns a minimum risk classification to a pixel ( i , j ) as shown below. Let x i j be n observations from location ( i , j ) having a fixed but unknown classification u i j . The classification uij can be any of m classes from the set R = (w,,w 2 , . . ., wm).Define the context of a pixel at the location (i, j ) as p - 1 observations spatially near to the observation x i j , as shown in Fig. 38. Group the p observations in the p-context array into a vector of observations in x i j = (xl, x2,. . .,x ~and ) let ~ uij be the vector of true but unknown classifications associated with the observations in x i j . Let C p E Q p be a vector of possible classifications for the elements of any p-context array. The decision rule, which defined the set of discriminant functions for the classification problem, is (Tilton et al., 1982)
d ( x i j )= action (classifications) a, which maximizes
CP E RP
(108)
C” = a Where G ( C P )is a “context function” and is the relative frequency with which c poccurs in the scene being analyzed, f ( x k / c k ) is a weighted sum Of multivariate normal densities. Methods for estimating the functions f ( x k / C k ) are well known from the noncontextual maximum-likelihood decision rule. The method for estimating the context function G ( C P )are discussed by Swain et al. (1 982). Tilton (1983) has implemented the algorithm using decision compound theory on a massive parallel processor (MPP) for classification of multispectral data using contextual information efficiently.
p=2
p.3
FIG.38. Examples of p-context arrays.
p=5
364
A. D. KULKARNI
F . Clustering Techniques In the processing of remotely sensed data, often there are many instances where classification must and can be performed without a priori knowledge. The clustering techniques deal with this task. It is a common phenomenon that features belonging to the same class tend to form groups or clusters in the feature space. If we want to classify N samples and each sample is characterized by an n-dimensional vector, i.e., we are given a set of vectors (XI,X 2,. . .,X,) which are known. Each sample is to be placed into one of the m classes, (wl, w2,. . . ,wm),where m may or may not be known. The class to which the ith sample is assigned is denoted by wkj. A classification Q is a vector made up of the wkisand configuration X * is a vector made up of X i s ; i.e., 0 = [wk 1 w k 2 .* * W k n l T
(109)
x* = [x:x;...x;f]
(1 10)
The clustering criterion J is a function of 2 ! and X * and can be written as (Fukunaga, 1972)
J
=
J(Q,X*)
(1 1 1 )
By definition, the best classification satisfies either
Many iterative algorithms to obtain the optimum J are available. The distance measures or the similarity measures are basically used to define the function J . Some of the common distance measures are as below (Deekshatulu and Kamat, 1983) (1)
Minkowsky metric ( 1 14)
where n is the number of features. (2) Quadratic metric d ( X i , Xj) = (x,- Xj)T
W ( X j- X j )
where W is an n x n positive definite matrix. (3) Normalized correlation d ( X i , Xj)
= (X'Xj,/C(X'Xi)(XTXj)l
1'2
365
DIGITAL PROCESSING OF REMOTELY SENSED DATA
(4) Mahalonbic metric d ( x , , X j ) = (Xi
- Xj)'
c-'(xi- Xj)
(1 17)
where C denotes a within-group covariance matrix. VI. SYSTEIM DESIGNCONSIDERATIONS
As discussed in the previous sections, the analysis of remotely sensed data is carried out in two stages, namely, the preprocessing stage and the processing stage. Preprocessing involves reading data from high-density digital tapes (HDDTs), systematic corrections for geometric distortions, radiometric corrections for sensor gain and bias variations, and generation of scene latitude and longitude information for standard film or computer compatible tape (CCT) products. Processing deals with the application of precision geometric correction, enhancements, classification, etc., for generating the thematic maps. The inputs to the processing system can be data on CCT from various sensors like TM, MSS, etc. Input information can also be in the form of geometric control points, training sets, etc. The output of the processing system can be computer-compatable tapes, film products, histograms, scatter diagrams, etc. The processing system can be considered as a black box, as shown in Fig. 39. It can be seen that, since there is a variety of inputs and outputs, it is very difficult to design an optimum system to meet all of the input/output and efficiency
CLASSIFIED OUTPUT
GEOMETRIC CORECTION HISTOGRAM
TM(CCTI IRS(CCT)
SOFTWARE
I
I
M~S(CCT) INPUTS
'
GCP INFORMATION GROIJND TRUTH
k
I
I
STAT1ST ICA L TABLES
FILMS ,TAPES
OUTPUTS
FIG.39. Typical software system IjO requirements.
366
A. D. KULKARNI c
-
REFORMATTING
-
HIGH-PASS- CONTRAST FILTER STRETCHING
CCT NPUT
1
GEOMETRIC CORRECTION
FILMING
FILM OUTPUT
FIG.40. Typical chain of modules
requirements. Hence, the modular approach can be adopted. The typical chain of modules is shown in Fig. 40. The modular approach gives flexibility for the choice of the inputs/outputs and the processing techniques for optimum processing. There can be a number of modules for a variety of processing techniques discussed earlier. The input/output data formats for each of the modules can be designed such that the output of one module can be used as the input to another module. With respect to hardware requirements, both the preprocessing and processing systems can be configured around a minicomputer or around the main frame of a general purpose computer. However, many special devices,
CONSOLE TERMINAL
VIDEO TERMINALS TAPE DRIVE No.1
HIGHDENSITY
VAX 28 TRACK
I
TIME CODE TRANSLATOR
TAPE DRIVE No.2
11/780
AND DECOM
FILM RECORDER
I
GENERATOR PROCESSOR I N P U T S
DISK No 2 OUT P U T S
PROCESSING
FIG. 41. Typical system configuration.
DIGITAL PROCESSING OF REMOTELY SENSED DATA
367
like high-density tape units, display systems, film recorders, digitizers, array processors, etc., are often used in processing remotely sensed data. Hence, a dedicated system around a minicomputer is more suitable. A typical system cofiguration is shown in Fig. 41.
VII. CONCLUSION This article mainly describes some of the digital techniques which are often used in processing remotely sensed images. Some of the sensors are also discussed. It is hoped that this article will be useful as a first reading for scientists in various disciplines, specifically, those who are using remotely sensed data for a variety of applications.
REFERENCES Andrews, H. C. (1 970). “Computer Techniques in Image Processing.” Academic Press, New York. Andrews, H. C., and Teacher, A. G. (1972). I E E E Spectrum July, 20-23. Anuta, P. E. (1970). Trans. I E E E E GE-8, 353-368. Anuta, P. E. (1977). Geophysics 42,468-481. Barnea, D. I., and Silverman, H. F. (1972). Trans. I E E E Comput. C-21, 179-186. Bernstein, R. (1976). / E M J . Res. Den 20,40-57. Brown, D. W. (1966). J . Nucl. Med. 7 , 165. Chittineni. C. B. (1981). Comput. Graphics Image Process. 16,305-340. Chittineni. C . B. (1982). Proc. Int. Symp. Machine Process. Remotely Sensed Data, LARS, Purdue University pp. 245-254. Chittineni. C. B. (1983). Trans. I E E E GE-21, 163-174. Deekshatulu, B. L., and Bajpai, 0. P. (1982). Curr. Sci. 51, 1133. Deekshatulu, B. L., and Kamat, D. S. (1983). Proc. Indian Acad. Sci. 6 (Part 2), 135-144. Deekshatulu, B. L., and Krishnan, R. (1982). J . I E 7 E 2 8 , 4 4 7 4 5 6 . Duda, R. O., and Hart, P. E. (1970). “Pattern Classification and Scene Analysis.” Wiley, New York. Fu, K. S. (1976). I E E E Trans. Geosci. Electron. GE-14, 10-18. Fu, K. S. (1980). “Digital Pattern Recognition.” Springer-Verlag. Berlin and New York. Fu, K. S., ed. (1982). “Application of Pattern Recognition.” CRC Press, Boca Raton, Florida. Fu, K. S. (1983). Proc. Indian Acad. Sci 6 (Part 2), 153-175. Fu, K. S., and Yu. T. S. (1980).“Statistical Pattern Classification.” Wiley (Research Studies Press), New York. Fukunaga, K. (1972). “Introduction to Statistical Pattern Recognition.” Academic Press, New York. Goldberg, M. (1981). In “Digital Image Processing” (J. C . Simon and R. M. Hardlick, eds.), pp. 383-437. Reidel, Dordrecht. Gonzalez, R. C., and Wintz, P. A. (1977). “Digital Image Processing.” Addison-Wesley, Reading, Massachusetts. Graham, R. E. (1962). IRE Trans. lnf. Theory IT-8, 129.
368
A. D. KULKARNI
Green, A. A., Huntington, J. F., and Roberts, G. P. (1978). Proc. Int. Symp. Remote Sensing Environ., 12th 3, 1755. Haralick, R. M. (1976). In “Topics in Applied Physics” (A. Rosenfeld, ed.), Vol. 11. SpringerVerlag, Berlin and New York. Haralick. R. M. (1981). Proc. Pattern Recog. Image Process. Conf., Dallas p p . 285-291. Hou, H. S., and Andrews, H. C. (1978). Trans. I E E E ASSP-26,508-517. Hueckel. C. B. (1973). J . Assoc. Comput. Mach. 20,631-647. Kekre, H. B., Sahasrabudhe, S.C., and Goyal, N. C. (1982). Comput. Electron. h g . 9, 131-152. Keys, R. G. (1981). Trans. I E E E ASSP-29, 1153-1 160. Kulkarni, A. D. (1983). Proc. Int. Symp Remote Sensing Enuiron., 171h, Ann Arbor, Michigan pp. 609-6 15. Kulkarni, A. D., and Sivaraman, K. (1984). Signal Process. 7,65-73. Kulkarni, A. D., Deekshatulu, B. L., and Rao, K. R. (1981). Proc. Int. Symp. M P R S D , L A R S , Purdue University, 7th pp. 181-187. Kulkarni, A. D., Deekshatulu, B. L., and Rao, K. R. (1982). Proc. Int. Symp. M P R S D , L A R S , Purdue University, 8th pp. 258-262. Kuwahara, M., Hachimura, K., and Kinoshito, M. (1976). In “Digital Processing of Bio-Medical Images” (K. Preston and M. Onoe, eds.). Plenum, New York. Lee, J. S. (1980). Trans. I E E E PAMI-2, 165. Lin. Y. K., and Fu, K. S. (1983). Pattern Recoy. 16, 69-80. Morgenthaler, D. S., and Rosenfeld, A. (1981). Trans. I E E E PAMI-3,482-486. Murphy, J. (1984). Technical Memo No. DMD-TM-84-368, Digital Methods Division, Canada Centre for Remote Sensing, Ottawa. Oppenheim, A. V.. Schafter, R. W., and Stockham, T. G (1968). Proc. I E E E 56, 1264-1291. O’Toole, R. K., and Stark, M. (1980). Appl. Opt. 19,2496-2505. Park, S. K., and Schowengerdt, R. A. (1983). Comput. Vision, Graphics Image Process. 23, (3). Rao, K. R., Kulkami, A. D., and Chennaiah, G . Ch. (1981). J . Photo Interpret. Remote Sensing 9, 44-48. Rao, K. R., Kulkarni, A. D., and Chennaiah, G. Ch. (1982). J . Photo Interpret. Remote Sensing, Indian Soc. Photo Interpret. Remote Sensing 10, 1-5. Reeves, G. (1975). “Manual of Remote Sensing” (G. Reeves, ed.), Vol. I , p. 325, American Society o f Photogrammetry. Rosenfeld, A., ed. (1976). “Topics in Applied Physics,” Vol. 11. Springer-Verlag, Berlin and New York. Rosenfeld, A. (1983). Proc. Indian Acad. Sci. 6 (Part 2), 145-152. Rosenfeld, A., and Kak, A. (1982). “Digital Image Processing.” Academic Press, New York. Strahler, A. H. (1982). Remote Sensing Enuiron. 10, 135-163. Stucki, P., ed. (1979). In “Advances in Digital Image Processing.” Plenum, New York. Swain, P. H., and Davis, S. M. (1978). “Remote Sensing: The Quantitative Approach.” McGrawHill, New York. Swain, P. H., Vardeman, S. B., and Tilton, J. C. (1981). Pattern Recog. 13, 429-441. Tilton, J. C. (1983). Proc. Int. Symp. Remote Sensing Environ., I7th, Ann Arbor, Michigan pp. 1-9. Tilton, J. C . ,Vardeman, S.B., and Swain, P. H. (1982). I E E E Trans. Geosci. Remote Sensing GE-20, 445-452. Wallis. R. H. (1976). Proc. Symp. Curr. Math. Problems Image Sci. Monterey, CuliJbrnia. Wang David, C. C., Vagnucci, A. H., and Li, C. C. (1983).Comput. Vision, Graphics Image Process. 26,363-38 1. Webb, W. (1983). Landsat-4 Ground Station Interference Description, Revision 7, GSFC-435-D400, NASA, GSFC, August. Webber, W. F. (1973). Proc. I E E E Conf. Mach. Process. Remotely Sensed Data, Oct. Yasuoka, Y., and Haralick, R. M. (1983). Pattern Recog. 16, 113-129.
Index A
exactly known object, 71-77 optimal linear coordinate estimator, 69-7 I optimal localization and picture contours, 82-87 Automatic parameter adjustment, 16 Auxiliary variable, statistical properties of, 299-305 Average operator, 153 Axial illumination, see Illumination, axial
Aberration function, 206-207, 216 Adaptation to parameters of signals and distortions, 6-9 estimation of noise and distortion parameters, 8-9 picture description and correction quality criterion, 6-7 system description, 7-8 “Adaptive” correction of distortion, definition of, 9 Adaptive correction of distortions in imaging and holographic systems, 5-44 automatic estimation of random-noise parameters, 9-16 of linear distortions, 27-34 noise suppression by filters with automatic parameter adjustment, 16-27 of nonlinear distortions, 34-44 problem formulation, 6-9 Adaptive differential pulse-code modulation, 167 Adaptive-linear prediction, 166 Adaptive mode quantization, 5 1-54 Adaptive nonlinear transformations of the video signal scale, 47-55 Adaptive sampling and quantization, 162- 164 ADPCM, see Adaptive differential pulse-code modulation ALP, see Adaptive linear prediction Amplitude windows, 48 Angiograni, using filters, 59 Antidiffusion operator, 334 APA. see Automatic parameter adjustment Aperture function, I23 Apodization function, 12 I Apodization masks, 13 Atmospheric windows, 3 13 Automatic localization of objects in pictures, 68-92 allowance for object’s uncertainty of definition and spatial nonuniformity, 78-81 estimation of volume of signal corresponding to a stereoscopic picture, 88-92
B BADM, see Basic asynchronous delta modulation Band ratioing, 340 Basic asynchronous delta modulation, 169 BIBO, see Bounded-input bounded-output Bhattacharya distance, 360 Binary hologram method, 108- 109 Binary-media-oriented methods, 120 Biomedicine, 185-192 Bit-slicing, 48 Bounded-input bounded output, 144 Burckhardt coding method, 114
C
CDF, see Cumulative distribution function Central spot, 123 Cepstrum, complex, 145 definition of, 145 Chavel-Hugonin coding method, 114 Chebyshev’s inequality. for histograms, 73 Chi-square test, 281, 286-289 Cinematographic effect, I3 1 Classification techniques, 355-365 pattern recognition, 355 Clustering techniques, 364-365 Code word length, defined, 161 Color holograms, 136 “Colorization,” 68 Colormation C4300, I36 Compact code, 161 Compositional stereo holograms, 129, I3 1132 369
370
lNDEX
Compound decision theory, 363 Compression ratio, defined, 162- 163 averages, 164- 165 “Context function,” 363 Contextual classification techniques, 361-363 “Contour, ’ ’ 82-83 Conventional transmission electron microscope, 203 Covariance function, of a picture, 10- 1 1 Covariance matrix, 246 identical to Cramer-Rao bound, 247 Cramer-Rao bound, 245, 271, 276, 305 Cramer-Rao confidence intervals, 269 CTEM, see Conventional transmission electron microscope Cumulative distribution function, 328
D Data compression, 158- 173 applications, 176- 199 methods and techniques, 162- 173 irreversible methods, 162 reversible methods, 162 prediction and interpolation, 164-166 prediction algorithms, 165 source coding, 159- 162 DCT, see Discrete cosine transform Decision compound theory, see Compound decision theory Decision rule, 357-358 Defocus parameter, 219, 227, 260, 293 Delta modulation, 167- 168 Design considerations, for remote sensing, 365-361 Detection of an object, 281-290 in electron micrographs of carbon foil and NADH:Q oxidoreductase crystal, 289290 “Detour phrase” method, 108 Differential pulse-code modulation and delta modulation, 166-169, 179-180 Diffraction contrast, 207 Diffusion noise, 39 Digital computer engineering, 2 Digital correction of nonlinear distortions in imaging systems, 36-37 Digital filtering, 169 Digital filters, two-dimensional, 142- 152, see also Two-dimensional digital filters
Digital filters, two-dimensional, 142- 152 applications, 176- 199 definition of, 142- 144 design methods of, 145-152 stability of, 144-145 Digital filters, two-dimensional, and data compression, 141-200 applications, 176- 199 Digital image processing, 141-142 Digital optics, 1- 140 Digital processing, 318-319 Digital processing of remotely sensed data 310-367 classification techniques, 355-365 enhancement techniques, 326-343 geometric correction and registration techniques, 343-355 preprocessing techniques, 3 19-326 system design considerations, 365-367 Direct approach, 2 15 for reconstruction of object wave function, 219-224 model computation, 221-224 statistical analysis of approximate solution, 224-229 Discrete cosine transform, 170, 180 Discrete sine transform, 170 Distortions, correction of, 5-44 distortion correction quality, 6 DM, see Delta modulation DPCM, see Differential pulse-code modulation DST, see Discrete sine transform Duplicated symmetrization, 106- 107 E
Edge detectors, 155-158 Edge-enhancement and detection tcchniques, 328-336 Electron microscopy of biological material contrast mechanisms, 207-209 image formation in the CTEM, 205-207 schematic of, 206 interaction of beam with specimen, 203-204 elastic scattering, 203 inelastic scattering, 203 the phase problem, 209, 214 relation between object structure and electron wave function, 204-205 models for the object
37 1
INDEX
stochastic process for low-dose image, 209213 ‘ ‘Enhancement, 45 Enhancement by band ratioing, 340 Enhancement techniques, 326-343 Entropy function, 160 Entropy, of a source, 159-160 definition of, 159 “Equalization,” 43, 48-50 Equidensities, 48 ERTS-I images, 183, 187 Extremal filtration algorithms, 62 ”
mirror nonlinearity, 325-326 misalignment of TM axis and yaw, roll, pitch of the spacecraft, 326-327 Geometric correction and registration techniques, 343-354 precision corrections. 343 systematic corrections, 343 Global quantization, 55 Gray-scale manipulation techniques, 327-328 or gray-level rescaling techniques, 327-328 Ground control point information, 319, 343344
F
H
“Fast algorithms,” 17 Fast cosine transform, 180- 181 Fast Fourier transform, 21, 219 Fast Fourier transform, 2D, 164, 171-172 Fast Walsh transforms, 171 FCT, see Fast cosine transform FDM, see Frequency division multiplex FFT, see Fast Fourier transform Filtering techniques, 336-338 “Filter mask,” 17 Finite impulse response, 143 FIR, see Finite impulse response FIR digital filters, design of, 145- I50 First-order interpolator, 166 First-order predictor, 165 First-order predictor algorithm, 165 Fisher information matrix, 245, 263-264, 305 Fletcher algorithm, 266 FOI, see First-order interpolator FOP, see First-order predictor Fourier holograms, 97- 102 synthesizing of, 133 Fourier transform of low-dose image, statistical properties of, 297-299 Fragmentwise equalization, 49-50 Frequency division multiplex, 176- 177 Fresnel holograms, 97-102 FWT,see Fast Walsh transforms
Haar transform, 170 Hadamard transform, 170 HIDM, see High information delta modulation High information delta modulation, 169 High-resolution visible imaging instruments, 315 Histogram equalization, 328 Histogram of filter output, 72 Histogram hyperbolization, 48 Hologram coding, 109-1 12 Hologram synthesis, 92- 136 application to information display, 128136 discrete representation of Fourier and Fresnel holograms, 98-102 mathematical model, 94-98 reconstruction of, 120- 128 recording synthesized holograms, 102- 120 Hologram window function, 123 Homomorphic filtering technique, 337 Homomorphic image processor, 337-338 HRV, see High-resolution visible Huffman encoding procedure, 162 Hybrid hologram synthesis, 133-135 rephotographing, 135 sandwich holograms, 135 Hybrid optodigital holograms, 133- 135 Hybrid volume hologram, 135 Hypothesis testing, see Statistical hypothesis testing
G GCP. see Ground control point Geometric corrections, 323-326 earth curvature and panoramic distortion, 324-325
I IDD, see Independent identically distributed IIR, see Infinite impulse response
372
INDEX
IIR digital filters, design of. 150-152 Illumination, axial, 230-242 low-dose image recording, 233 object wave reconstruction, 234-242 Illumination, tilted, 242-254 orthonormal expansion of the low-dose image, 244-247 reconstruction of the object wave function, 247-254 Image contrast, 208 Image enhancement techniques, in remote sensing, 326-343 Image handling in electron microscopy, statistical aspects, 202-308 object wave reconstruction, 2 13-229 parameter estimation, 254-277 statistical hypothesis testing, 277-296 wave-function reconstruction of weak scatterers, 230-254 Image intensity distribution in analytical form, 264 Image mask, 333 Image processing, statistical significance of, 295 Image wave function, 231 relation to object wave function, 231 Independent identically distributed random variable, 339 Indian remote sensing satellite, 3 16 Infinite impulse response, 144 Information display, 128- 136 Information Theory Theorem, first, I61 Information visualization. I29 compositional stereo holograms, 129, 131132 “multiplan” holograms, 129- 13 I programmable diffusors, 129, 132I34 Interferogram, equation for, 43 Interferogram, “one-dimensional.” definition of. 12 Interplanetary stations. 15 Interpolation of hologram samples, I19 Interpolation techniques, 344-350 cubic convolution method. 347 cubic spline method, 347 by hypersurface approximation, 347 by nearest neighbor, 347-348 “Interpretation objects,” 7 IRS, see Indian remote sensing satellite
K Karhunen-Loeve transform, 170 Kinofornis,’ ’ I04 Kirsch mask, 158 KLT, see Karhunen-Locve transform KP, see Kronecker-Picard integral Kronecker-Picard integral, 273 “
L Landsat, 315. 342, 347-350 LANDSAT-C images, 182- 185 Lee method, 1 1 I Likelihood function, 245, 255, 270, 293 Likelihood ratio, 280, 285-289 Linear discriminant functions, 356 Linear distortions, correction of, 27-34 Linear filtration of noisy signal, 16 Local informational approach, 7 Localization on “blurred pictures,“ 78-8 I defocused pictures, 80-8 I detection characteristics, 8 I inexactly defined picture, 78-79 estimator adjusted to averaged object, 7879 estimator with selection, 78 spatially nonhomogeneous criterion, 79-80 nonreadjustable estimator, 80 readjustable estimator with fragmentwisc optimal filtration, 79-80 Localization of objects in pictures, .see Automatic localization of objects in pictures Localization reliability, 82-87 Local space operators, 152-158 applications, 176- 199 edge detectors. 155- 158 for image smoothing and enhancement. 153-155 Low-dose imaging, 209. 230, 242 and stochastic process, 232 Low-resolution imaging of specimens with scattering contrast, 277 “Luna-24,” 64 lunar soil samples detected, 64. 66
M Mammograms, after MSNR filtering. 58. 65 Mdrkov sourcc, 160 “Mars-4” and Mar\-S.” 1.5, 20, 32
373
INDEX
Mars surface, color pictures of, 21 Massive parallel processor, 363 Maximum likelihood estimation. 254-259 applications of, 259-277 consistency of, 256 in electron microscopy, 254-259 Maximum-likelihood ratio, as test statistic, 285 Minimum distance classifier, 356 MINUIT program, 276 Misell’s algorithm, 215 Moir6 noise, 14 Morkov model, 363 MPP, see Massive parallel processor MRMS filter, 57, 64 MSNR filter, 57 MSS, see Multispectrd scanner “Multiplan” holograms, 129-131 Multiple exchange ascent algorithm 148, 149 Multiplication method, 41-42 for holograms, 41 -42 Multispectral scanner, 314, 319, 329, 342
N Narrow-band noise, 19-21 Newton-Kantorovich approach, 2 14-215 Noise covariance function, 10- 12 Noise-signal separation, 8-9 Noise suppression, 16-27 optimal linear APA filtration, 16--21 pulse noise filtration, 21-27 algorithm with detection by voting, 26-27 iterative prediction algorithm, 24-25 recursive prediction algorithm, 25-26 Nonliner distortions, correction of, 34-44 in imaging systems, 36-38 in holograms and interferograms under unknown distortion function, 42-44 in holographic systems, 38-42
0 OADM, see Operational asynchronous delta modulation Object amplitude function, 239 Object phase function, 239, 241 Object wave function, 213 and recorded intensity distribution, 215-218 Object wave reconstruction, 2 13-229
Operational asynchronous delta modulation, 169 Optical data processing, 128 Optimal filter mask, formula for, 57 Optimality criterion, 72 Optimal linear coordinate estimator, 69-71 Optimal linear filtration, 55-61 Optimal localization and picture contours, 8287 reference objects, selection of, 84-87 whitening and contours, 82-83 Orthogonal coding, 123-125, 127
P Parameter estimation, in low-dose electron microscopy, 254-271 examples, 259-276 Parameter extraction, 162 Parseval relation, 72 Payload calibration data, 323, 326 PCD, see Payload calibration data PCM, see Pulse-code-modulation Peak error, 162- 163 Periodic noise, 14-15 Phase contrast, 209 Phase-media-oriented methods, 120 Phase problem, 209, 214 Picture distortion correction, 6, 7 Picture preparation, 45-68 definition of, 45 Picture preparation in automated systems, 4647 Picture preparation, in digital optics, 45-68 by adaptive nonlinear transformations of video signal scale, 47-55 combined methods of, 63-68 linear preparation methods, 55-61 problems of, 46-47 rank algorithms of, 61-63 “Ping-pong propagation,” 131 Pixel, 282 Poisson process, 21 1, 266, 270, 279, 293 Position detection of marker atoms, 290-294 simulation experiment, presence of 3 uranium atoms, 293-294 “Power intensification of the picture,” 50-5 1 Power spectrum, 162 Prediction method, of anomaly detection, 9 Prewitt mask, 158
314
INDEX
Principal components, enhancement by. 341 “Problem of characteristic points,” 84 Programmable diffusors, 129, 132- 134 Prolate spheroidal functions, 247 Pseudocolor composites, 341 Pulse-code-modulation, 1 15, I77 Pulse noise, 15-16 Pulse noise filtration algorithms, comparison of, 28 “Pure” modes, definition of, 51
Q Quadrant recursive causal filter, 143 Quadruplicated symmetrization, 106- 107 Quantization noise, 15-16 Quantization of orthogonal components, 40-4 1
R Radiometric corrections, 3 19-323 input radiance vs. output digital voltage, 322 internal calibration systems, 321-322 prelaunch gain and bias values, 320-321 statistical methods, 322-323 Random-noise parameters, automatic estimation of, 9-16 additive signal-independent fluctuation noise in pictures, 10-12 additive wide-band noise parameters in onedimensional interferograms, 12- 14 periodic and other narrow-band noises, 1415 pulse, quantization, and “striped” noise, 15-16 Rank algorithms of picture preparation, 61-63 RBV, see Return beam vidicon Redundancy reduction, 158 Registration techniques, 350-354 “Regular” diffusors, 41 “Rejection” filter, 19 Remote sensing, 182- 185 laser radar, 182 microwave radiometers, 182 multispectral scanners, 182 optical cameras, 182 side-looking radar, 182 Remote sensing, applications of, 317-318 agriculture and forestry. 317 hydrology and water resources, 3 I7
land-use inventory and mapping, 317 meteorology, 317-318 military, 318 mineral resources. 317 Remote sensors, 313-316 photography, 3 13 satellite remote sensing, 3 13-3 16 Rephotographing, I35 Return beam vidicon, 314 R-L algorithms, 62-63 rms error, 163 Robinson mask. 158 Robotics, 192- 199 Robust-against-distribution-“tails,” 62 RSS filter. 57
S Sandwich holograms, 135 “Scalar filter.” 17 Scattering contrast, 207 Scherzer focus conditions, 260-261 Schwartz inequality, 73 SDFT, see Shifted discrete Fourier transform SEASAT-SAR images, 184-185. 188-191 Shadowing, correction of. 34 Shannon information. 258 Shannon theorem, first, 161, 162 Shifted discrete Fourier transform. 99- 100, I06 Shift variant enhancement techniques. 343 “Shot noise.” 202, 210, 225 Signal redundancy, 106 Signal-to-quantization-noisc ratio, 167 “Sliding,” 48 Slope-overload distortion, I68 Smoothing techniques, spatial, 338-340 SNR, see Signal-to-quantization-noise ratio Source code efficiency, defined, 161 Source coding. 159-163 comma codes, 160 instantaneous codes, 160 non-singular codes, 160 Source code redundancy, defined. 161 Spatial smoothing techniques, 338-340 Speckle contrast, 39-41 Spectral signatures, 310-313 green vegetation, 310-31 1 soil, 311-312 water, 311-313 SPOT satellite, 315
375
INDEX
SSR criterion, 29 Stability of two-dimensional digital filter, 144145 definition of, 144 Statistical decision theory, 3, 277 Statistical hypothesis testing, 277-296 in electron microscopy, 277-296 Stereo effect, 88-89 Stereo image generation, 342 Stereo-pair decomposition, 342-343 Stochastic driver functions, 222 Stochastic function, 217 Stochastic Poisson process, 255, 264 Stochastic process for low-dose image, 209213, 233 “Striped” noise, 5 , 15-16 Structure informational approach, 7 Student’s T test, 281, 286-289 Supervised classification techniques, 356-357 Symmetrization, 121-123 Synthesized holograms, application to information display, 128-136 design of holographic displays. 129 information visualization, 129 optical data processing, 128 Synthesized holograms, reconstruction of, 120- 128 orthogonal coding, 123-125 symmetrization, 121- 123 two-phase recording in phase medium, 125128 T TDM, see Time division multiplex Telemedicine, 187 Test statistics. 280-289 chi-square test, 28 I , 286-289 likelihood ratio, 280, 285-289 Student’s T test, 281, 286-289 Thematic mapper, 315-316, 319, 325-327 Thresholding. 38-40, 162, 174. 180 Tilted illumination, see Illumination, tilted Time division multiplex, 177 TM, see Thematic mapper Transformations, 169- 173 Fourier, 169- 170 Haar, 170 Hadamard-Walsh, 170 Karhunen-Loeve, 170 Tree classifiers, 357-361
Two-dimensional digital filters, see also Digital filters, two-dimensional Two-dimensional digital filters and data compression, joint use of, 173-176 processing system for digital comparison and correlation of images, 174-176 typical connections of the two digital operations. 173-174 Two-dimensional low-pass digital filter, 148 Two-phase coding, 116-1 17 Two-phase recording in phase medium, 125128
U Unsymmetrical half-plane filters, 144
v Variable-word-length coding, I7 1 , 179- I80 Variance, 246 Vision properties for picture preparation, use of, 63-68 Volume holograms, 135 Volume of signal corresponding to a stereoscopic picture, 88-92 Voting method, of anomaly detection, 9 W Wave-aberration function, 23 1 Wave-function reconstruction of weak scatterers, 230-254 axial illumination, 230-242 tilted illumination, 242-254 Walsh spectrum, 20 Walsh transform, 170 Weakly scattering bell-shaped object structure, theoretical electron microscopy of, 269273 Weak phase object, theoretical electron microscopy of, 259-264 Weak scatterers, wave-function reconstruction of, 230-254 ”Whitening,” 82-84 Whittaker-Shannon sampling, 30 I Wiener filter, continuous frequency response of, 30 Wiener filtration theory, 16 Window, 351 Window function, hologram, 123
376
INDEX
Window method, 146 Kaiser window, 147 Lanczos-extension window, 147 Weber-type approximation window, 147
X X-ray, with linear filtration, 60
Z Zero-locating algorithm, 265
Zero-memory source, 159- I60 binary source. 159 nth extension, 160 Zero-order interpolator, 166 Zero order predictor, 165 Zero-order predictor algorithm, 165, 171, 172, 183, 186 ZOI, see Zero-order interpolator ZOP, see Zero-order predictor