Three-Dimensional Electron Microscopy of Macromolecular Assemblies
This Page Intentionally Left Blank
Three-Dimensional Electron Microscopy of Macromolecular Assemblies
Joachim Frank Wadsworth Center for Laboratories and Research State of New York Department of Health The Governor Nelson A. Rockefeller Empire State Plaza Albany, New York and Department of Biomedical Sciences State University of New York at Albany Albany, New York
( Academic Press San Diego New York Boston London Sydney Tokyo Toronto
Front cover photograph" Three-dimensional reconstruction of the Escherichia coli ribosome from 4300 projections of single panicles embedded in ice and imaged with the Zeiss EM 912 energy-filtering microscope of the Max Planck Institute for Medical Research in Heidelberg, Germany (Frank et al., 1995a). The resolution of the reconstruction is 25 A, as determined xvith the differential phase residual. The small and large subunits are depicted in yello~v and blue. Interpretative elements have been added suggesting the mechanism of protein synthesis based on two new features seen in the reconstruction: a channel running through the neck of the small subunit and a bifurcating tunnel running through the large subunit. Orange: path of the messenger RNA going through the channel, then making a U-turn to exit ronsard the back. Red and green: aminoacy ! and peptidyl-site (A and P-site) transfer RNAs placed into the positions most likely assumed during translation. The nascent polypeptide chain goes through the large subunit and exits at tv,o possible sites. Gold: exit through the membrane; olive green: exit into the cytoplasm.
This book is printed on acid-flee paper. Copyright 9 1996 by ACADEMIC PRESS, INC. All Rights Reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, ~vithout permission in xvriting from the publisher.
Academic Press, Inc. A Division of Harcourt Brace & Company 525 B Street, Suite 1900, San Diego. California 92101-4495 United Kingdom Edition published hv Academic Press Limited 24-28 Oval Road, London NW1 7DX
Library of Congress Cataloging-in-Publication Data Frank, J. (Joachim), date. Three-dimensional electron microscopy of macromolecular assemblies / by Joachim Frank. p. cm. Includes bibliographical references and index. ISBN 0-12-265040-9 (alk. paper) 1. Three-dimensional imaging in biology. 2. Electron microscopy. I. Title QH324.9.T45F73 1996 578'.45-dc20 95-30893 CIP PRINTED IN THE UNITED STATES OF AMERICA 96 97 98 99 00 01 EB 9 8 7 6 5
4
3
2
1
to Carol, Mariel, and Hosea
This Page Intentionally Left Blank
Contents
Preface
I lll
I
xv
Chapter 1 Introduction
1
I. The Electron Microscope and Biology
1
II. Single-Particle versus Crystallographic Analysis III. Crystallography without Crystals
7
IV. Toward a Unified Approach to Structure Research V. The Electron Microscope and the Computer
10
Chapter 2 Electron Microscopy of Macromolecular Assemblies I. Specimen Preparation Methods
12
12
A. Introduction 12 B. Negative Staining: Principle 13 C. Negative Staining: Single Layer versus Carbon Sandwich Technique 14 D. Glucose Embedment Techniques 21 E. Use of Tannic Acid 22 vii
viii
Contents F. Cryo-electron Microscopy of Ice-Embedded Specimens 22 G. Labeling with Gold Clusters 23
II. Principle of Image Formation in the Electron Microscope
24
A. B. C. D. E. F.
Introduction 24 The Weak Phase Object Approximation 25 Contrast Transfer Theory 28 Amplitude Contrast 36 38 Optical and Computational Diffraction Analysis Determination of the Contrast Transfer Function 41 G. Instrumental Correction of the Contrast Transfer Function 44 H. Computational Correction of the Contrast Transfer Function 45
III. Special Imaging Techniques
49
A. Low-Dose Electron Microscopy B. Spot Scanning 51 C. Energy Filtering 52
Chapter 3 Two-Dimensional Averaging Techniques I. Introduction
49
54
54
A. The Sources of Noise 54 B. Principle of Averaging: Historical Notes 56 C. The Role of Two-Dimensional Averaging in the Three-Dimensional Analysis of Single Molecules D. A Discourse on Terminology: Views versus Projections 61 E. Origins of Orientational Preference 62
II. Digitization and Selection of Particles A. The Sampling Theorem 67 B. Interactive Particle Selection C. Automated Particle Selection
III. Alignment Methods
59
67 69 69
73
A. The Aims of Alignment 73 B. Homogeneous versus Heterogeneous Image Sets
74
Contents
ix C. Translational and Rotational Cross-Correlation D. Reference-Based Alignment Techniques 83 E. Reference-Free Techniques 93
76
IVo Averaging and Global Variance Analysis
101 A. The Statistics of Averaging 101 B. The Variance Map and Analysis of Significance C. Signal-to-Noise Ratio 107
102
VII Resolution
110 A. The Concept of Resolution B. Resolution Criteria 112 C. Resolution-Limiting Factors
110 122
VIo Validation of the Average Image: Rank Sum Analysis VII. Outlier Rejection: "Odd Men Out" Strategy 124
123
Chapter 4 Multivariate Statistical Analysis and Classification of Images 126 I. Introduction
126
126 A. Heterogeneity of Image Sets B~ Direct Application of Multivariate Statistical Analysis to an Image Set 127 C~ The Principle of Making Patterns Emerge from Data 129 Do Eigenvector Methods of Ordination: Principal Component Analysis versus Correspondence Analysis 129 II. Theory of Correspondence Analysis 135 A. Analysis of Image Vectors in R J 135 B. Analysis of Pixel Vectors in R N 136 C. Factorial Coordinates and Factor Maps D. Reconstitution 139 E. Computational Methods 143 F. Significance Test 144
137
III. Correspondence Analysis in Practice 145 145 A. Image Sets Used for Demonstration 145 B. Eigenvalue Histogram and Factor Map
x
Contents C. Explanatory Tools I: Local Averages 149 D. Explanatory Tools II: Eigenimages and Reconstitution 150 E. Preparation of Masks 156 F. Demonstration of Reconstitution for a Molecule Set 159
IV. Classification 160 160 A. Background B. Classification of the Different Approaches to Classification 163 164 Co Partitional Methods: K-Means Technique D. Hard versus Fuzzy Classification 165 E. Hierarchical Ascendant Classification 165 F. Hybrid Techniques 171 173 G. Intrinsically Parallel Methods H. Inventories and Analysis of Trends 175 I. Nonlinear Mapping 176 179 J. Supervised Classification: Use of Templates K. Inference, through Classification, from Two to Three Dimensions 180
Chapter 5 Three-Dimensional Reconstruction I. Introduction
182
182
II. General Mathematical Principles 183 A. The Projection Theorem, Radon's Theorem, and Resolution 183 B. Projection Geometries 186 188 III. Rationales of Data Collection: Reconstruction Schemes A. Introduction 188 B. Cylindrically Averaged Reconstruction 190 C. Compatibility of Projections 192 D. Relating Projections to One Another Using Common Lines 193 Eo The Random-Conical Data Collection Method 199 Fo Reconstruction Schemes Based on Uniform Angular Coverage 202
Contents
xi WI Overview of Existing Reconstruction Techniques 202 A. Preliminaries 202 B. Weighted Back-Projection 203 C. Fourier Methods 208 209 D. Iterative Algebraic Reconstruction Methods VO The Random.Conical Reconstruction Scheme in
Practice A. B. C~ D. E. F. G. H.
211
Overview 211 Optical Diffraction Screening 211 Interactive Tilted/Untilted Particle Selection Density Scaling 214 Processing of Untilted-Particle Images 217 Processing of Tilted-Particle Images 219 Reconstruction 222 Resolution Assessment 222
213
Vie Merging of Reconstructions 225 A. The Rationale of Merging 225 226 B. Preparation-Induced Deformations 227 C. Three-Dimensional Orientation Search 230 D. Reconstruction from the Full Projection Set VII. Three-Dimensional Restoration 231 A. Introduction 231 B. Theory of Projection onto Convex Sets 231 C. Projection onto Convex Sets in Practice 233 VIII. Angular Refinement Techniques 235 A. Introduction 235 B. Three-Dimensional Projection Matching Method 237 C. Three-Dimensional Radon Transform Method 241 D. The Size of Angular Deviations 243
IX. Transfer Function Correction
245
Chapter 6 Interpretation of Three.Dimensional Images of Macromolecules 247 I. Preliminaries: Significance, Experimental Validity, and Meaning 247
xii
Contents II. Assessment of Statistical Significance
248
A. Introduction 248 B. Three-Dimensional Variance Estimation from Projections 250 C. Significance of Features in a Three-Dimensional Map 252 253 D. Significance of Features in a Difference Map
III. Validation and Consistency
254
A. A Structure and Its Component Reconstructed Separately: 80S Mammalian Ribosome and the 40S Ribosomal Subunit 254 B. Three-Dimensional Structural Features Inferred from Variational Pattern: Half-Molecules of Limulus 257 polyphemus Hemocyanin 260 Co Concluding Remarks
IV. Visualization and Segmentation
260
A. Segmentation 260 B. Visualization and Rendering Tools C. Definition of Boundaries 266
V. Juxtaposition with Existing Knowledge
263
268
268 A. The Organization of Knowledge B. Fitting of Electron Microscopy with X-Ray Results 269 Use of Envelopes of Three-Dimensional Electron C~ Microscopy Data 271 D~ Public Sharing of Low-Resolution Volume Data: The Three-Dimensional Density Database 272
Chapter 7 Example for an Application: Calcium Release Channel I. Introduction
273
273
II. Image Processing and Three.Dimensional Reconstruction of the Calcium Release Channel 275
Contents
xiii
Appendix 1 Software Implementations I. Introduction
282
282
II. Basic Design Features 282 A. Modular Design 283 B. Hierarchical Calling Structure 283 C. Bookkeeping Capabilities and Data Storage Organization 284 D. User Interfaces 285 Ill. Existing Packages
285
IV. Interfacing to Other Software V. Documentation
287
288
Appendix 2 Macromolecular Assemblies Reconstructed from Images of Single Macromolecules 289 Bibliography Index
333
293
This Page Intentionally Left Blank
Preface
'
Whether it is true that nature does not make jumps (natura non facit saltus, as the Romans said), science certainly does in its historical progression. My attempt to present the approaches to three-dimensional electron microscopy of single macromolecules in a systematic way should not blind the eyes of the reader to the truth that the development of these techniques was in fact totally erratic and full of dead ends and fortuitous encounters. I owe my early involvement with correlation functions and the discovery that different electron micrographs of a carbon film can be aligned to within a few angstroms to the enthusiasm and "fidgetiness" of Anton Feltynowski, a Polish emigre professor who had just joined the group of my mentor Walter Hoppe when I started my graduate work. Unable to sit still during his first experiments, he produced a number of strongly blurred electron micrographs, which in the diffractometer revealed beautiful patterns that could be traced to drift (Frank, 1969). It was the type of drift caused by an abrupt change in the position of the specimen during exposure ("Sprungdrift"). More important than the identification of the physical cause was the realization that these Young's fringes patterns directly visualized the cross-power spectrum, i.e., the transform of the cross-correlation function. From here it was only a small step to the exploration of cross-correlation as an alignment tool, and, in hindsight, only a few small steps to the realization of single-particle averaging. But for the work on my thesis, the sudden emergence of the new agendas of "drift," "resolution," and "correlation" catalyzed by Dr. Feltynowski's presence rescued me from a thoroughly pedantic experimental project: to XV
xvi
Preface
measure the spherical aberration coefficient of the Siemens IA lens to the second decimal so that zone plates could be designed more accurately. Another jump occurred ten years later. I had just given a lecture on image processing during a workshop at New York University when I was approached by a friendly man from the audience who wished to show me a folder with his micrographs. Prepared to give a display of polite interest, I was instead drawn in by his Eastern European accent--he was from Czechoslovakia--and the unusual quality of the pictures. The name of this man was Miloslav Boublik, and what he showed me were 40S subunits of the mammalian ribosome. Until then the correlation methods of alignment had been developed with images of glutamine synthetase, an ill-behaved specimen in many respects despite the good will of Martin Kessel, first sabbatical SPIDER user, who had imported it to Albany. A new method is as good as its results, and the rather blurry average images (correspondence analysis had yet to be introduced!) failed to stir up a great amount of excitement. That all changed, almost overnight, when Milas Boublik's 40S subunit images revealed details of unexpected resolution after being chased through a computer. (At this point I must mention the diligent work by Adriana Verschoor who had just joined me and did most of the chasing.) Thus started a ten-year collaboration with the Roche Institute that ultimately led to the first three-dimensional reconstructions of ribosomal particles. Many more jumps occurred, and many more credits will be due when a history of the "crystallography without crystals" is eventually written. But what I wished to do here is simply give tribute to two Eastern European emigres, both now deceased, for their unique roles in this development. Even within the resolution range that has now been achieved (1/40 to 1/25 ~-1), reconstructions of macromolecular assemblies allow very detailed models to be designed, which serve as a framework for functional interpretations. It is the goal of all groups working on improvements in methodology to increase the resolution substantially, to bring it at least into the range (1/10 .~-~)where secondary structure begins to be recognized. It is my firm belief that this can indeed be achieved by concerted efforts in instrument design, specimen preparation, cryomethodology, and further development of image processing techniques. As to the contribution made by image processing, the success of the angular refinement techniques in the past two years can be taken as an indication of as yet unexplored potential for improved resolution. In writing this book I have drawn from many sources, and have benefited from many interactions with other scientists. I am very grateful for the cooperation of numerous colleagues in giving me permission to reproduce their figures, for supplying original artwork, or for making
Preface
xvii
preprints of articles in press available. Special thanks are due to colleagues at the Wadsworth Center and members of my group, Ramani Lata, Carmen Mannella, Bruce McEwen, Pawel Penczek, Michael Radermacher, Adriana Verschoor, Terry Wagenknecht, and Jun Zhu for advice and assistance with a host of references and illustrative material. Yanhong Li assisted me in preparing the color illustration of the ribosome appearing on the cover. I would like to thank Ken Downing and Bing Jap for allowing me to use an unpublished figure, and Noreen Francis who made a special effort to brighten up her basal body reconstruction for reproduction. Nicolas Boisset contributed the classification tree of the grimacing faces. Rasmus Schr6der helped me with computer graphics representations and with a section on energy filtration, and Ken Holmes kindly let me use the facilities of his department for some of the artwork during my Humboldt-funded stay at the Max Planck Institute for Medical Research in Heidelberg. I acknowledge helpful literature hints from Fritz Zemlin and Christopher Dinges. My very special thanks go to Jose-Maria Carazo, Michael Radermacher, and Pawel Penczek for a critical reading of the manuscript and many helpful comments and suggestions at different stages of its preparation. Finally, I acknowledge support by grants from the National Institutes of Health, over a period of thirteen years, without which many of the methodological developments described in this book would not have been possible. Although the book attempts to review all methodological developments in the field, emphasis is placed on methods of random-conical reconstruction developed in Albany. For expedience, many of the examples of both 2D and 3D processing come from the archives of the Albany Image Processing group. I hope that the book will fulfill the purposes for which it is intended: as a general introduction into the subject matter and a useful laboratory resource. Joachim Frank
This Page Intentionally Left Blank
There is, however, a sense in which viruses and chromatin.., are still relatively simple systems. Much more complex systems, ribosomes, the mitotic apparatus, lie before us and future generations will recognise that their study is a formidable task, in some respects only just begun. AARON KLUG, NOBEL LECTURE 1983
I. The Electron Microscope and Biology The Middle Ages saw the different objects of the world ordered in hierarchies; every living creature, every thing of the animated and unanimated world was ranked within its own orbit according to its intrinsic power, beauty, and utility to man. Thus the king of all animals was the lion, the king of all musical instruments the organ, etc. I have no doubt that in those days the electron microscope would have earned itself the name of Queen of all Microscopes, with its awe-inspiring physical size, its unsurpassed resolving power, and its mysterious lenses that work without glass. Biology today would be unthinkable without the invention of the electron microscope. The detailed architecture of the cell and its numerous organelles, the structure of viruses, and the fine structure of muscle are all beyond the reach of light microscopy; they came into view for the first time, in spectacular pictures, as this new instrument was being perfected. One of the first discoveries made possible by electron microscopy (EM) was the origin of the irridescent colors of butterfly wings: they were traced to diffraction of light on ordered arrays of entirely "colorless" scales (Kinder and Si~ffert 1943). By far the largest number of applications of electron microscopy to biology are concerned with interpretations of the image on a qualitative-descriptive level. In those cases where quantitative measurements are obtained, they normally relate to distances, sizes, or numbers of particles, etc. Precise measurements of optical densities are rarely relevant
2
Chapter 1. Introduction
in those uses of the instrument. In contrast, there are other studies in which measurements of the images in their entirety are required. In this approach, attempts are being made to form an accurate three-dimensional (3D) representation of the biological object. Here the term "representation" stands for the making of a 3D map, or image, of the object, which ideally reveals not only its shape but also its interior density uariations. At this point it is worth contemplating the role of three-dimensional electron microscopy in the project of uisualizing biological complexity. Biological structure is built up in a hierarchical way, following in ascending order the levels of macromolecule, macromolecular assembly, cell organelle, cell, tissue, and finally, the whole organism. This hierarchy encompasses an enormous span in dimensions. At one end of this span, atomic resolution is achieved, rendering a complete description of a macromolecule. At the other end, the scale is macroscopic, on the order of meters for large organisms. Within that vast range, electron microscopy bridges a gap of several orders of magnitude that is left between X-ray crystallography and light microscopy (Fig. 1.1). Thus, when we take into account the existing techniques of 3D light microscopy and those of macroscopic radiological imaging (e.g., CAT, PET, and MRI), the project of visualizing and modeling an organism in its entirety has become, at least in principle, possible. Of course, limitations of data collection, computational speed, and memory will for a foreseeable time prevent the realization of this project for all but the most primitive organisms such as Escherichia coli. Although true three-dimensionally imaging electron microscopes have actually been conceived, based on an objective lens design that allows unusually large apertures to be used (Hoppe, 1972; Typke et al., 1976), such instruments have never been realized. Three-dimensional information is normally obtained by interpreting micrographs as projections: by corn-
Fig. 1.1. The range of biological structures covered by three-dimensional electron microscopy. Drawing of bacteriophage is from Alberts et al. (1989), Molecular Biologyof the Cell, 2nd Ed., Fig. 4.1, p. 8. Reproduced with the permission of Garland Publishing.
I. The Electron Microscope and Biology
3
bining such projections, taken over a sufficient angular range in the computer, the object is eventually recovered. In many experiments of practical importance, at not too high resolutions, the interpretation of micrographs as projections of the object is in fact a very good approximation (see Hawkes, 1992). At the outset, we distinguish the true 3D imaging being considered here from stereoscopic imaging and from serial section reconstruction. The latter two techniques are unsuitable for macromolecular imaging for several reasons. In stereoscopic imaging, a 3D image is not actually formed until it is synthesized from the two views in the observer's brain. This technique poses obvious difficulties in obtaining an objective description of the structure, and its use is moreover restricted to structures that are isolated, well-delineated, and sufficiently large. In serial section reconstruction, on the other hand, a 3D shape representation is formed by stacking visually selected (or, in some cases, computationally extracted) contours. Here the thickness of sections, usually above 500 A, precludes application to even the largest macromolecules. [Besides, the material loss and deformation due to mechanical shearing limits the serial section technique of reconstruction even when applied to much larger subcellular structures.] Three-dimensional imaging with the electron microscope follows two different methodologies, which are essentially divided according to the size range of the object andmclosely related to the physical dimensions-according to its degree of "structural uniqueness" (Frank, 1989a, 1992a). On the one hand, we have cell components, in the size range of 100-1000 nm, which possess a unique structure. An example for such an object is the mitochondrion: one can state with confidence that no two mitochondria are the same in the strict sense of being congruent in three dimensions (Fig. 1.2a). [Here the term "congruent" is used to mean that two structures could be brought into precise overlap by "rigid body movement"; i.e., by a movement of the structure, in three dimensions, which leaves the relative distances and angles among components intact.] In fact, structures like the mitochondria even lack similarity of a less rigorous kind that would require two structures to be related by a precisely defined geometric transformation, such as an affine transformation. On the other hand, we have macromolecular assemblies, in the size range of 5-50 nm, 1 which exist in many structurally identical "copies" that are 3D-congruent. This implies that such macromolecules will present identi1The nm (nanometer) unit is used here to allow comparison of the macromolecular with the subcellular scale. However, henceforth, throughout the rest of the book, the ~ (angstrom) unit ( = 0.1 nm)will be used since it is naturally linked to the usage in X-ray crystallography, the field of prime importance for EM as a source of comparative and interpretative data.
4
Chapter 1. Introduction
I. The Electron Microscope and Biology cal views in the electron microscope when placed on the support in the same orientation (Fig. 1.2b). The degree of "structural uniqueness" obviously reflects the way function is realized in a structure. For instance, the function of the mitochondrion as the power plant of the cell relies on the properties of its specific membrane transport proteins and its compartmentation produced by the folding of the inner and outer membrane (Tyler, 1992; Mannella et al., 1994) but--and this is the important point--not on the maintenance of a precise shape of the entire organelle. In sharp contrast to this, the function of the much smaller ribosome in the synthesis of proteins is tied to the maintenance of a precise conformation (within certain limits that have yet to be explored) that facilitates and constrains the steric interactions of the ribosome's numerous ligands in the process of protein synthesis. The varying degrees to which function dictates, or fails to dictate, the maintenance of a precise structure leads to two fundamentally different approaches to 3D imaging: for objects that have identical structure by functional necessity, powerful averaging techniques can be used to eliminate noise and reduce radiation damage. On the other hand, objects that may vary from one realization to the next without impact on their function can only be visualized as "individuals," by obtaining one entire projection series per realization. For these latter kind of objects, it is obviously more difficult to draw generalizations from a single 3D image. This volume will exclusively deal with the former problem, while the latter problem has been discussed at some length in an edited volume on electron tomography (Frank, 1992b). Methods for the retrieval of structure from nonperiodic objects were first developed in the laboratory of Walter Hoppe at the Max-Planck Institute in Martinsried (Hoppe, 1974; Hoppe et al., 1974). However, the approach taken by Hoppe's group failed to take advantage of averaging
Fig. 1.2. Structural uniqueness of a biological system is a function of size. (a) Several rat liver mitochondria (diameter, 1.2 p,m) embedded in Epon and cut into 0.5-gm sections, imaged with the Albany high-voltage (1.2 MV) electron microscope at an electron optical magnification of 12,500 x . The "structure" of each mitochondrion is unique, although general building principles are maintained. Averaging methods cannot be used in the retrieval of the structure (unpublished data, kindly made available by Dr. Carmen Mannella). (b) Field of 40S small subunits of mammalian ribosome negatively stained on a carbon film. Particles occur in two side views, marked "L" (left-facing) and "R" (right-facing). Each image can be understood as a superposition of an underlying unchanged structure and noise. With such a "repeating" structure, averaging methods can be used with success. Reprinted from Frank, J., Verschoor, A., and Boublik, M. Science 214, 1353-1355. Copyright 1981 American Association for the Advancement of Science.
6
Chapter 1. Introduction
and was based on experiments that allowed 1000 e - / A 2 or more to accumulate. This approach was criticized because of the high accumulated dose and the absence of a criterion that allowed radiation damage and preparation-related artifacts to be assessed (Baumeister and Hahn, 1975; Frank, 1975; Crowther, 1976). Averaging over several reconstructions (Knauer et al., 1983; Ottl et al., 1983; Oefverstedt et al., 1994)still does not allow the dose to be reduced to acceptable levels (below 10 e-/A2). Only the development of computer-controlled electron microscopes (Dierksen et al., 1992, 1993; Koster et al., 1992) makes it possible to reduce the dose per projection to extremely low levels and opens the way for a true tomography of macromolecules, where a statistically significant merged reconstruction can in principle be obtained by averaging in three dimensions.
II. Single-Particle versus Crystallographic Analysis Electron crystallography, either in its pure form or in the variants discussed in this volume, deals with attempts to form precise threedimensional representations of the objects imaged. The crystallographic approach to structure analysis with the electron microscope follows a distinguished tradition established by Sir Lawrence Bragg, Director of the Cavendish Laboratory, which spawned the Molecular Biology Laboratory of the Medical Research Council in Cambridge, England (see, for instance, Watson, 1968). In the 1960s, Klug and Berger used optical diffraction as a way of evaluating images of periodic structures taken with the electron microscope (Klug and Berger, 1964). Shortly afterward, Klug and DeRosier (1966) developed optical filtration as a way of obtaining noise-free versions of a crystal image. From that starting position, the jump to computer applications of Fourier analysis in electron microscopy was obvious and straightforward, as recounted in Klug's Nobel lecture (Klug, 1983). The use of crystallographic methods to retrieve structural information from electron micrographs (e.g., Erickson and Klug, 1970; Hoppe et al., 1969; Unwin and Henderson, 1975; Henderson and Unwin, 1975) has become known as electron crystallography (Glaeser, 1985). To be suitable for this approach, the object must form ordered structures whose dimensions are in the range of distances traversed by electrons accelerated to 100 kV (in conventional transmission electron microscopes) or 200-400 kV (in intermediate-voltage electron microscopes). More specifically, the thickness range should be such that, for the electron energy used, the chances for multiple scattering are negligible. These structures can be thin crystals (Unwin and Henderson, 1975; Amos et al., 1982), no more than few unit cells thick; helical assemblies (DeRosier and Moore, 1970;
III. Crystallography without Crystals
7
Stewart, 1988b); or spherical viruses having high symmetry (Crowther et al., 1970). By being able to deal with ordered structures that fail to form "bulk" crystals amenable to X-ray crystallography, electron crystallography claims an important place in structure research. Among the thin crystals are membrane proteins, most of which have withstood attempts of 3D crystallization (light-harvesting complex, bacteriorhodopsin, and bacterial porin PhoE; see references). Among the helical structures are the T4 phage tail (DeRosier and Klug, 1968) and the fibers formed by the aggregation of the acetylcholine receptor (Toyoshima and Unwin, 1988b). There are also numerous examples for spherical viruses whose structure has been explored by electron microscopy and three-dimensional reconstruction (Crowther and Amos, 1971; Stewart et al., 1993; Cheng et al., 1994; see the volume edited by Chiu et al., 1996). For molecules that exist as, or can be brought into the form of, highly ordered crystalline sheets, close to atomic resolution can be achieved (light-harvesting complex, Kiihlbrandt et al., 1994; bacteriorhodopsin, Henderson et al., 1990; bacterial porin, Jap, 1991; see also survey in Cyrklaft and Kiihlbrandt, 1994). From the outset, it has been obvious that in nature, ordered or highly symmetric objects are the exception, not the rule. Indeed, order or symmetric arrangement does not have a functional role for most biological macromolecules. Obvious exceptions are bacterial membranes where structural proteins form a tight ordered network or systems that involve cooperativity of an entire array of ordered molecules that are in contact with one another. If molecules fail to form ordered arrays in the living system, they can under certain conditions be induced to do so. However, even now that sophisticated methods for inducing crystallinity are available (see Kornberg and Darst, 1991, and the comprehensive review of techniques by Jap et al., 1992), the likelihood of success is still unpredictable for a new protein. As a result, the fraction of macromolecular assemblies whose structures have successfully tackled with methods of electron crystallography is still quite small. One must also realize that the electron crystallographic approach has a shortcoming: the crystalline packing constrains the molecule to a certain extent so that it may assume only a small range of its possible physiologically relevant conformations. Thus, the electron crystallographic has a number of limitations, the most serious of which is the limited availability of crystals.
III. Crystallography without Crystals The fundamental principle of crystallography lies in the use of redundancy to achieve a virtually noise-free average. Redundancy of structural information is available in a crystal because the latter is built from a basic
8
Chapter 1. Introduction
structural unit, the unit cell, by translational repetition. [Helical and icosahedral structures are generated by combinations of translations and rotations.] The unit cell may be composed of one or several copies of the molecule, which are in a strict geometrical arrangement, following the regiment of symmetry. Electron crystallography is distinguished from X-ray crystallography by the fact that it uses as primary data images rather than diffraction patterns. Translated into the Fourier domain, the availability of images means that the "phase problem" known in X-ray crystallography does not exist: the electron microscope is, in Hoppe's (1982, 1983) words, a "phase-measuring diffractometer. ''-~ Another important difference between electron and X-ray crystallography is due to the fact that electrons interact with matter more strongly, hence even a single-layered ("two-dimensional") crystal no larger than a few micrometers provides sufficient contrast to produce, upon averaging, interpretable images, while X-ray diffraction requires "three-dimensional" crystals of macroscopic dimensions for producing statistically acceptable measurements. The catch phrase "crystallography without crystals ''3 used in the title of this section alludes to the fact that there is little difference, in principle, between the image of a crystal (seen as a composite of a set of translationally repeating images of the structure that forms the unit cell) and the image of a field containing such structures in isolated form, as so-called single particles. 4 In both situations, it is possible to precisely superimpose ("align") the individual images for the purpose of forming an average, although the practical way of achieving this is much more complicated in the case of single particles than in the case of the crystal. (Indeed, two entire chapters (Chapters 3 and 4) of this book are devoted to the problem of how to superimpose, classify, and average such single particle images.) Another complication arises from the fact that macromolecules in single particle form are more susceptible to variations in their microenvironment, in terms of stain thickness (in case stain is being used), hydrophilic properties of the support grid, etc. Because of the limited accuracy of alignment and the structural variability, it seems doubtful that atomic resolution will ever be achieved (but cf. recent assessment by Henderson, 1995). However, attainment of information on close-to-secondary structure, at a resolution near 10 ,~, perhaps even down to 7 A, is a definite possibility, promising the chance to visualize alpha-helices (see, for instance, Henderson and Unwin, 1975). 2However, the attainment of atomic or near-atomic resolution may entail the use of electron diffraction data whose resolution lies beyond the resolution of the images. 3Title of a lecture given by the author at the Second W. M. Keck Symposium on Computational Biology in Houston, 1992. 4The term "single particles" stands for "isolated, unordered particles with--in principle --identical structure."
IV. Toward a Unified Approach to Structure Research
In summary, the two types of electron crystallography have one common goal: to suppress the noise and to extract the three-dimensional structure from two-dimensional images. Both are distinguished from X-ray crystallography by the fact that true images are being recorded, or--in the language of crystallography--phases are being measured. Both types of electron crystallography also employ the same principle: by making use of the fact that many identical copies of a structure that can be used to form an average are available. The difference between the two methods is threefold: ( i ) " t r u e " crystallography, of crystals, does not require alignment; (ii) the molecules forming the unit cell of a crystal exhibit considerably less variation in structure; and (iii) only crystallography of crystals is able to benefit from the availability of electron diffraction data that supply highly accurate amplitudes, often to a resolution that exceeds the resolution of the image. For these reasons, the main contribution of singleparticle methods may be confined to the exploration of "molecular morphology" (a term used by Glaeser (1985)). The value of this contribution cannot be overestimated: besides yielding information on quaternary structure, the low-resolution map of a large macromolecular assembly also provides powerful constraints useful for phasing of X-ray data.
IV. Toward a Unified Approach to Structure Research Notwithstanding the contributions of some pioneering groups with expertise in both fields, electron microscopy and X-ray crystallography have largely developed separately. The much younger field of electron microscopic image analysis has benefitted from adopting the tools and working methods of statistical optics (e.g., O'Neill, 1963), the theory of linear systems (e.g., Goodman, 1968), and multivariate statistics (Lebart et al., 1977). In recent years, a convergence has taken place, brought about mainly by the development of cryotechniques in electron microscopy. For once it has become possible to relate the 3D density distribution of a molecule reconstructed by electron microscopy quantitatively to its lowresolution X-ray map. Quantitative comparisons of this kind (e.g., Jeng et al., 1989; Rayment et al., 1993; Stewart et al., 1993; Boissel et al., 1994b, 1995) will become more common as instruments with high coherence and stability become more widely available. Even with negative staining, it is often startling to see the agreement between 2D or 3D maps computed from X-ray data and electron microscopy in minute features at the boundary of the molecule; see, for instance, Lamy (1987), de Haas et al. (1993), and Stoops et al. (1991). However, with cryo data, the agreement even extends to the interior of the molecule. The ability to compare, match, and merge results from both
10
Chapter 1. Introduction
experimental sources offers entirely new possibilities that are beginning to be realized in collaborative research, as in the work cited above. Cryo-electron crystallography of both forms of specimens (crystals and single particles), when combined with X-ray crystallography, presents a way of resolving very large structures to atomic resolution. Often components of a macromolecular assembly can be induced to form crystals suitable for X-ray crystallography, but the same may not be true for the entire assembly because of its size or inherent flexibility. On the other hand, electron crystallography of such a structure using cryo methods may make it possible to obtain its 3D image at low resolution. By matching the high-resolution X-ray model of the component to the 3D cryo-EM image of the entire macromolecule, high positional accuracies beyond the EM resolution can be obtained, and atomic resolution within the large structure can be reached or at least approximated (Cheng et al., 1994; Stewart et al., 1993; Stewart and Burnett, 1993; Wang et al., 1992; Boisset et al., 1994b, 1995). The most recent development of the spray-mix method in cryo-electron microscopy (Berriman and Unwin, 1994; Walker et al., 1994) and the use of caged compounds (e.g., Menetret et al., 1991) strengthens the contribution of electron microscopy to structural biology as these new techniques are capable of providing dynamic information on conformational changes or functionally relevant binding reactions.
V. The Electron Microscope and the Computer It has become fashionable to speak of galloping technology and the way it affects our lives. The development of the computer has had the most profound impact on the way science is done. This revolution has particularly affected all areas of research that involve the interpretation of images: it has meant the change from a qualitative to a quantitative description of objects represented, be it a star, an ameba, or a ribosome. Up to now, image processing has relied on a tedious procedure by which the image is first recorded on a photographic film or plate and then is scanned sequentially by a specialized apparatus--the microdensitometer. For each image element, the transmission is recorded and converted to optical density, which is then converted to a digital value that can be processed by the computer. In 1968, when the first images were digitized in the Laboratory of my mentor, Walter Hoppe, at the Max-Planck-Institut ffir Eiweiss-und Lederforschung, 5 they were stored on paper tape. The scanning took all night. The paper equivalent of an area scanned into a 5This institute later became part of the Max-Planck Institute for Biochemistry in Martinsried.
V. The Electron Microscope and the Computer
11
512 x 512 array was several kilometers long, a length that invoked an indelible visceral feeling about the amount of information residing in a high-resolution electron micrograph. Twenty-five years later, the same job can be done within 1 s with a video scanner. Moreoever, a new type of electron microscope (Downing et al., 1992; Brink et al., 1992; Brink and Chiu, 1994) that allows the image to be read directly into the computer, entirely bypassing what Elmar Zeitler (1992) has ironically called "the analog recorder"--the photographic plate--has begun to emerge. With an instantaneous readout of the image into the computer, feedback control of the instrument has now become available along with the more outlandish possibility of running a microscope standing in California from another continent (Fan et al., 1993). Clearly, we have not yet seen the end of this development. As this volume is being written, instruments with direct image readout are still largely in an experimental phase. Virtually all of the reconstructions listed in the Appendix section of this book (Appendix 2) have been obtained from data that were digitized from photographic films or plates in the "old-fashioned" way. However, as slow-scan CCD cameras and microcircuitry fall in price, a totally integrated commercial instrument with the capability of automated three-dimensional imaging (Koster et al., 1989, 1990, 1992; Dierksen et al., 1992, 1993) is no longer a fantasy and may be only a few years away.
I. Specimen Preparation Methods A. Introduction Ideally, we wish to form a three-dimensional (3D) image of a macromolecule in its entirety, at the highest possible resolution. Specimen preparation methods perform the seemingly impossible task of stabilizing the initially hydrated molecule so that it can be placed and observed in the vacuum. In addition, the contrast produced by the molecule itself is normally insufficient for direct observation in the electron microscope, and various contrasting methods have been developed. Negatice staining (Sections I, B and I, C)with heavy metal salts such as uranyl acetate produces high contrast and protects, at least to some extent, the molecule from collapsing. However, the high contrast comes at a heavy price: instead of the molecule, with its interior density variations, only a cast of the exterior surface of the molecule is imaged, and only its shape can be reconstructed. To some extent, stain may penetrate into crevices, but this does not alter the fact that only the boundary of the molecule is visualized. As sophisticated averaging methods were developed, it becomes possible to "make sense" of the faint images produced by the molecule itself, but biologically relevant information could be obtained only after methods were found to "sustain" the molecule in a medium that closely approximates the aqueous environment: these methods are embedment in glucose (Section I,D), tannic acid (Section I,E), and t'itreous ice ("frozen-hydrated"; Section I, F). For completeness, this Section will conclude with a brief assessment 12
I. SpecimenPreparation Methods
13
of gold-labeling techniques which have become important in 3D electron microscopy (Section I, G).
B. Negative Staining: Principle The negative staining method, which goes back to Brenner and Horne (1959), has been widely used to obtain images of macromolecules with high contrast. Typically, an aqueous suspension is mixed with 1 to 2% uranyl acetate and applied to a carbon-coated copper grid. The excess liquid is blotted away and the suspension is allowed to dry. Although, to some extent, the stain goes into aqueous channels, the structural information in the image is basically limited to the shape of the molecule as it appears in projection. As tilt studies and comparisons with X-ray diffraction data on wet suspensions reveal, the molecule shape is distorted due to air drying (see Crowther, 1976; Kellenberger and Kistler, 1979; Kellenberger et al. 1982). Nevertheless, negative staining has been used with great success in numerous computer reconstructions of viruses and other large regular macromolecular assemblies. The reason such reconstructions are legitimate is that they utilize symmetries which allow many views to be generated from the view least affected by the distortion. Therefore, as Crowther (1976) argues correctly, "reconstructions of single isolated particles with no symmetry from a limited series of tilts (Hoppe et al., 1974) is therefore somewhat problematical." As this book shows, the problems in the approach of Hoppe et al. to the reconstruction of negatively stained molecules can be overcome by the use of a radically different method of data collection. Even in the age of cryo-electron microscopy, negative staining is still used in high-resolution electron microscopy (EM) of macromolecules as an important first step in identifying characteristic views and assessing if a molecule is suitable for this type of analysis (see also the reevaluation done by Bremer et al., 1992). Efforts to obtain a 3D reconstruction of the frozen-hydrated molecule almost always involve a negatively stained specimen as the first step, sometimes with a delay of several years [e.g., the 50S ribosomal subunit, Radermacher et al. (1987a) followed by Radermacher et al. (1992b); A n d r o c t o n u s australis hemocyanin, Boisset et al. (1990b) followed by Boisset et al. (1992a); nuclear pore complex, Hinshaw et al. (1992) followed by Akey and Radermacher (1993); calcium release channel, Wagenknecht et al. (1989a) followed by Radermacher et al. (1994b)]. Because of the remaining experimental difficulties in collecting tiltedspecimen images in cryo-EM with high yield, it may often take months until a data set suitable for 3D reconstruction is assembled.
14
Chapter2. ElectronMicroscopyof MacromolecularAssemblies
C. Negative Staining: Single Layer versus Carbon Sandwich Technique B i o l o g i c a l p a r t i c l e s b e y o n d a c e r t a i n size a r e o f t e n incompletely s t a i n e d ; p a r t s of t h e p a r t i c l e f a r t h e s t a w a y f r o m the c a r b o n film " s t i c k o u t " of t h e stain l a y e r ( " o n e - s i d e d s t a i n i n g " ; Figs. 2.1 a n d 2.2a). C o n s e q u e n t l y , t h e r e
Fig. 2.1. Schematic representation of particle shape as it is influenced by specimen preparation. (a) Original shape of particle. (b) Particle prepared by negative staining on a single carbon layer, showing the following features: (i) the particle is flattened and surrounded by a miniscus of stain; (ii) staining is one-sided--the top of the particle may be invisible in projection; (iii) the carbon film yields and shows an indentation at the place where the particle sits ("wrapping effect" as observed by Kellenberger et al., 1982). (c) Particle prepared by negative staining and the carbon sandwiching method. The particle is flattened, often to a stronger degree than in (b). The stain minuscus is less pronounced than in (b), but uniform stain deposition on both top and bottom sides is guaranteed. The same wrapping effect as in (b) takes place, but now the two carbon films may have a more symmetric role. Reprinted from Electron Microsc. Rec. (now Micron) 2, Frank, J.; Image analysis of single macromolecules, 53-74. Copyright 1989, with kind permission from Elsevier Science Ltd, The Boulevard, Langford Lane, Kidlington OX5 1GB, UK.
I. Specimen Preparation Methods
15
are parts of the structure that do not contribute to the projection image; we speak of partial projections. A telltale way by which to recognize partial projections is the observation that "flip" and "flop" views of a particle (i.e., views obtained by flipping the particle on the grid) are not mirror-related. A striking example of this effect is offered by the 40S subunit of the eukaryotic ribosome with its characteristic R and L side views (Fig. 2.3; Frank et al., 1981a, 1982): the two views are quite different in appearance ~ n o t e , for example, the massive "beak" of the R view which is fused with the "head," compared to the narrow, well-defined "beak" presented in the L view. Since the thickness of the stain may vary from one particle to the next, "one-sidedness" of staining is frequently accompanied by a high variability in the appearance of the particle. Very detailed~albeit qualitative~ observations on these effects go back to the beginning of image analysis (Moody, 1967). After the introduction of multivariate statistical analysis (see Chapter 4), the stain variations could be systematically studied as this technique of data analysis allows particle images to be ordered according to stain level or other overall effects. From such studies (van Heel and Frank, 1981; Frank et al., 1982; Bijlholt et al., 1982; Verschoor et al., 1985; Boekema, 1991) we know that the appearance of a particle varies with stain level in a manner reminiscent of the appearance of a rock partically immersed in water of varying depth (Fig 2.4). In both cases, a contour marks the level where the isolated mass raises above the liquid. Depending on the depth of immersion, the contour may contract or expand. Note, however, that despite the striking similarity between the two situations, there exists an important difference: for the negatively stained particle, seen in projection, only the immersed part is visible in the image, whereas for the rock, seen as a surface from above, only the part sticking out of the water is visible. Computer calculations can be used to verify the partial-staining model. Detailed matches with observed molecule projections have been achieved by modeling the molecule as a solid stain-excluding "box" surrounded by stain up to a certain z level. As an example, simulated images of a hemocyanin molecule, calculated by Boisset et al. (1990a), will be shown later on in Fig. 3.4 of Chapter 3. With the 40S ribosomal subunit, Verschoor et al. (1989) were able to simulate the appearance (and pattern of variability) of experimental single-layer projections by partial projections of a 3D reconstruction that was obtained from a double-layer preparation. Because partial projections do not generally allow the object to be fully reconstructed, the single-carbon layer technique is obviously unsuitable for any quantitative studies of particles with a diameter above 150/~ or so. The double-carbon layer method of staining (Tischendoff et al.,
16
Chapter 2. Electron Microscopy of MacromolecularAssemblies
Fig. 2.2. Appearance of negatively stained particles in single versus double carbon layer preparations, as exemplified by micrographs of the calcium release channel of skeletal fast twitch muscle. (a) Single-layer preparation, characterized by crisp appearance of the molecule border and white appearance of parts of the molecule that "'stick out." From Saito et al. (1988). Reproduced from The Journal of Cell Biology, 1988, 107, 211-219 by copyright permission of The Rockefeller University Press. (b) Double-layer preparation, characterized by a broader region of stain surrounding the molecule and more uniform staining of interior. From Radermacher et al. (1992a), Biophys. J. 61, 936-940. Reproduced with permission of the Biophysical Society.
1974; St6ffler and St6ffler-Meilicke, 1983; see also Frank et al., 1986, where the relative merits of this technique as compared to those of the single-layer technique are discussed) solves this problem by providing staining of the particle from both sides without information loss. This method has been used for immunoelectron microscopy (Tischendorf et al., 1974; Lamy, 1987; Boisset et al., 1988) precisely for that reason: antibodies attached to the surface facing away from the primary carbon layer will then be visualized with the same contrast as those close to the carbon film. Under proper conditions, the double-carbon layer method gives the most consistent results and provides the most detailed information on the
I. Specimen Preparation Methods
17
Fig. 2.2. (continued)
surface features of the particle, whereas the single-layer method yields better defined particle outlines but highly variable stain thickness. Outlines of the particles are better defined in the micrograph because a stain meniscus forms around the particle border (see Moody, 1967), producing a sharp increase of scattering absorption there. In contrast to this behavior, the stain is confined to a wedge at the particle border in sandwich preparations, producing a broader, much more uniform band of stain. Intuitively, the electron microscopist is bound to prefer the appearance of particles that are sharply delineated, but the computer analysis shows that images of sandwiched particles are in fact richer in interior
18
Chapter 2. Electron Microscopy of Macromolecular Assemblies
Fig. 2.3. One-sidedness of staining in a single-carbon layer preparation produces strong deviations from mirror relationship between flip/flop related projections. The average of 40S ribosomal subunit images showing the L view (top left) is distinctly different from the average of images in the R view (top right). This effect can be simulated by computing incomplete projections, i.e., projections through a partially capped volume (upper removed), from a reconstruction that was obtained from particles prepared by the double-layer technique (bottom panels). From Verschoor et al. (1989). Reproduced with permission of Academic Press Ltd. features. Thus the suggestion that sandwiching would lead to an "unfortunate loss of resolution" (Harris and Horne, 1991) is based only on an assessment of the visual appearance of particle borders, not on quantitative analysis. For use with the r a n d o m - c o n i c a l reconstruction technique (see Chapter 5, Section V), the sandwiching methods by Tischendorf et al. (1974) and Boublik et al. (1977) have been found to give most reliable results as they yield large proportions of the grid being double-layered. There is, however, evidence that the sandwiching technique is responsible for some of the flattening of the particle (i.e., additional to the usual flattening found in single-layer preparations that is due to the drying of the stain; see Kellenberger and Kistler, 1979). As a side effect of flattening, and due to the variability in the degree of it, large size variations may also be found in projection (Fig. 2.5; see Boisset et al., 1993b). The degree of flattening can be assessed by comparing z dimensions (i.e., in the direction normal to the support grid) of particles reconstructed
I. Specimen Preparation Methods
19
from different views showing the molecule in orientations related by 90 ~ rotation. Using this comparison, Boisset et al. (1990b) have found the long dimension of the A n d r o c t o n u s australis hemocyanin to be reduced to 60% of its size when measured in the x - y plane. The z dimension of the calcium release channel is reduced to as little as 30-40% when the reconstruction from the negatively stained preparation (Wagenknecht et al., 1989a) is compared with the reconstruction from cryoimages (Radermacher et al., 1994b). Apparently, in that case, the unusual extent of the collapse is due to the fragility of a dome-shaped structure on the transmembranous side of the molecule. Evidently, the size of the flattening effect depends strongly on the type of specimen. It has been conjectured, for instance (Knauer et al., 1983), that molecular assemblies composed of RNA and protein, such as ribosomes, are more resistant to mechanical forces than those made entirely of protein. Indications for particularly strong resistance were the apparent maintenance of the shape of the 30S subunit in the aforementioned reconstructions of Knauer et al. and the result of shadowing experiments by Kellenberger et al. (1982)which indicated that the carbon film yields to, and "wraps around," the ribosomal particle. On the other hand, the 30S subunit portion within the 70S ribosome reconstruction, easily identifiable because of the characteristic shape of its 50S subunit counterpart, was found to be strongly flattened in a sandwich preparation (Wagenknecht et al., 1989b). The behavior of biological structures subjected to mechanical forces might be easiest to understand by considering their specific architecture, which includes the presence or absence of aqueous channels and cavities. Indeed, empty shells of the turnip yellow mosaic virus were shown to be totally collapsed in the sandwiched preparation while maintaining their spherical shape in the single layer (Kellenberger et al., 1982). The flattening of the 70S ribosome mentioned above was observed when the particle was oriented such that the interface cavity between the two subunits could be closed by compression. Also, the calcium release channel, as visualized in the 3D reconstruction from a frozen-hydrated preparation (Radermacher et al., 1994a, b), is a particularly fragile structure that includes many cavities and channels. As a general caveat, any comparisons of particle dimensions in z direction from reconstructions have to be made with some caution. As will become clear later on, reconstructions from a single projection set are to some extent elongated in the z direction, as a result of the missing angular data, so that the amount of flattening deduced from a measurement of the z dimension actually leads to an underestimation. By applying restoration (see Chapter 5, Section VII) to the published 50S ribosomal subunit
I
I
! I
I
t
I I I
I I
I
i
T
!
~I
W I j----
f
a
b
-
I. SpecimenPreparation Methods
21
Fig. 2.5. Sizevariation of Androctonus australis hemocyanin-Fab complex prepared with the double-carbon layer, negative staining method. (A) Average of small, well-defined molecules that are encountered at places where the surrounding stain is deep; (B) average of large, apparently squashed molecules seen at places where the stain is shallow. The molecules were classified by Correspondence Analysis. Scale bar, 100 A. From Boisset et al. (1993b). Reproduced with permission of Academic Press.
reconstructed from stained specimens, R a d e r m a c h e r et al. (1992b) found that its true z dimension is in fact considerably smaller than the apparent dimension. A comparison of the structures shown in this abstract suggests a factor of approximately 0.7.
D. Glucose Embedment Techniques The technique of preparing unstained specimens using glucose was introduced by Unwin and Henderson (1975). In this preparation technique, the solution containing the specimen in suspension is applied to a carboncoated grid and washed with a 1% ( w / v ) solution of glucose. The rationale for the development of this technique was, in the words of the author, "to replace the aqueous medium by another liquid which has similar chemical and physical properties, but is non-volatile in addition." Other hydrophilic molecules tried were sucrose, ribose, and inositol. X-ray evidence indicated that this substitution leaves the structure undisturbed to a resolution equivalent to a distance of 3 to 3,. Although highly successful for the study
Fig. 2.4. (a) Images of the 40S ribosomal subunit from HeLa cells, negatively stained in a single-carbon layer preparation. The images are sorted (by correspondence analysis) according to increasing levels of stain. (b) Optical density profiles through the center of the particle. From Frank et al. (1982). Reproduced with permission of Academic Press Ltd.
22
Chapter2. ElectronMicroscopyof MacromolecularAssemblies
of bacteriorhodopsin, whose structure could ultimately be solved to a resolution of 1/3 A-1 (Henderson et al. 1990), glucose embedding has not found widespread application. The reason for this is that the scattering densities of glucose and protein are closely matched, resulting in extremely low contrast, as long as the resolution falls short of the 1/7 to 1/10 A -~ range where secondary structure becomes discernable. The other reason has been the success of frozen-hydrated electron microscopy ("cryoelectron microscopy") at the beginning of the 1980's which has the advantage, compared to glucose embedment, that the scattering densities of water and protein are sufficiently different to produce contrast even at low resolutions--provided, of course, that the electron microscopic transfer function allows the relevant Fourier components to contribute to the image (see Section II, C).
E. Use of Tannic Acid Tannin has been used with success by some researchers to stabilize and preserve thin ordered protein layers (Akey and Edelstein, 1983). It has been found to be instrumental in the collection of high-resolution data for the light-harvesting complex II (Wang and Kiihlbrandt, 1991; Kiihlbrandt and Wang, 1991; Kiihlbrandt et al., 1994). For tannin preservation, the carbon film is floated off water, transferred with the grid onto a 0.5% (w/v) tannic acid solution, and adjusted to pH 6.0 with KOH (Wang and Kiihlbrandt, 1991). Wang and Kiihlbrandt in fact found little differences between the preservation of the high-resolution structure when prepared with vitreous ice, glucose, or tannin, but the important difference was in the high success rate of crystalline preservation with tannin versus the extremely low success rate with the other embedding media. The authors discuss at length the role of tannin in essentially blocking the extraction of detergent from the membrane crystal which otherwise occurs in free equilibrium with the detergent-free medium.
F. Cryo-electron Microscopy of Ice-Embedded Specimens The development of cryo-electron microscopy of samples embedded in vitreous ice (Taylor and Glaeser, 1976; Dubochet et al., 1982; Lepault et al., 1983; McDowall et al., 1983; Adrian et al., 1984; see also review by Chiu, 1993) presented a quantum leap of biological electron microscopy as it made it possible to obtain images of fully hydrated macromolecules. The specimen grid, on which an aqueous solution containing the specimen is applied, is rapidly plunged into liquid ethane, whereupon the thin water
I. SpecimenPreparationMethods
23
later vitrifies. The rapid cooling rate prevents the water from turning into cubic ice. The grid is subsequently transferred to liquid nitrogen and mounted in the cryoholder of the electron microscope. It is the small mass of the electron microscopic specimen grid that makes the required extremely high cooling rate possible. The advantages of frozen-hydrated specimen preparation is, as with glucose embedment, that specimen collapse is avoided and that the image contrast is related to the biological object itself, rather than to an extraneous contrasting agent. Thus, by combining cryo-electron microscopy with 3D reconstruction, a quantitative, physically meaningful map of the macromolecule can be obtained, enabling direct comparisons with results from X-ray crystallography. 6 Another advantage of cooling is the greater resistance of organic material to radiation damage, although initial estimates proved overly optimistic. The reason for the reduction in damage is that free radicals produced by ionization during electron irradiation are trapped under these conditions, preventing, or at least reducing, the damage to the structure. More than 20 years ago, Taylor and Glaeser (1974) proved the preservation of crystalline order in thin plates of catalase cooled down to liquid nitrogen temperature. Subsequent investigations of a number of protein crystals found general improvements in radiation resistance by a factor of between two and six (see the summary given by Dubochet et al., 1988). A reminder that the last word has not been spoken on the best way to stabilize and preserve a specimen is an article by Cyrklaft and Kiihlbrandt (1994) where the use of a special form of cubic ice is explored. The authors believe that the high stability of specimens prepared in this way is due to the extraordinary mechanical properties of a cubic ice layer when compared to a layer of vitreous ice.
G. Labeling with Gold Clusters The use of selective stains to mark specific sites or residues of a molecule has been explored early on for some time (see review by Koller et al., 1971). Particular interest received the idea of using compounds incorporating single heavy atoms. Even sequencing of nucleic acid sequences was thought possible in this way (Beer and Moudrianakis, 1962). The difficulty 6 Strictly speaking, the image obtained in the transmission electron mircoscope is related to the Coulomb potential distribution of the object, whereas the diffraction intensities obtained by X-ray diffraction techniques are related to the electron density distribution of the object.
24
Chapter 2. Electron Microscopy of Macromolecular Assemblies
with these single-atom probes was the relatively low contrast compared to the contrast arising from a column of light atoms in a macromolecule and from the support. Subsequently, a number of heavy atom clusters were investigated for their utility in providing specific contrast [see the detailed account in Hainfeld (1992)]. Undecagold, a compound that incorporates 11 gold atoms, is clearly visible in the scanning transmission electron microscope (STEM) but not in the conventional electron microscope (EM). In a recent breakthrough, Hainfeld and Furuya (1992; see also Hainfeld, 1992) have introduced a new probe consisting of a 55-gold atom cluster (Nanogold; Nanoprobe Inc., Stony Brook, NY), which forms a dense 1.4-nm particle, that is bound to a single maleimide site. This compound can be specifically linked to exposed cysteine residues. From theoretical considerations and from the first experiments made with this compound, the scattering density of this gold cluster is high enough to outweigh the contribution, to the EM image, by a projected thick (200-300 A) protein mass. Because of the presence of the amplitude component, which is transferred by cos y (see Section II, D), the Nanogold cluster stands out as a sharp density peak even in (and, as it turns out, especially in) low-defocus cryoimages where the boundaries of macromolecules are virtually invisible (Wagenknecht et al., 1994). The usual high-defocus images show the cluster somewhat blurred but still as prominent "blobs" superimposed on the molecule. Applications of this method are currently proliferating (Braig et al., 1993; Wagenknecht et al., 1994; Boisset et al., 1992). Boisset et al. (1992) used the Nanogold cluster to determine the site of thiol ester bonds in human a2-macroglobulin in three dimensions. The cluster stands out as a "core" of high density in the center of the macromolecular complex. Wagenknecht et al. (1994) were able to determine the calmodulin-binding sites on the calcium release channel/ryanodine receptor. Using site-directed Nanogold labeling, Braig et al. (1993) succeeded in mapping the substrate protein to the cavity of GroEL.
II. Principle of Image Formation in the Electron Microscope A. Introduction
Image formation in the electron microscope is a complex process; indeed, it would be an appropriate subject for a separate book. In the later chapters of this volume, there will be occasional references to the "contrast transfer function" and its dependence on the defocus. It is important to understand the principle of the underlying theory for two reasons: First, the image is not necessarily a faithful representation of the object's
II. Principle of Image Formation in the Electron Microscope
25
projection, and hence the same can be said for the relationship between the three-dimensional reconstruction computed from such images and the 3D object it is supposed to represent. It is therefore important to know the imaging conditions that lead to maximum resemblance as well as the types of computational correction (see Section II, H in this chapter and Section IX in Chapter 5) that are needed to recover the original information. Second, the contrast transfer theory is only an approximation to a comprehensive theory of image formation (see Reimer, 1989; Rose, 1984), and attains its simplicity by ignoring a number of effects whose relative magnitudes vary from one specimen to the other. An awareness of these "moving boundaries" of the theory is required to avoid incorrect interpretations.
B. The Weak Phase Object Approximation The basis of image formation in the electron microscope is the interaction of the electrons with the object. We distinguish between elastic and inelastic scattering. The former involves no transfer of energy, it has a fairly wide angular distribution, and gives rise to high-resolution information. The latter involves transfer of energy, its angular distribution is narrow, and it produces an undesired background term in the image. Because this term has low resolution, it is normally tolerated, although it interferes with the quantitative interpretation of the image (see Section III, C on energy filtering). In the wave-optical picture, the "elastic" scattering interaction of the electron with the object is depicted as a phase shift ~(r) of the incoming wave traveling in z direction by ~(r) = f
(2.1)
~3D(r, Z) dz,
where r is a two-dimensional vector, which we will write as a column vector, r =
y
or [x, y]r, and q~31)(r, Z) iS the three-dimensional Coulomb
potential distribution within the object. Thus, the incoming plane wave ~ = exp(ikz) is modified according to ~' = 0 exp[icI)(r)].
(2.2)
The weak phase approximation assumes that ~ ( r ) < < 1, enabling the expansion 0
[
1
= 0 1 + i q ~ ( r ) - 5~(r)- + . . .
],
(2.3)
26
Chapter 2. Electron Microscopy of Macromolecular Assemblies
which is normally truncated after the second term. Note that this form implies a decomposition of the wave behind the object into an "unmodified" or "unscattered wave" (the term 1) and a "scattered wave" (the terms i~(r) and following). The Frauenhofer approximation of Diffraction Theory (Goodman, 1968) is obtained by assuming an observation in the far distance and close to the optical axis--assumptions that are always fulfilled in the imaging mode of the transmission electron microscope. In this approximation, the wave function in the back focal plane of the objective lens is--in the ideal case--the Fourier transform of Eq. (2.2) or of the approximated expression in Eq. (2.3). However, the lens aberrations and the defocusing have the effect of shifting the phase of the scattered wave by the term y(k) = 27rx(k),
(2.4)
which is dependent upon the coordinates in the back focal plane. The coordinates are in turn proportional to the scattering angle and the spatial frequency, k = {kx, ky}. The term X(k) is called waue aberration function (Fig. 2.6). In a polar coordinate system with k = Ikl, c~ = a tan(kx/ky),
x ( k , ch) = - ~ A[Az + (Za/2)sin 2 ( 4 ) - d)0)]k z + ~1 A3Csk 4 , (2.5) where A is the electron wavelength; Az, defocus of the objective lens; A z a, focal difference due to axial astigmatism; 050, reference angle of axial astigmatism; and C s, third-order spherical aberration constant. An ideal lens will transform an incoming plane wave into a spherical wavefront converging into a single point on the back focal plane. The wave aberration has the effect of deforming the spherical wavefront. The spherical aberration term acts in such a way that the outer zones of the wavefront are curved stronger than the inner zones, leading to a decreased focal length. In summary, the wave function in the back focal plane, of the objective 9 bf(k), can be written as the Fourier transform of the wave function 0 ' immediately behind the object times a term that represents the phase shift due to the lens aberrations: altbf(k)
-
F(0'}exp[iy(k)].
(2.6)
Here and in the following, the symbol F{ } will be used to denote the Fourier transformation, and F - l { } its inverse. Also, as a notional convention, we will use small letters to denote functions in real space and the corresponding capital letters to refer to their Fourier transforms. Thus, F{h(r)} = H(k).
II. Principle of Image Formation in the Electron Microscope /1,r 2
-10
-4
-2
-1-0.5 0
0.5
27
1.75 2. 25 2. 75 1.5 2 2.5
1
1
J 0 -1
3.25
-2 3.5 -3 -4
3.75
-5
107.5 ',
:
:
0
:
:
1
:
:
,
"
0.5
5 i
4 3.53.25 3.75 :
1
:
9
O
:
O
1.5
;
3.25 :
:
:
I
3.53.75 :
:
:
2
:
',
2.5
Fig. 2.6. Wave aberration function of the electron microscope. The curves give the function Az ,,k2 as a function of the generalized spatial frequency k _sin x(A2;/~)/rr - k4/2 _
for different choices of the generalized defocus A8 = A z / ( C s A ) and Kasper (1994). Reproduced with permission of Academic Press Ltd.
k(CsA3)
1/4
1/2.
From Hawkes
Next, the wave function in the image plane is obtained from the wave in the back focal plane, after modification by an aperture stop function A(k), through another Fourier transformation: qJi(r) =
A(k) =
F-l{F{qJ'}A(k)exp[iy(k)]} 1 0
for Ikl = 0/A ~ elsewhere
(2.7)
01/~t
(2.8) '
where 01 is the angle corresponding to the objective aperture. Finally, the observed intensity distribution in the image plane (ignoring irrelevant scaling factors) is I(r) = q~i(r)q~*(r).
(2.9)
If the expansion of Eq. (2.3) is broken off after the second term ("weak phase object approximation"), we see that the image intensity is dominated by a term that results from the interference of the "unmodified wave" with the "scattered wave." In this case, the imaging mode is referred to as bright-field electron mircoscopy. If the unmodified wave is blocked off in the back focal plane, we speak of dark-field electron mircoscopy.
28
Chapter 2. ElectronMicroscopyof MacromolecularAssemblies
Because of the dominance of the term that is linear in the scattered wave amplitude, bright-field electron microscopy has the unique property that it leads to an image whose contrast is--to a first approximation--finearly related to the projected object potential. The description of the relationship between observed image contrast and projected object potential, and the way this relationship is influenced by electron optical parameters, is the subject of the contrast transfer theory (see Lenz, 1971; Hanszen, 1971; Spence, 1988; Hawkes, 1992; Hawkes and Kasper, 1994.) A brief outline of this theory is presented in the following section.
C. Contrast Transfer Theory
1. The Phase Contrast Transfer Function If we (i) ignore terms involving higher than first orders in ~(r) and (ii) assume that the projected potential ~(r) is real, Eq. (2.9)yields a linear relationship between O ( k ) = F{~(r)} and the Fourier transform of the image contrast, F(k) = F{l(r)}: F(k) = O(k)A(k)2 sin 7(k).
(2.10)
Mathematically, the appearance of a simple scalar product in Fourier space, in this case with the factor H(k) = A(k)2 sin 7(k), means that in real space the image is related to the projected potential by a simple concolution operation: l(r) = f ~ ( r ' ) h ( r - r ' ) d r ' def
= 9 Q h(r),
(2.11) (2.12)
where h(r) is called the point spread function. [The notation using the symbol "@" (e.g., Goodman, 1968) is practical when expressions involving multiple convolutions are to be evaluated.] The function 2 sin 7(k) (Figs. 2.7a and 2.7e) is called phase contrast transfer function (CTF). It is characterized, as k = Ikl increases, by alternately positive and negative zones that are rotationally symmetric when the axial astigmatism is fully compensated (i.e., A z a - 0). The zones have elliptic or more complicated shapes for Aza 4: 0.
2. Partial Coherence Effects: The Envelope Function In Eq. (2.10), which is derived assuming completely coherent illumination with monochromatic electrons, the resolution is limited by the aperture function. In the absence of an aperture, information is--at least in
II. Principle of Image Formation in the Electron Microscope
i]
29
AAIIttII~ l VltVtltl
/~AA-I-I/-Ilftlttl-~"
~ ~,]~
Vl_lilIttllltlllo ~o
1
~ -
-
+1
-L
a
e
-1
-1
.
.
.
.
.
.
A
K
i/vvv~ .... Vv.....
io1~/, +1 j
.
.
.
.
~
.
.
.
.
.
.
.
.
.
-
.
.
.
.
.
.
.
.
.
.
.
.
Q
.
f -1
-1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
A
!:
'
K
A
K
1
- - - - 0
g
C
-1
-1
1
1~ +1
:0 +1
d
h
Fig. 2.7.
The influence of a finite illumination angle on the contrast transfer function for two defocus values, Az = 1 ( a - d ) and A2 = ~ (e-h). (a) U n d a m p e n e d CTF; (b) ~ = 0.05; (c) ~ = 0.1" (d) ~ = 0.5. From Hawkes and Kasper (1994). Reproduced with permission of Academic Press Ltd.
principle--transferred up to high spatial frequencies, even though the increasingly rapid oscillations of the CTF make it difficult to exploit that information. In practice, however, the illumination has finite divergence (or, in other words, the source size q0 is finite) and a finite energy spread. The resulting partial coherence dampens the CTF as we go toward higher spatial frequencies and ultimately limits the resolution. The theoretical treatment of these phenomena is somewhat complicated, and the resulting
30
Chapter 2. Electron Microscopy of MacromolecularAssemblies
integrals (e.g., Rose, 1984) make it difficult to gauge the effects of changing defocus, illumination divergence, or energy spread. The approximate envelope representations (Frank, 1973a; Wade and Frank, 1977) have the advantage that they reveal the influence of these parameters in a mathematically tractable form: Hpc(k) = 2A(k)sin 7(k)E(k);
k -[kl,
(2.13)
where E(k) is the "compound envelope function E(k)
= Ei(k)Ee(k)
,
(2.14)
consisting of the term Ei(k), the envelope function due to partially coherent illumination, and the term Ee(k), the envelope function due to energy spread. [For simplicity, only the radial dependence is considered here. It is straightforward to write down the full expression containing the polar coordinate dependency in case A z a 4= 0.] The effect of the partially coherent illumination alone is shown in Figs. 2.7b-2.7d and 2.7f-2.7h: increasing the source size (described by the parameter q0, a quantity of dimension 1/length defined in the back focal plane) is seen to dampen the high spatial frequency range increasingly. The range of validity for this product representation has been explored by Wade and Frank (1977). The first term is f E i ( k ) = exp[ - rr2qeo(CsA3k3 -- A z A k ) 2]
(2.15)
for a Gaussian source distribution and J , [ 2 r r q o ( C s A 3 k 3 - AzAk)] E i(k) = 2 [~-Trrqo~-[~-~5 - A z ak)]
(2.16)
for a "top hat" distribution (Frank, 1973), with J1 denoting the first-order Bessel function. The argument (CsA3k 3 - Azak) is the gradient of the wave aberrationfunction (Frank, 1973a). It is evident from Eqs. (2.15) and (2.16) that Ei(k) -- 1 wherever this gradient vanishes. The envelope due to the energy spread is (Hanszen and Trepte, 1971; Wade and Frank, 1977) E e ( k ) --
exp[- (rrNzAk2/2)2],
(2.17)
where 6z is the defocus spread due to lens current fluctuations and chromatic aberration in the presence of energy spread. The distinguishing feature of Ei(k) is that it is defocus dependent, whereas Ee(k) is indepen-
II. Principle of Image Formation in the Electron Microscope
31
dent of defocus. The combined effect of both envelopes can be understood as the action of two superimposed apertures, one of which changes with changing defocus and one of which remains constant. For example, if Ei(k) cuts off at k 1 and Ee(k) cuts off at k 2 < kl, then El(k) has no effect whatsoever. From this "product rule," an important clue can be derived: if the band limit produced by E(k) = Ei(k) Ee(k) is independent of defocus in a
giuen defocus range, then the energy spread is the limiting factor in that range. In the case where the band limit is observed to be defocus dependent, one part of the defocus range may be dependent and thus ruled by Ei(k), and the other part may be independent and limited by Ee(k). In instruments with a field-emission gun, both illumination and energy spread are very small, and so both envelopes El(k) and Ee(k) are pushed toward high spatial frequencies, offering the opportunity for reaching atomic resolution (O'Keefe, 1992; Zhou and Chiu, 1993). Finally, it should be mentioned that, unless compensated, axial astigmatism, which was left out for notational convenience in Eqs. (2.15-2.17), will create an azimuthal dependence of the effectiue band limit through the action of the defocus-dependent illumination envelope, which has now to be written as a function of a vector argument, Ei(k).
3. The Contrast Transfer Characteristics When the value of siny(k) is plotted as a function of both k and A z, we obtain a pattern called the contrast transfer characteristics (Thon, 1971)of the microscope. M6bus and Ri~hle (1993) have called this function "contrast transfer nomogram." If we ignore the effect of axial astigmatism for the moment, characteristics is entirely determined by the values of the remaining parameters in formula (2.5), A (wavelength) and C s (third-order spherical aberration coefficient). If one introduces dimensionless quantities (see Frank, 1973a) Az Az =
[Cs/~]1/2
("generalized defocus")
(2 18)
and
k = [CsA3]i/4k
("generalized spatial frequency"),
(2.19)
following a suggestion by Hanszen and Trepte (1971), one obtains the
standard characteristics which is independent of voltage and the value of the third-order spherical aberration constant and hence is the same for all electron microscopes: CTF(k; Az) = 2sin[-Tr A z k 2 + 7r/2k4].
(2.20)
32
Chapter 2. Electron Microscopy of Macromolecular Assemblies
This i m p o r t a n t diagram is shown in Fig. 2.8. It m a k e s it possible to d e t e r m i n e the o p t i m u m defocus setting that is required to bring out features of a certain size range or to gauge the effect of axial astigmatism. F e a t u r e s of a size range b e t w e e n dl and d2 require the transmission of a spatial frequency band b e t w e e n 1/dl and 1/d 2. O n e obtains the defocus value, or values, for which o p t i m u m transmission of this band occurs, by constructing the intersection b e t w e e n the desired frequency b a n d and the contrast transfer zones (Fig. 2.8). The effect of axial astigmatism can be g a u g e d by moving back and forth along the defocus axis by the a m o u n t Aza/2 from a given Az position, bearing in mind that this m o v e m e n t is controlled by the azimuthal angle d~ according to the behav-
Fig. 2.8. Representation of the contrast transfer function characteristics E(A2;/~) siny(A2;/~) showing the effect of the envelope functions. The horizontal axis is in generalized spatial frequency units /~ and the vertical axis is the generalized defocus As [see Eqs. (2.18) and (2.19) for definition of both quantities]. A profile along a vertical line A2 = const gives the CTF at that defocus, and allows the effective resolution to be gauged as the maximum spatial frequency (upper transfer limit) beyond which no information transfer occurs. (a) CTF with E(Az; k) - 1 (fully coherent case). (b) partially coherent illumination with the generalized source size q0 - 0.5 but no defocus spread. The practical resolution is defocus-dependent. At high spatial frequencies, a central band is also eliminated by the effect of the envelope function. (c) Defocus spread 62 = 0.125 in generalized units, in the case of a point source. The practical resolution has no defocus dependence in this case. From Wade (1992). Reproduced with permission of Elsevier Science, Amsterdam.
II. Principle of Image Formation in the Electron Microscope
Fig. 2.8. (continued)
33
34
Chapter 2. Electron Microscopy of Macromolecular Assemblies
ior of sin(24~): a full 360 ~ azimuthal range leads to two complete oscillations of defocus around the nominal value. Evidently, small values of astigmatism lead to an elliptic appearance of the contrast transfer zones, whereas large values may cause the defocus to oscillate beyond the boundaries of one or several zones, producing hyperbolic patterns.
4. The Effects of the Contrast Transfer Function The main effects of the CTF on the image, as compared to those of ideal contrast transfer [i.e., C T F ( k ) = 1], can be described as a combined low-pass (i.e., resolution-limiting) and high-pass filtration. An effective low-pass filtration results from the fact that in the underfocus range (by convention Az > 0), the CTF typically has a "plateau" of relative constancy followed by rapid oscillations. In this situation, the high-frequency border of the plateau acts as a virtual band limit. The use of information transferred beyond that limit, in the zones of alternating contrast, requires some type of restoration. In practice, information transferred outside the first zone is of little use in the image, unless the polarity of the subsequent, more peripheral zones is "flipped" computationally, so that a continuous positive or negative transfer behavior is achieved within the whole resolution domain. More elaborate schemes employ restoration such as Wiener filtering (Welton, 1979; Kiibler et al., 1978; Lepault and Pitt, 1984; Jeng et al., 1989; Frank and Penczek, 1995) in which not only the polarity (i.e., the phase) but also the amplitude of the CTF is compensated throughout the resolution range and two or more micrographs with different defocus values are used (see Section II, H in this Chapter and Section IX in Chapter 5). The fidelity of the restoration is limited by the presence of resolution-limiting envelope terms, the accuracy of the transfer function description of image formation, and the presence of noise. The well-known band-pass filtering effect of the CTF is a result of the CTF having a small value over an extended range of low spatial frequencies. The effect of this property on the image is that the particle as a whole does not stand out from the background, but its edges are sharply defined by contours, and short-range interior density variations are exaggerated. Another way of describing the effects of the CTF is by the appearance of the associated point spread function, which is the Fourier transform of the CTF and describes the way a single point of the object would be imaged by the electron microscope. Generally, the closer this function resembles a delta function, the more faithful is the image to the object. In practice, the typical point spread function has a central maximum that is sometimes barely higher than the surrounding maxima, and it might extend over a sizable area (Fig. 2.9). The long-range oscillations of the
II. Principle of Image Formation in the Electron Microscope
35
point spread function are responsible for the "ringing," i.e., the appearance of Fresnel fringes along the borders of the object. In Fig. 2.9, the CTF obtained with A z = 3000 A has been applied to a rural motif for a demonstration of the low-pass, high-pass, and ringing
Fig. 2.9.
Demonstration of the effect of an electron mircoscopic transfer function on a rural motif. (a)"Diffractogram" (squared transfer function) for Az = 3000 ,~,, Cs = 2 mm and an additional Gaussian envelope term. The first, innermost transfer interval conveys negative contrast, the following transfer interval positive contrast. (b center) Point spread function associated with the CTF of (a); (inset) The same function, enlarged. (c) An object before and (d) after application of the transfer function. Compared to the original, the distorted image is characterized by four features: (i) the inversion of contrast of larger areas (e.g., the head of the goat is now black on a faintly white background); diminished contrast of large areas; (iii) edge enhancement (each border is now sharply outlined); and accompanying the borders, (iv) the appearance of fringes with alternating contrast along borders.
36
Chapter 2. ElectronMicroscopyof MacromolecularAssemblies
effects which produce strong degradations. One is immediately struck by the "ghost" appearance of the familiar objects and the absence of the usual segmentation, by density, between different parts of the scene and the background. This effect is caused by the virtual absence of lowresolution information in the Fourier transform. Similarly, low-contrast objects such as single molecules embedded in ice are very hard to make out in the image, unless a much higher defocus (in the range of 10,000 to 20,000 A) is used (see, for instance, Fig 7.1 in Chapter 7).
D. Amplitude Contrast Amplitude contrast of an object arises from a locally changing virtual loss of electrons participating in the "elastic" image formation, either by electrons that are scattered outside of the aperture or by those that are removed by inelastic scattering (see Rose, 1984). These amplitude components are therefore entirely unaffected by energy filtering. Rose writes in his account of information transfer, "It seems astonishing at first that a [energy-] filtered bright-field image, obtained by removing all inelastically scattered electrons from the beam, represents an elastic image superposed on a inelastic 'shadow image."' The shadow image that Rose is referring to is produced by amplitude contrast. Since the detailed processes are dependent on the atomic species, the ratio of amplitude to phase contrast is itself a locally varying function. Formally, the amplitude component of an object can be expressed by an imaginary component of the potential in Eq. (2.3). The Fourier transform of the amplitude component is transferred by cos y, which, unlike the "usual" term sin y, starts off with a maximal value at low spatial frequencies. The complete expression for the image intensity thus becomes I(k) = 20~(k)sin y(k) - 20~(k)cos y(k),
(2.21)
where Or(k) and Oi(k) are the Fourier transforms of the real (or weakphase) and imaginary (or weak-amplitude) portions of the object, respectively (Erickson and Klug, 1970; Frank, 1972c; Wade, 1992). Equation (2.21) is the basis for heavy/light atom discrimination using a focus series (Frank, 1972c, 1973b; Kirkland el al., 1980; Typke et al., 1992; Frank and Penczek, 1995), following an original idea by Schiske (1968). Only for a homogeneous specimen (i.e., a specimen that consists of a single species of atoms; see Frank and Penczek, 1995) is it possible to rewrite Eq. (2.21) in the following way: I(k) -- Or(k)[2 sin y(k) - 2Q(k)cos y(k)].
(2.22)
II. Principle of Image Formation in the Electron Microscope
37
Here Q ( k ) = O i ( k ) / O r ( k ) i s a function characteristic for each atomic species, but within the small spatial frequency range of practical interest, it is safe to assume Q(k) -~ const = Q0 (see Toyoshima and Unwin, 1988a, b: Stewart et al., 1993). With these approximations, it is again possible to speak of a single contrast transfer function: (2.23)
H ' ( k ) = 2 sin 7(k) - 2Q0 cos 7(k).
Compared with the function obtained for a pure phase object, the function described by Eq. (2.23) has the zeros shifted toward higher radii (Fig. 2.10). The most important change lies in the fact that at low spatial frequencies, the transfer function starts off with H'(k) = - 2 Q 0 . Thus the cosine-transferred term mitigates the pronounced band-pass filtering effect produced by sin 7, brought about by the deletion of Fourier components at low spatial frequencies. The value of Q0 is usually determined by recording a defocus series of the specimen and measuring the positions of the zeros of H'(k) in the diffraction patterns of the micrographs. These diffraction patterns can be
CTF
"~
'
I
I
'
"
I
'
'
'
'
0
0
Fig. 2.10.
,,
I
I
I
!
0.5 Spatial frequency [nm"]
|
0.96
Electron-optical transfer function (C~ = 2 mm) for a mixed phase/amplitude object (Q = 0.15), for two defocus values: Az = 0.9 g m (solid line) and A z - - 1.5 ~m (broken line). From Frank and Penczek (1995). Reproduced with permission of Wissenschaftliche Verlagsgesellschaft, Stuttgart.
38
Chapter2. ElectronMicroscopyof MacromolecularAssemblies
obtained either by optical diffraction or by computation (see Section II, E). Other measurements of Q0 were done by following the amplitudes and phases of reflections in the computed Fourier transform of a crystal image as a function of defocus (Erickson and Klug, 1970) or by observing the lines of zero contrast transfer in optical diffraction patterns of strongly astigmatic images (Typke and Radermacher, 1982). Toyoshima and Unwin (1988a) and Toyoshima et al. (1993) obtained Q0 measurements by comparing pairs of micrographs that were taken with equal amounts of underfocus and overfocus. Averaged values for Q0 of negatively stained (uranyl acetate) specimens on a carbon film range from 0.19 (Zhu and Frank, 1994) to 0.35 (Erickson and Klug, 1970, 1971). The wide range of these measurements reflects not only considerable experimental errors, but also variations in the relative amount and thickness of the stain relative to that of the carbon film. For protein specimens in ice, the values range from 0.07 (Zhu and Frank (1994) for a specimen on a carbon layer in ice) to 0.09 (Toyoshima and Unwin, 1988a) and 0.14 for tobacco mosaic virus (Smith and Langmore, 1992); however, the range size of the last value probably reflects the presence of RNA.
E. Optical and Computational Diffraction Analysis The CTF leaves a "signature" in the diffraction pattern of a carbon film, which optical diffraction analysis (Thon, 1966; 1971; Johansen, 1975) or its computational equivalent is able to reveal. Before a micrograph can be considered worthy of the great time investment that is required in the digital processing (such as scanning and selection of particles), its diffraction pattern should first be analyzed. The optical diffractometer is a simple optical device, working with a coherent light source, that allows the diffraction pattern of a selected image area to be recorded (Fig. 2.11). With Eq. (2.23), and taking into account the envelope function E(k) and an additive noise term N(k), we obtain IF(k)l 2 = IO(k)12E(k)24[sin T(k) + Qo cos "y(k)] 2 + IN(k)l 2 (2.24) That is, the object spectrum (i.e., the squared Fourier transform of the object) is modulated by the squared CTF. In order to describe this modulation, and the conditions for its observability, we must make an approximate yet realistic assumption about the object. A thin carbon film can be characterized as an amorphous, unordered structure. For such an object, the spectrum [O(k)[ 2 is nearly "white." "Whiteness" of the spectrum means
II. Principle of Image Formation in the Electron Microscope t~
39
05-10m - - ~ ~
Opaque back stop
t
expanding telescope
Micrograph
Viewing screen or Camera back
Fig. 2.11.
Schematic sketch of an optical diffractometer. A beam-expanding telescope is used to form a spot of the coherent laser light on the viewing screen. When a micrograph is placed immediately in front of the lens of the telescope, its diffraction pattern is formed on the viewing screen. From Stewart (1988a), Introduction to the computer image processing of electron micrographs of two-dimensionally ordered biological structures. J. Electron Microsc. Tech.
that its local average is roughly the same throughout the resolution domain. More specifically, the total signal variance, which is equal to the squared object spectrum integrated over the resolution domain B (Parseval's theorem), var{o(r)} -- f~lO'(k)l 2 dk,
(2.25)
where O'(k) is the Fourier transform of the "floated," or averagesubtracted object [o(r) - 5(r)], is evenly partitioned within that domain. If we had an instrument that would image this kind of object without any aberration, then the optical diffraction pattern would be uniformly white. Hence, for such a structure, multiplication of its spectrum with the CTF in real instruments will leave a characteristic trace (the "signature" of which we have spoken before). We can draw the following (real-space) parallel: in order to see an image on a transparent sheet clearly, one has to place it on a light box that produces uniform, untextured illumination, as in an overhead projector. When we image a carbon film in the EM and subsequently analyze the electron micrograph in the optical diffractometer, the carbon film spectrum essentially acts as a uniform light source that makes the CTF (i.e., the transparency, in our analogy, through which the carbon spectrum is "filtered") visible in the Fourier transform of the image intensity. Instead of the carbon film, a two-dimensional crystal with large unit cell can also
40
Chapter 2. Electron Microscopy of Macromolecular Assemblies
be used: in that case, the C T F is evenly s a m p l e d by the fine grid of the reciprocal lattice (see the display of the c o m p u t e d F o u r i e r transform of a P h o E crystal, Fig. 2.12). The first use of optical diffraction as a m e a n s of d e t e r m i n i n g the C T F and its d e p e n d e n c e upon the defocus and axial astigmatism goes back to T h o n (1966). Since then, n u m e r o u s o t h e r electron optical effects have b e e n m e a s u r e d by optical diffraction: drift (Frank, 1969), illumination
Fig. 2.12. Fourier transform of PhoE crystal (unit cell size 130 X 150 A) after two passes of "lattice unbending" (see Downing, 1990). The size of the spots is a measure of the ratio between reflection amplitude and background. Although the amplitude fluctuates in its own right, due to the variation of the structure factor, its main visible modulation which follows the circular ~attern is created by the contrast transfer function. (Electron optical parameters: Az = 2400 A, C~ = 2 mm, U = 200 kV). Unpublished figure, relating to the data presented in Downing (1991). Kindly made available by K. Downing and B. Jap.
II. Principle of Image Formation in the Electron Microscope
41
source size (Frank, 1976; Saxton, 1977; Troyon et al., 1977), and coma (Zemlin et al., 1978; Zemlin, 1989a). Typke and Koestler (1977) have shown that the entire wave aberration of the objective lens can be mapped out. Some modern electron microscopes are fitted with a digital image readout system (e.g., a CCD camera) and a fast processor capable of producing "diagnostic" Fourier transforms of the image on-line (e.g., Koster et al., 1990). With such a device, the data collection can be made more efficient, since the transmission of an image to the computer can be deferred until satisfactory imaging conditions have been established. Provided that the specimen area is the same, optical and computational diffraction patterns are essentially equivalent. We conclude this section with a note on terminology. Often the term "power spectrum" is used to refer to the computed absolute-squared Fourier transform of a single image [IF(k)l 2 in Eq. (2.24)]. However, the correct name for the latter is periodogram in the signal processing field (e.g., Jenkins and Watts, 1968), whereas "power spectrum" is strictly defined only for an ensemble of images, namely as an expectation value of the absolute-squared Fourier transform. Now if the image is sufficiently large, and the structure it shows can be modeled as a stationary stochastic process (in other words, its statistics is translation invariant), then its absolute-squared Fourier transform is in fact a good approximation to the power spectrum of the ensemble to which the image belongs. So even if not strictly correct, the colloquial term "power spectrum" is not too much off the mark. Another note concerns the display mode. In the digital presentation, it is convenient to display the modulus of the Fourier transform, i.e., the square root of the power spectrum, because of the limited dynamic range of monitor screens. Logarithmic displays are also occasionally used, but experience shows that these often lead to an unacceptable compression of the dynamic range, rendering zeros of the CTF virtually invisible. F. Determination of the Contrast Transfer Function The CTF may either be determined "by hand," specifically by measuring the positions of the zeros and fitting them to a chart of the CTF characteristics, or by using automated computer-fitting methods. In the first method, the CTF characteristics of the microscope (see Section II, C, 3) is computed for the different voltages practically used (e.g., 80 and 100 kV) and displayed on a hard copy, preferably on a scale that enables direct comparison with the print of the optical diffraction pattern. As long as the lens of the microscope remains the same, the CTF
42
Chapter 2. Electron Microscopy of MacromolecularAssemblies
characteristics remain unchanged. (In fact, as was pointed out above, a single set of curves covers all possible voltages and spherical aberrations, provided that generalized coordinates are used). The determination of defocus involves "sliding" a set of measured radii of CTF zero positions against the CTF characteristics until a match is achieved. Since the slope of the different branches of the characteristics is shallow in most parts of the pattern, the accuracy of this kind of manual defocus determination is low, but it can be improved by using not one but simultaneously two or more diffraction patterns of a series with known defocus increments. Such a set of measurements forms a "comb" which can be slid against the CTF characteristics in its entirety. Following the second method (Frank et al., 1970; Frank 1972c; Henderson et al., 1986), the Fourier modulus IF(k)] (i.e., the square root of what would be called the diffraction pattern) is computed from a field of sufficient size. The theoretical CTF pattern is now matched with the experimental power spectrum using an iterative nonlinear least squares fitting method. The parameters being varied are A z, A za, ~b0, and a multiplicative scaling factor. Thus the error sum is (Frank et al., 1970)
E(Az, Aza, if)0, c) --- Y'~ {G(kj; AZ, Aza, ~0, C) - ]F(kj)]} 2, (2.26) where C G(kj; Az, Aza, 4)0, c) = ~ Isin ,/(kj; Az, Aza, 4)o)1,
(2.27)
and c is a simple scaling constant. It has not been practical to include envelope parameters in the two-dimensional fitting procedure. The 1/Ikl dependence was also used by other groups (Henderson et al., 1986; Stewart et al., 1993) to match the observed decline in power with spatial frequency. This 1 / k dependency, obtained for negatively stained specimens, lacks a theoretical basis but accommodates some of the effects discussed by Henderson and Glaeser (1985), such as specimen drift and fluctuating local charging. The error function [Eq. (2.26)] has many local minima. The only way to guarantee that the correct global minimum is found is by trying different starting values for A z. Intuitively it is clear that smoothing the strongly fluctuating experimental distribution IF(k)l will improve the quality of fit and the speed of convergence of the iterative algorthim. Smoothing can be accomplished by transforming IF(k)[ into real space, limiting the "autocorrelation radius," and finally transforming the result back into Fourier space (Frank et al., 1970). A smooth image spectrum is also obtained by dividing the image into small regions p,(r) and computing the average of
II. Principle of Image Formation in the Electron Microscope
43
the Fourier moduli ]Fn(k)]- ]F{pn(r))l (Zhu and Frank, 1994), a method that actually comes close to the definition of the power Sl~ectrum, at a given spatial frequency k, as an expectation value of IF(k)[ (see Section II, E). Another method of fitting which uses the rotationally averaged, squared Fourier transform (Zhou and Chiu, 1993; Zhu and Frank, 1994) sacrifices the determination of the axial astigmatism for increased accuracy in determining the other parameters. The one-dimensional profile obtained by rotational averaging is first corrected by background subtraction, then the resulting profile is fitted to a product of the transfer function with envelopes representing the effects of partial coherence, chromatic defocus spread, and other resolution-limiting effects. Background correction is accomplished by fitting the minima of IF(k)[ (i.e., regions where the Fourier transform of the image intensity should vanish, in the absence of noise) with a slowly varying, well-behaved function of [k]. Zhou and Chiu (1993) used a high-order polynomial function while Zhu and Frank (1994) were able to obtain a Gaussian fit of the noise background, independent of defocus (Fig. 2.13), and to determine the effective source size characterizing partial coherence as well. Astigmatism can be taken into account in this fitting process, without abandoning the benefit of azimuthal averaging, by dividing the 180~ azimuthal range (ignoring the Friedel related range) into a number of sectors. For each of these sectors, the defocus is then separately determined. However, this method appears to be unreliable when applied to specimans in ice because of the reduced signal-to-noise ratio (Jun Zhu, personal communication, 1994). Yet another group of techniques attempts to measure the entire CTF characteristics and, along with it, the parameters characterizing energy spread and partial coherence. These attempts started with the invention of CTF versus A z measurement using a tilted carbon grid and 1D optical diffraction using a cylindrical lens (Krakow et al., 1974). Frank et al. (1978b) used this method to verify the predicted defocus dependence of the envelope function. Burge and Scott (1975) developed a very elegant method of measurement according to which a large astigmatism is intentionally introduced. As we have seen in the previous section, the effect of a large amount of axial astigmatism is that as the azimuthal angle goes through the 360 ~ range, the astigmatic defocus component AZaSin[2(~b4'0)] (Eq. 2.5) sweeps forward and backward through the CTF characteristics, producing a hyperbolic appearance of most of the contrast transfer zones. For A z large enough, the diffraction pattern, either optically derived (Burge and Scott, 1975) or computed (M6bus and Riihle, 1993), will contain a large segment of the CTF characteristics in an angularly
Chapter 2. Electron Microscopy of MacromolecularAssemblies
44 0.0012
0.001
/ 0.0008
/
0.0006
r' ""
a
\ \
/ ',~\ ,\I/ ~\\ / , I
0.0004 0.0002 0
0
0.02
0.04
0.06
0.08
0.1
0.0006
~
b
0.0005 0.0004 "" 0.0003
\
I
\\1~ \ /'~
" l" ,
0.0002 0.0001 0
:
0
0.02
0.04
0.06
. _ ~
0.08
0.1
Fig. 2.13. Profiles of Fourier modulus (i.e., square root of "power spectrum") obtained by azimuthal averaging. Vertical, the modulus in arbitrary scale; dashed line, uncorrected profile; solid line, profile after background subtraction. (a) Uranyl acetate stain on carbon (Az = - 1 /,Lm, Q = 0.17); (b) protein specimen on a thin carbon film in ice (Az = 2.2 ~m, Q = 0.09). From Zhu and Frank (1994). Reproduced with permission of Les Editions de Physique, Les Ulis, France.
" c o d e d " form. M 6 b u s a n d Riihle ( 1 9 9 3 ) w e r e able to m a p the hyperbolic p a t t e r n into the c u s t o m a r y C T F versus A z d i a g r a m by a c o m p u t a t i o n a l procedure.
G. Instrumental Correction of the Contrast Transfer Function M a n y a t t e m p t s have b e e n m a d e to i m p r o v e the i n s t r u m e n t so that the C T F c o n f o r m s m o r e closely with the ideal behavior. In the early 1970s,
45
II. Principle of Image Formation in the Electron Microscope
severals groups tried to change the wave aberration function by a direct manipulation in the back focal plane. Hoppe and co-workers (Hoppe, 1961; Langer and Hoppe, 1966) designed zone plates, to be inserted in the aperture plane, that blocked out all waves giving rise to destructive interference for a particular defocus setting. A few of these plates were actually made, thanks to the early development of microfabrication in MSllenstedt's laboratory. Unwin (1970)introduced a device that builds up an electric charge at the center of the objective aperture (actually a spider's thread)whose field modifies the wave aberration function at low angles and thereby corrects the transfer function in the low spatial frequency range. Thin phase plates designed to retard portions of the wave have also been tried (see Reimer, 1989). The most promising development is a magnetic corrector element that can be used to "tune" the wave aberration function, also boosting the low-spatial frequency response if desired (Rose, 1990). Corrections that are applied after the image has been taken, such as Wiener filtering and the merging of data from a defocus series (see Section II, H), have traditionally not been counted as "instrumental compensation." However, as the computerization of electron microscopes proceeds, the boundaries between image formation, data collection, and on-line postprocessing are increasingly blurred. By making use of computer control of instrument functions, it is now possible to compose output images by integrating the "primary" image collected over the range of one or several parameters. One such scheme, pursued and demonstrated by Taniguchi et al. (1992), is based on a weighted superposition of a defocus series.
H. Computational Correction of the Contrast Transfer Function As has become clear, the CTF fails to transfer the object information as represented by the object's Fourier transform with correct phases and amplitudes. All computational corrections are based on knowledge of the CTF as a function of the spatial frequency, in terms of the parameters 5z, C~, and Aza occurring in the analytical description of CTF [Eqs. (2.4), (2.5), and (2.10)]. Additionally, envelope parameters that describe partial coherence effects must be known. The simplest correction of the CTF is by "phase flipping", that is, by performing the following operation on the image transform:
F'(k) = t - F ( k ) F(k)
for H(k) < 0 for H(k) > 0,
(2.28)
46
Chapter 2. Electron Microscopy of Macromolecular Assemblies
which assures that the modified image transform F'(k) has the correct phases throughout the resolution domain. However, such a correction |eaves the misrepresentations of the amplitudes unaffected" Fourier components sitting in regions near the zeros of CTF(k) are weighted down, and in the zeros themselves they are not transferred at all. The Wienerfiltering approach can be described as a "careful division" by the CTF that avoids noise amplification. We seek an estimate F(k) that minimizes the expected mean squared deviation from the Fourier transform of the subject, F(k): E[IF(k) - F(k)l 2] ! min,
(2.29)
where E[ ] denotes the expectation value computed over an ensemble of images. We now look for a filter function S(k) with the property F(k) = S(k)l(k);
(2.30)
i.e., it yields the desired estimate when applied to the image transform. The solution of this problem, obtained under the assumption that there is no correlation between F(k) and N(k), is the well-known expression
S(k) =
H*(k) IH(k)l 2 +
PN(k)/PF(k)'
(2.31)
where PN(k) and PF(k) are the power spectra of noise and object, respectively. It is easy to see that the filter function corrects both the phase [since it flips the phase according to the sign of H*(k)] and the amplitude of the Fourier transform. The additive term in the denominator of Eq. (2.31) prevents excessive noise amplification in the neighborhood of H(k) --, 0. The disadvantage of using a single micrograph is that a gap in the vicinity of the zeros of the transfer function remains unfilled. It is important to realize that for raw (unaveraged) images, the ratio of the power spectra is on the order of 1, so that these gaps are rather wide. This problem can be solved by using two or more images at different defocus settings (Frank and Penczek, 1995). For two images, we seek F ( k ) = Sl(k)II(k) + S2(k)I2(k), and the filter functions become S,(k) =
H* (k) ]Hl(k)l 2 + ]H2(k)] 2 +
PN(k)/PF(k)
(for n = 1,2).
(2.32)
II. Principle of Image Formation in the Electron Microscope
47
Normally, the zeroes of H~ and H 2 do not coincide, so that information gaps are entirely avoided. [It should be noted that in the Frank and Penczek application, the images are three-dimensional, resulting from two independent 3D reconstructions, and the spectral noise-to-signal ratio P N / P F is actually much smaller than that for raw images, as a consequence of averaging]. In the above descriptions of the phase flipping and Wiener filtering approaches to CTF correction, it has been assumed that the specimen has the same scattering properties throughout. If we take into account that there are different atomic species with different scattering properties, we have to go one step back and start with Equation (2.21), which describes the different image components relating to the phase (with transform O r ) and amplitude portion (with transform Qj) of the object: I(k) = 20~(k)sin y(k) - 2Oi(k)cos y(k). In 1968, Schiske posed the question whether the two different object components can be separately retrieved, by making use of several measurements of I(k) with different defocus settings. An advantage of separating the two components is the enhanced contrast between heavy and light atoms ("heavy/light atom discrimination"; see Frank, 1972c, 1973b) that we expect to find in the amplitude component. One can easily verify that the solution for N = 2 defocus settings is simply
Fr(k ) =
Fa(k) -
ll(k)cos[ y2(k)] - I2(k)cos[ yl(k)] sin[yl(k)
-
T2(k)]
ll(k)sin[ yE(k)] - I2(k)sin[ yl(k)] sin[3,~(k) - yE(k)]
.
(2.33)
For N > 2, there are more measurements than unknowns, and the problem can be solved by least squares, resulting in a supression of noise (Schiske, 1968; Frank, 1972c; Typke et al., 1992). The first application of the Schiske formula was presented by Frank (1972c), demonstrating the enhancement of the features of stained DNA on a carbon film. In a study, Typke et al. (1992) found that small magnification differences associated with the change in defocus produce intolerable effects in the restored images and proposed a method for magnification compensation. Figure 2.14 shows the phase and amplitude portion of a specimen field, restored
III. Special Imaging Techniques
49
from eight images of a focus series. Again it should be emphasized that in all applications to at'eraged images, the noise is greatly diminished, so that even the straightforward use of Eq. (2.33), derived for N - 2 images, is defensible in those situations. In an approach developed recently, C T F correction is being carried as part of the 3D reconstruction p r o c e d u r e (Zhu et al., 1995). Details will be found in the chapter on three-dimensional reconstruction ( C h a p t e r 5, Section IX).
Ill. Special Imaging Techniques A. Low-Dose Electron Microscopy Concern about radiation d a m a g e led to careful diffraction m e a s u r e m e n t s at the beginning of the 1970's (Glaeser, 1971). The results were not encouraging: the electron diffraction patterns of two-dimensional crystals f o r m e d by L-valine were found to disappear entirely after exposure to 6 e - / , ~ 2 (Glaeser, 1971). Crystals of the less sensitive adenosine ceased to oo diffract at a dose of about 60 e - / A - ' , still much lower than the ' n o r m a l " conditions for taking an exposure. To some extent, low t e m p e r a t u r e affords protection from radiation d a m a g e (Taylor and Glaeser, 1974), by trapping reaction products in situ, but the actual gain ( 5 x - 8 x for catalase and purple m e m b r a n e crystals at - 1 2 0 ~ turned out smaller than expected (Glaeser and Taylor, 1978; H a y w a r d and Glaeser, 1979). At the workshop in Gais, Switzerland (October, 1973), the prospect for biological electron microscopy with a resolution better than 1 / 3 0 A was discussed, and radiation d a m a g e was identified as the primary concern
Fig. 2.14. Demonstration of Schiske-type restoration applied to a focus series of eight micrographs (ranging from - 5400 to + 5400 ,~) showing proteasomes on carbon embedded in vitreous ice. (a, b) Two of the original micrographs. (c) Restored phase part of the object, obtained by using the micrographs (a,b). (d) Restored phase part, obtained by using four micrographs. (e, f) Amplitude and phase parts of the specimen, respectively, restored from the entire focus series. In the phase part, the particles stand out stronger than in (c, d)where fewer images were used. The amplitude part (e) reflects the locally changing pattern in the loss of electrons participating in the elastic image formation, due to inelastic scattering or through scattering outside the aperture. Another contribution comes from the fact that a single defocus is attributed to a relatively thick specimen (see Frank, 1973; Typke et al., 1992). The particles are invisible in this part of the restored object. In the phase part (f), the particles stand out strongly from the ice + carbon background on account of the locally increased phase shift. The arrows point to particles in side-view orientations that are virtually invisible in the original micrographs but now stand out from the background. From Typke et al. (1992). Reproduced with permission of Elsevier Science, Amsterdam.
50
Chapter 2. Electron Microscopy of Macromolecular Assemblies
(Beer et al., 1975). The use of averaging to circumvent this hurdle had been suggested earlier on by Glaeser and coworkers (1971; see also Kuo and Glaeser, 1975). A technique termed "minimum dose microscopy," invented by Williams and Fisher (1970), proved to preserve, at 50 e - / A 2, the thin tail fibers of the T4 bacteriophage that were previously invisible for doses usually exceeding 200 e-/A2; In their pioneering work, Unwin and Henderson (1975) obtained a 1/7 A-1 resolution map of glucose-embedded bacteriorhodopsin by the use of a very low dose, 0.5 e - / A 2, along with averaging over a large (10,000 unit cells) crystal field. In the current usage of the term, "low dose" refers to a dose lower than 10 e - / A 2. Experience has shown that doses larger than that lead to progressive disordering of the material and eventually to mass loss. Susceptibility to radiation damage was found to vary widely among different materials and with different specimen preparation conditions. Changes in the structure of negatively stained catalase crystals were investigated by Unwin (1975). The results of his work indicate that rearrangement of the stain (uranyl acetate) occurs for doses as little as 10 e - / A~ .2 The radiation damage studies by Kunath et al. (1984) on single uranyl acetate stained glutamine synthetase (glutamate-ammonia ligase) molecules came to a similar conclusion, comparing single electronically recorded frames with an interval of 1 e - / A 2. Frozen-hydrated specimens are strongly susceptible to radiation damage. For those specimens, bubbles begin to appear at doses in the range of o 2 50 e - / A , as first reported by Lepault et al. (1983). A study by Conway et al. (1993) compared the reconstructions of a herpes simplex virus capsid (HSV-1) in a frozen-hydrated preparation obtained with total accumulated doses of 6 and 30 e - / A 2. These authors report that although the nominal resolution (as obtained with the Fourier ring correlation criterion, see Chapter 3, Section V, B, 3) may change little, there is a strong overall loss of power in the Fourier spectrum with the fivefold increase of dose. Still, the surface representation of the virus based on a 1/30 A-a -resolution map shows only subtle changes. The article by Conway et al., incidentally, gives a good overview over the experimental findings to date. For another recent review of this topic, the reader is referred to the article by Zeitler (1990). Starting with the Unwin and Henderson (1975) paper, several variations of the low-dose recording protocol have been described. In essence, the beam is always shifted or deflected to an area adjacent to the selected specimen area for the purpose of focusing and astigmatism correction, with an intensity that is sufficient for observation. The selected area is exposed only once, for the purpose of recording, with the possible excep-
III. SpecialImagingTechniques
51
tion of an overall survey with an extremely low dose (on the order of 0.01 e - / A 2) at low magnification. In deciding on how low the recording dose should be chosen, several factors must be considered: (i) The fog level: when the dose on the recording medium becomes too |ow, the density variations disappear in the background. Control of this critical problem is possible by a judicious choice of the electron-optical magnification (Unwin and Henderson, 1975). Indeed, magnifications in the range of 40,000 x to 60,000 x are now routinely used following the example of Unwin and Henderson (1975). (ii) The ability to align the images of two particles: the correlation peak due to "self-recognition" or "self-detection" (Frank, 1975; Saxton and Frank, 1977; see Section III, D, 1 in Chapter 3 ) o f the motif buried in both images must stand out from the noisy background, and this stipulation leads to a minimum dose for a given particle size and resolution (Saxton and Frank, 1977). (iii) The statistical requirements: for a given resolution, the minimum number of particles to be averaged, in two or three dimensions, is tied to the recording dose (Unwin and Henderson, 1975; Henderson, 1995). In planning the experiment, we wish to steer away from a dose that leads to unrealistic numbers of particles. However, the judgement of what is realistic is in fact changing rapidly as computers become faster and more powerful and as methods for automated recording and particle selection are being developed (see Chapter 3, Section II, C).
B. Spot Scanning Recognizing that beam-induced movements of the specimen are responsible for a substantial loss in resolution, Henderson and Glaeser (1985) proposed a novel mode of imaging in the transmission electron microscope whereby only one single small area, in the size range of 1000 A, is illuminated at a time. This spot is moved over the spectrum field on a regular (square or hexagonal) grid. The rationale of this so-called spot scanning technique 7 is that it allows the beam-induced movement to be kept small since the ratio between energy-absorbing area and supporting perimeter is much reduced. After the successful demonstration of this technique with the radiation-sensitive materials vermiculite and paraffin (Downing and Glaeser, 1986; Bullough and Henderson, 1987; Downing, 7 The spot scanning technique was already used by Kunath et al. (1984) in experiments designed to study the radiation sensitivity of macromolecules. It was simply a rational way of organizing the collection of a radiation damage series (which the authors called "movie") from the same specimen field. In hindsight, the extraordinary stability of the specimen (0.05 A/sec) must be attributed to the unintended stabilizing effect of limiting the beam to a 1~m spot.
52
Chapter 2. ElectronMicroscopyof MacromolecularAssemblies
1991), numerous structural studies have made use of it (e.g., Kiihlbrandt and Downing, 1989; Soejima et al., 1993). Computer-controlled electron microscopes now contain spot scanning as a regular feature and also allow dynamic focus control (Zemlin, 1989b) so that the entire field of a tilted specimen can be kept at one desired defocus setting. The attainment of constant defocus across the image field is of obvious importance for the processing of images of tilted 2D crystals (see Downing, 1992), but it also simplifies the processing of single macromolecules following the protocol (Chapter 5, Sections III, E and V) of the random-conical reconstruction (see Typke et al., 1990). On the other hand, it may be desirable to retain the focus gradient in more sophisticated experiments where restoration or heavy/light atom discrimination are used (see Section II, above): the micrograph of a tilted specimen essentially produces a focus series of single particles. At a typical magnification of 50,000 and a tilt angle of 60 ~ the defocus varies by 17,000 across the field captured by the micrograph (assumed as 50 mm in width) in the direction perpendicular to the tilt axis.
C. Energy Filtering The weak phase object approximation introduced in Section II, B enabled us to describe the image formation in terms of a convolution integral. In Fourier space, the corresponding relationship between the transforms of image contrast and object is very simple, allowing the object function to be recovered by computational methods described in Section II, H. However, this description of image formation is valid only for the bright field image formed by elastically scattered electrons. Inelastically scattered electrons produce another, very blurred image of the object that appears superimposed on the "elastic image". The formation of this image follows more complicated rules (e.g., Reimer, 1989). As a consequence, the attempt to correct the image for the effect of the contrast transfer function (and thereby retrieve the true object function) runs into problems, especially in the range of the low spatial frequencies where the behaviour of the inelastic components is opposite to that expected for the elastic components. Thus, CTF correction based on the assumption that only elastic components are present will, in the attempt to undo the underrepresentation of these components, amplify the undesired inelastic component as well, leading to an incorrect, blurred estimate of the object. This problem is especially severe in the case of ice-embedded specimens, for which the cross-section for inelastic scattering exceeds that for elastic scattering.
III. Special Imaging Techniques
53
Another problem caused by inelastic scattering is that it produces a decrease in the signal-to-noise ratio of the image to be retrieved (Langmore and Smith, 1992; Schr6der et al., 1990). This affects the accuracy of all operations, to be described in the following chapters, that interrelate raw data, e.g., alignment (Chapter 3, Section III), multivariate statistical analysis (Chapter 4), and angular refinement (Chapter 5, Section VIII). The problems outlined above are overcome by the use of energyfiltering electron microscopes (Langmore and Smith, 1992; Smith and Langmore, 1992; Schr6der et al., 1990). In these, the electron beam passes an electron spectrometer at some stage after passing the specimen. The spectrometer consists of a system of magnets that separate electrons spatially on a plane according to their energies. By placing a slit into this energy-dispersive plane, one can mask out all inelastically scattered electrons, allowing only those to pass that have lost marginal amounts (0-15 eV) of energy ("zero-loss window"). In practice, the energy filter is either placed into the column in front of the projective lens (e.g., the Omega filter; Lanio, 1986), or added below the column as final electron optical element (Krivanek and Ahn, 1986). The performance of these different types of filters has been compared by Uhlemann and Rose (1994). Langmore and Smith (1992) and Zhu et al. (1995) showed that CTF correction of the entire spatial frequency band can be achieved when energy-filtered data are used. Examples of structural studies on frozenhydrated specimens in which energy filtering has been employed with success are found in the work on the structure of decorated actin (Schr6der et al., 1993) and the ribosome (Frank et al., 1995a, b).
I. Introduction A. The Sources of Noise As one attempts to interpret an electron microscope image in terms of an object that might have given rise to it, one is faced with several obstacles: the image lacks clarity and definition, its resolution is limited, and there is evidence of instrument distortions. Part of the problem, to be addressed by the application of averaging, relates to the fact that the image contains a large amount of extraneous information not related to the object. In analogy to its definition in one-dimensional signal processing, the term noise is used to denote all contributions to the image that do not originate with the object. We distinguish stochastic and fixed-pattern noise. In the former case, it is impossible to predict the value of the noise contribution to a given pixel but only its expected value--provided its statistical distribution is known; in the latter case, however, the value of the noise at every pixel position is the same every time a measurement is made. [Fixed-pattern noise will not be considered in the following because it can be easily eliminated by image subtraction (or division, in the case of multiplicative noise; see below) using an appropriate control image.] We further distinguish between signal-dependent and -independent noise. Stochastic noise that is signal-dependent has a spatially varying statistical distribution as dictated by the spatially varying signal. Yet another distinction is important: in the efforts to eliminate noise one must know in what way the noise combines with the signal part of the image, the most important cases being add#ice and multiplicatice noise. Another classification relates to the imaging step from which the noise originates, o r - - i n another way of speaking--where the noise source is
54
I. Introduction
55
located. We will first list the various noise sources, following the imaging pathway from the biological object to the digitized image ready for processing. For each source, as we go along, we will indicate the validity of the common additive noise model, s Macromolecules are often prepared by the negative staining technique, using a carbon film as support. This technique, besides being limited to rendering the shape of the molecule only, entails two sources of noise: (i) in the process of drying, the heavy metal salts used as stain precipitate in the form of small crystals. The irregular distribution of these crystals, and the variations in thickness of the stain layer as a whole are an important source of noise. (ii) The carbon layer also possesses a structure whose image appears superposed on the image of the macromolecule. Since the structure of each carbon area is unique, its contribution to the image cannot be eliminated by a simple subtraction technique. Within the limits of the weak phase approximation (see Chapter 2, Section II, B), this so-called structural nobse is additive, simply because the projection of two added structures is equal to the sum of the projections of each component structure. Note that the idea of eliminating the structure of the support by subtraction of two images o f the same area, one with and one without the molecule, was briefly discussed in the early 1970s under the name of "image difference method" (Hoppe et al., 1969, Langer et al., 1970). This discussion was motivated by the surprising accuracy with which the images of carbon could be aligned by cross-correlation. However, the realization of the image difference method was fraught with difficulties because of the rapid buildup of contamination. Nowadays the better vacuum of modern instruments and the availability of cryo-stages would give this method a better chance, but the ensuing development of averaging techniques has facilitated the separation of single molecule projections from their background, making experimental subtraction techniques obsolete. The recording process, whether by means of a photographic plate or by a direct image pickup device, reveals another source of noise, which is due to the quantum nature of the electron: the shot noise. This noise portion is caused by the statistical variations in the number of electrons that impinge upon the recording target and follows the Poisson statistics. The size of this contribution relative to the signal in a given pixel depends upon the local average number of electrons, hence it falls in the category, of signal-dependent noise. Because of the importance of minimizing radias This section is essentially an expanded version of the section on noise in the review on image processing in electron microscopy by Frank (1973c).
56
Chapter3. Two-DimensionalAveragingTechniques
tion damage, the shot noise is one of the most serious limitations in the imaging macromolecules. The photographic recording of electrons is a complicated process involving scattering and the ensuing formation of a shower of secondary electrons in the emulsion. In the end, after development, every primary electron gives rise to a fuzzy disk-shaped area of silver grains. Photographic noise or photographic granularity is caused by the resulting irregular distribution of silver grains in the electron micrograph (see Downing and Grano, 1982; Zeitler, 1992). It is important to realize, in assessing the contribution of photographic noise to the total noise, as well as the deterioration of the image quality due to the photographic recording, that the electron-optical magnification is a free parameter in the experiment: an increase in magnification will cause the object spectrum to contract relative to the noise spectrum. Thus it is possible to choose the magnification such that the effect of the photographic noise is minimal. The digitization process acts both as a filter--removing very shortrange components of the photographic granularity--and as an additional source of noise, the digitization noise, which is due to "density binning," the conversion of a principally continuous optical density signal into one that is discrete-valued. However, by using a scanner with large dynamic range, this latter type of noise can be virtually eliminated. (For a discussion of the separate subject of the conditions for the representation of an image by sampling, see Section II, A).
B. Principle of Averaging: Historical Notes Averaging of images for the purpose of noise elimination may be achieved by photographic superposition. The history of image averaging starts with the curious attempts by Galton (1878) to construct the face of the "average criminal ''9 at a time that was obsessed with the idea, going back to Johann Kaspar Lavater at the end of the 18th century, of linking traits of character to those of physiognomy. Elmar Zeitler (1990) has reproduced an illustration showing the face of "the average Saxonian recruit" obtained at the turn of the century by photographic averaging from 12 soldiers. Faces continued to be favorite objects for demonstrating the averaging technique in electron microscopy (see Fig. 3.1). 9 Galton's frustrated attempts were recalled recently in an article on the attractiveness of faces (Perrett et al., 1994; see also the accompanying News and Views article by Etcott (1994)). Faces produced by averaging were found to be more pleasing than any of the constituent images.
I. Introduction
57
In electron microscopy, Markham et al. (1963, 1964) introduced the method of averaging by photographic superposition and applied it to structures with both translational and rotational symmetry. According to the theory (see below), averaging over a correctly aligned set of images is equivalent to the result of Fourier filtration of a regular 2D montage of that set using appropriately placed ~%functions as mask. Ottensmeyer and coworkers (1972, 1977) used this principle to obtain averages of single particles: he arranged the individual images (in this case dark-field images of small biological molecules) into a regular gallery and subjected the resulting artificial two-dimensional "crystal" to optical filtration. At this point it is instructive to see the relationship between the formation of an average over N noisy realizations pi(r) of a projection p(r), i = 1 . . . N,
pi(r) = p(r) + ni(r);
(3.1)
and the formation of an average by Fourier filtering. ]~ Here ni(r)denotes the additive noise function. If we arrange the N aligned noisy versions of the projection into a montage of L rows with K images each, so that i - k + (l - 1), K; K , L = N, then we formally obtain the image of a crystal. This image can be written as L
K
m(r) = ~
~ p(r-ka-lb)
+n(r),
(3.2)
/=1 k = l
where n(r) is the noise function resulting from placing the independent noise functions n i(r) side by side in the montage, and a, b are orthogonal "lattice vectors" of equal length lal = Ibl = a. On the other hand, the average of the N realizations is l
N
1
N
~ ~ ni(r). p(r) = ~ i=~ p(r) + ~ i= 1
(3.3)
For scaling purposes, this average was placed into an image that has the same size as the montage, and the empty area was "padded" with zeros. 10 The ultimate objective of averaging is the measurement of the structure factor, the 3D Fourier transform of the object. According to the projection theorem (see Section II, A in Chapter 5), each projection contributes a central section to this transform.
I. Introduction
59
The Fourier transforms of both expressions (3.2) and (3.3) can be shown to be equivalent on the points of the reciprocal lattice, defined by multiples of the reciprocal vectors a*, b*. These are defined such that for a perpendicular lattice,
1 1 }a*l : ~-~, Ib*l = [b--/'
(3.4)
and a* is perpendicular to a, b* is perpendicular to b. For a demonstration of this relationship, Fig. 3.2a shows a set of noisy 8 • 8 aligned images of the calcium release channel (Radermacher et al., 1994b) placed into a montage which in turn was padded into a field twice the size. Figure 3.2c shows the average of these 64 images, padded into a field of the same size as in Fig. 3.2a. The Fourier transform of the crystal (Fig. 3.2b) agrees with the Fourier transform of the average (Fig. 3.2d) on the points of the reciprocal lattice; the former is a sampled version of the latter. By quasi-optical filtration, e.g., masking out the reciprocal lattice points, and subsequent inverse Fourier transformation, an image that contains the average image repeated on the original lattice is created. Computer averaging by computational filtration of the reciprocal lattice of a two-dimensional crystal was first accomplished independently by Nathan (1970) and Erickson and Klug (1970) with images of catalase. The mathematically equivalent optical filtration of electron micrographs had been pioneered 4 years earlier by Klug and De Rosier (1966). Glaeser et al. (1971) demonstrated, by the use of a computational model (a checkerboard object), how the repeating motif of a two-dimensional crystal can be recovered from an image taken with extremely low dose. Although the principle underlying this recovery was well understood, the dramatic visual demonstration was an inspiring landmark.
C. The Role of Two-Dimensional Averaging in the Three-Dimensional Analysis of Single Molecules Historically, two-dimensional averaging of molecules presenting the same view has been an important first step in the development of methods for extraction of quantitative information from single macromolecules (e.g., Frank et al., 1978a, 1981a; van Heel and Frank, 1981; Verschoor et al., Fig, 3.1. Averaging applied to the participants of the 1977 EMBO course on Image Processing of Electron Micrographs (at the Biozentrum Basel, March 7-18, 1977, in Basel, Switzerland, organized by U. Aebi and P. R. Smith). (a) Collage of portraits: (b) image obtained by photographic superimposition of 22 portraits using the position of the eyes as an alignment aid.
60
Chapter 3. Two-DimensionalAveragingTechniques
I. Introduction
61
1985). Even though most techniques of three-dimensional (3D) reconstruction from single molecules use unaveraged projections, the analysis of a molecule in single particle form still normally starts with the averaging of molecules occurring in selected, well-defined views. This is because averaging presents a fast way of assessing the quality of the data and estimating the potential resolution achievable by 3D reconstruction from the same specimen and under the same imaging conditions. Such an analysis allows the strategy of data collection for 3D reconstruction to be mapped out as it will answer the following questions: How many distinct views occur with sufficient frequency to allow 3D reconstruction? Is there evidence for uncontrolled variations (from one micrograph to the next, from one grid to the next, etc)?
D. A Discourse on Terminology: Views versus Projections Before coming to the subject of this section, it is necessary to clarify the terms that will be used. In the literature, the term "view" is used in two different ways: the first usage refers to the distinct appearance of a molecule lying in a certain orientation (as in the phrase "the top view of the hemocyanin molecule"), whereas the second usage refers to the actual realization of the image of a molecule presenting that view (e.g., "250 top views of the hemocyanin molecule were averaged"). These two manners of speaking are not compatible with each other. It seems more appropriate to reserve "view" for the generic orientation-related appearance of a molecule, not for each of its realizations. We will therefore use the term "view" strictly in the former sense: what we observe in the electron micrograph is an image, or a projection, of a molecule that lies in a certain orientation on the grid, presenting a particular view. Thus, in this manner of speaking, there are exactly as many distinguishable views as there are distinguishable orientations of a molecule on the specimen grid. On a
Fig. 3.2. Demonstration of the equivalence between Fourier-filtering of a pseudocrystal [generated by arranging the projection of the calcium release channel (Radermacher et al., 1994b) on a regular lattice] and direct averaging of the molecules. (a) Pseudocrystal containing 8 x 8 noisy 64 x 64 images of the channel; (b) Fourier transform of images in (a) after "padding" into a field twice as large--only the 512 x 512 center of the power spectrum is shown. (c) average of the 8 x 8 images, placed at the center of a 512 x 512 field; (d) Fourier transform of the image in (c) after padding as in (b). It is seen that the Fourier transform in (b) is simply a sample of the continuous Fourier transform of the average image. It represents the average (c) repeated on the lattice on which the images were arranged in (a). Extraction of Fourier components at the precise reciprocal lattice positions (e) and subsequent Fourier synthesis renders the average noise-free (f).
62
Chapter3. Two-DimensionalAveragingTechniques
given micrograph, on the other hand, there may be hundreds of molecules presenting the same view.
E. Origins of Orientational Preference Due to orientational preferences of the molecule, certain views are predominant in the micrograph. Observations of such preference have been made ever since the negative staining technique was invented. For instance, the 50S ribosomal subunit of Escherichia coli is seen in the crown view and the kidney t'iew (Tischendorf et al., 1974). Limulus polyphemus (horseshoe crab) hemocyanin occurs in several orientations, giving rise to a remarkable diversity of views, classified as the pentagonal, ring, cross and bow tie uiew (Lamy et al., 1982) (Fig 3.3). The ratios between the average numbers of molecules observed in the different orientations are constant for a given molecule and preparative method. For instance, a table with the statistics of views observed in the case of the Androctonus australis (scorpion) hemocyanin lists 540 particles showing the top view, 726 the side views, and 771 the 45 ~ view (Boisset et al., 1990b). A change in the method of carbon film preparation (e.g., glow discharging) or in the choice of stain used often leads to a change in those ratios. Such observations indicate that the orientational preferences are the result of a complex interplay between the molecule, the sample grid, and the stain. In experiments in which sandwiching is used (as in most studies using the random-conical reconstruction method with negatively stained specimens; see Frank et al. (1988a) and Section I, C in Chapter 2), the interaction between the two layers of carbon introduces an additional force usually favoring orientations that reduce the extension of the particle normal to the sample grid. Both surface charges and features of surface topology are important determinants of orientation. Because large areas of contact are energetically preferred, the stability of the molecule in a particular orientation can often be understood "intuitively," by reference to the behavior of a physical model on a horizontal plane under the influence of gravity. Boisset et al. (1990a) have developed a program that predicts points of stability in an angular coordinate system, provided that a complete description of the shape of the molecule is given. The philosopy of this calculation is as follows: the molecule is modeled from its subunits. A sphere entirely enclosing the molecule is placed such that its center coincides with the molecule's center of gravity. For 360 • 360 orientations, covering the entire angular space, a plane is placed tangential to the sphere. In each of these orientations, the plane is moved toward the molecule until it touches it. In the touching position of the plane, the (perpendicular) distances
I. Introduction
63
Fig. 3.3. Micrographs of Limulus polyphemus (horseshoe crab) hemocyanin molecule negatively stained, presenting characteristic views, and explanation of these views in terms of a model placed in different orientations. The views are classified according to the Lamy et al. (1982) terminology as follows: (a, f) Cross view, (b,g) bow tie view, (c,h) ring view, (d,i) asymmetric pentagon view, and (e,j) symmetric pentagon view, From Lamy (1987). Reproduced with permission of Academic Press.
64
Chapter 3. Two-DimensionalAveraging Techniques
between the plane and each of the voxels within the molecule model are summed, giving a number E x that is termed "energy." From this number, an "energy index" EI is calculated as follows:
E1 =
E x -- E m i n Ema x -
x 100,
(3.5)
Emi n
where E m a x and E m i n a r e the maximum and minimum of all E x values found. Figure 3.4 shows some preferred orientations of the Scutigera coleoptrata hemocyanin molecule computed by this method. Interestingly, the program predicts all known experimental projections satisfactorily, but fails to explain the observed relative frequencies. For instance, two of the views represent more than 95% of all observed molecules, yet the corresponding energy index values are not the lowest on the table computed by the authors. This observation clearly indicates that specific interactions of the molecule with the carbon film are at play. Boisset et al. (1990a) also pointed out that the existence of the so-called 45 ~ view exhibited by chelicerian 4 x 6-meric hemocyanins, first observed by Van Heel et al. (1983), reflects the importance of interactions with the carbon film, since it depicts the molecule in a position that leads to a rather high energy index in terms of the measure given by Eq. (3.5). In fact, in the case of the A. australis hemocyanin, the frequency of the 45 ~ view was found to depend on the hydrophobicity of the carbon film (Boisset et al., 1990a). In the case of frozen-hydrated preparations, the air-water interface plays a strong role in defining the orientation of the molecule (Dubochet et al., 1985, 1988; Wagenknecht et al., 1990). Judged from the point of view of the simplistic "gravitational" stability model, the A. australis hemocyanin (Boisset et al., 1994b, 1995), the calcium release channel (Radermacher et al., 1994a, b), and other large macromolecular assemblies studied in vitreous ice behave counterintuitively, at least for a fraction of the molecules, in assuming orientations where the molecules "stand on their heads." For example, the A. australis hemocyanin is rarely seen in the top view that is characteristic for the molecule in negatively stained preparations, even though the corresponding orientation of the molecule in the water layer leads to maximum contact with the air-water interface. For those molecules, specific interactions with localized surface charges apparently far outweigh other, nonspecific interactions. Another curious phenomenon is the occasional strong asymmetry in the occurrence of "flip" versus "flop" facing molecules. This phenomenon was first reported by Dubochet et al. (1988) who observed that the "group of nine" fragment of the adenovirus exhibits only a left-handed view in the ice preparation.
I. Introduction
65
In experiments with the random-conical reconstruction method (Chapter 5, Section V), which relies on the occurrence of stable views, it is useful to have some control over the orientational preference of the molecule investigated. The inclusion of a thin carbon film in the specimen preparation is one of the parameters that can be varied. Apparently, the background structure added by the carbon does not interfere with the alignment (Section III) and classification (Chapter 4, Section IV). Paradoxically, inclusion of the carbon film, intended to produce stronger orientational preferences, was found to increase rather than decrease the number of orientations compared with pure ice, in the case of the E. coli 70S ribosome (Robert Grassucci, personal communication, 1991). How accurately are the orientations defined? Given the many factors that affect the molecule's orientation, it is clear that any "defined" orientation in reality possesses some range of uncertainty. The surface of contact between molecule and support are rough and nonplanar on the scale of molecular dimensions (Glaeser, 1992a, b; Butt et al., 1991). As studies in the 1970s indicated (Kellenberger et al., 1982), carbon films below a certain thickness appear to be malleable, forming a "seat" for the molecule. In addition, carbon films used in a cryo-environment may warp because of the difference in thermal expansion between carbon and copper (Booy and Pawley, 1992). Only recently, molybdenum grids with matching expansion coefficients have been introduced in an effort to overcome this problem (Glaeser, 1992b). Schmutz et al. (1994) introduced a very sensitive method of monitoring the flatness of support films, using reflected light microscopy. Investigation of carbon films with this method reveals a high occurrence of wrinkling, even when molybdenum grids are used. However, the maximum angle observed is on the order of 1~ enough to interfere with the attainment of highest resolution from thin 2D crystals, but entirely unproblematic for single particles at the present resolutions ( < 1/20 A -1) obtained. Because of these effects, a range of L'iews, rather than a single view, are effectively observed: consequently, the different realizations of a "preferred view" often look as though they come from molecules that are deformed, while they are in fact produced by small changes in the orientation of the molecule. When the molecule assumes two stable positions separated by a small change in orientation, the molecule is said to rock. The rocking of the hemocyanin molecule, discovered in the first application of correspondence analysis (van Heel and Frank, 1981; Chapter 4, Section II) in electron microscopy, can be explained by a noncoplanar arrangement of the four hexameric building blocks. However, even without rocking effects, the variations in orientation around the preferred view are pronounced. Only recently, refinement techniques have revealed
II. Digitization and Selection of Particles
67
the extent of these angular deviations from the average view (see Chapter 5, Section VIII, D).
II. Digitization and Selection of Particles A. The S a m p l i n g T h e o r e m
According to the Whittaker-Shannon theorem (Shannon, 1949), a bandlimited continuous function can be represented by a set of discrete measurements ("the samples") taken at regular intervals. It can be shown that the original function can then be entirely reconstructed from its samples (see the conceptually simple explanation in Moody (1990)). An image, regarded as a two-dimensional function, requires sampling with a step size of 1/(2B) if B is the band limit in Fourier space. In practice, the resolution 11 of biological specimens in single particle form rarely exceeds 1/20 A-~, and so a sampling step equivalent to 10 A on the object scale would already satisfy the stipulation of the sampling theorem. However, such "critical sampling" makes no allowance for the subsequent deterioration of resolution which is inevitable when the digitized image has to be rotated and shifted, each step involving interpolation and resampling. 11 Resolution is a quantity in Fourier space and hence has dimension ,~-1 This Fourier-based resolution can be linked (see Appendix in Radermacher, 1988) to a "point-topoint" distance in real space by the Rayleigh criterion (see Born and Wolf, 1975). Rayleigh considered the diffraction-limited images of two points, which are Airy disks, each represented by the intensity distribution [Jz(27rrR)/27rR)] 2, where Jl is the first-order Bessel function, r is the radius, and R is the radius of the diffracting aperture. According to the criterion, the two points separated by distance d o are just resolved when the maximum of one Airy disk coincides with the minimum of the second Airy disk. This critical distance do turns out to be d o -- 0.6/R. If we interpret R as the radius of the circular domain within which Fourier terms contribute to the crystallographic Fourier synthesis ("crystallographic resolution"), then we can say that the Rayleigh point-to-point resolution, 1/d o, is 1.67 times the crystallographic resolution. Colloquially, and somewhat confusingly, the real-space quantity 1/resolution is also often termed "resolution." Just to eliminate this confusion, we will use the term "resolution distance" when referring to the quantity 1/resolution. Hence, if we compare the distance d o and 1 / R we arrive at the factor 0.6 (Radermacher, 1988): the point-to-point resolution distance according to Rayleigh is 0.6 times the inverse of the crystallographic resolution.
Fig. 3.4. Modeling of the stain pattern observed with single molecules of Scutigera coleoptrata hemocyanin. (a-d) Four of the 12 stable views of the molecule as depicted by a model; (e-h) computed patterns of stain exclusion, assuming a zero-density molecule surrounded by a high-density contrasting agent. Areas where the molecule touches the support grid appear white. From Boisset et al. (1990a). Reproduced with permission of l~ditions Scientifiques Elsevier, Paris.
68
Chapter 3. Two-DimensionalAveraging Techniques
Interpolation always is tantamount to a weighted averaging over the pixels neighboring the point for which the value is needed. For example, the four-point bilinear interpolation (e.g., Aebi et al., 1973; Smith and Aebi, 1973) requires the computation of X=a,(1-x),(1-y)
+b,x,(1-y)
+c,(1-x),y+d,x,y (3.6)
where X is the interpolated value, a, b, c, and d are the values of the pixels surrounding the new point, and (x, y) are the fractional values of its coordinates, with 0 < x < 1 and 0 < y < 1. Another bilinear interpolation scheme given by Smith (1981) involves weighted averaging of three surrounding points. In order to keep this subsequent resolution loss small, it is customary to use a sampling step of close to 5 A; i~e., apply oversampling by a factor of 2 relative to the resolution of 1/20 A-~ mentioned above. Another possible issue to be considered is the problem of aliasing. Above we have assumed that the sampling step is smaller or equal to (1/2B). When it must be chosen larger for some reason (e.g., to accommodate a certain area with an array of given size), then artifacts will arise" the Fourier components outside of the boundaries of { + l / s , -1/s}, where s is the sampling step, will be "reflected" at the border of the Fourier transforms and show up as low spatial frequency components. These "contaminating" terms cannot be removed after scanning. One way to solve this problem is by filtering the image (optically) to the desired bandwidth prior to sampling. More practical is the use of a blurred optical probe in the microdensitometer. As far as the physical sampling step (on the scale of the object) is concerned, this is determined by the electron optical-magnification normally used for low-dose images, which is in the range of 50,000. According to the Unwin and Henderson (1975) rationale for this choice, an increase in magnification above 50,000 (while keeping the dose at the specimen constant) would push the exposure close toward, or below, the fog level of the photographic film. For magnifications significantly lower than 50,000, on the other hand, the transfer function of the photographic film causes information to be lost [see previous note on the scale transformation (Section I, A) and the authoritative paper on the properties of electron microscope photographic films by Downing and Grano (1982)]. In the remainder of this book, we make use of the following notation. Each image of an image set {pi(r); i = 1 . . . N} is represented by a set of J - L • M measurements on a regular grid {rtm} = {l * Ax, m 9Ay; l = 1 . . . L; m = 1... M},
(3.7)
II. Digitization and Selection of Particles
69
where A x, A y are (usually equal) sampling increments in x- and y-directions, respectively. We will use the lexiographic notation (introduced by Hunt, 1973)when referring to the image elements, i.e., a notation referring to a one-dimensionally indexed array. The index is j-- (m-
1),L +l
(3.8)
and runs from j -- 1 to j -- L 9M ( = J). Thus the discrete representation of an entire image set is {Pij; i = 1 . . . N; j = 1 . . . J}. B. Interactive Particle Selection The digitized micrograph is displayed on the screen of the workstation for selection of suitable particles. The monitors of present-day workstations cannot easily accommodate the entire array (approximately 3000 x 2000 for a 100 x 75-mm film), and therefore one of several measures must be taken: (i) divide the array into subfields that are sufficiently small; (ii) develop a program that allows particle selection with the entire scanned array scrolling through the monitor "window"; (iii) reduce the micrograph to fit the screen. The first option is inconvenient as it leads to logistic and administrative problems. The second option is not very convenient as only a small fraction of the entire micrograph is seen at a given time, and a method for systematically accessing the whole field must be found. The third option, using a size reduction, is the most straightforward solution to this problem, although, when the factor becomes too large, it leads to loss of image quality. This impairs the ability to recognize particular views or eliminate damaged particles. In order to arrive at a practical solution without sacrificing particle visibility, it is probably best to combine moderate size reduction with a moderate amount of scrolling. Size reduction is accomplished by first using "box convolution," i.e., a local averaging operator that replaces the value of each pixel by the average of pixels within a box surrounding it, followed by sampling of the resulting smoothed image. The ensuing boost in signal-to-noise ratio is a consequence of the fact that box convolution performs a low-pass filtration and thereby eliminates those portions of the micrograph's Fourier transform that have a low signal-to-noise ratio (see Section IV, C, 1). Moderate size reduction (by a factor of 2x for 3x) of an oversampled micrograph therefore improves the contrast of particles. C. Automated Particle Selection Interactive selection is subjective, tedious, and time-consuming. Inevitably, the push toward higher resolution in 3D reconstruction of single macro-
70
Chapter 3. Two-DimensionalAveraging Techniques
molecules (see Chapter 5, Section VIII) involves larger and larger number of particles. Currently, particle numbers in the range of 1500 to 2000 are often encountered, and numbers in the range of 10,000 may have to be handled routinely in the future. It is clear that this massive data collection requires automation and the use of quantitative reproducible criteria for selecting particles. In the past several attempts have been made to automate particle selection, but at that time the computers were too slow and contained insufficient memory to make these approaches practical. van Heel (1982) used a technique based on the computation of the local variance in the neighborhood of each pixel. A similar technique was used earlier by Lutsch et al. (1977) to find the positions of particles in a micrograph. In the resulting "variance map" (to be distinguished from the variance map that is a byproduct of averaging of an aligned image set: cf. Section IV, B), particles show up as peaks of high variance. This method has the advantage that it cannot discriminate between true particles and any other mass (e.g., aggregates of stain). Frank and Wagenknecht (1984) developed a method based on crosscorrelation search with an azimuthally averaged reference image that shows the molecule in a selected orientation (Fig. 3.5). The azimuthal averaging assures that particles presented in any (in-plane)orientation produce the same value of correlation. Moreover, the correlation function in the vicinity of a particle strongly resembles the particle's autocorrelation function and can therefore be used immediately for the translation-invariant rotation search. Again, this method is poor in discriminating between genuine particles and any other mass in their size range. Andrews et al. (1986) developed a procedure for detecting particles in a dark-field micrograph, incorporating low-pass filtering, thresholding, edge detection, and determination of mass within the particle boundaries found. Indeed, molecules imaged in dark field show up with much higher contrast than in bright field, allowing them to be separated from the background by application of a threshold. The procedure of Andrew et al. is however problematic for the majority of applications where bright-field electron microscopy is used. The advent of more powerful computers has made it possible to explore algorithm with more extensive statistical analysis. Harauz and Fong-Lochocsky (1989) proposed a sophisticated scheme comprising (i) noise suppression and edge detection, (ii) component labeling (i.e., identifying pixels connected with one another) and feature computation, and (iii) symbolic object manipulation. More recently, Lata et al. (1994) proposed a method based on standard methods of discriminant analysis (Fig. 3.6). Particle candidates are first identified by performing a peak search on a low-pass filtered version
II. Digitization and Selection of Particles
71
Fig. 3.5. Automatic particle selection scheme of Frank and Wagenknecht (1984). A rotationally averaged version of the particle is created and cross-correlated with the full image field. Each repeat of the particle is marked by the appearance of a version of the autocorrelation function (ACF) in the cross-correlation function. The program uses the ACF subsequently to obtain a version of the particle that is rotated and correctly aligned. From Frank and Wagenknecht (1984). Reproduced with permission of Elsevier Science, Amsterdam.
72
Chapter 3. Two-Dimensional Averaging Techniques
input image
PRE
c=.f xY!1 untflted series
'~
UNTO01
UNT002
TLT001
TLT002[L~] . . . . . . . . . . . . .
manual of thr~.
particle
"junk"
selection
categories
TBAINING
Fig. 3.6.
Automatic particle selection based on discriminant analysis of texture-sensitive parameters. See text for explanation. From Lata et al. (1994). Reproduced with permission of the Microscopy Society of America.
of the micrograph. Subsequently, the data surrounding the peaks are extracted and several statistical measures such as variance, skewness, kurtosis, and entropy, are computed. In a training session, approximately 100 fields are visually categorized into the three groups "good particles," "junk," and "noise." On the basis of this reference information, a discriminant analysis can now be performed on any data with the same statistical characteristics, resulting in a classification of all putative particles. For instance, a typical reconstruction project might require selection of particles from 10 micrograph pairs (i.e., of the tilted and untilted specimen; see Chapter 5, Section III, E, where the random-conical reconstruction scheme is outlined). All these images have the same statistical properties as they
III. AlignmentMethods
73
relate to the same specimen grid, defocus, carbon thickness, etc. The discriminant function set up in the training session with the first micrograph pair can therefore be used for the entire data set. The results obtained with this last method are encouraging: in the application to a cryo-data set from the 30S ribosomal subunit, the total percentage of correct choices was 87%. The remaining 13% were almost equally divided between false positives (7% particles selected that should have been rejected) and false negatives (6% particles rejected that should have been accepted).
III. Alignment Methods A. The Aims of Alignment Alignment is initially understood as an operation that is performed on two or more images with the aim of bringing a common motif contained in those images into register. Implicit in the term "common motif' is the concept of homogeneity of the image set: the images are deemed "essentially" the same; they differ only in the noise component and perhaps in the presence and absence of a relatively small ligand. That the difference is small is an important stipulation; it assures, in all alignment methods making use of the cross-correlation function (see Section III, C), that the contribution from the correlation of the main component with itself (the "autocorrelation term") is very large compared to the contribution stemming from its correlation with the extra ligand mass. Alignment so understood is directly related to our visual concept of likeness and order; it is the challenge to make the computer perform as well as a 3-year-old child in arranging building blocks that have identical shapes into a common orientation. The introduction of dissimilar images, occurring in a heterogeneous image set, forces us to generalize the term alignment: in this more expanded meaning, dissimilar motifs occurring in those images are considered "aligned" when they are positioned to minimize a given functional, such as the generalized Euclidean distance ( = the variance of their difference). In that case, the precise relative position (meaning both shift and orientation) between the motifs after digital "alignment" may not be endorsable by visual assessment, which relies on the perception of edges and marks in both images, not on a pixel-by-pixel comparison employed by digital methods. The concept of homogeneous versus heterogeneous image sets is fundamental in understanding averaging methods, their limitation, and the
74
Chapter3. Two.DimensionalAveragingTechniques
ways these limitations can be overcome. These topics will form the bulk of the remainder of this chapter and Chapter 4.
B. Homogeneous versus Heterogeneous Image Sets
1. Alignment of a Homogeneous Image Set Assume that we have a micrograph that shows N "copies" of a molecule in the same view. By using an interactive selection program, these molecule images are separately extracted, normally within a square "window," and stored in arrays
{p~j,i = 1 . . . N ; j = 1 . . . J } .
(3.9)
Within the selection window, the molecule is roughly centered, and it has normally random azimuthal "in-plane" orientations. We then seek coordinate transformations T i such that their application to pij results in the precise superimposition of all realizations of the molecule view. Any pixel indexed j in the transformed arrays p'~j refers to the same point in the molecule projection's coordinate system. When this goal is achieved, it is possible to form a meaningful average 12 (Figs. 3.7 and 3.2 a, c). N
fij = 1/N Y'~ P'ij.
(3.10)
i=1
In contrast, the average would be meaningless if the different pixels with the same index j originated from different points of the coordinate system of the molecule projection, resulting from a failure of alignment, or from molecules with different conformation ("apples and oranges"), resulting from structural or orientational heterogeneity. If the deviations are small, however, the resulting average will at least resemble the ideal average that would be obtained without alignment error: the former will be a blurred version of the latter. For random translational deviations, the blurring can be described in Fourier 12 Throughout this book, ensemble averages are denoted by a bar over the symbol that denotes the observed quantity, e.g., pj denotes the average over multiple measurements of the pixel j. In contrast, averages of a function over its argument range will be denoted by angle brackets, as in the following example: J
(p) = 1/J E pj. j=l
III. Alignment Methods
75 k
.. ..... ?_i.......
........
S
5:
;i!!:
/
"/~
:i:
-
'.- i!
....... /
g
__/ :
/
/
7 7
Fig. 3.7. Definition of the average image. We imagine the images to be stacked up. For each pixel (indexed i,k), the column average is computed and stored in the element (i,k) of the resulting image. At the same time, the variance of the pixel is computed and stored in the element (i, k) of the variance map. From Frank (1984b). Reproduced with permission of Electron Microscopy Foundation, Budapest.
space by a Gaussian function, analog to the temperature factor of X-ray crystallography: F{/3} = F{fi}exp[-k2/ko]
(3.11)
where F{.} stands for the Fourier transform of the term within the bracket and k 2 = 1/r g is a "temperature" parameter (Debye-Waller factor) due to random translations characterized by the size of the rms (root mean square) deviation r 0.
2. Alignment of a Heterogeneous Image Set In the case of a heterogeneous image set, such as a set comprising molecules presenting different views, alignment does not have the clear and unambiguous meaning as before. Rather, it must be defined in an operational way: as an operation that establishes a defined geometrical relationship among a set of images by minimizing a certain functional. A well-behaved algorithm will have the effect that particles within a homogeneous subset are "aligned" in the same sense as defined above for homogeneous sets, while particles belonging to different subsets are brought into geometrical relationships that are consistent with one another. To put it more concretely, in the alignment of 50S ribosomal subunits falling into two views, the crown view and the kidney view, all particles in the crown view orientation will be oriented consistently, and the same will be true for all particles in the kidney view orientation. The orientation between any of the crown view and any of the kidney view particles will be fixed, but the angle between particles of the two groups will depend on the choice of the alignment algorithm. Although the size of this relative angle is irrelevant, a fixed spatial relationship is required for an objective, reproducible, and meaningful characterization of the image set by multivariate statistical analysis and
76
Chapter 3. Two-DimensionalAveragingTechniques
classification. Exceptions are those methods that produce an alignment implicitly ("alignment through classification" of Dube el al., 1993; Marabini and Carazo, 1994a)or use invariants that lend themselves to classification (Schatz, 1992; Schatz and van Heel, 1990, 1992). Heterogeneity may occur because particles in an initially homogeneous population change shape. Particles or fibers that are thin and extended may flex without changing their local structure, the true object of the study. From the point of view of studying the high-resolution structure, the diversity of overall shape may be seen as a mere obstacle and not in itself worthy of attention. In those cases, a different approach to alignment and averaging may be possible, in which the idealized overall particle shape is first restored by computational means. In general these entail curvilinear coordinate transformations. Structural homogeneity can thus be restored. Such "unbending" methods have been introduced to straighten fibers and other linear structures in preparation for processing methods, thus far covered here, that assume rigid body behavior in all rotations and translations. The group of Alasdair Steven at the National Institutes of Health has used these methods extensively in their studies of fibrillous structures (Steven et al., 1986, 1988, 1991; Fraser el al., 1990). Geometrical unbending enables the use of helical reconstruction methods on structures whose shapes do not conform to the path of the ideal helix (Steven et al., 1986; Egelman, 1986; Hutchinson et al., 1990). Yet a different concept of alignment comes in when an attempt is made to orient different projections with respect to one another and to a common three-dimensional frame of reference; see Section III in Chapter 5. We will refer to that problem as the problem of 3D alignment. It is equivalent to the search for the common phase origin in the 3D reconstruction of two-dimensional crystal sheets. An even closer analogy can be found in the common lines methods used in the processing of images of spherical viruses (Crowther et al., 1970; Cheng et al., 1994).
C. Translational and Rotational Cross-Correlation 1. The Cross-Correlation Function Based on the Euclidean Distance
The cross-correlation function is the most important tool for alignment of two images. It can be derived in the following way: we seek among all relative positions of the images (produced by rotating and translating one image with respect to the other) the one that maximizes a measure of similarity. The images, represented by J discrete measurements on a regular grid, {fl(rj), j = 1... J}, {f2(rj), j = 1 . . . J}, may be interpreted as vectors in a J-dimensional Cartesian coordinate system (see also Chapter 4
Ill. Alignment Methods
77
where extensive use will be made of this concept). The length of the difference vector, or the Euclidean distance between the vector end points, can be seen as a measure of their dissimilarity or as an inverse measure of their similarity. By introducing search parameters for the rotation and translation, (R~, r'), we obtain the expression: J
E(2(R~, r') = E [fl(ri) - f2(R~rj +
r')]
(3.12)
j=l
The rotation matrix R~ performs a rotation of the function ]'2 by the angle a, and the vector r' performs a shift of the rotated function. In comparing two images represented by the functions {f~} and {f2}, we are interested to find out whether similarity exists for any combination of the search parameters. This kind of comparison is similar to the comparison our eyes performmalmost instantaneously--when judging whether or not shapes presented in arbitrary orientations are identical. By writing out the expression (3.12) explicitly, we obtain J
J
E~2(R~,r') = E [f,(rj)] 2 + E [f2(R~rj + r')] 2 j=l
j=l J
fl(r])f2(R~rj + r').
(3.13)
j=l
The first two terms are invariant under the coordinate transformation rj ~ R~rj + r'.
(3.14)
The third term is maximized, as a function of the search parameters {R~, r'), when E12 assumes its minimum. This third term is called cross-
correlation function" J
9 12(R~,r') = ~ fl(rj)f2(R~rj + r').
(3.15)
j=l
In practice, the use of Eq. (3.15) is somewhat clumsy because determination of its maximum requires a three-dimensional search (i.e., over the ranges of one rotational and two translational parameters). Functions that explore the angular space and the translational space separately have become more important. Two additional impractical features of this formula are that ~12(R~, r') is not normalized, so that it is suitable for comparing images from the same experiment only, and that it is dependent
Chapter 3. Two-DimensionalAveraging Techniques
78
on the size of the "bias" terms ( f l ) and (f2), the averaged pixel values, which should be irrelevant in a meaningful measure of similarity.
2. Cross-Correlation Coefficient and Translational Cross-Correlation Function a. Definition of the Cross-Correlation Coefficient. The cross-correlation coefficient is a well-known measure of similarity and statistical interdependence. For two functions represented by discrete samples, {f~(rj); j = 1 . . . J } and {f2(rj); j = 1 . . . J } , the cross-correlation coefficient is defined as
W--, 012
(~_,jy.=l[fl(rj)_
,>]
]
(f2)12}1/2,
(3.16)
where J
(f~) dLf 1/J ~ f~(rj);
i = 1,2,
(3.17)
j=l
and a factor 1/J has been left out in both numerator and denominator. A high value of P~2 ( - 1 < Pie < 1) means that the two functions are very similar for the particular choice of relative shift and orientation. However, in comparing images, it is more relevant to ask whether P12 is maximized for any "rigid body" coordinate transformation applied to one of the images. In the defining formula (3.16), only the numerator is sensitive to the quality of the alignment between the two functions, while the denominator contains only the variances, which are invariant under shift or rotation of the argument vector. When a variable coordinate transformation is applied to one of the functions, the numerator becomes identical (except for the subtraction of the averages ( f l ) and (f2) which only change the constant "bias" term of the resulting function) to the cross-correlation function introduced in the previous section.
b. Translational Cross-Correlation Function. Specifically, the translational cross-correlation function (CCF) is obtained by restricting the transformation to a two-dimensional translation. However, for any r k @ O, the images no longer overlap completely, so that the summation can now be carried out only over a subset J' of the J pixels. This is indicated by the stipulation rj + r~eA under the summation symbol, meaning that only
III. Alignment Methods
79
those indices j should be used for which the sum vector rj + r k is still within the boundary of the image:
t~I)12( r k ) --
1 j,
J
E
[ f l ( r j + r k) -
(fl)][f2(rj)-
(f2)]"
(3.18)
J=l (rj+ rkeA)
By letting the "probing vector" r k assume all possible positions on the grid, all relative displacements of the two functions are explored, allowing the position of best match to be found: for that particular r k, the CCF assumes its highest value (Fig. 3.8). The formulation of the overlap contingency is somewhat awkward and indirect in the lexicographic notation [Eq. (3.18)]. However, it is easy to write down a version of this formula that uses separate indexing for the xand y-coordinates. In such a revised formula, the index boundaries become explicit. Because the practical computation of the CFF is normally done be fast Fourier transformations (see below), we will refrain from restating the formula here.
c. Computation Using the Fast Fourier Transform. In practice, the computation of ~(k) is speeded up though the use of a version of the convolution theorem. The convolution theorem proper, in its continuous form, can be expressed in the following way: the convolution product of two functions, (3.19)
C12(r') : f f fl(r-r')f~(r)dr,
Jl i2
~~
Fig. 3.8. Definition of the cross-correlation function. Image 1 is shifted with respect to image 2 by vector rpq. In this shifted position, the scalar product of the two images arrays is is now allowed to formed and put into the CCF matrix at position (p, q). The vector assume all positions on the sampling grid. In the end, the CCF matrix has an entry in each position. From Frank (1980). Reproduced with permission of Springer-Verlag, New York.
rpq
80
Chapter 3. Two-Dimensional Averaging Techniques
is equal to the inverse Fourier transform of the product F{f~}F{f2}, where F{ } denotes the Fourier transformation. Similarly, 9 12(rk) = F -~ {F{T~}F*{f2}},
(3.20)
where F -~ denotes the inverse Fourier transformation, and * denotes complex conjugation. Thus the computation involves three fast discrete Fourier transformations and one scalar matrix multiplication. The usual choice of origin in the digital Fourier transform (at element [1, 1] of the array) requires one adjustment: For the origin to be in the more convenient position, in the center of ~ ( r k), a shift by half the imaged dimensions in the x- and y-directions is required. For even array dimensions (and thus for any power-of-two based algorithm) this is easily accomplished by multiplication of the Fourier product in Eq. (3.20) with the factor ( - 1 ) ~k,+k,~ where kx, k, are the integer grid coordinates in Fourier space. For convenience, we will make use of the notation def _
cIJ12 -- f l | f 2 ,
(3.21)
C12 = fl (3 f2
(3.22)
in anology to the use of
for the convolution product. The computation via the Fourier route implies that the 2D image array p ( l , m ) (switching back for the moment to a 2D formulation for clarity) is replaced by a circulant array, with the properties (3.23)
p(l + L, m + M) = p(l, m).
This has the consequence that the translational cross-correlation function 9 ( l ' , m ' ) computed in this way contains contributions from terms p l ( l + l' - L , m + m ' ) p 2 ( l , m ) where l + l' > L. We speak of a "wraparound effect": instead of the intended overlap of image areas implied in the definition of the CFF [Eq. (3.18) and Fig. 3.8], the bottom of the first image now overlaps the top of the second, etc. To deal with this problem, it is common practice to extend the images twofold in both directions, by "padding" them with their respective averages, 1 (p,)
=
J
Y'. p , ( r j )
1
and
( P2 ) = ~ J
J __ 1
pE(rj).
(3.24)
III. Alignment Methods
81
Alternatively, the images may be "floated," by subtraction of ( p l ) or (P2), and then "padded" with zeros prior to the FFF calculation, as described by DeRosier and Moore (1970). The result is the same as with the above recipe, because any additive terms have no influence on the appearance of the CCF, as they merely add a "bias" to this function. In fact, programs calculating the CCF frequently eliminate the Fourier term F00 at the origin in the course of the computation, rendering the outcome insensitive to the choice of padding method. Another difference between the Fourier-computed CCF and the realspace CCF [Eq. (3.18)] is that the normalization by 1/J' (J' being the varying number of terms contributing to the sum) is now replaced by 1/J (or, respectively, by 1 / ( 4 J ) if twofold extension by padding is used).
3. Rotational Cross-Correlation Function The rotational cross-correlation function (analogous to the rotation function in X-ray crystallography) is defined in a similar way, but this time with a rotation as probing coordinate transformation. Here each function is represented by samples on a polar coordinate grid defined by Ar, the radial increment, and A6, the azimuthal increment:
{fi(lAr, m A c h ) ; l = 1 . . . L ; m = 1 . . . M } ; i = 1,2.
(3.25)
We define the discrete, weighted, rotational cross-correlation function in the following way: 1=12
M-
1
C(k) = ~ w(1) ~ f l l=l 1
m =0
X(lAr,mod[m + k,M]Ar
Ad~)f2(lAr, m Ad~)AchlAr
l=l 2
= ~
w(/)c(/, k)Z Ar.
(3.26)
l=l 1
For weights w(l)= 1 the standard definition of the rotational CCF is obtained. The choice of nonuniform weights, along with the choice of a range of radii {l~... l 2} can be used to place particular emphasis on the contribution of certain features with that range.
Chapter 3. Two-DimensionalAveragingTechniques
82
The computation of the inner sums c(l, k) along rings normally takes advantage (Saxton, 1978; Frank et al., 1978a, 1986) of the Fourier convolution theorem" M-1
c(l,k) = ~
Fl(l Ar, m' Acb)F*(I Ar, m' Acb')
m'=0
x A&' exp[2Hi(m' At/,' A4,)],
(3.27)
where F i (l Ar, m' A oh'), i = 1, 2 are the discrete Fourier transforms of the /th ring of the functions fi. A further gain in speed is achieved by reserving the order of the summations over rings and over Fourier terms in Eqs. (3.26) and (3.27) as this reduces the number of inverse one-dimensional Fourier transformations to one (Penczek et al., 1992).
4. Peak Search The search for the precise position of a peak is a common feature of all correlation-based alignment techniques. As a simple safeguard against detection of spurious peaks, not just the highest, but at least three highest-ranking peaks, p l, p2, and p3, are searched. For a significant peak, one would expect that the ratio p l / p 2 is well above p2/p3, assuming that the subsidiary peaks p2 and p3 are due to noise. Computer programs designed for this purpose are straightforward: the array is scanned for the appearance of relative peaks, i.e., elements that stand out from their immediate neighbors. In the one-dimensional search (typically of a rotational correlation function), each element of the array is compared with the two neighbors. In the 2D search (typically of a 2D translational, CCF), each element is compared to its eight neighbors. Those elements that fulfill this criterion are put on a stack into ranking order, and at the end of the scan, the stack contains the desired list of highest peaks. The peak position so found is given only as a multiple of the original sampling distance. However, the fact that the peak has finite width and originates mathematically from many independent contributions all coherently "focused" on the same spot means that the position can be found with higher accuracy by some type of fitting. First the putative peak region is defined as a normally circular region around the element with highest value found in the peak search. Elements within that region can now be used to determine an effective peak position with noninteger coordinates. The methods widely used are parabolic fit and center of gravity.
IIl. AlignmentMethods
83
D. Reference-Based Alignment Techniques
1. Principle of Self-Detection Reference-based alignment techniques were developed primarily for the case of homogeneous image sets, i.e., images originating from particles containing identical structures and presenting the same view. In that case, all images of the set {p,,(r); n = 1 . . . N} have a "signal component" in c o m m o n - - t h e projection p(r) of the structure as imaged by the instrumentwwhile differing in the noise component n i(r). Any image can then act as reference for the rest of the image set (principle of self-detection; see Frank, 1975. Formally, pl(r) = p(r) + nl(r),
(3.28)
p2(r) = p(r) + n2(r),
(3.29)
so that, using the notation introduced in Section III, C, 2), @12(r) = pl(r)| p2(r) = p(r)| p(r) + nl(r)| p(r) + p(r)|
) + nl(r)|
(3.30)
The first term is the autocorrelation function (ACF) of the structure common to both images, which has a sharp peak at the origin, while each of the other three terms is a cross-correlation of two uncorrelated functions. The shape of the peak at the center is determined by the ACF of the point spread function associated with the contrast transfer function: according to Eq. (2.11) with p(r) = l(r) and p0(r) - ~b(r) and the convolution theorem, one obtains p(r)| p(r) =
[h(r)Qpo(r)]|
= [h(r)|
p0(r)] p0(r)| p0(r)].
(3.31)
The term p0(r) | p0(r) is the Patterson function of the projection of the original structure. Its most important feature in Eq. (3.31) is that it acts for all practical purposes as a delta-function, due to the sharp self-correlation peak in its center, and as such essentially reproduces the ACF ~hh = h(r)Nh(r) of the point spread function of the instrument. 13 Its value at the origin, important for the ability to detect the peak at low s / n ratios, is determined by the size of the integral QB = f l H(k)[2 dk,
(3.32)
JB 13This property was previously exploited by AI-AIi and Frank (1980)who proposed the use of the cross-correlation function of two micrographs of the same "stochastic" object to obtain a measure of resolution.
84
Chapter 3. Two-DimensionalAveraging Techniques
where B is the resolution domain and H(k) the contrast transfer function. This follows from Parseval's theorem, which is simply a statement of the invariance of the norm of a function upon transforming this function into Fourier space (see Section IV, C, 1). In this context, it should be noted that the alignment of images of the same object taken at different defocus settings leads to a CCF peak whose shape is determined by ~hlh: = hl(r)|
(3.33)
with hi, 2 being the point spread functions corresponding to the two focus settings. In this case, the peak height is determined by the size of the Fourier integral (or its discrete-valued equivalent) fB Hi(k) H2(k) dk,
(3.34)
which is critically dependent on the relative positions of contrast transfer zones with different polarity (Frank, 1972b; A1-Ali, 1976; AI-Ali and Frank, 1980). In fact, the value of the integral [Eq. (3.34)] is no longer positivedefinite, and unfortunate defocus combinations might result in a CCF with a peak that has inverted polarity or is so flat that it cannot be detected in the peak search (Frank, 1980; Saxton, 1994; Zemlin, 1989b; see Fig. 3.9). Saxton (1994) has discussed remedies for this situation. One of them is the obvious "flipping" of transfer zones in case the transfer functions are known, with the aim of assuring a positive-definite (or negative-definite) integrand (see Fig. 3.10); Typke et al., 1992). A more sophisticated procedure suggested by Saxton (1994) is to multiply the transform of the CTF with a factor that acts like a Wiener filter: W(k) =
Pl(k)P2(k) IPl(k)12lp2(k)l 2 + e
,
(3.35)
where e is a small quantity that ensures boundedness of W(k) to keep the noise amplification in any spectral domain within reasonable margins. Apart from the degradation of the CCF peak due to the mismatch in CTF polarities, which can be fixed by "flipping" the polarity of zones in the Fourier domain, there are other effects that diminish the size of the peak, and thus may lead to difficulties in the use of the CCF in alignment. Typke et al. (1992) identified magnification changes and local distortions of the specimen and demonstrated that by applying an appropriate compensation, a strong CCF peak can be restored (see Fig. 3.10). However, these effects
III. Alignment Methods
85
come to play mainly in applications where large specimen fields must be related to one another, while they are negligible in the alignment of small (typically in the range of 64 • 64 to 128 • 128) single particle fields. The reason for this insensitivity is that all local translational components are automatically accounted for by an extra shift, while the small local rational components of the distortions (in the range of maximally 1~ affect the CCF Fourier integral underlying the CCF [Ec~. (3.30)] only marginally in the interesting resolution range 1/40 to 1/20 A-1 (see Appendix in Frank and Wagenknecht, 1984).
2. ACF / CCF-Based Search Strategy In order to avoid a time-consuming three-dimensional search of the {a, Ax,/Xy} parameter space, translation and rotation search are normally performed separately. One search strategy that accomplishes this separation, introduced by Langer et al. (1970), makes use of the translationinvariance of the ACF (Figs 3.11a-3.11d). In fact, this method goes back to search techniques in X-ray crystallography involving the Patterson function. According to this scheme (Fig 3.11e), the orientation between the images is first found by determining the orientation between their ACFs, and subsequently the shift between the correctly oriented molecules is found by translational cross-correlation. The ACF of an image {p(rj);j = 1 . . . J } , represented by discrete samples on a regular grid, rj, is obtained by simply letting fl(rj) =f2(rj) = p ( r j ) in the formula for the translational cross-correlation function [Eq. (3.18)]: 1
~(rk) - 7
J
~
p(rj)p(rj + rk).
(3.36)
j=l (within A)
The property of shift invariance is immediately clear from the defining formula, since the addition of an arbitrary vector to rj will not affect the outcome (except for changes due to the boundary terms). Another property (not as desirable as the shift invariance; see below) is that the two functions p(rj) and p ' ( r j ) = p ( - ri) have the same ACF. As a result, the ACF is always centrosymmetric. For fast computation of the ACF, the convolution theorem is again used as in the case of the CCF (Section III, C, 2): the ACF is obtained by inverse Fourier transformation of the squared Fourier modulus, IF(k)l 2, if F(k) = F-l{p(r)}. Here the elimination of "wrap-around" artifacts, by
86
Chapter 3. Two-DimensionalAveraging Techniques a
cross-correlation peal< 1.0 .8 .6
.4 .2 O. -.2
-.4
6
8
-I.0 1.0
b
1.2
1.4
cross-correlation
1.s
1.8
2.0
2.2
2.4
2.6
2.8
3.0
peal<
1.0 .8 .6
.2
!
!
-.2
-.4
-.6 ~_
AZ
-.8
-I.0 I0.0
10.2
10.4
10,6
10.8
Ii.0
11,2
11.4
11.6
11,8
r
III. AlignmentMethods
87
Fig. 3.10. The effect of image restoration and magnification correction on the crosscorrelation function of two micrographs of a focus series. One of eight images is crosscorrelated against the remaining seven images. Only the central 128 • 128 portion of the CCFs is shown. (Top row) Without correction. (Second row) Sign reversal applied to CTFs to make the CCF Fourier integral in Eq. (3.34) positive-definite. (Third row) Micrographs corrected only for magnification changes and displacements. The CCF peak comes up strongly now, but in three cases with reversed sign. (Bottom row) Micrographs additionally corrected for sign reversal. All CCF peaks are now positive. From Typke et al. (1992). Reproduced with permission of Elsevier Science, Amsterdam.
twofold extension and padding of the array prior to the computation of the Fourier transform, is particularly important, since these artifacts may lead to incorrect angles in the rotation search. We consider an idealized situation, namely two images of a motif that differ by a shift, a rotation a, and the addition of noise. The ACFs of these images are identical except that they are rotated by a relative to Fig. 3.9. Peak value of cross-correlation function ~,~,~ between two point spread functions with different defocus, plotted as a function of their defocus difference. Theoretically, the cross-correlation function of two micrographs is obtained as a convolution product between the CCF ~h,h, and the Patterson function of the common structure (see Eq. 3.31). (a) A2 = 1 (Scherzer focus) used as reference: (b) As = 10 used a reference. It is seen that in the case of (a), as the defocus difference increases, the peak drops off to zero and changes sign several times. From Zemlin (1989b). Dynamic focussing for recording images from tilted samples in small-spot scanning with a transmission electron microscope. J. Electron Microsc. Tech. Copyright 9 1989 John Wiley & Sons. Inc. Reprinted by permission of John Wiley & Sons, Inc.
88
Chapter 3. Two-Dimensional Averaging Techniques
each other. As in Section III, C, 2, we now represent the ACFs in a polar coordinate system and determine the relative angle by computing their rotational cross-correlation function: 12
C(k) = ~
w ( l ) c b ( l A r , m A& + ~)dP(lAr, m Acb + k Acb)Adpl Ar.
l=11
(3.37) The rotational cross-correlation function will have a peak centered at the discrete position kmax ---int(a/Ad~) [where int(.) denotes the integer closest to the argument], which can be found by a maximum search over the entire profile. The precise position is subsequently found using a parabolic fit or center of gravity determination in the vicinity of kmax (see Section III, C, 4). The weights w(l) as well as the choice of minimum and maximum radii ll, 12 are used to "focus" the comparison on certain ring zones of the autocorrelation function. For instance, if weights are chosen such that a radius r = r w is favored, then this has the effect that features in the molecule separated by the distance Ir 1 - r21 = rw will provide the strongest contributions in the computation of the rational cross-correlation function. The weights can thus be used, for instance, to select the most stable reliable (i.e., reproducible) distances occurring in the particle. Our problem of the ACF-based orientation search is the abovementioned symmetry which makes the ACFs of two functions pl(r) and pz(r) - - p ~ ( - r ) indistinguishable. Consequently, the fact that the ACFs of two images optimally match for k A~ = a may indicate that the images match with the relative orientations a, a + 180~ or both (the last case would be true if the images themselves are centrosymmetric). Zingsheim et al. (1980) solved this problem by an "up/down cross-correlation test," in the following way: i. rotate image 2 by a, ii. compute the CCF between 1 and 2 giving the peak maximum p~ at {Axe, Aye}, iii. rotate image 2 by c~ + 180 ~ iv. compute the CCF between 1 and the 2 giving the peak maximum 04+ 180 at {Axe+ 180, Aye+ 180} v. if p~ > p~+ 180 then shift 2 by {Axe, Aye}, else by {Axe+ 180, Aye+ 180}; i.e., both ways are tried, and the orientation that gives maximum crosscorrelation is used. Later the introduction of multivariate statistical analysis and classification (see Chapter 4) provided another way of distinguishing between
III. AlignmentMethods
89
particles lying in 180 ~ rotation related positions. In the multivariate statistical analysis of such a mixed data set, it is clear that the factor accounting for the up versus down mass redistribution predominates.
3. Refinement and Vectorial Addition of Alignment Parameters Inevitably, the selection of an image for reference produces a bias in the alignment. As a remedy, Frank et al. (1981a) proposed an iterative refinement (Fig. 3.12)whereby the average resulting from the first alignment is used as reference in the next alignment pass, etc. When this scheme was applied to the set of 80 40S ribosomal subunit images, the resolution of the average, as measured by the differential phase residual, showed clear improvement up to the second pass. Reference-based alignment procedures therefore normally incorporate iterative refinement. However, the discrete representation of images poses practical difficulties when multiple steps of rotations and shifts are successively applied to an image. As the image is subjected to several steps of refinements, the necessary interpolations gradually degrade the resolution. As a solution to this problem, the final image is directly computed from the original image, using a rotation and shift combination that results from vectorial additions of rotations and shifts obtained in each step.
Fig. 3.11. Alignment using the autocorrelation function. (a) A projection of the calcium release channel, padded into a 128 • 128 field. (b) Autocorrelation function of (a). (c) Eight noisy realizations of (a), shifted and rotated randomly. (d) Autocorrelation functions of the images in (c). Two important properties of the ACF can readily be recognized: it is always centered at the origin, irrespective of the translational position of the molecule, and it has a characteristic pattern that in our case always appears in the same orientation, reflecting the sameness of the (in-plane) orientations of the molecules in (c) from which the ACFs were derived. Thus the A C F can be used to determine the rotation of unaligned, uncentered molecules. (e) Scheme of two-step alignment utilizing the translation invariance of the autocorrelation function. Both the reference and the image to be aligned (images on the top left and top right, respectively) are "floated," or padded, into a larger field to avoid wrap-around artifacts. The ACFs are then computed and rotationally cross-correlated. The location of the peak in the rotational CCF establishes the angle 0 between the ACFs. From this it is inferred that the image on the top right has to be rotated by - 0 to bring it in the same orientation as the reference. (Note, however, that this conclusion is only correct if the motif in the images to be aligned is centrosymmetric. Otherwise, the angle 0 between the ACFs is compatible with both the angles 0 and 0 + 180~ both positions have to be tried in the following cross-correlation; see text.) Next, the correctly rotated, padded image is translationally cross-correlated with the reference, yielding the CCF. The position r, of the peak in the CCF is found by a two-dimensional peak search. Finally, the rotated version of the original image is shifted by - r to achieve alignment. From "Electron Microscopy at Molecular Dimensions. State of the Art and Strategies for the Future," Three-dimensional reconstruction of non-periodic macromolecular assemblies from electron micrographs. Frank, J., and Goldfarb, W. (1980), pp. 154-160. Reproduced with permission by Springer-Verlag, Berlin.
90
Fig. 3.11.
Chapter 3. Two-DimensionalAveraging Techniques
III. Alignment Methods
Fig. 3.11. (continued)
91
92
Chapter 3. Two-DimensionalAveraging Techniques r(*) = S ( * ) l ~ , ~
REFERENCE
It(l)=~
S=(~) A(,)
%
(I)
~
]r~=-N~Esn(?')
S~(z)
ls'*' I
r
l s, '-'__. s4(~') ~~) s,( ) '
~
s4(Z) Ss(2)
Fig. 3.12. Reference-based alignment with iterative refinement. Starting with an arbitrary reference particle (picked for its "typical" appearance), the entire set of particles is aligned in the first pass. The aligned images are averaged, and the resulting average image is used as reference in the second pass of alignment. This procedure is repeated until the shifts and rotation angles of the entire image set remain unchanged. From Frank (1982). Reproduced with permission of Wissenschaftliche Verlagsgesellschaft, Stuttgart.
For any pixel with coordinates r in the original image, we obtain new coordinates in the first alignment step as r' = a i r + dl,
(3.38)
where a 1 and d~ are the 2 • 2 rotation matrix and translation vector found in the alignment. In the next step (the first step of refinement), we have similarly r" = c~2r' + d2,
(3.39)
III. AlignmentMethods
93
where o~2 and d 2 are rotation matrix and translation vector expressing the adjustments. Instead of using the image obtained by applying the second transformation, one applies a consolidated transformation to the original image. This consolidated transformation is r" = a 2 ( a l r + d l ) + d 2 = a 2 a l r + a2d 1 + d 2 ---
~res
r + dres,
(3.40)
with the resulting single-step rotation Ofres and single-step translation dre s4. M u l t i r e f e r e n c e
Techniques
Multireference methods have been developed for situations in which more than one different motif (i.e., view, particle type, conformational state, etc.) is present in the dataset (van Heel and St6ffler-Meilicke, 1985). Alignment of a set of N images with L references (i.e., prototypical images showing the different motifs) leads to an array of L x N correlation coefficients, and each image is put into one of L bins according to which of its L correlation coefficients is maximum. In a second round, averages are formed over the subsets so obtained, which are higher signal-to-noise ratio (SNR) realizations of the motifs and can take the place of the initial references. This refinement step produces improved values of the rotational and translational parameters, but also a migration of particles from one bin to another, on account of the improved discriminating power of the cross-correlation in "close-call" cases. This procedure may be repeated several times until the parameter values stabilize and the particle migration stops. Elaborate versions of this scheme (Harauz et al., 1988) have incorporated multivariate statistical analysis and classification in each step. The experience of van Heel and co-workers (Boekema et al., 1986; Boekema and Boettcher, 1992; Dube et al., 1993) has indicated that the multireference procedure is not necessarily stable; initial preferences may be amplified, leading to biased results especially for small particles. The failure of this procedure is a consequence of an intrinsic problem of the reference-based alignment approach. In fact, it can be shown that the multireference alignment algorithm is closely related to the K-means clustering technique (Section IV, C in Chapter 4), sharing all of its drawbacks (see the note in Section IV, J of Chapter 4; Penczek, personal communication, 1995).
E. Reference-Free Techniques 1. Introduction As we have seen, reference-based alignment methods fail to work for heterogeneous data sets when the SNR drops below a certain value. The
94
Chapter 3. Two-DimensionalAveraging Techniques
choice of the initial reference can be shown to bias the outcome (see also Penczek et al., 1992). van Heel et al. (1992a, b) demonstrated that a replica of the reference (the authors chose a portrait of Einstein) emerges when the correlation averaging technique is applied to a set of images containing pure random noise. A similar observation was earlier reported by Raderreacher et al. (1986b) for correlation averaging of crystals: when the SNR is reduced to 0 (i.e., only a noise component remains in the image model), an average resembling the reference motif is still formed. This phenomenon is easily understood: maximum correlation between a motif and an image field occurs, as a function of translation/rotation, when the image has maximal similarity. If (as often is the case) the correlation averaging procedure employs no correlation threshold, such areas are always found even in a noise field. By adding up those areas of maximum similarity, one effectively reinforces all noise components that tend to replicate the reference pattern. This becomes quite clear from an interpretation of images as points in a multidimensional space; see Chapter 4. The emergence of the signal out of "thin air," an inverse Cheshire cat phenomenon, is in some way analogous to the outcome of an experiment in optical filtering: a mask that passes reflections on the reciprocal grid is normally used to selectively enhance Fourier components that build up the structure repeating on the crystal lattice. If one uses such a mask to filter an image that consists of pure noise, the lattice and some of the characteristics of the structure are still generated, albeit with spatially varying distortions because the assignment of phases is a matter of chance. Application of rigorous statistical tests would of course eliminate the spurious correlation peaks as insignificant. However, the problem with such an approach is that it requires a detailed statistical model (which varies along with many factors such as object type, preparation technique, imaging conditions, and electron dose). Such a model is usually not available. Attempts to overcome these problems have led to the development of reference-free methods o f alignment. Three principal directions have been taken; one, proposed by Schatz and van Heel (1990, 1992; see also Schatz et al., 1990; Schatz, 1992) eliminates the need for alignment among different classes of images by the use of invariants; the second, proposed by Penzcek et al. (1992), solves the alignment problem by an iterative method; and the third, developed recently for large data sets (Dube et al., 1993; Marabina and Carazo, 1994a), is based on the fact that among a set of molecules presenting the same view, subsets with similar ("in-plane") orientation can be found. In explaining these approaches, we are occasionally forced to make reference to the subject of multivariate statistical analysis and classification, which will be described in some detail later
III. Alignment Methods
(Chapter 4). For sufficient to think to sort images (or according to their
95
an understanding of the present issues, it may be of classification as a "black box" procedure that is able any patterns derived from them) into groups or classes appearance.
2. Use of Invariants: The Double Autocorrelation Function Functions that are invariant both under rotation and translation of an image can be used to classify a hetergeneous image set. The autocorrelation function of an image is invariant under a translation of a motif contained in it. This property has been used in the reference-based alignment originally proposed by Langer et al. (1970; see also Frank, 1975; Frank et al., 1978a). The double autocorrelation function (DACF) has the additional property that it is invariant under rotation as well; it is derived from the normal translational autocorrelation function by subjecting it to an operation of rotational autocorrelation (Schatz and van Heel, 1990). Following Schatz and van Heel's procedure (Fig. 3.13), the ACF is first resampled on a polar grid to give ~(r,,, ~b~). The rotational autocorrelation function is obtained by computing one-dimensional fast Fourier transforms (FFT's) and conjugate Fourier products along rings, using the same philosophy as in the computation of the rotational CCF, see Eqs. (3.26) or (3.37). The resulting function has the remarkable property of being invariant under both translation and rotation, because the addition of an arbitrary angle to 4~ in +(r~, 4~) above again does not affect the outcome. Later, Schatz and van Heel (1992; see also van Heel et al., 1992b) introduced another function, which they termed double self-correlation function (DSCF), that avoids the "dynamic" problem caused by the twofold squaring of intensities in the computation of the DACF. The modification consists of a strong suppression of the low spatial frequencies by application of a band-pass filter. Through the use of either DACF or DSCF, classification of the aligned, original images can be effectively replaced by a classification of their invariants, which are formed from the images without prior alignment. In these schemes, the problem of alignment of a heterogeneous image set is thus entirely avoided; alignment needs only be applied separately, after classification of the invariants, to images within each homogeneous subset. This approach of invariant classification poses a principal problem, however (Schatz and van Heel, 1990; Penczek et al., 1992; Frank et al., 1992): the DACFs and DSCFs are twofold degenerate representations of the initial images. Forming the autocorrelation function (or its "self-correlation" equivalent) of an image corresponds to eliminating the phase information in its Fourier transform. In the second step, in forming the
96
Chapter 3. Two-DimensionalAveraging Techniques
Fig. 3.13. Demonstration of the invariance of the double autocorrelation function (DACF) under rotation and shift of an image. (a) Image of worm hemoglobin in side view and the same image rotated and shifted. (b) Autocorrelation functions of the images in (a). The ACF of the image on the left is seen to be identical to that on the right, but rotated by the same angle as the image. (c) The ACFs of (b) in polar coordinates (horizontal: azimuthal angle, from 0~ to 360~ vertical: radius). The rotation between the ACFs is reflected by a horizontal shift of the polar coordinate representation. (d) A second, one-dimensional autocorrelation in horizontal direction produces identical patterns for both images: the DACF in a polar coordinate representation. (e) The DACF mapped into the Cartesian coordinate system. From Schatz (1992). Reproduced with permission of the author.
D A C F or D S C F from the A C F , the phases describing the rotational f e a t u r e s in t h e 1D F o u r i e r t r a n s f o r m s a l o n g rings a r e also lost. T h e r e f o r e a classification of t h e i n v a r i a n t s d o e s n o t n e c e s s a r i l y i n d u c e a c o r r e c t classification o f t h e i m a g e s t h e m s e l v e s . T h e o r e t i c a l l y , a very l a r g e n u m b e r of p a t t e r n s h a v e t h e s a m e D A C F ; t h e saving g r a c e is t h a t t h e y a r e u n l i k e l y
III. Alignment Methods
97
to be realized in the same experiment. Experience will tell whether the problem of ambiguity is a practical or merely an academic problem. For completeness, three other approaches making use of invariants which have not gained practical importance because of inherent problems, should be mentioned. They are based, respectively, on the use of moments (Goncharev et al., 1987; Salzman, 1990), of bispectra (Marabini and Carazo, 1994b, 1995), and of a new shift-invariant function (Frank et al., 1992). Both the method of moments and the method of bispectra are plagued by extreme noise sensitivity and numerical instability, as noted by the respective authors, while the third method cannot be easily extended to render the pattern rotationally invariant (Penczek and Frank, 1992, unpublished).
3. Exhaustive Sampling of Parameter Space These are approaches based on the premise that groups of particles with closely related rotations can be found by applying multivariate statistical analysis to the particle set that has been merely translationally aligned. If a data set is sufficiently large, then the number of particles presenting a similar appearance and falling within an angular range A~b is on the average n~ = ( A ~ / 2 , r r ) , nto t 9p,
(3.41)
where t/tot is the total number of particles, and p~. is the probability of encountering a particular view. For example, if the particle presents five views with equal probability (p, = 0.2) and e/tot = 1 0 , 0 0 0 , then n~. = 60 for • ~0o. Any particles that have the same structure and occur in similar azimuths will then fall within the same region of factor space. In principle, corresponding averages could be derived by the following "brute force" method: the most significant subspace of factor space (which might be just two-dimensional) is divided according to a coarse grid, and images are averaged separately according to the grid element into which they fall. The resulting statistically well-defined averages could then be related to one another by rotational correlation, yielding the azimuthal angles. The within-group alignment of particles can be obviously refined so that eventually an angle can be assigned to each particle. The Dube et al. (1993) method of "alignment through classification" proceeds somewhat differently from the general idea sketched out above, taking advantage of the particle's symmetry, and in fact establishing the symmetry of the molecule (in this case, the head-to-tail connector protein, or "portal protein," of bacteriophage 4~29) as seen in the electron micrograph. The molecule is rosette-shaped, with a symmetry that has been
98
Chapter 3. Two-DimensionalAveraging Techniques
given variously as 12- or 13-fold by different authors. In the first step, molecules presenting the circular top view were isolated by a multireference approach. For unaligned molecules appearing in the top view with N-fold symmetry, a given pixel at the periphery of the molecule is randomly realized with high or low density across the entire population. If we proceed along a circle, to a pixel that is 360~ away from the one considered, the pattern of variation across the population has consistently shifted by 180~ if the pixel in a given image was dark, it is now bright, and vice versa. Dube et al. (1993) showed that these symmetry-related patterns of variation are indeed reflected in the eigenimages produced by multivariate statistical analysis (Chapter 4). 4. I t e r a t i v e A l i g n m e n t M e t h o d
Penczek et al. (1992) introduced a method of reference-free alignment based on an iterative algorithm that also avoids singling out an image for "reference." Its rationale and implementation are described in the following. We first go back to a definition of what constitutes the alignment of an entire image set. By generalizing the alignment between two images to a set of N images, Frank et al. (1986) proposed the following definition: a set o f N images is aligned if all images are pairwise aligned. In the Penczek et al. notation, alignment of such a set P = {pi; i = 1. . . . . N} is achieved if the functional N-1
L(P,S) = f
Y~ i=1
N
i Syi) - Pk ( r; s~, k Sx, k s~ )] 2 dr (3.42) Y'~ [ p / ( r ; s / , Sx, k=i+l
is minimized by appropriate choice of the set of the 3N parameters { S i , s x ,i s v , i ' i = 1, . . . , N} .
s=
(3.43)
The actual computation of all pairwise cross-correlations, as would be required by implementation of (3.42), would be quite impractical; it would also have the disadvantage that each term is derived from two raw images having low SNR. Penczek et al. show that minimization of (3.42) is equivalent to minimization of N -L
P,
-
f
i s,) i - p, r ]2 E [ p,(r; so,i Sx, i=1
dr
(3.44)
III. AlignmentMethods
99
where 1
_ pi(r) N-
N
k k s~), 1 k=l P k ( r ; s ~ , s x , 2
(3.45)
kei
In this reformulated expression, each image numbered i is aligned to a partial average of all images, created from the total average by subtracting the current image n u m b e r e d i. This formula lends itself to the construction of an iterative algorithm involving partial sums, which have the advantage of possessing a strongly increased SNR when compared to the raw images. In detail, the algorithm consists of two steps (Fig. 3.14). The first step, the random approximation of the global average, proceeds as follows (for simplicity, the argument vector r is dropped): io pick two images Pi and Pk at random from the set P of images; ii. align Pi and Pk (using any algorithm that minimizes I l p i - phil); iii. initialize the global average (general term to be denoted by a m ) by setting a 2 = ( p i ( + ) p k ) , where the symbol ( + ) i s used in this and the following description to denote the algebraic sum of the two images after application o f the rotation a n d shifts f o u n d ;
set a counter m to 3; pick the next image Pt from the set of N vi. align Pt and a m_ 1; vii. update the global average to give iv.
V.
a m - (pt(+)(m
m + 1 remaining images;
- 1)am_l)/k;
viii. increase counter m by I. If m = N then stop else go to step v. The second part of the algorithm performs an iterative refinement of a N" the average A -
i. set counter m = 1; ii. create the modified average A' by subtracting the current image Pk in its current position" A' = (NA
- pk)/(N
- 1);
iii. align p~ with A'; iv. update the global average A as follows" A = (Pk( +)(N
-
1)A')/N;
v. increase m by 1. If m < N then go to ii;
100
Chapter 3. Two.Dimensional Averaging Techniques
a
1
Step 1
2
N
3
|
N-1 u
1
Step I
2
3
N-1
Initial average N
)
x/
\
9\ 9
(1 |
i
) ,,,
V
Repeat steps 1-N until no further orientation correction is needed.
)
Average after correction of 1 ~mage orientation
Average after correction of N'th image orientation
Fig. 3.14. Scheme for reference-free alignment by Penczek et al. (1992). The algorithm consists of two parts (a and b). e, The sum formed after the best orientation is found. For details see text. From Penczek et al. (1992). Reproduced with permission of Elsevier Science, Amsterdam.
vi. if in step iii any of the images changed its position significantly then go back to step i, else stop. The algorithm as outlined has the following properties: (a) No reference image is used, and so the result of the alignment does not depend on the choice of a reference, although there is some degree of dependency on the sequence of images being picked. (b) The algorithm is necessarily suboptimal since it falls short (by a wide margin) of exploring the entire parameter space. (c) Experience shows that, when a heterogeneous image set is used, comprising dissimiliar subsets (e.g., relating to different particle orienta-
IV. Averaging and Global Variance Analysis
101
tions), then the images within each subset are aligned to one another upon completion of the algorithm.
IV. Averaging and Global Variance Analysis A. The Statistics of Averaging After successful alignment, each image element j in the coordinate system of the molecule is represented by a series of measurements
{pi(rj);i = 1 . . . N},
(3.46)
which can be characterized by an average 1 N = ~ pi(rj), P(U)(rJ) -N i= 1
(3.47)
and a variance N
U(U)(rJ) = s 2 ( N ) ( r J )
Both {p(N)(rj); j
= I ... J}
= (N-
and
1
1)
{U(N)(rj); j
Z i=1
[pi(rj) - P,u)(rj)] 2
-- 1 . . . J } c a n
(3.48)
be represented
as
images, or maps, which are simply referred to as acerage image and
variance map. The meaning of these maps depends on the statistical distribution of the pixel measurements. If we use the simple assumption of additiL,e noise with zero-mean Gaussian statistics, pi(rj) = p(rj) + ni(rj),
(3.49)
p(N)(rj) as defined above as an unbiased estimate of the mean and thus represents the structural motif p(r) more and more faithfully as N is being increased. The quantity U(u )(r) -- S~N)(rj) is an estimate of ~rZ(rj), the squared standard deviation of the noise. A display of the variance map was first used in the context of image averaging in electron microscopy of Carrascosa and Steven (1979) and in single particle averaging by Frank et al. (1981a). The variance is a locally varying function for two reasons: (i) the noise statistics varies with the electron dose which in turn is proportional to the local image intensity, and (ii)part of the observed variations are "signal-associated"; that is, they arise from components of the structure that differ from one particle to the other in density or precise location. Usually no statistical models exist for these variations. then
102
Chapter3. Tw0.Dimensi0nalAveragingTechniques
The variance map is particularly associated components as it allows spotted. For instance, if half of a depleted of the L7/L12 stalk, and strong inconsistency would show up place of the stalk.
informative with regard to the signalregions of local inconsistency to be set of 50S ribosomal subunits were the other half not, then a region of in the variance map precisely at the
B. The Variance Map and Analysis of Significance We have seen in the foregoing that one of the uses of the variance map is to pinpoint image regions where the images in the set vary strongly. The possible sources of interimage variability are numerous: i. presence cersus absence of a molecule component, e.g., in partial depletion experiments (Carazo et al., 1988); ii. presence L'ersus absence of a ligand, e.g., in immunoelectron microscopy (Boisset et al., 1993b; Coogol et al., 1990) or ribosome-factor binding complex (Srivastava et al., 1992); iii. conformational change i.e., movement of a mass (Carazo et al., 1988, 1989) or of many thin flexible masses (Wagenknecht et al., 1992); iv. compositional heterogeneity; v. variation in orientation, e.g., rocking or flip/flop variation (van Heel and Frank, 1981; Bijlholt et al., 1982); vi. variation in stain depth (Frank et al., 1981a, 1982); vii. variation in magnification (Bijlholt et al., 1982). Striking examples are found in numerous studies of negatively stained molecules, where the variance map reveals that the stain depth at the boundary of the molecule is the strongest varying feature. The 40S ribosomal subunit of eukaryotes shows this behavior very clearly (Frank et al., 1981a); see Fig. 3.15. An example of a study in which structural information is gleaned from the variance map is found in the paper by Wagenknecht et al. (1992) (Fig. 3.16). Here a core structure (E2 cores of pyruvate dehydrogenase) is surrounded by lipoyl domains which do now show up in the single particle average because they do not appear to assume fixed positions. Their presence at the periphery of the E2 domain is nevertheless reflected in the variance map by the appearance of a strong white halo of high variance. However, the "global" variance analysis made possible by the variance map has some shortcomings. While it alerts us to the presence of variations and inconsistencies, and gives their location in the image field, it fails to characterize the different types of variation and to flag those images that have an outlier role. For a more specific analysis, the tools of
IV. Averaging and Global Variance Analysis
103
Fig. 3.15. Average and variance map obtained from an aligned set of macromolecules. (a) Sixteen of a total set of 77 images showing the 40S ribosomal subunit of HeLa in L-view orientation; (b) average image; (c) variance map; and (d) standard deviation map, showing prominent variations mainly at the particle border where the amount of stain fluctuates strongly (white areas indicate high variance). Reprinted from Frank, J., Verschoor, A., and Boublik, M. Science 214, 1353-1355. Copyright 1981 American Association for the Advancement of Science.
multivariate stat&tical analysis and classification must be employed (see Chapter 4). Another important use of the variance map is the assessment of significance of local features in the average image (Frank et al., 1986), using standard methods of statistical inference (e.g., Cruickshank, 1959; Sachs, 1984): each pixel value in that image, regarded as an estimate of the mean, is accompanied by a confidence interval within which the true value of the mean is located with a given probability. In order to construct the confidence interval, we consider the random variable
t =
p(rj ) - p(rj) s(rj)
(3.50)
where s(rj) - [t'(rj)/N] 1/2 is the standard error of the mean which can be
Chapter 3. Two-DimensionalAveragingTechniques
104
Fig. 3.16. Example of the type of information contained in a variance map: visualization of a "corona" of highly flexible lipoyl domains surrounding the E2 core of puryvate dehydrogenase complex of Escherichia coli. (A) Electron micrograph of frozen-hydrated E2 cores presenting fourfold symmetric views. Each is surrounded by fuzzy structures believed to the the lipoyl domains bound to the molecule. Due to their changing positions, the average map (B, left) depicts only the E2 core, but the variance map (B, right) shows a ring-shaped region of high variance. From Wagenknecht et al. (1992). Reproduced with permission of Academic Press. c o m p u t e d f r o m the m e a s u r e d v a r i a n c e m a p e x p r e s s e d by Eq. (3.48) (see Fig. 3.15). T h e t r u e m e a n lies in the interval p ( r j ) + t s ( r j ) - - t h e confidence interval--with p r o b a b i l i t y P if t satisfies
P
= f+t --t
SN_ l ( r ) d r ,
(3.51)
IV. Averaging and Global Variance Analysis
105
where s N_ I(T) is the Student distribution with N - 1 degrees of freedom. Beyond a number of images N = 60 or so, the distribution changes very little, and the confidence intervals for the most frequently used probabilities become t = 1.96 ( P = 9 5 % ) , t = 2 . 5 8 ( P = 9 9 % ) , and t = 3 . 2 9 (P = 99.9%). It must be noted, howet'er, that the use of the variance map to assess the significance of local features in a difference map must be reserved for regions where no signal-related inconsistencies exist. In fact, the statistical analysis outlined here is only meaningful for a homogeneous image set in the sense discussed in Section H, B, 1. Let us now discuss an application of statistical hypothesis testing to the comparison between two averaged pixels, p~(rj) and /52(r~), which are obtained by averaging over N~ and N 2 realizations, respectively (Frank et al., 1988a). This comparison covers two different situations: in one, two pixels j 4: k from the same image are being compared, and the question is if the difference between the values of these two pixels, separated by the distance rj - r k is significant. In the other situation, the values of the same pixel j = k are compared as realized in two averages resulting from different experiments. A typical example might be the detection of extra mass at the binding site of the antibody in an immunolabeled molecule when compared with a control. Here the question is whether the density difference detected at the putative binding site is statistically significant. In both these situations, the standard error of the difference between the two averaged pixels is given by
Sd[ Pl(rj), P2(rk ) ] = [ t',(rj)/N, +
l
(3.52)
Differences between two averaged image elements are deemed significant if they exceed the standard error by at least a factor of three. This choice corresponds to a significance level of 0.2% [i.e., P = 98% in Eq. (3.52)]. In single-particle analysis, this type of significance analysis was first done by Zingsheim et al. (1982) who determined the binding site of bungarotoxin on the projection map of the nicotinic acetylcholine receptor molecule of Torpedo marmorata. Another example is the Wagenknecht et al. (1988) determination of the anticodon binding site of P-tRNA on the 30S ribosomal subunit as it appears in projection (Fig. 3.17). An example of a t-map showing the sites where an undecagold cluster, is localized in a 2D difference map is found in Crum et al. (1994). The same kind of test is of course even more important in three dimensions, when reconstructions are compared that show a molecule with and without a ligand, or in different conformational states (see Section II, C in Chapter 6).
106
Chapter 3. Two-DimensionalAveragingTechniques
IV. Averaging and Global Variance Analysis
107
C. Signal-to-Noise Ratio
1. Concept and Definition The SNR is the ratio between the variance of the signal and the variance of the noise in a given image 14. This measure is extremely useful in assessing the quality of experimental data, as well as the power of two- and three-dimensional averaging methods. Unfortunately, the definition varies among different fields, with the SNR often being the square root of above definition or with the numerator being measured peak to peak. In the following we will use the ratio of variances, which is the one most widely used definition in digital signal processing. The sample variance of an image {Pu; J = 1... J} is defined as 1 var(p,)- J-
J 1 E [P,j- (P,)]',
(3.53)
j=l
with the sample mean J
( Pi ) = 1/J ~ Pij.
(3.54)
j=l
According to Parset'al's theorem, the variance of a band-limited function can be expressed as an integral (or its discrete equivalent) over its squared Fourier transform" var(p;) = fB IPi(k)12 dk,
(3.55)
where B o denotes a modified version of the "resolution domain" B, i.e., the bounded domain in Fourier space representing the signal information. The modification indicated by the (D subscript symbol is that the integra14 Note that in engineering fields, the signal-to-noise ratio is defined as the ratio of the signal power to the noise power. This ratio is identical with the variance ratio only if the means of both signal and noise are zero.
Fig. 3.17. Example for significance test: localization of the P-site tRNA-anticodon on the 30S ribosomal subunit of E. coli. (A) Average of 53 images of photo-cross-linked tRNA-30S complexes. (B) Average of 73 control images. (C) Difference image (A)-(B) with a contour interval that is half of the interval in (A) and (B). (D) Standard error of the difference image. (E) Map of statistical significance (difference image minus 3x standard error of the difference), which shows regions of high significance in white. (F) Only regions where the map (E) is positive, indicating highly significant difference, are shown in white. From Wagenknecht et al. (1988). Reproduced with permission of Academic Press Ltd.
108
Chapter 3. Two.DimensionalAveragingTechniques
tion exempts the term IP(0)I 2 at the origin. Applied to both signal and noise portion of the image p(r) = O(r) + n(r), one obtains
fB [O(k)12H2(k) dk
var(o) SNR =
=
"J
(3.56)
var(n)
]g(k)l 2 dk C,
Often the domain, B', within which the signal portion of the image possesses appreciable values, is considerably smaller than B. In that case, it is obvious that low-pass filtration of the image to band limit B' leads to an increased SNR without signal being sacrificed. For uniform spectral density of the noise power up to the boundary of B, the gain in SNR upon low-pass filtration to the true band limit B' is, according to Parseval's theorem (3.55), equal to the ratio area{B}/area{B'}. Similarly, it often happens that the noise power spectrum IN(k)f 2 is uniform, whereas the transferred signal transform IO(k)12HZ(k) falls off radially. In that case, the elimination, through low-pass filtration, of a high-frequency band may boost the SNR considerably without affecting the interpretable resolution.
2. Measurement of the Signal-to-Noise Ratio Generally, the unknown signal is mixed with noise, so the measurement of the SNR of an experimental image is not straightforward. Two ways of measuring the SNR of "raw" image data have been put forward: one is based on the dependence of the sample cariance [Eq. (3.53)] of the average image on N (= the number of images averaged) and the other on the cross-correlation of two realizations of the image. a . N-Dependence of Sample Variance. We assume the noise is additive, uncorrelated, stationary (i.e., with shift-independent statistics), and Gaussian and that it is uncorrelated with the signal. In that case, the variance of a "raw" image Pi is, independently of i,
var(Pi) = var(p) + var(n).
(3.57)
The variance of the average t5 of N images is var(t5) = var( p ) + ( 1 / N )var( n ),
(3.58)
i.e., with increasing N, the proportion of the noise variance to the signal variance is reduced. This formula suggests the plot of var( p) versus 1 / N as a means to obtain the unknown var(p) and var(n) (Hiinicke, 1981; Frank
109
IV. Averaging and Global Variance Analysis
et al., 1981a; H~inicke et al., 1984): if the assumptions made at the beginning are correct, the measured values of var(/5) should lie on a straight line whose slope is the desired quantity var(n) and whose intersection with the var(p) axis (obtained by extrapolating it to 1 / N - 0) gives the desired quantity var(p). Figure 3.18 shows the example obtained for a set of 81 images of the negatively stained 40S ribosomal subunit of HeLa cells (Frank et al., 1981a). It is seen that the linear dependency predicted by Eq. (3.58) is indeed a good approximation for this type of data. b. Measurement by Cross-Correlation. Another approach to the measurement of the SNR makes use of the definition of the cross-correlation coefficient (CCC). The CCC of two realizations of a noisy image, Pij and Pkj, is defined as [see Eq. (3.16)] J
E [Pij- (Pi)][Pkj- (Pk)1 j=l t912
1/2 '
--
(3.59)
j=l
where (p~) and ( p k ) again are the sample means defined in the previous section. When we substitute p,.j = p j + nq, Pkj =Pj + n ki, and observe that according to the assumptions both noise functions have the same variance var(n), we obtain the very simple result (Frank and AI-Ali, 1975) P12
l+a
,
(3.60)
4.5
Fig. 3.18.
0-2 4.0
3.5 ~
~75 9
J
Ill 45
1 1 5 12
I/N
1 9
I - 6..,-N
Decrease of the variance of the average image as a function of the number of images averaged, N, or (linear) increase of the variance as a function of 1/N. Extrapolation to N = = allows the variance of the signal to be measured. Averaged were N = 81 images of the 40S ribosomal subunit from HeLa cells negatively stained and showing the L-view. Reprinted from Frank, J., Verschoor, A., and Boublik, M. Science 214, 1353-1355. Copyright 1981 The American Association for the Advancement of Science.
Chapter3. Two-DimensionalAveragingTechniques
110
from which the SNR is obtained as a = ~
P12
.
(3.61)
1 - - P12
Thus the recipe for estimating the SNR is very simple: choose two images from the experimental image set, align them and filter the images to the resolution deemed relevant for the SNR measurement. Then compute the CCC and use formula (3.61) to obtain the SNR estimate. Because of the inevitable fluctuations of the results, it is advisable to repeat this procedure with several randomly picked image pairs and use the average SNR as a more reliable figure. In practice, the measurement of the SNR is somewhat more complicated since the microdensitometer measurement introduces another noise process which, unchecked, would lead to overly pessimistic SNR estimates. To take account of this additional contribution, Frank and A1-Ali (1975) used a control experiment in which two scans of the same micrograph were evaluated with the same method, and a corresponding SNR for the microdensitometer a d was found. The true SNR is then obtained as
Oetruc =
(1 + l / a ) / ( 1
+ 1 / a d) -- 1 '
(3.62)
where a is the SN['. deduced from the noncorrected experiment following Eq. (3.61).
V. Resolution A. The Concept of Resolution Optical definitions of resolution are based on the ability of an instrument to resolve two points separated by a given distance, d. This concept is problematic if applied to experiments on the nanometer scale in which any test object, as well as the support it must be placed on, reveals its atomic makeup. Another criticism (Di Francia, 1955)concerns an information theoretical aspect of the experiment: if it is known a priori that the object consists of two points, then measurement of their mutual distance from the image is essentially a pattern recognition problem, which is limited by noise, not by the size of d. In both crystallography and statistical optics, it is common to define resolution by the orders of (object-related) Fourier components available
V. Resolution
111
for the Fourier synthesis of the image. This so-called crystallographic resolution R c and Raleigh's point-to-point resolution distance d for an instrument which is diffraction-limited to Rc are related by d = 0.61/R,..
(3.63)
In electron crystallography, the signal-related Fourier components of the image are distinguished from noise-related components by the fact that the former lie on a regular lattice, the reciprocal lattice, while the latter form a continuous background. Thus, resolution can be specified by the radius of the highest orders that stand out from the background. What "stand out" means can be quantified by relating the amplitude of the peak to the mean of the background surrounding it (see below). In single particle averaging, on the other hand, there is no distinction in the Fourier transform between the appearance of signal and noise, and resolution estimation must take a different route. There are two categories of resolution tests; one is based on the comparison of two independent averages in the Fourier domain ("cross-resolution"), while the other is based on the multiple comparison of the Fourier transforms of all images participating in the average. The differential phase residual (Frank et al., 1981a), Fourier ring correlation (van Heel et al., 1982; Saxton and Baumeister, 1982), and method of Young's fringes (Frank et al., 1970; Frank, 1972a) fall into the first category, while the spectral signal-to-noise ratio (Unser et al., 1987, 1989) and the Q-factor (van Heel and Hollenberg, 1980; Kessel et al., 1985)fall into the second. Averaging makes it possible to recover signal that is present in very small amounts. Its distribution in Fourier space, as shown by the signal power spectrum, shows a steep falloff. For negatively stained specimens, this falloff is due to the imperfectness of the staining: on the molecular scale, the stain salt forms crystals and defines the boundary of the molecule only within a margin of error. In addition, the process of air-drying causes the molecule to change shape, and one must assume that this shape change is variable as it depends on stain depth, orientation, and properties of the supporting carbon film. For specimens embedded in ice, gross shape changes of the specimen are avoided, but residual variability (including genuine conformational variability) as well as instabilities of recording (drift, charging, etc.) are responsible for the decline of the power spectrum. More about these resolution-limiting factors will be discussed below (Section II, 5, 3); for the moment, it is important to realize that there is a practical limit beyond which the signal power makes only marginal contributions to the image.
112
Chapter 3. Two-DimensionalAveragingTechniques
However, all resolution criteria listed below in this section have in common that they ignore the relative distribution of the signal in Fourier space. Thus, a resolution of 1/20 ,~-1 might be found by a consistency test, even though the signal power might be quite marginal beyond 1/30 A-1. An example for this kind of discrepancy was given by van Heel and St6ffler-Meilicke (1985)who studied the 30S ribosomal subunits from two eubacterial species by 2D averaging: they found a resolution of 1/17 A-1 by the Fourier ring correlation method, even though the power spectrum indicated the presence of minimal signal contributions beyond 1/20 A - l , or even beyond 25 A-1, when a more conservative assessment is used. The lesson to be learned from these observations is that for a meaningful statement on the information actually present in the recovered molecule projection, a resolution assessment ideally should be accompanied by an assessment of the range of the power spectrum. The same concern about the meaning of "resolution" in a situation of diminishing diffraction power has arisen in electron crystallography. Henderson et al. (1986) invented a quality factor ("IQ", for "image quality") that expresses the SNR of each crystal diffraction spot and applied a rigorous cutoff in the Fourier synthesis depending on the averaged size of this factor, with the averaging being carried out over rings in Fourier space. (An example of an IQ-rated image transform was shown earlier (Fig. 2.2 in Chapter 2)). Glaeser and Downing (1992) have demonstrated the effect of including higher diffraction orders in the synthesis of a projection image of bacteriorhodopsin. It is seen that, with increasing spatial frequency, as the fraction of the total diffraction power drops to 15%, the actual improvement in the definition of the image (e.g., in the sharpness of peaks representing projected a-helices) becomes marginal.
B. Resolution Criteria 1. Definition of Area to be Tested The resolution criteria to be detailed below all make use of the discrete Fourier transform of the images to be analyzed. It is of crucial importance for a meaningful application of these measures that no correlations are unwittingly introduced when the images are prepared for the resolution test. It is tempting to use a mask to narrowly define the area in the image where the signal--the averaged molecule image--residues, as the inclusion of surrounding material with larger inconsistency might lead to overly pessimistic results. However, imposition of a binary mask, applied to both images that are being compared, would produce correlation extending to the highest resolution. Hence, the resolution found in any of the tests
V. Resolution
113
described below would be the highest possible--corresponding to the Nyquist limit. To avoid this effect one has to use a "soft" mask whose falloff is so slow that it introduces low-resolution correlation only. Fourier-based resolution criteria are thus governed by a type of uncertainty relationship" precise localization of features for which resolution is determined makes the resolution indeterminate; and, on the other hand, precise measurement of resolution is possible only when the notion of localizing the features is completely abandoned.
2. Differential Phase Residual For the determination of the differential phase residual (DPR) the aligned image set is arbitrarily partitioned into two subsets of equal size. For example, the use of even- and odd-numbered images of the set normally avoids any systematic trends such as the origin in different micrographs or different areas of the specimen. Each subset is averaged, leading to the average images/51(r), 152(r) ("subset averages"). Let F~(k) and F2(R) be the discrete Fourier transforms of the two subset averages, with the spatial frequency k assuming all values on the regular Fourier grid {kx, ky} within the Nyquist range. If A0~(k) is the phase difference between the two transforms, then the differential phase residual is computed as 1/2
Acb(k, Ak) =
Etk.•
[ A~b(k)]e[lFl(k)l + [F2(k)l] Etk..Xk][lF~(k)[ + IF2(k)[]
(3.64)
The sums are computed over Fourier components falling within rings defined by spatial frequency radii k _+ Ak; k = [kl and plotted as a function of k (Fig. 3.19). Thus, for any spatial frequency, A~b gives a measure of (amplitude-weighted) phase consistency. In principle, as in the case of the Fourier ring correlation below, the entire curve is needed to characterize the degree of consistency between the two averages. However, it is convenient to use a single figure, k45 , for which A ~b(k45, Ak) = 45 ~ As a conceptual justification for the choice of this value, one can consider the effect of superposing two sine waves differing by A 4~. If A & is significantly smaller than 45 ~ the waves enforce each other, whereas for A & > 45 ~ the maximum of one wave already tends to fall in the vicinity of the zero of the other, and destructive interference begins to occur. It is of crucial importance in the application to electron microscopy that the phase residual is computed "differentially," over successive rings, rather than globally~over the entire Fourier domain with a circle of radium k. Such global computation is being used, for instance, to align particles with helical symmetry (Unwin and Klug, 1974). Since IF(k)l falls
114
100
Chapter 3. Two-Dimensional Averaging Techniques
~ -
50 ~_
.
A .
.
.
__.AM? "
"
'~
............ "
Fig. 3.19. Resolution assessment by comparison of two subaverages (calcium release channel of skeletal fast twitch muscle) in Fourier space using two different criteria: differential phase residual (DPR, solid line, angular scale 0 . . . 100~ and Fourier ring correlation (FRC, dashed line, scale 0 . . . 1). The scale on the x-axis is in Fourier units, denoting the radius of the rings over which the expressions for DPR or FRC were evaluated. The DPR resolution limit (A--~= 45~ see arrowhead on v-axis)is 1/30 ,~-l (arrowhead on x-axis). For FRC resolution analysis, the FRC curve is compared with twice the FRC for pure noise (dotted curve). In the current example, the two curves do not intersect within the Fourier band sampled, indicating an FRC resolution of better than 1,/2(I A - ~. From Radermacher et al. (1992a). Reproduced with permission of the Biophysical Society.
off rapidly as the phase difference A~b increases, the figure k45 obtained with the global measure is not very meaningful in our application; for instance, excellent agreement in the lower spatial frequency range can make up for poor agreement in the higher range and thus produce an overoptimistic value for k45. The differential form of the phase residual [Eq. 3.64], was first used by Crowther (1971) to assess the preservation of icosahedral symmetry as a function of spatial frequency. It was first used in the context of single particle averaging by Frank et al. (1981a). It can be easily verified from Eq. (3.64) that the differential phase residual is sensitive to changes in scaling between the two Fourier transforms. In computational implementations, Eq, (3.64) is therefore replaced by an expression in which ]F2(k)l is scaled, i.e., replaced by sIFz(k)l, where the scale factor s is allowed to run through a range from a value below 1 to a value above 1, through a range large enough to include the minimum. The desired differential phase residual then is the minimum of the curve formed by the computed residuals. This computation is quite
V. Resolution
115
fast, so the scaling sensitivity can barely be counted as a disadvantage in comparing the D P R with other measures of resolution (cf. van Heel, 1987a). One of the advantages of the differential phase residual is that it relates to the measure frequently used in electron and X-ray diffraction to assess reproducibility and preservation of symmetry.
3. Fourier Ring Correlation The Fourier ring correlation (FRC); Saxton and Baumeister, 1982; van Heel et al., 1982) is similar in concept to the D P R as it is based on a comparison of the two Fourier transforms over rings: FRC( k, A k ) =
[
Re]Elk _xk]F I ( k ) F * (k)] " , ~l,/2" E[k._~kl IFI(R)I- E{k._~kl IF(k)l-
(3.65)
Here the resolution criterion (see Fig. 3.19) is based on a comparison of F R C as determined from Eq. (3.65) with a safe multiple of the value of F R C expected for pure noise , 2 • 1 / ( N Ik._~kI)~/e 9 Nik._~kl denotes the number of samples in the Fourier ring zone with radius k and width Ak. Experience has shown that the F R C (with the above threshold factor of 2) normally gives a more optimistic answer than the DPR. In order to avoid confusion in comparisons of resolution figures, many authors use both measures in their publications. Some light on the relationship between D P R and F R C has been shed by Unser et al. (1987) when these authors introduced the spectral signal-to-noise ratio, see below. Their theoretical analysis confirms the observation that always kv~ c > k4~ for the same model dataset. Further illumination of the relative sensitivity of these two measures was provided by R a d e r m a c h e r (1988) who showed, by means of a numerical test, that the F R C = 2 • 1 / N ~/2 cutoff is equivalent to a signal-to-noise ratio of SNR = 0.2, whereas the A4)= 45 ~ cutoff is equivalent to SNR = 1. Thus the F R C cutoff, and even the D P R cutoff with its fivefold increased SNR seem quite optimistic; however, for wellbehaved data the D P R curve is normally quite steep, so that even a small decrease in the cutoff will often lead to a rapid increase in SNR. ~5 Unpublished results of L. G. de la Fraga, J. Dopazo, and J. M. Carazo (Carazo, personal communication, 1994) are interesting in this context. By numerical computations, these authors established confidence limits for D P R and F R C resolution tests applied to a data set of 300 experimental images of DnaB helicase. The data set was divided into half-sets using many permutations, and corresponding averages were formed in each case. The resulting D P R and F R C curves were statistically evaluated. The ~5I thank Michael Radermacher fl~r a discussion of this point.
Chapter 3. Two-DimensionalAveraging Techniques
116
confidence limits for both the D P R determination of 10 Fourier units and the F R C determination of 13 units were found to be _+ 1 unit. This means that even with 300 images, the resolution estimate obtained by DPR or F R C may be as much as 10% off. The authors also confirmed another observation by Radermacher (1988), namely that the FRC cutoff of 2 • 1/(N[k, Akl) 1/2 corresponds to a DPR cutoff of 85 ~ A substitution of a factor of 3 in the FRC cutoff appears to give better agreement between the two criteria.
4. Young's Fringes Because of its close relationship with the D P R and the FRC, the method of Young's fringes (Frank et al., 1970; Frank, 1972a, 1976) should be mentioned here, even though this method is rarely used to measure the resolution of computed image averages. The method is based on the result of an optical diffraction experiment: two micrographs of the same specimen (a carbon film) are first brought into precise superimposition, and then a small relative translation A x is applied. The diffraction pattern (Fig. 3.20) shows the following intensity distribution: I(k) = IFl(k) -+- F2(k)exp(2rrik x Ax)l 2 --[Fl(k)[ 2 + [F2(k)l 2 + 2lFa(k)l IF2(k)lcos[2rrkx Ax + 4,a(k) - 4'2(k)].
(3.66)
The third term, the Young's fringes term proper, is modulated by a cosine pattern whose wavelength is inversely proportional to the size of the image shift and whose direction is in the direction of the shift. Since ~b1 ~ 4, 2, as long as the Fourier transform is dominated by the signal common to both images, the fringe pattern is visible only within the resolution domain. In other words, the modulation induced by the shift can be used to visualize the extent of that domain. The Young's fringes term in Eq. (3.66) is sensitive to the phase difference between the two transforms. Consistent shifts affecting an entire region of Fourier space show up as shifts of the fringe system. For example, the phase shift of alternate zones of the contrast transfer function by 180 ~ (leading to 4,1 - &2 - 180~ can be made visible by using two images of the same carbon film with different defocus settings (Frank, 1972a). Using digital Fourier processing, the waveform of the fringes can be freely designed, by superposing normal cosine-modulated patterns with different frequencies (Zemlin and Weiss, 1993). The most sensitive detection of the band limit is achieved when the waveform is square. Zemlin and
V. Resolution
117
Weiss (1993) have o b t a i n e d such a p a t t e r n of m o d u l a t i o n experimentally (Fig. 3.20).
5. The
Q-Factor
This m e a s u r e (van H e e l and H o l l e n b e r g , 1980; Kessel et al., 1985) is easily explained by r e f e r e n c e to a vector d i a g r a m (Fig. 3.21b) depicting the s u m m a t i o n of equally indexed F o u r i e r c o m p o n e n t s Pi(k) in the complex
Fig. 3.20. Diffraction patterns showing Young's fringes. The patterns are obtained by adding micrographs of the same specimen area and computing the Fourier transform of the resulting image. (a-c) Two micrographs are added with different horizontal displacements (10, 30, and 50 ,~). The intensity of the resulting pattern follows a cosine function. (d) A pattern with approximately rectangular profile is obtained by linear superposition of the patterns (a-c) using appropriate coefficients. From Zemlin and Weiss (1993). Reproduced with permission of Elsevier Science, Amsterdam.
118
Chapter 3. Two-DimensionalAveraging Techniques
Fig. 3.21. Averaging of signals in the complex plane. (a) Corresponding Fourier components (i.e., same-k) F1, F 2 of two aligned images that represent the same signal ( P ) but differ by the additive noise components N~. Nz. (b) Definition of the Q-factor: it relates to the addition of vectors (in this case N = 7) representing same-k Fourier components in the complex plane. Fsum/N is the Fourier component of the average image. The Q-factor is defined as the ratio between the length of F,u m and the sum of the lengths of the component vectors F i. Only in the absence of noise, the maximum Q = 1 is achieved. (c) Q-factor obtained in the course of averaging over an increasing number of repeats of a bacterial cell wall. As N increases, the Fourier components in the noise background perform a random walk while the signal-containing Fourier components add up in the same direction, as in (b). As a result, the signal-related Fourier components stand up in the Q-factor map for N sufficiently high. From Kessel et al. (1985). Reproduced with permission of Blackwell Science Ltd, Oxford.
V. Resolution
119
plane which takes place when an image set is being averaged. Because of the presence of a noise component (Fig. 3.21a), the vectors associated with the individual images zig-zag in the direction of the averaged signal. The Q-factor is simply the ratio between the length of the sum vector and the total pathway of the vectors contributing to it:
Q(k)
IE;~= ~ P~(k) I =
EN/=I [~(k)]
(3.67)
.
Obviously, 0 < Q < 1. For pure noise, Q(k) - 1/v/N, since this situation is equivalent to the random wandering of a particle in a plane under Brownian motion (Einstein equation). The Q-factor is a quite sensitive indicator for the presence of a signal component, since the normalization is specific for each Fourier coefficient. A display of Q(k) (the "Q-image," first used by Kessel et al., 1985, see Fig. 3.21c) readily shows weak signal components at high spatial frequencies standing out from the background and thus enables the ultimate limit of resolution ("potential resolution") to be established. Again, a quantitative statement can be derived by averaging this measure over rings in the spatial frequency domain, and plotting the result, Q(k), as a function of the ring radius k = Ik[. The stipulation that Q(k) should be equal or larger than 3/V/-~Ek,~kj can be used as a resolution criterion. Sass et al. (1989) introduced a variant to the Q-factor, which they termed S-factor: [1/N E~= ~ P,(k)[ 2 S(k)
:
1 / N [ z~'i='
[Pi(k)[2 ]
.
(3.681
The expectation value of S(k) for unrelated Fourier coefficients is 1/Ntk, akl. Thus, a resolution criterion can be formulated by stipulating that the ring zone-averaged value of S(k) > 3/NLk. akl.
6. Spectral Signal-to-Noise Ratio The spectral signal-to-noise ratio (SSNR) was introduced by Unser et al. (1987; see also Unser et al., 1989) as an alternative measure of resolution. It is based on a measurement of the signal to noise ratio as a function of spatial frequency. Compared with the resolution criteria discussed above, if offers several advantages. First, as summarized by Unser et al. (1989), the SNNR relates directly to the Fourier-based resolution criteria commonly used in crystallography. Second, its statistical uncertainty is lower than those of the DPR, FRC, and Q-factor. The third, very important
120
Chapter 3. Two-DimensionalAveraging Techniques
advantage is that it allows the resolution improvement that is expected when the data set is expanded by a certain amount to be gauged and the asymptotic resolution that would be expected for an infinitely large data set to be computed. Finally, as we will see, the SSNR provides a framework for relating FRC and DPR to each other in a meaningful way. As before, the individual images, which can represent unit cells of a crystal or single molecules, are modeled assuming a common signal component [p(rj); j = 1 . . . J ] and zero-mean, additive noise: (i = 1 . . . N ) .
pi(rj) = p(rj) + ni(r j)
(3.69)
An equivalent relationship holds in Fourier space: Pi(kt) = P ( k t) + Ni(k t)
(i = 1 . . . N ) ,
(3.70)
where k t are the spatial frequencies on the discrete two-dimensional Fourier grid. In real or Fourier space, the signal can be estimated by averaging: N
N
p(rj) = ~ p~(rj);
P(k~) = Y~ P~(kt).
i=1
(3.71)
i=1
The SSNR a B is based on an estimate of the signal to noise ratio in a local region B of Fourier space. It is given by
o-2s a B -- ~r~n/N - 1.
(3.72)
The numerator, o-2s, is the signal variance ("signal energy"), which can be estimated as Cr~s = ( 1 / n B) ~ IP(k,)l 2,
(3.73)
B
where n B is the number of Fourier components in the region B. The denominator in (3.72), ~r2n, is the noise variance, which can be estimated as
0"2
--
.
(N-
(3.74)
1)n B
By taking the regions B in successive computations of a~ to be concentric rings of equal width in Fourier space, the spatial frequency dependence of
V. Resolution
121
a B can be found, and we obtain a curve ce(k). Generally, the SNNR decreases with increasing spatial frequency. The resolution limit is taken 4 (Fig. 3.22). to be the point where c~(k) falls below the value a ( k ) Consideration of a model has shown that this limit is roughly equivalent to DPR = 45 ~ The statistical analysis given by these authors also allows upper and lower confidence intervals for the a -- 4 resolution limit to be established. For the N = 30 images of herpes simplex virus particles used as an example data set, the resolution was estimated as 1/29 A-1 (~ = 4), but the confidence intervals ranged from 1/27 to 1/30 ,~-1. This indicates that for such small data sets, resolution estimates obtained with any of the measures discussed should be used with caution. Unser et al. (1987) also address the question of "how much is enough?", referring to the number of noisy realizations that are averaged. For a given SSNR curve, the resolution improvement available by increasing N to N ' > N can be estimated by shifting the threshold from ofu - - 4 to a N, = 4 N / N ' (see Fig. 3.22).
8
\
I4~"~ .
SSNR
k(a)
J ~-~ rr z03 4 03
(b)
. . . . . . . . . . . . . . .
~/f/ c)'" " ~/~ 0
/
SSNR>4 (P<0.025)
SSNR<4 (P<0.025) SSNR*0(P<0.025)
10 Normalized radial frequency
20
Fig. 3.22. Experimental SSNR curve obtained for a set of 30 images of the herpes virus Type II capsomer. With increasing spatial frequency, the curve dropsorapidly from a high value (> 8) to values close to 0. The SSNR resolution limit (here 1/29 A - 1) is given by the spatial frequency where the experimental curve (b) intersects SSNR = 4. The dashed lines represent the (a) upper and (c) lower (+ 2o-) confidence limits for a measurement of SSNR = 4. The solid line at the bottom (d) represents the upper ( 2 o ) confidence limit for a measurement of SSNR = 0. From Unser et al. (1987). Reproduced with permission of Elsevier Science, Amsterdam.
122
Chapter 3. Two-DimensionalAveragingTechniques
C. Resolution-Limiting Factors Although the different resolution criteria stress different aspects of the signal, it is possible to characterize the resolution limit qualitatively as the limit, in Fourier space, beyond which the signal is "drowned" in noise. Thus the practical resolution limit, as measured by the Q-factor or by SSNR, is the result of many adversary effects that diminish the contrast at high spatial frequencies. The effects of various factors on the contrast of specimens embedded in ice have been discussed by Henderson and Glaeser (1985) and by Henderson (1992). More useful is a plot of the following ratio, the so-called Wilson plot:
w(k) -
image Fourier amplitudes electron diffraction amplitudes x C T F '
(3.75)
where both the expression in the numerator and denominator are derived by averaging over a resolution ring with mean radius k = ]k[. This is because the electron diffraction amplitudes are not affected by "phase" effects such as drift, charging, and illumination divergence. Thus, "ideal" imaging, or imaging affected by the CTF only, would be characterized b? w(k) -- constant. Instead, a plot of w(k) for tobacco mosaic virus (as an example of a widely investigated specimen) shows a falloff by more than an order of magnitude from 0 to 1/10 A-~. It is clear that the well-known envelope functions describing partial illumination coherence (Frank, 1973a) and energy spread (Hanszen and Trepte, 1971; Wade and Frank, 1977) are not the limiting factors in that resolution range. Henderson (1992) considered the relative importance of contributions from four physical effects: radiation damage, inelastic scattering, specimen movement, and charging. Of these, the two last ones, which are difficult to control experimentally, were thought to be most important. Meanwhile, the concern about beaminduced specimen movement has led to the development of the spot-scanning technique (Downing and Glaeser, 1986; Bullough and Henderson, 1987); see Section III, B in Chapter 2). As yet, on the other hand, no effective remedies for the charging effect have been found. In addition to the factors discussed by Henderson, we have to consider the effect of conformational variability that is intrinsically larger in single molecules than in molecules bound together in a crystal. Such variability reduces the resolution not only directly, but it also decreases the accuracy of alignment since it leads to a reduced similarity among same-view images, causing the correlation signal to drop (see Fourier computation of the cross-correlation function, Section III, C, 2). In this context, it should be mentioned that the attainment of high resolution by averaging of
VI. Validation of the Average Image: Bank Sum Analysis
123
low-dose images follows basic statistical requirements that have to do with the information content of a structure of certain size at a given resolution. Henderson's (1995) analysis of these relationships (by necessity, in three dimensions) concludes with the surprising finding that the number of particle images required is independent of particle size. These numbers are estimated to be 2000 for 1.20 ,~, 4000 for 1/20 ,~, and 10,000 for 1/3 A, respectively. We note that a resolution of 1/25 A, as measured with the relatively conservative differential phase residual criterion, has just been obtained for the 70S ribosome based on 4300 projections (Frank et al., 1995a,b).
VI. Validation of the Average Image: Rank Sum Analysis The question of significance of features in the average can be addressed by a multiple comparison method that makes no specific assumptions on the statistics, the rank sum method (H~inicke, 1981; H~inicke et al., 1984). Given a set of N images, each represented by J pixels: Pij = {p;(rj); j - 1 . . . J ; i - 1 . . . N}
(3.76)
The nonparametric statistical test is designed as follows: each pixel Pij of the i th image is ranked according to its value among the entire set of pixels {Pij, J = 1 . . . J): the pixel with smallest value is assigned rank 1, the second smallest rank 2, and so forth. Finally, the pixel with the largest value receives the rank J. In case of value ties, all pixels within the equal-value group are assigned an average rank. In the end, the image is represented by a set of rank samples r, 1, ri2,..., r~j, which is a permutation of the set of ordered integers {1, 2 . . . . . J} (except for those rank samples that result from ties). After forming rank representations for each image, we form a rank sum for each pixel: N
Rj = ~ rij,
1 < j < J.
(3.77)
i=1
In order to determine whether the difference between two pixels, indexed j and k, of the average image, N
fij = 1 / N
Y', Pij, i--1
N
Pk = 1 / N
~_, Pik , i=1
(3.78)
124
Chapter 3. Two.DimensionalAveragingTechniques
is statistically significant, on a specified significance level a, we can use the test statistic cjk - I R j - Rkl.
(3.79)
Whenever cjk is greater than a critical value D ( a , J , N), the difference between the two pixels is deemed significant. This critical value has been tabulated by Hollander and Wolfe (1973) and, for large N, by H~inicke (1981). In addition to answering questions about the significance of density differences, the rank sum analysis also gives information on the spatial resolution. Obviously, the smallest distance ("critical distance") between two pixels that fulfills cjk > D ( a , J, N )
(3.80)
is related to the local resolution in the region of those pixels. The nature of this relationship and a way to derive a global resolution measure from this are discussed by H~inicke et al. (1984).
VII. Outlier Rejection: "Odd Men Out" Strategy An average image formed by the alignment and averaging procedures introduced in Chapter 3 in meaningful only if the image set is homogeneous or internally consistent. This would be the case when the image originates from a single common structural motif. Heterogeneous data sets require some type of classification so that they can be subdivided into homogeneous subsets. Computer-assisted or automated methods of classification, which are among the most important tools of macromolecular reconstruction, will be discussed in Chapter 4. In principle, the techniques to be discussed are able to deal with the problem of outlier rejection as well. The subject of the following brief section is a method of outlier rejection, based on a strict statistical (but not multivariate) analysis, which assumes that the data set indeed originates from a common structural motif. Often, the analysis starts with the selection of molecules exhibiting a well-recognizable view from the image field, the images are aligned, and the gallery of aligned images is inspected for the presence of outliers. The Odd Men Out strategy of Unser et al. (1986) replaced subjective methods of "weeding out" anomalous images by application of a computer algorithm. The basic idea is very simple and can be paraphrased as follows: "Determine that image, from a giL'en image set, for which the variance of the remaining set is minimum."
VII. Outlier Rejection: "Odd Men Out" Strategy
125
Unser et al. (1986) developed a test statistic to determine the probability that the image under consideration is statistically consistent with the remaining images. Successive application of this algorithm to a set of HSV capsomer images led to a probability curve with a steep falloff, allowing six images to be unambiguously rejected as outliers. The authors compare the performance of this algorithm with the use of a two-dimensional factorial map obtained by correspondence analysis (see Sections II and III in Chapter 4) and note that the latter analysis concurs only partially with the Odd Men Out result. This discrepancy is a consequence of the fact that only two factors were inspected. Obviously, the Odd Men Out algorithm could also be applied, with much benefit, to the coordinates of images in factor space, in order to weed out outliers in a systematic way.
I. Introduction A. Heterogeneity of Image Sets The need for multivariate statistical analysis (MSA) has become obvious in the treatment of two-dimensional (2D) image averaging in the previous Chapter. For one, macromolecules exist in a large range of conformational states. Most of the associated variations are too small to be observed in the electron microscope, but some are big enough to be detected. Second, the forces acting on the molecule in the course of the specimen preparation may produce deformations. Air-drying of negatively stained specimens especially is known to "flatten" the molecule in a direction normal to the support; hence it is an effect that critically depends on the orientation of the molecule relative to the support. Finally, any variations in the orientation of the molecule on the support will result in variations of the projected image, which are in the case of negative staining further amplified by meniscus effects. What is important here is the conceptual difference between variations in the image entirely unrelated to the object (which we might call "true noise") and those variations that are due to something that happens to the object (previously termed "signal-related variations"). Since the ultimate goal of our study is to obtain meaningful 3D images of the macromolecule, we can merge only data that originate from particles showing the macromolecule in the same state (i.e., shape, conformational state, degree of flattening, etc.). At this point it might be useful to introduce the concept of a proper statistical ensemble. A set of images of single macromolecules, as extracted from an electron micrograph, is normally heterogeneous, as it contains
126
I. Introduction
127
projections of the molecule in different orientations and "states" as defined above. If the total number of images is sufficiently large, then each of those orientations and states is realized more than once. The subset consisting of these realizations is a proper statistical ensemble that allows a meaningful average to be formed. In contrast, any subset containing realizations of different orientations and states would not form a proper statistical ensemble. The estimation of an ensemble average is that latter case would be forbidden by the same rule as the one that forbids the proverbial comparison of apples and oranges. In practice, the terms "homogeneous" and "heterogeneous" are relative. For instance, a certain range of orientation or variation is acceptable as long as the effect of these variations is much smaller than the resolution distance. To consider the magnitude of this tolerance, we might take the 50S ribosomal subunit whose base is roughly 250 A wide. A tilt of the subunit by 5 ~ would lead to a foreshortening of the particle base to 250 x cos(5 ~ = 249 .&, which leads to a very small chanoge of shape compared to the size of the smallest resolved features (30 A). From this consideration, we could infer that orientation changes by 5 ~ will not change the appearance of the projection significantly. In reality, however, the ability to differentiate among different patterns is a function of noise. In the absence of noise, even a change that is small compared with the resolution distance will be detectable. A general criterion for homogeneity could be formulated as follows (Frank, 1990): "a group of images is considered homogeneous if the intra-group variations are small compared with the smallest resolved structural detail." Boisset et al. (1990b) used such a test to determine homogeneous subgroups in a continuously varying molecule set. Several ways of scrutinizing the image set based on a statistical analysis have already been introduced: the variance map, the rank sum analysis, and the "odd man out" strategy. None of these methods, however, is capable of dealing with an entirely heterogeneous data set, i.e., a data set consisting of two or more different groups of images.
B. Direct Application of Multivariate Statistical Analysis to an Image Set We assume that a set comprising N images is given, {pi(r); i - 1 . . . N}. Each of the images is represented by a set of discrete measurements on a regular Cartesian grid,
{Pij; J = 1 . . . J},
128
Chapter 4. Statistical Analysis and Classification of Images
where as before the image elements are lexicographically ordered (see Section II, A in Chapter 3). We assume that, by using one of the previously outlined procedures, the image set has already been aligned; that is, any given pixel j has the same "meaning" in the entire image set. In other words, if we were dealing with projections of "copies" of the same molecule lying in the same orientation, that pixel would refer to the same projection ray in a coordinate system affixed to the molecule. Variations among the images are in general due to linear combinations of pixel variations, rather than to variations of single pixels. In the latter case, only a single pixel would change while all other pixels would remain exactly the same. Such a behavior is extremely unlikely as an outcome of an experiment, for two reasons: for one, there is no precise constancy over any extended area of the image, and second, unless the images are undersampled or critically s a m p l e d ~ a condition normally avoided (see Section II, A in Chapter 3)~pixels will not vary in isolation but in concert with surrounding pixels. For example, a flexible protuberance may assume different positions in the particle set. This means that an entire set of pixels undergoes the same coordinate transformation, e.g., a rotation around a point of flexure (see Fig. 4.1). Another source of correlation between neighboring image points is the instrument itself: each object point, formally described by a delta function, is imaged as an extended disk (the point spread function of the instrument; see Section II,
Fig. 4.1. A specimen used to demonstrate the need for multivariate statistical analysis: although the micrograph shows the same view of the molecule (the crown view of the 50S ribosomal subunit), there are differences due to variations in staining and in the position of flexible components such as the "stalk" (indicated by arrow-head). From Frank et al. (1985). Reproduced with permission of Van Nostrand-Reinhold, New York.
I. Introduction
129
C, 1 and 4 in Chapter 2). Image points within the area of that disk are correlated; that is, they do not vary independently (Section III, D, 1 in Chapter 3).
C. The Principle of Making Patterns Emerge from Data Making Patterns Emerge from Data is the title of a ground-breaking paper by Benzecri (1969a), introducing correspondence analysis to the field of taxonomy. Benzecri's treatment (see also Benzecri, 1969b) marks a departure from a statistical, model-oriented analysis in the direction of a purely descriptive analysis of multidimensional data. This new approach toward data analysis opened up the exploration of multivariate data for which no models or at best sketchy models exist, e.g., in anthropology and laboratory medicine. Data displayed as patterns appeal to the eye; they are easily captured "at a glance." Whenever data can be represented in visual form, their interrelationship can be more easily assessed than from numerical tables (see the fascinating book on this subject by Tufte, 1983). In a certain way, Benzecri's phrase "patterns emerging from data" anticipates the new age of supercomputers and fast workstations that allow complex numerical relationships and vast numerical fields to be presented in threedimensional form on the computer screen. In recent years, the power of this human interaction with the computer has been recognized, and it has become acceptable to study the behavior of complex systems from visual displays of numerical simulations. Looking at such visual displays may be the only way for an observer to grasp the properties of a solution of a nonlinear system of equations. (A good example is provided by the July 1994 issue of Institute of Electrical and Electronics Engineers Trans. Comp. which specialized on visualization and featured on its cover a parametric solution to the fourth-degree Fermat equation.)
D. Eigenvector Methods of Ordination: Principal Component Analysis versus Correspondence Analysis The methods we employ for multivariate statistical analysis (see Lebart
et al., 1977, 1984) fall under the category of eigenvector methods of ordination. What these methods have in common is that they are based on a decomposition of the total variance into mutually orthogonal components that are ordered according to decreasing magnitude. This decomposition is tailored specifically to the data set analyzed.
130
Chapter 4. Statistical Analysis and Classification of Images
The lexicographical notation we introduced earlier preconceives the idea of the image as a L,ector in a multidimensional space R J (also referred to as hyperspace) where J = L 9M is the total number of pixels. Alternatively, we can think of the image as a point (namely, the vector end point) in that space. As an exercise in the use of this concept, note that the points representing all images that differ in the value of a single pixel at the point with coordinate ( l , m ) but are constant otherwise lie on a line parallel to the coordinate axis numbered j = (m - 1), L + l. This example shows, at the same time, that the original coordinate axes in R J are not useful for analyzing variations among images since such variations involve simultaneous changes of numerous pixels, whereas a movement along one of the coordinate axes is incapable of expressing the change of more than one pixel. If a set of images is given, the vector end points form a "cloud" in the "hyperspace." For an introduction into this concept, let us consider a set of images generated from a meaningful pattern (which may be the noisefree projection image of a macromolecule) by the addition of random noise. In the space R ~, any image of this set would be represented by addition of a vector with random orientation and random length (as governed by the statistical distribution) to the fixed vector representing the image of the molecule. The resulting vector end points would form a multidimensional cloud whose center lies in the point that represents the average of all images. As an aside, this "cloud" concept makes it easy to understand the problem of reference-based alignment and averaging of very noisy images (Chapter 3, Section III, D). The problem becomes evident, as we recall, when a set of images containing pure noise is aligned and averaged: the average tends to look like the reference--an image processing version of a self-fulfilling prophesy. If we compute the cross-correlation function (CCF) of a reference with such a set of pure noise images and average only those with highest correlation, we can describe this procedure as follows: the (assumed zero-mean) noise set is represented by a cloud centered in the origin of the coordinate system. Computation of the CCF is equivalent to computing the distances between each noise point and the point representing the reference (Fig. 4.2). The process of alignment involves a series of coordinate transformations in the argument space of the image functions; each coordinate transformation creates a new version of each image, described by another point again in the cloud. In the absence of a signal--an assumption we have made here--there is no principal difference between generating a set of noise functions by a random process, and generating it by rotating and translating a given noise function. The only difference is that for small
I. Introduction
131
r t
Fig. 4.2. Explanation of the reference problem using a description in the space RJ. Each point represents a noise function, for simplicity assumed zero-mean, so that the cloud is centered in the origin O.
/ r / i / t
9
i i
i
9
9
9
~o i i
9
0
~
9 9
:§9
9 't
9
Now
,
09
cross-correlation
functions 9
9
9
i
f
ol
9
#
with
reference
selects those
that lie within a
hypersphere (dashed line). The average of these selected functions (marked f') lies between O and f, either on the vector joining O and f or at least in close vicinity to it. In o t h e r w o r d s , a r e f e r e n c e f
\
\
f
and rejection of poorly correlating noise
~.
fe
an image of itself
if a l l o w e d
to
induces select
images most similar to itself from a pool of pure noise functions.
steps of the argument coordinate transformations, the points representing the generated functions lie on "similarity pathways" in the cloud. If we average all noise images, we obtain the zero mean image within statistical margins. However, if we pick only those noise images that lie within a (hyper-) sphere centered at the reference, then we obtain a point that (again within statistical margins) must lie on the reference vector. The reason for this fact is that the point set "carved out" by the hypersphere has a center of mass that lies on the line connecting the origin with the end point of the reference. To come back to the subject of multivariate analysis, the aim of this analysis is to find the vectors that define the directions of the principal extensions of the data cloud. These principal directions are constructed as follows: (i) find the maximum extension of the cloud; (ii) find the vector, perpendicular to the first, that points in the direction of the next-largest extension of the cloud; (iii) find the vector, perpendicular to the first and the second ... etc.. The different successive "size measurements," which describe the shape of the data cloud with increasing accuracy, are components of the total interimage c'ariance, and the method of finding these measurements along with the new data-adapted coordinate system is called principal component analysis ( PCA ) in physics and mathematics, and eigenanalysis in statistics. Another term for this method, which is used in image analysis is Karhunen-Loel,e transformation (see Rosenfeld and Kak, 1982). In the following, the mathematical principles of principal component analysis are outlined before a closely related technique is introduced.
Chapter 4. Statistical Analysis and Classification of Images
132
The simplest introduction into the concepts of multivariate analysis and classification of images is by considering a two-dimensional scatter diagram of two variables, e.g., elevation of a geographic site and its average temperature. Such a plot may reveal a thin stretched "cloud," whose trend suggest an anticorrelated behavior; i.e., higher elevations on the average go along with lower temperatures. This trend can be represented by a straight line, thus reducing the two-dimensional to a onedimensional relationship. The simplest classification problem would arise when the points on the scatter diagram were to fall in two clusters. In both cases, the act of representing the data in a systematic way, by assigning a spatial coordinate to each variable, reveals the underlying statistical pattern. The two-dimensional diagram in which the data are represented would be sufficient to represent "images" having no more that two pixels. We could for instance assign pixel 1 to the horizontal and pixel 2 to the vertical axis. Obviously, any pattern that we would normally call "image" contains many more pixels and thus involves an initial representation in a space with a much higher dimensionality. Even though the dimension of the space is higher, the principle of the analysis--finding a low-dimensional subspace in which the variations can be expressed with minimum loss of information--remains the same. In a least-squares sense, the "direction of maximum extension" of the data cloud, the term used above in describing the purpose of MSA, is defined as follows (Fig. 4.3): we seek a vector u in R J such that the sum of squared *projections 16 of the n image vectors x~ onto u is maximum. Before proceeding, we observe that this problem can be formulated in the N-dimensional subspace R N because the number of images is normally smaller than the number of pixels, i.e., N < J. The above least square condition thus leads to the equation N
~2
E OPi -i=1
N __ E (xi u)2 i=1
(Xu)'Xu = u'X'Xu
!
max,
(4.1)
16Here and in the following, the term "projection" has two different meanings: first the ordinary meaning, employed so far, as an image of a three-dimensional object resulting from a projection operation; i.e., by adding up all density values of the object which lie along defined "rays of projection." The second meaning refers to the result of a vector projection operation in the J-dimensional Euclidean space R J, or in a factor subspace. Obviously, electron microscopic images will keep occurring in both contexts: as the result of a type-1 projection and as an element subjected to a type-2 projection. To avoid confusion, and draw attention to this distinction, the second type will henceforth be denoted by an asterix: *projection.
I. Introduction
133
l
as
Q
o .... . .... ..
o
i
"""'o o
o
6
=._
O
Fig. 4.3.
Definition of a principal axis in R J resulting from the analysis of a point cloud Pi. F o r a given vector u, we can construct the projections of all the vectors x i = OP i. A m o n g all possible choices of vectors u, we select the one for which the sum of the squared projection is minimum. In that case, u points in the direction of the cloud's principal extension (after L e b a r t et al., 1977).
with the constraint u'u = 1
(4.2)
(orthonormalization condition). Here we have introduced a matrix X that contains the image vectors as rows:
X
Xll
X12
9. .
XlJ
X21
X22
9
X2J
XN1
XN2
9
XNJ
__
(4.3)
The problem posed by Eqs. (4.1) and (4.2) can be tackled by introducing the Lagrange multiplier A and posing the eigenvectorequation Du = Au
with def
O = X'X,
(4.4)
134
Chapter 4. Statistical Analysis and Classification of Images
where X' is the transpose of X. D is called the covariance matrix (or sometimes the variance-cot'ariance matrix), since its general elements are J
dii, = ~ XijXi j
(4.5)
j=l
Equation (4.4), solved by diagonalizing the matrix D, has at most q solutions {Ul,U2. . . . . Uq} where q = min{N,J}. These solutions form an orthogonal system of basis vectors in R y and are called eigenvectors of matrix D. They are associated with the set of eigenvalues h = {AI,A2,... ,hq}. Since D is symmetric and positive-definite [det(D)> 0], the eigenvalues and eigenvectors are real, and the eigenvectors may be ranked by the size of the eigenvalues in descending order. Each eigenvalue gives the share of total interimage variance expressed by the associated eigenvector, so that the sum of all eigenvalues is the total interimage variance. This relationship is expressed by q
Tr(D) = ~
A~,
(4.6)
o~=1
which is invariant under a rotation of the coordinate system. Correspondence analysis (CA) is distinguished from PCA by a different metric, which affects the computation of distances: instead of Euclidean distances, X 2 distances between data vectors are computed. The reasons for the prominent role of correspondence analysis in electron microscopic applications are in part historical17 and in part due to a fortuitous match of problem with technique. What counts in favor of CA compared to PCA in the context of electron microscopic applications is the fact that CA ignores multiplicative factors between different images; hence it appears ideally suited for analyzing images obtained in electron microscopic bright-field irrespective of their exposure. Thus particle images from different micrographs can in principle be combined without rescaling. CA, as a statistical analysis technique based on the x2-metric, requires positive input data; yet the contrast transfer function is known to invert contrast at least in parts of the image. On the surface, it might therefore appear as though electron microscope data could not be analyzed by CA. 17It originated with the fortuitous encounter between two scientists looking for a tool for classifying particle images and a third scientist applying a standard tool of laboratory medicine to a totally unrelated problem.
II. Theoryof CorrespondenceAnalysis
135
However, it must be noted that the bright-field image is always positive, arising from a modulation of a constant "background" term that is due to the unscattered beam. On the other hand, CA has the same property of "marginal equivalence" for the pixel vectors, and here this property can be detrimental because all pixels enter the analysis with the same weight, irrespective of their average value across the image series. This means, for instance, that in the analysis of cryoimages, the variations among the realizations of a given pixel within the particle and outside of it have the same weight.
II. Theory of Correspondence Analysis A detailed introduction into the theory of Correspondence Analysis is found in the books by Lebart et al. (1977, 1984). A brief formulation using the matrix notation was given by Frank and van Heel (1982a). In the current introduction, we follow the lucid presentation of Borland and van Heel (1990)who use Lebart's notation in a more compact form. Correspondence analysis deals with the analysis of n image vectors {xij; j - 1 . . . J} in R J, and, simultaneously, with the analysis of J pixel vectors {xij; i = 1 . . . N} in R • (the space which is said to be conjugate to R J). These two legs of the analysis are symmetric and closely linked. The geometric interpretation of image vectors as a point cloud in R J which was introduced in Section I, D is now complemented by the simultaneous interpretation of pixel vectors as a point cloud in R N.
A. Analysis of Image Vectors in R J With the choice of X Z-metric unique to CA, the least-squares condition derived before (in Section I, D) to seek a principal direction of the data cloud leads to the new eigenvector-eigenvalue equation X'NXMu = uA
(4.7)
with the orthonormalization constraint
u'Mu = 1
(4.8)
where M (N • N) and N (J)< J) are diagonal matrices describing the metrics in "image space" R J and "pixel space" R N, respectively. The elements of M and N are, respectively, mii
=
1/Xi.
,
i = 1... N
(4.9)
Chapter 4. StatisticalAnalysis and Classification of Images
136 and
nji=
l/x4,
(4.10)
j = 1 ... J
with the standard notation (see Lebart, 1984) of x i. and x.j as "marginal weights"" J
N
Xi. =
x
j;
x j =
j=l
x
j.
(4.11)
i---1
As mentioned in the introductory comments, the rescaling of each row and column has the effect that the analysis treats images that are related by a multiplicative factor as identical. For example, a pair of bright-field images of the same object recorded with two different shutter times tl, t 2 are related by X2 j = Xl j ,
t2/tl,
(4.12)
provided, first, that the optical density is linearly related to the electron exposure (which holds in good approximation) and, second, that the object is sufficiently resistant to beam damage. In that case, the two images {x 1j} and {x2j} are treated as identical by correspondence analysis. Equation (4.7) generates an entire set of solutions {ui, i -- 1... q} with associated eigenvalues {Ai, i = 1... q}. The eigenvectors u~ are of dimension J and can again be interpreted as images. Borland and van Heel (1990) used a compact formulation of the eigenvector-eigenvalue equation, X'NXMU = UAq,
(4.13)
in which the eigenvectors u~ are "bundled up," so that they form columns of eigenvector matrix U, and the eigenvalues A~ form the diagonal of the diagonal eigenvalue matrix A q. The orthonormalization condition now becomes U'MU with the diagonal unitary matrix
=
I j,
(4.14)
I j.
B. Analysis of Pixel Vectors in R TM As before, we seek vectors that maximize the projections of the data vectors in principal directions: XMX'NV = VAp,
(4.15)
II. Theory of Correspondence Analysis
137
with the diagonal eigenvalue matrix A p and the orthonormalization condition V'NV
:
I N.
(4.16)
According to a fundamental theorem of matrix algebra, based on the properties of the matrix X'NXM, the eigenvalues in Eqs. (4.15) and (4.13) are the same, assuming that all eigenvalues are different from zero. In contrast to the eigenvectors in R J, which lend themselves to an interpretation as images ("eigenimages," see below), the eigenvectors that form the columns of the eigenvector matrix V do not have an easily interpretable meaning. The eigenvectors that form the columns of V and those that form the columns of U are linked by the so-called transition formulas:
V = XMUA-1/2
(4.17)
U = X'NVA-~/2.
(4.18)
These relationships are of eminently practical importance since they allow the eigenvector set contained in U to be computed in the conjugate space R N where the matrix to be inverted is of dimension N • N, instead of the space R J with the normally much larger matrix D (dimension J x J). One of the advantages of CA is that because the data are simultaneously normalized by rows and columns, the eigenvectors U and V are on the same scale, as are the images and pixel coordinates, so that images and pixels can be represented on the same factor map. Proximity between images and pixels on the same map can therefore be meaningfully interpreted (see Lebart et al., 1984). However, thus far, this unique property has received little attention in the EM-related literature (Frank et al., 1982; Borland and van Heel, 1990).
C. Factorial Coordinates and Factor Maps Each image {xij; j = 1 . . . J } c a n now be represented by a set of coordinates in the new space formed by the set of eigenvectors contained in U. These eigenvectors are commonly called factors, hence the term factorial coordinates for the image coordinates. (In the following, the terms "factor" and "eigenvector" will be used interchangedly.) As in principal component analysis, the new space has the property that the interimage variance (or "energy") is concentrated in a small number of factors, qpract << q" However, this number of useful factors depends on the number of images and on the type of specimen, as will become clear later.
138
Chapter 4. Statistical Analysis and Classification of Images
Here it is important to reemphasize that there are as many factors as there are independent data vectors. The concentration of energy is the result of two principles underlying the analysis: first, the search for new, optimum directions in the vector space (see Fig. 4.3), and second, the ranking of these factors according to the size of their contribution to the total energy. In practice, therefore, only qpract coordinates need be computed for each image. These are obtained from the original set of images, stored in the rows of matrix X, by an operation of *projection: ~P = NXMU = N V A 1/2.
(4.19)
Similarly, the coordinates of the pixels are obtained as = MX'NU = MUA 1/2.
(4.20)
(In the second step of these derivations, the transition formulas were used.) In practice, the coordinates of the images, forming the elements of ~ , {~i~; c~ = 1 ... qpr~ct; i = 1 ... N}., constitute the immediately most useful result of correspondence analysis. Factor maps (example, see Fig. 4.4) are two-dimensional maps in which each image is printed as a symbol with the coordinates {q6~,, q6~} of two selected factors c~1 and a 2. The symbol that is printed can be-(i) a simple point if one is only interested in the distribution of the image on the map, or (ii) the image "ID", normally a number identifying the image, or (iii) a class symbol, if a classification has been performed (see below) and one is interested in the distribution of the classes on the factor map but not in the identity of the class members. The second option (image ID) is most often used, preceded by a symbol characterizing the status of the image in the analysis: "a" for active, "i" for inactive. ActiL,e images are those that participate in the analysis from the very start, forming the rows of matrix X'NXM [see Eq. (4.1)]. Inactice images can be any images that have the same dimensions as the active set and have been passed through the same mask, but were not included in the matrix. Therefore they do not participate in the spanning of the factor space, but are merely "coprojected" with the active images and shown on the factor maps so that their relative position can be investigated. What precisely is the use of such a map? We can think of it as an int:entory of the image set according to rationales whose principles are laid down in the least squares formula. To discover the rationales (i.e., the reasons for a particular ordering of the images along the two factors on
II. Theory of CorrespondenceAnalysis
139
the map) for a given data set requires some work; in other words, merely staring at the distribution of image symbols on the map will not help. An important tool for the interpretation of the factor map will be introduced below: reconstitution, which is a recipe for building up an image from its factorial components as though from individual building blocks.
D. Reconstitution To recapitulate, the eigenvectors {u,; c~ = 1 . . . q} form an orthonormal basis, where q = min{N, J} is the number of nonzero eigenvalues, which means that it is possible to reassemble each image by a linear combination of these new vectors (Fig. 4.5):
~marxi]Ji~Ujo~ 1 9 o~=1
(4.21)
Xij "-" Xi. X.j -Jr- E
By letting O~max ~ q, the image indexed i is restored without loss of information. Another form of this so-called full reconstitution makes use of the relationship (4.20) and expresses the eigenvectors in terms of the pixel coordinates (Bretaudiere and Frank, 1986; van Heel, 1986a):
~ma.x Xij -- Xi.X.j
1 + ~
o~=1
~
} ~Yia ~/)jor 9
(4.22)
A more interesting use of Eq. (4.21) is obtained when the series is broken off after a smaller number of factors, Ofmax < q; in that case, the expansion approximates the original image, with small, less significant components of interimage variance left out. Because of the strong condensation of variance in the highest-ranking factors in many experiments, the number of factors can be much smaller than q, and thus a substantial compression of information is achieved. For example, we take 1000 images, each of which is represented by 64 x 64 pixels (Frank, 1990). Let us assume that the molecule is situated within a mask that has 2500 elements. Let us further assume that 10 terms are sufficient to represent the important variations among the image set. We then need 10 x 1000 image coordinates, plus 10 x 2500 pixels of the factors to be stored, or a total of 35,000 numbers. This should be compared with the original number of pixels, namely 2.5 million. Thus, the data compression factor achieved in this case is of the order of 70. The form (Eq. (4.21)) of the reconstitution formula means that an image indexed i may be built up by adding to the average image the individual components xij ~ = xi. Oi, uj~ ,
(4.23)
140
Chapter 4. Statistical Analysis and Classification of Images
II. Theory of Correspondence Analysis
141
Fig. 4.4. (continued)
w h e r e x i is a scaling factor (the m a r g i n a l w e i g h t of e a c h image), ~i~ is the a th c o o r d i n a t e of i m a g e i, a n d uj~ the j t h e l e m e n t of t h e a th e i g e n v e c tor. D e p e n d i n g on the value of c o o r d i n a t e 4'~, the e l e m e n t uj~ is a d d e d
Fig. 4.4. Demonstration of correspondence analysis for a set of macromolecules that strongly vary in one feature. (a) Set of worm hemoglobin molecules with (1-16) and without (17-32) extra central subunit. (b) Factor map (1 versus 2) separates images (represented by their ID numbers) unambiguously into two clusters along factor 1, the most prominent factor. Insets show images obtained by averaging over the respective clusters. From Frank and van Heel (1982). Reproduced with permission of Deutsche Gesellschaft fiir Elektronenmikroskopie, e.V.
142
Chapter 4. Statistical Analysis and Classification of Images 0 0
o o o o
a2u o
o
2
o o
o
a I u
I
Fig. 4.5. Principle of reconstitution. Each image, represented by a point in the space R J, can be approximated by addition of vectors, which point along the eigenvectors, to the vector representing the average image that is centered in the data cloud. Only a few of these additional vectors may be needed to recreate the distinguishing features of the image.
(or subtracted) with different weights to (or from) the total sum. Interpreted as an image, {uj~; j = 1 . . . J} is also known as eigenimage. The analogy to the Fourier synthesis of an image is now obvious: in that case, the eigenvectors are complex exponentials. Fourier analysis is in essence the computation of a sequence of *projections of the vector that represents the image, onto the orthonormalized vectors representing the complex exponentials. The resulting set of components F t is referred to as the Fourier transform of the image. In the resynthesis of the image from its Fourier transform, terms of the form F t exp(2zrirjk l) are summed, which can be understood as "eigenimages" with different weights. The real and imaginary parts of these complex eigenimages are the elementary cosine and sine waves. As terms with increasing spatial frequencies k t are being added, a more and more refined approximation of the image is created. As in the case of reconstitution, the series can often be terminated at a low resolution Iktl < kmax without losing essential information. In fact, low-pass filtration was already discussed before (Section IV, C in Chapter 3) as an effective way of suppressing noise lying in spatial frequency bands where the signal amplitude is weak or nonexistent.
II. Theoryof CorrespondenceAnalysis
143
The important difference between the synthesis of an image by reconstitution from its factorial coordinates and its synthesis from its Fourier components is that in the former case, the eigenvectors are "tailored" to the variational pattern of the entire image set that the given image belongs to, whereas in the latter case, the eigenvectors are fixed. Another way of saying this is that the coordinate system of the Fourier expansion is absolute, whereas the coordinate system of PCA or correspondence analysis is relative and data-dependent. This difference is important because it implies that the results of PCA or correspondence analysis cannot be compared, factor by factor, unless the experimental methods are precisely matched, and in addition the pixel-defining mask and the number of images are precisely the same. Normally only a few very strong factors are found to be the same, e.g., the one that expresses the strong stain variation on the periphery of particles in a single-layer preparation. With increasing factor number, the coordinate systems obtained for different sets of data will increasingly differ by "rotations" (see Section IV, A). For these reasons, reconstitution of an image requires that the image is a member of the ensemble for which the eigenvector analysis has been carried out, whereas Fourier resynthesis of a single image can be carried out entirely without reference to such an ensemble. Reconstitution is extremely useful as a means of exploring the meaning of factors and to gauge the most important variations in the data set (Bretaudiere and Frank, 1986; see Section III, F). For this purpose, only a few selected factors, or even only a single factor, are used in the reconstitution sum (4.22).
E. Computational Methods The computation of the eigenvectors and eigenvalues is done by direct matrix inversion, either of matrix X'NXM [Eq. (4.7) or (4.12)] or of XMX'N [Eq. (4.5)], depending on the number of images, N, compared with the number of image elements in the mask, J. The transition formulas can always be used to convert the eigenvectors from one space to the conjugate space. Normally, N < J, making the inversion of X'NXM the fastest. For example, a set of 1000 images with a mask passing 2000 pixels poses no problems, because in that case, the smallest matrix has the size 1000 x 1000, which occupies 4 Mb in core memory and is easily accommodated in current workstations. If neither N x N nor J x J fits into memory, the
144
Chapter 4. StatisticalAnalysis and Classification of Images
so-called iterated power algorithm (Lebart et al., 1984) can be used, which essentially applies the eigenvalue equation iteratively, starting with a random seed vector. In a convenient image processing implementation, the program switches automatically from the matrix inversion to the iterated power algorithm depending upon the size of the problem.
F. Significance Test In electron microscopic applications with low signal-to-noise ratio, it is important to know if the eigenvectors found by MSA merely reflect the multidimensional fluctuations of noise. More specifically, it may happen that eigenvectors are significantly only up to a certain number, and then it is important to know up to which eigenvector the analysis can be trusted. The answer to these questions is obtained by an analysis of noise alone. A data set consisting of pure noise has a characteristic eigenvalue spectrum which depends on the number of images, the number of image elements, and the noise statistics (Lebart et al., 1984). For the eigenvectors to be significant, the associated eigenvalues should stand out from the noise eigenvalue spectrum. Because of the dependency of this "control" spectrum on the noise statistics, it is not possible to tabulate it in a useful way. Instead, the control analysis has to be done with direct reference to the data to be investigated (Lebart et al., 1984), by applying MSA to a noise distribution that has the same first-order statistic as the data. For this purpose, a new data set is created by a process of "random shuffling." Each image is subjected to the following process: i. ii.
set counter m -- 0; draw two numbers j~, J2 at random from the lexicographic range of pixels, 1 . . . J; iii. exchange pixels PJl and pj~; iv. increment counter m = m + 1; v. if m < J then go to (ii)else stop. The new data set no longer has a common signal since the random shuffling is independent for each image. All signal-related systematic variations will be eliminated. However, the resulting noise data set has the desired property that it has the same statistical distribution as the original data set and therefore acts as a true control. An analysis of this kind was done to test the significance of results obtained when applying correspondence analysis to partially averaged
III. CorrespondenceAnalysis in Practice
145
("patch-averaging") low dose images of purple membrane (Frank et al., 1993).
lIl. Correspondence Analysis in Practice A. Image Sets Used for Demonstration To understand how correspondence analysis works in practice, it is most instructive to use an image set with well-defined variational patterns. In the following, two sets will be used. One (Fig. 4.8a) is totally synthetic and presents a face varying bimodally in three different features, so that eight ( - 23) different versions of the face can be distinguished (Bretaudiere and Frank, 1986). The other set (Fig. 4.6a) derives from a reconstruction of the 70S ribosome of Escherichia coli (Frank et al., 1991; Penczek et al., 1992), which is projected into nine directions arranged in groups of three. In both cases, noise was added to simulate the experimental conditions (i.e., structure of the support, shot noise, etc.). These two sets will be used to exemplify the main points of MSA and classification.
B. Eigenvalue Histogram and Factor Map In preparation for correspondence analysis, :8 the image set (Fig. 4.6a) is put into a compact form and stored as a single file. Usually, the data contained in each image are selectively copied under the control of a mask file that is the same for all images. This binary-valued image must be carefully prepared to assure that all particles are passed. In our example, a simple circular mask was used. (The preparation of a binary mask of general shape will be described in Section III, E.) Next it is necessary to decide how many factors should be computed, bearing in mind that the number of eigenvectors determines the computational effort but that, on the other hand, it is better to err on the side of too many factors, since it is always possible to disregard those that are unneeded, but it is not possible to add factors without redoing the entire analysis. In the example of the 70S E. coli ribosome, 20 factors were specified, so that the result of the analysis was a set of 20 eigenvectors and associated eigenvalues and one set of 20 coordinates for each image in the new coordinate system. (The 18The description of processing steps has been kept generic for the most part, without reference to a specific image processing package.
146
Chapter 4. Statistical Analysis and Classification of Images
Fig. 4.6. Correspondence analysis of a model data set, created from the 70S E. coli ribosome reconstruction (Frank et al., 1991) by projection into three main orientations (defined by angles {d~, 0} = {0~ 0~ {0~ 90~ and {50~ -90~ each varied by a "'rocking" of the 0-angle by ( - 5 ~ 0~ + 5~ (a) The original nine projections (first column) and images generated by adding Gaussian with SNR = 0.2 (following columns). (b) Eigenvalue histogram of the first 20 factors. (c) Factor map 1 (horizontal) versus 2 (vertical). The images, denoted by their numbers, are visibly grouped according to their main orientations: 1-15, 16-30, and 31-45. However, finer divisions according to the "'rocking" by +5 ~ (i.e., grouping into subgroups 1-5, 6-10, 11-15, etc.)does not take place because of the large amount of noise. (d) Interactive generation of local averages. The display program WEB which accompanies the SPIDER system makes it possible to draw closed polygons defining groups of closely related images on the map. On request, an average of the marked images is displayed. (The average image is automatically shifted so as to avoid overlap with the numbers.) The map is shown at a stage where the third polygon (bottom left) has been closed, and the menu offers the choice of displaying or storing the local average generated.
d i s c u s s i o n o n t h e p r a c t i c a l n u m b e r of e i g e n v a l u e s will b e d e f e r r e d to t h e s e c t i o n o n classification, S e c t i o n IV). A t o n e g l a n c e , t h e eigencalue histogram (Fig. 4.6b) allows a n assessm e n t of t h e n u m b e r o f f a c t o r s c a r r y i n g m e a n i n g f u l i n f o r m a t i o n : it o f t e n h a p p e n s t h a t a g r o u p of h i g h e s t - r a n k i n g f a c t o r s s t a n d s o u t f r o m t h e rest,
III. Correspondence Analysis in Practice
Fig. 4.6. (continued)
147
148
Chapter 4. Statistical Analysis and Classification of Images
claiming a lion's share of the total interimage variance. If we look at the histogram, it is apparent that the first two eigenvalues stand out, together claiming 7% of the total interimage variation. The remaining eigenvalues trail off smoothly without another break, showing the characteristics of variations due to noise. From this behavior one can conclude that the first two factors capture the essence of the variational pattern. Therefore the next step is to generate a factor map of factor 1 versus factor 2. We recall that the factor map is a projection of the entire point cloud onto a Cartesian plane spanned by the selected directions. On the map, the position of each image is determined by its coordinates (Section 2, the coordinates of image i II, C). For the choice of factors a 1 = 1, ~2 are: =
[ q'~l, q',2 ]. Each image is marked by a symbol. For symbols one normally chooses the numbers identifying the images throughout the analysis ("image ID"), but they may also be chosen to be letters identifying images as having originated from different experimental batches or micrographs, or they may be chosen to indicate the association of images to classes according to a prior classification. Finally, it is sometimes sufficient to use the same symbol for all images (e.g., " . " or " , ") if the map is used to make a point about the shape of the distribution. This shape may contain clues to the following questions: Are the data clustered in any way? Is there any evidence for a linear ordering? Back to the 70S data set. The 1 versus 2 factor map (Fig. 4.6c) shows the grouping of the 70S ribosome images, but instead of the expected nine groups, only three are found. The reason for this result is that only three main orientations are present and that the subdivision of each group according to a +5 ~ change of orientation is completely obscured by the noise. It is easy to verify from the numbering (main orientation 1, 1-15; 2, 16-30; 3, 31-45) that the partition of the images into the three clusters is accurate and without error. As an aside, the result exemplifies the concepts of "heterogeneity" and "homogeneity" introduced earlier (Section I, A). The entire data set is clearly heterogeneous as it consists of three distinguishable groups that cannot be mixed (e.g., in computing a total average). However, each of the groups is itself heterogeneous in origin since each consists of projections with different--albeit closely related--orientations. However, the presence of noise renders this finer distinction meaningless, so that each of the groups is homogeneous for all practical purposes. Next comes the task of identifying the physical meaning of the factors if at all possible. (Although, a priori, there is no compelling reason for the
III. CorrespondenceAnalysis in Practice
149
existence of a one-to-one relationship between factors and physical causes, such relationships have often been found.) To this end, the following tools are used in conjunction with factor maps: explanatory images, local at:erages, eigenimages, and reconstituted images.
C. Explanatory Tools h Local Averages In attempting to interpret factor maps we must recall that the entire set of eigenvectors is unique for every given data set. Only if experimental protocol and image preprocessing steps are exactly duplicated and the number of images is comparable in two data sets, then the eigenvectors have the same meaning. The most straightforward way of determining the rationale underlying an eigenvector is by displaying images that fall at the extremes of the map along the axis under consideration. Such "explanatory images" typify the effect of addition and subtraction of the variational component expressed by the corresponding eigenvector. A shortcoming of this kind of analysis is that raw images may be quite noisy, tending to obscure the variational component brought out by the eigenvector. Averaging even over a few images that lie in close vicinity of one another obviously improves this situation. We call images generated in this way "local averages," where the term "local" refers to the size of a contiguous region, in factor space, from which images are drawn to form the average. This could be a circumscribed region on a particular (2D) factor map, or a multidimensional region, up to the number of dimensions for which factors are computed. Examples for such simple regions are - a < ~i~ < a;
a = 1... C~max ("hypercube")
(4.24)
Ofrrlax
[a,i/,a __ XlIC ]2 5 R 2
("hyperspace"),
(4.25)
O~
where ~c are the center coordinates and R is the radius of the multidimensional sphere. In the foregoing, the attention has been exclusively on the explanation of factors, but it should be stressed that local averages are equally useful in exploring the meaning of a group of images on the map: why they cluster together and what precisely makes them special compared to the rest of the image set. In deciding on the size of the region in factor space, one must strike a balance between the need to average noise and the need to obtain a true sample of the varying features: obviously, when the region is chosen too large, the variations will be partially leveled.
150
Chapter 4. StatisticalAnalysis and Classification of Images
The most basic way of generating local averages involves a printout of one or several factor maps, a felt pen, a notepad, and a computer standing by. An interesting region of the map is marked with the felt pen, the encircled numbers are written down on the notepad, and the corresponding average is computed and printed out. The image is cut out from the hard copy and pasted on the map, etc. Meanwhile, sophisticated computer graphics programs that allow averages to be computed "on the fly" are available. Polygonal regions are drawn with a cursor on the map and corresponding averages are calculated interactively. Figure 4.6d shows such a program in action. At the moment the window was captured, local averages have already been computed and displayed for three regions. A fourth region has just been marked, and the menu that gives the choice of how to proceed is shown with "display averaged image" and "store averaged image" being two obvious choices. As expected, the three local averages generated are seen to display the characteristics of the three main views of the 70S ribosome. It is often helpful to make an exhaustive display of local averages computed on a regular grid of a factor map ("local acerage maps"); at one glance, one is able to see the rationale behind a particular ordering. Figure 4.7 shows such a map made in the course of the study of a hemocyanin-Fab complex (Boisset et al., 1994a): here factor 1 (vertical) accounts for the flip/flop distinction (i.e., the side view seen from different sides of the molecule), while factor 2 (horizontal) accounts for the different degrees of flattening. Local averages in some parts of the map appear noisy, reflecting the small number of molecules and the consequent poor statistics there. Another display of this kind will be shown later on in the context of the reconstruction of the calcium release channel (Fig. 7.3b).
D. Explanatory Tools II: Eigenimages and Reconstitution For a demonstration of these tools, the "face" data set will be used (Fig. 4.8a). For each of the eight combinations of the variations face oval/round, eyes left/right, mouth narrow/small, nine noisy realizations were created, bringing the total to 72 images. The eigenvalue histogram (Fig. 4.8b) obtained after applying correspondence analysis to these images reveals that the eigenvalues of the first three factors (representing, respectively, 14, 5, and 4% of the total interimage variance; see below where the cumulative percentage will be introduced as a measure of representativeness) stand out from the remaining ones (representing 2.5% and less) and suggest that examination of the maps involving the three first factors (i.e., 1 versus 2, 1 versus 3, and 2 versus 3) might already give an indication of existing clustering. Each of these maps (Fig. 4.8c) indeed shows four
III. Correspondence Analysis in Practice
151
Fig. 4.7. Interimage variations along two factors, represented by a grid of local averages. The factor map was divided into squares according to a regular two-dimensional grid. For each square, the images whose symbols fall into it are averaged, and the resulting local average is depicted at that location. Because of the varying numbers of molecules within the squares, the local averages have varying degrees of statistical definition. Analysis of Androctonus australis hemocyanin molecules from electron micrographs of negatively stained specimens reveals two types of variations, immediately apparent from the map: (1) strong stain ring/well defined/small ~ weak stain/poorly defined/large, (2) flip/flop (interpreted as resulting from flipping the rhombic molecule on the carbon grid). Modified form of an illustration in Boisset et al. (1994a).
152
Chapter 4. Statistical Analysis and Classification of Images
clusters in which two of the three feature variations are separated, whereas the third one is indistinguishable. In turns out that the noisy realizations of the eight prototypical images, labeled as 1-8, are separated into eight clusters which are sitting on the eight corners of a parallelepiped. If we look into the three orthogonal directions of this parallelepiped (which is what we do when forming the projection maps), then we always find four pairs of clusters superposed.
I. Reconstitution of Local Averages Under certain conditions, the computation of a local average of a factor space region centered at {~; a = 1...q}
(4.26)
can be replaced by reconstituting an image using ~ as coordinates in Eq. (4.21) or Eq. (4.22). Obviously, the reconstituted image will not contain any features that are not captured in the factorial expansion. Thus, if the number of factors included in the reconstitution is too small, the reconstituted image will fail to approximate the local average. Specifically, it is the ratio of the interimage variance captured by the first O~max factors to the total interimage variance,
am~x r=
~
~=1
/ q A~
~
A~,
(4.27)
/ a=l
that determines the degree of "representativeness" of the reconstituted image. (r expressed in percent is simply the cumulative percentage normally printed out along with the eigenvalues themselves.) If this ratio r is in the range of a few percent, then such an image might lead to serious misinterpretations. Even for high values of r, it is good practice to make at least spot checks verifying that the reconstituted image captures the features of the local average. In the demonstration chosen (Fig. 4.8e, displayed below Fig. 4.8a for comparison), the ratio proved to be r = 23%. Each of the eight prototype images of the "face" series was stepwise reconstituted by using the coordinates of the eight cluster centers and, starting with the average image, carrying the summation in Eq. (4.21) to term 1 (top row), term 2 (middle row), and finally to term 3 (bottom row). It is interesting to see that in the first step, the image is sharply defined with respect to feature 1 (shape of face), while the other two features are blurred. In the second step, the eye position is in addition sharply defined, but the mouth remains blurred. Only the third step adds the precise definition of the shape of the mouth (narrow or wide): The fact that the three-step reconstitution retrieved the original images confirms the assumption, made at the begin-
III. CorrespondenceAnalysis in Practice
153
ning, that the three-dimensional factor space was sufficient to capture the significant variations.
2. Eigenimages The individual terms in the reconstitution formula, whose addition to the average image produces one of the two variants of the three independent features, are (apart from the image-dependent coefficients) the eigenimages. These images have quite peculiar properties. For example, the term relating to factor 2 must take away mass at two point relating to the "unwanted" position of the eyes in the average image and add it where it
Fig. 4.8. Demonstration of correspondence analysis and reconstitution, using a model image set with three variational components: eyes left cersus right; head round cersus oval; and mouth small versus wide. (a) First row: eight ( - 23) prototypical images generated from the three variations. Second row: examples of noisy versions created by adding noise to the prototypes, with twice the amplitude of the signal. Altogether nine noisy versions were created for each prototype, giving a total of 72 images. (b) Eigenvalue histogram showing the distribution of interimage variance among the first eight factors. The first three eigenvalues stand out from the rest; hence one would conclude from this histogram (without seeing the images) that the first three factors are probably sufficient for representing the variational patterns. This information can be infered by examining three two-dimensional factor maps. (c) The three factor maps (top, 1 t'ersus 2; middle, 1 L'ersus 3; bottom, 2 cersus 3), each showing clustering according to four different attributes. Factor 1 is seen to express the shape of the face, while factor 2 expresses the direction of the eyes. For example, all images in which the face is round, irrespective of the other two variations, are on the left of map 1 vs 2. Of these, the ones with the eyes looking left are at the top, those with the eyes looking right at the bottom. The distinction between oval mouth and round mouth is expressed by Factor 3. Thus it divides prototype 4 from 8 and 3 from 7 in the two clusters on which we just focused (see map 1 vs 3; middle). Since the 1 vs 3 map *projects along factor 2, the distinction eyes left vs right is lost, and images originating from prototypes 7 and 8 are lumped together, for instance. In three dimensions, we would see precisely eight clusters, lying on the eight corners of a parallelepiped. (d) Columns 1 and 2: eigenimages and their negative versions for the three factors. Factor 1 (top): the oval shape is created from the average by adding "material" on top and bottom; the round shape is created by adding slivers of material at the two sides. Factor 2 (middle): the movement of the eye from the average center position to the left is created by subtracting material from the center and a d d i n g material on the left, etc. Factor 3: the oval mouth is created from the average mouth by adding material on the sides. Columns 3 and 4: six versions of the image, created by reconstituting images on the two extremes of each factor. For each reconstituted image, the feature of the selected factor is sharply expressed while the other features appear fuzzy: for instance, the images reconstituted at the two extremes of factor 1 (top images in columns 3 and 4) have sharp facial boundaries but blurred centered eyes and blurred mouths, etc. (e) Stepwise reconstitution of the prototype images by stepwise addition of eigenimages to the average. Top row: average + factor 1; middle row: average + factor 1 + factor 2; bottom row: average + factor 1 + factor 2 + factor 3. The images on the bottom row exactly match the prototype images (a, first row). From Bretaudiere and Frank (1986). Reproduced with permission of Blackwell Science Ltd., Oxford.
154
Chapter 4. Statistical Analysis and Classification of Images
b 15
+ . . . .
+ . . . .
+ . . . .
+ . . . .
10+
*
5+
*
*
§ . . . .
+ . . . .
+-----+
I
2
3
+ . . . .
+ . . . .
+ . . . .
+ - - - - +
. . . .
. . . .
§ . . . .
+ . . . .
+ . . . .
§ . . . .
5
6
7
8
+
%
0
+ . . . .
4
Eigenvalue number Fig. 4.8.
§
C
2 0.066 . . . . . . . 7 0.060 3 77 7 0.053 37 0.046 73 0.039 33337 0.033 3 0.026 0.019 0.013 0.006 FACE -0.001 -0.007 -0.014 -0.021 888 88 -0.028 4 -0.034 8 8 484 -0.041 8 444 -0.048 4 4 -O.O54 -0.061 + ...... 4 ^
1 . . . . . . . .
0.074 0.067 0.060 0.053 0.047 0.040 0.033 0.026 0.019 0.013 0.006 -0.001 -0.008 -0.015 -0.021 -0.028 -0.035 -0.042 -0.049 -0.056
~AL --FACE 1
2
666
61 6
26 266
EYES
6
22 !
2222
RIQrI'^
-0.039
-0.012
- O V A L
8 8
^
^
^
+
0.015
0.069
0.042
§
. . . . . . . . . . . . .
MC~FH
6
8
7
6
8 87 7 8
7
I
1
-0.066
8 7 7
1 1 5
1
^
-0.093
5
555~ 1511 15 5 I
5 5
665
551
78 7
6656
5 7
I
5
OV/g., FiN~ 1
ROUND FACE
2 1
44 4 3 3
2 21
444
1 112 222 21
2
3334 4343
1
+^ ..... --3-3^
-0.093
^
-0.066
-0.039
0.015
-0.012
^
^
^
^
0.069
0.042
3 0.074 0.067 0.060 0.053 0.047 0.040 0.033 0.026 0.019 0.013 0.006 -0.001 -0.008 -0.015 -0.021 -0.028 -0.035 -0.042 -0.049 -0.056
+
8 . . . . . .
8 8 6
----OVAL MOtn~ 5
6 8
668 88
6 8 66 8
6
75 5
6 7 7
5 7
55 7
7
7
5
EYES LOOK RIGHT 2
.I.IX~§ 2 1
444
4
424
2
2 44
2 2
4
2 ROUND ^
^
-0.046
Fig. 4.8. (continued)
1
3
2 2
-0.066
7
t,tX,rl~ ^
-0.026
^
-0.007
^
0.013
1
3331
33 1
1
3 1
1 3^ ...... 3 ......... + A
0.033
0.053
156
Chapter 4. Statistical Analysis and Classification of Images
Fig. 4.8. (continued)
is "needed." Thus the term, visualized as an image, will be a pair of patterns (one for each eye) consisting of a white dot adjacent to a black dot (Fig. 4.8d, middle row, column 1 and 2). Similarly, the widening of the mouth is accomplished by adding mass on both sides (Fig. 4.8d, bottom row), and the elongation of the face involves the addition of mass at the top and bottom (Fig. 4.8d, top row). In each case, the two extremes of a feature variation are obtained by adding or subtracting the same term from the average. Figure 4.8d makes it clear that the display of eigenimages can be extremely helpful in understanding the nature of a variation expressed by a factor. Another helpful display is obtained by forming an exaggerated realization of an image along a particular factor to be investigated: Xij -- Xi.{X.j -Jr- Kl, lja},
(4.28)
with values of the coefficient K that are outside the value range assumed in the set.
E. Preparation of Masks The binary-valued mask file defines the region of the image that is to be analyzed throughout the image set. The purpose of the mask in correspondence analysis is to prevent the noise and adjacent molecules in the vicinity of the molecule from participating in the analysis. Exclusion of these irrelevant data obviously increased the signal-to-noise ratio, and makes the eigenvalue spectrum more compact. In other words, masking assures that existing variations within the molecule population studied are represented by the smallest number of factors.
III. Correspondence Analysis in Practice
157
For generating a mask function, tow tools are required: one that allows a binary image to be generated from a continuous-valued template image and one that allows different masks to be combined. If the image set is homogeneous, the shape of the molecule projection is more or less described by the shape of the global average N
/ ~ ( r ) - 1/N Y'~ pi(r).
(4.29)
i=1
Since the average is reasonably smooth, a first approximation to a mask function would be generated by "thresholding," i.e., M(r) = l 0 1
forp(r) < T elsewhere '
(4.30)
with an empirically selected threshold T. In this form, the mask function is as yet unsatisfactory for the following reason: in practice, the choice of T allows only little control over the size of the mask when the boundary of the molecule is well defined in the average of the majority of the molecules. Of concern are "minority" molecules that have features outside of the average boundary. If the shape follows the boundary of the global average too closely, then CA is unable to distinguish those minority molecules on the basis of "outside" features. It is therefore always advisable to expand the mask produced by Eq. (4.30) and create a safety margin. A simple procedure for expanding (or shrinking) a binary mask function was described by Frank and Verschoor (1984): the mask function is low-pass filtered using a Gaussian function exp[-kZ/k 2] in Fourier space, followed by a second thresholding step (Fig. 4.9). The smooth function generated by low-pass filtration has an approximate falloff width of r 0 = 1 / k o near the edge of the binary shape of M(r) and, within this distance, the choice of a new threshold T' will generate a family of shapes all completely contained within one another in the manner of two-dimensional Russian dolls. The expansion procedure can be repeated if necessary~ a small falloff width r 0 has the advantage that the expansion follows the previous boundary closely, but has the disadvantage that only a small margin can be gained. A complication arises when the image set shows the molecules in several dissimilar projections. In that case, the "all-inclusive" shape may be found, in principle, from the global average by choosing the threshold sufficiently low (Frank and Verschoor, 1984). However, this method breaks down, because of its sensitivity to noise, if the fraction of images in any of the subsets is small. Alternatively, an all-inclusive mask M,(r) can be generated from a given set of masks representing the constituent molecule projection shapes
158
Chapter 4. Statistical Analysis and Classification of Images
Fig. 4.9. Creation of binary masks from averages. The scheme shows two concepts: the creation of an inclusive mask designed to pass several shapes and a method for expanding (or contracting) a mask. When a binary mask is used to define the pixels to be analyzed, we wish to make sure that it will not truncate part of the important distinguishing information. For a heterogeneous set of molecules consisting of three subsets A, B, C, the mask has to pass the shape of molecules belonging to a subset A and the shape of these belonging to B and those belonging to C. This can be accomplished by forming preliminary averages of the subsets A, B, and C, and then going through the procedure indicated on the drawing: pass the averages
III. CorrespondenceAnalysis in Practice
159
ml(r), m2(r),..., m,,(r) by inclusive OR-logics. Those individual masks may be obtained from (preliminary) subset averages by thresholding and expansion as outlined above. The inclusive mask M/(r) is derived in the following way (Frank and Verschoor, 1984): N
M(r) = ~ mi(r)
(4.31)
i=1
Mr(r) = /O 1
for M(r) < 1 elsewhere
(4.32)
F. Demonstration of Reconstitution for a Molecule Set Reconstitution will be finally demonstrated as applied to an experimental image set that has been analyzed by correspondence analysis after passing through a binary mask. The image set is of negatively stained 40S ribosomal subunits presenting the L-view. A particular image is gradually built up by adding more and more terms of the reconstitution formula. In Fig. 4.10, the top row shows the average image followed by the first four terms of the expansion (essentially the eigenimages multiplied with a coefficient); the bottom row shows the partial reconstitutions of the image. The terms of the expansion are displayed with maximum scale, but they are actually very small compared to the average and so the effects visible in the partially reconstituted image are quite subtle. As usual for negatively stained images, the first factor expresses the different degrees of overall staining: the factor 1 term shows high density values surrounding the particle boundary, indicating that the image to be approximated has a smaller amount of stain in this region than the average. This effect is so small for the image selected that it is virtually invisible in the reconstitution compared to the pure average. The next factor expresses an imbalance in staining between upper and lower half of (first row) through a threshold operation, which results in binary masks with values of 0 or 1 (second row). These are summed, resulting in a mask with values 0, 1, 2, or 3 (third row). From this four-leveled mask, an inclusive bina~ mask is created by assigning 1 (or "pass") to all regions that have values 1-3 (fourth row). With appropriate choices of the threshold in the first step, the inclusive binary mask will pass molecules from all three subsets. The additional two steps at the bottom of in this scheme can be used to change the binary mask. The mask is first low-pass filtered (by using box-convolution or Fourier filtration), then passed through another thresholding step. Depending upon the choice of the threshold, the mask either expands or contracts. From Frank and Verschoor (1984). Reproduced with permission of Academic Press Ltd.
160
Chapter 4. StatisticalAnalysis and Classification of Images
Fig. 4.10. Gallery of eigenimages, and gradual buildup of a molecule image by reconstitution. 329 Images of the 40S ribosomal subunit were analyzed by correspondence analysis. Top row: average image (0) and the first four eigenimages which show stain variations and particle movements of different types: (1) strong versus weak peripheral staining, (2) top versus bottom stain imbalance, (3) diagonal wobble, and (4) variation due to rocking around vertical axis. Bottom row: one of the molecules is built up from the average image by stepwise reconstitution, each involving addition of a weight times an eigenimage. From Frank (1989b). Reprinted from Electron Microsc. Rel'. (now Micron), 2, Frank, J., Image analysis of single macromolecules, 53-74. Copyright 1989, with kind permission from Elsevier Science Ltd, the Boulevard, Langford Lane, Kidlington, OX5 1GB, UK.
the particle. Factors 3 and 4 finally appear to reflect small changes in orientation, the most pronounced effect of which is the elongation of a protuberance on the right just below the middle of the particle.
IV. Classification A. Background When applied to a heterogeneous set of images, multivariate statistical analysis groups similar images into clusters, which show up very clearly on the factor maps. Principal component analysis and correspondence analysis are data reduction techniques, and the arrangement of heterogeneous data into subgroups becomes much more evident in the representations produced by these techniques, because of their reduced dimensionality. Thus, visual inspection of maps of high-ranking factors (e.g., 1 versus 2, 1 versus 3), is often sufficient to classify molecules presenting different views. Note that the meanings of the two terms "clustering" and "classification" are obviously linked, but there is an important semantic difference:
IV. Classification
161
clustenng is any technique that groups data points in R J on the basis of their (multidimensional) geometrical constellation, while classification involves an attempt to interpret the resulting clusters in the widest sense: by attachment of a label that points outside the mere geometric description, and by an assignment of meaning, or specifically, of a physical origin to the diversity found. Another note on terminology should be added: classification is a technique used in many fields of science outside image processing. It is customary to refer to the vectors in R J as "objects" irrespective of their meaning. However, since "object" already has a defined meaning in the subject area of this volume, its use in the above general sense has been entirely avoided. Thus, whenever reference is made to the elements subjected to MSA, even in a general context, they will be called images or molecules, always keeping the application to images in mind. When correspondence analysis was first applied to electron microscopic images (van Heel and Frank, 1981; Frank and van Heel, 1982b), the results were so striking, due to a fortuitous choice of specimens and the exaggerating effects of negative staining, that the problem of classification and screening of molecule projections seemed virtually solved. Indeed, in the analysis of hemocyanin, prominent factors were often associated with distinct, physically interpretable components of interimage variation, suggesting that a comprehensive analysis of data grouping could always be accomplished by inspection of a few factor maps. [The special role of hemocycanin in the development of MSA has been previously emphasized (Frank, 1984a).] However, as single particle analysis was being extended to an increasing variety of specimens, macromolecules with less clear behavior were encountered that posed more difficult problems, often leading to a flat distribution of eigenvalues (i.e., of interimage variance across eigenvector space) and absence of recognizable clustering on factor maps. Finally, the application of single particle methods to images of ice embedded specimens, with their low contrast and extremely low signal-to-noise ratio (SNR), have forced a reevaluation of all tenets with regard to the choices of parameters in MSA. Latest experience has shown that q p r a c t - - 60 or more factors may have to be used to capture the significant variational features of such data (P. Penczek and A. Verschoor, personal communication, 1994). To some extent, the differences among the results of different data sets by correspondence analysis are a matter of scaling. On a factor map resulting from the analysis of an image set comprising different views of a molecule, each of the views may produce a defined cluster. If, instead, the analysis is confined to images of particles exhibiting a single view, then we have effectively expanded the scale of factor space and are now looking at
162
Chapter 4. Statistical Analysis and Classification of Images
density variations within one of the previous clusters. The factor space in the latter analysis is a magnified picture of a small portion of the data set, albeit with factors that bear no relationship to the factors from the entire data set. The reason for the absence of a relationship is that the spatial configuration of clusters in R j is unrelated to the shapes of the individual clusters. For example, in a data set consisting of two populations of molecule projections, related to each other by flipping the molecule by 180 ~ the associated change in the appearance will normally represent the largest component of interimage variation and show up as the first factor. In the 1 versus 2 factor map, two distinct clusters will be recognized. If only the molecules falling into one of the clusters are analyzed, on the other hand, then the first factor will express the largest variations within that cluster, which may be associated with a small change in orientation, or level of staining if a stained specimen is used. The first factor in this "magnified" correspondence analysis is not normally related to the second factor of the original analysis either, since the computation of that second factor was constrained by the requirement of orthogonality to the first factor and is now no longer so constrained. Another issue of scaling arises when one attempts to compare results from MSA of different data sets comprising different numbers of images. As the number of images increases, the eigenvalue spectrum spreads out, reducing and "diluting" the significance of highest-ranking, lower-order factors. As a result, a clear-cut behavior that might be observed for small N, involving a few factors only in important experimental variations, may become more complex as N is being increased. One of the problems with classification based on visual interpretation of factor maps is factor rotation. Factors with closely matched eigenvalues may interchange their positions in the ranking order, or entirely new factors may spring up even when similar data sets are analyzed. This behavior all but excludes the use of factors as a unique "fingerprint" for an experiment. The rotation of factors is caused by small changes in the multidimensional shape of the data cloud as new data are being added, or one data set is being exchanged against another from the same experiment. It is clear that a much more reliable characterization of the data set is achieved by describing the interior structure of the data, which is invariant under rotation of the factorial coordinate system and not appreciably affected by the addition of a small amount of data. There exists an obvious close analogy between the appreciation of an object's shape and interior density distribution from a few projections and the task of infering the clustering of a data set from a small number of factor maps: in both cases, the information is insufficient to obtain a
IV. Classification
163
certain answer, and any guess made on the basis of that information is strongly dependent on the choice of "viewing directions." In summary, the rationale for multidimensional clustering is twofold: for one, the relative multidimensional arrangement of clusters is a "fingerprint" of the data set, which is for once objective and reproducible, and second, knowledge of the delineations of the clusters is necessary to obtain statistically meaningful averages, but this knowledge often cannot be obtained from the factor maps alone.
B. Classification of the Different Approaches to Classification A number of terms must first be introduced; classification approaches themselves need to be classified. Two entirely different types of approaches can be distinguished: supervised and unsupervised. Supervised classification groups data according to the degree of their similarity to known prototypes, whereas unsupervised classification groups them according to their mutual relationships without such guidance. Virtually all classification in electron microscopy of macromolecular assemblies has been of the latter type since prototypes are normally not available. However, Fraser et al. (1990) have explored the use of supervised classification, and a brief description will follow. Another distinction concerns the way partitions in a data set are defined: "hard" or "fuzzy." With hard partitions, images are classified as belonging to one group only, with the membership in that group excluding membership in all others. Fuzzy partitions allow an image to belong to several groups at once with different degrees of pertinence. Classification of the hard type are the norm, although fuzzy techniques have been introduced into electron microscopy (Carazo et al., 1990). Yet a different distinction has to do with the practical way in which the clustering problem is solved. Here partitional methods such as the K-means method (or aggregation around moving centers) are juxtaposed with the hierarchical clustering techniques. The former proceeds by diL'iding the set of images, whereas the latter proceeds by merging the images successively into groups, keeping track of an index that reflects the proximity of objects merged. Both techniques also appear mixed in some hybrid techniques. Finally, all approaches discussed thus far make use of sequential algorithms. Neural networks represent an entirely different approach as they are based on the simultaneous interaction of processing nodes. (The fact that these networks are actually simulated by sequential machines in most implementations is irrelevant in this context.)
Chapter 4. Statistical Analysis and Classification of Images
164
C. Partitional Methods: K-Means Technique In the classification using the K-means technique, the data set is iteratively divided into a presupposed number of classes. If K classes are expected, K "seeds" (i.e., individual images) are randomly drawn from the entire data set and designated as centers of aggregation. A partition is devised based on the Euclidean distances between the centers of aggregation and the members of the data set. From the subsets created by the partition, new centers of aggregation are derived by computing the centers of gravity (Fig. 4.11) which now take the place of the original seeds. Eventually, iteration of this algorithm produces a partition that remains more or less stable. However, a big disadvantage of the method is
a
0
0
0 ..... "..... 0
0
9
0 O0
0
0
......... "...........
~.
0
99 1 7 6
0
0
0
b
/
O
........ / O O
...... ".... O
~
./
9"
O
9 :
O
OO
O O
C)
Fig. 4.11. The K-means method of clustering, illustrated for the choice of K = 2 clusters. (a) From among the objects, two are picked at random as "seeds" (marked with two circles). They define a first partition of the set. (b) Next, the centers of gravity (filled dots) are computed for the two subsets created by the partition, and these centers define an improved partition, etc. In our example, the two clusters are already separated in the second step. From Frank (1990). Reproduced with the permission of Cambridge University Press.
IV. Classification
165
that the individual classes tend to be (hyper-) spherically shaped. Depending on the particular application, this partition can be total wrong: elongate clusters or those with more complex shape will not be recognized as cohesive classes. Another disadvantage of the K-means method is that the finally partition depends critically on the choice of initial seeds. The need to overcome this dependency is the rationale behind the dynamic clouds technique by Diday (1971): the K-means technique is applied repeatedly with different choices of seeds. A cross-tabulation of all partitions obtained then reveals "stable clusters."
D. Hard versus Fuzzy Classification Intrinsic to the usual concept of classification is the exclusive categorical membership function: if three choices of classes are provided, for instance, then an image is either a member of Class 1 and not Class 2 or 3, or a member of Class 2 and not a member of Class 1 or 2, etc. We also speak of a hard membership function. An altogether different concept is introduced when a continuous membership function is allowed. In that case, an image is able to "belong" to different classes at once, with different degrees of pertinence. This fuzzy classification is appropriate, in general, when the object has vaguely defined properties. It has been previously pointed out (Frank, 1990) that the fuzzy membership concept actually runs counter to the physical reality (at least as long as we are not dealing with quantum-physical behavior). Only the specimen preparation may introduce effects, such as flattening and fluctuating stain levels, that occur in different degrees, or extents, and thus lend themselves to a description by fuzzy membership functions. However, the real utility of the fuzzy approaches to classification in our applications lies in the fact that they have algorithmic advantages in all iterative methods such as the K-means method: in this latter method, outlier images may be frequently reassigned to different "centers" and thus constitute a source of instability of the entire classification process. 19 In contrast, a "soft" classification using a continuous-valued membership function avoids such abrupt changes. If needed, a hard classification can be readily obtained at the very end of the iterative scheme by applying a threshold to the membership function of each image.
E. Hierarchical Ascendant Classification Hierarchical ascendant classification (HAC) is based on the construction of an indexed hierarchy among the images to be classified. Images are 19I am
endebted to Jose-Maria Carazo for drawing my attention to this important point.
166
Chapter 4. Statistical Analysis and Classification of Images
successively merged based on their distance in factor space (or the original space R J, for that matter), s~k, and on a merging rule. For the distance, the Euclidean distance in factor space is normally used" O'm ~tx
Sik -- Eik =
E
['qY~i ) -
q/.~k)[2,
(4.33)
cr=l
but below we will formulate the merging criteria in the most general way, referring to some distance measure s~k. The choice of merging criterion is very important because it determines the merging sequence and thereby the way the images coalesce into clusters. Based on the merging rule, each merger gives us an index of similarity which can be used to construct a tree that depicts the hierarchical relationships. Such a tree is called
dendrogram. The merging rule has the following form (Anderberg, 1973; Mezzich and Solomon, 1980); let us assume two clusters p and q have been merged to form cluster t. The distance S,r between the new cluster t and a given cluster r is determined by some function of the individual distances, Spr and Sqr, between the clusters p and q and cluster r. For the single linkage criterion, the function is Str= min(Spr,
Sqr).
(4.34)
It is easy to see (Anderberg, 1973) that HAC using this criterion is unable to distinguish clusters that virtually touch each other. Complete linkage overcomes this difficulty to some extent, by using instead the criterion
Str = maX(Spr , Sqr).
(4.35)
Average linkage is obtained by computing the average of all links and thus assumes a position somewhat intermediate between single and complete linkage. For details of this method the reader is referred to the monographs cited above. The centroid criterion merges those c|usters whose means are most similar. The formula is in this case (this time restricted to Euclidean distances) mp Str =
mp + mq
mq Spr %-
mp + mq
mpmq Sqr-
mp + mq
Spq,
(4.36)
where mp, mq are the number of elements (i.e., images in our case) in each cluster. The problem with this criterion is that mergers normally shift
IV. Classification
167
the centroids so that the distance between most similar clusters may increase or decrease from one stage to the next. As a result, the tree depicting the merging hierarchy may feature crossovers of branches. This difficulty is overcome by the widely used criterion of Ward (1963), which is based on the principle of minimum added intraclass variance. That is, the method finds, at each stage, the two clusters whose merger results in the minimum increase A Epq in the within-group variance. This quantity can be shown (e.g., Anderberg, 1973) to be /7"/p m
AEpq =
,~ m ~Lx
q mp + mq
E a=
I ~ p - ,,~ql 2,
(4.37)
1
where %p and a,I/c~q are the mean factorial coordinates of the pth and qth cluster, mp
*~p-
E *~;':"); mp
i=1
1
mq
mq
i=1
Xp~q-
Y', *~'~Q),
(4.38)
where P and Q have been used to denote the index subsets of images belonging to the two clusters. That is, the increase in the intragroup t:ariance is proportional to the squared Euclidean distance between the centroids of the two merged clusters. The updating formula for Ward's criterion is 1 A E r ' -- m r -+- m , [ ( m r -k- m p ) ~ E r p
nt- ( m r -k- m q ) A E r q
- m r AEpq].
(4.39) Three examples will be given here, the first with the 70S model data set introduced in Section III, A, the second with the "face" series, and the third with an experimental set of 70S ribosomes. In the case of the 70S ribosome model data set, Ward's merging criterion was used, and the classification was performed in the 20dimensional space created by correspondence analysis (Fig. 4.12a-c). We recall that the model data set was created, by addition of noise, from nine projections falling into three clusters of similar orientations. Figure 4.12a shows a tabulation of similarity indices for each image, resulting from HAC. From this tabulation, a classification tree, or dendrograrn, can be constructed. At level 0.5 (Fig. 4.12b), the three main orientations are separated into three classes. When the tree is cut just below 0.24
168
Fig. 4.12.
Chapter 4. Statistical Analysis and Classification of Images
Demonstration of hierarchical ascendant classification (HAC) with the 70S model projection set (see Fig. 4.6). (a) Result of HAC classification with Ward's criterion, as it is recorded in a SPIDER document file. Here the third column is the particle ID and the fourth column is the similarity index. (b and c) WEB-generated dendrograms which end in class averages that are automatically generated for the desired cut-off levels (here 0.5 and 0.05, for the purpose of the demonstration). The diagram has to be read from bottom to top. Each node connecting branches represents a merged class. For instance, in (b), the classes 1 ( N = 17) and 2 ( N = 13) are merged into a new class with 30 members; that new class is then merged with class 3 ( N -- 15) on the highest level to form the total population. The vertical height of the horizontal lines is a measure of the gain in intraclass variance obtained through the merging; if that gain is small, then the two classes are very similar. For example, in (c) classes 4 ( N = 9) and 5 ( N = 6) are connected on a low level (similarity index = 0.051) and hence are very similar, whereas mergcd classes (1 + 2) ( N = 9 + 8) and 3 ( N = 13) are
IV. Classification
169
(Fig. 4.12c), one of the classes splits up into two, and just below 0.05, another class splits up. By checking the image numbers going into the various classes (not shown), it is easy to see that the subdivision of the main classes at 0.24 and 0.05 is unrelated to the orientational "fine structure"; rather, the subdivisions found by HAC are due to noise, which creates larger fluctuation than the changes in orientation. The face data set (reproduced for this purpose in slightly different form by N. Boisset), subjected to HAC using Ward's criterion, splits into 2 or 4 classes or into the 8 original classes, depending on the placement of the cutoff level (Fig. 4.13). This data set enables use to introduce the important concept of stability of a classification. Each of the 3 classifications was obtained by a decision on where to place the cutoff level. We judge the stability of the classification by the size of the (vertical) step in the similarity index between the previous branching and the next branching relative to the cutoff chosen. The idea behind this criterion is that for a classification to be stable, small changes in the choice of cutoff level should not affect the outcome. By that criterion, the classification producing 2 classes (uppermost level in Fig. 4.13) is the most stable; the one producing four classes (middle level) is less stable but still sound; and the one producing 8 classes is rather uncertain because a small increase of the cutoff level might have resulted in 7 or 6 classes, while a slight decrease might have given 9, 10, 11, or even more classes. The third example is the classification of an experimental set comprising 667 images of the 70S E. coli ribosome embedded in ice by Penczek et al. (1994)with HAC using the complete linkage criterion (Fig. 4.14a). Here the stability criterion fails because, as apparent from the dendrogram, there is no single choice of cutoff, except for some choices at the highest levels, that would result in a classification insensitive to small changes in the cutoff value. This result is typical for a data set that is quite noisy. In this case the choice of cutoff level was made according to pragmatic considerations: to obtain classes that are large enough to permit the
connected on a much higher level (similarity index = 0.75) and are therefore less similar. The actual division into groups ("classes") is produced by a decision on where along the vertical axis to place the cutoff. Preferable and most stable is a division that is insensitive to a variation of the precise cutoff level. From that point of view, the only choices in the present case are two or three classes, one obtained by choosing a cutoff above 0.5 and the other between 0.05 and 0.5. Both are equally "valid" since they are a case of a tie, one would normally prefer the division into the greater number of classes because it captures more of the structure of the data.
170
Chapter 4. Statistical Analysis and Classification of Images
Fig. 4.13. HAC classification of the "'face" data set introduced earlier in Fig. 4.8. The three cutoff levels indicated produce two, four, or eight classes. The weakest feature distinction, the variation in the width of the mouth, comes out on the lowest of the three indicated levels in the dendrogram, and the corresponding classification is unstable since a small change in the cutoff level changes the number of classes (N. Boisset, unpublished).
computation of well-defined averages, yet small enough to prevent losing high resolution. It is seen from the display of the class averages (Fig. 4.14b) that some of the classes could well have been joined without harm, as for instance the classes 1, 3, and 4 on the top row. Hierarchical classification was first used in the electron microscopy context to classify the groups found by Diday's cross-tabulation of K-means clusters; see below. It was first used by van Heel (1984a) in the form where it is directly applied to the images. Van Heel used an important modification however, in which the partitions created by H A C are to some extend reshuffled. This modified H A C technique will therefore be discussed below, u n d e r hybrid techniques.
IV. Classification
171
F. Hybrid Techniques We call hybrid classification techniques those techniques that are not pure implementations of either partitional or hierarchical merging techniques, such as HAC, but combine elements of both. As we have seen, the K-means techniques have the disadvantage that they tend to produce clusters with (hyper-) spherical shapes irrespective of the actual shape of the clusters in the data. Hierarchical merging techniques can be misled, on the other hand, when the data are relatively unstructured so that incorrect "data bridges" that persist till the end may be formed. This problem exists even though a certain degree of control is possible by the choice of merging criterion. van Heel (1984a) introduced a classification procedure whereby HAC is used initially and then a "postprocessor" is employed, which decides for each image whether it should be reassigned to another class. The rationale behind the introduction of this postprocessor is that it is able to overcome too early "marriages" between images that in the course of the normal HAC scheme cannot be broken. If Ward's merging criterion is used (as was done by van Heel), then the reshuffling amounts to an attempt to overcome the trapping in a local variance minimum. [For applications and further discussions of this modified HAC technique, see van Heel and St6ffler-Meilicke (1985), Harauz et al. (1987, 1988), and van Heel (1989).] Another hybrid technique which is due to Wong (1982) addresses the specific problem posed by a quasi-continuous distribution. In such a case, the K-means techniques lead to rather arbitrary partitions depending upon the choice of seeds, and the HAC techniques suffer strongly from the "early marriage" problem. Wong combined Diday's (1971) dynamic clouds method with HAC. The dynamic clouds method is based on a repeated use of the K-mean algorithm in which cross-partitions that reflect the clustering history are defined. It is typical for quasi-continuous distributions that many small clusters are found in the cross-partitions. Subsequent HAC applied to the cluster centers then leads to a consolidation into larger clusters which hopefully represent "sharpened" divisions in the data set. The first automatic classification of single particles in electron microscopy was done in this way (van Heel et al., 1982). Frank et al. (1988b) applied this method to images of negatively stained 70S ribosomes from E. coli but found that elimination of cross-partition clusters containing fewer particles than a certain percentage of the total population helps to stabilize the results. A disadvantage of this latter hybrid technique is that its outcome depends on a number of discretionary parameters: the number of K-means iterations; the number of repeats of the K-means, which determines the cross-partitions; and the cutoff percentage.
IV. Classification
173
G. Intrinsically Parallel Methods The methods of classification discussed thus far are essentially sequential in design. Juxtaposed to these are methods that lend themselves to implementation in parallel machine architectures. This may have advantages in computational speed when actually implemented on such machines, or it may at least lead to new insights and paradigms when implemented artificially (and normally rather inefficiently)on a sequential machine. Another boon provided by parallel methods is that they tend to reduce the influence of noise. Marabini and Carazo (1994a) explored the use of a self-organizing map algorithm which lends itself to implementation on a neural network. Such an algorithm maps a set of N-dimensional vectors into a twodimensional array of "node vectors" in such a way that similar vectors "project" onto closely neighbored nodes while dissimilar vectors project onto nodes lying more separated from one another. The algorithm operates on all nodes simultaneously and leads to a "steady state" result in which the different groups of a heterogenous image set are spatially separated. The performance of the self-organizing map algorithm can be strikingly demonstrated by applying it to an unaligned particle set (Fig. 4.15): in that case, the particles (side view projections of the TCP1 complex are mapped onto the nodes in a systematic fashion according to their rotation angle. First successful applications have been reported by other groups as well (Egelman et al., 1995). A variant of the algorithm described, which is termed learning t:ector quantization and was introduced by Kohonen (1990), can be used to effect supervised classification on the basis of a preexisting set of vectors already classified. Again, Marabini and Carazo (1994a) were the first to try this approach with electron microscopy data.
Fig. 4.14. HAC classification of experimental images of 70S ribosome embedded in ice. (a) Averages of 25 classes. (Note that the average of class 26, consisting of three images only, has been left out in this gallery). From Penczek et al. (1994). Reproduced with permission of Elsevier Science, Amsterdam. (b) Classification dendrogram, showing the hierarchical similarity relationships among the 26 classes. From left to right, each branch of the dendrogram on its lowest level corresponds to a class, in the order (left to right, top to bottom) they are arranged in the gallery above. The numbers on each branch are the numbers of images falling into the corresponding class. The varying degrees of statistical definition are reflected in the appearance of the class averages. For instance, class 1 has 38 members; hence its average (first in the top row of the gallery (a)) is very well defined, whereas class 2 (second in top row) is quite noisy because it has only 7 members (unpublished results; P. Penczek, R. Grassucci, and J. Frank, 1994).
174
Chapter 4.
Statistical Analysis
and Classification of Images
Fig. 4.15. Unsupervised classification of TCP-1 complex presenting predominantly side view, using Kohonen's self-organizing algorithm. In the words of Marabini and Carazo (1994a), the algorithm maps "a set of n-dimensional nodes (i.e., the images) onto a two-dimensional array of nodes in such a way that vectors projected onto adjacent nodes are more similar than those projected onto distant nodes." Each node has a code vector assigned to it that characterizes its state at any given time. A total of 407 centered but not rotationally aligned images of the molecule were analyzed. (a) Gallery of some of the particles, (b) "tuned" code vectors (i.e., after application of the iterative algorithm), (c) detailed display of every other code vector in (b), (d) example molecules assigned to the code vectors shown in (c). It is seen that the algorithm has ordered the molecules in a two-dimensional continuum according to their (in-plane) orientation. The molecule assigned to the central unstructured code vector [marked by a circle on (d)] is in fact a top view. From Marabini and Carazo, 1994a. Reproduced with permission of the Biophysical Society.
IV. Classification
175
H. Inventories and Analysis of Trends Often it is clear from the factor maps that no distinct classes exist, but rather continuous variations. Many physical effects underlying variational patterns are continuous. Examples are "rocking" of the molecule on the support grid, rotational misalignment of a set of molecules, and movement of a flexible component within a finite range. In that case, we might wish to divide the entire set into subgroups that are big enough to allow meaningful averages to be formed, yet small enough to avoid the blurring of structural features. [The second stipulation was used above in an operational definition of the term "homogeneous."] This project is essentially a project of int'entory, i.e., of an exhaustive description of important structural variations, which is facilitated by the ordination of information achieved by multivariate statistical analysis. While a systematic fine division of am~x-dimensional factor subspace according to a grid into "hypercubes" is not feasible in practice for a~,~ > 3, because of the immense number of grid elements, a coarse division of a two- or three-dimensional subspace into (respectively) squares or cubes is quite easy to accomplish. Whether or not this type of inventory is meaningful depends on the proportion of interimage variance captured by the reduced subspace. In situations where the point cloud in factor space follows a geometrical, essentially one-dimensional path, a division into homogeneous subgroups is accomplished by cuts that are perpendicular to the local pathway, the way one would cut a ring-shaped sausage into equal pieces. Pathways of this kind are likely to point to an underlying physical process that constrains the variational manifold. They are found in situations where the image set shows the molecules in a continuous ("onedimensional") range of orientations. For instance, each projection in a single-axis tilt series has a "precursor" and a "successor" with high similarity. Frank and van Heel (1982b) demonstrated the mapping of a tilt pathway into factor space by creating a tilt series of a model, combining regular conical tilting with single-axis tilting: in accordance with the "logical" pathway of tilting, which has three branches connecting two points, the topology of the data pathway found in factor space is that of the Greek letter ~. Similar model computations were performed by van Heel (1984a). Molecules with roughly cylindrical shape, such as the 30S ribosomal subunit of E. coli, are likely candidates for single-tilt behavior; i.e., they often occur in multiple orientations related by tilting round the cylinder axis. The Boekema and van Heel (1989) analysis of Lumbricus terrestris hemocyanin side views provides a perfect example of this behavior: the
176
Chapter 4. Statistical Analysis and Classification of Images
barrel-shaped particle exists in many views related by the "rolling" and, to some extent, the "tumbling" of the barrel on the support grid, and the factorial space distribution consequently follows a narrow pathway. [Note that in this case, the apparent sixfold symmetry of the molecule creates a sixfold degeneracy in the mapping of an entire tilt series into factor space: two molecules rotated by 60 ~ relative to each other are indistinguishable in projection.] Some three-dimensional reconstructions based on an interpretation of factor-space distributions of projections have been attempted (van Heel, 1984a; Verschoor et al., 1984; Harauz and van Heel, 1985b). The general problem of how the outcome of MSA is related to variations of parameters of the data set (in this case, the variation of angles) has been later treated by Harauz and co-workers in a series of papers (Harauz and Chiu, 1991, 1993; Harauz et al., 1994). A description of this "event covering" approach to data analysis would go too far here; however, it is a direction of research that may yet have useful applications in the 3D reconstruction of macromolecules in the future.
I. Nonlinear Mapping As has become clear in the previous section, the analysis of continuous variationsmand subsequent partitioning of the data into homogeneous subsets--is solely based on visual analysis of one or two factor maps, since there exists no ready tool for automatic data-directed partitioning. (The only exception are neural networks which, as we have seen in Section IV, G, can be used to order and partition the data according to intrinsic rules.) Visual analysis becomes difficult to carry out when several factors are involved in the variation. Nonlinear mapping (Radermacher and Frank, 1985; Radermacher, 1988) is a way of presenting multidimensional variational information in two dimensions: either as a "point cloud," as a montage where each point is replaced by the image it represents, or as a montage of local averages (similar to the one shown in Fig. 4.7). From such a display, inferences can be drawn on the nature (and, if applicable, on the branching structure) of the variations. Formally, the goal of the nonlinear mapping procedure is to represent an N-dimensional point distribution on a 2D map in such a way that the interpoint distances are optimally preserved. The procedure is iterative and uses one of the factor maps as a starting map. The projected positions of the points are changed such that the differences between the Euclidean distances in 2D, dii, , and those in the amax-dimensional factor subspace, --tt't]max,are minimized. As an error measure, the following
IV. Classification
177
expression is used: ]
N ]2-a
E = EN< i'[ die m"x
E (d[~ m"" - dii)2/[d~ma~] a. i < i'
(4.40)
The distances are defined as follows1,/2 dii, --
~
(x(i) -- x(i',)2
(4.41)
k-1
where {x~~), k = 1, 2} are the coordinates of the i th point on the nonlinear map;
]
1/2
d 11'
amax
--
(i)
E (Ka a=]
_
(i')) 2
Ka
(4.42)
where {~(i)..=, a - - 1 . 9 9 O~max} are the coordinates of the ith point in the amax-dimensional factor subspace. The parameter a (0 < a < 1) in the denominator of expression (4.40) controls the importance of short distances d~ max relative to long distances. At the one extreme, a = 0 is used when clustering is expected, and short-range relationships within a cluster are thought to be unimportant; at the other extreme, the choice a = 1 is used when importance is placed on the maintenance of short-range relationships. Figure 4.16 shows an example for the use of a nonlinear map in the sorting of images of the 50S ribosomal subunit. Partial reconstitution of the 50S subunit with 23S rRNA and the protein fraction results in a particle that lacks 5S rRNA (Radermacher et al., 1990). A negatively stained sample was subjected to the usual procedures of electron microscopy (tilted/untilted, see Section V in Chapter 5) and preprocessing. There appeared to be continuum of shape variation among the 943 particles analyzed, not just the expected presence versus absence of a mass corresponding to the 5S rRNA. To get an overview over the diversity of particle shapes, and an objective way to separate homogeneous subsets, Radermacher et al. applied the nonlinear mapping to factors 4-8 from the correspondence analysis. The resulting map (Fig. 4.16) showed continuous transitions between a particle resembling the crown view of the native 50S subunit (map region I) over forms in which part of the central protuberance has disappeared (map region II) to forms that show a complete absence of mass in this area (map region III).
178
Chapter 4.
Statistical Analysis
and Classification of Images
Fig. 4.16. Example for the use of nonlinear mapping: partially reconstituted 50S ribosomal subunits from Escherichia coli, which were aligned and passed through correspondence analysis. Nonlinear mapping was used to generate a 2D map that best preserves the distances in factor space (only factors 4-8 were used). Local averages were then created by averaging over particles falling into different squares of a grid overlayed on the map. The subunit is seen to gradually change horizontally from more normal looking crown views (region I) to a variety lacking mass in the 5S RNA region (region III). There is an indication of vertical ordering according to the inclination of the stalk (extended feature on the left of the subunit) with respect to the main body. (Note, however, that there are no "axes" on such a map as we know them from factor maps.) From Radermacher et al. (1990). Reproduced with permission of San Francisco Press, San Francisco.
IV. Classification
179
J. Supervised Classification: Use of Templates Supervised classification methods group the images according to similarity to existing prototypes, or templates. Correlation averaging of crystals (Saxton and Baumeister, 1982; Frank, 1982) incorporates a supervised classification of some sort: here a reference is simply picked, or generated by Fourier filtration. Its correlation with the full crystal field produces a correlation peak wherever a match occurs. The normal practice to reject lattice repeats for which the correlation peak falls below a given threshold value amounts to a supervised classifications. The dependency of the result on the choice of the template is striking in the case of a double layer (Kessel et al., 1985)where repeats of the front layer are selected by the front-layer template, and those of the back layer by the back-layer template. Sosinsky et al. (1990) demonstrated this selectivity beautifully by using an artificial field of periodically repeating patterns (a hand) that differ in the position of a component (the index finger) (Fig. 4.17). The obvious disadvantage of supervised classification is that the result depends strongly on the choice of the template, thus allowing subjective decisions to influence the course and outcome of the structural analysis. The only applications of supervised classification to single particle averaging have been pursued by Alasdair Steven's group (Fraser et al., 1990). A brief description of the principle of this approach is found in Trus et al. (1992): for each image, Euclidean distances are computed to a set of templates. As pointed out in other sections of this book, the Euclidean distance is a measure of similarity between images and is computed between multicomponent vectors formed by the (lexicographically) ordered set of pixels. Each template is found by averaging images that have been se|ected as representatives of a given class. Some kind of intuitive classification thus precedes the actual automated, supervised classification, which is in turn achieved by using a standard minimum distance classifier (Duda and Hart, 1973) We are also now equipped with the necessary vocabulary to discuss the multireference alignment technique mentioned earlier (Section III, D, 4 in Chapter 3). According to this technique, several templates are generated from class averages following a preliminary classification, and these are used to realign the image series, after which another classification takes place, in which new templates are found, etc. It is easy to see that this scheme is very similar to the K-means method of classification, and shares its main disadvantage, which lies in the dependency of the final partition on the initial set of seeds. The only difference between the two schemes is the fact that the K-means technique does not incorporate alignment.
180
Chapter 4. StatisticalAnalysis and Classification of Images
Fig.
4.17. The power of supervised classification using correlation with different references for discrimination. (A) Part of a model "crystal" made by placing 240 images into a regular two-dimensional lattice. Three versions of the right-hand motif were used, distinguished by the position of the index finger; (B) noisy version of (A); (C1-C3) averages, each computed from the 25 images in (A) that correlate best with the motif; (C4) Fourier average of the entire field in (A), which shows the blurring of the distinguishing feature; (D1-D3) as (C1-C3), but obtained from the noisy image (B). The correlation averages still show the position of the index finger while the Fourier average is blurred as before. From Sosinsky et al. (1990). Reproduced with permission of the Biophysical Society.
K. Inference, through Classification, from Two to Three Dimensions O b t a i n i n g a valid 3D r e c o n s t r u c t i o n rests on the availability of a p r o c e d u r e that divides the m a c r o m o l e c u l e s (from which projections have b e e n obtained) into h o m o g e n e o u s subsets. T h e principal p r o b l e m p o s e d by this r e q u i r e m e n t has b e e n discussed in a previous t r e a t m e n t of classification in
IV. Classification
181
electron microscopy (Frank, 1990). This problem is contained in the ambiguity of the term "homogeneous" in this context: while our goal is to assure that particles from which we collect data are very similar, we can compare them only on the basis of their projections. Comparing projections in fact produces a twofold uncertainty; on the one hand, two different 3D objects can have identical projections in certain directions; and on the other hand, the fact that their projections differ from each other is not sufficient to conclude that the objects themselves are different: we may just be looking at different views of the same object. Even though the latter situation may lead to inefficiencies in the processing (because it forces us to treat the data separately in groups until the very end), it is not harmful as it merely leads to pseudoclasses of particles differing in orientation only. The former situation--multiplicity of objects giving rise to a given projection--is more serious since it poses the risk of obtaining a reconstruction that is not grounded in reality. This problem of 3D inference from 2D classification and its solution become clearer when the random conical approach to reconstruction (Chapter 5, Sections III, E and V) and the angular refinement methods have been described (Chapter 5, Section VIII).
I. Introduction The value of projection images is quite limited if one wishes to understand the architecture of an unknown structure (Fig. 5.1). This limitation is illustrated by the early controversies regarding the three-dimensional (3D) model of the ribosome, which was inferred, with different conclusions, by visual analysis of electron micrographs [see, for instance, the juxtaposition of different models in Wittmann's (1982) review]. In 1968, DeRosier and Klug published the first 3D reconstruction of a biological object, a phage tail with helical symmetry (DeRosier and Klug, 1968). Soon after that, Hoppe published an article (Hoppe, 1969) that sketches out the strategy for reconstruction of a single macromolecule lacking order and symmetry [see Hoppe et al. (1974) for the first 3D reconstruction of such an object from projections]. Since then, the methodologies dealing with the two types of objects have developed more or less separately, although the existence of the same mathematical thread (Crowther et al., 1970) has been often emphasized. This part of the volume is organized in the following way: first, some basic mathematical principles underlying reconstruction are laid down. Next, the different data collection schemes and reconstruction strategies are described, which answer the questions of how to maximize information, minimize radiation damage, or determine the directions of projections, while leaving open the choice of reconstruction algorithm. The main algorithmic approaches are subsequently covered: weighted backprojection, Fourier interpolation methods, and iterative algebraic methods. 182
II. General MathematicalPrinciples
183
Fig. 5.1. A single projection image is plainly insufficient to infer the structure of an object. Drawing by John O'Brien; 9 1991 The New Yorker Magazine.
Against this background, the random-conical reconstruction scheme is described as a scheme of data collection and processing that has gained practical importance and underlies most single particle reconstructions to date (see Bibliography at the end of the volume). In a final section, restoration and angular refinement are covered.
II. General Mathematical Principles A. The Projection Theorem, Radon's Theorem, and Resolution The projection theorem, which is of fundamental importance in the attempts to recover the object, is implied in the mathematical definition of a multidimensional Fourier transform. In two dimensions, let us consider the Fourier representation of a function,
F(kx(x, y,.))exp[-27ri(k, X + k,.y)]dkxdk,..
(5.1)
184
Chapter 5. Three-Dimensional Reconstruction
Now we form a one-dimensional projection in the y-direction. The result is q(x) =
f f(x, y)dy
: Jy[r
F(k~, ky )exp[ - 2"n'i( k x x + kyy)]dkxdky dy, (5.2)
which immediately yields g ( x ) = fk s F ( k x ' k " ) 6 ( k v ) d k ~ d k v = fk F(kx'O)dk~' x
v
(5.3)
x
where 6(ky) is the delta function. This means that the projection of a two-dimensional function f ( x , y) can be obtained as the im'erse onedimensional Fourier transform of a central section through its 2D Fourier transform F(kx, ky) = F[f(x, y)]. The above "proof" is very simple when the projections in x and y directions are considered, making use of the properties of the Cartesian coordinate system. Of course, the same relationship holds for any choice of projection direction, see for instance the formulation by Dover et al. (1980). An analogous relationship holds between the projection of a three-dimensional object and the corresponding central section of its 3D Fourier transform. This suggests that reconstruction can be achieved by "filling" the 3D Fourier space with data on 2D central planes that are derived from the projections by 2D Fourier transformation (Fig. 5.2). More rigorously, the principal possibility of reconstructing a 3D object from its projections follows from Radon's (1917) quite general theory which has as its subject "the determination of functions through their integrals over certain manifolds." The parallel projection geometry we use to describe the image formation in the transmission electron microscope [as modeled in Eq. (5.2)] is a special case of Radon's theory where the integrals are performed along parallel lines [see the integral for the 3D case, Eq. (5.43)]. According to Radon, an object can be reconstructed uniquely from its line projections when all of its line projections are known. Taken literally, this theorem is rather useless because it does not address the questions of how to reconstruct the object from a limited number of experimental (noisy) projections to a finite resolution and if for a limited number of projections such a reconstruction would be unique. The effect of restricting the reconstruction problem to finite resolution can be understood by considering the projection t h e o r e m m t h e fact that each projection furnishes one central section of the object's Fourier transform
lI. General Mathematical Principles
185
ThreedimensionoL [ ~ Fourietronsform r Inversethreedimensional Fourier tronsform -
Twodm i ensoi na[
Fourier tronsform
6 ~
l ll]
l~ dm ensoo irm na.l Fouri eitronsf
Fig. 5.2. Illustration of the projection theorem and its use in 3D reconstruction. From Lake (1971). Reproduced with permission of Academic Press Ltd.
Inversethreedm i enso i no[ Fourier tronsform
- - a n d taking into account the boundedness of the object (see Hoppe, 1969). We make reference to Fig. 5.18 (later used to explain the related fact that adjacent projections are correlated up to a resolution that depends on the angular increment and the size of the object). A bounded object o(r) can be described mathematically as the product of an unbounded function ~3(r) that coincides with the object inside of the object's perimeter and a shape function; i.e., a function that describes the object's shape and has the value 1 within the boundary of the object and 0 outside. o(r) = (~(r)s(r).
(5.4)
The Fourier transform of o(r), which we seek to recover from samples supplied by projections, is o(k) = ~3(k)QS(k),
(5.5)
i.e., every Fourier component of the unlimited object is surrounded by the
shape transform S ( k ) = F{s(r)}. For a smooth shape, the shape transform normally has a main maximum that occupies a region of Fourier space whose size is 1/D if D is the size of the object. This means that the Fourier transform o(k) varies smoothly over this distance or, conversely, that measurements have only to be available on a grid in Fourier space with that spacing. As a consequence (see Fig. 5.18), roughly ~D N -
d
(5.6)
equispaced projections need be available to reconstruct an object with diameter D to a resolution R - - 1 / d (Bracewell and Riddle, 1967; Crowther et aT., 1970). The same conclusion can be reached when one uses
186
Chapter 5. Three-Dimensional Reconstruction
a least-squares approach and formulates the reconstruction problem as the problem of finding the values of the Fourier transform on a finite polar grid from a finite number of experimental projections (Klug and Crowther, 1972). In conclusion, we can state that, as a consequence of Radon's theorem and the boundedness of the object, an object can be recouered to a giuen resolution from a finite number of projections, prot'ided that these projections cover the angular space ecenly. For the time being, we leave this formulation general, but the problems related to gaps in angular coverage will surface throughout this chapter. B. P r o j e c t i o n G e o m e t r i e s
The purpose of this section is to define the relationship between the coordinate system of the projection and that of the molecule. Furthermore, using this formalism, we will define the two most important regular data collection geometries, single-axis and conical. Let r = (x, y, z) T be the fixed coordinate system of the molecule. By projecting the molecule along the direction z (~), defined by the three angles qj(i), 0(i), and 4)(i), we obtain the projection pti) (x(~), y(i)). The transformation between the vectors in the coordinate system of the molecule and those in the coordinate system of the projection indexed i is expressed by three Eulerian rotations. In the convention used by Radermacher (1991), r (i) = Rr,
(5.7)
R = R6RoR~,
(5.8)
with
where
4,,
cos Oi
sin Oi
0
-sinOi
cosq6 0
0 1
0
Ro,
cosO i 0 sin O~ cos (hi -sin(hi 0
0 1 0
-sinOi 0 cos Os sin 4)i
0
cos 05i 0
0 1
(5.9)
(5.10)
(5.11)
II. General Mathematical Principles
187
These rotations are defined as in classical mechanics and can be understood by reference to the sketch in Fig. 5.3: first the molecule is rotated by the angle ~bi in positit'e direction around its z axis, then by the angle 0, in negatit'e direction around its new y axis, and finally by the angle ~i in positit'e direction around its new z axis. It is seen that the first two angles define the direction of projection, while the third Eulerian rotation amounts to a trivial rotation of the object around an axis perpendicular to the projection. One commonly associates the angle 0i with the concept of "tilt," although the exact tilt direction must be first defined by the size of the first "azimuthal" angle ~b~. The orientations of projections accessible in a given experiment are defined by technical constraints; these constraints are tied to the degrees of freedom of the tilt stage and to the way the molecules are distributed on the specimen grid. Referring to the geometry defined by these constraints, we speak of the data collection geometry. The regular single-axis tilt geometry (Fig. 5.4a) is generated by ~/ = 0
and
~bi = 0:
(5.12)
the molecule is tilted by 0; in equal increments around the y axis and then projected along z (i).
I f t~"
-"
~
m
I
.9'
4
z
x \
.x_
L
~
Fig. 5.3. Definition of Eulerian angles; see text for details. From Sommerfeld (1964). Reproduced with permission of Wissenschaftliche Verlagsgesellschaft Geest & Portig K. G.
Chapter 5. Three.Dimensional Reconstruction
188
Single-axis tilting a
Conical tilting b
Fig. 5.4. Data collection by (a) single-axis and (b) conical tilting. From Radermacher (1980).
The regular conical tilt geometry (Fig. 5.4b) is generated by ~i = 0
and
0i =
0o = constant.
(5.13)
The molecule is first rotated around its z-axis by the "azimuthal angle" r in equal increments and then tilted by 00 around the new y axis. Finally, for later reference, the random-conical geometry is equivalent to the regular conical tilt geometry (without an explicit azimuthal rotation), except that the azimuth is randomly distributed in the azimuthal range {0, 2~r}.
Ill. Rationales of Data Collection: Reconstruction Schemes A. Introduction In attempting to reconstruct a macromolecule from projections to a resolution of 1 / 3 0 / ~ - 1 or better, we must satisfy several mutually contradictory requirements: (i) We need many different projections of the same structure to cover Fourier space as evenly as possible (this requirement often excludes the direct use of images of molecules showing preferred orientations, since the number of those is normally insufficient, and the angular coverage is far from even).
III. Rationales of Data Collection: Reconstruction Schemes
189
(ii) The total dose must not exceed 10 e - / , ~ 2 (this excludes tomographic experiments of the type that Hoppe et al. (1974) introduced). (iii) The reconstruction should be representative of the ensemble of macromolecules in the specimen (this excludes the use of automated tomography, by collecting all projections from a single particle while keeping the total dose low (cf. Dierksen et al., 1992, 1993) unless a sizable number of such reconstructions are obtained which can be subsequently combined to form a statistically meaningful 3D average). The three pathways that lead from a set of molecule projections to a statistically significant 3D image were summarized by Frank and Radermacher (1986) in a diagram (Fig. 5.5): following the first pathway, individual molecules are separately reconstructed from their (tomographic) tilt series, then their reconstructions are aligned in 3D and averaged (Fig. 5.5a). Following the second, molecule projections found in the micrograph are aligned, classified, and averaged by class. When a sufficient number of views are present, the molecule can be reconstructed from the class averages (Fig. 5.5b). The third possibility is to relate projections that vary widely in viewing direction to one another, so that an averaged 3D reconstruction can be directly computed (Fig. 5.5c). (iv) Since stipulations (i) through (iii) imply that the projections have to be drawn from different "copies" (i.e., different realizations of the same structure) of the molecule, we need to establish the relative orientations of those molecules in a common frame of reference. In other words, for a data collection and reconstruction scheme to be viable, it must be able to "index" projections reliably; i.e., it must be able to find their orientations. Of all the stipulations listed above, the last is perhaps the most difficult one to fulfill in practice. The reason that the random conical scheme, to be described below (Section III, D), has found wide popularity among several schemes proposed over the years, is that it solves the problem of finding the relative orientations of different projections unambiguously, by the use of two exposures of the same specimen field. Other methods, such as the method of angular reconstitution (van Heel, 1987b; Goncharov et al. 1987; Orlova and van Heel, 1994), have to find the angles a posteriori based on common lines. In the following, I will first, for the sake for completeness, outline a method of data collection and reconstruction that draws from a single averaged projection and thus does not require orientation determination. Next, an important issue, the question of compatibility of projections, which determines the validity of all schemes that combine data from different particles, will be discussed. After that, the methods of angular
190
Chapter 5. Three.DimensionalReconstruction
7113[=]
DDD
7-1
D
,~ v
D7-1[--3
~
V
~
w
DV D
Fig. 5.5. Three principal ways of combining projection information into a statistically well-defined 3D structure. (a) Molecules are separately reconstructed from different projection sets (normally tilt series), and then the reconstructions are merged after appropriate orientation search. (b) A "naturally occurring" projection set is divided into classes of different views, an average is obtained for each class, the viewing direction is established for each average, and--if sufficient views are available--the molecule is reconstructed. (c) Projections are directly merged into a 3D reconstruction after their viewing directions have been found. From "'Advanced Techniques in Biological Electron Microscopy." Threedimensional reconstruction of non-periodic macromolecular assemblies from electron micrographs. Frank, J., and Radermacher, M., Vol. III, pp. 1-72 (1986). Reproduced with permission of Springer-Verlag, Berlin.
r e c o n s t i t u t i o n a n d r a n d o m - c o n i c a l d a t a c o l l e c t i o n will be o u t l i n e d in two s e p a r a t e sections.
B. Cylindrically Averaged Reconstruction F o r s o m e s t r u c t u r e s , t h e d e v i a t i o n f r o m cylindrical s y m m e t r y is n o t recogn i z a b l e at t h e r e s o l u t i o n a c h i e v e d ( 1 / 2 0 to 1 / 4 0 A - l ) , a n d so t h e r e is n o way, e x c e p t possibly by a n t i b o d y labeling, to d i s t i n g u i s h t h e p a r t i c l e o r i e n -
III. Rationales of Data Collection: Reconstruction Schemes
191
tation (with respect to its long axis that is running parallel to the grid) from the appearance of its side views. The presentation of a cylindrically averaged reconstruction that is consistent with the observed views is the best one can do under these circumstances. The way from the projection to the 3D reconstruction is provided by the inversion of the Abel transform (Vest, 1974; Steven et al., 1984): Let us consider the two-dimensional case. The projection of a function f ( x , y ) c a n be represented by the line integral 3C
f (R, o) = f
:)C
f f(x, y) (x
0 + y sin 0 -
R)dxdy,
(5.14)
which, considered as a function of the variables R and 0, is the Radon transform of f ( x , y). A choice of 0 defines the direction of projection, and R defines the exact projection ray. Now if f(x, y) is a slice of a cylindrically symmetric structure, it depends only on r = (x 2 + y2)1/2. In that case, Eq. (5.14) simplifies into the Abel transform: 5C
ZX;
f(r)rdr = 21 J~ ( x 2 _ r2 ) 1/2"
(5.15)
Equation (5.15) can be inverted and solved for the unknown profile f ( r ) by the use of the inverse Abel transform
1 f(r)-
rc fr
~
fL(x)dx (x 2 _ r2) 1/2"
(5 16)
The practical computation makes use of the fact that for a rotationally symmetric function, the Abel transform is equivalent to the Fourier transform of the Hankel transform. This method has been used with success in the investigation of flagellar basal bodies both negatively stained (Stallmeyer et al., 1989a, b) and frozen-hydrated (Sosinsky et al., 1992; Francis et al., 1994). Basal bodies are molecular motors effecting the rotation of flagella which are used for propulsion in water by certain bacteria. The 3D reconstructions of basal bodies from several mutants of two bacteria, Salmonella and Caulobacter, obtained by the Brandeis group over the course of the past few years, have much advanced our understanding of this fascinating "natural wheel" (Fig. 5.6). Of course, the detailed exploration of this structure will ultimately involve the more general methods discussed below, which are not based on the assumption of cylindrical symmetry. Better preparation methods and higher resolution are expected to make this improvement possible.
192
Chapter5. Three-DimensionalReconstruction
Fig. 5.6. Flagellar motor (basal body) of Salmonella, reconstructed from single-particle averages of frozen-hydrated specimen, assuming cylindrical symmetry. EBB, extended basal body. HBB, hook plus basal body, lacking the C ring complex and the switch protein. From Francis et al. (1994). Reproduced with permission of Academic Press Ltd.
C. Compatibility of Projections When projections from different particles arc combined in a threedimensional reconstruction, the implicit assumption is that they represent different views of the same structure. If this assumption is incorrectwthat is, if the structure is differently deformed in different particles--then the reconstruction will not produce a faithful 3D image of the macromolccule. Moreover, some methods of data collection and reconstruction determine the relative directions of projections by making use of mathematical relationships among them, which arc fulfilled only when the structures they originate from arc identical. However, macromolecules are often deformed because of an anisotropic environment: when prepared by negative staining and air-drying on a carbon grid, they arc strongly flattened, down to as much as 50% of their original z dimension, in the direction normal to the plane of the grid. Even ice-embedmcnt may not avoid a deformation entirely, because of the forces acting on a molecule at the air-water interface. One important consideration in assessing the viability of different reconstruction schemes is therefore whether or not they mix projections from particles lying in different orientations. If they do, then some kind of
III. Rationales of Data Collection: Reconstruction Schemes
193
check is required to make sure that the structure is not deformed in different ways (see following). If they do not, and the molecule is deformed, then it is at least faithfully reconstructed, without resolution loss, in its unique deformed state. It has been argued (e.g., van Heel, 1987) that by using solely 0 ~ projections, i.e., projections perpendicular to the plane of the specimen grid, one essentially circumvents the problem of direction-dependent deformation, as this mainly affects the dimension of the particle perpendicular to the grid. Following this argument, a secondary effect could be expected to increase the width for each view, leading to a reconstruction that would render the macromolecule in a uniformly expanded form. As yet, this argument has not been tested. It would appear that specimens will vary widely in their behavior and that the degree of expansion may be orientation-dependent as well. Conservation of the 3D shape of the molecule on the specimen grid can be checked by an intercom, ersion experiment. Such experiments play an important role in visual model building, i.e., in attempts to build a physical model intuitively, by assigning angles to the different views, and shaping a malleable material so that the model complies with the observed views. The experiment is designed to establish an angular relationship between two particles presenting different views, A and B: on tilting, the particle appearing in view A changes its appearance into A', and the one appearing in view B into B'. The experiment tests the hypothesis that the two particles are in fact identical but lie in different orientations. In that case, it should be possible to find a tilt angle a, around an appropriate axis, that renders A' and B identical. Inverse tilt around that axis by an angle - a should also render B' and A identical. An example for a successful interconversion experiment is the study of Stoops et al. (1991) who found that the two prominent views of negatively stained a2-macroglobulin, the "lip" and "padlock" views, interconvert for a tilt angle of 45 ~ around the long axis of the molecule. Numerous interconversion experiments were also done, in an earlier phase of ribosome morphology research, to relate the views of ribosomes and their subunits to one another (e.g., Wabl et al., 1973; Leonard and Lake, 1979) so that the 3D shape could be inferred.
D. Relating Projections to One Another Using Common Lines Methods designed to relate different projections of a structure to one another make use of the c o m m o n lines (Crowther et al., 1970). These are lines along which, according to the projection theorem, the Fourier transforms of the projections should be identical in the absence of noise. The
194
Chapter 5. Three-Dimensional Reconstruction
common lines concept is important in several approaches to electron microscopic reconstruction and will be discussed again later (Section VIII). At present we make use of a simple model: let us represent two arbitrary projections of an object in 3D Fourier space by their associated central sections (Fig. 5.7). These intersect on a line through the origin, their common line. Suppose now that we do not know the relative orientation of these projections. We can then find their common line by "brute force," by comparing every (one-dimensional) section of one 2D Fourier transform with every one of the other transform. The comparison is done by cross-correlation, which is in Fourier space equivalent to the forming of the summed conjugate product. This product will assume a maximum when a match occurs. Another frequently used measure of the fidelity of match is the differential phase residual; see Section V, B, 2 in Chapter 3. Once the common line is found, the (in-plane) rotations of the two central Fourier sections (and thereby, of the corresponding projections) are fixed. The two central sections can still move, in an unconstrained way, around the fixed common line which thereby acts as a "hinge." Obviously, a third projection and its central section, provided that it is nonplanar with either of the first two, will fix this movement and lead to a complete determination of relative angles among the three projections (apart from an ambiguity of handedness). Starting with this system of orientations, new projections are added by combining them with pairs of projections already placed. This, in essence, is the method of angular reconstitution (described by van Heel, 1987b; Goncharov et al., 1987; Orlova and van Heel, 1994). In practice, the common line search is performed in real space with the help of the so-called sinogram; this is a data table that contains in its rows the 1D projection of a 2D image (in our case, of a 2D projection) P
Fig. 5.7. The principle of common lines. Two projections of the same object, represented by central sections of the object's 3D Fourier transform, intersect each other along a common line through the origin. Along that line, their Fourier transforms must be identical.
III. Rationales of Data Collection: Reconstruction Schemes
195
exhaustively computed for all angles. (Note that the sinogram can be understood as a discrete version of the two-dimensional Radon transform.) If the 2D projections originate from the same object, then there exists an angle for which their 1D projections are identical (or closely similar): in real space, the equivalent to the common line is the common 1D projection. The angle is found by comparing or correlating the sinograms of the two 2D projections (Vainshtein and Goncharov, 1986; van Heel, 1987b; Goncharov et al., 1987). Although elegant in concept, the common line (or common 1D projection) method of orienting "raw data" projections, and thus the method of angular reconstitution as proposed originally, is normally hampered by the low signal-to-noise ratio of the data. However, as we know, the signal-tonoise ratio can be dramatically improved by averaging, either over projections of molecules presenting the same view, or over symmetry-related projections. Examples of sinograms for molecule class averages are presented in Fig. 5.8. From these examples it is clear that the determination of the common 1D projection, and hence the determination of the relative angle between two projections represented by class averages, should be quite robust. On the other hand, the averaging of molecules within classes entails a resolution loss which will be reflected by the quality of the reconstruction. Only by adding an angular refinement step, to be described in Section VIII of this chapter, can the full resolution in the data be realized. Reconstructions utilizing this concept, mostly applied to macromolecules with symmetries, have been reported by van Heel and co-workers (Schatz, 1992; Dube et al., 1994; Schatz et al., 1994). Figure 5.9 presents a model of worm hemoglobin obtained by Schatz (1992), partly making use of the sinogram-based angle assignments. The first full-sized article, describing the reconstruction of the calcium release channel from 3000 cryoprojections, has just appeared as this book is being completed (Serysheva et al., 1995). The results indicate (see Section III in Chapter 6 on validation), especially because they closely match the model previously obtained independently by the method of random-conical reconstruction (Radermacher et al., 1994a, b), that the angular reconstitution method is a new viable approach to 3D electron microscopy of macromolecules, especially those exhibiting symmetries. Since it is based on the presence of multiple views covering the angular space as evenly as possible, the method can be seen as complementary to the random-conical method of reconstruction which is based on a different situation: the presence of a few preferred views, or even a single one. Another reconstruction of this type, applied to scanning transmission electron microscope data from
196
Chapter 5. Three-DimensionalReconstruction
Fig. 5.8. Sinograms and their use in finding relative projection angles. (a) Class averages of worm hemoglobin showing the molecule in different views; (b) corresponding sinograms. Each horizontal line in the sinogram represents a 1D projection of the corresponding molecule image in a certain direction. The lines are ordered according to increasing angle of 1D projection, covering the full 360 ~ range. A rotation of an image is reflected by a cyclical vertical shift of the corresponding sinogram. (c) Sinogram correlation functions (SINECORR for short). The SINECORR between sinograms of projections 1 and 2 is derived in the following way: the cross-correlation coefficient is computed between the first row of sinogram 1 and each row of sinogram 2, and the resulting values are placed into the first row of the SINECORR, and so on with the following rows of 1. The position of the maximum in the SINECORR indicates the angular relationship between the projections. Meaning of the
III. Rationales of Data Collection: Reconstruction Schemes
197
Fig. 5.8. (continued)
f r e e z e - d r i e d specimens of the signal s e q u e n c e - b i n d i n g protein SRP54, was p r e s e n t e d by C z a r n o t a et al. (1994). Recognizing the noise sensitivity of van H e e l ' s and G o n c h a r o v ' s m e t h o d in its original form, which i n t e n d e d to recover relative orientations from raw data), F a r r o w and O t t e n s m e y e r (1992; O t t e n s m e y e r and Farrow, 1992) d e v e l o p e d an extension of the technique. Solutions are f o u n d for m a n y
panels from left to right, top to bottom: SINECORRs of 1 vs 1, 1 vs 2, 1 vs 3, 1 vs 4, 4 vs 4, 1 vs 9, 3 vs 9, vs 9, and 8 vs 8. Multiple maxima occur because of the sixfold symmetry of the molecule. From Schatz (1992). Reproduced with permission.
198
Chapter 5. Three.Dimensional Reconstruction
Fig. 5.9. Three-dimensionalreconstruction of Lumbricus terrestris erythrocruorin embedded in ice from class averages shown in Fig. 5.8a. In part, the angles were assigned based on the technique of "angular reconstitution" (see also Schatz et al. 1994). From Schatz (1992). Reproduced with permission.
projection triplets using the common-lines triangulation that is at the core of the angular reconstitution technique, and the results are reconciled using quaternion mathematics (see Harauz, 1990). The simultaneous minimization technique (Penczek et al., 1995) attempts to solve the orientation search problem simultaneously for any number of projections greater then three. Concurrent processing of a large number of projections relaxes the requirement of high SNR for the input data. The method uses a discrepancy measure which accounts for the uneven distribution of common lines in Fourier space. The minimization program begins the search from an initial random assignment of angles for the projections. Penczek and co-workers were able to demonstrate that the correct solution (as it is known from the result of 3D projection alignment to a merged random-conical reconstruction; Frank et al., 1995a)was found in 40% of the trials.
III. Rationales of Data Collection: Reconstruction Schemes
199
Another use of common lines is in "bootstrapping" methods, where new projections are matched to an existing reconstruction, or where the orientations of experimental projections are refined. This topic will be discussed further in Section VIII. E. The R a n d o m - C o n i c a l Data Collection M e t h o d
The principle of this data collection scheme was first mentioned in the context of two-dimensional averaging of molecule projections (Frank et al., 1978a) as an effective way for extending the single particle averaging into three dimensions. An explicit formulation and a discussion of the equivalent Fourier geometry was given by Frank and Goldfarb (1980). First attempts to implement the reconstruction technique led to a Fourier-based computer program that proved unwieldy (W. Goldfarb and J. Frank, unpublished, 1981). The first practical implementation of a reconstruction method making use of the random-conical data collection was achieved by Radermacher et al. (1986a, 1987a, b). For the implementation, numerous problems had to be solved, including the determination of the precise tilt geometry, the practical problem of pairwise particle selection, the alignment of tilted projections, the relative scaling of projection data, the weighting of projections in a generalized geometry, and--last, but not least - - t h e massive bookkeeping required. A detailed description of the method and its implementation is found in Radermacher (1988). In the following, the different solutions to these problems will be described in some detail. The method is based on the fact that single macromolecules often assume preferred orientations on the specimen grid (see Section I, E in Chapter 3). Any subset of molecules showing identical views in an untilted specimen form a rotation series with random azimuth, 0~. When the specimen grid is tilted by a fixed angle, 0~ (Fig. 5.10a, b), the above subset will appear in the micrograph as a conical projection series with random azimuths and 00 as cone angle (Fig. 5.10c). In the actual experiment, the specimen field is recorded twice: once tilted and once untilted (in this order). The first micrograph is used to extract the projections for the reconstruction. The purpose of the second micrograph is twofold: to (i) separate the particles according to their views ("classification") and (ii) within each subset (or class), determine the relative azimuths of all particles ("alignment"). The advantages of this scheme are evident: it allows the orientations of all molecules to be readily determined while allowing the dose to be kept to a minimum. Because of these advantages, the random conical reconstruction has come into widespread use (see the list of 3D reconstructions in Appendix 2).
200
Chapter 5. Three-Dimensional Reconstruction
O.
Q,o
G.
Q.O
~c>
I
o,
a
i
a',
QQ I
I
C
r /
Fig. 5.10. Principle of the randomconical data collection. (a)untilted, (b) tilted field with molecule attached to the support in a preferred orientation, (c) equivalent projection geometry. From Radermacher et al. (1987b). Reproduced with permission of Blackwell Science Ltd., Oxford, from Radermacher, M., Wagenknecht, T., Verschoor, A., and Frank, J., Threedimensional reconstruction from a single-exposure, random conical tilt series applied to the 50S ribosomal subunit. J. Microsc. 146, 113-136.
There are some obvious limitations that restrict the resolution of the reconstruction: one is due to the fact that the observed "preferred orientation" in reality encompasses an entire orientation range (see Section VIII, D for the likely size of the angular range). Another, related limitation stems from the need to classify particles on the basis of their 0 ~ appeara n c e - - a task which may have ambiguous results (see Section IV, K in Chapter 4). A third limitation has to do with the fact that the azimuthal angles (as well as the subsequent classification) are determined from the images of particles (at 0 ~ that have already been exposed to the electron beam and may have been damaged. All three limitations can be removed by the use of a refinement method according to which each projection is allowed to vary its orientation with respect to the entire data set (see Section VIII). However, the starting point is always a random conical reconstruction of the "basic" type outlined above.
III. Rationales of Data Collection: Reconstruction Schemes
201
The instructive drawing in Lanzavecchia et al. (1993) in (Fig. 5.11) shows the coverage of Fourier space afforded by the conical geometry. Since each projection is sampled on a square grid, its discrete Fourier transform is available within a square-shaped domain. The body formed by rotating an inclined square around its center resembles a yo-yo with a central cone spared out. Since the resolution of each projection is limited to a circular domain (unless anisotropic resolution-limiting effects such as astigmatism intervene, see Section II, B in Chapter 2), the coverage of the 3D Fourier transform by useful information is confined to a sphere contained within the perimeter of the yo-yo (not shown in Fig. 5.11).
a
Z ~"
Z
Fig. 5.11. Coverageof 3D Fourier space achieved by regular conical tilting. (a) Relationship between inclined Fourier plane representing a single projection and the 3D Fourier transform. (b) Yo-yo-shaped body (hollowed out by a double cone) covered by filling 3D Fourier space, assuming each plane contains information up to the sampling resolution. For randomconical data collection, the spacings between successive planes are irregular. Adapted from Lanzavecchia et al. (1993). Reproduced with permission of Blackwell Science Ltd., Oxford.
202
Chapter 5. Three.DimensionalReconstruction
We will come back to the procedural details of the random-conical data collection and reconstruction after giving a general overview over reconstruction algorithms.
F. Reconstruction Schemes Based on Uniform Angular Coverage For completeness, two reconstruction schemes that rely on a uniform coverage of the space of orientations should be mentioned. Both use spherical harmonics as a means to represent the object and its relationship to the input projections. The requirement of statistical uniformity and the choice of the rather involved mathematical representation have restricted the use of these schemes to model computations and few demonstrations with experimental data. The first scheme, proposed by Zvi Kam (1980), is based on a sophisticated statistical approach difficult to paraphrase here. The second scheme, introduced by Provencher and Vogel (1983; see also Provencher and Vogel, 1988; Vogel and Provencher, 1988), is designed to determine the relative orientations of the projections of a set of particles by a least squares method, but it requires approximate starting orientations. Thus far, only a single reconstruction of a nonsymmetric particle, the 50S ribosomal subunit, has been obtained with this latter method (Vogel and Provencher, 1988).
IV. Overview of Existing Reconstruction Techniques A. Preliminaries Given a data collection scheme that produces a set of projections over an angular range of sufficient size, there are still different techniques for obtaining the reconstruction. Under "technique" we understand the mathematical algorithm and--closely linked to itwits computational realization. The value of a reconstruction technique can be judged according to its mathematical tractability, computational efficiency, stability in the presence of noise, and many other criteria. Weighted back-projection (Section IV, B) has gained wide popularity because it is very fast compared to any iterative techniques. However, apart from the computational efficiency, two--mutually contradictory-criteria that are considered important in the reconstruction of single macromolecules from electron micrographs are linearity of a technique and its ability to allow incorporation of constraints. Here linearity implies that the reconstruction technique can be considered a black box with "input" and "output" channels and that the output
IV. Overviewof Existing ReconstructionTechniques
203
signal (the reconstruction) can be derived by linear superposition of elementary output signals, each of which is the response of the box to a delta-shaped input signal (projections of a point). In analogy to the point spread function, defined as the point response of an optical system, we speak of the "point spread function" of the combined system formed by the data collection and the subsequent reconstruction (see Radermacher, 1988). The linearity of the weighted back-projection technique (Section IV, B) has been important in the development and practical implementation of the random conical reconstruction method because of is mathematical tractability. (Some iterative techniques such as algebraic reconstruction technique (ART) and simultaneous iterative reconstruction technique (SIRT) which also share the property of linearity have not been used for random conical reconstruction until recently because of their slow speed.) The practical importance of linearity also lies in the fact that it allows the 3D variance distribution to be readily estimated from projection noise estimates (see Section II, A in Chapter 6). On the other hand, the second criterion--the ability of a technique to allow incorporation of constraints--is important in connection with efforts to fill the angular gap. Weighted back-projection as well as Fourier reconstruction techniques fall into the class of linear reconstruction schemes, which make use only of the projection data and fail to consider the noise explicitly. In contrast, the different iteratk'e algebraic techniques lend themselves readily to the incorporation of constraints and to techniques that take the noise statistics explicitly into account. However, these techniques are not necessarily linear. For example, modified SIRT in Penczek et al. (1992) incorporates nonlinear constraints. In comparing the two different approaches, one must bear in mind that one of the disadvantages of the weighted back-projection--its failure to fill the missing gap--can be mitigated by subsequent application of restoration, which is, however, again a nonlinear operation. Thus when one compares the two approaches in their entirety--weighted back-projection plus restoration versus any of the nonlinear iterative reconstruction techn i q u e s - t h e importance of the linearity stipulation is somewhat weakened by its eventual compromise.
B. Weighted Back-Projection Back-projection is an operation that is the inverse to projection: while the projection operation produces a 2D image of the 3D object, backprojection "smears out" a 2D image into a 3D body ("back-projection body," see Hoppe et al., 1986) by translation into the direction normal to the plane of the image (Fig. 5.12). The topic of modified back-projection,
Chapter 5. Three.Dimensional Reconstruction
204
[ /
~,~
.. /
>.1...
I
I i .I 1
I "..
i
"~ ...
j "7, ~
i~ ~
O-Oz Fig. 5.12. Illustration of the back-projection method of 3D reconstruction. The density distribution across a projection is "smeared out" in the original direction of projection, forming a "back-projection body". Summation of these back-projection bodies generated for all projection yields an approximation to the object. For reasons that become clear from an analysis of the problem in Fourier space, the resulting reconstruction is predominated by low-spatial frequency terms. This problem is solved by Fourier weighting of the projections prior to the back-projection step. From Frank et al. (1985). Reproduced with permission of van Nostrand-Reinhold, New York.
as it is applied to the reconstruction of single particles, has been systematically presented by Radermacher (1988, 1991, 1992), and some of this work will be paraphrased here. Let us consider a set of N projection into arbitrary angles. As a notational convention, we keep track of the different 2D coordinate systems of the projections by a superscript; thus, pi(r ~i~) is the ith projection, r ~i~ = {x ~, y~i)}T are the coordinates in the ith projection plane, and z ~i) is the coordinate perpendicular to that. With this convention, the back-projection body belonging to the ith projection is bi(r/i), z (i)) = p i ( r ' ~ i ) ) t ( z ( i ) ) ,
(5.17)
where t(z) is a "top hat" function:
t(z)=
1 0
for - D / 2 < z < D / 2 elsewhere
(5.18)
Thus b i is the result of translating the projection by D (a distance that should be chosen larger than the--anticipated--object diameter). As
IV. Overviewof Existing Reconstruction Techniques
205
more and more such back-projection bodies for different angles 0 are added together, a crude reconstruction of the object is obtained" N
~(r) = ~ bi(r (i), z(i)),
(5.19)
i=1
with r = {x, y, z} being the coordinate system of the object. The reason why such a reconstruction is crude is found by an analysis of the back-projection summation in Fourier space: it essentially corresponds to a simple filling of Fourier space by adding the central sections associated with the projections. It is immediately seen (Fig. 5.13) that the density of sampling points decreases with increasing spatial frequency, so that low spatial frequencies are overemphasized. As a result, the 3D image formed by back-projection appears like a blurred version of the object. Intuitively, it is clear that multiplication with a suitable radius-dependent weight might restore the correct balance in Fourier space. Weighted back-
Z~
/// 111
_..x
Fig. 5.13. Density of sampling points in Fourier space obtained by projections decreases with increasing spatial frequency. Although this is shown here for single-axis tilting, the same is obviously true for all other data collection geometries. From "Advanced Techniques in Biological Electron Microscopy." Three-dimensional reconstruction of non-periodic macromolecular assemblies from electron micrographs. Frank, J., and Radermacher, M., Vol. III, pp. 1-72 (1986). Reproduced with permission of Springer-Verlag, Berlin.
206
Chapter 5. Three-DimensionalReconstruction
projection makes use of a weighting function W~(k) tailored to the angular distribution of projections:
p(r)
= F -~ {Ws(k)F(~(r)})
(5.20)
or, equivalently, by application of a two-dimensional weighting function to the projections, p ' ( r ) = F -I {W2(k)F{p(r)}}.
(5.21)
The weighted back-projection algorithm makes it possible to design weighting functions for arbitrary projection geometries, and, specifically, to deal with the random azimuths encountered in the random-conical data collection (Radermacher et al., 1986a, 1987a, b). In the following, we must distinguish between the coordinates affixed to the object (denoted by uppercase letters) and those affixed to the individual projections (lowercase letters). Similarly, we need to distinguish between the 2D Fourier coordinates k ~i) = {k!i,, k?'} of the ith projection and the 3D Fourier coordinates of the object, K = {K x, K v, K:}. X, Y, Z are object coordinates, with z being the coordinate perpendicular to the specimen plane. R = {X {i), y0), zl~)} are the coordinates of the ith projection body. The transformations from the coordinate system of the object to that of the ith projection (azimuth ~;, tilt angle 0i) is defined as follows: r (i) =
R,(Oi)Rz(~i)R,
(5.22)
where Ry, R~ represent rotations around the y axis and z axis, respectively. The weighting function for arbitrary geometry is now derived by comparing the Fourier transform of the reconstruction resulting from back-projection, F{ 9(R)}, with the Fourier transform of the original object (Radermacher, 1991): N F{ p ( R ) } = ~ F{bi(r (i), z(i))} i=1 N
= ~ P(k~i~)D sinc(D~rk~i)).
(5.23)
(5.24)
i=1
Here P(k ~ denotes the Fourier transform of the ith projection. The function sinc(x) stand for sin(x)/x, frequently used because it is the "shape transform" of a top-hat function and thus describes the effect, in Fourier space, of a real-space limitation of an object. Each central section associated with a projection is "smeared out" in the k x-direction and thereby "thickened." In the discrete formulation, each Fourier coefficient
IV. Overviewof Existing Reconstruction Techniques
207
of such a central section is spread out and modulated in the direction normal to the section plane. This is in direct analogy to the continuous "spikes" associated with the reflections in the transform of a thin crystal (Amos et al., 1982). However, in contrast to the sparse sampling of the Fourier transform by the reciprocal lattice, in the case of the crystal, we now have a sampling that is at least in principle continuous. It is at once clear that the degree of overlap between adjacent "thick central sections" is dependent on the spatial frequency radius. The exact radius beyond which there is no overlap is the resolution limit of Crowther et al. (1970). The different degrees of overlap produce an imbalanced weighting, which the weighting function is supposed to overcome. For the reconstruction of a point (represented by a delta function) from its projections, P(k li~) = 1,
(5.25)
so that the Fourier transform of the reconstruction by back-projection becomes N
H(K) = ~ D sinc(DTrk~i~).
(5.26)
i=1
By definition, Eq. (5.26) can be regarded as the transfer function associated with the simple back-projection operation. From this expression the weighting function appropriate for the general geometry can be found as W(K) = 1 / H ( K ) .
(5.27)
From this weighting function for general geometries, any weighting functions for special geometries can be derived (Radermacher, 1991). Specifically, the case of single axis tilting and regular angular intervals yields the well-known "r*-weighting" (in our nomenclature, [Kl-weighting) of Gilbert (1972). In constructing the weighting function W(K) according to Eq. (5.27), the regions where H(K)--, 0 require special attention as they lead to singularities. In practice, to avoid artifacts, Radermacher et al. (1987a) found it sufficient to impose a threshold on the weighting function,
W(K) < 1.66; m
(5.28)
i.e., to replace W(K) by 1.66 wherever 1/H(K) exceeds that limit. In principle, a more accurate treatment can be conceived that would take the spectral behavior of the noise explicitly into account.
208
Chapter 5. Three-DimensionalReconstruction
Note that W(K) as derived here is a 3D function, and can be used directly in 3D Fourier space to obtain the corrected reconstruction. In practice, its central sections w(k ~i~)are frequently used instead and applied to the 2D Fourier transforms of the projections. It should be noted that both ways of weighting are mathematically, but not necessarily numerically, equivalent. Radermacher (1992) discussed reconstruction artifacts caused by approximation of Eq. (5.27), and has recommended replacing the sinc function by a Gaussian function in Eq. (5.24), tantamount to the assumption of a "soft" Gaussian-shaped object limitation.
C. Fourier Methods Fourier approaches to reconstruction [see, for instance, Radermacher's (1992a) brief overview] utilize the projection theorem directly and regard the Fourier components of the projections as samples of the 3D transform to be determined. In most case, the positions of these samples do not coincide with the regular three-dimensional Fourier grid. This situation leads to a complicated interpolation problem, which can be stated as follows: given a number of measurements in Fourier space at arbitrary points not lying on the sampling grid, what set of Fourier components on the sampling grid are consistent with these measurements? The key to this problem lies in the fact that the object is of finite dimensions (Hoppe, 1969); because of this, the arbitrary measurements are related to those on the grid by the Whittaker-Shannon interpolation (Hoppe, 1969; Crowther et al., 1970; Radermacher, 1992a; Lanzavecchia et al., 1993; Lanzavecchia and Bellon; 1994). Following Radermacher's account, we consider the unknown object bounded by a rectangular box with side lengths a, b, and c. In that case, its 3D Fourier transform is completely determined when all samples on the regular 3D sampling grid with grid size (1/a, 1/b, 1 / c ) are known. We index them as Fhk z -- F ( h / a , k / b , l/c). For an arbitrary position (denoted by the coordinate triple x*, y*, z*) not lying on this grid, the Whittaker-Shannon theorem yields the relationship sin rr(ax* - h) F( x*, y*, z* ) = ~
~
h
k
~ Fhk t I
7"r(o.x* -- h )
sin 7r(by* - k) sin 7r(cz* - l) rr(by* - k)
~r(cz* - l)
(5.29)
209
IV. Overviewof Existing ReconstructionTechniques
[The three terms whose product from the coefficient of Fhkt are again sinc functions.] This relationship tells us how to compute the value of the Fourier transform at an arbitrary point from those on the regular grid, but the problem we wish to solve is exactly the opposite: how to compute the values of the Fourier transform on every point h, k, l of the grid from a given set of measurements at arbitrary positions {x~, yf, z~; j = 1 . . . J}, as furnished by the projections. By writing Eq. (5.29) for each of these measurements, we create a system of M equations for H 9K 9L unknown Fourier coefficients on the regular grid. The matrix C representing the equation system has the general element sin 1r (axf - h )sin 7r (byf - k ) s i n 1r (czf - l) Cjhkl
"-
7r(ax; - h)rr(by~.
- k)Tr(cz; -l)
"
(5.30)
To solve this problem, we must solve the resulting equation system as follows: Fhk, = E F ( x ~ , y f , z~ )C~-h~ ,,
(5.31)
J
where Cjhlkt are the elements of the matrix that is the inverse to C. It is obvious that this approach is infeasible because of the large number of terms. Basically, this intractability is the result of the fact that at any point that does not coincide with a regular Fourier grid point, the Fourier transform receives contributions from sinc functions centered on every single grid point. Remedies designed to make the Fourier approach numerically feasible have been discussed by Lanzavecchia et al. (1993) and Lanzavecchia and Bellon (1994). These authors use the so-called "moving window" method to curb the number of sinc functions contributing in the interpolation, and thus obtain an overall computational speed that is considerable faster than the efficient weighted back-projection technique. A demonstration with experimental data--albeit with evenly distributed projections--indicated that the results of the two techniques were virtually identical. Other remedies are to truncate the sinc functions or to use different (e.g., triangular) interpolation functions.
D. Iterative Algebraic Reconstruction Methods In the discrete representation, the relationship between the object and the set of projections can be formulated by a set of algebraic equations (Crowther et al., 1970; Gordon et al., 1970). For the parallel projection
210
Chapter 5. Three-Dimensional Reconstruction
geometry, the jth sample of projection i is obtained by summing the object p along parallel rays (indexed j) defined by the projection direction 0,. The object is a continuous density function, represented by samples Pk on a regular grid, along with some rule (interpolation rule) for how to obtain the values on points not falling on the grid from those lying on the grid. Consequently, the discrete points of the object contribute to the projection rays according to weights w)~) that reflect the angle of projection and the particular interpolation rule: p)i,= E w)2'Pk.
(5.32)
With a sufficient number of projections at different angles, Eq. (5.32) could be formally solved by matrix inversion, as pointed out by Crowther et al. (1970), but the number of unknowns is too large to make this approach feasible. Least-square, pseudoinverse methods (see Carazo, 1992) involve the inversion of a matrix whose dimensionally is given by the number of independent measurements, still a large number but somewhat closer to being manageable for the 2D case (see Zhang, 1992). There is, however, an entire class of reconstruction algorithms based on an approach to estimate the solution to Eq. (5.32) iteratively. The principle of these algorithms is that they start from an original estimate p~0~ and compute its projection /3~i) following Eq. (5.32). The discrepancy between the actually observed projection p~i) and the "trial projection" h! r l i) can now be used to modify each sample of the estimate p~0), giving a new estimate p~l), etc. In the algebraic reconstruction technique (ART) [proposed by Gordon et al., 1970, but essentially identical with Karczemarz' (1937) algorithm for approximating the solutions of linear equations], the discrepancy is subtracted from the object estimate along the projection rays in each step, so that perfect agreement is achieved for the particular projection direction considered. In the simultaneous iterative reconstruction technique (SIRT) (proposed by Gilbert (1972)), the discrepancies of all projections are simultaneously corrected. For an exhaustive description of these and other iterative techniques, the reader is referred to Herman (1980). Iterative methods have the advantage over the other approaches to 3D reconstruction that they are quite flexible, allowing constraints and statistical considerations to be introduced into the reconstruction process (e.g., Penczek et al., 1992). They have the disadvantage of much larger computational expense than the weighted back-projection method. The use of nonlinear constraints (e.g., prescribed value range) introduces another disadvantage: the reconstruction process is no longer linear, making its characterization by a point spread function (see Section IV, B) or the 3D variance estimation by projection variance back-projection (Section II, A in Chapter 6) impossible to achieve.
V. The Random-Conical Reconstruction Scheme in Practice
211
V. The Random-Conical Reconstruction Scheme in Practice A. Overview The concept of the random-conical data collection was introduced above (Section III, E). In the following section, all steps of the reconstruction scheme that makes use of this data collection method will be outlined. In this we will follow the detailed account given by Radermacher (1988), but with the addition of the multivariate statistical analysis (MSA)/classification step and some modifications that reflect changes in the procedures as they have developed since 1988. To start with an overview of the procedure (Fig. 5.14), the particles are first selected ("windowed") simultaneously from the tilted and the untilted specimen field (steps 1 and 2, respectively), yielding two sets of images. Next, the untilted set is subjected to alignment (step 3), producing a set of "aligned" images. These are then analyzed using multivariate statistical analysis and classification (step 4), resulting in the rejection of certain particles and in the division of the data set into different classes according to particle view. From here on, the different classes are processed separately. For each class, the following steps are followed: The tilted-particle images are sorted according to the azimuthal angle found in the alignment procedure. Once in correct order, they are aligned with respect to one another or with a common reference (step 5). After this, they may be used to obtain a 3D reconstruction (step 6). Thus, in the end, as many reconstructions are obtained as there are classes, the only requirement being that the class has to be large enough for the reconstruction to be meaningful.
B. Optical Diffraction Screening The tilted-specimen micrograph covers a specimen field with a typical defocus range of 1.4 ~m (at 50,000 x magnification and 50 ~ tilt) perpendicular to the direction of the tilt axis. Because of the properties of the contrast transfer function (Section II, C in Chapter 2), useful imaging conditions require the defocus to be in the range ofunderfocus [i.e., Az < 0 in Eq. (2.5) of Chapter 2]. In addition, the entire range must be restricted, to prevent a blurring of the reconstruction on account of the defocus variation that is equivalent to the effect of energy spread (Section I, C, 2 in Chapter 2). How much the range must be restricted, to a practical "defocus corridor" parallel to the tilt axis, depends on the resolution expected and can be inferred by reference to the transfer function charac-
TILTED
UNTILTED
1
WINDOWING
r-
) WINDOWING
-7
I--
-7
'1 I'
'1 i'
I
'! I'
I [--] I~
I
I I
_1
I
" 9
I I i
I I
I L___j
L
I--
-"1
l
jill I
L
I
!,
I
(P;
'i !,
I I
I
I
I
'_~,
I I I
j
1 3-[3
J
,,~ ALIGNMENT
', I
9 9 9
I I I
l
i~ ALIGNMENT 1
I I
~ 9 9
'_~,
I
1
I
I
RECONSTRUCTION
CLASSIFICAnON I
I
9
i I
! !' o
L ....
J
C1
C
2
C
3
V. The Random-Conical Reconstruction Scheme in Practice
213
teristics (Chapter 2, Section II, C, 4) and the sharp falloff in the "energy spread" envelope produced by the defocus spread. When the micrograph shows a field with carbon film, the useful defocus range can be found by optical diffraction analysis. Details of this screening procedure have been described by Radermacher (1988). The selection aperture must be small enough so that the focus variation across the aperture is kept in limits. By probing different parts of the micrograph, the tilt axis direction is readily found as the direction in which the optical diffraction pattern remains unchanged. Perpendicular to that direction, the diffraction pattern changes most dramatically (see Fig. 5.15), and following this direction of steepest defocus change, the useful range must be established by comparison with the transfer function characteristics. Electron microscopes equipped for spot scanning allow an automatic compensation for the defocus change perpendicular to the tilt axis (Zemlin, 1989b; Downing, 1992). For micrographs or data collected electronically from such microscopes, particles from the entire specimen field can be used for processing (Typke et al., 1992). Another way of including all data in the processing irrespective of their defocus is by keeping track of a spatial variable, in the course of selecting and storing the individual particle windows, that gives the particle position within the micrograph field in the direction perpendicular to the tilt axis. This variable can later be interpreted in terms of the effective local defocus, and used to make compensations or assign appropriate Fourier weighting functions in the reconstruction (see also Section IX). C. Interactive T i l t e d / U n t i i t e d Particle Selection
Selection of particles from a pair of tilted- and untilted-specimen micrographs is a tedious task. A computer program for simultaneous interactive selection was first described in the review by Radermacher (1988). The two fields are displayed side by side on the screen of the workstation (Fig. 5.16) with a size reduction of 1:3 to 1:4. The size reduction makes it possible to select all particles from a micrograph field at once. It has the beneficial side effect that the particles stand out with enhanced contrast, since size
Fig. 5.14.
Schematic diagram of the data flow in the random-conical reconstruction. Simultaneously, particles are selected from the micrograph (1) of the tilted specimen and that (2) of the untilted specimen. Those from the untilted field are aligned (3), resulting in azimuthal angles ~i, and classified (4), resulting in the separation into classes C1-C 3. Separately, for each of these classes, tilted-specimen projections are now aligned (5) and passed to the 3D reconstruction program (6). From Radermacher et al. (1987b). Reproduced with permission of Blackwell Science Ltd., Oxford, from Radermacher, M., Wagenknecht, T., Verschoor, A., and Frank, J., Three-dimensional reconstruction from a single-exposure, random conical tilt series applied to the 50S ribosomal subunit. J. Microsc. 146, 113-136.
214
Chapter5. Three-DimensionalReconstruction
reduction with concurrent band limitation enhances the signal-to-noise ratio (SNR), provided (as is the case here) that the signal spectrum falls off faster then the noise spectrum with increasing spatial frequency (see Chapter, 3, Section IV, C, 1). Such a program is laid out to calculate the geometrical relationships between the coordinate systems of the two fields as sketched in the paper by Radermacher et al. (1987b). For this purpose, the user selects a number of particle images in pairs--tilted and untilted, by going back and forth with the cursor between the two fields. After this initial "manual" selection, the program is prompted to compute the direction of the tilt axis and the precise coordinate transformation between the two projections. This transformation is given as (Radermacher, 1988)
x,) =( cos
sin
y'
COS fl
- sin 13 •
y
13)
• ( cos 0 0
_ Xo) ) + YO
() x0
Y'0 '
0 ) X (cosa 1 sin a
-sina) cos a (5.33)
where a is the angle between tilt axis in the untilted-specimen image and its y axis; /3 is the corresponding angle for the tilted-specimen image; x0, Y0 are coordinates of an arbitrary selectable origin in the untiltedspecimen image; and, x'0, Y'0 are the corresponding origin coordinates in the tilted-specimen image. After these geometrical relationships have been established, the user can select additional particles from one field (normally the untiltedspecimen field as the particles are better recognizable there), and the program automatically finds the particle as it appears in the other field. A document file (see Appendix 1) is used by the program to store all particle positions in both fields, later to be used for windowing the particles images from the raw data files. As an alternative to the interactive particle selection, Lata et al. (1994, 1995) have developed an automated particle picking program (see Section II, C), which has already proved its practical value in the 3D reconstruction of the 30S ribosomal subunit from Escherichia coli (Lata et al., 1995).
D. Density Scaling As in electron crystallography of crystals (Amos et al., 1982) correct mutual scaling of the projection densities is necessary so that projections are prevented from entering the analysis with undue weight, which would lead to a distorted representation of the structure.
V. The Random-Conical Reconstruction Scheme in Practice
215
Fig. 5.15. Change of contrast transfer function across the micrograph of a tilted specimen. (a) Theoretical behavior; (b) optical diffraction patterns obtained at different positions of the area-selecting aperture of the optical diffractometer along a line perpendicular to the tilt axis. From Zemlin (1989b). Dynamic focussing for recording images from tilted samples in small-spot scanning with a transmission electron microscope. J. Electron Microsc. Tech. Copyright 9 1989 John Wiley & Sons, Inc. Reprinted by permission of John Wiley & Sons, Inc.
V. The Random-Conical Reconstruction Scheme in Practice
217
Radermacher et al. (1987b) normalized the individual images representing tilted particles as follows: next to the particle (but staying away from the heavy stain accumulation if the preparation is with negative staining), the average density of a small reference field, ( D ) , is calculated. The density values measured in the particle window are then rescaled according to the formula
D,' =
D~ - ( D ) (D) "
(5.34)
Boisset et al. (1993) introduced another scaling procedure that makes explicit use of the statistical distribution of the background noise: it is assumed that the background noise surrounding the particle has the same statistical distribution throughout. Using a large portion of one of the tilted-specimen micrographs, a reference histogram is calculated. Subsequently the density histogram of the area around each particle is compared with the reference histogram, and the parameters a and b of a linear density transformation D I = aD i + b
(5.35)
are estimated. This transformation is then applied to the entire particle image, with the result that in the end all images entering the reconstruction have identical noise statistics. In fact, the match in statistics extends beyond the first and second moment. E. Processing of Untilted-Particle Images 1.
Alignment and
Classification
Following the random-conical scheme, the untilted-specimen projections are first aligned and then classified. The alignment furnishes the azimuthal angle 4~i of the particle, which is needed to place the corresponding tilted-specimen projections into the conical geometry. The classification results in a division into L subsets of particles which are presumed to have Fig. 5.16. Interactive particle selection using WEB. Two micrographs of the same field (left, untilted: right, tilted by 36 ~ are displayed side by side. Equivalent particle images are identified and numbered in both micrographs. Shown here is the beginning phase of the program where each particle has to be tracked down in both micrographs. After the initial phase, the parameters of the underlying coordinate transformation are known, and the program is able to identify the companion in the second micrograph for any particle selected in the first micrograph.
218
Chapter 5. Three-Dimensional Reconstruction
different orientations and ideally should be processed separately to give L different reconstructions. In practice, however, one chooses the classification cutoff level rather low, so that many classes are initially generated. By analyzing and comparing the class averages, e.g., by using the differential phase residual criterion (Section V, B, 2 in Chapter 3), it is possible to gauge whether some mergers of similar classes are possible without compromising resolution. This procedure leads to some small number of L 1 < L "superclasses" which are fairly homogeneous on the one hand but contain sufficient numbers of particles on the other hand to proceed with three-dimensional reconstruction. A substantial number of particles that fall in none of the superclasses are left out at this stage, their main fault being that their view is underrepresented, with a number that is insufficient for a 3D reconstruction. These particles have to "wait" till a later stage of the project when they can be merged with a well-defined reconstruction based on a merger of the superclass reconstructions, see Section VI below.
2. N u m b e r o f Particles Needed: Angular Histogram The number of projections required for computing a "self-standing" reconstruction is determined by the statistics of the angular coverage (Radermacher et al., 1987b; see Fig. 5.17). If we require a self-standing reconstruction to be mathematically supported according to the conicalreconstruction resolution formula (Section V, I below), there should be no gap in the azimuthal distribution larger than
AOmi, = 3 6 0 / N = 360 • d / ( 2 r r D sin 00),
ol 5
-~9o o
-90 ~
Oo
~o o
~8o o
(5.36)
Fig. 5.17. Distribution of azimuthal angles of 50S ribosomal particles extracted from five pairs of micrographs. The angles were determined by alignment of particles showing the crown view as they appear in the untilted-specimen micrographs. From Radermacher et al. (1987b). Reproduced with permission of Blackwell Science Ltd., Oxford, from Radermacher, M., Wagenknecht, T., Verschoor, A., and Frank, J., Threedimensional reconstruction from a single-exposure, random conical tilt series applied to the 50S ribosomal subunit. J. Microsc. 146, 113-136.
V. The Random-C0nical Reconstruction Scheme in Practice
219
where N is the number of equidistant projections in a regular conical series, d = 1 / R is the resolution distance aimed for, D is the diameter of the object, and 00 is the cone angle. To reconstruct the 50S ribosomal subunit of E. coli (D : 200 A) to a resolution of R = 1/30 ~ - 1 (d = 30 A) from a 50 ~ tilted specimen (i.e., 00 - 4 0 ~ the minimum angular increment works out to be A0m~. --- 13~ Because of the statistical fluctuations in a random coverage of the azimuthal range, the required number of projections is much larger then the minimum of N = 360/13 = 28. In practice, a set of 250 projections proved sufficient to cover the 360 ~ range with the largest gap being 5 ~ (Radermacher et al., 1987b). Although there is no iron-clad rule, molecules in that size range (200 to 300 ,~) and to that resolution (1/30 A -~ ) appear to require a minimum of 200 projections.
F. Processing of Tilted-Particle Images 1. A l i g n m e n t
Reference to the Fourier description (Fig. 5.18) explains why an alignment of neighboring tilted-particle projections by cross-correlation is possible: because each Fourier component is surrounded by a "circle of influence," as a result of the boundedness of the object. It is the very reason that reconstruction from a finite number of projections is feasible. There are four methods of alignment that have been tried at one stage or the other: (i) sequential alignment "along the circle" with cosine stretching (Radermacher et al., 1987b), (ii) alignment to the corresponding 0 ~ projection with cosine stretching (Radermacher, 1988; Caraza et al., 1988), (iii) alignment to a perspectively distorted disk (Radermacher, 1988), and (iv) alignment to a disk or "blob" (Penczek et al., 1992). The reference to cosine stretching requires an explanation. According to Guckenberger's (1982) theory, the projection to be aligned to the untilted-particle projection (method (ii) above) must first be stretched by the factor 1/cos(0t) in the direction perpendicular to the tilt axis, where 0 t is the tilt angle. When this philosophy is applied to the alignment of two adjacent tilted-particle projections (method (i) above), both must be stretched by that factor prior to alignment. The cosine stretching procedure appears to work well for oblate objects, i.e., those objects that are more extended in the x and y directions than in the z-direction, as for instance the 50S ribosomal subunit in the negatively stained double layer preparation (Radermacher et a l . , 1987a, b; Radermacher, 1 9 8 8 ) T h e reason that it works well for such specimens is that the common signal includes the surrounding carbon film which is fiat. For globular objects, such as the 70S ribosome embedded in ice (Frank et al., 1991, 1995a, b;
220
Chapter 5. Three-Dimensional Reconstruction P1
Pz
j/
t///
/ , /~" / ~176 /
/ /
I/D
///
/////
/
Fig. 5.18. Statistical dependence of Fourier components belonging to different projections. We consider the Fourier transform along central sections P~, P2 representing two projections of an object with diameter D. Each Fourier component is surrounded by a "circle of influence" with diameter 1/D. Thus the central section is accompanied on both sides by a margin of influence, whose boundaries are indicated by the dashed lines. The diagram can be used to answer two interrelated questions: (1) to what resolution are two projections separated by A0 correlated? (2) what is the minimum number of projections with equispaced orientations that are required to reconstruct the object to a resolution R without loss of information? From "Advanced Techniques in Biological Electron Microscopy." Three-dimensional reconstruction of non-periodic macromolecular assemblies from electron micrographs. Frank, J., and Radermacher, M., Vol. III, pp. 1-72 (1986). Reproduced with permission of Springer-Verlag, Berlin.
Penczek et al., 1992, 1994), alignment to a nonstretched disk (blob) whose d i a m e t e r is equal to the d i a m e t e r of the particle leads to better results. Sequential alignment of neighboring projections is e r r o r p r o n e because errors can accumulate in the course of several h u n d r e d alignments, as may be easily checked by computing the closure error ( R a d e r m a c h e r et al., 1987b). R a d e r m a c h e r and co-workers developed a fail-safe variant of the sequential m e t h o d in which neighboring projections are first averaged in groups spanning an angle of 10~ these averaged tilt projections are aligned "along the circle", and, in the end, the individual projections are aligned to their corresponding group average. However, for cryospecimens with their decreased SNR, sequential alignment has largely been a b a n d o n e d in favor of the alignment of each tilted-particle projection with the 0 ~ projection or, as pointed out before, with a blob. It may seem that the availability of 3D refinement m e t h o d s (see Section VIII) has relaxed the r e q u i r e m e n t for precise tilted-particle projection alignment at the stage of the first reconstruction; on the other hand, it must be realized that the accuracy of the refinement is d e t e r m i n e d
V. The Random-Conical Reconstruction Scheme in Practice
221
by the quality of the reference reconstruction which is in turn critically dependent on the success of the initial alignment. 2. Screening
Screening of tilted-particle projections is a way of circumventing the degeneracy of 0 ~ classification (Section IV, K in Chapter 4). In the first study of a molecule whose structure is unknown, it is advisable to verify that the 0 ~ projection does not hide a mixture of two or more molecule orientations. The most important stipulation to apply is the continuity of shape in a series of angularly ordered projections. As members of a conical tilt series, projections must occur in a strict one-dimensional similarity order: for instance, the ordering of five closely spaced projections along the cone in the sequence A - B - C - D - E implies that the cross-correlation coefficients (denoted by symbol | are ranked in the following way: A|
> A|
> A|
> A|
If we look at projections forming an entire conical series, then the associated similarity pathway is a closed loop (see Section IV, H). In any short segment of the loop, we find a cross-correlation ranking of the type stated above as a local property. If all projections of the conical series are closely spaced, then the shape variation from one neighbor to the next is quasi-continuous. In practice this means that a gallery of tilted-particle projections presented in the sequence in which they are arranged on the cone should show smooth transitions, and the last one of the series should be similar to the first. The most sensitive test of shape continuity is a presentation of the entire projection series as a "movie": any discontinuity is immediately spotted by eye. The correct ordering on a similarity pathway can also be monitored by multivariate statistical analysis. In the absence of noise, the factor map should show the projections ordered on a closed loop (Frank and van Heel, 1982b; van Heel, 1984a). In fact, the ordering is on a closed loop in a high-dimensional space R J, and this closed loop appears in many different *projected versions in the factor maps. In the presence of noise, it proves difficult to visualize the similarity pathway (and thereby spot any outliers that lie off the path). One way out of this problem is to form "local" averages over short angular intervals, and apply MSA to these very robust manifestations of the projections (Frank et al., 1986). A closed loop becomes then indeed visible. It can be used, in principle, to screen the original projections according to their distance, in factor space, from the averaged pathway. This method has not been pursued, however, and has been partly replaced by the 3D projection matching (Section VIII, B) and 3D Radon transform (Section VIII, C) methods.
222
Chapter 5. Three-DimensionalReconstruction
Flip/flop ambiguities are normally resolved by classification of the 0 ~ degree views, on the basis of the mirroring of the projected molecule shape (van Heel and Frank, 1981). However, a peculiar problem emerges when the shape of the molecule in projection is symmetric. In that case, there is no distinction between flip and flop projectionsmunless induced by onesidedness of staining. Lambert et al. (1994a) had to deal with this problem when they processed images of the barrel-shaped chiton hemocyanin which sits on the grid with one of its (unequal) round faces. As these authors realized, the cylindric shape is unique in that it allows flip and flop orientations of the molecule to be sorted by MSA of the tilted-particle images. Perfect cylinders give uniformly rise to double-elliptic barrel projections, irrespective of the tilt direction. Any asymmetry in the z distribution of mass (in this case, the existence of a "crown" on one side of the molecule) leads to a difference in appearance between molecules tilted to the one side from those tilted to the other. This difference was clearly picked up by correspondence analysis, and the two different populations could be separated on this basis.
G. Reconstruction After the screening step to verify that the tilted-particle projections follow one another in a reasonable sequence, the particle set is ready for 3D reconstruction. Both weighted back-projection (Radermacher et al., 1987b) and a modification of SIRT (Penczek et al., 1992) are being used. For the weighted back-projection, the weighting step is either performed individually on each projection (Radermacher et al., 1987b) (implying the steps FT ~ Weighting ~ FF -~) or, summarily, by applying a 3D weighting function on the volume obtained by simple back-projection (Hegerl et al., 1991). The reconstruction exists as a set of slices (see Fig. 5.19). Each slice is represented by a 2D image that gives the density distribution on a given z level, which is a multiple of the sampling step. The representation as a gallery of successive slices (the equivalent of a cartoon if the third dimension were the time) is both the most comprehensive and the most confusing way of representing 3D results. Instead, surface representations (Section IV, B, 1, in Chapter 6) are now in common use. A general discussion of visualization options is contained in Section V, 4.
H. Resolution A s s e s s m e n t The mathematical resolution of a reconstruction is determined through the Crowther et al. (1970) formula (Section II, A). Corresponding formulae
V. The Random-Conical Reconstruction Scheme in Practice
223
Fig. 5.19. The 50S ribosomal subunit, reconstructed from cryoimages, represented as a series of slices. From Radermacher (1994). Reproduced with permission of Elsevier Science, Amsterdam.
have been given by R a d e r m a c h e r (1991) for the conical projection geometry. Here we mention the result for an even number of projections: d = 2 7 r ( D / N ) s i n 00
(5.37)
where d = 1 / R is the "resolution distance," i.e., the inverse of the resolution, D the object diameter, and 00 is the tilt of the specimen grid. Furthermore, for such a data collection geometry, the resolution is direc-
224
Chapter 5. Three-Dimensional Reconstruction
tion-dependent, and the above formula gives only the resolution in the directions perpendicular to the direction of the electron beam. In directions that form an angle oblique to those directions, the resolution is deteriorated. In the beam direction, the effect of the missing cone is strongest, and the resolution falls off by a factor of 1.58 (for 00 = 45 ~ or 1.23 (for 00 - 60~ The account given thus far relates to the theoretical resolution expected from the data collection geometry. However, whether this resolution is actually realized is quite another matter. The reasons that it is normally not realized are manifold: the structure of the biological particle may not be defined to that level of resolution because of conformational variability; in stained preparations, the stain fluctuations and finite graininess limit the definition of small specimen features; and there are a number of electron optical effects (partial coherence, charging, specimen movement) that limit the transfer of information from the specimen to the image. Another important limitation is due to errors in the assignment of projection angles (see Section VIII, D). For these reasons, the significant resolution of a reconstruction needs to be independently assessed. (Here the term "significant resolution" denotes the resolution up to which object-related features are represented in the 3D image.) The procedure is quite similar to the assessment of 2D resolution (Section V, B in Chapter 3), by computing two reconstructions from randomly drawn subsets of the projection set, and comparing these in Fourier space using differential phase residual (DPR) or Fourier ring correlation (FRC)criteria. However, in the 3D case, the summation in the defining formulas [Chapter 3, Eqs. (3.64) and (3.65)] now has to go over shells ]k] = constant. This extension of the differential resolution criterion to three dimensions was first implemented (under the name of "Fourier shell correlation") by Harauz and van Heel (1986a) for the case of the FRC. Thus far, a resolution assessment analogous to the spectral signalto-noise ratio (SSNR) has not been developed for the 3D reconstruction, but some considerations along these lines have been developed by Liu (1993). Resolution assessment based on the comparison of experimental projections with projections "predicted" from the reconstruction is useful, but cannot replace the full 3D comparison mentioned above. One of the consistency tests for a random-conical reconstruction is the ability to predict the 0 ~ degree projection from the reconstruction, a projection that does not enter the reconstruction procedure yet is available for comparison from the analysis of the 0 ~ data. Mismatch of these two projections is an indication that something has gone wrong in the reconstruction; however, on the other hand, an excellent match, up to a resolution R, is n o t a
VI. Mergingof Reconstructions
225
guarantee that such a resolution is realized in all directions. This is easy to see by invoking the projection theorem as applied to the conical projection geometry (see Fig. 5.10): provided that the azimuths ~bi are correct, the individual tilted planes representing tilted projections furnish correct data for the 0~ plane, irrespective of their angle of tilt. It is therefore possible to have excellent resolution in directions defined by the equatorial plane, as evidenced by the comparison between the 0~ projections, while the resolution might be severely restricted in all other directions. The reason that this might happen is that the actual tilt angles of the particles differ from the nominal tilt angle assumed (see Penczek et al., 1994). Another resolution test, used initially for assessing the results of 3D reconstruction, employs the pairwise comparison of selected slices, again using the two-dimensional DPR and FRC (e.g., Radermacher et al., 1987b, Verschoor et al., 1989; Boisset et al., 1990b). Although such a test, in contrast to the 0 ~ projection check mentioned above, is indeed sensitive to the accuracy of assignment of tilt angles to the particles, it still fails to give an overall assessment of resolution including all spatial directions, as only the Fourier shell measures can give. Because of these shortcomings of 2D resolution tests in giving a fair assessment of 3D resolution, the use of DPR and FRC computed over shells is now common practice (e.g., Akey and Radermacher, 1993; Radermacher et al., 1994b, Serysheva et al., 1995). (Unfortunately, the two measures in use give widely different results for the same data set, with FRC being as a rule more optimistic than DPR; see Section V, B in Chapter 3. It is therefore a good practice to quote both in the publication of a 3D map.) However, there is as yet no convention on how to describe the direction dependence of the experimental resolution. Thus, the resolution sometimes relates to an average over the part of the shell within the measured region of 3D Fourier space; sometimes (e.g., Boisset et al., 1993, 1995; Penczek et al., 1994) it relates to the entire Fourier space without exclusion of the missing cone. It is clear that, as a rule, the latter figure gives a more pessimistic estimate than the former.
VI. Merging of Reconstructions A. The Rationale of Merging Each reconstruction shows the molecule in an orientation that is determined by the orientation of the molecule on the specimen grid (and by a trivial "in-plane" rotation angle whose choice is arbitrary but which also figures eventually in the determination of relative orientations). The goal
226
Chapter 5. Three-DimensionalReconstruction
of filling the angular gap requires that several reconstructions with different orientations be combined, to form a "merged" reconstruction as the final result. The procedure to obtain the merged reconstruction is easily summarized in three steps (i) 3D orientation search, (ii) expression of the different projection sets in a common coordinate system, and (iii) reconstruction from the full projection set. The premise of the merging is that the reconstructions based on molecules showing different 0~ views represent the same molecule, without deformation. Only in that case will the different projection sets be consistent with a common object model. The validity of this assumption will be investigated in the next section. In each case, the merging must be justified, by applying a similarity measure: until such verification is achieved, the assumption that the particles reconstructed from different 0~ sets have identical structure and conformation remains unproven. This extra scrutiny is required because molecules are potentially more variable in the single particle form than when ordered in crystals.
B. Preparation-Induced Deformations Molecules prepared by negative staining and air-drying show evidence of flattening. Quantitative data on the degree of flattening are scarce; however, as an increasing number of comparisons between reconstructions of molecules negatively stained and embedded in ice become available, some good estimations can now be made. Another source of information is the comparison of molecules that have been reconstructed in two different orientations related to each other by a 90 ~ rotation. In interpreting such data, it is important to take incomplete staining into account. Without using the sandwiching technique, some portions of the particle, pointing away from the grid, may "stick out" of the stain layer and thus be rendered invisible. On the other hand, the sandwiching may be responsible for an increase in flattening. Boisset et al. (1990b) obtained two reconstructions of negatively stained, sandwiched A n d r o c t o n u s australis hemocyanin, drawing either from particles lying in the top or those in the side view, and reported a factor of 0.6 when comparing a particle dimension perpendicular to the specimen grid with the same dimension parallel to the grid. Since the flattening is accompanied by an increase in lateral dimensions, one can assume that the flattening is less severe than this factor might indicate. Another estimate, a factor of 0.65, comes from a comparison between two reconstructions of an A. australis hemocyanin-Fab complex, one obtained with negative staining (Boisset et al. 1993b) and the other using vitreous ice (Boisset et al., 1994b; 1995).
VI. Mergingof Reconstructions
227
Cejka et al. (1992) obtained a somewhat larger factor (12 n m / 18 nm - 0.67) for the height (i.e., the dimension along the cylindric axis) of the negatively stained (unsandwiched) Ophelia bicornis hemoglobin. They were able to show that molecules embedded in aurothioglucose and frozen-hydrated have essentially the same height (18.9 nm) when reconstructed in their top view as molecules negatively stained presenting the side view. The 50S ribosomal subunit, reconstructed in its crown view from negatively stained, sandwiched (Radermacher et al., 1987a, b), and iceembedded preparations (Radermacher et al., 1992), appears to be flattened according to the ratio 0.7:1. A factor of 0.6:1 holds for the calcium release channel, as can be inferred from a comparison (Radermacher et al., 1994b) of the side views of the cryo-reconstruction (Radermacher et al., 1994a, b) with the reconstruction from the negatively stained specimen (Wagenknecht et al., 1989a). In summary, then, it is possible to say that as a rule, molecules prepared by negative staining are flattened to 60-70% of their original dimension. All evidence suggests that the flattening is normally avoided when ice or aurothioglucose embedment is used. In addition, the high degree of preservation of 2D bacteriorhodopsin crystals in glucoseembedded preparations (Henderson and Unwin, 1975; Henderson et al., 1990) would suggest that single molecules embedded in glucose might also retain their shape, although to date this has been neither proved nor disproved by a 3D study. The important lesson for data merging is that 3D reconstructions from negaticely stained molecules cannot be merged unless they are based on the same c~iew of the molecule, i.e., on images showing the molecule facing the support grid in the same orientation.
C. Three-Dimensional Orientation Search
1. Orientation Search Using Volumes The reconstructed volumes have to be aligned both translationally and with respect to their orientation. This is achieved by a computational search in which the different parameters of shift and 3D orientation (five in all) are varied and the cross-correlation coefficient is used as similarity measure (Knauer et al., 1983; Carazo and Frank, 1988; Carazo et al. 1989; Penczek et al., 1992). The search range and the computational effort can be kept small if the approximate matching orientation can be estimated. Often it is clear from other evidence (e.g., interconversion experiments, see Section III, C; symmetries; or knowledge of architectural building principles, as in the case of an oligomeric molecule) how particular views
228
Chapter 5. Three.Dimensional Reconstruction
are related to one another in angular space. Examples are the top and side views of A. australis hemocyanin, which are related by a 90 ~ rotation of the molecule around a particular axis of pseudosymmetry (see Appendix in Boisset et al., 1988). A major difficulty in finding the relative orientation of different reconstructions is presented by the missing cone. In Fourier space, the correlation between two volumes in any orientation is given by the sum over the terms F~(k)F~(k) for all possible k. Here F~(k) denotes the Fourier transform of the first volume in its original orientation, and F~(k) the complex conjugate of the Fourier transform of the second volume after it has been subjected to a "probing" rotation. Since the missing cones of data sets originating from different molecule views lie in different orientations, the orientation search is biased by a "vignetting effect." This effect results from the cones either intersecting each other to different extents, depending on the angles, or sweeping through regions of the companion transform that carry important parts of the structural information. Realspace methods of orientation determination search generally fail to deal with this complication and may therefore be inaccurate. Fourier methods, in contrast, allow precise control over the terms included in the correlation sum and are therefore usually preferable. Figure 5.20 shows the outcome of an orientational search for the 70S Escherichia coli ribosome. In the matching orientations, the correlation coefficients are in the range between 0.8 and 0.86, justifying the assumption that we are indeed dealing with the same structure. The angles obtained when three or more structures are compared can be checked for closure. Penczek et al. (1992) applied this principle in the case of the 70S ribosome of E. coli where three reconstructions S 2, S 2, S 3 were available: the combination of the rotations found in the orientation search (S1, $2), (S 2, S 3) should give a result close to the rotation resulting from the search (S~, $3). [Note that each of these rotations is expressed in terms of three Eulerian angles (see Section II, B), so that the check actually involves the multiplication of two matrices]. More generally, in the situation where N reconstructions Si (i = 1... N) are compared, then any "closed" string of pairwide orientation determinations (i.e., a string that contains one volume twice, in two different comparisons), ( S i 1 ~ S i , )_ ~ ' ' ' '
( S i , "' S i 1 )
should result in rotations that, when combined, amount to no rotation at all. In the case of the 70S ribosome, the resulting closure error for three reconstructions was found to be in the range of 2~ (Penczek et al., 1992), which signifies excellent consistency.
VI. Merging of Reconstructions
229
Fig. 5.20. Result of orientation search among three independent reconstructions of the ribosome (DPR-resolution, 1/47 ,&-l). Top row, the three reconstructions before orientational alignment; bottom row, reconstructions 2 and 3 after alignment with reconstruction 1. From Penczek et al. (1992). Reproduced with permission of Elsevier Science, Amsterdam.
2. Orientation Search Using Sets of Projections (OSSP) Instead of the reconstructions, the projection sets themselves can be used for the orientation search (Frank et al., 1992; Penczek et al., 1994). In Fourier space, each random-conical projection set is represented by a set of central sections, tangential to a cone, whose mutual orientations arc fixed (Fig. 5.21). It is obvious that the real-space search between volumes can be replaced by a Fourier space search involving the two sets of central sections. Instead of a single common line, the comparison involves N common lines ( N being the number of projections in the two sets combined) simultaneously, with concomitant increase in signal-to-noise ratio. The method of comparison uses a discrepancy measure 1-p~2, where P~2 is the cross-correlation coefficient, computed over Fourier coefficients along the common line. Penczek et al. (1994) described the geometry underlying this search: the angle between the cones is allowed to vary over the full range. For any given angle, it is necessary to locate the "common line" intersections between all central sections of one cone with all central sections of the other.
230
Chapter 5. Three-DimensionalReconstruction Fig. 5.21. Principle of the method of orientation search using sets of projections (OSSP). The two cones around axes 1 and 2 belong to two random-conical data collection geometries with different orientations of the molecule. Two given projections in the two geometries are represented by central Fourier sections tangential to the cones. These central sections intersect each other along the common line C. The OSSP method simultaneously considers every, common line generated by the intersection of every pair of central sections. From Penczek et al. (1994). Reproduced with permission of Elsevier Science, Amsterdam.
While the location of the c o m m o n line is constructed in Fourier space, the actual computation of the discrepancy measure is p e r f o r m e d in real space, exploiting the fact that the one-dimensional c o m m o n line found for any particular pairing of central sections is the Fourier transform of a one-dimensional projection. The OSSP m e t h o d has two advantages over the orientation search between reconstruction volumes: it makes the computation of reconstructions prior to merging unnecessary, and it offers a rational way of dealing with the angular gap.
D. Reconstruction from the Full Projection Set Once the relative orientations of the projection sets are known, either from an orientation search of the reconstruction volumes, or from an orientation search of the projection sets (as outlined in the previous section), the Eulerian angles of all projections of the combined projection sets can be formulated in a c o m m o n coordinate system. It is then straightforward to compute the final, merged reconstruction. It is important to realize that because of the properties of the general weighting functions, it is not possible to merge reconstructions by simply adding them. Instead, one must first go back to the projections, apply the appropriate coordinate transformations so that all projection angles relate to a single coordinate system, and then perform the reconstruction from the entire set. The same is true when one uses iterative reconstruction techniques, where the solution is also tied to the specific geometry of a projection set.
VII. Three-DimensionalRestoration
231
VII. Three-Dimensional Restoration A. Introduction 3D restoration (as distinct from restoration of the contrast transfer function; see Section II, H in Chapter 2 and Section IX in this chapter) is the term we will use for techniques designed to overcome the angular limitation of a reconstruction which leads to resolution anisotropy and an elongation of the molecule in the direction of the missing data. M a x i m u m entropy methods present one approach to restoration (Barth et al., 1989; Farrow and Ottensmeyer, 1989; Lawrence et al., 1989). These methods are known to perform well for objects composed of isolated peaks, e.g., stars in astronomical applications, but less well for other objects; see Trussell (1980). Another approach, based on a set theoretical formulation of the restoration and the enforcement of mathematical constraints, is known under the name of projection onto cont'ex sets ( P O C S ) . POCS was developed by Youla and Webb (1982) and Sezan and Stark (1982) and introduced into electron microscopy by Carazo and Carrascosa (1987a, b). An overview chapter by Carazo (1992) addresses the general question of fidelity of 3D reconstructions and covers POCS as well as related restoration methods. As was earlier mentioned, iterative reconstruction techniques allow nonlinear constraints to be incorporated quite naturally. In this case they actually perform a reconstruction-cum-restoration, which can also be understood in terms of the theory of POCS (Penczek, unpublished work, 1993). All these methods, incidentally, along with multivariate statistical analysis, are examples of a development that treats image sets in framework of a general algebra of images. Hawkes (1993) has given a glimpse into the literature in this rapidly expanding field.
B. Theory of Projection onto Convex Sets What follows is a brief introduction into the philosophy of the POCS method which can be seen as a generalization of a method introduced by Gerchberg and Saxton (1971) and Gerchberg (1974). A similar method of "iterative single isomorphous replacement" (Wang, 1985) in X-ray crystallography is also known under the name of soh'ent flattening. Still another, related method of "constrained thickness" in reconstructions of onedimensional membrane profiles was proposed earlier on by Stroud and Agard (1979). An excellent primer for the POCS method was given by Sezan (1992).
Chapter 5. Three.Dimensional Reconstruction
232
Similarly as in multivariate statistical analysis of images (Chapter 4), which takes place in the space of all functions with finite 2D support, we now consider the space of all functions with finite 3D support. In the new space (Hilbert space), every conceivable bounded 3D structure is represented by a (vector end-) point. Constraints can be represented by sets. For instance, one conceivable set might be the set of all structures that have zero density outside a given radius R. The idea behind restoration by POCS is that the enforcement of known constraints that were not used in the reconstruction method itself will yield an improved version of the structure. This version will lie in the intersection of all constraint sets and thus closer to the true solution than any version outside of it. In Fourier space, the angular gap will tend to be filled. The only problem to solve is how to find a pathway from the approximate solution, reconstructed, for instance, by back-projection or any other conventional technique, to one of the solutions lying in the intersection of the constraint sets. Among all sets representing constraints, those that are both closed and convex proved of particular interest. Youla and Webb (1982) showed that for such sets the intersection can be reached by an iterative method of consecutive *projections. A *projection from a function f(r) onto a set C in Hilbert space is defined as an operation that determines a function g(r) in C with the following property: "of all functions in C, g(r) is the one closest to f(r)," where "closeness" is defined by the size of a distance; for instance by the generalized Euclidean distance in Hilbert space: E - IIg(r) - f(r)ll -
J ~ Ig(rj) - f(rj)l 2. j=l
(5.38)
Note that, by implication, repeated applications of *projection onto the same set lead to the same result.] In symbolic notation, if Pi denotes the operation of *projection onto set C i, so that f ' = Pif is the function obtained by *projecting f onto Ci, the iterative restoration proceeds as follows: ftl) = P1 e 2 . . . Vn fto)
f~2) = Pt P2"-" Pnf~l)
(5.39)
etc. As the geometric analogy shows (Fig. 5.22), by virtue of the convex property of the sets, each iteration brings the function (represented by a point in this diagram) closer to the intersection of all constraint sets. Carazo and Carrascosa (1987a, b) already discussed closed, convex constraint sets of potential interest in electron microscopy: spatial boundedness (as defined by a binary mask), agreement with the experimental
VII. Three-DimensionalRestoration
233
c
i
Fig. 5.22. Principle of restoration using the method of *projection onto convex sets. C~ and C2 are convex sets in the space R ~. representing constraints, and P~, P2 are associated *projected operators. Each element fi is a 3D structure. We seek to find a pathway from a given blurred structure f0 to the intersection set (shaded). Any structure in that set fulfills both constraints and is thus closer to the true solution than the initial structure f0- From Sezan (1992). Reproduced with permission of Elsevier Science, Amsterdam.
measurements in the measured region of Fourier space, value boundedness, and energy boundedness. Thus far, in practice (see following), only the first two on this list have gained much importance, essentially comprising the two components of Gerchberg's (1974) method. 2~ Numerous other constraints of potential importance [see, for instance, Sezan (1992)] still await exploration.
C. Projection onto Convex Sets in Practice In practice, the numerical computation in the various steps of POCS has to alternate between real space and Fourier space for each cycle. Both support (mask) and value constraints are implemented as operations in real space, while the "replace" constraint takes place in Fourier space. For the typical size of a 3D array representing a macromolecule (between 64 x 64 x 64 and 100 • 100 x 100), the 3D Fourier transformations in both directions constitute the largest fraction of the computational effort.
20 Gerchberg's (1974) method, not that of Gerberg and Saxton (1972), is a true precursor of POCS since it provides for replacement of both modulus and phases.
234
Chapter 5. Three-Dimensional Reconstruction
The support-associated *projector is of the following form:
P~f-- I f ( i ) ; 0
i ~ M otherwise,
(5.40)
where M is the set of indices defining the "pass" regions of the mask. In practice, a mask is represented by a binary-valued array with "1" representing "pass" and "0" representing "stop." The mask array is simply interrogated, as the discrete argument range of the function f is being scanned in the computer, and only those values of f(i) are retained for which the mask M indicates "pass." The support constraint is quite powerful if the mask is close to the actual boundary, and an important question is how to find a good estimate for the mask in the absence of information on the true boundary (which represents the normal situation). We will come back to this question later after the other constraints have been introduced. The ealue constraint is effected by the *projector (Carazo, 1992)
P, f =
a f b
l
fb
(5.41)
The measurement constraint is supposed to enforce the consistency of the solution with the known projections. This is rather difficult to achieve in practice because the projection data in Fourier space are distributed on a polar grid while the numerical Fourier transform is sampled on a Cartesian grid. Each POCS *projection would entail a complicated Fourier-sinc interpolation. Instead, the measurement constraint is normally used in a weaker form, as a "global replace" operation: within the range of the measurements (i.e., in the case of the random-conical data collection, within the cone complement that is covered with projections; see Fig. 5.11), all Fourier coefficients are replaced by the coefficients of the solution found by weighted back-projection. This kind of implementation is somewhat problematic, however, because it reinforces a solution that is eventually not consistent with the true solution because it incorporates a weighting that is designed to make up for the lack of data in the missing region. A much better "replace" operation is implicit in the iterative schemes in which agreement is enforced only between projection data and reprojections. In Fourier space, these enforcements are tantamount to a "replace" that is restricted to the Fourier components for which data are actually supplied. The use of the global replace operation also fails to realize an intriguing potential of POCS: the possibility of achieving anisotropic
VIII. Angular Refinement Techniques
235
superresolution, beyond the limit given by Crowther et al. (1970). Intuitively, the enforcement of "local replace" (i.e., only along central sections covered with projection data) along with the other constraints will fill the very small "missing wedges" between successive central sections on which measurements are available much more rapidly, and out to a much higher resolution, than the large missing wedge or cone associated with the data collection geometry. The resolution factor might be as large as two--theoretical calculations are still pending. What could be the use of anisotropic superresolution? An example is the tomographic study of the mitochondrion (Mannella et al., 1994), so far hampered by the extremely large ratio between size (several microns) and the size of the smallest detail we wish to study (5 nm). The mitochondrion is a large structure that encompasses, and is partially formed by, a convoluted membrane. We wish to obtain the spatial resolution in any direction that allows us to describe the spatial arrangements of the different portions of the membrane: Do they touch? Are compartments formed? What is the geometry of the diffusion-limiting channels? The fact that the important regions where membranes touch or form channels occur in different angular directions makes it highly likely in this application to pick up the relevant information. The subject of tomography is outside the scope of this book, but similar problems where even anisotropic resolution improvement may be a bonus could well be envisioned in the case of macromolecules. Examples for the application of POCS to experimental data are found in the work by Akey and Radermacher (1993) and Radermacher et al. (1992b, 1994b). In all these cases, only the measurement and the finite support constraints were used. In the first case, the nuclear pore complex was initially reconstructed from data obtained with merely 34 ~ and 42 ~ tilt and thus had an unusually large missing-cone volume. Radermacher et al. (1994b) observed that POCS applied to a reconstruction from a negatively stained specimen led to a substantial contraction in z direction (see Fig. 5.23 in this Chapter and Fig. 7.6 in Chapter 7) while the ice reconstruction was relatively unaffected.
VIII. Angular Refinement Techniques A. Introduction The random-conical method of data collection was developed as a way of providing a defined angular relationship among a set of molecules. Since the selection of groups is based on the classification of molecule views in
236
Chapter 5. Three-Dimensional Reconstruction
Fig. 5.23. Example for application of POCS. (a) Four views of the 50S ribosomal subunit, reconstructed from images of negatively stained specimen using random-conical data collection and weighted back-projection; (b) the structure in (a) after application of POCS using only "replace" and "boundedness" *projectors. The particle is seen to flatten in z direction, as a result of some filling of the missing cone in the low spatial frequency region. Reproduced with permission of M. Radermacher (unpublished).
the 0 ~ micrograph, there exists an uncertainty in the actual size of the 0 angle which defines the inclination of the central section associated with the projection. This angular uncertainty can easily reach _+ 10 ~ and reduce the resolution of the reconstruction substantially (see the estimates given in Section VIII, D). At this point it should be reiterated (see Section V, H) that the cross-resolution comparison between the 0 ~ average of a molecule set in a certain view and the 0 ~ projection of the reconstruction, valuable as it is as a check for internal consistency (Section V, I), nevertheless fails to provide an adequate estimate of over-all resolution. It has been pointed out (Penczek et al., 1994) that the 0 ~ projection is insensitive to incorrect assignments of 0, since its associated central section in Fourier space is built up from one-dimensional lines, each of which is an intersection between the 0~ section and the 50~ section. Angular refinement, by giving each projection a chance "to find a better home" in terms of its orientation and phase origin, improves the resolution of the reconstruction substantially. For the 70S ribosome, an improvement from 1/47.5 to 1,/40 A -1 was reported (Penczek et al., 1994). An angular refinement technique essentially based on the same principle, of using an existing lower-resolution reconstruction as a template, has been used recently in the processing of virus particles (Cheng et al., 1994). The two schemes resemble earlier schemes proposed by
VIII. AngularRefinementTechniques
237
van Heel (1984) and Harauz and Ottensmeyer (1984a). In fact, all four schemes, although differing in the choice of computational schemes and the degree of formalization, can be understood as variants of the same approach.
B. Three-Dimensional Projection Matching Method In the 3D projection matching scheme, reference projections (in the following termed "reprojections") are computed from the (low-resolution) template structure such that they cover the entire angular space evenly (Fig. 5.24). As template structure, an existing 3D volume is used, which might have been obtained by merging several random-conical projection sets. In the application by Penczek et al. used to demonstrate the techn i q u e - t h e 70S ribosome from E. c o l i m 5 2 6 6 such reference projections were obtained. A given experimental projection is cross-correlated with all reference projections. The angle giving the largest CCF peak is the desired projection angle. The cross-correlation function between reference and current projection gives, at the same time, the shift and the azimuthal orientation of the particle under consideration. Using the new parameters for each experimental projection, a new reconstruction is computed, which normally has improved resolution. This refined reconstruction can now be used as new template, etc. Usually, the angles no longer change by significant amounts after two or three iterations of this scheme (see Section VIII, D for a discussion of this point in the light of experimental results). In the demonstration by Penczek et al. (1994), additional 0 ~ or low-tilt data could be used, following this approach, to improve the resolution from 1/40 to 1/29 A-~. [0~ Data here means "the collection of particle images in an untilted specimen field."] The end results of the refinement, and the evenness of the distribution can be checked by plotting a chart that shows the 3D angular distribution of projections (Fig. 5.25). Before refinement, each random-conical set of projections will be mapped into a circle of data points on such a chart. It is seen that the angular corrections in the refinement have wiped out all traces of these circular paths. The 3D projection matching technique as described above (Penczek et al., 1994; Cheng et al., 1994) is closely related to Radermacher's 3D Radon transform method (Radermacher, 1994; see below). [This relationship can be understood in terms of the so-called X-ray transform versus Radon transform (see Natterer, 1986).]. It also has close similarities with projection matching techniques that were formulated some time ago, when exhaustive search techniques with the computer were quite time-
238
Chapter 5. Three-DimensionalReconstruction
Fig. 5.24.
Principle of 3D angular alignment or refinement. From an existing reconstruction (top left), a large number of projections are obtained, covering the 3D orientation space as evenly as possible. In the case illustrated, 5266 projections are computed. 3D angular alignment of new data: a given experimental projection that is not pan of the projection set from which the reconstruction was obtained is now cross-correlated with all trial projections. The direction Ier which maximum correlation is obtained is then assigned to the new projection. In this way, an entire new data set can be merged with the existing data so that a new reconstruction is obtained. Angular refinement of existing data: each projection that is part of the experimental projection set is matched in the same way with the computed projections to find a better angle than originally assigned. However, in this case, the search range does not have to extend over the full space of orientations, because large deviations are unlikely. The best choice of search range can be gauged by histograms of angular "'movements"; see Section VIII, D and particularly Fig. 5.27. From Penczek et al. (1994). Reproduced with permission of Elsevier Science, Amsterdam.
consuming and therefore still impractical to use. These relationships will be briefly summarized in the following: (i) van Heel (1984b) proposed a method to obtain the unknown orientations of a projection set using the following sequence: step 0: assign random orientations to the projections; step 1: compute 3D reconstruction; step 2: project 3D reconstruction in all directions in space to match experimental projections. The parameters for which best matches are
VIII. Angular Refinement Techniques
239
Fig. 5.25. Distribution of directions of 567 projections after refinement through 3D projection alignment. A single random-conical projection set prior to refinement would be represented by a circular pattern of dots on this map. From Penczek et al. (1994). Reproduced with permission of Elsevier Science, Amsterdam.
obtained yield new orientations for the projections; compute differences between current model projections and experimental projections; if summed squared differences are larger than a predefined value then GO TO step 1, otherwise STOP. This procedure thus contains the main ingredients of the algorithm of Penczek et al., except for the choice of starting point (step 0), which makes it difficult for the algorithm to find a satisfactory solution except in quite fortuitous cases where the random assignment of angles happens to come close to the actual values. In all other cases, the initial reconstruction will not likely resemble a low-resolution version of the true structure, and will fail to steer the orientation assignments in the correct directions. (ii) The Harauz and Ottensmeyer (1984a, b) approach differs from all approaches discussed thus far in that it uses a computer-generated model of the predicted structure rather than an experimental reconstruction as an initial 3D template and a combination of visual and computational
240
Chapter 5. Three.Dimensional Reconstruction
analyses to obtain an optimum fit for each projection. Because of the small size of the object (nucleosome cores whose phosphorus signal was obtained by energy filtering) and the type of specimen preparation used (air-drying), the result was greeted with considerable scepticism. There is also a principal question whether the use of an imposed 3D reference (as opposed to an experimental 3D reference) might not bias the result. We recall the reports about reference-induced averages in the twodimensional case (Radermacher et al., 1986b; Boekema et al., 1986), and our previous discussion of this question as in introduction to reference-free alignment schemes (see Chapter 3, Section III, E, 1). As van Heel (1984b), Harauz and Ottensmeyer (1984a, b) also make use of the summed squared difference, rather than the cross-correlation, to compare an experimental projection with the reprojections. It was earlier pointed out (Chapter 3, Section III, C, 1) that there is no practical difference between the summed squared difference (or Euclidean distance) and the cross-correlation as measures of "goodness of fit" when comparing 2D images that are rotated and translated with respect to one another. This is so because the variance terms in the expression of the Euclidean distance are translation- and rotation-invariant. In contrast, the two measures do behave differently when employed to compare projections of two structures, since the variance of a projection may strongly depend on the projection direction. (iii) Alignment of correlation-aceraged projections. Saxton et al. (1984) developed a somewhat related alignment technique as part of a strategy to reconstruct a crystal from projections that have been obtained by correlation averaging. New projections are added incrementally to a data set by aligning it to a "pseudoprojection" generated from the existing layer lines. A refinement pass was designed in which each projection, again, has the opportunity "to find a new home": each projection is matched with the data set which has been modified by exclusion of that projection. (iv) Multiresolution approach. Dengler (1989) discussed projection matching in the general framework of a multiresolution approach to reconstruction. He addressed the important problem of error propagation, which can be solved only by, in the words of the author, "iterative control strategy from coarse to fine." Some of Dengler's ideas, relating to the modeling of a space variant displacement vector field, have yet to be implemented and tested. (v) Method of "inactiL'e" *projection in factor space (Carazo et al., 1989). When a projection set is analyzed by correspondence analysis or by a similar method of multivariate statistical analysis, closely matched projections come to lie in close proximity to one another in factor space. When an existing model structure is projected successively along a closed
241
VIII. Angular RefinementTechniques
angular pathway, and the resulting projection set is analyzed by correspondence analysis, the data points representing the projections can be observed to fall on a closed loop (see the demonstration of this principle by Frank et al., 1986). In principle, this offers the possibility of determining orientations of experimental projections" these projections are "inactively" (i.e., without participating in the factor analysis) *projected into the factor space spanned by the model projection set. Assignment of angles then is on the basis of proximity to active data points with known angles. A similar strategy could be used for refinement (although the computational effort would be substantial): by making each projection of a data set in turn "inactive." (vi) Matching of projections with a theoretical model. For completeness, it should be mentioned that projection matching plays a role in attempts to fit a theoretical model to experimental projections. The approach is initially similar to that of Harauz and Ottensmeyer (1984a, b), in that a theoretical model is used, but differs from the latter in the important fact that no attempt is made to obtain an experimental 3D reconstruction, mainly because of insufficient angular coverage. For example, De Haas and van Bruggen (1994) investigated the orientations of the four hexamers of the tarantula hemocyanin by cross-correlating their averaged projections with a model projected into a large range of orientations, van Heel and Dube (1994) refined the parameters of the architectural model of Limulus polyphemus hemocyanin using a similar matching method. Boisset et al. (1990a) were able to explain the appearance of stained molecules of Scutigera coleoptrata hemocyanin on a single carbon film by calculating projections of a stain-exclusion model generated in the computer (see Fig. 3.4 in Chapter 3).
C. Three-Dimensional Radon Transform Method Although its principle is quite similar to the projection matching approaches described above, the Radon transform method (Radermacher, 1994) is distinguished from the former by its formal elegance and the fact that it brings out certain relationships that may not be evident in the other approaches. Radermacher's method takes advantage of the relationship between the 3D Radon transform and the 2D Radon transform. The 3D Radon transform is defined for a 3D function f(r) as
f(p, ~ ) =
ff(r)6(p
- ~r) dr
(5.42)
242
Chapter 5. Three-Dimensional Reconstruction
where r = ( x , y , z) T and 6 ( p - ~ r r ) r e p r e s e n t a plane defined by the direction of the (normal) unit vector s The 2D R a d o n transform is defined for a 2D function g(r), in analogy to Eq. (5.42), as
~,( p, ( ) = f g(r)6( p - ~ r r ) d r
(5.43)
now r e p r e s e n t s a line defined by where r = ( x , y ) l and 6 ( p - ~ T r ) the direction of the (normal) unit vector ~. The discrete 2D R a d o n transform is also known u n d e r the n a m e sinogram (e.g., van Heel, 1987b). R a d e r m a c h e r (1994) shows that the d e t e r m i n a t i o n of the unknown orientation of a projection is solved by cross-correlating its discrete 2D R a d o n transform with the discrete 3D R a d o n transform of the existing model (Fig. 5.26). Translational alignment (equivalent to " p h a s i n g " in F o u r i e r space) can be done simultaneously. As we recall from Section III, D, the use of the cross-correlation function between 2D sinograms was p r o p o s e d by van Heel (1987b) u n d e r the n a m e angular reconstitution, as a m e a n s of d e t e r m i n i n g the relative orientation b e t w e e n two or m o r e raw, experimental projections. Applied to such data, that m e t h o d generally fails because of the low SNR values normally e n c o u n t e r e d in electron micrographs of stained or f r o z e n hydrated specimens, unless high point symmetries are present. However, the m e t h o d has proved viable when applied to the matching of an experimental to an aceraged projection, or the matching of one averaged projec-
Fig. 5.26. Demonstration of angle search with the 3D Radon transform, using the cryo-reconstruction of the 50S ribosomal subunit of Escherichia coli (see Fig. 5.19). (a) Computed projection of the reconstruction into the direction given by the angles {t/, = 0~ 0 = 45~ 4' = 30~ with noise added (SNR = 0.88). (b) The & = 0 plane of the 3D cross-correlation between the Radon transform of the projection in (a) and the 3D Radon transform of the 50S-subunit reconstruction. The peak is found centered at {0 = 45~ d, --- 30~ From Radermacher (1994). Reproduced with permission of Elsevier Science. Amsterdam.
VIII. AngularRefinementTechniques
243
tion to another; see van Heel et al., 1994; Orlova and van Heel, 1994; Serysheva et al., 1995).
D. The Size of Angular Deviations Little is known about the actual size of the angular deviations of a molecule within its class, 2~ but inferences can be drawn from the results of the angular refinements. Some data (unpublished) are available from the ribosome study of Penczek et al. (1994). As described in Section VIII, B the refinement by projection matching is done in several passes. For each pass, one can make a histogram of angular adjustments of the particles. These are given in terms of three Eulerian angles (~, 0, ~b), of which g, and 4' have to do with the rotations of the particle around axes not of interest here, and only 0 gives the information about its "tilt." In the first pass (Fig. 5.27a), one half of the particles "moved" by less than 10~ one-third moved by angles between 10~ and 35 ~ and the rest moved by larger angles. In the second pass (Fig. 5.27b), a full 80% of the particles moved by less than 5 ~ (50% are even within 1%; not shown in this figure), indicating approximate stabilization of the solution. One can therefore take the angular adjustments in the first pass as estimates for the actual angular deviations. The size of these deviations--a full half of the particles are tilted by more than 10 ~ away from the orientation of their class--is at first glance surprising. It certainly explains (along with ~- and &deviations not depicted in Fig. 5.27) the great gain in resolution that was achieved by correcting the angles. The deviations are the result of both misalignment and misclassification of the extremely noisy data. What it means is that the unrefined reconstruction is essentially a superposition of a high-quality reconstruction (where the resolution limitation is due to factors unrelated to angular deviations, namely electron-optical limitations, conformational changes, etc.) and a blurred reconstruction, with the former based on projections whose 0 angles are closely matching and the latter based on projections whose 0 angles fall into a wide range. With such a concept, it is now possible to understand the power of the angular refinement method: basically, a high-resolution reconstruction is already "hidden" in the unrefined reconstruction and it furnishes weak but essentially accurate reference information in the course of the angular refinement passes (see Fig. 5.27). 21 This obviously depends on the definition of the class obtained by MSA and classification. However, the classification in the example used is rather typical for particles in the 200 A size range.
244
Chapter 5. Three-Dimensional Reconstruction
a 3oo250"
03 I
200
o "1:2 Q..
"5 150. L ..Q
E ~ 100
50
2'0
40
60
8'0
~6o
86
1()0
change of theta
b 300 250 03
200
1:2 Q..
o
150
E = 100
50
t---1-..
~
20
,
,
,
4'0
I
60
'
Fig. 5.27. Histograms showing the change of the theta-angle during angular refinement. (a) First refinement pass, (b) second refinement pass. For explanation, see text. (P. Penczek, R. Grassucci, and J. Frank, 1994, unpublished data).
IX. Transfer Function Correction
245
Histograms similar to those shown in Fig. 5.27 were already obtained, in model computations, by Harauz and Ottensmeyer (1984a), who gave the projections intentionally incorrect angular assignments. Since these authors used a model structure, they could study the actual angular improvements as a function of the number of iterations. What is interesting in the results of Harauz and Ottensmeyer is that angular error limits of + 10~ are rapidly compensated, to a residual error below 2~ while limits of +_20~ lead to residual errors in the range of 8~. Since the algorithm driving the correction of angles in this work differs somewhat from that employed by Penczek et al. (1994), the behavior of angular correction is not strictly comparable. However, it is likely that there is again a limit of rms angular deviation below which the angular refinement is well behaved and very efficient, but above which only small improvements might be achievable.
IX. Transfer Function Correction Without correction for the effects of the contrast transfer function, the reconstruction will have exaggerated features in the size range passed by the CTF spatial frequency band. Most importantly, the definition of the whole particle against the background is affected. Procedures for CTF correction were already discussed in Chapter 2. These can either be applied to the raw data (i.e., the projections) or to the 3D volume (or volumes). The most effective correction is obtained by combining data sets obtained with two or more different defocus settings. In deciding whether to apply correction before or after the 3D reconstruction, one has to consider the following pros and cons: when applied to averages, such as the 3D reconstructions, all procedures mentioned in Section II, H of Chapter 2 are very well behaved numerically because of the high SNR of the data. The opposite is true when these procedures are applied to raw data. On the other hand, correction after reconstruction runs into the difficulty that the projections from tilted-specimen micrographs have different defocus values (see Frank and Penczek, 1995). In order to proceed in this way, one has to sort the raw data according to the distance from the tilt axis and perform separate reconstructions for each defocus strip. For processing data from untilted specimens taken at different defocus settings, Zhu and co-workers (1995) developed a method of reconstruction that implicitly corrects for the CTF. The is done by including the CTF into the mathematical model describing the relationship between the threedimensional model and the observed projections.
246
Chapter 5. Three-Dimensional Reconstruction
In algebraic form, this relationship can be formulated as Pk = Hk Po,
(5.44)
where Pk is a matrix containing the projection data for the k-th defocus setting, o is a vector representing the elements of the 3D object in lexicographic order, P is a non-square matrix describing the projection operations, and H k is the CTF belonging to the k-th defocus. In reality, the data are subject to noise, and the equation system underlying Eq. (5.44) is ill-conditioned. Zhu and co-workers (1995) make use of an approach of regularized least squares to find a solution to the expression 1 Y'~ ~ k pk - H k P~ 2 --' min,
(5.45)
k
where N k is the number of projections in the k-th set. A least-squares solution is found iteratively, by the use of Richardson's method: 1
o~,,+ 1 , : o/,,, + A E ~
P / H ~ { P k - HkPkO'"'},
(5.46)
k
where A is a small constant controlling the speed of convergence. This method of "reconstruction-cum-CTF correction" was successfully used in the reconstruction of the ribosome from energy-filtered data that were taken with 2.0/~ and 2.5 # defocus (Frank et al., 1995a, b).
I. Preliminaries: Significance, Experimental Validity, and Meaning It is necessary, at first, to ask two questions that have to do with the relationship between the 3D reconstruction and the molecule it represents: (i) Are observed density changes statistically significant? (ii) What assurances do we have that the reconstruction represents something real? The first question arises legitimately in any experiment where the result is derived by averaging over a number of measurements. The question can be answered by application of statistical hypothesis testing, but this requires knowledge of another type of information not yet mentioned: the three-dimensional (3D)variance. In conjunction with this analysis of significance, it is possible to shine new light on the important question of how far the radiation dose can be reduced, in recording the projections, before the significance of the 3D reconstruction is lost (Hegerl and Hoppe, 1976; Hoppe and Hegerl, 1981; van Heel, 1986b). The significance assessment and 3D variance estimation will be treated below in Section II. The second question arises due to the fact that, even to a larger extent than the electron crystallographic approach, the single-particle approach to electron microscopic structure research involves a large battery of complex operations with a number of discretionary choices of parameters, e.g., which particles to exclude, how many classes to seek in the classification, and which classes to choose for 3D reconstruction. As previously mentioned, there also exists an ambiguity in the combination of 2D classification with 3D reconstruction which needs to be resolved by careful 247
248
Chapter 6. Interpretationof 3D Images of Macromolecules
checks. Taken together, these uncertainties call for some type of validation and testing of the results for experimental consistency. Section III will attempt to address some of these issues. On the other hand, one should be aware of the fact that electron crystallography, before it reaches the resolution level where atomic structure validation is possible, also involves many discretionary choices. Exclusion of data in the merging process is an obvious example, where the opportunities offered by multivariate statistics have yet to be appreciated. Once satisfied that the reconstructed 3D density distribution is (i) statistically significant and (ii) reproducible in a strict sense (i.e., involving completely independent experiments), we proceed to the next question: (iii) What is the meaning of the results? The search for the "meaning" of a 3D density distribution involves different kinds of representation (a subject to be treated in Section IV) and a juxtaposition of the represented molecule with existing knowledge (Section V).
II. Assessment of Statistical Significance A. Introduction In this section, we follow the discussion and analysis of Liu (1991, 1993) as presented in the article by Frank et al. (1992) and in the papers by Liu and Frank (1995) and Liu et al. (1995). There are two types of questions that involve the notion of statistical significance: (i) Is a density maximum observed in the reconstruction significant, given the set of observations, or can it be explained as the result of a random fluctuation? (ii) Let us assume that we need to localize a feature in three dimensions by forming the difference between two reconstructions, A-B, where A is the reconstruction obtained from the modified specimen (e.g., molecule-ligand complex or the molecule in a modified conformational state) and B is the reconstruction obtained from the control. A and B are sufficiently similar in structure so that alignment of the common features is possible and meaningful, and the structural difference can be obtained by simple subtraction. A and B are each derived from a projection set. We now wish to answer the question: Is the observed difference in the 3D difference map meaningful? Both types of questions can be solved through the use of the 3D variance map; in the first case, it must be computed for the single reconstruction volume considered, and in the second case, such a map must be computed for both volumes to be compared. Once the variance
II. Assessmentof Statistical Significance
249
map, or maps, have been obtained, we can proceed by applying the methods of statistical inference, just as we have done in two dimensions (Section IV, B in Chapter 5). The only problem is that the 3D variance is normally not available in a similar way as in two dimensions, namely as a byproduct of averaging. We would have a similar situation only if we had an entire statistical ensemble of reconstructions, i.e., N reconstructions for each of the experiments (i.e., modified specimen and control). This situation is common in experiments where biological objects with helical symmetry are studied (e.g., Trachtenberg and DeRosier, 1987). In that case, we could derive an estimate of the 3D variance, in strict analogy to the 2D case, by computing the "average reconstruction" ]
N
= N i ~= 1 Bi(R), B(R)
(6.1)
and from that the variance estimate
I~(R)
=
(Bi(R)
-
B(R)) 2.
(6.2)
Experiments on single macromolecules that came the closest to this situation were done in Hoppe's group (see Ottl et al., 1983; Knauer et al., 1983), albeit with numbers N that were much too small to allow statistical testing. Automated tomography (Koster et al., 1992) might produce larger sets of reconstructions where an estimation along the lines of Eq. (6.2) is meaningful (comparable applications are found in radiological imaging of the brain where abnormalities are determined relative to an "averaged" brain, see, e.g., Andreasen et al., 1994). However, the normal situation is that only one projection is available for each projection direction and that each originates from a different particle. Only one reconstruction results from each experiment, and the 3D variance information, if any, has to be inferred from an analysis of projections only. Liu and co-workers (Liu, 1991, 1993; Frank et al., 1992; Lui and Frank, 1995; Liu et al., 1995) developed an approach where a variance is estimated whose definition is different from the one in Eq. (6.2) relating to a different statistical ensemble. This variance is derived from a "gedankenexperiment" in which it is assumed that each projection is available in many versions, all differing in the realization of the (additive) noise. Hence, many sets of projections are available, each of which can be used to reconstruct a version of the object. The two types of variance are fundamentally different. The first, following Eq. (6.2), allows definite conclusions on the location of conformational variations, while the second type primarily facilitates the analysis
250
Chapter 6. Interpretationof 3D Images of Macromolecules
of significance, allowing only tentative conclusions on conformational variations.
B. Three-Dimensional Variance Estimation from Projections The estimation from projections starts with the concept of the projection noise. Each experimental projection is seen, in a gedankenexperiment, as a member of a (virtual) statistical ensemble of M projections in the same ith direction. Hence, the noise belonging to projection j in direction i is n~i)(r) = p~i)(r) - p(i)(r),
(6.3)
where the overbar denotes the ensemble average. Its contribution to a reconstruction based on the weighted back-projection algorithm is obtained by application of the weighting function: /~i!(l")
=
H)i)(r)OW/(r).
(6.4)
Now, drawing projections at random from the virtual bins (each containing the projection ensemble for a different direction), we can produce N different virtual reconstructions /~; and again define the variance according to =
-
2
It is easy to see that IY'(R) can be estimated by back-projecting the weighted-projection noise variances (now leaving out the drawing index j): =
(6.6)
The only remaining task is to estimate, for each projection direction, the projection noise n~)(r). This can be done in two different ways. One is by obtaining a reconstruction from the existing experimental projections by weighted back-projection, and then reprojecting this reconstruction in the original directions. The difference between the experimental projection and the reprojection can be taken as an estimate of the projection noise. The other way (Fig. 6.1), closely examined by Liu and co-workers (Liu and Frank, 1995; Liu et al., 1995), is by taking advantage of the fact that in normal data collections, the angular range is strongly oversampled so that for each projection, close-neighbor projections that differ little in the signal are available. In other words, an ensemble of projections, of the
II. Assessment of Statistical Significance
251
7 **9
"~
Fig. 6.1. Flow diagram of 3D variance estimation algorithm. Provided the angular spacing is small, the difference between neighboring projections can be used as noise estimate fi~i)(k,l). When such an estimate has been obtained for each projection, we can proceed to compute the 3D variance estimate by using the following steps: (i) convolution with the inverse of the back-projection weighting function (i.e., the real-space equivalent to the BP weighting in Fourier space); (ii) square the resulting weighted noise functions, which results in estimates of the noise variances; (iii) back-project (BP) the noise variance estimates which gives the desired 3D variance estimate. From Frank et al. (1992). Reproduced with permission of Scanning Microscopy International.
kind assumed in the gedankenexperiment, actually exists in the close angular neighborhood of any given projection. It is obviously much easier to go the first way, by using reprojections, which after all relate to consistency checks routinely done (see Fig. 6.2). The detailed analysis by Liu and co-workers, in the works cited above, needs to be consulted for an evaluation of systematic errors that might favor the second way in certain situations. It is also important to realize that there are certain systematic errors, such as errors due to interpolation, inherent to all reprojection methods that cause discrepancies n o t related to 3D variance (see Trussel et al., 1987). We have implicitly assumed in the foregoing that the weighted backprojection algorithm is being used for reconstruction. It must be emphasized that the estimation of the 3D variance from the projection variances only works for reconstruction schemes that, as the weighted backprojection, maintain strict linearity. Iterative reconstruction methods such as algebraic reconstruction technique (ART) (Gordon et al., 1970) and simultaneous iterative reconstruction technique (SIRT) (Gilbert, 1972) do maintain the linearity in the relationship between projections and reconstruction, unless they are modified to incorporate nonlinear constraints (e.g., Penczek et al., 1992).
252
Chapter 6. Interpretationof 3D Images of Macromolecules
Fig. 6.2. Proteasomes negatively stained: comparison of experimental tilted-specimen projections (top row)with reprojections of the random-conical reconstruction (obtained by SIRT; middle row) and corresponding cross-correlation functions (bottom row). The comparison shows the extent of agreement obtained without angular refinement. In the work by Hegerl et al. (1991), from which this figure is taken, the comparisons were preformed to check the accuracy of translational alignment, which was found to be within a pixel. Reproduced with permission of Elsevier Science, Amsterdam.
C. Significance of Features in a Three-Dimensional Map To ascertain the significance of a feature in a 3D map, we proceed in a way similar to that outlined in the two-dimensional case ( C h a p t e r 3, Section IV, B). W e need to know the statistical distribution of the deviation of the 3D image e l e m e n t p ( R j ) from the averaged e l e m e n t /5(Rj), or the distribution of the r a n d o m variable t(Rj) = p(Rj) - p(Rj) s(Rj) where s(Rj) --- v / V ( R j ) / N
(6.7)
is again the standard error of the m e a n which
can now be estimated either from the variance estimate I7"(Rj) [Eq. (6.2)] (which is normally not available, unless we can draw from many indepen-
II. Assessmentof Statistical Significance
253
dent 3D reconstructions; see Section II, B for the detailed argument) or from the estimate [Eq. (6.6)] that is always available when weighted back-projection or one of the linear iterative reconstruction algorithms is used as the method of reconstruction. Having constructed the random variable t(Rj), we can now infer the confidence interval for each element of the 3D map, following the considerations outlined in the 2D case earlier on. On this basis, the local features within the 3D reconstruction of a macromolecule can now be accepted or rejected (see, e.g., the use of t-maps by Trachtenberg and DeRosier (1987) to study the significance of features in reconstructions of frozen-hydrated filaments).
D. Significance of Features in a Difference Map The most important use of the variance map is in the assessment of the statistical significance of features in a 3D difference map. Due to experimental errors, such a difference map contains many features that are unrelated to the physical phenomenon studied (e.g., addition of antibodies, deletion of a protein, or conformational change). Following the 2D formulation [Eq. (3.52) in Chapter 3], we can write down the standard error of the difference between two corresponding elements of the two maps, p(R 1) and p(R 2): sa[/31(Rj) ,/32(R k)] = [ V I ( R j ) / N 1 + V2(Rk)/N2] 1/2,
(6.8)
where VI(R 1) and V2(R2) are the 3D variances. In studies that allow the estimation of the variance from repeated reconstructions according to Eq. (6.2) (see, e.g., Trachtenberg and DeRosier, 1987; Milligan and Flicker, 1987), N 1 and N 2 are the numbers of reconstructions that go into the computation of the averages pl(R) and fiE(R). In the other methods of estimation, from projections, N1 - N e - 1 must be used. On the basis of the standard error of the difference [Eq. (6.8)], it is now possible to reject or accept the hypothesis that the two elements are different. As a rule of thumb, differences between the reconstructions are deemed significant in those regions where they exceed the standard error by a factor of three or more (see Section IV, B in Chapter 3). Analyses of this kind have been used by Milligan and Flicker (1987) to pinpoint the position of tropomyosin in difference maps of decorated actin filaments. In the area of single-particle reconstruction, they are contained in the work by Liu (1993), Frank et al. (1992), and Boisset et al. (1994a). In the former two works, the 50S-L7/L12 depletion study of Carazo et al. (1988) was reevaluated with the tools of the variance estimation. The other
254
Chapter 6. Interpretationof 3D Images of Macromolecules
application concerns the binding of a Fab fragment to the A n d r o c t o n u s australis hemocyanin molecule in three dimensions (Liu, 1993; Liu et al., 1995; Boisset et al., 1994a). In the case of the hemocyanin study (Fig. 6.3), the appearance of the Fab mass on the corners of the complex is plain enough, and hardly requires statistical confirmation. However, the t map (Fig. 6.3F) also makes it possible to delineate the epitope sites on the surface of the molecule with high precision.
Ill. Validation and Consistency In the following, two case studies which demonstrate the self-consistency of the entire suite of experimental and reconstruction methods are shown. The demonstration goes beyond an analysis of statistical significance with a single study by comparing results from experiments that are entirely independent. Before coming to these comparisons, which still relate to data from the same group, I should point out that the calcium release channel is the first striking example of a single-molecule structure reconstructed independently by two different groups using entirely different data collection methods (Radermacher et al., 1994b; Serysheva et al., 1995). The agreement between the two published structures is quite detailed and indeed supports the validity and equivalence of both approaches of 3D electron microscopy of macromolecules. Another example is provided by the most recent publications of cryo-EM reconstructions of the 70S E. coli ribosome (Frank et al., 1995a, b; Stark et al., 1995), even though the difference is processing (presence versus absence of CTF correction, respectively) and presentation (low versus high threshold in contouring) makes the results difficult to compare (see also Moore, 1995).
A. A Structure and Its Component Reconstructed Separately: 80S Mammalian Ribosome and the 40S Ribosomal Subunit In the course of the translation cycle, the 80S eukaryotic ribosome is formed by association of the 60S large subunit with the 40S small subunit. To understand the translation process, one must come to a detailed understanding of the stereochemistry of the interaction between the ribosome, the mRNA, the tRNAs, and various factors. To date, 3D electron microscopy of single particles has been the only technique 22 that has given moderate resolutions in the range of 1/25 to 1/40 A-1. 22 For a comparison with resolutions quoted in electron crystallographic studies of ribosomes (e.g., Arad et al., 1987), it has to be appreciated that the latter are often much more optimistic than those obtained with the DPR criterion (Section V, B, 2 in Chapter 3).
III. Validation and Consistency
255
Fig. 6.3. Example of variance estimation, applied to the assessment of significance of a feature in a 3D difference map. Two reconstructions are compared: one (selected slices in row A) derived from hemocyanin molecules that are labeled at all four sites, and another (selected slices in row C) at three sites only. Row B and D: 3D variance estimates computed for reconstructions A and C, respectively, following the procedure outlined in the flow diagram of Fig. 6.1. Row E: difference volume A-C. Row F: 3D t test map, computed for a 99% confidence level, showing the 3D outline of the Fab molecule. From Boisset et al. (1994a). Reproduced with permission of Academic Press.
256
Chapter 6. Interpretationof 3D Images of Macromolecules
Both 80S and 40S particles exhibit distinct views, and were thus good candidates for application of the random-conical reconstruction. The oval-shaped 60S particle, on the other hand, lacks a characteristic view, and has been difficult to analyze. The shape of the 40S subunit reconstructed separately (Verschoor et al., 1989) from images of the isolated subunit versus its shape as part of the 80S particle (Verschoor and Frank, 1990) provides an obvious check for the validity and consistency of the reconstruction method. In addition, a reconstruction of the 40S subunit-eIF3 complex (the "native 40S subunit"; Srivastava et al., 1992a) is available for comparison. As was pointed out earlier on, reconstructions that are done from negatively stained particles can strictly be compared only when they originate from particles facing the grid in the same way, because, in the case, the forces that cause the deformations act in the same direction. Fortuitously, the different versions of the 40S subunit (single, complexed with eIF3, and associated with the 60S subunit in the 80S ribosome) have similar preferred orientations, enabling the comparison of the 3D shapes in some detail (40S versus 80S: Verschoor et al., 1989; 40S versus 40S-eIF3 complex: Srivastava et al., 1994, unpublished). These comparisons reveal a generally good agreement in all major features. However, these results have now been superceded by the data from cryo-electron microscopy (40S: Srivastava et al, 1995; 80S: Verschoor et al., 1995, in preparation), which show detailed agreement, the extent of which can be appreciated from the juxtaposition in Fig. 6.4. A new feature of the 40S subunit, previously not recognized in the negatively stained prepara-
Fig. 6.4. Comparison between morphologies of the 40S ribosomal subunit obtained from independent studies. (a) The 40S subunit as part of the 80S ribosome cryo-reconstruction. From Verschoor et al. (1995). (b) The 40S subunit reconstructed separately. From Srivastava et al. (1995). Reproduced with permission of Academic Press.
III. Validation and Consistency
257
tion--presumably because of an infolding or collapse induced by the air-drying--is a platform similar to the well-known platform of the 30S ribosomal subunit of Escherichia coli. The feature promptly shows up in the 40S subunit part of the 80S ribosome, at a place that would be expected from a superposition of the 70S and 80S ribosomes. In fact, as we have seen, the validation test has been passed with such a good grade that one can now begin to formulate an ambitious program of evolutionary comparisons of ribosomes of the different kingdoms (Verschoor, personal communication, 1995).
B. Three-Dimensional Structural Features Inferred from Variational Pattern: Half-Molecules of Limulus polyphemus Hemocyanin There are two ways by which 3D structural information is expressed in variational information gleaned from projections: by "rocking"-systematic changes of the molecule view produced by changes in its orientation--and "waterline" information--the change in appearance caused by different levels of staining. Both effects are pronounced when the molecule is large and the staining is one-sided. For one-sided staining, small changes of z (the elevation above the specimen grid) of a molecule component may produce dramatic changes in the stain pattern. One could speak of an amplification effect. In this section, in describing some of the basic features of the architecture of arthropod hemocyanins, we draw on numerous earlier works by Lamy and co-workers (e.g., Sizaret et al., 1982; Lamy et al., 1982, 1985; Lamy, 1987), unless a specific reference is given. In the case of the half-molecule of Limulus polyphemus hemocyanin, the rocking was first discovered when an electron micrograph was analyzed by correspondence analysis (van Heel and Frank, 1981; Frank and van Heel, 1982). For molecules in their "flip" view (the name arbitrarily given to one of the top views; see Lamy et al., 1982), the stain imprints created by two diagonally apposed hexamers change in relative intensity, whereas the other two remain unchanged (Fig. 6.5). For molecules in their "flop" view, related to the "flip" view by flipping the molecule, the variation affects the two formerly constant hexamers, and the formerly variable hexamers are now constant. This behavior was explained by a postulated noncoplanar arrangement of the four hexamers which would create two pivotal points (formed by the hexamers resting on the support) defining a"rocking axis" around which the molecule would be able to move. According to this rocking hypothesis, the two hexamers producing a varying
258
Chapter 6. Interpretationof 3D Images of Macromolecules
Fig. 6.5. Gallery of half-molecules (produced by proteolytic digestion of the full molecule) from Limulus polyphemus hemocyanin in different states of "rocking." The molecule is thought to be noncoplanar, so that it "rests" either on hexamers on the long axis (top row) or on the short axis (bottom row), depending on its flip/flop orientation. The diagrams depict the varying stain patterns found by correspondence analysis in an exaggerated way. From van Heel and Frank (1981). Reproduced with permission of Elsevier Science, Amsterdam.
pattern are those that have the freedom to move in a see-saw fashion. A given specimen field, so the explanation went, shows the molecules "frozen" in different rocking-related positions. At that stage, the rocking explanation was a mere conjecture. Indirect confirmation was, however, provided by the appearance of the peculiar 45 ~ views (van Heel et al., 1983) which shows the two dodecamers in a slightly skewed arrangement. Only recently did the reconstructions of the A . australis hemocyanin (which is virtually identical in its architecture to the half-molecule of the L i m u l u s variety) provide clear evidence for the noncoplanar arrangement of the four hexamers (Boisset et al., 1990b, 1994a, b, 1995; van Heel et al., 1994). Of these results, only those obtained from cryo-reconstructions (Boisset et al., 1994b, 1995; van Heel et al., 1994) can be expected to maintain the integrity of the geometrical arrangement (Fig. 6.6). Based on the numerical fitting of the X-ray map of the
III. Validation and Consistency
259
Fig, 6,6, Surface representations of Androtonus australis hemocyanin immunocomplex reconstructed from cryo-EM images. Starting with the top view orientation, the volume is incrementally rotated by 45~ around its horizontal axis. White arrows in C and G point to the skewed arrangement of the two dodecameric half-molecules which is responsible for the rocking of the whole molecule. From Boisset et al. (1995). Reproduced with permission of Academic Press.
hexameric Panilurus hemocyanin (supposed to be quite similar to the hexameric building blocks of Limulus and Androctonus hemocyanin), it has now been possible to measure the angle between the long axes of the dodecarners (Boisset et al., 1995). Although this example fails to demonstrate quantitatiee consistency, it nevertheless amounts to an instance of inductive reasoning, based on experiment A, which allowed a prediction to be made (i.e, that the 3D arrangement of the hexameric building blocks is skewed) that was subsequently proved right in experiment B. Since both experiments involve image processing procedures along different pathways, they clearly crossvalidate each other.
260
Chapter6. Interpretationof 3D Imagesof Macromolecules
C. Concluding Remarks The purpose of this section was to stress that the application of image processing procedures must be accompanied by a careful validation. This is all the more important as the molecule that is not constrained in a crystal lattice assumes a much larger variety of conformational states; this means that changes of protocol in many steps of the analysis, such as alignment, number of factors in the multivariate statistical analysis, and classification, which have an influence on the selection of data, will also have an impact on the final result. Ultimately, self-consistency must be invoked as a check for validity. The most convincing case of self-consistency is made when a component of the macromolecular complex reconstructed separately exhibits the same structure, to the resolution to which this can be checked, as it has within the complex. By this criterion, the comparison between the structures of the 40S ribosomal subunit and the 80S ribosome, and similar comparisons of work now in progress (e.g., 30S and 50S subunits of the 70S ribosome; see Frank et al., 1995b), are testimonies to the reproducibility of a complex experimental protocol that includes specimen preparation, cryo-electron microscopy, and a large suite of image processing procedures.
IV. Visualization and Segmentation Visualization tools help explore a structure represented by a 3D density distribution and display its relevant features. Segmentation is the general term we will use for the identification of known components or markers in the reconstruction volume. Since the most straightforward visualization of a structure already involves a decision about the placement of the boundary against the background ( -- the solvent, support or embedding medium), visualization and segmentation will always go hand in hand.
A. Segmentation Below atomic resolution, 3D density distributions are segmented to different degrees. Down to resolutions of 1/7 to 1/10 ~ - 1 , the outlines of helical regions can still be recognized, but below that, all connection between secondary structure and the shape of domains is lost. Thus in a sense the result of a 3D reconstruction is the opposite to a puzzle: the whole is given but without clues as to the location of its parts. Without the most basic information about the meaning of the features in a 3D representation of a molecule, the result is useless. Where could this information come from? Following is a list of possible sources, roughly in the order of increasing certainty: (i) recognition of morphological features,
IV. Visualization and Segmentation
261
(ii) known volume of a component with higher density, (iii) 3D antibody labeling, (iv) functional mapping, (v) difference mapping, and (vi) use of element-specific signals in the imaging (e.g., electron energy loss spectroscopy, EELS).
1. Recognition of Morphological Features This is a subjective method but may provide a basic orientation. Often this recognition involves an inference from a labeling study in two dimensions, and in this case it is possible to "harden" the evidence by projecting the structure in the orientation that matches the views from which the labeling study was done. Examples for this type of inference are found in the identification of L1 "shoulder" and L7/L12 "stalk" in the reconstruction of the 50S ribosomal subunit (Radermacher et al., 1987b). Morphological features with obvious functional implications are the channels immediately recognized in the 3D image of the calcium release channel (Wagenknecht et al., 1989a; Radermacher et al., 1994b).
2. Known Volume of Component with Higher Density When a component possesses higher density than the rest of the structure, due to its atomic composition, segmentation can be based on its known partial volume. Frank et al. (1991) and Stark et al. (1995) estimated the region occupied by ribosomal RNA within the 70S ribosome in this way. Obviously, the validity of this kind of segmentation relies on the accuracy of volume information and the assumption that the density ranking among the voxels in the 3D map reflects true ranking in the object. The validity of the second assumption must be examined separately in each case because the contrast transfer function produces some scrambling of the ranking, due to its suppression of low spatial frequencies. 3. A n t i b o d y L a b e l i n g
This method involves a separate experiment for each protein or epitope to be identified, yielding a 3D map, for each experiment. Comparison with the unlabeled control structure then allows the labeling site to the identified with a precision that is normally higher than the resolution (see Boisset et al., 1995). Studies of this kind are well known in the virus field (e.g., Smith et al., 1993). For molecules with lower symmetry considered in this book, the first 3D labeling studies were done by Srivastava and co-workers (1992b) and Boisset and co-workers (1992b; 1993b) on negatively stained specimens (3D mapping of protein L18 on 50S ribosomal subunit and of the four copies of protein Aa6 on the surface of A. australis hemocyanin, respectively.) The first such studies on ice-embedded specimens were done by Boisset et al. (1995), again on the A. australis hemocyanin-Fab complex.
262
Chapter 6. Interpretation of 3D Images of Macromolecules
In terms of the goal of segmentation, to delineate the 3D boundaries of a component, antibody labeling evidently falls short since it can only point to the location of a single accessible site on the molecule's surface. Multiple labeling experiments with different monoclonal antibodies would be necessary to cover the exposed area of a protein and thereby determine its surface boundaries. Even then, one must use additional information (known volume, approximate shape) in order to infer its boundaries in three dimensions.
4. Functional Mapping This method is based on the use of natural ligands that are either visualized directly or, if too small to be detectable, visualized after tagging with a heavy-atom cluster compound such as monomaleimido-AUl 4 .m (Nanogold; see Section I, G in Chapter 2). An example is provided by the Wagenknecht et al. (1994) localization of the sites of calmodulin binding on the calcium release channel (see also the commentary by FranziniArmstrong, 1994). Although this localization was two-dimensional only, additional experiments with a biotin-streptavidin-Au~ ,m compound narrowed down the sites to the cytoplasmic face of the receptor.
5. Difference Mapping This mapping involving the removal of a component in a separate experiment (or partial reconstitution) produces accurate 3D segmentation information, provided that the component does not have a role in stabilizing the entire macromolecular complex. The example of the 50S ribosomal subunit depleted of L7/L12 protein (Carazo et al., 1988b) was only a partial success since, although L7/L12 was unambigously traced to the morphological feature of the stalk, the precise boundary could not be established because of a substantial conformational change apparently triggered by the removal of this protein dimer. An effort to localize 5S rRNA in the 50S ribosomal subunit by partial reconstitution (Radermacher et al. 1990) was already mentioned. Another example is the long-term project of exploring the molecular motor of bacteria by single particle reconstruction of basal bodies of mutants that lack certain components (Stallmeyer et al., 1989a, b; Sosinski et al., 1992; Francis et al., 1994).
6. The Use o f Element-Specific Signals This method has a great potential but this has not been realized except in few applications (Harauz and Ottensmeyer, 1984a, b) because it requires specialized energy filtering instruments. The location of RNA components should be revealed on account of their high amount of phosphorus. One serious limitation is that a very small fraction of the dose (less than 1%) is utilized, so that the buildup of a statistically significant image requires a
IV. Visualizationand Segmentation
263
total specimen dose that exceeds the low dose (as defined in Section III, A, Chapter 2) by a large factor.
B. Visualization and Rendering Tools Less than 10 years ago, the visualization of electron microscopic objects reconstructed by the computer still relied largely on physical model building. Computer output was primarily in the form of crude printouts and contour maps, from which slices could be cut. Meanwhile we have witnessed an explosion of graphics capabilities that have reached the personal computer. This section cannot possibly cover the many options of representation currently available in both commercial and public-domain software, but can only describe the underlying principles briefly. An overview of the different techniques that have potential importance in electron microscopy was given by Leith (1992). Many visualization techniques in biomedical imaging, whose progress is featured in annual conferences [e.g., Visualization in Biomedical Computing; see Robb (Ed.) 1994], can obviously be adopted to electron microscopy.
1. Surface Rendering The objective of surface rendering is to create the 3D appearance of a solid object by the use of visual clues such as intensity, cosine-shading, reflectance, perspective, and shadows. Radermacher and Frank (1984) developed a combination of intensity and cosine-shading first employed to obtain a surface representation of ribosomes (Verschoor et al., 1983, 1984). Similar algorithms were developed by other groups (van Heel, 1983; Vigers et al., 1986a). The intensity is used to convey a sense of distance, with closer parts of the structure made to appear brighter than more distant parts. Cosine-shading mimicks the appearance of mat surfaces that are inclined relative to the viewing direction, with the highest brightness attached to surface elements whose normal coincides with the viewing direction, and zero brightness to surface elements whose normal is perpendicular to the viewing direction. The schematic diagram in Fig. 6.7 explains how in general the distance of the surface from the viewer is derived from the 3D density distribution stored in the volume. First, a threshold density do must be defined, to create the distinction between "inside" (d > do) and "outside" (d < do). The choice of this threshold is critical in the visualization of the structure, as different choices may bring out different aspects of it. With the given threshold, the algorithm scans the volume starting from a reference plane (which can be inside or outside the volume), counting the (perpendicular) distance between that plane and the first encounter of a voxel that is equal
264
Chapter 6. Interpretation of 3D Images of Macromolecules R'
~t!
~~"
SZ
-, . . . . . D
--q
tz
o
Fig. 6.7. Principle of surface representation combining shading and depth cue. The representation of a surface element is based on its distance from a reference plane measured normal to this plane. R and R', reference planes chosen inside and outside the structure, respectively. S 1 and $2, two portions of the surface. S~ is hidden by S2 unless the interior reference plane R is used. O, observer: t~ and t 2, distances the two adjacent surface elements. The slope It~ - t21can be used to control the local shading, while the distance itself can be used to make elements close to the observer appear bright, and those far from the observer, dark. From ~'Advanced Techniques in Biological Electron Microscopy." Threedimensional reconstruction of non-periodic macromolecular assemblies from electron micrographs. Frank, J., and Radermacher, M., Vol. III, pp. 1-72, Fig. 42 (1986). Reproduced with permission of Springer-Verlag. to or larger than d o . The precise distance must then be evaluated by interpolation in the close vicinity of that boundary. T h e resulting distances are stored in the so-called distance buffer. O n c o m p l e t i o n of the scan, the distance buffer contains a faithful r e p r e s e n t a t i o n of the t o p o g r a p h y of the molecule as seen from the direction p e r p e n d i c u l a r to the r e f e r e n c e plane. T h e desired surface r e p r e s e n t a t i o n is now achieved by using the distance buffer in the c o m p u t a t i o n of the two r e p r e s e n t a t i o n a l c o m p o n e n t s , intensity and shading, and mixing the two resulting signals. A 5 0 : 5 0 mixture of the two c o m p o n e n t s is usually quite satisfactory. M u c h m o r e e l a b o r a t e r e p r e s e n t a t i o n s i n c o r p o r a t e one or two sources of light, a choice of surface texture, and the use of colors. H o w e v e r , one must b e a r in mind that b e y o n d conveying shape information, surface r e p r e s e n t a t i o n s reflect no physical p r o p e r t y of the molecule and that any extra effect has merely esthetic appeal. E x a m p l e s of spectacular r e p r e s e n tations employing color to indicate s e g m e n t a t i o n are especially f o u n d in the virus field (Salunke et al., 1986; N a m b a et al., 1988; Stewart et al., 1991, 1993; Shaw et al., 1993; D r y d e n et al., 1993; Y e a g e r et al., 1994), but color is also increasingly used for o t h e r m a c r o m o l e c u l a r assemblies of high
IV. Visualization and Segmentation
265
complexity (e.g., Frank et al., 1991; Akey and Radermacher, 1993; Frank et al., 1995a, b).
Surface representations are especially effective when presented in the form of stereo pairs. To achieve this, the representation is calculated for two viewing directions separated by 6~ around the vertical axis, and the two images are mounted side by side with a distance equal to the horizontal distance between the two eyes (Fig. 6.8). The properties of the visual system make it also possible to perceive an entire rotation series at once as a field of 3D objects (van Heel, 1983; see Verschoor and Frank (1990) for an example of this effect.)
2. Volume Rendering Volume rendering is the term used for techniques that give a view of the entire volume, from a given viewing direction, to which all voxels are able
Fig. 6.8. Example of surface representations displayed as stereo pairs: front and back of the 50S ribosomal subunit of Escherichia coli reconstructed by Radermacher et al. (1987a). From Frank (1989a). Reproduced with permission of Eaton Publishing.
266
Chapter 6. Interpretationof 3D Images of Macromolecules
to contribute. Thus the structure appears transparent or semitransparent, and all interior features appear overlapped. The simplest rendering of this kind is obtained by parallel projection. However, more complicated ones have been devised where interior densities are reassigned and gradient information is allowed to contribute. The value of volume rendering is still controversial in many areas of image processing applications. The problem is that such a rendering corresponds to no known experience in the physical world. Perhaps the closest experience is watching a jellyfish from a close distance and seeing its organs overlapped as the animal moves. However, this example shows at the same time that we need to watch a transparent structure in motion in order to appreciate the spatial relationships of its components. Stereo representation also helps in conveying these relationships, but experience shows that a presentation as a "movie" is by far the most powerful tool. Somewhere in between surface and volume rendering falls the use of several three-dimensional contours ("wire mesh") in different colors representing different choices of threshold. Examples for the use of this technique, in widespread use in X-ray crystallography (e.g., using F R O D O and its successor O: Jones, 1978; Jones et al., 1991), are in the works by Milligan and Flicker (1987) and Frank et al. (1991). In the first instance, the boundaries of actin filament regions with highest density (above 15% of the peak density) are demarcated by a separate three-dimensional contour. In the second instance, the putative boundary of ribosomal RNA inside the E. coli ribosome was demarcated by an interior contour. Such contour representations are most effective if presented as stereo pairs.
C. Definition of Boundaries The definition of the molecular boundary in the reconstruction volume has already been brought up (Section IV, A) as an issue of segmentation. In the following this question will be answered as it relates to the practical questions of visualization. It is clear from the outset that due to the limited resolution, the molecule lacks "hard" boundaries within the volume. However, it is difficult with existing rendering techniques to convey the uncertainty of a boundary. Therefore, it is common practice to show the molecule as a solid object, with the understanding that the "hard" surface shown is accompanied by an inner and outer confidence margin (see discussion of this point by Frank et al., 1991). Several methods can be employed to determine the molecular boundary: (i) the gradient criterion, (ii) analysis of the histogram of voxels, and (iii) a molecular volume constraint.
IV. Visualization and Segmentation
267
According to the gradient criterion, formulated by Verschoor et al. (1984), the boundary is placed at the mid-point of the steep gradient region separating the "interior" from the "exterior" of the particle. Such a steep gradient is found between the stain-excluding protein and the stain, which makes this method particularly suitable for negatively stained specimens. The placement of the threshold at midpoint is evidently arbitrary, but it leads to consistent results when different reconstructions of the same specimen are compared. The analysis of the coxel histogram was used by Frank et al. (1991) to determine the boundary of the 70S ribosome in vitreous ice (Fig. 6.9). As mentioned in Section II, A of Chapter 2, the physical quantity that is being imaged is the Coulomb potential of the object. In the histogram, any chemically homogeneous mass (that therefore has the same scattering density) of sufficient size will appear as a high, narrow peak, while any inhomogeneous mass will appear as a broad peak. The term "inhomogeneous" simply refers to the fact that components smaller than the resolu-
ICE
15,000 PROTEIN
RNA
09 .-I
Lu X 0
10,000
5,000
0
,.-:
0
i
o
2
c
:
:
:_. 4
~.
:.:
:
6
:
.L:
:...:
8
. :_ 10
DENSITY
Fig. 6.9.
Histogram of the densiw distribution in the reconstruction of the 70S ribosome embedded in ice. The sharp peak at lower densities is due to the ice, the broader peak due to the particle mass which combines protein and the higher-density ribosomal RNA. From Frank et al. (1991). Reproduced from The Journal of Cell Biology, 1991, 115, 597-605 by copyright permission of The Rockefeller University Press.
268
Chapter6. Interpretationof 3D Imagesof Macromolecules
tion distance cannot be separately resolved in a mixture of components that have different scattering density. Analysis of the ribosome reconstruction with the histogram reveals a narrow high peak produced by the uniform density of the ice in the background and a broad peak originating from the particle itself. The minimum separating the two peaks~arising from an underrepresented density range between the density of ice and and that of protein~can be used to demarcate the boundary of the molecule. The reason that the "ice peak" has finite width is that the CTF leads to a scrambling of density values close to the boundary of any region with homogeneous density. One of the criteria for proper application of contrast transfer function (CTF) correction could be the ability of the algorithm to make the ice peak in the histogram as close as possible to the ideal delta-shaped function. Finally, the k n o w n molecular colume can be used to postulate the density cutoff associated with the boundary between two components with different densities, again utilizing the density histogram of the reconstruction. If the volume of the mass that has the highest density is known, then the density cutoff can simply be found by integrating the area of the histogram "from the top down," i.e., starting with the highest density in the direction of decreasing densities, until the known volume is reached (Frank et al., 1991; Stark et al., 1995). This approach is somewhat risky as it is not based on the experimental data but entirely on extraneous information. It breaks down when the basic assumption, that the function describing the density mapping object ---, image is monotonic, is not fulfilled. The occupied-volume criterion should therefore be used with caution and in conjunction with other information directly derived from the data.
V. Juxtaposition with Existing Knowledge A. The Organization of Knowledge The process by which conclusions are drawn based upon a comparison of the 3D map with existing knowledge is the essential part of interpretation. What immediately comes to mind are the attempts to fit an atomic structure into portions of an electron microscope (EM) reconstruction, as exemplified by the work by Stewart et al. (1991, 1993) for the adenovirus and by Rayment et al. (1993), Schr6der et al. (1993), and Schmid et al. (1994) for the functional interpretation of actin complexes with myosin and sruin. These kinds of comparisons and fittings have now also been accomplished for macromolecular assemblies reconstructed from single particles lacking symmetry, as will be reviewed below (Section V, B). However, the
V. Juxtapositionwith ExistingKnowledge
269
fact that there is virtually no instance of a fitting a m o n g E M data alone points to a gap in the organization of our knowledge: while atomic coordinates are routinely deposited in publicly accessible archives, no such organization exists to date for the large and ever-increasing amount of low-resolution 3D data. While it is possible to make use of X-ray coordinates in many contexts the original authors might not have thought of or dreamed of, the results from 3D electron microscopy are largely hidden away and are inaccessible except by personal interaction and negotiation with the author. If they were publicly accessible, then it would be possible to form difference maps, transfer the result of antibody mappings readily from one low-resolution structure to the other, search for "motifs" that exist as morphological shapes, etc. Section V, D introduces the concept of a distributed data base for 3D density distributions as a step in this direction.
B. Fitting of Electron Microscopy with X-Ray Results The interpretation of a 3D density map from single molecule reconstructions in terms of a meaningful structure is a task that is made difficult by the fact that the resolution of such maps is quite low compared to atomic resolution. In order to correlate features in such a map with features on an atomic scale it is necessary to employ chemical tagging, antibody binding, and matching with X-ray structures already solved. In the case of the hemocyanin, for instance, an oligomeric molecule consisting of building blocks with virtually identical structure, the analysis of projection images has always gone hand in hand with their interpretation in terms of a three-dimensional model. Because of the high interspecies conservation of the basic hexameric building block among arthropods, the only existing X-ray map (at 1,/5 A -~ resolution) of Panulirus hemocyanin (Schaick et al., 1982) has been used in numerous efforts of model building (e.g., Lamy, et al., 1990; Boisset et al., 1990a; van Heel and Dube, 1994). The resulting models could only be qualitative, however, for three reasons: (i) preparation by negative staining and air-drying distorts the 3D structure; (ii) single-carbon layer preparations, used in many of the studies, often lead to incomplete staining (see Section I, C in Chapter 2) and thus to incorrect projections; and (iii) the process of matching a multicomponent model to a projection, where all geometric parameters (rotations, translations) are allowed to be free, is highly unstable. Even though the visual quality and consistency of some of the simulations is remarkable (Lamy, 1987; Boisset et al, 1990a; de Haas et al., 1991; see Fig. 6.10), the angles and other parameters extracted (de Haas et al., 1991; van Heel and Dube, 1994) and inferences drawn on conforma-
270
Chapter 6. Interpretationof 3D Images of Macromolecules
Fig. 6.10, Example for the degree of correspondence between results from X-ray crystallography and electron microscopy of a negatively stained specimen. (a) Hemocyanin from Panulirus interruptus as obtained by X-ray crystallography. Only the main chain of the polypeptide is shown. Straight arrows show four protruding alpha helices. Curved arrows on the bottom right point to the regions where these alpha helices were removed from the model, to provide a check. The simulation shows that the envelope of the molecule is distinctly different in the absence of the short helical segment. (b) Low-resolution projection of the structure in (a). (c) Averaged projection of the 50 best images of a molecule set. (d) Same as in (c) after threefold averaging. The outline of the EM-derived average closely matches the outline of the calculated projection in the regions where the alpha helix was left in place. From De Haas et al. (1993). Reproduced with permission of Elsevier Science, Amsterdam.
V. Juxtapositionwith ExistingKnowledge
271
tional changes (de Haas et al., 1993) and cooperativity (de Haas and Bruggen, 1994) must be regarded as rather speculative. Only the most recent work, which resulted in cryo-reconstructions of arthropod hemocyanins antibody-labeled (Boisset et al., 1994b, 1995) or unlabeled (van Heel et al., 1994), is offering the first opportunities to obtain geometric parameters based on true three-dimensional fittings. Boisset et al. (1994b, 1995) have fitted four copies of the Panulirus structure to the reconstructed density distribution of the A . australis hemocyanin-monoclonal Fab complex. In a second step, they were able to narrow down the epitope region on the subunit by fitting the experimental three-dimensional Fab "blobs" with the atomic structure of a generic Fab. Other examples of an integration between EM and X-ray results are found in the most recent attempts to accommodate the A- and P-site tRNA in the available intersubunit space of the 70S (Frank et al., 1995a, b; Stark et al., 1995) and 80S ribosome (Verschoor et al., work in progress). In this case, the problem is one of docking, and here one of the guiding principles is space exclusion. Apart from this rule, the docking has to follow a fundamental constraint (see, for instance, Lim et al., 1992, the Malhotra and Harvey, 1994): on both sides, the two elbow-shaped tRNA molecules have to join each other to within a few angstroms. On one side of the tRNA molecules, their anticodon loops must interact in close vicinity with the codons of the mRNA, which in turn interacts with an exposed region of the 16S ribosomal RNA of the small subunit. On the other side, the aminoacyl ends must be in close vicinity to the peptidyl transferase center on the 50S subunit. The difficulty in these studies is that there are many speculations about the locations of these two important reference sites, but no firm evidence. For example, Wagenknecht et al. (1988) were able to localize the tRNA-anticodon interaction site on the 30S subunit (see Fig. 3.17 in Chapter 3), but this study was in 2D only and involved negative staining. In the absence of morphological clues or results from 3D studies where some kind of markers are visualized, a number of plausible configurations can be discussed but it is impossible to arrive at a unique solution. Frank et al. (1995a, b) were able to derive a plausible hypothetical model of the translational apparatus based on the observation of a channel in the 30S subunit and a tunnel in the 50S subunit (see cover illustration).
C. Use of Envelopes of Three-Dimensional Electron Microscopy Data Another use of low-resolution electron microscopic data is as constraint information: in X-ray crystallography, the envelope of a structure can be used as a powerful constraint for its phasing or refinement. In the
272
Chapter 6. Interpretationof 3D Imagesof Macromolecules
constraint satisfaction modeling of ribosomal RNA (Malhotra and Harvey, 1994), the envelope derived from an EM reconstruction was used to delineate the particle border which is expressed as a "hard" potential wall. Only within that wall is RNA folded according to rules of stereochemistry and subject to constraints that derive from cross-linking and chemical protection experiments. Finally, the existence of a low-resolution model is a crucial prerequisite for interpreting neutron scattering experiments involving the binding of ligands (Stuhrmann et al., 1995) or localization of individual proteins (Zhao and Stuhrmann, 1993).
D. Public Sharing of Low-Resolution Volume Data: The Three-Dimensional Density Database The few examples above, and those cited earlier from the muscle and virus fields, have demonstrated the need for a mechanism of sharing of lowresolution 3D density maps relating to published work. Such data have the property that they cannot be readily segmented into distinct natural building blocks, as is the case for X-ray electron density maps which can be interpreted in terms of a sequence of amino acids. Since, apart from a few exceptions, conventional publications cannot even reproduce a single 3D reconstruction (out of several that usually constitute the output of the work) in its entirety, the author must seek appropriate ways to convey a particular finding, normally by using surface representations. Thus the author is forced to select a particular density threshold, and a small number of orientations, to represent the relevant aspects of the structure. Inevitably, this process of selection withholds numerous aspects of the structure that are potentially relevant in other contexts of research. Of course, the most severe restriction is the unavailability of the complete density distribution in comparisons that involve 3D correlation searches over the entire volume. There will be an increasing number of situations where low-resolution data obtained in different groups have a part:whole relationship, as in the case of the 40S subunit and the 80S ribosome, where it is important to establish the precise spatial relationship. Currently, such data are exchanged only by special arrangements among different investigators. The first attempts are presently being made to establish a database for low-resolution 3D density data from different areas of microscopy (Carazo et al., 1994; Marabini et al., 1994, 1995). This database is conceived as a distributed, worldwide publicly accessible resource. Once an agreement is reached on the organization of the database, in terms of contents, format, network and interfacing protocols, it will be possible to access such volume data and any accompanying descriptive information in the same way now possible with sequence information.
I. Introduction Thus far, different three-dimensional (3D) reconstructions have been mentioned only in passing, to illustrate certain arguments or steps of the analysis. It will be useful to conclude this book with an outline of the processing steps used in one selected application, the reconstruction of the calcium release channel (CRC) of the fast-twitch skeleton muscle by Radermacher et al. (1994a, b) who used the random-conical reconstruction technique. The CRC was independently reconstructed by Serysheva et al. (1995) using the angular reconstitution technique of van Heel (1987b). The importance of this channel in the function of muscle is evident from a passage in the beginning paragraph in the article by Radermacher et al. (1994b): ...The CRC is an intracellular integral membrane protein of the sarcoplasmic reticulum. Depolarization of the sarcolemma causes the CRC to open (thereby releasing calcium which triggers muscle contraction w t h e author) by mechanisms whose elucidation represents the central remaining problem in understanding excitation-contraction coupling. The channel complex is known to be composed of a single 565-kDa polypeptide whose precise sequence is known. To date, the noncrystallographic methods described in this book are the only means of obtaining a structural model. Electron micrographs of the isolated complex, in a negatively stained preparation, show a rosette-shaped particle with apparent fourfold symmetry (see Fig. 2.2 in Chapter 2). 273
E~
~t"
~
~
"~
.~
~
~
|
9 ,u
~9
E ~
~.=
,v,
~
II. Three-DimensionalReconstruction of the Channel
275
II. Image Processing and Three-Dimensional Reconstruction of the Calcium Release Channel Cryoimages (Fig. 7.1) show the same preponderance of top views as observed with negative staining. Of several hundred pairs of micrographs (tilted/untilted grid), nine were selected according to the following critia: (i) yield (i.e., how many free-standing particles are in the micrograph?), (ii) usefulness of the tilted-grid micrograph (i.e., does it have adequate resolution, as judged by the diffractogram?). Most critical was the second
Fig. 7.2. Gallery of untilted and corresponding tilted molecules. The first, third, and fifth rows show untilted molecules, and the second, fourth, and sixth rows the corresponding tilted molecules. From Radermacher, Rao, Grassucci, Frank, Timerman, Fleischer and Wagenknecht (1994, unpublished data).
276
Chapter 7. Example for an Application: Calcium Release Channel
criterion, because only a few of these micrographs proved to be free of drift or blurring due to charging effects. Of these nine micrograph pairs, a total of 1665 particle images were selected in pairs, and the 0 ~ images were subjected to alignment (using the translational and rotational correlation techniques described in Section IID, D of Chapter 3), correspondence analysis, and classification. To give an idea of the extremely low signal-to-noise ratio (SNR), a few images of the untilted and tilted particles are reproduced in Fig. 7.2. The reasons that the usual reference-based alignment could be used in this case are that only a single view is present (if we disregard the minor peripheral differences due to flipping, see below) and that the particle is relatively large, producing a strong correlation peak even in the cryoimage. The point cloud on the correspondence analysis factor map 1 versus 2 (Fig. 7.3a) is visibly divided into three clusters, marked as I, II, and III. The
9
9:"
9
.. ! " ; . .
--~.
".!..
9
,'...o'..
9... "i~:~ :
:',' ".. . 9
9
9
~.
--k
9 .. .
..
:,,
"'="
"t.
. . .
-
"
9 9 ..
,.
. 9 .
~'1%
"
:.
. :':71_~'....~.
".:..;~cr~.~,,
~
"
.
...--'--,~--.. ~-~'~"'.,~,.: 9 ,'."..:r ." .... & ~ . . ~ . . . . . .f~. :~,." - ; . ' .
. ,
9
9
..
,-:,..~v'' 9
.
9
9 9 :,,. ....................... ~----~- --"
9 k,.-~.~e. .
,:
.
.-
i'-
" . . . "
;
~. 9
9 ..,,'.'-
~ ..
, . . ' ~ "~.
..
.
i .
,,~'..~ :*.t "1" ~_...~,_~_ . ~ _-._'. . ~ ' ~ , .
t. 9
,
"
" ' ~ "
.g'~4r.'.~.,e
..."
.:
.
.
"
-.-~- ~ . ' g . ; . r 4 ; \ .
.
-
..,.:~ :~
'D.." ,.~
~,I/~1.'~ , ~.~,~:1~/~_,...,..~.-
.............................................
.
~.~.~'~...
: ' : ' '~__ ',
%
9 ...
,
.'~/'..~':"
9
.~;
.
,..-:..
9
9
9
. e
,
9
.
'
Fig. 7.3.
'
'
I
'
'
'
'
I
. . . .
I
'
'
'
I
'
'
~'
'
I
Result of correspondence analysis applied to 1665 0 ~ images of CRC. (a) Map of factor 1 (horizontal)versus 2 (vertical). Each image is represented by an asterisk. The images are seen to form three clusters, labeled I, II, and III. From Radermacher et al. (1994). Reproduced from The Journal of Cell Biology, 1994, 127, 411-423 by copyright permission of The Rockefeller University Press. (b) The same map represented by local averages computed on a coarse grid. From Radermacher, Rao, Grassucci, Frank, Timerman, Fleischer, and Wagenknecht (1994, unpublished data).
II. Three-Dimensional Reconstruction of the Channel
277
Fig. 7.3. (continued)
meaning of this division becomes at once clear when one uses a display of the map in which local averages over grid squares are shown (Fig. 7.3b): images falling into clusters II and III evidently correspond to the two flip/flop related orientations of the molecule, while the origin of the class I appearance which lacks handedness is unknown. Automated classification was performed using a mixed classification method (dynamic clouds followed by hierarchical ascendant classification (HAC); Section IV, F in Chapter 4). Figure 7.4 shows the three class averages based on this classification. In an attempt to "zoom in" further on other possibly existing heterogeneities, the images falling within each of the three classes were again separately analyzed by correspondence analysis. However, no additional systematic variations were found. The tilted-particle images belonging to the three classes of particles (examples, see Fig. 7.2)were subsequently used in three separate recon-
Fig. 7.4. Averages of the untilted molecules, corresponding to three clusters in Fig. 7.3a. Scale bar, 100 A. From Radermacher et al. (1994b). Reproduced from The Journal of Cell Biology, 1994, 127, 411-423, by copyright permission of The Rockefeller University Press.
II. Three-Dimensional Reconstruction of the Channel
279
structions. While the reconstruction of class I is mainly of methodological interest, the class II and III reconstructions both represent the channel complex to a resolution of 1/32 A-~ and are also virtually indistinguishable from each other to within 1/31 ,~-~, after having been brought into register by 180~ rotation ("flipping") of one of the reconstructions around an axis perpendicular to the fourfold axis. (The two resolution values given above were obtained by using the differential phase residual averaged over shells in 3D Fourier space without exclusion of the missing cone. However, since the values of the Fourier modulus are small in the missing cone, the values obtained essentially reflect the resolution in the 3D directions supported by data.) In an attempt to reduce the effects of the missing cone (in the absence of data from particles in non-top view orientation), projection onto convex sets (POCS) (Chapter 5, Section VII, C) was used with the spatial boundedness (as defined by a 3D binary mask) and "replace" as the only constraints. The 3D binary mask was obtained by thresholding a low-pass filtered version of the reconstruction, in analogy to the preparation of 2D binary masks for correspondence analysis (Chapter 4, Section III,E). While the reconstruction obtained from negatively stained particles (Wagenknecht et al., 1989a) is reduced in z direction on application of POCS, the cryo-reconstruction changed its dimension very little. The main effect of POCS is a better definition of features in the z direction. The surface representation of the CRC (Fig. 7.5) shows a convoluted, surprisingly open structure with many compartments accessible to both cytoplasmic and sarcoplasmic sides. The complex architecture is easily comprehended from this stereo display. Even though the resolution of the structure is currently limited to 29 ,~, the reconstruction can serve as a framework for functional interpretations. I have already cited the results on the architecture of the calcium release channel as a demonstration for the extraordinary reproducibility of single-particle reconstructions by different groups [Radermacher et al. (1994) and Serysheva et al. (1995)] working with entirely different methods: the surface representations of the two models are virtually indistinguishable. Of obvious interest is the nature of the conformational changes that must be involved in the rapid release of calcium into the cytoplasm. This question can now be addressed by 3D difference imaging. Another question concerns the role of calmodulin which binds to the channel and modulates the action of the channel. Two-dimensional (Wagenknecht et al., 1994) and three-dimensional studies (Wagenknecht et al., 1995) have now established the sites of calmodulin binding, which are found on the four peripheral vestibules of the channel.
Fig. 7.5.
Stereo representations of the class-III reconstruction in different orientations. (a) Cytoplasmic side, (b) sarcoplasmic side of the channel. (c) Side view. The putative structural domains are labeled with numbers. From Radermacher et al. (1994b). Reproduced from The Journal of Cell Biology, 1994, 127, 411-423, by copyright permission of The Rockefeller University Press.
II. Three-DimensionalReconstruction of the Channel
281
Fig. 7.6. Side view of calcium release channel after application of POCS to the reconstruction. (a) As obtained by cryo-electron microscopy, (b) as obtained by using negative staining. From Radermacher et al. (1994b). Reproduced from The Journal of Cell Biologv, 1994, 127, 411-423, by copyright permission of The Rockefeller University Press. Also of some methodological interest is the comparison of the cryoreconstruction with the 3D model obtained earlier from negatively stained molecules (Fig. 7.6). The negative-stain reconstruction shows strong flattening in z direction and the complete collapse of the dome-shaped central domain (pointing to the right in Fig. 7.6). As mentioned before, application of POCS has resulted in particle dimensions that show the full extent of this effect. It is seen that in the negative-stain reconstruction, the z direction is reduced to 60%.
I. Introduction The sophistication of data analysis in electron crystalography in general, and particularly in single molecule reconstruction, is such that it is virtually impossible for a newcomer group to write the necessary computer programs "from scratch." Thus the software packages of pioneering groups are of prime importance as the most effective "vectors" for the spread of new methodology. The purpose of this appendix is to list existing software packages and to discuss important considerations in the design of such packages. After the first wave of articles describing electron microscopy-image processing systems (Smith, 1978; Saxton et al., 1979; van Heel and Keegstra, 1981; Frank et al., 1981; Trus and Steven, 1981; Hegerl and Altbauer, 1982), there has been relative silence despite the fact that since the time computers and computing have undergone radical changes. The newly developing salient features of the various programs are mostly hidden in Materials and Methods sections or in figure legends of pioneering papers. Hegerl (1992) gave a brief overview over existing packages, with a table indicating the types of methods and applications supported. However, what is currently lacking is a more detailed description that also covers options for visualization and interfacing to other softwares in the academic or commercial realm. Ross Smith, New York University, and Brigit Carragher, University of Illinois, have taken the initiative to solicit articles from software-originating groups to be published in a forthcoming issue of the Journal of Structural Biology. It is to be expected that this special issue will finally close the gap and provide newcomers to the field with some guidance.
II. Basic Design Features The need for a modular system with a hierarchical calling structure and comprehensive bookkeeping capabilities became obvious in the context of applications where large numbers of small images had to be processed. I 282
II. Basic Design Features
283
will briefly explain the importance of the principal design features that are highlighted in the preceding sentence. It will become clear that another important feature, the flexibility in programming, follows from the other properties more or less automatically. Finally, a more recent development that has added the user interface, a "friendly" layer between the package and the user, should be mentioned.
A. Modular Design The program is divided into semiautonomous units, or modules, that have precisely defined functions, called "operations," and are invoked by commands. Typically, such a module operates on an image (or set of images) of the database ("input files") and creates one or several images ("output files"). Beyond a certain level of complexity, it becomes impossible to design a program not based on this principle, because modularity assures that separate parts of the code can be changed without affecting the entire program. However, what is initially a mere programming convenience creates the elements of a higher-level programming language: the commands along with the data that they operate on (e.g, file names and registers) can be put together in many different sequences, according to rather simple rules that can be mastered by users who have no background in the formal "low level" programming languages (e.g., FORTRAN, C).
B. Hierarchical Calling Structure As in human communication, the richness of which rests on an organization of language into building blocks, following the hierarchy Word, Sentence, Essay, the power and flexibility of a software package increases tremendously when larger building blocks can be constructed in a hierarchical way. The operations forming the "menu" of the software system are on the very bottom of a hierarchy of procedures with increasing complexity and specialization. A sequence of operations, performed on specific data may be called a task. Thus a task is defined by a string of commands, each of which invokes one of the operations, and by explicit references to data (file names, values of parameters). An example of such a task might be Fourier-based low-pass filtration, which is realized by the sequence {Fourier transform}, {apply filter}, {inverse Fourier transform} along with the names of input and output files and the filter radius. Now we can derive from this a "generic task" by stripping off all explicit specifications, leaving symbols in their place. These act as "placeholders" for actual data. In SPIDER, such a generic task (in our example {low-pass filtration}) is called "procedure." Other packages have corresponding high-level command structures which also come under the name of "script."
284
Appendix1. SoftwareImplementations
What I have termed generic task is strictly equivalent to the operations on the lower level, which are also defined in terms of a generic function (e.g., Fourier transform) that is to be applied on as yet unknown data. It is therefore a logical step in the design of an image processing system to allow a generic task to be called at any place where the next operation is solicited. If one applies this principle to the command sequence within the generic task itself, the result is a hierarchical calling structure in which generic tasks can call other generic tasks, and so on. The structure that one obtains is similar to the structure in other formal programming languages and provides a virtually infinite number of possibilities. In practice, the need to keep the values of local variables makes it necessary to maintain a stack in memory, which poses a limit on the number of levels in the calling hierarchy.
C. Bookkeeping Capabilities and Data Storage Organization In the averaging and 3D reconstruction of single biological macromolecules, there are a number of parameters, unique to each particle, that have to be carried from one processing step to the other. Examples are statistical parameters, information identifying the origin of the particle, shifts and rotation angles determined in the alignment, codes identifying the class membership, etc. These parameters are used not only as input for image processing operations, but also for controlling the flow of processing: particles may be rejected on account of anomalous correlation coefficients with a template, anomalous shift vectors, outlier positions on factor maps, or membership to marginal clusters in the classification. Particles that are accepted have to be processed separately according to the class that they belong to. The parameters may be stored along with the images, as "header information," in a separate "document file"--a separate readable ASCII file, or in both. The philosophy on how to store the images and the supporting information varies widely. One can take the position that the entire set of images forms a natural database unit that should be stored in a single file. This has enormous advantages since it takes the burden off the operating system, but the price to be paid for this convenience is that a large fraction of the data that might have low quality or no pertinence to the aim of the analysis is being carried around or must be purged explicitly under control of the software system. At one extreme, the use of the computer's operating system in creating and accessing data is abandoned altogether. However, as computers become more powerful and tools in the operating systems improve, single file storage of images remains a viable form of organization. Nevertheless, it is clear that the only way to achieve higher resolution is through a sharp increase in the number of particles.
III. Existing Packages
285
W h e n these n u m b e r s go into t h o u s a n d s or even tens of thousands, the storage, accessing, and handling of single particle data must clearly be r e t h o u g h t as a formal d a t a b a s e o r g a n i z a t i o n p r o b l e m .
D. U s e r I n t e r f a c e s Initially written for batch o p e r a t i o n , most packages have b e e n f u r t h e r d e v e l o p e d to include a W i n d o w s - b a s e d graphics user interface. T h e role of this interface can be to organize the processing flow (as for instance in M D D P ; Ross Smith, p e r s o n a l c o m m u n i c a t i o n ) , to visualize images and 3D v o l u m e s (a universal capability), or to p e r f o r m a variety of m a p p i n g options related to multivariate statistical analysis and classification (as in the W E B interface of S P I D E R ; see for instance Figs. 4.6d and 4.12c).
Ill. Existing Packages Table A.I Packages Capable of Single-Particle Averaging, Classification, and 3D Reconstruction a Name
Re ference
Source code S
Con tact
EM
Hegerl and Altbauer, 1982
Dr. R. Hegerl, Max-Planck-Institut fiir Biochemie, Am Klopferspitz, 8031 Martinsried, Germany
IMAGIC
van Heel and Keggstra, 1982
MDDP
Smith, 1978
PIC
Trus and Steben, 1981
Dr. B. Trus, National Institutes of Health, Bethesda, MD 20892
SEMPER
Saxtonet al., 1979
Dr. W. O. Saxton, Synoptics Ltd., 271 Cambridge Science Park, Cambridge, CB4 4WE, UK
SPIDER
Frank et al., 1981b; 1995c
Dr. M. van Heel, IMAGE SCIENCE Software GmbH, Mecklenburgische Strasse 27, 1000 Berlin, Germany S
S
Dr. P. R. Smith, Department of Cell Biology, NYU Center, 550 First Avenue, New York, NY 10016
Dr. J. Frank, Wadsworth Center, P.O. Box 509, Albany, NY 12201-0509
a Follows Hegerl's (1992) survey. All packages are supported with Windows-based graphic interfaces. "S" indicates that source code is being distributed with the package. IMAGIC and SEMPER are being commercially distributed. (Note that all packages listed will be featured in the 1995 special software issue of Journal of Structural Biology, to be edited by Ross Smith and Brigit Carragher.)
286
Appendix 1. Software Implementations
IV. Interfacing to Other Software
287
IV. Interfacing to Other Software Interfacing among the electron microscopy-related packages, as well as between these packages and general visualization software, relies on format conversion options to standard image formats provided by the individual packages. Examples for such standards are gif, tiff, postscript, and rgb. Interconversion among those standard formats is facilitated by a variety of proprietary display and screen capturing tools or explicitly by a general conversion tool such as xv (shareware provided by John Bradley, 1053 Floyd Terrace, Bryn Mawr, PA 19010). Thus, package ONE might produce gif-formatted files as an option while package TWO accepts postscript format. The images can then be readily converted following the sequence ONE ~ gif ---, xv ---, ps ---, TWO. Six frequently used packages will be mentioned here: A N A L Y Z E is a package developed by the group of Richard Robb at the Mayo Clinic (Biomedical Imaging Resource, Mayo Foundation, 200 First Street, Rochester, MN 55905). It is mainly designed for radiological applications but has also found use in confocal light microscopy. However, it has many general features and options for data segmentation that might be very useful in 3D electron microscopy. A VS (Advanced Visual Systems, Inc., 300 Fifth Avenue, Waltham, MA 02154) is a general-purpose package which makes it very convenient to design visualizations, by the use of icons for the data items and "pipelines" for processing pathways. EXPLORER, a package distributed by Silicon Graphics, Inc., (2011 North Shoreline Boulavard, Mountain View, CA 94039-7311), is based on a similar principle. Examples for AVS-based visualization are provided by Stewart et al. (1991, 1993). Examples for EXPLORER-based visualizations are found in the work of Frank et al. (1995a, b). O (Jones et al., 1991) is the upgraded version of FRODO (Jones, 1978) which is frequently used in X-ray crystallography for molecular modeling and fitting. Both models based on imported Protein Data Bank coordinates
Fig. A.1. Examplefor a hypertext-organized software documentation, presented by using the mosaic graphics interface tool. The page shown is the index page of all SPIDER operations. This index page was accessed from the "SPIDER Home Page." Each highlighted phrase (in the original color version highlighted in blue) is a pointer to another documentation file. The latter is retrieved when the pointer is "clicked" with the mouse. Thus, clicking on "BC" will fetch the document describing the operation "box convolution," etc. Above that is a bar of icons, each of which provides immediate access to other categories of documentation, for example the warning triangle for explanations of error messages and the question mark at the end for help files.
288
Appendix 1. SoftwareImplementations
and general 3D density maps can be simultaneously represented in stereo, and fitted to one another interactively. Wire-mesh representations provide fast interactive operation and transparent display. For a 3D density map to be acceptable by the "Map_file" option of O, it must be preprocessed by the "BRIX" program, which divides the 3D density distribution into "bricks" of equal size, a remnant from the time when the computers had a much smaller capacity. Interfacing of electron microscopy-oriented software therefore goes through generation of BRIX-compatible files. Examples of fittings between X-ray structures created from PDB-formatted coordinates and EM-based reconstructions can be found in the works of Rayment et al. (1993), Schr6der et al. (1993), and Frank et al. (1995a, b). V O X E L V I E W (Visual Images, Inc., 505 North 4th Street, Fairfield, IA 52556) is most frequently used for volume rendering of complex biological objects obtained by electron tomography or confocal light microscopy. Views of the object obtain are first computed in a "rotation series" around a predefined axis, stored in the frame buffer of the workstation, and then presented as a movie loop.
V. Documentation Each operation is usually accompanied by a menu chapter which explains the usage and the basic philosophy. Apart from the menu chapters, there is an assortment of introductory material, examples for procedures, application notes, etc. The introduction of hypertext referencing has made it possible to string such a set of documents together into a network of texts and images. Moreover, the documentation can be made accessible by the server lab through World Wide Web and the Mosaic or NETSCAPE graphics interface so that it does not have to reside in the computer of the user group. Some of the packages (e.g., IMAGIC and SPIDER) are now set up for remote documentation access (see the example in Fig. A.1).
This appendix includes publications prior to October 1, 1995. Where identical results are published both in abstract form and as a full paper, the abstract is not listed in most instances. The listing does not cover helical structures and virus particles reconstructed by Fourier-Bessel techniques. Bibliographies covering these objects are found in Stewart (1988) and Moody (1990) (for helical structures) and in the volume edited by Chiu et al. (1996) (for viruses). Reconstruction techniques as well as specimen preparation methods used in the various reconstructions are indicated by keys. Reviews are indicated by [r]. The meanings of the other keys are as follows: Key to D a t a Collection a n d R e c o n s t r u c t i o n M e t h o d s
[1] Random-conical (Frank and Goldfarb, 1980; Radermacher et al., 1987a) [2] Use of averaged projections, sinogram/angular reconstitution, and/or symmetries (van Heel, 1987b) [3] Spherical harmonics synthesis (Provencher and Vogel, 1983) [4] Abel-transform (Steven et al., 1984) [5] Tomographic (i.e., derived by tilting a single particle around a single axis; Hoppe et al., 1974) [6] Tomographic, using symmetry (Vigers et al., 1986a) [7] Tomographic, using more than one tilt axis (Penczek et al., 1995) [8] Use of unaveraged high-dose projections and sinograms (Ottensmeyer et al., 1994). Key to S p e c i m e n P r e p a r a t i o n M e t h o d s
[ad] (unstained) air-dried [atg] aurothioglucose-embedded [bms] Butvar-film supported, methylamine tungstate stained 289
Appendix 2. MacromolecularAssemblies Reconstructed
290
[cpd] (unstained) critical point dried [fh] frozen-hydrated ("ice-embedded") [ns] negatively stained, uranyl acetate [pe] positively stained, plastic-embedded
Sample
Reference
Key
a2-Macroglobulin, human Native Transformed Chymotrypsin complex Transformed + nanogold Basal bodies/flagellar motors
Larquet et al., 1994a, b Schroeter et al., 1991 Boisset et al., 1993a Boiseet et al., 1992a, 1994a
lfh Ins lfh lfh
Stallmeyer et al., 1989a Stallmeyer et al., 1989b Sosinsky et al., 1992 Francis et al., 1994
4ns 4ns 4fh 4fh
Wagenknecht et al., 1989a Radermacher et al., 1994a, b Franzini-Armstrong, 1994 Serysheva et al., 1995
Ins lfh r 2fh
Saibil et al., 1993 Saibil and Wood, 1993 Chen et al., 1994 Carazo et al., 1992 Vigers et al., 1986a, b
6us 6ns 6fh lfh 6fh
Hoppe et al., 1974 Stoops et al., 1992a
5ns lbms
Tsuprun et al., 1994 Dube et al., 1994 Phipps et al., 1993
2ns 2ns lfh
Egelman et al., 1995 San Martin et al., 1995
Ins lfh
Stasiak et al., 1994
2ns
Caulobacter crescentus Salmonella typhimurium
Calcium release channel/ryanodine receptor of skeletal muscle Rabbit
Chaperonin groEL (see also Heat shock protein, below)
Clathrin cages Erythrocruorin, see Hemoglobin (worm) Fatty acid synthetase Flagellar motors: see Basal bodies Head-tail connector/portal protein Bacteriophage ch29 Bacteriophage SPP1
Heat Shock Protein (Pyrodictium occultum) Helicases Bacteriophage T7
DnaB RuvB branch migration protein ( Escherichia coli)
Hemocyanin Chiton (Lepidochiton sp.) Cuttlefish (Sepia officinalis) Horseshoe crab (Limulus polyphemus) Octopus (Octopus culgaris)
Lambert et al., Lambert et al., van Heel et al., Lambert et al.,
1994a 1995c 1994 1994b
lfh lfh 2fh lfh
Appendix 2. MacromolecularAssemblies Reconstructed Sample Protobranch bivalve mollusc ( Nucula hanleyi) Roman snail (Helix pomatia) Scorpion (Androctonus australis) Native Aa + 4Fab
Hemoglobin Worm ( Ophelia bicornis) ( L u m b r i c u s terrestris)
Klenow fragment of DNA polymerase Macroglobulin, see a 2-macroglobulin Nuclear Pore Complex (Xenopus)
Nucleosome Portal protein: see Head-to-tail connector Proteosomes (Therrnoplasma acidophilum) Pyruvate dehydrogenase ( Saccharomyces ceret'isiae ) Ribosomes Eukaryotic 40S 40S + initiation factor eIF3 80S Prokaryotic 30S
30S (rRNA) 50S
50S-L7/L12 50S + Fab(L18) 50S + Fab(L9) 50S-5S rRNA 70S
291
Reference
Key
Lambert et al., 1995b Lambert et al., 1995a
lfh lfh
Boisset Boisset Boisset Boisset
et et et et
al., al., al., al.,
1990b 1992b, 1993b 1994b 1995
Ins
Ins lfh lfh
Cejka et al., 1992 Schatz, 1992 Schatz et al., 1994 Schatz et al., 1995 Ottensmeyer and Farrow, 1992
Ins, latg/fh 2fh 2fh 2fh 2ad
Hinshaw et al., 1992 Akey and Radermacher, 1993 Pante and Aebi, 1994 Harauz and Ottensmeyer, 1984a, b
Ins lfh r
Hegerl et al., 1991
Ins
Stoops et al., 1992b
lbms
Verschoor et al., 1989 Srivastava et al., 1995 Srivastava et al., 1992a Verschoor and Frank, 1990
Ins
Knauer et al., 1983 Verschoor et al., 1984 Lata et al., 1995 Beniac and Harauz, 1995 Radermacher et al., 1987a, b Radermacher 1988 Vogel and Provencher, 1988 Oettl et al., 1983 Radermacher et al., 1992b Carazo et al., 1988 Srivastava et al.. 1992b Srivastava et al., 1995 Radermacher et al., 1990 Carazo et al., 1989 Wagenknecht et al., 1989b
2cdp
lfh lns lns 5ns
2ns lfh lns Ins 3ns 5ns lfh lns 5ns lns lns lns lns
292 Sample
Signal recognition protein SRP54
Appendix 2. MacromolecularAssemblies Reconstructed Reference
Key
Frank et al., 1991, 1995a, b Penczek et al., 1992 Penczek et al., 1994 Stark et al., 1995 I~fverstedt et al., 1994 Ottensmeyer et al., 1994 Czarnota et al., 1994
1fh lfh lfh 2fh 5pe 8ad 8ad ,
Bibliography
Adrian, M., Dubochet, J., Lepault, J., and McDowall, A. W. (1984). Cryoelectron microscopy of viruses. Nature 308, 32-36. Aebi, U., Smith, P. R. Dubochet, J., Henry, C., and Kellenberger, E. (1973). A study of the structure of the T-layer of Bacillus brecis. J. Supramol. Struct. 1, 498-515. Akey, C. W., and Edelstein, S. J. (1983). Equivalence of the projected structure of thin catalase crystals preserved for electron microscopy by negative stain, glucose or embedding in the presence of tannic acid. J. Mol. Biol. 163, 575-612. Akey, C. W., and Radermacher, M. (1993). Architecture of the Xenopus nuclear pore complex revealed by three-dimensional cyro-electron microscopy. J. Cell Biol. 122, 1-19. Al-Ali, L. (1976). Translational alignment of differently defocused microglaphs using cross-correlation. In: "Developments in Electron Microscopy & Analysis" (J. A. Venables, Ed.). Academic Press, London. AI-Ali, L., and Frank. J. (1980). Resolution estimation in electron microscopy. Optik 56, 31-40. Alberts et al. (Ed.) (1989). "Molecular Biology of the Cell," p. 8. Garland Publishing, New York. Amos, L. A., Henderson, R., and Unwin, P. N. T. (1982). Threedimensional structure determination by electron microscopy of 2-dimensional crystals. Prog. Biophys. Mol. Biol. 39, 183-231. Anderberg, M. R. (1973). "Cluster Analysis for Applications." Academic Press, New York. 293
294
Bibliography
Andreasen, N. C., Arndt, S., Swayze, V., II, Cizadlo, T., Flaum, M., O'Leary, D., Ehrhardt, J. C., and Yuh, W. T. C. (1994). Thalamic abnormalities in schizophrenia visualized through magnetic resonance image averaging. Science 266, 294-298. Andrews, D. W., Yu, A. H. C., and Ottensmeyer, F. P. (1986). Automatic selection of molecular images from dark field electron micrographs. Ultramicroscopy 19, 1-14. Arad, T., Piefke, J., Weinstein, S., Gewitz, H. S., Yonath, A., and Witmann, H. G. (1987). Three-dimensional image reconstruction from ordered arrays of 70S ribosomes. Biochimie 69, 1001-1006. Barth, M., Bryan, R. K., and Hegerl, R. (1989). Approximation of missingcone data in 3D electron microscopy. Ultramicroscopy 31, 365-378. Baumeister, W., and Hahn, M. (1975). Relevance of three-dimensional reconstructions of stain distributions for structural analysis of biomolecules. Hoppe-Seyler's Z. Physiol. Chem. 356, 1313-1316. Beer, M., Frank, J., Hanszen, K.-J., Kellenberger, E., and Williams, R. C. (1975). The possibilities and prospects of obtaining high-resolution information (below 30A) on biological material using the electron microscope. Rel'. Biophys. 7, 211-238. Beer, M., and Moudrianakis, E. N. (1962). Determination of base sequence in nucleic acids with the electron microscope. III. Visibility of a marker. Chemistry and microscopy of guanine-labeled DNA. Proc. Natl. Acad. Sci. USA 48, 409-416. Beniak, D. R., and Harauz, G. (1995). Structures of small subunit ribosomal RNAs in situ from Escherichia coli and Thermomyces lanuginosus. Mol. Cell. Biochem. 148, 165-181. Benzecri, J. P. (1969a). In: "Methodologies of Pattern Recognition" (S. Watanabe, Ed.). Academic Press, New York. Benzecri, J. P. (1969b). "L'Analyses des Donnees," Vol. 1. La Taxinomie, Dunod, Paris. Berriman, J. A., and Unwin, P. N. T. (1994). Analysis of transient structures by cryo-electron microscopy combined with rapid mixing of spray droplets. Ultramicroscopy 56, 241-252. Bijlholt, M. M. C., van Heel, M. G., and Van Bruggen, E. F. J. (1982). Comparisons of 4 • 6-meric hemocyanins from three different arthropods using computer alignment and correspondence analysis. J. Mol. Biol. 161, 139-153. Billiald, P., Lamy, J., Taveau, J. C., Motta, G., and Lamy, J. (1988). Mapping of six epitopes in haemocyanin subunit Aa6 by immunoelec, tron microscopy. Eur. J. Biochem. 175, 423-431. Boekema, E. J. (1991). Negative staining of integral membrane proteins. Micron and Microscopica Acta 22, 361-369.
Bibliography
295
Boekema, E. J., and Boettcher, B. (1992). The structure of ATP synthetase from chloroplasts. Conformational changes of CF~ studied by electron microscopy. Biochem. Biophys. Acta 1098, 131-143. Boekema, E. J., and van Heel, M. (1989). Molecular shape of Lumbricus terrestris erythrocruorin studied by electron microscopy and image analysis. Biochim. Biophys. Acta 957, 370-379. Boekema, E. J., Berden, J. A., and van Heel, M. G. (1986). Structure of mitochondrial F~-ATPase studied by electron microscopy and image processing. Biochim. Biophys. Acta 851, 353-360. Boisset, N., Frank, J., Taveau, J. C., Billiald, P., Motta, G., Lamy, J., Sizaret, P. Y., and Lamy, N. (1988). Intramolecular localization of epitopes within oligomeric protein by immunoelectron microscopy and image processing. Proteins 3, 161-183. Boisset, N., Taveau, J. C., Pochon, F., Tardieu, A., Barray, M., Lamy, J. N., and Delain, E. (1989a). Image processing of proteinase- and methylamine-transformed human c~2-macroglobulin. J. Biol. Chem. 264, 12046-12052. Boisset, N., Taveau, J.-C., and Lamy, J. N. (1990a). An approach to the architecture of Scutigera coleoptrata haemocyanin by electron microscopy and image processing. Biol. Cell 68, 73-84. Boisset, N., Taveau, J.-C., Lamy, N., Wagenknecht, T., Radermacher, M., and Frank, J. (1990b). Three-dimensional reconstruction of native Androctonus australis hemocyanin. J. Mol. Biol. 216, 743-760. Boisset, N., Grassucci, R., Penczek, P., Delain, E., Pochon, F., Frank, J., and Lamy, J. N. (1992a). Three-dimensional reconstruction of a complex of human c~2-macroglobulin with monomaleimido nanogold (Au 1.4 nm) embedded in ice. J. Struct. Biol. 109, 39-45. Boisset, N., Grassucci, R., Motta, G. Lamy, J., Radermacher, M., Taveau, J. C., Liu, W., Frank, J., and Lamy, J. N. (1992b). Three-dimensional immunoelectron microscopy of Androctonus australis hemocyanin: The location of monoclonal Fab fragments specific for subunit Aa6. In: "Proceedings of the 10th European Congress on Electron Microscopy" (A. Rios, J. M. Arias, L. Megias-Megias, and A. Lopez-Galindo, Eds.), Vol. 1, pp. 407-408. Secratariado de Publicaciones de la Universidad Granada, Granada, Spain. Boisset, N., Penczek, P., Pochon, F., Frank, J., and Lamy, J. (1993a). Three-dimensional architecture of human c~2-macroglobulin transformed with methylamine. J. Mol. Biol. 232, 522-529. Boisset, N., Radermacher, M., Grassucci, R., Taveau, J.-C., Liu, W., Lamy, J., Frank, J., and Lamy, J. N. (1993b). Three-dimensional immunoelectron microscopy of scorpion hemocyanin labeled with monoclonal Fab fragment. J. Sm~ct. Biol. 111, 234-244.
296
Bibliography
Boisset, N., Penczek, P., Pochon, F., Frank, J., and Lamy, J. (1994a). Three-dimensional reconstruction of human a2-macroglobulin and refinement of the localization of thiol ester bonds with monomaleimido Nanogold. Ann. NYAcad. Sci. 737, 229-244. Boisset, N., Taveau, J. C., Penczek, P., and Frank, J. (1994b). Threedimensional reconstruction in vitreous ice of Androctonus australis hemocyanin labelled with a monoclonal Fab fragment. In: "Proceedings of the 13th International Congress Electron Microscopy (Paris)," Vol. 1, pp. 527-528. Les Editions de Physiques, Les Ulis, France. Boisset, N., Taveau, J. C., Penczek, P., and Frank, J. (1995). Threedimensional reconstruction in vitreous ice of Androctonus australis hemocyanin labelled with a monoclonal Fab fragment. J. Struct. Biol. 115, 16-29. Booy, F. P., and Pawley, J. B. (1992). Cryo-crinkling: What happens to carbon films on copper grids at low temperature. Ultramicroscopy 48, 273-280. Borland, L., and van Heel, M. (1990). Classification of image data in conjugate representation spaces. J. Opt. Soc. Am. A 7, 601-610. Born, M., and Wolf, E. (1975). "Principles of Optics," 5th ed. Pergamon, Oxford. Boublik, M., Hellmann, W., and Kleinschmidt, A. K. (1977). Size and structure of E. coli ribosomes by electron microscopy. Cytobiology 14, 293-300. Bracewell, R. N., and Riddle, A. C. (1967). Inversion of fan-beam scans in radio astronomy. Astrophys. Soc. 150, 427-434. Braig, K., Simon, M., Furuya, F., Hainfeld, J. F., and Horwich, A. L. (1993). A polypeptide bound by chaperonin groEL is located within a central cavity. Proc. Natl. Acad. Sci. USA 90, 3978-3982. Bremer, A., Henn, C., Engel, A., Baumeister, W., and Aebi, U. (1992). Has negative staining still a place in biomacromolecular electron microscopy? Ultramicroscopy 46, 85-111. Brenner, S., and Horne, R. W. (1959). A negative staining method for high resolution electron microscopy of viruses. Biochim. Biophys. Acta 34, 103-110. Bretaudiere, J. P., and Frank, J. (1986). Reconstitution of molecule images analysed by correspondence analysis: A tool for structural interpretation. J. Microsc. 144, 1-14. Brimacombe, R. (1995). Ribosomal RNA; a three-dimensional jigsaw puzzle. Eur. J. Biochem. 230, 365-385. Brink, J., and Chiu, W. (1994). Applications of a slow-scan CCD camera in protein electron crystallography. J. Struct. Biol. 113, 23-34.
Bibliography
297
Brink, J., Chiu, W., and Dougherty, M. (1992). Computer-controlled spotscan imaging of crotoxin complex crystals with 400 KeV electrons at near-atomic resolution. Ultramicroscopy 46, 229-240. Bullough, P., and Henderson, R. (1987). Use of spotscan procedure for recording low-dose micrographs of beam-sensitive specimens. Ultramicroscopy 21, 223-230. Burge, R. E., and Scott, R. F. (1975). Electron microscope calibration by astigmatic images. Optik 43, 503-507. Butt, H.-J., Wang, D. N., Hansma, P. K., and Kiihlbrandt, W. (1991). Effect of surface roughness of carbon films on high-resolution electron diffraction of protein crystals. Ultramicroscopy 36, 307-318. Carazo, J. M. (1992). The fidelity of 3D reconstruction from incomplete data and the use of restoration methods. In: "Electron Tomography" (J. Frank, Ed.), pp. 117-166. Plenum, New York. Carazo, J. M., and Carrascosa, J. L. (1987a). Restoration of direct Fourier three-dimensional reconstructions of crystalline specimens by the method of convex projections. J. Microsc. 145, 159-177. Carazo, J. M., and Carrascosa, J. L. (1987b). Information recovery in missing angular data cases: An approach by the convex projections method in three dimensions. J. Microsc. 145, 23-43. Carazo, J. M., and Frank. J. (1988). Three-dimensional matching of macromolecular structures obtained from electron microscopy: An application to the 70S and 50S E. coli ribosomal particles. Ultramicroscopy 25, 13-22. Carazo, J. M., Wagenknecht, T., Radermacher, M., Mandiyan, V., Boublik, M., and Frank, J. (1988). Three-dimensional structure of 50S E. coli ribosomal subunits depleted of proteins L7/L12. J. Mol. Biol. 201, 393-404. Carazo, J. M., Wagenknecht, T., and Frank, J. (1989). Variations of the three-dimensional structure of Escherichia coli ribosome in the range of overlap views. Biophys. J. 55, 465-477. Carazo, J. M., Rivera, F. F., Zapata, E. L., Radermacher, M., and Frank, K. (1990). Fuzzy sets-based classification of electron microscopy images of biological macromolecules with an application to ribosomal particles. J. Microsc. 157, 187-203. Carazo, J. M., Benavides, I., Rivera, F. F., and Zapata, E. (1992). Identification, classification, and 3D reconstruction on hypercube computers. Ultramicroscopy 40, 13- 32. Carazo, J. M., Marabini, R., Vaquerizo, C., and Frank, J. (1994). Towards a data-bank of three-dimensional macromolecular structures: A WWW-
298
Bibliography
based prototype. In: "Proceedings of the 13th International Congress on Electron Microscopy (Paris)," Vol. 1, pp. 519-520. Les Editions de Physiques, Les Ulis, France. Carrascosa, J. L., and Steven, A. C. (1979). A procedure for evaluation of significant structural differences between related arrays of protein molecules. Micron 9, 199-206. Cejka, Z. Kleinz, J., Santini, C., Hegerl, R., and Ghiretti Magaldi, A. (1992). The molecular architecture of the extracellular hemoglobin of Ophelia bicornis: Analysis of individual molecules. J. Struct. Biol. 109, 52-60. Chen, S., Roseman, A. M., Hunter, A. S., Wood, S. P., Burston, S. G., Ranson, N. A., Clarke, A. R., and Saibil, H. R. (1994). Location of a folding protein and shape changes in GroEL-GroES complexes by cryo-electron microscopy. Nature 371, 261-264. Cheng, R. H., Reddy, V. S., Olson, N. H., Fisher, A. J., Baker, T. S., and Johnson, J. E. (1994). Functional implications of quasi-equivalence in a T = 3 icosohedral animal virus established by cryo-electron microscopy and X-ray crystallography. Structure 2, 271-282. Chiu, W. (1993). What does electron cyromicroscopy provide that X-ray crystallography and NMR spectroscopy cannot? Annu. Rec. Biophys. Biomol. Struct. 22, 233-255. Chiu, W., Burnett, R., and Garcea, R. (1996). "Structural Biology of Viruses." Oxford Univ. Press, Oxford. Conway, J. F., Trus, B. L., Booy, F. P., Newcomb, W. W., Brown, J. C., and Steven, A. C. (1993). The effects of radiation damage on the structure of frozen-hydrated HSV-1 capsids. J. Struct. Biol. 111, 222-233. Crowther, R. A. (1976). The interpretation of images reconstructed from electron micrographs of biological particles. In: "Proceedings of the Third John Innes Symposium" (R. Markham and R. W. Horne, Eds.), pp. 15-25. North-Holland, Amsterdam, Crowther, R. A. (1971). Procedures for three-dimensional reconstruction of spherical viruses by Fourier synthesis from electron micrographs. Philos. Trans. R. Soc. Lond. B. 261, 221-230. Crowther, R. A., and Amos, L. A. (1971). Three-dimensional image reconstructions of some small spherical viruses. Cold Spring Harbor Symp. Quant. Biol. 36, 489-494. Crowther, R. A., DeRosier, D. J., and Klug, A. (1970). The reconstruction of a three-dimensional structure from projections and its application to electron microscopy. Proc. R. Soc. Lond. 317, 319-340. Cruickshank, D. W. J. (1959). "International Tables for X-Ray Crystallography," Vol II, pp. 84-98. Kynoch Press, Birmingham, England.
Bibliography
299
Crum, J., Gruys, K. J., and Frey, T. G. (1994). Electron microscopy of cytochrome c oxidase crystals: Labeling of subunit III with a monomaleimide undecagold cluster compound. Biochemistry 33, 13719-13726. Cyrklaft, M., and Kiihlbrandt, W. (1994). High-resolution electron microscopy of biological specimens in cubic ice. Ultramicroscopy 55, 141-153. Czarnota, G. J., Andrews, D. W., Farrow, N. A., and Ottensmeyer, F. P. (1994). A structure for the signal sequence binding protein SRP54: 3D reconstruction from STEM images of single molecules. J. Struct. Biol. 113, 35-46. Dengler, J. (1989). A multi-resolution approach to the 3D reconstruction from an electron microscope tilt series solving the alignment problem without gold particles. Ultramicroscopy 30, 337-348. DeRosier, D., and Klug, A. (1968). Reconstruction of 3-dimensional structures from electron micrographs. Nature (London) 217, 130-134. DeRosier, D. J., and Moore, P. B. (1970). Reconstruction of threedimensional images from electron micrographs of structures with helical symmetry. J. Mol. Biol. 52, 355-369. De Haas, F., Bijlholt, M. M. C., and van Bruggen, E. F. J. (1991). An electron microscopic study of two-hexameric hemocyanins from the crab Cancer pagurus and the tarantula Eurypelma californicum: Determination of their quaternary structure using image processing and simulation models based on X-ray diffraction data. J. Struct. Biol. 107, 86-94. De Haas, F., and van Bruggen, E. F. J. (1994). The interhexameric contacts in the four-hexameric hemocyanin from the tarantula Eurypelma californicum. J. Mol. Biol. 237, 464-478. De Haas, F., van Bremen, J. F. L., Boekema, E. J., and Keegstra, W. (1993). Comparative electron microscopy and image analysis of oxyand deoxy-hemocyanin from the spiny lobster Panulirus interruptus. Ultramicroscopy 49, 426-435. De Jong, A. F., and van Dyck, D. (1993). Ultimate resolution and information in electron microscopy. II. The information limit of transmission electron microscopes. Ultramicroscopy 49, 66-80. Di Francia, T. (1955). Resolution power and information. J. Opt. Soc. Am. 45, 497-501. Diday, E. (1971). La methode des nuees dynamiques. Rec. Stat. Appl. 19, 19-34. Dierksen, K., Typke, D., Hegerl, R., Koster, A. J., and Baumeister, W. (1992). Towards automatic tomography. Uhramicroscopy 40, 71-87.
300
Bibliography
Dierksen, K., Typke, D., Hegerl, R., and Baumeister, W. (1993). Towards automatic tomography. II. Implementation of autofocus and low-dose procedures. Ultramicroscopy 49, 109-120. Dover, S. D., Elliot, A., and Kernagham, A. K. (1980). Three-dimensional reconstruction from images of tilted specimens: The paramyosin filament. J. Microsc. 122, 23-33. Downing, K. H. (1991). Spot scan imaging in TEM. Science 251, 53-59. Downing, K. H. (1992). Automatic focus correction for spot-scan imaging of tilted specimens. Ultramicroscopy 46, 199-206. Downing, K. H., and Grano, D. A. (1982). Analysis of photographic emulsions for electron microscopy of two-dimensional crystalline specimens. Ultramicroscopy 7, 381-404. Downing, K. H., and Glaeser, R. M. (1986). Improvement in high resolution image quality of radiation-sensitive specimens achieved with reduced spot size of the electron beam. Ultramicroscopy 20, 269-278. Downing, K. H., Koster, A. J., and Typke, D. (1992). Overview of computer-aided electron microscopy. Ultramicroscopy 46, 189-197. Dryden, K. A., Wang, G., Yeager, M., Nibert, M. L., Coombs, K. M., Furlong, D. B., Fields, B. N., and Baker, T. S. (1993). Early steps in reovirus infection are associated with dramatic changes in supramolecular structure and protein conformation: Analysis of virions and subviral particles by cryoelectron microscopy and image reconstruction. J. Cell Biol. 122, 1023-1041. Dube, P., Tavares, P., Lurz, R., and van Heel, M. (1993). The portal protein of bacteriophage SPPI: A DNA pump with 13-fold symmetry. EMBO J. 12, 1303-1309. Dube, P., Tavares, P., Orlova, E. V., Zemlin, F., and van Heel, M. (1994). Three-dimensional structure of portal protein from bacteriophage SPP1. "Proceedings of the International Congress on Electron Microscopy (Paris)," Vol. 3, pp. 533-534. Les Editions de Physiques, Les Ulis, France. Dubochet, J., Lepault, J., Freeman, R., Berriman, J. A., and Homo, J.-C. (1982). Electron microscopy of frozen water and aqueous solutions. J. Microsc. 128, 219-237. Dubochet, J., Adrian, M., Lepault, J., and McDowall, A. W. (1985). Cryo-electron microscopy of vitrified biological specimens. Trends Biochem Sci. 10, 143-146. Dubochet, J., Adrian, Chang, J.-J., Homo, J.-C., Lepault, J., McDowall, A. W., and Schultz, P. (1988). Cryo-electron microscopy of vitrified specimens. Reu. Biophys. 21, 129-228. Duda, R., and Hart, P. (1973). "Pattern Classification and Analysis," pp. 10-43, 189-260. Wiley, New York.
Bibliography
301
Egelman, E. (1986). An algorithm for straightening images of curved filamentous structures. Ultramiscroscopy 19, 367-374. Egelman, E. H., Yu, X., Wild, R., Hingorani, M. M., and Patel, S. S. (1995). T7 helicase/primase proteins form rings around single-stranded DNA that suggest a general structure for hexameric helicases. Proc. Natl. Acad. Sci. USA, 92, 3869-3873. Erickson, H. P., and Klug, A. (1970). The Fourier transform of an electron micrograph: Effects of defocussing and aberrations, and implications for the use of underfocus contrast enhancement. Ber. Bunsenges. Phys. Chem. 74, 1129-1137. Etcoff, N. L. (1994). Beauty and the beholder. Nature 368, 186-187. Fan, G., Mercurio, P., Young, S., and Ellisman, M. H. (1993). Telemicroscopy. Ultramicroscopy 52, 499-503. Farrow, N. A., and Ottensmeyer, F. P. (1989). Maximum entropy methods and dark field microscopy images. Ultramicroscopy 31, 275-284. Farrow, N. A., and Ottensmeyer, F. P. (1992). A-posteriori determination of relative projection directions of arbitrarily oriented macromolecules. J. Opt. Soc. Am. A9, 1749-1760. Francis, N. R., Sosinsky, G. E., Thomas, D., and DeRosier, D. J. (1994). Isolation, characterization and structure of bacterial flagellar motors containing the switch complex. J. Mol. Biol. 235, 1261-1270. Frank, J. (1969). Nachweis von Objektbewegungen im licht-optischen Diffraktogramm von elektronenmikroskopischen Aufnahmen. Optik 30, 171-180. Frank, J. (1972a). Observation of the relative phases of electron microscopic phase contrast zones with the aid of the optical diffractometer. Optik 35, 608-612. Frank, J. (1972b). Two-dimensional correlation functions in electron microscope image analysis. In: "Proceedings of the Fifth European Congress on Electron Microscopy, Manchester," pp. 622-623. The Institute of Physics, London. Frank, J. (1972c). A study on heavy/light atom discrimination in bright field electron microscopy using the computer. Biophys. J. 12, 484-511. Frank, J. (1973a). The envelope of electron microscopic transfer functions for partially coherent illumination. Optik 38, 519-536. Frank, J. (1973b). Use of anomalous scattering for element discrimination. In: "Image Processing and Computer-Aided Design in Electron Optics" (P. W. Hawkes, Ed.), pp. 196-211. Academic Press, San Diego. Frank, J. (1973c). Computer processing of electron micrographs. In: "Advanced Techniques in Biological Electron Microscopy" (J. K. Koehler, Ed.), pp. 215-274. Springer-Verlag, Berlin.
302
Bibliography
Frank, J. (1975). Averaging of low exposure electron micrographs of nonperiodic objects. Ultramicroscopy 1, 159-162. Frank, J. (1976). Determination of source size and energy spread from electron micrographs using the method of Young's fringes. Optik 44, 379-391. Frank, J. (1979). Image analysis in electron microscopy. J. Microsc. 117, 25-38. Frank, J. (1980). The role of correlation techniques in computer image processing. In: "Computer Processing of Electron Microscope Images" (P. W. Hawkes, Ed.). Springer-Verlag, Berlin. Frank, J. (1982). New methods for averaging non-periodic objects and distorted crystals in biologic electron microscopy. Optik 63, 67-89. Frank, J. (1984a). The role of multivariate statistical analysis in solving the architecture of the Limulus polyphemus hemocyanin molecule. Ultramicroscopy 13, 153-164. Frank, J. (1984b). Recent advances of image processing in the structural analysis of biological macromolecules. In: "Proceedings of the 8th European Congress on Electron Microscopy" Vol. 2, pp. 1307-1316. Electron Microscopy Foundation, Program Committee, Budapest. Frank, J. (1985). Image analysis of single molecules. Electron Microsc. Reu. 7, 53-74. Frank, J. (1989a). Three-dimensional imaging techniques in electron microscopy. BioTechniques 7, 164-173. Frank, J. (1989b). Image analysis of single macromolecules. Electron Microsc. Reu. 2, 53-74. Frank, J. (1990). Classification of macromolecular assemblies studied as "single particles." Reu. Biophys. 23, 281-329. Frank, J. (1992a). Three-dimensional reconstruction at the molecular level. Microsc. Microanal. Microstruct. 3, 45-54. Frank, J. (Ed.) (1992b). "Electron Tomography." Plenum, New York. Frank, J., and A1-Ali, L. (1975). Signal-to-noise ratio of electron micrographs obtained by cross-correlation. Nature 256, 376-378. Frank, J., and Goldfarb, W. (1980). Methods of averaging of single molecules and lattice fragments. In: "Electron Microscopy at Molecular Dimensions. State of the Art and Strategies for the Future" (W. Baumeister and W. Vogell, Eds.), pp. 154-160. Springer-Verlag, Berlin. Frank, J., and Penczek, P. (1995). On the correction of the contrast transfer function in biological electron microscopy. Optik 98, 125-129. Frank, J., and Radermacher, M. (1986). Three-dimensional reconstruction of non-periodic macromolecular assemblies from electron micro-
Bibliography
303
graphs. In: "Advanced Techniques in Biological Electron Microscopy" (J. K. Koehler, Ed.), Vol. 3, pp. 1-72. Springer-Verlag, Berlin. Frank, J., and Radermacher, M. (1992). Three-dimensional reconstruction of single particles negatively stained or in vitreous ice. Ultramicroscopy 46, 241-262. Frank, J., and van Heel, M. (1982a). Correspondence analysis of aligned images of biological particles. J. Mol. Biol. 161, 134-137. Frank, J., and van Heel, M. (1982b). Averaging techniques and correspondence analysis. In: "Proceedings of the 10th International Congress on Electron Microscopy," Vol. 1, pp. 107-114. Deutsche Gesellschaft fiir Elektronenmikroskopie, Frankfurt (Main). Frank, J., and Verschoor, A. (1984). Masks for prescreening of molecule projections. J. Mol. Biol. 178, 696-698. Frank, J., and Wagenknecht, T. (1984). Automatic selection of molecular images from electron micrographs. Ultramicroscopy 12, 169-176. Frank, J., Bussler, P., Langer, R., and Hoppe, W. (1970). Einige Erfahrungen mit der rechnerischen Analyse und Synthese von elektronenmikroskopischen Bildern hoher Aufl6sung. Ber. Bunsenges. Phys. Chem. 74, 1105-1115. Frank, J., Goldfarb, W., Eisenberg, D., and Baker, T. S. (1978a). Reconstruction of glutamine synthetase using computer averaging. Ultramicroscopy 3, 283-290. Frank, J., McFarlane, S. C., and Downing, K. H. (1978b). A note on the effect of illumination aperture and defocus spread in bright field electron microscopy. Optik 52, 49-60. Frank, J., Verschoor, A., and Boublik, M. (1981a). Computer averaging of electron micrographs of 40S ribosomal subunits. Science 214, 1353-1355. Frank, J., Shimkin, B., and Dowse, H. (1981b). SPIDER--A modular software system for electron image processing. Ultramicroscopy 6, 343-358. Frank, J., Verschoor, A., and Boublik, M. (1982). Multivariate statistical analysis of ribosome electron micrographs. J. Mol. Biol. 161, 107-137. Frank, J., Radermacher, M., Wagenknecht, T., and Verschoor, A. (1986). A new method for three-dimensional reconstruction of single macromolecules using low dose electron micrographs. Ann. NYAcad. Sci. 483, 77-87. Frank, J., Verschoor, A., and Wagenknecht, T. (1985). Computer processing of electron microscopic images of single macromolecules. In: "New Methodologies in Studies of Protein Configuration" (T. T. Wu, Ed.), pp. 36-89. Van Nostrand-Reinhold Company, New York.
304
Bibliography
Frank, J., Radermacher, M., Wagenknecht, T., and Verschoor, A. (1988a). Studying ribosome structure by electron microscopy and computerimage processing. Methods in Enzymol. 164, 3-35. Frank, J., Bretaudiere, J. P., Carazo, J. M., Verschoor, A., and Wagenknecht, T. (1988b). Classification of images of biomolecular assemblies: A study of ribosomes and ribosomal subunits of Escherichia coli. J. Microsc. 150, 99-115. Frank, J., Penczek, P., Grassucci, R., and Srivastava, S. (1991). Threedimensional reconstruction of the 70S E. coli ribosome in ice: The distribution of ribosomal RNA. J. Cell Biol. 115, 597-605. Frank, J., Penczek, P., and Liu, W. (1992). Alignment, classification, and three-dimensional reconstruction of single particles embedded in ice. In: "Scanning Microscopy Supplement 6: Proceedings of the Tenth Pfefferkorn Conference, Cambridge University, England, September 1992" (P.W. Hawkes, Ed.), pp. 11-22. Scanning International, Chicago. Frank, J., Chiu, W., and Henderson, R. (1993). Flopping polypeptide chains and Suleika's subtle imperfections: Analysis of variations in the electron micrograph of a purple membrane crystal. Ultramicroscopy 49, 387-396. Frank, J., Zhu, J., Penczek, P., Li, Y., Srivastava, S., Verschoor, A., Radermacher, M., Grassucci, R., Lata, R. K., and Agrawal, R. K. (1995a). A model of protein synthesis based on cryo-electron microscopy of the E. coli ribosome. Nature 376, 441-444. Frank, J., Verschoor, A., Li, Y., Zhu, J., Lata, R. K., Radermacher, M., Penczek, P., Grassucci, R., Agrawal, R. K., and Srivastava, S. (1995b). A model of the translational apparatus based on three-dimensional reconstruction of the E. coli ribosome. Biochem. and Cell Biol., in press. Frank, J., Radermacher, M., Penczek, P., Zhu, J., Li, Y., Ladjadj, M., and Leith, A. (1995c). SPIDER and WEB: Processing and visualization of images in 3D electron microscopy and related fields J. Struct. Biol. in press. Franzini-Armstrong, C. (1994). Unraveling the ryanodine receptor. Biophys. J. 67, 2135-2136. Fraser, R. D. B., Furlong, D. B., Trus, B. L., Nibert, M. L., Fields, B. N., and Steven, A. C. (1990). Molecular structure of the cell-attachment protein of reovirus: Correlation averaging of computer-processed electron micrographs with sequence-based predictions. J. Virol. 64, 2990-3000. Furcinitti, P. S., van Oostrum, J., and Burnett, R. M. (1989). Adenovirus polypeptide 1X revealed as capsid cement by difference images from electron microscopy and crystallography. EMBO J. 8, 3563-3570.
Bibliography
305
Galton, F. J. (1878). Nature 18, 97-100. Gerchberg, R. W. (1974). Super resolution through energy reduction. Opt. Acta 21, 709-720. Gerchberg, R. W., and Saxton, W. O. (1971). A practical algorithm for the determination of phase from image and diffraction plane pictures. Optik 34, 275-284. Gilbert, P. F. C. (1972). The reconstruction of a three-dimensional structure from projections and its application to electron microscopy. II. Direct methods. Proc. R. Soc. London B 182, 89-117. Glaeser, R. M. (1971). Limitations to significant information in biological electron microscopy as a result of radiation damage. J. Ultrastruct. Res. 36, 466-482. Glaeser, R. M. (1985). Electron crystallography of biological macromolecules. Am. Rec. Phys. Chem. 36, 243-275. Glaeser, R. M. (1992a). Specimen flatness of thin crystalline arrays~ Influence of the substrate. Ultramicroscopy 46, 33-43. Glaeser, R. M. (1992b). Cooling-induced wrinkling of thin crystals of biological macromolecules can be prevented by using molybdenum grids. In: "Proceedings of the 50th Annual Meeting, EMSA (Boston)," pp. 520-521. Glaeser, R. M., and Downing, K. H. (1992). Assessment of resolution in biological electron microscopy. Ultramicroscopy 47, 256-265. Glaeser, R. M., and Taylor, K. A. (1978). Radiation damage relative to transmission electron microscopy of biological specimens at low temperature: A review. J. Microsc. 112, 127-138. Glaeser, R. M., Kuo, I., and Budinger, T. F. (1971). Method for processing of periodic images at reduced levels of electron radiation. In: "Proceedings of the 29th Annual Meeting, EMSA," pp. 466-467. Gogol, E. P., Johnston, E., Aggeler, R., and Capaldi, R. A. (1990). Liganddependent structural variations in Escherichia coli F~ ATPase revealed by cryoelectron microscopy. Proc. Natl. Acad. Sci. USA 87, 9585-9589. Goncharov, A. B., Vainshtein, B. K., Ryskin, A. I., and Vagin, A. A. (1987). Three-dimensional reconstruction of arbitrarily oriented particles from their electron photomicrographs. SOL'. Phys. Crystallogr. 32, 504-509. Goodman, J. W. (1968). "Introduction to Fourier Optics." McGraw-Hill, New York. Gordon, R., Bender, R., and Herman, G. T. (1970). Algebraic reconstruction techniques. J. Theor. Biol. 29, 471-482. Guckenberger, R. (1982). Determination of a common origin in the micrographs of tilt series in three-dimensional electron microscopy. Ultramicroscopy 9, 167-174.
306
Bibliography
H~inicke, W. (1981). "Mathematische Methoden zur Aufbereitung elektronenmikroskopischer Bilder." Thesis, Universitfit G6ttingen. H~inicke, W., Frank, J., and Zingsheim, H. P. (1984). Statistical significance of molecule projections by single particle averaging. J. Microsc. 133, 223-238. Hainfeld, J. F. (1992). Site-specific cluster labels. Ultramicroscopy 46, 135-144. Hainfeld, J. F., and Furuya, F. R. (1992). A 1.4 nm gold cluster covalently attached to antibodies improves immunolabelling. J. Histochem. Cytochem. 40, 177-184. Hanszen, K.-J. (1971). The optical transfer theory of the electron microscope: Fundamental principles and applications. Adv. Opt. Microsc. 4, 1-84. Hanszen, K.-J., and Trepte, L. (1971). The contrast transfer of the electron microscope with partial coherent illumination. A. The ring condensor. Optik 33, 166-181. B. Disc-shaped source. Optik 33, 182-198. Harauz, G. (1990). Representation of rotations by unit quaternions. Ultramicroscopy 33, 209-213. Harauz, G., and Chiu, D. K. Y. (1991). Covering events in eigenimages of biomolecules. Ultramicroscopy 38, 307-317. Harauz, G., and Chiu, D. K. Y. (1993). Complementary applications of correspondence analysis and event covering of noisy image sequences. Optik 95, 1-8. Harauz, G., and Fong-Lochovsky, A. (1989). Automatic selection of macromolecules from electron micrographs by component labelling and symbolic processing. Ultramicroscopy 31, 333-344. Harauz, G., and Ottensmeyer, F. P. (1984a). Direct three-dimensional reconstruction for macromolecular complexes from electron micrographs. Ultramicroscopy 12, 309-320. Harauz, G., and Ottensmeyer, F. P. (1984b). Nucleosome reconstruction via phosphorus mapping. Science 226, 936-940. Harauz, G., and van Heel, M. (1986a). Exact filters for general geometry three-dimensional reconstruction. Optik 73, 146-156. Harauz, G., and van Heel, M. (1986b). Direct 3D reconstruction from projections with initially unknown angles. In: "Pattern Recognition in Practice II" (E. S. Gelsema and L. N. Kanal, Eds.), pp. 279-288. North-Holland, Elsevier, Amsterdam. Harauz, G., St6ffler-Meilicke, M., and van Heel, M. (1987). Characteristic views of prokaryotic 50S ribosomal subunits. J. Mol. Ecol. 26, 347-357. Harauz, G., Boekema, E., and van Heel, M. (1988). Statistical image analysis of electron micrographs of ribosomal subunits. Methods Enzymol. 164, 35-49.
Bibliography
307
Harauz, G., Chiu, D. K. Y., MacAulay, C., and Palcic, B. (1994). Probabilistic inference in computer-aided screening for cervical cancer: An event covering approach to information extraction and decision rule formulation. Anal. Cell. Pathol. 6, 37-50. Harris, R., and Horne, R. (1991). Negative staining. In: "Electron Microscopy in Biology" (J. R. Harris, Ed.), pp. 203-228. IRL Press at Oxford Univ. Press, Oxford. Hawkes, P. W. (1980). Image processing based on the linear transfer theory of image formation. In "Computer Processing of Electron Microscope Images," pp. 1-33. Springer-Verlag, Berlin. Hawkes, P. W. (1992). The electron microscope as a structure projector. In: "Electron Tomography" (J. Frank, Ed.), pp. 17-38. Plenum, New York. Hawkes, P. W. (1993). Reflections on the algebraic manipulation of sets of electron images or spectra. Optik 93, 149-154. Hawkes, P. W., and Kasper, E. (1994). "Principles of Electron Optics," Vol. 3: Wave Optics. Academic Press, London. Hayward, S. B., and Glaeser, R. M. (1979). Radiation damage of purple membrane at low temperature. Ultramicroscopy 4, 201-210. Hegerl, R. (1992). A brief survey of software packages for image processing in biological electron microscopy. Ultramicroscopy 47, 417-423. Hegerl, R., and Hoppe, W. (1976). Influence of electron noise on threedimensional image reconstruction. Z. Naturforsch. 31a, 1717-1721. Hegerl, R., and Altbauer, A. (1982). The "EM" program system. Ultramicroscopy 9, 109-116. Hegerl, R., Pfeifer, G., Pfihler, G., Dahlmann, B., and Baumeister, W. (1991). The three-dimensional structure of proteosomes from Thermoplasma acidophilum as determined by electron microscopy using random conical tilting. FEBS Lett. 283, 117-121. Henderson, R. (1992). Image contrast in high-resolution electron microscopy of biological specimens: TMV in ice. Ultramiscroscopy 46, 1-18. Henderson, R. (1995). The potential and limitation of neutrons, electrons, and X-rays for atomic resolution microscopy of unstained biological molecules. Rel'. Biophys. 28, 171-193. Henderson, R., and Glaeser, R. M. (1985). Quantitative analysis of image contrast in electron micrographs of beam-sensitive crystals. Ultramicroscopy 16, 139-150. Henderson, R., and Unwin, P. N. T. (1975). Three-dimensional model of purple membrane obtained by electron microscopy. Nature 257, 28-32. Henderson, R., Baldwin, J. M., Downing, K. H., and Zemlin, F. (1986). Structure of purple membrane from halobacterium Holobium:
308
Bibliography
Recording, measurement and evaluation of electron micrographs at 3.5 A resolution. Ultramicroscopy 19, 147-178. Henderson, R., Baldwin, J. M., Ceska, T. A., Zemlin, F., Beckmann, E., and Downing, K. H. (1990). Model of the structure of bacteriorhodopsin based on high-resolution electron cryomicroscopy. J. Mol. Biol. 213, 899-929. Herman, G. T. (1980). "Image Reconstruction from Projections: The Fundamentals of Computerized Tomography." Academic Press, New York. Hinshaw, J. E., Carragher, B. O., and Milligan, R. A. (1992). Architecture and design of the nuclear pore complex. Cell 69, 1133-1141. Hollander, M., and Wolfe, D. A. (1973). "Nonparametric Statistical Methods." Wiley, New York. Hoppe, W. (1961). Ein neuer Weg zur Erh6hung des AuflSsungsvermSgens des Elektronenmikroskops. Naturwissenschafien 48, 736-737. Hoppe, W. (1969). Das Endlichkeitspostulat und das Interpolationstheorem der dreidimensionalen elektronenmikroskopischen Analyse aperiodischer Strukturen. Optik 29, 617-621. Hoppe, W. (1972). Drei-dimensional abbildende Elektronenmikroskope. Z. Naturforsch. 27a, 919-929. Hoppe, W. (1974). Towards three-dimensional "electron microscopy" at atomic resolution. Naturwissenschafien 61, 239-249. Hoppe, W. (1981). Three-dimensional electron microscopy. Annu. ReL'. Biophys. Bioeng. 10, 563-592. Hoppe, W. (1983). Elektronenbeugung mit dem TransmissionsElektronenmikrokop als phasenbestimmendem Diffraktometer--von der Ortsfrequenzfilterung zur dreidimensionalen Strukturanalyse an Ribosomen. Angew. Chem. 95, 465-494. Hoppe, W., and Hegerl, R. (1980). Three-dimensional structure determination by electron microscopy (nonperiodic specimens). In: "Computer Processing of Electron Microscope Images" (P. W. Hawkes, Ed.), pp. 127-185. Springer-Verlag, Berlin, New York. Hoppe, W., and Hegerl, R. (1981). Some remarks concerning the influence of electron noise on 3D reconstruction. Ultramicroscopy 6, 205-206. Hoppe, W., Langer, R., Knesch, G., and Poppe, C. (1968). ProteinKristallstrukturanalyse mit Elektronenstrahlen. Naturwissenschafien 55, 333-336. Hoppe, W., Langer, R., Frank, J., and Feltynowski, A. (1969). Bilddifferenzverfahren in der Elektronenmikroskopie. Naturwissenschafien 56, 267-272. Hoppe, W., Gassmann, J., Hunsmann, N., Schramm, H. J., and Sturm, M. (1974). Three-dimensional reconstruction of individual negatively
Bibliography
309
stained fatty-acid synthetase molecules from tilt series in the electron microscope. Hoppe-Seyler's Z. Physiol. Chem. 355, 1483-1487. Hoppe, W., Schramm, H. J., Sturm, M., Hunsmann, N., and Gassmann, J. (1986). Three-dimensional electron microscopy of individual biological objects. I. Methods. Z. Naturforsch. A 31, 645-655. Hu, J. J., and Li, F. H. (1991). Maximum entropy image deconvolution in high resolution electron microscopy. Ultramicroscopy 35, 339-350. Hunt, B. R. (1973). The application of constraint least squares estimation to image restoration by digital computer. IEEE Trans. Comput. 22, 805-812. Hutchinson, G., Tichelaar, W., Weiss, H., and Leonard, K. (1990). Electron microscopic characterization of helical filaments formed by subunits I and II (core proteins) of ubiquinol:Cytochrome c reductase from Neurospora mitochondria. J. Struct. Biol. 103, 75-88. Jap, B. (1989). Molecular design of PhoE porin and its functional consequences. J. Mol. Biol. 205, 407-419. Jap, B. (1991). Structural architecture of an outer membrane channel as determined by electron crystallography. Nature (London) 350, 167-170. Jap, B. K., Zulauf, M., Scheybani, T., Hefti, A., Baumeister, W., Aebi, U., and Engel, A. (1992). 2D crystallization: From art to science. Ultramicroscopy 46, 45-84. Jeng, T. W., Crowther, R. A., Stubbs, G., and Chiu, W. (1989). Visualization of alpha-helices in TMV by cryo-electron microscopy. J. Mol. Biol. 205, 251- 25 7. Jenkins, G. M., and Watts, D. G. (1968)"Spectral Analysis and Its Applications." Holden-Day, Oakland, CA. Johansen, B. V. (1975). Optical diffractometry. In: "Principles and Techniques of Electron Microscopy: Biological Applications" (M. A. Hayat, Ed.), Vol. 5, pp. 114-173. Van Nostrand-Reinhold, New York. Jones, T. A. (1978). A graphics model building and refinement system for macromolecules. J. Appl. Crystallogr. 11, 268-272. Jones, T. A., Zou, J.-Y., Cowan, S. W., and Kjeldgaard, M. (1991). Improved methods for building protein models in electron density maps and the location of errors in these models. Acta Crystallogr. A 47, 110-119. Kam, Z. (1980). The reconstruction of structure from electron micrographs of randomly oriented particles. J. Theor. Biol. 82, 15-39. Karczmarz, S. (1937). Angen~iherte Aufl6sung von Systemen linearer Gleichungen. Bull. Acad. Pol. Sci. Lett. A 35, 355-357. Kellenberger, E. and Kistler, J. (1979). The physics of specimen preparation. In: "Advances in Structure Research by Diffraction Methods"
310
Bibliography
(W. Hoppe and R. Mason, Eds.), Vol. 3, pp. 49-79. Vieweg, Wiesbaden. Kellenberger, E., Hiiner, M., and Wurtz, M. (1982). The wrapping phenomenon in air-dried and negatively stained preparations. Ultramicroscopy 9, 139-150. Kessel, M., Radermacher, M., and Frank, J. (1985). The structure of the stalk surface layer of a brine pond microorganism: Correlation averaging applied to a double layered lattice structure. J. Microsc. 139, 63-74. Kinder, E., and Siiffert, F. (1943). Biol Zentr. 63, 268. Kirkland, E. J., Siegel, B. M., Uyeda, N., and Fujiyoshi, Y. (1980). Digital reconstruction of bright field phase contrast images from high resolution electron micrographs. Ultramicroscopy fi, 479-503. Kirkland, E. J. (1984). Improved high resolution image processing of bright field electron micrographs. Ultramicroscopy lg, 151-172. Klug, A. (1983). From macromolecules to biological assemblies (Nobel lecture). Agnew. Chem. 22, 565-636. Klug, A., and Berger, J. E. (1964). An optical method for the analysis of periodicities in electron micrographs, and some observations on the mechanism of negative staining. J. Mol. Biol. 10, 565-569. Klug, A., and Crowther, R. A. (1972). Three-dimensional image reconstruction from the viewpoint of information theory. Nature 228, 435-440. Klug, A., and DeRosier, D. J. (1966). Optical filtering of electron micrographs: Reconstruction of one-sided images. Nature (London) 212, 29-32. Knauer, V., Hegerl, R., and Hoppe, W. (1983). Three-dimensional reconstruction and averaging of 30S ribosomal subunits of Escherichia coli from electron micrographs. J. Mol. Biol. 163, 409-430. Kohonen, T. (1990). The self-organizing map. Proc. IEEE 78, 1464-1480. Koller, T., Beer, M., Miiller, M., and Miihlethaler, M. (1971). Electron microscopy of selectively stained molecules. Cytobiologie 4, 369-408. Kornberg, R., and Darst, S. A. (1991). Two-dimensional crystals of proteins on liquid layers. Curr. Opinion Struct. Biol. 1, 642-646. Koster, A. J., de Ruijter, W. J., van Den Bos, A., and van Der Mast, K. D. (1989). Autotuning of a TEM using minimum electron dose. Ultramicroscopy 27, 251-272. Koster, A. J., Typke, D., and de Jong, M. J. C. (1990). Fast and accurate autotuning of a TEM for high resolution and low dose electron microscopy. "Proceedings of the XII International Congress for Electron Microscopy, pp. 114-115. San Francisco Press, San Francisco.
Bibliography
311
Koster, A. J., Chen, J. W., Sedat, J. W., and Agard, D. A. (1992). Automated microscopy for electron tomography. Ultramicroscopy 46, 207-227. Krakow, W., Downing, K. H., and Siegel, B. M. (1974). The use of tilted specimens to obtain the contrast transfer characteristics of an electron microscopy imaging system. Optik 40, 1-13. Krivanek, O. k., and Ahn, C. (1986). Energy-filtered imaging with quadrupole lenses. In: "'Electron Microscopy 1986, Proceedings of the XI International Congress on Electron Microscopy '~ (T. Imura, S. Maruse , and T. Suzuki, Eds.), Vol. I, pp. 519-520. The Japanese Society of Electron Microscopy, Tokyo, Japan. Krivanek, O. L., and Mooney, P. E. (1993). Applications of slow-scan CCD cameras in transmission electron microscopy. Ultramicroscopy 49, 95-108. Kiibler, O., Hahn, M., and Serendynski, J. (1978). Optical and digital spatial frequency filtering of electron micrographs. I. Theoretical considerations. Optik 51, 171-188. II. Experimental results. Optik 51, 235-256. Kuo, I., and Glaeser, R. M. (1975). Development of methodology for low exposure, high resolution electron microscopy of biological specimens. Ultramicroscopy 1, 53-66. Kiihlbrandt, W. (1982). Discrimination of protein and nucleic acids by electron microscopy using contrast variation. Ultramicroscopy 7, 221-232. Kiihlbrandt, W., and Downing, K. H. (1989). Two-dimensional structure of plant light-harvesting complex at 3.7 A resolution by electron crystallography. J. Mol. Biol. 207, 823-828. (Note that the actual title states "37 A" due to a misprint). Kiihlbrandt, W., and Wang, D. N. (1991). 3-dimensional structure of plant light harvesting complex determined by electron crystallography. Nature (London) 350, 130-134. Kiihlbrandt, W., Wang, D. N., and Fujiyoshi, Y. (1994). Atomic model of plant light-harvesting complex by electron crystallography. Nature 367, 614-621. Kunath, W., Weiss, K., Sack-Kongehl, H., Kessel, M., and Zeitler, E. (1984). Time-resolved low-dose microscopy of glutamine synthetase molecules. Ultramicroscopy 13, 241-252. Kuo, I., and Glaeser, R. M. (1975). Development of methodology for low exposure, high resolution electron microscopy of biological specimens. Ultramicroscopy 1, 53-66. Lake, J. (1971). Biological structures. In: "Optical Transforms" (H. Lipson, Ed.), p. 174. Academic Press, London. o
312
Bibliography
Lambert, O., Boisset, N., Taveau, J.-C., and Lamy, J. N. (1994a). Threedimensional reconstruction from frozen-hydrated specimen of the chiton Lepidochiton sp. hemocyanin. J. Mol. Biol. 244, 640-647. Lambert, O., Boisset, N., Penczek, P., Lamy, J., Taveau, J. C., Frank, J., and Lamy, J. N. (1994b). Quaternary structure of Octopus L'ulgaris hemocyanin: Three-dimensional reconstruction from frozen-hydrated specimens and intramolecular location of functional units Ore and Ovb. J. Mol. Biol. 238, 75-87. Lambert, O., Boisset, N., Taveau, J.-C., Preaux, G., and Lamy, J. N. (1995a). Three-dimensional reconstruction of the c~D- and /3c-hemocyanins of Helix pomatia from frozen-hydrated specimens. J. Mol. Biol. 248, 431-448. Lambert, O., Taveau, J.-C., Boisset, N., and Lamy, J. N. (1996b). Threedimensional reconstruction of the hemocyanin of the protobranch bivalve mollusc Nucula hanleyi from frozen-hydrated specimens. Arch. Biochem. Biophys. 319, 231-243. Lambert, O., Boisset, N., Taveau, J.-C., and Lamy, J. N. (1995c). Threedimensional reconstruction of Sepia officinalis hemocyanin from frozen-hydrated specimens. Arch. Biochem. Biophys. 316, 950-959. Lamy, J. (1987). Intramolecular localization of antigenic determinants by molecular immunoelectron microscopy. In: "Biological Organization: Macromolecular Interactions at High Resolution" (R. M. Burnett and H. J. Vogel, Eds.), pp. 153-191. Academic Press, San Diego. Lamy, J., Sizaret, P.-Y., Frank, J., Verschoor, A., Feldmann, R., and Bonaventura, J. (1982). Architecture of Limulus polyphemus hemocyanins. Biochemistry 21, 6825-6833. Lamy, J., Lamy, J., Billiald, P., Sizaret, P.-Y., Cav6, G., Frank, J., and Motta, G. (1985). Approach to the direct intramolecular localization of antigenic determinants in Androctonus australis hemocyanin with monoclonal antibodies by molecular immunoelectron microscopy. Biochemistry 24, 5532-5542. Lamy, J., Billiald, P., Taveau, J.-C., Boisset, N., Motta, G., Lamy, J. N. (1990). Topological mapping of 13 epitopes on a subunit of Androctonus australis hemocyanin. J. Struct. Biol. 103, 64-74. Lamy, J., Gielens, C., Lambert, O., Taveau, J. C., Motta, G., Loncke, P., De Geest, N., Preaux, G., and Lamy, J. (1993). Further approaches to the quaternary structure of Octopus hemocyanin: A model based on immunoelectron microscopy and image processing. Arch. Biochem. Biophys. 305, 17-29. Langer, R., and Hoppe, W. (1966). Die Erh6hung der AufliSsung und Kontrast im Elektronenmikroskop mit Zonenkorrekturplatten. Optik 24, 470-489.
Bibliography
313
Langer, R., Frank, J., Feltynowski, A., and Hoppe, W. (1970). Anwendung des Bilddifferenzverfahrens auf die Untersuchung von Struktur~inderungen di]nner Kohlefolien bei Elektronenbestrahlung. Ber. Bunsenges. Phys. Chem. 74, 1120-1126. Langmore, J., and Smith, M. (1992). Quantitative energy-filtered microscopy of biological molecules in ice. Ultramicroscopy 46, 349-373. Lanio, S. (1986). High-resolution imaging magnetic filter with simple structure. Optik 73, 99-107. Lanzavecchia, S., Bellon, P. L., and Scatturin, V. (1993). SPARK, a kernel of software programs for spatial reconstruction in electron microscopy. J. Microsc. 171, 255-266. Lanzavecchia, S., and Bellon, P. L. (1994). A moving window Shannon reconstruction algorithm for image interpolation. J. Vis. Commun. Image Repres. 5, 255-264. Larquet, E., Boisset, N., Pochon, F., and Lamy, J. (1994a). Threedimensional cryoelectron microscopy of native human c~2-macroglobulin. In: "Proceedings of the 13th International Congress on Electron Microscopy (Paris)," Vol. 3A, pp. 529-530. Les Editions de Physiques, Les Ulis, France. Larquet, E., Boisset, N., Pochon, F., and Lamy, J. (1995b). Architecture of native human alpha 2-macroglobulin studies by cryoelectron microscopy and three-dimensional reconstruction. J. Struct. Biol. 113, 87-98. Lata, K. R., Penczek, P., and Frank, J. (1994). Automatic particle picking from electron micrographs. In: "Proceedings of the 52nd Annual Meeting MSA (New Orleans)," (G. W. Bailey and A. J. Garratt-Reed, Eds.), pp. 122-123. San Francisco Press, San Francisco. Lata, K. R., Penczek, P., and Frank, J. (1995). Automated particle picking from electron micrographs. Ultramicroscopy 58, 381-391. Lawrence, M. C., Jaffer, M. A., and Sewell, B. T. (1989). The application of the maximum entropy method to electron microscope tomography. Ultramicroscopy 31, 285-301. Lebart, L., Morineau, A., and Tabard, N. (1977). "Techniques de la Description Statistique." Dunod, Paris. Lebart, L., Maurineau, A., and Warwick, K. M. (1984). "Multivariate Descriptive Statistical Analysis." Wiley, New York. Leith, A. (1992). Computer visualization of volume data in electron tomography. In: "Electron Tomography" (J. Frank, Ed.), pp. 215-236. Plenum, New York. Lenz, F. (1971). In: "Electron Microscopy in Material Science" (U. Valdre, Ed.). Academic Press, New York.
314
Bibliography
Leonard, K. R., and Lake, J. A. (1979). Ribosome structure: Hand determination by electron microscopy of 30S subunits. J. Mol. Biol. 129, 155-163. Lepault, J., and Pitt, T. (1984). Projected structure of unstained, frozen-hydrated T-layer of Bacillus bret'is. EMBO. J. 3, 101-105. Lepault, J., Booy, F. P., and Dubochet, J. (1983). Electron microscopy of frozen biological suspensions. J. Microsc. 129, 89-102. Lim, V., Venclovas, C., Spirin, A., Brimacombe, R., Mitchell, P., and Miiller, F. (1992). How are tRNAs and mRNA arranged in the ribosome? An attempt to correlate the stereochemistry of the tRNAmRNA interaction with constraints imposed by the ribosomal topography. Nucleic Acids Res. 20, 2627-2637. Liu, W. (1991). 3-D variance of weighted back-projection reconstruction and its application to the detection of 3-D particle conformational changes. In: Proceedings of the 49th Annual Meeting, EMSA, (G. W. Bailey, Ed.), pp. 542-543. San Francisco Press, San Francisco. Liu, W. (1993). "Three-Dimensional Variance of Weighted BackProjection." Ph.D. thesis, State University of New York at Albany. Liu, W., and Frank, J. (1995). Estimation of variance distribution in three-dimensional reconstruction, I: Theory. J. Opt. Soc. Am., in press. Liu, W., Boisset, N., and Frank, J. (1995). Estimation of variance distribution in three-dimensional reconstruction. II: Applications. J. Opt. Soc. Am., in press. Lutsch, G., Pleissner, K.-P., Wangermann, G., and Noll, F. (1977). Studies on the structure of animal ribosomes. VIII. Application of a digital image processing method to the enhancement of electron micrographs of small ribosomal subunits. Acta. Biol. Med. Gem1.36, K-59. Malhotra, A., and Harvey, S. C. (1994). A quantitative model of the Escherichia coli 16S RNA in the 30S ribosomal subunit. J. Mol. Biol. 240, 308-340. Mannella, C., Marko, M., Penczek, P., Barnard, D., and Frank, J. (1994). The internal compartmentation of rat-liver mitochondria: Tomographic study using the high-voltage electron microscopy. Microsc. Res. Tech. 27, 278-283. Marabini, R., and Carazo, J. M. (1994a). Pattern recognition and classification of images of biological macromolecules using artificial neural networks. Biophys. J. 66, 1804-1814. Marabini, R., and Carazo, J. M. (1994b). Practical issues on invariant image averaging using the bispectrum. Signal Process. 40, 119-128. Marabini, R., and Carazo, J. M. (1995). "On a New Computationally Fast Image Invariant Based on Bispectral Projections." Submitted for publication.
Bibliography
315
Marabini, R. Vaquerizo, C., Fernandez, J. J., Carazo, J. M., Ladjadj, M., Odesanya, O., and Frank, J. (1994). On a prototype for a new distributed data base of volume data obtained by 3D imaging. In: "Visualization in Biomedical Computing" (R. A. Robb, Ed.), pp. 466-472. Proc. SPIE 2359. Marabini, R. Vaquerizo, C., Fernandez, J. J., Carazo, J. M., Engel, A., and Frank, J. (1995). Proposal for a new distributed data base of macromolecular and subcellular structures from different areas of microscopy. J. Struct. Biol., in press. Markham, R., Frey, S., and Hills, G. J. (1963). Methods for the enhancement of image detail and accentuation of structure in electron microscopy. Virology 22, 88-102. Markham, R., Hitchborn, J. H., Hills, G., Frey, S. (1964). The anatomy of the tobacco mosaic virus. Virology 22, 342-359. McDowall, A. W., Chang, J. J., Freeman, R., Lepault, J., Walter, C. A., and Dubochet, J. (1983). Electron microscopy of frozen-hydrated sections of vitreous ice and vitrified biological samples. J. Microsc. 131, 1-9. Menetret, J.-F., Hofmann, W., Schr6der, R. R., Rapp, G., and Goody, R. S. (1991). Time-resolved cryo-electron microscopic study of the dissociation of actomyosin induced by photolysis of photolabile nucleotides. J. Mol. Biol. 219, 139-144. Mezzich, J. E., and Solomon, H. (1980). "Taxonomy and Behavioral Science." Academic Press, London. Milligan, R. A., and Flicker, P. F. (1987). Structural relationships of actin, myosin, and tropomyosin revealed by cryo-electron microscopy. J. Cell Biol. 105, 29-39. M6bus, G., and RiJhle, M. (1993). A new procedure for the determination of the chromatic contrast transfer envelope of electron microscopes. Optik 93, 108-118. Moody, M. F. (1967). Structure of the sheath of bacteriophage T4. I. The structure of the contracted sheath and polysheath. J. Mol. Biol. 25, 167-200. Moody, M. F. (1990). Image analysis in electron microscopy. In: "Biophysical Electron Microscopy" (P. W. Hawkes and U. Valdre, Eds.), pp. 145-287. Academic Press, London. Moore, P. B. (1995). Ribosomes seen through a glass less darkly. Structure 3, 851-852. Namba, K., Caspar, D. L. D., and Stubbs, G. (1988). Enhancement and simplification of macromolecular images. Biophys. J. 53, 469-475. Nathan, R. (1970). Computer enhancement of electron micrographs. In "Proceedings of the 28th Annual Meeting, EMSA," pp. 28-29. Natterer, F. (1986). The Mathematics of Computerized Tomography." John Wiley & Sons, Stuttgart.
316
Bibliography
O'Brien, J. (1991). The New Yorker, Feb. 25, p. 37. Ofverstedt, L.-G., Zhang, K., Tapio, S., Skoglund, U., and Isakson, L. A. (1994). Starvation in cico for aminoacyl-tRNA increases the spatial separation between the two ribosomal subunits. Cell 79, 629-638. Ottl, H., Hegerl, R., and Hoppe, W. (1983). Three-dimensional reconstruction and averaging of 50S ribosomal subunits of Escherichia coli from electron micrographs. J. Mol. Biol. 163, 431-450. O'Keefe, M. A. (1992). "Resolution" in high-resolution electron microscopy. Ultramicroscopy 47, 282-297. O'Neill, E. L. (1969). "Introduction to Statistical Optics." Addison-Wesley, Reading, Mass. Orlova, E., and van Heel, M. (1994). Angular reconstitution of macromolecules with arbitrary point-group symmetry. In: "Proceedings of the 13th International Congress on Electron Microscopy (Paris)," Vol. 1, pp. 507-508. Les Editions de Physiques, Les Ulis, France. Ottensmeyer, F. P., and Farrow, N. A. (1992). Three-dimensional reconstruction from dark-field electron micrographs of macromolecules at random unknown angles. In: "Proceedings of the 50th Annual Meeting, EMSA" (G. W. Bailey, J. Bentley, and J. A. Small, Eds.), pp. 1058-1059. San Francisco Press, San Francisco. Ottensmeyer, F. P., Schmidt, E. E., Jack, T., and Powell, J. (1972). Molecular architecture: The optical treatment of dark field electron micrographs of atoms. J. Ultrastruct. Res. 40, 546-555. Ottensmeyer, F. P., Andrew, J. W., Bazett-Jones, D. P., Chan. A. S. K, and Hewitt, J. (1977). Signal to noise enhancement in dark field electron micrographs of vasopressin: Filtering of arrays of images in reciprocal space. J. Microsc. 109, 259-268. Ottensmeyer, F. P., Czarnota, G. J., Andrews, D. W., and Farrow, N. A. (1994). Three-dimensional reconstruction of the 54 kDa signal recognition protein SRP54 from STEM darkfield images of the molecule at random orientations. In: "Proceedings of the 13th International Congress on Electron Microscopy (Paris)," Vol. 1, pp. 509-510. Les Editions de Physiques, Les Ulis, France. Pante, N., and Aebi, U. (1994). Towards understanding the threedimensional structure of the nuclear pore complex at the molecular level. Curr. Opinion Struct. Biol. 4, 187-196. Penczek, P., Radermacher, M., and Frank, J. (1992). Three-dimensional reconstruction of single particles embedded in ice. Ultramicroscopy 40, 33-53. Penczek, P., Grassucci, R. A., and Frank, J. (1994). The ribsome at improved resolution: New techniques for merging and orientation refinement in 3D cryoelectron microscopy of biological particles. Ultramicroscopy 53, 251-270.
Bibliography
317
Penczek, P., Zhu, J., and Frank, J. (1995). A common-lines based method for determining orientations simultaneously for N > 3 particle projections. Ultramicroscopy, in press. Perret, D. I., May, K. A., and Yoshikawa, S. (1994). Facial shape and judgements of female attractiveness. Nature 368, 239-242. Phipps, B. M., Typke, D., Hegerl, R., Volker, S., Hoffmann, A., Stetter, K. O., and Baumeister, W. (1993). Structure of a molecular chaperone from a thermophilic archaebacterium. Nature 361,475-477. Provencher, S. W., and Vogel, R. H. (1983). Regularization techniques for inverse problems in molecular biology. In: "Progress in Scientific Computing" (S. Abarbanel, R. Glowinski, G. Golub, and H.-O. Kreiss, Eds.). Birkhauser, Boston. Provencher, S. W., and Vogel, R. W. (1988). Three-dimensional reconstruction from electron micrographs of disordered specimens, I. Method. Ultramicroscopy 25, 209-222. Radermacher, M. (1980). Dreidimensionale Rekonstruktion bei kegelf6runiger Kippung in Elektronenmikroskop. Thesis. Technical University, Munich. Radermacher, M. (1988). The three-dimensional reconstruction of single particles from random and non-random tilt series. J. Electron Microsc. Tech. 9, 359-394. Radermacher, M. (1991). Three-dimensional reconstruction of single particles in electron microscopy. In: "Image Analysis in Biology" (D.-P. H~ider, Ed.), pp. 219-249. CRC Press, Boca Raton. Radermacher, M. (1992). Weighted back-projection methods. In: "Electron Tomography" (J. Frank, Ed.). Plenum, New York. Radermacher, M. (1994). Three-dimensional reconstruction from random projections: Orientational alignment via Radon transforms. Ultramicroscopy 53, 121-136. Radermacher, M., and Frank, J. (1984). Representation of objects reconstructed in 3D by surfaces of equal density. J. Microsc. 136, 77-85. Radermacher, M., and Frank, J. (1985). Use of nonlinear mapping in multivariate statistical analysis of molecule projections. Ultramicroscopy 17, 117-126. Radermacher, M., Wagenknecht, T., Verschoor, A., and Frank, J. (1986a). A new 3-D reconstruction scheme applied to the 50S ribosomal subunit of E. coli. J. Microsc. 141, RP1. Radermacher, M., Frank, J., and Mannella, C. A. (1986b). Correlation averaging: Lattice separation and resolution assessment. In "Proceedings of the 44th Annual Meeting, EMSA," pp. 140-143. Radermacher, M., Wagenknecht, T., Verschoor, A., and Frank, J. (1987a). Three-dimensional structure of the large ribosomal subunit from Escherichia coli. EMBO J. 6, 1107-1114.
318
Bibliography
Radermacher, M., Wagenknecht, T., Verschoor, A., and Frank, J. (1987b). Three-dimensional reconstruction from a single-exposure, random conical tilt series applied to the 50S ribosomal subunit. J. Microsc. 146, 113-136. Radermacher, M., Nowotny, V., Grassucci, R. and Frank, J. (1990). Threedimensional image reconstruction of the 50S subunit from Escherichia coli ribosomes lacking 5S-rRNA. In: "Proceedings of the XII International Congress on Electron Microscopy," pp. 284-285. San Francisco Press, San Francisco. Radermacher, M., Wagenknecht, T., Grassucci, R., Frank, J., Inui, M., Chadwick, C., and Fleischer, S. (1992a). Cryo-EM of the native structure of the calcium release channel/ryanodine receptor from sarcoplasmic reticulum. Biophys. J. 61,936-940. Radermacher, M., Srivastava, S., and Frank, J. (1992b). The structure of the 50S ribosomal subunit from E. coli in frozen hydrated preparation reconstructed with SECRET. In: "Proceedings of the 10th European Congress on Electronic Microscopy," Vol. 3, pp. 19-20. Secretariado de Publicaciones de la Universidad Granada, Granada, Spain. Radermacher, M. Rao, V., Wagenknecht, T., Grassucci, R., Frank, J., Timerman, A. P., and Frank, J. (1994a). Three-dimensional reconstruction of the calcium release channel by cryo electron microscopy. In: "Proceedings of the 13th International Congress on Microscopy (Paris)," Vol. 3A, pp. 551-552. Les Editions de Physiques, Les Ulis, France. Radermacher, M., Rao, V., Grassucci, R., Frank, J., Timerman, A. P., Fleischer, S., and Wagenknecht, T. (1994b). Cryo-electron microscopy and three-dimensional reconstruction of the calcium release channel/ryanodine receptor from skeletal muscle. J. Cell. Biol. 127, 411-423. Radon, J. (1917). 0ber die Bestimmung von Funktionen durch ihre Integralwerte l~ings gewisser Mannigfaltigkeiten. Berichte fiber die Verhandlungen der K6niglich S~ichsischen Gesellschaft der Wissenschaften zu Leipzig. Math. Phys. Klasse 69, 262-277. Rayment, I., Holden, H., Whittaker, M., Yohn, C., Lorenz, M., Holmes, K., and Milligan, R. (1993). Structure of the actin-myosin complex and its implications for muscle contraction. Science 261, 58-65. Reimer, L. (1989). "Transmission Electron Microscopy." Springer-Verlag, Berlin. Robb, R. A. (Ed.) (1994). "Visualization in Biomedical Computing." Proc. SPIE 2359. Rose, H. (1984). Information transfer in transmission electron microscopy. Ultramicroscopy 15, 173-192.
Bibliography
319
Rose, H. (1990). Outline of a spherically corrected semiplanatic mediumvoltage transmission electron microscope. Optik 85, 19-24. Rosenfeld, A., and Kak, A. C. (1982). "Digital Picture Processing," 2nd ed. Academic Press, New York. Sachs, L. (1984). "Applied Statistics. A Handbook of Techniques," 2nd ed. Springer-Verlag, Berlin/New York. Saibil, H., and Wood, S. (1993). Chaperonins. Curr. Opinion Struct. Biol. 3, 207-213. Saibil, H. R., Zheng, D., Rosemann, A. M., Hunter, A. S., Watson, G. M. F., Chen, S., auf der Mauer, A., O'Hara, B. P., Wood, S. P., Mann, N. H., Barnett, L. K., and Ellis, R. J. (1993). ATP induces large quaternary rearrangements in a cage-like chaperonin structure. Curr. Biol. 3, 265-273. Saito, A., Inui, M., Radermacher, M., Frank, J., and Fleischer, S. (1988). Ultrastructure of the calcium release channel of sarcoplasmic reticulum. J. Cell Biol. 107, 211 -219. Salzman, D. B. (1990). A method of general moments for orienting 2D projections of unknown 3D objects. Comput. Vis. Graphics Image Proc. 50, 129-156. Salunke, D. M., Caspar, D. L. D., and Garcia, R. L. (1986). Self-assembly of purified polyomavirus capsid protein VP~. Cell 46, 895-904. San Martin, M. C., Stamford, N. P. J., Dammerova, N., Dixon, N., and Carazo, J. M. (1995). A structural model of the Escherichia coli DnaB helicase based on three-dimensional electron microscopy data. J. Struct. Biol. 114, 167-176. Sass, H. J., Bueldt, G., Beckmann, E., Zemlin, F., van Heel, M., Zeitler. E., Rosenbusch, J. P., Dorset, D. L., and Massalski, A. (1989). Densely packed /3-structure at the protein-liquid interface of porin is revealed by high-resolution cryo-electron microscopy. J. Mol. Biol. 209, 171-175. Saxton, W. O. (1977). Spatial coherence in axial high resolution conventional electron microscopy. Optik 49, 51-62. Saxton, W. O. (1978)"Computer Techniques for Image Processing in Electron Microscopy," Supplement 10 to Advances in Electronic and Electron Physics. Academic Press, New York. Saxton, W. O. (1986). Focal series restoration in HREM. In "Proceedings of the XI International Congress on Electron Microscopy, Kyoto," post deadline paper 1. The Japanese Society of Electron Microscopy, Tokyo, Japan. Saxton, W. O. (1994). Accurate alignment of sets of images for superresolving applications. J. Microsc. 174, 61-68. Saxton, W. O., and Baumeisterm, W. (1982). The correlation averaging of a regularly arranged bacterial cell envelope protein. J. Microsc. 127, 127-138.
320
Bibliography
Saxton, W. O., and Frank, J. (1977). Motif detection in quantum noiselimited electron micrographs by cross-correlation. Ultramicroscopy 2, 219-227. Saxton, W. O. Pitt, T. J., and Horner, M. (1979). Digital image processing: The SEMPER system. Ultramicroscopy 4, 343-354. Saxton, W. O., Baumeister, W., and Hahn, M. (1984). Three-dimensional reconstruction of imperfect two-dimensional crystals. Ultramicroscopy 13, 57-70. Schatz, M. (1992). "Invariante Klassifizierung elektronenmikroskopischer Aufnahmen von eiseingebetteten biologischen Makromolekfilen." Thesis, Freie Universitfit Berlin. Deutsche Hochschulschriften 452. Schatz, M., and van Heel, M. (1990). Invariant classification of molecular views in electron micrographs. Ultramicroscopy 32, 255-264. Schatz, M., and van Heel, M. (1992). Invariant recognition of molecular projections in vitreous ice preparations. Ultramicroscopy 45, 15-22. Schatz, M., J~iger, J., and van Heel, M. (1990). Molecular views of iceembedded Lumbricus terrestris erythrocruorin obtained by invariant classification. In: "Proceedings of the XII International Congress on Electron Microscopy," pp. 450-451. San Francisco Press, San Francisco. Schatz, M., Orlova, E., J~iger, J., Kitzelmann, E., and van Heel, M. (1994). 3D structure of ice-embedded Lumbricus terrestris erythrocruorin. In: "Proceedings of the 13th International Congress on Electron Microscopy (Paris)," Vol. 3, pp. 553-554. Les Editions de Physiques, Les Ulis, France. Schatz, M., Orlova, E. V., Dube, P., J~iger, J., van Heel, M. (1995). Structure of Lumbricus terrestris hemocyanin at 30,~ resolution determined using angular reconstitution. J. Struct. Biol. 114, 28-40. Schiske, P. (1968). Zur Frage der Bildrekonstruktion durch Fokusreihen. In: "Fourth European Conference on Electron Microscopy, Rome," pp. 145-146. Tipografia Poliglotta Vaticana, Rome. Schmid, M. F., Agris, J. M., Jakava, J., Matsudaira, P., and Chiu, W. (1994). Three-dimensional structure of a single filament in the Limulus acrosomal bundle: Scruin binds to homologous helix-loop-beta motifs in actin. J. Cell. Biol. 124, 341-350. Schmutz, M., Lang, J., Graft, S., and Brisson, A. (1994). Defects on planarity of carbon films supported on electron microscope grids revealed by reflected light microscopy. J. Struct. Biol. 112, 252-258. Schr6der, R. R., Hofmann, E., and Menetret, J.-F. (1990). Zero-loss energy filtering as improved imaging mode in cryo-electron microscopy of frozen-hydrated specimens. J. Struct. Biol. 105, 28-34.
Bibliography
321
Schr6der, R. R., Manstein, D. J., Jahn, W., Holden, H., Rayment, I., Holmes, K. C., and Spudich, J. A. (1993). Three-dimensional atomic model of F-actin decorated with Dictyostelium myosin S1. Nature 364, 171-174. Schroeter, J. P., Wagenknecht, T., Kolodzij, S. J., Bretaudiere, J.-P., Strickland, D. K., and Stoops, J. K. (1991). Three-dimensional structure of chymotrypsin-human c~2-macroglobulin complex. FASEB 5, A452. Serysheva, I. I., Orlova, E. V., Chiu, W., Sherman, M. B., Hamilton, S., and van Heel, M. (1995). Electron cyromicroscopy and angular reconstitution used to visualize the sketetal muscle calcium release channel. Struct. Biol. 2, 18-24. Sezan, M. I. (1992). An overview of convex projections theory and its application to image recovery problems. Ultramicroscopy 40, 55-67. Sezan, M. I., and Stark, H. (1982). Image restoration by the method of convex projections. II. Applications and numerical results. IEEE Trans. Med. Imaging 1, 95-101. Shannon, C. E. (1949). Communication in the presence of noise. Proc. IRE 37, 10-21. Shaw, A. L., Rothnagel, R., Chen, D., Ramig, R. F., Chiu, W., and Prasad, B. V. V. (1993). Three-dimensional visualization of the rotavirus hemagglutinin structure. Cell 74, 693-701. Sizaret, P.-Y., Frank, J., Lamy, J., Weill, J., and Lamy, J. N. (1982). A refined quaternary structure of Androctonus australis hemocyanin. Eur. J. Biochem. 127, 501-506. Smith, M. F., and Langmore, J. P. (1992). Quantitation of molecular densities by cryoelectron microscopy. Determination of radial density distribution of tobacco mosaic virus. J. Mol. Biol. 226, 763-774. Smith, P. R., and Aebi, U. (1973). Filtering continuous and discrete Fourier transforms (Appendix). J. Supramol. Struct. 1, 516-522. Smith, P. R. (1978). An integrated set of computer programs for processing electron micrographs of biological structures. Ultramicroscopy 3, 153-160. Smith, P. R. (1981). Bilinear interpolation of digital images. Ultramicroscopy 6, 201-204. Smith, T. J., Olson, N. H., Cheng, R. H., Chase, E. S., and Baker, T. S. (1993). Structure of a human rhinovirus-bivalently bound antibody complex: Implications for viral neutralization and antibody flexibility. Proc. Natl. Acad. Sci. USA 90, 7015-7018. Soejima, T., Sherman, M. S., Schmid, M. F., and Chiu, W. (1993). 4-A projection map of bacteriophage T4 DNA helix-stabilizing protein (gp32*I) crystal by 400-kV electron cryomicroscopy. J. Struct. Biol. 111, 9-16.
322
Bibliography
Sommerfeld, A. (1964). "Vorlesungen fiber Theoretische Physik: Mechanik," 7th ed. Academische Verlagsgesellschaft, Geest & Partig K. G., Leipzig. Sosinsky, G. E., Baker, T. S., Caspar, D. L. D., and Goodenough, D. A. (1990). Correlation analysis of gap junction lattice images. Biophys. J. 58, 1213-1226. Sosinsky, G. E., Francis, N. R., Stallmeyer, M. J. B., and DeRosier, D. J. (1992). Substructure of the flagellar basal body of Salmonella typhimurium. J. Mol. Biol. 223, 171-184. Spence, J. (1988). "Experimental High-Resolution Electron Microscopy." Oxford Univ. Press, New York. Srivastava, S., Verschoor, A., and Frank, J. (1992a). Eukaryotic initiation factor 3 does not prevent association through physical blocking of the ribosomal subunit-subunit interface. J. Mol. Biol. 226, 301-304. Srivastava, S., Radermacher, M., and Frank, J. (1992b). Three-dimensional mapping of protein L18 on the 50S ribosomal subunit from E. coli. In: "Proceedings of the 10th European Congress on Electron Microscopy," Vol. 1, pp. 421-422. Secretariado de Publicaciones de la Universidad Granada, Granada, Spain. Srivastava, S., Verschoor A., Radermacher, M., Grassucci, R., and Frank, J. (1995). Three-dimensional reconstruction of the mammalian 40S ribosomal subunit embedded in ice. J. Mol. Biol. 245, 461-466. Stallmeyer, M. J. B., Hahnenberger, K. M., Sosinsky, G. E., Shapiro, L., and DeRosier, D. J. (1989a). Image reconstruction of the flagellar basal body of Caulobacter crescentus. J. Mol. Biol. 205, 511-518. Stallmeyer, M. J. B., Aizawa, S.-I., Macnab, R. M., and DeRosier, D. J. (1989b). Image reconstruction of the flagellar basal body of Salmonella typhimurium. J. Mol. Biol. 205, 519-529. Stark, H., Miiller, F., Orlova, E. V., Schatz, M., Dube, P., Erdemir, T., Zemlin, F., Brimacombe, R., and van Heel, M. (1995). The 70S Escherichia coli ribosome at 23 .~ resolution: Fitting the ribosomal RNA. Structure 3, 815-821. Stasiak, A., Tsevana, I. R., West, S. C., Benson, C. J. B., Yu, X., and Egelman, E. H. (1994). The E. coli RuvB branch migration protein forms double hexameric rings around DNA. Proc. Natl. Acad. Sci. USA 91, 7618-7622. Steinkilberg, M., and Schramm, H. J. (1980). Eine verbesserte Drehkorrelationsmethode fiir die Strukturbestimmung biologischer Makromolekiile durch Mittelung elektronenmikroskopischer Bilder. HoppeSeyler's Z. Physiol. Chem. 361, 1363-1369. Steven, A. C., Hainfield, J. F., Trus, B. L., Steinert, P. M., and Wall, J. S. (1984). Radial distributions of density within macromolecular corn-
Bibliography
323
plexes determined from dark-field electron micrographs. Proc. Natl. Acad. Sci. USA 81, 6363-6367.
Steven, A. C., Stall, R., Steinert, P. M., and Trus, B. L. (1986). Computational straightening of images of curved macromolecular helices by cubic spline interpolation facilitates structural analysis by Fourier methods. In: "Electron Microscopy and Alzheimer's Disease" (J. Metuzals Ed.), pp. 31-33. San Francisco Press, San Francisco. Steven, A. C., Trus, B. L., Maizel, J. V., Unser, M., Parry, D. A. D., Wall, J. S., Hainfeld, J. F., and Studier, F. W. (1988). Molecular substructure of a viral receptor-recognition protein: The gpl7 tail-fiber of bacteriophage T7. J. Mol. Biol. 200, 351-365. Steven, A. C., Kocsis, E., Unser, M., and Trus, B. L. (1991). Spatial disorders and computational cures. Int. J. Biol. Macromol. 13, 174-180. Stewart, M. (1988a). Introduction to the computer image processing of electron micrographs of two-dimensionally ordered biological structures. J. Electron Microsc. Tech. 9, 301-324. Stewart, M. (1988b). Computer image processing of electron micrographs of biological structures with helical symmetry. J. Electron Microsc. Tech. 9, 325-358. Stewart, M. (1990). Electron microscopy of biological macromolecules. In: "Modern Microscopies" (P. J. Duke and A. G. Michette, Eds.), pp. 9-40. Plenum, New York. Stewart, P. L., and Burnett, R. M. (1993). Adenovirus structure as revealed by X-ray crystallography, electron microscopy, and difference imaging. Jpn. J. Appl. Phys. 32, 1342-1347. Stewart, P. L. Burnett, R. M., Cyrklaff, M., and Fuller, S. D. (1991). Image reconstruction reveals the complex molecular organization of adenovirus. Cell 67, 145-154. Stewart, P. L., Fuller, S. D., and Burnett, R. M. (1993). Difference imaging of adenovirus: Bridging the resoluting gap between X-ray crystallography and electron microscopy. EMBO J. 12, 2589-2599. St6ffler, G., and St6ffler-Meilicke, M. (1983). The ultrastructure of macromolecular complexes studied with antibodies. In "Modern Methods in Protein Chemistry" (H. Tesche, Ed.), pp. 409-455. De Gruyter, Berlin. Stoops, J. K., Schroeter, J. P. Bretaudiere, J.-P., Olson, N. H., Baker, T. S., and Strickland, D. K. (1991a). Structural studies of human a 2macroglobulin: Concordance between projected views obtained by negative-stain and cryoelectron microscopy. J. Struct. Biol. 106, 172-178. Stoops, J. K., Momany, C., Ernst, S. R., Oliver, R. M., Schroeter, J. P., Bretaudiere, J.-P., and Hackert, M. L. (1991). Comparisons of the low-resolution structures of ornithine carboxylase by electron mi-
324
Bibliography
croscopy and X-ray crystallography: The utility of methylamine tungstate stain and Butvar support film in the study of macromolecules by transmission electron microscopy. J. Electron Microsc. Tech. 18, 157-166. Stoops, J. K., Kolodziej, S. J., Schroeter, J. P., Bretaudiere, J. P., and Wakil, S. J. (1992a). Structure-function relationships of the yeast fatty acid synthetase; negative-stain, cryoelectron microscopy, and image analysis studies of the end views of the structure. Proc. Natl. Acad. Sci. USA 89, 6585-6589. Stoops, J. K., Baker, T. S., Schroeter, J. P., Kolodziej, S. J., Niu, X.-D., and Reed, L. J. (1992b). Three-dimensional structure of the truncated core of the Saccharomyces cerecisiae puruvate dehydrogenase complex determined from negative stain and cryoelectron microscopy images. J. Biol. Chem. 267, 24769-24775. Stroud, R. M., and Agard, D. A. (1979). Structure determination of asymmetric membrane profiles using an iterative Fourier method. Biophys. J. 25, 495-512. Stuhrmann, H. B., Burkhardt, N., Dietrich, G., Jiinemann, R., Meerwinck, W., Schmitt, M., Wadzack, J., Willumeit, R., Zhao, J., and Nierhaus, K. H. (1995). Proton- and deuteron spin targets in biological structure research. Nucl. Instrum. Methods A 356, 124-132. Taniguchi, Y., Takai, Y., and Shimizu, R. (1992). Spherical-aberration-free observation of TEM images by defocus-modulation image processing. Ultramicroscopy 41, 323-333. Taylor, K., and Glaeser, R. M. (1974). Electron diffraction of frozen, hydrated protein crystals. Science 186, 1036-1037. Taylor, K., and Glaeser, R. M. (1976). Electron microscopy of frozenhydrated biological specimens. J. Ultrastr. Res. 55, 448-456. Thomas, D., Flifla, M. J., Escoffier, B., Barray, M., and Delain, E. (1988). Image processing of electron micrographs of human alpha 2macroglobulin half-molecules induced by Cd 2+. Biol. Cell 64, 39-44. Thon, F. (1966). Zur Defokussierungsabh~ingigkeit des Phasenkontrastes bei der elektronenmikroskopischen Abbildung. Z. Naturforsch. 21a, 476-478. Thon, F. (1971). Phase contrast electron microscopy. In: "Electron Microscopy in Material Science" (U. Valdr6, Ed.). Academic Press, New York. Tischendorf, G. W., Zeichhardt, H., and St6ffler, G. (1974). Determination of the location of proteins L14, L17, L18, L19, L22, and L23 on the surface of the 50S ribosomal subunit of Escherichia coli by immunoelectron microscopy. Mol. Gen. Genet. 134, 187-208.
Bibliography
325
Toyoshima, C., and Unwin, P. N. T. (1988a). Contrast transfer for frozen-hydrated specimens: determination from pairs of defocused images. Ultramicroscopy 25, 279-292. Toyoshima, C., and Unwin, P. N. T. (1988b). Ion channel of acetylcholine receptor reconstructed from images of postsynaptic membranes. Nature 336, 247-250. Toyoshima, C., Yonekura, K., and Sasabe, H. (1993). Contrast transfer for frozen-hydrated specimens. II. Amplitude contrast at very low frequencies. Ultramicroscopy 48, 165-176. Trachtenberg, S., and DeRosier, D. J. (1987). Three-dimensional structure of the frozen-hydrated flagellar filament. J. Mol. Biol. 195, 581-601. Troyon, M. (1977). A method for determining the illumination divergence from electron micrographs. Optik 49, 247-251. Trus, B. L., and Steven, A. C. (1981). Digital image processing of electron micrographs~The PIC system. Ultramicroscopy 6, 383-386. Trus, B. L., Unser, M., Pun, T., and Steven, A. C. (1992). Digital image processing of electron micrographs: The PIC system II. In: "Scanning Microscopy Supplement 6: Proceedings of the Tenth Pfefferkorn Conference, Cambridge University, England, September 1992" (P. W. Hawkes, Ed.), pp. 441-451. Scanning International, Chicago. Trussell, H. J. (1980). The relationship between image restoration by the maximum a posteriori method and the maximum entropy method. IEEE Trans. Acoust. Speed Signal Proc. 28(1), 114-117. Trussell, H. J., Orun-Ozturk, H., and Civanlar, M. R. (1987). Errors in reprojection methods in computerized tomography. IEEE Trans. Med. Im. 6, 220-227. Tsuprun, V., Anderson, D., and Egelman, E. H. (1994). The bacteriophage 4~29 head-tail connector shows 13-fold symmetry in both hexagonallypacked arrays and as single particles. Biophys. J. 66, 2139-2150. Tufte, E. R. (1983). "The Visual Display of Quantitative Information." Graphics Press, Cheshire, CT. Tyler, D. D. (1992). "Mitochondria in Health and Disease." VCH, New York. Typke, D., and K6stler, D. (1977). Determination of the wave aberration of electron lenses from superposition diffractograms of images with differently tilted illumination. Ultramicroscopy 2, 285-295. Typke, D., and Radermacher, M. (1982). Determination of the phase of complex atomic scattering amplitudes from light-optical diffractograms of electron microscope images. Ultramicroscopy 9, 131-138. Typke, D., Pfeifer, G., Hegerl, R., and Baumeister, W. (1990). 3D reconstruction of single particles by quasi-conical tilting from micrographs recorded with dynamic focusing. In "Proceedings of the XII
326
Bibliography
International Congress for Electron Microscopy" (L. D. Peachey and D. B. Williams, Eds.), Vol 1, pp. 244-245. San Francisco Press, San Francisco. Typke, D., Hoppe, W., Sessler, W., and Burger, M. (1976). Conception of a 3-D imaging electron microscope. In: Proceedings of the 6th European Congress on Electron Microscopy" (D. G. Brandon, Ed.), Vol. 1, pp. 334-335. Tal International, Israel. Typke, D., Hegerl, R., and Kleinz, J. (1992). Image restoration for biological objects using external TEM control and electronic image recording. Ultramicroscopy 46, 157-173. Uhlemann, S., and Rose, M. (1994). Comparison of the performance of existing and proposed imaging energy filters. In: "Proceedings of the 13th International Congress on Electron Microscopy" (B. Jouffrey and C. Colliex, Eds.), Vol. I, pp. 163-164. Les Editions de Physique, Les Ulis, France. Unser, M., Steven, A. C., and Trus, B. L. (1986). Odd men out: A quantitative objective procedure for identifying anomalous members of a set of noisy images of ostensibly identical specimens. Ultramicroscopy 19, 337-348. Unser, M., Trus, B. L., and Steven, A. C. (1987). A new resolution criterion based on spectral signal-to-noise ratios. Ultramicroscopy 23, 39-52. Unser, M., Trus, B. L., Frank, J., and Steven, A. C. (1989). The spectral signal-to-noise ratio resolution criterion: Computational efficiency and statistical precision. Ultramicroscopy 30, 429-434. Unwin, P. N. T. (1970). An electrostatic phase plate for the electron microscope. Ber. Bunsenges, Phys. Chem. 74, 1137-1141. Unwin, P. N. T. (1975). Beef liver catalase structure: Interpretation of electron micrographs. J. Mol. Biol. 98, 235-242. Unwin, P. N. T., and Henderson, R. (1975). Molecular structure determination by electron microscopy of unstained crystalline specimens. J. Mol. Biol. 94, 425-440. Unwin, P. N. T., and Klug, A. (1974). Electron microscopy of the stacked disk aggregate of tobacco mosaic virus protein. J. Mol. Biol. 87, 641-656. Vainshtein, B. K., and Goncharov, A. B. (1986). Determination of the spatial orientation of arbitrarily arranged identical particles of an unknown structure from their projections. In: "Proceedings of the 1 l th International Congress on Electron Microscopy, Kyoto," pp. 459-460. The Japanese Society for Electron Microscopy, Tokyo, Japan. van Heel, M. (1982). Detection of objects in quantum-noise limited images. Ultramicroscopy 8, 331 - 342.
Bibliography
327
van Heel, M. (1983). Stereographic representation of three-dimensional density distributions. Ultramicroscopy 11, 307-314. van Heel, M. (1984a). Three-dimensional reconstruction with unknown angular relationships. In: "Proceedings of the 8th European Congress on Electron Microscopy (Budapest)," pp. 1347-1348. Electron Microscopy Foundation, Program Committee, Budapest. van Heel, M. (1984b). Multivariate statistical classification of noisy images (randomly oriented biological macromolecules). Ultramicroscopy 13, 165-184. van Heel, M. (1986a). Finding the characteristic views of macromolecules in extremely noisy electron micrographs. In: "Pattern Recognition in Practice" (E. S. Gelsema and k. N. Kanal, Eds.), Vol. 2, pp. 291-299. Elsevier/North Holland, Amsterdam. van Heel, (1986b). Noise-limited three-dimensional reconstructions. Optik 73, 83-86. van Heel, M. (1987a). Similarity measures between images. Ultramicroscopy 21, 95-100. van Heel, M. (1987b). Angular reconstitution: A posteriori assignment of projection directions for 3D reconstruction. Ultramicroscopy 21, 111-124. van Heel, M. (1989). Classification of very large electron microscopial image data sets. Optik 82, 114-126. van Heel, M., and Dube, P. (1994). Quaternary structure of multihexameric arthropod hemocyanins. Micron 25, 387-418. van Heel, M., and Frank, J. (1980). Classification of particles in noisy electron micrographs using correspondence analysis. In: "Pattern Recognition in Practice" (E. S. Gelsema and L. N. Kanal, Eds.), Vol. 1, pp. 235-243. Elsevier/North-Holland, Amsterdam. van Heel, M., and Frank, J. (1981). Use of multivariate statistical statistics in analysing the images of biological macromolecules. Ultramicrosc. 6, 187-194. van Heel, M., and Harauz, G. (1986). Resolution criteria for threedimensional reconstruction. Optik 73, 119-122. van Heel, M., and Hollenberg, J. (1980). The stretching of distorted images of two-dimensional crystals. In: "Electron Microscopy at Molecular Dimensions" (W. Baumeister Ed.), pp. 256-260. Springer-Verlag, Berlin/New York. van Heel, M., and Keegstra, W. (1981). IMAGIC: A fast, flexible and friendly image analysis software system. Ultramicroscopy 7, 113-130. van Heel, M., and St6ffler-Meilicke, M. (1985). The characteristic views of E. coli and B. stearothermophilus 30S ribosomal subunits in the electron microscope. EMBO J. 4, 2389-2395.
328
Bibliography
van Heel, M., Keegstra, W., Schutter, W., and van Bruggen, E. F. J. (1982a). In "Structure and Function of Invertebrate Respiratory Proteins." (E. J. Wood, Ed.), pp. 69-73. Harwood Academic, Reading, UK. van Heel, M., Bretaudiere, J.-P., and Frank, J. (1982b). Classification and multireference alignment of images of macromolecules. In: "Proceedings of the 10th International Congress on Electron Microscopy," Vol. 1, pp. 563-564. Deutsche Gesellschaft fiir Elektronenmikroskopie e.V., Frankfurt (Main). van Heel, M., Keegstra, W., Schutter, W. G., van Bruggen, E. F. J. (1983). Arthropod hemocyanin studied by image analysis. Life Chem. Rep. Suppl. 1, 69-73. van Heel, M., Schatz, M., and Orlova, E. (1992a). Correlation functions revisited. Ultramicroscopy 46, 307-316. van Heel, M., Winkler, H., Orlova, E., and Schatz, M. (1992b). Structure analysis of ice-embedded single particles. In: "Scanning Microscopy Supplement 6: Proceedings of the Tenth Pfefferkorn Conference, Cambridge University, England, September 1992" (P. W. Hawkes, Ed.), pp. 23-42. Scanning International, Chicago. van Heel, M., Dube, P., and Orlova, E. V. (1994). Three-dimensional structure of Limulus polyphemus hemocyanin. In: "Proceedings of the 13th International Congress on Electron Microscopy (Paris)," Vol. 3, pp. 555-556. Les Editions de Physiques, Les Ulis, France. van Oostrum, J., Smith, P. R., Mohraz, M., and Burnett, R. M. (1987). The structure of the adenovirus capsid. III. Hexon packing determined from electron micrographs of capsid fragments. J. Mol. Biol. 198, 73-89. Verschoor, A., and Frank, J. (1990). Three-dimensional structure of the mammalian cytoplasmic ribosome. J. Mol. Biol. 214, 737-749. Verschoor, A., Frank, J., Radermacher, M., Wagenknecht, T., and Boublik, M. (1983). Three-dimensional reconstruction of the 30S ribosomal subunit from randomly oriented particles. In: "Proceedings of the 41st Annual Meeting, EMSA" (G. W. Bailey, Ed.), pp. 758-759. Verschoor, A., Frank, J., Radermacher, M., Wagenknecht, T., and Boublik, M. (1984). Three-dimensional reconstruction of the 30S ribosomal subunit from randomly oriented particles. J. Mol. Biol. 178, 677-698. Verschoor, A., Frank, J., and Boublik, M. (1985). Investigation of the 50S ribosomal subunit by electron microscopy and image analysis. J. Ultrastruct. Res. 92, 180-189. Verschoor, A., Zhang, N. Y., Wagenknecht, T. Obrig, T., Radermacher, M., and Frank, J. (1989). Three-dimensional reconstruction of mammalian 40S ribosomal subunit. J. Mol. Biol. 209, 115-126.
Bibliography
329
Verschoor, A., Srivastava, S., Grassucci, R., and Frank, J. (1995). Native 3D structure of the eukaryotic 80S ribosome: Morphological homology with the E. coli 70S ribosome. Submitted. Vest, C. M. (1974). Formation of images from projections: Radon and Abel transforms. J. Opt. Soc. Am. 64, 1215-1218. Vigers, G. P. A., Crowther, R. A., and Pearse, B. M. F. (1986a). Threedimensional structure of clathrin in ice. EMBO J. 5, 529-534. Vigers, G. P. A., Crowther, R. A., and Pearse, B. M. F. (1986b). Location of the 100 kd-50 kd accessory proteins in c|athrin coats. EMBO J. 5, 2079-2085. Vogel, R. W. and Provencher, S. W. (1988). Three-dimensional reconstruction from electron micrographs of disordered specimens. II. Implementation and results. Ultramicroscopy 25, 223-240. Wabl, M. R., Barends, P. J., and Nanninga, N. (1973). Tilting experiments with negatively stained E. coli ribosomal subunits. An electron microscopic study. Cytobiologie 7, 1-9. Wade, R. H. (1992). A brief look at imaging and contrast transfer. Ultramicroscopy 46, 145-156. Wade, R. H., and Frank, J. (1977). Electron microscopic transfer functions for partially coherent axial illumination and chromatic defocus spread. Optik 49, 81-92. Wagenknecht, T., Frank, J., Boublik, M., Nurse, K., and Ofengand, J. (1988). Direct localization of the tRNA-anticodon interaction site on the Escherichia coli 30S ribosomal subunit by electron microscopy and computerized image averaging. J. Mol. Biol. 203, 753-760. Wagenknecht, T., Grassucci, R., Frank, J., Saito, A., Inui, M., and Fleischer, S. (1989a). Three-dimensional architecture of the calcium channel/foot structure of sarcoplasmic reticulum. Nature 338, 167-170. Wagenknecht, T., Carazo, J. M., Radermacher, M., and Frank, J. (1989b). Three-dimensional reconstruction of the ribosome from Escherichia coli ribosome in the range of overlap views. Biophys. J. 55, 465-477. Wagenknecht, T., Grassucci, R., and Schaak, D. (1990). Cryo electron microscopy of frozen-hydrated c~-ketoacid dehydrogenase complexes from Escherichia coli. J. Biol. Chem. 265, 22402-22408. Wagenknecht, T., Grassucci, R., Berkowitz, J., and Forneris, C. (1992). Configuration of interdomain linkers in puryvate dehydrogenase complex of Escherichia coli as determined by cryoelectron microscopy. J. Struct. Biol. 109, 70-77. Wagenknecht, T., Berkowitz, J., Grassucci, R., Timerman, A. P., and Fleischer, S. (1994). Localization of calmodulin binding sites on the ryanodine receptor from skeletal muscle by electron microscopy. Biophys. J. 67, 2286-2295.
330
Bibliography
Walker, M., White, H., and Trinick, J. (1994). Electron cryomicroscopy of acto-myosin-S1 during steady-state ATP hydrolysis. Biophys. J. 66, 1563-1572. Wang. B.-C. (1985). Resolution of phase ambiguity in macromolecular crystallography. In: "Methods in Enzymology: Diffraction Methods in Biology, Part B" (H. W. Wyckoff, C. H. W. Hirs, and S. N. Timasheff, Eds.), Vol. 115. Academic Press, Orlando, FL. Wang, D. N., and Kfihlbrandt, W. (1991). High-resolution electron crystallography of light-harvesting chlorophyll a/b-protein complex in three different media. J. Mol. Biol. 217, 691-699. Wang, G., Porta, C., Chen, Z., Baker, T. S., and Johnson, J. E. (1992). Identification of a Fab interaction footprint on an icosahedral virus by cryo electron microscopy and x-ray crystallography. Nature 355, 275-278. Ward, J. H., Jr. (1982). Hierarchical grouping to optimize an objective function. Am. Statist. Assoc. 58, 236-244. Watson, J. D. (1968) "The Double Helix." Atheneum, New York. Welton, T. A. (1979). A computational critique of an algorithm for image enhancement in bright field electron microscopy. Adt'. Electron. Electron Phys. 48, 37-101. Williams, R. C., and Fisher, H. W. (1970). Electron microscopy of TMV under conditions of minimal beam exposure. J. Mol. Biol. 52, 121-123. Wittmann, H. G. (1983). Architecture of prokaryotic ribosomes. Annu. Rev. Biochem. 52, 35-65. Wong, M. A. (1982). A hybrid clustering method for identifying highdensity clusters. Am. Stast. Assoc. J. 77, 841-847. Yeager, M., Berrimen, J. A., Baker, T. S., and Bellamy, A. R. (1994). Three-dimensional structure of the rotavirus haemagglutinin VP4 by cryo-electron microscopy and difference map analysis. EMBO J. 13, 1011-1018. Youla, D. C., and Webb, H. (1982). Image restoration by the method of convex projections. I. Theory. IEEE Trans. Med. Imaging 1, 81-94. Zeitler, E. (1990). Radiation damage in biological electron microscopy. In: "Biophysical Electron Microscopy: Basic Concepts and Modern Techniques" (P. W. Hawkes and U. Valdre, Eds.), pp. 289-308. Academic Press, London. Zeitler, E. (1992). The photographic emulsion as an analog recorder for electrons. Ultramicroscopy 46, 405-416. Zemlin, F. (1989a). Interferometric measurement of axial coma in electron-microscopical images. Ultramicroscopy 30, 311-314. Zemlin, F. (1989b). Dynamic focussing for recording images from tilted samples in small-spot scanning with a transmission electron microscope. J. Electron Microsc. Tech. 11, 251-257.
Bibliography
331
Zemlin, F., and Weiss, K. (1993). Young's interference fringes in electron microscopy revisited. Ultramicroscopy 50, 123-126. Zemlin, F., Weiss, K., Schiske, P., Kunath, W., and Herrmann, K.-H. (1978). Coma-free alignment of high resolution electron microscopes with the aid of optical diffractograms. Ultramicroscopy 3, 49-60. Zhang, N. (1992). "A New Method of 3D Reconstruction and Restoration in Electron Microscopy: Least Squares Method Combined with Projection onto Convex Sets (LSPOCS)." Thesis. State University of New York at Albany. Zhao, J., and Stuhrmann, H. B. (1993). The in situ structure of the L3 and L4 proteins of the large subunit of E. coli ribosomes as determined by nuclear spin contrast variation. J. Phys. IV 3, 233-236. Zhou, Z. H., and Chiu, W. (1993). Prospects for using an IVEM with a FEG for imaging macromolecules towards atomic resolution. Ultramicroscopy 49, 407-416. Zhu, J., and Frank, J. (1994). Accurate retrieval of transfer function from defocus series. In: "Proceedings of the 13th International Congress on Electron Microscopy (Paris)," Vol. 1, pp. 465-466. Les Editions de Physiques, Les Ulis, France. Zhu, J., Penczek, P., Schr6der, R., and Frank, J. (1995). Three-dimensional reconstruction of 70S ribosome using energy-filtered EM data. In preparation. Zingsheim, H. P., Neugebauer, D. C., Barrantes, F. J., and Frank, J. (1980). Structural details of membrane-bound acetylcholine receptor from Torpedo marmorata. Proc. Natl. Acad. Sci. USA 77, 952-956. Zingsheim, H. P., Barrantes, F. J., Frank, J., H~inicke, W., and Neugebauer, D.-Ch. (1982). Direct structural localization of two toxin recognition sites on an acetylcholine receptor protein. Nature 299, 81-84.
This Page Intentionally Left Blank
Index
Abel transform, 191 Acetylcholine receptor, 105 ACF, see Autocorrelation function Actin, 253, 266, 268 Adenovirus, 64, 268 Airy disk, 67 Aliasing, see Sampling, aliasing Alignment, 73-101 accuracy, 53, 122, 252 ACF-based, 85-89 aims, 73-74 classification, through, 75, 97 definition, formal, 98 homogeneous image set, 74 invariants, use of, 95-97 iterative refinement, 89-93 minimum dose, for achieving, 51 multireference, 93, 98, 179 projections, of, 199, 211, 213, 219 reference-based, 89-92, 276 reference-free, 93-101,240 self-detection, 51, 83 3D alignment problem, 76 tilted projections, of, 199, 219-221 translation-invariant rotation search, 70, 85-89 vectorial addition of alignment parameters, 89-93 a-helix, 6, 112
"
a2-macroglobulin, 24, 193 Amplitude contrast, 36-38 ANALYZE, 287 Hemocyanin, A n (scorpion) Angular reconstitution, s e e Projection, assignment of orientations, angular reconstitution Antibody labeling, 105, 190, 253-255, 261-262, 269, 271 ART, s e e Reconstruction, algebraic reconstruction technique (ART) Astigmatism, axial, see Objective lens, axial astigmatism Aurothioglucose, 227 Autocorrelation function, 70-71, 73, 83, 85, 87-91 definition, 85 double (DACF), 95-97 one-dimensional, 96 rotational, 95 Average class, 195, 198 global, 99, 157 map, 75, 103-104 notations, 75 partial, 99 rotational, 71 subset, 113
Androctonus
a u s t r a l i s , see
droctonus australis
333
334 Averaging complex plane, 118 computer, 59 filtration, by, 57 local, 70 low-dose images, 122 minimum number of particles, 51 photographic superimposition, by, 56-57 techniques, 55 3D, 6 AVS, 287
B Back focal plane, see Objective lens, back focal plane Back-projection, see Reconstruction, backprojection Bacterial porin, 7 Bacteriophage, 2, 97 Bacteriorhodopsin, 2, 7, 22, 49-50, 145, 227 Band limit, 31, 34, 107, 116: see also Resolution, limiting factors Band-limited function, 67 Basal bodies, flagellar, 191 - 192, 262 Bessel function, 67 Bispectrum, 97 Boundedness, of the object, 185-186 Box convolution, 69 Bragg, Sir Lawrence, 6 Bright-field electron microscopy, 27-28. 70. 134-135 Brownian motion, 119 Bungarotoxin, 105 Butterfly wings, 1
C Calcium release channel, 13, 16-17. 19. 60. 90, 114, 150, 195, 227, 254, 261,273-281 calmodulin binding, 24, 262, 279 Carbon film optical diffraction, 211 properties, 65 spectrum, 39, 44 Catalase, 23, 49-50, 59 C a u l o b a c t e r , 191
Cavendish Laboratory, 6 CCC, see Cross-correlation, coefficient CCD, slow-scan cameras, 11 CCF, see Cross-correlation, function
Index Charging effects, of specimen. 42, 111, 122, 224 Cheshire cat phenomenon, inverse, 94 Classification, 72, 76, 93, 124, 138, 160-181. 199, 212, 222. 247, 260, 276, 285 analysis of trends. 175, 176 degeneracy. 221 fuzz,v. 165 hierarchical ascendant. 163. 165-170. 173, 218. 277 cutoff level, 218 dendrogram. 166-168, 169, 173 intragroup variance, 167 merging rules, 166-169 similarity index, 166, 167 Ward's criterion, 167-168, 169 hybrid techniques, 169-171 inference problem, 2D to 3D, 180-181, 247 invariants, of. 95-97 inventories. 175-176 K-means. 93. 164-165, 168-169, 179 centers of aggregation, 164 dynamic clouds technique, 165, 171,277 membership function. 165 misclassification, 243 neural networks, 163, 171, 176 parallel methods, 173, 174 stability.. 169 supervised. 163, 179-180 tree. 166 unsupervised. 163, 174 Closure error. 220, 228 Coma. 41 Common lines. 76, 230, see also Projection, assignment of orientations, common lines Component labeling, 68 Confidence interval, 103-105, 253 Conformational changes, 10, 102, 105, 122, 243. 249-250, 253, 260, 279 Constrained thickness method, 231 Contamination, 55 Contrast transfer characteristics, 31-34, 211 Contrast transfer function, 22, 24, 28-49, 84, 116, 122, 215, 261,268 correction, 25 computational, 34, 45-49, 52, 85 flipping of transfer intervals, 34, 45, 84
Index instrumental, 44-45 3D, 245, 246 defocus dependence, 24, 31-34. 40 degradation caused by. 35 determination, 41-44 Convolution product, 28, 52. 79-8(t theorem, 79, 82, 91 Coordinate transformation rigid body, 78 variable, 78 Correlation, s e e a l s o Cross-correlation averaging, of c~stals, 94. 179. 24(I mask-introduced, 112 neighboring image points. 128 Correspondence analysis. 21. 125. 129. 134-136, 257-258, 276 cluster averages, 141 delineation of, 162 conjugate space. 135, 137. 143 coordinate system of, 143 data compression, 139 data pathway, 175. 176 eigenimage, 142, 149, 153, 156. 16(1 eigenvalue histogram, 145-148, 150. 153 ranking, 161 spectrum, 144, 156, 161 eigenvector, 142 eigenvector-eigenvalue equation. 136 space, 161 explanatory images, 149 factor, 137 coordinates, 137. 162 expansion, 152 maps, 138, 140, 145. 162, 163 rotation, 143, 162 hypercube, 149, 175 hypersphere, 149 inactive images, 138 local averages, 149-150, 178, 277 marginal equivalence, 135 marginal weights, 136, 141 mask, binary, 145, 156-157 from averages, bv thresholding and expansion, 158, 159 nonlinear mapping, 176-178 orthonormalization condition, 136-137 reconstitution, 142-143 analogy with Fourier synthesis. 142
335 demonstration of, 153, 159, 160 images obtained by, 149 representativeness. 150. 152 tilted molecules. 222 transition formula. 137, 138. 143 Cosine-stretching. 219-220 Coulomb potential. 23, 29, 267 Cross-correlation coefficient. 78.93. 1(19-110. 197, 221. 228. 33
definition. 78. 79 defocus dependence. 84-85 Fourier computation. 8(I-81 function. 71. 73. 86, 122. 130-131. 237. .,....4 _-~ , _ 25-~ peak search, 82-86. 88 point-spread functions, between, 87 ranking. 221 rotational. 81-82, 87-88, 276 SNR measurement, 108-110 translational. 76-8(I. 85. 276 up-down cross-correlation test. 88 wrap-around effect. 80. 85 Crowther resolution, s e e Reconstruction, resolution Cryo-electron microscopy. 9, 13, 22-24, s e e a l s o Vitreous ice Crvstal pseudoc~stal, 6(1 radiation resistance, 23 thin 2D. 7, 39 CTF. s e e Contrast transfer function Cubic ice. 23
D DACF. s e e Autocorrelation function, double (DACF) Dark-field electron microscopy, 27, 70 Data collection geometries, 187-188 conical. 186, 201.217, 221,223 random-conical. 188. 19(I, 195-2(12, 217, 234, 235 single-axis. 186-187 Debve-Wailer factor, 75 Defocus corridor. 211 dynamic defocus control, 52 generalized. 27. 31-32 local. 213
336 Defocus ( c o n t i n u e d ) optimum setting, 32 spread, 30, 32-33, 43 Density binning, 56 Difference map, 105, 106, 253-254, 261-262 standard error, 107, 253 Differential phase residual, s e e Resolution, differential phase residual Diffractogram, s e e Optical diffraction Discriminant analysis, 70 Distance buffer, 264 DnaB helicase, 115 Double layer preparation, s e e Sandwiching technique; Negative staining, double layer Double self-correlation function, 95 Drift, 40, 42, 111, 122, 222 Dynamic focusing, 87 Dynamic range, 56
E Edge detection, 70 Edge enhancement, 35 Eigenanalysis, 131 Eigenimage, s e e Correspondence analysis, eigenimage Eigenvalue, s e e Correspondence analysis, eigenvalue Eigenvector, s e e Correspondence analysis, eigenvector Eigenvector methods of ordination, 129 Einstein equation, 119 Electron crystallography, 6-7, 10, 112, 214, 247 Electron energy loss spectroscopy (EELS), 261-262 Electron microscope, 6 computer-interfaced, 6, 41, 51 energy-filtering, 36, 52-53, 240, 246 high-voltage, 5 image formation, 24-49 contrast, 28 intensity distribution, 27 wave-optical description, 25-28 intermediate-voltage, 6 tomography, s e e Tomography Energy concentration of, 137 index, 64
Index filtering, s e e Electron microscope, energy-filtering spread, 30, 43, 211 Envelope function, 34, 38 defocus dependence, 43 energy spread, 30, 122, 213 illumination, 30, 122 E s c h e r i c h i a c o l i , 2, 62, 65, 104, 214, 219, 228, 242, 265 Euclidean distance, 73, 232, 240 Eulerian rotations, 187, 228, 230 EXPLORER, 287
F Fab labeling, s e e Antibody labeling Filtration, s e e a l s o Wiener filter computational, 59 crystal, 57 high-pass, 34 low-pass, 34, 70, 108 optical, 6 quasi-optical, 59 Flagellar motor, s e e Basal bodies, flagellar Fog level, s e e Photographic film, fog level Fourier average, 180 filtering, s e e Filtration interpolation, 208, 234 moving window, 209 reconstruction techniques, s e e Reconstruction, Fourier reconstruction techniques representation, of a 2D function, 183 ring correlation, s e e Resolution, Fourier ring correlation statistical dependence of Fourier coefficients, 220 synthesis, 60, 67, 112, 142 transform, digital, 80 Fraunhofer approximation, 26 FRC, s e e Resolution, Fourier ring correlation FRODO, 266
G Galton, 56 Glow discharging, 62 Glucose embedment, 12, 21-23, 227 Glutamine synthetase, 50
Index Gold-labeling, 12, 23 Nanogold, 24, 262 Undecagold, 24, 105 Gradient criterion, 267 GroEL, 24
H HAC, see Classification, hierarchical ascendant Hankel transform, 191 Heavy/light atom discrimination, 36, 47 HeLa cells, 21, 109 Helical structures, 6-7, 249, 260 Hemocyanin, 151, 161,255, 269 Androctonusaustralis (scorpion), 13, 19, 21-22, 64, 151,226, 254, 258-259, 261, 271 chiton, 222 Limulus polyphemus (Horseshoe crab), 62-63, 241,257-259 Panulirus interruptus, 259, 269-270 rocking, 65 Scutigera coleoptrata, 64, 66, 241 tarantula, 241 Hemoglobin Lumbricus terrestris (earthworm), 96, 140, 195, 196-197 Ophelia bicornis, 227 Herpes simplex virus (HSV), 50, 121, 125 Heterogeneity compositional, 102 image sets, of, 73, 75-76, 100, 124-125, 158, 173 orientational, 75 structural, 76 Hilbert space, 232 Histogram angular, 218, 237, 243-245 projection density, 217 reference, 217 voxel, 267-268 Homogeneity of image sets, 73, 105, 127, 175, 180 Hoppe, Walter, 5, 10 Horseshoe crab, see Hemocyanin, Limulus polyphemus (Horseshoe crab) HSV, see Herpes simplex virus Hyperspace, 130 Hypersphere, 131
337
I Illumination coherent, 28 divergence, 29-30, 122 field-emission gun, 31 Gaussian source, 30 partially coherent, 29-32 source size, 40-41 Image difference method, 55 Immunoelectron microscopy, 102, 105, see also Antibody labeling Inelastic scattering, see Scattering, inelastic Interconversion, of projections, 193, 227 Interpolation, see also Fourier, interpolation bilinear, 68 errors, 251 rule, 210 IQ (Image quality), 112-113
K Karhunen-Loeve transformation, 131
L L7/L12, see Ribosome, subunit, 50S, stalk (L7/L12) Lattice reciprocal, 40, 59-60 unbending, see Unbending vector, 57 Lavater, Johann Kaspar, 56 Lens aberrations, see Objective lens, aberrations Lexicographic ordering, 69, 128 Light-harvesting complex, 7 Limulus polyphemus, see Hemocyanin, L i m u lus polyphernus (Horseshoe crab) Linear systems, 9 Lipoyl domains, 102-103 Low-dose techniques, 49-51,247
M Magnification, electron-optical, 51, 56, 68 correction, 47, 87 variation, 47, 102 Mask, see also Correspondence analysis, mask, binary binary, 112, 279 soft, 113
338 Max-Planck-Institut ffir Eiweiss-und Lederforschung, 10 Maximum entropy methods, 231 Medical Research Council, 6 Microdensitometer, 10, 11(1 Minimum dose microscopy, 5(1 Missing cone, 224, 225, 228, 235, 279 Missing wedge, 235 Mitochondrion, 2-3, 235 Molybdenum grids, 65 MSA, see Multivariate statistical analysis Multiresolution approach, 24(I Multivariate statistical analysis. 9, 15.53, 75, 89, 93, 97, 126-16(I, 211. 221. 231-232. 240, 243, 248, 26(I, 285, see also Correspondence analysis
N Negative staining, 9, 12-21, 50, 55. 62, 63. 126, 235, 256, 27(I, 279 double layer, 14-19, 64, 227 exaggerating effects of, 161 fluctuating levels, 15-16. 21, 38. 1(11- 1(12, 165 incomplete, 13, 226 meniscus effects, 14, 17, 126 modeling, 66, 233. 270 one-sided, 13, 14-15, 18 particle size variations, 18, 21 specimen flattening, 14, 18-21, 126, 226-227, 256 uranyl acetate, 12, 38, 44, 50 variations, 15, 102, 143, 224 wrapping effect, 14, 19 Noise additive, 38, 54-55, 101, 1(18, 118-119 digitization, 56 fixed-pattern, 54 Gaussian, 108 photographic, 56 shot, 55 signal-dependence, 54-55 spectrum, 56 stationary, 108 statistics, 101 stochastic, 54 structural, 55 Nonlinear mapping, see Correspondence analysis, nonlinear mapping
Index Nuclear pore complex. 13. 235 Nucleosome cores. 240
O O (molecular modeling package). 266. 287-288 Object spectrum. 56 Objective lens aberrations. 26-27 aperture. 27-28 axial astigmatism. 26.28. 31-32. 38, 40. 43 back focal plane. 26-27, 30. 45 spherical aberration constant. 26. 31 Odd men out, 124-125 Omega filter. 53 Optical diffraction. 38-39 pattern. 35.37-38. 116. 215 screening. 211-213. 215. 275 Orientation definition. 65 deviations from average view. 65. 1(12 flip/flop asymmetry. 64. 277 preference. 62. 64. 65 search. 228 3D. 19(I. 226. 227-229 using sets of projections (OSSP), 229-231 OSSP. see Orientation. using sets of projections Outlier rejection, 124-125
P P a n u l i m s interruptus, see
Hemocyanin, Pan-
ulims interruptus
Parseval's theorem. 39. 83. 107 Partial coherence, 28-31. 43, 45. 122, 224, see also Envelope function Particle selection automated, 51, 69-73. 214 interactive, 216 tilted/untilted, 212, 213, 274-275 Particle symmetry, 57. 191 - 192 Patch averaging, 145 Patterson function, 83, 86 PCA. see Principal component analysis Periodogram, 41 Phase contrast transfer function, see Contrast transfer function object, 25-28 origin, common. 76
339
Index
problem, 8 shift, of scattered wave, 26. 49 PhoE. 411 Photographic film fog level, 51, 68 granularity, 56 recording, 10-11 POCS, s e e Projection. in, hyperspace, onto convex sets Point spread function data collection/reconstruction. 2112. 2111 Fresnel fringes, 35 instrument, 28, 34-35.83, 128 Poisson statistics, 55 Portal protein, 97 Power spectrum, 39, 41-42. 6(1 definition, 41 falloff, 111 noise, 107 white, 38-39 Principal component analysis, 131-132. 143. 160 Projection, in hyperspace, 132. 138. 141 inactive, 2411 onto convex sets (POCS). 231-235. 279. 281 constraints, 231-233 superresolution, 235 3D angular alignment, 238 angular distribution, 237 assignment of orientations. 189 angular reconstitution. 189. 194-198. 242 bootstrapping methods. 199 common lines, 189. 193-199 common ID projection, 195 least-squares method. 2112 simultaneous minimization technique. 198 sinogram, 194-197 3D projection matching. 237-241 3D Radon transform, 237. 241-243 central section associated with. 57. 186. 202-207, 22/I. 229-23/I compatibility, 189, 192 conical series, 221 coordinate transformations. 214. 216 flip/flop orientations. 14. 18. 257 geometries, 186-188
matching, s e e Projection, assignment of orientations, 3D projection matching minimum number, 219 noise estimate, 2113.25/)-251 orientations. 21t2 partial. 14-15. 18 random-conical set. 229, 237-238 scaling. 199. 214-217 terminolo~'. 61 theorem, 57. 183-184, 208 tilt-projection. 220 variance. 2111, 24(/ Proteasomes, 49, 252 Purple membrane, s e e Bacteriorhodopsin Pvrovate dehydrogenase, 102-104
Q O-factor, s e e Resolution, O-factor O-image. 119 Ouaternion mathematics. 198
R Radiation damage. 23, 49-51, 55. 122, 182 Radon transform, s e e Projection. assignment of orientations, 3D Radon transform Radon's theorem. 183-184, 186 Raleigh criterion. 67, 111 Random-conical reconstruction, s e e Reconstruction, random-conical Rank sum analysis, 123-124 Reconstruction algebraic reconstruction technique (ART), 21/3. 2119. 251 algorithm. 182 angular rcconstitution, using, 194-199 artifacts. 2117 back-projection. 2113-21/4 body. 2113-2114 summation. 2115 transfer function, associated with. 2117 weighted. 182. 202-208. 250-257 weighting functions, 199. 21t5-2118, 222 crystal sheets, 76 cylindrically averaged, 1911-192 Fourier interpolation, 181,202, 2117 Fourier reconstruction techniques, 2112 helical. 76 iterative algebraic. 182. 202. 2119-2111. 231, 251. 253
340 Reconstruction (continued) linear reconstruction schemes, 202 merging, 225-230 pseudoinverse methods, 210 random-conical, 18, 62, 65, 72. 181. 183. 195, 202, 205, 211-225, 252. 256, 273 reconstruction-cum-restoration, 49. 231. 246 resolution, 69, 183, 184, 207, 218-219. 222-225, 236 direction dependence, 223-224 simultaneous iterative reconstruction technique (SIRT), 203, 210, 251 modified, 203, 222 spherical harmonics-based, 202 techniques, 202-210 linearity, 202 Reference problem, 130 Refinement angular, 53, 195, 220, 235-245 techniques, 65 Resolution anisotropy, see Reconstruction. resolution, direction dependence biological specimens, 67 concept, 110-112 criteria, 112-121 critical distance, 67 cross-resolution, 111 crystallographic, 67, 111 definition, 67 degradation, due to interpolation. 89 differential phase residual, 89. 113-115. 122, 218, 224-254, 279 distance, 223 domain, 107, 116-117 electron-optical, 28, 30-32 Fourier ring correlation, 50, 115-116, 225 Fourier shell correlation, 225 improvement, 120 limiting factors, 111 mathematical, 222 Nyquist limit, 113-114 point-to-point, 67 potential, 119 Q-factor, 111, 117-119 S-factor, 119 spectral signal-to-noise ratio (SSNR). 111, 119-121,224 theoretical, 224 Young's fringes, 111, 116-117
Index Restoration, see also Contrast transfer function fidelity, 34 image, 34. 87 Schiske-type, 49 3D, 231-235 Ribosome, 2-3, 10, 182, 193, 229, 254, 262 factor binding, 102, 254 mRNA, 254, 271 rRNA, 19, 177, 266-267, 271 subunit 30S, 19, 73, 105, 107, 112, 175, 214, 271 40S, 5, 14-15, 18, 21, 89, 102-103, 109, 159, 254, 256-257, 260 50S, 13, 19, 62, 75, 102, 177-178, 202, 218, 219, 223, 227, 236, 242, 261, 265, 271 crown view, 62, 75, 218, 227 kidney view, 62, 75 stalk (L7/L12), 102, 128, 253, 261-262 60S, 254, 256 70S, 19, 66, 123, 145, 150, 167, 171, 173, 219, 228, 236, 254, 257, 261, 267, 271 80S, 254-257, 260, 271 tRNA, 106, 107, 254, 271 RNA, 19, 38, 262 Rocking, of molecules, 65, 175, 257-259 Rotation function, 81
S S-factor, see Resolution, S-factor Salmonella, 191 Sampling aliasing, 68-69 theorem, 67-69, 208-209 Sandwiching technique, see Negative staining Scanning transmission electron microscope, 195 Scattering angle, 26 elastic, 25, 36, 49, 52 inelastic, 25, 36, 49, 52, 122 multiple, 6 Scherzer focus, 87 Scorpion, see Hemocyanin Scutigera coleoptrata, see Hemocyanin, Scutigera coleoptrata
Segmentation, 260-263
Index Shape continuity, 221 transform, 185, 206 Signal energy, 120 spatially varying, 54 Signal-to-noise ratio, 52, 69, 93-94, 98, 107-110, 119, 144, 156, 195-198, 214. 220, 229, 242 ice-embedded specimens, images of, 161, 276 measurement, 108-110 spectral, s e e Resolution, spectral signalto-noise ratio (SSNR) Significance, 105-107, 247-248, 250, 252-254 test, 107, 252 Similarity order, 221 pathway, 131,221 Single-layer preparation, 143 Sinogram, s e e Projection, assignment of orientations, sinogram SIRT, s e e Reconstruction, simultaneous iterative reconstruction technique (SIRT) Skeletal fast twitch muscle, s e e Calcium release channel SNR, s e e Signal-to-noise ratio Solvent flattening, 231 Specimen charging, s e e Charging effects, of specimen deformations, preparation-induced, 226 flattening, 165, 192 movement, s e e Drift preparation, 12-24, 64 frozen-hydrated, s e e Vitreous ice negatively stained, s e e Negative staining Spectral signal-to-noise ratio (SSNR), s e e Resolution, spectral signal-to-noise ratio (SSNR) Spherical viruses, 7, 76 SPIDER/WEB image processing package, 146, 168, 216, 283-284, 285-288 Spot scanning, 51-52, 87, 122 Spray-mix method, 10 SRP54, signal sequence-binding protein. 197 SSNR, s e e Resolution, spectral signal-tonoise ratio (SSNR) Stain variation, s e e Negative staining, variations Standard error of the mean, 105, 252
341 Statistical optics. 9 Stereo representation, 265, 288 Stereoscopic imaging, 3 Student distribution, 105 Structure factor, 40, 57 Surface representation, 222, 263-265, 279-280 topology, 62, 264
T t-test. 104, 253-255 Tannic acid, 12, 22 TCP-1 complex, 174 Temperature factor, 74 Tilt axis direction, 213, 214 geometry, s e e Data collection, geometry series, 221 stage, 187 Tilting, single-axis, 205, 207 Tobacco mosaic virus, 38, 122 Tomography, 3, 6, 189, 235, 249, 288 T o r p e d o r n a r m o r a t a , 105 Transfer function, s e e Contrast transfer function Tropomyosin, 253 Turnip yellow mosaic virus, 19
U Unbending, 40, 76 Undecagold, s e e Gold-labeling, Undecagold Underfocus, 34, 38, 211
V Validation, 247-248, 254-260 Variance eigenvector space, 161 electron-dose dependence, 101 function of number of images, 109 interimage, total, 131-132, 148 intragroup, 167 map, 70, 75, 102-105 noise, 107, 108 sample, 107, 108 signal, 39, 108 3D, 203, 210, 247-252, 255 Vitreous ice, 12 amplitude contrast, 38 boundary particle/ice, 267-268
342 images, 49, 220 52-53 orientational preferences in. 65 power spectrum, 44 preparation, 22-23 3D structures, 191, 198. ."~'~2 . . . . ."~'~ . . "-~ 7, 26(I, 271,273-281 Volume criterion, 267-268 Volume rendering, 265-266 VOXELVIEW, 288
W Ward's criterion, s e e Classification, hierarchical ascendant, Ward's criterion Wave aberration function, 26-27, 3(I Weak phase object approximation, 25-28, 36, 52 WEB, s e e S P I D E R / W E B image processing package
Index Weighted back-projection, see Reconstruction. back-projection, weighted Whittaker-Shannon theorem, s e e Sampling theorem Wiener filter, 34, 45-47, 84 Wilson plot, 121 Wrap-around effect, s e e Cross-correlation, wrap-around effect
X X-ray c~,stallography. 3, 7-9, 23. 287 diffraction. 23 transform. 237
266,
27(1-271,
Y Young's fringes, fringes
see
Resolution. Young's