PROGRESS I N OPTICS VOLUME I
FROM THE SERIES I N PHYSICS Geneva1 Editors: BOER,Professor of Physics, University of Amsterdam H. BKINKMAN, Professor of Physics, University of Groningen H. B. G. CASIMIR, Director of the Philips Research Laboratories, Eindhoven J.
DE
hlonogvuphs: H. C. BRINKMAN, Application of Spinor Invariants in Atomic Physics H. G. VAN BUEREN,Imperfections in Crystals S. R. DE GRoor, Thermodynamics 01 Irreversible Processes Thermodynamics E. A. GUGGENHEIM,
E. A. GUGGENHEIM, Boltzmanii’s Distribution Law and J . E. PRUE, Physicochemical Calculations E. A. GUGGFNHEIM H. JONES,The Thcory of Brillouin Zones and Electronic States in Crystals H. A. KRAMERS, Qusntum Mechanics I€.A . KRAMERS, The 1;oundations of Quantum Theory J . G. LINHART, Plasma Physics J . R’ICCONNELL, Quantum Particle Dynamics h.MERCIER, Analytical and Canonical Formalism in Physics I. PRIGOGINE, The RiIolccular Theory of Solutions E. G. RICIIARIISON, Relaxation Spectrometry P. ROMAN, Theory 01 Elementary Particles M. E. ROSE.,Iiiternal Conversion Coefficients j . L. SYNGE, Relativity: The Special Theory L. SYNGE,Relativity: The General Theory J . L. SYNGE, The Relativistic Gas H. UMEZAWR, Quantum Field Theory A. VAS~EEK, Optics of Thin Films A. H. WAPSTRA, G. J . NIJGHand R. V A N L I E SHOUT, Nuclear Spectroscopy Tables
1.
Edited Volumes: J. BOUMAN (editor), Selected Topics in X-Ray Crystallography J . XI. BURGERS and H. C . V A N n~ HuLsr (editors), Gas Dynamics of Cosmic Clouds. A Symposium I-’.M. ENDTand nl. DEMEUR (editors), Nuclear Reactions, Volume 1 C. J , GORTER(editor), Progress in Low Temperature Physics, Volume 1-111 G. I,. DE HAAS-LORENTZ (editor),H. A. Lorentz, lmpressions of his Life and Work J. KISTEMAKER, J . DICELEISEN and A . 0. C. NIER (editors), Proceedings of the lnterriational Symposium on 1 sotope Separation J . KOCH(editor), Electromagnetic Isotope Separators and Applications of Electromagnetically Enriched lsotopes 2. KOPAL(editor), Astronomical Optics and Related Subjects H. J . LIPKIN (editor), Procecdings of the Rchovoth Conference on Kuclear Structure N. K. NILSSON(editor), Proceedings of the Fourth International Conference on Ionization Phenomena in Gases, Uppsala, 1959 K. SIEGBAHN (editor), Beta- and Gamma-Ray Spectroscopy SvmDosium on solid state diffusion (Colloque sur la diffusion A l’btat solide, ’Sahay, 1958) Symposium on corrosion (3e Colloque de mbtallurgie siir la corrosion, Saclay, 1959) Turning Points in Physics. A series of lectures given a t Oxford University in Trinity Term 1958 J. G. WILSONand S. A. WOUTHCYSEN (editors), Progress in Elementary Yarticle and Cosmic Ray Physics. Volumes I-V E. WOLF(editor), Progress in Optics, Volume I VAN DER POL,Selected Scientific Papers P. EHRENFEST, Collected Scientific Papers
€5.
This Page Intentionally Left Blank
E D I T O R I A L A D J7I S 0 R Y B 0 A R D
M. FRANCON, Paris
A. C. S .
VAN
HEEL,Delft
E. INGELSTAM, Stockholm K. S. KRISHNAN, N e w Delhi
H. KUBOTA, Tokyo E. L. O’NEILL,Boston
J. PICHT,Potsdam A. RUBINOWICZ, Warszawa W. H. STEEL,Sydney G. TORALDO DI FRANCIA, Firenze
W. T. WELFORD, London H. WOLTER,Marburg
PROGRESS I N O P T I C S VOLUME I
EDITED ny
E. W O L F Univevsity of Rochester, N . Y .
Contributors R . J. P E G I S , K. M I Y A M O T O , R . B A R A K A T , D. G A B O R ,
H. W O L T E R , H. K U B O T A , A. F I O R E N T I N I , A. C. S. V A N H E E L
1961 NORTH-HOLLAND PUBLISHING COMPANY-AMSTERDAM
N o part of this book may be reproduced i n any form by print, photoprint, microfilm or any other means without written permission from the publisher
PUBLISHERS: N 0 R T H-H 0 L L A N D PU B L I S H I N G C O., AM S T E R D A M S O L E D I S T R I B U T O R S F O R U.S.A.: I N T E R S C I E N C E P U B L I S H E R S INC., N E W Y O R K
P R I N T E D I N T H E NETHERLANDS
PREFACE With a continually increasing volume of research, workers in all branches of the sciences are experiencing difficulties in keeping abreast of the numerous developments. It is the aim of this new series t o provide information in the form of review articles about current researches in Optics and in related fields. Optical research carried out in recent times is covering a wide range of subjects. I n particular mention may be made of phase and interference microscopy, optics of thin films and fiber optics. The exploration of the analogy between optics of visible radiation and microwave optics has helped to provide solutions to old problems and is posing new ones. Other fruitful lines of developments have come about from the exploration of the similarities that exist between optical systems and other systems used for the transfer of information (e.g. electric circuits), from the increasing use of correlation techniques in problems relating to coherence and polarization, and from the introduction of high speed electronic computers in optical design. These and other developments present new opportunities both for basic research and for technical developments. I t is hoped that Progress in Optics will reflect these activities and will give help and provide stimulus to workers in Optics and in related sciences.
Institute of Optics University of Rochester Rochester 20, N e w York November, 1960
EMILWOLF
This Page Intentionally Left Blank
CONTENTS PREFACE
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
CONTENTS
VII IX
I . THE MODERN DEVELOPMENT O F HAMILTONIAN OPTICS by R . J . PEGIS 1. 2.
3.
. . . . . . . . . . . . . . . . . . . . . . . . . T H E CHARACTERISTIC FUNCTIONS . . . . . . . . . . . . . . . . . 2.1 Preliminary remarks . . . . . . . . . . . . . . . . . . . . 2.2 Fermat’s principle . . . . . . . . . . . . . . . . . . . . . 2.3 Illustrative example . . . . . . . . . . . . . . . . . . . . 2.4 Snell’s law . . . . . . . . . . . . . . . . . . . . . . . . 2.5 The point characteristic . . . . . . . . . . . . . . . . . . 2.6 The eikonal . . . . . . . . . . . . . . . . . . . . . . . . 2.7 The choice of variables . . . . . . . . . . . . . . . . . . . INTRODUCTION
T H E D E P E N D E N C E O F T H E ABERRATIONS UPON OBJECT AND STOP POSITION
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 . CONCLUSION . . . . . . . . . . . . . . . . . . . . . . . . . . REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 3.2 3.3 3.4 3.5 3.6 3.7
Preliminary remarks . . . . . . . . . . Notation for the eikonal . . . . . . . Aberrations of the stop . . . . . . . . Statement of the transformation . . . Crossed brackets . . . . . . . . . . . Theory of the transformation . . . . . Relation to the focal eikonal . . . . .
. . . . .
3 4 4 4 5 6
8 11 13 16 16 17 18 22 23 23 29 29 29
I1. WAVE OPTICS AND GEOMETRICAL OPTICS I N OPTICAL DESIGN by K . MIYAMOTO 1. INTRODUCTION .
2. 3. 4.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . INTENSITY DISTRIBUTION OF LIGHT IN A N OPTICAL IMAGE . . . . . . THE RESPONSE FUNCTION . . . . . . . . . . . . . . . . . . . . 4.1 Incoherent illumination . . . . . . . . . . . . . . . . . . 4.2 Comparison of wave optical and geometric-optical response functions . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Partially coherent illumination . . . . . . . . . . . . . . . .
WAVE SURFACE AND CHARACTERISTIC FUNCTION (EIKONAL)
33 34 36 41 41
43 56
x 5.
CONTENTS IMAGE EVALUATION BY SPOT DIAGRAM
. . . . . . . . . . . . . .
5.1 Image evaluating method . . . . . . . . . . . . . . . . . . 5.2 Single figure of merit for cybernetic design with digital computer REFERENCES
. . . . . . . . . . . . . . . . . . . . . . . . . . .
58 58 62 65
I11. T H E INTENSITY DISTRIBUTION AND TOTAL ILLUMINATION O F ABEKKATION-FREE DIFFRACTION IMAGES by R . B a R A K A T
. . . . . . . . . . . . . . . . . . . . . . . . . THEORY . . . . . . . . . . . . . . . . SPECIAL PROBLEMS . . . . . . . . . . . . . . . . . . . . . . . 3.1 Point source - uniform amplitude distribution . . . . . . . .
1.
INTRODUCTION
69
2.
ICIRCHHOFF DIFFRACTION
70 74
3.
3.2 3.3 3.4 3.5 3.6 4.
Point source - variable amplitude distribution . . . . . . Point source - high numerical aperture . . . . . . . . . . Imaging of extended objects . . . . . . . . . . . . . . Total illumination . . . . . . . . . . . . . . . . . . . . . Experimental results . . . . . . . . . . . . . . . . .
. . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
VECTOR DIFFRACTION THEORIES
REFERENCES
74 80 87 90 95 97 99 105
IV . T. IGHT AND INFORMATION by D . GABOR
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CLASSICAL WAVE o p m s . . . . . . . . . . . . . . . . . . . . . THE PARADOX OF “OBSERVATION WITHOUT ILLUMINATION’’ . . . . . . A FURTHER PARADOX: “ A PERPETUUM MOBILE OF THE SECOND KIND” . . 6 . THE METRICAL INFORMATION I N LIGHT BEAMS . . . . . . . . . . . 7. CONCLUSION . . . . . . . . . . . . . . . . . . . . . . . . . . APPENDICES . . . . . . . . . . . . . . . . . . . . . . . . I . Diffraction of a wave a t a plane object . . . . . . . . . . . . 11. Non-redundant specification of optical objects . . . . . . . . 111. The effect of illumination . . . . . . . . . . . . . . . . . . IV . Notes to the perpetuum mobile problem . . . . . . . . . . . v . Occupation numbers in light beams and in electron beams . . . VI . Information capacity and selective entropy . . . . . . . . . . REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . 1. 2. 3. 4. 5.
INTRODUCTION
111
GEOMETRICAL OPTICS
113 115 122 125 132 136
136 136 138 140 142 146 148 152
V . ON BASIC ANALOGIES AND PRINCIPAL DIFFERENCES BETWEEN OPTICAL AND ELECTRONIC INFORMATION by H . WOLTER 1.
INTRODUCTION
2.
ANALOGIES
. . . . . . . . . . . . . . . . . . . . . . . . .
BETWEEN
TRANSMISSION
LAYER SYSTEMS I N OPTICS
LINES
IN
ELECTRONICS
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.1 The general wave-analogy
157
AND
159 159
XI
CONTENTS
2.2 The analogy relations . . . . . . . . . . . . . . . . . . . 2.3 The four terminal matrix for optical waves in the layer system and its general analogy to the waves in systems of series circuits consisting of homogeneous transmission lines and €our terminal networks . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Limits of the analogy caused by differences of dimensional multiplicity . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Limits of the analogy because of the condition of violation of transversality of the two-conductor system . . . . . . . . . . 2.6 Limits of the analogy because of the different rBle of reflection . 2.7 Examples of analogy between layer optics and conduction theory 2.8 Possibilities of extension . . . . . . . . . . . . . . . . . .
3.
ANALOGIES B E T W E E N OPTICAL AND HERTZIAN WAVES
. . . . . . . .
3.1 The problem of the non-reflecting metallic w-all for hertzian waves 3.2 The ray shift with light and long waves . . . . . . . . . . . . 3.3 The overcoming of the optical unsharpness condition by means of the analogy with the radio direction finding procedure . . . . . 3.4 Limits of the analogy in the domain of radiation . . . . . . . . 4.
T H E PSEUDOANALOGY B E T W E E N TIME AND COORDINATE, OR FREQUEXCY AND DIRECTION VARIABLE
. . . . . . . . . . . . . . . . . . . .
4.1 Zernike’s phase contrast method and its communication technique analogy - the phase demodulation . . . . . . . . . . . . . . 4.2 The Fourier formalism in optical and electronic information theory . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Solutions of the basic problem in the domain of communication technique . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Failure of the analogous solution method in optics and the incompleteness of the coordinate * time analogy . . . . . . . . 4.5 The problem of analytic continuation of the spectral function F ( y ) in optics . . . . . . . . . . . . . . . . . . . . . . . 4.6 Solution of the basic information-theoretical problem in optics 4.7 Common and distinctive factors between information theories of electronics arid optics . . . . . . . . . . . . . . . . . . . . REFERENCES
. . . . . . . . . . . . . . . . . . . . . . . . . . .
160
164 166 167 167 168 178 178 178 180 181 186 187 187 195 199 201 202 203 208 209
VI . INTERFERENCE COLOR by H . KEBOTA
. . . . . . . . . . . 2.1 Two types of layers . . . . . . . . . . . . . 2.2 Color of non-reflection layer . . . . . . . . . . 2.3 Effect of multiple reflection and dispersion . . . 2.4 Oblique incidence . . . . . . . . . . . . . . 3 . I N T E R F E R E N C E COLOR O F MULTILAYER . . . . . . . 3.1 Double layer . . . . . . . . . . . . . . . . 3.2 Triple layer . . . . . . . . . . . . . . . . . 3.3 Multilayer . . . . . . . . . . . . . . . . . 4. COLOR O F A T H I N FILM ON METALLIC SURFACE . . . . 5. I N T E R F E R E N C E COLOR O F CHROMATIC POLARIZATION . . 5.1 Birefringent crystal . . . . . . . . . . . . . 5.2 Sensitive color . . . . . . . . . . . . . . . . 5.3 Sensitivity of the sensitive color . . . . . . . . 1. 2.
INTRODUCTION . EVALUATION O F COLOR
I N T E R F E R E N C E COLOR O F MONOLAYER .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . .
.
. .
213 214 214 219 221 223 226 226 230 231 231 233 234 236 237
XI1
CONTENTS
5.4 Hypersensitive color . . 5.5 Optically active crystal
. . . . . . . . . . . . . . . . . 6. INTERFERENCE COLOR IN OTHER PHENOMENA . . N E W TABLES OF THE INTERFERENCE COLOR . . . . REFERENCES . . . . . . . . . . . . . . . . . .
. . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
239 240 244 245 250
V I I . DYNAMIC CHARACTERISTICS O F VISUAL PROCESSES by A . FIORENTINI
. . . . . . . . . . . . . . . . . . . . . . . . .
1.
INTRODUCTION
2.
. . . . . . . . . . . . . . INVOLUNTARY MOVEMENTS O F T H E E Y E . . . . . . . . . . . . . .
3. 4. 5. 6. 7. 8.
DYNAMIC THEORIES O F VISUAL ACUITY
VISION WITH STABILIZED RETINAL IMAGES
. . . . . . . . . . . . .
DISCUSSION ON T H E POSSIBLE ROLE O F INVOLUNTARY E Y E MOVEMENTS
. . . . . . . . . . . . T H E PERCEPTION O F CONTOURS . . . . . . . . . . . . . . . . . . REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . DYNAMIC CHARACTERISTICS O F BIKOCULAR VISION
SOME VISUAL EFFECTS PRODUCED B Y INTERMITTENT ILLUMINATION
255 257 258 262 269 273 276 282 287
VIII . MODERN ALIGXMENT DEVICES by A . C . S. VAN HEEL
. . . . . . . . . . . . . . . . . . . . . . . .
1.
INTRODUCTION
2.
CUSTOMARY METHODS EMPLOYING COLLIMATORS AND TELESCOPES
.
2.1 Telescopes . . . . . . . . . . . 2.2 Telescopes and collimators . . .
291 294 294 297 299 302
. . . . . . . . . . . . . . . . . . . . . . . . . . . 3. I N T E R F E R E N C E ARRANGEMENTS . . . . . . . . . . . . . . . . . 4 . DISCUSSION O F T H E PRECISION . . . . . . . . . . . . . . . . . 5. T H E USE O F REFLECTING S P H E R E S AND O F S P H E R E S W I T H A CONCENTRIC CAP . . . . . . . . . . . . . . . . . . . . . . . . . 6. SPHERE WITHOUT REFLECTION, PRODUCING A LUMINOUS “LINE” . . . 7. SINGLE LENS AS ALIGNMENT COLLIMATOR . . . . . . . . . . . . . a . THE USE OF THE RAINBOW . . . . . . . . . . . . . . . . . . . 9. ALIGNMENT O F SURFACES . . . . . . . . . . . . . . . . . . . . 10. T H E A X I C O N . . . . . . . . . . . . . . . . . . . . . . . . . 11. ADDITIONAL EXAMPLES . . . . . . . . . . . . . . . . . . . . . 12. SOME TECHNICAL REMARKS ON T H E MANUFACTURE O F ZONE PLATES SUPPLEMENTARY NOTE . . . . . . . . . . . . . . . . . . . . . . REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . .
304 308 311 312 315 318 319 323 324 328
AUTHOR INDEX .
331
SU B J E C T I N D E X
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . .
336
I
T H E MODERN DEVELOPMENT OF HAM I L T 0 N I A N 0 P T I C S BY
R. J. PEGIS
Bausch 6 Lomb Inc., Rochester, N . Y .
CONTENTS PAGE
$ 1 . INTRODUCTION
. . . . . . . . . . . . . . . . . . $ 2. T H E CHARACTERISTIC FUNCTIONS . . . . . . . . $ 3. T H E DEPENDENCE OF THE ABERRATIONS UPON OBJECT AND STOP POSITION . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 4
16
$ 4. CONCLUSION
29
REFERENCES
29
Q 1. Introduction The method of Sir William Hamilton in mechanics and geometrical optics was undoubtedly one of the most profound mathematical discoveries to come from the nineteenth century. From the time of the communication of his “Theory of Systems of Rays” to the Irish Royal Academy in 1827, Hamilton continued to startle the scientific world with his new idea of a “characteristic function” in physics. He died in 1865. In the field of mechanics the new theory took hold immediately, so that today no one doubts its place in theoretical and applied science. But the theory was intended for use in geometrical optics as well as in mechanics, and the task of further developing Hamilton’s ideas along these lines was left to a small handful of followers whose work is almost exclusively confined to this century. We mention STEWARD[ 19281, SYNGE[ 19371, LUNEBERG[ 19441, HERZBERGER [ 19581. Probably the most prolific and difficult writer on Hamiltonian optics this century is T. Smith, an English mathematician who has spent most of his life adapting Hamilton’s methods to modern lens design. His basic articles appeared in the 1920’s though related articles continue through the 1940’s. What is unfortunate is that for the most part these articles have been neglected or misunderstood. The reason for this lies partly in the inherent hfficulty of the material, and partly in the enormous economy of expression exercised by their author. There is considerable need today for an understandable presentation of Hamiltonian optics t o the contemporary scientific world, with special attention to the ideas of T. Smith which in practice would be otherwise unavailable. This article is intended as an introduction to the modern developments of Hamiltonian optics. Section 2 develops the more basic and classical ideas, while Section 3 introduces the more radical algebra of aberrations, first discussed by T. SMITH[ 19221. It is hoped at some future date to discuss the rest of Smith’s work.
4
MODERN HAMILTONIAN OPTICS
Q 2. The Characteristic Functions 2.1. P R E L I M I N A R Y R E M A R K S
The distinguishing feature of Hamilton’s method is the use of a “characteristic function” to describe the performance of an optical system. This is not to be confused with the current use of a “merit function” in lens design, for the latter is a performance function defined by itself and applicable t o any system, while the characteristic function is actually a function of the system and completely describes the geometric optical properties of that system. Several types of characteristic function are possible, for the properties of a system can [1828] was be described in terms of points or rays or both. HAMILTON the first t o use such functions and the originator of the idea, though it was BRUNS [ 18951 who independently singled out the so-called angular characteristic or “eikonal” as basic for aberration theory. In this second part we discuss what is commonly known about the theory of two of the characteristic functions, the point characteristic and the eikonal. There is also a “mixed characteristic” discussed by SYNGE [1937] and LUNEBERG [1944], but is similar in properties to the other two and will not be discussed here. We shall show, in fact, that only the eikonal has certain special advantages and that because of these its use is almost always preferable. 2.2. FERMAT’S P R I N C I P L E
We are given a general optical system which images one space (called the object space) into another (called the image space). No special assumptions are made about the transformation between the spaces - it may not be one-to-one, so that a point may be imaged into a spot, or vice versa. We retain only the physically obvious assumption that a straight line ray entering the system is imaged into a straight line ray leaving the system. This implicitly involves us in another assumption which we make about the spaces - they are homogeneous and isotropic. In the object space, whose refractive index we denote by n, we choose a right-handed system of perpendicular axes x, y , z, and similarly in the image space of index n‘ we choose axes x‘, y‘, z‘. Let the direction cosines of a general ray in the object space be L, M, N and the direction cosines of the optically corresponding ray in the image space be L’, M‘, N’. All quantities in the object space are measured with respect to the x, y , z system, all quantities in the image space with respect to the x’,y’, z’ system. The
I?
5 21
T H E CHARACTERISTIC FUNCTIONS
5
two systems may be arbitrarily oriented with respect to each other, though in most applications we make them parallel or even coincident. Let P(x,y, z) be a general point in the object space and P’(x’,y’, 2 ’ ) a general point in the image space. All the laws of geometrical optics are contained in Fermat’s Variational Principle which states that the path taken by light from P through the system to P’ will be such that the time of propagation along it is stationary in the Calculus of Variations sense - this means that the path is such that if it were altered infinitesimally, the resulting infinitesimal change in time of propagation w d d be zero. Now we know that the time of propagation through a medium is proportional to the optical path (refractive index multiplied by geometrical path) taken through the medium ; hence by Fermat’s principle the optical path must be stationary. 2.3. ILLUSTRATIVE EXAMPLE
I n many statements of Fermat’s principle the phrase “optical path must be stationary” is replaced by “optical path must be a minimum”. We give here an example from LUNEBERG [I9441 pp. 96-97, which demonstrates that the stationary optical path need not be a minimum.
Fig. 2.1
In a medium of air (index unity) consider a spherical mirror with center M and vertex Q, as shown in Fig. 2.1. Let POand P I be symmetrically located about M with respect to the mirror axis. We know
6
MODERN HAMILTONIAN OPTICS
[I,
5
2
in advance that the ray path which will be taken between POand P1 via the mirror is PoQPl since it is the only path fulfilling the reflection law. Since the medium is air the optical path here is PoQ QP,. We show that if Q' be any point on the mirror and in the plane of PO,Q, Pi then the optical path via Q' is shorter than that via Q. To do this we construct the ellipse E through Q with Po and P I as focal points. I t s radius of curvature a t Q is certainly greater than MQ, the radius of the mirror, so it will lie outside the mirror. Extend PoQ' till it intersects the ellipse, say a t Q". Then
+
+ Q"Q' > PiQ', PiQ" + Q"Po + PoQ' > PiQ , PiQ"
.*.
.*.
PiQ"
+ Q"Po > PiQ' + Q'Po.
But on the other hand
+ Q"Po PIQ + QPo, ... PiQ + QPo > pie' + Q'Po, PiQ"
=
which demonstrates our assertion that the true optical path need not be a minimum. 2.4. SNELL'S LAW
To further illustrate and confirm Fermat's principle we use it t o deduce Snell's law. Consider the case of refraction by a single surface whose equation is given in the form x = f ( y ,2 ) .
(2.1)
Let the x , y , z and x', y', z' coordinate systems be coincident, so that all coordinates are measured in the same system. Fig. 2.2 shows a proposed ray path from P to P', where all symbols have the meanings assigned t o them in subsection 2.2. If p(Z,y, Z) denote a point on the surface, then the optical path from P to P' via F is given by
nD
+ n'D',
(2.2)
where we have written
By Fermat's principle we shall have the optically correct ray joining P
I#
9 21
THE CHARACTERISTIC FUNCTIONS
7
and P’ if infinitesimal alterations in the ray leave (2.2) unaffected. Such alterations are accomplished by varying ( Z , j j , Z) slightly, maintaining 2 = / ( j j , S ) so that P will remain on the surface. In effect, then, we are requiring that the derivatives of (2.2) with respect to 7
Fig. 2.2
and Z, where 2 = f(7,Z ) , must be zero. Performing the differentiations and equating to zero we have O=
n(y
-
y)
D
o=-- n(z - z)
+
n(x - x ) as + nyy - y‘) -D/
D
n(z - x )
V az -+ az
+ ByzD-/x’) ax ’
nyz - 2 ’ )
q x - x ’ ) ax
(2.5)
D, -k D’ az ’ D + D where D and D’ are as defined in eqs. (2.3) and (2.4). Now from Fig. 2.2 we have 2 - X
T - V
so that our derivative equations may be written
nM - n’M‘ = (n’L‘ - nL)f:, nN - n’N‘ = (n’L‘ - nL)/;, where we have used the notation
ax
- - / +-
a7
ax
= /;.
(2.7)
8
MODERN HAMILTONIAN OPTICS
[I,
92
For reasons of symmetry we consider in conjunction with the two equations of (2.7) a trivial third equation
nL
- n’L’ =
(n’L’ - nL)(- 1).
(2.8)
We now regard (I,, M, N) as components of a unit vector s in the incident direction and (L’, M’, N’) as components of a unit vector s’ in the refracted direction. Also, since the equation of the surface may be written in the form -
x + f(7, z) = 0
and since the direction of the normal to any surface g(x,y , z ) given by the vector
=
0 is
(Z, $>t>. it is clear that (- I , fi, fz) are direction numbers for the normal to our surface. Denote this vector by Ap where ilis a length and p is the unit normal. In terms of s, s’, p our eqs. (2.7) and (2.8) may be written in the convenient vector form
ns
- n’s‘ =
(n’L’ - nL)Ap.
Taking the vector cross-product of this equation with p we have
n(s x p ) - n’(s’ x p ) = 0,
(2.9)
since the cross-product of the vector p with itself is zero. It is easily seen that (2.9) is Snell’s law. For from the directions of the vectors we see that the plane defined by s and p is parallel (and therefore coincident with) the plane defined by s’ and p ; and from the magnitudes of the vectors we have
n sin(s,p ) = n’ sin(s’, p ) ,
(2.10)
where the symbol (s,p ) means the angle between s and p . 2.5. THE P O I N T CHARACTERISTIC
If in the previous discussion we had actually carried out a solution for the (x,y , z ) of an optically correct ray from P to P‘ and substituted these values in the expression (2.2) we would have found n D n’D’ as a function of the initial and final points alone, i.e. a function of P and P‘. This function would be the true optical path from P to P’ and
+
1, §
21
9
T H E CHARACTERISTIC FUNCTIONS
a function of their six coordinates. We denote it by (2.1 1)
V ( x ,y , 2, x’,y‘, 2’)
and call it the point characteristic function of the system. If the system consisted of several surfaces, we would have to impose the conditions for stationary path at each surface, eliminate all the intermediary coordinates, and end up with a function V of the initial and final coordinates alone. There are special difficulties and special methods associated with carrying out this scheme - meanwhile we only wish to examine the usefulness of the function V on the supposition that we could obtain it. We apply Fermat’s principle to an arbitrary optical system, using the same symbols P , x,y , z , L, M, N, P’, x’,y‘, z’, L‘, M’, N’ with the same meanings as before now applied to the system as a whole. For greatest generality we take the coordinate systems x,y , z and x’,y‘, z’ to be unrelated. The most important property of the point characteristic comes to light when we investigate the derivatives of V with respect to its six variables. In the literature see, for instance, SYNGE[ 19371 pp. 17-24, STEWARD 119281 pp. 19-20. With reference to Fig. 2.3, let PQ be a ray
Fig. 2.3
entering the system and let Q’P‘ be the corresponding emerging ray. We know that small changes in Q and Q’ do not affect V , so now we consider the effect on V of small changes in P and P’. Define a point P + SP near P with coordinates (x Sx,y + Sy, z + Sz) and a point P’ SP‘ near P’ with coordinates (x‘ Sx’,y’ + Sy’, z’ 62’). Let Q SQ and Q’ + SQ‘ be points on the ray defined by P SP and P‘ SP‘, near Q and Q’ respectively. To facilitate the writing of equations, a distance enclosed in square brackets, e.g. [ P ,Q] shall denote an optical path. Hence V , the optical path from P to P’ is given by (2.12) = [ P , Q1 [Q, Q’l [Q’, P’I.
+
+ + +
v
+
+
+
+ +
10
[I, § 2
MODERN HAMILTONIAN OPTICS
If now we denote by V + SV the point characteristic (optical path) for the points P SP and P' SP', we have
+
V
+
+ SV = [P + SP, Q + SQ1 + [Q + SQ, Q' + SQ'I + [Q' + SQ', P' + W .
Now by Fermat's principle the change in V must be due to the change in P and P' alone, for if the optical path is stationary, our diversion of the intermediary points Q and Q' to the nearby points Q SQ and Q' SQ' produces no change in V . Hence we may ignore the diversion of Q and Q' and write
+
+
V
+ dv
=
[P
+ dp, Q1 + [Q, Q'l + [Q', P' + @'I.
Subtracting eq. (2.12) from this we have
+ dp, Q] + [Q', P + @'I - [P,Q] - [Q', P'] = { [ P+ SP, Ql - [P,QI} + {[Q', P' + 6P'l - [Q', P'I} = {[P+ dp, Ql - [P, Q]} - {[P'+ dp', Q'l - [P',Q'l}.
dv = [P
+
But [P dP, Q] - [P,Q] = B[P,Q] taken with respect to x , y , z, and [P' dP', Q'] - [P', Q'] = d[P', Q'] taken with respect to x', y', z', so that we have
+
The various derivatives of [P,Q] and [P', Q'] may be worked out as follows. Let Q have coordinates (a,v , w). Then
[P,Q] = % [ ( x - 21)'
+ ( y - v)' +
(Z
- w)']'.
Taking the partial derivative of this with respect to x we have
a
-[P, Q] = ~ Z ( X- Z C ) [ ( X - a)' ax
+ ( y - v)' + ( Z - w)']'.
But in Fig. 2.3 (x - a)[@ - a)'
+ ( y - v)' + (2 - w)']-'
Hence we may write
a
-[P, Q] = - nL ax
= - L.
1,
§ 21
THE CHARACTERISTIC FUNCTIONS
11
and similarly
In the same way, letting the coordinates of Q‘ be (u‘,v ’ , w‘) we find from Fig. 2.3 (since [P’Q’] is negative)
a
-- [P’,Q‘]
ax’
a
=
-
a ax
-[P’Q’]= - n’M‘,
aY’
[Q’, P‘] =
-
n‘L’,
a
-[P’Q’]= - 12”’. azl
Hence eq. (2.13) may be written in the striking form
6V = - n(L6x
+ M6y + N6z) + n’(L’6~’+ M‘dy‘ + ”62’).
(2.14)
From this we have all the derivatives of V . Denoting partial derivatives by subscript letters here and henceforth in this article we may write
It may be noted that V satisfies Hamilton’s partial differential equation in ( x , y, z ) and in (x’, y‘, z‘), as discussed by LUNEBERG [ 19441 pp. 103-1 10, STEWARD [1928] pp. 19-20, SYNGE [1937] pp. 18-19. v2?
+ v,2 + vz2 =
vz.2
122,
+ v,*2+ Vz.2 =
(2.16)
12‘2.
Interesting as these relations seem, they constitute in reality a serious disadvantage in the use of the function V . For because of eq. (2.16) not every function of our six coordinate variables can be the point characteristic of an optical system, but rather, only those functions satisfying two given non-linear partial differential equations. For the [ 19491 pp. 222-228. analogous situation in mechanics see LANCZOS 2.6. THE EIKONAL
The angular characteristic function or “eikonal” may be defined geometrically in the following way. In Fig. 2.4 let 0 and 0‘ be origins for the ( x , y, z ) and (x’, y‘, z’) coordinate systems. Let P and P’ be the points where two optically corresponding rays cross the ( y , z ) and
12
M O D E R N H A M I L T O N I A N OPTICS
[I>
92
(y’, z’) planes; then V ( O , y , 2, O‘, y’, 2’) = [P,P’].
(2.17)
Let perpendiculars from 0 and 0‘ meet the entering and departing
Fig. 2.4
rays in S and S’, and define
E
=
(2.18)
[ S ,S’].
Then the optical distance E is called the eikonal. Now if we project OP and O‘P‘ upon the two portions of the ray we have P‘S‘ = - S‘P’ = - My‘ - N’z‘, S P = My Nz,
+
where L, M, N, L‘, M‘, N‘ are defined as before. Hence the eikonal E is given by
E
=
V
+ %(My + Nz)
-
%’(M’y’
+ ”2’).
(2.19)
Taking the first variation of this we have
SE
= SV
+ n(MSy + ySM + NSZ + zSN) - n’(M’Sy’
+ y’SM’ + N’Sz’ + z’SN’).
Substituting from eq. (2.14) for SV with Sx = 6%’ = 0 (since P and P‘ are confined to the planes x = 0 and x’ = 0 respectively) we have the simplification
SE = nySM
+ nzSN - n‘y’SM‘ - ~z’z’SN’.
(2.20)
Hence for the derivatives of E when it is regarded as a function of M, N, M’, N’ we may write
EM= ny, EM, = - dy‘,
E N = nz, E N , = - n‘z’
(2.21)
Thus it is to our advantage to regard E as a function of the four independent direction cosines M, N , M’, N‘ alone, and it will be seen that the properties of the system are completely determined when the form of this function is known. Hence E , the eikonal, is also known as the ‘angular characteristic function’ of the system. It is more con-
1,
§ 21
T H E CHARACTERISTIC FUNCTIONS
13
venient than V , since it does not have to satisfy any given differential equations. If we allow for variation of x and x’ as well (which is seldom done) the eqs. (2.21) may easily be shown to assume the slightly more complicated form
xM \
as discussed in SYNGE[I19371 pp. 29-36. Finally we note the analytical significance of E . Substituting (2.15) in (2.19) for nM, nN, n’M’, n’N‘, we have
E =V
-Y
V~ 2Vz
-
Y’V,, - z ’ V ~ , .
Thus -E is the Legendre transform of V with respect to y, z , y’, z’, and analytically its new variables V,, V z , V,,, V z ,are by eq. (2.15) n M , nN, n’M’, n‘N‘, or equivalently the direction cosines, as we have already chosen for E . The connection of V and - E via the Legendre transform is the same as the connection between the Lagrangian and the Hamiltonian in classical mechanics, so that many of the advantages of the Hamiltonian accrue to the eikonal. For an interesting discussion of the situation in mechanics, see LANCZOS[ 19491 pp. 262-280. 2.7. THE CHOICE O F VARIABLES
When 0 and 0‘ are chosen, we have seen that E is a function of M, N, M’, N’.However, our main concern is with the symmetrical optical system, which has an axis of symmetry such that planes normal to it are imaged into other normal planes. We choose the x- and x’-axes to coincide, and nearly always take the y- and y’-axes (therefore also the 2- and 2’-axes) to be parallel. The origins 0 and 0’ on the common x-axis are not necessarily optically corresponding. Because of the symmetry, if they- and z-axes are rotated through an angle 6 about the common x-axis, and the y’- and 2‘-axes rotated through the same angle, there should be no change in the optical path. Hence the point characteristic and the eikonal may be written purely in terms of the invariants of the rotation. I n the case of the point characteristic whose variables are x , y , 2, x’,y‘, z’, if the dependence on y , z , y’, z’ is invariant under rotation about the common x-axis, then
14
[I,
MODERN HAMILTONIAN OPTICS
52
these four variables may be replaced by three: the lengths of the vectors ( y , z ) and (y’, z’) and the angle between them, or equivalently by y2 22, yy‘ zz’, y‘2 zt2. I n the case of the eikonal, since the variables are M, N , M’, N’, we replace them with the three symmetric variables of the rays: the angles made with the axis by the incident and refracted rays and the angle between these two rays, or equivalently, L, L’, LL’ MM’ NN’. But since L2 + M2 + N 2 = L’2+ M ’ 2 + ”2 =z 1,
+
+
+
+
+
it is just as correct to choose as symmetric variables the quantities (1) = M2
+ N2,
(2) = MM’
+ NN’,
(3) = M’2
+ ”2.
(2.23)
The use of these numbers to denote variables was introduced by T. SMITH[1922], and while confusing a t first sight leads to great convenience in the writing of subcripts. We denote the derivatives of E with respect to these three variables by E l , E2, E3, and we consider E as a function E [ ( I ) ,( 2 ) ,(3)] of them. Then we have in eq. (2.21)
ny
= E M = 2ME1+ M’E2,
nz = E N = 2NE1 -n’y’ = E M (= ME2
+ N’Ez,
+ 2M’E3,
= E N , = NE2
-dz‘
(2.24)
+ 2N’E3.
Let us now take 0 and 0’ to be corresponding points in the system. Then the conditions for the plane x = 0 t o be imaged onto the plane x‘ = 0 without image errors are n‘y‘ = Gny, n’z’ = Gnz, (2.25) where G is the ‘reduced magnification’, or ratio of the sizes of image and object (measured in optical rather than geometlical length). It is convenient at this point to choose the initial and final media to be air, so that n = n’ = 1 , and G may be thought of as a geometrical magnification. From eqs. (2.24) and (2.25) we then have 0
=
Gy -y‘
0
=
GZ - 2’
+ E z ) + M’(GE2 + 2E3), = N(2GE1 + Ez) + N’(GE2 + 2E3). =
M(2GE1
Hence we may write
+
+ +
M(2GE1 E2) - - M’(GE2 2E3) - N’(GE2 2E3) ’ N(2GE1+ Ez) M __ M‘ i.e. - -for all rays. N N‘
(2.26)
1,
5 21
THE CHARACTERISTIC FUNCTIONS
15
This is easily seen to be a contradiction, for it implies that all rays lie in planes through the axis. The only situation in which the Contradiction is avoided is if in (2.26) 2GE1+ EZ= 0, GE2
+ 2E3 = 0.
(2.27)
These may be thought of as the conditions for freedom from image errors. Multiplying the first by G and adding the second we have after dividing by 2 E B = 0. (2.28) G2E1 GE2
+
+
Again, multiplying the first of eqs. (2.27) by an arbitrary constant S and adding the second we have 2SGE1+ (S
+ G)E2 + 2E3 = 0.
(2.29)
Our last two equations suggest that great simplicity would result from a linear change of variables from ( l ) , (2), (3) to I, 11, 111, say, in such a way that eqs. (2.28) and (2.29) would become the equations EII = EIII = 0. It turns out more convenient t o use - EII, so we define - EII = 2SGE1+ (S G)E2 2E3,
+ + EIII = G2E1 + GE2 + E3.
To keep the formulae symmetric in S and G (which will prove advantageous later) we must choose EI as EI
=
S2E1
+ SE2 + E3.
To give these differential relations we must have for our linear equations (1) = S2I - 2SGII G'III,
+
(2)
=
SI
(3) = I
-
-
(S + G ) I I
211
+ GIII,
(2.30)
+ 111,
from which we solve for the equations of transformation, obtaining
+ G2(3),
(S - G)ZI
=
(1)
-
2G(2)
(S - G)'II
=
(1)
-
(S + G)(2)
(S - G)'III
=
(1)
-
2S(2)
+ SG(3),
(2.3I )
+ S'(3).
The variables I, 11, 111, first introduced by T. SMITH[1922], are most convenient for aberration theory, since we know that when E is expressed in terms of them, the conditions for freedom from image
16
MODERN HAMILTONIAN OPTICS
[I,
93
errors are
EII = EIII = 0,
(2.32)
i.e. E must be a function of I alone. Thus if E for a system could be expanded as a power series in I, 11, 111, the various aberrations could be identified with terms such as I I1 (third order distortion), III3 (fifth order spherical aberration) which do not involve I alone. For a discussion of the geometrical aberrations from this point of view, see STEWARD [ (1 926) ; (1 928), pp. 30-49)]. The arbitrary constant S in the transformation is carried along for purposes of symmetry, and since it enters into the equations in exactly the same manner that G does we interpret it as a magnification, usually the magnification associated with the pupil planes of the optical system.
Q 3. The Dependence of the Aberrations upon Object and Stop Positior, 3.1. PRELIMINARY REMARKS
We have seen that in the expansion of the eikonal for a symmetrical optical system working at a magnification G as a power series in the variables I, 11, 111, all terms save powers of I alone represent image errors. Now the variables I, 11, I11 involve the magnification, so if we change G we obtain new variables 1’, 11’, III’, defined in the same way as I, 11, I11 except with the old magnification G replaced by the new magnification G‘, and the new image errors will be represented by the terms in the new eikonal at the new magnification which do not involve I’ alone. Similarly the variables I, 11, 111 involve S , so that changes in S also affect the terms in the eikonal. Our purpose in this third part is to investigate the dependence of the terms of E on G and S , where we shall take S to be the magnification associated with the pupil planes of the optical system. The algebra of this dependence may be treated very generally, and all orders of aberration considered. Our primary source is T. SMITH[1922], one of his most difficult and important papers, and it is essential to understand this algebra of aberrations in the interpretation of his later papers. As a first step, however, we must investigate the significance of E as a power series in I alone. Clearly any such series leads to freedom from image errors, and we should like to find some standard form for
I.
3 31
17
OBJECT A N D STOP POSITION
the series for E , such that any and all departures from it (even in powers of I) may be regarded as aberration, even if not all are errors in the image. 3.2. NOTATION FOR THE EIKONAL
We find it convenient to let the focal points of the symmetrical optical system be origins for the object and image spaces, and as before t o choose the x- and x’-axes coincident with the axis of revolution of the system. I n this situation we represent the eikonal by E , and call it the focal eikonal. Now if with reference to the given origins we define the symbol E’ to represent the eikonal of the same system with axial points (x, 0, 0) in the object space and (x’,0, 0) in the image space, where x and x’ are measured positively to the right from their respective focal points, we have E’ = E - nLx n’L‘x‘. (3.1)
+
Again we assume that the end media are air, so that n = n’ = 1. Suppose now that the axial points at x and x’ are conjugate. Then if f is the focal length of the system and G the magnification at which it is working, we have from Newton’s lens formula as developed, for example, in STEWARD [1928], p. 3,
f/G, X ’ = - fG, so that writing EG to identify the conjugates we have for E‘
x
==
EG It is customary to write K
EGK
=E =
-
L f - L’fG. -
(34
(3.3)
G
l / f , the power of the system, so that
L
= E K - - - L‘G.
(3.4)
G
Let S be the magnification associated with another pair of conjugate planes perpendicular t o the axis, which we shall take to be the pupil planes. For them we have
.:
EsK = EGK
L + ( S - G) (-SG
- L’)
.
(3.5)
18
[I,
MODERN HAMILTONIAN OPTICS
43
3.3. ABERRATIONS O F THE STOP
We assume that the image is free from aberrations, so that E G is a function of I alone. But we should like to determine EG uniquely and for this purpose we find it convenient to impose the additional condition that any ray passing through the axial point of the stop, i.e. through the axial point of the plane in the object space corresponding to magnification S , be refracted through the axial point of the corresponding image plane. This will uniquely determine the coefficients in the power series for EG in the variable I. Optically, the condition we are imposing means that we would like the form of the eikonal when there is no spherical aberration of any order at the axial points of the be pupil planes, i.e. at the center of the stop. Let ( Y ,2)and (Y’,2’) the coordinates of intersection of a general ray with the pupil planes. Then by eq. (2.21) we may write the derivatives of ES as
Y
z
EMS, Z
Y’ = -EM‘S,
= ENS,
- E N‘S .
Z’=
S o w EG is given as a function of I alone, so that writing ES in terms of EG by means of eq. (3.5)we have
with similar equations in Z and 2’. But in EG the differentiation with respect to M and M’, N and N‘ can be written in terms of differentiation with respect to I. For from the definition of I in eqs. (2.31) and (2.32) we have
a aM
a
~-
. -
2M’
81 a 2(M - GM’) a ______ --_-8M
(S - G)2
81
81 a _ _ _anl’ a1
-
81 ’
2G(M - GM’) (S - G)’
a a1 ’
with similar equations in N and N’. Hence, using the relations
aL
__
8M
-
M -_
L ’
aL __ 8N
N
- --
L ’
aL aM’
~-
=o,
aL
a“
= 0,
1 , s 31
19
O B J E C T A N D STOP POSITION
which follow immediately from the differentiation of L and L’ as functions of M and N, M’ and N’ respectively, we have
Y ( S - G)2
= 2(M -
(S - G)3 M GM’)EIG - -__SGK L ’
with similar equations for Z and 2’.Now for freedom from axial aberration at magnification S, if Y and Z are zero, Y’ and 2‘ must also be zero, independently of the values of the direction cosines. Using eq. (3.6) and its counterpart in Z and 2’we then have
M L
--
2SG(M - GM‘) SM’ EIGK = ___ (S - G)3 L’ ’
N 2 S G ( N - GN’) -EIGK L (S - G)3
(3.7)
SN’
= - __.
L‘
If we eliminate M and M’ from the first pair of these equations (or N and N’ from the second) we find (SL - G L ’ ) ~ G E I G K= (S - G)3
(3.8)
and from this we could find E G if we could find the form which L and L’ take under these conditions in terms of I alone. To simplify the notation we write EI for E I G , the G being understood. Then squaring and adding the two equations of (3.7) we have M2
+ N2 - 4S2G2[(M - GM’)2 + (N (S - G)‘
L2 1 - L2
i.e. Set 26
-
=
L2
-
+
G N ’ ) ~ ] K ~ E-I SZ(M’2 ~ L‘2
4S2G21K2E12 (S - G)4
-
S’(1
-
L‘2)
L’2
~ G K E I / ( S- G)2. Then we have 1
-
L2
= L2S2Iu2
i.e.
L
1
=
(1
+ S21212)k ’
”2)
,
20
[I, §
MODERN HAMILTONIAN OPTICS
3
Substituting these values in eq. (3.8) we have 2GK
{ (1 + S
-
(1
S2IZ12)+
+ Iu2)+ EI = (S - G)3, (3.9)
S
)=S-G.
This is equivalent to a quartic equation in u.If we let -+Cn be the usual binomial coefficient, i.e. the coefficient of tn in (1 t)-+, we have
+
(3.10)
where we have written
The series (3.10) may be solved for u as a series in I by successive approximation or by formal series reversion to give zt =
1
+ &el1 - Q(e2 - 2e12)IZ + . . .
and since from the definition of u we have GKEG
=
+(S - G)2
we may therefore write the series for
+ &(5e3
- 24ezel
- &(7e4
- 40esel -
s
E G
zt dI,
as
+ 24e13)14 l8e22
+ 132e2el2 - 88e14)15,
(3.1 1)
up to terms of the fifth degree. This is the form which the eikonal must take in the absence of all image errors and all orders of spherical aberration of the pupil. There is no constant term in the aberration since its value is quite arbitrary, only the derivatives of E being significant .
1,
J 31
21
O B J E C T A N D STOP POSITION
Equation (3.1 1) gives us a reference for the coefficients of powers of I in the eikonal of any system. When an imperfect system is being considered we subtract eq. (3.1 1) from its eikonal, and all of the terms which remain, i.e. a power series in I, 11,111, will represent aberrations. However, in the transformation theory which follows it is more convenient to transform the full eikonal EG, remembering that when all is done the coefficient of the term in I alone at any order must have a correction applied to it if it is to represent the aberration at the center of the stop. We may now find the form of the focal eikonal E under the aberration-free conditions described above. To do this we substitute from eq. (3.1 1) in eq. (3.4), using the latter in the form
and writing for I its value (S - G)-2[(1) - 2G(2)
+ G2(3)].
The extra factor (S - G)-2n introduced by In in this substitution is most simply absorbed into the coefficient en by writing
en' =
S2n+1 - G
(S - G)2n+l
*
Then the terms in the focal eikonal E of the first three orders when aberrations are absent are
{ a~ 1
EK = - (2) - -
+ (3)2G - el'
-
[(I) - 2G(2)
+ G2(3)]2} (3.12)
When aberrations are present, however, it is not at all obvious what form E will take when the form of EG is given. This equation, as well as the question of the dependence of the aberration terms on G and S will be discussed with the general transformation theory in what follows. First, however, it might not be out of order to review the terminology used in describing the orders of aberration. If we keep only the linear terms in I, 11, I11 in EG, i.e. if we consider M, N, M', N' as small quantities, the ray will become a paraxial ray
22
MODERN HAMILTONIAN OPTICS
[I,
§3
and we shall have Gaussian optics. Since when the system is in focus there are no Gaussian aberrations, we would suspect that the linear part of the eikonal E G has only a term in I, and this suspicion is indeed correct. Again, the quadratic terms in EG, viz. the terms in 1 2 , 111, I 111,112,I1 111,1112 are the next to be considered, and of these all but one (the term in 12) represent aberrations. The five aberration terms are related to the Seidel aberrations, as is shown in a slightly different notation in STEWARD [1928] pp. 30-49. Steward, Smith and nearly all British writers call these first aberrations first order or primary aberrations, while in America they are called third order aberrations. As the order of the aberrations increases, the British terminology is first, second, third, etc., or p r i m a r y , secondary, tertiary, while the American (and some more recent British) is third, fifth, seventh, etc. Here we shall adopt the older British terminology, because it is more suited to the variables with which we are dealing. Thus the quadratic terms in the eikonal give the five first order image errors plus the first order spherical aberration of the pupil, and in general the n’th order terms in the eikonal give the aberrations of order n - 1 , of which all but one are image errors, and one is an aberration of the pupil. We now go on t o discuss the general transformation expressions. 3.4. STATEMENT O F THE TRANSFORMATION
It is desired t o express the coefficients in the eikonal at object and stop magnifications G‘ and S’ in terms of those at G and S. I n such a transformation we have seen that the old variables I, 11, I11 will become new variables 1’,11’, 111’,but it should be carefully noted that the quantities ( l ) , (2) ,(3)in terms of which the old and new variables are defined do not change in the transformation, for they are independent of G and S , being functions of the direction cosines alone. By analogy with eqs. (2.31),the variables 1’, 11’, 111’ are defined by the equations
+ G’2(3), (S’ - G’)’II’ = ( 1 ) - (S’ + G’)(2) + G’2(3), (S’ - G’)’III’ = ( 1 ) 2S’(2) + S”(3).
(S’ - G‘)’I’
=
(1)
-
2G’(2)
(3.13)
-
Solving these equations for ( I ) , (Z), (3) in terms of 1’, 11’, 111’ either directly or by the equation analogous to eq. (2.30),and substituting the results in eq. (2.31)we obtain the transformation from I, 11, I11
1,
5 31
23
OBJECT AND STOP POSITION
to 1', 11', 111' as
I ( S - G)'
II(S - G)'
= I'(S' - G)' - 2II'(S' - G)(G' - G) =
- G)',
I'(S'- G)(S' - S) - II'{(S' - G) (G' - S)
+ (G' I I I ( S - G)'
+ III'(G'
-
G)(S' - S ) ]+ III'(G'
- G)(S'
= I'(S' - S)' - 2II'(S' - S)(G' - S)
- S),
+ III'(G'
(3.14)
-S )'.
These relations may be expressed more concisely in a notation borrowed from invariant theory, the crossed brackets, which we proceed to define. For a more detailed study, see GRACEand YOUNG[1903], pp. 1-20. 3.5. CROSSED BRACKETS
By (ao, a l , . . . a ,
0 x,y)" we agree to mean
nCOaOXn
+ nClalxn-'y + .. + nCnany*,
where nCr is the usual binomial coefficient. We may describe this expression by saying that (x ty)" is to be expanded and tr replaced by a, throughout. This description enables us to interpret expressions such as
+
(ao, a l ,
- - - an 0 X,y)"(.',
Y')"-~
as long as n 2 K . For we simply take (%
+ ty)k(x' + ty'),-k
and replace t r by a, throughout. Another obvious extension is (bob1 . . . b2n
0 x , y , 4,,
+ +
which is defined by the operation of evaluating (x ty t2z)n and replacing tr by b, throughout. This again may be extended to
.
(bob1 . . bzn
0 X,y , z)k (x',y', z')~-'
precisely as above. 3.6. THEORY O F THE TRANSFORMATION
Returning to our transformation, we see that if we divide eqs. (3.14) through by (S - G)2 we may write the result in crossed bracket
24
M O D E R N H A M I L T O N I A N OPTICS
[I,
§3
form as
+ s, - g)2, I1 = (I’,11’)111’ 0 1 + s, - g)(s, 1 - g ) , I
=
(I‘, XI’, 111’ 0 1
(3.15)
I11 = (1’,11’)111‘ 0 s, 1 - g)2, where we have written s=-
S’- s S-G
9
G‘ - G g=S-G’
so that s and g represent the displacements of the stop image and of the object image respectively as fractions of the original separation of these images, as seen from eq. (3.2). The relation of the eikonal EG (where S is implied as the stop magnification) to the eikonal EG’ (where S‘ is implied as the stop magnification) may be inferred from eq. (3.5) as
EG’
==EG
+ G‘ K-
(&- - L’).
(3.16)
But to perform the transformation explicitly we must assume that EG and EG’ are expanded as infinite series of some form chosen to simplify the work as much as possible. Ordinary power series in I, 11, I11 and in 1’, 11’, 111’ would lead to hopelessly complicated transformation expressions, so we follow a different approach and investigate the transformation through the structure of its invariants. First we note the identity
(S - G)2(II11 - 112) r z (1)(3) - (2)2 =
(S’ - G’)2(I’111’ - 11’2)= (MN’ - M’N)’.
(3.17)
This shows that (S - G)2(II11 - 112)is an invariant of the transformation, and, moreover, vanishes for rays in a plane through the axis of the system, since we have seen that a ray will lie in a plane containing the axis only if (3.18) Consider now the terms in E G of order n, i.e. the aberrations (including stop aberration) of order n - 1. These terms will form a homogeneous expression of order n in the variables I, 11,111.Since the
1,531
25
O B J E C T A N D STOP P O S I T I O N
transformation is linear and homogeneous, the new terms of the n’th order will be derived from and only from the old terms of the same order. But in virtue of the identity (3.17) if we represent the n’th order terms as a finite series of powers of (S - G)2(I I11 - 112) with coefficient polynomials tailored to bring each term up to the n’th degree, then upon transformation the powers of (S - G)2(I I11 - 1 1 2 ) will be invariant and therefore the old polynomial coefficient of each power will alone determine the new coefficient of the same power of ( S - G)2(I I11 - 112). The decomposition of the n’th order terms into a series of powers of (S - G)2(I I11 - 1 1 2 ) is not unique if we allow arbitrary coefficient polynomials. However, if we use crossed bracket polynomials the decomposition has been shown by T. SMITH[1922] to be unique, though the original proof is tedious. Writing out the terms in the various series explicitly we have for the n’th order terms
0 I, - 211, I I I p + (I I11 - II2)(S- G)2(D?jD:. . .Dkn-2 0 I, -211,III)n-2 0 I, -211, III)n-4 + (I I11 - II2)2(S- G)4(DiDi.. (D;, D?,. . .
+ ... + (I I I I - I I ~ ) w ( S - G ) ~ ~ ( D ~ ~ D. .D&-Zw & , + ~ . 0 I, -211,
+
-
*
a
(3.19)
III)fi-2w
>
where the D’s are to be regarded as aberration coefficients and the different “series” are simply the groups of terms involving the different powers of ( S - G)2(I I11 - 112). It should be noted that the terms in eq. (3.19) may have a common factor depending on tz applied to them all. This will be important when we consider that in the transformation there will be extra terms of each order arising from the quantity
in eq. (3.16), and all of them will be of series zero. The transformation equations for series zero will of course be more complicated because of the extra terms, so we treat this series and the problem of choosing over-all coefficients for the terms of the various orders a little later. Hence we concentrate on series w ,w > 0, at first. From eq. (3.14) we find that for any b we have identically
(I,11,I11 0 1, - b)2 = (I’, 11’,111’ 0 1 + s
-
sb, - g - b
+ gb)2.
(3.20)
26
MODERN HAMILTONIAN OPTICS
[I,
93
We wish to consider what happens to eq. (3.19) under a change of stop and conjugates from S and G to S’ and G’. By eq. (3.16), if we avoid series zero with its extra terms, each of the series in the n’th order of E G goes directly into the same series in EG ‘ . By eq. (3.17) the factor (S - G ) 2 w ( I I11 - I I 2 ) w at the head of every series of eq. (3.19) is invariant. We thus only need consider what happens to terms of the form (3.21)
under the transformation. Now the left hand side of eq. (3.20) is I - 2bII b2II1, which is exactly what we would put for the second bracket of eq. (3.21), with b as the dummy t , in its expansion. This will allow us to find the relation between the D’s before and after transformation. The definition of the D’s after transformation is of course analogous to that before transformation, so that
+
pi;,
D;;+l,
. . . Dy;-2w 0 1’, - 211’, 1II’)n-Zw = (Dyw,DYw+,, . . . DYn’n-2w 0 I, - 211, 1II)n-zW.
The definition of the crossed brackets allows this to be written as (I’ - 2B 11‘ + B2 III’)n-zW = (I - 2b I1
+ b 2 III)n-2w,
(3.22)
where Bv is to be replaced by D;”+,, and bv is to be replaced by DYw+*. But I - 2bII b2III is the left hand side of eq. (3.20), and hence equals
+
1’(1
+s
-
sb)’ - 2II’(1
+s
-
sb)(g
+ b - gb) + III’(g + b
-
gb)2. (3.23)
We thus wish to find the (n - 2w)th power of this expression and equate to the left hand side of eq. (3.22), with the convention on Bw and bv. Coefficients of like powers of the variables could then be equated. Consider the term I’r(- 211’)PIII’t,
Y
+ p + t = PZ - 2w.
By eq. (3.22) the particular D’ that serves as coefficient for this term depends on $ 2t = v, the power of B. Then for all 9, t such that 9 2t = v, the coefficient of I’r(- 211’)PIII’t in the left hand side of eq. (3.22) is D;;,, multiplied by whatever trinomial coefficient is associated with these powers. On the right hand side, obtained from eq. (3.23), the corresponding coefficient will be the same trinomial coef-
+
+
1,
9 31
27
OBJECT AND STOP POSITION
ficient multiplied by (1
+ s - sb)'r( 1 + s
-
Sb)P(g
+ b - gb)P(g + b - gb)'t,
with our convention on bv. If we write
for the trinomial coefficient of convention on bv
n ) Dkz+v ( ( p,t
n
-
2w
=
y>
-
I,
.*. DhZ++,= ( 1 (1
1
I,
2w
p,t
p , t on % - 2w we have with the usual
) (1 +
s - sb)Zr+P(g
+ s - Sb)Zr+P(g + b
-
+ s - ~b)zn-4w-v( g + b
+ b - gb)p+2t
gb)v -
gb)v.
That is to say
a+, = (DFw,D&,+l
. . . DFn-2w 0 1 +s,
- ~ ) 2 9 2 - 4 W - V (g, 1 - 8)". (3.24)
This important equation gives the aberration coefficients for S' and G' in terms of those for S and G. We must of course remember that w > 0, since series 0 needs separate treatment. As noted previously, the result (3.24) requires modification for the special case w = 0 owing to the presence of the terms in L and L' in eq. (3.16), all of which contribute to series 0. The additional terms of order n are evidently
G' - G ,f (- I)"+C,(l)" K 1 GG'
- (-
1)niC,(3)"}
(3.25)
in which (- 1)" is just what it says, but (1)" and ( 3 ) B are powers of the variables (1) and (3). I n terms of the variables 1', 11', 111' the expression (3.25) is
G' - G
K
(S'21' - 2S'G'II' (- I)"*C,( GG'
+ G'ZIII')" - (I' - 211'
1
+ 111')" .
Next we evaluate the binomial coefficient, which readily gives
(- 1)"*Cn = -
(2n - 2) ! 22"-1.%!.(n - l ) !
(3'26)
28
M O D E R N H A M I L T O N I A N OPTICS
Hence the additional terms of order n are
(G’ - G) - _-__
K
(S’2I’ XI
(2% - 2) ! 22n-l.n!.(n - l ) ! - 2S’G’II’
GG’
+ G’2III’)n - (I’ - 211’ + 111’)n}.
(3.27)
Now for all terms of the same order we had proposed an over-all coefficient, say c ~ + ~ where + ~ , 9 q Y = n. Calling this Cn it seems wise t o take
+ +
(3.28) for then every term of order n in the eikonals would have the coefficient cn included in its definition, and this would be the same as in the additional terms of the same order. The factor may then be cancelled in the transformation expressions, order zero included. Again, the quantities expressing the aberrations are preferably dimensionless, so that the factor l/K which introduces the unit of length must also be excluded. To do this we re-write eq. (3.16) in the form KEG’ = KEG
+ (G‘ - G)
(GLG,
--- L’
)
and regard all coefficients as coefficients of the dimensionless eikonal KE. Now in the expansion of (3.27) in terms of the form I’r(-21I’)PIII’t the same trinomial coefficients are encountered as in the regular terms of the series, so that again these cancel in the transformation formulae. Note too that aside from the trinomial coefficient, the coefficient of I’r(- 211’)PIII’t in the large parentheses in (3.27) is S”&.S‘pG‘p. G‘2t - 1, i.e. (S’2r+pG’V--1G--1 - 1). GG’ Thus with all factors accounted for, the relations for the aberrations of the zero series may be written in the form
q o = (0;q. . .Din 0 1 + s, - s)2”-”(g,
1 - g)”
+ (G’ - G)(S’2n-vG’v-lG-l
- 1).
(3.29)
Thus we have explicit stop-shift and conjugate-shift expressions for all orders of aberrations.
=I
CONCLUSION,
29
REFERENCES
3.7. RELATION TO THE FOCAL EIKONAL
We now mention the relations between the D's of any order and the coefficients of the standard focal eikonal E . If we define aberration coefficients for E by the relations
(D&,Dyw+l.. .Dyn)n--2w 0 I,
- 211, III)n-2W
= (Jq$Yw+1.
*
.Eyn-2w
0 (1)) - 2(2), (3))"-2"
(3.30)
then it follows as in T. SMITH[I9221 that for all series except series zero we have
oyW+, = ( E ~ ~ , E : .~.+ . E~ ;, ~ -0 s, ~ -1)2n-4w-v ~
(G, -S)"
(3.31)
and for series zero the result is modified to =
(E:, Ey . . .Eo2% 0 S,- l)zn-'J(G, - I)v-G-.S2n-vGv-l.
(3.32)
These results make it possible to write out a t once the form of the focal eikonal in the variables ( l ) , (2), (3) when EG is given.
Q 4. Conclusion In the space of one article it is impossible to discuss or even mention many of the remarkable things that have been done this century in geometrical optics, especially by T. Smith. It is hoped that this article will at least draw attention t o the fact that a general aberration theory is far from impossible and that the first step towards it seems to be a detailed review of the great work of T. Smith.
References BRUNS, H., 1895, Saechs. Ber. d. Wiss. 21. GRACE,J. H. and A. YOUNG,1903, Algebra of Invariants (Cambridge). HAMILTON, Sir W., 1931, Collected Mathematical Papers, Vol. I (Cambridge). HERZBERGER, M., 1958, Modern Geometrical Optics (Interscience, New York). LANCZOS, C . , 1949, The Variational Principles of Mechanics (Toronto). R. K., 1944, Mathematical Theory of Optics (Brown University). LUNEBERG, SMITH,T., 1921, 1922, Trans. Opt. SOC.(London) 23 (1921-22); Reprinted in "National Physical Laboratory Collected Researches" 17,Paper 13, with an Appendix of Proofs. STEWARD, G. C., 1926, Trans. Camb. Phil. SOC.23, No. 9. STEWARD, G. C., 1928, The Symmetrical Optical System (Cambridge). SYNGE,J . L., 1937, Geometrical Optics (Cambridge).
This Page Intentionally Left Blank
I1
WAVE O P T I C S A N D GEOMETRICAL O P T I C S I N OPTICAL DESIGN BY
KENRO MIYAMOTO
*
Department of Optical Design NipPoit Kogaku K.K. Tokyo, Japan
*
Temporarily at the Institute of Industrial Science, T h e University of l o k y o , T o k y o , J a p a n and now at the Institute of Optics, the University of Rochester, Rochester, N e w Y o r k , U.S.A.
CONTENTS PAGE
§ 1 . INTRODUCTION
. . . . . . . . .
. . . . . .
*
33
$ 2. WAVE SURFACE AND CHARACTERISTIC FUNCTION
(EIKONAL)
.....................
34
$ 3. INTENSITY DISTRIBUTION O F LIGHT I N AN OPTI-
. . . . . . . . . . . . . . . . . . . . . 3 4 . THE RESPONSE FUNCTION . . . . . . . . . . . . . 4 5. IMAGE EVALUATION BY SPOT DIAGRAM . . . . . . REFERENCES . . . . . . . . . . . . . . . . . . . . . CAL IMAGE
36 41
58 65
9
1. Introduction
In recent years the theory of image formation and evaluation has witnessed many useful developments, following the introduction of Fourier techniques and information theory into optics. These new fields were found by the conversion of the description in co-ordinate space represented by “Kirchhoff’s integral” t o that of spectrum space represented by the so-called “response function”, the latter being more easily connected with the other fields (for example, with information theory) by its mathematical adaptability. Furthermore, methods and instruments for measuring response functions are being explored and have alseady been applied as a new tool for image evaluation. Nevertheless, when optical designers try to apply these useful results to optical design, they a t once encounter a great obstacle; for the recent researches on the image formation and evaluation depend on wave optics, while practical design resorts t o geometrical optics according to its long-standing tradition. There are good reasons for this latter approach; the amount of calculation is much smaller than if wave optics is used and also the quality of lenses can be predicted fairly accurately from the knowledge of the geometric optical aberrations in the design stage, if the lens designers are sufficiently experienced. Therefore the relationship between the two methods has remained obscure quantitatively, and the exchange of ideas between the two fields is rather unsatisfactory. However the advent of high speed computing machines has removed the labour of a prodigious amount of calculation and also the introduction of Fourier analysis and information theory has led to the development of new aspects in optics. Faced with a fortunate and hopeful situation, the relationship between wave optics and geometrical optics is described here with special reference to image formation and evaluation and future aspects of optical design are discussed.
34
W A V E OPTICS A N D GEOMETRICAL OPTICS
Q 2.
[It,
§2
Wave Surface and Characteristic Function (Eikonal)
The aberration theory which is one of the most important theoretical backgrounds of geometrical optics, was systematized gradually, since W. R. HAMILTON [I8271 introduced into optics the idea of a characteristic function connected with the optical length of a ray. As the method of this theory was so analogous to that of the analytical mechanics developed by J. L. Lagrange, Hamilton could transfer this method to the general problem of mechanics and thus deduced the famous Hamilton’s equation of motion (CARATHEODORY [ 19371). However, his optical work remained little known for a long time, but came to light again after the eikonal of H. Bruns t was developed and brought fruitful results. For example, the Seidel aberration theory was refined and systematized by K . SCHWARZSCHILD [ 19051. Accordingly the theory of the characteristic function is the most important part of geometrical optics even from the historical point of view. The connection between this characteristic function and waves was carried out by A. SOMMERFELD and J. RUNCE[1911]. They showed, using a suggestion of Debye, that the wave equation becomes the differential equation of the characteristic function in the limit of vanishingly small wave-length I (BORN[1933], BORNand WOLF [1959]). Let the amplitude of the light disturbance be f = u(x, y , z ) exp [- ikL(x, y , 2 )
+ iwt],
where u and L are considered as functions which vary slowly with ( x , y, 2 ) within an interval of the order of a wave-length. Substituting f in the wave equation
(-+-+--a p a2
ax2 a2
a2)f-cz,,!=o, n2 a 2 a22
we have k2zr(n2
-
grad2 L ) -- ik(udL + 2 grad u-grad L )
+ Azc = 0.
In these equations, n is the refractive index of the medium, k 3 w/c = 2x11 is the propagation number, and o and c are the angular frequency and the light velocity respectively. If uAL grad u-grad L and Au are much smaller than k , only
+
t Interesting discussions concerning t h e Hamilton characteristic function [ 1 9 3 6 ] a n d J . 1,. a n d t h e Bruiis eikonal were held between M. HERZBERGER SYXGE[ 19371.
11,
4 21
WAVE SURFACE AND EIKONAL
the term involving equation :
K2
aL
35
is dominant and we then obtain the following 2
aL
2
aL
2
(x) +(%) +(z> =+. This equation is precisely the formula which Hamilton’s characteristic function must satisfy. When we define the unit vector s by ns = = grad L , it coincides with the geometric optical ray and it becomes clear that L itself is the characteristic function. The foregoing theory gives a physical meaning to the characteristic function or eikonal, connecting the ray with the physical quantity, the wave surface. It is interesting to note that the foregoing relations are analogous to the relations between wave mechanics and classical mechanics when Planck’s constant h is compared to L (DIRAC[1947]). However, these relations are not valid in the region where the intensity of light changes rapidly ( d u , grad u large) or where many rays concentrate (AL = div grad L = div ns large). The fact that the theory becomes powerless in the most iniportant place - in the image region - left many questions to be solved and was responsible for some hesitation in using geometrical optics. B. R. A. NIJBOER [ 19431 investigated this problem systematically in the image plane. He introduced the reference sphere having its centre a t a Gauss image point and passing through the centre of the exit pupil, and defined the so-called wave aberration function, as the difference between the wave surface of the optical system and the reference sphere. (In these discussions and in the following, the refractive index of the image space is assumed to be constant.) I n general, the point a t which the normal to the wave surface intersects the image plane, deviates from the Gauss image. Nijboer connected this geometric optical aberration with the wave aberration function I/ by a simple geometrical consideration. Let the radius of reference sphere be K (Fig. 2.1), and the co-ordinates on the exit pupil be (5, 7). The equation of the wave surface is then expressed by the formula
-t-
+
+
Yo)2 5 2 = [K V(E,7)]2. Neglecting the term involving the second power of V , the direction cosines of the normal at the point ( ( , ? I ,C) on the wave surface are 52
(7
--
36
W A V E OPTICS A N D GEOMETRICAL OPTICS
[IL § 3
Thus the equation of the normal is
x--5 -5 - R avpt
-
(-r -
z-5 y--r -_- ~y o )- R avjayl 5
Accordingly the deviations ( A X , A Y ) from the Gauss point (0,Y O ) of the intersection point in the image plane at Z == 0, are given by (2.1) This analytical method was developed later by H. H. HOPKINS[ 19501, and E. WOLF [I9521 rigorously examined the relation between the wave surface and the Hamilton’s characteristic function. According to his works, eq. (2.1) contains the error of order RO([/R)7in its approximation; he pointed out that eq. (2.1) can therefore be applied only to aberration theory of not higher than the fifth order. However, as shown by TORALDO DI FRANCIA [1954], if the wave is thought of as coming from infinity, and we regard the function V as the wave aberration function corresponding to the reference sphere having infinite radius, eq. (2.1) is strictly valid. In any case, eq. (2.1) provides a powerful method for comparing the aberrations of geometrical optics with the wave surface belonging to wave optics. Y
Fig. 2.1. Wave aberration function and geometric aberrations
$ 3. Intensity Distribution of Light in an Optical Image When the aberrations are sufficiently small, the image has to be analysed from the standpoint of diffraction theory. First detailed studies of this kind were made by PICHT [ 192.51 and STEWARD [ 19281. More recently the problem was investigated again by several writers. The most complete theory seems t o be NIJBOER’S[1943, 19471.
11,
9 31
37
INTENSITY DISTRIBUTION
After he introduced the wave aberration function V , Nijboer (see also NIENHUISand NIJBOER [ 19491) analysed the diffraction pattern of a point image in the case of small aberration. Using the polar coordinates (Y,rp), the wave aberration function was previously expanded in terms of yn COP y (STEWARD [1928]). I n Nijboer’s work the expansion is in terms of expressions of the form r n cos mrp, and furthermore he classified the wave aberration functions in terms of the circle polynomials R,m(r) cos ”91, following a suggestion of F. Zernike :
v(r,q)
fn
1
o
C 4 Rno(r) n-2
1/2
+
(n,oven)
W
r
n
C C
fn,rnRnm(y) C O ~
n-1 m = l ( n - m , even)
I n this classification, new sets of aberrations appear, but they do not represent different kinds of aberrations. Owing to the orthogonality and completeness of circle polynomials within the unit circle, R n m ( y ) cos ~ t ~ . R n * m ’ (cos r ) WZ’P*Y
dY dv
=
(% =
0).
Strehl’s definition (S.D.)in the case of small aberration (the normalized light intensity at centre) is given by the following expression:
-
-1
--
rc
I1rn
[ V ( Yq)]% , dr dpl
0
W
= 1
-
0
M
c
n 1 m=O (rb-m, cren)
fn1,,2/(2n-t 2).
(3.1)
As a result, cach term contributes jntlepenicntly a negative term to thc S.D., and cach aberration represented by a circle polynomial cannot be counterbalanced by the other terms. Many diffraction patterns were calculated for small aberrations (Fig. 3. la, b) and it
38
WAVE OPTICS AND GEOMETRICAL OPTICS
a) Diffraction astigmatism.
Fig. 3.1. pattern
of
+(=, ). = 0 . 6 4 a p u2
+ v2 5
primary
- q,
1.
The dotted circle indicates the boundary of the geometrically illuminated area. (After NIENHUE[ 19481)
b) Diffraction coma.
+(w,V )
=
pattern
of
+ v2
1.43L(u2
-
primary
Q)u.
The boundary of the geometrically determined coma flare is also shown. (After NIENHUISand NIJBOER[1949])
11,
s 31
39
INTENSITY DISTRIBUTION
became clear that they are quite different from those expected from geometrical optics (NIENHUIS[1948]). However, if we try to use these methods for large aberrations, the orthogonality of circle polynomials loses its utility, and they are no longer so useful for analytical treatment. Effects of large aberrations were successfully treated by VAN KAMPEN[1949, 19501. He applied “the method of stationary phase” to Kirchhoff‘s integral and gave an asymptotic expansion of this integral in the limit of K + 00 (A --f 0 ) . In Kirchhoff’s integral
A
when the variation of f(t, q) is large in the domain of integration (when the aberration is large), the exponential factor of the integrand changes its sign many times; accordingly the main contribution t o the integral comes only from the neighbourhood of a point (called a critical point of the first kind) satisfying the conditions aJ/at = 0, af pq = 0.
Shifting the origin of (t,q ) to this point and expanding the integrand in the Taylor’s series, we obtain
W t ,4 exp [ik/(Et d l =k
exp ikno,o.exp ik 4(az,oE2
+ 2a1,ltq + ao,zq2)
Integrating this, the following series in powers of Ilk is deduced:
<
where E = f 1 when a2,o a0,2 z u : , ~ and a2,o 0 ; and E = - i when ~ Z , O ~ O ,
40
W A V E O P T I C S A N D G E O M E T R I C A L OPTICS
[II,
§3
only pointed out that the first term which remains in eq. (3.2) in the limit of L + 0 ( K + m) corresponds to geometrical optics. While these analytical methods based on Kirchhoff’s integral have somewhat clarified the relation between image formation and aberrations ( M A R ~ C H A[L1954]), investigations based on geometrical optics had also been carried out. M. HERZBERGER [ 19471proposed the method of image evaluation by means of the spot diagram ; this diagram consists of the pattern of points formed in the image plane by a set of rays, distributed uniformly over its entrance pupil (in order that one ray represents the same amount of light energy) and emanating from a single object point. He calculated the spot diagrams of an existing lens by the interpolation formula and published photographs of point images. He showed that, if fine details are neglected in the intensity distribution, both figures are very similar to each other (cf. Fig. 3.2). At this point, in order to investigate the relationship quantitatively, it is necessary to have a concrete analytical formula of geometric optical intensity distribution, and the formula (2.1) reveals its own utility in this place. For simplicity we introduce the normalized co-ordinates (x, y) and (u, v ) defined as follows
x =- A X * H f, y R
AYaH,
,
where H f , H , are the largest values of \ E l , I ~ I / respectively, and R is the length of AP’ in Fig. 2.1. Then eq. (2.1) becomes (2.1’) where we have changed the notation: +(u, v ) = V(E,q). As the energy flux through the small area da = du dv of the exit pupil is converged on the small area do = dx dy of the image plane (Fig. 3.3), the geometric optical intensity distribution Ig(x,y ) can be expressed in the form
Fig. 3.2. Comparison of spot diagram with photographs of point images from a manufactured lens (After Y . Uli11.h and J . Tsujiuctri 119581)
This Page Intentionally Left Blank
11,
§ 41
THE RESPONSE FUNCTION
41
assuming the uniformity of the flux through the pupil t (MIYAMOTO [1957]). a is the area of (u, u) region and the normalizing factor,
SS
I ~ ( xy), dx dy = a
Sda __
do
do = 1.
It is to be noted that the expression (3.3) has the same form as the denominator in eq. (3.2), if we relate the ai,j’s to the wave aberration function +(u,v). However, as eqs. (2.1‘) and (3.3) express the geometric optical intensity distribution I g ( x ,y) in terms of the parameters (u,u ) , it is hardly possible to compare the formula (3.3) directly with Kirchhoff’s integral. We shall now turn our attention to the treatment based on Fourier analysis, which has been developed in considerable detail in recent years. r 9
I.
Fig. 3.3. A , centre of exit pupil;
§ 4.
Y
0,c:.l.tre
of image plane
The Response Function
4.1. INCOHERENT ILLUMINATIOX
Between the intensity distributions in the image plane and in the object plane of an optical system, the superposition principle is applicable in the case of incoherent illumination. Accordingly we can consider an optical system as a linear filter of spacial frequencies and Fourier analysis can be easily applied. This line of thought was initiated by P. M. DUFFIEUX [1946], and 0. H. SCHADE [I9511 contributed greatly to the advancement of these ideas in the experimental field connecting them with the problems of television. Furthermore, H. H. HOPKINS [1951, 19531 utilized Fourier analysis to formulate the general diffraction theory of optical image formation. Now, if we take the Fourier transform of I W ( x y, ) , the wave t This assumption is not essential and a similar analysis can be made without it.
42
WAVE OPTICS AND GEOMETRICAL OPTICS
[II,
s4
optical intensity distribution of the point image, we obtain the rzsponse function R,(s, t ) - which is the auto-correlation function of the pupil fuiiction exp [2ni+(u, w)/A]:
Kw(s,t ) exp [2ni(sx
Iw(% y ) =
+ ty)] ds dt,
(4.3)
--M
where AAs,at is the region shown in Fig. 4.1. When 42s 2 1 or +&At2 1, then the area aAS, of A , , At is zero and hence Rw(s,t) is then also equal to zero. The frequency variables (s, t) used in the foregoing are connected with the line number per unit length ( N 5 ,N,) by the following relation : N,. R t =; ~_N , * R s ___ L T
Ht ' H, HOPKINS [ 19551 discussed defocusing of an aberration-free lens as a simple example, comparing the wave optical response function V
Fig. 4.1. Region AA,,Atof integration for the frequency pair (s, t )
11,
3 41
43
THE RESPONSE FUNCTION
with the geometric optical one, and deduced that, if the magnitude of the wave aberration function is larger than 21, both values coincide fairly well with each other in the low frequency domain. M. DE [1955] and N. S. BROMILOW [1958] also investigated the case of astigmatism and spherical aberration in siniilar ways. These results can be deduced from a more general standpoint (MIYAMOTO[1957]). If we expand the exponent of the integrand in eq. (4.2) in the Taylor series around the point (u, v), we have -
2xi[
-__
A
(As 8 lvt 2 2 --+--. 2 azt 2 av
)+ + -
+ (2m
--
a
1)
>,-l
+ + . . .]
,Only the first term in the above series is independent of A, and the .other terms approach zero when il tends to zero. Accordingly, as 3, --f 0 R,(s, t ) approaches to R,(s, t ) , where
R,(s, t ) = .-I//,
exp
[- 2ni (s & + t L) av +] du dv.
(4.5)
This is a formula for the geometric-optical response function (MIYA[ 19571, HOPKINS [ 1957b]), This equation also follows by taking the Fourier transform of the geometric-optical intensity distribution MOTO
Y ):
11
00
R,(s, t ) =
I,(%,y ) exp [- 2 4 s x
--oo
-w
= a-1
11-
+ t y ) ] dx dy
-exp [- 2n;i(sx + t y ) ] do do
exp [- 2ni(sx
+ t y ) ] du dv.
-cc
4.2. COMPARISON O F WAVE OPTICAL AND GEOMETRIC-OPTICAL R E S P O N S E FUNCTIONS
Difference between the functions K,(s, t) and R,(s, t ) arises from the difference of the regions of integration AIs,Lt,A o , o and of the higher order terms in the Taylor series of eq. (4.4).
44
W A V E O P T I C S A N D G E O M E T R I C A L OPTICS
[XI,
94
In the case of small aberrations, the factor of the difference in their integral regions is dominant; and the approximate relation Kw(s, t ) N
Rg+d(S,
t)
f
Rg(s,
t ) ‘Rd(S, t )
(4.6)
is easily deduced, R d representing the response function of an ideal lens of the same aperture. On the other hand, when the aberration becomes large, the relationship is a little more complicated. I t is convenient to introduce the quantities
P , 4,4%74. Y W ( P , 4)
and
f,g(P,
4)
defined by
;a$,
4 = gat
f ) = -
4 = wle(u,u),
&%
Yv(P,
9) = Rw(s, 4,
k(% v)I Yg(f),
-
O(1)
4) = K&, t ) ,
and to employ these instead of s, t, ~ ( z Lv), , K,(s, t ) and R,(s, t ) respectively, co being the parameter which expresses the ratio of the magnitude of the aberration function to the wave-length A. Then the wave optical rcsponse function can be mritten as follows:
-1-
O(93W)
4- . . .
As the niagnitudc of (a - c . ~ ~ , ~is) /Oa( p ) , y W ( p , 4)is given by YW(P>
4)= r g ( p , 4 )
+ O(P) + O(P30))+ . * . .
Accordingly, when we consider the low frequency region 191, jq( < oo/co
11,
§ 41
45
THE RESPONSE FUNCTION
(choosing appropriate finite constant
WO),
a%
a2e _ _ _ _ au2 at4atf
8% ~
P_ e
auav
av2
-
1
__-
4 J_ w ) 2 1 , ( % Y)
=:
0.
46
WAVE OPTICS AND GEOMETRICAL OPTICS
111,
94
out a rigorous analysis. However, j f fine details in the intensity distribution can bc neglected, as in the case of image evaluation taking into account a receiver (emulsion etc.) with a large turbidity, the results are similar to the one above. Numerical analyses on the basis of the foregoing approach will now be carried out for each case of the Seidel aberrations. For ease of mathematical treatments, we assume that the optical system has a square aperture 1uJ g 1 , In1 1 . (The results are of direct interest in connection with the spectrograph.)
Defocusing and Astigmatism In the case of defocusing (or curvature of field) or astigmatism, the wave aberration function is given by f$(u,v ) = wA(u2 -fr: 79).
The regions A p , q and A are expressed by 1u.l 1 - 191, 101 5 1 - 141 and j z ~ i5 1, lvj 5 1 respectively. Then the response functions are
when
21
otherwise YdP) =
sin [8nwP] 8nop
and
=o
otherwise.
lhese curves arc shown in Fig. 4.21, b, c for various values of
0).
11,
4 41
THE. JtESPONSE FUNCTION
C
Fig. 4.2a, b, c. Response function for astigmatism, +(u, v) = oii(u2 - v 2 ) , ]uI, /vl 5 1. -lines, - - - - lines and lines are the curves r,(p) (wave optics), y P ( p ) (geom. optics) and r g + d ( + ) = vg@) * r d ( p ) ,respectively
-._.-
47
48
[IL
WAVE OPTICS AND GEOMETRICAL OPTICS
44
Now, in order to estimate how well Y g + d ( p ) and Y g ( P ) approximate we carry out the integrations over the corresponding regions. By the Fourier theorem, these values are identified with the intensities of the line images a t centre, which are an important measure of image evaluation (Strehl’s definition). They are expressed as follows : Y&),
sin 8nwp
IPI) d+
( 1 --
1 4wl
,[ 2 Si(8nw)
1
- _ _ ( 1 - cos
4n2w
1
8nw) ,
where
C(x) - is(%) = 0
Si(x) =
1’ o
sin t
__
t
exp (- it) - dt, (244
dt,
These curves are shown in Fig. 4.3.
Fig. 4.3. Relative intensity at centre. ___ line, 4mAI,(O) ; _ - _ - line, 4wlI,(O) and - - - line, 4mAIg+d(0)
-
11,
9 41
49
THE RESPONSE FUNCTION
D
C
Fig. 4.4a, b, c. Response function for coma, corresponding to the line image which is perpendicular to the sagittal direction, 5 1. The lines are the same +(u,V ) = o l ( u 2 uz)u, IuI, as in the case of Fig. 4.2
+
50
WAVE OPTICS AND GEOMETRICAL OPTICS
lrcs)l
0.0
. .
.
~-
0.5
1. 0
(I
0 4 I-,
10
05
00
005 c
01
4
11,
9 41
THE RESPONSE FUNCTION
e 7c
0.0
0.5
d
e
0.0
0.05
01
f
Fig. 4.5a, b, c, d, e, f . Response function for coina corresponding to the line image which is perpendicular to the meridional direction, +(u, 0) = w 1 ( u 2 +)u, IuI, Iu] _< 1. The lines are the same as in the case of Fig. 4.2. O(q) is the phase shift of ~ ( g )= Ir(q)I exp [ - iO(q)]
+
51
52
§4
“1,
WAVE OPTICS AND GEOMETRICAL OPTICS
COW@.
The primary coma has the following wave aberration function: v ) = W;lZf(ZtZ
+(zt,
+ 79).
Putting q = 0 in rW(fi,q), r g ( p ,q), we have the response functions corresponding to the line image which is perpendicular to the sagittal direction, that is
r,(fi) = r w ( f i , 0) =
a
11, /‘-”‘ dv
dv exp (- 4cwi * 2puvj
-(I- IPI)
= (8nwp)-1Si[8nwp(l -
lfil)]
when
=o
lfii 2 1
otherwise,
rg(fi)G rg(p,0) = (8nwp)-1Si(8nwp). These curves are shown in Fig. 4.4a, b, c with those of Yg+d($). I n the same way the response functions yw(qj, rg(q) corresponding to the line image that is perpendicular t o the meridional direction are obtained by putting fi = 0 in rw(fi,q ) , y g ( p , 9). When w4 2 0,
~
_
__
_
~
-
Fig. 4.6 a) The geometrical optical intensity distribution I&, y ) for coma +(u,a ) = wA(u2 +)a; circular aperture u2 02 5 1. From the formula of (Z.l’), (3.3),we have,
+
+
I&, y )
=
(4/ W I n ) - l ( y % - 3 ~ 2 ) - +
in region I
=
(8 Iw1 ii)-l(y2
in region I1
=o b) The wave
optical
-
3x2)-&
otherwise intensity distribution for coma;
4(u, v) = 6.4A(u2 + v2)u. (After KINGSLAKE [1958])
4
11,
§ 41
THE RESPONSE FUNCTION
a
53
54
[II,
WAVE OPTICS AND GEOMETRICAL OPTICS
Fi 4
If wq .< 0, then r&) = rW*(- q ) and rg(q)= rg*(- 4). (Asterisk means the complex conjugate.) These curves are shown in Fig. 4.5a-f. It is also of interest to compare the geometric optical intensity distribution with that computed by K. KINGSLAKE [ 19481 (Fig. 4.6a, b). SPherical Aberration The functions y W ( p ) , r g ( p ) of the lens having primary spherical aberration +(u,v ) = wqu2 ,2)2
+
are given by
- sin
16n Iwpl(u3
+ p2u)S(16n [wpI u)]
=o
- sin
when lfil
<1
otherwise,
167~lopi u3S(1676 lwpl u)].
These curves are shown in Fig. 4.7a, b, c. From the foregoing analyses, we can deduce the following results. When the magnitudes of wave aberrations are so small that the ray aberrations (x,y) are almost included within the circle whose radius is half of Airy’s, the product Yg+d of rg and Yd approximates to yW, and therefore the convolution of Ig(x,y) and the intensity distribution I d ( % , y ) of an ideal lens approximates to r,(x, y). On the other hand, when the magnitudes of the wave aberration function are larger than 22, the geometric optical response functions of any Seidel terms approach the wave optical ones in the region of < 0.2 and 141 < 0.2. In other words, in the case where the extension of the spot diagram is 10 or 20 times as large as the diameter of Airy’s disk and if finer details than the length of 2.5 times diameter of Airy’s disk (e 5iiF F/400 (nim) ; F is f-number) are smoothed, the geometric and wave optical intensity distribution are almost equal to each other. Furthermore the fine details in the intensity distribution corresponding to > 0.2, 141 > 0.2 change rapidly and irregularly with the change of cr), and form the fluctuating terms from the intensity distribution, determined by the low-frequency Fourier components ; they do not
-
11,
5 41
55
THE RESPONSE FUNCTION
0.5-
P
0.0 1
C
Fig. 4.7a, b, c. Response function for spherical aberration, 4(u, v) = ol,(u2 G ) 2 . The lines are the same as in the case of Fig. 4.2
+
56
WAVE OPTICS AND GEOMETRICAL OPTICS
[II,
94
play dominant roles in the image formation. Even when a lens has an arbitrary aperture with arbitrary aberration, similar conclusions may perhaps hold if the Seidel terms are dominant. The investigations along the lines discussed in this section point the way towards the appropriate and convenient evaluation of image quality, with the help of spot diagrams. 4.3. PARTIALLY COHERENT ILLUMINATION
In the foregoing, the relation between wave optics and geometrical optics has been discussed only for the case of incoherent illumination. It is also necessary to consider the relationship for the case of partially coherent illumination. Let the (u,v) region of the effective source be 2 and let the co-ordinates of a point P in the object plane be the same as the co-ordinates ( x , y ) of its geometric-optical image P'. (The coordinates (x,y ) on the image plane and (u, v) on the exit pupil are the same as the foregoing (see Fig. 4.8).)
A
I 0'
Fig. 4.8. Optical system with partiaIIy coherent illumination Z, 0, A and 0' are effective source, object plane, exit pupil and image plane, respectively
Fig. 4.9. Region of integration Aas,,atl; at^ for the frequency variables SI, t l ; s2, t~
11,
3 41
57
THE RESPONSE FUNCTION
We denote the coniplex transmission in the object plane by E ( x , y) and its Fourier transformation by F ( S , t ) :
E ( x , y) exp [- 2ni(sx
+ t y ) ] dx dy.
y) of the image formed The wave optical intensity distribution cDW’(x, under the illumination of the effective source is given by H. H. HOPKINS [I9531 in the following form:
%’(x, y)
==
jjr,dsi
dsz dtz @I,
dh
t i ; s2, t z )
-03
x
4% tl) exp 2ni(s1x + t1y)
x
E*(SZ,
tz)
exp [- 2ni(s2x
+ tzy),
(4.7)
where
S
t(s1, t1; s2, t z ) = u-1
du dv exp(-2niil-l[+(u
+ hi,v +
21)
Basl, atl :t s a , t t a
-
+(u
+ Asz, + &)I}.
(4.8)
ZJ
A>sl,a t l ; asa, at2 is the region of (u, v) shown in Fig. 4.9 and u is the area of A0 = Ao,o; o, o. If 1. approaches zero in eq. (4.8), t(s1, ti; sz, t z ) becomes n
limit t(s1, t i ; sz, t z ) = a-1
J
du dv An
2-0
= %(Sl
- sz,
tl - t z ) ,
and changes to a function of only s = s1 - s2, t = tl - t z . This is the geometric optical response function itself (MITAMOTO [ 19581). Then eq. (4.7) becomes
11
m
@gl(x,y ) =
ds dt Rg(s,t ) exp 2ni(sx
+ ty)
-W
x
dsl dtl E ( S I , t l ) ~ * ( sl s, tl - t). --m
On the other hand, the following relationship is obtained by Fourier
58
CK 5 4
WAVE OPTICS AND GEOMETRICAL OPTICS
theory : (bcs, t )
dsl dtl E ( S I , t l ) ~ * ( sl S, tl
=
11
t)
-
Do
=
dx dy lE(x, y)j2 exp [- 274sx
+ ty)].
-Do
Thus +(s, t ) is the Fourier transform of the intensity distribution, @(x, y) = IE(x,y ) / 2 ,in the object plane. Furthermore, according to the convolution theorem, @‘gl(x,y) may be expressed in the followjng form:
1s
ds dt +(s, t)Rg(s,t ) exp 2ni(sx
11
dx’ dy‘ @(x’, y ’ ) l g ( x- x’,y - y’).
w
y) =
+ ty)
--oo
w
=
-CC
This is the formula for image formation by geometrical optics. It has been shown that t ( s l , t l ; s2, t z ) becomes Rg(s,t ) when ilapproaches zero keeping (s, t ) constant. These relationships are, mathematically as well as physically, quite equivalent to the fact that the wave optical response function approaches the geometrical optical response function, when the wave aberration function +(zb, v ) becomes large and the frequency variables (s, t ) become small, keeping il constant. So the results obtained for the case of incoherent illumination and large aberrations are also valid for partially coherent illumination. In conclusion, when the aberrations are relatively large such as in a photographic lens, the image quality can be evaluated quantitatively by geometrical optics even for partially coherent illumination. I n the case where the diameter of Airy’s disk is large compared with the extension of the spot diagram such as in a telescope objective the image may be analysed by semigeometric optical methods [Rg+d(S, t ) ] . However, when the diameter of the Airy’s disk and the extent of the spot diagram are comparable with each other, for example in the case of a microscope objective, we must resort to wave optics.
Q 5. Image Evaluation by Spot Diagram 5.1. IMAGE EVALUATING METHOD
As is well known, the varidus curves representing the generalized Seidel aberrations are used as a measure of image quality- in usual lens design. However, as systems with large apertures and fields have
11,
5 51
IMAGE EVALUATION
59
become of particular interest in recent years, more appropriate image evaluation is desirable. Although we can consider the wave aberration function (characteristic function) and other functions as a proper measure, the starting point should be the method of the spot diagram, proposed by M. Herzberger. As described in section 4, this shows the intensity distribution fairly accurately, if the diffraction effect can be neglected, and also gives a good “risual” representation of the image. Furthermore only ray tracing is necessary to obtain the co-ordinates (xi,y i ) of the spot diagram, and the method of ray tracing has already been investigated for a long time and is well established. The large amount of computing which was necessary to calculate a spot diagram made it unpractical as long as desk calculating machines were the only means of computing. At the present time the labour of ray tracing has almost been removed by the advent of high speed computers, and now this method has become practical and is widely used. However, if the method is used as it is, then as R. E. HOPKINS [1955] said: “Faced with this fortunate situation, many designers started tracing a large number of rays through their optical systems for purposes of evaluation. They very shortly reached a serious dilemma in that they did not know how to put the resulting data in a form which lead to easy interpretation.” To meet this problem which naturally occurs, some trials have already been carried out. For example, the radius of gyration of the spot diagram is taken as a measure of the image quality,
{As the centre of spot diagram, one can take the intersection point of the principal ray with the image plane, or preferably, the centre of gravity
which makes the radius of gyration minimum.) This method was applied to the error balancing of the Schmidt camera by E. H. LINFOOT [1955] and was also applied to various other cases. This criterion may be appropriate when only small aberrations are present as in a Schmidt camela, but it is not so satisfactory
60
WAVE OPTICS AND GEOMETRICAL OPTICS
“I,
s5
when the spot diagram consists of two parts, one being the core of the image, in which most of the spots are well concentrated and the other being a halo with spots spread broadly around the image core. This type frequently appears in lenses with large apertures. The images are evaluated too unfavourably by such a radius of gyration, because the distant spots are too heavily weighted. In view of this, F. A. LUCY[1956] proposed the intensity criterion
I.C.
-x N I
=
1
N i=l
(Xi2
+ yi2p + d r ’
d r is the radius of Airy’s disk and prevents the right-hand side from becoming infinity in the case of an aberration-free lens. These foregoing criteria are practical, but still rather arbitrary, and theoretical considerations seem insufficient in themselves. On the other hand, as described previously, the introduction of Fourier analysis into optics has provided various possibilities for image evaluation. P. B. FELLGET and E. H. LINFOOT [I9551 combined the treatment based on Fourier analysis with information theory and deduced the statistical mean information (S.M.I.) content per unit area of image by making a few assumptions based on practical conditions:
S.M.I. = ‘//,.log(
f2
1
+
-
-)
1TI2 IE0I2 1T71l2 m 2
+
t dudv;
lWl2
here (u, 21) are frequency variables, T, 7 1 are the response functions is the statistical of the lens system and the receiver respectively. mean of spectral powers of the intensity distribution on the object -plane and / Y O ] ~1,~ 2 1 2are the means of spectral powers of the noise in the object and in the image plane respectively; and f is the focaI length. Based on this theory, Linfoot proposed three criteria, pointing out that an evaluating method should have the following properties (LINFOOT [ 1956, 19581): (1) It must include means of taking into account the characteristics of the receiving surface; (2) The criterion should take account of the type of object o n which the system is to be used. The three criteria proposed by Linfoot are :
18012
11,
§ 51
I M A G E E V ALU ATI 0i X
61
Relative structural content :
Fidelity
Correlation quality :
Ij
dx dy Q = / / < o 2 ) dx dy
'
Here o(x,y ) and I(x,y ) are the intensity distributions in the object and image plane respectively, normalized in such a way that
ss
O(X, y) dx
dy
=
ss
I ( x ,y) dx dy,
and the notation < ) expresses the statistical mean. (Among T , 4, Q, the relation of Q = &(T 4) holds.) With the help of Fourier theory, T , ($,Q may also be expressed as follows:
+
T=
J J n
n
J J;FOdu dv
If we take into account the effect of the receiver, we only have to
62
WAVE OPTICS AND GEOMETRICAL OPTICS
111,
55
replace T by 771. Thcse quantities have a decp physical meaning. T may be considered to express a statistical mean information content in the case where the noise of the object plane can be neglected and the image details are almost completely smothered in the noise of the -image plane, that is 1~012, 1~012 4 represents the degree of similarity between the intensity distribution of object and image planes as the formula shows. Q is the mean of T and 6 and furthermore, if is constant, Q becomes equal (using the Fourier formula see (4.3))to the intensity at the centre, except for a constant coefficient. When the effcct of the receivcr is considered, Q becomes equivalent to the value of the resultant intensity distribution of the total system a t the ccntre. Hence, Q has a close relationship with the Strehl deiinition, which has been discussed by inany authors. G. KUWABARA [ 19551 already showed that the image evaluation by the Strehl definition coincides fairly well with visual evaluation in the case of spherical aberrations, and K. SAYANAGI :1956] discussed also the effect of the receiver along such lines. Returning to thc first problem, wc can say that in order to apply these theories to lens design, it is necessary to find an easy method for calculating the rcsponse function from the results of ray tracing. As it is very difficult, although noL impossible, to calculate the wave optical response function in practice, it will be useful to obtain the geometric optical response function as the second best method. As it is clear from eqs. (2.1') and (4.5), computation from the following formula seems to be advisable (LUKOSZ [1958], MIYAMOTO [ 19581):
< m.
1
N
1
1 v
For a good approximation, a dense spot diagram is necessary, which can be calculated more easily using interpolation formula (FOCKE [ 19521, STAVROUDIS and FEDER[ 19541, HERZBERGER [ 19581). The value of R,(s, t ) calculated by eq. (4.5') coincides fairly well with the measured one in the case of a photographic lens with a relatively large aperture (KUBOTA, MIYAMOTOand MURATA [ 19601). 5.2. SINGLE F I G U R E O F MERIT F O R CYBERNETIC DESIGN WITH DIGITAL COMPUTER
In conventional design, the rough arrangement of lenses or lens powers are first determined from a knowledge of the aperture and the field size required. Gaussian optics or Seidel aberration theory
11,
§ 51
IMAGE EVALUATION
63
may be useful in the early stages of optical design, or in the case of a lens with a small aperture and field. For a lens having larger aperture and field, designers usually select the proper set of the important residual aberrations and reduce them by using various minimization processes. In other words, the optical design may be regarded as a problem of solving a set of multiple-dimensional, nonlinear equations under many awkward boundary conditions. However, the recent development of electronic computers has made possible the introduction of new lens design procedures, utilizing the computer’s large memory and ability to perform complicated logical operations. When the problem of cybernetic design with a digital computer is considered, it is convenient to use a single figure of merit instead of the set of residual aberrations; because if one finds a proper single figure of merit +(pi) which can be calculated from construction parameters /& of the optical system, optical design is reduced to a problem of determining the proper values of /& which make +(p)i best under the limitations of the optical system. Accordingly there are some prospects for suitable programming and various methods h a w already been discussed. As typical examples, there are the variable-byvariable method (BLACK [ 195.511,the method of steepest descent (FEDER [ 19571, MEIRONand LOEBENSTEIN [ 1957]),and the least square method (ROSENand ELDERT [1954], K. E. HOPKINS, MCCARTHYand WALTERS [1955], WYNNE[1959]). Fig. 5.1 shows a flow diagram of the variable by variable method proposed by G. Black.
I I L
Fig. 5.1. A flow diagram of the variable by variable method. This flow diagram is rather primitive a t the present time, but it gives some idea of cybernetic design
It is thus necessary to examine the properties which +(pi) must have, and we easily notice that a third one is required in addition to
64
WAVE OPTICS AND GEOMETRICAL OPTICS
[IL
95
the two conditions described concerning the assessments T , 4,Q ; that is, (3) c $ ( ~ z )must be easily calculated from the construction parameters pi of the optical system. It is also desirable to obtain + ( p i ) in a length of time comparable to the time needed for ray tracing. As practically used figures of merits +(pi), the radius of gyration of the spot diagram, Lucy’s intensity criteria, a properly weighted sum of squares of generalized Seidel aberration residuals, and others are considered. But it is clear that they do not satisfy the conditions ( l ) , (2). A t this point, Q is considered as one of the assessments satisfying all three conditions. Taking the inverse Fourier transforniation of 771, we have a convolution t(x,y) of the intensity distribution I(x,y) of point image and the turbidity ~ ( xy ,) of its receiver,
t(x,y)
=
11
I(%’,y ’ ) ~ (x x’,y
- y’)
dx‘ dy‘.
If I ( % y) , can be replaced by the geometric optical intensity distribution, we have (MIYAMOTO[1959]) 1
s
using here the co-ordinates (xi,yt) of the spot diagram. Accordingly when the spectral power of the object plane is constant, we obtain for Q the following expression:
18012
Even if /8012 is not constant, this effect can be taken into account in the form of V ( X , y). It may be interesting to examine how t o determine the proper form of the function ~ ( xy,) in any particular situation. The function r(x,y) must have a physical meaning and must be easily calculated, but we can also choose its form more freely. (In this case the meaning of Q is lost.) For example if we take ~ ( xy), = = [(xz y2)h dr]-1, Q becomes Lucy’s intensity criteria and when exp 2ni(sx ty) is selected, eq. (5.I ) changes t o the geometric optical response function K,(s, t ) ; see eq. (4.5). In the case where the extent of the spot diagram is small compared
+
+
+
REFERENCES
111
65
with the turbidity o i the receiver, only the values of R,(s, t ) in the low frequency domain relate to the image evaluation and we may discuss it by expanding eq. (4.5’) in a power series. If we choose the origin as a centre of gravity of the spot diagram ,v
N
( p = o .
cYi=o),
i= 1
the power series starts as ,?J
1 - +n2
2 ( S X i + tYi)2 + . . .) ; i= 1
we then find that the evaluation by the radius of gyration is its first approximation. Even in the case of no receiver, if the extent of the spot diagram is sniallcr than that of Airy’s disk, we have a similar result to the above (the case of Schmidt camera). These discussions havc a close connection with the tolerance criterion proposed by H. H. HOPKINS [1957aj.
References I ~ L A CG., K , 1955, Proc. Phys. SOC.€3 68, 729. BORN,M., 1933, Optik (Julius Springer, Berlin). BORN, M. and E. WOLF,1959, Principles of Optics (Pergamon Press, New York). BROMILOW, N. S., 1958, Proc. Phys. SOC.B 71, 231. C A 4 ~ C., ~ 1937, ~ ~Geometrische ~ ~ ~ Optik o ~(Julius ~ ,Springer, Berlin). DE, N., 1955, Proc. Roy. SOC.A 233, 91. DIRAC,P. A. M., 1947, The Principle of Quantum Mechanics (Oxford). DUFFIEUX, P. M., 1946, L’IntBgral de Fourier et ses Applications B 1’Optique (Rennes). FEDER, D. P., 1957, J . Opt. SOC.Am. 47, 902. FELLGET, P. B. and E. H. LINFOOT, 1955, Trans. Roy. SOC.A 247, 367. FOCKE, J., 1952, Jenaer Jahrbuch (Jena). HAMILTON, W. R., 1827, Theory of System of Rays (Trans. Roy. Irish Acad.). M., 1936, J. Opt. SOC.Am. 26, 177. HERZBERGER, HERZBERGER, M., 1947, J . Opt. SOC.Am. 37, 485. HERZBERGER, M., t958, Modern Geometrical Optics (Interscience Pub. Inc., New York). H. H., 1950, Wave Theory of Aberrations (Oxford). HOPXINS, HOPKINS, H. H., 1951, Proc. Roy. SOC.A 208, 263. HOPKINS, H. H., 1953, Proc. Roy. SOC.A 217, 408. HOPKINS, H. H., 1955, Proc. Roy. SOC.A 231, 81.
66
WAVE OPTICS AND GEONETRICAL OPTICS
111
HOPXINS,H. H., 1957a, Proc. Phys. Soc. B 70, 449. HOPKINS,H. H., 195713, Proc. Phys. SOC.B 70, 1162. HOPKINS,R. E., 1955, Report Inst. Optics, Univ. Rochester. HOPKINS, R. E., C. A. MCCARTHY and R. WALTERS,1955, J . Opt. SOC. Am. 45, 363. KINGSLAKE, R., 1948, Proc. Phys. SOC.61, 147. KUBOTA, H., K. MIYAMOTO and K. MURATA, 1960, Optik 17 (In Press). KUWABARA, G., 1955, J . Opt. SOC.Am. 45, 309 and 625. LINFOOT,E . H., 1955, Recent Advances in Optics (Oxford). LINFOOT,E. H., 1956, J . Opt. Soc. Am. 46, 740. LINFOOT, E. H., 1958, Opt. Acta 5, 1. LUCY,F. A,, 1956, J . Opt. SOC.Am. 46, 699. LUKOSZ, W., 1958, Opt. Acta 5, 299. MARECHAL,A,, 1954, Optical Image Evaluation, Nat. Bur. Stand. Circ. 526 (Washington, D.C.). MEIRON,J. and H. M. LOEBENSTEIN, 1957, J. Opt. Soc. Am. 47, 1104. MIYAMOTO, K., 1957, J . Appl. Phys. Japan 26, 421. MIYAMOTO, K., 1958a, J. Opt. SOC. Am. 48, 57 and 567. MIYAMOTO, K., 1958b, J . Appl. Phys. Japan 27, 585. MIYAMOTO, K., 1959, J . Opt. SOC.Am. 49, 35. NIENHUIS,K., 1948, Thesis, Groningen. NIENHUIS,K. and B. R. A. NIJBOER,1949, Physica 14, 590. NIJBOER,B. R. A , , 1943, Physica 10, 679. NIJBOER,B. R. A., 1947, Physica 13, 605. OGURA,I., 1958, J. Opt. SOC.Am. 48, 579. PICHT, J., 1925, Ann. der Physik 77, 685. PICHT, J., 1926, Ann. der Physik 80, 491. ROSEN, S. and C. ELDERT,1954, J. Opt. SOC.Am. 44, 250. SAYANAGI, K., 1956, J . Appl. Phys. Japan 25, 193. SCHADE, 0. H., 1951, J . SOC.Motion Pict. Telev. Engr. 56, 137. K., 1905, Astronom. Mitteil. Kgl. Sternwarte, Gottingen. SCHWARZSCHILD, SOMMERFELD, A. and J . RUNGE,1911, Ann. Phys. 35, 277. 0. N. and D. P. FEDER,1954, J. Opt. SOC.Am. 44, 163. STAVKOUDIS, STEWARD, G. c., 1928, The Symmetrical Optical System (Cambridge). SYNGE,J. L., 1937, J. Opt. SOC.Am. 27, 138. TORALDO DI FRANCIA, G., 1954, Optical Image Evaluation, Nat. Bur. Stand. Circ. 526 (Washington, D.C.) 161. 1958, Opt. Acta 5, 39. UKITA,Y . and J. TSUJIUCHI, VAN KAMPEN, N. G., 1949, Physica 14, 575. VAN KAMPEN,N. G., 1950, Physica 16, 817. WOLF,E., 1952, J. Opt. SOC.Am. 42, 547. WYNNE,C. G., 1959, Proc. Phys. SOC.73, 777.
I11
THE INTENSITY DISTRIBUTION AND TOTAL I L L U M I N A T I O N OF A B E R R A T I O N - F R E E DIFFRACTION IMAGES BY
RICHARD BARAKAT Optics Department, Itek Corporation, Boston, Mass., U.S.A.
CONTENTS PAGE
$ 1 . INTRODUCTION
. . . . . . . . . . . . . . . . . .
69
. . . . . . .
70
. . . . . . .
74
. . . . . . . . . . . . . . . . . . . .
99
3 2. KIRCHHOFF DIFFRACTION THEORY 3 3. SPECIAL PROBLEMS . . . . . . . . 3 4. VECTOR DIFFRACTION THEORIES . ACKNOWLEDGMENTS . . . . . . . . . . REFERENCES . . . . . . . . . . . . . . .
105 105
Q 1. Introduction One of the major problems of physical optics is a quantitative description of the various diffraction phenomena. I n fact, one might even say that physical optics is the product of the successful attempt to describe diffraction as a manifestation of wave interactions. TO a great extent this quantitative description has been carried out in the realm of the Kirchhoff theory although there are notable exceptions. I n view of the success of the transfer function approach there is a tendency to believe that the intensity distribution in the diffraction image is of secondary importance since it can be obtained in principle from the transfer function. Of course, the complete solution to any problem would include both descriptions. I n this review paper we will discuss the intensity distributions and the total illumination (or encircled energy) of aberration-free optical systems. Although there will be occasional references to the diffraction theory of aberrations they will be incidental to the main topic. For recent reviews of the diffraction theory of aberrations see WOLF[ 1951b] or BORNand WOLF [1959] p. 458. No attempt has been made to assemble an exhaustive bibliography although the more important papers are listed. One of the unfortunate features of an historical study is that many of the important papers were published in relatively obscure journals which are extremely difficult to obtain. Among the books which are partially devoted to a history of physical optics we mention MEYER [1934] and VERDET[1881]. A very recent volume by RONCHI [1957] deals with the general history of optics and is illustrated by valuable photographs from a number of the older manuscripts. Finally, we mention MACH’S well-known volume [ 19131 on physical optics which contains an excellent historical treatment of the subject (colored to be sure by Mach’s dislike of general analytical arguments). The chief contributor to the technical aspects of early diffraction
70
DIFFRAClION I M A G E S
iIII,
92
theory was G. AIRY;a convenient summary of much of his work is contained in his tract Undulatory Theory of Optics” [ 18771. The work of the period 1820-1885 is skillfully summarized and critically discussed by LORDRAYLEIGH in “Wave Theory of Light” [1888]. This article is one of the cornerstones of any serious study of optical diffraction theory. Earlier surveys of the diffraction literature are given in the following “Handbuch” type articles : VONLAUE[ 1915, 19281, POCKELS [ 19061, MOGLICH [ 19271, JENTSCH [ 19291, WOLFSOHN[ 19281. The latest Handbuch article by FRANCON [ 19561 contains much valuable information and is profusely illustrated. Recently three volumes devoted exclusively t o optical diffraction theory have appeared and are unreservedly recommended. The first DI F R A N C I A [1958] and is “ L a Diffranzione dells Luce” by TORALDO contains an excellent introduction to optical diffraction theory. The second is by RUBINOWICZ and is entitled “Die Bezigungswelle in. der Kirchhoffea Theorie deer Beugung” [1957]; the volume is devoted to Rubinowicz’s exposition of his boundary wave theory but also contains a very thorough discussion of the Kirchhoff theory. Finally, the volume “Diffraction, Structure des Images” [ 19601 by MARBCHAL and FRANCON is a modern treatise on optical image formation from the point of view of the French school of optical physicists. No reference is made to any Russian work for the simple reason that there is little published literature available. It is difficult to believe that there is no interest in optical diffraction theory in USSR especially when men of the stature of Fock, Vajnstejn, etc. are working in closely allied fields. The plan of the article is to first give a critical r6sum6 of the foundations of the Kirchhoff diffraction theory on which most optical diffraction work is based. We then pass to a discussion of various special problems of theoretical and practical interest. Finally, we outline some recent work on vector diffraction. “
Q 2. Kirchhoff Diffraction Theory The classical theory for the treatment of diffraction problems for the high frequencies of optics is due to G. KIRCHHOFF [ 18911. In spite of a number of basic objections (to be discussed below) the Kirchhoff
111,
s 21
KIKCHHOFF DIFFRACTION THEORY
71
theory is entirely adequate for the usual problems of instrumental optics provided we are not too close to the diffracting edge. One of the fundamental unsolved problems of optical diffraction theory i s to understand why the Kirchho f f theory successfully predicts the intensity distributions i ~ spite b of the fact that from the mathematical standpoint the Kirchhoff theory appears to be a poor approximation to the rigorous formulation of the difjraction problem (wave equation, boundary conditions, radiation condition). Paraphrasing a remark of Poincar6 on the law of errors: “The theoreticians believe in the Kirchhoff theory because they hold it to be an experimental fact, while the experimentalists think it to be a mathematical theorem.” Important progress towards understanding the true reason for the success of Kirchhoff’s theory has recently been made by H. M. NUSSENZVEIG [1957, 19591 in his study of diffraction by the double wedge. We briefly sketch the Kirchhoff theory referring to the standard works of BAKER and COPSON[ 19501, SOMMERFELD [ 19541, RUBINOWICZ [1957], BORNand WOLF[1959] for detailed accounts. Assuming the disturbance U to be a scalar quantity, a straightforward application of Green’s theorem to the Helmholtz equation V2U kzU = 0 yields
+
where S is a closed surface (the sources of the field are assumed to lie outside S ) , Y denotes the distance between the field point P and the source point and ajan denotes differentiation with respect to the normal. This integral expresses the effect a t any point P in the field in terms of a surface integral taken over the surface S (which we can take as including the aperture) ; in other words, we consider U(P) as the resultant of a superposition of secondary sources situated over S. The problem is completely solved if U and 8Ujan were known on the boundary which, of course, requires a knowledge of the boundary conditions. The Kirchhoff theory is based on the assumption that the unknown distribution of light on the boundary can be replaced to a good accuracy by certain simple approximation so that no account need be taken of the actual boundary conditions. Kirchhoff’s method consists of the simultaneous prescription of the boundary values of U and its normal derivative. I t is assumed that immediately behind the screen there is no disturbance, while the actual field in the aperture is re-
72
D I F F R A C T I 0 N I MA G E S
[III,
32
placed by the unperturbed field. These two assumptions are essentially geometrical optics approximations, and may be expected to be reasonable approximations when the dimensions of the aperture are large compared to the wavelength. Consequently, U and aUjan are chosen equal to the incident wave values in the aperture and taken to be zero on the boundary. As POINCARE [ 18921 p. 187 has shown, the Kirchhoff approximation is not self-consistent in the sense that it cannot reproduce the boundary values by substituting the geometrical optics approximations into the original integral. The reason being that we cannot simultaneously specify the scalar function U and its normal derivative on the boundary, since Helmholtz's equation is of elliptic and not hyperbolic type. The Kirchhoff approximation is a plausible one but implies that U and aU/an are discontinuous at the edge of the aperture. Since Green's theorem is valid only for continuous functions, we have violated the assumptions made in applying Green's theorem. I n spite of all these assumptions (mutually contradictory!) the theory yields excellent results. A factor tending to work in favor of the approximate theory is the rapid decrease of intensity within the (geometric) shadow zone limiting the usual measurements of diffraction patterns to small angles of diffraction. As a direct consequence of the application of the Kirchhoff theory to a spherical wavefront, (2.1) becomes
where D is a constant, S is the surface of unobstructed wavefront (area of the aperture) and 0 is the angle of diffraction. The optical path length from the source to dS is denoted by Y, while Y O is the optical path length from d S to the field point P in the specified receiving plane in the diffraction field. The function (1 + cos 0) is the obliquity factor and T is the amplitude distribution over the converging wavefront. It is common practice to ignore the variation of the obliquity factor and to bring YYO outside the integral leaving
+
The function (Y YO) depends upon x, y and may be expanded in a two-dimensional Taylor series. By definition we have Fraunhofer diffraction when we keep only the terms up to the first order in the Taylor series. Inclusion of the higher order terms yields Fresnel
111,
9 21
K I R C H H O F F D I F F R .4CT I 0 N T H E O X Y
73
diffraction. If the amplitude distribution over the converging wavefront is constant, we have the classic Airy systems so familiar from undergraduate physics courses. The Airy-Kirchhoff theory, with its neglect of the obliquity factor and assumption of uniform amplitude distribution over the exit pupil has been justified by THEIMER, WASSERMANN, and WOLF [1952] for natural light and with aperture semi-angles up to about 10 degrees. With the introduction of optical systems of high numerical aperture, it has now become a problem of great importance to develop the necessary theory to cover these new situations. In this respect it is not enough to consider the scalar diffraction integrals with the inclusion of second order terms. The scalar theory itself appears to be inadequate and a vector theory must be substituted in its place, for at high numerical apertures polarization effects must be taken into account. The very great number of assumptions introduced to obtain (2.3) should be kept constantly in mind as the mathematical pyrotechnics necessary for the evaluation of the basic diffraction integral tend to relegate the physics to a secondary role. A number of attempts have been made to improve the Kirchhoff approximations. BORN[ 19331 was the first to suggest that the Kirchhoff theory was the first approximation to an accurate solution which could be obtained by repeated iteration. FRANZ [1949, 19571 and SCHELKUNOFF [ 19511 have proved that this assertion is false. Contrary to the opinion held by some recent workers, this does not mean that there are no other methods which will start with the Kirchhoff theory and finally yield the rigorous solution. This approach has not yet been attempted. Using an entirely different approach, KOTTLER[ 19231 has proved that the Kirchhoff solution is the rigorous solution to a “saltus” problem (problem involving discontinuities) and not of a boundary value problem. The main contribution of Kottler lies in his careful examination of the “black screen” concept. I t is implied in the Kirchhoff theory that screen is perfectly absorbing ; however, from the electromagnetic point of view there can exist no perfectly absorbing screen (black screen). Quoting BAKERand COPSON[1950] p. 101 : “It is impossible to give a satisfactory physical definition of a thin black screen; Kottler’s work shows us what analytical definition of ‘blackness’ gives rise to Kirchhoff’s formula.” Reference is made to the extensive and critical study of diffraction theories (at longer wavelengths where the boundary conditions must
74
D I F P R A C T I 0 N I 1\1 A G E S
1111,
93
be taken into account) by BOUWKAMP [ 19531 and their use in acoustic and electromagnetic problems. Excellent accounts of electromagnetic diffraction theory are also given in FRANZ [1957] and TORALDO DI FRANCIA [1956]. The articles by Bouwkamp and Franz contain detailed bibliographies.
Q 3.
Special Problems
We now pass to consideration of various special problems. All the work reviewed is based upon the Kirchhoff theory except for subsection 3.3 which is based upon the Luneberg approach. In spite of the fact that optical diffraction theory is over one hundred fifty years old, solutions to all but the simplest problems are still wanting. It is true that formal solutions have been obtained but what is presently needed is a systematic numerical study which will partially complete the program initiated by Airy. 3.1. POIKT SOURCE
~
UNIFORM AMPLITUDE DISTRIBUTION
We now specialize our analysis to cover only the case of a uniform amplitude distribution (T = constant) over the converging wavefront and discuss in this context Fraunhofer and Fresnel diffraction by various apertures. We can write the complex amplitude due to a point source as
U ( x ,y) = N
11
eik(px+qy)dpdq
(3.1)
for Fraunhofer diffraction. The aperture coordinates are $I and q, the direction cosines are x and y , and N is essentially a normalizing constant. The integration is over the aperture. The point to bear in mind is that the incident waves are plane waves and consequently the Fraunhofer diffraction integral is “properly a function of the direction in which the light is to be estimated”. As we have previously remarked, THEIMER, WASSERMAN and WOLF [ 19521, (also BORNand WOLF[1959] p. 386) have shown that it is permissible t o use a single scalar function U in calculating the light intensity provided that the aperture semi-angle is small. A second restriction involves the interpretation of an averaging procedure ; we refer to the references for full details. OSTERBERG [ 19511 p. 245 has given a very clear statement as t o the
111,
S 31
5 PE C I A L P R 0 B L E M S
75
physical meaning of the complex diffraction integral U ; we can do no better than to quote him. “The complex function U is of direct physical significance in the sense that lUlz gives the distribution of energy density produced in the image plane by an unpolarized dipole radiator. An unpolarized dipole radiator may be regarded as one that changes its orientation in a random manner in a period of time which is short compared with the smallest interval of time that can be distinguished by the receptor of the energy density, or it may be regarded as a group of independent dipole radiators oriented a t random in an element of area or volume which can be considered as being infinitesimally small. 1UIZ is the distribution of energy density produced by these unpolarized, that is, randomly oriented, dipole radiators. I t is important to appreciate that, whereas the phase and amplitude distribution produced by an unpolarized radiator and hence by U is fictitious. . . . . We shall continue to call U an amplitude and phase distribution, but we shall not claim that either it or the amplitude and phase distribution derived from it are real amplitude and phase distributions.” BRIDGE [1858] has proved a number of elementary but highly important theorems relating to Fraunhofer diffraction. We follow RAYLEIGH [ 18881: A) A diminution of the wavelength 1leads to a simple proportional shrinkage of the diffraction pattern, attended by an augmentation of brilliancy in proportion to 1-2. B) If the wavelength remains unchanged, similar effects are produced by an increase in the scale of the aperture. The linear dimension of the diffraction pattern is inversely as that of the aperture and the brightness at corresponding points is as the square of the area of the aperture. C) If the aperture and wavelength increase in the same proportion, the size and shape of the diffraction pattern undergoes no change. The interested reader should consult TORALDO DI FRANCIA [ 19581 p. 201 for an elegant treatment of this topic. Application of (B) allows US to compute, for example, the pattern for an elliptic aperture given the intensity distribution for a circular aperture. The number of aperture shapes, beside the circular, which have been considered is small. AIRY [ 18411 treated the annular aperture following the experimental work of Herschel. Other apertures discussed are the rectangular, equilateral triangle, isosceles triangle, and elliptic aperture. Full details are available in BASSETj18911.
76
DIFFRACTION IMAGES
[IIL
93
An aperture shape of considerable importance is that made by the sector of a circle ; in particular the semi-circular aperture (heliometer) is used in astronomy. STRUVE[1882b] outlined the first analytical treatment ; however, BRUNS[ 18831 established the complete analysis [ 18881). Bruns expanded the Fraunhofer dif(see also STRAUBEL fraction integral into a series of Bessel functions; no computations were attempted. With the availability of extensive Bessel function tables it would be a simple matter to use Bruns’ expansions and study how the intensity pattern varies as the sector of the circle is varied. Formulae equivalent to those of Bruns are given in STEWARD [ 19281
Major axis
Fig-. 3.1. Contour lines of intensity (isophotes in paraxial receiving plane for semi-circular aperture (Everitt))
p. 100. EVERITT [1919] carried out the computation of the intensity pattern of heliometer by direct quadrature of the integral (see Fig. 3.1). In Fig. 3.2 we show a photograph of the diffraction pattern. MITRA [1920] reconsidered the problem in the light of the Rubinowicz
Fig. 3.2. Fraunhofer diffraction pattern for semi-circular aperture (Scheiner and Hirayama)
a
b
Fig. 3.3. Fresiiel diffraction pattern of a circular apcrture. The left hand figure (a) shows thc pattern a t p = 18.7 and the right hand figure (b) shows the pattern at fi = 2 0 ~ The . former is a maximum a n d thc latter a minimum (Taylor and Thompson)
111,
s 31
S P E C IA L P R 0 B L E M S
77
interpretation of the Kirchhoff integral and succeeded in explaining the major features of the intensity distribution by qualitative arguments. STRAUBEL [1888, 189.51 enunciated a series of theorems on the symmetry of the Fraunhofer diffraction pattern (see also VON LAUE, [ 19281). A valuable collection of photographs of the Fraunhofer diffraction pattern of various apertures is contained in the work of [ 18941. The photographs are also reproduced SCHEINER and HIRAYAMA in DIMITROFF and BAKER[ 19451 p. 295. Fresnel, as is well known, was the first to study that class of diffraction phenomena subsumed under his name. His work is concerned mainly with diffraction by a straight edge. The first person to make a serious study of Fresnel diffraction was LOMMEL [ 1884, 18861 who gave an exhaustive treatment of the circular and the rectangular aperture as well as the complementary problem of the circular and the rectangular disc. Many more problems of Fresnel diffraction can be reduced to the evaluation of the integral (WALKER [I9041 p. 130)
the integration again extending over the diffraction aperture. Here x and y are the aperture coordinates; q1,92 are the lateral displacements in the x and y directions; p is the defocusing term. Following Lommel we can write (3.2) as
1
u = 1z
+ i~i(glx)le&i~z%dx x (42y)TJ-:(qzy) + iJ:(q2y)le*iPYZdy.
(q1x)i[~-+(q1x)
1
(3.3)
The analysis of these integrals can be found in WALKER[ 19041 p. I3 1. The end result is that the integrals are expressed in terms of the Lommel functions of two variables. Although Lommel studied the rectangular aperture (and as a limiting case the slit aperture) and gave a detailed investigation of the location of the maxima and minima, he did not attempt, however, to give a graphical representation of the three-dimensional light distribution near focus. These computations have been completed by THOMPSON [ 19591. The intensity is expressed in terms of the Lommel functions of orders 8 and i. As a by-product of the rectangular aperture, it is a simple matter
78
[III,
DIFFRACTION IMAGES
93
to obtain the diffraction pattern due to a rectangular obstacle. By letting one edge of the rectangle become infinitely large, we can also obtain the diffraction due to a half-plane. The intensity is given in terms of Fresnel integrals (WALKER [ 19041 p. 139 or BORNand WOLF [ 19591 p. 432). Fortunately there exists a rigorous electromagnetic solution of the (perfectly conducting) half-plane problem by SOMMERFELD [1894, 19541 to which we can compare the approximate solution. The difference is very small and leads Sommerfeld to remark: “It is amazing that the classical diffraction theory nevertheless yields for all practical purposes satisfactory results.” The effects of diffraction on the interference by a Fresnel prism has been investigated by STRUVE[ 1882al and WEBER[ 18791. The results are summarized in WALKER[1904] p. 142; again the analysis can be carried out in terms of Lommel functions. It is more convenient, when discussing diffraction from a circular aperture or circular disc, to employ polar coordinates; the complex amplitude for the circular aperture then becomes (BORNand WOLF C19.591 p. 436)
U ( P ,4) == 2
s:
egiPraJO(qr)rdr,
(3.4)
where, in suitable units, p is the longitudinal displacement and q the lateral displacement (see Fig. 3.3).The intensity is given by Lommel in the form:
The U and I/ functions are the Lommel functions of two variables 2n+v
M
c (-I)(;)
I/”($> 4) =
Jzn+”(q),
n= 0
U J P , 9)
=
5 (-1)m(’)
n= 0
(3.6)
2n+v
4
~2n+v(S).
The first expression is suitable for numerical computations for value of [q’pl < 1 , the second for ip’q1 > 1. Details of the integration are given in BORNand WOLF [1959], LINFOOT[1951], WALKER[1904], [1931], BASSET[1891] as well as GRAY,MATHEWS and MACROBERT in Lommel’s original article.
111,
§ 31
SPECIAL PROBLEMS
79
Recently other analytical methods have been devised to evaluate the integral. ZERNIKEand NIJBOER[1949] utilized the theory of circle polynomials (used in the diffraction theory of aberrations) to obtain the following expression for U :
(see Fig. 3.4 for isophotes near focus).
Fig. 3.4. Contour lines of intensity (isophotes) near focus for circular aperture. The abscissa represents longitudinal defocusing and the ordinate lateral displacement (Zernike and Nijboer)
BOIVIN[ 19521 has obtained two other expressions which are useful because their regions of convergence are different than those delimited by Lommel's functions. The first is
where Fn is the incomplete exponential function. This expression converges rapidly when q2 < 2p. The second expansion is obtained by expanhng the Bessel function in (3.4) and integrating termwise :
80
DIFFRACT IO N IMAGES
(111,
43
As Boivin points out, this result is best adapted to computations on given coaxial cones where q 2 > 29. I n this important paper Boivin has also treated diffraction by concentric arrays of ring-shaped apertures. Numerical calculations have been completed and Boivin’s thesis containing them will appear shortly (private communication). An alternate series useful for arbitrary q and 191 < 1 is given in LANSRAUX [1947]. I n the region of the geometric shadow (i.e., 6 = q) the Lommel functions are slowly convergent ; STRUVE[ 18861 derived useful approximations for this region. Other papers on the problem are : BEREK[ 19261, BUXTON [ 1 92 1 , 1 9231, CONRADY[ 1 9 191, EPSTEIN [ 19491, MARTIN [ 19221 and SCHWARZSCHILD [ 18981. An extension of Lommel’s classical analysis to diffraction at an annular aperture was made by LINFOOTand WOLF [ 19531. 3.2. P O I N T SOURCE
-
VARIABLE AMPLITUDE D I S T R I B U T I O N
I n the previous subsection we discussed the diffraction pattern under the assumption that the amplitude distribution over the incoming wavefront is constant. When the amplitude distribution varies over the wavefront the diffraction pattern is altered. I n general, the amplitude distribution or, as it is now called in recent literature, the pupil function, can depend upon both aperture coordinates (x,y in the square aperture; Y, 13 in the circular aperture) ; furthermore, it may be complex. \.lie consider only amplitude modulation of the wavefront (i.e., T is real) in this article. HOPKIXS[ 19491 extended Lommel’s work on the circular aperture by assuming that the amplitude distribution over the wavefront was parabolic (3.10) T ( Y )= a bY2,
+
where a and b are constants. The intensity distributions are given in terms of Lommel functions and their first derivatives, the X n and Yn functions, as Hopkins terms them (see WOLF [1953]). The paper contains a wealth of graphical results for various defocused receiving planes. Hopkins’ conclusion is that for all practical purposes the effect of the ~2 term on the intensity distribution is negligible. BOIVIN[ 19521 has outlined the analysis for diffraction by an annular array when the amplitude distribution is parabolic. There is a powerful theorem relating the amplitude distribution in the Fraunhofer diffraction pattern and the amplitude distribution over the wavefront, namely: the pupil function T and the amplitude
111,
§ 31
SPECIAL PROBLEMS
81
distribution in the diffraction pattern U are Fourier transform pairs :
U(x9.Y) =
T ( p ,q)
=
rw rrn Iw Iw
J
T ( p ,q)eik(pX+qy)dpdq,
-w
J
-w
(3.1 1 )
U ( x ,y)e-ik@x+qy)dxdy.
-w
--oo
The infinite limits of integration are only formal since we define the pupil function to be zero outside the aperture. This theorem was undoubtedly known to Michelson and Rayleigh but its first extensive use is by DUFFIEUX[1946] and LANSRAUX [1947, 19531. See also FRANCON [ 19561 and O’NEILL[ 19581. I n the case of rotational symmetry the Fourier transform pairs (3.1 1 ) (actually Hankel transform pairs) are (with suitable normalization)
V) =
Iw 0
(3.12)
U(4)Jo(qY)qdq.
(The change in the meaning of q is evident; it is simply the lateral displacement. ) An important problem is to determine the pupil function so as to increase resolution in the image by decreasing the first zero of the Airy disc. This subject has been extensively investigated, especially in France, and is termed apodization (“cutting off the toes”). STRAUBEL [ 19351 and LUXEBERG [ 19441 were the first to call attention to the benefits which could be accrued by permitting the pupil function to vary. The theory of apodization is simply the study of the various pupil functions (possibly complex) which achieve some prespecified intensity distribution over the designated receiving plane. The usual attempts involve an expansion of the pupil function into a convenient set of functions such as Bessel, Hermite, lambda or Legendre functions. The procedure is to choose the constants in the expansion to obtain the desired results. Even though these methods are elegant, they nevertheless rest on an essentially ad hoc basis. Consult WOLF[1951b] for a r6sum6 of wark in this field up to 1951 . The culmination of this approach is DOSSIER’S thesis [ 1954, 19561. One person to attack the problem on a rigorous mathematical basis was LUNEBERG [ 19441 p. 386. Using the calculus of variations together
82
D I F F R A C T I 0 N I M A4G E S
[IIL
93
with the method of Lagrange multipliers, he demanded that the Fraunhofer pattern satisfy certain conditions together with physical constraints. Although he formulated four problems (Luneberg Apodization Problems) he only published a solution to the first problem. As Luneberg’s notes are not generally available we will discuss the problems in some detail. The first problem is to determine the amplitude distribution (pupil function) giving the maximum value to the Strehl definition of the Fraunhofer pattern subject to the condition that the total energy passing through the aperture be constant. Luneberg shows that the amplitude distribution which yields the Iliaximum Strehl definition is the uniform amplitude distribution (T = constant). This result is proved only for the circular aperture but modification of the argument to apply to other aperture shapes is not difficult. There is a close connection between this problem and the theory of “super-resolving” DI FRANCIA [I9581 p. 229. pupils, TORALDO The second problem is to maximize the Strehl definition of the diffraction pattern with constant energy with the added condition that the first zero of the diffraction pattern move inward from the Airy radius t o a prespecified radius B. The solution has been given by BARAKAT [1961b] for both circular and slit apertures, and in both cases amounts to solving an inhomogeneous Fredholm equation of the second kind for the pupiI function. For the circular aperture the amplitude distribution is given by
where (3.14)
The resultant distributions weigh against the center of the aperture (Fig. 3.5). The principal conclusion is that this procedure is useless when we try to bring the first zero of the Airy disc in more than about 20% (Fig. 3.6). The loss in Strehl definition and the increase of intensity in the secondary maxima are sufficient to overcome the beneficial effects of increased resolution. Similar results hold for the slit aperture.
111,
3 31
83
SPECIAL PROBLEMS
-201 0
I 02
I 04
I 0.6
I
08
'
10
r Fig. 3.5. Amplitude clistribution over circular aperture for the second Luneberg apodization problem (Barakat)
Another scheme would be to concentrate as much energy as possible into the smallest area in the receiving plane consistent with the physical constraints. That is, we choose the amplitude distribution such that the total illumination (encircled energy) in a circle of specified radius is made a maximum. The amplitude distribution is given as the solution of a homogeneous Fredholm equation of the second kind. BARAKAT [1961b] has also solved this problem for both circular and slit apertures. For the circular aperture the pupil functions is given by
T ( r ) = a0
+ a2r2 + a4r4 + a@,
(3.15)
where the a's are functions of /3 (radius of the circle of maximum intensity). The amplitude distribution weighs against the edge of the aperture (Fig. 3.7). As p goes to zero the pupil function approaches a constant with the result that the classical Airy objective (T = constant) maximizes the total illumination in an infinitely small circle. The main effect of this apodization procedure is to slightly lower the central intensity (Strehl definition) while moving the first zero of the pattern slightly outward (Fig. 3.8). As a consequence, the resolution is lowered. The results for the slit aperture are qualitatively similar.
84
DIFFRACTION IMAGES
IC
I
I
I
I
2
3
I
I
4
5
0s
OE
07
06
05
04
03
02
01
0 0
6
9 Fig. 3.6. Intensity distribution in paraxial receiving plane for the second Luneberg apodization problem (Barakat)
The fourth Luneberg problem involves resolution of two points in both coherent and incoherent light. Barakat has carried out the full analysis, and computations are in progress. Thus far the circular aperture pupil functions have been rotationally symmetric (i.e. T = T ( r ) ) The . diffraction image of a circular aperture having a sinusoidal angular variation was studied by SAITO[ 19591. The
111,
9 31
85
SPECIAL PROBLEMS
12
-
-t
10
-
-
18
-
-
06
-
-
0.4
-
-
02
-
I
I
I
I
amplitude distribution is given by
T(0) = sin no.
(3.16)
By carrying out the analysis it can be shown that the amplitude of the diffraction image also varies sinusoidally with the same period as that of the pupil function. The intensity distribution possesses 2n dark lines and 2n bright leaves radiating from the center of the pattern. Saito applies his analysis t o a pupil function having a square wave angular variation and compares due result with experiments. I n Fig. 3.9 the diffraction patterns corresponding to an integral number of square waves over the exit pupil is shown. Pupil functions depending on both Y and 8 occur in the KUBOTA and INOUE [ 19591 theory of the diffraction image in a polarizing microscope. In the usual polarizing microscope complete extinction does not occur when the polarizers are crossed. The plane of polarization of the transmitted light is rotated during passage through the system as the light vectors perpendicular and parallel to the plane of incidence have different transmittances at the refracting surfaces. Kubota and Inoue show that the resulting diffraction image, due t o a point source is very different from the Airy disc and is given by (3.17)
86
D I F F R A C T I 0 N I M A GP: S 10
05
0.8
1
I
\
\
I
I
I
Q
\o
\ \
/pa0
\\
e(
07
3
\\ \\
\\
06
‘\ \
05
\ 04
\
\ 4 ‘\
03
02
01
0
I
I
I A-.“
-
-
*
4 1
c q
Fig. 3.8. Intensity distribution in paraxial receiving plane for the third Luneberg apodization problem (Barakat)
where 8 is measured from the plane of polarization of the polarizer. Fig. 3.10 reproduced from their paper shows the contours of equal intensity in the diffraction image. The diffraction image has the form of a four-leaf clover. The resolving power is considerably lower as the rradius of the first dark ring can be shown to be about 1.7 times large than that of the Airy disc. The paper contains a large number of
Fig. 3.9. Diffraction patterns when the pupil function T ( 0 ) is a square wave. The number of square waves in the aperture is half the number of bright leaves in the pattern (Saito)
This Page Intentionally Left Blank
111,
9 31
SPECIAL PROBLEMS
87
Fig. 3.10. Contour lines of intensity (isophotes) for the diffraction image due to crossed polarizers (Kubota and Inoue)
theoretical and experimental intensity distributions corresponding t o a variety of situations. 3.3. P O I N T SOURCE
~
H I G H NUMERICAL A P E R T U R E
In all the previous work we made the tacit assumption that the aperture semi-angle LY was small enough so that its square could be neglected. We have thus been dealing with essentially a paraxial theory (LY 0). Although this formulation suffices for many problems it is of interest to examine the effect of the second order terms. The first investigation into the effect of a large aperture semi-angle (within the context of scalar theory) is contained in Chapter 6 of STREHL’S volume [1894]. It is shown there that the normalized intensity distribution in the case of Fraunhofer diffraction is proportional to
-
88
DIFFRACTION IMAGES
[HI,
53
where a is the aperture semi-angle. The classic Airy intensity pattern is recovered as a approaches zero. Note that the amplitude distribution over the exit pupil is no longer uniform as a consequence of the effect of including second order terms in a. A somewhat similar problem was studied by H. H. HOPKINS[ 19431 who also considered the effects of polarization of the incident wave. The approach of Hopkins is not scalar, but since he does not take into account the full Maxwell equations his results do not constitute a rigorous electromagnetic treatment. By a rather complicated analysis, he was able to derive the following expression for the intensity distribution in a meridian perpendicular to the direction of the incident light vector:
where the constants A1, Az, . . . are functions of the aperture semiangle a. A similar formula is obtained for the intensity in a meridian parallel wjth the direction of the incident light vector. Hopkins has shown that the contours of the intensity distribution are no longer circular but are of an elliptical form. Again, as the angle a goes to zero, the elliptical distribution degenerates to a circular one and the Airy disc is approached. From an examination of curves accompanying this paper, it is evident that the first zero of the Airy disc is moved inward while simultaneously the energy in the second maxima is increased. The paper also contains a discussion of the intensity distribution when the sine condition is to be satisfied. Other investigations, whether scalar or vectorial, have imposed the Abbe sine condition; DRUDE[1933] p. 59 or STEWARD [1928] p. 49. This condition is not as restrictive as it would seem - after all, one of the main reasons for these diffraction studies is to investigate the imaging qualities of optical systems. If the sine condition is not required, then the amplitude distribution over the converging wavefront could be of considerable generality. In accordance with the sine law, the amplitude distribution cannot remain constant but becomes a function of the aperture semi-angle a (ABBE[1910] p. 30), or equivalently a function of the numerical aperture and refractive indices in the object and image space (OSTERBERGand WILKINS[1949]). If the absolute value IM/ of the magnification ratio is greater than unity (e.g.,microscope objective) the amplitude distribution increases towards the outer portions of the wavefront so we weigh against the central
111,
§ 31
SPECIAL PROBLEMS
89
region of the aperture. The reverse situation holds for 1M1 < 1 (e.g., a telescope). A complete reformulation of optical diffraction theory was made by LUNEBERG [1944] (see 5 4). An important scalar specialization of Luneberg's work was effected by OSTERBERG and WILKINS[ 19491, also OSTERBERG [1951], by demanding that the Abbe s h e law hold. The diffraction integral for an aberration-free system in the paraxial plane is given by (3.20)
here 10 = nMr/no where M is the magnification and n, no are the refractive indices in image and object space, T ( r )is the pupil function. The Airy diffraction integral is recovered by setting T ( r ) equal to the denominator of the integrand in (3.20). Note that even when 10 = 0 (so that the aperture semi-angle c( = 0) we do not really recover the classical Airy integral but instead we have (3.21)
Numerically the difference is small to be sure but functionally ther.: is a considerable difference. Osterberg and collaborators have examined
-
N =0.95
--- A I R Y
TYPE OB J ECT I V E
SLIT A P E R T U R E
9 Fig. 3.1 1. Comparison of intensity distributions for Airy type objective and Luneberg-Osterberg objective for slit aperture where N = (NA)/wo (Rarakat and Lev)
90
D IFFR A C T I 0N I M A GE S
[IIL
93
the implications of (3.20) in a number of papers (a convenient summary of the work is given in OSTERBERG and MCDONALD [1954]) the chief results being for the case IM/ > 1 : a) Strehl definition is greater than the classical value of unity. b) The first zero of the diffraction pattern is moved inward giving a slightly better resolution. c) The secondary maxima are higher than the analogous values for the Airy objective. As a typical example we show (Fig. 3.11) the intensity curves for a slit aperture obeying the Luneberg-Osterberg theory (BARAKAT and LEV,unpublished work). In spite of the success of the theory it cannot cope with polarization phenomena being only scalar in formulation. Although it suffers from this defect, it is a powerful tool and certainly deserves a broader following than it currently enjoys. 3.4. IMAGING O F EXTENDED OBJECTS
Thus far we have considered only the diffraction patterns due to point sources. From a theoretical point of view, the study of the intensity distributions due to a point source of light is the simplest physical and mathematical situation. However, the experimental problem involved in obtaining a bright point source is not a trivial one. Even if this were not an experimental limitation, it is still valuable to study the intensity patterns of extended luminous sources (e.g. bright disc) as the situations depicted occur in astronomy and in microscopy. Although we are not primarily interested in the diffraction theory of aberrations, it should be pointed out that the use of the point source is of limited use in the study of aberrated systems. As WEINSTEIN [1954] has remarked: “It therefore appears reasonable to study the images produced, not of point sources, but of objects such that an increase in aberration in the optical system causes general deterioration of image sharpness and contrast rather than complicated fringe structure. If the object acts as an incoherent source of light, every point of the object will be an independent source. In this case the distribution of light in the image plane is most easily obtained by summation of the intensities. All the classic work (work done prior to 1945) is for incoherent illumination. If the object is partially or fully coherent, the flux density is best obtained in a more indirect manner using the theory of Fourier transforms. ”
111,
9 31
SPECIAL PROBLEMS
91
For the present we restrict ourselves to Fraunhofer diffraction so that the amplitude distribution due to a point source is (for a circular aperture) 2Jl(q)/q.The illumination or flux density for an incoherent object of finite extent is proportional to
where the integration is over the area of the object. The function B ( r , 19)represents the intensity variation of the object. In general, the integral is too complicated to be evaluated analytically and must be treated by numerical methods. Fortunately for the objects of practical interest (disc, half-plane, etc.) the integration can be performed analytically. In particular, the case of a uniform disc (B(Y,0 ) = constant) occurs in astronomical work, in microscopy, and in diffraction by very small pinholes. At the center of the aperture q = 0 and the integral can ,easily be evaluated to yield 1 - .To2(@) - J 1 2 ( @ ) ,
(3.23)
Q
Fig. 3.12. Flux density at ccnter of circular aperture due to incoherent disc. The circles represent experimental vaIues (Slater and W’einsteinn)
92
DIFFRACTION IMAGES
[IIL
93
where a is the reduced radius of the disc. SLATERand WEINSTEIN [ 19581 have verified this equation experimentally using a 25-micron diameter pinhole (Fig. 3.12). Three solutions for the disc are available in the literature : NAGAOKA [ 1898, 19201, WEINSTEIN[ 19551, OSTERBERGand SMITH [ 19601, SMITH[ 19601. Nagaoka’s interest in the problem stems from his study of the drop formation of a planet during transit; the analysis is extremely complicated (he uses approximations involving elliptic functions) but in the end he is able to obtain isophotes in a variety of interesting situations, for example, a luminous point and luminous disc (Fig. 3.13).TORALDO DI FRANCIA [ 19581 p. 263 has reproduced some of Nagaoka’s isophotes. Weinstein evaluates (3.22) by series expansions and studies the intensity distribution as a function of the disc radius. Finally there is an elaborate treatment of the complementary problem of a dark disc in a light background by OSTERBERG and SMITH[1960], p. 362, who are motivated by microscopy. Specifically their study is for a microscope adjusted for Kohler illumination. The heavy analysis precludes any short description of the work and the interested reader is referred to the original publications. When the object is an incoherent line source we can follow STRUVE [1882] (GRAY,MATHEWS and MACROBERT [1931] has an elaborate summary) and integrate the intensity due to a point source. An easier procedure due to RAYLEIGH [I8881 is to postpone the integration over the circular aperture until the integration with respect to the direction of the line source has been carried out. I n either case the intensity is expressed in terms of the Struve function of order one, (3.24) where q is the distance of the point in the receiving plane from axis of system. The intensity distribution is different from that of a point source; it can be shown that (3.24) is always greater than zero, although it possesses maxima and minima (Fig. 3.14). Using the asymptotic expansion of the Struve function, one can easily show “the intensity of the image of a luminous line is ultimately inversely proportional to the square of the distance from the central axis, or geometrical image.” RAYLEIGH [ 18881 has examined the case of two parallel incohercnt line sources in connection with the resolution of a telescope objective. The semi-infinite incoherent plane source is obtained by integrating.
SPECIAL PROBLEMS
93
Fig. 3.13. Isophotes for a luminous disc and luminous point (Xagaoka)
I .c
1
I
-
I
>b-
; i 0.:
z
W
t-
z -
0
2
4
8
9
Fig. 3.14. Intensity distribution for a circular aperture due to a line source
94
D I F F IIA C T I 0 N I M A G E S
[III,
53
the line source intensity from 9 to 00. At large distances from the geometrical image of the plane, the intensity is inversely proportional to the distance q and to the radius of the aperture, RAYLEIGH [1888]. Using variations on this theme, we can construct various useful objects such as gratings consisting of black and white bars, etc. Additional sources of information are : BUXTON [ 19261, BYRAM[ 19441, HARIHARAN [ 19551, LAMAR[ 19491, RAYLEIGH [ 1896, 19031. TORALDO DI FRANCIA [1958] p. 252 summarizes some of the earlier work along with graphs and tables. MOURASHINSKY [1923] has studied the general case of two incoherent plane sources of brightness a and b separated by a distance D of brightness C. The effect of varying the width D and of varying the brightness is examined. The effect of a central obstruction on the circular aperture (annular aperture) has been the subject of a doctoral thesis by STEEL[1953]. I n this thesis the image of an incoherent line source is studied as well as an infinite resolution chart (sinusoidal intensity variation). A brief study of coherently illuminated objects is also made. The major portion of the work is devoted to aberrations and the theory of aberration balancing for annular apertures. The incoherent line source and the slit aperture is of spectroscopic importance ; the resultant intensity is simply the (sin x/x)2 distribution [ 18881). (RAYLEIGH When we admit Fresnel diffraction the problem becomes more useful (and more intricate). WEINSTEIN[ 19541 studied the defocused image of an incoherently illuminated edge extending Struve’s and Rayleigh’s work. The defocused image of sinusoidal grating is studied by STEEL[1956]. Attention should be given to the defocused image of an incoherently illuminated disc. So much for special apertures and sources. The general problem of the imaging of extended objects is properly part of coherence theory. At one extreme we have completely incoherent objects and at the other completely coherent objects. What we really need is a theory which takes into account partially coherent objects. Such a theory has been outlined by DUMONTET [ 19551, HOPKINS[ 19531 and PARRENT [ 19611 ; however, considerable work remains. MARECHAL[ 19541 outlines the computations for various extended objects in coherent light. FRANCON [1956] p. 331 gives an excellent r h m 6 of the entire problem of the imaging of extended sources (see also BORNand WOLF [I9591 p. 479).
111, §
31
95
SPECIAL PROBLEMS
3.5. TOTAL ILLULV/LINATION
All the work discussed thus far is devoted to the calculation of the [ 18881 intensity distribution in specified receiving planes. As RAYLEIGH has pointed out, it is also of interest to know the total illumination in the various rings of the diffraction pattern; that is, we wish to know the fraction L of the total energy that falls within a circle of radius q about the axial point in a given receiving plane. Obviously L vanishes when q is zero and approaches unity as q becomes infinite. WOLF [1951a] (also BORNand WOLF [1959] p. 434 and LINFOOT [ 19551 p. 39) has carried out a detailed study of L for a circular aperture. The fraction of the total energy (or encircled energy) that falls within a circular domain of the diffraction pattern of radius q and centered on the axis is given by (3.25) where I is the intensity of the aberration-free image (see (3.7)).The constant N is a normalizing constant chosen such that L is unity when q is infinite. Wolf evaluates the total illumination analytically by direct integration of the Lommel functions. A particularly interesting feature of Wolf’s work is the construction of a graph showing the contour lines of L as a function of longitudinal and lateral displacement (Fig. 3.15). There is also a comparison with geometrical optics predictions.
Fig. 3.15. Contour lines of total illumination for a perfect system (Wolf)
Asymptotic expressions for L are derived in a paper by FOCKE [ 19561 and a comparison is made with Wolf’s exact solution in three different receiving planes. The agreement is excellent.
96
DIFFRACTION IMAGES
When
p
=
[III,
F3
0 (3.25) has reduced to
(3.26) given by RAYLEIGH [ 18881 for Fraunhofer diffraction. BARAKAT (to be published) has examined the total illumination for a rectangular aperture following essentially Wolf’s direct approach. Here the energy in a rectangular area is computed. There exists an alternate interpretation of L due to LANSRAUX and BOIVIN[1958]. Consider an image of a circular disc of radius q, illuminated by incoherent light of uniform intensity and contrasting with a dark background; the total illumination (encircled energy) is the contrast of the center of the image with respect to the background. STOKES[ 18531 and RAYLEIGH [ 18881 proved a number of general theorems about the total illumination, of which one is of special importance. This concerns the fact that the total illumination is obtained by integrating the intensity over a receiving plane; whereas in strict sense the integration should be over a hemisphere of large radius. I n the context of the circular aperture the intensity when integrated over a hemisphere yields the total illumination multiplied by a factor proportional to (3.27) where R is the radius of the aperture. If the linear dimensions of the aperture are much larger than the incident wavelength then kR 1 and the second term is essentially negligible. We are so far away from the aperture in terms of wavelength that the hemispherical surface of integration is replaced by a plane. This point should be kept in mind as failure to do so can lead to paradoxical interpretations. BOUWKAMP [ 19531 p. 14 also emphasizes this point in a different context. Wolf’s work has been supplemented by BARAKAT [1961a] who studied the effect of third and fifth order spherical aberration on L using Zernike polynomials. This case was not susceptible to analytical treatment and was therefore investigated numerically using Gauss quadrature theory. Contour maps of L similar to Wolf’s were constructed for various amounts of aberration. As an example, we show the curves for a half wave of third order spherical aberration (Fig. 3.16). The isophotes are no longer symmetric about the central plane (P = 0). In this regard we note that LANSRAUX [1953, 19551 has derived
>
111,
9 31
SPECIAL PROBLEMS
97
P
Fig. 3.16. Contour lines of total illumination for system with half-wave of spherical aberration given by R40(v) (Barakat)
(p4,,=+)
analytical expressions for L (valid for small aberrations) ; however, no extensive calculations were performed. In an important paper LANSRAUX and BOIVIN [1958] have developed a numerical method of evaluating L so as to include the effects of variable pupil function. Further references to French work on the subject can be found in a monograph by LANSRAUX [1953] p. 72. 3.6. EXPERIMEKTAL RESULTS
No survey of diffraction theory would be complete without some mention of experimental work. Most experimental studies of diffraction phenomena have been confined to the use of photographic techniques, namely the direct reproduction of the diffraction pattern on film and subsequent measurement of the densities. We have already mentioned the qualitative study by SCHEINER and HIRAYAMA [ 18941. KATHAVATE [1945] has also conducted qualitative studies. As typical examples of the [ 19241, who measured quantitative use of this technique, see TURNER the diffraction pattern of a narrow slit, or HUFFORDand DAVIS[I9291 for the circular aperture. LYMAN[ 19301 made series of measurements
98
DIFFRACTION IMAGES
[IIL
§3
of the intensity distribution in a diffraction pattern of an edge. He measured the ratio of maxima to minima and obtained an agreement of about 4 per cent between theory and experiment. The culmination of this method is probably by NIENHUIS[1948] in his work on thc diffraction theory of aberrations. The intensity range involved is large and this is the limiting factor for precise photographic work. HAUSE,WOODWARD and MCCLELLAN[I9391 used a photocell and measured directly the intensity distributions for slit apertures. The agreement with theory is excellent (about 2 3 per cent) and only when the intensity is very low is there any marked deviation from the theoretical curve. This is probably due to anomalies in the photocell response. TAYLOR and THOMPSON [ 19581 measured the lffractioii patterns of circular and annular apertures using the optical diffractometer. With the advent of microwaves, it was only natural that they wouId be used in diffraction experiments. Here the problem is somewhat different in that polarization effects are not completely negligible and hence the scalar theory itself is probably not valid at these longer wavelengths. Experiments have been performed by AXDREWS [ 1947, 19501, BOIVINet al. [1956], MATHEWS and CULLEN[1956], to mention a few of the papers. BACHYNSKI and EEKEFI[1957] have utiIized the microwave phase plotter and studied both aberrated and non-aberrated systems using plastic lenses. The measured isophotes are shown in Fig. 3.17 (compare with Fig. 3.4). The experimental setup is such that the intensity is plotted directly versus the distance from the lens. The physical distance is proportional to 9 only when x R , where R is the distance
<
Fig. 3.17. Contour lines of intensity for circular aperture obtained espcrimentally. Compai-c Xvith Fig. 3.4 (Bachyiiski and Bekefi)
111,
5 41
VECTOR DIFFRACTION THEORIES
99
from the center of the exit pupil to the field point. To have to replot the curves would involve considerable labor. A number of other isophotes are shown for lens with spherical aberration, coma, astigmatism, and mixtures of aberrations. Useful information on microwave measurements is given in KING and Wu [ 19591 p. 174.
§ 4. Vector Diffraction Theories
When we ascribe vectorial properties to the incoming wave then difficulties increase since the description of the diffracted field must take into account polarization phenomena. For the purposes of this review paper we restrict ourselves to the high frequency region of the spectrum. Of coursc, the correct approach would be to solve a rigorous steady-state diffraction problem; however, we are interested in approximate methods which will utilize the high frequency constraint. An obvious answer is to modify the scalar Kirchhoff theory to handle vectorial problems. Surprisingly, even the analytical formulation of an electromagnetic Hnygen’s principle is fraught with difficulties (BAKER and COPSOS [ 19501 p. 102, KOTTLER[ 19231).Although we can carry out the proposed modification of the Kirchhoff theory (STRATTON r1941] p. 470), therc are difficulties of a non-trivial nature. Modifications of the Kirchhoff theory have been advanced whose primary purpose is to secure a self-consistent theory as regards the boundary conditions. A major constraint on these theories is that they are restricted to planar diffraction problems (e.g. infinitely thin disc). We set the aperture at z = 0 and assume incoming waves in the region z < 0; our problem is to determine what happens at z > 0. The most successful of these theories is due to RAYLEIGH [ 19131. In Rayleigh’s modified scalar theory the Green’s functions for each region are (ROUWKAMP [1953] p. 7) : (4.1a)
U9 = -- 1 2n
-
Ju
s
(.?)
dS
(z > 0).
(4.1b )
Integration i i over the aperture. This modified theory is self-consistent ( B o u w ~ . ~ zr19531 ir p. 8).Now the values of U and its normal derivat ive ‘
100
D I F F R A C T I 0 N I bZ A G E S
[In,
44
are to be replaced by their geometrical optics approximation just as in the original Kirchhoff theory. As Bouwkamp points out the Rayleigh modified solutions are exact solutions to saltus problems. BREMMER [ 1951a, 1951b] has utilized the second Rayleigh Green’s function (4.1b) to develop his (scalar) diffraction theory of Gaussian optics. I n a latter paper [ 19521 he expands the scalar wave function via iterative procedures and applies the results to phase contrast microscopy. LUNEBERG [1944] p. 344 also uses the second Rayleigh formula (4.1b ) but his work is vectorial. Luneberg sets himself the problem of determining a solution of Maxwell’s equation which has prescribed behavior at infinity and is valid for the half space z > 0. No actual boundary conditions are imposed. We seek a solution of Maxwell’s equations which has the same boundary values at infinity as the solution of geometrical optics. Calling U1 the Luneberg solution and UO the geometrica1 optics solution we demand that lim R(U1 - UO)= 0, R-.w
where R is the distance from the aperture. The geometrical optics solution is given by eikr
Uo = - - YA ,
(4.3)
where A is a given vector function of direction. Upon carrying out the analysis, Luneberg derives the following representation for the amplitude distribution in the image space due to a dipole source: (4.4) where K is the Gaussian curvature of the wavefront, W is Hamilton’s mixed characteristic function, p is the vector normal to the wavefront, and n is the refractive index. Eq. (4.4) is the fundamental integral of the Luneberg theory and can be interpreted as the vector generalization of Debye’s scalar integral (1 909) which represents the disturbance as the superposition of plane waves of different directions of propagation. PICHT [1931] p. 75 outlines and extends Debye’s work; SOMMERFELD [1954] p. 318 also has a number of comments on Debye’s integral representation. However, the widely held view that the Debye-Picht
111,
9 41
VECTOR DIFFRACTION THEORIES
101
representation is more rigorous than Kirchhoff’s does not appear to be correct - a t least in connection with analysis of the field in the focal region (see WOLF[ 19551, KOTTLER[ 19571). The formula (4.4) may be re-written in the form
here $, 4, m are the components of p and A ( $ , 4) is a (generally complex) “amplitude vector”. A scalar specialization of (4.5) was utilized by Osterberg for his high numerical aperture work (subsection 3.3). LUNEBERG [1944] p, 375 shows that the electric and magnetic vectors describing the diffracted field can be expressed in terms of a single scalar function
where +($, q) is a measure of the amplitude a t infinity. By demanding that the sine condition hold, it is possible to obtain Osterberg’s diffraction integral (3.20). WOLF [1959] has given an alternate derivation of (4.5) using an integral representation of the image field. The absence of an obliquity factor in (4.5) is of great importance in that we are not restricted to systems of low numerical aperture as we are with formalisms based on the direct use of Huygens’ principle. An extensive investigation based on this approach of the diffraction pattern of a non-aberrated system obeying the sine condition (aplanatic system) with a circular aperture has been carried out by Richards and Wolf (RICHARDS [1956], RICHARDS and WOLF [1959]). On applying the sine law constraint, (4.5) becomes (for linearly polarized incident light)
where
b Q
unit vector along the direction of polarization constant depending upon system and source cr = aperture semi-angle e = angle between the ray vector p and the vector r to the point of observation in the image field 0 = angle which the rays in the image space make with the axis. =
=
102
DIFFRACTION IMAGES
VECTOR DIFFRACTION THEORIES
103
Fig. 4.I . Contours of the time-averaged electric energy density in the focal plane of an aplanatic system of aperture semi-angle on the image side. The electric vector of the incident field is assumed to be linearly polarized in the direction 4 = 0. The first figure ( a + 0) is identical with the Airy disc. The dotted circles in the other figures represent the dark rings of the Airy disc (Richards and Wolf)
104
D I F F R A C T I 0N 1M A G E S
[III,
s4
The evaluation of (4.7) was accomplished by numerical integration using a high speed computer. The time average electric energy density contours are not circular but of an elliptical nature as predicted by Hopkins’ partial solution (subsection 3.3). The electric energy density isophotes are approximately circular when a is very small. As a increases, the departure from the circular form is pronounced and becomes essentially elliptical in nature (Fig. 4.1). The magnetic energy density isophotes have exactly the same form as their electric counterparts except that they are now rotated through 90” about the optical axis. As a consequence of the elliptical nature of the electric energy density contours, Wolf and Richards conclude that “with linearly polarized light and using detectors of electric energy our solution predicts an increase in resolving power in wide aperture aplanatic optical systems for measurements in the azimuth at right angles to the electric vector of the incident wave”. When the light is unpolarized (and quasi-monochromatic), the electric and magnetic energy densities are, of course, equal. A rather surprising thing occurs ; although the energy density isophotes are circles their radii are somewhat larger than the radii calculated by the Airy formula. The time averaged energy flow is given by the time averaged Poynting vector
The authors show that the magnitude of the ( S ) is symmetric with respect to the focal plane. A detailed study is made of the limiting form of the solution for small a , and comparison is made with the scalar solutions derived in 5 3. They show that <S>is proportional to the scalar function lUl2 for very small a. The same problem was treated by IGNATOWSKY [1919] who derived the same integrals as Wolf and Richards. He did not attempt such an ambitious numerical program as they did, but gave Bessel function expansions of the integrals and discussed the behavior of the Poynting vector in the image field. Ignatowsky +, of course, did not have available the Luneberg-Wolf theory but obtained his basic integrals in a somewhat similar manner. His study is confined to aplanatic systems. Other less comprehensive work has been carried out by BURTIN [ 19561 and by FOCKE [ 19571. t I am indebted to Dr. E. Wolf for an English translation of this paper.
1111
REFERENCES
105
Acknowledgments I am indebted to Drs. E. I>. O’Neill and W. Brouwer for their continued encouragement. Thanks are also due to Drs. E. Wolf, M. Herzberger, N. Chako and F. Kottler for stimulating conversations on diffraction theory. I gratefully acknowledge the support of the Itek Corporation during the preparation of the article.
References ABBE,E., 1910, Die Lehre von der Bildentstehung im Mikroskop, edited by 0. Lummer and F. Rciche (Braunschweig, Friedrich Vieweg und Sohn). AIRY,G., 1877, Undulatory Theory of Optics (Cambridge University Press). ANDREWS, C., 1947, 1950, Phys. Rev. 71 (1947) 777; J. Appl. Phys. 21 (1950) 767. BACHYNSKI, M. and G. BEREFI,1957, J . Opt. Soc. Amer. 47 428. BAKER,B. B. and E. T. COPSON,1950, The Mathematical Theory of Huygens’ Principle, 2nd Edition (Oxford University Press). BARAKAT, R., 1961a, b, J. Opt. Soc. Amer. 50 (1961a), in press; Ibid. 50 (1961b), in press. BASSET,A . , 1892, Treatise on Physical Optics (Cambridge, Bell). BEREK,M., 1926, Z. Phys. 40,421. BOIVIN,A , , 1952, J. Opt. Soc. Amer. 42,60. BOIVIN,A , , A. DIONand H. KOENIG,1956, Can. J. Phys. 34, 166. BORN,M., 1933, Optik (Berlin, Springer) p. 33. BORN,M. and E. WOLF,1959, Principles of Optics (New York, Pergamon Press). BOUWKAMP, C. J., 1953, 1954, Diffraction Theory, a Critique of Some Recent Developments, New York University Research Report EM-50 (1953). A shortened version is in Rep. Prog. Phys. 17 (1954) 35. BREMMER, H . , 1951, 1952, Symposium on Theory of Electromagnetic Waves (New York, Interscience, 1951) p. 125; Physica 27 (1951) 63; I b i d . 28 (1952) 469. BRIDGE,J., 1858, Phil. Mag. Series 4, 16, 321. BRUNS,H., 1883, Astron. Nachr. 104, 1. BURTIN,R., 1956, Optica Acta 3, 104. BUXTON, A , , 1921, 1923, 1926, Mon. Not. R. Astr. Soc. 81 (1921) 547; I b i d . 83 (1923) 475; Proc. Opt. Conv. 2 (1926) 759. BYRAM, G., 1944, J. Opt. SOC. Amer. 34, 571. CONRADY,A,, 1919, Mon. Not. R. Astr. Soc. 79,575. DIMITROFF, G.and J. BAKER, 1945, Telescopes and Accessories (Philadelphia, Blakiston) . DOSSIER,B., 1954, 1956, Rev. d’Optique 33 (1954) 57, 147, 267; Astronomical Optics (Amsterdam, North-Holland, 1956) p. 163.
106
DIFFRACTION IMAGES
CI11
DRUDE,P., 1933, Theory of Optics, translated by C. Mann and R. Millikan (London, Longmans). DUFFIEUX,P., 1946, L’Int6grale de Fourier et ses Applications A 1’Optique (Besanqon, Facult6 des Sciences). P., 1955, Publ. Sci. Univ. d’hlger B 1, 33. DUMONTET, EPSTEIN, L., 1949, J . Opt. SOC.Amer. 39, 226. EVERITT, P., 1919, Proc. R. SOC.A 83, 302. FOCKE, J., 1956, 1957, Optica Acta 3 (1956) 161; Ibid. 4 (1957) 124. FRANFON, M., 1956, Handbuch der Physik, Band 24 (Berlin, Springer) p. 171. FRANZ, W., 1949, 1957, 2. Phys. 125 (1949) 563; Theorie der Beugung electromagnetischer Wellen (Berlin, Springer, 1957). GRAY,A., G. MATHEWS and T. MACROBERT, 1931, A Treatise on Bessel Functions (London, Macmillan). HARIHARAN, 1955, J. Opt. SOC.Amer. 45, 44. HAUSE,WOODWARD and MCCLELLAN, 1939, J. Opt. SOC.Amer. 29, 147. HOPKINS, H. H., 1943, 1949, 1953, Proc. Phys. SOC.London 55 (1943) 116; I b i d . B 62 (1949) 22; Proc. Roy. SOC.A 217 (1953) 408. HUFFORD, M. and H. DAVIS,1929, Phys. Rev. 33, 589. IGNATOWSKY, B., 1919, Trans. Opt. Inst. Petrograd 1 , Nos. 3 and 4. JENTSCH, F., 1929, Handbuch der Physik, Band 21 (Berlin, Springer) p. 885. KATHAVATE, Y . , 1945, Proc. Ind. Acad. Sci. 21, 177. KING,R. and T. T. Wu, 1959, The Scattering and Diffraction of Waves (Cambridge, Harvard University Press). KIRCHHOFF, G., 1891, Vorlesungen iiber mathematische Optik (Leipzig, Teubner) . F., 1923a,b, 1957, Ann. Phys. 70 (1923a) 405; Ibid. 71 (1923b) 457; KOTTLER, J . Opt. SOC.Amer. 47 (1957) 569. H. and S. INOUE, 1959, J. Opt. SOC.Amer. 49, 191. KUBOTA, LAMAR, E., 1939, J. Opt. SOC.Amer. 39, 929. LANSRAUX, G., 1947, 1953, 1955, Rev. d’Optique 26 (1947) 24; Diffraction Instrumental (Paris, gditions de la Revue d’Optique, 1953); Rev. d’Optique 34 (1955) 65. LANSRAUX, G. and G. BOIVIN,1958, Can. J. Phys. 36, 1696. LINFOOT, E. H., 1955, Recent Advances in Optics (Cambridge University Press) cf. p. 66. LINFOOT, E. H. and E. WOLF, 1953, Proc. Phys. SOC.B 66, 145. LOMMEL, E., 1884, 1886, Abh. Bayer. Akad. 15 (1884) 229; Ibid. 15 (1886) 529. LUNEBERG, R. K., 1944, Mathematical Theory of Optics (Providence, Brown University). LYMAN, T., 1930, Proc. Nat. Acad. Sci. 16, 71. MACH,E., 1913, The Principles of Physical Optics, translation by J. Anderson and A. Young (New York, Dover, 1953), German Edition, 1913. MARECHAL, A., 1954, Optical Image Evaluation, National Bureau of Standards Circular 526 (Washington, Bureau of Standards) p. 9. MARECHAL,A. and M. FRANFON, 1960, Diffraction, Structure des Images (Paris, Editions de la Revue d’optique). MARTIN,L. C., 1922, Mon. Kot. R. Astr. SOC.82, 310. MATHEWS,P. and A. CULLEN, 1956, Proc. Inst. Elect. Engrs. 103, 449.
1111
REFERENCES
107
MEYER, C., 1934, The Diffraction of Light, X-Rays, and Material Farticles (University of Chicago Press). MITRA, S., 1920, Proc. Ind. Assoc. Cult. Sci. 6, 1. M~GLICH , 1927, Handbuch der Physikalischen Optik, Band 1 (Leipzig, F., Barth) p. 499. MOURASHINSKY, R., 1923, Phil. Mag. 46, 1008. NAGAOKA, H., 1920, Astroph. J. 51, 74. NIENHUIS, K., 1948, Thesis, University of Groningen. NUSSENZVEIG, H., 1957, 1959, Notas de Fisica, Suppl. 1957; Phil. Trans. R. SOC. A 252 (1959) 1. O’NEILL,E., 1958, Selected Topics in Optics and Communication Theory (Boston, Itek Corporation, 1958). .OSTERBERG, H., 1951, Appendix in “Phase Microscopy” (New York, Wiley) p. 238. OSTERBERG, H. and R. MCDONALD, 1954, Optical Image Evaluation, National Bureau of Standards Circular 526 (Washington, Bureau of Standards) p. 23. OSTERBERG, H. and L. SMITH,1960, J. Opt. SOC.Amer. 50, 362. QSTERBERG, H. and J. WILKINS,1949, J. Opt. SOC.Amer. 39, 553. PARRENT, G., 1961, J. Opt. SOC.Amer. 51, 143. PICHT,J., 1931, Optische Abbildung (Braunschweig, Vieweg). POCKELS, F., 1906, Winkelmans Handbuch der Physik, Band 6 (Leipzig, Barth). POINCARE, H., 1892, ThCorie Mathematique de la Lumikre, Vol. 2 (Paris, Card). RAYLEIGH, Lord, 1888, 1896, 1903, 1913, Encyl. Brit. 24 (1888) 430; also in Collected Papers, Vol. 3 Article 148 (1 888) p. 47; Ibid. Vol. 4 Article 222 (1896) p. 235; I b i d . Vol. 5 Article 289 (1903) p. 118; I b i d . Vol. 6 Article 375 (1913) p. 161. RICHARDS, B., 1956, Astronomical Optics (Amsterdam, North-Holland Publ. Company) p. 352. RICHARDS, B. and E. WOLF, 1959, Proc. Roy. SOC.A 253, 358. RONCHI, V., 1957, Histoire de la Lumikre (Paris, Colin). RUBINOWICZ, A., 1957, Die Beugungswelle in der Kirchhoffen Theorie der Beugung (Warsaw, Polska Akademia Nauk). SAITO, H., 1959, Jap. J . Appl. Phys. 28, 502. SCHEINER, J . and S. HIRAYAMA, 1894, Abh. Konigl. Akad. Wissensch., Anhang 1. SCHELKUNOFF, S., 1951, Comm. Pure Appl. Math. 4, 44. K., 1898, Sitz. Miinchen. Akad. Wiss., Math.-Phys. K1. 28, SCHWARZSCHILD, 271. SLATER, P. and W. WEINSTEIN, 1958, J . Opt. SOC.Amer. 48, 146. SMITH,L., 1960, J. Opt. SOC.Amer. 50, 369. SOMMERFELD, A,, 1894, 1954, Nachr. Akad. Wiss. Gottingen, Math. Phys. K1. l(1894) 383; Optics, translated by 0. Laporte and P. Moldauer (New York, Academic Press). STEEL, m’.,1953, 1956, Rev. d’Optique 32 (1953) 4, 143, 269; Optica Acta 3 (1956) 49. STEWARD, G. C., 1928, The Symmetrical Optical System (Cambridge University Press). STOKES, G., 1853, Ed. Trans. 20, 317. STRATTON, J , , 1941, Electromagnetic Theory (New York, McGraw-Hill).
108
DIFFRACTION IMAGES
[111
STRAUBEL, R., 1888, 1895, 1935, Thesis, Jena 1888; Wied. Ann. 56 (1895) 746; P. Zeeman Verh. (Hague, Nijhoff, 1935) p. 302. STREHL,K., 1894, Theorie des Fernrohrs, Vol. 1 (Leipzig, Barth). STRUVIS, H.,1882a,b, 1886, Wied. Ann. 25 (1882a) 407; Ibid. 27 (l882b) 1008; Mem. Akad. Sci., St.Petersbourg 34 (1886) 1. TAYLOR, C. and B. J . THOMPSON, 1958, J . Opt. SOC.Amer. 48, 844. THEIMER, O.,G. WASSERMANN and E. WOLF,1952, Proc. R. SOC.A212,426. THOMPSON, B. J., 1959, Proc. Phys. SOC.London 73,905. DI FRANCIA, G., 1956, 1958, Introduction to the Modern Theory of TORALDO Electromagnetic Diffraction (Firenze, Atti Fondaz. G. Ronchi, 1956). La Diffranzione della Luce (Torino, Edizione Scientifiche Einaudi, 1958). TURNER, R., 1924, J . Opt. SOC.Amer. 14,649. VERDET,E., 1881, Wellentheorie des Lichtes, Band 1,translation from French by K. Exner (Braunschweig, Vieweg). VON LAUR,M., 1915, 1928, Enzykl. d. Math. Wiss. Band 5 (1915) p. 359; Handbuch der Exp. Phys. Band 18 (1928) 21 1. J ., 1904, The Analytical Theory of Light (Cambridge University Press). WALKER, WEBER,H., 1879, Wied. Ann. 18,407. WEINSTEIN, W., 1954, 1955, J . Opt. SOC.Amer. 44 (1954) p. 610; I b i d . 45 (1955) 1006. WOLF,E., 1951a,b, 1953, 1955, 1959, Proc. R . SOC.A204 (1951a) 533; Repts. Progr. Phys. 14 (1951b) 95; J. Opt. SOC.Amer. 43 (1953) 218; Math. Rev. t6. (1955) 1074; Proc. R. SOC.A253 (1959) 349. WOLFSOHN, G., 1928, Handbuch der Physik, Band 20 (Berlin, Springer) p. 263. ZEIINIKE,F. and B. NIJBOER,1949, La ThCorie des Images Optiqnes (Paris, Revue d’Optique) p. 227.
IV
LIGHT AND INFORMATION t BY
D. GABOR Imperial College, Lomioiz
t This article is the substance of a Kitchic lecture, delivered by the author on March 2, 1951 at the University of Edinburgh. The contents of the lecture became known to a wider audience through the distribution of a limited number of mimeographed notes, which have since become widely quoted in the literature. The wish has been often expressed that a permanent record of the lecture should be made generally available. We are glad to be able to meet this wish.
CONTENTS PAGE
9 1. 9 2.
3 3.
. . . . . . . . . . . . . . . . . . GEOMETRICAL OPTICS . . . . . . . . . . . . . . CI. ASSICAT. WAT’E OPTICS . . . . . . . . . . . . . INTRODUCTION
111
113
115
9 4 . T H E PARADOX O F “OBSERVATION ’IIIITHOUT IL-
LUMINATION” . . . . . . . . . . . . . . . . . . .
122
S 5. A F U R T H E R PARADOX: A PERPETYL’M MOBILE O F T H E SECOND K I N D . . . . . . . . . . . . . . . . 125 S 6 . T H E METRICAL INFORRIATION I N L I G H T BEAMS . $ 7. CONCLUSION
. . . . . . . . . . . . . . . . . . .
132 136
APPENDICES
. . . . . . . . . . . . . . . . . . . . .
136
REFERENCES
. . . . . . . . . . . . . . . . . . . . .
152
Q 1. Introduction Light is our rnost powerful source of information on the physical world. Anthropologists have often emphasized that the privileged position of Man is due as much to his exceptionally perfect eye, as to his large brain. I was much inipressed by a remark of Aldous Huxley, that we owe our civilisation largely to the fact that vision is an objective sense. An animal with an olfactory sense or with hearing, however well developed, could never have created science. A smell is either good or bad, and even hearing is never entirely neutral; music can convey emotions with an inimcdiateness of which the sober visual arts are incapable. No wonder that the very word “objective” has been appropriated by optics. But on the other hand it is probably the peculiar character of vision which is chiefly responsible for one of the most deep-rooted of scientific prejudices ; that the world can be divided into an outer world and into an “objective” observer, who observes “what there is”, without influencing the phenomena in the slightest. In this lecture an attempt will be made to discuss optics from the point of view of information theory. Rut before doing this, I innst start with a disclaimer. I do not want to give the impression that WP have now a valuable new epistemological principle, which we want to hand over to the physicist. Nothing irritates the physicist niorc than when the philosopher tries to look over his shoulder and to give him advice, and this is hardly surprising in view of the past rccord of philosophers, from Aristotle to Hegel, to mention only those who are safely dead. Infomiation theory does not originate from philosophers, but aIso from a group of outsiders; from niathematically interested electrical engineers, and mathematicians interested in communications Tlicy may not be qnite as suspcct o€ conceit as philosophers, but it 15 111 ‘nc as well to point out from the start that the point of 1 icw of information theory 1vas never quite absent from physics,
112
LIGHT A N D INFORMATION
[IV,
§ 1
and has been growing stronger and stronger in modern physics long before information theory became fashionable. Again and again in the course of this lecture I shall be able to point out the work of physicists in this direction. But having said this, I may be allowed also to say that the points of view of information theory, consequently applied, may yet prove of appreciable heuristic value to physics. What then are the points of view of information theory? I want to say that I am stating my own views, not necessarily shared by others who are working in this field. There are two steps in the approach. I n the first step we specify the degrees of freedom of the phenomenon, in such a way that we operate always with discrete degrees, and in all practical cases with a finite number of them. This, in MacKay’s useful terminology specifies the strzictural aspects of information. Once we have found the right coordinates, the second step is to specify the phenomenon by attaching a measure to each coordinate. But it is essential that we must never expect an exact measure. We must take account of the fact that in every physical measurement there is an unavoidable amount of uncertainty, fluctuation or “noise”, so that the best we can do is to specify the nieasure between certain limits, with a certain probability. A convenient way of doing this is to lay down a certain “scale of distinguishable steps” also called a “proper scale”. This means that we proceed along the scale in steps roughly equal to the uncertainty. Of course some sort of convention must be made regarding what one considers as distinguishable, e.g. by agreeing that if one says that the value js in a certain interval, this means that on repetition of the experiment one would find this interval say in 50% of the cases. Once such a convention is made - and practically it is easy to fix one in most cases - the measureniciit is expressed by an integer, by the order of the interval, counted from the lowest step. Thus in information theory every phenomenon is described by a finite number of integers. There is no continuity, except in the probabilities. There is no need to emphasize how close this view comes to the method of quantum physics, and the authors of information theory do not wish to plead ignorance of this fact. On the contrary this was always emphasized, especially in a paper by MacKay, and those of the author. The “structural” information, i.e. the free coordinates of the phenomenon to be studied, has been also called the “a $riori” part of it. What is meant by this can perhaps best be illustrated by Eddington’s
IV,
9 21
GEOMETRICAL OPTICS
113
famous “parable of the fishing net” (EDDINGTON [1939], pp. 16, 62). - If an ichthyologist casts a net with meshes two inches wide for exploring the life on the ocean, he must not be surprised if he finds that “no sea-creature is less than two inches long”. Similarly, if one tries to explore atmospherics by means of a radio set with a bandwidth of a thousand cycles, there is no need to look out for surges with a “base” of less than a millisecond. But one must be very careful with the word “a firiori”. We do not always know our instrument as well as the ichthyologist ought to know his net, and the specification of the free coordinates of the instrument requires fihysical knowledge, and not only the knowledge of formal logic, as may be suggested by the word “a firiori”. Later in this lecture there will be opportunity for showing that an important part of our knowledge of light is in fact embodied in the system of “free coordinates”, suitable for its description. The handling of the metrical information (sometimes called a 90steriori), in information theory has a distinctive feature which may be briefly mentioned. The number which appears as the result of the measurement is often considered as a selection from a nuniber of possible values. Historically this may be attributed to the fact that the first authors in the field which became later known as “communication theory”, Nyquist, Kiipfmiiller, Hartley, were interested in telegraphy, where the signals are in fact selections from a certain discrete set. This view may appear a little strange to the physicist, but he may remember that once he has set his galvanometer, every possible reading is a selection from the distinguishable marks on his scale. at any rate if we include the reading “off the scale”. Nor is this concept such a strangcr to physics as it might appear at first sight, as we shall see later when we come to the discussion of light, information arid thermodynamics.
5
2. Geometrical Optics
After a few, rather unsuccessful attempts of thc ancients, the laws of light were first formulated round the turn of the 16th century in the form of geometrical optics. This is built on the concept of a “ray of light” which for a long time was naively identified with a geometrical line. (Sometimcs a curve.) From the point of view of information theory this is a completely unsatisfactory departure. Every point of an object planc sends out a double infinity of rays, and if we had a perfect lens, which is no
114
LIGHT AND INFORMATION
[IV,
§2
impossibility in geometrical optics, we can unite this whole pencil of rays in one point of an image plane, and study the object plane point-for-point. But there is no need for a perfect lens. Let us take instead a camera obscwa, with a “point-hole”, and we have automatically perfect representation. The number of “free coordinates” is infinite in this naive view; we have not only an infinity of points or rays, but a transfinite infinity. It is evident how strongly these naive views, and the crude experiments on which they are based are responsible for our belief in a continuous geometry. It was, of course, a very sound instinct which led Snell, Descartes and others to base the infant theory of light on what appeared to them the safe foundation of euclidean geometry. To this day we cannot do without the concept of a continuous space, though it is no longer euclidean. Attempts to eliminate it appeared to Einstein as promising as “breathing in a vacuum”. Some day it may be possible to discard it, but the time has not come yet, and we shall have to use continuous space as a background, though it will soon become evident that what we can physically distinguish in it are not points, but a t the best small, diffuse patches. Yet, geometrical optics gives a t least a hint which way to look for a basis in applying information theory to light. Information is something which is propagated from the object to the image without destruction, if the imaging system is a perfect one; thus we must look for the invariants of the imaging process. Moreover we must look for a geonzetrical invariant for the structural specifications ; one which exists as soon as we set up the image planc, the objcct planc and the lens
Fig. 1. The Smith-Lagrange invariant in geometrical optics
system, irrespective of what object we put in the plane, and how we illuminate it. But there exists only one of this type; the SniithLagrange invariant (Fig. 1). This is the product of any small line element at right angles to the optic axis with the angular divergence of the rays which issue from any one of its points and pass through the lens aperture. This is the same for the object as for its image, @’a’ = a”a’’.
IV,
9 31
CLASSICAL WAVE OPTICS
115
This holds exactly true only for perfect imaging, but - excluding certain types of lens errors - it will be true also for less perfect ones if we restrict both the elements, and the divergences to very small values. We can now write conveniently dS.dQ
==
inv.,
where dl2 is the solid angle of a very narrow cone of rays, and dS is the projection of the area of a very small element, viewed from the direction of the cone. I t will be seen later that this is, in fact, an important cue.
9
3.
Classical Wave Optics
After Snell, Descartes and Fermat the next great progress came with Christiaan Huygens, who formulated what we would call nowadays the scalar wave theory of light. It is known that this had to be replaced later by the “vector theory” of Young and Fresnel, and that their mechanical vectors had to be reinterpreted by Maxwell as electromagnetic ones, but these steps, important as they were, are not as fundamental from the point of view of information theory as Huygens’ step from rays to waves. But we must not forget that Newton, though he opposed the wave theory, supplied what is perhaps the most important element in it, by his celebrated experiments in which he decomposed light into spectral colours, and showed that these could not be further decomposed. This made it possible later, in the hands of Young and of Fresnel, to associate a characteristic length, the wavelength, with every spectral colour. It is this characteristic length which changes the picture completely from the point of view of information theory. In wave optics the concept of a “ray” is not at all elementary. Its place is taken by the simplest solution of the wave equation; the plane, monochromatic wave. Unfortunately, like most of the simple concepts with which we do our thinking, this turns out to be a very remote abstraction from reality because it must be infinite in extension. But we must retain it because of its mathematical simplicity, with much the same reservation which we have made about geometry. Let us therefore consider for a start what appears the simplest case; a plane, monochromatic wave with wavelength A impinging on a plane object. But in order to represent such a wave matheniatically we must make another questionable assumption. Consider for simplicity “scalar
116
LlGHT AND INFORMATION
[IV, 9
3
light” with amplitude u. (In the vector theory we can instead consider any Cartesian component of the vectors involved.) This must satisfy the wave equation
which expresses the fact that light propagates in all directions with the velocity c. But if we want to satisfy this equatioii, we must assume that the wave which is periodic in space with period il is periodic in time with a frequency Y == CIA. This introduces an air of unreality into classical wave optics, because the frequency of light has ncver been measured in any optical experiment, nor the phase of this hypothetical vibration. What we measure are always wavelengths and relative phases, which are entirely determined by geometry. The behaviour of light in time appears in wave optics as an ad hoc construction, so contrived as to account for the velocity of propagation. But we will accept it for the present, because for long wavelengths, for radio waves, frequency and phase become really measurable quantities, and the vibrations can be followed in time by means of oscillographs. Why frequency should be measurable for long waves but not for short ones is a question to which classical wave theory has evidently no answer, and which we must leave for later. Consider now that such a wave, whose mathematical exprcssion is = e2ni(z/a--vt)
falls in the z-direction on a plane object in the plane z = 0 (Fig. 2). Inimediately behind the object the amplitude will be given by some expression of the form u(x, y, -t0,t ) = t(x, y) eaniVt.
(4
t ( x , y ) is the complex “amplitude transmission” of the object. There is no need here to discuss its meaning, and how it is related to physical properties of the object, because in this experiment the function t(x,y) zs the object. That is to say it contains everything that we can expect to find out about the object; in fact, as we shall see in a nionient, it contains much more, and only a small part of it is actually observablr. The amplitude being given by eq. (2) immediately behind the object, the problem is to calculate it for any z . One could solve it by using the method of Huygens and Fresnel, by superimposing the
IV, §
31
117
CLASSICAL W A V E OPTICS
elementary spherical wavelets issuing from all surface elements of the object. But another method, connected with the name of Fourier, but which, I believe, was first introduced into optics by Rayleigh is
X
Fig. 2. Propagation of light waves
far more appropriate. One starts by decomposing the transmission function t(x,y ) into its Fourier components, by the formula
J J
--oo
Each Fourier component represents a simple periodic infinite “standing wave” of transmission, with the periods l/E and 1/17 in the x and y direction. Thus the “Fourier variables” 5 and 17 can be interpreted as wave wumbers in the plane z = 0. The (complex) amplitude T(6,q ) of these components is called the Fourier transform of t(x,y ) . By the principle of superposition (first noticed for water waves by Leonard0 da Vinci), the amplitude at any point x,y , z can be calculated by determining separately the wave issuing from every one of the Fourier components, and summing them. The calculation - carried out in Appendix I - gives a very simple and significant result: those Fourier components whose period in the object plane is longer
118
LIGHT AND INFORMATION
[IV,
53
than a wavelength will be propagated as plane waves, while those with a shorter period will be continued as exponentially damped (,evanescent waves”, which means that they will be practically damped out in a matter of a few wavelengths at most. It is intuitively clear that if the Fourier components below a wavelength are cut out, all details of the object (that is t o say of t(x,y ) ) which are finer than about half a wavelength will be cut out with them. Thus we arrive at the first significant result of wave theory, that light with a wavelength il will under no circumstances carry with it information on detail below Ql. We obtain a very clear idea of the propagation of the remaining information if we follow the transformations of the amplitude u with increasing distance z from the object plane in space, and simultaneously in “Fourier space”. This is illustrated in Fig. 2, but for simplicity only the intensity is shown, i.e. the squared absolute value of the amplitude u,and the modulus of its Fourier transform. The striking feature is that while the intensity pattern changes rapidly, so that the object soon becomes unrecognizable, the modulus of the Fourier transform does not change at all. This can be easily understood if it is remembered that each point 6,7 of the Fourier pattern corresponds to a certain direction in space, in which the corresponding plane wave is propagated, and this does not of course change in free space. The phase (argument), of the Fourier component changes with z , but here again we have a law which is very much simpler than the one for the phase change of u:- The phase factor of T depends only on z , 6 and TI, i.e. it is independent of all other points in the Fourier diagram, and it can be easily calculated, as shown in Appendix I. This is the advantage of the method of Fourier transforms, which does not apply to the simple case only which we have here considered, and which is finding increasing applications in instrumental optics, after having
Fig. 3. Connection between Fourier variables and angular variables
IV,
§
31
CLASSICAL WAVE O P T I C S
119
been for many years one of the chief mathematical tools of communicatiop engineering. If the distance z of the screen on which we observe the intensity is further increased, all resemblance to the object is gradually lost, and finally the intensity pattern becomes identical with the Fourier modulus diagram. This is illustrated in Fig. 3. This happens a t a distance so large that every plane wavelet issuing from the object can be considered as a “ray”. If we use a lens between the object and the screen, there is no need to go to infinity, we find the same conditions very nearly realized in the rear focal plane of the lens. This is the plane in which to place a ray-limiting aperture, if one wants the same angular limitation for every point of the object. In this simple case it is quite evident that we lose some further information, because we have cut out all Fourier components outside a certain area. We are now in a position to answer the first question of information theory, the question of the degree of freedom, or of “free coordinates”. We can reformulate this question in the form: - “How many independent variables are necessary t o express as much of the function t(x,y ) as we can learn from an optical image, under certain conditions of ray limitation?” Consider first, for simplicity, the last example, in which the Fourier variables were all limited to the same region (by a n aperture at a large distance), independently of the space coordinate x, y. We now build up the complicated beam which issues from the object out of elementary beams, every one of which has a non-zero Fourier transform only inside the allowed region, and we try to expand t(x,y ) in a series of these. We find that we get into difficulties, because if the Fourier spectrum is sharply cut off, as assumed, these beams will spread out a t the base, i.e. in the object plane, to infinity, hence we cannot have, as we wished, a sharply limited object. Without going into technicalities which have been dealt with elsewhere (GABOR[1946]) we will only mention that there exists a relation of the form smallest effective beam area x solid angle of divergence 31 square of wavelength
(4)
and that the smallest possible value of this ratio is achieved for the rotationally symmetrical “gaussian elementary beam” illustrated in Fig. 4. This smallest value is of the order unity, with any reasonable definition of the quantities which figure in the numerator in eq. (4). Thus we see that so long as the product of object area and Fourier
120
LIGHT AND INFORMATION
[IV,
53
area is of the order unity or smaller, we cannot even start to answer the question regarding the degrees of freedom, because we cannot construct even one elementary beam to satisfy the cut-off conditions. The question is evidently of a statistical nature, and can be answered
The intensity distribution always remains Gaussian
Fig. 4. Gaussian elementary beam
with an accuracy of order l/M if the product of object area and Fourier area is of the order M ; a large number. But with this qualification we can give an answer to the question: - A monochromatic beam of light has F degrees of freedom, where
F
=;
2 x 2 x object area x accessible Fourier area
(5)
because it takes this number of independent terms to build up what remains of t(x,y) inside the object area, after cutting out the Fourier components outside a certain area t. The first factor 2 is due t o the fact that each term has an arbitrary complex coefficient, equivalent to two real data, the other is due to the vector nature of light. I n principle light can transmit two independent images, polarized at right angles to one another. This result is essentially contained in an important paper by MAX VON LAUE[1914], though not in connection with the transmission of inforniation by light. It may be mentioned that the theorem has not yet been proved with a rigour which would satisfy mathematicians, but physicists have their own standards in these matters. The result is illustrated in Fig. 5. The information space has really four dimensions, x, y,5 and 7 but in the simple case where the solid angle Q is independent of x, y three dimensions suffice. The theorem t Appendix I1 contains two examples of such series expansions for the “nonredundant” representation of what is left of t ( x , 3)) after cutting off Fourier components. Physicists will need no reminding of how similar t h a t is to the procedure in quantum mechanics, especially in Dirac’s formulation.
IV,
9 31
121
CLASSICAL WAVE OPTICS
can be evidently generalized: The degree of freedom is 2 (or 4) times the volume of the information space available. So far we have talked of the stationary case only, i.e. of a steady, unchanging image. What happens if the object is moving or changing ?
n solid angle
n
Object or ea/ X2
Fig. 5. Information space
We can give an answer immediately, by availing ourselves of the now fairly generally known results of communication theory. Every degree of freedom can be conceived as a separate and independent comniunication line, which has (2)AvAt degrees of freedom in a frequency interval Av, and the time interval, (observation time) At. The factor (2) in brackets is to be used if the “temporal phase” is measurable. The author has shown in a recent paper (GABOR[1950]) that in the case of light this is possible only with quite extraordinary intensities, combined with high spectral purity, which it may never be possible to realize with existing light sources. But in the region of radio waves phase is easily nieasurable, and the factor 2 is justified. We can now write down our result for the degree of freedom of any beam of light (which need no longer be monochromatic or coherent T) in the general form ~
F
=
2.2-(2)
I/////
dx dy d5 dq dv dt,
or, in terms of the cross section dS and the differential solid angle d 0
F
= 2.2(2)
/ss/cdQ dv dt. 12
(7)
This is evidently a significant quantity, because dS dQlR2 and dv dt are both relativistic invariants. But the result is hardly written down before doubts arise whether it can really stand on its own legs. We have already seen that the bracketed factor 2 becomes physically t Cf. Appendix 111.
122
LIGHT A N D INFORMATION
,
[IV,
4
real only at very high intensities. But another question, even more elementary is suggested by eqs. (6) and (7): What happens if we do not cut off the area or the angular variables sharply, as we have assumed up to now, but e.g. just almost cut out a part of the waves, by an almost black filter? Are we still allowed to measure the information space just as if it were fully accessible? This is a familiar dilemma in problems of a statistical nature, to which classical theory has no answer. The weighting factor which is evidently necessary will have to come from another side. But before approaching this question, we will sharpen the dilemma, by an example which throws into relief thc logical insufficiency of the classical scheme.
Q 4.
The Paradox of “Observation without Illumination”
The classical theory of light claims validity at all levels of intensity, however small. This appears a harmless assumption. Combined with the elementary experience that in fact every observation requires a certain minimum, finite light sum, one would at first sight conclude
PARADOX I Fig. 6. Observation “without illumination”
IV, 9
41
“OBSERVATION
wITH OUT
ILLUMINATION”
123
only that one has to wait a correspondingly long time for an observation. But it will now be shown that if the classical theory were true, however large the minimum energy, we could make an observation with a light sun1 passing through the object which could be made as sniall as we like. Let us take a Zehnder-Mach interferometer, as shown in Fig. 6, in which coherent light is divided into two very unequal parts. Only a very small faction is directed through the branch which contains the object; the rest is branched through the other term, and united with the weak beam only at the receptor, which may be e.g. a photographic plate. Thus we have divided the light into two parts; a weak one uhich carries all the information, and a strong one which carries almost all the energy. For simplicity 1-t us fix our attention on one resolvable element of the object, say a square whose edges are equal t o the resolution limit, so that the result of the observation is expressed by a single number; the light sum S through the image of the element during the observation time. Let us call A0 the amplitude in the strong, uniform background, a the amplitude which the image-carrying beam would produce by itself. As the two are coherent, the resulting intensity is
1 = A02
+ a2 + 2Aoa cos c$
(8)
i f # is the phase angle between the two, which depends on the optical paths and also on the phase delay in the object. Similarly we have the relation between the resulting and the partial light sums
s = so + s + 2(SOS)3 cos 4.
(9)
This is the sum of the large uniform background term SO, known beforehand, the light sum s which has penetrated through the object, and which can be assumed as very small, and an interference term. This can happen to be zero if the two amplitudes are in quadrature, but if necessary we can repeat the experiment with a quarter-wave plate introduced into one branch or the other. The absolute expectation value of this term is
which means that we can amplify the effect of the weak imagecarrying beam roughly in the ratio (So/s)*.It is true that the contrast is still small, of the order (s/So)+,but as the background is known
124
LIGHT A N D INFORMATION
[IV, 5
4
and uniform, we can subtract it. Subtraction is particularly easy if electrical methods are used; one takes the image on a television screen and suppresses the d.c. component in the transmission. But it can be done also with photographic plates if the grain is fine enough to be negligible, e.g. by using the Foucault-Toepler “Schlieren” method. \lie are allowed to neglect the grain, because however large the area and the corresponding niinimuni light sum, and however small s, we can make SOlarge enough for the product (lo), to become observable. Thus, in the limit, we could make an observation with as small a total illumination of the object as we like. Instinct of course tells us that this cannot be true. The weak point in the argument is evidently the subtraction of the strong but uniform background. The argument would break down if, in increasing the intensity in the background, we would, at the same time, increase its %tn~ontro~lable flactmtions to such an extent that in the end the interference term ( l o ) , which indicates the object, could not be told against the background of “noise”. But we could eliminate the imperfections of the apparatus unless these fluctuations arose from the natztre of light itself. Let us now make the reasonable assumption, that the experiment is bound to fail if the light sum s which has gone through the object element is smaller than a certain minimum energy FO, because for s
< 80
the interference term (lo), becomes smaller than the root of the mean fluctuation square of the background, i.e.
sos
< (SS0)Z.
Assume that equality, i.e. possible observation is just achieved for s = 80, this can be written G(S0/&0)2
= SO/&O.
( 1 1)
SO/FO is a pure number, the light sum in the background in units of
the minimum energy which makes an elementary observation possible. But eq. ( 1 1) is Poisson’s “law of rare events”. I t could be exactly accounted for by the hypothesis that monochromatic light arrives in quanta of some size 8 0 , which arrive at random, subject only to thc condition that an average of SO/EO arrives during the observation time. No observation can be made with less than one quantum Passing through thc observed object.
IV, §
51
“PERPETUUM
MOBILE OF THE S E C O N D KIND”
125
Q 5. A Further Paradox: “A Perpetuum Mobile of the Second Kind” We see that the conviction that one cannot get something for nothing, ‘, not even an observation”, leads to the first result of the quantum theory of light, that monochromatic light is perceived in discrete quanta t. It will now be shown that this belief can be based on one of the strongest convictions of the physicist; the belief in the Second Principle of Thermodynamics. For our purpose it will be best to formulate this principle in the orthodox way : No cyclically operating machine is possible which produces work at the expense of the heat in one store. This great principle has always remained a challenge to physicists by the extreme generality which it claims, and it can be safely said that something has always been learned every time one tried to break through it. One of the most fruitful ideas in this direction came from Clerk Maxwell, who posed the question of demons opening a v a h e for fast molecules in a gas, and shutting it for slow ones. This led t o the even simpler question: Why not spring-load the valve, so that only a fast molecule can open i t ? This was answered only by SMOLUCHOWSKI in his classical papers [1912-131. These, however, did not deal with the question of an “intelligent demon”. L. SZILARDtook this up [in 19291, and cleared the ground first by showing that a simple observation, which amounts to a selection from n equally likely possibilities, enables the observer to decrease the entropy of the systcm observed by a maximum of
k log n. Hence, in order to save the Second Principle, it must be assumed that such an observation could not be made by any “demon”, intelligent or mechanical, without an entropy increase of at least this amount. Szilkrd proved this in detail in one example tt. t It may be recalled that the existence of light quanta or photons was [ 19051 from the fluctuations of black historically first inferred by EINSTEIN radiation. Einstein obtained this by boldly applying the fluctuation law of statistical mcchanics which he had previously discovered to the radiation in a black cavity. L. SZILARDshowed [in 19251 that this fluctuation law is indeed a thermodynamical necessity. tt Here is a somewhat simplified account of Szilird’s work. He considers a molecule performing Brownian motion in a cylinder. At some instant this volume is divided into two parts by a shutter. An observation is made to
126
LIGHT AND INFORMATION
[IV, 9 5
We now consider a Peq5etztunz. Mobile similar to Szilhd’s, in that it is based on Brownian motion, but different in so far as we aim t o obtain large gains of entropy, and that we are using light for the
ELEMENTS
PARADOX 11 Fig. 7. “Perpetuum mobile of second kind” determine which part-volume contains the molecule, and the shutter, now serving as a piston, is moved so that the one-molecule gas expands until it fills the whole volume. It is easy to see that the expectation value of the entropy gain is maximum if the two part volumes are equal, and in this case the entropy decrease is k log 2, corresponding to a binary choice. But the generalisation to an n-fold selection is so obvious, that we thought it fairer to formulate Szilird’s result in this more general form.
IV, §
51
“PERPETUUM
MOBILE OF THE SECOND
KIND”
127
observations. This will enable us to prove once more the failure of the classical theory, and to learn a few somewhat surprising facts about the properties of photons. The imaginary machine is shown in Fig. 7. A single “molecule” is in thermal motion in an evacuated cylinder, connected with a large heat store at a temperature T . A part of the cylinder walls is transparent, and this fraction 1jX of the volume is flooded by a light beam, coming from both directions from a filament, which is the only part of the apparatus a t a temperature different from T . We can imagine this filament coated with a selective emitter, which emits and absorbs only a narrow spectral range dv, so that the light is nearly monochromatic. The mirrors, lenses and windows are assumed to be ideal, so that the light stream could circulate without losses for an indefinite time if nothing came into its way. Assume now that the molecule drifts into the light stream. A part of the light will be scattered, and collected by one or the other or both photosensitive elements. These work a relay which sets the mechanism in motion. A frictionless piston slides into the cylinder, and will be slowly raised b y the molecule, until it reaches the top of the cylinder. A cam ensures that the pressure of the molecule is always very nearly balanced according to Boyle’s law. Thus the expansion is isotherniic, and the work gained at the expense of the heat in the store is
kT log X , where X is the expansion ratio, X
=
k log X .
(12)
Vjv. The entropy decrease is (13)
At the same time as the piston is set in motion two ideal mirrors slide over the transparent windows, so that during the long working phase of the device there can be no further loss of light, even if the molecule should happen - as it will from time to time - to visit the part volume z1 again. The process can be cyclically repeated t, if the work gained, kT log X , is larger than the energy lost from the light beam. In this case a part of the work is used to restore the energy to the filament, and we have t I n order to show the final part of the cycle a few modifications should be added to Fig. 7. One may imagine, for example, that at thc stop the piston is again slipped out sideways, and does work on a machine, say an electric generator, while descending to its original position.
128
LIGHT A N D I N F O R M A T I O N
[IV,
55
indeed a cyclically operating perpetuum mobile of the second kind. This of course cannot be true, but it is not easy to see where we have gone wrong. The evident objection against frictionless pistons, ideal mirrors, selective emitters etc. can be discarded. These are all thermodynamically sound assumptions, and have been used in the classical imaginary experiments of Boltzmann and Willy W e n . The disturbing feature of the problem is that, however high we assume the minimum energy SO required for an observation, we can always make the expansion ratio X = Vjv so large that
kT log X
> FO.
But X depends on that part of the volume into which light has not penetrated at all, according to our assumptions. Unless we can explain that a loss of light energy from the ordered beam has taken place somewhere in the cycle, which increases a t least with the logarithm of the unexplored volume, we cannot disprove the perpetuum mobile. Refore showing that classical light theory has no answer to this question, it may be mentioned that classical statistics has an answer, but one which we cannot accept. If, with Max Planck and Max von Laue we apply Boltzmann’s statistical method to the degrees of freedom of light which we have previously discussed, it is easy to show that any amount of energy EO when passing from the ordered state, in which it fills the partial volume v and the solid angle S, to the disordered state in which it fills the whole volume of the apparatus (necessarily larger than V ) ,and the solid angle 4n thereby increases the entropy of the system by more than
L
k log V
+ 1%) ,4n
.
But this is merely a mathematical expression, so contrived that the second principle shall be satisfied. It is useless unless classical theory can also explain how this entropy change has come about, i.e. unless it provides a mechanism by which the filament must lose an energy at least equal to kT log Vjv. I t will now be shown that there is, in fact, no such classical mechanism. The energy loss required to save the Second Principle can have taken place in three phases of the cycle: 1. During the transient process which proceeds the steady state of the beam, when the sliding mirrors are removed from the windows after the completed cycle.
IV,
5 51
“PERPETUUM
MOBILE
O F THE SECOND KIND”
129
2. During the waiting time, before the molecule has appeared in the part-volume v, provided that there is a possibility for the light energy to be scattered by the niolecule while it is still outside the volume v. This energy would be either absorbed by the walls of the cylinder, or if the walls are not absorbing, it would escape through the windows. 3. During the working phase. A certain amount of radiation is imprisoned in the cylinder when the windows are shut. This will be either dissipated to the walls by the molecule, or merely disordered, in which case it will escape when the windows are opened again for the next cycle. Let us see first what classical theory has to say regarding these three possibilities. Ad 1. There is indeed, by Huygens’ Principle, a transient process, during which the wavelets emitted by the progressing wavefront explore the whole available space, before they destroy each other by interference outside v. But the energy consumed or disordered in this process must be proportional to the final intensity, and this can be made as small as we like, as will be shown in a moment. A d 2 . Scattering outside the space v can be made as sniall as we like, as according to classical theory the intensity can be made to fall off very sharply, e.g. as exp [- (x/x0)2]with the distance x from the beam edge. A d 3. The imprisoned radiation is proportional to the intensity. It remains to be shomn that the intensity can in fact be made as small as we like. Let E O be the energy necessary to make the relay work. This is tstiniated in Appendix I V ; here we require only the evident result that it is in no way dependent on the potential expansion X . Let I , dv be the light-flux per unit time, At the mean waiting time, of the order of the “return time” of the molecule. The mean time which it spends in v will be of the order At. v/V = At/X; this is the time during which the observation must be made. If K is the fraction of the flux which is in the mean successfully scattered by the molecule on to the photosensitive elements, we have the relation
K I dvAt/X ~ > 80.
(15)
The factor K causes no difficulties, as it can be made of the order unity just in the most favourable case for large gains, when v is made so small that it is entirely filled by the molecule. Thus, however large EO or X , we can keep the intensity I,dv as low as we like, if
130
LIGHT AND INFORMATION
[IV,
s5
only we make At sufficiently large, and this we can achieve by using a large and sluggish “molecule”. Such a machine would operate very slowly, but it would none the less certainly break through the Second Principle in the long run. But there is a hidden assumption here, which we must mention. This is that we can make I , as large as we like t, relative to the background of black radiation a t the temperature T which fills the whole space. Experience (Wien’s Law) confirms that this is possible if we make the filament sufficiently hot. But this is a datum which we take directly from experience, the classical theory of light and therniodynamics have nothing to say on this point. (Classical statistics has something to say, but we are not here concerned with disproving classical statistics.) This is an essential assumption, as otherwise we could not expect the mechanism to pick out the weak signal from the background. Jlic will return to it k 7 . t ~ . h
CONFINING A BEAM BY SUPERIMPOSING
n
FOURIER COMPOI4ENTS
Fig. 8. (cf. Appendix IV, p. 143)
t \Vhile keeping I,dv very small.
IL
s 51
‘‘PERPETUUM
M O B I L E OF T H E S E C O N D KIND’’
131
Thus the conclusion is that, on the basis of the classical light theory, we cannot disprove this perpetuum mobile. But even elementary quantum theory is not sufficient. A single quantum hi) is sufficient for an observation, or a t any rate a small number of quanta, if there is sufficient certainty that the photons do not come from the thermal background. But however large hv, we can still make the expansion ratio X so large that the gain exceeds the loss. But the modern quantum theory of radiation easily accounts for this queer phenomenon. The essence of this method is that it uses classical theory to the point of decomposing the general field into simple components which can be easily quantized: plane waves, or in the case of cavities, eigenfunctions. I n our case plane waves will be appropriate, bccausc we have not to deal with a closed cavity during the waiting time, when the windows are open, but we. will talk of these. for simplicity, as “modes”, or “Fourier components”. Evidently only the vertical dimension of the cylinder is of importance; thus we can restrict our explanations to one dimension. It is a fundamental and elementary result of Fourier analysis, that in order to confine non-zero amplitudes essentially to a fraction 1/X of an interval, we must superimpose at least X components. This is illustrated in Fig. 8, where it is also shown how amplitude distributions can be produced which are essentially flat inside a region, and almost zero outside. That is to say in o ~ d e rto confine a beam of light to a fraction 1jX of a volume, we must sinmltaneo.zdy excite at least X modes. The question now arises how strongly must we excite them. The answer is evidently: strongly enough, so that the peak in v rises sufficiently above the general level of fluctuations. Sufficiently means that the peak must be about X times stronger in intensity than the mean energy level due to the fluctuations elsewhere, because the molecule spends about X times more time outside the volume v than inside it. The calculations are carried out in Appendix IV. Here we mention only the rcsult, which is that every one of the X modes must contain in the mean about one half photon inside the volume V , in order to have about an even chance for a correct observation. But what happens if we want to avoid this danger (i.e. malting a wrong observation during the waiting time) and increase the intensity sufficiently? In this case we fall into another trap; we imprison a t least gX photons, and these will be dissipated by the molecule during the long “working phase’’. Thus the Sccond Principle is amply safe-
132
LIGHT AND INFORMATION
[IV,
36
guarded, because the dissipated energy is at least JXhv, and this is always larger than k T log X , because
hv > kT,
i X > log X .
(16)
The first relation follows from the fact that a relay at temperature T cannot be safely worked by an energy less than KT, the second is a purely mathematical relation. It shows also that the more we try to gain, the more we are going t o lose, because for large X the logarithm of X will be very much smaller than & X . Thus the Second Principle reveals a rather curious and unexpected property of light. One could call it the “ubiquitousness of photons”, and sum it up in the form: Very weak beams of light cawtot be concentrated. But lest this demonstration might have given the impression that one can sit down and work out the laws of nature from purely imaginary experiments, it will be useful to remember that an important element of experience has gone into our proof; the fact that the thermal s p x t r u m falls off a t short wavelengths. We cannot get something for nothing, not even an observation, far less a law of nature! But it remains remarkable how small a hint from experience is sometimes sufficient to reveal phenomena apparently quite unconnected with it 7.
Q 6. The Metrical Information in Light Beams We can now return to the problem of the information content of light, which we had to leave in a rather unsatisfactory state. Classical theory enabled us to count the degrees of freedom, but it did not provide a metric. I n quantum theory we can count light energy: in terms of photons, which provides a natural measure. I n classical theory there is no upper limit to field intensities, and quantum theory, a t least for the present, retains this feature by allowing a n y number of photons in one cell, i.e. in any one degree of freedom. It is interesting to consider for a moment to what an extent we can avail ourselves in practice of this generous theoretical t The considerations of this section give support t o Max Born’s thesis that the Second Principle can be understood and satisfactorily interpreted only on the basis of quantum mechanics (BORN[1949]).
IV,
§ 61
THE METRICAL INFORMATION
IN LIGHT BEAMS
133
permission. We consider three powerful sources in different parts of the electromagnetic spectrum. A power generating station, 10 000 kW, 50 f 0.01 cycles/sec puts about 1041 photons into a single cell t. A large magnetron, in pulsed operation on 10 cm wavelength, 3 x lo9 rt 0.5 x 106 cycles, though with an instantaneous power only ten times smaller than the generating station, produces 1024 photons per cell. This is 1015 times smaller than in the first case, but still a large enough number to make electrical engineers indifferent to quantum theory. But a powerful high pressure mercury l a m p , emitting 1 watt per cm2 arc area in the form of the green line il = 5461 & 10 Angstrom achieves less than 10-3 photons per cell, i.e. the best it can do is about one photon for a thousand cells! Hence light optics, when it comes to metrical problems is entirely outside the classical region. The classical theory has given us the formula
(7) for the degrees of freedom in an arbitrary light beam, called also the “number of logons”. We can now consider every logon separately from the point of view of information capacity. I t is convenient t o define this as the logarithm of the number of distinguishable steps s ( n ) if 1 , 2, . . . photons are packed into it, up to a level n. This problem was the subject of a recent investigation by the author [1950] where i t was found that the number of distinguishablc steps is, approximately,
t Cf. Appendix V.
134
LIGHT A N D I N F O R M A T I O N
[IV, 3 6
and that the factor (2) in eq. (7) must be suppressed t. i i is~ here the number of thermal photons a t the temperature T a t which the observations are made, which is, by Planck’s law
Now consider the case of a beam with many degrees of freedom, as given by eq. (7), e.g. the beam which is issuing from an object under a microscope. We define the information capacity again as the logarithvn of the number of distinguishable states. In order to calculate this by combining the number si(ni) for the different degrees of freedom i we must have some condition for the nf, the photons which may appear in i. We obtain such a condition in the simplest and most natural form if we separate the “time cell”, d v d t from the integrand, and put it equal to unity. I n this case all elementary beams contained in eq. (7) are necessarily coherent. We now imagine that the object has been illuminated with N photons, in the same unit time cell, which means of course “coherent illumination”. Thus a total maximum of N photons can appear in the beam issuing from the object, this will be the case if the object is not absorbing but has only “phase contrast”. The problem is now clearly given; in how many ways can we distribute N photons over F degrees of freedom, and what is the total number of distinguishable patterns, formed by combinations of distinguishable steps ? The quantity defined in this way is very close to, though not quite t It can be directly verified that this is the number of distinguishable energy levels, by using an extension of Einstein’s law for the energy fluctuations in a Hohlraum. One might ask whether, at high quantum levels, this is the same as the number of distinguishable states, because classical theory associates two quantities with every level: an amplitude and a phase. These are also considered as observable in the quantum theory of radiation, but only to a certain accuracy, [ 1933, 19501 determined by the uncertainty relation. BOHRand ROSENFELD have proved that these measurements can be indeed carried out, if no restriction is imposed on the particles used in the imaginary experiments, i.e. if one admits test bodies composed of nuclear matter, or even denser. On the other hand, I have found Z.C. that if one uses electrons, and the type of electronic amplifier which appears the most promising for this purpose, the total information contained in the best possible measurements of amplitude and phase will be, a t most, equal to the number of distinguishable energy levels, i.e. to s ( n ) . The reason for this remarkable divergence from the “ideal” experiments is the shot effect in electron beams. If one assumed that, a t least for very high n , eq. (17) has to be replaced by a linear law, this would merely restore the factor (2) which we have suppressed.
IV,
5 61
THE METRICAL
INFORMATION
IN LIGHT BEAMS
135
identical with MAX PLANCK’S [1924] definition of the entropy of a quantized system as k times the logarithm of the “probability” P, where P is defined as the number of ways in which a given energy can be distributed over the states. The difference is only that we have replaced “states” by “distinguishable states”. The calculation, carried out in Appendix VI, gives the asymptotic formula, valid for large N and F 2neN log P = $3.log --___ F(1 2 w )
+
for the maximum information capacity in a beam with F degrees of freedom and containing N photons. This formula still has the weakness that it gives equal weight to all degrees of freedom. What happens if we do not cut out some of the degrees but weaken them by absorbing screens? These screens are a part of our experimental set-up, they are part of our a $riori information. We can answer the question immediately by associating a transmission coefficient 7 4 with the i-th degree of freedom, which is a real, positive number, smaller than or a t most equal to unity. Eq. (19) now changes into log P
==
2neN~g 4&F l log ~ _ _ F(1 + 212T)
_
This formula at last answers the objections to the classical theory; the degrees of freedom are properly weighted. I t may be noted that the formula is an asymptotic one, it must not be extended to 7 4 so small that an added degree of freedom might appear to make a negcztive contribution, which happens if the argument of the logarithm falls below unity, i.e. we must cut off a t
For zero thermal noise, 1227 = 0 this limit is about 17 times smaller than that given by MacKay’s intuitive rule: “Adding a degree of freedom is useless if it will contain in the mean less than about ‘one metron per logon”’. The reason for this appreciable discrepancy is that if the logon is one of many, and a large energy is distributed over them, it can still make a useful contribution in the cases where it receives an energy above the average, in other words by making use of the fluctuations.
136
LIGHT AND INFORMATION
[Iv, A
It may be pointed out that the entropies (19) and (20) which have a very close relation with what goes under this name in statistical mechanics, are not to be identified with Shannon’s “entropy” or measure of information in communication theory. The relations between them are discussed in Appendix VI.
5
7.
Conclusion
This, I believe, does not by any means exhaust what informat’ion theory can give to the physics of light. I have mentioned the unavoidable increase of disorder which every observation must create, but I could not go into the question of the unavoidable disorder which an observation creates in the object itself. This question was first raised by Bohr and by Heisenberg, and most important further developments are due to L. DE BROGLIE [1947]. It is a problem of the greatest interest to those who, like the author, are engaged to extend the limits of microscopic vision. I hope to have shown that information theory is of some heuristic use in physics, by asking the right sort of questions. Rut even if this were questioned, another advantage is, I believe, evident beyond doubt. This is that it prepares the mind for quantum theory, whose strange methods are so difficult to assimilate for those who have been too long engaged in classical physics. As we must now give up all hope of ever understanding the physical world on classical lines, it is gratifying that in information theory we appear to have the right tool for introducing the quantum point of view into classical physics.
Appendices I. DIFFRACTION OF A WAVE AT A PLANE OBJECT
Let the object plane be z = 0, and assume that a plane monochromatic wave
a. = ezni(z/A-vt)
(2
-=0)
is incident on it. From the experience that the wavelength of light is not changed by a stationary object we can write the amplitude immediately behind the object plane in the form
~ ( xy,,
+ 0, t ) = t(x,y) eznivt,
(1)
IV, A]
137
APPENDICES
where t(x,y) is some complex function, called the complex amplitude transmission of the object. As we do not go into the question how this function is correlated with known physical properties of the object, or how it varies with different illumination, in the present discussion t(x,y) is the object, and the problem is only how to obtain it, or as much of it as possible, by observing the amplitude u in other planes z > 0. Using Fourier's theorem we write
t(x,y) =
(
[m
J J
T(5,q) e2ni(zE+u7) d5 dq,
(2)
--oo
where T is the Fourier transform of t. Substituting this into ( I ) , the amplitude for z = 0 appears as the sum of Fourier components, periodic in x, y and in t. We extend this to z > 0 by replacing each component by
+
exp 2zi[(xE
+ yq + 4)
-
4
5' must be so determined that each component is a solution of the wave equation IJU=O.
This gives
+
Thus for 5 2 q 2 < 1 / 1 2 we obtain plane waves, propagating in the direction of the wave normal a , p, y , given by
cos
=
nt
cos p
=
nq
cos
= a5 =
[i - a 2 ( 5 2
+ q2)13,
-+
while for 5 2 q 2 > 1/22 we obtain evanescent waves. This means that Fourier components with a periodicity in the object plane smaller than a wavelength are not propagated to any appreciable distance. In order to avoid changing the limits in the integral (2) we re-interpret T , so that it is assumed to vanish outside the circle 5 2 q 2 = 1 / 1 2 . With this understanding we have the solution, valid for all positive z
+-
y, z, t) =
Thus the amplitude appears as the transform of T , multiplied by a unitary factor which is a function of z and of the Fourier variables.
138
LIGHT AND INFORMATION
[IV, A
This result here obtained for propagation in free space, can be extended also to optical systems with certain geometrical errors, as shown by DUFFIEUX [1950] and by the author in a forthcoming paper. (GABOR [ 19511; cf. also BOOKER,RATCLIFFEand SHINN[ 19501.) 11. NON-REDUNDANT SPECIFICATION O F OPTICAL OBJECTS
Describing an object by a continuous transmission function t(x,y ) is unsatisfactory, as it contains infinitely more data than can be physically ascertained. The same objection has been raised in communication theory against the description of signals by continuous functions of time (GABOR[1946]) and it can be met in optics by the same method as used there: by representing t(x,y) as an expansion in terms of suitably chosen elementary functions, and by taking only as many terms of the series as there are degrees of freedom. A few examples of such “non-redundant” representations may be given. The simplest case arises if the object plane is limited by a rectangle so that the transmission t is non-zero only in the limits - 4x0 < x < 4x0, - +yo < y < $yo, and the Fourier plane is similarly limited by - $60 < E < 460, - $qo < q < 4q0. (Rectangular aperture.) In this case we replace t ( x ,y) by “the cardinal function”
This formula, known in interpolation theory (WHITTAKER[ 1915]), has been applied in signal analysis (in the one dimensional case), by SHANNON [1949], OSWALD[1949] and by VAN DER POL [1950] to produce “signals of limited spectrum” which are cut off a t a certain maximum frequency. It is easy to see that this function assumes the same values as t(x,y ) at the interpolation points x = %/to, y = m/qo, but its Fourier transform is non-zero and constant only inside a certain rectangle, as specified, and zero outside it. As the transmission t is also zero outside a certain rectangular region, the summation limits are effectively -
8x060
< ?Z < +oEo
-
gyoqo <
< iyoqo;
that is to say there are altogether xoy06oqo terms, equal to the degrees of freedom. It may be noted that the function defined by eq. ( 1 ) does not go exactly t o zero a t the limits
IV, A1
139
APPENDICES
of the rectangle, as the elementary functions of the type (sin x)/x spread out a little, but the error is relatively small if the degree of freedom is large. In the more frequent case in which the useful parts of the objcct plane and of the Fourier or aperture plane are of circular shape, a solution of the problem can be written in the form s 7n-
m n- 0
0
17t
+
where we have introduced polar coordinates Y , in the object plane, and po is the maximum Fourier radius. Jnb is the Ressel function of order w.t. The expansion has J N ( N 1) non-zero terms, and equating this to the degrees of freedom, we obtain
+
$N(N
+ 1) = (nRpo)2
if R is the radius in the object plane, or approximately N = nl/Z Rpo. We now prove that the spectrum is in fact inside po, by investigating the spectrum of a single term. This is
and this is indeed zero, unless p assumes one of the values n
P = -Po, HL
which are all inside or a t the limit PO. It is seen that the spectrum consists of discrete circles. This is a consequence of the fact that we have extended the integration in (3),to infinity. If there were a sharp cut-off in the object plane, these would spread out. In the general case, when both the object and the aperture are
140
LIGHT AND I N F O R M A T I O N
[IV, A
of arbitrary shape, an approximate method can be applied which has been sketched out by the author in a previous paper (GABOR [1949]). The object is considered as formed by the superposition of “gaussian patches”, arranged in a honeycomb pattern. This method will be applied later in connection with the problem of coherence. 111. THE E F F E C T O F ILLUMINATION
Equations (6) and (7) of the text contain no explicit reference to t h e mode of illumination other than the wavelength, and it might appear doubtful whether they are of as general validity as claimed. The effect of illumination on the resolving power of optical instruments, and especially the effect of coherence was, not long ago, the object of a heated discussion between the followers of Abbe and those of Rayleigh. It has now become clear that the Abbe school had greatly exaggerated the influence of coherence, but the effect, though sn:all, is not unimportant, and it may be useful to establjsh the connection with the current optical theory. It may be remembered that we have considered only one optical experiment carried out on one object, and in this t(x,y ) or rather [t(x,y)] was, of necessity, the object. If now we consider different ways of illuminating the same object, we must make the assumption which is at the basis of all theories of optical instruments, based on Kirchhoff’s approximation :
+, y , + 0,4
= t(x,y , Y)+,
y,
- 0,v).
(1)
That is to say the complex amplitude associated with the frequency v immediately behind the object is connected with the corresponding quantity inimediately before the object by a coniplex transmission coefficient t , which is a function of x, y and of the frequency, but of nothing else. It is an approximation which gives good service for all but extreme angles of incidence. We can now treat the effect of illumination by reducing the general case to the special case of plane, parallel illumination in the z-direction which we have previously considered, by replacing the object t(x,y , v), illuminated in some general way, by another object t’(x,y,v) which would present the same appearance if it were illuniinated in the standard way. Consider first monochromatic illumination, with an amplitude A ( E ,0)dQ for those plane components whose wave normal is inside-
IV, A]
141
APPENDICES
the solid angle elenlent d 9 . Thus
t ’ h Y ) == t(% Y )
J/ A ( %P) e
Zni/A(z cos
E+V
cos p )
do
(2)
In order to obtain the image of the object t with this illumination in an ideal microscope, which has no other errors than diffraction, we must take the “square bracket” substitute of this function t‘ according to the rules of the previous Appendix. Its Fourier transform is
We now drop the assumption of monochromatic radiation, but still operate with plane waves, as only for these can we assume the amplitude to be an arbitrary function of time, without affecting the spatial character. The passage is easiest if we replace every plane wavelet of frequency v (at z = 0), by the “gaussian elementary signals” studied in communication theory (GABOR[ 19461). These are pulses whose time-description is, in standardized form,
and whose Fourier transform is
with
AtAv
=
1.
It has been shown that any arbitrary signal can be represented by dividing up the time-frequency plane into cells AtAv of unit size, and associating such an elementary signal with it. For At = 00 one obtains infinite monochromatic wavetrains as a special case. Incoherent, “natural” light may be more conveniently treated with a finite choice of At, but all descriptions are of course equivalent, and the question never arises whether a description by short pulses might be “truer” than a description by long wavetrains, as Rayleigh has pointed out long ago t. Thus the factor dv dt which we have attached to eqs. ( 6 ) a r d (7) I Cf. A. SOMMERFELD [ 19501 p. 362.
142
L I G H T AN11 I N F O R M A T I O N
[IV, A
of the text covers all cases of coherent or incohcrent illumination. A word may be said in this connection on the reprcsentation of “natural” light, issuing from a source such as a hot body or an excited gas. If the beam is analysed into bundles, each corresponding to one (complex) degree of freedom, each bundle is independent of all others (“incoherent” with them), in whatever way the analysis is performed. As a convenient example let us analyse the beam issuing from a plane x , y into gaussian beams. The space representation is, for z == 0
and the Fourier or angular representation
with
AX&
=
Aydr
1,
=
1.
These elementary beams, if they are sufficiently narrow, will propagate in such a way that the amplitude distribution is gaussian in every cross section. (Cf. MOTT and MASSEY [1949], p. 6.) IV. NOTES TO THE PERPETUUPII MOBILE PROBLEM
The minimum energy required to operate safely a relay of any kind at a temperature T can be estimated from NYQUIST’Stheorem [1928] which states that the noise power for one degree of freedom, in a frequency range A / is
kT4f. (1) The relay, in our case, has to expect a signal during a time interval of the order A t / X , if At is the mean waiting time. Thus the o ~ t i m u m setting of the frequency band A / transmitted to the relay is of the order
A/ =
~
X At
’
An instrument set in this \my will integrate the signal over a time of the order l / A f , and if thc signal energy received is to, the mean signal power during this time is about c g l ) . As this must exceed the noise power 1, the minimum perceptible energy, received during the interval 1 ’ A / or less will be indccd of the order K l .
IV, A]
APPENDICES
143
Fourier anaZysis of restricted beams. Fig. 8 illustrates an example of the series
c1 cos 2z
Kz
1L
~
L
=
cos ( z ( n-1- I)z/L)sin (nrtz/L) sin ( z z / ~ j
(3)
that is to say it consists of r, equal harmonic components, which have all the same phase at z = 0 and substantially destroy each other b y interference outside an interval of about L/vz. This function also satisfies exactly the boundary condition “amplitude zero at z = & JL” if n is an even number. It is also shown in Fig. 8 how three such wave-sets can be superimposed in order to produce an amplitude which is substantially flat inside an interval Ljn, and vanishes even a little more rapidly outside this interval than the function (3). In this second example 2n modes are excited, because thc functions ( 3 ) ,shifted to zo can be written
c cos 2nk(zL 1
- 20) -__-
~
n
i.e. sin components are excited, as well as cos components. I n order to satisfy the boundary conditions at z = f &L only one condition must be imposed on the components, because they are periodic in L. Notc that it is not necessary to impose the boundary conditions on every component separately, because they are coherent. That is to say the components need not be eigenfunctions of the interval L . T h e f ~ u c ~ ~ ~ a of~ ithe o n intensity. s Assume that we have decomposed the beam into plane waves inside the volume V , by the scheme just described. (We neglect the fact that we have two beams going in opposite directions, requiring at least 2X components for their description, as even X waves will give us a sufficiently sharp criterion.) Plane waves can be directly quantized, due to their “particle-like” classical properties. (HEITLER [1944], p. 18.) We assume that each of those waves, 1 . . .i. . .n contains an integer number of quanta, qi inside the volume V . If n is a reasonably large number, the probability of an additional quantum appcaring in any one of the Fourier components is small, we can therefore, at least approximately, apply the law of Poisson’s distribution, according to which the probability
144
[IV A
LIGHT A N D INFORMATION
of thc i-th mode containing qi photons is 4)
For simplicity we assume that the mean value
is the same for all components, 4 being the total number of quanta in the beam inside V . The probability of scattering taking place at any point in the volume is proportional to the classically calculated intensity, but with the assumption that the energies in the modes are distributed according to the law (4). As we need only compare thc intensities, or probabilities in a point outside the volume v with those inside it, we can use quantum units, and write for the probability
P
=
(xd q i c o s
+i)2
4- ( 2 d q i s i n + i ) 2 .
(6)
The +i here are the classically defined relative phases (relative to any one of the components), they do not fluctuate. It is sinipler to replace these by complex unit vectors
(7)
ci = @*,
so that the probability appears in the form P
=
(XcgdG)(2 C k * d & ) .
(8)
Outside v we assume that the ci form a closed polygon
c ci = 0,
(9)
so that the intensity or probability would be zero if all the 9i were exactly equal. We now take account of their fluctuations, by writing
qi
=
4 + 64% qi + 6qi = n
and 1
n
1
... .
(10)
We shall have to use the second approximation, as n/q will turn out to be of the order 2, and this is not sufficient to justify the first
IV, -41
145
APPENDICES
approximation. With the condition (9) this gives for the mean probability of scattering
where we have assumed that the fluctuations are independent in the mean, 6q&k = 0. For the Poisson distribution eq. (4), -
8qt2
==
4
yi = -, 12
-
6qt4 = qt
+ 3@ = 4'YL + 3
(9). 2
(12)
Substituting these into ( 1 1 ) the probability outside v is found to be
Po = $n
(13)
Inside v all Fourier components are in phase, i.e. Ci = 1, and the corresponding probability I'i is, by eq. (81, in the mean
Pi = (nq$)2= n q .
(14)
If now we want at least an even chance for a correct observation, we must postulate
Pi 2 ( X - 1 p 0 ,
(15)
because the molecule will spend in the mean X - 1 times more time outside the inside v. For simplicity we will write X instead of X - 1, and write n = k X , where k , as we have previously seen, must be at least unity if thc bcam be confined to the fraction 1jX of the volume. Substituting (13) and (14) into (15) we thus obtain the condition
for approximately even chance of a successful observation of the molecule. A few values are
K q/X
1
> 0.45
2
3
0.54
1.0.
146
LIGHT A N D I N F O R M A T I O N
[IV, A
Thus even for the smallest possible number k = 1 we have for the minimum number of quanta lost during the cycle
q > 0.45 X > log X which, as we have shown in the text, amply safeguards the Second Principle. V. OCCUPATION NUMBERS I N LIGHT BEAMS AND I N ELECTRO N BEAMS
A power generating station can be considered as a coherent source of long-wave energy in time intervals d t = I/dv. One can estimate the frequency as constant within about f 0.01 cycles/sec for short times. This gives a cell length d t of 200 seconds. During this time a 10 000 kW station sends out 2 x 109 joule. A quantum at 50 cycles has an energy of 3.3 x 10-32 joule, hence the figure of 1041 quanta per cell, quoted in the text. A pulsed microwave transmitter, emitting pulses of 1 microsec duration has, ips0 facto a bandwidth of 106 cycles. Each pulse is one cell. With an instantaneous power of 1 megawatt the energy in the pulse is 1 joule. One quantum at 10 cm wavelength, v = 3 x l O Q cycles has an energy of 2 x 10-24 joule which gives about 1024 photons per cell. A high pressure unercwy lamp may radiate 1 watt per cm2 arc surfacein the form of the green line with nominal wavelength I, == 5461 Angstroni and a line width of 20 Angstrom. The duration of the cell is only 0.5 x 10-12 sec. As the arc emits in a solid angle of 2n, the coherent emitting area is I,2/4z == 0.25 x lO-gcm2. (It would be only half as much if we counted only one direction of polarization.) The energy in the cell is about 1.3 x 10-22 joule, while the energy of one quantum is 3.5 x 10-19 joule. This gives the figure of less than one in a thousand quoted in the text. The Sun produces somewhat higher occupation numbers. Using Planck’s law for thermal emitters %T =
1 exp (hv/kT) - 1
(for one polarization), and substituting I = 0.6 x 10-4 cm (yellow light), T = 5500”K, we obtain n = 1.3 x 10-2. This figure can be somewhat exceeded by “extra-high pressure” mercury lamps. Appre-
IV, A]
147
APPENDICES
ciably higher figure cannot be achieved by any steady source, but only by exploding wires and the like. It is of some interest to put the expression for the unit cell in a form in which it applies to all kinds of particles, to electrons for instance as well as to photons. The unit cell for light is defined by dS
[ 2 ] * 2__ dQ dv dt = 1. A2
The first factor [2] relates to the polarization, this is peculiar to light. The second 2 stands for the “spatial phase”, i.e. for the fact that for every spatial period one can distinguish a sine and a cosine component. We have dropped the factor (2), considering the “time phase” as unobservable. We now put this into a more general form by making use of Einstein’s equation
E = hv
(2)
where E is the energy of the particle, and of de Broglie’s relation
p z -h
(3)
?L
where 1) is its momentum. This gives the general definition of the cell 292
[2] -- dS dQ d E d t h3
In the case of light
p
= Izvlc = E/c, and
=
I,
we can write (4) in the form
4 ___ h3C2 d S d Q E 2 d E d t - 1
while in the case of slow electrons
p2 =
4m - dh3S d Q E d d E d t -
(4)
(5)
(mv)2= 21nE, and the cell is
1.
(6 )
There is a fundamental difference between light optics and electron optics, where by Pauli’s exclusion principle the maxinium occupation of a cell is one or two. It will be shown in a moment that two antiparallel electrons in a cell in a free beam is an extremely unlikely case; it is questionable whether it can occur a t all. Two electrons in a cell with opposite spins could not be distinguished from a particle
I48
LIGHT A N D I N F O R M A T I O N
[IV, A
with double charge and zero spin, and this is probably an extremely unstable particle. We will now calculate the maximum current density in an electron bean1 if every cell is occupied by a charge e. Eq. (6) gives for every elementary bean1 a maximum current of 4enz
__ h3
E d E (per unit area and steradian).
(7)
The maximum density per cm2 arises at a source where the solid angle is 2n, and this is
8nem h3
EdE=-
8xe3m h3
V dV (per crnz),
where we have expressed the electron energy also in terms of equivalent volts, E = eV. This gives numerically 3.2
x
amp/cni2 volt energy * volt energy spread
1010
This is an enormous density. The highest figures experimentally realized under stationary conditions have been obtained by autoelectronic emission, where the energy and the energy spread were presumably of the order of a few electron volts, and the current densities were 104 - 105 amp/cni2. Even assuming that the emission was to some extent directed, so that the angular spread was less than 2n,it is clear that the occupation numbers were very small even in these rather extreme experiments. Hence the almost complete identity of light optics and electron optics, in spite of the extreme difference between Einstein-Bose and Fermi-Dirac statistics. VI. INFORMATION CAPACITY AND SELECTIVE E N T R O P Y
It has been shown in the text, that the first problem which arises is to count the number of distinguishable configurations which can be produced by distributing up to N photons over F degrees of freedom, or “logons”. Let ni be the number of photons in the i-th logon. \.lie have the condition nl-t
. _ .+ n i . . . + n F < N
(1)
with the understanding that all ni are positive. The number of distinguishable steps up to ni is si = 2( 1
+ 2n73)-4ni*.
(2)
149
APPENDICES
IV, A]
Let us now imagine the si as certain coordinates in an F-dimensional space. The number of distinguishable configurations is now equal to the number of integer lattice points, inside a hypersphere with the radius
R
=
2(1
+ 21zT)-:N
(3)
in the sector where all Si are positive. I n other words, t h e number of distinguishable configurations is the volume of this sector of the hypersphere, which is
and the inforniation capacity, as defined, is the logarithm of this number. Using Stirling’s formula
+ 1 ) m $F(log 4F
log T(+F
-
l),
(5)
valid for large F , we obtain
which is eq. (19) of the text. The next problem is to calculate this number if the set-up contains absorbing screens with transmission coefficients T ( (0 < 7 6 < l ) , so that the maximum energy which can appear in the i-th logon is not N but N T ~(The . whole energy can appear in any single logon if the object has 011137 a single Fourier component, and this one with pure phase contrast, without absorption.) This problem is easily reduced to the previous one, if we replace the ni by We have now t o consider the volume of a hyper-ellipsoid, with semi-axrs R T ~The . result is B 2neN5-i log P = ; log i-0 F(1 f 2nT) ’ which is eq. (20)of the text. This is about as far as the interest of the physicist usually goes in these matters. Kni as many research workers are now busy studying thc works of Wiener and of Shannon, it may be useful to establish the liaison with communication theory. This introduces an idea rather strange to physics, but very natural in communication engineering: The pattern to be observed is not entirely unknown. It is a new term in a long series, of which we know the statistical characteristics. As an illustration the pattern in question could be a moving
150
[IV, A
LIGHT AND INFORMATION
picture, or a television picture. In this case one can generally predict for instance that the upper half of the picture will be brighter than the lower half. In the long run such statistical knowledge can be used to restrict, for example, the waveband used for television transmission, though this is hardly as yet a practical possibility. As a first step let us assume that we know that the transmission coefficient of the picture in the i-th logon will have a probability distribution
(0< ;i’
fii(ri’)dri’
< 1).
Treating these transmissions just as we have treated the LI priori known ~i we can a t once write down the expectation value of log P , or what is the same, its mean value in the long run
This is the answer, but it is neither convenient in form, nor general enough from the point of view of communication theory. Eq. (8) is based on the assumption that the best possible analysis has been applied to every logon, so that patterns which are different only in a single step in a single logon will be recognized as distinguishable. Rut this, evidently, need not be the case; in general it will depend in the “context”, on the rest of the pattern whether a difference in one logon, distinguishable in itself, will make a distinguishable change in the whole. Shannon’s analysis is a short cut through these difficulties. In his theory the contributions of the different degrees of freedom are not considered separately, but only the global effect, the “signal”, or in our case the pattern. Let there be altogether M distinguishable patterns numbered 1 . . . k . . .M. Consider now an ensemble of N such patterns, in which the patterns 1 . . .k . . .A1 occur n l . . .nk.. .nnz times, so that ‘?6k
=z
N.
(9)
The probability of such an ensemble P, may be defined as the number of ways in which it can be realised. This is
and its logarithm, using Stirling’s formula, is BZ
log P, = N log Ar -
Ilrc
1
log %k.
IV, A1
151
APPENDICES
Now let N go to infinity (or take a grand ensemble of ensembles), and define the “selective eiitropy” S,as the mean value of the limit of (log P,)lN. The ratios n k / N are assumed to approach beyond any limit certain values pk, called the probabilities. Thus, as 2 p k = 1,
S,
1
= lim log P,/N = - [N log N
N
-
=-
2 pklogpk.
(11)
This is Shannon’s well known formula for the “entropy of a source” (considering the changing as the source of information). It is also called the “selective value of information”. If all probabilities were equal, p k = 11s the expression reduces to log s, and one can say that the expectation value of the information is equal to that of a selection between s equally probable cases if log s
= -
2 p k log p k .
It may be useful to point out once more the difference between this “selective entropy” and the physical entropy of Roltzmann and Planck : Physical entropy of a system is S = k log P , where P is the number of ways in which energy, up to a certain prescribed limit, can be distributed over the degrees of freedom of the system, taking into account the a Priori given restrictive conditions. Selective entropy of a series is the mean value of log PIN, if P is the number of ways in which any given distribution of distinguishable patterns can be realised in N repetitions, N tending to infinity, and the relative frequency of these patterns tending to certain asymptotic values p k . Those who meet the subject for the first time will hardly fail to notice that the “selective value of information” has little to do with what one is inclined to consider as the value of information in everyday life. The selective information is an expected value, not the value of information after it has been received. But one can say that in everyday life one values information not so much for its unexpectedness but rather for its usefulness in foresceiitg the future. It may be shown, in a simple example, that this “prediction value of information” can be also cast into a form similar to Shannon’s. Let us consider a “Markoff chain”, i.e. a series in which the probabilities in one event depend only on the last, but not on those before the last. Assume that the event “i” has happened, and that p k $ is the probability
152
LIGHT AND INFORMATION
[IV
for the event “k” to follow “i”. We can now define
as the predicative value of the inforrnation “the event i has happened”. This is a negative quantity, its largest value is zero, corresponding to absolute certainty regarding the next event. Mathematically this has the form of the “conditional entropies” discussed by SHANKOX [ 19481. Finally it may be pointed out that though Shannon’s selective entropy is a t an appreciable remove from the physical entropy, the two are connected by Szilird’s theorem, which can be expressed in the most general form as follows: In order to make repeated observations whose selective entropy is S,, in the long run the physical entropy of the systeni including the observer must be increased by a t least k S , per observation.
-
References BOHR,N. and L. KOSENFELD, 1933, Mat. Fys. Medd. Dan. Vid. Selslr. 12, No. 8; 1950, Phys. Rev. 68, 794. BOOKER, H. G., J. A. RATCLIFFE and D. H. SHINN,1950, Phil. Trans. Roy. SOC. A 242, 579. BORN,M., 1949, Ann. Inst. Henri Poincar6, Paris 11, 1. DE BROGLIE, L., 1947, Reprinted in Optique Electronique et Corpusculaire (Hermann & Cie, Paris, 1950) p. 227. DUFFIEUX,P. M., 1950, Rkunion d’opticiens, Ed. Rev. d’optique, Paris, 1950 lists the works of this author between 1935 and 1938. EDDINGTON, A., Sir, 1939, The Philosophy of Physical Sciences (Cambridge). EINSTEIN, A,, 1905, Ann. d. Phys. [4] 17, 132. GABOR, D., 1946, Journ. I.E.E. 93, I11 429; I b i d . , 1947, 94, 111, 369; 1949, Proc. Roy. SOC.A 197, 454; 1950, Phil. Mag. [7] 41, 1161; Nature 166, 724; 1951, Proc. Phys. SOC.B 64, 449. HEITLER, W., 1944, The Quantum Theory of Radiation (Oxford, 2d Ed.). v. LAUE,M., 1914, Ann. Physik [4] 44, 1197; Ibid. [4] 48, 668. MACKAY, 11. M., 1950, Phil. Mag. [7j 41, 189. MOTT,N. F. and H. S. W. MASSEY,1949, Theory of Atomic Collisions (Oxford). NYQUIST, H., 1928, Phys. Rev. 32, 753. OSWALD,J., 1949, C. R. Acad. Sci. Paris 229, 21. PLANCK, M., 1924, Ber. Preuss. Akad. Wiss. Berlin 24, 442. VAN DER POL, B., 1950, U.R.S.I. report, Geneva, unpublished. SHANNON, C. E., 1948, Bell. Syst. T. J. 27, 379, 623, reprinted in SHANKOK, C. E. and m’.WEAVER,1949, The Math. Theor. of Comm., Urbana, Illinois; 1949, Proc. I.R.E. 37, 10.
IVI
REFERENCES
153
SMOLUCHOWSKI, &I., 1912, Phys. Zeitschr. 13, 1069; I b i d . 14,261. SOMMERFELD, X., 1950, Vorlesungen iiber theoret. Physik, Bd. IV, Optik
V.
(Dieterich, Wiesbadcn). SZILARD,L., 1925, Z. Physik 32, 753; 1929, I b i d . 53, 840. WHITTAKER, E. T., 1915, Univ. of Edinburgh, Math. Dept. Res. Paper No. 8. WIENER,N., 1949, “Stationary Time Series” and “Cybernetics” (Chapman & Hall).
This Page Intentionally Left Blank
O N BASIC ANALOGIES AND PRINCIPAL DIFFERENCES BETWEEN OPTICAL AND ELECTRONIC INFORMATION BY
HANS WOLTER Institart fiir angewandte Physik der Universitat Marburg (Lahn), G e r m a y
CONTENTS PAGE
$ 1. INTRODUCTION
. . . . . . . . . . . . . . . . . . 157
$ 2. ANALOGIES B E T W E E N TRANSMISSION L I N E S I N ELECTRONICS AND L A Y E R SYSTEMS I N OPTICS. .
159
9 3. ANALOGIES B E T W E E N OPTICAL AND HERTZIAN WAVES . . . . . . . . . . . . . . . . . . . . . .
178
$ 4 . THE PSEUDOANALOGY B E T W E E N TIME AND CO-
ORDINATE, O R FREQUENCY AND DIRECTION VARIABLE.. . . . . . . . . . . . . . . . . . . . . REFERENCES . . . . . . . . . . . . . . . . . . . . .
187
209
Q 1. Introduction Various analogies are known to exist between optics and electronics. They are essentially related to three classes of phenomena, clearly distinguishable from one another. The first class concerns the optics of parallel layer systems and its analogy to transmission lines in series or series of f o w terminal networks (0 2). The modern optical interference filters are above all in analogy to transmission and quadrupole filters ; further the methods of reducing optical reflexion is analogous to the procedures for reflexion-free adaptation of transmission lines and quadrupoles. But this analogy encounters limitations in several directions. The most important one seems to be the different r61e stimulation plays a t the filter input where it is decisive for all practical purposes. I n optics the incident wave alone is almost always considered as the input stimulation; it is referred to in the definition of the transmission coefficient. As a rule, in four terminal network technique the whole input voltage or input current simply serves as input stimulation; a separation into an incident wave and a reflected wave is rarely of any interest. This difference in conception rests on the fact that in optics there is no great difficulty if a reflected wave is to be separated from an incident wave; this is based on the possibility in optics of “oblique incidence” to which there is no analogous counterpart in electronics. This additional optical degree of freedom, enhanced by the possibility of two polarisations, makes it easier to obtain the transmission line and four terminal network theory by means of specialisation and analogies from optics than vice versa. A third limit of analogy is the departure from transversality in transmission lines in contrast to optical waves of perpendicular incidence. Transition from one optical medium into an adjacent one is not completely analogous to transition from one transmission line to a second transmission line of different cross-section, as alteration of cross-section always causes the building of non-transverse fields.
158
0P T I CA L A N D E L E C T R 0N I C I N F O R M A T I 0 N
[VJ
s1
This third defect of the analogy is also peculiar to hollow tube conductors. But the first two analogy limits disappear in increasing measure, when one transfers from usual transmission lines to hollow tube transmission lines and finally to electromagnetic radiation (3 3 ) . As light is likewise electromagnetic radiation, this (second) analogy would approach identity if the difference of wave lengths and thus of quantum energies did not necessarily create large differences in all questions about the limits of the accuracy of measurement. For, with respect to these questions, the statistics of the quanta plays the decisive part, and not the information theoretical basic theorem. That is proved by the optical minimum-ray-characteristic, which can partly be explained as an analogy of radio direction finding procedures. Information theory places optical imaging in two ways in analogy with electronics. Apart from the analogy between light and electromagnetic radiation treated in 3 3, there also exists a formally very close analogy between the optical image of a function of position by means of an optical system of restricted aperture on the one hand and the electronic distortion of a time function serving as a communication through a communication channel of limited bandwidth on the other hand (3 4). But here it is a question of pseudo-analogy as time and coordinate do not prove to be sufficiently analogous; this analogy limit is conditioned by the principle of causality “no effect before its cause” and makes necessary, in optics, substantially deeper mathematical investigations than in electronics if the validity of the basic theorems of information theory is being examined. Nevertheless the results are related in both cases; for in optics as well as in electronics one can determine the object - the original communication - as exactly as one wishes from a sufficiently exact measurement of the “image” in contrast to the basic theorems of information theory. Though we do not wish to minimize the importance of analogies for mutual stimulation of optical and electronic research, it seemed appropriate to point out the limits rather than to force analogies where they do not exist in nature. Consequently, the terminology appropriate to each sphere was retained, as any attempt a t uniformity might have led to distortion of the representation. Generally accepted symbols have been retained as far as possible in order to facilitate comparison with references. This, however, leads to some inconsistency; for instance the symbols GI and gl have different meanings in 3 2 from that attached to the symbols G and g in 9 3 and $ 4.
v, § 21
T R A N S M I S S I O N L I N E S Ai%D L A Y E R S Y S T E M S
159
Q 2. Analogies between Transmission Lines in Electronics and Layer Systems in Optics 2.1. THE GENERAL WAVE-ANALOGY
The analogy between laminated media of optics on the one hand and transmission lines in series of electronics on the other hand rests on the similarity of the solution
E ( r ; t ) = Eo exp C
(
H ( r ; t ) = Ho exp iwt
”c I\’
- iw - r-f
(2.2)
of the Maxwell equations for a plane wave, which travels in the direction of the unit vector!, and of the solution
U j z ; t ) = Uo exp (iwt
J ( z ; t)
= J O exp
Tz),
(2.3)
(iwt - Fz),
(2.4)
-
which comes from the equation of telegraphy likewise derived from the Maxwell equations. E and H respectively denote the complex electric and magnetic field strengths in Gaussian units. U and J are complex voltages and currents, w is the angular frequency, i the imaginary unit, r a position vector, z a coordinate, t the time and c the light velocity in vacuum. n is the complex refractive index in the optical case which, with a real and the imaginary part, determines the phase and amplitude relations of the wave. I n the same way this will, in the electronic case, be determined by
T=B
+ iA = .\/{(I? + iroL)(G + i d ) ) ;
(2.5)
here A is the phase measure and B the damping of the transmission line, represents the “propagation constant”, R is the longitudinal resistance per cm along the transmission line, G the cross conductivity, L the inductance and C the capacitance, all per cm of the transmission line.
r
is the surge impedance of the transmission line, which in the case of a pure progressive wave gives the relation
U
=
ZJ
for the coupling between voltage and current. This, in the optical
160
OPTICAL A N D ELECTKONIC I N F O R M A T I O N
rv, 9 2
analogy, corresponds to the coupling of an electric and a magnetic field component in the following manner :
n
Ell =: P H,; here p is the magnetic permeability; the field strengths themselves however, are not analogous t o current and voltage. The boundary conditions for the transition between two conductors require continuity of both current and voltage. The analogous boundary coiiditions are valid only for their tangential components, not for the field strengths themselves. Only these tangential components are subject to the condition of continuity. The analogy, therefore, can be made consistent only for boundary surfaces all parallel to one another. z
Fig. 2. la, b. Layer system in optics and conduction line in electronics 2.2. THE ANALOGY RELATIONS
To a transmission line of several homogeneous line elements in series (Fig. 2.lb) the optical analogy is a system of parallel layers of homogeneous media (Fig. 2. i a).
v, 9 21
161
TRANSMISSION LINES AND LAYER SYSTEMS
Suppose a wave enters the wz-th medium. In each medium an incident wave
and a “reflected wave” Ef ( r ) = E t; exp
+ z cos pl)nl
- io(x sin cpl C
HI ( r ) = €3; exp
-
io(x sin q~
)
(2‘9)
J
+ z cos pll)nl
C
for 1 = 0, . . . m are admitted. I n the last, the 0-th medium, the reflected wave disappears: i E ;t = H;t = 0. We assume that the incident wave E;,,, H i , in the m-th medium is known. The time factor exp ( i d ) is implied in the , , H! , H! . quantities E J ~E; The angles cpl obey the Snell sine condition
no sin q10 = nl sin 91 =
. . . = nl sin p1 = . . . = n,
sin qm,
When looking for the optical analogy of the surge impedance it is necessary, because of polarisation, to distinguish two cases. The linear combination of the corresponding solutions will then result in the most general solution.
A , Transverse-E-waves In the case of TE waves, the electric field is perpendicular to the reference plane of Fig. 2. la. Sufficiently characteristic for every wave of this kind are then Eb and E:, since the x- and z-components disappear and the magnetic field strength can be computed from the E , by means of the Maxwell equation (2.1 1)
i.e. (2.12)
162
OPTICAL AND ELECTRONIC INFORMATION
[IT,
s2
For the incident and reflected waves there follows by means of differentiation with respect to z in the equations (1.7) and (1.9), the relations ioH;l,p1 -
io cos q,qE&rzl
C
>
C
(2.13)
Hence one obtains the coupling relations
H;1, = glE&
(2.15)
Hrlx - glE& for 1 = 0, . . ., m,
(2.16)
A
with (2.17)
B.
Tyansverse-H-waves In the case of TH-waves, the magnetic field is perpendicular to the reference plane of Fig. 2.1 a. Sufficiently characteristic of every wave of this kind then are H ; and H;, as the x- and z-components vanish and the electric field strength can be computed from the H , by means of the Maxwell equation pcurl H
P E
= 4x0-
C
+ iw- w
E
C
C
i.e. (2.19) o is the specific conductivity of the medium. For incident and reflected waves there follows, by means of differentiation with respect to z in the equations (2.8) and (2.10), the relations
(2.21)
v,
4 21
TRANSMISSION LINES AND LAYER SYSTEMS
163
This gives the coupling relations (2.22) (2.23) PI
(2.24) cos for 1 = 0, . . ., m. nl To extend the analogy, the coupling relations (2.lS), (2.16), (2.22) and (2.23) must be compared with the couplings between current and voltage of the conduction theory. The transmission line according to Fig. 2.lb permits an “incident” wave in the I-th line element e
gl =
-
=
exp ( r z ) ,
(2.25)
ZIJi =
J. u; exp (rz),
(2.26)
U!
and a “reflected” wave
~t ZIJT
= =
-
U{ exp (U;t exp (-
rz), rz);
(2.27) (2.28)
obviously these satisfy the coupling conditions U‘1 -- ZlJ?;; Ut
= -
ZJI.
(2.29)
For the TE-wave we can draw an analogy between (2.30) (2.31) (2.32) (2.33)
(2.34)
analogous to the case of TE-waves. With equal application of a right handed system for E , H and the direction of propagation, one has to set in analogy for the TH-waves
u!
t+
- El,, J.
(2.35)
J i t+ H &
(2.36)
u2 * E L >
(2.37)
J!
(2.38)
* H$-
164
[v, s: 2
OPTICAL AND ELECTRONIC INFORMATION
Then the surge impedance is (2.39) analogous for TH-waves. The optical values pl, introduced in the Encyclopedia of Physics, vol. 24, p. 467, are p l = iw
nl -
cos c p ~tf rl,
(2.40)
C
analogous for both TE and T H waves, as is shown by the comparison of the eqs. (2.7) to (2.8) with (2.25) to (2.28). The x-dependence of the optical waves has, of course, an analogy only for the fields in the transmission lines, not for the “integral” voltages and currents. A fixed x, for instance x = 0, must therefore be considered. 2.3. THE FOUR TER MI NAL MATRIX FOR OPTICAL WAVES I N THE L A Y E R SYSTEM AND I T S GE NER AL ANALOGY T O THE WAVES I N SYSTEMS O F S E R I E S CIRCUITS CONSISTING O F H O MO G ESEOUS TRANSMISSION L I N E S AND FOUR TERMIN A L N E T W O R K S
A homogeneous line element transforms voltage and current according to eqs. (2.25) to (2.28) from the point 0 to the point x (Fig. 2.lb) according to the relations
U ( z )=
U“x)
+ Ur(z) = Uj;exp ( r z ) + U $ exp (- r z ) ,
Z J ( z ) = Z ( J & ( xf ) J ? ( z ) ) = Uh exp (Fz) - Ub exp (U ( 0 )=
rz),
(2.41) (2.42)
ui + ua,
(2.43)
+ Ub.
(2.44)
ZJ(0) = U i
From the last two equations it follows that
q = i(U(0) + ZJ(O))>
(2.45)
ur, = i ( U ( 0 )- Z J ( 0 ) ) .
(2.46)
By replacing (2.41) and (2.42) we obtain
U ( z ) = U ( 0 )cos (Fz)
J(4 =‘(O) sin ( Z
+ J ( 0 ) Z sin (Fx),
~ z+) J ( O )cos (
(2.47) (2.48)
~ 2 ) .
If the starting point is not the origin but another point
20,
then the
s
v, 21
165
TRANSMISSION LINES A N D LAYER SYSTEMS
relations
+ J(z0)Z sin (T(z 20)) + J ( Z O ) cos (T(z-
U ( Z )= U(z0)cos (T(z- ZO))
J(z) =u(zo) sin (T(z-
z
-
XO)),
20))
(2.49) (2.50)
hold. Thus the I-th line element in Fig. 2. lb transforms the values a t its lower end zl to its higher end zl+l according t o the formulae
U(ZZ+l) = U(z1)cos J(Zl+l)
(Tldl)
+ J(zz)Zzsin (Tzdz),
(2.51) (2.52)
=
The transformation matrix cos
21sin ( T ~ d l )
(I'zdl)
(2.53) cos (Tldl) has the optical analogy (with
dl' = dl
cos pl)
nl cos p 7 ~
llTIEll =
.
I.
for TE-waves and cos ( ~ i y l_ ')
_
.
p1 cos 911
sin (iwnldl'lc) nl
iITIHjI =
sin (iwnZdl'/c) nl___-___ p1 cos p1
cos (iw;di) --
r
(2.55)
for TH-waves. The layer of the I-th medium is a symmetric four terminal network just as a homogeneous line element. The components of the field strength E , = E$ E i and H , = H: H i a t the point zl can be computed from those a t the point zl+l by means of the formulae
+
+
(2.57)
166
rv>4 2
OPTICAL AND ELECTRONIC INFORMATION
for TE-waves. For TH-waves we have to substitute l/cos qq for cos qq. Both matrices (2.54) and (2.55) fulfil the reciprocity principle, viz. the determinant of the matrix is equal to 1 . This is here fulfilled in a trivial way because cos2(rzdz) sin2(T&) = 1. The matrix which represents the transition of the values from point zz+l to that of point zz is therefore
+
IITzII-l
i
cos (Tzdz) sin (I'zdl) -
ZZ
-
Z z sin
(r&)
cos (Tzdz)
1
(2.58)
and corresponding equations are valid for /ITzE//-1and ~ ~ T ~ ~ ~ ~ - 1 .
If several line elements are connected in series as in Fig. 2.1 this is mathematically described by the multiplication of matrices :
(
?i1ll?'llI) 1- 1
Jm
(7). 0
(2.59)
For optical TE-waves the corresponding equation is (2.60) and the equivalent for TH-waves. By means of this analogy the results of the transmission line theory are related to the optics of plane-parallel layers. The next section shows, however, the narrow limits of the analogy. References : HLUCKA[ 19261, SCHUSTER [ 19491, HUBNER[ 19501, A B E L ~ [S19501, WOLTER[ 1956al. 2.4. LIMITS O F T H E ANALOGY CATJSED BY DIFFERENCES O F DI-
MENSIONAL MULTIPLICITY
The drawing of an analogy between the surge impedance
z
'
for the TE-waves
(2.61)
Ztt- iu cos P for the TH-waves n
(2.62)
t-f
n cos 9
and
shows that it is not possible to introduce an analogy free from contradiction for the two polarisations simultaneously or for natural light
v,
s 21
TRANSMISSION LINES AND LAYER SYSTEMS
167
unless cos p = 1
(2.63)
i.e. unless the incidence is normal. The transmission lines permit transmitted waves of a single “polarisation” and of a single “direction” of the multiplicity of a set { Z ; - 2).
(2.64)
An optical wave packet even of a very small but finite cone aperture represents a multiplicity even in the case of a single polarisation (2.65)
1x1 I f ; to an element of the set (2.64) in the case of with \pi I@; transmission lines there corresponds a two-dimensional continuum in optics. This difference poses many interesting optical problems which have no counterpart in the so-called “two conductor technique” - as distinguished from the “hollow conductor technique” - e.g. the “polarisation divisor”. At the same time, the transmission line engineer may well ignore many difficulties which the optician has to overcome if he is to find technical solutions of a problem of finite apertures, e.g. when reflection is to be extinguished. 2.5. LIMITS O F THE ANALOGY BECAUSE O F THE CONDITION O F VIOLATION O F TRANSVERSALITY O F THE TWO-CONDUCTOR SYSTEM
The solutions of the Maxwell equations of optical layer systems are accurate to a higher degree than the solution of the equation of telegraphy, for two conductor systems, because these presuppose in the same way as the equation of telegraphy itself the transversality of the fields. This condition is, at best approximately, fulfilled when the conductor separation is small compared with the wave length and if we can disregard the transversality error a t the junction as well as the cross-sectional variations. An example illustrating this difference will be given in 9 2.6. 2.6.
LIMITS O F THE ANALOGY BECAUSE O F THE D I F F E R E N T ROLE O F REFLECTION
If we disregard the analogy limits discussed in 3 2.4 and 3 2.5, permitting only perpendicular incidence, a further restriction on the analogy will nevertheless remain.
168
OPTICAL AND ELECTRONIC INFORMATION
[V>5 2
The solution of the problem of transmission lines, formulated in $2.2, represents the entire voltage and the entire current a t the line beginning as a function of the two quantities a t the line end. The portions composed of incident and reflected waves have been added. That only this total voltage and total current are of interest today in technical transmission line problems is the result of the practical use of four terminal networks, and of measuring methods based on them. In optics, on the other hand, the most common applications are those in which reflected and incident waves are separated. This separation is based on the possibility of deflecting the reflected wave to one side by giving the layer system a minute inclination. It can also be done in the case of almost perpendicular incidence. In the end, it is based on the difference in nature mentioned in $ 2.3. This separation by means of radiation divisors (half-transparent mirrors) has for a long time been succesfully performed even in cases of strictly normal incidence. I n transmission line techniques the analogous problem has only recently appeared and has been solved by means of direction-coupler and similar methods. Therefore the conduction technique can today fall back on the solutions of the optical layer problem in the case of separation. This solution is related to, but generally not identical with the solution discussed in $2.3. It is not necessary to discuss it here since we can find it in detail in the author’s article in the Encyclopedia of Physics, vol. 24, p. 472. With the analogy relations of $ 2.2, the transfer of the results to the transmission line theory is trivial. We shall give here only some examples. 2.7. EXAMPLES O F ANALOGY B E T W E E N L A Y E R OPTICS AND CONDUCTION T H E O R Y
The difference, emphasized in 5 2.5, between layer optics and transmission line theory is important in the case of interference filters. Certainly the “quotient of transmission” in the numerator is in both cases always defined together with the amplitude at the output. I n the denominator, however, it is the incident wave alone that is, in optics, inserted into the definition; in transmission line technique as with four terminal networks, the total amplitude will be included. It is, therefore, convenient in the case of new problems for conduction filters which are analogous to optical problems, to quote transmission and reflection from the equations (10.12) to (10.14) of table 1 given in the article “Optik dunner Schichten” in the Encyclopedia of Physics,
v,
9 21
TRANSMISSION LINES AND LAYER SYSTEMS
169
vol. 24, p. 47 1, and to adapt them to transmission line theory by means of the above-mentioned analogy relations.
a
@ 580 600
620mp
2-
b
580
600
i
i
l
I,
620mp
C
Fig. 2.2. Interference filters in optics and in electronics
Fig. 2.2 shows some filter transmissions with reference to conduction, how they are in this way produced both for the optical layer filter (left) and for the conduction systems (right). For further details see Encyclopedia of Physics, vol. 24, p. 505 ff. Fig. 2.2a shows the transmission curve of an optical filter consisting essentially of a thin silvered foil, moderately permeable on both sides (GEFFCKEN [ 19411). An entirely analogous transmission curve is shown by a Lecher line element (fig. 2.2a right) which is in two places equipped with complex cross resistances of suitable measure. With three cross resistances or layers, it is a case of coupled eliminators with a two-pointed resonance curve; with four layers, i.e. three intervals, the transmission curve will be a three-pointed band filter curve (fig. 2 . 2 ~ ) The . illustrations on the right can also be interpreted as longitudinal sections across hollow tubes. With such tubes analogous filters for centimetre waves are constructed by means of a cross construction of complex resistances, often simply with screw pins or perforated walls. I n the case of long waves, e.g. radio waves,
170
0 PT IC A L A N D E L E C T R 0N I C I N F O R M A T I 0 N
s
[v, 2
conductors or even hollow tubes would, of course, become too large; resonance circuits are used instead of cavity resonators; i.e. combinations of so-called concentrated impedance elements such as coils and condensers. If the definition of transmission usually applied to four terminal networks had been employed, the curves would differ from those of Fig. 2.2. The difference would be small at the transmission maxima; at places of little transmission it might grow considerably.
@
yno
Fig. 2.3. Transmission free from reflection by means of one intermediate layer of +A thickness
Examples where this difference is, of course, negligible are related to the extinction of reflection, as in these cases the formulation of the question unequivocally presupposes the separation of the incident and the reflected waves. The example for transition free from reflexion and loss from one transmission line of surge impedance 2 0 to another transmission line of surge impedance 22 (Fig. 2.3 right) by means of a connecting transmission line $2 long of the surge impedance
d(ZoZ2) (2.66) is entirely analogous to the compensation or anti-reflection coating of optical systems (Fig. 2.3 left) given by Sniakula. Here a $2-layer with a refractive index = d(nonz) is inserted between two media of refractive indexes no and nz. This is, however, only valid for normal incidence. I n the case of oblique incidence there follows because of analogy (2.34) 21 =
(2.67)
v, § 21
TRANSMISSION LINES AND LAYER SYSTEMS
171
(for ,Uk = 1 ) from eq. (2.66)
nl cos 471
=
1/(non2cos 470 cos ~
2
Because of the analogy (2.39), having for zk
for ) TE-waves.
,Uk =
1 the form
cos F k
t-t
(2.68)
-,
(2.69)
nk
we obtain from eq. (2.66) n1
cos p?1
for TH-waves.
=
(2.70)
The conditions (2.68) and (2.70) can be satisfied simultaneously only for normal incidence. Further consequences are described in the quoted article above in the Encyclopedia of Physics, p. 475. If it is not a question of extreme transmission but of freedom from reflection for wide frequency intervals, then the solution hitherto considered will be given up in favour of another solution of the problem. A boundary surface between the two media @ and @ with refractive indexes no and n2 can be made free from reflection for waves, incident normally from medium @ (Fig. 2.4a) by using a very thin (compared with wavelength) metallic intermediate layer of thickness d l A and refractive index nl with
<
Re
(2.71)
(n12)= n0n2
and
Im
(n12) = (n2 --
no) -.
A
2nd1
(2.72)
This has been confirmed and discussed, with optical examples, in the article on pages 482 and 483 of the Encyclopedia. The analogue is a discontinuity of the surge impedance a t a conduction line (Fig. 2.4b). If the surge impedance a t the one side is 2 2 and a t the other side is Z O f Z2, we can remove the reflection for waves falling in from transmission line @ by means of inserting a very short loss-conduction (length d l < A ) between the two conductions. For that purpose a very short lossy line of length d l < A will be inserted between the two lines. To find the analogous solution one cannot start from the equations (2.71) and (2.72) because the r61e that the refractive index n plays in the analogy to Z or to p
172
OPTICAL AND ELECTRONIC INFORMATION
[l',
§2
K C
Fig. 2.4. Freedom from reflection for waves entering from medium 2 t o medium 0 by means of very thin layers or lines involving losses
has been completely mixed up. One has to go back to eq. (2.55) of the Encyclopedia article (p. 481) where it is shown that dlpl ~
__
gl
_1 _ _ 1 g2 go
+-.dipigi gog2
(2.73)
This according to the analogy relations (2.34) and (2.39) changes t o (2.74) This formula becomes on transformation (2.75)
If the relations (2.5) and (2.6) between or 21, and conduction variables R1, GI, C1 and L1 are taken into consideration, we have
If a cross resistance GI
=0
is specially applied, we must demand
vt
s 21
TRANSMlSSION LINES AND L A Y E R S Y S T E M S
173
that
that is L1
-
c1
p z co cz
Lo
(2.78)
Zo.
(2.79)
and
Rldl
= Zz -
The total resistance will thus be equal to the difference of the surge impedances ; that is almost trivial, because a series arrangement of Rldl and ZOappears as well as terminal resistance when the wave is incident from conductor 0. This condition can, of course, only be fulfilled when Z Z is the bigger of the two wave resistances 20 and 22.For waves incident from the conductor 0, the discontinuity is not free of reflection. This disadvantage (it is often an advantage) of one-sidedness is made up for by the advantage, compared with the foregoing example, of the wide band property of extinguishing reflection. Whilst there the akcondition of connecting conduction could only be fulfilled for discrete frequencies, both conditions (2.78) and (2.79) may obviously be fulfilled here for a very large frequency range (many octaves). I n many cases that makes up for the disadvantage of a loss of power in the resistance Rldl. The mean value condition (2.78),which requires that the connecting line be given an L/C proportional to the geometrical mean of the L/C values of the other two lines, will generally be nearly satisfied through a conic transition line K according to Fig. 2 . 4 ~ .Then, by extrapolation, the case can be derived from this that a longitudinal resistance Rldl = 2 2 with an exponential cone terminates a line almost free from reflection. I n this example attention should be drawn t o the transversality condition, which is by no means always fulfilled. Not only the cone but properly every sudden change of the cross section gives rise to this, because the electric field lines rest perpendicularly on the well conducting wall, instead of descending in a plane perpendicular to the z-axis. This deficiency is missing in the optical analogue. The optical example treated in the article of the Encyclopedia
174
OPTICAL AND ELECTRONIC INFORMATION
[V>
92
is yet more closely approached in the case where GI # 0. If one choses R1 = 0, then the eq. (2.76) becomes dlico(L1 - C12220)
-
ZzZoCldl
= Z Z - 20.
(2.80)
Thus the connecting line has to fulfil the conditions (2.81) and
Gldl
1
1
-
1 ~
2 2
~.
(2.82)
2 0
Condition (2.81) is identical with condition (2.78). Condition (2.82) says that the total conductivity G l d l of the connecting line shall be equal to the difference of the wave conductivities. This is only realized when the transmission line @) has the smaller surge impedance 22 < 20. If we adapt the transmission line (Fig. 2.4d) to a new line (surge impedance 2 2 ) by means of a very thin layer (thickness d l
0
(2.83) then the total system is again free of reflection. The procedure may be continued and finally leads to a reflection-free termination like
a
b
Fig. 2.5. Termination of a line
that of Fig. 2.5a, that is equivalent to the wave trap in Fig. 2.5b. The conductivity of the third layer is (2.84)
v, 5 21
175
T R A N S M I S S I O N L I N E S ANT) L A Y E R S Y S T E M S
etc. The total conductivity of the wave trap is
G l d l + G-id-1
+ G-3d-3 + . . . = _1 - _1 + zz
1
1 -
2 0
-
2 0
~
21
+ . . . (2.85)
Thus the wave trap offers a resistance equal to 20 as could be expected. The conic termination mentioned above qualitatively and the longitudinal resistance can be proved quantitatively similar to eq. (2.85) by means of the equation
R l d l f R-Id-1
+ . . . = Zz - 20+ 20- + 2 2
2-2
-
+...
(2.86)
Furthermore, we shall also see that the form of the outer tube (Fig.
2.4~) can be computed from (2.79) with constant longitudinal resistance R1 per cm. For the function 2 = Z ( z ) the eq. (2.79) implies the differential equation (2.87) which has the integral (2.88)
Z ( z ) = ZrJ - Rl.2. For concentric tube transmission lines 2
=
60 ohm
f(5)
In Ya
(2.89)
ri
provided p and E are the permeability and the dielectric constant for the intermediate medium of the cable and ~i and ra. are the inner and outer radii of the conductors. Thus it follows that (2.90)
(2.91) Hence the outer radius of the conductor follows an exponential course with a constant radius of the inner conductor. Conversely, if the form of the outer conductor is made conical by
176
OPTICAL AND ELECTRONIC INFORMATION
[V?
§2
then Rl(z) as function of position can be calculated from eq. (2.87): (2.93) (2.94)
The resistance per cm has then to be diminished towards the narrower end of the cone. At the end 21 of a termination link, where Ya(zI) == y i j €21
will have the value
k
(2.95)
We have from eq. (2.92)
therefore k can be replaced by (2.96)
Corresponding results follow from an analogous consideration of the case illustrated in Fig. 2.5. A combination of longitudinal resistance and cross conductivity is, of course, possible too. As the last example the reflection-free termination of a transmission line without cross-sectional variation will be considered. Its optical analogue is calculated in the Encyclopedia article on p. 494, with the result that an incident wave (Fig. 2.6) from medium @ is not reflected in this medium when the layer 0has the thickness $2 and the thin metallic layer @ is adjusted for the TE-wave according to the formula (2.97)
The transmission line analogue is represented in Fig. 2.6. The metallic layer 0 is to be adjusted in a manner analogous to that of eq. (2.87), because of the correspondence p +, F ; g ++ ljZ for TEwaves from (2.98)
v>9 21
177
T R A N S M I S S I O N L I N E S A N D L A Y E R SYSTEMS
Because of the relations (2.5) and 12.6) it follows that
dz(G2 and therefore when oC2
1
+ icrtCz)
-,
< G2 is valid,
(2.99)
23
1
d2G2 w _ _ .
(2.100)
23
The resistance lldzG2 of the layer @) is then to be made equal to the surge impedance of the line from which the wave is incident, in order to make the system reflection free. That is one of the rules
THIN METAL.LIC LAYER
0
$a -LAYER WITHOUT ABSORPTION
7///////////// MASSIVE
c9
METAL
a
@
b
Fig. 2.6. a. Elimination of reflection for a wall of massive metal by means of a thin metallic layer at distance a.1 from the wall b. Analogous system to transmission lines
familiar to the transmission line engineer; for the metallic termination of the resistance 0 situated at a distance $2 behind the layer @ is transformed to the position of the thin layer @ as resistance W . It lies parallel to the resistance of the layer and is therefore without effect; thus the layer resistance lld2Gz alone takes care of the termination and is, of course, to be made equal to 1/23, when reflection free. The resistance l/d2G2 may thereby be realized by means of a resistance body in the form of a rod, instead of the layer. In this example the advantage must be recognised of transferring the trivial procedure from the transmission line theory into optics where the result of the eys. (2.24) and (2.13) given on p. 495 in the Encyclopedia article is considerably less trivial.
178
OPTICAL A N D ELECTRONIC INFORMATION
§3
2.8. POSSIBILITIES O F EXTENS1,ON
Any number of examples for analogies between conduction theory and layer optics may be given: almost every paragraph in the article from the Encyclopedia quoted above, also that dealing with inhomogenous layers (p. 491) and that concerned with measuring methods, produces interesting analogies. The transfer must in the light of the analogy relations (2.34),(2.39) and (2.40) however be left to the reader, in order to prevent the extent of this section from becoming comparable with that of the article in the Encyclopedia. For this reason hollow tube transmission lines which occupy an intermediate position between transmission lines dealt with here and optics will not be discussed here. Attention should be called to the fact that all the Figures 2.3 to 2.6 can just as well be used to interpret hollow tube waves. If, for instance, the inner conductor is omitted in Fig. 2.4, the thin layer in the hollow conductor will then have the surface resistance given in the article in the Encyclopedia of Physics, vol. 24, p. 495 (compare also p. 483!). The analogy between hollow tube waves and optics is, among other things, much closer than it is in the case of two conductor systems as there exist several possibilities of polarisation. Resides, doubly refracting or optically rotating media find here very conspicuous analogues. Hollow tube waves constitute an intermediate stage towards free radiation, which will be treated in the next section. References : SCHLICK[ 19041, SMAKULA [ 1935, 1940, 19421, Goos [ 19361, WOLTER[ 19371, [ 1956a1, BLODGETT [ 19401, GEFFCKEN [ 19411, HIESINGER [ 1947, 19481, MAYER [ 19501, CABRERA [ 1952a, b], FRAU [ 19521, GRADMANN [ 19561.
5 3.
Analogies between Optical and Hertzian Waves
3.1. THE PROBLEM O F THE NON-REFLECTING METALLlC \TALI, FOR H E R T Z I AN WAVES
The last example of 9 2.7 is historically interesting. When in World War I1 radar detection with centimetre to metre waves from English and American aeroplanes constituted immediate danger for every German submarine rising to the surface, it was necessary to “blacken” the towers of the submarines as reflection by the towers made detection possible. It is as simple to blacken a metallic wall against light as it is difficult
V,
5 31
179
OPTICAL AND HERTZIAN WAVES
to do the same against Hertzian waves. The black paint used in the case of light is essentially a wave trap consisting for instance of carbon or metallic particles which, by their very loose structure and great layer thickness compared with the wave length, scatter and absorb the waves before they reach the wall. An entirely similar “paint” for metre waves was at that time unknown. A simple solution of the problem was the arrangement sketched in Fig. 2.6a. (See the article “Optik dunner Schichten” in Encyclopedia of Physics, vol. 24, p. 494.) A layer having the surface resistance 12032 ohm is arranged in front of the metallic wall at a distance of $2. This layer is either a foil (in the case of centimetre waves) or a net with square meshes (in the case of metre waves) composed of resistances of 377 ohm (Fig. 3.la). The meshes must be sufficiently small compared with the wave length, and the resistance of one separate square cut out from the net and equipped with electrodes El, Ez must amount to 1 2 0 ohm. ~ 377 OHM . \ . .
. . . . .
. . .
b
El
EZ
a
I
. . . .
Fig. 3.1. Layer of 377 ohm surface resistance, a : with electrodes for resistance measurement, b : realized as a net for radar camouflage
To reduce as much as possible the distance i 2 , it is expedient to apply material of high dielectric constant directly on the wall. For wide frequency bands any reflection will be eliminated by multilayer systems as is done in optics, analogous, for instance, to the inhomogenous system described in the Encyclopedia of Physics vol. 24, p. 492. Though superficially the analogy between optical and Hertzian waves seems almost complete, the example of the “camouflage” net shows distinctly (Fig. 3.lb) the great difference due to the fact
180
OPTICAL AND ELECTRONIC INFORMATION
[V> §
3
that the difference in wave length amounts to many powers of ten. While in this example optics stimulated Hertzian wave technique an example will be discussed showing the reverse, viz. a method of communication technique inspiring optics. 3.2. THE RAY SHIFT WITH LIGHT AND LONG WAVES
Goos and HANCHEN[ 1943, 19471 have discovered and measured the sideways shift of a ray in the case of total reflection. As the effect is, of course, very small, the difficulty arose that the “detour of the light” through the less dense medium was itself smaller than the “breadth of the path”, that is, here, the breadth of the ray. How problematic the question of the light path really is became obvious when the phenomenon was investigated (ARTMANN [ 19481, WOLTER [1949a, b, c]). Strictly speaking, there is no ray, but at the most a wave packet, whose breadth, even at the points of narrowest contraction, amounts to the wave length. The publication of Goos and Hanchen reminded the author of similar phenomena observed with long electromagnetic waves. The shift observed shortly after the beginning of World War I1 when fields around a wire-loop were measured at and under the surface of sea water, did not amount to a “ray” nor to a bundle of waves but to a zero plane of the field; it revealed the paradox that the “reflection” in water seemed to occur in the one polarisation direction (in fact at the so-called penetration depth where the field strength has fallen t o the e-th part of its surface value), whilst “reflection” for the other polarisation direction seemed distinctly to have occurred above the surface. Recollection of this observation suggested an attempt to apply to optics the investigations of Goos and Hanchen with zero planes of a radiation field (WOLTER[1949a, b, c], [1950]), with a view to study the polarisation effects. The dependency on polarisation of the Goos-Hanchen effect, however, seemed at the beginning to present difficulties. The use of the zero plane of light made possible an unequivocal and unrestricted definition of the “light path”, and produced a substitution for the conception of what a ray is which in many cases proved sufficient. At the same time this characteristic of a light path led, through a great number of zero points, to increased accuracy in measurement, not only with regard to the Goos-Hanchen effect but also in the case of many optical measurements. This will be the subject of the next section
V,
s 31
181
OPTICAL AND HEKTZIAN WAVES
3.3. THE OVERCOMING O F THE OPTICAL UNSHARPNESS CONDITION BY MEANS O F THE ANALOGY WITH THE RADIO DIRECTIONFINDING PKOCEDURE
There is a rule of thumb according to which a group dipole antenna for centimetre or metre wave technique (Fig. 3.2a) can concentrate the radiation in an angular region of half width
n d(sin a ) 2 --, AX where A X is the breadth of the group. The same applies when we choose as antenna a concave mirror, which is irradiated by a dipole, or also for instance an electromagnetic horn (Fig. 3.2). The statement dxA(sin a ) 2 1
(3.1)
e
C
Fig. 3.2 a, b, c : Directional antennas; d : Horizontal diagram of the direcCiona1 antenna a by “maximum feeding” ; e : a horizontal diagram of the same antenna by “minimum feeding”
is entirely analogous to the familiar diffraction relation in optics. This explains, for instance, how accurately we can measure the angular deflection with a light pointer apparatus, e.g. a mirror galvanometer,
182
iv, 3 3
OPTICAL A N D ELECTRONIC I N F O R M A T I O N
or how accurately we can resolve (Abbe) an object with a microscope of aperture A(sin a). This limit of measuring accuracy is related to the Heisenberg uncertainty relation AxAPz 2 h,
(3.4
where h is Planck’s constant. This relation gives for monochromatic photons of momentum
lPl
h
=
T>
(3.3)
-
MONOCHROMATIC
-
PHOTONS
4
P=h/a
Fig. 3.3. On Heisenberg’s uncertainty condition
because of the relation (cf. Fig. 3.3)
Op,
= PO (sin a ) = d (sin a )
h h
(3.4)
directly
h OxA(sin a ) - > h
1 -
(3.5)
and this relation is the same as (3.1). The classical diffraction relations for waves such as eq. (3.1) which had at first only been established by examples, have, as is well known, fundamentally contributed to establish the Heisenberg uncertainty relation and have then, in the way sketched out from eq. (3.2) to (3.5), returned to optics with the well-founded and more extensive claim to general and fundamental validity. The principal limits of error for measurements have been described in eq. (3.1) for direction finding procedures with Hertzian waves, light pointer proceedings in optics, striae measuring procedures, microscopic resolution of objects, etc. This, however, only with the following qualification : The source of a photon and the direction it has taken from there will
v, 9 31
O P T I C A L A N D HERTZIAN WAVES
183
be found to “scatter” in such a way that AxA(sin a) 2 I if the two quantities relating to several photons equally treated are measured. But this does not express the same thing for the limits of measuring errors when many photons (number N) are available and when the experiment can be repeated N-times. In this case the accuracy of measurement can in principle be raised to the limit
This result is to be expected from the familiar Gaussian relation about the increase of accuracy of measurement by repeated measuring. It was explained in detail and proved in the author’s article [1958 a]. Certainly nothing has been gained for practical purposes with the knowledge of the principle (3.6) unless the way had been shown to obtain values exceeding those of (3.1) in accuracy; for the striae measuring procedures practically used such as those of PhilpotSvensson, Topler, Lamb and also the resolution observed in the microscope and the light pointer apparatus, show such diffraction unsharpnesses as correspond to the statement (3.1). It was suggested for a long time that with (3.1) the limit of information had been in principle obtained. From this arose the fundamental theorems of the information theory, and it is with these that the next part will deal. Almost simultaneously with the origin of information theory there began a development which finally broke through the information limit (3.1)and which can most easily be understood from the transfer of the direction finding procedure of Hertzian waves to optical procedures. It is possible in high-frequency technique, to make measurements with much greater accuracy using a minimum of the directivity diagram of the antenna than by using a maximum. For instance, sharp “linear” zeros, well suited to our purpose, are obtained in the directional diagram if the two halves of the group are closely connected to the receiver with opposite phase. For such a minimum there is, in principle, no lower limit to its double width when the possible disturbances are small enough. The double-width plays the same role for minima as the half-width does for maxima, but the double-width of the minima is smaller. There is an analogy valid in optics. If directions in space are not characterized by means of accumulations of photons but by surfaces in which the light intensity is zero, there exists principally no lower
184
OPTICAL AND ELECTRONIC INFORMATION
s
[v, 3
limit of their double-width, i.e. of the angles or the position characteristic, it being assumed that there is no disturbing light present and a sufficient number of photons is available to ascertain the statistical dispersion. This corresponds to the fact that it is impossible to define the exact position of an electron within an atom for a definite state with any desirable exactness; it is, however, possible to define the node surfaces on which one can be sure no electron will be found. The “minimum ray characteristic” can be practically realized with particular simplicity, for instance in the case of the light-pointer procedure. Using a mirror galvanometer it is sufficient to cover half of the mirror with a thin metallic vapour about in thickness, when the mirror is used near the surface normal. Then the light which has passed the one half of the mirror is shifted in phase by 180” relative t o the light of the other half of the mirror. Fig. 3.4 shows actual photographs taken with such a light-pointer. The fine minimum passing symmetrically in the centre of a light pointer with minimum characteristic allows for more exact measurements than a diffraction maximum. The strict proof relating to the superiority of the minimum ray characteristic over the maximum ray characteristic follows from an examination of photon statistics. To take a more concrete situation consider first photographic registration ; it can be proved that the kind of registration is, in principle, immaterial. If n is the number of photons active in blackening per mm2 during the exposure, this figure will vary by l / n from photograph to photograph in spite of identical previous treatment. Then, in the case of idealized photon information, when every photon independent of the other manifests itself identically, we find that An
=
l/n.
(3.7)
If we use a photographic emulsion which requires several photons to achieve one blackening a statistical coarseness appears. I n general
Here w ( n ) is a characteristic function of the receiver and in the case of the ideal receiver it is equal t o l/n. As measure of the accuracy in measuring the coordinate z of an intensity edge let us use l/Az. The breadth Az of the edge results from the photon or “grain” fluctuation. We assume that there is a
a
b
C
Fig. 3.4. Photographs of the light pointer; increase in accuracy of measurement by means of the “minimum ray characteristic” (a), with respect to the “maximum ray characteristic” (b), (c)
This Page Intentionally Left Blank
V,
s 31
185
OPTICAL A N D HERTZIAN WAVES
sufficient optical preliminary enlargement so that many “grains” or photons fall into each area considered. The intensity distribution I ( z ) oc n(z) can be computed on the edges as follows dn/dz An
1 AZ
w(z) n(z)
_ __ _ _ _-- - ~ -dn.(z)
1
__ Az
=
w(z)
d I (z)/dz
I(z)
dz
= w(z)
’
d In I ( z ) dz
(3.9)
(3.10)
If we denote the data gained in the case of a “maximum characteristic” by the index “max” and those gained in the case of a “minimum” characteristic with the index “min” the relative gain factor of the minimum-ray characteristic is
Each of the two photographs is to be exposed so that the edges which serve as characteristics are optimally situated in the grey region.
I----, Fig. 3.5. a) b) c)
Theoretical diagrams of the “ray characteristic” by means of 180” phase shift layer (“minimum ray characteristic”) slit of optimal width screen edge
186
OPTICAL AND ELECTRONIC INFORMATION
[V>
53
Then we have nmax
= nmiii
(3.12)
in the two places used. If both photographs are made on the same material, the result will be (3.13)
(3.14)
The logarithmic derivative of the intensity variation is therefore decisive for the measuring accuracy (Fig. 3.5); it goes towards 00 as z goes towards a point zo where I(z,) = 0. g,,l is ultimately limited only by stray light or, in other cases, by an insufficient number of photons in the vicinity of the point zo. Thus the minimum ray characteristic is proved to be optimal among all conceivable “ray characteristics”. The minimum ray characteristic was used in various forms. Among these applications, those carried out in the case of the stria measuring procedure were of principal importance for the analogy of the optical and electronic information theory treated in the next section. References : SCHARDIN[ 19421, Goos and HANCHEN[ 1943, 19471, ANTWEILER and KAYSER[ 19501, WOLTER[ 1949a, b, c], [ 1950b, c, d, el, [ 19531, [ 1956b], [ 1958a1, KOSSELand STROHMAIER [ 19511, ARMBRUSTER, KOSSELand STROHMAIER [ 19511, MOSER and WITTMAXN[ 19513, MOSER and SCHMIDT[ 19531, KRAUSBAUER [ 19571. 3 4. LIMITS O F THE ANALOGY- I N THE DOMAIK O F RADIATION
The analogy between electronics and optics was developed more and more closely in part 1 , until it became almost an identity in part 2; for light has the same nature as the electromagnetic radiation of Hertzian waves. But the difference of the wave length il amounting to several powers of ten implies a limit to the analogy, because it is connected with the corresponding difference in the quantum value hc/L The quantum value and the quantum quantity N , stipulated by it, determine the final limit of measuring error at least in principle, in accordance with eq. (3.6).This is often increased by statistical coarseness of the interceptor which, in order to affect a registration, requires not one quantum but several quanta.
v, 5 41
PSEUDOANALOGY BETWEEN TIME
AND COORDINATE
187
Several optical procedures applying minimum ray characteristic have already reached this limit today; they find their limit of measuring accuracy no longer in disturbing light but in photon statistics. That is essentially different in all practical procedures with Hertzian waves. Here the disturbance and not the photon statistics has so far been the limit of measuring error. The relation (3.6) which is already important practically in optics has therefore only figurative significance in electronics; then N signifies the number of charge quanta, not the number of radiation photons. I n this case, however, the time in which the charge quanta are transported becomes decisive, instead of one coordinate, and that is another kind of analogy. It will be the subject of the last section (9 4) of this article.
9 4.
The Pseudoanalogy between Time and Coordinate, o r Frequency and Direction Variable
4.1. Z E R X I K E ’ S P HAS E CONTRAST METHOD AND ITS COMMUNICA-
TION T E C HNI QUE ANALOGY
-
THE PH A SE DEMODULATION
Fig. 4.1 shows in cross-section an “optical communication channel” that is a device which transilluminates a more or less transparent object, situated in the object plane, with coherent monochromatic light issuing from a point source and made parallel by means of a collimator; it forms an image of the object on the image plane by means of the object lens. As the eye only perceives differences of intensity, the image of the object is unrecognisable for us in the image if it is a pure “phase object” which causes no intensity minima but only phase changes in the light, Many microscopic objects are phase objects, e.g. uncoloured bacteria, colourless crystals in their mother-liquid or concentration striae in a solution. F. ZERNIKE [ 19313 with his phase-contrast method indicated a way whereby, by means of simple intervention in the ray path, one can make visible the previously unrecognisable object by changing the phase differences into amplitude differences. The place a t which this intervention occurs is the focal plane of the object lens nearer the observer. The “modus operandi” of the intervention will be described for a particularly simple object, namely a “phase grating”, according to Zernike’s method. A phase grating consists for instance of a plane-parallel glass plate in which parallel
188
0PTIC A
[v, § 4
L A N D E L E C T R 0N I C I N F O K M A T I 0 N
grooves are cut, as Fig. 4-.1 indicates in cross-section. The remaining glass ridges and the grooves will have the same breadth.
Y OBJECT PLANE
4.ORDER
IMAGE PLANE
OBJECT LENS
Fig. 4. I . Optical “channel”
If such a phase grating lies on the object table of the microscope, that is, in the object plane of the optical communication channel according to Fig. 4.1, then the part of the apparatus from the light source to the focal plane of the object lens forms a grating spectrograph. Then the spectra of order zero, first, minus first, second etc. appear in the focal plane of the object lens and each of them, in monochromatic light, consists of one point only - more exactly - of an image of the light source. The spectrum of order m, which is produced at the angle a, to the grating normal, has according to Huygens’ principle the complex disturbance of light
F,
1
= - exp
)
sin anZ J. dx,
(i(ot 4-@o))
g
(4.1)
since the path difference of the ray deflected by the grating at the angle LY, at position x compared with the ray deflected a t position 0 is xsin LY,. w is the angular frequency of the light and @O a phase constant; f(x) is the object function; for our phase grating f(31:)
=
c
exp (i0) for the ridges, 1 for the interspaces.
(44
is the phase shift caused by the grating ridges against the interspaces. It is well known that one obtains the spectrum of order m for an angle ccm which is connected with the wave length Jb and the grating constant g by Cf,
g sin LY,
=
mJ..
(4.3)
v, S 41
PSEUDOANALOGY
BETWEEN TIME
A N D COORDINATE
189
If we substitute for sin a, in eq. (4.1) then F,
=I exp
g
+
(;(at @o))
l[
f(x)exp
(
- 2zi
~
7)dx;
(4.4)
this is formally the m-th Fourier coefficient of the Fourier series for
Thus the representation for the object function will be
Using this equation one can obviously determine, in the case of an ideal image, the image function which is identical with the object function, if only the light disturbance and the coordinate x are measured on a suitable scale. If one disturbs the image by intervention in the spectra - forcing on every spectrum of order m a factor S, - for instance by means of a plate placed in the focal plane of the object lens, then the image function becomes
The following consideration shows which intervention is suitable for the transformation of the phase differences into amplitude differences. Fig. 4.2a illustrates the complex light disturbance in the image plane when no intervention occurs. If one traverses the image plane the complex light disturbance moves to and fro between the two vectors drawn in Fig. 42a, according to whether one is in the image position of a ridge or an interspace. This light disturbance is caused physically by interference of the waves emerging from the point spectra in the back focal plane. If only the spectrum of order zero were admitted all the “side spectra” being covered by means of gap walls in the focal plane of the object lens, the “picture” would have a light disturbance independent of position; it would be given as a spectrum of order zero according to eq. (4.4)
F O == exp (i(at
+ @o))
(4.7)
190
OPTICAL AND ELECTRONIC INFORMATION
[v, 5 4
It is represented in the complex number plane of Fig. 4.2b by means of the dashed vector.
I i g . 4.2. Zernike’s phase contrast method, illustrated by means of a phase grating object. Light disturbance in the image plane, described by means of vectors in a Gaussian plane of complex numbers
The fact that the light disturbance jumps from the vector “Y” to the vector “i’’ as we go through the image is clearly caused by the side spectra (order 1 ; - 1 ; 2; - 2; . . .) together. Its contribution is indicated in Fig. 4.2b by means of the dotted and dot-dashed vectors. The fact that the grating ridges cannot be distinguished by the eye from the interspaces finds its expression in Fig. 4.2b - viz. by the fact that the two vectors “Y” and “2” have the same length and therefore signify the same intensity. This can according t o Zernike be remedied by a 90” phase rotation of all side spectra together. Then they take up positions according to Fig. 4.2b and vector “Y” becomes “Y”’ and “2” becomes ‘‘i”’. As Y’ and i‘ now have different lengths the grating ridges are pictured darker than the interspaces
Fig. 4.3.Amplitude grating (left) and phasc grating (right), photographed by means of a usual ~nici-oscop~' (a),ant1 li? iiieaiis o f a phase contrast microscope (b)
This Page Intentionally Left Blank
V,
3 41
PSEUDOANALOGY BETWEEN TIME AND COORDINATE
191
(Fig. 4.3). The phase objects have become visible because the phase differences have been turned into amplitude differences. The intervention which in practice must be undertaken for this purpose is done by placing a Zernike phase plate in the focal plane of the object lens; a t the point of the spectrum of order zero this plate causes a 90” phase shift in comparison with the rest of the plate. That is equivalent to the phase shift of the side spectra since only relative phases have any significance. Though reference must be made to the literature concerning further details of the phase-contrast method, the example explained here is sufficient for the purpose of comparison with the communication technical analogy. This consists in the phase modulation which is explained in Fig. 4.4. With an oscillation equal to the real part of the function
+
f ( t )= a exp (27~iv0t i@),
(4.8)
the real a would be called “amplitude”, Y O “carrier frequency” and the real @ “phase”. If the amplitude is a slowly varying time function (slowly compared with I / Y o ) , then there arises the “aniplitudemodulated oscillation”
fa(t) = a(t) exp (2nivot
+ i%).
(4.9)
If, on the other hand, a is held constant together with Y O but @ is made a function of time, then there arises the “phase-modulated oscillation” &(t)
= a0
:I:i
+
exp (27~ivot i@(t)).
[-sT---j
(4.10)
LOAD
!
!
-___-I AMPLITUDE DEMODULATING LINK L-,L-
1;ig. 4.4.Amplituii~demodulation P A real function is always to be understood by the real oscillation that is the real part of f a i ( t ) etc. In the case of pure linear operations the equality of the real parts follows naturally as a result of the equality of the complex functions, therefore the symbol for real part can usually be omitted. In non-linear processes, as the formation of the absolute value, eq. (4.12), the express quotation of the “Re” is indispensable.
192
OPTICAL A N D ELECTRONIC INFORMATION
[v, 9 4
The “phase modulation” @(t) is then the expression of the “coniniunication” just as the “amplitude modulation” a(t) in eq. (4.9). If the oscillation in a “demodulating link” is rectified and averaged with a time constant, which is large compared with 1/v0 = To but small compared with the variation of a(t) or @(t),the time function at the output of the demodulating link remains equal to 1Re f a ( t ) l
= Ka(t),
(4.11 )
which is valid for amplitude modulation as in eq. (4.9). But it is IRe /&)I
= Kao,
(4.12)
when we have a phase modulation as in eq. (4.10). K is a constant. Then the demodulation term reproduces, as desired, the communication a ( t ) liberated from the carrier oscillation in the case of amplitude modulation; but on phase modulation it only shows the constant amplitude a0 and nothing of the content @(t) of the communication. It is therefore practical to convert a phase modulated communication (4.10) into an amplitude modulated communication of the form (4.9) before putting in the demodulating link (Fig. 4.4). We can obtain this here in the same way as in optics with a “modulation transformer”, the effect of which will be exemplified also here by a periodically phase modulated communication according to Fig. 4.5, which corresponds to the phase grating in optics. It has a timedependent phase
@(t) =
I
for intervals of even numbers 0 for intervals of odd numbers.
@I
(4.13)
For the sake of clarity @I = 180” was chosen in Fig. 4.5a. In practice @I 180” as in Fig. 4.6. The total phase modulated oscillation (4.10) is then a periodic function with the period length T :
<
a0
f(‘) =
exp (2nivot
+ idil) for intervals of even numbers,
\ a0 exp (2nivot)
for intervals of odd numbers.
(4.14)
This function can be developed into a Fourier series in the usual way:
v, 5 41
PSEUDOANALOGY
BETWEEN TIME
193
AND COORDINATE
In spite of the close analogy between the eqs. (4.15) and (4.5) a fundamental difference becomes evident here. A genuine frequency co/2n of the light oscillation corresponds to the carrier frequency
t
-
~ U U U U U Uu
1
NUMBEROF THE INTERVAL
2
1
uuuuu uuuuuu
3
I
4
1
5
1
6
l a
~
2
1
3
4
5
6
Y
Fig. 4.5 a. a simple phase modulated communication b. a simple amplitude modulated communication
in optics also; but in optics “pseudofrequencies” m l g which do not have the dimension of a reciprocal time but that of a reciprocal length, correspond to the “side bands” of communication technique, represented in eq. (4.15) by the genuine frequencies m/T. While in optics, therefore, the “carrier frequency” and the “side bands” are of different physical natures, in communication techniques they are of the same nature and are therefore combined in eq. (4.15) into the frequencies $10
m YO+-
T
for m = O ; 1 ; - 1;2; - 2 ;
...
(4.16)
In spite of this difference in nature, we must formally proceed in a similar way as in the optical case; because we can describe here also an intervention (S,) in the spectra in eq. (4.15) and the image function resulting from it ; that is, the communication appearing at the output
194
0 P T I C X L 4N D E L E C T K O N I C I N F 0 R M A T I 0 N
iv. s 4
of the apparatus causing the intervention can be described by means of 03
fB(t) =
C S,F,exp
(4.17)
I l k L -m
ROTATION WITH FREQUENCY To\
OF EVEN NUMBERS)
1 (VECTOR FOR INTERVALS O F ODD NUMBERS
2 -SIDE
BANDS BEFORE
Fig. 4.6. Vectors in a Gaussian plane of complex numbers ; a : before, c : after 90" phase shifting of the side bands in the Fourier spectrum
The intervention that here causes the transformation of the phase modulation into an amplitude modulation is illustrated in Fig. 4.6. Fig. 4.6a gives the complex oscillation (4.14) in the complex plane as a pair of vectors that is to be thought of as rotating with the carrier frequency Y O in the positive sense. For the time intervals of even numbers (Fig. 4.5) the vector describes the oscillation, and for the time intervals of the odd numbers the vector 0describes the oscillation. The usual demodulator (Fig. 4.4) takes account of the vector length only and allows no difference between the vector @ and @ to be recognized. The pure carrier oscillation contained in f ( t ) according to eq. (4.15), denoted by m = 0, contributes to f i t ) with a coefficient F o which according to the Fourier series theorem is the arithmetic mean
0
I
P7'
(3.18)
The dashed vector in Fig. 4.6b corresponds to it. The totality of all
v, 3 41
PSEUDOANALOGY
BETWEEN
TIME A N D COORDINATE
195
side spectra causes the jump from this vector position to the positions ( 1 ) and (2) and is thcrefore itself reproduced by means of the vectors, which are denoted by dots and dots-dashes in Fig. 4.6b. They must be rotated by 90" in phase in order to effect the transformation of the phase modulation into an amplitude modulation, indicated in Fig. 4 . 6 ~ . As the absolute phase is irrelevant one can also let the side spectra remain unchanged and rotate the carrier alone by 90" in phase; that is technically carried out by a device illustrated in Fig. 4.7. Out of the communication the carrier is filtered off and again added to the unchanged side bands after a 90" phase shift. The phase modulated communication has thereby been transformed into an amplitude modulated communication, the demodulation of which can be done in the usual way as illustrated in Fig. 4.4. FREQUENCY BRANCHING FILTER I
-
SIDE BANDS-
T I
I
I
-
-
Fig. 4.7. Ilevice for transforming phase modulation into amplitude modulation and for amplitude demodulation of the transformed communication message
The phase modulation is of considerable importance in coniniunication technique, mainly because of certain possibilities of reducing disturbances. A historical relation between it and the Zernike phasecontrast method does not seem to exist; very probably, the two procedures arose quite independently of each other. References : ZEKNIKE [ 1931 ; 1934a, b ; 1935; 19461, WOLTER [ 1954, 1 956bl. 4.2. THE FOURIER FORMALISM I N OPTICAL AND ELECTRONIC INFORMATION T H E O R Y
The distortion of a communication by means of an electric communication channel and the modification in a picture due to the behaviour
196
OPTICAL AND ELECTRONIC INFORMATION
!v, 9 4
of an optical image-forming system have a close relationship with one another because in both cases the Fourier transformation is applicable as the common tool for calculation. Let us suppose that a message is described by means of a function f(t) of time t ; f(t) is piecewise continuous, piecewise monotonic and bounded. Furthermore the “transmitting time” T is finite, and f ( t ) is zero for t < 0 and for t > T . A communication channel with its spectral transmission function S(v) for frequency v acts upon this communication; one describes this by multiplying the Fourier transform
F(v)=
i:
f ( t ) exp (-
2nivt) dt
(4.19)
of the communication with S(v). The Fourier transform of the function, which leaves the communication channel, is FB(v) =
S(v)F(v),
(4.20)
and the “image communication” itself, which leaves the communication channel, is the time function obtained by means of the inverse transformation, 03
f s ( t )=
(
J
F B ( v )exp (2nivt) dv
=
exp (2nivt) dv. (4.21)
-03
__ t G/2
Fig. 4.8. Schema of an optical and an electronical channel of communication
The analogous formalism (Fig. 4.8) governs the optical image formation for instance with coherent illumination. An object function f ( x ) describing the object has a diffraction function
F
(f”>
=
1 fG
/(x)exp
-t Q
(- 2nix sinA a ) dx, ~-
(4.22)
v,
3 41
PSEUDOANALOGY
BETWEEN TIME
AND COORDINATE
197
manifested in the focal plane of the objective lying on the side of the observer. G is the breadth of the illuminated object field, i.e. fix) = 0 for 1x1 > gG. Let fix) be piecewise continuous and piecewise monotonic, and let us choose two constants, H and I?, such that if(x)1 2 H and the total fluctuation is I in the interval <- &G; 4G). If we refer to the “direction variable” sin CL
?=-T
(4.23)
as the frequency of the image formation, the relation between f ( x ) and F ( y ) is analogous to the relation between a message f(t) and its spectral function F ( v ) . The coordinate x takes over the part of the time t. The direction variable y with the dimension (length)-1 has taken the place of the frequency v with the dimension (time)-l. When the image forming optical system is ideal and the coordinate in the image is measured in adequate orientation and suitable scale, the image is reproduced in accordance with the formula fB(x) =
IW
F ( y ) exp (2niyx) dy = f(x);
(4.24)
-W
the image function is then identical with the object function according to the Fourier integral theorem. But if the imaging system restricts the aperture or otherwise affects the spectral distribution function according to a selectivity function S ( y ) , the image function becomes
1
W
f ~ (=4
-03
1
03
S ( y ) F ( y )exp ( h i p )dy =
F B (exp ~ ) (2niyx) dy. (4.25)
-Cc
The aperture restriction is specially described by means of 1 for Iyl < W, 0 for otherwise.
(4.26)
The basic problem of information theory in optics as well as in communication technique is the recovery of the original message f ( t ) or of the original object function f ( x ) from the immediately observable image function f s ( t ) or f ~ ( x )The . possibility of solving this problem is regarded at present as limited by the Kupfmuller relation which says that one can only separately resolve two events with a channel of band width 2W if their time interval A t surpasses a minimum value, which is such that
W A t 2 4.
(4.27)
198
OPTICAL AND ELECTRONIC INFORMATION
[v, I4
I n the optical case, remembering the meaning of the optical frequency y , one has d y = 2 W , where
W=
d(sin K) 21
(4.28)
Analogous to the Kiipfmuller condition one now has the relation dxd(sin CX) 2 I.
(4.29)
This is Abbe’s statement concerning the lateral resolving power of a microscope and it is directly comparable with the statement (3.1). The Abbe-Kiipfmiiller conditions have in the last few years received a new formulation and a more general foundation by SHANNON’S sampling theorem [ 1949a1, which says : If a time signal f ( t ) , which vanishes outside the time interval 0 t T , is passed by a communication channel transmitting a band of frequencies 0 to W cycles per second, only 2WT number data are thereby transmitted, corresponding to the resolving power according to Kiipfmiiller. The sampling points, marking the values (4.30) determine with the corresponding values (4.31) of the signal, the total information, that can be gathered from the image function. The expansion theorem (GABOR[1956]) says the same for optics. It can immediately be deduced from the sampling theorem just formulated by means of the correspondance coordiizate ts t i m e , object fzcnction ts communication, direction variable t)frequency. It has been already sketched here in its essential details by means of eq. (4.29). The experience with the minimum-ray characteristic, described in 0 3, gives rise to the suspicion that the basic theorems of information theory do not have the general validity that is widely attributed to them; it is taken for granted that they are valid for communication technique as well as for optics. The reasons for their applicability are, however, quite different in the two fields, because the analogy coordinate-time is imperfect. The solution of the basic information-theoretical problem, that is the
v, S 41
PSEUDOANRLOGY BETWEEN TIME AND COORDIKATE
199
crossing of the information barriers asserted by the basic theorems, must therefore be done separately for comniunication technique and for optics. References : NYQUIST[ 19281, ZERNIKE,F. [ 1934a, b ; 1935; 19461, WHITTAKER[1935], KUPFMULLER[1949], SHANNON[ 1949 a, b], DOETSCH[ 19501, MEYER-EPPLER [ 19521, SCHMIDT[ 19531, HOPKIKS [ 1 9551, GABOR[ 1 9561, INGELSTAM [ 1 9561, LOHMANN [ 19571, ROSEKHAUER and ROSENBRUCH [1957], KUBOTAand OHZU[1957], LUKOSZ [1958]. 4.3. SOLISTIONS O F THE BASIC PROBLEM I N THE DOMAIN O F COMMUNICATION T E C H N I Q U E
The basic problem of communication technique, viz. the computation of the input-function / ( t ) from the observable output function f ~ ( t ) , assumes the following simple form in the sphere of the spectral function - called the “sub-sphere”; (the sphere of the time function is called the upper-sphere). Eq. (4.20), F B ( v ) = S(v)F(v),
(4.32)
should be solved for the spectral function F ( v ) of the true communication message. Formally this is carried out simply by means of division (4.33) The conversion to t h e functions of the upper-spheres is achieved according to the Fourier integral theorem by means of the Fourier transformation (4.19) and (4.21), namely FB(v) =
1:[
fB(t’) exp (-2nivt’) dt‘
(4.34)
F(v) exp (2nivt) dv.
(4.35)
M
f(t) =
-M
The result is
f ~ ~ (==t f)( t ) =
-ca
S(v)
exp (2niv(t - t’)) dt’ dv.
(4.36)
The solution procedure assumes, however, that the division (4.33) is feasible. The division for “frequencies” IyI > W is not feasible in optics in the example where the aperture is restricted according to eq. (4.26), because then S ( y ) = 0. If like Shannon, Kupfmuller and other authors in communi-
200
OPTICAL AND ELECTRONIC INFORMATION
Iv, § 4
cation technique, we assume a rectangular-band filter with a transmission function 1 for \vI < W S(v) = (4.37) 0 for ( v > W ,
c
then for ( v (> W S ( v ) as well as F B ( I ! are ) according t o (4.32), equal to zero and eq. (4.33) contains the meaningless right hand side OjO. But this is fortunately impossible in communication technique, because a communication channel with rectangular boundary limitation (4.37) is not realizable. If it existed, a Dirac short time pulse (4.38)
f ( t )= A6(t - to),
which at time to is given at the channel input and according to eq. (4.19) possesses the spectrum
F(v) = A
s””_
6(t - to) exp (-22nivt) dt
=A
exp (-
2nivto),
(4.39)
would give the time function (4.40) i.e. fB(t) =
2WA
sin (2nTT(t - to)) 276W(t - to)
(4.41)
a t the channel output according to eq. (4.21). Thus the observer could conclude from the “forerunners” (Fig. 4.9) whether and even
I
CAUSE flt)
Fig. 4.9. Cause (origiiial communication) and effect (output function) by means of a commuriication channel with a rectangular spectral function represented by eq. (4.37)
V,
5 41
PSEUDOANALOGY BETWEEN TIME AND COORDINATE
20 1
when the pulse was switched on. The communication channel with the square-band limitation according to eq. (4.37) would be a “prophet” foretelling the future. This prophet would of course have difficulties if the person who takes care of the switching a t the channel input in the intervening time should decide to omit the switching; the forerunners would have later to be “fetched back”. The existence of a communication channel with a band function (4.37) would violate the causality principle : “no effect takes place before its cause”. Moreover it can be shown that a real conmiunication channel can at the most possess isolated zeros of its transmission function, for real v’s. If vo is such a zero it follows that (4.42) because F(v), the Fourier transform of a function f ( t ) , for which the assumptions formulated at the beginning are valid, is a continuous, even an analytic function in the whole v plane. How the statements of this section follow from familiar mathematical principles has been shown in the author’s publication (WOLTER[ 1958~1). Even when disturbances and errors of measurement play their parts the following is valid in any case: If we require a certain accuracy for the communication computed according to eq. (4.36), that is, if we assume a mean error allowing for the Hilbert distance
an accuracy of measurements 6 > 0 for the measurement of f ~ ( t ) and a disturbance tolerance not equal to zero can always be specified in such a manner that the condition (4.43) can be satisfied. That is to say, there is finite spread of error from the measurement to the required communication (WOLTER[ 1958a, b, c ; 1959~~1). The basic problem of communication technique can be regarded as solved. An analogue computer can automatically carry out the calculation according to eq. (4.36). Figs. 4.13a to c show an example. 4.4. F A I L U R E O F THE ANALOGOUS SOLUTION METHOD I N OPTICS AND THE INCOMPLETENESS OF THE COORDINATE tf TIME AXALOGY
The arguments given in the preceding paragraphs against the existence of a rectangular transmission function of a communication
202
OPTICAL A N D ELECTRONIC INFORMATION
[V> 9
4
channel are not valid in the optical analogy. The causality principle “There are indeed effects after, but not before their causes” has no analogy, for such an analogy would imply something like “a bright object point can indeed cause brightness to the right of its image but not to the left of it”. The difference between left and right is much less radical than the difference between yesterday and to-morrow. It is in agreement with this, that the realisation of the rectangular transmission function (4.26) is quite elementary in optics; two walls of a slit in the aperture plane already restrict the spectrum in the manner required by eq. (4.26). Thus it is certain that the division by S ( Y )for > W is impracticable and that the eq. (4.36) cannot generally reach the goal in optics.
IY\
4.5. THE PROBLEM O F ANALYTIC CONTINUATION O F THE SPECTRAT, FUNCTION F ( y ) I N OPTICS
When the object function f (x)is piecewise continuous and piecewise monotonic, and vanishes beyond a finite interval (illumination boundaries a t - BG and 4G) we call f(x) a “finite” object-function and then its proper spectral function (4.22)
F(y)=
jiG f(x)exp (-
2niyx) dx
(4.44)
-40
is always analytic in the whole y plane. However one can deduce F ( y ) directly from the image only for the interval - W < y W, since the value of F ( y ) does not affect the image beyond this interval because of the aperture restriction. But one could try to continue
Fig. 4.10. Analytic fuiiction, small in the frequency band (- W ; very large outside this band
+ W ) , but
v, 3 41
PSEUDOANALOGY
BETWEEN TIME
A N D COORDINATE
203
F ( y ) analytically from the interior of the interval to the exterior region. That would indeed be succcssful if we could determine exactly F ( y ) in the interior of the interval. But very sniall errors in thc interior of the interval could extend analytically as errors of any size into the exterior of the interval. Fig. 4.10 illustrates graphically an analytic “error function” which is as small as desired in the interior (<E > 0 ) ; however in the exterior it exceeds the value A which is allegedly arbitrarily large a t the point y’ given as close to W as desired. In spite of a small uncertainty in the interior, F ( y ) can, therefore, in various ways (Fig. 4.1 1 ) be continued analytically into the exterior. The analytic continuation has infinitely strong “error propagation”, if nothing is known except that the function F ( y ) is analytic. That one knows more is decisive for thc solution of our basic information-theoretical problems. That will be shown in the next section.
-w
0 FREQUENCY
7
W +
Fig. 4.11. Analytic continuation of the spectral function F ( y ) from the frequency W >into the external region band (- W ;
+
4.6. SOLUTION O F THE BASIC INFOIIMATION-THEORETICAL PROBLEM I N OPTICS
The solution of the basic information-theoretical problem is given by the theorem: If f ( x ) is a finite object function with f ( x ) = 0 for 1x1 > BG we have for every required “resultant error tolerance” E > 0 a “measuring error tolerance” 6 > 0 and an image interval of the width G’ such that a “computed object function” fsn(x) can be computed with measuring errors < B from the measurements of the image function in the interval 1x1 < 4G. The function / B R ( X ) distinguishes itself from the real object function f ( x ) in the Hilbert sense (that is in the mean
204
OPTICAL AND ELECTRONIC I N F O R M A T I O N
square) by less than
F,
:v,
s4
that is
The proof of this theorem was published by the present author in
AEU [ 19.59131. For lack of space it cannot here be reproduced in full, but a sketch of the proof will be given. Since ~ B ( x )--f f ( x ) for W + 00, there exists a W’ = W ’ ( F )> W , such that the function (4.46) satisfies the condition
I l f ~ ~ (x )f ( ~ ) i !
< 8.
(4.47)
as long as F ( y ) itself is exact. Thus in breaking off the integral (4.46) at - W’ and W’, we dispose of one third of the available tolerance E . We dispose of a further +F by breaking off the power series (4.48) for F ( y ) after A: terms. That is legitimate because F ( y ) is analytic in the whole y-plane and its power series converges in jyI < I.Ir‘ absolutely and uniformly. If H is the upper bound for all the permissible I/(X)], it follows similarly to eqs. (4.50) to (4.53) that w
F ( y ) T=
n20
(&)n -
inn!
1
:G
-
00
x ~ / ( xd)x y n g
~
= cnyn; n- 0
viz.
Thus a convergent dominating series exists for 00
I; (zTY’G)” ?a= 0
C cnyn
GH n!
~,
which niakes it possible to find a number AT = N ( E )for all the permissible functions f ( ~ )provided that W , K,H , G and G’ are given.
V,
$41
PSEUDOANALOGY BETWEEN TIME AND COORDINATE
205
Consequently
r .V-I (4.49) The last +e is sufficient to collect the effect of the measuring errors on the coefficients, as has been shown in the publication just quoted. First of all we get the coefficients quite formally from the inverse of eq. (4.25) by means of power series expansion of the exponential function
1
-M
F(y) =
~ B ( xexp )
< M7.
(- Zniyx) dx for lyl
(4.50)
--M
(4.51) (4.52) with Ma =
J
00
x ” ~ B ( xdx ) = -w
1,
+ (-
x~{/B(x)
I)”/B(x))
dx.
(4.53)
But this formal calculation is so far not legitimate because the “moment integrals” (4.53) do not converge (Fig. 4.12). But we can
I
Fig. 4.12. Integrand of a moment integral
legitimize them by means of the Abel-Poisson limitation procedure (DOETSCH [1950]) by replacing each &Inby M n = lim Mn(s)= lim s-,
0
-
exp (- sx)xn{fn(x)
+ (-
l ) n f ~ ( -x)) dx.
(4.54)
206
0 P T I C A L A N D E L E C T li 0 N I C I N F O R M A T I 0 N
[v, s 4
It may be shown that these integrals and the lim of a finite object 0'6 + function converge because we know that jB(x) is an image function (imaged with restricted aperture) of a limited object function. From this there follows an asymptotic expansion for JB(x) by means of repeated integration by parts in (4.25). When x # 0, T r
~
fB(X)
~
-
- TV
F'(W)
1V
-
{ F (W )exp (2niWx)
I
~1 7
exp (2niyx)
[
I
F'(y)
- 11-
1
exp (2niyx) 2nix dy
F ( - W )exp (- 2niWx) 2nix
-
(4.55)
I I
exp (2niWx) - F'( - W )exp (- ZniWx) (2zix)2
exp (2niWx) - F"(- W )exp (- 2niWx) (zXix)3 (4.56) According to the estimates given a t the beginning of this paragraph, it is sufficient for our resultant error tolerance, to approximate E ( y ) by a polynomial of degree A: - 1 . Thus we can say that the last integral written in eq. (4.56) disappears. Then we have with sufficient accuracy
J&)
s-1 e 2: (-
F(l)(T/I/) exp l)l---~
2 -0
(2niWx) - F ( l ) ( - W )exp (- 2niT'c"x) (2nix)lt l (4.57)
If we substitute eq. (4.57) in eq. (4.54) the integral (4.54) breaks up into N integrals which all converge. The lirn exists for each of s+o+
these integrals. That is evident for the integrals with n < I , because even the integral for s = 0 itself converges. For the integrals
$&= J exp (-
S X ) X ~exp
(& 27ciWx) dx
(4.58)
0
with k = n - I > 0 one obtains, by means of repeated integration by parts, if we set 1
a=-
s
~
2niW '
(4.59)
v, § 41
PSEUDOANALOGY
BETWEEN
TIME
ANT) C O O R D I N A T E
207
the expressions
(4.61)
vk
$k+
=
$fJ+
= 21,
Because (4.62)
one has xk+ =
k ! 7)lC-l 1 ;
(4.63)
that is (4.64) (4.65) The yk* are continuous in the interval 0 analytic functions; one has lim S-
-01-
$I;+
=
< s < 1;
they are even
k! (- 2niW)k+l’
(4.66) (4.67)
Because the finite sumination does not alter anything in the convergence behaviour, the M,(s) are continuous as s - t o from the right side, and the substitution of M , by means of lim M,(s) is pers+n+
niissible. The summability of the moment integral is therewith secured and a way is at the same time indicated for the calculation of M , from the measurements of f ~ ( x )The . limit error propagation from the values f ~ ( x to ) the moments M , is certainly easier to establish (that has been proved in the article quoted above) with series instead of with integrals; but it can also be seen here in the integrals (4.53) by means of (4.54), because the partial integration leading to eq. (4.66) and (4.67) has a result which also leads to uniform convergence, that is if one applies it to the finite sum (4.54) of integrals of the type (4.58). Each moment M , can in this way be immediately obtained in practice from (4.53) by means of partial integration. A suitable basic function is to be formed for that purpose from the
208
[v, 9 4
0 P T I C A L A N D E L E C T R 0 N I C I N F 0 K M .4TI 0 N
observed J B ( x ) and a new basic function is to be formed from the old one, and so on. Hereby random errors cancel out for the most part. The object function follows from F B R ( ~by ) means of the relation ~ B R ( X )=
r-‘
FB&) exp (27diyx) dy.
(4.68)
- H-’
An example illustrated by Fig. 4.13 shows graphically the improvement beyond the Abbe limit of resolution obtained by means of this calculation.
t-
b
--x--
X-
f
Fig. 4.13. Upper three curves: example relating to electronic communication engineering; f ( t j = original communication; f ~ ( 2 ) = image communication of the channel output, as the frequency band width is made smaller than Kupf) communication evaluated b y means of an automatic muller’s limit: f n ~ ( t= calculating device. The three lower curves: optical example; f(xj = object function; f&) = image function by means of an optical system with a n aperture smaller than Abbe’s limit; f B n ( x j = calculated object function 4.7. COMMON AND DISTINCTIVE FACTORS BETWEEN INFORMATION THEORIES O F ELECTRONICS AND OPTICS
The result of 9 3 and 9 4 common to optics and electronics is that one can in principle go from the image function to the object function as closely as desired, provided the image function is determined sufficiently exactly and disturbed sufficiently slightly. The difference manifesting itself in a single case rests above all on the causality principle.
VI
REFERENCES
209
A second type of differences has to do with the “condition of isoplanatism” which requires that the same function S ( y ) is effective for all image positions, i.e. that the image errors are independent of position. Only when this condition is fulfilled is the Fourier transformation significant in the case of incoherent, as well as in the case of coherent illumination. To assume geometric optical image errors as independent of position implies a strong limitation of applicability. This is quite different in the communication technique where we can satisfy the time channel constancy with extraordinary accuracy to the period of a communication. Natural laws are the same yesterday, today and tomorrow. The optical axis distinguishes an image position in geometrical optics; therefore the information theory in optics is above all suited for elucidating purely diffraction theoretical basic phenomena. Presumably because of this fact, opticians have hesitated to describe geometrical optic properties of imaging systems by means of spectral functions. In communication engineering, on the contrary, the description of channel properties by means of frequency functions was the primary method and only recently did physicists begin to use a short time pulse at channel input as a test; this is the up-to-date elegant test method for communication channels and is analogous to the common method for testing an optical system by means of a point source. References A B E L ~ SF., , Ann. de Physique 56, 596, 706. ANTWEILER, H. J. and H. KAYSER,1950, Angew. Chemie 62, 412. ARMBRUSTER, D., W. K o s s e ~ and K. STROHMAIER, 1951, Z. Naturf. 6a,510. ARTMANN, K.,1948, Ann. Physik (6) 2, 87. BLODGETT, K., 1940, Phys. Rev. 57, 921. CABRERA, N.,1952 a, C.K. 234, 1045; 1952 b, C.R. 234, 1146. DOETSCH,G., 1950, Handbuch der Laplace-Transformation, Bd. 1 (Basel, Verlag Birkhauser, 1950). FRAU, D. C., 1952, Rev. d’Optique 31, 161. GABOR, D., 1956, Light and Information, in Astronomical Optics and Related Subjects; ed. Z. Kopal (Amsterdam, North-Holland Publ. Co., 1956). TV., 1941, D.R.P. 716 153/42h (Anm.: Jenaer Glaswerk Schott u. GEFFCKEN, Gen.) and Ann. Physik (5) 40, 385. Goos, F., 1936, Z. Physik 100, 95; 1959, Physikal. Blatter 5, 234. Goos, F. and H. HANCHEN,1943, 1947, Ann. Physik (5) 43 (1943) 383; Ibzd. (6) 1 (1947) 333.
210
OPTICAL A N D ELECTRONIC INFORMATION
[V
GRADMANN, U., 1956, Optica Acta 3, 30. HIESINGER, L., 1947, Naturw. 34, 121; 1948, Optik 3, 485. HLUEKA, FR., 1926, Z. Physik 38, 589. HOPKINS,H. H., 1955, Proc. Roy. SOC.A 231, 91. HUBNER,W., 1950, Optik 7 , 128. INGELSTAM, E., E. DJURLEand B. SJOGREN,1956, J. Opt. SOC.Amer. 46, 707. KOSSEL, W. and K . STROHIMAIER, 1951, 2. Naturf. 6 a , 504. KRAUSBAUER, L., 1957, Diplomarbeit Marburg. KUBOTA, K. and H. OHZU,1957, J. Opt. Amer. SOC.47, 666. KUPFMULLER,K., 1949, Die Systemtheorie der elektrischen Nachrichtenubertragung (Stuttgart, S. Hirzel Verlag, 1949) 175. LOHMANN, H., 1957, Optik 14, 510, 516. LUKOSZ, W., 1958, Diss. T H Braunschweig. MAYER,H., 1950, Physik dunner Schichten, Teil 1 (Stuttgart, Wissenschaftliche Verlagsgesellschaft m.b.H ., 1950). MEYER-EPPLER, W., 1952, Naturw. 15, 341; 1959, Grundlagen und Anwendungen der Informationstheorie (Springer Verlag, Heidelberg). 1953, Z. Physik 134, 546. MOSER,H. and W. SCHMIDT, MOSER,H. and J . WITTMANN, 1951, Z. Physik 131, 48. NYQUIST,H., 1928, Transact. Amer. Inst. Elect. Engrs. 47, 617. ROSENHAUER, K. and K. J . ROSENBRUCH, 1957, Optica Acta 3, 21. H., 1942, Ergebn. exact. Naturwiss. 20, 303. SCHARDIN, SCEILICK, M., 1904, ober die Reflexion des Lichtes einer inhomogenen Schicht, Diss. Berlin. SCHMIDT, K. O., 1953, Fernmeldetechn. Z. 12, 555. SCHUSTER, K., 1949, Ann. Phys. (6) 4 , 352. SHANNON, C., 1949 a, Proc. I.R.E. 37, 10; 1949 b, The mathematical theory of communication (Urbana, University of Illinois Press). SMAKULA, A,, 1935, D.R.P. 685 767 (1.11.1935) Anm. C. Zeiss, Jena; 1940, 2. Instr. K. 60, 33; 1942, Z. Physik 43, 217. J. M., 1935, Interpolatory Function Theory (Cambridge Tract) 33. WHITTAKER, WOLTEK,H., 1937, Z. Physik 105, 269; 1949a, Physikal. Blatter 5, 234; 1949 b, D.R.P. 819 925/42h, Gr. 34, 11, Anm. E . Leitz Wetzlar; 1949 c, D.R.P. 819 728, Anm. E. Leitz, Wetzlar; 1950 a, Ann. Physik (6) 7 , 33; 1950b, Ann. Physik (6) 7 , 182; 1950c, Ann. Physik (6) 7, 341; 1950d, Z. Naturf. Sa, 143; 1950e, Z.Naturf. Sa, 276; 1953, Z. Physik 135, 531; 1954, Fortschr. chem. Forsch. 3, 1; 1956 a, Optik diinner Schichten, in Encyclopedia oi Physics, ed. S. Fliigge (Heidelberg, Springer-Verlag, 1956), pp. 46 1-554; 1956 b, Schlieren-, Phasenkontrast- und Lichtschnittverfahren, in Encyclopedia of Physics, ed. S. Fliigge (Heidelberg, Springer-Verlag. 1956) pp. 556-645; 1958 a, Physica 24, 457; 1958 b, Arch. electr. fiber. 12, 335; 1958c, I b i d . 13, 101; 1959a, Ibid. 13, 171; 1959b, I b i d . 13, 267; 1959c, Ibid.13, 393; 1960, Optica Acta 7 , 53. ZERNIKE, F,, 1931, I1.R.P. 636 168/42h/670, Anm. C. Zeiss, Jena; 1934a, Physica 1, 43; 1934b, Roy. Astron. SOC.M.N. 94, 377; 1935, Z. techn. Physik 16, 454; 1946, Phase contrast, a new method for the microscopic observation of transparent objects, in Ch. IV (Physical Optics) in A. BOUWERS, 1946, Achievements in Optics (New York and Amsterdam).
VI
INTERFERENCE COLOR BY
HIROSHI KUBOTA Tokyo University, Japait
COXTENTS PAGE
. . . . .... . . . .
214
.
231
$ 5. INTERFERENCE COLOR OF CHROMATIC POLARIZATION.. . . . . . . . . . . . . . . . . . . .
233
$ 6. INTERFERENCE COLOR I N OTHER PHENOMENA .
244
NEW TABLES O F T H E INTERFERENCE COLOR
245
REFERENCES
. . . . . . . . . . . . . . . . . . . . . . . .
250
$ 1. INTRODUCTION. EVALUATION O F COLOR
$ 2 . INTERFERENCE COLOR OF MONOLAYER. $ 3 . INTERFERENCE COLOR O F MULTILAYER
$ 4 . COLOR O F A THIN FILM ON METALLIC SURFACE
213
226
Q 1. Introduction. Evaluation of Color A thin film of oil spread on water or a soap bubble, the layer thickness of which is of the order of the wavelength of light, shows beautiful color when illuminated with white light. The color is due to the interference of light reflected from the upper and lower boundaries of the film and is called the interference color. The same kind of coloring due to the interference is also seen when white light is passed through a thin section of birefringent or optically active crystal between Nicol prisms. This phenomenon is known as the chromatic polarization. Although many authors in the past have given theoretical explanations of these phenomena, the color itself was not treated quantitatively, most of their discussions were qualitative. I n recent times along with the development in technique of coating non-reflection layers on lens surfaces, of making interference filters, etc., exact quantitative treatment of color is required for which modern colorimetry has to find solutions. In this review, we shall show some of the treatment. To describe the color quantitatively, a well known law of Grassmann, according to which any color can be reproduced by suitably mixing three primary colors is utilized even in the modern theory of colorimetry. Accordingly, any color is defined with three variables corresponding to the amounts of these primaries mixed in it, which are called tristimulus values. Psychologically, color has three attributes, hue, purity and brightness. If we are interested in hue and purity only, that is in chromaticity, it is given by two variables called chromaticity coordin at es. The colorimetric reference system internationally adopted at present is that of the Commission Internationale de l’gclairage (CIE). In this system, the amounts of three primaries in a light of wavelength 1 are denoted by Z ( A ) , ?(A) and Z(A) and their recommended numerical values were given by CIE (e.g. HARDY[1936]). Accordingly, the tristimulus values of a light of spectral intensity distribution I(A)
214
[VI,
I N T E R F E R E N C E COLOR
92
are given as
j
=
1
m
00
X
Y=
I(il)Z(A)dil,
I(il)y(R)dil, 2 =
0
0
s,
I(il)E(il)dil.
Chromativity co-ordinates are denoted as x and y and given as
X X=
X+Y+Z'
Y y= X + Y + Z '
Chromaticity is given as a point on a rectangular co-ordinate taking x and y as ordinate and abscissa. This diagram is called CIE-chromaticity diagram. Points on a straight line connecting the white and a color (F) on this diagram have the same hue with the color F, and the nearer the point lies to the spectral color, the more the purity. Fig. 1 (A) shows CIE-chromaticity diagram. The horseshoe-shaped curve is the locus of the spectral color and the straight line which forms the base of it is the locus of purple. Relations between the names of colors and their co-ordinates were given by JUDD and KELLY[ 19391. Two of the three primaries (corresponding to X and 2)of the CIE system are so selected that their luminance factors are zero. Accordingly, the luminance factor of any color is given directly by its Y-tristimulus value. This is a great merit of using the CIE system. Color is completely specified by x, y and Y .
Q 2. Interference Color of Monolayer 2.1. TWO TYPES O F LAYERS
Color of light reflected from a thin layer has been studied ever since the days of Newton and Herschel. Many other investigators followed (e.g. POCKEL [ 19061) and recently MOXCH 119523 repeated similar experiments. I n all of these observations, however, colors are given merely by their names; also in the paper on monolayer of stearic acid spread on water, by BLODGETT [ 19341, which led to the non-reflection layer, colors are specified only by names. BUCH[1950] has also discussed the color of non-reflection layers employing the name of color. The aim of the present study is t o describe color quantitatively on the CIE-chromaticity diagram. If the amplitude of incident light is assumed to be unity, the reflection coefficients of the upper and lower boundaries of the layer are A and B , and the thickness and refractive index of the layer are d and n,
VI, § 21
215
MONOLAYER
then the intensity of the reflected light is
+
A2 B2 1 A2B2
I=
+
+ 2AB cos 6
+ 2,4B cos 6
(1)
’
When light is incident perpendicularly onto the layer, 6 = 4n(nd)/1 and
A =
nl - n nl+ n’
where n1 and n2 are the of upper and lower sides of In most cases, A and B unity, so that we can take
I
n - n2 =
+
(2)
n2’
refractive indexes of the ambient media the layer, respectively. can be considered very small compared to approximately
= A2
+ B2 + 2AB cos 6 .
(3)
This is the formula for the superposition of two waves of amplitudes A and B with the phase difference 6 between them. Hence the above approximation is for the case in which the multiple reflection within the layer is ignored. This formula is rewritten in the following form so as to keep E > 0, according as A and B have
-+ -+
the same sign, I
opposite sign, I
E
cosz $8,
E
E
sin2 $6,
E =
=
( A - B)’ 4AB -
> 0,
+
( A q2 > 0. 4AB
(4)
(5)
If the energy distribution of the light source is represented by E(1) and cos2g6 or sin2 $6 by /(A), we have for the tristiniulus values of the interference color X , Y and 2 the following expressions, in which E is assumed not t o contain the wavelength A : X
=
Y
=
2
=
+
E ( ~ ) [ Ef ( A ) ] Z ( A )dil
I I
+
+ X‘,
= 8x0
+
E ( ~ ) [ Ef ( A ) ] T ( A )dil = EYO Y’,
+
E ( ~ ) [ E f ( 1 ) ] Z ( A ) dil
+ Z’,
= EZO
216
INTERFERENCE COLOR
where
Xo
=
Yo =
20 =
1
s s
E(il)%(il) dil,
X‘ =
E(il)T(il)dil,
Y‘ =
E(il)Z(il)dil,
2’ =
s s s
[VI,
s2
E(il)j(il)%(il) dil,
E(i)j(l)?(i.)dil, E(A)f(L)Z(il)dil.
( X O ,Yo, 20)are the tristimulus values of the light source and (X’, Y’, 2’) are those of the layer of thickness (nd) and E = 0. From these equations, we have as the chromaticity co-ordinates
x
EX0
-tX‘
= ____
&SO-+ S’ ’
y=-
+ Y‘ + S’ ’
&YO &SO
where
so = xo + Yo + 20,
S’
= X’
+ Y’ + 2’.
This is a straight line passing through the light source the layer of E = 0 (x’,y’), where
(xg,yo)
and
Chromaticity of the layer of e # 0, i.e. a point shown by (x,y ) is on this straight line and as E is always positive; (x,y ) has the same hue as (x’,y’), but it has a different purity. Purity is maximum when e = 0, becomes less with larger E . Accordingly it is only necessary to calculate the case of e := 0 to obtain hue of the light from the layer (KUBOTA [1950a]). A and B of a layer with the index of refraction lying between nl and n2, i.e. a non-reflection film on glass, have the same sign. The reflectivity is therefore shown b y eq. (4)and we shall call the layer cos-type. For a non-reflection film on glass satisfying the amplitude condition of non-reflection ( n = ( n l - n z ) : ) ,A = B and F = 0 ; hence the color of it is the purest among cos-type layers. On the other hand, when 7% is larger or smaller than either n l and n2, A and B of a layer have opposite signs and the reflectivity is shown by eq. (5). We shall call the layer sin-type. For a layer of soap bubble in air or a thin air film between two plates of glass (Newton ring experiment), n l = n2 and A = - B ; hence e is zero in eq. (5) and the color is the purest among sin-type layers. Colors of reflected light from
VI,
9 21
217
MONOLAYER
these two types of layers are complementary when both layers are of equal thickness. If we put eqs. (4) and (5) into eq. ( 6 ) , we can calculate the TABLE 1 Interference color
Author
Light source
RAYLEIGH[ 19001
Illurninant E
LOMMEL [ 18911 BAUDand WRIGHT [l930] 3000°K blackbodj RUCHWALD [ 19401 B KUBOTA [ 1950al C LE GRAND[ 19561 A KUBOTA [ 19601 C and B
0
0.1
0.2
0.3
0.4
T y p e of zntcrference color
Calculated optical thickness
sin-type cossin-, cos-
2 ( 9 4 5 2450 m p 2320 960
sinsin-, sin-, sin-, sin-,
0.5
2000 ca. 2700 1200 1200 3000
coscoscoscos-
0.6
0.7
Fig. 1 ( A ) . CIE-chromaticity diagram and interference color (figures are ( n d ) in mp)
218
[VL
I N T E R F E R E N C E COLOR
92
coordinates of the interference color. RAYLEIGH [ 19001 first calculated them for both cos- and sin-type layers and plotted the result on the Maxwell color triangle. There are many other calculations as listed in Table 1. Results of calculations of the color of chromatic polarization I
Q
I
Light Source llluminant
C
0.5
0.4
Y
i
0.3
0.2
0.1
0.2 0.3 0.4 Fig. 1 (B). Interference color (figures are (nd)in mp)
are also listed in the Table, as they are calculated in the same way. Figs. 1 (A) and 1 (B) show the interference color of a cos-type layer of ( n d ) 5; 1500nip. Recently MIVAKE[1957] used an electronic calculating macliine. In his paper, a table of weighted co-ordinates of equal interval in wavenumber, which is convenient for the calculation of color with such a calculating machine, is given. The last one in Table 1 (KUBOTA[ 19601) was calculated using this table. When the thickness (nd) is vanishingly small, we can write
VI, 9
21
219
MONOLAYER
Accordingly, the color of cos-type layer becomes white and sin-type layer of E = 0 approaches the limit given by
where
If the intensity of scattered light in the sky is proportional to this color of the sky would be the blue. But really, it is pro-
1/22,
portional to 1/14 and the blue of the sky has much richer color than this (RAYLEIGH [ 19001). According to ROSCH[1959], limiting colors when various kinds of light sources are used, are a s follows: for illuminant A , x
= 0.3983,
for illuminant B , x
=
for illuminant C,
= 0.2630,
x and for illurninant E , x
= 1
y
0.2968, y
=
0.3902;
= 0.3085;
y
==
0.2674;
0.2821, y
==
0.2858.
2.2. COLOR O F NON-REFLECTION L A Y E R
Thickness of the most effective non-reflection layer is such as t o make V minimum, where
V
=
s
I(il)g(il)dil,
V(A) being the spectral sensitivity of the receptor, i.e. the eye, photo-
graphic emulsion, etc., and I(A) being the spectral intensity of the light reflected from the layer. For optical instruments of visual use such as a finder of camera, binocular, etc., only the luminance factor should be taken into account and we can put V ( i ) =?(A), so that V = Y . In Fig. 2, the curve shown by the full line is for Y-value of a thin layer when illuminant C is used as the light source. Thickness of the layer which gives minimum Y is, from this curve,
( n d ) = 138 mp.
220
INTERFERENCE COLOR
[VI,
52
Its co-ordinates are x = 0.205, y = 0.063 and the color is bluish purple. For the layer cf camera lenses, a plot of V against ( n d ) using 6(A) of a usual photographic emulsion of panchromatic film, is shown
0
100: ; 200 300 400mp (1 131(1381 * (nd 1 Fig. 2. Luminance factors of the non-reflecting layer
500
in the same figure by the dotted line. Thickness which gives minimum V for a panchromatic film and for other films thus calculated is about ( n d ) = from 113 to 118 mp, and its color is orange. Lenses coated with a layer of this thickness are sometimes said to be amber-coated (KUBOTA [1949]). Color of transmitted light through the layer and that of the reflected light are complementary when there is no absorption. The colors of transmitted light of such layers are shown in Fig. 3 (KUBOTA [1950b]), it being assumed that the layers satisfy the amplitude condition for non-reflection. In the figure, wz is the number of such layers through which light passes successively. Ellipses indicated with dotted lines are the ellipses showing the area of the least perceptible difference (LPD-ellipse) of MACADAM[ 19421 with the center at illuminant C. Inner ellipse is LPD-ellipse itself and the outer one is a three times magnified LPD-ellipse, within which the color appears almost white. From thi., figure, we can see that, when the thickness of the layer is ( n d ) = 138 mp, the transmitted light is alniozt uncolored so long as nt is less than about 10. In most practical cases, the layer does not satisfy the amplitude condition for non-reflection and the color of the transmitted light is much less pure, so that the light may
VI,
9 21
221
MONOLAYER
be colorless even for larger M . Special kinds of glass used in modern camera lenses have stronger absorption in shorter wavelength region. Use of suitable combination of layers of different thickness is therefore
0.3051 0.300
1
0.305
I
-X
0.31 0
0.315
Fig. 3. Chromaticity of light after transmission through 1% layers of thickness (mi)
required to conipensate for the color due to absorption (WATAXABE [ 19541, SUZUKI[ 19541 and SCHARF [ 19521). 2.3. EFFECT O F MULTIPLE REFLECTION AND DISPERSION
To calculate the color of a thin layer, we have used eq. (3) in which niultiple reflection of light within the layer is ignored. As long as the difference between the refractive indexes of the layer and the ambient media is not large and A or B is small, we can ignore this effect. But when a layer of high refractive index is used, the difference becomes large and we have to take the effect into account. Eq. (1) which gives the correct intensity of the reflected light can be rewritten
222
TNTERFEKENCE COLOR
in the following form to make calculation easier:
I=1+
F-G G + cos 6 '
where
A2 + B'2 - ~ , 4AB
F
G=
1
+ A2B2 2AB
Result of calculation of the color based on eq. (6) after substitution from (7) and assuming A2 = B2 for simplicity, is shown in Fig. 4 (KUBOTA
0.2
0.3
-x
0.4
0.5
Fig. 4. Effect of multiple reflection (A,B are reflection coefficients of the upper and lower boundaries of the layer, r12 is number of multiple reflections)
The outermost curve is the one corresponding to the case when multiple reflection is not taken into account. As is clear from the figure, the effect of multiple reflection is t o reduce the purity of the color while its hue is retained almost unchanged. If we expand eq. (7)
VI, § 21
223
M 0N 0 L A Y E R
into Fourier series, m
I
=
+m== 2 Rni cos md, 1
Ro
where
each term with m = 1,2,3 . . . corresponds to the effect multiple reflection, from which we can obtain the effect of each step of multiple reflection on the color. To illustrate this, the case of B2 = 0.60 was calculated, and the results are shown in the figure with marks. It is seen that the first few reflections ( m 2 3) considerably affect the purity, whereas further multiple reflections do not. These calculations, as well as all the calculations hitherto made, were performed by ignoring the dispersion of the refractive index. But if layer of higher refractive index is used, dispersion of n can no longer be neglected; there will be a considerable change in hue of the reflected light. Taking dispersion into account, MURRAY [19561 has also examined the color of the reflected light from a layer on various kinds of glass. 2.4. OBLIQUE INCIDENCE
Let us now consider the case in which the light is incident obliquely onto the layer. Reflectivity of the layer is also given by eq. ( l ) , but with 6=
$z(nd) cos ii il
where il is the angle of refraction in the layer. Both A and B are different for the components parallel and perpendicular to the plane of incidence and they are given as follows: €or the Component parallel to the plane of incidence ($-component),
A=
Nz Nz
--
N1
+ Ni
,
B=
N1 --
AT0
N1 4-A'o '
for the component perpendicular to the plane of incidence (s-component),
224
INTERFERENCE COLOR
where
n/r, = tan :2ij, N j
sin 2ij;
=
j' = 0,
1, 2.
If we replace DIj and N j by the refractive indexes, the above equations become the same as eq. (2) which gives the reflection coefficients for normal incidence. So, we can apply the results obtained for nornial incidence directly to the case of oblique incidence.
0
10
20
30
40 -
50
60
70
80
90"
/2
Fig. 5. Change of Nf= sin 2ij with the angle of incidence
In the case of nornial incidence, it \vas shown that two types of layers - cos- and sin-types - may be distinguished according as the refractive index of the layer lies between the indexes of the ambient media or not. I n the case of oblique incidence, N j varies with the angle of incidence as shown in Fig. 5, the cos-type laj-er becoming the sin-type or vice versa according to the angle of incidence, if we observe the p-component. It is nothing but the outconie of the reasoning described above that the center of Newton ring becomes dark or bright according to the viewing direction (MAHAN [ 19521). With the sin-type layer, the color becomes the purest when A = - B, and if p-component is considered this is the case cvhere
No
= Nz,
i.e.,
sin 2i0 = sin 2i2,
and where the angle of incidence is is
= tan-1
(noin4;
VI,
§ 21
MONOLAYER
225
that is the angle of polarization for the light entering from the medium of refractive index no directly into that of 722. This angle is denoted by i, in Fig. 5. With the cos-type layer, the color beconies the purest for p-component when N1 = (NoN2)*,that is when sin 2il
=
(sin 2i0 sin 2i2)tS +
This is the amplitude condition for non-reflection. Accordingly, any layer, even though not satisfying the amplitude condition for perpendicular incidence (nl # (nons)!),is bound to have a direction in which the interference color becomes most vivid if p-component is observed.
0.1 0.2 0.3 Fig. 6 . Ckoiniticily of obliqu; L A A C(Optical ~. thickness of the layer ( m i ) = 137 mp)
When the refractive indexes of the media on both sides of the layer are equal, that is no == n2, then io = iz. Accordingly, for both pand s-components, A = - B, and the color is vivid whatever the direction of observation is. This is the reason why a soap bubble illuminated by the sun light shows beautiful color in all directions. Variation of the interference color with the direction of observation for some layers are shown in Figs. 6 and 7 (KCBOTA and ARA[ 1951a]). As is clear from Fig. 6, a cos-type layer satisfying the amplitude
226
[T‘L
INTERFERENCE COLOR
33
condition for non-reflection in perpendicular incidence changes its hue rapidly as the angle of incidence varies, its purity being not much affected. With a sin-type layer the opposite is the case, as shown in
unit ,rn,u)
0.1
0.2
0.3
0.4
l l 0.5
Fig. 7. Chromaticity of oblique incidence. Optical thickness of the layers are: (1) ( n d ) = 165 mp, (2) ( n d ) = 148 mp, (3) ( n d ) = 142 m p , (4) ( n d ) = 138 m p , (5) ( n d ) = 135 mp
Fig. 7: the purity changes rapidly as the. angle of incidence varies from about 68” to 75” while the hue remains almost unaffected.
Q 3. Interference Color of Multilayer 3.1. DOUBLE LAYER
Although the reflectivity of nionolayer is not very high, it is possible to obtain a film of high reflectivity by coating numbers of layers of high and low refractive indexes alternately. By selecting suitable refractive index and thickness for every layer and combining them properly, we are able to get a filter of desired form and width of pass band. As such filters have almost no absorption and the colors of transmitted and reflected light are complementary, some of them, called “dichroic mirrors”, are used to separate the color into its primaries without any loss. Calculation of the spect ral reflectivity of such multilayer from the
VI,
§ 31
227
MULTILAYER
thickness and refractive index of its component layers is very complicated; it is hardly possible to treat it generally. But there are some relatively simple rules for special cases, for example, if every one cf the layers has the same optical thickness of $20, where 10is the wavelength near the middle of the visible region (550 mp or thereabouts) ; 2n(nd),’3. then becomes nearly equal to i n and w.e can therefore neglect the terms higher than cos2 6. Then if we denote the reflection coefficients a t the boundaries betwevn layers by rj, and ignore the terms higher than yj4, we have for double layers,
and for triple layers: R32
N
€3
+
C O S 48, ~ ~3
(yo -y1+
4[(YoYl+
Y1Y2
+
y2v3)
Y2
- .3)2
-4(r3Y1
-k
Y2YO)
+
9y3Yo!
*
The hue of such film is therefore the same as that of a monolayer
0.5
0.4
Y 10.3
0.2
0.1
0.2
0.3
0.4
Fig. 8. Chromaticity of an achromatic lion-reflection double layer (10= 510 mp)
228
INTERFERENCE COLOR
[VL
§3
with the thickness (nd),only the purity being different. This rule was established by BANKING [1947] and is useful for making such multilayer films. As to the film coniposed of layers of different optical thickness, we shall consider in detail as an example, the case of an achroniatic non-reflection coating with double layers of Sbz03 and MgFz on glass following the calculations of SAWAKI [1958]. In Fig. 8, the dotted line shows the changing of color by increasing thickness of Sbz03 layer up to ( n d )= ~ 410 (indicated by P I in the figure) coated on glass. When MgFz is coated on top of this ; l o Sb203 layer up to the thickness ( n d )= ~ + l o (indicated by P z ) , the color changes along the full line. Theoretically, this double layer has an excellent propertv when coated on glass of reiractive index 1.52: the reflectivity is nearly zero at. 440 and 600 nip, in between it is not over 1.8% at any wavelength and is about 1% a t both ends of visible region if SbzO3 is assumed to have no absorption. The result of measurement of this film coated on glass shows over 98% transmission between 425 and 700mp with a little absorption in the region below 425 m p and the transmission at 400 mp is 91 :(,. Another interesting example of double layer for practical purposes is that of stain on glass surface, considered by NIYABE[19571. Stain is due to the oxidation of glass and is usually found only by experienced workers. But the difference of color between stained and unstained parts of glass becomes conspicuous when a reflection reducing layer is coated thereon, and this is a demerit of coating on glass. To calculate the reflectivity of stain coated with MgFz layer, Miyake treated it as a double layer. In the case of triple layer we have R32
=
D3
=
1
( 1 - V02)(1
-
Y12)(1
-
Y22)(1 - v32)
-
D3
,
+ + f + + + + 2[7'zY3( 1 + ( 1 + 83 + YlYZ( 1 + ( 1 + cos + 1 + ( 1 + cos + YoYz(1 + (dl + + cos (81 f 2YoYi?Z ' Y3{COS + f (81 - 83)) + 1+ f 83) f cos f f + 83) + cos (61+ 82 - '33) + cos (61 62 4-83) + cos (- 81 + 8 2 + &)I, 1
+
(Y0Y3)2
(YlY2)'
Yo2)
(YOYIYZY~)~
ro2)
Y32)
(81
Y1Y3(
YOY3{cos (61 Y12Y22
($3)
82
-
V32)
YOYl(
Y12
b2))
COS
Yoz){COS (82
(YlY3)'
Y12) COS
82
62)
Y32){c0s
(YOYZ)'
(Y2Y3)'
(YOvl)2
YZ2
Y22
Y12
(61
82))
Y22)
81
"1,
§ 31
229
MULTILXYER
where Sj = 2n(nd)jjl, (ad), being the optical thickness of the layers. Reflectivity of the double layer is obtained by putting 73 = 0 and 83 = 0 in this equation. Fig. 9 gives the results of calculations of the color of stain coated with MgFz of thickness 138 m p (shown
-
.Y
Fig. 9. Chromaticity of stain coated with MgFz layer ( ( n d )= ~ optical thickness of the stain, (nd)z = optical thickness of MgFz layer)
by the dotted lines) and 115mp (full line). These thicknesses of MgFz layers are the most effective thicknesses for achieving non-reflection for visual and photographic purposes respectively. The index of the glass used was no = 1.74 (an easily stainable glass) and that of the stain was taken as nl = 1.47, a value obtained from another experi) with MgFz (thickment. The color of the stain (thickness ( n d ) ~coated ness ( n d ) ~is) nearly the same as the color of MgFz layer of thickness [ ( n d ) ~ (nd)e]directly coated on glass. As will be seen later, color of an MgFz layer on glass changes rapidly with the change of thickness when its thickness is around 138nip. Therefore the stain causes a remarkable change of color and becomes conspicuous when MgFz layer is coated thereon, of about 138 my in thickness. Difference of color between stained and unstained parts is 2 3 times of the least perceptible difference (2 3 T,PD-unit) when (nd) is 1 mp, 5 10 LPD-unit when ( n d ) = 5 mp and more than 30 LPD-unit when (nd) is over 15 mp. Hence the stain would become striking when its thickness is over a few mp.
+
-
N
-
230
[VI, 3
I N T E R F E R E N C E C0L0R
3
3.2. TRIPLE LAYER
Calculation of the color of triple layer is shown as an example for the case when the surface of glass is coated with triple layer in the order of ZnS, MgFz and ZnS. For simplicity the light is assumed to be incident norinally on the layer. As such a triple layer has 50% reflectivity for light of wavelength 20 when (nd)l =- ( ? ~ d )= z (7243 = &lo, and has almost no absorption, we can use it as an excellent beam divider. In Fig. 10 color is shown with the broken line when
0.20
0.25
0.30
0.35
0.40
-X
Fig. 10. Chromaticity of a triple layer beam divider
ZnS (nl = 2.20) is coated on glass of no = 1.50 from zero up to the thickness (nd)l = ~ A o .The chain line starting from Pz is the color when MgFz (nz = 1.40) is coated thereon up to ( n d ) = ~ ;A0 and finally the full line from P3 shows the color when ZnS is coated again to the thickness (nd)3 = fAo. At Pq, as all the layers have the same thickness $20, its color and that of a cos-type monolayer oi thickness (nd) = $10 are complementary. This chromaticity diagram is used t o control the thickness of the layers during evaporation process. As Y-value of the layer is maximum at Pq, it varies only a few percent even if the thickness of any layer departs from ~ A o ,as long as the departure is less than 15% or thereabouts.
VL
4 41
THIN FILM ON METALLIC SURFACE
23 1
3.3. MULTILAYER
As the number of layers composing a film becomes larger, rise of the spectral reflectivity curve becomes steeper and such a film can be used as an interference filter or dichroic mirror. POHLACK 1119581 made a detailed calculation of the color of multilayer films consisting of 3, 5, 7 and 9 alternating layers of MgFz of (nd) = $10 and ZnS of &to. These films have the same spectral reflectivity as the multilayer of similar construction with the thickness of ZnS layer of $1.0 for an angle of incidence. But shift of the spectral reflectivity curve by the change of the angle of incidence is less with the former than with the latter, that is, the field of view is larger with the former than with the latter (KUBOTA and SAT^ [1953]). Q 4.
Color of a Thin Film on Metallic Surface
It was made clear by EVANS [1952] that the beautiful color produced by oxide film on metallic surface is due to interference. CHARSBY and POLLING [1955] studied this color in detail in the case of a film on tantalum. Theoretically, oxide film on metallic surface can be treated in the same way as a film on glass provided one assigns a complex number (n i k ) to the refractive index of the metal surface. MACSWAN[1958] studied this problem using the Smith chart and calculated the spectral reflectivity when a thin film of n’ =; 2.0 is formed on surface of metal of ?z = 3.0 and lz = 1.41, When the thickness of the film is:not large (d < 1000 a),the reflectivity curve is of a simple form with a minimum in the visible region and we can predict the color of reflected light from the curve. For example, it is straw colored when d = 250 8, purplc when d = 400 a and blue when
+
,
0
Illuniinant C
0.110
0.35 0.30 0.25
Fig. 11. ZnS layer on A1 ((nd) = optical thickness of the layer)
232
I N T E R F E R E N C E COLOR
IVL
s4
d = I000A. These agree well with the results of observation made on
tantalum. But when the thickness of the film beconies larger, there occur many maxima and minima within visible region, and the color cannot be estimated without calculating its co-ordinates. Color of the sin-type layer, e.g. ZnS layer on glass, does not change so rapidly as cos-type with the thickness when it is small. This makes the control of the film thickness in evaporation process by color difficult. NAWATA[1953] found that when the layer is coated on Almetal, color becomes sensitive to the thickness when it is around (nd) = 80 mp. Fig. 11 i:; the result of his calculation of the color plotted on the CIE diagram. We can coat thin layer of ZnS accurately on glass by using this diagram, putting A1 at the side of glass as a reference surface. The author and OSE [1955a] have calculated the color of a thin layer on a metallic surface by means of the following formula relating to the intensity of the reflected light:
where v is the phase change by reflection at metallic surface. B is the reflection coefficient at the metal-layer boundary and it depends strongly on the wavelength, whereas A , the reflection coefficient at the upper surface of the layer, scarcely varies with the wavelength and we assume it to be constant through the calculation. B is very large as compared with A when the light is incident normally, but the two become nearly equal when the incidence is sufficiently oblique. In the case of so-called white metal such as Ag, the color of reflected light is white and 17 is given by the following equation: ==
Vo
+ qiU
-
no) + c(il - i0p,
where b and c are constants. As b and c differ considerably from nietal to metal, it is difficult to give a general discussion. Some cases were calculated and the results are plotted on the CIE-chromaticity diagram as shown in Fig. 12, where the thin lines refer to the case where B is considered constant. The color, when the dispersion of B is taken into account, is represented by thick lines. If we connect the points on the thick line with those on the thin line, joining the colors of equal thickness, by straight lines (i.e. Q aiid Q’ for (nd) = 23.6 mp), all the straight lines run through a point P. This shows that the color
VI,
9 51
233
CHROMATIC POLARIZATION
assumed with dispersion changes, as if the color shown by P were added to the color assumed without dispersion. This is clear from the expression of tristimulus values of the color assumed with dispersion ; for by substituting eq. (8) into eq. (6) we obtain
X =
s
E ( l ) x ( l ) ~d ‘l
+
s
E(l)x(A)B(A) cos2
+ i ~d }l ; etc.,
Fig. 12. Thin layer on Ag ( ( n d ) = optical thickness of the layer)
and the first term in these equations does not contain ( n d ); hence P is nothing but the color given by these terms.
Q 5. Interference Color of Chromatic Polarization White light when passed through a thin piece of crystal between Nicol prisms is colored. This phenomenon is called “chromatic polari[ 19061). zation”, named by Arago who first found it in 181 1 (POCKEL The color is due to the birefringent or optical activity of the crystal. KOSCH [ 1954, 19591 classified the interference colors. He proposed the name “Norrenberg color” for the color due to birefringence without dispersion ; and the name “Fresnel color” for the color due to optical activity. As color of any hue is reproducible, these phenomena are utilized to produce the standards of color for color instruments. B R ~ C K[1887] E and MEISLING [1904] each made the instruments
234
INTEKFERENCE COLOR
[VI,
95
called “Schistoskop” utilizing the double refraction of gypsum and quartz and measured the amount of haemoglobin in blood by coloring. ARONS[ 19101 made “Chromoskop” utilizing the optical activity of quartz. There were many other instruments of similar kind (e.g. ROSCH[ 19491). We will now show the color by the chromatic polarization on the CIE-chroniaticity diagram. 5 1. BIREFRINGENT CRYSTAL
Assuming that the light is incident normally on a piece of birefringent crystal and designating the intersections of the plane of the crystal and the planes of polarization of ordinary and extra-ordinary rays in the crystal by X and Y axes, we shall calculate the intensity of light I(A) passed through the crystal between Nicols. When the planes of polarizer and analyser make angles p and 0 with the X-axis respectively and if we assume the amplitude of the incident light t o be unity, I becomes
I
= cos2 ( p
+ 0) sin2 @ -t cos2 (p - 0) cos2 46,
where 6 = 2n(pd)/A, p is the difference between refractive indexes of ordinary and extraordinary rays in the crystal and d the thickness of the crystal. Relative intensity can be obtained from the above equation as:
or
I
N
F
+ sin2 $6,
E
=
- cos2 ( p - 0)
sin 2p * cos 28
These are the same formulae as eqs. (4) and (5) but with 6 = 47t(izd)/1 for the intensity of light reflected from the thin layer within which the multiple reflection is ignored. The color of chromatic polarization is therefore identical with that of thin layer shown in Fig. 1 when $(pd) is substituted for (nd). When p -- 0 = (crossed Nicols), the color is of sin-type, and when p 0 = in (the case of polarizer and analyser situated symmetrically with respect to the bisectors of X and Y axes and named by KUBOTAand ARA [1951b] symmetrical Nicols), the color corresponds to that of a cos-type layer. The wellknown case of Nicols being parallel with each other and the crystal being set in diagonal position ( p = 0 = in)is a special case of symmetrical
+
VL
3: 51
235
CHROMATIC POLARIZATION
Nicols. I n both crossed and symmetrical Nicols, E = 0 in eqs. (9) and (lo), and the color is more vivid than in all the other cases. Calculations in these two cases with (pud) as parameter are listed in Table 1. These calculations were made on the assumption that p does not depend on the wavelength, that is, its dispersion is negligible, as was the case of thin layer in which n was considered constant. According to E H R I N G I I A U S [ 1920j, interference color becomes anomalous when IN] is smaller than 30, where A T
=
PD
_____.
iuF
-
PC
Here, p~ is the birefringence for D-line, etc. For quartz and gypsum, N is about 34 or thereabouts and the interference color can be considered normal. But for a few crystals, such as epidote, chlinochrore,
1
I
I
I
0.1
0.2
0.3
0.4
--x
Fig. 13. Interference color of apophilite (KUBOTA and OSE [1955b])
236
INTERFERENCE COLOR
[VI, 9
5
vesuvianite and apophilite, etc. it becomes very small. As N is different for different crystals, it IS hardly possible t o treat their color generally. W’e show here, as an example, the case of apophilite for which N = - 4.0 (TROLLE [1906]). Result of calculation of the interference color by using CIE-system, is shown in Fig. 13; the figure is different from the usual one (Fig. 1). 5.2. SENSITIVE COLOR
If we examine the rate of change of color with the change of (pd) in Fig. 1 , we notice at once that this rate is not constant. The color changes rapidly with the change of (pud) when the color is purple, but it changes slowly when the color is green. The former fact has been well known since olden times as the sensitive color. When the crystal is very thin and its color is nearly white or green, color does not change sensitivity with the change of (pd), but if we superpose thereon a thin piece of crystal showing purple color between Nicols which is called sensitive color plate, the color of the field of view beconies purple as a result, changing sensitively with the change of (pi)of the base crystal, and making even its slight irregularity detectable. MARCELIN[ 1931: measured thc thickness of the steps of cleavage of mica sheet using the sensitive color and found that the thicknesses are integral multiples of 0.4mp, the thickness of m e layer of mica molecules. Thickness of the most effective non-reflecting layer for visual purpose is (nd) = 138 mp, which is also the thickness that gives this sensitive color. This fact is very advantageous in coating, for the thickness of the layer can be accurately controlled by observing the color, but on the other hand, even slight unevenness of the thickness of layer is much exaggerated in the change of color and the irregularity becomes conspicuous. For treating the sensitivity of the sensitive color quantitatively, the ratio AF/A(pud) has been used, where AF is the change of color and d ( p d ) is that of corresponding retardation. But as there was no exact way of representing the color quantitatively, discussions hitherto made (WENZEL[ 19171, LOMMEL[ 18911) were qualitative. The author used uniform chromaticity scale (UCS) diagram based on observed values of LPD to measure AF. In UCS diagram, the length of the line element is proportional to the difference of color. Although there are many kinds of such UCS diagram, MacAdam’s UCS (MACADAM[1942]) was used in the author’s first work as it is not only very simple for the calculation, but has also the necessary
VI,
§ 51
237
CHROMATIC POLARIZATION
accuracy for the prcsent purpose. Sensitivity S for the case of cosand sin-type interference colors thus calculated are shown with full and chain lines respectively in Fig. 14. From the figure it is seen that the maximum serisitivity of the broken line (crossed Nicols) lies a t 20 15
10 5
200
500 GOO 700 800 900 l000rnu (,uud) Fig. 14. Sensitivity of the sensitive colors (KUBOTA, ARA and SAITT)[1951b]j 0
100
300
400
-
about (pud) = 520 mp and that of the full line (symmetrical Nicols) at about (pud) = 260 mp. The existence of the former peak has been well known, but the latter peak has only been observed in a special case of parallel Nicols and crystal being set in diagonal position (WEINSCHENK [1919]) ; its more general existence (symmetrical Nicols) and the sensitivity twice as large as that of the crossed Nicols have been left unnoticed. TABLE2 Sensitive colors (theoretical)
Symmetrical Nicols, 1st order Crossed Nicols, 1st order Symmetrical Nicols, 2nd order
261.4
2.2 1.o 0.6
Retardation (pud) of maximum sensitivity obtained from the calculation are given in Table 2, in which illurninant C was used as the light source. KUBOTAand Smvirzu [1957] verified the figures given in Table 2 by experiments. 5.3. SENSITIVITY O F THE SENSITIVE COLOR
Values of scnsitivity of the sensitive color given in Table 2 are those obtained by Nicols and crystal placed precisely in the position pre-
238
[VI,
INTERFERENCE COLOR
s5
scribed by p 3 : 6 = 8. When the Nicols are crossed, that is p - 0 - +.z, rotation of the crystal does not affect this relation and the sensitivity is independent of orientation of the crystal. In the case of symmetrical Nicols, rotation of the crystal, that is the movement of X and Y axes, breaks the condition p 0 = +z and consequently E is no longer zero and the sensitivity decreases. Sensitivity curves for 8 =-0.02, 0.04, etc., are given with dotted lines in the same figure. It is seen from the figure that the sensitivity of symmetrical Nicols becomes less than that of crossed Nicols when E > 0.03, that is when I(p 0) - 3x1 > 5". Moreover, as the sensitivity curve in the case of symmetrical Nicols is steeper than in the case of crossed Nicols, even a slight deviation of (pd) from the value which gives peak sensitivity causes sudden decrease of the sensitivity in the former case. The reason why the high sensitivity of symmetrical Nicols has not been found is in all probability due to these difficulties encountered in obtaining the peak serisitivity (KUBOTA, AKAa i d S A I T [1951b]). ~
+
+
TABLE3 Semi tive colors (experimental)
(421'
Authov
Pavallel Nicols
MASCART L 189 11 W~;LFING [ 19 1'31 WEINSCHEKK [ 19 1'31
-
28 1
275
I
mP
Cvossed A'icols
575 575 550
In Table 3, values of (pi)for sensitive colors given by past authors are showri. Although these values agree in the main with those of Table 2, close examination reveals considerable differences that surpass in magnitude the error of experiment as well as of calculation. For all the values given in Table 2, illuminant C of CIE was used as the light source, but as to the values given in Table 3, details of tlic light source used in those bygone experiments are untraceable. Difference between the values listed in the two tables could possibly be due to the difference in light sources. To ascertain this, the author has calculated the interference colors using various light sourct:, (black body radiation of various temperatures) and the sensitive colors therefrom. Results are plotted on CIE chromaticity diagram as shown in Fig. 15. From Ihis figure, it is seen that the sensitive color (markcd with 0) is purple n.hcii the temperature T of the light source
VI, 5
51
CHROMATIC POLARIZATION
239
is high (T > 2500" K ) , purple-pink and pink when the teniperature is between 2500" and 1500" and orange for lower temperature. Inversely, if (pud) of sensitive color is given, corresponding color temperature of the light source can be found on Fig. 15. Thus, (pud) = 550mp, which is thc valup given by the past authors for the
Fig. 15. Sensitive colors for the light source of various color temperatures
sensitive color, corresponds to the light source at 2000"K, this being just the color temperatuie of the lamp of carbon filament. The past authors must have used carbon filament lamp as the light source. Hue of the sensitive color in this case is purple-pink, which is rather reddish and could have naturally been called red as mentioned in old papers. As for the sensitive color plate, it should be of proper thickness t o match the color temperature of the light source, or else the sensitivity will decrease. 5.4. HYPERSENSITIVE COLOR
To treat more generally the intensity of light passed through a crystal between Nicols, we shall add a retardation q between ordinary and extraordinary rays. Then the intensity is given by
240
I N T E R F E R E N C E COLOR
[Vl,
F5
I n the case of symmetrical Nicols q = 0, and the case of crossed Nicols corresponds to q == - z.We shall examine now the case when q has a n arbitrary value, but is independent of the wavelength. Such achroniatic retardation can be given by utilizing the phase shift that occurs in total reflection: the shift is between component rays with planes of polarization parallel and perpendicular to the plane of incidence. Amount of the shift varies with the angle of incidence. By a krief consideration (KUBOT.4 [1952]), the sensitivity when q is added is found to be
which means that the ncarer 91 approaches to z,the larger S becomes. The sensitive colors in crossed and symmetrical Nicols are the special cases of those given here, and we are able to obtain colors more sensitive than with symmetrical Nicols. We shall call these sensitive colors hyper-sensitive colors. 5.5.OPTICALLY AC'TlVE CRYSTAL
A piece of z-cut quartz (cut perpendicular to the optical axis), when placed between two Nicol prisms, exhibits interference colors of chromatic polarization because of its optically active nature. Representing the rotatory power per unit thickncss of the crystal by p'(,l), and denoting the angle between Nicol prisms by 0, the intensity of light emerging from the analyzer is
Eq. ( 1 1) can be regarded ;as a special case of this equation. Tristimulus values obtained by substituting the preceding formulae into eq. (6) are
X
=
Y
=
Z
=
s s s
E(il)Z(L)I(il)dil = X O+ Xr cos 20
+ Xg sin 28,
E(il)g(,l)I(il) dil = Y O+ Y , cos 28
+ Y isin 28,
E(A)z(il)I(il) d?,= 20
+ 2,cos 28 + Zgsin 20,
VI,
5 51
CHROMATIC POLARIZATION
24 I
where E ( I ) is the energy distribution of the light source and,
X O= 4
s
E ( l ) Z ( l )d l ,
Xr
=
4
E(il)Z(l)cos ( 2 ~ d4l ,
x. - 1
E(il)%(l)sin (2yd) d l ,
-
Yo = 2
20
&
1
1
E ( l ) j j ( l )d l ,
1
E ( l ) E ( l )dl,
Y r ==
+ 1E ( l ) j j ( l )cos ( 2 ~dl,4
Yi
=
4
2,
=
+
Zi = Q
1
E (l )j j (l )sin ( 2 4 d l ,
s s
E(il)E(il)cos (2pd) d l ,
E(l,).Z(l)sin (294d l .
Locus of the color obtained by varying the angle between Nicol prisms is found by eliminating 0 from the above equations. The elimination is carried out by first solving for sin 28 and cos 20 and then adding their squares, thus:
X=-
X S'
y=-
'1 S '
where
S=X+Y+Z, the above equation becomes :
Since the locus should be a closed curve on CIE-chromaticity diagram
242
[VL
I N T E R F E R E N C E COLOR
95
and the above equation is quadratic in x and y , the locus is an ellipse on CIE-diagram. In Fig. 16, these ellipses for various thicknesses of crystal are shown. BUCHWALD [I9401 also calculated such loci and plotted them on a color triangle with a remark that the curves are ‘‘ellipsenahnlich’’ .
If light is passed through a series of crystals and Nicols placcd alternately, then
-n N
l(1)
c0s2 (Pj(1)dn - e n } ,
n=l
where d, is the thickness of nth crystal, 0, is the angle between n and (n 1)th Nicol prisms and N the number of crystals. When d , = 2nd and 0, = 2n9, then
-+
r(n)
cII
C O S ~[2qP(a)d -
ell.
The intensity distribution given by this formula has maxima at the
VI,
§ 51
243
CHROMATIC POLARIZATION
wavelengths given by q@)d
-
8 == nzz, nz
=
1, 2, 3,
. . .)
(12)
and the width of these maxima are made as narrow as desired by increasing the number of crystals. This is the principle of Lyot’s filter. LYOT[ 19331used birefringent crystal to make such a filter. But, if we use optically active crystals and change the angle 0 by retaining the relation 8, = 2nd between them, then as is clear from eq. (12) the center of the pass band will move continuously as 0 varies and the width of the band is still kept narrow. Thus we can obtain a variable filter. (If we use birefringent crystal, we have to devise a mechanism for varying the thickness d of the crystal to keep the relation d n = 2nd between them, which will be very complicated.) Fig. 17 is the locus of color of such variable filter with three crystals, that is N = 3. From the figure, we see that d = 6 mm gives the purest hue for all colors. According to other calculations (KUBOTA
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 Fig. 17. Color of three element filter, figures representing the angle 8 between and S A I T[1954]) ~ the first and the second Nicol prisms (KUROTA
244
I N T E R F E R E N C E COLOR
[VI,
§6
and SAITS,[1954]), purity is not sufficient when N < 2, and the increase of N over 3 may not improve the purity much. Hence N = 3 with d = 6 mni will be the best for practical use. Change of color when d is varied and 8 is kept constant gives the locus similar to the one given in Fig. 1.
Q 6.
Interference Color in Other Phenomena
When the aperture of an aberration-free optical system is circular, the intensity distribution in the image plane of a point source is given by :
where Y is the distance from the geometrical optical image of the light source and (2a/f) is the numerical aperture of the optical system. As the intensity distribution depends on the wavelength, the diffraction image will also be seen colored. MECKE [ 19201 was the first to calculate 0.7
0.ti
0.5 0.4
Y 10.3 0.2 O.i
Fig. 18. Chromaticity of diffraction image (figures representing z = __
this color and plotted it on a color triangle. Fig. 18 is the recalculation of Mecke’s figure using the CIE system (the author is indebted to Mr. S. END^ [1960]). If we draw about ten times enlarged the MacAdam LPD-ellipse with the center a t C in this figure, and suppose
VII
NEW TABLES O F T H E I N T E R F E R E N C E S COLOR
245
that the area within this ellipse can be considered practically as white, the diffractionimage is white for about 1.5 I z < 1.8. Recently, SOHDA [1959] calculated the luminance factor of a diffraction image taking z as parameter and concluded that the resolving power (based on the Rayleigh limit) when white light is used is nearly the same as that calculated by assuming that the light is of the wavelength of green light. This conclusion holds even if we take the chromaticity into account, for the resolving power of the eye is known to be maximum for green light. Hence, for example, in NTSC color television system, green is sent by main carrier to give detail of the image while two other primaries, red and violet, are sent by subcarriers to color the image (HOWELLS [ 19541). As to the color of light through mist, Mecke has calculated and plotted it on a color triangle by taking the radius of water drop as parameter. Colors of rainbow were examined by EXNER and PERNTER [1922] who calculated the intensity of light diffracted by water drops, but they have not so far been investigated in the light of modern coloriniet ry .
NEW T A B L E S O F T H E I N T E R F E R E N C E COLOR
Za(y) =
1
cos2 2ZY E(A)Z(il)d1,
a
Zb(y) =
1
sin2 2JcY E(il)Z(il)dil. il
Here 5,j j and d are tristimulus values of the spectrum of CIE, E(A) is energy distribution of the CIE light source. Numerical integration was performed by NEAC (Nippon Electric Co. Automatic Computer). Y a and Yb in the Tables are divided by JE(il)jj(il)dil = Y,(y = 0). Figures in the fifth and higher decimal places were discarded.
246
INTERFERENCE COLOR
TABLE1 C-light source ( y in mp)
Y
y h
Xb
Yb
0.3 163 3193 329 1 3480 3812
0.0000 0508 1927 3962 6192
0.0000 2638 2663 2708 2775
0.0000 2686 2720 2778 2863
0 20 40 60 80
1.oooo 9492 8073 6038 3808
100
1840 1085 0534 0378 0315 026 1 0217 0183 0159 0145
4592 5147 5129 4607 4229 3806 3375 2968 2609 2310
4336 4472 3675 2878 2424 1980 1581 1251 1002 0830
8159 8915 9466 9622 9685 9739 9783 9817 9841 9855
2870 293 1 3004 3036 3053 307 1 3089 3108 3127 3147
298 1 3054 3138 3175 3195 3214 3235 3256 3277 3299
160 170 180
0147 0206 0326 0742 1373 2182
1892 1631 1546 1595 1732 1878
0680 0719 0870 1236 1565 1840
9853 9794 9674 9258 8627 7818
3190 3248 3310 3454 3627 3835
3345 3406 347 I 3615 3777 3957
200 220 240 250 260 270 280 290
4154 6250 8040 8706 9174 9423 9443 9238
2156 2424 271 1 287 1 3046 3240 3456 3695
2287 2675 3056 3254 3459 367 1 3890 4111
5846 3750 1960 1294 0826 0577 0557 0762
4378 5019 5036 4372 3390 2503 1929 1641
4345 4545 3694 2664 1622 0970 0758 0834
300 320 340 350 360 370 380 390
8821 7465 5671 4727 3816 2984 227 1 171 1
3958 4522 4964 5015 4874 4523 4005 3414
4322 4638 4556 4245 3725 3047 2327 1694
1179 2535 4329 5273 6184 7016 7729 8289
1541 1629 1892 206 1 225 I 2463 2698 2955
1052 1633 2258 2575 2894 3215 3536 3852
400 410 420 430 440 460 480
1329 1139 1 I48 1349 1728 2918 4450
2847 2369 2006 1756 1607 1555 1765
1237 0982 0912 099 1 1 I85 181 1 2638
867 1 886 1 8852 8650 8272 7082 5550
3232 3526 3826 41 18 438 1 4708 4630
4154 4427 4653 4805 4856 4568 3764
500 520 540 560 580
6008 7285 8048 8180 7693
2193 2789 3450 4006 4300
3554 4405 497 1 5042 4589
3992 2715 1952 1820 2307
4146 3444 2735 2148 1748
2714 1801 1265 1183 1553
110 120 124 126 128 130 132 134 136 140 145 150
0.3101 3131 3228 3425 3&08
247
N E W T A B L E S O F T H E I N T E R F E R E N C E COLOR
Y
Yb
600 620 640 660 680
0.6720 548 1 423 1 3120 2599
0.4290 405 1 3684 3263 2826
0.3804 2954 2229 1728 1506
0.3280 4519 5769 6790 740 1
0.1595 1752 2227 287 I 3432
0.2351 3460 4565 5209 5155
700 720 740 760 780 820 840 860 880
aoo
2485 2855 3603 4558 5525 6320 6805 6914 6656 61 10
2398 2024 181 1 1907 231 1 2785 3132 3332 3435 3487
1632 2202 3243 4453 5167 5070 4489 3815 3233 2787
7515 7145 6397 5442 4475 3680 3195 3086 3344 3890
3754 3850 3800 3673 3505 3307 3073 279 1 2477 2229
4585 383 1 31 I9 2545 2139 1922 1942 2287 3032 4013
900 920 940 960 980
5403 4679 4078 3700 3597
3512 3519 3497 3417 3235
248 1 2316 2314 2517 2954
4597 532 1 5922 6300 6403
2191 2362 2604 2827 3018
4674 4663 4232 3724 3294
1000 1020 1040 1060 1080 1100 1120 1 I40 1160 1180
3765 4147 4653 5176 5616 5899 5986 588 1 5622 5274
2932 2605 2408 2388 2489 2655 2857 308 1 3318 3548
3532 3964 403 1 3824 3539 3289 31 10 3003 2964 2980
6235 5853 5347 4824 4384 4101 4014 41 I9 4378 4726
3187 3347 3500 3635 3718 3686 3482 3134 2776 2533
2975 2767 2664 2668 2784 2998 3246 3423 3462 3396
1200 1220 1240 1260 1280 1300 1320 1340 1360 1380
4912 4605 4405 4340 4406 4576 4806 5044 5246 5375
3729 3802 3713 3470 3150 2853 2646 2550 2562 2669
5088
5424 5194 4956 4754 4625
2443 2480 2614 2816 3062 3321 3545 3680 3688 3569
3298 3220 3180 3179 3203 3233 3244 3213 3140 3044
1400 1420 1440 1460 1480 1500
5417 5374 5265 5120 4972 4948
2848 3066 3277 3435 3508 3493
4583 4626 4735 4880 5028 5151
3366 3137 2932 2782 2704 2704
2962 2923 2944 3023 3145 3284
1
3034 3142 3143 31 13 3083
1 ~
3184 3273 3356 3403 3389 3309 3181 3044
1
5595
I iiE
11
248
INTERFERENCE COLOR
[VI
TABLE 2 A-light source ( y in mp)
Y
Y,
Yb
Xb
Yb
0 20 40 60
ao
1 .oooo 9518 8168 6214 4039
0.4476 4502 4583 4737 4999
0.4075 4084 41 10 4154 4206
0.0000 0482 1a32 3786 5961
0.0000 3991 4020 4069 4140
0.0000 3909 3925 3953 3991
100 110 120 124 126 128 130 132 134 136
2066 1278 0675 0493 0415 0346 0286 0236 0195 0163
5440 5737 5957 5908 5816 5660 5427 5107 4702 4226
4218 4131 3798 3515 3325 3099 2841 2557 2266 1989
7934 8722 9325 9507 9585 9654 9714 9764 9805 9837
4236 4294 4360 4389 4404 4420 4435 4451 4468 4485
4040 4067 4097 4109 4116 3122 4128 4135 4141 4148
140 145 150 160 170 1 ao
0129 0139 0209 0521 1047 1761
3203 2202 1757 1a65 2286 2665
1585 1496 1756 2475 2994 3330
9871 9861 9791 9479 a953 8239
4520 4566 4614 4721 4840 4974
4161 4177 4193 4225 4255 4280
200 220 240 250 260 270 280 290
3596 5661 7555 8322 8918 9318 9504 9470
3226 3635 3981 4144 4306 4467 4629 4793
3732 3978 4156 4229 4293 4348 4392 4424
6404 4339 2445 1678 1082 0682 0496 0530
5284 5637 5890 5818 5416 4540 3333 2306
4297 4210 3844 3452 2873 2195 1711 1693
300 320 340 350 360 370 380 390
9221 a150 6519 5594 4657 3753 2925 2213
4959 5288 5586 5701 5771 5768 5657 5395
4440 4413 4264 4123 3925 3660 3324 2928
0779 1a50 3481 4406 4353 6247 7075 7787
1817 2005 2641 2961 3264 3549 3816 4067
2064 3048 3763 4016 4216 4374 4496 4586
400 410 420 430 440 460 480
1648 1255 1049 1036 1213 2077 3447
4946 4312 3559 2824 2260 1912 2336
2510 2144 1932 1956 2226 3191 4125
8352 a745 a951 8964 a787 7923 6553
4304 4527 4737 4931 5108 5401 5581
4647 4680 4684 4659 4603 4395 4050
500 520 540 560 580
5042 6551 7698 8286 8235
3023 3702 4282 4742 5078
4729 501 1 5045 4895 4609
4959 3449 2302 1714 1765
5593 5359 4794 3886 2864
3573 3008 2481 2252 2645
249
N E W TABLES O F T H E INTERFERENCE COLOR
TABLE2 (cofitinued)
Y
Xa
Ya
Yb
Yb
600 620 640 660 680
0.7591 6507 521 1 3955 2969
0.5295 5394 5367 5197 485 1
0.4229 3790 3335 2920 2640
0.2409 3493 4789 6045 703 1
0.2243 2337 2913 3587 4159
0.3657 4740 5374 5499 5289
700 720 740 760 780 800 820 840 860 880
2418 2373 2810 3614 4612 5606 6416 6908 7018 6757
4292 3556 2885 266 1 2947 3449 3930 4316 4614 4844
2656 3172 4205 5265 5759 5656 5249 4763 4300 3898
7582 7627 7190 6386 5388 4394 3584 3092 2982 3243
4585 4880 5072 5181 5217 5172 5018 4707 4202 359 1
4912 447 1 4027 3613 3260 3004 2910 308 1 3628 4502
900 920 940 960 980
6202 5480 4736 4105 3690
5022 5152 522 1 5196 5019
3571 3330 3199 3214 3428
3798 4520 5264 5895 6310
3152 3099 3342 3695 405 1
5298 5592 5408 5010 458 1
1000 1020 1040 1060 1080 1100 1120 1140 1160 1180
3547 3674 4023 4508 5029 5490 5817 5964 5928 5737
4640 4094 3579 3312 3337 3559 3875 4219 4554 4854
3866 444 1 4919 5101 5002 4755 4472 4208 3987 3815
6453 6326 5977 5492 497 1 4509 4183 4036 4072 4263
4378 4670 4924 5131 5267 529 1 5154 4828 4357 3878
4200 3889 3654 3498 3432 347 1 3628 3893 421 1 4488
1200 1220 1240 1260 1280 1300 1320 1340 1360 1380
544 1 5107 4797 4560 443 1 4413 4495 4647 4835 5020
5093 5239 5260 5131 4857 4484 4100 3803 3660 3687
3695 3633 3634 3702 383 1 4005 4193 4357 4466 4504
4559 4893 5203 5439 5569 5587 5505 5353 5165 4980
355 1 3453 3563 3815 4137 4469 4767 4994 5120 5132
4645 4668 4589 4452 4293 4132 3984 3859 3767 3718
1400 1420 1440 1460 1480 1500
5173 5274 5317 5304 5250 5169
3856 4111 4392 4646 4838 4947
4472 4380 4249 4102 396 1 3846
4827 4726 4683 4696 4750
5029 4827 4563 4286 405 1 3906
3722 3782 3894 4046 4210 4353
483 1
250
INTERFERENCE COLOR
References ARONS,L., 1910, Ann. Phys. (4) 33, 799. BAXNIKG, M., 1947, Jour. Opt. SOC.Am. 37, 792. BAUD, R. V. and W. D. WRIGHT,1930, Jour, Opt. COC. Am. 20, 381. BLODGETT, K. R., 1934, Jour. Opt. SOC. Am. 24, 313. BRUCKE,1887, Die Physiologie der Farben fur die Zwecke der Kunstgewerbe (Leipzig) p. 44. BUCH, S., 1950, Zeits. Wiss. Photogr. Photophys. u. Photochem. 45, 212. B U C H W A L D , E., 1940, Ann. der Phys. (5) 38, 245, 325. CHARSBY, A. and J . J. POLLING, 1955, Proc. Roy. SOC.A 2 2 7 , 434. EHRINGHAUS, A,, 1920, Neues Jb. Mineral., Beilage-bd. 43, 557. END^, S.,1960, Oydbutsuri? 29, 726. Evan-s,U. I<., 1952, Proc. Roy. Soc. A 107, 228. EXKER, E. M. and J . M. PERNTE~I, 1922, Meteorologische Optik, I1 Aufl. (Wien), p. 565. HARIIY, A. C. and collaborators, 1936, Handbook of Colorimetry (Tech. Press, Cambridge, Mass.). HOWELLS, P. TV., 1954, Proc. I.R.E. 42, Color T.V. issue, p. 134. JUDD,D. R. and I<. L. KELLY,1939, J . Research Natl. Bur. Standards 23, 355, R P 1239. H., 1949, oybbutsurit 18, 247. KUBOTA, KUBOTA, H., 1950a, Jour. Opt. SOC.Am. 40, 146. KUBOTA, H., 1950b, Jonr. Photographic SOC.Japan 12, 23. KUBOTA, H., 1950c, Jour. Opt. Soc. Am. 40, 621. KUBOTA, H., 1952, Jour. Opt. Soc. Am. 42, 144. KUBOTA, H., 1960, Proc. Japan. Acad. 36, 418. KUBOTA, H. a n d T . ARA, 1951a, Jour. Opt. SOC. Am. 41, 16. K U B O T A , H., 7'. ARAand H. SAITO,1951b, Jour. Opt. SOC. Am. 41, 537. KUBOTA, H. and T. OSE, 1952, Jour. Phys. Soc. Japan 7, 470. KUBOTA, H. a n d T . OSE, 1955a, Jour. Opt. Soc. Am. 45, 89. KUBOTA, H. and T. OSE, 1955b, Oydbutsurit 24, 63. KUBOTX, H. and T. SAT& 1953, Japanese Jour. SOC.of Television 7, 27. KUBOTA, H. and H. SAITO,1954, oydbutsurii 23, 354. KUKOTA, H . and I<. SHIMIZTJ, 1957, Jour. Opt. Soc. Am. 47, 1121. LE GRAND,Y . , 1956, Handbuch der Physik (Berlin) 1'01. 24, p. 209. LOMMEL, E., 1891, Ann. Phys. u. Chem. 43, 473. LYOT,B., 1933, Compt. Rend. 197, 1593. MACADAM, D. L., 1942, Jour. Opt. SOC.Am. 32, 247. MACSTVAN, A. M., 1958, Proc. Phys. SOC.27, 742. MAHAN,A. I., 1952, Jour. Opt. SOC.Am. 42, 259. MARCELIN, A,, 1931, Jour. Chem. Phys. 28, 605. MASCART, M. E., 1891, Trait4 d'Optique (Paris) Vol. 2 , p. 74. MECKE,K., 1920, Ann. Phys. 61, 623. MEISLING,1904, Zeit. analyt.. Chem. 43, 137. MIYAKE,K., 1957, Scicncc of Light (Tokyo) 6 , 77, 85. MOXCH,G. C., 1952, Optik 9, 75, 97. MCRRAY, A. E., 1956, Jour. Opt. Soc. Am. 46, 790.
VII
REFERENCES
25 1
NAWATA, S., 1953, Sci. Report o f Res. Institute, Tohoku Univ., Sendai, Japan 5, 179. POCKEL,F., 1906, Lehrbuch der Krystaloptik (Leipzig) p. 216; M. FRANGON, 1956, Handbuch der Physik (Berlin) Vol. 24, p. 208: etc. (G. QUINCKE, 1869, Pogg. Ann. 129, 180). POHLACK, H., 1958, Jenaer Jahrbuch, 11. Teil, p. 102. RAYLEIGH, Lord, 1900, Sci. Papers, T‘ol. 2 (The Univ. Press, Cambridge) p. 498. K o s c ~ S., , 1949, Fortschritte d. Mineralog. 28, 72; Ber. Oberhess. Ges. Nat. -u. Heilk., Giessen, 27 (1954) 128; Optica Acta 6 (1959) 186. SAWAKI, T., 1958, Science of Lig;ht (Tokyo) 7, 1. SCHARF, P. T., 1952, S.M.P.T.E 59, 91. SOHDA, J . S., 1959, Optik 16, 2i6. SUZUKI,M., 1954, @dbutsurit 23, 371. TROLLE, R., 1906, Phys. Zeits. 7, 700. WATANABE, T., 1954, oydbutsurit 23, 369. WEINSCHENK, E., 1919, Polarizz,tionsmikroskop (Freiburg) p. 86. WENZEL,A,, 1917, Phys. Zeits. 18, 472. M‘ULFING, E. A,, 1910, Sitzungsber. Heidelberg. Akad. \Viss. Kr. 24, 1.
t Oydbutsuri (Journal of the !$ociety of Applied Physics, Japan), in Japanese with English title and abstract, issued by “Japanese Society of Applied Physics, Japan”, c/o Department of Engineering, University of Tokyo, Japan.
This Page Intentionally Left Blank
VII
DYNAMIC CHARACTERISTICS OF VISUAL P R O C E S S E S BY
ADRIAKA F I O R E N T I N I Istitztto A'azionale di Ottica, Arcetri, Firenze, Italy
CONTENTS PAGE
$ 1. INTRODUCTION
255
$ 2 . DYNAMIC
. . . . . . . . . . . . . . . . . . THEORIES O F VISUAL ACUITY . . . .
257
$3. INVOLUNTARY MOVEMENTS O F T H E E Y E . . . .
258
. .
262
$ 4 . VISION WITH STABILIZED RETINAL IMAGES
$ 5. DISCUSSION ON T H E POSSIBLE ROLE O F INVOLUNTARY EYE MOVEMENTS. . . . . . . . . . . . . 269
5 6.
DYNAMIC CHARACTERISTICS O F BINOCULAR VISION. . . . . . . . . . . . . . . . . . . . . . . . 273
3 7. SOME VISUAL EFFECTS PRODUCED BY INTERMIT-
. . . . . . . . $ 8. T H E PERCEPTION O F CONTOURS. . . . . . . . . REFERENCES . . . . . . . . . . . . . . . . . . . . . TENT ILLUMINATION . . . . . .
276
282 287
5
1.
Introduction
Most of the information from the surrounding world is conveyed to the human brain through the visual system. The quantity of information that can be obtained during observation of an object is related to the number of details that the eye can perceive, namely to visual acuity. The visual acuity is sometimes defined as the ability of the eye to detect a very small object some other times as the ability to resolve two neighbouring elements of a periodic pattern. According to the first definition, visual acuity is inversely related to the visual angle subtended by a just detectable object (for instance a dark spot on a bright background) ; according to the second definition, acuity is inversely related to the angle subtended by one of the just resolvable elements (for instance one of the parallel bars of a grating). Both these quantities dep2nd on a number of factors, and in particular on the luminous level of the test field and on the level of adaptation of the eye (the eye is adapted to a given level when it has been exposed to that level of illumination for a sufficiently long time). The acuity of the eye is maximal when the eye is adapted to a daylight level and the test field is illurninated at the same level (photopic vision). Under these conditions, visual acuity is highest within a small central area of the visual field. According to classical th2ories of vision, the final limitations to visual acuity is the blurring of retinal images and the “grain” of the retinal structure. The blurring of retinal images is produced by diffraction, aberrations of the optical system of the eye and diffusion in the dioptric media. The “grain” of the retinal tissue is represented by independent retinal units, which consist of groups of receptors linked to a single fiber of the optic nerve. The size of retinal units varies widely across the retina and is minimum in the centre of the fovea, where every cone can be considered as an independent unit;
256
VISUAL PROCESSES
[VII,
9
1
here the separation between the axis of two adjacent cones can be as small as 1.5-2 microns, corresponding to a visual angle of about 20 sec of arc. Experimental data on visual acuity, however, show that, at least under the best conditions, (a) the visual perception of contours is much sharper than one could expect on the basis of limiting optical factors, (b) the eye can detect details which subtend a visual angle smaller than the angular size of the narrowest foveal units. Many difficulties are encountered when one attempts to explain these facts by the classical theories of vision. This is one of the reasons which lead to a modern interpretation of visual functions which is quite different from the classical ones: the hypothesis has been made that time gradients of illumination on retinal receptors have a fundamental role in visual perception. This hypothesis is supported by the fact that pulsing in the nervons fibers which connect the retinal cells to the brain is strongly activated by any rapid change of illumination on the receptors. The retinal illumination may change even during steady fixation of constantly illuminated objects, owing to the small involuntary movements performed by the eye, which make the retinal images shift continually back and forth across the retina. The theories which try to account for the high values of visual acuity on the basis of time gradients of illumination produced by eye movements have been called dynamic to distinguish them from the classical static theories. Since the formulation of dynamic theories of visual acuity, the investigation of the time characteristics of the visual system has much developed and new interesting visual phenomena have been discovered. The present review will cover some of the more recent findings in this field. The present review starts with a summary of the basic concepts of dynamic theories of visual acuity (9 2). The various kinds of involuntary movements are then described according t o the results of recent accurate measurernents (5 3). Next, the experimental findings of a number of researches on the effects of eye movements on vision are reported (9 4) and discussed (5 5), with particular reference to visual acuity. Some evidence is presented for the possible dynamic nature of the correspondence between the two eyes (9 6). A section is devoted to some visual phenomena which occur when the eye is exposed to intermittent illumination (§ 7) : first, some experiments are reported which show that the visual system acts as a low-pass
VII,
5 21
DYNAMIC TH1;ORIES O F VISUAL ACUITY
257
filter as regards the response to periodically modulated lights ; second, the effects of periodic variai ions of illumination on visual acuity are described. In conclusion, rccent developments in the investigation on the mechanisms underlying the perception of contours are reported and discussed (§ 8) in relatitm to the results of the experiments described in the previous secticms.
Q 2.
Dynamic Theories of Visual Acuity
The dynamic theory of visual acuity, suggested for the first time by HERING[ 18991, was further developed by M'EYICIOUTHet al. [ 19281 and, under different forms, by MARSHBLL and TALBOT [ 19421, BYRAM [ 19441, JOKES and HIGGINS [ 19481. The basic concept of the dynamic theory is the notion that the natural flutter of the eye has a fundamental role in the vision of details, as it causes the retinal image of observed objects to be continually shifted back and forth across the retina. If the target contains, say, a dark and a bright area with a common edge, the illumination of receptors which are swept by the image of the edge varies with time, and the average illumination n d l be intermediate between the illuminations of receptors lying in the dark area and in the bright area, respectively. According to classical theories, this would blur the image of the cdge. \h'eymouth et al., on the contrary, saw in this fact the way for the visual acuity to overcone the limitation imposed by the grain of the retinal network. They thought that the perceived image of a point or a line corresponds to the average position assumed in the retina by the image during the oscillations produced by eye movements. The average position would be defined with great accuracy because it results from a great number of different instantaneous positions. This would explain why thci minimum size of perceptible objects is smaller than the size of a single cone. Marshall and Talbot put forward a similar explanation, but assurned that the averaging mechanism is located in the brain, where the grain of nervous cells is niuch finer than in the retina. Byram considered visual acuity as the result of brightness discrimination and explained the perception of a boundary between a bright and a dark area on the basis of photochemical considerations. In the absence of any eye movement, the receptors a t either side of the border would adapt to the respective levels and the rates of the photo-
258
VISUAL PROCESSES
[VII,
s3
chemical reactions would become the same in the two regions, in spite of the different illumination levels ; no brightness discrimination would be possible after having reached such a steady state of adaptation. The movements of the eye would have the role of levelling off the photochemical state of adaptation in the rieighbourhood of the border and thus of making the rate of the photochemical reaction to be greater at that side of the border where the illumination is higher. Jones and Higgins suggested that the perception of details is mediated by the rate of change of illumination on retinal receptors with respect to time, rather than by the spatial variations of illumination in the observed pattern. The physiological basis for this assumption is to be found in the well-known properties of the nervous mechanisms of the eye, firstly discovered by HARTLINE [ 19381 in the frog eye. There are three kinds of nervous fibers in the opiic nerve of the frog eye : (a) “on” fibers, which transmit a train of nervous pulses at the onset of illumination, (b) “off” fibers, which transmit a train of pulses when the illumination is switched off, (c) “on-off’’ fibers, which transmit both at the beginning and at the end of the illumination period. Large bursts of pulses are transmitted from the retina to the brain every time the illumination is rapidly changed, while steady illumination gives rise to a very low number of pulses per second. It is conceivable that the same kinds of effect occur in the human eye. The role of eye movements would be to transform spatial illumination gradients into time gradients, thus making strong responses to arise in the retina. Practically no response could be obtained from the retina during voluntary fixation if involuntary movements were not present. At the time the dynamic theories were put forward, there was little experimental evidence to support the concept of a dynamic nature of visual processes, apart from the results of Hartline on the time characteristics of nervous mechanisms and from some data on involuntary movements. The situation is quite different at present, because of an increase in the knowledge of the time characteristics of visual processes. A discussion on the dynamic theory is now possible, and will be found in the following sections.
Q 3. Involuntary Movements of the Eye The ordinary movemerits of the eye can be described as rotations about a point situated approximately on the anterior-posterior axis
VII,
§ 31
I N V O L U N T A R Y M O V E M E N T S O F THE EYE
259
of the eyeball, at about 1.3mm froni the cornea. The oculomotor system, which is responsiblc for the rotations of the eye, consists of three pairs of antagonistic muscles, contained in the orbit. Under normal visual conditions, voluntary rotations are performed to carry onto the jovea (the central region of the retina where visual acuity is maximal) the images of objects which attract the attention of the subject, successively in time Apart from the voluntary rotations, there are small involuntary rotations which take place w e n when the subject makes an attempt to look as steadily as possilile a t a fixed point. The existence of involuntary movements (which are frequently reported in the literature with the name of physio1ogr:cal nystagmus) was suspected before the end of the last century, but accurate measurements of the small involuntary oscillations could not be carried out by the techniques available a t that time. The formulation of the dynamic theories of visual acuity has given a new impulse to the researches in this field and the various kinds of involuntary movements have been accurately measured by new techniques during the past ten years.
i' Fig. 1. Apparatus for recording the involuntary movements of the eye
The most accurate method for recording eye movements is based on the principle of the optjcal lever (Fig. 1). A small plane mirror attached to the eye reflects a light beam issued by a point source. An image of the source is focused, after reflection, onto the running film of a recording apparatus. Any rotation of the eye causes the reflectld
260
VISUAL PROCESSES
[VII,
43
beam to rotate and the image on the recording film to be shifted by a proportional amount. Only the horizontal components of eye movements can be recorded by the apparatus shown in Fig. 1 , but supplementary arrangements can be introduced to obtain records of both horizontal and vertical components. Movements of the head are minimized by means of a chin rest and a bite board; large voluntary movements of the eye are prevented, as the subject is asked t o look steadily at a fixation point during the records. The small plane mirror may be attached directly to the eye, as was done in earlier researches (ADLERand FLIEGELMAX [1934]), or to a contact lens which tightly adheres to the eye (RATLIFFand RIGGS [1950]; DITCHEURN and GINSBOXG [ 19531; DITCHBURN [ 19551). Other recording methods are based on the observation of images reflected by the cornea (LORD and WRIGHT[1948]) or on the direct photography of a small detail of the eyeball: BARLOW[1952] used a droplet of mercury placed on the cornea, while HIGGINSand STVLZ[1953] observed the blood vessels of the sclera. According to the results of these researches, the involuntary movemcnts of the eye during fixation can be analyzed as follows: a) tremor: irregular oscillations having mean amplitude 10- 15 sec of arc and a frequency variable from 20 to 100 C.P.S.The maximum angular velocity is about 20 min of arc per sec.
b) flicks or saccades: very rapid rotations (angular velocity about 600 min of arc per sec) whose amplitude ranges from 1 to 25 min of arc and which occur irregularly. The intervals between two flicks vary from 0.03 to 5 sec. c) drifts : slow oscillations and slow unidirectional movements whose amplitude does not exceed 5 niin of arc. Unidirectional drifts occur during the interval between two flicks. Angular velocity of order 1 min of arc per sec. The movements described are horizontal and vertical components of eye rotations. F E N D E R [ 19551 also recorded torsional movements (rotations about the visual axis) of the same kind but somewhat different amplitudes: the torsional tremor has an amplitude of about 45 min of arc, the torsional flicks are very small (2 min of arc). The involuntary movements are not a peculiarity of the human eye, since HEBBARD and MARG [1957] observed the same kinds of movements in the cye of the cat. The displacements produced by the involuntary eye movements
VII,
9 31
INVOLUNTARY MOVEMENTS O F THE E Y E
26 1
of the images across the retina can be inferred from the data reported above t (RATLIFFand RIGGS[1950]; DITCHBURN [1955]). The shifts of the retinal images produced by the tremor are very small: the mean amplitude of tremor corresponds to a displacement smaller than the width of a single receptor in the fovea, while the maximum amplitudes correspond to two or three receptors. During very short intervals of time, which do not include movements other than the tremor (0.01-0.02 sec), the image is likely to remain on a single retinal unit. The combined effect of slow drifts and rapid flicks is to make the image wander irregularly inside an area having the angular diameter of about 20 niin of arc whit-3 contains about 2000 receptors. This area may well correspond to the centval territory of the fovea described by POLYAK [1941], where the cones are extremely thin and which “represents the peak of the strictural perfection of the eye”. The characteristics of drifts are quite different from those of flicks (CORNSWEETl19.561). Thc d.rifts are more likely to produce a displacement of the image from the centre of the fixation area than towards it ; further, the drift rates measured during fixation are essentially identical to the drift rates recorded while the subject is in total darkness. This leads to the conclusion that the drifts are not under direct visual control, and that they are the result of an instability in the balance of the antagonistic muscle:; of the eye. The flicks, on the contrary, have been found to be coircctive movements which tend to return the retinal image to the central region, when it has been shifted away by drifts. The flicks are probably controlled by the retinal events through at least two different mechanisms: one mechanism would control the direction and thc other the amplitude of the flicks, according to the position assumed by the retinal image at a given time, with respect to the centre of thc fixation area. Moreover, it appears that the eye is unable to see anything during a flick period (DITCHBURN [ 19551). It is possible to conclude from what precedes that, during normal fixation, the image drifts slowly across the retina as a consequence of the instability of the oculomotor system. As it drifts farther and farther away from a particular region of the fovea, it becomes more and more probable that a flick will occur, to bring the image back t CORNSWEET[ 19581 has recently developed a technique for recording the displacements of the images on the retina.
262
VISUAL PROCESSES
[VII,
§4
to the central region. Very small and rapid oscillations are superimposed on these two kinds of movements. I n total darkness, in the absence of a fixation point, the eye is unable to maintain an approximately constant position: the lack of visual control makes the visual axis move farther and farther away from the mean position assumed during fixation. § 4. Vision with Stabilized Retinal Images
The physiological nystagmus has been regarded either as an unavoidable disturbance for the perception of details (static theory) or as the fundamental mechanism for visual acuity (dynamic theory). The conflict between these two opposite interpretations has given rise to a great number of experimental investigations on the effect of eye movements on vision t. Various methods have been used to investigate the influence of the involuntary movements on visual acuity. RATLIFF[ 19521 recorded the movements performed by the eye during the observation of an acuity test object. He found that a correlation exists between the percentage of exact responses given by the subject in the acuity judgements and the amplitude of the movements which took place during the single observations. With a very short observation time (0.075 sec) visual acuity results to be greater during relatively small involuntary movements than during large ones. Accordingly, eye movements should be considered as a hindrance for visual acuity, at least with short observation times. The most impressive results have been obtained by quite a different method, in which the displacements of retinal images, normally produced by eye movements, are eliminated. To prevent movement of the images with respect to the retina, the position of the observed target must be controlled by the eye, in the sense that for any rotation of the eye the target must rotate in the same direction and through the same angle. In this way, the image remains stationary on the same retinal receptors, in spite of eye movements. An apparatus for obtaining stabilization of the images on the retina has been devised by DITCHBURN and GINSBORG[1952] and independently by RIGGS,RATLIFFet al. [ 19531. The observer wears a contact t The results of these investigations are reported in the present section without giving any comment on them. A summary of the results and a general discussion are contained in the following section.
VII,
§ 41
263
STABILIZED RETINAL IMAGES
lens which carries a small plane mirror m (Fig. 2). The mirror reflects light from a projector onto a screen placed in front of the observer. An image of the test object is focused on the screen. A rotation a of the eye produces a rotation 2a of the reflected beam. The observer
1-
Screen
8 n
t e s t object
Fig. 2. Apparatus for obtaining stationary retinal images
views the image on the screen by reflection at mirrors M, and the path from the screen t o the nodal point N of the eye is twice as long as the path from m to the screen. In this way, the displacement of the image on the screen produced by a rotation 2a of the reflected beam is viewed by the obsei-verunder the angle a equal to the rotation of the eye, and the image remains stationary on the retina. Owing to the position of the mirror m, the compensation is complete only for the horizontal coinponents of eye movements and vertical targets must be used to minimize the effects of the incomplete vertical compensation. A very accurate stabilization of retinal images, with compensation in both horizontal and vertical planes, has been obtained by DITCHBCRK and FENDER [1955]. Their apparatus is based on the same principle as the previous one ; the modifications introduced concern the position of the mrror on the contact lens and the compensating path. A number of other methods have been devised to obtain stabilization of retinal images. These methods are not suitable for all the types of experiments which can be carried out with the method previously
264
VISUAL PROCESSES
LVII,
s4
described, but are more convenient for some particular experiments, since they require much simpler optical arrangements. TEN DOESSCHATE 1119541 described a method for obtaining stabilization on the retina of a blurred image of the eye pupil. DITCHBURN and PRITCHAKD [1956] mounted a small Fabry-PQot etalon on a contact lens worn by the observer and obtained stabilization on the retina for the interference pattern. The same authors used also a unit made of a calcite crystal cemented between two “polaroid” sheets ; the unit was supported by a contact lens and produced a stationary fringe pattern, when illuminated by a convergent beam. R.4TLIFF [ 19581 suggested that the so-called Haidinger brcishes t may be considered as a stationary pattern which require no attachment to the eye. Vision with compensation of eye movements is a very unusual one and can hardly be described to a person who has never experienced it. The most remarkable effect of stabilization is the disappearance of the details of the observed target after a few minutes from the beginning of the observation. If the target consists of a black bar on a luminous background, light from the field appears to invade more and more the bar area, until finally the whole bar disappears and the field appears uniformly illuminated. The bar may reappear later, and then fade again. The total time of disappearance over one minute observation depends on the accuracy of stabilization and on the bar width. Fine lines disappear very soon and fail to reappear later, while wide bars take long to disappear and remain invisible only for a few seconds (RIGGS,RATLIFFet al. [1953]). With a very accurate stabilization, a bar as wide as 5 min of arc on a field of luminance 25 niillilamberts is visible only during 50% of the observation time (DITCHBURN and FENDER [1955]). When the observation is prolonged for some minutes, the bar never reappears as well defined as at the beginning: the edges are somewhat blurred and there is a loss in contrast. Very similar effects occur with parallel bar gratings (FIOREXt The Haidingev bvushes are an entoptical phenomenon. They may be seen by looking through a polarizer at a field of blue light. The two opposite bvzishes are dark, are symmetrically oriented with respect to the centre of the fovea and their orientation depends on the plane of polarization of the incident light. They are assumed to be due to the dychroism of some fibers contained in the retina and oriented symmetrically with respect to the fovea. The orientation of the fibers with respect to the polarization plane does not change for a rotation of the eye in a meridian plane. Consequently, the brushes remain stationary during involuntary rotations of the eye. Only torsional movements are not compensated.
VII,
5 41
STABILIZED RETINAL IMAGES
265
and ERCOLES [ 19561) or with fringe patterns (DITCHBURK [ 19561). With a 1 ” circular field divided into two parts having different luminances, the line of division vanishes after a few seconds of stabilization, and the field appears uniformly bright for a period. Then vision of the two parts of the field is suddenly restored, after few seconds it fades again, and so on (DITCHBURN 119551). Stabilization of large luminous patches on a dark ground produces at first blur of the edges. Then the shape of the figure becomes less and less defined and its brightness decreases until only a very irregular and dim patch is perceived. Apart from the disappearance of the details, a person who experiences vision with compensated movements for the first time, is bewildered by the fact that the observed target follows any movement performed by his eye and that he does not succeed in shifting the fixation direction from one point of the field to another, even by large voluntary movements. During prolonged observations with stationary images, the eye undergoes very large and slow involuntary oscillations, whose amplitude tends to increase with time (TEN DOESSCHATE 119541; HEDLUKand WHITE [1959]). The amplitude of these oscillations is of the order of 30”; the period is constant for each subject, but varies from subject to subject, being of the order of 2sec. Apparently, the oculomotor system is unable to control the position of the eye when it is deprived of the usual optical feedback, that is the displacement of the image of the fixated object from the centre of the fixation area. The large oscillations may be the result of a much coarser feedback mechanism ; perhaps, when the contraction of one oculomotor muscle exceeds a certain degree, proprioceptive stimuli produce contraction of the antagonistic muscle; this in turn is not controlled by an optical feedback and goes on until contraction of the opposite muscle is produced, and so on. If large voluntary movements and involuntary “pendular” oscillations are to be avoided, a fixation spot whose image is not stabilized on the retina must be presented to the observer. But even in this case, the disappearance of the stationary iniage and the unusual fading of the visual sensation sometimes lead the eye to a sharp movement. In the methods where a contact lens is used, sharp movements of the eye may cause slippage of the lens on the eye surface which momentarily destroys stabilization and makes the image reappear. The various kinds of involuntary micromovements which occur during vision with stationary images have also been investigated. TINI
266
VISUAL PROCESSES
[VIL
94
It has been found that in “stopped” viewing conditions, with an “unstopped” fixation mark, the mean rate of drifts is higher, while the mean number of flicks per unit time is smaller, than in normal viewing conditions (CORNSWEET[ 19561). This finding confirms the different characteristics of drifts and flicks, described in the previous section. The loss of the ability to focus correctly the images on the retina is another peculiar effect of stabilisation. It is probably due to the disappearance of any sharp edge in the perceived pattern, after a short time from the beginning of the observation. The progressive blur of the stabilized image possibly causes convulsive acconiodation changes from time to tinie, which can destroy stabilization and make the image reappear 7. This may be prevented by the use of a fixation mark whose image is not stabilized. There are very few data on the effect of stabilization on colour. It seems that colours in the blue-green part of the spectrum appear much desaturated as a consequence of stabilization (DITCHBURN [ 19561). Before undertaking a discussion on the effects of the compensation of eye movements on vision, some other experiments will be reported, which are of great importance for the understanding of retinal mechanisms. Once the images have been “stopped” on the retina, it is possible to restore normal vision in two ways: (a) by introducing movements of the test object controlled by the experimenter, (b) by intermittent illumination of Ihe test object. DITCHBURN [ 19561 subjected the test objects to movements which simulated the various kinds of involuntary natural movements. Sharp movements simulating a flick are found to regenerate immediately vision of a stabilized image. The effect of stabilization is also considerably reduced by oscillatory movements simulating the natural tremor; the greatest effect is obtained with oscillations having the maximum angular velocity of about 30 min of arc per sec. Slower or faster oscillations are less effective in restoring vision ; oscillations of 100 min of arc per sec have practically no effect. Unidirectional movements simulating drifts restore vision to an extent which increases with increasing velocity, at least up to 100 min of arc per sec. t The position of the nodal point of the eye depends on the degree of accommodation. A change of accomodation produces a variation of the compensating path (Fig. 2) and consequently stabilization becomes incomplete.
VIL
9 41
STABILIZED R E T I N A L IMAGES
267
CORNSWEETand RIGGS[1954] investigated the effect of the vibration of otherwise stationary images on visual acuity. They used dark lines of various width on a uniform background; a line was considered as visible when it was perceived during the whole observation time (2 sec). The lines were oscillating a t three frequencies: 30, 50 and 70 C.P.S. and with amplitudes which varied from 0 to 150 sec of arc. The visual acuity was found to be constant for all three frequencies in the range of amplitudes from 0 to 60 scc of arc. For larger amplitudes, the visual acuity regularly decreases with increasing amplitude. KRAUSI~OPF [ 19571 has extended the research to a wider frequency range. He measured the contrast threshold for a bright bar on a uniform background, under stationary image conditions. The contrast threshold was defined as the contrast at which the bar was seen during 50% of the observation time. Imposed oscillations of the retinal image have no effect on the contrast threshold when the amplitude is less than 1 min of arc. When the amplitude exceeds this value, the effect of the oscillation depends on the frequency. High frequency oscillations are detrimental to the contrast threshold, in agreement with the results of Cornsweet and Riggs. Low frequency oscillations, on the contrary, are beneficial for the visibility of the target, as compared with the stationary image condition. The critical frequency appears to be of the order of 1Oc.p.s. RIGGS,RATLIFFet al. [1953] measured visual acuity under three different conditions : (a) stationary image condition, (b) normal viewing, (c) “exaggerated” condition. In the “exaggerated” condition the displacements of the retinal image produced by the involuntary movements of the eye were twice as large as in normal vision t. The acuity test object consisted of a dark line of variable width with a bright background. Visual acuity was measured with a long observation time (1 minute) and with “flash” exposures of variable duration. I n the first case visual acuity was expressed as the width of the line which was visible during 50% of the observation time, in the second case as the width of the line which was seen in 50% of flashes having the same duration. The movements of the retinal image have been found to be beneficial for visual acuity when the observation time t The “exaggerated” movements were obtained by introducing a reversing prism in the compensating path (Fig. 2). In this way, when the eye rotates through an angle C( in one direction, the image on the screen is viewed as it were rotated through angle a in the opposite direction. The total displacement of the image on the retina corresponds to an angle 2a.
268
VISUAL PROCESSES
[VII, 9 4
exceeds 0.15 sec: in this case the “exaggerated” condition yields the best acuity. When the observation time is shorter than 0.15 sec, eye movements are bad for acuity : the results obtained with stationary images are better than those obtained with “exaggerated” movements. Normal vision always gives intermediate results between those of stationary and “exaggerated” viewing conditions (Fig. 3).
LOG FLASH DURATION ( T i m e i n S C C I
Fig. 3. Visual acuity as a function of log exposure time. (a) stationary images; (b) normal viewing; (c) “exaggerated” movements (after RIGGS, RATLIFF et al. [ 19531)
The effect of intermittent illumination on vision with statiomry images has been investigated by D I T C H B U R N and FENDER [1955] and separately by CORNSWEET [1956]. Ditchburn and Fender used a 1 ” circular field, luminance 25 millilamberts, with a black bar 5 minutes wide. The image of the target was stabilized on the retina and the light on the field was interrupted with different frequencies. The total time of visibility of the bar over 30 sec observation time was recorded for the various frequencies of interruption. The results show that interrupted light partially prevcnts the disappearance of the bar caused by stabilization of the image on the retina : with any one of the frequencies used, the total time of visibility of the bar is greater than with steady light (Fig. 4). The greatest effect is obtained at 20 c.p.s. ; at this frequency the bar is perceived during the whole observ at’ion time, as it would occur if the image were not stabilized. Lower or higher frequencies are less effective in producing visibility of the bar, but some effect is still present at very high frequencies. The authors report that, in the course of this experiment, the subject complained
VII,
5 51
269
INVOLUNTARY E Y E MOVEMENTS
of the brightness of illumination at high frequencies of interruption, in spite of the fact that the objective luminance was relativelgr low. One subject estimated the luminance of the field to be 100 times higher than its actual level. According to Ditchburn, this might be due to the high rate of “on-off’’ responses generated in the nervous fibers by intermittent light ; a t a suitable frequency, the rate of responses would even
60 C P S
FLICKER FREQUENCY
Fig. 4. Time of visibility of a stationary line in percent of the total observation time, during intermittent illumination, as a function of frequency (after DITCHBURN and FENDER [ 19551)
reach a saturation level in some fibers. The resulting brightness would thus be very high, even if the objective luminance is relatively low. C,ornsweet used a different technique : instead of interrupting light on the whole field, he rnade only the dark line flicker against the background, the line being alternately present and absent in the otherwise uniform field. The total time of visibility of the line, over 45 sec observation time, is found to increase with decreasing frequency in the range from 4.4 to 0.8 c.p.s. The line is practically always visible for a flicker rate of about 1 C.P.S. Cornsweet did not use frequencies greater than 9.6 C.P.S. The disagreement between these results and those of Ditchburn, as regards the rate of flicker which is most effective in restoring vision, may possibly be ascribed to the different conditions of flicker used in the two cxperiments.
5
5.
Discussion on the Possible Role of Involuntary Eye Movements
The results reported in the two previous sections may be summarized as follows :
270
VISUAL PROCESSES
[VII,
35
Three types of involuntary movements occur during fixation of a steady point: tremor, flicks and drifts. The effect of tremor is t o move the retinal image back and forth, 30 to 100 times per second, with a mean excursus which is less than the width of a single foveal receptor. The drifts make the image shift slowly, covering a distance corresponding to a few cones ; these are uncontrolled movements, whose rate does not change, whether the eye fixates or not. The function of flicks is to carry the image back to the centre of the fixation area, when it has been brought far away by drifts. The flicks are controlled by the visual system. Vision is inhibited during flick periods. The movements of the retinal image can be eliminated by a servcsystem which makes the image remain stationary on the same retinal receptors, regardless of eye movements. When this stationary image condition is realized: (a) details of the “stopped” pattern vanish in a few seconds and the field appears uniformly bright; periods of “seeing” may alternate with periods of “fade out”, the total duration of “fade out” periods depending on the accuracy of stabilization and on the characteristics of the test object; (b) the ability of the eye t o focus the images on the retina appears to be partially lost, possibly owing to the disappearance of the edges in the perceived image; (c) proper fixation cannot be maintained for a long time, unless a fixation spot be provided whose image is not stabilized. Vision of “stopped” images can be restored by intermittent illumination. Constant visibility of a dark line is obtained: (a) with a frequency of 20 c.P.s., if the light is periodically interrupted in the whole field, (b) with a frequency of about 1 c.P.s., if the line is flickered against a steady illuminated background. The effect of natural eye movements having been neutralized, vision of a steady object can be compared to vision of an object which undergoes controlled movements. The main results are: (a) with long exposure times, oscillatory movements facilitate differential sensitivity if the frequency is below 10 C.P.S.and the amplitude exceeds 1 min of arc; oscillations at higher frequencies have no effect if the amplitude is less than 1 min, while they are detrimental to both visual acuity and contrast sensitivity, if the amplitude exceeds 1 min; (b) movements simulating the three kinds of natural involuntary movements are effective in restoring vision of a “stopped” target, as soon as they are introduced ; (c) natural eye movements producing a displacement of the retinal image twice as large as the normal one cause a loss of
3
~ 1 1 , 51
I N V O L U N T A R Y EYE M O V E M E N T S
27 t
visual acuity with respect to the stationary image condition, when the exposure time is not greater than 0.1 sec; the same movements canse an enhancement of visual acuity for observation times exceeding 0.2 sec. An attempt can be made to derive some conclusions from the complex whole of these results, as regards the influence of eye movements on vision. The decay of visual sensation during the observation of “stopped” images proves that some kind of motion is required to maintain vision of details. Presumably, in the absence of motion, a stationary state is reached in which firing in the nervous fibers becomes independent of the stimulus intensity and fails to give any information about luminance differences in various parts of the field. A slow motion of the retinal image can prevent disappearance of details, probably because it produces illumination gradients at some receptors and consequently makes “on-off” responses to be generated in the fibers. This is shown by the experiments with imposed movements or interrupted light. On the other hand, high frequency movements whose amplitudes exceed the total width of two or three cones, are detrimental to visual acuity. In this case, the variation of illumination on those receptors which are swept by the boundary between a dark and a bright area of the image is too rapid for the nervous response to follow it. The stimulus intensity is averaged over time and the average stirnulation of receptors swept by the boundary is less effective than stimulation of receptors in the bright area and more effective than stimulation of receptors in the dark area. The resulting effect is blur of the image and, as a consequence, a loss of visual acuity. From what precedes, the conclusion might be drawn that the natural tremor of the eye has practically no effect on visual acuity under normal viewing conditions, its frequency being too high to be beneficial and its amplitude too small to be detrimental. Of the other two kinds of natural movements, the drifts seem to have the characteristics required t o play a part in the maintenance of vision : they are sufficiently slow and large to make “on-off” impulses to be generated in a certain number of receptors. The flicks have the quite different function of keeping the image of the observed object inside the retinal area where acuity is highest. Flicks, however, are so rapid and large, that they would produce a considerable amount of blur in the perceived image. That is probably the reason why vision is inhibited during flicks.
272
VISUAL PROCESSES
[VIL
55
Thc situation is quite different when short observation times are considcred. In this case, natural or “exaggerated” eye motions are detrimental to visual acuity. The typical extent of natural eye movements during 0.1 sec is 25 sec of arc. Hence it appears that a movement of the image equal to the width of a cone results in a loss of visual acuity. This is pcculiar of short exposures : with long exposures an imposed oscillation whose amplitude corresponds to the width of two cones has no effcct on visual acuity, no matter what the frequency may be. According to the writcr’s tiew, this intriguing phenomenon can possibly be ascribed to the absence of lateval inhibition t in the retina during very short Observation times. BARLOW, FITZHUGH and KUFFLER [ 19571 have shown that the cffect of lateral inhibition is highly reduced when brief exposures are used, probably because the inhibitory mechanism takes a longer time to build up its full activity than the excitatory mechanism which is responsible for the direct rcsponse to stimulation. The blur caused by a movement of the image would be overcome by inhibition at the conditions that it is contained over a few rcccptors (amplitude of the oscillation less than 1 min of arc) and that the exposure time is long enough for the inhibitory action to take place. The relationship betwren eye movements and visual acuity seems, therefore, to he much more coniplex than is assumed by the dynamic theory. Instantaneous visual acuity is not mediated by thc physiological iiystagmus; on the contrary, the natural micromovements of the cye are a hindrance to visual acuity over brief intervals of time. The visual acuity task, however, cannot be carried out for a long time by the same set of receptors, unless illumination of the observed pattern undergoes a low frequency variation: a new set of receptors must come into action to yield again information on details, as soon as the rate of response from the other has reached a stationary level. In this sense, eye movements are good for acuity: they counteract the loss of vision during fixation of steadily illuminated objects. t Lateval inhibition is the depression of the response t o illumination from a given receptor produced by stimulation of neighbouring receptors. Inhibition seeins t o be particularly emphasized near the edges of bright and dark areas in the retinal image (see 5 7) and is assumed t o be one of the mechanisms which overcome the blurring effect of diffraction, aberrations, etc. Similarly, inhibition could neutralize the blurring effect produced by a rapid oscillation of the retinal image, at least under certain conditions.
VII,
§ 61
273
BINOCULAR VISION
Q 6. Dynamic Characteristics of Binocular Vision The monocular retinal image is a two-dimensional representation of the visual space and consequently the perception of depth is very rough in monocular vision. In binocular vision, the perception of the third dimension arises from the disparity between the two monocular images which is due to the different positions of the two eyes with respect to the same objects. The sensitivity to small differences of depth is called stereosccpic acuity.
n
P
Fig. 5. Binocular perception of depth. P and Q object points; L and R centres of the eye pupils; PL‘,QL’, PR‘ and QR’ retinal images
The stereoscopic acuity may be defined as follows. Let L and R be the centres of the pupils of the left and right eye, respectively (Fig, 5). P and Q are two object points; P is placed slightly farther away than Q from the line LR. The visual angle PLQ is somewhat greater than the visual angle PRQ and consequently the two images PL‘and QL’ of points P and Q in the left eye arc farther apart than the images PR’ and QR’ in the right eye. The stereoscopic acuity is /\
/\
inversely related to the minimum disparity PLQ -PRQ which yields the perception of a depth difference between P and Q.
274
VI S U A L P R 0CES SES
IVIL S 6
The definition of stereoscopic acuity is based on the assumption that single receptors or groups of receptors in one eye are anatomically correlated to corresponding receptors in the other eye. When the two images of the same object lie on corresponding retinal units, the disparit y would be zero and no sense of depth could be aroused in the subject. The stereoscopic acuity can be exceedingly high; a disparity as small as 2 sec of arc can be used as a cue for the perception of depth. This value is very small compared with the dimension of retinal receptors. It is difficult to understand how a mechanism based on the anatomical correlation between retinal units, whose minimum width is about 20 sec of arc, could reach an accuracy of 2 seconds. Here we find the same contradiction which is encountered by the anatomical theories of visual acuity, when the diameter of foveal cones is compared with the width of a just perceptible dark line (0.5 sec of arc) or with the accuracy in judging the alignment of two lines (2 sec of arc). In the case of monocular acuity an attempt has been made by the dynamic theory to resolve this contradiction. A similar hypothesis has been advanced to account for the accuracy in the perception of depth in binocular vision (RIGGSand RATLIFF[1951]). Owing to the involuntary movements, the relative position of the two eyes would continually change even if the subject tried to maintain steady fixation at the same point with both eyes. As a consequence, the two images of the fixation point would wander inside two small regions of the two retinae, rather than remain for a long time on a single pair of corresponding receptors. A mechanism may be assumed t o exist which would “compute” the mean locations of the two nionocular images with respect to time, so that the binocuIar correspondence would be based on meart locations of retinal images, rather than on retinal receptors. This hypothesis is supported by the fact that the stereoscopic acuity is lower with short observation times than with long ones. Actually, OGLE[1957] has recently demonstrated that the threshold for the perception of depth regularly increases as the observation time decreases from 1 sec to &,sec. There is, however, very little direct experimental evidence of the role of involuntary movements in binocular vision. The only data available on the subject are the records of involuntary movements of both eyes obtained during binocular fixation (RIGGSand RATLIFF [ 19511 ; DITCHBURN [ 19551; KRAUSKOPF,COI~NSWEETand RIGGS [1958]). The records show that the lateral positions of the two eyes are moderately correlated. Of the three kinds of involuntary move-
VII,
S 61
275
BINOCULAR VISION
ments (a) the tremor movements are independent in the two eyes, (b) the drifts are also more or less independent, (c) the flicks are synchronous in the two eyes. Over brief intervals of time, excluding flicks, the tremor causes the separation between two corresponding points of the two retinae to vary very rapidly; the instantaneous separations may differ 20 sec of arc from the mean separation value. The separations vary more u idely during longer intervals including drifts. These movements may have opposite directions in the two eyes and are much larger than tremor. The flicks, on the contrary, seem to have the role of correcting deviations in convergence, similarly as they correct the action of drifts in monocular fixation. A flick in one eye is accompanied by a flick in the other eye, having the same direction; the amplitudes of simultaneous flicks, however, are slightly different so as to correct deviations from the mean relative position of the two eyes. It is perhaps worth while to mention that a dynamic theory of binocular vision was advanced a long time before involuntary movements were recorded and was since born out by many authors. Its basic assumption is that the stereoscopic acuity is mediated by variations in convergence, that is to say by the voluntary movements of the two eyes. It is known that the binocular fixation of a point at a finite distance from the observer is obtained through a voluntary contraction of the oculoniotor muscles of both eyes. The rotations of the two eyes are balanced so that the two visual directions converge on the observed point. A variation in convergence is required when the gaze is shifted from one point to another lying at a different distance from the observer: this would give the brain a cue for the perception of depth. (For instance, in the case illustrated by Fig. 5, the stereoscopic acuity /\
would result from the difference between the two angles LQR and /\
/\
/\
LPK, rather than from the retinal disparity PLQ-PRQ.) However, the fact that the perception of depth can be aroused even during steady binocular fixation seems to show that the voluntary movements are not of primary importance for stereoscopic vision. Perhaps, the voluntary movements could be useful for maintaining the stereoscopic sensitivity during prolonged binocular observations. It seems, indeed, that the perception of depth is almost completely lost after a prolonged fixation, and that it shows up again soon after a voluntary movement of the eyes. In this sense, the function of voluntary movements for binocular vision would be similar to that of involuntary movements for monocular acuity.
276
VISUAL PROCESSES
Q 7.
[VII.
§7
Some Visual Effects Produced by Intermittent Illumination
It has been shown in the previous sections that one of the roles of eye movements is to transform the spatial variations of retinal illumination into time variations of the light stimulus to retinal receptors. Hence the study of the visual effects produced by intermittent light is correlated with the investigation on the effects produced by eye movements. A great deal of experimental work has been carried out with intermittent light even recently; however, only a few researches will be reported here, which are of particular interest for the present review. The visual effects of intermittent light are known to be the following. Suppose that the illumination of a homogeneous field varies periodically with time so that periods of dark alternate with periods of light, each having the same duration. Suppose also that the light level is constant during each light period and the transitions from light t o dark and vice versa are instantaneous, The visual appearance of the field depends on the frequency of interruption : for any given luminance level L of the field during light intervals, a critical fusion frequency (cff) can be found such that (a) for any frequency greater than the cff fusion occurs and the field appears as it were steadily illuminated, (b) for frequencies lower than the cff, light variations are perceived. The apparent brightness of the field, when fusion occurs, equals that of a steadily illuminated field whose luminance has the value corresponding to the mean luminance Lm of the first field during a lightdark period (Talbot’s law). For frequencies slightly below the cff the observer reports flicker sensation and at much lower frequencies the light pulses are clearly distinguished from dark intervals. At low interruption rates, the field appears brighter during light pulses than a field steadily illuminated at level L (Briicke effect). This effect occurs at fairly high luminance levels and is maximum for frequencies of 8- 10 C.P.S. The cff is linearly related to the log mean luminance Lm, at least in a range of mean luminance values (Ferry-Porter’s law). The cff may reach 80- 100 C.P.S. at high levels, and be as low as 3--4 C.P.S. at very low levels. The cff depends also on a number of other factors, as the size of the test field, its location in the visual field with respect t o the line of sight, the spectral composition of light, etc. In more general cases, intermittent light may be used which has different
VII,
5 71
INTERMITTENT ILLUMINATION
277
light-to-dark ratios (ratio of the duration of a light pulse to the duration of a dark interval) or different pulse “shape” (variation of light with time during a single light pulse). Periodical variations of light may also be considered which include no dark period, but which consist of light modulations between two levels which are both above the eye threshold. A fundamental investigation on the visual response to pulsating lights has been carried out by DE LANGE[1952 and foll.]. He has applied to the “human fovea-brightness perception” system the method for the dynamic investigation of a linear system which is often used in communication engineering. I n a linear system, a sinusoidal variation of the input quantity yields a sinusoidal variation of the output quantity. For a given input frequency f , the output amplitude may be higher or lower than the input amplitude, according to whether the system produces amplification or attenuation. The ratio q of the output amplitude to the input amplitude may be measured as a function of frequency f . The curve obtained by plotting q against f , both on logarithmic scales, is called the attenz6ation characteristic and its shape indicates to what extent sinusoidal variations are amplified or attenuated in different parts of the frequency band, when they are transmitted through the system. The application of this method of analysis requires the visual system to work linearly. Now, evidence can be presented in many cases for the non-linearity of the visual system: for instance, the relationship between stimulus intensity and brightness perception is far from being a linear one when the eye is presented with a single flash of light (Broca-Sulzer effect) t or with a light flickering at a low frequency (Briicke effect). However, it can be easily demonstrated on the basis of Talbot’s law that the visual system works linearly a t fusion of intermittent lights (DE LANGE[1957]). The input quantity for the visual system is retinal illumination, the output quantity is brightness perception. Obviously, the output amplitude cannot be measured directly. Nevertheless, de Lange succeeded in obtaining attenuation characteristics on the basis of the following reasoning. Suppose that the luminance a t the observed field varies sinusoidally between L rL and L - rL. For a given
+
t When a field of high luminance is presented for a short time, the response of the eye overshoots the brightness level which should correspond to that luminance and which would be reached if the field were steadily illuminated at the same level for a longer time.
278
VISUAL PROCESSES
[v=,
97
frequency f and a fixed value of the average luminance L , the critical value of the ripple ratio r% can be experimentally determined a t which flicker just disappears. This means that the ripple ratio ry0 of the input is reduced by attenuation to the critical value Y O which pis the threshold of a mechanism located somewhere in the visual system between the receptors and the visual cortex. The attenuation is == rojr. The threshold value Y O is independent of frequency; if we assume that a t a very low frequency ( 1 c.P.s.) neither attenuation nor amplification occur in the system ('7 = I ) , YO can be taken equal to the critical input amplitude r for that frequency. To obtain the attenuation characteristic, once Y O is known, it is only required to measure the critical ripple ratio r% for different frequencies. The attenuation characteristic obtained by DE LANGE [ 19521 with
Fig 6 Attenuation characteristics of the visual system for three levels of average retinal illumination (white light, 2" circular field with light surround) The shapes and the modes of interruption of the various stimuli used are indicated in the figure (after DE LANGE[1958])
VII,
9 71
INTERMITTENT ILLUMINATION
279
a 2” white field are shown in Fig. 6. The ripple ratio r% is plotted against cff for three different values of mean retinal illumination. The 2” field was surrounded during the experiment by a large field which was steadily illuminated at a level equal to the mean luminance of the test field. As could be expected, these curves show that the visual system acts as a low-pass filter for modulated light. The dynamic characteristics of the system depend on the mean luminance level: the bandwidth becomes greater and the descending branch of the curve becomes steeper as the mean luminance increases. The most interesting result is obtained at the highest levels where amplification occurs (Y > YO) for frequencies below 15 C.P.S. with a maximum a t 8- 10 C.P.S. At frequencies higher than 15 C.P.S. there is attenuation (Y < YO). I t is also interesting to note that, for r ;z 20/,the attenuation increases very rapidly with increasing frequency. This shows that, if a non-sinusoidal input variation is considered, the effect of the higher harmonics in the Fourier expansion is negligible, as the higher harmonics are attenuated to a much greater extent than the fundamental vibration. For any shape of modulation or mode of interruption, the visual system “reacts solely to the ripple ratio Y, as regards the passing of the flicker limit” t. The relationship between these phenomena and the effects of eye movements is shoun, for instance, by a very recent experiment made by KELLY[1959]. He emphasized that a system like the eye, which shows in many cases a “transient overshooting behaviour” (Briicke effect, Broca-Sulzer effect, etc.) must act as a differentiator: that is to say that the amplification should increase much more rapidly with frequency than is shown in de Lange’s characteristic, at least in the low frequency range. Kelly points out that the flattening of de Lange’s curves at low frequencies is an artifact resulting from the presence of a sharp edge in the 2” test field. The tremor of the eye supplies a modulation of the stimulus wave form a t the edge of the field, when the frequency of the stimulus is lower than the tremor frequency. Since the receptors act as differentiators, they accentuate the modulation resulting from tremor, so that the low-frequency sensitivity to the intermittent light is articifially increased. Kelly actually obtains very sharp characteristics by using a very large field with a blurred t It must be emphasized that these results hold for a small central part of the retina, including the fovea. The peripheral part of the retina seems to have different time characteristics (see, for instance, BITTINI[ 19581).
280
VISUAL PROCESSES
[VII,
I7
edge (Fig. 7). It is quite possible that all photometric judgements based on the match of an intermittently illuminated field to a steadily illuminated field are influenced by the modulations introduced by eye movements a t the edge.
FREQUENCY
IN
CPS--+
Fig. 7. Attenuation characteristics of the visual system obtained with three different fields: (a) 2" circular field with light surround; (b) large field with blurred edge; (c) 4 ' field with dark surround (after KELLY[ 19591)
Another group of researches, which are of interest for the investigation of the dynamic characteristics of vision, deal with visual acuity under intermittent illumination. SENDERS[ 19491 has found that interrupted light is more efficient than steady light for the readability of printed matter. The investigation has been further developed by NACHMIAS[ 19581 who used various kinds of intermittent lights which differed from each other in frequency and in light-to-dark ratio. As regards perceived brightness, all the stimuli used were equivalent to a steady light having the same mean energy content. Visual acuity was measured with both long and short
VII,
9 71
INTERMITTENT ILLUMINATION
28 1
exposures (45 sec, 250 msec, respectively). The results show that : (a) intermittent lights which are equivalent for brightness are not equivalent for visual acuity; (b) with long exposures, intermittent light is more efficient than steady light for visual acuity, as was
t - LOG f
0 . 7 ~ 0 ~ to
Fig. 8. Log of relative luminance required for resolution of a grating with intermittent lights having different frequencies and light-dark periods (broken lines). The abscissae have been chosen so that the solid line represents a more general form of Talbot’s law holding even in the case of short exposures (after NACHMIAS [ 19581)
previously found by Senders ; (c) with short exposures the opposite occurs (Fig. 8). The results (b) and (c) strongly recall the findings of Kiggs, Ratliff et al. about the effect of eye movements on visual acuity (see 9 4): intermittent illumination and eye niovernents are both detrimental to visual acuity during short observation times and beneficial during long observation times. The agreement of the results obtained in the two different experiments provides further evidence for the hypothesis that the main effect of eye movements, as far as acuity is concerned, is t o produce time variations of illumination of retinal receptors. The suggestion advanced by the author in 3 5 to account for the detrimental effect of eye movements on visual acuity with short exposures should hold also in the case of intermittent illumination. As regards (a), it seems not surprising that visual acuity and brightness do not show the same behaviour under intermittent
282
VISUAL PROCESSES
[VII,
98
illumination. This indicates that the visibility of details is not merely a matter of brightness discrimination, as is assumed by the classical photocheniical theory, but also involves the perception of contours, a far more complicated phenomenon which cannot be described in terms of photochemical reactions. Some of the aspects of the visual perception of contours will be discussed in the next section.
Q 8. The Perception of Contours Many examples have been presented which show the importance of edges in the observed pattern for visual perception. It has been shown, for instance, that the disappearance of the edges during observations with stationary images makes the whole field appear uniform in spite of objective variations of luminance across the field. On the contrary, the presence of a non-stabilized edge in an otherwise uniform field maintains vision of the whole field and perception of luminance differences, even if the receptors which are located far apart from the edge undergo a constant stimulation. Contours are also very important for binocular vision. Convergence is adjusted so as to obtain fusion of important contours in the binocular field. ”hen the two eyes are presented with two different patterns, and a zone of one monocular image which contains contours corresponds to a uniform zone of the other monocular image, the latter is suppressed from the binocular pattern, while the contours of the first are perceived. Apparently, the information from the edges is of substantial significance for the brain and provides sufficient knowledge for the perception of the whole pattern, even if the specification of light intensity is not very precise. Investigation on the perception of contours seems, accordingly, to be of basic importancz for the understanding of visual performance. There are at least two points to be ascertained : (a) which light distribution of illumination is required i o yield the peIception of a contour, (b) what kind of mechanism may be assumed to be responsible for the perception of contours. We shall try to give a brief account of some recent researches which deal with these two points. In the following, we shall call “contour” a narrow region where the eye perceives a sharp variation in brightness. Contours are perceived a t steps in the luminance distribution and they generally appear sharper than could bc expected by taking into account the smoothing
VII,
9 81
THE PERCEPTION O F CONTOURS
283
effects of diffraction, aberrations, etc. Moreover, subjective contours are seen in regions of the field where no sharp variation of luminance occurs. The so-called Mach bands are one of the most striking examples o f subjective contours.
a
b
C
DISTANCE
Fig. 9. Objective luminance distribution along one direction of the test field (solid curve). Qualitative representation of the corresponding retinal illumination (broken curve) and of the subjective brightness distribution (dot and dash curve)
Let us consider a field where the luminance is constant along one direction and varies in the direction at right angles to it, according to the diagram of Fig. 9 (solid curve). The luminance has the constant value L , in region a of the field, the constant value L , in region .c, and varies linearly from L , to L, in the intermediate region b. The retinal distribution of illumination is smoothed by diffraction and .aberrations, as is qualitatively represented by the broken curve in Fig. 9. An observer looking at the field reports a brightness sensation which does not correspond to the objective luminance distribution : a bright band is perceived at the edge of region a and a dark band at the "edgeof region c. The first band appears brighter than region a and the latter darker than region c. The two contours of each band are located .at opposite sides with respect to the edge of the uniform region in the objective distribution of luminance. The brightness distribution is ,qualitatively indicated by the dot and dash curve of Fig. 9. Both bands are subjective in nature and are perceived, for instance, at the boundaries of the penumbra projected on a diffusing surface by an
284
VISUAL PROCESSES
[VIL
s8
object illuminated with an extended source. They have been described for the first time by MACH [1865] and are usually referred to in the literature as the Mach bands. It is to be noted that the field represented in Fig. 9 is but one of the many different luminance distributions which give rise to the Mach bands. LUDVIGH[1953] has carried out a general investigation of the relationship between retinal distribution of illumination and location of subjective contours. He used a sigmoid distribution of luminance and chose experimental conditions such that the actual distribution of retinal illumination could be known with sufficient accuracy. The external light distribution could also be changed in such a way that the derivatives of retinal illumination with respect to retinal distance were systematically varied by an amount controlled by the experimenter. Ludvigh pointed out that the rate of change of illumination with respect to retinal distance is an unimportant factor for the formation of a contour and that the lowest derivative of illumination with respect to retinal distance which has significance for the perception of contours is the second derivative. According to Ludvigh, two contours are perceived in any region where the second derivative is maximal and are symmetrically disposed about that region. The separation between the two contours depends on the fourth derivative : the greater the fourth derivative, the smaller the separation of the doublet contour. This means that a sharp bend in the luminance distribution produces a very narrow edge doublet, while a region where the curvature varies smoothly produces a wide edge doublet. When the bend is very sharp, and accordingly the fourth derivative has a very high value, the two contours may fuse and no band is perceived. If patterns are employed which contain several sharp bends, the contour at one bend may fuse with the contour at the adjacent bend and resolution of details can be affected. The mechanism responsible for the perception of contours might be located in the retina or in the brain. Most of the experimental findings previously reported can be explained on the basis of retinal mechanisms. However, this does not imply that the whole process is mediated by the retina; it seems conceivable that the high centres give their own active contribution to the perception of contours, a t least as regards the evaluation of retinal responses on the basis of experience. Ludvigh assumes the existence of three different kinds of elements.
VII,
9 81
T H E PERCEPTION O F CONTOURS
285
in the retina: the first would be responsible for the absolute response t o light, the second for the discrimination of brightness, the third for the perception of contours. The three sets of elements would be involved in the evaluation : of the illumination level, of the first derivative of illumination and of the second derivative, respectively. The third set would consist of “on-off’’ elements. Owing to eye movements, these last elements would be maximally excited in the vicinity of regions where the first derivative of illumination has a maximum, but the “gradient of excitation” of these elements would be maximum where the second derivative is maximum. More frequently, however, the perception of contours has been assumed to be produced by lateral inhibition in the retina. GREEN [ 19581 has reviewed the theories proposed by several authors and has pointed out that many of the visual phenomena connected with contour perception (Mach bands, detectability of a line, resolution of two lines) can be explained on the basis of a rather simple inhibitory mechanism in the retina. He assumes, as previously postulated by FRY1119481 that the mechanism responsible for the perception of contours is composed by two kinds of retinal elements: a set of elements whose response depends on the level E of local illumination, and a second set of overlapping elements which evaluate E , the mean of E over a retinal region of diameter a. The response from the second set of elements would inhibit the response from the first set, so that the final response would depend on E - E . A contour would be perceived in regions where the “contour response” E - E presents a discontinuity, or at least a sharp variation. Green has proved that in the one-dimensional case and with the assumption that the averaging interval a is small, the difference E - E can be satisfactorily approximated by the sum of two terms which are respectively proportional to the second and fourth derivatives of E with respect to retinal distance. These derivatives are just the ones that Ludvigh demonstrated to be of significance for the formation of contours : they would derive their significance from the fact that they determine the value of E - B. The assumption that the retinal inhibition is responsible for the perception of contours is supported by the fact that contours are not seen under conditions which are unsuitable for the inhibitory processes to take place, namely short observation times and low illumination levels. This holds for both contours at sharp edges and at smooth edges (Mach bands). CHEATHAM[1952] has shown that sharp contours take
286
\’IS 1J A L P K 0 C E S S E S
[VIL
48
a finite time to be perceived (from 30 to 100 msec) and that the latency increases when luminance is decreased. Mach bands are not visible a t low luminance levels (EKCOLES and FIORENTINI [1959]) nor during short presentations of the field (FIORENTINI [ 19561). The movements of the retinal images have been found to affect visibility of subjective contours in the same way as they affect visibility of details. The Mach bands perceived at a diffused edge disappear very soon after stabilization of the retinal image. On the othcr hand, slow and wide oscillations of the field enhance the visibility of Mach bands, while rapid motions impair visibility of the bands ( F I O R E K T I N I and ERCOLES [1957]). This seems to suggest that the disappearance of details during prolonged observations of “stopped” images is due to fatigue of the contour mechanism, rather than to a fading of the direct response to illumination. If this hypothesis is correct, the following conclusions can be drawn about the dynamic characteristics of perception : The vision of details is possiblc through differential brightness sensitivity and perception of contours. The first is activated even by flash exposures, while the second takes longer to occur, since it is mediated by lateral inhibition. Prolonged exposures in the absence of movement cause fatigue of the contour mechanism and as a consequence contours disappear. Diffcrential brightness sensitivity is also severely impaired by the lack of a contour response and brightness tends to become uniform in spite of luminance variations on the field. Time variations of retinal illumination, produced either by intermittent illumination or by motions of the retinal image, can prevent disappearance of details. Brightness discrimination can be maintained even during steady illumination of large sets of receptors, provided that the contour response at the boundaries is maintained through local time variations of illumination and activation of inhibitory mechanisms.
Acknowledgments The author gratefully acknowledges the permission granted by various authors and publishing bodies for the reproduction of diagrams.
VII]
REFERENCES
287
References ADLER,I;. H. and I;. FLIEGELMAN, 1934, Arch. Ophthal. 12, 475. H. R., 1952, J . Physiol. 116, 290. BARLOW, BARLOW, H. B., R. FITZHUGH and S. W. KUFFLER,1957, J. Physiol. 137, 338. BIwINI, M., 1958, Atti Fond. G. Ronchi 13, 442. BYRAM, G. M., 1944, J. Opt. SOC.Am. 34, 718. CHEATHAM, P. G., 1952, J. Exp. Psych. 43, 369. CORNSWEET, T. N., 1956, J. Opt. Soc. Am. 46, 987. CORNSWEET, T. N., 1958, J . Opt. SOC.Am. 48, 808. CORNSWEET, T. N. and L. A. RIGGS,1954, Paper delivered a t the East. Psych. 'Ass. Meeting, 1954. DE LANGE DZN,H., 1952, Physica 18, 935. DE LANGEDZN,H., 1954, J. Opt. SOC.Am. 44, 380. DE LANGEDZN,H., 1957, Thesis, Technical Univ. Delft. DE LANGEDZN,H., 1958, J. Opt. SOC.Am. 48, 777. DITCHBURN, R. W., 1955, Optica Acta 1, 171. DITCHBURN, R. W., 1956, Research 9,466. DITCHBURN, R. W. and D. H. FENDER, 1955, Optica Acta 2 , 128. DITCHBURN, R. W. and B. L. GINSBORG, 1952, Nature 170, 36. DITCHBURN, R. W.and B. L. GINSBORG, 1953, J. Physiol. 119, 1. DITCHBURN, R. W. and R. M. PRITCHARD, 1956, Nature 177, 434. ERCOLES, A. M. and A. FIORENTINI, 1959, Atti Fond. G. Ronchi 14, 230. FENDER, D. H., 1955, Brit. J . Ophthal. 39, 65. FIORENTINI, A , , 1956, Problems in Contemporary Optics (Florence, 1st. Naz. di Ottica). A. and A. M. ERCOLES, 1956, Problems in Contemporary Optics FIORENTINI, (Florence, 1st. Naz. di Ottica). FIORENTINI, A. and A. M. ERCOLES, 1957, Optica Acta 4, 150. FRY,G. A,, 1948, Am. J. Optom. Monograph 45. GREEN,P. H., 1958, Factors in Visual Acuity (Aeromedical Division, U.S. Air Force Office of Scientific Research, A F 18(600), Project 9777 - Univ. of Chicago, 1958). HARTLINE, H. K., 1938, Am. J. Physiol. 121, 400. HEBBARD, F. \V. and E. MARG,1957, J . Opt. SOC.Am. 47, 112. HEDLUN,J . M. and C. T. WHITE, 1959, J. Opt. SOC.Am. 49, 729. HERING,E., 1899, Ber. Gesellsch. Leipzig 51, 16. HIGGINS, G. C. and K. F. STULZ,1953, J. Opt. SOC.Am. 43, 1136. JONES,L. A. and G. C. HIGGINS,1948, J . Opt. SOC.Am. 38, 398. KELLY,D . H., 1959, J. Opt. SOC.Am. 49, 730. KRAUSKOPF, J., 1957, J . Opt. SOC.Am. 47, 740. KRAUSKOPF, J., T. N. CORNSWEET and L. A. RIGGS,1958, J. Opt. Soc.Amer. 48, 288. LORD,M. P.and W. 11. WRIGHT,1948, Nature 162, 25. LUIIVIGH,E., 1953, Perception of Contours, I and I1 (U.S. IXaval School of Aviation Medicine, Reports N.M. 001.075.01.04 and N.M. 001.075.01.05). MACH,E., 1865, Sitz. TVien. Akad. Wiss. 5212, 303.
288
VISUAL PROCESSES
[VII
MARSHALL, W. H. and S. A. TALBOT, 1942, Biological Symposia, Vol. 7 (Lancaster, The Jaques Cattel Press) p. 117. NACHMIAS, J., 1958, J. Opt. SOC. Am. 48, 726. OGLE,K. N., 1957, J . Opt. SOC.Am. 47, 343. POLYAK, S. L., 1941, The Retina (Univ. of Chicago Press) p. 425. RATLIFF, F., 1952, J. Exp. Psych. 43, 163. RATLIFF,F., 1958, J . Opt. SOC.Am. 48, 274. RATLIFF,F. and L. A. RIGGS,1950, J. Exp. Psych. 40, 687. RIGGS,L. A. and F. RATLIFF,1951, Science 114, 17. RIGGS,L. A., F. RATLIFF,J. C. CORNSWEET and T. N. CORNSWEET,1953, J . Opt. SOC.Am. 43, 495. SENDERS, V. L., 1949, J. Exp. Psych. 39, 453. TEN DOESSCHATE, J., 1954, Ophthalmologica 127, 65. WEYMOUTH, F. W., D. C. HINES, Id.H. ACRES,J. E. RAAFand &I. C. WHEELER, 1928, Am. J . Ophthal. 11, 947.
MODERN ALIGNMENT DEVICES BY
A. C. S. VAN HEEL Technische Hogeschool, Delft
CONTENTS PAGE
5
1. INTRODUCTION
. . . . . . . . . . . . . . . . .
291
9 2. CUSTOMARY METHODS EMPLOYING COLLIMATORS AND TELESCOPES . . . . . . . . . . . . . 294
9 3. INTERFERENCE ARRANGEMENTS . . . . . . 3 4. DISCUSSION O F T H E PRECISION . . . . . . 3 5. T H E USE O F REFLECTING SPHERES AND SPHERES WITH A CONCENTRIC CAP . . .
. . . .
299 302
OF
. .
304
5 6.
SPHERE WITHOUT REFLECTION, PRODUCING A LUMINOUS “LINE” . . . . . . . . . . . . . . . 308
3 7. 9 8. 3 9. 5 10.
SINGLE LENS AS ALIGNMENT COLLIMATOR
. . . . ALIGNMENT OF SURFACES . . . . . T H E AXICON . . . . . . . . . . . . . $ 1 1 . ADDITIONAL EXAMPLES . . . . . . . 3 12. SOME TECHNICAL REMARKS ON THE T H E USE O F T H E RAINBOW
. . . .
. . . .
. . . .
. .
31 1
. . . . . . . . . . . .
312
315 318 319
MANUFACTURE O F ZONE PLATES . . . . . . . . . . . . 323
REFERENCES . . . . . . . . . . . . . . . . . . . . .
32%
tj 1. Introduction By alignment we understand the act of placing three or more points on a straight line, including the determination of (small) departures from a straight line. The ways to ascertain the departures of points from a flat surface, though not strictly belonging to our subject, will be considered too, since often similar methods can be applied as in the case of pure alignment. In many cases stretched steel wires are used. The wind unfavorably influences the results obtained with these. Moreover they are more cumbersome than light rays, which have no mass. Since alignment methods using light rays can be appropriated to meet practically every demand and yield at the same time accurate and rapid results, they alone will be considered here. One general limitation must be brought to mind. Light rays are straight lines only in a homogeneous medium. Turbulence of the air
Fig. 1. Huygens’ principle
impedes the use of optical sighting methods, atmospheric refraction limits their use to relatively small distances. In geodetic measurements the influence of the atmosphere on the determination of azimuthal angles and elevations must be eliminated by applying theoretical
292
[VIII,
MODERN ALIGNMENT DEVICES
51
corrections and by multiplying the number of observations under varying circumstances. It is not our intention to dwell on these features, well known in geodetical practice. We must, however, say a few words on what will be understood by a light ray. We will start with the concept of wave fronts. When a wave front V at time t (Fig. 1) travels from left to right each point of V may be considered as a centre of secondary wavelets, each of which will have reached a given position at time t + At. The enveloping surface V’ of these wavelets is (according to Huygens’ principle) the wavefront at the time t At. Fresnel has combined this principle with the principle of superposition. To calculate the intensity at a point P one has to add the light vectors originating from the different surface elements do of V, taking into account the distance of each element do from P. Experiment shows that, in all cases considered here, phase and intensity at P are correctly obtained in this way (except for a constant phase lag, which does not play a role in our considerations). In a homogeneous medium the wavelets are spherical. It can be shown that the light at a point P at a distance 1 from a plane wavefront V (Fig. 2) can be considered as proceeding from a circular
+
Fig. 2. Effective portion of an infinite wavefront
portion of V around the foot A of the perpendicular from P on V, of which the diameter is of the order of magnitude 1.41/% i.e. from the inner half of the first Fresnel zone; here il is the wavelength of the light. The normal to the wavefront at A can be considered as the “light ray” through A, which has a certain “play” in its “origin” on V. The angular play corresponding to a point P is 1.41/@.
VIII,
5
13
INTRODUCTION
293
By a similar consideration the failure of “sighting” by means of three small holes to obtain high precision is explained. Let 0 in Fig. 3 be a small hole and let V be a plane assumed to be about V
Fig. 3. Play in directions of light rays, when sighting is done with three holes
midway between 0 and P, and let OP = 1. When the light path OAP does not exceed the path O P by more than say the light arriving at P is still more or less in phase. The patch on V, of which the diameter d = AB, is the part of the wavefront at V responsible for the illumination a t P. Now, the angle cc is the play in the alignment and its value is easily seen to be 2dv.With 1 = 0.56 x 10-3 mm we have CL = 300/1/iin seconds of arc, 1 being measured in meters. If we make a circular hole at V which just admits the patch AB, we can expect to align with a precision of not too small a fraction of this value. Indeed we find, for example, in an article of BONNAFFE [1930] a play of 2 mm for P when I = 30 m. This amounts to 14 seconds for 2c(, while the above reasoning gives 110 seconds. The result is much better than expected, for which the symmetry of the phenomenon is responsible. Still, even careful settings by experienced observers do obviously not yield satisfactory results. In these experiments the first hole must be small enough to provide an illumination at V with more or less coherent light. Furthermore the hole at V must be circular to a good approximation, as otherwise the setting on maximum light at P does not guarantee the straightness of the sighting line. One objection to the method is the lack of illumination on account of the required smallness of the holes, which often makes it necessary to observe during the night. Another serious drawback of this method is the fact, that nothing is seen until alignment of the three holes is achieved, an unsatisfactory feature indeed, as much time is wasted
294
MODERN ALIGNMENT DEVICES
[VIII,
32
by groping in the dark in search of the sighting line. It is especially because of these two drawbacks that methods have been developed which yield accurate results with less loss of time; they offer a field of view even when alignment is far from realized. And, moreover, the brightness is sufficient for observations in daylight. That sighting along pegs or through holes is an old technique which gave in ancient times astonishingly good results is well known. One may quote as an example the construction of a tunnel of 1 km length to conduct water through a mountain at Samos in the sixth century B.C. started at two opposite points [HERODOTUS]. HERON OF ALEXANDRIA living six centuries later gives an elaborate account of the way geodetic measurements for this and similar purposes were performed in his time. The ingenious contrivance he used, the dioptra, was a horizontal tube of about 1.5 m length with two vertical end tubes of glass, filled with water. The sighting took place by viewing through narrow adjustable slits along the two water menisci. The instrument acting at the same time as a level and as a sighting apparatus had manifold dses, described in the existing book of Heron. Its precision may have been a few minutes of arc. In later times, up to the application of lenses, the observations of directions for astronomical, navigational and geodetical purposes was done with holes or pegs. Some remarks on these instruments can be found in the historical chapter of the book by DANJON and COUDER [1935].
Q 2. Customary Methods Employing Collimators and Telescopes 2.1. TELESCOPES
In the sixteenth century the invention of the telescope provided more powerful technique. In order to establish the direction of points at infinity this technique is excellent, because the parallel pencils of rays originating from such points are brought to points of the focal plane G of the object glass; directions are transformed into points (see Fig. 4). The position of the points in the focal plane is ascertained by means of cross wires or a graticule, of which there exist a multitude of various forms, suited to different purposes. This holds only in the case of “perfect” object glasses. The only optical limitation to the fixation of the direction is then set by effects of diffraction. The angular diameter of the diffraction patch is of the
VIII,
21
295
COLLIMATORS AND TELESCOPES
order of 2A/b, where b is the diameter of the object glass, the focal length playing no part. It must be understood, that not only the geometrical aberrations of the optical system, but also errors arising from imperfect form of the surfaces, from decentring and from the
G Fig. 4. Lens transforming directions into points in the focal plane
lack of homogeneity must be below certain limits. Furthermore the mechanical parts must have fixed form and dimensions. The influence of changes in temperature, especially of irregular temperature distribution, can be fatal to the ultimate precision. In what way the influence of the bending of the telescope tube of a meridian circle can be ascertained has been described by VAN HEEL [1956] and by VAN HERK [1958].
The improvement of pointing on sights at great distances by means of zonal rings placed before the object glass has been discussed by RICHARDUS [1954]. This last method will be described later. Since the main subject of this article is the alignment of points at finite distances we will not dwell on these methods. Another feature of pointing with telescopes, however, must be discussed. For many purposes directions of points are being ascertained by means of a telescope coupled with a spirit level. In a theodolite the position of the telescope can be read on a horizontal as well as on a vertical divided scale. When the targets are not very far off, the telescope must be focussed for each distance. This can be done by changing the distance between the object glass and the cross wires or by moving a lens between object glass and cross wires (the internal focussing lens). The centring of the moving parts on one line is only approximately realized. When, for example, the eyepiece of a telescope of which the object glass has a focal length of 25 cm has a lateral
296
MODERN ALIGNMENT DEVICES
[VIII,
92
mechanical play of say 0.005 mm the angular play in pointing is 4 seconds of arc on account of this defect. This source of errors, very important for measurements involving high precision, can be completely obviated by an arrangement in which the telescope need not be touched during or between the observations. For this purpose consider a small hole (illuminated from behind with a small lamp, if necessary with interposition of a lens) as object to be pointed at. The eyepiece of the telescope is focussed on a plane between the object glass and its focus. Thus neither objects at infinity nor at any real distance can be seen sharply. The light “points” form a blur, a circle of diffusion, whatever their distance. Now place before the object glass a plate with alternately transmitting and absorbing zones. The width of the zones need not follow any specific law, provided they do not resemble Fresnel zones. The practical realization of the zone pIates is treated below, but let us assume that these plates can be made with sufficient precision and ease (see below, 5 12). The circular zones usually are equidistant. With such a plate before the object glass of the telescope, the “image” of a luminous point appears to be a disc marked by concentric coloured rings. The light distribution in the disc can be calculated (for monochromatic light) from elementary diffraction theory; the delicate nuances of the colours, when white light is used, cannot and indeed need not be calculated. They vary with the different distances of the object points. It is important, however, t o note that they are concentric around the optical axis of the object glass. No asymmetrical errors in the last are allowed, of course, if precision is intended. Errors of centring the object glass itself are especially pernicious. But spherical and chromatic aberration are both harmless; they do not disturb the symmetry of light distribution around the axial point. The reticule of the eyepiece must preferably contain a set of concentric rings. By placing the object point on the axis of the telescope its position with reference t o this axis can be observed, and thus alignment of several points can be reached without touching the telescope. For geodetic measurernents, from the most precise primary triangulation to levelling at short distances, the method has proven to be preferable t o the usual ones in respect of precision and rapidity. It is to be noted that even when the object point is not in line, it is seen in the field of the telescope as far as the lateral field of the object glass allows.
Fig. 5. Pointing to a small illuminated hole (0.2 mm) ; zone plate in front of the object glass
This Page Intentionally Left Blank
VIII,
0 21
COLLIMATORS AND TELESCOPES
297
In these measurements the telescope is moved in its mount if necessary, but need not be touched itself. For a practical use we refer the reader to Richardus (1.c.). This author also stresses that accuracy is nearly twice as good as with the customary methods, that influence of contrast (by haze) is strongly minimized, that field illumination is unnecessary and the additional apparatus is simple and cheap. Observing with the sighting telescope can be conceived as working with an imaginary straight line, the optical axis, extrapolated from the telescope into the object space. In principle there is no objection to its use, provided the instrument need not be touched between observations. Fig. 5 gives an image obtained by a telescope with magnification 20 and a zoneplate with a period of 1 mm before the object glass. A step in the direction of employing diffraction instead of being hampered by it, has been taken by Bonnaffk (l.c.), who used a point source and a telescope focussed on the source for farthest and nearest points and who used a unilateral screen at the intermediate point. The telescope was stopped down to a diameter of a few millimeters. When the screen was laterally displaced until the first diffraction rings around the Airy disc disappeared, he noted the position of the edge of the screen. By first pushing the screen into the pencil from the left and then from the right and by taking the mean of two readings, he got a much better precision than with sighting alone. He thus attains an accuracy of about 1 second of arc, performing his observations in the dark. 2.2. TELESCOPES A N D COLLIMATORS
One might ask if it were not feasible to use the zone plate alone, without object glass. This indeed is the case, but we will discuss this possibility later on and we will treat first the use of telescopes in autocollimation and also in conjunction with collimators, both uses for alignment purposes. The use of a telescope to determine directions, especially for rifles and other military purposes, is treated by KONIG [1937] in his book on telescopes and range-finders. In order to test the straightness of lathe beds for instance, a telescope T is mounted near to the bed V, while a collimator C is placed on the surface to be tested (see Fig. 6). The optical parts of both must be of very good quality. In the focal plane P of the collimator a
298
MODERN ALIGNMENT DEVICES
[VIII,
92
graticule has been mounted of which one point, or a small circle indicate the focal point. The object glass of the telescope forms an image of it in its back focal plane Q and the position of this image is determined with reference to a graticule with appropriate lines or circles on it.
--X
V
Fig. 6. Alignment with collimator and telescope
When the optical axis of the collimator is inclined to that of the telescope the image of A does not coincide with the focal point B of the telescope. Thus the inclinations of the collimator can be ascertained. Let us assume the distance between the two points of support of the collimator to be A x , the lateral deviations of the points of the surface V to be A y and the distance of the first point of support from the left extremity to be x. The deviations in Q are a measure for A y / A x as a function of x. The form y(x) of the surface can be derived from the measured values by means of an approximate integration. This is an unsatisfactory procedure. Important as the determination of the angular deviations as a function of x may be in some cases, interest in practice is usually centred on the deviations A y themselves, again as a function of x. To measure these, the collimator is provided with a second graticule against its object glass and the telescope is focussed on it. This, however, introduces a serious source of errors due to re-focussing of the telescope, as has been pointed out above. We need not enter into the technical details of this customary way of alignment, as they are dealt with in several treatises (see, for example, RANTSCH[ 19491, who also indicates several variants). It is not surprising that high precision is impaired when lenses are used, even when aberrations are not directly a hindrance. Errors of centring in the optical parts themselves are of much more importance. Moreover it seems worth-while to seek methods by which the deviations themselves are measured directly. With telescopes and collimators the deviations are derived from determinations of direction.
VIII,
3 31
299
INTERFERENCE ARRANGEMENTS
Further requirements for a useful method are a field of view of sufficient dimensions, images of adequate brightness, simplicity in use and m a n u facture and high precision. It is desirable that for low precision (in preliminary observations) no essential change in the apparatus or the mode of observation is required. It will be shown below that the use of lenses must and can be cut down with profit, and it appears to be no longer quite true, that “the high attainments of modern instruments have . . . placed the superiority of the optical pointer beyond all question”, where by optical pointer is meant “a lens system carrying on its optical axis a cross wire or other pointer, made to coincide with the image of the object” (MARTIN [1924]).
9
3. Interference Arrangements
The problem of alignment can be attacked in quite another way, without the use of lenses in the essential parts, that is without the formation of images. The most simple arrangement is a double slit (VAN HEEL[1946]). When a screen with a double slit is illuminated by coherent light, interference of the light diffracted by each of the slits gives rise to a pattern of dark and light fringes when monochro-
-----IE B
A E
- - - _---
L
I
__--
I
K ----
II -)
--- --
‘
Fig. 7. Alignment with double slit; image of light source on slit
300
MODERN ALIGNMENT DEVICES
[VIII,
93
matic light is used, and of colored fringes with white light. The fringes are parallel to the slits. In order to secure sufficient coherence it is necessary to make use of a single slit as a light source. The general arrangement is shown in Figs. 7 and 8. This set-up has proved to be useful in precision alignment. The farthest point (line) is the central line of slit A, the intermediate point (line) is the centre line of the double slit B, and the nearest point is the intersection of the crosswires a t C. Two auxiliary lenses D and G are added, both of which can be single, uncorrected lenses (spectacle lenses), both being outside the region where alignment takes place. The lens D acts as a condensing lens. It can be used to form an image of the wire of the glow lamp E on the single slit (Fig. 7), or in infinity (Fig. 8). The first way of illumination has the advantage that the lateral play in the position of the lamp is large, as the screen with the double slit easily falls within the cone of light emerging from A. The second way ensures a more even brightness of the pattern along the fringes. There is n o difference between the two set-ups in regard to brightness of the pattern, provided they are correctly realized. The pattern can be observed with the naked eye or with a single lens of low power G (focal length say 10 to 20 cm). This lens has the advantage that stray light, not proceeding from the double slit, can easily be cut off by means of a stop K in the focal plane of G. The lens forms an image of the screen B in or near that focal plane. The hole in K must be slightly smaller than this image. Thus all the light passing along B or coming from aside is prevented to enter the eye of the observer, making it possible to observe in daylight, even against very a bright background (the sky, for instance - cf. FRANK and VAN HEEL[1951]). Fig. 9 is the photograph of a pattern produced by such a double slit. The width of the single slit depends on the distance between A and B and on the mutual distance of the two slits a t B. Referring t o Fig. 10 we have for the difference of the light paths M E and N E the approximate value pq/2a. Thus the width q of the single slit should not exceed a;l/2p to prevent this difference being greater than $2. This proved good enough in practice. A smaller width is unprofitable as it only diminishes the brightness of the interference patterns without any gain in contrast. With q/a not larger than i1/2p, the contrast is reasonable. Further it can be assumed from extensive experience that the
VIII,
5 31
INTERFERENCE ARRANGEMENTS
30 1
mutual distance h of the centre lines of the double slits should be about 3 x lO-4-2ain order to give patterns of the right sort, while the width of each of these slits ought to be about 0.2h. B
Taking p to be about equal to h we obtain the following working rule with A = 0.56 x lO-3mm: the width of the single slit must be about 0.5mm, that of each of the double slits 6 x lO-5Z, and their mutual distance about 3 x 10-41. Here I is the distance between A and C ; the range of a is between 0.151 and 0.85Z. For many purposes the slits can be cut by a fraise in brass. In extreme cases, for instance with I small, the above rules do not hold. An observation with 1 = 5 cm and its application has been described (VAN HEEL [1950]) t. I t is astonishing that this simple arrangement, being nothing else than Young’s experiment of 1807, or rather that of Fresnel (with slits instead of holes) of 1816 (FRESKEL, CEuvres compl6tes) has not been applied to practical uses earlier than about fifteen years ago. It has the advantage of being very simple to set up and for observation, and it makes it possible to see the approach of alignment, long before it is reached, by inspecting the interference pattern; this allows rapid corrections in the position of its parts to obtain alignment. The absence of any parts which might produce deleterious aberrations is the more to be appreciated as the results are of astounding precision. Before discussing the precision that can be obtained by this method, we will describe how7 alignment based on the same principles can be extended to two dimensions. Up to now only deviations in the plane of the figures have been considered. If, however, one uses a hole f See especially 9 3. The reader should be warned that in this article, on account of a misunderstanding, the sign for seconds of arc has been mistaken for inch, thus giving the wrong impression that precision was low instead of very high.
302
MODERN ALIGNMENT DEVICES
[VIII,
44
instead of the single slit, a series of concentric circular slots (a zone plate) instead of the double slit, and a glass plate with blackened concentric circles instead of the cross wires, deviations in all directions perpendicular to the sighting line can be observed. Diffraction of the light proceeding from the slots produces an interference pattern in each plane perpendicular to the line of sight. The pattern consists of light and dark rings (monochromatic light) or colored rings (with white light). The zone plate must not be of the type of Fresnel’s zone plates (with rings whose radii are proportional to the square roots of whole numbers), as these produce foci, not needed here and even troublesome. Equidistant zones proved to meet every practical case. Thc plates must be carefully made; see $ 12 below. The setting consists of placing the pattern exactly concentric to the black rings of the reticule a t C , or the reverse. Precision again is very great.
Q 4. Discussion of the Precision Let us first discuss the case of double slits. The distance d between the minima in the pattern for one wavelength 2, observed a t a distance say $1 between B and C, is &2/h, where 1%again represents the distance between the centre lines of the slits. With the assumption that h is appropriately chosen, about 3 x 10-41, the fringe distance is of the order of 1 mni. It is difficult to measure the position of the minima or of the maxima with a greater precision than one tenth or one twentieth of d. Taking the smaller value and dividing the value of 0.05~2by ;I, we find that the position is known only to about 20 seconds of arc. By using white light, however, the results are quite different. Provided brightness is high enough and dimensions of the pattern large enough, it is possible without much preliminary trying, to point at color transitions in the pattern with a precision of well below 1 second of arc without using a magnifying eyepiece. This means that setting is to be trusted to within 111200 of the fringe distance. \h7ith magnification 5 of the eyepiece, 0.06 seconds of arc is not unusual (\.AX HEEL [1950], p. 810, reading again seconds of arc for in). This extremc precision is only attained when the fringes are vertical or horizontal and, of course, when there is no hindrance from turbulence of the air.
VIII,
$ 41
DISCUSSION O F THE PRECISION
303
With circular fringes another faculty of the seeing process enters t o aid the eye to make the setting correctly, namely the symmetry of the system of fringes. Here too a precision of 0.2 seconds of arc is attainable. By using either of these very simple methods a precision of 0.01 m m or better can be depended on, even when the observer has no specialized training for this work. This holds for distances from, say, 0.5 to 20 or 30m. For larger values of I the brightness of the interference pattern drops below a level where colors are easily seen. It is even possible to take pictures of the pattern on color film, that can be measured later. Precision is not appreciably impaired (VAN HEEL[1950], p. 810). For several years the contrivances described here have been in practical use. They are, however, by no means exhaustive as regards the application of interference to alignment purposes. Before discussing other possibilities, we shall make some general remarks about the precision that can be reached in this way. It has been known for a long time that setting with a symmetrical pattern can be performed with a much higher precision than corresponds to the resolving power of the eye (MICHELSON[1927]). This fact must evidently have some connection with the much larger number of retinal elements that play a part when decisions about settings are to be taken. It is not our intention to enter here into discussion of this difficult matter, but there is one point worth mentioning. The remarkable precision of the eye results only when the pattern and the fiducial marks are practically perfect, straight or concentric-circular according to the case in question. As far as we know, such perfection of straightness or concentric-circularity never occurs in nature. By “nature” is meant here the surroundings of men in prehistoric times. This faculty of the eye only made itself felt i n the last few centuries of civilisation. Even if we take into account several thousands of years, one wonders what it “did”, or “m7hy” it was incorporated in the visual functions of primitive man, and one asks oneself the question if it is a phenomenon in the same class with the colors of fishes, living in the darkness of the deep-sea. MTeOnly wanted to draw attention to the fact that the eg7e can be used to perform much more accurate settings than needed in “nature”.
304
M 0 D E R N A L I G h‘M E N T D E V I C E S
[VIII,
95
Q 5. The Use of Reflecting Spheres and of Spheres with a Concentric Cap It is sometimes necessary to perform the alignment from one side only, as the farther side may be blocked. We might think of borings of fire arms and the like. For these cases an autocollimation alignment device has been developed. K b
I
+
elB
D
C
R
-Y -A
Fig. 1 1. Alipnmznt in autocollimation
This device consists of two parts, the “auto-collimator” A and the reflector I<; see Fig. 1 1 (VAN HEEL [1960]). The first receives light from a small hole, that is made parallel by the object glass C. The light reflected back from K is partially reflected upwards by a semitransparent mirror and reaches the eyc of the observer through a low power microscope E. The parts B, D and C must be connected very rigidly. The fiducial mark G too must be firmly attached to this collimator-telescope. This requirement is fulfilled by the construction represented in Fig. 12. The whole of this optical part is made of glass. Spherical aberration is well corrected, chromatic aberrations too. The length RC is 14 cm, the focal length, reduced to air, is 9.2cm. The diameter of the glass cylinder is 14 mni. With an aperture of 1 /6.1
Fig. 12. Autocollimator
in air the circle of confusion is 10-4 times the focal length. The diameter of the hole B is about 0.05 mm. The distance HG is smaller by several millimeters than BH. Thus the
VIII,
9 51
REFLECTING SPHERES
305
fiducial mark G is inside the focus. The observing microscope can be focussed on G. The position of this microscope (magnification about 15) can be changed without touching the collimator; the mutual position of B, G, the reflecting and the refracting surfaces is not influenced by screwing the microscope up or down. In order to prevent irregular temperature distribution within the collimator, it is enclosed in a cylindrical tube of copper and an outer tube of aluminum, with appropriate air spaces and with holes for entrance of light at B and C and for observation of G. It should be mentioned that G, the fiducial mark, is applied on a plane polished surface perpendicular to the plane of Fig. 12. A useful lamp is a glow lamp of 3 watt, as used for medical purposes, mounted so as t o minimize its heating effect on the collimator. It is worth-while to mount this optical part on a support which is kept in position by two sets of robust sheet springs. Translations and rotations can be executed without play and controlled by precision screws with divided heads.
-Tp)/----.. , ’
, ,, ,
’
,
Fig. 13. Reflecting sphere
The reflector R consists of a sphere of homogeneous glass, aluminized at the rear. It fulfils the task of showing the lateral displacements of its centre, while it isindifferent to rotations around the centre. It functions in the following way (cf. Fig. 13): When a parallel or nearlyparallel pencil of light falls on the sphere from the left, it is refracted by the front surface, reflected at the rear surface and again refracted by the front surface. It then proceeds to the left emerging from a point F, situated at a distance &nr/(2- n) to the right of the centre M, where n is the refractive index and Y the radius of curvature of the sphere. This, of course, holds only for the central part of the emergent pencil. As the distance of R from the collimator is many times Y, we need not take into account the spherical aberration of this pencil: the part of the reflected pencil received by the collimator is small.
306
MODERN ALIGNMENT DEVICES
[VIII,
s5
The light emerging from the “light source” F is concentrated by the refracting surfaces of the collimator C to its focal point, well above G. Thus when looking into the microscope focussed on G, one observes a light patch, a circle of diffusion. To the outer surface of C a zone plate is applied. Here it is preferable not to use a metallic one, but to make this surface into a zone plate by evaporating aluminum on it through a (well centred) metallic zone plate. It is even better to evaporate on it zinc sulphide to an optical thickness of, say, +A. The surface produces in this way diffracted light. The pencils going from left to right already are “marked” or subdivided in this way. The diffracted light at this instance soon loses its power t o mark the light pencil. The light entering C from the right, however, is again diffracted by the zones and forms an interference pattern in the plane of G, consisting of colored concentric circles. When F is on the optical axis of the collimator these rings are concentric with the (well adjusted) blackened ring, the circular fiducial mark G. This is true for all distances of the sphere, in practice from 1 to 5 metres.
X
Y
-Fig 14. Use of autocollimator and reflecting sphere
By these means the centre, or at least the point F near the centre, of the sphere is put on the optical axis of the collimator. The use for the alignment of borings, and also of straight-edges etc., as illustrated by the diagrams of Fig. 14 is obvious. The sphere is placed at X , at Y or at any intermediate appropriate position. The precision of say 0.5 second of arc, easily obtained when the instrument is well made, means 2.5 microns at 1 m and 13 microns a t 5 m. Of course, the reflector must be very nearly spherical. A discussion of errors gives as the tolerance of sphericity about $2, that
VIII,
§ 51
REFLECTING SPHERES
307
is, one interference ring in the test glass. Spheres with this precision are more readily made than plane surfaces, especially as the absolute value of the radius of curvature has no significance. We extensively used spheres of 2 t o 5 cm diameter. We draw attention to the fact that the orientation of the sphere does not play a role. It can be turned around the centre in any direction without producing any change in the position of the interference pattern (provided there is present a reflecting part a t the rear surface). The sphere gives indications on the lateral positions, the lateral displacements of the centre, irrespective of its orientation around the centre. Placing and replacing the sphere against cylinders can be safely done. It will be obvious that these reflectors show some of the useful features of the corner cube or triple mirror; they are, however, easier to manufacture. It is unfortunate that this procedure cannot be used for larger distances; the interference pattern for these distances is not bright enough. With the preservation of the useful properties, mentioned in the last paragraph, this defect can be remedied b y a simple addition.
1-
2r
-I.dJ
Fig. 15. Reflecting sphere with cap
Instead of a simple sphere the reflector can be a sphere with a cap, a peel (of thickness d ) , made from the same glass, cemented to the sphere and reflecting a t its outer surface; see Fig. 15. By appropriately choosing dlr, the point F can be placed at any distance from the centre. The farther F is, the less divergent is the reflected pencil and
F t
I
Fig. 16. Use of reflecting sphere with cap
308
MODERN ALIGNMENT DEVICES
PI, S6
the more light enters the collimator. Very large distances must be avoided, as the alignment then corresponds to a point (F) too far from the plane of contact with the cylindrical ring; see Fig. 16. The distance M F = a is given by the formula a =
1 2
1
d
-n
It is even possible to make a set of two spheres plus cap where a is negative. By combining observations with two reflectors with values of a of equal absolute magnitude and opposite sign, alignment can be performed for M. When d = r(2 - n)/(n- l ) , F lies at infinity and the position of the sphere is no longer indicated: it is useless for alignment. Fig. 17 gives a view of the autocollimator aligner, Fig. 18 some of the spheres and spheres with cap, and Fig. 19 shows a pattern obtained with the arrangement containing a reflector of borosilicate crown glass, Y = 12.5 rnm, d = 9 mm, at a distance of 4 m. This apparatus proved useful to prolong a line given by two points (two positions of the reflector) at distances of 1 m and 1.5 m to 8 m. Precision was below 1 second of arc, that is below 0.05 mm. Four points at distances of 1, 1.5, 2.5 and 8 m were set with this precision within one hour. Observation took place in full daylight. In order to attain this precision decentering of sphere and cap must be within a few microns. A useful feature is the fact that the microscope can be raised during the preliminary alignment, until the reflectors themselves are seen sharp. This speeds up the first trials considerably.
Q 6.
Sphere without Reflection, Producing a Luminous “Line”
We can supplement the above with the description of a device in certain respects much more promising, that has not been discussed in the literature. It consists of a sphere of very homogeneous glass receiving light from a point nearby and producing a pencil of light, subdivided into light and dark zones by interference, without any addition of an artificially made zone plate. Here nature itself pro-
Fig. 17. Autocollimator for alignment
Fig. 18. Spheres with and without cap
Fig. 19. L’attern obtained by autocollimator with a sphere with cap, placed at 4 m distance, photographed in f u l l daylight
VIII,
§ 61
309
SPHERE WITHOUT REFLECTION
duces the pattern of concentric circles, so profitable for alignment. I n Fig. 20 the sphere with centre M, radius Y and refractive index n receives light (to begin with monochromatic light of wavelength 1)from the point P. This is the image of a small hole a t several dm distance I
\
E
”
H
L
Fig. 20. Sphere producing “light ray”
from the microscope object glass B (numerical aperture preferably 0.5 or greater) of good quality. The cone of light proceeding from P is refracted by the front and by the rear surface of the sphere. The distance P M is chosen so that one cone of rays is refracted into a cylinder of emergent light, of which DE and GH are describing lines. (This can always be done by taking P within the caustic surface of the sphere, acting as a lens with light incident from the right.) In order to describe the course of the rays after passing the sphere, consider one of the wavefronts. This has the form indicated by 2 in Fig. 20, where the deviations from flatness are exaggerated. The wavefront is sketched again in Fig. 21. It is a surface of revolution around IK, the prolongation of PM. At L and L’ there are points
a
c
Fig. 21. Form of a wavefront emerging from a sphere
of inflection ; the rays through these points diverge from I K (in practical cases by an angle not more than 0.003 radian). Rays through points of 2 farther from the axis than L have a slope, diminishing to zero
310
M 0D E K N A L I G N M E N T D E V I C E S
[VIII,
96
and even to negative values. Rays through points nearer to the axis have also a smaller slope than those through L. Fig. 22 gives a schematical drawing of these rays.
I
Fig. 22. Normals to wavefront emerging from a sphere
The optical paths from the points of 2 to a point such as Q differ from each other and we can expect a pattern of light and dark parts (in monochromatic light) or with colors (in white light). On account of symmetry around the axis, the light point N on the axis will be surrounded by dark and light (or colored) rings, and this holds for any distance of N from the sphere (provided it is large enough to be reached by normals from Z). Thus the spherical aberration of the sphere, instead of being a disadvantage - as aberrations are, as a rule - performs the useful duty of forming neat patterns of concentric circles. These are of utmost utility to define the line IK, that is the prolongation of PM. It proved possible to produce in this way a luminous line of perfect straightness, thanks to the information procured by the circular interference pattern in any point of the line. Precision is here a few tenths of a second of arc. The line was tested from 1 to 80 metres. At such distances illumination is still quite adequate and it seems possible to observe, even in daylight, at much larger distances. Here again atmospheric refraction sets the limit to precision. It must be understood, however, that the sphere ought to be well made. Half an interference ring on the test glass seems to be good enough and is readily obtainable by the usual polishing means. Much more difficulty was met in finding a piece of optical glass of sufficient homogeneity. Thanks to the courtesy of the optical firm “Old Delft”,
a1
bl
cl Fig. 23a. Patterns produced by solid sphere without reflection a t ) monochromatic light, 1 m bl) 9m cl) ,, , 81 m ,I
I
c2
Fig. 23b. Patterns produced by solid sphere without reflection ;12) white light, 1 m 1 4 ,, 9m c2) ,, ,, , 81 m 7,
I
VIII,
§ 71
SINGLE LENS
31 1
we obtained pieces from which spheres up to 50 mm diameter were made. The refractive index 1.519 190 f 0.000 002 was constant to within a few units of the sixth decimal. Experiments with a concave spherical mirror instead of a sphere have been started. A suggestion to make use of systems with aberration was made by my collaborator A. Walther. It soon appeared, that the complete sphere is the suitable system. A more detailed account is given in a publication by WALTHER[ 19591. The distance of the light source P to M is within wide limits immaterial. For precision it is of utmost importance that no change of the position of these two points occurs during the measurements. Change of temperature during the (usually very rapid) observations must be obviated at all costs. Even touching the sphere with the hands is deleterious; it is advisable to wait an hour after such an awkward action is done before starting observations. Locking up sphere and microscope object glass in a sphere of copper with appropriate holes is indicated. Two additional remarks must be made. In the first place it is to be noted, that white light can be used too. As the point on the axis is a maximum, irrespective of the wavelength it will be maximum also in white light and for every distance from the sphere. The concentric rings, however, are lacking for obvious reasons. Thus only the track given by these light peaks indicate the course of the “light ray”. A precision of 1 to 2 seconds of arc can be obtained even in this way. In monochromatic light a few tenths of a second is readily attained. Fig. 23 shows some photographs with a sphere of the said glass, with a diameter of 4 8 m m in monochromatic and in white light, at distances of a) 1 m, b) 9 m and c) 81 m. In the second place we wish to draw attention to the fact that the testing of large flat surfaces might be speeded up by such a sphere provided with two microscope objectives. The two lines of alignment intersect in one point, the centre of the sphere, and thus define a plane. For other and perhaps more practical means we refer to 3 9 below. Ej 7. Single Lens as Alignment Collimator Recently an ingenious device has been described by STEEL[ 19601. He used a point source throwing light on a lens, which transmits light after two refractions, but also after one refraction, two
312
MODERN ALIGNMENT DEVICES
[WIT,
S8
reflections and one refraction. With an appropriate form of the lens, a point in the object space in or near to the optical axis gives two emergent rays as shown in Fig. 24. The only adjustment to be made is the placing of the light source on the optical axis of the lens. This
Fig. 24. Alignment with single lens
is effectuated by observation of the reflecting images on the two lens surfaces. A precision of about 2 seconds of arc and a range of a few metres are attained. As compared to the sphere the adjustment of the point source seems to be an inconvenience. It might be preferable to perform this experiment with monochromatic light, as here too a circular interference pattern may help to improve the precision. We refrained from this, as the sphere completely solved the problem; the range is much larger with the sphere, and the adjustment of the light source is not necessary. Still it is noteworthy that with the method of Steel the lens too does not form an image and spherical aberration, both for the direct and for the twice reflected rays, is responsible for its functioning.
Q 8. The Use of the Rainbow Before discussing the testing of large flat surfaces, we have to mention a phenomenon, that can be readily adapted for that purpose. As is well known (DESCARTES [1637], NEWTON,Opticks, BOYER [1959]) spheres with a refractive index higher than that of the surrounding medium give rise to rainbows. We will not enter any further into this matter than is needed for the practical applications described below: a rainbow formed after two internal reflections, as illustrated in Fig. 25. A parallel pencil of light entering the sphere of index n from the left gives rise to emerging rays a, b and c. It is easily proved, that the deviation 6 has a minimum value, which is in the neighbourhood of 270" for spheres with a refractive index of
VIII,
9 81
THE USE O F THE RAINBOW
313
about 1.52. Symmetry around the line AM, parallel to the incident pencil through the centre M of the sphere, accounts for the fact, that a cone is formed with AM as axis and of which b, the limiting ray in the figure, is one of the generating lines. With the use of white b I
a
j
Fig. 25. Border line b in light emerging from sphere after two reflections
light, dispersion produces a series of cones, one for each wavelength, of different vertex half angle. Thus a “rainbow”, lying in a plane perpendicular to the incident pencil is produced. One might think for a moment, that it might be possible to choose refractive index and wavelength in such a way, that the minimum value of 6 is exactly 270°, thus producing a flat surface. This index should be 1.517 992. This, however, is not feasible and, moreover, not necessary. It is again nature, helping us with an interference phenomenon, that solves the problem of producing a flat surface in a satisfactory way. When monochromatic light is used, it appears that the borderline between the dark region and the region where light emerges is not sharp, but provided with a host of fine “diffraction” fringes of high contrast. The theory of this phenomenon has been developed first by AIRY [1838] and STOKES [1850] and is described in the well-known treatise by MASCART [1889] and other books. In Fig. 26 the phenomenon is illustrated by a photograph and by the corresponding graph of the intensity. The dotted line indicates
3 14
MODERN ALIGNMENT DEVICES
[VIII,
98
the position of the border line in the absence of interference (the “geometrical” border line). An explanation is obtained by comparing Fig. 27, where part of the emergent wavefront Z i s drawn, with the Figs. 21 and 22. The wavefront again has the s-form; this is, of course, closely connected with the fact that there is a limiting direction of the rays.
Fig. 27. Form of wavefront emerging from sphere after two reflections
Theory indicates that the “geometrical” limiting ray (as indicated by the broken line in Fig. 26) is determined by the formula for the “geometrical” deviation 6 : 6
=
360” 4-2i - 6i’,
where i and i’ are the angles of incidence and of refraction respectively. It is independent of the dimensions of the sphere. The position of the minima and maxima of the interference pattern have been calculated by Airy and others (see, for instance, the table in Mascart, l.c., p. 400). The pattern has the same form for all cases and its (angular) scale is determined by the parameter z : z = 26(6~2/hA2)+,
where Y is the radius of the sphere, 6 the angular distance from the geometrical border, and h=
9 - n2
Once the refractive index and the diameter of the sphere is known, we can calculate the position of the geometrical border with respect
-6 I
Fig. 26. Interference pattern produced by sphere after two reflections
a
b Fig. 29. Pattern produced by zone plate a) stationary b) rotating
VIII,
s: 9j
ALIGXMENT O F SURFACES
315
to the minima and maxima, On the other hand a sphere of appropriate refractive index, a little smaller than 1.517 992 for the green mercury line (A = 0.546 074 x lO-3mm) can be chosen of such a diameter, that a convenient minimum, say the third, has a deviation of exactly 270”. The author and his collaborators have performed this experiment and it appears that with glass spheres of diameter between 20 and 50 mm, aluminized on the parts as indicated in Fig. 27, sufficient light is present to measure the pattern at distances up to 1 0 m with a precision of 1 t o 2 seconds of arc. I t should be noted that the position of the plane is not changed, when the sphere is turned around its centre, or when it undergoes translations in that plane. For other useful applications of the “rainbow” we refer the reader to the thesis of Walther. $j 9.
Alignment of Surfaces
Closely related to alignment is the subject of determining the deviations of a large surface froin flatness. Instead of using an artificial word like “plano-testing” we will call this “alignment” of a surface. Amongst the practical problems of this type we encounter in the first place the determination of the departure from a strict plane of the surface of a large fraise, or of the two sides of the bed of a turning lathe. The methods which will be described are in principle also applicable to smaller and larger surfaces, provided, of course, turbulence of the atmosphere plays an insignificant role.
Fig. 28. Alignment of two intersecting lines
316
MODERN ALIGNMENT DEVICES
[VIII,
$9
The first approach seems to be alignment of lines that intersect one another. One might think of a zone plate 2 (Fig. 28) traversed b y light from different directions. With the skew pencils the diffraction pattern is broadened in the plane of the figure. For pencils with large inclination the pattern is not satisfactory for precise settings. However, there is a simple solution: W. de Bruin, working in the author’s laboratory in 1955, suggested that the plate be rotated around an axis perpendicular to the plane of Fig. 28. Deficient centring of the centre of the zones on the axis of rotation can be corrected by a mounting with appropriate adjustment. With frequency of rotation of more than 20 per second there is practically no change in the pattern; see Fig. 29, where a) shows the pattern with stationary zone plate,
Fig. 30. Rotating zone plate
used square to the alignment line, and b) with rotating zone plateThe explanation of this at first sight somewhat startling phenomenon is elucidated in Fig. 30, which shows a cross section of the plate by the plane of Fig. 28. The zone plate, consisting of metallic concentric rings, has a certain thickness. As soon as the angle of inclination of the normal to its surface has attained a certain value, the plate blocks out all the rays. Before this position is reached the pattern broadens out, but this broadening is not large as the lateral dimensions increase proportionally to I/cos a. A second possibility which secures two intersecting straight lines has been indicated above with the complete, non-reflecting sphere (see end of 5 6). These methods can now be supplied by the “rainbow” set-up described in 3 8, which can be realized in the following way: At C (Fig. 31), a sphere as described is mounted on a collimator, which receives light from a small hole at A, illuminated by a mercury lamp (with green filter) B. The “light plane” DE can be observed
VIII,
9 91
317
ALIGNMENT O F SURFACES
from any point such as G, where crosswires can be set on that maximum or minimum of the pattern for which DE is really a plane. Now the mounting of G, with the eyepiece, can be lowered or raised until the setting is satisfactory. The height above the point H of the surface to be tested is read on a dial in 0.01 mm. With precision of about 2 seconds of arc, deviations are determined to within 0.1 mm up to
G
XD
Q
C Fig. 31. “Alignment” of a plane surface
10 m from the sphere, and surfaces of 20 m square can be tested rapidly and conveniently in this way. The collimator must be sturdy. A practical form is illustrated in Fig. 32. It is all-glass; rays proceeding from A rendered parallel by
A
Fig. 32. Collimator for “alignment” of a plane surface
318
M O D E R N ALIGNMENT DEVICES
[VIII,
§ 10
reflection at P, a second reflection a t Q and refraction a t R. The central part of the emergent pencil is not used. Deviations from parallellisni are not more than 0.6 seconds of arc. The diameter is 6 c m , the length 12cm, the focal length 24cm. Spheres with a diameter between 30 and 5 5 m m can be used; their refractive index should be about 1.516.
Q 10. The Axicon MCLEOD[ 19541 described quite a different device for alignment purposes. In its simplest form it consists of a glass cone, which produces a luminous line, when used as in one of the diagrams of Fig. 33, with a point source on the axis. With well-made cones a diffraction
Fig. 33. Axicons
(interference) pattern can be observed along a considerable portion of the axis in the “image” space. A striking appearance, McLeod mentions, is given by a cone of 6 inches in diameter with a white point source of 0.003 inches in diameter, placed at 100 feet distance from it on the axis. The pattern could be observed a t 100 feet at the other side from it and a cross-hair could be located with an accuracy of about 0.001 inch, that is of about 0.1 of a second of arc, while the pattern is of satisfactory luminosity. This method is a worthy addition to the arsenal of alignment auxiliaries. We have not tried out the method, as we do not have the means to produce cones with the required precision. Further the cited precision of pointing is no guarantee that the line through the centres of the pattern is straight. This ought to be tested in each case, because slight deviations from homogeneity and perfect form of the cone give erratic and unpredictable deviations from straightness. With the sphere testing of homogeneity and sphericity
VIII,
9
113
319
ADDITIONAL EXAMPLES
is easily carried out by turning the sphere around its centre; the sphericity, moreover, can be tested by a test glass. We therefore preferred the sphere, the near-perfection of which is much more easily achieved.
Q 11. Additional Examples That it is possible to make very accurate alignments with extremely simple means, may be illustrated by the following practical examples : a) The bridge near Hedel on the river Meuse in Holland (span width 124 m), damaged by bombardment in 1940, repaired, and again damaged in 1945, had to be repaired immediately. Because materia1 was scarce then, it was necessary to use the available parts and supplement these with some new pieces. In order to speed up the work, reconstruction started a t the same time on the river (for the building of the floor) and at the construction shop at Delft (for the arches and suspending bars). The so-called berth pieces (at T in Fig. 34) wrre A
T
T’
Fig. 34. Alignment as a help in construction of a bridge
needed at the river and a t the same time in Delft for the location of the point of intersection of the arch and the floor construction. Therefore the so-called “theoretical point of intersection”, indicated by a small hole, was localized with respect to the adjacent part C of the arch by means of a single and a double slit, welded on part C a t a mutual distance of 12 m and in such a way that the theoretical point was lying on one of the critical color transitions in the diffraction pattern. The distance of this point from the double slit was 16 m,!and was measured to within a few tenths of a mm. The piece T was then
320
MODERN ALIGNMENT DEVICES
[VIII,
5
11
taken away, and an iron slab was substituted and the theoretical point reconstructed o n it. The piece T could then be sent to the river and set up there, while the formation of the arc was started from the reconstructed theoretical point. In this way the repair had been speeded up by several months. The accuracy proved to be amply sufficient. b) Another instance where the double slit proved useful was the measurement of the amplitude of the oldest church-tower in the Hague, of which the vertical fissures were to be repaired. These fissures showed a tendency of widening and the pealing of the two heavy bells was interrupted. It was desirable to know the amplitude of the
Fig. 35. Measurement of small oscillations of a tower
tower during that operation. To measure this, a single slit was mounted in the tower at a height of 53 m, the emerging light was rendered horizontal by means of a pentaprism and was jntercepted by a double slit; see Fig. 35. The pattern was observed at a distance of 50 m. With the two bells, weighing 3500 and 7500 kg in phase the amplitude was 1.7 mm. When the pealing was done with a phase difference of n, the amplitude was 0.3mm, of the same order of magnitude as the irregular disturbances arising from the wind pressure on the tower. The apparatus was set up in one afternoon, the measurements themselves took less than two hours. The amplitudes might have been measured by means o€ a telescope from the ground, preferably with a point source in the tower and a zone plate before the object glass. Though the accuracy of pointing might have been 0.3 seconds of arc, vibrations of soil and trembling
VIII,
3
111
ADDITIONAL EXAMPLES
32 1
of the telescope would have prevented the achievement of this precision, necessary tc. ascertain the position of the point source t o 0.1 mm. c) The alignment of the bearings of ship’s shafts can be done, and is in some places currently done by means of a point source, a zone plate and a circular reticule, along the lines described in 0 3. (The method employing the complete sphere as indicated in 3 6 has been available for only a few months.) Precision is quite satisfactory with the zone plate. A great advantage, especially in comparison to the sighting with three holes, is, of course, the fact that the observer always sees something and can indicate in what direction and to what an extent correction of the position of the parts have to be made. d) By making use of the concepts of projective geometry it is possible to solve some special problems by means of alignment without recourse to metrical measurements. An example is illustrated by Fig. 3 6 : it refers to the prolongation of the straight line AB when it is P
P’
Fig, 36. I’rolongation of a line beyond an obstacle
blocked by a wall W. One takes an additional point P not lying on AB and an additional point C on AB. Next a judicious, but otherwise arbitrary point D on PC is chosen. The point of intersection E of BD and AP and the point of intersection F of AD and B P are then found. The procedure is then repeated with another point P’ outside AB. The intersection of EF and E’F’ is a point G on the prolongation of the line AB. To find a second point, another point on AB must be chosen. The point P’ need not lie in the plane ABP. As all the described operations can be carried out by one of the alignment devices already described, there is no essential difficulty
322
MODERN ALIGNMENT DEVICES
[VIII,
9 11
to solve the problem in practice. It depends on the available lateral space as to the precision with which the prolonged line is in line with AB. e) A problem of special difficulty is that of finding the orientation with reference to the meridian line, the azimuth, of a line in a mininggallery. Though the required precision is somewhat less than 1 minute, magnetic compasses are of no avail on account of the possible presence of layers of magnetic minerals. MOONEN [1955] solved the problem by using a zone plate; see Fig. 37. Two point sources C and D throw
b
Fig. 37. Alignment in mine shaft
light upwards through the zone plate 2. At ground level the points A and B are found on the lines DZ and CZ, care having beeii taken that A and B are on the same level; C and D too are on a horizontal line. CD then is prolonged into the gallery and the azimuth of AB is determined a t ground level. It proved quite possible to determine the azimuth of CI) with a precision of below one minute of arc in a mine of 500111 depth, of which the shaft had a width of 5 ni. Comparison, in a certain case, with measurements made in the usual geodetical way confirmed the precision following from the analysis of internal errors, that is well below one minute of arc. For details we refer the reader to the cited article of Moonen. There is an important advantage of the alignment method, which we want to stress. In deep mines a gale blows upwards owing to ventilation. For geodetic measurements moments are awaited when ventilation is switched off, perhaps for one day in several weeks. Otherwise the telescopes would be shaking too much to allow pointing with any accuracy. The prolongation of the optical axis of the telescope can only be done securely, when the instrument is in a fixed position, without any
VIII,
4
121
THE M A N U F A C T U R E O F Z O N E PLATES
323
wavering. With the zone plate, however, the deviations of the points from a straight line are measured directly, not derived from a determination of angles, as is the essential way of measuring with telescopes. Therefore the measurements with the zone plate were done with the ventilation on, with the mine in working conditions. The trepidations of the fiducial marks, of the light sources and of the zone plate are not more than of few tenths of a mm and this is good enough for the required precision. One can go even further and presume that the light travels in the turbulent, ventilated air of the shaft along lines that are more straight than when the ventilation is off. In the latter case the air settles in layers with temperatures decreasing from bottom to top; a possible effect is that deleterious atmospheric refraction arises of which the influence is not easy t o correct. f) Another practical case might be mentioned: the placing of sixty shafts in a large machine of 56 m length parallel to one another, with tolerance 0.1 mm and the shaft length 2 m . With zone plates the work could be done in about one week.
Q 12.
Some Technical Remarks on the Manufacture of Zone Plates
For relative displacements zone plates on glass can be used. It often does not matter if the line through point source, centre of zone plate and fiducial mark is straight, as only the change of the form of this line with time or temperature or any other variable is important, not the “absolute” straightness. In that case the plates can be scratched in aluminized glass on a good lathe. The glass carrier n:ust be of good homogeneous optical glass, but the parallelism of its well made plane surfaces need not be approximated closer than several
I
4 ,
Fig. 38. Manufacture of a zone plate by grinding
324
MODERN ALIGNMENT DEVICES
[VIII,
Suppl.
minutes of arc. For an application in the field of soil mechanics, where a hundred points had to be measured we used pieces of glass plate on which a pattern had been ground with emery by means of a stamp of the form of Fig. 38, held in place by a hollow cylinder. For the determination of the absolute straightness of lines, a metal zone plate is needed. This can be made as follows. A bronze plate is cemented on a bar and clamped in the head of a precision lathe. The grooves are cut carefully until the material of the supporting bar is reached. Three cross wires making an angle of 120" with one another are very cautiously soldered on the rings. Then the rings, joined in this way, are disconnected from the supporting material solving the cement. In summing up we can conclude that rapid relatively simple optical methods are available to perform alignments with utmost precision, as ~ l as lto test the straightness of lines and the flatness of surfaces. The described methods can be adapted to special problems. We may add that in industrial as well as in laboratory practice we prefer the use of glass spheres and the autocollimator.
Supplementary Note, added July 1960. Referring to 9 6 it seems worth while to add a short note concerning another device which has yielded remarkable results since this article was written. The sphere of homogeneous glass with a cap and one or two small holes (diameter 2 microns) on the outer surface of the cap proved very practical and useful, one hole for pure alignment of a line, two holes for the testing of flat surfaces. Now the fact that the glass must be very homogeneous in order to attain a precision of a few tenths of a second of arc suggested another form, in which the light is reflected instead of refracted. The spherical aberration of a spherical mirror enables us to diminish greatly the dependance on material. A mirror always causes trouble because the reversal of the direction in which the light travels gives rise to partial interceptions of the pencils and makes it necessary to add a plane mirror. Both the spherical concave mirror and the plane mirror can be produced with high precision in the conventional way, provided the glass is of good quality and thick enough. We chose the well-known borosilicate crown with n d = 1.517, v = 64. The set-up is illustrated in Fig. 39.
VIII,
SUPpl.]
CONCAVE MIRROR AND PLANE MIRROR
325
White light from the small hole A is transmitted by the cylindrical boring of the thick glass block B. At C either a microscope object glass or a single glass sphere is mounted. We will first describe the experiment with the microscope object glass. This forms a reduced image
Fig. 39. Alignment with a concave mirror and a plant: mirror
of A, magnification being 1/90. Thus a very small light source of a few microns diameter is produced in the neighbourhood of the focus of the spherical concave mirror N. The light point is slightly nearer t o N than the focus. Aperture of C is large enough to cover the whole of N. The light reflected by N is again reflected at the front surface V of B. This surface is made as plane as possible and N is made as close to spherical as possible. The thickness of B is 5 cm, of N about 3.5 cm. Departures from ideal form are below 0.05 part of a wavelength. Radius of curvature of N is 20 cm, its diameter is 12 cm. The shape of the wave front emerging from V to the right is as indicated in Figs. 21 and 22. With an appropriate distance of C from N the outer zone of N gives an “image” at P at a distance of about 4 m , while the smallest zone around the boring (of diameter about 26 mm) gives an “image” at Q at 40 m distance. To each point between P and Q corresponds one zone of mirror
326
MODERN ALIGNMENT DEVICES
[VIII,
Suppl.
N. The zones with larger or smaller diameter are out of phase. They give stray light. It is as if the appropriate zone is an annular aperture, giving the well-known diffraction image of rings, on which is superimposed a general illumination of stray light. At all points between P and Q the diffraction pattern was clear enough to allow precise settings, i.e. to 0.02 mm in Q and to a few microns at P. Illumination is strong enough (the white lamp at D being a Sylvania concentrated arc 100-watt lamp) to allow measurements in full daylight. The maximum deviation from straightness of the line joining the centres of the diffraction patterns can be predicted from the known precision with which the parts have been made. For the whole range this amounts to somewhat more than 0.1 of a second of arc. Curiosity led us to test this in an independent way. At Q we used a fiducial mark consisting of concentric black rings, at P a screen with a circular aperture which could be centred with respect to the diffraction pattern by means of a small black spot in the centre of the aperture. At R between P and Q a third setting was made with a screen with an aperture of special form. This aperture could not be mounted on a glass substrate as that would be a source of error. We chose a pattern of steel balls of the same diameter producing an aperture of the form of Fig. 40.
T
Fig. 40. Diffracting aperture produced with steel balls
Having centred T with respect to the diffraction pattern produced by the concave mirror plus plane mirror, we illuminated the circular aperture a t P uniformly with an auxiliary white lamp with collimating lens. Thus P, R and Q could be aligned by setting the fiducial mark at Q symmetrically on the diffraction pattern (with an 120"-symmetry) produced by R. During these measurements we were much hampered by air currents, temperature gradients of the air and vibrations of the building in the cellar of which we worked. The best results were
VIII,
Suppl.]
STEEL BALLS
327
obtained during the night. Six settings on the pattern produced by the mirrors were followed by six settings on that produced by the balls, and again six settings on the first pattern were made. I n the most reliable series deviations amounted to 0.15 seconds of arc in horizontal direction and to 0.07 seconds of arc in vertical direction, the standard deviations of the mean of each series being 0.12 seconds of arc. The prediction about straightness appeared to be fulfilled. Though the testing experiment seems to be of academic use only, there is a practical side to it. We do not pretend the apparatus is very appropriate for practical purposes (indeed the glass sphere with cap described in 4 6 is easier to handle), but there is a great advantage to having tested the double mirror with utmost precision, for it can be used now for testing other alignment apparatus intended to give straight lines (like the sphere of 3 6). Reverting now to the optical piece C in Fig. 39 it ought to be remarked, that the object for the microscope object glass (the small aperture a t A) must be carefully placed on the optical axis of the object glass, preferably at a distance of about 16 cm before it. The object glass being an apochromatic one, the least deviation of A from the optical axis results in an asymmetrical coloring of the diffraction pattern. This rather delicate adjustment can be obviated by replacing the object glass by a “perfect” glass sphere of small dimensions (between 2 and 6 mm diameter). First the small hole A is put in the centre of curvature of mirror N, then the small sphere is put in place at an appropriate distance from N. The optical parts now have an optical axis accurately defined by the centre of Aand the centre of the sphere at C, no further adjustments by trial and error being necessary. We did this experiment with spheres of 6 and of 4 mm diameter. The diffraction pattern was not less nice and crisp than when using the microscope objective, it had in fact better contrast. The chromatic aberration of the sphere reduces a little the coloring of the diffraction rings. With regard t o practical applications we can add that the concentrated arc lamp has been replaced by a 4-watt glowlamp (as used for surgical applications) and even so illumination is strong enough to allow measurements up to 40 m distance with a magnification of ten times. This apparatus once having been tested under favorable circumstances is now used to test the straightness of the lines provided by other apparatus. If both apparatuses are placed on the same rigid support neither the vibration of this nor the irregular refraction of the air has any influence. It should be added that up to now the sphere
328
MODERN ALIGNMENT DEVICES
[VIII
with cap, as described in $ 6 , seems to be the most convenient contrivance for alignment in practice. Of the several collaborators, who worked in the alignment team in Delft, the author wishes to mention the names of Messrs. J. G. Doekes and W. de Bruin, who helped to develop the zone plates, Mr. A. Walther who added much t o our knowledge of the sphere and its uses, Messrs. W. Brouwer and G, J. Beernink who formerly worked in this field and Messrs. Liem S. H. and R. F. van Ligten who now carry on the measurements. The author wishes to thank here Messrs. M. J. Wijsman, W. C. van der Vaart, Th. Kersbergen and A. C. F. van Kuyk for the manufacture of the optical and mechanical parts, and Messrs. W. Ham, H. W. A. van der Meer, H. R. A. Wessels and J. Hoogland for their help during the experiments.
References AIRY,G. B., 1838, Trans. Cambr. Phil. SOC.6, p. 379; 1848, 8, p. 593. J., 1930, Rev. d’Optique 9, 17. BONNAFFE, BOYER,C. B., 1959, The rainbow (T. Yoseloff, New York). DANJON, A . and A. COUDER, 1935, Lunettes et tklescopes, Ed. Rev. d’Opt., Paris, p. 6 15 and 6 18 ; the precision of the observations of Tycho Brahe is estimated to be 1 or 0.5 minute of arc. DESCARTES, R., 1637, Les Mktkores (Leyde). FRANK, c. and A. c. s. V A N FIEEL, 1951, J . Opt. soc. Am. 41, 277. FRESNEL, A,, CEuvres complbtes I, p. 165. HERODOTUS 111, 60; H. Diels, 1920, Antike Technik, 2d ed. (Teubner, Leipzig). HERONIS ALEXANDRINUS OPERA, rec. H. Schone, 1903, (Teubner, Leipzig), 111, p. 239 (Dioptra, cap. 15). K ~ X I GA, , , 1937, Die Fernrohre und Entfernungsmesser, 2nd ed. (Berlin), part A, IT’. MARTIN,L. C., 1924, Optical measuring instruments (Glasgow) p. 18 and 17. MBSCART, E., 1889, Trait6 d’optique (Paris) I, p. 387 sqq. MCLEOD,1954, J. Opt. SOC.Am. 44, 592. MICHELSON,A. A,, 1927, Studies in optics (Chicago) p. 78. MOONEN,J. G. D., 1955, Zeits. Deuts. Markscheider-Vereins 62, 1. NEWTON,I., Optics, Book one, part 11, prop. 9, probl. 4. KANTSCH, K., 1949, Die Optik in der Feinmechanik (Miinchen) p. 238 sqq. RICHARDUS, P. 1954, A new method of collimation with a theodolite. Report General Assembly of the Internat. Assoc. of Geodesy a t Rome. STEEL,W. H., 1960, Optics in metrology, Symposium Brussels (Pergamon Press, London) 181. STOKES, G. G., 1850, Trans. Cambr. Phil. SOC.9, 166.
VIII]
REFERENCES
329
HEEL, A. C. S., 1946, J . Opt. SOC.Am. 36, 242. HEEL, A. C. S., 1950, J. Opt. SOC. Am. 40, 809. VAN HEEL,A . C. S., 1956, Proc. Symp. Astron. Optics Manchester (Ed. 2. Kopal, (North-Holland Publishing Company, Amsterdam) p. 376. VAN HEEL,A. C. S., 1960, Optics in Metrology, Symposium, Brussels (Pergamon Press, London). v . 4 ~HERK,G., 1958, Bull. Astron. Inst. Netherl. 14, 155. WALTHER, A., 1959, Optical applications of solid glass spheres, Doctoral Thesis, Delft. VAN VAN
This Page Intentionally Left Blank
AUTHOR INDEX A ABBE, E., 88, 140 ABELES, F., 166 ACRES,L. H., 257 ADLER,F. H., 260 AIKE, G. B., 313, 314 AIRY, G., 70, 75 ANDREWS, C., 98 ANTWEILER,H. J., 186 A R A ,T., 225, 234, 237, 238 ARAGO,233 ARMBRUSTER, D., 186 ARONS,L., 234 ARTMANX, K., 180
B B.SCHYNSX1, M., 98 BAKER, B. B., 71, 73, 99 BAKER, J., 77 BANNING, M., 228 BARAKAT, R., 82, 83, 89, 90, 96, 97 BARLOW, H. R., 260, 272 BASSET, A , , 75, 78 BACD, R. V., 217 BEKEFI,G., 98 BEREK, M., 80 BITTINI, nI., 279 BLACK, G., 63 BLODGETT, K., 178, 214 BOHR,IS., 134, 136 BOIVIN, A., 79, 80, 96, 97, 98 BOLTZMANN, I,., 128 BONNAFFE, J . , 293, 297 BOOKER, H. G.. 138 BORN,&I., 34, 69, 71, 73, 74, 78, 94, 95. 132
BOUWKAMP, C. J , , 74, 96, 99 BOYER, C. B., 312 BREMMER, H., 100
BRIDGE, J., 75 BROMILOW, N . S., 43 BRUCKE, 233 BRUNS, H., 4, 76 BUCH, S., 214 BUCHWALD, E., 217, 242 BURTIN, R., 104 BUXTOK, A., 80, 94 BYRAM, G., 94, 257
C CABRERA,N., 178 CARATHEODORY, C., 34 CHARSBY, A., 231 CHEATHAM, I-’.G., 285 CONRADY,A., 80 COPSON,E. T., 71, 73, 99 CORNSWEET,J . C., 262, 264, 267, 268, 28 1 CORNSWEET, T. N., 261, 262, 264, 266, 267, 268, 274, 281 COUDER,A,, 294
D A., 294 DAVIS,H., 97 DE, M., 43 DE BROGLIE, L., 136 DE BRUIN, W., 316 DEBYE,P., 100 DE LAKGE DZN,H., 277, 278 DESCARTES, R., 114, 115, 312 DIMITROFF, G., 77 DIRAC, P. A. M., 35 DITCHBUKN, R. W., 260, 261, 262, 263, 264, 265, 266, 268, 269, 274 DJURLE, E., 199 DOETSCH, G., 199,205 DOSSIER, B., 81 DANJON,
332
AUTHOR INDEX
DRUDE, P., 88 DUFFIEUX, P. AT., 41, 81. 138 DUMONTET, P., 94 E EDDINGTON, A,, 112, 113 EHRINGHAUS, A , , 235 EINSTEIN, A , , 114, 125 ELDERT, C., 63 END^, S., 244 EPSTEIN, L., 80 ERCOLES, A. M., 264, 286 EVANS, U. K., 231 EVERITT, P., 76 E X N E RE. , M., 245
F FEDER, D. P., 62, 63 FELLGET, P.B., 60 FENDER, P. H., 260,263,264, 268, 269 F E K h l A T , P., 115 FIORENTINI, A , , 264, 286 FIRZHUGH, R., 260, 272 FLIEGELMAN, F., 260 FOCK, 70 FOCKE, J., 62, 95, 104 FOURIER, 117 FRANCON, M., 70, 81, 94 FRANK, C., 300 FIIAXZ, Mi., 73, 74 FRAU, D. C., 178 FRESNEL, A , , 115, 301 FRY,G. A . , 285
G GABON, D., 119, 120, 133, 138, 140, 141, 198, 199 GEFFCKEN, W., 169, 178 GIXSBOKG, R.L., 260, 262 Goos, F., 178, 180, 186 GRACE, J . H , 23 GRADMASPI’, U., 178 GRASSMAEN, 213 GRAY, A , 78, 92 GREEN,P H., 285 H H~MILTO W., N ,3, 4, 34 HANCHEN, H., 180, 186 H A R D YA, , 213 HARIHAR.AN, 94 H . w r L E Y , 113
HAARTLINE,H . K., 258 HAUSE,98 HERRARD, F. W., 260 HEDLUN, J. M., 265 HEISENBERG, W., 136 HEITLEII, W., 143 HERING, E., 257 HERODOTIJS, 294 HERON, 294 HERSCHEL, W., 75 HERZBERGER, M,, 3, 40, 62 HIESINGER, L., 178 HIGGINS, G. C., 257, 260 HINES,D. C.. 257 H I R A Y A R I A , s., 77, 97 HLUEKA, F., 166 HOPKINS, H. H., 36, 41, 42, 43, 57, 59, 63, 80, 88, 94, 199 HOPKINS, R. E., 59, 63 HOWELLS, P.W., 245 HUBNER, W., 166 HUFFOKD, M., 97 HUXLEY, A., 1 1 1 HUYGENS, CH. 115
I IGNATOWSKY, B., 104 INGELSTAM, E., 199 INOUE, S., 85, 87 J JENTSCH, F., 70 JONES, L. ii., 257 JUDD,D. B., 214
K KATHAVATE, Y., 97 KAYSER, H., 186 KELLY,11. H., 279, 280 KELLY,K. L., 214 KING,K., 99 KINGSLAKE, R., 54 KIRCHHOFF, G., 70 KONIG,A , , 297 KOSSEL, W., 186 KOTTLEK, F.,73, 99, 101 KRAUSE.AI:ER, L., 186 KKAUSKOPF, J., 267, 274 KUBOTA, H., 62, 85, 87, 216, 217, 218, 220, 222, 225, 231, 234, 235, 237 238, 240, 242, 243 KUBOTA, K., 199
AUTHOR I N D E S
KUFFLER,S. W., 260, 272 KUPFMULLER, K . , 113, 199 G., 62 KUWABARX,
L LAMAR, E., 94 LANCZOS, C., 1 1 , 13 LANGE Dzx, H. DE, 277, 278 LANSRAUS, G., 80, 81, 96, 97 LE G R A K DY, . , 217 LEONARDO DA VINCI, 117 LEV,89, 90 LINFOOT,E. H., 59, 60, 78, 80, 95 H. M., 63 LOEBENSTEIX, LOHMANX, H., 199 LOMMEL, E., 77, 78, 217, 236 LORD,M. P., 260 60 LUCY,F. LUDVIGH, E . , 284, 285 LUKOSZ, W., 62, 199 R. K., 3, 4, 5, 11, 81, 89, LUNEBERG, 100, 101 LYMAN, T., 97 LYOT,B., 243 M MAcAnAnI, 13. L., 220, 236 MACH,E., 69, 284 MACKAY,I). M., 112, 135 T., 78, 92 MACROBERT, A . M., 231 MACSWAN, MAHAN,A. I., 224 A , 236 MARCELIK, X., 40, 70, 94 MARECHAL, MARG,E., 260 W. H., 257 MARSHALL, MARTIN,L. C., 80, 299 1c.I. E., 238, 313, 314 MASCART, MASSEY,H. S. W., 142 G., 78, 92 MATHEWS, MATHEWS,P., 98 MAXWELL,C., 115, 125 MAYER,H., 178 MCCARTHY, C. A, 63 MCCLELLAX, 98 R., 90 MCDONALD, MCLEOD,218 MECKE,R., 244 MEIRON,J., 63 MEISLING, 233 MEYER,C., 69 MEYER-EPPLER, W., 199
333
MICHELSON, A. A , , 303 MITRA,S., 76 MIYAKE,K., 218, 228 MIYAMOTO, K., 41, 43, 57, 62, 64 F., M~GLICH , 70 M ~ R T CG. H ,C., 214 MOONEN,J . G. D., 322 MOSER,H., 186 MOTT,N. F., 142 94 MOUR-XSHINSKY, R., MUKATA, K , , 62 A. E., 223 MURRAY,
N NACHMIAS, J., 280, 281 NAGAOKA, H., 92 NAWi\TA, s., 232 XEWTON,I., 115, 312 NIENHUIS,K . , 37, 39, 98 H., 71 KUSSEXZVEIG, NIJBOER,B. R. A , , 35, 36, 37, 79 H., 113, 142, 199 ~TYQUIST,
0 OGLE, K. N., 274 OGURA,I., 45 OHZU,H., 199 O'NEILL,E., 81 OSE, T., 222, 232, 235, 242 H., 74, 88, 89, 90, 92, 101 OSTERBERG, OSWALD,J., 138 P PARRENT, G., 94 PERNTER, J . M., 245 PICHT, J.. 36, 100 PLANCK, M . , 128, 135 POCKELS, F., 70, 214, 233 POHLACK, H., 231 H., 71, 72 POINCARE, POLLING,J. J., 231 POLYAK, S. L., 261 P R I T C H A R D , R. M., 264
R KAAP, J . E., 257 RANTSCH,K., 298 RATLIFF,F., 260, 261, 262, 264, 267, 268, 274, 281 RAYLEIGH,LORD,70, 75, 92, 94, 95, 96, 99, 117, 140, 217, 218, 219 B., 101, 103, 104 RICHARDS,
334
AUTHOR INDEX
RICHARDUS, P., 295, 297 RIGGS,L. A., 260, 261, 262, 264, 267, 268, 274, 281 RONCHI, V., 69 ROSCH,S., 233, 234 ROSEN,S., 63 K. J., 199 ROSENBRUCH, L., 134, 136 ROSENFELD, K., 199 ROSENHAUER, A,, 70, 71 RUBINOWICZ, RUNGE,J., 34
S SAIT~ H., , 84, 237, 238, 243, 244 SAT^, T., 231 SAWAKI, T., 228 K., 62 SAYANAGI, 0. H., 41 SCHADE, H., 186 SCHARDIN, SCHARF, P. T., 221 J., 77, 97 SCHEINER, S., 73 SCHELKUNOFF, M., 178 SCHLICK, K. O., 199 SCHMIDT, SCHMIDT, W., 186 K., 166 SCHUSTER, K., 34 SCHWARZSCHILD, SENDERS, V. L., 280 C. E., 136, 138, 149, 150, SHANNON, 152, 198, 199 SHIMIZU, K., 237 B., 199 SJOGREN, SLATER, P., 29, 91 SMAKULA, 170, 178 L., 92 SMITH, SMITH,T., 3, 14, 15, 16, 25, 29 M. VON, 125 SMOLUCHOWSKI, SNELL,114, 115 J. S., 245 SOHDA, SOMMERFELD, A,, 34, 71, 78, 100, 141 STAVROUDIS, 0. N., 62 STEEL,W., 94 G. C., 3, 9, 1 1, 16, 17, 22, STEWARD, 36, 76, 88 STOKES, G. G., 96,313 STRATTON, J., 98 STRAUBEI., R., 76, 77, 81 K., 87 STREHL, K., 186 STROHMAIER, STRUVE, H., 76, 78, 80, 92 STULZ,K. F., 260 M., 221 SUZUKI, SYNGE,J. I,.,3, 4, 9, 11, 13
SZILARD, L., 125
T TALBOT, S. A,, 257 C., 98 TAYLOR, TEN DOESSCHATE, J., 264, 265 THEIMER, O., 73, 74 B. J., 77, 98 THOMPSOX, TORALDI DI F R B N C I A , G., 70, 74, 75, 82, 92, 94 TROLLE,B., 236 TSUJIUCHI, J., 66 TURNER, R., 97
U UKITA,Y . , 66
V VAJNSTEIN, 70 V A N D E R POL, B., 138 VANHEEL,A. C . S., 295, 299, 300, 301, 302, 303, 304 VAN HERK,G., 295 N. G., 39 VANKAMPEN, VINCI, I,. DA, 117 VERDET,E., 69 VONLAUE,M., 70, 77, 120, 128
W WALKER, J . , 77, 78 R., 63 WALTERS, WALTHER, A., 311, 315 WASSERMANN, G., 73, 74 WATANABE, T., 221 WEBER,H., 78 WEINSCHENK, E., 237, 238 W., 29, 90, 91, 92, 94 WEINSTEIK, h.,236 WENZEL, F. W., 257 WEYMOUTH, M. C., 257 WHEELER, WHITE,C. T., 265 E. T., 138 WHITTAKEK, J . M., 199 WHITTAKER, WIENER,N., 149 WILKINS,J., 88, 89 WIEN, 128 WITTMANN, J., 186 WOLF, E., 34, 36, 69, 71, 73, 74, 78, 80, 81, 94, 95, 96, 101, 103, 104 WOLFSOHN, G., 70 WOLTER,H., 166, 178, 180, 186, 195, 20 1
w.,
AUTHOR INDEX
WOODWARD, 98 WRIGHT, W. D., 217, 260 Wu, T. T., 99 WULFING, E. A , , 238 WYNNE, C. G., 63
335
Y YOUNG, A , , 23, 115
Z ZERNIKE,F., 79, 187, 190, 195, 199
SIJBJECT I N D E X A A posteriori information, 113 A priori information, 112, 113, 135 Abbe limit, 198, 208 - resolution condition, 198 - sine condition, 88 Abel-Poisson limitation procedure, 205 Aberration, 16, 22, 24, 28, 39, 98 balancing, 94 chromatic, 296 primary, 22 ray, 54 response functions for, 46 et seq., 54, 55 Seidel, 22, 39, 46 et seq., 56, 64 of the stop, 18 theory, 15 Adaptation of eye, 255 Airy diffraction integral, 89 - disk, 54, 58, 103, 297 - type objective, 83, 89 Alignment, 291 et seq. autocollimator for, 304, 306, 308 of borings, 306 collimator for, 297-299, 31 I et seq., 317 devices, 289 et seq. with double slit, 299 et seq., 319 interference arrangements for, 299 mirrors, for, 324-327 precision of, 302 et seq. rainbow for, 312 et seq. sphere for, 304 et seq., 308 of surfaces, 315 et seg. telescopes for 294, 297-299 Amplitude condition for non-reflection, 225 demodulation, 19 1 grating, Fig. 4.3 facing p. 190
modulation, 192- 195 transmission, 116, 137 Analytic continuation of spectral function, 202 et seq. Angular characteristic function, 1 1 Annular aperture, 75, 94, 98 Antenna, 181 *%nti-reflectioncoating, 170 Aperture, 82, 83, 85, 91, 98 Apodization, 8 1 et seq. Apopliilite, interference color of, 235 Astigmatism, 38, 43, 46, 47 Attenuation characteristic, 277, 278, 280 Auto-correlation function, 42 Axicon, 3 18-3 19
B Binocular fixation, 275 - perception of depth, 273 - vision, dynamic characteristics of, 273 et seq. - -, dynamic theory of, 275 et seq. Birefringent crystal, 234 et seq. Black screen, 73 Blurring of retinal images, 255 Brightness discrimination, 258, 286 - perception, 277 Broca-Sulzer effect, 277, 279 Brucke effect, 276, 277, 279
C Cardinal function, 138 Carrier frequency, I9 I Causality principle, 202, 208 Cell, 132 Characteristic function, 3, 4 et seq., 34, 35, 36 - _ , mixed, 4
SUBJECT INDEX
_ _ , point, 8, 9,
11, 13
Chromatic abberation, 296 - polarization, 213, 233 et seq., 242 Chromaticity of achromatic non-reflection double layer, 227 of color of chromatic polarization, 242 coordinates, 213, 214 diagram, CIE, 214, 217, 237 of diffraction image, 244-255 of oblique incidence, 225, 226 of stain coated with MgFz layer, 229 after transmission through wz layers, 22 1 of triple layer beam divider, 230 Chromoskop, 234 CIE-chromaticity diagram, 2 14, 21 7, 237 Circle polynomials, 37, 39, 79 Circular aperture, 82, 83, 85, 91 -~, diffraction at, 78, 80, 97, 98 - disc, diffraction at, 78 Classical statistics, 128 Collimator, see alignment Coating, non-reflection, 228 Coherence, 140 Coherent illumination, 196 Color change of color with film thickness, 232 evaluation of, 213 et seq. of light transmitted through layer, 220 of non-reflection layer, 219 et seq. of thin film on metallic surface, 23 1 et seq. of three element filter, 243 of light transmitted through layer, 220 Colorimetry, 213 Coma, 38, 52 -, response function for, 49, 51 Communication channel, 196, 200, 201, 209 optical, 187, 188 technique, 199 theory, 121 Concentric tube transmission lines, 175 Conditional entropies, 152 Conduction line, 160 et seq. Construction parameters, 64 Contour, 282
337
response, 286 Convolution, 54, 64 Corner cube, 307 Correlation quality, 6 1, 62 Cos-type layer, 216, 218, 21 9, 224, 225, 230, 232 Critical fusion frequency (cff), 276 - point, 39 Criteria for image evaluation, 60-62 Crossed brackets, 23 et seq. Curvature of field, 46 Cybernetic design with digital computer, 62 et seq. -
D Dark disc in a light background, 92 Debye’s integral, 100 Decentering, 308 Defocusing, 46 Degrees of freedom, 1 12, 1 19, 120, 121, 132, 133 et seq., 138, 148 et seq. Demodulation, amplitude, 19 1 -, phase, 187 et seq. Demon, 125 Dichroic mirrors, 226, 231 Diffraction at annular aperture, 80 a t circular aperture, 97, 98 a t elliptic aperture, 75 a t semi-circular aperture, 76, Fig. 3.2 facing p. 76 by an edge, 98 by concentric arrays of ring-shaped apertures, 80 Fraunhofer, 72, 74-77, 80, 82, 91 Fresnel, 72-73, 77, 94 image, chromaticity of, 244-255 integral, Airy, 89 integrals, Luneberg’s, 100, 101 by narrow slit, 97 in non-aberrated system, 101-104 with square wave pupil function, Fig. 3.9 facing p. 85 theory of aberration, 79 -, Kirchhoff’s approximation, 140 -, Luneberg, 100 -, Rayleigh, 99 -, vector, 73, 99 of a wave a t a plane object, 136 et seq. Dioptra, 294 Dipole antenna, 18 1
338
SUBJECT INDEX
Dipole source, 100 Directional antennas, 181 Discrimination of brightness, 285, 286 Dispersion, 22 I , 223, 232, 235 Distinguishable energy levels, 133 - states, 134 Double layer, 226 el seq. Double slit, alignment with, 299 et seq., 319 Drifts of the eye, 260 et seq., 270, 271, 275 Dynamic characteristics of binocular vision, 273 et seq. - theories of visual acuity, 250, 257259 - theory of binocular vision, 275 et seq.
E Edge doublet, 284 Eikonal, 4, 11 et seq., 24, 34, 35 -, focal, 17, 21, 29 Electromagnetic diffraction theory, 74 Electronic computers, 63 Elementary beam, gaussian, 1 19, 120 Encircled energy, 83, 95 Energy levels, distinguishable, 133 Entropy, 125 et seq., 126, 134, 136 -, conditional, 152 -, physical, 151, 152 -, selective, 148 et seq., 151, 152 - of a source, 151 Equation of telegraphy, 167 Errors of centering, 298 -, limits of, 183. 186 Error tolerance, 203 et seq. Evaluation of color, 213 et seq. “Exaggerated” eye movements,, 267268, 272 Exclusion principle, 147 Expansion theorem, 198 Extraordinary rays, 234 Eye adaptation of, 255 “exaggerated” movements of, 267268, 272 involuntary movements of, 256, 258 et seq., 269 et seq. torsional movements (flicks, tremor) of, 260 Evaluation of image, criteria for, 60-62 Evanescent waves, 1 18, 137 Extended sources, 94
F Fatigue, 286 Fermat’s Principle, 4, 5, 10 Ferry-Porter’s law, 276 Fidelity, 61 Fiducial marks, 303, 304, 323, 326 Figure of merit, 63 et seq. Filter, interference, 169 et seq., 231 Flicker, 276 et seq. - limit, 279 Flicks, 260 el seq., 270, 271, 275 Fluctuations, 124, 135, 143 Focal eikonal, 17, 21, 29 Focus, light distribution near, 77 Four terminal networks, 164 et seg. Fourier analysis, 41 - area, 120 - coefficient, 189 - component, 117, 118, 130, 131, 137, 145 - pattern, 118 - series, 189, 194, 223 - space, I 18 - theorem, 137 - transform, 117, 118, 196 et seq. - variables, 117, 137 Fovea, 259, 277 Praunhofer diffraction, 72, 74-77, 80, 82, 91 Fredholm equation, 82 Free coordinates, 119 Frequency, carrier, 191 Fresnel color, 233 - diffraction, 72-73, 77, 94 - zone, 292 Fusion, 276
G Gabor’s expansion theorem, 198 Gaussian elementary beam, 1 19, 120 - optics, 22 Geodetic measurements, 296 Geometric optical intensity clistribution, 40, 41, 43, 45, 64 - - response function, 43, 54, 57, 58, 62, 64 Geometrical optics, 3 et seq., 34 et seq., 113 et seq. Grain of retinal structure, 255 Grassmann’s law, 213 Grating amplitude, Fig. 4.3 facing p. 190
SUBJECT INDEX
phase, 187, 188, Fig. 4.3 facing p. 190 sinusoidal, 94
H Haiclinger brushes, 264 Half-plane, diffraction by, rigorous, 78 Hamiltonian, 13 Hamilton’s partial differential equation, l l Hankel transform, 8 1 Heisenberg’s uncertainty condition, 182 Hertzian waves, 178 et seq. Homogeneity of glass, 310 Hue, 222 Huygens’ principle, 188, 292 Hypersensitive color, 239 et seq.
I Illumination, coherent, 196 -, partially coherent, 56 et sey. Image errors, 15 et seg., 209 - evaluation, 58 el seq. - of an incoherent line source, 94 - - - incoherently illuminated edge, 94 Imaging of extended objects, 90 ef seq. Incoherently illuminated edge, 94 Information, 155 et seq., 197 a posteriori, 113 a priori, 112, 1 13, 135 and light, 107 et seq., 109 capacity, 133 et seg., 148 et seq. limit, 183 measure of, 136 metrical, 113, 132 et seq. selective value of, 151 space, 121 structural, 112 Inhibition, lateral in retina, 272, 285 Intensity criterion of Lucy, 60, 64 -, wave optical distribution of, 41-42, 57 Interference arrangements for alignment, etc., 299 et seq. - color, 213 et seq., 217, 218, 225 of apophilite, 235 of chromatic polarization, 233 et seq., 242 of monolayer, 214 et seq.
339
of multilayer, 226 et seq. of optically active crystal, 240 et seq. of oxide film, 231 tables of, 245-249 tristimulus values of, 215 filter, 169 et seq., 231 - pattern from sphere after two reflections, Fig. 26 facing p. 314 Intermittent illumination, 268, 276, 280, 281 Interpolation theory, 138 Involuntary micromovements, 265 - movements of the eye, 256, 258 et seq., 269 et seq. Isophotes near focus for circular aperture, 79 Isoplanatism condition, 209 -
K Kirchhoff’s approximation, 140 boundary conditions, 71-72 diffraction theory, 70 et seq., _ _ - saltus interpretation of, 73 - integral, 33 Kohler illumination, 92 Kupfmiiller relation (condition, limit), 197, 198, 208 -
L Lagrangian, 13 Laminated media, 159 et seq. Lateral displacement, 95 - inhibition in the retina, 272, 285 Layer optics, 168 et seq. - systems, 159 et seq., 160 et seq. Least square method, 63 Lecher line, 169 Legendre transform, 13 Light beam, degrees of freedom of, 221 - information, 107 et seq., 109 - pointer, 182 et seq. - ray, 4 et seq., 34 et seq., 113, 292 - quanta, 125 Limit of measuring errors, 183, 186 - of resolution, 208 Limiting colors, 2 19 Line source, 92-94 Logon, 133, 135, 148 et seq. Lommel functions, 77-80 Longitudinal displacement, 95 Luminance factor, 214, 220 Luminous disc, 92, 93
340
SUBJECT INDEX
Luminous point, 92, 93 Luneberg apodization problems, 82, 85 - diffraction integrals, 100, 1 C 1 - diffraction theory, 100 - -0sterberg objective, 89 Lyot’s filter, 243
M Mach bands, 283-286 Markoff chain, 151 Matrix, four terminal, 164 et sey. Maxwell equations, 159, 167 Merit function, 4 Method of steepest descent, 63 Metric, 132 Metrical information, 113, 132 et seq. Metron, 135 Microscope, polarizing, 85 Microscopy, phase contrast, 100 Mirrors for alignment, 324-327 Mixed characteristic, 4 Modes, 131 Moment integrals, 205, 207 Monolayer, interference color of, 214 et seq. Multilayer, 23 1 et seq. -, interference color of, 226 et seq. Multiple reflection, 221 et seq. Multiplicity, 167 I
N Nervous fibers, 258 Network, four terminal, 164 et seq. Non-linearity of the visual system, 277 Non-redundant specification of optical objects, 138 et seq. Non-reflection, amplitude condition for, 225 - coating, 228 - double layer, chromaticity of, 227 - layer, color of, 219 et seq. Norrenberg color, 233 Nyquist’s theorem, 142
0 Object, phase, 187, 191 Oblique incidence, reflectivity and chromaticity of, 223-226 Obliquity factor, 72 Occupation numbers, 146 et seq.
Optical communication channel, 187 lever, 259 - path, 5 Optically active crystal, interference colors of, 240 et seq. Ordinary rays, 234 Oxide film, interference colors of, 231 -
P Partially coherent illumination, 56 et seq. Pauli’s exclusion principle, 147 Perception, 257 et seq. - of contours, 282 et seq. - of depth, binocular, 273 Perpetuum mobile, 124 et sep., 142 et seq. Phase contrast method. 187 et sea.. 191, 195 contrast microscopy, 100 grating, 187, 188, Fig. 4.3 facing p. 190 demodulation, 187 et seq. modulation, 19 1-1 95 object, 187, 191 plate, Zernike, 191 -, spatial, 147 -, time, 147 Photons, 125, 184 Photopic vision, 255 Physical entropy, 151, 152 Physiological nystagmus, 259, 262, 272 Plane waves, 1 15, 137, 143 Pointing, 295 et seq. Point characteristic function, 8, 9, 11, 13 Poisson’s distribution, 143-145 Polarization, chromatic, 2 13, 233 et seq., 242 Polarizing microscope, 85 Poynting vector, 104 Precision of alignment, 302 et seq. Primaries, (color), 213, 214 Primary aberrations, 22 Proper scale, 1 12 Pseudofrequencies, 193 Pupil function, 80, 81 et seq., 85, 97 Purity, 222, 226, 228 I
Q
Quadrature theory, 96 Quanta, light, 125
I
SUBJECT INDEX
Quantum, 124 theory, 131
-
R Radar, 178 et seq. Radius of gyration of spot diagram, 59, 64 Rainbow, 245, 3 12 et seq. Ray aberration, 54 - characteristic, maximum, 184 et seg. - _ , minimum, 184 et seq., 187, 198 - of light, 113 - shift, 180 et seq. Rectangular aperture, diffraction at, 77, 96 - obstacle, 78 Reduced magnification, 14 Reflecting sphere for alignment, 304 et seq. Reflection-free transmission, 170 Relative structural content, 61, 62 Resolution, limit of, 208 Response function, 33, 41-58, 60 for coma, 49, 51-54 for Seidel aberrations, 46 et seq. for small aberrations, 54 for spherical aberration, 55 wave optical, 42-44, 54, 58 Retina, 255 et seq., 284-285 Retinal images blurring of, 255 stabilized, 262 et seq. - structures, grain of, 255 Ripple ratio, 278
34 1
spectra, 189 Sin-type layer, 2 16, 2 18, 2 19, 224, 226, 232 Slit aperture, 89, 94, 98 Smith-Lagrange invariant, 1 14 Snell’s law, 6-8 Spatial phase, 147 Spectra, side, 189 Sphere for alignment, 304 et seq., 308 et seq. -, wavefront emerging from, 309 Spherical aberration, 18 et seq.. 43, 54, 62, 96, 296, 305, 310, 312, 324 Spot diagram, 40, Fig. 3.2 facing p. 40, 58-60, 62, 64-65 _ _ , radius of gyration of, 59 Square wave pupil function, Fig. 3.9 facing p. 85 Stabilized retinal images, 262 et seq. Stain on glass, 228, 229 Static theories of visual acuity, 256 Stationary phase, 39 Statistical mean information (S.M.I.) content, 60 Statistics, classical, 128 Stereoscopic acuity, 273, 274 - sensitivity, 275 Strehl definition, 48, 62, 82, 90 Structural information, I 12 Struve function, 92 Super-resolving pupils, 82 Surge impedance, 159, 161, 163, 166, 170, 171 Symmetrical optical system, 13 et seq. Symmetric variables of rays, 14 SzilArd’s theorem, 152 -
S
T
Saccades, 260 et seq. Saltus interpretation of Kirchhoff theory, 73 - problems in diffraction theory, 100 Sampling theorem, 198 Scattering, 144, 219 Schistoskop, 234 Schlieren method, 124 Seidel aberrations, 22, 39, 46 et seq., 56, 64 Selective entropy, 148 et seq., 151, 152 - value of information, 151 Sensitive colors, 236 et seq., 239 Shannon’s sampling theorem, 198 Side bands, 193
Talbot’s law, 276, 281 Telescopes for alignment, 294, 297-299 Termination of a line, 174, 176 Theodolite, 295 Time phase, 147 Tolerance criterion of H. H. Hopkins, 65 Torsional movements (flicks, tremor) of eye, 260 Total illumination, 95 et seq. - - for spherical aberration, 97 Transmission amplitude, 1 16, 137 - function, 116 - lines, 159 et seg., 163 -, reflection-free, 170
342
SUBJECT INDEX
Transverse-E-w'ives (TE Waves), 16 1 et seq. 'L'ransverse-l~-n,aves(TH \V:~vcs),162
et seq Tremor of eye, 260 et seq , 270, 271, 275, 279 Triplc layer, 228, 230 et seq Tristimulus values, 213, 215, 216
U Uncertainty condition, 182 V Variable by variable method, 63 Vector diffraction theories, '99 Visual acuity, 255-259, 262, 267, 268, 271, 272, 280, 281 - -, dynamic theories of, 256, 257-259 , static theories of, 256 Visual system attenuation characteristic:j of, 278, 280 non-linearity ol, 277 -
-
Vision, photopic, 255
W Wave aberrations, 54 - aberration function, 35 et seq., 37, 41, 43, 46, 52, 58 - equation, 116, 137 -, evanescent, 1 18, 137 - numbers, 117 - optical intensity distribution, 41 -42, 57 - optical response function, 42-44, 54, 58 optics, 34 et seq., 115 et seq. - packet, 167 -, plane, 115, 137, 143 - trap, 179
-
2
Zero plane, 180 Zone plate, 302, Fig. 59 facing p. 315; 316, 321, 322-324