Optics Second Edition
FLOW DIAGRAM
I
1
1
2 Comblnat~onsof waves
3 Four~ertheory and wove groups
14
Wove propoga...
1060 downloads
5638 Views
35MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Optics Second Edition
FLOW DIAGRAM
I
1
1
2 Comblnat~onsof waves
3 Four~ertheory and wove groups
14
Wove propogat~on
I
II
5 Polor~tottonof
9 Interference and
6 Geometr~coptlcs
I
1 The wave nature of l~ght
7 Aberrot~ons
L 8 Opt~calInstruments
1-
12 Young's double sl~t
I
I I
13 lnterference by dlvls~on of ompl~tude
14 Resolut~onllrn~tsIn opt~calInstruments
I
15 Interferometers and spectrometers
I
--
I
I
16 Coherence and cor relot on -
18 Holography
-
I
I
I
19 Rod~atlonby accelerated charges
20 Physlcol processes ~nrefroctlon a
21 Detection of llght b
This flow diagram shows the main logical connections between chapters. Any specific chapter can be understoodif all the chapters above it which are connected by a line have been covered. Where more than one chapter is shown in one box the chapters concerned form a unit for the purposes of later chapters.
Optics
The Manchester Physics Series General Editors F. MANDL : R. J. ELLISON : D. J. SANDIFORD Physics Department, Faculty of Science, University of Manchester
Properties of Matter:
B. H. Flowers and E. Mendoza
Optics: (2nd edition)
F. G. Smith and J. H. Thomson
Statistical Physics: (2nd edition)
F. Mandl
Solid State Physics:
H. E. Hall
Electromagnetism:
I. S Grant and W. R . Phillips
Atomic Physics:
J. C. Willmott
Electronics:
J. M. Calvert and M. A. H. McCausland
Electromagnetic Radiation: F. H. Read
In preparation:
Statistics:
R. J. Barlow
Particle Physics:
B. R . Martin and G.. Shaw
OPTICS Second Edition
Sir Francis Graham Smith F.R.S. J. H. Thomson Department of Physics, University of Manchester
John Wiley & Sons Ltd. CHICHESTER
NEW YORK
@
BRISBANE
TORONTO
SINGAPORE
Copyright O 1971, 1988 by John Wiley & Sons Ltd. All rights reserved. No part of this book may be reproduced by any means, or transmitted, or translated into a machine language without the written permission of the publisher.
Library of Congress Cataloging in Publication Data: Smith, F. Graham (Francis Graham), 1923Optics/F. Graham Smith, J. H. Thomson. -2nd ed. p. cm. -(The Manchester physics series) Bibliography: p. Includes index. ISBN 0 471 91534 3: ISBN 0 471 91535 1 (pbk.): 1. Optics. I. Thomson, J. H. (John Hunter) 11. Title. 111. Series. QC355.2.S55 1988 87-22174 535-dc19 CIP
British Library Cataloguing in Publication Data: Smith, F. Graham Optics. -2nd ed. -(The Manchester physics series). 1. Optics 11. Thomson, J. H. 111. Series I. Title 535 QC355.2 ISBN 0 471 91534 3 ISBN 0 471 91535 1 Pbk Phototypeset by Dobbie Typesetting Service, Plymouth, Devon Printed by St Edmundsbury Press, Bury St Edmunds Suffolk.
Editors' Preface to the Manchester Physics Series The first book in the Manchester Physics Series, Properties of Matter, was published in 1970. Since then, eight volumes have appeared in the series and we have been extremely encouraged by the response of readers, both colleagues and students. These books have been reprinted many times and are being used world-wide in the English language edition and in translations. All the same, both the editors and authors feel the time is right for a revision of some of the books in order to take into account the feedback received and to reflect the changing style and needs of undergraduate courses. The Manchester Physics Series is a series of textbooks on physics at undergraduate level. It grew out of our experience at Manchester University Physics Department, widely shared elsewhere, that many textbooks contain much more material than can be accommodated in a typical undergraduate course and that this material is only rarely so arranged as to allow the definition of a shorter self-contained course. In planning these books, we have had two objectives. One was to produce short books: so that lecturers should find them attractive for undergraduate courses; and so that students should not be frightened off by their encyclopaedic size or their price. To achieve this, we have been very selective in the choice of topics, with the emphasis on the basic physics together with some instructive, stimulating and useful applications. Our second aim was to produce books which allow courses of different length and difficulty to be selected, with emphasis on different applications. To achieve such flexibility we have encouraged authors
vi
Editors' preface to the Manchester Physics Series
to use flow diagrams showing the logical connections between different chapters and to put some topics in starred sections. These cover more advanced and alternative material which is not required for the understanding of later parts of each volume. Although these books were conceived as a series, each of them is selfcontained and can be used independently of the others. Several of them are suitable for use outside physics courses; for example, Electronics is used by electronic engineers, Properties of Matter by chemists and metallurgists. Each author's preface gives details about the level, prerequisites, etc., of the particular volume. We are extremely grateful to the many students and colleagues, at Manchester and elsewhere, whose helpful criticisms and stimulating comments have led to many improvements. Our particular thanks go to the authors for all the work they have done, for the many new ideas they have contributed, and for discussing patiently, and often accepting, our many suggestions and requests. We would also like to thank the publishers, John Wiley & Sons, who have been most helpful. January, 1987
Author's Preface This book is based on wave concepts, and includes a thorough introduction to the physics of wave motion. There are major changes from the original first edition. We now introduce Fourier theory at an early stage, so that reference can be made to it throughout the book, greatly simplifying much of the later material. New material on modern optics has been included, and peripheral material on vibrations has been omitted. The basic concepts of geometric optics, and of interference and diffraction, are presented in a sufficiently concise form to allow room for several developments in modern optics, and for the description of practical spectrometers and interferometers. Later chapters provide an introduction to lasers, holography, and to the generation and detection of light. Much of this material has been re-written to take account of developments in the last fifteen years. Physical concepts are emphasized throughout, so that optics can be seen in relation to other disciplines such as electricity and atomic physics. The first sixteen chapters form a conventional course, which branches into several independent topics in later chapters. Geometric optics is treated by wave concepts; it forms an independent section which may be omitted without loss of continuity. Sections marked with a star may also be omitted on first reading. A flow diagram is included on the inside cover. The number of examples has been considerably increased: these are now found at the end of the first fifteen chapters, and numerical answers and notes are collected together near the end of the book.
viii
Author's preface
It is sad for me to record that my friend and colleague, John Thomson, died before this book was completed. Although much of his work remains, I have to take full responsibility for this second edition. Dr John Ellison provided invaluable suggestions for the major reorganization and improvement of the text.
Contents 1 THE WAVE NATURE OF LIGHT
**
1.1 1.2 1 . 14 1.5 1.6 1.7
Photons and Waves. Wave Theory . Electromagnetic Waves . Energy Flow in Electromagnetic Waves Cosinusoidal Waves Dispersion . Group Velocity . PROBLEMS1 .
.
2 COMBINATIONS OF WAVES
Waves and Phasors. The Complex Amplitude. Positive and Negative Frequencies . Beats between Oscillators. Standing Waves . Crossing Plane Waves . Similarities between Beats and Standing Wave Patterns Standing Waves and the Doppler Effect. Standing Waves at a Reflector. PROBLEMS 2 .
* Starred sections may be omitted as they are not required later in the book.
.
x
Contents
3 FOURIER THEORY AND WAVE GROUPS
* *
3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11
Fourier Series and Spectra . Fourier Series with Exponentials . Fourier Transforms. Finding the Fourier Transform of a Non-Periodic Function The Transform of a Modulated Wave. Convolution . Delta and Grating Functions . Autocorrelation and the Power Spectrum. Wave Groups. The Schrodinger Wave Group. An Angular Spread of Plane Waves. PROBLEMS 3 .
4 WAVE PROPAGATION
** *
4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 4.10
Rays and Wavefronts . Reflection at a Plane Surface . Refraction at a Plane Boundary . Oscillatory Waves and Huygen' s Principle. Fermat's Principle . Refractive Index and Wave Impedance . Partial Reflection of Plane Waves . Reflection at Oblique Incidence. Anisotropic Propagation. Double Refraction in a Uniaxial Crystal. PROBLEMS 4 .
5 POLARIZATION OF LIGHT
Polarization of Transverse Waves . Formal Description of Polarized Light-Elliptical Polarization . Analysis of Elliptically Polarized Waves . Polarizers . Birefringent Polarizers . Quarter- and Half-Wave Plates. Optical Activity . Induced Birefringence . Liquid Crystal Displays . PROBLEMS 5 .
6 GEOMETRIC OPTICS 6.1 The Concept of Rays 6.2 Perfect Imaging .
.
Contents
6.3 6.4 6.5 6.6 6.7 6.8
Perfect Imaging of Surfaces . Spherical Surfaces-the Paraxial Approximation General Properties of Imaging Systems . Ray Tracing -Separated Thin Lenses in Air . Ray Tracing-The Sine Law . Spherical Surfaces -The Concept of Wavefronts PROBLEMS 6 .
.
.
xi
80 81 83 85 87 88 89
7 ABERRATIONS
Ray and Wave Aberrations . Wave Aberration on Axis -Spherical Aberration . Off-Axis Aberrations . Influence of Aperture Stops . The Reduction of Spherical Aberration by Corrector Plates . 7.6 The Correction of Chromatic Aberration. 7.7 Achromatism in Separated Lens Systems. PROBLEMS 7 .
7.1 7.2 7.3 7.4 7.5
8 OPTICAL INSTRUMENTS 8.1 8.2 8.3 8.4 8.5 8.6 8.7 8.8 8.9 8.10
Lenses and Mirrors. Refraction in a Prism . The Simple Lens Magnifier . The Telescope. Advantages of the Various Types of Telescope. Binoculars . The Compound Microscope . Image-Forming Instruments -The Camera . Intensity, Flux, Illumination, and Brightness . Illumination in Optical Instruments. PROBLEMS 8 .
9 INTERFERENCE AND DIFFRACTION
9.1 9.2 9.3 9.4 9.5
Young's Experiment . Diffraction at a Single Slit . The General Aperture . Rectangular and Circular Apertures. The Field at the Edge of an Aperture PROBLEMS 9 .
.
127 130 135 136 140 141
Contents
xii
10 FRESNEL DIFFRACTION 10.1 Fraunhofer and Fresnel Diffraction- the Rayleigh 10.2 10.3 10.4 10.5 10.6
Criterion . Shadow Edges-Fresnel Diffraction at a Straight Edge. Diffraction of Cylindrical Wavefronts . Fresnel Diffraction by Slits and Strip Obstacles. Spherical Waves and Circular Apertures -Half-Period Zones . Kirchhoff's Diffraction Theory . PROBLEMS 10.
11 THE DIFFRACTION GRATING AND RELATED TOPICS The Diffraction Grating. Diffraction Pattern of the Grating. The Effect of Slit Width and Shape. Fourier Transforms in Grating Theory . Missing Orders and Blazed Gratings . Making Gratings . Radio Antenna Arrays . X-Ray Diffraction with a Ruled Grating. X-Ray Diffraction by a Crystal Lattice. PROBLEMS 11 12 YOUNG'S DOUBLE SLIT AND RELATED TOPICS
Ir
12.1 12.2 12.3 12.4 12.5
Young's Slits, Fresnel's Biprism, and Lloyd's Mirror The Effect of Slit Width . Source Size and Coherence . Michelson's Stellar Interferometer. The Intensity Interferometer. PROBLEMS 12.
13 INTERFERENCE BY DIVISION OF AMPLITUDE 13.1 13.2 13.3 13.4 13.5 13.6 13.7
Division of Amplitude -Newton's Rings. Interference Effects with a Plane-Parallel Plate. Thin Films . Michelson's Spectral Interferometer. Multiple Beam Interference . The Fabry-Perot Interferometer . Interference Filters PROBLEMS 13
.
Contents
xiii
14 THE RESOLUTION LIMITS OF OPTICAL INSTRUMENTS
* ** * ,
14.1 14.2 14.3 14.4 14.5 14.6 14.7 14.8 14.9 14.10
Introduction . The Telescope . Resolving Power and Sensitivity . The Microscope . Coherent Illumination in the Microscope (Abbe Theory) . Phase-Contrast Microscopy . Spatial Filtering . Resolution in Wavelength: The Prism Spectrometer. The Grating Spectrometer . Resolving Power of a Fabry-Perot Interferometer. PROBLEMS 14
209 210 213 214 215 217 219 219 222 224 225
15 INTERFEROMETERS AND SPECTROMETERS 15.1 15.2 15.3 15.4 15.5 156 15.7 15.8 15.9 15.10 15.1 1 15.12 15.13 15.14
The Measurement of Length. The Rayleigh Refractometer . Wedge Fringes and End Gauges . The Twyman and Green Interferometer. The Standard of Length. The Michelson-Morley Experiment . The Ring Interferometer . The Choice of Spectrometers. The Grating Spectrometer. Concave Gratings . Echelle and Echelon Gratings. The Fabry-Perot Interferometer . Twin-Beam Spectroscopy . Fourier Transform Spectroscopy . PROBLEMS 15
226 226 227 229 230 23 1 232 232 233 234 235 236 237 239 240
16 COHERENCE AND CORRELATION
* *
16.1 16.2 16.3 16.4 16.5
The Region of Coherence . Correlation as a Measure of Coherence. Autocorrelation and Coherence . Aperture Synthesis The Intensity Interferometer.
242 244 246 247 249
17 LASERS 17.1 Light Amplification . 17.2 The Laser as an Oscillator 17.3 Optical Cavities and their Properties.
.
25 1 253 255
xiv
Contents
17.4 17.5 17.6 17.7 17.8 17.9 17.10 17.1 1
Pulsed Lasers Mode Locking . Glass, Gas and Liquid Lasers. Semiconductor Lasers . Fibre Optic Communications. Properties of Laser Light . . _ . Harmonic Generation and Heterodyning Homodyne Detection of Laser Light .
.
18 HOLOGRAPHY 18.1 18.2 18.3 18.4 18.5 18.6
Introduction . Gabor's Original Method . A More Sophisticated System: Carrier Fringes. Aspect Effects . Three-Dimensional Holograms -Colour Holography Holography of Moving Objects .
19 RADIATION BY ACCELERATED CHARGES 19.1 The Electromagnetic Field 19.2 The Hertzian Dipole . 19.3 Cerenkov Radiation . 19.4 Synchrotron Radiation . 19.5 Bremsstrahlung .
.
20 PHYSICAL PROCESSES IN REFRACTION
.
Introduction Polarization in Dielectrics . Free Electrons . Resonant Atoms in Gases . Anisotropic Refraction . Rayleigh Scattering. Raman Scattering . Thomson and Compton Scattering by Electrons. 21 DETECTION OF LIGHT 21.1 2 1.2 2 1.3 21.4 2 1.5
Detection over the Electromagnetic Spectrum. Thermal Detectors. Photoconductive Detectors . Photoelectric Detectors. Photon-Counting Image Detectors.
.
Contents
21.6 21.7 21.8 21.9
Electron Beam Scanners and Intensifiers. Photodiodes and Charge-Coupled Devices (CCDs). Photography. TheEye .
FURTHER READING
.
SOLUTIONS TO PROBLEMS
INDEX
.
300 303 304 305 308
.
310 315
The Electromagnetic Spectrum Units, Symbols and Data
xv
.
facing inside back cover
. inside back cover
CHAPTER
The wave nature of light
1
PHOTONS AND WAVES
Optics in the days of Newton was a branch of physics which was clearly defined and separate from mechanics, heat, electricity, and magnetism. Today it is a subject which acts as a bridge, bringing together concepts from all other parts of physics, and linking closely with such diverse subjects as the engineering of communications and high-energy nuclear physics. The link is the wave nature of light. The behaviour of light waves is typical of the wide range of waves known as electromagnetic waves. Light occupies a place in the electromagnetic spectrum intermediate between the long radio waves, which we are accustomed to think of as oscillating and propagating electric fields, and the short X-rays and gamma rays, which we think of as energetic particles. The whole of this range is in fact governed by the same basic laws, and the whole of our study of light will be found to apply in some way to other parts of the spectrum. The differences of behaviour are, however, very large. The frequencies of the oscillations vary from 104Hz for long radio waves to more than 1021Hz for gamma rays (1 Hertz equals one cycle per second), and the wavelengths from 30 km to less than 1 picometre ('pico' = 10- 12). Radio waves are generated and detected as an oscillating electric or magnetic field, and it is unusual (but not unknown) to hear a physicist refer to a quantum process in the radio frequency spectrum. Gamma rays are packets
2
The wave nature of light
Chap. 1
of energy, photons, emitted and absorbed in atomic and nuclear reactions: only rarely will a physicist refer to a frequency of a gamma ray, but he will instead refer to its energy. The laws governing the propagation of electromagnetic waves are the same for all wavelengths; some are sufficiently general to apply equally to many other quite different kinds of waves. The quantum nature of a wave, which is responsible for the obvious differences between a gamma ray and a radio wave, is only revealed when waves and matter interact. Waves are found to exchange energy and momentum with their generators and absorbers only in discrete packets, according to quantum laws. These discrete packets of energy, the quanta, are photons. They need not be considered to exist at all during the propagation of a wave, but only as a limitation on the possible ways in which radiation interacts with matter. The relation of the size of the quantum to common levels of energy in matter determines the relative importance of the quantum at different parts of the spectrum: gamma-rays, with a high photon energy and a high photon momentum, can act on matter explosively or like a high-velocity billiard ball, while long infrared or radio waves, with a low photon energy, do not interact so much with individual atoms as with matter on a larger scale, through classical electric and magnetic induction. The photon energy of light waves is such that quantum effects dominate only some of the processes of emission and reception. The visible spectrum contains the marks of quantum processes in the profusion of colour from line emission and absorption, but it also can display a continuum of emission over a wide range of wavelengths, giving 'white' light, whose actual colour is determined by the large-scale structure of the continuum spectrum rather than its fine detail. The wavelength of visible light is at the same time sufficiently short for us to observe effectively rectilinear propagation, as though light were propagated as a stream of particles, and long enough for us to observe easily many interference and diffraction effects with apparatus whose dimensions are in some way commensurate with the wavelength. We find therefore in the study of light most of the physical phenomena of electromagnetic waves, and although we may refer to other wavelengths for illustration, light provides our main examples of the propagation, interference, and diffraction of waves, the main themes of this book. Photons are referred to only briefly, in Chapters 17 and 19 on the generation and Chapter 21 on the detection of light; from our point of view a photon does not exist except in a transfer of energy and momentum to or from the electromagnetic field, and we do not therefore describe light as a stream of photons, or speculate about the interference of one photon with another. It is wave propagation we are dealing with, and we start therefore with a general study of waves.
1.2
Wave theory
3
1.2 WAVE THEORY A simple expression of a wave of a variable quantity $, travelling in the direction x with velocity c is: $=f (x-ct)
(1 1)
The function f (x) describes the shape of $ at the moment t = 0, and the equation states that the shape of $ is unchanged at later times, with only a movement along the x axis (Fig. 1.1). The variable quantity $ may be a scalar, for example the pressure in a sound wave, or it may be a vector. If it is a vector, it may be orientated longitudinally along the wave direction, as are the material displacements in a sound wave; or it may be transverse, as in a stretched string or in the electromagnetic waves which are our main concern. For most of optics it is sufficient to consider only the transverse electric field; indeed, as we shall see later, the results of scalar wave theory are sufficiently general that for many purposes we may just think of the magnitude of the electric field and forget about its vector nature. At any time the variation of $ with x, i.e. the slope of the graph in Fig. 1.1, is given by a$/ax, which we denote by f '(x-ct). JI Tlme 0
Time
t
Velocity c
*
Fig. 1.1. A wave travelling in the x direction with unchanging shape and with velocity c. At time t = O the shape is $=f (x), and at time t it is $=f ( x - c t ) .
At any one place the quantity $ will change with time as a$/at, where
The second differential of $ with respect to x, i.e. a2$/ax2,which is the curvature of the graph in Fig. 1.1, is denoted by f "(x- ct). The second differential with respect to time, i.e. the acceleration of $, is given by
4
The wave nature of light
Chap. 1
It follows that any wave propagating in the x direction with uniform velocity and without change of form obeys the equation
Conversely, if a physical system obeys an equation like Eq. (1.4), then it will support waves which will propagate without change of form. The wave equation (1.4) may be extended to cover three dimensions. The three-dimensional wave motion in a quantity is:
+
or in a more general and concise notation:
The formulation of Eq. (1.6) is independent of any particular coordinate system; it covers for example a different formulation of (1.5) for a system of spherical coordinates r, 13, 4, which would be used for waves spreading
Fig. 1.2. Polar coordinate systems. (a) Spherical
(b) Cylindrical.
1.2
Wave theory
5
from a point, or a system of cylindrical coordinates p, z, $ suitable for waves radiated from a line source. Fig. 1.2 illustrates each of these systems of coordinates. The wave equations for these two systems are:
- 1 a2+ (spherical) c2 at2
a2+ +
--
ap2
1
a$ ap
+
1
a2+
p2 a42
+
a2+ az2
1
a2+ (cylindrical) .
c2 at2
(1.8)
We will now derive a wave equation (1.4) for the simple example of a stretched string, showing that the velocity c is related to the tension and the mass per unit length of the string. Consider a wave which curves a small portion away from the equilibrium position without changing the tension or making a large slope (Fig. 1.3). Let the mass per unit length be m. The net force deflecting the string is determined by the tension and the curvature: the force on a portion of string Ax long can be equated to the mass m Ax multiplied by the acceleration. The upward component of the string tension at any point is T a y / a x . The
Fig. 1.3. A wave travelling along a stretched string. The portion A x is accelerated upwards by the difference in vertical components of the tension T. The mass of the portion is m Ax.
6
The wave nature of light
Chap. 1
differential of this measures its change with x, so that the difference in upward force on the two ends of the piece Ax long is T(a2y/ax2)k. Therefore
Comparison of Eqs. (1.4) and (1.9) shows that the wave propagates with velocity (T/m)lI2. Since the equations are identical in form, it follows also that the wave propagates without change in wave shape. Transverse waves of this type can be sent for long distances along stretched wires, such as the stretched cables of mountain cable cars: the acute observer will notice that there is in practice always some change of wave shape as the wave travels, which must always mean that the equation of motion of the wire is not precisely given by Eq. (1.4). As we mentioned earlier, electromagnetic waves are waves both of electric and magnetic fields. The two are, however, closely linked; in free space the ratio of the field strengths is fixed, and the field directions are mutually perpendicular and also both perpendicular to the line of propagation of the wave. We may usually concentrate on one; since the electric field is generally more easily detected we choose it as our variable, and speak only of the field E. The wave equation for electromagnetic waves propagating without change of form will therefore be
* 1.3
ELECTROMAGNETIC WAVES
In the next two sections we discuss electromagnetic waves more fully. This is needed for some of the applications discussed later in this book, but not for most of the central ideas of optics. For readers who are already familiar with Maxwell's equations, however, the following sections will help to unify the subjects of optics and electromagnetism. We may expect that this wave equation for electromagnetic waves can be derived from simple laws governing the behaviour of the electric and magnetic fields, E and H. The laws are not, however, as simple as for the stretched string, since E and H are vector quantities, and the two vital laws involve both E and H. These are the laws of induction, in which a changing field of E or H induces a loop component of H or E. In vector terminology, and for free space: curl E =
-h-
aH , at
curl H = Q-
aE at
.
*
1.4
Energy flow in electromagnetic waves
7
The first of these is Faraday's law, and the second is the complementary law introduced by Maxwell. The constants and q-, are the permeability and permittivity of free space. The wave equation for the vector fields E and H is found by using the identity curl curl = grad div - V
.
(1.12)
Then from equations (1.1 1):
V2E = po curl and
aH - a at - Po% curl H
-
a2E V2E=~pOat2
-
The velocity of light in free space is therefore given by
The simple wave equation (1.10) is correct only for free space, but when it applies the electromagnetic waves propagate without change of form at a velocity which is related to the values of q and h. Electromagnetic waves can also propagate in a material medium; in this case the Eqs. (I. 11) must include additional factors e and p to represent the dielectric constant and magnetic permeability of the medium. Of these factors, the dielectric constant is the more important, since it is unusual to encounter waves in media where p differs appreciably from unity. In a dielectric the ratio of the velocity in free space and the velocity in the medium is the refractive index of the medium (n). Hence for the dielectric n = ell2. The dielectric constant and magnetic permeability are not necessarily simple quantities: in general each may be a tensor. We will encounter a comparatively simple form of this complication in the propagation of light through a crystal (Chapter 4). We will also find that the dielectric constant e varies with wavelength for all media; we must therefore develop analytical methods that distinguish between waves of different wavelengths.
* 1.4
ENERGY FLOW IN ELECTROMAGNETIC WAVES
The energy in an electromagnetic wave resides in the medium through which it propagates, even if that medium is free space. The flow of energy is measured by the Poynting vector
8
The wave nature of light
Chap. 1
This is the amount of energy crossing unit area perpendicular to the vector per unit time. (Strictly speaking, this is only correct if S is integrated over the surface of a volume, when it measures the total flux of energy out of that volume.) The relation between E and H is easily found for a plane wave in free space. We first show that there is no component along the direction of propagation. In free space div E=O, i.e.
For a wave propagating along the x-axis with no variation along y and z, all terms of (1.17) are zero, and the longitudinal component of E is zero. A similar proof shows that H, is also zero. We can now show that E and H are orthogonal in a plane wave in free space. This follows from Maxwell's equation, curl E = - p h aH/at, by expanding the components in x, y, and z, and putting Hx=0, Ex=0, Ez=0. Then
and only Hzexists. For the elementary plane wave of (1.1) we have a/at = - c a/ax, and therefore
In free space (p = E = 1) the ratio of Eyto Hzis a constant, which is known as the impedance of free space and has the value of 377 ohms. The Poynting vector for a plane electromagnetic wave in free space is directed along the wave normal, and substituting Eq. (1.19), its value is E2/377 watts m-2, where E is the root mean square electric field in volts per metre. 1.5 COSINUSOIDAL WAVES
The general wave t,b =f ( x - ct) may have any wave shape provided that it keeps the same shape as it progresses. The waveform
1.5
I
+
Wavelength A
27k =
Cosinusoidal waves
9
*
Fig. 1.4. The progressive cosinusoidal wave $ = A cos (wt - kx+ 6).
is mathematically correct as a wave travelling without change of form, since $ is a function of x - (o/k)t. The velocity of this wave is therefore c = o/k, and its form at time t = 0 is a simple cosine wave along the x axis. The phase term 6 determines the position of the cosine wave at t = 0 (Fig. 1.4). Adding u/2 to 4 makes the wave into a sine wave; we will refer to the waves generally as sine waves, although the form of Eq. (1.20) is preferable for analytical reasons that emerge later. This particular form of wave lends itself well to analytical work, and fortunately it is also a type of wave that is easily recognizable, as for example in a sound wave composed of a single pure note, or in a monochromatic light wave. The constants in Eq. (1.20) are often referred to by name; o is the angular frequency, k is the wave number, and 6 is the phase. A cycle of oscillation in y occurs at time intervals of one phriod T = 2 d 0 , and at distance intervals of one wavelength X = 2 d k . (Note that in spectroscopy the term 'wave number' is used for l/X, differing from this usage by the factor 2u.) The actual waveform of any wave may differ markedly from the pure sine wave, and its shape may change with time. It is found, however, that any periodic waveform can be represented as the sum of sine waves, by adding a series of harmonics at angular frequencies which are integral multiples of o. If the waveform is not truly periodic, but changes with time (and therefore with distance), then it can be constructed by adding a continuous spectrum
10
The wave nature of light
Chap. 1
of sine waves, with an infinite range of angular frequency. The link between a waveform and its components is by way of Fourier analysis, which will be described in Chapter 3. 1.6 DISPERSION A sine wave may be propagated in a medium which does not obey the general wave equation (1.2). This particular waveform may then be propagated without change, but it is now to be expected that a more complicated waveform, made up of different frequencies, will change as it propagates. For example, waves on the surface of a liquid do not obey the wave equation (1.4) in which the velocity c is independent of X or w . Instead the velocity varies with wavelength as
where g is gravitational acceleration, p is density, and T is surface tension (see Problem 1.2 at the end of this chapter). The two separatk terms in the bracket dominate respectively at long and at short wavelengths. For long wavelengths, corresponding to deep ocean waves, velocity varies as A l l 2 , while for short wavelengths, where the waves become surface tension ripples, the velocity varies as XWater waves therefore provide examples both of waves whose velocity increases with wavelength, which are the deep water waves, and of waves whose velocity decreases with increasing wavelength, which are the surface tension ripples. In either case the medium is said to be dispersive, and the waveform of any wave other than a sinusoid will change progressively as the wave propagates. Dispersion is, of course, very familiar in optics, where the velocity of light waves through glass varies with wavelength. White light has a complicated wave shape, which is sorted out by a prism into separate wavelengths, each representing a different pure colour, and each being refracted independently by the prism. 1.7 GROUP VELOCITY
The large dispersion of velocity with wavelength exhibited by water waves suggests that for these waves there is no identifiable velocity for any wave apart from a sinusoidal wave with a specified wavelength. Experience shows that on the contrary, the waves or ripples travelling out from an impulse, such as the splash from a stone falling into the water, do have an identifiable velocity. This is not now the velocity with which an unchanging waveform
1.7
? -
Group velocity
11
Velocity C
Time r
t
Time t
+
r
Fig. 1.5. The velocity of a group of waves, related to the velocities of two components of the group. After time T the crests of the two waves coincide at a point one wavelength back, so that the group velocity for these waves is less than the wave velocity.
travels outwards from the splash, but rather the velocity of the centre of a group of waves. It is the velocity with which the wave energy travels. The V-shaped bow wave of a ship often comprises a group of three or four waves; the whole group travels with a definite velocity but changes shape as it travels, with wave crests actually moving through the group at a different velocity. The two velocities, of the group of waves and of the component waves making up the group, are called the group and wave velocities. The wave velocity is also known as the phase velocity, since any point on the wave can be defined in terms of phase. Since the difference between the two velocities is a result of dispersion, we can expect to find a relation between them in terms of the rate of change of velocity with wavelength. The two wavelengths in Fig. 1.5 are sinusoidal waves with slightly different wavelengths X and X + 6X, travelling with different wave velocities c and c + 6c. These two are intended to be part of a group of waves, and the sum of the two waves is representative of the sum of a spectrum of waves, possibly with a very wide range of wavelengths. At some time t a pair of crests from these two waves will reinforce exactly as in Fig. 1.5(a), but since their velocities are different the waves will not again reinforce exactly until a later time t + T , when a different pair of wave crests coincide as in Fig. 1.5(c). If the longer wave is moving faster, the point of coincidence moves back along the wave. This point marks the centre of the group.
12
The wave nature of light
Chap. 1
In time 7 the longer wave, moving with a relative velocity 6c, has caught up a distance 6X, and the group, whose centre is at the point where the waves reinforce, has slipped back by one wavelength. The waves have moved a distance CT (approximately, since we ignore 6c in comparison with c). Therefore
and in this time the group has moved a distance CT - X . The group velocity v, is therefore given by
or adopting differential notation by allowing 6X to tend to zero:
This analysis has used only two component waves, with adjacent wavelengths, but the expression for vg shows that a single identifiable group velocity may exist for a wide range of wavelengths, depending on the form of the dispersion. For water waves the dispersion actually changes in sign between the short wavelengths and long wavelengths, as may be seen by differentiating Eq. (1.21), so that the approximate wavelength must be stated for v, to be known. For light waves in air or in glass the dispersion may be so similar over the whole range of visible wavelengths that only one value of ug may apply. Electromagnetic waves provide examples of various different ways in which the phase velocity can depend on wavelength; the physical reasons for this dependence are discussed in Chapter 20. Another useful expression for group velocity is
The identity with Eq. (1.24) may be proved by substituting o=kc in Eq. (1.25). Eq. (1.25) may also be derived more directly, as follows. If the components of a wave group, each represented by a cosinusoidal wave of the form given in Eq. (1.20), are to add to make the centre of the group x= 0 when t = 0, then 4 = 0 for each wave, and the group centre will follow the point where (ot-kx) is independent of frequency, i.e. where t - x dk/do = 0. The group velocity is then vg=x/t = do/d k.
Problems 1
13
The energy in a wave packet is carried by the group of waves, and therefore travels at the group velocity. In some circumstances the phase velocity of an electromagnetic wave may exceed c; as expected from relativity theory, the group velocity can never exceed c. PROBLEMS 1 1.1 What is the ratio of the velocities of transverse and longitudinal waves in a thin wire which is stretched elastically by a fraction x of its length? 1.2 Waves propagate on the surface of a liquid with a velocity V depending on wavelength X . Use dimensional arguments to find the form of this dependence for very long and for very short wavelengths, given that: (i) the velocity for long wavelengths also depends on gravitational acceleration g (ii) the velocity for short wavelengths also depends on surface tension T and density p. 1.3 Only the first or second terms of Eq. (1.21) are needed for finding the velocity of water waves with very long and very short wavelengths respectively. Calculate the wavelengths for which this simplification gives a velocity which is 1010 in error. N m-I.) (Surface tension T=7 x 1.4 In an electrical transmission line with inductance L per unit length and capacity C per unit length, show that a wave of voltage V and current I propagates with velocity (LC)-'I2. (Hint : consider a length Ax and find expressions for A V in terms of L and for A I in terms of C.) 1.5 Demonstrate the equivalence of the following expression for group velocity v, , when phase velocity u = c/n :
1.6 A plane wave propagates in a dispersive medium with phase velocity V given by
where a and b are constants. Find the group velocity. Show that any impulsive waveform will reproduce its shape at times separated by intervals of 7 = 1/ b, and at distance intervals of a/b. (Hint: consider any pair of component waves separated in wavelength by 6X as in Section 1.5.) 1.7 Calculate the group velocity for the following types of waves, given the variation of phase velocity V with wavelength X. (i) Surface waves controlled by gravity: V= aX1I2. (ii) Surface waves controlled by surface tension: V=aX-'I2. (iii) Transverse waves on a rod: V= a h - l. (iv) Radio waves in an ionized gas: v=/(c2 b2X2). 1.8 The refractive index for electromagnetic waves propagating in an ionized gas is given by
+
where w,, the plasma frequency, is determined by the density of the gas. Show that the product of the group and phase velocities is c2.
CHAPTER
Combinations of waves 2.1 WAVES AND PHASORS
In almost all of this book our concern is with waves in a steady state. That is to say that the optical systems have been in existence long enough for light to have reached every part of them, so that all reflections, refractions and so on that can happen will have happened. If one imagines a set-up on an optical bench of length L, it would take a time somewhat greater than L/c after switching on to reach such a state. In such a system, if light of only one frequency v is present, that is to say the light is monochromatic, the magnitude of the electric vector El at any point P1would be found to be varying at the frequency v of the light, so that a plot of I E, I against time would be a sinusoidal variation, such as we have considered in Section 1.5. If we made a similar plot for some other point PZ we should get a similar plot to that for P1.These plots have one important thing in common: (i) the frequency v is the same for both, and two important things are different: (ii) the amplitudes of the sinusoids are different, (iii) the phases of the sinusoids are different. These last two are independent quantities: we can draw a sinusoid of any amplitude and then slide it along the time axis to have any phase. Similarly, having decided the phase, that is to say the position of the crossover points on the time axis, we can draw a sinusoid of any amplitude through them.
2.1
Waves and phasors
15
It is usual to take the phase at some point Po as a reference: the sinusoidal variation at Po is taken to be a cosine. Then to know the variation at P 1 for example we need to know two things: the amplitude and the phase relative to P o . Suppose that we denote the magnitude of the electric vector I E 1 by the symbol U (sometimes called the optical disturbance) then U is a scalar, so that a single number serves to fix U at a given place and time. Thus U is a function of position and time, and we can write symbolically
If we are concerned with a monochromatic system we also know that the time-variation of U at, say, P 1 will be a cosine of angular frequency w = 27rv and of some amplitude A l and phase 41. So we may write
If the cosine factor is expanded by the well known trigonometric formula, we have
which can be written
The two quantities A, and As convey the same information about the disturbance at P 1 as did the amplitude A and the phase 6. They are the coefficients of the unit cosine and sine disturbances at P1which, added together, give the disturbance of Eq. (2.2). An example of this summation of a cosine and sine is given in Fig. 2.1. Looked at in this way it is clear that if there were further disturbances to be added to that already at P 1 , this could be done by adding all the A , and all the As terms, and replacing A, and As in Eq. (2.4) with CAc and C A s . A, and As thus behave like components of a two-dimensional vector, and the result of the superposition of a number of disturbances at a point may be obtained by means of a vector diagram. To save confusion with threedimensional vectors like E and H, A, and As are said to be the components of a phasor A, and the diagram on which they are added is called a phasor diagram. The resultant of such a phasor diagram is the amplitude and phase of the disturbance at the point considered. At optical (as opposed to radio) wavelengths, the observable quantity is usually the intensity, which is proportional to the square of the amplitude. Thus in optics whilst the phase plays a vital part in the construction of a phasor diagram, it is not directly observable.
16
Combinations of waves
Chap. 2
3 cos ( w t )
4 sin ( w t )
Fig. 2.1. A pure cosine and a pure sine of different amplitudes adding to give an offset cosine. The amplitude and phase of the resultant wave may be obtained from the phasor diagram.
2.3
Positive and negative frequencies
17
2.2 THE COMPLEX AMPLITUDE If we form a complex number from A, and A, by writing
then this number is called the complex amplitude. We have here underlined A to distinguish it from A, but shall not do so in future as it will always be clear from the context if a complex amplitude is intended. It will be evident that the addition of such complex numbers will correctly sum all the A, parts as the real part of the sum, and all the A, parts as the coefficient of i, forming the imaginary part. The intensity, being the square of the amplitude, will be correctly calculated from
where the asterisk denotes the complex conjugate. There is no mystery in using imaginary quantities in the description of a physically real quantity: the complex amplitude is a convenient mathematical way to handle A, and A,, the real coefficients of cos wt and sin of, which when added according to Eq. (2.4) give the real disturbance U(xl, yl ,zl ,t). Drawing phasor diagrams then may be regarded as adding complex numbers in the complex plane, and the concepts tend to be used interchangeably in the literature. The manipulation of complex numbers is very convenient compared with the trigonometrical functions. For example, to rotate a phasor by an angle a corresponds to multiplying the complex amplitude by exp(ia). Thus ArOtated = (Ac + i A,) exp (icx) = A exp (i+) exp (ia)
The addition of several oscillations at the same frequency is represented by the addition of their complex amplitudes. An example of the addition +2 is of two oscillations with arbitrary amplitudes A l , A2 and phases shown in Fig. 2.2. The vector addition of the two phasors is shown to give a resultant phasor A, 6, representing the complex amplitude of the resultant oscillation. 2.3 POSITIVE AND NEGATIVE FREQUENCIES We have seen above how the introduction of the complex amplitude greatly simplifies the manipulation of the non-time-dependent parts of any disturbance U(x, y, z, t). The simplification arose from the use of complex numbers rather than trigonometrical functions to describe a two-dimensional vector. The time-dependent part of U(x,y, z, t) still has sine and cosine
18
Combinations of waves
Chap. 2
2.3
Positive and negative frequencies
19
components of unit amplitude: these may also be expressed in terms of complex numbers, leading to a similar simplification. Consider the behaviour of the complex number of unit magnitude exp(iot) as t varies. It represents a point at unit distance from the origin in the complex plane rotating anticlockwise round the origin o/2x = u times per second. Such a term is called an exponentialfrequency term. It has the property that when a phasor is multiplied by it the phasor is made to rotate anticlockwise u times per second. Now consider an exponential frequency term in which the frequency is a negative quantity. Evidently this can be used to rotate a phasor in the opposite direction, continually subtracting from its phase rather than adding to it. A completely real oscillation can be made by adding two exponential frequency terms. The two terms exp ( i d ) = cos o t + i sin o t
(2-8)
exp ( - iwt) = cos wt - i sin o t
(2.9)
can be combined to form C ~ wt S
=
+(exp( i d ) + exp - i d ) ) (
and sin a t = [exp(iwt) - exp(- i d ) ]
(2.10)
.
The real oscillation Ac cos ot + As sin o t can be written as A exp (iot) + A* exp( - iot)
where A = +(A,
+ iA,)
and
The two terms in Eq. (2.13) represent phasors of equal magnitude, each multiplied by an exponential frequency term. They are therefore made to rotate in opposite directions in the complex plane at the same angular frequency. Further, the rotating phasors are at all times each other's mirror image in the real axis, so that their imaginary parts cancel;
20
Combinations of waves
Chap. 2
their real parts always add, as shown in Fig. 2.3. This then is the trick: for all values of t the imaginary parts cancel, while the real parts add to give a real cosine wave. This representation of an oscillation as a sum of positive and negative exponential frequency terms is particularly useful in Fourier analysis (Chapter 3). It is nevertheless often sufficient to think of an oscillation cos ot as the real part of a single exponential exp (iwt).
Fig. 2.3. A cosine of phase 4 formed from rotating phasors. The two phasors are at & 4 at t = 0 and rotate in opposite directions. Their resultant is always real and is the offset cosine wave cos(2xvt + 4). The imaginary parts always cancel. Notice that the amplitude is 2A if each phasor is A long.
2.4 BEATS BETWEEN OSCILLATORS
The addition of phasors in a phasor diagram provides a convenient way of understanding problems in which any number of oscillations are to be added. Their phases are specified relative to a reference oscillation, and the addition is then performed by a simple addition of phasors with the specified amplitudes and phases. The addition of two oscillations, one of which is smaller in amplitude and which has a slowly increasing relative phase, is shown in Fig. 2.4. Here the phase of the larger oscillation is taken as the reference phase, so that the
2.4
Beats between oscillators
(a) The sum of t w o sinusoidal oscillations with different amplitudes and with a slowly changing relative phase corresponding t o slightly different frequencies
(6)Phasor diagrams representing (iii)
( iv)
the addition of the t w o oscillations. The amplitude A of the resultant varies between A, A, and A, - A,, and its phase oscillates about that of the larger oscillation. In this figure A, 3A2
+
A2
-
-
--
(c) The variations i n amplitude of the resultant oscillation
(d) The variations i n phase of the resultant oscillation relative to the phase of the larger oscillat~on
Fig. 2.4.
21
22
Combinations of waves
(i)
(ii)
Chap. 2
(iii)
Fig. 2.5. The sum of two sinusoidal oscillations with equal amplitudes but different frequencies. (a) The resultant oscillation. The variation of amplitude is shown dotted. (b) The phasor diagrams at three typical maxima in the resultant. (c) The variation in phase of the resultant relative to one of the original oscillations. This figure shows the limiting case of Fig. 2.4 for equal amplitudes. The frequency difference shown here is greater than in Fig. 2.4. This figure shows beats having maximum modulation.
phasor representing the smaller oscillation rotates; the sum oscillation is represented by a phasor whose tip traces out a circle. The middle of the figure shows the pattern of the actual oscillation. The phasor diagram gives the amplitude and relative phase of the oscillation. The phase follows the lower graph, advancing and retarding through the cycle. This sum is familiar in the beating of two sound waves. The phase variations are not usually as important as the familiar beat in the amplitude. Note that there is no actual oscillation at the beat frequency, but only a modulation of the amplitude and phase of the main oscillation. If the two oscillations have equal amplitudes A, the vector diagram of the phasors becomes as in Fig. 2.5. The amplitude is now modulated from 2A to zero, while the phase increases linearly, with steps of ?r at each zero. The phase reference has been taken as one of the two waves, with the other phase increasing uniformly with time. This means that one oscillation has angular frequency w, while the other has w + Aw. The beat frequency is
2.5
Standing waves
23
Amplitude
I
Fig. 2.6. Phasor diagrams showing successive stages in beats between two sinusoidal oscillations. The phase is referred to an oscillation at the mid frequency. The resultant remains always vertical in the figure, and varies sinusoidally as indicated in a period of l/Av where Av is the frequency difference between the oscillators.
A o / 2 ~ .A more convenient phase reference may be taken as that of an oscillator with frequency half-way between these two, so that we are adding regular frequencies o - A d 2 and o + A d 2 . The phasor diagram now looks like Fig. 2.6, which shows half a beat cycle. The phase of the resultant is now constant for a half period, reversing at the instants of zero amplitude. 2.5
STANDING WAVES
Two sinusoidal waves travelling in opposite directions add to give a pattern of standing waves. This may be seen in water waves reflected from a pond wall, or heard in sound waves; it is often conspicuous in VHF radio, where standing wave patterns inside a room may be explored by moving a portable receiver. We will use the phasor representation to describe the pattern of standing waves made by two waves travelling in opposite directions. The two waves, amplitude y, and y2, travel in the directions + x and - x. Let their oscillations be in phase at x = 0.Then as x increases, the phase of one decreases as kx and the phase of the second increases as kx as indicated in Fig. 2.7. The phase reference is the phase of the oscillation at x=O. At x=O the oscillations add, giving maximum amplitude. A minimum is obtained one quarter wavelength away, and at further odd numbers of quarter wavelength, from x=O. These minima are called nodes, and the maxima are called an tinodes.
24
Combinations of waves
Chap. 2
Fig. 2.7. Relative phases of two waves, in phase at x=O, and travelling in opposite directions.
Fig. 2.8. Envelope pattern of standing waves generated by two waves of equal amplitude. The phasor diagrams show the waves in phase at x= 0, with the resultant amplitude depending on the relative phase difference as x increases.
The patterns of standing waves in Fig. 2.8 are shown with the corresponding phasor diagrams, which give the amplitude and phase of the oscillation at any point. For y, =y2, as in a perfectly reflected plane wave, the phase is the same at all x, apart from a reversal at each zero point, or node, so that a time series of graphs of y against x is the simple pattern of Fig. 2.9. If the two waves travelling in opposite directions do not have equal amplitudes the standing wave pattern does not have zero amplitude at the nodes, and the phase is not constant through the region of an antinode. The corresponding phasor diagrams are shown in Fig. 2.10. This pattern contains a progressive wave, whose amplitude is the difference in amplitude of the two waves. There is therefore a continuous change of
2.5
Standing waves
25
Fig. 2.9. Successive plots of the oscillation in a standing wave pattern. The displacement y is plotted at intervals of one sixteenth of the period, i.e. at phase intervals of a/8.
Fig. 2.10. (a)The envelope of the standing wave pattern, with corresponding phasor diagrams, for waves of unequal amplitude. (b) The elliptical locus of the resultant phasor.
26
combinations of waves
Chap. 2
phase, in contrast to the discontinuous change of phase at the nodes in the pattern of Fig. 2.9. The changes of amplitude and phase of the oscillation along the x direction are represented by the phasors of Fig. 2.10(a).The phasors trace out an elliptical locus once each wavelength of x, as in Fig. 2.10(b).Successive points round the locus are marked with the distance along x, so that the amplitude and phase of the standing wave may be followed. The major and minor axes represent amplitudes at antinodes and nodes. For waves of equal amplitude the ellipse degenerates into a straight line. 2.6 CROSSING PLANE WAVES A wave pattern similar to the standing wave pattern is formed by plane waves A and B with equal amplitudes and crossing at an angle. The wave pattern does not, however, remain fixed in space. In Fig. 2.11 the two wave directions each make an angle 8 with the x-axis. The positions of maximum positive amplitude are shown by continuous lines.
Fig. 2.11. Two sets of plane waves crossing at angle 28. The waves add in phase along the line z = 0 , and again at z = z , . The wavelength of the pattern along the x axis is X sec 8, and the wavelength along the z axis is X cosec 8. The pattern moves in .the x direction with velocity c sec 0.
2.6
Crossing plane waves
27
The intersections lie along lines parallel to the x-axis, separated by X cosec 8. As time progresses, these intersections travel in the x direction with velocity c sec 8. Between the lines of maximum amplitude lie lines of zero amplitude. The waves are said to 'interfere' with one another, giving constructive interference along one set of' planes (seen in section z = 0 and Z= zl in Fig. 2.1 1) and destructive interference along the intermediate set, where zeros always coincide with maxima. The phasors for any part of this diagram may be constructed as in Fig. 2.12 by setting one at the phase angle - kxcos 6 - kz sin 8 and the other
Fig. 2.12. Phasors representing the oscillations in a pattern of crossing waves. On
+
the planes z = 0, z = z , , the waves are always in phase. At z = z , / 2 they are in antiphase, and the amplitude is zero. Along a line of constant x the phase is constant, as in a standing wave pattern. The pattern for the two waves with unit amplitudes is represented algebraically by exp i(ot + k ( l x + n z ) )+ exp i(ot + k ( l x - nz))
where I= cos 0 and n =sin 0. The sum reduces to 2 exp iwt exp i klx sin knz
28
Combinations of waves
Chap. 2
at - kxcos 8 + kz sin 8. The variation with z is then seen to be the usual standing wave pattern, repeating at distance zl = 1/(2k sin 8). The pattern in x is a wave progressing with a velocity greater than the velocity of a single wave by the factor sec8. (Note that this is not a group velocity, which for light waves must always be less than c.) An interesting feature of this pattern is the existence of the planes of zero amplitude which cut the plane of Fig. 2.1 1 half-way between the broken lines. For a radio wave these planes can coincide with a metal surface, so that a portion of the crossing wave pattern may be trapped between plane-parallel metal surfaces and will continue to propagate like the full pattern. This is an elementary 'wave guide' mode of propagation. The wave crests, and any other portion of the wave, move forward with velocity c sec 8, which is always greater than c, providing an example of wave velocity, or phase velocity, greater than the velocity of light in free space. 2.7
SIMILARITIES BETWEEN BEATS AND STANDING WAVE PATTERNS
The common use of the phasor diagram to illustrate the phenomena of beats and of standing waves demonstrates their underlying similarity. Beats are variations of amplitude with time at one point, whilst standing waves are variations of amplitude at different positions at one time. Both phenomena can in fact be produced simultaneously in very simple circumstances. In Fig. 2.1 3 two sources of sinusoidal waves S1, S2are separated by a distance of several wavelengths, so that the distances SIX, S2X to a point X may differ by anything from zero to several wavelengths. Such a situation may occur, for example, with two sources of sound, or with two radio transmitters. At X the two waves add, with a phase relation that depends on the relative
Fig. 2.13. Waves from the two sources S, , S2 reach X by different path lengths. As X moves it explores an interference pattern: alternatively if S, and S, transmit different frequencies beats will be heard at X.
2.7
Similarities between beats and standing wave patterns
29
phase of the sources S1 and S2, and on the difference SIX- S2X expressed in terms of the wave number k. Beats can now be produced at X by keeping the geometric arrangement fixed, and transmitting two different frequencies from S1 and S2. Alternatively, the two transmitters can be set to the same frequency, and arranged to transmit exactly in phase. Then if (SIXS2X)= nX the waves will arrive in phase at X, while if at another point X' the path difference (SIX' - S2X1)= (n ++)A the waves will arrive there out of phase. There is therefore a pattern of waves resulting from the interference of the waves, with maxima and minima of amplitude following a simple geometric pattern. Now let the phase of S1 change slowly with respect to S2. The result can be described in two ways: either the interference pattern is moving transversely, or at each point there is a beat between the two transmitters. The physical situation can be reversed, so that X represents a single source or transmitter and S1 and S2 represent two receivers connected together in such a way that the relative phase of the two waves they receive determines the sum of their signals. This situation occurs in optical and radio interferometers, such as the Michelson stellar interferometer. The radio interferometer, as used in radio astronomy, collects waves from a single radio source in two separate antennas, adding them in a single receiver. As the earth rotates a celestial source moves across the interferometer, the path difference changes, and the receiver output varies. Alternatively, in the addition of the two waves an extra phase difference can be inserted deliberately, and if this increases steadily with time, the rate of variation can be increased or decreased according to its relation to the rotation of the Earth. The basic equivalence in these examples is that the linear change of phase with time in a sinusoidal oscillation is equivalent to a change of frequency.
-
Transmitter -m
C
I
Receiver Target velocity
v Frequency difference
Fig. 2.14. A Doppler radar system. The reflected wave from a target receding with velocity v is at a frequency lower by 2 y / X . This is measured as a beat against the transmitted frequency.
30
2.8
Combinations of waves
Chap. 2
STANDING WAVES AND THE DOPPLER EFFECT
A similar comparison of interference in space and in time can be made when a standing wave pattern is formed in front of a moving reflector. This is well illustrated by the Doppler radar system used for measuring the speed of moving aeroplanes or automobiles. In Fig. 2.14 a transmitter T sends out a constant sinusoidal wave, say with wavelength X=3cm (frequency f = 10 GHz). The reflected wave is received at R and added to part of the transmitted wave; the relative phase of these two waves is determined by the distance TXR. If the target X moves with velocity u, this distance changes at a rate 2v, and the beat frequency will be 2v/X. For wavelength 3 cm and a velocity u = 50 km per hour = 14 m s- I , the beat frequency is 93 Hz; this may be used to give a direct reading of the velocity of the target. Alternatively, the target may be stationary and the Doppler radar may be moving in the opposite direction with velocity u. There is a standing wave pattern between the radar and the target, and the position of nodes and antinodes in the pattern is determined by the position of the reflecting target, not the radar. The radar therefore moves through a standing wave pattern, which appears as a beat frequency as before. Yet another way of looking at the same problem is depicted in Fig. 2.15, where the signal reaching the receiver is considered to have originated in an image T ' of the transmitter, which moves with velocity 2v away from the radar. The beat is now between the frequency f and the Doppler shifted frequency f (1 - 2v)/c, giving a beat frequency 2v/ X as before. Doppler radar measurements give velocity, not distance. Distance, or range, is measured by the time of flight of a reflected radar pulse, which travels at the group velocity.
-
Fig. 2.15. The same Doppler radar as in Fig. 2.14 showing an image T' of the transmitter receding at twice the velocity of the reflector. The receiver measures the beat between the transmitters T and T ' .
2.9
Standing waves at a reflector
31
2.9 STANDING WAVES AT A REFLECTOR
The standing wave pattern and the crossing wave patterns of Sections 2.5 and 2.6 can most easily be demonstrated by arranging for the total reflection of a plane wave. A classic demonstration of the standing waves of light reflected by a mirror was carried out in 1891 by Lippmann. He used a photographic plate with very fine grain, backed with a layer of mercury to act as a smooth reflector. A plane wave of monochromatic light, falling on the plate, formed a standing wave which could be seen in the developed film by cutting a section zt a very shallow angle. (Lippmann also showed that a plate exposed and developed in this way could be used for colour photography, since light was reflected selectively according to its wavelength from the planes of silver left in the emulsion.) The Lippmann demonstration can now he repeated much more easily by using radio waves with wavelength of a few centimetres; but it had a particular historical importance in that it showed not merely the wave pattern but the way in which the pattern was related to the reflecting surface. The two standing wave patterns in Fig. 2.16 show the patterns obtained at two different kinds of boundaries. This difference is well known in wind instruments: the resonant oscillations in an organ pipe have a node or antinode at the end of the pipe for closed and open pipes respectively. For electromagnetic waves the relevant boundary conditions are determined by the dielectric and magnetic properties of the material at the boundary. In
Fig. 2.16. Standing wave patterns due to reflections at different types of boundary. (a) Zero amplitude at the boundary. (b) Maximum amplitude at the boundary.
32
Combinations of waves
Chap. 2
1891 it was still interesting to prove that a metal surface, being an excellent conductor, would determine that there would be essentially zero electric field at the surface, giving the pattern of Fig. 2.16(a).The Lippmann films showed this clearly, giving layers of silver starting one quarter wavelength above the surface. The pattern of Fig. 2.16(a) is produced by two waves out of phase at the surface, while in Fig. 2.16(b) the two waves are in phase at the surface. The properties of the surface determine the relation between the incident and the reflected waves, i.e. the reflection coefficient. We will return to this in Chapter 4. PROBLEMS 2 2.1
(i) Show that the energy in the sum of two oscillations is equal to the sum of their individual energies, provided that they differ in frequency and a suitable time average is taken. (ii) For many oscillations with random phases, show that the energy of the sum is nearly equal to the sum of the individual energies. Discuss the necessity for taking a time average. (iii) Under what circumstances can two electromagnetic waves add so that the intensity of the sum is always equal to the sum of their two separate intensities? 2.2 Two plane waves exactly in phase combine to form a wave with double amplitude, i.e. with quadruple power. Where does the extra energy come from? 2.3 A crossing pair of electromagnetic waves as in Fig. 2.11 is confined between two metal planes a distance a apart. Show that the wavelength A, in the direction of propagation is related to the free-space wavelength X by
where n is an integer. What is the wave configuration when X approaches 2a? 2.4 The Relativistic Doppler Effect. A signal received from an oscillator with frequency v moving in a space vehicle with velocity V directly away from an observer has an apparent frequency v 1 where
Compare v - v1 with the value from the non-relativistic formua v1 = v(1- V/c) for an oscillator at 6 GHz moving at 6 km s-I in the line of sight. Find also the transverse Doppler frequency shift for a velocity of 6 km s-I in the line of sight where
CHAPTER
t-our~ertheory and wave groups
3.1 FOURIER SERIES AND SPECTRA
Our analysis so far has dealt with harmonic oscillation at a single frequency, with an infinite extent in time and space. Light is never exactly like that: in all practical circumstances we are dealing with waves covering a range of frequencies and travelling in various directions. To handle these cases we need to add a spectrum of waves, and relate the distribution of the spectrum in frequency and in angle to the waveform of the light. There is a very broad family of such relationships, connecting the waveform of a periodic wave, or even an impulsive or aperiodic wave, to its spectrum. The relationships are universally expressed through Fourier theory, which we now explore. We start by relating the shape of a harmonic wave to its spectrum, through a Fourier series. We found in Section 2.1 that a cosine wave of any phase t$ and amplitude A can be made by adding up cosine and sine waves of zero phase with amplitudes selected as A cos t$ and A sin 6 respectively. We now extend the analysis by asserting that any periodic function can be obtained by adding up a large number of cosine and sine waves with suitably chosen amplitudes and frequencies. The frequencies needed are given by a simple rule: they are harmonics (integral multiples) of the frequency of the original periodic function. So if the original function repeats with a period T, giving it a frequency of 1/T, we can express it as the sum of many sine and cosine waves
34
Fourier theory and wave groups
Chap. 3
Fig. 3.1. An example of Fourier transformation. The spectrum of the saw tooth periodic function shown in (a) is 1 ao, = O for all k and b U , = zfor k a 1 Hence if we take sine waves of frequencies 1/T, 2 / T , 3/T, 4 / T , . . . , and amplitudes 1 1 of 1 , T1 , 7 ,a,. . . , and add them up for various values of t, the saw tooth function should appear. In (b) above the sum of the first four sine waves is shown. The individual sine waves are shown in (c), (d), (e) and (f). It is quite surprising how well these four sine waves can represent the saw tooth. The addition of more components would lead to a closer approximation to the saw tooth.
3.2
Fourier series with exponentials
35
with frequencies 1/T, 2/T, 3/T, . . . , and so on. These are the harmonics of 1/T. Unless the function has an average value of zero (as do cosine and sine waves) a constant term will also be required (which can be regarded as a cosine of zero frequency). In general it takes the sum of an infinitely large number of harmonics to reproduce the original function exactly, but often a good approximation is given by only a few. An illustration of this is given in Fig. 3.1. That a periodic function can be built up in this way out of cosine and sine waves is Fourier's theorem. The mathematical expression for the periodic function f (t) which repeats with a period T is 00
f(t)=+a,+
ak cos k= 1
27rkt
00
j T ) ,C, bk +
27rkt (3.1)
It is well worth taking a lot of trouble to understand this expression. For k = 1 the cos and sin terms have a period T, that is to say the same period as f (t). The sum of the cos and sin terms at k = 1 gives a cosinusoid of amplitude j(a: + b:) and of phase tan - (bl /al ). For k = 2 the cos and sin terms have twice the frequency, half the period, and add to give a cosinusoid of amplitude ,,/(a:+ b:). And so it goes; the cos and sin terms for any k produce a cosinusoid with k cycles in T, the period off (t), and with amplitude and phase as determined by akand bk. The example of Fig. 3.1 should help in understanding this process. The set of numbers +ao and the ak and bp is called the spectrum off (t). It can be plotted on two separate graphs against frequency, one for the cosine terms (ak) and one for the sine terms (bk), in which the size of each term is shown at its correct frequency. Such a spectrum is plotted in Fig. 3.2 for the triangular periodic function shown. The cosine wave has the property of being symmetrical about t = 0, so that cos(2xvt) = cos( - 27rvt) always. Iff (t) has this property so that f (t) =f ( - t) it is said to be an even function. It is easy to see that an even function must be built only from cosines. So for an even function all bk are zero, and the spectrum is referred to as a cosine transform. On the other hand, the sine wave is antisymmetrical or odd, having the property that sin (27rvt) = - sin ( - 27rvt). If f (t ) = -f ( - t ) it too is odd, all ak (including ao) are zero and the spectrum is referred to as a sine transform. 3.2 FOURIER SERIES WITH EXPONENTIALS A great simplification of the Fourier series can be achieved by introducing the idea of exponential frequency terms, allowing the sin and cos to be included in one expression. Let us see how Eq. (3.1) can be simplified as in Eqs. (2.10) and (2.1 1) by expressing the cos and sin terms as exponentials. We obtain
36
Fourier theory and wave groups 00
f (t) = )ao
+)
k = l
Chap. 3
(ak + ibk)exp
The second sum can be made to go from k = the sign of k, giving
-1
to k = - oo by changing
And we can define a complex Fourier coefficient by Ak = )(ak + ibk) for k > 0 Ak = )ao for k = 0 Ak = )(ak - ibk) for k < 0
.
Notice that Ak for kO. With these provisions Eq. (3.2) contracts to the much simpler expression
Each term in the sum represents an exponential frequency term with complex amplitude Ak. The frequencies follow a series, at intervals of 1/T, from - oo through zero to + oo. Because A - =A;, f (t) always remains real. The real and imaginary parts of the complex Ak together form the complex spectrum of the periodic function f (t), and may be plotted as lines on two graphs versus frequency, one for the real and one for the imaginary part. Notice that (having understood it) there is no need always to plot the negative frequency half, as it is irnplied by the conjugate property. (A function with such a conjugate property is said to be Hermitian.)
3.3 FOURIER TRANSFORMS As the period T off (t) gets longer, the separation 1/ T between lines in the spectrum is reduced, and eventually we can introduce a new continuous function F(v) which expresses the density of lines in the spectrum. Since k/T is the number of oscillations in the period T we have v = k/T, and Eq. (3.5) becomes the integral expression
3.3
Fourier transforms
37
Spectrum
COS
Fig. 3.2. Functions and their spectra. Whilst the functions shown are real their spectra have in general both real and imaginary parts. This is because a real cosine (top) is made up of wholly real components, whilst a real sine (middle) is made up of wholly imaginary components. The chopped triangular function is even, like a cosine, so its spectrum is real, made up of lines spaced at 1/T. As its average is not zero it has a zero frequency component, corresponding to a fixed offset.
38
Fourier theory and wave groups
f (t) =
[+
a,
Chap. 3
F(v) exp (2~ivt)dv.
(3 6)
In the above expression F(v) is not a function of t and integration over v removes V,leaving a function only of t. The two functions f (t) and F(v) are said to be a Fourier transform pair, and t and v to be transform variables. We do not prove it here, but it is true that +a,
F (v) =
-
f (t) exp ( - 2 ~ i v t ) d t.
(3 07)
m
Notice the negative sign in the exponential. The Fourier transform, as opposed to the Fourier series, applies to a function with infinite period, which means a function with no period at all, such as a single impulse. A periodic function, whatever its shape, has a spectrum consisting of discrete line components at frequencies which are harmonics of the basic repetition frequency. A non-periodic function has a spectrum which is continuous, and which can only be described by a density of components per unit frequency range. Fourier transforms are, of course, purely mathematical relationships, and apply to any variable quantity. The time-varying expressions Eqs. (3.6) and (3.7) may be written more generally
and F (u) =
[
00
f(x) exp ( - 2riux) d~
-
(3 9)
where the product xu is dimensionless. For example, x may be a length, when u must have the dimensions of an inverse length. 3.4
FINDING THE FOURIER TRANSFORM OF A NON-PERIODIC FUNCTION
We illustrate the evaluation of the continuous spectrum F(v) of a function f (t) through the transform integral Eq. (3.7) in the useful example where f (t) is a single square pulse (Fig. 3.3), with height h and width b, centred on the origin of t. For this 'top-hat' function:
and
f (t) = 0
elsewhere
.
3.4
Finding the Fourier transform of a non-periodic function
39
Fig. 3.3(a). Fourier transformation of a 'top-hat' function.
Fig. 3.3(b). Fourier transformation of a Gaussian function.
The integral (3.6) becomes
2rvb sin $ where $ = 2 $
= hb-
The Fourier transform of a top-hat function is called a sinc function. We will see in Chapter 9 that the sinc function describes the Fraunhofer diffraction pattern of a single slit. The width of the sinc function varies inversely as the width of the top-hat; this corresponds to the angular width of the diffraction pattern varying inversely as the width of the slit. Another useful transformation is that of the Gaussian function
40
Fourier theory and wave groups
Chap. 3
Evaluation by the same process, using the identity
gives the transform
The transform of a Gaussian function is therefore another Gaussian function, whose width l / r u is inversely proportional to the original width a. The two functions chosen for these examples are symmetrical about the origin. In this case the transform is always real. Any asymmetrical part of f (t) would result in a complex transform, in which any component of F(v) must be specified in terms of a phase as well as an amplitude. Note however, that a change of origin of t would not affect the amplitudes of the components of F(v), but only their phases. The evaluation of Fourier transforms in practice may require large-scale numerical analysis on a computer. Fortunately this is such a frequent requirement that standard computer programs are available even for functions requiring a very large number of points for their definition. 3.5
THE TRANSFORM OF A MODULATED WAVE
A further class of interesting functionsf (t) is provided by cosinusoidal waves whose amplitude varies with time, either periodically as in the beat pattern of Fig. 2.4, or aperiodically as in an isolated group of waves. If the modulating function is g(t), so that the wave is g(t) cos(27rvlt), then its spectrum is
where G(v) is the transform of the modulating function g(t). Examples are depicted in Fig. 3.4. A cosine modulation has a spectrum which shows 'sidebands' on either side of the components at L-v l . This is a familiar situation in modulated radio waves: it also explains the appearance of 'ghosts' in the diffraction pattern of a grating (Section 11.6). The 'tophat' modulation of Fig. 3.4(b) broadens the line components into sinc functions covering a frequency band inversely proportional to the length of the wave train. The Gaussian modulation of Fig. 3.4(c) has Gaussian spectral line components; this is the spectrum of a wave group, such as the group of matter waves associated with a single particle.
3 -6
Convolution
41
Fig. 3.4. Fourier transforms of modulated cosine waves. (a) Cosine modulation, period 1,.( b ) Top-hat modulation. (c) Gaussian modulation.
3.6 CONVOLUTION We now state the convolution theorem which enables us to find the Fourier transform of a further class of functions, those which are obtainable by
convolving together two functions, sayf (T)and g(7). A convolution is defined by the function
42
Fourier theory and wave groups
Chap. 3
The convolution function occurs naturally in the response of any optical instrument which is intended to 'resolve' light either according to direction or wavelength, but which has a limited resolving power which inevitably modifies and degrades the image or spectrum. For example, a photograph of a point source taken by an astronomical telescope appears to show the light originating from an extended source, which is the result of diffraction. Let a cross-section of this apparent source have brightness distribution g(8). Then a photograph of an object which has an actual brightness distribution f (8) has a blurred image h(8) made by a combination of the two functions. The image at 8 is made up of contributions from a range of angles covered by the blurring function. At a point a the true brightness is f (a), while the blurring function centred on the point 8 is g(8 - a). The resultant at 8 is the integral over a:
This is a convolution of the source function f (8) with the instrumental response g (8). We now write the Fourier transforms of the functions f (t), g(t), h(t) as F(v), G(v), H(v); it can then easily be shown that
This is the convolution theorem, which may be stated as follows: The Fourier transform of the convolution of two functions is the product of their individual transforms. We will apply the convolution theorem in diffraction theory (Chapter 9), where the functions f (t), etc., are replaced by amplitude distributions across a diffraction aperture; their transforms then represent the corresponding diffraction patterns.
3.7
DELTA AND GRATING FUNCTIONS
When a single pulse becomes infinitely narrow, its transform becomes infinitely wide. It is convenient to describe the infinitely narrow function as a delta-function. The area under a delta-function is unity, so that for any reasonable function f (x): for rr<xo
3.9
Wavegroups
43
A regular sequence of delta-functions is known as the grating function. This may be regarded as a periodic function with an infinite series of harmonics: in fact its transform is another infinite sequence of deltafunctions. The periodicities in the grating function and its transform are related inversely.
* 3.8
AUTOCORRELATION AND THE POWER SPECTRUM
If the amplitude A(t) of a time-varying quantity is convolved with itself, the result is the Fourier transform of its power spectrum. Although this follows from Section 3.6, the result is so valuable that we set out a proof as follows. The self-convolution is also known as the autocorrelation function rll. We define
Now take the Fourier transform of this and manipulate
=
1[I + A (t
1
r)AS(t)dt exp (2rirv)dr
This is the power spectrum. 3.9 WAVE GROUPS
Modulated waves, and in particular wave groups such as those of Fig. 3.4(c), are of great importance in many branches of physics. In view of this we now consider modulated waves in a simple physical way, so as to illuminate the mathematical results of the Fourier approach in Section 3.5. We start with two wave components only. The addition of two waves with equal amplitude a but different angular frequencies w f A d 2 and wave numbers k k Ak/2 is expressed as:
Fourier theory and wave groups
44
a
x[
= tz exp,i(ot
= 2o sin
(
i
Chap. 3
- ~ ) t + ( k - F x ] ]
+ ,,[exp[i
Aot + Akx
)
(y+ F)]+ [exp
exp (i(wt + kx))
Awt i(i
+ Akx i)]]
.
The exponential term is a wave at the centre frequency, and the sine term is a modulation of the wave in time at angular frequency A d 2 and in space with wave number Ak/2. The wave at the centre frequency moves as before with a phase velocity o/k. The modulation moves with a different velocity given by the ratio
Any pair in a group of waves can be analysed in this way, so we may deduce that the whole group will move with the same velocity as the sinusoidal modulation pattern. In the limit, a group must be considered as an infinite series of waves all with angular frequencies and wave numbers near w and k. The group velocity is then the differential do/dk, as found in Section 1.7. The limitation of the duration of a wave, or its extent in space, requires the superposition of not only two waves but of an infinite series. Two waves differing by Af in frequency reinforce over a time T= l/Af, which is the time between successive beat minima. A group lasting for time T must consist of sinusoidal waves spread over a range Af = 1 / ~ so , that they are in phase during the time T, and outside this time their relative phases become random. Similarly a limitation of a group of waves to a length L implies that the group contains a range of wavelengths such that the component waves'become out of step outside the group; if L = nX then the range of AX is given by X/AX = n.
* 3.10
THE SCHRODINGER WAVE GROUP
As an example from another branch of physics we consider the Schrodinger wave which describes the behaviour of material particles. A single particle may be regarded as a wave group, or wave packet. In wave mechanics we equate the group velocity vg = dw/dk of the wave packet to the particle velocity. From diffraction experiments it is found that the wavelength of
*
3.10
The Schrodinger wave group
45
particle waves is related to the momentum p by k = p/h, where h = h / 2 ~ , so that a simple dispersion equation can be written. Since u, = p/m we obtain
and by integration
where V is an integration constant. Assuming the wave has the form
+ = +o exp (i(ot - kx) j we see that the time-dependent Schrodinger equation
+
is consistent with Eq. (3.24). The wave quantity has been introduced as a variable, although it is only observable as $$*, when it represents the particle density. The wave packet is made up of waves with a range AX of wavelengths, or Ak in wave numbers. The shape of the packet, and the shape of the spectrum, may both be Gaussian, as in Fig. 3.4(c). Then the width of the packet Ax and the range of wave numbers Ak are related by
Using the relation between k and p we obtain Heisenberg's uncertainty relation
The interpretation in terms of a particle is that the accuracy with which its position can be determined depends inversely on the accuracy with which its momentum can be determined at the same time. In terms of waves, a narrow spectral distribution produces a wide group, and vice versa. This relationship appears explicitly in the Fourier transforms of Section 3.4.
46
Fourier theory and wave groups
Chap. 3
3.11 AN ANGULAR SPREAD OF PLANE WAVES
The wave groups discussed in previous sections are limited in extent only along the direction of travel. A wave packet describing a particle, or a simple light beam, must also have a limited extent laterally. Can this also be regarded as a result of superposing plane waves? The longitudinal extent of a wave group is governed by the range of wavelengths of the plane waves constituting the group: we now show that the lateral extent is determined by a spread in wave directions rather than by a spread in wavelengths. In Section 2.6 we considered the addition of two consinusoidal waves crossing at an angle 28, and we found that the resultant wave pattern showed a cosine variation of amplitude across the wavefront, i.e. perpendicular to the direction of propagation. For small 8, the maxima are separated by a distance X/8. If we now add more pairs of waves, with the same wavelength but crossing at different values of 8, we add further cosinusoidally varying components with different scales X/8. Following the same idea as in the longitudinal limitation of the wave group, we see that these different scales of lateral variation can add to produce a wave limited in space transverse to the direction of propagation. A wavefront limited in this way to a lateral extent D requires a range of crossing waves with angles from zero to 8 = X/D. The required distribution of wave amplitude with angle depends on the shape of the distribution of amplitude across the wavefront: following the example of the wave group we may expect the relation to be given again by a Fourier transform. This is explored in more detail in Chapter 9.
PROBLEMS 3 3.1 The negative half-cycles of a sinusoidal waveform E = Eo cos wt are removed by a half-wave rectifier. Show that the resulting wave is represented by the Fourier series 2 3a
w t + -cos
2
2wt- -cos 4wt
15 a
+-.
Show that a full-wave rectifier, which inverts the negative half-cycles, has an output 2
4
cos 2wt + even harmonics
3 . 2 A Gaussian function with height h and width o is f ( t ) = h exp - (t2/202). Show that its Fourier transform is
Problems 3
47
F(v) = d(2~)ohexp ( - 27rv202) .
1
00
(You will require the integral - ,exp ( - 9) dx = all2.)
3.3 Show that the Fourier transform F(v) of an isosceles triangular function, height h, width b, is sin2 $ ab F(v) = h2b2-where $ = -v . P 2
3.4 A decaying wave train is represented by
Show that the Fourier transform o f f (t) is
and hence that the energy spectrum for w close to wo is given by
3.5 What is the spectrum of the modulated wave F ( t ) = (A
+ Bcos wt) cos p t
where p is the carrier frequency and w the modulating frequency? Find the spectrum of the frequency-modulated wave
f (t) = A cos ( p t B coswt) when B is small. (Hint : expand the waveform as a power series in B coswt, and neglect B2). What distinguishes this spectrum from the amplitude-modulated spectrum?
CHAPTER
Wave propagation
4.1 RAYS AND WAVEFRONTS
Everyday experience of light leads naturally to the concept of rays of light, propagating in straight lines, and reflected or refracted according to geometrical laws. The introduction of the concept of a light wave, and its reconciliation with ray theory, is due to Huygens. He postulated that light propagates as a wavefront, and that at any instant every point on the wavefront is the origin of a secondary wave which propagates outwards as a spherical wave. The secondary waves then combine to form a new wavefront, so that successive positions of the wavefront may be calculated by a step-by-step process. Huygens' theory of secondary wavelets provides an explanation both for rectilinear propagation and for diffraction; it is empirical and approximate, but it is still widely used to give a simple insight into many problems such as propagation in anisotropic media and the scattering of light by small particles. Huygens' principle may be extended to cover a periodic wave motion as compared with an impulsive wavefront, and indeed it is incapable in its original form of dealing with situations where propagation is dependent on wavelength. For example, the scattering of light from a dust particle may be strongly dependent on wavelength; again the sum of scattered waves from a number of particles may be dependent on wavelength because of interference effects between the separate waves. A ray of light appears according to Huygens as the normal to a wavefront. Every part of the spherical wavefront W, originating from the point P in
4.2
Reflection at a plane surface
49
Fig. 4.1, gives rise to a wavelet which merges into the new wavefront W ' . The rays indicate the direction of travel of the energy; if the wavefront covers only part of a sphere the edges will show the spreading diffraction waves, but over the uniform part of the wave the energy travels out radially. The empirical nature of this principle is obvious from the necessity of neglecting secondary waves propagating backwards from the wavefront. Even the concentration of energy into rays is not obvious, since sideways propagation is allowed to explain diffraction while no explanation is given of the cancellation of radiation in directions away from the normal to the 'new wavefront. Despite these difficulties, Huygens' principle is a good useful piece of physics, basically because it is closely related to the rigorous propagation theory developed by Kirchhoff.
Fig. 4. 1. Secondary wavelets originating on a spherical wavefront W. They combine to form the new wavefront W ' .
4.2 REFLECTION AT A PLANE SURFACE Using Huygens' principle, a plane wavefront reaching a reflecting surface sets up secondary wavelets successively along the surface, as in Fig. 4.2. The envelope of the secondary wavelets defines the direction of the reflected wavefront. The geometry is obvious, and leads to the law that the inclination of the incident and reflected wavefronts is the same.
50
Wave propagation
Chap. 4
Fig. 4.2. Reflection of a wave at a plane surface, according to Huygens' principle. W, and Wi are successive positions of the incident wave, while the reflected wave W, is made up of wavelets generated as W, reaches the surface.
It is interesting to compare this construction with the first explanation of the law of reflection, attributed to Hero of Alexandria. A light ray between two points A, B is reflected in the mirror M (Fig. 4.3). Hero showed that the path from A to B was a minimum if the angles i and r are equal. A simple demonstration of this is obtained by constructing the mirror image A' of A, when the line A'B must be straight for a minimum distance. The idea of a minimum path length can also be applied to refraction, where we find that it is better expressed as a minimum time of travel.
Fig. 4.3. Reflection of a ray in plane surface. The law of reflection is found by making the path AB a minimum distance.
4.4
4.3
Oscillatory wavcs and Huygcns' principle
51
REFRACTION AT A PLANE BOUNDARY
Applying Huygens' principle t o a boundary between two media in which the velocities of propagation are c, and c2, we obtain Fig. 4.4. Here the secondary wavelets W2 travel at. velocity c2, whereas the original wavefront is made up of wavelets W, travelling with velocity c,. Snell's law of refraction is obtained from
c2 A C sin r -- -- - - . c, BD sin i The refractive index of the second medium is then cl/c2= n, assuming that the first medium is free space. Again following the ray approach introduced by Hero, we find that the same law is obtained from Fig. 4.5 if the condition is that A P + nPB is a
Fig. 4.4. Refraction of a wavefront, according to Huygens' principle. The incident wavefront moving from W, to W[ generates secondary wavelets, which form the refracted wavefront W2.
minimum. This does not now give the shortest distance between A and B, but the shortest time of travel. (The condition may be verified by setting AC = y,, and BD = y2. We require a minimum value of A P + nPB = y , sec i + ny2 sec r given that y, tan i + y2 tan r = CD = constant. Differentiation of both equations gives the required result.) The relation between the condition of minimum light travel time and Huygens' principle emerges when we consider the propagation of oscillatory waves. 4.4
OSCILLATORY WAVES AND HUYGENS' PRINCIPLE
Huygens' principle of secondary waves is a simple and useful concept, but it obviously suffers badly from a close scrutiny. What happens, for example, to the wavelets propagated backwards? Does as much energy propagate sideways as forwards? How much of the original wavefront contributes to
52
Wave propagation
Chap. 4
Fig. 4.5. Refraction of a ray at a plane surface. If n is the refractive index, the laws of refraction correspond to a minimum of the optical path AP + nPB. the formation of one particular piece of the new wavefront? These questions require an extension of Huygens' principle to oscillatory waves as compared with impulsive wavefronts. Secondary waves which are not simple impulses, but oscillatory waves with a definite wavelength, must be added with due regard to their phase relationships. For the section of plane wavefront in Fig. 4.6 the wavelets arriving at P are derived from all points of the wavefront W. The phase is constant over W, so that the phase at P from a typical point Y depends on the path YP. The normal from P meets the wavefront at 0, at a distance OP = D. Then if OY = y , the path YP = (D2 + y2)lI2 or to a first approximation YP - OP = y2/2D. The phase of the wavelet from Y relative to that at 0 is therefore 4 ( y )= 27rA y2/2D. If the wavelets are to add constructively at P then 6 must be less than a/2, giving the condition:
-
Wavelets from outside this distance arrive with all possible phases, and their resultant sum is negligible. The condition (4.2) therefore defines the extent of the wavefront which contributes significantly to the wave at P, and shows that the concept of a ray as the path of an unchanging section of wavefront can only be valid when the section is larger than a definite minimum size.
4.6
Refractive index and wave impedance
53
Fig. 4.6. Wavefront geometry. Only wavelets originating close to 0 contribute to the wave at P.
4.5 FERMAT'S PRINCIPLE
The propagation of a ray is now seen to be governed by a simple law, which is that the phases of all contributions traversing paths close to the geometrical ray are the same. The phase of any contribution depends on the path it follows, and on the refractive index along that path. It can be defined in terms of an optical length along the path, which is determined by the time of travel along the path. Fermat's statement of this fundamental principle is that the optical length of the path is at a stationary value with respect to any small change of path; i.e., for the path from PI to PZ:
1::
n cis is at a stationary value
.
The relation of this law to the simple laws of reflection and refraction may be seen by reference back to Sections 4.2 and 4.3. 4.6 REFRACTIVE INDEX AND WAVE IMPEDANCE A wave encountering a boundary between media with different refractive indices will not only be refracted, as in Section 4.3; it will, in general, also be partially reflected. We will investigate the relation between the incident wave and the refracted and reflected waves, i.e. the transmission and reflection coefficients. In optics it is usually appropriate to use the refractive index to characterize the two media; it is, however, easy to extend the analysis to cover different
54
Wave propagation
Chap. 4
kinds of waves, such as sound waves and guided electromagnetic waves, by using the more general characteristic known as 'wave impedance'. By analogy with mechanical or electrical impedance, a wave impedance is the ratio of two variables in the wave. The gas propagating a sound wave, for example, has at any instant a longitudinal velocity v and an excess pressure p; the ratio p/v is the wave impedance z. The wave impedance is a property of the medium, not the wave; for a sound wave z = cp, where c is the velocity in the medium and p is its density. For electromagnetic waves the impedance is the ratio between the magnitudes of the E and H fields. For a medium with dielectric constant E the wave impedance varies as E - lI2. In geometric optics, where we use refractive index n, we find that n varies as in most practical circumstances; the wave impedance is then z = zo/n, where zo is the impedance of free space (see Section 1.4). 4.7
PARTIAL REFLECTION OF PLANE WAVES
A wave reaching a discontinuity, as for example a light wave entering a glass surface, or a radio wave incident on the surface of the sea, will generally be partly reflected and partly transmitted. If a wave impedance can be assigned to each of the two media, and if the wave is incident normally, the amplitude coefficients of reflection and transmission are easily calculated. Consider the sound wave of Fig. 4.7 reaching a boundary where the wave impedance z = cp changes from zl to z2. The boundary might be a transition from air to water or to a solid material, or it might be discontinuous change in gas composition as at the surface of a balloon. The pressure and the particle velocity in the incident sound wave at the boundary may be expressed as po exp i(ot - klx) and vo exp i(ot - klx) and in the reflected and transmitted waves as p, exp i(ot + klx), v, exp i(ot + klx), p, exp i(ot - k2x), v, exp i(wt - k2x). For each medium p = + zv for a wave travelling in the positive direction, a n d p = - zv for the reflected wave. The reflection and transmission coefficients will be defined as ratios of the pressures, so that R = pr/po and T = p, /po. The relation of R and T to the impedances zl and z2 is found from the conditions at the boundary. Across the boundary the particle velocities must be continuous, as the media do not separate; the pressures are also equal, since the boundary has no mass. Equating velocities and pressures on either side:
4.7
Partial reflection of plane waves
Incident wave
Transm~ttedwove
Pressure po
_
55
\'oI~city v0
Pressure pa
*
,
Retlected wove
)
pressure pr Velocity v,
------------
----J. - - - - - - - - - - - - - - - - - - Medium 2
Medium I
Chorocteristic impedance
Chorocteristic impedance Z2
L,
Fig. 4.7. Reflected and transmitted components of a sound wave incident normally on a boundary between media with different characteristic impedances.
Substituting for impedances z , and z2, Eqs. (4.3) and (4.4) become:
l+R=T
(4.5)
Hence
The reflection coefficient therefore depends directly on the change of impedance between the two media. This is an amplitude reflection coefficient; the power flux is p2/z = zv2. For example, for transmission from air to water we have
pwater
- 10' kgm-3,
c,,,,,
=
1,470 m s.-
I , SO
that
56
Wave propagation
Chap. 4
The reflection coefficient is 0.9994; all but 0.06% of the power is reflected. The same expressions for reflection and transmission coefficients apply for any situation where similar boundary conditions apply to two wave parameters related by a wave impedance. For electromagnetic waves (Section 1.4) the appropriate pair of parameters are the values of electric and magnetic field which are related by:
where p, E are permeability and dielectric constant of the medium, and v is the velocity in the medium. The constant h, EO are the permeability and permittivity of free space. When p1 = p2 = 1 the reflection and transmission coefficients become
where nl and n2 are the refractive indices of the medium, given by n = E +
* 4.8
'I2.
REFLECTION AT OBLIQUE INCIDENCE
A plane electromagnetic wave incident on a plane surface from a direction away from the normal will be reflected differently according to its state of polarization (see Chapter 5 for definitions of polarization). For a sound wave, which is a longitudinal wave and therefore has no transverse polarization, it is possible to resolve the wave into components which are normal and tangential to the surface, and then apply the previous analysis to the normal component. For the electromagnetic wave, it is necessary to specify the state of polarization and relate the field components to the boundary conditions. The characteristic impedances of the two media are zl and 22, and the angles of incidence and refraction are el and e2. The analysis follows from Section 4.7 if it is realized that the impedance z in Eq. 4.7 must be replaced by the ratio of field components tangential to the boundary, since it is these components which must be continuous across the boundary. For Fig. 4.8(a) the components of E cos 8 and H must be continuous, and the reflection coefficient therefore contains z cos 8 instead of z; for Fig. 4.8(b) the coefficient contains 'z/cos 8.
*
4.8
Reflection at oblique incidence
57
Fig. 4.8. Reflected and refracted rays at a boundary. The directions of the vector fields are shown for (a) E-vector in plane of incidence (b) H-vector in plane of
incidence.
The reflection and transmission coefficients are: For E-vector in plane of incidence: RII=
z, COS 8, - 2, COS O] 22 Cos C12
+ 21cos BI
22, cos 0, = z2cos or + z, cos el
(4.12) '
For E-vector perpendicular to plane of incidence
If, as is often the case, the media have both p, n2/nl = sin 0, /sin 8, and the coefficients become: tan (8, - 0,) R~~ = tan (el + 0,) R
sin (0, - el ) - sin@, + e l )
-
9
= p2 = 1
then zl/z2 =
4 sin e2cos O1 = sin 28, + sin 2 4
(4.15)
2 sin 13, cos 0, sin (8, + 0,)
(4.16)
'I TI =
58
Wave propagation
Chap. 4
Angle of inc~dence
Fig. 4.9. Fresnel reflection coefficients for an air/glass boundary, as a function of angle of incidence
These are known as the Fresnel coefficients. A typical plot of the reflection coefficients against O1 is seen in Fig. 4.9. It will be seen that R goes through zero when O1 + O2 = 7r/2, and that it changes sign. At this point the angle of incidence is known as the Brewster angle. The change of sign indicates a phase reversal. Light reflected at the Brewster angle becomes completely linearly polarized, with the electric vector normal to the plane of incidence. It is this behaviour that makes polaroid glasses useful in reducing glare from light reflected off a wet road, and in allowing fishermen to see into a lake despite the reflection of the bright sky in its surface.
* 4.9
ANISOTROPIC PROPAGATION
The direction of a ray, in which energy is propagated, does not necessarily coincide with the wave normal, which is defined by the front of constant phase in the wave. For an anisotropic medium, in which phase velocity varies with direction, the refractive index is not a single number, but a quantity that varies according to the wave direction. Thus the refractive index can be represented as a three-dimensional surface; for example, it can for some kinds of material be represented by an ellipsoid: A Huygens' secondary wave
*
4.10
Double refraction in a uniaxial crystal
59
Fig. 4.10. Ray (R) and wave normal (N) directions for an anisotropic medium.
setting out from a point in an anisotropic medium will then be ellipsoidal also, but the radial distances in the elliptical wave will be the inverse of the radii in the refractive index ellipsoid. Figure 4.10- shows the divergence between ray and wave normal for such a medium. Anisotropy occurs in the propagation of electromagnetic waves through media where an electric field produces a dielectric polarization which is not, in general, in the same direction as the field. (A similar effect might be expected when the magnetic permeability is not a simple scalar quantity, but refractive index is usually determined primarily by the dielectric constant.) This occurs for light propagating through crystals, and for radio waves propagating through an ionized gas containing a static magnetic field. A further complication is that the refractive index in any one direction may be a function of the state of polarization of the wave, so that a ray entering the medium with a random polarization will be split into two components which will be refracted differently. The medium is then said to be doubly refracting or birefringent. The two components may be plane polarized, as for propagation of light in crystals such as calcite, or circularly polarized, as under some circumstances for the propagation of light in quartz or for the propagation of radio waves in the ionosphere.
* 4.10
DOUBLE REFRACTION IN A UNIAXIAL CRYSTAL
An object seen through a slab of calcite crystal appears double, even when the slab has parallel sides and is normal to the line of sight. Instead of the usual single ray path from an object point to the eye, there are two. One behaves normally, but the other ray path behaves as in Fig. 4.10 and does not obey Snell's law of refraction. Calcite is a 'uniaxial' crystal, in which only one of the two secondary waves is non-spherical. The spherical wave is known as the ordinary wave and the elliptical wave as the extraordinary wave. The two wavefronts touch in one direction,
60
Wave propagation
Chap. 4
?
Optic axis
(0)
Section in plane of optic axis. Ordinary wave is polarized perpendicular to section, extraordinary wave is polarized in plane of section.
(a)
(6) (b) Optic axis perpendicular to refracting surface. A11
rays and wave normals coincide.
( c ) Optic
surface. Rays and wave normals coincide, but wave velocities differ. axis parallel Lo refracting
Fig. 4.11. Double refraction in a uniaxial crystal. Normal incidence. known as the optic axis. (For biaxial crystals both secondary waves are ellipsoidal.) The refraction at a crystal face depends on the direction of incidence, on the polarization, and on the orientation of the optic axis in the crystal. The series of examples shown in Fig. 4.11 illustrates the splitting of a randomly polarized ray into the ordinary and extraordinary components. The ordinary wavelets are spherical, touching the ellipsoidal extraordinary wavelets along the optic axis. The general case for oblique incidence may be understood by a similar Huygens' construction, as in Fig. 4.12. Here two points on the surface originate wavelets at successive times as the wavefront reaches them, and
Problems 4
61
Fig. 4.12. Double refraction in a uniaxial crystal. Oblique incidence, in plane of optic axis.
the refracted wavefronts may be constructed by joining the secondary waves as before.
PROBLEMS 4 4.1 In the Pulfrich refractometer (Fig. 4.13), the refractive index of a liquid is found by measuring the emergent angle e from the prism whose refractive index is N. Show that if i is nearly 90°, n = ( N 2- sin2 e)1/2
Fig. 4.13.
4.2 The angular width of a rainbow may be found from the geometry of the ray in Fig. 4.14, which lies in the meridian plane of a spherical drop of water
62
Wave propagation
Chap. 4
with refractive index n. The angular width is a stationary value of the deviation angle; show that it is given by
'
cos i = Jn2; -.
Fig. 4.14.
4.3 Show that the apparent diameter of the bore of a glass capillary tube, as seen normally from the outside, is independent of the outer diameter. 4.4 Show that the lateral displacement d of a ray passing through a plane-parallel plate of glass refractive index n, thickness t, is related to the angle of incidence 4 by
provided that 4 is small. 4.5 If the refractive index n of a slab of material varies in a direction y, perpendicular to the x-axis, show by using Huygens' construction that a ray travelling nearly parallel to the x-axis will follow an arc with radius
4.6 Show that the distance of the horizon as seen by an observer at height h metres is approximately 3.6 h" kilometres. Use the result of Problem 4.5 to calculate how this is affected by atmospheric refraction, if this is due to pressure changes only with an exponential scale height of 10 kilometres. The refractive index of air at ground level is approximately 1.00028. 4.7 A ray of light falls at angle of incidence i on a mirror surface with a velocity v small compared with c. Use Huygens' construction to show that the angle of reflection r differs from i by approximately (2v/c)i for small i. 4.8 Two glass slabs, with refractive indices n, and n,, are glued together with a thin layer of transparent material, thickness d, with refractive index n,. Show that for n,l= X/2 the reflection coefficient for normal incidence is
Problems 4
63
while if n,l= X/4 the reflection coefficient is
Hence show that for a quarter-wave layer with correctly chosen refractive index the reflection coefficient is zero. 4.9 What fraction of light is reflected at the surface of a lens with refractive index 1.5? Show how this may be improved by a suitable surface coating, and discuss the effect if a wide range of wavelengths is involved. 4.10 In the solar spectrum the same Fraunhofer line at 600 nm appears at wavelengths differing by 0.004 nm at the pole and at the edge of the disc near the equator, Find the velocity at the equator, and deduce the rotation period, given that the Sun's distance is 500 light-seconds and that it subtends an angle of 30' at the Earth. 4.11 The Crab Pulsar emits a precisely periodic pulse train whose frequency is close to 30 Hz. It lies in a direction close to the ecliptic plane, in which the Earth orbits round the Sun. Calculate the peak-to-peak variation in the observed pulse frequency due to the Earth's annual motion given that the Sun's distance is 500 light-seconds. 4.12 Find the half-width of the Ha line (X = 656 nm) emitted by hydrogen at a temperature 50 "C assuming that the width is entirely due to the Doppler effect. The half-width is AX = X,- A,/,, where the intensity is half maximum at wivelength XI,,. 4.13 The concept of wave impedance (Section 4.6) is useful in calculations of energy flow in waves, as in the following examples: (i) For electromagnetic waves the energy flow per unit area (the Poynting Flux) is E2/z where z is the impedance of the medium. In free space z = 377 ohms. Calculate the electric field due to normal illumination from a desk lamp. (ii) For sound waves the impedance z is the ratio of pressure p to velocity v of the wave in the medium, and the energy flow is p2/z per unit area. For air z = 412 kg m-2 s-l; estimate the power entering the earhole in a normal conversation and hence find the pressure changes.
CHAPTER
Polarization of Iight
5.1 POLARIZATION OF TRANSVERSE WAVES
In a transverse wave along a direction x the variable quantity is a vector which may be in any direction in the orthogonal plane y,z. The relevant variable for electromagnetic waves is the electric field E (Section 1.3). The polarization of the wave is the description of the behaviour of the vector E in the plane y,z. The plane of polarization is defined as the plane containing the ray, i.e. the x-axis, and the electric field vector. If the vector E remains in a fixed direction, the wave is said to be linearly or planepolarized; if the direction changes randomly with time, the wave is said to be randomly polarized, or unpolarized. The vector E can also rotate uniformly at the wave frequency, as observed at a fixed point on the ray; the polarization is then circular, either right- or left-handed. It is convenient to consider the components Eyand Ezon the orthogonal axes y and z as independent plane-polarized waves with individual amplitudes and phases. The vector addition of these two components can produce any state of polarization of the actual electric field. The addition must be done at every instant of time, but if the two oscillations have the same frequency the addition need only be done once. However, the relative phase of the two oscillations must now be taken into account. If the two oscillations are in phase, the successive vector sums are as in Fig. 5.l(a). The resultant is a vector at a constant angle to the y-axis. Two plane-polarized waves have combined to produce another wave which is also plane polarized.
5.1
Polarization of transverse waves
65
If the two oscillations are in quadrature, so that their values of 4 in , successive equations of the form E = a cos(ot - kx + 4) differ by ~ / 2 the vector additions follow Fig. 5.l(b). The resultant now rotates, following a circle if the two amplitudes a, b are the same, or an ellipse if a # b. If a is the amplitude on the y axis, and b the amplitude on the z axis, and if the a oscillation is phase advanced on b the rotation is anticlockwise; if it is retarded the rotation is clockwise. The two plane-polarized waves have combined to produce a wave which is elliptically polarized. Linearly polarized light is a surprisingly common phenomenon in everyday circumstances. It can be detected by the use of the 'polaroid' material in glasses used, especially by motorists, to reduce glare in bright sunlight. Polaroid transmits light which is plane polarized in one direction only, and absorbs light polarized perpendicular to this direction. Light reflected from any smooth surface, such as a wet road or a polished table top, is partially linearly polarized; this is easily demonstrated by turning the polaroid glass, which gives a change in brightness according to the change in angle between the plane of polarization and the transmission axis of the polaroid. The maximum effect is found for reflection at a particular angle of incidence, the 'Brewster angle' (Section 4.8). The light of the blue sky, which is sunlight scattered through an angle, is also noticeably polarized.
Fig. 5.1. Vector addition of two oscillating electric fields, at successive moments through one half-cycle. The two fields have equal amplitudes and are mutually perpendicular: in (a) they are in phase and in (b)they are in quadrature. The resultant oscillation is linearly polarized in (a) and circularly polarized in (b).
66
Polarization of light
Chap. 5
Circular polarization is less easily observed, but it is important in several phenomena concerned with propagation in anisotropic media, such as the propagation of light in crystals such as quartz, and in some liquids such as sugar solution; it also has a particular importance in the propagation of radio waves through the ionosphere, where the Earth's magnetic field is responsible for the anisotropy. The common characteristic of these particular media is that light of different hands of circular polarization travels with different velocities, so that if two hands are propagated together, their phase relation changes progressively along the line of sight. The addition of two circularly polarized oscillations or waves, of equal amplitudes but with opposite hands, results in a plane-polarized wave whose plane depends on the relative phase of the two circular oscillations. The difference in propagation velocity therefore results in a rotation of the plane of polarization, as observed for example in the propagation of plane-polarized light in sugar solution; it is also observed in the Faraday effect (Section 5.8).
Ir 5.2 FORMAL DESCRIPTION OF POLARIZED LIGHT ELLIPTICAL POLARIZATION We shall show how any state of polarization in a ray of light can be described in terms of elementary components of plane-polarized light. It is necessary to distinguish first between the polarized and unpolarized components of the ray. The intensity of the ray, as opposed to the amplitude, is made up of a simple addition of these components; the unpolarized component is one in which there is no consistent phase relation between plane vector components in any two mutually perpendicular directions in a plane normal to the direction of propagation. The polarized component of a wave propagating along the x-axis of an orthogonal system of axes may be analysed in terms of components along the axes y,z. These components are the two plane-polarized waves Ey = a exp(i(wt - kx))
where 4 is the phase difference referred to in Section 5.1. If $ = 0,Ey and Ezcombine vectorially to give a resultant field E with magnitude (a2+ b2)lI2, and at an angle $ to the y-axis given by
If 4 = 7r/2 and a = b, the wave is circularly polarized; the hand reverses when 4 = - 7r/2. (Definition of hand is opposite in convention between optics
*
5.2
Formal description of polarized light-elliptical polarization
67
Fig. 5.2. Elliptically polarized oscillation, combining linearly polarized oscillations on the z and y axes, with amplitudes a and b, and with an arbitrary phase difference.
and radio; in optics the hand is defined looking back towards the source of the ray, when the wave vector in any one plane rotates clockwise for a righthanded circular polarization.) More generally, adding orthogonal vector oscillations when is not zero or 7/2 produces an elliptical polarization whose major axis does not lie along y or z. We add the real parts of the oscillations, with components Ey and Ez
+
EY a
-=
COS of,
Ez cos(ot++)
-=
b
= cos o t cos
+ - sin o t sin + .
(5.3)
Eliminating the time factor:
This equation describes an ellipse, as in Fig. 5.2. At any time the resultant field vector E reaches to a point on the ellipse moving round as time progresses.
68
Polarization of light
Chap. 5
The ellipse is contained in a rectangle with sides 2a and 2b. The position of the major axis of the ellipse, at an angle $ to they axis, is found as follows: The amplitude E: + E: is maximum on the major axis. Therefore at this point
Also from:
On the major axis we have tan $=Ez/Ey, so that combining Eqs. ( 5 . 5 ) and ( 5 . 6 ) 1 cos$ ---
a2
1 tan$=--ab b2
cost$ cot $ ab
.
Since tan $ - cot $ = - 2 cot 2$, we find: tan 2$ =
2 ab cos $ a2 - b2
The ratio of the maximum and minimum axes of the ellipse may be found by rotating the coordinate axes through the angle $. The axial ratio may in this way be shown to1 be given by R in:
R = + ab sin $ 1+R2 a2 + b2
.
The sign depends on the hand of rotation.
It. 5.3 ANALYSIS OF ELLIPTICALLY POLARIZED WAVES By a suitable choice of amplitudes, a, b, and relative phase $, the vibration ellipse of Fig. 5 . 2 may be set with any axial ratio and with the major axis at any position angle. It follows that any elliptical oscillation or elliptically polarized wave may be analysed into two linearly polarized components at right angles, using axes at any angle to the major axis of the ellipse. The relative phase depends on this position angle; it is only important to note that the two components are in quadrature if they are aligned with the major and minor axes of the ellipse.
5.4
Polarizers
69
The analysis may be done practically by using a device which will change the relative phases of two orthogonal components of a ray by a known amount. A particularly useful device is the 'quarter-wave plate', a transparent slice of anisotropic crystalline material in which the wave velocity differs between two perpendicular directions by such an amount that one component takes a quarter-period longer to propagate than the other (see Section 5.6). This, in combination with a polaroid analyser, can be used for analysis of elliptically polarized light by turning the axes of the quarter-wave plate until a position is found where the emergent light is fully plane polarized. A combination of an analyser and a quarter-wave plate can also be used to determine the state of polarization of an arbitrarily polarized wave. For this purpose it is convenient to think of the most general state of polarization as a combination of elliptical and random polarization. The procedure is as follows: (i) Using the analyser discussed above the amount of plane-polarized light can be determined by rotating this analyser. The remaining light when the analyser is set to admit a minimum intensity may be circularly or randomly polarized. (ii) The quarter-wave plate is inserted before the analyser, and the orientations of both are changed independently to produce a minimum intensity. Elliptically polarized light will give zero intensity in these circumstances, since the quarter-wave plate if properly oriented will turn it into plane-polarized light which will be rejected by the analyser. Any remaining light at minimum intensity must have a random polarization.
5.4
POLARIZERS
Light from most sources is unpolarized. It can be converted into fully polarized light by the removal of one component, usually either plane or circularly polarized. A simple example is the wire grid polarizer used originally for radio waves, but which can be demonstrated to work for infrared light at about 1 pm wavelength. This is simply a parallel grid of thin conducting wires whose diameter and spacing are small compared with the wavelength. In such a grid only the electric field component perpendicular to the wires can exist, and the grid reflects the other component. Such devices are called polarizers when they are used to create polarized light, or analysers when they are used to explore the state of polarization, as in the previous section. If linearly polarized light is incident normally on a linear analyser, with the polarization at angle 8 to the plane of
70
Polarization of light
Chap. 5
transmission, the amplitude transmitted is reduced to cos 0, and the intensity is reduced to cos2B (Malus' law). The metallic wire grid has an analogue in the aligned molecular structure of polaroid sheet. This is a stretched film of polyvinyl alcohol containing iodine; the iodine is in aligned polymeric strings which absorb light polarized parallel to the direction of alignment. A general class of crystals, including the well-known material tourmaline, have the same property of selectively absorbing one plane of polarization: for uniaxial crystals these are known as the dichroics.
5.5 BIREFRINGENT POLARIZERS The refractive index of a crystal depends generally on the direction of polarization in relation to the crystal structure. The two refracted rays inside a crystal (the E and 0 waves; see Section 5.6), obtained from a single incident
Fig. 5.3. Nicol prism. The refractive index of the transparent cement between the parts of the calcite crystal is 1.55, intermediate to the index for the E ray (1.49) and the index of the 0 ray (1.66).
Fig. 5.4. Wollaston prism. The two parts are made of quartz, with the optic axis A in the two directions orthogonal to the ray. The E and 0 rays are separated at the interface.
5.6
Quarter- and half-wave plates
71
ray (see Fig. 4.12), may have orthogonal linear polarizations. In the Nicol prism one of these rays is removed by total internal reflection, giving a single emergent ray which is accurately linearly polarized (Fig. 5.3). In the Wollaston prism (Fig. 5.4) the two polarized rays are both transmitted, but they are separated by a sufficient angle for them to be treated individually; for example, they may go to separate photoelectric detectors. Such devices are used in optical telescopes for measuring the plane-polarized component of starlight. The advantage over the Nicol prism is symmetry: both components are transmitted through similar paths in the crystal, and any absorption is the same for both. A linear polarizer can also be made by reflecting light from a polished surface at oblique incidence (Section 4.8). 5.6 QUARTER- AND HALF-WAVE PLATES
Consider the propagation of plane-polarized light incident normally on a parallel-sided thin slab of crystal such as calcite, cut so that the optic axis is in the plane of the slab. The component of the wave with electric vector parallel to the optic axis travels faster than the perpendicular component; these two components are known as the E and 0 waves (for extraordinary and ordinary). They are shown as wavefronts in Fig. 4.1 l(c). These two components are in phase as they enter the slab, but the E wave travels faster and a phase difference 6 grows as they travel. If the two refractive indices are n, and no, the phase difference after a distance d is: 27r X
6 = -(no
Quarter wave
- n,)d
Halfwave
radians.
Quarterwave
Fig. 5.5 A plane-polarized wave at 4 5 O passes through a quarter-wave plate and becomes circular; the hand is reversed in a half-wave plate; and the orthogonal plane is produced in a second quarter-wave plate. The fast axis is indicated by F.
72
Polarization of light
Chap. 5
Fig. 5.6. A plane-polarized wave at angle 0 to the fast direction of a quarter-wave and a half-wave plate, converted into an elliptical and a plane-polarized wave respectively.
Fig. 5.7. Changes of polarization through a series of quarter-wave plates.
The amplitudes of the two components are A cos 8 and A sin 8, where 8 is the angle between the incident plane of polarization and the optic axis and A is the amplitude in the incident ray (Fig. 5.5). Combining these two again with phase difference 6 produces a different state of polarization (Fig. 5.6); for 6 = a/2 this is an ellipse with a principal axis along the optic axis, while for 6 = a the polarization is again plane but rotated by angle 28. In the particular case where 8 = 45" the ellipse becomes a circle. The opposite hand of circular polarization is obtained when 8 = 135". The successive changes in polarization are shown in Fig. 5.7. Crystal slabs giving 6 = a/2 and a are known as quarter-wave and halfwave plates respectively. We have already shown how a quarter-wave plate can be used in the analysis of polarized light. 5.7
OPTICAL ACTIVITY
The double refraction, or 'birefringence' of calcite depends on its anisotropic crystal structure. In some crystals, notably quartz, and in the molecules of
5.8
Induced birefringence
73
many organic substances such as sugar, the molecular structure is a spiral. The refractive index for circularly polarized light then depends on the relation between the hand of polarization and the hand of the spiral structure. A plane-polarized ray traversing an 'optically active' crystal must then be thought of as the combination of two circularly polarized rays, which travel at different speeds. Their relative phases, which determine the position angle of the linear polarization, change along the ray path, and the plane rotates. The rate of rotation of the plane of polarization in quartz, for light propagated along the optic axis, is 21 " per millimetre. In liquids the rotation is normally less, but for the so-called 'liquid crystals', which are liquids in which molecules are partially oriented as in a crystal lattice, the rotation may be very much larger. Cholesteric liquid crystals, in which the molecules have a helical structure, have rotations up to 40 000' per mm. These are used in the digital displays of wristwatches and other instruments (Section 5.9).
5.8 INDUCED BIREFRINGENCE Some materials can be made birefringent by an external electric or magnetic field. The effects can be understood at the atomic or molecular level, as in the permanently birefringent materials. In the Kerr effect the birefringence is induced in a liquid by a transverse electric field. As in a uniaxial crystal, quarter-wave and half-wave plates can be created in a parallel-sided cell, although a cell several centimetres long may be needed in practice. The difference in refractive indices is related to the field E and the wavelength X by
where K is a constant depending on the liquid. The Kerr cell is used to modulate a ray of plane-polarized light. A cell about 10 cm long containing nitrobenzene becomes a half-wave plate when a transverse field of 10 kV cm-I is applied. If the incident polarization is at 45" to the field, the emergent beam is rotated by 90" and it can be rejected by an analyser set in the original plane. The Kerr cell can therefore be used as an electrically operated light switch. The Pockels effect is a birefringence induced in a crystal by a longitudinal electric field. This can also be used for electrical switching of a light beam. The Faraday effect (Fig. 5.8) is induced optical activity, in which a longitudinal magnetic field can induce a rotation of the plane of polarization in an isotropic material such as glass. The rotation is proportional to the magnetic field strength. A particularly simple explanation is available for propagation of a radio wave through a cloud of free electrons, such as in the ionosphere and in interstellar space, where the Faraday effect is easily demonstrated. The oscillation of the electrons in response to the electric field of the
74
Polarization of light
Chap. 5
Fig. 5.8. The Faraday effect. Between the planes A, A ' , a longitudinal magnetic field separates the refractive index into different values for the two hands of circular polarization. The relative phases of the two circularly polarized components of the ,plane-polarized wave change, and the plane rotates.
Fig. 5.9. Molecular alignment in a liquid crystal display (LCD). (a) With no field between the electrodes, the molecules align with the surface structure, which is arranged to give a twisted structure. (b) An electric field aligns the molecules and removes the twist.
Problems 5
75
wave now includes a gyration round the steady magnetic field, and the magnitude of the oscillation depends on the hand of the circular polarization as compared with the direction of natural gyration round the magnetic field. 5.9 LIQUID CRYSTAL DISPLAYS
The familiar LCD digital displays of pocket calculators and wrist watches employ a different form of electro-optic modulator shown in Fig. 5.9. The active element is a liquid crystal, in which long organic molecules align naturally parallel to a liquid-glass interface, but can be realigned by an electric field. The natural alignment is determined by conditions at the surface, so that a twisted structure can be produced in a thin cell, as in Fig. 5.9. The effect is now to rotate the plane of polarization, as in a quartz crystal, and in the arrangement of an LCD display the reflected light from a mirror behind the cell is rejected by a polarizer. Application of an electric field removes the spiral structure and allows light to pass both ways through the cell. PROBLEMS 5 5.1 A pair of crossed polarizers, with axes at angles 8 = 0" and 90°, is placed in a beam of unpolarized light with flux density I,, so that light emerges from the first with I,= + I , and from the second with I, = 0. A third polarizer is placed between the two at angle 8 = 45". What then is I,? If the third polarizer rotates at angular frequency w show that '0
I* = -(1 - cos 4ut) 16
.
5.2 Calculate the thickness of a calcite quarter-wave plate for sodium D light (A = 589 nm), given the refractive indices no = 1.658 and n, = 1.486 for the two linearly polarized modes. 5.3 A plane-polarized wave propagates along the optical axis of quartz as two circularly polarized waves, so that the difference in refractive indices n, - n, introduces a phase difference 6 between the two. Show that the plane of polarization is rotated by angle 6/2. Calculate the thickness of quartz plate that will rotate the plane by 90" at wavelength 760 nm, given ( n, - n, 1 = 6 x 5.4 A printed page appears double if a doubly refracting crystalline plate is placed upon it. Why is it that a distant scene does not appear double when viewed through the same plate? 5.5 Why does a thin plate of doubly refracting crystal generally appear faintly coloured when it is placed between two polarizers? 5.6 Show that an elliptically polarized wave can be regarded as a combination of circularly and linearly polarized waves. 5.7 An elliptically polarized beam of light is passed through a quarter-wave plate and then through a sheet of polaroid. The quarter-wave plate is rotated to two positions where the polaroid shows the light to be plane polarized, and it is found that the plane of polarization is then at angles of 24" and 80" to the vertical. Describe the original elliptical polarization.
CHAPTER
Geometric optics
6.1 THE CONCEPT OF RAYS
Light, which is propagated as a wave motion, may often be represented by rays, which are geometrical lines along which the light is supposed to flow. The concept of a ray must be insufficient to describe what happens at a precise focus in an optical system, but it is sufficiently good to describe the progress of a wavefront through any part of an optical system provided only that the wavelength is small enough. The effects of a finite wavelength are completely neglected in the subject of geometric optics, and it is quite usual to treat the subject entirely in terms of rays. It is nevertheless interesting to recall that Fermat's principle of least time, which is the basis of optical theory, is itself only understandable in terms of a wave theory. If many rays pass through an optical system from an object point to an image point, the optical distance must be the same along each ray, giving a proper addition of waves in phase at the object point. However, most optical systems are so far from perfect that such an accurate superposition only occurs for a small bundle of rays passing through part of the system, while other bundles focus correctly at another image point. Generally speaking a point object will, at any point near a nominal image point, provide rays from different parts of the optical system whose relative paths differ by many wavelengths, and which do not therefore add coherently; the total intensity is then found by adding the separate intensities and not the amplitudes. This general behaviour is illustrated in Fig. 6.1, which shows an optical system concentrating the light from a point source Po into an image PI.
6.2
Perfect imaging
77
Fig. 6.1. The caustic curve of light intensity from a source Po imaged in a general optical system. The curve is a section of a surface known as the caustic surface.
Ideally every ray from Po would pass through P I , but as everyone knows who has tried to focus a point source of light with a cheap lens, or with a good lens held at an angle, or who has observed sunlight reflected on the inside wall of a beer mug, it is a common experience to see a concentration of rays well outside P I . The figure shows a concentration of rays in a curve known as a caustic curve, in which the true image point is only a bright cusp. On any point of the caustic curve rays from one region of the system are adding in intensity. At any point in the image space a ray can be drawn, originating in Po, but the image is of course limited to a region where the rays are concentrated. Any ray path, even if it is to a point outside the image, must be traceable by using only Snell's law, or of course by Fermat's principle. An optical ray is therefore any possible path through an aperture small compared with the whole optical system, but large compared with the wavelength. The idea of intensity can be introduced by tracing a bundle of rays, with intensity varying as the density of rays in the bundle as its crosssection changes through the optical system. To allow for the different behaviours of differently polarized light, any ray may have a state of polarization associated with it, so that at any boundary the correct refraction and reflection laws may be applied. 6.2
PERFECT IMAGING
We start the study of optical systems by specifying the behaviour of an ideal system. The specification must then be relaxed until useful systems can actually be built, since the ideal is both difficult to attain and fundamentally of limited usefulness. In much of physics, as in less exact disciplines, progress is made through compromise.
78
Geometrical optics
Refractive index
Chap. 6
n,
Fig. 6.2(a). Maxwell's theorem for a 'perfect' optical system. Optical path lengths must be equal for corresponding parts of the object and image, so that for cxamplc n , A , B, = n z A 2 B 2 .
I
Object space
Image
space
Fig. 6.2(b). In a 'perfect' system the optical paths A, B, and A2B2 are equal (Lenz's proof of Maxwell's theorem).
An ideal, or perfect, optical system must surely be one in which every point in a n object space corresponds precisely to a point in a n image space, being connected to it by rays passing through all points of the optical system. The optical path from any object point to its image point is therefore the same along all rays.
6.2
Perfect imaging
79
It is clearly desirable that any optical instrument should be perfect in this sense, but it can easily be shown that almost no useful instrument can be so, for a fundamental reason first formulated by Maxwell. He showed that a perfect optical system can only give a magnification equal to the ratio of the refractive indices in the object and image spaces (Fig. 6.2(a)). For example, if object and image are both in air, the magnification can only be unity, which may not be very useful. A mirror can evidently be perfect, but a magnifying lens cannot. The following demonstration of Maxwell's theorem is due to Lenz. The theorem states effectively that if two object points Al , B1 give rise to image points A2, B2, the optical paths A1B1 and A2B2must be equal. Suppose in the system (Fig. 6.2(b)) the rays Al B1 and B1A, can both pass through the optical system. They must then pass through B2A2and A2B2 respectively. Since both paths from Al to A2 must have the same length, and also both paths from B1 to B2, it follows immediately that the paths Al B1 and A2B2 must be the same. This proof appears at first glance to be very limited, since the rays AB and BA can hardly be expected both to pass through the aperture. It may, however, be generalized by constructing a curve similar to that in Fig. 6.2(b), but which is made of many segments of actual rays, and integrating the whole path between the intersections of these segments. The simplest example of a perfect optical system is a plane mirror. Even a single plane refracting surface only approaches perfection for rays which are nearly normal to its surface; away from the normal a bundle of rays from a single point does not form a perfect point, or 'stigmatic' image. A theoretical example of a perfect system, known as the 'fish-eye' lens, was invented by Maxwell; this uses an infinite spherical lens with refractive index varying with radius in such a way that all rays diverging from any point would converge on another point (see Born and Wolf, 1980).
Fig. 6.3. Stigmatic imaging (a) in an ellipsoidal mirror, (b) in a refracting Cartesian oval.
80
Geometric optics
Chap. 6
If a single object point and its image point are specified, they can be connected by stigmatic rays in the two optical systems of Fig. 6.3. The elliptical mirror has the two points as its two foci; if one point is infinitely distant, the reflector becomes the familiar paraboloid of revolution used in telescopes and car headlights. The refracting surface is the more complicated Cartesian oval, named after Descartes. 6.3 PERFECT IMAGING OF SURFACES
The severe restriction of the 'perfect' optical system, in which magnification can only be equal to the ratio of refractive indices in the object and image spaces, does not apply if the object points are restricted to lie on a single definite surface. This surface need not be plane, but the corresponding image points must lie on another definite surface if all points have sharp, or 'stigmatic', images. An example of a curved, but truly stigmatic imaging surface is provided by a spherical lens. (Microscope objectives commonly use such a spherical lens, but with a single flat face. The flat face forms a 'perfect' image, and the spherical lens then provides a stigmatic image of part of this 'object' if it is correctly located.) Fig. 6.4 shows a homogeneous spherical lens, centre 0, with radius a and refractive index n. A point source Po inside the lens is supposed to be imaged at PI outside the lens. If the image is stigmatic, all rays from Po must appear to diverge from P, It is found that this is only possible when OPo= a/n and OP1= na, so that the conjugate surfaces are spherical. It is, of course, not always convenient to restrict object and image surfaces to a special curve such as a sphere, but if it is required that either or both
.
Refractive index = l
Fig. 6.4. Spherical lens. All rays diverging from the point Po appear to diverge from the point P, when OP,=a/n and OP, =na. The points Po and P, lie on spherical conjugate surfaces.
6.4
Sphcrical surfaccs-thc paraxial approximation
81
should be plane in any useful instrument it will be necessary to abandon the requirement that the images should be strictly stigmatic. The departures from true point imaging are called' 'aberrations'. 6.4 SPHERICAL SURFACES-THE TION
PARAXIAL APPROXIMA-
Although it must be accepted that most optical systems are not 'perfect', or even 'stigmatic', there exists a very wide variety of systems which provide more or less exact images over a considerable volume of object and image spaces. The spherical lens is exact only for a pair of conjugate surfaces; if, however, rays are confined close to an axis lying along a radius of the spherical surface, there can be good image formation over a range of distances, as required for a 'perfect' system. We now explore the geometry of imaging in a single spherical refracting surface, using only the 'paraxial' rays. In geometric optics it is essential to use a sign convention in which the focal length, and the power, of a converging lens are both positive; more generally the power of an optical instrument is said to be positive if it produces an inverted image of an infinitely distant object, and negative if it produces an upright image of an infinitely distant object. There are two sign conventions in common use which conform to this criterion; they are the Cartesian and the 'real-positive' conventions. The Cartesian system is to be preferred for the logical development of geometrical formulae, especially when several surfaces are involved, and we will illustrate its use in our geometric treatment of refraction of rays. The 'real-positive' system, by contrast, is based on the actual distances travelled by a light wave; it is therefore well suited to discussions of the physical processes occurring in a
Fig. 6.5. The Cartesian sign convention. All quantities labelled in this diagram of refraction at a spherical surface are positive.
82
Gcomctric optics
Chap. 6
wavefront. In the Cartesian system, the signs of distances are determined by the normal conventions of coordinate geometry. Usually, but not necessarily, rays are considered to travel from left to right. Quantities referring to the left-hand space are distinguished from those in the right-hand space by subscripts. The diagram of Fig. 6.5 may be used for reference; it is arranged so that all the labelled quantities are positive. Refraction of a ray at a single spherical surface, radius r, bounding media with refractive indices, n, and n2, is shown in Fig. 6.6. The point A is taken to be not far from the axis P,CP2. The relation between object distance P I A = -II and image distance AP2 = + I2 is obtained by constructing perpendiculars P l M l and P2M2to the radius through A, when the similar triangles P,M,C, P2M2C give:
-IIcos4,+r - 12cos42-r - l1 sin 4, f2 sin qi2 Using the relation nl sin 4, = n2 sin Q2 this becomes: n2 --n, +-= 11
Refractive index n,
12
n2 cos 42- nl cos 41 r
(6.3)
Refractive index n2
Fig. 6.6. Geometry of a ray refracted at a spherical surface between media of refractive
index n, and n,. P, and P, are conjugate points. The surface as shown has positive power since n2> n,.
6.5
General properties of imaging systems
83
In the paraxial approximation the angles and 42are small, their cosines are near unity, and the familiar relation is obtained:
where P is defined as the power of the surface. The simple lens in air, with two convex surfaces, is analysed by adding two equations of this form, giving a negative sign to the second radius since the centre of curvature is to the left. We obtain:
The power of a thin lens is the sum of the powers of the two surfaces. If the object is at infinity, l3 in (6.5) becomes the focal length f. The power is then l / f ; measuring f in metres, the power of a lens is specified in dioptres.
6.5 GENERAL PROPERTIES OF IMAGING SYSTEMS There are several properties of optical systems, such as the existence of focal planes and some laws governing the paths of rays, which may be demonstrated in actual examples such as the spherical surface we have already discussed. We have found, however, that the single surface, and by extension any number of surfaces, can behave fairly well as a perfect system. We therefore return to the perfect system to find a series of laws which apply quite generally to all practical systems. The geometric specification of a perfect system is that points and lines in the object space correspond to points and lines in the image space. Mathematically, the two spaces are linked by a projective transformation, and it can be shown that for this to be true there must be a linear relation between distances in the object and image spaces. Eq. (6.5) is an example of such a relation; it is only part of the general relation since it involves axial distances only. There is also a linear relationship between perpendicular distances, giving the transverse magnification of the system; the magnification depends of course on the positions of the conjugate planes containing the object and image. A pair of conjugate planes between which the magnification is unity are called the principal planes. The planes whose conjugates are at infinity are the focal planes. The magnification of an object on a focal plane is infinite and negative. The general linear relationship between distances in object and image planes becomes very simple when the system is axially symmetrical and when axial
84
Geometric optics
Chap. 6
Fig. 6.7. Coordinate systems for general axially symmetric optical systems. PI and P2 are conjugate points. Axial distances Z , , Z2 are measured from the focal planes F, , F,.
distances are measured from the focal planes, as in Fig. 6.7. Denoting axial and transverse distances by Z and Y, and using subscripts for the object and image spaces, the relation becomes:
as can be verified by substituting Z1=fi, obtaining unity magnification corresponding to a principal plane, and by substituting Z1= 0, obtaining infinite magnification. The constants fl and f2 are the focal lengths of the system. Equation (6.6) contains the well-known Newton's equation
A longitudinal magnification ML can be found by differentiating Newton's equation:
This indicates, for example, the amount of refocusing required when an object moves closer to a camera lens. An angular magnification MA is defined by the ratio of angles at which a ray cuts the axis in the image and object planes. It is given by the ratio of the transverse and longitudinal magnifications
6.6
Ray tracing-separated
6.6 RAY TRACING-SEPARATED
thin lenses in air
85
THIN LENSES IN AIR
Many optical systelrls use corrlporlerlts which are themselves made up of two or more lenses, combined to provide a minimum aberration over a given field of view. The analysis of any such complex system by the repeated use of the simple lens formula (6.5) soon leads to tedious algebra, and it is more usual to follow a ray-tracing procedure for each individual system. It is instructive to compare the algebraic computation and the geometry of a ray tracing for a pair of thin lenses, in air, separated by a distance d (Fig. 6.8). First we apply the simple lens formula to the two lenses in turn. If the image distance for the first lens is 12, then the object distance for the second is 12- d, so that the image distance 13, measured from the second lens, is found from:
Eliminating l2 gives
The focal points may be found by setting I, and I3 in turn to infinity. Then (6.11) may be recast in the form of Newton's equation (6.7), giving
(I,
1 - dP1 Pi + P2 - dP, P2
Power
)(I3-
1 - dP2 PI + P 2 - d P l P 2
Power P,
Fig. 6.8. Separated thin lenses treated by the simple lens formula.
86
Geometric optics
Chap. 6
This shows that the double lens behaves like a single surface with focal points at distances 1 - dP2 dP, and PI P2- dP, P2 PI P2- dP, P2
+
1
-
+
outside the thin lenses, and with a power ( P I + P2- dP1P2). The alternative approach, tracing a ray through the system, is illustrated in Fig. 6.9. We require first to find the angular deviation D of a ray through a single thin lens, which behaves like a prism whose angle increases with distance y above the axis. A prism with small angle a deviates a ray falling nearly normally upon it by an angle ( n - 1)a.where n is the refractive index. (A proof of this is given in Section 8.2.) The angle a is y/rl -y/r2, so that the deviation D is
Following a ray which cuts the first lens at distance y l from the axis, and the second at distance y2 from the axis, the total deviation is given by the sum
As in Fig. 6.9, consider a ray initially parallel to the axis. Then
Combining Eqs. (6.14) and (6.15):
Fig. 6.9. Ray tracing in a separated thin lens system. A ray parallel to the axis is deviated in both lenses, and crosses the axis at the focus F, .
6.7
Ray tracing-the sine law
87
showing immediately that the power of the combination is ( P I + P2dPl P2). The focal points may also be found geometrically from the figure, without recourse to tedious algebra. SINE LAW
6.7 RAY TRACING-THE
More complex lens systems are evidently analysed better by a ray-tracing procedure than by the continued application of the simple lens formula. Furthermore, the quality of a lens system is determined by the precision with which all rays from an object point pass through an image point. This is tested by ray tracing, including rays at the largest possible angle from the axial direction. A simple condition which must be obeyed by all rays in such a wide pencil of rays'joining a small object near the axis and its stigmatic image is Abbt's sine condition. This may be obtained from Fig. 6.10, which shows two rays of such a pencil through a single spherical surface. One is the undeviated ray Plop2through the centre of curvature C, the other PIAP2 is a ray inclined at a large angle to the axis. Since PI and P2are both near the axis, the angles O1 and O2 represent the inclination of the extreme ray of the pencil before and after refraction. From the undeviated P I0 P 2 the well-known formula for magnification at a single surface gives (the negative sign indicates an inversion):
The triangle Pl AC contains angles O1 and
?r -
4I related by:
sin 0, sin 4, ---. -
r
Object height y,
PIC
lrrioge heiaht I
Fig. 6.10. Two rays from a wide pencil joining P, with its stigmatic image P,.
88
Geometric optics
Chap. 6
Similarly from triangle AP2C: sin O2 - sin 42 r P2C Hence, since n, sin
= n2 sin 4 :
This is AbbC's sine condition, which must be obeyed by all rays setting off from P1 at any angle which can be accepted by the system. It will be noted that the derivation contains no approximations, so that the condition applies strictly to any system producing stigmatic images of objects lying on an object plane. It is therefore a useful test of the precision of any optical system. 6.8 SPHERICAL SURFACEST H E CONCEPT OF WAVEFRONTS
The development of ray optics, following the two previous sections, is essential for the detailed design of optical instruments. There is, however, an enormous step between the simple lens formulae and the work involved in the design, for example, of a lens suitable for wide-angle photography. Two stages are involved. The first is taken by physical intuition: it is the choice of suitable components based on a general knowledge of their behaviour. The second is the detailed tracing of rays through the system, including a wide range of positions and angles. The advent of high-speed digital computation has allowed this second stage to be repeated for a range of parameters in the optical system; the computer output can then include a specified measure of image quality as a function of any system parameter, and a pattern of adjustment can be followed automatically so as to 'optimize' the system according to predetermined criteria. It is no use giving the computer the wrong systems to work on in the first place. We need, therefore, to develop a knowledge of the behaviour of simple components, including the aberrations they introduce. This development is begun in the next two chapters. It is based on the concept of wavefronts rather than rays; we therefore end this chapter with wavefront treatment of refraction at a spherical surface, for comparison with the ray treatment of Section 6.4. In Fig. 6.11 the wavefront from a point P1 is seen as it arrives at the refracting surface, and as it leaves it. Between these positions there is an equal optical path between all corresponding points on the wavefront. Considering a portion of the wavefront at a distance h from the axis P1P2, this means that the optical path ABC, traversed in medium with
Problems 6
89
Kel ind
Fig. 6.11. A wavefront refracted at a single spherical surface. The optical paths ABC and DE are equal. refractive index n,, is equal to the path DE traversed in medium n2. We will now measure distances traversed by light between the object and the image as positive quantities, using the symbols u and v t o distinguish from the Cartesian symbols used in Section 6.4. If h is small compared with the distances u and v, and with the radius r of the surface, then approximately AB = h2/2u and BC = h2/2r, while DE = h2/2r- h2/2v. The equal optical paths then give:
giving
Apart from the change of sign convention, this is identical to Eq. (6.4). The surface acts on the wavefront so as to change its curvature. The precision of the emergent wavefront which converges on P2 depends on the approximation adopted in Eq. (6.21); we now examine this approximation further in a study of aberrations. PROBLEMS 6 6.1 A thin concave mirror is silvered on both sides. If an object 0 is reflected in it as an image at 0', check that an object at 0' will be imaged correctly at 0.
90
Geometric optics
Chap. 6
6.2 Two identical plano-convex lenses are each silvered on one face only, one on the plane face and the other on the convex face. Find the ratio of their focal lengths for light incident on the unsilvered side. 6.3 A lens with refractive index 1.52 is submerged in carbon disulphide, which has refractive index 1.63. What happens to its focal length? 6.4 A small fish swims along the diameter of a spherical gold fish bowl, directly towards an observer. Find how its apparent position varies in terms of the bowl radius R and liquid refractive index n. 6.5 Find the focal length of two lenses with focal lengths f l and f,, separated by a distance d which is filled with a liquid with refractive index n. 6.6 In a plane-parallel circular disc of refracting material, the refractive index n(r) is a function only of the distance from the axis of the disc. Show that the radius of curvature of a ray nearly parallel to the axis is (l/n dn/dx)-l. Find the form of n(r) which will make the disc act as a concave lens. Design a microwave lens with focal length 5 m from such a disc with radius 1 m, allowing the refractive index to vary between 1.1 and 1.4. 6.7 A reflecting surface giving stigmatic images of two conjugate points is a paraboloid of revolution when one of the conjugate points is on the axis and at infinity. Show that a single refracting surface between refractive indices n1 and n, is similarly aplanatic for an object at infinity when it is (a) an ellipsoid of revolution, (b) a hyperboloid of revolution. 6.8 The magnification between the object and image at the foci of the elliptical mirror (Fig. 6.3) might be calculated from paraxial rays reflected either to the left or to the right of the foci. These evidently give different magnifications, while symmetry apparently demands unit magnification. What is wrong with this analysis? 6.9 If a glass filter, thickness d and refractive index n, is inserted between a camera lens and the photographic plate, show that the plate must be moved a distance d/n towards the lens for focus to be maintained. 6.10 The distance between an object and its image formed by a thin lens is D. The same distance is found in a second position of the lens, when it is moved a distance X. Show that the focal length of the lens is
Show incidentally that the minimum distance between an object and its image is 4f.
CHAPTER
Aberrations
7.1 M Y AND WAVE ABERRATIONS
The discussion of geometric optics in the previous chapter has shown that a useful optical instrument can ideally only give stigmatic (sharp) images of points on a single surface, while even under this restriction the lenses or mirrors in the instrument cannot have simple spherical surfaces unless only a small bundle of paraxial rays is used to form the image. In spite of this it is evident that many very useful optical instruments exist which do not conform strictly to these conditions. The quality of their images may not be ideal, but the departures from perfection, known as aberrations, may be tolerable for their purpose. The design of an optical system is mainly concerned with the calculation of the various aberrations, and with their suppression below a tolerable level. Aberration may be specified for any ray which contributes to the formation of a point image. The distance between an ideal image point and the intersection of the ray with the image plane is called the 'ray aberration'. The total effect on the image is found by tracing sufficient rays from an object point so that the spread of intensity across the image can be found. Ray aberration is therefore a measure of the size of an ideal image point; its importance may be judged in relation to the size of the diffraction patch which is the lower limit to the size of the image of a point object below which even an ideal instrument cannot go. Alternatively the point image may be considered as the centre of a convergent wave, ideally spherical but in practice departing from sphericity
92
Aberrations
Chap. 7
because of inequalities of optical path along various ray paths. The amount of departure, measured along a ray path, is known as the 'wave aberration'. The ideal of zero wave aberration need only be approached sufficiently closely that the ideal diffraction image is unaffected; this means that the phasors representing the various components of the wave are still adding approximately in phase, say within much less than a radian of the correct direction, so that the wave aberration must be considerably less than X/27r.
Fig. 7.1. Ray and wave aberration. A spherical wave W leaving the point P is focused by the optical system into a converging wave W, which departs from the ideal spherical shape W,, centred on 0.Wave aberrations are shown as a, and ray aberrations as b, for two rays at different distances from the axis.
Wave and ray aberrations are shown in Fig. 7.1. The wave aberration may amount to some tens of wavelengths in a good camera lens, but it usually is much less than one wavelength in an astronomical telescope. The corresponding ray aberrations may be found by geometric ray tracing rather than by analysis of optical path lengths; there is, however, no need to draw a sharp distinction since both approaches lead to similar analytic results. The wave aberrations offer a clearer physical picture, and are described in this chapter. Ray aberrations may be found from any pattern of wavefront aberrations by drawing ray normals from the wavefront, as in Fig. 7.1. The intensity at the nominal image point is best found from the wave aberrations, since these give directly the pattern of phasors which must be added to give the amplitude at the image point. The efficiency with which light is concentrated into the image point increases with decreasing wave aberration until the aberration becomes small compared with X/27r.
7.2
7.2
Wave aberration on axis-spherical aberration
93
WAVE ABERKATION ON AXIS -SPHERICAL ABERRATION
As soon as it is admitted that an optical instrument is not ideal, or even that it will not function strictly with objects and images confined to aplanatic surfaces, the possible range of optical designs at once becomes infinite, and with this the variety of aberration pattern over the image plane also becomes infinite. A fairly simple pattern does, however, emerge from refraction or reflection at a spherical surface. This pattern provides an easy separation of aberrations into parts which depend on the distance off axis that the ray intersects the surface, and the distance of the object point off the axis. For objects on the axis the only effect is spherical aberration, representing the different actions of the axial and peripheral rays. It is instructive to follow the geometry of refraction at a spherical surface in some detail so that the amount of aberration can be calculated for a ray refracted at any distance from the axis of symmetry. We follow Section 6.8 in measuring optical paths as positive quantities, and calculate the small path differences involved in the wave aberrations.
Refractive
'active
index n,
Fig. 7.2. Spherical aberration at a single spherical surface. The wave aberration for a ray y from the axis found from the small difference between the optical paths PIAP, and P, OP,
.
Following Fig. 7.2, the difference a in optical path between the axial ray and the ray intersecting the surface at a distance y from the axis is given by:
The distance y is taken for convenience as the chord CA, so that the following geometrical relations hold:
94
Aberrations
cos $ = y/2r
Chap. 7
(from the triangle AOC)
The aberration is found to depend on the axial distance y as
+ terms of higher order in y .
(7.3)
According to the size of the aperture, the approximation in (7.3) must be taken to higher orders in y. For paraxial rays where y is small, only the term in y2 need be considered, and a is zero when the first bracket of Eq. (7.3) is zero. This gives the simple relation between conjugate points (Eq. (6.22)). When the coefficient of y2 is zero, giving correct imaging for paraxial rays, the term in y4 represents the spherical aberration expressed as a wave aberration. The magnitude evidently increases as the fourth power of the aperture of a spherical refractor. The ray passing through A is normal to the wavefront, so that its direction departs from the correct direction AP2 by the angle between the ideal wavefront and the actual wavefront. This angle is found from the rate of variation of wavefront aberration a with increasing y ;the angular deviation of a ray at P2therefore varies as da/dy, that is as y3 rather than y4. Correction of spherical aberration is achieved very simply by changing the shape of the refracting surface. This can be made exactly correct for any chosen pair of conjugate points. A lens may be corrected very well for spherical aberration by the 'bending' illustrated in Fig. 7.3, where the two surfaces are still spherical but have different radii of curvature. An exact correction requires the use of aspheric surfaces. It is important to note that any qorrection can only apply exactly to one particular object distance, and that objects at a different distance will still suffer from spherical aberration. Reflecting telescopes, and particularly the large reflector radio telescopes, commonly use apertures with diameters of the same order as their focal lengths. It is usual to remove the spherical aberration by making the surface a paraboloid of revolution (Fig. 7.4), when the spherical aberration for an object at infinity and on the axis is exactly zero. A paraboloid of revolution does not, however, form a perfect image for objects off the axis, and if it is intended to use an extended field of view in an optical telescope it will be necessary to consider the off-axis aberrations, which grow more rapidly with angle for a paraboloid than for a spherical reflector.
7.3
Off-axis aberrations
95
Fig. 7.3. Lenses with spherical surfaces, and the same focal lengths, but 'bent' by different amounts. Spherical aberration is minimized by using a lens shaped so that the refraction is shared roughly equally by the two surfaces, so that the plane or concave surface should be closest to the nearer of the conjugate object and image points.
Fig. 7.4. A section through a paraboloidal reflector telescope, showing rays from a distant object converging on the focus F. All optical paths from the wavefront W to the focus are exactly equal, so that there is no spherical aberration for waves from a distant object.
7.3 OFF-AXIS ABERRATIONS The more general analysis which we now require concerns object and image points which d o not lie on a radius of the refracting surface. Figure 7.5 shows a small refracting surface, centred on A, and which may be considered as part of a larger spherical surface. The point A is at a distance y from the radius P10P2on which the object P I and image P2 ideally both lie. The distance y is not precisely the same as the distance y in Fig. 7.2, but for ease of analysis they are regarded as identical.
96
Aberrations
Chap. 7
The wavefront aberration which is required is now the difference between the aberration, as already calculated, for rays between the centre of the pencil, at A, and for other points across the disc, such as B, in Fig. 7.5. Let B be a distance ,o from A, and let AB be at angle 0 to the vertical in Fig. 7.5. The wavefront aberration may be obtained for a ray passing through the point B(p, 0) within the disc in terms of y, p and 0 as follows: Assign rectangular coordinates {, q to the point B, where = p sin 0 and q = p cos 0. Then the wavefront aberration a ( r , q, y) is proportional to the fourth power of the distance OB, so we can write
r
In ascending powers of y:
The direction of the ray leaving the point B depends on the slope of the wavefront across the aperture. The ray reaches the image plane, not at the correct point P2 but at a distance b from it. This is the ray aberration. It is convenient to find components bx and by which are respectively perpendicular and parallel to the plane of Fig. 7.5. As may be seen from Fig. 7.6, these components are given by:
By differentiating Eq. (7.5) and transforming back into polar coordinates ( p ,0), these components may be expressed in ascending powers of y, so that the effect of increasing distance off-axis can be clearly seen. They take the form:
+ Dy2psin 0 by = Bp3 sin e - Fyp2(1 + 2 cos2e) + ( 2 +~ b,
= Bp3 sin 0 - Fyp2sin 20
cos 6 - Ey3
.
(7.7)
Here the coefficients B, C, D,E, F are those used in the traditional developments of aberration theory. They may be obtained for any particular configuration of object and image distances for a spherical surface, but they are also used for aspheric surfaces in the same form. The advantage of this form of expression for the wave aberration is that the various terms can be considered as distinguishable types of aberration which depend in different ways on the combination of the factors y, p and 0, which determine their size and shape.
Fig. 7.5. Off-axis aberrations. The above figure shows two 90" projections. A pencil of rays from P, crosses the refracting surface at a distance y from the axis P,O. The intersection of the pencil with the surface is shown shaded. The enlargement of the shaded portion is shown so as to indicate the system of axes in this region.
Magnified version of shaded portion of above
98
Aberrations
Chap. 7
Fig. 7.6. The relation between wave aberration a in a wavefront at a distance y from the axis and the corresponding ray aberration b at the image point P,. The angle a between the actual and the ideal wavefronts W and W, is the error in ray direction, so that at the image-distance v the ray aberration is b = av. Since the geometrical displacement of the wavefront is a/n, the angle a! is the gradient (l/n)da/dy.
The first term Bp3 sin 8 is the spherical aberration already considered. It applies when y = 0, i.e. on the axis, and it shows that an annulus of the aperture, radius p , produces an annular image with radius proportional to /I3. The wave aberrations corresponding to the terms of Eq. (7.7) are shown in Fig. 7.7 as deviations from an idealized plane wave. The effects are illustrated in Fig. 7.8, which represents the ray aberrations around nominal point images. Astigmatism and curvature are here combined, since they depend on p and y in the same way. Spherical aberration, depending on the cube of p , the ray distance off axis, gives a point focus for the paraxial rays; the extra wave curvature for large values of p directs rays to cut the axis in points in front of or behind the point focus. These rays form a halo round the focal point. Coma is a wavefront distortion additional to spherical aberration, which only appears for object points off the axis. Rays intersect the image plane in a comet-like spread image, whose length increases as the square of the distance off axis. Rays from a line across the aperture, at 8 = 90°, do not contribute to coma, but the rays along the line at 8 = O0 cannot focus at a point and spread out beyond the true image. The typical comatic image consists of superposed circular images, successively shifted further from the axis and focused less sharply. The increasing shift can be seen in Fig. 7.7(b) as an increasing slope in wavefront aberration along the axis of q . Astigmatism is a result of a cylindrical wavefront aberration, which increases as the square of the distance off axis and the square of the aperture p. The effect is unfortunately familiar in many human eyes, which show
7.3
Off-axis aberrations
( a )Spherical aberration ( , = - l g4 4 p
( b ) Como
( c ) Astigmatism o = - cyZp2 C O S ~e
( d ) Curvature of the field O = 1~ - D Y2p2
99
o = F~~~cos 8
(e)Distortion
o=tYJp~~~e Fig. 7.7. Wavefront distortions for the primary aberrations. The ray aberrations vary as the same powers of y and as one smaller power of p. astigmatism even for objects on axis. The focus shown in Fig. 7.8(c) consists of two concentrations of rays known as the focal lines, with a blurred circular region between representing the best approximation to a point focus. This is called the circle of least confusion. The curvature term in which the wavefront has an added curvature proportional t o y2, shows that the focal length of the lens changes for
100
Aberrations
Chap. 7
( b ) Como
(c)Astigrnotisrn, showing
( d ) Distortion, showing o
focal lines
rnognificotion which increases or decreoses with d~stonceoff 0x1s Fig. 7.8. The effects of the various aberrations.
Fig. 7.9. The positioning of an aperture-stop. In (a) the stop is spaced away from the lens, so that off-axis points are focused by the outer part of the lens. The shape of the lens may then be changed so that aberrations are reduced. In (b) the same part of the lens is used for all ray inclinations. Aberrations are then less controllable, although they will be smaller for spherical surfaces.
7.5
The reduction of spherical aberration by corrector plates
101
off-axis points. A flat object plane will then give a curved image surface. It is usual to find curvature still present in a lens which is corrected for astigmatism; this remaining curvature is referred to as the Petzval curvature. Distortion represents an angular deviation of the wavefront, increasing as y3. This spreads or contracts the image, destroying the linear relation between dimensions in object and image. Since we know from the start that all aberrations cannot be eliminated from a useful optical system, it becomes a matter for choice which aberrations are the most nuisance and which can most easily be tolerated. For example, a photograph with distortion may be more displeasing to the eye than one with some blurring due to spherical aberration or coma. An astronomical photograph might.on the other hand be required to show small symmetrical point images over the whole of a plate covering a large solid angle, while it might be less important to minimize the distortion of angular scale near the edges of the plate. Two particular methods of controlling aberrations will be described in the following sections. 7.4 INFLUENCE OF APERTURE STOPS The amount of spherical aberration introduced by an uncorrected lens or reflector system varies as the cube of the lens aperture. If a large aperture is necessary, the aberration must either be tolerated or corrected, but an improvement in images can obviously be made by restricting the aperture by a stop. For the single purpose of restricting spherical aberration in a lens the stop would be placed against the lens itself, but the other aberrations are also affected by the stop in ways which depend on the separation of the stop from the lens. This is demonstrated in Fig. 7.9, which shows a pencil of rays from an off-axis point passing through an aperture stop in front of a lens, as in the analysis of Section 7.3. When the aperture stop is separated from the lens the rays from an offaxis point are constrained to pass through the outer part of the lens, as in (a). Depending on the shape of the lens, this may reduce or increase the offaxis aberrations. An important difference between (a) and (b) is that the magnification of off-axis points will be greater in (b) than (a), since the object distance is less in (b) where the stop allows rays to reach the lens by a shorter path. Distortion of the two types illustrated in Fig. 7.8(d) can therefore be controlled by the correct positioning of the aperture stop. 7.5 THE REDUCTION OF SPHERICAL ABERRATION BY CORRECTOR PLATES The usefulness of keeping in mind both the concepts of ray and wave aberrations may be illustrated by contrasting the ray approach appropriate
102
Aberrations
Chap. 7
to the discussion of aperture stops with the wave approach which gives the simplest insight into the use of corrector plates in reflecting optical telescopes. A reflector telescope using a parabolic mirror provides a perfect image of a point on axis, but suffers principally from coma for points off axis. A spherical mirror by contrast shows aberrations which increase less rapidly for points off axis, but all points suffer from spherical aberration. If this can be removed, the spherical mirror can be used in telescopes with a wide field of view. The 'focal plane' for a spherical mirror is in fact a spherical surface with half the radius of the mirror, as seen in Fig. 7.10, but the practical difficulty of using curved photographic plates is not too severe in large astronomical telescopes, where the curvature may be so small that a flat glass plate may be bent into the required shape. A stop at the centre of curvature of the mirror ensures that the same amount of mirror surface is used at all parts of the field. The symmetry of this telescope arrangement, using a spherical mirror, is so attractive that it has been found worthwhile to make a special system to correct for spherical aberration, using a corrector plate devised by Schmidt. Spherical aberration may be regarded as an advance of the wavefront by an amount proportional to the fourth power of the distance of the ray from the axis (Section 7.2). This can be removed by introducing a glass plate which is very nearly plane, but whose thickness varies slightly with radial distance so as to provide a wave retardation exactly compensating for the spherical wave aberration. The profile of the plate will then follow the wave aberration
Fig. 7.10. (a) A spherical mirror has a spherical focal surface at half the radius of the mirror. (b) The Schmidt corrector plate retards the wave in the outer parts of the aperture, removing spherical aberration. It is placed at the centre of curvature so that its effect is nearly independent of the ray inclination.
7.6
The correction of chromatic aberration
103
Fig. 7.1 1. A Schmidt corrector plate considered by ray optics. Pencils of rays near the edge of the aperture are made to diverge by the plate, and pencils near the centre are made to converge, cancelling out the spherical aberration.
shape of Fig. 7.7(a). If the plate is placed at the centre of curvature of the mirror, its effect is nearly the same at considerable angles off axis, so that very good images can be obtained over a very wide field of view. This is the principle of the Schmidt system, used for example in the 48-inch telescope on Mt Palomar which has provided survey photographs of the whole of the northern sky. The useful field of view of this telescope is 7" across. A practical Schmidt correcting plate follows the shape shown in Fig. 7.11; in comparison with the plate in Fig. 7.10 an average curvature has been removed so that a thinner plate can be used. The only effect of this is a small change in the focal length of the telescope, as though a slightly converging lens had been added to the corrector plate. It will by now be realized that a ray approach to spherical aberration will give the same results as a wave approach; so it is simple to see that the effect of the practical form of the plate is to increase the power of the mirror for rays near the centre, and reduce it for rays near the edge of the aperture. The Schmidt plate therefore has the same effect as the use of aspheric surfaces in a lens, but it has the added advantage of its symmetrical position at the centre of curvature of a spherical mirror. 7.6 THE CORRECTION OF CHROMATIC ABERRATION
The power of a spherical refracting surface, radius r, is given by Eq. (6.4) as (n2- n l ) / r ; where n2 - n I is the difference of refractive index across the
104
Aberrations
Chap. 7
surface. So far no account has been taken of the need to focus light of a wide range of colour by the same optical system; since refractive index inevitably varies with the wavelength of the light, any optical system which depends on refraction rather than reflection will behave differently for different colours. Chromatic aberration is a measurement of the spread of an image point over a range of colours. It may be represented either as a longitudinal movement of an image plane, or as a change in magnification, but basically it is a result of the dependence of the power of a refracting surface on wavelength. It may be compensated for by combining lenses made of different materials. The power of a single lens used in air may be written as P=(n - 1)R, where R = l/rl + l/r2, the sum of the curvature of the two surfaces. A small change of refractive index 6n therefore changes the power by 6P where
Varieties of optical glass differ quite widely in the way in which n varies with wavelength, so that it is possible to combine two lenses with power P1and P2in such a way that 6P1+ 6P2=0 without at the same time making the total power PI+ P2= 0. Since 6n (n - 1) has the same sign for all glasses, this means that P1and P2 must have the opposite sign, and the combined lens system has a lower power than either component. Two colours separated in wavelength by 6X will be focused together when:
The powers must therefore be of opposite sign and inversely proportional to the value of l/(n - 1)6n/6X for the two glasses. This quantity is the dispersivepower of the glass. It is often quoted in terms of refractive index at two specific wavelengths. The two lenses may be in contact if two surfaces have the same radii of curvature. It is advantageous to reduce the number of interfaces between glass and air, since light is lost by partial reflection at each step in refractive index. The step between the two kinds of glass is smaller than for interfaces between glass and air, but the advantage is lost unless the two lenses are cemented together with a transparent glue with a refractive index approximately the same as that for glass. Both the power and the focal plane
7.7
Achromatism in separated lens systems
105
of such an achromatic doublet can be made the same over a range of wavelengths, or at any two widely separated wavelengths. Outside these wavelengths, however, it will generally still suffer from chromatic aberration.
7.7 ACHROMATISM IN SEPARATED LENS SYSTEMS The provision of a focal length which does not vary with wavelength is not a sufficient condition to provide completely achromatic images, since it provides only a constant lateral magnification. The position of the focal plane can still vary with wavelength, giving a longitudinal chromatic aberration. This is often less important than the reduction of lateral chromatic aberration, especially in the eyepieces of microscopes and telescopes, so that it is common to find in these a very simple system for achromatism, using two identical lenses separated by the focal length of one lens. This provides a constant lateral magnification, giving a great improvement over an uncorrected lens at very little cost. The power of two lenses, separated by a distance d, is given by Eq. (6.12) as
At a different wavelength the net change in total power is given by
The change in total power is zero when
If the lenses are made of the same glass then 6P1/P1= 6P2/P2for all wavelengths, and the achromatic condition becomes:
The doublet therefore is achromatic when the lenses are separated by half the sum of their focal lengths. This configuration is used in the Huygens and Ramsden eyepieces of microscopes and telescopes. PROBLEMS 7 7.1 A point source of light is at distance u from a concave spherical mirror, radius
of curvature r, aperture 2h. Following the method and approximations of Section 7.2, show that:
106
Aberrations
Chap. 7
(i) the wave aberration a at the edge of the mirror is given by
(ii) the transverse ray aberration b is given by
(iii) the longitudinal ray aberration c is given by
7.2 Find the transverse and the longitudinal spherical aberration for a concave spherical mirror 1 m diameter, focal length 10 m, for a distant source of light such as a star. 7.3 Find the difference in thickness across a Schmidt corrector plate, refractive index 1.4, used to correct the spherical aberration of 7.2 without moving its focal plane. 7.4 Plane-parallel light is incident normally on the vertex of a glass hemisphere with radius 70 mm. If the refractive indices for red and blue light are 1.61 and 1.63 respectively, find the axial chromatic aberration. 7.5 A narrow pencil of light is incident on a glass sphere off axis by a distance ar where r is the radius, and the refracted ray meets the opposite side of the sphere exactly on axis. Find cr in terms of the refractive index n. 7.6 A thin equiconvex lens with radii of curvature 220 mm is made of crown glass with refractive indices 1.515 and 1.508 for blue and red light respectively. Find the focal length of the lens and the axial chromatic aberration. A thin planoconcave lens of flint glass is to be used to compensate this chromatic aberration, with the concave face towards the first lens. The refractive indices of flint glass for blue and red light are 1.632 and 1.615 respectively. Find the required radius of curvature of the concave surface and the focal length of the combination.
CHAPTER
Optical instruments
8.1 LENSES AND MIRRORS
The geometric arguments of the two previous chapters may suggest that the behaviour of a lens is to be thought of in terms of formulae such as Equation (6.5), with some subtle modifications when any reasonable aperture is used. The geometric relations are of course very useful, and not only because they are easily tested in elementary practical examinations, but they do not always give a direct indication of the way that lenses can be put to use in optical instruments. It is often more appropriate to think instead of the action of a lens or a mirror on a wavefront, as indeed we have already done in one of the derivations of the simple lens formula (Section 6.8). If there is a clear idea of what sort of changes can be made to a wavefront, then the selection of optical components for the various needs of optical instruments can be made intuitively; the precise geometrical theory will only be required later if a precise specification is to be met. The concept of an optical instrument as a means of modifying a wavefront is particularly appropriate for the simple instruments, such as a prism, or a lens used either as a simple magnifier or as a means of condensing light. The concept may however be extended to cover the more complicated instruments such as microscopes and telescopes, in which the purpose of the instrument may be stated in wavefront terms even though the detailed analysis must be carried out by ray calculations or by ray tracing.
108
Optical instruments
Chap. 8
8.2 REFRACTION IN A PRISM
The bending of light in a prism is a simple example of wavefront modification. The ray approach is to analyse the two refractions, each following Snell's law. Instead of tracing through the geometry of both refractions, we will consider the action of the prism as a whole. A particularly simple situation is found for a prism with small vertex angle, but the wave approach is useful for all prisms; it also leads naturally into calculations of dispersive power and resolving power. In Fig. 8.1, a plane wavefront is deviated by an angle 0 by passing through a prism with apex angle a and refractive index n. The angles are small, and the wavefront falls nearly normally on the surface. Then the optical paths AA' and BB' are equal. While the wavefront at B passes through a length a1 of the prism, the wavefront at A passes through a length I(@ + a )of air. Therefore if n is the refractive index
and the angle of deviation 0 is given very simply by
The same approach can be used as an exact method if the light passes symmetrically through the prism, when the two paths become 2nl sin & and 21sin k(0 + a).It is easily seen from Fig. 8.1 that the deviation is a minimum
Fig. 8.1. A prism with small angle a refracts a wavefront through an angle which is nearly independent of the angle of incidence, provided it is near normal.
8.2
Refraction in n prism
109
in this configuration, since tilting the prism about its apex will increase the path B by a cosine factor without increasing the path A. The symmetrical position is therefore the position of minimum deviation, where 8 is given by
The approximation in Eq. (8.1) is therefore one of equating an angle with its sine. The prism is useful as a wavefront modifier mainly because refractive index n varies with wavelength X, so that different colours emerge with separate wavefronts when a single superposed plane wavefront is incident on the prism. The angular separation 68 of two wavefronts with wavelengths separated by a small amount 61 depends on dn/dX, since differentiation of Eq. (8.1) leads to
The 'angular dispersion' of the prism is therefore directly proportional to the angle a and to dn/dX. The proper design of a practical spectrometer requires this angle to be large enough for an adequate separation of components with different wavelengths; there is however a limit placed on the possible resolution of close wavelengths by diffraction at the whole aperture. This limit is found to depend mainly on the size of the prism rather than on its angle or refractive index (Section 14.8). Equation (8.3) may be stated exactly for any configuration of the prism, whether symmetrical or not, if the path p ( = al) in the prism is used as a parameter (Fig. 8.2). Evidently it is the change of optical path n that causes the angular dispersion. This change of path p 6n affects the emergent wavefront along a width w, so that the angle 60 is now given by
Fig. 8.2. The dispersion of a prism. The optical length of the pathp changes by p6n with wavelength change 6X. The change p6n over width w deviates the emergent wavefront by angle p / w 6n.
110
Optical instrumcnts
Chap. 8
A practical prism spectrometer is now seen to require in essence an incident plane wavefront, and a means of recording separately the plane wavefronts which emerge in different directions. The incident plane wave may be obtained from a source at a great distance, or from a point source of diverging waves which are made plane by a lens with a positive power, such as a simple biconvex lens. The emergent wavefronts may be examined by eye, since the eye is designed to sort out waves travelling in different directions, or by the eye with the help of a telescope, or by a form of camera, as shown in Fig. 8.3. It will be seen that each point on the plate is a monochromatic image of the original point source. If the source is made a line, such as a slit in front of a flame or a discharge tube, then a line will appear for each colour of the spectrum.
Point source
Fig. 8.3. A simple prism spectrometer, using a photographic plate to record the spectrum of a point source of light.
8.3 THE SIMPLE LENS MAGNIFIER The prism changes the direction of a wavefront: a curved refracting or reflecting surface changes the curvature of a wavefront. This is illustrated by the use of a simple lens as a magnifier. The eye analyses light by focusing wavefronts from different directions on to different parts of the retina. The wavefronts are very nearly plane, although by accommodating the eye to slightly diverging wavefronts a correct focusing can be obtained for objects as close as a limiting distance D, known as the nearest distance of distinct vision. Although this varies between individuals, it is usually quoted as a nominal distance of 25 cm. The angular resolution of the eye is determined by the separation of sensitive elements on the retina, which matches well the limit of angular resolution set by diffraction at the iris, the aperture of the main part of the eye lens (Fig. 8.4).
8.3
The simple lens magnifier
111
Retina
Fig. 8.4. Focusing system of the eye. Most of the refraction occurs at the first surface, whcre thc cornea C separates a liquid N, (n = 1.34) from the air. The lens L, which has a refractive indcx varying from 1.42 in the centre to 1.37 at the edge, is focused by tcnsion at the cdges. The main volume contains a jelly-like 'vitreous humour' (n = 1.34). The image is formed on the retina. The iris I adjusts the aperture according to the available illumination.
Fig. 8.5. Simple lens used as a magnifier. An object at 0 close to the eye can be focused by the eye as though it were at a more distant point 0'. Since the angular resolution is very nearly unchangeable, it follows that the linear resolution of the unaided eye is greatest for objects as close as possible, i.e. at 25 cm. Closer objects are out of focus, but if the eye is aided by a lens an object at very small distance can be focused, with a corresponding increase in linear resolution. The lens assists the eye by converting a wavefront which is diverging too sharply into the nearly plane wave which the eye can focus unaided. The magnifying power M o f a simple lens used in this way is given by the ratio of the apparent size of a n object, seen as a n image at 25 cm from the eye, to its actual size. This is the ratio of the curvatures of the wavefront from a single point, i.e., D/d where D is the image distance (25 cm) and d is the object distance (Fig. 8.5).
112
Optical instruments
Chap. 8
From the simple lens equation, if the lens has power P:
and therefore
Usually M% 1, and it is convenient to approximate by putting M-- PD. Practical values of M are limited by the difficulty of making a single lens with very high power and free of aberration. The compound microscope therefore replaces the simple magnifier for magnifications greater than about 10. The compound microscope is considered in Section 8.7.
8.4 THE TELESCOPE When the eye attempts to distinguish details of a distant object, it is attempting to separate plane waves which are inclined at small angles to each other. The limit of resolution can now only be improved by using an instrument which increases the angular separations of a range of plane waves. This is the action of a telescope. If a plane wave at a small angle O1 to the axis of a telescope is to emerge as a plane wave at a larger angle 02, the refractive index at both ends being the same, then it follows that the width of the wavefront is reduced in the ratio O1 /02, as can be seen from Fig. 8.6.
Fig. 8.6. The action of a telescope. A wavefront W, along the axis is undeviated, while a wavefront W1 enters at angle 8, and emerges at angle e2. The widths of the wavefronts as they enter and emerge are y1 and y,. The optical paths I are identical; it follows that y18, = y28,. The angular magnification is therefore y2/yl.
8.5
Advantages of the various types of telescope
113
The difference in angle between the two wavefronts as they enter the telescope is determined by the optical path I and the width of wavefront y 1 which is accepted by the telescope. The optical path difference I is preserved as the wavefronts traverse the telescope, so that the difference in angle as the wavefronts leave the telescope is determined by I and the new width y2 of the wavefront. The angular magnification is therefore:
(It is possible for the wavefront to be inverted, so that the angular magnification is negative. This occurs when the wavefront goes through a real focus inside the telescope .) Practical arrangements for the reduction of the size of a plane wavefront are shown in Fig. 8.7, which includes many of the conventional varieties of telescope. The angular magnification of all these arrangements is given by the ratio of the widths of the plane wavefronts entering and leaving the telescope; this is numerically equal to the ratio of the focal lengths of the two optical elements, either lenses or mirrors, which form the 'objective' and 'eyepiece' elements of the system. A telescope also has the advantage over the eye that it gathers a larger area of plane wavefront, so that a point source of light becomes more easily visible. It is most important not to confuse this increase in sensitivity with the question of the visibility of a uniformly bright object with a finite size: the moon is no brighter as seen through a telescope, while stars not visible to the naked eye are easily seen through a telescope. If the object is already resolved in angle by the eye, then its visibility is related to its brightness, which is the light emitted per unit area into unit solid angle (Section 8.9). The apparent brightness as seen through the telescope cannot be greater than the original, since the apparent size of the object increases in proportion to the increase in total light collected. There can in practice only be a loss of brightness in a telescope, due to partial reflection at lens surfaces or to incomplete reflection at a mirror surface. 8.5
ADVANTAGES OF THE VARIOUS TYPES OF TELESCOPE
The types of telescope in Fig. 8.7 are clearly divided mainly by the use of mirrors or of lenses, and by the use of second elements with positive or negative power. A lens has the great advantage over a mirror that a small distortion due to gravity or uneven temperature has no first-order effect on the optical path through it, whereas if part of a mirror bends forward by an amount z it shortens the optical path by 22. If nearly perfect images are required, in which
114
Optical instruments
Chap. 8
(a)
Galilean
(6)
Astronomical
(c)
Newtonian
(d) Cosseqroin
(el
Greqorian
(f)
Herschel
Fig. 8.7. The reduction in width of a wavefront in various types of telescope. The telescopes are all adjusted for direct viewing of the emergent beam; the emergent wave could instead be made convergent, so that a photographic plate could be placed at the focus. all optical paths are near equal, this means that the mounting of the mirror must be considered much more carefully than that of a lens. O n the other hand, a mirror has n o inherent chromatic aberration, and very little light is lost at a reflection. The largest telescopes, and some of the best survey theodolites, use mirror systems. A mirror must, of course, be used for wavelengths where n o good lens can be made, as at infrared o r radio wavelengths.
8.6
Binoculars
115
Systems such as the Galilean telescope and the Cassegrain telescope may be unsuitable for survey and position measurement because they have no real image point at which a graticule can be placed as a reference mark and for making measurements of angular size; on the other hand they have the advantage that they are shorter than the corresponding instruments using second elements with positive power (astronomical and Gregorian). The Herschel system uses the whole of the aperture without obstruction, but it is easier to control aberrations with the more symmetrical mirror arrangements (c), (d) and (e). The control of aberrations has already been discussed in the previous chapter, but the detailed application to a full telescope system becomes complicated. Modern astronomical telescopes commonly use a Cassegrain system, and the control of aberrations may entail the addition of a Schmidt corrector plate in the beam as it enters the telescope, or a corresponding asphericity of both primary and secondary.
8.6 BINOCULARS The binocular telescope, or the 'binoculars', must be one of the most widely used optical instruments. Let us imagine that we are to design one for general use. We know already that the magnification of a telescope focused for object and image at infinity, is given by the ratio (f,/ f,) between the focal lengths of objective and eyepiece. A hand-held instrument does not usually have a magnification greater than about x 8 or x 10, otherwise the image cannot be held sufficiently steady. A magnification of x 8 is common. We also know that the total amount of light entering the instrument is determined by the aperture of the objective, and that this affects the visibility of point sources of light. A large-diameter objective is therefore important for looking at stars or the like. We have not yet discussed what factors determine the field of view of the binoculars, what sorts of lenses we must use, and what determines the diameter of the eyepiece. The eye is especially sensitive to chromatic aberration, which has the effect of colouring the edges of objects away from the axis. Binocular objectives must therefore be carefully corrected; they are therefore made as cemented achromatic doublets (Section 7.6). The eyepiece is usually more complex, as a single cemented achromatic pair introduces other aberrations. It is made basically as a separated pair, which incidentally helps to correct for lateral chromatic aberration (Section 7.7); in a good eyepiece one of the pair may itself be a cemented doublet. Figure 8.8 shows only a simple pair in the form known as the Ramsden eyepiece. An important part of eyepiece design concerns the position of the eye. The rays from a point source in Fig. 8.8 cross the axis beyond the eyepiece; at
116
Optical instruments
Chap. 8
this point they fill the exit pupil of the system. The eye is placed at the exit pupil, which is separated from the eye lens by a distance known as the eye relief. The size of the exit pupil is determined by the size of the pupil of the eye. If the exit pupil is smaller than the eye pupil then the eye is used inefficiently since only part of the eye pupil is illuminated. If the exit pupil is larger than the eye pupil some light is wasted and the telescope is being used inefficiently. In practice the exit pupil should be somewhat larger than the eye pupil so that the exact position of the eye pupil is not too critical. The binoculars are then easier to use. The focal plane of eyepieces of the Ramsden type is close to the first lens. A real image of any object at infinity exists at this point, so that angular width of the field of view is therefore determined by the aperture of the first lens. An aperture which limits the field in this way is known as afield stop; the first lens is therefore often called the field lens. The angular width of the field of view is the diameter of the field lens divided by the focal length of the objective. A long objective focal length fo therefore gives a large magnification but a small field of view. We might therefore expect to see large magnifications obtained instead by using a small eyepiece focal length f,. However, as we will see, the diameter of the eyepiece is fixed by other considerations, and reducing f, becomes difficult without introducing aberrations. Let us assume that the magnification is fixed at the comfortable limit of x 10, and find the diameters and focal lengths which must be used for eyepiece and objective. We shall find that all of these depend on our requirements for the field of view. Consider again the rays entering the eye at the exit pupil in Fig. 8.8. The angular spread of rays at this point is the angular field of view multiplied by the magnification; it is therefore a large angle, often about 50". The eye lens must be larger than the exit pupil to accommodate these rays, and the field lens must be somewhat larger again. The field lens must therefore be at least 15 mm in diameter; the size of the real image at this point must be the same size, as this lens constitutes the field stop. The field of view, which in this example would be 50" divided by the magnification, determines the objective focal length f,; this must therefore be about 17 cm. The focal length of the eyepiece is therefore one-tenth of this, i.e. 1.7 cm. Finally, the diameter D of the objective must be M times the diameter of the exit pupil, which is usually about 4 mm to match the pupil of the eye. The objective is therefore 40 mm in diameter. We have reached the specification in the form usually quoted: these binoculars would be 10 x 40. Two remaining problems are solved simultaneously by the use of a pair of prisms, as seen in Fig. 8.8(b).These invert the image, so that it appears upright, and they fold the telescope so that the'total length is much less than
8.7
The compound microscope
117
Eye relief -
f
e
w
I
I
Objective lens
(Achromatic pair)
Field stop
\~xit. Pup11
Ramsden eve~iece Field lens lens
ye
Fig. 8.8(a). Astronomical telescope, as used in the binocular telescope
Field of view 0, =
field lens diameter objective focal length
Fig. 8.8(b) A pair of erecting prisms, as used in the binocular telescope.
f, +f,. A Galilea11 arrangement would of course provide an upright irriage without prisms, but with no folding the telescope length is too great for anything more than the small magnification used in opera glasses. 8.7 THE COMPOUND MICROSCOPE
The simple lens magnifier is required t o convert a diverging wavefront into a nearly plane wavefront, so that separate points of an image give rise to separate components of an angular range of plane wavefronts. If further
118
Optical instmmcnts
Chap. 8
Objective Fig. 8.9. The compound microscope. A wide angle divergent wavefront is converted into a narrow, ncarly parallel, wavefront erncrging from the eycpiece.
Fig. 8.10. Microscope objectives. (a) Oil-immersion objective in which the object 0 and virtual objective 0' are on the aplanatic surfaces of a sphere S. The wavefront curvature is again reduced by a series of meniscus lenses M. (b) Reflecting objective. This is a close relation of the Cassegrain telescope of Fig. 8.7. linear resolution is required, then the angular spread of these wavefronts can be increased by a telescopic system. The combination of the two systems into a compound microscope does not in practice have two separable parts, since a separate telescope objective is unnecessary. In Fig. 8.9 the objective lens, which has the short focal length required of the simple magnifier, is placed so that the object is just beyond the focal plane. The rest of the instrument is like the astronomical telescope. The diffraction theory of the microscope, dealt with in Section 14.4, adds
8.8
Image-forming instruments-the
-
Focal plane of objective
119
-I
F F
camera
g
-
4
Image Eyepiece Power P,
Objective Power g Fig. 8.11. Geo~~ietry Tor calculating the magnification of a compound microscope.
the further requirement that the objective lens must collect the spherical wavefront emerging from an object point over as wide an angle as possible. The design of the objective lens is crucial to the success of a microscope, since aberrations are easily introduced when rays traverse the lens at large distances from the axis. The highest magnification is obtained with an oil-immersion lens (Fig. 8.10(a)), in which the oil film has the same refractive index as the glass. The property of aplanatic surfaces (Section 6.3) in which all rays focus correctly is used to form a virtual image of O f , and successive meniscus lenses M are then used to reduce the curvature of the wavefront. The magnification of the compound microscope is calculated in two stages. Firstly the objective lens forms a real image; this is then magnified further by the eyepiece (Fig. 8.1 1). If the real image is formed at a distance g beyond the focus F of the objective, whose power is P I ,then the magnification is Pig. (See Section 6.5 for a general derivation; alternatively this may be shown to be equal to u/u for a simple lens.) The magnification of the eyepiece is given as before by P2D, giving the overall magnification as PIP2gD.The length g is known as the optical tube length of the microscope, since it accounts for most of the length of the instrument.
8.8 IMAGE-FORMING INSTRUMENTS-THE CAMERA Astronomical research is seldom conducted by looking through a telescope: instead an image is formed either on a photographic plate or on the slit of a spectrograph. (Many modern extensions of these recording techniques are now in use, such as image intensifiers and interferometric spectrum analysers.) The telescope then becomes a camera, which is an artificial eye; the
120
Optical instruments
Chap. 8
-
Fig. 8.12. Comparison of Galilean telescope (a) and telephoto lens (b). In the telescope the diverging secondary lens is placed so that the foci of the two lenses coincide; in the telephoto lens the secondary is moved so that a real focus is located on a
photographic plate. photographic plate is the retina and the lens of the eye is the primary lens or mirror of the telescope. A camera may of course be arranged to focus on very near objects, when it becomes a photomicroscope. For larger distances the linear size of the image is given directly by the product of the focal length of the lens and the angular size of the object. Photographic and television cameras are for this reason provided with interchangeable sets of lenses with a range of focal lengths, so that the scale of a picture may be selected according to the required angular resolution; alternatively a 'zoom' lens may be used which is an adjustable compound lens whose focal length can be varied over a range of, say, five to one. Small cameras commonly have a lens with focal length about 5 cm; an astronomical telescope may have a focal length of 10 metres or more, so as to provide a sufficiently large linear scale on the photographic plate. Even with a focal length of 10 metres an angle of 1 second of arc corresponds to only 0.05 millimetres at the focal plane; since diffraction images smaller than 1 second of arc are obtainable in large telescopes a stellar image usually has a microscopic scale on the image plane. The effective focal length may be adjusted by any of the devices of Fig. 8.7, since we have already seen how these affect the angular scale of a pattern of plane waves. It is in fact only necessary to change the position of the secondary lens or mirror to obtain a real image at any desired distance. For example, the Galilean telescope may
8.9
Intensity, flux, illumination, and brightness
121
be converted into the telephoto lens (Fig. 8.12) by moving the secondary away from the primary. The same advantage is obtained, that a long focal length is available without a corresponding and inconveniently long distance between the first lens and the photographic plate. 8.9 INTENSITY, FLUX, ILLUMINATION, AND BRIGHTNESS
Until recent years there was a multiplicity of photometric units. Lighthouses, for example, used the sea-mile candle as a unit of illumination, in competition with the lux, the phot, the foot-candle, and the nox. Ultimately the units for illumination should be in terms of electromagnetic units, but this would only be possible if the spectrum of the light in question was completely known and the spectral sensitivity of the detector could be completely standardized. Units of light are therefore referred back to a single standard source of light. The basic unit in the SI (Systeme Internationale d9Unites)system is the candela, which is evaluated in terms of the luminous intensity of a black body at the temperature of solidification of platinum. This provides a reproducible standard source of light whose spectrum is suitable for comparisons with other sources of visible light. Four quantities are required in photometry: in SI units they are defined as follows: Luminous intensity The candela (cd), which replaced 'candle power', is a unit of total light emitted from a standard source which is effectively a point source. It refers to the total power emitted in all directions in the visible part of the spectrum. Luminous flux A point source emits light in all directions. Practical sources, however, do not emit isotropically; the amount of light flowing in a given direction is specified as the luminous flux. The unit of luminous flux is the lumen (lm). This is the flux from a point source of one candela within a solid angle of one steradian. Illumination The amount of light falling on a surface is expressed in terms of illumination, which for a surface normal to the light rays is proportional to the energy in the electromagnetic wave crossing unit area in unit time. The unit of illumination is the lux (lx), which is the illumination produced by one lumen falling on an area of one square metre. The illumination of
122
Optical instruments
Chap. 8
a spherical surface with radius one metre and with a point source of one candela at the centre is therefore one lux.
Luminance A surface may emit light either because it is self-luminous or because it is illuminated. The brightness, or luminance, of the surface may depend on the angle at which it is viewed, but for a given direction the brightness may be related to the unit of intensity by expressing it in terms of candelasper square metre (cd m - 2).
The ideas lying behind these four definitions may be illustrated by reference to the optical instruments we have discussed in this chapter. Consider first a star, observed as a photographic image by means of an astronomical telescope. The total energy output of the star in the optical spectrum is described by its luminous intensity. That part of the energy which falls upon the telescope aperture is focused on to a point on the photographic plate and produces a developable image. The energy in the optical range collected by the telescope is the product of the aperture area and the illumination provided by the star. If the star emits more light in one direction than another, like a terrestrial lighthouse beacon (the pulsar in the Crab Nebula is a good instance of this), then the illumination is determined by the luminous flux emitted in the direction of an observer. For an isotropic source this is numerically equal to the luminous intensity, but in the lighthouse beam it is of course much greater. A telescope or camera photographing an extended object produces an image in which we need to know the amount of light energy per unit area, i.e. the illumination. This depends on the light energy leaving unit area of the source in the direction of the observer, i.e. on the luminance of the source. The illumination of the image is proportional to the aperture area, but it also varies inversely as the area of the image, which is itself proportional to the square of the focal length. The intensity on the photographic plate therefore varies as the ratio (F/D)-2, where F/D is the familiar 'focal ratio', or 'F number', of a camera lens; it is the ratio of focal length to aperture diameter.
8.10 ILLUMINATION IN OPTICAL INSTRUMENTS The discussion of the compound microscope started by assuming that wavefronts left the object over a wide range of angles. This does of course occur naturally if the object is illuminated naturally by diffuse light, but this does not usually provide sufficient brightness even before the many optical elements of the microscope have successively reduced the brightness by partial
8.10
Illumination in optical instruments
123
Fig. 8.13. Condenser systems for a microscope. (a) Concave mirror. (b) Lens system. A short focal length is needed to collect light over a large solid angle Q and to cover a wide range of angles as it enters the microscope.
reflection and absorption. Extra illumination must be provided, and for efficiency and good angular resolution the light must be encouraged to leave the object in the right range of directions. This is achieved for transparent objects by the use of a condenser, which may be a concave mirror or a lens system, as in Fig. 8.13. No great optical quality is required, since only a rough image of a diffuse source of light need be formed on or near the objective plane of the microscope. The total light entering the microscope depends on the solid angle Q over which the condenser collects the light; the condenser must therefore have a short local length both for this reason and so that the object plane is well illuminated by light traversing it over a wide range of angles. A similar problem is encountered in projection systems and enlargers; there the requirement is to obtain as much light as possible through a system consisting of a transparency, a projection lens, and a screen. The illumination of the transparent object must be even, but there is no requirement for illumination over a wide range of angles. Figure 8.14(a) shows the way in which light from a small source traverses a projection system: it is important not to confuse this diagram with the more conventional ray diagram of Fig. 8.14(b) which is concerned with the image on the screen of a point on the transparent object. This image is formed by a narrow pencil of rays within the light paths of Fig. 8.14(a). The condenser lens of a projector need not be an accurate high-quality component. For the familiar 'overhead' projector a stepped lens or 'Fresnel' lens is used (Fig. 8.15); the simple lens which it replaces would be an impossibly thick and massive piece of glass, and the imperfections of the stepped lens are unimportant.
124
Optical instruments
Chap. 8
Fig. 8.14. Illumination in a projector system. (a) The action of the condenser lens is to collect light emerging from a small bright source over a wide angle, providing an even illumination over the transparent object, and concentrating the light as it passes through the projector lens. (b) The projector lens forms an image of each part of the object in a narrow pencil of rays determined by the illumination. In the episcope, which projects a n image of a n opaque object, the object scatters the illuminating light in all directions. The problem is therefore to provide a high level of illumination, and then to use a projection lens which collects light over as wide a n angle as possible. Since the objects may be large, this means not only a small F/D ratio, but a large lens
Problems 8
125
Fig. 8.15. Overhead projector. The light from the lamp L is concentrated into the projector lens P by the stepped lens plate S.
diameter. Adequate brightness in the image is such a serious problem that image quality must usually be sacrificed by the use of a projection lens with very large aperture. PROBLEMS 8 8.1 Show that the magnification of the astronomical telescope Fig. 8.7(b) can be 8.2
8.3
8.4
8.5
measured by placing a scale across the objective lens, and measuring the magnification of this scale in the image formed by the eyepiece. Self-luminous or diffusely reflecting surfaces obey Lambert's law, which states that the luminous intensity of any surface element is proportional to cos 8, where 8 is the angle between the normal and the line of sight. Compare the distribution of luminance across the discs of the Sun and the full moon, assuming that their surfaces follow Lambert's law. Show that the total flux of light from a flat surface, with area S and luminance L, and which obeys Lambert's law, is TLS. Assuming that the Moon's surface obeys Lambert's law, and also that it re-emits ;of the light incident on it, find the ratio of illumination on the Earth in full sunlight to that in full moonlight. You may assume that the solar illumination of the Moon's surface is the same as that of the Earth. The angular diameter of the Moon seen from the Earth is &, radian. (a) What is the illumination of a horizontal surface under a hemispherical sky with luminance L? (b) What is the illumination of horizontal surface parallel to an infinite flat slab with luminance L? (a) The illumination of the Earth's surface by the Sun at normal incidence is lo5 lux and the Sun subtends an angle a = 30' arc. What is the luminance of the Sun?
126
Optical instruments
Chap. 8
(b) An image of the Sun is formed by a lens with diameter 10 cm and focal length 100 cm. What is the illumination and luminance of the image? 8.6 The immersion technique, in which a liquid fills the thin space between a highpowered microscope objective and a slide cover glass, can give improved image brightness by avoiding total internal reflection of oblique rays in the cover glass. Calculate the improvement which can be obtained for a cover glass thickness h = 0.2 mm, with refractive index 1.5, and an objective diameter D = 3 mm. 8.7 Compare the fields of view of the Gregorian and astronomical (Keplerian) telescopes of Fig. 8.7, in the following example. The objective diameters are both 2 cm, with focal lengths 20 cm, and the eyepiece diameters are both 1 cm, with local lengths - 10 and 10 cm respectively. Show that the magnifications are + 2 and - 2 respectively, and the fields of view are 1/10 and 1/30 radian respectively. Show that the exit pupil of the Gregorian telescope is located between the objective and the eyepiece lenses, whereas for the astronomical telescope it is beyond the eye lens.
+
CHAPTER
Interference and diffraction
9.1 YOUNG'S EXPERIMENT
We have seen in Chapter 4, where the idea of Huygens' secondary waves was introduced, that the future position of a wavefront may be derived from a past position by considering every point of the wavefront to be a source of secondary waves. The progress of the wavefront can often be described by ray optics; we now turn to the phenomena of interference and diffraction, where ray optics provides a totally inadequate description. Interference effects occur when two or more wavefronts are superposed, as in the crossing periodic waves of Fig. 2.12. Diffraction effects occur when part of the wavefront is removed by an obstacle in an otherwise infinite wavefront, or when all but a part of the wavefront is removed by an aperture or stop. The importance of diffraction depends on the scale of the obstacle or aperture compared with the wavelength. We will start with simple examples of interference and diffraction which can be understood intuitively or by using the phasor ideas and constructions of Chapter 2. We will find, however, that this simple approach is transcended by the Fourier theory of Chapter 3. The simplest example of interference is Young's experiment, which provided the first demonstration of the wave nature of light. Two pinholes, or slits, A and B in Fig. 9.1, transmit two elements of a light wave. The two wavelets overlap and interfere; if they then illuminate a screen, there will be light and dark bands across the illuminated patch. We will consider the effect for monochromatic light, and for a screen at a large distance.
Interference and diffraction
Chap. 9
-
y axis
Plane wavefront
Screen
Resultant
Fig. 9.1. (a) Young's experiment. Two wavelets spread out from the pair of slits, and interfere. Constructive interference occurs at P according to whether the path difference I is NX or ( N + ;)A. (b) Phasor diagram for the sum at P. The phase reference is at 0 when P is at a large distance from the slits.
The incident light is a plane wave, so that each slit is the source of identical expanding waves. The point P & z ) is sufficiently far from A and B that the distances AP and BP are nearly equal, and the small difference I = d sine. The two waves differ in phase by 2?rl/X, and the resultant amplitude is
9.1
Young's experiment
129
and the intensity is
where A. and I, are the amplitude and intensity of each wave at P. When P is at a large distance from the pair of slits, the phase reference may be taken at 0, half-way between the slits. The waves are then advanced and retarded on the reference by 7 r l / X , as in the phasor diagram of Fig. 9.l(b). Thus where constructive interference occurs the intensity is four times that due to one slit, or twice the intensity due to two slits if interference did not happen. The conditions for constructive and destructive interference are as follows: constructive PA - PB = I = f NX
(9.3)
destructive PA - PB = I = f ( N + k ) X
(994)
where N is an integer. N is called the order of the interference; it is simply the number of wavelengths difference in the paths to points where constructive interference takes place. The bright and dark 'fringes' which appear on a screen placed anywhere to the right of the double slit system can therefore each be labelled according to their order. The highest order is the integral part of d/X. The loci of points of maximum and minimum intensity in the y, z plane are defined by Eq. (9.3)and E q . (9.4).At large distances from the screen, where I = d sin 8, the conditions Eqs. (9.3) and (9.4) become
and
The fringes are then spaced uniformly in sin 8 at intervals NX/d, and at a distance z the linear spacing is z a / d . The condition that the distance is sufficient to make I = d sin 8 is important; it is equivalent to the condition that the phase of any elementary wave from the screen is a linear function of distances x and y in the plane of the screen. This is the condition for Fraunhofer diffraction, of which two-slit interference is a special case. The Fraunhofer diffraction pattern of a pair of slit sources is the fringe pattern
130
Interference and diffraction
Chap. 9 Intensity
Fig. 9.2. Young's fringes in intensity. Note that each slit alone would give an intensity I,. The average intensity is 2Zo, but the interference effects put energy into the fringes out of the dark troughs.
A (8) is the amplitude of the diffracted wave; negative values in Eq. (9.7) represent a reversal of phase. The intensity is the square of A(0). Notice that the average intensity across several fringes is 24, (see Fig. 9.2), because the peaks of the upper half of the cos2 fringes just fill the troughs of the lower half. This is an example of an important general principle of all interference and diffraction effects: the energy is redistributed in space by these effects, but remains in total the same. This rather obvious remark enables some not so obvious predictions to be made; for example, if light is diffracted into the geometrical shadow of an object, the intensity must begin to fall outside the geometrical shadow to compensate.
9.2 DIFFRACTION AT A SINGLE SLIT The simplest case of diffraction is for a single slit, width w, illuminated by a plane wavefront with uniform amplitude. The slit is perpendicular to the paper in Fig. 9.3, so we take as our basic element a strip of width dy, small compared with the wavelength. Then each strip contributes an equal amplitude ady to the total at P, but the phase of each contribution depends on y as 4 ( ~ =)
21ry sin 8 X
As in the previous section the point P is sufficiently far away that the phase is a linear function of y. The contributions of the elementary strips can be added in the phasor diagram of Fig. 9.3. The central strip, at 0, is taken as the phase reference, appearing as a horizontal phasor in the diagram. Contributions from below 0
9.2
Plane wavefront
Diffraction at a single slit
131
i Resultant A (8)
Phasor contr~buted by str~pat orlgln
Phasor contr Ibuted by str~pdy at y
Fig. 9.3. Diffraction at a slit. Each elementary strip dy contributes an equal phasor with a phase varying linearly with y. The phasor diagram is then part of a circle, allowing the resultant to be easily calculated.
have phases retarded on this reference; these have been placed on the lefthand side of the phasor diagram. Contributions from above 0 appear on the right-hand side, and the diagram then becomes an arc of a circle. The resultant is a chord, representing the amplitude A (8) of the resultant wave in direction 8. When 8 = 0 the phasor diagram is a straight line, representing the maximum amplitude A(0). Hence from Fig. 9.3 we see that
Interference and diffraction
132
Chap. 9
where $ = 7rw sin 8/X is the phase of the component from one edge of the slit. More precisely, using exponentials, we can express the sum of the elementary contributions as the integral A (8) =
S
w/2
a exp (im) dy
- w/2
.
This integrates directly to give the result Eq. (9.9). We now see the connection between diffraction and the Fourier transform, which is so important that we repeat Eq. (3.8):
1
m
f(x) =
~ ( uexp ) (2rinr) du
(9.1 1)
-m
The equations become identical if x = sin 8/X and u = y. The integral need not extend to infinity, as there is no contribution from beyond the edges of the slit at y = + w/2 and - w/2. The integral (9.10) is in fact the Fourier transform of the 'top-hat' function, which was evaluated in Section 3.4 and found to be the sinc function (sin $)/$, as in Eq. (9.9). This equality between the Fraunhofer diffraction pattern and the Fourier transform of the aperture function is universal. The sinc function is important in many branches of physics. By considering the behaviour of the phasor diagram as $ changes we can easily see its main properties. At 0 = 0, $ = 0 and the phasor is a straight line; sinc (0) is therefore unity. As $ moves away from zero the phasor diagram begins to curve into the arc of a circle; the resultant, the chord, shortens but remains parallel to the contribution from the centre of the slit. Thus the amplitude decreases but the phase remains the same. When t,b = a, so that sin 8 = X/w, the phasor diagram is a closed circle and the amplitude is zero. As $ increases beyond 7r there is again a resultant, but it is in the opposite direction; the amplitude has gone negative, or we may say that the phase has changed by 7r in going through the zero. At approximately t,b = 3 ~ / the 2 resultant reaches its extreme negative value and then begins to shorten as $ increases. At $ = 27r, so that sin 8 = 2(X/w), the resultant is again zero. These processes are illustrated in Fig. 9.4. It is easy to see that this oscillatory behaviour continues, giving zeros at exactly sin 8 = X/ w, 2(X/ w), 3(X/ w), etc., corresponding to $ = f T, 27r, 37r, etc. The values of $ to give the maximum and minimum values where dA/d$ = 0 may be found by differentiation:
+
+
+
+
+
da - d (sin $) - $ cos $ - sin $ ---d$ d$ $ 112 dA d$
=0
-
where $ = tan $
.
(9.12)
9.2 .. ~.
Diffraction at a single slit
.
....
Central maximum sine=O JI=O
~
133
-
the centrol
No resultant First zero sin e = A +=i7
Close to the first moximum 3 A sine=-.-
2 w
JI=- 3lr 2
a Resultant
Second zero sin 8 = 2 A
JI = 2 s Fig. 9.4. As 0 the direction of diffraction moves away from zero, the phasor, straight at 0 = 0, curls up as shown. Each zero corresponds to the phasor being wrapped round an integral number of times.
134
Interference and diffraction
Chap. 9
sin 8
sin 8
Fig. 9.5. The amplitude function
and its square, the intensity function, for diffraction at a slit of width w.
This intrinsic equation is best solved either graphically or numerically. If n is an integer the extremes for large n are at
For smaller values of n the maxima and minima are somewhat closer in than the values given by Eq. (9.14). For example, the first extremes come at t,b = _+ 1.437 rather than at 1 . 5 ~ In . terms of the phasor diagram these extremes correspond to the same length of phasor being formed into a circle by being wrapped round approximately l;, 2; times, etc. Similarly, the zeros are given exactly by the same length of phasor being wrapped round 1,2, 3, etc., times, and the central maximum by its being straight. Remember that the change in sin 0 between the zeros on each side of the central maxima
+
9.3
The general aperture
135
Fig. 9.6. A general aperture in the xy-plane. The contribution from Q in the P-direction specified by the direction cosines (I, m, n) is advanced in phase compared with those from the origin by 27r 27r -QQt =-(lx+ my) .
X
X
is twice that between subsequent zeros. In amplitude the first subsidiary lobe is negative and about 22% of the central lobe. The eye or a photographic plate is directly sensitive to the intensity, proportional to the amplitude squared. The amplitude and intensity are plotted in Fig. 9.5. In intensity the first subsidiary maximum is only about 5% of the main maximum, and has fallen to 0.5% by the fourth. 9.3 THE GENERAL APERTURE Having looked at a simple but important case of Fraunhofer diffraction, we now go on to make an important generalization. We generalize in two ways: (i) by considering diffraction in two dimensions; (ii) by allowing the complex amplitude in the aperture to be non-uniform: that is to say it can have arbitrary amplitude and phase. Let the aperture be any shape in the xy-plane. Then the direction P of interest may be specified by the direction cosines (I, m, n). Remember that the direction cosines are the cosines of the angles between the direction they define and the coordinate axes. The reference plane is now a plane through 0 perpendicular to the direction P given by (I, m, n). As before Q is a general
136
Interference and diffraction
Chap. 9
point in the aperture and Q ' the point in which the line in the direction ( I , m, n) to P cuts the reference plane. The distance from Q to the distant point P is shorter than that from 0 by QQ'. From Fig. 9.6,
Hence on account of path difference the phase of light from Q at P will be advanced by (27r/X)(lx + m y ) . So much for the extension to two dimensions. Now let both the amplitude and phase of the wavefront in the aperture be functions of (x, y ) . Mathematically we can express this by letting the amplitude be a complex function of position F(x, y ) . An element dx dy at (x, y) will then make a contribution to the amplitude at P of F(x, y ) dx dy, but rotated in phase by (27r/X)(lx+ my). Or expressing the sum of all such components over the aperture by a double integral,
I
(Ix+my) dxdy Aperture
.
(9.16)
Be clear what this integral represents. Each element of the aperture dx dy contributes a phasor of length F(x, y ) dx dy and phase given by the initial phase of F(x, y ) advanced by the phase due to the path difference QQ ' . The complex integral is just a way of arriving at the resultant in the phasor diagram. Equation (9.16) allows the calculation of complex amplitude at a distant point P in terms of the complex amplitude F(x, y ) in the aperture. The dimensions must be the same on each side, and this is taken care of by the constant C' with dimensions [length] -*. For present purposes we are concerned with the form of the diffraction pattern rather than its absolute value. Equation (9.16) has the form of a Fourier transform in two dimensions, using the pairs of transform variables x/X and I , y/X and m. The coordinates x/X and y/X in the aperture are lengths measured in wavelengths, while I and m are sines of angles measured from the normal. Thus we have the important general result:
The Fraunhofer diffraction pattern in amplitude of an aperture is the Fourier transform of the complex amplitude distribution across the aperture. 9.4 RECTANGULAR AND CIRCULAR APERTURES We can now apply the general theorem of Eq. (9.16) to various apertures, using Fourier transforms directly.
9.4
Rectangular and circular apertures
137
(i) Uniformly illuminated single slit
As in Section 9.2, the slit is represented by an amplitude distribution for lyl< w/2 for lyl>w/2
Fo
.
The Fourier transform, i.e. the diffraction pattern Eq. (9.9), is: A (8) = A (01 sinc
( ;in' li
(ii) Two infinitesimally narrow slits
The Young's double slit of Section 9.1 is represented by the amplitude distribution
where the slits become two delta functions (Section 3.7). The Fourier transform, i.e. the double slit interference pattern, is:
(iii) Uniformly illuminated rectangular aperture
Here F(x, y) = Fo, within the aperture sides extending from - a/2 to + a/2 and - b/2 to + b/2. The aperture sides are aligned along the x and y axes. The diffraction pattern in terms of direction cosines 1 and m is +a/2
+ b/2
A(1, m) = C ' Fo
2lii --(lx+my)
a
The double integrals in Eq. (9.21) are separable, so
Now
-
sin (lila/X) rl/X
d ~ d y. (9.21)
138
Interference and diffraction
Chap. 9
Resultant
~ h a s b rdiagram for single y -strip
Fig. 9.7. Phasor diagram for a rectangular aperture for diffraction in a direction off both 1and m axes. The diagram corresponds to a point on the central maximum close to the first minimum in the m-direction, but fairly far up the central maximum in the /-direction.
with a similar expression for the y integral. The amplitude A(0, 0) at the centre of the pattern where 1 and m are zero is C'Foab, so the amplitude A(/, m) is given in terms of that at (0, 0) by A(1, m) = A(0, 0)
sin (?rla/X) sin (?rmb/X) ?rla/X mb/X
The intensity is given by Eq. (9.24) squared. Thus the expression for the Fraunhofer diffraction pattern of a uniformly illuminated rectangular aperture is proportional to the product of the expressions for the separate diffraction patterns of two crossed slits having the same widths. Along the 1 and m axes the subsidiary maxima have the same values as those of the slit, as one of the product terms of Eq. (9.24) is unity on each axis. Faint subsidiary maxima exist in the four quadrants. For these both product terms are of the order of a few percent, and the brightest of them, at approximately ( f 1.5X/a, f 1.5X/b), is (0.047)2 or 2.2 x 10- of the intensity at the centre. Physically we can see why this is so by considering the phasor diagrams shown in Fig. 9.7. Along the I-axis each strip parallel to the y-axis is in the same phase all over. Phasors from each of these strips combine to give the diffraction pattern as in the single slit. Similarly for the m-axis. If we consider a general point (I, m) however, the phasor from a strip parallel to the y-axis is already bent into the arc of a circle because of the phase along the strip. The total phasor diagram is thus constructed of phasors already bent, and hence comes out very small.
9.4
Rectangular and circular apertures
139
(iv) Uniformly illuminated circular aperture How can we now apply the general theorem to a circular aperture? First we note that the diffraction pattern must be circularly symmetrical. The result is simply written as the Fourier transform A([, rn) = C f Fo
11
e x p ( ~ [ l x +my]) cix dy
circular
aperture
Evaluating this integral is messy (because the limits link x and y). It is simpler to change directly to polar coordinates (r,8) in the aperture and (w,+) in the diffraction pattern. Now r cos 8 = x; r sin 8 = y; and an elementary area is r dr do. In the diffraction pattern coordinates w cos 4 = I ; w sin 4 = m, so that w = J(12 + m2) = sin of the angle that (I, m) makes with the axis (I, m = 0). The amplitude is now given by
The integral in Eq. (9.26) is only soluble analytically in terms of Bessel functions. However, most integrals can only be solved in terms of some sort of tabulated functions; it is merely that Bessel's are less familiar than the sines needed to perform the integrals of Eq. (9.23) for example. This being done, we have
The square of the right-hand side of Eq. (9.27) gives the intensity pattern for Fraunhofer diffraction at a circular aperture. This famous and important result was first derived by George Airy in 1835 at about the time he became Astronomer Royal. It is of especial importance to astronomers as it is the pattern produced in the focal plane of an ideal telescope with a circular lens (or mirror) when a plane wavefront from a distant star is incident. The circular edge of the objective lens of the telescope acts as the aperture, and it is the size of the ring systems due to two nearby stars that determines whether or not they can be distinguished. Airy's pattern both in amplitude and intensity is plotted in Fig. 9.8. At first sight it looks like the similar plots for the slit in Fig. 9.5, but there are several differences. Most important, it is a ring system, so that the plots are radial sections of a pattern possessing circular symmetry. The first zero is at 1.22 X/D(where D = 2a is the diameter of the aperture), compared to X/w for the slit. The zeros are not equally spaced but tend to a separation of X/D
140
Interference and diffraction
Chap. 9
Amplitude
Fig. 9.8. The amplitude and intensity functions
and its square for diffraction at a circular aperture of diameter D. The substitutions w = sin 8 and D = 2a have been made in the expression in Eq. (9.27) to allow direct comparison with Fig. 9.5 for a single slit.
for large values of w. The first subsidiary maximum of intensity is lower, 1.75% compared with 4.72% for the slit. 9.5
THE FIELD AT THE EDGE OF AN APERTURE
In the diffraction theory of this chapter, and indeed in most of the later diffraction theory, we have assumed that the wave can be described by a scalar variable, and made no mention of polarization. It is not usually
Problems 9
141
necessary to calculate diffraction separately for each component of the polarization of the vector wave, but we can easily see one situation where this is necessary. It concerns the assumption, made in Section 9.2, that the diffraction of a plane wave at a slit may be calculated as if the amplitude of the wave were uniform over the whole slit. Suppose the diffracting slit is made of perfectly conducting metal sheet. Then the E field must be zero in the sheet; immediately outside the sheet the component parallel to the slit edge must also be zero. Only at distances greater than about one wavelength from the edge can the field reach its full value. The wavefront is therefore narrower for polarization parallel to the edge than it is for polarization perpendicular to the edge, and the width of the diffraction patterns will correspondingly be somewhat different. This effect is only important if the scale of the diffracting object or slit is not large compared with one wavelength. Evidently there can be considerable complications introduced by the behaviour of the wavefront close to a diffracting object. The full solution of such problems involves a detailed description of the wavefront, which must accord with the boundary conditions at the edge of the object. When the wave is described, then the diffraction pattern can be calculated either by the simple theory of this chapter, or in more difficult cases by a full wave theory due to Kirchhoff. This theory is discussed briefly in Section 10.6. Fortunately it is often possible to proceed without the full rigour of the Kirchhoff theory. PROBLEMS 9 9.1 A simple demonstration of diffraction and interference can be made by scratching lines through the emulsion of an undeveloped photographic plate, and looking through the lines at a distant bright light with the plate held close to the eye. Find the angular breadth of the pattern given by a sodium lamp (A = 589 nm) with a slit width of 0.1 mm. What will be the effect for two such slits 1 mm apart? 9.2 A single slit width D is made into a double slit by obscuring its centre with a progressively wider opaque strip, leaving two slits with width a. Draw phasor diagrams for the single slit diffraction pattern at the edge of the main maximum and at the first zero, and show how these are changed as the opaque strip is widened. Sketch the diffraction and interference patterns for the single slit and double slit on the same scales of intensity and angle. 9.3 Two pinholes 0.1 mm in diameter and 0.5 mm apart, are illuminated from behind by a parallel beam of monochromatic light, wavelength 500 nm. A convex lens of diameter 1 cm and focal length 1 m is placed 110 cm from the holes. Describe the pattern formed on a screen placed (a) 1 m, (b) 11 m, from the lens. 9.4 Show that the slit interference pattern in Fig. 9.5 can be observed using a source of light which is extended along a line parallel to the slit. What is observed when the line source is not parallel to the slit?
142
Interference and diffraction
Chap. 9
9.5 Estimate the smallest possible beam width of (i) a paraboloid radio telescope, 80 m in diameter, used at a wavelength of 20 cm, (ii) a laser operating at a wavelength of 600 nm, with an aperture of 1 cm. 9.6 An aperture in the form of an equilateral triangle, emits a plane monochromatic wave. The side of the triangle is 20 wavelengths long. Find the directions of the zeros of the diffraction pattern closest to the normal. 9.7 From the asymptotic expansion for large z
J,(z) =
sin z - cos z (a)
show that the angular distance between diffraction minima far from the axis of a circular aperture, diameter d, is approximately X/d.
CHAPTER
Fresnel diffraction
1 0 1 FRAUNHOFER AND FRESNEL DIFFRACTION -THE
RAYLEIGH CRITERION In Chapter 9 Fraunhofer diffraction was characterized by a linear variation of the phase of contributions from elements across an aperture. At a point close to an aperture or an obstacle the phase of these contributions may no longer vary linearly with distance across the aperture, and quadratic terms must be introduced. This is typical of Fresnel diffraction; the results are no longer given by Fourier transforms as in Fraunhofer diffraction. The transition from Fresnel to Fraunhofer diffraction is illustrated for slit diffraction in Fig. 10.1. To determine the relative phases of contributions across the aperture for a point P inside the distance R it is not sufficient to draw a reference plane in the aperture; instead a sphere must be drawn to define equal distances from P. For a point P beyond the distance R the difference between a sphere and a plane becomes unimportant, since if the maximum deviation is less than X / 8 it has little effect on the resultant at P. The distance R, dividing the two regimes, is known as the Rayleigh distance; for an aperture width w it is given by
Two further factors may complicate some applications of Fresnel diffraction theory:
144
Fresnel diffraction
Chap. 10
Fig. 10.1. Transition from Fresnel to Fraunhofer diffraction. A portion of a wave W passes through a slit, width d. Intensity distributions across the wave are shown for planes P, (close to the slit). P, (just inside the Rayleigh distance), and P, (beyond the Rayleigh distance).
(ii) Different parts of the aperture may be at considerably different distances from P; this variation may have a significant effect on wave amplitude as well as phase. (ii) The line to P from different parts of the aperture may make considerably different angles with the normal to the surface of the aperture; this also may have a significant effect on amplitude. These factors can be incorporated into a general diffraction integral through the Kirchhoff theory (Section 10.6). For the simplest case of a distant point source of illumination, ro from the aperture, the wave amplitude at the aperture can be written as & exp (ikro)/ro. The diffracted wave amplitude A ( P ) at P is given by the integral ik A ( P ) = - -Ao 4~
exp(ikro)j' r0
aperture
exp (ikr) (1 + cos x)da r
(10.2)
10.2
Shadow edges-Fresnel
diffraction at a straight edge
145
Inside the integral sign the exponential term determines the phase of each component from an area da, while the amplitude is proportional to l/r, the distance of P from the area da. The factor (1 + cos X ) is the inclination factor, where x is angle to the normal of the aperture surface. Outside the integral the factor - i k / 4 normalizes ~ the amplitude and phase of A ( P). The evaluation of Eq. (10.2) is of particular interest for diffraction at a straight edge and at circular holes or obstacles. 10.2 SHADOW EDGES -FRESNEL DIFFRACTION AT A STRAIGHT
EDGE One of the most interesting predictions of the wave theory of light is that there should be some light within a geometric shadow, and interference fringes just outside it. The effect, seen in Fig. 10.2, is that at the geometrical edge of a shadow the intensity is already reduced to a quarter of the undisturbed intensity, falling monotonically to zero within the shadow. Outside the shadow the intensity increases to more than its undisturbed value and oscillates with increasing frequency as it approaches a uniform value.
I
Fig. 10.2. Fresnel diffraction; intensity distribution for a straight edge.
Most problems of interest are treated adequately by a drastically simplified model, that of the diffraction of a plane wave by a half-plane. It is natural in this case to evaluate the contributions of strips parallel to the edge of the half-plane to the amplitude at the point P. The extra phase in the path length from a strip distant h from the edge of the half-plane is easily shown to be
146
Fresnel diffraction
Chap. 10
Fig. 10.3. (a) Fresnel diffraction. Portions of the wavefront W contribute to the wave at P according to their amplitude (proportional to d h ) and phase relative to the contribution from 0 (proportional to h2). (b) The decline in area of the half-period zones.
Since the phase 4 increases as the square of h, it follows that equal increments of 4 correspond to smaller increments of h as h increases. The plot of +(h) in Fig. 10.3 is marked off at intervals of a in phase, showing the width of zones across which the phase reverses. Zones marked off in this way at intervals of ?r in phase are known as 'half-period zones'. We can now construct a phasor diagram made up of the contribution of infinitesimal strips to the resultant at P. The inclination factor can be ignored at first, so that the contribution of an infinitesimal strip of width dh at h has a phase given by Eq. (10.3) and an amplitude proportional to dh. Its components along two axes x and y (taking the x-axis as the phase reference) are d x = dhcos-
ah2 Xs
and
dy=dhsin-
ah2 Xs
(10.4)
The x and y coordinates of contributions from the wavefront from the origin up to any value of h are now given by the integrals of the expressions of Eq. (10.4). Their solution enables a plot to be made of the phasor diagram giving the intensity at P. This takes the form of a spiral, with the property that the angle $ ( h ) that its tangent makes with the x-axis is proportional to the square of the distance along it from the origin. (It is instructive to notice
10.2
Shadow edges -Fresnel diffraction at a straight edge
147
here that if + ( h ) were proportional to the distance along it the spiral would become a circle through the origin with its diameter along the y-axis. This is why Fraunhofer phasor diagrams are circular!) It is usual when plotting this spiral to do so in terms of a dimensionless variable v, related in the present case to the variable h by
Then any point on the spiral is given by X=
Cvcosy
Irv2 dv,
y=
[
V
7rv2
sin,dv
.
The integrals of Eq. (10.6) are called Fresnel's integrals, and the plot of their value as v varies is called Cornu's spiral. As v gets large the spiral tends to a point Z ( x = t , y = t).The edges of the half-period zones correspond to points on the spiral where its tangent is parallel to the x-axis. The Mth such position is given by
The spiral moves in towards Z quickly at first: after v = 4 corresponding to the 8th half-period zone it is entirely contained within 0.5 f0.1 in both x and y coordinates. The inclination factor (1 + cos X ) in Eq. (10.2) will reduce the size of components at large values of v, making the spiral somewhat tighter. Again, the evaluation of $(h) in Eq. (10.3) ignores higher-order terms in h, which also should tighten the spiral. In practice these factors are seldom important, and we will take no account of them. The intensity at P is the resultant of the whole spiral from the origin to Z. Now imagine removing the half-plane entirely. Then the half of the plane wavefront that was covered by the screen can clearly be treated similarly, and contributes another branch of the Cornu spiral in the third quadrant. The whole curve is shown in Fig. 10.4. The resultant Z ' Z when there is no screen at all is clearly double the resultant OZ with the screen in place. This explains why the intensity at the position of the geometric shadow is a quarter of that of the undisturbed wave.
148
Fresnel diffraction
Chap. 10
Fig. 10.4. Cornu's spiral.
Now starting with the undisturbed wave, consider how the amplitude and intensity at P vary as a half-plane is moved from infinity across the plane wave. Then starting at 2' a growing proportion of the spiral is deleted (Fig. 10.5). The resultant instead of being Z ' Z is DZ. D moves round the spiral, so the amplitude begins to show oscillations above and below is undisturbed value. Each extreme represents the deletion of one half-turn of spiral, corresponding to a movement in of one half-period zone. If w is the coordinate of the edge of the plane, the spatial frequency of the oscillations is proportional to w2, so as the plane moves in the amplitude of the oscillations increases as the spiral gets bigger; at the same time their frequency declines as w2. The last minimum is 0.78 of the undisturbed intensity, and the last maximum is up to 1.34 of the undisturbed intensity. From here on the resultant moves smoothly on, arriving at the origin, at just half the amplitude and a quarter the intensity. As w becomes positive the resultant becomes a short vector joining Z to a point on the spiral around Z which rotates, shortening smoothly in length and reducing rapidly in size as P gets deeper into the geometric shadow.
10.2
Shadow edges: Fresnel diffractiorl at a straiglit edge
149
Fig. 10.5. Edge diffraction. Phasor diagrams for successive positions of a shadow edge: (a) a diffraction minimum, (b) the first diffraction maximum, (c) inside the geometric shadow.
150
Fresnel diffraction
Chap. 10
An interesting point to notice is that whilst the spatial frequency of the fringe is determined by the wavelength and the distance s from the edge to the plane on which the shadow is observed, the ratio of the oscillations to the undisturbed intensity is not. So however short the wavelength, these effects still occur-one case in which wave optics does not approach geometrical optics as the wavelength shortens. It is quite easy to observe the first bright fringe around the edge of a shadow in white light, though of course the further ones get progressively out of step due to the large range of wavelengths. For example, the shadow cast by the back of a chair placed half-way across a room, illuminated with a car headlamp bulb at one side of the room shows the bright fringe quite convincingly around its shadow on the opposite wall. If one looks back from a position in the shadow area towards the obstacle, the edge of the obstacle appears bright. This is the light which is diffracted into the shadow; it appears to originate at the edge itself, and it is sometimes referred to as an 'edge wave'. 10.3 DIFFRACTION OF CYLINDRICAL WAVEFRONTS
The treatment in the last section was of the diffraction of a plane wavefront at an edge. It is very easy to extend this to the diffraction of cylindrical wavefronts-such as those emerging from slits. If the source of the wavefront
Fig. 10.6. Geometry for the diffraction of cylindrical wavefronts.
10.4
Fresnel diffraction by slits and strip obstacles
151
is r from the diffracting screen as shown in Fig. 10.6, the extra phase in the path that passes a distance h from the centre line is given by
This is similar to Eq. (10.3) with the addition of the h2/2r term to take account of the curvature of the wavefront before diffraction. The same argument can be followed through with this slightly more complicated expression. It turns out that the Cornu spiral still can be applied as before if instead of the change of variable in Eq. (10.5) the change
is made. The same spiral can then be used with just this change of scale factor. For example, if r = s the diffraction pattern would be scaled by 1/& compared with the plane wave (r= 00) case. For simplicity we shall go on to discuss the Fresnel diffraction of plane waves by slits, but all the results are easily adapted to cylindrical waves by the change of scale given by Eq. (10.9). 10.4 FRESNEL DIFFRACTION BY SLITS AND STRIP OBSTACLES
The Cornu spiral will now be seen as the phasor diagram obtained by adding the contributions at some point of infinitesimal strips across the whole of a plane or cylindrical wavefront. If an obstacle or a slit deletes some of the spiral, the remainder allows us easily to obtain the amplitude and phase. From this point of view it is natural to work in terms of v as variable, always remembering that v is related to h, the actual dimensional coordinate perpendicular to the strips, by Eq. (10.5) for a plane wave or Eq. (10.9) for a cylindrical. In the case of slits it is again conventional to think of the slit being moved past P as was done in the case of the half-plane; the contribution of the uncovered portion of the wavefront is then represented by a fixed length us of the Cornu spiral. Moving the slit relative to P, the point of observation, moves v, along the spiral. To illustrate this Fig. 10.7 shows successive positions of us= 1.2 moved along the spiral in unit steps. This corresponds to a slit too narrow for the undisturbed brightness ever to be attained. As the free length of phasor moves out from the centre-where it is almost straight -to the spiral portions, its resultant decreases monotonically until the spiral is of small enough diameter for it to be wrapped once round once. From then on the resultant increases till about where it is wrapped l t times
152
Fresrlel diffraclior~ Chap. 10
Fig. 10.7. Fresnel diffraction by a slit. A fixed length us= 1.2 of the Cornu spiral forms phasor diagrams (a) at the centre of the diffraction pattern, (b) and (c) at increasing distance off centre.
10.4
Fresnel diffraction by slits and strip obstacles
153
round, after which it again declines, repeating the process in a series of fringes getting smaller and smaller. This process is highly reminiscent of the Fraunhofer diffraction at a single slit, as indeed it should be! As a glance at Eq. (10.5) shows, to make us small at a given A, we must make s, the distance from the slit to the observation point, large, which is just the condition for Fraunhofer diffraction. Indeed we can be more exact. If us = 2 the phasor diagram is just beginning to be seriously bent in the centre: this is evident by inspection of the Cornu spiral. Equation (10.5) then may be used to give the actual slit width in terms of X and s. Thus
which is half the Rayleigh distance (Section 10.1) where Fresnel effects should still matter. On the other hand, if us = 1, the bending of the phasor at the centre of the spiral is negligible, its resultant being 99.4% of its unbent length, and a similar calculation to that of Eq. (10.10) shows we are at twice the Rayleigh distance. As vs increases the intensity in the centre goes up, reaching the undisturbed intensity at us = 1.4, and rising to 1.82 times the undisturbed intensity at ~ ~ ~ 2This . 4 great . increase in brightness can be thought of as the (coherent) superposition of the bright fringes near each edge diffracting separately. The general character of the diffraction from wider and wider slits will now be clear. As the fixed length of phasor slides along the Cornu spiral there are two conditions:
t
Intensity
Fig. 10.8. A slit diffraction pattern in the Fresnel region.
154
Fresnel diffraction
Chap. 10
( i ) In the geometrically bright area. The two ends of the phasor are in opposite parts of the spiral. The intensity is of the same order as the undisturbed intensity but because the resultant joins the ends of the phasor which are independently going round different spirals, complicated beating effects may be seen, as shown in Fig. 10.8. (ii)In the geometrical shadow. The two ends of the phasor are on the same part of the spiral. The intensity is low but, because the ends of the phasor are on the same spiral, fringes are produced having maxima if the two ends are opposite and minima if they are close. Between these conditions is the rapid transition through the edge of the geometric shadow, where for a change of position of about one unit of v the edge of the phasor sweeps from one arm of the spiral to the other. The effect of strip obstacles can be analysed in a similar way by the use of the Cornu spiral. Here a limited portion of the spiral is removed, so that if this is in the centre the two coils Z and Z ' move closer together. In the centre of the shadow of a strip obstacle there is always some light, although it rapidly becomes less as the strip is made wider. We will find that the centre of the shadow of a perfectly circular object contains a narrow spot of light which has the same intensity as the unobstructed light. 10.5
SPHERICAL WAVES AND CIRCULAR APERTURES HALF-PERIOD ZONES
This section deals with Fresnel diffraction in axially symmetrical systems, as for example along the axis of a circular diffracting disc or hole. In the limit a large enough circular hole offers no obstacle at all, so the case of free-space propagation is also covered.
Fig. 10.9. Fresnel's half-period zone construction.
10.5
Spherical waves and circular apertures- half-period zones
155
In Fig. 10.9 the wave amplitude at a point P due to a point source Po is to be calculated by integrating all contributions originating from a spherical surface surrounding Po. The limit of the integral will depend on the size of the diffracting aperture, whose circular edge lies on the sphere. A reference sphere radius b, centred on P, touches the surface on the axis; the phase of a contribution from an annulus at a distance h from the axis is then given to a first approximation by
and the area of an annulus between h and h + d h is 2a h d h. The integral over the surface is conveniently carried out in terms of 4, since the element of area ds = 27rh dh is proportional to d4. The wave amplitude at P is given by an integral of the form U(P)=CS
exp(-i4)db
(10.12)
aperture
where c is a constant depending on the amplitude of the source, and on ro, b, A. For a circular hole with radius o, the integral is between 0 and
giving
The intensity at P is therefore proportional to sin2(qh0/2). This means that if the radius of the hole is increased progressively from zero, the intensity at P increases until the extra path
is a half wavelength, then decreases to zero when it becomes a whole wavelength, and changes cyclically thereafter. The successive annuli on the sphere between these turning points are known as the Fresnel half-period zones. The contributions making up the integral (10.13) are shown in the phasor diagram Fig. 10.10. This is a circle, within the approximation we have made
156
Fresnel diffraction
Chap. 10
Fig. 10.10. The spiral phasor diagram for a spherical wavefront.
in neglecting distance and inclination effects. Ultimately these effects shrink the circle to a point at its centre, giving a resultant amplitude for free space which is half the amplitude obtained when a single zone is exposed. A circular aperture in which alternate half-period zones are blacked out, called a zone plate, is shown in Fig. 10.1 1. Alternate semicircles are now removed from the phasor diagram, and a large concentration of light appears at P. The zone plate is acting like a lens; the relation between object and image distance roand b conforms to a simple lens formula for a given zone plate. An improved zone plate can be made by reversing the phase of alternate zones instead of blacking them out; this is done by a change in thickness of a transparent plate. A circular disc, acting as an obstacle, requires the integral (10.12) to be carried out from a limit 4o out to infinity. The contribution from large values of 4 tends to zero and the integral becomes
U( P)= c( - exp (i$,)) . Surprisingly, the modulus of U(P) is independent of &; the wave amplitude at a point on the axis behind any circular obstacle is the same as the unobstructed wave amplitude. The prediction that there should be a bright spot at the centre of the shadow of circular disc was first made by Poisson on reading a dissertation by Fresnel on diffraction, submitted to the French Academy of Sciences in 1818. When the test was made, and the bright spot was found, the wave theory of light was firmly and finally established.
158
Fresnel diffraction
Chap. 10
10.6 KIRCHHOFF'S DIFFRACTION THEORY
In both Fresnel and Fraunhofer theory we have assumed that a diffracted wave amplitude can be calculated from the sum of secondary waves originating at an aperture. We have assumed also that the wave can be represented by a scalar, so that polarization can be neglected; and we have assumed that the amplitude distribution across an aperture is that of the undisturbed wavefront. These latter assumptions may be improved in specific cases; for example, we know that at the edge of a slit in a metal sheet the electric field must be perpendicular to the conducting surface, so that the parallel component is zero near the edge of the slit. The diffraction pattern will therefore be narrower for a wave polarization perpendicular to the slit length than for the wave polarized parallel to the slit length. Such cases can be dealt with by the application of boundary conditions in determining the amplitude distribution across the aperture. The concept of secondary wavelets, as proposed by Huygens and developed by Fresnel, was originally an arbitrary hypothesis. It was justified later by a more fundamental analysis due to Kirchhoff. One of the problems of the Huygens-Fresnel principle was to assign to each wavelet an inclination factor, which would give it unit amplitude in the forward direction and zero backwards. Fresnel assumed, incorrectly, that it was also zero at 90" to the forward direction. The inclination factor is obtained explicitly in Kirchhoff's analysis, which involves not only the amplitude and phase on a diffracting surface but also their differentials along the wave normal. The time-independent form of the wave equation for electromagnetic waves, expressed in scalar form, is
Kirchhoff's theorem applies to any field obeying Eq. (10.15). It states that the complex amplitude A ( P) at a point P is related to the complex amplitude A on a surface S enclosing P by
where r is the distance from P to the surface element da, and n is the normal to da. If the wave on the surface S originates from a point source Q (Fig. 10.12) it is given by A=Ao
exp (ikro) ro
10.6
Kirchhoff's diffraction theory
159
Fig. 10.12. Fresnel Kirchhoff theory. An aperture forms part of the surface S enclosing P.
and
The term (ik - l/ro), and the similar term (i k - l/r) from the first term of the integral Eq. (10.16), differ significantly from ik only when ro and r are small compared with the wavelength. (Note, however, that inside this region there is a phase shift of 7r/2 as well as an amplitude factor.) If P and Q are many wavelengths from the surface, the integral becomes
Ss~o
ik A ( P )= - 47r
exp ik(r + ro) (cos xo + cos X)da rro
where xo and x are the angles (n, ro) and (n, r). This result, known as the Fresnel Kirchhoff diffraction formula, provides the basis for the diffraction theory which has been developed in previous chapters. In most of the diffraction problems encountered in this chapter the surface S may be made to coincide with a wavefront, so that the incidence angle xo is zero. The inclination factor (cos xo + cos X)then becomes (1 + cos x). The propagation of Huygens' wavelets forwards but not backwards is now clear, as the inclination factor becomes zero for x = 180". The correct factor for x = 90" is not zero, but half the forward amplitude; we should point out, however, that diffraction through such a large angle is very dependent on the boundary conditions at the edge of the aperture. A consequence of the Kirchhoff theory, due to Babinet, concerns complementary diffracting screens. If in one surface S1 all the apertures are made opaque, and all the opaque regions opened, forming a complementary surface S2, then the complex amplitude at P without any obstruction can be regarded as Al + A2, the two diffracted amplitudes from S1 and S2. If
160
Fresnel diffraction
Chap. 10
the unobstructed amplitude is zero, it follows that A, = -A2, and the intensities are identical. Babinet's principle applies to any situation where light is diffracted by an obstacle or aperture into an otherwise dark region. For example, if a small obstruction is placed in a large parallel light beam, the light diffracted out of the beam is the same as that which would be due to an aperture of the same shape and size. Astronomical photographs often show this effect as a cross-like diffraction pattern extending from images of bright stars: this is due to a support structure for a secondary mirror, forming an obstructing cross in the telescope aperture. The diffraction pattern is the same as would be obtained from crossed slits of the same dimensions in an otherwise totally obscured telescope aperture.
PROBLEMS 10 10.1 The beam shape of a 15 m diameter millimetre wave telescope is to be measured from beyond the Rayleigh distance. Calculate this distance for a wavelength of 0.5 mm. 10.2 Consider the possibility of observing optical Fresnel diffraction, as in Fig. 10.2, when a star is occluded by the Moon. Calculate (i) the width of the first halfwave zone at the Moon, (ii) the angular width of a star just filling this zone. Compare these with the size of irregularities on the Moon's surface, and the actual angular width of bright stars. (Moon's distance = 376 x lo6 km.) 10.3 A distant point source of light is viewed through a glass plate dusted with opaque particles of a uniform size. The light now appears to have a diffuse halo about l o across. Use Babinet's principle to explain this and estimate the size of the particles. 10.4 Compare the intensities of light focused from a point source by a zone plate and by a lens of the same diameter and focal length. What change is made by reversing the phase of alternate zones rather than blacking them out? Where has the remaining energy gone? 10.5 An infinite screen is made of polaroid, and divided by a straight line into two areas in which the polaroid is oriented parallel to and perpendicular to the division. Describe the diffraction pattern due to the edge when unpolarized light is incident normally on the screen. 10.6 A shadow edge for demonstrating Fresnel diffraction is made by depositing a metallic film on glass. What will be the effect of using a film that transmits one-quarter of the light intensity?
CHAPTER
The diffraction grating and related topics
11.1 THE DIFFRACTION GRATING
In Chapter 9 the Fraunhofer diffraction pattern in intensity of two slits illuminated by a plane wavefront was shown to be a set of cos2 fringes equally spaced in sine. We now discuss the more general problem of how such a system performs when it has a large number of slits instead of just two. The general solution is to evaluate the Fourier transform of the aperture function. However, much physical insight can be gained by using a phasor approach . Consider first the simple case of five slits illustrated in Fig. 11.1, each separated by d from its neighbours. With two slits maxima were produced by both slits being in phase, which occurred according to Eq. (9.5) when d sin 8 = N A. Clearly we can have a situation when allfive slits are in phase, and this will be when
+
This condition means that the path difference from the reference plane to each slit is in all cases an integral multiple of the wavelength. The phasors lie on a straight line and add up to the same value of amplitude as when there is no path difference between the slits at sin 8 = 0. These maxima in the diffraction pattern of a grating are called its orders, first, second, third,
The diffraction grating and related topics
162
Chap. 1 1
I
Graling
Plone wovefr
d s'in e -I
Fig. 11.1. A diffraction grating with five slits. All the light diffracted into the direction 8 is brought to one point in the focal plane of the lens where the Fraunhofer diffraction pattern may be seen.
etc., according to the value of N. The central or zeroth order is the one for
N=0. The situation as sin 0 moves away from zero is illustrated in Fig. 11-2; for comparison the amplitude pattern for a pair of slits at spacing 5d/2 is also shown. The remarkable thing is that in the five-slit case the light has been diffracted mainly into the strong maxima at the several orders, with only weak maxima coming between them. The first zeros on each side of an order are +X/Sd apart. The orders then are approximately the angular width we should expect for the whole diffraction pattern of a slit as wide as the whole gralirig, ilia1 is Lo say 5din this case. Also wt: can narrow the orders by adding more slits at the same spacing. Diffraction gratings in use at optical wavelengths often have many thousand slits per centimetre, or lines as they are more usually called. In practice gratings are often used in refection rather than transmission. The discussion will be continued in terms of transmission which is probably easier to follow, but reference will be made to reflection gratings where necessary. Figure 11.3, showing the relations of a reflection grating, has been arranged so that the discussion that follows can easily be related to it. In general a grating is not
11.1
The diffraction grating
163
Y
0 sin 8
I~ / 5 d
I
Fig. 11.2. Amplitude diffraction patterns for five slits d apart and two slits 5d/2 apart. Phase is in each case referred to the centre of the slit pattern. Notice that in the 5 slit case the phasor contributed by the central slit remains unchanged throughout.
Fig. 11.3. Geometry for a reflection grating.
illuminated with a wavefront arriving exactly parallel to its plane, but with one at an angle O0 as measured in a plane perpendicular to the lines. We shall continue in this analysis to assume the lines are narrow enough for each to be considered as the source of a single cylindrical wavelet. Take the first line as the phase reference and let there be n lines, so that the optical path from the plane wavefront incident to the reference plane for emergence at 8 in Fig. 11.4 is
nd (sin O0 + sin 0)
.
(1 1.2)
164
The diffraction grating and related topics
Chap. 11
Fig. 11.4. Geometry for a transmission grating.
It will be convenient to write the phase difference between light paths that pass through two adjacent slits as $. Then the phase of light that has gone through the nth line is (with respect to the first)
The complex amplitude obtained as the sum of all the light leaving the grating in the direction 8 is then
This geometrical series may be summed in the usual way, giving 1 - exp (in$) 1 - exp (i+)
I sin ($/2)
The exponential term gives the phase of the resultant A (8, 80) relative to the zero of phase which was taken to be the first line. If instead we take the centre of the grating (be it line or non-line) as our phase reference, the exponential term becomes unity and we see once again that the amplitude remains real, although the phase can be reversed.
11.2
Diffraction pattern of the grating
165
If we are interested in the pattern as a function of 8 for a fixed O0 it is convenient to regard the inclination of the illuminating wavefront as putting a linear phase shift across the grating. This may be conveniently designated by 6 radians per line. Then 6=- 21rd sin O0 X
.
Rewriting Eq. (1 1.5) in terms of 8 and 6 and with the grating centre as phase reference gives nIrd sin (Tsin 8 + A(8,6) =Ao
sin
(
-sin 8
+
4
-
This is an important general expression for the diffraction pattern from n narrow lines d apart. 11.2 DIFFRACTION PATTERN OF THE GRATING
The general diffraction pattern of Eq. (1 1.7) is illustrated in Fig. 11.5 where the intensity is plotted for a grating with six lines. It has several important properties. (i) The orders are equally spaced in sin 8, and occur whenever the phase difference between adjacent lines is an integral multiple of X . This occurs when the numerator and denominator of Eq. (1 1.7) bath tend to zero together. This will happen if
6X
sin 8 = - 2rd
+-a
NX
d
Here N is the order number. So 6, the phase shift per line caused by the angle of the illumination, determines the position of the orders. Hence changing 6 by altering Oo shifts the pattern so that (sin 8 + sin $) remains constant.
166
The diffraction grating and related topics
Chap. 11
Fig. 11.5. General grating pattern for a grating with very narrow lines.
(ii) On the other hand, the separation of the orders in sin 8 is independent of 6 . Thus (sin O)N- (sin 8)N-
X d
=-
and the orders are equally spaced in sin 8, this spacing depending only on the separation of the lines d and the wavelength. (iii) Zeros are given by the numerator of Eq. (1 1.7) being zero when the denominator is not. That is to say when
nrd X
n6 sin 8 + -2
P
)
is an integer
(11.11)
11.3
The effect of slit width and shape
167
or, with the same restriction
This is similar to the expression in Eq. (1 1.9), i.e. the condition for orders; except for the restriction that p/n is not integral. So to sum up Eqs. (1 1.9) and (1 1.12) we see that the n phasors produced by the n lines of the grating form a closed polygon at each zero; zeros thus appear whenever the phase shift across the whole grating is a multiple of 27r. However, in the rare cases where this condition is satisfied, but also the phase shift between each line is a multiple of 2a, the phasor diagram is not a polygon at all, but a straight line, and instead of a zero an order is produced. 11.3 THE EFFECT OF SLIT WIDTH AND SHAPE
So far the diffraction grating has been taken to consist of n slits so narrow that the phase change across each could be neglected. This condition is very restrictive as lines of actual gratings may be several wavelengths wide. To analyse the situation where this restriction is not valid,
I I
I I
I /'I1
.'
I
I
I
line
diffraction d p o t tern
I I
I I
I
Fig. 11.6. Grating pattern for a grating with lines of width w. The line diffraction pattern envelopes the general pattern of Fig. 11.5.
168
The diffraction grating and related topics
Chap. 11
consider first the case of a grating which is completely opaque except for lines of width w spaced as before d apart. Then the same principle applies as in the case of Young's slits of finite width (see Section 12.2): the diffraction pattern of the whole grating only appears in directions in which the individual lines diffract light. The amplitude contributed by an individual line to the light transmitted in any direction by the whole grating is governed by the diffraction pattern of that line itself. Hence the pattern of the grating is the product of the intensity pattern of a single line with the intensity pattern of Fig. 11.5 for the ideal grating. This is illustrated in Fig. 11.6.
11.4 FOURIER TRANSFORMS IN GRATING THEORY We pointed out in Section 9.3 that the Fraunhofer diffraction pattern of an aperture is the Fourier transform of the amplitude distribution across the aperture. An idealized grating has the aperture distribution shown in Fig. 11.7(a); this equidistant series of lines, or delta-functions, is known as the grating function. Its Fourier transform is also a grating function, as shown in Fig. 11.7(b). Following Section 9.3, the transform applies to the diffraction grating if the scales are in terms of wavelengths (for the aperture distribution) and in terms of direction cosines (for the angular distribution). As expected, the closer the lines of the grating, the wider apart in angle are the diffracted beams of successive orders.
Fig. 11.7. The Fourier transform of a grating function is another grating function. The two scales are inversely proportional.
The grating function represents an idealized grating, with infinitely wide grating with infinitely narrow lines. Practical gratings, with finite overall width and with lines of finite width, are also very conveniently analysed by Fourier theory; the theory becomes essential for the more complex case of threedimensional diffraction encountered in X-ray crystallography. We now develop the Fourier transform approach. As we will see, this easily extends to cover the general case of arbitrary slit structure and of arbitrary distribution of illumination over the grating. Mathematically, the aperture distributions are constructed as products and convolutions (see Section 3.6)
11.4
Fourier transforms in grating theory
169
of various functions with the grating function. These cases develop as follows: (i) The grating function: an infinite array of delta functions, spacing a, with the form +(x)= C 6(x- na)
(11.13)
(ii) Repeated line structure, i.e. an infinite array of elements each with the form f (x). The overall aperture distribution F(x) is then a convolution of +(x) with f (x):
(iii) Finite grating length, i.e. a grating with narrow (delta-function) lines, and limited in extent by a function H(x), where H(x) = 1 for 1 xl < L and H(x) = O for 1x1 > L. Then F(x) is the product
(iv) Finite array of structured lines, i.e. a combination of (ii) and (iii) above. Then
Recalling from Section 3.6 that the Fourier transform of the convolution of two functions is the product of their individual Fourier transforms, and correspondingly the Fourier transform of the product of two functions is the convolution of their individual Fourier transforms, we find the required transforms, i.e. the diffraction patterns, of the four cases as follows: (i) The ideal grating +(x) transforms into an ideal angular distribution A (I), consisting of diffracted beams at equal intervals of the direction cosine I. (ii) The grating with finite line width transforms into the product of the transforms of the grating function and the line structure, as in the envelope curve of Fig. 11.6. (iii) The grating with finite length transforms into the convolution of the transforms of the grating function and the overall illumination function of the grating, as in Fig. 11.5. (iv) The general case is a combination of (ii) and (iii), as in Fig. 11.6. In this example the line structure is a 'top-hat' function, which transforms
170
The diffraction grating and related topics
Chap. 11
into the wide sinc function forming the overall envelope; the illumination of the grating is also a top-hat function, which transforms into the narrow sinc function which is convolved with the grating function to produce diffracted beams with finite width. 11.5 MISSING ORDERS AND BLAZED GRATINGS
The combination of the individual line pattern and the grating pattern can produce the effect of missing orders. Suppose the line width w is commensurate with d; that is to say that d / w is a rational fraction. Then a zero of the individual line diffraction pattern will fall on an order of the grating pattern, so that this order will not appear. This effect is shown in Fig. 11.8 for w = d / 2 . So it is possible to remove orders by the skilful choice of the diffraction pattern of the lines. More important, particularly in reflection gratings, it is possible to arrange the grating so that most of the light goes into one particular order. Such a grating, illustrated in Fig. 11.9, is called a blazed grating. The lines are ruled so that each reflects specularly in the direction of the desired order. Another way of looking at this is to observe that the angle of illumination and the tilt of the reflecting lines is such that each line has across it the appropriate phase shift to make it (as a single slit) diffract in the direction of the required order. It is amusing to notice that a plane mirror can be regarded as a limiting case of a diffraction grating, in which the separation d is equal to the line width w. A little thought shows that now every order but the zero order is missing, as it is suppressed by the zeros of the line pattern. Only the zero order is produced, its width depending upon the width of the whole grating, that is to say upon the diffraction of the whole mirror as an aperture.
11.6 MAKING GRATINGS The main grating effects are easy to observe, and do not require very fine gratings. If a handkerchief is held in front of the eye and one looks through it at a well defined edge (such as a distant roof against the sky) it is found that as well as the actual edge at least two more can be seen. These, which are displaced by equal increments of angle, correspond to the first and second orders of the grating formed by the threads of the handkerchief and are spaced at X/d. The human eye can resolve angles down to about 1 minute of arc, or 0.0003 rad. With X = 500 nm, this means that grating effects can just be detected if a grating of spacing 1 mm is held in front of the eye. The millimetre graduations on a transparent ruler will serve, but only just. A better grating to look through may be made by ruling a 10 cm x 10 cm square with 100 lines 1 mm apart. Photographic reduction to produce a negative 5 mm x 5 mm then
1 1.6
-
A --+ d -7,/T\,
I
I
A
\ \
\
:
\ \
I I I I
\ \ \
\
I I
\ \
I
\ \ \
I
/ /
\
/
\
/
-1
1__ \
/
\
/ +
Miss~ng order \
/
L/.
171
\
/
I
Missing order
Making gratings
+
x --
0
'+-
i
- sin 8 . +1
dl2
dl2
Fig. 11.8. Missing orders. A zero of the line diffraction pattern suppresses order entirely.
Incident light
Direction of blaze
Blazed grating
Fig. 11.9. A blazed grating.
172
The diffraction grating and related topics
Chap. 11
gives a grating with a line spacing of 0.005 cm, in which the order separation is 0.01 rad or about 34 min of arc. The Sun viewed through this shows a spectacular and colourful series of orders, overlapping more and more as higher orders are reached. Fraunhofer made his first grating in 1819 by winding fine wire between two screws. Later he made them by ruling with the help of a ruling machine, in which a ruling point was advanced between lines by means of a screw. The ruling was either of a gold film deposited on glass, or directly onto glass with a diamond point. Later in the 19th century Rowland improved the design of ruling machines and was able to rule 14,000 lines to the inch on gratings as much as 6 inches wide. He also invented the concave grating which diffracts and focuses the light at the same time; this dispenses with the need for a lens, which allows spectra to be made of ultraviolet light which would be absorbed by glass. If a grating is ruled by a machine which is not perfect, confusing effects are observed which make its use for spectroscopy difficult. Each single spectral line is seen with several equally spaced and dimmer lines on each side of it. These are called ghosts and in a complicated spectrum of many lines may be difficult to distinguish from genuine lines. They arise from periodic errors in the ruling. Suppose that the machine's error was such that the depth of the ruling varied so as to go through a cycle of deep, shallow, deep every m lines, and that the transmission of the lines was proportional to their depth. Then the grating would be like a perfect grating with another perfect grating m times as coarse in front of it. When illuminated by monochromatic light, each order of the perfect grating would be further split into orders separated by X/md caused by the coarse grating. It is these satellite orders that are the ghosts. In fact any type of imperfection with a periodicity every m lines causes such ghosts spaced at X/md, whether it be of amplitude or phase. The case we considered was of amplitude, in which the ability to transmit light was periodically variable. In a phase variation the spacing of the lines, whilst remaining on the average d, is periodically variable. In this case the lines are first too close, then too far apart, repeating this cyclically. This is like phase-modulation of a carrier wave in radio engineering. The spectrum produced has numerous ghosts spaced at multiples of X/md and of various intensities.
* 11.7
RADIO ANTENNA ARRAYS
From metre wavelengths to centimetre wavelengths it is often convenient to construct large antennas or aerials from many similar radiating or receiving elements, arranged on a one- or two-dimensional grid. Such an arrangement is called an array. The radiating elements do not here concern us: they may for example be half-wave dipoles. In the present discussion
Ir
(a)
(b)
11.7
Radio antenna arrays
173
End-fire array
Broadside array
Fig. 11.10. Directive radio antennas made from separately excited dipole elements spaced X/2 apart. The polar diagrams show the main beam. The broadside array (b) is constructed a short distance (typically X/8) above a reflecting sheet S. A progressive phase difference between elements changes the direction of the main beam, as shown in the polar diagram.
they are considered to be identical so that they have all the same polar diagram. The power polar diagram used in radio engineering is simply the angular pattern of intensity produced at a large distance from the antenna, often normalized to a fraction of the maximum of intensity. It is a Fraunhofer diffraction pattern. Similarly the less-familiar voltage polar diagram is the complex amplitude. Further nomenclature that is usual in aerial work is that the main maximum or maxima of a polar diagram are called the main beam or beams. Subsidiary maxima are referred to as side-lobes.
174
The diffraction grating and related topics
Chap. 11
Consider first a one-dimensional array of elements uniformly spaced d apart. The elements can all be excited separately, using suitable lengths of transmission line from the transmitter. New possibilities now arise as compared with the optical diffraction grating, since the phase of each element can be separately controlled. The main beam can be directed at any angle to the line of elements, as in the following examples, illustrated in Fig. 11.10.
(i) End-fire array shooting equally in both directions This means an array with a polar diagram having equal main beams directed each way along its length. To achieve this it must be arranged that one order appears at sin 8 = 1, with no order in between. To make the distance in sin8 between orders equal to 2,
+
To put a main beam at sin 8 = - 1
So an array of spacing X/2 between (say) dipole elements phased alternately positive and negative would have the required property. It is easy to see that in either direction along the array the contributions from each dipole would be in phase, the delay in space from dipole to dipole being matched by the shift 6 = .?r between the dipoles. On the other hand, in the direction at right angles to the array (for example) the contributions from alternate dipoles would cancel each other.
(ii) End-fire array shooting in only one direction Here we must put an order at sin 8 = - 1, but not another anywhere. To ensure the last condition
And to put an order at sin 8 = - 1 again
*
11.7
Radio antenna arrays
175
A reasonable arrangement now might be to make d = X/4 and hence 6 = ~ / 2 .
Seen from one end the delay in space would now be compensated for by the phase shift of 7r/2 between elements. Seen from the other end the dipoles' contributions would cancel in pairs. The technical problem of obtaining the required phase shift is not here our concern. In practice, because of the mutual coupling between elements, these sorts of arrays are difficult to realize. The familiar Yagi aerial is one such realization of (ii) in which a single 'driven' element couples with a reflector and several 'director' elements to approximate to the conditions of (ii). (iii) The broadside array
This is by far the most important type of array, in which a single main beam is required, emerging perpendicular to the array at sin 8 = 0. The single main beam with low side-lobe level elsewhere is ensured if
+
The side-lobes along the array at sin 8 = 1 will then be those midway between two orders of the general grating pattern and hence low. The main beam will emerge perpendicular to the array if
So in a broadside array all the elements must be fed in phase. Suppose we do not make 6 = 0.Then the beam will emerge at an angle determined by
So with a fixed broadside array the main beam can be steered simply by altering the phase shift from element to element at the rate given by Eq. (1 1.23). This principle is used in many large fixed arrays for beam swinging. (iv) Two-dimensional broadside arrays
In the discussions above of one-dimensional arrays we have considered only the plane of the elements. Considering now all three dimensions the onedimensional broadside array above would have a main beam rather like a
176
The diffraction grating and related topics
Chap. 11
pancake with its plane perpendicular to the line of the array. Only the polar diagrams of the individual elements (by which the generalized grating polar diagram is in practice multiplied) prevent there being completely circular symmetry about the line of the array as axis. Beam swinging by phase shifting makes the original flat main beam become a cone of semi-angle (7r/2 -8). In almost all applications a beam small in both dimensions is required, and this may be achieved by making the array two dimensional. As in the rectangular aperture treated in Section 9.4 the two-dimensional polar diagram is now given by the product of the two grating patterns appropriate to each dimension, and the resultant pencil beam may be steered independently in each direction by applying suitable phase shifts. 11.8 X-RAY DIFFRACTION WITH A RULED GRATING
Diffraction gratings are expected to work well when the line spacing is a few wavelengths. If the spacing is very many wavelengths, the diffraction angles will be small and hard to measure. Diffraction of X-rays by a ruled grating is therefore rather difficult: furthermore X-rays are not easily reflected or absorbed, so that no ordinary grating can be used. Compton solved these difficulties very neatly by using a metal grating at nearly glancing incidence, when X-rays are reflected quite well. For X-rays the refractive index of all substances is less than unity by a few parts in lo6 and hence the phenomenon of total internal reflection is observed when Xrays attempt to pass from free space into a medium rather than the other way round. As the angle of incidence of a beam of X-rays on a highly polished surface approaches 90"there is a critical glancing angle OR(=90 - i) at which reflection is observed; since the refractive index n is near unity we write: cOseR=n = 1-6,
sin OR = OR = J(26)
.
(1 1.24)
A typical example is copper, for which OR = 20' 24" at a wavelength of 0.1537nm, giving 6 = 1 7 . 7 ~ The observation of OR is not a very accurate method of determining 6 because absorption and other unwanted effects make the transition from non-reflection to reflection gradual rather than sharp. But the ability to reflect X-rays at once leads to the possibility of measuring their wavelength with a grating. Compton and Doan in 1925 successfully achieved this measurement. Using a grating of speculum metal with 50 lines per mm they found the wavelength of the Ka line of molybdenum to be 0.0707 0.0003 nm. In Fig. 11.1 1 the relationships of the zeroth and first order are shown. For the Nth order
+
11.9
X-Ray diffraction by a crystal lattice
177 First order
X-rays
order Zero
Fig. 11.11. Diffraction of X-rays at glancing incidence on a ruled grating.
Since a and fl are small this may be simplified to
Notice that there is no difficulty in knowing which order the observed diffraction is at; the orders starting at zero simply come out in order as 6 increases. 1 1 9 X-RAY DIFFRACTION BY A CRYSTAL LATTICE The short wavelengths of X-rays are evidently not well suited for diffraction by ruled gratings. They are, however, conveniently close to the spacings of atoms in crystal lattices, which therefore provide excellent three-dimensional diffraction gratings for X-rays. The diffraction pattern is intimately connected with the arrangement and spacing of atoms within a crystal, so that X-rays can be used for determining the lattice structure. It was Laue who first suggested that a crystal might behave towards a beam of X-rays rather as does a ruled diffraction grating to ordinary light. At the time it was not certain either that crystals really were such regular arrangements, or that X-rays were short wavelength electromagnetic radiation. In 1912 Friedrich, Knipping and Laue performed the experiment illustrated in Fig. 11.12. X-rays from a metallic target bombarded by an electron beam were collimated by passing through holes S, and S2 and fell on a single crystal C of zinc blende, passing through to a photographic plate. When the plate was developed, as well as the central spot at 0 where the beam struck the plate there were present many fainter discrete spots also. This showed that there were a few directions into which the three-dimensional array of atoms in the crystal selectively diffracted the X-rays. These depended on the orientation of the single crystal used. How could the significance of these directions be understood, when the three-dimensional and doubtless very complicated crystal structure was not known?
178
The diffraction grating and related topics
Chap. 1 1
Photographic plate
Fig. 1 1.12. The experiment of Friedrich, Knipping and Laue.
X-rays
Fig. 11.13. Diffraction of X-rays at the Bragg planes in a crystal. The extra path of ray 2 is the distance CR,B = 2d sin 8.
11.9
X-Ray diffraction by a crystal lattice
179
The answer was given by W. L. Bragg in 1912 and is so simple that (in hindsight) it may seem completely obvious. Bragg pointed out that whatever the structure of a crystal, so long as it is repetitive in three dimensions, it is possible to draw sets of parallel planes on each of which the arrangements of atoms will be the same. Such planes are called Bragg planes and their separations Bragg spacings. In any crystal structure it is possible to draw many such sets of planes, but the numbers of atoms on each will vary very much. If plane monochromatic X-rays fall upon the atoms of a Bragg plane, each atom acts as a scatterer and a secondary wavelet spreads out in all directions. If we consider a single Bragg plane these wavelets will combine in phase in the undeviated direction, which .is uninteresting, but also in a direction corresponding to ordinary reflection just like a mirror. Now add all the other Bragg planes parallel to the first. The specularly reflected waves from the various planes will in general be out of phase and will interfere destructively. They will all combine in phase, however, if a rather simple condition involving the glancing angle 8 is satisfied. In Fig. 11.13 it is easy to see that the paths RIA and RIC are identical, so that the extra path in reflection number 2 is CR2+ R2B. The conditions for reinforcement are therefore the simple law of reflection, 8=8', plus
These two statements together form Bragg's law for X-ray reflection from a crystal. The spots found in the original experiment of Fig. 11.12 represented reflection at Bragg planes. The X-ray source contained a wide range of wavelengths, so that the Bragg law was obeyed for each spot by a wavelength within the range. Modern X-ray diffraction techniques use monochromatic sources; since X is fixed a spot can then only be found by rotating the crystal until a suitable value of 8 occurs. An alternative to rotating a single crystal is to use a powder containing crystals at all possible orientations. The importance of the many methods of investigation of crystal and molecular structures which sprang from these fundamental discoveries of Laue and Bragg is hard to over-emphasize. The full analysis of an X-ray diffraction pattern requires measurement not only of spot positions but also of their relative intensities, since these reveal the internal structure of the repeating elements in the crystal lattice. When this is undertaken, however, the structure can be completely determined, even if it contains some tens or hundreds of thousands of atoms, as in a protein molecule.
180
The diffraction grating and related topics
Chap. 11
PROBLEMS 11 11.1 Sketch the diffraction pattern (in intensity) of a grating of total width D, in which the slit width is w , and the spacing is d, when the grating is illuminated normally by a plane monochromatic wave. Indicate how this pattern would be modified if: (a) the illumination is reduced at the edges of the grating (b) alternate lines are blacked out (c) the amplitude in alternate lines is reduced to a fraction a. 11.2 How is the same pattern modified if transparent material which retards the wave by half a wavelength covers (a) alternate slits (b) every third slit (c) one half of the grating? 11.3 A plane diffraction grating consists of alternate transparent and opaque lines with width a and b respectively. Babinet's principle shows that the Fraunhofer diffraction pattern is the same for a grating with widths a and b interchanged, apart from the central maximum. Confirm this by means of Fourier analysis. 11.4 Derive the Fraunhofer diffraction pattern of a set of three equally spaced similar slits by adding the transform of the outer pair of slits to that of the single central slit. Repeat the exercise for a four-slit grating, taking the inner and outer pairs separately. 11.5 The line spacing s in a badly ruled grating increases with distance x across the grating so that
Show that the nth-order diffraction maximum, formed by plane-incident light, focuses at a distance (bnX)-I from the grating. 11.6 The transparency of the lines in a badly ruled grating varies periodically across the grating, so that the pthline has transparency proportional to 1 + a sin(2~p/q), where a is small. Use the convolution theorem to find the diffraction pattern of the grating.
CHAPTER
Young's double slit and related topics
12.1 YOUNG'S SLITS, FRESNEL'S BIPRISM, AND LLOYD'S MIRROR
The simplest possible demonstration of Young's double source interference is to look directly at a bright lamp through a double slit made by scratching two parallel lines on an undeveloped pho~ographicplate. The double slil, which should have a spacing of about half a millimetre, is held immediately in front of the eye. The light source should be monochromatic and compact. It is apparently broadened by diffraction in each slit; the interference fringes are then seen within this broadened image.
Fig. 12.1. Overlapping beams from two slits, S,, S,, illuminated by the same source
L, Interference occurs in the whole of the shaded area.
182
Young's double slit and related topics
Chap. 12
Fig. 12.2. (a) Fresnel's biprism, (b) Fresnel's double mirror, ( c ) Lloyd's mirror. In each asrange~ne~it twill bearils fro111 the sanir sou~.ceoverlap ill the shaded area, appearing to diverge from the twin sources S,, SZ. The more general representation of double source interference in Fig. 12.1 shows two pinholes or slits S1, S2 illuminated by a point source of monochromatic light L. Light from S1 and S2 spreads and overlaps in the shaded region: throughout this region there is interference between the two sets of waves, and interference fringes would appear on a screen or photographic plate placed anywhere in the region. The interference fringes are said t o be non-localized. There are several practical ways of producing two overlapping light beams from a single small source. Fresnel used the thin biprism of Fig. 12.2(a) and the nearly coplanar mirrors of Fig. 12.2(b).A particularly simple arrangement is Lloyd's mirror (Fig. 12.2(c)): here the two sources are the slit S, and its image S2. A n interesting feature of Lloyd's mirror is that the light reflected at grazing incidence t o form S2 suffers a phase reversal at reflection, so that the interference fringes are exactly out of step with those of the double slit. Similar arrangements can be devised for the much longer-wavelength radio waves. A classic example is the equivalent of ~ l o ~ dmirror 's (Fig. 12.3(a)),
12.2
0°
2O
The effect of slit width
Sun's elevat~on(centre of d ~ s k ) 4O 6O 8O
183
1O0
A -
0 5 ~ 1 0 0~ 5 20" ~
0 5 3om ~ 0 5 ~ 4 0 " 0 5 50m ~
0 6 00" ~
0 6 10m ~
0 6 20" ~
0 6 30" ~
Eastern Austral~anstandard time
Fig. 12.3. Lloyd's mirror in radioastronomy. ( a ) The radio telescope receives radio waves both directly and indirectly from the sea. (b) The recorded radiation from the Sun as it rises above the horizon. The interference fringes are disturbed by refraction near the horizon, and by solar outbursts.
used in early radioastronomical observations in Sydney, Australia. The radio telescope, mounted on a cliff overlooking the sea, received radio waves of around 1 1/2 metres wavelength (frequency 200 Mhz) from the Sun and other celestial radio sources as they rose above the horizon. Both direct and reflected radio waves were received. A typical trace of the interference fringes is shown in Fig. 12.3(b). 12.2 THE EFFECT OF SLIT WIDTH
So far each slit in Young's experiment has been supposed to be the source of a uniform cylindrical wave. We must now investigate the effect of the variation of amplitude with 0 which we must expect from any practical source of light, such as a slit which may be several wavelengths wide. We first consider the Fraunhofer fringes of a system made up of two slits of width w, spaced d apart at their centres. The simplest way to think of what happens is to notice that fringes cannot be observed in any particular direction unless each slit, acting as a diffracting aperture, sends some light that way. Each slit contributes according to its own Fraunhofer diffraction pattern, while the relative phase of the two contributions is (2nd/X) sin 0. Young's fringes are then observed within the intensity envelope of the diffraction pattern of a single slit, as shown in Fig. 12.4.
184
Young's double slit and related topics
Chap. 12
Fig. 12.4. Young's fringes with slits of finite width w and separation d . The dotted envelope of the fringes is the diffraction pattern of each slit and would reach its first zero at k X / w .
Following the methods of Chapters 9 and 11, we can immediately describe the fringe pattern as the product of a sinc function and a cosine fringe system, using the direction cosine m = 8: A@)=
wm / A ) ~wm/A
sin
(T
This is another example of the convolution theorem in Fourier transforms (Section 11.4). 12.3 SOURCE SIZE AND COHERENCE In Chapter 9 it was implicitly assumed that a plane monochromatic wavefront can be produced and portions of it removed or allowed through apertures to produce diffraction and interference effects. No wavefront is ideally plane and no wave is perfectly monochromatic, although radio waves and laser light can approach the ideal very nearly. We must investigate interference between non-ideal wavefronts, such as those obtained from a source of finite size. In any ordinary light source, such as an incandescent filament or a sodium lamp, the output of the lamp is made up of the sum of the amplitudes produced by a very large number of individual atoms. Each atom starts radiating with some arbitrary phase, unconnected with the phase of its neighbours, and produces a wave train of finite length for a finite time at some frequency. This wave spreads out spherically and the wavefront from the source as a whole is the sum of all the contributions from individual emitters. In general the emitters will not be on the same frequency either because of the spread of emission all over the optical spectrum (as in an
12.3
Source size and coherence
185
Finite source
Fig. 12.5. A source of finite size illuminating Young's slits. Fringes of high visibility can only be seen if AS-BS only changes by a small fraction of a wavelength for all positions of S from S t to S".
incandescent filament) or because of the width of the emitted spectral lines (as in a sodium lamp). Hence light derived from different sources, or different parts of the same source will not interfere, and interference is only observable if the two interfering beams are each derived from the same source. Any individual part of a source contributes equally to both beams, and every component of each beam will have its fellow to interfere with when the beams are combined. This property of two beams of light, which makes them capable of interfering, is called coherence, and two such beams are said to be coherent. The problem then is in deriving two beams of light that will be mutually coherent from an ordinary light source. Let us look at the problem in the case of Young's slits. Suppose that the plane wave we have so far considered is replaced by the wavefront from a finite source of width w , at a distance r, as in Fig. 12.5. Then for the whole of the light passing through A to be capable of interference with that from B the relative phase of all its components from different parts of the source must be the same, in spite of the different journeys. The light at one slit A is made up of the sum of contributions from all source points from S ' to S". The phasor diagram at A is a random sum of all these contributions. At B the phasor diagram must be the same shape (but not necessarily the same orientation), so that the relative phases of all contributions must be the same as it is at A. This will be true if, on moving from A to B the relative path length to each end of the source, changes by
186
Young's double slit a ~ i drelated topics
Chap. 12
Fig. 12.6. Michelson's stellar interferometer. Light from parts of the star's wavefront d M apart is made to interfere in the focal plane of the telescope. The visibility of
the resultant fringes as d, is varied allows some estimate to be made of the star's angular diameter. very much less than a wavelength. Then the condition for coherence between the light at A and R is
A perfect point source would have the property that all pairs of points on its wavefront would be coherent. The important result of Eq. (12.2) gives the condition under which a finite source may be regarded as a point source as far as a diffracting system is concerned. Looked at from the point of view of the diffracting system Eq. (12.2) may be restated as angle subtended by source eX/linear size of diffracting system
(12.3)
One way then to produce two coherent beams of light is to separate two portions of the wavefront when conditions Eqs. (12.2) and (12.3) are satisfied. This is called 'division of wavefront', in contrast to 'division of amplitude', as discussed in Chapter 13. A further discussion of the way in which coherence varies across a wavefront will be found in Chapter 16.
12.4
Michelson's stellar interferometer
187
12.4 MICHELSON'S STELLAR INTERFEROMETER
In Section 12.3 it was shown that one condition for fringes to be produced by Young's slits was Eq. (12.3)' a restriction on the angular size of the source seen from the slits. This suggests that such a system might be used to measure the angular size of a source by altering the slit separation until fringes could not be seen. This is the principle of Michelson's stellar interferometer. Some modification of the simple Young system is needed to make an interferometer capable of producing fringes by the light of a star; this concerns the method of combining the light that comes through the two slits. In Young's system diffraction at the slits is relied upon to make part of the emergent light from each slit reach the same region so that interference can take place. This means that the slits have to be very narrow. In Michelson's system (Fig. 12.6) the slits are replaced by two plane mirrors M1 and M2, which can be large in size, each reflecting light inwards to two further inclined mirrors M3 and M4. These reflect two parallel beams of light into a telescope objective (which may be reflecting or refracting). Each beam forms an image of the star in the focal plane F, and fringes are seen crossing the diffraction disc of the combined image. Two slits S1 and S2 in front of the objective are used to limit the size of the beams. Suppose first that a star is observed of so small an angular diameter that Eq. (12.3) is satisfied. Then the wavefronts falling on M1 and M2 are coherent and hence (if the mirrors are perfectly adjusted) S1 and S2 are illuminated by coherent wavefronts. In the focal plane F the interference pattern of the slits is observed; it consists of intensity fringes with a spacing (X/d,) f. Notice that the fringe spacing is independent of the separation dM of the outer mirrors M1 and M2. The extent of the area over which fringes can be observed is limited by the width w , of the individual slits and is therefore of the order of ( X / w , ) f, the approximate width of the central maximum of each slit's diffraction pattern. Now consider the effect of observing a star of finite angular diameter 40. Suppose it to be divided up into very narrow strips of width 64 each of which satisfies the condition Eq. (12.3), which in this case is
Then each such strip produces a set of fringes, but each set is non-coherent with any other set, originating as it does from a different part of the source. These sets of fringes then add non-coherently, that is to say their intensities add. Three cases can easily be distinguished, and are illustrated in Fig. 12.7, where only the central portion of the pattern is considered, so that the (sin $)/$ term in the single slit pattern is taken as unity. The sets of fringes
188
Young's double slit and related topics
,
<< A
IdM
a= vd.
Chap. 12
4o>> A
~
M
Fig. 12.7. Fringes in three cases with the Michelson interferometer. The angular diameter of the observed star is 4,. Each elementary strip on the star produces its own fringe system which overlaps the others to a greater or lesser degree according to the star's angular diameter. Note that there is no interference between fringes from different parts of the star (which are non-coherent) only addition of intensity.
from different parts of the star will be shifted sideways in y, the total shift in y being 4of, as is easily shown by geometrical considerations. An angular movement 4 amounting to X/dM causes a difference of path of one wavelength for the light going by the two routes. Examining the three cases in turn shows that more or less blurring of the fringes occurs. (i) 4,4X/dM. The condition Eq. (12.3) is satisfied and the transverse shift of the fringes from different parts of the source is slight. Completely dark minima will be seen. (ii) 40=X/dM.The transverse shift of the fringes from different parts of the source is the same order as their spacing X/d,. There will be no completely dark minima, but a sinusoidal variation of the intensity will be seen. (iii) 4 0X/dM. ~ The overlapping of the sets of fringes from different parts of the source is complete; no variation of the intensity will be seen.
12.4
Michelson's stellar interferometer
189
A quantity, the visibility V(dM), can be measured, given by
v= I m m -
Imin
Zmax + I m i n
The maximum and minimum intensities here used are illustrated in Fig. 12.7. To see how V varies quantitatively with dMit is necessary to define the brightness distribution of the source as a function of +. Let this be B(+), and let B(4) be symmetrical about the centre of the source. Then each elementary strip of the source, d 4 wide and at angle 4 from the centre of the source, produces a fringe system with intensity proportional to B($) and displaced by angle 4 from the centre of the fringe system. The intensity of a fringe maximum now becomes the integral
A
J source
=:I
2
source
x
B(+)(I+COS
where c is a constant. Similarly the minimum intensity becomes
2
~ ( d ) ( l - c o s2ndM4)d+
x
source
.
The fringe visibility is therefore
For a source of uniform brightness B(4) over an angular width V(+) =
t$o.
sin ( ~ d ~ + ~ / k ) r d ~
which is a sinc function similar to that representing the Fraunhofer diffraction pattern of a single slit. There is in fact a close relation between the fringe visibility function Eq. (12.8) and the diffraction formula Eq. (9.16), which can be traced through the fact that the denominator of Eq. (12.8) is the cosine Fourier transform of the brightness distribution across the source. Equation (12.9) may be taken to represent the variation of visibility with at a fixed spacing d M ;alternatively if d M can be varied the fall of visibility can be used to measure for a given star.
mo
mo
190
Young's double slit and related topics
Chap. 12
Michelson in the early 1920s set up such an interferometer at Mount Wilson on the 100" telescope, in which the maximum spacing of the outer mirrors was about 6 m, allowing the measurement of angular diameters down to about 0.02 seconds of arc. Unfortunately this is large compared to the angular diameters of most nearby stars, and only a small number of red giants could be measured. The first of these was Betelgeuse, which was found to have an angular diameter of 0.047 seconds of arc. From this measurement and the distance (known from measurements of parallax) the linear size turned out to be three hundred times that of the sun, and large enough to enclose the earth's orbit! Only a few such stars could be measured with the 6 m spacing. Michelson attempted to use larger spacings, but thermal instabilities in the separated light paths made the interference fringes impossible to observe. Larger spacings can however be used, as has been demonstrated recently by J. Davis in Sydney, Australia.
* 12.5
THE INTENSITY INTERFEROMETER
The difficulties of Michelson's stellar interferometer were overcome by the intensity interferometer of Hanbury Brown and Twiss. This was originally conceived by Hanbury Brown at Jodrell Bank for use at radio wavelengths. As the baseline of radio versions of Michelson's interferometer got longer in order to increase their resolving power, the methods of conveying the radio frequency signals picked up at each end back to the central station where they could interfere and produce fringes became more difficult. In a short interferometer of (say) up to 1 km baseline this is usually done with cables. For longer baselines radio links are often used in which the received signals modulate a carrier at a much higher frequency for transmission to the base station, where after their recovery from their carriers the two signals can interfere. Hanbury Brown's suggestion was that instead of conveying back the amplitude of the radio waves received at each end from the radio source, the intensity only need be conveyed. This suggestion may seem ridiculous, as the intensity does not depend on phase; however, the signals from radio sources, like other naturally occurring radio signals, are characteristically noisy and have fluctuations in intensity. According to Hanbury Brown these fluctuations in intensity would correlate if the two stations were close, but as the baseline was increased the correlation would fall off in a way that would allow the angular diameter of the source to be determined. These two radio versions of Michelson's interferometer are shown in Fig. 12.8. The easing of the problem of conveying the information to the central station this allowed was enormous. The radio frequencies occupy a bandwidth of several MHz at hundreds of MHz: the intensity fluctuations are instead at the low frequencies normally carried by telephone lines.
* I
1
Amplifier
12.5
- Detector
-1
I
a
t
191
The intensity interferometer
Wavefront from radio source
Multiplier
Output a I1 I*
I2 = A2 '2
Amplifier
'2 Detector
.
' 1 Amplifier
Wavefront from radio source
Multiplier
Outlet
a Te
Amplifier
:A2
Fig. 12.8. Radio interferometers. Below is a conventional form similar to Michelson's. The chief difficulty is in conveying the amplitudes A , and A, from the radio telescopes T, and T , which may be intercontinental distances apart. Above is the intensity interferometer where the much more easily handled intensity is brought in, the amplitude being squared (detected) at each end.
The system was used first by Jennison and das Gupta at Jodrell Bank in 1951 in a measurement of the angular diameter of the second most powerful radio source in the sky, that in Cygnus. Encouraged by its success at radio wavelengths, Hanbury Brown then proceeded to apply the same technique at optical wavelengths. The equipment consisted of two searchlight mirrors focusing the light of the observed star onto the cathodes of two photomultipliers. The observed fluctuations in current were then correlated in an electronic multiplying circuit. The equipment was used to observe Sirius on every possible night in the winter of 1955-1956, when a total of only 18 hours observing was achieved. As the two ends of the interferometer were
192
Young's double slit and related topics
Chap. 12
moved apart to a final spacing of over 9 m there indeed was a fall-off in correlation, giving an angular diameter for Sirius of 0.0069 0.0004 seconds of arc. This result and the laboratory experiments in intensity interferometry that had preceded it started a storm of controversy amongst theoretical physicists. The conventional explanation of the intensity fluctuations was that the numbers of photons arriving at each photocathode had statistical variations; how could the photons caught by searchlight mirrors several metres apart have correlated fluctuations? Clearly the work of Hanbury Brown and Twiss stands straddled across from the wave nature of light (which makes interferometers work) to the quantum nature (which is clearly demonstrated in photoelectric emission). The correct approach to the problem is to apply wave theory to the propagation, diffraction, and interference of the light waves, and quantum theory to the interaction of the light waves with the detectors. The wave theory shows what correlation exists between the waves at the detectors, while the quantum theory shows how this correlation may become masked by a statistical 'photon noise'. Moving to clearer skies than those of Cheshire, Hanbury Brown set up at Narrabri in Australia a very large version of his intensity interferometer. He was then able to measure the diameters of several hundred of the brightest stars, the first direct measurement of the diameters of any stars smaller than the red giants.
+
PROBLEMS 12 12.1 A simple double slit with separation 1 mm is held immediately in front of the eye, and a distant sodium street lamp is observed. If the lamp is 10 cm across,
how far away must it be for clear interference fringes to be observed? 12.2 In an experiment to demonstrate Young's fringes, light from a source slit falls on two narrow slits 1 mm apart and 100 mm from a slit source; the fringes are observed on a screen 1 m away. The source is white light filtered so that only the wavelength band from 480 to 520 nm is used. (a) What is the separation of the fringes? (b) Approximately how many fringes will be clearly visible? (c) How wide can the source slit be made without seriously reducing the fringe visibility? 12.3 The altitude of aircraft approaching land is controlled by a 'glide path' in which a radio transmitter of wavelength 90 cm forms interference fringes. The fringes are formed from a transmitter at height h above a conducting ground plane. Find the height h for a maximum signal to be received along a path at 3" elevation from the airfield. 12.4 Hanbury Brown's development of the Michelson interferometer can operate at optical wavelengths with spacings up to 50 m. Calculate the diameter of an object just resolvable by this instrument at a distance equal to the Earth's diameter.
Problems 12
193
12.5 An image of a narrow slit, illuminated from behind by light of wavelength 500 nm, is formed on a screen by a convex lens of focal length 100 cm. The slit is 200 cm from the lens. A second slit, parallel to the first, now limits the beam of light to a width of 0.5 mm. This slit is placed successively (a) 100 cm from the screen; (b) in contact with the lens; (c) 100 cm from the first slit. What is the width between the first zeros of the diffraction pattern in each case?
CHAPTER
Interference by division of amplitude
13.1 DIVISION OF AMPLITUDE -NEWTON'S RINGS In Section 12.3 the problem of making two beams of light coherent so that they could interfere was explained; the condition was derived, showing under what restrictions two portions of the wavefront from an actual source could be made to interfere. This type of interference is called 'division of wavefront'. The other way to observe interference effects is by 'division of amplitude'. A wavefront, which may be from an extended source, is made to produce two beams, usually by partial reflection. On recombining the beams, interference effects are seen. As an introduction to this type of interference we shall consider Newton's rings. These may be observed with the optical system shown in Fig. 13.1. A long-focus lens is placed in contact with an optical flat, and illuminated at vertical incidence by the source reflected by the glass plate tilted at 45'. Observations are made with the microscope. Notice that the 45" plate is not anything to do with the interference effects; it provides a convenient method of illumination. The wavefront travelling downwards is partially reflected at each boundary of the lens and plate. The two wavefronts that interfere to cause Newton's rings arise from partial reflection at the lower surface of the lens and the upper surface of the flat. As they are both derived from the same source, the two waves can interfere with each other. Their relative phases are determined by two factors:
13.1
Division of amplitudc -Ncwton's rings
195
in reflection
Ex tended llght source
,-----, Microscope I
;
t , , , , position
to
p -----l observe rinas
Fig. 13.1. Newton's rings. The rings may conveniently be observed in reflection with the system shown. The much lower visibility rings seen in transmission may be seen from the position indicated.
(i) the reflection coefficients of the two boundaries are of opposite sign, (ii) there is an extra path traversed by the wave reflected by the flat surface. The first factor provides a phase difference of T,so that the centre of the pattern is dark, as would be expected where the glass is in contact and effectively continuous (no reflection at a boundary can take place if there is no boundary). Succeeding rings are light and dark as the extra path length is (N+ %)A or NX. If the radius of curvature of the bottom face of the lens is R the condition for brightness at r from the axis is path difference= 2 h = ( N + M)X ,
(13.1)
196
Interference by division of amplitude
Chap. 13
where the value of h in terms of R and r can be determined approximately by an expansion:
This approximation often appears in optics; it is important to be clear under what circumstances it is applicable. The expression
is a parabola that approximates to a circle of radius R for small r. Usually this approximation will be good enough if the deviation between circle and parabola is small compared with a wavelength. If the expansion of Eq. (13.2) is taken to one more term, this condition becomes
so if R=100cm and X=500nm=5x10-5cm, we find that 0 9 5 . 8 ~ 10- rad or a few degrees. Now at r = 1 cm, h is &, cm = 0.005 cm so 2 h = 0.01 cm. The order N of the ring is then 10-2/(5 x = 200. The approximation then is good for many tens of rings. These ring-fringes are localized in the sense that to see them the microscope must be focused in depth to the boundary between the lens and flat. When this condition is satisfied the light back-scattered from a point on the lens surface is brought ultimately to an image point by the same optical path, whatever direction it started out in. The light scattered in all directions from the corresponding point below on the flat is likewise brought to the same focal point, the only difference being that the path for all rays from the flat is longer by approximately 2h than that from the lens surface. Combining Eqs. (13.1) and (13.3) gives
as the condition for brightness. The ring system then has a dark centre surrounded by rings getting more and more crowded as r increases. The visibility of Newton's rings is not intrinsically close to unity, as in the case of Young's fringes with a point source. It is instead determined by the relative amplitudes of the two waves reflected from the lower surface of the lens and the upper surface of the flat. In practice these are nearly the same, and the ring system is well defined and of high visibility. Quite the opposite is true in the case of Newton's rings seen in transmission rather
13.2
Interference effects with a plane-parallel plate
197
than reflection. The transmitted fringes are complementary: bright where those reflected are dark, but the visibility is very low. A system for observing them is shown in Fig. 13.1. One way to see that they must arise is from a consideration of the conservation of energy. Most of the light incident on the system is transmitted, but due to the interference effects discussed above some areas (the bright rings) reflect light back somewhat, whilst others (the dark rings) reflect hardly at all. Hence, as the energy not reflected back is transmitted, the complementary low-visibility system is seen in transmission. The well known property of a photographic negative to look like a positive if seen in reflected light from the front is somewhat analogous to this. The high transmission (transparent) parts look dark compared with the low transmission (opaque) parts. If instead of a monochromatic source, a white light source is substituted, a dark spot is still seen in the centre, where light of all wavelengths cancels because of the phase change of a on reflection. The overlapping ring systems in all the different wavelengths get increasingly out of step, and give a white field a few orders away from the centre. Between the dark and white regions is a system of coloured rings in a sequence called Newton's colours, as the various colours are added or subtracted according to their wavelength and the distance between the surfaces. 13.2 INTERFERENCE EFFECTS WITH A PLANE-PARALLEL PLATE The case of Newton's rings is only one example of interference effects observed between two beams derived by the division of amplitude by partial reflections from two surfaces. Consider first a monochromatic point source S illuminating a parallel-sided slab of transparent material of refraction index n, shown in Fig. 13.2. Then if the direct path is excluded there are two paths from S to P corresponding to reflections from the front and back of the slab, and P will be light or dark according to whether the optical paths (taking into account the phase change at the upper reflection) differ by an odd or even number of half wavelengths. Clearly there will be circular symmetry about the line SN through the source normal to the slab. The system is like Young's, the interference being between light from the two images S1 and S2 of S, in the front and back surfaces of the slab. The fringes are nonlocalized in space, and a photographic plate in any plane parallel to the slab (for example) would record circular fringes centred on SN. The order of the fringes is high if the thickness of the plate is large compared to the wavelength, so that the source must be highly monochromatic if fringes are to be observed. Similarly if the source is not a point, but extended, the fringes will quickly be lost. A surprising change is made in the system by inserting a lens as shown in Fig. 13.3, and observing the distribution of brightness in its focal plane.
198
Interference by division of amplitude
Chap. 13
Fig. 13.2. A point source S has two images S, and S2 in the front and back surfaces of a parallel-sided slab. The interference in the light from these gives non-localized fringes similar to Young's.
Fig. 13.3. The effect of introducing a lens which brings all parallel rays to a single point P is to allow fringes of equal inclination to be observed with an extended source.
13.2
Interference effects with a plane-parallel plate
199
The the lens brings together at a point P all the rays that leave the plate at a particular angle; there are two of these for each point in the source. The source may now be extended, since the path difference for all pairs of rays reflected in the front and back faces of the slab is the same, if they leave the plate at the same angle, regardless of which part of the extended source they come from. Each element of the extended source thus contributes twice to the light wave at P, once by reflection at the back, and once by reflection at the front. These contributions interfere either constructively or destructively, as determined by their different path lengths. As only the angle of incidence, i, determines the brightness or darkness, these fringes are called fringes of equal inclination. It is easy to see from Fig. 13.4 that the conditions for bright and dark fringes are related to the angle of refraction r, inside the slab, by 2nhcosr=(N+ %)A 2nhcosr=NA
for bright fringes for dark fringes .
The visibility of the fringes is again not necessarily unity. Let A f and A b be the amplitudes from the front and back of the slab. Then the intensity will be proportional to
where $=
4 r n h cos r
+r
The visibility V is defined in terms of the maximum and minimum intensities as
v= Imax - Imin Imax
+ Imin
From Eq. (13.7), putting cos $ = 1 for maximum intensity and cos $ = - 1 for minimum intensity,
So with a thick slab of material and a monochromatic extended source, fringes may be observed in the focal plane of the lens in Fig. 13.3. They are
200
Interference by division of amplitude
Chap. 13 E
h
c\
\
\
\ \ \
r \ \,
D'
Fig. 13.4. The path difference for two rays A,BCDE and A,DE. The paths are equal to the points D and H, where DH is perpendicular to BC. The path difference is therefore HCD, which by construction is seen to be equal to HD'. Since HCD is within the film, the extra path is 2nh cos r.
cos2 fringes, and not of unit visibility. Notice that the present discussion has excluded multiple reflections, a good approximation unless special measures are taken to improve the reflectivity. This point is return to in Section 13.5.
13.3 THIN FILMS The parallel-sided slab discussed in the previous section can produce fringes of a very high order; that is to say the path difference between the interfering beams might be many thousands of wavelengths. Thin films in which the thickness is only a few wavelengths display interesting and somewhat different interference phenomena, which do not depend on the film having parallel sides. Interference fringes now appear in the surface of the film; their position is determilied mainly by the thickness of the film and very little by the angle at which they are seen. Consider first the familiar observation of light from the sky reflected in a thin oil film on water. Each colour forms a set of interference fringes across the film, the positions of the fringes depending on wavelength and the thickness of the film. Each part of the film reflects two waves, one from the front and one from the back of the oil film; the phase difference of these two waves depends both on angle r and thickness h, but in practice mainly on h, so that the interference effects appear to outline the contours of equal thickness in the film.
13.4
Michelson's spectral interferometer
20 1
Observer
Fig. 13.5. Fringes are seen in a thin film by reflection of an extended source of light. There are in fact two extreme cases for interference in thin films. If the thickness is completely uniform, then only a variation of the angle r can change the path difference; fringes will therefore be seen outlining directions where r is constant, as in the thick slab already discussed. The fringes will be very broad, as a large change in cos r corresponds to a small change in path difference when h is small. These are fringes of constant inclination. Alternatively, as with the oil film or a soap bubble, the thickness may vary rapidly while cos r changes little; fringes of constant thickness are then seen. The fringes of constant thickness seem to be located within the film, as their position is determined by h and not by r; by contrast the fringes of constant inclination are seen in fixed directions, and therefore appear to be at infinite distance. In practice the fringes seen in oil films are determined in small part by the angle r, so that they are located just behind the film, as can be verified by the observer moving his head from side t o side, looking for parallactic motion of the fringes across the film. Fringes of equal thickness have many obvious practical applications, as they allow measurements of thickness to be made to a fraction of a wavelength.
13.4 MICHELSON'S SPECTRAL INTERFEROMETER An instrument that has been very influential in the development of interference optics, both theoretical and practical, is the Michelson interferometer (not to be confused with his stellar interferometer, Section 12.4). Its simplest form is shown in Fig. 13.6, which may seem at first sight to bear little relationship to the optics discussed in connection with thin films in the last section. I n fact they are closely related.
202
Interference by division of amplitude
Chap. 13
Fig. 13.6. Michelson's interferometer. Observations are made from P either directly or with a microscope or camera. An observer at P sees the extended light source S reflected in a 'thin film' consisting of the mirror M I and the image Mi of the mirror M,. Fringes of constant thickness are generally seen in this thin film, although if M I and M; are made accurately parallel the fringes take the form of a circular pattern of fringes of constant inclination. Light from the extended source S is split in amplitude by a half-silvered mirror D, and the two beams are then reflected by the mirrors MI and M2. A further partial reflection in D sends the two beams out towards the observer at P. The observer can see the source S reflected simultaneously in the two rr~irrvrsM1 and M2. He can also see an image M; of the mirror M2 close to the surface of M,. The mirror M, is equipped with screws to allow its inclination to be adjusted, so that M; can be made parallel with MI; MI itself is mounted on a screw-controlled carriage so that the perpendicular distance between MI and M; can be adjusted over a considerable range. When M I and nearly coincide, the view of the source S is exactly the same as the view reflected in a thin film, and fringes will be seen in the surface of M I . The equivalent thin film is the space between MI and M;, so that the path difference between two rays reaching P from a point on S is
MI
13.5
Multiple beam interference
203
determined by the thickness of this space. Exact equality of the paths is ensured for the situation when M1 and M; coincide by the insertion of the glass plate C, which is the same thickness as D. Each path then traverses a glass plate twice, and the paths will remain equal for all wavelengths even if the refractive index of the glass is dispersive. Suppose that the distance between M1 and M; is small, but that MI is not parallel to M;. As the mirrors are accurately plane the fringes will then be light and dark lines across the mirrors; these are fringes of constant thickness. The angle of M2 can then be adjusted, and, MI and M; can be made parallel to a small fraction of a wavelength; the linear fringes of equal thickness are then replaced by circular fringes of equal inclination. Adjustment of the position of MI will now make the fringes expand outwards from the centre if the distance M ~ M ;is decreasing, and grow outwards from the centre if it is increasing. Finally a position can be attained when the field is of uniform grey all over. This corresponds to the planes of MI and M; coinciding. The lightness or darkness of the field in this condition depends on the value of the phase change 4 at the reflection in D. If 4 = a the field will be black if the division in amplitude by D has been accurately equal. Suppose that M2 is now tipped about its centre so that the linear fringes of equal inclination are seen. The centre black fringe then corresponds to the zero order. If the monochromatic source S is now replaced by a white light, the central dark fringe will remain, and a few fringes will be seen either side of it displaying Newton's scale of colour and merging into white light a few fringes on each side when the overlapping of the different coloured fringes is complete. So substitution of a white light source is useful as it allows identification of the zeroth-order fringe. The basic properties then of the Michelson interferometer are the ability: (i) to make both arms equal in optical length to a fraction of a wavelength; (ii) to measure changes of position as measured on a scale (the positions of MI) in terms of wavelength by counting fringes; (iii) to produce interference fringes of a known high order (number of wavelengths difference in path lengths). The last of these is the basis for the use of Michelson's interferometer for measuring the width and shape of spectral lines, as discussed in Chapter 15.
13.5 MULTIPLE BEAM INTERFERENCE In the discussion of interference effects in parallel-sided slabs and in thin films we have so far taken account of only two reflected beams, and ignored any multiple reflections. Unless special measures are taken to increase them such reflections are very weak, but if they are enhanced, so that perhaps ten or more beams are combined, the fringes change their character and become very much sharper.
204
Interference by division of amplitude
Chap. 13
Fig. 13.7. Multiple reflections in a parallel-sided slab.
By suitably coating the faces of a slab with a thin metallic film it is possible to increase the reflection coefficient, so that the front face reflects a large fraction of the light incident upon it. The small amount that does get through to the inside of the slab is reflected back and forth many times, a small proportion emerging at each reflection. It is the interference of these many emerging rays, rather than just two, that gives multiple interference its special character. This simple case is illustrated in Fig, 13.7. Most of the incident light S is reflected into the ray R,, but after that the further rays on the reflection side R2,R3, . . . are all of similar strength, dying away gradually. Let A be the amplitude of the incident ray. Then if r and t are the reflection and transmission coefficients from the surrounding medium to the slab, and r' and t ' the corresponding quantities from slab to medium, we can write down the amplitude of the reflected rays
and of the transmitted rays
With the exception of the first large reflected ray these amplitudes go in geometric progression, and the closer to unity that r ' is made, the more slowly does their size die away. Their phase depends on the angle of refraction inside the plate, its thickness and its refraction index. The relative phase of the rays on a plane outside the slab perpendicular to the ray changes by
13.5
Multiple beam interference
205
If now, as in the case of two-beam interference, we provide a lens to bring the transmitted rays to a focus we can discuss the conditions for constructive and destructive interference. For constructive interference, all the slowly declining vectors will be in phase if
Consequently with an extended source fringes of equal inclination, that is to say circles, will be seen, each corresponding to a particular value of N. It is when we consider the spaces between these fringes that the difference comes in. The angle between each of the many phasors is given by $ in Eq. (13.13); a change in $ of 27r takes us to the next maximum, but a change from 2~ of only a very small amount causes the long string of nearly equal phasors to curl up into a near circle. For this reason the fringes are very sharp, a plot of the intensity showing almost no light at all transmitted except close to the maxima. If p beams are transmitted the complex amplitude is the sum of the geometric series
Multiplying both sides by r l 2 exp (i $) and subtracting the result from Eq. (13.14) gives the sum of the series as A(p)=Attl
1 - r ' 2 ~ e x(ip p $) 1 - rt2exp(i$)
Now if p becomes large so that r I 2 p gets very small, we can ignore that term, and calculate the intensity I from A ( a ) A * ( a )
In this expression we can recognize A 2 as the intensity of the incident wave, Ii. The expression for transmitted intensity is more neatly expressed in terms of sin2(%$): It --
Ii
(tt ' (1 - r'2)2+ 4rt2sin2('/2$)
206
Interference by division of amplitude
Chap. 13
Intensity
Fig. 13.8. Cross-sections of Fabry-Perot fringes for several values of the parameter F.
Now it may be shown from a study of the Fresnel coefficients in Section 4.8 that t t ' = 1 - r t 2 , so that Eq. (13.17) may be further simplified to
where
Notice that the parameter F is very big when r t 2is only fairly close to unity. For example if r t 2 = 0 . 8 , F=80. The effect of this is to keep Eq. (13.18) always very small except when $ is close enough to a multiple of 2a for sin2 (1/2 $) to become less than 1/F. The expression then shoots rapidly up to unity when t,b is a multiple of 2a. Figure 13.8 is a plot of I,/Ii for several values of F. The sharp fringes of multiple-beam interference are known as Fabry-Perot fringes. It is easy to see that here we are dealing with a phenomenon very like that of the diffraction grating: the many phasors arising from the multiple reflections give only a small resultant unless the angle 8 at which they traverse the slab is just right to put them all in phase. So the slab is a device that will allow light of any fixed wavelength to traverse it only at certain extremely well defined angles.
13.6 THE FABRY-PEROT INTERFEROMETER This instrument, illustrated in Fig. 13.9, uses the effect discussed in the previous section to produce circular, sharply defined interference fringes from
208
Interference by division of amplitude
Chap. 13
A very high reflection coefficient is essential for high resolution in the interferometer and in filters. A thin metal film then absorbs rather too much
light, and it is preferable to use dielectric coatings on glass. The reflection coefficient then depends on the step of refractive index at the interface; it can be increased by using a series of layers of dielectric, with alternate high and low refractive indices. PROBLEMS 13 13.1 Young's doiible-slit fringes are formed by two side-by-side coherent sources. Consider the fringe pattern formed by two coherent point sources one in front of the other, and relate it to the pattern from a single source seen reflected in a parallel-sided slab. 13.2 The colours in a soap bubble often fade just before the bubble bursts, and the film becomes dark. Estimate the film thickness at this stage. 13.3 The centre of Newton's rings is usually dark. Is this the same phenomenon as in the soap film of Problem 13.2? What happens if oil with refractive index between those of the lens and the flat surface fills the air space between them? 13.4 Show that, contrary to expectation, a transparent film on a perfectly reflecting surface does not show interference fringes in reflected light. This will require analysis following the lines of Section 13.5, noting the change of phase at internal reflection at the top face. 13.5 An air-filled wedge between two plane glass plates is illuminated by a diffuse source of light, wavelength 600 nm. Fringes are seen in the light reflected by the air wedge, spaced 5 mm apart. Find the angle cr between the glass plates. What happens to the fringes if a liquid with refractive index n fills the wedge? What happens if n approaches the refractive index of the glass? 13.6 Find the highest-order fringe visible in a Fabry-Perot etalon with spacing 1 cm, for light with wavelength 600.000 nm. What is the highest order for a wavelength 600.010 nm? What are the angular radii of the highest-order fringes for these wavelengths? 13.7 The structure of a doublet spectral line is to be examined in a Fabry-Perot spectrometer. If the separation of the doublet is 0.0043 nm at a wavelength of 475 nm, what is the spacing of the etalon which places the nth order of one component on top of the (n+ 1)th order of the other? 13.8 If the reflection coefficient (intensity) in a Fabry-Perot etalon is 60070, find the ratio of the intensity at maximum to that halfway between maxima. 13.9 A parallel beam of white light is passed through a Fabry-Perot etalon and focused on the slit of a spectrograph. Describe the appearance of the spectrum. The slit is illuminated also by mercury light. If 200 bands are seen in the spectrum between the blue and green mercury lines (wavelengths 546 and 436 nm), what is the thickness of the etalon?
CHAPTER
The resolution limits of optical instruments
14.1 INTRODUCTION When any of the optical instruments described in Chapter 8 is in use it forms part of a system which starts with light transmitted or reflected from an object and ends with an image on a detector. Information is conveyed by means of light from the object to the detector through the instrument. There are thus three stages, at any or all of which information may be lost. (i) The light may not be able to convey the information at all. This depends on the relation of the wavelength to the size of detail; for example, detail on a scale smaller than the wavelength cannot be conveyed. The wavelength difficulty is fundamental and has led to the development both of electron microscopes and of the methods of X-ray crystallography, in both of which the use of wavelengths much shorter than those of light allows information to be obtained about atomic structures which are lo3 times smaller than optical wavelengths. The use of very high-energy electron accelerators in the study of charge distributions in nuclei and nucleons extends the same argument by a further factor of lo6 or more. (ii) There will be imperfections in the optical system. These can be geometric aberrations which can be avoided by good design, or effects due to diffraction within the instrument; the latter are unavoidable and provide a basic limit to the performance of the instrument. Diffraction must always
210
The resolution limits of optical instruments
Chap. 14
occur because light on its way from a point on the object to a point on the image passes through various stops or apertures, so that the image is made up of many superimposed diffraction patterns. (iii) The final image of any system will be formed on some detecting device such as the retina of the eye, a photographic plate or the photoelectric surface of a television camera. Such a device will have some sort of grid of detectors which will be of finite spacing and will be unable to detect detail smaller than that. The retina itself is surprisingly good: a millimetre scale viewed from 1.5 m is quite distinct, though the separation of the graduations in the image on the retina is only 0.01 mm. The limiting distance between detecting cells (rods or cones) is something like 0.002 mm. The best photographic plates do somewhat better and their grain size enables lines about 0.001 mm apart to be distinguished. In round numbers, think of a newspaper with 300 lines of type reduced on a microfilm from 1 m2 to 1 cm2. The typeface is reduced from 3 mm to 0.03 mm so that a photographic grain size of about 0.003 mm is required if reproduction is not to suffer. The prime example of an optical system that is limited by its final detector is of course the television camera where a raster of only 625 lines is used for the whole picture. Reproduced on a 60 cm screen and viewed from 1.5 m the lines of the raster are separated by 0.01 mm on the retina and all too easily seen! 14.2 THE TELESCOPE
This is the simplest case in which to consider the effect of diffraction. Suppose a telescope is used to observe two stars close together. In the focal plane the best possible images instead of being points are two Airy diffraction patterns, each being a circular spot surrounded by alternate bright and dark rings; if the angular separation of the two stars is a then the centres of the two diffraction patterns are separated by a distance af in the focal plane. This distance can be made any size simply by increasing f, which may be done either by actually altering the focal length of the objective mirror or lens, or, in the Cassegrain or Gregorian telescopes, by altering the secondary mirror. Unfortunately this increase in linear separation does not help in resolving the two stars, because the scale of their diffraction patterns increases in proportion. The distance to the first zero from the bright central spot is given by 1.22 X f/D, where D is the diameter of the objective, so that changing f simply scales the system up or down without altering the relative size of the diffraction patterns and their separation. It is important to appreciate clearly the difference between altering the magnification, which simply changes the size of the image, and increasing the resolution, which allows more detail to be seen. Figure 14.1 should help. To make the diffraction patterns smaller, without altering their separation, it is necessary to increase D, the diameter of the objective mirror or lens.
14.2
The telescope
21 1
Fig. 14.1. Magnification and resolution. The lower lens is twice the focal length of the upper. Because of this the images of two stars are formed at A, and B, twice as far apart as A, and B,. The diffraction patterns caused by the aperture D which is the same in both cases simply scale up in linear size also, so there is no gain in resolution. It is obviously harder and harder to distinguish two stars as their diffraction patterns get closer together, and it is useful to have some arbitrary criterion by which to measure the resolving power of optical instruments. Rayleigh suggested that the angular resolution of a telescope should be defined as the angle between two stars when the maximum of the diffraction pattern of one falls exactly on the first dark ring of the other. That is to say,
X angular resolving power = a,,= 1.22 -
D
(14.1)
In Fig. 14.2'the two Airy functions are drawn, with their sum dotted. It will be seen that a minimum of the intensity occurs which is down to 0.735
212
The resolution limits of optical instruments
Chap. 14
Position
Fig. 14.2. Rayleigh's criterion. The maximum of the diffraction pattern due to one point source (such as a star) falls on the first minimum of the other.
of that at the two peaks, allowing the presence of two stars to be determined. Notice that the intensities add and there are no interference effects between the light from the two stars because they are not coherent. Rayleigh's criterion is also usually applied if the objective is not circular, but a slit or rectangle, in which case the minimum of one pattern falling on the first minimum of the other again gives a minimum of intensity, in this case 0.81 1 of that at the two peaks, which are now separated by X/w where w is the slit width. This result will be useful in the next sections where we consider spectroscopes. If an astronomical telescope is used photographically, a photographic plate or television camera may be placed at the prime focus. Any such detector has detecting elements, or pixels, with a finite size; these may be the individual diodes in a solid-state detector such the charge-coupled device or CCD (Chapter 21), or the individual grains of a photographic plate. The pixel size must be small enough to accommodate the resolution of the telescope. Consider a medium sized optical telescope with D = 0.5 m. Then the angular resolving power at X = 500 nm would be 1.22X/D = 1.2 x rad, or 0.25 seconds of arc. If the focal length were 10 metres, an ideal star image would be 1.2 x lo-* m across, i.e. 12 pm, close to the limit of resolution of a photographic plate. As the eye is rather like an optical telescope, or a camera in which the retina replaces the photographic plate, we are now in a position to calculate the limit set by diffraction on the eye's resolving power. The pupil of the eye varies in size from about 1 to 6mm according to the strength of the illumination. Taking 2 mm and the same value of X, the angular resolving rad or just 1 minute of arc. If the focal power comes out at about 3 x length of the eye is taken as 15 mm this means that the distance between
Ir
14.3
Resolving power and sensitivity
213
images on the retina is 0.005 mm. As the distance between detecting cells is about 0.002 mm, evolution has produced in the eye an optical instrument with its detector well matched to the diffraction limits of its objective! Whilst on the subject of eyes and evolution, it is interesting to notice that the camera eye, having a lens that can be focused, a pupil of variable size, and a retina, has been produced twice. Most familiarly of course it is found in the phylum of chordates (vertebrates) to which we belong. Mice, men, fish and frogs-to take some random examples-are all descended from common ancestors with camera eyes. But the molluscs have also in the squid and the octopus developed camera eyes strikingly similar to our own. We do not share ancestors with camera eyes with these creatures: their eyes have evolved separately. This is a prime example of convergent evolution, the production separately of very similar structures to fulfil a similar function. Giant squids are the largest of the invertebrates, sometimes reaching a length of 50 feet; in proportion eyeballs of 15 inches diameter have been reported!
* 14.3
RESOLVING POWER AND SENSITIVITY
The action of the pupil of the eye in changing size according to the strength of the illumination leads to an interesting point which affects the design of radio telescopes, though (so far) it has had little impact on optical telescopes. If there is strong illumination, such as sunlight, the pupil contracts to as little as 1 mm in diameter. On the other hand, on a moonless night the pupil expands to as much as 6 mm, increasing the eye's light-gathering power by a factor of 36. But these changes in pupil size also alter the resolution limit set by diffraction by a factor of 6. Clearly the inverse connection of lightgathering power and resolution is undesirable, as we may well want to see fine angular detail on a sunny day. Similarly in an optical telescope the increase of light-gathering power which is given by bigger objectives brings with it an increase of resolving power. The world's biggest telescopes have been made big for light gathering rather than high resolution, and in practice the theoretical diffraction limit of resolution is not attained because of bad 'seeing' through the atmosphere. This is why the best Earth-based photographs of the Moon and the planets come from observatories famous for good seeing rather than for big telescopes. But in the optical astronomy of objects outside the solar system the resolution is almost always good enough and usually much better than needed to match the light-gathering power of the telescope. In radio telescopes the situation is reversed, particularly at the longer wavelengths of 1 m upwards. That is to say that the astronomical situation is such that a radio telescope of, say, 1 degree resolving power has a collecting area sufficient for there always to be several clearly detectable radio sources in the beam, and it becomes impossible to sort out the positions and strengths
214
The resolution limits of optical instruments
Chap. 14
of the individual sources. In many situations in radio astronomy then, it is resolution that is needed, not collecting area. It is instructive to work out the size needed to make a radio telescope at a wavelength of 1 m have a resolving power as good as the human eye, which as we have seen can resolve down to about 1 minute of arc, or 3 x rad. Then
so D;. 4 km. Hence the radio telescopes in the form of paraboloids working on the same principle as optical reflecting telescopes cannot be made even to equal the human eye in resolving power. 14.4 THE MICROSCOPE
The resolution of a microscope is limited by diffraction at the objective, just as it is in a telescope. Two points in the object, separated by a distance s, form overlapping diffraction patterns in the image plane of the objective; at the limit of resolution, these patterns are just distinguishable from each other. A further complication may arise; since all points of the object are illuminated from some single source of light there is the possibility that there may be interference between light from separate points on the object. We first consider noncoherent illumination of the object. It was pointed out in Section 8.7 that the microscope can be regarded as a simple lens magnifier followed by a telescope. The nlagnifier converts the strongly diverging wavefronts from the points of the object into nearly plane wavefronts at different angles, and the telescope further increases the angular spread of these wavefronts in the usual way. In practice a separate objective for the telescope is not needed, but in Fig. 14.3 two lenses have been drawn to allow the plane wavefronts between the magnifier and telescope to be
Fig. 14.3. The objective of a microscope split into two halves to ease the discussion of resolving power. The diagram is much exaggerated in vertical scale.
*
14.5
Coherent illumination in the microscope (Abbe theory)
215
seen. Working backwards we notice that on the Rayleigh criterion separate points may be distinguished if separated by a distance 1.22Xf2/D at the image plane. This means that the plane waves between the lenses going to these points must be inclined to each other at an angle 1.22X/D. If the objective has a focal length fi,the wavefronts must have arisen from points separated by
The resolution therefore depends on the angular spread of rays 2i= D/fl. In practice the angular spread of rays entering the objective is very large, and a stricter geometrical argument leads to 1.22X 2 sin i
s=-.
Another way of looking at the problem is to notice that if the angle a in Fig. 14.3 is to be 1.22X/D as required by Rayleigh's criterion, the optical distance A'A must be (D/2) 1.22X/D which means that the extra path length in OLA compared with 0'LA must be 1.22X/2. Similarly B ' B must be (D/2) (1.22h/D) or 1.22)\/2, so that 0 ' LB - OLBmust be 1.22)\/2. Each of these optical path differences is seen by dropping a perpendicular from 0' onto A 0 or from 0 onto AO' , to be approximately s sin i, agreeing with Eq. (14.4) more convincingly. Further, if an oil-filled objective of refractive index n is used, the optical paths above will be in oil, giving the result S=
1.22X 2n sin i
which is the usual form for the resolution distance of a microscope using noncoherent illumination of the object. The quantity n sin i is known as the numerical aperture (denoted by NA) of the objective.
* 14.5
COHERENT ILLUMINATION IN THE MICROSCOPE (ABBE THEORY)
When an object is illuminated by a plane coherent wavefront it is no longer possible to treat each point separately, and add the intensities of the separate diffraction patterns in the object plane. It now becomes more appropriate to consider the whole object as a source of diffracted waves and follow the course of these waves through the microscope. The image, which is formed
*
14.6
Phase-contrast microscopy
217
spacing less than d = X/sin i are removed by the microscope, leaving an object which has lost any detail smaller than this scale. If an oil-immersion lens is used, the diffraction angle is given by sin 0 = X/(nd), and the resolution limit becomes d = X/(n sin i). The result differs from the incoherent case only by a small factor.
rt 14.6 PHASE-CONTRAST MICROSCOPY Frequently in microscopy objects are encountered that do not substantially absorb any of the light passing through them. This is true of many biological specimens, which must be stained with various dyes if they are to be visible in an ordinary microscope. An unstained object may however affect a coherent wavefront passing through it by delaying the wave and producing a pattern of phase changes rather than amplitude changes. Zernike's method of phase-contrast microscopy has the effect of transforming phase changes in the object plane to amplitude changes in the image plane, so making the object visible. Consider again a simple grating in the object plane (Fig. 14.4); the grating has no effect on the modulus of the amplitude, but it introduces a small phase shift which varies periodically across the object plane. The diffraction pattern then contains components So, S1, S' as before, except that the wave at S1 and S f is in quadrature with the light at So, as may be seen by describing the complex amplitude in the object plane as A (x) = A, exp (im0 cos
F)
If +o is small
and the wave has two components, one with the unchanged amplitude, the other in phase quadrature and with an amplitude varying periodically across the aperture. The periodic component produces the diffraction spots S1 and Sf1.
The basic idea of phase-contrast microscopy is to retard the phase of the large undeviated component So by a quarter wavelength, so as to reproduce the diffraction pattern of an amplitude grating. This may be achieved in various ways, the most straightforward of which is to put a glass plate in the focal plane of F, with an extra thickness in the central region so as to retard the light at So by X/4 or 3X/4. In the first case regions of the object having greater optical thickness will appear brighter, and in the second darker.
218
The resolution limits of optical instruments
Chap. 14
Object
Condenser
Auxiliary condenser
Light source
Fig. 14.5. Practical arrangement of a phase-contrast microscope. The ring source of light R is focused on the phase contrast plate P, within the objective lens system. The undeviated light is retarded by the plate, while light diffracted by the object passes through the thinner part of the plate and appears in quadrature in the final image.
These are called bright and dark, or positive and negative phase contrast respectively. The undeviated light at So forms an image of the light source. If this is a point source, it is difficult to obtain sufficient illumination. In a practical instrument (Fig. 14.5) the souce is in the form of a ring (R), and the phasecontrast plate P has a corresponding retarding region which covers the image of R in the objective and condenser lens system.
*
14.8
Resolution in wavelength: the prism spectrometer
219
Fig. 14.6. A generalized imaging system, showing an angular spread of waves and the corresponding amplitude distribution in the focal plane F.
Ir 14.7 SPATIAL FILTERING The diffraction theqry of microscope resolution set out in Section 14.5 may be generalized to include many kinds of image processing, of which phasecontrast microscopy is one example. Figure 14.6 shows a simple lens forming an image I of object 0.As in Fig. 14.4, the object 0 transmits a diffraction pattern which appears as a distribution of amplitude in the focal plane F. The image I is constructed by waves originating in this plane. Any alteration of the amplitude distribution in F will have a corresponding effect on the image; the effect can be predicted since the amplitude distribution in F is the Fourier transform of the amplitude distribution across 0.The image may be modified by blocking out or otherwise changing a chosen region of the Fourier transform; for example, in phase-contrast microscopy the transform is altered by changing the phase of the central region. The process is generally analogous to signal filtering in electrical communications; since it applies to images rather than to signals it is known as spatial filtering. A striking example of image processing by blocking out part of a Fourier transform is provided by the need to remove a pattern of stripes in a picture; this might be a photograph or television image showing a prominent modulation by the line scan. The Fourier transform of such a picture shows a prominent series of points at the positions expected from diffraction at the regular line pattern. Removal of these spots by a mask smooths the picture with little or no loss of its essential content. 14.8 RESOLUTION IN WAVELENGTH: THE PRISM SPECTROMETER
An instrument in which the resolving-power limitation is very similar to that of the telescope is the prism spectrometer, which was described in Section 8.2.
220
The resolution limits of optical instruments
Chap. 14
Whereas in the telescope the problem is to distinguish between the light from two stars separated by a small angle, in the spectroscope the problem is to distinguish between light separated by a small wavelength difference 6X. The prism of the spectrometer by its dispersive property changes the wavelength difference 6X into a difference of emergence angle 68 in the wavefronts at the two wavelengths. In Eq. (8.3) the exchange rate between 6X and 68 was shown in the case of thin prisms to be determined by the angle a of the prism and the dispersive power dn/dX of the material of the prism. The connection as in (8.3) is
This quantity is called the angular dispersion of the prism. The problem of sorting out the wavelength difference 6X thus becomes one of sorting out the angular difference 68 and is hence directly comparable to that of the telescope. The main difference is that the prism is allowed to provide the limiting aperture of the system, rather than the telescope. A complete optical system is shown in Fig. 14.7, beginning with a slit of width w, followed by a collimating lens to make the wavefronts falling on the prism plane. After refraction in the prism plane wavefronts emerge having been deviated by an angle 8 dependent on wavelength. If the position of minimum deviation has been satisfied then Eq. (14.8) applies. The width of the emergent wavefront w, (which is not the same as the width of either face of the prism) is the limiting aperture of the system. At the focal plane of the telescope lens a series of slit diffraction patterns formed by the several wavelengths present, the distance from maximum to first zero beingfX/ww. Applying Rayleigh's criterion in its second form for a slit rather than a circular aperture, two spectral lines can be distinguished if the maximum of one falls on the first minimum of the other; that is to say, if
This quantity X/6X is obviously a useful measure of the power of any device to distinguish different wavelengths and is called the chromatic resolving power.
14.8
Resolution in wavelength: the prism spectrometer
22 1
ww
Fig. 14.7. Optical system of a prism spectroscope. Notice that each wavelength gives rise to an image of the slit broadened by diffraction.
Fig. 14.8. Geometry for the resolving power of a prism. The discussion so far has applied to thin prisms and connected directly with the earlier section on the telescope. For thick prisms an interesting simple expression for the chromatic resolving power can be derived by a direct consideration of optical paths and Rayleigh's criterion. In Fig. 14.8 a thick prism is shown at the position of minimum deviation with the approximate paths for light of wavelengths A and A + 6X shown. Now for Rayleigh's criterion to be satisfied the emerging wavefronts from the prism must be at such an angle that there is one wavelength difference between them at the top in Fig. 14.8 where they emerge from the prism completely. So for light of wavelength A, equating optical path lengths for the extreme rays
222
The resolution limits of optical instruments
Chap. 14
and at X + 6X
which by subtraction give
The reader will find it instructive to reconcile this expression for a thick prism with the expression Eq. (14.10) for a thin prism. This is perhaps a good point to digress somewhat on the subject of spectral lines. The concept is very deeply in the language: we talk of atoms having emission lines, and of the 21 cm hydrogen line seen in absorption, and so on. But of course the atoms do not have lines, they emit or absorb at certain wavelengths. It is the spectroscope that displays the different wavelengths in the light presented to it as a series of lines, each of which is an image of the slit, each at a different wavelength. So that when we speak of lines in an X-ray spectrum, or of the emission lines of the OH radical in radio astronomy, we are using a word which is an interesting fossil originating in the simplest and oldest technique of spectral analysis, the prism spectrometer. 14.9 THE GRATING SPECTROMETER
The simplest form of grating spectrometer is represented by Fig. 14.9; the optical arrangements follow the prism spectrometer of Fig. 1'4.7. The grating equation, from Section 11.1, is d(sin ON
+ sin 00) = N X .
(14.14)
If there are two wavelengths present in the light being examined, Rayleigh's criterion can be applied; the orders produced by the two wavelengths X and X + 6X will just be distinguishable if the order of one falls on the first zero before or after the same order of the other. For this to be true
14.9
The grating spctrometer
223
Fig. 14.9. A grating showing the angles of incident and emergent light.
That is to say, the chromatic resolving power of a grating is the product of the total number of lines across it and the order in which it is used. The order acts like a kind of gearing: in the third order a given change of wavelength 6X changes the path difference between adjacent lines by three times as much as it does in the first order. We can also derive an expression for the angular dispersion of the grating, the rate of change of angle of emergence against wavelength for an order. From Eq. (14.14) this is
Thus for any given order a grating may be used just like a prism, except that it is possible to get much higher resolving powers with a grating. A new problem arises, however, when a wide range of wavelengths is being observed: the spectra in different orders may overlap. If a range from XI to X2 is observed in the Nth and ( N + 1)th order, there is an overlap if
This is easily seen from Fig. 14.10 where the path lengths concerning two lines of the grating are shown. It is natural here to compare the prism and the grating as devices to put in a spectroscope. Equation (14.13) for the chromatic resolving power of a thick prism shows that the angle of a prism is unimportant: the thing that matters is the distance B traversed in the prism by the extreme ray, and the value of dn/dX for the material of the prism. For a heavy flint glass dn/dX
224
The resolution limits of optical instruments
Chap. 14
Nth
order
A, A1
A,
(N+l)th order
Fig. 14.10. Two lines of a grating showing the emerging rays for light of two different wavelengths in the different orders. The extra path travelled by light from the upper line increases by one wavelength.
can be about lo3cm- so that a wavelength of 500 nm and with a very large prism with B = 10 cm we have X 6X
-=
lo4 and
6X=
5 x 10-6 lo"
= 0.05
nm
.
To obtain the same resolving power with a grating in the second order it would need 5000 lines across it. So a grating on the (rather heroic) scale of the prism which was lOcm across would need only 5001ines per cm. Fraunhofer, Rowland and Michelson all improved techniques for ruling conventional gratings, Michelson eventually producing gratings more than 15 cm across giving resolving powers of 4 x lo5. Such a grating enables the fine structure of spectra on a scale of 0.001 nm to be explored. Babcock, also working at Caltech, produced gratings up to 25 cm across. 14.10
RESOLVING POWER OF A FABRY-PEROT INTERFEROMETER
The resolving power of a Fabry-Perot interferometer (see Section 13.6) depends on the order of interference N, which is determined by the spacing between the reflecting surfaces, and on the number of reflecting beams effectively contributing to the interference fringes. A useful parameter in determining the effective number of beams is the parameter F defined by Eq. (13.19).
Problems 14
225
The Rayleigh criterion for resolution of two adjacent wavelengths cannot be applied directly to the Fabry-Perot interferometer, since there is no minimum of intensity close to each interference fringe. However, a good equivalent criterion is that fringes of equal intensity from adjacent wavelengths should be distinguishable as separate peaks of intensity, with a dip between to about 0.8 of the maximum intensity in each line. Equation (13.18) can then be used to show that the minimum separation in wavelength is given by
where the order N = 2d/X, d being the plate separation. This result is very similar to (14.16) which gives the resolving power of a grating as the product of the order and the number of lines in the grating. The number ( a / 2 ) J ' ~ is therefore specially named the finesse of the interferometer; it plays the same part as the number of lines in the grating.
PROBLEMS 14 14.1 Calculate approximate values for the theoretical angular resoluhon of: (i) A 100 m radio telescope working at X = 5 cm. (ii) The unaided human eye, aperture 4 mm. (iii) A 3 m diameter optical telescope. (iv) An optical interferometer with 100 m baseline. (v) A radio interferometer, working at X = 10 cm, with baseline 6000 km. 14.2 The wavelength of a beam of particles, mass m and velocity v is given by X = h/mv, where h is Planck's constant. (i) Show that the wavelength for electrons accelerated by a field of V volts is approximately 1.23 V - ' I 2nm. (ii) Calculate the best possible resolving power of an electron microscope with numerical aperture 0.1 and accelerating field 30,000 volts. 14.3 Calculate the spectral resolving power for wavelengths near 500 nm of the following spectrometers: (i) A glass prism, baselength 4 cm, with refractive index varying linearly between n = 1.5477 at X = 546 nm and n = 1.5537 at X = 486 nm. (ii) A grating 4 cm across with 1500 lines per cm, used in the third order. (iii) A Fabry-Perot interferometer in which F = 40, and with a spacing 4 cm between the plates. 14.4 A 'cross' type of radio telescope consists of two perpendicular strips of receiving area each with length D and width d. (They may for example be large arrays of dipoles or parabolic reflectors). The radio signals from these two are multiplied together in a receiver which records only their product. What is the angular resolution of the system?
CHAPTER
Interferometers and spectrometers
15.1 THE MEASUREMENT OF LENGTH Optical interferometers, such as the Michelson (Section 12.4) and the FabryPerot (Section 13.6), can be used to measure distances in terms of the wavelength of light. Most of the interferometers described in this chapter are two-beam interferometers, using either wavefront division (as in Young's double slit) or amplitude division (as in the Michelson). The latter may often be improved by using multi-beams, as in the Fabry-Perot; the principle is the same, but the fringes are sharper. The length measurements we will consider range from a small fraction of a wavelength, in which the purpose may be to test the optical quality of a surface, to some tens or hundreds of metres, where the object may be to measure the stability of a large structure or to test the theory of relativity. Optical interferometry measures optical path; it is therefore also possible to measure refractive index and its variations across a light beam. We start with the simplest of refractive index measurements. 15.2 THE RAYLEIGH REFRACTOMETER In any two-path interferometer there is a comparison of the optical length of two separate paths. Rayleigh put this to use in measuring the refractive index of a gas. His refractometer (Fig. 15.1) was based on Young's double
15.3
Wedge fringes and end gauges
227
Fig. 15.1. The Rayleigh refractometer. The two slits S, and S2 are illuminated by a common source of light. Interference fringes are formed at the focal plane F of the lens L, and viewed with an eyepiece E. The fringes move across the field of view when the gas pressure is changed in one of the tubes T, , T2. slits, although evidently any other two-beam interferometer could be used. The tubes T, and T2 are in the separate light paths from the slits S, and S2, illuminated coherently from a single source. When the pressure of gas is changed in one of the tubes, the fringe system, viewed by an eyepiece E at the focus of a long-focus lens, moves across the field of view. A count of the fringes as they move provides a direct measurement of the change in optical path through the tube, and hence the change in refractive index as the amount of gas changes. The refractive index can then be obtained for any pressure by simple proportion, since n - 1 is proportional t o pressure.
15.3 WEDGE FRINGES AND END GAUGES The thin-film fringes typified by Newton's rings can be used to measure a very small wedge angle between two surfaces. This can be put to use in comparing standards of length in the form of end gauges, metal bars with polished ends that can be used as reflecting mirrors in interferometers. The simplest comparison that can be made is betwecn two end gauges which are nominally of the same length. They can be placed together on a flat surface, and a partially silvered optically flat glass plate placed on top (Fig. 15.2). Thin-film fringes are then seen in the two wedge cavities, and
Fig. 15.2. Comparison of the lengths of two end gauges. Wedge fringes are formed in the air gap between a glass plate and the ends of the gauges. The spacing of the fringes, which are viewed from above the plate, is a measure of the wedge angle.
228
Interferometers and spectrometers
Chap. 15
Fig. 15.3. End gauge interferometer. This is a Michelson interferometer with fixed mirrors. The gauge G IGz is in contact with MI , so that the centre of the field of view, seen from above, shows fringes from G I , while the outer part shows fringes from M , , as shown in the inset diagram.
the difference in height of the two gauges can easily be found from the spacing of the fringes. Irregularities in the reflecting surfaces are seen as deviations from straight lines in the reflected wedge fringes. The partial silvering of the glass plate gives a multibeam interference effect, like the transmission fringes in the Fabry-Perot interferometer. These fringes are so sharp that surface discontinuities down to 0.3 nm can be detected. Interferometers for measuring larger differences in light paths are usually based on Michelson's interferometer. An example of the direct measurement of the length of an end gauge is shown in Fig. 15.3. Here the end gauge G1G2 is placed firmly in contact with an optically flat reflector M1, so that M1 and the end of the gauge G1 can both reflect light in the field of the interferometer. The reflector M2 is inclined at a small angle, so that the field of view is crossed with parallel fringes, as in Fig. 15.3. Part of the field shows fringes from G1 and part from M1. The distance GIG2 is measured as a shift of the fringe pattern, in wavelengths of the particular light in use. Only the fractional part can be
15.4
The Twyman and Green interferometer
229
measured, however. The whole number of fringes can be found by repeating the measurement with other wavelengths of light, for example using four different wavelengths of cadmium light, each of which is known t o about 1 part in lo8. When the fractional parts are all known, the whole numbers can be found by computation.
15.4 THE TWYMAN AND GREEN INTERFEROMETER The interferometer in Fig. 15.3 can evidently be used to measure the flatness of a reflecting surface. The Twyman and Green interferometer is a n adaptation which can also test the transmission properties of transparent optical components. It is a Michelson interferollleter in which the source of light is a plane wavefront coherent over the whole field rather than a broad incoherent source.(Fig. 15.4(a)).Originally this was achieved by the use of
Fig. 15.4. (a) Twyman and Green interferometer. The half-silvered plate H and the two mirrors MI,M2 are arranged as in a Michelson interferometer, but the source of light is a coherent plane wavefront W, and the fringes are recorded either direct on a photographic plate at P or viewed at the focus of the lens L2.The field of view is uniformly bright if M Iand M2 are exactly aligned. Imperfections in optical path, as for example through a glass slab at G, show as variations of brightness. (b) (inset) M, can be replaced by a spherical mirror when a converging lens is to be tested.
230
Interferometers and spectrometers
Chap. 15
Fig. 15.5. Ray paths in the Mach-Zehnder interferometer. The beam-splitter S, and recombiner S, are half-reflecting mirrors. The mirrors M are fully reflecting. An extended light source L may be used.
a pinhole source at the focus of a high-quality lens, but now a suitable source is a laser beam. With this arrangement light paths can be compared over the whole field. For example, the arrangement of Fig. 15.4(a) may be used to compare the surfaces of two mirrors, while in (b) the optical path through a lens is measured across the whole field. As with the measurement of end gauges in Fig. 15.3, the mirror MI is slightly tilted, so that a perfect optical system yields a system of parallel fringes across the field of view; imperfections then show as deviations from straightness. The Mach-Zehnder interferometer, shown in Fig. 15.5, is another amplitude-splitting device intended for comparing optical paths in separated beams. The separation may be large, so that one may for example traverse a wind tunnel in the region of a shock wave, or it may traverse a plasma cloud in a thermonuclear test reactor. Variations of refractive index across the field are seen as deviations from straightness in a set of plane-parallel fringes.
15.5 THE STANDARD OF LENGTH The internationally defined standard of length is the distance travelled in vacuo by light in unit time. The velocity of light c is therefore a defined constant chosen to give agreement to the best possible accuracy, between the new and old definitions of the unit of length. If the frequency of a narrowbandwidth laser can be measured in terms of the unit of time, then the wavelength is known in standard units, and it can be used to calibrate a secondary standard such as a metre-long end gauge at the wavelength of a chosen spectral line. (In fact the previous length definition was in terms of a specific spectral line wavelength.) Michelson made the first measurements of the metre in terms of wavelengths in a different era, when the metre was defined by a mechanical
15.6
The Michelson-Morley experiment
23 1
standard; he was therefore measuring the wavelength of light rather than measuring the metre. The moving mirror of a Michelson interferometer was mounted on a carriage on accurate sliding ways alongside the standard, and with a microscope attached for setting on the fiducial marks of the metre. A direct count of fringes as the carriage traversed the metre would involve some millions of fringes; furthermore, the spectral line sources available to Michelson were not sufficiently narrow to allow the use of path difference as large as a metre. He therefore used an intermediate sized standard of length called an etalon, and built up the full length by adding a series of etalons.
15.6 THE MICHELSON-MORLEY EXPERIMENT This classic experiment, which is now regarded as a test of special relativity, was originally devised as a measurement of the velocity of the Earth through space. If light could be regarded as a wave in a medium, called the aether, through which the Earth was moving, the velocity of light as measured on Earth would depend on its direction of travel. A Michelson interferometer, with two light paths at right angles, would show the effect. In the Michelson-Morley experiment both light paths were increased to 11 metres by folding them between a series of mirrors. The expected effect, according to the aether theory, of the Earth's orbital velocity was nevertheless only 0.4 of a fringe. This was to be detected by rotating the whole apparatus on a stone bed floated in mercury. As we now expect from the theory of special relativity, no fringe shift was detected, demonstrating that the velocity of light is invariant.
Fig. 15.6. Ray paths in the Sagnac interferometer. The rays which recombine at the beam-splitter S traverse the circuit in opposite directions, reflected by the three mirrors M.
232
Interferometers and spectrometers
Chap. 15
15.7 THE RING INTERFEROMETER
In 1911 Sagnac constructed an interferometer in which the two beams follow the same optical path but in opposite directions (Fig. 15.6). The ring may have three or four mirrors; a small angular misalignment of any of them will produce a parallel fringe pattern. The ring interferometer is insensitive to any effect apart from rotation about an axis perpendicular to the ring. In such a rotation one optical path is effectively shortened in comparison with the other. Special relativity in no way suggests that rotation cannot be detected, in contrast to a linear velocity, but the theory warns us that the following analysis can only be regarded as a useful approximation. If the ring has a mean radius R, and angular velocity o , then the times of travel in the two directions are 2 r R (c + d?)and 2?rR(c- oR)- l . The difference is easily seen to be
where A is the area of the ring. The ring interferometer can therefore be used to measure rotation: it is the basis of the laser gyroscope (p. 268). 15.8 THE CHOICE OF SPECTROMETERS
Interferometers which can be used to measure length in terms of the wavelength of light can obviously be used to measure the wavelength of a spectral line; twin-beam interferometers can also be used to analyse a wide and complex spectrum through Fourier transform spectroscopy. We describe this very important and fundamental technique in Section 15.14. We now consider some factors which influence the choice between various types of spectrometer. (i) Resolving power
This has already been discussed in Chapter 14: it is a measure of the ability of the instrument to distinguish between close features of a spectrum. (ii) Overlapping orders
The high resolution of some instruments, notably the Fabry-Perot interferometer, which work at a very high order of diffraction, is accompanied by the necessity of adding another low-resolution spectrometer or filter, so that overlapping adjacent diffraction orders can be separated.
15.9
The grating spectrometer
233
(iii) Instrumental profile
Even though the resolving power may be adequate, an instrument may nevertheless confuse the light in adjacent parts of the spectrum. If, for example, the important part of a stellar spectrum were an absorption line, and it was required to measure the depth of absorption by comparison with the adjacent parts of the spectrum, then it would be necessary to know the amount of unabsorbed light of other wavelengths which was scattered by the instrument into the absorption line. This is achieved by measuring the line profile recorded when a single very narrow line is used as a calibrating source: this profile is called the instrumental profile. The instrumental profile may be limited by basic diffraction theory, so that it is like the patterns of Figs. 11.6 and 13.8. A grating at very high resolution is, however, likely to produce a line profile limited by imperfections in the ruling of the grating, which cause 'ghosts' close to each order (Section 11.5). The ghosts add up to a profile with a spread response amounting often to a quarter of the whole response. (iv) Signal-to-noise ratio
It is obvious that as much energy as possible should be put into the device which records the spectrum, whether it is a photographic plate, a photomultiplier, or a thermal detector. It is less obvious that all recording devices produce their own random output signals, whether they are in the form of photographic grains developed at random, spontaneously emitted electrons, or thermal fluctuations. The definition of the performance of detectors is discussed in Chapter 21; here we note only that it may be important to obtain sufficient sensitivity in an interferometer that the required 'signal' is large compared with the 'noise'. 15.9 THE GRATING SPECTROMETER
The resolving power of the best grating spectrometers is in the range lo5- 10% This requires a very large grating, 20cm or more across, containing possibly 200,000 ruled lines. The highest resolution is obtained by using the grating beyond the first order; the fifth order is often used. The problem of obtaining a good line profile has already been mentioned. There is, however, no alternative to the grating if a direct spectrum is required, with no overlapping orders, and it is often used for optical spectroscopy. When there is difficulty with the signal-to-noise ratio, it is important to use a 'blazed' grating which sends most of the light into a specific order. For extremely poor illumination it may not be sufficient to record the spectrum photographically. Then an improvement can only be made by the use of
234
Interferometers and spectrometers
Chap. 15
elaborate electron-optical devices such as image converters (Chapter 21), which amplify the illumination of an image by photoelectric and electron multiplication processes. Any attempt to record the whole spectrum by scanning a simple photoelectric cell across it would of course lose in signalto-noise ratio, since the cell can only record one part of the spectrum at one time, while the rest of the light is wasted.
15.10 CONCAVE GRATINGS In an ordinary spectrograph a grating is usually illuminated by a plane wave, requiring a collimator lens or mirror with the light source at the focus. The emergent spectrum is then focused on to a detector, so that two lenses or mirrors are needed; these may introduce losses and aberrations. The difficulty may be avoided by using an arrangement due to Rowland in which a concave grating is used for focusing. In this grating the lines are ruled on the surface of a concave mirror. An interesting piece of geometry shows that if the slit is located on a circle tangential to the mirror and with diameter equal to its radius of curvature, then the several orders of diffraction are also in focus along this circle.
Fig. 15.7. The geometry of the Rowland circle.
15.1 1
Echelle and echelon gratings
235
In Fig. 15.7 S is the slit and C the centre of curvature of the grating. Then any ray from S that is reflected from the mirror at R has an angle of incidence a! = SRC because CR is normal to the mirror. The directly reflected ray then crosses the circle again at P, where CRP = a!. Now if R were on the circle the angle SRP = 2a! would be independent of the position of R, so that all rays from S would pass through P. In fact if the size of the mirror is small compared with the diameter of the circle this is true enough. Now if one wavelength in one of the orders is diffracted by an angle P more than the direct reflection, so that it cuts the circle at P ' , the same argument applies to SRP ' which is constant at 2a + 0. Hence if the slit, the grating and the photographic plate are all located on this Rowland circle, sharp spectra may be recorded without the intervention of further optics. 15.11 ECHELLE AND ECHELON GRATINGS
We have seen that for a grating the resolving power is nN,the product of the number of lines and the order of the diffraction. One approach to high resolving powers is to make the number of lines as great as possible. This has led to gratings having resolving powers of the order of 4 x 1@. To obtain still higher resolution from gratings it is easier to use fewer lines but increase the order of the diffraction. With a conventional flat grating the order cannot be higher than d/X, the number of wavelengths in the space between lines, so that close ruling does not permit the use of high orders. For example, if a grating with 5000 lines/cm was to be used at high resolution, the highestorder it could be possibly used in would be
and to realize this the light would be incident at 7r/2 (parallel with the surface) and be diffracted back through T. High-order diffraction is achieved in practice by the use of blazed gratings, and the echelle and echelon gratings. The idea of all these basically similar systems is to separate the fixed relationship between the line spacing and the order by making the grating not flat but rather like a flight of stairs viewed from a distance. The riser of the stairs corresponds to the line, and the tread to a displacement backwards of each line. The lines have thus become reflecting surfaces, each one displaced backwards from the previous one to give a high order of interference. The angle of these reflecting surfaces can now be adjusted to reflect light into the direction in which it is desired to observe spectra. An echelle grating about 25 cm across with lo4 steps or grooves can operate in the 1000th order and have a resolved power of 10% Similar systems due to Michelson called echelons consist of a pile of glass plates arranged like a
236
Interferometers and spectrometers
Chap. 15
Fig. 15.8. A Michelson transmission echelon. The individual glass plates are vertical and staggered to form the flight-of-stairs effect, with very many wavelengths between each emerging beam.
flight of stairs, which may be used in either transmission or reflection at orders as high as 20,000. The difficulties of realizing the system become very great, and Michelson never in fact realized the reflection echelon, though he made transmission echelons (Fig. 15.8) successfully with some tens of plates. With the very high orders of interference obtained in these devices the problem of overlapping orders becomes extreme. Overlapping orders may however be dealt with by crossing any high-resolution spectrometer with a low-resolution spectrometer, such as a prism, whose resolution is in a perpendicular direction. The various orders are then separated in a twodimensional format. An example is shown in Fig. 15.9. 15.12 THE FABRY-PEROT INTERFEROMETER The Fabry-Perot interferometer (Section 14.10) easily attains a very high resolving power, in excess of lo6, with a fairly good instrumental profile. It also has the great advantage of simplicity. The one disadvantage is the overlapping of orders of diffraction, which confines its use to the detailed analysis of a small section of spectrum which has already been isolated, as for example by the arrangement of Fig. 15.9. An extended source of light is suitable for a Fabry-Perot interferometer. This has been advantageous in studies of diffuse objects such as galactic nebulae and the solar corona. The high reflection coefficient which is used
15.13
Twin-beam spectroscopy
237
Fig. 15.9. Combination of Fabry-Perot and prism spectrometers. The spectral lines S , , S,, S , are images of the source S, dispersed by the prism P. The Fabry-Perot
etalon E is placed in the parallel beam of light, and produces a ring system with high dispersion within each of the spectral lines. for high resolution does however imply that a large proportion of the light is lost in the first reflection. It is interesting to compare this situation with twin-beam interferometry, in which there is the minimum loss of light. 15.13 TWIN-BEAM SPECTROSCOPY
We have so far described the performance of a twin-beam interferometer in terms of an ideally narrow spectral line. The next step is to consider its action with a single spectral line with finite width or structure, and then to generalize for a wide and complex spectrum. Suppose that sodium light provides the illumination in a Michelson interferometer. As is known from examination in any spectrometer that can resolve wavelengths separated by a few Angstrom units (1 A = 10-'Om), the prominent yellow light from sodium is made up of the two D lines, at approximately X I = 589.0 and X2 = 589.6 nm of almost equal intensity. These wavelengths differ by just over 0.1%. As a first approximation consider them as very narrow compared to their separation. Suppose that the interferometer (Fig. 13.6) is first set up with MI and Mi in coincidence at the centre and at a slight angle so that vertical fringes are seen. Then as MI is moved further away the fringes move sideways, and as each crosses the centre of the field of view it indicates a change of X in the optical paths between the two arms, that is to say a movement of X/2 in the position of MI. As the order of the interference increases, the fringes get less and less visible. This is because the two sets of fringes formed get progressively out of phase until a point is reached when they are exactly in
238
Interferometers and spectrometers
Chap. 15
antiphase and give a uniform intensity. The condition for this is that the mirror M1 moves a distance dl where
We must be clear that there is no interference between the two sets of fringes, as light on different frequencies cannot be coherent. It is simply that the addition in intensity of two nearly equal but antiphase sine waves gives a more or less uniform intensity. Increased d still further, the visibility improves and the fringes become sharp again when
Changing d by about 3 cm allows about 100 such cycles of visibility variation to be counted, each with about 1000 fringes between them. Clearly the appropriate reduction of such data allows the separation of the two lines to be accurately determined. Supposing though that it was not known that the sodium light contained two nearly equal spectral lines; could similar analysis yield a knowledge of the frequency spectrum of the light used? In general when light from a spectral line without prominent frequency structure is examined in the Michelson interferometer it is found to give high-visibility fringes at zero d, which decrease in visibility as d is increased, and finally disappear. A recording of the intensity at the centre of the field as d is varied will in general be of the form of fringes of varying visibility finally going to zero for large values of d. Such a recording is called an interferogram. In the example of sodium light above it is easy to see that the form of the interferogram implies the spectrum of the light causing it. Michelson realized this and correctly described the Fourier transform relationship between the two quantities. Because he was not equipped to measure intensity quantitatively, and had no means other than hand computation to do Fourier transforms, he was unable to apply his elegant method except to fairly simple spectral lines. Measuring the visibility as a function of d (that is to say not attempting to measure absolute fringe position) Michelson nevertheless made discoveries of fundamental importance. One of these was that lines previously thought to be monochromatic contained several close components. This is the hyperfine structure and is caused sometimes because several isotopes are present in the sample of atoms emitting the analysed light, and sometimes by the atomic nucleus having several possible magnetic moments. Typical hyperfine splittings in wavelength are of the order of 0.001 nm and are detected with values of d of the order of tens of centimetres. Another important discovery was that all atomic lines had some width, so that a monochromatic line is a purely theoretical concept. With lasers we can now come much closer to monochromatic light than does any spectral line.
15.14
Fourier transform spectroscopy
239
15.14 FOURIER TRANSFORM SPECTROSCOPY Michelson's work showed that a twin-beam interferometer could be used to find the shape and structure of a single spectral line. This involves the measurement of fringe visibility as a function of interference order; a subsequent Fourier transformation of the visibility curve gives the profile of the line. The resolving power is equal to the order of interference reached,
Fig. 15.10. Efficient twin-beam interferometers. (a) The double mirrors in a Michelson interferometer allow the returning beam to be detected at D, in addition to the beam at D2. (b) All the light from the source S reaches a single detector in this arrangement due to Strong. The path difference is changed by moving the multiple mirror M2, which interleaves with a fixed mirror M,.
240
Interferometers and spectrometers
Chap. 15
which is naturally limited to the order necessary to reduce the visibility practically to zero and thus resolve the line. For many years the Fabry-Perot interferometer proved the simpler and more practical instrument. In 1950, however, Fellgett suggested that the twinbeam spectrometer could be used at many wavelengths, and particularly at infrared wavelengths, with great advantage in respect of signal-to-noise ratio. When photographic recording can be used in a spectroscope, the whole of the light passing through the instrument is recorded. But if the spectrum can only be registered by scanning a single detector line by line across the spectrum, as might be necessary in the infrared where photography becomes inefficient, then light is wasted and sensitivity is lost. In a twin-beam instrument, by contrast, all the light entering the instrument always reaches the detector, and no signal is lost. The only loss of light occurs at the arrangement for splitting the beam. Figure 15.10 shows two ways of avoiding even this loss, one by detecting both the beams which leave a conventional instrument, and the other, due to Strong, which uses an ingenious interleaved mirror which reflects all the light into a single detector. The computation of the line profile is, of course, easier now than it was for Michelson. It is a matter of routine to record the complete visibility curve in digital form, and obtain the profile very quickly from a digital computer.
PROBLEMS 15 15.1 A Rayleigh interferometer is used to measure the refractive index of hydrogen. The tubes are 100 cm long, and a pressure change of 50 cm mercury gives a count of 145.7 fringes at X=5893 nm. What is the refractive index at normal atmospheric pressure? Starting with both tubes at atmospheric pressure, what pressure change will give a minimum fringe visibility due to the pair of sodium D lines at 589.0 and 589.6 nm?
15.2 A spectrograph used in radio astronomy is required to resolve the structure of the hydrogen spectral line at 1420 MHz, as observed when a radio telescope is
receiving radiation from several hydrogen gas clouds moving with different speeds in the line of sight. The spectrograph divides the radio signal into two paths, inserting a variable digital delay into one path, and then recombines them in a multiplier. The smallest increment of delay is r and the largest total delay is N r . If gas clouds with velocity differences between 10 km s- and 1000 km sare to be distinguished, what values of 7 and N are required? 15.3 Light falls normally on a reflection echelon grating in which the step height is h and the step width is w. Show that the path difference between light beams reflected in direction 8 to the normal from corresponding points on adjacent faces is h (1 + cos 8) - w sin 8 For small B the mth order then emerges at
PROBLEMS 15
24 1
For an echelle with h = 1 cm and w = 0.1 cm, and with 40 such steps, find for wavelengths near 500 nm: (i) the order m for 8 near zero (ii) the angular separation of orders (iii) the resolving power. 15.4 The resolving power X / 6 X of a grating spectrograph is the difference between extreme optical paths measured in wavelengths (Section 14.8). Show that the resolvable frequency difference 6 v is related to the difference T in travel times in the extreme paths by
For a Lummer plate (Fig. 15.1l), which produces a fringe pattern related to that of a Fabry-Perot etalon, show that the resolving power is approximately
Note that the plate is used with the emergent beams at a very small angle to the surface of the plate. The light beam enters the plate via prism P, the refractive index is n and the length is L.
Fig. 15.11
CHAPTER
Coherence and correlation
16.1 THE REGION OF COHERENCE
In much of the discussion of diffraction and interference phenomena so far we have been concerned with monochromatic light produced by a point source, or with portions of wavefront produced by such a source. In such a monochromatic wavefront the amplitude of the vibration at any point on it is constant, while the phase varies linearly with time. So fundamental is this concept that we have almost always worked in terms of a timeindependent but spatially variable complex amplitude A (x, y, 2): this is the coefficient of exp (2a ivt), which gives the monochromatic time variation. We must now recall the discussion of Section 2.1, which shows that the scalar quantity actually proportional to E(x, y, z, t ) or H(x, y, z, t ) is the quantity U(x, Y, z, t ) in
No actual source is either a point or strictly monochromatic, so that no light is a perfect sine wave extending indefinitely in space or time. The consequences have already been encountered in the discussion of two types of interferometer, of which Michelson's stellar interferometer and spectral interferometer are examples. The stellar interferometer investigates the waves from a source which is nearly, but not quite, a point, finding that the lack of coherence across the wavefront is a measure of the angular diameter of the source. The spectral interferometer investigates the waves from a narrow
16.1
The region of coherence
243
spectral line by exploring the loss of coherence between two points separated along the path of the wave. The loss of coherence along the path of a wave from a source which is nearly, but not quite, monochromatic can be understood by supposing that the wave is made up of a large number of wave trains of finite length, each produced by a single atom or other emitter, and that a large number of such wave trains pass a point in the time taken to make an observation of intensity. As the phase of different trains is random (that is to say all phases are equally likely), there can be no interference effects except between light derived from the same wave train. When the path difference gets longer than the wave train, no interference effects can be seen. Each such wave train may itself (indeed must) be made up of a spectrum of frequencies, so we can express its waveform as a function of time by means of a Fourier integral
At a particular point (x, y, z) the scalar U(x, y, z, t), which for the moment will be written Up(t), is made up of some large number N of such disturbances. So
where tn is a time of arrival for each wave train. If we now measure the intensity at P, we in practice measure the time average of (Up(t))2. The time over which such an average would be taken will be very long compared with the time for one of the wave trains to pass P. So the intensity at P is given by
where the brackets ( ) mean that the time average is to be taken of the quantity within. Now consider the process of adding the F ( t - tn), and squaring, followed by time averaging which is implied by the right-hand side of Eq. (16.4):
Each of the wave trains is of random phase so that the cross terms F1F2, etc., time average to zero. The squared terms F:, etc., represent sinusoidal wave trains squared, which of course are all positive and therefore add to give a resultant intensity which is proportional to the number of such wave trains passing the point of observation at any one time.
244
Coherence and correlation
Chap. 16
In the Michelson spectral interferometer the two light beams are derived from the same source, and they are brought together after travelling different path lengths. Each elementary wave train appears in both beams, so that if the path difference is less than the length of an elementary wave train there can be interference. There is a typical coherence length in the light beam, which is the length of an elementary wave train. There is also a typical coherence time, which is the time for the wave train to pass any point. Coherence time is fundamentally related to spectral width, as may be seen from the Fourier analysis of Section 3.9. A Gaussian wave group with spectral width Av will have a coherence time to where
The coherence length of a wave packet is similarly inversely proportional to the range of wave number in its constituent waves. In the discussions of the stellar interferometer (Section 12.4) the condition for obtaining interference between light derived from points transversely separated across the wavefront was derived and expressed in the inequality (12.4). Put in a way more suitable for the present discussion this becomes a measure of the transverse coherence distance which is of the order of X (angle subtended by source)
We are thus led to the idea that around any point in the field produced by a real source there is a region of coherence, with a transverse size governed by the angular diameter of the source, and a longitudinal size governed by the bandwidth of the radiation from the source. Any interferometer that is to produce fringes from the light of the source must derive its two beams from points within this volume. The two sorts of Michelson interferometer we have discussed are the archetypes, the stellar using transverse separation and the spectral longitudinal. 16.2 CORRELATION AS A MEASURE OF COHERENCE
In the last section it was shown that given the picture of each atomic radiator in a source radiating wave trains of a finite length at random times, the existence of a region of coherence in the radiated field is implied. We now examine how this idea can be made more quantitative. First let us set out the essential operations carried out in the stellar interferometer (Fig. 12.6): (i) Light is collected from two positions in the field by the outer mirrors.
16.2
Correlation as a measure of coherence
245
(ii) It is brought by equal path lengths to the same point in the focal plane of the telescope objective. (iii) Its amplitudes are added and squared over a range of relative delays in time. This range of relative delays is achieved by the differing lighttimes from the two inner slits to different parts of the focal plane, and gives rise to the fringes. (iv) Whether fringes are photographed or examined visually, a time average is taken of the intensity. These processes can be expressed mathematically. Let Al(t) and a2(t) be the amplitudes at points P1 and P2 in the field. Then the intensity as a function of T, the relative delay, is given by
As usual the asterisk denotes the complex conjugate and the brackets ( ) denote that a time average is taken. If this expression is multiplied out it gives
The first two terms are simply the two intensities at P1 and P2, and are not functions of T. It is the variation with T of the second two terms (which are each other's complex conjugate) that gives fringes. Suppose the field at PI and P2 is from a monochromatic point source of period T. Then when 7 = N T (N is an integer) all four terms are equal and the intensity is four times that at P, or P2. On the other hand, when T = ( N + k) T, the second pair of terms are negative (each being the average of the product of cosines in antiphase) and they exactly cancel the first pair of terms. Thus fringes of 100% visibility are observed. Evidently it is the second pair of terms that are of interest, expressing the relationship between the complex amplitudes at the two points. As they are each other's conjugate each has the same information as the other and conventionally the first is taken. Mathematically this is the cross-correlation of A,(t) and A2(t), regarding these as complex functions of time; in optics it is the mutual coherence rI2(7). Thus
Notice that when P1 and P2 coincide and T = 0, the mutual coherence reduces to ( ~ ~ ( t ) A ; ( t which )) is simply the intensity. r12(r)may thus be normalized to give the complex degree of coherence
246
Coherence and correlation
Chap. 16
The function ~ ~ ~makes ( 7 )precise the somewhat vague ideas of the previous section. If we regard P 1 as fixed and P2 as exploring the space around it, there is in general a complex number Y12(7)for each position of P 2 and value of T . Most of the interference phenomena we have discussed will be seen to be interpretable in these terms. For example, in the case of Young's slits with narrow slits, the points PI and P2are in the slits, and the various path lengths to different points where the intensity is measured allow T to be varied. If the light illuminating the slits is monochromatic and from a point source, the light at each slit will be completely coherent and Y l 2 ( T ) unity. Fringes of unit visibility result. If a broad monochromatic source is used the light at each slit is only partially coherent, y12(r)is less than unity and so is the visibility. Conversely, if a white point source is used the fringe on axis where T = 0 will be visible, but those off axis where T #0 rapidly decline in visibility. We can explain both these reductions of visibility by the overlapping of fringes, in the case of the broad source from different parts of the source, and in the case of the white light source from different wavelengths in the light from the source. In succeeding sections we shall see how these ideas on coherence have been ' applied in many different cases. 16.3 AUTOCORRELATION AND COHERENCE
The stellar interferometer samples a wave at two separated points across the wavefront, and measures the coherence between the two samples. We can describe the coherence between the two sample points as a function of distance between them. Similarly we can sample the wave at two points along its direction of travel, obtaining the coherence as a function of time interval. The whole area of coherence can be sampled by these two processes. The description of the coherence is expressed by two autocorrelation functions, one in space and the other in time. Autocorrelation in space, i.e. transverse to the wave, is related to the angular distribution of the source; autocorrelation in time, i.e. along the wave, is related to its spectrum. In Section 3.8 we considered autocorrelation in a time-varying quantity A ( t ) . The autocorrelation function is defined as
This was shown to be the Fourier transform of the power spectrum of A ( t ) . The transverse autocorrelation r(x)similarly is the Fourier transform of the angular distribution of brightness across the source. Any interferometer which
Ir
16.4
Aperture synthesis
247
measures coherence along a wave train can find r ( t ) , and hence the spectrum of the wave train; any interferometer which measures coherence transverse to a wavefront can find r(x), and hence the brightness distribution across the source. Fourier transform spectroscopy, as described in Section 15.14, is therefore a process of measuring the autocorrelation along a wave train, using an interferometer such as Michelson's spectral interferometer over a range of path differences. The fringe amplitude is measured as a function of path difference, and a Fourier transformation gives the spectrum. The spectrum can be measured with a resolution which depends only on the maximum delay T between the two beams; the frequency resolution is approximately 1 / ~ . Fourier transform spectroscopy is particularly useful when a very high resolution is required, or when conventional components are ineffective; for example, at infrared and millimetre wavelengths. At radio wavelengths it is also applied in an electrical form. Here a radio spectral line can be analysed by measuring the autocorrelation in an electrical signal, using an electrical circuit to provide the variable delay T.
* 16.4
APERTURE SYNTHESIS
The principle discussed in the last section, where the autocorrelation in time of a signal was Fourier transformed to give the frequency power spectrum, has a striking parallel in the spectacular method of aperture synthesis in radio astronomy. Suppose that the amplitude at a point in the (x, y), plane is A (x, y), and that at another is A (x + X, y + Y). Then, keeping T equal to zero, the twodimensional mutual coherence is
r(X, Y) = (A(x + X, y + Y)A*(x, y))
.
(16.11)
By a similar argument to that in the time-frequency case, the two-dimensional Fourier transform of this turns out to be the two-dimensional distribution of power with angle. Put in simpler terms, this is the distribution of brightness giving rise to the sampled amplitudes. The resolution in angle is of the order of X/X and X/Y in the x and y directions. Now the right-hand side of Eq. (16.11) can be obtained from a radio interferometer of the type shown in Fig. 12.8, so that by varying the spacing and orientation (that is by varying X and Y), a knowledge of r(X, Y) out to some limiting values X ' and Y ' can be obtained. As radio interferometers can successfully be operated with baselines up to some thousands of kilometres in extent it is possible by this method to construct maps of radio brightness with resolutions down to a small fraction of an arcsecond, in fact considerably better than the resolution of the largest optical telescopes. (See Figs. 16.1 and 16.2.)
248
Coherence and correlation
Chap. 16 Sky radio brightness distribution
-
Fig. 16.1. Aperture synthesis in radio astronomy. The interferometer's output is the sky radio brightness distribution multiplied by the interferometer fringe pattern. This will be seen to be one Fourier component of the brightness distribution. By successively extracting the different components, corresponding to different interferometer spacings, maps such as that shown in Fig. 16.2 can be constructed by the computer.
Cygnus A
151 MHz
0
30
60
arc seconds
Fig. 16.2. Aperture synthesis map of the radio galaxy Cygnus A, made with the MERLIN interferometer at a wavelength of 2 metres. The interferometer baselines extend to 134 km, giving an angular resolution of 3 seconds of arc.
*
* 16.5
16.5
The intensity interferometer
249
THE INTENSITY INTERFEROMETER
In Section 12.5 the history of the intensity interferometer was recounted because in optical work it has become the successor to Michelson's stellar interferometer. No explanation could be given there as to why the intensity fluctuations observed at separated positions should correlate; indeed if the source were strictly monochromatic there would be no intensity fluctuations. A full discussion of the intensity interferometer would be out of place here, but it is easy to see why it works in terms of the ideas in the first section of this chapter. Each atomic emitter in the source gives rise to a finite wave train of random phase. We can imagine this multiplicity of spherical waves spreading out from it. At any point P1 in space the amplitude at a particular time depends on how many trains are present and how their phases happen to be arranged. Sometimes favourable interference will take place and the amplitude-and hence the intensity- will go up; sometimes destructive interference will make it go down. In these rather oversimplified terms one can see that intensity fluctuations should exist. Now let us consider whether the fluctuations at another point P2 will be correlated with those at P I . The same wave trains reach P2 as reach P1: the only difference is in their relative phases caused by the different paths they have travelled. The condition for identical fluctuations at P1 and P2 is the same as that 'for interference at PI and P2: the relative phases of the wave trains must be the same, which is to say that the waves are coherent at P1 and P2. The phase condition has already been found in Section 12.4; it is
where $o is the angular width of the source and d is the separation between P1 and P2. The discussion can be put on a quantitative basis in terms of the coherence r. We have seen that the intensities at P1 and P2 are
The intensity interferometer takes the time average of the product of Zl and It,
which is
(1113= ((( A 1 4 ) ) (( ~244;)))
250
Coherence and correlation
0
5
Chap. 16
10
15
20
Baseline d in metres
Fig. 16.3. Hanbury Brown and Twiss' results for the variation of intensity fluctuations with baseline for the star Sirius. The full curve is the variation of r:, with d, calculated from an assumed angular diameter of 0.0069 sec of arc. The optical system consisted of two standard Army searchlight mirrors 1.56 m in diameter and 0.65 m focal length, capable only of focusing the light from a star into an area 8 mm in diameter. A photomultiplier was mounted at the focus of each mirror and the anode currents were multiplied and integrated to give r:,.
The crucial step here is that between the first and second lines. The reader should try to convince himself it is justified. This being done we see that the intensity interferometer measures the modulus of the mutual coherence between P I and P2, but not its phase. A variation of the spacing of P1 and P2 thus allows 1 r121 to be measured over the lateral coherence area of the source. As in the aperture synthesis discussed in the last section the source brightness distribution is then obtained by Fourier transformation of 1 r12 1. The loss of phase of r12 means that it must be assumed that the source is symmetrical, -not a serious limitation in Hanbury Brown's work on stars, as it is fairly certain that stars are symmetrical! The results obtained on Sirius at Jodrell Bank are shown in Fig. 16.3, where the fall-off of y:2 with increasing baseline can be clearly seen. Notice that this account of the intensity interferometer is a purely wave explanation. Reconciliation with quantum theory- which is beyond the scope of this book-is a long story.
CHAPTER
Lasers
17.1 LIGHT AMPLIFICATION
The acronym laser stands for 'light amplification by stimulated emission of radiation', but the device commonly referred to as a laser is not an amplifier of light but an oscillator generating light. It is convenient now to refer to the process of light amplification as laser action, and reserve the term 'laser' for its common usage for the oscillating device. Light may be amplified in its transmission through a medium by the process of stimulated emission, in which energy stored in the medium is released by the light radiation, adding to its strength. This is analogous to electronic amplifier systems, in which energy is stored in a power supply and amplification is analysed without reference to quantum physics. The interaction of radiation with matter is by contrast determined entirely by quantum physics, and we must analyse laser action through the relevant photon processes. Suppose a unit volume of the transmission medium contains atoms or atomic systems which have only two energy levels, N2systems being in the higher level and Nl in the lower. The ratio of the populations when the medium is in equilibrium at temperature T is given by:
where a1 and a2 are the statistical weights of the two levels, and E is their energy separation. For simplicity we consider only a system where al= a2.
252
Lasers
Chap. 17
Enerqy --
A
A
I
- -
Spontaneous CVlr)
Stimulated
hv
hv
hv
E2
I
Absorption
I
v
1
Fig. 17.1. The three ways in which an atomic system can move between two energy levels, upwards by absorption and downwards by either spontaneous or stimulated emission.
Radiation of suitable frequency V, so that E = hv, can interact with the atomic systems in three ways: (i) Radiation can be absorbed by systems with the lower energy. This process occurs at a rate BN,,ov, where ,ow is the radiation energy density, and B is a constant. (ii) Radiation can stimulate the emission of further radiation from systems' with the higher energy. This process occurs at a rate BN2,0v,where B is the same constant as in (i). (iii) Spontaneous emission from systems with higher energy can occur, with no dependence on p,. This occurs at a rate AN2. The coefficients A and B are the Einstein transition probabilities. The balance of radiation energy density is given by the net effect of the three processes:
This is illustrated in Fig. 17.1. Amplification is only concerned with the first two terms; it requires N2>N1 which according to Eq. (17.1) cannot occur under equilibrium conditions at any physically real temperature. The ratio N2/N1at room temperature as given by Eq. (17.1) is of the order of e- loo for optical frequencies, so only a tiny proportion of systems is in the upper state. If by some means N2 can be made greater than Nl the levels are said to be inverted. Such means will always be a special process rather than an increase of temperature, as according to Eq. (17.1) only a negative value of T could make N2>Nl. The injection of energy so as to invert the
17.2
The laser as an oscillator
253
'312
Caesium line
3.2p m Caes~umline
388.8 nm Helium line
Fig. 17.2. An example of pumping by line absorption. The energy levels shown are some of those of the caesium atom. A photon of the 388.8 nm helium line has just the right energy to raise caesium from 6SIl2 to BPIl2, creating inversions between this and the 8SIl2 and 6D3/, levels.
populations of levels is known as 'pumping'; it corresponds to the power supply in an electronic amplifier. The pump ideally fills the higher energy level by selective excitation of atomic systems which are in a different and much lower energy level. Figure 17.2 shows such a system, where the 388.8 nm line of helium happens to be suitable for exciting a specific energy level in caesium. Usually, however, the energy comes from a broadband illumination or from an electrical discharge, and a series of atomic processes may be involved. Amplification by laser action has been observed to occur naturally in some interstellar gas clouds, where the hydroxyl radical OH produces strong line emission at 16 cm wavelength, in the radio spectrum. The populations of the energy levels in the OH radicals are distributed by energy from a hot star, whose ultraviolet light acts as a pump. The normal faint line emission is therefore amplified before it leaves the gas cloud, and it appears with the characteristic high intensity and narrow bandwidth of amplified line radiation. 17.2 THE LASER AS AN OSCILLATOR
Given an amplifier it is usually easy to construct an oscillator; it is simply necessary to return a small portion of the output to the input in the correct phase. If the power gain of the amplifier is g, so that an input power P produces an output power gP, the fraction of power returned in the correct
254
Lasers
Output power
Chap. 17
Feed bock
-1 Resonant circuit (with losses)
A m p l ~~f e r
DC power supply
A
Fig. 17.3. A simple transistor oscillator. All its components and activities have their counterparts in the laser.
phase to the input must be at least l/g. This is referred to by saying that the loop gain must be greater than unity. Oscillations will then occur and grow uncontrollably. In an electronic oscillator this growth will usually be limited by some portion of the circuit saturating. A further essential part of an oscillator is a resonant system that defines the frequency of the oscillations which stores energy and from which both the output of the oscillator and the feedback power are taken. These considerations are illustrated by the simple electronic oscillator of Fig. 17.3. Here all the essential parts of an oscillator are neatly separated, and it is clear how the DC power is turned into output power at the frequency of the resonant circuit. For the oscillation to occur the loss of energy from the resonant circuit due to the output power, the feedback power and its own resistive losses must be made up by the amplifier. All these parts and processes have their analogues in a laser. Stimulated emission has two properties which are of great importance in the laser: it is emitted in the same direction and with the same phase as the incident radiation. It is therefore possible to reflect radiation many times in a closed system, with continuous amplification on the path, and still preserve phase relations as for unamplified radiation. This leads to the use of a resonant cavity to define the wavelength at which amplification occurs. A simplified diagram of a laser is shown in Fig. 17.4. The resonant system is an optical cavity like that of a Fabry-Perot interferometer, but long and thin, with mirrors at each end. In this is contained the 'lasing' medium with a population inversion maintained usually by an electrical discharge in a gas or by optical pumping in a solid. Radiation arising from spontaneous emission
17.3
Optical cavities and their properties
255
Pump power maintaining inversion
t
t
t
L a s ~ n gmedium
/
/
c /
Par t~ally reflect inq mirror
Resonant optical cavity
Wholly ref Iecting mirror
Fig. 17.4. A simplified diagram of a laser. The optical cavity formed by the two mirrors contains the lasing medium in which an inversion is maintained by the pump power. The cavity contains a series of orthogonal standing wave systems, the modes, which are maintained against losses of all kinds by the laser action. The left-hand mirror allows through a small fraction of the power falling on it to form the output beam.
and travelling down the tube will be amplified by laser action. It is reflected by the end mirrors and makes many journeys up and down the tube. One or both of the mirrors is made partly transparent, so that some small fraction of the radiation escapes to provide the output of the laser. Losses occur in the cavity due to absorption in the gas, at the mirrors, and also due to light heing lost sideways from the system. The amplification along the path for a single pass can be measured for many laser systems. For example, a commonly used lasing medium is a mixture of helium and neon, in which a number of population inversions can be maintained by an electric discharge. Amplifications measured for these actions are of the order of 1.5% per metre of path. Hence cavity lengths of the order of a metre and very many reflections are needed for a useful output. It might be thought at first sight that the precise frequency of oscillation of a laser was determined by the frequency of the transition v between the inverted states. If this were the case, laser light would be no better defined in frequency than any other light made by an atomic transition, and a laser would simply be an extraordinarily complicated way to excite a spectral line. For the usual reasons of intrinsic line width, collisions and Doppler broadening, an inverted population allows laser action over a range of frequency, corresponding to the width of a normal spectral line. It is in fact the resonant properties of the optical cavity that give the laser its precisely determined frequency or frequencies.
17.3 OPTICAL CAVITIES AND THEIR PROPERTIES As mentioned above, the optical resonant cavity used in a laser resembles that of a Fabry-Perot interferometer except that its length is great compared with the linear size of its mirrors. As an introduction to cavity resonance, consider the system shown in Fig. 17.5. Then the condition for constructive
256
Lasers
Chap. 17
M~rror
Mirror
Fig. 17.5. Optical cavity with wavefront.
interference to take place between the multiply reflected light over many reflections back and forth is that
2d cos 8 = qX
= qc/v
(17.3)
where q is an integer. What freqencies of oscillation will the cavity support? Clearly one important parameter is q, the number of wavelengths in a double journey. If d = 1 m, and X = 5 x 10-7m then q is about 4 x lo6. The separation in frequency of oscillations differing by one in their q-value is c/2d which is 150 MHz for our 1 metre cavity, corresponding to a change of wavelength of about 10- l 3 m. Each value of q defines a different longitudinal mode of resonance of the cavity. The calculation above did not take variation of 8 into account, that is to say it assumed that light was being reflected back and forth with the wavefronts plane and parallel to the mirrors. Owing to diffraction effects at the mirrors, and scattering, such a situation cannot be maintained and light will always be present travelling in directions not quite along the axis. It is found that due to such effects each resonance is split into a number of transverse (or spatial) modes separated in frequency from each other by amounts small compared with the frequency separation of the resonances. As the modes are primarily generated by diffraction at the mirrors, a computation of their frequencies can only be made by taking into account the shape and size of the mirrors. Calculations of mode frequencies and the corresponding amplitude distribution at the mirrors are usually achieved by an iterative process. An arbitrary amplitude distribution at a mirror surface is assumed and the amplitude distribution resulting at the other mirror is calculated by diffraction theory. The process is repeated back and fort'h till further calculation does
17.3
Optical cavities and their properties
257
Fig. 17.6. Laser modes for rectangular and circular mirrors. Arrows indicate the direction of the electric vector over the cross-section of the emerging beam for several modes. It will be seen that only the TEMw modes represent a plane wavefront emerging parallel to the mirrors; other modes give an inclined or curved wavefront.
not alter the distributions. It is found that a number of discrete solutions corresponding to the modes are possible for each value of q. The patterns of amplitude and phase of the emergent light differ for the different modes, and can be characterized by analogy with waveguide modes by two numbers in addition to q. Several examples of this are shown in Fig. 17.6,both for circular and rectangular mirrors. Here the designation TEM refers to waveguide modes in which the electric and magnetic fields are transverse to the propagation direction. A rough calculation of the frequency shift between modes can be made by noticing that in the TEMlw mode there is a phase shift of 2~ from edge to edge of the mirror in the rectangular case. If the width of the mirror is w we can calculate cos 8 as 1 - @/w)~. ,For w = 1 cm and X = 500 nm this gives cos 8 = 1 - 1.25 x The left-hand side of Eq. (17.3)has been changed by this small amount, so that for the same value of q the frequency must shift proportionately. This gives a shift of 0.75 MHz.
258
Lasers
Chap. 17 Spectral line
(I-2
(I-
1
(I
(I+1
(I +
2
Fig. 17.7. Longitudinal resonances, each consisting of a group of transverse modes, shown against frequency. Behind is the wide spectral line over which laser action is possible. Usually several resonances fall within it. A resonance (shown dotted) of a shorter cavity is sometimes used to select one group of modes; its other resonances will fall well outside the spectral line.
So the various modes are separated by an amount small compared with the frequency differences of 150MHz between the longitudinal resonances. More complicated cavities which actual lasers often use in preference to the simp'le Fabry-Perot cavities discussed here sometimes have modes separated by only a few hundred Hz. This can be demonstrated by allowing laser light with several modes present to fall on a photoelectric intensity detector. The squaring effect of the detector causes beats between the modes which are in the audio range and can be heard with a suitable amplifier and speaker system. As remarked above, the population inversion allows laser action over a wide range of frequency corresponding to the spectral line width for the transition. There are therefore usually several resonances, each with its varying modes present in the linewidth over which laser action is possible. Not all resonances and modes will be supported, but the resultant light from the laser may well be quite a mixture. However, it will be noted from Fig. 17.6 that only the TEM,, mode consists of uniform plane waves which will all pass through the focal region of a good lens. These modes can hence be sorted out by a pinhole aperture in a screen at the focal plane of a lens following the laser, the other modes not passing through the fine hole. This device is termed a spatial filter. Unwanted longitudinal resonances can be removed
17.4
Pulsed lasers
259
by passing light through a much shorter resonant cavity which has only one resonance within the bandwidth of laser action. Figure 17.7 shows diagrammatically the resonances and modes with the broad spectral line over which laser action is possible superimposed. Also shown dotted is the resonance curve of a shorter cavity used to select one resonance. With all precautions taken to select only one mode of one resonance, the bandwidth of the resultant laser light is almost unbelievably small. Light from the best single line low-pressure gas lamp is about 1000 MHz wide in frequency. The laser light on the other hand has a bandwidth of a few tens of hertz. In fact mechanical vibration of the mirrors is one of the limitations in the purity of laser output. The extreme frequency purity of a few parts in 1014 should not be confised with knowing the absolute frequency (and hence the wavelength) to the same accuracy. As the frequency is determined by the dimensions of the cavity it varies as these dimensions do. Hence a laser is not useful as a standard of length. 17.4 PULSED LASERS
Lasers may be made to oscillate continuously, or their output may be confined to very short pulses of high intensity. Pulsed lasers have many applications; for example, they are used in range-finding, and they are used in the study of short-lived chemical processes. Short laser pulses can be produced by any switching process which rapidly changes the resonant cavity of a laser. In an early example, one of the laser mirrors was rotated rapidly, so that the resonance in the cavity was destroyed except for a very short time. By analogy with electrical circuits, the quality of the resonator is measured as a factor Q; the process is therefore referred to as Q-switching. The switch is now usually an electro-optic switch such as a Pockels cell or a Kerr cell (Chapter 20). While the laser medium is being excited, one mirror is obscured by the optical switch; the switch is opened at the time of maximum population inversion, and a large pulse is therefore produced as soon as laser action starts. Laser amplifiers acting on these short pulses can not only increase the pulse power: they may also shorten the pulses through non-linear effects. The shortest pulses from such amplifiers are about one femtosecond (femto = 10-15); they contain only a few oscillations, and are only about one micrometre (micron) long. The measurement of such short pulse lengths poses special problems, since no conventional detector has a short enough response time. It is achieved by splitting and recombining the laser beam with a variable delay, so as to measure the autocorrelation along the length of the pulse. Coherence between the two beams is only observed when the delay is less than the pulse length;
260
Lasers
Chap. 17
it is then detectable because the electric field becomes strong enough to generate second harmonic light in a non-linear medium (Section 17.10). 17.5 MODE LOCKING A laser with a high level of excitation may oscillate simultaneously in several longitudinal modes. We noted in Section 17.3 that these are separated in frequency by Av = c/2d, i.e. typically by some hundreds of megahertz. The modes then beat together, but with random phase relation. The laser light amplitude therefore fluctuates irregularly, with a typical fluctuation time 7 determined by the total frequency spread of the modes. If all modes can be momentarily in phase, they will stay in phase for a time T , giving a sharp pulse. This is the process of mode locking, which is used for making laser pulses of very short duration. Mode locking is achieved by switching the gain of the laser periodically at the mode difference frequency Av; the switch is an electro-optic device as in the simple Q-switched laser. The shortest pulses are produced by lasers with the widest frequency spread of longitudinal modes, i.e. those involving atomic transitions with a wide bandwidth, as shown in Fig. 17.7. For example, a mode-locked neodymium-glass laser produces pulses less than 1 picosecond (pico = 10- 12) long. 17.6 GLASS, GAS AND LIQUID LASERS
The essential ingredients of a laser, i.e. stimulated emission, a pump, and
Ground stote
Ground state
Fig. 17.8. Energy levels in laser materials. (a) Ruby laser, showing absorption A into two bands, and the laser transition L; (b) a more efficient level scheme in which the laser transition does not reach the ground state.
17.6
Glass, gas and liquid lasers
26 1
Collisionol
Fig. 17.9. Vibrational energy levels in the CO, laser. a resonator, can be realized in many different ways. We briefly describe three, the ruby laser, the carbon dioxide laser, and the dye laser. The ruby laser is in the form of a transparent rod, with mirrors formed directly on the ends. The mirrors are usually concave, which eases the problem of precise alignment. Ruby is a crystal of aluminium oxide; the active ion is, however, the chromium ion Cr3+,which replaces about 0.05% of aluminium in the crystal lattice. Some energy bands of Cr3 are widened by interaction with the lattice, and pumping is easily achieved in white light, usually from a discharge tube. The laser transition itself remains narrow. Ruby lasers produce pulsed radiation at a wavelength of 694.3 nm. The ruby laser transition is to the ground state (Fig. 17.8(a)), which is naturally always populated. More than half of the ions must therefore be excited before the population inversion is achieved. Lasers in which the transition is to a level well above the ground state are more efficient. Figure 17.8(b) shows the energy levels that are then required: this is the arrangement for example in the Nd-YAG solid-state laser, which consists of Nd3 ions in yttrium aluminium garnet. The C02gas laser involves four vibrational energy levels, as in the scheme of Fig. 17.8(b); the broad highest level is, however, an excited level in nitrogen, which is an essential added gas component. The upper level of the C 0 2 is populated from this state by collisions (Fig. 17.9). The gas also contains helium, which assists the depletion of the lower levels by collisional de-excitation. A similar scheme is used in the helium-neon laser, in which excited helium atoms collide with neon atoms and raise them to an upper energy level. The laser action is due to the neon atoms falling to a lower level, and not to the helium. The four-level scheme of the C 0 2 laser allows efficient operation, especially at the 10.6 pm wavelength. Large power outputs, up to some tens +
+
262
Lasers
Chap. 17
as flow
as 'flow
Fig. 17.10. CO, laser. Electrodes in the side tubes S,, S, sustain a discharge in the laser tube. Windows W,, W, are set at the Brewster angle; the mirrors M,, M, are outside the laser tube.
Wavelength (nm)
Fig. 17.11. Absorption (A) and emission (E) spectra of a typical organic molecule used in a dye laser.
of kilowatts, are obtainable; as the gas densities are comparatively low, highpowered C 0 2 lasers must be very large to contain a sufficient number of molecules. The excitation is by electron collisions, and the electrons'are produced in an electric discharge within the laser tube. Figure 17.10 shows the arrangement of the discharge tube, with external mirrors. The windows ot the discharge tube are set at the Brewster angle (Section 4.8) so that light of one linear polarization passes through with minimum loss. In a dye loser the active ingredient is a solution of an organic dye, which has the property of fluorescence. Figure 17.1 1 shows the typical absorption spectrum, with the fluorescent emitted light at longer wavelength. Laser action can occur over the wide band of this fluorescent light, and these lasers are often referred to as tunable dye lasers. Typically a single dye may be tuned over a 5-10% range of wavelengths, and a series of dyes is available covering the range from ultraviolet to infrared. A very short path is sufficient in the
17.7
Semiconductor lasers
263
liquid; to avoid overheating and to maintain optical uniformity the liquid may be flowing across the optical beam. The dye laser may be operated continuously or pulsed. 17.7 SEMICONDUCTOR LASERS
Semiconductor junctions which emit light are known as light emitting diodes (LED). Although the energy bands involved are wide compared with those of ion or molecular lasers, pumping is easily and efficiently achieved by a DC electrical input. Stimulated emission, producing laser light, is achieved by enclosing the emitting region in a simple cavity, consisting of reflecting layers of semiconductor either side of the diode gap, and reflection at the ends of the gap.
N- region
I
Junction
I
P - region
Fig. 17.12. Energy levels in a semiconductor laser. A Fermi distribution of electron energies extends into the junction from the N-region. Electrons fall into levels near the top of the valence band, where there are holes extending from the P-region. Pumping is here achieved by direct electrical energy.
Figure 17.12 shows energy levels in the junction region of a semiconductor diode. On one side of the junction, in the N region, there is an excess of electrons in the conduction energy band; on the other side, in the P region, there is an excess of electrons in the valence band, which has lower energy. In the transition region electrons can fall from filled states of the conduction band to empty states of the valence band, emitting energy in the form of a photon. This occurs particularly in a class of semiconductors known as 'direct-gap' semiconductors, in which the transition is not coupled to vibrations of the crystal lattice. The electrons can be provided by an electrical circuit, which feeds electrons into the N region. The resonant cavity is
264
Lasers
Chap. 17
provided, as for the ruby laser, by polishing opposite faces of the semiconductor crystal. Semiconductor lasers made of gallium arsenide give infrared light at about 900 nm. A mixed compound, gallium arsenide phosphide, gives visible light, down to 600 nm wavelength. The extreme simplicity of these lasers, which are small and efficient, makes them very attractive. The light output can easily be modulated by varying the applied voltage. The main useful features of the semiconductor laser are as follows: (i) Small size; the largest dimension is less than 1 millimetre. (ii) High efficiency and simple operation from low-power electric circuits. (iii) Easy modulation up to frequencies of order 10 GHz (1 gigahertz = lo9 HZ.) (iv) Easy integration with glass fibres for communications, and with optoelectronic circuits. (v) Suitability for mass production. Semiconductor lasers are therefore widely used for communications, usually through glass fibres. The conventional communications networks of coaxial cables and microwave links are rapidly being replaced by fibre optical systems.
Fig. 17.13. Internal reflection in a light guide.
17.8 FIBRE OPTIC COMMUNICATIONS A small-diameter glass cylinder, or a bundle of glass fibres, may be used
as a light guide. Light rays are confined to the glass by total internal reflection at glancing angles of incidence, even when the cylinder is bent (Fig. 17.13). The development of silica glass with very low absorption coefficient for light, particularly at infrared wavelengths, led to the use of glass fibres for communications. Semiconductor lasers are used as transmitters, and semiconductor photodiodes as receivers, allowing pulse rates up to 500 megabits
17.9
Properties of laser light
265
per second. The absorption is due to scattering in irregularities within the glass, and to resonant absorption in impurities. Scattering follows Rayleigh's law, and falls as the fourth power of the wavelength; this favours infrared wavelengths in the region of 1.5 pm near the limit of the semiconductor techniques. An absorption resonance at 1.4 pm due to hydroxyl ions limits the choice of wavelengths to the regions of 1.3 and 1.5 pm. Here the attenuation of a fibre can be astonishingly small: the minimum possible is only 0.2 decibels per kilometre; i.e. the intensity only falls by a factor of 10 in a distance of 50 kilometres. The transmission of short pulses, of order 10 cm long, over such large distances requires a very small velocity dispersion. From the ray standpoint of Fig. 17.13 the more oblique rays will travel slowest, and it would be advantageous to use only axial rays. In very thin fibres the ray concept is inapplicable; wave propagation occurs in a set of modes, each of which corresponds roughly to rays at various obliquities. A single-mode fibre, carrying only an axial ray, is necessarily only a few wavelengths in diameter. Part of the wave is carried immediately outside the fibre, and it is necessary to embed the fibre in a sheath of glass with a lower refractive index. Even with single-mode propagation there are two further effects which tend to disperse and lengthen the optical pulses. The wave mode itself is dispersive, and the group velocity varies with wavelength. Second, the material refractive index is dispersive, and the dispersion varies with wavelength. Fortunately it is possible to choose a wavelength where these two effects are of opposite sign, and a very low dispersion of group velocity can be obtained over a wide bandwidth. Pulses only lOcm long can be preserved over a 50 kilometre link between repeater stations. 17.9 PROPERTIES OF LASER LIGHT
The coherence of laser light leads to some remarkable properties which are not evident in naturally occurring spectral line radiation. It is, for example, easy to produce a very narrow collimated beam of laser light. A typical singlemode He-Ne laser produces a coherent beam about 3 mm across at a wavelength of 633 nm; the diffraction width of this beam is easily seen to be about 0.7 minutes of arc. The same beam can be expanded, essentially by using a telescope in reverse (see Fig. 8.7); a 10cm beam then gives a diffraction width of 1; arcseconds, provided that the telescope optics are of sufficiently high quality. The brightness of laser light can be astonishingly high in comparison with incoherent sources. The Sun, for example,. has a brightness of 1.3 x lo6 watts per square metre per steradian per unit wavelength (Wm - sr - 'Ah- ). Coherence both reduces the solid angle and the wavelength range of laser light. A simple He-Ne laser has a brightness of 101° Wm-2sr-1Ah-1.
266
Lasers
Chap. 17
The very high brightness allows a very high energy concentration in a focal spot; lasers can therefore be used for burning or cutting small holes in materials, including thick metal sheet. Laser light is immediately recognizable to the unaided eye because of the phenomenon of speckle. A surface illuminated by a laser is covered with bright speckles, which rapidly change their pattern as the eye moves sideways, and which change in size as the eye distance from the surface is changed. This is due to the coherence of the light over an area of the object corresponding to the resolution of the eye: within this area there are many coherent sources which add randomly according to the roughness of the surface. 17.10 HARMONIC GENERATION AND HETERODYNING In the propagation theory of Chapter 1 we tacitly assumed that dielectric constant E, or refractive index n = E%,were indeed constants for any material. The dielectric constant depends on the polarizability of the material, and we were effectively assuming that the polarization P varies linearly with electric field E. With the large values of E available from lasers this is no longer a valid assumption, and we must add higher-order terms to express the non-linearity:
The effect of these terms is easily seen by putting E = Eo exp ( i d ) , when it is obvious that the coefficient a2 gives rise to polarization varying at double frequency, so that a second harmonic appears in the wave. The coefficient a3 similarly gives a third harmonic, and so on. If two waves at frequencies oland w2 are added, the non-linear term will produce waves at ol - a2and wl + u2; this is familiar in radio receivers as a mixing or heterodyning process. The generation of a second harmonic does not occur in symmetrical crystal structures, since q must then be zero. Crystals such as potassium dihydrogen phosphate (KDP) or quartz, which do not possess inversion symmetry, are therefore used for frequency doubling or for heterodyne generation of sum and difference frequencies. The efficient generation of a second harmonic requires an appreciable path length through a non-linear crystal. It is then important that the phase velocity of the two waves should be identical, so that the second harmonic generated at different points along the path should add in phase. This can be achieved in a narrowly defined range of propagation angles through a crystal. 17.11 HOMODYNE DETECTION OF LASER LIGHT
Beats between two independent lasers can be detected without the non-linear
17.11
Homodyne detection of laser light
267
comblner
Output
s~gnal
Laser local osc~l lator
Fig. 17.14. Homodyne detector of laser light. M
Gas laser tubes
Fig. 17.15. A ring laser. Two modes of oscillation correspond to the two directions of circulation. Any difference in frequency will be detected as a beat frequency in the detector output.
effects of the last section, provided that the detector system can respond sufficiently rapidly. The simple 'homodyne' system of Fig. 17.14 allows beats to be detected as a noise-like signal, due to the randomness of the phases between the two lasers. It acts as a very narrow-bandwidth detector of the weaker of the two laser signals, and it can therefore act as a very sensitive detector. If the two laser beams in Fig. 17.14 were derived from the same laser there would, of course, be no such noise fluctuation. An intermediate situation
268
Lasers
Chap. 17
is depicted in Fig. 17.15, where the same laser excites two modes, each corresponding to one hand of circulation round the ring. Most of the instabilities in the laser are common to the two modes, and the phase noise is small. As in the Sagnac interferometer (Fig. 15.6) the ring laser can be used to detect rotation; it is at the heart of a modern optical form of the gyroscope. The two-laser system of Fig. 17.14 can also be used in a sensitive version of the Michelson-Morley experiment. If the two lasers are mounted on a turntable, rotation through a right angle should show any effect due to motion through the aether. As expected, the result is a null, in conformity with special relativity.
CHAPTER
Holography
18.1 INTRODUCTION
If a scene is viewed through an aperture large compared with the wavelength it is seen three-dimensionally and is completely lifelike. The only difference from viewing the scene directly is that only those parts can be seen which are not obscured by the screen containing the aperture. We commonly overcome this difficulty by altering our viewpoint: we approach a window and look up through it to see an object in the sky for example. As the viewpoint alters, objects in the scene show parallactic displacements relative to each other; if we move from left to right nearby objects seem to move from right to left compared with distant ones. These displacements are an important factor in distance estimation. Another effect is that of being able to focus the eye on a particular object at a particular depth. This focusing action can be duplicated by a camera, which may (according to the f / D ratio used) have a small or great depth of focus. How different is this view through an aperture from a photograph of the same view! The photograph is neither lifelike nor three-dimensional. No parallactic displacements of objects within it may be seen by a shift of viewpoint. The eye must be focused on the plane of the photograph to see it, and no eye focusing can make sharp any part of the photograph not originally brought into focus in the camera. The above remarks seem trivially obvious, yet clearly in the photograph a great deal of information has been lost. According to the Huygens-Fresnel principle (and more precisely according to Kirchhoff's diffraction theory),
270
Holography
Chap. 18
the amplitude and phase on the viewer's side of the screen can be deduced at every point from the amplitude and phase in the aperture. It is by altering where the eye interacts with this complex amplitude on the viewer's side (viewpoint) and how it interacts (focus) that all the qualities lending verisimilitude to the scene as opposed to the photograph arise. Perhaps in recording the photograph, which superficially resembles the view, we are recording the wrong thing. The complex amplitude in the aperture is the true paradigm that would enable us to reconstruct, completely lifelike and indistinguishable, the view recorded. This is the basic idea of holography. By recording the complex amplitude over an aperture we should at least in principle store all the information necessary to reconstruct a complex amplitude distribution throughout the volume on the viewer's side that would be indistinguishable from the real thing . This quite revolutionary idea was first suggested by Gabor in 1948 in connection with work on electron microscopy. He proposed and demonstrated a method of actually recording the amplitude and phase to produce a hologram, from which he was able to reconstruct the original scene, which showed the qualities of parallax and three-dimensionality mentioned above. Gabor's method was somewhat ahead of the technology of his time: it depended upon the illumination of the scene with coherent light. No strong source of such light was available until the advent of the laser in 1960. This made the holographic technique, previously only of scientific interest, into a practical proposition, and great research and development efforts were and are being put into its development. In the following sections we shall sketch out the principles of some of the holographic systems.
18.2 GABOR'S ORIGINAL METHOD The system originally used by Gabor is shown in Fig. 18.1. Light is brought to a point focus which is stopped down sufficiently to make the emergent beam, of semiangle a,coherent. The object, which must be a transparency small compared with the beam cross-section, is placed close to the focus in the diverging beam. Far away compared with the distance from the focus to the object is placed a photographic plate, so that from the viewpoint of geometrical optics the object would cast a shadow of itself on the photographic plate. What actually happens is that at each point of the plate interference occurs between the undisturbed amplitude and the transmission diffraction field of the object. As the undisturbed amplitude is arranged to be much stronger than the diffracted amplitude, the photographic plate is darkened more where they are in phase and less where they are in antiphase. When developed the plates carries on it a crude representation on a uniform
18.2
Gabor's original method
27 1
Photographic plate
Eye
observing Hologram
Fig. 18.1. Cabor's original system of making a hologram and reconstructing the image. background of the amplitude of the diffracted wave, positive amplitude being represented by dark, and negative by less dark. By using the undiffracted wave as a carrier, on which the diffracted wave produces a modulation, the phase of the diffracted wave is partially recorded. The plate is printed to form a positive transparency called the hologram, and placed in the original plate position in the apparatus, the object having been removed. On looking through the hologram t o where the object was, a reconstructed three-dimensional image is seen. In fact if a microscope is used for viewing, its focal plane can be racked through different parts of the holographic image, demonstrating its true three-dimensionality. Another interesting property is revealed by movement of the microscope's focal plane. That is the existence of a second image of the original object
272
Holography
Chap. 18
located behind the focus. This is inverted and has the property that each point on it is the same distance from the focus as the corresponding point on the first image. This relationship is unique to holographic systems and the second image is said to be pseudoscopic. The action of this original holographic system is most easily understood by considering the relationship between the light scattered by the object and the undisturbed light. The light scattered by the object to H is much weaker than that from F directly and is of different phase because the paths traversed are of different length. The resultant of all the scattered light at H from the whole object will be small compared with the amplitude of the unscattered light. The sum will be larger or smaller than the amplitude of the unscattered light, depending on the phase of the scattered light. The amplitude therefore contains a partial record of the phase, and the intensity on the photographic plate is a good representation of the scattered wavefront. When the object is removed and the photographic plate is replaced by the hologram, the magnitude variations are impressed on the wavefront by the hologram, and to the right of the hologram the wavefront is a tolerably good reproduction of the original. Hence looking through the hologram much the same scene appears as when the object was present, but with the addition of the pseudoscopic image behind the focus. We can now see how this extra image arises. Consider the point P ' behind the focus. Then if light were scattered from P ' it would arrive at H with the same phase ahead of that from F as light scattered from P arrives behind that from F . The magnitude of the resultant amplitude at H is therefore the same whether light is scattered from P or P ' ; only the phase is different, and it is just this phase information that is lost in the photographic process. Hence the view through the hologram shows images in both the positions that could have given rise to it. With hindsight one could paraphrase Gabor's method as follows. Faced with the necessity of recording both amplitude and phase, but possessing a device- the photographic plate- that would only record the intensity, he devised an optical system in which at the position of the photographic plate almost all the information resided in the magnitude of the amplitude, and little in its phase. The method succeeded, but has the disadvantage of showing the reconstruction always with the extra image behind it, and illuminated from the rear by a point source. 18.3 A MORE SOPHISTICATED SYSTEM: CARRIER FRINGES
The problem of recording both amplitude and phase can be solved, if strong coherent plane waves are available, by the method of carrierfringes. Several optical systems can be used to do this, but basically the idea is to generate fringes in intensity which by their phase (transverse position) record phase,
18.3
A morc sophisticatcd systcm: carricr fringcs Plane wave front
Prism
w
-
Object
y<
Photographic plate forming hologram
Fig. 18.2. One of several possible systems for making a hologram using carrier fringes. Both the amplitude and phase information of the original wavefront are preserved in the amplitude and phase of the fringes. and by their visibility record amplitude. The system illustrated in Fig. 18.2 uses a prism to provide a reference beam. (i) The recordingprocess. Suppose part of a plane wavefront is scattered by the object, while another part is deflected through angle 9 by a prism and thus made to interfere with the scattered light at the photographic plate. The amplitude A. of this reference field is constant, and the phase varies linearly with x, so that b,,f=2irxsin9/h, which we write as bref=ax.The total amplitude at the plate is
A (x) = A. exp (iax) + A (x) exp (id (x))
(18.1)
and
A (x) = A;
+ A ( x ) ~+ 2 A. A (x) cos ( a x + 4 (x)) .
(1 8.2)
274
Holography
Chap. 18
Notice that phase has not been lost. The third term of the expression for I(x) represents a fringe system recorded on the photographic plate, with a fringe spacing 2?r/a.These 'carrier' fringes have an amplitude proportional to A(x) and a phase, +(x). They thus contain all the information in the original wavefront. (The photographic plate in general records some power, g, of I(x) but if A0&A this can be shown by a binomial expansion of the expression for (I(x))g to be unimportant.) (ii) Reconstruction. If the hologram recorded as above is illuminated with parallel coherent light, a wavefront is formed of constant phase but with amplitude AR varying as Pl(x), where fl is a constant transmission factor.
Since A(x) is always small compared with A. the first two terms give a beam in the same direction as the illuminating wavefront, with only a slight diffraction due to A(x). The third term may be written
This is the original wavefront A(x)exp(i+(x))multiplied by Aoexp( + iaxj and its complex conjugate (remembering A(x) is real) multiplied by A. exp ( - ix). These terms have phase shifts linear with x but of opposite sign, and cause two beams to emerge at angles 8 to the axis of symmetry. The first is the reconstructed wavefront, making an off-axis virtual image of the original object, and the second is the complex conjugate of the reconstructed wavefront and forms a real image of the object also off axis, and pseudoscopic (Fig. 18.3). The use of carrier fringes has here overcome several of the disadvantages of the original method. One does not have to look straight into the light source, and it is easy to avoid seeing the unwanted image. The hologram with its phase and amplitude modulated fringes is rather like a diffraction grating, with the slightly diffracted beam the equivalent of the zero order, and the wavefronts forming the two images equivalent to the two first diffraction orders.
+
18.4 ASPECT EFFECTS
When the reconstructed object is viewed through the hologram the edges of the hologram act rather like a window frame. Within the limits set by it movement of viewpoint changes the aspect of the reconstructed scene, so that if it is three dimensional one can see more round to the right by moving the head to the left. Similarly parallactic displacements may be observed. A consideration of these aspect effects brings out another interesting property. From a given direction of observation light reaching the eye from
18.4
+
Aspect effects
275
+
Plane wavefront
Reconstructed image
Hologram
Pseudoscopic real image
x
Sl~ghtly diffracted beam
Reconstructed wavef ront
Fig. 18.3. Reconstruction of the hologram produced as in Fig. 18.2. The two images and the undeviated beam are now separated in angle.
the image only comes from a small portion of the hologram determined by the position of the eye and the angle subtended by the object as shown in Fig. 18.4. Evidently if all the rest were removed but this piece the object could still be seen, but only from that aspect. Thus if a hologram is broken into fragments, the reconstructed object can still be seen through each fragment, as seen from the appropriate aspect. This is rather like looking through a window that is completely obscured except for a small hole. The view is still to be seen, but only from the position allowed by the hole. An amusing aspect of these properties is the recording of several quite different pictures on one hologram. Supposing a number of photographic
276
Holography
Chap. 18
Reconstructed object
Hologram
Fig. 18.4. To see a reconstructed object from one aspect, only a small portion of a hologram is used. The same portion viewed from another angle allows a different reconstruction (or a different part of the same one) to be seen.
transparencies are arranged so that during the recording process none obscures any other as seen from any point of the photographic plate. Then looking through the resultant hologram, or any part of it, the different pictures may all be seen. 18.5 THREE-DIMENSIONAL HOLOGRAMS -COLOUR HOLOGRAPHY A disadvantage of the methods described above is that both the recording
and reconstruction processes demand laser light and produce a monochromatic image. Reconstructions seen through such holograms are in bright red, or whatever monochromatic laser light is used; the threedimensional and aspect effects have been gained at the expense of unreality of colour. If white light is used to illuminate such a hologram no reconstruction is seen at all, as an infinite number of overlapping and different sized images are produced by each wavelength present. Curiously enough the modern key to making holograms that can be viewed in white light and produce naturally coloured images is to be found in the historic work of Lippmann in 1891 (Section 2.9) and of Bragg in 1912 (Section 11.9). It will be recalled that Lippmann demonstrated the real existence of standing waves close to a reflecting surface by showing that a thick photographic emulsion on top of a mirror was darkened in layers
18.6
Holography of moving objects
277
corresponding to maxima of the interference pattern between the direct and reflected waves. Such a plate serves as a selective reflector of light of the wavelength in which it was made, for only at that wavelength do the reflections from the different layers in depth add constructively. This effect was the basis of Lippmann's process of colour photography, historically the first. Bragg's work on crystals shows how a three-dimensional structure can single out a particular direction or directions and send radiation to it selectively. The holograms we have considered so far have been essentially flat twodimensional patterns. To make a three-dimensional hologram with sufficient structure in depth two main methods are used. Firstly a thick emulsion (up to a few millimetres) which is transparent except where it has been blackened by an interference maximum during exposure is used for making the hologram. Secondly the angle between the reference beam and the scattered light from the object is made large. In the most extreme case of this the angle is made almost 180"' so that the scattered light and the reference beam arrive at the photographic plate from opposite sides and interference structure of the order of X/2 in depth is produced. Such holograms when illuminated by diffused white light from say a tungsten filament lamp will only transmit light of the right colour which is going in the appropriate direction. The final step to full colour holograms is to illuminate the object with red, green and blue light from three separate lasers, three corresponding reference beams being used. When illuminated with an ordinary white light source this hologram produces a passably convincing 3-D coloured reconstruction. 18.6 HOLOGRAPHY OF MOVING OBJECTS Here we come to one of holography's greatest difficulties. The process of forming the hologram depends on the phase relationship between the reference beam and the scattered light from the object remaining constant during the exposure to a few degrees. Clearly the object must not move more than a fraction of a wavelength. Be clear that movement will not just cause blurring of the reconstruction; there will be nothing to reconstruct. A consideration of a simple case is helpful. Suppose we were making a diffraction grating by allowing two coherent beams of light to meet at an angle and interfere at a photographic plate. Then obviously a movement of a wavelength or so of the source of one beam would move the interference fringes on the photographic plate so we would get no grating at all. The grating formed here is of course a simple case of a hologram. Happily lasers which make the light for holographic work can produce quite incredibly short pulses. It is instructive to calculate how short an exposure is needed. If we take it that for human scenes we need to record
278
Holography
Chap. 18
objects moving at up to 10 m/s the exposure must be so short that movement of only X/10 (say) happens in that time. If X = 5 x lo-' m the exposure can last only for 5 x sec= 5 ns. Five nanoseconds is a short time by any standards: light itself only moves 1.5 m in that time. A laser pulse can however be as short as one nanosecond, and repetitive pulses are easily obtained. The three-dimensional cinematograph is no longer an impossible dream.
CHAPTER
Radiation by accelerated charges
19.1 THE ELECTROMAGNETIC FIELD
In this chapter we illustrate the application of some of the principles of diffraction and coherence to the radiation produced by an accelerated charge. We will not derive the basic equation for the radiation from a single charge; we will discuss it briefly and then apply it to three classical situations in which (i) an electron is accelerating in simple harmonic motion (the Hertzian dipole), (ii) it moves with a velocity which is uniform but higher than the local phase velocity (Cerenkov radiation), and (iii) it is moving at relativistic velocities and accelerated by a magnetic field (synchrotron radiation). In classical electrodynamic theory the electric field generated by a single charge has three components. The first is the electrostatic field, whose strength varies as the inverse square of the distance from the charge. A uniformly moving electron generates a second field component which is proportional to its velocity; this field also falls away as the square of the distance from the charge. The third component of the field is only present when the charge is accelerated. This is the radiation field, which falls away only as the first power of the distance. The strength of the radiation field at a point some way from the accelerating charge depends on the component of the acceleration perpendicular to the line of sight. Since radiation reaches the field point after a finite travel time, the acceleration must be measured at the time the
Radiation by accelerated charges
Chap. 19
point P
Fig. 19.1. Radiation from an electron. The radiated electric field at the field point P at a time t depends on the acceleration of the electron at time t ' = t - r/c.
radiation leaves the charge rather than when it arrives (Fig. 19.1). The field is given by
E= -
3oc2r
(acceleration)
where by (acceleration) we mean: (i) the component of acceleration projected on to a plane perpendicular to the line of sight from the field point to the charge (ii) the acceleration is measured at time r/c ' earlier, where r is the distance from the charge and c' is the phase velocity of the radiated wave. The radiated electric field is therefore always transverse to the line of sight. The magnetic field B is also transverse; it is perpendicular to the electric field E. 19.2 THE HERTZIAN DIPOLE
A dipole is a pair of equal and opposite charges separated by some distance. The product of charge and distance is the dipole moment. A dipole whose moment oscillates sinusoidally, and whose dimensions are small compared with the corresponding radiated wavelength, is known as a Hertzian dipole. It radiates in the same way as a single oscillating charge, since the two opposite charges oscillate with opposite accelerations and their radiated fields must therefore add. We therefore consider only one charge q, moving linearly according to
19.3
Cerenkov radiation
28 1
Fig. 19.2. Radiated field from a Hertzian dipole. (a) The field at P is aligned along the surface of the sphere, with strength proportional to sin 8/r. (b)The variation of field strength with 8 can be represented as a 'polar diagram'.
The component of acceleration as seen from the field point P at a distance r in free space, and in a direction 8 from the axis of the dipole (Fig. 19.2), is then easily found from Eq. (19. I), giving a field:
E = - wLqxo sin 8 exp iw(t - r/c) eOc2r
.
(19.3)
The variation with 8 is known as the 'radiation pattern' or 'polar diagram'. This is shown in Fig. 19.2(b); note that the radiation pattern of a radio antenna is often plotted as a variation of power (E2) rather than field (E). The term r/c in the oscillatory function is already familiar as a phase variation with distance, while the amplitude variation with r has been discussed in Fresnel diffraction theory (Chapter 10). The total power W radiated by the dipole is found by integrating over a sphere the power flux E 2 / z= E ~ E ~giving: C,
The radiation power W depends on the square of the dipole moment, and on the fourth power of the frequency. 19.3 CERENKOV RADIATION It is easy for a boat to travel faster than the waves it generates, and it is commonplace for an aircraft to travel faster than the speed of sound. In both
282
Radiation by accelerated charges
Chap. 19
cases a sharp wavefront travels out in a V-shape, at an angle whose sine is the ratio between the group velocity of the waves and the velocity of the vehicle. By analogy, it might be expected that electromagnetic waves can be generated by a charge moving faster than the wave velocity c/n. Light emitted from an electron moving with a relativistic velocity in a medium with n > 1 was first observed by Cerenkov, and is named after him. We note first that the electron itself has no acceleration which can account for the radiation. It is in fact possible to analyse the radiation as generated by the electron, by expressing the field in terms of the electron charge, its velocity and its acceleration, at a suitable retarded time (t - nr/c), but we prefer to regard the electron as exciting the dielectric medium as it progresses, and to consider the dielectric as the source of the radiation (Fig. 19.3). The dielectric becomes polarized, the electron attracting positive charge and repelling negative, so that as the electron passes each part of the dielectric acts like a small dipole giving only one impulsive oscillation. The radiation from all impulses along the track adds in phase where the Huygens' wavelets coincide. What is the angle a between the Cerenkov ray and the electron track? Consider a single wavelength component of the impulsive radiation from a point in the medium. The track is now a one-dimensional array of radiators, phased progressively as 2~ nvx (b(x)=-X
c
where v is the velocity of the electron. By analogy with the diffraction grating (Section 11.3) we know that this radiates a cone at angle a = cos- l (clnv), provided that n/c is the phase velocity of the wave. There is a small paradox to resolve here, since the Huygens' wavelets of Fig 19.3 obviously travel at the group velocity, leaving us uncertain about the true value of a. The answer is simply that a true impulse only continues without change of form if the medium is nondispersive, when the two refractive indices coincide. The correct value for a when the medium is dispersive is obtained from the group velocity in the range of wavelengths which is being observed. The high-energy particles of a cosmic ray shower generate a flash of Cerenkov light, which shines within a few degrees of the direction of the shower axis. Light generated by high-energy particles as they pass through liquids or gases can be used as a sensitive detector for cosmic ray showers, or for particles from accelerators. When the Cerenkov angle a can be measured it gives a direct and simple measurement of the velocity of highenergy particles.
19.3
Cerenkov radiation
283
(a) Shock wave from a bullet, forming a cone with semi-angle 8.
Cerenkov angle a = cos' =/(nu)
Uc-
(b) Cerenkov wave from electron in medium with refractive index n. Huygens' wavelets form the wavefront.
Fig. 19.3. Comparison of shock wave in air with Cerenkov radiation from an electron.
284
Radiation by accelerated charges
Chap. 19
19.4 SYNCHROTRON RADIATION
Electrons may be accelerated by a variety of forces, such as the electric fields in a radio transmitter or the electrostatic field near ions in an ionized gas. High-energy electrons and protons in synchrotrons are continuously accelerated radially by the magnetic field which confines them to a circular orbit. This acceleration, as for any other acceleration, produces radiation, and the power radiated is a serious limitation on the energy which can be given to electrons in a synchrotron. This 'synchrotron radiation' is also responsible for light and radio waves from astronomical sources such as quasars and supernovae. Consider an electron with large energy, y times the rest energy, moving with velocity near c perpendicular to a magnetic field H (Fig. 19.4). The electron orbit is a circle, which is followed at a frequency v l / y, where v~ is the Larmor frequency eH/moc. Seen perpendicular to the orbit, the acceleration is obviously into a circle, and a circularly polarized wave is emitted along the magnetic field. The much more intense radiation in directions close to the plane of the orbit must however be calculated using the acceleration at the corrected time (t - r/c), which has a large effect when the electron velocity is close to c. Instead of a sine wave the radiated field appears as in Fig. 19.4(b). The effect is to compress the time during which the electron moves towards the observer, and stretch the time during which it is receding. Each time the electron travels towards the observer a sharp peak is radiated. The electric field therefore consists of a regular series of spikes, separated
Magnetic f ~eld
(0)
(16) Fig. 19.4. Synchrotron radiation. (a)An electron gyrating in a magnetic field is viewed from a point in the plane of its orbit. The radiated electric field (b) has a sharp maximum each time the electron travels towards the observer.
19.5
by the orbital period of y / i L . The width
Brernsstrahlung
285
of the peaks decreases rapidly with increasing electron energy; it may be shown geometrically that T
The waveform of the radiation can now be Fourier analysed. It contains harmonics of the period y/ vL up to a frequency of the order of r - l , i.e. up to a frequency of approximately 2vL. The light from the Crab Nebula is generated by electrons with cosmic ray energies, probably up to loL2electron volts, moving in a magnetic field of the order of 10- tesla (10- gauss). Radio astronomy has shown that the interstellar gas contains a magnetic field of only 10-lo tesla gauss) but this is still sufficient to accelerate the cosmic ray electrons and produce measurable radio waves. By contrast an electron synchrotron machine might have a field of 1 tesla, and visible light would be emitted by electrons with energies of 1O9 electron volts. 19.5 BREMSSTRAHLUNG
An electron passing through an ionized gas is attracted by the nuclei which it encounters, and is accelerated along a curved path at each encounter (Fig. 19.5). This acceleration results in radiation, and a consequent loss of energy by the electron. The radiation is known as bremsstrahlung, (from the German, 'braking radiation'). It is an important process in astrophysical plasmas such as the solar corona; it is also important in high-energy laboratory physics, since it is responsible for energy loss of high-energy electrons in solid matter. The radiation from a single encounter is an impulse; it therefore has a broad spectrum. The closest encounters give the narrowest pulses, containing the
Fig. 19.5. Brernsstrahlung. (a) A moving electron is deflected by the electrostatic field of a positive ion. (b) The acceleration of the electron in the direction of travel x and in the perpendicular direction y.
286
Radiation by accelerated charges
Chap. 19
highest frequencies: a full calculation of the overall spectrum requires an integration over a range of impact parameters and electron energies. An upper limit to the emitted frequency is set by the photon energy: this cannot be greater than the original electron energy. This limit is encountered in an Xray discharge tube, where electrons are accelerated by a high voltage V. The minimum wavelength of the X-rays emitted when the electrons reach the target anode is
and there is a continuum of radiation emitted at lower frequencies.
CHAPTER
Physical processes In retraction
20.1 INTRODUCTION
The interaction of electromagnetic radiation with matter has two aspects, wave and quantum. When the photon energy hv is small compared with energies such as the binding energies of atoms, it is appropriate to use the macroscopic properties of materials, such as conductivity and dielectric constant, to explain the way in which they reflect, absorb, or transmit waves. At larger photon energies this approach is difficult to reconcile with the way in which photons obviously act like particles, almost like the 'material' particles such as the electron and proton. It is found nevertheless that even at these higher quantum energies the refractive index of materials conforms to a semi-classical theory, in which the macroscopic properties can be related to microscopic properties which at least take some account of the atomic nature of matter, and of the resonant frequencies of atoms. Except for the highest-energy electromagnetic waves, i .e. the gamma rays, the interaction is with electrons, since these are the least massive charged particles. The simplest interactions involve free electrons and waves with relatively low photon energy; this applies to radio waves in an ionized gas and to X-rays in a metal. Our analysis then progresses to electrons bound in atoms or molecules. Finally we briefly describe some of the quantum interactions which occur for relatively high photon energies.
288
Physical processes in refraction
Chap. 20
20.2 POLARIZATION IN DIELECTRICS
The effect of a dielectric on an electromagnetic wave will now be discussed in terms of the electric polarization induced by the electric field of the wave. The polarization represents a charge separation within the material, forming a dipole moment which oscillates at the wave frequency: this re-radiates a secondary wave, which combines with the original wave. The combination may have a different phase from that of the original wave: this means that the phase velocity of the resultant wave has been determined by the polarization properties of the dielectric, that is, by its dielectric constant. The velocity of electromagnetic waves is given generally by c/(eP)lI2, where c is the velocity in free space. For a pure dielectric, whose magnetic susceptibility is unity, the refractive index therefore equals E 'I2, the square root of the dielectric constant. The relation between e and the polarization P is given by simple electrostatic theory as:
The polarization P is defined as the electric dipole moment per unit volume. If there are N atoms per unit volume, each polarized by the field E to form a dipole with moment cx E, then
We must allow for the possibility that the atomic polarization cx E is not in phase with the field E, which may happen when the atom behaves like a resonant atomic oscillator. The refractive index E 'I2 is therefore complex, having real and imaginary parts. If it is n-ix, the effect is that the wave becomes
in which k is the wave number in free space. The quadrature component x then represents absorption, since Eq. (20.3) becomes
The wave proceeds with phase velocity c/n, while the amplitude decays exponentially with distance. Eq. (20.2) becomes:
Free electrons
20.3
289
and if x is nearly zero, as in a dielectric where the attenuation is small,
A further approximation may be made if n is close to unity:
We now have the required relationships between the refractive index of a medium and the response of its charged particles to the oscillatory electric field. The response, represented by the polarizability a, will now be found for two particularly interesting cases. 20.3
FREE ELECTRONS
A particularly simple situation arises when the dielectric polarization is entirely due to electrons which are not bound to nuclei. This is the case for radio wave propagation through the ionosphere, and also for X-rays in metals, which behave like dielectrics for these short wavelengths. Neglecting the effect of collisions between electrons and ions in the ionosphere, and the effects of lattice irregularities in metals, the polarizability a for a single electron is found from the simple equation of motion
d2y - eE - eEo exp ( i d ) ----dt2
rn
rn
.
The electrons all follow an oscillatory displacement y = y o exp ( i d ) , and the polarizability a! is the dipole moment per unit field:
Substituting in Eq. (20.6) we find
This gives the refractive index for any dielectric where the polarization is due to free electrons. It may be written as
290
Physical processes in refraction
Chap. 20
where V, is typical of the dielectric. For the ionosphere v, reaches 10MHz where the electron density is greatest (i.e. in the F-region), while for metals it corresponds to ultraviolet light. For frequencies above V, the refractive index is real and less than unity, and the phase velocity is therefore greater than the free space velocity c. It is easy to show that if Eq. (20.1 1) holds, then the product of the group velocity and phase velocity is c2.
20.4 RESONANT ATOMS IN GASES The second case for which we can easily evaluate the polarizability, and hence obtain the refractive index, concerns un-ionized atoms in gases. In a lowpressure gas the atoms may be assumed to act independently. Electrons are bound within the atoms at a series of energy levels; the atoms emit and absorb light with photon energies equal to differences between energy levels. The classical analogy is a simpler picture, in which the atoms are considered as oscillators which resonate at the appropriate spectral line frequencies. These resonators respond to the electric field of a wave with a forced oscillation, whose amplitude depends on the resonant angular frequency wo, and the sharpness of resonance, characterized by a width A w %. The approach of Eqs. (20.8) and (20.9) then leads to a polarizability
Some awkward algebra is needed to relate a! through Eq. (20.12) to the refractive index n - ix, but if x and n - 1 are both small, as is appropriate for a low-pressure gas, a good approximation may be found:
In the quantum approach the resonant width Aw 1/, is replaced by a lifetime 7, where
The form of the variation of n and x with the wave frequency is shown in Fig. 20.1. Well away from resonance, the refractive index increases steadily
20.4
Resonant atoms in gases
29 1
Fig. 20.1. The variation of absorption and refractive index in the region of a resonance. The width of the resonance is determined by the lifetime 7. The absorption varies with o as K = KO/[ 1 + ( T Aw ) ~ ;] refractive index varies as A o/ [ 1 + ( T Ao )2] .
with frequency, and the absorption is small. As the wave frequency approaches the atomic resonance, and the amplitude of the atomic oscillators increases, the absorption increases and the dispersion in the refractive index changes sign. The 'anomalous dispersion' is found at every resonant frequency of the atoms in the gas. Anomalous dispersion is most marked in gases at low pressure, since it is only when the atoms behave independently that a sharp resonance is found. When resonators of the same frequency are coupled together the combined resonance curve is broadened; in quantum terms this means that the lifetime of an excited state is reduced, while in terms of refractive index there is less anomalous dispersion. In real gases the atoms will have many resonances, each of which has an associated anomalous dispersion. Molecular gases also have rotational and vibrational resonances, occurring generally at lower frequencies. A complete curve showing refractive index and absorption for a gas therefore shows a series of the dispersion curves of Fig. 20.1. Away from resonances, and where the absorption is negligible, the refractive index increases with wave frequency; this is seen in glass, where the refractive index is higher at the violet than at the red end of the spectrum. When the wave frequency is very much higher than the resonant frequencies, the refractive index becomes less than unity; this is the free-electron case discussed in the previous section. The quadrature component x of the complex refractive index is called the extinction coefficient. The wave propagates with phase velocity c/n, and with an amplitude decreasing exponentially with distance. This attenuation is
292
Physical processes in refraction
Chap. 20
expressed by an absorption coefficientp, such that the fraction of the intensity lost in a thin slab with thickness dx is pdx. An expansion of Eq. (20.4) shows that
The absorption may be regarded as the sum of the absorptions by all the individual atoms, each of which must then have a cross-sectionfor absorption a=p/N.
20.5 ANISOTROPIC REFRACTION In most substances the refractive index is independent of the direction of propagation and independent of the polarization of the light. From the foregoing discussion this is obviously due to isotropy in structure of the substance: the polarization is independent of the direction of the electric field. For other materials, however, this may not be so. In a crystal the unit cell may be anisotropic, so that the polarizability depends on the direction of the electric field in relation to the directions of the crystal axes. This leads to the birefringence already discussed in Section 5.5. Again, a liquid may be composed of molecules with a permanent dipole moment, so that they may be aligned by an external steady electric field. In this way a dielectric liquid may be made birefringent, and used to modify the polarization of light passing through it. Induced birefringence (Section 5.8) in a liquid is known as the Kerr effect; a related phenomenon in which the existing birefringence of a crystal is modified by an electric field is known as the Pockels effect. Both these effects relate to the refractive indices of plane-polarized components of a light ray. The two components will also show anomalous dispersion for frequencies close to atomic resonances. Figure 20.2 shows a further effect in which the resonant frequency of an atomic oscillator is changed by a strong electric or magnetic field. The dispersion curve is then displaced between two planes of polarization, giving a large birefringence in the region of the resonance. This is the Voigt effect. Optical activity, discussed in Section 5.7, is due to a difference in refractive index between the two hands of circular polarization. 20.6 RAYLEIGH SCATTERING
The scattering and absorption of electromagnetic waves in gases often involves quantum processes. In a neutral gas, however, and for low photon energies, it is often appropriate to treat the neutral atoms as classical oscillators, as in Section 20.4; these oscillators respond to the electric field of the
20.6
Rayleigh scattering
293
Fig. 20.2. Anisotropic dispersion. Strong magnetic or electric fields can change the resonant frequency of atomic oscillators so that the dispersion curve is displaced between two planes of polarization. The dielectric then shows magnetic or electric double refraction. The magnetic effect is known as the Voigt effect. Some substances show an induced birefringence well away from resonances. This is due to the reorientation of molecules in the field. These effects are known as the Kerr effect (electric) and the Cotton-Mouton effect (magnetic).
electromagnetic wave, and reradiate some of the incident energy in directions away from the incident ray. The amplitude of the oscillation is found from the analysis of atomic polarization (Eq. (20.8)):
where Eo is the incident field. The phase difference 4 is of no consequence; furthermore, at low frequencies (i.e. low photon energies) the terms in w 4 and ( ~ A w % can ) ~be neglected in comparison with u:. The result is that the amplitude of oscillation is independent of the wave frequency. The radiated electric field from the electron is proportional to its acceleration, found by twice differentiating Eq. (20.17). This is proportional to u2, and the reradiated power is therefore proportional to u4, or to This is the Rayleigh law of scattering. Its most familiar application is to the scattering of visible sunlight in the Earth's atmosphere, where blue light is scattered more than the longer-wavelength red light. The Rayleigh law can also apply to the scattering of light in transparent solids or liquids, where the scattering elements are irregularities in density or structure rather than individual atoms. An important example is in the transmission of light along glass fibres, as used in long-distance communications. Here the lowest attenuation is mainly limited by Rayleigh scattering in small irregularities in the glass. The Rayleigh law dictates the use of the longest infrared wavelengths for which light sources and detectors are available; in practice the lowest attenuation is obtained at about 1.55 pm. There is a resonant absorption in silica glass at 1.39 pm, due to a small population of hydroxyl ions (OH), which results in an appreciable dispersion in refractive index at adjacent wavelengths; in the best optical fibres this can be cancelled out with an opposite dispersion due to the geometry of the light
294
Physical processes in refraction
Chap. 20
guide itself (Section 17.8). These fibres will then carry a very broad-band modulated light signal with an attenuation as low as 0.2 decibels per kilometre, i.e. a loss by a factor of only 10 in 50 kilometres. 20.7
RAMAN SCATTERING
The scattering of light by atoms and molecules can be described by the Rayleigh law only for light with low photon energy. When the photon energy is comparable with or greater than the resonance energies in the scatterer, there may be a quantum interchange of energy, so that a photon emerges from a collision with a different energy. The response of the classical oscillator, as in Eq. (20.17), is maximum at resonance, when w = wo and the photon energy is close to a resonant energy. The scattering which follows from this response corresponds to the re-emission of photons with unchanged energy. This is not always the case: the atom or molecule may absorb a photon and subsequently decay to a different excited state by emitting a photon of different energy; this inelastic scattering is known as Raman scattering. Two types of Raman scattering may be distinguished. Normally the emergent photon has a lower energy than the incident photon: this is the usual inelastic scattering. If, in contrast, the scatterer is already in an excited state before the interaction, a photon may be emitted with higher energy than the incident photon: this is referred to as super-elastic scattering. Analysis of both types must provide both for conservation of energy and for conservation of momentum, as for Compton scattering of free electrons. 20.8
THOMSON AND COMPTON SCATTERING BY ELECTRONS
The electrons of an ionized gas may scatter electromagnetic radiation in a simple classical process, analogous to Rayleigh scattering by neutral atoms. This is known as Thomson scattering, for which the individual electrons have the Thomson cross-section
where ro is the classical electron radius. In Thomson scattering the radiation is not absorbed, but reappears as radiation travelling in a different direction. There is no change of frequency; in quantum terms the photons collide elastically with the electrons. This applies when the photon energy hv is much less than the rest mass of the electron moc2. For higher photon energies the scattering can only be regarded as a quantum process in which a collision between an electron and a photon
20.8
Thomson and Compton scattering by electrons
295
involves an exchange both of energy and momentum. This the realm of Compton scattering, in which the photon emerging after the collision has a lower energy, i.e. a longer wavelength, the amount of the change depending on the geometry of the collision. The analysis which depends on the conservation both of energy and of momentum, gives the Compton scattering formula for the increase in wavelength AX as a function of 8, the angle through which the photon is deviated: AX=-
h (1 - cos 8) moc
Compton scattering is observed in X-rays passing through a solid or a gas. The essential interaction is between a high-energy photon and an individual electron, whether or not that electron is bound to an atomic nucleus.
CHAPTER
Detection of light
21.1 DETECTION OVER THE ELECTROMAGNETIC SPECTRUM
The interactions between electromagnetic radiation and matter offer an enormous variety of methods of detecting radiation. The range of quantum energies, from the negligibly small quanta of long radio waves to the overwhelmingly large quanta of gamma rays, is responsible for the main division between methods; there is an obvious difference between the measurement of the oscillating electric field of a radio wave and the measurement of momentum interchange in a collision between a gammaray photon and a material particle. There is, however, a common element through the whole range: all methods are measuring energy, and when we think of the ultimate sensitivity of any method we specify it in terms of the smallest quantity of radiation energy that can be detected. The attainable sensitivity of any detecting system is determined either by random processes in the detector itself or by the quantum nature of interactions between radiation and any detector. If the random processes within the detector are dominant, as for example the electrical noise in a radio receiver dominates over the small quantum energy of radio signals, then a classical analysis is appropriate: Boltzmann's law is applied to the random process and a minimum detectable energy level is found. If the quantum dominates, as for X-rays and gamma-rays, then it is often possible to detect every photon individually and the question of noise within the detector does not arise. Light occupies a middle position in the spectrum, where detection
21.2
Thermal detectors
297
processes are sometimes regarded as the detection of a steady flow of energy, to be measured against a random thermal background, and sometimes as quantum processes in which the detector can approach the ultimate limit of a separate detection of the arrival of each individual photon. The performance of a detector of light is therefore to be measured against a theoretical detector which adds no random output signals to the fluctuations that already exist in the light itself. The performance is specified as a detective quantum efficiency (DQE). The 'noise' fluctuations at the input and output of the detector can be expressed in terms of a detected energy flux, or 'signal'. The DQE is defined as a ratio: DQE =
Signalhoise at the input Signalhoise at the output
An ideal detector therefore has DQE = 1. We may expect the DQE of devices at wavelengths longer than visible light to be considerably smaller than unity. In the visible spectrum the physical and chemical processes involved in detection are such that a DQE of about 0.1 can be attained in several different ways. 21.2 THERMAL DETECTORS
The quanta of infrared light are not sufficiently energetic to be detected photoelectrically. In the far infrared (wavelengths 100 pm to 1 mm) the usual methods of detection depend on the heating effect of radiation on an absorbing surface. The resultant temperature change may be detected by a change of resistance, in a bolometer, or by a change in thermoelectric potential at a metal junction, in the thermocouple. In both types of thermal detector it is essential to use a sensitive element with small thermal capacity, well insulated from its surroundings, so that its temperature can respond reasonably rapidly to incident infrared energy. A response time of a few milliseconds can be achieved. The resistive element in a bolometer may be a metal strip, usually of platinum. The resistance changes are small; they are defined in terms of a temperature coefficient
where o, is the resistivity of the material. Most metals at room temperature have a values of around 0.005 K- l . Larger values are available in semiconductors, usually oxides of manganese, nickel or cobalt. Here the temperature coefficient depends on the bandgap in the semiconductor; values of 0.06 K - l are obtained at room temperature.
298
Detection of light
Chap. 21
The resistance of a superconductor at the transition temperature from normal conductivity to superconductivity is very sensitive to temperature. As a bolometer element this provides a very sensitive detector, but one that is difficult to use. A more practical cooled bolometer element is a carbon resistance cooled to liquid helium temperature; this has been used in far infrared astronomy. Thermocouples are generally less sensitive than bolometers, but they are also generally more rugged and convenient to use. The two metals of the junction are usually bismuth and antimony. 21.3 PHOTOCONDUCTIVE DETECTORS Although the photons of infrared light have insufficient energy to eject an electron from a solid, they may nevertheless be able to create a free electron or hole in a semiconductor, and thereby increase the conductivity of the material. For an intrinsic semiconductor the necessary photon energy is the bandgap energy, which typically corresponds to photons with wavelengths less than a few micrometres. Lead sulphide is commonly used for wavelengths from 1 to 4pm, and indium antimonide for wavelengths up to 7pm. The wavelength range may be extended by using doped semiconductors or semimetals. Here the energy required to excite an electron into a conduction band may be made arbitrarily small, allowing photoconductive detectors to be made for wavelengths up to 100pm. It will be obvious that these detectors can only be used at very low temperatures to avoid thermal excitation of electrons into the conduction bands. If the excitation energy is E, the probability of an electron being excited thermally is given by the Boltzmann factor exp( - E/kT); this must be smaller than in practice. A detector for 100pm radiation therefore requires cooling to the temperature of liquid helium, 4.2 K. 21.4 PHOTOELECTRIC DETECTORS The photoelectric effect, which provided the first demonstration of the quantum effect in light, provides the basis for many types of light detector. A clean metal surface emits electrons when photons with sufficient energy fall upon it. Only about 10% of the photons release electrons from a metal surface; electrons from the other photons are trapped in the metal. The minimum energy is the work function of the metal; for gold this is 4.5 eV, so that only light with a wavelength shorter than 270 nm can release electrons from gold. There is a random emission of electrons from any surface, which depends on the work function and the average electron energy. At room temperature
21.4
Photoelectric detectors
299
Fig. 21.1. Photomultiplier tube. An electron emitted by the photocathode P C is accelerated to D,, the first of a series of dynodes. At each dynode each electron stimulates the emission of several secondary electrons, and a large pulse is collected by the anode A.
this energy is about 1/40eV, so that the random emission from gold is negligible. Semiconductors are widely used as photoemitting surfaces. Here the minimum energy for a photon is the energy required to lift an electron across the Fermi energy gap, plus a surface effect. An alloy of caesium and strontium is often used; this requires a photon energy of 1.9 eV, corresponding to a wavelength of 660 nm. There is a considerably greater random emission from such a surface, and for observations of the greatest possible sensitivity it may be necessary to cool the photoelectric surface so as to reduce the random emission. In a simple photoelectric detector it may be sufficient to measure the total electric current of the photoemission; the best sensitivity and efficiency is obtained instead by detecting the individual photoelectrons. This is achieved in a photomultiplier tube (Fig. 21. I), in which the electrons are accelerated on a second metal surface, or dynode, which emits several electrons each time a primary electron strikes it. A series of these secondary emitting stages can be used, eventually multiplying the charge by a factor of 105 or 10% Photomultipliers may be used either to determine an average light intensity by recording the direct current or for short pulses of light when a current pulse is detected electronically. A single primary photon falling on the photocathode produces an output pulse of appreciable length; this may be important since it limits the photon counting rate which can be achieved without overlapping pulses. The output pulse length is determined partly by the variations in travel time of electrons between the dynodes, and partly by the finite response time of the pulse amplifier which follows the last dynode. A time resolution of a few nanoseconds (10-9s) is typical of photomultipliers designed for a high counting rate.
300
Detection of light
Chap. 21
21.5 PHOTON-COUNTING IMAGE DETECTORS A photomultiplier tube ideally gives a signal for all photons reaching its photocathode, without discriminating between locations on the photocathode. A camera which will build up an image by detecting individual photons must count electrons in many separate picture elements, or pixels. This requires the intensification of the photomultiplier to be repeated for each pixel, which can be achieved in a microchannel plate intensifier (Fig. 21.2).
Fig. 21.2. Microchannel plate. An electron emitted from the photocathode PC enters a channel, strikes the wall and stimulates a shower of secondary electrons. Part of its shower development is shown. A large bunch of electrons reaches the fluorescent screen S.
The microchannel plate is a closely packed array of glass tubes, around 15 pm diameter, forming an insulating plate about 1 mm thick. The glass surface acts as a dynode when struck by an energetic electron. A potential difference of around 1 kV between the surfaces of the plate provides the acceleration, and a large multiplication factor can be achieved. Bunches of electrons then strike a phosphor screen, each bunch indicating the arrival of a photon at the corresponding point on the photoelectric surface. A rapidly scanning television camera can then be used to record the arrival position for each photon, and eventually to build up the complete image. Photon-counting image systems of this kind are widely used in astronomy. 21.6 ELECTRON BEAM SCANNERS AND INTENSIFIERS Television cameras do not normally count the individual photoelectrons emitted from a photocathode surface at the camera focus, but detect only the
21.6
/
/
Electron beam scanners and intensifiers
30 1
Photocathode
,Target
plate P
--- -----,s
iEM1
+
----qq - i s
DT
Output 4
Electron collector grid
Fig. 21.3. The image orthicon. The image on the photocathode releases electrons which are accelerated to the target plate P, releasing a large number of secondary electrons. These are collected by a fine wire grid, leaving an image on P in the form of a stored charge. An electron beam from a gun G is scanned by the deflector plates D. The number of scattered electrons S depends on the charge on P. The electrons are detected in an electron multiplier EM surrounding the electron gun.
distribution of photoelectric current over the surface. In the image orthicon, shown diagrammatically in Fig. 21.3, the current is stored as a charge in a non-conducting surface, and the state of charge is explored by a scanning electron beam. Electrons emitted from the photocathode are accelerated to a thin nonconducting plate P, which may consist of glass or magnesium oxide. The potential of P is about 300 volts above that of the photocathode, so that secondary electrons are emitted on impact; these are collected by a fine wire mesh, at a potential slightly above P. The plate P therefore accumulates a positive surface charge density depending on the illumination of the cathode. A beam of electrons now scans P with the electron energies so arranged that the beam is scattered back more or less according to the potential of the part of P close to the beam. At the same time the beam neutralizes the charge on P. The scattered electrons are detected in the secondary electron multiplier arrangement surrounding the electron gun. In another type of television camera tube, the vidicon, the basic effect is photoconduction instead of photoemission (Fig. 21.4). A layer of lead monoxide absorbs photons and becomes more or less conducting according to the illumination. The electron beam which scans the layer of lead monoxide now conveys a current through the lead monoxide to a transparent conductor, stannic oxide; variations in this current are amplified to form the television signal.
302
Detection of light I
Chap. 21
Stannic oxide
Output
Fig. 21.4. The vidicon. The conductivity of the lead monoxide layer is proportional to the number of photons that fall on it. An electron beam from the gun G scans over the layer, and a current is collected by the transparent conducting layer of stannic oxide. Photocathode Phosphor screen
I
Magnetic field coils
'
+ 10 kV
Fig. 21.5. Image intensifier. Electrons from a photocathode surface are accelerated towards a phosphor screen, which is at a high potential. They are focused by an electromagnet, as shown, or by electric field lenses. The phosphor screen may be replaced by a photographic plate, either inside the vacuum tube, and replaceable only by breaking the vacuum, or outside a mica window which is thin enough to allow the electrons to reach the plate.
The detective quantum efficiency (DQE) of the image orthicon approaches 0.1, which is considerably better than the human eye. This performance is only possible over a limited range of illumination; nothing like the adaptability of the eye has been achieved in any of these devices. A simple intensification of an image on a photocathode is achieved in the image intensifier of Fig. 21.5. This uses electron optics to focus the p h ~ t o electrons from the cathode on to a phosphor screen, which may be viewed
21.7
Photodiodes and charge-coupled devices (CCDs)
303
directly or scanned by a television camera. The focusing is achieved by a series of electrodes or magnetic field coils, which form an electron lens between the cathode and the phosphor screen. 21.7
PHOTODIODES AND CHARGE-COUPLED DEVICES (CCDS)
Photons are detected in semiconductor photodiodes by the electron-hole pairs which they create. The pairs separate in the internal field of the diode junction, creating a potential difference across the diode terminals. At very high light intensities electric power may be extracted from the diode, as in the solar cell. At very low intensities the accumulated charge from a small number of photons may be detected by discharging the diode into a pulsedetecting circuit. A two-dimensional array of photodiodes on a silicon chip can be used as a very efficient detector of an image, with sufficient pixels for use as a television camera. It would, however, be quite impractical to make a separate wired connection to each diode. The charge-coupled device (CCD) is a selfscanning array, in which the output of the diodes is read sequentially after their exposure to light. Two principles are involved: charge storage and charge coupling. In Fig. 21.6 the photodiodes appear as a row of electrodes insulated from a semiconducting silicon substrate by a thin film of metal oxide. The substrate is usually p-type, and the electrodes are biassed positively. Under each electrode the positive carriers (holes) are repelled, leaving a depletion layer at a positive potential. Photoelectrons accumulating in this potential well are stored with very little loss for periods of some hours, allowing long integration times for faint-light photography, as for example in astronomical photography. The charge is accumulated under every third electrode, where a deeper potential well is created by applying a larger voltage. Charge can be transferred along a row of electrodes by the successive transfer of this bias
Fig. 2 1.6. Charge-coupled device, showing the deeper potential wells under the electrodes with higher voltages. One of the wells in ( a ) has accumulated electrons, which are shifted to the adjacent store in ( b ). A three-phase cycle of voltages moves the charges progressively along the line.
304
Detection of light
Chap. 21
voltage, as shown in Fig. 21.6(b). The action is often likened to a 'bucket brigade', in which buckets of water are passed from hand to hand to fill a tank. The last electrode on the line is the input to a transistor amplifier. The outputs of the individual rows can similarly be read sequentially. Efficient illumination of a CCD array requires the surface electrodes to be transparent: they are often made of polysilicon. Alternatively the diode array can be illuminated from the rear, when the silicon chip must be thinned to allow the photodiode action to occur close to the potential wells. CCDs are extensively used in astronomy, where their main advantages are a high DQE, reaching over 50%, and a linear response. They retain a useful performance to short infrared wavelengths (about 1.1 pm), in contrast to photoemissive detectors which have little response beyond the visible region (about 700 nm). CCDs are cooled to liquid nitrogen temperature for use at the lowest light levels, to avoid thermal generation of free electrons. CCDs are also very convenient for television cameras, where they have the advantages of compactness, simplicity, and operation at low voltages. In astronomical telescopes they show a large advantage over photographic plates and vidicons; if they are operated at the temperature of liquid nitrogen they have a DQE for red light which can reach 0.7. 21.8 PHOTOGRAPHY
The photographic process has two great differences from all photoelectronic devices. First, it can store permanently an enormous amount of information. The linear resolution on a photographic plate is about 1 pm, while plates over lOcm across are commonly used in Schmidt telescopes (Section 7.5). Second, it can be used to build up an image during very long exposure times. This is a process of integration. Whether this long integration time is a disadvantage or an advantage depends on whether one wishes to study time dependances of signals or not. A photographic emulsion contains individual grains of silver halide crystals, each about half to one micrometre in diameter. Each crystal can be developed by a reducing agent to produce a grain of silver. The reducing agent is not, however, powerful enough to develop grains unless they contain an imperfection in the form of a single silver ion which has already been converted into a silver atom. This conversion can occur when a photon is absorbed by the crystal. The photon acts by exciting an electron into the conduction energy band, when it is free to move until it is trapped by an imperfection in the lattice; here a silver atom is produced. There is a natural relaxation process in the crystal, and a developable grain is only obtained if more than one photon is absorbed within some minutes of time. Further, red and green light has insufficient quantum energy for direct action on the crystal, and absorption must be arranged to occur in
21.9
TheEye
305
organic dyes which coat the grains. Photography is not therefore a process with a high DQE.The high linear resolution and integration properties are therefore best combined with the photoelectric process in an image intensifier for photography of the faintest astronomical objects. 21.9 THE EYE
The optics of animal eyes embrace an astonishingly wide range of systems, but in all types of eyes the detection process is remarkably similar. We are all familiar with the camera-like human eye (Section 8.3), in which the curved front surface of the cornea, assisted by an internal lens, focuses an image on to the retina. Very simple eyes may act merely as a pinhole camera with no lens. Many other animal eyes have a spherical lens in which the refractive index is graded radially, very like the fish-eye lens devised by Maxwell; these short-focus lenses are very well corrected for spherical aberration over a wide field of view. There are even multi-element lenses: the copepod crustacean Pontella has a triplet lens, again well corrected for spherical aberration but with a narrower field of view. Reflection, rather than refraction, optics is rarer but not unknown in animal eyes. No metallic surfaces are available, as there is no natural process for depositing metal films. Instead the biological mirrors are constructed by the thin film technique, using a succession of layers, a quarter-wavelength thick, of organic materials with two different refractive indices. There are examples of optics using a concave reflector (some molluscs and the deep-sea ostracod Gigantocypris), but the commonest use of reflection is in compound eyes. Compound eyes, usually thought of as flies' eyes, are found in at least half of the animal kingdom. In the simplest types, the mosaic of separate lenses on the convex surface each has its own detector. Each element detects light from a single direction, with a typical angular resolution of about 1". In others the mosaic of lenses is separated from a continuous photosensitive surface. These are the 'superposition' eyes, in which several lens elements contribute to each element of the image. The lenses themselves are elongated and use the graded refractive index principle of the fish-eye lens. In some eyes, notably those of the decapod crustacea such as shrimps and lobsters, these lenses are replaced by a form of light guide, using internal reflection. The guides are square, and for light arriving nearly along the axis they act like single lenses, each ray being reflected once only in each of two faces of the square tube. All these different optical systems have in common a single basic photochemical process by which the retina detects light. It is not, however, a simple process, and it is indeed remarkable that it is common to the many different types of eye which must have evolved independently. The detection of light in the retina is a quantum process, in which a single molecule undergoes a chemical change in absorbing a single photon. The
306
Detection of light
Chap. 21
Fig. 21.7. The chromophore 11-cis-retinal, and the trans configuration to which it is converted when it absorbs a photon of light with a wavelength near 500 nm.
molecule is rhodopsin, which consists of a large protein molecule, known as an opsin, with a small part attached which contains the specific absorbing structure, or chromophore. This part is the chemical 1 1-cis retinal. It contains a chain of carbon atoms, with alternating double and single bonds (Fig. 21.7). An isomeric form, 11-trans retinal, is the same molecule with a different configuration, and with an energy greater by the energy of a photon near 500 nm wavelength. The absorption of a band of wavelengths in this vicinity gives rhodopsin its purple colour. The chromophore is common to all eyes, although its absorption band is found at somewhat different wavelengths according to the various forms of opsin molecule to which it can be attached. The changes in rhodopsin after the absorption of a photon are complex, and need not concern us. It is sufficient to say that during the first few milliseconds a nerve signal is sent out from the cell containing the excited molecule. There are also random impulses, the 'noise' of the detection process, and there is a most complex recognition process involved in distinguishing real signals from noise. Recognition of a light stimulus requires the simultaneous firing of several cells: in fact the most sensitive condition in the eye occurs when light falls on a considerable area of the retina, corresponding to at least one square degree in the field of view. The most sensitive detection corresponds to about 50 photons falling on this area of retina, which corresponds to about 100 photons on the cornea, since half the light is lost on its way to the retina.
21.9
TheEye
307
Evidently the DQE of the eye is only about at best; however, the astonishing adaptability of the eye is such that it preserves a DQE above 10- over a range of lo7 in light intensity. The adaptation involves several processes, including a change of up to a factor of 10 in the area of the pupil; the most important is a change between two types of detecting cells, the rods and the cones. The cones are concerned with higher intensities, and they contain three forms of rhodopsin, absorbing at three different wavelengths: this provides for colour vision in sufficiently bright light. The rods are the most sensitive elements, but they are not colour selective; in consequence, colours cannot be seen in faint illumination, as for example in moonlight.
Further reading
Advanced Texts
M. Born & E. Wolf, Principles of Optics, 6th ed., Pergamon, Oxford, 1980. Systematic and comprehensive theoretical treatment. M. V. Klein and T. E. Furtak, Optics, 2nd ed., Wiley, 1986. General text in classical optics. R. Loudon, The Quantum Theory of Light, 2nd ed., O.U.P., 1983. Individual Topics
W. H. Steel, Interferometry, 2nd ed., O.U.P., 1983. H. Haken, Light Vol. 2: Laser Light Dynamics, N. Holland, 1985. J. F. James & R. S. Sternberg, The Design of Optical Spectrometers, Chapman and Hall, 1969. Nils H. Abramson, The Making and Evaluation of Holograms, Academic Press, 1981. P. M. Duf fieux, The Fourier Transform and its Applications to Optics, Wiley , N.Y., 1983. M. F. Land, Optics and Vision in Invertebrates. In: H. Autrum (Ed.), Handbook of Sensory Physiology VII/6B, Springer, Berlin, 1981. General Texts
R. S. Longhurst, Geometrical and Physical Optics, 3rd ed., Longman, 1973.
Further reading
309
F. A. Jenkins and H. E. White, Fundamentals of Optics, 4th ed., McGrawHill, 1976. E. Hecht and A. Zajac, Optics, Addison-Wesley, 1982. S. G. Lipson and H. Lipson, Optical Physics, 2nd ed., C.U.P., 198 1 . Interaction of waves with matter-classical treatment. M. Garbuny, Optical Physics, Academic Press, 1965. Interaction of waves with matter, from the standpoint of atomic physics.
Worked Examples The Optics Problem Solver, Staff of Research and Education Association, R.E.A., N.Y., 1981.
Solutions to Problems 1.1 Vl = ( T/Ap)l12, V2 = ( Y/p)lj2 where A is cross-sectional area, T is tension, Y is Young's Modulus and Hence Vl/ V2= all2. 1.2 v l ~ g 1 / 2 ~ v21O/C2 T ~ / ~ X - ~ / 1.3 X = 2.4 mm, X = 12 cm. av ai and ai av 1.4 Show first that - = L= C-. ax at ax at 1.6 Group velocity = V- (d/dX) V= a = constant. Any pair of wavelengths have the same relative phase at the prescribed intervals of time and distance, so that a group made up of any number of component waves will reproduce exactly. 1.7 Group velocities are V/2, 3 V/2, 2 V, c2/ V. 2.1 (iii) Suitable pairs of polarizations would be orthogonal linear, or circular with opposite hands. 2.2 If the waves are in phase everywhere, the source amplitude must be doubled also. 2.3 As X approaches 2a only one mode (n = 1) can propagate, and X, becomes long. 2.4 The difference between the relativistic and non-relativistic calculations is 2 Hz. This is also the transverse Doppler shift. 3.5 The amplitude modulated wave has sidebands at p w. The frequency modulated wave also has sidebands at p 2w, p 3w, etc. 4.3 Apparent bore is n times actual. 4.4 Displacement is d = t sin 4(1- cos +/n cos 4' ) where 4 ' is angle of refraction. 4.5 Consider a section of wavefront 6y. In time 7 secondary waves from one end travel a distance cr/n; those from the other travel a distance c7/ [ n + (dn/dv)6y ] . Hence the angle turned is (c.r/n)(l/n(dn/dy)), and the radius of curvature is (1/n )(dn/dy ). 4.6 The radius of curvature is ho(n- I)-', where ho is the scale height. In the example this is 36 000 km as compared with the Earth's radius, 6000 km. The horizon distance is increased by 10%. 4.7 Points on the wavefront 6y apart in the plane of incidence arrive on a stationary 9
+
+
+
Solutions to problems
4.8 4.9 4.10 4.11 4.12 4.13
5.1 5.2 5.3 5.4 5.5 5.7 6.2 6.3
311
mirror at times (6y/c)sin i apart. In this time the mirror moves Gy(v/c)sin i, and the secondary waves are retarded by 6y(2v/c)sin i. n3 must be the geometric mean of n, and n2. The amplitude reflected is 20%; the intensity reflected is 4%. A quarter-wave coating with refractive index 1.225 is needed; a low reflection coefficient over a wide wavelength range can only be achieved by a multi-layer coating. The velocity is 2 km s-l, giving a rotation period of 24 days. (The true value is 27 days). The Earth's velocity is of the velocity of light, giving a peak-to-peak variation of 6 mHz. Finding the velocity from tm$ = k~ gives a half-width 0.02 nm. (A full calculation gives a lower value.) (i) If 10 watts is radiated from a distance of 1 metre, then E- 0.05 V m-l. (ii) Estimating 0.1 watt at 2 metres, and the diameter of the earhole as 2 mm, the power is 3 x lo-* watt. The power flux = P2/z, where P is the r.m.s. pressure change, hence P= 1.8 N m-2, i.e. 0.02% of atmospheric pressure. Resolve the amplitude emerging from the first on to the direction of the second, and again on to the third, giving 1/8 1,. 0.85pm. 6.3 mm. The plate produces a small lateral displacement, not an angular displacement. There is dispersion in the difference between the refractive indices. Major axis at 52", axial ratio cot 28" = 1.88. Ratio is (n - l)/n. Multiplied by - 7.7.
6.6 ~hickness= f m, n(r) = 1.4 - 39/10. 6.7 With the origin of coordinates at the apex of the surface, and a focal length f, the optical path through the point (x,y,z) is equal to the path fn, for an axial ray. Hence:
Hence the equation for the surface is
:
:
When n > n this is a prolate ellipsoid of revolution: when n < n it is a hyperboloid of revolution. The hyperboloid gives a virtual image. Note that if n, = n2 the aplanatic surface corresponds to a paraboloid of revolution: this is the perfect reflector. 6.8 The focusing property applies only to a single point, and no image can be formed of an extended object. 7.1 Follow Eqs. (7.2) and use
7.2 Using Problem 7.1, with l / u =0, b=0.3 mm, c = 3 mm.
312
Solutions to problems
7.3 The plate is as in Fig. 7.10(b) rather than Fig. 7.1 1. The extra thickness is ( 1 / n )a, where a is found from Problem 7.1. The plate is thicker by 1.3 pm at the edge. 2-n 7.4 Image distance beyond the second surface is 2(n tain chromatic abberation = 4 mm.
r - Differentiate to ob-
7.6 f = 215 mm, 6f = 3 mm. Compensating lens f = 485 mm. The combined focal length is 386 mm. 8.1 Use Newton's formula (6.6) with 2, equal to the focal length of the objective. 8.2 The Sun appears uniformly bright, while the Moon's brightness falls towards the limb by the factor cos 8 in the illumination. 8.3 The integral of the luminous flux F is required, in the form 2,
1:"'
sin
e do
Since F = L cos 8, the result follows. An elementary ring on the Moon has area 2 ~ sin 9 8 do, and its illumination is S cos 8, where S is the illumination of full sunlight. The luminous intensity of the ring is ( a / a ) S cos 8 2 a$ sin 8 d8 where a is the reflectivity. In the direction 8, back towards the Earth, the total luminous intensity is therefore a
j'
"I2cos2 8 sin 0 d8 = f a r 2 s
.
0
and the illumination at the surface of the Earth is f a ( r 2 / d 2 ) s . The ratio is therefore + a ( r 2 / d 2 )= 2.3 x 8.4 ( a ) and ( b ) . a L . 8.5 ( a ) ( 4 / a a 2 ) S = 2 x lo9 lux. ( b ) a02/(4f 2a2)=lo7 lux.
( 31
8.6 Approximately 1
-
- n ( n - ( n 2 - l)lI2)=3.4.
9.1 Angular breadth 20 arc minutes. The double slit gives 10 interference fringes across this pattern. 9.2 Derive the phasor diagrams from Fig. 9.4. On the first zero the phasor diagram is a closed circle; removing a central strip increases the amplitude to a maximum value of a w l . 9.3 (a) In this region there are two crossing wide beams, giving Young's fringes at 1 mm spacing. (b) Here the two pinholes are focused, giving images 5 mm apart, with Airy discs 0.5 mm across. 9.4 Each point of the source forms a pattern independently. The elongated source at an angle therefore forms fringes parallel to the line source, separated by angle X / D in a direction perpendicular to the diffracting slit. 9.5 (i) 10 minutes of arc. (ii) & second of arc. 9.6 In directions lying in a plane perpendicular to a side there are no zeros; the phasor diagram is a spiral that does not close. In planes parallel to the sides there are zeros where symmetrical halves of the triangle cancel. The angular distance to the zeros is very approximately 2X/L where L is the side of the triangle. 9.7 The zeros occur where tan z = 1.
Solutions to problems
313
10.1 450 km. 10.2 The Fresnel zone is about 500 m across, comparable with the height of surface features. The angular width is about 250 micro arcseconds, about ten times the
angular diameter of the nearest typical stars. 10.3 The particles are about 35 pm in diameter. 10.4 The phasor diagram for each zone is a semicircle, and alternate zones are missing, giving an amplitude ratio of +x$, and an intensity ratio of n-'. Reversing the phase of alternate zones quadruples the intensity. The remaining energy is diffracted around the focal point. 10.5 The two polarized sectors act independently, and the two edge patterns add in intensity giving shadow fringes on both sides of the geometric shadow edge. 10.6 The pattern is a superposition of a Fresnel shadow from a half-amplitude wavefront on an unobstructed half-amplitude wavefront. Shadow fringes will be seen on both sides of the geometric shadow edge, as can be shown by using the Cornu spiral. 11.1 (a) The effect is similar to the diffraction from a circular aperture as compared to a rectangular aperture. The diffraction maxima become wider, and the subsidiary maxima become smaller. (b) The grating spacing becomes 2d, so that the maxima of all orders are closer spaced. The intensity in the direction of the original maxima is reduced by a factor of four. The widths of the maxima are unchanged. (c) A superposition of two grating patterns, one with spacing d and amplitude a , the other with spacing 2d and amplitude 1 - a , should be considered. Then the original maxima will be reduced in intensity by a factor [ ( I + a ) / 2 ] ', and subsidiary maxima appear at half spacing in sin 9, with intensity a2. 11.2 (a) The grating pattern is shifted by angle 9, where sin 9 = X/2d. The envelope pattern, determined by the width of a single slit, remains unchanged. (b) Superpose two patterns, one from a grating with spacing d and unit amplitude, the other from a grating with spacing 3d and amplitude - 2. Then subsidiary maxima appear at one third intervals of sin 6, with intensity (+)', while the original maxima are reduced in intensity to (f)'. (c) Each half should be considered separately, then the two halves added to give an interference pattern, repeating at intervals of 2 X / D in sin 9. This is half the total width of a diffraction maximum, so that each maximum breaks up into two halves, with zero in the centre. 11.6 Such a modulation gives ghosts on either side of a spectral line, apparently from light with wavelength difference 6X = X/q. 12.1 At least 200 metres. 12.2 (a) Angular separation = 5 x rad, giving linear separation 0.5 mm. (b) Approximately X/6X = 25. (c) Angular width seen from double slit must be small compared with 5 x rad. Hence width less than 2 x mm. 12.3 Interference fringes are formed between the transmitter and its image, which has a reversed phase. The minimum value of h is therefore 5 wavelengths, i.e. 4TI m. 12.4 Approximately 10 cm. 12.5 At each position the slit spreads the rays by angle X/D. The widths of the standard slit diffraction patterns in the three cases are: (a) 2 mm (b) 1 mm (c) 2 mm. 13.1 The interference fringes are concentric circles. 13.2 The reflected waves from the front and back of the film are in opposite phase. The film thickness must therefore be considerably less than a quarter wavelength.
314
Solutions to problems
13.3 At the centre the glass surfaces are in contact and there is no reflection. The oil will reduce the reflection coefficient at both surfaces, giving fringes with reduced intensity and with smaller diameters. r + exp i8 , whose modulus is unity. 13.4 The reflection coefficient takes the form 1+ r exp i0 radians, or about 1' arc. 13.5 If the spacing is d, then dar = X/2. Hence a = 3 x The spacing reduces to d/n. The reflection coefficient becomes small, and the fringes become faint. The visibility does not fall. 13.6 The order is greatest at the centre. Let this be n + E, where E < 1. Then
For 600.000 nm we find n, + E, = 33,333.33. For 600.010 nm, n + E is decreased by 1 part in 6 x 104 and n2+ e2 = 33,332.788. Hence nl = 33,333 and n2= 33,332. Fringes appear at 8 where cos 8 = nX/2d. Putting cos 0 = 1 - 02/2, the first fringe appears at e2= 2 ~ / n .Hence el = 45 mrad and 8, = 73 mrad. 13.7 Thickness d=n(X/2). If a change of 6X changes n by 6n:
Here 6n = 1
and
= 2.62 cm.
13.8 The parameter F = (4 x 0.6)/(0.4)2 = 15. The ratio required is (1 + F)= 16. 13.6 The spectrum consists of a regular sequence of bright bands at wavelengths X=2t/n, where n is an integer. The thickness is 0.43 mm. 14.1 Use approximate values 1.2h/d for single apertures, and V 2 d for interferometers. arc sec, (v) 1.5 x lo-' arc sec. (i) 2', (ii) 0.6', (iii) 0.04", (iv) 5 x 14.2 (ii) Microscope resolving power is 1.2X(sin 8)-l, giving a resolution of approximately 0.1 nm. 14.3 (i) 4 400. (ii) 18,000. (iii) Resolution X/6X is approximately 1 . 5 p , / ~where , p = 2d/X. Hence for this spacing the resolution is 1.6 x lo6. 14.4 The reception pattern is the product of the two amplitude diffraction patterns. The angular resolution is X/D. 15.1 n - 1 = 1.7 x The visibility is at minimum when the path difference corresponds to 500 fringes, which requires 2.7 atmospheres. 15.2 7=20ps, N = 100. 15.3 m = 4 x lo4; 100 arc seconds; resolving power = 8 x lo5.
Index AbbC's sine condition 87 AbbC's theory of microscope 215 Aberrations 81, 91 chromatic 103 Absorption at a resonance 290 coefficient 292 in dielectrics 288 of photons 252 Achromatic lens 105 Aerial arrays broadside 175 end-fire 174 Airy disc 139, 21 1 Airy function 21 1 Amplitude complex 17 of a wave 15 Angular dispersion 220 Angular frequency 9 Angular magnification 84 Anisotropic propagation 58 Anomalous dispersion 29 1 Antinodes 23 Aperture stops 101 Aperture synthesis 247 Astigmatism 98, 99 Astronomical telescope 114 Autocorrelation 43, 246
Babinet's Principle 159 Beats 20, 28 Biaxial crystal 60 Binoculars 115 Birefringence 59 Bolometers 297 Bragg reflection law 179 Bremsstrahlung 285 Brewster angle 58, 65 Brightness 113 Broadside radio array 175 Camera 119 Candela 121 Cartesian oval 80 Cartesian sign convention 81 Cassegrain telescope 114 Caustic 77 Cavity resonator 255 Cerenkov radiation 279, 281 Charge-coupled device (CCD) 303 Chromatic aberration 103 Chromatic dispersion in grating 222 in prism 109, 220 Circular polarization 64, 71, 73 C 0 2 laser 261
316
Index
Coherence 184, 242 length 244 time 244 Coma 98 Complex amplitude 17 Compound microscope 117 Compton scattering 295 Compton X-ray diffraction 176 Condenser lens 133 Conjugate planes 83 Convergent evolution 213 Convex lens 83, 85 Convolution 42, 169, 184 Cornu spiral 147 Correlation 242 Cotton-Mouton effect 293 Cross-correlation 245 Cross-section for absorption 292 Curvature of field 99 Delta function 42 Descartes 30 Detective quantum efficiency (DQE) 297 Detectors CCD 303 image orthicon 301 photoconductive 298 photoelectric 298 photographic 304 the eye 111, 305 thermal 297 vidicon 302 Dielectric polarization 288 Diffraction circular aperture 139 Fraunhofer 129 Fresnel 143 general aperture 135 grating 161 rectangular aperture 136 slit 130 Diffraction grating 161 blazed 170 concave 234 construction 170 ghosts 172 missing orders 170 orders 161 X-rays 176 Dipole 280 Dispersion 10
angular 220 anomalous 29 1 chromatic 104 Dispersive power 104 Distortion 99 Doppler effect 30 relativistic 32 Doppler radar 29 Dye laser 262 Echelle grating 235 Edge wave 150 Einstein transition coefficients 252 Electric dipole radiation 280 Electromagnetic radiation 6, 279 Electron beam scanners 300 Elliptical polarization 65 Emission from accelerated charge 280 spontaneous 252 stimulated 252 End-fire radio array 174 End-gauges 227 Energy in waves 7, 13 Etalon 207 Exit pupil 116 Exponential frequency term 19, 35 Extinction coefficient 299 Extraordinary wave 59 Eye 111, 305 Eyepieces 105 F number 122 Fabry-Perot interferometer 206,224, 236 Faraday effect 73 Fellgett 240 Fermat's principle 53, 76 Fibre optics 264, 293 Field stop 116 Field of view 116 Finesse 225 Fish-eye lens 79 Flux 121 Focal length 84 Focal ratio 122 Fourier theory 33 Fourier series 35 Fourier transform 36 in diffraction 132 in grating theory 168 spectroscopy 239, 247
Index Fraunhofer diffraction 129, 143, 153 Frequency angular 9 negative 17 Fresnel biprism and double mirror 182 Fresnel diffraction 143 circular apertures 154 cylindrical wavefronts 150 integrals 147 slits and strips 151 straight edge 145 Fresnel reflection coefficients 58 Fresnel lens 123 Friedrich, Knipping and Laue 177 Fringe visibility 189, 199 Fringes, see Interference fringes Gabor 270 Galilean telescope 114, 120 Gaussian function 39 Grating function 43, 168 Grating spectrometer 222, 233 Gregorian telescope 114 Group velocity 10, 44 Half-period zones 146, 154 Half-wave plate 71 Hanbury Brown and Twiss 190, 250 Harmonic generation 266 Heisenberg's uncertainty principle 45 Helium-neon laser 261 Hero of Alexandria 51 Herschel telescope 114 Hertzian dipole 279, 280 Heterodyning in light 266 Holography 269 colour 276 moving objects 277 Homodyne detection 266 Huygens eyepieces 105 principle 48 secondary waves 48, 158, 283 Illumination 121 Image, perfect 77 Image intensifiers 300 Image orthocon 301 Infrared detectors 297 Instrumental profile 233 Intensity 121
Intensity interferometer 190 Interference filter 207 Interference fringes of constant thickness 201 of equal inclination 199, 201 visibility 189, 199 Ionosphere 289 Intensity correlations 190, 249 Interferogram 238 Interferometer 29 Fabry-Perot 206, 224, 236 intensity 249 Mach-Zehnder 230 Michelson, spectral 201, 247 Michelson, stellar 187 radio 29, 190, 247 Sagnac ring 232, 268 twin-beam 239 Jennison and das Gupta 191 Kerr cell 259 Kerr effect 73 Kirchhoff theory 141, 144, 158 Larmor frequency 284 Lasers 251 carbon dioxide 261 dye 262 four-level 260 gas, glass and liquid 260 modes 256 pulsed 259 pumping 253 ruby 261 semiconductor 263 Laue X-ray diffraction 177 Length measurement 226, 228 Lens achromatic 105 condenser 123 magnifier 110 projector 124 Lens aberrations 81 Lens equation 83 separated pair of lenses 85, 105 Lenz 79 Light guide 264 Lippmann 32, 277 Liquid crystal display 75 Lloyd's mirror 182
317
318
Index
Longitudinal magnification 84 Lumen 121 Luminance 122 Luminous flux 121 Lummer plate 24 1 Lux 121 Mach-Zehnder interferometer 230 Magnification 83 angular and longitudinal 84 Magnifying lens 110 Malus' law 70 Maxwell's equations 6 Maxwell's fish-eye lens 79 Maxwell's theorem 78 Michelson spectral interferometer 201, 247 stellar interferometer 187, 242 Michelson-Morley experiment 23 1, 268 Micro-channel plate 300 Microscope 117, 214 Abbe's theory 215 objective 118 phase contrast 217 resolving power 214 Missing orders 170 Modes, in laser 256 Modulated wave 40 Monochromatic light 14 Multiple beam interference 203 Negative frequency 17 Newton's colours 197 Newton's equation 84 Newton's rings 194 in transmission 197 Newtonian telescope 114 Nodes, in standing waves 23 Nonlinear optics 266 Numerical aperture 2 15 Oil-immersion objective 118 Optic axis 60 Optical activity 72 Optical cavities 255 Optical disturbance 15 Optical path 53, 76 Optical pumping 253 Ordinary wave 59 Overlapping orders 236
Parabolic reflector 95 Paraxial approximation 8 1 Perfect image 77, 80 Phase contrast microscopy 2 17 Phase velocity 12 Phasor 15 Photoconductive detectors 298 Photodiode 303 Photoelectric detectors 298 Photographic camera 119, 122 Photography 304 Photomultiplier tube 299 Photon 2, 192, 294, 298 Photon counting image detectors Plane wave propagation 3, 26 angular spread 26, 46 Pockels effect 73, 259, 292 Poisson 156 Polar diagram 173, 281 Polarization 64 analysis 68 circular 66 dielectric 288 elliptical 65, 66 plane 64 random 64 Polarizers 69, 70 Polaroid 70 Population inversion 252 Power of lens 83 Power spectrum 43 principal planes 83 Principle of least time 51, 53 Prism chromatic resolving power 220 dispersion 109 spectrometer 100, 219 thin 108 Prismatic binoculars 115 Projector 123 Propagation of waves anisotropic 58 dispersive 10, 44 in waveguide 28, 265 Kirchhoff theory 159 Pseudoscopic image 272 Pulfrich refractometer 61
Quantum efficiency 297 Quarter-wave plate 69, 71
Index Radar, Doppler 29 Radiation Cerenkov 281 dipole 280 electromagnetic 279 synchrotron 284 Radio antenna arrays 172 Radio interferometer 29, 190, 247 Raman scattering 294 Ramsden eyepiece 105, 115 Ray aberration 91 Ray tracing 85 sine law 87 Rays 50, 76 Rayleigh criterion 21 1, 215, 222, 224 Rayleigh distance 143 Rayleigh scattering 265, 293 Rayleigh's refractometer 226 Reflecting microscope 118 Reflection 50 coefficient 55 oblique incidence 56 of X-rays 176 partial 54 Refraction 5 1 at spherical surface 82, 89 double 59 Refractive index 53, 289 Refractometer 61, 226 Relativistic Doppler effect 32 Resolving power angular 2 11 chromatic 220 Fabry-Perot interferometer 224 grating 222 microscope 214 prism spectrometer 2 19 spectrometer 232 Resonant atoms in gases 290 Rhodopsin 306 Ring laser 267 Rowland concave grating 234 Sagnac interferometer 232, 268 Scattering 294 Rayleigh 292 Schmidt corrrector plate 102 Schrodinger 44 Secondary waves 48, 158 Semiconductor laser 263 Sign conventions 81
3 19
Signal-to-noise ratio 233 Sinc function 39, 170 Sine condition 87 Snell's law 50 Sound waves 54 Spatial filtering 219, 258 Speckles 266 Spectrometers 232 grating 222 prism 219 Fabry-Perot 206, 234, 236 Fourier transform 239, 247 instrumental profile 253 Spectrum 35 complex 36 Spherical aberration 93, 99, 101 Spherical lens 79 Standing waves 23, 28, 30 Stigmatic image 79 Stimulated emission 252 Stops 101, 116 Superposition of waves 15 Synchrotron radiation 284 Telephoto lens 120 Telescopes 112 resolving power 21 1 Schmidt 102 Thermal detectors 207 Thermocouple 297 Thomson scattering 294 cross-section 294 Thin film interference 200 Thin lens 83, 85 separated pair 85, 105 Transmission coefficient 54 Twin-beam spectroscopy 237 Twyman and Green interferometer 229 Velocity of light 7 group 10, 44 Vidicon 302 Visibility of fringes 189, 199 Voigt effect 292 Waves electromagnetic 6 on liquid surface 10 on stretched string 5
320
Index
Wave aberration 92 Wave equation 4 electromagnetic waves 7 Schrodinger 44 Wave group 10, 43, 46 Wave impedance 53 Wave packet 44 Wave velocity 1 1 Wavefront 48 Waveguide 28 Wedge fringes 227
X-ray crystallography 179 X-ray diffraction 176 X-ray refraction 289 Yagi radio array 175 Young's double slit 127, 137, 181 Zernike 2 17 Zone plate 156
The Electromagnetic Spectrum Wavelength
Frequency
30 k m
1o4HZ
Name and Typical Uses VLF (very l o w frequency) radio. Long distance and submarine communication. Long .wave radio. Broadcasting. Short wave radio. Long distance communication via the ionosphere. UHF (ultra high frequency) radio. Television, radar, radio astronomy. Centimetric radio. Satellite communications, radar. Shortest radio waves, longest infrared. Infrared. Molecular spectra.
Visible light. Ultraviolet. Atomic spectra. X-rays. Quantum energy 2.5 x 103eV. 7-rays. Quantum energy 0.25 MeV, extending up t o cosmic ray energies approaching 10 2 0 eV.
Units, Symbols and Data Micron ( p )
=
metre
Nanometre (nm)
=
metre
Angstrom unit
(A)
= 10-'*metre
Planck's constant ( h)
= 6.626x 1 0-34 joule sec
Velocity of light in vacuo ( c )
= 2.99792x 1 O8 metre sec-
Mass of electron
=9.11 x
Electronic charge ( e )
= 1.602x 1 0-l9coulomb
Electron volt (eV)
= 1.602~ 10-19joule
Wavelength of non-relativistic electron with energy v, the energy being expressed in electron volts
= 1.226~
Boltzmann's constant ( k )
=1.380x
gram
v-4 metre joule deg-I
The wave narure of light, which provides the theme of this book, is a link between optics and all other branches of physics. It is emphasized here i n the treatments of geometric optics and optical instruments, as well as i n diffraction and interference. Fourier theory and the concept of coherence are introduced, leading to the modern topics of lasers and holography. The interaction between radiation and matter is treated 8s the emission, absorption and refraction of waves. This book is intended to provide a connected account of this wide range of topics, suitable for an undergraduate course i n physics. In this second edition the basic material has been reorganized and revised to give a more central importance to Fourier theory. The applications to modern optics have been completely updated. The Manchester Physics Series is a collection of textbooks suitable for an undergraduate degree course in Physics. Each book has been individually developed to provide a reliable, self-contained text for an up-to-date course. Each book can be used independently of the other books in the series, while the organization and scope of each book allows considerabte flexibility in the selection and arrangement of different courses.
The Manchester Physics Series General Editors F.MANDL; R.J.ELUSON; D.J.SANDIFORO Physics Department, Faculty of Science, University of Manchesfer Properties of Matter:
B.H. Flowers and E. Mendoza
Optics :
F.G. Smith and J.H. Thornson
Second Edition Statistical Physics: Second Edition
Solid State Physics:
Electromagnetism: Atomic Physics: Electronics: Electromagnetic Radiation:
H.E. Hall 1.S. Grant and W.R. Phillips J.C. Willmott
J.M. Calvert and M.A.H. McCausfand F.H. Read
In preparation Statistics: Particle Physics:
R.J. Barlow B.R, Martin and G . Shaw
JOHN WlLEY & SONS Chichester New York Brisbana - Toranto .Singapore