PROGRESS IN OPTICS VOLUME X
EDITORIAL ADVlSORY BOARD
M. FRANCON,
Paris, France
E. INGELSTAM,
Stockholm, Sweden
K . KINOSITA,
Tokyo, Japan
A. LOHMANN,
Sun Diego, U.S.A.
W. MARTIENSSEN,
Frankfurt am Main, Germany
M. E. MOVSESYAN,
Ereiian, U.S.S.R .
A. RUBINOWICZ,
Warsaw, Poland
G. SCHULZ,
Berlin, Germany (G.D.R.)
W. H. STEEL,
Sydney, Australia
G. TORALDO D I FRANCIA,Florence, Italy
W. T. WELFORD,
London, England
PROGRESS IN OPTICS VOLUME X
EDITED BY
E. WOLF University of Rochester, N . Y., U.S.A.
Contributors
T. S. H U A N G , R. W. SMITH, M. 0 . SCULLY, K. G. WHITNEY, C. G. WYNNE, D. Y. SMITH, D. L. DEXTER E. K. SITTIG, C. W. HELSTROM
1972 NORTH-HOLLAND PUBLISHING COMPANY - AMSTERDAM. LONDON
@ NORTH-HOLLAND
PUBLISHlNG COMPANY
- 1972
All Rights Reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without the prior permission of the Copyright owner.
61-19297
L I B R A R Y OF CONGRESS CATALOG CARD NUMBER: NORTH-HOLLANDISBN:
0 7204 1510 1
A M E R l C A N ELSEVI ER ISBN:
0 444 10394 5
PUBLISHERS:
NORTH-HOLLAND PUBLISHING COMPANY - AMSTERDAM NORTH-HOLLAND PUBLISHING COMPANY, LTD. LONDON
-
SOLE DISTRIBUTORS FOR THE U.S.A. A N D CANADA:
AMERICAN ELSEVIER PUBLISHING COMPANY, INC. 52 VANDERBILT AVENUE NEW YORK, N.Y. 10017
PRINTED I N THE NETHERLANDS
CONTENTS OF VOLUME I(1961) I. I1 .
THEMODERN DEVELOPMENT OF HAMILTONIAN OPTICS.R . J. PEGIS ... WAVEOPTICSA N D GEOMETRICAL OPTICSIN OPTICAL DESIGN.K . MIYA-
1-29
MOT0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31-66 A N D TOTAL ILLUMINATION OF ABERRATIONTHEINTENSITY DISTRIBUTION FREE DIFFRACTION IMAGES. R . BARAKAT. . . . . . . . . . . . . . 67-108 IV . LIGHTA N D INFORMATION. D . GABOR. . . . . . . . . . . . . . . . 109-153 V. ON BASICANALOGIES AND PRINCIPAL DIFFERENCES BETWEEN OPTICAL A N D ELECTRONIC INFORMATION. H . WOLTER . . . . . . . . . . . . . 155-210 VI . INTERFERENCE COLOR.H . KUBOTA . . . . . . . . . . . . . . . . . 21 1-251 VII . DYNAMIC CHARACTERISTICS OF VISUAL PROCESSES. A . FIORENTINI . . . . 253-288 VIII . MODERN ALIGNMENT DEVICES. A . C . S . VAN HEEL. . . . . . . . . . 289-329
111.
C O N T E N T S O F V O L U M E I1 (1963)
.
1
11.
111.
IV.
V. VI .
RULING.TESTING AND USEOF OPTICAL GRATINGS FOR HIGH-RESOLUTION . . . . . . . . . . . . . . . . . . . 1-72 SPECTROSCOPY. G . W . STROKE THE METROLOGICAL APPLICATIONS OF DIFFRACTION GRATINGS. J. M BURCH. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73-108 DIFFUSION THROUGH NON-UNIFORM MEDIA.R . G . GIOVANELLI . . . . 109-129 CORRECTION OF OPTICAL IMAGES BY COMPENSATION OF ABERRATIONS AND BY SPATIAL FREQUENCY FILTERING. J . TSUJIUCHI . . . . . . . . . . . 131-180 FLUCTUATIONS OF LIGHTBEAMS. L . MANDEL. . . . . . . . . . . . 181-248 METHODS FOR DETERMINING OPTICALPARAMETERS OF THINFILMS. F. A B E L .~ S. . . . . . . . . . . . . . . . . . . . . . . . . . . . 249-288
.
C O N T E N T S O F V O L U M E IT1 (1964)
.
I THEELEMENTS OF RADIATIVE TRANSFER. F. KOTTLER .. I1. APODISATION. P . JACQUINOT AND B. ROIZEN-DOSSIER . . 111. MATRIXTREATMENT OF PARTIAL COHERENCE. H . GAMO.
.......
. . . . . . .
. . . . . . .
1-28 29-1 86 187-332
C O N T E N T S OF V O L U M E I V (1965) I.
I1 . 111. IV . V. VI .
VII .
1-36 HIGHERORDERABERRATION THEORY. J . FOCKE. . . . . . . . . . . APPLICATIONS OF SHEARING INTERFEROMETRY. 0. BRYNGDAHL . . . . . 37-83 SURFACE DETERIORATION OF OPTICAL GLASSES. K . KINOSITA . . . . . . 85-143 OPTICAL CONSTANTS OF THINFILMS. P . ROUARD A N D P . BOUSQUET . . 145-197 THEMIYAMOTO-WOLF DIFFRACTION WAVE.A . RUBINOWICZ . . . . . . 199-240 ABERRATION THEORY OF GRATINGS AND GRATING MOUNTINGS. W. T . . . 241-280 WELFORD . . . . . . . . . . . . . . . . . . . . . . . . . . DIFFRACTION AT A BLACKSCREEN. PARTI: KIRCHHOFF'S THEORY.F. . . 281-314 KOTTLER . . . . . . . . . . . . . . . . . . . . . . . . . .
.
C O N T E N T S O F V O L U M E V (1966) I.
I1.
111.
IV.
.
.
1-81 OPTICALPUMPING.C COHEN-TANNOUDJI AND A KASTLER. . . . . . NON-LINEAR OPTICS.P. S . PERSHAN. . . . . . . . . . . . . . . . 83-144 TWO-BEAM INTERFEROMETRY. W . H . STEEL . . . . . . . . . . . . . 145-197 INSTRUMENTS FOR THE MEASURING OF OPTICAL TRANSFER FUNCTIONS. K. MURATA. . . . . . . . . . . . . . . . . . . . . . . . . . . . 199-245
LIGHTREFLECTION FROM FILMS OF CONTINUOUSLY VARYING REFRACTIVE . . . . . . . . . . . . . . . . . . . . . 247-286 INDEX,R. JACOBSSON. DETERMINATION AS A BRANCH OF PHYSICAL VI. X-RAYCRYSTAL-STRUCTURE OPTICS,H. LIPSONAND C. A. TAYLOR. . . . . . . . . . . . . . . 287-350 CLASSICAL ELECTRON, J. PICHT . . . . . . . 351-370 VII. THEWAVEOF A MOVING
V.
.
C O N T E N T S O F V O L U M E V 1 (1967) RECENT ADVANCES I N HOLOGRAPHY, E. N. LEITHAND J. UPATNIEKS . . SCATTERING OF LIGHTBY ROUGHSURFACES, P. BECKMANN . . . . . . MEASUREMENT OF THE SECOND ORDER DEGREEOF COHERENCE, M. FRANCON AND S. MALLICK . . . . . .. . . .. . . . . . . . . . OF ZOOMLENSES, K. YAMAJI. . . . . . . . . . . . . . . . IV. DESIGN SOMEAPPLICATIONS OF LASERS TO INTERFEROMETRY, D. R. HERRIOTT.. V. STUDIESOF INTENSITY FLUCTUATIONS IN LASERS, J. A. VI. EXPERIMENTAL A N D A. W. SMITH. . . . . . . . . . . . . . . . . . ARMSTRONG SPECTROSCOPY, G. A. VANASSE, H. SAKAI. . . . . . . . . . VII. FOURIER AT A BLACK SCREEN, PART11: ELECTROMAGNETIC THEORY, VIII. DIFFRACTION F. KOTTLER.. . . . . . . . . . . . . . . . . . . . . . . . . . I.
11. 111.
1-52 53-69 71-104 105-1 70 171-209 21 1-257 259-330 331-377
C O N T E N T S O F V O L U M E v r I (1969) MULTIPLE-BEAM INTERFERENCEA N D NATURAL MODESIN OPENRESONA.. . .. .. .. . .. . . . . . . .. . . 1-66 TORS,G. KOPPELMAN. 11. METHODS OF SYNTHESIS FOR DIELECTRIC MULTILAYER FILTERS, E. DELANO A N D R. J. P E G I S . . . . . . . . . . . . . . . . . . . . . . . . . 67-137 111. ECHOES AT OPTICAL FREQUENCIES, 1. D. ABELLA. . . . . . . . . . . 139-168 FORMATION WITH PARTIALLY COHERENT LIGHT, B. J. THOMPSON 169-230 IV. IMAGE QUASI-CLASSICAL THEORY OF LASER RADIATION, A. L. MIKAELIAN AND V. M. L. TER-MIKAELIAN . . . . . . . . . . . . . . . . . . . . . . 231-297 VI. THEPHOTOGRAPHIC IMAGE, S. OOUE. . . . . . . . . . . . . . . . 299-358 VII. INTERACTION OF VERY INTENSE LIGHTWITH FREEELECTRONS, J. H. EBERLY. . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 5 9 4 1 5 I.
C O N T E N T S OF V O L U M E V I I I (1970) 1-50 SYNTHETIC-APERTURE OPTICS,J. W. GOODMAN. . . . . . . . . . . THEOPTICAL PERFORMANCE OF THE HUMAN EYE,G. A. FRY . . . . . 51-131 111. LIGHTBEATING SPECTROSCOPY, H. 2. CUMMINS A N D H. L. SWINNEY. . 133-200 1v. MULTILAYER ANTIREFLECTION COATINGS. A. MUSSET AND A. THELEN.. 20 1-237 V. STATISTICAL PROPERTIES OF LASER LIGHT,H. RISKEN. . . . . . . . . 239-294 v1. COHERENCE THEORY OF SOURCE-SIZE COMPENSATION IN INTERFERENCE . . . . . . . . . . . . . . . . . . . 295-341 MICROSCOPY, T. YAMAMOTO VII. VISIONIN COMMUNICATION, L. LEVI . . . . . . . . . . . . . . . . 343-372 VIII . THEORY OF PHOTOELECTRON COUNTING, C. L. MEHTA . . . . . . . . 373-440
I. 11.
C O N T E N T S O F V O L U M E 1X (1971) I. 11.
GASLASERS AND THEIR APPLICATION TO PRECISE LENGTH MEASUREMENTS, . . . . . 1-30 A.L.BLooM. . . . . . . . . . . . . . . . . . . PICOSECOND LASERPULSES,A. J. DEMARIA. . . . . . . . . . . . 31-71
.. .
.
OPTICALPROPAGATION THROUGH THE TURBULENT ATMOSPHERE. J. W. STROHBEHN . . . . . . . . . . . . . . . . . . . . . . . . . . OF OPTICAL BIREFRINGENT NETWORKS. E . 0. AMMANN . . . IV. SYNTHESIS V. IN GASLASERS. L . ALLENAND D . G . C . JONES . . . . . MODELOCKING VI . CRYSTAL OPTICS WITH SPATIAL DISPERSION. v. M . AGRANOVICH AND V. L . GINZBURG. . . . . . . . . . . . . . . . . . . . . . . . VII . APPLICATIONS OF OPTICAL METHODS I N THE DIFFRACTION THEORY OF ELASTIC WAVES.K . GNIADEK AND J. PETYKIEWICZ . . . . . . . . . . VIII . EVOLUATION. DESIGNAND EXTRAPOLATION METHODSFOR OPTICAL . . ON USEOF THE PROLATE FUNCTIONS. B. R . FRIEDEN SIGNALS. BASED
I11
73-122 123-177 179-234 235-280 281-310 311407
This Page Intentionally Left Blank
PREFACE Even if the writing of a characteristically different preface to each new volume in this series becomes a more and more difficult task, there is at least no problem to find suitable authors, or topics for review. A glance at the literature shows that physicists and engineers working in optics continue to exercise originality and vitality whether in connection with the traditional problem of instrumentation or with the still rapidly expanding and diversifying field of quantum optics. In addition the series continues to be well served by an international board of editors which never fails to suggest not only appropriate topics but also appropriate authors. This has enabled us once again to offer a number of reviews, which, it is hoped, will satisfy the most catholic of tastes. An innovation in this volume, which we believe may well become useful, is a cumulative index to be found on pp. 392-393. EMILWOLF
Department of Physics and Astronomy University of Rochester, N . Y., 14627 July 1972
This Page Intentionally Left Blank
CONTENTS
.
I BANDWIDTH COMPRESSION OF OPTICAL IMAGES by T. S. HUANG(Cambridge. Mass.) 1 . INTRODUCTION .................... 1.1 The digital transmission and storage of optical images 1.2 Rate-distortion theory . . . . . . . . . . . . . . . 1.3 A practical system layout . . . . . . . . . . . . . . 1.4 References . . . . . . . . . . . . . . . . . . . .
......... . . . . . . . . . ........ ........ ........ 2. CHANNEL CODING . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Channel characteristics . . . . . . . . . . . . . . . . . . . . . . 2.2 Modem (Modulator-demodulator) . . . . . . . . . . . . . . . . . . 2.3 Error-detection-and-correction codes . . . . . . . . . . . . . . . . 2.4 Tradeoffs between source and channel coding . . . . . . . . . . . . . 3 . IMAGE DIGITIZATION .......................... 3.1 Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Quantization .......................... 3.3 Combined sampling and quantization . . . . . . . . . . . . . . . . REDUCTION ........................ 4. REDUNDANCY 4.1 Statistical coding . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Psychovisual coding . . . . . . . . . . . . . . . . . . . . . . . . 5 . INTERPOLATIVE CODING. . . . . . . . . . . . . . . . . . . . . . . . . 5.1 The basic principle . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Piecewise-linear approximation . . . . . . . . . . . . . . . . . . . 5.3 Lowpass and corrective signals . . . . . . . . . . . . . . . . . . . 5.4 Two-dimensional edge detection . . . . . . . . . . . . . . . . . . 5.5 Contour interpolation . . . . . . . . . . . . . . . . . . . . . . . 6. PREDICTIVE CODING . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 Predictive coding . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Differential PCM (DPCM) . . . . . . . . . . . . . . . . . . . . . 6.3 Delta modulation . . . . . . . . . . . . . . . . . . . . . . . . . 6.4 Two-dimensional prediction . . . . . . . . . . . . . . . . . . . . . 7 . RUN-LENGTH CODING . . . . . . . . . . . . . . . . . . . . . . . . . 7.1 Run-length coding . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Area coding . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 3 3 5 6 6 6 7 9 10 11 11 11 12
16 16 17 18 18 18 18
19 20 20 20 20 21 22 22 22 23
CONTENTS
XI1
8. REDUCING QUANTIZATION NOISE. . . . . . . . . . . . . . 8.1 Randomizing quantization noise . . . . . . . . . . . 8.2 Reducing quantization noise by filtering . . . . . . . 8.3 Block quantization . . . . . . . . . . . . . . . . . 9. DUAL-MODE CODING. . . . 9.1 The basic principle . . . 9.2 Coarse-fine quantization 9.3 Synthetic highs . . . . 9.4 Contour coding . . . .
....... . . . . . . . . . . . . . . . .
. . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
......................
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10. TRANSFORMATIONAL CODING . . . . . . . . . . . . . . . . . . . . . . 10.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2 Bandwidth compression by low-passing the image . . . . . . . . . . . 10.3 Thresholding the Fourier transform . . . . . . . . . . . . . . . . . 10.4 Piecewise Fourier transform coding . . . . . . . . . . . . . . . . . 11. CODING OF MOTIONPICTURES . .
11.1 11.2 11.3 11.4 11.5
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Preliminaries . . . . . . . . . . . . . Interpolative coding . . . . . . . . . . Frame-correction coding . . . . . . . . Pseudorandom scanning . . . . . . . . Varying the spatial resolution . . . . . .
.............. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . 12. CODING OF COLOR PICTURES . . . . . . . . . . . . . . . . . . . . . . 12.1 Sampling and quantizing color images . . . . . . . . . . . . . . . . 12.2 DPCM and delta-modulation . . . . . . . . . . . . . . . . . . . . 12.3 Frame-to-frame coding of NTSC color TV . . . . . . . . . . . . . . 13. SOMEPRACTICAL CONSIDERATIONS . . . . . . . . . . . . . . . . . . . . 13.1 Image quality and bit rate . . . . . . . . . . . . . . . . . . . . . 13.2 Economy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14. COMMENTS ON IMAGE QUALITY
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15. CONCLUDING REMARKS . . . . . . . . . . . . . . . . . . . . . ACKNOWLEDGEMENT ...................... REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.1 Mean-square error criteria . . 14.2 A proposed distortion measure
24 24 25 25 30 30 30 30 31 33 33 33 33 34 31 31 31 31 31 31
35 38 38 38 39 39 39 40 40 41 41
41 42
.
I1 T H E USE O F IMAGE TUBES AS SHUTTERS by R . W . SMITH(London)
.
1 INTRODUCTION .. 1.1
..........................
High speed camera systems
2 . IMAGE TUBECOMPONENTS ..
.
3. ELECTRON-OPTICAL SYSTEMS .. 3.1 Biplanar image tubes . . . 3.2 Electrostatic lenses . . . . 3.3 Short magnetic lenses . .
. . . . . . . . . . . . . . . . . . . . . ..................... ..................... . . . . . . . . . . . . . . . . . . . . . .....................
.....................
41 41 49 52 52 53 53
CONTENTS
XI11
3.4 Uniform magnetic field . . . . . . . . . . . . . . 3.5 Strong uniform magnetic field . . . . . . . . . . . 3.6 Photocathode resistance and space charge . . . . .
......... .........
. . . . . . . . . . ....................... Cascade image intensifiers . . . . . . . . . . . . . . . . . . . . .
4 . IMAGEINTENSIFICATION ..
4.1 4.2 Transmission secondary electron multiplication intensifier (TSEM) 4.3 Fibre optic coupling . . . . . . . . . . . . . . . . . . . . . . . 4.4 Image decay time in image intensifiers . . . . . . . . . . . . .
. . . .
. . . . . . . . . . . . . . . . ......... . . . . . . . . . . . . . . . . . .
53 54 54 55 56 57 58 58
OF IMAGE TUBESTO HIGHSPEEDPHOTOGRAPHY . . 5. EARLYAPPLICATIONS
58
FOR USEAS SHUTTERS . . 6. IMAGETUBESDESIGNED 6.1 The Mullard ME 1201 . . . . . . . . . . . 6.2 Image dissector cameras . . . . . . . . . .
60 60 64
. . .. . . SHUTTERS. . . . . . . . . . . . . . . 7. DEFLECTION 7.1 Russian image tubes with deflection shutters . 7.2 British developments of deflection shutter tubes 7.3 Streak operation of deflection shutter tubes . .
. . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8. GRIDCONTROLLED IMAGE TUBES. . . 8.1 Electrostatically focussed tubes . 8.2 Magnetically focussed mesh tubes
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IMAGE INTENSIFIERS AS HIGHSPEED SHUTTERS . . . 9. THEUSE OF CONVENTIONAL METHODS. . . . . . . . . . . . . . . . . . . . . . . . . . 10. STORAGE 10.1 Phosphor screen storage . . . . . . . . . . . . . . . . . . . . . . 10.2 Dynamic electron image storage . . . . . . . . . . . . . . . . . . 10.3 An alternative dynamic storage method . . . . . . . . . . . . . . . IMAGE TUBES. . . . . . . . . . . . . . . . . . . . . . . . 11. BIPLANAR SYSTEMS ........................ 12. MULTICHANNEL ............................... REFERENCES
64 64 66 70 71 71 75 75 76 77 77 80 81 82 84
111. TOOLS OF THEORETICAL QUANTUM OPTICS by M . 0. SCULLY and K . G . WHITNEY (Tucson. Arizona)
1. INTRODUCTION ...
.......................... THEORY. . . . . . . . . . . . . . . . . . . . . . . . 2. DENSITYMATRIX 3. GREEN'SFUNCTION THEORY. . . . . . . . . . . . . . . . . . . . . . . NOISEOPERATOR THEORY . . . . . . . . . . . . . . . . . . . 4 . QUANTUM I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . APPENDIX APPENDIX I1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . APPENDIX 111. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . APPENDIX ............................... REFERENCES
91 95 104 113 120 123 127 129 135
XIV
CONTENTS
1V. FIELD CORRECTORS FOR ASTRONOMICAL TELESCOPES by C . G . WYNNE(London)
. . . 1. INTRODUCTION
. . . . . . . . . . . . . . . . . . . . . . . . . .
139
CORRECTORS . . . . . . . . . . . . . . . . . . 2. NEWTONIAN TELESCOPE 2.1 Ross correctors . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 The Baker corrector . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Aspheric plate correctors . . . . . . . . . . . . . . . . . . . . . . 2.4 Four-lens correctors . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Two-mirror correctors . . . . . . . . . . . . . . . . . . . . . . . 3. RITCHEY-CHR~TIEN TELESCOPE PRIMEFocus CORRECTORS .
.
i39 140 143 144 146 148
. . . . . . . . .
149 151 151 153 154
3.1 Single aspheric plate Ritchey-Chretien prime correctors . . . . . . . . . 3.2 Doublet lens Ritchey-Chretien prime correctors . . . . . . . . . . . . 3.3 Multiple aspheri c. plate Ritchey-Chretien prime correctors . . . . . . . . 3.4 Three component prime focus correctors for Ritchey-Chrttien telescopes . Focus CORRECTORS .. 4 . SECONDARY
................... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . REFERENCES
160 163
V . OPTICAL ABSORPTION STRENGTH OF DEFECTS I N INSULATORS Thef-Sum Rule. Smakula’s Equation. Effective Fields. and Application to Color Centers in Alkali Halides by D . Y . SMITH(Argonne. Ill.) and D . L . DEXTER(Rochester. N.Y.)
1. INTRODUCTION . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . 2 . SMAKULA’S CLASSICAL TREATMENT OF DEFECT ABSORPTION . . . . . . . . . .
167
3. DECOUPLING OF DEFECT AND HOSTAND THE SUM RULE . . 3.1 Thef-sum rule for the total defect-host system . . . . 3.2 Separation of defect and host and the partialf-sum rule 3.3 Applications to defect problems . . . . . . . . . . .
. . . .
172 172 174 176
. . . . . . . . . . . . . . . . . . . . CORRECTIONS . . . . . . . . . . . . . . . . . . . . . . . 5. LOCALFIELD
178 178 181
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
MASSES. 4 . CROSS SECTION. SMAKULA’S EQUATION. A N D EFFECTIVE 4.1 The generalized Smakula’s equation . . . . . . . . . . 4.2 Effective masses . . . . . . . . . . . . . . . . . . . . 5.1 5.2 5.3 5.4
.
Correlation effects and effective local fields . Classical local fields . . . . . . . . . . . . Quantum mechanical formulation . . . . . Quantum mechanical definition c f Ferr. . .
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . .
STRENGTHS OF COLORCENTERS IN ALKALIHALIDES . . . . . . 6 ABSORPTION 6.1 The convergence of $sums . . . . . . . . . . . . . . . . . . . . . 6.2 Direct measurement of oscillator strengths . . . . . . . . . . . . . 6.3 An example - the F center . . . . . . . . . . . . . . . . . . . . . 6.4 Relative oscillator strengths . . . . . . . . . . . . . . . . . . . . . as a function of no . . . . . . . . . . . . . 6.5 Possible tests of 8erf/80
.
. .
169
183 183 187 194 203 205 205 208 211 215 221
xv
CONTENTS
............................... ........................ REFERENCES ...............................
222
SUMMARY
ACKNOWLEDGEMENTS ...
224 224
VI . ELASTOOPTIC LIGHT MODULATION AND DEFLECTION by E . K . SITTIG(Murray Hill. N.J.)
1. INTRODUCTION.
...........................
2 . PHENOMENOLOGICAL THEORY OF ELASTOOPTICS . . . . . . . . . . 2.1 Dielectric relations and optics in crystals . . . . . . . . . . 2.2 Elastic relations and sound in crystals . . . . . . . . . . . 2.3 Piezoelectricity . . . . . . . . . . . . . . . . . . . . . . . 2.4 Electrooptic and elastooptic constitutive relations . . . . . . 2.5 Simplified description: The acoustooptic figure of merit . . . .
. . . . .
. . . . .
. . . . .
231
. . . . . . . .
. . . . .
232 233 234 236 237 238
3. MATERIALS FOR ELASTOOPTIC DEVICES . . . . . . . . . . . . . . . . . . . 240 3.1 Heuristic approaches to the selection of materials . . . . . . . . . . . 240 242 3.2 Data of elastooptic materials . . . . . . . . . . . . . . . . . . . . 4 . CLASSIFICATION OF ELASTOOPTIC MODULATORS AND DEFLECTORS . GENERAL ............................... CRITERIA 4.1 Elementary description . . . . . . . . . . . . . . . . . . . . . . . 4.2 Some general relations . . . . . . . . . . . . . . . . . . . . . . .
244 244 246
5. REFRACTIVE A N D BIREFRINGENT LIGHTDEFLECTION . . 5.1 Refractive light deflection . . . . . . . . . . . . 5.2 Birefringent modulation . . . . . . . . . . . .
248 248 250
. . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 6 . DIFFRACTIVE LIGHTDEFLECTION ..................... 6.1 Survey of the theory . . . . . . . . . . . . . . . . . . . . . . . . 6.2 The design of Bragg deflectors and modulators . . . . . . . . . . . . 6.3 Reduction of sound power, beam steering . 6.4 Bragg diffraction in anisotropic media . . 6.5 Thermal and sound absorption problems .
252 252 256 260 264 266
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7. PIEZOELECTRIC TRANSDUCERS FOR DIFFRACTION LIGHTDEFLECTORS ...... 7.1 Transducer theory . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Technology of transducers . . . . . . . . . . . . . . . . . . . . . 8 . AREASOF APPLICATION ......................... 9 . OUTLOOKS AND CONCLUSION. . . . . . . . . . . . . . . . . . . . . . ACKNOWLEDGMENTS ...........................
267 261 274
BIBLIOGRAPHY . .
219
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
275 278 279
VII . QUANTUM DETECTION THEORY by C. W . HELSTROM (La Jolla. Calif.)
1. DETECTION THEORY ........................... 1.1 Binary detection . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Discrete data and randomization . . . . . . . . . . . . . . . . . 1.3 Composite hypotheses . . . . . . . . . . . . . . . . . . . . . . . 1.4 Threshold detection . . . . . . . . . . . . . . . . . . . . . . . . 1.5 Multiple hypotheses . . . . . . . . . . . . . . . . . . . . . . . .
.
291 292 295 297 298 300
XVI
CONTENTS
2 . DETECTION THEORY IN QUANTUM MECHANICS. 2.1 Binary detection . . . . . . . . . . . . . 2.2 The choice between pure states . . . . . 2.3 Threshold detection . . . . . . . . . . . 2.4 Multiple hypotheses . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 3. DETECTION OF A COHERENT SIGNAL. . . . . . . . . . . 3.1 3.2 3.3 3.4 3.5
. . . . . .
. . . . . .
. . . . . . . 301 . . . . . . 301 . . . . . . . 306 . . . . . . 306 . . . . . . 307 . . . . . . 305
The transmission-line receiver: classical analysis . . . . Quantization of the receiver . . . . . . . . . . . . . . The coherent signal of known phase . . . . . . . . . Reception of a signal of random phase . . . . . . . . Amplification . . . . . . . . . . . . . . . . . . . .
. . . .
. . . .
OF APERTURE FIELDS . 4 . MODALDECOMPOSITION
5 . DETECTION OF INCOHERENT LIGHT . . . 5.1 The optical fields . . . . . . . . 5.2 The optimum receiver . . . . . . 5.3 Detection of point sources . . . . 5.4 Detection of extended objects . . .
. . . . . .
. . . .
. . . . . 308 . . . . 313 . . . . . 317 . . . . . 324 . . . . . . . 326
. . . . . . . . . . . . . . 330
................. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .................
334 334 335 335 353
6. ESTIMATION THEORY . . . . . . . . . . . . . . . . 6.1 Classical parameter estimation . . . . . . . . 6.2 Quantum estimation . . . . . . . . . . . . . . 6.3 The Cram&-Rao inequality . . . . . . . . . .
. . . . . . . . . . 359 . . . . . . . . . . . 359 . . . . . . . . . . 361 . . . . . . . . . . 364 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 368 ACKNOWLEDGMENT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 368 REFERENCES
AUTHOR INDEX . . . . . . . . . . . . . . . . . . . . . . . . . . . .
371
SUBJECT INDEX . . . . . . . . . . . . . . . . . . . . . . . . . . . .
379
CUMULATIVE INDEX OF AUTHORS
. . . . . . . . . . . . . . . . . . 392
I
BANDWIDTH C O M P R E S S I O N O F OPTICAL I M A G E S BY
T. S . HUANG Department of Electrical Engineering, M.I.T., Cambridge, Mass., U S A and
Institut f u r Technische Physik, ETH, Zurich, Switzerland
CONTENTS
PAGE
Q
1.INTRODUCTION . . . . . . . . . . . . . . . . . . .
3
5
2. CHANNEL CODING . . . . . . . . . . . . . . . . . .
6
Q
3. IMAGE DIGITIZATION . . . . . . . . . . . . . . . .
II
Q 4. REDUNDANCY REDUCTION
. . . . . . . . . . . . .
16
Q 5 . INTERPOLATIVE CODING . . . . . . . . . . . . . .
18
Q 6. PREDICTIVE CODING
. . . . . . . . . . . . . . . .
20
7. RUN-LENGTH CODING . . . . . . . . . . . . . . . .
22
Q 8. REDUCING QUANTIZATION NOISE . . . . . . . . .
24
9. DUAL-MODE CODING . . . . . . . . . . . . . . . .
30
Q 10. TRANSFORMATIONAL CODING . . . . . . . . . . .
33
Q 11. CODING OF MOTION PICTURES . . . . . . . . . . .
37
Q 12. CODING OF COLOR PICTURES
38
Q
Q
. . . . . . . . . . .
Q 13. SOME PRACTICAL CONSIDERATIONS
. . . . . . . .
39
. . . . . . . . . .
40
. . . . . . . . . . . . . . ACKNOWLEDGEMENT . . . . . . . . . . . . . . . . . . . REFERENCES . . . . . . . . . . . . . . . . . . . . . . .
41
Q 14. COMMENTS ON IMAGE QUALITY
Q 15. CONCLUDING REMARKS
41 42
0 1.
Introduction
1.1. THE DIGITAL TRANSMISSION AND STORAGE OF OPTICAL IMAGES
The transmission of images finds applications in many diverse fields, such as picturephone, computer-computer and man-computer communications, and remote sensing (in space exploration, reconnaissance, biomedical engineering, and other areas). In still other cases, although image transmission to a remote location is not required, one does need to store the images for future retrieval and analysis. Some examples are the filing and storage of engineering drawings, finger prints, and library books and journals. The trend in image transmission and storage is to use digital instead of analog techniques. This is due to the many inherent advantages of digital communication systems (OLIVER,PIERCEand SHANNON [1948]) in the case of transmission, and the flexibility and ubiquity of digital computers in the case of storage. In both cases, one major advantage of digital over analog techniques is that the errors are much easier to control for the former. Since images generally contain a large amount of information, a common problem one encounters in the digital transmission and storage of images is that the required channel or storage capacity is often excessive. It is desirable and sometimes mandatory to find ways to reduce this capacity requirement. The reason that this capacity reduction is possible is twofold. First, there is statistical redundancy in images: image points which are spatially close to each other tend to have nearly equal brightness levels. Secondly, there is psychovisual redundancy in images: One can intentionally destroy some of the information contained in an image without causing a loss in its subjective quality. Many schemes have been devised in which some of the statistical and psychovisual redundancies of images are removed to reduce the channel (storage) capacity requirement. The purpose of this paper is to review some of these schemes. 1.2. RATE-DISTORTION THEORY
Before we get down to the description of the redundancy reduction 3
4
BANDWIDTH COMPRESSION OF OPTICAL I M A G E S
1
schemes, it is worthwhile to first discuss briefly the general problem of designing an image transmission or storage system. A general block diagram for a signal transmission (storage) system is shown in Fig. 1. The source puts out signals, which in our case are images.
Fig. 1 . A general block diagram for a transmission (storage) system.
The signals are transferred to the sink, in our case a human observer, via the channel or storage. The encoder transforms the source images into a form suitable for the channel or storage, and the decoder transforms the output from the channel or storage into a form suitable for the human observer. The problem a system engineer has at his hands is to optimize this system: Given a source, a sink, and a fidelity criterion or distortion measure for picture quality, how do we design the encoder and decoder so that the channel (storage) capacity requirement is a minimum? Alternatively: Given a source, a sink, and a channel (storage), how do we design the encoder and decoder so that the quality of the image reaching the sink is maximized? The ideal mathematical framework for our optimization problem is Shannon’s rate-distortion theory (SHANNONand WEAVER[1949]). The main result of this theory states that: For a given source and a given fidelity criterion or distortion measure D , we can find a function R ( D ) ,called the rate of the source, such that we can transfer our signal to the sink with a distortion as close to D as we wish as long as the channel (storage) capacity C is larger than R(D). There are, however, considerable difficulties in trying to apply the ratedistortion theory to practical problems. First, in order to calculate the rate R, we need to have a realistic mathematical model (in statistical terms) of the source and a mathematical expression of the distortion measure D which agrees reasonably well with subjective judgement. Both of these are yet to be found. Second, even if we should find a realistic source model and a good distortion measure, they would probably be so complicated that we would have great difficulty in calculating the rate R analytically. Third, the ratedistortion theory tells us only what is the best we can do; it does not tell US how to do it. In summary, then, although the rate-distortion theory gives us guide lines as to how an optimum system should behave, it does not really help us in designing such a system.
I,
§ 11
INTRODUCTION
5
1.3. A PRACTICAL SYSTEM LAYOUT
In practice, instead of the general block diagram of Fig. 1, we prefer to work with a more detailed layout as shown in Fig. 2. A two-dimensional image is first spatially filtered, and then sampled in space and quantized in brightness. The psychovisual and statistical encoders remove respectively some of the psychovisual and statistical redundancies in the digitized image. The error-detection-and-correction encoder adds redundancy into the binary sequence representing the coded image to protect it from channel (storage) noise. The output binary sequence of this encoder is transformed by the
IE
TRANSMITTER
RECEIVER
Fig. 2. A practical block diagram for an image transmission (storage) system.
modulator into waveforms suitable for transmission through the channel (or putting into storage). On the other side of the channel (storage), the demodulator and the decoders try to recover the best they can the output of the quantizer. Finally, the two-dimensional (spatial) post-filter smooths out the recovered digital image into a continuous image for viewing by human observers. The operations of the filters, the sampler, the quantizer, and the psychovisual and statistical coders are based on the properties of the source: they are called source coding. The operations of the error-detection-and-xorrection coders and the modem (modulator and demodulator) are based on the properties of the channel (or storage): they are called channel coding. In the remainder of this paper, we shall phrase our discussions in terms of image transmission, although most of the results apply to image storage as well. Since all the blocks in Fig. 2 interact with each other, it is an impossible task to optimize the total system. One can only hope to design a good system instead of an optimum one, and to do so is an art as well as at science. Also, because of the lack of a mathematical distortion measure, the image quality has to be judged subjectively.
6
B A N D W I D T H COMPRESSION O F O P T I C A L IMAGES
[I,
9: 2
As mentioned earlier, the purpose of the present paper is not to describe the design of an image transmission or storage system in its totality, but rather to review some of the psychovisual and statistical coders for redundancy removal. However, in order to gain the proper perspective, channel coding and image digitization are discussed briefly in Q 2 and Q 3, respectively. The redundancy reduction of monochrome still images are discussed in $4 5-10; that of moving and color images in QQ 11 and 12, respectively. The effect of channel noise on the redundancy reduction schemes are discussed briefly in Q 13, where we also comment on the equipment complexity. Finally, in Q 14 we present some preliminary thoughts on the searching for a measure of picture quality. 1.4. REFERENCES
Several bibliographies on picture bandwidth compression have been published (PRATT[1967], ROSENFELD [ 19681, WILKINSand WINTZ[ 19691). Most of the papers in a special issue of the proceedings of the IEEE on redundancy reduction were concerned with images (CUTLER [ 19671). A symposium on picture bandwidth compression was held at the Massachusetts Institute of Technology, Cambridge, Mass., in April 1969 (HUANG and TRETIAK [ 19701). Journals that frequently publish papers on picture bandwidth compression include the IEEE Proceedings, the IEEE Transactions on Information Theory and the IEEE Transactions on Communication Technology, and the Bell System Technical Journal.
5 2.
Channel Coding
(BENNETT and DAVEY[1965], LUCKY,SALZand WELDON[1968], MARTIN [1969].) 2.1. CHANNEL CHARACTERISTICS
Physical communication channels, such as telephone wires, cables, and microwave links, in addition to being band limited, suffer from various imperfections: (a) Noise - There are two kinds of noise. Gaussian random noise due to thermal and shot effects, and impulse noise due to lightning, switching transients, etc. (b) Interference - Crosstalk between channels. (c) Frequency distortion - Within the passband, the amplitude of the frequency response of a channel is not flat, and the phase not linear. Therefore, even the signal waveform we wish to transmit through the
1,
§ 21
7
CHANNEL CODING
channel has only frequency components inside the passband of the channel, it still will be distorted at the receiving end. (d) Fading - Due to multipath in wireless transmission. (e) Dropout - Due to equipment failure somewhere along the channel. Theoretical estimation of the capacity of a channel is difficult except in the case of a channel with additive white Gaussian noise only, in which case we have the well-known formula of Shannon (SHANNON and WEAVER[ 19491): C
=
(1+ -3bits/sec.
W log,
where W is the bandwidth of the channel in cycles per second, and SIN is the signal-to-noise ratio. Thus, a 3kc voice channel with a SIN of 30 db (SIN z 1000) has a theoretical channel capacity of about 30 000 bits per second. 2.2. MODEM (MODULATOR-DEMODULATOR)
The purpose of the modulator is to transform an incoming sequence of binary digits (bits) into a waveform suitable for transmission over the channel. The demodulator transforms the received waveform back to the binary sequence. For a given channel, one wishes to design a modem which can squeeze the most number of bits per second through the channel, achieve the least probability of error, and is cheap. Usually, the operation of the modulatoI consists of two stages. First, the incoming bit stream is transformed into a baseband signal. Then, this baseband signal is used to modulate a high-frequency carrier. Let the incoming bit sequence be {uk;k = 0, 1,2, . . .>where a k = 0 or 1. The baseband signal is g(t) =
bkh(t-kT) k
where bk
=
+ d , if
ak
=
if
Llk
= 0,
(-d,
1
(3)
h ( t ) is a fixed time function (usually pulse shaped), and T and d a r e positive constants. More generally, we can group the incoming bits into blocks (with n bits per block, say), use one term in eq. (2) to represent each block, allowing bk to have 2" possible values (each value corresponding to a specific bit pattern in a block), The baseband signal g ( t ) modulates either the amplitude, or the frequency or the phase of a high-frequency carrier (sinewave). This modulated carrier
8
B A N D W I D T H COMPRESSION O F OPTICAL IMAGES
[I,
§2
is then transmitted through the channel. At the receiver, the demodulator first recovers the baseband signal then samples at the appropriate time instants (kT, say) to get b, and hence a,. The number of bits per second we can transmit through the channel is proportional to the logarithm of the number of levels in b,, and inversely proportional to T. The number of levels we can have in b, is limited by channel noise. The size of T is limited by the channel bandwidth. Theoretically, the Nyquist criterion tells us that for a channel band limited to W cycles pr second, the smallest T we can have is 1 To = (4)
2w
when we use h(t) =
sin (nt/T,,) ntlTo
(which has a bandwidth W ) and sample at t = kTo. It is easy to see that in this case, if the channel has a perfect frequency response within the passband (flat magnitude, linear phase), then there is no intersymbol interference, i.e., the sample at t = kT,, is b,, without any contribution from bjh(t - j T ) , j # k. In practice, the channel frequency response is not perfect. A filter, called an equalizer, is inserted at either the transmitter or the receiver end to compensate for the channel frequency response. Since telephone channels are usually time-varying, the equalizer characteristics should change automatically in accordance with the channel variation. The modern trend in designing automatic equalizers is to use digital transversal filters (employing tapped delay-lines). Because we cannot synthesize h ( t ) of eq. (5) exactly, in practice we can signal only at around half of the Nyquist rate, in order to avoid intersymbol interference. However, recently, an alternative approach to baseband signal design, known as partial response signaling (which includes in particular the duobinary technique), has emerged. In this approach, the pulse h(t) is chosen to introduce intersymbol interference intentionally but in a controlled manner so that the demodulator will have no difficulty in decoding the individual symbols. Nyquist-rate signalling then becomes possible. The price we have to pay is that: for n-bit blocks, the baseband signal has to have (2"" - 1) levels instead of the (2"- 1) levels required in the conventional design. The SIN requirement of the channel is therefore more stringent. As we mentioned earlier, the baseband signal can be used to modulate a carrier by either AM, or FM, or PM. Modern-day high-speed modems for
1,
§ 21
CHANNEL CODlNG
9
data transmission generally use AM, because of the possibility of singleside-band (SSB) and vestigial-side-band (VSB) signalling in AM which cuts the bandwidth requirement in half compared to double-side-band methods. In SSB and VSB signalling, we have to have a reference carrier at the demodulator which has the same frequency and phase as those of the carrier at the modulator. Any carrier frequency or phase offsets will introduce distortion in the recovered baseband signal. It turns out that it is this carrier phase offset, rather than intersymbol interference or channel noise, that puts a limit on the present state-of-the-art of high-speed modem design. To conclude this section, we give an example of a modem developed by Honeywell Communications Center for the Air Force (Anonymous [ 19701). The modem uses AM and VSB with a digital automatic equalizer containing a 66-tap delay line. The distance between successive symbols in the base band signal is T = second. The baseband symbols may have 2, 4 or 8 levels corresponding to a data rate of 4800, 9600 or 14400 bits per second. In performance tests over a 400-mile Type 2300 C-2 conditioned telephone channel with a SIN M 30db and a bandwidth of about 3kc, the bit error probabilities at the three transmission rates are, respectively, lo-*, lop6 and 5x The performance should be compared to the theoretical capacity of the channel, which is about 30 000 bits per second (see 9 2.1). 2.3. ERROR-DETECTION-AND-CORRECTION CODES
The purpose of the error-detection-and-correction encoder is to add check bits to the incoming information bits to protect them from channel disturbances. There are two major categories of such codes: block codes, and the convolutional codes. In both cases, the incoming information bits are divided into blocks. In the case of block codes, the check bits added to each block depend only on the information bits in that block; while in the case of convolutional codes, they depend also on the information bits in other blocks. Much more is known about block codes than about convolutional codes. Generally speaking, the larger the block size, the more efficient the code, i.e., the smaller the ratio of the number of check bits to the number of information bits. However, when the block size is large, the mechanics of encoding and decoding may become very complicated. Therefore, in designing error-detection-and-correction codes, the main problem is not just to find an efficient code, but rather to find an efficient code which has a lot of structure so that encoding and decoding become relatively simple. A class of block codes that are quite efficient and relatively simple to implement is the BCH codes. Let n be the total number of bits per block, k
10
B A N D W I D T H COMPRESSION OF O P T I C A L I M A G E S
[I,
02
the number of information bits, and t the number of bit errors the code can correct. Some typical numbers for BCH codes are: (n, k, t) = (7, 4, 1); (7, 1, 3); (15, 1 1 , 1); (15, 5, 3); (255, 247, 1); (255, 231, 3); (255, 139, 15). 2.4. TRADEOFFS BETWEEN SOURCE A N D CHANNEL CODING
To speed up the transmission of a given image through a given channel, we can either do source coding to reduce the information bit rate or do channel coding to squeeze more information bits per second through the channel. The more source coding we do, the more vulnerable is the picture quality to information bit errors. The more channel coding we do, the more information bit errors we will have. The kind of compromises we make depend on whether we are striving for picture quality, speed of transmission, minimum cost, etc., or any combinations thereof. To illustrate the manner in which we may try to achieve an optimum design, we offer the following simplified hypothetical example. We assume that in our system, by varying parameter in the channel coders, we can achieve
INCREASING QUALITY * INFORMATION BITS/SEC
I
I
*
I
INFORMATION BITS/SEC
---rr
I----_------_
OPTIMUM POINT OPERATING
INFORMATION BITS/SEC (C)
Fig. 3. Tradeoff between source and channel coding. (a) Probability of bit error vs. transmission rate (information bits/sec.) for channel coding. (b) Equal-qualitycurves for sourcecoding. (c) Determining the optimum operating point.
11
IMAGE DIGITIZATION
the operating curve (probability vs. information bit rate) as shown in Fig. 3(a). Also, by carrying out subjective tests, we determine a family of equalquality curves for source coding as shown in Fig. 3(b). A typical point (BoyPo) in Fig. 3(b) represents an image which is first source coded to B, information bits per second, then corrupted with a bit error probability of Po , and finally reconstructed and its quality judged subjectively. The superposition of Figs. 3(a) and (b) yields an operating point for maximum image quality.
0 3.
Image Digitization
3.1. SAMPLING
To concentrate on the sampling process, let us consider the simplified subsystem depicted in Fig. 4. With respect to this system, the basic question is: For a fixed number of samples per image frame, how should we choose the prefilter and the postfilter to optimize the output image quality? INPUT
_c
IMAGE
TWO-Dl MENSIONAL PREFILTER
-
IDEAL (IMPULSE) SAMPLER
-
TWO-DIMENSIONAL POST-FILTER
OUTPUT
TGE
Fig. 4. The sampling process.
Let the picture be sampled at a square array of points. PETERSON and MID[I9621 showed that, for a fixed number of samples per frame preand post-filtering with two-dimensional ideal low-pass filters (whose cutoff frequencies are chosen to avoid aliasing) give the least mean-square difference between the output of the post-filter and the input to the pre-filter. Subjective tests (HUANGand TRETIAK [1965]) indicated that these same filters also give reconstructed pictures with the best subjective quality in the case of very low resolution (64 x 64 samples per frame) systems. For higher resolution systems (256 x 256 samples per frame), highspatial-frequency accentuation at the post-filter seems to improve the output image quality; however, no extensive subjective tests have been done to verify this. Note that to obtain a received image with resolution comparable to that of present-day US commercial television pictures, about 500 x 500 samples per frame are required. DLETON
3.2. QUANTIZATION
To each input sample (with a continuous brightness range) the quantizer assigns a discrete level. The quantization can be either uniform or non-
12
B A N D W I D T H COMPRESSION OF OPTICAL IMAGES
[I,
9: 3
uniform (Fig. 5). If uniform quantization is used, about 5 to 8 bits per sample or 32 to 256 brightness levels (depending on the SIN of the original, the viewing conditions, etc.) are required to eliminate artificial contours (the so-called quantization noise). One can save about 1 bit per sample by using logarithmic quantization to take advantage of the properties of human vision (Weber-Fechner law). OUTPUT L E V E L S
-
1
INPUT BRIGHTNESS (0)
ir--T
OUTPUT L E V E L S
INPUT BRIGHTNESS (b)
Fig. 5. The quantization process. (a) Uniform or linear quantization. (b) Logarithmic quantization.
Some examples of uniformly and logarithmically quantized images are shown in Figs. 6 and 7. The original image used in these examples contains a cameraman as the central object with grass and sky as background, and has a SIN of about 40db before quantization and a resolution of 256 x 256 sample points. We can do even better (HUANGet al. [1967]) than logarithmic quantization (i.e., get by with fewer quantization levels), if we make use of the frequency distribution of the brightness of the image points (to use more levels in the brightness ranges where the frequency distribution curve is high). This last scheme, however, is in most cases impractical, because it is image-dependent
.
3.3. COMBINED SAMPLING AND QUANTIZATION
If we digitize an image into L x L samples with B bits (or 2’ levels) per sample, then the total number of bits required to represent the digitized image is N = L x L x B. The following question then arises: For a given
1031
I M A G E DIGITIZATION
13
Fig. 6. Images uniformly quantized to various numbers of levels (256 X 256 samples per frame). (a) 2 bits or 4 levels. (b) 3 bits or 8 levels. (c) 5 bits or 32 levels.
value of N> how should we choose values for L and B to get the best received picture? The answer, of course, depends on what we mean by “best”. The requirements on a reconnaissance picture, for example, are quite different from those on commercial television pictures. For general-purpose pictures (such as commercial television and picturephone pictures), the judgment on the quality of the picture is necessarily subjective. For this class of pictures, a series [1965]) were conducted in an attempt to answer of subjective tests (SCOVILL the question posed in the preceding paragraph. Three original pictures, containing different amounts of details, were used: a face (Fig. s), a scene with a cameraman at the center, and a crowd (Fig. 9). Pictures with different values of L and B were generated, and observers were asked to rank order them according to their subjective quality. The results are presented in
14
BANDWIDTH COMPRESSION OF OPTICAL IMAGES
Fig. 7. Image logarithmically quantized to various number of levels (256 x 256 samples per frame). (a) 2 bits or 4 levels. (b) 3 bits or 8 levels. (c) 5 bits or 32 levels.
Fig. 8. Pictures of a face received through simulated PCM systems. (a) Number of samples = 128 x 128; number of brightness levels = 64. (b) Number of samples = 256 x 256; number of brightness levels = 16.
15
IMAGE D I G I T I Z A T I O N
Fig. 9. Pictures of a crowd received through simulated PCM systems. (a) 128 x 128 samples; 64 brightness levels. (b) 256 x 256 samples; 16 brightness levels.
Fig. 10 in the form of isopreference curves in the L-B plane. Each point in the L-B plane represents a received picture, with values of L and B equal to the coordinates of that point. An isopreference curve is one on which the points represent pictures of equal subjective quality. Figure 10 also shows
32
64
128
256
32
64
128
256
32
64
128
NUMBER OF SAMPLES I N EACH DIRECTION, L
Fig. 10. Isopreference curves for: (a) Face; (b) Cameraman; (c) Crowd.
256
16
BANDWIDTH COMPRESSION OF OPTICAL IMAGES
“3
04
curves of constant N , which are dotted. By inspection of the isopreference curves, the following conclusions can be drawn: (1) The isopreference curves depart markedly from the curves of constant picture bit rate ( N ) .
(2) The isopreference curves depend very much on the picture types. The curves become more vertical as the picture details increase. This indicates that for pictures with a large amount of detail, only a few brightness levels are needed; see Fig. 9(b). (3) In some cases, for a fixed number of spatial samples, the picture quality will improve with a decrease in the number of brightness levels. A probable reason is that decreasing the number of brightness levels increases the apparent contrast of the picture.
6 4.
Redundancy Reduction
4.1. STATISTICAL CODING
To transmit a digitized image by direct PCM requires N = L x L xB bits per frame, where L x L is the number of samples per frame and B the number of bits per sample (2Bbeing the number of discrete levels used for the brightness of each sample). Since the channel capacity requirement increases with an increase in the number of bits used to represent the image, it is the purpose of the psychovisual and the statistical encoders (Fig. 2) to reduce the number of bits needed to characterize the digitized image. We shall first take up statistical coding. We can characterize a digitized image by a sequence of messages. The messages can be, for example, the brightness levels of each individual sample. Or, each message may contain the brightness levels of a pair of neighboring samples. Still a third example is that the messages may be first differences of adjacent samples along each horizontal line. There are infinitely many ways in which we can choose our messages, the only requirement being that we should be able to reconstruct the digitized image from the sequence of messages. For a particular choice, let the possible messages be m,, m 2 , . . ., m,; and let the probability distribution of these messages (over the class of digitized images we are interested in) be p l , p 2 , . . ., p n . The main idea in statistical coding is to use variable-length binary codewords for the messages, using short codewords for the more probable messages and longer codewords for the less probable ones so that on the average we will have a small number of bits per message. Shannon’s theory (SHANNON and WEAVER
REDUNDANCY REDUCTION
17
[1949]) tells us that we can always find a code such that the average number of bits per message r satisfies the inequality H s r s H + I where the entropy H i s by definition H E -
(7)
The simple and elegant procedure of HUFFMAN [I9521 guarantees that we will get a code with the minimum r. The entropy H for a probability distribution is maximum when all p iare equal, and is minimum when all pi but one are zero. Generally speaking, the more nonuniform or peaky a probability distribution, the smaller its entropy. Therefore, in order to do effective statistical coding, we should choose a message set which has a peaky probability distribution. 4.2. PSYCHOVISUAL CODING
If the received picture is to be viewed by humans, then one can take advantage of the properties of human vision. Here, the purpose is to distort the picture in such a way that it can be described by a smaller number of bits; however, the distortion is not great enough to be noticeable or objectionable to the human viewer. Psychovisual encoding, then, can be considered as an operation to derive a sequence of messages from the digitized image such that these messages require less channel capacity to transmit than the original digital image and that from these messages we can reconstruct a reasonable replica of the original digital image. Statistical encoding can of course be applied to the output messages of a psychovisual encoder to reduce their statistical redundancy. We shall see that it is in psychovisual coding that we can hope for large amounts of redundancy reduction. Indeed, without taking advantage of the psychovisual properties of human vision, we would have had neither movies nor black-and-white televisions, not to say color televisions. The discreteframe approach of movies and televisions (around 30 frames per second) are satisfactory because the limited resolution of the tempera1 response of the human vision. Color televisions are possible because we can synthesize any subjective color by using a finite (3 or 4) number of color components. In both cases, a continuum is reduced to a finite discrete set: the bandwidth compression or redundancy reduction ratio is infinite. In the remainder of this paper, we shall describe some specific redundancy reduction schemes. Most of these schemes use both psychovisual and sta-
18
B A N D W I D T H COMPRESSION OF O P T I C A L IMAGES
[I,
05
tistical coding. We note that because of the Weber-Fechner law, generally speaking we can obtain better results by coding the density instead of the transmittance of a picture.
5 5.
Interpolative Coding
5.1. T H E BASIC PRINCIPLE
Judiously chosen samples in the picture are omitted in transmission. At the receiver, these missing samples are filled in by interpolation from the transmitted samples. Sometimes corrective values are also transmitted for the interpolated samples. In a good scheme, these corrective values should have a very peaky distribution so that efficient statistical coding can be used. 5.2. PIECEWISE-LINEAR APPROXIMATION
Youngblood was among the first who investigated interpolative coding (YOUNGBLOOD [ 19581). His method was piecewise-linear approximation. For each scan line, consider the intensity z as a function of position n: 2, = f ( n ) . (8) He approximated f ( n ) by a (sampled) piecewise-linear curve g(n) such that
Is(n>-f(n>l 5 E (9) where E was a pre-selected threshold. At the transmitting end, only the locations and amplitudes of the joint points of the straight line segments were sent. From these, the receiver could reconstruct g(n) by linear interpolation. Using this scheme, Youngblood was able to reconstruct goodquality pictures at an average bit rate of about 1 bit per sample for a picture of a girl’s face, and 3 bits per sample for a crowd scene. The pictures contained about 240 x 240 samples. 5.3. LOWPASS A N D CORRECTIVE SIGNALS
In this scheme due to CUNNINGHAM [1958], a picture was divided into small square blocks, and the average intensity of each block was transmitted to represent the intensity of the central sample of the block. At the receiver, linear interpolation was used to obtain intervening intensities. Coarsely quantized correction signals had to be sent for each sample to make the resulting picture acceptable. With this scheme, good-quality pictures could be obtained at an average bit rate of about 0.9 bit per sample for a picture of a girl’s face, and 1.7 bits per sample for a crowd scene. These were the same pictures used by Youngblood.
INTERPOLATIVE CODING
19
5.4. TWO-DIMENSIONAL EDGE DETECTION
HUANG[1960] studied a scheme in which edge points were transmitted in addition to regular coarse samples. Figure 1 I shows an anatomy of this scheme. A set of basic points (e.g., one out of every sixteen samples as in Fig. 1I(a)) were transmitted for all picture frames. These points essentially constituted the low-frequency part of the picture. In addition, extra edge points (see Fig. 1 I(b)) were sent for each frame. Whether any given point was an edge point or not was determined by a threshold function which depended only on the basic points. Therefore, if the transmitter and receiver
Fig. 11. Pictures pertaining to Huang’s two-dimensional edge-detection scheme. (a) Basic points. (b) Edge points. (c) Points sent from the transmitter = (a)+(b). (d) Reconstructed picture. Average bit rate = 0.85 bitjsample.
20
B A N D W I D T H COMPRESSION OF OPTICAL IMAGES
[I,
P6
agreed on the threshold beforehand, the positions of the edge points need not be sent. At the receiver the blanks were filled in by linear interpolation. For the particular picture shown in Fig. 1 I(d), the bit rate is about 0.85 bit per sample. The picture was the same one used by Youngblood and Cunningham. 5.5. CONTOUR INTERPOLATION
A straightforward way of reducing the bit rate by a factor of 2 is to transmit every other scan line and at the receiver to fill in the missing scan lines by interpolation. However, the use of intensity interpolation (e.g., linear interpolation) will introduce staircase-like structure along the contours in the picture. GABOR and HILL [I9611 proposed a way of doing contour interpolation which resulted in pictures of good quality.
0 6.
Predictive Coding
6.1. PREDICTIVE CODING (ELIAS 119551)
In predictive coding, an equation of prediction is set up:
ZA
=
f(z,> ~2
9
- ., Zn-l),
where
z:, = predicted intensity of the next sample and
zl,z 2 , . . ., z,,- = intensities of the past (n- 1) samples. Let z,,= actual intensity of the next sample. Then the picture can be described by the values of (zn-z;) instead of those of z,,. If the equation of prediction, eq. (10) is well chosen, then the probability distribution of (z,,-zi) will concentrate near 0, and efficient statistical coding can be used. Linear predictive coding (i.e.,fin eq. (10) is linear) was studied by HARRISON [19521. He tried in particular both previous-value prediction
z,, = z,-1 and slope prediction
z:,= 2z,-1-z,,-~. He found that the redundancy reduction achieved by the latter method was only slightly greater than that by the former. 6.2. DIFFERENTIAL PCM (DPCM)
The differential quantizing scheme of Cutler (CUTLER[1952], GRAHAM [ 19581, O’NEAL [ 1966bl) is essentially previous-value prediction. In this
1,
§ 61
21
PREDICTIVE CODING
scheme, first-differences of adjacent samples were transmitted. These differences were quantized nonuniformly to a fixed number of bits. The quantum levels became larger when the difference became larger. No statistical coding was used. DPCM and its variants (LIMB[1969], LIMBand MOUNTS1.19691, BROWN [19701, CANDY[19701) represent a class of simply implementable redundancy reduction techniques which can yield good quality pictures at a fixed bit rate of 2 to 3 bits per sample. 6.3. DELTA MODULATION
Delta modulation can be considered as a special case of DPCM where only 1 bit is used to transmit the difference between adjacent samples. The transmission of speech by delta modulation (DE JAGER [1952]) has been quite successful. The application of ordinary delta modulation to picture transmission, however, gives poor received pictures (O’NEAL[ 1966al) because picture signals vary much more rapidly than speech signals. A modified scheme (WITT [1963]; WINKLER[1965]; REMM,COTTONand STROHMEYER [1966]), called A’ (delta-squared) modulation, gives much sharper pictures than ordinary delta modulation. An ordinary digital delta-modulation system is shown in Fig. 12. Q is a two-level quantizer whose output is “1” when the input is positive, or “0” when the input is negative. G is a two-level generator that puts out A or - A when the input is 1 or 0, respectively. D is a one-unit (time between successive samples) delay. RECONSTRUCTED VIDEO SIGNAL
INPUT CHANNEL
+
TRANSMITTER
RECEIVER
Fig. 12. Delta-modulation system.
In the A’-modulation system, the two-level generator G is replaced by If in the input to GI we have a string of n 1’s a multilevel generator GI. after one or more O’s, the output of GIcorresponding to the string of 1’s will be a,, a 2 , . . ., a,, where the at’s are positive numbers. Similarly, if in the input we have a string of n 0’s after one or more 1’s the output corresponding
22
B A N D W I D T H COMPRESSION OF OPTICAL IMAGES
[I, §
7
to the string of 0’s will be - a , , - a 2 , . . ., -a,. To increase the rise time, one usually chooses a,, a 2 ,. . ., a, to be more or less exponentially increasing. Notice that G is a special case of G, with ai = A . Although A’ modulation will give sharp pictures, it may introduce overshooting and ringing. To avoid excessive overshooting, some kind of limiting can be used; and by properly adjusting the output levels of G I , a compromise might be reached between ringing and loss of edge sharpness. Figures 13(a) and (b) show a A-modulated picture and a A2-modulated one; the number of sample points is the same in both pictures.
Fig. 13. (a) Ordinary A-modulated picture, with A 2. (b) d2-modulated picture, with n, = 2i-‘, limited at 8. 7
6.4. TWO-DIMENSIONAL PREDICTION
The prediction schemes described in the preceding sections worked along scan lines. HARRISON [I9521 discussed the possibility of doing prediction in two dimensions. CORRADETTI [ 19591 tried one particular such planar prediction scheme. He considered 5 scan lines at a time and predicted that the future points would lie on the plane determined by 3 suitably chosen previous points. The results did not show any improvement over the one-dimensional schemes. Q 7. Run-Length Coding 7.1. ,RUN-LENGTH CODING
In a digitized picture, a ‘‘run’’ is by definition a sequence of consecutive samples (along a scan line) of the same intensity level. In run-length coding
I.
0 71
RUN-LENGTH CODING
23
instead of sample values, the length and the intensity level of each run are transmitted. A reduction in the average bit rate is achieved by using statistical coding. A prominent example of run-length coding is the work of Cherry and his students (CHERRY et al. [1963]) who achievedan average bit rate of around 3 bits for TV quality pictures. An alternative approach (SPENCER and HUANG[1969])in run-length coding is to code the individual bit-planes of the digitized picture, instead of the picture as a whole. Let sij be the intensity level of the sample in the ith row andjth column of the digitized picture. Then the nth bit-plane of the picture is by definition a matrix whose ijth element is equal to the nth bit (either 0 or 1) of the binary codeword for s i j .By using Gray code for the intensities and by using a combination of conventional run-length coding and line-toline run-length difference coding (in which essentially the differences between the corresponding run-lengths in successive scan lines were transmitted), i t was possible to achieve an average bit rate of from 2 to 4 bits depending on picture complexity. The original digitized pictures (before run-length coding) contained 256 x 256 samples with 6 bits per sample. Only statistical coding was used so that the reconstruction was exact (assuming no channel disturbances). 7.2. AREA CODING
A direct extension of run-length codin,g to two dimensions is to transmit the areas in a digitized image (MOTT-SMITH and BAER[1970]). An area is by definition a connected group of picture samples which have the same intensity level. To transmit an area, we can transmit its boundary points. By following the boundary we need to use only 3 bits per boundary point, since in a sampled picture each point has only 8 neighbors. A picture of a girl standing in front of a house was sampled with 256 x 256 points, and quantized to 4 bits or 16 intensity levels. By computer measurement, it was found that the number of areas is 2240, and the number of boundary points of the areas is 41 359. A simple calculation (SCHREIBER, HUANCand TRETIAK [1968]) shows that the average bit rate for the picture, using area coding, is about 2.5 bits per sample, which is rather disappointing when we compare it to the original 4 bits per sample. If we should use a 6-bit original picture, the number of boundary points for the areas would increase greatly and no redundancy reduction could be expected by using area coding.
24
B A N D W I D T H COMPRESSION OF O P T I C A L IMAGES
5 8.
Reducing Quantization Noise
8.1. RANDOMIZING QUANTIZATION NOISE
The human eye is much more objectionable to noise with strong structures, such as quantization noise, than to random noise. Therefore, a smaller number of quantization levels can be tolerated, if means can be found to transform quantization noise to random noise. An example is the scheme of ROBERTS[1962]. He added pseudorandom noise to a picture before quantization, and later at the receiver subtracted the same noise from the quantized picture. It can be shown that by this maneuver the quantization noise
Fig. 14. Results from the Roberts’ scheme. (a) I bit per sample. (b) 2 bits per sample. (c) 3 bits per sample. (d) 4 bits per sample.
I,
0 81
R E D U C I N G Q U A N T I Z A T I O N NOISE
25
is transformed into random noise with the same r.m.s. value. This scheme gives good quality pictures with only 4 bits per sample. See Fig. 14. 8.2. REDUCING QUANTIZATION NOISE BY FILTERING
Quantization noise can be reduced by putting a pre- and a post-filter around the quantizer. Approximating the quantizer by an additive noise source, GRAHAM[1962] calculated the optimum (in the least mean-square sense) two-dimensional pre- and post-filters. These were later simulated on a digital computer (POST[1966]). It was found that, by using these filters, one can obtain a picture essentially free of artificial contours with only 8 brightness levels or 3 bits per sample. The quality of the reconstructed picture, however, was not good. Since Roberts’ scheme transforms quantization noise to random noise the noise filtering approach can be applied to it even better than to the original quantization noise. This was tried by WACKS[1970]. The filters used had the frequency responses shown in Fig. 15. Some results are shown in Fig. 16. Note that the 3-bit picture is of very good quality. 32
24
I 6
00
0
100
200
300
w (RADIAN/PICTURE
400
500
HEIGHT1
Fig. 15. Frequency responses of the circularly symmetrical pre-filter H I ( w ) and post-filter Hz(w). 8.3. BLOCK QUANTIZATION
The reduction of bit rate by linear transformation and block quantization were suggested and analyzed by KRAMER and MATHEWS[1956] and HUANGand SCHULTHEISS [1963]. The basic scheme is as follows. A block of N data samples xi are linearly transformed into y i by a N x N matrix A . The y i are quantized and transmitted. At the receiver, the quantized y i are transformed by another N x N matrix B into z i . For a given bit rate, the matrices
26
BANDWIDTH C O M P R E S S I O N OF OPTICAL IMAGES
[I,
08
Fig. 16. Results from Roberts’ scheme with pre- and post-filters. (a) 1 bit per sample. (b) 2 bits per sample. (c) 3 bits per sample. (d) 4 bits per sample.
A and B are chosen to minimize the mean-square error between ziand xi. It turns out that the optimum matrix A consists of the eigenvectors of the correlation matrix of the samples xi,and the optimum matrix B is the inverse of A . It appears, however, that a much sinipler type of matrices, the so-called Hadamard matrices work almost as well as the optimum ( RADERand CROWTHER [1966], HABIBI and WINTZ[1970]). A Hadamard matrix contains only 1 and - 1 as its elements, and is orthogonal. The inverse of a Hadamard matrix is itself multiplied by a scalar. Fast computer algorithms are available for multiplication by Hadamard matrices (PRATTet al. [1969]).
+
I,
0 81
R E D U C I N G Q U A N T I Z A T I O N NOISE
21
Fig. 17. Hadamard block quantization (block size = 8 x 8). Average number of bits per sample: (a) 1; (b) 2 ; ( c ) 3.
The application of Hadamard matrices to block quantizing images was studied by Huang and Woods (HUANGand WOODS[1969], WOODSand HUANG[1970]). Let us assume, for simplicity that the sampled image is divided into 2 x 2 blocks. Call the intensities of the 4 samples in a block xl,x2, x3 and x4. These intensities are transformed into y i by a 4 x 4 Hadamard matrix: Yt Y2
Y3
Y4
1 1 i 1
1 -1
I -1
1 1 1 -1 -1 -1 -1 1
(13)
28
B A N D W I D T H COMPRESSION OF O P T I C A L IMAGES
[I,
08
Fig. 18. Hadamard block quantization (block size = 16 x 16). Average number of bits per sample: (a) 1; (b) 2; (c) 3.
Since in a typical image neighboring samples tend to have equal intensities, y i (i # 1) tend to be very small. Therefore, in quantizing thely,, we use more bits for y , and fewer bits for y,, y 3 and y4, hoping that we may end up with a small average number of bits per sample and yet get a good quality reconstructed image. This scheme was applied to several images, and various block sizes were tried. It was found that for a given average bit rate, the use of a large block size tended to make the degradation in the reconstructed image appear as random noise, while the use of a small block size made the degradation appear in the form of discontinuities at block boundaries. Some results (using square blocks) are shown in Figs. 17 and 18 (all pictures contain 256 x 256
1,
a 81
REDUCING QUANTIZATION NOISE
29
Fig. 19. Hadamard block quantization (block size = 1 x 256). Average number of bits per sample: (a) 1; (b) 2; (c) 3.
samples). Note that with 3 bits per sample, the picture quality becomes as good as that of 6-bit originals. In implementing image coding schemes in real time, it is easier to work along a scan line rather than in two dimensions. Therefore, the Hadamard block quantization scheme was also tried with one-dimensional blocks. Some results are shown in Fig. 19 (all pictures contain 256 x 256 samples). Note that the 3-bit picture is as good as the 3-bit picture using 16 x 16 blocks, but the 2-bit and I-bit pictures have inferior quality when compared to the corresponding pictures using 16 x 16 blocks.
30
BANDWIDTH COMPRESSION O f OPTICAL IMAGES
[I,
9: 9
Q 9. Dual-Mode Coding 9.1. THE BASIC PRINCIPLE
In this section we shall describe a group of redundancy reduction schemes which take advantage of the fact that the response of the human eye to areas of high details (in particular, contours) in an image is different from its response to areas of low details. 9.2. COARSE-FINE QUANTIZATION
The human eye can tolerate a lot of quantization noise in areas of high details in a picture, but relatively little in areas of low details. Therefore, one can use different numbers of quantization levels for the two types of areas. Based on this fact, Kretzmer devised a “band-splitting’’ technique (KRETZMER [ 19561). He separated the original picture into its low-frequency and high-frequency parts. The low-frequency part was sampled at a low rate and quantized finely, while the high-frequency part was sampled at a high rate but quantized coarsely. At the receiving end the two parts were combined to obtain the resulting picture. A variation of this scheme was reported by KITSOPOULOS and KRETZMER [1961]. The reader is also referred to the coarse-fine scheme of Bisignani et al. (BISIGNANI et al. [1966], RICHARDS and BISIGNANI [1967]). All these schemes gave good quality pictures at an average bit rate of about 3 bits per sample for low resolution pictures (100 x 100 to 200 x 200 samples per frame). 9.3. SYNTHETIC HIGHS
Schreiber’s synthetic highs system (SCHREIBER et a]. [1959]) took advantage of the fact that the human eye tends to emphasize edges (abrupt changes in brightness) in a picture but is relatively insensitive to the amount of changes in the brightness over edges; on the other hand, in areas where the brightness changes slowly, quantization noise is easily discernable. Therefore, edges and slowly-varying part of a picture were treated differently. The video signal, derived from a picture by scanning, is passed through a low-pass filter with frequency response L ( j w ) .If the bandwidth of the lowpass filter is & of that of the original video signal s ( ~ )then , the output a(x) needs to be sampled only +$ as often as s(x); each sample of a(.) still has to have 6 bits to avoid quantization noise. The video signal S(X) is also passed through a differentiator, since dsldx is large at the edges, this signal contains mainly edge information. If both a(x) and ds/dx are transmitted exactly (and if the channel is noiseless), then we can synthesize the high-frequency
DUAL-MODE CODING
31
part of s(x) by passing dsldx through a “synthetic highs generator” with a frequency response
H(jo) =
1 - L( jo) jo
The output of H ( j o ) will be b(x)
=
s(x)--u(x)
(15)
and the sum of a(.) and b(x) is exactly s(x), the original picture. In Schreiber’s system, the edge signal, dsldx, was quantized to eight levels (3 bits); the first level was chosen high enough so that only a few noise points were mistaken to be edges, yet it was low enough so that no significant edge points were missed. The edge information was sent by run-length coding (essentially, the magnitude and the position of each edge point were transmitted). An average bit rate of about 1.5 bits per sample gave good TV quality pictures. 9.4. CONTOUR CODING
To obtain more reduction, an obvious step would be to extend Schreiber’s system to two dimensions. This was tried in an ad hoc manner by PAN[1962]. Later, Schreiber suggested a direct mathematical extension of his synthetic [ 19631). The differentiator is highs system to two dimensions (SCHREIBER replaced by a gradient operator, and a pair of two-dimensional filters, H I and H,, are required to synthesize the high-frequency part. It can be shown readily that if the low-frequency part a(x, y ) and the gradient components ds/dy and ds/dx are sent exactly (and if the channel is noiseless), then one can synthesize the high-frequency part, viz., s(x, y)-a(x, y ) , exactly, by using appropriate H,(ju,ju) and H,(ju,ju), where u and v are spatial frequencies, and the original picture will be reproduced exactly. Graham worked on the problem of how to approximate the gradient so that we can achieve a large amount of reduction and also at the same time obtain [1967]). He considered as edge points good received pictures (GRAHAM all points whose gradients had magnitudes greater than a certain threshold. The gradients of these edge points then were transmitted by contour tracing. Fig. 20 illustrates this scheme. Note that a tremendous amount of redundancy reduction was possible using this scheme. However, the reconstructed image suffered from a loss of textures. This was because the textures are often high frequency low-amplitude signals. They are on the one hand not included in the low-frequency part of the image, and on the other hand not large enough to pass the gradient threshold.
32
B A N D W I D T H C O M P R E S S I O N O F O P T I C A L IMAGES
[I,
§9
Fig. 20. The contour coding scheme of Schreiber and Graham. (a) Original (256 x 256 samples, 6 bits/sample). (b) Low-frequency part of the picture. (c) Gradients. (d) Synthetic highs. (e) Reconstruction (0.3 bit/sample).
I,§
101
TRANSFORMATIONAL CODING
0 10.
33
Transformational Coding
10.1 PRELIMINARIES
In 4 8.3 we described a scheme of block quantizing images using Hadamard transforms. It turns out that we can also achieve good results using Fourier instead of Hadamard transforms. Furthermore, we can encode the transformed variables in many different ways other than dividing them into groups and use a fixed number of quantization levels for each group. In short, a variety of transformational coding schemes exist. In this section, we shall describe one such scheme, In order to gain additional insight, we shall approach this scheme from a different point of view from that of block quantization. 10.2. BANDWIDTH COMPRESSION BY LOW-PASSING THE IMAGE
One direct way of reducing the bandwidth of an image is to pass it through a two-dimensional low-pass (spatial) filter. Let the Fourier transform of the original image be F(ju,jv) where u and v are spatial frequencies. The lowpassed image has the Fourier transform I----
ct(ju, j u )
=
F ( j u , j v ) , if-\iuZ+v2< R 0 , otherwise
where R is a constant. The low-passed image needs less samples and hence less bits to transmit. The disadvantage of this simple-minded approach is of course that the low-passed image will appear blurred. In particular, the high-frequency part of the line structures in the Fourier spectrum (due to sharp edges in the original image) are cutoff by the low-pass filter; therefore, the edges in the image are blurred. 10.3. THRESHOLDING THE FOURIER TRANSFORM
We can do better, if instead of throwing away indiscriminately all the frequency components higher than R, we look at all the frequency components and transmit only those whose magnitudes are greater than a preset threshold. Note that both the complex amplitude and the location of each frequency component above the threshold have to be transmitted. This scheme, however, still has the disadvantage that high-frequency details that exist only in small areas will be blurred, because the strength of these frequency components may not be large enough to pass the threshold.
34
BANDWIDTH COMPRESSION OF OPTICAL IMAGES
[I,
9: 10
10.4. PIECEWISE FOURIER TRANSFORM CODING
To circumvent the disadvantage of the Fourier transform thresholding scheme, ANDERSON and HUANG[I9691 proposed that the image should be divided into little blocks and the threshold coding applied to each block individually. In this manner, not only can we save small-area details, but we can also make the threshold adaptive; i.e., varying from block to block depending on the contents of the blocks. A digitized image and the magnitude of its Fourier transform are shown in Figs. 21(a) and (b), respectively. Fig. 21(d) shows the magnitude of the Fourier transform of a subsection of the same image (see Fig. 21(c)). Notice that the edge in the subsection (the boundary between the cameraman’s head
Fig. 21. (a) Digitized image (256x256 samples, 6 bits per sample). (b) Logarithm of magniiude of Fourier transform of (a). (c) Digitized image with subsection (16 x 16 samples) indicated. (d) Logarithm of magnitude of Fourier transform of subsection as indicated in (c).
1,
§ 101
TRANSFORMATIONAL CODING
35
and the sky) gave rise to a strong 45" line in the Fourier transform Fig. 21 (d), which is not observable in the transform (Fig. 21(b)) of the entire image. Therefore, thresholding the subsection transform will preserve the sharpness of the edge in question much better than thresholding the transform of the entire image. The piecewise Fourier transform scheme was studied extensively by Anderson and Huang who tried various subsection sizes and thresholding methods. It was found that a good way of setting the threshold was to make it proportional to the variance of the intensity values of the samples in each subsection. In Fig. 22(a), we show a reconstructed image from this
Fig. 22. (a) Image resulting from piecewise Fourier transform coding, subsection size 16 X 16 samples. Average bit rate = 1 bit per sample. (b) Image lowpassed to an equivalent bit rate of 1 bit per sample.
scheme, with an average bit rate of about 1 bit per sample. The Fourier components above the thresholds were quantized to 16 bits (8 bits for the magnitude, and 8 bits for the phase), and their locations were transmitted by using run-length coding. For comparison, Fig. 22(b) shows a straightly lowpassed image with the same bit rate (1 bit per sample), and Fig. 23 shows an image resulting from thresholding the Fourier transform of the entire original image (the average bit rate is also 1 bit per sample). Again, for practical reasons, we may want to code along scan lines instead of two dimensions. Fig. 24 shows a reconstructed image from piecewise Fourier transform coding using each scan line (256 x 1 samples) as a subsection. The average bit rate is about 2 bits per sample.
36
B A N D W I D T H COMPRESSION O F OPTICAL IMAGES
11,
§ 10
Fig. 23. Image resulting from thresholding the Fourier transform of the entire picture. Average bit rate = 1 bit per sample.
Fig. 24. Image resulting from piecewise Fourier transform coding, subsection size 256 x 1 samples. Average bit rate = 2 bits per sample.
1,
§ 111
C O D I N G O F MOTION PICTURES
5 11.
37
Coding of Motion Pictures
11.1. PRELIMINARIES
The intraframe coding schemes discussed in $9 5-10 can of course be used in coding motion pictures. However, in motion pictures, a new form of statistical constraints exists, viz., the constraints among successive frames. Also, the properties of human visual perception of motion should be put to good use. Some of the schemes to be discussed below promise good quality reconstructions at about 1 bit per sample. 11.2. INTERPOLATIVE CODING
The inter-frame redundancy of motion pictures is huge. Therefore, it might be possible to omit alternate (or more) frames in transmission, and synthesize these missing frames at the receiver from the transmitted frames. Linear interpolation yields pictures with jerkey motion (CUNNINGHAM [1963 and 19701).However, the contour interpolation scheme of GABORand HILL[ 1961] gave excellent results. 11.3. FRAME-CORRECTION CODING
Extensive statistical measurements (SEYLER[ 19651) showed that only about 10% of the samples in a TV picture change more than 8 % in brightness from frame to frame. A bandwidth reduction can be achieved by updating only the samples which change in brightness for more than a certain threshold. This scheme was tried by CUNNINGHAM [1963 and 19701 and MOUNTS [1969], and appears to be the most promising motion-picture coding scheme to date. 11.4. PSEUDORANDOM SCANNING
In some cases, the frame rate is not dictated by motion rendition but rather [I9621 used slow pseudorandom dot scanning comby flicker. DEUTSCH bined with a long-persistence phosphor in the receiver scanner to reduce frame rate without introducing flicker. 11.5. VARYING THE SPATIAL RESOLUTION
Being based on the assumption that successive frames are much alike, the schemes discussed above will encounter difficulties during a scene change. Different bandwidth reduction methods have to be used during scene changes. Subjective tests done by SEYLER and BUDRIKIS [I9651 indicated that during the first few hundred milliseconds of a new scene, the human eye can tolerate a lot of blurring of the frames. Therefore, these frames can be sam-
38
B A N D W I D T H COMPRESSION OF OPTICAL IMAGES
[I,
0 12
pled very coarsely. A similar idea was recently suggested by BEDDOES and MEIER[1970]. They established a relation between the critical flicker frequency and the spatial frequency of the image (the critical flicker frequency becomes lower when the spatial frequency becomes higher), and suggested that this relation be exploited in coding motion pictures by presenting highquality pictures interleaved with low-quality ones.
5 12.
Coding of Color Pictures
12.1. SAMPLING A N D QUANTIZING COLOR IMAGES
An extensive study on the efficient digitization of color images was undertaken by GRONEMANN [ 19641. He found that in order to get a good quality reconstruction, at least 5 bits should be used for the luminance and each of the two chrominance components of a sample. However, the chrominance can be sampled very coarsely, and as a result it requires only about half a bit per sample more to transmit a color picture (TV quality) than to transmit a monochromatic picture of comparable quality. 12.2. DPCM A N D DELTA-MODULATION
Bhushan [ 19701carried out a series of experiments in the real-time coding of color TV signals. DPCM and delta-modulation were applied to the three color components (green, red and blue) derived from the video signal. I t was found that at a fixed bit rate of 6 Megabits per second, reconstructed images from delta-modulation was preferred to those from 2-bit and 3-bit DPCM. The reason was that at a fixed bit rate, delta-modulation had more samples than DPCM; therefore, the delta-modulated images appeared sharper (though noisier) than the images from DPCM and were subjectively preferred. Bhushan also tried coding only one color component at a time and found that the eye was most sensitive to degradations in the green component, and least sensitive to degradations in the blue component. 12.3. FRAME-TO-FRAME CODING OF NTSC COLOR TV
Schaphorst and his colleagues at Philco-Ford (SCHAPHORST[19701) designed an experimental coding system to digitally transmit NTSC color T V signals over AT & T common carrier video circuits. The encoding process is accomplished in two steps. First the NTSC color signal is pre-processed by decoding the input video into the Y luminance signal and one chrominance signal. The chrominance signal alternates between I and Q on a line by line basis in a manner similar to the French
I,
D 131
S 0 M E P R A C T I C A L C,ON S I D E R AT1 0 N S
39
SECAM system. In the second encoding step the two video signals are converted to digital form by a frame-to-frame coding process and multiplexed for transmission at a rate of 16 x lo6 bits/sec. At the receive terminal a frameto-frame decoding converts the received digital signal back to one luminance and one chrominance signals which in turn are reconverted to the NTSC signal by a post processor unit. Subjective evaluation of the output picture reveals minor distortion in highly animated scenes. However, the system performance is considered satisfactory for many applications.
0 13.
Some Practical Considerations
13.1. IMAGE QUALITY AND BIT RATE
In $8 5-12, we described in some detail a variety of redundancy reduction schemes. We shall now discuss very briefly some of the practical considerations in evaluating the performance of redundancy reduction schemes. The two most important things one looks for in a redundancy reduction scheme are of course the quality of the reconstructed image and the average bit rate. However, as we emphasized earlier and let us emphasize again now that redundancy reduction is only part of the overall image transmission system. In designing an image transmission system, we have to consider all the blocks in the system diagram of Fig. 2 and ponder on their interactions in order to reach a good compromise. Therefore, instead of asking for the average bit rate of a redundancy reduction scheme at a particular image resolution (sampling rate), we should investigate how the bit rate changes with a change in the resolution. While the bit rate of a direct PCM system increases as the square of the linear resolution of the image, that of a good redundancy reduction scheme should increase much more slowly (SCHREIBER, HUANGand TRETIAK [1968]). And we should not only judge the quality of the reconstructed image assuming a noiseless channel, but also demand to see the reconstructed image when the channel is noisy. Generally speaking, the more efficient a redundancy reduction scheme is, the more vulnerable it is to channel noise. 13.2. ECONOMY
So far in our discussion, we have completely neglected the problem of economy, which may very well be the overriding factor in most practical system designs. It is therefore imperative to estimate the equipment complexity of redundancy reduction schemes. Most of the schemes discussed in our paper have only been simulated on digital computers. The real-time imple-
40
B A N D W I D T H C O M P R E S S I O N OF OPTICAL IMAGES
[I,
0 14
mentation of them will be very complex and expensive. In particular, schemes which encode two-dimensional areas, such as contour coding and transformational coding, need large storage at the transmitter and the receiver. Schemes which use statistical coding also need buffer storage. However, with the rapid progress in digital computers and large-scale integration, the implementation of these complex schemes may become economically feasible in the not too distant future.
6 14.
Comments on Image Quality
14.1. MEAN-SQUARE E R R O R CRITERIA
Since image quality is one of the most important factors in evaluating the performance of image transmission systems, it is extremely desirable to have a mathematical expression for image distortion. Without such a distortion measure, comparisons of system performance can only be made through tedious subjective tests. The only distortion measures which have been seriously considered in evaluating image transmission systems are the mean-square error and its variants, such as the weighted mean-square error. These measures have the distinct advantage that they are mathematically tractable. They also appear to agree reasonably well with subjective evaluation in many cases (HUANG[1970]). Let f(x,y) be the input picture, and g ( x , y ) the output picture, where (x,y ) are the spatial coordinates a ndfa nd g are brightness. We define the error as 44 Y ) = f ( x , v)- g(x, Y ) (17) and denote its Fourier transform as E(u, u ) where (u, u ) are spatial frequencies. Then the mean-square error is
and the weighted mean-square error
where W(u, u ) is called the weighting function. The weighting function reflects the sensitivity of the eye to various spatial frequency components in the picture. The mean-square error criteria have at least two defects. First, the sub-
1,s
151
CONCLUDING REMARKS
41
jective quality of a degraded image g(x, y ) depends not only on the error E ( X ,y ) but also on the original image f ( x , y ) . Secondly, some image transmission systems degrade the images geometrically - for example, block quantization using Hadamard transform (0 8.3) sometimes yields pictures containing “staircases” along the edges (contours). The mean-square error criteria do not seem appropriate for geometrical distortions. A more satisfactory criterion should be based on some kind of edge error. 14.2. A PROPOSED DISTORTION MEASURE
We propose the distortion measure
where A and B are positive constants, D, is a weighted mean-square error modified to take care of the dependence on the original image, and D , is a measure of edge error. One possible choice for D, is
J -mJ
where W , reflects the eye sensitivity, and W , reflects the dependence on the original image (and is a function off). Much experimentation needs to be done to determine the suitable forms of Db and W,.
8 15.
Concluding Remarks
Looking toward the future, we expect that experimental studies in frameto-frame and color coding will play a growing role. On the theoretical side, we are rather pessimistic about seeing any significant break through, although some progress may be expected through development of refined psychovisual models and in adaptive coding systems. Since images are perceived in terms of the objects they represent, and not as multi-dimensional random processes (KOLERS[1970]), a useful theory of image coding can probably only be achieved through progress in pattern recognition. In the framework of pattern recognition, one may eventually be able to apply Shannon’s model in a meaningful way. Acknowledgement This work was supported principally by NIGMS Grants 5 PO1 G M 14940-04 and GM 15006-03.
42
BANDWIDTH COMPRESSION OF O P T I C A L IMAGES
11
References ANDERSON, G. B. and T. S. HUANG,1969, Picture Bandwidth Compression by Piecewise Fourier transformation, Proc. Purdue Centennial Symp. on Information Processing, Purdue University, Lafayette, Indiana. Anonymous, 1970, All-Digital Equalizer Gives Modem 14 400-bps Rate over C-2 Line, Communications Designer’s Digest, pp. 35-36. BEDDOES, M. P. and 0. MEIER,1970, IEEE Trans. o n Inf. Theo. IT-16, pp. 214-218. BENNETT, W. R. and J. R. DAVEY,1965, Data Transmission (McGraw-Hill). BHUSHAN, A. K., 1970, Transmission and Coding of Color Pictures, in: Picture Bandwidth Compression, eds. T. S. Huang and 0. J. Tretiak (Gordon and Breach). and J. W. WHELAN,1966, IEEE Proc. 54, pp. 376-390. BISIGNANI, W. T., G. P. RICHARDS BROWN,E. F., 1970, Expanded DPCM Coding Techniques for Television, in: Picture Bandwidth Compression, eds. T. S. Huang and 0. J. Tretiak (Gordon and Breach). CANDY,J. C., 1970, Refinement of a Delta Modulator, in: Picture Bandwidth Compression, eds. T. S. Huang and 0. J. Tretiak (Gordon and Breach). CHERRY,C., M. H. KUBBA,D. E. PEARSON and M. P. BARTON,1963, IEEE Proc. 51, pp. 1507-1517. CORRADETTI, M., 1959, A Method of Planar Prediction for Coding Pictures, M.I.T. Research Laboratory of Electronics, Quarterly Progress Report no. 55, pp. 110-1 14. CUNNINGHAM, J. E., 1958, Recording Pictures by Generation of Lowpass and Correction Signals, M.I.T. Research Laboratory of Electronics, Quarterly Progress Report, pp. 136-1 37. CUNNINGHAM, J. E., 1963, Image Correction-Transmission Experiments, M.I.T. Research Laboratory of Electronics, Quarterly Progress Report no. 70, p. 244. J. E., 1970, Frame-Correction Coding, in: Picture Bandwidth CompresCUNNINGHAM, sion, eds. T. S. Huang and 0. J. Tretiak (Gordon and Breach). CUTLER, C. C., 1952, Patent no. 2,605, 361, July 29 (applied for June 29, 1950). CUTLER, C. C. (ed.), 1967, IEEE Proc., special issue on Redundancy Reduction. DE JAGER, F., 1952, Delta Modulation: A Method of PCM Transmission Using the I-unit Code, Philips Res. Rept., VoJ. 7, pp. 442-466. S., 1962, Electronics 35, pp. 49-51. DEUTSCH, ELIAS,P., 1955, IRE Trans. o n Inf. Thec. IT-I, pp. 16-33. GABOR,D. and P. C. J. HILL,1961, Television Band Compression by Contour Interpolation, Proc. IEE (British), Part B. GRAHAM, D. N., 1962 Optimum Filtering to Reduce Quantization Noise, M.S. Thesis, Department of Electrical Engineering, M.I.T. D. N., 1967, Proc. IEEE 55, pp. 336-346. GRAHAM, GRAHAM, R. E., 1958, Predictive Quantizing of Television Signals, IRE Wescon Conv. Rec., Part 4, pp. 147-157. U. F., 1964, Coding Color Pictures, M.I.T. Research Laboratory of ElecGRONEMANN, tronics, Technical Report no. 422. HABIBI, A. and P. A. WINTZ,1970, Linear Transformation for Encoding 2-Dimensional Sources, Technical Report TR-EE 70-2, School of Engineering, Purdue University. HARRISON, C. W., 1952, Experiments with Linear Prediction in Television, BSTJ. HUANG,T. S., 1960, A Method of Picture Coding, M.I.T. Research Laboratory of Electronics, Quarterly Progress Report no. 57, p. 109. T. S., 1970, Comments on Picture Distortion Measures, G. T. &E. Laboratories, HUANG, Waltham Research Center, Technical Memorandum no. 70-428.2. 1965, Research in Picture Processing, in: Optical and HUANG,T. S. and 0. J. TRETIAK, Electro-Optical Information, eds. J. Tippett et al. (M.I.T. Press) Ch. 3. (eds.), 1970, Picture Bandwidth Compression (Gordon HUANG,T. S. and 0. J. TRETIAK and Breach).
I1
REFERENCES
43
B. PRASADA and Y. YAMAGUCHI, 1967, IEEE Proc. 55, pp. HUANG,T. S., 0. J. TRETIAK, 331-335. HUANG,T. S. and J. W. WOODS,1969, Picture Bandwidth Compression by Block Quantization, Intern. Symp. of Information Theory, Ellenville, New York. (Abstracts in IEEE Trans. on Inf. Theo. IT-16 (1970) no. 1.) HUANG,J. Y. and P. M. SCHULTHEISS, 1963, IEEE Trans. on Comm. Syst. CS-11, pp. 289-296. HUFFMAN, D. A., 1952, Proc. IRE 40, pp. 1098-1101. S. C. and E. R. KRETZMER, 1961, IRE Proc. 49, pp. 1076-1077. KITSOPOULOS, KOLERS,P., 1970, Reading Pictures: Some Cognitive Aspects of Visual Perception, in: Picture Bandwidth Compression, eds. T. S. Huang and 0. J. Tretiak (Gordon and Breach). KRAMER, H. P. and M. W. MATHEWS, 1956, IRE Trans. of Inf. Theo. 2, p. 41. KRETZMER, E. R., 1956, Reduced-Alphabet Representation of Television Signals, IRE Nat. Conv. Rec., part 4. LIMB,J. O., 1969, Design of Dither Waveforms for Quantized Visual Signals, BSTJ 48, pp. 2555-2582. LIMB,J. 0. and F. W. MOUNTS,1969, Digital Differential Quantizer for Television, BSTJ 48, pp. 2583-2599. LUCKY,R. W., J. SALTZand E. J. WELDON,1968, Principlesof DataCommunication (McGraw-Hill). MARTIN,J., 1969, Telecommunication and the Computer (Prentice-Hall). J. C. and T. BAER,1970, Area and Volume Coding of Picture, in: Picture MOTT-SMITH, Bandwidth Compression, eds. T. S. Huang and 0.J. Tretiak (Gordon and Breach). MOUNTS, F. W., 1969, A Video Encoding System Using Conditional Picture Element Replenishment, BSTJ 48, pp. 2545-2554. OLIVER,B. M., J. R. PIERCEand C. E. SHANNON, 1948, Proc. IRE 36, pp. 1324-1331. O’NEALJr., J. B., 1966, Delta Modulation Quantizing Noise, BSTJ 45, pp. 117-142. O’NEALJr., J. B. 1966, Predictive Quantizing Systems for the Transmission of Television Signals, BSTJ 45, pp. 689-722. PAN,J. W., 1962, Picture Processing, M.I.T. Research Laboratory of Electronics, Quarterly Progress Report no. 66, p. 224. PETERSON, D. P. and D. MIDDLETON, 1962, Information and Control 5, pp. 279-323. POST, A., 1966, Filtering Quantization Noise, B.S. thesis, Department of Electrical Engineering, M.I.T. PRATT,W. K., 1967, IEEE Trans. on Inf. Theo. IT-13, pp. 114-115. 1969, IEEE Proc., January. PRATT,W. K., J. KANEand H. G. ANDREWS, RADER,C. M. and W. R. CROWTHER, 1966, IEEE Proc. 54, pp. 1594-1595. 1966, Analysis and Implementation REMM,R. L., R. V. COTTONand G. R. STROHMEYER, of a Delta Modulation Pictorial Encoding System, National Telemetering Conference Proc., pp. 27-34, 1967, Proc. IEEE 55, pp. 1707-1717. RICHARDS, G. P. and W. T. BISIGNANI, ROBERTS, L. G., 1962, IRE Trans. on Inf. Theo. 8, pp. 145-154. ROSENFELD, A., 1968, IEEE Trans. on Inf. Theo. IT-14, no. 4. SEYLER, A. J., 1965, IEEEProc. 53, pp. 2127-2128. 1965, IEEE Trans. on Inf. Theo. 11, pp. 31-43. SEYLER, A. J. and F. L. BUDRIKIS, SCHAPHORST, R., 1970, Frame-to-Frame Coding of NTSC Color TV, in: Picture Bandwidth Compression, eds. T. S. Huang and 0. J. Tretiak (Gordon and Breach). SCHREIBER, W. F., 1963, The Mathematical Foundation of the Synthetic Highs System, M.I.T. Research Laboratory of Electronics, Quarterly Progress Report, no. 68, p. 140. SCHREIBER, W. F., T. S. HUANGand 0. J. TRETIK,1968, Contour Coding of Images, Wescon Conv. Record. SCHREIBER, W. F., C. F. KNAPPand N. D. KAY,1959, Journal SMPTE 68, p. 525.
44
B A N D W I D T H COMPRESSION OF OPTICAL IMAGES
[I
SCOVILL, F. W., 1965, The SubjectiveEffect of Brightness and Spatial Quantization, M.I.T. Research Laboratory of Electronics, Quarterly Progress Report no. 78. SHANNON, C. E. and W. WEAVER,1949, The Mathematical Theory of Communication (Univ. of Illinois Press). SPENCER, D. A. and T. S. HUANG,1969, Bit-Plane Encoding of Continuous-Tone Pictures, Proc. Symp. on Computer Processing in Communications (PIB Press). WACKS,K., 1970, Analyzing the Performance of an Image Processing System, M.S. thesis, Department of Electrical Engineering, M.I.T. WILKINS, L. C. and P. A. WINTZ,1969, Bibliography on Data Compression, Picture Properties, and Picture Coding, Technical Report no. TR-EE69-10, School of Engineering, Purdue University. WINKLER, M. R., 1965, Pictorial Transmission with HIDM, IEEE International Conv. Record, part I, pp. 285-291. WITT,R., 1963, Private Communication, Comm. and Data Proc. Div., Raytheon Co., Norwood, Mass. WOODS,J. W. and T. S. HUANG,1970, Picture Bandwidth Compression by Linear Transformation and Block Quantization, in: Picture Bandwidth Compression, eds. T.S. Huang and 0. J. Tretiak (Gordon and Breach). YOUNGBLOOD, W. A., 1968, Picture Processing, M.I.T. Research Laboratories of Electronics, Quarterly Progress Report, pp. 95-100.
Note added in proof: Sincethecompletion of this manuscript, two more symposia on picture coding have been held, one at the North Carolina State University, USA, 1970, and the other at Purdue University, USA, 1971, Many papers from the North Carolina symposium were published in a special issue of the IEEE Trans. on Communication Technology, December 1971. Also the IEEE Proceedings will have a special issue on Digital Picture Processing, July 1972.
I1
THE U S E OF IMAGE TUBES AS SHUTTERS BY
R. W. SMITH Applied Optics Section, Physics Department, Imperial College of Science and Technology, London
CONTENTS
PAGE
9 1. INTRODUCTION . . . . . . . . . . . . . . . . . . .
47
5 2. IMAGE TUBE COMPONENTS . . . . . . . . . . . . .
49
3 3. ELECTRON-OPTICAL SYSTEMS . . . . . . . . . . . .
52
3 4. IMAGE INTENSIFICATION . . . . . . . . . . . . . .
55
5 5. EARLY APPLICATIONS O F IMAGE TUBES TO HIGH SPEED PHOTOGRAPHY . . . . . . . . . . . . . . . .
58
5 6. IMAGE TUBES DESIGNED FOR USE AS SHUTTERS .
60
9 7. DEFLECTION SHUTTERS . . . . . . . . . . . . . . .
64
9 8. GRID CONTROLLED IMAGE TUBES . . . . . . . . .
71
5 9. THE USE O F CONVENTIONAL IMAGE INTENSIFIERS AS HIGH SPEED SHUTTERS . . . . . . . . . . . . .
75
10. STORAGE METHODS . . . . . . . . . . . . . . . . .
76
11. BIPLANAR IMAGE TUBES . . . . . . . . . . . . . .
81
9 12. MULTICHANNEL SYSTEMS . . . . . . . . . . . . . .
82
REFERENCES . . . . . . . . . . . . . . . . . . . . . . .
84
5 5
0 1.
Introduction
Some important requirements of a high speed photographic system are (a) the provision of an adequately fast shutter, (b) synchronisation of this shutter to the event being studied and (c) ensuring that enough light reaches the photographic film during the exposure to produce a recording. In principle image tubes can be designed to meet all these requirements and it is the purpose of this paper to review the image tubes and systems which have been investigated. The image tube, or image converter, was developed by HOLST,DEBOER, TEVESand VEENEMOUS [1934] as a method for making visible scenes illuminated by infra-red radiation. At one end of the image tube was a photocathode at which the infra-red image of the scene was converted to a photoelectron image. This electron image was transferred by a simple electron-optical system to a phosphor screen. The electrons were given sufficient energy to excite the phosphor material and an image of the scene in the visible region of the spectrum was produced. The subsequent development during the 1939-45 war, of similar tubes with military applications has been described by SCHAFFERNICHT [1948]~ ~ ~ P R[1947,1948]. A T T After the war many workers began to apply existing image tubes to high-speed photography making use of the ease with which the photoelectron image could be controlled by electric and magnetic fields. It was not until the early 1950’s that image tubes were specifically designed for use as shutters. 1.1. HIGH SPEED CAMERA SYSTEMS
There are four main types of high speed photographic system.
1) Single exposure. In its simplest form this is just an objective lens followed by a shutter and the recording photographic plate. 2 ) Multiple frame. This produces a continuous series of short exposures or frames usually separated by a fixed time interval. It requires a shutter and some method of either moving the photographic film or arranging that the separate frames are directed to a different region of the film. The speed 47
48
THE USE OF IMAGE TUBES AS SHUTTERS
[IF, § 1
of the system is usually indicated as the number of exposures or frames per second (f.p.s.). 3) Multiple channel systems. This is an alternative to the multiple frame method when a sequential record is required. Instead of having one camera which performs both the actions of shuttering and spatially separating the frames, a series of separate shutters or cameras in parallel with each other is used. Each camera or channel can either have its own objective lens which gives parallax between the images or alternatively a single objective lens followed by a series of beam splitters can be used. 4) Streak systems. This method is used to study the distribution of light along a line element of the event as a function of time. An image of the event is focussed onto a slit and the light which passes through the slit is focussed onto the photographic plate. A suitable arrangement is introduced to sweep this slit image across the plate in a direction normal to the slit. Neighbouring line sections of the final recording show the distribution of light along the selected event line element at successive times. I n the simplest framing camera the film is moved intermittently so that while the exposure is made the film is stationary. These cameras will attain speeds of a few hundred frames per second but they are limited by the strength of the film which tends to tear at higher framing rates. Their range can be extended by using a continuously moving film and passing the light through a rotating glass prism, which is synchronized to the film, to compensate for the movement of the film during the exposure. There are many types of these prism cameras and they can be used to recoid at speeds up to ‘w lo4 f.p.s. A similar compensation can be obtained by reflecting the images from a rotating mirror onto the film which is fitted inside a rotating drum. The highest framing rates achieved mechanically are obtained using rotating mirror cameras. The image of the event is focussed onto a rapidly rotating mirror and is reflected onto a series of lenses arranged in an arc centred on the mirror. As the reflected light beam passes over each lens an image is recorded on the stationary film which is arranged to lie at the foci of the array of lenses. Framing rates of 2 x lo7 f.p.s. can be obtained in this way. This camera can be converted into a streak camera by removing the array of lenses, putting a slit at a suitable intermediate image plane and by focussing the image at the film not onto the mirror as before. The Kerr cell has been used to obtain single exposures as short as 5 ns. When many frames are required a multichannel system is used, for example N
11,
§ 21
49
IMAGE TUBE COMPONENTS
the twelve channel camera described by BARNSLEY [1962]. The four channel system described by EREZand EYLON[1962] used a single objective lens followed by beam splitters to overcome to the problem of parallax. Goss [1960] used a Kerr cell as a shutter in a system which was a cross between a multiple frame and a multiple channel camera. After a single objective lens the light was split by a series of beam splitters and travelled along different optical paths before it arrived at the Kerr cell. The optical arrangement was such that the separate beams formed well separated images on the film following the Kerr cell. By operating the Kerr cell shutter once, a series of images corresponding to different times at the event were obtained. The exposure time was 5 ns and the framingrate was 7 x lo7 f.p.s. Shutters based on other electro-optical effects e.g. Pockels effect have been discussed. The review articles by COURTNEY-PRATT [I9571 and COLEMAN [I9631 describe many other techniques which have been used for high speed photography. For shorter exposure times image tubes are used and in this paper we shall consider the development and uses of such tubes.
0 2.
Image Tube Components
The basic image tube system is shown diagrammatically in Fig. 2.1. Radiation from the object 0 is transferred by the objective lens L onto the semi-transparent photocathode PC. The photoelectrons which leave the photocathode are focussed by an electron-optical system EOS onto a phosphor screen P where the output image is produced. The whole arrangement is contained within a vacuum envelope.
Recording System
Fig. 2.1. The basic image tube.
The number of photoelectrons emitted from the photocathode depends on the wavelength of the incident radiation and on the illumination; at a given wavelength the photoelectron current is directly proportional to the illumination. The spectral sensitivities of several photocathodes are shown in Fig. 2.2. The different photocathode sensitivity curves are classified by
50
T H E U S E O F IMAGE T U B E S AS S H U T T E R S
“I,
§2
S-numbers; the more common being the S 1, silver-oxygen-caesium, the S9, antimony-caesium and the S20, antimony-caesium-sodium-potassium (tri-a1kali).
3000
5003
moo wavelength
9030
(11
Fig. 2.2. Photocathode sensitivity and phosphor output.
The curves labelled quantum efficiency can be considered to show the efficiency of the photoelectric process as the probability that a photon will cause the emission of a photoelectron from the photocathode surface. The photoelectrons are emitted with a distribution of emission energies and directions. For monochromatic illumination the emission energy can lie between zero and a maximum energy E given by Einstein’s equation
E = hv-cj
- -
(2.1)
where cj is the work function of the surface and hv is the energy of a photon of frequency v. For a typical photocathode cj 2 eV and so putting hv 3 eV (blue light) this maximum energy is 1 eV.
-
11,
§ 21
IMAGE T U B E COMPONENTS
51
The direction of emission of an electron with respect to the normal to the surface follows an approximately cosine distribution. The momentum p of a particular electron which is emitted at an angle 0 to the normal to the surface, and with emission energy eV,, can be resolved into components p cos 0 and p sin 0 which are normal and parallel to the surface respectively. The velocity of emission v of this electron is
v
=
p/m = (2evo/m)i
where e and m are the electron charge and mass. In some image tubes the momentum component p cos 0 is important as it can determine the time resolution of the device while the spatial resolution can be a function of both momentum components. The delay between the absorption of a photon and the emission of the corresponding photoelectron is estimated to be s and so when an image of a rapidly changing event is focussed onto a photocathode the electron stream which leaves it contains spatial and temporal information about the event. A section of the electron stream represents the event at a given time and a section taken at the same position in the tube at a later time represents the event at a later time. The electrons can be detected by accelerating them to a high kinetic energy 10 keV before they strike a phosphor screen where 1000 photons are emitted per incident photoelectron (MANDEL[1955]). The spectral composition of the emitted light depends on the phosphor material and corresponding to the S numbers of the photocathodes the phosphors are labelled by P numbers, for example P11 phosphor is silver activated zinc sulphide and has a spectral output with a peak at a wavelength of -4700A (Fig. 2.2). The phosphor material is usually deposited on a glass plate and is covered by a thin film of aluminium. This film is sufficiently thin to allow the electrons to pass through and serves several purposes. Firstly it provides the electrical conductivity which defines the potential across the phosphor screen and prevents it from charging up. Secondly the light produced at the phosphor screen is emitted in all directions and the aluminium film reflects more of it in the required direction. It also prevents the feedback of light to the photocathode which otherwise would cause overloading and instability. Thirdly the aluminium prevents transmission of any of the incident light which was not absorbed by the photocathode. It is necessary to include an electron-optical system to transfer the electron image from the photocathode to the phosphor screen and also to give the electrons sufficient energy to excite the phosphor screen material. N
-
-
52
T H E U S E OF I M A G E T U B E S A S S H U T T E R S
9 3.
“ 1 9 0
3
Electron-Optical Systems
3.1. BIPLANAR IMAGE TUBES
The image tube developed by HOLST,DEBOER,TEVESand VEENEMOUS [ 19341 consisted of a plane photocathode PC and a plane phosphor screen P arranged parallel to each other and a small distance apart, Fig. 3.1. The
-
Fig. 3.1. Diagram of the Holst image tube.
phosphor screen did not have an aluminium backing but this did not cause serious problems as the spectral sensitivity curve of the photocathode, mainly infra-red, and the output from the phosphor screen, predominantly green, were spectrally separated. In this simple electron-optical system a potential difference is applied between the photocathode and the phosphor screen and an electron which leaves the photocathode travels along a parabolic path before it strikes the phosphor screen. An electron which is emitted with an energy eVo at an angle 0 to the normal has a component of its velocity parallel to the photocathode surface up = (2ev0 sin2 e/m)+.
(3.1)
The transit time of this electron between the photocathode and phosphor screen is
where d is the photocathode to phosphor screen spacing and VAis the potential difference between them. In practice VA>> V,,and so the approximation shown in eq. (3.2) can be made. The distance x at which the photoelectron strikes the phosphor screen from the normal through its point of emission is x = tu, = 2d(v0 sin2 O/vA)+.
(3.3)
11,
D 31
ELECTRON-OPTICAL SYSTEMS
53
The distribution of the electron arrival points x,over the phosphor screen will be a complicated function depending on the spectral distribution of the illuminating radiation, the spectral sensitivity of the photocathode and the distributions of emission energy and angle. Eq. (3.3) can be used to define an order of magnitude point spread function normally called the disc of confusion which has a radius -x. Typically the maximum value of V , sin2 0 is 1 V and diameter of the disc of confusion is -4d/( VA)*.Thus putting d = 2 mm and V, = 10 kV, this diameter is -0.08 mm and image resolutions of 10-20 Ip/mm* are obtainable. The simple geometry of this biplanar arrangement has the advantage that there is no particular axis to the device and hence the resolution is uniform over the output phosphor screen and there is no distortion. The image magnification is unity.
-
3.2. ELECTROSTATIC LENSES
By incorporating a structure of axially symmetrical electrodes, held at various potentials, between the photocathode and the phosphor screen it is possible to focus the electrons which leave a point on the photocathode onto an image surface at the phosphor screen. Unfortunately it is not possible to design a system which has no field curvature and so most designs have a curved photocathode or phosphor screen. The ,chromatic’ aberration caused by the distributions of emission energy and angle limits the image resolution. The field of view is limited by the distortion and field curvature, and the image is usually demagnified. Particular electrode arrangements will be discussed later. 3.3. SHORT MAGNETIC LENSES
This system is often a combination of electrostatic and magnetic lenses, the magnetic field being produced by a short coil. There is usually a series of focussing conditions giving images of different magnification and orientation, the latter being produced by image rotation in the magnetic field. 3.4. UNIFORM MAGNETIC FIELD
A uniform magnetic field is arranged parallel to the axis of an image tube which has a uniform electric field between its photocathode and phosphor screen. The path followed by a photoelectron is a helix of increasing
* The limiting spatial resolution in image tubes is usually indicated in terms of line pairs/mm i.e. the number of pairs of black and white lines per millimetre which are just resolvable.
54
T H E U S E OF I M A G E T U B E S A S S H U T T E R S
111,
03
pitch as the electron travels through the tube. The diameter of the helical path is b = 2(2m/e)i(V0sin’ O)*/B, (3.4) where B is the magnetic field. The time taken by the electron to complete one cycle of the helix is dependant only on B, i.e. 27cmleB, and so if the electron completes an integral number of cycles during its transit time between the photocathode and phosphor screen the image will appear in focus. The condition for focus is that
( e / 2 m ) + d ~ / ~ z lnc = (integer).
(3.5)
However a finite disc of confusion is produced because the electrons have slightly different transit times as indicated by eq. (3.2). Image resolutions 100 lp/mm are possible and this method is referred to as loop focussing.
-
3.5. STRONG UNIFORM MAGNETIC FIELD
If the magnetic field B in eq. (3.4) is increased the diameter of the helical path decreases and can be made sufficiently small so that the image resolution is adequate without the necessity of using loop focussing. This is referred to as ‘brute force’ focussing and is particularly useful in tubes which have storage or electron drift sections (3 10). 3.6. PHOTOCATHODE RESISTANCE A N D SPACE CHARGE
The photocathodes used in image tubes are normally deposited on glass substrates and can have resistances MR/square. Under moderate conditions of illumination the photoelectron current may be sufficient to give potential differences across the photocathode of several volts. However, when short exposures are considered the currents taken during the exposure time are much larger and can produce large electric fields across the surface. These electric fields increase distortion and degrade the images and hence they must be avoided. This can be done by reducing the effective photocathode resistance by depositing a transparent conducting layer onto the glass substrate before the photocathode is formed. A resistance of 100 R/ square is readily obtainable using a nesa coating - (stannic oxide). Other methods of reducing the resistance are to use metallic conducting layers or to embed a metal mesh into the surface of the glass substrate (GARFIELD et al. [1969]). The latter gives a resistance -0.1 R/square and can be used with
-
IMAGE INTENSIFICATION
55
the S20 photocathode for which the nesa layer is unsuitable as it is attacked by the sodium used in the preparation of the cathode. An alternative method of reducing the effect of photocathode resistance is to fix a mesh parallel to the photocathode on the outside of the tube. The capacitance between this mesh and the photocathode is sufficient to hold the potential of the photocathode surface constant during the short exposures when large currents are drawn. This has been studied by STEWART and WANIECK [1963]. Space charge can also contribute to the fall in image resolution and increase in distortion when large currents are drawn from the photocathode. The effects are largest when the electrons are slowly moving which is usually near the photocathode and therefore can be reduced by increasing the electric field near the photocathode. It must be pointed out that it is very difficult to distinguish between the effects caused by photocathode resistance and space charge and it is usually advisable to avoid both if possible.
5 4.
Image Intensification
The number of photons emitted from a phosphor screen per incident electron is -700-1000. If it is assumed that the spectral quality of the light incident at the photocathode and leaving the phosphor screen are the same and that the photocathode is 10 % efficient, then the light gain through the image tube is -70-100. Unfortunately the lens which is used to transmit the image from the phosphor screen to the photographic plate collects - 5 % of the light and so an overall gain of -3-5 is obtained in practice. In terms of light collection efficiency an image tube system has an important advantage over the high speed rotating mirror cameras in that it can usually be operated with a high aperture optical system ( - f / 2 ) for transferring the light from the event to the photocathode whereas the rotating mirror camera in general operates at -f/20. The efficiency of the photocathode is 10 % whereas that of the photographic plate is 0.1 % and thus the photocathode is a much more sensitive detector. The detective quantum efficiency of a detector is defined as the ratio of the square of the signal to noise ratio that occurs in the output signal to square of the signal to noise ratio of the input light signal (ROSE[1948], JONES[1959]). The quantum efficiency of a complete system has been discus[1958] and MANDEL[I9591 based on the assumption that sed by FELLGETT the distribution of input light is Poissonian (MANDEL[1958]). The quantum
-
-
56
T H E U S E OF I M A G E TUBES A S S H U T T E R S
[II, §
4
efficiency of an image tube-lens-photographic film combination is given by FELLGETT [1958] as 4 s = 40(1+ l/maqp)(4.1 1 where qs, qpand qo are the quantum efficiencies of the system, the photocathode and the photographic emulsion respectively, and a is the lens coupling efficiency. The quantity m is the number of photons which leave the phosphor screen for each photoelectron which leaves the photocathode. Table 4.1 shows the quantum efficiency of the system as a function of m with ct = 5 %, qp = 0.1 % and qo = 10 %. TABLE 4.1 700
/n
7000
5 ~ 1 0 3~ . 5 ~lo6 __ ~
4J40
0.03
4J%
3
*
0.25
2s
0.7 7
0.95
a,
__ 1 I00
95
s x 104*
7oo*
-
~~
0 25
0.96
25
96
see $4.3.
-
For a simple image tube m is 700 and hence the quantum efficiency of the system is three times greater than that of the photographic plate. However, the full potential of the photocathode is not being utilised. To do this m must be increased by introducing a gain mechanism between the primary photocathode and the output phosphor screen. 4.1. CASCADE IMAGE INTENSIFIERS
A small gain could be obtained by using several image tubes in series with lenses transferring the image from one phosphor screen to the following photocathode. A more efficient way of coupling the image tubes is to make a composite tube with the phosphor screen of one stage deposited onto one side of a thin transparent membrane and the photocathode of the next stage deposited on the other side (Fig. 4.1). If the membrane is sufficiently thin the light is efficiently transferred with only a small loss in resolution caused by light spreading within the membrane. The membrane is usually mica of thickness -4p. The phosphor material and photocathodes are chosen such that their spectral emission and sensitivity curves are matched. If the quantum efficiency of the photocathode is 10 % and each incident electron produces 700-1000 photons at the phosphor screen, then the electron gain at such a phosphor-mica-photocathode multiplying screen is 70-100. An image tube having three such stages of gain would have a value of rn of
-
-
11,
fi 41
IMAGE INTENSIFICATION
57
-3.5 x lo6 and the quantum efficiency of the system would be -95 % that of the primary photocathode. The number of photons reaching the photographic film for each photon at the primary photocathode is mqocc.
I el +
4 D I O O electrons
-====z
Thus with the cascade intensifier the exposure at the film is 2000 times what it would have been had the photographic plate been used alone. It is assumed in this comparison that the objective lens which transfers the light from the event to the photocathode or from the event to the photographic film are the same. It has also been assumed that no primary electrons are lost, e.g. by scattering at the first phosphor screen. The detailed analysis of this kind of image intensifier has been given by MANDEL[1959]. These tubes are usually loop focussed in a uniform magnetic field but GUYOT, DRIARD and SIROU[I9661 describe cascade tubes in which each stage is focussed by a short magnetic lens. 4.2. TRANSMISSION SECONDARY ELECTRON MULTIPLICATION INTENSIFIER (TSEM)
The gain process in this kind of intensifier is secondary electron emission which occurs when an energetic electron strikes a thin layer of potassium chloride. An incident electron with an energy of 5 keV will cause the emission of 5-6 secondary electrons from the other side of the layer. The secondary emitting layer is usually deposited on a supporting layer of aluminium oxide with a conducting layer of aluminium between them, the total thickness being 0.1 p (WILCOCK, EMBERSON and WEEKLY [ 19601). By N
-
58
T H E USE OF IMAGE TUBES AS S H U T T E R S
“1,
§ 5
using five of these screens in series an electron gain of 3000 can be obtained. The tubes are loop focussed in a uniform magnetic field. 4.3. FIBRE OPTIC COUPLING
If the output phosphor screen is deposited on one side of a vacuumtight fibre optic plate and the photographic emulsion is pressed against the other side an optical coupling of 50 % is attainable. This is ten times greater than that for a lens system and the figures marked with an asterisk in Table 4.1 give the effective quantum efficiency of an image tube system incorporating fibre optic coupling. If the photocathode of the tube is also deposited on a fibre optic plate it is possible to butt a series of tubes together to form a cascade intensifier. An advantage of this system is that the fibre optic plates can be made plane on one side and curved on the other to accommodate the curved photocathodes and phosphor screens required in electrostatically focussed tubes (SIEGMUND [ 19681). 4.4. IMAGE DECAY TIME IN IMAGE INTENSIFIERS
The output of light from the phosphor screen of a simple image tube exposed to a short pulse of light shows a characteristic exponential decay the time constant of which depends on the particular phosphor material. For PI 1 phosphor it is -400 ps. When several image tube stages are used in series as in a cascade intensifier the response is no longer exponential but shows a maximum. It is an advantage that the light is emitted over a relatively long period as this reduces the current density in the later stages of the tube. If this were not so and the phosphor decay times were very short, space charge and photocathode resistance would seriously effect the resolution in the later stages. The TSEM image intensifiers do not have this advantage as the secondary emission process is of shorter decay time than the light emission from a phosphor material.
8 5.
Early Applications of Image Tubes to High Speed Photography
COURTNEY-PRATT [1949] applied an image tube developed at AEG in Germany during the war (SCHAFFERNICHT [I9481 and PRATT[1948]) to high speed photography. This was an electrostatically focussed tube which had a curved photocathode PC and a plane phosphor screen P with an aluminium backing (Fig. 5.1). Courtney-Pratt used the streak technique ($ 1.2) to study the initiation of explosive reactions by electric sparks. The slit was placed
11,
§ 51
H I G H SPEED PHOTOPRAPHY
59
near the explosion and the light which passed through it was focussed onto the photocathode. The photoelectron image was deflected magnetically, along a direction perpendicular to the slit, as it passed through the image tube to give the streaked image on the phosphor screen. Magnetic deflection had to be used as the tube was electrostatically screened. Courtney-Pratt used a sinusoidal deflection current making use of the reasonably linear
Fig. 5.1. Diagram of the AEG tube (length approx. 15 cm).
part of the waveform. The streak velocity and the image tube resolution were 5 mm/ps and 10lp/mm respectively corresponding to a time resolution of 0.5 ps. He also used the tube to produce a series of nine frames of a small propeller rotating at 3600 r.p.m. using a mechanical commutator to switch the currents supplied to two orthogonal sets of coils. The interframe exposure time was 0.8 ms and the currents in the coils were changed sufficiently quickly so that no further shutter was required. Another early application of image tubes was made by HOGAN[1951a, 1951b] who used an electrostatically focussed tube, the IP25, developed by RCA. The electron optical system of this tube was a series of coaxial cylindrical electrodes (Fig. 5.2). Hogan used the tube as a single exposure camera by applying the overall tube potential as a pulse of 2 kV amplitude and 2 ps duration. The distribution of the correct working potential to each electrode was by means of a capacitative potential divider. The image resolution was 10lp/mm. He also used the system as a stroboscope by applying the voltage pulses at a frequency of 2 x 106 pulses per sec. UnforN
N
-
60
THE USE OF IMAGE TUBES A S SHUTTERS
[II,
06
tunately the tube did not have an aluminium backing on the phosphor screen and leakage of light from the event to the recording film was a problem. However, as the photocathode was mainly sensitive in the infrared region of the spectrum and the light emitted by the phosphor screen was blue-green, a filter could be used to prevent leakage of the infrared radiation to the film. PC
I
I 0
I
15V
IOOV
5 kV
600V
Fig. 5.2. Diagram of the RCA type 1P25 tube (length approx. 10 cm).
Both the AEG tube and the IP25 tube produced large amounts of distortion and the image resolution at the edge of the field was low.
8 6.
Image Tubes Designed for Use as Shutters
6.1. T H E MULLARD ME1201
JENKINSand CHIPPENDALE [I9511 described an image tube which had been designed specifically for use as a shutter. Between its plane photocathode PC and phosphor screen P there was a control or grid electrode G (Fig. 6.1). The tube was focussed by means of a short magnetic lens MC and with the
/
1'
!zm
/
1I11
J
Fig. 6.1. Diagram of the Mullard ME1201 tube (length approx. 24 cm).
11,
0 61
I M A G E T U B E S D E S I G N E D F O R U S E AS S H U T T E R S
61
phosphor screen and photocathode potentials at 6 kV and zero respectively conditions of focus were obtained for a series of combinations of the magnetic lens current and the potential applied to the grid electrode. Usually a focussing condition with the grid potential at 3 kV was used, it being necessary to stabilize the magnetic lens current to 3 %. The image magnification was four. The tube could be operated as a shutter in various ways. It was found that no photocurrent passed to the phosphor screen when the grid potential was held at - 60V with respect to the photocathode. One method of shuttering was to hold the grid potential at - 60 V and to switch it to 3 kV for the duration of the exposure -1O-’s. Alternatively, with the grid at 3 kV, the photocathode could be held at 3.1 kV and a pulse applied to reduce its potential to zero during the exposure. Pulses could be applied to both the grid and cathode. For example with the grid and cathode initially set at 3.1 kV and 3.0 kV respectively, the shutter was opened* by switching the cathode to zero. The shutter was closed by switching the grid to -100 V. [195 1 ] discussed the importance of having low JENKINS and CHIPPENDALE photocathode resistance to avoid distortions which were produced when large currents were drawn. TURNOCK [ 19511 studied the distortions produced by space charge near the photocathode by comparing the tubes which had different axial electric fields near the photocathode. The tube with the higher electric field could be used to give shorter exposures without serious distortion. For the two tubes these exposure times were 1 ps and 0.1 ps. These experiments were also discussed by MEEKand TURNOCK [1952]. In modern tubes which are designed to work at even shorter exposure times it is important to keep the photocathode resistance as low as possible to avoid distortion. To reduce the effects of space charge it is necessary to have a high axial electric field at the photocathode and to use the most efficient optical coupling between the phosphor screen and the photographic plate as this reduces the photocurrent required to give an adequately bright image. A modified version of the ME1201 was used by COURTNEY-PRATT [1952]. This was the ME1200 which had no grid electrode. He operated it as a streak camera using magnetic deflection as in his previous work. He was able to demonstrate the time resolution of the streak method by sending the light which passed through one part of the slit, along an optical delay before it formed an image at the photocathode. The streak record of the event,
* As the image tube is being used as a shutter it is convenient to use the terms open and shut to refer to the conditions for which the photo-electrons can or cannot travel to the phosphor screen respectively.
62
T H E USE O F I M A G E T U B E S A S S H U T T E R S
[[I, §
6
an explosion, was divided into two parts corresponding to the direct and delayed light. The displacement between the two sections of the record corresponded to the time delay. By this means he was able to measure delays of lo-* s to an accuracy of 20 % and he concluded that the probable instrumental resolving time was s. This corresponded to a streaking speed of 100 mm/ps and a spatial resolution of 10 Ip/mm. In practice a speed of 60 mm/ps was obtained. RICHARDS[ 19521 discussed in detail the ancillary equipment required for a high speed camera system based on the ME1201 image tube. He reduced image distortion by using two short magnetic lenses separated from each other. The distortion was found to be proportional to the integral of the magnetic field along the electron path whereas focussing conditions depend on the integral of the square of the magnetic field. Hence by having opposite magnetic fields produced by the lens coils the focussing condition could be met whilst the distortion produced by one lens was compensated by the distortion produced by the other. The image resolution was 9 lp/mm and 13 lp/ mm for exposures of 2 ps and 10 ps respectively. CHIPPENDALE [I9521 described the operation of the tube as an effective two electrode tube or diode tube, with the grid electrode connected to the phosphor screen electrically. This had the advantage that it prevented any photoelectrons emitted from the grid electrode from reaching the phosphor screen. During the manufacture of the tube some of the caesium which is used in the processing of the photocathode can stray onto the grid electrode and by reducing the surface work function can increase the probability of photoeniission from it. If the grid potential is less than that of the phosphor screen these photoelectrons are able to travel to the phosphor screen even when the tube is shut, i.e. when photoelectrons from the photocathode cannot travel to the phosphor screen. Chippendale showed a series of single exposure photographs of a flash tube at exposures of 0.1 ps taken at I , 2, 35, 60,70 ps, after the triggering of the flash tube. The minimum exposure time obtained was 3 x IO-'s as below this the pulse which was applied between the phosphor screen and the photocathode departed seriously from the ideal square pulse shape and caused defocussing of the image. When the tube is operated in the triode mode, i.e. with the shutter pulses applied to the grid electrode, all the electrons which reach the phosphor screen arrive with the same energy, -6 keV. It is assumed that any effects due to the electron transit time can be neglected. The image at the phosphor screen will be in focus for one particular setting of the grid electrode potential. Hence, although a flat top shutter pulse is used, the electrons emitted during the rise and fall times of this pulse form a slightly defocussed image
-
-
11,
§ 61
I M A G E T U B E S D E S I G N E D FOR USE A S S H U T T E R S
63
and this becomes a more serious problem as the exposure time gets shorter. However, when the tube is operated in the diode mode, with the grid electrode and the phosphor screen at the same potential, only those electrons emitted during the flat portion of the shutter pulse arrive at the phosphor screen with an energy of 6 keV. Those electrons emitted during the rise and fall times of the pulse arrive with lower energies and hence cause the emission of less light from the phosphor screen. This helps to reduce the problem of the defocussed image and also means that the effective exposure time is shorter than the shutter pulse. This was dealt with by RICHARDS [I9521 and discussed analytically by BARRAULT and KEKEZ[19691. GIBSONet al. [I9541 used the ME1201 in this diode mode to study explosions with single exposures of 100 ns. GROVER[1961] used exposures of 10 ns although he was troubled by leakage of light through small holes in the aluminium backing of the phosphor screen. KING[ 19551 and KINGand HETT[I 961 ] described a framing camera system based on the ME1201 used in the diode mode. The images were deflected magnetically using a staircase deflection waveform so that each image was stationary on the phosphor screen. The exposure times available were 0.1, 0.3, 1,3, 10 ps with an interval of not less than five times the exposure time between the frames. The maximum framing rate was 2 x lo6 f.p.s. MARTONE and SEGRE[I9621 described a similar system. SAXEand CHIPPENDALE [I9551 and SAXE[1957] obtained single exposures of 3 ns by building an ME1201 tube into a coaxial line system and used it to photograph the development of an electric spark in air. When such short exposures are being considered the problem of synchronization between the camera and the event becomes very important. One method is to use a photomultiplier to detect emission of light from the event and to trigger the camera from the photomultiplier. This works well for mechanical shutters, e.g. rotating mirror cameras as the photomultiplier detects the event long before there is sufficient light to record on the photographic emulsion. However when an image tube is used as the shutter the photocathode of the tube is of the same order of sensitivity as the photomultiplier and so there must only be a very small delay between the detection of the light by the photomultiplier and the operation of the shutter. Typically the delay caused by the electronic equipment is 10-50 ns and this is unimportant when exposures greater than 100 ns are being used. However it is necessary to introduce a compensating delay between the event and the image tube if the exposure time is 2 or 3 ns. This delay can be introduced optically by sending the light from the event along a long optical path before it reaches the camera, the delay being 3ns/m.
-
-
64
T H E USE O F IMAGE T U B E S AS S H U T T E R S
“1,s 7
Synchronization would not be a problem if the event could be initiated by the camera control unit. In practice few events can be triggered accurately enough for this method to be used. Saxe and Chippendale passed the light from the spark along an optical delay and used the pulse produced by the breakdown of the spark gap to control the image tube. The tube was used in the diode mode and although the shutter pulse was 4 ns long the effective exposure time was shortened as discussed above. 6.2. IMAGE DISSECTOR CAMERAS
An image dissection system based on the ME1201 was developed by LUNN[1957] and LUNNand CHIPPENDALE [1957]. The continuous photocathode was replaced by a rectangular array of photosensitive areas the dimensions of which were about a tenth of the spacing between them. Light from the event caused emission of electrons from each element and an image consisting of discrete spots was formed on the output phosphor screen. By displacing the electron streams by one element dimension it was possible to produce a second image of the event interlaced with the first. Fifty such images were produced using a magnetic deflection system with a 1 ps continuous scan. A positive transparency of the record was made and the individual images were produced by contact printing through this positive transparency using the image tube phosphor screen as the source of light. The photocathode was uniformly illuminated and as the same deflection magnetic field was used, any distortions introduced by the tube during the recording were compensated by this read out process. The sensitive areas of the photocathode occupied only 1% of the total area and hence much of the light from the event was not used. To increase the efficiency of the image dissection COURTNEY-PRATT and THACKERAY [ 19571 suggested the use of a lenticular plate. This was an array of small rectangular lenses each centred on a particular element and each collecting light and focussing it at the element. In practice it was not necessary to use a mosaic photocathode. A description of many mechanical image dissection systems is given by DUBOVIC [ 19581.
6 7.
Deflection Shutters
7.1. RUSSIAN IMAGE TUBES WITH DEFLECTION SHUTTERS
During the 1950’s workers in Russia developed a new type of image tube shutter. This work has been described in detail by BUTSLOVet al. [I9581 and ZAVOISKY and FANCHENCO [1965]. The image tubes, for example type
11, §
71
65
DEFLECTION SHUTTERS
PIM3, were electrostatically focussed. A metal plate S with a slot in it was situated at the position where the electron stream was narrowest (Fig. 7.1). In front of this plate was an electrostatic deflection system D consisting of two plates whose separation was much greater than the width of the 3
S
electron stream. A rapidly varying voltage waveform was applied to the deflection plates D to sweep the electron stream across the slot. The combination of the deflection system and the slot plate thus could be used as a shutter and the electrons which passed through the slot were directed by two orthogonal sets of deflection plates D, , D,, to form an image on the output phosphor screen P. Unfortunately as the electrons were swept across the slot they received some transverse momentum and this caused blurring of the final image. To overcome this it was necessary to introduce a set of compensating plates CP after the slot and to these was applied a waveform identical to the shutter waveform but of opposite polarity. By matching the sensitivity of the two deflection systems this transverse momentum was cancelled and a stationary image was obtained. An advantage of this type of shutter is that the only requirement on the shutter voltage waveform is that it should vary at a sufficient rate. The minimum exposure time was limited by the stray inductance and capacitance of the deflection plates as these introduced phase shifts making accurate compensation difficult. The electric field at the photocathode was made as large as possible to reduce space charge effects, and also to reduce the differences in electron transit times caused by the distribution of electron emission energies. An estimate of this transit time difference, for electrons emitted normal to the surface with emission energies of zero and eVo,can be found as the time taken by the zero energy electron to attain a kinetic energy e V o . Very little extra time difference is added during the rest of the flight from the photocathode to the shutter plate. The electron acceleration is eE/m
66
T H E U S E OF I M A G E T U B E S AS S H U T T E R S
[II, §
7
where E is the electric field at the photocathode. By integrating and substituting for the velocity corresponding to energy eVo, the transit time difference At is found to be
At
=
(2m/e))V2jE;
At can be reduced by increasing E, e.g. with Vo = IV, At is 1 0 - l o s for E = 340V/cm and s for E = 34 kV/cm. KOMELKOV, NESTERIKHIN and PERIGAMENT [I9621 used a PIM3 tubs with a series of pulses applied to the shutter and a staircase waveform applied to the deflection plates. This gave 16 frames of 5011s exposure time at a framing rate of 5 x lo6 f.p.s. ZAVOISKY, BUTSLOV, PLAKHOV and SMOLKIN [1957] incorporated a PIM3 and a magnetically focussed five stage cascade image intensifier within the same tube. The camera system was designed to photograph tracks in scintillation chambers, the intensifier being necessary as light intensity from the tracks was very low. KOROBKIN and SCHELEV [1968] described a versatile camera system in which a PIM type tube or a tube with two stages of cascade intensifier gain can be used. The minimum exposure was 5x1s. SIMONOV and KUTUKOV[I9621 operated the tube by applying a staircase waveform to one pair of deflection plates. The waveform was produced by mismatched transmission lines and it deflected the photoelectron stream to a series of positions on the output phosphor screen. The step determined the deflection and the flat part of the waveform the exposure time. A framing rate of lo7 f.p.s. was obtained with exposures of 50 ns. 7.2. BRITISH DEVELOPMENTS OF DEFLECTION SHUTTER TUBES
An image tube of similar construction to the PIM type tubes was described by HUSTON and WALTERS [1962].They discussed the action of the shutter in detail paying particular attention to the tolerances required in the compensation system. It was shown that, if ramp voltage waveforms were used, accurate compensation required that either the two sets of deflection plates had equal sensitivities or otherwise potentials rising at different rates should be applied to them. The latter was undesirable. A plate separation tolerance of 6I .J in 4 mm was needed to give an image of 20 lp/mm resolution. In the early tubes the separation could be adjusted by means of vacuum bellows but it was found later that the tubes could be made to sufficient accuracy and so this was not necessary. Electrons which are already between the shutter plates at the time of application of the shuttering potential will not be correctly compensated if they
-
11,
o 71
67
DEFLECTION S H U T T E R S
are allowed to pass through the shutter aperture. It is therefore necessary to ensure that the rising shutter potential does not start to open the shutter for a period equal to the transit time of an electron from the entrance of the shutter plates to the aperture. This necessitates an additional fixed bias potential on the deflection plates. The tube was further described by WALTERS et al. [1963]. HUSTON[I9641 introduced a new method for operating the shutter. Fig. 7.2 shows the tube and the waveforms used to operate it at high framing rates. A sinusoid voltage waveform was applied to the shutter plates S and a second sinusoid of the same amplitude and frequency but of different phase was applied to the compensation plates CP. Exposures were obtained as the sinusoids passed through zero, corresponding to no deflection of the beam. P
4-1I
I
( l e n g t h approx. 3 0 c m
I
,
time-
Fig. 7.2. Waveforms applied to deflection type shutter tube.
1
68
T H E USE OF I M A G E T U B E S A S S H U T T E R S
[11,9:
7
This occurred twice per cycle. The phase of the sinusoids was such that alternate exposures were immobilised by different parts of the compensation waveform, i.e. exposure ab was compensated by a'b' whereas exposure cd was compensated by c'd'. The compensation was therefore not exact but if the ratio of exposure time to interframe time was restricted, typically 1 : 5, adequate immobilization of the image was obtained. The effect of this incomplete compensation was to cause alternate images to appear at different positions on the output phosphor screen. A staircase voltage waveform was applied to the deflection or shift plates D to separate the pairs of images into two rows.
Fig. 7.3. Exploding copper wire, Ions exposures taken at 2 x 10' f.p.s.
This mode of operation of the tube has been further described by I-IusroN [1966, 19671. Fig. 7.3 shows a series of 8 exposures of an exploding copper wire taken at a rate of 2 x lo7 f.p.s. with an exposure time of 10 ns. Exposures of 2 ns at the rate of 6 x lo7 f.p.s. have been obtained with an image of 30 x 30 resolvable elements. More recently the oscillator which produced the continuous sinusoidal waveform has been replaced by one which could be triggered to produce a damped sinusoidal waveform, HUSTON and MAJUMDAR [ 19681. This oscillator is triggered prior to the event and the staircase waveform generator is switched on when the first light from the event has been detected. Thus the
11,
s 71
DEFLECTION SHUTTERS
69
first two frames of the recording contain information which was received during the camera switch-on time-delay. PRUDENCE and COLMER[ 19681 describe the use of this tube with a triggered sinusoid and also with a square pulse train applied to the shutter. HUSTON[1970] modified the method of S I M ~ N O and V KUTUKOV [1962] and overcame the problem of producing a high speed staircase by generating it in two stages. A continuous sawtooth waveform was applied to the plates D and linear ramp waveforms of opposite slope were applied to the plates CP (Hadland Imacon camera). The nett deflection corresponded to that which would have been generated by a staircase waveform applied to a single set of deflection plates. To obtain a stationary set of images it was necessary to match the rate of rise of the linear ramp to the rate of fall of the linear part of the sawtooth waveform. In practice this could be done by observing a flash-illuminated chart having prominent bars in a direction perpendicular to the direction of deflection in the tube. The amplitude of the sawtooth waveform controlled the frame separation and hence the number of frames, the frequency of the sawtooth controlled the framing rate. The rate of rise of the linear ramp waveform was adjusted to give a stationary row of images. Fig. 7.4 shows a series of nine frames of a bar chart taken at 3 x lo8 f.p.s. The exposure time was 1.5 ns and the frame size was 6 x 14 mm. The limiting image resolution was 3 Ip/mm. A method for doubling the framing rate was described in which a beam-splitter was introduced before the photocathode.
Fig. 7.4. A series of frames of bar chart taken at 3 x 10' f.p.s.,exposure time 1.5 ns,:frame size 6 x 14 mm.
I0
THE U S E O F IMAGE TUBES AS SHUTTERS
"I,§
7
The reflected light was optically delayed before it formed an image alongside the direct image on the photocathode. When the optical delay was half the normal frame interval, two rows of images were formed, the lower row interlaced in time with the upper row. The exposure time remained the same. 7.3. STREAK OPERATION OF DEFLECTION SHUTTER TUBES
These image tubes can be operated as streak cameras by applying a ramp waveform to the deflection or shift plates and keeping the shutter and compensation plates at fixed potentials. Light from the event is focussed onto a slit and then onto the photocathode. The electron image is streaked in the direction perpendicular to the slit. HUSTON[1964] obtained streak writing speeds of 2500 mm/ps which is many times higher than that obtained using magnetic deflection systems.
-"
I "
Fig. 7.5. (a) Streak record of beats between axial modes of a neodynium laser. (b) Structure in high power light pulses generated by self locking of axial modes of a neodynium laser.
KOROBKIN and SCHELEV [ 19681used a PIM type tube to study mode locking in neodynium lasers. The sweep speed was 5 x 104mm/ps and the record duration was 2 ns with a sweep linearity of 80 %. Fig. 7.5a shows the recording of beats between the laser axial modes with a period of 50 ps and Fig. 7.5b shows the structure in one of the high power light spikes generated by self locking of axial modes. Similar work has been described by BASOV,DROZHBIN, NIKITIN, SENENOV,STEPANOV and YAKOVLEV [1968]. The time resolution in a streak record is given by the ratio of the spatial resolution to the sweep speed. It is assumed that the electric field near the photocathode is
[ I , § 81
G R I D CONTROLLED IMAGE TUBES
71
sufficiently large to make the differences in transit time caused by the different emission energies negligible. ZAVOISKI and FANCHENCO [ 19641 and KOROBKIN and SCHELEV [ 19681 claim time resolutions of 3 x lo-’’ s for their systems. The ultimate time resolution was estimated by BUTSLOV,ZAVOISKY, PLAKHOV, SMOLKIN and PANCHENKO [1959] and ZAVOISKI and FANCHENCO [1964] to be 10-13-10-14s. The problem of synchronization is just as serious in very high speed streak systems as in high speed framing cameras. BUTSLOV, ZAVOISKY, PLAKHOV,SMOLKIN and PANCHENKO [1959] described a system using a circular deflection produced by applying sinusoidal waveforms with a relative phase difference of $ 7 ~to two ortogonal sets of deflection plates. The sweep was run continuously and so the event was recorded whenever it occurred. A shutter which could be closed after the event had occurred was needed to prevent double exposure. This method, which has been termed ‘electron optical chronography’ can only record what happens at one point in the object. ZAVOISKI and FANCHENCO [I9651 discussed the use of a0.5 MW, 1 m wavelength triode oscillator to supply the deflection plate waveforms and obtained a time resolution of 3x s. In the same paper they described a later tube built by Butslow and used by FANCHENCO [1961]. Thus tube, PIMv, had a deflection system consisting of a resonant cavity driven by a 10 cm magnetron. The instrumental time resolution was given as 10-13s.
8 8.
Grid Controlled Image Tubes
8.1. ELECTROSTATICALLY FOCUSSED TUBES
SCHAGEN et al. [I9521 published the design of an image tube consisting in principle of two concentric spherical surfaces with a potential difference between them. Photoelectrons leaving the inner surface of the outer sphere, the photocathode, experience an inverse square law electric field. The motions of the electrons can be calculated in a closed form and it is found that they form an image beyond the centre of curvature of the system, for example through a hole cut in the inner sphere. In practice a spherical photocathode PC is used with a conical anode structure A followed by the output phosphor screen P (Fig. 8.1). The advantage of this electrode configuration over many others is that the focussing condition is determined by the geometrical dimensions alone. It is a function of the curvatures of the surfaces and does not depend on the applied tube potentials. In most other tubes the focussing conditions are determined by the potentials applied to the various electrodes and by the
72
THE USE OF IMAGE TUBES AS SHUTTERS
magnetic field if one is used. The image surface is spherical and as flat phosphor screens are usually used, distortion occurs. HUSTON[I9671 describes the use of such a tube in a single exposure camera with exposure times of 30ns. In the same paper he describes the use of a similar tube, the FE11, which has a spherical phosphor screen to match the image curvature. For use with this camera REID[1961] has discussed a double Schmidt mirror system to produce a curved image field at the photocathode and 1 : 1 copying lens followed by a fibre optic field flattener to transfer the output image from the phosphor screen to the photographic film. The AEG tube used by COURTNEY-PRATT [1949] which was described by SCHAFFERNICHT [1948] was of similar design (Fig. 5.1). LINDENand SNELL[ 19571 extended SCHAGEN’S design by incorporating a spherical mesh M close to the photocathode surface (Fig. 8.2). It was found that the flow of electrons could be cut off by holding this mesh at a potential of - 300V with respect to the photocathode when the overall tube potential was 10 kV.
Fig. 8.2. Diagram of the tube due to LINDEN and SNELL [1957] (length approx. 25 cm).
11,
0 81
73
G R I D CONTROLLED IMAGE TUBES
STOUDENHEIMER and MOORE[I9571 described the development of a similar tube in which the mesh was replaced by a grid of wires, Fig. 8.3. PC
GI
G2
ID
P
Fig. 8.3.Diagram of the R C A type 4449A tube (length approx. 25 cm).
Various combinations of electrode potentials gave focussed images but the least distortion was obtained when the grid wires were held at the potentials which would have existed at the positions of the wires before they were introduced. The distortion took the form of segmentation of the image. The tube had deflection plates inside the conical anode electrode A so that it could be used as a framing camera. This design of tube type RCA4449 has been used in the TRW (Thomson Ram0 Woolridge Inc.) framing camera system which gives exposure times as short as 5 ns at a framing rate of 2 x l o 7 f.p.s. Three images of size 17 x 25 mm and resolution 8 lp/mm are obtainable. A similar camera system is described by MENIGER and BUNTENBACH [1957]. BULPITT [I9681 described developments in the TRW camera system which included a tube which had a stage of cascade intensifier gain, followed by a proximity focussed section and a fibre optic output window. The photograhic film was pressed against this output window and the number of photons arriving at this film for each primary photoelectron was 600 times the corresponding number in the single stage tube with conventional optics and no image intensification.
-
Fig. 8.4.Diagram of the tube due t o REEDand NIKLAS [I9591 (length approx. 35 cm).
74
THE USE OF IMAGE TUBES AS SHUTTERS
“195
8
REED and NIKLAS[ 19591 developed a multielectrode electrostatically focussed tube in which the electrodes were short coaxial cylinders C , , C, etc. and the photocathode PC and phosphor screen P were plane (Fig. 8.4). A magnetic deflection system was used. The shutter was a mesh M near the photocathode and it could be operated by pulses of 10 V amplitude and gave exposure times down to 10 ns. The maximum framing rate was limited to lo6 f.p.s. by the magnetic deflection system D. An image resolution of 10 lp/mm was obtained. GUYOT,DRIARD and SIROU[1966], GUYOT,KAPLAN and BALOSKOVIC [ 19671 and GUYOT, KAPLAN,DOMALAIN, LAMARRAGUE and DURANT [ 19681 have developed an electrostatically focussed mesh tube with electrostatic deflection D (Fig. 8.5). The shutter pulse was 400 V and could be applied to the mesh M or the following electrode E. This electrode was held at a lower potential than the mesh to prevent any secondary electrons produced by collisions at themesh from reaching the output phosphor screen. Exposure times of 20 ns have been obtained. A similar tube incorporating a cascade image intensifier has been produced with each stage of the intensifier magnetically focussed by a short coil.
-
-
Fig. 8.5. Diagram of the tube due to GUYOT,DRIARD and SIROU[1966] (length approx. 12 cm).
GAVGANEN, DIAMANT, ISKOLDSKI, NESTERIKHIN and FEDOROV [19681briefly described a mesh shutter tube with an electrostatic deflection system. The minimum exposure time was 10 ns and four frames of 120 x 120 elements were obtained. CHARLES, WENDTand CARVENNEC [1968] described a tube similar to that developed by Stoudenheimer except that the mesh has been replaced by an annular electrode. The shortest exposure time was 3 ns and the system gave nine frames.
11,
0 91
I M A G E I N T E N S I F I E R S AS H I G H S P E E D S H U T T E R S
15
8.2. MAGNETICALLY FOCUSSED MESH TUBES
MANDEL[I9621 developed a mesh shutter tube which was loop focussed in a uniform magnetic field (Fig. 8.6), the mesh M I was close to the photocathode PC and pulses of 10V amplitude were sufficient to operate the shutter. The penetration of the strong axial electric field through the mesh apertures caused collection of appreciable current when the mesh was held at a
Fig. 8.6. Diagram of the tube due to MANDEL[1961] (length approx. 25 cm).
low potential. This problem was overcome by using a second mesh M, to electrostatically screen the first mesh from the rest of the tube. The presence of this second mesh also reduced the tolerances on the shutter pulse shape to give good resolution. Later tubes incorporated a TSEM image intensifier. The tube was used by MAGYAR and MANDEL[I9631 to study the interference between independent lasers.
8 9.
The Use of Conventional Image Intensifiers as High Speed Shutters
Several workers have investigated the operation of conventional image intensifiers as high speed shutters. The shuttering system is usually contained in the first stage of the tube and the applications are usually as single exposure cameras. RUGGLES, SLARKand WOOLGAR [ 19631described methods of shuttering a five stage TSEM (transmission secondary electron emission) image intensifier. The shutter pulses were applied to the photocathode to switch it to its correct working potential. Typically pulses of 2 kV amplitude and 2ps duration were used. Unfortunately any stray capacitance had to be charged through the high resistance of the photocathode and this limited the rate at which the potential at the photocathode surface could be changed. To overcome this a mesh was fixed across the photocathode on the outside of the tube and electrically connected to the photocathode (Fig. 9.1). The capacitance between this mesh and the photocathode shunted the photocathode resistance and hence short rise time pulses could be applied. Alternatively the photocathode could be deposited onto a conducting substrate.
16
T H E U S E O F I M A G E T U B E S AS S H U T T E R S
111,
§ 10
This same method of shuttering has been used by BRADLEY, HIGGINS and KEY [1970]. A laser-triggered spark-gap provided the pulse to shutter a four stage cascade intensifier. The exposure time was 1.5 ns.
-
PC
I
I
shutter pulse -+time
Fig. 9.1. Method of shuttering an image intensifier.
EMBERSON [1962, 19671 used two methods for shuttering TSEM and cascade intensifiers. The pulses were either applied to the photocathode with an external mesh or a conducting substrate as described above, or to one of the electrodes in the first stage of the tube. This electrode was otherwise biased to a potential below that of the photocathode. Both methods of shuttering suffer from the disadvantage that the shape and size of the shutter pulse must be controlled carefully to satisfy the focussing conditions of the tube. A further problem when cascade intensifiers are used is that there is thermal emission from the subsequent photocathodes even when the shutter is closed to prevent electrons from the first photocathode from passing through the tube. Very little such thermal emission occurs in TSEM tubes. Emberson demonstrated the gain capabilities of the systems by photographing a candle using a TSEM intensifier with a 150 ns exposure. The effect of the pulse shape on the exposure time has been studied by BARRAULT and KEKEZ[19691.
0 10.
Storage Methods
The development of scintillation chambers for use in nuclear physics led to the need for a system capable of storing the images of the scintillation
11,
0 101
77
STORAGE METHODS
tracks while the associated electronic circuitry determined whether the images contained useful information and hence whether they should be recorded or not. 10.1. PHOSPHOR SCREEN STORAGE
A suitable method for storing the image was to use two image tubes in series. The first of these had a long persistance output phosphor screen on which the image was stored and the second tube was used as a shutter to select the required events. It was also necessary to use image tubes as the amount of light from the scintillation tracks was very small. PERLand JONES[1962] reviewed many image tubes available for this purpose. ANDERSON,GOETZE and KANTER [I9621 described a phosphor screen storage system using two TSEM image intensifiers. HILL, CALDWELL and SCHLUTER [I9621 used phosphor screen storage followed by an intensifier orthicon; a television camera incorporating an image intensifier. 10.2. DYNAMIC ELECTRON IMAGE STORAGE
PERLand JONES [I9621 and HILL, CALDWELL and SCHLUTER [1962] discussed an alternative way of storing the images. This involved the introduction of a long electron drift space between the photocathode and the shutter in a magnetically loop-focussed image tube. The image storage or delay was provided by the time of flight of the electrons through this drift space. MCGEE[ 19621 suggested a practical arrangement and this was developed by MCGEE,BEESLEY and BERG [1966], BERG,SMITHand PROSSER [1966] and SMITH[1968, 19691.
.FC MI . . . . . . . D . . . . M 2. . .DP . . . . . hi "
'
I " " '
I
I
,
I
I
,
I
,
1
1
0
I I
I
I
I
I
P
U
[ / / / / / / / / / / / / / / / / / / / I Fig. 10.1. Diagram of the electron image storage tube (without cascade intensifier section) - (length approx. 70 cm).
The device is shown diagrammatically in Fig. 10.1. The image tube is 70cm long. Photoelectrons which leave the photocathode PC are accelerated by an axial electric field through a mesh M I into a tubular metal drift section D which is held at a fixed potential V , ( N lOOV). The time of flight of the electrons through this drift section is 100 ns. The shutter is a second mesh M, which can be opened or closed by applying positive or negative potentials with respect to that of the photocathode. N
-
78
THE USE OF IMAGE TUBES AS SHUTTERS
[II,
5
10
The electron image was 'brute force focussed' ( Q 3.5) by means of a strong axial magnetic field. It is not possible to loop focus the image (Q 3.4) because the variation in the electron transit time caused by the distribution of emission energies is 0.5 ns, whereas the time for an electron to complete a loop in a magnetic field of 250 gauss is 1.4 ns. Thus the transit time cannot correspond to an integral number of loop times for all electrons. However if the magnetic field is sufficiently large, e.g. 600 gauss, the diameter of the disc of confusion is 0.1 mm and thus adequate image quality can be obtained. Electrons which passed through mesh M, were accelerated to a high kinetic energy and produced the output image on the phosphor screen P. The shutter could be controlled by potential changes of 10 V but it was found that the exposure time obtained when short pulses were applied to the mesh, was a function of the emission energies of the particular electrons, being shorter for electrons of lower emission energy. This limits the time resolution of the shutter and it was found that this effect could be reduced a) by decreasing the electron transit time through the region in the neighbourhood of the shutter mesh by having a large electric field between the drift section and mesh, and b) by using pulses with as short rise and fall times as possible. The difference in time of flight through the storage section for electrons with different emission energies also limits the time resolution. It was found that when the exposure time was 1 ns the shuttered electron stream contained 85 of the electrons which were generated within 1 ns at the photocathode. The electrons which did not pass through mesh M, were reflected and returned through the drift section. If the potential at mesh MI was changed to a negative value before these electrons arrived at it they were again reflected and returned towards the shutter mesh M,. The electrons were now trapped between the two meshes and travelled back and forth through the tube. A second pulse was applied to the shutter mesh delayed by approximately one double electron transit time with respect to the first and allowed a second sample or exposure from the electron stream. By using a set of suitably timed shutter pulses, a series of exposures of the event was obtained. It was necessary to include a deflection system after the shutter mesh to separate spatially the electron images before they reached the output phosphor screen. A pair of deflection plates DP was used and the combination of the axial magnetic field B and the transverse electric field E caused a deflection in a direction parallel to the plane of the plates. The deflection velocity is given by EIB. The deflection is given by the product of this velocity with the transit time through the plates and as the latter depends on theposition N
N
N
N
N
-
11,
8 101
79
STORAGE METHODS
of the electrons between the plate, the electrons nears the plates at the higher potential are deflected less than those near the other plate. The main advantages of this dynamic image storage are a) although the samples taken from the electron stream can be as little as a 1 ns apart the shutter pulses need only be applied at intervals of one double transit time (- 200 ns), b) similarly the deflection system operates with a double transit time between exposures and c) the initial electron transit time allows easy synchronisation of the camera to the event.
6.0
8.0
14.0
Fig. 10.2. A series of single exposures of an electric spark in air. The times are the settings o n the delay control circuit (ns).
The latter is demonstrated by Fig. 10.2 which shows a series of single exposures of a small electric spark in air as the delay between the discharge and the application of the shutter pulse was varied. The times indicate the setting on the delay circuit (in ns) which controlled the delay between the detection of the event and the operation of the shutter. The operation of the device as a framing camera was tested by photographing an oscilliscope trace as it passed across the screen. Exposures at a framing rate of 5 x lo8f.p.s. were obtained by SMITH[1969]. Many of the tubes incorporated a cascade image intensifier and hence the current density in the tube was low and so space charge and photocathode resistance effects were not as important as in many other systems. The spatial resolution was 10 lp/mm and the image sizes were 2 x 0.7 cmz for a single exposure and 0.7 x 0.5 cmz for a series of frames. The time resolution in the final image of a series was as good as in the first becausealthough the electrons with higher emission energies travelled faster through the drift section and introduced a time dispersion, they travelled further into the
-
80
THE USE OF IMAGE TUBES AS SHUTTERS
[II,
0 10
decelerating field when they were reflected at a mesh and hence took longer to come to rest. By suitable design a compensated system was built. 10.3. AN ALTERNATIVE DYNAMIC STORAGE METHOD
A similar device which used a drift space was investigated by BRADLEY and MAJUMDAR [1966], Fig. 10.3. The tube was focussed by a strong uniform magnetic field. Photoelectrons from the photocathode PC passed through a hole in a metal plate AP. Beyond this aperture plate there was a
Fig. 10.3. Principle of the operation of the tube due to BRADLEY and MAJUMDAR[I9661 (length approx. 40 cm).
drift space defined by two parallel deflection plates D P held at different potentials. The deflection produced by the combination of the transverse electric field E and the axial magnetic field B was parallel to the plates. The shutter was a mesh M which was initially held at a suitable negative potential and so the electron stream was reflected. The aperture plate was also held at a negative potential and hence the returning electron stream was again reflected. The deflection is always in the same direction as it is a function of E A B/IBI2.The electrons were able to pass through the aperture due to the strong electric fields on either side of it. Between the subsequent reflections the electron image was progressively deflected and when all the electron stream had passed through the aperture the shutter was opened. The electron stream ABC passed out through A, stream CDE through C and stream EF through E. In this way a series of exposures at a high framing rate was obtained while using a single shutter pulse. A serious limitation of this system was that the electric field between the plates caused different parts of the image to travel with different velocities and hence limited the possible time resolution.
1,
0 111
BIPLANAR IMAGE TUBES
5 11.
81
Biplanar Image Tubes
Recently there has been renewed interest in the Holst type image tube. HEALEY and OWREN[1967] compare the properties of the biplanar tube with those of other image tubes. Modern biplanar tubes can be made with large area photocathodes and phosphor screens with a separation of
Fig. 11.1. An eight frame sequence of two simultaneous air gap discharges. The exposure time was 5 ns and the relative frame times are: 0, 0.5, 1.O, 1.5, 2.5, 3.5, 4.5 and 6.5 ps.
82
T H E U S E O F I M A G E T U B E S AS S H U T T E R S
-
-
“1,
s 12
a few millimetres and give a resolution of 201p/mm over areas 10 cm in diameter. The resolution is uniform over the field and there is no distortion. The high electric field required to give adequate resolution (9 3.1) makes space charge unimportant and also makes the tubes insensitive to external stray electric and magnetic fields. The shutter is controlled by applying the tube working potential for the required exposure time. Biplanar image tubes can only be used to give single exposures as there is no possibility of incorporating a deflection system. Light leakage through the tube can be a serious problem but as the light which gets through is usually of wavelength > 5000 8, a red blocking filter can be used to prevent this light from reaching the photographic plate. Fig. 1 1.1 shows a series of exposures taken with exposure times of 5 ns, HEALEY and OWREN[1967]. ESCHARD and POLAERT [1968a, 1968b, 19691 describe a series of biplanar devices which were made by fabricating the phosphor screen and photocathode sections separately and then joining the two parts together. Tubes with diameters up to 12 cm and with a resolution of 18 lp/mm are described. They also describe a two stage tube which has a cascade phosphor-micaphotocathode multiplying screen and a fibre optic output window. Experimental arrangements to give exposures between 5 and 500 ns are discussed by BACCI and MARILLEAU [1968]. When the two stage tube is used, the second stage is pulsed on for a short time so that light only reaches the recording film during the decay time of the first phosphor screen, as this reduces the background. LAVIRON and BACCI [1968] describe the use of these biplanar tubes in a matched transmission line system and show exposures of 1 ns and indicate that it should be possible to develop a system to give an exposure time of 0.3 ns. The pulse generators used in this system are described by BAccr and BLANCHET [1968].
-
8 12. Multichannel Systems The multiple exposure systems considered so far have been capable of taking several exposures within a single tube. An alternative approach is to use many simple single exposure image tubes in parallel with each other. Each channel produces its own output image and may receive light from the event either through its own objective lens or from a system of beam splitters following a single objective lens. The former method suffers from parallax of the images and the latter from light loss at the beam splitters. The image tubes can be either shuttered at different times or at the same time but with different optical paths between the event and each tube.
11,
(i 121
. S M U L T I C H A N N E 1. S Y S T E M
83
N E S T E R I K H I KOMELKOV N~~~ [ 19591described theuseof severalPIM3 tubes in a multichannel system and obtained I5 ns exposures with 20 ns intervals between them. The use of biplanar image tubes in a multichannel system is discussed by HEALEY and OWREN[I9671 and BACCIand MARILLEAU [1968]. HUSTON[ 19661 used electrostatically focussed diode tubes (9 8.1) in six separate channels but the serious pincushion distortion in these tubes was a problem. A multichannel system was used by VOROBJEV, ISKOLDSKI,
Fig. 12.1. A Q-switched ruby laser spot structure as a function of time. The vertical axis gives the time in ns, and the horizontal axis gives the attenuation (log. scale) in each of the channels.
84
T H E U S E O F I M A G E T U B E S AS S H U T T E R S
"
190
12
GRUGLIAKOV, NESTERIKHIN and STCHELEV [I9681 to study the formation of a ruby laser giant pulse. Eight channels were used and the single shutter pulse was applied to the tubes through cables of different lengths. N o objective lenses were used as the laser light was only very slightly divergent. The light passed through a series of wedge shaped beam splitters which directed the reflected light onto the photocathodes of the image tubes. As there was a large change in brightness (1 : lo7) during the build up of the giant laser pulse attenuators were introduced into each channel to prevent overloading of the photocathodes. The aluminium backings of the phosphor screens were unusually thick to prevent feedthrough of the laser light. An advantage of the multichannel system over the multiple frame camera is that intensity attenuation can be introduced into each channel and hence large changes in brightness during the event can be accommodated. Fig. 12.1 shows the build up of a giant laser pulse. GAVGANEN, DIAMANT, ISKOLDSKI, NESTERIKHIN and FEDOROV [ 19681 described a multichannel system based on a two electrode electrostatically focussed tube. The photocathode was deposited on a conducting substrate and the electrodes and leads were designed so that their resonant frequency was > 1000 Mc/s. The minimum exposure time was 0.5 ns and there were eight separate channels with attenuators in each channel. Separate objectives were used and could be arranged to view the event from different directions. A system was described for photographing low light level events and consisted of four channels followed by a single cascade image intensifier. References The references to the Proceedings of the International Congresses on High Speed Pho[ 19571. tography have been abbreviated following the scheme given by COURTNEY-PRATT HSP2, 1956, Actes Deuxitme Congrks International de Photographie et de Cinematographie Ultra Rapides, Paris, 1954, eds. P. Naslin and J. Vivie (Dunod, Paris). HSP3, 1957, Proceedings of the Third International Congress on High Speed Photography, London, 1956, ed. R. B. Collins (Butterworths, London). HSP4, 1959, Kurzzeitphotographie IV, Internationaler Kongress fur Kurzzeitphotographie und Hochfrequenzkinematographie, Koln, 1958, eds. H. Schardin and 0. Helwich (Verlag Dr. Othmar Helwich, Darmstadt). HSP5, 1962, Proceedings of the Fifth International Congress on High Speed Photography, Washington, 1960, ed. J.S. Courtney-Pratt (Society of Motion Picture and Television Engrs., New York). HSP6, 1963, Proceedings of the Sixth International Congress on High Speed Photography, The Hague, 1962, eds. J. G. A. de Graaf and P. Tegelaar (Tjeenk Willink, Haarlem). HSP7, 1967, Kurzzeitphotographie-VII, Internationaler Kongress fur Hochfrequenzkinematographie,Zurich, 1965, ed. 0. Helwich (Verlag Dr. Othmar Helwich, Darmstadt). HSP8, 1968, Proceedings of the Eighth International Congress on High Speed Photography, Stockholm 1968, eds. N. R. Nilsson and L. Hogberg (Wiley, New York and Almqvist and Wiksell, Stockholm).
111
REFERENCES
85
ANDERSON, A. E., G. W. GOETZE and H. KANTER, 1962, HSP5, 95. BACCI,H. and H. BLANCHET, 1968, L'Onde Electrique 48,430. 1968, HSP8, 57. BACCI,H. and J. MARILLEAU, BARNSLEY, D. A., 1962, HSP6,341. M. R. and M. M. KEKEZ,1969, J. Sci. Instr. (J. of Phys. E) Series 2, 2, 1041. BARRAULT, B. M. STEPANOV and BASOV, N. G., Yu. A. DROZHBIN, V. V. NIKITIN,A. S. SENENOV, V. A. YAKOVLEV, 1968, HSP8, 33. and R. D. PROSSER, 1966, Advan. Electron. Electron. Phys. 22B, BERG,A. D., R. W. SMITH 969. 1966, Advan. Electron. Electron. Phys. 22B, 985. BRADLEY, D. J. and S. MAJUMDAR, and M. H. KEY,1970, Appl. Phys. Letters 16, 53. BRADLEY, D. J., J. F. HIGGINS BULPITT,T. H., 1968, HSP8, 31. BUTSLOV, M. M., E. K. ZAVOISKY, A. G. PLAKHOV, G. E. SMOLKIN and S. D. PANCHENKO, 1959, HSP4, 230. D. R., G. WENDTand F. LE CARVENNEC, 1968, HSP8, 51. CHARLES, CHIPPENDALE, R. A., 1952, Phot. J. 92B, 149. K. R., 1963, Rep. Prog. Phys. 26,269. COLEMAN, J. S., 1949, Research (London) 2,293. COURTNEY-PRATT, J. S., 1952, Phot. J. 92B, 137. COURTNEY-PRATT, J. S. and D. P. C. THACKERAY, 1957, J. Phot. Sci. 5 , 32. COURTNEY-PRATT, COURTNEY-PRATT, J. S., 1957, Rep. Prog. Phys. 20, 379. COURTNEY-PRATT, J. S., 1962, HSP5, 197. DUBOVIC, A. S., 1965, Photographic Recording of High-speed Processes, NASA Technical Translation TT F-377. EMBERSON, D. L., 1962, IRE (Inst. Radio Engrs.) Trans. Nucl. Sci. NS-9, 107. EMBERSON, D. L., 1967, HSP7, 454. EREZ,A. and S. EYLON,1962, HSP6, 333. ESCHARD, G. and R. POLAERT, 1968a, HSP8, 54. ESCHARD, G. and R. POLAERT, 1968b, L'Onde Electronique 48, 426. 1969, Advan. Electron. Electron. Phys. 28B, 989. ESCHARD, G. and R. POLAERT, G. and J. GRAF,1969, Advan. Electron. Electron. Phys. 28A, 499. ESCHARD, S. D., 1961, Pribory i Techn. Ekperim. 1, 5. FANCHENCO, FELLGETT, P., 1958, in: Present and Future Use of Telescopes of Moderate Size, ed. F. B. Wood (University of Pennsylvania Press) p. 51. and B. T. LIDDY,1969, Advan. Electron. Electron. Phys. GARFIELD, B. R. C., J. R. FOLKES 28A, 375. L. V., L. M. DIAMANT, A. M. ISKOLDSKI, Yu. E. NESTERIKHIN and V. M. GAVGANEN, FEDOROV, 1968, HSP8, 41. GIBSON, F. C., M. L. BOWSER, C. W. RAMALY and F. H. SCOTT,1954, Rev. Sci. Instr. 25, 173. Goss, W., 1962, HSP5, 137. F. H. 1961, J. Sci. Instr. 38, 86. GROVER, GUYOT,L. F., B. DRIARDand F. SIROU,1966, Advan. Electron. Electron. Phys. 22B, 949. GUYOT,L. F., B. DRIARDand P. BALOSKOVIC, 1967, HSP7, 448. M. DOMALAIN, P. LAMARRAGUE and M. DURANT, 1968, HSP8, GUYOT,L. F., D. KAPLAN, 47. HEALEY, T. J. and H. H. OWREN,1967, HSP7, 531. and R. A. SCHLUTER, 1962, Advan. Electron. Electron. HILL, D. A., D. 0. CALDWELL Phys. 16, 484. HOGAN, A. W., 1951a, Proc. IRE (Inst. Radio Engrs.) 39,268. HOGAN,A. W., 1951b, J. SOC.Motion Picture and Television Engrs. 56, 635. M. C. TEVESand C. F. VEENEMOUS, 1934, Physica 1,297. HOLST,G., J. H. DEBOER,
86
THE USE OF I M A G E T U B t S AS SHUTTERS
r11
HUSTON, A. and F. WALTERS, 1962, Advan. Electron. Electron. Phys. 16, 249. HUSTON, A. E., 1964, Appl. Opt. 3, 1231. HUSTON, A. E., 1966, Advan. Electron. Electron. Phys. 22B, 957. A. E., 1967, HSP7, 93. HUSTON, 1968, HSP8, 25. HUSTON,A. E. and S. MAJUMDAR, HUSTON,A. E., 1970, Proc. Ninth International Congress on High Speed Photography, Denver, Colorado, Aug. 1970 (SOC.of Motion Picture and Television Engrs., New York) to be published. J. A. and R. A. CHIPPENDALE, 1951, J. Br. Inst. Radio Engrs. 11, 505. JENKINS, JENKINS, J. A. and R. A. CHIPPENDALE, 1953, Phillips Tech. Rev. 14,213. JONES,R. C., 1959, J. Opt. SOC.Am. 49,645. KING, R. W., 1955, IRE (Inst. Radio Engrs.) Trans. Telemetry and Remote Control TRC-1, NO. 2, p. 8. KING,R. W. and J. H. HETT,1961, J. SOC.Motion Picture and Television Engrs. 70, 270. V. S., Yu. E. NESTERIKHIN and M. I. PERIGAMENT, 1962, HSP5, 118. KOMELKOV, V. V. and M. SCHELEV, 1968, HSP8, 36. KOROBKIN, E. and H. BACCI,1968, HSP8, 61. LAVIRON, LINDEN, B. R. and P. A. SNELL,1957, Proc. IRE (Inst. Radio Engrs.) 45, 513. LUNN,G. H., 1957, HSP3, 102. 1957, Electronic and Radio Engineer 34, 156. LUNN,G. H. and R. A. CHIPPENDALE, MCGEE,J. D., 1962, Some Problems in Photoelectronic Image Intensifiers forusein HighEnergy Physics, in: Proc. Symp. on Nuclear Instruments, Harwell, Sept. 1961, ed. J. B. Birks (Heywood and Company, London) p. 1. MCGEE,J. D., J. BEESLEY and A. D. BERG,1966, J. Sci. Instr. 43, 153. MAGYAR, G. and L. MANDEL, 1963, Nature (London) 198, 255. MANDEL, L., 1955, J. Sci. Instr. 32, 405. MANDEL, L., 1958, Proc. Phys. SOC.(London) 71, 1037. MANDEL, L., 1959, Brit. J. Appl. Phys. 10, 233. L., 1962, HSP5, 110. MANDEL, MARTONE, M. and S. E. SEGRE,1962, J. Sci. Instr. 39, 112. 1952, Phot. J. 92B, 161. MEEK,J. M. and R. C. TURNOCK, MENIGER, R. C. and R. W. BUNTENBACK, 1957, IRE (Inst. Radio Engrs.) Convention Record, Part V, 88. Yu. E. and V. S. KOMELKOV, 1959, HSP4,243. NESTERIKHIN, PERL,M. L. and L. W. JONES,1962, HSP5, 98. PRATT,T. H., 1947, J. Sci. Instr. 24, 312. PRATT,T. H., 1948, Electron. Eng. 20, 274, 314. PRUDENCE, M. B. and R. A. COLMER, 1968, HSP8, 21. ROSE,A., 1948, Adv. Electron. Electron. Phys. 1, 131. RFED,W. 0. and W. F. NIKLAS,1959, J. SOC.of Motion Pictures and Televison Engrs. 68, 1. REID,C. D., 1967, HSP7, 431. RICHARDS, M. A., 1952, Proc. Inst. Elec. Engrs. 99, Pt. IIIA (Television) p. 729. 1963, HSP6, 362. RUGGLES, P. C., N. A. SLARKand A. G. WOOLGAR, 1955, Brit. J. Appl. Phys. 6, 336. SAXE,R. F. and R. A. CHIPPENDALE, SAXE,R . F., 1957, HSP3, 126. W., 1948, Bildwandler, in: Fiat Review of German Science 1939-1946. SCHAFFERNICHT, Electronics, Pt. 1, eds. G. Georg and J. Zenneck (Office of Military Government for Germany, Field Information Agencies Technical, Wiesbaden, Germany) p. 79. and J. C. FRANKCKEN, 1952, Phillips Res. Rep. 7, 119. SCHAGEN, P., H. BRUINING SIEGMUND, W. P., 1968, HSP8, 90. SIMONOV, P. and A. KUTUKOV, 1962, HSPS, 123. SMITH,R. W., 1968, HSP8, 18.
111
REFERENCES
87
SMITH,R. W., 1969, Adv. Electron. Electron. Phys. 28B, 1011. STEWART, G. W. and M. WANIECK, 1963, Rev. Sci. Instr. 34, 512. STOUDENHEIMER, R. G. and J. C. MOORE,1957, RCA Rev. 18, 322. TURNOCK, R. C., 1951, Proc. IEE (Inst. Elec. Engrs.) 98, Pt. 11, p. 635. V. V., A. M. ISKOLDSKI, E. P. GRUGLIAKOV, Yu. E. N E S T E R I K H IM. N ~Ya. ~~ VOROBJEV, STCHELEV, 1968, HSP8, 45. F., R. A. CHIPPENDALE and R. P. BROWN,1963, HSP6, 357. WALTERS, and B. WEEKLEY, 1960, IRE (Inst. Radio Engrs.) WILCOCK,W. L., D. L. EMBERSON Trans. Nucl. Sci. NS-7,126. 1956, Soviet Phys. Doklady 1,285. ZAVOISKY, E. K. and S. D. FANCHENKO, A. G. PLAKHOV and G. E. SMOLKIN, 1957, J. Nucl. ZAVOISKY, E. K., M. M. BUTSLOV, Energy 11, 4, 340. 1965, Appl. Opt. 4, 1155. ZAVOISKY, E. K. and S. D. FANCHENKO,
This Page Intentionally Left Blank
T O O L S OF THEORETICAL Q U A N T U M OPTICS* BY
MARLAN 0. SCULLY? and KENNETH G . WHITNEY Department of Physics and Optical Sciences Center, University of Arizona, Tucson, Arizona, U S A
* Work supported in part by the U.S. Air Force (Office of Scientific Research) and in part by the U.S. Air Force (Kirtland). + John Simon Guggenheim Fellow.
CONTENTS
PAGE
Q 1 . INTRODUCTION . . . . . . . . . . . . . . . . . . . .
91
Q 2. DENSITY MATRIX THEORY . . . . . . . . . . . . . .
95
Q 3. GREEN’S FUNCTION THEORY . . . . . . . . . . . . 104 Q 4 . QUANTUM NOISE OPERATOR THEORY . . . . . . .
113
APPENDIX I . . . . . . . . . . . . . . . . . . . . . . . .
120
APPENDIX 11. . . . . . . . . . . . . . . . . . . . . . . .
123
APPENDIX 111 . . . . . . . . . . . . . . . . . . . . . . .
127
APPENDIX IV . . . . . . . . . . . . . . . . . . . . . . .
129
REFERENCES . . . . . . . . . . . . . . . . . . . . . . .
135
0 1.
Introduction
In the last several years three different theoretical techniques have been used extensively in quantum optical problems, to wit, density matrix, Green’s function, and quantum noise operator techniques. However, while many workers use one or the other of these approaches most do not include all three in their “tool boxes”. It is the purpose of this review to give a simple discussion of each while demonstrating some of their strengths and interconnections. We propose to accomplish this by treating the well-known problem of a single mode of the radiation field coupled to a reservoir using the density matrix, Green’s function and quantum noise operator approaches. This will enable us to make simple comparisons between these theoretical approaches and to see clearly how they provide equivalent complementary descriptions of the reservoir interaction. We will also comment in two appendices on the density matrix and Green’s function descriptions of more complicated problems. How do the various theoretical approaches differ? One generally has a basic system of quantum field theoretic equations of motion which describe the fundamental interactions taking place on the microscopic level. From these equations of motion, one wants to obtain information about the macroscopic many-body system. From the macroscopic theory one may calculate relaxation and drift rates, correlation times, probability distributions, etc. The basic quantum mechanical connection between macroscopic and microscopic theory is contained in the statement that associates the time evolution of a physical observable to the time evolution of an operator as an average over the initial random state of the many-body system. Thus, for the radiation field, the physical variables are the (average) field strengths, a(t) and a*(t), which are operator averages over the density operator p at some initial time t o , of the annihilation. 6, and creation, df, operators; e.g.9 (1.1) a ( t ) = ( 6 ( t ) ) E Tr (p(to)ci(t)). Density matrix, Green’s function, and quantum noise operator theories 91
92
TOOLS OF THEORETICAL QUANTUM OPTICS
[III,
01
represent three distinct approaches to the problem of calculating operator moments, such as eq. (l.l), or correlation functions, such as ( d ’ ( t ’ ) d ( t ) ) . The idea for each approach is contained in eq. (1.1). Suppose, for simplicity, that the dynamics of the system under study is governed by the time independent Hamiltonian H. To determine the time dependence of ( d ( t ) ) the equation of motion for ii (in units such that h = 1)
(1.2) which couples d with the variables of some atomic (reservoir) system, must be solved and averaged - or vice versa. Eq. (1.2) can be solved in terms of the time translation operator, U , which satisfies the equation of motion, ia,U(t,
to) =
HU(t, to),
(1.3)
along with the initial condition U(t o , t o ) = 1. A formal solution to eq. (1.2) is d(t) = U+(l,tO>d(tO)U(t> (1.4) where, for the particular case of a time independent Hamiltonian, Ucan be expressed as U(t, t o ) = exp { -iH(t--t0)}. Uis a unitary operator so that ~ + ( tt o, ) = ~ - ‘ ( t , to) = exp {iH(t-to)}. When eq. (1.4) is substituted into eq. (l.l), one finds that a ( t ) = Tr { U + ( l , to)6(to)U(t, to)p(to)}.
(1.5)
On making use in eq. (1.5) of the cyclic property of the trace, one can shift the focus of attention from d ( t ) in eq. (1.1) to the density matrix p ( t ) as follows: a(t) = Tr {d(tO)U(t,t o ) P ( t o ) U + ( t , t o ) > = Tr { 4 t O ) P ( t ) } . (1.6) Eq. (1.6) implies that the density matrix satisfies the “Schrodinger” equation,
As expressed in terms of the usual quantum mechanical jargon, the shift of time dependence from d ( t ) to p ( t ) takes one from the Heisenberg to the Schrodinger picture. The primary elements of each of the three approaches that have been devised for computing moments and correlation functions quantum mechanically are displayed in eqs. (1.2)-(1.7). Density matrix theory is an approach that emphasizes the statistical side of the computations. One first
“I,§
11
INTRODUCTION
93
solves eq. (1.7) in some fashion for p, the density operator which is common to all averaging operations. One then obtains the average value of an operator or operator product as a secondary step (eq. (1.6) for example) of the analysis. p provides a valuable probabilistic point of view from which to regard many-body and, in particular, the reservoir interaction. The second approach is suggested by eq. (1.5). In this second so called Green’s function approach, the emphasis is on U rather than B or p. One introduces (fictitious) external, time varying forces that act on the system under study. U then becomes a functional of these forces. As they vary, the system responds (while it interacts with other systems) and one can measure its macroscopic properties. However, to fully explore all of the statistical and dynamical properties of the system, one must probe the to + t transition separately from the t -+ to transition as a purely formal mathematical device. We will go into more detail on this procedure later. For those readers who are accustomed to thinking in terms of Heisenberg, Schrodinger, and interaction pictures there might be some confusion about the mathematical formulation of this theory. In Appendix I we discuss the time dependence of states, operators, and density matrices, in order to clarify the mathematical relationships of the three approaches. The third approach, quantum noise operator theory, deals at the operator level with the dynamics of a system-plus-reservoir, interaction. Thus, for example, when the radiation field is coupled to a reservoir, eq. (1.2) can be cast into a quantum noise operator equation of motion. This equation incorporates the macroscopic and microscopic (noise) effects of the reservoir in eq. (1.2) as damping-plus-frequency-shift terms and Langevin forces respectively. Because the Langevin force appears explicitly, this equation of motion offers physical insight into the nature of the eq. (1.1) averaging procedure. Quantum noise operator theory emphasizes the fundamental aspects of microscopic dynamical theory by working with coupled unaveraged operator equations of motions. This can lead to certain mathematical difficulties; for eq. (1.2) must in general be solved in conjunction with other operator equations of motion. Eq. (1.7) apparently represents some analytical economy by replacing several coupled operator equations of motion with one equation of motion for the density matrix. A bridge between B ( t ) and p ( t ) analyses is provided by U. Thus, it may be anticipated that Green’s function theory constitutes a bridge between quantum noise operator theory and density matrix theory. To summarize, eqs. ( 1 . I)-( 1.7) suggest three approaches to the problem of analyzing the statistical and dynamical properties of a many-body interaction. One can work with the operator equations of motion for B ( t ) and
94
TOOLS OF T H E O R E T I C A L Q U A N T U M OPTICS
“11,
9I
B + ( t ) ; when one of the systems with which the radiation mode interacts is a reservoir, the reservoir coordinates can be eliminated and these equations become quantum noise operator equations of motion. Alternatively, one can use the relation Tr ( p ( t , , ) d ( t ) } = Tr { p ( t ) 8 ( t O ) }and confine one’s attention to solving the density matrix equation of motion. This approach yields the statistical state of the radiation mode and is particularly convenient for computing, in theory, any moment of the 8 and 8’ variables. The third approach is suggested by eq. (1.5) and is referred to as Green’s function theory, Z-functional, or the forward-backward time approach interchangeably. The latter terminology will become clear when it is seen in context. The simple radiation mode, reservoir interaction that we will use to compare and contrast the above three theoretical approaches consists of a single harmonic oscillator of frequency w,,, (a single mode of the radiation field) coupled to an ensemble of harmonic oscillators (e.g. phonons), which have closely spaced frequencies { w k } (GORDON,WALKERand LOUISELL [1963]). The single oscillator, with Hamiltonian, H, = w U 0 8 +8, is taken to interact with the reservoir of oscillators, having Hamiltonian, HR = x k H k = xkw,6:6,, via single emission and absorption events: V = x k ( ~ k 8 + b k + ~ : 6 : 8 ) . In tnis problem, then (hk} is the reservoir and B the system whose behavior under the influence of {&} will be discussed from density matrix, Green’s function, and quantum noise operator points of view. This review is constructed in the following fashion. The most extensive and familiar development in quantum optics has been density matrix theory, and therefore we review it first. The natural, though less familiar, continuation of the p-story is provided by the theoretical development that is based on the Green’s function or on the generalized partition functional Z . Z complements p theory for mathematical procedures are reversed. In Z-theory, the probability distribution is constructed from the moments and correlation functions rather than vice versa. Next, an analysis of the operator equations of motion is given which links up all the preceding discussion in one formula, the quantum noise operator equation of motion. The radiation mode is exhibited to be a damped harmonic oscillator driven by a random Gaussian noise current. For a Markov process, this noise current is a randomizing Langevin force acting on the radiation mode. All of the density matrix, Z-functional, and quantum noise operator theory for the radiation mode applies for atomic systems as well. The purpose of this paper is intended to be pedagogical in nature. It is not our intention to assign credit or provide an exhaustive bibliography; instead we refer the reader to those papers we have found useful.
111,
0 21
DENSITY MATRIX THEORY
0 2.
95
Density Matrix Theory
The physics of a system-reservoir interaction can be best appreciated if it is first viewed from a slightly revised version of the {&, {b,}} model (SCULLYand LAMB[ 19671). The { 6, { b k } } model is an idealized description of a radiation mode, reservoir interaction. In essence, the (8, (6,)) model demonstrates the randomizing (damping) effect on the radiation mode of a spacially incoherent ensemble of atoms or phonons. The important point is that each phonon or atom of the 6, variety will have a different characteristic frequency wk as a result of spacial inhomogeneity. These frequency differences are produced by the interatomic interactions of the randomly situated atoms. This model is treated in detail in Appendix 11. An interesting, and in some ways simpler, complement to this model is provided by one in which the spacial randomness of the reservoir is replaced by a temporal randomness. The correlation and relaxation times of the radiation mode are then no longer determined by the measure of the spacial incoherence of the atomic system but by the measure of its temporal incoherence. The relevance of this revamped {&, (6,)) model to the original one is that in gases, in particular, the atoms are in random motion and space-time incoherences are not separable. Hence, it is worth seeing that all paths lead to the same type of reservoir theory. Suppose therefore that all of the 6, systems are noninteracting and share the same level spacing, w o . Imagine that the radiation mode is oscillating without loss in a cavity. The 6, atoms will be injected into the cavity at a rate of w atoms per second. They will be assumed to interact with the mode uniformly for a time 6t with identical coupling strengths, K ~ They . then pass out of the cavity. Clearly, the atoms will take energy from the mode if they are injected into the cavity in their lower (ground) state. The effect of such a beam is to damp the mode, that is, to provide the cavity with a finite Q. It ultimately depletes the mode of all its energy. Our aim is to determine the secular motion of the radiation density matrix, pr, by summing over incoherent contributions from the atoms. Each atom exchanges energy with the radiation mode via the “potential”,
v‘ E K,*6+a+KOa+b. The following density matrix calculation will be carried out in the interaction picture. It will bs assumed that wo = wa0;this will keep V’ time independent. When an atom enters the cavity at time t , it will change the density matrix for the radiation mode from p‘(t; &+, &) to some other value at t 6t:
+
96
TOOLS OF THEORETICAL QUANTUM OPTICS
p'(t+Ft; ilf, a) = Tr,,,,pa'"""(t+6t;
a',
ii)
3
The net change in pr is 6pr(t;a+, a)
= p ' ( t + 6 t ; a+, il)-p'(t; a+, a).
Since 6t is small, the only way for the mode to be damped is for many atoms to act on it. It will be assumed that all of these atomic interactions are uncorrelated with one another. When the atomic beam has such a complete temporal incoherence, the effect of N atoms on changing p' is N times the effect of one atom. For a time duration At long with respect to 6t but short with respect to the field decay time (v,/Q)-' (produced by the cumulative effect of the beam), pr changes by Ap'(t)
NA6p'(t),
where N A is the number of atoms that pass through the cavity in the time At. This number is given by w . At; therefore, a "coarse-grained'' time derivative can be defined by
To calculate 6pr, we will employ the solution by iteration to the interaction picture density matrix equation of motion,
This solution is t+6t
dt'[V', p""(t)]
pa,'(t+6t) = p"*'(t)--i/ t
+(-i)'J,
d t ' J dt"[V', [ V ' , p",'(r)]]+
. . ..
t
Moreover, because V' is time independent, pa>'(t+6t) = pa,'(t)-i6t[l/', pa'r(t)] - +(St)'[l/',
[v',pa,r(t)]]+
. . ..
At the start of the interaction, ~ " ' ~ = ( tp)r ( t ) l g ) ( g l , where Ig) is the ground state of the atom. Since 61s) = (glb' = 0, it follows that Tr,[V', p a , ' ( t ) ] = 0 and Tra[ If', [V ' , pa,'(l)]] Thus to lowest order in 6t,
=
1 ~ ' { a~ +dpr(t) 1 + pr(t)a'6
'1.
- 2&pr(t)2
111, §
21
97
DENSITY MATRIX THEORY
pr(t+6t)
E
Tr,pa"(t+6t) tio16t)2(ii +iip'(t) + p'(t)ii ii -2dp'(t)d }.
z p'(t) - +(I
+
+
The course-grained or secular time rate of change of pr, induced by the atomic beam, is therefore given by
(S)4 L =
Q
w(p'(t+6t)-p'(t))
At
where the decay rate wa0/Q = ~ l t i ~ l ' ( 6 t ) ~ . The ground state atomic beam damps the radiation mode and has the ability to drain it of all its energy. In the case of the (8, (6,)) model, the atoms are permanently situated within the cavity and, for this reason, in equilibrium, they maintain the mode in a finite nonzero state of incoherent excitation. The pr equation of motion that describes this and a frequency pulling effect is calculated in Appendix 11. With the inclusion of free-field oscillations, it is
where the reservoir contribution (dp'ldt), to this equation is an augmented version of the reservoir contribution,
=
&([a,
prii+1-cii+, iip']},
given above. It now includes frequency shift, Amao, terms and xkn!& terms that imply a finite excitation of the mode in steady state:
-+
nkyk([ii+, k
[ a 7
p'(t)]]
+[Iii,
Ld+7
pr(t)]]).
(2.2)
= 2XItik1'6(co,O--k) is the amount by which the k-th atom damps the mode: 7, = Z k Y k , and nk is the excitation level of the k-th atom: nk = (6:6,). Eq. ( 2 . 1 ) (together with eq. ( 2 . 2 ) )is known in the literature (modulo other interactions) as the master equation. For it to be valid, the interaction must be small in a sense to be stated momentarily. It is convenient to write (dp'ldt), in the form, eq. ( 2 . 2 ) , since equations of motion for the moments a ( t ) = Tr (pri(t)) = Tr, ( p r ( t ) 8 ) and n(r) = Yk
98
TOOLS OF THEORETICAL Q U A N T U M OPTICS
“11,
§2
( B + ( t ) L i { t ) ) = Tr,(p‘(t)d+ 4) can be quickly computed if the trace property Tr ( [ A ,B]C) = Tr ( A [ B ,C]) is used:
Here, o, = oao+AmEo.Note that a*a satisfies (d/dt+y,)a*a = 0. Thus, the fact that the (macroscopic) motion of the ci system is correlated to the statistical state of the (6,) reservoir system has produced three physically identifiable effects. Macroscopically, B behaves like a damped harmonic oscillator which oscillates at a frequency shifted from its natural frequency. Furthermore, it is apparently driven by a (random) force that, in steady state, maintains a^ at an incoherent level of excitation given by 5,i.e. (Li’S) remains nonzero while a*a goes to zero. It can now be understood in what sense the interaction Vmust be small. The secular motion is determined from the damping rate ya and the frequency shift Amao. These must be small, in general, in comparison to wEofor the expansion of the dynamics in powers of V to converge. Having extracted information about the macroscopic motions of the d system, we would now like to solve for pr and, to do this, it is convenient to recast eq. (2.1) into another form. The oscillatory motion of pr can be eliminated from further consideration by defining a secular density matrix: p s ( t ) = exp {iwaci+~(t-fo)}pr(t) exp { -imaB+Li(t-t,)}. This is a slightly modified version of the interaction picture density matrix which includes the frequency shift. Eq. (2.1) can be rewritten in terms ofps as follows:
dt
++?,(I +2n)(ii+iip”(r)+p‘(t)%+ii)+yanp’(t) =
y a n B + p s ( ( t ) i i + y , ( l +n)iip’((t)ii+.
(2.5)
In this form, the master equation can be transcribed quickly into one of its two principle numerical representations. Taking expectation values of eq. (2.5) between two photon states In’) and (nl yields
From eq. (2.6), it is a simple matter to solve for the steady state probability distribution of photons in the excitation spectrum of the B system. In steady
111,
§ 21
99
DENSITY MATRIX THEORY
state, pi, ,, = 0 for n # n‘. Let p:: denote the steady state values of the diagonal elements of ps. We see that the equations, dp:”,dt = 0 for all n, can be solved for p z by detail balancing on a flow of probability diagram (SCULLY and LAMB[1967]): ~
n+ 1
~
1
?,(I+ fijnpst’n
fl
J Y a E n P . - l5s. . - r
_ _ _ _ ~ -__
n- 1
Two term recursion relations are explicitly solvable and in this case the solution is, from ya( I + E ) ( I + n)pr+ 1 , ”+ = ran( I + n)pi: ,
pi;
=
’,
n”/(l + zy+
(2.71
I&,
= I . This solution is the familiar blackbody photon dissatisfying tribution and indicates, as expected, that, to achieve steady state, the reservoir must bring the radiation mode into a common thermal equilibrium with itself and that the average excitation level of B is ii. The photon representation is useful for finding the steady state distribution of the density matrix, but the time dependent solution of p‘ involving ya is much more difficult to obtain from eq. (2.6) than from the coherent state transcription of the master equation. Moreover, from the transient solution for p‘, relaxation times and correlation times can be calculated. The coherent states are an overcomplete set of states, which provide a valuablc alternative transcription of the master equation. Wellprescribed means are available for expressing an arbitrary operator in terms of them (CLAUBER [1963]). In particular, the density matrix satisfies a coherent state P-representation (GLAUBER [1963], SUDARSHAN [1963], KLAUDER and MCKENNA[1965]):
p’(t) = I d 2 a / a ,tO)PS(c(, t)
(2.8)
where a = a, + ia, and d2a = da, dcc, . From P the classical statistical aspects of the previous reservoir analysis become identifiable. Perhaps the best way to understand the analogy between P and a classical probability distribution function is to view it removed from the context of the coherent states. From a macroscopic point of view, we want to obtain ps in order to compute the moments of fluctuating physical variables. These moments can be obtained by differentiation from the characteristic function, defined quantum mechanically in terms of ps as (GLAUBER [1966])
100
TOOLS OF T H E O R E T I C A L QUANTUM OPTICS
f(A, t )
= Tr {pS(t)e'Li+e-A*Li}.
Namely,
so that the expectation value of an arbitrary physical operator of the ci system can be calculated from f by performing the appropriate derivative operation. The inherent quantum uncertainties associated with measurements, made on the ci system, of these physical quantities are convolved with the statistical uncertainties of the random many-body systems. For this reason, the theory that follows has all of the features of a classical noise theory, in which a deterministic motion is randomized by the random initial conditions or multiplicity of coordinates of a many-body system. At any rate, we expect the Fourier transform of the characteristic function to reproduce the "classical" probability distribution (GLAUBER [1966], MOLLOWand GLAUBER [1967]): (2.10) ((ii:(t))m(i?s(t)>")
=
=
(&Irn(
- &)"/dzaeA"*-A*aP'(a, t)l,=,
/d2aa*"a"P"(a, t).
(2.1 1)
As an aside, it should be commented that the substitution of eq. (2.8) into eq. (2.9) confirms the identity of the P" defined in eq. (2.10) with the P" defined in eq. (2.8). It might be objected that P s should not be called a probability distribution function because, for certain states, it has negative values and for others (aphoton state) it is not well defined (GLAUBER [1964]). However, in thermal equilibrium, it is a well defined probability function and presumably it evolves in time from thermal equilibrium to well-defined probability functions. The pathologies may have no physical significance therefore and we shall ignore them. A general prescription based on x for deriving P-function equations of motion from operator equations of motion, when the P-function contains atomic as well as radiation field variables, is possible (GORDON [1967], HAKEN,RISKENand WEIDLICH[1967], ACARWAL and WOLF [1970]). When applied to the problem where atomic population coordinates are present in P, the P equation of motion involves an infinity of variable differentiations
111
5 21
101
DENSITY MATRIX THEORY
and is, in general, quite formal and complicated (HAKEN,RISKEN and WEIDLICH [1967], LAX [1968]). As an illustrative example consider our single mode field interacting with its reservoir. Eq. (2.5) may be written as
dpso = y,{+[a,
p"(t)a']-+[a+,
dp'(t)]
dt
-q-a+, [a, P"t)]] -qa, [a+, Pyqll}. On substituting this equation into the time derivative of eq. (2.10) and utilizing the commutation relations [a, e"+e-;.*i ;le16+e-n*ri
I=
3
[a+, e"+e-i*il= A*eli+e-l*i 3
along with the aforementioned trace property Tr ( A [ & C]) = Tr ( [ A , BIC), one finds (with z = r a t ) that
a~~ ~). (= ~a ,( + U P ) + ^*a az am oa
+ a2 a@aa*
(+.*P')
~
(W).
(2.12)
This is a Fokker-Planck equation having the standard classical form (LAX and LOUISELL [ 19671) ap -- -
a
az
a
a2
a@*
acr am*
( A P ) - -((A*P)+
am
~
~
(2DP),
(2.13)
where the drift vectors, A and A*, are read off from the equations of motion for m(z) = exp {i(o,/y,)z}a(z) and a*(.) = exp {-i(w,/y,)z)a*(z):
dcco
=
dz
A(z),
da*(t) -~ dz
=
A*(T),
and the diffusion coefficient is read from the equation of motion of n ( z ) = ( 8 + (7)ci( 7)):
:(
+ I ) n(z)
=
20.
One of the nice features of Fokker-Planck theory is that it can be interpreted physically from the standpoint of Langevin forces (LAX [1960]). Furthermore, it demonstrates that far removed from the realm of basic quantum uncertainties, a quantum many-body system has classical statistical interpretations. The main utility of the Fokker-Planck equation is that, from its Green's function solution, one can calculate transition probabilities for discrete problems and relaxation rates for continuum problems. One seeks the retard-
102
TOOLS OF T H E O R E T I C A L QUANTUM OPTICS
“11,
92
ed Green’s function, G,(cI, T ~ U ’ Z ’ ) = q + ( z - z ’ ) P ( x , ~IcI’,z’), where q + ( z ) = 1 for z > 0 and q + ( z ) = 0 for z < 0, and P(a, zla’, z’) 6(2)(a-~’) when z 3 z’. Through the Kolmogoroff-Smoluchowski relation, --f
P(u, 7) =
f
d2U‘ P ( M , I a‘,
7’)
P(c~’, T‘),
P(a, Z~CI’, z’) can be physically interpreted as the conditional probability for the system to move from a’ to CI in z- z’ under a Markovian statistical dynamics. There are general techniques available for solving a FokkerPlanck equation like eq. (2.13) (WAX[1954]). In particular, eq. (2.12) has the following closed solution for Ps(a, T ~ C I ’ T ’ )valid for T 2 T‘*
Except for one important statistical consideration, our review of the density matrix description of a radiation mode coupled to a reservoir is done. The solution for P ( a , t ) allows one to calculate relaxation times for any of the moments of eq. (2.1 1). According to the mathematical theory outlined in LAX and LOUISELL[1967], Ps has the general eigenfunction expansion
P’(a,
T) =
f
C exp { - A l ( z - ~ ‘ ) ) q ~ ( ad2a’ ) (P~(cI’)PS(CI’,z’). 1
* To obtain the Green’s function solution in closed form, one looks for a solution of the type,
where 0 and
C are solutions to the drift and diffusion equations of motion
(2
d0 - = A(8) = -@, + i ) qz) dz and v is determined from 5 by the equation, 1 dv 1 d[ - _ _ _-- - _ _ v dr
=
2D
i dT
In order to satisfy the boundary condition, u exp { - / c - 8 / * / < } it is necessary that v , t and 8 have limit values V(T)
as
t = 5‘.
3
l / n i ( ~ ) , i ( ~0,) O(z) --f
--f
=
-
n,
&(’)(ct-a’), as 7 + r’.
CI‘,
111,
§ 21
103
DENSITY MATRIX THEORY
Thus, depending on the amount of overlap of a*ma” with q L ( a ) , one can determine the relaxation rates of ((b+(t))”(b(t))”). The one remaining consideration has stimulated considerable theoretical thought in recent years. It concerns the role of the correlation function in quantum optics; both its measurability and calculability have been at issue. Near thermal equilibrium, it has long been known (CALLEN and WELTON [1951], KUBO[1957], BERNARD and CALLEN[1959]) that the relaxation times and correlation times of a system are related. The fluctuation-dissipation theorem connects a system’s dissipation or relaxation rate with its noise fluctuations, which, according to the noise spectrum, determine the time interval over which the fluctuating variable remains correlated to itself or other variables. Near other steady state points, one would like to have the same relationship between fluctuation and dissipation hold so that the density matrix theory outlined above could be utilized for the calculation of correlation times. An analysis of an atomic beam spectrum analyzer has been carried out (SCULLYand LAMB[1968]) which verifies that the inverse measured linewidth of the spectrum analyzer, the correlation time of the correlation function (b’(t)b(O)), and the relaxation rate of a ( t ) are all equal. It may also be shown HANBURY-BROWN and TWISS[ 1956, 1957, 19581, GLAUBER [ 1963b, 19641 that, within experimental limitations, a photoelectron correlation experiment can be used to measure the Hanbury-Brown, Twiss effect embodied in the correlation function ( b ’ ( 0 ) b + ( t )b(t) b(0)). An early analysis of this effect* showed that for a thermal light source, this correlation time was equal to the relaxation rate of the quantity n(t). The theoretical situation regarding the equality of relaxation and correlation times away from thermal equilibrium has been recently studied (LAX[1963]). In general, for a Markovian statistical dynamics, the equality holds true. This result is useful in density matrix P-function theory and is called the quantum regression theorem (LAX[ 19671). A discussion of this theorem is given in Appendix 111. It implies that under the physical conditions where the radiation mode undergoes a Markovian time development, the spectral correlation function ( b ’ ( t ’ ) b i t ) ) and the Hanbury-Brown, Twiss correlation function ( b ’ ( t ) b’(t’) b:t’) b ( t ) ) can be evaluated in steady state in terms of the transient response of p which is determined from P s (a,zla’, 7’): (d:(~)d~(O))
= j d 2 a j d 2 a ’ a *a’PS(a, +’,
* This result is implicit in SCHWINGER[1961].
O)Pss(a’),
(2.16)
104
TOOLS OF T H E O R E T I C A L QUANTUM OPTICS
( d ~ ( O ) b ~ ( z ) d , ( z ) d , ( O ) )=
ss
d2a d2a‘laI2 ja’I2 P ( Mzia’, , O)Pss(a’),
“11,
33
(2.17)
where 8,(r) = exp (iwat)8(t)( y,t E z) is the secular part of 8. The simplest Markovian steady state to which eqs. (2.16) and (2.17) apply is the thermal steady state that is brought about by the {&] system. In this case, eqs. (2.14) and (2.15) should be substituted into eqs. (2.16) and (2.17). The a and CO integrations of eqs. (2.16) and (2.17) can then be carried out. It is found that, in thermal equilibrium under a Markovian dynamics,
a,(z)a,(~)) = n2(1+e-‘).
5 3.
Green’s Function Theory
Let us now discuss a methodology due to SCHWINGER [1961] (these techniques were applied to the laser problem by KORENMAN [1966]) that enables one to deal directly with the correlative aspects of a system-reservoir interaction. We shall employ external current sources in a somewhat formal fashion in order to study the effects of a reservoir on the macroscopic description of a radiation mode: its response to energy inputs, its internal degree of excitation, and its excitation spectrum. The principle mathematical function in this analysis might be termed a generalized partition functional Z. We have called Z-functional theory, Green’s function theory, because perhaps the most important physical information in Z is contained in its second functional derivatives, which define a matrix Green’s function. In general, certain variables describe the macroscopic state of the system and one would like to compute their expected value. However, one would also like to compute correlation functions, such as <&‘(t’) 8 ( t ) ) , which determine the frequency spectrum of the radiation field. Moreover, as we shall see later, this correlation function governs, in part, the dynamical evolution of the atomic system. It therefore influences the dynamics from which it must be determined. Thus. although both density matrix and Green’s function theory begin from certain common computational desires, their scope and emphasis are slightly different. In particular, an important element of the present approach is that one can functionally generate inforniation about higher order moments and correlation functions from lower order moments. By definition, a macroscopic dynamical variable is one that can be coupled
111,
o 31
105
GREEN’S FUNCTION THEORY
with an external force to excite and drive the system to study its response. One can couple to and drive the radiation field with external current sources K and K*:
The total (Schrodinger) Hamiltonian, H S H ,that determines the time evolution of the transition operator, 0, is time dependent: HSH(t)
= (HR+ffs+ ~ > ( t O ) + K * ( t ) ~ ( t O ) + ~ + ( t O ) K ( t ) .
Under the dynamics of HSH,idto(tyt o ) = H s H ( t ) U ( t , t o ) and 4 t ) = Tr
{D+G7t o ) ~ ( t o ) Wt o, ) p ( t o ) )
(3.1)
is a functional of K. By going to an interaction picture this K dependence can be made explicit. In the interaction picture that we have in mind, U satisfies d,u(t,t o ) = -i(K*(t)ci(t)+ci+(t)K(t))U(t, to>. The formal solution to this equation is
where ( )+ means positive time ordering of the operators and, for example,
ci(t) = exp (i(Hs+HR+ ~>(t-t,)}ci(t,)exp {-i(H,+H,+ V)(t-to)}.
Also
where ( )- means negative time operator ordering. Now let us call the time development operator U(t, t o ) the forward time operator, i.e., it takes us from to to the later (forward) time t. In a like fashion, let us call U + ( t ,t o ) = U ( t o ,t ) the backward time operator, i.e., it takes us from t backward to the earlier time to. Eq. (3.1) now becomes
*
[
(exp -i~:dr’(K*(r~)ci(t’)+a‘(t.) K(t’))
)) p(to)] .
(3.2)
Eq. (3.2) makes the current sources implicit in eq. (3.1) appear explicitly.
106
TOOLS OF THEORETICAL QUANTUM OPTICS
“11,
§3
Now comes a very clever trick due to SCHWINGER [1961]. Notice that if we let K, as it appears in ( )+ and ( )-, differ, i.e., if we pretend that the Hamiltonian is different for the Uand U + operators, then it becomes possible to obtain (ci’(t’)ci(t)) or ( b ( f ) c i + ( f ’ )etc. ) by functional differentiation. We define
U + ( t , to> = (exp
(i~t~di.(h.T(r.)a(tf)+di(t.) ~ ( t ’ ) ) )-),
and ~ [ t l K ,, K z ]
=
1
Tr { U T ( t , to)a(t)U+(t, h ) p ( t ~ ) ) ,
(3.3)
where Z = Tr { U T ( f ,t o )U+(t, t o ) p ( t o ) } is introduced to insure that a unit operator has an expectation value of unity. Z is a normalization functional which equals 1 when K , = K - , K,* = K? forthen U ? ( t , t o ) = U;’(t, t o ) . Note that, as defined,*
so that a [ t l K + , K,*] can be functionally generated from Z. Then by functional differentiation, for example,
and
In both of these expressions, t is greater than t’. They illustrate what is meant by the statement: using external sources, K, that are distinguished on the forward, K+ , and backward, K - , time path, one can separately probe the system on these paths.
*
For example
111,
o 31
GREEN'S FUNCTION THEORY
107
The above discussion was intended to motivate the following review of SCHWINGER'S [I961 ] treatment of the radiation mode, reservoir interaction. The mathematical quantity 2, on which his work is based, is an expectation value of U+ and U ? operators that probe the system under study over a forward time path running from to to t , and a backward time path from t , to to encompassing the dynamical time interval, [ t o , t,], of interest (see Fig. 3.1): Z[K,
9
KT1
= Tr { U + ( f l
2
t o ) U + ( t , tO)P(tO)}. 9
(3.4)
Eq. (3.4) makes the relationship of 2 to pet) evident. For a particular burst of current at the time t : K-(t') = -iA6(t'-t), K,*(t') = -iA*d(t'-t), K? = K+ = 0, 2 becomes the characteristic function
and contact is made with the previous section on density matrix theory. Operationally, we will want to obtain 2 from a functional integration. This will be possible, for instance, if the first functional derivatives of 2 can be calculated in closed form as functionals of K , and K,*. One could functionally differentiate eq. (3.4) to investigate this possibility. The functional form that results for 2 on varying the K's in eq. (3.4) is
6Z + 6K-(t) -~
In order to obtain 2 we will make use of eq. (3.3a) and write, instead of eq. (3.6), the functional differential equation for In 2 as 6 In Z
=
-i
It:
dt(a*,(t)GK+(t)+a+(t)GK*,(t)
- a*_(t ) 6 K -( t ) - a - ( t ) 6~ i ( t ) ) ,
(3.7)
where as implied by eq. (3.3a) (3.8a)
. 6lnZ
a+(t) = 1
~
GKT(t) '
(3.8b)
108
TOOLS OF THEORETICAL QUANTUM OPTICS
a*(t)
6InZ = -1 . -
a-(t)
=
SK-(t) ’
-1
. SInZ
“IL§ 3
(3.8~) (3.8d)
~
dK?(t).
Equation (3.7) must be functionally integrated subject to the boundary condition Z[K,
=
KT
= O] =
1,
(3.9)
which follows from the property Tr p = 1. 2 can be found explicitly from eqs. (3.8) and (3.9) if a , and a*, are known functionals of K , and K,*. The determination of a* and a: as functionals of K , and KZ must be made from their equations of motion. The straightforward approach to finding an equation of motion for u+(t), for example, is to apply the operator id/dt to a + ( t ) as given by eq. (3.8a). Since, by definition (W)U+(t,, t o ) > +
=
U+(t1, t > W U + ( t ,to>
we may write 1 ci+(t>= - T r {U-(tl,
Z
+ U+@l
Y
lO”+(L
t ) W U + ( t ,t o )
t ) d ( W + ( t , to>
(3.10)
+ U+(t17 t ) W O + ( t , tO)lP(tO)) and because (id/dt-wao)ci(r)
=x k
rckgk(f),one finds that (3.11)
where
This is formally the same equation of motion for a+(t) as the one a ( t ) satisfies (eq. (3.1)) except that a plus subscript has been added. Thus, we are tempted (or should be tempted) to generalize a ( t ) = (ci(t)) to the statement, a+(t) = ( 6 , ( t ) ) , where 6, ( t ) is the Heisenberg radiation mode operator on the forward time path. Thus, when the external current on the forward time path is distinguished from the external current on the backward time path, time dependent, Heisenberg operators (and states) must also be distinguished and labeled with plus and minus subscripts.
111,
§ 31
GREEN'S FUNCTION T H E O R Y
109
The picture that one should bear in mind is that of the system's operators undergoing a continuous time evolution over a bent time path:
Fig. 3.1. The dynamical evolution of a radiation mode, which is being driven by external source currents, K + and K - ,from some initial state at time, t o ,is to be studied over a continuous time path that runs from ro to a later time, t i , and then back to t o .
t , is the continuity point at which a + ( t , ) = a - ( t l ) and thus 8+(t,) = 8-(t,). This assertion can be verified by considering the interaction picture definitions of a+(t) for example: (d+(t1)> =
Tr { U - ( t o ,
t t > ( 4 t 1 >U + ( t , ,
to>>+p(to)l
Tr { U - ( t o , t J 4 t l ) U+(t19 to)ro(to)l = Tr { ( U - ( t o , tl)8(tl))- U+(tl, t o ) p ( t 0 ) } = . =
(3.13)
In summary, equations of motion for the first functional derivatives of In Z are simply derived by taking an appropriate expectation value of 8, and Li: operator equations of motion. We have argued for this statement as a natural generalization of 8(t) operator theory. The resulting equations of motion for a , and a: tell how the radiation field that is generated externally interacts with the {&} system. From a more significant point Of view, however, the solutions for the first functional derivatives of In 2 tell how the radiation field responds to external probing as a composite system whose properties are acquired in part from the 6,. Thus, in the model problem of the (8, ( 6 k ) ) system, expectation values of the expressions for and their Hermitian adjoints give a , and a: responding to K , and K+* through the intervention of bk, and b;, :
$ where and
-w~O)
a+(t) =
Kkbkf(t)+Kf(t) k
k+(t)+K,(t),
(3.14)
I10
TOOLS OF THEORETICAL QUANTUM OPTICS
[111,
03
Note again that ( i i , ( t ) ) and ( c i i ( t ) ) satisfy equations of motion that are obtained from the u ( t ) and a*(t) equations of motion by the addition of and - subscripts to these latter variables. The corresponding equations of motion for b,, and b:, , found from their operator equations of motion, are
+
(3.16) (3.17)
The calculational program of generalized partition functional theory is contained in eqs. (3.7), (3.9), (3.14)-(3.17). From a solution to the lowest order moment equations of motion, which give the reservoir-influenced response of the radiation mode to external current sources, 2 is assembled by functional integration and, among other things, the P-function is calculated from eq. (3.5). Higher order moments or correlation functions can be obtained by functionally differentiating 2.Next we must solve eqs. (3.14)(3.17) for a, ( t ) and uf ( t ) so that we may insert the solutions into eq. (3.7) and thus obtain 2. This is not conceptually difficult but does involve a fair amount of algebra and is therefore relegated to Appendix IV. We are most interested in problems associated with evolution to and from different states of thermal equilibrium and we therefore let to and t , go to plus and minus infinity. As shown in detail, in Appendix IV, we obtain the following important equation for the generalized functional: dlnZss = 6 (-i/:,dt/“‘
dt’(KT(t), ( - ) K ? ( t ) ) G ( t - t ’ )
(:::::9).
(3.18)
-02
2’’is the steady state solution for 2. In other words, as was just stated, the correlation functions that are computed from it are all thermal equilibrium correlation functions. The matrix G as given in Appendix IV is expressed in terms of the lowest order correlation functions of the &system: G(t-t’)
=
G++ ( t - t ’ ) G + - ( t - t ’ ) - +(t - t’) G - - ( t - t ’)
(G
).
- i<(ci(t ) ii +(t’))+ ) i( ii +(t ’ ) ii( t ) ) - i (8( t ) 6 (2’)) i<(ci(t ) ci +(t ’)) - )
(3.19)
+
The fourfold information of G reduces to twofold information as follows G+ +(.I
= yI+(Z) G-+(Z) -yI-(T)G+
G- - ( T)
=
q+ (Z) G+ - (7)
- ( T) G-
-(TI, + (7).
(3.20a) (3.20b)
111,
D 31
111
GREEN’S FUNCTION THEORY
Eq. ( A4.25) gives the values of G-+ and G+- in thermal equilibrium: iG- + ( t - t ‘ )
= (2(t)d+(tl)) =
(n+ l ) e - i ~ = ( t - t ’ ) e - t ~ ~ t - - f ’ 4
-i ~ + - ( t - t ’ ) = ( ( i + ( t ’ ) q t ) ) = ne - -iw,(t-t’)e-ty=lt-r‘I
(3.21a) (3.2 1b)
Note that a spectral function for the radiation mode can be defined in terms of the expectation value of the mode variable commutator A ( t - t’) = ( [ d ( t ) , d + ( t ’ ) ] ) . The spectrum of each correlation function can be defined as the Fourier transform G(o) = (27c-l J” dz exp (iwz)G(z). Because B and a+ satisfy the equal time commutation relation [a, 2’1 = I , A ( o ) satisfies the sum rule, dwA(w) = 1. From eqs. (3.21a) and (3.21b), two important facts are learned about mode spectra for modes in thermal equilibrium under a Markovian dynamics. First, the spectrum of field fluctuations, AF,(o), which is the radiation mode spectrum determined experimentally, is proportional to A ( o ) , and second, A ( o ) is Lorentzian:
A ( ~= ) e-i-=re-+~e14 1 A(o) = 2~
9
Ya (0-
+ (,a>’
*
In effect, although eq. (3.18) presents Zssin closed form, the information we have just obtained about the lowest order correlation functions comes from a functional differentiation of a + ( [ )and a - ( t ) . In fact, depending on one’s point of view, Z in closed form either contains too much or too little information. For example, we would like to determine the relaxation rates of a(t) and n ( t ) , yet eq. (3.18) only gives us their steady state values for which daldt = dn/dt = 0. Relaxation rates of a ( t ) and n(t) can be determined directly from the basic equations of motion (A4.16) and (A4.17) of Appendix IV (as can correlation times of lowest order correlation functions). On setting K+ = K - = 0, one finds that eq. (A4.16) vanishes and eq. (A4.17) specializes to
This equation is the same as eq. (2.3) and verifies that the relaxation time of a(t), (or a*(t)) is equal to the correlation time of < ( i ’ ( t ’ ) B ( t ) ) in thermal equilibrium since
112
TOOLS O F THEORETICAL QUANTUM OPTICS
A functional derivative, -+i6/6K- ( t ’ ) , of (A4.16) less (A4.17) yields
($ +iw,)
(ci+(t‘)ci(t))
+ +y,<(s(t)
ci+(t’)>-) =
++)r, q
(6
-
( t - t ’) A( t - t ’).
(3.22)
A similar equation giving the t‘ dependence of (ci’(t’)ci(t)) can be found by functionally differentiating the u*_(t’) equation of motion with respect to i6/6K,*(t):
($ -imu)
ci(t>>++y,<(ci(t)s+(t’))+> = (ii
++)ra q +( t - t’) A( t - t’).
(3.23)
The addition of eq. (3.22) to eq. (3.23) taken with the limit t’+ t from either direction produces an equation for n ( t ) identical with eq. (2.4):
Observe that while a d + 0 when K, -+ 0 , 6ud/6K, 4 0 when K, -+ 0, and is responsible for the term, run, which implies that the mode is incoherently excited. That (ci+(t’) ci’(t) d ( t ) ci(t’)) has a correlation time in thermal equilibrium equal to the relaxation time y,‘ of IZ can be verified from eq. (3.18). From In 2 one calculates fluctuation correlation functions. Thus, for example, for t > t’,
and i3
63 In z = (6 1( t )ci+ ( t )ci+ ( t ‘)>- ( 2 ( t )6 +(t)>. + ( t ’) 6K: ( t ’) 6 K + ( t 0 )6 K ( t )
- i4
+
:
:
64 In z 6 K -( t ’) 6K*,( t ’) 6K +(t 0)6K*,( t )
+
= K
+
=K *
=0
(ci +(t’) ci +(t ) ci( t ) a( t ’)>
111, § 41
113
QUANTUM NOISE OPERATOR THEORY
Because In Zss is a quadratic functional of the form K*GK, the fourth order functional derivative of In Zssis zero as well as second order functional derivatives involving two K’s or two K*’s. Thus ( r i ( t ) r i ( t ’ ) ) = ( r i + ( t ’ ) r i + ( t ) ) = 0 and (ci+(t’)ri+(t) ri(t) ri(t’)) = (&+(t ) ri(t))( ri+(t’) ri(t’)) ( r i ’ ( t ) ri(t‘))( ri+ (t’)ri(t)) - nZ(1 +e-Y.lt-t‘l 1. (3.24)
+
This is the same Hanbury-Brown, Twiss correlation effect for a radiation mode in thermal equilibrium that we derived before from an application of the quantum regression theorem. As a final comparison between Z or G theory and p theory, we compute the P-function as determined from eqs. (3.18) and (3.5). To begin with, m
m
Zss
=
exp [ - i /
dt/-mdtt(K:(t),
-K?(t))G(t-t’)
(f:f1$]
,
-00
satisfying ZSs[K*= K,*
=
Z s s [ K - ( t ” ) = -i16(t”-t),
01 = 1. Hence,
K*,(t’) = -iAd(t’--t), K*_ = K +
=
01
= exp {i1A12G+-(0)} = exp
{-ElAl’}
and from (3.5) and (2.10) have
1 - -e
- ~ u ~ ~ / i i 7
nn
in agreement with p theory.
0 4.
Quantum Noise Operator Theory
The operator equations of motion for the oscillators as they evolve under H s + H R + V are
(d ;:(
+iwao) ri(t> = -i +hk)
&(t)
=
c &bk(t),
(4.1)
k
-kth(t),
k = 1, 2 , .
. ..
By solving for the &(t)in terms of a(t’)and the initial value of 6,:
J to
(44
114
TOOLS OF THEORETICAL Q U A N T U M OPTICS
"11,
04
one can eliminate the reservoir coordinates as active variables from the ci equation of motion:
(4.4)
We saw a similar integration in 5 3. In keeping with previous arguments, it will be assumed that the destructive interference time zR of z k lick12 x exp (-iwkz) is much smaller than the time over which significant phase and amplitude modulations of B take place. Then, for times t >> to+ zR, rt-tn
J0
"dz
1 IKk12e-i"krB(t-z) M k
frn
a(t)
J
dzCIicklZei(wmo-mk)r-B(t)(+y,+ O
ihu0).
k
When an ensemble of atoms radiates, two kinds of currents flow. One, in response to ambient fields, gives rise to stimulated emission and absorption; the other is autonomous and responsible for spontaneous emissions. It is appropriate therefore to identify the autonomous current, - i z k K k h k ( t 0 ) x exp { -iwk(t-tO)) = z(t),as a random Langevin-type current source and rewrite eq. (4.4)as
Apparently this quantum noise operator equation contains all of the information about the &system that we had extracted at some length in reviewing density matrix and Green's function theories. In particular, therefore, it contains the same information as the a , equations of motion. This observation permits us to make the following comments about and comparisons of the various theoretical techniques we are discussing. The relationship of quantum noise operator theory as represented by eq. (4.5) to density matrix theory is through the identity, Tr (p(t)ci(to)) = Tr ( p ( t o )d ( t ) ) , which constructs density matrix theory as the Schrodinger picture side of the Heisenberg picture quantum noise operator theory. For a Markovian interaction, density matrix theory, used in conjunction with the quantum regression theorem, furnishes a complete description of the interaction; hence, eq. (4.5) must be a distillation of that description into one operator equation. The average of eq. (4.5) over p(to) = p,(to)pR(to), where pR(IO)= n&(tO), illustrates the gain in physical interpretation from this point of view. Since t ) ) = 0, we obtain, as before, the equation: (d/dt+ +y,+io,)a(t) = 0. The expectation value equation of motion loses
(e(
I", §
41
115
QUANTUM NOISE OPERATOR THEORY
contact with the spontaneous noise current, but gains some physical information about it; namely, the a equation of motion describes how coherent energy is degraded into incoherent energy at a rate ya - the rate of energy input into the (6,) system. Having identified the "hidden variable" of density matrix and Green's function reservoir theory, we would like to employ it to rederive, in order to obtain yet another perspective of, the previous density matrix and Green's function results. This is the final variation on the theme of calculating reservoir correlation functions, moments, and relaxation rates. The working equations are eqs. (4.5) along with its steady state solution:
which is attained when t >> y', after the initial excitation level of ri has been thermalized by the { 6 k } . We begin with a calculation of the equation of motion of f i ( t ) = &'(t) ri(t). Since dfi/dt = ri+(dri/dt)+ (dri+/dt)ri, eq. (4.5) and its Hermitian adjoint imply that
($ +7,) A(t)
= ri+(t)@(t)+Z+(t)ri(t)
= 2D+L,,(t),
(4.7)
where 2 0 = (S'L+ z'd), and z,,(t)= ri+(t>L(t)+ z+(t)ri(t)-2D. This procedure of separating noise from average values, apart from operator ordering considerations, proceeds along classical Langevin theory lines (LAX [1960], e.g.. LAX[1966]). As a reminder of the analogy to Brownian motion theory, the reservoir average value of 8+2+ 2'2 is suggestively denoted as a diffusion coefficient. @,(t) is a Langevin force associated with fi analogous to 2in the sense that (z,,(t)),= (z(t)),=O. In steady state, when dn/dt = 0, yan = 2 0 is regarded, in Brownian motion language, as a balancing of drift and diffusion (LAX [1960]). D can be evaluated in two ways. By definition, D
Re ( t + ( t ) a(t)), = Re {i
1K:eiok(t-to)
T~&R
6:
(to>d(t)>)-
(4.8)
k
The mode operator, ri(t), is correlated to 6:(to) through eq. (4.6) and the definition of 2:
116
TOOLS OF THEORETICAL QUANTUM OPTICS
[III,
$4
There are several ways of evaluating D, the one we give reproduces previous Green's function and density matrix results. We first substitute eq. (4.9) into eq. (4.8), interchange sum and integration, and then use the fact that y, >> zR: (4.10)
=
+ C yk nk = +ya fi
(vaiid for t - t o >> zR).
k
(4.11)
In the above calculations of D, the important contributions to the integral came from the integration region close to t. Thus, for purposes of' calculating 2'6 and 6' Lmoments, secular changes in B can be neglected and one can take
Sli_d:t'
qt)z
e-'"'-")A
L(t ') + 'i(t - dt)
(4.12)
along with
, z
M
qs(t - t ' -
=
0,
W
>
R
0 )+ s(t - t'+O)),
(4.13) (4.14)
where 2 0 is given by eq. (4.10) and d t > zR 0. The &function correlation property enforces or represents the fact that zR is the smallest time constant in the problem. The noise current uncorrelates from itself and 6, eqs. (4.13) and (4.14), much faster than it can eradicate B correlations, which decay through interaction with the {bk),system, but only at the rate y a . To calculate thermal equilibrium correlation functions for the B system, we want to utilize eq. (4.6). Again, the field-fluctuation and HanburyBrown, Twiss correlation functions are of interest: N
111,
D 41
117
QUANTUM NOISE OPERATOR T H E O R Y
For the calculation of the field-fluctuation correlation function, eq. (4.13) is useful; the calculation of the Hanbur y-Brown, Twiss correlation function hinges on the fact that L i s a Gaussian noise current (SENITZKY [1961]). From the definition of L in terms of &(to)one finds that <e+(tl) L + ( t 2 >
L ( t 3 ) L(t4>>
=
< L + ( t 2 )
z(t4)>
+ < L + ( t 2 )
L(t3)>*
(4.17)
Therefore eq. (4.13) can be used to evaluate both eqs. (4.15) and (4.16). Since 2 0 = y,E, the answers come out in agreement with eqs. (3.44) and (3.47). The final link of this classical-noise-theory-made-quantum story concerns the quantum noise operator calculation of the P-function. Not too surprisingly, quantum noise operator theory furnishes the quantum analogue of the classical Langevin force derivation of the Fokker-Planck equation. To begin with, some reservoir averages, (. . .)R = TrR(p, . . .) of dri(t) = ri(t)-ri(t-dt) and dB+(t) = ri+(t)-ri+(t-dt) need to be analyzed. Eq. (4.12) and its Hermitian adjoint imply that
=
s’
d t ” ( ~ + ( t ‘ ) e ” o ~ o ( t ‘ - t ’ ’ ) ~ ( t ’R’ )-) 2Ddt.
df‘r t-dt
t-dt
(4.18)
Similarly, (L+(t)L+(t’)>,= ,= o implies (dri+(t)dri+(t)), (dB(t)dri(t)) = 0. Furthermore, eq. (4.17) implies the result ((dd+(t))2(dd(t))2)R = 8D2(dt)’.
=
(4.19)
For a Gaussian noise current, therefore, only the reservoir average (dri+(t)dri(t)), is proportional to dt. All other averages of moments of dri and dri’ are of higher order in dt. The calculation of the P-function from eq. (4.5) proceeds, as in Green’s function and density matrix theory, through eq. (2.9) for the characteristic function. Since we are now working in the Heisenberg rather than the Schrodinger picture, eq. (2.9) must be rewritten as
dn,t,
=
aci+(t) - n * r i ( t )
Tr (~I(~O)~R(~O)~
>.
(4.20)
There is a quick solution to eq. (4.20). If Lis a Gaussian noise current then ri is Gaussian in response to it. Therefore, according to the Gaussian evaluation of characteristic functions (KLAUDER and SUDARSHAN [ 1968]),
&, t )
(e~~+(t)e-~*w)
-
e-1a12(ci+(t)ci(t))
=
e-1n12ii
118
TOOLS OF T H E O R E T I C A L Q U A N T U M O P T I C S
[111,5
4
Another procedure for calculating x is to derive an equation of motion for it from eqs. (4.5) and (4.20) (LAXand LOUISELL [1967]). The method of evaluating the x time derivative from eq. (4.20) (LAXand LOUISELL[1967]) is to work from the definition of the operator time derivative: d (eAh+(z)edt
d*d(t))
=
lim dt-0
(eA(hi(t-dt)+dd'('))e
-A*(d.(t-dt)+dd(t))
(dt
- edd+(t-
dt)e- A*h(t -dt)
Eq. (4.12) implies that [d(t-dt), dd(t)] = [d'(t-dt),
dd(t)] = 0
and therefore.
The expression on the right must be averaged over the reservoir before the limit is taken. Invoking the Markov property, one can uncorrelate dd(t) and dd+(t) and z+(t)) from d(t-dt) and d+(t-dt). Thus, the calculation is factorized:
(z(t)
-A*b(t))>.
NN
;;yo
((edh'('-dt)e-Ald(Z-dt)
>R.
The expression multiplying (exp {ALs+(t-dt)} exp { -A*S(t-dt)})R can, by considerations associated with the derivation of eqs. (4.18) and (4.19), be expanded as a power series in dt:
1
- (edd6+(Z)e
-A*dd(t)-
l)R
=
(
A dd+(t) T)R-A* (T) dd(t)
dt
R
-1112
(
dd (t)dd( t) +
(dt)2
In the limit dt -+ 0, the terms of order dt are neglectible so that
)
R
+O(dt).
111,
P 41
119
QUANTUM NOISE OPERATOR THEORY
An eq. (2.10)-type Fourier transformation of this equation, apart from oscillatory motion, yields the Fokker-Planck equation for P that we had obtained earlier. The physical assumptions that go into this derivation appear more stringent than those made in the density matrix derivation, but they are physically equivalent. Once the Markov approximation is made, the reservoir, radiation mode theory becomes Gaussian in the usual second-order, Fokker-Planck, differential equation sense. However, even when the Markov approximation is invalid, the reservoir interaction is Gaussian. For example, suppose that
can no longer be meaningfully approximated by C k n k Y k d ( t - t ‘ ) . Notwithstanding this nonlocality in the theory, the P-function remains unchanged in form in steady state. Only the method of computing fi changes. A slightly more general quantum noise theory than eq. (4.5) shows this. Suppose that eq. (4.4) is not converted into eq. (4.5) form, i.e.:
where r(t-tr)3 Cklxk12 exp {-icok(t-t’)}. This equation is valid for t - to >> the relaxation time of a. Eq. (4.6) is then replaced by (i(t) x /_mmdt’G,(t-t’)L(t’),
where G,, a retarded Green’s function, satisfies
:(
1
m
+icouo) G,(t-t’)+
dt” q + ( t - t ” ) T ( t - t r r ) G , ( t r r - t ’ )
= d(t-tr).
-to
Consequently, in steady state,
x(I, t )
= (exp (nS_LmdtrG~(t-t’)C+(tr)) *
{
.exp - I *
where
Sm
dt” G,(t - t”) L(t”)))
-W
120
TOOLS OF THEORETICAL Q U A N T U M OPTICS
[Ill
The Simplicity Eq. (4.21) identifies as x k n,lgklz = 4nZ xklKk12nklGr(CC)k)lZ. of this reservoir model and our assumption that PR =
IT k
(Trk
exp { -pH,})-
exp { - P H k )
has made the computation of the correlation properties of the noise current an easy matter. Appendix I One usually encounters the distinction between the Schrodinger and Heisenberg pictures from an equation such as eq. (1.1) with p ( t o ) = 17, t o ) ( y , t o l l a(t> = (7, t0l4t)lY3 to).
Rather than calculate a(t) from the operator equation of motion for d ( t ) , one can define Schrodinger states I y , t ) , and s(y, tl, which time evolve from the initially specified states I y, t o ) and (y, to] respectively, according to the condition (y, t o l d ( t ) l y , t o ) = ,(y, tld(t,)ly, t),. This condition is satisfied if - i H ( t -t o ) 1% 0, = e IY, t o ) or idtly, t ) , = Hly, t),. The shift from calculating the time dependence of d ( t ) to calculating the time dependence of Iy, t ) , is equivalent to the transition involved in d ( t ) p ( t ) . Note that p ( t ) = I y , t ) , , ( ~ ,t [ .The Schrodinger state or the Schrodinger picture is employed as a mathematical device, which provides an alternative method for calculating quantities like a(t). Green's function theory (or 2-functional theory) is also based on a mathematical device, one not so familiar yet one equally powerful to the Schrodinger picture density matrix. It is an approach that can also be conveniently formulated in terms of time dependent states. These states define the time evolution of operators as well as the basic measurement processes of quantum mechanics. For example, let In, t o ) be an n photon state at the time to: d'(to)d(to)ln, t o ) = nln, to). If it were possible to prepare and measure such a state, one might ask the question: what is the probability that at the time t( > t o ) the radiation field has evolved into the state of m photons ( m , tl, where ( m , t l d + ( t ) d ( t ) = ( m , tlm? The quantum mechanical answer is I(m, tln, tO)l2.The state ( m , tl has a time dependence determined from id,(m, tl = (m,tlH, -iH(t-to) ( m , 4 = ( m ,tole --f
121
APPENDIX I
Ill]
In particular, a differentiation with respect to time shows that ( m , tls(t)ln, t ) is time independent. The difference in viewpoint between In, t ) and In, t ) , is a difference between a plus and minus rotation of coordinate system respectively. Suppose, for example, that initially p ( t o ) = X J n , to)pnm(m,?,I. Then, at a later time, p(t) =
X In, t o ) p,,(t)
( m , tol
(fixed coordinate system)
n, m
=
1In, t)spnm(to)s(m,tl
(negative rotation of coordinate system).
n, m
By way of contrast, d+(to)ci(to) = Z J n , to)n(n,?,I and at a later time s+(t)(i(t) =
C In, tO)n,,,(t)(m, t o ]
(fixed coordinate system)
n, m
=
c In, t > n ( n , tl
(positive rotation of coordinate system).
The operator time-dependent states provide a useful conceptualization of the third approach to computing moments and correlation functions. Let s(t) be expanded in terms of some complete set of operator states at the time t : qt) =
C IY’,
t)(Y’l”(Y”7
4.
Y’, Y”
Then
4)= C
( Y , t o b r , t>(y’iW’>(y”,
tlY”’, t o > ( ~ ” ’ ~ p ~ ~ >( ~ 1 . 1 )
Y.7‘ Y‘’? Y“‘
expresses a(t) as a particular linear combination of elemental transition amplitudes ( y , t o l y ’ , t ) ( y ” , t i y ” ’ , t o ) . The transition amplitude ( y ” , ?ly”‘, t o ) measures the amount by which the system (radiation field plus atoms), initially in the state y”’ at the time t o , will be found through the intervention of dynamical processes in the state y” at the later time t; if one could prepare and measure such many-body states, one would eventually measure the probability for the transition to occur: [ ( y ” , tly”’, tO)l2. Conceptually, therefore ( y , t o l y ’ , t ) must be interpreted as the transition amplitude of a time reversed process or as a question about the past; namely, what is the amount by which the system in the state y’ at time t was in the y state at the time t o . The calculation of the time history of expected values of macroscopic variables is therefore a complicated summation of elementary microscopic processes, which, in some conceptual sense, involves a knowledge about the time reversed motion of the overall system. Eq. (A1 .l) suggests an approach to calculating a ( t )that has many mathematical advantages. One probes separately, with external forces, the for-
122
TOOLS OF THEORETICAL Q U A N T U M OPTICS
[III
ward and backward time development of the system contained in the sum mation over transition amplitudes. For example, the time behavior of the total system can be studied by driving the radiation field with an external source of current K. If K were operating while the system evolved from the state 1 y”’, t o ) to other states, the “forward time” transition amplitude (y”, tly”’, t o ) would be K dependent. As a purely formal mathematical procedure, it is possible to compute a ( t ) as a functional of external currents K such that, for the forward time transition amplitudes ( y ” , t l y ” ’ , t o ) , there is a different K operating than for the backward time transition amplitudes ( y , t o l y ’ , t ) . This formal procedure converts the problem of computing correlation functions into one of calculating functional derivatives. Another way of writing eq. ( A l . l ) is in terms of the transition operator U . When external forces are acting, the physical picture is as in Fig. A.1,
Fig. A.l. External (complex) currents, K and K*, feed energy into a radiation mode, system S, which is coupled to a reservoir, R, through an interaction potential, V.
and the total Hamiltonian H i s time dependent:
+ + v)(t)+ K*(t)d(t) + d’(t)
H ( t ) = ( H s HR
K(?).
The solution to the equation id,(y”, t l y ” ’ , t o ) = ( y ” , tlH(t)ly”‘, t o ) satisfying the appropriate boundary condition at t = to can be expressed as
(Y”,
tly”’, t o > =
(Y”,
toIU(t,
tO)lY”’,
20)
where U ( t , t o ) is the transition operator which satisfies
id, U ( t , t o )
= HSH(t)
U(t, to)
U ( t 0 , t o ) = 1. U(t, t o ) H ( t ) , The Schrodinger time dependent Hamiltonian H,,(t) is given by =
HSH(f) =
(Hs+HR+ v>(fO)+K*(tjd(tO)+d+(tO) K ( t )
and is related to H ( t ) by
H(t)
=
(u(t,tO))-’HSH(t)
U!t, to).
Thus i2,
(Y”, t O l H S H ( t ) ujtyto)l~”’, to> = (Y”, tolU(4 tO)H(t)lY”’Y to) = (Y”, tlH(t)lY”’, to>. =
123
A P P E N D I X 11
1111
Similarly
(Y, tOlY’, t )
=
where U + ( t , t o ) = (U(t, to))-’
(Y, t o l U + ( t ,
=
tO)lY’,
to>,
U ( t o , t ) satisfies
id, ~ ‘ ( t to) , = ~ + ( tt ,, ) ~ ~ ~ ( t ) . Using these solutions for the transition amplitudes, we can rewrite eq. ( A l . l ) as 4)= Tr { U + ( t ,t C J 4 t o ) U(t, to)p(to)). Appendix I1
One has two systems with Hamiltonians, H , and H,, interacting via H,, . Theoretical analysis begins from the operator equation of motion for the total density matrix p of the two systems (LAX [1964]): (A2.1)
+
where H = H , H , + H l z . One defines density matrixes, p1 and p z , for the one and two system respectively by tracing with respect to a complete set of states of the other system:
(A2.3)
Hereafter, operators written without explicit time dependence are understood to be evaluated at the initial time to. One procedure for solving eqs. (A2.2) and (A.2.3) is to expand p around the Hartree approximation, p ( t ) M p l ( t ) p z ( t ) . Since [H,z, p~(t)pz(t)l [ H I z ~ I (p~z ()t ,) l + [ p z ( t ) H l , ,p,(t)], in the Hartree approximation, each system evolves in time interacting with an average potential, V,(t) = Tr2(Hlzp2(f))or V,(t) = Tr1(H12pl(t)) of the other system:
I24
“I1
TOOLS OF THEORETICAL QUANTUM OPTICS
1-dpl(t)
= [HI
+ V,(t), p,(t)],
.
= [H,
+ V,(t),
.
dt
1- dp2(t)
dt
p,(r)].
Although p ( t ) % p l ( t ) p 2 ( t ) solves eqs. (A2.2) and (A2.3), it does not solve eq. (A2.1). One must set p ( t ) = p l ( t ) p 2 ( t ) + p 1 2 ( t ) and then obtain an equation of motion for p 1 2 ( t ) from eqs. (A2.1)-(A2.3). This procedure can be thought of, therefore, as beginning by neglecting correlations between the systems and then correcting by going back to calculate what was missed. It is through the statistical correlations contained in p12(t) that the two systems become coupled in the irreversible thermodynamic sense. These correlations produce relaxations and qualitative changes of the systems and, moreover, drive each system with noise emissions from the other. If we assume that V , = V , = 0, which is often the case, then the coupled set of equations of motion for p l , p,, and p J 2 ,obtained from eqs. (A2.1)(A.2.3), are
.
1 -dpl(t) -
[Hl, Pl(t)l+Tr,[H12
P12(t)l,
(A2.4)
1‘ -dp2(t) -
[ H , , p2(t)]+Trl[H12, pl2(t)l,
(A2.5)
dt
dt
-Tr,
(Wl,, P l Z ( t ) I ) P Z ( t ) - P l ( t )
9
(A24
Tr1 ([Hl,, P12(t)l).
One method of analyzing these equations is to neglect the terms involving traces over H , , and p12 in eq. (A2.6) and to solve 1. dp12(t) -__ -[If,
dt
P12(f)l =
[H12,
(A2.6’)
Pl(t)P2(t)l
for p 1 2 . The formal solution is p 1 2 ( t ) = -i[dt’e-iH(t-t.’[H12,
pl(t’)p2(t’)]ei”(f-f‘)
9
(A2.7)
f0
where we have assumed that initially the two systems are uncorrelated so that p12(t0)= 0. Eq. (A2.7) apparently begins an iterative solution for p12 in terms of p1 and p,. Eqs. (A2.4), (A2.5) and (A2.6) treat both systems on an equal footing and enable one to study how each system reacts back on the other in the
125
A P P E N D I X I1
1111
course of energy exchanges between them. These equations also permit one to study a reservoir interaction when one of the two systems is too large to respond significantly to energy inputs from the other. A good example of this situation is provided by the ensemble of 6, oscillators regarded collectively as one reservoir system in interaction with ci through V. Let p z ( t ) be the density matrix for the system and pl(t) be the density matrix for the ci system. If it were not assumed that the (6,) constitute a reservoir, eqs. (A2.4)-(A2.6) would have to be solved self-consistently for p1 and p z . The reservoir assumption allows one to take dpz(t)/dt M 0. Then, on assuming that the 6, are statistically independent, one can set p2(t) M pz = p,, where {p,} are, for example, (6,) thermal equilibrium Hence, mixtures. In thermal equilibrium, ( 6 k ) k = Tr ( 6 , ~ ~=)0 = (6:). Vyieldsno driftpotential, and eqs. (A2.4), (A2.5) and (A2.7) can be utilized in the form,
Is,}
where H, and V are given in the introduction and V(t) = exp {iH(t-to)}V(t,) exp {-iH(t-t,)}. When eq. (A2.9) is substituted into eq. (A2.8), an equation of motion for p1 is obtained that couples it to second order correlation functions of the reservoir system:
.
1- d p l ( t )
dt
- [H,,
p,(t)]
rt
=
-i
J
dt’Tr,{[K [V(t’-t+t,), e-iH(‘-t’)pl(t’)eiH(t-t’)pz]]}.(A2.10) f0
Eq. (A2.10) is a well-formulated version of the more simplified reservoir theory that was used in the early days of nuclear magnetic resonance theory (ABRAGAM [ 19611). More recently, this equation was employed together with various phenomenological interaction Hamiltonians to determine the quantum statistics of nonlinear optical processes (SHEN [ 19671). Because Vis generally small it is a good approximation to take e-iH(f-t’)pl(tr)e
iH(f--t’)
pl(t)
under the integral of eq. (A2.10). This amounts to a neglect of some of
126
T O O L S OF T H E O R E T I C A L Q U A N T U M O P T I C S
[III
the secular motion induced in p1 by the reservoir interaction. Because the I' integration is over correlation functions of the reservoir, which have short correlation times compared to the radiation mode correlation and decay times, this procedure for propagatingp, is a good approximation for a Markov theory. The model we are dealing with allows us to work out explicitly the consequences of interaction-induced correlations and to note in what sense V must be small and to what degree a Markov approximation is satisfactory. The simplest way to evaluate the trace over the reservoir coordinates as is indicated in eq. (A2.10), is to approximate the time dependence of V ( t ' - t + t o ) by the free-field time dependence: v(tj- t + to)
~
ei(ffs+ffR)(t'-t)v
(t0)e
-i(Hs+HR)(t'-f)
We are assuming that the 6 k ' s are distributed in uncorrelated thermal equilibrium mixtures of states: hence,
where nk = Tr, (6:hkp2) the evaluation,
=
(exp (bmk)-l)-', the trace in eq. (A2.10) has
We can now look at the t' integration in detail. The integrations that must be performed in eq. (A2.10) are
1111
127
A P P E N D I X I11
and the complex conjugate integration. As t - to increases, the integration over z acts as a frequency filter. Only those oscillators 6,‘ whose frequencies lie closest to waohave a significant effect on the motion of 6. This is why from an interaction point of view the photon attributes of the radiation field are derived from its interaction with discrete quantized matter. If the spectrum of { w k } is densely and broadly distributed around w a o , the sum x k l K k 1 2 exp {i(wao-wk)z>vanishes rapidly as z increases due to destructive interference. Hence the correlative effects of the interaction accumulate quicklyover the time, zR,it takesx, IKk12exp{i(wao-wk)z} to nullify itself. If negligible secular changes in the 6 system occur over the time interval zR, the Markov approximation is valid. Furthermore, for t - to > zR,
= +ya+iAwao. Therefore, eq. (A2.10) can be rewritten in Markov form for t - t o The result is listed as eq. (2.2) in the text.
>> z R .
Appendix I11
LAX [1967] states the quantum regression theorem as follows. If M is a member of a complete set of physical Markovian variables M,, and if the (M,(t)) satisfy linear equations of motion d(M,(t))/dt = ( A , ) = ‘&l,,(M,)+A~ so that ( M ( t ) ) = x , O , ( t - t ’ ) ( M , ( t ’ ) ) , then in a stationary state, within the expectation value ( N , ( t ’ )iM(t)N,(t’)), M ( t ) can be related to M(t’) in the same way that ( M ( t ) ) is related to (M,(t’)): (Nl(t’) M ( t ) N , ( t ) )
=
c O,(t - t ’ ) ( N , ( t ’ )
MP(t’) N2(W
Ic
The problem of calculating ( N , ( t ‘ ) M ( t ) N , ( t ’ ) ) is thus reduced to the problem of calculating ( N , ( t ’ ) M , ( t ’ ) N , ( t ’ ) ) and the relaxation rates of the transient solution for ( M ( t ) ) . All this can be done from the solution for P ( 4 t ) (P”% 7)). The most significant application of the quantum regression theorem has been to the problem of calculating correlation functions for the radiation field (LAXand LOUISELL [1967]). When pRMsatisfies a Markov equation of motion, the entire ci system is Markovian. The complete set of variables M , is constructed from the entire set of normally ordered ci and ci’ products
128
TOOLS OF THEORETICAL QUANTUM OPTICS
“11
and taken to be the combinations {6(ci+-a*)d(d-a)>, where the index p becomes the continuous indices CI and a*, and the sum over p is an integration d2a. The Kolomogoroff-Smoluchowski relation connects ( M ( t ) ) to ( M , ( t ) ) as follows:
s
=
(M(t))
s
d2aM(a)P(a, t)
- a’*)6(ci(i’) - a’)),
M ( a ) P(a, ?la’, t’)(d(ci’(t’)
= /d2a/d2d
where
(6( ci+ ( t ’ )- ”*)6( ci(t‘)- a ’ ) )
=
P(a’, t’).
(A3.1)
s
One identifies O,(t-t’) as d2aM(a)P(a, tla’, t’). In P-function theory, therefore, one calculates correlation functions in stationary state from the equations such as
( 6 + (7)@)) ( 6 (0)ci (z) +
+
B(T)
=
J d2aJ d2a’a * a ’ ~ ( a ,
(A3.2)
T I C ( ’ ,0 )~ ” ( a ’ ) ,
d(0)) = sd2asd2a’ia121a’12P(a,TIE’,
O)Pss(a’),
(A3.3)
derived from the quantum regression theorem. LOUISELL and MARBURGER [1967] have given an alternative derivation of equations having the form of eqs. (A3.2) and (A3.3). Let to be << 0 so that, as above, the ci system has equilibriated at the time t = 0 with the reservoir, which is described by pR = time independent. Thus, for t , t‘ > 0, (ci’(t)ci(t’)) will depend only on t-t’ because p‘ is time independent for t > 0: ( ~ i + ( ~&() t’))
~
-
T~ {eiH(t- t o ) & + e- iH(f -1’)a e- i H ( t ’ - t o ) P> T~ { 5+e- i H ( t - t ’ ) b eiH(t-t’) P(t>> T~ { eiH(t - I , ) & + e- iH(f-f’)dp,(0)pR}.
If some coherent state theory is used and p‘(0) stituted into the above equation; then,
d2ala)P”(a)(al is sub-
( h ’ ( ~ ) a ( O ) ) = / d 2 a s d 2 a r a * a ’ -1 TrR{pRl(ale-iH‘Ia’)12)pss(a’). 71
The Markov property holds if it is possible to identify P(a,
1 -
x
TrR{pRl(alexp (-iffT)la’)12}.
TIu‘,
0) with
1111
129
A P P E N D I X IV
In order to have the right initial condition, the correct identification is
Appendix IV
In order to integrate our eqs. (3.14)-(3.17) uniquely we must specify the constants of integration. This we do in the usual way; we look for boundary or initial conditions. Consider the {&} system. At the time t , (the latest time) we must have bk+(tl) = bk-(tl); that is, we think of our 6 k systems as developing under U+(t,, to) from t o to t , and then under U ; ( t , , to) = U - ( t o , t , ) from t , back to to: see Fig. 3.1. Furthermore we may obtain a simple relation between bk+(_tO)_ and b k - ( t O )as follows: p ( t o ) = p ’ ( t o ) x ( 1 - exp ( - Po&)) exp ( - Pokb:bk) and bk+(f0)
Tr
{u-(tO,
bk-(to)
Tr
(6k(to)
ll)
U+(tl
7
tO)6k(tO)P(t0)},
while U-(to t i ) U+(ti ~ o > P ( ~ o ) } -
But, by cyclic invariance of the trace,
= =
Tr { u-(fO > t l ) U+(tl tO)6k(tO)e+Bukp(t0)} bk+(tO)e+Pmk. 7
Collecting together the boundary conditions at t , and to we have bk-(tl) = b k + ( d
(A4.1)
bk-(tO)
=
bk+(tO)e+Pwk.
(A4.2)
=
bk*+(h),
(A4.3)
In an entirely like manner, bk*-(td
bl-(tO) = b~+(to)e-Pok.
(A4.4)
Let us return to the problem of integrating eqs. (3.16) and (3.17) in terms of the boundary conditions eqs. (A4.1)-(A4.4). A direct integration of eqs.
130
TOOLS O F THEORETICAL Q U A N T U M OPTICS
(3.16) in terms of the boundary values
However, by the boundary condition for bk+(tl), we may write
bk+(tO)
bk-(tl) =
and
bk-(fl)
[Ill
is
bk+(tl) and the equation
fl
dt’ exp { -icok(t-t‘)} a - ( t ’ )
bk-(t) = iK:/ t
xexp {-icok(t-tl)).
(A4.6)
This equation for b k - ( t ) can be used, together with the boundary condition b k - ( t O ) = exp ( + / h , ) b k + ( t o )to , solve for b k + ( t O ) as a functional of a, and a: : ~~
1 dt’exp {-icok(to-t’)} tl
b,+(to)(e+8”k-1)
=
iK:
a-(t’)
J to
-iK:/t:dt’
exp { -iWk(tO-t’)) a + ( t ’ )
or bk+(tO)= iK:n,
lo1
dt’ exp ( i ~ k ( t ’ - t O ) } ( a - ( t ’ ) - ~ + ( f ’ ) ) , (A4.7)
where nk = (exp (PWk)--l)-’. Eq. (3.17) can similarly be integrated. The corresponding boundary condition for b:+ (t o ) is fl
b:+(to)
=
dt’ eXp { -iOk(t’-tO)}
iKk(nk+l)/
(U?(t‘)-U*,(t’)).
(A4.8)
to
Note that b k + ( t o ) # 0 unless a+ = a _ , which is to say unless K+ = K - . Eqs. (A4.5)-(A4.8) can be used to express b k + ( t ) (and b:,(f)), and therefore the currents, entirely as linear functionals of a, and a : . TO summarize the physics of our calculations up to this point, imagine for a moment that the 6, are atomic current variables. Then, the current response to the field would be formally nonlinear and the b,, would be formal nonlinear functionals of a , . However, if the ensemble of 6, constitutes a reservoir, their reaction to a, will be predominantly linear. The model we are considering takes,
I111
131
A P P E N D I X IV
in fact, the b k response to be strictly linear. The solutions for the currents k , and k*, that result from the integrations of eqs. (3.16) and (3.17), following the enforcement of boundary conditions as outlined above, are k + ( t ) = -i
J =dt’{B++ i t - t ’ ) a + ( t-)B + - ( t - t’) a-(t’)},
(A4.9)
to tl
k-(t)
=
- i s dt’{B-+(t- t’)a+(t’)-BB-(t-t’)a_(t’)},
(A4.10)
f0
tl
k*,( t )
=
- iIt0d t ‘{a *,( t ’) B + + ( t ‘- t ) - u ? ( t ’)B- + ( t ’ - t ) } , (A4.1 I )
k*_(t ) = - iJ ‘ d t ’{ a ( t ’) B + -( t - t ) - a i( t ’ ) ~ -- ( t ’ - t ) } ,
(A4.12)
t0
where
and
?+(4=
{0,1,
z>o z
q-(z)
=
0, z > o 1, z < 0.
Note that the solution for k? = x k K t b t - is formally obtained from the k+ = z k K k b k + solution by simultaneously performing the * operation and interchanging + and - labels. Furthermore, it can be verified that the B’s are composed of expectation values of the 6, systems taken in the absence of interaction between ci and { b k } : + ( t - t’) B + - ( t - 2 ’ ) - t ’ ) B - - ( t - t‘) =
(BB-+ + ( t
)
2 lKkl
(
<(6k(t) (bk(t)
61(t’))+ )k <s:(t‘) bl(t‘))k
6k(t)>k
((gk(t)bk+(t’))-
)k
where
We now have solutions for the atomic currents that express them as linear functionals of the field strengths a , and a*, . We would next like to solve eqs. (3.14) and (3.15) for a , and uz as functionals of K , and K,*.The a*
132
TOOLS OF THEORETICAL QUANTUM OPTICS
[Ill
equations of motion are best analyzed in terms of the combinations a d ( t ) = (a- -a+)(t)and a,(t) = (a- +a+)(t). Substitutionofeqs. (A4.9) and(A4.12) into eqs. (3.14) produces the following equations of motion for a d and a,: (A4.13)
(A4.14) where
r determines the response of a d , a,, and therefore a to external currents; @ determines the excitation level of the radiation mode in the absence of external forces and is determined by the fluctuation spectrum of the reservoir. Eq. (3.8) can be rewritten in terms of &, K , , Kf and K,* driving currents:
In view of the above noted correspondence between the current solutions, k , and k*+,it will not be necessary to further analyze the a: equations of motion. An interchange of + and - labels and a simultaneous complex conjugation of a , solutions yield a: solutions. The mathematical task is now to solve eqs. (A4.13) and (A4.14) for the partial functional derivatives of In Z and to rearrange eq. (A4.15) into integrable form. When the frequency spectrum { w k } is ample, B + - and B - + are effectively nonzero only over short time intervals. The Markov approximation can be made and tested, the test criterion being that secular time changes of a , be small over the correlation time period T~ of the B’s. If this test criterion is satisfied, then for times t , >> t >> zR = 0, one can use ad(t’) = ad(t) x exp { -icoao(t’-t)} and a,(t’) M a,(t) exp { -icoao(t’-t)} to rewrite eqs.
133
A P P E N D I X IV
1111
(A4.13) and (A4.14) as local equations of motion:
(d”, +iw,)
ad(t)
=
(A4.16)
-iKd(t),
where
The solutions to
in general, are
(A4.18)
J,, d ’ exp { t
+
- i(wm-+iy,)( t - t ’)}( - iK,( t ’)
+ ( 2 +~l)y,
ad( t ’)). (A4.19)
Becauseof the fact thatad(t,) = o and &(t’) = o for t‘ 2 t , , eq. (A4.18) can be rewritten as m
m
ad(t) = is, dt’ exp {-iwi(t-t’))Kd(t’)
=
-i/
dt’G,(t-t’)Kd(t’), -00
(A4.20)
where G,(z)
= (-)q-(z)
exp { -i(w,++iy,)z}.
134
T O O L S OF T H E O R E T I C A L Q U A N T U M O P T I C S
“11
Moreover, letting to -+ - 00 will produce the following equation for
=
f
m
dt‘ G,(t - t’)( -iK<(t‘)
as(t):
+(2n +
-m
where G,(T) = q + ( z ) exp { -i(~,-$y,)~] Combining eqs. (A4.20) and (A4.21) gives, finally, m
as(t) = -if
dt’G,(t-t’)K,(t’) -m m
-i(2E+l)y,fm
dt’j -m
dt“ G,(t-t’)G,(t’-t”)K,(t”).
(A4.22)
-m
If we had not let to -+ - co, we would have had to determine the as(to) boundary condition as we had the b k + ( t O )boundary condition. The solution for as(t)in terms of as(to)would enable us to determine, for example, the progression of the radiation mode from one state of thermal equilibrium to another, as was done by Schwinger. Steady state is achieved on a time scale set by y,’. We only want the solutions, eqs. (A4.20) and(A4.22), to eqs. (A4.16) and (A4.17) that lead to the functional evaluation of thermal equilibrium correlation functions. G,( z) and G,( r ) are advanced and retarded Green’s functions respectively. Since (G,(z))* = ( -)G,( - T), it follows from eqs. (A4.20) and (A4.22) that m
a,*(t) =
-is
dt’K,*(t’)G,(t’-t)
(A4.23)
-m
and m
a:(t) =
-if
dt‘K,*(t‘)G,(t‘-t) -a m
-i(2E+l)y,fm
dt’f -m
dt”K,*(t’)G,(t’-t”)G,(t”-t).
(A4.24)
-m
When eqs. (A4.20), (A4.22)-(A4.24) are inserted into eq. (A4.15), the right hand side of the equation can be reassembled as a total variation (at steady state)
III]
where
135
REFERENCES
G(t-t‘)
=
G + +(t-t’) G+ - ( t - t ’ ) G-+(t-t’) G--(t-t’)
).
-i( (ci(t)ci+( t ’ ) ) + > i(6+ (t’)ci(t)> -i(ci(t)ci+(t’)> i((ci(t)ci+ ( t ’ ) ) - >
(A4.26)
References ABRAGAM, A., 1961,ThePrinciples of Nuclear Magnetism(0xford University Press,London). AGARWAL, G. S. and E. WOLF,1970, Phys. Rev. D 2,2161,2107,2206. BERNARD W. and H. B. CALLEN, 1959, Rev. Mod. Phys. 31, 1017. CALLEN H. B. and T. A. WELTON,1951, Phys. Rev. 83, 34. FLECK, J. A., 1966, Phys. Rev. 149, 309. GLAUBER, R. J., 1963, Phys. Rev. 131, 2766. GLAUBER, R. J., 1963b, Phys. Rev. 130, 2529. GLAUBER, R. J., 1964, in: Quantum Optics and Electronics, eds. C. DeWitt, A. Blandin and C. Cohen-Tannoudji (Gordon and Breach, New York) p. 65, in particular, lecture XIII. GLAUBER, R. J., 1966, in: Physics of Quantum Electronics, eds. P. L. Kelley, B. Lax and P. E. Tannenwald (McGraw-Hill Book Co. Inc., New York) p. 788. GORDON, J. P., L. WALKER and W. H. LOUISELL, 1963, Phys. Rev. 130, 806. GORDON, J. P., 1967, Phys. Rev. 161, 361. HAKEN,H., H. RISKENand W. WEIDLICH,1967, Z. Physik 206, 355. HANBURY-BROWN R. and R. Q. TWISS, 1956, Nature 177,27. HANBURY-BROWN, R. and R. Q. Twrss, 1957, Proc. Roy. SOC.(London) A242, 300. HANBURY-BROWN, R. and R. Q. Twrss, 1958, Proc. Roy. SOC.(London) A243, 291. KLAUDER, J. R. and J. MCKENNA, 1965, J. Math. Phys. 6, 734. KLAUDER, J. R. and E. C. G. SUDARSHAN, 1968, Fundamentals of Quantum Optics (W. A. Benjamin, Inc., New York) p. 32. V., 1966, Ann. Phys. (N.Y.) 39, 72. KORENMAN, KUBO,R., 1957, J. Phys. SOC.Japan 12, 570. LAX,M., 1960, Rev. Mod. Phys. 32, 25. LAX,M., 1963, Phys. Rev. 129, 2342. LAX,M., 1964, J. Phys. Chem. Solids 25, 487. LAX, M., 1966, in: Physics of Quantum Electronics, eds. P. L. Kelley, B. Lax and P. E. Tannenwald (McGraw-Hill Book Co., New York) and Phys. Rev. 145, 110. LAX,M., 1967, Phys. Rev. 157, 213. LAX,M. and W. H. LOUISELL, 1967, IEEE J. Quant. Elect. QE-3, 47. LAX,M., 1968, Phys. Rev. 172, 350. W. H. LOUISELL and J. MARBURGER, 1967, IEEE J. Quant. Elect. QE3,348. LOUISELL, W. H., 1968, in: The Physics of Quantum Electronics, eds. S. Jacobs and J. Mandelbaum (University of Arizona Optical Sciences Center, Tucson, Arizona) p. 31 1. MOLLOW,B. R. and R. J. GLAUBER, 1967, Phys. Rev. 160, 1076. SCHWINGER, J., 1961, J. Math. Phys. 2, 407. SCULLY,M. and W. E. LAMBJr., 1967, Phys. Rev. 159, 208. SCULLY,M. and W. E. LAMBJr., 1968, Phys. Rev. 166, 246. SENITZKY, I. R., 1961, Phys. Rev. 124, 642. SHEN,Y. R., 1967, Phys. Rev. 155, 921. SUDARSHAN, E. C. G., 1963, Proc. Symp. on Optical Masers (Polytechnic Press, Brooklyn, New York and J. Wiley and Sons Inc., New York) p. 45. WAX,N.. ed., 1954, Selected Papers on Noise and Stochastic Processes (Dover Publications, New York). W. and F. HAAKE, 1965, Z. Physik 185,30; 186,203. WEIDLICH,
This Page Intentionally Left Blank
IV
FIELD CORRECTORS FOR ASTRONOMICAL TELESCOPES BY
C. G . WYNNE Imperial College, London, England
CONTENTS
PAGE
5 1. INTRODUCTION . . . . . . . . . . . . . . . . . . . . 9 2. NEWTONIAN TELESCOPE CORRECTORS . . . . . . .
139 139
Q 3. RITCHEY-CHRETIEN TELESCOPE PRIME FOCUS CORRECTORS . . . . . . . . . . . . . . . . . . . . . . .
149
Q 4. SECONDARY FOCUS CORRECTORS .
160
. . . . . . . . .
REFERENCES . . . . . . . . . . . . . . . . . . . . . . .
163
§ 1. Introduction
Most large astronomical telescopes are made in a form that can be converted to work at different focal lengths for different types of observation. The majority of those at present in use are of Newtonian-Cassegrain form, used either at the prime focus of the paraboloid mirror, or at Cassegrain or coudt foci formed behind the main mirror, by interposing hyperboloid secondaries. More recently Ritchey-ChrCtien telescopes are being built, of the same general configuration but with different mirror shapes. The useful field of the Newtonian-Cassegrain telescope is restricted by uncorrected coma, particularly at the prime focus. The Ritchey-Chrttien prime focus in addition suffers from spherical aberration; its secondary focus is corrected for spherical aberration and coma, but is limited in field size by astigmatism. This review is mainly concerned with subsidiary optical systems which can be inserted into these types of telescope to give good aberration correction over more extended fields of view, the basic telescope being either NewtonianCassegrain or Ritchey-Chrttien, and therefore available for use in those forms when used without a corrector. This review does not include a general survey of special purpose telescopes designed to give an extended field of high resolution at a single focal station, such as Schmidt cameras. But the distinction is not wholly clear cut, since there is an intermediate region. Optical systems have been proposed, and at least one has been made, for the secondary focus of a telescope whose mirrors are required to depart somewhat from the Ritchey-Chrttien form; in consequence, the secondary focus will suffer from coma, when the corrector is removed. And the true correctors, that work with unmodified Newtonian-Cassegrain or RitcheyChrttien systems, evolved historically from designs that did require changes to the mirror shapes. Such systems are included in this review.
8 2. Newtonian Telescope Correctors The Newtonian-Cassegrain telescope, or any system similarly based on the focal properties of conicoids of revolution, has perfect aberration correction on axis. For extra-axial imagery, the Seidel coma gives an adequate 139
140
ASTRONOMICAL TELESCOPES
[IV,
§2
description of the performance for the practically useful range of apertures and field angles, and in these systems this gives a coma flare of angular extent 6U,at an angle ii from the axis, given by SU/U = $u2, where u is the semi-aperture angle. Thus the coma is most serious at the prime focus, where the aperture angle is greatest. Older paraboloid telescopes (e.g. the 100 inch at Mount Wilson) had a prime focal ratio of f/5; if the maximum tolerable coma spread be taken as 1 arc second (about the limit set by atmospheric seeing) the useful angular field then extends to 2.3 arc minutes from the axis. With the more recent trend to shorter telescopes (larger aperture angles) the field rapidly reduces. On the Isaac Newton telescope, at f/3, coma reaches 1 arc second at 0.8 arc minutes from the axis. Thus for the simple paraboloid the field size is very small. The first suggestion for extending the field of good resolution of a Newtonian-type telescope seems to have been made by SAMPSON [1913b]; (he had earlier (SAMPSON 1913a]) considered an extended field Cassegrain-type system, and this is referred to below). For the Newtonian focus, Sampson investigated the use of a system of lenses, in the converging beam between the mirror and the focus, near enough to the latter to allow the lenses to be of practicable size on a large telescope. He took a system of three thin lenses, with small separations between them, the system being nearly afocal, and all the lenses being made of the same type of glass, so that the two primary chromatic aberrations could be corrected, and secondary spectrum effects eliminated. In addition, Sampson considered the correction of spherical aberration, coma, and’ flatness of field, by which he meant the mean field, midway between the sagittal and tangential foci. With the rather small separations between lenses that he had chosen, Sampson found it impossible to correct all these aberrations, with a paraboloid primary mirror, with small surface curvatures that he considered desirable on the lenses. He accordingly suggested that the prime mirror depart from the paraboloidal shape; for the design quoted, the mirror shape is “nearly as far beyond the paraboloid as the paraboloid is beyond the sphere”. 2.1. ROSS CORRECTORS
The first field correctors for a Newtonian telescope to be actually made appear to be due to Ross [1933, 19351. In his second paper, Ross refers to Sampson’s work on Cassegrain focus correctors (SAMPSON [1913a]) but not that on Newtonian correctors (SAMPSON[1913b]). Ross first considered a doublet lens of one positive and one negative element. For such a system thin lens theory shows that chromatic difference of focus and of magnification can only both be corrected if the two lenses
IV, § 21
N E W T O N I A N TELESCOPE CORRECTORS
141
are in contact. Ross, like Sampson, chose a substantially afocal system, so that secondary spectrum errors may be eliminated by using the same type of glass for each lens element. Ross assumed that the field curvature and astigmatism of the prime paraboloid mirror were negligible, and using thin lens aberration theory he showed that if the two elements of an afocal spherical surfaced close doublet be bent to such shapes that, with respect to a stop on the prime mirror, the Seidel astigmatism is zero and the coma of the prime mirror is annulled, then the doublet introduces spherical aberration and distortion which depend only on the position of the doublet in the converging beam. If the distance from the prime mirror to the doublet be DF, where F i s the focal length of the mirror, then under Ross’ conditions the spherical aberration introduced by the doublet is 4S( 1 -D ) / D , where S is a spherical aberration corresponding to the aspherising of the prime mirror to a paraboloid from the sphere osculating at its vertex. Ross showed that for spherical surfaced lenses this spherical aberration is independent of the powers (equal and opposite) given to the two lens elements, and of the refractive index of the glass of which they are made; but he apparently believed it could be eliminated by the use of an aspheric lens surface (Ross [1933]). WYNNE[1949] has shown that the spherical aberration has the same value, for the given coma and astigmatism correction, independent of whether spherical or aspheric surfaces are used, and for an afocal system of any number of thin lenses in contact. If the afocal lens system be designed to correct the prime mirror astigmatism as well as the coma, then the spherical aberration is somewhat increased, to 4S( 1-D ) / D 2 . The spherical aberration introduced by the corrector decreases as the separation D approaches unity, i.e. as the corrector approaches the focus of the mirror. The corrector also introduces distortion, which increases as D is increased, but this is not generally important. There is a limit to how near to the focus the corrector can be taken, since as D approaches unity, the curvatures of the surfaces of the lenses to give the required aberration correction become greater, and higher-order aberrations become significant, so that the useful field size is restricted by these; Ross chose a compromise value of D of about 0.95. The first-order thin lens design requires some modification, using ray-tracing, to yield a final design in which first-order aberrations are balanced against higher-order ones, and lenses have the necessary finite thicknesses and separations. Whereas Sampson had proposed that the spherical aberration introduced by his corrector should be removed by changing the prime mirror shape, Ross considered at first that fairly small amounts of spherical aberration (which expands a star image symmetrically) would be acceptable for stellar
142
ASTRONOMICAL TELESCOPES
IIV,
02
photometry and astrometry. Several of his doublet correctors were designed and made, and ROSS[I9331 published photographs, over a field of about 50 arc minutes, taken with one of these correctors on the Mount Wilson 60 inch f/5 telescope. With a value of D of 0.95, the spherical aberration in this case corresponds to an angular spread of the image at the best compromise focus of about 2.8 arc seconds. On an f/3.3 paraboloid, with the same value of D, the image spread would be about 10 arc seconds. These doublets correctors have not been widely used, presumably on account of this spherical aberration. Ross subsequently designed a different form of corrector for a Newtonian telescope, consisting of a thin meniscus lens concave to the prime mirror, with a doublet lens some distance behind it (Fig. 1). Several of these were
Fig. 1. Section drawing of Ross’ triple corrector for the 200 inch telescope of Palomar Observatory.
made, but Ross did not publish any description of them. WYNNE[1965] published the data for one of Ross’ designs of three-element corrector that was made for the Palomar Observatory 200 inch f/3.3 telescope. This gives good spherical aberration correction, and a comatic image spread of about 4 arc seconds over the spectral range 405 to 656 nm at 10 arc minutes from the axis. Ross also designed correctors of significant negative power, to decrease the numerical aperture at the prime focus, as well as correcting the coma. This necessarily requires the use of glasses of differing dispersions, and introduces some secondary spectrum aberrations. PAUL[1935] in a paper discussing a variety of telescope systems covering extended fields of good imagery, included a treatment of the Ross doublet corrector. He revived Sampson’s suggestion that the corrector lens spherical aberration be corrected by a change of prime mirror shape, and pointed out that the mirror shape would then depart from the paraboloidal in the same direction as the prime mirror of a Ritchey-Chrttien telescope, so that doublet correctors might be useful on Ritchey-Chrttien prime mirrors. This possibility was not taken up until recently, see 0 3.2 below. The idea of correcting
I",
0 21
N E W T O N I A N TELESCOPE CORRECTORS
143
the spherical aberration of a Ross doublet corrector by an appropriate change of prime mirror shape, from paraboloid to approximately hyperboloidal, was again suggested by ROSIN[1961]; but in this case the author regarded it as a development from the Baker-corrector system discussed in 0 2.2 below. 2.2. THE BAKER CORRECTOR
Baker investigated a further development from the Ross doublet corrector, which he reported to the American Astronomical Society 1947, but did not publish until some years later (BAKER[1953]). A close doublet lens, at a distance D F behind the mirror, can be designed to correct coma and astigmatism of the mirror with respect to a stop distance DFin front of the mirror. The doublet then introduces rather more spherical aberration than it would if the stop were on the mirror. If the doublet be given a positive power such that its Petzval field curvature annuls that of the mirror, the spherical aberration of the doublet is greater than an afocal doublet would have. But this spherical aberration can now be corrected by introducing a suitably aspherised plate at the stop, and this will not affect the coma, astigmatism and field curvature correction of the system. The aspheric plate has a central hole, which accommodates the doublet corrector lens (Fig. 2), the plate with the lens being mounted together, so that they constitute a corrector assembly that can be attached to a paraboloid to convert it to an extended flat field system. Without this assembly, the paraboloid can be used alone. Compared with a Schmidt camera, this system has the further advantage of a flat image surface, and a much shorter overall length. In the Schmidt camera, with a separation of 2 F between the aspheric plate and the mirror, the entrance pupil must be substantially smaller than the mirror diameter if the system is to be free from vignetting over an extended field. For the Baker corrector, with a plate to mirror separation of about 0.8F in the design Baker gives, this effect is reduced, though his design does vignette oblique pencils beyond about 1O from the axis. By a judicious choice of the types of glass used for the doublet lens, it is possible to use a cemented form giving the required aber-
Fig. 2. Section drawing of Baker's paraboloid corrector system.
144
ASTRONOMICAL TELESCOPES
IIV,
02
ration correction and this has been done in the example in Baker's paper, which has a relative aperture of fi4.5. Baker suggests that his system may be used over a semi-angular field of 3". Since the doublet has finite power, and hence glasses of differing dispersions, secondary spectrum errors are present. These image defects increase as D is decreased, and for Baker's design with D = 0.8, these effects are quite large, so that the system only gives a high performance over rather restricted spectral ranges. It is designed to give optimum results at 434 nm, and if focussed at this wavelength, the geometiical image spread at 2" from the axis is about 2 arc seconds over the range 405-486 nm, but this becomes about 8 arc seconds if the range be extended to 588 nm. If the system is focussed for 588 nm, then for the range 588-656 nm, the image spread at 2" from the axis is about 5 arc seconds. At 3" from the axis, the image spreads are substantially larger. Several of these Baker corrector telescopes have been made. The largest of these is the Queen Elizabeth telescope at the Cape Observatory, made by Sir Howard Grubb Paisons and Co. Ltd. This has a 39 inch diameter mirror with a 35 inch aspheric plate. With a focal length for the complete system of 137.8 inches, the relative aperture is somewhat greater than in Baker's example; the field size covered (2" x 2") is smaller. The actual design of the elements is very similar to Baker's specification. WYNNE[I9491 applied the Baker corrector system to designs covering a smaller field angle. With a larger value of D (about 0.9) the secondary spectrum errors are smaller, and he gave an example with an image spread, over an unvignetted image fieldof f 1",of within 1 arc second over the spectral range 436 to 656 nm. The doublet in this case was not cemented. Wynne also gave designs where the aspheric plate is located between the mirror and the doublet lens, and where this plate is replaced by a mirror; but these arrangements are less suitable for a convertible installation. 2.3. ASPHERIC PLATE CORRECTORS
In the wide-ranging paper already cited, PAUL[1935] discussed the possible use of aspheric plates between mirror and focus to correct the field of a Newtonian telescope. He suggested the use of two spaced aspheric plates to correct coma and astigmatism, pointed out that these necessarily introduced spherical aberration, and proposed that this might be corrected by a change of mirror shape. The use of aspheric plate correctors was taken up again by MEINEL[I9531 who proposed three such plates, where Paul used two and a mirror figuring. Considering first plates with a fourth-power asphericity, and the lowest-power aberrations, such plates introduce no first order chromatic aberrations, and
IV,
0 21
N E W T O N I A N TELESCOPE CORRECTORS
145
no field curvature, so that the field curvature of the paraboloid mirror cannot be corrected using figured plates. This leaves four Seidel aberrations, spherical aberration, coma, astigmatism and distortion. If we introduce two aspheric plates, we have four available parameters (their positions relative to the paraboloid, and their fourth-power asphericity coefficients) so that it might appear that the four Seidel aberrations could be corrected in a twoplate system. This is indeed the case, but the solution is not useful. It is the solution where one plate coincides with the paraboloid mirror, as a doublepass plate, effectively converting the paraboloid to a spherical mirror, while the second plate is at the centre of curvature of the mirror; the system becomes effectively a Schmidt camera, with its disadvantages of overall length and vignetting. If, to reduce vignetting, the condition be imposed that the aspheric plates be located in the converging beam between the mirror and its focus, then it can be shown that a minimum of three plates is necessary even to satisfy the conditions for correction of spherical aberration, coma and astigmatism. (If distortion correction is required, a fourth plate would be needed, but this is not generally demanded in astronomical instruments.) Moreover, for a corrector system in the converging beam, none of its components must be near to the mirror, for it would then give rise to a large central obstruction of the aperture. Meinel, in considering such aspheric plate correctors, suggested that all the elements should be within 15% of the focal length in front of the focal plane. This places a severe constraint on the design. Some general characteristics of aspheric plate correctors can be derived from Seidel theory. Expressing lengths in terms of the focal length of the mirror, whose semi-angular aperture is u, then for an aspheric plate whose spherical aberration coefficient is S, and which is located at a distance I from the focus, the coma coefficient will be SE and the astigmatism SE’, where E
=
(l/I-l)/u2.
For the paraboloidal mirror at the aperture stop of the system, the spherical aberration coefficient is zero, the coma is -+u2 and the astigmatism 1. Denoting quantities relating to the three plates by suffices, numbering the plates in order from the mirror, the conditions for correction of the three aberrations are:
Substituting for E l , E , , E, terms of I , , 1, and l3 gives
146
A S T R O N O M I C A L TELESCOPES
“v, § 2
with corresponding expressions for the other two plates. Hence for any chosen values of I , , I, and I,, the fourth-power asphericities of the three plates may be calculated. Since 1 > 1, > I, > I,, it follows that the three aberrations at the first and third plates are negative, and at the second are positive; and hence it follows that the coma of the second plate S,E,, which is of opposite sign to that of the mirror, must be numerically greater than this. The third plate, nearest to the focal plane, cannot approach the focus very closely, or its asphericity becomes too large, with consequent large higher order aberrations. Taking I, = 0.15 as suggested by Meinel, and Z3 at a minimum of say 0.02, the range of possible solutions all give aberration correction by the mutual cancellation of terms, with coma and astigmatism terms on the plates of much bigger numerical values than those of the mirror. In consequence of this there are quite heavy monochromatic higher-order aberrations arising mainly from the interaction of the aberrations at one plate with those of precading ones; and because of the dispersion of the plate materials there are large higher-order chromatic defects. The addition to the three plates of small vertex curvatures and higher-power figurings than fourthpower ones enables these higher-order effects to be alleviated, but not removed. It is presumably because of these problems that no actual design of aspheric plate corrector has been proposed for use with a Newtonian telescope. Such systems have been investigated in more detail as RitcheyChrCtien prime mirror correctors, discussed below. 2.4. FOUR-LENS CORRECTORS
To obtain better aberration correction over wider field angles with lens correctors, it appears to be necessary to use more than the three elements used by Ross. WYNNE[1967] has described correctors of four-lens elements which give a substantially higher performance. The final stages of the design of these correctors was carried out using lens optimisation computer programs, of the general type described by WYNNE [1959], NUNNand WYNNE[1959], WYNNE and WORMELL [1963]. In order to obtain good results from such programs, it is necessary to give the computer an initial design which has the potentiality of a high degree of aberration correction. Wynne used an initial design consisting of a separated pair of afocal doublets, each of two thin lenses in contact. For such a
IV,
§ 21
N E W T O N I A N TELESCOPE CORRECTORS
147
system it can be arranged that the Seidel spherical aberration of the two doublets are equal and opposite, that the sum of their astigmatism coefficients is equal and opposite to that of the mirror, and that the correction of the mirror coma is equally distributed between the two doublets, with a view to minimising higher-order aberrations. At this level of thin lens Seidel aberration correction, designs of the two doublets can be derived analytically. Real doublet lenses must have finite thickness and then if chromatic difference of focus is zero, chromatic difference of magnification cannot in general be exactly corrected; but the sign of this is opposite in doublets having the positive or the negative element nearer to the mirror. Wynne therefore took one doublet in one sense, and the other in the reverse. This arrangement allows the computer the possibility of separating the elements of the two doublets and still conserving correction of the two primary chromatic aberrations. In practice this proved to be useful. The two thin afocal doublets of the initial design have of course zero Petzval field curvature, but in the optimisation process the computer modified this, for the lens system as a whole, to balance the small field curvature of the mirror. WYNNE[1967] gave numerical data for a four lens corrector of this type designed for the Palomar Observatory 200 inch f/3.3 telescope. The spot diagrams for this show a geometrical image spread within about 4 arc second for the spectral range 365 to 1014 nm over a field of 25 arc minutes diameter. A similar corrector fitted to the Isaac Newton telescope of the Royal Greenwich Observatory (98 inch f/3.0) gives a similar performance over a field of 40 arc minutes diameter (Fig. 3). Better aberration correction could be achieved by optimising these designs over somewhat smaller field angles, but since under the best atmospheric conditions the seeing disc is about 1
Fig. 3. Section drawing of corrector fitted to the Isaac Newton telescope, designed by Wynne.
148
ASTRONOMICAL TELESCOPES
[IV,
Ei 2
arc second, a corrector system giving a geometrical spread of one half of this is considered a reasonable compromise. For observing sites where the seeing is inherently worse, or for smaller telescopes where resolution is limited by photographic film granularity rather than by the seeing, wider fields of view are possible with correctors of this type. The aberration balancing achieved by an optimisation program is such that the image spread is maintained within some fairly small value over some linear diameter of field, outside which the aberrations increase rapidly; there is a rather sudden catastrophic breakdown of aberration correction. The diameter of field before aberration breakdown, and the size of image spread within it, both increase with the actual size of the corrector lens system. Now it follows from aberration theory that if a corrector of this type be scaled in size and in position from the mirror focus, then the (zero) Seidel spherical aberration and the coma are almost unchanged, so that the correction of these aberrations is undisturbed. There is a change in the astigmatism and field curvature, but these are both small in size, and can be corrected by small variations to the design, made by the computer. These correctors may therefore be scaled up or down in size, only small other changes being needed, to give systems of a higher degree of aberration correction over a smaller field, or a lower correction over a larger field. As an example of the latter, Wynne has designed a corrector for the projected 30 inch f/4 telescope of the University of Oporto Observatory, covering a field of 2" diameter, with image spread within about 2 arc seconds over the spectral range 365 to 852 nm. 2.5. TWO-MIRROR CORRECTORS
A quite different form of Newtonian focus corrector derives from a discussion of three-mirror anastigmat systems given by PAUL [1935]. One
Fig. 4. Paul's three mirror system.
IV,
9: 31
RITCHEY-CHRETIEN
P R I M E CORRECTORS
149
system arising from his general analysis has a paraboloidal prime mirror, followed by convex and concave secondary and tertiary spherical mirrors of equal radius, the complete system being corrected for spherical aberration, coma and astigmatism (Fig. 4); the two spherical mirrors, with a separation equal to their common radius, therefore constitute a field corrector to the paraboloid, giving sharp imagery over a curved focal surface. Paul also considered other configurations of three mirrors, including a flat-field system of better obscuration characteristics in which the focal plane lies at the pole of the second mirror; in this case he showed that aberration correction is incompatible with a paraboloidal primary mirror. Of these three-mirror systems in general, Paul comments that they are “interesting theoretically, unfortunately of very limited application on account of obscuration”; for most applications this verdict is probably just. Paul’s two-sphere paraboloidal corrector can be modified to give a flat field in a system where the secondary mirror is approximately ellipsoidal, the tertiary remaining a sphere centred on the secondary. The focal plane falls between the secondary and tertiary mirrors. This system was briefly, and incorrectly referred to by DIMITROFF and BAKER[1945], and discussed in some detail by BAKER[1969] who gave a series of specific designs for different field sizes for use with a 200 inch f/3.3 paraboloidal mirror. These give an extremely high level of aberration correction, with of course perfect freedom from chromatic errors. But as compared with the other forms of corrector discussed above, these two-mirror correctors are of much greater physical size, for a given size of unvignetted field of view. For example, Baker’s “case 2/3”, which gives an unvjgnetted field of 20 arc minutes diameter, requires corrector mirrors of 23 and 32 inches diameter, with 147 inches separation between them, and a baffle of 42 inches diameter; such a system would be less easily fitted and removed from a telescope than smaller devices. Moreover the location of the focal plane midway between the mirrors has disadvantages. On the other hand, the level of aberrational image spread in these systems, of the order of 0.01 arc second, is very much less than that caused by atmospheric turbulence, even at the best observing sites, so that full advantage cannot be taken of the potentially high performance of these correctors in terrestrial aFtronomy. Baker suggests that these systems might find applications in outer space, where observations are not seeing-limited, and would be required over a wide spectral range.
5 3.
Ritchey-ChrCtien Telescope Prime Focus Correctors
The conicoid telescopes (Newtonian, Cassegrain, Gregorian), with perfect correction of axial aberrations, all have uncorrected coma and astigma-
150
A S T R O N O M I C A L TELESCOPES
“v, 0 3
tism, and generally field curvature. The field of view of good resolution is limited by coma, which depends on the first power of the field size, rather than by astigmatism and field curvature, depending on the square. SCHWARZSCHILD [I9051 pointed out that by suitable departure from conicoid form, a system of two mirrors could be made aplanatic (corrected for spherical aberration and coma). Astigmatic correction is not generally also possible in physically convenient configurations (SCHWARZSCHILD [ 19051, WYNNE[ 19691) but an aplanatic mirror system gives a considerable extension of field size. Schwarzschild gave a detailed analysis of aplanatic mirror pairs, and applied this to a specific design of “Gregorian” form (two concave mirrors) in which the sagittal and tangential curvatures were approximately equal and opposite, giving a nearly plane mean image field lying between the mirrors. Apparently no telescope of this form was at this time made, but CHRBTIEN [1922] revived the idea, and applied Schwarzschild’s analysis to a Cassegrain type of telescope, which he suggested should be used with curved photographic plates to minimise astigmatic effects. Telescopes of this type were made by G.W. Ritchey, including a 30 inch instrument for the U. S. Naval Observatory, Washington, and the type has become known as RitcheyChrttien. Again, the idea was not followed up for some time, but is now being generally adopted for several new telescopes being built or planned. The largest Ritchey-ChrCtien telescope at present in use is the 107 inch (2.72 m) at McDonald Observatory, completed in 1969. The general characteristics of Ritchey-Chrttien telescopes have been discussed by WYNNE[1968]. The prime mirror, if used alone, would suffer from spherical aberration, and as originally conceived, the Ritchey-ChrCtien telescope provided only a ‘Cassegrain’ focus, the prime and coudC foci being discarded. It is an obvious extension to provide an alternative secondary mirror of suitable vertex curvature to give a longer focal length coudC focus, the mirror being so aspherised as to correct the spherical aberration of the prime mirror. The RitcheyChrCtien coud6 focus has much heavier coma than the corresponding conicoid telescope, but for the on-axis spectroscopic work normally done at this focus, coma is not significant. The recovery of a prime focus is more difficult. The prime mirror of a Ritchey-ChrCtien telescope has the same Seidel coma, astigmatism and field curvature as a paraboloid of the same size and focal length, and in addition has overcorrected spherical aberration. For the case where the Ritchey-Chrttien focus (the secondary focus) lies at the pole of the prime mirror, the spherical aberration, expressed as a wavefront aberration at the edge of the aperture is given by
w
=
&h4/F3R3,
IV,
5 31
RITCHEY-CHRETIEN
P R I M E CORRECTORS
151
where h is the semidiameter of the mirror, F is its focal length, and R is the ratio of the numerical apertures at the prime and the secondary foci. If the secondary focus lies behind the prime mirror pole, as is generally the case, the value of w is somewhat greater - typically by 10 to 20 %, depending on the back focal distance. For large telescopes, this spherical aberration makes the direct prime focus of the mirror useless, even on axis. For example, on the Anglo-Australian 154 inch (3.91 m) telescope now under construction, with a primary aperture of fl3.25 and a Ritchey-ChrCtien focus of aperture of f/7.79 ( R = 2.40) lying 66 inches behind the prime mirror pole, w is 66 wavelengths at A = 588 nm, and the geometrical image patch at the best compromise focus has an angular diameter of 8 arc seconds. Some form of aberration correction is therefore necessary if the prime focus of a Ritchey-ChrCtien telescope is to be used. It is desirable to make this corrector such as to extend the field of good resolution at the prime focus, as well as correcting the axial aberration. This turns out to be possible at a series of different levels of field size and corrector complexity discussed below, the correction of the aberrations of oblique imagery being in fact in some ways facilitated, as compared with the paraboloid telescope, by the presence of spherical aberration. 3.1. SINGLE ASPHERIC PLATE RITCHEY-CHRETIEN PRIME CORRECTORS
The spherical aberration of the prime mirror can be corrected by a suitably shaped aspheric plate located anywhere in the converging beam be[I9651 pointed out that if the plate tween mirror and focus. GASCOIGNE position be properly chosen, the prime mirror coma can also be corrected. This is the simplest possible corrector, involving less loss of light from absorption and surface reflections than any other. The size of field of good resolution that it gives is rather small, since the plate necessarily introduces considerable astigmatism, and some chromatic aberrations. For example, on the Anglo-Australian 154 inch (3.91 m) fl3.25 : fl7.79 mentioned above, an aspheric plate corrector about 35 cm in diameter and about 1 m from the focus gives image spreads within arc second in diameter over the spectral range 850-360 nm over a flat field of a little under 7 arc minutes in diameter, or if the plate be suitably curved concave to the mirror, over a field approaching 10 arc minutes diameter.
+
3.2. DOUBLET LENS RITCHEY-CHRETIEN PRIME CORRECTORS
Since the Seidel coma and astigmatism are the same for a Ritchey-ChrCtien prime mirror as for a paraboloid of the same focal length and aperture, the characteristics of a close doublet corrector for these aberrations, discussed
152
ASTRONOMICAL TELESCOPES
[IV,
5: 3
in 5 2.1 above, are the same in the two cases. It was shown above that such a doublet lens has spherical aberration, of magnitude depending on its position in the converging beam; this spherical aberration is of opposite sign to that of the Ritchey-ChrCtien prime mirror. It is therefore possible to choose a position for the doublet such that its Seidel spherical aberration annuls that of the Ritchey-ChrCtien mirror. A simple doublet corrector can therefore give an extended field of good imagery at the prime focus of the Ritchey-ChrCtien mirror, in a way that is not possible with a paraboloid. The spherical aberration w of the Ritchey-ChrCtien mirror depends not only on its focal length F and semi-aperture h, but also on the ratio R. As R is decreased, w increases, so that for a prime mirror of given aperture and focal length, a smaller R corresponds to a doublet position further from the focus, which for the required coma and astigmatism correction has shallower surface curvatures, and smaller higher order aberrations. But unlike the other lens correctors discussed in this paper, these doublet correctors are limited in their field size not mainly by higher order aberrations but by primary chromatic ones. If a doublet with finite lens thicknesses be corrected for chromatic difference of focal position, then there is in general a small chromatic difference of magnification. If these doublet correctors are to be used over a wide spectral range (as has generally been proposed) then it is this chromatic difference of magnification that limits the size of field that can be covered. In angular measure, this limit on field size varies only slowly with the relative aperture of the prime mirror, and the ratio R. And the limit is a fairly sharp one, since if the criterion of image spread be relaxed, and the doublet be designed to cover a larger field, the thicknesses of the lens elements must be increased, with a consequent increase in the chromatic difference of magnification, so that there are rapidly diminishing returns. WYNNE[1968] gave data for a doublet corrector for a 105 inch f/4 : f/9 (R = 2.25) Ritchey-ChrCtien telescope; this was the original specification of the 107 inch instrument at the McDonald Observatory. This covers an angular field of 28 arc minutes diameter with a total image spread within arc second over the spectral range 770 to 365 nm, and significantly better than this over a more restricted range. For a prime mirror of larger numerical aperture and larger ratio R, the field covered is slightly less. For example for the Anglo-Australian telescope now being constructed (1 54 inch f/3.25 : f/7.79 (R = 2.40)) Wynne has designed a doublet corrector covering a field of 25 arc minutes diameter for which the image spread amounts to 0.75 arc seconds over the spectral range 852 t o 365 nm, but is within 0.5 arc seconds for each of the ranges 852 to 546 nm and 546 to 365 nm. A variant of these doublet correctors has been described by SCHULTE
+
IV,
S 31
R I T C H E Y - C H R ~ T I E NP R I M E C O R R E C T O R S
153
[1966a, b] in which an aspheric plate is inserted between the mirror and the doublet lens. This form derives from earlier work by WYNNE[I9491 on paraboloid correctors. Design data are given by Schulte for a corrector for a 380 cm f/2.8 : f/9 telescope, with an aspheric plate about 1 metre in diameter. Spot diagrams are given for a single wavelength only (589 nm) extending to a semifield angle of 20 arc minutes. The image spread is about one arc second at 15 arc minutes from the axis. This system therefore appears to have little advantage over the simple doublet. 3.3. MULTIPLE ASPHERIC PLATE RITCHEY-CHRETIEN PRIME CORRECTORS
For the projected European Southern Observatory (E.S.O.) RitcheyChrCtien telescope (3gm f/3 : f/8) KOHLER[1966, 1967, 19681" has designed a prime focus corrector based essentially on the suggestion of MEINEL[ 19531 for correctors of three aspheric plates for paraboloid telescopes, discussed above. Kohler has added a fourth component, a spherical surfaced lens, close to the focal plane behind the aspheric plates, to correct the field curvature of the mirror; this lens introduces small amounts of the other Seidel aberrations, in particular some over-correct astigmatism. Moreover, for reasons discussed below ($ 4) the mirror pair of the E.S.O. design departs slightly from the exact Ritchey-ChrCtien form, but the effect of this on the corrector design is small. The Seidel aberration analysis of the three plate system then follows the same lines as that given for paraboloid correctors in $2.3 above, except that the spherical aberration coma and astigmatism of the three plates must annul the corresponding combined aberrations of the prime mirror and the rear lens. The main difference from the paraboloid case arises from the substantial negative spherical aberration of the Ritchey-ChrCtien mirror. The effect of this is that, while the aberrations of the first, second and third plates are still negative, positive and negative, their numerical values are somewhat lower than they would be in a corresponding paraboloid case; but they are still quite considerable. In Kohler's design the coma coefficient on the central plate, which is of opposite sign to that of the mirror, is about three times its size. Higher order aberrations, in particular chromatic ones, while less than in paraboloid correctors of this form, are still significant. In order to reduce the effect of these, some vignetting of the aperture of oblique pencils has been introduced beyond a semi-angle of field of about 20 arc minutes, the meridian plane aperture being reduced to about 0.75 of its * The contents of these three papers are substantially identical, except that the spot diagrams in the first do not show the full extent of the image spread. These have been amended in the second and third publications.
154
“v, 0 3
A S T R O N O M I C A L TELESCOPES
axial value at the full semi-field angle of 30 arc minutes for which the system is designed. In the final design in which aspheric plates of suitable thickness have been introduced, the aspheric plate profiles have been optimised, giving them small vertex curvatures as well as fourth-power figurings, and the largest plate a sixth-power figuring. Spot diagrams show an image spread over the spectral range 486 to 656 nm within 3 arc second on axis, increasing to rather more than 1 arc second at 21 arc minutes from the axis, and falling to a little below 1 arc second over the vignetted aperture at 30 arc minutes from the axis. Over the extended spectral range 365 to 1014 nm the image spreads are in each position up to three times greater. Spot diagrams for this corrector, and for another type discussed below in 0 3.4, are given in Fig. 7. Fig. 5 shows the form of the corrector, the aspheric profiles being grossly exaggerated. The diameter of the large aspheric plate is metre, and the overall length of the corrector system is 1.09 m. A corrector of the basic Meinel three aspheric plate form, designed for an fl2.8 : f/9 Ritchey-ChrCtien telescope, has been described by SCHULTE [1966a, b]. This shows an image spread of about 1 arc second at a field angle of 15 minutes from the axis, at 420 nm; no diagrams are given for other wavelengths.
+
3.4. THREE COMPONENT PRIME FOCUS CORRECTORS FOR RITCHEYCHRETIEN TELESCOPES
WYNNE[1965, 1966, 19681 has described a type of corrector consisting of three separated lenses, all with spherical surfaces. This form derived from
Fig. 5. Corrector designed by Kohler for the prime focus of the ESO 3.5m Ritchey-Chr& tien telescope: the shapes of the aspheric surfaces greatly exaggerated.
IV,
D 31
R I T C H E Y - C H R ~ T I E NP R I M E C O R R E C T O R S
155
the following considerations. For the close doublet correctors discussed in 0 3.2 above, simultaneous correction of Seidel spherical aberration, coma and astigmatism is only possible for a unique separation between the mirror and the doublet, this separation depending on the particular Ritchey-Chrktien configuration; for other separations, in general any two of these aberrations can be satisfied. If the separation between the mirror and the doublet be less than that giving correction of all three aberrations, and the shapes of the lenses comprising the doublet be chosen so as to correct the spherical aberration and coma of the prime mirror, then the system will have undercorrected Seidel astigmatism. This may be corrected, with only small contributions to the other aberrations, by a weak positive lens of suitable shape, near to the focus. The doublet, of larger size than in the simple doublet case, will have shallower curvatures, and smaller higher-order aberrations, and moreover the addition of the rear convergent lens allows of the correction of the chromatic difference of magnification of the single doublet corrector; on both counts, correction over wider field angles might be expected, and is in fact realised. The earlier correctors of this general form described by WYNNE[1965, 19661 were of two types. In one, the close doublet had its converging lens nearer to the mirror, and the rear converging lens was a single element. In the other form, the close doublet had its diverging lens nearer to the mirror, and the rear converging lens was a cemented triplet; this form has rather better corrected higher-order chromatic aberrations, but involves the use of flint glass in the triplet, with some consequent loss of transmission in the near ultra-violet. In both these earlier forms of corrector, the rear convergent lens lay quite close to the focal plane, so that the back focal distance was small; this could be inconvenient, for example if the correctors were used with image tubes. Later designs (WYNNE[1968]), derived from the earlier ones by computer optimisation procedures, depart considerably from the simple concept described above. In these designs the front close doublet is separated, and the rear positive lens removed from the vicinity of the focal plane, the correction of the mirror aberrations being distributed in a more complicated way between the three single lenses, and the level of aberration correction being higher than in the earlier designs. The three lenses may all be made of the same material, so there are no secondary spectrum effects; and this material can be fused silica, or a glass with a good short wave-length transmission. A Ritchey-Chrttien prime mirror has the same Seidel coma, astigmatism and field curvature as a paraboloid of the same focal length and aperture, and spherical aberration of magnitude depending on the ratio R (see 9 3
156
ASTRONOMICAL TELESCOPES
"v, 9: 3
above). Ritchey-Chretien telescopes of differing configurations, and hence values of R,therefore require prime focus correctors with correspondingly differing spherical aberration characteristics. If now a corrector system designed for some given prime mirror be scaled in size and position from the focus by a factor n, then its spherical aberration is also scaled up by n, its coma is approximately unchanged, and its astigmatism and field curvature are approximately scaled by l/n; but these last are in any case small. A corrector designed for any Ritchey-ChrCtien prime mirror may therefore serve as the basis of the design of corrector for a different prime mirror, corresponding to a configuration of different value of R, by appropriate scaling, the final aberration balancing being achieved by computer optimisation. In general, telescopes with smaller values of R , and hence larger spherical aberration in the prime mirror, will require corrector systems of larger physical size, which will cover larger field angles. WYNNE[1968] gives designs of triple lens corrector for the Kitt Peak National Observatory 150 inch f/2.8 : f/8 telescope ( R = 2.86) covering a full unvignetted field of 50 arc minutes, and for the McDonald Observatory 107.6 inch f/3.9 : f/8.8 telescope ( R = 2.25) covering a full field of 2". Fig. 6 shows a section drawing of the former.
n
Fig. 6. Corrector designed by Wynne for the prime focus of the Kitt Peak National Observatory 150 inch Ritchey-Chrktientelescope.
These triple lens correctors give a higher degree of aberration correction than the other forms described above. For purposes of comparison WYNNE [1968] designed a three-lens system to correct the prime focus field of the
IV,
o 31
RITCHEY-CHRETIEN
P R I M E CORRECTORS
157
European Southern Observatory 3: metre f/3 : f/8 telescope (R = 2.67) for which KOHLER[1966, 1967, 19681 has designed an aspheric plate corrector. In each case the full field covered is lo,with some vignetting (the same for each corrector) beyond +21 arc minutes. Fig. 7 shows spot diagrams for the two systems, on axis and at semi-field angles of l5", 21" and 30", at wavelengths 656, 588,486 and 405 nm, the circle in each case corresponding to an angular spread of 1 arc second. In addition to having better aberration correction, the triple lens sysem is much smaller, and having spherical surfaces it is easier to make and to test. The largest residual aberrations in these triple lens correctors are higherorder chromatic ones. In the photography of star fields with these correctors, a series of plates are normally taken, each through a different filter; the total spectral range that can be covered should be large, but is more restricted for each separate exposure. For the highest performance over wide field angles, therefore, a series of two or three correctors may be used, each computed for a different spectral range. Since these three-component correctors are of relatively small size and weight, it is not impracticable to interchange them between exposures. It is proposed that Kitt Peak's new 158 inch telescope should be provided in this way with a series of these correctors. WYNNE[1968] has investigated two further elaborations of these correctors, one in which some surfaces of the three lenses are allowed to take aspheric forms, and the other in which four single spherical surfaced lenses are used in place of three, giving correctors similar in general form to those described in § 2.4 above. In the case of a three-lens corrector whose performance has been optimised with spherical surfaces, there appears to be little gain in performance to be achieved by allowing the computer to aspherise less than three surfaces (one on each lens) and even with three aspherics, the improvement is relatively slight, amounting to a reduction of image spread at the worst part of the field of about 25 %. The use of four spherical surfaced lenses in place of three gives a similar scale of improvement. Two other Ritchey-ChrCtien prime focus correctors consisting of three separated lenses have been mentioned in the literature. The first of these, described briefly by KOHLER[1966, 1967, 19681 in the paper in which his aspheric plate corrector for the ESO telescope was described (5 3.3 above), was an earlier approach to this problem, discarded in favour of the aspheric plate design. Another three-lens corrector, also designed for the ESO telescope prime mirror, was described by BARANNE[1966]. The published account contains numerous misprints and inconsistencies. Two designs are given, each consisting of a central positive lens between two divergent menisci, one surface
158
ASTRONOMICAL TELESCOPES
3
t
?
..
3
Y
0
O
,
.I
.,?;
0
.. . . .. . ..”.. rsc . . . .. .....,..
3
3
IV, 0 31
c)
a
n
0
0
c)
2
0
s
-
.*....:
t
-
0
m
9
R I T C H E Y - C H R ~ T I E NP R I M E CORRECTORS
"js
wi?
i?
n m
'7'
f'*
0
4
h
.p. . .
*
h 9 0
;;r \o
159
Fig. 7. Comparison of spot diagrams of Kohler's aspheric plate corrector ( K ) and Wynne's triple lens corrector (W), both designed for the ESO 3.5m telescope. Image spreads are shown for four wavelengths (656, 588, 486 and 405nm) (a) on axis, (b) at 15 arc minutes from the axis, (c) at 21 arc minutes, and (d) at 30 arc minutes.
I60
ASTRONOMICAL TELESCOPES
IlV,
$4
being aspherised. It would appear from the spot-diagrams given that, for the better design, over a somewhat vignetted field of +30 arc minutes, the image spread is about 24 arc seconds for the spectral range 370 to 500 nm.
9 4.
Secondary Focus Correctors
The earliest proposals for extending the field of view of a two-mirror system, as for a single mirror one, were thought of as special purpose instruments, for use at only one focal station. From these systems have evolved the true correctors, which can be added to an unmodified Ritchey-ChrCtien telescope to extend the field. Along side these true correctors, designs continue to be proposed which require some modification of the Ritchey-Chrttien mirrors, so that their performance is degraded when used without the corrector. The first proposal for a corrected Cassegrain reflector was made by SAMPSON [1913a], who also devised the first corrected Newtonian (SAMPSON [ 1913bl). Sampson’s system consisted of an approximately ellipsoidal prime mirror, a secondary consisting of a spherical surfaced meniscus lens with its rear, concave surface silvered, followed by a spherical surfaced separated doublet lens system. The lens system was corrected for both primary chromatic aberrations, and the same glass type was used for all three refracting elements so that secondary spectrum effects were eliminated. The complete system, corrected for spherical aberration, coma and a flat mean field had an aperture of f/14.05, and images of 2.2 arc seconds were claimed at a field angle of 1 O from the axis. Subsequent work related to Ritchey-Chrttien type systems. In the same year as ChrCtien’s paper describing these, Rear-Admiral VIOLETTE [ 19221 proposed that the astigmatism and field curvature of the Ritchey-ChrCtien telescope be corrected by a thin close doublet lens near to the focal plane, together with a change of the mirror asphericities. Violette based his initial analysis on thin lens aberration theory, which leads to the general conclusions that simultaneous correction of the two primary chromatic aberrations is only possible for two thin lenses if these are in contact, and that if this thin close doublet is to have finite Petzval curvature, and hence finite power, it must be made up of positive and negative elements with differing dispersions. Since the normal Ritchey-Chrttien mirror system has a finite positive field curvature, Violette proposed a close doublet, of appropriate negative power, compounded of one crown and one flint glass element, and hence having some secondary spectrum defect. He considered that this could only be eliminated, in a doublet of zero power with two elements of the same
IV,
I 41
SECONDARY F O C U S CORRECTORS
161
glass type, if the Ritchey-ChrCtien configuration were of the type having zero Petzval curvature; this has an unusually large central obstruction. In fact thin lens theory, under usual conditions a useful guide, becomes a quite poor approximation for lenses close to a focus. This is discussed further below. Following Violette, no further consideration seems to have been given to secondary focus correctors for more than forty years, by which time astronomers were planning the construction of a number of Ritchey-Chrttien telescopes, some of considerable size. The angular field that can conveniently be used is limited, on the larger telescopes (say of apertures 3.5 m and upwards) by the size of photographic plates available, and on smaller telescopes by the size of hole in the prime mirror; in general on larger telescopes, field diameters of from 30 to 50 arc minutes may be desired, and on smaller instruments up to about 1" or at most 13" diameter. In the simple Ritchey-Chrttien telescope using a flat image surface, astigmatism and field curvature limit the field to much smaller sizes than these. Since the sagittal and tangential field curvatures are both of the same sign, the field of good resolution is considerably extended if the photographic plate is given a suitable curvature, concave toward the mirrors; figures given by WYNNE[I9681 for various Ritchey-Chrttien instruments of focal ratios about f/8 show that with a bent plate and image spreads up to 3 arc second, field diameters of 25 to 29 arc minutes are covered. Substantially the same level of correction would be obtained over a flat field if instead of bending the photographic plate, a field lens of suitable power were inserted, so close to the focus that its aberrations other than field curvature are negligible. A variety of more complicated secondary focus correctors have been proposed. These will be described in order of complexity, starting with the simplest, rather than by date of publication. GASCOIGNE [I9651 proposed a single element corrector in the form of an aspheric plate, fairly near to the focus. The plate can be figured so as to introduce astigmatism of amount to give a best balance against the Petzval field curvature. The sagittal and tangential field curvatures will then be equal and opposite, and over a flat image surface the field of good resolution is substantially increased over what can be attained with a simple field lens. The aspheric plate will necessarily introduce some spherical aberration and coma; if the plate is located near enough to the focus, these can be made negligible, but the plate asphericity is then rather large. Alternatively, for a plate further from the focus, the spherical aberration and coma can be corrected by making the two mirrors depart appropriately from the true Ritchey-Chrttien condition; this is what Gascoigne proposed. Another single element corrector has been proposed by KOHLER[1966,
162
A S T R O N O M I C A L TELESCOPES
LIV,
9: 4
19C7, 19681 for the European Southern Observatory telescope. This consists of a single spherical surfaced lens, near to the focus, of such shape and power as to correct the astigmatism and field curvature of the mirror pair. This lens introduces spherical aberration and significant coma, which is removed by altering the mirror asphericities appropriately, so that in the absence of the corrector, there is coma at the secondary focus. Moreover the single lens necessarily introduces some chromatic difference of focus and more particularly of magnification. This may be acceptable if the corrector is only used for photography over fairly restricted spectral ranges. The field size covered is +_ 15 arc minutes, with an image spread at any single wavelength in the range 365 to 768 nm within about 0.4 arc second. For the two systems mentioned above that do not introduce large chromatic aberrations, Gascoigne's aspheric plate gives control of the astigmatism of the system, so that this can be balanced against the Petzval field curvature of the mirror system, which is unchanged by the plate; the field lens corrector gives control of the Petzval curvature, which can be balanced against the astigmatism of the mirror system, which is substantially unchanged by the field lens. SCHULTE [ 1966 b, c] proposed a two element corrector of an aspheric plate well in front of the focus, together with a field lens close to the focus, which enables both astigmatism and field curvature to be corrected. He gave a design, for the secondary f/7.5 focus of the 152 cm telescope of the Cerro Tololo Inter-American Observatory, which gives image spreads within arc second for the spectral range 340 to 660 nm over a field of -+0.75". The aspheric plate introduces significant coma, so that it is once again necessary to modify the mirror shapes from the true Ritchey-ChrCtia form. The first systems correcting both astigmatism and field curvature without departing from the Ritchey-ChrCtien mirror shapes were proposed by WYNNE [1965] and consisted of two spherical surfaced lenses. It follows from thin lens theory that a system of two lenses, each of zero axial thickness, can only be corrected for the two first order chromatic aberrations if the axial separation between the lenses is zero; and that then the two lenses can only be made of the same type of glass if their combined power is zero, in which case their Petzval field curvature is zero. These considerations led VIOLETTE [I9221 to conclude that in general a doublet corrector for a Ritchey-Chrttien focus must contain glasses of different dispersions, and hence secondary spectrum aberrations. Under most conditions, these thin lens propositions are approximately valid for lenses of small finite thicknesses, but not then for all shapes of the lenses; and for lenses of finite thicknesses and separations near to a focus, the discrepancies from thin lens theory become greater.
+
IVI
REFERENCES
163
WYNNE[1965], in two designs of corrector for the Ritchey-Chrttien focus of the Kitt Peak 150 inch telescope, showed that it was possible with two separated spherical surfaced lenses, each made of the same material, to obtain good correction of field curvature and astigmatism, with substantial freedom from chromatic aberrations, without disturbing the RitcheyChrttien mirror shapes. The earlier designs (WYNNE[1965]) used either fused silica or a glass of high U.V. transmission, and gave image spreads within about 0.2 arc seconds with a field of +15 arc minutes, over the spectral range 405 to 644 nm. A later design (WYNNE[1968]), again for the Kitt Peak 150 inch telescope, covered a field of +25 arc minutes, with image spreads within about arc second over the range 365 to 770 nm. ROSIN[1966] has described a different form of spherical surfaced doublet corrector for an unmodified Ritchey-Chrttien mirror pair, employing two glasses of slightly differing dispersions. REFSDAL[1968] gave a design of a fused silica doublet with one surface aspheric, giving a high level of correction over a field of +45 arc minutes on a 1.5 metre f/3.5 : f/7.5 telescope, with the mirrors departing from the Ritchey-Chrttien condition. For a similar 1.5 metre f/3 : f/8 modified Ritchey-Chrttien, WILSON[ 19681 designed a silica doublet corrector covering k30 arc minutes field. An attempt to find a satisfactory doublet solution with the two lenses of the same material for a true Ritchey-ChrCtien mirror pair was unsuccessful. A doublet design is given, using two different optical glasses, one rather liable to staining, and also a three lens design which requires refocussing for different wavelength regions.
+
References (up to November, 1970) BAKER,J. G., 1953, Amateur Telescope Making, Book Three (Scientific American Inc.) p. 1 . BAKER,J. G., 1969, I.E.E.E. Trans. AES-5, 261. BARANNE, A., 1966, The Construction of Large Telescopes (I.A.U. Symp. NO. 27, 1965) (Academic Press, London) p. 22. H., 1922, Rev. d'Opt. 1, 13 and 49. CHRETIEN, G. Z. and J. G . BAKER,1945, Telescopesand Accessories (J. and A. Churchill DIMITROFF, Ltd., London) p. 105. GASCOIGNE, S. C. B., 1965, Observatory 85, 79. KOHLER,H., 1966, The Construction of Large Telescopes (I.A.U. Symp. NO 27, 1965) (Academic Press, London) p. 9. KOHLER,H., 1967, European Southern Obs. Bull. No. 2, p. 13. KOHLER,H., 1968, Appl. Opt. 7, 241. MEINEL,A. B., 1953, Astrophys. J. 118, 335. NUNN,M. and C. G. WYNNE,1959, Proc. Phys. SOC.74, 316. PAUL,M., 1935, Rev. d'Opt. 14, 169. I. N., 1968, Appl. Opt. 7, 1645. REFSDAL,
164
ASTRONOMICAL TELESCOPES
ROSIN,S., 1961, J. Opt. SOC.Am. 51, 331. ROSIN,S., 1966, Appl. Opt. 5, 675. Ross, F. E., 1933, Astrophys. J. 74, 316. Ross, F. E., 1935, Astrophys. J. 81, 2. R. A., 1913a, Phil. Trans. Roy. SOC.213,27. SAMPSON, R. A., 1913b, Mon. Not. R.A.S. 73, 524. SAMPSON, D. H., 1966a, Appl. Opt. 5, 313. SCHULTE, SCHULTE,D. H., 1966b, The Construction of Large Telescopes (I.A.U. Symp. No. 27, 1965) (Academic Press, London) p. 32. D. H., 1966c, Appl. Opt. 5, 309. SCHULTE, K., 1905, Astr. Mitt. der k. Stern. z. Gottingen, ii. SCHWARZSCHILD, VIOLETTE, H., 1922, Rev. &Opt. 1, 397. WILSON,R. N., 1968, Appl. Opt. 7, 253. WYNNE,C. G., 1949, Proc. Phys. SOC.B62, 772. WYNNE,C. G., 1959, Proc. Phys. SOC.73, 717. WYNNE,C. G., 1965, Appl. Opt. 4, 1185. WYNNE,C. G., 1966, The Construction of Large Telescopes (I.A.U. Symp. No. 27, 1965) (Academic Press, London) p. 25. WYNNE,C. G., 1967, Appl. Opt. 6, 1227. WYNNE,C. G., 1968, Astrophys. J. 152, 675. WYNNE,C. G., 1969, J. Opt. SOC.Am. 59, 572. 1963, Appl. Opt. 2, 1233. WYNNE,C. G. and P. M. J. H. WORMELL,
OPTICAL ABSORPTION STRENGTHS O F DEFECTS I N INSULATORS* The f-Sum Rule, Smakula’s Equation, Effective Fields, and Application to Color Centers in Alkali Halides BY
D. Y. SMITH’ Argonne National Laboratory, Argonne, Illinois 60439, U S A and Michigan State University, East Lansing, Michigan 48823, USA
and
D. L. DEXTER University of Rochester, Rochester, New York 14627, U S A and University of Rome, Rome, 00100, Italy
* Research supported in part by U.S. Atomic Energy Commission, USAF Office of Scientific Research, and the Italian National Research Council. Present address: Argonne National Laboratory. Present address: University of Rochester.
CONTENTS
PAGE
0 1. INTRODUCTION . . . . . . . . . . . . . . . . . . . .
167
§ 2. SMAKULA’S CLASSICAL TREATMENT O F DEFECT AB-
SORPTION
. . . . . . . . . . . . . . . . . . . . . . .
Q 3. DECOUPLING O F DEFECT AND HOST AND THEf-SUM RULE . . . . . . . . . . . . . . . . . . . . . . . . .
169 172
9 4. CROSS
SECTION, SMAKULA’S EQUATION. AND EFFECTIVE MASSES . . . . . . . . . . . . . . . . . . .
Q 5. LOCAL FIELD CORRECTIONS .
178
. . . . . . . . . . . . 183
§ 6. ABSORPTION STRENGTHS O F COLOR CENTERS IN
ALKALI HALIDES . . . . . . . . . . . . . . . . . . . Q 7. SUMMARY .
. . . . . . . . . . . . . . . . . . . . . .
ACKNOWLEDGEMENTS
205 222
. . . . . . . . . . . . . . . . . . 224
REFERENCES . . . . . . . . . . . . . . . . . . . . . . .
224
0 1.
Introduction
The optical properties of insulating solids may be dramatically altered by the presence of defects such as trace impurities or lattice imperfections. In nature this accounts for the coloration of many specimens of normally colorless minerals such as yellow rock salt, rose quartz, and various gems such as sapphire, topaz and ruby. In the laboratory, defect absorption has been observed in a wide range of crystals prepared by the intentional addition of impurities, electrochemical treatment, and by irradiation with X-rays, U.V. light and various high-energy particles. The optical properties of such defects have been investigated extensively over the last fifty years both because of their practical significance and because of the deeper understanding of the properties of all solids that they have afforded (SEITZ [1946, 19541, MCCLURE[1959], SCHULMANand COMPTON[1962], PICK [1965, 19721 and FOWLER [1968]). In the present article we propose to review several general problems associated with the strength of optical absorption by defects in nonmetals. In this we shall limit consideration to electronic transitions between defect levels; we shall not explicitly consider the details of phonon-induced transitions. We shall consider the question of defect energy levels only incidentally since this has been widely studied and many up-to-date reviews are available (GOURARY and ADRIAN[ 19601, FOWLER [ 19681, BENNETT[ 1968, 19691). In the past the subject of absorption strength of defect absorptions has not received as great attention as the companion question of transition energies and eigenvalues. This is partly a result of the experimental fact that measurements of absorption cross section are considerably more difficult than the measurement of the energy of an absorption band. In preparing this review, we seek to emphasize the importance of an understanding of the oscillator strength of defects and hope to show that, though difficult, precise measurements of absorption cross sections can lead to far deeper insight into the defects involved. We further wish to review critically a number of widely held notions concerning sum rules, effective fields and effective masses. Our 167
168
ABSORPTION STRENGTHS OF DEFECTS
[v, 9 1
aim is to establish limits of the validity of these ideas and examine what understanding can be gained by applying them to experimental data. Consider the oscillator strength, f, which is not only the most convenient dimensionless parameter for describing the intensity of a particular transition, but is also a useful test for the accuracy of wave functions. Certainly it is possible to construct approximate Hamiltonians the eigenvalue differences of which correspond reasonably well with observed transition energies, but whose associated eigenfunctions bear little resemblance to reality. On the other hand, if the eigenfunctions have sensible form, i.e., have the proper numbers of nodes and satisfy the boundary conditions, and furthermore predict the oscillator strength correctly, one can more reasonably expect to be able to predict the actual electronic charge distribution to be measured, for example, by magnetic resonance experiments (see SEIDEL and WOLF [ 19681). Accordingly, the f-numbers are important quantities to understand for imperfections in solids, even more than for free atoms, and it would be desirable to make use of all formal knowledge available in their analysis. In the study of localized imperfections in insulators a common simplification is to assume that the defect and the surrounding medium can be decoupled and then handled separately. Usually, but not always, attention is focused on the defect and modification of the host crystal by the defect is neglected. In this approach the effect of the host on the energy levels of the defect is commonly accounted for by ( I ) a crystal field or crystal potential (BETHE [I 9291, VAN VLECK [ 19321, SEITZ [ 19381, BALLHAUSEN [ 19621) which approximates the average defect-host interaction and, sometimes, ( 2 ) an effective mass (LAX [ 19561) which accounts for the electron energy dispersion in the periodic structure of the host crystal. In treating the interaction of the defect with an external field a third quantity, (3) a ‘‘local’’ or “effective” field, is frequently introduced ( LORENTZ [ 19091, SMAKULA [ 19301, DEXTER [1958]). This local field accounts for the modification of the external field by the host medium and is made up of the external field plus the field arising from polarization induced in the host. As applied to optical properties of defects this approach has led to a generalized form of Smakula’s equation (see 6 4.1 ) relating the oscillator strength for a transition associated with the “isolated” defect to the integrated absorption cross section of the absorption band of the defect in the solid. It also has led to intuitive notions that sum rules for oscillator strengths should apply to imperfections in the same way as for isolated atomic systems. Unfortunately, as we stress in this article, the oscillator strengths associated with particular atoms or imperfections cannot be treated as independent of the remainder of the system.
v, s 21
SMAKULA’S CLASSICAL TREATMENT
169
In 3 3 we consider this “non-separability” of defect and system oscillator strengths from several points of view and show that it leads to non-trivial deviations from thejlsum rule as it would apply to the isolated species. For those readers who may not be familiar with the classical theory of defect absorption as developed by SMAKULA [I9301 and by MOLLWOand Roos [1934b], we sketch the derivation of Smakula’s celebrated relation from classical dispersion theory in 2. While this development has been superseded by a quantum mechanical one, the review of classical theory serves to bring out those points that must be handled with care in the more general treatment. In 4 we review the quantum mechanical generalization of Smakula’s expression and briefly analyze the approximations involved, including the introduction and significance of the effective mass. In § 5 we discuss the related problem of the connection between defect oscillator strength and the experimentally observable integrated cross section, particularly the so-called “effective-field ratio”. The theory behind the various choices of effective field is reviewed. It is then shown that unless the effective field ratio is unity, as for extremely diffuse centers, or the ONSACER[1936] cavity field value in the case of point centers, the effective field ratio cannot be properly defined in terms of macroscopic properties of the host crystal and the properties of the defect. In Q 6 we survey the known experimental results on color centers and some impurities in alkali halide crystals. In the case of the F center we conclude that ,f-sums significantly greater than unity are expected for the total absorption in the F, K and L bands. Further, we find that the net effective field experienced by the electron bound to an F center in the alkali halide is close to the average field in the medium; in no case is it even approximately the commonly assumed Lorentz value.
9 2. Smakula’s Classical Treatment of Defect Absorption The first treatment of the absorption strength of defects was given by SMAKULA [1930] and was based on the classical theory of dispersion (LORENTZ[1909], WOLFand HERZFELD [1928], ROSENFELD [1951]). As an introduction to our subject we shall briefly review Smakula’s treatment since in many ways it is qualitatively correct and serves to emphasize the basic assumptions which will be investigated in subsequent sections. In classical dispersion theory the electrons in a solid are described as a collection of independent Lorentz oscillators each with a frequency-dependent polarizability
I70
ABSORPTION STRENGTHS OF DEFECTS
a, =
[v; 0 2
e2/m 2 0,-w +iy,w’ 2
where e and m are the electronic charge and mass, respectively, and w, and y, are the eigenfrequency and damping constant for the vth oscillator, respectively.* To find the optical properties of such a system we calculate the dielectric function, E(w),i.e., the polarization response of the system to an external field. In such a calculation it must be kept in mind that each oscillator experiences the fields arising from the polarization of all the other oscillators, so that the dipole moment induced on any one oscillator is determined by the local effective field, &elf, and not necessarily the average field in the medium, &, . The polarization of the various oscillators in this classical system is additive so that the polarizability of the entire system is just x= ( E - 1)/4n = (&,,,/&,) c i a i . In his original calculation for defects in cubic crystals Smakula made the choice, customary for a cubic or randomly oriented insulating medium, of a Lorentz-Lorenz local field (LORENTZ[ 19091, FROHLICH [1958]) so that we take &eff = &, = +(~+2)&,. Introducing the oscillator strength, f,, as the number of dispersion (i.e., optically active) electrons having eigenfrequency w, yields the well known classical dispersion relation for the complex index of refraction, n” = n-ik,
z2-1 - E - 1 4n 4n --Cui=-Cfv z2+2 ~ + 2 3 i 3
e2/m 2 w,--0 +iy,o’ 2
(2.2)
where the second sum runs over the groups of oscillators having various frequencies 0,. Now Smakula observed that the change in optical properties caused by introducing defects could be calculated from eq. (2.2) in terms of the difference between the total polarizability of the defect, uD, and that of the host oscillator(s) it replaces, aH. Taking differentials in eq. (2.2) yields 6n ~
=
(?)
271 z 2 + 2 -
A
(aD-aH).
For the moment we assume with Smakula that the defect introduces a single absorption band in the region of transparency of the host crystal where the host index of refraction is real and equal to no. Then the optical absorp-
* This expression for the polarizability (with yv> 0) is appropriate for fields with a time dependence exp (ior). If the time dependence exp (-iwt) had been chosen, the sign of the imaginary term in the denomincator ofeq. (2.1) would be negative and the complex index o f refraction would be li = nfik.
v,
0 21
SMAKULA’S CLASSICAL TREATMENT
171
tion coefficient, p, may be calculated by equating the imaginary parts of eq. (2.3) and using the relation p = - ( 2 0 / c ) Im ii = 4nk/A,where c is the velocity and A the wavelength of light in vucuo. For p defects per unit volume with oscillator strength f D and damping constant y, one finds pf,
=
9- mc no Pmax TD 2 e2h ( n 2 i 2 ) 2
2
where pmaxis the absorption coefficient in the maximum of the defect absorption band, h equals Planck‘s constant, and TD = hy, is the full width of the defect band at half maximum in units of energy. This is Smakula’s celebrated relation (as given by MOLLWO and Roos [1934b]) between the oscillator strength of a defect absorption and the area under its absorption band which was assumed (see eq. ( 2 . 2 ) ) to be a Lorentzian. This result also assumes p to be small enough that k and &/no are everywhere much less than unity. That is, it applies to dilute solutions, not compounds. This approximation will be made throughout. Comparison of this expression with that for the Lorentz field shows that the local field correction appears squared. Thus, Smakula’s equation can be rewritten as
where the quantities on the left-hand side refer to the observable macroscopic properties of the imperfect solid while those on the right are primarily microscopic properties of the defect and its environment. The appearance of the square of the effective field ratio will be seen to be a consequence ot the fact that quantum mechanically the transition probability depends on the square of the matrix element for the interaction of the defect with the radiation field. As we shall see in the following sections, this form may be generalized. In arriving at this result we have explicitly assumed that 1) the defect and host can be decoupled and treated separately, and, further, that their polarizations are additive; 2 ) there is a well-defined local field at the defect; and 3) the local field at the defect is the same as that at a host atom and in this example has the Lorentz value.
An important consequence of this classical treatment is that f , should be an integral number for a single absorption band since it represents the number of dispersion electrons for the defect. Of course, several bands with fractional oscillator strengths are allowed quantum mechanically, but still
172
ABSORPTION STRENGTHS OF DEFECTS
[v, 0 3
the classical picture predicts an integral f-sum for the various bands arising from the same defect. As will become evident in the following, the decoupling of defect and host is only partly possible even in the one-electron approximation. This will be seen to lead to deviations from the f-sum rule and to a close connection between electronic states of the host and total defect oscillator strength. It will further become evident that a local field generally cannot be defined. In those cases in which Eeffis well defined it is generally not the same at a defect as at a host atom. In spite of this, an expression similar to Smakula’s equation is found to hold provided the quantities such as Eeffand f are given the proper quantum mechanical interpretation.
8 3.
Decoupling of Defect and Host and thef-Sum Rule
We shall begin our discussion of defect absorption by considering the common assumption of the separability of defect and host. Since this is closely connected with thefsum rule as applied to part of a system, we first review the quantum mechanical derivation of thef-sum rule for a system as a whole. Then we consider the isolation of a portion of the system and the extent to which a partialf-sum rule holds for the isolated portion. 3.1. THEf-SUM RULE FOR T H E TOTAL DEFECT-HOST SYSTEM
Thefsum rule for a system of N electrons asserts that
C 1
fj-1
=
N,
(3.1)
where &[, the average oscillator strength* of a transition from state Y J j ( r l , r 2 , . . . r N )to state Y t ( r l , r 2 , .. . r N ) , is defined as
xf=
Here Ek is the energy of state Y k ,m is the electronic mass, and 92 = r, . A common proof (BETHEand SALPETER [1957]) of this rule proceeds by the evaluation of the fermion commutator
* The oscillator strength is strictly a tensor quantity (SEITZ[1940], BETHEand SALPETER I19571). We shall follow the standard procedure of averaging over orientations, and all our oscillator strengths are average values. Furthermore, although all the optical parameters are tensor quantities, we shall for simplicity treat them throughout as scalars. This w? ild be valid in treating S-P transitions in cubic or isotropic host media, and as we shall :ee ‘hnre are sufficient non-trivial problems in even the simplest case to make unwise th; ntrsd iction of unnecessary complications.
v, § 31
THE f-SUM
173
RULE
where p a sis the momentum operator canonically conjugate to the position coordinates r,,, s = 1, 2, . . . N . If the Hamiltonian of the system is
H
c 2mPL + V ( r , ,
=
2
Y2,
. . . YN),
s
(3.4)
plS is given by
m
Pas
=
.- Cras HI.
(3.5)
3
Ih
Equations (3.1) and (3.3) may be combined to give the double commutator (3.6) The expectation value of this for the statej of the system contains matrices of the form (Yjlr,,Hr,tlYj) with r’s and H arranged in various orders. These matrices may be simplified by using the closure property of the complete set of eigenfunctions, Y,, of H . The result is I
Summation over the 3N coordinates then yields
C I
fj-1
2 m 3 A2
=-
c (El-Ej)( YjlBl YJ(
Yu,lBlY j ) = N .
(3.1’)
I
This is the famous Thomas-Reiche-Kuhn sum rule for transitions starting from a definite state Y j . It is of completely general validity for any atomic, molecular, or solid system as a whole provided all transitions are included. The step from eq. (3.6) to eq. (3.7), and consequently a proof of thef-sum rule, cannot be made whenever we cannot use the closure property for a complete set of states. This could occur for the (theoretically) trivial reason that we exclude, say, high-energy transitions from consideration. Another possible reason might be that an approximate Hamiltonian is employed for which there exists no complete set of states. The most important example of this comes about through considerations of the Pauli Principle applied to just part of the system. This last point will be discussed in some detail. In many cases, including the subject of this paper, interest is centered on absorptions in a more or less limited energy range which are primarily associated with a particular part of the system, so that it is pertinent to ask if an $sum rule holds for a given portion of the system (SMITHand DEXTER [1968a, 1968bl). For example, essentially all absorption of the F center in the alkali halides occurs in the transparent region of the host crystal, and
174
ABSORPTION S T R E N G T H S OF DEFECTS
[v. § 3
can be measured (see 9 6.1). It would seem reasonable (DOYLE[1958b], SCHULMAN and COMPTON[I9621 p. 90, MARKHAM [1966]) - though incorrect - to assume that this absorption corresponds to an f-sum of unity. Similarly with substitutional atomic Li in solid Ar, a donor impurity in a semiconductor, or absorption involving a single electron in a many-electron atom. 3.2. SEPARATION OF DEFECT A N D HOST AND THE PARTIALf-SUM RULE
As an illustration of the division of a system into two subsystems and incompleteness of the separation of their individual f-sums we consider the particularly simple case of the isolated alkali atom. This example serves three purposes. It is of historical interest, it explains why the italicized proviso above is necessary, and it has direct bearing on the properties of color centers. The alkali atoms consist of a single valence electron and a number of closed shells. The closed shells have zero spin and zero orbital angular momentum and their charge distributions are found to be almost independent of the state of the valence electron. As a consequence the valence electron may be described to a very good approximation as moving in an effective central potential arising from the closed shells plus the nuclear Coulomb potential (HELLMANN [1935, 19361, HARTREE [1957], BETHEand SALPETER [1957] $ 68). Such a treatment assumes the one-electron approximation and assumes that the core states are independent of the valence electron’s state. Neither of these assumptions is exactly satisfied. In particular the wave function is not precisely a product of one-electron states, but it may be written as a superposition of all possible product wave functions. The low-energy excited states may therefore be written as a product of normal core states and an excited valence state plus correction terms in which core excitations occur (BETHEand SALPETER [ 19571 $68). In the states of interest in optical problems these correction terms occur with very small coefficients and the “singlevalence-electron-excitation” approximation is ordinarily sufficient, and is sufficient for our purposes. A similar situation is to be expected for a variety of electron-excess and defect centers in ionic crystals in which one electron or a group of electrons moves in the crystal field of closed shell ions. To treat thef-sum rule in such cases consider the division of the N-electron system into two parts A and B containing n and N - n electrons, respectively, and assume that the interaction of A with B is sufficiently weak relative to the binding energy of the electrons in B that the subsystem A can be assumed to move in an effective potential u ( r ) . In the alkali atom A would be the valence electron and B the core states. The Hamiltonian for A is then
V,
o 31
THE f-SUM
HA =
1
s= 1
2
RULE
175
1
+ u(rJ + K(r, , r 2 , . . . r,),
where V,(r l , r 2 , . . . r,) is the potential for interactions within the subsystem A. HA has a complete set of eigenfunctions q t ( r I , r z , . . . r,) and the proof of thef-sum rule for the subsystem A may be carried almost to the point of eq. (3.7). However, not all the eigenfunctions of the Hamiltonian in eq. (3.8) can be associated with the final states for excitations of the subsystem A since some of the q t ’ s correspond to the occupied ground state of subsystem B, qz. Thus, the sum rule for the oscillator strengths of subsystem A,&?, takes the form
where R = r, and the q t ’ s are the functions already contained in the occupied ground state of B. Physically the second term on the right-hand side of eq. (3.9) can be viewed as a correction to thef-sum of the isolated system because of the Pauli principle prohibition of transitions to the other occupied states of the complete system contained in part B. Equation (3.9) can therefore be rewritten in terms of a fictitious oscillator strength of the forbidden transitions as (3.10) where c ranges over all occupied states in the remainder of the system to which transitions cannot occur. Thus in the potassium atom, for example, system A might be the 4s valence electron and system B the ls22s22p63s23p6core electrons. The forbidden transitions in this case would be the 4s -+ np, n < 4, transitions, i.e., X-ray emissions. Since emissions have a negative oscillator strength (because of the negative transition energy), the total oscillator strength for absorptions from the ground state of the valence electron in alkalies (other than lithium) is greater than unity (UNSOLD[1955], SEITZ[1940] p. 644). The values for some alkalies calculated in the one-electron approximation are given in Table 1 . A similar argument may be made for the core electrons, but here the forbidden transitions include X-ray absorptions to the higher occupied core and valence electron states. Thus, the innermost electrons must have oneelectron oscillator strength sums of less than unity since the Pauli principle excludes transitions to higher occupied states that would be allowed in a
176
ABSORPTION S T R E N G T H S OF DEFECTS
Iv, 0 3
TABLE1 The total oscillator strength for absorptions from the ground state of alkali atoms involving transitions of the valence electron as calculated in the one-electron approximation. Element H Li Na K Rb cs
Cf
1.00 ideniically 1 .oo I .04".
1.10".
1.14b 1.17b
Calculated from dipole matrix elements by BIERMANN and LUBECK[I9451 (see also BIERMANN [1950]). Calculated from dipole and gradient matrix elements by the authors using wave functions generated with the Herman-Skillman atomic structure programs (HERMAN and SKILLMAN [1963]). Dipole and gradient oscillator strengths were the same to within less than 1 "/I. For a comparison of the methods see CHANDRASEKHAR [I9451 and GREEN, JOHNSON and KOLCHIN[1966].
one-electron picture (KRONIGand KRAMERS [1928], COMPTONand ALLISON [1935]). (in the case of electronic levels with intermediate energies there are occupied levels above 2nd below the level in question. Thus, the forbidden transitions are of both positive and negative oscillator strength so that the netf-sum for allowed transitions is either greater or less than unity depending on the particular state.) However, what is a forbidden emission for one electron is a forbidden absorption for a deeper lying electron so that, considering the system as a whole,the total oscillator strength summed over all transitions of the system remains just N , the total number of electrons. 3.3. APPLICATIONS TO DEFECT PROBLEMS
Consider the implications of the foregoing for an impurity atom in a solid in the tight-binding approximation. Substitutional atomic Li in solid argon is a particularly simple example. In this case we would treat the outer (2s) electron of Li as (almost) isolated from the rest of the system, and, since the Li atom has no lower-lying states to which optical transitions can occur, any corrections to thef-sum rule for the 2s electron alone must result from the presence of the Ar medium. Deviations wiZZ occur, because whatever account is taken of the interactions between Li and its neighbors, we must deal with orthonormal wave functions. If we use Schmidt (COURANT and HILBERT[1953]) or Lowdin symmetric (LOWDIN[ 19561) orthogonalization we must subtract from the Li 2s state an appropriate amount of Ar Is, 2s, 2p, 3s and 3p states so that the corrected Li 2s function is orthog-
v, S 31
THE ,f-SUM R U L E
177
onal to that for the rest of the system. (The “appropriate amounts” are given by overlap integrals between the Li 2s function and the Ar Is, 2s, . . . functions.) Similarly for the Li 2p, 3p, . . . states. Accordingly, when a transition matrix element is computed between, say, the corrected 2s and 2p states of Li, the integral will contain not only the (renornialized) Li 2s and 2p functions, but also Ar matrix elements and Li-Ar two-center matrix elements of various forms, in addition to the two-center overlap integrals. Since this information obviously cannot be supplied by the Hamiltonian for the Li atom alone, there cannot be a complete set of states, and thef-sum rule breaks down. In the language of the foregoing paragraphs, final states of the Li sub-system which correspond to occupied levels of the Ar crystal must be excluded. If those occupied levels are all ot lower energy than that of the 2s state of Li, then thef-sum for the Li 2s electron must be greater than unity, but of course thef-value for any particular transition might be either increased or decreased from its atomic value. For example, the oscillator strength for the 1s + 2p transition of atomic hydrogen in solid argon increases from 0.416 to 0.486, and that for the 2s + 2p transition of atomic lithium in solid neon decreases from 0.768 to 0.590 (BHARGAVA and DEXTER [1970]). It is worthwhile repeating this point in a slightly different way. In the presence of overlapping neighbors (argon atoms, say) the oscillator strength for any transition of the defect (e.g., lithium) is changed in several ways. First, the transition energy is changed. Second, the normalization of the wave functions is changed. These are multiplicative factors which do not change the form of eq. (3.2). However, the squared transition matrix element, lRjJ2,now takes the form (]Rill2 Y R j , + Z ) ,where Y and Z are complicated functions of (1) overlap integrals of ‘pi and ‘pl with all occupied states of the host, (2) two-center matrix elements between the impurity and host atoms, and (3) host matrix elements. It is apparent from the change in form of the expression for the oscillator strength that the “separated”Jlsum rule cannot be valid. In the opposite extreme, let us consider the extremely diffuse center, for example a donor impurity in a high-dielectric constant semiconductor (KOHN[1952]). Let us also ignore “central-cell corrections”, and defer discussion of the effective mass until later. In the simplest case we would expect to be able to incorporate all interactions between the optically active electron and the medium through the use of the bulk low-energy dielectric constant, E , in the potential energy term, - e2/Er,and one would expect thef-sum rule to be obeyed for the extra electron (again, exclusive of effective-mass effects). However, the potential energy term is now a function of the transition energy since it depends on E , and ordinary dispersion theory shows that
+
178
ABSORPTION STRENGTHS OF DEFECTS
tv,
s4
E increases with energy. In this case, that of an energy-dependent potential, there is no assurance that there exists a complete set of states, and hence the derivation of the $sum rule breaks down. Here we make the important distinction between having different self-consistent potentials for different states as in an atom, and an energy-dependent potential not calculable from the Hamiltonianof the center, but determined by the properties of the medium. If the band gap is sufficiently large compared with the binding energy (B.E.) of the donor (or acceptor), one would not expect the variation of E with energy to produce a large effect. This is because 56.5 % of the total oscillator strength of the hydrogenic atom occurs at an energy less than the B.E., 88.1 % at less than twice the B.E., and 94.7 % at less than three times the B.E. (see 0 6.1). If the band gap is sufficiently large that no appreciable change occurs in E from zero energy up to 3 B.E., and if the conduction band(s) remains parabolic up to twice the B.E. thef-sum rule should be reasonably well-satisfied, even though not formally derivable. Perhaps it is worthwhile pointing out here that generallyhj is not equal to the negative of&, for an imperfection in a solid because of the changed potential, wave functions, and energy levels that will come about with lattice relaxation around the excited center (FOWLERand DEXTER[1962, 19651). Thus, it would not generally be correct to use an oscillator strength derived from measurement of the spontaneous emission rate in any expression having to do with absorption, such as eq. (4.4). For well-shielded centers, where the overlap of both ground and excited state wave functions with those of the neighbors is negligibly small, and there is little Stokes’ shift, as is the case for many rare earth transitions, it probably is a good approximation to set f i j = -&*. In other cases, such as the F center in KC1, there may be a factor of 10 error. In summary of this section, we have shown that it is not generally possible to derive an f-sum rule for part of a system. In the tight binding approximation, even for the lowest excited states, sizeable modifications to f can generally be expected, positive or negative. Even for very tightly bound systems it is inevitable that overlaps will become large for highly excited states. Even for centers which may be treated in the effective-mass approximation anf-sum rule is not formally valid although reasonably accurate numerically if the binding energy is small compared with the band gap of the host.
5 4.
Cross Section, Smakula’s Equation, and Effective Masses
4.1. THE GENERALIZED SMAKULA’S EQUATION
Focusing of attention on transitions in a relatively independent portion of a large system tends to obscure the circumstance that one can only observe
V,
a 41
SMAKULA’S EQUATION
179
interactions of the system as a whole with an applied radiation field. The important point is that the oscillator strengths appropriate to the subsystem rule, eq. (3.10), cannot be measured directly. The experimentally observable quantity is the absorption coefficient or cross section of the entire system. This is clear when one notes that the subsystem described by eq. (3.8) is assumed to move in the one-electron potential u( Y) arising from the remainder of the system. In discussing the oscillator strength it was tacitly assumed that this potential is a static self-consistent potential which, at best, can only account for the average interaction between the two parts of the system; this i s adequate for an approximate calculation of energy levels and wave functions. However, for calculating transition probabilities it is just the timedependent part of the potential which is of importance since transitions between the subsystem energy levels are induced not only directly by the applied field, but also indirectly by the time-dependent interaction between the various parts of the system which is modulated by the applied field. In the classical case the externally induced polarization of the medium surrounding the subsystem is visualized as giving rise to an additional, in phase, electric field at the subsystem and in the present section we consider Smakula’s equation in this approximation. In the general case overlap effects are important and the exchange interaction must also be considered. Then the idea of an effective field becomes less useful. An approximate approach involving the time-dependent part of the potential itself will be discussed in $0 5.3, 5.4. To proceed we specialize to the well-studied case of electric-depole transitions of an atomic center embedded in a dielectric medium. It is assumed that the host medium is transparent at and near the energies of the center’s absorptions, and that the density of centers is sufficiently low that interactions among centers may be neglected. The absorption cross section, integrated over energy, of the system containing a single center is given by the rate of energy absorption divided by the energy flux at the absorbing center. The rate of absorption is the product of transition energy, h o l j = ( E l - E j ) , and the transition probability which is in the dipole approximation*
Here the electric vector, b ( r ) , has magnitude equal to the r.m.s. value
* The transition probability is given exactly in terms of matrix elements of the vector [1968]). In the dipole approximation, i.e., for wavelengths large compotential, A (SCHIFF pared with the dimensions of the absorbing center, the exact expression may be shown to be equivalent to matrix elements of the electric vector of the radiation field, B = -( 1/c)aA/at.
180
ABSORPTION STRENGTHS OF DEFECTS
[v. § 4
of the electric field at point r . The energy flux is given by n0cG,2/(47r),where no is the real part of the index of refraction of the medium at the energy hwlj and go is the r.m.s. value of the average field in the medium. The ratio of these quantities gives for the integrated absorption cross section, 0,the well-known result
(see LAX [1952], DEXTER[1958]). For the case of a broad absorption line clearly h o l j and 1
(4.4) Eq. (4.4) is used experimentally by way of the definition of the absorption coefficient, p, in the relation (4.5 ) d J = -pJdx, where J is the intensity density of incident radiation. That is, ci is the energy density removed per unit time per unit volume from a beam of unit intensity density, and is equal to the cross section, 0, times the concentration per unit volume of absorbing centers, p. Thus the integrated absorption coefficient, Mji
=
s
pji(hm)d(hm),
(4.6)
or “the area under the absorption band” may be related to the concentration and the oscillator strength f j l by the equation (4.7)
The numerical value of mc/(27r2e2h)is 9.1106 x 1015cm-’eV-’ (TAYLOR et al. [1969]). This is the generalized form of Smakula’s equation, which is commonly used to estimate the concentration of absorbing centers from the measured M j ,. For a particular system of imperfection plus host, a simulta-
v, D 41
EFFECTIVE MASSES
181
neous measurement of p (by chemical means, say) and Mi, allows the “calibration” of eq. (4.7),i.e., a determination of the quantity (€o/6‘eff)2fiI. Here it should be stressed that such experiments determine not the oscillator strength, but rather the proportionality factor between p and M j I ,or even more commonly, in practice, that between p and the product of band height and width. The oscillator strengths quoted in the literature are therefore a measure of these proportionality factors for an assumed effective field ratio - usually the Lorentz field - and an approximate band shape. They are not truly oscillator strengths in the sense of eq. (3.2). 4.2. EFFECTIVE MASSES
An important feature of the foregoing is that the electronic mass does not appear in eq. (4.2)for the absorption cross section, whereas it does appear in the expressions for the oscillator strengths, eqs. (3.2) and (4.4). If the true electronic mass is inserted, thef-values conform to the traditional meaning, and are the appropriate strengths to use in the ThomasReiche-Kuhn f-sum rule discussed in § 3. However, for some purposes and in some systems it is convenient to employ an effective mass in the expression for the oscillator strength. This choice was introduced by LAXEl9561 in the limit of the effective-mass approximation for shallow donor states in semi-conductors. He observed that with this choice the sum of thef’s is unity for transitions between efSective-massstates associated with a given band extremum. That is, if we define a “pro forma” oscillator strength
where m* is the harmonic mean of the masses in the diagonalized effectivemass tensor, 3 1 1 -. - -1 (4.9) m” m, m z m3 the f?’s for transitions from an effective-mass state j to an effective-mass state I obey (4.10) XI f; = 1.
+-+--,
e.m. states
This can be easily seen from inspection of the hydrogenic Hamiltonian and simple dimensional analysis. If the kinetic operator is changed to (h2/2m*)V2,the characteristic unit of length is changed to (h2/me2)m/m*, and energies are changed by a factor m*/m. Hence the product of factors hw~jl(qjlRIq1>I2changes by the ratio m/m*, and replacement of the electron mass by the effective mass in the definition of .f restores the ex-
182
A B S O R P T I O N S T R E N G T H S OF D E F E C T S
[v, I 4
pressions for the hydrogen atom for which the $sum rule is valid. Of course this argument ignores the complication of an energy-dependent potential, as discussed in 9 3.3. Similarly changes of m* with state of excitation corresponding to non-parabolic bands are also excluded. The important restriction in eq. (4.10) is that the sum is over only a partial number of final states. The index I ranges over states - both discrete and continuous - derived from the lowest conduction band in the case of donor states or the highest valence band in the case of acceptors. Transitions to states derived from other bands are ignored. 7 the “traditional” oscillator strength associated In terms of m* and 4 with transitions between-effective mass states is (4.11) On summing over all final effective-mass states we have from eq. (4.10) (4.12) Furthermore, from thef-sum rule for the single electron under consideration we have a sum rule for transitions from the effective-mass state j to all states, k , associated with higher and lower bands, occupied or unoccupied. fjk
=
I-m/m*.
(4.13)
higher and lower hands
Here the summation specifically excludes transitions to effective-mass bound states and continuum states around the extremum of the band from which the state j is derived. These results are closely related to the so-called effective-mass Jsum rule for interband transitions in a perfect crystal (SOMMERFELD and BETHE[1933], WILSON[1953]). This asserts that for transitions from band s to band t
C f,, = 1-mlm*.
(4.14)
t#s
The term mjm* on the right-hand side corresponds to f,,,the intraband oscillator strength. This quantity does not arise in atomic or localized state problems because f,, is zero, i.e., (E,-E,) = 0 and the diagonal dipole transition matrix elements are zero or, at least, finite. However, for Bloch functions there are finite diagonal momentum matrix elements and hence, infinite diagonal dipole matrix elements. When the limit as t -+ s is taken the product of the vanishing energy difference and the infinite dipole matrix elements leads to a finite intraband oscillator strength of just m/m*. SOMMERFELD and BETHE[1933] have discussed this point in especially clear detail
V,
9: 51
LOCAL FIELD CORRECTIONS
183
and interpret the term jisas a zero-frequency “absorption line” of strength m/m*. In metals this is a well-known effect reflected in the singularity in the imaginary part of the dielectric response function at zero frequency. Now in effective-mass theory localized states are made up of linear combinations of the states of a single band. There is a redistribution of the intraband oscillator strength associated with the w = 0 “transition”, but as long as no other bands are involved, the total oscillator strength associated with the states in question remains m/m*. Thus, the intraband oscillator strength appears in the effective-mass state spectrum of the defect giving rise to the sum rule eq. (4.12). In passing it is perhaps worthwhile to emphasize the close connection between effective masses and oscillator strength and vice versa. From eq. (4.13) or (4.14) one has three possible cases for m* > 0. (1)
If 0 < C f s t< 1, then m%> m ;
(4.15a)
f#S
here f is dominated by transitions to higher lying bands. Of course, both allowed and “virtual” Pauli principle forbidden transitions are included in the f-sum. I f c fst
=
0, then m:
=
m;
(4.15b)
t#s
in this case “virtual” emissions and absorptions exactly balance in thef-sum. (3)
If - f,, < 0, then m,* < m,
(4.1%)
t#s
for which the $sum is dominated by “virtual” emissions. Thus, whether the valence band mass, m*, is greater, equal or less than the electronic mass depends on the strength of the “virtual” valence band to core transitions relative to the oscillator strength for all allowed absorptions. To summarize: If the oscillator strength is to have its traditional meaning the mass in eq. (4.4) must be the electronic mass and not the effective mass. The effective mass may only be used to define a “pro forma” oscillator strength that forces the partial $sum for shallow donor or acceptor states into the form of eq. (4.10).
0 5.
Local Field Corrections
5.1. CORRELATION EFFECTS A N D EFFECTIVE LOCAL FIELDS
In order to define and discuss the oscillator strength of a defect in a solid, it was assumed in 9 3 that the N-electron system under consideration could
184
[v, s; 5
ABSORPTION STRENGTHS OF DEFECTS
be treated as two parts: subsystem A containing n electrons - we assumed this to be the defect - and the remainder of the system, subsystem B, containing N - n electrons. Then as far as the calculation ofwave functions and eigenvalues wiis concerned, subsystem A was assumed to obey the Hamiltonian
where V,(r l , r 2 , . . . v,) is the potential for interactions within the subsystem A, and v(r,) is a static effective potential which accounts for interactions with the remainder of the system. In general v(rJ is non-local and statedependent which gives rise to the deviations from thef-sum rule as discussed in Q 3.2. In the one-electron approximation u(r,) is the Hartree-Fock potential in which all but accidental electronic correlations are neglected. In the more general case u( v,) might include correlations approximately through a dielectric screening function. This neglect or partial treatment of correlation is generally an adequate approximation for stationary-state calculations in insulators since many-electron effects involve admixtures of states having excitation energies of approximately the band gap, an energy large compared with the correlation terms in the Hamiltonian. However, in calculating dynamical effects - such as transition probabilities which are of interest here - terms arising from the correlation between electrons in the defect and the remainder of the system are not small. The polarization of the electronic system by an external field causes a periodic deviation of the electrons from their average positions so that even in the oneelectron model the subsystem experiences a time-dependent self-consistent potential in addition to the applied field. Classically, this time-dependent potential may be thought of as the static potential, v(r,), plus the potential arising from the induced polarization in the remainder of the system. In general the polarization fields yield sizable shieldings or enhancements of the external field throughout the system. This is analogous to the Sternheimer shielding and anti-shielding effects in the nuclear quadrupole interaction (COHENand REIF [1957], LUCKEN[1969]). The effects of polarization in the host medium may be visualized by considering the field of a single polarized atom (MOTTand GURNEY[1948]). An example is given in Fig. 1 which shows the electric field along the dipole axis of a hydrogen atom polarized by an external field (directed to the right) At distances large compared with the extent of the charge distribution the field is that of a point dipole, d , for which & = - ( d - 3 d PP)/r3. Along the axis the applied and dipole fields are parallel yielding a net field greater 1
V,
o 51
185
LOCAL FIELD CORRECTIONS
; 1 -
/
\
--
I
.8
\-POINT
= .6--
/ R I G I D ATOM%
. //Z/ I
--
-_.2 --
\
I
\
0
2.4
.‘
I \
DIPOLE, 2 d / r 3
\
I - ?
\ \
,.-.I1
\
DEFORMABLE ATOM
>:. c
I
-8
a . Electric Field o f a Polarized Hydrogenic 1s S l a t e
r
(00)
b. Radial Charge Distribution f o r a Hydrogenic Is State
Fig. 1. The electric polarization field along the dipolar axis of a polarized hydrogen atom. In (a) the solid curve gives the field calculated quantum mechanically using HASSB’S[1930, 19311 variational technique. Outside the atom the field approaches the dipole form, 2d/r3,where d is the magnitude of the dipole moment and r the radial distance; within the charge distribution the field reverses. The field obtained by assuming a simple rigid displacement of electronic charge is shown by the dot-dash curve. Note that this “classical” approximation yields a considerable overestimate of the internal fields as compared with the quantum mechanical treatment which allows for deformation of the electron distribution. For comparison the radial charge distribution of the ground state is given in (b).
186
ABSORPTION STRENGTHS OF DEFECTS
IV,
§5
than the applied field. Within the atom the field is everywhere less than that for a point dipole; in particular near the center of the atom the field along the axis is reversed so that there the net field is opposite to that of the applied field. Furthermore, the average value of the field arising from polarization is zero since it may be thought of as the superposition of the fields of two charge distributions of equal magnitude, but opposite sign that are slightly displaced from one another. Thus, in the case of a very fast electron with uniform density everywhere the positive and negative parts of the polarization field average to zero so that the net force is just the applied field (the average field in the medium, b,, if the atom is in a host medium; see PaNOFSKY and PHILLIPS [1955]). Quantum mechanically the classical assumption of an effective field is equivalent (in first-order perturbation theory in the dipole approximation) to taking the Hamiltonian for subsystem A in the presence of an external field to be n
HA(~ 7 ~I 2
. r n 7t )
7 . .
=
HA(^, , r 2 , .. . r,,)-Cezeff(t).r,, s=l
(5.2)
where tUeff is the average effective field at the center. Since the effective field is a one-electron operator, this approximation implies the neglect of terms arising from overlap and exchange. This is discussed in 9 5.2. As will be seen in 0 5.3, the introduction of a local field also implies the neglect of certain correlation effects since the field operator is derived from the q j t h Fourier component of the average charge density in the same manner as v(rs)is derived from the static component of the charge density. In the present section classical and quantum mechanical treatments of the electron correlation effect will be reviewed and their validity investigated for the purpose of computing absorption strengths. Much that we say is not new. We hope by this fairly complete review to demonstrate the extreme difficulty of formulating precisely what is meant by “effective field”, and the impossibility of expressing it in terms of macroscopic parameters, except in extreme cases. It should be kept in mind that in discussing the interaction of the system with an external field it is convenient to work with a needle-shaped specimen with its long axis oriented along the electric field. For this choice of geometry, complications arising from charge or dipole layers on surfaces perpendicular to the field are eliminated and from elementary electrodynamics we have that the average field in the medium is the same as that in the surrounding free space. (This is the electrostatic analog of a zero “demagnetization factor”; see PANOFSKY and PHILLIPS[1955].)
V,
D 51
LOCAL FIELD CORRECTIONS
187
5.2. CLASSICAL LOCAL FIELDS
Classically the coupling between defect and host occurs through the direct Coulomb interaction. Such a treatment is valid in the limit of negligible overlap between defect and host wave functions. Within this approximation the concept of a local field is well defined and the field is just the external field plus the fields arising from the polarization of atoms or ions surrounding the defect. Clearly the field experienced by an electron in the vicinity of a polarized atom depends on its charge distribution and the geometry and polarization of the surrounding ions. The various classical fields that have been proposed represent different assumptions for these factors (see FROHLICH [1958], BROWN[1956]). The three classical fields most commonly used are critically examined in the following paragraphs. Particular emphasis is placed on the Onsager field (ONSAGER [1936]) since this results from a selfconsistent approach appropriate to the idealized case of a point defect having no overlap with the host medium. We also discuss classical treatments of effective fields for overlapping charge distributions and their validity for questions of defect absorption. 5.2.1. The macroscopic or Drude-Sellmeier field* If the charge under consideration is spread uniformly throughout the system, it performs a spatial average of the fields arising from polarization of the constituent ions. Since these fields average to zero, the net Coulomb field experienced is just the macroscopic field in the medium, 8,.This would be the case for a fast electron passing through any matter and in particular applies to electrons in plasmas and metals (FROHLICH [1958], DARWIN [1934, 19441). The macroscopic field is also approximately correct for systems with small polarizabilities and densities such that the dielectric constant does not differ significantly from unity. In refractive index calculations it leads to the Drude-Sellmeier formula applicable to dilute nonpolar gases, metals, and plasmas. 5.2.2. The Lorentz localjeld
In insulators correlation effects are most often accounted for by means of the Lorentz local field (LORENTZ[1909], FROHLICH [1958]) 8, = 8,+$cP = +(&
+2 ) 8 , ,
(5.3)
where gois the macroscopic average field in the medium and P the polari-
* For a penetrating classical analysis of the use of the Sellmeier and Lorentz fields see DARWIN [1934, 19441.
188
ABSORPTION STRENGTHS OF DEFECTS
[v, § 5
zation. Assumption of this field yields the well-known Mossotti-ClausiusLorenz-Lorentz (DARWIN[I9341 second footnote) expression for the refractive index which is obeyed by many non-polar fluids over wide ranges in density (VANVLECK[1940], BROWN[1956]). The Lorentz local field is the local field at a lattice site assuming 1. an isotropic or cubic distribution of polarizable atoms, ions or isotropic molecules; with 2. dimensions small compared with the nearest neighbor spacings; 3. dipole-dipole interactions with no short range forces such as overlap; and 4. uniform polarization throughout the system.
Derivations of eq. (5.3) are standard and are based on continuum electrostatics or direct summation of fields of a lattice of point dipoles (MOTTand GURNEY[1948], FROHLICH [1958]). Despite the fact that assumptions 2 and 3 are not completely satisfied in ionic crystals, studies of the refractive indices of the alkali halides by Shockley et al. (SHOCKLEY [19461,T~sSMAN, KAHNand SHOCKLEY [1953]) indicate that the Lorentz field gives the best agreement with experiment in calculations of the optical dielectric constant in ionic crystals. This would seem to lend support to the use of the Lorentz field in the generalized form of Smakula’s equation, eq. (4.4), provided the defect charge distribution is no larger than that of a typical ion. However, it will be shown below that even under the most favorable conditions the Lorentz field is incorrect for a defect even in principle. 5.2.3. The OnsagerJield In the case of a crystal containing defects, assumption 4 given above is not justified because the defect’s polarizability is not the same as that of the host medium. HERRING[1956] and SILSBEE[1956] have pointed out that this non-uniform polarization may be accounted for with Onsager’s approach to the local field (ONSAGER [1936], VANVLECK[1940], FROHLICH [1958]). Onsager observed that the field acting on a molecule within a system in an external electric field may be decomposed into two fields. The first is the field that would obtain at the position of the molecule as a result of the applied field in the absence of the polarized molecule under consideration. This is usually referred to as the cavity field. The second is the field at the molecule arising from the polarization (or “reaction”) induced in the medium by the molecule’s own dipole moment. For simplicity, Onsager chose a model consisting of a single molecule contained in a cavity of*radius u in a
V,
o 51
LOCAL FIELD CORRECTIONS
189
continuous medium of dielectric constant E. In this model the cavity field, i.e., the field in the vacant cavity caused by the external field, is uniform and is 3E
G=B, 2&+1
=
gb,.
(5.4)
The reaction field arising from the presence of a dipole in the cavity is also uniform within the cavity and is 2(~-1) d R = - = rd, 2 ~ + 1a 3
where d is the dipole moment of the molecule. The total field, F , acting on the molecule is the sum of G and R
F = g6, + r d .
(5.6)
In the event that the molecule in the cavity has the same polarizability as those in the medium this sum reduces to the Lorentz field*. It should be noted in passing that the specific values of g and Y given in eq. (5.4)and eq. (5.5) are appropriate to Onsager’s continuum model. A more detailed model invoiving a discrete lattice would yield somewhat different values. However, as long as short range forces such as overlap are negligible, an equation of the form of eq. (5.6) should hold where g and r are functions of the lattice structure, spacings and the dielectric constant of the crystal alone. Since the field at the molecule depends on the molecule’s dipole moment and vice versa, the two must be found self-consistently. If we denote the polarizability of the central molecule by a, the moment is d = aF and from eq. (5.6)
and
d
a
= ---ggb,.
I -ar
The latter equation shows that the effect of polarization in the medium has been to renormalize or “dress” the dipole to give an effective local polarizability aeff = a/( 1- ar). We note for future reference that, since the Onsager field reduces to the Lorentz field if the polarizability of the central molecule is the same as those
* This can be verified by noting that from the definition of E in terms of the dipole moment per unit volume one has ldl/a3 = +(~-1)d‘,.
190
A B S O R P T I O N S T R E N G T H S OF DEFECTS
[v, § 5
in the remainder of the system, eq. (5.7) yields an alternate expression for &L9
bL = F(a = a")
=
go,
~
(5.9)
l-C(Hr
where aH is the polarizability of the host, with useful implications. Eq. (5.8) gives the moment d within the cavity. However, the quantity of interest for the interaction with an external fie!d is the total moment of the system. This may be divided into three parts: 1) that portion arising from the polarization of the surrounding medium that would exist in the absence of the local moment; 2) the moment d ; and 3 ) the polarization of the medium induced by the moment d . Since the first of these is fixed, only the change of the second two from the perfect crystal case is of concern for an impurity problem. The result for the net external moment, d e x t ,due to the moment d plus the associated polarization in the medium is, in the cavity-in-a-continuum model used by Onsager, dext =
3E ~
d = gd.
2&+1
(5.10)
An argument based on energy considerations similar to those used by Herring gives the more general formula dext = gd
(5.11)
for any g appropriate to eq. (5.6). Physically these results for the external moment can be viewed in either of two ways. The viewpoint leading to eq. (5.10) is that the moment in the cavity polarizes the medium so that at large distances from the cavity the effective moment is the cavity moment plus that of the polarized surroundings. Herring's point of view is to divide the interaction of the moment in the cavity with the local field into that with the cavity field - d . g b , and that with the reaction field - d . rd. The latter is a self energy associated with the existence of the moment and its polarization field. The former, - g d . b,, is the actual coupling between d and 6,. From either viewpoint it is clear that eqs. (5.8) and (5.11) may be combined to yield the effective external polarizability, u e X t , (5.12)
of an object of polarizability a within the cavity. Now consider a host crystal with index of refraction no into which p defects
V,
9: 51
LOCAL FIELD CORRECTIONS
191
per unit volume are introduced by replacement of host atoms. The polarizability of the host atoms will be denoted by aH and will be taken as real (i.e., no absorptions) in the frequency range of interest. The defect center's polarizability will be denoted by aC which is in general complex. p will be assumed small enough to exclude mutual interactions of defects and to ensure that the change in the index of refraction, 6E, is small relative to no. Then from E 2 = E = 1 + 4 n ~one has to a good approximation 617 = 6n-ik = (2n/n,)6~,
(5.13)
where n and k are the real and imaginary parts of the complex index of refraction, E, and 6x is the change in overall polarizability of the system. Combining eqs. (5.12) and (5.13) we have for the change in optical constants
(5.14) Now, the optical absorption coefficient, p ( o ) , is related to the imaginary part of the refractive index by p ( w ) = 2wk(o)/c. Since ctH is real, ,u is determined entirely by ac giving" p(o) = -p-
4710 g 2 - I m p . MC c no l-rctc
(5.15)
This expression shows that the medium has two effects on the polarizability of the defect center: a multiplicative enhancement by the factor g2 and a renormalization of ctc by the factor (I-rac)-'. It is not, as assumed in Smakula's equation, a simple multiplicative modification by the square of
*
Some confusion exists in the literature over the correct form foreqs. (5.14) and (5.15).
To see the reason for this, note that from eq. (5.9) the quantity g 2 [ a c / ( l-mC)a ~ / ( -raH)J 1 can be written in the three equivalent forms
Now, it can be argued that for the purposes of calculating the integrated absorption, most of the contribution comes from within a few natural line widths of the center of the line. In this energy range Ixc/ > j ]awlsuggesting that aH may be neglected as small compared with ac wherever it occurs explicitly. However, doing this yields the contradictory result
(G/b0)' z G . 6,/6: = 8:(1 -rct,)/b:
z (&L/bo)2.
Thus, in the present notation, HERRING [I9561 found a factor ( 8 L / L ? o ) z ( l - r a Hrather ) in eq. (5.15). Basically the difficulty is that the quantity aH/(1-mic) has been than (G/L?",)2 inadvertently neglected. Since the denominator of this term may be small in the vicinity of a resonance where Re a,changes sign the approximation is not valid. This error was first noted by SILSBEE[1956].
192
A B S O R P T I O N S T R E N G T H S OF D E F E C T S
IV, §
5
an effective field ratio. The point is that even classically a single quantity is generally insufficient to describe the coupling of one part of a system to another part and to the external field. In other words, the total local field determines the local moment, but only a portion of the local field, the cavity field, is effective in coupling to the external field. Herring and Silsbee noted that in the special case in which ac is a single Lorentzian, LZwo, ?(a), centered at coowith full width at half-maximum y, (5.16) the only effect of the factor (1 -rat)-' in eq. (5.15) is to shift the resonance rigidly from 0,to 0:where, = of - r f ( e 2 / m ) ,
(5.17)
yielding (5.18) Comparison of this with the integrated cross section*, eq. (4.4), shows that in this particular case g plays the role of the effective field ratio Eeff/8,. Hence, in the limit of a single, compact Lorentz oscillator in a dielectric medium with negligible short range interactions between host and defect, the effective field is the Onsager cavity field. We stress that these are just the conditions for which the Lorentz field is widely and erroneously believed to apply. In the case of E = 2 use of the Lorentz field rather than the Onsager field overestimates (c?eff/&o)2 by over 23 % whereas for c = 3 the overestimate is 68%. In general the effect of (1 - r ~ , - ) - ~is not just a simple frequency shift. For example, if ac can be expressed as a sum of Lorentz oscillators N
(5.19) i
where ac, has the form of eq. (5.16), and if the widths, y i , and frequencies, m i ,are such that the various resonances have negligible overlap, then the external polarizability has the form aext(o>
=
C
aext,
i
where
* To do this note that j,"
(2)
Im 9(0) dw
=
-&f(e2/m).
(5.20)
v, 0 51
193
LOCAL FIELD CORRECTIONS
and
The fact that we neglect the small off-resonance imaginary part of aC, for j # i has been emphasized by writing Re xc, in the summations. In this more general case the effect of ( 1 - r a J - I has been to shift the frequencies and to alter the strength of the resonance by an additional factor Re c ( ~j (, m ) ] - l . Since Re a,-, has a dispersive form, iReclc, of [ I - r may be either positive or negative so that the strength of a particular resonance is either increased or decreased depending on the strength and energy of xc, i(a) relative to the other ac, j ( o ) ’ s . The basic point for our purposes is that it is no longer possible to identify a quantity in eq. (5.21) as the effective field. The only possibility would be g [ l - r Re tic, j ( o ) ] - t ,but this does not correspond to any actual field. In summary we compare the use of the Lorentz field and the Onsager fields in Smakula’s equation. From the expression for the local field, eq. (5.9), the square of the effective field ratio in the Lorentz approximation is
cj+
cj+i
cj+i
(€L/€o)2
=
g’/(l -mH)’.
(5.23)
The present analysis shows that the appropriate quantity is a combination of the Onsager cavity field and the reaction field enhancement factor. Dewe have fining an equivalent Onsager “effective field”,
These results differ in two ways. First, the c1 in the denominator of eq. (5.23) is that of the host, rather than that of the defect which eq. (5.24) shows to be correct. Secondly, the expressions differ even in form since the denominator of eq. (5.23) is squared. Again, the reason is that squaring the Lorentz field brings in the renormalizing effect of the reaction field once too often. That is, the coupling of the induced moment back to the external field does not contain a renormalization; it is, from eq. (5.10) g and not g(l -rzH)-l. and STERN 5.2.4. Overlapping charge distributions and the study of GUERTIN [ 19641
In the previous discussion consideration has been limited to point dipoles or, at least, ions with dimensions very small compared with the nearest neighbor spacing. These conditions are only encountered at low densities
194
ABSORPTION STRENGTHS OF DEFECTS
[v, s; 5
and are rarely if ever realized for defects in solids or liquids. Furthermore, polarizabilities and transition rates are determined by matrix elements between the ground and excited states and even if the ground state is compact this is not true of all the excited states that contribute to the polarizability. An attempt to include the effect of overlapping charge distributions has been made by GUERTIN and STERN[1964] in a classical model calculation for various cubic lattices in which electrons have extended charge distributions. Assuming a Gaussian chaIge distribution and allowing rigid displacement of the electron distribution, they found that the local field varied smoothly from the Lorentz value at small overlap to the macroscopic (average) field for very diffuse charge distributions. Since the polarization of the medium is taken to be uniform, all unit cells are assumed to have the same polarizability in this calculation. Furthermore, the effective field calculated by Guertin and Stern is the field tending to polarize an ion. That is, it is the effective field in the sense of eq. (5.7). As in the discussion of eqs. (5.23) and (5.24), the square of this field is not to be used in Smakula’s equation since the effect of the reaction field would be included once too often. Guertin and Stern have pointed out that the effective fields in real systems may differ from their calculated values because of their assumption of Gaussian charge distributions rather than ones having a more realistic exponential decay at large distances. A far more serious source of error is the assumption that charge distributions polarize by rigid displacement. This can be seen directly from Fig. 1 which shows that electric field for a polarized hydrogen atom in several approximations. Assuming a rigid displacement of the electronic cloud relative to the nucleus yields the dot-dashed curve whereas a quantum mechanical calculation taking deformation of the charge distribution into account yields the solid curve“. It will be seen that the classical “rigid ion” approximation greatly overestimates the magnitude of the internal fields and that it does not predict the position of the zeroes of the field accurately. 5.3. QUANTUM MECHANICAL FORMULATION
In a complete quantum mechanical treatment the system of defect and host is viewed as a whole and only the applied field and its interaction with the entire system is considered. However. correlations within the system must
* The Q. M. calculation was made using the variational method given by HAS& [1930, 19311 with a one-parameter function. A more flexible electronic wave function presumably would have given even smaller fields within the atom because of its greater capacity to screen out the field.
v, 0 51
195
LOCAL FIELD CORRECTIONS
be adequately accounted for and the calculations carried beyond the usual stationary state one-electron approximation. Two approximate methods of doing this are discussed in the present section. They differ in the order in which the correlations and the electromagnetic perturbation are considered and serve to shed light on different aspects of the problem. The Hamiltonian for the entire system of matter plus vector potential in the semiclassical approximation may be written in the form
ieii
+mC 1 i 4 r i ,t)
*
vi.
(5.25)
Here V,,,( r i ) includes nuclear-nuclear as well as electron-nuclear Coulomb interactions. The usual choice of gauge has been made so that Div A = 0, and terms in A' have been omitted on the assumption that the radiation field is of low intensity.
5.3.1. Configuration mixing procedure The most direct approach to the Hamiltonian in eq. (5.25) is to solve the many-electron problem in the absence of the electromagnetic interaction and then to include the electromagnetic part by time-dependent perturbation theory. The first part of the problem, the electronic structure, is generally treated in the Hartree or Hartree-Fock approximation in insulators. In these methods the electron-electron interaction is replaced by some suitably averaged one-electron interaction, Vee(r i ) . The resulting Hamiltonian,
xi
is separable and has solutions that are (antisymmetric) products of oneelectron functions. The exact solution for the many-electron problem in the absence of a field may be expanded in the complete set consisting of all possible determinants formed from the one-electron basis set for ZHF, eq. (5.26). Thus, a solution of the electronic problem consists of finding the stationary states of -@HF and using then1 to diagonalize the actual electronic Hamiltonian which can be written as
x'
X ( A= 0 ) = pH,++ e2/rij--1 Vee(ri). i, j
(5.27)
i
The correlation effects introduced by the electron-electron interaction
196
ABSORPTION STRENGTHS OF DEFECTS
[v, P 5
appear as admixtures of one-electron configurations connected to the state of interest by the correction term, je2/rii-CiVe,(ri)],which is a function of pairs of coordinates. In the Hartree-Fock method the lowest order effect of this “perturbation” on the zero-order one-electron solution is to mix in configurations which differ from the original state by two one-electron functions. That is, to lowest contributing order there are two electron-hole pairs created by the “perturbation” term. For the present discussion of defect absorption, the configuration-interaction terms of primary importance are those in which one of the “excitations” is “on” the defect and the other “on” a neighboring host ion. Now, the effect of an electromagnetic field of frequency oljnear a resonance of the defect is two-fold. In the zero-order one-electron part of the wave function real transitions of the defect are induced in the usual way. In the correlation correction terms which invoIve excited states on the host and the defect the field (which is a one-electron operator) causes virtual transitions of the host from the excited state to the ground state yielding a final state with just the defect excited. The latter process is comparable to the first if the host consists of a large number of highly polarizable ions. Higher order terms yield additional channels for defect excitation. In general the explicit calculation of the process just outlined is a formidable task since for most systems the interactions are so many and so large that perturbation theory or configuration interaction methods are inadequate. However, the calculation has been carried out in detail by DEXTER [1956] in an idealized model of an insulator with cubic crystal structure. To make the calculation tractable it was assumed that the overlap of wave functions centered on one atom with those on another atom is negligible. It was further assumed that normalized free atom functions are also appropriate one-electron functions in the solid. The dominant interaction between atoms was taken to be the dipole-dipole or van der Waals’ interaction with exchange and higher multiple interactions of lesser importance. The calculation was carried to first order in the polarizability of the host atoms and the integrated absorption coefficient of the defect was found to be given in terms of the oscillator strength in DUCUO by an equation similar to Smakula’s. The quantity corresponding to the classical effective field ratio was given by
[+xi,
(5.28) where aH is the polarizability of the host atoms, of density N o , at frequency
v,
0 51
LOCAL FIELD CORRECTIONS
197
o.Here J is an exchange term and K includes higher order corrections such as dipole-quadrupole and higher niultipole effect, squares of the parameter ~ T c N , M ~ and ( w ) effects , of overlap and nonorthogonality. For the assumed model J and K represent small corrections so that to first order in the polarizability of the host the effective field is just the Lorentz field. This is in agreement with the first-order classical result in which short range forces (such as exchange) are neglected. (To reproduce Onsager’s results would have required carrying the calculation to high order which was impractical in the formulation employed.) Conversely, if J and K are not negligible, and they rarely if ever are, there is no reason to expect the Lorentz or Onsager result to be valid.
5.3.2. The adiabatic polarization procedure
The approach of the preceeding section was to solve the electronic problem in the one-electron approximation, correct for correlations, and finally include the time-dependent external field term. As seen above, only an approximate solution for a simplified model has so far been feasible. An alternative approach (SMITH,to be published) is to include the electromagnetic terms (which are one-electron operators) in the one-electron problem and then correct for correlation. This procedure yields a time-dependent Hartree-Fock Hamiltonian in which the external perturbation and the auerage electron-electron interaction as modulated by the external field appear as time-dependent one-electron potentials. The electronic states are, of course, time-dependent and are solved for in a self-consistent fashion, adiabatically, on the assumption that the frequency of the applied field is much less than that of the host crystal’s fundamental electronic absorption. In this approach the analog of Vee( ri) in the time-dependent Hartree-Fock equation, eq. (5.26), is a time-dependent potential Vee( r i , t ) . Transitions of the defect are induced by the combined effect of A ( r i , t ) and Vee(ri, t ) . Correlation effects not included in the average potential Vee(ri, t ) have negligible effect on these transition probabilities. An advantage of this approach is that it more nearly parallels the classical formulation and yields a clearer interpretation of the classical effective field. To proceed we retain the coupling to the electromagnetic field in the reduction to the one-electron problem. The resulting Hamiltonian is
+ ieh - A(ri,t) .V i mc
198
[v, § 5
A B S O R P T I O N S T R E N G T H S OF DEFECTS
In analogy with the time-independent Hartree-Fock method the one-electron potentials satisfy"
- ( ' ~ i ( ~ 2>
t)I
e2 - Iqj(r2
2
t))lqi(r1>t ) ) ,
(5.30)
r12
where qi(Y, t ) are time-dependent one-electron wave functions. This Hamiltonian differs from the total system Hamiltonian, eq. (5.25), only in that the actual electron-electron interaction has been replaced by the self-consistent Hartree-Fock potential Vee(r i , t ) . It is related to the time-independent Hartree-Fock Hamiltonian, eq. (5.26), by
1 -
ieh
+ Ci \[vee(.i,t ) - Vee(.i)] + A ( Y ~t ,) mc
*
Vi
The quantity in square brackets, [Vee(ri, t ) - Vee(ri)],will be denoted by u(ri, t ) . Since the Hamiltonian Xo(rl, r 2 , . . . r N , t ) is separable, the one-electron functions, qj(Y, t ) , obey individual time-dependent wave equations
a
ih - q j ( v , t ) = X o ( t~ ) q,j ( r , t ) at =
{[$ vz+V,,,(Y)+
Vee(v)+v(v,t)+
mc
qj(y,
t).
(5.32)
For normal laboratory light sources the vector potential and u( Y, t ) are small perturbations on the solutions, un( Y), for the field-free one-electron problem,
(5.33) Thus, the time-dependent solutions, qj(r, t ) , may be obtained from firstorder time-dependent perturbation theory as
* This choice neglects the effect of retardation due to the finite propagation time of the Coulomb field, but this effect leads to negligible corrections at optical frequencies. The point is that a significant change in the position of an electron due to the external field OCcurs in times of the order w - so that significant field differences due to the finite propagation appear at distances greater than c/w = A. At optical frequencies this is of the order of 100 atom spacings, a large distance for correlation effects in insulators.
V,
o 51
199
LOCAL FIELD CORRECTIONS
qpS')(v,t ) z u j(Y exp
()-:'
+C a$:'(t)uk(Y) exp k
~
(k) -iEk t
,
(5.34)
where the coefficients u$'(t) are given by (uklv(r, t ) +
ieh A mc
*
VlUj)
(5.35)
eXp (iwkjt')dt'.
Using these first-order wave functions we may calculate the potential t ) from eq. (5.30). To first order this yields
U(Y,
+a;k(t)
exp ( - i w j k
e2 l)
[(uk(r2)1
-lUj(r2))lqi(rl
r12
e2 -(uk(r2)1
- 1d)i(r2
r12
3
ti>I.i('l)>]
1
Y
Y
l)>
(5.36)
where the sum over k runs over excited states and that o v e r j runs over all occupied states except the one on which D(Y, t ) operates. Physically eq. (5.36) says that the potential v(v, t ) is, in the present approximation, just the time-dependent part of the Coulomb and exchange potentials of the polarized charge distribution (excluding, of course, any self-interactions). As usual, the exchange terms in eq. (5.36) are quite complicated if there is significant overlap between q i ( v , t ) and u j ( r ) . A qualitative idea of their effect can be had by using the local approximations of SLATER[I9511 and of KOHNand SHAM[I9651 in which the exchange potential is a functional of the electron density. In this case the exchange terms in v( u , t ) are found to be proportional to Re ajku?(Y)uk(Y). Since u j ( r ) and U k ( r ) are orthogonal the spatial average of such terms is zero. Thus, in the limit of a defect with a diffuse charge distribution which overlaps many neighbors the effect of exchange in u(r, t ) should be negligible. This is, of course, also the case for a defect with a compact charge distribution having no appreciable overlap with neighboring ions. In intermediate cases a detailed knowledge of the wave functions is required; however, in these cases there should also be considerable cancellation in the potential matrix elements because of the orthogonality of u j ( r ) and u k ( r ) . The evaluation of the a(t)'s requires a self-consistent solution of eqs. (5.34) and (5.36) and the explicit form depends on the details of the electronic struc-
200
[v, 9: 5
A B S O R P T I O N S T R E N G T H S OF DEFECTS
ture through u(r, t ) . As a qualitative example of the form of ak(t) and u(r, t ) for an individual polarized atom, the Coulomb part of the potential and field of an isolated Li' ion has been calculated for an applied plane wave sin(q * r - o t ) . While this is a far cry field &',,(r,t ) = -(l/c)dA/dt = from a self-consistent solution for a system of interacting atoms in a solid, it serves to give a qualitative idea of the potentials involved. In this calculation it was assumed that the wavelength, A = 2n/lqI, is large compared with the dimensions of the ion and that the frequency, w , is much less than any natural electronic frequency of the ion. Eq. (5.35) may then be evaluated by the usual procedure on the assumption that the applied field is turned on slowly and the matrix elements involving V are eliminated by the use of the dipole-gradient identity, ( k l V l j ) = - ( m w k j / h ) ( krl l j ) . The result for the off-resonance electronic states is
x exp (iokjt).
(5.37)
This expression has a resonant form and, aside from the factor exp (iokjt), it has two parts: a real component in phase with the applied electric field and an imaginary part 90" out of phase with the field. In the present example only the real part contributes to the Coulomb potential so that the polarization may be thought of as being induced directly by the external electric field potential, - er . 8,,, sin ( q . Y - ot),experienced by the electron distribution. The potential u(r, t ) and the associated electric field, 8 ( r , t ) = - (l/e)Vu( Y, t ) were calculated from the variational equivalent of eq. (5.36) for a lithium ion in the 1s' ground state". The results are given in Fig. 2 in units of V(Y, t)/ld(t)l and &'(Y, t ) / d ( t )where d ( t ) is the time-dependent induced dipole moment of the ion e2/m
d(r, t ) =
fjk
k
~
(5.38)
I,,,sin ( q . r - at),
O;jj-02
with
(5.39) the oscillator strength for the (virtual) transition of the ion
zij -+
uk.
* The calculation was made using the variational method HASSB[1930, 19311 and wave functionsof MORSEet al. [1935,1956].
V,
D 51
LOCAL FIELD CORRECTIONS
20 1
Fig. 2. The potential and electric field arising from polarization of an isolated Li+ ion as a function of distance along the dipolar axis. Note that the potential is an odd function and has a spatial average of zero. The spatial average of the electric field is also zero. The Tosi-Fumi average crystal radius for the Li+ ion is 1.7 a, (FUMIand TOSI119641, TOSI and FUMI[1964]).
As in Fig. 1 the particular direction chosen for displaying the results is a line through the nucleus parallel to the external field. The potential is an odd function and at distances large compared with the extent of the charge distribution approaches the simple dipole value, Id(t)l/r2.For comparison of distances and overlaps in an ionic solid, the radial charge densities of a F- ion’s 2p electron (GILBERT [1967]) and of a LiF F center* are given in Fig. 3 along with the fields of a pair of Li’ ions at distances of *3.80ao (the nearest-neighbor distance in LiF). These figures will be discussed in detail in what follows. It should be stressed that in Figs. 2 and 3 isolated ions are considered and only the portion of potential and field arising directly from polarization due to the A . V term has been displayed. The portion arising
* The F-center is a n electron bound in a negative-ion vacancy by the Madelung potential. (See SCHULMAN and COMPTON[1962].) The charge density shown is for the simple and ADRIAN[1957]. More accurate hydrogenic“Type I” wave function given by GOURARY wave functions using variational functions of greater flexibility or numerical integration give qualitatively similar results. Note, however, that orthogonalization to the occupied core states has been neglected. This is of prime importance for hyperfine and spin-orbit effects (GOURARYand ADRIAN[1960], SMITH[1965a]). In the present case orthogonalization corrections increase the contribution to matrix elements from regions near the nuclei of surrounding ions, but do not alter the general conclusions.
202 ABSORPTION STRENGTHS O F DEFECTS
tv, 0 5
Fig. 3. A comparison of charge distributions for a fluoride ion and an F center with the electric fields of polarized neighbor lithium ions in a LiF crystal. The solid curves centered on Li+ sites show the electric field due to the polarization of the Li+ ions by an external field directed along the [loo] direction. Two charge distributions are shown at the negative-ion site. The dashed curves give the radial charge density of a F- 2p valence electron (directed along [loo]) and the dot-dash curve gives the F-center electron density. The major portion of the F- 2p charge lies within the volume of the negative-ion site where the electric fields are approximately dipolar. The F-center distribution is more diffuse and also experiences the reversed intraionic field. Thus, the average field experienced by the F-center electron is considerably less than that felt by the ion’s valence electrons.
V,
Q 51
LOCAL FIELD CORRECTIONS
203
from the oscillatory part of the self-consistent potential, u(r, t ) , must also be included when other atoms are present. 5.4. QUANTUM MECHANICAL DEFINITlON OF
Qerf
In the present discussion attention has bzen focused on the potential which is more fundamental in the quantum mechanical approach than are the fields. In this framework the perturbation inducing transitions of the optically active electron is from eq. (5.32) w(v, t ) =
U(Y,
ieh mc
t)+ - A
*
v,
(5.40)
whereas in the classical picture we seek an effective field such that (5.41) is the spatial average of the local effective field Beff(r,t)*. where seff(t) Ingeneral it is not possible to transform W(Y, t ) into the simpler form of eq. (5.41). The operator A V of eq. (5.40) cannot be duplicated with a scalar potential since the electric field of the radiation is not conservative so that both electric and magnetic effects are involved. Moreover, the potential u(v, t ) usually will be non-local and state-dependent. The first of these objections is no problem for first-order calculations of transition probabilities in the dipole approximation since magnetic effects are negligible compared with electric ones under these conditions. That is, for radiation of wavelengths large compared with the dimensions of the defect it is easy to show that the correct transition probabilities are given by replacing ieh/ mcA .V by the electric coupling - (l/c)dA (Y, t)/dt * Y. The second objection is more difficult to overcome; however, in the approximation in which the exchange interaction is a functional of the electron density (SLATER[1951], KOHNand SHAM[1965]) an effective field may be defined. Then the potential V(Y, t ) is local although still generally state-dependent and the contribution to g e f f ( Y , t ) from u ( v , t ) is -VV(Y, t ) / e .Thus, for these special conditions the effective field is given by CFeff(Y,
1 1 aA (Y, t ) t ) x - - VV(Y, t ) - - -. e c at
(5.42)
* If the electron-photon interaction could be expressed exactly as a scalar potential, the path of integration in eq. (5.41) would be arbitrary. However, a portion of the effective field arises from the non-conservative electric field of the radiation. For systems of interest here the path of integration is a straight line from the center of the defect’s electronic distribution to the point r.
204
ABSORPTION STRENGTHS OF DEFECTS
[v, § 5
We emphasize here that although it may not be possible to define g e f f ( v ,t ) , what is actually needed in the calculation of the absorption cross section are the matrix elements of w( Y, t ) . An average effective field may thus be defined such that (5.43)
It is in this sense that geff in Smakula's equation, eq. (4.2), is to be interpreted. In practice it should be possible to carry out a self-consistent calculation for u ( u , t ) and hence c",,,(t) in some of the simpler systems. The most likely candidates are systems having relatively small overlaps such as the rare-gas solids. Since overlaps between distant neighbors are small, only a few discrete atoms about the impurity would have to be treated in the tight-binding approximation and the remainder of the solid could be replaced by a dielectric continuum with the appropriate dielectric function. Without performing detailed numerical calculations some limiting values for effective fields (or potentials) as well as a qualitative picture of intermediate cases can be had by considering the free-ion results, Figs. 2 and 3. From the asymptotic dipole form of the field of an isolated atom or ion it is clear that the present prescription for u(r, t ) when taken to self-consistency will yield [via eq. (5.42)] the Lorentz field for a uniformly polarized cubic array of identical, nonoverlapping atoms with dimensions small compared with the interatomic spacings. Likewise, the Onsager cavity field (for a discrete lattice) will be obtained for a similar (but not uniformly polarized) array containing a cavity. In the opposite extreme of an impurity having a very diffuse charge distribution, the effective field must be independent of the detailed structure and its value may be inferred to be the average field in the medium. This can be seen from Fig. 2 and the preceeding discussion of exchange effects. In discussing eq. (5.36) it was argued that because of the orthogonality of the wave functions involved, the spatial average of the exchange terms in u( Y, t ) is zero. Furthermore, it can be seen from Fig. 2 that the Coulomb part of u( Y, t ) for a uniformly polarized atom is an odd function of Y about the nucleus. Thus, the spatial average of u( Y, t ) is zero and if the wave functions of the defect are diffuse and do not vary significantly over the extent of v( Y, t ) for an ion, the matrix elements of u( Y, t ) go to zero. This leaves the term aA/dt in eq. (5.42) which gives matrix elements of Y a,( Y, t ) SO that Feffis just the average field in the medium. Two examples of intermediate cases are shown in Fig. 3. They are the F- ion 2p valence state and the LiF F-center 1s state centered on an F- site between two polarized Lif ions in a LiF crystal. For the F- ion the charge distribution is compact and primarily samples the long-range dipole part of
v, § 61
205
COLOR CENTERS I N ALKALI HALIDES
the polarized Li’ ion field. More specifically, the actual dipole potential of the Li’ ion begins to deviate from that of a point dipole at approximately 1.8 a, so that only that portion of the charge on a neighboring ion outside 2.0 a, feels a non-dipole field. In the case of F- only 14.5 % of the valence electron’s charge lies outside 2.0 a, so that virtually all of the F- ion lies in a purely dipole field. Thus, it is not surprising that the polarizability (ground state) data for the alkali halides is explainable in terms of a Lorentz field as Shockley et al. (SHOCKLEY [1946], TESSMAN et al. [1953]) found. The F-center, on the other hand, is very diffuse and the major part - 73 % - of the charge density in the ground state lies outside 2.0 a,, and almost all lies outside 2a, for excited states. Thus, for the F-center the major portion of o( r, t ) will average away when matrix elements are calculated so that the effective field is more nearly the average field in the medium.
-
0 6.
Absorption Strengths of Color Centers in Alkali Halides
In the present section a critical survey of some of the available experimental data on color centers in the alkali halides will be made in light of the ideas in the preceeding sections. First we consider the convergence off-sums and the extent to which thef-sum rule can be applied to actual absorption measurements which must be made over a limited energy range. The “direct” measurement of oscillator strengths via Smakula’s equation is then discussed and the results for the F-center are critically examined. The measurement of relative oscillator strengths is considered for the cases of U F conversion and the ct and p bands. Finally, we consider a possible experimental test of the dependence of (beff/&,) on the index of refraction. In the case of the F-center it is concluded that totalf-sums for the F, K and L absorptions are significantly greater than unity. Further, we find that the net effective field experienced by the electron bound to an F-center in the alkali halides is approximately the average field in the medium, not the commonly assumed Lorentz field. --f
6.1. THE CONVERGENCE OFf-SUMS
The optical absorption of defects in strongly ionic crystals must of necessity be observed in the transparent region between the host crystal’s fundamental infrared and ultraviolet absorptions. Since this optical “window” is of finite width, not all transitions associated with the defect are observable. Of particular interest here is the obscuring of defect absorption in the U.V. by the host crystal absorption (even in a favorable case the host absorption is lo5 times stronger than that of the defect). A general esti-
-
206
ABSORPTION STRENGTHS OF DEFECTS
OSCILLATOR STRENGTH SUM VS. ENERGY OF HIGHEST TRANSITION
0.8
-
% 0.4
-
w 0 \
A
f - S U M FOR LYMAN SERIES
0
f-SUM FOR RbCl MODEL F CENTER
df/dE OF L Y M A N SERIES CONTINUUM
0
'I I
0
I
2
3
I 4
I 5
I
I
6
7
ENERGY (Rydbergs)
Fig. 4. Partial oscillation strength sums for the Lyman series of free atomic hydrogen and for a model F-center in RbCI. A schematic representation of the Lyman spectrum is given in the lower figure and the corresponding partial oscillator strength sum is plotted above as a function of the highest transition energy considered. For comparison partial f-sums for the discrete spectrum of an F-center in RbCl as calculated from a semicontinuum model (with no f-sum corrections for core function overlap) are shown. In plotting the F-center values the energy of the ground state relative to the bottom of the conduction band was taken as the Rydberg.
mate of how much defect spectrum is "lost" is not possible because of the diversity of defect electronic structure. However, it is instructive to consider the spectrum of atomic hydrogen for which the oscillator strength distribution is known. As we shall argue below, this, together with model calculations for the F-center, strongly suggests that only a negligible portion of the absorption of a number of important electron excess centers lies under the fundamental. A schematic representation of the discrete and continuous absorption of atomic hydrogen is given in Fig. 4 together with the sum of oscillator strengths for all transitions up to the energy under consideration (SUGIURA [1927], BETHEand SALPETER [1957]). The discrete absorption accounts for slightly over half of the total with a partialf-sum of 0.565. The$sum converges to unity slowly in the continuum, but at twice the bindmg energy
v, 9: 61
COLOR CENTERS IN ALKALI HALIDES
207
(2 Ry) thef-sum is 0.881 and at 3 Ry it is 0.947 with only -5 % of the oscillator strength at higher energies. For the F-center, discrete transitions to states lying below the first crystal conduction band are responsible (CHIAROTTI and GRASSANO [1966a, b]) for the main F-bandand the low energy part of the K-band, the K,-band (NAKAZAWA and KANZAKI [1967], and MAIERand GEBHARDT [1968]). These are analogous to the discrete transitions in hydrogen. The high energy part of the K-band, the K,-band (NAKAZAWA and KANZAKI [1967]), and the Lbands (LUTY [1960]) give rise to photoconductivity (CRANDALL and MIKKOR [1965]; NAKAZAWA and KANZAKI[1965]; and SPINOLO and SMITH [19651) and presumably are due to F-center transitions to conduction states or to defect states degenerate with conduction bands (PAGE,STROZIER and HYGH[1968], FOWLER [1968]. and BASSANI, IADONISI and PREZIOSI [1969]). These are analogous to the atomic continuum transitions. In a typical alkali halide such as RbCl the K,- and K,-bands merge at 2.5 eV (MAIERand GEBHARDT [ 19681) while the fundamental absorption begins at 7.5 eV (EBY,TEEGARDEN and DUTTON[1959]). Thus, a spectrum over a range of some three times the ionization potential can be observed for F-centers. SMITHand SPINOLO[1965b] have estimated the distribution of F-center oscillator strength between discrete and continuum states on the basis of the semi-continuum model assuming an approximate state-independent potential that reproduces the F-K spectrum of RbCI. The results for the discrete transitions are indicated in Fig. 4 by circles*; they show that for this model thef-sum for all discrete transitions is 0.73, almost half again that for atomic hydrogen. Subsequent calculations by IADONISI and PREZIOSI [I9671 give a better fit to the observed spectrum and yield higher partial f-sum for discrete transitions. The distribution of oscillator strength in the continuum is not known, but it seems reasonable that it is qualitatively similar to the atomic case apart from structure associated with the conduction bands**. Under this assumption, extrapolation of the F-center results in Fig. 4 into the continuum leads to the conclusion that any unobserved portion of the F-center absorption is much less than 5 % of the total.
* The Smith-Spinolo calculation makes no provision for overlap with core states so that the f-sum approaches unity asymptotically. ** Sincethe final states are related to the various crystal conduction bands, structure is expected in the absorption whereas in the hydrogenic case the continuum is a smooth curve (Fig. 4). In the limit o f the effective-mass approximation this would appear as a series of hydrogenic spectra each associated with a particular conduction band. In the case of the F-center a model calculation for the continuum spectrum (PAGEet al. [1968]) shows that variations in the transition probability and in the density of states combine to produce structure in the spectrum.
208
ABSORPTION STRENGTHS OF DEFECTS
[v, (i 6
This argument also holds for perturbed F-centers such as the F A t and Z, (or F,, in PICK’S [1972] notation)tt centers. Jt also seems likely that it would apply to many centers in which the electrons are sufficiently weakly bound that the major absorption bands occur well below the crystal’s fundamental electron absorption. Probable candidates are the F-aggregates (COMPTON and RABIN[1964]) and F’-(PIcK [1938, 19401) centers (as well as shallow donor impurities in large dielectric constant semiconductors (KOHN[1957])). These centers will be considered further in 3 6.4, but we note here the experimental observation that the total integrated absorption remains the same on bleaching F-centers to form aggregates (PETROFF[ 19501) or F,, (PICK [1939a, 1939b]), on F - t F’ conversion (PICK [1938, 1940]), and in colloid formation (DOYLE[1968a]). That is, to within experimental error, the observable integrated cross section per electron is the same for each of these centers. This fact, together with considerations for the effective field will lead to the conclusion that virtually all the absorptions for this group of centers falls in the region of crystal transparency. For centers having transitions primarily in the u.v., considerable absorption will generally lie at energies beyond the fundamental edge. This is certainly true for centers such as OH-, SH-, etc. Where the observable U.V. bands are found to have f of the order of 0.1 to 0.2 (for OH- see FRITZ, LUTYand ANGER[1963], PAUSand LUTY[1965], KLEIN,KENNEDY, GIEand WEDDING[ 19681, KURZ[1969a1, see also KUHNand LUTY[19641 and KOSTLIN [1967]; for SH- the measurements of FISCHER and GRUNDIG[1965] yield the ratio fSH-lfF. = 0.19 on the traditional assumption of a Lorentz field for both centers). It is probably also true for the U-center (PICK [1965], FOWLER [1968]), a case discussed below, and for the U,-center for which values of fuz/fF of approximately are reported (KURZ[1969a, b]).
+
6.2. DIRECT MEASUREMENT OF OSCILLATOR STRENGTHS
Direct determination of the oscillator strength,f, of a transition relies on Smakula’s equation in the form
where no is the average index of refraction at the transition, ,it the absorption coefficient, and p the number density of centers. Such determinations are t The F,&-center is an F-center with a nearest-neighbor monovalent cation impurity such as Na+ in KCl (LUTY[1968]). t t The F,,-center is an F-center perturbed by a divalent cation impurity and its accompanying cation vacancy arranged in an as yet uncertain geometry (BUSHNELL[1964], PAUSand L U T Y [1968], GEHRER and LANCER[1968], PAUS[1969]).
v, 0 61
209
COLOR CENTERS I N ALKALI HALIDES
subject to the theoretical uncertainty associated with the choice of the effective field as well as uncertainties in measurement. Experimentally it is necessary to determine the integrated absorption and the number density of centers. Generally the absorption is measured with relative ease. However, if a single band is involved, a common procedure is to measure the height and width at half-maximum, and figure the area assuming a band shape - usually a Lorentzian or a Gaussian. As several authors have pointed out, the Lorentzian line shape assumed in Smakula’s original work is a poor fit to most color center bands (DEXTER [1958], DOYLE [1958b], KONITZER and MARKHAM [I 9601, MARKHAM and KONITZER[1961], KLICK,PATTERSON and KNOX[1964]). Experimentally the bands are found to be more nearly Gaussians or the superposition of Gaussians. Also, the theory for strong electron-phonon coupling (DEXTER [ 1954]), common in ionic crystals, indicates that a Gaussian should be a much better approximation to the band shape than a Lorentzian. The conversion factors for various band shapes and the observed F-band shape in several salts have been given by DOYLE [1958b] and are reproduced in Table 2 for reference. Note that assumption of a Lorentzian lineshape overestimates the area of the band (and consequently f ) by approximately 50 p/,. TABLE 2 Values of the ratio, S, of the integrated absorption coefficient to the product of p-ak height, pmax, and full width at half maximum, m+. Band
in
1.57
Lorentzian Gaussian
$\/’am
Actual KBr F-band (including K)” Actual NaCl F-band (including K)”
1.19 i-0.05 1.06 1.31 50.05 1.19 *0.05
=
=
1.07
After DOYLE [1958b]. Calculated from Doyle’s value for the area including the K-band and the K-to-F area ratio reported by LUTY [1960]. a
The major experimental uncertainty lies in the determination of center density. Basically six approaches have been used*. They include: a) Chemical measurements of impurity ion or coloring agent concen-
* Early electrical attempts to measure F-center number densities (STASIW[1932]) have been shown t o be difficult to analyze because of the number of conduction processes involved. For details see MOLLWOand Roos [1934b].
210
A B S O R P T I O N S T R E N G T H S OF DEFECTS
b,0 6
tration (see for example KLEINSCHROD [1936]). This method has been used in several variations in measurements of F-center oscillator strengths (KLEINSCHROD [1936], SCOTT[1951, 1955, 19581, DOYLE [1958b], KLEEFSTRA [1963], PHELPS[1963]). In these experiments the stoichiometric excess of alkali metal in additively colored crystals was measured and the assumption made that there was a one-to-one relation between excess metal atoms and F-centers. Formation of separate phases, and precipitation of metal as colloids or at dislocations or other imperfections would introduce errors and tend to underestimate oscillator strengths. In general chemical measurements are difficult because only small deviations from stoichiometry or relatively small impurity concentrations are involved. Overall accuracy of approximately 20 % would seem to be the present limit of this approach. b) Spectroscopic methods such as flame spectrochemical analysis or atomic absorption spectroscopy (FUKUDA [ 1964a, 1964bl). These techniques have been used to determine the concentration of heavy metal impurities. In favorable cases such as Ag' in KCI accuracies of 10 % are obtained (FUKUDA [1964c]). c) Radioactive tracer techniques and neutron activation analysis. These methods have been applied principally to heavy metal impurity centers. Perhaps the most accurate color center density determinations to date have been made on T1' centers using radioactive 204Tl as a tracer (WAGNER [ 19641, LEUTEand SCHULZ [ 19661).The researchers report oscillator strength with an experimental uncertainty of k 2-4 %. d) E.P.R. measurement of the number of unpaired spins. Provided the center under study is paramagnetic and its E.P.R. spectrum does not overlap that of other centers, the number of centers may be determined by comparison of the area of the E.P.R. absorption with that of a calibration salt having a known number of unpaired spins. This method has been applied to the F-center by SILSBEE [1956]. Although this method is very selective in that it measures the density of a particular paramagnetic center and is in principle quite sensitive, technical problems limit the accuracy. For example, Silsbee found that even with careful measurements reproducibility is good to only 10 %. e) Static magnetic susceptibility measurements. The density of paramagnetic centers may also be found by measuring the static magnetic susceptibility provided only one magnetic species is present and the electronic gvalue is known. This approach has been used by Heer and co-workers for the F-center (RAUCHand HEER [1957], BATESand HEER[1958]). Possible sources of error are magnetic impurities or imperfections with magnetic properties similar to those of the center of interest.
-
-
v, 0 61
21 1
COLOR CENTERS I N ALKALI HALIDES
f ) Density measurements. Several determinations of F-center concentration have been based on the change in crystal density upon coloration (MOLLWO[ 1934a1, ESTERMAN, LEIVOand STERN [19491, WITT [ 19521; see also PAUSand THOMMEN [1963]). The assumption is that each F-center decreases the density of the crystal by the mass of a halide atom. In general this method appears to be primarily of historical interest since it is subject to many uncertainties associated with the processes involved in the particular method of coloration used. In addition to these direct measurements Seitz observed that the results of F -+ F’ conversion can be used to estimate the number density of color centers (SEITZ[1946] p. 390). PICK’S[1938, 19401 study showed that in the appropriate temperature range two F-centers in KCI are converted by each photon absorbed if KLEINSCHROD’S [ 19361chemical calibration of Smakula’s equation is correct. This led to the notion that the F’-center is an F-center which has trapped an electron. Now, ifit is assumed that the maximum quantum efficiency for F -+ F’ conversion is two in all crystals, Smakula’s equation may be calibrated from F -+ F’ quantum efficiency measurements. Trapping of electrons at defects other than F-centers and F-center-electron recombination are possible sources of error in this method. 6.3. A N EXAMPLE - THE F-CENTER
The most extensive oscillator strength measurements have been made for the F-center. Traditionally, the quantity reported has been a “conventional” oscillator strength based on Smakula’s equation assuming a Lorentzian line-shape and a Lorentz local field. These numbers are essentially calibration constants of proportionality between the density of centers and the product of F-band width at half-maximum and peak height. As such they are useful for determining the concentration of centers in a crystal from its absorption spectrum. However, these values should not be confused with the actual oscillator strength for the isolated defect. The results of a number of experiments have been collected in Table 3. All values are based on the Lorentz local field and in most cases the band area was computed for a Lorentzian or Gaussian shape. The resulting values are given in columns marked,f(dp) a n d f ( 9 ) respectively. In a number of cases the actual band area was measured; these results are labeled f ( d ) . The Gaussian values are in the best agreement with those for the actual area, but differences of up to 10 % exist. Consider the results for KCl, the most widely studied substancs. A straight average of the f values gives a “conventional” oscillator strength of fF( 9) 0.88. Excluding the highest and lowest values yieldsf,(9) 0.85. For
-
-
212
A B S O R P T I O N S T R E N G T H S OF D E F E C T S
tv, 5 6 TAE
Some reported values of the F-center oscillator strength as derived from Smakula’s equati Gaussian line shape, and the actual ar Ref.
KLEINSCHROD~ [I9361 PICK’[1938, 19401 SILSBEE [I9561 RAUCHand HEER[I9571 BATESand HEER[1958] SCOTTand HILLS[I9581 DOYLE [3958b] KLEEFSTRA [1963] PHELPS[I9631
0.81 (0.55)b 0.7 0.87 (0.59)b 0.70 0.47
0.81
0.85 (0.58)b 0.66 0.45 0.
0.82 0.56 0.54 1.17 0.75 0 0.86
0.91 0.93 0.8
a A value of,f(P) rn 0.9 for the KCI F-band has been derived from earlier measurements [1966]. of S ~ ~ s r w [ 1 9 3by 2 ] MARKHAM
definiteness we assume the latter in the following. From Doyle’s measured shape factors in Table 2 we can estimate the corresponding value offF(&); it is 0.57. Then, from LUTY’S[1960] measurements of the relative area of the F, K and L bands the total strength of all the observed F-center bands is 0.68. This is to be compared with the expectation, based on the f-sum rule, of a total oscillator strength greater than unity. From the discussion in $6.2, the discrepancy between these values is outside experimental error. Although the possibility that more than 3 of the F-center absorption is obscured by the crystal’s fundamental absorption cannot be definitely excluded, it seems highly improbable from the considerations of $ 6.1. It is far more likely that the assumption of a Lorentz field in the analysis has led to oscillator strengths that are too small. That this must be the case is illustrated in Fig. 5 which shows the radial extent of the dipole operator for an F center* (SMITHand DEXTER [1968b]). The specific crystal illustrated is RbCl and the ions along a [loo] direction are indicated by circles of the classical ionic radius. The curves give the integrand ( $ ~ s r $ , p ) r 2for 2 5 n 5 5. The striking feature is that the integrands are not compact and that the major contribution occurs from the region between 4 and 12 a, which includes 32 ions. Furthermore, there is almost no
* The F-center wave functions used are for the semicontinuum model with empirical parameters given by SMITHand SPINOLO[1965b].
v, § 61
213
COLOR CENTERS IN ALKALI HALIDES
r the Lorentz local field. f ( U ) ,f(9) and f ( d ) are the values for a Lorentz line shape, a tder the absorption band, respectively K-band included?
0.9
0.71 0.48 0.52
0.46 0.31
0.38 0.26
Chemical - p H F --z F, conversion E.P.R. Static mag. susc. Static mag. susc. Chemical - Hz evol. Chemical - pH Chemical - p H Chemical - H Z evol.
0.85
[1958]. Recalculated for a Gaussian band by DEXTER
' See also SEITZ[I9461 p.
390.
contribution from within the vacancy. Hence, the Lorentz and Onsager local fields are clearly the wrong choices - they apply only near r = 0 where the integrand vanishes - and &eff must be much nearer the average field, 8,.A similar conclusion holds if we consider the gradient matrix elements. The gradient integrand for the 1s + 2p transition is given as the dashed curve in the figure. This situation is also found for matrix elements between the ground state and continuum wave functions (SMITH[1967]). Further evidence that the Lorentz field is incorrect was presented some years ago by DOYLE[1958b] who reasoned that if the one-electronf-sum rule could be applied to the observed spectra, Smakula's equation could be inverted and, assuming = 1, the effective field ratio calculated from the total integrated cross section. His results for (&,,f/&,)2 are given in the column marked = 1 of Table 4. In 9: 3.2. it was argued that CfF.center 2 1 so that Doyle's values represent an upper bound on the field ratios (assuming negligible absorption is obscured by the host crystal fundamental). Another the total valence electron approach is to assume that IfFz oscillator strength for the alkali atom. The justification for this is that in the simplest L.C.A.O. approximation the F-center ground state is made up of alkali-atom valence states. The effective field ratios calculated for this assumption are given in the column marked ELlkali in Table 4. They are significantly lower than either the Lorentz or Onsager values. A more realistic
Cf
If
Ifalkali,
214
ABSORPTION STRENGTHS OF DEFECTS
Fig. 5. The integrands of several Is --z np dipole matrix elementsand the Is +2p momentum matrix element for the RbCl F-center in the semicontinuum model. Ions along the [ l o o ] direction are indicated by circles; the position and number of ions in all directions up to the eighth shell are shown in the lower part of the figure. Two measures of the “size” of the vacancy are shown. They are the Mott-Littleton radius RM-L (MOTTand LITTELTON [1938]) and the central well in the semicontinuum model potential RE (SMITHand SPINOLO [1965b]). In all cases the matrix element integrands are spread over many neighboring ions.
L.C.A.O. model would include halide-ion and excited alkali-ion states since overlaps with halide ions is sizeable; this would probably lead to smaller effective field ratios particularly for salts of the heavy halides. Thus, this treatment of the experimental data implies that an effective field only slightly larger than the average field is consistent with both thef-sum rule as amended to include occupied states and with the relatively diffuse value of the dipole transition matrix integrand. Another way to interpret the data is to note that by assuming an effective field ratio of unity the experimental data may be used to set an upper limit on the oscillator strength of the observed F-center transitions. These upper limits for the total oscillator strengths are approximately 1.3 in NaCl, 1.4 in KCl and 1.5, in RbRr, while those for the F-band itself are of the order of 1.2*. Although there is insufficient data here to draw any final conclusion, it is interesting to note that the totalf-sum increases as the atomic number
v, 0 61
21 5
COLOR CENTERS IN ALKALI HALIDES
TABLE 4 Estimates of the square of the effective field ratio for the F-center. The values listed are those for the Lorentz and Onsager fields, the field calculated from the observed cross section assuming (after Doyle) a Xfof unity, and the field calculated for a Zfequal to the total oscillator strength of the alkali atom valence electron. The index of refraction for the region of the F-band is nF. Substance NaCl KCl KBr
(&df/
nF '
1.56 1.49 1.56
82
Lorentz
Onsager
Cf = 1
2.18
1.55 ISO
1.35 &0.07b 1.34f0.1Sb 1.54 j=O. 13b
1.98 2.18
1 .55
Xf=
Xf.lkali
=
1.30 1.22 1.40
Values of Xfalkali are taken from Table 1. [1958b]. Based on the oscillator strengths reported b) DOYLE For values of the refractive index used in preparing this and subsequent tables see PICK [1962] and GUYLAI[1928]. a
of either the alkali or the halide ions increases. This is in line with the notion that the larger the number of occupied core states, the larger the corrections to thef-sum rule. 6.4. RELATIVE OSCILLATOR STRENGTHS
Although the absolute determination of oscillator strengths is generally difficult, relative strengths of absorptions for centers related by photochemical reactions are easier to measure. Since the F-center is involved in a large number of such processes, oscillator strengths of many centers have been investigated relative to the F-band. Of course, such measurements require that all converted centers yield observable species and assume that the number of F-centers produced from, or required to form, each of the other centers is known. The technique is based on the generalized form of Smakula's equation, eq. (6.1), which gives the relative strengths of two absorptions a and b to be
This equation shows that even relative oscillator strengths cannot be determined unambiguously since the effective fields depend on the details of the defects. The simplest case occurs for centers with similar electron distributions and
* In this calculation the weighted means of the published oscillator strength given by DOYLE[1958b] were used. Since the published results assume a Lorentz local field and a Lorentzian line-shape, corrections were applied to find the oscillator strengths for the actual band shape and for defr= I,.The total oscillator strength for the F-, K- and L-bands was then found using the relative areas given by LUTY[1960].
216
A B S O R P T I O N S T R E N G T H S OF DEFECTS
tv, § 6
absorptions at energies for which n, E n b . Then the effective fields are approximately the same and the relative strength is the ratio of the integrated absorptions. A probable example is the “family” of F-, F-aggregate, and perturbed F-centers. To a first approximation, all these centers are made up of combinations of F-center states or slightly perturbed F-center states. Thus, the transition matrix element integrands are of similar spatial extent implying similar effective fields. If, as contended above, the F-center experience nearly the average field, the more extended aggregates also experience the average field; further, the group may be expanded to include the F’-center which has an even more diffuse charge distribution (LAand BARTRAM [1966], STROZIER and DICK [1969]). Table 5 contains data for these centers obtained from the bleaching of Fcenters. Since there is always the possibility - slight here - that products of the F-center conversion may have escaped detection, all values are lower limits. In addition there is photo-conversion data for R- and N-centers, the higher F-aggregate centers containing three and four F-centers, respectively (COMPTON and RABIN[1964]). PETROFF[1950] found that conversion of F-centers into F A - and F-aggregate centers proceeded without change in area under the absorption curves. Since Table 5 shows that both F A - and Mcenters have roughly the same total oscillator strength as the F-center, this implies that the observed R- and N-center transitions must also have the same oscillator strength per electron as the F-center. Tosummarize, the F-, F’-, FA-, F,,-, M-, N- and R-centersall haveequalcross section per electron to within experimental error. The simplest explanation is that all these centers experience virtually the same effective field, have very nearly the same total oscillator strength per electron (values somewhat in excess of unity), and that substantially all the absorption is observed in each case. Other explanations require variations of the effective field to be offset by observation of a correspondingly different fraction of the total oscillator strength; or, if the fields are the same, that the same fraction of the absorption is observed for all centers. Considering the diversity in electronic structure, these latter possibilities seem unlikely. The presence of the F’-center in this group is additional evidence that the net effective field for these centers is the average field since this more diffuse center samples even more of the crystal field than the F-center (LA and BARTRAM [1966], STROZIER and DICK[1969]). In the general case where there are significant differences in charge distribution and refractive index the general formula must be used. As an example consider the U-center, a substitutional H- ion (PICK [1965], FOWLER [1968]). Bleaching of this center yields F-centers on a one-to-one
v, § 61
217
COLOR CENTERS IN ALKALI HALIDES
TABLE 5 Oscillator strengths of some electron excess color centers derived from the F-center by bleaching experiments Center
Crystal
F
KCI
F’-band
-2.00 1.87
FA
KC1:Na
A, AZ(tw0)
0.36 0.32 -.
K C I : Li
Z1(FZ,)
KCI KCI KCI KCI
: Sr : Sr
: Sr : Srb
Transition
fifF
-
1.00 0.94
1.o
Al AZ(tW0)
0.33 0.33
Total
1.o
1.o
Z1-band 2,-band ZI-band Z1 Kz 1
1.04 0.97 1.00“ 0.89 0.12
1.04 0.97 1.00”
Total RbCI:Cab Z , Kz 1 Total RbCl: Srb Z1 Kz 1
PICK [1938, 19401 DELBECQ [I9631 LUTY 11961, 19621
Total
-
Reference
,flfF per electron
1.o
FRITZet al. [I9651
-
CAMAGNI and CHIAROTTI [ 19541 KLEEFSTRA [1963] HARTEL and LUTY[1964] PAUS[1969, 19711
1.01
1.01
0.75 0.19 0.94
0.94
PAUS[1971]
1.07
PAUS
0.90 0.17 ~
Total MC
KC1
MI Mz Mz‘ M3 M3’ Total
Colloids
1.07
DELBECQ [I9631
0.43 0.36 -0.36 0.22 -0.22 ___ 1.59
KC1:H
M,
0.42
NaCl
Colloid bands
_ _
119711
0.8
-
EROS[I9651 1.00
DOYLE [1958a]
a From the author’s integration of absorption spectra given by HARTELand LUTY [1964]. Z1 refers to the main Zl-band. KZ1 refers to the K-band-like absorption making up the high energy tail of the Z,-band. ’ See also COMPTON and RABIN[1964] for a reinterpretation of the data of OKURA 119571 and of TOMIKI[1959a, 1959b, 19601 which leads to values of fM,/fF of 0.38 and 0.43 respectively.
218
[v, § 6
A B S O R P T I O N STRENGTHS OF D E F E C T S
basis (MARTIENSSEN and PICK[1953]). However, the U-band lies from 3 to 4 eV higher in energy than the F-band so that the index of refraction in the U-band region is 10 to 20 % higher than at the F-band. The difference in effective fields for the F- and U-centers is not known. We note, however, that the U-center must be somewhat more compact than the F-center because of the additional binding of the partially shielded proton. But this binding is relatively small {the electron affinity of H- is -0.75 eV (WEAST[1970])} so that the two charge distributions probably do not differ greatly. This leads to the tentative conclusion that the U-center effective field is probably somewhat greater than that for the F-center, but much less than the Onsager or Lorentz values. In any event, reasonable limits forf& can be set by assuming that either the Lorentz or numerically equal fields act at both centers. The experimental results forfUifF are summarized in Table 6. As reported in the literature the experimental data effectively excludes both the K-band
-
TABLE 6
Oscillator strengths of the U-band relative to that of the F-band. Values in the first column arecalculated assuming a Lorentz effective field; those in the second column are based on the assumption that the effective fields for both centers are numerically equal. Since the U center has two equivalent electrons, the oscillator strength per electron is half the listed value.
Crystal
Lorentz field
Reference U
$eff
F
NaCl
1.2,
1.8,
GOTOet al. [I9631
KCI KCl KCI KCI : H
1.3
1 .o,
1.65 1.4, 1.3, I .36
KLEINSCHROD [1936] HIRAI[1960] FISCHER and GRUNDIG [I9651 EROS[1965]
KBr KBr : H
1.o, 0.86
1.48 1.22
TIMUSK et al. [I9631 EROS[1965]
1.16
1 .os
absorption of the F center and the short-wavelength tail of the U-band." The ratios consequently refer just to the principal bands of the F- and U-centers. Moreover, as in the case of F-center conversions there is the possibility of undetected products of U-center destruction so that these results should be interpreted as upper limits.
* Except in the measurements of Timusk and Martienssen it was assumed that both the U- and F-bands have similar shape so that their areas could be calculated from the product of width, peak-height and a constant which is the same for both bands. There appears to be insufficient information to justify or reject this assumption. In Timusk and Martienssen's measurements the areas were measured, but the short-wavelength absorptions were subtracted.
v, 0 61
COLOR CENTERS I N ALKALI HALIDES
219
The outstanding difference between the U-center results and those for the F- and F-aggregate centers (Table 5) is that the observed U-band oscillator strength per electron is of the order of 0.6 to 0.8 of that of the F-band. Since the F-band accounts for roughly 80 % of the observed F-center spectrum, the U-band oscillator strength per electron is only of the order of half that for the total observable F-band spectrum. Consequently, the U-band alone cannot exhaust thef-sum rule for the U-center and these crude estimates suggest that somewhat less than half the U-center spectrum must lie toward higher energies. The high energy tail or shoulder of the U-band, the U,-band of GOTO,ISHIIand UETA[1963], accounts for some of this, but from published spectra it appears that the tail is only of the order of 10 to 15 % of the main U-band (TIMUSK and MARTIENSSEN [1963], GOTOet al. [1963]). Thus, we conclude that 4 to 3 of the U-center absorption probably lies under the host crystal absorption. This is not surprising since it is reasonable to expect U-center absorptions at higher energies in analogy to the F-center K- and L-bands and if the K-like absorption is assigned to the U,-band, the L-like bands would lie under the fundamental. Further evidence for this point of view is given by the photoconductivity studies of GOTOet al. [I9631 which show a higher energy U-center absorption, the U,-band, lying very near the exciton edge. In the case of the U-center the percentage of total strength in the higher energy bands postulated in the preceding paragraph should be greater than that in the F-center. The point is that to a first approximation the F-center is a particle in a box while both U-center electrons experience the Coulomb field of the central proton. In the case of a particle in a spherical well the oscillator strength of the first transition (Is 2p) lies between 0.97 and 0.98 over a wide range of parameters appropriate to defect centers in ionic crystals (SMITHand DEXTER [1969]). This is distinctly different from the case of a Coulomb potential in which the total oscillator strength is distributed over many transitions with only 0.41 associated with the Is + 2p transition (BETHE and SALPETER [1957]). Qualitatively this difference arises from the infinitely deep potential near the hydrogen nucleus. The U-center ground S state has a large density near the proton where the potential is deep. The excited P states, since they contain an extra factor of r 2 in the density, have a maximum density further out where the potential is weaker. The U-center ground and excited states are consequently determined by different regions of the potential. This leads to a relatively tightly-bound ground state and relatively diffuse excited states with little overlap between the ground state and any single excited state. Although the transition energies may be large, the oscillator strengths are --f
220
ABSORPTION STRENGTHS O F DEFECTS
P,8 6
small because of the quadratic dependence of transition probability on the dipole matrix element. In contrast, for the relatively flat wells found in the F-center family the ground and (unrelaxed) excited states “feel” roughly the same well depth, and the ground and first few excited states are more nearly of the same spatial extent. This gives rise to a few large dipole matrix elements and, hence, oscillator strengths. The other absorptions whose strength is conveniently measured relative to the F-band are the a- and /?-bands (DELBECQ et al. [1951, 19521). These bands lie to slightly longer wavelengths than the host crystal exciton lines and are thought to be localized excitons perturbed by the presence of a negative-ion vacancy or an F-center respectively. The different refractive indices in the F- and a-/?-band regions are known (PICK[1962]), but uncertainty in the effective fields for such “excitons” makes an analysis difficult. As in the previous discussion of U-centers we have simply calculated the ratios for equal and Lorentz fields as examples. The oscillator strengths for the a- and /?-bands are given in Table 7. Timusk and Martienssen, and Rockstad used the actual area under the absorption bands in arriving at their results. In all the a-band measurements the TABLE 7 Oscillator strengths for a- and B-bands. Column headings have the same meaning as in Table 6. Values in parenthesis are the reported experimental values not corrected for the variation in refractive index. ~~
Crystal Lorentz NaCl
1
KCI KBr
~
~
deff
=
d eff
Lorentz
Reference =
8,ffF
2.72 (0.64) 0.355 1.39
{
~~~
b/fF
fa/fF
::::
2.52 2.3, 2.49
(l.6g) 0.934
ONAKA et al. [1963] RIDGEN[1961] ONAKA et at. [I9631 ONAKAet at. [1963] TIMUSK et al. [1963] and ROCKSTAD [I9651
change in refractive index appears to have been accounted for. In the case of Rigden’s NaCl /?-band results neither the change in index nor the actual area under the bands appears to have been used. The value off,/f, from the original work is given in parentheses followed by a recalculated value taking into account the change in n.
v, § 61
22 1
COLOR CENTERS IN ALKALI HALIDES
6.5. POSSIBLE TESTS OF 8 e f f /AS ~o A FUNCTION OF no
The various classical effective fields differ strongly in their dependence on the index of refraction suggesting that it may be possible to determine which is acting in a given situation by a variation of the host crystal. This dependence is shown in Fig. 6 in which is plotted vs no for the Lorentz, Onsager and average fields. In order to distinguish effects due to effective fields it is necessary to find a property of the defect which is relatively insensitive to the host material. Some possible quantities are: a) transition matrix IC
8
-e
6
N
L
L
UQ
v
4
2
t t
t
NaF LIF NaCl
C
1
1
1
1
1
1.5
t
t
t
t
NaI
LII
AgBr
TlBr
1
1
1
,
2.0
1
1
I
1
2.5
REFRACTIVE INDEX, no
Fig. 6. The square of the effective field ratio as a function of the refractive index in the Lorentz, Onsager, and average field approximations. The optical indices of refraction for a number of common crystals are indicated by arrows.
elements between electronic states that are shielded by outer electrons; examples include transitions in rare earths and transition elements; b)
222
ABSORPTION STRENGTHS OF DEFECTS
tv, P 7
dipole moments of molecular impurities for which there is no chemical bonding with the host; c) oscillator strength sums where corrections for occupied host-crystal core states are the same for a series of host crystals. In light of the conclusions about the F-center drawn in the previous sections the last case is of particular interest since thef-sum for an F-center in the halide of a given alkali should change little relative to the index of refraction as the halide ion changes. Thus, differences in the total integrated absorption between various hosts should primarily reflect changes in the effective field. For the series of lithium salts the value of (&eff/&,,)z increases by a factor of 2.2 from the fluoride to the iodide in the case of the Lorentz field, while a factor of 1.23 increase is predicted for the Onsager field. The corresponding figures for the sodium salts are 1.7 and 1.22, respectively. Thus, it should be possible to differentiate among the three classical cases in an experiment good to 10 or 15 %, while a 25 or 30 % experiment could distinguish between the Lorentz and the Onsager or average fields. Unfortunately, measurements of total integrated cross sections on series of compounds do not seem to have been performed.
6 7.
Summary
In this article we have reviewed and attempted to put inLo perspective the theory for the strength of optical absorption by defects in inzulating solids. Experimental results for color centers in the alkali halides have been considered as an example of these ideas and possible experimental tests of the theory have been discussed. The major points emphasized may be summarized as follows: 1) We have shown on general grounds that it is not possible completely to decouple the absorption of a particular defect from that of the crystal as a whole. In the one-electron approximation this circumstance leads to deviations from thef-sum rule as it would apply to the isolated defect. Physically this is a result of the Pauli principle prohibition of transitions of the defect electron to occupied states of the host crystal. In the case of electronexcess color centers one is led to the expectation of totalf-sums per electron exceeding unity. 2) Smakula’s classical relation between the integrated absorption cross section and defect oscillator strength may be generalized provided the effective field ratio, oscillator strength and mass are given their proper quantum mechanical interpretation. In general the effective field cannot be associated with an actual electromagnetic field, but is, rather, a parameter related to the time-dependent potential in the neighborhood of the defect. For most
SUMMARY
223
applications, the mass to be used in Smakula’s equation is the electronic mass, not the effective mass of the host crystal. The resulting oscillator strengths then obey thef-sum rule consistent with the Pauli principle. 3) The idea of a local effective field was considered from both classical and quantum mechanical viewpoints. In the limit of small overlap between defect and host the classical approximation is sufficient. However, the field at the defect must be found self-consistently because the defect and host atoms have different polarizabilities. Such a treatment leads to a local effective field involving the Onsager cavity and reaction fields. The result is not, as often assumed, the Lorentz field. When overlap is not negligible, exchange effects must be included and in general a local effective field cannot be defined. Then a time-dependent Hartree-Fock formulation may be used and transitions are seen to be induced by the time-dependent self-consistent potential experienced by the defect. In the limit of a very diffuse center Coulomb and exchange potentials arising from the polarization of the medium average to zero and the effective field reduces to the macroscopic field in the medium. 4) The distributions of oscillator strength as a function of energy for hydrogenic defects and the F-center were considered. The results indicate that thef-sum is nearly exhausted by the observable absorption in a number of important centers in the alkali halides. A comparison of theory and experiment indicates that this holds for the F-, FA-,F,,-, F’- and F-aggregate centers and that for these centers only a few percent of the total absorption is obscured by the fundamental crystal absorption. In the case of the U-center one-quarter to one-half of the total appears to be lost under the fundamental while a larger percentage remains undetected in centers such as substitutional OH- and SH-. 5) A comparison of measured F-center oscillator strengths with thef-sum rule prediction of a total strength exceeding unity leads to the conclusion that the effective local field experienced by the F-center electron is close to the macroscopic field, 6,. This conclusion is verified by the observation that the F-center dipole transition matrix element integrands are very diffuse and average over the polarization fields of a large number of ions. 6) An experimental test of the conclusions that, for the F-center, the observed absorption almost exhausts the,flsum and that Eeffz 6, is suggested. The test consists of the measurement of total integrated absorption as a function of refractive index for a series of crystals. Experiments good to 25 or 30% should be sufficient to distinguish between an effective field with the Lorentz or with the macroxopic field value.
224
A B S O R P T I O N S T R E N G T H S O F DEFECTS
tv
Acknowledgements The authors would like to thank Dr. M. Altarelli for helpful comments on an early draft of this article. They are indebted to Dr. C. J. Delbecq and Dr. P. Yuster for discussions of the experimental determination of oscillator stiengths, and would like to thank Dr. A. Fukuda, Dr. G . Kurz and Dr. H Paus for comments on impurity center experiments and for supplying the authors with unpublished results. References BALLHAUSEN, C. J., 1962, Introduction to Ligand Field Theory (McGraw-Hill, Inc., New York). F., G. IADONISI and B. PREZIOSI, 1969, Phys. Rev. 186, 735, BASSANI, BATES,R. T. and C. V. HEER,1958, J. Phys. Chem. Solids 7, 14. BENNETT, H. S., 1968, Phys. Rev. 169, 729. H. S., 1969, Phys. Rev. 184, 918. BENNETT, BETHE,H. A., 1929, Ann. Physik 151 3, 133. 1957, Quantum Mechanics of One- and Two-Electron BETHE,H. A. and E. E. SALPETER, Atoms (Springer-Verlag, Berlin) Sects. 59, 61 and 62. 1970, Phys. Rev. B1, 1. BHARGAVA, R. K. and D. L. DEXTER, BIERMANN, L. and K. LOBECK,1948, 2. Astrophys. 25, 325. BIERMANN, L., 1950, Oszillatorenstarken und Lebensdauer angeregter Zustande, in: Landolt-Bornstein Zahlenwerte und Funktionen, Vol. 1, ed. A. Eucken, 6th ed. (Springer Verlag, Berlin) part 1, pp. 260-275. BROWN,W. F., 1956, Dielectrics, in: Handbuch der Physik, Vol. 17, ed. S. Fliigge (Springer-Verlag, Berlin). J. C., 1964, ENDOR Study cf Z1 Centers in KCI, Thesis, University of IIBUSHNELL, linois, Urbana/USA (unpublished). CAMAGNI, P. and G. CHIAROTTI, 1954, Nuovo Cimento 11, 1. CHANDRASEKHAR, S., 1945, Astophys. J. 102, 223. 1966a, Phys. Rev. Letters 16, 124. CHIAROTTI, G. and U. M. GRASSANO, CHIAROTTI, G. and U. M. GRASSANO, 1966b, Nuovo Cimento B46, 78. COHEN,M. H. and F. REIF, 1957, Quadrupole Effects in Nuclear Magnetic Resonance Studies of Solids, in: Solid State Physics, Vol. 5 , eds. F. Seitz and D. Turnbull (Academic Press, Inc., New York). COMPTON, A. H. and S. K. ALLISON,1935, X-Rays in Theory and Experiment (D. Van Nostrand, Princeton), see especially chapter VII, 0 9 and Table VII-13. W. D. and H. RABIN,1964, F-Aggregate Centers in Alkali Halide Crystals, COMPTON, in: Solid State Physics, Vol. 16, eds. F. Seitz and D. Turnbull (Academic Press, Inc., New York). COURANT, R. and D. HILBERT,1953, Methods of Mathematical Physics, Vol. I (Interscience Publishers, Inc., New York). CRANDALL, R. S. and M. MIKKOR,1965, Phys. Rev. 138, A1247. DARWIN,C. G., 1934, Proc. Roy. SOC.(London) A146, 17. DARWIN,C. G., 1944, Proc. Roy. SOC.(London) A182, 152. and P. YUSTER,1951, J. Chem. Phys. 19, 574. DELBECQ, C. J., P. PRINGSHEIM C. J., P. PRINGSHEIM and P. YUSTER,1952, J. Chem. Phys. 20,746. DELBECQ, C. J. 1963, Z. Physik 171, 560. DELBECQ, DEXTER,D. L., 1954, Phys. Rev. 96, 615. DEXTER,D. L., 1956, Phys. Rev. 101, 48.
vl
REFERENCES
225
DEXTER, D. L.. 1958, Theory of the Optical Properties of Imperfections in Nonmetals, in: Solid State Physics, Vol. 6, eds. F. Seitz and D. Turnbull (Academic Press Inc., New York). DOYLE,W. T., 1958a, Phys. Rev. 111, 1067. DOYLE.W. T., 1958b, Phys. Rev. 111, 1072. EBY,J. E., K. J. TEEGARDEN and D. B. DUTTON, 1959, Phys. Rev. 116,1099. EROS,S., 1965, An Investigation of F Centers, U Centers, and M Centers in KBr : H, KCI : H and NaCl : H, Final repcrt U.S. Army Research Office Contract No. DA 49-092-ARO-59with Carson Laboratories, Inc., Bristol, Connecticut (Defense Documentation Center Report No. AD-465278). ESTERMANN, I., W. J. LEIVOand 0. STERN,1949, Phys. Rev. 75, 627. FISCHER, F. and H. GRUNDIG, 1965,Z. Physik 184,299. FOWLER, W. B. and D. L. DEXTER, 1962, Phys. Rev. 128, 2154. FOWLER, W. B. and D. L. DEXTER, 1965, J. Chem. Phys. 43,1768. FOWLER, W. B., 1968, Electronic States and Optical Transitions of Color Centers, in: Physics of Color Centers, ed. W. B. Fowler (Academic Press, Inc., New York). FRITZ,B., F. LUTYand J. ANGER,1963, Z . Physik 174,240. FRITZ,B., F. LUTYand G. RAUSCH,1965, Phys. Stat. Sol. 11,635. FROHLICH, H., 1958, Theory of Dielectrics, 2nd ed. (Oxford University Press, London). FUKUDA, A., 1964a, Science of Light 13, 64. FUKUDA, A., K. INOHARA and R. ONAKA,1964b, J. Phys. SOC.Japan 19, 1274. FUKUDA, A., 1964c, J. Spectroscopical SOC.Japan 12, 201, and private communication. FUMI, F. G. and M. P. Tosr, 1964, J. Phys. Chem. Solids 25, 31. GEHRER, G. and H. LANGER, 1968, Phys. Letters 26A, 232. GILBERT, T. L., 1967, unpublished wave functions calculated with atomic structure programs developed by C. Froese Fischer [FROESE, C., 1965, Hartree-Fock Program with Configuration Mixing, University of British Columbia Computing Centre Report]. GOTO,T., T. ISHIIand M. UETA,1963, J. Phys. SOC.Japan 18, 1422. GOURARY, B. S. and F. J. ADRIAN, 1957, Phys. Rev. 105, 1180. GOURARY, B. S. and F. J. ADRIAN,1960, Wave Functions for Electron-Excess Color Centers in Alkali Halide Crystals, in: Solid State Physics, Vol. 10, eds. F. Seitz and D. Turnbull (Academic Press Inc., New York). GREEN, L. C., N. C. JOHNSON and E. K. KOLCHIN,1966, Astrophys. J. 144,369. GUERTIN,R. F. and F. STERN,1964, Phys. Rev. 134, A427. GYULAI, Z . , 1927, Z . Physik 46, 80. HARTEL,H. and F. LUTY,1964, Z . Physik 182, 111. HARTREE, D. R., 1957, The Calculation of Atomic Structures (John Wiley and Sons, Inc., New York). HASSE,H. R., 1930, Proc. Cambridge Phil. SOC.26, 542. HA&, H. R., 1931, Proc. Cambridge Phil. SOC.27, 66. HELLMANN, H., 1935, Acta Physicochimica URSS 1, 913. HELLMANN, H., 1936, Acta Physicochimica URSS 4, 225. HERMAN, F. and S. SKILLMAN, 1963, Atomic Structure Calculations (Prentice-Hall, Inc. Englewood Cliffs, N.J.). HERRING, C., 1956, Theoretical Ideas Pertaining to Traps or Centers, in: Photoconductivity Conference, eds. R. G. Breckenridge et al. (John Wiley and Sons, Inc., New York) p. 81. HIRAI,M., 1960, J. Phys. SOC.Japan 15, 1308. IADONISI, G. and B. PREZIOSI, 1967, NuOvO Cimento B48, 92. KLEEFSTRA, M., 1963, J. Phys. Chem. Solids 24, 1567. KLEIN,M. V., S. 0. KENNEDY, T. I. GIE and B. WEDDING, 1968, Mat. Res. Bull. 3, 677. KLEINSCHROD, F. C., 1936, Ann. Physik [5] 27, 97. KLICK,C. C., D. A. PATTERSON and R. S. KNOX,1964, Phys. Rev. 133, A1717.
226
ABSORPTION S T R E N G T H S O F DEFECTS
Iv
KOHN,W., 1957, Shallow Impurity States in Silicon and Germanium, in: Solid State Physics, Vol. 5, eds. F. Seitz and D. Turnbull (Academic Press Inc., New York). KOHN,W. and L. J. SHAM,1965, Phys. Rev. 140, A1133. KONITZER, J. D. and J. J. MARKHAM, 1960, J. Chem. Phys. 32, 843. KOSTLIN,H., 1967, Z. Physik 204, 290. KRONIG, R. de L. and H. A. KRAMERS, 1928,Z. Physik 48, 174. KUHN,U. and F. LUTY, 1964, Solid State Comm. 2, 281. KURZ,G., 1969a, Phys. Stat. Sol. 31, 93. KURZ,G., 1969b, Phys. Stat. Sol. 32, 91. LA, S. Y. and R. H. BARTRAM, 1966, Phys. Rev. 144,670. LAX,M., 1952, J. Chem. Phys. 20, 1752. LAX,M., 1956, The Influence of Lattice Vibrations on Electronic Transitions in Solids, in: Photoconductivity Conference, eds. R. G. Breckenridge et al. (John Wiley and Sons, Inc., New York) p. 111. LEUTE,H. and G. SCHULZ,1966, Z. Physik 192, 299. LORENTZ, H. A., 1909, The Theory of Electrons (B. G. Teubner, Leipzig; 2nd ed. reprinted by Dover Press, New York, 1952). LOWDIN,P.-O., 1956, Advan. Phys. 5 , l . LUCKEN,E. A. C., 1969, Nuclear Quadrupole Coupling Constants (Academic Press, Ltd., London). LUTY,F., 1960, Z. Physik 160, 1. LUTY, F., 1961, Z. Physik 165, 17. LUTY,F., 1962, Habilitationsschrift (unpublished, Stuttgart). LUTY,F., 1968, FACenters in Alkali Halide Crystals, in: Color Centers in Alkali Halides, ed. W. B. Fowler (Academic Press, Inc., New York). 1968, Phys. Stat. Sol. 27, 713. MAIER,K. and W. GEBHARDT, MARKHAM, J. J. and J. D. KONITZER, 1961, J. Chem. Phys. 34, 1936. MARKHAM, J. J., 1966, F-Centers in Alkali Halides (Academic Press, Inc., New York) p. 34. MARTIENSSEN, W. and H. PICK,1953, Z. Physik 135,309. MCCLURE,D. S., 1959, Electronic Spectra of Molecules and Ions in Crystals, Part 11, in: Solid State Physics, Vol. 9, eds. F. Seitz and D. Turnbull (Academic Press, Inc., New York) p. 399. MOLLWO,E., 1934a, Nachr. Gesell. Wiss. Gottingen, I1 Math.-Physik KI., N.F. 1, no. 6, 79. MOLLWO,E. and W. Roos, 1934b, Nachr. Gesell. Wiss. Gottingen, Math.-Physik KI., N.F. 1, No. 8, 107. MOTT,N. F. and M. J. LITTLETON, 1938, Trans. Faraday SOC.34. 485. MOTT,N. F. and R. W. GURNEY,1948, Electronic Processes in Ionic Crystals, 2nd ed. (Oxford University Press, London). MORSE,P. M., L. A. YOUNGand E. S. HAURWITZ, 1935, Phys. Rev. 48,948. MORSE,P. M. and H. YILMAZ,1956, Tables for the Determination of Atomic Wave Functions (Technology Press of M.I.T., Cambridge, Mass.). NAKAZAWA, F. and H. KANZAKI, 1965, J. Phys. Soc. Japan 20,468. NAKAZAWA, F. and H. KANZAKI, 1967, J. Phys. SOC.Japan 22,844. OHKURA, H., 1957, J. Phys. SOC.Japan 12, 1313. ONAKA, R., I. FUJITAand A. FUKUDA, 1963, J. Phys. SOC.Japan 18, Supplement 11, 263. ONSAGER, L., 1936, J. Am. Chem. SOC.58, 1486. PAGE,L. J., J. STROZIER and E. H. HYGH,1968, Phys. Rev. Letters 21, 348. PANOFSKY, W. K. H. and M. PHILLIPS,1955, Classical Electricity and Magnetism (Addison-Wesley Publishing Co., Reading, Mass.) 0 2-3. PAUS,H. J. and K. THOMMEN, 1963, Phys. Letters 5, 315. PAUS,H. J. and F. LUTY,1965. Phys. Stat. Sol. 12, 341. PAUS,H. J. and F. LUTY,1968, Phys. Rev. Letters 20, 57.
VI
REFERENCES
227
PAUS,H. J., 1969, Z. Physik 218, 56. PAUS,H. J., 1971, private communication. PETROFF, ST., 1950, Z. Physik 127, 443. PHELPS,F. T., 1963, Bull. Am. Phys. SOC.8, 340, Abstract L14. PICK, H., 1938, Ann. Physik [5] 31, 365. PICK,H., 1939a, Ann. Physik [5] 35, 73. PICK,H., 1939b, Z. Physik 114, 127. PICK,H., 1940, Ann. Physik [5] 37, 421. PICK, H., 1962, Optische Konstanten ausgewahlter fester Stoffe, in: Landolt-Bornstein Zahlenwerte und Funktionen, Vol. 11, eds. J. Bartels et al., 6th ed. (Springer-Verlag, Berlin) part 8, pp. 405-433. PICK,H., 1965, Ergeb. Exakt. Naturw. 38, 1. PICK,H., 1972, Structure of Trapped Electron and Trapped Hold Centers in Alkali Halides, in: Optical Properties of Solids, ed. F. Abeles (North-Holland Publishing Co., Amsterdam), p. 653. RAUCH,C. J. and C. V. HFER,1957, Phys. Rev. 105, 914. RIGDEN, J. D., 1961, Phys. Rev. 121, 357. ROCKSTAD, H., 1965, Phys. Rev. 140, A311. ROSENFELD, L., 1951, Theory of Electrons (North-Holland Publishing Co., Amsterdam). SCHIFF,L. I., 1968, Quantum Mechanics, 3rd ed. (McGraw-Hill, Inc., New York). J. H. and W. D. COMPTON, 1962, Color Centers in Solids (Pergamon Press, SCHULMAN, Inc., New York). SCOTT,A. B. and W. A. SMITH,1951, Phys. Rev. 83, 982. SCOTT,A. B., 1955, Nuovo Cimento Suppl. [lo] 1, 104. SCOTT,A. B. and M. E. HILLS,1958, J. Chem. Phys. 28, 24. SEIDEL,H. and H. C. WOLF,1968, ESR and E N D O R Spectroscopy of Color Centers in Alkali Halide Crystals, in: Physics of Color Centers, ed. W. B. Fowler (Academic Press, lnc., New York) ch. 8. SEITZ,F., 1938, J. Chem. Phys. 6, 150. SEITZ,F., 1940, Modern Theory of Solids (McGraw-Hill Book Co., Inc., New York). SEITZ,F., 1946, Rev. Mod. Phys. 18, 384, especiallyp. 390. SEITZ,F., 1954, Rev. Mod. Phys. 26, 7. W., 1946, Phys. Rev. 70, 105. SHOCKLEY, SILSBEE, R. H., 1956, Phys. Rev. 103, 1675. J. C., 1951, Phys. Rev. 81, 385. SLATER, SMAKULA, A., 1930, Z. Physik 59, 603. SMITH,D. Y., 1965a, Phys. Rev. 137, A574. SMITH,D. Y. and G. SPINOLO,1965b, Phys. Rev. 140, A2121. SMITH,D. Y., 1967, unpublished calculations. SMITH,D. Y. and D. L. DEXTER,1968a, Bull. Am. Phys. SOC.13,439, Abstract DJ4. SMITH,D. Y. and D. L. DEXTER,1968b, International Symposium on Color Centers in Alkali Halides, Rome, 1968 (unpublished) Abstract 172. SMITH,D. Y. and D. L. DEXTER,1969, unpublished calculations. A. and H. BETHE,1933, Elektronentheorie der Metalle, in: Handbuch der SOMMERFELD, Physik, Vol. 24/2, eds. H. Geiger and K. Scheel, 2nd ed. (J. Springer, Berlin) Sec. 9-C. SPINOLO,G. and D. Y. SMITH,1965, Phys. Rev. 140, A2117. STASIW,O., 1932, Nachr. Gesell. Wiss. Gottingen, I1 Math.-Phys. K1. 3, No. 26. J. and B. G. DICK,1969, Phys. Stat. Sol. 31, 203. STROZIER, SUGIURA, Y., 1927, J . Phys. Radium 8, 113. TAYLOR, B. N., W. H. PARKER and D. N. LANGENBERG, 1969, Rev. Mod. Phys. 41, 375, especially Table XXXII. TESSMAN, J. R., A. H. KAHNand W. SHOCKLEY, 1953, Phys. Rev. 92, 890. TIMUSK, T. and W. MARTIENSSEN, 1963, Z. Physik 176, 305.
228
ABSORPTION STRENGTHS OF DEFECTS
TOMIKI, T., 1959a, J. Phys. SOC.Japan 14, 1114. TOMIKI, T., 1959b, J. Phys. SOC.Japan 14, 1243. TOMIKI,T., 1960, J. Phys. SOC.Japan 15,488. Tosr, M. P. and F. G. FUMI, 1964, J. Phys. Chem. Solids 25, 45. UNSOLD,A., 1955, Physik der Sternatmospharen, 2nd ed. (Springer-Verlag, Berlin) p. 350. VANVLECK,J. H., 1932, Theory of Magnetic and Electric Susceptibility (Oxford University Press, London). VANVLECK,J. H., 1940, Annals N. Y. Acad. Sci. 40, 293. WAGNER, W.-U., 1964, Z. Physik 181, 143. WEAST,R. C., editor, 1970, Handbook of Chemistry and Physics, 51st ed. (The Chemical Rubber Co., Cleveland). WILSON,A. H., 1953, The Theory of Metals, 2nd ed. (Cambridge University Press, London). WITT, H., 1952, Nachr. Akad. Wiss. Gottingen, I1 Math.-Physik KI, 17. WOLF,K. L. and K. F. HERZFELD, 1928, Absorption und Dispersion, in: Handbuch der Physik, Vol. XX, eds. H. Geiger and K. Scheel (J. Springer, Berlin) p. 480.
VI
ELASTOOPTIC LIGHT MODULATION AND DEFLECTION BY
E. K. SITTIG Bell Telephone Laboratories, Incorporated, Murray Hill, N. J., USA
CONTENTS
PAGE
rj 1. INTRODUCTION
. . . . . . . . . . . . . . . . . . . .
9: 2. PHENOMENOLOGICAL THEORY OF ELASTOOPTlCS rj 3. MATERIALS FOR ELASTOOPTIC DEVICES
231 . 232
. . . . . . 240
rj 4. CLASSIFICATION
OF ELASTOOPTIC MODULATORS AND DEFLECTORS . GENERAL CRITERIA . . . . . . . 244
9: 5. REFRACTIVE AND BIREFRINGENCE LIGHT DEFLECTION . . . . . . . . . . . . . . . . . . . . . . . . . . 248 rj 6 . DIFFRACTIVE LIGHT DEFLECTION
. . . . . . . . . . 252
Q 7. PIEZOELECTRIC TRANSDUCERS FOR DIFFRACTION
LIGHT DEFLECTORS . . . . . . . . . . . . . . . . . 267 rj 8. AREAS OF APPLICATION
. . . . . . . . . . . . . . . 275
9: 9. OUTLOOKS AND CONCLUSION . . . . . . . . . . . . 278 ACKNOWLEDGMENTS
. . . . . . . . . . . . . . . . . . 279
BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . .
279
0 1.
Introduction
The observation that transparent media exhibit variations of the refractive index upon application of elastic stress dates well back to the last century when Toepler, around 1870, made sound waves in air visible by means of his Schlieren method. That elastic stress induces or causes changes in birefringence of solid media was already known to Brewster in 1815. Both these phenomena are, of course, special aspects of elastooptics, which deals in general with variations induced in the refractive indicatrix of a transparent medium by stress fields. Stress birefringence, in particular, has been an engineering tool for many years. After the advent of modern electronics in the thirties elastooptics gained some new dimensions. By means of piezoelectric transducers one could produce ultrasonic waves in the frequency range above 1 MHz inducing refractive index gradients big enough to image sound fields in, e.g., water quite readily by means of Schlieren optics. Thus, sound beams, diffractive phenomena, etc. could be demonstrated with good contrast. Patents on ultrasonic light modulation using this method began to appear around 1932. In particular, it proved possible to store a whole TV-line in a water-cell by driving the transducer attached to the cell with the TV signal. The resulting index modulation could then be Schlieren-imaged and formed the basis of a TV projection system development at Scophony Ltd., described among others by OKOLICSANYI [1937]. Based on a prediction by BRILLOUIN [1922], DEBYE and SEARS[1932] and LUCASand BIQUARD[1932] showed that a sound wave could also serve as a phase grating to diffract light. By varying the frequency of the sound wave, the diffraction angle could be varied accordingly. To obtain a large number of resolvable positions with this technique required a large number of “grating lines”, i.e., sound waves in the optical aperture and their coherent illumination. Thus it required the laser as a light source and the evolution of ultrasonic technology into the frequency range beyond 100 MHz before this approach could mature into providing technologically useful devices. These “elastooptic diffractive deflectors” or “acoustooptic deflectors” have 23 1
232
ELASTOOPTIC L I G H T MODULATION A N D DEFLECTION
PI,
52
recently reached performance levels which give them distinct advantages over other forms of nonmechanical light deflection so that the discussion to follow will center on them. BERGMANN [I9541 has reviewed the work of many authors that had contributed to an understanding of the diffraction effects from ultrasonically generated phase gratings, prior to the appearance of the laser. A later review by QUATEet al. [1965] summarized diffractive phenomena with emphasis on the theoretical aspects. Indeed, up to then the field was dominated by the interest in Brillouin scattering in transparent crystals and liquids. Thus the emphasis was on diffraction from incoherent phonons at frequencies above 10I2Hz. By that time piezoelectric thin film transducers had opened up a way of readily producing coherent sound sources with frequencies beyond lo9 Hz, so that diffraction angles of several degrees could be produced, albeit with low diffraction efficiency and even so requiring large electrical input powers to the transducers. The technique of deflecting and modulating laser beams using Bragg diffraction was developed in the early sixties notably by GORDON [1966] and by KORPELet al. [1966]. Their papers summarize and reference their earlier work. Since then, improved materials and traiisducer technology have led to the attainment of high diffraction efficiency at low transducer input powers. This permits random deflections into several hundred positions within a few microseconds, so that, e.g., television displays etc. can be realized with diffractive deflectors. Modulators with rise times of a few nanoseconds can also be obtained with this technique. In contrast to diffractive devices, refractive and birefringent elastooptic light modulators or deflectors have found little application compared with their electrooptic counterparts. The reasons for this will be discussed. They derive, in essence from the speed limitations of sound waves as compared with electromagnetic waves. The sections to follow will deal briefly with the tensor properties of the elastooptic effect and discuss some materials useful for application in devices. Then some aspects common to all elastooptic devices will be dealt with and the fundamental limitations of refractive and birefringent approaches will be discussed. The rest of the paper will deal with diffractive deflectors and modulators, their design, limitations and technological problems.
0 2.
Phenomenological Theory of Elastooptics
This section serves as a short account of the definitions of the electric, optic and elastic variables, the material property tensors and the constitutive relations between them that enter in the description of elastooptic effects. A
VI, 0 21
P H E N O M E N O L O G I C A L T H E O R Y OF ELASTOOPTICS
233
more detailed description may be found in the appropriate sections of the books by NYE[1967] and BORNand WOLF[I9651 and on elasticity and piezoelectricity in the reviews by THURSTON [1964], BERLINCOURT et al. [I9641 and MCSKIMIN[1964] and the references given there. A very readable account of elastic wave propagation has been given by MASON[1958]. NELSON and LAX [I9711 have presented a theory of the eleastooptic effect derived from classical nonlinear electrodynamics. 2.1. DIELECTRIC RELATIONS AND OPTICS I N CRYSTALS For the following we will restrict the discussion to nonmagnetic and nonconductive materials since the others rarely exhibit high optical transparency in the visible and near infrared range. We further assume that the media are homogeneous. The electric field E, polarization P and displacement D , all vectors, then obey the constitutive relations
D = c0E+P P =E~XE 8.85pF/m being the free-space permittivity, and x the susceptibility tensor. Hence in component representation and with summation over repeated subscripts assumed, e0 =
Di
=
&,,cijEj, i , j
=
1,. . ., 3.
(2.2)
The permittivity tensor eij is, in the absence of losses, Hermitian, i.e., cij = the asterisk indicating the complex conjugate. In media without optical activity or Faraday rotation the permittivities E~~ are real and hence E~~ = E~~ (BORNand WOLF[I9651 Ch. 14). The remaining six permittivities can then be reduced to three by suitable transformation of the coordinate system to yield the principal permittivities
&yi
cii = ci; c i j = 0 for i # j
and the principal refractive indices
izi
(2.3)
via
These three components define the semiaxes of a representative ellipsoid known by various names such as “indicatrix”, “index ellipsoid” or “ellipsoid of wave normals”:
234
ELASTOOPTIC LIGHT MODULATION A N D DEFLECTION
[VI,
§2
An incident light beam having a wave vector k within the medium can propagate with polarization directions which are obtained as the axes of an ellipse resulting from the intersection of a plane normal to k with the ellipsoid. The indicatrix can be found for all crystal classes, but in the triclinic and monoclinic system the direction of the principal axes depends on the actual values of the six components c i j and hence there can be dispersion of the axis directions as well as of the principal indices if the c i j exhibit dispersion. In the classes with higher symmetry, the principal axes point in the appropriate crystallographic symmetry axes, regardless of the actual values of c i j . In the trigonal, tetragonal and hexagonal classes only two independent values cij exist and the indicatrix is an ellipsoid of revolution. Cubic and isotropic materials have a spherical indicatrix. Hence the intersection ellipse can, in general, degenerate into a circle for two separate directions of k which thus define the optical axes of the “biaxial” crystal. It can do this for only one direction in the trigonal, tetragonal and hexagonal classes which coincides with the direction of one of the principal axis in these “uniaxial” crystals. The cubic classes are optically isotropic, as are, of course,isotropic solids and fluids. NYE[1967, Ch. 141 has given a treatment of optical activity. He states that for propagation off the optical axes in uniaxial and biaxial crystals the effect of optical activity amounts to a small perturbation on the static birefringence and thus is in practice only relevant for on-axis propagation or in optically isotropic media. In that case, if n, and n, designate the refractive indices for left and right rotating circularly polarized waves and 3 the average index, the expression (n,-n,)/ii is usually found to be of order Strain-induced variations of this expression cannot be expected to lead to effects comparing in magnitude with those obtained by modulating the refractive indices directly. This justifies neglect of optical activity in the discussion to follow. The electrooptic and elastooptic effects consist in the variation of the permittivity tensor and hence the components of the indicatrix by applied electrical fields or elastic strains respectively. In crystals exhibiting piezoelectricity the two effects are coupled. Before formulating constitutive relations, we need to deal with the elastic relations. 2.2. ELASTIC RELATIONS AND SOUND I N CRYSTALS
In an elastic medium a displacement u = ( u l , u 2 , u s ) of a point P of COordinate X = X ~ X, ~ xg , causes a displacement u + 6u of a neighboring point Q ( x + 6x) which can be determined within the linear approximation by the first term of a Taylor expansion. It is customary to write
VI, 0 21
235
P H E N O M E N O L O G I C A L THEORY OF ELASTOOPTICS
2sij= aui/axj + auj/axi 2 q j = aui/axj-auj/axi
J
i , j = 1, ..., 3
in terms of which one obtains for the components of 6u, summation over repeated subscripts being implied, 6Ui =
(2.7)
(Sij+Oij>6Xj.
,
The reason for doing so is that it can be shown that the terms S, , SZ2, S 3 , are the extensional strains (elongations) of a cubic volume element along the x l , x2, x3-axes and the S i j for i # j are the shear strains (angular deformations) in the planes defined by x i , xj. Likewise, the Q i j can be shown to be proportional to the rigid rotations around the normals to the (xi, xi)planes. From the definitions, it is clear that the Sij-tensor is symmetric, i.e., S i j = Sji and that the Qij-tensor is antisymmetric, i.e., O i j = - Q . .J & ' To deform the volume element requires stresses, these being defined as forces per unit area acting on the surfaces of a cubic volume element. Again, we have the tensions T i i acting on the sides of the element and the shear stresses T i j acting in the directions xj on the planes whose normal points along xi. Considerations of angular momentum balance require that Tij = Tji. In any crystal the constitutive stress-strain relations can be written as
Tij = cijk,Skl;
i , j , k, 1 = 1 , .
. ., 3
(2.8)
in which the elastic tensor components Cijkr are constants in the linear (small deformation) approximation. Because of the symmetry of the T- and S-tensors the number of independent C i j k l reduces to 36 from 81 and the requirement that the energy (2.9) dW = CijklSijdSJ.1 required to strain a volume element be a perfect differential reduces this number further to 21. In order not to have to carry redundant subscripts one often uses a compressed matrix notation replacing the subscript pairs i j and kl by single subscripts m,n according to the scheme ij = 11; 22; 33; 23,32; 31,13; 12,21 m = 1 ; 2; 3; 4; 5; 6.
However, the matrix thus obtained does not transform like a tensor, but is still symmetric, i.e., c,, = c,,, which produces 21 independent constants. Their number reduces further upon application of the crystal symmetry operations, down to three for the cubic crystal system, two for an isotropic solid and one for fluids.
236
ELASTOOPTIC L I G H T MODULATION AND DEFLECTION
[VI,
02
The greater complexity of the elastic relations is also reflected in the propagation of elastic (sound) waves as compared with electromagnetic waves. Applying eqs. (2.6) and (2.7) to the stress equation of motion gives (2.10) p being the density and uithe displacement. One may insert a plane displacement wave of angular frequency w propagating in the direction given by the unit wave vector I% with phase velocity u, viz.
ui
=
uiexp [jw(t - I% . xlu)]
(2.1 1)
where x is a vector defining the coordinate system in the crystal and the Ui are constants. This gives (2.12) where 6 & = 1 for i = k , 6 , = 0 for i # k. Equations (2.12) are nontrivially solvable if (2.13) det ( c i j k l k j k l - p U 2 6 i k ) = 0. This determinant admits of three real solutions u l , v 2 , u3 yielding mutually perpendicular displacement vectors none necessarily parallel to the wave normal, so that generally the waves propagate as extraordinary rays and only for special directions, equivalent to optical axes do these displacements describe a purely longitudinal wave and two purely transverse ones with orthogonal polarization propagating as ordinary rays. A detailed treatment of this topic has been given by FARNELL [I9611 and FEDOROV [1968]. The equivalent of the indicatrix is generally considerably more complicated than an ellipsoid. 2.3. PIEZOELECTRICITY
All crystal classes not possessing a center of symmetry except group (432) are piezoelectric, i.e., produce an electric displacement upon application of a strain. The constitutive relations are (2.14) (2.15) The superscript indicates which variable is held constant through the material in defining the elastic constants and permittivities. There are a number of equivalent representations for different choices of dependent variables which are enumerated by BERLINCOURT et al. [1964]. Because of various inherent symmetries the second and third subscript of the piezoelectric tensor
VI, 0 21
237
PHENOMENOLOGICAL T H E O R Y OF ELASTOOPTICS
ejkl can be contracted so that a most 18 independent constants can exist
which are further reduced in number by the crystal symmetries. Piezoelectricity appears in a consideration of elastooptic devices in two aspects: a) it couples electrooptic and elastooptic phenomena, b) it is the agent that permits the generation of elastic waves by the application of an electric field across a suitably oriented plate of such materials which thus forms a transducer. 2.4. ELECTROOPTIC AND ELASTOOPTIC CONSTITUTIVE RELATIONS
Linear electrooptic and elastooptic effects result from nonlinearities in the relation between the polarization P and electric field E . Customarily one introduces the relative impermeability tensor B = and considers only small changes ABij in its components so that only the first order terms in the electric field E and strain S need be retained. One obtains thus ABij = r;k E, + pEkl S,,
= AB;. + ABE
(2.16)
which defines the electrooptic tensor rijkand elastooptic tensor P i j k l . The superscripts denote the variables to be held constant in defining the components, since E and S may be coupled by the piezoelectric relations eqs. (2.14) and (2.15). NELSON and LAX [I9711 have formulated relations applicable in the presence of such coupling to the elastooptic effect created by sound waves. In fact, it is found (e.g., NYE[1967]) that a linear electrooptic effect only exists in crystals which are also piezoelectric. However, because of different symmetry constraints on the components this is not true for the elastooptic effect which is thus found in nonpiezoelectric crystals, isotropic solids and even fluids. Because of intrinsic symmetries the subscripts j k in r i j k can be contracted to m = 1, . . ., 6 leaving at most 18 components whose number is further reduced by the crystal Symmetries. Traditionally, it has been assumed thatpijkrcan also be contracted to pnm(n,m = I, . . ., 6) so that at most 36 independent components should exist. Hence one often finds listings using the contracted notation. NELSON and LAX [I9701 have, however, shown recently on the basis of nonlinear electrodynamics in birefringent crystals that this need not be true. In their treatment the correct representation is in terms of the displacement gradients aui/axk,i.e., neglecting the electrooptic effect, A BE = ~ F~i j k l a U k / a X 1
(2.17)
so that the effect also depends on the rigid rotation components defined in eq. (2.6). Thus = Pijkl + P i j k l nkl (2.18)
ABc
238
ELASTOOPTIC LIGHT MODULATION A N D DEFLECTION
[VI,
02
where P j j k l is symmetric, but P f j k l is antisymmetric upon interchanging k and 1. Experiments by NELSON and LAZAY[1970] have demonstrated the separate existence of the rotation effect by Brillouin scattering in rutile. Physically, the effect arises from the rotation of the crystal birefringence in a volume element upon application of a shear strain and is only expected to be relevant in strongly birefringent crystals traversed by transverse (shear) sound waves. In any case, because of the symmetry of B i j , the assumption P i j k l = Pjikl remains valid and the crystal symmetries reduce the number of independent constants further. The enumeration by NYE[I9671 uses fully contracted notation pmn. 2.5. SIMPLIFIED DESCRIPTION: THE ACOUSTOOPTIC FIGURE OF MERIT
In view of the complicated interrelations with all the crystal tensor components one has to resort in most practical applications to considerable simplifications: If all tensors are fully known, one may select the sound wave vector to point into an “elastic axis” to obtain a pure longitudinal or shear wave and then select the type of sound wave, light polarization and wave vector direction to obtain a maximum effect. Having accomplished this, one can use a simplified stress-strain relation with an “effective elastic constant” ceff defined by (2.19) T = ceff . S resulting in a propagation velocity v given by (2.20)
0’ = C e f f I P ,
where p is the density. The energy density in the stressed medium is W
=
1:
T dS
= +Ceff
S2
(2.21)
with reference to the unstressed medium. The sound power density is then the energy propagating through a unit cross section in unit time, so that a beam of cross section A and propagation velocity v carries the power P,
=
AWV = *Apu3S2.
(2.22)
Note, that in the presence of piezoelectricity ceffand v also depend on the electrical boundary conditions imposed on the material. Likewise we write AB = A(l/n’) = p S (2.23) having selected the tensor components B = B i j , p = pijlr and S =
sk,
VI, § 21
P H E N O M E N O L O G I C A L T H E O R Y OF E L A S T O O P T I C S
appropriately and note that for n derivative of eq. (2.23) gives
An
=
=
no+ An and An
-+nips.
<< no
239
taking the (2.24)
This index modulation, sustained over an interaction length L along the light propagation direction, imparts to light of the free-space wavelength A. a phase excursion Acp relative to a light beam having traveled the same distance in a medium of refractive index no. This is given by
Acp
=
2nLAn/Ao = -Lni pS/Ao.
(2.25)
Similarly one obtains for electrooptic phase modulation Arp
=
-nLni rE/Ao.
(2.26)
These two expressions suggest a comparison of the maximum achievable phase modulation in either class of devices: In practice, S tends to be restricted to values less than by the onset of nonlinearities and fatigue. Likewise dielectric breakdown limits E to less than lo7 V/m. PINNOW [1970] points out that in most materials p has values between 0.2 and 0.5 and according to KAMINOW and TURNER [1966] r rarely exceeds 10-l’ Vjm in nonferroelectric materials. Hence p S and rE tend to be restricted to values below 10-4. Equation (2.25) may be specialized to a strain excursion produced by a traveling sound wave in a beam of cross section A = L . H, H being the height perpendicular to the light and sound propagation directions. With eq. (2.22), we then obtain Acp
=
n[2P,Lng p 2 / ( p v 3 H A i ) ] * .
(2.27)
In accordance with GORDON[1966], DIXON[1967] and PINNOW[1970], and others we define (2.28) M2 = ngp2/pv3 as the “acoustooptic figure of merit” in terms of which
An = (M2P,/2L)*
(2.29)
and Aq
=
n(2M2 P, LIHI;)”.
(2.30)
The form of M 2 , one figure of merit among several in use, indicates the high premium put on materials with a high index of refraction and low sound velocity.
240
ELASTOOPTIC LIGHT MODULATION A N D DEFLECTION
8 3.
[VI,
§3
Materials for Elastiooptic Devices
3.1. HEURISTIC APPROACHES TO THE SELECTION O F MATERIALS
A guideline for finding materials having combinations of refractive index sound velocity v and elastooptic constants p to yield large values of M , as defined by eq. (2.28) has been published by PINNOW[1970]. In this guideline, the few cases are neglected, where a large piezoelectric effect causes strong coupling between elastooptic and electrooptic effects. For the remainder, classes of materials are identified in which to search, based on the appropriate theories existing separately for n, v and p . A heuristic treatment of sound velocities can be based on a simple model considering points of mass M spaced a distance a apart, connected with springs of stiffness constants C. Any appropriate text, e.g., KITTEL[I9681 shows that the sound velocity well below resonance cutoff is given by y1,
v
=
(ca’/M)+
(3.1)
so that low sound velocities would be expected in substances where M , being interpreted as the molecular weight, is high. However, for different groups of chemical constituents C and a can be expected to differ because of changes in the interatomic forces. Pinnow finds a relation log (?lip) = - b R + d to hold quite well, where p is the density, R the mean atomic weight (molecular weight divided by the number of atoms per molecule) and the parameters b and d are constants within a class of materials such as oxides, alkali halides, etc. Deviations from the relation correlate to some extent with the Mohs hardness, as does the sound absorption. This procedure predicts an average velocity in crystals which, strictly speaking, would apply to a polycrystalline aggregate. However, a method described by ANDERSON [ 19651 predicts such average values from crystal data and shows that anisotropic variations of sound velocity usually stay within a 25 ”/, deviation from the average. The index of refraction can be described by summing the contributions to the dielectric polarizability in a model of oscillators of strength sk and resonance frequencies vk driven by a frequency v, leading to Sellmeier’s dispersion formula (BORNand WOLF [I9651 p. 96)
n2 - 1 = C sk/(vf -v’).
(3.3)
k
It immediately leads to the expectation that the maximum refractive index one can hope to use at a given light wavelength is predominantly de-
VI,
§ 31
MATERIALS FOR ELASTOOPTIC DEVICES
241
termined by the location of the nearest shorter wavelength absorption edge. A theory by WEMPLE and DIDOMENICO [1970] relates this location to the electronic energy band model of the solid and leads to the conclusion that in the visible wavelength range no index of refraction higher than 2.5 can be expected in a material of low light absorption. A microscopic theory of the elastooptic effect can be based on the LorentzLorenz relation (BORNand WOLF[1965] p. 87) for the dielectric polarizability CI in optically isotropic materials
where N is the number of polarizable point ions per unit volume and A a constant. Basically, a dependence on compressive strains can be expected to exist i n all materials from the corresponding variation of number density N . In addition, the polarizability itself depends on distortions in the local electric field. An appropriate theory was worked out by MUELLER [1935], but is of limited validity in crystals of lower symmetry, shear deformation and covalent bonding. The recent theory by WEMPLEand DIDOMENICO [19701 relates the variation of permittivity to strain-induced variations of the electronic energy band structure via the Sellmeier model, thus also describing the dependence of thep-coefficients on light wavelength (dispersion). PINNOW[I9701 chooses the Mueller approach: He discusses the influence of isotropic compression on the average p = + ( p , + p l + p l 3) which can be separated into a term arising from an increase in packing density and one due to a concomitant change in the molecular polarizability which he finds to be a function of n and the ionicity of the crystal bonds. Because of the large intermolecular spacing in liquids and their high concomitant compressibility the packing density term prevails. In solids, in contrast, the variation of molecular polarizability prevails and this tends to be largest in ionic crystals as compared with covalent ones. For shear strains there is no change in packing density and the appropriate component p44 arises exclusively from changes in molecular polarizability with shear strain. PINNOW[1970] has investigated and listed data on a large number of materials and classified such materials in terms of chemical composition. For the visible range, suitable candidates are found mainly among the oxides and some halides and sulfides. In the infrared, substances like Gap, Ge, As2S, and a variety of chalcogenide glasses combine high refractive index with adequate transparency. The material investigation then comprises the following procedure: One determines the spectral range of optical transparency, the principal re-
242
ELASTOOPTIC LIGHT MODULATION AND DEFLECTION
tVI,
03
fractive indices and their directions with standard optical techniques. The technique to determine the components of the elastic tensor amounts to measuring the velocities of longitudinal and orthogonally polarized transverse sound waves in various principal directions of the crystal. It is considerably more intricate than in the optical case. The elastooptic tensor components are determined for characteristic selections of sound wave propagation. A measurement of the diffraction efficiency of Bragg diffraction (see § 6) for various polarization directions of the incident and exit light beams then permits a determination of M2 of eq. (2.28). An elegant method of accomplishing this has been described by DIXON and COHEN [1966]. Finally, once all relevant tensor components are known, optimum values of M , can be found by recomputation for appropriately rotated sound and light propagation directions. It is clear that the large number of constants involved militates somewhat against the use of crystals of low symmetry. 3.2. DATA OF ELASTOOPTIC MATERIALS
A comprehensive list of photoelastic materials data known at present has been prepared by PINNOW[I9721 comprising older listings by DIXON 119671, TA Selected elastooptic mater
Material
Density p[gcm- .'I
Range of transparency Ilrml
2.20
0.2-4.5
1.O
222
1.16 4.63
0.2-0.9 0.2-1.8 0.3-1.8
4im 3m 41nmm 422
6.95 4.7 4.26 6.12
43m
Point group
Fused silica (SiOZ) Water HZO D,O x-HIO~ PbMo04 LiNb03 Ti02 TeOz GaP As2S3 glass Ge33Se,5As12glass
Sound wave polarization and direction
Sound ve
UImmlP
long. shear. long.
5.96 3.76 1.5
long. [OOl]
2.44
0.4-5.5 0.5-4.5 0.45-5.5 0.35-5.0
long. [OOI] long. [11Z0] long. [IIZO] long. [OOl] shear. [IIO]
3.66 6.57 7.86 4.26 0.617
4.13
0.6-10.0
3.2 4.4
0.6-1 1.0 1.0-14.0
long. [I101 shear. [IOO] long. long.
6.32 4.13 2.6 2.52
long. [ I I 1 ] shear. [IOO] long. [IIZO]
5.50 3.51 2.2
Ge
m3m
5.32
2.0-20.0
Te
32
6.24
5.0-20.0
243
MATERIALS FOR ELASTOOPTIC DEVICES
SPENCER et al. [1967], REINTJES and SCHULTZ [1968] and others. A list of the forms the elastooptic matrix p,,,, assumes for the various crystal classes has been given by NYE[1967]. However, to be useful for practical applications, a material must also satisfy a number of additional conditions: (a) Light and sound absorption must be low. Apart from the obvious reasons, the internal heating produced by these at higher power levels causes optical inhomogeneity. (b) Optical damage, photochromic effects, etc. must be absent. (c) The material must have reasonable technological properties and be available with adequate optical homogeneity. A listing, taken from PINNOW [1972], of substances meeting these criteria more or less is shown in Table 1. Besides M , ,two other figures of merit useful in diffractive modulator design, M I and M2 are listed. These are defined by
M, = M,nv2; M ,
= M,nv
(3.5)
and are listed as multiples M T , MT , M : of the corresponding values for fused silica, which has
)m PINNOW [1972:) ind absorption at 500 MHz [dB/~l
Wavelength of measurement
[wl
Index of refraction
I
0.633
1,46
II or I
0.633
1.33
I to101
0.633
1.98
13.6
55.0
0.633 0.633 0.633 0.633
2.39 2.20 2.58 2.27
15.3 8.3 7.9 18.5 8.8
23.7 4.6 2.6 22.8 525
24.9 7.5 6.2 25.6 85.0
0.633
3.31
1.15 1.06
2.46 2.7
75.0 17.4 78.0 53.0
29.5 16.0 230 164.0
69.0 25.7 182.0 128
1.8 75. 0.6 1.2 <0.03 1.O
3.0
II or I[1001 I [OOOII I to101 11 or I [OOlI
t1.0
// 11.0 1.8
M
4.2 0.8 10
Figures of merit
Optical wave polarization and direction
II or I to101
I
II Or 1 II 1I or 1 11 in [OOOl]
10.6
4.0
10.6
4.8
MI* 1.0 1.12 6.1
Mz*
M3*
1.0 0.31 106
1.0 0.2 24
32
540.0 1 380.0 1 270.0 190.0 308.0 182.0 3 550 1320 2920
244
ELASTOOPTIC LIGHT MODULATION A N D DEFLECTION
MI
=- 7.89 x
tVI,
§
4
lo-' [cmZsg-']
M , = 1.51 x lO-"[~~g-'] M,
=
1 . 2 9 10-lZ[cm ~ s2g-'I.
It is apparent, that materials with higher figure of merit than that of fused silica exist. Liquid water, and for extended transmission range in the infrared, D,O is, in addition, known to have a zero temperature coefficient of sound velocity around 71 "C. It has, however, too high a sound absorption to be useful at sound frequencies beyond 30 MHz. Lithium niobate suffers from optical damage for wavelengths shorter than red and aiodic acid is water soluble, even hygroscopic, and thus difficult to process and maintain. Recent development of diffractive deflectors and modulators in the visible range has, therefore, centered on lead molybdate. A discussion of its relevant properties by COQUINet al. [I9711 indicate, however, that it has rather high temperature coefficients of sound velocity and refractive index. For longitudinal sound waves along the c-axis they obtain dv/vodT = - 161 ppm/"C and perpendicular to the c-axis dn/no d T = -30 and - 18 ppm/"C for the ordinary and extraordinary ray respectively. This means that temperature excursions in high resolution diffractive deflectors, where u and n determine the deflection angle, must be restricted to a few degrees. Measurements on tellurium dioxide by UCHIDA and OHMACHI [1970] indicate that in this material a crystallographic orientation with zero temperature coefficient of the sound velocity may exist for shear waves. This material is also of particular interest because of the very high value of A4; attainable for transverse (shear) sound waves along the [I 101-direction. This is due to an uncommonly low transverse sound velocity in this direction which, in turn, goes along with a correspondingly higher sound absorption. The prospects for the evolution of materials with a high figure of merit seem thus to be tied in with tradeoffs existing between sound velocity and sound absorption as well as refractive index and spectral range of transparency. Trying to attain higher elastooptic coefficients by the utilization of ferroelectric or ferroelastic effects is not likely to help since these phenomena tend to exhibit hysteresis-type losses upon domain switching. Nonlinearity and high sound absorption is the likely result.
8 4.
Classificationof Elastooptic Modulators and Deflectors. General Criteria
4.1. ELEMENTARY DESCRIPTION
Modulators and deflectors based on refractive index modulation can be categorized as follows: As shown in Fig. 4.1 we assume the light beam, usual-
VI,§ 41
245
CLASSIFICATION
( a ) REFRACTION
( b ) 6lREFRlNGENCE
(C
RAMAN - NATH DIFFRACTION (SCHLIEREN)
( d ) BRAGG DIFFRACTION
I----L 4 Fig. 4.1. Basic types of elastooptic light modulators and deflectors. A transducer launches a sound wave upwards into the elastooptic medium. A light beam traverses the insolated region from left to right and incurs refraction (a), a change of polarization vector (b) or diffraction (c) and (d) due to interaction with the sound wave.
ly of circular cross section with diameter D,to traverse a medium in which index modulation is maintained over an interaction length L by means of a transducer of, as yet, unspecified properties. This is driven by some electrical signal. The following categories can be defined: A refractive deflector (Fig. 4. la) would deflect the light beam via an index gradient set up in the medium, or by modulation of the refractive index of a prism. In a birefringence modulator (Fig. 4.1b) induced birefringence would rotate the polarization vector of incident light or vary the retardation between two components of it. By using suitably polarized incident light and an analyzer or Wollaston prism at the output, modulation or deflection into two positions can be obtained. In a Schlieren-optical modulator (Fig. 4 . 1 ~ )the light deflected by index gradients would pass a Schlieren-stop which would otherwise block it. Finally, in a diffractive deflector (Fig. 4.ld), a periodic index modulation set up by a sound wave acts as a phase diffraction grating deflecting the incident light beam into one or several diffraction orders.
246
ELASTOOPTIC LIGHT MODULATION A N D DEFLECTION
[VI,
54
Refractive and birefringent device categories mentioned have in common that the spatial phase modulation imparted by an electric or elastic field must be sensibly constant across the light beam diameter, so that the wavelength of the field must be large compared with the beam diameter. The converse is required for Schlieren and diffractive devices. By appropriate spatial filtering of the output beam deflectors can always operate as modulators. This makes it unnecessary in many cases to discuss them separately, although details of design may differ. Some fundamental comparisons between electrooptic and elastooptic devices can be made at this point: A basic limit on the rise time of such devices is set by the transit time z = D/u which the modulating signal takes to traverse the light beam diameter D with the velocity v. In an electrooptic device this velocity exceeds 3 x lo6 ms-l even in media with very high permittivity. In an elastooptic device this velocity is the sound velocity which rarely exceeds lo4 ms-I in any known material. Thus refractive or birefringent electrooptic devices have a speed advantage of at least 300 over their elastooptic counterparts, all other things being equal. However, the same applies to the signal frequencies needed to obtain comparable modulation wavelengths serving as grating lines in diffractive devices to produce a given diffraction angle. Thus one can achieve diffraction angles on the order of a few degrees with technically well accessible frequencies below 1 GHz in elastooptic deflectors for which almost any electrooptic device would require at least 300 GHz. Accordingly, development in elastooptic devices has centered on the diffractive approaches in recent years in contrast to the situation in electrooptic devices. Some exceptions, e.g., COHEN and GORDON [ 1964, 19651 aimed at overcoming limits set by the transducers at the time. 4.2. SOME GENERAL RELATIONS
If a deflector can produce a deflection angle variation A0 in response to an input signal variation and Omin is the minimum resolvable angle, the device can produce N = A8/%mi, (4.1) resolvable positions or sports. If optical distortions can be neglected Omin is a function of the diffraction angle of the light beam traversing the device and the resolution criterion invoked for a given application. We write, therefore, emin= RAID (4.2) where I is the wavelength inside the deflection medium, D is the beam diameter and R is a factor depending on the beam geometry and the resolution
VI,
0 41
247
CLASSIFICATION
criterion. Since all angles inside the deflection medium and in free space are simply related by Snell's law and the wavelength inside the medium of index n is A = Ao/n,A, being the free-space wavelength, it suffices to consider only the variables inside the medium. Also, since most applications of elastooptic devices deal with the control of laser beams, we will henceforth only consider Gaussian beams of wavelength A in the medium and waist diameter 2w0 for the l/ez intensity contour. According to KOGELNIK and Lr [1966] the waist radius W ( Z ) a distance z away from the waist position is given by w2(z)/wf= 1 + ( A Z / i T W f ) ~
(4.3) and the full far field diffraction angle of the I/eZ contour in Fig. 4.2 is given by Po = 2A/iTwo. (4.4) V
Fig. 4.2. Definitions for a Gaussian light beam. Beam diameter D and waist radius w,, and far-field diffraction angle j3 refer to the l/e2-intensity contour of a beam with circular cross section.
If we clip such a beam at the l/e2 points, i.e., choose D far-field diffraction angle widens to
=
2w0 the full
p, M 1.83AlD (4.5) at the l/e2 contour. If the center of the adjacent position is placed at this angle, we have obviously R = 1.83 and the crosstalk intensity at the center of an adjacent position is down 8.7 dB. For TV-like displays closer spacing ( R x 1) can be chosen, but for more stringent crosstalk suppression requirements R 2 2 may be necessary. For more detail see RANDOLPH and MORRISON [1971]. The minimum rise time of any elastooptic device is given by the transit time z of the deflecting signal across the beam diameter D z = D/v (4.6) and is, according to Fig. 4.2 minimized if the beam waist is plaGed at the center of the interaction length L . For the particular choice
248
ELASTOOPTIC LIGHT MODULATION A N D DEFLECTION
L
=
271w;/J.
[VI,
05
(4.7)
eq. (4.3) indicates that the beam expands to 1.4/4w0 at the ends of the interaction region so that z % 3w0/u. Thus the relation L
%
o.7z2v2/J.
(44
represents an estimate for the upper limit on the interaction length compatible with the rise time of any elastooptic deflector.
Q 5. Refractive and Birefringence Light Deflection 5.1. REFRACTIVE LIGHT DEFLECTION
Refractive deflection of a light beam due to index gradients was already described and investigated by LUCASand BIQUARD[1932] and NOMOTO [1937]. It has been used for sinusoidal scanning of light beams by GIAROLA and BILLETER [1963], LIPNICK et al. [1964, 19651, AASand ERF [1964] and others. DEMARIA and DANIELSON [ 19661 analyzed and demonstrated the use of cylindrical sound waves in a laser medium to obtain a time-varying optical waveguide action for mode scanning. FOSTER et al. [1970] described traveling lens arrays used to increase the resolution of diffractive scanners. Since the sound velocity is negligible compared to the light velocity, ray tracing can be done as if the medium had stationary index variations. If a narrow light beam traverses a medium with spatially variable index of re-
k
Fig. 5.1. Refractive deflection of a light beam in a constant refractive index gradient directed along the y-axis.
VI,P 51
249
LIGHT DEFLECTION
fraction n(x, y, z ) it is deflected with a radius of curvature given by (BORN and WOLF[1965] p. 124) 1 = v grad log n r where v is the unit vector normal to the propagation direction. For a plane index gradient along the y-direction in Fig. 5.1 and the light beam propagating in the z-direction, this becomes
1 - 1 dn r n dy Having traveled a distance z, the light beam is deflected by an angle 8 given by z z dn s i n e = - = - -. (5.3) r n dy If we vary n by means of a plane sound wave of angular frequency o = 27cf and wavelength A traveling in the y-direction with phase velocity u, we have
n
=
no+An cos(ot-Ky);
K
=
Hence, after having traversed the distance z the light beam is deflected by an angle
6
27c/A; A
=L
=
u/f.
(5.4)
in the deflection medium,
8, sin (ot- Ky)
(5.5)
8, = KL An/no = (A/l)Aq/no
(5.6)
sin 8
FZ
=
where An << no is assumed and
is the peak deflection angle in terms of the light wavelength l in the deflection medium. The index modulation An or phase modulation Aq is given by eqs. (2.29) or (2.30). For 8, 0.15 the small angle approximation is good to within 1 %. The light beam is therefore deflected as if a sinusoidal lens sheet with lens spacing A moves by with the sound velocity u. Computations by DEMARIA and DANIELSON [1966] indicate that an approximation by cylinder lenses is good for a region of about &4 centered on the maximum or minimum of the refractive index excursion. The focal length of these lenses would thus be F = k&i/(e, sin an) FZ 0.177A/8, = (A/22.6L)(no/An)
(5.7)
if L << F. If beam convergence is reached within the sound field ( L F ) , FOSTER et al. [1970] find
250
ELASTOOPTIC LIGHT MODULATION A N D DEFLECTION
F = $A(n0/An)'
IVI,
§5
(5-8)
the change being caused by the now approximately parabolic ray paths. If one wants to use such a device to scan a light beam directionally, the focusing and defocusing concomitant with the direction change impairs the resolution. To make the refractive aberration equal to the diffraction limit of an incident coherent collimated beam of width w o,equating the focusing angle 60 = D / F with the minimum angle of resolution Omin of eq. (4.2) yields 0' = O.177RAA/Om
(5-9)
for L << F as the maximum usable beam diameter. R depends on the resolution criterion used. The number of resolvable spots N is then N = Om/Omin = O,D/AS
(5.10)
the safety factor S > 1 depending on R and the requirement for scan linearity etc. necessary for a given application. We combine eqs. (5.9) and (5.10) to get N 2 = 0.1778,A/AS2 w (L/A)(An/no)/S2. (5.11)
As mentioned in 9 2 An/no tends to be restricted to values below and, for other practical reasons, L/Acan hardly ever exceed lo6 so that only on the order of 10 positions can be resolved with such deflectors. 5.2. BIREFRINGENT MODULATION
In analogy to electrooptic birefringence modulation, elastooptic birefringence modulation can be obtained by means of static strains, oscillatory strains or sound waves. While the response to static strains has found a place among traditional engineering methods, dynamic birefringence modulation has hardly ever been suggested for applications, presumably because of the existence of electrooptical counterparts with their higher speed potential. However, useful linear electrooptic effects can be only obtained in crystals of low symmetry, but elastooptic devices can be built even from isotropic materials. BONCH-BRUEVICH [19561 apparently first suggested the use of standing waves in a resonating isotropic bar for light modulation and calculated the modulation depth obtainable. KEMP[1969] suggested the use of resonance modes in rectangular glass bars. SITTIG[1970] proposed the use of plane-strain resonance modes in cylindrical bars which had actually already been demonstrated by BERGMANN [19491 although not generally explained, which was accomplished by BOHMEet al. [1960]. The relevant arguments run as follows. In an isotropic medium indicatrix and strain ellipsoid must have coinciding axis directions with the in-
VI, g
51
251
LIGHT DEFLECTION
dicatrix being a sphere in the unstrained condition. An applied uniaxial strain S deforms this sphere to an ellipsoid of revolution producing a birefringence ( n ” - n ’ ) . A light beam traversing the material normal to the strain direction thus encounters a retardation
6
2n
7-d 3 3,
(5.12)
= - (n”-n’)L = - n,(p,,-p,,)S
1
between the two components polarized in and normal to the strain direction respectively. If the polarization direction of the incident beam of intensity Zi is inclined by an angle cp relative to the strain direction, the intensity I, exiting from a crossed analyzer is (BORNand WOLF[1965] p. 696)
I J I ~= sin’ 2q3 sin’ $6
=
$(sin’ 2cp)(1 -cos 6)
so that the optimum modulation is obtained for cp is applied as a sound wave
s = soej(ur-Ky)
=
i n . However, if S
(5.14)
9
6 varies across the light beam diameter D. For A be neglected and we obtain for cp = $n
(5.13)
>> D
this variation can
Ie/Ii = $(I -cos (6, cos wt))
(5.15)
where
6,
= HP11 -P1z)d s o kL.
(5.16)
In this elastostatic case 1JZi = 1 can be reached for 6, 2 3. The intensity modulation is nonlinear. In the general case the variation of 6 across the aperture causes the modulation depth to be reduced. For a rectangular aperture eq. (5.15) has then to be replaced by (5.17) Assuming a standing sound wave across the aperture, we may write 6(y, t ) = (6, sin K y ) sin cot and obtain, using a standard Bessel function expansion m
21,/Ii
=
1 - Jo(a) -
J,,(a) sin ( q K D ) / ( q K D )
(5.18)
q=l
with a
=
6, sin wt.
If KD is chosen to be an integer multiple of n, all terms in the sum vanish,
252
ELASTOOPTIC L I G H T MODULATION A N D DEFLECTION
[VI,
S6
so that L/Ii = +(l-Jo(a)).
(5.19)
Again, nonlinear modulation results, but the maximum modulation achievable is given by Ie/li = 0.701 for a = 3.83 (5.20) in contrast to the elastostatic case. Because of the requirement for standing wave operation, this mode is only usable over narrow frequency ranges of resonance. In using the elastostatic case a large modulation depth is only attainable in practice for frequencies below 1 MHz because D << A is required.
0 6. Diffractive Light Deflection In contrast to the device categories described in $ 5, diffractive devices are characterized by the use of sound wavelengths short compared with the diameter of the incident light beam. The effect of the sound wave in the deflection medium is then to set up a phase diffraction grating moving with the sound velocity and with a grating spacing equal to the sound wavelength. The effect of the motion is a frequency displacement of the scattered light wave due to the Doppler effect, which is small, being on the order of the ratio of the sound velocity to the light velocity, i.e., z lo-’. Inasniuch as the location of the grating lines can be considered stationary during the transit time of the light through the interaction region, the theory of diffraction from stationary phase gratings is directly applicable. In particular, a high degree of isomorphism exists with the theories of phase holograms ( KOGELNIK [1967, 19691) and of X-ray diffraction in crystals (BATTERMAN and COLE[1964]). General theories of light diffraction by ultrasonic waves have been presented by RAMANand NATH[1935, 19361, EXTERMANN and WANNIER [1936], BHATIAand NOBLE [1953], and others. This work has been reviewed by BORNand WOLF[1965] and QUATE[1965]. A more recent treatment based on coupled wave theory was given by KLEINet al. [1965] and KLEINand COOK[1967]. All these treatments deal with isotropic media only. These theories were the starting points for a great deal of detailed investigation of sound propagation properties in transparent media some of which is listed in the bibliography. 6.1. SURVEY OF THE THEORY
KLEINand COOK[I9671 started with a scalar wave equation for the electric field amplitude E of a light wave propagating in a medium of spatially
VI,
D 61
253
DIFFRACTIVE LIGHT DEFLECTION
and temporally varying refractive index n(y, t ) , viz. V 2 E = (n2(y,t ) / ~ ~ ) ) ( d ~ E / d t ’ )
(6.1)
where c is the free-space light velocity. They write the refractive index as a Fourier expansion rr
n(x, t )
=
no+
2 An, sin [ q ( Q t - K y ) + a , ]
,= 1
where 52 and K are the angular frequency and wave vector of a sound wave traveling in the x-direction (Fig. 6.1), Aa, and 6, being the refractive index
’t
I
z
INCIDENT
SOUND
2
I
p =0
-L-
-I
-2
Fig. 6.1. Diffraction of a light beam incident under the angle Oi to the wave plane of a sound wave. The diffraction angle of the 1st order equals 20B.The diffraction order is indicated by an integer p.
+
excursion and its phase of the qth Fourier component. The light amplitude E is likewise expanded into the Fourier series E =
--m
ejwt
C
p=-rn
E P ( z )exp MpQt - k,
. y)>
(6.3)
where for light incident at the angle Oi against the z-axis k; r
= k(z
cos Oi+y sin Oi)+pKy
(6.4)
and o and k are angular frequency and wave number of the light. This formulation amounts to a plane-wave expansion into Fraunhofer diffraction orders with index p , leaving the interaction region with the diffraction angle e d given by sin 8, +sin Oi = p K / k in Fig. 6.1. Introducing eqs. (6.2)-(6.4) into (6.1) they obtain, neglecting second order
254
ELASTOOPTIC L I G H T MODULATION A N D DEFLECTION
[VI,0
6
and higher terms, a set of coupled difference-differential equations for the amplitudes E,. For moderate sound intensities the second and higher order terms in eq. (6.2) can be neglected, so that for sinusoidal index variations dE,ldz
+ $(E,-
,)PL = j p Q ( p - 2 4 1 2 L
- E,+
is the set to be solved, subject to boundary conditions for z E,(O)
=
E,;
=
(6.5) 0:
Ep(0) = 0 for p # 0.
The diffraction problem is thus seen to depend on three parameters Q, and $. In terms of the “Bragg angle” d B given by sin 8,
=
Kj2k
=
A/2A7
CI
(6.6)
the incidence angle Oi and the phase excursion A q of eq. (2.30), these parameters are defined by
$
=
k L Anlcos Oi
=
Aq/cos Oi,
(6.7)
Q
=
K 2 L / k COS di
=
2 K L sin t&,/cos O i ,
(6.8)
(- k / K ) sin Oi
=
sin OJ(2 sin 0,).
(6.9)
c1 =
Physically, $ can be interpreted as a measure of the phase excursion imparted to the light wave upon traversing the interaction length L and c1 as a measure of the incidence angle in units of the Bragg angle. The parameter Q can be interpreted according to ADLER[I9671 as a measure of the amount a light beam with a width equal to + A spreads by diffraction over the interaction length L. Thus if Q << 1, this diffraction spread is small and the “thin” phase grating approximately holds. For Q >> 1, one has a “thick” phase grating in the terminology of holography. It turns out that the cases Q << 1 and Q >> 1 can be dealt with analytically, whereas Q z 1 has to be treated by computational procedures. For Q << 1 Klein and Cook reestablished a result originally found by RAMANand NATH[1935], namely that E,
=
2L)
E , exp (-j nQc1z J , (’sn i
=)
Qaz ,
(6.10)
J, being the Bessel function of order p . We define as the diffraction efficiency qp for thepth order, the power exciting in that order divided by the incident power in the absence of light absorption in the medium, viz. V,
IE,l2/IE0IZ = Jf($ sin [+Qa]l[+Qal).
(6.1 1)
Thus a series of positive and negative diffraction orders ensues. The dif-
VI, 5 61
255
DIFFRACTIVE LIGHT DEFLECTION
fraction efficiency is largest in the first order and attained with a minimal Acp for c1 = 0, i.e., zero incidence angle O i . However, q l 5 0.348, the maximum being reached when the argument of J , equals 1.84. Thus, the “RamanNath” case is not very advantageous for light deflection wherz one wants to recover most of the incident beam in one diffracted order. However, for Schlieren-optical light modulation, where all diffracted orders can be recollimated or removed by spatial filtering, full modulation is obtainable in either the zeroth order or all higher diffraction orders combined. For Q >> 1 Klein and Cook rederived a result, already obtained by PHARISEAU [1956], namely that in the vicinity of c1 = *, i.e., for light incident at the Bragg angle, one obtains q1 = 1 - q o = [($/20) sin
(6.12)
01’
with 0
= *[Q2(l - 2 ~ ) ~ + ( $ ) ~ ] ’ .
For exact incidence under the Bragg angle, c1
=
3 or Oi
q1 = sin 2 +$,
(6.13) =
OB, one has (6.14)
and all incident power can be deflected into the first order for $ = *7c, the corresponding sound power being given by eq. (2.30). For this reason, recent development of diffractive light deflectors has tended to concentrate on this “Bragg diffraction” and so have more recent treatments and reviews such as those by COHENand GORDON[1965], GORDON[1966] and DIXON [1970]. If c1 # 9, i.e., Oi # OB, the factor A(p/20 becomes less than unity for all values of $ so that it is not possible to reach q I = 1, and the reduction of the maximum obtainable depends on Q ’ ( 1 - 2 ~ ) ~so that the angular selectivity of the diffraction efficiency increases with Q. Klein and Cook also dealt with the transitional range Q z 1 by numerical computation. This range is characterized by decreased angular selectivity of the Bragg incidence condition and the emergence of a multiplicity of diffraction orders. For the diffraction efficiency of the two first orders (p = 1 ) in the range 1 g Q 5 10 and for small values of Bi and $ an expression derived by RAOand MURTY[I9581 again gives eq. (6.12), but with 0 =
Q(l-p2~),p= + I
(6.15)
which f o r p = 1 is seen to be a subcase of eq. (6.13). This case has been discussed in detail by PARKS[1969].
256
ELASTOOPTIC LIGHT MODULATION A N D DEFLECTION
[VI,
06
6.2. THE DESIGN OF BRAGG DEFLECTORS A N D MODULATORS
Detailed treatments of devices using Bragg diffraction have been given by GORDON [1966], KORPEL et al. [I9661 and MAYDAN[1970], the latter concentrating on modulators with rise times in the nanosecond range. If Q is chosen large enough ( > 10) to make eq. (6.12) valid, it follows from eq. (6.13) that the range of incidence angles which allow substantial diffraction efficiency is small compared with the Bragg angle. The diffraction of an incident plane light wave of frequency wi wavelength A i , and wave vector ki by a plane sound wave of frequency Q, wavelength A and wave vector K may then be described by the “energy-momentum” relations wd
=
wiko;
for the diffracted light wave
(wd,
k,j
(6.16)
= ki+K
kd). Since
od/w
= 1f Q / w
z 1 disper-
Fig. 6.2. Bragg diffraction of plane light waves by a plane sound wave in an isotropic medium. The wave vectors of the sound wave, the incident light wave and diffracted light wave are K, ki,and kd respectively.
sion can be neglected and one has in an isotropic medium kd z ki and hence from Fig. 6.2 the Bragg condition sin di
=
sin d d
=
sin 8,
=
K/2k
=
A/2A
=
Af/2v,
=
k,
(6.17)
and since sin BB has to be less then unity,fis restricted to frequencies less than
f,,,
=
2u/A.
(6.18)
For small angles, sin BB z 8, is seen to vary linearly with I orf, which suggests that a monochromatic, collimated, light beam can be deflected by a variable angle upon varying the sound frequency,f. However, a consequence of the initial assumptions is that only those directional components of the light
VI,
0 61
257
DIFFRACTIVE LIGHT DEFLECTION
and sound beam interact which fulfill the Bragg condition rigorously. If the minimum resolution angle Omin of the incident beam, given by eq. (4.4) or (4.3), is small compared with 8, it is only possible to vary Od over a range AOd with constant incidence angle 8, if the sound beam either contains appropriate direction components over this range or if it is steered to maintain Oi = e d . The latter case will be dealt with later. With the definitions of 8 4.2 and eq. (4.2), we obtain for the number of resolvable positions obtainable with the deflector upon varying the sound frequency over the range Af a relation originally due to GORDON and COHEN [ 19651, viz. (6.19) N = ded/dmin = Af zlR, R 2 1 depending on the resolution criterion chosen for a particular application. It is undesirable to operate any deflector over more than a 2 : 1 frequency range in order to avoid spurious spots within the deflection range which are produced by harmonics of the drive source or residual higher diffraction orders. Hence Af ought to be centered around a frequency fofor which
f o 2 1.5Af.
(6.20)
SOUPID
f DIFFRACTED LIGHT
Z INCIDENTLIGHT
w Fig. 6.3. Bragg diffraction of a light beam over the angular range 2(OH-OL) bya sound beam emanating from a transducer of length L, tilted by an angle O1 against the direction of the incident light beam. To obtain equal directional response of the transducer at the angles OH and OL requires O1 # Oo.
258
ELASTOOPTIC LIGHT MODULATION A N D DEFLECTION
[VI,
§6
Since we want to operate with a fixed incidence angle Bi and fixed sound propagation direction, the Bragg condition Od = Oi can only be met by the transducer emitting sound in all directions bisecting the angle between the incident and deflected beams. If the light enters along the z-axis in Fig. 6.3, and Oi designates the tilt angle of the transducer normal from the y-axis, the directional response of a rectangular transducer of width L at the angle (0- 0,) from the transducer normal in the yz-plane is given by
~ ~- ~J/P,(o) ( 0 = sin2 x/x2
(6.21)
x = +KL sin (0-0,) w n L ( 8 - 0 , ) / A .
(6.22)
where This response is down 1 dB from the maximum at 0 = Q1 for x, = x, = k0.82 and down 3 dB for x, = x3 = 2 1.39. We desire that for the frequencies fL and fH at the ends of the passband id(8,-0,)/&
= X,
and ~ L( e L -e , )j n L= - x e .
(6.23)
Solving these equations for L and 0, , having made use of eq. (6.17) and writing f L = fo-+Af and f H = fo++Af (6.24) we obtain
and (6.26) Since from eq. (6.20) Afro is always less than 5, the term Af 2/4f: can be neglected in eq. (6.26). Choosing x, = x 1 gives thus
L w v2/(AfoAf).
(6.27)
From eq. (6.26), we deduce that the response peak does not occur exactly at f = f o but at the slightly higher frequency f l given by filfo = 8,/O0. There remains the determination of the sound power P , required to obtain a desired diffraction efficiency at band center. This one obtains from eqs. (6.14), (6.7) and (2.30) in the form q = sin2 ( P , / P ~ ~
where for light of the free-space wavelength A. beam height normal to the yz-plane,
Po
=
=
(6.28)
nl and H being the sound
2HAi cos2 Bi/(n2LM2).
(6.29)
v1,0 61
259
DIFFRACTIVE LIGHT DEFLECTION
This expression is conveniently normalized to A. = 0.6328 pm and M f as listed in Table 1 to give Po in watt as
Po = 54(Ao[pm]/0.6328)2 * H cos2 e,/(LM:>.
(6.30)
Po in eq. (6.29) can first be combined with eqs. (6.19), (6.20) and (6.27): Po = 3HAi cos2 ei N 2 R 2 / ( ~ 2 ~ 2 Mwith ,) M,
=
M , no v2.
(6.31)
For a light beam with circular cross section H = D and upon noting that the transit time z is related to D by
D/t
=v
cos Bi
(6.32)
eq. (6.31) turns into
PO = 32;
COS,
ei N2R2/(r.n2M,) with
-
M , = M , no v .
(6.33)
Thus, the figures of merit M I and M , already mentioned in 9 3.2 should be maximized for deflectors with given N a n d z designed according to the preceding relations with elliptical ( D # H ) or circular ( D = H ) light beam cross section as pointed out by GORDON [1966]. Inspection of eq. (6.28) shows that at P, = Po about 71 % of the incident light is deflected, but that for P, > Pa the relation between q and P,becomes markedly nonlinear: to reach y = 1 requires P,/Po E 2.5. Where thermal and signal distortion are of concern, this fact may make it inadvisable to use diffraction efficiencies exceeding 70 %. Bragg modulators can be regarded as deflectors with a single resolvable spot corresponding to N = 1 in eq. (6.19), so that z = RIAL and the center frequencyf, is still given by (6.20). However, R must be chosen large enough to maintain a separation between the zero order and diffracted light beams adequate for the desired on-off ratio. In addition, for rise times approaching the nanosecond region the oblique traversal of the light beam at the Bragg angle through the sound field noticeably lenghtens the rise time depending on the transducer length L , limiting the maximum usable L. This leads to high required sound power densities via eq. (6.29). Also in attempting to make D small the increasing diffraction spread of the light beam limits the usable interaction length via eq. (4.22). Finally, the requirement to match the sound beam cross section to the dimensions of the interaction region makes the diffraction angles of the sound and light beams comparable so that the condition Q >> 1 of eq. (6.8) is violated leading to a somewhat reduced diffraction efficiency. MAYDAN[I9701 has treated this case in detail and finds the performance of such modulators at optimum if 1 5 21L/7cw0A
2
(6.34)
260
ELASTOOPTIC LIGHT MODULATION A N D DEFLECTION
[VI,
S6
which yields a rise time t, (10-90 %) for the optical signal of t,
Z 1.7Wo/V,
(6.35)
wo being the waist radius of a Gaussian beam. The center frequency .fo should then be f o L 2/t,. (6.36)
Maydan also points out that the high power densities resulting from eq. (6.29) for such modulators can be obtained by cylindrically focused sound beams of focal length F,. However, this method introduces an additional latency time I, = FJU for the sound signal to reach the focus. 6.3. REDUCTION OF SOUND POWER, BEAM STEERING
The power Po stipulated by eq. (6.33) or (6.29) becomes very high if short rise times or large numbers of resolvable spots are to be obtained. The concomitant increase of power density leads to technological problems caused mainly by excessive heating due to losses. Hence it is desirable to find means of reducing the sound power and power density. This can be accomplished in several ways. A reduction of sound power below the value given by eq. (6.33) results from making H < D with a proportional reduction of Po because of the reduced transducer area. This means using light beams of elliptical cross section which can be obtained with cylindrical lens telescopes or the beam expansion prisms described by GIRES[1969]. The limit is set by the widening diffraction angle of the sound beam which causes part of the sound power to bypass the interaction region. A further reduction is obtainable by having the light beam traverse several times an appropriately foreshortened sound beam. Such reentrant deflectors have been described by FELDMAN et al. [ 197 1 1. Neither of these measures, however, lowers the power density Po/HL in the sound wave. To this end L must be increased beyond the value set by eq. (6.27). When this is done, eq. (6.22) shows that the angular bandwidth is reduced accordingly, but eq. (6.29) indicates that the response is increased at band center. The bandwidth reduction can be compensated for to some extent by driving the transducer harder at the band edges than at midband, thus “equalizing” the overall response. In this way L can be approximately doubled and the power density reduced correspondingly before more power is required at the band edges than would have been required with the original transducer. A larger improvement is obtained by steering the sound beam to maintain the Bragg angle over a larger frequency range, as described by KORPEL [ 19661
VI,§
61
26 1
DIFFRACTIVE LIGHT DEFLECTION
and GORDON[1966]. In Korpel’s approach the transducer consists of N , rectangular elements of width L,, spaced a distance s on a face inclined by the angle x against the direction of the incident light beam as shown in Fig. 6.4a. Each subarray, spaced at 2s produces its pth diffraction order at an angle y which varies with frequency and thus is used to track the Bragg
l N C l DE N T
LIGHT
*
(b)
TRANSDUCERS
Fig. 6.4. Diffractive beam steering of the sound beam to maintain the Bragg angle over a finite frequency range. The transducer consists of an array of elements of length L, and spacing s on a plane tilted by the angle against the direction of the incident light. In (b) the array is blazed to obtain maximum directional response from- the transducer elements i n the desired diffraction order.
x
angle. Driving the two subarrays in phase cancels all odd orders, driving them in opposite phase cancels all even orders. Blazing the array as shown in Fig. 6.4b, i.e., tilting the transducers elements to turn their directivity maxima into the desired diffraction order then maximizes the sound power. The condition that the pth diffraction order of the array track the Bragg angle 8 is then that the steering error ( vanish, i.e., (
=
0
sin 8
=
y+d-X
(6.37)
with sin y
= pA/2s;
2/2A.
Because of their inverse dependence on the sound frequency f
(6.38) =
v / A , these
262
ELASTOOPTIC LIGHT MODULATION A N D DEFLECTION
[VI,
06
angles cannot track over an extended frequency range. However, exact coincidence for two frequencies, f l and f 2 , with wavelength A l and A 2 , can be achieved by suitable choice of s and x. We normalize the frequencies to a center frequencyf, of wavelength A , , not necessarily identical with the desired band center,f, of eq. (6.20), so that
F
F,
= .f/fc;
= fllfc;
F2 = f z f , .
(6.39)
Writing eq. (6.37) for F = F, and F, and solving in the small angle approximation for s and x, one obtains s = p A l A 2 / 1 = pA,/28, F , F,
(6.40)
x = i A ( l / A l + l/A2) = 6,(F1 + F 2 ) .
(6.41)
Blazing the array means turning the individual transducers by the angle y so that their normals point in the direction of the chosen diffraction order for f = f , . The resulting step height is then
h
=s
sin y,
Inserting the values of s and finds the steering error 5 to be
M
sy = +PA,.
(6.42)
x thus determined back into eq. (6.37) one
5 = f3,(F,F2/F+F-F1-F2)
(6.43)
which duly vanishes for F = F, and F 2 . It remains now to find out how the directional response of the array depends on 5 to obtain the variation of the diffraction efficiency over the passband of the deflector. The normalized directional response of such an array in the far field is given by (BORN and WOLF[1965] p. 404), N , sin y
(6.44)
with x = nL,(e-e,)/n = n ~ e e c ~ ( ~ - i ) / ~ c
(6.45)
y = ns(/A = ~ P ~ [ ~ + F ( F - F , - F ~ ) / F ~ F ~ ] (6.46) .
However, L, cannot exceed s and writing L,
=
rs we get with eq. (6.40)
x = w p F ( F - 1)/2Fl F 2 .
(6.47)
In practice we desire r to be as near unity as is technically possible in order to minimize the sound power density emanating from the transducer array. The total length of the array is La = N,s and is now to be compared with the length L of a single transducer given by eq. (6.25) as
VI,0 61
263
DIFFRACTIVE LIGHT DEFLECTION
L = - X- ,' A
FZ+FI
(6.48)
n 8, ( F 2 - F J F 1 F2
in our present notation where x, determines the drop-off at the band edges Fl and F z . Thus LaIL = p N e x ( F z - F , ) / x e ( F z + F , 1 (6.49) is the gain in total transducer length using the array, versus a single transducer of length L . It remains to determine the values Fl and F2 to produce the bandpass shape desired for a particular application. Trial computations of eq. (6.44) for a variety of values for p , N , , Fl and F 2 , some of which are shown in Fig. 6.5, indicate that for a desired bandwidth of Af/A z 5 with a 3 dB drop-off at the band edges values of p N , up to about 12 can be used.
0
0 5
1.5
10 NORMALIZED
FREQLIENCY
f/f,
Fig. 6.5. Normalized response of a beam-steered transducer array with p = 2, F1 = 0.78, F2 = 1.18 and a single transducer ( p = 0) with x, = 0.8 at F z = 1.18. Parameters are the number of elements N, and the diffraction order p. F1 and Fz were chosen to produce an approximately symmetric response with a 3 dB drop at J%fc = 0.67 and 1.3.
Inserting the appropriate values into eq. (6.49) yields then L,/L N" 10 as the transducer length advantage to be gained. PINNOW[I9711 obtained a similar result by a somewhat different route.
264
IVI, S 6
E L AS T O O P T I C L I G H T M O D U L A T I O N A N D D EF LEC TION
COQUIN et al. [ 19701 have extended this technique to larger arrays of transducers with individual subarrays which are driven with fixed phase offsets switched with frequency so that the overall directional response matches the Bragg angle. 6.4. BRAGG DIFFRACTION IN ANISOTROPIC MEDIA
DIXON[1967] has pointed out that Bragg diffraction in isotropic media described in the preceding sections is modified in anisotropic mediz since incident and diffracted rays can propagate with differing refractive indices ni and nd, if the elastooptic interaction couples an ordinary output ray with an extraordinary output ray or vice versa. The relations (6.16) are then still valid, but ki # k,. The case, where both rays are in the same polarization mode is relatively trivial, since for the small diffraction angles usually encountered in practical devices both rays have substantially equal refractive indices so that Fig. 6.6a still applies. However, if diffraction oc-
kd
ki
-- K
(C)
Fig. 6.6. Wave vector relation for Bragg diffraction. The sound wave, incident and diffracted light waves have the wave vectors K , ki and kd respectively. (a) isotropic niedium, (b) birefringent medium, (c) collinear diffraction for f = fmin.
VI, § 61
DIFFRACTIVE LIGHT DEFLECTION
265
curs into an orthogonal polarization mode, birefringence causes the refractive index difference ni - nd to remain finite even for small angles and ki and k d in Fig. 6.6b are not required to have the same magnitude. In particular, it then is possible to have collinear Bragg diffraction as shown in Fig. 6 . 6 ~if k, = K + k i with K > 0 whereas the isotropic case would obviously require K = 0. For the free space light wavelength do with k , = 27c/AOwe have ki
=
nik,;
kd
(6.50)
= ndko,
and obtain from Fig. 6.6b, A being the sound wavelength,
-n3]
sin ei
=
(1,/2ni A)[I +(~’/,ii)(n;
sin 6,
=
(/‘o/2ndA)[1-(n2/jn~)(n~-nf)]
(6.51) (6.52)
which for n, = reduces to eq. (6.17). Writing f = v / A , one finds that for a particular frequency fm . in =
U(ni
+ nd)/10
(6.53)
sin 8, = -sin Bi = 1, for which, therefore, k, , k, and K are c o h e a r but ki and k, have orthogonal polarization and thus can be separated by suitable optical components. Forf < ,fmin this form of Bragg diffraction cannot exist. For f > fmin, k , and k, are not collinear. The normal form of Bragg diffraction exists at all frequencies but does not exhibit the property of collinearity. HARRIS and WALLACE [1969] and HARRIS et al. [1969] have used these properties of collinear Bragg diffraction to obtain an electrically tunable spectral light filter, covering the range from 0.70 to 0.55 pm with a bandwidth of 0.2 mm and peak transmission of 50 %. They used a LiNb0,crystal setting up a standing sound wave along the x-axis with the collinearly incident light polarized along either the y- or z-axis. The pass band is of the form sin’ x/x’ with a half-power optical bandwidth, neglecting dispersion, given by Ad = 1/(2LIAnl)cm-’ (6.54) L being the collinear interaction length. For LiNb03 in the orientation mentioned An = ni-nd = 0.09 and v = 6.57 mm/ps giving fmin = 1.075 GHz for 2, = 0.55 pm andf,,, = 0.75 GHz for A, = 0.7 pmfrom eq. (6.53). For very high frequencies the first terms in the brackets of eqs. (6.51) and (6.52) dominate so that Bi and Ba are symmetrically disposed around BB = &/(ni + f ? d ) n and approach the normal case, forf = f,,, of eq. (6.18) as is shown in more detail by Dixon who also suggested the application of the effect for switching the polarization of the light.
266
E L AS T O O P T I C L I G H T M O D U L A T I O N A N D D EF LEC TION
tvI,
06
LEANet al. [ 19671 have used noncollinear anisotropic Bragg diffraction in sapphire to extend the Bragg bandwidth beyond the value given by eq. (6.21) without resorting to beam steering. Figure 6.6b illustrates this possibility: for a given ki of fixed direction k , can be varied in direction without, to first order, necessitating a change in the direction of K as long as K is tangential to the locus of the tip of k , . In all these applications high frequencies around 1 GHz had to be used which were only attainable with thin-film transducers of relatively high transducer loss. This fact restricted the attainable diffraction efficiency due to onset of thermal distortion. Lower frequencies of operation are obtainable with crystals of lower birefringence, according to eq. (6.53). Also, since the elastooptic constants p44,p45,p s s or p66 have to be used for this type of diffraction, their higher values in lithium niobate and calcium molybdate make these materials more advantageous than lead molybdate for these applications. However, in the materials known these constants tend to be lower than the ones usable for the normal type of Bragg diffraction. 6.5. THERMAL AND SOUND ABSORPTION PROBLEMS
It is apparent from the principle of diffractive deflection that sound frequencies of many MHz have to be used, where sound absorption and transducer losses cannot be neglected and give rise to heat production within the device. Thus, eq. (6.17) needs to be rewritten for temperature-dependent sound velocity and index of refraction, viz. sin 8 = i 0 f / 2 n v w (,iof/’2nov0)(1-
[’no * + 1*]AT) dT dT vo
(6.55)
for small temperature excursions. The heat produced and hence the device temperature depends on the time-averaged drive power level and, since this varies with frequency in a practical transducer, it also depends on the drive frequency. Since the sound absorption also depends on frequency, temperature gradients within the device and their spatial variation also depend on frequency. Thus, to minimize temperature effects on the deflection angle one would want the temperature coefficient of refractive index and sound velocity to be small or to compensate each other. While such cases may exist, as mentioned in § 3, in practical cases, as discussed by COQUIN et al. [1970] temperature coefficients of sound velocity exceeding 10-4/0Chave to be dealt with. As long as ATis constant through the deflection medium, eq. (6.55) indicates that a simple angle variation of the deflected spot is the result, which can be compensated for by varying the frequency f accordingly. If A T depends lin-
VI,
o 71
PIEZOELECTRIC TRANSDUCERS
261
early on y in Fig. 6.2 spot distortion results due to the linear variation of sound wavelength with position across the aperture. GERIGand MONTAGUE [1964] have shown that this type of distortion is equivalent to a stationary cylinder lens and, as such, can be compensated for by a cylinder lens. Nonlinear temperature gradients produce distortions which are usually not correctable, so that sound absorption and transducer dissipation losses set a practical limit, even if good heat-sinking is used. The problem is well illustrated by considering that a lead molybdate deflector with a resolution of a few hundred positions will not tolerate any temperature excursions in excess of a few degrees and yet may have to be driven with several watt of sound power. To accomplish this requires careful thermal design, minimum dissipation losses in the transducer, its bonds and electrodes, minimum ohmic losses in the latter, forced air cooling and reflection-free extraction of the transmitted sound wave by an absorber which simultaneously is a good heat sink. PINNOW[1970] points out that materials with high figure of merit M2 tend to have high sound absorption. This ultimately sets the limit for the capacity speed product after all thermal design improvements have been exhausted. This comes about since the large bandwidth and center frequency required for high capacity-speed products by eqs. (6.19) and (6.20) lead to small usable interaction lengths via eq. (6.27) necessitating high power densities. Also the absorption coefficient tends in most materials to increase at least proportional to f and often tof2. Sound absorption also affects the resolution via what COHEN and GORDON [ 19651 term a "finite coherence width" effect. The exponentially decreasing phase modulation with distance from a transducer makes it approximately equivalent to a constant amplitude grating of shorter aperture width D. As a result the diffraction angle of the light increases when absorption is present and the intensity profile tails off more slowly. This has also been shown by SWEENEY and BUDD [1970]. However, the effect is only relevant for most applications if the power drops more than a few dB across the light beam width.
5 7.
Piezoelectric Transducers for Diffraction Light Deflectors
7.1. TRANSDUCER THEORY
The diffractive deflectors and modulators described in the previous section are of particular interest for capacity speed products N z = Af of many MHz and require sound powers on the order of watt. Piezoelectric transducers afford, at present, the best means to produce such powers and frequencies.
268
ELASTOOPTIC LIGHT MODULATION A N D DEFLECTION
[VI, 0 7
As shown in Fig. 7.1, such a transducer consists of a piezoelectric plate of dimensions L x H x 1 attached to the deflection medium. For high frequency operation, where L, H >> A , , the driving voltage is applied as shown to two electrodes extending perpendicular to the sound propagation direction, and the transducer is called “thickness driven”. From a practical point of view one is interested in knowing the electric input voltage V and current 1 required to produce the sound power P, desired as a function of frequency J; the plate dimensions, the relevant materials data of the transducer and deflection medium and for the desired mode of sound propagation. T R 5 fy 5 D ii C E R
\
L-
DEFLECTION MEDIUM
Fig. 7.1. A “thickness driven” transducer attached to the deflection medium and driven from a source with internal impedance R,. The shaded areas are the drive electr,odes.
Rather than dealing with the theory of such transducers in general, we defer to the discussion given by BERLINCOURT et al. [I9641 and treat the problem in an “equivalent circuit” approximation the limit of validity of which has been discussed by TIERSTEN[1970]. In this approach one describes the transducer as a “black box” coupling the electric input power to the sound power transmitted into the deflection medium. When a transducer of electrical input impedance Zi is driven from a source of impedance R , , a part of the power Piincident upon the transducer terminals is reflected back to the source if R, # Z i , the rest, P, is transmitted into the transducer terminals. Customarily one calls the expression
ML
= - 10 log
(PJP,)
(7.1)
the “1i;at:hing loss”, expressed in dB. Of the fraction P,entering the transdpcer a part is absorbed in the transducer due to internal dissipation from dielectric, sound absorption and other losses, the rest P, is delivered to the deflection medium. One expresses the “dissipation loss” in dB by DL
= - 10 log (P,/P,).
(7.2)
VI,
I 71
269
PIEZOELECTRIC TRANSDUCERS
The total "transducer loss" TL is then in dB TL
= - 10 log
(PJPi)
=
M L + DL.
(7.3)
It is necessary to consider these losses separately, since DL > 0 causes the transducer to heat. But in practice DL << ML so that the transducer loss and its dependence on frequency is mainly determined by ML. Minimizing ML for a transducer of given area L x H , center frequency f o and bandwidth Af driven from a source impedance R, - in practice near 500-involves chosing the optimal transducer material and its thickness and the design of an electrical matching network to be connected between source and transducer. In order to obtain guidelines on how to accomplish this, we now have to evaluate Zi and TL in terms of the materials data and dimensions of the transducer. BERLINCOURT et al. [1964] have discussed an equivalent circuit representation due to MASON[1948] and find the relevant data of the transducer material to be its density po , sound velocity u 0 , effective permittivity E and electromechanical coupling factor K , which are expressed and listed in TABLE 2 Selected piezoelectric materials (from MEITZLER [1971]) Material Point group CdS
Mode
L S
6mm
S
Orientation
d&O
00
[mmips]
K
z o =
PO00
[lo6 kgismZ]
0" 90" 39.7"
0.154 0.188 0.212
9.53 9.02 9.33
4.50 1.80 2.10
21.7 8.7 10.2
Zn0
L
0"
6mm
S S
90" 43.0"
0.282 0.259 0.322
8.84 8.33 8.63
6.40 2.88 3.21
36.4 16.4 18.4
LiI03 6
L S
0" 90"
0.51 0.60
6.0 8.0
4.13 2.52
18.5 11.3
PZT 7-A 6mm
L
0" 90"
0.50 0.67
235.0 460.0
4.80 2.50
33.8 17.6
0" 90"
0.46 0.65
310.0 545.0
6.94 3.76
31.3 16.9
Z 35"Y 163"Y
3m
L L S S
0.17 0.49 0.62 0.68
29 39 43 44
7.32 7.40 4.56 4.80
34.4 34.8 21.4 22.3
Ba2NaNb5OI5
L
z
3m
S S
X Y
0.57 0.21 0.25
32 222 227
6.15 3.64 3.66
32.6 19.3 19.4
S
~
SPN 6mm ___ LiNbOB
L S
X
-
~
-
270
ELASTOOPTIC L I G H T MODULATION A N D DEFLECTION
[VI,
07
terms of the components of the dielectric, elastic and piezoelectric tensors of a material, the choice of longitudinal or shear waves to be generated and for a suitable crystal cut to produce these with adequate mode purity. Table 2 gives an example. SITTIG[1969] and MEITZLER and SITTIG[1969] have given expressions for Zi and TL in terms of these parameters. Their “insertion loss” IL or “insertion gain” I G is related to TL by TL
=
3IL
=
-4IG.
(7.4)
For a transducer with width H, length L and thickness I attached as shown in Fig. 7.1 to a deflection medium of density p and sound velocity v, one customarily defines a “half wave frequency” f o and a “clamped capacitance” co by COO = 2ZfO = T t V O l I ; Co = HLElI (7.5) and uses a “normalized impedance” the driving source, defined by zd = p v / p o v,;
zd
for the deflection medium and r, for r,
=
RsmOC,
.
It is then found that the transducer has a bandpass response peaking near f = f o with transmission zeroes at f = 0 and f = 2f0. To obtain a bandwidth on the order of Afifo = 3 it is necessary to choose z, and r, near unity. Figures 7.2a-f show computed values of TL for r, = 1 and z, = 0.4, . . ., 2 and a coupling factor of K = 0.1, . . ., 0.7 as a function off&, thus covering the range of practical interest. The slight tilt of the pass bands is usually corrected by the electrical matching network, which, in the simplest case, may be an inductor parallel to the transducer terminals. For large area transducers at frequencies above 100 MHz it may not be practical to attain r, z 1 as generator impedances on the order of 1Q would be required. In such a case the transducer must be subdivided into series-connected sections or more elaborate impedance matching networks such as stepped ladder LC filters must be resorted to. At frequencies approaching 100 MHz, the effect of electrode and bonding layers can no longer be neglected, as these act as mismatched transmission line sections in the sound propagation path and thus make the zd and z,, seen by the transducers complex and frequency dependent. SITTIG[1969] and others quoted there have dealt with this problem. Sittig finds that ideally all layers between the transducer and deflection medium should have the same acoustic impedance 2 = pu as the latter, in order not to impair the bandpass characteristics. If this is not possible, such layers must be made extremely
o
VI, 71
27 1
PIEZOELECTRIC TRANSDUCERS
I
\ 8
I
I
I
I
I
I
zd = 0 . 6
I
.
"
30 0.2
I
I
0 4
0.6
!
l
I
I0 1.2 FREQUENCY RATIO f / f o
0.8
Fig. 7.2 (b)
'
I
I4
1
1.6
I 3
272
ELASTOOPTIC LIGHT MODULATION A N D DEFLECTION
m V
z ln ln
s w
0 0 ln
01 --
./
,/
3
02 - 0 3 -
/
Z
a c
I
-
04 - 0 5 ----06
[L
2
.\
-
/
[L
,
04
.
0 7
-
08
1.0
I
,
C6
1.2
,
1.4
,
,
I6
,
1.0
FDEOUENCY RATIO f / f o
Fig. 7.2 ( c )
' I
VI,
o 71
213
PIEZOELECTRIC TRANSDUCERS
32
0 4
06
08
12
I0
FREQUENCY
RATIO
14
16
18
f/f,
Fig. 7.2 (e)
Figs. 7.2. (a-f). The transducer loss as a function of normalized frequency flyofor r, = R,woCo = I , coupling factor K = 0.1,. ... 0.7 and yd = pu/pou0 = 0.4,.... 2.0.
274
ELASTOOPTIC L I G H T MODULATION A N D DEFLECTION
tVI,
D7
thin, typically small fractions of a pm, or multiple quarterwave layers must be used which minimize the reflection factors at the interfaces. This problem is analogous to the design of optical antireflection coatings of interfaces with mismatched index of refraction. 7.2. TECHNOLOGY OF TRANSDUCERS
From Fig. 7.2, it is clear that transducer materials should exhibit a high coupling factor IC to obtain small transducer losses. Taken from a larger compilation of data by MEITZLER [1971], Table 2 lists the properties of some piezoelectric transducer materials which combine high coupling factor with reasonable technological properties. Cadmium sulfide and zinc oxide are included in spite of their lower coupling factor because they can be made by thin-film deposition methods as described by FOSTER et al. [1968]. These avoid the problems of bonding methods and permit deposition on cylindrical surfaces to obtain sound focusing. The listed orientations produce longitudinal (L) and shear waves (S) with adequate mode purity, i.e., the power simultaneously radiated in the other modes is down at least 30 dB from the power in the designated mode. As is evident from eq. (7.5) and the design equations for diffractive deflectors, the impedance to which the generator has to be matched, I/w,C, becomes very low even for moderate values of E at frequencies above 100 MHz. For L x H = 1 cm’, &/so = 100, v = 5 mm/ps one would obtain l/woCo = 4.5 R a t 100 MHz and 1 . 1 Q at 200 MHz. Thus a low E is usually also desirable. Lithium iodate would be a very good choice in terms of low E and high k were it not for the fact that it is water soluble and thus difficult to process. Ferroelectric ceramics of the lead zirconate-titanate family like Clevite’s PZT-7A or sodium potassium niobate (SPN) combine a high coupling factor with good technological tractability. Their high permittivity E requires them, however, to be subdivided into many series connected sections to obtain manageable input impedances. Their high dielectric loss and sound absorption also militate against their use as these contribute to the dissipation loss and attendant thermal problems. Thus, lithium niobate and, for longitudinal waves, barium sodium niobate seem to be preferable in that they combine a moderate E with a high coupling factor and can be processed with standard methods. Electrodes and the bonding layers needed to affix these materials to the deflection medium should have an acoustic impedance pv close to that of the deflection medium in order to avoid bandpass distortions, or else be very thin. Electrodes have to consist of metals with high conductivity like Ag, Au or A1 to keep ohmic dissipation losses small. Even if acoustically
VI,
0 81
AREAS OF A P P L I C A T I O N
215
matched, metal layers exhibit sound absorptions of up to 0.1 dB/pm at a few hundred MHz so that thick layers would again raise dissipation losses. Bonds between vacuum-deposited layers of such metals can be made by a variety of methods, such as thermocompression. But because of thermal expansion mismatch between the transducer, deflector and absorber materials, only methods are suitable that avoid large temperature excursions from ambient. The organic adhesive and indium alloy bonds described or referred to by SITTIG[1969] are suitable, although the former are restricted to frequencies below 100 MHz. Their low pv causes them to be sufficiently mismatched to most deflector materials at higher frequencies that such bonds would have to be thinner than 0.1 pm which is difficult to accomplish. For frequencies above 1 GHz the transducer thickness decreases to a few micrometers which is difficult to obtain reliably with machining techniques. There thin-film transducers of CdS or ZnO become the preferred choice in spite of their lower coupling factor. Thermal problems and dielectric breakdown set limits on the sound power, so that, at present, acoustooptic devices using these transducers rarely reach CW-diffraction efficiencies of more than a few percent.
5 8.
Areas of Application
Since according to eq. (6.17), the Bragg angle depends equally on the light wavelength and sound frequency, a diffractive deflector can serve as a spectral analyzer for sound or light waves. ROSENTHAL [1955], LAMBERT [1962], DIXON[1969], SLAYMAKER [I9681 and HARRIS[1969] have dealt with these applications. Raman-Nath diffraction has been used extensively in the past to analyze strain distribution in traveling or standing waves, to mention only the work of LOEBERand HIEDEMANN [1954, 19561, BREAZEALE and HIEDEMANN [1955, 19581, HARGROVE [1964, 1967, 19681, COHENand GORDON [1965] and MALONEY et al. [1968]. Modulators using either Raman-Nath or Bragg diffraction have found application in intracavity light modulation of lasers, be it for @switching (DEMARIA[1963]), mode locking (HARGROVE et al. [1964]), or cavity dumping (MAYDAN[1970]). The requirements on optical homogeneity in any of these applications are quite stringent and tend to dominate the selection of the deflector material. Another range of applications may be described as correlation. A signal is amplitude modulated onto a suitable R F carrier and inserted into an elastooptic cell with a transit time chosen to encompass the correlation time to be desired. Thus at a given time a phase image of the carrier and its modu-
216
ELASTOOPTIC LIGHT MODULATION A N D DEFLECTION
tVI,
s8
lation exists over an aperture vt, and a Fourier transform can be obtained if the cell is illuminated with coherent, collimated light and a subsequent lens focuses the light transmitted through the aperture onto a photodetector. Suitable amplitude or phase spatial filters arranged in the aperture then produce their convolution with the signal. COOKand BERNFELD [1967] and MALONEY [1969] reviewed a number of such configurations and MELTZand MALONEY [1968] have dealt with the theoretical aspects. By their geometry, such ultrasonic light modulators usually work in the Raman-Nath range. Schlieren-optical image display cells as were used in the TV projectors pioneered by Scophony Ltd. also belong in this category. GERIGand MONTAGUE [1964] have pointed out that an R.F. pulse of constant amplitude but linearly frequency modulated over the range Af in the time At produces a phase grating with linearly varying spacing. This is approximately equivalent to a linear Fresnel-zone plate which thus has a constant focal length F = U2At/AfAo
(8.1)
and resembles a cylinder lens in its imaging properties. However, this lens moves across the aperture with the sound velocity. It thus can be used to scan an image of the coherent light source illuminating the cell across a photodetector with a slit aperture whose output is then the matched filter response of the linearly frequency modulated sound signal. Multiposition deflectors operating in the Bragg regime have now reached capabilities where bandwidths on the order of 100 MHz, transit times on the order of up to l o p s and CW-diffraction efficiencies around 50% are available. A typical device configuration is shown in Fig. 8.1. Thus, e.g., random access speeds on the low ps-range into 100 well resolved positions are attainable which are of interest for fast access file storage in computer systems along approaches outlined, e.g., by SMITSand GALLAHER [1967] and ANDERSON [1968]. Likewise the combination of an acoustooptical light modulator and deflectors for linear beam scanning permits the generation of TV displays with scanned laser beams as described by KORPELet al. [ 19661. In this particular application system simplifications result when the modulator is designed as a deflector for a few positions which then can be used to obtain simultaneously in one device modulation and angular separation of a polychromatic beam or several monochromatic beams for multicolor displays. The transit time of the deflector can be made as long as the retrace time of a linear scan so that resolutions of several hundred spots can be reached with frequency scans over less than 100 MHz. Combined with mechanical scanning, acoustooptic beam positioning also
217
AREAS OF APPLICATION
TRANSDUCERS
r
L
-
1
-
BONDS AND ELECTRODES
- DMEEFDLI EU CMT I O N
I t-
LIGHT
-
L L P - - - - --- -
I
/ /' /
i f ARSORBER
Fig. 8.1. A Bragg deflector with 4 series-connected transducer elements for convenient matching to the source impedance. After traversing the interaction region the sound is absorbed. The absorber must have an acoustic impedance pu closely matching the deflection medium to keep reflections from the interface down. Tilting the interface by an angle y ~ assures , that spurious reflections are outside the Bragg angle range used for deflection.
has potential in areas like laser machining of thin films or photocomposition and art work generation in photolithographic applications. The trade-off between rise time and number of resolved positions implicit in eq. (6.19) exists, of course, only for a single device. By cascading several devices or by optically joining several shorter scans into one longer scan a larger number of positions can be obtained without increasing the rise time above that of a single device. In practice, optical complexity and CUmulative light losses in the optical elements of such a system are the penalties inherent in this approach. It is also possible to perform X and Y deflection in a single device traversed by two sound beams perpendicular to each other, as described by UCHIDA and IWASAKI [I9691 and LAMACCHIA and COQUIN [1971]. The trade-off, however, is one of reduced cost of materials versus greater susceptibility to thermal problems resulting from the larger sound power density.
278
ELASTOOPTIC LIGHT MODULATION A N D DEFLECTION
tVI,
P
9
Q 9. Outlooks and Conclusion In terms of achievable performance elastooptic devices are always in direct competition with electrooptic devices. At least for the visible range of the spectrum comparable magnetooptic devices do not exist at present. The maximally achievable refractive index excursions are comparable but small in both electrooptic and elastooptic devices, so that interaction lengths of many light wavelengths and, where appropriate, phase matching is generally required. The large ratio between the velocities of light and of sound favors diffractive approaches for elastooptic devices. As a result, multiposition or analog deflection is readily accomplished with single acoustooptic deflectors where electrooptic devices would need very high voltages in refractive approaches or cascaded two-position birefringent deflectors, requiring stringent control of high voltages and temperature. Thus, external limitations in, e.g., the electronic circuitry undo a good part of the speed advantage which electrooptic devices inherently possess, so that in either case rise times on the order of microseconds may result in practice. For modulator applications with rise times exceeding about 10 ns acoustooptic approaches offer the advantage of only requiring low voltage power at readily accessible frequencies making adaptation to solid-state circuitry simple. HENDERSON and ABRAMS [ 19701 have shown this, comparing acoustooptic devices using Ge with electrooptic ones using GaAs or CdTe to modulate light at 10.6 pm. For rise times less than 10 ns the problems concomitant with the high sound intensity required in the interaction region gradually shift the advantage toward the electrooptic devices, at least with the elastooptic materials known at present. Elastooptic devices also benefit from the larger category of materials with useful effects available, compared with electrooptic ones. There linear effects are only found in crystal classes which lack a center of symmetry and thus are also piezoelectric. Spurious elastooptic effects generated by the driving electric field thus need special attention. Also, the larger class of available materials makes it easier to circumvent problems arising from optical damage, inhomogeneity, etc. Ultrasonic transducers can be expected to be available with conversion losses of a few dB and fractional bandwidth around 50 % up to 1 GHz. At higher frequencies thin-film transducers are usable but have, at present, larger conversion losses or restricted fractional bandwidth. With present thermal limitations CW-diffraction efficiencies beyond 70 %, high duty factors and high capacity-speed products are difficult to attain all in the same device, but the prospects are hopeful. Improvement of this situa-
VII
BIBLIOGRAPHY
219
tion can be expected to come from engineering development and materials research. Diffractive elastooptic devices can then also be expected to make larger inroads into high-resolution scanning presently done with rotating mirrors. Acknowledgments
The author is indebted to B. A. Stevens for assistance in compiling the bibliography and to L. K. Anderson for clarifying discussions. Bibliography AAS,H. G. and R. K. ERF,1964, Application of Ultrasonic Standing Waves to the Generation of Optical Beam Scanning, J. Acoust. SOC.Am. 36, 1906-1913. ABRAMS, R. L. and D. A. PINNOW,1971, Efficient Acoustooptic Modulation at 3.39 pm and 10.6 pm in Crystalline Germanium, IEEE J. Quantum Electroaics QE-7, 135-136. ADLER,R., 1967, Ultrasonic Light Modulation and Deflection, IEEE Intern. Convent. Rec. Pt. 11, 15, 69-77. ADLER,R., 1967, Interaction Between Light and Sound, IEEE Spectrum 4, 42-54. ANGELBECK, A. W., 1968, Intracavity Control of Lasers Using Acoustical Waves of Two Frequencies, Appl. Opt. 7, 2329-2330. ALIPPI,A. and L. PALMIERI, 1968, New Holographic Method for the Investigation of Light Diffraction by Ultrasonic Standing Waves, Acustica 20, 84-87. ALIPPI,A. and L. PALMIERI, 1969, Spectrum Analysis of the Light Diffracted by an Ultrasonic Standing Wave by Means of a Holographic Method, Acustica 21, 104-111. ANDERSON,L. K., 1968, Holographic Optical Memory for Bulk Data Storage, Bell Laboratories Record 46, 319-325. ANDERSON, 0. L., 1965, Determination and Some Uses of Isotropic Elastic Constants of Polycrystalline Aggregates Using Single Crystal Data, in: Physical Acoustics, vol. 3B, ed. W. P. Mason (Academic Press, New York) pp. 43-95. ATZENI,C. and L. PANTANI,1969, A Simplified Optical Correlator for Radar-Signal Precessing, Proc. IEEE 57, 344-346. BALAKSHY, V. L. and V. N. PARYGIN, 1968, Ultrasound Refraction Deflector of Infrared Range, Bull. Moscow Univ. Phys. Astronomy, Nr. 5, 112-115. B. and H. COLE,1964, Dynamical Diffraction of X-rays by Perfect Crystals, BATTERMAN, Rev. Mod. Phys. 36, 681-717. and G. SCHAACK, 1963, Niederfrequente Modulation der BAYER,E., K. H. HELLWECE Emission eines Rubin Lasers, Phys. Lett. 6, 243-245. BELOVA, G . N. and V. F. KASANTSEV, 1969, Ultrasonic Modulation of a Laser, Sov. Phys. Acoust. 15, 4-9. BERGMANN, L., 1954, Der Ultraschall (S. Hirzel Verlag, Stuttgart). BERLINCOURT, D., 1967, Delay Line Transducer Materials, IEEE Intern. Conv. Rec. Pt. 11, 61-68. D. A., D. R. CURRAN and H. JAFFE,1964, Piezoelectric and Piezomagnetic BERLINCOURT, Materials and Their Function in Transducers, in: Physical Acoustics, vol. IA, ed. W. P. Mason, (Academic Press, New York). BERNSTEIN, S., J. MINKOFFand M. ARM, 1967, Birefringence in Amorphous Solids with Application to Solid Light Modulators, Columbia Univ. Electronic Res. Labs. Tech. Rpt. T-3/321.
280
ELASTOOPTIC LIGHT MODULATION A N D DEFLECTION
IVI
BERRY,M. V., 1967, The Diffraction of Light by Ultrasound (Academic Press, New York). BHATIA,A. B. and W. J. NOBLE,1953, Diffraction of Light by Ultrasonic Waves: I. General Theory, 11. Approximate Expressions for the Intensities and Comparison with Experiment, Proc. Royal SOC.(London) 220A, 356-368 and 369-385. BOHME,H., E. FROMM and E. K. SITTIG,1960, Schwingungen des Isotropen Kreiszylinders mit Verschwindender Axial-Komponente, Acustica 10, 67-71. BONCH-BRUEVICH, A. H., 1956, Transmission of Polarized Light Through a Medium with Standing Ultrasonic Waves, Sov. Phys.-Tech. Phys. 1, 428430. BORN,M. and E. WOLF,1965, Principles of Optics, 3rd. (Pergamon Press, New York). BORSUK,G. M. and W. J. THALER, 1970, Frequency Modulated Laser Communication System, IEEE Transact. Sonics and Ultrason. SU-17, 207-209. BREAZEALE, M. A. and E. A. HIEDEMANN, 1955, Simple Way to Observe Optical Diffrdction Patterns Produced by Shear Waves, J. Acoust. SOC.Am. 27, 1220-1221. BREAZEALE, M. A. and E. A. HIEDEMANN, 1958, Investigation of Progressive Ultrasonic Waves by Light Refraction, J. Acoust. SOC.Am. 30, 751-756. BRIENZA, M. J. and A. J. DEMARIA, 1966, Continuously Variable Ultrasonic-Optical Delay Line, Appl. Phys. Lett. 9, 312-314. BURCKHARDT, C. B., 1966, Diffraction of a Plane Wave at a Sinusoidally Stratified Dielectric Grating, J. Opt. SOC.Am. 56, 1502-1509. BURCKHARDT, C. B., 1967, Efficiency of a Dielectric Grating, J. Opt. SOC.Am. 57, 601-603. CARLETON, H. R. and W. T. Maloney, 1967, Advantages of Transverse-Wave Light Modulators, Proc. IEEE 55, 1077. CARLETON, H. R., W. T. MALONEY and G. MELTZ,1969, Collinear Heterodyning in Optical Processors, Proc. IEEE 51, 769-775. CARLETON, H. R. and R. A. SOREF,1966, Modulation of 10.6 pm Laser Radiation by Ultrasonic Diffraction, Appl. Phys. Lett. 9, 110-1 12. CARPENTER, R. O’B., 1953, Electro-optic Sound-on-Film Modulators, J. Acoust. SOC.Am. 25, 1145-1 148. CHU, RUEY-SHIand T. TAMIR, 1969, Guided Wave Theory of Light Diffraction by Acoustic Microwaves, IEEE Transact. Microwave Theory and Techniques MTT-17, 1002-1 020. COHEN,M. G. and E. I. GORDON,1964, Electro-optic [KTa,Nb,-,O,(KTN)] Gratings for Light Beam Modulation and Deflection, Appl. Phys. Lett. 5, 181-182. COHEN,M. G . and E. I. GORDON,1965, Acoustic Beam Probing Using Optical Techniques, Bell Syst. Tech. J. 44, 693-721. COLLINS, J. H . , E. G. H. LEANand H . J. SHAW,1967, Pulse Compression by Bragg Diffraction of Light with Microwave Sound, Appl. Phys. Lett. 11, 240. COOK,B. D., 1965, Optical Method for Ultrasonic Waveform Analysis, Using a Recursion Relation, J. Acoust. SOC.Am. 37, 172-173. COOK,B. D. and E. A. HIEDEMANN, 1961, Diffraction of Light by Ultrasonic Waves of Various Standing Wave Ratios, J. Acoust. SOC.Am. 33, 945-948. COOK,C. E. and M. BERNFELD, 1967, Radar Signals (Academic Press, New York). COQUIN,G. A., J. P. GRIFFINand L. K. ANDERSON, 1970, Wide Band Acoustooptic Defleztors Using Acoustic Beam Steering, IEEE Transact. Sonics and Ultrason. SU-17, 34-40. COQUIN, G. A., D. A. PINNOWand A. W. WARNER, 1971, The Physical Properties of Lead Molybdate Relevant to Acoustooptic Device Applications, JI Appl. Phys. 42, 21 622168. CRUMLEY, B., L. C. FOSTER and M. D. EWEY,1965, Laser Mode Locking by an External Doppler Cell, Appl. Phys. Lett. 6, 6-8. 1963, Single Side Band Modulation of Coherent Light by CUMMINS, H. Z.and N. KNABLE, Bragg Reflection from Acoustical Waves, Proc. IEEE 51, 1246.
v11
BIBLIOGRAPHY
28 1
DAMON, R. W., W. T. MALONEY and D. H. MCMAHON,1970, Interaction of Light with Ultrasound: Phenomena and Applications, in: Physical Acoustics, eds. W. P. Mason and R. N. Thurston, vol. 7, ch. 5 (Academic Press, New York). DEBYE,P. and SEARS,F. W., 1932, Scattering of Light by Supersonic Waves, Proc. Nat. Acad. Sci. Wash. 18, 409-414. DEFEBVRE, A., 1968, Light Diffraction by Ultrasonics (Debye-Sears Effect), Rev. Opt. 47, 149-1 72. DEMARIA,A. J., 1963, Ultrasonic-Diffraction Shutters for Optical Maser Oscillators, J. Appl. Phys. 34, 2984-2988. 1966, Internal Laser Modulation by Acoustic DEMARIA,A. J. and G. E. DANIELSON, Lens-Like Effects, IEEE J. Quantum Electron. QE-2, 157-164. DEMARIA,A. J., R. GAGOSZand G. BARNARD, 1963, Ultrasonic-Refraction Shutters for Optical Maser Oscillators, J. Appl. Phys. 34, 453-456. DIXON,R. W., 1967, Acoustic Diffraction of Light in Anisotropic Media, Proc. Symp. Mod. Optics, Polytech. Inst. of Brooklyn, and IEEE J. Quantum Electron. 3,85-93. DIXON,R. W., 1967, Optical Investigation of Magnetically Induced Elastic Wave Dispersion in YIG, J. Appl. Phys. 38, 3634-3640. DIXON,R. W., 1967, Photoelastic Properties of Selected Materials and Their Relevance for Applications to Acoustooptic Light Modulators and Scanners, J. Appl. Phys. 38, 5 149-5 153. DIXON, R. W., 1969, Multiwavelength Acoustooptic Devices, private communication. DIXON,R. W., 1967, Acoustic Nonlinear Frequency Mixing Detected Using Optical Bragg Diffraction, Appl. Phys. Lett. 11, 340-344. DIXON,R. W., 1970, Acoustooptic Interactions and Devices, IEEE Transact. Electron Devices ED-17, 229-235. DIXON,R. W. and A. N. CHESTER, 1966, An Acoustic Light Modulator for 10.6 pm, Appl. Phys. Lett. 9, 190-192. DIXON, R. W. and M. G. COHEN,1966, A New Technique for Measuring Magnitudes of Photoelastic Tensors and its Application to Lithium Niobate, Appl. Phys. Lett. 8, 205-207. DIXON,R. W. and E. I . GORDON,February 1967, Acoustic Light Modulators Using Optical Heterodyne Mixing Bell Syst. Tech. J. 46, 367-389. DIXON,R. W. and H. MATTHEWS, 1967, Diffraction of Light by Elastic Waves in YIG, Appl. Phys. Lett. 10, 195-197. DURAN,J., 1969, Critical Study and Improvements of the Methods for Measuring Elastooptical Constants by Ultrasonics, Rev. Physique Appl. 4, 57-62. DURAN,M. J. and M. S. PAUTHIER-CAMIER, 1968, Critique et Amelioration de la Sensibilite des Mdthodes de Mesure des Constantes Elastooptiques par des Ultrasons, C. R. Acad. Sci. Paris B267, 1308-1 3 1 1. EXTERMANN, R. and G. WANNER,1936, Thkorie de la Diffraction de la Lumiere par les Ultrasons, Helv. Phys. Acta 9, 520-532. EXTERMANN, R. and G . WANNIER,1937, Theorie de la Diffraction de la Lumibre par les Ultrasons, Helv. Phys. Acta 10, 185-217. FARNELL, 1961, Elastic Waves in Trigonal Crystals, Can. J. Phys. 39, 65-79. FEDOROV, F. I., 1968, Theory of Elastic Waves in Crystals (Plenum Press, New York). FELDMAN, M., A. W. WARNER, J. P. GRIFFINand R. E. DEAN,1971, Reentrant Acoustooptic Deflectors and Modulators, private communication. FLINCHBAUGH, D. A., 1965, Focusing Ultrasonic System Applicable to Two-Dimensional Optical Beam Scanning and Laser Output Modulation, J. Acoust. SOC. Am. 37, 975. FOSTER, N. F., G . A. COQUIN,G . A. ROZGONYI and F. A. VANATTA,1968, Cadmium Sulphide and Zinc Oxide Thin-Film Transducers, IEEE Transact. Sonics and Ultrason. SU-15, 2 8 4 1 .
282
ELASTOOPTIC LIGHT M O D U L A T I O N A N D DEFLECTION
[VI
FOSTER, L. C., C. B. CRUMLY and R. L. COHOON, 1970, A High Resolution Linear Optical Scanner Using a Traveling Wave Acoustic Lens, Appl. Optics 9, 2154-2160. FULLER, G. G., 1970, An Experimental Laser-Photochromic Display System, Radio and Electron. Eng. 39, 123-129. I., 1969, High Frequency Modulation of Light by Ultrasonic Progressive GABRIELLI, Waves, Acustica 21, 97-103. 1964, A Simple Optical Filter For Chirp Radar, Proc. GERIG,J. S. and H. MONTAGUE, IEEE 52, 1753. 1963, Electroacoustic Deflection of a Coherent Light GIAROLA, A. J. and T. R. BILIETER, Beam, Proc. IEEE 51, 1150-1151. GILL,S. P., 1964, The Diffraction of Light by Sound, Harvard Univ. Acoust. Res. Lab. Tech. Memo. 58. GIRES,F., 1969, Dispositif Enlargissant une Dimension Seulement du Faisceau de Lumiere issu d'un Laser, Rev. Physique Appl. 4, 505-506. 1970, Un Correlateur Optique Compact pour le Traitement des GIRES,F. and C. LARDAT, Signaux Electriques Codes, Rev. Tech. Thomson-CSF 2, 205-216. GORDON, E. 1. and M. G. COHEN,1965, Electrooptic Diffraction Grating for Light Beam Modulation and Diffraction, IEEE J. Quant. Electr. 1, 191-198. GORDON, E. I., 1966, Figure of Merit for Acoustooptical Deflection and Modulation Devices, IEEE J. Quant. Electr. 2, 104-105. GORDON, E. I., 1966, A Review of Acoustooptical Deflection and Modulation Devices, Proc. IEEE 54, 1391-1401. HAKKI,B. W. and R. W. DIXON,1969, Phonon Frequency Spectra of Traveling Acoustoelectric Domains in CdS, Appl. Phys. Lett. 14, 185-188. HANCE,H. and J. K. PARKS,1965, Wide-Band Modulation of a Laser Beam Using BraggAngle Diffraction by Amplitude-Modulated Ultrasonic Waves, J. Acoust. SOC.Am. 38, 14-23. HARGROVE, L. E., 1962, Optical Effects of Ultrasonic Waves Producing Phase and Amplitude Modulation, J . Acoust. SOC.Am. 34, 1547-1552. HARGROVE, L. E., 1964, Successive Diffraction Theory for Diffraction of Light by Ultrasonic Waves of Arbitrary Wave Form, J. Acoust. SOC.Am. 36, 323-326. HARGROVE, L. E., 1967, Exact Simplification of Time-Dependent Ultrasonic Standing Wave Light Diffraction Equations, J. Acoust. SOC.Am. 41, 91-92. HARGROVE, L. E., 1967, Fourier Series for Intensity of Light Modulated by Ultrasonic Standing Waves, IEEE Transact. Sonics and Ultrason. SU-14, 33-36. HARGROVE, L. E., 1967, Limits of Validity of Some Optical Ultrasonic Waveform Determination Methods, J. Acoust. SOC.Am. 41, 1025-1028. HARGROVE, L. E., 1968, Effects of Ultrasonic Waves on Gaussian Light Beams with Diameter Comparable to Ultrasonic Wavelength, J. Acoust. SOC. Am. 43, 847851. L. E., 1968, Fourier Series for Intensity of a Small Gaussian Light Beam HARGROVE, Modulated by a Progressive Ultrasonic Wave, J. Acoust. SOC.Am. 43, 1448-1449. L. E., 1971, Diffraction of a Gaussian Light Beam by Cylindrical Ultrasonic HARGROVE, Standing Waves, J. Acoust. SOC.Am. 49, 120. 1965, Use of Light Diffraction in Measuring the HARGROVE, L. E. and K. ACHYUTAN, Parameter of Nonlinearity of Liquids and the Photoelastic Constants of Solids, in: Physical Acoustics, vol. 2B, ed. W. P. Mason (Academic Press, New York). Ch. 12, HARGROVE, L. E., R. L. FORK and M. A. POLLACK, 1964, Locking of He-Ne Laser Modes Induced by Synchronous Intracavity Modulation, Appl. Phys. Lett. 5, 4-5. HARGROVE, L. E., E. A. HIEDEMANN and R. MERTENS, 1962, Diffraction of Light by T w o Spatially Separated Ultrasonic Waves of Different Frequencies, Z. Phys. 167, 326-336. S. E., S. T. NIEHand D. K. WINSLOW, 1969, Electronically Tunable Acoustooptic HARRIS, Filter, Appl. Phys. Lett. 15, 325-326.
VII
BIBLIOGRAPHY
283
HARRIS,S. E. and R. W. WALLACE,1969, Acoustooptic Tunable Filter, J. Opt. SOC. Am. 59, 744-747. HENDERSON, D. M. and R. L. ABRAMS, 1970, A Comparison of Acoustoptic and Electrooptic Modulators a t 10.6 Microns, Optics Communications 2, 223-226. HIEDEMANN, E. A. and M. A. BREAZEALE, 1959, Secondary Interference in the Fresnel Zone of Gratings, J. Opt. SOC.Am. 49, 372-375. HILLYARD, N. C. and H. G. JERRARD, 1962, Theories of Birefringence Induced in Liquids by Ultrasonic Waves, J. Appl. Phys. 33, 3470-3479. ILIN,V. S. and G. P. KOSTYWNINA, 1967, Diffraction of Electromagnetic Waves by Ultrasonic Waves in an Anisotropic Medium 11, lz. Vysh. Ucheb. Zaved. Radiofiz. SSSR 10, 703-718. INABA,H. and T. KOBAYASHI, 1965, Ultrasonic Frequency Modulation of Laser Oscillation from Nd3+ Glass Rod, Z. Angew. Math. Phys. 16, 66-67. IYENGAR, K. S., 1955, Refraction of Light in Solids by Ultrasonic Waves, P r o c h d i a n Acad. Sci. 41, 25-29. KAMENSKII, V. I. and V. L. POKROVSKII, 1969, Influence of an Acoustic Wave on the Optical Properties of Crystals, Sov. Phys. Solid State 10, 2841-2843. I. P. and E. H. TURNER, 1966, Electrooptic Light Modulators, Proc. IEEE 54, KAMINOV, 1374-1390. KECK,G., 1956, Sichtbarmachung von Stehenden Ultraschall Feldern und AkustischOptische Bildwandlung, Acustica 6 , 543-548. KEMP,J. C., 1969, Piezo-optical Birefringence Modulators: New Use for a Long-Known Effect, J. Opt. SOC.Am. 59, 950-954. KING, M., W. R. BENNETT, L. B. LAmERT and M. ARM, 1967, Real-Time Electrooptical Signal Processors with Coherent Detection, Appl. Optics 6, 1367-1375. KITTEL,C., 1968, Introduction to Solid State Physics, 3rd ed. (Wiley, New York). KLEIN,W. R., 1966, Theoretical Efficiency of Bragg Devices, Proc. IEEE 54, 803-804. KLEIN,W. R. and B. D. COOK,1967, Unified Approach to Ultrasonic Light Diffraction, IEEE Transact. Sonics and Ultrason. SU-14, 123-134. and E. A. HIEDEMANN, 1965, Experimental Study of Fraunhofer KLEIN,W. R., C. B. TIPNIS Light Diffraction by Ultrasonic Beams of Moderately High Frequency at Oblique Incidence, J. Acoust. SOC.Am. 38, 229-233. KOHN,E. S., 1969, Effect of the Bandgap on the Elastooptic and Electrooptic Properties of Hexagonal Semiconductors, J. Appl. Phys. 40, 2608-2613. KOGELNIK, H., 1967, Hologram Efficiency and Response, Microwaves 6 , 68-73. H., 1969, Coupled Wave Theory for Thick Hologram Gratings, Bell Syst. Tech. KOGELNIK, J. 48, 2909-2947. KOGELNIK, H. and T. LI, 1966, Laser Beams and Resonators, Proc. IEEE 54, 1312-1329. 1954, The Study of Sound Field by Means of Optical RefracKOLB,J. and A. 0. LOEBER, tion Effects, J. Acoust. SOC.Am. 26, 249-251. KORPEL, A,, 1968, Acoustic Imaging by Diffracted Light. I. Two-Dimensional Interaction, IEEE Transact. Sonics and Ultrason. SU-15, 153-1 57. KORPEL,A., R. ADLER,P. DESMARES and T. M. SMITH,1965, A n Ultrasonic Light Deflection System, IEEE J. Quantum Electronics QE-1, 60-61. KORPEL,A., R. ADLER,P. DESMARES and W. WATSON,1966, A Television Display Using Acoustic Deflection and Modulation of Coherent Light, Proc. IEEE 54, 1429-1437. KRISCHER, C., 1968, Measurement of Sound Velocities in Crystals Using Bragg Diffraction of Light and Applications, Appl. Phys. Lett. 13, 310-311. C., 1970, Optical Measurements of Ultrasonic Attenuation and Reflection KRISCHER, Losses in Fused Silica, J. Acoust. SOC.Am. 48, 1086-1092. KULIASKO, F., R. MERTENS and 0. LEROY,1968, Diffraction of Light by Supersonic Waves: Solution of the Raman-Nath Equations, Proc. Ind. Acad. Sci., Sec. A 67, 295311.
284
ELASTOOPTIC LIGHT MODULATION A N D DEFLECTION
[VI
KUPPERS,H., 1966, The Diffraction of Light by Transverse Ultrasonic Waves in Crystals, Acustica 16, 365-367. LAMBERT, L. B., 1962, Wide Band Instantaneous Spectrum Analyzers Employing Delay Line Light Modulators, IRE Intern. Conv. Record Pt. 6, 69-98. LAMACCHIA, J. T. and G. A. COQUIN,1971, Simultaneous X, Y Acoustooptic Deflections, Proc. IEEE 59, 304-305. LANDOLT, H. H. and R. BORNSTEIN, 1966, Elastic, Piezoelectric, Piezooptic and Electrooptic Constants of Crystals (Springer, Berlin) Group 111, 1, 137. LEAN,E. G., M. L. DAKSSand C. G. POWELL, 1969, Efficiencies and Bandwidths of Intracavity Acoustooptic Devices, IBM J. Res. and Develop 13, 184-191. LEAN,E. G. H. and C . G. POWELL,1970, Optical Probing o f Surface Acoustic Waves, Proc. IEEE 58, 1939-1947. LEAN,E. G. H., C . F. QUATEand H. J. SHAW,1967, Continuous Deflection of Laser Beams, Appl. Phys. Lett. 10, 48-50. LIPNICK,R., A. RElCH and G. A. SCHOEN,1965, Nonmechanical Scanning of Light in One and Two Dimensions, Proc. IEEE (Correspond.) 53, 321. LOEBER, A. P. and E. A. HIEDEMANN, 1954, A Simple Method for Photographing the Sound Pressure Distribution in a Stationary Ultrasonic Wave, J. Acoust. SOC.Am. 26, 257. LOEBER, A. P. and E. A. HIEDEMA", 1956, Investigation of Stationary Ultrasonic Waves by Light Refraction, J. Acoust. SOC.Am. 28, 27-35. LUCAS,R., 1932, Sur la Diffraction de la Lumiere par les Ondes Elastiques, Compt. Rend. Acad. Sci. Paris 195, 1066-1068. LUCAS,R. and P. BIQUARD, 1932, Nouvelles Proprietts Optiques des Liquides Soumis a des Ondes Ultrasonores, Compt. Rend. Acad. Sci. Paris 194, 21 32-21 34. MALONEY, W. T., 1969, An Ultrasonic Shutter for Noise Reduction in Real-Time Optical Correlators, Appl. Optics 8,443-446. MALONEY, W. T., 1969, Acoustooptical Approaches to Radar Signal Processing, October 1969, IEEE Spectrum, 40-48. MALONEY, W. T. and H. R. CARLETON, 1967, Light Diffraction by Transverse Ultrasonic Waves in Hexagonal Crystals, IEEE Transact. Sonics and Ultrason. SU-14, 135-139. MALONEY, W. T., G. MELTZand R. L. GRAVEL, 1968, Optical Probing of the Fresnel and Fraunhofer Regions o f a Rectangular Acoustic Transducer, IEEE Transact. Sonics and Ultrason. SU-15, 167-172. MARADUDIN, A. A. and E. BURSTEIN, 1967, Relation Between Photoelasticity, Electrostriction, and First-Order Raman Effect in Crystals o f the Diamond Structure, Phys. Rev. 164, 1081-1099. MCMAHON, D. H., 1969, Relative Efficiency of Optical Bragg Diffraction as a Function of Interaction Geometry, IEEE Transact. Sonics and Ultrason. SU-16, 41-44. MASON,W. P., 1948, Electromechanical Transducers and Wave Filters, 2nd ed. (D. Van Nostrand Co., New York). MASON,W. P. 1958, Physical Acoustics and the Properties of Solids (D. Van Nostrand Co., New York). MAYDAN,D., 1970, Acoustooptical Pulse Modulators, IEEE J. Quantum Electronics QE-6, 15-24. MAYDAN, D., 1970, Fast Modulator for Extraction o f Internal Laser Power, J. Appl. Phys. 41, 1552-1559. MAYER,W. G., 1964, Light Diffraction by Ultrasonic Waves for Oblique Incidence, J. Acoust. SOC.Am. 36, 779. MAYER, W. G., G. B. LAMERS, and D. C. AUTH, 1967, Interaction o f Light and Ultrasonic Surface Waves, J. Acoust. SOC.Am. 42, 1255-1257. MCSKIMIN, H. J., 1964, Ultrasonic Methods for Measuring the Mechanical Properties of Liquids and Solids, in: Physical Acoustics, vol. IA, ed. W. P. Mason (Academic Press, New York).
VII
BIBLIOGRAPHY
285
MEITZLER, A. H., 1971, Piezoelectric Transducer Materials and Techniques for Ultrasonic Devices Operating Above 100 MHz, in: Ultrasonic Transducer Materials, ed. 0. E. Mattiat (Plenum Press, New York). MEITZLER,A. H. and E. K. SITTIG,1969, Characterization of Piezoelectric Transducers Used in Ultrasonic Devices Operating Above 0.1 GHz, J. Appl. Phys. 40, 4341-4352. 1968, Optical Correlation of Fresnel Images, Appl. MELTZ,G. and W. T. MALONEY, Optics 7, 2091-2099. MERTENS, R., 1955, On the Diffraction of Light by Progressive and Standing Supersonic Waves, Proc. Indian Acad. Sci. (A) 42, 195-198. MICHAEL, A. J., 1968, Intensity Method for Stress-Optical Measurements, J. Opt. SOC.Am. 58, 889-894. J. B. 1968, Operation of Ultrasonic Light Modulators Under Conditions of MINKOFF, Random Electrical Excitation, J. Acoust. SOC.Am. 44, 903-91 1. MORI,H. and T. SUMINOKURA, 1968, lnterferometric Method for Measuring Ultrasonic Light Diffraction Spectra, Japan J. Appl. Phys. 7, 1518-1522. MUELLER, H., 1935, Theory of the Photoelastic Effect of Cubic Crystals, Phys. Rev. 47, 947-9 57. MUELLER, H., 1938, Determination of Elastooptical Constants with Supersonic Waves, Z. Kristall. 99, 122. NELSON, D. F. and M. LAX,1970, New Symmetry for Acousto-optic Scattering, Phys. Rev. Lett. 24, 379-380. NELSON,D. F. and M. LAX,1971, Theory of the Photoelastic Interaction, Phys. Rev. (B) 3, 2778-2794. NELSON,D. F. and P. D. LAZAY,1970, Measurement of the Rotational Contribution to Brillouin Scattering, Phys. Rev. Lett. 25, 1187-1 191. 1965, Certain Photoelastic Properties of Gallium NIKITENKO, V. I. and G. P. MARTYNENKO, Arsenide and Silicon, Sov. Phys. Solid State 7, 494-496. NOMOTO, O., 1954, Theory of the Visualization of Ultrasonic Waves I and 11, J. Phys. SOC. Japan 9,267-278 and 279-286. 1968, Theory of the Diffraction of Light by Ultrasonic NOMOTO, 0. and Y. TORIKAI, Waves: A Successive Diffraction Computation, in: Reports 6th Intern. Congr. Acoust. Tokyo (Elsevier Publishing Co., New York, 1969) paper H-4-7. NYE.J. F., 1967, Physical Properties of Crystals (Oxford Clarendon Press). OHMACHI, Y. and N. UCHIDA,1971, Acoustic and Acoustooptical Properties of P b 2 M o 0 5 Single Crystal, J. Appl. Phys. 42, 521-524. OKOLICSANYI, F., 1937, The Wave Slot, An Optical Television System, Wireless Eng. 14, 527-536. PARKS,J. K., 1969, An Acoustooptic Receiver and Fast Spectrum Analyzer for Electromagnetic Signals in the VHF-UHF Range, IEEE Transact. Commun. Tech. COM-17, 686-700. PARTHASARATHY, S., M. PANCHOLY and H. SINGH,1954, Diffraction of Light by Several Ultrasonic Beams: Intensityof theCombinationLines, J. Sci. and Industr. Res. 13b, 81-83. PARTHASARATHY, S. and H. SINGH,1954, Diffraction de la Luniikre par deux Faisceaux d’Ultra-sons, Ann. de Physique (12) 9, 382-384. PEDINOFF, M. E. and H. A. SEGUIN,1967, Direct Measurement of Infrared Photoelastic Constants of Silicon, IEEE J. Quantum Electron. 3, 31-32. PETERSON, G . E. and P. M. BRIDENBAUGH, 1965, Time Resolution in Acoustic Mode Patterns in K D P Crystals, Appl. Optics 4, 1655-1659. PHARISEAU, P., 1956, On the Diffraction of Light by Progressive Supersonic Waves, Proc. Ind. Acad. Sci. 44A, 165-170. PHILLIPS,J. C., 1967, Bond Bending and Stretching Model of Photoelastic Constants, Phys. Lett. 25A, 727-728.
286
ELASTOOPTIC LIGHT MODULATION A N D DEFLECTION
[VI
PINNOW,D. A., 1970, Guide Lines for the Selection of Acoustooptic Materials, IEEE J. Quantum Electronics QE-6, 223-238. PINNOW,D. A., 1971, Acoustooptic Light Deflection: Design Considerations for First Order Beam Steering Transducers, IEEE Transact. Sonics and Ultrason. SU-18, 209214. PINKOW,D. A,, 1972, Elasto-Optics, in: Laser Handbook, eds. F. T. Arecchi and E. 0. Schulz-Dubois (North-Holland, Amsterdam), to be published. PINNOW,D. A. and R. W. DIXON,1968, Alpha-Iodic Acid: A Solution-Grown Crystal with a High Figure of Merit for Acoustooptic Device Applications, Appl. Phys. Lett. 13, 156-158. PINNOW,D. A,, L. G . VANUITERT, A. W. WARNERand W. A. BONNER,1969, Lead Molybdate: A Melt-Grown Crystal with a High Figure of Merit for Acoustic Device Applications, Appl. Phys. Lett. 15, 83-86. PINNOW,D. A., S. R. WILLIAMSON and J. T. LAMACCHIA, 1969, Acoustooptic Light Deflection -The Design and Operation of A Simple X, Y-Deflection System, J. Opt. SOC. Am. 59, 490. PORRECA, F., 1953, On the Propagation of Narrow Light Beams in Liquids Traversed by Ultrasound, Nuovo Cimento 6, 274-281. F., 1955, On the Persistence of a Phase Grating in Some Suspensions When PORRECA, Stopping the Supersonic Waves, Nuovo Cimento 2, 904-906. PRIMAK, W. and D. Post, 1959, Photoelastic Constants of Vitreous Silica an-l Its Elastic Coefficient of Refractive Index, J. Appl. Phys. 30, 779-788. QUATE,C. F., C. D. W. WILKINSON and D. K. WINSLOW,1965, Interaction of Light and Microwave Sound, Proc. IEEE 53, 1604-1621. RAMAN, C. V. and N. S. N. Nath, 1936, The Diffraction of Light by High Frequency Sound Waves, Generalized Theory, Proc. Ind. Acad. Sci. 4, 222-242. RANDOLPH, J. and J. MORRISON, 1971, Modulation Transfer Characteristics of an Acoustooptic Deflector, Appl. Optics 10, 1383-1385; 1453-1454. RAO, R. C., 1955, Intensity Measurements of Ultrasonic Diffraction Orders, Proc. Ind. Acad. Sci (A) 42, 331-335. RAO,B. R., 1956, Die Beugung von Licht Durch Ultraschallwellen bei 300 MHz, Nature (London) 178, 160-161. RAO, 9. R. and J. S. MURTY,1958, Diffraction of Light by Weak Ultrasonic Fields, Z. Physik 152, 440-447. RAMAVATARAM, K., 1955, Ultrasonic Diffraction Patterns in Some Optical Glasses, J. Opt. SOC.Am. 45, 749-750. REEDER, T. M. and D. K. WINSLOW, 1969, Characteristics of Microwave Acoustic Transducers for Volume Wave Excitation, IEEE Transact. Microwave Theory and Techn. MTT-17, 921-941. ROSENTHAL, A. H., 1955, Color Control by Ultrasonic Wave Gratings, J. Opt. SOC.Am. 45, 751-756. ROSENTHAL,A. H., 1961, Application of Ultrasonic Light Modulation to Signal Recording, Display, Analysis and Communication, IRE Transact. Ultrason. Eng. 8, 1-5. SACOCCIO, E. J. 1967, Application of the Dynamical Theory of X-ray Diffraction to Holography, J. Appl. Phys. 38, 3994-3998. SCHMIDT, R. V., 1970, Optical Probing of Bulk Waves Present in Acoustic Surface Wave Delay Lines (LiNbOa), Appl. Phys. Lett. 17, 369-371. SCHULZ,M. B. 1968, Polarization of Light Bragg Diffracted by Sound in Optically Isotropic Solids, IEEE J. Quant. Electronics QE-4, 232-234. SCHULZ,M. B., M. G. HOLLAND and L. DAVIS,1967, Optical Pulse Compression Using Bragg Scattering by Ultrasonic Waves, Appl. Phys. Lett. 11, 237. SIEGMAN, A. E., C. F. QUATE,J. BJORKHOLM and G . FRANCOIS, 1964, Frequency Translation of a He-Ne Laser, Output Frequency by Acoustic Output Coupling Inside the Resonant Cavity, Appl. Phys. Lett. 5 , 1-2.
VII
BIBLIOGRAPHY
287
SITTIG,E. K., A. W. WARNER! and H. D. COOK,1969, Bonded Piezoelectric Transducers for Frequencies Beyond 100 MHz, Ultrasonics 7, 108-112. SITTIG,E. K. and G . A. COQUIN,1970, Visualization of Plane-Strain Vibration Modes of a Long Cylinder Capable of Producing Sound Radiation, J. Acoust. SOC.Am. 48, 11501159. SITTIG,E. K., 1969, Effects of Bonding and Electrode Layers on the Transmission Parameters of Piezoelectric Transducers Used in Ultrasonic Digital Delay Lines, IEEE Transact. Sonics and Ultrason. SU-16, 2-10. SLAYMAKER, F. H., 1968, Real-Time Debye-Sears Effect Spectrum Analyzer for Audio Frequencies, J. Acoust. SOC.Am. 44, 1140-1 142. SLIWINSKI, A. and M. LABOWSKI, 1968, Some Aspects of Laser Light Diffraction by an Ultrasonic Wave in Inhomogeneous Media, in: Reports 6th Intern. Congr. Acoust. Tokyo (Elsevier Publishing Co., New York, 1969) paper H-4-9. SMITH,A. W., 1968, Diffraction of Light by Magnetoeleastic Waves, IEEE Transact. Sonics and Ultrason. SU-15, 161-167. SMITH,T. M. and A. KORPEL,1965, Measurement of Light-Sound Interaction Efficiencies in Solids, IEEE J. Quantum Electronics QE-1, 283. SMITS,F. M. and L. E. GALLAHER, 1967, Design Considerations for a Semipermanent Optical Memory, Bell Syst. Tech. J. 56, 1267-1278. SORIN,A. S., V. E. PERFILOVA and A. S. VASILEVSKAYA, 1967, Phenomenological Description of the Electrooptical and Elastooptical Properties of Ferroelectric Materials, Bull. Acad. Sci. USSR Phys. Ser. 31, 1142-1 145. SPEARS, V. L. and R. BRAY,1968, Modulation of Optical Absorption at the Intrinsic Edge by Acoustoelectric Domains in n-Gallium Arsenide, Appl. Phys. Lett. 12, 118-120. 1967, Dielectric Materials for ElectroSPENCER, E. G., P. V. LENZOand A. A. BALLMAN, optic, Elastooptic and Ultrasonic Device Applications, Proc. IEEE 55, 2074-2108. SWEENEY, H. E. and W. E. BUDD,1970, The Number of Resolvable Spots in a Photoelastic Beam Deflection System, Proc. IEEE 58, 1162-1 163. TELL,B., J. M. WORLOCK and R. J. MARTIN,1965, Enhancement of Elastooptic Constants in the Neighborhood of a Bandgap in Zinc Oxyde and Cadmium Sulfide, Appl. Phys. Lett. 6, 123-124. THALER, W. J., 1964, Frequency Modulation of an He-Ne Laser Beam Via Ultrasonic Waves in Quartz, Appl. Phys. Lett. 5, 29-31. THURSTON, R. N., 1964, Wave Propagation in Fluids and Normal Solids, in: Physical Acoustics, vol. IA ed. W. P. Mason, (Academic Press, New York). TIERSTEN, H. F., 1970, Electromechanical Coupling Factors and Fundamental Material Constants of Thickness Vibrating Piezoelectric Plates, Ultrasonics 8, 19-23. TOEPLER, A., 1867, Optische Studien nach der Methode der Schlieren Beobachtung, Pogg. Ann. d. Physik 131, 180-215. TORGUET, R. and E. DIEULESAINT, 1969, Continuous Deflection of Light with Longitudinal Acoustic Waves, Electron. Lett. 5, 632-633. TSAI,C. S., 1971, The Increase of Bragg Diffraction Intensity Due to Acoustic Resonance and Its Application for Demultiplexing and Multiplexing in Laser Communication, Appl. Optics 10, 215-218. TSAI,C. A. and B. A. AULD, 1966, Multiple Acoustic Diffraction Techniques for Frequency Shifting of Laser Beams, Proc. IEEE 54, 1217-1218. TSAI,C. S. and H. V. HANCE,1967, Optical Imaging of the Cross Section of a Microwave Acoustic Beam in Rutile by Bragg Diffraction of a Laser Beam, J. Acoust. SOC.Am. 42, 1345-1347. TSAI,C. S. and H. V. HANCE,1970, Experimental Investigation of the Resolution Capability of Microwave Ultrasonic Beam Visualization Techniques Using Bragg Diffraction of a Laser Beam, J. Acoust. SOC.Am. 48, 1110-1 118. UCHIDA,N., 1968, Elastooptic Coefficient of Liquids Determined by Ultrasonic Light Diffraction Method, Japan J. Appl. Phys. 7, 1259-1267.
288
E L A S T O O P T I C LIGHT M O D U L A T I O N AND DEFLECTION
[VI
UCHIDA,N., 1969, Direct Measurement of Photoelastic Cofficients by Ultrasonic Light Diffraction Technique, Japan J. Appl. Phys. 8, 329-333. UCHIDA,N. and H. IWASAKI,1969, Two-Dimensional Acoustooptical Deflector, Japan J. Appl. Phys. 8, 811. UCHIDA,N. and Y. OHMACHI, 1969, Elastic and Photoelastic Properties of TeOz Single Crystal, J. Appl. Phys. 40, 4692-4695. 1970, Acoustooptical Light Deflector Using TeOZ Single UccrrDA, N. and Y. OHMACHI, Crystal, in: Digest 1V Intern. Quant. Elect. Conf. Kyoto, Japan, paper 5-2. 1970, Acoustooptical Light Deflector Using TeOz Single UCHIDA,N. and Y. OHMACHI, Crystal, Japan J. Appl. Phys. 9, 155-156. VASILEVSKAYA, A. S. and A. S. SORIN,1969, Electrooptical and Elastooptical Properties of Crystals of the Potassium Dihydrogen Phosphate Group and Their Relation to Structure, Kristallographiya 14, 713-716. VASILEVSKAYA, A. S. and A. S. SORIN,1967, Electrooptical and Elastooptical Properties of Deuterated Ammonium Dihydrogen Phosphate Crystals, Sov. Phys. Solid State 8, 2756-2757. VASILEVSKAYA, A. S., A. S. SORINand J. S. REZ, 1967, Electrooptical and Elastooptical Properties of Alkali Metal Dihydrogen Arsenates, Sov. Phys. Solid State 9, 986-987. and A. A. BALLMAN, 1969, Elastooptic Properties of VENTURINI, E. L., E. G. SPENCER BilZGe02,,, BiIzSiOzo,and SrxBal-xNbL06, J. Appl. Phys. 40, 1622-1624. WEMPLE, S. H. and M. DIDOMENICO, 1970, Theory of the Elastooptic Effect in Nonmetallic Crystals, Phys. Rev. B1, 193-202. R., A. KORPEL and S. LOTSOFF,1967, Application of Acoustic Bragg Diffraction WHITMAN, to Optical Processing Techniques, in: Proc. Symp. Modern Optics (Polytechnic Press) pp. 243-256. WHITMAN,R. L. and A. KORPEL,1969, Probing of Acoustic Surface Perturbations by Coherent Light, Appl. Opt. 8, 1567-1576. Wu, WE[-HAU, 1970, Calculation of Deflection Efficiency in a Phase Modulated Acoustooptical Deflector, Appl. Opt. 9, 506-507. M., 1970, Far-Field Diffraction Patterns of Scattering Light Arising From YAMASAKI, Interaction Between Acoustic and Light Waves with Complex Amplitude Distributions, Japan J. Appl. Phys. 9, 497-504. ZANKEL,K. L., 1959, The Effect of a Progressive Ultrasonic Wave o n a Light Beam of Finite Width, Naturwissenschaften 46, 105-106. K. L. and E. A. HIEDEMANN, 1960, Diffraction of a Narrow Beam of Light by ZANKEL, Ultrasonic Waves, IRE Transact. on Ultrason. Eng. UE-7, 71-75. ZITTER, R. N., 1968, Ultrasonic Diffraction of Light by Short Acoustic Pulses, J. Acoust. SOC. Am. 43, 864-870.
VII
QUANTUM DETECTION THEORY BY
CARL W. HELSTROM Department of Applied Physics and Information Science, University of California, La Jolla, Calij: 92037, USA
CONTENTS
PAGE
1. DETECTION THEORY
9 2. DETECTION THEORY
. . . . . . . . . . . . . . . . . IN QUANTUM MECHANICS . .
291
301
9 3. DETECTION OF A COHERENT SIGNAL . . . . . . . . 308 0 4. MODAL DECOMPOSITION OF APERTURE FIELDS . . 330
9 5. DETECTION OF INCOHERENT LIGHT . . . . . . . . . 334 9 6. ESTIMATION THEORY . . . . . . . . . . . . . . . . . 359 ACKNOWLEDGMENT REFERENCES
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . .
368 368
9 1.
Detection Theory
T o the question “What is the weakest light detectable?” the easy answer is “one photon”. Yet it is not the presence of any light whatever that is ordinarily of interest, but the presence of light coming from a particular source or possessing specific characteristics. The astronomer wants to know whether there is a star at a certain point in the sky, the spectroscopist whether light of a definite wavelength is being emitted. In a laser radar the reception of a light pulse from a certain direction reveals the presence of a target. Our question must include the nature of the light to be detected, by for instance specifying the electromagnetic field whose presence or absence at the aperture of some observing instrument is to be determined. Detect ion involves decision. Is the specified light field present or not? An observer must decide, but decide he cannot without liability to etror. Thermal background light is usually incident on the aperture of his instrument, and because of its random nature, this background field cannot be predicted and subtracted. The light to be detected is itself a random phenomenon, both because of the chaos inherent in natural sources and because of the photonic character shared by all forms of light, coherent or not. When we say that a certain light field is detectable, therefore, we must also state with what probability of error. Our question is complex, and “one photon” is an inadequate response. A receiver of light is to be viewed as making - or assisting an observer to make - decisions about the presence or absence of a specific light field at its aperture, and these decisions are subject to error. The most effective optical system permits them to be made with minimum probability of error, and the performance of this optimum system determines the weakest detectable light. How to design the optimum optical receiver is a problem in detection theory, which is basically an application of the statistical testing of hypotheses, or “decision theory”. We shall begin with an outline of the detection theory appropriate for classical physics and then show how it is modified to take the laws of quantum mechanics into account. After that a model of an ideal quantum receiver will be proposed and analyzed to 29 1
292
Q U A N T U M D E T E C T I O N THEORY
bI,§
1
determine the detectabilities of both coherent laser light and incoherent natural light. Our main concern will be to discover the fundamental limitations on optical detection imposed by the nature of the wanted light and the unwanted background illumination. Detailed treatments of quantum detection theory and its applications can be found in papers by the writer [1967-19701. A review by HELSTROM et al. [ 19701 stresses its relation to optical communications. Articles bringing it into the study of optical communication through the turbulent atmosphere have been published by KENNEDY and HOVERSTEN [1968], KENNEDY [1970], and HOVERSTEN, HARGERand HALME[1970]. Detection theory as based on classical statistics has been applied to the detection of light by means of photosensitive surfaces in papers by REIFFENand SHERMAN [1963], HELSTROM [1964, 19711, GOODMAN [1966], BAKUT,VYGON, KURIKSHA, REPINand TARTAKoVSKII [1966], STEFANYUK [1966], BAKUT[ 1966, 19671, GINZBURG[ 19661, BAR-DAVID[ 19691, KARP and CLARK[ 19701, KARP, O’NEILLand GAGLIARDI [1970], and GAGLIARDI [1972]. We shall not attempt to summarize this work here. A telescope is used not only to discover a star, but to measure its location and radiant power, and a spectrometer measures the wavelengths and radiant fluxes of spectral lines. A laser radar determines the distance to a target by measuring the time elapsed until the arrival of a reflected coherent light pulse. Each of these instruments can be regarded as estimating certain parameters of the light field at its aperture, and its results are liable to error because of the random nature of that field, subject to quantum fluctuations and corrupted by background light. As the primary field becomes weaker, error increases; and we shall show how the attendant limitations on the measuring power of an optical instrument can be evaluated. To this end wz shall appeal to statistical estimation theory. We turn first to detection. 1. l . BINARY DETECTION
The optical receiver is regarded as processing the light field appearing at its aperture A during an interval of time (0, T ) ,and the fundamental limitations on its ability to detect are brought out by asking what an arbitrary instrument processing this aperture field could achieve. We suppose first that during the observation interval the instrument abstracts from the field certain data, represented by a set of numbers xl,x2,. . ., x,, or by a vector x = ( x l , x2, . . ., x,). These data might, for instance, be the values of one component of the aperture field at various points r i E A and times ti E (0, T ) . Later the data set will be conceptually expanded until it enconipasses all the information available in the field. For the present we shall ignore any quan-
VII,§
11
DETECTION T H E O R Y
293
tum-mechanical limitations on the acquisition of the data x, restricting ourselves thus to the domain of classical physics. Quantum detection theory will be introduced in 9 2. Details and proofs of what follows can be found in texts such as those by MIDDLETON [1960], HELSTROM [1968c], and VAN TREES[1968]. The instrument is to decide, on the basis of the data x, whether a certain field F exists as a constituent of the total field at the aperture, or whether the aperture contains only a field -9due , to the background. It chooses between two hypotheses, (H,) the field 9is absent, F,only being at hand, and (H,) the field F is present along with F0. The field F is commonly called the signal, 9, the noise. Because of the stochastic nature of the aperture field, the data x l , x2,. . ., x, are random variables, and these two possibilities H, and H , correspond to distinct probability density functions (p.d.f.’s) of the data x. Under hypothesis H, the data are distributed according to the p.d.f. p o ( x , , x2,. . ., x,) = p o ( x ) , under H, according to the p.d.f. pl(x,,x2,.. ., x,) = p l ( x ) . The former describes the statistical properties of the background field F,, the latter those of the combined fields F and 9,We . suppose both p.d.f.’s completely known. The instrument or the observer using it chooses in effect one of these p.d.f.’s as the more compatible with the data. How the choice is made is termed a strategy and is best visualized in terms of the n-dimensional Euclidean space of the data x = (x,, x2,. . ., XJ. The strategy corresponds to a division of this space into two regions R, and R,. When the data point x falls into region R,, hypothesis H, is selected; when x falls into R,, H, is chosen. Fig. 1.1 illustrates this division of a
Fig. 1 . 1 . Decision regions in two dimensions.
two-dimensional data space. We envision our strategy as being tried over and over again. The data being random variables, the point x falls here and there in the space with probability density p o ( x )or p , (x) depending on which hypothesis happens to be correct.
294
QUANTUM DETECTION THEORY
[VII,
01
When hypothesis H, is true and the data point x falls into region R , , hypothesis H, is incorrectly chosen, and an error of the first kind is said to occur. Its probability is n
where d"x = dx,dx, . . . dx, is the rectilinear volume element in the space. In detection theory Q, is called the false-alarmprobability. When on the other hand H, is true and x falls into R , , hypothesis H, is incorrectly chosen; and an error of the second kind, or "false dismissal", occurs. Its probability is ,.
As H, is associated with the presence of the sought field 9and H, with its absence, the complementary probability Qd
=
1-Q1
p,(x)d"x
=
(1.3)
JRl
of choosing H, when HI is true is called the probability ofdetection. Statisticians call Q, the size, Qd the power of the statistical test. At this point a metaphysical element enters the theory. We must specify how serious these errors are, and this is most conveniently done by introducing the costs C , of an error of the first kind and C , of an error of the second kind. If in addition we know the prior probability 5 of hypothesis H, and (1 - 5) of H , - that is, if we know how often in a long series of trials each hypothesis occurs -, we can work out the average cost C associated with any strategy,
C = (1-C)Ci
Qi
+iCoQo =
(1 - 5)Cl
[1 J -
R IPl(X)d"X]
+ 5co/R~o(x)d"r,
(1.4)
which we can write as
by introducing the constant
The Bayes criterion for the optimality of a statistical test requires the average cost to be minimum. The strategy that achieves this is called the
e
VII,
0 11
DETECTION THEORY
295
Bayes strategy, and to find it we move the surface dividing R, from R , about the data space until the average cost as given by eq. (1.5) is as small as possible. Minimizing C involves maximizing the integral on the right-hand side of eq. (1.5), and this is done by putting into region R , all points x for which pl(x) > A o p o ( x )and into region R, all the rest. Equivalently, the Bayes strategy chooses hypothesis H, whenever A ( x ) < A , and HI whenever A ( x ) > A , , where 4 x ) =P,(X>/P,(X>
(1.7)
is called the likelihood ratio. The number A , with which it is compared is called the decision level. As the costs C , and C , and the prior probability 5 may not always be accurately known, an alternative viewpoint dispenses with them and simply fixes the false-alarm probability Q, at a level that the observer can afford. The strategy maximizing the probability Qd of detection is then sought. This goal is called the Neyman-Pearson criterion. The same strategy results: pick hypothesis H, whenever A ( x ) < A , and hypothesis HI otherwise. The decision level A , is set so that the false-alarm probability Qo = Pr [ A ( x ) > AolHol
(1.8)
equals the pre-assigned value. The optimum instrument for detecting the field 9, then, is one that somehow evaluates the likelihood ratio A ( x ) for the field at its aperture A during the observation interval (0, T ) , the data set x being augmented by ever finer sampling until all the information contained in the field is utilized. Whether the Bayes or the Neyman-Pearson criterion is pieferred does not affect the design of the optimum instrument, but only the value A , with which its output A ( x) is compared. 1.2. DISCRETE DATA A N D RANDOMIZATION
Practicality may restrict the kind of processing to which the aperture field can be subjected. It may be necessary, for instance, to focus the field onto a photoelectric surface and count the electrons emitted during the observation interval from various regions thereof. A decision strategy utilizing the counts as the data x = (xl, x2,. . ., x,) and meeting one of the criteria of optimality is needed. The data are now discrete random variables. The same type of analysis as what we have just presented can be carried out; it is merely necessary to replace integrals over probability densities as in eqs. (1.1)-(1.5) with sums over probabilities. The best system is now
296
Q U A N T U M DETECTION THEORY
[VII,
01
one that bases its decisions on the likelihood ratio 4 x 1 = P,(X)/P,(X),
(1.9)
which is the quotient of the probabilities P o ( x ) and P l ( x ) of the data under the two hypotheses. The likelihood ratio A ( x ) will be a discrete random variable, taking values only on a countable set of numbers. Under the Bayes criterion, hypothesis H, is chosen if A ( x ) exceeds the decision level A , given by eq. (1.6). When the Neyman-Pearson criterion is applied to discrete data x, the pre-assigned value of the false-alarm probability Q, may not be attainable by setting A , equal to any of the discrete set of values that the likelihood ratio A ( x ) can take on. It is then necessary to randomize the decision. Imagine the possible values Ii that the likelihood ratio A ( x ) can assume as arranged in ascending order,
0 2 I, < I, <
. . . < Ii-, < Ii < li+,< . .. < 00.
Each of these has a certain probability
Pi
=
Pr { A ( x ) = iilH,}
(1.10)
of occurring under hypothesis H, . Suppose that setting the decision level at one of the l i ,say A’, yields a false-alarm probability just too large and that taking the next in order, A”, yields one just too small,
Pi > Q, > Aj2A’
l i
C2
Pi, A” > A ’ .
(1.11)
A‘‘
The randomized strategy then chooses hypothesis H, whenever A ( x ) < A’ and H, whenever A ( x ) 2 A” > A ’ ; but when A ( x ) = A‘, hypothesis H I is chosen with a certain probabilityf, and H, with probability 1 -f, by for instance tossing a properly biased coin. The value offis picked so that the false-alarm probability
Qo
=
1 Pi+fPr
{ A ( x ) = A’IH,)
(1.12)
&>A’
takes on the pre-assigned value. The probability of detection is then
Qd =
Pr { A ( x ) = IilH,)+fPr { A ( x ) = A’lH,}.
(1.13)
&>A’
We shall need this type of randomized strategy when the receiver is one that counts photons. As an example, suppose that the decision is based on a single number n of photons whose distribution under each hypothesis has the Bose form, but
VII,
0 11
291
DETECTION THEORY
with different means, ii, and ii, , 6, > Pi(.)
=
(l-vi)U;,
Ui
no,
= iii/(Ei+1),
i
=
0, 1.
(1.14)
Basing the decision on the likelihood ratio A(n) is now equivalent to basing it on the number n of counts. A randomized decision rule would set a certain integral decision level 8 and would choose hypothesis HI when n > 8 and H, when n < 8, choosing hypothesis H, with probability f whenever the number n equals 8 exactly. Then eqs. (1.12) and (1.13) become Qo = u;" +f(l - u ~ ) v : , Qd = U ? + ' + f ( l - U , ) U : .
(1.15)
From the first of these the values of 8 and,f, 0 < f 5 I , can easily be determined. Further examples, involving photon or photoelectron counting, are [1969a, 19711. found in HELSTROM 1.3. COMPOSITE HYPOTHESES
Up to now we have supposed that the observer or the designer of the optical receiver has complete knowledge of the p.d.f.'s po(x) and p, (x) governing the data x under the two hypotheses, which are therefore termed simple. Althoughp,( x) may be entirely known - it describes the background field in our applications -, it often happens that the p.d.f. pl(x) of the data under hypothesis HI depends on certain parameters whose values are unknown a priori. These are usually parameters of the field 9to be detected, such as an overall amplitude or flux level, the phase of a coherent field, or the arrival time and wavelength of a laser pulse. Hypothesis HI is then termed composite, and the p.d.f. of the data under HI is written asp,(x; 8 ) , where 8 = (el, 8,, . . ., 8,) stands for the collection of nz unknown parameters. If a prior p.d.f. z(8) of these parameters, representing the relative frequencies with which their values fall into various ranges, is known, and if the error cost C , is independent of the parameters 8, the Bayes criterion requires the choice between the simple hypothesis H, and the composite hypothesis HI to be based on the average likelihood ratio A(x)
=
[ z(B)A(x; 8)dm0, (1.16)
which is compared as before with the decision level A , given by eq. (1.6). The Neyman-Pearson criterion now specifies that the average detection probability Qd[Z]
= j/Z(8)p1(X; 8)d"Od"x
(1.17)
298
QUANTUM DETECTION T H E O R Y
[VII,§ 1
be maximum, the false-alarm probability Q , being fixed; and it leads to comparison of the same average likelihood ratio A( x) with a decision level A, set to yield a pre-assigned false-alarm probability. Occasionally the Neyman-Pearson test of hypothesis H, against a hypothesis HI with a known set of values 8 leads to a dichotomy of the data space into two regions R, and R, that is independent of the parameters 8. This happens when the likelihood ratio A ( x ; 8 ) depends on the data x only through some functionf(x) that does not involve the parameters 8. The decision can then be based on the value off(x) for the data at hand. Such a functionf(x) is called a suflcient statistic. The test is said to be uniformly most powerful and applies whatever the prior p.d.f. z(8) may be. If the prior p.d.f. z(8) is unknown, and if no uniformly most powerful test exists, a conservative observer may wish to use the Neyman-Pearson test that is based on the least favorable prior p.d.f. Z(8). The least favorable prior p.d.f. Z(0) is the one for which the maximum average detection probability Qd[z] is minimum, the false-alarm probability being held fixed. The principal application of the concept of a least favorable distribution is to the detection of a coherent signal or field of unknown phase 4. The least favorable distribution of 4 is the uniform one, Z(4) = (2z)-I,
05
4 < 27c,
(1.18)
and the detection probability attained by the resulting system is independent of the actual phase 4 of the coherent field that happens to be present. This independence is characteristic of least favorable p.d.f.’s (HELSTROM [ 1968~1 p. 159). If, on the other hand, the observer knew what phase 4 the field carries when it is present, he could construct a receiver with a higher probability Qd(4)of detection for the same false-alarm probability. With light fields this knowledge of the phase is seldom to bz had. 1.4. THRESHOLD DETECTION
When the overall strength S of the field 9to be detected is unknown, the optimum detector can in many cases not be constructed, for the likelihood ratio determining its structure is such a function of S that no uniformly most powerful test exists. There are two courses of action. One is to design the receiver for a standard signal strength So, and for other strengths S # So to accept a detection probability Qd(S) less than the maximum possible. An alternative approach is to postulate the unfavorable situation of very weak signals and design the receiver to be optimum in the limit of vanishing signal strength S + 0. The likelihood ratio A ( x ; S) as a function of the strength S of the signal is expanded in a power series about S = 0,
VII,P
11
299
DETECTION THEORY
The receiver bases its decisions on the coefficient of the smallest power of S appearing, usually (aA/dS),=, or ( d 2 A / a S 2 ) s = 0comparing , it with a decision level set to yield a pre-assigned false-alarm probability. This scheme is known as threshold detection (MIDDLETON [ 1960, 9 19.4; 19661). When a number Mof statistically independent sets x(j) of data are available, j = I, 2 , . . ., M , all of which either contain the field 9 or do not, the decision may be based on the sum 2 of a certain function f ( x " ' ) of each set of data,
z= c M
(1.20)
f(x(j)),
j= 1
hypothesis HI being chosen when 2 exceeds a certain decision level 2,. The logarithm of the likelihood ratio of the data, In A ( x ( ' ) , d2), . ., x ( ~ ) ) will , be such a statistic. If M >> 1, 2 will have approximately a Gaussian distribution under both hypotheses Ho and H I , by virtue of the central limit theorem (CRAMER[I9461 p. 213), and the false-alarm and detection probabilities will be
.
Qo
M
erfc x = (27c)-*
Qd x erfc (x - A&@).
(1.21) (1.22)
Here x is related to the decision level 2, , and A ( S ) is an effective signal-tonoise ratio given by (1.23) where Var,fis the variance of the statisticf(x) when no signal is present, and E stands for expected value. If we now fix the probabilities Q, and Q d and let M increase, the signal strength S required will decrease. The best statistic f ( x ) to use will, for M >> 1, be that for which the effective signal-to-noise ratio A ( S ) is largest in the limit S + 0. This optimum statistic can easily be shown to be the threshold statistic just described through eq. (1.19) (RUDNICK [ 19621). Thus if we compare two detectors forming different functions f ( x ) of the data, for the same values of Q,, Qd, and A4 >> 1, the threshold detector will require the smaller signal strength S.
300
Q U A N T U M DETECTION THEORY
[VII,
01
1.5. MULTIPLE HYPOTHESES
A communication system might transmit any one of M laser pulses of different forms or different wavelengths, each corresponding to a symbol of an alphabet of M symbols into which messages have been coded. A pulse is sent forth once every T seconds. During each interval of T seconds’ duration when a transmitted signal arrives, the receiver must decide which of the M possible pulses has been sent. Its decisions are based on the data x obtained, for instance, by sampling the field at the aperture of the receiver during the interval. The receiving optical system must now permit a choice not between two, but among M hypotheses, which we label Hj, j = 1, 2, . . ., M . Under hypothesis Hi, “Thejth signal is present”, the data x are described by a probability density function p j ( x > .We say that the receiver must carry out a multiple hypothesis test. The Bayes criterion of minimum average cost can again be applied if we know the prior probabilities Cj of each of the M hypotheses, with
and the costs Cij of choosing hypothesis Hi when Hj is true. The average cost is then M
c=c i=l
M
c CjCijJ
j=l
Pj(X)d”X,
(1.24)
Ri
where Ri is the region of the data space for which hypothesis Hi is selected. The average cost is minimum for a strategy that calculates the M posterior risks
c
M
(1.25) and selects the hypothesis with the smallest posterior risk. Here P(Hjl x> = CjPj(x>/P(x>
(1.26)
is the posterior probability of hypothesis Hj upon observation of the data x, with M
P(X) =
C Ti Pi(.)
(1.27)
i=l
the total probability density function of the data x. Let us introduce the p.d.f. p o ( x >of the data when no signal field, but only the background field 9, is present. It corresponds to a pseudo-hypothesis H, that 9, alone is at hand. Then by dividing both numerator and denom-
VII,
0 21
DETECTION THEORY I N Q U A N T U M MECHANICS
301
inator of eq. (1.26) by p , ( x ) , we can write the posterior probabilities upon which the decision depends as P(Hjl x) =
A,(.)
=
ij
j(x)/na(x),
Pj(x)/Po(x), j =
., M ,
M
(1.28)
where A j ( x ) is the likelihood ratio for deciding between hypothesis Hi and the pseudo-hypothesis H, . In this form the passage to the limit of exhaustive sampling of the aperture field is most easily carried out. When the costs of error are equal, and correct decisions cost nothing, C,
= C,
i #j;
C i i = 0, all i,
(1.29)
the Bayes strategy requires selection of that hypothesis for which the posterior probability p ( H i l x ) is maximum. This is the familiar Bayes rule for choosing among a number of statistical hypotheses. Equivalently, one chooses the hypothesis Hifor which C i A i ( x ) is largest. Estimating an unknown parameter d of a field or of its p.d.f. p ( x ; 0 ) can be considered a continuous version of the testing of multiple hypotheses. Indeed, if we content ourselves with deciding in which of a number of finite ranges of values the parameter 0 lies, estimation becomes equivalent to choosing among a number of hypotheses Hj. We shall see in 9 6 that the Bayes formulation of multiple hypothesis testing can be carried over directly to estimation.
0 2.
Detection Theory in Quantum Mechanics
2.1. BINARY DETECTION
In our outline of detection theory in the first section, we showed that the best optical instrument for binary detection is one that generates a likelihood ratio A ( x ) for samples x of its aperture field during the observation interval; and we presumed that this would eventually be accomplished with a set of samples x that exhausts all the information in that field. The field would have to be sampled at points r i in the aperture and at times t i in the observation interval that are but infinitesimally separated. This infinitely dense sampling needs to be possible only conceptually; the same result can be achieved by appropriate filtering of the aperture field. Thus in classical physics this passage to the limit of an infinite number n of data as required by detection theory poses no serious difficulties.
302
Q U A N T U M DETECTION T H E O R Y
IVIL
02
Light fields, however, are subject to the laws of quantum mechanics. which place limitations on the extent and precision with which the fields can be measured. The amplitude and phase of a coherent field, for instance, are not simultaneously measurable with perfect accuracy; and measurements of a field at points on each other’s light cones interfere. Although fields containing many photons can be treated along the lines of classical physics, when as in many optical detection problems few photons may be available, the quantum-mechanical behavior of the fields must be taken into account. Some care must be exercised, therefore, in prescribing how the samples x are to be taken in order to use all the information in the aperture field. Indeed, we require a reformulation of detection theory in quantum-mechanical terms. In quantum mechanics it is difficult to treat measurements of a system at a succession of times, and in order to avoid this difficulty in studying the detection of light, we imagine a Gedankenexperiment. Behind the aperture of our optical instrument we place a large, lossless cavity, initially closed and empty. During the observation interval (0, T ) the aperture is open, and the field inside the cavity interacts with the external field, which contains background light and perhaps also the field 9 to be detected. At the end of the observation interval the aperture is closed. The decision about the presence or absence of the sought field - that is, the choice between hypotheses H, and H I - will now be based on measurements of the field inside the cavity at some later time t > T . The cavity field can be described quantum-mechanically by giving its density operators p o and p1 under the two hypotheses H, and H I . These operators correspond to the classical p.d.f.’s p o ( x ) and p l ( x ) that we dealt with in Q 1. Any measurable quantity attached to the field corresponds to an Hermitian operator, say X , and the outcomes of a measurement are the eigenvalues of X . If X has discrete eigenvalues xk and eigenstates I&),
the probability under hypothesis Hi that a measurement yields the value is ( x k l p i l x k ) , i = 0, 1. If the eigenvalues x form a continuous spectrum,
xk
X l X ) = XIX),
the p.d.f.’s of the outcome of a measurement of X are ( x l p i l x ) , i = 0, I, under the two hypotheses. The observer must choose between the hypotheses by making the best possible measurements on the field in the cavity, the best measurements again being defined as those enabling us to choose between H, and H I with minimum average cost.
VII, §
21
DETECTION THEORY I N QUANTUM MECHANICS
303
If n operators XI, X 2 , . . ., X, are to be measured simultaneously, they must commute among themselves: X i X j = X J i , all i a n d j . Let the outcomes of measuring them be x x2,. . ., x,, respectively. A decision strategy can be described as a functionf(x, ,x,, . . ., x,) of the outcomes that takes on only two possible values, 0 and 1. If in a given test of the strategy, f(xl, x2, . . ., x,) = 0, hypothesis H, is chosen, otherwise H, . The optimum classical strategy derived in 0 1, for instance, can be expressed as the function f ( x )= u ( P l ( x ) - ~ o P o ( x ) ) ? x =
(x1, x2,
..
-2
x,),
(2.1)
where U(Y>= 0, Y < 0 ;
W Y )= 1 7 Y > 0,
(2.2)
is the unit step function. Classically the set x can be augmented until all the relevant information in the field is encompassed. Quantum-mechanically this is impossible, and the problem of which operators X, to measure must be faced. Since the n operators Xk must commute, we can form the operatorf’(X, , X,, . . ., X,) and measure it instead. Because only the values 0 and 1 can result from the measurement, this operator must be a projection operator, and we denote it by II. Among all possible projection operators for the cavity field, we must determine which one yields minimum average cost. The probability Q, of an error of the first kind is the probability that measuring II yields the value 1 when H, is true, QO
=
(2.3)
Tr (POn),
where Tr stands for the trace of an operator. The probability Q, of an error of the second kind is Qi
= Trbi(l-n)l
=
l-Tr(pin>,
(2.4)
where 1is the identity operator. The average cost is now, as in eq. (lS),
C = IcoQo+(1-OCiQi
=
(l-I)Ci{l-Tr
[ ( P I - ~ o P o ) ~ ~ ) (2.5) ,
where A , is again given by eq. (1.6). We must choose the projection operator minimizing C, or equivalently, maximizing Tr[(p, -Aop ,)II]. Let the eigenvalues and eigenstates of the operator p l - A o p o be q, and Irk) respectively, as in (p1 - A O p O ) l q k ) = Yklrk). (2.6) Then we must maximize
304
Q U A N T U M DETECTION T H E O R Y
t V IL
§2
and this we can do by picking I l as an operator projecting the state vector of the system onto the subspace spanned by the eigenstates l q k ) associated with positive eigenvalues, q k > 0, = k:qk>0
lqk)
(qkl
=
1
u(qk)l?k)(qkl*
k
(2.8)
The minimum average cost is now Cmin
= (l-c)cl[l-
1
q k U(qk)]*
k
(2.9)
The optimum projection operator can be freely written in terms of the unit step function as n = WPl-~OPo), which is the quantum-mechanical counterpart to eq. (2.1). When the density operators pa and p1 commute, they possess a common set of eigenstates I r k ) , pilqk)
=
pi(k)lqk)?
= O?
(2.10)
where P , ( k )is the probability under hypothesis H i that the system is in state I&). The eigenvalues of the operator p1 - n a p o are qk
=
Pl(k)-nOPO(k)?
(2.11)
and the optimum strategy is therefore to pick hypothesis HI when
this is the same as in classical detection theory. Any operator with the same eigenstates I q k ) as pa and pI and having distinct eigenvalues can just as well be measured, and the decision can be based on a likelihood ratio formed from the outcome of the measurement. The Neyman-Pearson criterion is not so easily handled in quantum mechanics. One defines a randomized decision operator n,, the outcome of whose measurement is a numberf, lying between 0 and 1. This outcome is taken as the probability with which one should choose hypothesis H , ; that is, a chance device would be constructed that yields a zero with probability 1 -,A and a one with probability f,, and which of these, 0 or 1, turned up would determine the decision. Measurement of the operator n, may yield a different value off, on each trial of the system. The false-alarm and detection probabilities are now
VII,
§ 21
DETECTION THEORY I N Q U A N T U M MECHANICS
305
and in order to maximize Qd for fixed Qo we introduce a Lagrange multiplier A and maximize Tr KP1-40117rl. Again 17, takes the form in eq. (2.8), where the q k and ( q k ) are the eigenvalues and eigenstates of the operator ( p l -Ape), (P1 - A P O ) l q k )
= qklqk).
(2.14)
It is now necessary to vary A until the false-alarm probability Qo takes on exactly the pre-assigned value. As the operator equation (2.14) is generally difficult to solve when po and p1 do not commute, finding the optimum operator for the Neyman-Pearson strategy will not be easy. If po and p1 do commute, the problem reduces to the classical one, and the randomized strategy described in 3 1.2 is optimum. In many applications of the Neyman-Pearson criterion, as in radar detection, the false-alarm probability Qo is very small, Q , < 1, and A is thereupon very large, A > 1. (If A = co, hypothesis H, is never chosen, and Qo = 0.) We are then tempted t o define A’ = l/A and rewrite the eigenvalue equation (2.14) as (PO-A’P1)Iqk) = ?;Irk), q; = - ? k / A * (2.15) When A’ << 1 we can treat the term A’pl as a perturbation and approximate the eigenvalues by the first two terms of the standard perturbation expansion, V; Po(k)-A’(klPllk>, (2.16) where Po(k)are the eigenvalues and Ik) the eigenstates of the density operator P o , Polk) = Po(k)lk)(2.17) Now hypothesis HI is chosen when the outcome q; of a measurement of po -A’pl is negative, that is, when
(klP,lk)
> APo(k) = A
The decision can therefore, in this approximation, be based on a measurement ofp, or of all the projection operators Ik)(k(, after which the classical likelihood ratio
is formed for the eigenstate Ik) of po that turns up. The decision level A is set so that the false-alarm probability takes on a pre-assigned value. We shall later apply this approximation to the detection of a coherent signal.
306
WI, §
QUANTUM DETECTION THEORY
2
2.2. THE CHOICE BETWEEN PURE STATES
A simple case in which the eigenvalue equation (2.14) can be solved exactly involves the choice between two pure states, which we denote by The density operators are then I$o) and Po = I$o>($ol,
P1
=
l$l)<$ll
(BAKUTand SHCHUROV [1968]; HELSTROM [1968d]). There are only two eigenstates with non-zero eigenvalues; we denote them by [yo) and Iyl). They are linear combinations of I $ o) and Iri)
= ziol$o>+zi~l$i>~
i
=
0, 1.
Substitution into eq. (2.14) yields simultaneous equations for the zij7s,and one finds non-zero solutions for them only when y = yo or y = y l in the determinantal equation
where y
=
(t+hll$o). Thus one obtains the eigenvalues y1 =
+(l-A)+R > 0, yo
=
R = ([3(1-A)]2+Aq}t, 4
+(l-A)-R
=
< 0,
1-IyI2.
The false-alarm and detection probabilities are Qo = Qd
I ( ~ i l $ o > l ~= ( r i - d P R ,
= 1
=
(?i+Aq)/2R-
When the first equation is solved for A , and A is substituted into the second, we obtain
If in particular the states are orthogonal, y = 0, q = 1, and we find Qo = 0, Qd = 1; orthogonal states can be distinguished without error. 2.3. THRESHOLD DETECTION
The optimum detection operator I7 is often difficult to determine, and it will in general depend on the strength of the signal. If that strength is unknown in advance, the designer might consider the threshold approach described in Q 1.4, where it is assumed that the signal is very weak. A detector is then termed optimum when it requires the least signal strength to at-
VII, §
21
DETECTION THEORY I N Q U A N T U M MECHANICS
307
tain a given pair (Q, , Qd)of false-alarm and detection probabilities, a great number M of inputs being at hand, all either containing or not containing the signal. Equivalently, the best detector is the one that maximizes the effective signal-to-noise ratio, eq. (1.23), in the limit S -+0. Let us suppose that the detector measures an operator X on the field. The effective signal-to-noise ratio is the quantum-mechanical counterpart of eq. (1.23), A’ = [Tr (pl X)-Tr ( P ~ X ) ] ~ / VX ~, ~ , (2.19) where Var, X = Tr (pox2)-[Tr ( p o X ) l 2 . The operator X for which A’ is maximum can be shown, by using the Schwarz inequality for traces, to be the solution of the operator equation P1-Po
= HPOX+XPO)
(2.20)
(HELSTROM [1969d] p. 241). Letting S represent the signal strength, with = pl(S), po = p,(O), we define the symmetrized logarithmic derivative (s.1.d.) of p,(S) as the solution L of the equation
p1
a P l m = :(POL +LPO).
(2.21)
The threshold detection operator II, is then the value of L in the limit S + 0. Often the threshold operator is much easier to determine than the optimum detection operator U(pl -Aopo). 2.4. MULTIPLE. HYPOTHESES
The distinction between the quantum-mechanical detection theory and the classical lies in this, that classically everything about the field can in principle be measured, and there is no ambiguity about carrying the likelihood ratio A ( x ) to the limit where all the information in the field is utilized; quantum-mechanically it is necessary also to decide what to measure, for it is impossible to measure simultaneously all the operators corresponding to a complete classical description of the field. In binary detection the quantum theory discovers the best operator to measure in order to choose between two hypotheses H, and H, with minimum average cost. When decisions among more than two hypotheses are required, however, the optimum operators to measure are as yet unknown, except when the associated density operators commute. In a laser communication system transmitting every T seconds one of M different coherent signals, the observer is confronted with the choice among M density operators p l , pz , . . ., pM for the field in the lossless cavity representing the ideal receiver. His strategy can be described as one of measuring
308
QUANTUM DETECTION THEORY
[VII,
§3
A4 projection operators n,,n2,. . ., nM,forming a resolution of the identity, (2.22) n,+n,+ . . . +nM = 1, and he chooses hypothesis Hi when 17,yields the value 1 and the rest yield 0. In order that the operators nkbe measurable on the same system, they must commute. In terms of the costs and prior probabilities introduced in Q 1.5, the average cost of operation will be, as in eq. (1.24), M
M
(2.23)
for Tr (nipj) is the probability under hypothesis Hj that the outcome of measuring niis 1. When the density operators pk commute among themselves, they possess a common set of eigenstates, and it is necessary simply to determine which of those states the system is in. The classical procedure described in Q 1.5 can then be applied to deciding which hypothesis is true. Optical communication systems involving the choice among commuting density operators have been treated by LIU [1970]. For noncommuting density operators, the optimum resolution of the identity (n,,n2,. . ., nM), that is, the one minimizing C in eq. (2.23), is unknown. Although some conditions on the solution of this problem have been reported, it is unclear how they can be applied (YUEN,KENNEDY and LAX [1970]).
6 3.
Detection of a Coherent Signal
3.1. THE TRANSMISSION-LINE RECEIVER: CLASSICAL ANALYSIS
The optimum quantum receiver has been prescribed in terms of the density operators of the electromagnetic field within a lossless cavity, which is exposed to the incident light during an observation interval (0, T ) . These density operators are evaluated at an arbitrary time t > T after the aperture has been closed, but they depend on the field at the aperture during the interval (0, T ) . As it is in terms of the aperture field that we want to express the detectability of light signals, we must specify the relation between it and the internal field at t > T . It is instructive to begin with a one-dimensional counterpart of the ideal receiver, a lossless transmission line into which the signal and the noise are introduced by a voltage source having a real internal impedance 2, (Fig. 3.1). The source might represent an antenna, whose radiation resistance contributes to 2, and whose terminals exhibit a randomly fluctuating noise
VII,5 31
DETECTION O F A COHERENT S I G N A L
309
voltage n(t) as a result of background fields incident upon it. When the field 9 to be detected also falls upon the surface of the antenna, the terminals experience an additional voltage s(t), which here becomes “the signal”. The field surrounding the transmission line, or equivalently the voltage V ( x ,t )
Fig. 3.1. Transmission-line receiver.
and the current I(x, t ) at points x along it, corresponds to the field in the lossless cavity, but having a single spatial dimension can be analyzed with less cumbersome mathematics. The transmission-line receiver has previously been treated by SHE[1965, 19681. The transmission line will first be studied on the basis of classical physics, and by forming a likelihood ratio from the amplitudes of its normal modes, we shall determine the detectability of a coherent signal pulse of known form and finite duration. Next, the normal modes will be quantized, and the quantum detection theory of 0 2 will be applied to the same problem. Finally we shall see how to treat the effect of amplification on signal detectability in this ideal receiver. As shown in Fig. 3.1, the transmission line, whose length is I, is open at the far end. At first, the line is quiescent. The generator, whose voltage output is v(t),is connected during the observation interval (0, T ) ,which is long enough to contain the entire signal s(t). The line has an inductance L per unit length and a capacitance C per unit length. The velocity of propagation along the line is c = (LC)-*, and we assume that I = cT. The characteristic impedance of the line is 2, = (L/C)*. At time t the voltage between point x of the line and ground is V ( x ,t), and the current moving in the positive x-direction is I(x, t). These are related by the partial differential equations -aviax = Lariat,
-aI/ax
=
caviat,
(3.1)
with the boundary conditions V(0,t)+Z,I(O, t ) = v(t),
I(Z, t ) = 0.
(34 (3.3)
310
Q U A N T U M DETECTION T H E O R Y
[VII,
03
The general solutions of these equations represent waves moving to the left and to the right, and they can be put into the form
The entire system being linear, the principle of superposition applies, and we can treat the components due to signal and noise separately. The rightward and leftward waves can be expanded in Fourier series
rn
f-(t+x/c)
=
C
m=--00
h,exp [-iw,(t+x/c)],
(3.7)
where w, = 27cm/T. The terms of these series correspond to normal modes of the field between the line and ground when the ends of the line are open. Because V ( x , t ) and Z(x, t) are real, g e m =,:g h - , = A:. We begin within the framework of classical physics, disregarding quantum fluctuations. When a signal s(t) is present (hypothesis HI), the input is u(t) =
s(t)+n(t),
(3.8)
where n(t) is a white, Gaussian random noise process with mean value 0 and autocovariance function given by Nyquist’s law as
qqz) = E[n(t)n(t+z)]
=
2K7Zs6(z),
(3.9)
where K is Boltzmann’s constant and 7 is the effective absolute temperature of the source. During the interval (0, T ) the voltage and current on the line are V ( x , t ) = Z,(Z, +Z,) - u( t - x/c), (3.10) I ( x , t) = (Z,+Z,)-lu(t-x/c),
(3.11)
which satisfy the conditions in eqs. (3.2) and (3.3), u ( t ) being zero for 5 < 0. The line supports a wave moving to the right. At time t = T we disconnect the source and measure the voltage and current along the line, or what is the same thing, the amplitudes of the normal modes defined in eqs. (3.6) and (3.7). All the Am’s are zero, there being as yet no leftward waves. The coefficients g, of the rightward modes, on the other hand, are given by
VII,
0 31
DETECTION O F A COHERENT S I G N A L
gm
=
gmx+igmy = l-’/;f+(T--x/c)
=
T - ’ Zo(Zo+Z,)-’
311
exp [-io,x/c]dx
loT
v(t) exp (iom t)dt.
(3.12)
As these are linear functionals of the input, they are Gaussian random vari* we fix our attention only on the coefficientswith ables. Because g - m = gm, m 2 0, and for these it is easily shown that gmx and gnyare statistically independent for all positive values m and n. Under hypothesis Ho, u ( t ) = n(t), the coefficients gmx and gmy all have mean value zero; under hypothesis H, their mean values are given by E(gmIH1) =
= Bmx+iSmy
Bm
= T-’Z0(Zo+Z,)-’
IOT
s(t)
exp (iw,t)dt.
(3.13)
Their variances are, under both hypotheses, Var gm, = Var gmy = 0’
=
K Y Z ,T-’[Zo/(Z0+Z,)]’
(3.14)
as a consequence of eq. (3.9) for the autocovariance of the noise. In general, the total energy in the electromagnetic field of the transmission line is given by PI
&
=
Jo
3
-{C[ V(x, t)]’
+ L [ l ( x , t)]’}dx (3.15)
A t time T, the portion of the energy in the field due to the signal, when present, is m
T
C IBmI2 = Z o ( Z o + Z s ) - 2 ~[s(t)]’dt, o m=O
8, = 2Cl
(3.16)
in obtaining which we have used Cc = 2;’. The observer, having measured the first M of the mode amplitudes g,, must form their likelihood ratio in order to decide whether a signal is present, and for this he needs the joint probability density functions (p.d.f.’s) of their real and imaginary parts g,, gmy.As these are Gaussian variables, their joint p.d.f. under hypothesis HIis
m=O M-1
=
n
m=O
(2x0’)-
exp (- lgv,- ~m12/202);
(3.17)
312
Q U A N T U M DETECTION THEORY
[VII,
43
under Ho it is M- 1
Po({gmx, gmy))
=
n
m=O
(2nc2)-‘ ~ X (-Igm12/2cz). P
(3.18)
The likelihood ratio is then A({gm>) = P,({gmx 3 gmy>)/Po({gmx gmyl) 3
n ~ X [(~eg:gm-~IgmI~)/c~I, P
M-1
=
(3.19)
m=O
where Re indicates the real part of a complex number. The likelihood ratio, as we learned in 0 1.1, is to be compared with a pre-assigned decision level A , , hypothesis HI (“signal present”) being chosen when A > A , . Because the exponential function is monotone, this procedure is equivalent to the simpler prescription to form the sufficient statistic
and compare it with a decision level G M O depending on the pre-assigned falsealarm probability Qo . As the statistic G M is a linear combination of Gaussian random variables {gmx,gmy},it also has a Gaussian distribution under both hypotheses. Its mean values are, by eq. (3.13), M- 1
E(GMIHO) = 0, E(GMIH,)
=
C Igm12,
(3.21)
m=O
and its variance under both hypotheses is, by eq. (3.14) and the independence of the coefficients, M- 1
Var GM = 0; = cz C
1gm12.
(3.22)
m=O
The false-alarm probability is the probability under hypothesis Ho that G M exceeds its decision level G M O ,
and the probability of detection is similarly
(3.24)
VII,
o 31
DETECTION O F A COHERENT S I G N A L
313
The more modes included in the statistic G, of eq. (3.20), the larger d, and Qd. The maximum detection probability is attained by using all the modes and is given by Q d = erfc ( t - d ) , (3.25) where by eqs. (3.14) and (3.16) (3.26)
gSbeing the energy of the signal component of the field in the transmission line as in eq. (3.16). This energy is maximum when the transmission line is matched to the source, Z , = Z,, and the maximum probability of detection is given by eq. (3.25) with the maximum signal-to-noise ratio
The parameter t is determined by the false-alarm probability Q, through Q , = erfc t. The energy E, is the maximum that can be extracted from the signal field, and the probability of detection we have calculated here is identical to that determined by the conventional detection-theoretical analysis involving the likelihood ratio for the input u ( t ) = n(t)(Ho) versus the input u ( t ) = s(t)+ n(t)(H,), 0 < t < T (HELSTROM [1968c] ch. 4). That analysis leads to a receiver consisting of a filter matched to the signal and a subsequent decision device. The output of the matched filter at the end of the observation interval (0, T )is compared with a decision level, and if the level is surpassed, hypothesis H, is selected. We have shown here that the field of the transmission line contains at time Tall the information necessary to detect the signal in the optimum manner, and the transmission-line receiver, when all its mode amplitudes g, are processed according to eq. (3.20) with M = co, is equivalent to the standard optimum receiver of classical detection theory. 3.2. QUANTIZATION OF THE RECEIVER
In quantum mechanics the voltage V ( x , t ) , the current Z(x, t), and all quantities linearly related to them, such as the mode amplitudes g, and A,, become operators acting on the state vector l Y ) of the transmission line and its ambient field. Let us consider first the field when the line is open at both ends, as at time t < 0, and let us expand it in normal modes as in eqs. (3.6) and (3.7). The energy in the field is given in terms of the mode amplitudes by eq. (3.15), and hence the Hamiltonian operator of the field must be
314
QUANTUM DETECTION T H E O R Y
[VII,
P3
m
A?
C (g,'gm+ m=O
2Cl
=
(3.28)
h:h,),
where the adjoint operators g,', h,f take the place of the complex conjugates g:, h:, respectively. Equations (3.6) and (3.7) show that the operators g, and h, must have the time dependences grn(t) = grn(t0) ~ X C-iwm(t-to)l~ P hm(t) = h m ( t 0 ) ~ X P[-i~m(t-to)]~
(3.29)
and they must hence obey the differential equations dg,/dt
=
-iw,g,,
dh,/dt
=
-io,h,.
(3.30)
However, these time derivatives must also be given by the commutators with the Hamiltonian, dgmldt dh,/dt
= =
(i/h)[-X, gm] (i/h)[Z, h,],
= (i/h)(sgm-gm
z),
(3.31)
where h is Planck's constant h/27c. By comparison with eq. (3.30), it follows that the commutators of the operators g,, h, and their adjoints must be [gm 9
s,']
=
[ g m , h,+] =
[hm h,+] = (hwm/2~98,, 0, 7
9
where 6,, is the Kronecker delta, 6,, = 1, m = n ; 6,, = 0, m # n. We thus identify g, and h, with the annihilation operators a, and b, of the modes, which we define by gm
=
(hom/2C2)'~, , h,
=
(h~,/2Cl)*b,.
(3.32)
The adjoint operators a;, b,' are the associated creation operators. The annihilation and creation operators obey the standard commutation rules Cam,
an']
= [ b m , b,+] =
[ a m 7 an] = [ a m
3
bn]
(3.33)
amn,
= [bm 2
bn]
= [am
9
b,+] = 0
(3.34)
(LOUISELL[1964] ch. 4). In terms of them the Hamiltonian operator takes the usual form
c ho,(a,+a,+b,+b,). 00
A?
=
(3.35)
m=O
The normal modes of the field behave like quantum-mechanical harmonic oscillators with angular frequencies w,. The operator a,' amis the number operator for the mth rightward mode; it has integral eigenvalues, which we
VII,
Q 31
D E T E C T I O N OF A C O H E R E N T S I G N A L
315
identify with the number of photons in this mth mode, each photon contributing an energy hw, . The voltage source producing the input u ( t ) is to be connected to the transmission line at time t = 0, and the field of the line is to be observed at time T = I/c. We must therefore investigate the behavior of the modes during the interim, when one end of the line is connected to the source with its internal impedance Z,. For later purposes, we suppose also that the far end of the line terminates in a load impedance Z,, as in Fig. 3.2. Our aim
& -
-
Fig. 3.2. Transmission-line terminated at both ends.
is to show that the mode operators u,(T), b,(T) at time T obey the same commutation rules (3.33), (3.34), despite the altered terminations of the line. The boundary conditions are now, in place of eq. (3.2), V(0,t ) + Z , I ( O , t ) = u(t), V(Z, t ) -zLI(z,t ) = 0.
(3.36)
The field at time t = T is composed of two parts, the remnant of the original field at time t = 0- and the field produced by the source impedance 2, and the load impedance 2,. To see what happened to the initial field, we decompose the initial voltage and current into rightward and leftward componentsf?( -x/c) andf!(x/c) through eqs. (3.4), ( 3 . 9 , which become for t=O q x ,0 ) = f+O( - x/c) f!(x/c), (3.37) Z,I(x, 0 ) = f:(-x/c)-fO(x/c),
+
and permitf!( -x/c) andf!(x/c) to be determined. The subsequent behavior of the wavesf!(t-x/c) andf!?(t+x/c) is governed by eqs. (3.4), (3.5) and (3.36), in the last of which we put u ( t ) = 0. A straightforward analysis shows that during the interval (0, T ) the rightward wavef! is reflected at the right-hand end x = I of the line and turns into a leftward wave, diniinished in amplitude by the reflection coefficient BL =
(ZL. - Z,)/(ZL
+ Z,);
(3.38)
316
QUANTUM DETECTION THEORY
the leftward wavef'! factor
is similarly reflected at x Ps
=
[VIL
P3
=
0 and diminished by the.
(4- ZO)/(ZS + ZO)?
(3.39)
becoming a rightward wave. As a result, the portion of the field at time t = T due to the initial field at t = 0 is made up of the components (3.40) we denote these portions by single primes. The corresponding Fourier coefficients g, and h, and the annihilation operators a, and b, for the rightward and leftward modes undergo a similar exchange and diminution, the latter becoming aXT) =
bL(T) = BLarn(O).
Psbm(O),
(3.41)
Since IPsI < 1 and lPLl < 1, these new operators ak(T) and bk(T)no longer obey the commutation rules in eq. (3.33). The deficiency must be made up from the portion of the field resulting from the terminating impedances Zs and Z,. These contain an enormous number of atoms and molecules, ions and electrons whose thermal agitation - as we know from classical statistical mechanics - produces across their terminals fluctuating voltages n,(t) and n,(t), respectively, having spectral densities 2 K F , Z, and 2KF,Z,, where F, and F, are their respective absolute temperatures. LAX[1966] and HAUS[1970] have shown that these voltages n,(t) and n,(t) must be treated as quantum-mechanical operators obeying the commutation rules [n,(t,), ns(t2)]
=
2ihZ, 8'(tl -tz),
Cn,(t,), n,(t,)]
=
2ihZL q
t 1-tz),
(3.42)
Cns(t,>,n d t 2 ) l = 02
where d ' ( t ) is the derivative of the delta function. The randomly varying voltage operators n,(t) and n,(t) contribute terms a:(T) and b:(T) to the mode amplitudes, and by virtue of their disparate physical origins, these terms commute with the portions ah( T ) and b;(T) arising from the initial field at t = 0-. By eqs. (3.12) and (3.32) we find
b l ( T ) = (2Cl/ho,)+Zo(Z,+Z,)-'
T-' Jb'n,(t) exp (io, t) dt.
(3.43)
VII,
o 31
317
D E T E C T I O N OF A C O H E R E N T S I G N A L
Using eq. (3.42) we evaluate their commutators as [ a i ( T ) , a; +(T)]= (2CZ/ho,) ZZ,(Z, + z,)-2T -
=
4 2 0 z,/(z,
+ 2 0 )=~ 1 -/?f,
(3.44)
where we have used ZIT = c = (LC)-* = (CZ,)-’, and where the reflecis defined by eq. (3.39). Thus with a,(T) = aL(T)+a:(T), tion coefficient /?, we find from eqs. (3.41) and (3.44) the commutator [am(T), a , ’ ( ~ > ] = /?;[bm(O), bm+(0)]+1-/?f
=
1.
(3.45)
Similarly, [b,(T), b,’(T)] = 1. Because of the orthogonality of the modes, the commutativity of the operators for different modes continues to hold. We can thus treat the transmission line at time t = T i n terms of the annihilation operators a,(T), b,(T) and their adjoint creation operators a,‘ ( T ) , b’, ( T ) ;and these have all the ordinary quantum-mechanical properties deriving from the representation of the modes as independent harmonic oscillators. 3.3. THE COHERENT SIGNAL OF KNOWN PHASE
3.3.1. The density operators
Now we can return to the detection of a coherent signal of known form, analyzed from the classical standpoint in $ 3.1, and treat it quantum-mechanically. As before, we imagine measuring the field modes at time t = T; but now, in place of the p.d.f.’s of the mode amplitudes, we need the density operators po and p1 of the modes under the two hypotheses H, (“signal absent”) and H, (“signal present”). The receiver is the same as the transmission line discussed in Q 3.2, but the far end x = I is open, 2, = 00. We learned that we can treat the modes at time t = T i n terms of the annihilation operators a, = a,(T) and their adjoints a,’ = [a,(T)]’ for the rightward modes, and that these obey the usual commutation rules, eqs. (3.33), (3.34), for annihilation and creation operators. At t = T the leftward modes contain neither signal nor noise and can be disregarded. Because the noise arises from the enormous number of particles in the source impedance 2, - or equivalently, from the thermal background field
318
m, D 3
QUANTUM DETECTION THEORY
picked up by the antenna - its values have, by the central-limit theorem of statistics, Gaussian distributions. Quantum-mechanically this means that the density operators po and p1 have Gaussian P-representations and depend only on the expected values and variances of the operators a, (GLAUBER [1963]). We start with a finite number M of modes, allowing M later to go to infinity. By eqs. (3.32) and (3.12), we see that under hypothesis Ho the expected value of a, is (3.46) E[amIHo] = Tr [ ~ o a m ( T ) = ] 0; under H , it is Prn = ~ C a r n ~ = ~ l~r ]
arn(~)]= (2Cl/hom)%rn
9
(3.47)
with 3, given by eq. (3.13). We can call lprnI2the average number of signal photons in mode m; the energy of the signal component of the field is, by eq. (3.161, 03
8 s
=
~r
=
[ ( P I -PO)&]
C hWmIPm12.
(3.48)
m=O
Under both hypotheses the mean number of noise photons in the mth mode IS
JVL = E(amfa,lH,) = Tr (po a:a,) = 4Z0 Z,(Zo +Z,)-2&”, JV, = [exp (hw,/l
where Y
=
(3.49)
Fs is the effective absolute temperature of the source and PI (x)
=
(ex-l)-’
(3.50)
is the Planck factor. In the classical limit ho, < K F 7 eq. (3.49) reduces to eq. (3.14) by virtue of eq. (3.32). From eqs. (3.46), (3.47) and (3.49) the Gaussian P-representations of the density operators po and p1 can be written down. The optimum processing, which requires solution of eq. (2.14), has not been determined. A decisive simplification follows from the realistic assumption that the signal has so narrow a bandwidth W that the mean numbers JVL of noise photons in all modes affected by the signal are equal. By eq. (3.49) this requires W .=< K F / h , and as K F / h = 6.2 x 1OI2 sec-l at F = 300”K, this condition will usually be met. We can then put for all modes A’”,
E A’” =
pl (hSZ/KF),
(3.51)
where SZ is the central angular frequency of the signal. We can now introduce a new set of modes by a unitary transformation of
VII,
§ 31
D E T E C T I O N OF A C O H E R E N T SIGNAL
319
the original amplitudes a, (HELSTROM [1968b]). We put M- 1 cn
=
2
Unmam,
m=O
where U = ~ ~ U n is ra nunitary ~ ~ matrix. The operators cn and their adjoints c,’ then obey the same commutation rules as the an’sand a i ’ s : [cn,
cm’]
= Jnm,
[ c n , cm]
=
0.
(3.52)
Furthermore, the new modes will be statistically independent and contain mean numbers M’ = 420 ~ , ( Z , + ~ O ) - ~ ~ (3.53) of noise photons. The unitary matrix is so chosen that (3.54) where (3.55) i.e., U,, = vY4&. The remaining rows U,, of U are orthogonal to U l m .The expected values of the c,’s will then all vanish under both hypotheses, except for cl, for which E(c,lH,)
=
Tr ( p , c l )
=
r,
II‘l
=
vM.
(3.56)
We can thus disregard all the new modes except the first, for they are unaffected by the signal; the optimum receiver works only with the field in the mode whose annihilation operator is c, . The resemblance of c1 to the statistic GM in eq. (3.20) is apparent; the quantum counterpart to GM is the Hermitian operator 22 = +(Cl +c:>. (3.57) We call this new mode the “matched mode”; in effect, it correlates the amplitudes a, with the signal components p,. The density operators for the matched mode under the two hypotheses are, in the P-representation, Po = ( n M Y / e x p (-lYl2/~’)lY)(YId2YY
(3.58)
p1 = (nM’)-’/exp (-iY-ri2/Jlr’)iY)(YId2y,
(3.59)
3 20
[VII,
Q U A N T U M DETECTION THEORY
03
where 17) is a coherent state (GLAUBER [1963]), y = y,+iy,, d2y = dyxdyy, and the integration is carried out over the entire (y,, y,)-plane. These density operators must be substituted into eq. (2.14), and the eigenvalues y]k and eigenstates l i j k ) must be determined. 3.3.2. The extreme quantum limit For arbitrary values of Jlr‘ and r the exact solution of this eigenvalue problem is unknown. For ?= 0 and” = 0, however, the matched mode is in a pure coherent state under each hypothesis, the density operators are simply Po = lO>(Ol> P1 = Ir>
=
I-exp(-v,).
The detection probability is maximum when M with omz SZ for the significant modes q
=
l-exp(-N:),
N6
8,= 4Z,Z,(Z,
=
=
co; then by eq. (3.48)
1 IpJ2 = G,/hSZ,
m=O
+z,)- %, ,
(3.61)
where E, is the available signal energy as defined in eq. (3.27). The detection probability is maximum when the transmission line is matched to the source, Z , = Z,, &, = E,, whereupon N6 = N , = EJhQ equals the average number of photons provided by the source of the signal. In Fig. 3.3 this maximum probability Qd of detection is plotted versus the equivalent signal-to-noise ratio D, = 2N: as the curve marked “optimum”; the false-alarm probability is Q , = When the false-alarm probability is very small, Q , < 1, the probability of detection is approximately, by eq. (2.18),
Q d z9
=
1 -eXp (- N,),
(3.62)
which - as we shall see - is the detection probability attained by a receiver that disregards the phase of the signal and simply counts the number of photons in the matched mode. The difference between the maximum possible value of Q d and that in eq. (3.62) is of the order of Q t .
VII,
31
32 1
DETECTION OF A COHERENT SIGNAL 99.99
I
I
I
I
I
I /
I
1
/
Fig. 3.3. Probability Qd of detecting a coherent signal versus the signal-to-noise ratio D, = [4Ns/(2A”+1)]+ for Qo = The curve “optimum” gives Qd for the optimum receiver when the signal phase is known and N = 0. The dashed curve gives Qd for the threshold receiver of a signal of known phase for all A”. The remaining curves refer to a receiver that counts the number of photons in the matched mode; these are indexed by the value of A”.
3.3.3. The threshold receiver The optimum receiver for a coherent signal of known phase depends on the amplitude of the expected signal; the corresponding statistical test is not uniformly most powerful. In view of the added disadvantage that the required eigenvalue equation is difficult to solve, it is natural to turn to the threshold receiver, which was described in 9 2.3. It is not hard to show from eqs. (3.58) and (3.59) that the threshold operator 17, is proportional to the operator 9 = +(cl +c:) given in eq. (3.57) as the quantum counterpart
322
Q U A N T U M D E T E C T I O N THEORY
[VII,
§3
of the classical sufficient statistic G , (HELSTROM [1967b]). The threshold receiver measures the Hermitian operator 9 and compares the outcome 9’with a decision level 9,, declaring a signal present when 9’> 22,. This outcome 9’has a Gaussian distribution by virtue of the Gaussian forms of the density operators pd and p , , eqs. (3.58), (3.59). In the limit M + co the means and variance of 9‘ are, by eqs. (3.54), (3.49), E[2!‘IH0] Var 22’
=
=
0, E[9’IH,] = Tr ( p l 9)= Ni*,
(3.63)
lpJ2 Tr (a,a:
(3.64)
Tr (p, S 2 ) M- 1
= lim $vM1 M-tm
+a,’.,)
=
*(2M’+l),
m=O
and the false-alarm and detection probabilities are, as in 6 3.1, Q, = erfc 5, Qd = erfc (t-D,),
(3.65)
where D, is an equivalent signal-to-noise ratio defined by
Dg’
=
4Ni/(2M’+1),
(3.66)
which again is maximum when 2, = Z,, NL = N , , N‘=.N. The probability Qd for Q , = is plotted versus D, in Fig. 3.3 as the dashed line marked “threshold”. In the classical limit K F > > h Q , D: goes to the signal-tonoise ratio d 2 = 2Es/KF defined in eq. (3.27). By the correspondence principle, we expect the optimum quantum receiver to tend ever closer to one that measures 9 as the effective temperature F or the average number 4” of noise photons increases. 3.3.4. Approximations to the optimum receiver For small values of Jlr’ the eigenvalue equation (2.6) withp, andp, given by eqs. (3.58), (3.59) has been solved by YOSHITANI [1970], taking A , = 1 as for a binary communication system transmitting 0’s and l’s with equal probability. He used a perturbation method starting with the exact solution for N’= 0 described in 9 3.3.2. An alternative approach is to diagonalize the matrix (ml(p, -ipo)ln) obtained by expressing the density operators in the number representation, with c:clln)
=
nln),
n
=
0, 1, 2, . . .,
(3.67)
defining the eigenstates In) of po . The matrix elements are (nlp,lm)
=
(l-v)v”S,,,
u = M’/(N’+i),
(3.68)
VII,
D 31
323
DETECTION OF A COHERENT SIGNAL
(nlp,lrn) = (1-v)
(nlplm)
=
(;!I+
-
(mlpln)",
Urn (;Jrn-" -
m < n,
where L:-"(x) is the associated Laguerre polynomial (LOUISELL and WALKER [1965]). One truncates the matrix to such a size that the neglected elements are insignificant. The average probability of error P e = i(Qo+l-Q&
which is given by eq. (2.9) with 5 = 4,C , = 1, is plotted in Fig. 3.4 versus the signal-to-noise ratio D, = [4NS/(2M+1)13 for a matched receiver (Zo = Z,). The line marked M = co represents the error probability P, = erfc (+I),)attained at all values of Jlr by a receiver measuring the threshold operator 9 of eq. (3.57). The optimum quantum receiver has a significantly smaller probability P, of error only for small values of the average number JV of noise photons per mode.
0.5
Pe 0.1
0.05
0.01
0.5
1.0
I .5 Dg = J4NS /
20
2.5
3.0
3.5
(N-tI
Fig. 3.4. Probability P, of error in detection of known signal with prior probability = &. D, = signal-to-noise ratio = [ 4 N S / ( 2 N + l ) ] f -N, , = average number of signal photons, -Af = average number of noise photons. (From HELSTROM et al. [19701.)
5
324
Q U A N T U M DETECTION T H E O R Y
[VII,
03
We remarked in $ 2.1 that when the false-alarm probability Q, is very small, the optimum receiver is nearly the same as one that measures the operator p o and bases its decision on the outcome. In the present case, measuring p o is equivalent to measuring c:cl, that is, to counting the number of photons in the matched mode. We saw in $3.3.2 that in the extreme quantum limit JV = 0 the optimum receiver yields only a slightly larger probability of detection than one that disregards the known phase of the signal and counts photons. We conjecture that the same holds true for small as well as for zero average numbers JV of noise photons per mode. In Fig. 3.3 we have plotted versus the signal-to-noise ratio D, the probability of detection attained by a receiver that counts the number of photons in the matched mode. How these were calculated will be shown in the next section. The curve marked “optimum” gives the detection probability attained by the optimum receiver for &” = 0. The dashed line in Fig. 3.3 indicates the detection probability for the threshold receiver at all values of &”. For an average number Jlr of noise photons greater than about 0.2 the threshold receiver is superior to the photon counter when Q, = lop4, but the difference is small. 3.4. RECEPTION OF A SIGNAL OF RANDOM PHASE
The receiver just described needs to know the form of the expected signal exactly, and when - as we have assumed - the signal is a pulse modulation of a carrier of frequency Q, the phase of the carrier upon arrival of the signal at the receiver must be known as well. The distance between transmitter and receiver must therefore be known within a fraction of a wavelength 2 = 2nc/Q, or else the carrier phase, as in a communication system, must have been as accurately tracked from some initially precise value. At very high frequencies such knowledge of the phase will most often be unavailable. The signal will have the form s(t) =
Re [F(t)eiRffi@],
(3.70)
where F(t) is a known complex envelope, but the phase $ will ordinarily be unknown. It is as likely to have one value as another, and the least favorable distribution of the phase is the uniform one,
z(*) = (2n)-l,
02
Ij
< 2n.
(3.71)
When this signal appears in the transmission line of our ideal receiver, all its complex mode amplitudes d,,lwill have a common phase factor [email protected] is still possible to form a matched mode as in eq. (3.54) by taking the coefficients U,, proportional to S, for $ = 0. The remaining new modes can
VILO
31
DETECTION OF A COHERENT SIGNAL
325
be made independent of the first for all values of $, and as the signal does not affect them, it suffices to consider only the field in the matched mode. In place of eq. (3.56) the mean value of the operator c1 when the signal is present with phase II/ is
E(c,lH,, $)
=
I-eiJI.
(3.72)
The density operator po of the mode in the absence of a signal is the same as in eq. (3.58); when the signal in eq. (3.70) is present, the density operator is Jexp (- IY -~ei~iZ/~’)IY>(Yld2Y
PI($) = (n”)-I
(3.73)
in place of eq. (3.59). Since II/ is unknown and random with the uniform prior p.d.f. z($) of eq. (3.71), the actual density operator under hypothesis HI is the average PI = j~zz($)Pl($)d*
SexP ~ - ~ l Y l z + l ~ l z ) / ~ ’ ] ~ o ~ ~ l ~ Y d2Y, l / ~ ’ ~ l(3.74) Y~~Yl
= (.”>-I
where I o ( x ) is the modified Bessel function of the first kind. Since in the P-representation both po and p1 depend on the complex variable y only through its absolute value I y l , they must be diagonal in the number representation. From eq. (3.69) we obtain, by averaging over $ after replacing r by r ei$, m
pi
=
C PinIn>(nI,
i = 0,1,
(3.75)
n=O
with
P,, P,,
=
=
(1 - U ) ? Y , u = M’/(M’+l), (I-u)u”exp [ - ( 1 - ~ ) ~ 6 ] ~ , [ - ( 1 - v ) ~ ~ 6 / v ]
(3.76)
(LACHS[1965]). As the density operators commute, the optimum receiver reduces to one measuring the number n of photons in the matched mode. A receiver based on the Bayes criterion will form the likelihood ratio A(n) = P,,/Po, for the actual number n of photons counted, and it will decide that a signal is present if A(n) exceeds a decision level defined as in eq. (1.6). If the Neyman-Pearson criterion governs the design of the receiver, randomization must be employed, as described in $1.2. Hypothesis HI is chosen whenever n exceeds a certain integer v, H, whenever n < v; when iz = v, HI is chosen with a certain probabilityf. The false-alarm and detection probabilities are now, as in eqs. (1.12) and (1.13),
326
QUANTUM DETECTION THEORY
Qo
=
Qd
= fPiV+
NII,
f(l-v)~~+~”~~,
§3
(3.77)
a0
(3.78)
Pin-
n=v+l
Here v is the greatest integer in In Qo/ln v. From these formulas the curves in Fig. 3.3 were constructed. (See HELSTROM [1969a] for additional graphs of Qd.) In the limit A’” >> 1 the false-alarm and detection probabilities are given approximately by Qo = ~ X (P -&Pz),
Qd
=
Q(a, P)
CI
=
(2iv$v’)+,
= IOmx exp
[ -+(az
+ x’)]
Io(ctx)dx, (3.79)
where Q(u, B) is Marcum’s @function. These probabilities are the same as for the optimum classical receiver of a signal of random phase, which passes the output of the matched filter through a rectifier and compares the rectified output at time t = T with a decision level related to p (HELSTROM [1968~]pp. 166-171). 3.5. AMPLIFICATION
Let us return to our model of the ideal receiver as the transmission line shown in Fig. 3.1. The signal, when present at the input, has come in during the interval (0, T ) and at time t = T occupies the entire line, whose length is I = cT. At time t = T the input with its impedance Z, is detached, and a load impedance ZLis attached at the right-hand end (x = I ) of the line. The signal and noise waves occupying the line and moving to the right are reflected and attenuated by the load, which injects additional noise at temperature FL. At time t = 2T the signal, when present, again occupies the entire line and is represented by V(X,2 T ) = -ZOI(x, 27’)
=
PLZo(Zs+Zo)-l~(~/~),
(3.80)
where BL is the reflection coefficient defined in eq. (3.38). The field is again expanded in a Fourier series as in eq. (3.7), and classically the leftward modes have signal components
h,
=
(-
P L zo. .z,+ zo
) kJoTs(t)exp(io,t)dt.
The noise components of h, = h,,+ih,n,
have the variances
(3.81)
o 31
321
DETECTION OF A C O H E R E N T S I G N A L
~11,
Var h,, = Var hrny = K F b ; Zs[Zo/(Z,
+ Z0)l2T - + K Y LZLIZo/(ZL+ Z,)]
T-
(3.82)
as in eq. (3.14). Thus the mean number of signal photons in the matched mode whose annihilation operator is m
b,
=
(2CZ/hQ)”h,,
m
p,,, = (2CZ/hQ)’hm ,
(3.83)
is
Nir =
N6,
(3.84)
where NA is given by eq. (3.61). The mean number of noise photons is, by comparison of eqs. (3.82), (3.14) and (3.49), JV”
M’
=
p;M’+(l -p:> pl (hQ/K*FL),
=
(1-8:) pl ( h Q / K Y ) ,
(3.85)
where we have replaced the classical noise factor K Y / h Qby the Planck factor pl ( h Q / K F ) for both the source and the load. With lBLl = 1 (ZL= 0 or 2, = co),the processing of the leftward modes can take place at time t = 2 T with the same effectiveness as for detecting the signal in the rightward modes at time t = T. Otherwise the signal will, with IDL[ < 1 , be weakened, and additional noise will be added by the load. Let us henceforth for simplicity assume that the source and the transmission line are matched so that Z , = Z , , /Is = 0, and NA’ = p,” N , ,
Jlr“
=
p;p1 (ha/KF)+(l-p;)pi (hQ/KFL),
(3.86) and let us ask what happens if ZLis a negative resistance, Z , < 0, such as might appear in an idealized amplifier. Then p; > 1 and NLr > N,; the signal in the transmission line at time t = 2 T has been amplified, the additional photons having been supplied by the power source driving the negative resistance. We know that a quantum amplifier such as the maser contains a great many atoms or molecules, the population of whose energy levels has been inverted by pumping. As discussed by TAKAHASI [1965], this reversed distribution can be idealized as a Planck distribution corresponding to a negative temperature FL< 0. The Planck factor pl(hQ/KFL) is then negative, and the average number of noise photons in the matched mode is
328
Q U A N T U M DETECTION T H E O R Y
N”= G N + ( G - l ) N o , A’-, = -pl (-hQ/KIF,I) where G
=
[VIL §
3
(3.87) =
exp ( h Q / K l Y , l ) pl (hQ/KlF,J),
(3.88)
Bf is the amplification factor, or gain, N;’
=
GN,.
(3.89)
The minimum number (G- 1)N, of additional photons is added when YL= -OO,Na= I , Jlr;i,,
=
G ( N + 1)- 1.
(3.90)
Our earlier discussion of how the commutation rules are maintained by the
99.9 99.8
99
-
-
-
A
=o
N S
Fig. 3.5. Probability Qd of detecting an amplified coherent signal of random phase versus the initial number N , of signal photons; N = 0, Qo = The curves are indexed with the gain G.
v11,B 31
D E T E C T I O N OF A C O H E R E N T S I G N A L
329
60
Qd‘%’50 40
30 20 10
5 2 1 0.5
0.2 0.1 0.05
0.01
Ns Fig . 3.6. Probability Qdof detecting an amplified coherent signal of random phase versus the initial number N, of signal photons; JV = 1 , Qo = lo-’ and The curves are indexed with the gain G.
presence of an additional driving voltage in the load 2, applies here as well; the commutator [ b l ( T ) ,b l + ( T ) ] = 4z,z,/(z,+z,)2 corresponding to eq. (3.44) takes a negative sign with 2,. Further details can be found in the article by TAKAHASI [1965]. For a coherent signal of known phase, the best processing of the amplified mode is, in the limit G >> 1, to measure the operator +(dl + d : ) . The detection probability then depends, as in eqs. (3.65) and (3.66), on the equivalent signal-to-noise ratio 4GNs DY2 = 4N1’/(2Jtr”+1) = (3.91) +--2Ns , G >> 1. 2G(Jtr+1)-1 N+1
330
Q U A N T U M DETECTION T H E O R Y
[VII,
04
This maximum attainable signal-to-noise ratio is always less than the effective signal-to-noise ratio Di = 2NS/(Jlr t) characteristic of the mode before amplification. Thus amplification results in diminishing the detectability of the signal. When the phase of the signal is unknown, the false-alarm and detection probabilities can be calculated by eqs. (3.77) and (3.78), into which eq. (3.76) has been substituted, with N‘ replaced by .”‘ = G ( d ” 1)- 1 and N6 replaced by N6‘ = GN,. In Figs. 3.5 and 3.6 we plot the detection probability Qdversus the average initial number N , of signal photons, for various gains G , taking JV = 0 and N = 1, respectively. The effect of amplification is the smaller, the larger the average initial number JI’ of noise photons per mode. This decrease of signal detectability due to amplification is to be expected. Detection theory prescribes the best method of processing the input v ( t ) to the receiver; that is, it gives the strategy yielding minimum average probability of error or maximum detection probability with fixed false-alarm probability. In neither of the cases we have treated does this strategy include an amplification of the data. Hence amplification cannot yield a lower average error probability or higher detection probability, and as it is accompanied by the addition of noise to the data, amplification generally diminishes the detectability of a signal.
+
+
0 4.
Modal Decomposition of Aperture Fields
We turn now to the detection of light beams, which we suppose to be falling normally on the aperture of an optical instrument and coming from sources spanning a narrow solid angle as seen by the observer. Because stops might be used to cut out all the incident background light outside a slender cone of directions normal to the aperture of the instrument, without reducing the detectability of the light from the source, all rays can be treated as paraxial, and the light fields can be accurately described by a scalar wave theory (GREENand WOLF[1953]; WALTHER [1967]). In thus dealing with a scalar field, our subsequent discussions assume that the incident light, both signal and background, is linearly polarized. To treat unpolarized light it is merely necessary to double the number of field modes considered when applying our results, and intermediate degrees of polarization can be similarly handled by breaking the incident light up into statistically independent, linearly polarized components. Furthermore, the light to be detected occupies a spectral band whose width W is a small fraction of the central carrier frequency SZ = 27cc/A; W << SZ.
o 41
~11,
MODAL DECOMPOSITION O F APERTURE FIELDS
331
The background light is distributed broadly not only in direction, but also in frequency, possessing a Planck spectral density with absolute temperature F.Components of the background light having frequencies far outside the band occupied by the light from the source can be filtered out without affecting the detectability of the signal. The observer is to decide whether a certain coherent or incoherent light beam is present or not, and the only information on which he can base his decision resides in the scalar field $(r, t ) at the aperture A of his optical instrument during a finite interval (0, T ) .He must so process that field that his decisions are made with the greatest effectiveness, and to this end he applies the principles of detection theory outlined in the preceding chapters. The field at the aperture A is expanded in a set of orthonormal functions vLJ(r)9
d 2r being the two-dimensional element of area in the aperture. We write the field there as (4.2) $ ( r , t ) = $,(t)vp(r),
c P
where the component
will be called the pth aperture mode. Because the background light is spatially homogeneous, falling on the aperture from a cone of directions much broader than that from which the light to be detected arrives, its spatial coherence function can be assumed to be proportional to the two-dimensional delta-function 6 ( r , - r 2 ) , and the separate aperture modes t+bp(t)will be statistically independent in the absence of a signal field (hypothesis H,,). By appropriately choosing the spatial expansion functions q,(r) the aperture modes t+b,(t) can be made independent under hypothesis H I as well (KURIKSHA [ 19681). This statistical independence requires the mode functions q,(r) to vary but little over the correlation distance h c l K 9 of the background field whenever any background light is present; the source must subtend a solid angle much smaller than (KF/hS2)2. In our ideal receiver the light passing through the aperture is admitted to a lossless cavity during the observation interval (0, T ) . We assume that the aperture A is so much larger than a wavelength of light that all the light falling on the aperture enters the cavity and none is reflected or diffracted backward. Each aperture mode t+b,(t) can be thought of as ex-
332
Q U A N T U M DETECTION T H E O R Y
m,I 4
citing a distinct combination of normal modes within the cavity in much the same way as the input source u ( t ) to the transmission-line receiver in 0 3 excited the normal modes of the line, and we can regard each of these sets of cavity modes as associated with a separate and independent transmission line matched to the external field. The pth equivalent transmission line, excited only by the pth aperture mode, can be imagined as having connected to its input terminals the generator of a voltage UpG)
= 2;s$,(t>,
(4.4)
the internal impedance 2, of this generator being equal to the characteristic impedance 2, of the line. The energy absorbed by the pth line is, as in eq. (3.27), equal to E, = (42,J-l JOT[~,(t)Izdt
=
$
(4.5)
and the total energy absorbed by the cavity is
which specifies the normalization of the aperture field $ ( r , t). In Q 3 we supposed that the input is disconnected from the transmission line at time t = T and that the modes of the line are observed immediately. Actually, their amplitudes could be measured at any subsequent time without changing the attainable false-alarm and detection probabilities, for they possess a known sinusoidal dependence on the time t, as in eq. (3.29). On a more fundamental level we recognize that the eigenvalues v], of the operator p1 - A o p o in eq. (2.6) are independent of the time of observation of the field in the ideal receiver, and so are the false-alarm and detection probabilities, as given by eq. (2.13). A change in the time of observation merely imposes a unitary transformation on the detection operatorn or n,. It does not matter, therefore, at what time t 2 T the cavity modes associated with the several aperture modes $,(t) are observed, once the aperture has been closed or equivalently - the sources have been disconnected from the transmission lines. Because the fields under consideration are quasimonochromatic, occupying a frequency band of width W about a carrier frequency s2 = 27rc/A, the aperture modes can be unambiguously divided into their positive- and negative-frequency components,
$,(O
=
$:v)
+ $b-’(t).
(4.7)
VII,
P 41
MODAL DECOMPOSITION O F APERTURE F I E L D S
333
The former is a carrier exp (- iQt) multiplied by a slowly varying modulation factor. Because $,(t) is real, the latter is the complex conjugate,
[$3)]*,
$J;'(t> =
(4.8)
and represents a corresponding modulation of exp (int). Quantum-mechanically the positive- and negative-frequency components become adjoint noncommuting operators, $i-)(t) = [$Y'(t)]'.
(4.9)
Their commutators follow from those in eq. (3.42), and when the time-dependences exp (+iSZt) and the slowly varying nature of the modulations are taken into account, these can be written, by using eq. (4.4) in the conversion, [$B"(tl>,
$ 3 t Z ) ]
C$l;"(tl),
$6')(tz)]
= 2hQ6(t, - t z ) d p 4 , = 0.
(4.10)
Distinct aperture modes commute by virtue of the orthogonality of the expansion functions q p ( r ) . The mode amplitudes g m introduced in 0 3 are seen from eq. (3.12) to be proportional to the coefficients of a Fourier series for the input v(t) over the interval (0,T ) . Because of the limitation of the input to frequencies in the neighborhood of SZ, only the positive-frequency part of n(t) contributes to the Fourier integral for g m . When we use eqs. (3.32) and (4.4) with 2, = 2, and C1 = T / Z o ,we find that we can write the annihilation operators for mode p as T
upm= (2hQ)-*J y:(t)$b+)(t)eiRtdt,
(4.11)
0
y,(t)
=
T-* exp [-i(com-Q)t],
om = 27cm/T,
(4.12)
where we have replaced the factor (horn)-* with (hQ)-* because om z Q. The expansion functions y,(t) are orthonormal over the observation interval (0, T ) . In treating the detection of incoherent light beams, however, we shall find it convenient to expand the aperture modes $b+)(t) in series of orthonormal functions y,(t) exp (-iSZt) different from the sinusoidal ones in eq. (4.12). In general, then, we shall define the annihilation operators up,,,for the pth aperture mode by eq. (4.11) with PT
(4.13) Which expansion functions are used will depend on the spectral density of
334
QUANTUM DETECTION THEORY
[VII, §
5
the light to be detected, and as we shall assume spectral purity, the same set ym(t)will apply to all the aperture modes $,(t). These annihilation operators upmand their adjoint creation operators uimwill obey the commutation rules (4.14) by virtue of the commutators given in eq. (4.10). These have also been derived from the commutation rules for a scalar field in free space (HELSTROM [1970b]). We are now writing the pth aperture mode, as a function of time, in the form m
(4.15) The total energy in thepth mode is, by eqs. (4.5), (4.7) and (4.13), PT
m
(4.16) and u ~ m u pismthe number operator for the mth temporal mode of $,,(f). Once the spatial modes have been properly chosen, the results of our analysis in $ 3 can be applied to the detection of coherent light beams. Suppose, for example, that the light to be detected is coming from an ideal laser whose emissions are pulses of known temporal form s ( t ) . Let the field created by the laser at point r of the aperture be proportional tof(r), where
We then take the first aperture mode function q l ( r ) equal tof(r), and we pick the remaining functions q,(v) to be orthogonal tof(r) over the aperture. Our results in $ 3 determine the detectability of the laser pulse in the presence of spatially and temporally white thermal background light of absolute temperature 7.If the carrier phase of the laser pulse is known, the analysis in Q 3.3 applies; if it is unknown, that in $ 3.4 applies.
8 5.
Detection of Incoherent Light
5.1. THE OPTICAL FIELDS
By incoherent light we mean light emitted by natural sources, which consist of enormous numbers of randomly emitting atoms and ions. Classically, the fields produced by such sources can be treated as spatio-temporal Gaussian random processes, for they are the sums of a great many small,
"11,
P 51
DETECTION OF INCOHERENT L I G H T
335
random components produced by individual atomic radiators. Quantummechanically, the density operators of these fields have Gaussian P-representations, and the fields possess only first-order coherence (GLAUBER [1963]). When the light to be detected is present (hypothesis Hl), the positivefrequency half $ ( + ) ( r ,t ) of the field at the aperture of the observing instrument consists of two parts, $(+I@, t ) = $s(r, t)+$,(r,
t),
$s(r, t ) representing the beam to be detected (the signal), and $,,(r, t ) representing the background light (the noise). (We omit the superscript(+) indicating that $, and $,, contain only positive frequencies.) The background light arrives with a distribution in frequency and angle so much broader than that of the signal that the background can be taken as spatially and temporally white; its mutual coherence function has the form d(ri - r z ) a(t1- t z ) , SE[$,f(rl? t i ) $n(rz t z ) ] = (54 where the superscript + indicates classically the complex-conjugate, quantummechanically the adjoint. The signal field we assume to be spectrally pure, so that its mutual coherence function can be factored into a spatial part q s ( r l ,rZ) and a temporal part x(tl - t z ) ; we write it in the form 2
SE[$:(Yi
t z ) ] = 4 0 s ( ~ i7 rz) ~ ( t i - t z )
exp [-iQ(ti-tz)],
(5.2) where Q = 2nc/il is the angular frequency of the carrier. The temporal coherence function ~ ( z is ) so normalized that ~ ( 0=) 1; its Fourier transform 9
ti)$s(Yz,
X(o)=
/
m -m
X(z)e'"'dz
(5.3)
represents the spectral density of the object light, with angular frequencies o referred to 52 as origin. The bandwidth W of this light is conveniently defined as
(5.4) W is the reciprocal of the coherence time defined by MANDEL[1958]. The signal and noise processes are both circular Gaussian in the sense that E [ $ ( + ) ( r , , tl)$(+)(rz, t2)] = 0 (HELSTROM [1968c] pp. 69-72). The signal is considered as arising from a distant luminous source, which can be taken as planar and as having a radiance B ( s ) as a function of posi-
336
QUANTUM DETECTION THEORY
[VII,
§5
tion u = ( u x 7uy). The spatial part of the mutual coherence function at the aperture is then 'pS(rI, r2)
=
["
(4nR2)-' exp 2R ( r i - r z ) ] p(r, -r2),
(5.5)
where k = Q/c = 27112 is the propagation constant, R is the distance between object and aperture, and
0 designating an integration over the source or object plane. Equation (5.5) follows directly from the Fresnel-Kirchhoff diffraction formula as applied to the object plane, at which the mutual coherence function of the signal has the form mICI3u17
tl)ICId% 7 t 2 ) l = ~ck-~B(u,)G(u,- u 2 ) ~ ( t l- t 2 ) exp [-iQ(t,
-t2)],
(5.7)
the delta-function indicating the spatial incoherence of the light upon emission by the object plane. The average total energy received from the object is, by eqs. (4.16), (5.2), (5.5) and (5.6), E,
=
1
JT'pS(r,r)X(0)d2rdt = B, AT/4xRZ,
A
0
BT = p(0)
=
1
B(u)d2u.
0
The most useful decomposition of the aperture field into modes as in eq. (4.2) now involves as expansion functions q p ( r ) the eigenfunctions of the spatial part 'ps(rl, v 2 ) of the mutual coherence function of the object light; they are solutions of the integral equation hpqp(rl)
=
(ABT)-l
~)(r2-r1)'1p(r2)dZr2
(5.9)
with eigenvalues h, that sum to 1,
C h p = 1,
(5.10)
P
and indicate the fraction of the total signal energy going into each aperture mode $ J t ) . By using these expansion functions, we ensure that the aperture modes $,(t) are statistically independent both in the presence and in the absence of the object light.
VII,
o 51
337
DETECTION OF INCOHERENT LIGHT
For the temporal decomposition of each aperture mode $,(t) we use the eigenfunctions y m ( t ) of the temporal part of the mutual coherence function of the object light; these are defined by the integral equation XmYm(t1)
= ~ - 1 / ~ ~ X ( t-t2)Ym(t2)dt2 1
*
(5.11)
The eigenvalues xm sum to 1, C X m =
1,
(5.12)
and specify the fractions of the signal energy in each aperture mode +,(t) that are associated with the terms y m ( t ) exp (-2 2 ) in eq. (4.15). When, as is often the case, the product WT of the signal bandwidth and the observation time is very large, the eigenvalues xm are given approximately by
xm M T-’X(2nm/T),
(5.13)
where X ( o ) is the spectral density of the object light as defined in eq. (5.3), and m takes on both positive and negative integral values. The associated eigenfunctions are approximately the complex exponentials given by eq. (4.12), as can be seen by substitution into the integral equation (5.11) (GRENANDER and SZEGO[1958] Q 8.6, pp. 136-139). We define the combination qP(r)ym(t)exp (-iQt) as a spatio-temporal mode of the aperture field. It possesses on the average a fraction hpXmof the energy received from the object. All the spatio-temporal modes are statistically independent under both hypotheses Ho(“signal absent”) and HI (“signal present”). From eqs. (4.1 l), (5.2) and (5.3) we can calculate the average number of photons in the spatio-temporal mode (pm) under hypothesis H, ,
where Npm
=
hp X m EslhQ
(5.15)
= h p Xm N s
is the average number of signal photons in the mode and Jlr = No/hQis the average number of noise photons; Ns = Es/hQis the total average number of photons received at aperture A from the object during the observation interval (0, T ) . The number A’” = pl(hQ/KF) is given by the Planck law, eq. (3.50), in terms of Boltzmann’s constant K and the effective absolute ternperature F of the background light; it is the same for all the significant spatio-temporal modes, for we again assume that h W << K F whenever JV > 0. Under hypothesis H, , E(a,’m
apm
IHO)
=
~r
( P O a,’m a p m )
=
Jlr.
(5.16)
338
QUANTUM DETECTION THEORY
[VII,
05
The number of photons in each mode has a Bose distribution under both hypotheses H, and H,; the mean numbers given by eqs. (5.14) and (5.16) suffice to determine these distributions. 5.2. THE OPTIMUM RECEIVER
Because we have arranged for statistically independent modes under both hypotheses, both the density operators po and p1 for the field in the ideal receiver factor into products of density operators for the individual spatiotemporal modes. For each mode the density operator has, under both H, and H I , a Gaussian P-representation with mean zero, like that in eq. (3.58), in which M' = JV under hypothesis H, and M' = N p m + N under H,. Both these modal density operators are therefore diagonal in the number representation. Let Ivpm) be the eigenstate of the number operator u ~ m u p m with eigenvalue vDm, ap+mapmIVpm) = VpmIVpm). (5.17) Then the density operators for the entire field are (5.18)
P;L(~) = (1
-u(i))u(i)v pm p m ,
Ng)
=M,
pp m = N;;/(N;;+I),
N r i = N p m +M
=
i = 0, 1, hpXmN,+N.
(5.19)
Because the density operators p, and p1 are simultaneously diagonal in the number representation, the optimum receiver is one that measures the numbers vpmof photons in each spatio-temporal mode, forms the likelihood ratio A({vpm>> = Pyd(Vpm)/PJPm)(vpm), (5.20)
JJ
P. m
and compares it with an appropriate decision level A , , deciding that the signal field is present if A > A , . Equivalently, it compares the logarithmic likelihood ratio (5.21) with In A , or with a decision level determined by the false-alarm probability. In the subsequent sections we shall for various special cases study this optimum receiver or its approximating threshold receiver and evaluate its performance. 5.3. DETECTION OF POINT SOURCES
When the light to be detected comes from a point source such as a distant
VII,
5 51
DETECTION OF INCOHERENT LIGHT
339
star, its field possesses complete first-order spatial coherence over the aperture, p ( r ) E p(O), r E A. The lowest spatial eigenfunction is then constant, ql(r) 3 A-*, r E A, h , = 1, (5.22) where A stands for the area of the aperture. The remaining eigenfunctions are orthogonal to q , ( r ) over the aperture. Only the aperture mode $ l ( t ) is excited by the light from the object, and the others can be disregarded. We can therefore drop the spatial-mode subscript p in the preceding equations, putting h, = 1 and restricting our attention to the temporal modes rm(t)x exp (- iQt) into which $, ( t ) is decomposed. 5.3.1. Extreme quantum limit In the extreme quantum limit Jzr = 0 (F= 0) no photons are counted in any of the temporal modes in the absence of a signal (hypothesis Ho). It is therefore necessary to use a randomized strategy, as discussed in 9 1.2, in order to attain a pre-assigned false-alarm probability (2,. The receiver will choose hypothesis HI whenever any photons at all are counted; when no photons are observed, it chooses hypothesis HI with probability (2,. The probability of detection is now Qd
=
‘-PI(O)+QOPI(O)
= 1-(1-Q13)p1(0)7
(5.23)
wherep,(O) is the probability that no photons are found in any of the modes under hypothesis H,, that is, by eq. (5.19), pI(0) = n(1-0:))= n ( l + ~ ~ N , ) -=’ [f(N,)]-’, m
(5.24)
m
where xm is an eigenvalue of the integral equation (5.11). Our previous assumption that h W << K Y is unnecessary here, the number JV being zero in all modes. The function (5.25) f(z) = n(1+ Xm z ) m
is the Fredholm determinant of the integral equation (5.11). For a source with a Lorentz spectrum X(O)
=
~ w ( o ~ + w ~ )~ -( z~)=, e-wlrl,
(5.26)
the Fredholm determinant is well known,
(5.27)
340
QUANTUM DETECTION THEORY
[VII,
05
(SIEGERT [1957]). In Fig. 5.1 we have plotted the probability Qd of detection for Qo = and various values of the time-bandwidth product p = WT. For p = 0, p,(O) = (1 + N , ) - ' , and in the limit p >> 1,p,(O) = exp ( - N s ) . These curves would be altered only imperceptibly for smaller false-alarm probabilities.
L
.997
-
,996
-
.w5
-
.994 ,993 ,991 ,991 .99
.98
Q,
-
-
-
.97.96 .95 .94 .93 .92 .91 .9
.8
-
-
-
-
2
3
4
5
6 7 8 9 1 0
20
30
40 50 60
80 100
Ns
Fig. 5.1. Probability Qdof detecting linearly polarized incoherent light from a point source versus number N , of photons; -4'"= 0, Qo = Curves are indexed by the time-bandwidth product WT. Solid lines: Lorentz spectrum; dashed lines: rectangular spectrum. The curve for W T = 6.366 for the rectangular spectrum nearly coincides with that for W T = 5 for the Lorentz spectrum.
If the incident light is unpolarized, it can be broken up into two statistically independent linearly polarized modes with an average number 3Ns of signal photons in each. The detection probability is now Qd
=
1-(1-Qo)[.f(&Ns)I2
(5.28)
v11, § 51
341
DETECTION OF INCOHERENT LIGHT
and is plotted in Fig. 5.2 for various values of WT,again with Q , = Whether polarized or unpolarized, the light is the more easily detected, the greater its bandwidth. The slow descent of the Lorentz spectrum to zero with increasing w is in strong contrast to the sharp cutoff of the rectangular spectrum,
X(0)= w-1, -nw < 0 < nw, X(0)= 0, 1 0 1 > nw,
(5.29)
for which the temporal coherence function is ~ ( z )= sinc Wz,
,996
Q,
(5.30)
-
,995
-
.994
-
.993 ,992 .991 .99
-
,911
-
.97
-
-
-
.94 .93 .92 .91 .96
.95
.9
,
I
,
I
, 1 , 1 , ,
,
.? 1 0 2
3
4
5
6711910
, 20
,
I I
I
30
40 50 b0
80
NS
Fig. 5.2. Probability Qd of detecting unpolarized incoherent light from a point source Curves are indexed by the time-bandversus number N, of photons; Jv = 0,Qo = width product WT. Solid lines: Lorentz spectrum; dashed lines: rectangular spectrum.
342
[VII, §
QUANTUM DETECTION THEORY
5
with sinc x = sin (nx>/(nx).The integral equation (5.11) that results has been solved by SLEPIAN and POLLAK[1961], and extensive tables of eigenand SONNENBLICK [1965]. In their values have been provided by SLEPIAN notation, xrn = A,(c)/WT = ~A,(C)/~C,c = 4nWT; (5.31) the eigenfunctions y,(t) are prolate spheroidal wave functions. For W T >> 1 there are approximately W T eigenvalues xrn equal to ( W T ) - ' , and the rest are exponentially small; compare eq. (5.13). In order to show the effect of the spectral shape on the detectability of incoherent light at the extreme quantum limit, we have plotted as dashed lines in Figs. 5.1 and 5.2 the detection probability calculated from eqs. (5.24) and (5.28) by using these eigenvalues. 5.3.2. Intermediate range The logarithmic likelihood ratio of eq. (5.21) cannot be converted into a statistic having only integral values when Jzr # 0, for the factors In ( u ~ ! / u J p , ' ) are not in general commensurable. The distributions of In A( (v,,}) under the two hypotheses are then very difficult to calculate. When the spectral density of the signal is rectangular, and the timebandwidth product W T is large, however, there are approximately W T tem poral eigenvalues xrn = ( W T ) - l , and the rest are negligible. If we make this approximation, we find that the optimum receiver of the light forms the sum s
v
=
ZV,,
p = WT,
(5.32)
m=l
of the numbers v , of photons in the p significant temporal modes. The sum v differs now from the logarithmic likelihood ratio In A ( ( v , ) ) only by an additive constant and a constant factor, which can be absorbed into the decision level. Since v is an integer, a randomized decision rule must be applied as described in 5 1.2, and the false-alarm and detection probabilities are calculated by eqs. (1.12) and (1.13). The probability of obtaining a total of v photons is now the generalized Bose-Einstein distribution p , ( v ) = (p-l)!-1v!-1(v+p-1)!(l-uiyu;, 00
=
Jzr/(N+l),
i
=
0,1,
u1 = (N+p-",)/(dV-+p-",+l).
(5.33) (5.34)
The mean total numbers of photons are obtained from eq. (5.14), i o= E(vIH0) = p M ,
i 1 = E(vIH1) = pJlr+ N , .
(5.35)
VII, §
343
DETECTION OF INCOHERENT LIGHT
51
The probability of detection has been plotted in Fig. 5.3 for 1= 1, Qo = lop4, and various values of the time-bandwidth product p = WT. The curve for p = 1 gives the detection probability for a single temporal mode when JI’ = 1 and can be calculated by eq. (1.15). For p = 2 and 5 the curves represent only crude approximations. 99 99
99.9
20 10 35 7 50 5
99.8
99
98
2
95 PO
1
80 70 60
‘d
%
50
40
30
10 -
20
-
Qo = 1
2
3
4
5
6 78910
20
30
40 5 0 6 0
-
80 100
N S
Fig. 5.3. Probability Qdof detecting an incoherent point source with a rectangular specCurves are tral density versus number N, of photons from the source; JV = 1, & = indexed with the number ,u = W T of significant modes.
5.3.3. The Poisson limit
If the time-bandwidth product WTincreases without limit and at the same time the mean number N of noise photons per mode decreases in such a way that E(vIH0) = Jlr WT remains fixed, the distributions of the total number v of photons counted become in the limit ii = co Poisson distributions, independently of the spectral density of the light from the point source, pi(v) = i(exp(-S,)/v,!, i = 0 , 1. (5.36)
344
Q U A N T U M DETECTION T H E O R Y
[VII,
§5
A detector based on observations of the number v is optimum, however, only when the spectral density of the signal is rectangular, as in eq. (5.29). In Fig. 5.4 we have plotted the probability Qd of detection in this limit W T >> 1 versus the average number N , of signal photons for various values of .A’”W T 5 1. For .A’”W T 2 1 the detection probability is plotted in Fig. 5.5 versus the signal-to-noise ratio
D, = N,(MWT)-’.
(5.37)
For M W T >> 1 the distribution of the sum v is nearly Gaussian, and the false-alarm and detection probabilities are approximately as in eq. (3.65) with D, in place of D,.
99 99 99 9 99 8 99 98 95 90
50
10
10
01
10
5
15
NS
Fig. 5.4. Probability Qd of detecting an incoherent point source having a rectangular spec1, tral density and supplying a total number N, of photons in the Poisson limit N W T >> 1. The curves are indexed by the average total number A’ W T of thermal photons in the significant modes; Qo =
<<
VII,
5 51
345
D E T E C T I O N OF I N C O H E R E N T L I G H T
99 999 8 -
-
-
5
10
15
D" Fig. 5.5 Probability Qdof detecting an incoherent point source having a rectangular spectral density and supplying a total number N , of photons, versus D, = N , ( X W T ) - + , in 1. The curves are indexed by the average total number the Poisson limit .N 1, W T MrWT of thermal photons in the significant modes; Qo =
<<
>>
5.3.4. The effect of amplification When the spectral density of the signal is rectangular, the p = W T temporal modes correspond to the sinusoidal eigenfunctions g , ( t ) exp (-ifit) of eq. (4.12) and can be associated directly with the modes of the transmission-line receiver as treated in 9 3. We can then envision amplifying the mixture of incoherent signal and background as described in 0 3.5. Under hypothesis HI the mean number of photons in each of the significant temporal modes before amplification is
(5.38) after amplification with gain G it is under the best of circumstances given by eq. (3.90) as = G(CmI+l)-l = G(M+p-'Ns+1)-l, (5.39) and the parameter u1 in the distribution in eq. (5.34) is (5.40)
3 46
QUANTUM DETECTION THEORY
[VIL
05
with vo obtained by setting N, = 0. In Figs. 5.6 and 5.7 we have plotted the probability of detection calculated from the distributions of eq. (5.33) with these values of u, and v l , for W T = 5 and 10 and for JV = 0 and various amplification factors G . A randomized decision procedure leading to an exact was postulated. Again the detectability false-alarm probability Q , = of the signal decreases with increasing amplification. For larger values of J1' the effect of amplification is less severe.
5.3.5. The classical limit For the sake of completeness we analyze the detection of light from an incoherent point source in the classical limit when the average number of photons in each temporal mode is very large under both hypotheses. The
N S
Fig. 5.6. Probability Qd of detecting an amplified incoherent signal of random phase and rectangular spectral density versus the mean number N, of signal photons; -4'" = 0, W T = 5. The curves are indexed with the gain G. Qo =
VII, §
51
DETECTION OF INCOHERENT LIGHT
347
NS
Fig. 5.7. Probability Qd of detecting an amplified incoherent signal of random phase and rectangular spectral density versus the mean number N , of signal photons; N = 0, Qo = W T = 10. The curves are indexed with the gain G .
problem is then the same as that of detecting a narrowband Gaussian stochastic signal having energy E, = hs2N, and a normalized temporal mutual coherence function ~ ( z exp ) (-is22z) in the presence of white Gaussian noise of unilateral spectral density K Y = hS2.N. Our results will also apply to the detection of an incoherently emitting point source after extreme amplification, G >> 1, the number of noise and signal photons in each temporal mode y,(t) exp (-iQt> being assumed given by an expression like eq. (3.90) after amplification; it is only necessary in what follows to replace .N by N +1. In the limit A’”>> 1, N, > 1, the logarithmic likelihood ratio in eq. (5.21) becomes, with the spatial-mode subscript p dropped,
348
QUANTUM DETECTION THEORY
[VII,
1 [ v m ( N p l - N p - l ) - l l l (lVp/Ng))] = 2 [D2xm(l+D2xm)-’xm-1n (1 +D2xm)]
0 5.
111 A((Vm})=
m
(5.41)
m
where D2 = N , / M = E J K Y is the signal-to-noise ratio, x,,,is the temporal eigenvalue defined in eq. (5.11), and x, = vm/M. Here we have used eq. (5.19). In this classical limit x, = v m / M is a continuous random variable with exponential p.d.f.’s under both hypotheses, Porn = exp ( - x m ) ,
p l m= ( ~ + D ’ X ~ exp ) - ~ [-(1+D2xm)-1xm],
x,
> 0,
(5.42)
and the xm’s for different modes are statistically independent. In order to calculate the false-alarm and detection probabilities we first find the momcnt-generating functions (ni.g.f.’s) of the detection statistic L
=
CD2Xm(l+D2xm)-1Xm,
(5.43)
m
which differs only by a constant from the logarithmic likelihood ratio in eq. (5.41). Hypothesis H, is chosen when L exceeds a decision level L o . The m.g.f’s are the Laplace transforms of the p.d.f.’s of L , E[e-”lHo]
=
f(D2)/f(D2(1 + s)),
(5.44)
E[e-”LIHl]
=
[f(D2s)]-’,
(5.45)
wheref(z) is the Fredholm determinant defined in eq. (5.25). The p.d.f.’s of L are found by taking the inverse Laplace transforms of eqs. (5.44) and (5.45) by means of the residue theorem, and when these are integrated over L o < L < co, the false-alarm and detection probabilities are found to be 00
Qo = f ( D 2 )
2 [f’(-x;’)(D2+~;1)]-1
exp [-(1+D-2xi1)L~],
m= 1 m
(5.46) where f ’ ( z ) = dfdz. For object light with a Lorentz spectral density defined by eq. (5.26) the Fredholm determinantf(2) was given in eq. (5.27); its zeros - xi’ are obtained by solving a certain transcendental equation (HELSTROM [1968~]p. 1401. In this way the detection probability Qdwas calculated for Qo = and various values of the tinie-bandwidth product p = W T ;the results are plotted in Fig. 5.8. The statistic L does not yield a uniformly most powerful test, depending as it does on the signal-to-noise ratio D 2 , and hence each
VII,
0 51
349
DETECTION OF INCOHERENT L I G H T 99.99
20 35 12 50
99.9 99.8
4
2
99
98
1 0.5
95
90
0
80 70 60
'd
so
1%)
40
30 20 10
5 2 1 0.5 0.2 0.1
0.01 0.01 1
2
3
4
5 6 78910
D' =
20
30
40 5060
0 100
Ns/Ae
Fig. 5.8. Probability Qd of detecting an incoherent point source with a Lorentz spectral density in the classical limit Jf 1, versus the signal-to-noise ratio D 2= N J M ; Qo = The curves are indexed by the time-bandwidth product WT.
>>
point on a curve in Fig. 5.8 refers to a different detector, specifically the optimum one for that signal-to-noise ratio. When WT>> 1, the eigenvalues xm of the temporal coherence function ~ ( 7 are ) approximately given by eq. (5.13), and the Fredholm determinant in eq. (5.25) can be written approximately as
m
z exp
(TI 2n
In [l+zX(w)]dw
(5.47)
-m
For the Lorentz spectral density this yields ___.
f(z)
M
epmb, b
=
4 p 2 +2 p z, p
=
WT,
(5.48)
and by substituting into eqs. (5.44) and (5.45) and taking the inverse Laplace
350
tVIL §
QUANTUM DETECTION THEORY
5
transformation we find the approximate false-alarm and detection probabilities for the optimum classical receiver, Qo
= erfc [(J14/C)*(<-l)]-e~~ erfc [(M/(P(<+l)], (pZ+2D2p)+, p
>> 1,
(5.49)
Q~ x erfc [(p/(’)”((’- l)]-eZ” erfc [(p/(’)*(t’+ I)], 5’ = p t / M , p
>> 1,
(5.50)
M
=
where erfc ( is the error-function integral defined in eq. (3.23). For each value of D 2 , 5 is determined from eq. (5.49) to obtain the pre-assigned falsealarm Probability Q,, whereupon Qdis calculated from eq. (5.50). Good agreement was tound between the residue series and these approximations for p = W T k 12. In Fig. 5.9 the probability of detecting light with a rectangular spectral density - eq. (5.29) - is presented for various values of the time-bandwidth 35 20 50 12
99.99
7 99.9 99.8
3.820
2.546 9Y 98
1.273
9s
0.637
90
0
80 70
60
Qd {%I
50 40
30 20 10
5
2 1
0.5 0.2
0.1 0.05 0.01 1
2
3
4
5 6 78V10
D’
20
30
40 5 0 6 0
(10
100
= Ns//Ir
Fig. 5.9. Probability Qdof detecting an incoherent point source with a rectangular spectral density in the classical limit N>>I, versus the signal-to-noise ratio D z = N J N ; The curves are indexed by the time-bandwidth product WT. Qo =
VII,
§ 51
DETECTION OF INCOHERENT LIGHT
351
product W T . The eigenvalues xrn are given by eq. (5.31), and the Fredholm determinant is used in the form given in eq. (5.25). Only a finite number of the order of W T - of terms need to be carried because the eigenvalues xrn decrease rapidly once the index rn exceeds WT. For W T 2 7 the approximation obtained from eq. (5.47) has been used; it leads to a gammadistribution for the test statistic under each hypothesis. The false-alarm and detection probabilities were then calculated from Qo = T ( x , W T ) ,
Qd = T ( x ( l + D 2 / W T ) ,W T ) , where q x , v) =
(5.51)
IXm
[r(v)]-’
tv-’ e - t dt
is related to the incomplete gamma-function. 5.3.6. The threshold detector The structure of the optimum detector of incoherent light in the presence of a thermal background field depends on the strength of the signal to be detected; it does not provide a uniformly most powerful test of the hypothesis “signal present” against the null hypothesis “signal absent”. If the signal strength is unknown in advance, therefore, the optimum detector derived in the foregoing sections cannot be applied. As signal detectability is poorest for weak signals, it is natural to use instead the detector that is best in the limit of vanishing signal strength; this is the threshold detector described in § 1.4 and 9 2.3. Since the density operators for the field in the receiver under the two hypotheses commute, it suffices here to use the classical form of the threshold detector. The logarithmic likelihood ratio in eq. (5.21) is easily expanded in powers of the average number N, of signal photons, and the term of first order in N, yields, after the part independent of the data is dropped and the remainder is divided by N , / N ( N + l), (5.52) which is the threshold statistic. It is a simpler linear combination of the data vpmthan the optimum statistic in eq. (5.21) and has the advantage of being independent of the strength N , of the light to be detected. The light signal we are considering here comes from a point source and possesses first-order spatial coherence over the aperture. As in the previous parts of this section, we deal only with a single spatial mode and drop the
352
QUANTUM DETECTION THEORY
m,§ 5
subscript p , writing the threshold statistic in the form (5.53) m
where v,, is the number of photons counted in the mth temporal mode y m ( t ) x exp (-ifit). When the spectral density of the object light is rectangular with bandwidth W, there are - as we have said - approximately W T equal eigenvalues xm = ( W T ) - ’ , and the rest are negligibly small, provided WT>> 1. The threshold detector in this case performs the same operation on the data vmas the optimum detector, summing those numbers of photons in the W T significant temporal modes and comparing the sum with an appropriate decision level. The detection probabilities calculated in parts (ii) and (iii) and plotted in Figs. 5.3, 5.4 and 5.5 apply to the threshold detector as well, and for light with a rectangular spectrum the threshold and optimum detectors exhibit nearly the same performance. For signal spectra other than the rectangular, the threshold statistic U becomes a continuous, rather than a discrete random variable. Its p.d.f.’s under the two hypotheses must be obtained from its moment generating functions (m.g.f.’s),
{ I + (M+xrn~ s ) [ l - e x p ( - x m ~ ) ] } - ’ >
=
(5.54)
m
fo(s)
=
E(e-Us~HH,) =
(l+N[l-exp (-xms)]}-’,
(5.55)
m
which follow from the Bose distributions in eq. (5.19). An inverse Laplace transform is required, and numerical methods must ordinarily be used. For most observations of natural sources of incoherent light the timebandwidth product W T is very large. The eigenvalues are then given approximately by eq. (5.13). If we assume Jlr << 1 , with the product JI’ W T of the order of 1, the m.g.f.’s in eqs. (5.54) and (5.55) become approximately
(21
m
f1(s)
=
exp 2.n
-m
[N+N.T-’X(o)]{l-erp[-X(o)s/T]}dco)
-MT
{ 1- exp [ - X(o)s/T]}do
(5.56)
-01
the sum over modes involved in lnfi(s) and Info(s) having been converted to an integration. For a uniform spectral density these lead to the Poisson distributions found in 6 5.3.3.
VII,
0 51
353
DETECTION OF INCOHERENT L I G H T
When the object light has a Lorentz spectral density as given by eq. (5.26), the m.g.f. of the statistic U under hypothesis HI becomes fi(s)
=
exp { - - J ~ / ’ W T U ~ - “ [ ~ , ( U ) + I , ( U ) ] --e-aI,(~)]}, N,[~ (5.57)
u = s/WT,
withf,(s) obtained by setting N, = 0; Z0(a) and Zl(a) are modified Bessel functions. It appears impossible to invert these m.g.f.’s analytically in order to obtain the p.d.f.’s of the threshold statistic U, from which the false-alarm and detection probabilities can be calculated; numerical methods described elsewhere are required (HELSTROM [1969c]). It was found that the detection probability is slightly smaller for a Lorentz spectrum than for a rectangular spectrum with the same value of WT. 5.4. DETECTION OF EXTENDED OBJECTS
When the object is not a point source, but extends over more then one resolution element of the instrument, its light arriving at the aperture does not possess full first-order spatial coherence, and more than one aperture mode @Jt)must be processed for detection. A resolution element in the object plane at distanceR has an area of the order of (AR)’/A, where 1 = 27cc/s2 is the wavelength of the object light and A is the area of the aperture of the observing instrument. It is now necessary to solve the integral equation (5.9), and this can be done analytically for only a few forms of the radiance distribution B(u) of the object. Some general statements can be made about the spectrum of eigenvalues h,, however, if the aperture is rectangular (a, x a,) and if B(u) changes only slightly over a single resolution element of area (AR)2/a,u,. An approximation for the spatial eigenvalues h, can be found that is much like the one in eq. (5.13) for the temporal eigenvalues. It links them with samples of the Fourier transform of the kernel P(r) of eq. (5.9), that is, to samples of the radiance B(u) at uniformly spaced points in the object plane, hp
P y 6y)/BT P = (P,, p,),
6y B ( P x 6x
7
7
6, = AR/a,, 6,
=
ARb, ,
(5.58)
where p , and p , are integers, and 6, and 6, are the linear resolution elements in the x- and y-directions in the object plane. The spatial mode index p has for convenience been made into a 2-vector p . The approximate number of significant eigenvalues hp is given by eq. (5.58) as M = AA,/(AR)’, (5.59)
354
[V”, §
QUANTUM DETECTION THEORY
5
where A = axaYis the area of the aperture and A , is the area of the planar object at distance R. More generally we can derive eq. (5.59) by observing that by eq. (5.9) M i s given approximately by
=
(5.60)
AA,/(AR)*,
if we define the effective area of the object as A,
=
B:
[S,[B(U)]’ d ’ ~ ]
B,
=
1
B(u)d’u.
(5.61)
0
For a uniformly radiating object A , is its geometrical area. Implicit in our derivation is that M B 1, so that the kernel p ( v ) covers a much smaller area than the aperture A. The number M given by eq. (5.59) has been defined by GABOR[1961] as the number of spatial degrees of freedom in the object. If the object has a uniform radiance over the area A , , the A4 significant eigenvalues can be taken as equal, hp z M - ’ , when M >> 1, and the rest can be set equal to 0. When the object is a uniformly radiating circular disk of radius b and area A , = zb’, and the aperture is circular with radius a, the integral equation [1964]. The spatial mode functions (5.9) reduces to one treated by SLEPIAN y,( u ) are proportional to the generalized prolate sphexoidal wave functions, and the eigenvalues can be expressed as hN,k = (4/xz)AN,k,
=
kab/R,
(5.62)
where the AN, are the eigenvalues tabulated by SLEPIAN [1964]; our CI corresponds to his parameter c. The eigenfunctions with N = 0 exhibit circular symmetry. The eigenvalues with N > 0 have multiplicity 2, corresponding t o eigenfunctions proportional to cos N 4 and sin N 4 , 4 being the angular coordinate in the aperture plane. Hence in all our expressions the terms for nodes with N > 0 must be taken twice. With this proviso, there are when M >> 1 approximately M = ~ u ’ significant eigenvalues, and the rest are nearly 0. When a specific set of eigenvalues is needed for calculating detection probabilities, those for the uniform circular object and the circular aperture will be used. 5.4.1. Extreme quantum limit When there is no background light at all, the method of Q 5.3.1 can be
VII,5 51
DETECTION OF I N C O H E R E N T L I G H T
355
applied immediately. For linearly polarized light the detection probability is given by eq. (5.23), where now instead of eq. (5.24) the probability p,(O) of receiving no photons at all is given by
P~(o)= n n ( l + h p X m N s ) - ' ~m
= n U ( h p ~ s ) I - ' ,
(5.63)
P
withf(z) the Fredholm determinant specified by the temporal spectral density X(o)as in eq. (5.25). In particular, for a very small time-bandwidth product, W T<< 1, (5.64)
,996
-
,995
-
,994
-
.993 ,992 ,991 .99
-
.98
-
.96
-
.93 .91 .91 .95
.94
.9
.8
-
Fig. 5.10. Probability Qd of detecting a circular object at a circular aperture in the extreme quantum limit, versus the average number N, of photons received; linearly polarized Curves are indexed by the parameter a = 2Mf = kub/R. light. WT<< 1, Q, =
356
Q U A N T U M DETECTlON T H E O R Y
tV11,
65
0
Fig. 5.1 1. Probability Qd of detecting a circular object at a circular aperture in the extreme quantum limit, versus the average number N, of photons received; unpolarized light, Curves are indexed by the parameter c( = 2Mt = kab/R. WT<< 1 , Qo =
and for W T > I , PI(0) = rI"P(-h,NJ
=
exp(-W.
(5.65)
P
Thus when the observation time is very long, W T>> 1, the detectability of the extended object depends only on the total average number N , of photons received from it, and not at all on its size and shape, provided that the optimum receiver is utilized. In Fig. 5.10 we have plotted the probability Q,,of detecting a circular object at a circular aperture when the light is linearly polarized and WT< 1; curves for various values of the parameter a = 2M* are exhibited. In Fig.
VII,
D 51
357
DETECTION O F INCOHERENT LIGHT
5.1 1 we give the detection probabilities for unpolarized light, determined as in eq. (5.28) appropriately modified. 5.4.2. Inrermediate range
The optimum statistic for detecting the light from an arbitrary extended object is given in eq. (5.21) as a certain linear combination of the numbers vpmof photons observed in the spatio-temporal modes. For most sources of incoherent light W T>> 1, and for ordinary temperatures and wavelengths the mean number N of thermal photons per mode, given by the Planck formula, eq. (3.49), is very small, Jlr << 1. From eq. (5.19) we see that since xm is of the order of ( W T ) - l and very small, N!i << 1 for i = 0, 1, and the logarithm of the likelihood ratio, eq. (5.21), is approximately U' = C C Cvpm In (1 + h p X m Ns/"V-hp p
Xm
Ns1.
(5.66)
m
We shall assume here that the temporal spectral density X(o)is rectangular, as in eq. (5.29), so that xm w (WT)-' for all 11 = W T significant temporal modes. The statistic U' then depends only on (5.67)
which is the sum of the number of photons in all the ,u significant temporal modes into which the aperture mode $,(t) is decomposed. The statistic U' can then be written as U'
=
C [v,In
(l+h,N,/NWT)-h,N,].
(5.68)
P
This statistic depends on the expected number N, of photons from the object, and it cannot provide a uniformly most powerful test for the presence of the object light. A receiver that is less effective, but whose design does not require knowing the number N,, is the threshold receiver introduced in 6 1.4. Here its statistic is obtained by expanding U'in powers of N J N W T and keeping only the term proportional to the first. Using the threshold receiver involves comparing the statistic U" =
C h,v,
(5.69)
P
with a decision level U, and deciding that the light from the object is present whenever U" > U,. When W T > 1 the total number v p of photons in the aperture mode t,bp(t)has a Poisson distribution under each hypothesis,
358
QUANTUM DETECTION THEORY
~2exp (-vPi)/vp!,
P,(v,)
=
vp0 =
N W T , vPl
=
[VII,
05
i = 0,1,
h,N,+NWT.
(5.70)
These distributions have been used to calculate the probability of detecting [1970b]). In Fig. 5.12 this a circular object at a circular aperture (HELSTROM probability has been plotted versus the average number N, of received photons for JI' W T = 1 and various values of the parameter a = 2M3. The larger the object, the more incoherent aperture modes among which the object light is divided, and the smaller the probability of detection. The optimum detector represented by the statistic U' in eq. (5.68) was also evaluated and found to attain detection probabilities very close to those of the threshold detector. The two detectors are the same for a point source, when only a single aperture mode contributes. When M > 1, on the other hand, the significant eigenvalues h, are nearly equal to M - l , and both U' and U" weight the associated numbers v p in the same way. Thus the optimum and threshold detectors differ only in their treatment of the modes whose
I 0
.
5
1 10
I
I
15
20
I 25
I
30
35
Ns
Fig. 5.12. Probability Qd of detecting a uniform circular object of radius b by observations at a circular aperture of radius a, versus the average number N, of photons received from The curves are indexed by the parameter c( = 2 M f = the object; N W T = 1, Qo = kab/R. (From HELSTROM [1970b].)
VII,
0 61
359
ESTIMATION T H E O R Y
eigenvalues h, lie between M - l and zero, and these influence the statistics U' and U" only weakly. When M >> 1 the optimum and the threshold statistics are nearly proportional to Poisson-distributed random variables with mean values JlrM W T and JlrMWT+ N, under the two hypotheses. The probability of detection can then be read from the curves in Figs. 5.4 and 5.5 if N W T is replaced by N M WT.
0 6.
Estimation Theory
6.1. CLASSICAL PARAMETER ESTIMATION
The field at the aperture of an optical instrument may consist of two parts, the signal component 9 due to light from a source or object plane, and the noise component So due to the thermal background. The field 9may depend on certain parameters of the source, such as its radiant power or its wavelength or, in the case of a laser radar echo, its time of arrival; we denote them by (0, ,O , , . . ., Om) = 8, and we indicate that the field depends on them by writing it F(8). The observer wishes to estimate the values of these parameters. His estimates G 2 , . . ., 8, will be based on the actual total field F ( 0 ) Foat the aperture A during an observation interval (0,T).How best to process that field in order to estimate the parameters 8 is a problem in statistical estimation theory. As applied to the parameters of signals received in the presence of noise, estimation theory has been discussed in texts such as those by MIDDLETON [1960, ch. 211, VAN TREES [1968, 6 2.41, and HELSTROM [1968~, ch. 81. When classical electromagnetic theory is valid, we can sample the aperture field at various space-time points (v, t ) , v E A, t E (0, T ) , denoting the samples we obtain by x = ( x , , x 2 , . . ., x,,), as in 5 1. Later these n sampling points are taken closer and closer together, and we pass to the limit n -+ 00 of an infinite number of samples. The joint probability density function (p.d.f.) of the data x, given that the field is F ( 8 ) So, is designated by p ( x l 0 ) ; it embodies the statistical properties of the object light (the signal) and the background (the noise). The Bayesian attitude toward estimation, as toward detection, asserts that the best scheme is the cheapest, on the average. We define a cost C(8,O) of issuing estimates 8 = (g,, 8,, . . ., 8,) of the parameters O j when the true values are 8 = (0, , O , , . . ., Om). For a single parameter, the squared error
+
a,,
+
c(8,e)= (8-e)2
(6.1)
i s a common and mathematically tIactable cost function. In addition, we
3 60
QUANTUM DETECTION T H E O R Y
[VII,
$6
must specify a prior p.d.f. z(8) of the parameters, representing the relative frequencies with which their values lie in various regions of the parameter space 0. An estimation strategy is a set of rn functions of the data, ei = x), i = 1, 2, . . ., m, which we collect into a vector estimator 8( x). Estimation is a continuous version of a multiple-hypothesis test, in which the hypotheses state that the parameters 8 of p ( x l 8 ) lie within one of numerous infinitesimal regions d"8 of the space 0.By analogy with eq. (1.24), the average cost of a particular set of strategies 8(x) is
ei(
C
=
11
z(8) C(8(x), 8 ) p(x18)dm8d"x.
c
The best estimator 8(x) is the one for which is minimum. As with multiple hypothesis testing, the best strategy picks that set 8 for which the posterior risk
s
r(8)= c(8,e ) p ( e l x ) dme is minimum
-
(6.3)
cf. eq. (I .25) - , where
J
p(8l x) is the posterior p.d.f. of the parameter values 8, given the observed data x, andp(x) is the total p.d.f. of the data x. With a quadratic cost function as in eq. (6.1), the Bayes estimate of a single parameter 8 is the conditional expected value
8=
s
ep(elx)de,
(6.5)
as can be shown by substitution into eq. (6.3) and minimization with respect to 8. On the other hand, a cost function of the form
C(8,O) = A - B S(8-8),
(6.6)
which is the continuous counterpart to the one in eq. (1.29), leads to the maximum-likelihood estimate, which selects those values of the parameters for which the posterior p.d.f. p(8l x) is maximum. When the prior p.d.f. z(8) of the parameters is very broad, as when nothing is known about their values in advance, all information about them being necessarily derived from the measurements of x, the maximum-likelihood strategy is equivalent to choosing the values of 8 = (el, e,, . . ., e,) for which the conditional p.d.f. p ( x ( 8 ) is maximum.
VII,
8 61
ESTIMATION THEORY
361
When the parameters refer to the field created by some source, we can generally specify the p.d.f. p o ( x ) that would describe the data x were the source inoperative. In optics this p.d.f. embodies the statistical properties of the background light. As it does not involve the parameters of the source field, we can just as well express these estimation strategies in terms of the likelihood ratio A ( x ; 8) = p(x18)/po(x).Using such a likelihood ratio facilitates passage to the limit of an infinite number of data, in which all the information in the aperture field is exhausted. 6.2. QUANTUM ESTIMATION
When the field F ( 8 ) + Foof the incident light must be treated quantummechanically, it is again convenient to imagine it as admitted to the lossless cavity of an ideal receiver by opening the aperture during the observation interval (0, T).The density operator p of the field inside the receiver at a later time t > Twill depend on the parameters 8 of the light from the source, p = p(8). The parameters 8 must be estimated by measurements on the field that are consistent with the laws of quantum mechanics. As with multiplehypothesis testing, if the density operators p(8) commute for all pairs of parameter sets 8, they all have a common set of eigenstates; and by determining first in which of these states the cavity field actually is, the parameters can be estimated by the classical methods described in 4 6.1. For noncommuting density operators p(8), on the other hand, optimum estimation poses difficult problems. 6.2.1. Minimum-mean-square-error estimation
Let us consider first the estimation of a single parameter 8. The Bayes cost is, as in eq. (6.2),
c=
1 [c(~, Tr
s) p(e)l z(e) de,
(6.7)
where as before z(8) is the prior p.d.f. of the unknown parameter and C(8,8) is a cost function measuring the seriousness of an error, that is, of issuing an estimate 8 when the true value of the parameter is 8. The estimator 8 is an Hermitian operator whose measurement on the system yields the numerical estimate 8, leaving the system in an eigenstate 18) of 8, 818) = 818). When the cost of an error is measured by its square, as in eq. (6.1), the average cost is
C=
J
Tr [p(8)(8-8)’]~(e)de
= Tr
(8’ro-2r,8+r,),
(6.8)
362
[ V I I , Cj
Q U A N T U M DETECTION T H E O R Y
6
where T o , rl and r2are operators defined by r k
=
s
ekz(e)p(e)de.
PERSONICK [1971] has shown that the optimum estimator the operator equation Br,+r,P = 2 r 1
8 is the solution of (6.10)
and that the minimum mean-square error is
Cmin= Tr (r,-rl a).
(6.11)
The operator 8; can be written formally as 8;
=
2jomexp( - T o a)T, exp ( - T o a)dcc,
(6.12)
which can be shown to satisfy eq. (6.10),
1‘,8+8;T,
:
-[exp(-r,a)T, jom
=
-2
exp(-r,a)]da
=
- 2 e x p ( - ~ , a ) ~ , exp(-r,a)l;
= 2r,
When the density operators p(8) commute for all pairs of values of 8, it is only necessary to determine in which of their common set of eigenstates the system actually is, and measuring the operator 8 is equivalent to finding the conditional expected value given by eq. (6.5). A similar approach to optimum filtering of quantum-mechanical variables has been taken by GRISHANIN and STRATONOVICH [ 19701. As an example, consider estimating the amplitude r of a coherent signal of known form, received in the presence of a thermal background. We showed in 0 3.3.1 that by making a certain linear combination of the normal modes of the receiver we can form a “matched mode” that alone contains the signal. The density operator for this matched mode can be written as in eq. (3.59), and if we take as our unknown parameter r as defined by eq. (3.56) in the limit A4 -+ co, we can write the density operator as p ( r ) = (n.Y)-’Sexp (-1v-r1~/~v’)lr>(rld2r.
(6.13)
We allow r to be either positive or negative; it is proportional to the clasthe mean number of photons supplied sical signal amplitude, with N, = lrI2 by the source. Here JV is the mean number of background photons per mode, as given by the Planck formula, eq. (3.51); a matched receiver (2, = Z,) is assumed.
VII,
0 61
363
ESTIMATION THEORY
Suppose that the amplitude r has a Gaussian prior p.d.f.,
Z(T)= (27cr2)-+ exp [ -(r-r)2/2a2],
(6.14)
r
where is the apriori expected value of r and a2is a variance measuring our uncertainty about the value of r before any observations. The estimation operator f that minimizes the mean-square error is then
i; =
+ ($M+$)P]/(a2+ +N+ $),
(6.15)
[a22
+
where 2 = +(cl c:) is the operator related to the “coordinate” of the harmonic oscillator representing the matched mode, as defined in eq. (3.57) in terms of the annihilation and creation operators for that mode (PERSONICK [1971]). The minimum mean-square error when f is measured is
C = c2(+J1’ + t)j(a2+ +M+ $). If the amplitude r is most uncertain upriori, a2 -+ co, +.A’”+$. The relative mean-square error is
E(f-r)’/r2 = (2M+1)/4NS = D q 2 ,
(6.16) =
2 and C = (6.17)
where D; is the signal-to-noise ratio specifying, through eq. (3.65), the probability of detecting the coherent signal in thermal noise when the threshold operator 2 is measured on the field in the ideal receiver. In the classical limit this estimator f of the signal amplitude becomes equivalent to the one derived by ordinary estimation theory, and its relative meansquare error approaches the classical value M/2Ns = K F / 2 E s , where E , is the energy in the signal field and K F is the mean thermal energy per mode (HELSTROM [1968c, pp. 254-256; 1968d1). Although classically the optimum statistic for estimating the amplitude of a coherent signal in the presence of Gaussian noise is the same as the optimum statistic for detecting the signal, this is not the case quantum-mechanically; the optimum detection operator If,as we learned in 9 3.3, is not the same as the operator 2. When two or more parameters are to be estimated with minimum meansquare error, the cost function takes the form m ...
c(8,e) = C ~ ~ (-8q2, ,
(6.18)
i=l
where the ai are appropriate weights. Although one might proceed formally as for a single parameter, the resulting estimation operators Oimay not commute, and hence may not be simultaneously measurable on the same system. A case in point is the estimation of amplitude and phase of a coherent signal in a thermal background, or equivalently, the estimation of 7, and jj,, in the
364
QUANTUM DETECTION T H E O R Y
[VII,
06
density operator P(7) = ( 7 M - l j e x p (-Ir-7I2/~)lr>(yldZy,
7 = Y,+iY,.
(6.19)
The estimators that would result involve +(cl + c:) for 7, and +i(c: - c l ) for V,, and these do not commute. How to find commuting operators that minimize the total mean squared error is unknown.
{ei}
6.2.2. Arbitrary cost,funct ions When cost functions other than the squared error, eq. (6.1), govern the estimation, the optimum estimation operators are unknown. For a single parameter, it is necessary to discover an operator 8 that minimizes in eq. (6.7). We have noted the similarity between estimation and multiple hypothesis testing, and in the quantum domain the optimum strategies for both remain undiscovered. There appears to be no quantum counterpart for the posterior p.d.f. p(0l.x) in the limit where the data x encompass all the information available in the field of the receiver.
c
6.3. THE CRAMER-RAO INEQUALITY
Although the optimum Bayes estimator of a parameter 8 of a density operator p(8) is not in general known, it is possible to set a lower bound to the mean-square error attainable by any estimator having a given bias. In some cases the estimator attaining this lower bound can be determined. The way is shown by the CramCr-Rao inequality of classical statistics. Suppose that a parameter 8 of the joint p.d.f. p ( xle) of a set x of data is to be estimated; the data might be samples of the field at the aperture of an optical instrument during an observation interval (0, T ) , and 0 might be the radiant power of a star in the field of view. An estimator of 0 is a function 8(x) of the data, and its bias is b(e) = ~ [ 8 ( ~ ) l e ] - e
The mean square error is
q e ) = ~([O(~)-e]~le} and it has been shown that this mean square error can be no less than the Cramtr-Rao bound given by the inequality
VII,
0 61
ESTIMATION THEORY
365
where b’(0) = db(O)/dO, all derivatives being evaluated at the true value of the parameter (CRAM~R [1946] p. 473 RAO [1945]). Furthermore, this lower bound is attainable if the derivative appearing in eq. (6.22) has the form
a ae
-In
p(xp)
=
k(e)[B(~)-e],
(6.23)
where k(B) is independent of the data x. The estimate 8(x) in eq. (6.23) is unbiased and attains a minimum mean-square error given by
€ ( B ) = [k(O)]-’;
(6.24)
it is termed an eflcient estimator. Efficient estimators exist only in exceptional cases. The CramCr-Rao inequality in this form has been applied to the estimation of the position of a stellar image on a photosensitive surface by HELSTROM [ 19641and FARRELL [ 19661; applications to the estimation of other parameters of an optical source are given by HELSTROM [1969b, 197Ocl. A quantum-mechanical estimator of the parameter 8 will be an Hermitian operator 8 that when measured on the system yields an outcome that is taken as the estimate 8 of 6. Its expected value is E[ml
=
Tr [p(Q)b
(6.25)
and its bias is, as in eq. (6.20),
b(e) = ~ [ B l e-] e
=
Tr [p(e) (D - e)].
(6.26)
The mean-square error is defined by &(O)
=
E[(o-0)2]O]
=
Tr [p(0)(t%8)2],
(6.27)
and it can be shown to be bounded below by the quantum-mechanical counterpart of eq. (6.22), B(0) 2 [I +b’(0)l2/Tr (pL2),
(6.28)
where L is the symmetrized logarithmic derivative (s.1.d.) of p ( 0 ) with respect to d, defined as the solution of the operator equation appe = +(pL.+Lp)
(6.29)
(HELSTROM [ 1967~1). The estimator # is said to be efficient if the s.1.d. has the form corresponding to eq. (6.23), L = k(O)(p:-O), (6.30)
366
[w0 6
QUANTUM DETECTION THEORY
k ( 8 ) being a c-number function, whereupon the estimator 8 is also unbiased and attains the minimum mean-square error given by eq. (6.28) with b’(8) = 0. As in eq. (6.24), the minimum mean-square error is [ k ( 8 ) ] - ’ . Multi-dimensional versions of both the classical and the quantummechanical CramCr-Rao inequalities exist (HELSTROM [1968a1) and permit one to bound the mean-square errors of estimates of more than a single unknown parameter. An extension to parameter estimation when a prior p.d.f. of the unknown parameters is known has been given by PERSONICK [1971], who has applied the bound to estimation in quantum communications. The estimator 2 = $(cl+c:) of the amplitude r of a coherent signal in thermal noise, which was derived in § 2.1, is an efficient estimator. The s.1.d. L has the form L = 4 ( 2 - r ) / ( 2 ~ + 11, (6.31) and [ k ( T ) ] - ’ = $(2N+1) is the mean-square error it attains, as in eq. [ 1968a1). A smaller mean-square error is pos(6.16) for o2 -+ 03 (HELSTROM sible, but only for a biased estimator. If both components of the complex amplitude 9 = j j + i y , of a coherent signal in thermal noise are to be estimated -the density operator is now that given in eq. (6.19) - ,the multivariate CramCr-Rao inequality in its quantummechanical form sets the low bounds E(f,-y,) 2 2 $(2J’+1), E ( f y - ~ y ) 22 $(2N+1). (6.32) The operators
?,
= +(c, +c:),
Yy = $(c;
-cl)
are efficient estimators of 7, and l/,, respectively, but they do not commute and hence cannot be measured in the same receiver. If the field in the matched mode is amplified, as described in 6 3.5, the complex amplitude of the matched mode might be estimated and the estimate divided by the square root G* of the gain to yield estimates of 7, and Yy. The resulting minimum mean-square errors are obtained by substituting from eq. (3.90) G(JV+ 1)- 1 for JV. In the limit G -+ co,both Giy, and G*y, can be measured simultaneously by classical-physical efficient estimators. and the minimum mean-square errors so attained are, after division by G,
E(f,-yJ2
= +(JV+1) ,
E(f,-yJ2
=
+(J’+l),
(6.33)
which are larger than those in eq. (6.32). These correspond to an uncertainty product [ E ( j - p ) 2 E(lj - q ) q * = h ( f l + 1)
VII,
9: 61
ESTIMATION THEORY
361
in the coordinate q = (2h/SZ)3(c, +c:) and momentum p = (2hSZ)*i(c: cl) of the harmonic oscillator representing the matched mode. When N = 0, this is the same uncertainty product as found by ARTHURS and KELLY[I9651 in a study of the simultaneous measurement of noncommuting observables; see also SHEand HEFFNER [1966]. The quantum-mechanical CramCr-Rao inequality has been applied to the estimation of parameters of incoherently radiating objects (HELSTROM [1970a]). It was assumed that the product WTof the observation time Tand the bandwidth W of the object light is very large, yet the mean number N of noise photons per mode is very small, so that M W Tis of the order of 1. For a point source, the mean-square relative error of an unbiased estimate of the radiant power B, of a point source is bounded below by
where with X ( w )the spectral density of the object light as defined in eq. (5.3),
fl(9) =
""f" [X(~)]~[1+9Wx(w)]-'do, 2n -02
9 = N,/J'"WT. In the extreme quantum limit, 9+ 00, J;(9) -+ 1, and the minimum meansquare relative error is bounded below by N,-'. For M W T > N , , fi(g) % 9, and the lower bound is ( N f I MWT)-'. For an unbiased estimate of the frequency SZ of incoherent light having bandwidth Wand coming from a point source, the following lower bounds were derived from the CramCr-Rao inequality, E(O-52)2/W2 2 2/N,, E(fi-SZ)2/W2 2 2MWT/N:,
<< N , MWT >> N , . N W T
For a coordinate u, of the position of the point source in a plane at distance
R,the relative mean-square error is subject to the following bounds, E(Zi,-u,)2/62 2 1/N,,
MWT < N ,
E ( Z ~ , - U , ) ~2 / ~2J'"WT/N:, ~
MWT > N , .
Here 6 = ilR/21~a,where il = 2nc/Q is the wave-length of the light and a is the radius of the aperture, taken as circular; thus 6 represents the size of a resolution element in the object plane. Bounds on errors in estimates for extended sources were also derived in the reference cited.
368
Q U A N T U M DETECTION THEORY
[VII
Acknowledgment The writer’s research described in this article was partly carried out under Grant NGL 05-009-079 from the National Aeronautics and Space Administration. I wish to thank Mrs. Lily Wang and Mr. Yie-Ming Hong for their assistance with the numerical computations.
References ARTHURS, E. and J. L. KELLYJr., 1965, Bell System Tech. J. 44, 725. BAKUT,P. A., 1966, Radio Eng. Electron. (USSR) 11, 551. BAKUT,P. A., 1967, Radio Eng. Electron. (USSR) 12, 1. BAKUT,P. A. and S. S. SHCHUROV, 1968, Probl. Peredachi lnformatsii 4(1), 77. BAKUT,P. A., V. G. VYGON,A. A. KURIKSHA, V. G. REPINand G. P. TARTAKOVSKII, 1966, Probl. Peredachi lnformatsii 2(4), 39. BAR-DAVID, I., 1969, Trans. IEEE IT-15, 31. CRAMER, H., 1946, Mathematical Methods of Statistics (Princeton Univ. Press, Princeton). FARRELL, E. J., 1966, J. Opt. SOC.Am. 56, 578. GABOR,D., 1961, Progress in Optics 1, 138. GAGLIARIII, R., 1972, Trans. IEEE IT-lS(I), 208. GINZBURG, S. A., 1966, Radio Eng. Electron. (USSR) 11, 1972. GLAUBER, R. J., 1963, Phys. Rev. 131, 2766. GOODMAN, J. W., 1966, Trans. IEEE AES-2, 526. GREEN,H. S. and E. WOLF,1953, Proc. Phys. SOC.(London) A66, 1129. GRENANDER, U. and G. S Z E G ~1958, , Toeplitz Forms and Their Applications (Univ. of California Press, Berkeley). GRISHANIN, B. A. and R. L. STRATONOVICH, 1970, Probl. Peredachi Informatsii 6 (3), 15. HAUS,H. A., 1970, Proc. IEEE 58, 1599. HELSTROM, C. W., 1964, Trans. IEEE IT-10,275. HELSTROM, C. W., 1967a, J. Opt. SOC.Am. 57, 353. HELSTROM, C. W., 1967b, Inform. and Control 10, 254. HELSTROM, C. W., 1967c, Phys. Letters 25A, 101. HELSTROM, C. W., 1968a, Trans. IEEE IT-I4,234. HELSTROM, C. W., 1968b, Int. J. Theor. Phys. 1, 37. HELSTROM, C. W., 1968c, Statistical Theory of Signal Detection, 2nd ed. (Pergamon Press, Oxford). HELSTROM, C. W., 3968d, Inform. and Control 13, 156. HELSTROM, C. W., 1969a, Trans. IEEE AES-5, 562. HELSTROM, C. W., 1969b, J. Opt. SOC.Am. 59, 164. HELSTROM, C. W., 1969c, J. Opt. SOC.Am. 59, 924. C. W., 1969d, J. Stat. Phys. 1, 231. HELSTROM, HELSTROM, C. W., 1970a, J. Opt. SOC.Am. 60, 233, HELSTROM, C. W., 1970b, J. Opt. SOC.Am. 60, 521. HELSTROM,C. W., 1970c, J. Opt. SOC.Am. 60, 659. HELSTROM, C. W., 1971, Trans. IEEE AES-7, 210. HELSTROM, C. W., J. W. S. LIU and J. P. GORDON,1970, Proc. IEEE 58, 1578. E. V., R. 0. HARCERand S. J. HALME,1970, Proc. IEEE 58, 1626. HOVERSTEN, KARP,S. and J. R. CLARK,1970, Trans. IEEE IT-16, 672. KARP,S., E. L. O’NEILLand R. M. GAGLIARDI, 1970, Proc. IEEE 58, 161 1. KENNEDY, R. S., 1970, Proc. lEEE58, 1651. KENNEDY, R. S. and E. V. HOVERSTEN, 1968, Trans. IEEE IT-14, 716.
VII]
REFERENCES
369
KURIKSHA, A. A., 1968, Radio Eng. Electron. (USSR) 13, 1567. LACHS,G., 1965, Phys. Rev. 138, B1012. LAX,M., 1966, Phys. Rev. 145, 110. LIU, J. W. S., 1970, Trans. IEEE IT-16, 319. LOUISELL, W. H., 1964, Radiation and Noise in Quantum Electronics (McGraw-Hill, New York). LOUISELL, W. H. and L. R. WALKER, 1965, Phys. Rev. 137, B204. MANDEL, L., 1958, Proc. Phys. SOC.(London) 72, 1037. MIDDLETON, D., 1960, An Introduction to Statistical Communication Theory (McGrawHill, New York). MIDDLETON, 1966, Trans. IEEE IT-12, 230. S., 1971, Trans. IEEE IT-17, 240. PERSONICK, RAO,C. R., 1945, Bull. Calcutta Math. SOC.37, 81. 1963, Proc. IEEE 51, 1316. REIFFEN, B. and H. SHERMAN, RUDNICK, P., 1962, Nature 193, 604. SHE,C. Y., 1965, J. Appl. Phys. 36, 3784. SHE,C. Y., 1968, Trans. IEEE IT-14, 32. 1966, Phys. Rev. 152, 1103. SHE,C. Y. and H. HEFFNER, SIEGERT, A. J. F., 1957, Trans. IRE IT-3, 38. D., 1964, Bell System Tech. J. 43, 3009. SLEPIAN, 1961, B~11System Tech. J. 40, 43. SLEPIAN, D. and H. 0. POLLAK, SLEPIAN, D. and E. SONNENBLICK, 1965, Bell System Tech. J. 44, 1745. STEFANYUK, V. L., 1966, Probl. Peredachi Informatsii 2(1), 58. H., 1965, Information Theory of Quantum-Mechanical Channels, in: Adv. in TAKAHASI, Communications Systems Vol. 1, ed. A. Balakrishnan (Academic Press, New York) p. 227. VAN TREES,H. L., 1968, Detection, Estimation, and Modulation Theory, pt. 1 (Wiley, New York). A., 1967, J. Opt. SOC.Am. 57, 639. WALTHER, YOSHITANI, R., 1970, On the Detectability Limit of Coherent Optical Signals in Thermal Radiation, Ph. D. thesis, Univ. of California, Los Angeles. and M. LAX, 1970, Proc. IEEE 58, 1770. YUEN,H. P., R. S. KENNEDY
This Page Intentionally Left Blank
AUTHOR INDEX A
BASOV,N. G., 70, 85 BASSANI,F., 207, 224 BATES,R. T., 210 212, 224 BATTERMAN, B., 252, 279 BAYER,E., 279 BEDDOES,M. P., 38, 42 J., 77, 86 BEESLEY, G. N., 279 BELOVA, BENNETT,H. S., 167, 224 W. R., 6,42, 283 BENNETT, BERG,A. D., 77, 85, 86 L., 232, 250, 279 BERGMANN, BERLINCOURT, D. A., 233, 236, 268, 269, 279 W., 103, 135 BERNARD, M., 276,280 BERNFELD, S., 279 BERNSTEIN, BERRY,M. V., 280 BETHE,H. A., 168, 172, 174, 182, 206, 219, 224,227 BHARGAVA, R. K., 177, 224 BHATIA,A. B., 252, 280 A. K., 38, 42 BHUSHAN, L., 176, 224 BIERMANN, T., 248, 282 BILLETER, P., 231, 248, 284 BIQUARD, BISIGNANI, W. T., 30,42, 43 J., 286 BJORKHOLM, H., 82, 85 BLANCHET, BOHME,H., 250, 280 BONCH-BRUEVICH, A. H., 250, 280 BONNER,W. A., 286 BORN,M., 233,240,241,249,251,252,262, 280 R., 284 BORNSTEIN, BORSUK,G. M., 280 BOWSER,M. L., 63, 85 D. J., 76, 80, 85 BRADLEY, BRAY,R., 287 M. A., 275, 280,283 BREAZEALE,
AAS,H. G., 248, 279 ABRAGAM, A., 125, 135 ABRAMS,R.L., 278, 279, 283 K., 282 ACHYUTAN, ADLER,R., 232, 254, 256, 276, 279 ADRIAN, F. J., 167, 201, 225 AGARWAL,G. S., 100, 135 ALIPPI,A., 279 ALLISON,S. K., 176, 224 ANDERSON, A. E., 77, 85 ANDERSON, G. B., 34, 42 L. K., 264, 266, 276,279, 280 ANDERSON, ANDERSON, 0. L., 240, 279 ANDREWS,H. G., 26, 43 A. W., 279 ANGELBECK, ANGER,J., 208, 225 ARM,M., 279, 283 ARTHURS,E., 367, 368 ATZENI,C., 279 AULD, B. A., 287 AUTH, D. C., 284
B BACCI,H., 82, 83, 85, 86 BAER,T., 23, 42 BAKER,J. G., 143, 149, 163 BAKUT,P. A., 292, 306, 368 V. L., 279 BALAKSHY, C. J., 168, 224 BALLHAUSEN, A. A., 243, 287, 288 BALLMAN, P., 74, 85 BALOSKOVIC, BARANNE, A., 157, 163 BAR-DAVID,I., 292, 368 BARNARD, G., 281 BARNSLEY, D. A., 49, 85 M. R., 63, 76, 85 BARRAULT, BARTON,M. P., 23,42 BARTRAM, R. H., 216, 226 371
372
AUTHOR INDEX
BRIDENBAUGH, P. M., 285 BRIENZA, M. J., 280 BROWN,E. F., 21, 42 BROWN,R. P., 67, 87 BROWN,W. F., 187, 188, 224 BRUINING, H., 71, 72, 86 BUDD,W. E., 267, 287 BUDRIKIS, F. L., 37, 43 BULPITT, T. H., 73, 85 BUNTENBACK, R. W., 73, 86 BURCKHARDT, C. B., 280 E., 284 BURSTEIN, J. C., 208, 224 BUSCHNELL,
c CALDWELL, D. O., 77, 85 CALLEN, H. B., 103, 135 CAMAGNI, P.,217, 224 CANDY, J. C., 21, 42 CARLETON, H. R. 280, 284 CARPENTER, R. O’B., 280 S., 176, 224 CHANDRASEKHAR, CHARLES, D. R., 74, 85 C., 23, 42 CHERRY, CHESTER, A. N., 281 CHIAROTTI, G., 207, 217, 224 R. A,, 59, 61-64, 67, 85-87 CHIPPENDALE, C H R ~ T I EH., N , 150, 163 CHU,Ruey-Shi, 280 J. R., 292, 368 CLARK, COHEN,M . G., 242,246,255,257,261,275, 280-282 COHEN,M . H., 184, 224 COHOON,R. L.,248, 249, 281 COLE,H., 252, 279 COLEMAN, K . R., 49, 85 COLLINS, J. H., 280 COLMER, R. A,, 69, 86 COMPTON, A. H., 176, 224 COMPTON, W. D., 167, 174, 201, 208, 216, 2 17, 224,227 COOK,B. D., 252, 280,283 COOK, C. E., 276, 280 G. A., 244, 264, 266, 274, 277, COQUIN, 280, 281, 284, 287 CORRADETTI, M., 22, 42 COTTON,R. V., 21, 43 COURANT, R., 176, 224 COURTNEY-PRATT, J. S., 49, 58, 61, 64, 72, 85 C R A M ~H., R , 299, 365, 368 R. S., 207, 224 CRANDALL, CROWTHER, W. R., 26, 43
CRUMLEY, B., 280 CRUMLY, C. B., 248, 249, 281 CUMMINS, H. Z., 280 CUNNINGHAM, J. E., 18, 37, 42 CURRAN, D. R., 233, 236,268,269, 279 CUTLER, C. C., 6, 20, 42
D DAKSS,M. L., 284 DAMON,R. W., 281 DANIELSON, G. E., 248, 249, 281 DARRIN,C. G., 187, 188, 224 DAVEY, J. R., 6, 42 DAVIS,L., 286 DEAN,R. E., 260, 281 J. H., 47, 52, 85 DEBOER, DEBYE,P., 23 1, 28 I DEFEBVRE, A., 28 1 DE JAGER,F., 21, 42 DELBECQ, C. J., 217, 220, 224 A. J., 248, 249, 275, 280, 281 DEMARIA, DESMARES, P., 232, 256, 276, 283 DEUTSCH, S., 37, 42 DEXTER,D. L., 168,173, 177, 178,180,196, 209, 212, 213, 219, 224, 225, 227 DIAMANT, L. M., 74, 84, 85 DICK,B. G., 216, 227 DIDOMENICO, M., 241,288 DIEULESAINT, E., 287 DIMITROFF, G. Z., 149, 163 DIXON,R. W., 239,242,255, 264,275,281, 282,286 DOMALAIN, M., 74, 85 DOYLE,W. T., 174, 208-210, 212, 213-215, 217, 225 DRIARD,B., 57, 74, 85 DROZHBIN, Yu. A., 70, 85 DUBOVIC,A. S., 64, 85 DURAN,J., 281 DURAN,M. J., 281 DURANT, M., 74, 85 DUTTON,D. B., 207, 225
E EBY,J. E., 207, 225 ELIAS,P., 20, 42 EMBERSON, D. L., 57, 76, 85, 87 EREZ,A., 49, 85 ERF,R. K., 248,279 EROS,S., 217, 218, 225 ESCHARD, G., 82, 85 ESTERMANN, I., 21 1, 225
AUTHOR INDEX
EWEY,M. D., 280 R., 252, 281 EXTERMANN, S., 49, 85 EYLON, F FANCHENCO, S. D., 64, 71, 85, 87 E. J., 236, 281, 365, 368 FARRELL, F. I., 236, 281 FEDOROV, V. M., 74, 84, 85 FEDOROV, M., 260, 281 FELDMAN, P., 55, 56, 85 FELLGETT, FISCHER, F., 208, 218, 225 FLECK,J. A., 135 D. A., 281 FLINCHBAUGH, J. R., 54, 85 FOLKES, FORK,R. L., 275, 282 L. C., 248, 249, 280, 281 FOSTER, N. F., 274, 281 FOSTER, W. B., 167, 178,207,208,216,225 FOWLER, G., 286 FRANCOIS, FRANCKEN, J. C., 71, 72, 86 FRITZ,B., 208, 217, 225 FROHLICH, H., 170, 187, 188, 225 E., 250,280 FROMM, I., 220, 226 FUJITA, A., 210, 220, 225, 226 FUKUDA, FULLER, G. G., 282 F U M IF. , G., 201, 225, 228 G
GABOR, D., 20, 37, 42, 354, 368 I., 282 GABRIELLI, GAGLIARDI, R., 292, 368 R. M., 292, 368 GAGLIARDI, GAGOSZ, R., 281 GALLAHER, L. E., 276, 287 GARFIELD, B. R. C., 54, 85 S. C. B., 151, 161, 163 GASCOIGNE, GAVGANEN, L. V., 74, 84, 85 W., 207, 226 GEBHARDT, G., 208, 225 GEHRER, GERIC,J. S., 267, 276, 282 GIAROLA, A. J., 248, 282 GIBSON, F. C., 63, 85 GIE, T. I., 208, 225 T. L., 201, 225 GILBERT, GILL,S. P., 282 S. A., 292, 368 GINZBURG, GIRES,F., 260, 282 R. J., 99, 100, 103, 135, 320, 335, GLAUBER, 368 GOETZE, G. W., 77, 85 J. W., 292, 368 GOODMAN,
373
GORDON, E. I., 232,239,246,255-257,259, 261, 267, 275, 280-282 GORDON, J. P., 94, 100, 135, 292, 323, 368 Goss, W., 49, 85 GOTO,T., 218,219, 225 B. S., 167, 201, 225 GOURARY, GRAF,J., 85 D. N., 25, 31, 42 GRAHAM, R. E., 20, 42 GRAHAM, U. M., 207, 224 GRASSANO, R. L., 275, 284 GRAVEL, GREEN, H. S., 330, 368 L. C., 176,225 GREEN, U., 337, 368 GRENANDER, J. P., 260, 264, 266, 280, 281 GRIFFIN, B. A., 362, 368 GRISHANIN, U. F., 38, 42 GRONEMANN, GROVER, F. H., 63, 85 E. P., 84, 87 GRUGLIAKOV, GRUNDIG, H., 208,218,225 R. F., 193, 194, 225 GUERTIN, GURNEY, R. W., 184, 188, 226 GUYOT,L. F., 57, 74, 85 GYULAI, Z., 215, 225 H HAAKE, F., 135 A., 26, 42 HABIBI, HAKEN, H., 100, 101, 135 B. W., 282 HAKKI, S. J., 292, 368 HALME, R., 103, 135 HANBURY-BROWN, HANCE,H., 282 HANCE, H. V., 287 R. O., 292, 368 HARGER, L. E., 275,282 HARGROVE, S . E., 265, 275, 282 HARRIS, C. W., 20, 22, 42 HARRISON, H., 217, 225 HARTEL, D. R., 174, 225 HARTREE, HA&, H. R., 185, 194, 200, 225 E. S., 200, 226 HAURWITZ, HAUS,H. A., 316,368 HEALEY, T. J., 81-83, 85 HEER,C. V., 210 212, 224, 227 HEFFNER, H., 367, 369 HELLMANN, H., 174, 225 K. H., 279 HELLWEGE, C. W., 292, 293, 297, 298, 306, HELSTROM, 307, 313, 319, 322, 323, 326, 334, 335, 348, 353, 358, 359, 363, 365, 366, 368 D. M., 278,283 HENDERSON, HERMAN, F., 176, 225
374
AUTHOR INDEX
HERRING, C., 188, 191,225 K. F., 169, 228 HERZFELD, HETT,J. H., 63, 86 HIEDEMANN, E. A., 252, 275, 280,282-284, 288 HIGGINS,J. F., 76, 85 HILBERT, D., 176, 224 HILL,D. A,, 77, 85 HILL,P. C. J., 20, 37, 42 HILLS,M. E., 210, 212, 227 N . C., 283 HILLYARD, HIRAI,M., 218, 225 HOGAN,A. W., 60, 85 M. G., 286 HOLLAND, HOLST,G., 47, 52, 85 HOVERSTEN, E. V., 292, 368 HUANG,J . Y., 25, 43 HUANG,T. S., 6, I I , 12, 19, 23, 27, 34, 39, 40, 42-44 HUFFMAN, D. A., 17, 43 HUSTON,A., 66, 86 HUSTON,A. E., 67-70, 72, 83, 86 HYGH,E. H., 207, 226 I IADONISI, G., 207, 224, 225 ILIN, V. S., 283 INABA, H., 283 K., 2 10, 225 INOHARA, IsHII,T., 218, 219, 225 ISKOLDSKI,A. M., 74, 83-85, 87 IWASAKI,H., 277, 288 K . S., 283 IYENGAR,
J JAFFE,H., 233, 236, 268, 269, 279 JENKINS, J. A., 59, 61, 86 JERRARD, H. G., 283 N. C., 176, 225 JOHNSON, JONES,L. W., 71, 86 JONES, R. C., 55, 86
K KAHN,A. H., 188, 205, 227 V. I., 283 KAMENSKII, I. P., 239, 283 KAMINOV, KAMP,J. C., 250, 283 KANE,J., 26, 43 KANTER,H., 77, 85 KANZAKI, H., 207, 226 KAPLAN,D., 74, 85 KARP,S., 292, 368 KASANTSEV, V. F., 279
KAY,N. D., 30, 43 KECK,G., 283 KEKEZ,M. M., 63, 76, 85 KELLYJr., J. L., 367, 368 KENNEDY, R. S., 292, 308, 368, 369 S. O., 208, 225 KENNEDY, KEY, M. H., 76, 85 KING, M., 283 KING, R. W., 63, 86 S. C., 30,43 KITSOPOULOS, KITTEL,C., 240, 283 KLAUDER, J. R., 99, 117, 135 KLEEFSTRA, M., 210, 212, 217, 225 KLEIN,M. V., 208, 225 KLEIN,W. R., 252, 283 F. C., 210-212, 218, 225 KLEINSCHROD, KLICK,C. C., 209, 225 N., 280 KNABLE, KNAPP,C. F., 30, 43 KNOX,R. S., 209, 225 T., 283 KOBAYASHI, H., 247, 252, 283 KOGELNIK, KOHLER,H., 153, 157, 161, 163 KOHN,E. S., 283 KbHN, W., 177, 199, 203, 208, 226 KOLB,J., 283 KOLCHIN,E. K., 176, 225 KOLERS,P., 41, 43 V. S., 66, 83, 86 KOMELKOV, KONITZER, J. D., 209, 226 V., 104, 135 KORENMAN, V. V., 66, 70, 71, 86 KOROBKIN, KORPEL,A., 232,260,276, 283, 287, 288 KOSTLIN,H., 208, 226 G. P., 283 KOSTYUNINA, KRAMER, H. P., 25,43 KRAMERS, H. A., 176, 226 E. R., 30, 43 KRETZMER, KRISCHER, C., 283 KRONIG,R. de L., 176, 226 KUBBA,M. H., 23,42 KUBO,R., 103, 135 KUHN,U., 208, 226 KULIASKO, F., 283 KUPPERS,H., 283 A. A., 292, 331, 368, 369 KURIKSHA, KURZ,G., 208,226 A., 66, 69, 86 KUTUKOV, L LA, S. Y., 216, 226 LABOWSKI, M., 287 LACHS,G., 325, 369
AUTHOR INDEX
LAMACCHIA, J. T., 277, 284, 286 LAMARRAGUE, P., 14, 85 LAMBJR., W. E., 95, 99, 103, 135 LAMBERT, L. B., 215, 283, 284 G. B., 284 LAMERS, LANDOLT,H. H., 284 LANGENBERG, D. N., 180, 227 LANCER,H., 208, 225 LARDAT,C., 282 E., 82, 86 LAVIRON, LAX,M., 101-103, 115, 118, 123, 127, 135, 168, 180, 181, 226, 233, 237, 285, 308, 316, 369 LAZAY,P. D., 238,285 LEAN,E. G. H., 266, 280, 284 F., 74, 85 LE CARVENNEC, LEIVO,W. J., 211, 225 LENZO,P. V., 243, 287 LEROY,O., 283 LEUTE,H., 210, 226 LI, T., 247, 283 LIDDY,B. T., 54, 85 LIMB,J. O., 21, 43 LINDEN,B. R., 72, 86 LIPNICK, R., 248, 284 LITTLETON, M. J., 214, 226 LIU, J. W. S., 292, 308, 323, 368, 369 LOEBER,A. O., 283 LOEBER,A. P., 275, 284 LOTSOFF,S., 288 LORENTZ,H. A., 168-170, 187, 226 LOUISELL,W. H., 94, 101, 102, 118, 127, 128, 135, 314, 323, 369 LOWDIN,P. -O., 176, 226 LUBECK,K., 176, 224 LUCAS,R., 23 1, 248, 284 LUCKEN,E. A. C., 184, 226 LUCKY,R. W., 6, 43 LUNN,G. H., 64, 86 LWTY, F., 207-209,212,215,217,225,226
M MAGYAR, G., 75, 86 MAIER,K., 207, 226 MAJUMDAR, S., 68, 80, 85, 86 W. T., 275, 276, 280, 281, 284, MALONEY, 285 MANDEL,L., 51, 55, 57, 75, 86, 335, 369 MARADUDIN, A. A,, 284 J., 128, 135 MARBURGER, MARILLEAU, J., 82, 83, 85 MARKHAM, J. J., 174, 209, 212, 226 W., 218-220, 226, 227 MARTIENSSEN,
375
MARTIN,J., 6, 43 MARTIN,R. J., 287 MARTONE,M., 63, 86 G. P., 285 MARTYNENKO, MASON,W. P., 233, 269, 284 MATTHEWS, H., 281 MATTHEWS, M. W., 25,43 MAYDAN,D., 256, 259, 275, 284 MAYER,W. G., 284 MCCLURE,D. S., 167, 226 MCGEE,J. D., 77, 86 MCKENNA,J., 99, 135 MCMAHON,D. H., 281, 284 MCSKIMIN,H. J., 233, 284 MEEK,J. M., 61, 86 MEIER,O., 38, 42 MEINEL,A. M., 144, 153, 163 A. H., 269,270,274,284, 285 MEITZLER, MELTZ,G., 275, 276, 280, 284, 285 MENIGER,R. C., 73, 86 MERTENS,R., 282, 283, 285 MICHAEL, A. J., 285 MIDDLETON, D., 11, 43, 293, 299, 359, 369 MIKKOR, M., 207,224 MINKOFF,J., 279 MINKOFF,J. B., 285 MOLLOW,B. R., 100, 135 MOLLWO,E., 169, 171, 209, 211, 226 MONTAGUE, J. C., 73, 87 MOORE,J. C., 73, 87 MORI,H., 285 J., 247, 286 MORRISON, MORSE,P. M., 200, 226 MOTT,N . F., 184, 188, 214, 226 MOTT-SMITH, J. c . , 23, 43 MOUNTS,F. W., 21, 37, 43 MUELLER, H., 241, 285 MURTY,J. S., 255, 286
N NAKAZAWA, F., 207, 226 NATH,N . S. N., 252, 254, 286 NELSON,D. F., 233, 237, 238, 285 NESTERIKHIN, Yu. E., 66, 74, 83-87 NIEH,S. T., 265, 282 NIKITENKO, V. I., 285 NIKITIN,V. V., 70, 85 NIKLAS,W. F., 73, 74, 86 NOBLE,W. J., 252, 280 NOMOTO,O., 248, 285 NUNN,M., 146, 163 NYE,J. F., 233, 234, 237, 238, 243, 285
3 76
AUTHOR INDEX
0 OHKURA, H., 217, 226 Y., 244, 285, 288 OHMACHI, OKOLICSANYI, F., 231, 285 OLIVER, B. M., 3, 43 ONAKA,R., 210, 220, 225, 226 O’NEALJr., J. B., 20, 21, 43 O’NEILL,E. L., 292, 368 ONSAGER, 1..169, 187, 188, 226 OWREN,H. H., 81-83, 85
POWELL, C. G., 284 PRASADA, B., 12, 43 PRATT,T. H., 47, 58, 86 PRATT,W. K., 6, 26,43 PREZIOSI, B., 207, 224, 225 PRIMAK, W., 286 PRINGSHEIM, P., 220, 224 PROSSER, R. D., 77, 85 PRUDENCE, M. B., 69, 86
Q P PAGE,L. J., 207, 226 PALMIERI, L., 279 PAN,J. W., 31, 43 S. D., 64, 71, 85 PANCHENKO, PANCHOLY, M., 285 PANOFSKY, W. K. H., 186, 226 PANTANI, L., 279 PARKER, W. H., 180, 227 PARKS,J. K., 255, 282, 285 PARTHASARATHY, S., 285 PATTERSON, D. A,, 209, 225 PARYGIN, V. N., 279 PAUL,M., 142, 144, 148, 163 PAUS,H. J., 208, 211, 217, 226, 227 PAUTHIER-CAMIER, M. S., 281 PEARSON, D. E., 23, 42 PFDINOFF, M. E., 285 PERFILOVA, V. E., 287 PERIGAMENT, M. I., 66, 86 PERL,M . L., 77, 86 PERSONICK, S., 362, 363, 366, 369 PETERSON, D. P., I I , 43 PETERSON, G . E., 285 PETROFF, St., 208, 216, 227 PHARISEAU, P., 255, 285 PHELPS,F. T., 210, 212, 227 PHILLIPS, J. C., 285 PHILLIPS, M., 186, 226 PICK,H., 167, 208,211, 212, 214, 216-218, 220, 226, 227 PIERCE, J . R., 3, 43 PINNOW,D. A., 239-244, 263, 267, 279, 280, 285, 286 PLAKHOV, A. G., 64, 66, 71, 85, 87 POLAERT,R., 82, 85 POLLACK, M . A., 275, 282 POLLAK, H. O., 342, 369 PORRECA,F., 286 POST, A., 25, 43 POST,D., 286
QUATE,C. F., 232, 252, 266, 284, 286 R
RABIN,H., 208, 216, 217, 224 RADER,C. M., 26, 43 C. W., 63, 85 RAMALY, RAMAN, C. V., 252, 254, 286 RAMAVATARAM, K., 286 RANDOLPH, J., 247, 286 RAO, B. R., 255, 286 RAO,C. R., 365, 369 RAO,R. C., 286 RAUCH,C. J., 210, 212, 227 RAUSCH,G., 217, 225 REED,W. O., 73, 74, 86 REEDER, T. M., 286 REFSDAL, I. N., 163 REICH,A,, 248, 284 REID,C. D., 72, 86 REIF,F., 184, 224 Reiffen, B., 292, 369 REMM,R. L., 21, 43 REPIN,V. G., 292, 368 REZ, J. S., 288 RICHARDS, G. P., 30, 42, 43 RICHARDS, M. A., 62, 63, 86 RIGDEN, J. D., 220,227 RISKEN,H., 100, 101, 135 1.G., 24, 43 ROBERTS, ROCKSTAD, H., 220, 227 Roos, W., 169, 171, 209, 226 ROSE,A,, 55, 86 ROSENFELD, A., 6, 43 ROSENFELD, L., 169, 227 ROSENTHAL, A. H., 275, 286 ROSIN,S., 143, 163, 164 ROSS,F. E., 140-142, 164 ROZGONYI, G. A., 274, 281 RUDNICK,P., 369 RUGGLES,P. C., 75, 86
AUTHOR INDEX
S SACOCCIO, E. J., 286 E. E., 172, 174, 206, 219, 224 SALPETER, SALTZ,J., 6, 43 R. A., 140, 160, 164 SAMPSON, SAXE,R. F., 63, 86 G., 279 SCHAACK, W., 47, 58, 72, 86 SCHAFFERNICHT, P., 71, 72, 86 SCHAGEN, R., 38, 43 SCHAPHORST, M., 66, 70, 71, 86 SCHELEV, SCHIPF, L. I., 179, 227 R. A., 77, 85 SCHLUTER, R. V., 286 SCHMIDT, G. A., 248, 284 SCHOEN, W. F., 23, 30, 31, 39, 43 SCHREIBER, J. H., 167, 174, 201, 227 SCHULMAN, D. H., 152, 162, 164 SCHULTE, P. M., 25, 43 SCHULTHEISS, G., 210,226 SCHULZ, SCHULZ,M. B., 243, 286 K., 150, 164 SCHWARZSCHILD, M., 103, 104, 106, 107, 135 SCHWINGER, SCOTT,A. B., 210, 212, 227 SCOTT,F. H., 63, 85 F. W., 13, 44 SCOVILL, SCULLY,M., 95, 99, 103, 135 SEARS,F. W., 231, 281 SEGUIN,H. A., 285 H., 168, 227 SEIDEL, SEITZ,F., 167, 168, 172, 175, 211, 213,227 SEGRE,S. E., 63, 86 SENENOV, A. S., 70, 85 I. R., 117, 135 SENITZKY, A. J., 37, 43 SEYLER, SHAM,L. J., 199, 203, 226 C. E., 3, 4, 7, 16, 43, 44 SHANNON, SHAW,H. J., 266, 280, 284 S. S., 306, 368 SHCHUROV, SHE,C. Y., 309, 367, 369 SHEN, Y. R., 125, 135 H., 292, 369 SHERMAN, W., 188, 205, 227 SHOCKLEY, A. J. F., 340, 369 SIEGERT, A. E., 286 SIEGMAN, SIEGMUND, W. P., 58, 86 R. H., 188, 191, 210, 212, 227 SILSBEE, P., 66, 69, 86 SIMONOV, SINGH,H., 285 SIROU, F., 51, 74, 85 SITTIG,E. K., 250, 270, 275, 280, 285-287 SKILLMAN, S., 176, 225 SLARK,N. A., 75, 86
377
J. C., 199, 203, 227 SLATER, F. H., 275, 287 SLAYMAKER, D., 342, 354, 369 SLEPIAN, A., 287 SLIWINSKI, A., 168, 169, 227 SMAKULA, SMITH,A. W., 287 SMITH,D. Y., 173, 197, 201, 207, 212-214, 219, 227 SMITH,R. W., 77, 79, 85-87 SMITH,T. M., 232, 283, 287 SMITH,W. A,, 210,227 SMITS,F. M., 276, 287 G. E., 64, 66, 71, 85, 87 SMOLKIN, SNELL,P. A., 72, 86 A,, 182,227 SOMMERFELD, E., 342, 369 SONNENBLICK, SOREF,R. A,, 280 SORIN,A. S., 287, 288 SPEARS,V. L., 287 D. A., 23, 44 SPENCER, SPENCER, E. G., 243, 287, 288 G., 207, 212,214, 227 SPINOLO, STASIW,O., 209, 212, 227 M. Ya., 84, 87 STCHELEV, V. L., 292, 369 STEFANYUK, B. M., 70, 85 STEPANOV, STERN,F., 193, 194, 225 STERN,O., 21 1 , 225 G. W., 55, 87 STEWART, R. G., 73, 87 STOUDENHEIMER, R. L., 362, 368 STRATONOVICH, G. R., 21, 43 STROHMEYER, J., 207, 216,226,227 STROZIER, E. C. G., 99, 117, 135 SUDARSHAN, Y., 206, 227 SUGIURA, SUMINOKURA, T., 285 H. E., 267, 287 SWEENEY, SZEGB,G., 337, 368
T TAKAHASI, H., 327, 329, 369 T., 280 TAMIR, G. P., 292, 368 TARTAKOVSKII, B. N., 180,227 TAYLOR, TEEGARDEN, K. J., 207, 225 TELL,B., 287 J. R., 188, 205, 227 TESSMAN, TEVES,M. C., 47, 52, 85 THACKERAY, D. P. C., 64, 85 THALER, W. J., 280, 287 K., 211, 227 THOMMEN, THURSTON, R. N., 233, 287 H. F., 268, 287 TIERSTEN,
378
AUTHOR INDEX
TIMUSK, T., 218-220, 227 TIPNIS, C. B., 252, 283 TOEPLER, A., 287 TOMIKI, T., 217, 228 TORGUET, R., 287 TORIKAI, Y., 285 TOSI,M. P., 201, 225, 228 TRETIAK, 0. J., 6, 1 1 , 12, 23, 39, 42, 43 TSAI,C. A., 287 TSAI,C. S., 287 TURNER, E. H., 239, 283 TURNOCK, R. C., 61, 86, 87 Twiss, R. Q., 103, 135
1J UCHIDA, N., 244, 277, 285, 287, 288 UETA,M., 218, 219, 225 UNSOLD,A,, 175, 228 V VANATTA,F. A., 274, 281 VANTREES,H. L., 293, 359, 369 VANUITERT, L. G., 286 VANVLECK,J. H., 168, 188, 228 VASILEVSKAYA, A. S., 287, 288 VEENEMOUS, C. F., 47, 52, 85 VENTURINI, E. L., 288 VIOLETTE, H., 160, 162, 164 VOROBJEV, V. V., 83, 87 VYGON,V. G., 292, 368 W WACKS,K., 25,44 W.-U., 210, 228 WAGNER, WALKER, L., 94, 135 WALKER, L. R., 323, 369 WALLACE, R. W., 265, 282 WALrERS, F., 66, 67, 86, 87 WALTHER, A,, 330, 369 WANIECK, M., 55, 87 WANNIER, G., 252, 281 WARNER, A. W., 244, 260, 280, 281, 286, 287 WATSON,W., 232, 276, 283 WAX, N., 102, 135 WEAST,R. C., 218, 228 WEAVER, W., 4, 7, 16, 44
WEDDING,B., 208, 225 WEEKLEY, B., 57, 87 WEIDLICH, W., 100, 101, 135 WELDON,E. J., 6, 43 WELTON, T. A., 103, 135 WEMPLE,S. H., 241, 288 WENDT,G., 74, 85 WHELAN, J. W., 30, 42 WHITMAN, R., 288 WILCOCK, W. L., 57, 87 WILKINS, L. C., 6, 44 WILKINSON, C. D. W., 286 WILLIAMSON, S. R., 286 WILSON,A. H., 182, 228 WILSON,R. N., 163, 164 WINKLER, M. R., 21, 44 WINSLOW,D. K., 265, 282,286 WINTZ,P. A., 6, 26, 42, 44 WITT,H., 21 I , 228 WITT,R., 21, 44 WOLF,E., 100, 135, 232, 240,241, 249, 251, 252, 262, 280, 330, 368 WOLF,H. C., 168, 227 WOLF, K. L., 169, 228 WOODS,J. W., 27, 43,44 WOOLGAR,A. G., 75, 86 WORLOCK, J. M., 287 WORMELL, P. M. J. H., 146, 164 Wu, WEI-HAU,288 WYNNE,C. G., 141, 142, 144, 146, 147, 150, 152-157, 161-164
Y YAKOVLEV, V. A., 70, 85 YAMAGUCHI, Y., 12, 43 YAMASAKI, M., 288 YILMAZ, H., 200,226 YOSHITANI, R.,322, 369 YOUNG,L. A., 200, 226 YOUNGBLOOD, W. A., 18,44 YUEN,H. P., 308, 369 YUSTER, P., 220, 224
Z ZANKEL, K. L., 288 ZAVOISKY, E. K., 64, 66, 71, 85, 87 ZITTER,R. N., 288
SUBJECT INDEX A aberration balancing, 156 - correction, I39 et seq. -, monochromatic higher order, 146 -, refractive, 250 absorbing center, 179 absorption, 151 - band, 170 -coefficient, 179, 191, 208 - -, integrated, 196 - cross section, 167, 179, 204 - defect, 167 - edge, 241 - line, 180 - strength, 167, 186, 205 et seq. accentuation, high-spatial-frequency, 1 1 acceptor, 178 acoustic figure of merit, 237 et seq. acoustically matched metal layers, 275 acoustooptic beam positioning, 276 - deflector, 231 addition of impurities, 167 adiabatic polarization procedure, 197 afocal doublet, 146 aliasing, 1 1 alkali halide, 205 et seq. - - crystal, 169, 173 aluminum backing, 58 et seq., 84 amplification, 326 et seq. - effect on detection, 345 amplitude filter, 276 - modulation, 114 analyser, 245 angular bandwidth, 260 - deformation, 235 - field, useful, 140 - selectivity, 255 annihilation operator, 91, 314 anti-shielding effects, 184 antireflection coating, 274
aperture plate, 80 aplanatic mirror pair, 150 area coding, 23 array of photosensitive area, 64 artificial contours, 25 aspheric plate, 145, 150 - plates, multiple, 153 aspherical plate corrector, 144 ef seq. asphericity coefficient, 145 astigmatic correction, 150 astigmatism, 139, 141, 145, 148, 151, 161 - coefficient, 147 astrometry, 142 atmospheric seeing, 140 -turbulence, 149 atomic absorption spectroscopy, 210 - beam, 96 et seq. - current, 130 - hydrogen, 206
B backward time operator, 105
- - path, 107 - - transition, 122 band gap, 178
- shape, 181 bandpass response, 270
- shape, 263 bandwidth, channel, 7 compression, 17, 33 -, definition of, 335 - reduction, 37, 260 Baker corrector, 143 et seq. baseband signal, 7 et seq. Bayes criterion, 294, 297, 325 - rule, 301 - strategy, 295 - - for multiple hypothesis, 300 - - in estimation. 360 beam deflection, 67 -
37 9
380
SUBJECT INDEX
- splitter, 48, 69, 82 - - steered transducer, 263 - steering, 260, 266 beats, 70 bias potential, 67 biaxial crystal, 233 binding energy, 178 biplanar image tube, 52, 81 et say. birefringence, 23 1 , 265 - light deflection, 248 et seq. - modulator, 245 -, static, 234 birefringent modulation, 250 bit rate, 28, 39 - sequence, 7 blackbody photon distribution, 99 bleaching, 216 Bloch function, 182 block boundary, 28 - code, 9 - quantization, 25, 29, 33 - -, Hadamard, 29 blurring, 33, 65 Bose distribution, 296 et seq. - -, generalized, 362 boundary condition, 129 et seq., 238, 254 - point, 23 Bragg angle, 254 et seq., 260, 264, 275 - bandwidth, 266 - condition, 255 et seq. - deflector, 256 - diffraction, 232, 242, 255 et seq., 264 et seq., 275 - -, collinear, 265 - incidence condition, 255 et seq. - modulator, 256, 259 brightness changes, 30 - level, 16 Brillouin scattering, 232, 238 Brownian motion, 1 15 brute force focussing, 54, 78 buffer storage, 40 bulk low energy dielectric constant, 177
C C W-diffraction efficiency, 276, 278 carrier, high frequency, 7 - phase offset, 9 cascade image intensifier, 56 et seq., 79 - intensifier, 73, 76 Cassegrain focus, 139, 150 - - corrector, 140 - telescope, 150, 160
cavity, 95 dumping, 275 - field, 188 et seq. - -in-a-continuum model, 190 center of symmetry, 236 central-cell correction, 177 channel, 4 -bandwidth, 7 capacity, 16 et seq. - coding, 6, 10 - disturbances, 23 - noise, 5 et seq., 9, 39 -, noiseless, 39 check bit, 9 chromatic aberration, 53, 140, 144, 160 - defects, 146 - errors, 149 chrominance, 38 chronography, electron optical, 71 circular polarization, 234 classical dispersion relation, 170 close doublet corrector, 155 coarse-fine quantization, 30 et seq. code, BCH, 9 -, block, 9 -, convolutional, 9 coding area, 23 -, dual mode, 30 -, interpolating, 37 -, interpolative, 18 -, intraframe, 37 -, predictive, 20 et seq. -, psychovisual, 17 et seq. -, run-length, 22 et seq., 3 I , 34 -, source, 5 , 10 -, statistical, 16 et seq. 20 -* transformational, 33 coherent energy, 1 15 - illumination, 231 collective oscillations, 125 color center, 174, 205 et seq., 222 - picture, 38 coloration, 167 coloring agent, 209 coma, 139, 142, 145, 147 - coefficient, 153 -flare, 140 complex index of refraction, 170 compression, isotropic, 241 computer optimisation, I55 conducting layer, 54 - substrate, 75 conduction band, 207 -
-
SUBJECT INDEX
conductivity, electrical, 50 confusion, disc of, 53, 78 constitutive relations, 233 contour coding, 40 - interpolation, 20 - tracing, 3 1 convolutional code, 9 corrective signal, 18 correlation effects, 183, 187, 195, 197 - function, 92, 103 et seq., 110 et seq. -, interaction induced, 126 - matrix, 26 -time, 91, 95, 99, 103, 111, 126 costs in estimation, 359 - of error, 294 CoudC focus, 139, 150 Coulomb field, 187, 219 - interaction, 187 - potential, 174, 199 et seq. counting of photons, 297 covalent bonding, 241 CramCr-Rao inequality, 364 et seq. creation operator, 91, 314 criterion, Bayes, 294 -, Neyman-Pearson, 295, 297, 304 critical flicker frequency, 38 cross section, 178 crosstalk, 6 - suppression, 247 crystal birefringence, 238 - potential, 168 - symmetry, 235 cubic lattice, 194 curved photocathode, 53
D damped harmonic oscillator, 94, 98 damping, 93 - constant, 170 - rate, 98 decay rate, 97 -time, 82, 126 decision level, 295 defect, 167 et seq., 172 -absorption, 167, 169 et seq., 171 - energy level, 167 - excitation, 196 - -host interaction, 168 -_system, 172 - oscillator strength, 169 - 's polarizability, 188, 191 deflection angle, 246
38 1
-, circular, 71 -, laser beam, 232 -
plate, 65 et seq.
- shutter, 64 deflector passband, 262 degration, 28, 38 degree of excitation, internal, 104 degrees of freedom, spatial, 354 delay, 63, 69 - -line, tapped, 8 -, optical, 64 delta modulation, 21 - squared modulation, 21 demagnetization factor, 186 demodulator, 5, 7 density, 18 - matrix, 91 et seq., 95 et seq., 120, 123 - - equation of motion, 94 - - theory, 95, 114 - operator, 302 --coherent signal in Gaussian noise, 319 - - in number representation, 323 - -, pure states, 306, 320 - - signal of random phase, 325 destructive interference, 127 --time, 114 detection, binary, 292 et seq., 301 -, coherent light beam, 334 -, extended objects, 353 et sey. -, incoherent light, 334 et seq. -, probability of, 294, 304 - threshold, 298 et seq., 306 et seq., 321 et seq., 351 et seq. dielectric breakdown, 239 -constant, 188 - continuum, 204 - function, 170, 204 - medium, 179 - polarizability, 240 et seq. - screening function, 184 - tensor, 270 differential PCM, 20 et sey. differentiator, 3 1 diffraction effects, 232 - efficiency, 232, 242, 254 et seq., 262, 266 -, far field, 247 - limit, 250 diffractive deflector, 244 et seq., 275 - light deflection, 250 et seq., 267 - modulator, 244 diffuse center, 177 diffusion, 115 - coefficients, 101, 115
382
SUBJECT INDEX
digital communication system, 3
- computer, 40 diode tube, 62 dipole approximation, 179 - -dipole interaction, 188, 196 - -quadrupole effect, 197 directional response, 262 disc of confusion, 53, 78 discrete-frame approach, 17 dispersion relation, classical, 170 - theory, 171 - -, classical, 169 dissipation loss, 268 distortion, 17, 53 et seq., 61, 72, 82 - measure, 4, 41 donor, 178 - impurity, 174, 177 Doppler effect, 252 doublet corrector, 142, 152 - lens corrector, 151 et seq. drift, 115 - rate, 91 - space, electron, 77 driving current, 132 dropout, 7 Drude-Sellmeier field, 187 -_formula, 187 dual mode coding, 30 et seq. duo-binary technique, 8 dynamic storage, 77, 79 et seq. dynamical effects, 189
E E. P. R., 210 E. P. R. absorption, 210 edge, 30 et seq. - detection, 19 - information, 31 - point, 3 I - sensitivity, 41 - sharpness, 22, 35 - signal, 31 effective field, 167 et seq., 186,203, 209, 221 - - ratio, 169, 180, I92 et seq., 2 13 - local field, 183 - mass, 167, 169, 178, 181 - - approximation, 178 - -/:sum rule, 182 - - states, 182 - - tensor, 181 - polarizability, 189 et sey. eigenfunctions, 168 elastic axis, 238
constant, effective, 238 relations, 234 et seq. - stress, 231 -tensor, 235, 242, 270 - waves, 236 elastooptic coefficient, 244 - constant, 240 - diffractive deflector, 231 et seq. - effects, 237, 240 - materials, 242 - phenomena, 237 - tensor, 237, 242 electric-dipole transition, 179 - spark, 58 electro-optical effects, 49 - _ system, 49 electron drift, 54 - -nuclear Coulomb interaction, 195 - -optical system, 52 - -phonon coupling, 209 electronic charge distribution, 168 - correlations, 184 - transition, 167 electrooptic effects, 237, 240 - phenomena, 237 - tensor, 237 electrostatic deflection system, 65 - focussing, 58, 71, 74 - lens, 53 ellipsoid of wave normals, 233 elongation, 235 emission energy, 50, 78 encoder, error-detection-and-correction code, 5 -, psychovisual, 5 -, statistical, 5 energy momentum relation, 256 entropy, 17 equalizer, 8 -, automatic, 8 erfc x , definition of, 3 12 error-detection-and-correction code, 9 - _ - _ _ _ - encoder, 5 -, information bit, 10 estimation, 301 -, classical theory, 359 et seq. -, efficient, 365 -, quantum theory, 361 et seq. exchange potential, 199 excited center, 178 exciton, 220 -, localized, 220 explosive reactions, 58 -
SUBJECT INDEX
exposure time, 61 et seq., 76 extensional strain, 235 extra-axial imagery, 139 extraordinary ray, 236, 244, 264 F
F-aggregates center, 208, 216 F-center, 169, 173, 178, 204 et seq., 21 1 et seq. F- - oscillator strength, 210, 223 F- - transition, 215 f-number, 168 f-sum rule, 172, 178, 205 et seq., 215, 222 f - - -, partial, 174, 183 fading, 7 fall time, 78 false-alarm probability, 294, 304 - dismissal, 294 far field, 262 - - diffraction, 247 Faraday rotation, 233 fatigue, 239 feedback of light, 51 ferroelastic effects, 244 ferroelectric effects, 244 fibre optic coupling, 58 - - output window, 82 - - plate, 58 fidelity criterion, 4 field curvature, 53, 141, 145, 148, 150, 153, 161 - decay time, 96 - flatness, 140 - flattener, fibre optic, 72 - fluctuations, 11 1, 116 - - -correlation function, 117 - -free one electron problem, 198 - size. 150 figure of merit, 237 et seq, 243 et seq., 259, 267 filter, 5 -, digital transversal, 8 -, low-pass, 11 finite coherence width effect, 267 flame analysis, 210 flicker, 37 -, critical, 38 flint glass, 155 fluctuation correlation functions, 112 - -dissipation theorem, 103 - spectrum, 132 fluid, 237
383
focus condition, 54, 61, 71, 76 focussing, brute force, 54, 78 -, electrostatic, 58, 71, 74 -, loop, 54, 56, 77 -, magnetically, 75 Fokker-Planck equation, 101 et seq., I16 forbidden emission, 176 - transition, 175 et seq. forward-backward time approach, 94 -time operator, 105 - - path, 107 - - transition, 122 four-lens corrector, 146 et seq. Fourier spectrum, 33 - transform, 33, 40, 100, 119, 276 - - coding, piecewise, 34 fourth power figuring, 154 frame correction, 37 framing rate, 48, 66 ef seq., 73 Fraunhofer diffraction, 253 Fredholm determinant, 339 - -, approximation to, 349 free field oscillation, 97 - -space permittivity, 233 frequency distortion, 6 - pulling, 97 - shift, 93, 98, 192 et seq. Fresnel-zone plate, 276 fundamental absorption, 207 et seq., 212 fused silica, 155, 163 - - doublet, 163
G gain, 76 gamma function, incomplete, 351 Gascoigne’s aspheric plate, 162 Gaussian band shape, 209 - beam, 247, 260 - charge distribution, 194 - noise current, 117 - -, white, 7 - random noise, 6, 94 geometrical dimensions, 71 giant pulse, ruby laser, 84 gradient operator, 31 - threshold, 31 Gray coding, 23 Green’s function, 91, 101 et seq. - -, advanced, 134 - - approach, 93 _ - -, retarded, 134 - - theory, 94, 104 et seq., 120
384 Gregorian mirrors, 150 grid control, 71 - potential, 61 et seq.
SUBJECT INDEX
distortion, 40 immobilization, 68 - intensification, 55 - magnification, 53 H - quality, 1 I , 39 et seq., 78 Hadamard block quantization, 29 - resolution, 39, 53 et seq. - matrices, 26 et seq. - segmentation, 73 - transform, 33, 41 - storage, 3 et seq. Hamiltonian, 105 - -, dynamic, 79 - operator, 3 14 - -, - electron, 77 -, time independent - surface, 161 Hanbury-Brown, Twiss correlation effect, - transmission, 3 et seq. 1 I3 - t u b e , 47 , - - function, 103, 116 -, -, biplanar, 52 - - -, - effect, 103 - - resolution, 59 harmonic oscillator, 94, 314 imperfection, lattice, 167 Hartree approximation, 123, 195 -, localized, 168 - Fock approximation, 195 et seq. impermeability tensor, relative, 237 equation, 197 impurity ion, 209 - _ formulation, time-dependent, 223 -, trace, 167 Hamiltonian, 197 et seq. incoherent energy, 1 1 5 - - - potential, 184, 198 - excitation, I12 heat sink, 267 - level of excitation, 97 et seq. heating, 260 - phonon, 232 -, internal, 243 index ellipsoid, 233 Heisenberg picture, 92 et seq., 114, 117, 120 - gradient, 245, 248 - radiation mode operator, 108 - modulation, 245, 249 helical path, 54 - o f refraction, 170, 180, 190, 208, 217, 240 high speed camera, 47, 71 indicatrix, 233 et seq., 250 - - photography, 47 induced dipole moment, 200 - - streak system, 71 infrared, 241 hologram, phase, 252 - absorption, 205 holography, 254 - radiation, 47, 60 Holst type image tube, 81 inhomogeneity, optical, 243 host crystal, 168 initial condition, 129 - matrix element, 177 insertian gain, 270 human vision, 12, 17 - loss, 270 hydrogen, atomic 206 insulator, 187 hypothesis, 293 integrated absorption cross section, 180 -, composite, 297 - cross section, 192 -, multiple, 300, 307 inter-frame redundancy, 37 -, pseudo-, 300 interaction, defect-host, 168 -, simple, 297 - induced correlation, 126 hysteresis-type losses, 244 - length, 278 - -, collinear, 265 1 -, many-body, 93 image convertor, 47 - picture, 93, 105 - curvature, 72 -, reservoir, 93 - decay time, 58 -, system-plus-reservoir, 93, 95 - degradation, 54 interatomic forces, 240 - digitization, 1 1 - interaction, 95 - dissection, 64 interband transition, 182 - dissector camera, 64 interference, 6, 8, 75 -
-
-
SUBJECT INDEX
intermolecular spacing, 241 internal degree of excitation, 104 interpolating coding, 37 interpolation, contour, 37 -, linear, 37 interpolative coding, 18 intersymbol interference, 8 et seq. intraband oscillator strength, 182 e f seq. intracavity light modulation of lasers, 275 intraframe coding, 37 ionic crystal, 188 ionization potential, 207 irradiation, 167 Isaac Newton telescope, 140, 147 isomorphism, 252 isopreference curve, 15 et seq. isotropic solid, 237
K Kerr cell, 48 et seq.
- - shutter, 49 Kolmogoroff-Smoluchowski relation, 102, 128
L Langevin force, 93 et seq., 101, 115
- theory, 115 - type current source, 114 laser, 75, 104, 232
- axial modes, 70 - beam modulation, 232 machining, 277 latency time, 260 lattice imperfection, 167 - relaxation, 178 lens optimisation computer program, 146 light collection, 55 - spikes, high power, 70 likelihood ratio, average, 297 - -, definition of, 295 linear predictive coding, 20 local field, 168 - -, classical, 187 - - correction, 183 et seq. localized exciton, 220 longitudinal wave, 270, 274 loop focussing, 54, 58, 77 Lorentz field, 181, 188 et seq., 192 et seq., 205, 221 -local field, 170, 187, 211, 213 relation, 241 - oscillators, 169, 192 -
385
Lorentzian absorption band, 171
- band shape, 209 line-shape, 21 1 spectrum, 11 1 low-pass filter, 11, 33 low-passing, 33 Lowdin symmetric orthogonalization, 176 lowpass, 18 luminance, 38 Lyman series, 206 - spectrum, 206 -
M macroscopic field, 187
- theory, 91 magnetic deflection, 59, 61, 74
- impurity, 210 - lens, 53, 61 et seq. - resonance, 168 - susceptibility, 210 magnetical focussing, 75 magnetron, 71 many-body interaction, 93 Marcum’s Q-function, 326 Markov approximation, 119, 126 et seq., 132 - process, 94 -theory, 126 Markovian dynamics, 1 I 1 - interaction, 114 - statistical mechanics, 102 et seq. - steady state, 104 master equation, 97 e f seq. matched mode, 319, 362 - transmission line, 82 matching loss, 268 material property tensor, 232 mean-square error, 40 mechanical shutter, 63 mesh, 72 et seq., 77 microscopic theory, 91 mismatched transmission line, 66 mode locking, 70 - purity, 274 - scanning, 248 modes, aperture, 331 -, matched, 319, 362 -, normal, in transmission line, 310 -, spatio-temporal, 337 modulation, DPCM, 38 -, delta, 21, 38 -, - squared, 21 - depth, 251
386
SUBJECT INDEX
-, laser beam, 232 modulator, 5, 7 Mohs hardness, 240 molecular polarizability, 241 monoclinic crystal, 234 Mossotti-Clausius-Lorenz-Lorentz expression, 188 motion picture, 37 Mott-Littleton radius, 214 multichannel system, 82 multilevel generator, 21 multiple channel, 48 - frame, 47 multiposition deflector, 276 mutual coherence function, 335
N needle-shaped specimen, 186 negative oscillator strength, 175 neighboring sample, 16 nesa coating, 54 - layer, 55 nett deflection, 69 neutron activation analysis, 210 Newtonian-Cassegrain telescope, 139 - telescope, 139 - - corrector, 139 et seq. Neyman-Pearson criterion, 295, 297, 304 noise, 6, 24 et seq., 293 -current, 116, 120 - -, spontaneous, 1 I5 - effects, 93 - filtering, 25 - quantization, 12, 24 et seq. - theory, classical, 100 non-linear optical process, 125 - -linearities, 239 nonpolar fiuid, 188 gas, 187 numerical aperture, 142 Nyquist criterion, 8 - rate, 8 - - signalling, 8 -'slaw, 310 ~
0
objective lens, single, 48 et seq. observable, physical, 91 off-resonance electronic state, 200 On-axis spectroscopy, 150 one-electron approximation, 172, 174, 184, 195 - - - potential, 198
Onsager cavity field, 192 et seq., 204 effective field, 193 -field, 187 et seq., 193, 221 - local field, 2 I3 - 's continuum model, 189 operator, annihilation, 91, 314 -, creation, 91, 314 -, density, 302 -, estimation, 361 -, Hamiltonian, 314 -, projection, 303 optical activity, 233 et seq. - window, 205 optimum matrix, 26 ordinary ray, 236, 244 orthicon, 77 orthogonalization, 176 orthonormal wave functions, 176 oscillation, free field, 97 oscillator strength, 167 et seq., 170 et seq., 180 et seq., 200, 208 et seq., 21 1, 240 - - distribution, 223 - -, relative, 215 oscillatory strain, 250 outer space applications, 149 over-correct astigmatism, 153 overlap integral, 177 overlapping charge distributions, 193 et seq. - neighbor, 177 overshooting, 22 -
P P-function, 100, 128 P-representation, 99 Palomar observatory, 147 parallax, 82 paramagnetic center, 210 parameters of density operator, 361 - -signal, 297, 359 partial f-sum rule, 174, 183 - response signaling, 8 partition functional 2, generalized, 94, 104 pattern recognition, 41 Pauli principle, 173, 175, 222 - - forbidden transitions, 183 perfect crystal, 182, 190 permittivity, 274 -, effective, 269 - tensor, 233 et seq. perturbation, 195 et seq. - approximation, 305 - induced transitions, 203 - theory, first order, 186
SUBJECT INDEX
Petzval curvature, 160
- -, zero, 161 - field curvature, 143, 147, 161 phase diffraction grating, 252 - grating, 23 I , 254 - hologram, 252 - modulation, 114, 239, 249 - -, electrooptic, 239 - -, spatial, 246 - spatial filter, 276 phonon, 94 et seq. phosphor decay time, 58 - material, 51 - screen, 41, 49, 51, 17 photo-conversion, 216 photocathode, 47, 49 - efficiency, 55 - resistance, 54, 58, 61, 75, 79 - sensitivity curve, 49 et seq. photochemical reactions, 215 photochromic effects, 243 photocomposition, 277 photoconductivity, 207, 219 photoelectron, 50 - path, 53 photographic emulsion, 56 - film granularity, 148 - plate, 55 photolithographic applications, 277 photon-electron correlations, 103 physical observable, 91 picture bit rate, 16 - contrast, I9 - phone, 13 - quality, 10 - type, 16 piecewise-linear approximation, 18 piezoelectric effects, 240 - relations, 237 - tensor, 236 et seq., 270 - transducer, 231 et seq., 267 piezoelectricity, 236 et seq. pincushion distortion, 82 Planck factor, 318, 327 - 's constant, 171 Pockels effect, 49 point dipole, 188 - source, detection of, 338 et seq. - spread function, 53 Poisson distribution, 343, 352, 357 et seq. Poissonian light distribution, 55 polarizability, 169 et seq., 189, 194,205 polarization, 184
387
- response, 170 - switching, 265 postfilter, 1 1, 25 power of test, 294 prediction, equation of, 20 - scheme, planar, 22 -, two-dimensional, 22 predictive coding, 20 et seq. prefilter, 11, 25 prime focus, 150 - - corrector, 149 principal permittivity, 233 prior probability, 294 - - distribution, 297, 360 - -, least favorable, 298 prism camera, 48 probability diagram, flow of, 99 - distribution, 16, 17, 20, 91, 297, 360 --function, 99 et seq. prolate spheroidal wavefunctions, 342 - _ - , generalized, 354 pseudorandom noise, 24 - scanning, 37 psychovisual coding, 17 et seq. - encoder, 5 - encoding, 17 - redundancy, 3, 5 pulse shape, 76 - shaped time function, 7 pure states, 306
Q Q-function, 326 Q-switching, 275 quadrupole interaction, nuclear, 184 quantization, 1 1 et seq. -, block, 25, 29, 33 -, coarse-fine, 30 et seq. - level, 33 -, logarithmic, 12 - noise, 12, 24 et seq., 30 - - reduction, 24 et seq. -, uniform, 12 quantizer, 5 -, two level, 21 quantum efficiency, 50, 55 et seq., 21 1 - -, detective, 55 - limit, extreme, 320, 339, 354, 367 - noise operator, 91, 93, 113 et seq., 117 - - - theory, 113 et seq. - regression theorem, 103, 113 et seq., 127 et seq.
388
SUBJECT INDEX
uncertainty, 100 Queen Elizabeth telescope, 144 -
R R-center, 2 I6 radiation mode, 95, 107 radioactive tracer techniques, 210 Raman-Nath diffraction, 275 ramp voltage waveform, 66 et sey. random noise, 24 randomization, 295 et seq., 304, 325, 339 randomizing effect, 95 rare earth transitions, 178 - -gas solid, 204 rate-distortion theory, 3 et seq. _ _ - - , Shannon’s, 4 - of the source, 4 real-time coding, 38 rectangular spectrum, 341, 351, 357 reduction, quantization noise, 24 et seq. redundancy, psychovisual, 3, 5 - reduction, 3 et seq., 16 et seq., 23, 39 - removal, 6 -, statistical, 3, 5 reentrant deflector, 260 refraction, complex index of, 170 -, index of, 170, 180, 190, 208, 217 refractive index, 187 e f seq., 191, 216, 231, 240 et sey., 253 - - gradient, 231 - - modulation, 244 - light deflection, 248 et seq. relaxation, 124 -rate, 91, 103, 1 1 1 , 127 - time, 95, 99, 112, I19 remote sensing, 3 reservoir correlation function, I I5 - interaction, 93, 107, 125 - of oscillations, 94 residual aberrations, 157 resolution, 15 1 - criterion, 246, 250, 257 -element, 353, 367 - of the identity, 308 -, spatial, 37, 51, 79 -, time, 51, 59, 79 et seq. resolvable angle, 246 - position, 246 resolving time, instrumental, 62 resonance, 192 el seq., 196 - cutoff, 240 - mode, 250 resonant cavity, 71
resonating isotropic bar. 250 rigid ion approximation, 194 ringing, 22 rise time, 78 risk, posterior, 300, 360 Ritchey-Cretien telescope, 139, 142, 149 et seq., 160 Ross corrector, 140 ef seq. - doublet corrector, 142 et seq. rotating drum, 48 - glass prism, 48 - mirror, 48, 55 - - camera, 55, 63 rotation effect, 238 run-length coding, 22 et seq., 31, 34 S
sagittal curvature, 150, 161 focus, 140 sampler, 5 sampling, I 1 e f seq. - rate, 39 scaling, 156 scanning, laser beam, 276 Schlieren method, 23 1 - optical image display, 276 - - light modulation, 255 - - modulator, 245 Schmidt camera, 139, 143, 145 - mirror, double, 72 - orthogonalization, 176 scintillation chamber, 66, 76 - tracks, 76 e f seq. Schrodinger equation, 92 - picture, 92 ef seq., 114, 117, 120 - state, 120 secondary electron emission, 57 - focus, 139 - - corrector, 160 - spectrum effects, 155, 160 - - error, 141, 144 secular motion, 98, 126 - time changes, 132 seeing, 148 - limitation, 149 Seidel aberrations, 145, 153 - astigmatism, 141 - c o m a , 139, 151 Seidel spherical aberration, 147 ef seq. -theory, 145 selection, event, 77 self-consistent potential, static, 179 - -interaction, 199 -
SUBJECT INDEX
locking, 70 Sellmeier’s dispersion formula, 240 semi-continuum model, 207 semiconductor, 174, 177, 208 Shannon, formula of, 7 - ’s theory, 16 - ’s rate distortion theory, 4 shear deformation, 241 - strain, 235 - wave, 244, 210, 274 shielded center, 178 shift plate, 69 et seq. shot noise, 6 shutter, 60 ef seq. -, fast, 47 -potential, 67 - pulse shape, 75 - synchronization, 47 signal, 293, 309 -, coherent, 309 - distortion, 259 -, random phase, 324 -, standard strength, 298 - -to-noise ratio, 55 , effective, 299 , equivalent, 329 _ _ _ _ - - for detection of coherent signal, 313, 363 single absorption, 94 - aspheric plate, 151 - element corrector, 161 - emission, 94 - exposure, 47, 79 - mode fields, 101 - objective lens, 48 er seq. - -side-band signalling, 9 sixth-power figuring, I54 size of test, 294 slope prediction, 20 slot plate, 65 Smakula’s equation, 168 et seq., 171, 178 et seq., 188, 191, 193 et seq., 204 et seq., 208, 21 1 el seq., 215 - -, generalized, 178 er seq. -relation, 169, 171, 222 - treatment, 169 et seq. small-area detail, 34 Snell’s law, 247 sound, 234 et seq. - absorption, 244, 266 - power, 260 - velocity, 239 et seq., 244, 266, 269 - wave, 231, 238, 250, 252 -
389
- - propagation, 242 source coding, 5, 10 space charge, 54, 58, 79, 82 spark-gap, laser-triggered, 76 spatial filtering, 5, 255 - incoherence, 95 -resolution, 51, 79 spectral analyzer, 275 - correlation function, 103 - light filter, tunable, 265 - sensitivity curve, 52 spectrochemical analysis, 210 spectroscopic methods, 210 spectroscopy, on-axis, 150 spectrum analyzer, atomic beam, 103 speech signal, 21 spherical aberration, 139, 141 er seq., 145 - well, 219 spontaneous emission, 114 - - rate, 178 spot diagram, 154 staircase deflection, 63 - waveform, 66 stannic oxide, 54 star field, 157 static birefringence, 234 - strain, 250 statistical coding, 16 et seq., 20, 40 - encoder, 5 - redundancy, 3, 5 steady state mode, 97 - - probability distribution, 98 steering error, 261 er seq. stellar photometry, 142 Sternheimer shielding, I84 stimulated absorption, I14 - emission, I14 stoichiometric excess, 2 10 stoichiometry, 210 Stokes’ shift, 178 storage, 76 strain distribution, 275 - ellipsoid, 250 strategy, detection, 293 -, estimation, 360 stray capacitance, 65, 75 - inductance, 65 streak camera, 48, 61 - operation, 70 - system, 48 - technique, 58 - velocity, 59 stress birefringence, 23 I
390
SUBJECT I N D E X
- -strain relations, constitutive, 235 stroboscope, 59 subjective quality, 3, 13 et seq. -test, 13 sufficient statistic, 298, 312, 322 sum rule, 167, 175 surface reflection, 151 susceptibility, 210 - tensor, 233 symmetrized logarithmic derivative, 307. 365 symmetry, centre of, 236 synchronization, 63, 71, 79 synthesization of subjective color, 17 synthetic highs, 30 - - generator, 3 1 system-reservoir interaction, 95, 104
T TSEM, 57 et seq., 75 et seq. T V projector, 276 tangential curvature, 150, 161 - focus, 140 tapped delay-line, 8 television, 13, 17 - camera, 77 temperature coefficient, 244, 266 - effects, 266 temporal incoherence, 95 et seq. - response of human vision, I 7 test criterion, 132 thermal distortion, 259, 266 - emission, 76 - equilibrium, 99 et seq., 103, 110 et seq., 125, 134 - - correlation function, 116 - expansion mismatch, 275 - light source, 103 - noise, 6 - problem, 266 - steady state, 104 thermocompression, 275 threshold, 31 - detector, 298 et seq., 306 et seq., 321 et seq., 351, 357 - function, 19 thresholding, 33 thickness driven transducer, 268 thin film transducer, 266 Thomas-Reiche-Kuhn sum rule, 173, 181 three mirror anastigmat system, 148 time evolution, 91 delay, 62, 69
-
- resolution, 51, 59, 70 et seq., 78 - - limitation, 80 - translation operator, 92 trace impurity, 167 tracer, 210 transducer, 267 et seq. - IOSS, 266, 269, 274 transformational coding, 33 transient response, 103 transit time, 259 - - difference, 65 et seq. - -, electron, 52, 65, 78 et seq. transition amplitude, 121 -energy, 168 -probability, 121, 171, 179, 184, 197, 220 transition rate, 194 transmission line, 308 et seq. - - energy, 3 11 - - normal modes, 310 - - quantization, 3 13 et seq. transmittance, 18 transparency, 241 triclinic crystal, 234 triggering, 63 triode, 62 two-center matrix element, 177 - mirror corrector, 148 et seq. 1J U-center, 208 U. V. transmission of glass, 163 ultrasonic light modulation, 231, 276 - transducer, 278 - wave, 231, 252 ultraviolet absorption, 205 under-corrected Seidel astigmatism, 155 unitary operator, 92 V
variable-length binary codeword, 16 vacuum-deposited layers, 275 vestigal-side-band signalling, 9 video signal, 31 virtual emission, 183 - transition, 200 - valence band, 183
w Waals’ interaction, van der, 196 waveguide action, 248 Weber-Fechner law, 12, 18
SUBJECT INDEX
weighting function, 40 wide field angle, 157 Wollaston prism, 245
X X-ray diffraction, 252 - _ -emission, 175
39 1
Z-functional approach, 94 -_theory, 120 Z , generalized partition functional, 94, 104 zero-frequency absorption line, 183 - Petzval curvaturem, 161 - temperature coefficient, 244
CUMULATIVE INDEX - VOLUMES I-X ABEL~S F.,, Methods for Determining Optical parameters of Thin Films 11, 249 VII, 139 ABELLA, 1. D., Echoes at Optical Frequencies AGRANOVICH, V. M., V. L., GINZBURG, Crystal Optics with Spatial Disperision IX, 235 IX, 179 ALLEN,L., D. G . C. JONES,Mode Locking in Gas Lasers IX, 123 E. O., Synthesis of Optical Birefringent Networks AMMAN, ARMSTRONG, J. A,, A. W. SMITH,Experimental Studies of Intensity Fluctuations in Lasers VI, 21 I BARAKAT, R., The Intensity Distribution and Total Illumination of AberrationI, 67 Free Diffraction Images BECKMANN, P., Scattering of Light by Rough Surfaces VI, 53 I BLOOM,A. L., Gas Lasers and their Application to Precise Length Measurements IX, IV, 145 BOUSQUET, P., see P. Rouard BRYNGDAHL, O., Applications of Shearing Interferometry IV, 37 BURCH,J. M . , The Metrological Applications of Diffraction Gratings 11, 73 I COHEN-TANNOUDJI, C., A. KASTLER, Optical Pumping v, CUMMINS, H. Z . , H. L. SWINNEY, Light Beating Spectroscopy VIII, 133 DELANO, E., R. J. PEGIS,Methods of Synthesis for Dielectric Multilayer Filters VII, 67 DEMARIA, A. J., Picosecond Laser Pulses IX, 31 X , 165 D. L., see D. Y . Smith DEXTER, EBERLY, J . H., Interaction of Very Intense Light Free Electrons VII, 359 I, 253 FIoRENTiNI, A., Dynamic Characteristics of Visual Processes FOCKE, J., Higher Order Aberration Theory IV, I FRANCON, M., S . MALLICK, Measurement of the Second Order Degree of Coherence VI, 71 B. R., Evoluation, Design and Extrapolation Methods for Optical FRIEDEN, IX, 31 I Signals, Based on Use of the Prolate Functions VIII, 51 FRY,G . A., The Optical Performance of the Human Eye I, 109 GABOR,D., Light and Information I l l , 187 GAMO,H . , Matrix Treatment of Partial Coherence IX, 235 GINZBURG, V. L., see V. M. Agranovich 11, 109 GIOVANELLI, R. G . , Diffusion Through Non-Uniform Media GNIADEK, K., J. PETYKIEWICZ, Applications of Optical Methods in the Diffraction IX, 281 Theory of Elastic Waves J . W., Synthetic-Aperture Optics VIII, 1 GOODMAN, X , 289 HELSTROM, C. W., Quantum Detection Theory VI, 171 HERRIOTT, D. R., Some Applications of Lasers to Interferometry HUANG,T. S., Bandwidth Compression of Optical Images x, 1 JACOBSSON, R., Light Reflection from Films of Continuously Varying Refractive Index V, 247 JACQUINOT, P., 9. ROIZEN-DOSSIER, Apodisation 111, 29 IX, 179 JONES,D. G . C., see L. Allen KASTLER, A., see C. Cohen-Tannoudji v, I KINOSITA, K., Surface Deterioration of Optical Glasses IV, 85 G . , Multiple-Beam Interference and Natural Modes in Open ResoKOPPELMAN, VII, 1 nators 111, 1 KOTTLER, F., The Elements of Radiative Transfer 392
CUMULATIVE INDEX OF AUTHORS
393
KOTTLER, F., Diffraction at a Black Screen, Part I: Kirchhoff’s Theory IV, 281 KOTTLER,F., Diffraction at a Black Screen, Part 11: Electromagnetic Theory VI, 331 KUBOTA, H., Interference Color I, 211 LEITH,E. N., J. UPATNIEKS,Recent Advances in Holography VI, 1 LEVI,L., Vision in Communication VIII, 343 LIPSON,H., C. A. TAYLOR, X-Ray Crystal-Structure Determination as a Branch of Physical Optics V, 287 MALLICK, S., see M. FranGon VI, 71 MANDEL, L., Fluctuations of Light Beams 11, 181 MEHTA,C. L., Theory of Photoelectron Counting VIII, 373 MIKAELIAN, A. L., M. L. TER-MIKAELIAN, Quasi-Classical Theory of Laser Radiation VII, 231 MIYAMOTO, K., Wave Optics and Geometrical Optics in Optical Design I, 31 MURATA, K., Instruments for the Measuring of Optical Transfer Functions V, 199 MUSSET,A., A. THELEN, Multilayer Antireflection Coatings VIII, 201 OOUE,S., The Photographic Image VII, 299 PEGIS,R. J., The Modern Development of Hamiltonian Optics 1, 1 PEGIS,R. J., see E. Delano VII, 67 PERSHAN, P. S., Non-Linear Optics V, 8 3 PETYKIEWICZ, J., see K. Gniadek IX, 281 PICHT,J., The Wave of a Moving Classical Electron V, 351 RISKEN,H., Statistical Properties of Laser Light VIII, 239 ROIZEN-DOSSIER, B., see P. Jacquinot I l l , 29 ROUARD, P., P. BOUSQLJET, Optical Constants of Thin Films IV, 145 RUBINOWICZ, A., The Miyamoto-Wolf Diffraction Wave IV, 199 SAKAI, H., see G . A. Vanasse VI, 259 SCULLY, M. 0. K. G . WHITNEY, Tools of Theoretical Quantum Optics X, 89 SITTIG,E. K., Elastooptic Light Modulation and Deflection X, 229 SMITH,A. W., see J . A. Armstrong VI, 211 SMITH,D. Y., D. L. DEXTER, Optical Absorption Strength of Defects in Insulators X, 165 x, 45 SMITH,R. W., The Use of Image Tubes as Shutters V, 145 STEEL,W. H., Two-Beam Interferometry STROHBEHN, J. W., Optical Propagation Through the Turbulent Atmosphere IX, 73 STROKE, G. W., Ruling, Testing and Use of Optical Gratings for High-Resolution Spectroscopy 11, 1 SWINNEY, H. J., see H. Z . Cummins VIII, 133 TAYLOR, C. A., see H. Lipson V, 287 TER-MIKAELIAN, M. L., see A. L. Mikaelian VII, 231 VIII, 201 THELEN, A., see A. Musset VII, 169 THOMPSON, B. J . , Image Formation with Partially Coherent Light TSUJIUCHI, J., Correction of Optical Images by Compensation of Aberrations and 11, 131 by Spatial Frequency Filtering J., see E. N. Leith v1, 1 UPATNIEKS, V1, 259 VANASSE, G . A., H. SAKAI,Fourier Spectroscopy 1, 289 VANHEEL,A. C. S., Modern Alignment Devices WELFORD, W. T., Aberration Theory of Gratings and Grating Mountings IV, 241 WHITNEY, K. G., see M. 0. Scully X, 89 WOLTER,H., On Basic Analogies and Principal Differences between Optical and Electronic Information I, 155 X, 137 WYNNE, C. G., Field Correctors for Astronomical Telescopes V1, 105 YAMAJI,K., Design of Zoom Lenses YAMAMOTO, T., Coherence Theory of Source-Size Compensation in Interference Microscopy VIII, 295
This Page Intentionally Left Blank